Agents will not reconnect after the server hosting Bamboo ran out disk space
After an out of disk-space event on the Bamboo server, all agents (Elastic and / or Remote) have gone offline and are no longer reconnecting despite the storage situation being resolved:
The following may appear in the agent logs, bamboo-elastic-agent.out or atlassian-bamboo-agent.log:
INFO | jvm 2 | 2017/06/14 14:12:10 | 2017-06-14 14:12:10,585 INFO [AgentRunnerThread] [AgentRegistrationBean] Registering agent on the server, INFO | jvm 2 | 2017/06/14 14:17:13 | 2017-06-14 14:17:13,029 WARN [AgentRunnerThread] [RemoteAgent$1] Exception encountered during context initialization - cancelling refresh attempt: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'agentRegistrationBean': Invocation of init method failed; nested exception is org.springframework.remoting.RemoteTimeoutException: Receive timeout after 300000 ms for RemoteInvocation: method name 'registerAgent'; parameter types [com.atlassian.bamboo.buildqueue.RemotableRemoteAgentDefinition]
The above is a generic warning that suggests that the agent timed out trying to communicate with the messaging broker on the Bamboo Server. The above warning does not imply that your Bamboo Server has run out of disk space or that the below resolution needs to be followed.
Errors similar to the below are present on the Bamboo Server within the
2017-06-13 21:48:27,985 ERROR [ConcurrentQueueStoreAndDispatch] [MessageDatabase] KahaDB failed to store to Journal java.io.IOException: No space left on device at java.io.RandomAccessFile.writeBytes(Native Method)
The ActiveMQ JMS Broker used for agent communication never recovered after the server ran out of disk space, despite the storage situation being resolved.
- Ensure there's enough disk space on the server hosting Bamboo.
- Restart Bamboo so that the ActiveMQ JMS Broker also restarts successfully.
Generally after a system runs out of disk-space entirely and the storage situation has been resolved, it's a good idea to restart the entire server (not just Bamboo).
It's not uncommon for certain Bamboo XML configuration to become corrupt after the server runs out of disk-space during operation. This can present after the server is restarted and cause a Bamboo outage. Please see be aware of the below two knowledge-base articles which may be applicable in such an event: