Starting Bitbucket Server/DC after upgrade to 8.x fails due to sidecar issue without any error
プラットフォームについて: Data Center のみ - この記事は、Data Center プラットフォームのアトラシアン製品にのみ適用されます。
この KB は Data Center バージョンの製品用に作成されています。Data Center 固有ではない機能の Data Center KB は、製品のサーバー バージョンでも動作する可能性はありますが、テストは行われていません。サーバー*製品のサポートは 2024 年 2 月 15 日に終了しました。サーバー製品を利用している場合は、アトラシアンのサーバー製品のサポート終了のお知らせページにて移行オプションをご確認ください。
*Fisheye および Crucible は除く
要約
In addition to the case described in KB Starting Bitbucket Server/Data Center after upgrade to 8.x fails with the error UnsatisfiedDependencyException Mesh process may fail to start on time due to long hostname resolution.
You may also notice a behavior where Sidecar starts, but not on time, which causes Bitbucket to fail to start with the following error in UI:
Git is temporarily not available due to dead mesh sidecar child process
環境
Any Bitbucket version 8.x
診断
If your Bitbucket instance is failing to start due to Mesh Sidecar is currently unavailable but not throwing UnsatisfiedDependencyException or any other clearly defined exception like in the example below:
xxxx-xx-xx xx:xx:xx WARN [spring-startup] o.s.w.c.s.XmlWebApplicationContext Exception encountered during context initialization - cancelling refresh attempt: org.springframework.context.ApplicationContextException: Failed to start bean 'meshController'; nested exception is com.atlassian.stash.internal.mesh.SidecarUnavailableException: Sidecar is currently unavailable
xxxx-xx-xx xx:xx:xx,848 ERROR [hz.hazelcast.event-1] c.h.s.i.e.impl.EventServiceImpl [192.168.2.2]:5701 [atlbitbucket] [3.12.13] hz.hazelcast.event-1 caught an exception while processing:com.hazelcast.spi.impl.eventservice.impl.LocalEventDispatcher@3c0f46ab
com.atlassian.stash.internal.mesh.SidecarUnavailableException: Sidecar is currently unavailable
at com.atlassian.stash.internal.mesh.DefaultMeshController.maybeConfigureSidecar(DefaultMeshController.java:259)
at com.atlassian.stash.internal.mesh.DefaultMeshController.access$700(DefaultMeshController.java:71)
at com.atlassian.stash.internal.mesh.DefaultMeshController$ControlPlaneStateListener.lambda$onUpdated$0(DefaultMeshController.java:292)
at java.base/java.util.Optional.ifPresent(Unknown Source)
at com.atlassian.stash.internal.mesh.DefaultMeshController$ControlPlaneStateListener.onUpdated(DefaultMeshController.java:291)
at com.atlassian.stash.internal.mesh.DefaultMeshController$ControlPlaneStateListener.entryAdded(DefaultMeshController.java:278)
at com.hazelcast.map.impl.MapListenerAdaptors$1$1.onEvent(MapListenerAdaptors.java:67)
at com.hazelcast.map.impl.MapListenerAdaptors$1$1.onEvent(MapListenerAdaptors.java:64)
at com.hazelcast.map.impl.InternalMapListenerAdapter.onEvent(InternalMapListenerAdapter.java:56)
at com.hazelcast.map.impl.InternalMapListenerAdapter.onEvent(InternalMapListenerAdapter.java:35)
at com.hazelcast.map.impl.event.MapEventPublishingService.callListener(MapEventPublishingService.java:172)
at com.hazelcast.map.impl.event.MapEventPublishingService.dispatchEntryEventData(MapEventPublishingService.java:184)
at com.hazelcast.map.impl.event.MapEventPublishingService.dispatchEvent(MapEventPublishingService.java:83)
at com.hazelcast.map.impl.event.MapEventPublishingService.dispatchEvent(MapEventPublishingService.java:47)
at com.hazelcast.map.impl.MapService.dispatchEvent(MapService.java:105)
at com.hazelcast.map.impl.MapService.dispatchEvent(MapService.java:78)
at com.hazelcast.spi.impl.eventservice.impl.LocalEventDispatcher.run(LocalEventDispatcher.java:64)
at com.hazelcast.util.executor.StripedExecutor$Worker.process(StripedExecutor.java:244)
at com.hazelcast.util.executor.StripedExecutor$Worker.run(StripedExecutor.java:227)
xxxx-xx-xx xx:xx:xx,300 INFO [spring-startup] c.a.s.i.s.g.m.DefaultSidecarManager Sidecar has stopped (Exit code: 0)
xxxx-xx-xx xx:xx:xx,350 INFO [spring-startup] c.a.s.internal.home.HomeLockAcquirer Releasing lock on /var/atlassian/application-data/bitbucket/shared
xxxx-xx-xx xx:xx:xx,363 INFO [spring-startup] c.a.s.internal.home.HomeLockAcquirer Releasing lock on /var/atlassian/application-data/bitbucket
xxxx-xx-xx xx:xx:xx,366 ERROR [spring-startup] c.a.j.s.w.s.JohnsonDispatcherServlet SpringMVC dispatcher [springMvc] could not be started
org.springframework.context.ApplicationContextException: Failed to start bean 'meshController'; nested exception is com.atlassian.stash.internal.mesh.SidecarUnavailableException: Sidecar is currently unavailable
at org.springframework.context.support.DefaultLifecycleProcessor.doStart(DefaultLifecycleProcessor.java:181)
at org.springframework.context.support.DefaultLifecycleProcessor.access$200(DefaultLifecycleProcessor.java:54)
at org.springframework.context.support.DefaultLifecycleProcessor$LifecycleGroup.start(DefaultLifecycleProcessor.java:356)
at java.base/java.lang.Iterable.forEach(Unknown Source)
at org.springframework.context.support.DefaultLifecycleProcessor.startBeans(DefaultLifecycleProcessor.java:155)
at org.springframework.context.support.DefaultLifecycleProcessor.onRefresh(DefaultLifecycleProcessor.java:123)
at org.springframework.context.support.AbstractApplicationContext.finishRefresh(AbstractApplicationContext.java:935)
at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:586)
at javax.servlet.GenericServlet.init(GenericServlet.java:158)
at java.base/java.lang.Thread.run(Unknown Source)
... 8 frames trimmed
Caused by: com.atlassian.stash.internal.mesh.SidecarUnavailableException: Sidecar is currently unavailable
at com.atlassian.stash.internal.mesh.DefaultMeshController.maybeConfigureSidecar(DefaultMeshController.java:259)
at com.atlassian.stash.internal.mesh.DefaultMeshController.lambda$maybeConfigureSidecar$0(DefaultMeshController.java:250)
at java.base/java.util.Optional.ifPresent(Unknown Source)
at com.atlassian.stash.internal.mesh.DefaultMeshController.maybeConfigureSidecar(DefaultMeshController.java:246)
at com.atlassian.stash.internal.mesh.DefaultMeshController.start(DefaultMeshController.java:233)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at org.springframework.context.support.DefaultLifecycleProcessor.doStart(DefaultLifecycleProcessor.java:178)
... 10 common frames omitted
xxxx-xx-xx xx:xx:xx,625 WARN [http-nio-7990-exec-10] c.a.j.s.w.f.BypassableDelegatingFilterProxy Bypassing [hazelcastFilter]; no Spring WebApplicationContext is available
When the issue has happened, please check atlassian-mesh.log. You will see the following warnings:
xxxx-xx-xx xx:xx:xx,654 WARN [main] - o.s.boot.StartupInfoLogger InetAddress.getLocalHost().getHostName() took 5016 milliseconds to respond. Please verify your network configuration.
xxxx-xx-xx xx:xx:xx,675 WARN [main] - o.s.boot.system.ApplicationPid ManagementFactory.getRuntimeMXBean().getName() took 5017 milliseconds to respond. This may be due to slow host name resolution. Please verify your network configuration.
Please note that Mesh should startup without any warnings. If you see any please create a ticket to Atlassian Support
You may also see the following errors being thrown by Bitbucket while Sidecar is starting, but delays:
2024-10-14 08:49:05,805 ERROR [spring-startup] c.a.s.internal.scm.PluginScmService ***************************************************************************
2024-10-14 08:49:05,805 ERROR [spring-startup] c.a.s.internal.scm.PluginScmService Git is temporarily not available due to dead mesh sidecar child process.
2024-10-14 08:49:05,805 ERROR [spring-startup] c.a.s.internal.scm.PluginScmService ***************************************************************************
2024-10-14 08:49:14,876 WARN [threadpool:thread-1] c.a.s.i.mesh.MeshReadOnlyController Failed to update the set of readonly repositories on Mesh node Sidecar(http://localhost:7777)
com.atlassian.bitbucket.mesh.MeshNodeConnectionException: Mesh node is currently unavailable.
at com.atlassian.stash.internal.scm.git.mesh.DefaultErrorTranslator.translateUnavailable(DefaultErrorTranslator.java:447)
at com.atlassian.stash.internal.scm.git.mesh.DefaultErrorTranslator.translateIfKnownCause(DefaultErrorTranslator.java:285)
at com.atlassian.stash.internal.scm.git.mesh.DefaultErrorTranslator.maybeTranslate(DefaultErrorTranslator.java:57)
at com.atlassian.stash.internal.scm.git.mesh.AbstractFutureResponseObserver.maybeTranslate(AbstractFutureResponseObserver.java:209)
at com.atlassian.stash.internal.scm.git.mesh.AbstractFutureResponseObserver$ResponseFuture.translateCause(AbstractFutureResponseObserver.java:298)
at com.atlassian.stash.internal.scm.git.mesh.AbstractFutureResponseObserver$ResponseFuture.get(AbstractFutureResponseObserver.java:235)
at com.atlassian.stash.internal.scm.git.mesh.AbstractFutureResponseObserver.asResult(AbstractFutureResponseObserver.java:151)
at com.atlassian.stash.internal.scm.git.mesh.AbstractFutureResponseObserver.asResult(AbstractFutureResponseObserver.java:139)
at com.atlassian.stash.internal.scm.git.mesh.GrpcManagementClient$2.onEnd(GrpcManagementClient.java:360)
at com.atlassian.stash.internal.repository.HibernateRepositoryDao.streamRepositories(HibernateRepositoryDao.java:556)
at com.atlassian.stash.internal.repository.HibernateRepositoryDao.streamReadOnly(HibernateRepositoryDao.java:382)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at jdk.internal.reflect.GeneratedMethodAccessor188.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at jdk.proxy3/jdk.proxy3.$Proxy301.streamReadOnly(Unknown Source)
at com.atlassian.stash.internal.repository.DefaultRepositoryService.streamReadOnly(DefaultRepositoryService.java:708)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at jdk.internal.reflect.GeneratedMethodAccessor188.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at jdk.proxy3/jdk.proxy3.$Proxy350.streamReadOnly(Unknown Source)
at com.atlassian.stash.internal.scm.git.DefaultGitRepositorySupplier.streamReadOnly(DefaultGitRepositorySupplier.java:21)
at com.atlassian.stash.internal.scm.git.mesh.GrpcManagementClient.replaceReadOnlyRepositories(GrpcManagementClient.java:272)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at jdk.internal.reflect.GeneratedMethodAccessor188.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at jdk.proxy3/jdk.proxy3.$Proxy443.replaceReadOnlyRepositories(Unknown Source)
at com.atlassian.stash.internal.mesh.MeshReadOnlyController.replaceReadOnlyRepositories(MeshReadOnlyController.java:171)
at com.google.common.collect.ImmutableList.forEach(ImmutableList.java:422)
at com.atlassian.stash.internal.mesh.MeshReadOnlyController.lambda$onApplicationEvent$1(MeshReadOnlyController.java:116)
at com.atlassian.stash.internal.concurrent.StateTransferringRunnable.run(StateTransferringRunnable.java:50)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)
... 49 frames trimmed
Caused by: io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
at io.grpc.Status.asRuntimeException(Status.java:537)
at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:481)
at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
at com.atlassian.stash.internal.scm.git.mesh.LastSeenClientInterceptor$LastSeenClientListener.onClose(LastSeenClientInterceptor.java:40)
at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
at com.atlassian.stash.internal.scm.git.mesh.StatefulClientCallListener.onClose(StatefulClientCallListener.java:34)
at com.atlassian.stash.internal.scm.git.mesh.ErrorHandlingClientInterceptor$ErrorHandlingCall$1.onClose(ErrorHandlingClientInterceptor.java:149)
at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:574)
at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:72)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:742)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:723)
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
... 3 common frames omitted
Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: Connection refused: localhost/127.0.0.1:7777
Caused by: java.net.ConnectException: finishConnect(..) failed: Connection refused
at io.netty.channel.unix.Errors.newConnectException0(Errors.java:166)
at io.netty.channel.unix.Errors.handleConnectErrno(Errors.java:131)
at io.netty.channel.unix.Socket.finishConnect(Socket.java:359)
at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.doFinishConnect(AbstractEpollChannel.java:715)
at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.finishConnect(AbstractEpollChannel.java:692)
at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.epollOutReady(AbstractEpollChannel.java:567)
at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:491)
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:399)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Unknown Source)
Even after seeing sidecar starting successfully after a few minutes, Bitbucket is still not able to be brought back online:
2024-10-14 08:49:28,154 INFO [mesh-grpc-request:thread-5] c.a.s.i.mesh.DefaultMeshNodeRegistry Node Sidecar (http://localhost:7777) came online
Lastly, you may see some hints of IPv6 being used while Sidecar is being started, which could delay its startup:
Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: Connection refused: localhost/[0:0:0:0:0:0:0:1]:7777
原因
The WARN you can observe above is caused by getHostName() function failed to obtain the hostname value from the OS within the timeout. The function getHostName() is called during the startup of both Bitbucket application and mesh nodes. If mesh node fails to obtain the hostname within timeout Bitbucket considers that sidecar has failed to start and initiate the termination sequence.
Additionally, Sidecar could possibly delay to be started due IPv6 use and that could be confirmed capturing Thread Dumps on Bitbucket's startup.
Thread Dumps capture:
MESH_PID=$(pgrep -f mesh-app.jar | head -n1)
while true; do
if [ -n "$MESH_PID" ]; then
jstack -l $MESH_PID >> "jstack-$(date +%s).out"
echo "Captured jstack at $(date +%s)"
sleep 2
fi
done
ソリューション
There are two solutions available:
- Fix the network configuration to make hostname resolution faster - check resolv.conf or speak to DNS admin of your network.
In /etc/hosts set the hostname to both 127.0.0.1 and the IP address of the server.
# Hosts file example ~ cat /etc/hosts 127.0.0.1 localhost ::1 localhost
# Comment the line with ip6-loopback (::1) or change to 127.0.0.1 127.0.0.1 localhost #::1 localhost
If none of above workarounds have resolved the issue, we could try to force Bitbucket and Mesh to use IPv4 instead by adding below properties to $BITBUCKET_INSTALLATION_DIRECTORY/bin/_start-webapp.sh:
JVM_SUPPORT_RECOMMENDED_ARGS="$JVM_SUPPORT_RECOMMENDED_ARGS -Djava.net.preferIPv4Stack=true -Djava.net.preferIPv6Addresses=false"
And to pass these into sidecar, we need to specify in $BITBUCKET_HOME/shared/bitbucket.properties file below properties as well and restart Bitbucket:plugin.bitbucket-git.mesh.sidecar.jvmArgs[0]=-Djava.net.preferIPv4Stack=true plugin.bitbucket-git.mesh.sidecar.jvmArgs[1]=-Djava.net.preferIPv6Addresses=false