Starting Bitbucket Server takes a long time after upgrading to version 4.12 or newer

お困りですか?

アトラシアン コミュニティをご利用ください。

コミュニティに質問


プラットフォームについて: Server および Data Center のみ。この記事は、Server および Data Center プラットフォームのアトラシアン製品にのみ適用されます。

Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.

*Fisheye および Crucible は除く

問題

After the upgrade of Bitbucket Server or Data Center to version 4.12 or newer, the initial startup is taking significantly longer. In the case of Data Center installation, the issue affects every node while it is attached to the cluster.

During startup, you can see the following errors in the logs:

2019-06-11 12:00:08,114 ERROR [spring-startup]  c.a.s.i.s.g.u.s.SalGitUpgradeManager IncludeSystemConfigTask failed for repository TEST/document[79998]
java.nio.file.NoSuchFileException: /var/atlassian/application-data/bitbucket/shared/data/repositories/79998/tmp-61d593556c512d39-config.lock
	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
	at sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177)
	at java.nio.channels.FileChannel.open(FileChannel.java:287)
	at java.nio.channels.FileChannel.open(FileChannel.java:335)
	at com.atlassian.stash.internal.scm.git.DefaultGitRepositoryLayout.openLockWithHardLink(DefaultGitRepositoryLayout.java:293)
	at com.atlassian.stash.internal.scm.git.DefaultGitRepositoryLayout.withLock(DefaultGitRepositoryLayout.java:160)
	at com.atlassian.stash.internal.scm.git.DefaultGitRepositoryLayout.editConfig(DefaultGitRepositoryLayout.java:83)
	at com.atlassian.stash.internal.scm.git.upgrade.IncludeSystemConfigTask.upgrade(IncludeSystemConfigTask.java:96)
	at com.atlassian.stash.internal.scm.git.upgrade.IncludeSystemConfigTask.lambda$parallelUpgrade$1(IncludeSystemConfigTask.java:143)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.lang.Thread.run(Thread.java:748)
	... 1 frame trimmed

診断

環境

  • The instance has been upgraded to version 4.12 or newer

  • The issue started happening only after the upgrade

  • No significant load on the database or filesystem is observable during the long startup
  • The issue still reproducible in UPM Safe Mode 
  • The startup is stuck on Preparing plugin framework

     

Diagnostic Steps

  1. Enable Debug logging and profiling

  2. Restart the instance
  3. Search the logs for the occurrence of 

    SalGitUpgradeManager IncludeSystemConfigTask failed for repository
  4. Verify the time in atlassian-bitbucket-profiler.log consumed by the task git: apply IncludeSystemConfigTask

原因

In version 4.12 we introduced IncludeSystemConfigTask which rewrites the config files for all repositories to add its own settings for a shared config file and to add a repository config file for each repository. We also introduced additional filesystem locks in order to provide required isolation to prevent concurrent changes to the individual repository settings. In other words, while Bitbucket Server has a config.lock file in place, if someone was to try and use git config to edit the configuration as well, Git would reject their edit.

In order to implement this locking mechanism in version 4.12, the new upgrade task has been added to perform the following actions:

  1. Query all the repositories from the database (git: apply IncludeSystemConfigTask)
  2. Create the tmp-<some_hash>-config.lock file in each repository as a hard link
  3. If the creation fails throw an exception with ERROR level and reschedule the retry for all repositories during the next restart.
  4. Retry with each restart until the task finishes successfully for all repositories.

The described above logic is causing an issue in the case of the list of repositories stored in the database differs from the real repositories on the filesystem. In that case, Bitbucket will fail to create the lock file as the path does not exist on the filesystem. And when Bitbucket fails to create the lock the tasks are marked as failed:

c.a.sal.core.upgrade.PluginUpgrader Upgrade failed: IncludeSystemConfigTask failed for one or more repositories
java.lang.RuntimeException: IncludeSystemConfigTask failed for one or more repositories

This task is then rescheduled for the next restart. Meaning that each node restart will trigger the task to execute and when it cannot create the lock, it will be scheduled to run again at the next restart. The effect this has is that each node will encounter increased startup times. The startup times will be more pronounced as more repositories are added to the system (more repositories for IncludeSystemConfigTask to check).

ソリューション

There are two resolutions available:

  1. As the main root cause is the inconsistency between the database and the filesystem the issue can be resolved with Bitbucket Integrity checker in Data Center installations
    (warning) Please note that it can take a very long time to run the integrity check on the instance with a significant number of repositories. You should only use the Integrity checker as a resolution if the errors reported in the logs affect more than 50 repositories.
  2. For Bitbucket Server installations or Bitbucket Data Center installations with fewer than 50 affected repositories follow these steps:
    1. Recreate (delete and create again) all the impacted repositories via UI
    2. Restart the instance
    3. Verify that there are no errors reported from IncludeSystemConfigTask in atlassian-bitbucket.log i.e:

      Errors showing the tasks failed:

      2019-01-11 12:53:08,114 ERROR [spring-startup]  c.a.s.i.s.g.u.s.SalGitUpgradeManager IncludeSystemConfigTask failed for repository TEST/document[79998]
      java.nio.file.NoSuchFileException: /var/atlassian/application-data/bitbucket/shared/data/repositories/79904/tmp-61d593556c518d39-config.lock
      	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
      	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
      	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
      	at sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177)
      	at java.nio.channels.FileChannel.open(FileChannel.java:287)
      	at java.nio.channels.FileChannel.open(FileChannel.java:335)
      	at com.atlassian.stash.internal.scm.git.DefaultGitRepositoryLayout.openLockWithHardLink(DefaultGitRepositoryLayout.java:293)
      	at com.atlassian.stash.internal.scm.git.DefaultGitRepositoryLayout.withLock(DefaultGitRepositoryLayout.java:160)
      	at com.atlassian.stash.internal.scm.git.DefaultGitRepositoryLayout.editConfig(DefaultGitRepositoryLayout.java:83)
      	at com.atlassian.stash.internal.scm.git.upgrade.IncludeSystemConfigTask.upgrade(IncludeSystemConfigTask.java:96)
      	at com.atlassian.stash.internal.scm.git.upgrade.IncludeSystemConfigTask.lambda$parallelUpgrade$1(IncludeSystemConfigTask.java:143)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.lang.Thread.run(Thread.java:748)
      	... 1 frame trimmed
      2019-01-11 12:53:08,126 ERROR [spring-startup]  c.a.sal.core.upgrade.PluginUpgrader Upgrade failed: IncludeSystemConfigTask failed for one or more repositories
      java.lang.RuntimeException: IncludeSystemConfigTask failed for one or more repositories
      	at com.atlassian.stash.internal.scm.git.upgrade.sal.SalGitUpgradeManager$SalUpgradeTask.perform(SalGitUpgradeManager.java:366)
      	at com.atlassian.stash.internal.scm.git.upgrade.sal.SalGitUpgradeManager$SalUpgradeTask.perform(SalGitUpgradeManager.java:317)
      	at com.atlassian.stash.internal.user.DefaultEscalatedSecurityContext.call(DefaultEscalatedSecurityContext.java:58)
      	at com.atlassian.stash.internal.scm.git.upgrade.sal.SalGitUpgradeManager$DelegatingUpgradeTask.apply(SalGitUpgradeManager.java:264)
      	at com.atlassian.stash.internal.scm.git.upgrade.sal.SalGitUpgradeManager$SalUpgradeTask.doUpgrade(SalGitUpgradeManager.java:325)
      	at com.atlassian.sal.core.upgrade.PluginUpgrader.doUpgrade(PluginUpgrader.java:72)
      	at com.atlassian.stash.internal.scm.git.upgrade.sal.SalPluginUpgrader.apply(SalPluginUpgrader.java:27)
      	at com.atlassian.stash.internal.scm.git.upgrade.sal.SalGitUpgradeManager$SynchronousUpgrader.doInTransaction(SalGitUpgradeManager.java:382)
      	at com.atlassian.stash.internal.scm.git.upgrade.sal.SalGitUpgradeManager$SynchronousUpgrader.doInTransaction(SalGitUpgradeManager.java:373)
      	at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:133)
      	at com.atlassian.stash.internal.scm.git.upgrade.sal.SalGitUpgradeManager.start(SalGitUpgradeManager.java:133)
      	at org.springframework.context.support.DefaultLifecycleProcessor.doStart(DefaultLifecycleProcessor.java:173)
      	at org.springframework.context.support.DefaultLifecycleProcessor.access$200(DefaultLifecycleProcessor.java:50)
      	at org.springframework.context.support.DefaultLifecycleProcessor$LifecycleGroup.start(DefaultLifecycleProcessor.java:350)
      	at org.springframework.context.support.DefaultLifecycleProcessor.startBeans(DefaultLifecycleProcessor.java:149)
      	at org.springframework.context.support.DefaultLifecycleProcessor.onRefresh(DefaultLifecycleProcessor.java:112)
      	at org.springframework.context.support.AbstractApplicationContext.finishRefresh(AbstractApplicationContext.java:880)
      	at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:546)
      	at javax.servlet.GenericServlet.init(GenericServlet.java:158)
      	at java.lang.Thread.run(Thread.java:748)
      	... 8 frames trimmed


      Messages for successful task completion:

      c.a.s.i.s.g.u.IncludeSystemConfigTask Executor service has shutdown gracefully
      
      c.a.sal.core.upgrade.PluginUpgrader Upgraded plugin com.atlassian.bitbucket.server.bitbucket-git to version 8 - Updates all repositories to include system-config for common configuration 
      



    4. Perform another restart to confirm that the issue is resolved.
    5. If you do not see any errors but the instance still takes a lot of time to startup please contact Atlassian Support and attach the log files.


説明 Slow startup issue troubleshooting after the upgrade.
製品Bitbucket Server

最終更新日 2020 年 9 月 24 日

この内容はお役に立ちましたか?

はい
いいえ
この記事についてのフィードバックを送信する
Powered by Confluence and Scroll Viewport.