Bitbucket Server DIY Backup fails - Operations from one or more SCMs did not finish within the allotted timeout

お困りですか?

アトラシアン コミュニティをご利用ください。

コミュニティに質問

症状

Bitbucket Server DIY backup fails gracefully, specifically, this function doesn't finish in time:  bitbucket_backup_wait The following appears in the atlassian-bitbucket-YYYY-MM-DD.log:

2015-10-09 02:50:42,925 WARN  [threadpool:thread-6] backup *AA1BB1x165x3972463x2 10.10.10.10 "POST /mvc/admin/backups HTTP/1.1" c.a.s.i.m.LatchAndDrainScmStep The SCMs could not be drained. Aborting...
2015-10-09 02:50:42,962 WARN  [threadpool:thread-6] backup *AA1BB1x165x3972463x2 10.10.10.10 "POST /mvc/admin/backups HTTP/1.1" c.a.s.i.m.DefaultMaintenanceTaskMonitor BACKUP maintenance has failed (Cause: BackupException: A backup file could not be created.)
com.atlassian.stash.internal.backup.BackupException: A backup file could not be created.
                at com.atlassian.stash.internal.maintenance.backup.BackupPhase.run(BackupPhase.java:78) ~[stash-service-impl-3.11.1.jar:na]
                at com.atlassian.stash.internal.maintenance.CompositeMaintenanceTask$Step.run(CompositeMaintenanceTask.java:130) ~[stash-service-impl-3.11.1.jar:na]
                at com.atlassian.stash.internal.maintenance.CompositeMaintenanceTask.run(CompositeMaintenanceTask.java:69) ~[stash-service-impl-3.11.1.jar:na]
                at com.atlassian.stash.internal.maintenance.MaintenanceModePhase.run(MaintenanceModePhase.java:27) ~[stash-service-impl-3.11.1.jar:na]
                at com.atlassian.stash.internal.maintenance.backup.AbstractBackupTask.run(AbstractBackupTask.java:85) ~[stash-service-impl-3.11.1.jar:na]
                at com.atlassian.stash.internal.maintenance.DefaultMaintenanceTaskMonitor.run(DefaultMaintenanceTaskMonitor.java:212) ~[stash-service-impl-3.11.1.jar:na]
                at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) [na:1.8.0_45]
                at java.util.concurrent.FutureTask.run(Unknown Source) [na:1.8.0_45]
                at com.atlassian.stash.internal.concurrent.StateTransferringExecutor$StateTransferringRunnable.run(StateTransferringExecutor.java:73) [stash-platform-3.11.1.jar:na]
                at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) [na:1.8.0_45]
                at java.util.concurrent.FutureTask.run(Unknown Source) [na:1.8.0_45]
                at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Unknown Source) [na:1.8.0_45]
                at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) [na:1.8.0_45]
                at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [na:1.8.0_45]
                at java.lang.Thread.run(Unknown Source) [na:1.8.0_45]
                ... 1 frame trimmed
Caused by: com.atlassian.stash.internal.backup.BackupException: Operations from one or more SCMs did not finish within the allotted timeout. To prevent corruption due to inconsistent state, the backup has been aborted. Please try backup up again when the system is under less load.
                at com.atlassian.stash.internal.maintenance.LatchAndDrainScmStep.newDrainFailedException(LatchAndDrainScmStep.java:36) ~[stash-service-impl-3.11.1.jar:na]
                at com.atlassian.stash.internal.maintenance.AbstractLatchAndDrainTask.run(AbstractLatchAndDrainTask.java:86) ~[stash-service-impl-3.11.1.jar:na]
                at com.atlassian.stash.internal.maintenance.CompositeMaintenanceTask$Step.run(CompositeMaintenanceTask.java:130) ~[stash-service-impl-3.11.1.jar:na]
                at com.atlassian.stash.internal.maintenance.CompositeMaintenanceTask.run(CompositeMaintenanceTask.java:69) ~[stash-service-impl-3.11.1.jar:na]
                at com.atlassian.stash.internal.maintenance.backup.BackupPhase.run(BackupPhase.java:74) ~[stash-service-impl-3.11.1.jar:na]
                ... 15 common frames omitted
2015-10-09 02:50:42,925 WARN  [threadpool:thread-6] backup *AA1BB1x165x3972463x2 10.10.10.10 "POST /mvc/admin/backups HTTP/1.1" c.a.s.i.m.LatchAndDrainScmStep The SCMs could not be drained. Aborting...

診断

  1. Run ps -ef to find the back-end running git processes, e.g.

    atlstash 2964 6 0.0 00:00:00 0.0 1184 113132 ? S 16:38:31 git http-backend
    atlstash 2965 0 0.0 00:00:00 0.0 1232 112188 ? S 16:38:31 git-http-backend <noArgs>
    atlstash 2966 7 0.0 00:00:00 0.0 1316 123376 ? S 16:38:31 git receive-pack_--stateless-rpc_.
    atlstash 2971 0 0.1 00:01:16 0.0 13024 137468 ? S 16:38:31 git index-pack_-
  2. Run lsof to determine the affected repositories, e.g.

    git 2971 atlstash cwd DIR 0,19 4096 67504227 /apps/stash-data/shared/data/repositories/1
    git 2971 atlstash rtd DIR 253,0 4096 2 /
    git 2971 atlstash txt REG 253,0 7303493 286128 /usr/local/libexec/git-core/git
    git 2971 atlstash mem REG 253,0 91096 139390 /lib64/libz.so.1.2.3
    git 2971 atlstash mem REG 253,0 99158576 263410 /usr/lib/locale/locale-archive
    git 2971 atlstash mem REG 253,0 19536 135157 /lib64/libdl-2.12.so
    git 2971 atlstash mem REG 253,0 1921216 131097 /lib64/libc-2.12.so
    git 2971 atlstash mem REG 253,0 142640 131121 /lib64/libpthread-2.12.so
    git 2971 atlstash mem REG 253,0 1963296 266130 /usr/lib64/libcrypto.so.1.0.1e
    git 2971 atlstash mem REG 253,0 154664 132201 /lib64/ld-2.12.so
    git 2971 atlstash mem REG 253,0 26060 527143 /usr/lib64/gconv/gconv-modules.cache
    git 2971 atlstash 0r FIFO 0,8 ? 591801098 pipe
    git 2971 atlstash 1w FIFO 0,8 ? 591801518 pipe
    git 2971 atlstash 2w FIFO 0,8 ? 591801517 pipe
    git 2971 atlstash 3u REG 0,19 4186750018 67295571 /apps/stash-data/shared/data/repositories/1/objects/pack/tmp_pack_lCDLna

原因

There were Git processes active that didn't finish within the default time (60 seconds) expected by the backup script.

ソリューション

Increase the timeout from the 60 second default (e.g. to 2 minutes) to give more time for the Git operations to be completed. Update bitbucket.properties with the following parameter and restart the application:

  • backup.drain.scm.timeout=120

There is no exact answer to what the timeout value should be set to. Iterate until you find enough time so the SCM requests can be processed.

最終更新日 2016 年 4 月 20 日

この内容はお役に立ちましたか?

はい
いいえ
この記事についてのフィードバックを送信する
Powered by Confluence and Scroll Viewport.