Build plans queued for extended duration reporting "Updating source code to latest..." inside Build activity dashboard

お困りですか?

アトラシアン コミュニティをご利用ください。

コミュニティに質問

プラットフォームについて: Server および Data Center のみ。この記事は、Server および Data Center プラットフォームのアトラシアン製品にのみ適用されます。

Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.

*Fisheye および Crucible は除く

 

要約

Build plans have been queued for building for an extended duration reporting "Updating source code to latest..." inside the Build activity dashboard. There are several agents capable of processing the builds available but they remain queued for an excessive period.

診断

The fact that several build plans are queued and seem to be stuck during "Updating source code to latest..." doesn't necessarily mean they are waiting for Bamboo to update the caches of the repositories they are using before dispatching the build. There's another article that outlines the potential causes and fixes for this type of problem here:

The issue described in this article is slightly different and affects builds after change detection has happened and respective caches been updated despite the fact Bamboo reports "Updating source code to latest..." inside the Build activity dashboard. There's one common factor to the two scenarios that are going to be described below and is very important for diagnosing this issue:

Build plans are in fact getting dispatched and built by agents while Bamboo reports "Updating source code to latest...". So the very first step to diagnosing this issue would be to review your agent logs and see if they are building the plans that Bamboo says are in the queue under the status "Updating source code to latest...".

It's helpful to understand the basics of the Bamboo build plan workflow to understand where the issue might be when the symptoms present:

  1. ビルド プランが開始される。
  2. プランのステータスが「キューに登録済み」に変わる。
  3. Change detection happens on the server-side. This is where it will reach out to the repository to determine if there are any changes it needs for the build.
  4. プランがビルド キューに追加される。これがビルド アクティビティ ダッシュボードに表示されるタイミング。
  5. サーバーはそれにエージェントを割り当て、そのエージェントにイベントを送る。
  6. エージェントはイベントを受け取り、ビルドを開始する。
  7. Agent finishes building and sends the results back to server.

The problem described in this article happens when Bamboo has to process the events/ messages sent from the agent. The fact that builds are going to the queue, getting picked up by available agents and built all the while Bamboo is reporting "Updating source code to latest..." means Bamboo is having a hard time updating the status of your builds in the database.

Diagnosis 1

Thread dumps taken while several build plans are queued and seem to be stuck during "Updating source code to latest..."  contain RUNNABLE threads with following classes:

...
at com.atlassian.bamboo.user.rename.UserRenameHelper.updateUserInTable(UserRenameHelper.java:38)
at com.atlassian.bamboo.user.rename.UserRenameHelper.renameUserInBuildResultSummary(UserRenameHelper.java:82)
at com.atlassian.bamboo.user.rename.UserRenameServiceImpl.doRenameUser(UserRenameServiceImpl.java:179)
...

This suggests that a user renaming process is happening.

Diagnosis 2

Important Bamboo threads such as IndexerService and BuildTailMessageProcessingThread can be seen in thread dump spending extended periods in filesystem operations. Example:

8-BuildTailMessageProcessingThread-expensive:pool-16-thread-102
State
Runnable
Java Stack
at java.io.RandomAccessFile.open0(Native Method) 
at java.io.RandomAccessFile.open(RandomAccessFile.java:316) 
at java.io.RandomAccessFile. (RandomAccessFile.java:243) 
at org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:193) 
at org.apache.lucene.store.Directory.copy(Directory.java:185) 
at org.apache.lucene.store.TrackingDirectoryWrapper.copy(TrackingDirectoryWrapper.java:50) 
at org.apache.lucene.index.IndexWriter.createCompoundFile(IndexWriter.java:4582) 
at org.apache.lucene.index.DocumentsWriterPerThread.sealFlushedSegment(DocumentsWriterPerThread.java:535) 
at org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:502) 
at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:506) 
at org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:616) 
at org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2815) 
- locked [0x00000003ce9fd4e8] (a java.lang.Object) 
- locked [0x00000003cf2b1460] (a java.lang.Object) 
at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2970) 
- locked [0x00000003cf2b1460] (a java.lang.Object) 
at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2940) 
at com.atlassian.bonnie.LuceneConnection.commitAndRefreshSearcher(LuceneConnection.java:566) 
at com.atlassian.bonnie.LuceneConnection.withWriter(LuceneConnection.java:506) 
at com.atlassian.bamboo.index.IndexerServiceImpl$8.run(IndexerServiceImpl.java:314) 


A quick look at the current processes utilization on the server running Bamboo shows that there's an anti-virus software consuming a lot of resources. Here's an example from running top while McAffee On-Access Scanner is running while the problem is happening:

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
 1019 bamboo    20   0  170052   3648   1628 R 63.0  0.0   0:00.36 top
 4028 root      20   0 1133188  27388  13512 S 50.0  0.0  79:33.16 oacore
 6404 root      20   0 1756680 420504   8660 S 50.0  0.6 461:45.27 OASManager
 6402 root      20   0 1756680 420504   8660 R 47.8  0.6 461:23.14 OASManager
 6408 root      20   0 1756680 420504   8660 S 47.8  0.6 461:33.43 OASManager
 6406 root      20   0 1756680 420504   8660 S 45.7  0.6 460:56.47 OASManager
 6410 root      20   0 1756680 420504   8660 S 43.5  0.6 461:46.93 OASManager


原因

原因 1

This is actually a bug:  BAM-20993 - Getting issue details... STATUS . The user renaming process can be quite extensive and time consuming depending on the number of records that need to be updated inside the database. This can affect Bamboo's ability to keep up with reading/ writing the status/ results of all builds.

原因 2

The is caused by the anti-virus which is likely intercepting/ blocking read/ open/ write operations in lucene indexing (for build results and status) and/or ActiveMQ threads. The communication and transfer of data between the Bamboo server and agents is done through the Apache ActiveMQ (AMQ). In Bamboo, AMQ is configured as a persistent queue, meaning that messages that are sent are written to disk in the <Bamboo server home directory>/jms-store directory before they get to the database.

If using McAffee On-Access Scanner the cause might be (McAffee) Slow performance with Java-based applications.

ソリューション

ソリューション 1

There's no immediate solution to this issue. If the user renaming process is running you must wait until the process finishes and be careful to avoid renaming a large batch of users at once while  BAM-20993 - Getting issue details... STATUS  hasn't been fixed.

ソリューション 2

There are a few options to consider when it comes to anti-virus softwares:

最終更新日 2021 年 4 月 6 日

この内容はお役に立ちましたか?

はい
いいえ
この記事についてのフィードバックを送信する
Powered by Confluence and Scroll Viewport.