Unlimited LDAP read timeout can cause Cluster Locks health check to fail if there are communication issues

お困りですか?

アトラシアン コミュニティをご利用ください。

コミュニティに質問

プラットフォームについて: Server および Data Center のみ。この記事は、Server および Data Center プラットフォームのアトラシアン製品にのみ適用されます。

Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.

*Fisheye および Crucible は除く

問題

In some situations where you have a high LDAP read timeout and JIRA cannot properly communicate with your LDAP server, the Cluster Lock health check will fail. By default, this check will fail if there is a process holding a cluster lock for more than 300 seconds, which will happen if JIRA can connect to your LDAP server but is having an issue reading information.

 

Node 'node1' has been holding cluster lock, 'com.atlassian.crowd.embedded.api.Directory:10100', for 503 seconds.

 

診断

環境

  • A JIRA instance that is configured with LDAP/AD and has a read timeout of 0 (infinite) or higher than 300 seconds.

Diagnostic Steps

  • This will only happen if there is an issue reading from the LDAP server. The atlassian-jira.log file shows that a sync starts but does not complete as expected.

    2017-01-31 11:34:08,377 atlassian-scheduler-quartz1.clustered_Worker-1 INFO ServiceRunner     [atlassian.crowd.directory.DbCachingRemoteDirectory] INCREMENTAL synchronisation for directory [ 10400 ] starting
    2017-01-31 11:34:08,377 atlassian-scheduler-quartz1.clustered_Worker-1 INFO ServiceRunner     [atlassian.crowd.directory.DbCachingRemoteDirectory] Attempting INCREMENTAL synchronisation for directory [ 10400 ]
  • If you review thread dumps, you will see a long-running thread for com.sun.jndi.ldap.LdapRequest across all thread dumps. This suggests that JIRA is waiting for a response from the LDAP server and seemingly "stuck" in this stage.

    "atlassian-scheduler-quartz1.clustered_Worker-1" #140 prio=5 tid=0x00007f6659c33000 nid=0x4e98 in Object.wait() [0x00007f6575984000]
       java.lang.Thread.State: WAITING (on object monitor)
    	at java.lang.Object.wait(Native Method)
    	at java.lang.Object.wait(Object.java:502)
    	at com.sun.jndi.ldap.Connection.readReply(Connection.java:467)
    	- locked <0x00000005e0912700> (a com.sun.jndi.ldap.LdapRequest)
    	at com.sun.jndi.ldap.LdapClient.getSearchReply(LdapClient.java:640)
    	at com.sun.jndi.ldap.LdapClient.search(LdapClient.java:563)

回避策

The following workaround will help as a short term solution: 

  • Manually start the LDAP directory sync again
  • Change the LDAP read and connection timeouts to be finite (e.g. not 0 ) so that the process can be terminated with a read timeout exception if there are any communication issues.

ソリューション

This underlying communication issue will likely need to be addressed on the network side to see why JIRA is having communication issues with LDAP/AD. These methods are out of the scope of this guide.

 

最終更新日 2018 年 11 月 2 日

この内容はお役に立ちましたか?

はい
いいえ
この記事についてのフィードバックを送信する
Powered by Confluence and Scroll Viewport.