Directory Synchronisation against Active Directory times out with a load balancer in place
プラットフォームについて: Cloud および Data Center - この記事はクラウドと Data Center 両方のプラットフォームに等しく適用されます。
Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Fisheye および Crucible は除く
Summary
After upgrading Confluence you may notice that synchronizations against your Active Directory and consequentially login attempts may fail.
環境
Confluence Server and Datacenter
Active Directory Load Balanced
診断
Atlassian Crowd may exhibit the aforementioned error after an upgrade. There are other applicable factors that may incur into this issue:
- There is a load balancer between Confluence and Active Directory
- The same setup with no changes has worked before the upgrade
It's possible to verify the issue by analyzing the atlassian-confluence.log file and search for a similar timeout error message:
2020-07-29 10:38:53,254 ERROR [Caesium-1-3] [atlassian.crowd.directory.DbCachingDirectoryPoller] pollChanges Error occurred while refreshing the cache for directory [ 42860545 ].
com.atlassian.crowd.exception.OperationFailedException: Error looking up attributes for highestCommittedUSN
at com.atlassian.crowd.directory.MicrosoftActiveDirectory.fetchHighestCommittedUSN(MicrosoftActiveDirectory.java:703)
at com.atlassian.crowd.directory.ldap.cache.UsnChangedCacheRefresher.synchroniseAll(UsnChangedCacheRefresher.java:148)
at com.atlassian.crowd.directory.DbCachingRemoteDirectory.synchroniseCache(DbCachingRemoteDirectory.java:978)
at com.atlassian.crowd.manager.directory.DirectorySynchroniserImpl.synchronise(DirectorySynchroniserImpl.java:67)
at com.atlassian.crowd.directory.DbCachingDirectoryPoller.pollChanges(DbCachingDirectoryPoller.java:45)
at com.atlassian.crowd.manager.directory.monitor.poller.DirectoryPollerJobRunner.runJob(DirectoryPollerJobRunner.java:85)
at com.atlassian.confluence.impl.schedule.caesium.JobRunnerWrapper.doRunJob(JobRunnerWrapper.java:117)
at com.atlassian.confluence.impl.schedule.caesium.JobRunnerWrapper.lambda$runJob$0(JobRunnerWrapper.java:87)
at com.atlassian.confluence.impl.vcache.VCacheRequestContextManager.doInRequestContextInternal(VCacheRequestContextManager.java:84)
at com.atlassian.confluence.impl.vcache.VCacheRequestContextManager.doInRequestContext(VCacheRequestContextManager.java:68)
at com.atlassian.confluence.impl.schedule.caesium.JobRunnerWrapper.runJob(JobRunnerWrapper.java:87)
at com.atlassian.scheduler.core.JobLauncher.runJob(JobLauncher.java:134)
at com.atlassian.scheduler.core.JobLauncher.launchAndBuildResponse(JobLauncher.java:106)
at com.atlassian.scheduler.core.JobLauncher.launch(JobLauncher.java:90)
at com.atlassian.scheduler.caesium.impl.CaesiumSchedulerService.launchJob(CaesiumSchedulerService.java:435)
at com.atlassian.scheduler.caesium.impl.CaesiumSchedulerService.executeLocalJob(CaesiumSchedulerService.java:402)
at com.atlassian.scheduler.caesium.impl.CaesiumSchedulerService.executeQueuedJob(CaesiumSchedulerService.java:380)
at com.atlassian.scheduler.caesium.impl.SchedulerQueueWorker.executeJob(SchedulerQueueWorker.java:66)
at com.atlassian.scheduler.caesium.impl.SchedulerQueueWorker.executeNextJob(SchedulerQueueWorker.java:60)
at com.atlassian.scheduler.caesium.impl.SchedulerQueueWorker.run(SchedulerQueueWorker.java:35)
at java.lang.Thread.run(Unknown Source)
Caused by: org.springframework.ldap.UncategorizedLdapException: Uncategorized exception occured during LDAP processing; nested exception is javax.naming.NamingException: LDAP response read timed out, timeout used:120000ms.; remaining name '/'
at org.springframework.ldap.support.LdapUtils.convertLdapException(LdapUtils.java:228)
at org.springframework.ldap.core.LdapTemplate.search(LdapTemplate.java:397)
at org.springframework.ldap.core.LdapTemplate.search(LdapTemplate.java:440)
at com.atlassian.crowd.directory.ldap.SpringLdapTemplateWrapper$2.timedCall(SpringLdapTemplateWrapper.java:178)
at com.atlassian.crowd.directory.ldap.SpringLdapTemplateWrapper$TimedCallable.call(SpringLdapTemplateWrapper.java:130)
at com.atlassian.crowd.directory.ldap.SpringLdapTemplateWrapper.invokeWithContextClassLoader(SpringLdapTemplateWrapper.java:100)
at com.atlassian.crowd.directory.ldap.SpringLdapTemplateWrapper.lookup(SpringLdapTemplateWrapper.java:168)
at com.atlassian.crowd.directory.MicrosoftActiveDirectory.fetchHighestCommittedUSN(MicrosoftActiveDirectory.java:688)
... 20 more
Caused by: javax.naming.NamingException: LDAP response read timed out, timeout used:120000ms.; remaining name '/'
at com.sun.jndi.ldap.Connection.readReply(Unknown Source)
at com.sun.jndi.ldap.LdapClient.getSearchReply(Unknown Source)
at com.sun.jndi.ldap.LdapClient.search(Unknown Source)
at com.sun.jndi.ldap.LdapCtx.doSearch(Unknown Source)
at com.sun.jndi.ldap.LdapCtx.searchAux(Unknown Source)
at com.sun.jndi.ldap.LdapCtx.c_search(Unknown Source)
at com.sun.jndi.toolkit.ctx.ComponentDirContext.p_search(Unknown Source)
at com.sun.jndi.toolkit.ctx.PartialCompositeDirContext.search(Unknown Source)
at javax.naming.directory.InitialDirContext.search(Unknown Source)
at sun.reflect.GeneratedMethodAccessor1508.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.springframework.ldap.transaction.compensating.manager.TransactionAwareDirContextInvocationHandler.invoke(TransactionAwareDirContextInvocationHandler.java:90)
at com.sun.proxy.$Proxy3196.search(Unknown Source)
at com.atlassian.crowd.directory.ldap.SpringLdapTemplateWrapper$2.lambda$timedCall$0(SpringLdapTemplateWrapper.java:175)
at org.springframework.ldap.core.LdapTemplate.search(LdapTemplate.java:363)
... 26 more
原因
There are several possible causes; which may or may not resolve the problem. Proceed to the Solution section and verify if they apply to the case and help to resolve the issue.
ソリューション
Has "Follow Referrals" been enabled?
The most common cause of timeouts is due to "Follow Referrals" being enabled. Generally, these timeouts have two root causes
- The DNS for the domain is not valid, causing timeouts.
- A large domain (particularly if the domain is partitioned) can also cause similar timeouts. Disabling this option will prevent Crowd from following referrals into other partitions which should speed up sync time (but may not give a complete result)
If "Follow Referrals" has been enabled, try disabling it before performing a second synchronization.
Restricting the LDAP Scope
Using a smaller filter, see if you can limit the LDAP search to just a single, smaller OU; the smaller the better.
Increase the connection.timeout in the LDAP directory
- Go to Administration > Users > User Directories
- Edit the LDAP directory
- Under Advanced Settings Increase the value of 'Connection Timeout (seconds)'
- Save the directory
Can you bypass the load balancer?
Bypass the load balancer, and connect directly to Active Directory. Bypassing the load balancer (even just temporarily) will help to confirm the load balancer as the cause of the problem (or remove it from consideration). Some customers have reported problems with certain load balancer products/configurations after upgrading. The same configuration works without any problems in earlier versions of Confluence. Some customers have reported success with using HAProxy as a load balancer to Active Directory.
Please note that the setup or configuration of a load balancer is not covered by Atlassian Support Offerings.
Additionally, load-balanced LDAP is not supported by Atlassian Support: CONFSERVER-23073 - Getting issue details... STATUS