Incoming e-mails processing issue
症状
- JIRA intermittently processes emails from the inbox. For example :
The emails that are sent to the inbox will stay on the inbox the entire day and only JIRA successfully processes e-mails during the night
- This issue occurred on Microsoft Outlook ( Exchange Inbox ) & all JIRA mail handlers are affected.
The following appears in the incoming mail log:
2015-02-04 09:32:50,153 DEBUG [Office 365 - JIRA] QuartzScheduler_Worker-3 ServiceRunner IT Tasks DEBUG POP3: connecting to host "outlook.office365.com", port 995, isSSL true
2015-02-04 21:00:26,605 WARN [Office 365 - JIRA] QuartzScheduler_Worker-0 ServiceRunner IT Purchases IT Purchases[10100]: javax.mail.MessagingException: Connect failed while connecting to host 'outlook.office365.com' as user 'xxxx@xxxxxx.com' via protocol 'pop3s': javax.mail.MessagingException: Connect failed;
nested exception is:
java.net.UnknownHostException: outlook.office365.com
javax.mail.MessagingException: Connect failed;
nested exception is:
java.net.UnknownHostException: outlook.office365.com
at com.sun.mail.pop3.POP3Store.protocolConnect(POP3Store.java:210)
at javax.mail.Service.connect(Service.java:295)
at com.atlassian.jira.service.services.mail.MailFetcherService$MessageProviderImpl.getAndProcessMail(MailFetcherService.java:221)
at com.atlassian.jira.service.services.mail.MailFetcherService.runImpl(MailFetcherService.java:401)
at com.atlassian.jira.service.services.file.AbstractMessageHandlingService.run(AbstractMessageHandlingService.java:257)
at com.atlassian.jira.service.JiraServiceContainerImpl.run(JiraServiceContainerImpl.java:61)
at com.atlassian.jira.service.ServiceRunner.execute(ServiceRunner.java:48)
at org.quartz.core.JobRunShell.run(JobRunShell.java:195)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:520)
Caused by: java.net.UnknownHostException: outlook.office365.com
Notice the jump in time from 09:32 to 21:00
The following appears in the incoming mail log
2015-02-04 09:29:03,955 QuartzScheduler_Worker-2 INFO ServiceRunner Backup Service [jira.bc.dataimport.DefaultExportService] Data export completed in 73833ms. Wrote 627242 entities to export in memory.
2015-02-04 09:29:03,956 QuartzScheduler_Worker-2 INFO ServiceRunner Backup Service [jira.bc.dataimport.DefaultExportService] Attempting to save the Active Objects Backup
2015-02-04 09:29:22,848 QuartzScheduler_Worker-2 INFO ServiceRunner Backup Service [jira.bc.dataimport.DefaultExportService] Finished saving the Active Objects Backup
2015-02-04 10:01:00,919 http-bio-443-exec-9018 WARN anonymous 601x768726x1 - 192.168.0.214 /rest/dev-status/1.0/webhooks/update-summary [oauth.serviceprovider.internal.AuthenticatorImpl] No executing user assigned for 2LO requests
2015-02-04 10:21:17,569 http-bio-443-exec-9043 INFO anonymous 621x769213x2 1qx8nzy 192.168.59.111 /s/en_USaxj3u2-1988229788/6258/7/2.5.2.25201808/_/download/resources/com.thed.zephyr.je:zephyr-je-schedule/content/css/images/wait.gif [atlassian.plugin.webresource.SingleDownloadableResourceBuilder] Unable to find resource for plugin: com.thed.zephyr.je:zephyr-je-schedule and path: content/css/images/wait.gif
We noticed that the backup service had kicked off a few minutes earlier, but there is nothing else in the logs really telling us what is happening. What we can see is that there is a huge time frame where no services appear to be running. From the time when the backup service started at 09:29, and the next time we see a service start is not until when the incoming mail starts processing:
2015-02-04 20:31:30,015 http-bio-443-exec-9386 INFO xxxxxxx.xx-xxxx 1231x783119x2 f63dh3 192.168.0.47 /rest/gadget/1.0/issueTable/filter [fields.layout.field.AbstractFieldLayoutManager] Field layout contains non-orderable field with id 'customfield_10403'.
2015-02-04 21:00:26,789 QuartzScheduler_Worker-0 INFO ServiceRunner [atlassian.crowd.directory.DbCachingRemoteDirectory] synchronisation for directory [ 10000 ] starting
We then needed to generate thread dumps during the period when the incoming mail isn't processing , as outlines on Troubleshooting JIRA Services
Looking at the thread dumps, we can actually see what the quartzworker threads are up to:
"QuartzScheduler_Worker-3" prio=10 tid=0x00007f046c92a800 nid=0xb1a in Object.wait() [0x00007f04a816f000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:503)
at java.net.InetAddress.checkLookupTable(Unknown Source)
- locked <0x00000000d02972b8> (a java.util.HashMap)
at java.net.InetAddress.getAddressesFromNameService(Unknown Source)
at java.net.InetAddress.getAllByName0(Unknown Source)
at java.net.InetAddress.getAllByName(Unknown Source)
at java.net.InetAddress.getAllByName(Unknown Source)
at java.net.InetAddress.getByName(Unknown Source)
at java.net.InetSocketAddress.<init>(Unknown Source)
at com.sun.mail.util.SocketFetcher.createSocket(SocketFetcher.java:319)
at com.sun.mail.util.SocketFetcher.getSocket(SocketFetcher.java:233)
at com.sun.mail.pop3.Protocol.<init>(Protocol.java:111)
at com.sun.mail.pop3.POP3Store.getPort(POP3Store.java:261)
- locked <0x00000000e9425b88> (a com.sun.mail.pop3.POP3SSLStore)
at com.sun.mail.pop3.POP3Store.protocolConnect(POP3Store.java:206)
- locked <0x00000000e9425b88> (a com.sun.mail.pop3.POP3SSLStore)
at javax.mail.Service.connect(Service.java:295)
- locked <0x00000000e9425b88> (a com.sun.mail.pop3.POP3SSLStore)
at com.atlassian.jira.service.services.mail.MailFetcherService$MessageProviderImpl.getAndProcessMail(MailFetcherService.java:221)
at com.atlassian.jira.service.services.mail.MailFetcherService.runImpl(MailFetcherService.java:401)
at com.atlassian.jira.service.services.file.AbstractMessageHandlingService.run(AbstractMessageHandlingService.java:257)
at com.atlassian.jira.service.JiraServiceContainerImpl.run(JiraServiceContainerImpl.java:61)
at com.atlassian.jira.service.ServiceRunner.execute(ServiceRunner.java:48)
at org.quartz.core.JobRunShell.run(JobRunShell.java:195)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:520)
"QuartzScheduler_Worker-2" prio=10 tid=0x00007f046c929000 nid=0xb19 in Object.wait() [0x00007f04a8270000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:503)
at java.net.InetAddress.checkLookupTable(Unknown Source)
- locked <0x00000000d02972b8> (a java.util.HashMap)
at java.net.InetAddress.getAddressesFromNameService(Unknown Source)
at java.net.InetAddress.getAllByName0(Unknown Source)
at java.net.InetAddress.getAllByName(Unknown Source)
at java.net.InetAddress.getAllByName(Unknown Source)
at java.net.InetAddress.getByName(Unknown Source)
at java.net.InetSocketAddress.<init>(Unknown Source)
at com.sun.mail.util.SocketFetcher.createSocket(SocketFetcher.java:319)
at com.sun.mail.util.SocketFetcher.getSocket(SocketFetcher.java:233)
at com.sun.mail.pop3.Protocol.<init>(Protocol.java:111)
at com.sun.mail.pop3.POP3Store.getPort(POP3Store.java:261)
- locked <0x00000000e942e3d8> (a com.sun.mail.pop3.POP3SSLStore)
at com.sun.mail.pop3.POP3Store.protocolConnect(POP3Store.java:206)
- locked <0x00000000e942e3d8> (a com.sun.mail.pop3.POP3SSLStore)
at javax.mail.Service.connect(Service.java:295)
- locked <0x00000000e942e3d8> (a com.sun.mail.pop3.POP3SSLStore)
at com.atlassian.jira.service.services.mail.MailFetcherService$MessageProviderImpl.getAndProcessMail(MailFetcherService.java:221)
at com.atlassian.jira.service.services.mail.MailFetcherService.runImpl(MailFetcherService.java:401)
at com.atlassian.jira.service.services.file.AbstractMessageHandlingService.run(AbstractMessageHandlingService.java:257)
at com.atlassian.jira.service.JiraServiceContainerImpl.run(JiraServiceContainerImpl.java:61)
at com.atlassian.jira.service.ServiceRunner.execute(ServiceRunner.java:48)
at org.quartz.core.JobRunShell.run(JobRunShell.java:195)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:520)
"QuartzScheduler_Worker-1" prio=10 tid=0x00007f046cef4800 nid=0xb18 runnable [0x00007f04a8371000]
java.lang.Thread.State: RUNNABLE
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$1.lookupAllHostAddr(Unknown Source)
at java.net.InetAddress.getAddressesFromNameService(Unknown Source)
at java.net.InetAddress.getAllByName0(Unknown Source)
at java.net.InetAddress.getAllByName(Unknown Source)
at java.net.InetAddress.getAllByName(Unknown Source)
at java.net.InetAddress.getByName(Unknown Source)
at java.net.InetSocketAddress.<init>(Unknown Source)
at com.sun.mail.util.SocketFetcher.createSocket(SocketFetcher.java:319)
at com.sun.mail.util.SocketFetcher.getSocket(SocketFetcher.java:233)
at com.sun.mail.pop3.Protocol.<init>(Protocol.java:111)
at com.sun.mail.pop3.POP3Store.getPort(POP3Store.java:261)
- locked <0x00000000e918eb28> (a com.sun.mail.pop3.POP3SSLStore)
at com.sun.mail.pop3.POP3Store.protocolConnect(POP3Store.java:206)
- locked <0x00000000e918eb28> (a com.sun.mail.pop3.POP3SSLStore)
at javax.mail.Service.connect(Service.java:295)
- locked <0x00000000e918eb28> (a com.sun.mail.pop3.POP3SSLStore)
at com.atlassian.jira.service.services.mail.MailFetcherService$MessageProviderImpl.getAndProcessMail(MailFetcherService.java:221)
at com.atlassian.jira.service.services.mail.MailFetcherService.runImpl(MailFetcherService.java:401)
at com.atlassian.jira.service.services.file.AbstractMessageHandlingService.run(AbstractMessageHandlingService.java:257)
at com.atlassian.jira.service.JiraServiceContainerImpl.run(JiraServiceContainerImpl.java:61)
at com.atlassian.jira.service.ServiceRunner.execute(ServiceRunner.java:48)
at org.quartz.core.JobRunShell.run(JobRunShell.java:195)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:520)
"QuartzScheduler_Worker-0" prio=10 tid=0x00007f046cef3000 nid=0xb17 in Object.wait() [0x00007f04a8472000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:503)
at java.net.InetAddress.checkLookupTable(Unknown Source)
- locked <0x00000000d02972b8> (a java.util.HashMap)
at java.net.InetAddress.getAddressesFromNameService(Unknown Source)
at java.net.InetAddress.getAllByName0(Unknown Source)
at java.net.InetAddress.getAllByName(Unknown Source)
at java.net.InetAddress.getAllByName(Unknown Source)
at java.net.InetAddress.getByName(Unknown Source)
at java.net.InetSocketAddress.<init>(Unknown Source)
at com.sun.mail.util.SocketFetcher.createSocket(SocketFetcher.java:319)
at com.sun.mail.util.SocketFetcher.getSocket(SocketFetcher.java:233)
at com.sun.mail.pop3.Protocol.<init>(Protocol.java:111)
at com.sun.mail.pop3.POP3Store.getPort(POP3Store.java:261)
- locked <0x00000000e925e320> (a com.sun.mail.pop3.POP3SSLStore)
at com.sun.mail.pop3.POP3Store.protocolConnect(POP3Store.java:206)
- locked <0x00000000e925e320> (a com.sun.mail.pop3.POP3SSLStore)
at javax.mail.Service.connect(Service.java:295)
- locked <0x00000000e925e320> (a com.sun.mail.pop3.POP3SSLStore)
at com.atlassian.jira.service.services.mail.MailFetcherService$MessageProviderImpl.getAndProcessMail(MailFetcherService.java:221)
at com.atlassian.jira.service.services.mail.MailFetcherService.runImpl(MailFetcherService.java:401)
at com.atlassian.jira.service.services.file.AbstractMessageHandlingService.run(AbstractMessageHandlingService.java:257)
at com.atlassian.jira.service.JiraServiceContainerImpl.run(JiraServiceContainerImpl.java:61)
at com.atlassian.jira.service.ServiceRunner.execute(ServiceRunner.java:48)
at org.quartz.core.JobRunShell.run(JobRunShell.java:195)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:520)
- We noticed 3 out of 4 of the workers are just in waiting state; waiting on InetAddress.checkLookupTable
The one quartz worker thread that is running, appears to be doing some work on a look up table as well - hereInet6AddressImpl.lookupAllHostAddr
After doing some Googling around on these threads, we came across similar symptoms mentioned here :
原因
We are suspecting that the lock which the running thread has on the look up table is holding up other threads
Troubleshooting Steps
- Enable GC Logging referring to this page : Using Garbage Collection Logs to Analyze JIRA Performance /Troubleshooting JIRA Services
- Jira を再起動します。
- Enable the outgoing mail logs & debugging at the Logging and Profiling page in your JIRA instance
- Next time services stop running ( when the problem starts occurring ) grab a series of thread dumps to be analyzed.
ソリューション
- Tell JIRA to specifically use the IPv4 stack only.
That should avoid the above call toInet6AddressImpl.lookupAllHostAddr entirely.
Add the following JVM argument and restart JIRA to pick it up:
-Djava.net.preferIPv4Stack=true
See Setting Properties and Options on Startup on how to add that argument.