Jira Notification Delays Due to Caesium Thread Issue
プラットフォームについて: Server および Data Center のみ。この記事は、Server および Data Center プラットフォームのアトラシアン製品にのみ適用されます。
Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Fisheye および Crucible は除く
症状
All notification emails from Jira (batched notifications, non-batched notifications, customer notifications) are piling up in the queue. The emails are eventually sent, but at random times.
診断
Looking at the file
application.xml
, we can see that at least 4 Jira incoming Mail Handlers are listed in the <services> section:<service> <name>Jira mail handler 1</name> <delay>60ms</delay> <last-run>12.01.2021 11:21</last-run> </service> <service> <name>Jira mail handler 2</name> <delay>60ms</delay> <last-run>12.01.2021 11:21</last-run> </service> <service> <name>Jira mail handler 3</name> <delay>60ms</delay> <last-run>12.01.2021 11:21</last-run> </service> <service> <name>Jira mail handler 4</name> <delay>60ms</delay> <last-run>12.01.2021 11:21</last-run> </service>
If the debugging package com.atlassian.jira.service has been enabled in the page ⚙ > System > Logging and Profiling, we should see in the file
atlassian-jira.log
that:the Mail Queue Service (responsible to empty the mail queue) is not scheduled every 1 min as it should. Instead, it is scheduled at random intervals:
grep 'Mail Queue Service' atlassian-jira.log 2020-07-01 12:00:38,278+0000 Caesium-1-1 DEBUG anonymous Mail Queue Service [c.a.j.s.services.mail.MailQueueService] Attempting to run mail queue service 2020-07-01 12:16:23,597+0000 Caesium-1-1 DEBUG anonymous Mail Queue Service [c.a.j.s.services.mail.MailQueueService] Attempting to run mail queue service 2020-07-01 12:35:49,036+0000 Caesium-1-1 DEBUG anonymous Mail Queue Service [c.a.j.s.services.mail.MailQueueService] Attempting to run mail queue service 2020-07-01 12:48:40,567+0000 Caesium-1-1 DEBUG anonymous Mail Queue Service [c.a.j.s.services.mail.MailQueueService] Attempting to run mail queue service
the Jira Mail Handlers are taking a lot of time to complete, frequently exhausting all the Caesium threads that are available to run any scheduled service such as the mail handlers, the backup service and the Mail Queue Service:
grep 'com.atlassian.jira.service.services.mail.MailFetcherService' atlassian-jira.log 2020-07-29 09:41:00,008 Caesium-1-3 DEBUG ServiceRunner Jira Mail Handler 2 [c.a.jira.service.ServiceRunner] Running Service [Container: com.atlassian.jira.service.services.mail.MailFetcherService delay [60000ms]] 2020-07-29 09:56:00,379 Caesium-1-3 DEBUG anonymous Jira Mail Handler 2 [c.a.jira.service.ServiceRunner] Finished Running Service [Container: com.atlassian.jira.service.services.mail.MailFetcherService delay [60000ms]] 2020-07-29 09:57:00,008 Caesium-1-2 DEBUG ServiceRunner Jira Mail Handler 2 [c.a.jira.service.ServiceRunner] Running Service [Container: com.atlassian.jira.service.services.mail.MailFetcherService delay [60000ms]] 2020-07-29 10:07:00,426 Caesium-1-2 DEBUG anonymous Jira Mail Handler 2 [c.a.jira.service.ServiceRunner] Finished Running Service [Container: com.atlassian.jira.service.services.mail.MailFetcherService delay [60000ms]]
If the logging and debugging options have been enabled under Incoming Mails in the page page ⚙ > System > Logging and Profiling, we should see in the file
atlassian-jira-incoming-mail.log
a huge number of emails being rejected due to the error "does not match catch mail list" or "do not match the catch email". In fact, you should see that the same emails keep being processed and rejected over and over. If you run grep commands below, you should see a high number of matches:grep -c 'does not match catch mail list' atlassian-jira-incoming-mail.log grep -c 'do not match the catch email' atlassian-jira-incoming-mail.log
Checking how the Jira Mail Handlers are configured in ⚙ > System > Incoming Mail, you may see that most mail handlers are configured with a "Catch Email Address"
原因
When mail handlers are configured with a catch email address, any incoming emails that does not match this requirement will be rejected, but such email won't be removed from the incoming mailbox, nor marked as READ. As a result, the more emails are received in the mailbox without any matching catch email address, the more emails each single mail handler will have to read, because all the mail handlers will have to re-read over and over the same emails, as long as they are marked as UNREAD. As a consequence, the mail handlers will take more and more time to complete their execution, as the mailboxes keep growing with new incoming emails. This issue is addressed in the public ticket below:
JRASERVER-33345 - Getting issue details... STATUS
Since all the Jira services (mail queue service, mail handlers...) are sharing the same resource, the chances that the mail queue service can get the proper resource to run are getting low. And as a result, this service might end up running very rarely and at random times.
回避策
Short Term workaround
- Go to the mailboxes from which the Jira Mail Handlers are pulling incoming mail
- Either mark all the UNREAD emails as READ or move them to a separate folder, so that the mail handlers no longer process these emails
Long Term workaround
Add some rules into all the mailboxes, so that any email which recipients does not match the catch email addresses configured in the Mail Handler is either automatically read, or moved to a different folder.