JSM project automation rule doesn't send any webhook
プラットフォームについて: Server および Data Center のみ。この記事は、Server および Data Center プラットフォームのアトラシアン製品にのみ適用されます。
Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.
*Fisheye および Crucible は除く
要約
Jira Service Management Automation is not firing the webhooks, even if the automation view log points the rule was executed successfully.
環境
Jira Service Management 3.x or higher
診断
Enable JSM debugs
- Log in as a user with the Jira System Administrators global permission.
- Navigate to Administration > System > Logging and profiling
- Scroll up and go to the "Mark Logs" section above
- At the "Optional Message" field insert replication starts here
- Click on Mark
- Under the "Default Loggers" section, click on Configure logging level for another package
- At the "Package name" field, enable DEBUG for the following packages:
com.atlassian.servicedesk.plugins.automation com.atlassian.webhooks org.apache.http
Reproducing the webhook issue
With the debugs enabled, after triggering the rule again (which has at then action a webhook), the $JIRA_HOME/log/atlassian-jira.log file is updated with additional details.
This is the expected output when a rule fire a webhook (extracted from a working JSM system):
2021-06-15 13:33:40,755-0300 PsmqAsyncExecutors-job:thread-3 DEBUG anonymous 813x764x1 1ijnd4m 0:0:0:0:0:0:0:1
/rest/api/2/issue/10100/comment [c.a.s.p.a.i.e.e.r.async.psmq.PsmqExecutionJobRunnerImpl] Queue
'com.atlassian.servicedesk.plugins.automation.execution.async.job.queue.issue.ITSM-51' released
2021-06-15 13:33:42,367-0300 PsmqAsyncExecutors-then:thread-5 DEBUG xxxxx 813x764x1 1ijnd4m 0:0:0:0:0:0:0:1
/rest/api/2/issue/10100/comment [c.a.s.p.a.webhook.executor.WebhookExecutorImpl] JSD webhook request successful:
------------------------------------------------------------
url: https://webhook.site/59f0ea66-cb12-45a1-bcb3-c44b5d562528
headers: {}
response code: 200
exception: null
------------------------------------------------------------
2021-06-15 13:33:42,367-0300 PsmqAsyncExecutors-then:thread-5 INFO xxxxxx 813x764x1 1ijnd4m 0:0:0:0:0:0:0:1 /rest/api/2/issue/10100/comment [c.a.s.p.a.i.e.engine.asyncthen.AsyncThenJobProcessor] Execution of Asynchronous ThenAction com.atlassian.servicedesk.plugins.automation.webhook.rulethen.WebhookThenAction is successful
However, from an affected JSM instance, the log will only show the job is released, but not fired:
2021-06-10 13:07:30,640+0000 PsmqAsyncExecutors-job:thread-4 DEBUG xxxxxxx 778x304x1 6ieu9b xxxxxxx
/rest/api/2/issue/101241/comment [c.a.s.p.a.i.e.h.dao.querydsl.ExecutionHistoryDaoImpl] New rule execution record saved in
the database with id: '345487'
2021-06-10 13:07:30,657+0000 PsmqAsyncExecutors-job:thread-4 DEBUG anonymous 778x304x1 6ieu9b xxxxxx
/rest/api/2/issue/101241/comment [c.a.s.p.a.i.e.e.r.async.psmq.PsmqExecutionJobRunnerImpl] Queue
'com.atlassian.servicedesk.plugins.automation.execution.async.job.queue.issue.ATL-42' released
2021-06-10 13:07:30,665+0000 PsmqAsyncExecutors-then:thread-25 DEBUG xxxxxxx 778x304x1 6ieu9b
xxxxxxxx /rest/api/2/issue/101241/comment [c.a.s.p.a.i.e.engine.asyncthen.AsyncThenJobProcessor] Running
com.atlassian.servicedesk.plugins.automation.webhook.rulethen.WebhookThenAction as user xxxxxxxx
データベース クエリ
After confirming on the logs the webhook issue, double-check on the database tables if you can find the entries which don't have values on the CLAIMANT column
Get the ID column of the AO_319474_QUEUE table
SELECT * FROM "AO_319474_QUEUE" WHERE "NAME" = 'com.atlassian.servicedesk.plugins.automation.execution.asyncthen.queue';
Confirm if there are entries in table AO_319474_MESSAGE with no value in the CLAIMANT column, like the sample below:
select "CLAIMANT", "CLAIMANT_TIME", "CLAIM_COUNT", "CREATED_TIME", "EXPIRY_TIME" from "AO_319474_MESSAGE" where "QUEUE_ID" = <ID from step #1>; CLAIMANT | CLAIMANT_TIME | CLAIM_COUNT | CREATED_TIME | EXPIRY_TIME -------------------------------------------------------------------------+---------------+-------------+---------------+------------- | | 0 | 1610387949718 | | | 0 | 1610384203181 | | | 0 | 1610384223160 | | | 0 | 1613583799197 | | | 0 | 1616423617227 | | | 0 | 1615930960907 | | | 0 | 1615930960915 | | | 0 | 1610730914633 | | | 0 | 1610382733300 | | | 0 | 1610502246041 | | | 0 | 1610493711317 | (20950 rows)
原因
In the traces, we can see the execution of the Automation Rule actions is done asynchronously and Jira uses the PSMQ to manage and orchestrate that.
This tool uses mainly two tables to manage this queue: AO_319474_QUEUE and AO_319474_MESSAGE. When table AO_319474_MESSAGE is never flushed, this issue occurs.
AO_319474_MESSAGE
MESSAGEs are meant to be added when needed, then when consumed, DELETED.
If JSM gets into a deadlock situation, it is not clearing messages from the queue - they continue to build up. To clear the queue, the admin restarts the instance.
The message payload itself is held in the node’s memory, so once the node is rebooted, these messages are now dereferenced from the database and not read or removed again.
Rows in the message table can be deleted at the DB in two conditions
- Claimant is empty
- The expiry date is expired
AO_319474_QUEUE
QUEUEs are created and removed dynamically by the application. If an event occurs that requires a PSMQ queue, JSD checks to see if one exists and adds on if it does not. There is one queue per issue, and each QUEUE can have multiple MESSAGEs
ソリューション
Before performing any changes to your database, make sure to create a backup from it.
- Stop your Jira system;
After confirming there are many entries and all of them has no value in the CLAIMANT column, delete all these entries
DELETE FROM "AO_319474_MESSAGE" WHERE "QUEUE_ID" = <ID from Diagnosis step #1>;
For the last, update the AO_319474_QUEUE table, setting the MESSAGE_COUNT=0
UPDATE "AO_319474_QUEUE" set "MESSAGE_COUNT" = 0 WHERE "ID" = <ID from Diagnosis step #1>;
- Jira インスタンスを起動します。
There is a job that is executed on the system every day to validate the AO_319474_QUEUE
but it does not check the AO_319474_MESSAGE_
. There is a feature request to change that: JSDSERVER-7162 - Getting issue details... STATUS