Jira server process is unexpectedly terminated in Linux (Out Of Memory Killer)

お困りですか?

アトラシアン コミュニティをご利用ください。

コミュニティに質問

症状

The JIRA process is being terminated unexpectedly in a Linux environment due to the Out Of Memory Killer(OOM) and there is a lack of a clean shutdown in Jira's logs. 

For example, looking at the atlassian-jira.logs, we can see that there is no clean shutdown process:

2021-06-22 13:53:14,025+0000 plugin-transaction-0 INFO [c.a.jira.plugin.PluginTransactionListener] [plugin-transaction] numberStartEvents:1011, numberEndEvents:1011, numberSendEvents:545, numberEventsInTransactions:18710, numberOfPluginEnableEvents:313
#### No shutdown process was started as noted by the lack of localhost-startStop-1 logging. 

2021-06-22 14:08:38,286+0000 localhost-startStop-1 INFO [c.a.jira.startup.JiraHomeStartupCheck] The jira.home directory '/var/atlassian/application-data/jira' is validated and locked for exclusive use by this instance.
2021-06-22 14:08:38,337+0000 JIRA-Bootstrap INFO [c.a.jira.startup.JiraStartupLogger] 
 
 ****************
 JIRA starting...
 ****************
 
2021-06-22 14:08:38,521+0000 JIRA-Bootstrap INFO [c.a.jira.startup.JiraStartupLogger]

You can search for the OOM killer process with the following command to confirm this is happening. 

# dmesg -T | egrep -i -B 1 'killed process'
Example output:[Tue Jun 22 13:42:49 2021] Out of memory: Kill process 90619 (java) score 440 or sacrifice child
[Tue Jun 22 13:42:49 2021] Killed process 95510 (java), UID 752, total-vm:12301500kB, anon-rss:1873032kB, file-rss:0kB, shmem-rss:0kB

The following appears in the /var/log/messages, /var/log/syslog, or the systemd kernel journal:

Aug 12 19:12:19 ussclpdapjra002 kernel: java invoked oom-killer: gfp_mask=0xd0, order=0, oomkilladj=0
Aug 12 19:12:19 ussclpdapjra002 kernel:
Aug 12 19:12:19 ussclpdapjra002 kernel: Call Trace:
Aug 12 19:12:19 ussclpdapjra002 kernel:  [<ffffffff800c82e8>] out_of_memory+0x8e/0x2f3
Aug 12 19:12:19 ussclpdapjra002 kernel:  [<ffffffff800a1ba4>] autoremove_wake_function+0x0/0x2e
Aug 12 19:12:19 ussclpdapjra002 kernel:  [<ffffffff8000f506>] __alloc_pages+0x27f/0x308
Aug 12 19:12:19 ussclpdapjra002 kernel:  [<ffffffff80017949>] cache_grow+0x133/0x3c1
Aug 12 19:12:19 ussclpdapjra002 kernel:  [<ffffffff8005c6f9>] cache_alloc_refill+0x136/0x186
Aug 12 19:12:19 ussclpdapjra002 kernel:  [<ffffffff800dc9e3>] kmem_cache_zalloc+0x6f/0x94
Aug 12 19:12:19 ussclpdapjra002 kernel:  [<ffffffff800bf56f>] taskstats_exit_alloc+0x32/0x89
Aug 12 19:12:19 ussclpdapjra002 kernel:  [<ffffffff80015693>] do_exit+0x186/0x911
Aug 12 19:12:19 ussclpdapjra002 kernel:  [<ffffffff800496a1>] cpuset_exit+0x0/0x88
Aug 12 19:12:19 ussclpdapjra002 kernel:  [<ffffffff8002b29e>] get_signal_to_deliver+0x465/0x494
Aug 12 19:12:19 ussclpdapjra002 kernel:  [<ffffffff8005b295>] do_notify_resume+0x9c/0x7af
Aug 12 19:12:19 ussclpdapjra002 kernel:  [<ffffffff8008e16d>] default_wake_function+0x0/0xe
Aug 12 19:12:19 ussclpdapjra002 kernel:  [<ffffffff800a1ba4>] autoremove_wake_function+0x0/0x2e
Aug 12 19:12:19 ussclpdapjra002 kernel:  [<ffffffff800a52a6>] sys_futex+0x10b/0x12b
Aug 12 19:12:19 ussclpdapjra002 kernel:  [<ffffffff8005e19f>] sysret_signal+0x1c/0x27
Aug 12 19:12:19 ussclpdapjra002 kernel:  [<ffffffff8005e427>] ptregscall_common+0x67/0xac

Additionally the /var/log/messages, /var/log/syslog, or the systemd kernel journal may include the following log

Aug 12 19:11:52 ussclpdapjra002 kernel: INFO: task java:5491 blocked for more than 120 seconds.
Aug 12 19:11:52 ussclpdapjra002 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 12 19:11:52 ussclpdapjra002 kernel: java          D 0000000000000014     0  5491      1          5492  5490 (NOTLB)
Aug 12 19:11:52 ussclpdapjra002 kernel:  ffff810722859e18 0000000000000082 0000000000000000 0000000000000001
Aug 12 19:11:52 ussclpdapjra002 kernel:  ffff810722859e88 000000000000000a ffff81083673a100 ffff8107024d7080
Aug 12 19:11:52 ussclpdapjra002 kernel:  0000d1276b8c3dc6 0000000000003296 ffff81083673a2e8 0000000400000000
Aug 12 19:11:52 ussclpdapjra002 kernel: Call Trace:
Aug 12 19:11:52 ussclpdapjra002 kernel:  [<ffffffff80016dd4>] generic_file_aio_read+0x34/0x39
Aug 12 19:11:52 ussclpdapjra002 kernel:  [<ffffffff800656ac>] __down_read+0x7a/0x92
Aug 12 19:11:52 ussclpdapjra002 kernel:  [<ffffffff80067ad0>] do_page_fault+0x446/0x874
Aug 12 19:11:52 ussclpdapjra002 kernel:  [<ffffffff800a1ba4>] autoremove_wake_function+0x0/0x2e
Aug 12 19:11:52 ussclpdapjra002 kernel:  [<ffffffff8000c62d>] _atomic_dec_and_lock+0x39/0x57
Aug 12 19:12:08 ussclpdapjra002 kernel:  [<ffffffff8000d3fa>] dput+0x3d/0x114
Aug 12 19:12:10 ussclpdapjra002 kernel:  [<ffffffff8005ede9>] error_exit+0x0/0x84
Aug 12 19:12:11 ussclpdapjra002 kernel:

原因

When the system runs out of memory Linux kernel will automatically start killing processes that consume the largest amount of memory and in this case, JIRA JVM was consuming the highest and it was killed.

This error is usually due to one or more of the below-listed issues.

  1. The memory configured to be used by JIRA's JVM as configured with the -XMX parameter is not available in the machine.
  2. There is not enough physical memory allocated to the Jira node to run Jira and other processes. 
  3. JIRA JVM is configured higher value of -XMX which is not required for the size of the instance.

ソリューション

このページの内容は、Jira アプリケーションでサポートされていないプラットフォームに関連しています。したがって、アトラシアンは、そのためのサポートの提供を保証できません 。この資料は情報提供のみを目的としているため、お客様自身の責任でご使用ください。

This error requires careful analysis to look at the memory usage patterns and decide how much memory a JIRA instance needs and adjust the Server capabilities. The below documents will help in making the right decision about the requirements.

Kubernetes

The Exit code 137 is important because it means that the system terminated the container as it tried to use more memory than its limit

Run this command to find exit code 137 :

kubectl get pod <pod-name> -o yaml

We should see exit code 137 in the output for example :

    state:
      terminated:
        containerID: docker://054cd898b7fff24f75f467895d4b0680c83fc54f49679faeaae975a579af87b8
        exitCode: 137

(info) External resource : https://sysdig.com/blog/troubleshoot-kubernetes-oom/




最終更新日 2021 年 11 月 12 日

この内容はお役に立ちましたか?

はい
いいえ
この記事についてのフィードバックを送信する
Powered by Confluence and Scroll Viewport.