Soft lockup messages from Linux kernel on Hipchat Server

お困りですか?

アトラシアン コミュニティをご利用ください。

コミュニティに質問

 

 

プラットフォームについて: Server および Data Center のみ。この記事は、Server および Data Center プラットフォームのアトラシアン製品にのみ適用されます。

Support for Server* products ended on February 15th 2024. If you are running a Server product, you can visit the Atlassian Server end of support announcement to review your migration options.

*Fisheye および Crucible は除く

 

 

問題

Admin sees the following in /var/log/hipchat/kern.log

The soft lockups can cause the Hipchat Server to freeze or stop responding,  which subsequently causes other issues with normal operation.

原因

The hypervisor is not keeping up with CPU demand of the Hipchat Server Virtual Machine (VM). Typically, system resources at the hypervisor level is not sufficient which then affected the Hipchat Server appliance and its performance or there's not enough underlying compute resource to keep the Hipchat VM going. 

Here's a related VMware article: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009996

When running a Linux kernel in a symmetric multiprocessing (SMP) enabled virtual machine, messages similar to BUG: soft lockup detected on CPU#1! are written to the message log file. The exact format of these messages vary from kernel to kernel, and might be accompanied by a kernel stack backtrace.
...
When running in a virtual machine, this might instead indicate high levels of overcommitment (especially memory overcommitment) or other virtualization overheads.

The soft lockup messages indicate that the vCPUs are waiting some amount of time before the hypervisor is able to provide the resources necessary for a particular process inside the VM to continue functioning.

ソリューション

Our recommendation would be to evaluate the Virtual Machine to determine if it needs more resources, if VMs need to be moved off the system, switch to vSphere ESXi, etc. 

If the Virtual Machine has been allocated the proper amount of resources for the number of users per our System Requirements, then another possible workaround involves increasing the watchdog_thresh value on the server by running the following commands:

sudo dont-blame-hipchat
echo 60 > /proc/sys/kernel/watchdog_thresh

Please monitor the system after making the setting above to ensure it stabilizes and the number of softlockup messages is reduced.

 

最終更新日 2018 年 11 月 2 日

この内容はお役に立ちましたか?

はい
いいえ
この記事についてのフィードバックを送信する
Powered by Confluence and Scroll Viewport.