Cluster Cache replication health check fails with error SocketException: Broken pipe exception
プラットフォームについて: Data Center のみ - この記事は、Data Center プラットフォームのアトラシアン製品にのみ適用されます。
この KB は Data Center バージョンの製品用に作成されています。Data Center 固有ではない機能の Data Center KB は、製品のサーバー バージョンでも動作する可能性はありますが、テストは行われていません。サーバー*製品のサポートは 2024 年 2 月 15 日に終了しました。サーバー製品を利用している場合は、アトラシアンのサーバー製品のサポート終了のお知らせページにて移行オプションをご確認ください。
*Fisheye および Crucible は除く
要約
Cluster Cache replication health check fails and the nodes cannot communicate with each other to replicate cache.
Name: Cluster Cache Replication
NodeId: null
Is healthy: false
Failure reason: ["The node node3 is not replicating","The node node2 is not replicating"]
Severity: CRITICAL
Exception from the atlassian-jira.log:
2021-06-04 18:26:25,830+0000 localq-reader-12 ERROR [c.a.j.c.distribution.localq.LocalQCacheOpReader] [LOCALQ] [VIA-COPY] Abandoning sending: LocalQCacheOp{cacheName='com.atlassian.jira.plugins.healthcheck.service.HeartBeatService.heartbeat', action=PUT, key=node2, value == null ? false, replicatePutsViaCopy=true, creationTimeInMillis=1622831185825} from cache replication queue: [queueId=queue_node1_2_164546f60261c7e4be0c5f5f9aaeec86_put, queuePath=/var/atlassian/application-data/jira-home/localq/queue_node1_2_164546f60261c7e4be0c5f5f9aaeec86_put], failuresCount: 1/1. Removing from queue. Error: java.rmi.MarshalException: error marshalling arguments; nested exception is:
java.net.SocketException: Broken pipe (Write failed)
com.atlassian.jira.cluster.distribution.localq.LocalQCacheOpSender$UnrecoverableFailure: java.rmi.MarshalException: error marshalling arguments; nested exception is:
java.net.SocketException: Broken pipe (Write failed)
at com.atlassian.jira.cluster.distribution.localq.rmi.LocalQCacheOpRMISender.send(LocalQCacheOpRMISender.java:90)
at com.atlassian.jira.cluster.distribution.localq.LocalQCacheOpReader.run(LocalQCacheOpReader.java:96)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.rmi.MarshalException: error marshalling arguments; nested exception is:
java.net.SocketException: Broken pipe (Write failed)
at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:158)
at net.sf.ehcache.distribution.RMICachePeer_Stub.put(Unknown Source)
at com.atlassian.jira.cluster.distribution.localq.rmi.LocalQCacheOpRMISender.lambda$send$2(LocalQCacheOpRMISender.java:69)
at com.atlassian.jira.cluster.distribution.localq.rmi.CachingRMICachePeerManager.withCachePeer(CachingRMICachePeerManager.java:94)
... 6 more
Caused by: java.net.SocketException: Broken pipe (Write failed)
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111)
at java.net.SocketOutputStream.write(SocketOutputStream.java:155)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
at java.io.ObjectOutputStream$BlockDataOutputStream.writeByte(ObjectOutputStream.java:1915)
at java.io.ObjectOutputStream.writeFatalException(ObjectOutputStream.java:1576)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:351)
at sun.rmi.server.UnicastRef.marshalValue(UnicastRef.java:294)
at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:153)
原因
The nodes are unable to communicate with each other due to a loop in the IP configuration of the system.
診断
Verify if the IP entries are correct under /etc/hosts. Ensure there are no duplicated entries pointing to the node with both external and internal IP addresses:
127.0.1.1 ip-node-1
127.0.0.1 ip-node-1
ソリューション
- Remove the nodes' hostname from the local loopback IP 127.0.0.1 – Leaving only its external IP entries
- Repeat this for all nodes
- Jira を再起動します。
- Check if cluster cache replication health is resolved.
You may need to check if the IP was statically defined in other settings of the system, such as:
JIRA_LOCAL_HOME/cluster.properties, entry:
ehcache.listener.hostName
JIRA_INSTALL/bin/setenv.sh, entry:
-Djava.rmi.server.hostname