エラー「Caused by: com.hazelcast.config.ConfigurationException」が原因で再起動後に Confluence ノードを起動できない
プラットフォームについて: Data Center - この記事は、Data Center プラットフォームのアトラシアン製品に適用されます。
このナレッジベース記事は製品の Data Center バージョン用に作成されています。Data Center 固有ではない機能の Data Center ナレッジベースは、製品のサーバー バージョンでも動作する可能性はありますが、テストは行われていません。サーバー*製品のサポートは 2024 年 2 月 15 日に終了しました。サーバー製品を利用している場合は、アトラシアンのサーバー製品のサポート終了のお知らせページにて移行オプションをご確認ください。
*Fisheye および Crucible は除く
要約
エラー「Caused by: com.hazelcast.config.ConfigurationException」が原因で再起動後に Confluence ノードを起動できない
環境
Confluence Data Center
診断
The atlassian-confluence.log
will show the above error, in addition to messages such as 'Cannot add a dynamic configuration' related to 'MapConfig', as below:
Caused by: com.hazelcast.config.ConfigurationException: Cannot add a dynamic configuration 'MapConfig{name='atlassian-cache.Cache.com.atlassian.confluence.user.ConfluenceUserPropertySetFactory.propertysets', inMemoryFormat=BINARY', backupCount=0, asyncBackupCount=0, timeToLiveSeconds=3600, maxIdleSeconds=3600, evictionPolicy='LFU', mapEvictionPolicy='null', evictionPercentage=25, minEvictionCheckMillis=100, maxSizeConfig=MaxSizeConfig{maxSizePolicy='PER_NODE', size=40000}, readBackupData=false, hotRestart=HotRestartConfig{enabled=false, fsync=false}, nearCacheConfig=NearCacheConfig{name=default, inMemoryFormat=OBJECT, invalidateOnChange=true, timeToLiveSeconds=3600, maxIdleSeconds=3600, maxSize=40000, evictionPolicy='LFU', evictionConfig=EvictionConfig{size=40000, maxSizePolicy=ENTRY_COUNT, evictionPolicy=LFU, comparatorClassName=null, comparator=null}, cacheLocalEntries=true, localUpdatePolicy=INVALIDATE, preloaderConfig=NearCachePreloaderConfig{enabled=false, directory=, storeInitialDelaySeconds=600, storeIntervalSeconds=600}}, mapStoreConfig=MapStoreConfig{enabled=false, className='null', factoryClassName='null', writeDelaySeconds=0, writeBatchSize=1, implementation=null, factoryImplementation=null, properties={}, initialLoadMode=LAZY, writeCoalescing=true}, mergePolicyConfig=MergePolicyConfig{policy='com.atlassian.confluence.cluster.hazelcast.AlwaysNullMapMergePolicy', batchSize=100}, wanReplicationRef=null, entryListenerConfigs=[], mapIndexConfigs=[], mapAttributeConfigs=[], quorumName=null, queryCacheConfigs=[], cacheDeserializedValues=INDEX_ONLY}' as there is already a conflicting configuration
原因
Our investigation into MapConfig shows that this is a cluster cache issue. MapConfig is part of the Hazelcast library that Confluence uses for clustering, and there are two contributors for this problem:
Cause #1 : CONFSERVER-60142 : Changing distributed cache settings prevents Confluence cluster node restart due to a Hazelcast exception
または
Cause #2 : More than one node in the cluster was started simultaneously and tried to join the cluster at the same time. This simultaneous join could have potentially corrupted the EhCache and then prevented a node from starting.
ソリューション
If you've recently made changes to distributed cache sizes (Confluence Administration >> General Configuration >> Cache Management >> Show advanced view), cause #1 above is likely the issue, so changing the cache settings back to the previous values should allow nodes to restart.
However, if nodes have recently been started at the same time, stopping all nodes in the cluster before restarting one node at a time should fix the problem.
Regardless of the cause, a full shutdown and subsequent restart will fix the problem, as the memory cache will be fully destroyed once the last node leaves the cluster.
- Stop Confluence on すべて nodes to bring the whole cluster down.
- Confirm that Confluence has fully stopped (check that the Java process for Confluence has exited) eg. 'ps -ef | grep -i confluence'
- Once you've confirmed that Confluence has fully stopped on all nodes, restart only one node.
This can be one of the nodes that's currently considered 'good' or it can be this particular problematic node. This will reset the whole cluster cache.
- Check that the first node is fully up.
You can do this by directing a browser directly to the node that has been started in step 3 and verifying that it is responsive.
- Once the first node is fully up, start the remaining nodes, one by one. Each node startup will typically take a few minutes. Confirm that each node is fully up before starting the next node.
You can confirm that a node has finished starting up by either directing a browser to the node that has just been started and checking that it's responsive, or via the UI under
Confluence administration
>>General Configuration
>>Clustering