刚刚从cassandra 3.11.13升级到cassandra 4.1.1,我可以在日志中看到这样的错误:
ERROR [ReadStage-1] 2023-05-18 09:00:13,032 JVMStabilityInspector.java:68 - Exception in thread Thread[ReadStage-1,5,SharedPool]
java.lang.NullPointerException: null
at org.apache.cassandra.db.rows.Row$Merger$ColumnDataReducer.getReduced(Row.java:854)
at org.apache.cassandra.db.rows.Row$Merger$ColumnDataReducer.getReduced(Row.java:793)
at org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:219)
at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:158)
at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
at org.apache.cassandra.db.rows.Row$Merger.merge(Row.java:770)
at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.mergeStaticRows(UnfilteredRowIterators.java:494)
at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.<init>(UnfilteredRowIterators.java:407)
at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.create(UnfilteredRowIterators.java:426)
at org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.access$000(UnfilteredRowIterators.java:391)
at org.apache.cassandra.db.rows.UnfilteredRowIterators.merge(UnfilteredRowIterators.java:135)
at org.apache.cassandra.db.SinglePartitionReadCommand.withSSTablesIterated(SinglePartitionReadCommand.java:817)
at org.apache.cassandra.db.SinglePartitionReadCommand.queryMemtableAndDiskInternal(SinglePartitionReadCommand.java:766)
at org.apache.cassandra.db.SinglePartitionReadCommand.queryMemtableAndDisk(SinglePartitionReadCommand.java:625)
at org.apache.cassandra.db.SinglePartitionReadCommand.queryStorage(SinglePartitionReadCommand.java:459)
at org.apache.cassandra.db.ReadCommand.executeLocally(ReadCommand.java:419)
at org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:61)
at org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78)
at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97)
at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45)
at org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:430)
at org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:124)
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:120)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:829)
ERROR [ReadStage-3] 2023-05-18 09:00:41,324 JVMStabilityInspector.java:68 - Exception in thread Thread[ReadStage-3,5,SharedPool]
java.lang.NullPointerException: null
它是一个6节点集群,8 cpu/64 G RAM,Java 11,Shenandoah GC,31 GB堆。你知道从哪里着手解决吗?谢谢你
nodetool tpstats的输出如下所示:
Pool Name Active Pending Completed Blocked All time blocked
RequestResponseStage 0 0 4597757 0 0
MutationStage 0 0 2280940 0 0
ReadStage 0 0 3770853 0 0
CompactionExecutor 0 0 6828 0 0
MemtableReclaimMemory 0 0 34 0 0
PendingRangeCalculator 0 0 10 0 0
GossipStage 0 0 11283 0 0
SecondaryIndexManagement 0 0 1 0 0
HintsDispatcher 0 0 0 0 0
MigrationStage 0 0 5 0 0
MemtablePostFlush 0 0 49 0 0
PerDiskMemtableFlushWriter_0 0 0 22 0 0
ValidationExecutor 0 0 0 0 0
Sampler 0 0 0 0 0
ViewBuildExecutor 0 0 0 0 0
MemtableFlushWriter 0 0 34 0 0
CacheCleanupExecutor 0 0 0 0 0
Native-Transport-Requests 1 0 1745221 0 0
Latencies waiting in queue (micros) per dropped message types
Message type Dropped 50% 95% 99% Max
READ_RSP 0 924.0 2299.0 2759.0 8239.0
RANGE_REQ 0 642.0 1916.0 2759.0 2759.0
PING_REQ 0 0.0 0.0 0.0 0.0
PAXOS2_COMMIT_REMOTE_RSP 0 0.0 0.0 0.0 0.0
PAXOS2_COMMIT_AND_PREPARE_RSP 0 0.0 0.0 0.0 0.0
_SAMPLE 0 0.0 0.0 0.0 0.0
VALIDATION_RSP 0 0.0 0.0 0.0 0.0
SCHEMA_PULL_RSP 0 0.0 0.0 0.0 0.0
SYNC_RSP 0 0.0 0.0 0.0 0.0
PAXOS2_CLEANUP_START_PREPARE_REQ 0 0.0 0.0 0.0 0.0
PAXOS2_PREPARE_REFRESH_REQ 0 0.0 0.0 0.0 0.0
SCHEMA_VERSION_REQ 0 0.0 0.0 0.0 0.0
HINT_RSP 0 0.0 0.0 0.0 0.0
BATCH_REMOVE_RSP 0 0.0 0.0 0.0 0.0
PAXOS2_CLEANUP_RSP 0 0.0 0.0 0.0 0.0
PAXOS2_CLEANUP_FINISH_PREPARE_RSP 0 0.0 0.0 0.0 0.0
PAXOS_COMMIT_REQ 0 642.0 1916.0 2299.0 2759.0
SNAPSHOT_RSP 0 0.0 0.0 0.0 0.0
COUNTER_MUTATION_REQ 0 0.0 0.0 0.0 0.0
PAXOS2_PROPOSE_REQ 0 0.0 0.0 0.0 0.0
GOSSIP_DIGEST_SYN 0 924.0 35425.0 88148.0 88148.0
PAXOS_PREPARE_REQ 0 924.0 2299.0 2759.0 14237.0
PREPARE_MSG 0 0.0 0.0 0.0 0.0
PAXOS2_PREPARE_REFRESH_RSP 0 0.0 0.0 0.0 0.0
PAXOS_COMMIT_RSP 0 1109.0 2299.0 2759.0 2759.0
HINT_REQ 0 0.0 0.0 0.0 0.0
BATCH_REMOVE_REQ 0 642.0 1916.0 2299.0 2759.0
STATUS_RSP 0 0.0 0.0 0.0 0.0
READ_REPAIR_RSP 0 535.0 535.0 535.0 535.0
PAXOS2_PROPOSE_RSP 0 0.0 0.0 0.0 0.0
GOSSIP_DIGEST_ACK2 0 1109.0 35425.0 61214.0 61214.0
CLEANUP_MSG 0 0.0 0.0 0.0 0.0
REQUEST_RSP 0 642.0 73457.0 88148.0 219342.0
TRUNCATE_RSP 0 0.0 0.0 0.0 0.0
UNUSED_CUSTOM_VERB 0 0.0 0.0 0.0 0.0
REPLICATION_DONE_RSP 0 0.0 0.0 0.0 0.0
SNAPSHOT_REQ 0 0.0 0.0 0.0 0.0
ECHO_REQ 0 0.0 0.0 0.0 0.0
PAXOS2_REPAIR_REQ 0 0.0 0.0 0.0 0.0
PAXOS2_CLEANUP_COMPLETE_RSP 0 0.0 0.0 0.0 0.0
PREPARE_CONSISTENT_REQ 0 0.0 0.0 0.0 0.0
FAILURE_RSP 0 1916.0 1916.0 1916.0 1916.0
BATCH_STORE_RSP 0 924.0 2299.0 2759.0 2759.0
SCHEMA_PUSH_RSP 0 0.0 0.0 0.0 0.0
MUTATION_RSP 0 924.0 2299.0 2759.0 14237.0
FINALIZE_PROPOSE_MSG 0 0.0 0.0 0.0 0.0
ECHO_RSP 0 0.0 0.0 0.0 0.0
PAXOS2_REPAIR_RSP 0 0.0 0.0 0.0 0.0
INTERNAL_RSP 0 0.0 0.0 0.0 0.0
FAILED_SESSION_MSG 0 0.0 0.0 0.0 0.0
PAXOS2_CLEANUP_COMPLETE_REQ 0 0.0 0.0 0.0 0.0
_TRACE 0 0.0 0.0 0.0 0.0
SCHEMA_VERSION_RSP 0 0.0 0.0 0.0 0.0
FINALIZE_COMMIT_MSG 0 0.0 0.0 0.0 0.0
SNAPSHOT_MSG 0 0.0 0.0 0.0 0.0
PREPARE_CONSISTENT_RSP 0 0.0 0.0 0.0 0.0
PAXOS_PROPOSE_REQ 0 642.0 1916.0 2299.0 3311.0
PAXOS_PREPARE_RSP 0 924.0 2299.0 2759.0 14237.0
MUTATION_REQ 0 642.0 1916.0 2299.0 11864.0
PAXOS2_CLEANUP_RSP2 0 0.0 0.0 0.0 0.0
READ_REQ 0 642.0 1597.0 2299.0 9887.0
PING_RSP 0 0.0 0.0 0.0 0.0
PAXOS2_COMMIT_REMOTE_REQ 0 0.0 0.0 0.0 0.0
RANGE_RSP 0 1109.0 2299.0 2299.0 2299.0
VALIDATION_REQ 0 0.0 0.0 0.0 0.0
SYNC_REQ 0 0.0 0.0 0.0 0.0
PAXOS2_PREPARE_RSP 0 0.0 0.0 0.0 0.0
_TEST_1 0 0.0 0.0 0.0 0.0
GOSSIP_SHUTDOWN 0 0.0 0.0 0.0 0.0
PAXOS2_CLEANUP_START_PREPARE_RSP 0 0.0 0.0 0.0 0.0
TRUNCATE_REQ 0 0.0 0.0 0.0 0.0
_TEST_2 0 0.0 0.0 0.0 0.0
GOSSIP_DIGEST_ACK 0 924.0 35425.0 61214.0 61214.0
PAXOS2_CLEANUP_REQ 0 0.0 0.0 0.0 0.0
SCHEMA_PUSH_REQ 0 0.0 0.0 0.0 0.0
FINALIZE_PROMISE_MSG 0 0.0 0.0 0.0 0.0
PAXOS2_CLEANUP_FINISH_PREPARE_REQ 0 0.0 0.0 0.0 0.0
PAXOS2_PREPARE_REQ 0 0.0 0.0 0.0 0.0
BATCH_STORE_REQ 0 642.0 1916.0 2299.0 9887.0
COUNTER_MUTATION_RSP 0 0.0 0.0 0.0 0.0
REPAIR_RSP 0 0.0 0.0 0.0 0.0
PAXOS2_COMMIT_AND_PREPARE_REQ 0 0.0 0.0 0.0 0.0
STATUS_REQ 0 0.0 0.0 0.0 0.0
SCHEMA_PULL_REQ 0 0.0 0.0 0.0 0.0
READ_REPAIR_REQ 0 446.0 535.0 535.0 535.0
REPLICATION_DONE_REQ 0 0.0 0.0 0.0 0.0
PAXOS_PROPOSE_RSP 0 924.0 2299.0 2759.0 3311.0
任何帮助都非常感谢,谢谢。
1条答案
按热度按时间ykejflvf1#
不幸的是,你的文章中没有任何东西可以提供问题的线索。
实际上,您截断了错误中最重要的部分,即堆栈跟踪。
在任何情况下,您都需要做更多的调查并分析日志以寻找线索。注意几分钟导致错误信息的消息,但我建议您通过查看
system.log
来“缩小”。DEBUG
消息在您调查的这个阶段没有帮助,因为在您放大到特定问题之前,它们是不相关的。干杯!