我在生产上有3个节点cassandra 2.0.9,当使用clqsh的count或任何特定查询总是得到rpc-timeout时,我会面临这个问题,奇怪的是,这只发生在cassandra1上,其他配置相同的节点也可以
[cqlsh 4.1.1 | Cassandra 2.0.9 | CQL spec 3.1.1 | Thrift protocol 19.39.0]
Use HELP for help.
cqlsh> use xdata;
cqlsh:xdata> select count(*) from blobstore limit 100;
Request did not complete within rpc_timeout.
这里是来自系统的日志。执行查询时记录日志
INFO [MemoryMeter:1] 2022-08-03 10:40:10,910 Memtable.java (line 481) CFS(Keyspace='system', ColumnFamily='sstable_activity') liveRatio is 14.607407883739976 (just-counted was 14.607407407407408). calculation took 2ms for 54 cells
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,061 MessagingService.java (line 857) 1 REQUEST_RESPONSE messages dropped in last 5000ms
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,061 StatusLogger.java (line 55) Pool Name Active Pending Completed Blocked All Time Blocked
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,062 StatusLogger.java (line 70) MutationStage 0 0 8726 0 0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,062 StatusLogger.java (line 70) RequestResponseStage 0 0 193404 0 0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,062 StatusLogger.java (line 70) ReadRepairStage 0 0 0 0 0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,062 StatusLogger.java (line 70) ReadStage 0 0 295316 0 0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,062 StatusLogger.java (line 70) ReplicateOnWriteStage 0 0 0 0 0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,063 StatusLogger.java (line 70) MiscStage 0 0 2582 0 0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,063 StatusLogger.java (line 70) AntiEntropySessions 0 0 1028 0 0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,063 StatusLogger.java (line 70) HintedHandoff 0 0 112 0 0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,063 StatusLogger.java (line 70) FlushWriter 0 0 39 0 0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,063 StatusLogger.java (line 70) MemoryMeter 0 0 50 0 0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,063 StatusLogger.java (line 70) GossipStage 0 0 150208 0 0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,063 StatusLogger.java (line 70) CacheCleanupExecutor 0 0 0 0 0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,064 StatusLogger.java (line 70) InternalResponseStage 0 0 4112 0 0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,064 StatusLogger.java (line 70) CompactionExecutor 0 0 271 0 0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,064 StatusLogger.java (line 70) ValidationExecutor 0 0 2582 0 0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,064 StatusLogger.java (line 70) MigrationStage 0 0 2 0 0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,064 StatusLogger.java (line 70) commitlog_archiver 0 0 0 0 0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,064 StatusLogger.java (line 70) AntiEntropyStage 0 0 11332 0 0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,064 StatusLogger.java (line 70) PendingRangeCalculator 0 0 3 0 0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,065 StatusLogger.java (line 70) MemtablePostFlusher 0 0 6062 0 0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,065 StatusLogger.java (line 79) CompactionManager 0 0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,065 StatusLogger.java (line 81) Commitlog n/a 0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,065 StatusLogger.java (line 93) MessagingService n/a 0/0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,065 StatusLogger.java (line 103) Cache Type Size Capacity KeysToSave
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,065 StatusLogger.java (line 105) KeyCache 13808 104857600 all
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,065 StatusLogger.java (line 111) RowCache 0 0 all
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,065 StatusLogger.java (line 118) ColumnFamily Memtable ops,data
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,065 StatusLogger.java (line 121) system.compaction_history 9,3184
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,065 StatusLogger.java (line 121) system.hints 0,0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,065 StatusLogger.java (line 121) system.IndexInfo 0,0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,066 StatusLogger.java (line 121) system.schema_columnfamilies 0,0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,066 StatusLogger.java (line 121) system.schema_triggers 0,0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,066 StatusLogger.java (line 121) system.NodeIdInfo 0,0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,066 StatusLogger.java (line 121) system.paxos 0,0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,066 StatusLogger.java (line 121) system.peer_events 0,0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,066 StatusLogger.java (line 121) system.range_xfers 0,0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,066 StatusLogger.java (line 121) system.compactions_in_progress 0,0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,066 StatusLogger.java (line 121) system.peers 0,0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,066 StatusLogger.java (line 121) system.schema_keyspaces 0,0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,066 StatusLogger.java (line 121) system.local 0,0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,066 StatusLogger.java (line 121) system.sstable_activity 639,23664
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,066 StatusLogger.java (line 121) system.schema_columns 0,0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,066 StatusLogger.java (line 121) system.batchlog 0,0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,067 StatusLogger.java (line 121) xdata.blobstore 127,54400
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,067 StatusLogger.java (line 121) xdata.document 0,0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,067 StatusLogger.java (line 121) xdata.blobstoremeta 252,121960
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,067 StatusLogger.java (line 121) system_traces.sessions 0,0
INFO [ScheduledTasks:1] 2022-08-03 10:40:21,067 StatusLogger.java (line 121) system_traces.events 0,0
INFO [ScheduledTasks:1] 2022-08-03 10:42:13,051 GCInspector.java (line 116) GC for ParNew: 261 ms for 1 collections, 2073077016 used; max is 8482586624
2条答案
按热度按时间3xiyfsfu1#
在分布式体系结构中,CQL
SELECT COUNT()
不是一件好事。当您执行无限COUNT()
时,Cassandra必须执行全表扫描,以读取每个节点上的每个记录。正如我在Why COUNT() is bad in Cassandra中详细解释的那样,计算表中的行非常昂贵。
您应该考虑使用其他软件,如Spark、Solr或工具,如DataStax Bulk Loader(DSBulk),它们能够有效地计算Cassandra表中的数据,因为它们使用了针对任务优化的算法。我在上面链接的DBA堆栈交换帖子中提供了详细信息。干杯
46scxncf2#
通过添加更多节点并清除一些未使用的数据来解决