我担心的是“压缩分区最大字节数”的值,因为它似乎相当高的89mb。
这是否表示模型损坏或其他问题?应用程序方面没有观察到任何问题。
存储到表中的数据使用week\u first\u day、device\u id分区键打包到每个设备的每周存储桶中。
表的数据模型:
CREATE TABLE device_data (
week_first_day timestamp,
device_id uuid,
nano_since_epoch bigint,
sensor_id uuid,
source text,
unit text,
username text,
value double,
PRIMARY KEY ((week_first_day, device_id), nano_since_epoch, sensor_id)
)
节点工具cfstats
Table: device_data
SSTable count: 5
Space used (live): 447558297
Space used (total): 447558297
Space used by snapshots (total): 0
Off heap memory used (total): 211264
SSTable Compression Ratio: 0.2610509614736755
Number of partitions (estimate): 939
Memtable cell count: 458
Memtable data size: 63785
Memtable off heap memory used: 0
Memtable switch count: 0
Local read count: 0
Local read latency: NaN ms
Local write count: 458
Local write latency: 0.058 ms
Pending flushes: 0
Percent repaired: 99.83
Bloom filter false positives: 0
Bloom filter false ratio: 0.00000
Bloom filter space used: 2216
Bloom filter off heap memory used: 2176
Index summary off heap memory used: 672
Compression metadata off heap memory used: 208416
Compacted partition minimum bytes: 43
Compacted partition maximum bytes: 89970660
Compacted partition mean bytes: 1100241
Average live cells per slice (last five minutes): NaN
Maximum live cells per slice (last five minutes): 0
Average tombstones per slice (last five minutes): NaN
Maximum tombstones per slice (last five minutes): 0
Dropped Mutations: 0
1条答案
按热度按时间qmelpv7a1#
这实际上取决于该分区中数据的访问模式—如果您经常读取整个分区,那么这可能会导致问题,但是如果您只读取其中的一部分,那么就不应该是问题。例如,您可以使用day作为bucket来分解分区。
看看两年前CassandraSummit关于大分区的讨论神话——它有更多关于Cassandra3.x如何处理的细节。