如何按不同的列选择最后一个时间戳?

ijnw1ujt  于 2021-06-10  发布在  Cassandra
关注(0)|答案(1)|浏览(357)

假设有这样一张table:

| user_id | location_id | datetime            | other_field |
| ------- | ----------- | ------------------- | ----------- |
| 12      | 1           | 2020-02-01 10:00:00 | asdqwe      |
| 12      | 1           | 2020-02-01 10:30:00 | asdqwe      |
| 12      | 2           | 2020-02-01 10:40:00 | asdqwe      |
| 12      | 2           | 2020-02-01 10:50:00 | asdqwe      |
| 13      | 1           | 2020-02-01 10:10:00 | asdqwe      |
| 13      | 1           | 2020-02-01 10:20:00 | asdqwe      |
| 14      | 3           | 2020-02-01 09:00:00 | asdqwe      |

我想选择最后一个 datetime 每个不同的 user_id 以及 location_id . 这就是我想要的结果:

| user_id | location_id | datetime            | other_field |
| ------- | ----------- | ------------------- | ----------- |
| 12      | 1           | 2020-02-01 10:30:00 | asdqwe      |
| 12      | 2           | 2020-02-01 10:50:00 | asdqwe      |
| 13      | 1           | 2020-02-01 10:20:00 | asdqwe      |
| 14      | 3           | 2020-02-01 09:00:00 | asdqwe      |

以下是表格说明:

CREATE TABLE mykeyspace.mytable (
    user_id int,
    location_id int,
    datetime timestamp,
    other_field text,
    PRIMARY KEY ((user_id, location_id, other_field), datetime)
) WITH CLUSTERING ORDER BY (datetime ASC)
    AND read_repair_chance = 0.0
    AND dclocal_read_repair_chance = 0.1
    AND gc_grace_seconds = 864000
    AND bloom_filter_fp_chance = 0.01
    AND caching = { 'keys' : 'ALL', 'rows_per_partition' : 'NONE' }
    AND comment = ''
    AND compaction = { 'class' : 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold' : 32, 'min_threshold' : 4 }
    AND compression = { 'chunk_length_in_kb' : 64, 'class' : 'org.apache.cassandra.io.compress.LZ4Compressor' }
    AND default_time_to_live = 0
    AND speculative_retry = '99PERCENTILE'
    AND min_index_interval = 128
    AND max_index_interval = 2048
    AND crc_check_chance = 1.0
    AND cdc = false;
rbpvctlc

rbpvctlc1#

对于这种情况,cql有“每分区限制”子句(在cassandra3.6+iirc中提供)。但要在表中使用,需要将表定义更改为 CLUSTERING ORDER BY (datetime DESC) ,然后你可以写:

select * from prospacedb.quarter_utilisation per partition limit 1;

并获取每个分区键的最新时间戳行。

相关问题