Cassandra按时间戳列中的日期搜索

3j86kqsm  于 2022-12-12  发布在  Cassandra
关注(0)|答案(2)|浏览(152)

如何根据时间戳列中的日期获取记录。表的详细信息包括

CREATE TABLE hlragent_logs.hlragent_logs_2021 (
    msisdn text,
    date_time timestamp,
    cmd_no text,
    agent_id text,
    cmd_executed text,
    dummy text,
    id bigint,
    imsi text,
    mml_cmd text,
    module text,
    node text,
    node_id text,
    node_ip text,
    p text,
    pno text,
    serial text,
    vhlr_name text,
    PRIMARY KEY (msisdn, date_time, cmd_no)
) WITH CLUSTERING ORDER BY (date_time ASC, cmd_no ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99PERCENTILE';
CREATE INDEX indx_agentlogs_2021 ON hlragent_logs.hlragent_logs_2021 (imsi)

选择 * from hlragent_logs_2021,其中今天(日期时间)=“2021-08-10”允许过滤;语法异常:第1行:45在输入'('处没有可行的替代项(从hlragent_logs_2021中选择 *,其中todate

o2gm4chl

o2gm4chl1#

您不能在WHERE子句中使用用户定义的函数(有一个Jira ticket for it,但我不记得有人在使用它)。

fwzugrvs

fwzugrvs2#

It isn't necessary to use the native CQL TODATE() function to work on timestamp columns. It's possible to directly work on the column with:

SELECT * FROM ... WHERE ... AND date_time < '2021-08-10';

But you cannot use the equality operator ( = ) because the CQL timestamp data type is encoded as the number of milliseconds since Unix epoch (Jan 1, 1970 00:00 GMT) so you need to be precise when you're working with timestamps.
Depending on where you're running the query, the filter could be translated in the local timezone. Let me illustrate with this example table:

CREATE TABLE community.tstamptbl (
    id int,
    tstamp timestamp,
    PRIMARY KEY (id, tstamp)
)

These 2 statements may appear similar but translate to 2 different entries:

INSERT INTO tstamptbl (id, tstamp) VALUES (5, '2021-08-09');
INSERT INTO tstamptbl (id, tstamp) VALUES (5, '2021-08-09 +0000');

The first statement creates an entry with a timestamp in my local timezone (Melbourne, Australia) while the second statement creates an entry with a timestamp in UTC ( +0000 ):

cqlsh:community> SELECT * FROM tstamptbl WHERE id = 5;

 id | tstamp
----+---------------------------------
  5 | 2021-08-08 14:00:00.000000+0000
  5 | 2021-08-09 00:00:00.000000+0000

Similarly, you need to be precise when reading the data. You need to specify the timezone to remove ambiguity. Here are some examples:

SELECT * FROM tstamptbl WHERE id = 5 AND tstamp < '2021-08-09 +0000';
SELECT * FROM tstamptbl WHERE id = 5 AND tstamp < '2021-08-10 12:00+0000';
SELECT * FROM tstamptbl WHERE id = 5 AND tstamp < '2021-08-08 12:34:56+0000';

Again since timestamps are encoded in milliseconds (instead of days), there is a whole range of possible values for a given date. If I wanted to retrieve all rows for the date 2021-08-09 , I need to filter based on a range as in this example:

SELECT * FROM tstamptbl 
WHERE id = 5
    AND tstamp >= '2021-08-09 +0000'
    AND tstamp < '2021-08-10 +0000';

Cheers!

相关问题