如何使用flink按时间过滤数据?

8cdiaqws  于 2022-12-09  发布在  Apache
关注(0)|答案(1)|浏览(208)

我有一些类似这样格式的数据:

(id, time, value)

给出以下模拟数据(可能有重复数据):

("a1", "2022-06-28 00:00:00", "0.23"), // The time interval is 15 minutes, and there is only 24-hour data of the day
("a1", "2022-06-28 00:15:00", "0.89"),
...
("a1", "2022-06-28 23:59:59", "0.11"),

("b1", "2022-06-28 00:00:00", "0.23"), 
("b1", "2022-06-28 00:15:00", "0.89"),
...
("b1", "2022-06-28 23:59:59", "0.11"),

("c1", "2022-06-28 00:00:00", "0.23"), 
("c1", "2022-06-28 00:15:00", "0.89"),
...
("c1", "2022-06-28 23:59:59", "0.11"),

假设现在是2022-06-28 16:00:00,我想计算1小时、45分钟、30分钟、15分钟前和现在的数据。
输出应如下所示:

("a1", "2022-06-28 15:00:00", "1"),
("a1", "2022-06-28 15:15:00", "1"),
("a1", "2022-06-28 15:30:00", "1"),
("a1", "2022-06-28 15:45:00", "1"),
("a1", "2022-06-28 16:00:00", "1"),
("b1", "2022-06-28 15:00:00", "1"),
("b1", "2022-06-28 15:15:00", "1"),
("b1", "2022-06-28 15:30:00", "1"),
("b1", "2022-06-28 15:45:00", "1"),
("b1", "2022-06-28 16:00:00", "1"),
("c1", "2022-06-28 15:00:00", "1"),
("c1", "2022-06-28 15:15:00", "1"),
("c1", "2022-06-28 15:30:00", "1"),
("c1", "2022-06-28 15:45:00", "1"),
("c1", "2022-06-28 16:00:00", "1"),

如何编写Flink程序?最好用Java或Scala编写。如果你能给我一些代码片段,我将非常感激!

t8e9dugd

t8e9dugd1#

I would recommend to have a look at the Windowing Table Valued Functions from Flink. You can find the documentation and examples at https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/window-tvf/#tumble

相关问题