我必须找到一组带有共享密钥的结果,这些结果必须在不早于过去90天、但不迟于过去7天的时间内发布到本地数据存储的api中。
这将是一个有效的组,我需要为其捕获1234:
account_id,company_id,posted_date
1234,A,2018-02-28
1234,B,2018-03-13
1234,C.2018-04-23
1234,D,2018-05-15
这将是一个无效的组。如果单个日期超出查询的上限或下限,则应将帐户id从最终结果中排除:
account_id,company_id,posted_date
5678,Z,2018-02-01
5678,Y,2018-03-13
5678,X.2018-04-23
5678,W,2018-05-21
这是使用子查询的查询的初稿:
SELECT DISTINCT account_id, company_id FROM local_data_store.result_api
WHERE account_id NOT IN (
SELECT account_id FROM local_data_store.result_api
GROUP BY account_id
HAVING posted_date > DATE_SUB(NOW(), INTERVAL 7 DAY)
)
AND account_did IN (
SELECT account_did FROM local_data_store.result_api
GROUP BY account_did
HAVING posted_date > DATE_SUB(NOW(), INTERVAL 90 DAY)
)
GROUP BY account_id, company_id
LIMIT 100000;
这是我现在正在处理的没有子查询的查询(我尝试了连接,但它们确实不起作用):
SELECT DISTINCT account_id, company_id,
COUNT(ra1.posted_date > DATE_SUB(NOW(), INTERVAL 90 DAY)) AS day90,
COUNT(ra1.posted_date > DATE_SUB(NOW(), INTERVAL 7 DAY)) as day7
FROM local_data_store.result_api ra1
GROUP BY posted_date, account_id;
但它运行的时间太长,数据库连接超时。这只存在于375000行的数据库表上。
1条答案
按热度按时间dddzy1tm1#
这是我想出的解决办法。希望它能帮助别人。