Mysql-如何获得最大的日期时,案件的数量已经达到峰值

dffbzjpn  于 2022-12-22  发布在  Mysql
关注(0)|答案(4)|浏览(122)

我正在尝试对sqlzoo(https://sqlzoo.net/wiki/Window_LAG#LAG_using_a_JOIN)中的这个问题集做第8个问题。
问题是“对于每一个在一天内至少有1000个新病例的国家,显示新病例峰值的日期。”
下表给出了各国每天的新冠肺炎病例、死亡人数和康复人数,如下所示:

+-------------+-------------------------------+-----------+--------+-----------+
|    Name     |              whn              | confirmed | deaths | recovered |
+-------------+-------------------------------+-----------+--------+-----------+
| Afghanistan | Sun, 01 Mar 2020 00:00:00 GMT |         1 |      0 |         0 |
| Albania     | Sun, 01 Mar 2020 00:00:00 GMT |         0 |      0 |         0 |
| Algeria     | Sun, 01 Mar 2020 00:00:00 GMT |         1 |      0 |         0 |
+-------------+-------------------------------+-----------+--------+-----------+

目前我有这个代码:

SELECT c.name, DATE_FORMAT(c.whn,'%Y-%m-%d') as this, d.peak
from ( select tw.name, max(tw.confirmed-lw.confirmed) as peak
FROM covid tw LEFT JOIN covid lw ON 
  DATE_ADD(lw.whn, INTERVAL 1 DAY) = tw.whn
   AND tw.name=lw.name
where tw.confirmed-lw.confirmed > 1000
group by tw.name) d
join covid as c
on d.name = c.name
group by name

它为我提供了每个国家、日期和病例峰值。但是,日期显示的是每个国家病例超过1000的第一天。我如何获得病例峰值出现的日期?

|  Name   |    this    | peak |
|---------|------------|------|
| Austria | 2020-03-26 | 1321 |
| Belarus | 2020-04-20 | 1485 |
| Belgium | 2020-03-26 | 2454 |
y0u0uwnf

y0u0uwnf1#

您可以通过比较随后几天的confirmed来计算 new 案例的数量;为此,lag()派上了用场:

select 
    t.*,
    confirmed - lag(confirmed, 1, 0) over(partition by name order by whn) new_cases
from mytable t

这假设每个国家/地区每天有一笔记录。然后,您可以以此为依据对每个国家/地区的记录进行排名,并筛选每个国家/地区排名靠前的日期:

select *
from (
    select 
        t*, 
        rank() over(partition by name order by new_cases desc) rn
    from (
        select 
            t.*,
            confirmed - lag(confirmed, 1, 0) over(partition by name order by whn) new_cases
        from mytable t
    ) t
    where new_cases > 1000
) t
where rn = 1
cwtwac6a

cwtwac6a2#

它看起来并不好,但它适用于MySQL。由于某些原因,其他答案没有通过。

select t2.name,DATE_FORMAT(t3.whn,'%Y-%m-%d'),t2.peakNewCases
from
    (select a.name, max(a.confirmed-b.confirmed) as peakNewCases
     from covid a 
     left join covid b
     on DATE_ADD(b.whn, interval 1 day)=a.whn 
     and 
     a.name=b.name
     where a.confirmed-b.confirmed>999
     group by name
    ) t2 
join 
    (select name, whn, confirmed-lag(confirmed,1) over (partition by name order by confirmed) as peakNewCases
     from covid
    ) t3 
on t2.peakNewCases=t3.peakNewCases 
and 
t2.name=t3.name
order by t3.whn
but5z9lq

but5z9lq3#

使用RANK() OVER (PARTITION BY name ORDER BY peak) AS rank对每个国家每天的新病例数进行排名,然后选择rank = 1在外部查询中的病例数以找到峰值。

SELECT name, date, peak FROM
 (SELECT name, date, peak, RANK() OVER (PARTITION BY name ORDER BY peak DESC) as rank
  FROM 
   (SELECT name, DATE_FORMAT(whn,'%Y-%m-%d') AS date, 
    confirmed - (LAG(confirmed, 1) OVER (PARTITION BY name ORDER BY whn)) AS peak 
    FROM 
     covid)  
  TAB WHERE peak >= 1000)
TAB WHERE rank = 1
ORDER BY date

看起来输出的顺序需要调整才能完全符合答案,但我认为这是可行的。

wkyowqbh

wkyowqbh4#

SELECT 
   total_cases - LAG(total_cases,1) OVER (PARTITION BY state ORDER BY date) 
   AS new_cases,
   total_cases,
   state,
   date
FROM statistics

相关问题