python—在给定唯一id的情况下，仅选择列与前面的行不同的行

c9x0cxw0 于 2021-07-26 发布在 Java

关注(0)|答案(1)|浏览(302)

我有一个postgresql数据库，我想在其中记录每个id的特定列随时间的变化。表1：

personID | status | unixtime | column d | column e | column f
    1        2       213214      x            y        z
    1        2       213325      x            y        z
    1        2       213326      x            y        z
    1        2       213327      x            y        z
    1        2       213328      x            y        z
    1        3       214330      x            y        z
    1        3       214331      x            y        z
    1        3       214332      x            y        z
    1        2       324543      x            y        z

我想跟踪所有的状态随着时间的推移。基于此，我需要一个新的表，表2包含以下数据：

personID | status | unixtime | column d | column e | column f
    1        2       213214      x            y        z
    1        3       214323      x            y        z
    1        2       324543      x            y        z

x、 y，z是可以并且将在每行之间变化的变量。表有成千上万的其他人与不断变化的id的人，我想捕捉以及。仅按状态、personid分组是不够的（在我看来），因为我可以存储多行相同的状态和personid，就像状态发生了变化一样。
我用python做这个，但是它非常慢（而且我猜它有很多io）：

for person in personid:
    status = -1
    records = getPersonRecords(person) #sorted by unixtime in query
    newrecords = []
    for record in records:
        if record.status != status:
                 status = record.status
                 newrecords.append(record)
    appendtoDB(newrecords)

sql postgresql python window-functions gaps-and-islands

来源：https://stackoverflow.com/questions/62214728/select-only-rows-that-has-a-column-changed-from-the-rows-before-it-given-an-uni

1条答案

按热度按时间

7d7tgy0s1#

这是一个缺口和孤岛问题。您需要每个岛的开始，您可以通过比较当前行上的状态和“上一个”记录上的状态来识别它。
窗口功能非常方便：

select t.*
from (
    select t.*, lag(status) over(partition by personID order by unixtime) lag_status
    from mytable t
) t
where lag_status is null or status <> lag_status

赞(0）回复(0）举报 2021-07-26

我来回答

python—在给定唯一id的情况下，仅选择列与前面的行不同的行

1条答案

相关问题

热门标签

最新问答