如何利用Pandas的时间序列从给定条件下的数据集中选择数据

x6492ojm  于 2023-01-19  发布在  其他
关注(0)|答案(1)|浏览(114)
Patient_id    timestamp    date          time    blood_sugar
0   pid11928    1.670000e+12    30-12-22    16:53:20    100
1   pid11928    1.600000e+12    05-10-20    12:46:40    98
2   pid11928    1.580000e+12    10-03-20    3:13:20     102
3   pid12334    1.480000e+12    07-01-17    17:26:40    99
4   pid12334    1.490000e+12    03-05-17    11:13:20    98
5   pid14556    1.500000e+12    30-06-17    8:06:40     115
6   pid14556    1.490000e+12    06-03-17    14:20:00    114
7   pid11223    1.600000e+12    11-09-20    7:40:00     100
8   pid11223    1.590000e+12    15-07-20    10:46:40    100
9   pid11223    1.580000e+12    21-03-20    17:00:00    95

如何选择血糖在给定时间间隔内升高/降低的用户?
在这里,1个用户在不同的时间间隔有多个读数,因此我必须选择满足血糖升高条件的用户列表,并查找其升高值。
我尝试过按唯一用户分离df,然后执行操作,但对于更大的数据集来说,这变得太混乱了。

k97glaaz

k97glaaz1#

因此,您想要检查任何给定患者的先前阅读与新读数之间的差异是否大于零?如果是,请尝试以下操作:

import pandas as pd

df = pd.DataFrame({'Patient_id': ['pid11928', 'pid11928', 'pid11928', 'pid12334', 'pid12334', 'pid14556', 'pid14556', 'pid11223', 'pid11223', 'pid11223'], 'timestamp': [1670000000000.0, 1600000000000.0, 1580000000000.0, 1480000000000.0, 1490000000000.0, 1500000000000.0, 1490000000000.0, 1600000000000.0, 1590000000000.0, 1580000000000.0], 'date': ['30-12-22', '05-10-20', '10-03-20', '07-01-17', '03-05-17', '30-06-17', '06-03-17', '11-09-20', '15-07-20', '21-03-20'], 'time': ['16:53:20', '12:46:40', '3:13:20', '17:26:40', '11:13:20', '8:06:40', '14:20:00', '7:40:00', '10:46:40', '17:00:00'], 'blood_sugar': [100, 98, 102, 99, 98, 115, 114, 100, 100, 95]})

df['diff'] = df.groupby('Patient_id')['blood_sugar'].diff()
df.loc[df['diff'].gt(0)]

产出

Patient_id     timestamp      date     time  blood_sugar  diff
2   pid11928  1.580000e+12  10-03-20  3:13:20          102   4.0

相关问题