我有一个员工计划,我filter
以获得name, timein, timeout
的DF,如下所示:
employees = [('BOB', datetime(2022,12,1,6,0,0), datetime(2022,12,1,14,0,0)),
('BOB', datetime(2022,12,2,6,0,0), datetime(2022,12,2,14,0,0)),
('GILL', datetime(2022,12,1,6,0,0), datetime(2022,12,1,14,0,0)),
('GILL', datetime(2022,12,3,6,0,0), datetime(2022,12,3,14,0,0)),
('TOBY', datetime(2022,12,1,14,0,0), datetime(2022,12,1,20,30,0))]
labels = ['name', 'timein', 'timeout']
df = pd.DataFrame.from_records(employees, columns=labels)
**我需要比较当前timeout
和下一个timein
值之间的时间增量。**我的想法是过滤、选择和更新到一个dict:
{'BOB' : [(datetime(2022,12,1,6,0,0), datetime(2022,12,1,14,0,0)), (datetime(2022,12,2,6,0,0), datetime(2022,12,2,14,0,0)), etc...}
那么它应该是一个简单的测试(针对常见的错误模式):dict['BOB'][i+1][0] - dict['BOB'][i][1] < fixed_duration
但Pandas经历了一些Numpy wringer和生产天知道什么:
results = {}
names = df['name'].unique().tolist()
for name in names:
times = df.loc[df['name'] == 'BOB', ['schedulein', 'scheduleout']].values.tolist()
results.update({name: times})
results
{'BOB': [[1669874400000000000, 1669903200000000000],
[1669960800000000000, 1669989600000000000]],
'GILL': [[1669874400000000000, 1669903200000000000],
[1669960800000000000, 1669989600000000000]],
'TOBY': [[1669874400000000000, 1669903200000000000],
[1669960800000000000, 1669989600000000000]]}
为什么无法调出日期时间?
奖金如果你知道更多的Pandas的方式,我叫它,“过滤器,选择”。
1条答案
按热度按时间8fq7wneg1#
下面是您要执行的操作:
它提供给您: