- 此问题在此处已有答案**:
How to assign a name to the size() column?(5个答案)
4小时前关门了。
如何使用groupby()
来获得给定日期的雇员类型计数,并将结果反馈到原始 Dataframe 中?
这是数据
shifts = [("Cashier", "Thursday"), ("Cashier", "Thursday"),
("Cashier", "Thursday"), ("Cook", "Thursday"),
("Cashier", "Friday"), ("Cashier", "Friday"),
("Cook", "Friday"), ("Cook", "Friday"),
("Cashier", "Saturday"), ("Cook", "Saturday"),
("Cook", "Saturday")]
labels = ["JOB_TITLE", "DAY"]
df = pd.DataFrame.from_records(shifts, columns=labels)
value_counts()
的这种用法会产生正确的结果:
shifts_series = df.groupby('DAY')['JOB_TITLE'].value_counts()
那么,如何将这里给出的值反馈回原始DF:
JOB_TITLE DAY TYPE
0 Cashier Thursday 3
1 Cashier Thursday 3
2 Cashier Thursday 3
3 Cook Thursday 1
4 Cashier Friday 2
5 Cashier Friday 2
6 Cook Friday 2
7 Cook Friday 2
8 Cashier Saturday 1
9 Cook Saturday 2
10 Cook Saturday 2
我找到了一些建议使用transform()
的答案,但结果只计算'DAY'的示例数:
df.groupby('DAY')['JOB_TITLE'].transform('count')
我设法使用different question的答案创建了一个令人讨厌的小Pandas反模式,我尝试循环结果并标记为[('Saturday', 'Cashier'), ('Thursday', 'Cook')]
:
shift_filter1 = shifts_series[shifts_series == 1].index.tolist()
df['WORKED_SOLO'] = np.nan
for workday, title in shift_filter1:
df['WORKED_SOLO'] = (np.where(((df['WORKED_SOLO'].isna()) & (df['DAY'] == workday) & (df['JOB_TITLE'] == title)), True, np.nan))
但是结果DF替换了前一个循环的结果--尽管进行了isna()
测试。
1条答案
按热度按时间63lcw9qa1#
您可以执行以下操作:
其给出: