我有一个数据,其中一列包含日期(Date
),另一列包含分类数据(A
:是、否、未知)。
我想显示一段时间内“是”的总百分比,但相对于观察的时间点(即,“是”的数量/该时间点的累积总和)。
假设我有如下数据:
df
Date A
2022-08-22 Unknown
2022-08-23 Yes
2022-08-24 No
2022-08-25 Unknown
2022-09-13 Yes
# . . .
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 246 entries, 0 to 245
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Date 246 non-null datetime64[ns]
1 A 246 non-null object
dtypes: datetime64[ns](1), object(1)
字符串
主要问题是:是否有趋势显示A==“是”的频率随时间的推移而逐年/月增加?
我想显示每年/每个月的“是”占该年/每个月所有行总和的百分比。因此,如果2022年6月有10条记录,其中2条记录的A==“是”,则2022年6月的值为20%。它 * 可能 * 看起来像这样,其中值/A是百分比:
Date Date
2022 1 0.05
2 0.22
3 0.88
4 0.79
5 0.51
6 0.04
7 0.20
8 0.91
9 0.98
Name: A, dtype: int64
我可以按年和月计算出“是”的 * 计数 *,如下所示:
df.loc[df["A"] == "Yes"]["A"].groupby([df["Date"].dt.year, df["Date"].dt.month]).agg("count")
但是我不知道如何获得每月相对于每年/每月累计总和的相对百分比,这需要将A==“Yes”除以每年/每月的总行数。
要获得示例数据:
d = [{'Date': Timestamp('2022-08-02 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-09-14 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-01-18 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-01-19 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-01-20 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-01-21 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-01-22 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-01-23 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-01-24 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-01-25 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-01-26 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-01-27 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-01-28 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-01-29 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-01-30 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-01-31 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-02-01 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-02-02 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-02-03 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-02-04 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-02-05 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-02-06 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-02-07 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-02-20 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-02-21 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-02-22 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-02-23 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-02-24 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-02-25 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-02-26 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-02-27 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-02-28 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-03-01 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-03-02 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-03-03 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-03-04 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-03-05 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-03-06 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-03-07 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-03-08 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-03-09 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-03-10 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-03-11 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-03-12 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-03-13 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-03-14 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-03-15 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-03-16 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-03-17 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-03-18 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-03-19 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-08-02 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-09-14 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-01-18 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-01-19 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-01-20 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-01-21 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-03-26 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-03-27 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-03-28 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-03-29 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-03-30 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-03-31 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-04-01 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-04-02 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-04-03 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-04-04 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-04-05 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-04-06 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-04-07 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-04-08 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-04-09 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-04-10 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-04-11 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-04-12 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-04-13 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-04-14 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-04-15 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-04-16 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-04-17 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-04-18 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-04-19 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-04-20 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-04-21 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-04-22 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-04-23 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-04-24 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-04-25 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-04-26 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-04-27 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-04-28 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-04-29 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-04-30 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-05-01 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-05-02 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-05-03 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-05-04 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-05-05 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-05-06 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-05-07 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-05-08 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-05-09 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-05-10 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-05-11 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-05-12 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-05-13 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-05-14 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-05-15 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-05-16 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-05-17 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-05-18 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-05-19 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-05-20 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-05-21 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-05-22 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-05-23 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-05-24 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-05-25 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-05-26 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-05-27 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-05-28 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-05-29 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-05-30 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-05-31 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-06-01 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-06-02 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-06-03 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-06-04 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-06-05 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-06-06 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-06-07 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-06-08 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-06-09 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-06-10 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-06-11 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-06-12 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-06-13 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-06-14 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-06-15 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-06-16 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-06-17 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-06-18 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-06-19 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-06-20 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-06-21 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-06-22 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-06-23 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-06-24 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-06-25 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-06-26 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-06-27 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-06-28 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-06-29 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-06-30 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-07-01 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-07-02 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-07-03 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-07-04 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-07-05 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-07-06 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-07-07 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-07-08 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-07-09 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-07-10 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-07-11 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-07-12 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-07-13 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-07-14 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-07-15 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-07-16 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-07-17 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-07-17 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-07-16 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-07-17 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-07-17 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-07-16 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-07-17 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-07-17 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-07-16 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-07-17 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-07-17 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-07-28 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-07-29 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-07-30 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-07-31 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-08-01 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-08-02 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-08-03 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-08-04 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-08-05 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-08-06 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-08-07 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-08-08 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-08-09 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-08-10 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-08-11 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-08-12 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-08-13 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-08-14 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-08-15 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-08-16 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-08-17 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-08-18 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-08-19 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-08-20 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-08-21 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-08-22 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-08-23 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-08-24 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-08-25 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-08-26 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-08-27 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-08-28 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-08-29 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-08-30 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-08-31 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-09-01 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-09-02 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-09-03 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-09-04 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-09-05 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-09-06 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-09-07 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-09-08 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-09-09 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-09-10 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-09-11 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-09-12 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-09-13 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-09-14 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-09-15 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-09-16 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-09-17 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-09-18 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-09-19 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-09-20 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-09-21 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-09-22 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-09-23 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-09-24 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-09-25 00:00:00'), 'A': 'Yes'},
{'Date': Timestamp('2022-09-26 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-09-27 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-09-28 00:00:00'), 'A': 'Unknown'},
{'Date': Timestamp('2022-09-29 00:00:00'), 'A': 'No'},
{'Date': Timestamp('2022-09-30 00:00:00'), 'A': 'No'}]
df = pd.DataFrame(d)
1条答案
按热度按时间bgtovc5b1#
您可以尝试以下操作:
.pivot_table
在["Year", "Month"]
上分组,对列A
中的值的出现次数进行计数,然后将它们放入相应的列中。Yes
列除以每个Year
-Month
的总计数(df_counts.sum(axis=1)
)。样本 Dataframe 的结果: