Pandas添加增量索引

hc8w905p 于 2023-08-01 发布在其他

关注(0)|答案(4)|浏览(104)

我有下面的Pandas DataFrame

字符串
我需要创建一个索引，只有当键重复后，不同的键序列后，自动递增。所以，我需要这样的输出：

Key Value Index
A    10     1
A    20     1
B    30     1
B    40     1
C    50     1
A    60     2
A    70     2
A    70     2
B    80     2
A    90     3

型
谢谢你，谢谢
我尝试使用方法groupby和cumcount() + 1，但它不工作。

pandas

来源：https://stackoverflow.com/questions/76806273/pandas-add-increment-index

4条答案

按热度按时间

zpqajqem1#

import pandas as pd

df = pd.DataFrame({
    'Key': ['A', 'A', 'B', 'B', 'C', 'A', 'A', 'A', 'B', 'A'],
    'Value': [10, 20, 30, 40, 50, 60, 70, 70, 80, 90]
})

df['Index'] = (df.Key != df.Key.shift()).cumsum()
df['Index'] = df.groupby('Key')['Index'].rank(method='dense').astype(int)

display(df)


Key Value   Index
0   A   10  1
1   A   20  1
2   B   30  1
3   B   40  1
4   C   50  1
5   A   60  2
6   A   70  2
7   A   70  2
8   B   80  2
9   A   90  3

字符串
对正在发生的事情的快速分解

# checks whether the current Key is not equal to the last Key, returning a boolean series.
# The cumsum function then returns the cumulative sum of this series, which gives you the unique key for each group that you requested. 

df.Key != df.Key.shift().cumsum()

# The below ranks these numbers by each Key group, which gives each unique number within a group the same rank.

groupby('Key')['Index'].rank(method='dense')

型

赞(0）回复(0）举报 2023-08-01

ttp71kqs2#

使用有序的Categorical和numpy.cumsum：

import numpy as np

s = pd.Categorical(df['Key'], ordered=True)
df['Index'] = np.cumsum(s<s.shift())+1

字符串

如果您想要自定义订单通行证categories=['X', 'Z', 'Y']。*

或者，像@SimonT评论的那样，如果你的类别是按字典排序的：

df['Index'] = np.cumsum(df['Key']<df['Key'].shift())+1

型
输出量：

Key  Value  Index
0   A     10      1
1   A     20      1
2   B     30      1
3   B     40      1
4   C     50      1
5   A     60      2
6   A     70      2
7   A     70      2
8   B     80      2
9   A     90      3

型

赞(0）回复(0）举报 2023-08-01

fgw7neuy3#

另一种方法是使用pd.factorize计算密集秩

df['Index'] = (df['Key'] != df['Key'].shift()).cumsum()
df['Index'] = df.groupby('Key')['Index'].transform(lambda x: pd.factorize(x)[0] + 1)

字符串

输出：

Key  Value  Index
0   A     10      1
1   A     20      1
2   B     30      1
3   B     40      1
4   C     50      1
5   A     60      2
6   A     70      2
7   A     70      2
8   B     80      2
9   A     90      3

型

赞(0）回复(0）举报 2023-08-01

a1o7rhls4#

试试这个：

df['Key'].ne(df['Key'].shift()).groupby(df['Key']).cumsum()

字符串
或者是

df.loc[df['Key'].ne(df['Key'].shift())].groupby('Key').cumcount().add(1).reindex(df.index,method = 'ffill')

型
输出量：

型

赞(0）回复(0）举报 2023-08-01

我来回答

Pandas添加增量索引

4条答案

相关问题

热门标签

最新问答