pandas 使用基于dict键的信息创建ID

kgqe7b3p 于 2023-02-20 发布在其他

关注(0)|答案(2)|浏览(70)

我有一个包含以下输入的大型 Dataframe
| 委托人|类型|乡村|势|
| - ------|- ------|- ------|- ------|
| 客户端1|私营|美国|低|
| 客户端2|私营|爱尔兰|高|
| 客户端3|机构|英国|中|
| 客户端4|机构|英国|中|
| 客户端5|机构|英国|中|
我想为每个客户端创建一个ID。ID不应该是随机的（我尝试使用uuid包），而是包含客户端的信息，并将具有相同属性的客户端分开。

ID_classification = {'type':{'A':'Private','B':'Institutional'},
                     'country':{'1':'USA','2':'GB','3':'Ireland'},
                     'potential':{'1':'low','2':'mid','3':'high'}}

ID模式可能如下所示（尚未确定最终模式）
类型.关键字-国家.关键字-潜在.关键字-唯一标识
导致：
| 身份证|委托人|类型|乡村|势|
| - ------|- ------|- ------|- ------|- ------|
| 甲-1 -1 -1|客户端1|私营|美国|低|
| 甲-3 -3 -1|客户端2|私营|爱尔兰|高|
| 乙-2 -2 -1|客户端3|机构|英国|中|
| 乙-2 -2 -2|客户端4|机构|英国|中|
| 乙二二二三|客户端5|机构|英国|中|
先谢了

pandas

来源：https://stackoverflow.com/questions/75508158/create-id-with-information-based-on-dict-keys

2条答案

按热度按时间

brccelvz1#

您可以用途：

# reorganize your mapping dictionary
# to have the key: value in correct order
mapper = {k1: {k: v for v, k in d.items()}
          for k1, d in ID_classification.items()}

# map all desired columns
df['id'] = df[list(mapper)].apply(lambda s: s.map(mapper[s.name])).agg('-'.join, axis=1)

# add unique id
df['id'] += '-' + df.groupby('id').cumcount().add(1).astype(str)

输出：

client           type  country potential       id
0  Client 1        Private      USA       low  A-1-1-1
1  Client 2        Private  Ireland      high  A-3-3-1
2  Client 3  Institutional       GB       mid  B-2-2-1
3  Client 4  Institutional       GB       mid  B-2-2-2
4  Client 5  Institutional       GB       mid  B-2-2-3

赞(0）回复(0）举报 2023-02-20

zengzsys2#

用途：

#swap key value in inner dictionaries
d = {k:{v1:k1 for k1, v1 in v.items()} for k, v in ID_classification.items()}

#map columns by d with join together by -
s = pd.DataFrame([df[k].map(v) for k, v in d.items()]).agg('-'.join)

#added counter column by Series s
df['id'] = s + '-' + df.groupby(s).cumcount().add(1).astype(str)
print (df)
     client           type  country potential       id
0  Client 1        Private      USA       low  A-1-1-1
1  Client 2        Private  Ireland      high  A-3-3-1
2  Client 3  Institutional       GB       mid  B-2-2-1
3  Client 4  Institutional       GB       mid  B-2-2-2
4  Client 5  Institutional       GB       mid  B-2-2-3

赞(0）回复(0）举报 2023-02-20

我来回答

pandas 使用基于dict键的信息创建ID

2条答案

相关问题

热门标签

最新问答