pandas 使用基于dict键的信息创建ID

kgqe7b3p  于 2023-02-20  发布在  其他
关注(0)|答案(2)|浏览(70)

我有一个包含以下输入的大型 Dataframe
| 委托人|类型|乡村|势|
| - ------|- ------|- ------|- ------|
| 客户端1|私营|美国|低|
| 客户端2|私营|爱尔兰|高|
| 客户端3|机构|英国|中|
| 客户端4|机构|英国|中|
| 客户端5|机构|英国|中|
我想为每个客户端创建一个ID。ID不应该是随机的(我尝试使用uuid包),而是包含客户端的信息,并将具有相同属性的客户端分开。

ID_classification = {'type':{'A':'Private','B':'Institutional'},
                     'country':{'1':'USA','2':'GB','3':'Ireland'},
                     'potential':{'1':'low','2':'mid','3':'high'}}

ID模式可能如下所示(尚未确定最终模式)
类型.关键字-国家.关键字-潜在.关键字-唯一标识
导致:
| 身份证|委托人|类型|乡村|势|
| - ------|- ------|- ------|- ------|- ------|
| 甲-1 -1 -1|客户端1|私营|美国|低|
| 甲-3 -3 -1|客户端2|私营|爱尔兰|高|
| 乙-2 -2 -1|客户端3|机构|英国|中|
| 乙-2 -2 -2|客户端4|机构|英国|中|
| 乙二二二三|客户端5|机构|英国|中|
先谢了

brccelvz

brccelvz1#

您可以用途:

# reorganize your mapping dictionary
# to have the key: value in correct order
mapper = {k1: {k: v for v, k in d.items()}
          for k1, d in ID_classification.items()}

# map all desired columns
df['id'] = df[list(mapper)].apply(lambda s: s.map(mapper[s.name])).agg('-'.join, axis=1)

# add unique id
df['id'] += '-' + df.groupby('id').cumcount().add(1).astype(str)

输出:

client           type  country potential       id
0  Client 1        Private      USA       low  A-1-1-1
1  Client 2        Private  Ireland      high  A-3-3-1
2  Client 3  Institutional       GB       mid  B-2-2-1
3  Client 4  Institutional       GB       mid  B-2-2-2
4  Client 5  Institutional       GB       mid  B-2-2-3
zengzsys

zengzsys2#

用途:

#swap key value in inner dictionaries
d = {k:{v1:k1 for k1, v1 in v.items()} for k, v in ID_classification.items()}

#map columns by d with join together by -
s = pd.DataFrame([df[k].map(v) for k, v in d.items()]).agg('-'.join)

#added counter column by Series s
df['id'] = s + '-' + df.groupby(s).cumcount().add(1).astype(str)
print (df)
     client           type  country potential       id
0  Client 1        Private      USA       low  A-1-1-1
1  Client 2        Private  Ireland      high  A-3-3-1
2  Client 3  Institutional       GB       mid  B-2-2-1
3  Client 4  Institutional       GB       mid  B-2-2-2
4  Client 5  Institutional       GB       mid  B-2-2-3

相关问题