我想groupby dataframe列值的基础上ID和我有以下内容:
data = {
"ID": ["1", "1", "2"],
"start": [5, 6, 30],
"end": [10,20,50],
"label": ["age", "gender", "history"]
}
df = pd.DataFrame(data)
我寻找的结果是这样的:
Dict = {{'ID': 1, 'Labels': [[5,10,'age'], [6,20,"gender"]]}, {'ID': 2, 'Labels': [[30,50,'history']]} }
我尝试了多种方法,它们都需要非常长的时间,有没有办法优化这个代码?
inner_dict = {}
inner_list = []
middle_list = []
labels_dict= []
for idx in df.index:
ID = df['ID'][idx]
for ddx in df.index:
if (df['ID'][ddx] == ID):
inner_list = [df['start'][ddx], df['end'][ddx],df['Label'][ddx]]
middle_list.append(inner_list)
else :
inner_dict = ({'ID': ID, 'Label': middle_list})
# print (inner_dict)
# inner_dict = ({'Label': inner_list})
labels_dict.append(inner_dict)
1条答案
按热度按时间bweufnob1#
将数据框分组到列表解析中,并收集每个ID的标签
结果