这个问题和这个问题pandas: split a string column into multiple columns and dynamically name columns类似,我修改了数据如下
df = pd.DataFrame.from_dict({'study_id': {0: 'study1',
1: 'study2',
2: 'study3',
3: 'study4',
4: 'study5'},
'fuzzy_market': {0: '[Age: 18-67], [Country of Birth: Austria], [Country of Birth: Germany], [Country: Austria], [Country: Germany], [Language: German]',
1: '[Country: Germany], [Management experience: Yes]',
2: '[Country: United Kingdom], [Language: English]',
3: '[Age: 18-67], [Country of Birth: Austria], [Country of Birth: Germany], [Country: Austria], [Country: Germany], [Language: German]',
4: '[Age: 48-99]'}})
我希望输出如下。
study_id Age Country of Birth Country Language Management experience
study1 18-67 Austria Austria German None
study1 18-67 Germany Germany German None
study2 None None Germany None Yes
study3 None None United Kingdom English None
我应该如何修改这些代码。谢谢。
p = df['fuzzy_market'].str.findall(r'([^:\[]+): ([^\]]+)')
df[['study_id']].join(pd.DataFrame(map(dict, p)))
1条答案
按热度按时间jv4diomz1#
试试这个: