pandas 将字串数据行分割成多个数据行并动态命名数据行

k4emjkb1  于 2022-11-20  发布在  其他
关注(0)|答案(1)|浏览(106)

这个问题和这个问题pandas: split a string column into multiple columns and dynamically name columns类似,我修改了数据如下

df = pd.DataFrame.from_dict({'study_id': {0: 'study1',
  1: 'study2',
  2: 'study3',
  3: 'study4',
  4: 'study5'},
 'fuzzy_market': {0: '[Age: 18-67], [Country of Birth: Austria], [Country of Birth: Germany], [Country: Austria], [Country: Germany], [Language: German]',
  1: '[Country: Germany], [Management experience: Yes]',
  2: '[Country: United Kingdom], [Language: English]',
  3: '[Age: 18-67], [Country of Birth: Austria], [Country of Birth: Germany], [Country: Austria], [Country: Germany], [Language: German]',
  4: '[Age: 48-99]'}})

我希望输出如下。

study_id    Age     Country of Birth    Country         Language      Management experience
study1     18-67    Austria             Austria         German        None
study1     18-67    Germany             Germany         German        None
study2     None     None                Germany         None          Yes
study3     None     None                United Kingdom  English       None

我应该如何修改这些代码。谢谢。

p = df['fuzzy_market'].str.findall(r'([^:\[]+): ([^\]]+)')
df[['study_id']].join(pd.DataFrame(map(dict, p)))
jv4diomz

jv4diomz1#

试试这个:

data = [*df.pop('fuzzy_market').str.findall(r'([^:\[]+): ([^\]]+)').map(dict)]
res = df.join(pd.DataFrame(data, index=df.index))
print(res)

相关问题