Pandas:将带有多索引的DataFrame转换为dict

ecbunoof  于 2023-03-06  发布在  其他
关注(0)|答案(2)|浏览(179)

另一个Pandas新手问题。我想把一个DataFrame转换成字典,但是方法和DataFrame.to_dict()函数提供的方法不同。举例说明:

df = pd.DataFrame({'co':['DE','DE','FR','FR'],
                   'tp':['Lake','Forest','Lake','Forest'],
                   'area':[10,20,30,40],
                   'count':[7,5,2,3]})
df = df.set_index(['co','tp'])

之前:

area  count
co tp
DE Lake      10      7
   Forest    20      5
FR Lake      30      2
   Forest    40      3

之后:

{('DE', 'Lake', 'area'): 10,
 ('DE', 'Lake', 'count'): 7,
 ('DE', 'Forest', 'area'): 20,
 ...
 ('FR', 'Forest', 'count'): 3 }

dict键应该是由索引行+列标题组成的元组,而dict值应该是单个DataFrame值。对于上面的示例,我设法找到了以下表达式:

after = {(r[0],r[1],c):df.ix[r,c] for c in df.columns for r in df.index}

我 * 如何将这段代码推广到具有N个级别 *(而不是2个)的MultiIndex?

    • 回答我**

多亏了DSM's answer,我发现我实际上只需要使用元组连接r+(c,),上面的二维循环就变成了N维循环:

after = {r + (c,): df.ix[r,c] for c in df.columns for r in df.index}
wb1gzix0

wb1gzix01#

不如这样:

>>> df
           area  count
co tp                 
DE Lake      10      7
   Forest    20      5
FR Lake      30      2
   Forest    40      3
>>> after = {r + (k,): v for r, kv in df.iterrows() for k,v in kv.to_dict().items()}
>>> import pprint
>>> pprint.pprint(after)
{('DE', 'Forest', 'area'): 20,
 ('DE', 'Forest', 'count'): 5,
 ('DE', 'Lake', 'area'): 10,
 ('DE', 'Lake', 'count'): 7,
 ('FR', 'Forest', 'area'): 40,
 ('FR', 'Forest', 'count'): 3,
 ('FR', 'Lake', 'area'): 30,
 ('FR', 'Lake', 'count'): 2}
5jvtdoz2

5jvtdoz22#

df.stack().to_dict()

输出:

{('DE', 'Lake', 'area'): 10,
 ('DE', 'Lake', 'count'): 7,
 ('DE', 'Forest', 'area'): 20,
 ('DE', 'Forest', 'count'): 5,
 ('FR', 'Lake', 'area'): 30,
 ('FR', 'Lake', 'count'): 2,
 ('FR', 'Forest', 'area'): 40,
 ('FR', 'Forest', 'count'): 3}

相关问题