pandas 如何在多级列数据框中添加小计列？

yruzcnhs 于 2022-12-17 发布在其他

关注(0)|答案(1)|浏览(178)

我有一个 Dataframe ，包含3个级别的多索引列：

quarter           Q1                        Q2                        Totals
year              2021        2022           2021         2022                      
                 qty orders  qty orders    qty orders   qty orders   qty orders
month name                                       
January          40  2        5   1         1   2         0 0             46  5
February         20  8        2   3         4   6         0 0             26  17
March            2  10        7   4         3   3         0 0             12  17
Totals           62 20       14   8         8   11        0 0             84  39

按级别（0，2）分组后，我得到了以下小计数据框：

quarter           Q1           Q2          Totals                     
                 qty orders  qty orders    qty orders  
month name                                       
January          45  3        1   2         46   5     
February         22  10       4   6         26   16     
March            9  14        3   3         12   17   
Totals           76 28        8   11        84   39

我需要将第二个插入到第一个中，而不打乱列、级别或索引，以便获得以下 Dataframe ：

quarter       Q1                                   Q2                        Totals
year        2021        2022      Subtotal    2021        2022     Subtotal                 
            qty orders qty orders qty orders qty orders qty orders qty orders qty orders
month name                                       
January     40  2       5   1     45   3       1  2       0  0       1  2     46  5
February    20  8       2   3     22   10      4  6       0  0       4  6     26  16
March       2  10       7   4     9    14      3  3       0  0       3  3     12  17
Totals      62 20      14   8     76   28      8  11      0  0       8  11    84 39

我该怎么做呢？

pandas

来源：https://stackoverflow.com/questions/74733258/how-to-add-sub-total-columns-to-a-multilevel-columns-dataframe

1条答案

按热度按时间

nfs0ujit1#

使用初始 Dataframe （groupby之前）：

import pandas as pd

df = pd.DataFrame(
    [
        [40, 2, 5, 1, 1, 2, 0, 0],
        [20, 8, 2, 3, 4, 6, 0, 0],
        [2, 10, 7, 4, 3, 3, 0, 0],
        [62, 20, 14, 8, 8, 11, 0, 0],
    ],
    columns=pd.MultiIndex.from_product(
        [("Q1", "Q2"), ("2021", "2022"), ("qty", "orders")]
    ),
    index=["January", "February", "March", "Totals"],
)

下面是一种实现方法（使用Python标准库itertools模块中的product，否则也可以使用嵌套的for循环）：

# Add new columns
for level1, level2 in product(["Q1", "Q2"], ["qty", "orders"]):
    df.loc[:, (level1, "subtotal", level2)] = (
        df.loc[:, (level1, "2021", level2)] + df.loc[:, (level1, "2022", level2)]
    )

# Sort columns
df = df.reindex(
    pd.MultiIndex.from_product(
        [("Q1", "Q2"), ("2021", "2022", "subtotal"), ("qty", "orders")]
    ),
    axis=1,
)

然后：

print(df)
# Output
           Q1                                      Q2                     \
         2021        2022        subtotal        2021        2022
          qty orders  qty orders      qty orders  qty orders  qty orders   
January    40      2    5      1       45      3    1      2    0      0   
February   20      8    2      3       22     11    4      6    0      0   
March       2     10    7      4        9     14    3      3    0      0   
Totals     62     20   14      8       76     28    8     11    0      0   

         subtotal
              qty orders  
January         1      2  
February        4      6  
March           3      3  
Totals          8     11

赞(0）回复(0）举报 2022-12-17

我来回答

pandas 如何在多级列数据框中添加小计列？

1条答案

相关问题

热门标签

最新问答