pandas 如何将一个预先初始化的 Dataframe 或多个列插入到另一个 Dataframe 的指定列位置?

c90pui9n  于 2023-01-28  发布在  其他
关注(0)|答案(5)|浏览(192)

假设我们有下面的 Dataframe 。

col1 col2   col3
0  one  two  three
1  one  two  three
2  one  two  three
3  one  two  three
4  one  two  three

我们试图在这个 Dataframe 中引入31列,每列代表一个月中的一天。
假设我们要在col2列和col3列之间精确地引入它。
我们如何实现这一目标?
为了简单起见,引入的列可以从1到31编号。

起始源代码

import pandas as pd

src = pd.DataFrame({'col1': ['one', 'one', 'one', 'one','one'],    
                    'col2': ['two', 'two', 'two', 'two','two'],    
                    'col3': ['three', 'three', 'three', 'three','three'],
                    })
q7solyqu

q7solyqu1#

另一种可能的解决方案:

pd.concat([src.iloc[:, :2].assign(
    **{str(col): 0 for col in range(1, 32)}), src['col3']], axis=1)

输出:

col1 col2  1  2  3  4  5  6  7  8  ...  23  24  25  26  27  28  29  30  31  \
0  one  two  0  0  0  0  0  0  0  0  ...   0   0   0   0   0   0   0   0   0   
1  one  two  0  0  0  0  0  0  0  0  ...   0   0   0   0   0   0   0   0   0   
2  one  two  0  0  0  0  0  0  0  0  ...   0   0   0   0   0   0   0   0   0   
3  one  two  0  0  0  0  0  0  0  0  ...   0   0   0   0   0   0   0   0   0   
4  one  two  0  0  0  0  0  0  0  0  ...   0   0   0   0   0   0   0   0   0   

    col3  
0  three  
1  three  
2  three  
3  three  
4  three  

[5 rows x 34 columns]
ut6juiuv

ut6juiuv2#

您可以使用pd.concat并使用iloc对列进行重新排序,如下所示:

import numpy as np

# Create dataframe with 31 column and 5 rows
tmp = pd.DataFrame(np.zeros((5, 31)), columns=range(1, 32))

# Concat two dataframes and reorder columns as you like
df = pd.concat([src.iloc[:,:2], tmp, src.iloc[:, 2:]], axis=1)

输出:

col1 col2    1    2    3    4    5    6    7    8  ...   23   24   25   26  \
0  one  two  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0   
1  one  two  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0   
2  one  two  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0   
3  one  two  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0   
4  one  two  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0   

    27   28   29   30   31   col3  
0  0.0  0.0  0.0  0.0  0.0  three  
1  0.0  0.0  0.0  0.0  0.0  three  
2  0.0  0.0  0.0  0.0  0.0  three  
3  0.0  0.0  0.0  0.0  0.0  three  
4  0.0  0.0  0.0  0.0  0.0  three  

[5 rows x 34 columns]
34gzjxbg

34gzjxbg3#

我将为原始 Dataframe 赋值,并使用列选择对列重新排序。

src[list(range(1, 32))] = 0
src = src[[*src.columns[:2], *range(1, 32), src.columns[2]]]

或者对于全新的副本,请使用assign

cols = list(map(str, range(1, 32)))
new_df = (
    src
    .assign(**dict.fromkeys(cols, 0))
    .reindex(columns=[*src.columns[:2], *cols, *src.columns[2:]])
)

j13ufse2

j13ufse24#

如果您的目的是添加和初始化新列,请使用reindex

cols = list(src)
cols[2:2] = range(1,31+1)

df = src.reindex(columns=cols, fill_value=0)

输出:

col1 col2  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31   col3
0  one  two  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0  three
1  one  two  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0  three
2  one  two  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0  three
3  one  two  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0  three
4  one  two  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0  three
fjaof16o

fjaof16o5#

    • 仅供参考**
import pandas as pd
import numpy as np

src = pd.DataFrame({'col1': ['one', 'one', 'one', 'one','one'],    
                    'col2': ['two', 'two', 'two', 'two','two'],    
                    'col3': ['three', 'three', 'three', 'three','three'],
                    })

m = np.matrix([0]*31) # Builds a 31-columns numpy array matrix
df = pd.DataFrame(m) # Converts matrix to dataframe
df.columns = df.columns+1 # Increments columns from 1 in dataframe

# Operations on dataframe : extension + resetting index + replace Nan by 0
df = (df.reindex(list(range(0, len(src))))
        .reset_index(drop=True)
        .fillna(0))

df = pd.concat([src.iloc[:, :2], df, src.iloc[:, 2:]], axis=1) # inserts by slicing source in two parts
    • 结果**
col1 col2    1    2    3    4    5  ...   26   27   28   29   30   31   col3
0  one  two  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0  0.0  0.0  three
1  one  two  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0  0.0  0.0  three
2  one  two  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0  0.0  0.0  three
3  one  two  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0  0.0  0.0  three
4  one  two  0.0  0.0  0.0  0.0  0.0  ...  0.0  0.0  0.0  0.0  0.0  0.0  three

[5 rows x 34 columns]

相关问题