python 连接数据框并重置列

h22fl7wq  于 2023-01-01  发布在  Python
关注(0)|答案(5)|浏览(129)

对于以下 Dataframe

ID Name  Time
0   0    A   100
1   1    B    70
   ID Name  Time
0   0    C    40
1   1    D    90

我想将它们按行连接起来,并重置ID编号。因此,最终的数据框应为

ID Name  Time
0   0    A   100
1   1    B    70
2   2    C    40
3   3    D    90

密码是

big_df = pd.DataFrame()
for i in range(1,3):
    fname = 'test_' + str(i) + '.csv'
    small_df = pd.read_csv(fname, skiprows=[1])
    print(small_df)
    frames = [big_df, small_df]
    big_df = pd.concat(frames) 
    i += 1
big_df.set_index('ID', inplace=True)
print(big_df)

但是输出是

Name  Time
ID           
0     A   100
1     B    70
0     C    40
1     D    90

我想将索引值复制到ID列,但我知道set_index会将该列作为索引。如何修复代码以实现此目的?

    • 更新**

我发现big_df['ID'] = big_df.index会将索引值复制到ID列。

nukf8bse

nukf8bse1#

一个选项使用concat作为ID的增量:

dfs = [df1, df2]

dic = dict(enumerate(map(len, dfs), start=1))
dic[0] = 0

out = (pd
  .concat(dfs, keys=range(len(dfs)))
  .assign(ID=lambda d: d['ID'].add(d.index.get_level_values(0).map(dic)))
  .reset_index(drop=True)
)

输出:

ID Name  Time
0   0    A   100
1   1    B    70
2   2    C    40
3   3    D    90
vulvrdjw

vulvrdjw2#

你试过在连接时使用ignore_index选项吗?

pd.concat(frames, ignore_index=True)
hc8w905p

hc8w905p3#

您可以执行以下操作:

import pandas as pd

big_df = pd.DataFrame({'ID': [0, 1], 'Name': ['A', 'B'], 'Time': [100, 70]})
small_df = pd.DataFrame({'ID': [0, 1], 'Name': ['C', 'D'], 'Time': [40, 90]})

df = pd.concat([big_df, small_df])
df = df.reset_index(drop=True)

print(df)
qvsjd97n

qvsjd97n4#

如果您尝试连接 Dataframe ,则可以这样做,

In [1]: df1 = pd.DataFrame(
   ...:     {
   ...:         "A": ["A0", "A1", "A2", "A3"],
   ...:         "B": ["B0", "B1", "B2", "B3"],
   ...:         "C": ["C0", "C1", "C2", "C3"],
   ...:         "D": ["D0", "D1", "D2", "D3"],
   ...:     },
   ...:     index=[0, 1, 2, 3],
   ...: )
   ...: 

In [2]: df2 = pd.DataFrame(
   ...:     {
   ...:         "A": ["A4", "A5", "A6", "A7"],
   ...:         "B": ["B4", "B5", "B6", "B7"],
   ...:         "C": ["C4", "C5", "C6", "C7"],
   ...:         "D": ["D4", "D5", "D6", "D7"],
   ...:     },
   ...:     index=[4, 5, 6, 7],
   ...: )
   ...: 

In [3]: df3 = pd.DataFrame(
   ...:     {
   ...:         "A": ["A8", "A9", "A10", "A11"],
   ...:         "B": ["B8", "B9", "B10", "B11"],
   ...:         "C": ["C8", "C9", "C10", "C11"],
   ...:         "D": ["D8", "D9", "D10", "D11"],
   ...:     },
   ...:     index=[8, 9, 10, 11],
   ...: )
   ...: 

In [4]: frames = [df1, df2, df3]

In [5]: result = pd.concat(frames, ignore_index=True)

要获得更多文档,请查看以下内容。
https://pandas.pydata.org/docs/user_guide/merging.html

qpgpyjmq

qpgpyjmq5#

下面是关于pandas.concatpandas.DataFrame.index的命题:

big_df = pd.concat([df1, df2], ignore_index=True).assign(ID= lambda x: x.index)
#输出:
print(big_df)
​
   ID Name  Time
0   0    A   100
1   1    B    70
2   2    C    40
3   3    D    90

相关问题