pandas 从只包含行的csv创建数据框

xt0899hw  于 2023-04-28  发布在  其他
关注(0)|答案(3)|浏览(129)

我有一个csv文件,如下所示:

feature1
feature2
feature3
f1_v1
f2_v1
f3_v1
f1_v2
f2_v2
f3_v2
...

我想得到一个这样的dataframe:

feature1 feature2 feature3
0   f1_v1    f2_v1    f3_v1
1   f1_v2    f2_v2    f3_v2
...

我该怎么做?

hi3rlvi2

hi3rlvi21#

我不知道为什么这么多人害怕预处理他们的数据以使其适合pandas,但这应该是一个常见的策略。

import pandas as pd

headers = []
rows = []

for line in open('x.csv'):
    line = line.strip()
    if len(headers) < 3:
        headers.append(line)
        continue
    if not rows or len(rows[-1]) == 3:
        rows.append([])
    rows[-1].append(line)

df = pd.DataFrame(rows, columns=headers)
print(df)

输出:

feature1 feature2 feature3
0    f1_v1    f2_v1    f3_v1
1    f1_v2    f2_v2    f3_v2
yks3o0rb

yks3o0rb2#

你可以先读取所有的行,然后使用numpy重新整形:

import pandas as pd
import numpy as np
with open('text.csv', 'r') as f:
   data = f.readlines()
data = list(map(lambda x: x.strip(), data))
data = np.array(data).reshape(3, -1)
data = pd.DataFrame(data[1:, :], columns=data[0])
data.head()
cczfrluj

cczfrluj3#

import pandas as pd

# Read the CSV file into a pandas DataFrame
df = pd.read_csv('filename.csv')

# Reshape the DataFrame using pivot_table()
new_df = pd.pivot_table(df, index=df.index // 3, columns='feature1', values='value')

# Rename the column axis to None to remove the "feature1" label
new_df.columns.name = None

# Display the new DataFrame
print(new_df)

and this is how you would change it depending on the amount of values in the rows
import pandas as pd

# Read the CSV file into a pandas DataFrame
df = pd.read_csv('filename.csv')

# Reshape the DataFrame into "long" format
long_df = pd.melt(df, id_vars=['feature1'], value_vars=['feature2', 'feature3'], var_name='feature', value_name='value')

# Reshape the DataFrame using pivot_table()
new_df = pd.pivot_table(long_df, index=long_df.index // 2, columns='feature', values='value')

# Rename the column axis to None to remove the "feature" label
new_df.columns.name = None

# Display the new DataFrame
print(new_df)

相关问题