从字典创建与从列表创建的Pandas数据框

3bygqnnd  于 2023-03-16  发布在  其他
关注(0)|答案(2)|浏览(152)

是否有一两行代码可以使从列表创建的DataFrame像从字典创建的DataFrame一样工作?

#DataFrame created from dictionary, this works:
import pandas as pd
data= {'Salary': [30000, 40000, 50000, 85000, 75000],            
        'Exp': [1, 3, 5, 10, 25],          
        'Gender': ['M','F', 'M', 'F', 'M']} 
df = pd.DataFrame(data)
print(df), print()

new_df1 = df[df['Salary'] >= 50000]
print(new_df1), print()

new_df2 = df.sort_values(['Exp'], axis = 0, ascending=[False])
print(new_df2)

#This doesn't work with the df.functions, sort and conditionals    
data = [['Salary', 'Exp', 'Gender'],[30000, 1, 'M'],
        [40000, 3, 'F'], [50000, 5, 'M'], [85000, 10, 'F'], [75000, 25, 'M']]

df = pd.DataFrame(data)
print(df), print()

new_df1 = df[df['Salary'] >= 50000]  #doesn't work
print(new_df1), print()

new_df2 = df.sort_values(['Exp'], axis = 0, ascending=[False])  #ditto
print(new_df2)
qxsslcnc

qxsslcnc1#

在第二段代码中,没有使用第一个子列表作为列名,而是使用数据。
而将第一个子列表作为DataFrame构造函数的columns参数传递:

df = pd.DataFrame(data[1:], columns=data[0])

输出:

Salary  Exp Gender
0   30000    1      M
1   40000    3      F
2   50000    5      M
3   85000   10      F
4   75000   25      M
代码失败的原因

您的代码错误地将第一个子列表Map为数据:

pd.DataFrame(data)

        0    1       2   # incorrect header
0  Salary  Exp  Gender   # this shouldn't be a data row
1   30000    1       M
2   40000    3       F
3   50000    5       M
4   85000   10       F
5   75000   25       M
完整代码:
df = pd.DataFrame(data[1:], columns=data[0])
print(df), print()

new_df1 = df[df['Salary'] >= 50000]  #doesn't work
print(new_df1), print()

new_df2 = df.sort_values(['Exp'], axis = 0, ascending=[False])  #ditto
print(new_df2)

输出:

Salary  Exp Gender
0   30000    1      M
1   40000    3      F
2   50000    5      M
3   85000   10      F
4   75000   25      M

   Salary  Exp Gender
2   50000    5      M
3   85000   10      F
4   75000   25      M

   Salary  Exp Gender
4   75000   25      M
3   85000   10      F
2   50000    5      M
1   40000    3      F
0   30000    1      M
q5iwbnjs

q5iwbnjs2#

这里有必要创建 Dataframe 的所有值没有第一和传递参数columns

#This doesn't work with the df.functions, sort and conditionals    
data = [['Salary', 'Exp', 'Gender'],[30000, 1, 'M'],
        [40000, 3, 'F'], [50000, 5, 'M'], [85000, 10, 'F'], [75000, 25, 'M']]

df = pd.DataFrame(data[1:], columns=data[0])
print(df), print()
   Salary  Exp Gender
0   30000    1      M
1   40000    3      F
2   50000    5      M
3   85000   10      F
4   75000   25      M

new_df1 = df[df['Salary'] >= 50000]  #working well
print(new_df1), print()
   Salary  Exp Gender
2   50000    5      M
3   85000   10      F
4   75000   25      M

new_df2 = df.sort_values(['Exp'], axis = 0, ascending=[False])  #ditto
print(new_df2)

   Salary  Exp Gender
4   75000   25      M
3   85000   10      F
2   50000    5      M
1   40000    3      F
0   30000    1      M

相关问题