下面我有提供给我的代码,以便加入2个数据集.
import pandas as pd
from sklearn.datasets import load_iris
import matplotlib.pyplot as plt
df= pd.read_csv("student/student-por.csv")
ds= pd.read_csv("student/student-mat.csv")
print("before merge")
print(df)
print(ds)
print("After merging:")
dq = pd.merge(df,ds,by=c("school","sex","age","address","famsize","Pstatus","Medu","Fedu","Mjob","Fjob","reason","nursery","internet"))
print(dq)
我得到这个错误:
Traceback (most recent call last):
File "/Users/PycharmProjects/datamining/main.py", line 15, in <module>
dq = pd.merge(df, ds,by=c ("school","sex","age","address","famsize","Pstatus","Medu","Fedu","Mjob","Fjob","reason","nursery","internet"))
NameError: name 'c' is not defined
任何帮助都是很好的,我已经尝试了一段时间,我相信'by=c'是问题所在。
谢谢
1条答案
按热度按时间bn31dyow1#
嗨,👋🏻希望你一切顺利!
发生错误的原因是
merge
函数的参数中有一个c
符号。另外,merge
函数有一个不同的签名,它没有参数by
,而是应该是on
,它只接受列的列表🙂。因此,总结起来,它应该类似于以下内容:文件:https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.merge.html