从CSV手动创建ARFF文件

fcipmucu  于 2023-06-27  发布在  其他
关注(0)|答案(1)|浏览(157)

我想从Kaggle创建一个基于此CSVarff文件
https://www.kaggle.com/c/titanic/download/train.csv
下面是我创建的arff文件的一部分

@relation titanic

@attribute PassengerId numeric
@attribute Survived {0,1}
@attribute Pclass {1,2,3}
@attribute Name string
@attribute Sex {male,female}
@attribute Age numeric
@attribute SibSp numeric
@attribute Parch numeric
@attribute Ticket string
@attribute Fare numeric
@attribute Cabin string
@attribute Embarked {C,Q,S}

@data
1,0,3,"Braund, Mr. Owen Harris",male,22,1,0,A/5 21171,7.25,,S
2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Thayer)",female,38,1,0,PC 17599,71.2833,C85,C
3,1,3,"Heikkinen, Miss. Laina",female,26,0,0,STON/O2. 3101282,7.925,,S
4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35,1,0,113803,53.1,C123,S

但是当我在Weka中加载它时,它返回以下错误:

nominal value not declared in header, read Token[C85], line 18 % the second line of my data

我的声明有什么不对吗?

sirbozc5

sirbozc51#

问题是名称"Cumings, Mrs. John Bradley (Florence Briggs Thayer)"中有一个逗号。Weka将其解析为两个字段,尽管有双引号。
您可以尝试在正则表达式的帮助下删除此类逗号(即双引号中的逗号)。

相关问题