上传.csv数据到配置单元中,该配置单元采用封闭格式

mzaanser  于 2021-06-04  发布在  Hadoop
关注(0)|答案(3)|浏览(277)

我的.csv文件是一个封闭的格式。

"13","9827259163","0","D","2"
    "13","9827961481","0","D","2"
    "13","9827202228","0","A","2"
    "13","9827529897","0","A","2"
    "13","9827700249","0","A","2"
    "12","9883219029","0","A","2"
    "17","9861065312","0","A","2"
    "17","9861220761","0","D","2"
    "13","9827438384","0","A","2"
    "13","9827336733","0","D","2"
    "13","9827380905","0","D","2"
    "13","9827115358","0","D","2"
    "17","9861475884","0","D","2"
    "17","9861511646","0","D","2"
    "17","9861310397","0","D","2"
    "13","9827035035","0","A","2"
    "13","9827304969","0","D","2"
    "13","9827355786","0","A","2"
    "13","9827702373","0","A","2"

就像在mysql中一样,我尝试使用“inclosed”关键字,如下所示。。

CREATE EXTERNAL TABLE dnd (ServiceAreaCode varchar(50), PhoneNumber varchar(15), Preferences varchar(15), Opstype varchar(15), PhoneType varchar(10))
ROW FORMAT DELIMITED
        FIELDS TERMINATED BY ',' ENCLOSED BY '"'
        LINES TERMINATED BY '\n'
LOCATION '/dnd';

但是,它给出了一个错误如下。。。

NoViableAltException(26@[1704:103: ( tableRowFormatMapKeysIdentifier )?])
    at org.antlr.runtime.DFA.noViableAlt(DFA.java:158)
    at org.antlr.runtime.DFA.predict(DFA.java:144)
    at org.apache.hadoop.hive.ql.parse.HiveParser.rowFormatDelimited(HiveParser.java:30427)
    at org.apache.hadoop.hive.ql.parse.HiveParser.tableRowFormat(HiveParser.java:30662)
    at org.apache.hadoop.hive.ql.parse.HiveParser.createTableStatement(HiveParser.java:4683)
    at org.apache.hadoop.hive.ql.parse.HiveParser.ddlStatement(HiveParser.java:2144)
    at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1398)
    at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1036)
    at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:199)
    at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
    at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:404)
    at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:322)
    at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:975)
    at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1040)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901)
    at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:359)
    at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:456)
    at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:466)
    at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:748)
    at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
FAILED: ParseException line 5:33 cannot recognize input near 'ENCLOSED' 'BY' ''"'' in serde properties specification

有没有办法直接导入这个文件??提前谢谢。

wbgh16ku

wbgh16ku1#

另找一条路。解决办法是塞德。请使用以下链接下载serde jar:https://github.com/downloads/illyayalovyy/csv-serde/csv-serde-0.9.1.jar
然后使用配置单元提示符执行以下步骤:

add jar path/to/csv-serde.jar;

create table dnd (ServiceAreaCode varchar(50), PhoneNumber varchar(15), Preferences varchar(15), Opstype varchar(15), PhoneType varchar(10))
row format serde 'com.bizo.hive.serde.csv.CSVSerde'
with serdeproperties(
"separatorChar" = "\,",
"quoteChar" = "\"")
stored as textfile
;

然后使用以下查询从给定路径加载数据:
将数据本地inpath'path/xyz.csv'加载到表dnd中;然后运行:

select * from dnd;
ejk8hzay

ejk8hzay2#

嘿,我在hive表中引用了csv数据:首先下载csv-serde(我下载了csv-serde-1.1.2.jar),然后

hive>add jar /opt/hive-1.1.1/lib/csv-serde-1.1.2.jar;
Hive>create table t1(schema) row format serde 'com.bizo.hive.serde.csv.CSVSerde' with serdeproperties ("separatorChar" = ",") LOCATION '/user/hive/warehouse/dwb/ot1/';

然后我们必须在hive-site.xml中添加serde,如下所述,这样我们就可以从hiveshell查询表了。

<property><name>hive.aux.jars.path</name><value>hdfs://master-ip:54310/hive-serde/csv-serde-1.1.2.jar</value></property>
gdx19jrr

gdx19jrr3#

在hive中,我们可以使用jar文件来检索用双引号括起来的数据。
有关您的问题,请参阅以下链接:

http://stackoverflow.com/questions/21156071/why-dont-hive-have-fields-enclosed-by-like-in-mysql

相关问题