hive:json serde文件在外部表中返回“null”

o8x7eapl 于 2021-05-29 发布在 Hadoop

关注(0)|答案(1)|浏览(408)

我在amazon上有一个dynamodb，其中包含一堆tweet和相关数据（用户、位置等）。我通过管道导出了这个，得到了一个json文件。将其导出为csv是个坏主意，因为许多tweet的文本字段中都包含逗号。虽然我对hive很陌生，但我至少知道要加载json文件，我需要某种serde。
我就是这样创建表的：

create external table tablename (
id string,
created_at string,
followers_count string,
geo string,
location string,
polarity string,
screen_name string,
sentiment string,
subjectivity string,
tweet string,
username string)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
SAVE AS TEXTFILE ;

我没有收到任何错误，但是我收到了：

load data inpath '/user/exam'
overwrite into table tablename;

（这是存储json文件的地方）
当我这么做的时候” select * from tablename limit 5; “一切都是空的：

hive> select * from wcd.tablename limit 5;
OK
{   NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL
{   NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL
{   NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL
{   NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL
{   NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL    NULL

如果有人想查看相关文件，请访问：
http://www.vaughn-s.net/hadoop
如有任何帮助，我们将不胜感激！

hadoop Hive JSON amazon-dynamodb

来源：https://stackoverflow.com/questions/45558199/hive-json-serde-file-returns-null-in-external-table

1条答案

按热度按时间

sc4hvdpw1#

原因是你的json没有´不能按照你的表格定义

{"id":{"s":"894643473017561088"},"sentiment":{"s":"neutral"},"subjectivity":{"s":"0.0"},"username":{"s":"Jessi"},"geo":{"s":"None"},"location":{"s":"Valley of the sunâ˜€ï¸"},"polarity":{"s":"0.0"},"tweet":{"s":"b\"RT @bannerite: Donald Trump's lies have consequences. We're seeing them now | Charlotte Observer #DemForce https""},"created_at":{"s":"Mon Aug 07 19:36:40
+0000 2017"},"screen_name":{"s":"JessiAtkins06"},"followers_count":{"s":"19"}}

例如，尝试将每一列与一个带有s字符串的结构放在一起

id struct<s:string>

赞(0）回复(0）举报 2021-05-29

我来回答

hive:json serde文件在外部表中返回“null”

1条答案

相关问题

热门标签

最新问答