如何在pig中读取json数据?

deyfvvtc  于 2021-06-02  发布在  Hadoop
关注(0)|答案(1)|浏览(370)

我有以下类型的json文件:

{"employees":[
    {"firstName":"John", "lastName":"Doe"},
    {"firstName":"Anna", "lastName":"Smith"},
    {"firstName":"Peter", "lastName":"Jones"}
]}

我正在尝试执行以下pig脚本来加载json数据

A = load 'pigdemo/employeejson.json' using JsonLoader ('employees:{(firstName:chararray)},{(lastName:chararray)}');

获取错误!!
无法从备份的错误中重新创建异常:错误:org.codehaus.jackson.jsonparseexception:输入的意外结束:数组的预期关闭标记(来自[source:java.io]。bytearrayinputstream@1553f9b2; 行:1,列:1])。bytearrayinputstream@1553f9b2; 行:1,列:29]

l3zydbqr

l3zydbqr1#

首先你看到的原因 Unexpected end-of-input 是因为每个代码应该在一行中-如下所示: {"employees":[{"firstName":"John", "lastName":"Doe"}, {"firstName":"Anna", "lastName":"Smith"}, {"firstName":"Peter", "lastName":"Jones"}]} 现在-由于每行都是employees list,所以运行下一个命令

A = load '$flurryData' using JsonLoader ('employees:bag {t:tuple(firstName:chararray, lastName:chararray)}');
describe A;
dump A;

给出下一个输出

A: {employees: {t: (firstName: chararray,lastName: chararray)}}

({(John,Doe),(Anna,Smith),(Peter,Jones)})

希望这对你有帮助!

相关问题