pig:json加载程序的结果为空

7nbnzgx9 于 2021-06-21 发布在 Pig

关注(0)|答案(3)|浏览(340)

我使用的是cdh5 quickstart vm，我有这样一个文件（这里不完整）：

{"user_id": "kim95",
 "type": "Book",
 "title": "Modern Database Systems: The Object Model, Interoperability, and
Beyond.",
 "year": "1995",
 "publisher": "ACM Press and Addison-Wesley",
 "authors": {},
 "source": "DBLP"
}
{"user_id": "marshallo79",
 "type": "Book",
 "title": "Inequalities: Theory of Majorization and Its Application.",
 "year": "1979",
 "publisher": "Academic Press",
 "authors": {("Albert W. Marshall"), ("Ingram Olkin")},
 "source": "DBLP"
}

我用了这个脚本：

books = load 'data/book-seded.json'
        using JsonLoader('t1:tuple(user_id:
chararray,type:chararray,title:chararray,year:chararray,publisher:chararray,source:chararray,authors:bag{T:tuple(author:chararray)})');

STORE books INTO 'book-no-seded.tsv';

脚本正常，但是生成的文件是空的，你知道吗？

JSON hue cloudera-cdh apache-pig

来源：https://stackoverflow.com/questions/24976373/pig-result-of-json-loader-empty

3条答案

按热度按时间

n6lpvg4x1#

尝试使用org.apache.pig.piggybank.storage.jsonstorage（）将书籍存储到'book no seed.tsv'；

赞(0）回复(0）举报 2021-06-21

62o28rlo2#

您需要确保加载模式是好的。你可以试着做一个 DUMP books 快速检查。
在本教程中使用pigjsonload时，我们必须小心输入数据和模式http://gethue.com/hadoop-tutorials-ii-1-prepare-the-data-for-analysis/.

赞(0）回复(0）举报 2021-06-21

5kgi1eie3#

最后，只有这个模式有效：如果我添加或删除一个与这个配置不同的空间，那么我会有一个错误（我还为元组添加了“name”，并在元组为空时指定了“null”，并且更改了作者和源代码之间的顺序，但是即使没有这个配置，它仍然是错误的）

{"user_id": "kim95", "type": "Book","title": "Modern Database Systems: The Object Model, Interoperability, and Beyond.", "year": "1995", "publisher": "ACM Press and Addison-Wesley", "authors": [{"name":null"}], "source": "DBLP"}
{"user_id": "marshallo79", "type": "Book", "title": "Inequalities: Theory of Majorization and Its Application.", "year": "1979", "publisher": "Academic Press", "authors": [{"name":"Albert W. Marshall"},{"name":"Ingram Olkin"}], "source": "DBLP"}

工作脚本是这样的：

books = load 'data/book-seded-workings-reduced.json'
        using JsonLoader('user_id:chararray,type:chararray,title:chararray,year:chararray,publisher:chararray,authors:{(name:chararray)},source:chararray');

STORE books INTO 'book-table.csv';  //whether .tsv or .csv

赞(0）回复(0）举报 2021-06-21

我来回答

pig:json加载程序的结果为空

3条答案

相关问题

热门标签

最新问答