CREATE EXTERNAL TABLE invoiceitems (
InvoiceNo INT,
StockCode INT,
Description STRING,
Quantity INT,
InvoiceDate BIGINT,
UnitPrice DOUBLE,
CustomerID INT,
Country STRING,
LineNo INT,
InvoiceTime STRING,
StoreID INT,
TransactionID STRING
)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
LOCATION 's3a://streamingdata/data/*';
数据文件是由spark结构化流作业创建的:
...
data/part-00000-006fc42a-c6a1-42a2-af03-ae0c326b40bd-c000.json 7.1 KB 29/08/2018 10:27:32 PM
data/part-00000-0075634b-8513-47b3-b5f8-19df8269cf9d-c000.json 1.3 KB 30/08/2018 10:47:32 AM
data/part-00000-00b6b230-8bb3-49d1-a42e-ad768c1f9a94-c000.json 2.3 KB 30/08/2018 1:25:02 AM
...
以下是第一个文件的前几行:
{"InvoiceNo":5421462,"StockCode":22426,"Description":"ENAMEL WASH BOWL CREAM","Quantity":8,"InvoiceDate":1535578020000,"UnitPrice":3.75,"CustomerID":13405,"Country":"United Kingdom","LineNo":6,"InvoiceTime":"21:27:00","StoreID":0,"TransactionID":"542146260180829"}
{"InvoiceNo":5501932,"StockCode":22170,"Description":"PICTURE FRAME WOOD TRIPLE PORTRAIT","Quantity":4,"InvoiceDate":1535578020000,"UnitPrice":6.75,"CustomerID":13952,"Country":"United Kingdom","LineNo":26,"InvoiceTime":"21:27:00","StoreID":0,"TransactionID":"5501932260180829"}
但是,如果运行查询,则不会返回任何数据:
hive> select * from invoiceitems limit 5;
OK
Time taken: 24.127 seconds
配置单元的日志文件为空:
$ ls /var/log/hive*
/var/log/hive:
/var/log/hive-hcatalog:
/var/log/hive2:
如何进一步调试?
1条答案
按热度按时间dzjeubhm1#
我在运行时收到了更多关于错误的提示:
这返回了以下错误
...
由于vertex\U失败,dag未成功。失败dvertices:1 killedvertices:1失败:执行错误,从org.apache.hadoop.hive.ql.exec.tez.teztask返回代码2。vertex失败,vertexname=map 1,vertexid=vertex\u 1535521291031\u 0011\u 00,diagnostics=[vertex vertex\u 1535521291031\u 0011\u 00[map 1]已终止/失败,原因是:根输入初始化失败,vertex输入:invoiceitems初始值设定项失败,vertex=vertex\u 1535521291031\u 0011\u 00[map 1],java.io.ioexception:在pathtPartitionInfo:[s3a://streamingdata/data/part-00000-006fc42a-c6a1-42a2-af03-ae0c326b40bd-c000.json]中找不到dir=s3a://streamingdata/data/part-00000-006fc42a-c6a1-42a2
我决定将create table定义从:
至
这解决了问题。