我有一个配置单元表来加载json数据。我的json中有两个值。两者的数据类型都是字符串。如果我将它们保留为bigint,则此表上的select将给出以下错误:
Failed with exception java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException: org.codehaus.jackson.JsonParseException: Current token (VALUE_STRING) not numeric, can not use numeric value accessors
at [Source: java.io.ByteArrayInputStream@3b6c740b; line: 1, column: 21]
如果我把它改成两个字符串,那么它就可以工作了。
现在,因为这些列是字符串,所以我不能对这些列使用from\ unixtime方法。
如果我尝试将这些列的数据类型从string改为bigint,会出现以下错误:
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. The following columns have types incompatible with the existing columns in their respective positions : uploadtimestamp
下面是我的create table语句:
create table ABC
(
uploadTimeStamp bigint
,PDID string
,data array
<
struct
<
Data:struct
<
unit:string
,value:string
,heading:string
,loc:string
,loc1:string
,loc2:string
,loc3:string
,speed:string
,xvalue:string
,yvalue:string
,zvalue:string
>
,Event:string
,PDID:string
,`Timestamp`:string
,Timezone:string
,Version:string
,pii:struct<dummy:string>
>
>
)
row format serde 'org.apache.hive.hcatalog.data.JsonSerDe'
stored as textfile;
我的json:
{"uploadTimeStamp":"1488793268598","PDID":"123","data":[{"Data":{"unit":"rpm","value":"100"},"EventID":"E1","PDID":"123","Timestamp":1488793268598,"Timezone":330,"Version":"1.0","pii":{}},{"Data":{"heading":"N","loc":"false","loc1":"16.032425","loc2":"80.770587","loc3":"false","speed":"10"},"EventID":"Location","PDID":"skga06031430gedvcl1pdid2367","Timestamp":1488793268598,"Timezone":330,"Version":"1.1","pii":{}},{"Data":{"xvalue":"1.1","yvalue":"1.2","zvalue":"2.2"},"EventID":"AccelerometerInfo","PDID":"skga06031430gedvcl1pdid2367","Timestamp":1488793268598,"Timezone":330,"Version":"1.0","pii":{}},{"EventID":"FuelLevel","Data":{"value":"50","unit":"percentage"},"Version":"1.0","Timestamp":1488793268598,"PDID":"skga06031430gedvcl1pdid2367","Timezone":330},{"Data":{"unit":"kmph","value":"70"},"EventID":"VehicleSpeed","PDID":"skga06031430gedvcl1pdid2367","Timestamp":1488793268598,"Timezone":330,"Version":"1.0","pii":{}}]}
有什么方法可以将这个字符串unixtimestamp转换为标准时间,或者我可以为这些列使用bigint?
1条答案
按热度按时间fcipmucu1#
如果您讨论的是时间戳和时区,那么可以将它们定义为int/big int类型。
如果您查看它们的定义,就会发现值周围没有限定符(“),因此在json文档中它们是数字类型:
“时间戳”:1488793268598,“时区”:330
即使您已经将timestamp定义为字符串,您仍然可以在需要bigint的函数中使用它之前将其强制转换为bigint。
强制转换(
timestamp
为bigint)失败:semanticexception[error 10014]:行1:45错误的参数“timestamp”:类org.apache.hadoop.hive.ql.udf.udffromunixtime没有与(string)匹配的方法。可能的选择:func(bigint)func(bigint,string)func(int)func(int,string)