hive查询返回类似“x00e\x00”\x00的输出

wgxvkvu9  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(757)

我在配置单元中创建了一个表,并从外部csv文件加载了数据。当我试图从python打印数据时,会得到类似“['\x00”\x00m\x00e\x00s\x00s\x00a\x00g\x00e\x00“\x00']”的输出。当我查询hivegui时,结果是正确的。请告诉我如何通过python程序得到相同的结果。
我的python代码:

import pyhs2

with pyhs2.connect(host='192.168.56.101',
               port=10000,
               authMechanism='PLAIN',
               user='hiveuser',
               password='password',
               database='anuvrat') as conn:
with conn.cursor() as cur:
    cur.execute('SELECT message FROM ABC_NEWS LIMIT 5')

    print cur.fetchone()

输出为:

/usr/bin/python2.7 /home/anuvrattiku/SPRING_2017/CMPE239/Facebook_Fake_news_detection/code_fake_news/code.py
['\x00"\x00m\x00e\x00s\x00s\x00a\x00g\x00e\x00"\x00']

Process finished with exit code 0

在配置单元中查询同一个表时,会得到以下输出:

我就是这样创建表的:

CREATE TABLE ABC_NEWS(
ID STRING, 
PAGE_ID INT, 
NAME STRING, 
MESSAGE STRING, 
DESCRIPTION STRING, 
CAPTION STRING, 
POST_TYPE STRING, 
STATUS_TYPE STRING, 
LIKES_COUNT SMALLINT, 
COMMENTS SMALLINT, 
SHARES_COUNT SMALLINT, 
LOVE_COUNT SMALLINT, 
WOW_COUNT SMALLINT, 
HAHA_COUNT SMALLINT, 
SAD_COUNT SMALLINT, 
THANKFUL_COUNT SMALLINT, 
ANGRY_COUNT SMALLINT, 
LINK STRING, 
IMAGE_LINK STRING, 
POSTED_AT STRING
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY "," ESCAPED BY '\\';

用于加载表的csv文件位于以下路径中:https://www.dropbox.com/s/fiwygyqt8u9eo5s/-news-86680728811.csv?dl=0

tkqqtvp1

tkqqtvp11#

既然文本是限定的( " )在限定文本中出现分隔符( , ),您应该使用csv serde
你在试着打印 cur.fetchone() 它是一个列表而不是一个字符串,因此得到了一个字节数组,而您应该打印列表的第一个元素- cur.fetchone()[0] ```
create external table abc_news
(
id string
,page_id int
,name string
,message string
,description string
,caption string
,post_type string
,status_type string
,likes_count smallint
,comments smallint
,shares_count smallint
,love_count smallint
,wow_count smallint
,haha_count smallint
,sad_count smallint
,thankful_count smallint
,angry_count smallint
,link string
,image_link string
,posted_at string
)
row format serde 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
with serdeproperties
(
'separatorChar' = ','
,'quoteChar' = '"'
)
stored as textfile
;

import pyhs2

with pyhs2.connect(host='localhost',port=10000,authMechanism='PLAIN',user='cloudera',password='cloudera',database='local_db') as conn:
... with conn.cursor() as cur:
... cur.execute('SELECT message FROM ABC_NEWS LIMIT 10')
... for i in cur.fetch():
... print i[0]
...
...
...
"message"
"Roberts took the unusual step of devoting the majority of his annual report to the issue of judicial ethics."
"Do you agree with the new law?"
"Some pretty cool confetti will rain down on New York City celebrators."
NULL
"The pharmacy was held up by a man seeking prescription medication. "
NULL
"There were no immediate reports of damage or injuries."
"Were you an LCD screen early adopter? A settlement may be headed your way."
"As Americans get bigger, passenger limits are becoming more restrictive."

相关问题