ambari hive utf-8问题

dwthyt8l  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(444)

配置单元表中的西里尔字母符号有问题。已安装版本:

ambari-server 2.4.2.0-136
hive-2-5-3-0-37 1.2.1000.2.5.3.0-37
Ubuntu 14.04

问题是什么:
将locale设置为ru\u ru.utf-8:

spark@hadoop:~$ locale
LANG=ru_RU.UTF-8
LANGUAGE=ru_RU:ru
LC_CTYPE="ru_RU.UTF-8"
LC_NUMERIC="ru_RU.UTF-8"
LC_TIME="ru_RU.UTF-8"
LC_COLLATE="ru_RU.UTF-8"
LC_MONETARY="ru_RU.UTF-8"
LC_MESSAGES="ru_RU.UTF-8"
LC_PAPER="ru_RU.UTF-8"
LC_NAME="ru_RU.UTF-8"
LC_ADDRESS="ru_RU.UTF-8"
LC_TELEPHONE="ru_RU.UTF-8"
LC_MEASUREMENT="ru_RU.UTF-8"
LC_IDENTIFICATION="ru_RU.UTF-8"
LC_ALL=ru_RU.UTF-8

连接到配置单元并创建测试表:

spark@hadoop:~$ beeline -n spark -u jdbc:hive2://spark@hadoop.domain.com:10000/

Connecting to enter code herejdbc:hive2://spark@hadoop.domain.com:10000/
Connected to: Apache Hive (version 1.2.1000.2.5.3.0-37)
Driver: Hive JDBC (version 1.2.1000.2.5.3.0-37)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.2.1000.2.5.3.0-37 by Apache Hive

0: jdbc:hive2://spark@hadoop.domain.com> CREATE TABLE `test`(`name` string) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH SERDEPROPERTIES ( 'serialization.encoding'='UTF-8');
No rows affected (0,127 seconds)

插入西里尔文符号:

0: jdbc:hive2://spark@hadoop.domain.com> insert into test values('привет');

INFO  : Tez session hasn't been created yet. Opening session
INFO  : Dag name: insert into test values('привет')(Stage-1)
INFO  :

INFO  : Status: Running (Executing on YARN cluster with App id application_1490211406894_2481)

INFO  : Map 1: -/-
INFO  : Map 1: 0/1
INFO  : Map 1: 0(+1)/1
INFO  : Map 1: 1/1
INFO  : Loading data to table default.test from hdfs://hadoop.domain.com:8020/apps/hive/warehouse/test/.hive-staging_hive_2017-03-23_13-41-46_215_3133047104896717605-116/-ext-10000
INFO  : Table default.test stats: [numFiles=1, numRows=1, totalSize=7, rawDataSize=6]
No rows affected (6,652 seconds)

从表中选择:

0: jdbc:hive2://spark@hadoop.domain.com> select * from test;
+------------+--+
| test.name  |
+------------+--+
| ?@825B     |
+------------+--+
1 row selected (0,162 seconds)

我在apachehive上读到了很多bug,测试了unicode、utf-8、utf-16和一些isos编码,但都没有成功。
有人能帮我吗?
谢谢!

bmp9r5qi

bmp9r5qi1#

hortonwroks的人帮我解决了这个问题。好像是个虫子。
https://community.hortonworks.com/answers/90989/view.html
https://issues.apache.org/jira/browse/hive-13983

相关问题