我正在使用r-studio处理来自hive的数据。这里我使用的是rjdbc。rjdbc将select语句转换为Dataframe。不幸的是,配置单元列数据类型“date”和“timestamp”的转换似乎无法识别。因此,在dbreadtable(conn,db2.ibor\u)期间它被转换为字符,这是不好的。
你知道这件事吗?我不想在r中重铸这个角色,因为它是1。头顶,2。连接到联轴器和3。增加维护工作量
library(DBI)
library(rJava)
library(RJDBC)
print("Attempting Hive Connection...")
hadoop.class.path = list.files(path=c("/usr/hdp/current/hadoop-client"),pattern="jar", full.names=T);
hadoop.client.lib = list.files(path=c("/usr/hdp/current/hadoop-client/lib"),pattern="jar", full.names=T);
hive.class.path = list.files(path=c("/usr/hdp/current/hive-client/lib"),pattern="jar", full.names=T);
hadoop.hdfs.lib.path = list.files(path=c("/usr/hdp/current/hadoop-hdfs-client"),pattern="jar",full.names=T);
zookeeper.lib.path = list.files(path=c("/usr/hdp/current/zookeeper-client"),pattern="jar",full.names=T);
mapred.class.path = list.files(path=c("/usr/hdp/current/hadoop-mapreduce-client"),pattern="jar",full.names=T);
cp = c(hive.class.path,mapred.class.path,hadoop.class.path,hadoop.client.lib,hadoop.hdfs.lib.path)
.jinit(classpath=cp, parameters="-Djavax.security.auth.useSubjectCredsOnly=false")
drv <- JDBC("org.apache.hive.jdbc.HiveDriver","/usr/hdp/current/hive-client/lib/hive-jdbc.jar",identifier.quote="`")
conn <- dbConnect(drv,"jdbc:hive2://xxx:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_HOST@yyyy")
show_databases <- dbGetQuery(conn, "select * from db2.ibor_lending LIMIT 100")
show_datatypes <- dbGetQuery(conn, "describe db2.ibor_lending")
show_table <- dbReadTable(conn, db2.ibor_lending)
结果是:
Hive: col_name data_type comment
cutoffdate timestamp
R dataframe: ibor_lending.cutoffdate character
比尔,丹尼斯
暂无答案!
目前还没有任何答案,快来回答吧!