我建立了一个aws电子病历集群。我选择了emr-6.0.0。选择的应用程序是:
spark:Hadoop3.2.1Yarn上的spark 2.4.4,ganglia 3.7.2和zeppelin 0.9.0-snapshot
之后,我创建了一个jupyter笔记本并将其连接到集群。问题是笔记本中的以下代码行引发错误:
data_frame = spark.read.json("s3://transactions-bucket-demo/")
data_frame.createOrReplaceTempView("table")
spark.sql("SELECT * from table")
错误:
'java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient;'
Traceback (most recent call last):
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/session.py", line 767, in sql
return DataFrame(self._jsparkSession.sql(sqlQuery), self._wrapped)
File "/usr/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__
answer, self.gateway_client, self.target_id, self.name)
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 69, in deco
raise AnalysisException(s.split(': ', 1)[1], stackTrace)
pyspark.sql.utils.AnalysisException: 'java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient;'
如何解决由于笔记本中的sql查询导致的此错误?
暂无答案!
目前还没有任何答案,快来回答吧!