在aws emr中运行spark sql查询

z0qdvdin  于 2021-05-29  发布在  Spark
关注(0)|答案(0)|浏览(345)

我建立了一个aws电子病历集群。我选择了emr-6.0.0。选择的应用程序是:
spark:Hadoop3.2.1Yarn上的spark 2.4.4,ganglia 3.7.2和zeppelin 0.9.0-snapshot
之后,我创建了一个jupyter笔记本并将其连接到集群。问题是笔记本中的以下代码行引发错误:

data_frame = spark.read.json("s3://transactions-bucket-demo/")
data_frame.createOrReplaceTempView("table")
spark.sql("SELECT * from table")

错误:

'java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient;'
Traceback (most recent call last):
  File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/session.py", line 767, in sql
    return DataFrame(self._jsparkSession.sql(sqlQuery), self._wrapped)
  File "/usr/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__
    answer, self.gateway_client, self.target_id, self.name)
  File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 69, in deco
    raise AnalysisException(s.split(': ', 1)[1], stackTrace)
pyspark.sql.utils.AnalysisException: 'java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient;'

如何解决由于笔记本中的sql查询导致的此错误?

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题