我正在尝试通过私人链接在AWS Glue和Snowflake之间建立连接
我已经配置了VPC端点,我可以在不使用pyspark的情况下从glue连接
代码如下:
from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives import serialization
from awsglue.context import GlueContext
from pyspark.context import SparkContext
from pyspark.sql import SQLContext
from awsglue.job import Job
import re
glueContext = GlueContext(SparkContext.getOrCreate())
spark = glueContext.spark_session
job = Job(glueContext)
key_file = """-----BEGIN ENCRYPTED PRIVATE KEY-----
-----END ENCRYPTED PRIVATE KEY-----
"""
p_key = serialization.load_pem_private_key(
str.encode(key_file),
password="<pass>".encode(),
backend=default_backend()
)
pkb = p_key.private_bytes(
encoding=serialization.Encoding.PEM,
format=serialization.PrivateFormat.PKCS8,
encryption_algorithm=serialization.NoEncryption()
)
pkb = pkb.decode("UTF-8")
pkb = re.sub("-*(BEGIN|END) PRIVATE KEY-*\n","",pkb).replace("\n","")
sfOptions = {
"sfURL" : "",
"sfUser" : "",
"pem_private_key": pkb,
"sfDatabase" : "",
"sfSchema" : "",
"sfWarehouse" : "",
"continue_on_error": "",
"sfAccount" : "",
"tracing": all,
}
query = "SELECT COUNT(*) FROM my_table"
SNOWFLAKE_SOURCE_NAME = "net.snowflake.spark.snowflake"
df = spark.read.format(SNOWFLAKE_SOURCE_NAME).options(**sfOptions).option("query", query).load()
df.show()
job.commit()
字符串
我使用的JARS版本是:
- snowflake-jdbc-3.13.22.jar
- spark-snowflake_2.12-2.11.0-spark_3.3.jar
我目前有以下错误
Traceback (most recent call last):
File "/tmp/test snowflake.py", line 79, in <module>
df = spark.read.format(SNOWFLAKE_SOURCE_NAME).options(**sfOptions).option("query", query).load()
File "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line 210, in load
return self._df(self._jreader.load())
File "/opt/amazon/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1305, in __call__
answer, self.gateway_client, self.target_id, self.name)
File "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 117, in deco
raise converted from None
pyspark.sql.utils.IllegalArgumentException: Bad level "<BUILT-IN FUNCTION ALL>"
型
到目前为止,我已经多次更改查询,并尝试使用不同版本的JAR,但错误仍然没有改变
1条答案
按热度按时间fnatzsnv1#
我可以用snowflake-jdbc-3.13.30.jar,spark-snowflake_2.12-2.13.0-spark_3.3.jar复制这个问题。将“tracing”:all更改为“tracing”:“all”解决了我的问题,或者删除跟踪选项。