我正在使用doc并尝试运行一个简单的脚本:https://docs.snowflake.com/en/user-guide/spark-connector-use.html
Py4JJavaError: An error occurred while calling o37.load.
: java.lang.ClassNotFoundException: Failed to find data source: net.snowflake.spark.snowflake.
我的代码在下面。我还尝试设置config选项,设置jdbc和spark snowflake jars的路径 /Users/Hana/spark-sf/
但运气不好。
from pyspark import SparkConf, SparkContext
from pyspark.sql import SQLContext
from pyspark.sql.types import *
from pyspark import SparkConf, SparkContext
spark = SparkSession \
.builder \
.appName("Python Spark SQL basic example") \
.config('spark.jars','/Users/Hana/spark-sf/snowflake-jdbc-3.12.9.jar,/Users/Hana/spark-sf/spark-snowflake_2.12-2.8.1-spark_3.0.jar') \
.getOrCreate()
# Set options below
sfOptions = {
"sfURL" : "<account_name>.snowflakecomputing.com",
"sfUser" : "<user_name>",
"sfPassword" : "<password>",
"sfDatabase" : "<database>",
"sfSchema" : "<schema>",
"sfWarehouse" : "<warehouse>"
}
SNOWFLAKE_SOURCE_NAME = "net.snowflake.spark.snowflake"
df = spark.read.format(SNOWFLAKE_SOURCE_NAME) \
.options(**sfOptions) \
.option("query", "select * from table limit 200") \
.load()
df.show()
如何正确设置变量?哪些是需要设置的?如果有人能帮我列出这些步骤,我将不胜感激!
1条答案
按热度按时间7rtdyuoh1#
你能试着用“雪花”格式吗
所以你的Dataframe
或设置
SNOWFLAKE_SOURCE_NAME
变量到