sagemaker jupinter-r-spark\u read\u csv错误

7xzttuei  于 2021-05-16  发布在  Spark
关注(0)|答案(0)|浏览(317)

在sagemaker jupyter示例中-r kernal。
在spark\u read\u csv函数中,在我们将spark版本升级到3.0.1和hadoop版本升级到2.7之后,出现以下错误。

'/home/ec2-user/spark/spark-3.0.1-bin-hadoop2.7'
A data.frame: 4 × 3
spark   hadoop  dir
<chr>   <chr>   <chr>
2.4.3   2.7 /home/ec2-user/spark/spark-2.4.3-bin-hadoop2.7
2.4.7   2.7 /home/ec2-user/spark/spark-2.4.7-bin-hadoop2.7
3.0.1   2.7 /home/ec2-user/spark/spark-3.0.1-bin-hadoop2.7
3.0.1   3.2 /home/ec2-user/spark/spark-3.0.1-bin-hadoop3.2
getSC  <-function ()
    {

    config<-spark_config()

    config$sparklyr.defaultPackages <- c(
"com.databricks:spark-csv_2.10:1.5.0",
"com.amazonaws:aws-java-sdk-pom:1.10.34",
"org.apache.hadoop:hadoop-aws:2.7.3")

print(config)
options(sparklyr.log.console = TRUE)

 sc <- spark_connect(master="local" ) 
ctx <- spark_context(sc)

jsc <- invoke_static(sc, 
                     "org.apache.spark.api.java.JavaSparkContext",
                     "fromSparkContext",
                     ctx)
hconf <- jsc %>% invoke("hadoopConfiguration")  
hconf %>% invoke("set", "com.amazonaws.services.s3a.enableV4", "true")
hconf %>% invoke("set", "fs.s3a.S3AFileSystem", "true")
hconf %>% invoke("set", "fs.s3a.fast.upload", "true")
hconf %>% invoke("set", "fs.s3a.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem")
hconf %>% invoke("set","fs.s3a.access.key", "Xxxx")
hconf %>% invoke("set","fs.s3a.secret.key", "Xxxxx"
hconf %>% invoke("set","fs.s3a.endpoint", "xxxx")
  return (sc)
}

 sc <- getSC()

constituents<-spark_read_csv(sc,name = "constituents",null_value="NA", path=folder_files,infer_schema = TRUE,header = T,delimiter = "," , mode="overwrite")
print(3)

例外情况:

Error: java.util.ServiceConfigurationError: org.apache.spark.sql.sources.DataSourceRegister: Provider com.amazonaws.services.sagemaker.sparksdk.protobuf.SageMakerProtobufFileFormat could not be instantiated
    at java.base/java.util.ServiceLoader.fail(ServiceLoader.java:581)
    at java.base/java.util.ServiceLoader$ProviderImpl.newInstance(ServiceLoader.java:803)
    at java.base/java.util.ServiceLoader$ProviderImpl.get(ServiceLoader.java:721)
    at java.base/java.util.ServiceLoader$3.next(ServiceLoader.java:1394)
    at scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:44)

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题