我试图写数据集在txt格式的s3桶使用Spark。
但我收到以下错误:
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.intellij.rt.execution.CommandLineWrapper.main(CommandLineWrapper.java:64)
Caused by: java.lang.IllegalArgumentException: AWS Access Key ID and Secret Access Key must be specified by setting the fs.s3.awsAccessKeyId and fs.s3.awsSecretAccessKey properties (respectively).
enter image description here
我的代码:
override fun write(input: Dataset<String>) =
input.coalesce(NUMBER_PARTITIONS).write().text(S3_BUCKET_PATH)
.also {
LOGGER.logInfo(
LOG_MESSAGE_TEMPLATE,
READ_DATA_METHOD,
WRITE_MESSAGE
)
}
enter image description here
我的Spark配置:*
object SparkConfiguration {
private const val SPARK_MASTER_NAME = "spark.master"
private const val SPARK_APP_NAME_CONFIG = "spark.app.name"
fun buildSparkSession(config: Config): SparkSession {
return SparkSession.builder()
.config(buildSparkConfig(config))
.orCreate
}
fun buildSparkConfig(config: Config): SparkConf = SparkConf()
.setMaster(config.getString(SPARK_MASTER_NAME))
.setAppName(config.getString(SPARK_APP_NAME_CONFIG))
}
1条答案
按热度按时间6ojccjat1#
这是由于运行Spark Job时没有权限:
确保您有权限对S3运行代码。
理想情况下,在conf/core-site.xml中将凭据设置为:
或在计算机上重新安装awscli和。
那么