我正在和Kafka一起运行spark结构化流媒体。下面是pom.xml
<properties>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
<encoding>UTF-8</encoding>
<!-- Put the Scala version of the cluster -->
<scalaVersion>2.12.10</scalaVersion>
<sparkVersion>3.0.1</sparkVersion>
</properties>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-clients</artifactId>
<version>2.1.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>${sparkVersion}</version>
<scope>provided</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
<version>${sparkVersion}</version>
<scope>provided</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql-kafka-0-10 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql-kafka-0-10_2.12</artifactId>
<version>${sparkVersion}</version>
</dependency>
用shade插件构建胖jar。jar在我的本地安装程序中按预期运行
spark-submit --master local[*] --class com.stream.Main --num-executors 3 --driver-memory 2g --executor-cores 2 --executor-memory 3g prism-event-synch-rta.jar
但当我尝试在spark cluster中使用yarn with command运行同一个jar时:
spark-submit --master yarn --deploy-mode cluster --class com.stream.Main --num-executors 4 --driver-memory 2g --executor-cores 1 --executor-memory 4g gs://jars/prism-event-synch-rta.jar
获取此异常的名称:
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:355)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:245)
Caused by: org.apache.kafka.common.config.ConfigException: Missing required configuration "partition.assignment.strategy" which has no default value.
at org.apache.kafka.common.config.ConfigDef.parse(ConfigDef.java:124)
我试过设置“分区、分配、策略”,但也不起作用。
编辑:也尝试使用包选项发送kafka客户端。结果是同样的例外。
spark-submit --packages org.apache.kafka:kafka-clients:2.1.0,org.apache.spark:spark-sql-kafka-0-10_2.12:3.0.1 --master yarn --deploy-mode cluster --class com.stream.Main --num-executors 4 --driver-memory 2g --executor-cores 1 --executor-memory 4g gs://jars/prism-event-synch-rta.jar
请帮忙。
暂无答案!
目前还没有任何答案,快来回答吧!