spark从azure eventhub读取=>streamingqueryexception:输入字节数组有错误的4字节结束单位

but5z9lq  于 2021-05-19  发布在  Spark
关注(0)|答案(1)|浏览(384)

我正在尝试使用spark/python收集azure eventhub消息。每次,我都会得到异常“streamingqueryexception:input byte array有错误的4字节结束单位”
有什么想法吗?

conf = {}
conf["eventhubs.connectionString"] = "Endpoint=sb://XXXXXXXXX.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=XXXXXXXXXXXXX=;EntityPath=XXXXXX"

read_df  = spark.readStream.format("eventhubs").options(**conf).load()
stream = read_df.writeStream.format("console").start()
stream.awaitTermination()
q5iwbnjs

q5iwbnjs1#

请注意,对于2.3.15及更高版本,您需要加密配置字典中的连接字符串:

ehConf['eventhubs.connectionString'] = sc._jvm.org.apache.spark.eventhubs.EventHubsUtils.encrypt(connectionString)

https://github.com/azure/azure-event-hubs-spark/blob/master/docs/pyspark/structured-streaming-pyspark.md#event-集线器配置

相关问题