为什么从kafka读取流失败，并导致“无法找到数据集中存储的类型的编码器”？

nc1teljy 于 2021-06-07 发布在 Kafka

关注(0)|答案(1)|浏览(328)

我正在尝试与Kafka一起使用spark结构化流媒体。

object StructuredStreaming {

  def main(args: Array[String]) {
    if (args.length < 2) {
      System.err.println("Usage: StructuredStreaming <hostname> <port>")
      System.exit(1)
    }

    val host = args(0)
    val port = args(1).toInt

    val spark = SparkSession
      .builder
      .appName("StructuredStreaming")
      .config("spark.master", "local")
      .getOrCreate()

    import spark.implicits._

    // Subscribe to 1 topic
    val lines = spark
      .readStream
      .format("kafka")
      .option("kafka.bootstrap.servers", "localhost:9093")
      .option("subscribe", "sparkss")
      .load()
    lines.selectExpr("CAST(key AS STRING)", "CAST(value AS STRING)")
      .as[(String, String)]
    }
}

我从spark文档中获取了代码，出现以下生成错误：
找不到数据集中存储的类型的编码器。通过导入spark.implicits支持基元类型（int、string等）和产品类型（case类）。\在将来的版本中将添加对序列化其他类型的支持。作为[（字符串，字符串）]
我在另一篇文章上读到，这是由于缺乏 import spark.implicits._ . 但这对我来说没有任何改变。
更新：

<properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    <slf4j.version>1.7.12</slf4j.version>
    <spark.version>2.1.0</spark.version>
    <scala.version>2.10.4</scala.version>
    <scala.binary.version>2.10</scala.binary.version>
</properties>

<dependencies>

    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.10</artifactId>
        <version>2.1.0</version>
    </dependency>

    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_2.10</artifactId>
        <version>2.1.0</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql-kafka-0-10_2.10</artifactId>
        <version>2.1.0</version>
    </dependency>
</dependencies>

scala apache-kafka apache-spark spark-structured-streaming

来源：https://stackoverflow.com/questions/43235764/why-does-reading-stream-from-kafka-fail-with-unable-to-find-encoder-for-type-st

1条答案

按热度按时间

fcg9iug31#

我试过scala 2.11.8

<scala.version>2.11.8</scala.version>
<scala.binary.version>2.11</scala.binary.version>

<dependencies>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.11</artifactId>
        <version>2.1.0</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_2.11</artifactId>
        <version>2.1.0</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql-kafka-0-10_2.11</artifactId>
        <version>2.1.0</version>
    </dependency>

</dependencies>

并具有相应的依赖关系（对于Scala2.11），它最终成功了。
警告：您需要重新启动intellij上的项目，我认为在更改版本而不重新启动时存在一些问题，错误仍然存在。

赞(0）回复(0）举报 2021-06-07

我来回答

为什么从kafka读取流失败，并导致“无法找到数据集中存储的类型的编码器”？

1条答案

相关问题

热门标签

最新问答