如何使用ZIO库测试Scala Spark UnitTest/DQ检查?

utugiqy6  于 2023-01-26  发布在  Scala
关注(0)|答案(1)|浏览(168)

我是Scala的新手。我正在尝试使用ZIO库对Scala Spark Dataframe的UT/DQ检查进行单元测试Assert。如果有人已经在ZIO库上工作过,可以帮助我吗?

kadbb459

kadbb4591#

我建议在scala中使用spark-fast-tests来AssertSpark DataFrames,ZIO-test不是spark-fast-tests支持的框架之一,但是你应该仍然可以使用它。

示例

如果您需要测试DataFrame上的某些转换:

import org.apache.spark.sql.functions.lit
import org.apache.spark.sql.DataFrame

object Transformations {
  def appendLiteral(incomingData: DataFrame): DataFrame =
    incomingData.withColumn("foo", lit("bar"))
}

一个没有利用更广泛的ZIO效应生态系统的简单测试可能是这样的:

import com.github.mrpowers.spark.fast.tests.DataFrameComparer
import org.apache.spark.sql.SparkSession
import zio.test._
import zio.test.Assertion._

object TransformationsSpec extends ZIOSpecDefault with DataFrameComparer {
  val spark: SparkSession = SparkSession.builder().config("spark.master", "local").getOrCreate()
  import spark.implicits._

  def spec = suite("TransformationSpec")(
    test("appendLiteral adds a column named 'foo' with value 'bar'") {
      val testInput: DataFrame = Seq("Hello", "hi", "howdy").toDF("greeting")
      val expected: DataFrame =  Seq(("Hello", "bar"), ("hi", "bar"), ("howdy", "bar")).toDF("greeting", "foo")

      val result = testInput.transform(Transformations.appendLiteral)

      assert(assertSmallDataFrameEquality(expected, result, ignoreNullable = true))(isUnit)
    }
  )
}

相关问题