scala 通过不与另一列匹配来筛选一列

b4qexyjb  于 2023-06-29  发布在  Scala
关注(0)|答案(1)|浏览(131)

如何创建一个包含table1表中记录的 Dataframe ,该记录不匹配istituto, service_rap, filiale_rap, codice_rap字段和table2
我尝试了这样的方法(但不起作用):

val result: Dataset[Row] = table1.where($"istituto".notEqual(table2("istituto")))
val result: Dataset[Row] = table1.where($"istituto" =!= (table2("istituto")))

错误:

Exception in thread "main" org.apache.spark.sql.AnalysisException: Resolved attribute(s) istituto#16 missing from istituto#42,servizio_rap#43,filiale_rap#44,codice_rap#45,ndg#46 in operator !Filter NOT (istituto#42 = istituto#16). Attribute(s) with the same name appear in the operation: istituto. Please check if the right attribute(s) are used.;
!Filter NOT (istituto#42 = istituto#16)

表1:

private val table1: DataFrame = Seq(
    ("03104", "001", "00002", "123456", "ndg1"),
    ("03104", "001", "00002", "123455", "ndg2")
  ).toDF("istituto", "servizio_rap", "filiale_rap", "codice_rap", "ndg")

表2:

private val secondInput: DataFrame = Seq(
    ("03106", "001", "00002", "123456", "ndg1"))
    .toDF("istituto", "servizio_rap", "filiale_rap", "codice_rap", "ndg")

预期结果:

+--------+------------+-----------+----------+----+
|istituto|servizio_rap|filiale_rap|codice_rap|ndg |
+--------+------------+-----------+----------+----+
|03106   |002         |00003      |123465    |ndg1|
+--------+------------+-----------+----------+----+
myzjeezk

myzjeezk1#

来自Miko的评论。
使用leftanti join解决

val result: DataFrame = secondInput.join(input,Seq("servizio_rap","filiale_rap","codice_rap","istituto"),"leftanti")

相关问题