//load data for first dataframe.
val dfa = dfaData.withColumn("id",monotonically_increasing_id).withColumn("id",row_number().over(Window.partitionBy($"id").orderBy($"id".asc)))
//load data for second dataframe.
val dfb = dfbData.withColumn("id",monotonically_increasing_id).withColumn("id",row_number().over(Window.partitionBy($"id").orderBy($"id".asc)))
//Used cross join to match dfa columns to dfb columns.
dfa.crossJoin(dfb).withColumn("matched",when($"filtereddescription" === $"name", lit("matched")).otherwise("not matched")).show(false)
1条答案
按热度按时间7dl7o3gd1#
由于您并没有提到您的逻辑的完整流程,我只是添加下面的逻辑来匹配两个表中的一列。