scala 将SQL查询转换为Spark

ff29svar  于 2023-08-05  发布在  Scala
关注(0)|答案(1)|浏览(136)

我有一个sql查询,我想转换成spark-scala

SELECT aid,DId,BM,BY 
FROM (SELECT DISTINCT aid,DId,BM,BY,TO FROM SU WHERE cd =2) t 
GROUP BY aid,DId,BM,BY HAVING COUNT(*) >1;

字符串
SU是我的 Dataframe 。我是通过

sqlContext.sql("""
  SELECT aid,DId,BM,BY 
  FROM (SELECT DISTINCT aid,DId,BM,BY,TO FROM SU WHERE cd =2) t 
  GROUP BY aid,DId,BM,BY HAVING COUNT(*) >1
""")


相反,我需要这个在利用我的 Dataframe

vq8itlhq

vq8itlhq1#

这应该是DataFrame的等效项:

SU.filter($"cd" === 2)
  .select("aid","DId","BM","BY","TO")
  .distinct()
  .groupBy("aid","DId","BM","BY")
  .count()
  .filter($"count" > 1)
  .select("aid","DId","BM","BY")

字符串

相关问题