我试图找到包含非法值(在本例中是大于1小于-1的值)的所有列的名称。我使用聚合函数min和max,想知道是否有比使用聚合更有效的实现。另外,如果我想找到包含具有某些特定值的单元格的列,它也不起作用
val columnsSelected = List("column 1", "column 2", "column 3", "column 4", "column 5")
val selectedColsDF = inputDF.select(columnsSelected.map(c => col(c)): _*)
selectedColsDF.show()
val colsWithIllegalVals = selectedColsDF.select(selectedColsDF.columns.map(c => (max(col(c)) > 1 || min(col(c)) < -1).alias(c)): _*) // set true if columns contains illegal values else false
.head().toSeq.zipWithIndex // convert to list of tupples with true/false values and associated col index
.filter(_._1 == true).map(_._2) // filter the tuples that contain true values only (cols containing illegal vals)
.map(i => selectedColsDF.columns.apply(i)) // use index to pass out column headers
暂无答案!
目前还没有任何答案,快来回答吧!