我有一个如下的模式。我想知道什么是最好的方式在Spark选择的元素座位和驱动器,然后把它投成一个字符串。我是在spark 1.6的Dataframe中阅读的。
|-- cars: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- carId: string (nullable = true)
| | |-- carCode: string (nullable = true)
| | |-- carNumber: string (nullable = true)
| | |-- features: array (nullable = true)
| | | |-- element: struct (containsNull = true)
| | | | |-- seat: string (nullable = true)
| | | | |-- drive: string (nullable = true)
cars.features作为car\u features在json中的输出:
"cars_features":[[{"seat":"Auto","drive":"Manual"}]]
我试着选择“auto”并将其放入dataframe列,选择“manual”并放入另一列。
当前尝试将整个结构返回为:
+-------------------+
|car_features |
+-------------------+
| [[Auto,Manual]] |
+-------------------+
col("car.features").getItem(0).as("car_features_seat")
1条答案
按热度按时间zzzyeukh1#
我必须在数组中钻两次:
这提取“自动”