sparkjava:如何按键对arraytype(maptype)排序并访问排序后的数组的值

rggaifut  于 2021-05-18  发布在  Spark
关注(0)|答案(1)|浏览(577)

所以,我有一个Dataframe,这个模式是:

StructType(StructField(experimentid,StringType,true), StructField(descinten,ArrayType(MapType(StringType,DoubleType,true),true),true))

内容如下:

+----------------+-------------------------------------------------------------+
|experimentid    |descinten                                                    |
+----------------+-------------------------------------------------------------+
|id1             |[[x1->120.51513], [x2->57.59762], [x3->83028.867]]           |
|id2             |[[x2->478.5698], [x3->79.6873], [x1->341.89]]                |
+----------------+-------------------------------------------------------------+

我想按键按升序对“descinten”排序,然后取排序后的值。我尝试分别Map和排序每一行,但出现了如下错误:
classcastexception:scala.collection.mutable.wrappedarray$ofref不能强制转换为scala.collection.map
或者类似的。在java中有没有更直接的方法呢?

kuhbmx9i

kuhbmx9i1#

对于任何感兴趣的人,我设法用map和treemap来解决它。我的目标是根据它们的键按升序创建值向量。

StructType schema = new StructType(new StructField[] {
                        new StructField("expids",DataTypes.StringType, false,Metadata.empty()),
                        new StructField("intensity",new VectorUDT(),false,Metadata.empty())
            });
            Dataset<Row> newdf=olddf.map((Row originalrow) -> {
                    String firstpos = new String();
                    firstpos=originalrow.get(0).toString();
                    List<scala.collection.Map<String,Double>>mplist=originalrow.getList(1);
                    int s = mplist.size();
                    Map<String,Double>treemp=new TreeMap<>();
                    for(int k=0;k<s;k++){
                        Object[] exid = JavaConversions.mapAsJavaMap(mplist.get(k)).values().toArray();
                        Object[] kvlist= JavaConversions.mapAsJavaMap(mplist.get(k)).keySet().toArray();
                        treemp.put(exid[0].toString(),Double.parseDouble(kvlist[0].toString()));
                    }
                    Object[] yo1 = treemp.values().toArray();
                    double[] tmplist= new double[s];
                    for(int i=0;i<s;i++){
                        tmplist[i]=Double.parseDouble(yo1[i].toString());
                    }
                    Row newrow = RowFactory.create(firstpos,Vectors.dense(tmplist));
                    return newrow;

            }, RowEncoder.apply(schema));

相关问题