如何用这个在graphx中创建图形

ffscu2ro 于 2021-05-29 发布在 Hadoop

关注(0)|答案(1)|浏览(409)

关闭。这个问题需要更加突出重点。它目前不接受答案。
**想改进这个问题吗？**通过编辑这篇文章更新这个问题，使它只关注一个问题。

四年前关门了。
改进这个问题
我正在努力理解如何在apachespark的graphx中创建以下内容。我得到以下信息：
一个hdfs文件，其中包含以下形式的大量数据：
节点：connectingnode1，connectingnode2。。
例如：
123214: 521345, 235213, 657323
我需要以某种方式将这些数据存储在edgerdd中，以便在graphx中创建图形，但我不知道该如何进行。

hadoop mapreduce scala apache-spark spark-graphx

来源：https://stackoverflow.com/questions/41190668/how-do-i-create-a-graph-in-graphx-with-this

1条答案

按热度按时间

chy5wohz1#

在您阅读了hdfs源代码并将数据输入 rdd ，您可以尝试以下操作：

import org.apache.spark.rdd.RDD
import org.apache.spark.graphx.Edge
// Sample data
val rdd = sc.parallelize(Seq("1: 1, 2, 3", "2: 2, 3"))

val edges: RDD[Edge[Int]] = rdd.flatMap {
  row => 
    // split around ":"
    val splitted = row.split(":").map(_.trim)
    // the value to the left of ":" is the source vertex:
    val srcVertex = splitted(0).toLong
    // for the values to the right of ":", we split around "," to get the other vertices
    val otherVertices = splitted(1).split(",").map(_.trim)
    // for each vertex to the right of ":", we create an Edge object connecting them to the srcVertex:
    otherVertices.map(v => Edge(srcVertex, v.toLong, 1))
}

编辑
此外，如果顶点具有恒定的默认权重，则可以直接从边创建图形，因此无需创建verticesrdd：

import org.apache.spark.graphx.Graph
val g = Graph.fromEdges(edges, defaultValue = 1)

赞(0）回复(0）举报 2021-05-29

我来回答

如何用这个在graphx中创建图形

1条答案

相关问题

热门标签

最新问答