EdgeRDD Class

abstract class EdgeRDD[ED] extends RDD[Edge[ED]]

Instance constructor:

new EdgeRDD(sc: SparkContext, deps: Seq[Dependency[_]])

abstract def innerJoin[ED2, ED3](other: EdgeRDD[ED2])(f: (VertexId, VertexId, ED, ED2) ⇒ ED3)(implicit arg0: ClassTag[ED2], arg1: ClassTag[ED3]): EdgeRDD[ED3]

Inner joins this EdgeRDD with another EdgeRDD, assuming both are partitioned using the same PartitionStrategy.

abstract def mapValues[ED2](f: (Edge[ED]) ⇒ ED2)(implicit arg0: ClassTag[ED2]): EdgeRDD[ED2]

Map the values in an edge partitioning preserving the structure but changing the values.

abstract def reverse: EdgeRDD[ED]

Reverse all the edges in this RDD

Example:

import org.apache.spark._
import org.apache.spark.graphx._
import org.apache.spark.rdd.RDD

// Create an RDD for the vertices
val users: RDD[(VertexId, (String, String))] =
  sc.parallelize(Array((3L, ("rxin", "student")), (7L, ("jgonzal", "postdoc")),
                       (5L, ("franklin", "prof")), (2L, ("istoica", "prof")),
                       (4L, ("peter", "student"))))
// Create an RDD for edges
val relationships: RDD[Edge[String]] =
  sc.parallelize(Array(Edge(3L, 7L, "collab"),    Edge(5L, 3L, "advisor"),
                       Edge(2L, 5L, "colleague"), Edge(5L, 7L, "pi"),
                       Edge(4L, 0L, "student"),   Edge(5L, 0L, "colleague")))
// Define a default user in case there are relationship with missing user
val defaultUser = ("John Doe", "Missing")

val graph = Graph(users, relationships, defaultUser)

val edges = EdgeRDD.fromEdges(relationships.map(x=>Edge(x.srcId,x.dstId,"EdgeRDD")))
edges.take(5).foreach(println)
/*
Edge(2,5,EdgeRDD)
Edge(3,7,EdgeRDD)
Edge(5,3,EdgeRDD)
Edge(4,0,EdgeRDD)
Edge(5,0,EdgeRDD)
*/

reference:

https://github.com/apache/spark/blob/v2.4.5/graphx/src/main/scala/org/apache/spark/graphx/EdgeRDD.scala

Last updated