子图

在 GraphX 中,subgraph() 方法接收一个边三元组(边、源顶点和目标顶点,以及属性),并允许用户基于三元组和顶点过滤器来选择子图。

GraphFrames 提供了一种更强大的方式,通过结合模式查找和 DataFrame 过滤器来选择子图。我们为子图选择提供了三种辅助方法:filterVertices(condition)filterEdges(condition)dropIsolatedVertices()

简单子图

以下示例展示了如何基于顶点和边过滤器选择子图。

Python API

from graphframes.examples import Graphs

g = Graphs(spark).friends()  # Get example graph

# Select subgraph of users older than 30, and relationships of type "friend"
# Drop isolated vertices (users) which are not contained in any edges (relationships)
g1 = g.filterVertices("age > 30").filterEdges("relationship = 'friend'").dropIsolatedVertices()

Scala API

import org.graphframes.{examples,GraphFrame}

val g: GraphFrame = examples.Graphs.friends

// Select subgraph of users older than 30, and relationships of type "friend".
// Drop isolated vertices (users) which are not contained in any edges (relationships).
val g1 = g.filterVertices("age > 30").filterEdges("relationship = 'friend'").dropIsolatedVertices()

复杂子图:三元组过滤器

以下示例展示了如何基于三元组筛选器选择子图,这些筛选器作用于一条边及其源顶点和目标顶点。此示例可通过使用更复杂的模式来扩展到三元组之外。

Python API

from graphframes.examples import Graphs

g = Graphs(spark).friends()  # Get example graph

# Select subgraph based on edges "e" of type "follow"
# pointing from a younger user "a" to an older user "b"
paths = g.find("(a)-[e]->(b)")\
  .filter("e.relationship = 'follow'")\
  .filter("a.age < b.age")

# "paths" contains vertex info. Extract the edges
e2 = paths.select("e.src", "e.dst", "e.relationship")

# In Spark 1.5+, the user may simplify this call
# val e2 = paths.select("e.*")
# Construct the subgraph
g2 = GraphFrame(g.vertices, e2)

Scala API

import org.graphframes.{examples,GraphFrame}

val g: GraphFrame = examples.Graphs.friends  // get example graph

// Select subgraph based on edges "e" of type "follow"
// pointing from a younger user "a" to an older user "b".
val paths = { g.find("(a)-[e]->(b)")
  .filter("e.relationship = 'follow'")
  .filter("a.age < b.age") }
// "paths" contains vertex info. Extract the edges.
val e2 = paths.select("e.*")

// Construct the subgraph
val g2 = GraphFrame(g.vertices, e2)

Property Graphs

关于更高级的子图选择,请参阅属性图部分。