子图
在 GraphX 中,subgraph() 方法接收一个边三元组(边、源顶点和目标顶点,以及属性),并允许用户基于三元组和顶点过滤器来选择子图。
GraphFrames 提供了一种更强大的方式,通过结合模式查找和 DataFrame 过滤器来选择子图。我们为子图选择提供了三种辅助方法:filterVertices(condition)、filterEdges(condition) 和 dropIsolatedVertices()。
简单子图
以下示例展示了如何基于顶点和边过滤器选择子图。
Python API
from graphframes.examples import Graphs
g = Graphs(spark).friends() # Get example graph
# Select subgraph of users older than 30, and relationships of type "friend"
# Drop isolated vertices (users) which are not contained in any edges (relationships)
g1 = g.filterVertices("age > 30").filterEdges("relationship = 'friend'").dropIsolatedVertices()
Scala API
import org.graphframes.{examples,GraphFrame}
val g: GraphFrame = examples.Graphs.friends
// Select subgraph of users older than 30, and relationships of type "friend".
// Drop isolated vertices (users) which are not contained in any edges (relationships).
val g1 = g.filterVertices("age > 30").filterEdges("relationship = 'friend'").dropIsolatedVertices()
复杂子图:三元组过滤器
以下示例展示了如何基于三元组筛选器选择子图,这些筛选器作用于一条边及其源顶点和目标顶点。此示例可通过使用更复杂的模式来扩展到三元组之外。
Python API
from graphframes.examples import Graphs
g = Graphs(spark).friends() # Get example graph
# Select subgraph based on edges "e" of type "follow"
# pointing from a younger user "a" to an older user "b"
paths = g.find("(a)-[e]->(b)")\
.filter("e.relationship = 'follow'")\
.filter("a.age < b.age")
# "paths" contains vertex info. Extract the edges
e2 = paths.select("e.src", "e.dst", "e.relationship")
# In Spark 1.5+, the user may simplify this call
# val e2 = paths.select("e.*")
# Construct the subgraph
g2 = GraphFrame(g.vertices, e2)
Scala API
import org.graphframes.{examples,GraphFrame}
val g: GraphFrame = examples.Graphs.friends // get example graph
// Select subgraph based on edges "e" of type "follow"
// pointing from a younger user "a" to an older user "b".
val paths = { g.find("(a)-[e]->(b)")
.filter("e.relationship = 'follow'")
.filter("a.age < b.age") }
// "paths" contains vertex info. Extract the edges.
val e2 = paths.select("e.*")
// Construct the subgraph
val g2 = GraphFrame(g.vertices, e2)
Property Graphs
关于更高级的子图选择,请参阅属性图部分。