保存与加载
目前,GraphFrames 不支持开箱即用的保存和加载功能,需要使用 Spark SQL 将 vertices 和 edges 作为 DataFrame 进行保存和加载。更多详细信息请参阅 Spark SQL 数据源用户指南。
注意:GraphFrames的维护者目前正在致力于添加对将GraphFrames保存和加载为各种格式的支持,例如GraphML、Apache GraphAr(孵化中)等。
以下示例展示了如何将图保存为顶点和边的parquet文件,然后再加载它们。
Python API
from graphframes.examples import Graphs
g = Graphs(spark).friends() # Get example graph
# Save vertices and edges as Parquet to some location
g.vertices.write.parquet("hdfs://myLocation/vertices")
g.edges.write.parquet("hdfs://myLocation/edges")
# Load the vertices and edges back
sameV = spark.read.parquet("hdfs://myLocation/vertices")
sameE = spark.read.parquet("hdfs://myLocation/edges")
# Create an identical GraphFrame
sameG = GraphFrame(sameV, sameE)
Scala API
import org.graphframes.{examples,GraphFrame}
val g: GraphFrame = examples.Graphs.friends // get example graph
// Save vertices and edges as Parquet to some location.
g.vertices.write.parquet("hdfs://myLocation/vertices")
g.edges.write.parquet("hdfs://myLocation/edges")
// Load the vertices and edges back.
val sameV = spark.read.parquet("hdfs://myLocation/vertices")
val sameE = spark.read.parquet("hdfs://myLocation/edges")
// Create an identical GraphFrame.
val sameG = GraphFrame(sameV, sameE)