保存与加载

目前,GraphFrames 不支持开箱即用的保存和加载功能,需要使用 Spark SQL 将 verticesedges 作为 DataFrame 进行保存和加载。更多详细信息请参阅 Spark SQL 数据源用户指南

注意:GraphFrames的维护者目前正在致力于添加对将GraphFrames保存和加载为各种格式的支持,例如GraphML、Apache GraphAr(孵化中)等。

以下示例展示了如何将图保存为顶点和边的parquet文件,然后再加载它们。

Python API

from graphframes.examples import Graphs


g = Graphs(spark).friends()  # Get example graph

# Save vertices and edges as Parquet to some location
g.vertices.write.parquet("hdfs://myLocation/vertices")
g.edges.write.parquet("hdfs://myLocation/edges")

# Load the vertices and edges back
sameV = spark.read.parquet("hdfs://myLocation/vertices")
sameE = spark.read.parquet("hdfs://myLocation/edges")

# Create an identical GraphFrame
sameG = GraphFrame(sameV, sameE)

Scala API

import org.graphframes.{examples,GraphFrame}

val g: GraphFrame = examples.Graphs.friends  // get example graph

// Save vertices and edges as Parquet to some location.
g.vertices.write.parquet("hdfs://myLocation/vertices")
g.edges.write.parquet("hdfs://myLocation/edges")

// Load the vertices and edges back.
val sameV = spark.read.parquet("hdfs://myLocation/vertices")
val sameE = spark.read.parquet("hdfs://myLocation/edges")

// Create an identical GraphFrame.
val sameG = GraphFrame(sameV, sameE)