使用Graphistry的ArangoDB

ArangoDB与Graphistry#

我们探索了ArangoDB中的《权力的游戏》数据,以展示Arango的图支持如何与Graphistry快速互操作。

本教程分享两个示例转换:* 可视化完整图 * 可视化遍历查询的结果

每个都通过python-arango运行一个AQL查询,自动转换为pandas,并使用graphistry进行绘图。

设置#

[ ]:
!pip install python-arango --user -q
[1]:
from arango import ArangoClient
import pandas as pd
import graphistry
[3]:
def paths_to_graph(paths, source='_from', destination='_to', node='_id'):
    nodes_df = pd.DataFrame()
    edges_df = pd.DataFrame()
    for graph in paths:
        nodes_df = pd.concat([ nodes_df, pd.DataFrame(graph['vertices']) ], ignore_index=True)
        edges_df = pd.concat([ edges_df, pd.DataFrame(graph['edges']) ], ignore_index=True)
    nodes_df = nodes_df.drop_duplicates([node])
    edges_df = edges_df.drop_duplicates([node])
    return graphistry.bind(source=source, destination=destination, node=node).nodes(nodes_df).edges(edges_df)

def graph_to_graphistry(graph, source='_from', destination='_to', node='_id'):
    nodes_df = pd.DataFrame()
    for vc_name in graph.vertex_collections():
        nodes_df = pd.concat([nodes_df, pd.DataFrame([x for x in graph.vertex_collection(vc_name)])], ignore_index=True)
    edges_df = pd.DataFrame()
    for edge_def in graph.edge_definitions():
        edges_df = pd.concat([edges_df, pd.DataFrame([x for x in graph.edge_collection(edge_def['edge_collection'])])], ignore_index=True)
    return graphistry.bind(source=source, destination=destination, node=node).nodes(nodes_df).edges(edges_df)

连接#

[ ]:
# To specify Graphistry account & server, use:
# graphistry.register(api=3, username='...', password='...', protocol='https', server='hub.graphistry.com')
# For more options, see https://github.com/graphistry/pygraphistry#configure
[4]:
client = ArangoClient(protocol='http', host='localhost', port=8529)
db = client.db('GoT', username='root', password='1234')

演示1:遍历可视化#

  • 使用 python-arangotraverse() 调用来遍历 Ned Stark 的后代

  • 将结果路径转换为pandas和Graphistry

  • 绘制图表,并使用名字而不是原始的Arango顶点ID

[7]:
paths = db.graph('theGraph').traverse(
    start_vertex='Characters/4814',
    direction='outbound',
    strategy='breadthfirst'
)['paths']
[8]:
g = paths_to_graph(paths)
g.bind(point_title='name').plot()
[8]:

演示2:完整图表#

  • 在图上使用 python-arango 来识别并下载涉及的顶点/边集合

  • 将结果转换为pandas和Graphistry

  • 绘制图表,并使用名字而不是原始的Arango顶点ID

[11]:
g = graph_to_graphistry( db.graph('theGraph') )
g.bind(point_title='name').plot()
[11]:
[ ]: