快速入门

本指南将帮助您快速上手在本地机器上使用GraphScope进行图学习任务。

安装

我们将通过一行命令开始安装GraphScope。

python3 -m pip install graphscope --upgrade

如果下载速度非常慢,可以尝试使用pip的镜像站点。

python3 -m pip install graphscope --upgrade \
    -i http://mirrors.aliyun.com/pypi/simple/ --trusted-host=mirrors.aliyun.com

默认情况下,GraphScope Learning Engine使用TensorFlow作为其神经网络后端,您还需要安装tensorflow。

# Installing the latest version of tensorflow may cause dependency
# conflicts with GraphScope, we use v2.8.0 here.
python3 -m pip install tensorflow==2.8.0

v2.11.0版本支持在linux aarch64平台下运行:

>>> import platform
>>> platform.system()
'Linux'
>>> platform.processor()
'aarch64'
# Install the fixed 'v2.11.0' version of tensorflow under the linux aarch64 platform
python3 -m pip install tensorflow==2.11.0

在本地运行GraphScope学习引擎

graphscope 包包含了在本地机器上训练GNN模型所需的一切功能。现在您可以在Python会话中导入它并开始您的工作。使用以下示例来训练一个EgoGraphSAGE模型,将节点(论文)分类到349个类别中,每个类别代表一个学术场所(例如预印本和会议)。

try:
    # https://www.tensorflow.org/guide/migrate
    import tensorflow.compat.v1 as tf
    tf.disable_v2_behavior()
except ImportError:
    import tensorflow as tf

import graphscope as gs
from graphscope.dataset import load_ogbn_mag
from graphscope.learning.examples import EgoGraphSAGE
from graphscope.learning.examples import EgoSAGESupervisedDataLoader
from graphscope.learning.examples.tf.trainer import LocalTrainer

gs.set_option(show_log=True)

# Define the training process of EgoGraphSAGE
def train(graph, node_type, edge_type, class_num, features_num,
              hops_num=2, nbrs_num=[25, 10], epochs=2,
              hidden_dim=256, in_drop_rate=0.5, learning_rate=0.01
):
    gs.learning.reset_default_tf_graph()

    dimensions = [features_num] + [hidden_dim] * (hops_num - 1) + [class_num]
    model = EgoGraphSAGE(dimensions, act_func=tf.nn.relu, dropout=in_drop_rate)

    # prepare training dataset
    train_data = EgoSAGESupervisedDataLoader(
        graph, gs.learning.Mask.TRAIN,
        node_type=node_type, edge_type=edge_type,
        nbrs_num=nbrs_num, hops_num=hops_num,
    )
    train_embedding = model.forward(train_data.src_ego)
    train_labels = train_data.src_ego.src.labels
    loss = tf.reduce_mean(
        tf.nn.sparse_softmax_cross_entropy_with_logits(
            labels=train_labels, logits=train_embedding,
        )
    )
    optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)

    # prepare test dataset
    test_data = EgoSAGESupervisedDataLoader(
        graph, gs.learning.Mask.TEST,
        node_type=node_type, edge_type=edge_type,
        nbrs_num=nbrs_num, hops_num=hops_num,
    )
    test_embedding = model.forward(test_data.src_ego)
    test_labels = test_data.src_ego.src.labels
    test_indices = tf.math.argmax(test_embedding, 1, output_type=tf.int32)
    test_acc = tf.div(
        tf.reduce_sum(tf.cast(tf.math.equal(test_indices, test_labels), tf.float32)),
        tf.cast(tf.shape(test_labels)[0], tf.float32),
    )

    # train and test
    trainer = LocalTrainer()
    trainer.train(train_data.iterator, loss, optimizer, epochs=epochs)
    trainer.test(test_data.iterator, test_acc)

# load the obgn-mag graph as example.
g = load_ogbn_mag()

# define the features for learning.
paper_features = [f"feat_{i}" for i in range(128)]

# launch a learning engine.
lg = gs.graphlearn(
    g,
    nodes=[("paper", paper_features)],
    edges=[("paper", "cites", "paper")],
    gen_labels=[
        ("train", "paper", 100, (0, 75)),
        ("val", "paper", 100, (75, 85)),
        ("test", "paper", 100, (85, 100))
    ]
)

train(lg, node_type="paper", edge_type="cites",
          class_num=349,  # output dimension
          features_num=128,  # input dimension
)

下一步是什么

如上例所示,使用GraphScope在本地机器上训练GNN模型非常简单。接下来,您可能想了解更多关于以下主题的内容:

接下来,您可能想了解更多关于以下主题的内容: