用于异构图表的OnDiskDataset
本教程展示了如何为可以在GraphBolt框架中使用的异构图创建OnDiskDataset
。与为同构图创建数据集的主要区别在于,我们需要为边、特征数据、训练/验证/测试集指定节点/边类型。
在本教程结束时,您将能够
组织图结构数据。
组织特征数据。
为特定任务组织训练/验证/测试集。
要创建一个OnDiskDataset
对象,你需要将所有数据包括图结构、特征数据和任务组织到一个目录中。该目录应包含一个metadata.yaml
文件,用于描述数据集的元数据。
现在让我们逐步生成各种数据并将它们组织在一起,最终实例化OnDiskDataset
。
安装DGL包
[1]:
# Install required packages.
import os
import torch
import numpy as np
os.environ['TORCH'] = torch.__version__
os.environ['DGLBACKEND'] = "pytorch"
# Install the CPU version.
device = torch.device("cpu")
!pip install --pre dgl -f https://data.dgl.ai/wheels-test/repo.html
try:
import dgl
import dgl.graphbolt as gb
installed = True
except ImportError as error:
installed = False
print(error)
print("DGL installed!" if installed else "DGL not found!")
Looking in links: https://data.dgl.ai/wheels-test/repo.html
Requirement already satisfied: dgl in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages/dgl-2.3-py3.8-linux-x86_64.egg (2.3)
Requirement already satisfied: numpy>=1.14.0 in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from dgl) (1.24.4)
Requirement already satisfied: scipy>=1.1.0 in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from dgl) (1.10.1)
Requirement already satisfied: networkx>=2.1 in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from dgl) (3.1)
Requirement already satisfied: requests>=2.19.0 in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from dgl) (2.31.0)
Requirement already satisfied: tqdm in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from dgl) (4.66.4)
Requirement already satisfied: psutil>=5.8.0 in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from dgl) (5.9.8)
Requirement already satisfied: torchdata>=0.5.0 in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from dgl) (0.7.1)
Requirement already satisfied: pandas in /home/ubuntu/.pyenv/versions/miniconda3-latest/lib/python3.8/site-packages (from dgl) (2.0.3)
Requirement already satisfied: charset-normalizer<4,>=2 in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from requests>=2.19.0->dgl) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from requests>=2.19.0->dgl) (3.7)
Requirement already satisfied: urllib3<3,>=1.21.1 in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from requests>=2.19.0->dgl) (2.2.1)
Requirement already satisfied: certifi>=2017.4.17 in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from requests>=2.19.0->dgl) (2024.2.2)
Requirement already satisfied: torch>=2 in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from torchdata>=0.5.0->dgl) (2.0.0)
Requirement already satisfied: python-dateutil>=2.8.2 in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from pandas->dgl) (2.9.0.post0)
Requirement already satisfied: pytz>=2020.1 in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from pandas->dgl) (2024.1)
Requirement already satisfied: tzdata>=2022.1 in /home/ubuntu/.pyenv/versions/miniconda3-latest/lib/python3.8/site-packages (from pandas->dgl) (2024.1)
Requirement already satisfied: six>=1.5 in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from python-dateutil>=2.8.2->pandas->dgl) (1.16.0)
Requirement already satisfied: filelock in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from torch>=2->torchdata>=0.5.0->dgl) (3.14.0)
Requirement already satisfied: typing-extensions in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from torch>=2->torchdata>=0.5.0->dgl) (4.11.0)
Requirement already satisfied: sympy in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from torch>=2->torchdata>=0.5.0->dgl) (1.12)
Requirement already satisfied: jinja2 in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from torch>=2->torchdata>=0.5.0->dgl) (3.1.4)
Requirement already satisfied: nvidia-cuda-nvrtc-cu11==11.7.99 in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from torch>=2->torchdata>=0.5.0->dgl) (11.7.99)
Requirement already satisfied: nvidia-cuda-runtime-cu11==11.7.99 in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from torch>=2->torchdata>=0.5.0->dgl) (11.7.99)
Requirement already satisfied: nvidia-cuda-cupti-cu11==11.7.101 in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from torch>=2->torchdata>=0.5.0->dgl) (11.7.101)
Requirement already satisfied: nvidia-cudnn-cu11==8.5.0.96 in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from torch>=2->torchdata>=0.5.0->dgl) (8.5.0.96)
Requirement already satisfied: nvidia-cublas-cu11==11.10.3.66 in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from torch>=2->torchdata>=0.5.0->dgl) (11.10.3.66)
Requirement already satisfied: nvidia-cufft-cu11==10.9.0.58 in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from torch>=2->torchdata>=0.5.0->dgl) (10.9.0.58)
Requirement already satisfied: nvidia-curand-cu11==10.2.10.91 in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from torch>=2->torchdata>=0.5.0->dgl) (10.2.10.91)
Requirement already satisfied: nvidia-cusolver-cu11==11.4.0.1 in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from torch>=2->torchdata>=0.5.0->dgl) (11.4.0.1)
Requirement already satisfied: nvidia-cusparse-cu11==11.7.4.91 in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from torch>=2->torchdata>=0.5.0->dgl) (11.7.4.91)
Requirement already satisfied: nvidia-nccl-cu11==2.14.3 in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from torch>=2->torchdata>=0.5.0->dgl) (2.14.3)
Requirement already satisfied: nvidia-nvtx-cu11==11.7.91 in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from torch>=2->torchdata>=0.5.0->dgl) (11.7.91)
Requirement already satisfied: triton==2.0.0 in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from torch>=2->torchdata>=0.5.0->dgl) (2.0.0)
Requirement already satisfied: setuptools in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from nvidia-cublas-cu11==11.10.3.66->torch>=2->torchdata>=0.5.0->dgl) (69.5.1)
Requirement already satisfied: wheel in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from nvidia-cublas-cu11==11.10.3.66->torch>=2->torchdata>=0.5.0->dgl) (0.43.0)
Requirement already satisfied: cmake in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from triton==2.0.0->torch>=2->torchdata>=0.5.0->dgl) (3.29.3)
Requirement already satisfied: lit in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from triton==2.0.0->torch>=2->torchdata>=0.5.0->dgl) (18.1.4)
Requirement already satisfied: MarkupSafe>=2.0 in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from jinja2->torch>=2->torchdata>=0.5.0->dgl) (2.1.5)
Requirement already satisfied: mpmath>=0.19 in /home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages (from sympy->torch>=2->torchdata>=0.5.0->dgl) (1.3.0)
DGL installed!
数据准备
为了演示如何组织各种数据,我们首先创建一个基础目录。
[2]:
base_dir = './ondisk_dataset_heterograph'
os.makedirs(base_dir, exist_ok=True)
print(f"Created base directory: {base_dir}")
Created base directory: ./ondisk_dataset_heterograph
生成图结构数据
对于异构图,我们需要将不同的边(即种子)保存到单独的Numpy或CSV文件中。
注意:- 当保存到Numpy时,数组需要是(2, N)
的形状。推荐使用这种格式,因为从中构建图比CSV文件快得多。- 当保存到CSV文件时,不要保存索引和标题。
[3]:
import numpy as np
import pandas as pd
# For simplicity, we create a heterogeneous graph with
# 2 node types: `user`, `item`
# 2 edge types: `user:like:item`, `user:follow:user`
# And each node/edge type has the same number of nodes/edges.
num_nodes = 1000
num_edges = 10 * num_nodes
# Edge type: "user:like:item"
like_edges_path = os.path.join(base_dir, "like-edges.csv")
like_edges = np.random.randint(0, num_nodes, size=(num_edges, 2))
print(f"Part of [user:like:item] edges: {like_edges[:5, :]}\n")
df = pd.DataFrame(like_edges)
df.to_csv(like_edges_path, index=False, header=False)
print(f"[user:like:item] edges are saved into {like_edges_path}\n")
# Edge type: "user:follow:user"
follow_edges_path = os.path.join(base_dir, "follow-edges.csv")
follow_edges = np.random.randint(0, num_nodes, size=(num_edges, 2))
print(f"Part of [user:follow:user] edges: {follow_edges[:5, :]}\n")
df = pd.DataFrame(follow_edges)
df.to_csv(follow_edges_path, index=False, header=False)
print(f"[user:follow:user] edges are saved into {follow_edges_path}\n")
Part of [user:like:item] edges: [[223 454]
[488 169]
[636 944]
[556 151]
[ 64 198]]
[user:like:item] edges are saved into ./ondisk_dataset_heterograph/like-edges.csv
Part of [user:follow:user] edges: [[784 947]
[ 47 185]
[663 659]
[ 25 938]
[676 964]]
[user:follow:user] edges are saved into ./ondisk_dataset_heterograph/follow-edges.csv
生成图的特征数据
对于特征数据,目前支持numpy数组和torch张量。让我们为每个节点/边类型生成特征数据。
[4]:
# Generate node[user] feature in numpy array.
node_user_feat_0_path = os.path.join(base_dir, "node-user-feat-0.npy")
node_user_feat_0 = np.random.rand(num_nodes, 5)
print(f"Part of node[user] feature [feat_0]: {node_user_feat_0[:3, :]}")
np.save(node_user_feat_0_path, node_user_feat_0)
print(f"Node[user] feature [feat_0] is saved to {node_user_feat_0_path}\n")
# Generate another node[user] feature in torch tensor
node_user_feat_1_path = os.path.join(base_dir, "node-user-feat-1.pt")
node_user_feat_1 = torch.rand(num_nodes, 5)
print(f"Part of node[user] feature [feat_1]: {node_user_feat_1[:3, :]}")
torch.save(node_user_feat_1, node_user_feat_1_path)
print(f"Node[user] feature [feat_1] is saved to {node_user_feat_1_path}\n")
# Generate node[item] feature in numpy array.
node_item_feat_0_path = os.path.join(base_dir, "node-item-feat-0.npy")
node_item_feat_0 = np.random.rand(num_nodes, 5)
print(f"Part of node[item] feature [feat_0]: {node_item_feat_0[:3, :]}")
np.save(node_item_feat_0_path, node_item_feat_0)
print(f"Node[item] feature [feat_0] is saved to {node_item_feat_0_path}\n")
# Generate another node[item] feature in torch tensor
node_item_feat_1_path = os.path.join(base_dir, "node-item-feat-1.pt")
node_item_feat_1 = torch.rand(num_nodes, 5)
print(f"Part of node[item] feature [feat_1]: {node_item_feat_1[:3, :]}")
torch.save(node_item_feat_1, node_item_feat_1_path)
print(f"Node[item] feature [feat_1] is saved to {node_item_feat_1_path}\n")
# Generate edge[user:like:item] feature in numpy array.
edge_like_feat_0_path = os.path.join(base_dir, "edge-like-feat-0.npy")
edge_like_feat_0 = np.random.rand(num_edges, 5)
print(f"Part of edge[user:like:item] feature [feat_0]: {edge_like_feat_0[:3, :]}")
np.save(edge_like_feat_0_path, edge_like_feat_0)
print(f"Edge[user:like:item] feature [feat_0] is saved to {edge_like_feat_0_path}\n")
# Generate another edge[user:like:item] feature in torch tensor
edge_like_feat_1_path = os.path.join(base_dir, "edge-like-feat-1.pt")
edge_like_feat_1 = torch.rand(num_edges, 5)
print(f"Part of edge[user:like:item] feature [feat_1]: {edge_like_feat_1[:3, :]}")
torch.save(edge_like_feat_1, edge_like_feat_1_path)
print(f"Edge[user:like:item] feature [feat_1] is saved to {edge_like_feat_1_path}\n")
# Generate edge[user:follow:user] feature in numpy array.
edge_follow_feat_0_path = os.path.join(base_dir, "edge-follow-feat-0.npy")
edge_follow_feat_0 = np.random.rand(num_edges, 5)
print(f"Part of edge[user:follow:user] feature [feat_0]: {edge_follow_feat_0[:3, :]}")
np.save(edge_follow_feat_0_path, edge_follow_feat_0)
print(f"Edge[user:follow:user] feature [feat_0] is saved to {edge_follow_feat_0_path}\n")
# Generate another edge[user:follow:user] feature in torch tensor
edge_follow_feat_1_path = os.path.join(base_dir, "edge-follow-feat-1.pt")
edge_follow_feat_1 = torch.rand(num_edges, 5)
print(f"Part of edge[user:follow:user] feature [feat_1]: {edge_follow_feat_1[:3, :]}")
torch.save(edge_follow_feat_1, edge_follow_feat_1_path)
print(f"Edge[user:follow:user] feature [feat_1] is saved to {edge_follow_feat_1_path}\n")
Part of node[user] feature [feat_0]: [[0.34530439 0.51361204 0.44487117 0.47657383 0.33869241]
[0.14177148 0.88608203 0.18546059 0.94919518 0.42267535]
[0.57366088 0.02856163 0.43094464 0.90240259 0.78299396]]
Node[user] feature [feat_0] is saved to ./ondisk_dataset_heterograph/node-user-feat-0.npy
Part of node[user] feature [feat_1]: tensor([[0.5241, 0.0636, 0.9646, 0.8781, 0.7311],
[0.6916, 0.6266, 0.4001, 0.1896, 0.1379],
[0.4709, 0.3672, 0.3312, 0.6207, 0.0476]])
Node[user] feature [feat_1] is saved to ./ondisk_dataset_heterograph/node-user-feat-1.pt
Part of node[item] feature [feat_0]: [[0.25721427 0.51771909 0.8278702 0.62791233 0.29943825]
[0.95104018 0.72910706 0.50528496 0.19739672 0.65760257]
[0.37939125 0.86060118 0.50888008 0.00102259 0.60673931]]
Node[item] feature [feat_0] is saved to ./ondisk_dataset_heterograph/node-item-feat-0.npy
Part of node[item] feature [feat_1]: tensor([[0.2722, 0.6056, 0.7261, 0.1551, 0.4849],
[0.1026, 0.2786, 0.7825, 0.7189, 0.4327],
[0.0944, 0.3622, 0.7748, 0.1745, 0.0108]])
Node[item] feature [feat_1] is saved to ./ondisk_dataset_heterograph/node-item-feat-1.pt
Part of edge[user:like:item] feature [feat_0]: [[0.35191634 0.55082039 0.64912221 0.29406678 0.54022448]
[0.1076132 0.80823449 0.19773621 0.73198129 0.15072053]
[0.52699964 0.69212724 0.75319037 0.39414802 0.53260797]]
Edge[user:like:item] feature [feat_0] is saved to ./ondisk_dataset_heterograph/edge-like-feat-0.npy
Part of edge[user:like:item] feature [feat_1]: tensor([[0.7484, 0.3172, 0.4100, 0.1923, 0.3520],
[0.0528, 0.4414, 0.4194, 0.7457, 0.6756],
[0.7973, 0.6465, 0.2455, 0.6211, 0.1246]])
Edge[user:like:item] feature [feat_1] is saved to ./ondisk_dataset_heterograph/edge-like-feat-1.pt
Part of edge[user:follow:user] feature [feat_0]: [[0.0760699 0.5714282 0.44356893 0.59451301 0.0425682 ]
[0.88398336 0.37164033 0.87853397 0.47953317 0.32131761]
[0.19834625 0.56664029 0.74931118 0.75929103 0.11101125]]
Edge[user:follow:user] feature [feat_0] is saved to ./ondisk_dataset_heterograph/edge-follow-feat-0.npy
Part of edge[user:follow:user] feature [feat_1]: tensor([[0.8681, 0.8286, 0.3780, 0.7100, 0.7202],
[0.0604, 0.0072, 0.2118, 0.4139, 0.7075],
[0.4383, 0.2968, 0.0711, 0.6355, 0.2753]])
Edge[user:follow:user] feature [feat_1] is saved to ./ondisk_dataset_heterograph/edge-follow-feat-1.pt
生成任务
OnDiskDataset
支持多个任务。对于每个任务,我们需要分别准备训练/验证/测试集。这些集通常在不同的任务之间有所不同。在本教程中,让我们创建一个节点分类任务和一个链接预测任务。
节点分类任务
对于节点分类任务,我们需要每个训练/验证/测试集的节点ID和相应的标签。与特征数据一样,这些集支持numpy数组和torch张量。
[5]:
# For illustration, let's generate item sets for each node type.
num_trains = int(num_nodes * 0.6)
num_vals = int(num_nodes * 0.2)
num_tests = num_nodes - num_trains - num_vals
user_ids = np.arange(num_nodes)
np.random.shuffle(user_ids)
item_ids = np.arange(num_nodes)
np.random.shuffle(item_ids)
# Train IDs for user.
nc_train_user_ids_path = os.path.join(base_dir, "nc-train-user-ids.npy")
nc_train_user_ids = user_ids[:num_trains]
print(f"Part of train ids[user] for node classification: {nc_train_user_ids[:3]}")
np.save(nc_train_user_ids_path, nc_train_user_ids)
print(f"NC train ids[user] are saved to {nc_train_user_ids_path}\n")
# Train labels for user.
nc_train_user_labels_path = os.path.join(base_dir, "nc-train-user-labels.pt")
nc_train_user_labels = torch.randint(0, 10, (num_trains,))
print(f"Part of train labels[user] for node classification: {nc_train_user_labels[:3]}")
torch.save(nc_train_user_labels, nc_train_user_labels_path)
print(f"NC train labels[user] are saved to {nc_train_user_labels_path}\n")
# Train IDs for item.
nc_train_item_ids_path = os.path.join(base_dir, "nc-train-item-ids.npy")
nc_train_item_ids = item_ids[:num_trains]
print(f"Part of train ids[item] for node classification: {nc_train_item_ids[:3]}")
np.save(nc_train_item_ids_path, nc_train_item_ids)
print(f"NC train ids[item] are saved to {nc_train_item_ids_path}\n")
# Train labels for item.
nc_train_item_labels_path = os.path.join(base_dir, "nc-train-item-labels.pt")
nc_train_item_labels = torch.randint(0, 10, (num_trains,))
print(f"Part of train labels[item] for node classification: {nc_train_item_labels[:3]}")
torch.save(nc_train_item_labels, nc_train_item_labels_path)
print(f"NC train labels[item] are saved to {nc_train_item_labels_path}\n")
# Val IDs for user.
nc_val_user_ids_path = os.path.join(base_dir, "nc-val-user-ids.npy")
nc_val_user_ids = user_ids[num_trains:num_trains+num_vals]
print(f"Part of val ids[user] for node classification: {nc_val_user_ids[:3]}")
np.save(nc_val_user_ids_path, nc_val_user_ids)
print(f"NC val ids[user] are saved to {nc_val_user_ids_path}\n")
# Val labels for user.
nc_val_user_labels_path = os.path.join(base_dir, "nc-val-user-labels.pt")
nc_val_user_labels = torch.randint(0, 10, (num_vals,))
print(f"Part of val labels[user] for node classification: {nc_val_user_labels[:3]}")
torch.save(nc_val_user_labels, nc_val_user_labels_path)
print(f"NC val labels[user] are saved to {nc_val_user_labels_path}\n")
# Val IDs for item.
nc_val_item_ids_path = os.path.join(base_dir, "nc-val-item-ids.npy")
nc_val_item_ids = item_ids[num_trains:num_trains+num_vals]
print(f"Part of val ids[item] for node classification: {nc_val_item_ids[:3]}")
np.save(nc_val_item_ids_path, nc_val_item_ids)
print(f"NC val ids[item] are saved to {nc_val_item_ids_path}\n")
# Val labels for item.
nc_val_item_labels_path = os.path.join(base_dir, "nc-val-item-labels.pt")
nc_val_item_labels = torch.randint(0, 10, (num_vals,))
print(f"Part of val labels[item] for node classification: {nc_val_item_labels[:3]}")
torch.save(nc_val_item_labels, nc_val_item_labels_path)
print(f"NC val labels[item] are saved to {nc_val_item_labels_path}\n")
# Test IDs for user.
nc_test_user_ids_path = os.path.join(base_dir, "nc-test-user-ids.npy")
nc_test_user_ids = user_ids[-num_tests:]
print(f"Part of test ids[user] for node classification: {nc_test_user_ids[:3]}")
np.save(nc_test_user_ids_path, nc_test_user_ids)
print(f"NC test ids[user] are saved to {nc_test_user_ids_path}\n")
# Test labels for user.
nc_test_user_labels_path = os.path.join(base_dir, "nc-test-user-labels.pt")
nc_test_user_labels = torch.randint(0, 10, (num_tests,))
print(f"Part of test labels[user] for node classification: {nc_test_user_labels[:3]}")
torch.save(nc_test_user_labels, nc_test_user_labels_path)
print(f"NC test labels[user] are saved to {nc_test_user_labels_path}\n")
# Test IDs for item.
nc_test_item_ids_path = os.path.join(base_dir, "nc-test-item-ids.npy")
nc_test_item_ids = item_ids[-num_tests:]
print(f"Part of test ids[item] for node classification: {nc_test_item_ids[:3]}")
np.save(nc_test_item_ids_path, nc_test_item_ids)
print(f"NC test ids[item] are saved to {nc_test_item_ids_path}\n")
# Test labels for item.
nc_test_item_labels_path = os.path.join(base_dir, "nc-test-item-labels.pt")
nc_test_item_labels = torch.randint(0, 10, (num_tests,))
print(f"Part of test labels[item] for node classification: {nc_test_item_labels[:3]}")
torch.save(nc_test_item_labels, nc_test_item_labels_path)
print(f"NC test labels[item] are saved to {nc_test_item_labels_path}\n")
Part of train ids[user] for node classification: [752 872 543]
NC train ids[user] are saved to ./ondisk_dataset_heterograph/nc-train-user-ids.npy
Part of train labels[user] for node classification: tensor([1, 8, 7])
NC train labels[user] are saved to ./ondisk_dataset_heterograph/nc-train-user-labels.pt
Part of train ids[item] for node classification: [176 746 328]
NC train ids[item] are saved to ./ondisk_dataset_heterograph/nc-train-item-ids.npy
Part of train labels[item] for node classification: tensor([1, 7, 7])
NC train labels[item] are saved to ./ondisk_dataset_heterograph/nc-train-item-labels.pt
Part of val ids[user] for node classification: [232 843 87]
NC val ids[user] are saved to ./ondisk_dataset_heterograph/nc-val-user-ids.npy
Part of val labels[user] for node classification: tensor([4, 9, 2])
NC val labels[user] are saved to ./ondisk_dataset_heterograph/nc-val-user-labels.pt
Part of val ids[item] for node classification: [797 431 908]
NC val ids[item] are saved to ./ondisk_dataset_heterograph/nc-val-item-ids.npy
Part of val labels[item] for node classification: tensor([2, 7, 9])
NC val labels[item] are saved to ./ondisk_dataset_heterograph/nc-val-item-labels.pt
Part of test ids[user] for node classification: [ 30 988 822]
NC test ids[user] are saved to ./ondisk_dataset_heterograph/nc-test-user-ids.npy
Part of test labels[user] for node classification: tensor([8, 5, 6])
NC test labels[user] are saved to ./ondisk_dataset_heterograph/nc-test-user-labels.pt
Part of test ids[item] for node classification: [847 694 108]
NC test ids[item] are saved to ./ondisk_dataset_heterograph/nc-test-item-ids.npy
Part of test labels[item] for node classification: tensor([3, 0, 0])
NC test labels[item] are saved to ./ondisk_dataset_heterograph/nc-test-item-labels.pt
链接预测任务
对于链接预测任务,我们需要种子或相应的标签和索引,这些代表每个训练/验证/测试集的种子正/负属性和组。与特征数据一样,这些集支持numpy数组和torch张量。
[6]:
# For illustration, let's generate item sets for each edge type.
num_trains = int(num_edges * 0.6)
num_vals = int(num_edges * 0.2)
num_tests = num_edges - num_trains - num_vals
# Train seeds for user:like:item.
lp_train_like_seeds_path = os.path.join(base_dir, "lp-train-like-seeds.npy")
lp_train_like_seeds = like_edges[:num_trains, :]
print(f"Part of train seeds[user:like:item] for link prediction: {lp_train_like_seeds[:3]}")
np.save(lp_train_like_seeds_path, lp_train_like_seeds)
print(f"LP train seeds[user:like:item] are saved to {lp_train_like_seeds_path}\n")
# Train seeds for user:follow:user.
lp_train_follow_seeds_path = os.path.join(base_dir, "lp-train-follow-seeds.npy")
lp_train_follow_seeds = follow_edges[:num_trains, :]
print(f"Part of train seeds[user:follow:user] for link prediction: {lp_train_follow_seeds[:3]}")
np.save(lp_train_follow_seeds_path, lp_train_follow_seeds)
print(f"LP train seeds[user:follow:user] are saved to {lp_train_follow_seeds_path}\n")
# Val seeds for user:like:item.
lp_val_like_seeds_path = os.path.join(base_dir, "lp-val-like-seeds.npy")
lp_val_like_seeds = like_edges[num_trains:num_trains+num_vals, :]
lp_val_like_neg_dsts = np.random.randint(0, num_nodes, (num_vals, 10)).reshape(-1)
lp_val_like_neg_srcs = np.repeat(lp_val_like_seeds[:,0], 10)
lp_val_like_neg_seeds = np.concatenate((lp_val_like_neg_srcs, lp_val_like_neg_dsts)).reshape(2,-1).T
lp_val_like_seeds = np.concatenate((lp_val_like_seeds, lp_val_like_neg_seeds))
print(f"Part of val seeds[user:like:item] for link prediction: {lp_val_like_seeds[:3]}")
np.save(lp_val_like_seeds_path, lp_val_like_seeds)
print(f"LP val seeds[user:like:item] are saved to {lp_val_like_seeds_path}\n")
# Val labels for user:like:item.
lp_val_like_labels_path = os.path.join(base_dir, "lp-val-like-labels.npy")
lp_val_like_labels = np.empty(num_vals * (10 + 1))
lp_val_like_labels[:num_vals] = 1
lp_val_like_labels[num_vals:] = 0
print(f"Part of val labels[user:like:item] for link prediction: {lp_val_like_labels[:3]}")
np.save(lp_val_like_labels_path, lp_val_like_labels)
print(f"LP val labels[user:like:item] are saved to {lp_val_like_labels_path}\n")
# Val indexes for user:like:item.
lp_val_like_indexes_path = os.path.join(base_dir, "lp-val-like-indexes.npy")
lp_val_like_indexes = np.arange(0, num_vals)
lp_val_like_neg_indexes = np.repeat(lp_val_like_indexes, 10)
lp_val_like_indexes = np.concatenate([lp_val_like_indexes, lp_val_like_neg_indexes])
print(f"Part of val indexes[user:like:item] for link prediction: {lp_val_like_indexes[:3]}")
np.save(lp_val_like_indexes_path, lp_val_like_indexes)
print(f"LP val indexes[user:like:item] are saved to {lp_val_like_indexes_path}\n")
# Val seeds for user:follow:item.
lp_val_follow_seeds_path = os.path.join(base_dir, "lp-val-follow-seeds.npy")
lp_val_follow_seeds = follow_edges[num_trains:num_trains+num_vals, :]
lp_val_follow_neg_dsts = np.random.randint(0, num_nodes, (num_vals, 10)).reshape(-1)
lp_val_follow_neg_srcs = np.repeat(lp_val_follow_seeds[:,0], 10)
lp_val_follow_neg_seeds = np.concatenate((lp_val_follow_neg_srcs, lp_val_follow_neg_dsts)).reshape(2,-1).T
lp_val_follow_seeds = np.concatenate((lp_val_follow_seeds, lp_val_follow_neg_seeds))
print(f"Part of val seeds[user:follow:item] for link prediction: {lp_val_follow_seeds[:3]}")
np.save(lp_val_follow_seeds_path, lp_val_follow_seeds)
print(f"LP val seeds[user:follow:item] are saved to {lp_val_follow_seeds_path}\n")
# Val labels for user:follow:item.
lp_val_follow_labels_path = os.path.join(base_dir, "lp-val-follow-labels.npy")
lp_val_follow_labels = np.empty(num_vals * (10 + 1))
lp_val_follow_labels[:num_vals] = 1
lp_val_follow_labels[num_vals:] = 0
print(f"Part of val labels[user:follow:item] for link prediction: {lp_val_follow_labels[:3]}")
np.save(lp_val_follow_labels_path, lp_val_follow_labels)
print(f"LP val labels[user:follow:item] are saved to {lp_val_follow_labels_path}\n")
# Val indexes for user:follow:item.
lp_val_follow_indexes_path = os.path.join(base_dir, "lp-val-follow-indexes.npy")
lp_val_follow_indexes = np.arange(0, num_vals)
lp_val_follow_neg_indexes = np.repeat(lp_val_follow_indexes, 10)
lp_val_follow_indexes = np.concatenate([lp_val_follow_indexes, lp_val_follow_neg_indexes])
print(f"Part of val indexes[user:follow:item] for link prediction: {lp_val_follow_indexes[:3]}")
np.save(lp_val_follow_indexes_path, lp_val_follow_indexes)
print(f"LP val indexes[user:follow:item] are saved to {lp_val_follow_indexes_path}\n")
# Test seeds for user:like:item.
lp_test_like_seeds_path = os.path.join(base_dir, "lp-test-like-seeds.npy")
lp_test_like_seeds = like_edges[-num_tests:, :]
lp_test_like_neg_dsts = np.random.randint(0, num_nodes, (num_tests, 10)).reshape(-1)
lp_test_like_neg_srcs = np.repeat(lp_test_like_seeds[:,0], 10)
lp_test_like_neg_seeds = np.concatenate((lp_test_like_neg_srcs, lp_test_like_neg_dsts)).reshape(2,-1).T
lp_test_like_seeds = np.concatenate((lp_test_like_seeds, lp_test_like_neg_seeds))
print(f"Part of test seeds[user:like:item] for link prediction: {lp_test_like_seeds[:3]}")
np.save(lp_test_like_seeds_path, lp_test_like_seeds)
print(f"LP test seeds[user:like:item] are saved to {lp_test_like_seeds_path}\n")
# Test labels for user:like:item.
lp_test_like_labels_path = os.path.join(base_dir, "lp-test-like-labels.npy")
lp_test_like_labels = np.empty(num_tests * (10 + 1))
lp_test_like_labels[:num_tests] = 1
lp_test_like_labels[num_tests:] = 0
print(f"Part of test labels[user:like:item] for link prediction: {lp_test_like_labels[:3]}")
np.save(lp_test_like_labels_path, lp_test_like_labels)
print(f"LP test labels[user:like:item] are saved to {lp_test_like_labels_path}\n")
# Test indexes for user:like:item.
lp_test_like_indexes_path = os.path.join(base_dir, "lp-test-like-indexes.npy")
lp_test_like_indexes = np.arange(0, num_tests)
lp_test_like_neg_indexes = np.repeat(lp_test_like_indexes, 10)
lp_test_like_indexes = np.concatenate([lp_test_like_indexes, lp_test_like_neg_indexes])
print(f"Part of test indexes[user:like:item] for link prediction: {lp_test_like_indexes[:3]}")
np.save(lp_test_like_indexes_path, lp_test_like_indexes)
print(f"LP test indexes[user:like:item] are saved to {lp_test_like_indexes_path}\n")
# Test seeds for user:follow:item.
lp_test_follow_seeds_path = os.path.join(base_dir, "lp-test-follow-seeds.npy")
lp_test_follow_seeds = follow_edges[-num_tests:, :]
lp_test_follow_neg_dsts = np.random.randint(0, num_nodes, (num_tests, 10)).reshape(-1)
lp_test_follow_neg_srcs = np.repeat(lp_test_follow_seeds[:,0], 10)
lp_test_follow_neg_seeds = np.concatenate((lp_test_follow_neg_srcs, lp_test_follow_neg_dsts)).reshape(2,-1).T
lp_test_follow_seeds = np.concatenate((lp_test_follow_seeds, lp_test_follow_neg_seeds))
print(f"Part of test seeds[user:follow:item] for link prediction: {lp_test_follow_seeds[:3]}")
np.save(lp_test_follow_seeds_path, lp_test_follow_seeds)
print(f"LP test seeds[user:follow:item] are saved to {lp_test_follow_seeds_path}\n")
# Test labels for user:follow:item.
lp_test_follow_labels_path = os.path.join(base_dir, "lp-test-follow-labels.npy")
lp_test_follow_labels = np.empty(num_tests * (10 + 1))
lp_test_follow_labels[:num_tests] = 1
lp_test_follow_labels[num_tests:] = 0
print(f"Part of test labels[user:follow:item] for link prediction: {lp_test_follow_labels[:3]}")
np.save(lp_test_follow_labels_path, lp_test_follow_labels)
print(f"LP test labels[user:follow:item] are saved to {lp_test_follow_labels_path}\n")
# Test indexes for user:follow:item.
lp_test_follow_indexes_path = os.path.join(base_dir, "lp-test-follow-indexes.npy")
lp_test_follow_indexes = np.arange(0, num_tests)
lp_test_follow_neg_indexes = np.repeat(lp_test_follow_indexes, 10)
lp_test_follow_indexes = np.concatenate([lp_test_follow_indexes, lp_test_follow_neg_indexes])
print(f"Part of test indexes[user:follow:item] for link prediction: {lp_test_follow_indexes[:3]}")
np.save(lp_test_follow_indexes_path, lp_test_follow_indexes)
print(f"LP test indexes[user:follow:item] are saved to {lp_test_follow_indexes_path}\n")
Part of train seeds[user:like:item] for link prediction: [[223 454]
[488 169]
[636 944]]
LP train seeds[user:like:item] are saved to ./ondisk_dataset_heterograph/lp-train-like-seeds.npy
Part of train seeds[user:follow:user] for link prediction: [[784 947]
[ 47 185]
[663 659]]
LP train seeds[user:follow:user] are saved to ./ondisk_dataset_heterograph/lp-train-follow-seeds.npy
Part of val seeds[user:like:item] for link prediction: [[951 816]
[ 82 256]
[472 47]]
LP val seeds[user:like:item] are saved to ./ondisk_dataset_heterograph/lp-val-like-seeds.npy
Part of val labels[user:like:item] for link prediction: [1. 1. 1.]
LP val labels[user:like:item] are saved to ./ondisk_dataset_heterograph/lp-val-like-labels.npy
Part of val indexes[user:like:item] for link prediction: [0 1 2]
LP val indexes[user:like:item] are saved to ./ondisk_dataset_heterograph/lp-val-like-indexes.npy
Part of val seeds[user:follow:item] for link prediction: [[548 14]
[458 847]
[669 933]]
LP val seeds[user:follow:item] are saved to ./ondisk_dataset_heterograph/lp-val-follow-seeds.npy
Part of val labels[user:follow:item] for link prediction: [1. 1. 1.]
LP val labels[user:follow:item] are saved to ./ondisk_dataset_heterograph/lp-val-follow-labels.npy
Part of val indexes[user:follow:item] for link prediction: [0 1 2]
LP val indexes[user:follow:item] are saved to ./ondisk_dataset_heterograph/lp-val-follow-indexes.npy
Part of test seeds[user:like:item] for link prediction: [[357 647]
[ 81 89]
[880 39]]
LP test seeds[user:like:item] are saved to ./ondisk_dataset_heterograph/lp-test-like-seeds.npy
Part of test labels[user:like:item] for link prediction: [1. 1. 1.]
LP test labels[user:like:item] are saved to ./ondisk_dataset_heterograph/lp-test-like-labels.npy
Part of test indexes[user:like:item] for link prediction: [0 1 2]
LP test indexes[user:like:item] are saved to ./ondisk_dataset_heterograph/lp-test-like-indexes.npy
Part of test seeds[user:follow:item] for link prediction: [[195 196]
[403 599]
[394 249]]
LP test seeds[user:follow:item] are saved to ./ondisk_dataset_heterograph/lp-test-follow-seeds.npy
Part of test labels[user:follow:item] for link prediction: [1. 1. 1.]
LP test labels[user:follow:item] are saved to ./ondisk_dataset_heterograph/lp-test-follow-labels.npy
Part of test indexes[user:follow:item] for link prediction: [0 1 2]
LP test indexes[user:follow:item] are saved to ./ondisk_dataset_heterograph/lp-test-follow-indexes.npy
将数据组织到YAML文件中
现在我们需要创建一个metadata.yaml
文件,该文件包含路径、图结构的数据类型、特征数据、训练/验证/测试集。请注意,所有路径应相对于metadata.yaml
。
对于异构图,我们需要在type字段中指定节点/边的类型。对于边类型,需要规范化的etype,这是一个由源节点类型、etype和目标节点类型通过:
连接在一起的字符串。
注意:- 所有路径应相对于metadata.yaml
。- 以下字段是可选的,并未在下面的示例中指定。- in_memory
:指示是否将数据加载到内存中或使用mmap
。默认值为True
。
请参考YAML规范了解更多详情。
[7]:
yaml_content = f"""
dataset_name: heterogeneous_graph_nc_lp
graph:
nodes:
- type: user
num: {num_nodes}
- type: item
num: {num_nodes}
edges:
- type: "user:like:item"
format: csv
path: {os.path.basename(like_edges_path)}
- type: "user:follow:user"
format: csv
path: {os.path.basename(follow_edges_path)}
feature_data:
- domain: node
type: user
name: feat_0
format: numpy
path: {os.path.basename(node_user_feat_0_path)}
- domain: node
type: user
name: feat_1
format: torch
path: {os.path.basename(node_user_feat_1_path)}
- domain: node
type: item
name: feat_0
format: numpy
path: {os.path.basename(node_item_feat_0_path)}
- domain: node
type: item
name: feat_1
format: torch
path: {os.path.basename(node_item_feat_1_path)}
- domain: edge
type: "user:like:item"
name: feat_0
format: numpy
path: {os.path.basename(edge_like_feat_0_path)}
- domain: edge
type: "user:like:item"
name: feat_1
format: torch
path: {os.path.basename(edge_like_feat_1_path)}
- domain: edge
type: "user:follow:user"
name: feat_0
format: numpy
path: {os.path.basename(edge_follow_feat_0_path)}
- domain: edge
type: "user:follow:user"
name: feat_1
format: torch
path: {os.path.basename(edge_follow_feat_1_path)}
tasks:
- name: node_classification
num_classes: 10
train_set:
- type: user
data:
- name: seeds
format: numpy
path: {os.path.basename(nc_train_user_ids_path)}
- name: labels
format: torch
path: {os.path.basename(nc_train_user_labels_path)}
- type: item
data:
- name: seeds
format: numpy
path: {os.path.basename(nc_train_item_ids_path)}
- name: labels
format: torch
path: {os.path.basename(nc_train_item_labels_path)}
validation_set:
- type: user
data:
- name: seeds
format: numpy
path: {os.path.basename(nc_val_user_ids_path)}
- name: labels
format: torch
path: {os.path.basename(nc_val_user_labels_path)}
- type: item
data:
- name: seeds
format: numpy
path: {os.path.basename(nc_val_item_ids_path)}
- name: labels
format: torch
path: {os.path.basename(nc_val_item_labels_path)}
test_set:
- type: user
data:
- name: seeds
format: numpy
path: {os.path.basename(nc_test_user_ids_path)}
- name: labels
format: torch
path: {os.path.basename(nc_test_user_labels_path)}
- type: item
data:
- name: seeds
format: numpy
path: {os.path.basename(nc_test_item_ids_path)}
- name: labels
format: torch
path: {os.path.basename(nc_test_item_labels_path)}
- name: link_prediction
num_classes: 10
train_set:
- type: "user:like:item"
data:
- name: seeds
format: numpy
path: {os.path.basename(lp_train_like_seeds_path)}
- type: "user:follow:user"
data:
- name: seeds
format: numpy
path: {os.path.basename(lp_train_follow_seeds_path)}
validation_set:
- type: "user:like:item"
data:
- name: seeds
format: numpy
path: {os.path.basename(lp_val_like_seeds_path)}
- name: labels
format: numpy
path: {os.path.basename(lp_val_like_labels_path)}
- name: indexes
format: numpy
path: {os.path.basename(lp_val_like_indexes_path)}
- type: "user:follow:user"
data:
- name: seeds
format: numpy
path: {os.path.basename(lp_val_follow_seeds_path)}
- name: labels
format: numpy
path: {os.path.basename(lp_val_follow_labels_path)}
- name: indexes
format: numpy
path: {os.path.basename(lp_val_follow_indexes_path)}
test_set:
- type: "user:like:item"
data:
- name: seeds
format: numpy
path: {os.path.basename(lp_test_like_seeds_path)}
- name: labels
format: numpy
path: {os.path.basename(lp_test_like_labels_path)}
- name: indexes
format: numpy
path: {os.path.basename(lp_test_like_indexes_path)}
- type: "user:follow:user"
data:
- name: seeds
format: numpy
path: {os.path.basename(lp_test_follow_seeds_path)}
- name: labels
format: numpy
path: {os.path.basename(lp_test_follow_labels_path)}
- name: indexes
format: numpy
path: {os.path.basename(lp_test_follow_indexes_path)}
"""
metadata_path = os.path.join(base_dir, "metadata.yaml")
with open(metadata_path, "w") as f:
f.write(yaml_content)
实例化 OnDiskDataset
现在我们准备通过dgl.graphbolt.OnDiskDataset
加载数据集。在实例化时,我们只需传入metadata.yaml
文件所在的基础目录。
在首次实例化期间,GraphBolt 预处理原始数据,例如从边构建 FusedCSCSamplingGraph
。所有数据,包括图、特征数据、训练/验证/测试集,在预处理后都会放入 preprocessed
目录中。任何后续的数据集加载都将跳过预处理阶段。
预处理后,需要显式调用load()
以加载图、特征数据和任务。
[8]:
dataset = gb.OnDiskDataset(base_dir).load()
graph = dataset.graph
print(f"Loaded graph: {graph}\n")
feature = dataset.feature
print(f"Loaded feature store: {feature}\n")
tasks = dataset.tasks
nc_task = tasks[0]
print(f"Loaded node classification task: {nc_task}\n")
lp_task = tasks[1]
print(f"Loaded link prediction task: {lp_task}\n")
Start to preprocess the on-disk dataset.
Finish preprocessing the on-disk dataset.
Loaded graph: FusedCSCSamplingGraph(csc_indptr=tensor([ 0, 11, 25, ..., 19976, 19985, 20000], dtype=torch.int32),
indices=tensor([1852, 1789, 1674, ..., 1760, 1566, 1492], dtype=torch.int32),
total_num_nodes=2000, num_edges={'user:follow:user': 10000, 'user:like:item': 10000},
node_type_offset=tensor([ 0, 1000, 2000], dtype=torch.int32),
type_per_edge=tensor([1, 1, 1, ..., 0, 0, 0], dtype=torch.uint8),
node_type_to_id={'item': 0, 'user': 1},
edge_type_to_id={'user:follow:user': 0, 'user:like:item': 1},)
Loaded feature store: TorchBasedFeatureStore(
{(<OnDiskFeatureDataDomain.NODE: 'node'>, 'user', 'feat_0'): TorchBasedFeature(
feature=tensor([[0.3453, 0.5136, 0.4449, 0.4766, 0.3387],
[0.1418, 0.8861, 0.1855, 0.9492, 0.4227],
[0.5737, 0.0286, 0.4309, 0.9024, 0.7830],
...,
[0.4259, 0.3646, 0.2193, 0.1301, 0.2112],
[0.3391, 0.9038, 0.9678, 0.2951, 0.4638],
[0.0489, 0.3332, 0.5256, 0.1272, 0.5993]], dtype=torch.float64),
metadata={},
), (<OnDiskFeatureDataDomain.NODE: 'node'>, 'user', 'feat_1'): TorchBasedFeature(
feature=tensor([[0.5241, 0.0636, 0.9646, 0.8781, 0.7311],
[0.6916, 0.6266, 0.4001, 0.1896, 0.1379],
[0.4709, 0.3672, 0.3312, 0.6207, 0.0476],
...,
[0.2340, 0.3408, 0.1988, 0.9623, 0.8946],
[0.4456, 0.0447, 0.8039, 0.1333, 0.2468],
[0.8423, 0.6072, 0.2308, 0.0939, 0.9637]]),
metadata={},
), (<OnDiskFeatureDataDomain.NODE: 'node'>, 'item', 'feat_0'): TorchBasedFeature(
feature=tensor([[0.2572, 0.5177, 0.8279, 0.6279, 0.2994],
[0.9510, 0.7291, 0.5053, 0.1974, 0.6576],
[0.3794, 0.8606, 0.5089, 0.0010, 0.6067],
...,
[0.8492, 0.7437, 0.0712, 0.0188, 0.0103],
[0.4733, 0.4335, 0.2466, 0.0357, 0.7442],
[0.6500, 0.6994, 0.3546, 0.0236, 0.4502]], dtype=torch.float64),
metadata={},
), (<OnDiskFeatureDataDomain.NODE: 'node'>, 'item', 'feat_1'): TorchBasedFeature(
feature=tensor([[0.2722, 0.6056, 0.7261, 0.1551, 0.4849],
[0.1026, 0.2786, 0.7825, 0.7189, 0.4327],
[0.0944, 0.3622, 0.7748, 0.1745, 0.0108],
...,
[0.8066, 0.3587, 0.2633, 0.2288, 0.6713],
[0.9799, 0.3191, 0.7592, 0.3989, 0.0747],
[0.3062, 0.2187, 0.2080, 0.7417, 0.4526]]),
metadata={},
), (<OnDiskFeatureDataDomain.EDGE: 'edge'>, 'user:like:item', 'feat_0'): TorchBasedFeature(
feature=tensor([[0.3519, 0.5508, 0.6491, 0.2941, 0.5402],
[0.1076, 0.8082, 0.1977, 0.7320, 0.1507],
[0.5270, 0.6921, 0.7532, 0.3941, 0.5326],
...,
[0.5103, 0.9926, 0.6133, 0.4438, 0.2042],
[0.7844, 0.8919, 0.0859, 0.4417, 0.0571],
[0.7587, 0.8504, 0.3398, 0.6426, 0.8632]], dtype=torch.float64),
metadata={},
), (<OnDiskFeatureDataDomain.EDGE: 'edge'>, 'user:like:item', 'feat_1'): TorchBasedFeature(
feature=tensor([[7.4837e-01, 3.1719e-01, 4.1000e-01, 1.9229e-01, 3.5197e-01],
[5.2805e-02, 4.4137e-01, 4.1943e-01, 7.4570e-01, 6.7561e-01],
[7.9727e-01, 6.4647e-01, 2.4549e-01, 6.2105e-01, 1.2463e-01],
...,
[2.6980e-01, 3.9593e-01, 8.5360e-01, 4.9777e-01, 3.7384e-01],
[2.3341e-04, 6.4598e-01, 7.3573e-01, 7.9099e-01, 3.2468e-01],
[1.8835e-01, 5.2857e-01, 5.3158e-01, 1.2036e-01, 9.1056e-01]]),
metadata={},
), (<OnDiskFeatureDataDomain.EDGE: 'edge'>, 'user:follow:user', 'feat_0'): TorchBasedFeature(
feature=tensor([[0.0761, 0.5714, 0.4436, 0.5945, 0.0426],
[0.8840, 0.3716, 0.8785, 0.4795, 0.3213],
[0.1983, 0.5666, 0.7493, 0.7593, 0.1110],
...,
[0.2941, 0.0110, 0.9637, 0.2852, 0.4968],
[0.5539, 0.8754, 0.2235, 0.5246, 0.2844],
[0.5825, 0.6678, 0.2807, 0.3584, 0.0613]], dtype=torch.float64),
metadata={},
), (<OnDiskFeatureDataDomain.EDGE: 'edge'>, 'user:follow:user', 'feat_1'): TorchBasedFeature(
feature=tensor([[0.8681, 0.8286, 0.3780, 0.7100, 0.7202],
[0.0604, 0.0072, 0.2118, 0.4139, 0.7075],
[0.4383, 0.2968, 0.0711, 0.6355, 0.2753],
...,
[0.6545, 0.1100, 0.4310, 0.4985, 0.0644],
[0.1214, 0.6535, 0.5049, 0.8754, 0.6724],
[0.8938, 0.0103, 0.2399, 0.6832, 0.2074]]),
metadata={},
)}
)
Loaded node classification task: OnDiskTask(validation_set=ItemSetDict(
itemsets={'user': ItemSet(
items=(tensor([232, 843, 87, 550, 610, 515, 667, 249, 509, 457, 306, 86, 153, 304,
148, 288, 487, 150, 283, 946, 167, 768, 90, 432, 444, 47, 724, 847,
257, 214, 616, 893, 410, 52, 717, 206, 35, 518, 368, 314, 446, 664,
268, 660, 392, 121, 233, 128, 810, 569, 889, 348, 189, 745, 259, 262,
80, 881, 281, 447, 185, 231, 420, 387, 169, 821, 71, 58, 721, 949,
203, 152, 598, 715, 72, 828, 663, 905, 334, 741, 595, 184, 897, 108,
44, 979, 958, 389, 955, 99, 160, 211, 761, 709, 355, 413, 42, 637,
285, 558, 125, 173, 449, 298, 273, 871, 123, 734, 508, 55, 344, 656,
479, 175, 264, 145, 899, 604, 641, 846, 207, 642, 459, 874, 681, 258,
219, 325, 868, 147, 21, 867, 675, 575, 516, 397, 469, 580, 209, 277,
136, 415, 902, 115, 375, 428, 501, 992, 395, 224, 799, 638, 825, 942,
778, 261, 842, 759, 756, 23, 118, 723, 628, 910, 602, 726, 286, 704,
201, 480, 545, 811, 126, 221, 995, 196, 463, 680, 987, 875, 120, 862,
45, 504, 631, 46, 31, 841, 157, 473, 765, 454, 796, 496, 947, 729,
748, 172, 870, 636], dtype=torch.int32), tensor([4, 9, 2, 5, 1, 9, 9, 3, 9, 7, 9, 7, 9, 8, 1, 2, 6, 5, 1, 6, 4, 1, 7, 3,
5, 2, 3, 8, 7, 8, 6, 5, 8, 0, 5, 3, 1, 1, 2, 2, 5, 9, 1, 6, 0, 4, 9, 2,
9, 9, 0, 4, 3, 2, 1, 1, 3, 7, 1, 2, 4, 2, 6, 7, 8, 5, 3, 5, 3, 5, 1, 1,
1, 4, 6, 1, 3, 1, 8, 1, 7, 8, 7, 5, 7, 4, 0, 6, 5, 1, 9, 7, 9, 4, 8, 2,
4, 0, 0, 2, 6, 4, 3, 1, 1, 6, 1, 1, 7, 9, 1, 6, 4, 8, 9, 2, 1, 6, 2, 4,
6, 5, 1, 2, 6, 0, 7, 1, 5, 0, 9, 3, 6, 1, 4, 5, 7, 7, 1, 6, 5, 7, 2, 8,
4, 7, 5, 3, 9, 4, 3, 6, 1, 9, 0, 4, 3, 6, 2, 8, 7, 2, 0, 5, 6, 2, 1, 5,
1, 3, 9, 9, 9, 9, 3, 9, 2, 8, 8, 1, 5, 7, 0, 0, 1, 4, 9, 9, 9, 1, 1, 6,
6, 9, 5, 4, 7, 1, 4, 3])),
names=('seeds', 'labels'),
), 'item': ItemSet(
items=(tensor([797, 431, 908, 696, 676, 976, 319, 322, 995, 712, 631, 390, 87, 901,
598, 394, 670, 211, 812, 189, 154, 859, 933, 398, 835, 750, 531, 237,
856, 209, 495, 277, 539, 684, 520, 806, 359, 499, 370, 78, 720, 600,
432, 420, 232, 71, 276, 145, 445, 315, 920, 53, 327, 673, 530, 414,
101, 535, 271, 69, 991, 508, 46, 357, 492, 249, 662, 986, 558, 368,
215, 926, 377, 754, 969, 355, 476, 172, 683, 584, 254, 126, 342, 320,
934, 449, 757, 620, 893, 654, 84, 879, 187, 568, 198, 115, 645, 613,
769, 939, 932, 458, 482, 283, 799, 975, 153, 771, 796, 114, 691, 217,
637, 521, 208, 993, 740, 526, 898, 66, 286, 823, 180, 493, 894, 455,
252, 513, 956, 439, 628, 243, 235, 490, 668, 923, 24, 40, 863, 606,
106, 294, 280, 527, 193, 220, 399, 960, 524, 779, 340, 98, 404, 756,
494, 326, 813, 536, 138, 267, 728, 877, 615, 792, 489, 344, 700, 129,
591, 178, 435, 967, 984, 732, 544, 26, 905, 137, 203, 226, 184, 166,
13, 804, 260, 147, 3, 798, 269, 44, 622, 936, 403, 916, 119, 497,
854, 334, 469, 485], dtype=torch.int32), tensor([2, 7, 9, 3, 1, 6, 0, 6, 6, 7, 4, 3, 1, 7, 4, 2, 3, 7, 9, 1, 7, 7, 4, 9,
3, 4, 5, 5, 8, 6, 0, 2, 7, 4, 7, 1, 1, 5, 1, 1, 7, 4, 7, 2, 7, 5, 4, 5,
6, 2, 5, 6, 8, 1, 2, 4, 9, 4, 3, 0, 3, 2, 8, 6, 0, 4, 9, 9, 6, 1, 7, 2,
6, 6, 0, 0, 7, 0, 2, 6, 3, 1, 9, 1, 6, 9, 4, 1, 8, 6, 7, 6, 2, 7, 3, 3,
4, 0, 9, 6, 3, 3, 0, 0, 8, 2, 3, 3, 7, 7, 2, 7, 3, 5, 9, 3, 1, 9, 7, 5,
4, 7, 1, 2, 1, 0, 3, 7, 9, 3, 7, 1, 1, 0, 8, 0, 5, 0, 3, 6, 4, 5, 6, 1,
6, 2, 3, 3, 1, 7, 9, 8, 9, 0, 5, 5, 2, 6, 6, 3, 0, 5, 1, 6, 4, 7, 2, 5,
9, 7, 1, 1, 2, 8, 7, 8, 7, 0, 7, 5, 5, 1, 0, 1, 0, 2, 4, 5, 8, 9, 0, 9,
4, 8, 1, 1, 0, 6, 6, 5])),
names=('seeds', 'labels'),
)},
names=('seeds', 'labels'),
),
train_set=ItemSetDict(
itemsets={'user': ItemSet(
items=(tensor([752, 872, 543, 804, 624, 536, 179, 599, 384, 498, 669, 716, 561, 119,
658, 686, 552, 901, 954, 467, 705, 486, 816, 436, 760, 608, 155, 612,
583, 161, 287, 606, 181, 906, 448, 978, 943, 827, 54, 609, 122, 423,
855, 740, 938, 922, 149, 605, 650, 400, 800, 309, 326, 476, 63, 484,
197, 412, 785, 758, 864, 204, 786, 876, 698, 13, 402, 12, 370, 8,
990, 645, 519, 60, 353, 396, 885, 962, 614, 34, 972, 294, 547, 465,
856, 925, 134, 225, 84, 887, 260, 102, 159, 512, 538, 802, 59, 625,
534, 330, 725, 132, 67, 743, 961, 996, 162, 967, 180, 632, 310, 929,
356, 983, 944, 833, 300, 670, 963, 455, 861, 926, 878, 791, 777, 135,
373, 474, 844, 226, 83, 584, 376, 576, 361, 495, 813, 250, 997, 711,
272, 567, 437, 312, 554, 328, 363, 142, 324, 581, 190, 369, 343, 701,
817, 367, 797, 332, 916, 19, 886, 168, 458, 500, 877, 295, 5, 914,
662, 865, 907, 98, 103, 401, 327, 917, 464, 930, 714, 365, 838, 462,
676, 892, 89, 582, 751, 794, 533, 199, 429, 787, 305, 404, 857, 292,
882, 321, 553, 236, 541, 439, 880, 630, 657, 687, 848, 50, 613, 684,
994, 790, 488, 820, 839, 129, 131, 585, 588, 852, 427, 354, 191, 784,
451, 76, 386, 274, 879, 358, 611, 269, 252, 359, 869, 634, 971, 331,
470, 78, 240, 146, 933, 245, 342, 991, 336, 62, 477, 372, 450, 460,
441, 235, 707, 73, 414, 973, 654, 767, 345, 831, 251, 471, 793, 890,
941, 139, 744, 113, 497, 311, 255, 540, 293, 158, 719, 419, 884, 202,
801, 695, 775, 708, 416, 544, 263, 524, 490, 267, 362, 346, 41, 819,
644, 216, 824, 888, 506, 952, 475, 920, 727, 170, 434, 713, 313, 20,
578, 873, 307, 284, 379, 503, 394, 301, 510, 110, 593, 237, 37, 535,
770, 651, 568, 101, 850, 623, 229, 965, 228, 563, 74, 919, 738, 492,
409, 688, 549, 408, 936, 970, 565, 649, 472, 10, 299, 571, 635, 750,
894, 931, 697, 671, 15, 176, 773, 747, 935, 835, 591, 296, 626, 210,
898, 555, 92, 278, 489, 303, 438, 6, 68, 648, 564, 731, 950, 218,
620, 270, 114, 421, 836, 138, 3, 96, 840, 619, 539, 753, 832, 982,
109, 133, 388, 183, 678, 526, 993, 739, 1, 499, 61, 244, 322, 164,
27, 928, 26, 629, 481, 403, 546, 254, 939, 853, 491, 597, 679, 690,
980, 290, 673, 517, 323, 16, 596, 851, 378, 505, 951, 256, 830, 431,
710, 241, 805, 570, 75, 452, 195, 70, 829, 702, 896, 282, 445, 668,
601, 587, 97, 918, 200, 780, 823, 124, 706, 366, 964, 699, 617, 945,
297, 230, 523, 186, 266, 755, 736, 382, 289, 903, 932, 682, 212, 418,
732, 171, 692, 532, 981, 56, 863, 482, 208, 652, 127, 351, 639, 106,
36, 666, 341, 338, 912, 559, 17, 677, 66, 390, 808, 694, 483, 165,
82, 308, 530, 93, 665, 718, 891, 220, 399, 51, 9, 79, 934, 393,
647, 860, 105, 615, 69, 683, 911, 178, 383, 779, 749, 360, 291, 141,
276, 522, 653, 974, 712, 904, 529, 783, 521, 317, 722, 38, 627, 968,
812, 182, 440, 774, 198, 858, 998, 177, 2, 730, 975, 188, 7, 442,
809, 248, 347, 542, 746, 39, 643, 525, 927, 339, 14, 849, 646, 144,
957, 589, 984, 661, 29, 633, 223, 776, 803, 435, 845, 940, 859, 520,
494, 556, 398, 381, 908, 551, 117, 243, 527, 234, 350, 837],
dtype=torch.int32), tensor([1, 8, 7, 5, 0, 5, 6, 3, 1, 6, 5, 4, 1, 4, 0, 0, 3, 8, 6, 6, 6, 7, 3, 8,
3, 9, 0, 8, 0, 6, 5, 1, 6, 5, 9, 9, 1, 3, 2, 9, 0, 8, 1, 6, 0, 3, 4, 1,
3, 5, 2, 0, 3, 2, 4, 3, 3, 9, 0, 7, 4, 5, 7, 1, 8, 4, 8, 1, 4, 2, 5, 1,
0, 2, 4, 0, 2, 9, 4, 0, 0, 0, 5, 9, 7, 7, 6, 3, 6, 5, 7, 5, 8, 3, 0, 6,
8, 5, 9, 3, 7, 9, 4, 5, 0, 6, 2, 1, 7, 4, 3, 6, 9, 7, 1, 1, 6, 6, 1, 2,
6, 5, 0, 4, 2, 8, 1, 1, 3, 1, 4, 0, 0, 3, 9, 0, 3, 0, 3, 1, 7, 3, 0, 8,
7, 7, 6, 8, 7, 1, 2, 5, 5, 8, 7, 3, 8, 4, 2, 6, 5, 4, 6, 6, 0, 8, 2, 6,
3, 3, 6, 3, 2, 6, 2, 5, 4, 8, 1, 9, 2, 3, 5, 8, 6, 7, 5, 2, 4, 8, 5, 4,
6, 1, 7, 3, 6, 5, 8, 7, 4, 8, 6, 0, 9, 4, 1, 5, 0, 6, 3, 5, 6, 3, 3, 0,
7, 8, 6, 8, 9, 3, 5, 5, 7, 8, 8, 5, 6, 1, 7, 3, 6, 7, 6, 7, 2, 9, 5, 8,
1, 6, 0, 8, 9, 5, 4, 8, 2, 0, 3, 1, 5, 6, 4, 4, 6, 0, 9, 8, 5, 4, 8, 2,
3, 5, 5, 1, 6, 5, 5, 6, 8, 5, 0, 9, 9, 9, 4, 8, 0, 3, 7, 2, 3, 5, 3, 2,
7, 4, 4, 2, 7, 3, 0, 4, 2, 0, 9, 6, 7, 9, 2, 6, 7, 5, 8, 9, 9, 0, 8, 1,
4, 5, 2, 6, 3, 6, 2, 0, 5, 4, 8, 6, 0, 4, 1, 4, 9, 2, 0, 5, 0, 3, 2, 6,
8, 6, 2, 5, 2, 8, 8, 1, 3, 3, 5, 5, 3, 3, 2, 9, 2, 6, 7, 9, 3, 9, 6, 2,
0, 3, 5, 1, 2, 3, 5, 0, 2, 8, 6, 8, 5, 7, 1, 8, 8, 1, 3, 5, 1, 0, 4, 0,
7, 0, 0, 0, 9, 0, 4, 8, 0, 7, 0, 4, 3, 2, 3, 5, 3, 1, 2, 1, 0, 5, 6, 6,
4, 4, 5, 4, 1, 3, 7, 9, 5, 4, 3, 6, 8, 2, 0, 7, 9, 6, 8, 2, 4, 5, 7, 2,
4, 2, 2, 0, 2, 4, 9, 3, 1, 0, 3, 8, 7, 9, 5, 4, 1, 3, 2, 4, 5, 6, 3, 1,
5, 7, 8, 1, 4, 2, 2, 6, 8, 8, 9, 1, 6, 1, 3, 8, 8, 9, 7, 5, 2, 1, 3, 1,
1, 2, 3, 0, 3, 8, 7, 4, 0, 7, 0, 3, 2, 5, 8, 3, 1, 2, 8, 9, 6, 3, 3, 6,
7, 4, 4, 7, 2, 8, 4, 5, 0, 2, 3, 5, 0, 6, 2, 4, 1, 8, 6, 5, 3, 1, 0, 3,
2, 7, 1, 1, 6, 3, 2, 2, 9, 2, 9, 9, 0, 4, 4, 5, 1, 9, 2, 8, 4, 7, 8, 9,
3, 5, 1, 2, 3, 8, 9, 6, 5, 0, 5, 4, 8, 4, 6, 2, 4, 4, 8, 7, 3, 8, 2, 4,
9, 1, 0, 8, 7, 3, 8, 8, 4, 6, 0, 1, 1, 6, 1, 0, 9, 5, 5, 9, 6, 9, 3, 1])),
names=('seeds', 'labels'),
), 'item': ItemSet(
items=(tensor([176, 746, 328, 770, 509, 191, 39, 955, 660, 417, 773, 272, 65, 580,
562, 380, 257, 678, 614, 240, 817, 419, 197, 25, 413, 332, 982, 52,
466, 625, 442, 692, 167, 266, 33, 196, 12, 450, 329, 764, 309, 123,
255, 648, 360, 159, 697, 290, 800, 930, 250, 733, 475, 793, 443, 872,
502, 7, 518, 63, 836, 917, 980, 655, 135, 356, 174, 608, 924, 314,
781, 547, 858, 347, 36, 28, 744, 616, 333, 791, 983, 246, 808, 37,
15, 965, 846, 510, 199, 550, 1, 994, 245, 569, 717, 463, 429, 795,
679, 528, 200, 11, 786, 541, 567, 761, 825, 421, 43, 169, 296, 886,
656, 59, 810, 851, 212, 583, 704, 735, 736, 594, 324, 231, 275, 263,
857, 787, 605, 72, 762, 667, 974, 278, 391, 658, 34, 665, 738, 844,
20, 695, 425, 534, 410, 785, 966, 201, 395, 745, 60, 833, 287, 590,
783, 156, 471, 89, 996, 177, 384, 385, 635, 752, 876, 171, 285, 990,
210, 350, 102, 554, 672, 729, 251, 557, 680, 222, 801, 353, 751, 626,
258, 902, 400, 621, 944, 636, 627, 45, 317, 131, 630, 619, 311, 588,
291, 31, 230, 415, 743, 989, 928, 701, 820, 734, 681, 248, 570, 888,
721, 988, 822, 382, 774, 321, 674, 68, 214, 58, 487, 452, 143, 204,
483, 652, 782, 716, 179, 422, 375, 83, 657, 964, 504, 64, 973, 462,
850, 481, 777, 840, 809, 837, 134, 842, 977, 867, 453, 868, 864, 74,
111, 308, 848, 978, 816, 709, 834, 373, 753, 173, 749, 54, 592, 563,
852, 241, 525, 79, 405, 971, 389, 312, 759, 772, 909, 151, 165, 581,
81, 560, 140, 725, 100, 739, 374, 875, 163, 999, 992, 349, 132, 689,
884, 234, 302, 57, 281, 477, 433, 9, 221, 666, 325, 164, 543, 141,
416, 339, 97, 295, 457, 987, 146, 183, 565, 755, 829, 484, 448, 27,
651, 361, 763, 345, 514, 985, 576, 972, 589, 318, 907, 647, 922, 961,
949, 551, 0, 577, 602, 963, 61, 706, 634, 587, 17, 310, 362, 586,
609, 223, 722, 737, 148, 997, 505, 881, 546, 303, 168, 110, 669, 242,
354, 819, 941, 124, 814, 811, 529, 264, 402, 760, 376, 776, 363, 301,
942, 473, 67, 460, 426, 708, 624, 579, 427, 566, 780, 896, 127, 144,
436, 491, 86, 571, 724, 120, 47, 838, 659, 228, 139, 540, 954, 274,
573, 511, 386, 962, 611, 878, 778, 38, 682, 610, 447, 155, 273, 559,
915, 519, 640, 703, 873, 937, 19, 824, 95, 555, 910, 742, 91, 663,
707, 496, 348, 364, 365, 794, 152, 288, 160, 142, 726, 122, 705, 564,
423, 268, 331, 897, 162, 899, 578, 843, 225, 282, 597, 85, 467, 486,
998, 76, 675, 125, 639, 690, 186, 927, 718, 585, 599, 512, 444, 224,
2, 775, 32, 170, 855, 366, 236, 92, 472, 552, 5, 80, 807, 190,
548, 758, 889, 239, 912, 947, 133, 351, 874, 790, 629, 698, 219, 532,
638, 632, 821, 358, 381, 765, 699, 41, 784, 686, 284, 259, 882, 688,
767, 42, 116, 35, 316, 10, 574, 715, 865, 346, 279, 206, 218, 869,
500, 136, 161, 118, 803, 70, 468, 693, 938, 247, 306, 623, 82, 826,
112, 919, 839, 88, 945, 107, 121, 727, 612, 99, 948, 507, 685, 501,
194, 182, 8, 815, 299, 128, 103, 542, 595, 959, 434, 336, 456, 261,
337, 943, 73, 397, 593, 383, 604, 406, 661, 304, 904, 438, 387, 418,
104, 149, 788, 538, 253, 227, 396, 617, 741, 747, 861, 4],
dtype=torch.int32), tensor([1, 7, 7, 0, 8, 4, 9, 2, 7, 9, 2, 3, 9, 1, 1, 6, 6, 6, 9, 1, 4, 7, 6, 3,
1, 5, 3, 1, 6, 7, 6, 3, 2, 1, 1, 1, 8, 0, 3, 0, 5, 1, 3, 1, 0, 1, 9, 9,
5, 3, 1, 5, 3, 7, 8, 3, 0, 0, 5, 5, 6, 5, 4, 3, 3, 1, 1, 1, 2, 2, 3, 5,
4, 7, 6, 7, 2, 2, 7, 2, 1, 6, 8, 1, 9, 7, 1, 9, 0, 3, 8, 4, 6, 4, 6, 6,
6, 6, 4, 1, 0, 8, 0, 6, 9, 4, 3, 8, 3, 3, 8, 7, 0, 4, 3, 1, 5, 1, 2, 8,
8, 2, 6, 6, 8, 8, 4, 1, 8, 2, 9, 2, 6, 4, 6, 0, 9, 3, 1, 6, 3, 6, 1, 1,
4, 0, 5, 8, 8, 7, 0, 6, 5, 7, 6, 8, 0, 9, 9, 3, 8, 5, 4, 0, 5, 7, 7, 3,
7, 0, 7, 8, 5, 1, 0, 9, 8, 7, 0, 3, 5, 0, 3, 4, 1, 6, 4, 1, 2, 3, 5, 2,
3, 9, 4, 3, 3, 8, 4, 1, 4, 2, 7, 6, 5, 4, 7, 7, 7, 9, 3, 1, 9, 0, 1, 3,
9, 9, 1, 7, 1, 1, 2, 6, 8, 5, 9, 2, 8, 1, 7, 3, 5, 3, 2, 3, 4, 9, 8, 7,
6, 7, 0, 3, 0, 2, 1, 0, 0, 0, 1, 5, 3, 7, 9, 8, 2, 1, 2, 4, 5, 6, 1, 7,
5, 5, 9, 7, 8, 7, 8, 7, 6, 9, 2, 9, 2, 1, 8, 9, 0, 1, 1, 1, 7, 1, 6, 2,
1, 9, 8, 7, 9, 0, 5, 1, 0, 6, 9, 1, 8, 5, 7, 8, 2, 2, 8, 6, 0, 1, 0, 0,
8, 2, 0, 0, 1, 0, 4, 4, 2, 0, 6, 5, 3, 2, 0, 0, 7, 0, 1, 3, 4, 0, 7, 3,
3, 5, 6, 8, 6, 3, 9, 7, 1, 3, 2, 0, 3, 0, 2, 4, 2, 3, 1, 2, 7, 9, 1, 6,
7, 8, 7, 4, 0, 6, 1, 5, 2, 3, 4, 6, 4, 4, 8, 7, 3, 6, 4, 3, 3, 4, 3, 6,
5, 8, 2, 8, 8, 7, 5, 3, 9, 3, 7, 2, 7, 8, 7, 6, 0, 5, 6, 1, 0, 1, 1, 1,
9, 4, 2, 2, 6, 7, 0, 5, 7, 8, 7, 5, 0, 0, 2, 1, 9, 0, 9, 2, 7, 0, 5, 8,
1, 6, 8, 8, 4, 8, 3, 2, 7, 0, 3, 4, 9, 9, 4, 5, 1, 5, 3, 4, 2, 3, 4, 1,
4, 7, 9, 6, 6, 6, 3, 5, 3, 9, 6, 5, 7, 3, 8, 5, 2, 4, 9, 6, 5, 9, 8, 8,
4, 6, 4, 3, 4, 6, 2, 7, 3, 7, 0, 1, 4, 3, 6, 6, 4, 5, 6, 2, 6, 6, 6, 0,
0, 2, 4, 7, 3, 0, 1, 9, 8, 3, 3, 7, 7, 5, 7, 9, 0, 0, 5, 2, 5, 6, 3, 0,
1, 2, 6, 5, 8, 3, 6, 8, 0, 3, 8, 6, 3, 2, 6, 0, 9, 1, 9, 8, 2, 5, 2, 2,
9, 2, 4, 0, 6, 2, 2, 1, 4, 0, 0, 6, 1, 5, 5, 8, 4, 3, 4, 3, 4, 8, 5, 0,
9, 1, 5, 5, 8, 1, 5, 6, 5, 8, 0, 5, 7, 4, 0, 3, 8, 8, 4, 4, 8, 9, 2, 7])),
names=('seeds', 'labels'),
)},
names=('seeds', 'labels'),
),
test_set=ItemSetDict(
itemsets={'user': ItemSet(
items=(tensor([ 30, 988, 822, 989, 795, 302, 64, 674, 374, 253, 217, 788, 691, 43,
340, 407, 514, 242, 100, 364, 854, 116, 94, 572, 985, 49, 537, 377,
443, 782, 371, 456, 265, 411, 959, 329, 187, 685, 333, 562, 53, 349,
696, 40, 728, 689, 999, 4, 603, 205, 461, 156, 814, 151, 969, 772,
976, 433, 579, 762, 513, 577, 425, 502, 316, 275, 883, 352, 213, 531,
913, 391, 895, 85, 143, 607, 466, 560, 65, 453, 166, 900, 742, 485,
11, 137, 104, 573, 32, 88, 566, 640, 238, 693, 764, 921, 600, 781,
320, 966, 915, 818, 655, 57, 28, 766, 622, 174, 357, 22, 769, 528,
590, 81, 956, 406, 511, 405, 24, 33, 789, 763, 95, 771, 280, 227,
977, 154, 960, 319, 112, 586, 239, 48, 621, 737, 426, 222, 815, 866,
507, 279, 417, 909, 937, 337, 111, 557, 335, 163, 792, 834, 130, 25,
807, 720, 315, 700, 806, 733, 592, 247, 826, 0, 318, 672, 478, 77,
18, 923, 754, 735, 468, 271, 798, 924, 948, 194, 953, 424, 385, 493,
422, 192, 430, 594, 91, 703, 380, 140, 215, 548, 193, 757, 246, 574,
659, 986, 107, 618], dtype=torch.int32), tensor([8, 5, 6, 7, 0, 6, 9, 3, 7, 9, 4, 4, 7, 4, 9, 9, 9, 9, 0, 2, 1, 4, 0, 8,
8, 7, 9, 7, 4, 3, 1, 8, 7, 2, 4, 2, 3, 4, 0, 1, 5, 2, 1, 1, 1, 1, 5, 0,
8, 1, 1, 5, 7, 9, 9, 3, 1, 9, 3, 5, 6, 9, 9, 9, 0, 0, 3, 2, 7, 7, 8, 5,
9, 4, 0, 4, 2, 4, 2, 3, 9, 1, 4, 5, 9, 8, 9, 0, 7, 8, 2, 4, 0, 3, 6, 5,
0, 8, 4, 5, 8, 9, 0, 1, 9, 7, 7, 1, 9, 9, 8, 9, 4, 2, 5, 9, 6, 3, 4, 2,
0, 4, 7, 8, 2, 8, 1, 4, 2, 9, 2, 3, 8, 8, 6, 8, 2, 3, 7, 0, 7, 2, 5, 4,
6, 6, 6, 7, 8, 6, 0, 3, 3, 5, 1, 1, 9, 0, 0, 2, 0, 5, 3, 7, 3, 8, 0, 7,
8, 7, 4, 4, 6, 2, 3, 7, 7, 6, 2, 8, 4, 8, 8, 3, 0, 8, 6, 0, 8, 1, 4, 3,
0, 0, 8, 9, 5, 0, 5, 7])),
names=('seeds', 'labels'),
), 'item': ItemSet(
items=(tensor([847, 694, 108, 270, 330, 16, 900, 150, 202, 158, 157, 853, 256, 338,
300, 575, 883, 289, 175, 768, 192, 929, 677, 818, 117, 832, 970, 185,
714, 238, 188, 181, 710, 23, 105, 229, 646, 633, 213, 6, 75, 644,
341, 957, 77, 866, 441, 748, 830, 892, 671, 723, 109, 643, 49, 766,
642, 958, 951, 195, 556, 113, 903, 522, 459, 731, 914, 378, 292, 702,
935, 925, 428, 596, 22, 515, 940, 216, 409, 451, 862, 388, 479, 827,
523, 649, 265, 506, 62, 205, 14, 470, 244, 454, 572, 424, 607, 906,
297, 440, 48, 90, 30, 480, 262, 461, 981, 50, 465, 412, 952, 371,
887, 549, 430, 918, 537, 503, 870, 18, 411, 517, 730, 789, 871, 603,
478, 307, 29, 650, 392, 582, 880, 845, 831, 96, 407, 93, 21, 561,
352, 711, 849, 369, 233, 293, 464, 895, 931, 207, 474, 913, 641, 968,
393, 335, 488, 719, 408, 885, 979, 533, 401, 130, 379, 516, 545, 828,
298, 911, 890, 367, 618, 305, 687, 94, 664, 841, 498, 51, 553, 372,
950, 921, 860, 713, 953, 653, 802, 946, 343, 55, 323, 56, 437, 891,
446, 805, 313, 601], dtype=torch.int32), tensor([3, 0, 0, 0, 1, 6, 7, 6, 9, 3, 4, 6, 1, 4, 5, 4, 4, 6, 8, 8, 1, 0, 7, 3,
6, 3, 5, 0, 9, 5, 9, 3, 9, 7, 1, 2, 5, 6, 0, 6, 4, 3, 7, 3, 3, 4, 7, 5,
9, 9, 7, 0, 0, 6, 7, 4, 8, 1, 9, 4, 5, 8, 2, 0, 8, 2, 3, 7, 7, 0, 2, 3,
1, 9, 6, 9, 9, 7, 1, 2, 0, 1, 0, 0, 3, 2, 6, 4, 2, 8, 9, 0, 7, 5, 5, 1,
1, 9, 8, 2, 6, 9, 6, 1, 2, 4, 3, 7, 3, 6, 6, 8, 3, 4, 4, 7, 2, 4, 0, 7,
2, 4, 5, 5, 2, 0, 3, 5, 0, 5, 2, 8, 5, 2, 0, 1, 2, 5, 4, 0, 9, 6, 8, 6,
0, 5, 7, 4, 8, 7, 3, 9, 2, 2, 6, 2, 7, 7, 6, 0, 1, 6, 1, 0, 1, 9, 0, 8,
0, 7, 5, 8, 9, 4, 0, 3, 0, 7, 9, 6, 4, 4, 1, 5, 7, 9, 1, 5, 4, 9, 1, 4,
8, 6, 3, 7, 1, 0, 3, 3])),
names=('seeds', 'labels'),
)},
names=('seeds', 'labels'),
),
metadata={'name': 'node_classification', 'num_classes': 10},)
Loaded link prediction task: OnDiskTask(validation_set=ItemSetDict(
itemsets={'user:like:item': ItemSet(
items=(tensor([[951, 816],
[ 82, 256],
[472, 47],
...,
[ 4, 236],
[ 4, 763],
[ 4, 512]], dtype=torch.int32), tensor([1., 1., 1., ..., 0., 0., 0.], dtype=torch.float64), tensor([ 0, 1, 2, ..., 1999, 1999, 1999])),
names=('seeds', 'labels', 'indexes'),
), 'user:follow:user': ItemSet(
items=(tensor([[548, 14],
[458, 847],
[669, 933],
...,
[695, 837],
[695, 434],
[695, 619]], dtype=torch.int32), tensor([1., 1., 1., ..., 0., 0., 0.], dtype=torch.float64), tensor([ 0, 1, 2, ..., 1999, 1999, 1999])),
names=('seeds', 'labels', 'indexes'),
)},
names=('seeds', 'labels', 'indexes'),
),
train_set=ItemSetDict(
itemsets={'user:like:item': ItemSet(
items=(tensor([[223, 454],
[488, 169],
[636, 944],
...,
[634, 229],
[107, 230],
[838, 918]], dtype=torch.int32),),
names=('seeds',),
), 'user:follow:user': ItemSet(
items=(tensor([[784, 947],
[ 47, 185],
[663, 659],
...,
[717, 932],
[878, 352],
[335, 261]], dtype=torch.int32),),
names=('seeds',),
)},
names=('seeds',),
),
test_set=ItemSetDict(
itemsets={'user:like:item': ItemSet(
items=(tensor([[357, 647],
[ 81, 89],
[880, 39],
...,
[252, 282],
[252, 302],
[252, 377]], dtype=torch.int32), tensor([1., 1., 1., ..., 0., 0., 0.], dtype=torch.float64), tensor([ 0, 1, 2, ..., 1999, 1999, 1999])),
names=('seeds', 'labels', 'indexes'),
), 'user:follow:user': ItemSet(
items=(tensor([[195, 196],
[403, 599],
[394, 249],
...,
[354, 302],
[354, 478],
[354, 539]], dtype=torch.int32), tensor([1., 1., 1., ..., 0., 0., 0.], dtype=torch.float64), tensor([ 0, 1, 2, ..., 1999, 1999, 1999])),
names=('seeds', 'labels', 'indexes'),
)},
names=('seeds', 'labels', 'indexes'),
),
metadata={'name': 'link_prediction', 'num_classes': 10},)
/home/ubuntu/prod-doc/readthedocs.org/user_builds/dgl/envs/latest/lib/python3.8/site-packages/dgl-2.3-py3.8-linux-x86_64.egg/dgl/graphbolt/impl/ondisk_dataset.py:460: DGLWarning: Edge feature is stored, but edge IDs are not saved.
dgl_warning("Edge feature is stored, but edge IDs are not saved.")