超图神经网络

本教程将说明什么是超图以及如何使用DGL的稀疏矩阵API构建超图神经网络。

[ ]:

# Install required packages.
import os
import torch
os.environ['TORCH'] = torch.__version__
os.environ['DGLBACKEND'] = "pytorch"

# Uncomment below to install required packages. If the CUDA version is not 11.8,
# check the https://www.dgl.ai/pages/start.html to find the supported CUDA
# version and corresponding command to install DGL.
#!pip install dgl -f https://data.dgl.ai/wheels/cu118/repo.html > /dev/null
#!pip install torchmetrics > /dev/null

try:
    import dgl
    installed = True
except ImportError:
    installed = False
print("DGL installed!" if installed else "Failed to install DGL!")

超图

一个超图由节点和超边组成。与图中的边不同，超边可以连接任意数量的节点。例如，下图展示了一个包含11个节点和5条超边的超图，超边用不同的颜色绘制。

超图在数据集内数据点之间的关系不是二元时特别有用。例如，在电子商务系统中，可以一起购买两种以上的产品，因此共同购买的关系是\(n\)元的而不是二元的，因此它更适合被描述为超图而不是普通图。

超图通常由其关联矩阵 \(H\) 来表征，其行代表节点，列代表超边。如果超边 \(j\) 包含节点 \(i\)，则条目 \(H_{ij}\) 为1，否则为0。例如，上图中的超图可以用一个 \(11 \times 5\) 的矩阵来表征，如下所示：

\[\begin{split}H = \begin{bmatrix} 1 & 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & 0 & 0 \\ 1 & 1 & 0 & 1 & 1 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \\ 1 & 0 & 1 & 1 & 1 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 1 & 0 & 1 & 0 \\ 0 & 1 & 0 & 1 & 0 \\ 0 & 0 & 1 & 0 & 1 \\ 0 & 0 & 0 & 0 & 1 \\ \end{bmatrix}\end{split}\]

可以通过指定两个张量 nodes 和 hyperedges 来构建超图关联矩阵，其中节点 ID nodes[i] 属于超边 ID hyperedges[i] 对于所有 i。在上述情况下，可以构建如下关联矩阵。

[ ]:

import dgl.sparse as dglsp
import torch

H = dglsp.spmatrix(
    torch.LongTensor([[0, 1, 2, 2, 2, 2, 3, 4, 5, 5, 5, 5, 6, 7, 7, 8, 8, 9, 9, 10],
                      [0, 0, 0, 1, 3, 4, 2, 1, 0, 2, 3, 4, 2, 1, 3, 1, 3, 2, 4, 4]])
)

print(H.to_dense())

tensor([[1., 0., 0., 0., 0.],
        [1., 0., 0., 0., 0.],
        [1., 1., 0., 1., 1.],
        [0., 0., 1., 0., 0.],
        [0., 1., 0., 0., 0.],
        [1., 0., 1., 1., 1.],
        [0., 0., 1., 0., 0.],
        [0., 1., 0., 1., 0.],
        [0., 1., 0., 1., 0.],
        [0., 0., 1., 0., 1.],
        [0., 0., 0., 0., 1.]])

超图中节点的度被定义为包含该节点的超边的数量。同样，超图中超边的度被定义为包含该超边的节点的数量。在上面的例子中，超边的度可以通过行向量的和来计算（即全部为4），而节点的度可以通过列向量的和来计算。

[ ]:

node_degrees = H.sum(1)
print("Node degrees", node_degrees)

hyperedge_degrees = H.sum(0)
print("Hyperedge degrees", hyperedge_degrees)

Node degrees tensor([1., 1., 4., 1., 1., 4., 1., 2., 2., 2., 1.])
Hyperedge degrees tensor([4., 4., 4., 4., 4.])

超图神经网络 (HGNN) 层

HGNN层的定义如下：

\[f(X^{(l)}, H; W^{(l)}) = \sigma(L X^{(l)} W^{(l)})\]

\[L = D_v^{-1/2} H B D_e^{-1} H^\top D_v^{-1/2}\]

其中

\(H \in \mathbb{R}^{N \times M}\) 是具有 \(N\) 个节点和 \(M\) 个超边的超图的关联矩阵。
\(D_v \in \mathbb{R}^{N \times N}\) 是一个表示节点度的对角矩阵，其第 \(i\) 个对角元素为 \(\sum_{j=1}^M H_{ij}\)。
\(D_e \in \mathbb{R}^{M \times M}\) 是一个表示超边度数的对角矩阵，其第 \(j\) 个对角元素为 \(\sum_{i=1}^N H_{ij}\)。
\(B \in \mathbb{R}^{M \times M}\) 是一个表示超边权重的对角矩阵，其\(j\)-th对角元素是第\(j\)-th超边的权重。在我们的例子中，\(B\) 是一个单位矩阵。

以下代码构建了一个两层的HGNN。

[ ]:

import dgl.sparse as dglsp
import torch
import torch.nn as nn
import torch.nn.functional as F
import tqdm
from dgl.data import CoraGraphDataset
from torchmetrics.functional import accuracy


class HGNN(nn.Module):
    def __init__(self, H, in_size, out_size, hidden_dims=16):
        super().__init__()

        self.W1 = nn.Linear(in_size, hidden_dims)
        self.W2 = nn.Linear(hidden_dims, out_size)
        self.dropout = nn.Dropout(0.5)

        ###########################################################
        # (HIGHLIGHT) Compute the Laplacian with Sparse Matrix API
        ###########################################################
        # Compute node degree.
        d_V = H.sum(1)
        # Compute edge degree.
        d_E = H.sum(0)
        # Compute the inverse of the square root of the diagonal D_v.
        D_v_invsqrt = dglsp.diag(d_V**-0.5)
        # Compute the inverse of the diagonal D_e.
        D_e_inv = dglsp.diag(d_E**-1)
        # In our example, B is an identity matrix.
        n_edges = d_E.shape[0]
        B = dglsp.identity((n_edges, n_edges))
        # Compute Laplacian from the equation above.
        self.L = D_v_invsqrt @ H @ B @ D_e_inv @ H.T @ D_v_invsqrt

    def forward(self, X):
        X = self.L @ self.W1(self.dropout(X))
        X = F.relu(X)
        X = self.L @ self.W2(self.dropout(X))
        return X

Loading Data

我们在示例中使用了Cora引用网络。但是，我们没有使用论文之间的原始“引用”关系，而是考虑了论文之间的“共同引用”关系。我们从原始引用网络构建了一个超图，其中对于每篇论文，我们构建了一个超边，该超边包括它引用的所有其他论文以及论文本身。

请注意，以这种方式构建的超图的关联矩阵与原始图的邻接矩阵完全相同（加上自环的单位矩阵）。这是因为每个超边与每篇论文之间存在一一对应关系。因此，我们可以直接取图的邻接矩阵并加上单位矩阵，然后将其用作超图的关联矩阵。

[ ]:

def load_data():
    dataset = CoraGraphDataset()

    graph = dataset[0]
    indices = torch.stack(graph.edges())
    H = dglsp.spmatrix(indices)
    H = H + dglsp.identity(H.shape)

    X = graph.ndata["feat"]
    Y = graph.ndata["label"]
    train_mask = graph.ndata["train_mask"]
    val_mask = graph.ndata["val_mask"]
    test_mask = graph.ndata["test_mask"]
    return H, X, Y, dataset.num_classes, train_mask, val_mask, test_mask

训练和评估

现在我们可以编写训练和评估函数如下。

[ ]:

def train(model, optimizer, X, Y, train_mask):
    model.train()
    Y_hat = model(X)
    loss = F.cross_entropy(Y_hat[train_mask], Y[train_mask])
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()


def evaluate(model, X, Y, val_mask, test_mask, num_classes):
    model.eval()
    Y_hat = model(X)
    val_acc = accuracy(
        Y_hat[val_mask], Y[val_mask], task="multiclass", num_classes=num_classes
    )
    test_acc = accuracy(
        Y_hat[test_mask],
        Y[test_mask],
        task="multiclass",
        num_classes=num_classes,
    )
    return val_acc, test_acc


H, X, Y, num_classes, train_mask, val_mask, test_mask = load_data()
model = HGNN(H, X.shape[1], num_classes)
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

with tqdm.trange(500) as tq:
    for epoch in tq:
        train(model, optimizer, X, Y, train_mask)
        val_acc, test_acc = evaluate(
            model, X, Y, val_mask, test_mask, num_classes
        )
        tq.set_postfix(
            {
                "Val acc": f"{val_acc:.5f}",
                "Test acc": f"{test_acc:.5f}",
            },
            refresh=False,
        )

print(f"Test acc: {test_acc:.3f}")

Downloading /root/.dgl/cora_v2.zip from https://data.dgl.ai/dataset/cora_v2.zip...
Extracting file to /root/.dgl/cora_v2
Finished data loading and preprocessing.
  NumNodes: 2708
  NumEdges: 10556
  NumFeats: 1433
  NumClasses: 7
  NumTrainingSamples: 140
  NumValidationSamples: 500
  NumTestSamples: 1000
Done saving data into cached files.

100%|██████████| 500/500 [00:57<00:00,  8.70it/s, Val acc=0.77800, Test acc=0.78100]

Test acc: 0.781

有关HGNN的完整示例，请参考这里。