GINDataset

class dgl.data.GINDataset(name, self_loop, degree_as_nlabel=False, raw_dir=None, force_reload=False, verbose=False, transform=None)[source]

Bases: DGLBuiltinDataset

数据集类用于图神经网络有多强大？。

这是改编自 https://github.com/weihua916/powerful-gnns/blob/master/dataset.zip.

该类提供了论文中使用的九个数据集的接口以及论文特定的设置。数据集包括 'MUTAG', 'COLLAB', 'IMDBBINARY', 'IMDBMULTI', 'NCI1', 'PROTEINS', 'PTC', 'REDDITBINARY', 'REDDITMULTI5K'。

如果 degree_as_nlabel 设置为 False，那么 ndata['label'] 存储提供的节点标签，否则 ndata['label'] 存储节点的入度。

对于具有节点属性的图，ndata['attr'] 存储节点属性。对于没有属性的图，ndata['attr'] 存储 ndata['label'] 的对应 one-hot 编码。

Parameters:

name (str) – 数据集名称，可选值为 ('MUTAG', 'COLLAB', 'IMDBBINARY', 'IMDBMULTI', 'NCI1', 'PROTEINS', 'PTC', 'REDDITBINARY', 'REDDITMULTI5K')
self_loop (bool) – 如果为真，则添加自环边
degree_as_nlabel (bool) – 如果为真，则将节点度作为标签和特征
transform (callable, optional) – A transform that takes in a DGLGraph object and returns a transformed version. The DGLGraph object will be transformed before every access.

num_classes

多类分类的类别数量

Type:: int

示例

>>> data = GINDataset(name='MUTAG', self_loop=False)

数据集实例是可迭代的

>>> len(data)
188
>>> g, label = data[128]
>>> g
Graph(num_nodes=13, num_edges=26,
      ndata_schemes={'label': Scheme(shape=(), dtype=torch.int64), 'attr': Scheme(shape=(7,), dtype=torch.float32)}
      edata_schemes={})
>>> label
tensor(1)

将图和标签分批用于小批量训练

>>> graphs, labels = zip(*[data[i] for i in range(16)])
>>> batched_graphs = dgl.batch(graphs)
>>> batched_labels = torch.tensor(labels)
>>> batched_graphs
Graph(num_nodes=330, num_edges=748,
      ndata_schemes={'label': Scheme(shape=(), dtype=torch.int64), 'attr': Scheme(shape=(7,), dtype=torch.float32)}
      edata_schemes={})

__getitem__(idx)[source]

获取第 idx 个样本。

Parameters:: idx (int) – The sample index.
Returns:: 图表及其标签。
Return type:: (dgl.Graph, 张量)

__len__()[source]: 返回数据集中图的数量。