GINDataset
- class dgl.data.GINDataset(name, self_loop, degree_as_nlabel=False, raw_dir=None, force_reload=False, verbose=False, transform=None)[source]
Bases:
DGLBuiltinDataset数据集类用于图神经网络有多强大?。
这是改编自 https://github.com/weihua916/powerful-gnns/blob/master/dataset.zip.
该类提供了论文中使用的九个数据集的接口以及论文特定的设置。数据集包括
'MUTAG','COLLAB','IMDBBINARY','IMDBMULTI','NCI1','PROTEINS','PTC','REDDITBINARY','REDDITMULTI5K'。如果
degree_as_nlabel设置为False,那么ndata['label']存储提供的节点标签, 否则ndata['label']存储节点的入度。对于具有节点属性的图,
ndata['attr']存储节点属性。 对于没有属性的图,ndata['attr']存储ndata['label']的对应 one-hot 编码。- Parameters:
name (str) – 数据集名称,可选值为 (
'MUTAG','COLLAB','IMDBBINARY','IMDBMULTI','NCI1','PROTEINS','PTC','REDDITBINARY','REDDITMULTI5K')self_loop (bool) – 如果为真,则添加自环边
degree_as_nlabel (bool) – 如果为真,则将节点度作为标签和特征
transform (callable, optional) – A transform that takes in a
DGLGraphobject and returns a transformed version. TheDGLGraphobject will be transformed before every access.
示例
>>> data = GINDataset(name='MUTAG', self_loop=False)
数据集实例是可迭代的
>>> len(data) 188 >>> g, label = data[128] >>> g Graph(num_nodes=13, num_edges=26, ndata_schemes={'label': Scheme(shape=(), dtype=torch.int64), 'attr': Scheme(shape=(7,), dtype=torch.float32)} edata_schemes={}) >>> label tensor(1)
将图和标签分批用于小批量训练
>>> graphs, labels = zip(*[data[i] for i in range(16)]) >>> batched_graphs = dgl.batch(graphs) >>> batched_labels = torch.tensor(labels) >>> batched_graphs Graph(num_nodes=330, num_edges=748, ndata_schemes={'label': Scheme(shape=(), dtype=torch.int64), 'attr': Scheme(shape=(7,), dtype=torch.float32)} edata_schemes={})