torch_geometric.datasets.Planetoid

class Planetoid(root: str, name: str, split: str = 'public', num_train_per_class: int = 20, num_val: int = 500, num_test: int = 1000, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None, force_reload: bool = False)[source]

Bases: InMemoryDataset

引用网络数据集 "Cora", "CiteSeer""PubMed" 来自 “Revisiting Semi-Supervised Learning with Graph Embeddings” 论文。 节点表示文档,边表示引用链接。 训练、验证和测试分割由二进制掩码给出。

Parameters:
  • root (str) – Root directory where the dataset should be saved.

  • name (str) – 数据集的名称 ("Cora", "CiteSeer", "PubMed").

  • split (str, optional) –

    数据集分割的类型 ("public", "full", "geom-gcn", "random"). 如果设置为 "public",分割将来自 “Revisiting Semi-Supervised Learning with Graph Embeddings” 论文中的公共固定分割。 如果设置为 "full",除了验证集和测试集中的节点外,所有节点都将用于训练(如 “FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling” 论文中所述)。 如果设置为 "geom-gcn",将提供来自 “Geom-GCN: Geometric Graph Convolutional Networks” 论文的10个公共固定分割。 如果设置为 "random",训练集、验证集和测试集将根据 num_train_per_class, num_valnum_test 随机生成。(默认值:"public"

  • num_train_per_class (int, optional) – 在"random"分割情况下,每个类别的训练样本数量。(默认值: 20)

  • num_val (int, optional) – 在"random"分割的情况下,验证样本的数量。(默认值:500

  • num_test (int, 可选) – 在"random"分割的情况下,测试样本的数量。(默认值:1000

  • transform (callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before every access. (default: None)

  • pre_transform (callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before being saved to disk. (default: None)

  • force_reload (bool, optional) – Whether to re-process the dataset. (default: False)

统计:

名称

#节点

#edges

#特性

#classes

Cora

2,708

10,556

1,433

7

CiteSeer

3,327

9,104

3,703

6

PubMed

19,717

88,648

500

3