torch_geometric.datasets.Planetoid
- class Planetoid(root: str, name: str, split: str = 'public', num_train_per_class: int = 20, num_val: int = 500, num_test: int = 1000, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None, force_reload: bool = False)[source]
Bases:
InMemoryDataset引用网络数据集
"Cora","CiteSeer"和"PubMed"来自 “Revisiting Semi-Supervised Learning with Graph Embeddings” 论文。 节点表示文档,边表示引用链接。 训练、验证和测试分割由二进制掩码给出。- Parameters:
root (str) – Root directory where the dataset should be saved.
name (str) – 数据集的名称 (
"Cora","CiteSeer","PubMed").split (str, optional) –
数据集分割的类型 (
"public","full","geom-gcn","random"). 如果设置为"public",分割将来自 “Revisiting Semi-Supervised Learning with Graph Embeddings” 论文中的公共固定分割。 如果设置为"full",除了验证集和测试集中的节点外,所有节点都将用于训练(如 “FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling” 论文中所述)。 如果设置为"geom-gcn",将提供来自 “Geom-GCN: Geometric Graph Convolutional Networks” 论文的10个公共固定分割。 如果设置为"random",训练集、验证集和测试集将根据num_train_per_class,num_val和num_test随机生成。(默认值:"public")num_train_per_class (int, optional) – 在
"random"分割情况下,每个类别的训练样本数量。(默认值:20)num_val (int, optional) – 在
"random"分割的情况下,验证样本的数量。(默认值:500)num_test (int, 可选) – 在
"random"分割的情况下,测试样本的数量。(默认值:1000)transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Dataobject and returns a transformed version. The data object will be transformed before every access. (default:None)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Dataobject and returns a transformed version. The data object will be transformed before being saved to disk. (default:None)force_reload (bool, optional) – Whether to re-process the dataset. (default:
False)
统计:
名称
#节点
#edges
#特性
#classes
Cora
2,708
10,556
1,433
7
CiteSeer
3,327
9,104
3,703
6
PubMed
19,717
88,648
500
3