torch_geometric.datasets.TUDataset
- class TUDataset(root: str, name: str, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None, pre_filter: Optional[Callable] = None, force_reload: bool = False, use_node_attr: bool = False, use_edge_attr: bool = False, cleaned: bool = False)[source]
Bases:
InMemoryDataset多种图核基准数据集,例如,
"IMDB-BINARY","REDDIT-BINARY"或"PROTEINS", 收集自 TU Dortmund University。 此外,该数据集包装器提供了 清理后的数据集版本,这是由 “Understanding Isomorphism Bias in Graph Data Sets” 论文所推动的,仅包含非同构的图。注意
有些数据集可能不包含任何节点标签。 你可以使用参数
use_node_attr来加载额外的连续节点属性(如果存在),或者使用转换提供 合成节点特征,例如torch_geometric.transforms.Constant或torch_geometric.transforms.OneHotDegree。- Parameters:
root (str) – Root directory where the dataset should be saved.
transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Dataobject and returns a transformed version. The data object will be transformed before every access. (default:None)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Dataobject and returns a transformed version. The data object will be transformed before being saved to disk. (default:None)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Dataobject and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None)force_reload (bool, optional) – Whether to re-process the dataset. (default:
False)use_node_attr (bool, 可选) – 如果为
True,数据集将包含额外的连续节点属性(如果存在)。 (默认:False)use_edge_attr (bool, 可选) – 如果为
True,数据集将包含额外的连续边属性(如果存在)。 (默认:False)
统计:
名称
#图表
#节点
#edges
#特性
#classes
MUTAG
188
~17.9
~39.6
7
2
酶
600
~32.6
~124.3
3
6
蛋白质
1,113
~39.1
~145.6
3
2
协作
5,000
~74.5
~4914.4
0
3
IMDB-BINARY
1,000
~19.8
~193.1
0
2
REDDIT-BINARY
2,000
~429.6
~995.5
0
2
…