torch_geometric.datasets.MoleculeNet

class MoleculeNet(root: str, name: str, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None, pre_filter: Optional[Callable] = None, force_reload: bool = False, from_smiles: Optional[Callable] = None)[source]

Bases: InMemoryDataset

来自“MoleculeNet: 分子机器学习的基准”论文的MoleculeNet基准集合,包含来自物理化学、生物物理学和生理学的数据集。所有数据集都带有由Open Graph Benchmark引入的额外节点和边特征。

Parameters:
  • root (str) – Root directory where the dataset should be saved.

  • name (str) – 数据集的名称 ("ESOL", "FreeSolv", "Lipo", "PCBA", "MUV", "HIV", "BACE", "BBBP", "Tox21", "ToxCast", "SIDER", "ClinTox").

  • transform (callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before every access. (default: None)

  • pre_transform (callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before being saved to disk. (default: None)

  • pre_filter (callable, optional) – A function that takes in an torch_geometric.data.Data object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default: None)

  • force_reload (bool, optional) – Whether to re-process the dataset. (default: False)

  • from_smiles (callable, optional) – A custom function that takes a SMILES string and outputs a Data object. If not set, defaults to from_smiles(). (default: None)

统计:

名称

#图表

#节点

#edges

#特性

#classes

ESOL

1,128

~13.3

~27.4

9

1

自由溶剂

642

~8.7

~16.8

9

1

亲脂性

4,200

~27.0

~59.0

9

1

PCBA

437,929

~26.0

~56.2

9

128

MUV

93,087

~24.2

~52.6

9

17

艾滋病

41,127

~25.5

~54.9

9

1

BACE

1513

~34.1

~73.7

9

1

BBBP

2,050

~23.9

~51.6

9

1

Tox21

7,831

~18.6

~38.6

9

12

ToxCast

8,597

~18.7

~38.4

9

617

SIDER

1,427

~33.6

~70.7

9

27

临床毒性

1,484

~26.1

~55.5

9

2