torch_geometric.datasets.HydroNet

class HydroNet(root: str, name: Optional[str] = None, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None, force_reload: bool = False, num_workers: int = 8, clusters: Optional[Union[int, List[int]]] = None, use_processed: bool = True)[source]

Bases: InMemoryDataset

来自“HydroNet: Benchmark Tasks for Preserving Intermolecular Interactions and Structural Motifs in Predictive and Generative Models for Molecular Data”论文的HydroNet数据集,包含500万个通过氢键网络结合在一起的水簇。该数据集提供了簇的原子坐标和以kcal/mol为单位的总能量。

Parameters:
  • root (str) – Root directory where the dataset should be saved.

  • name (str, optional) – 要使用的完整数据集的子集名称: "small" 使用从 "medium" 数据集中采样的50万个图, "medium" 使用270万个图,最大节点数为75。 与clusters参数互斥。 (默认 None)

  • transform (callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before every access. (default: None)

  • pre_transform (callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before being saved to disk. (default: None)

  • force_reload (bool, optional) – Whether to re-process the dataset. (default: False)

  • num_workers (int) – 用于预处理数据集的多进程工作线程数量。(默认 8

  • clusters (intList[int], 可选) – 从完整数据集中选择一个子集的集群。如果设置为 None,将选择所有集群。 (默认 None)

  • use_processed (bool) – 选择是否使用预处理版本的原始 xyz 数据集。(默认值:True