torch_geometric.data.HeteroData
- class HeteroData(_mapping: Optional[Dict[str, Any]] = None, **kwargs)[source]
基础类:
BaseData,FeatureStore,GraphStore描述异构图的数据对象,在分离的存储对象中保存多种节点和/或边类型。存储对象可以保存节点级别、链接级别或图级别的属性。通常,
HeteroData试图模仿常规的嵌套 Python 字典的行为。此外,它提供了分析图结构的有用功能,并提供了基本的 PyTorch 张量功能。from torch_geometric.data import HeteroData data = HeteroData() # Create two node types "paper" and "author" holding a feature matrix: data['paper'].x = torch.randn(num_papers, num_paper_features) data['author'].x = torch.randn(num_authors, num_authors_features) # Create an edge type "(author, writes, paper)" and building the # graph connectivity: data['author', 'writes', 'paper'].edge_index = ... # [2, num_edges] data['paper'].num_nodes >>> 23 data['author', 'writes', 'paper'].num_edges >>> 52 # PyTorch tensor functionality: data = data.pin_memory() data = data.to('cuda:0', non_blocking=True)
请注意,创建异构图数据存在多种方式,例如:
要初始化一个类型为
"paper"的节点,该节点持有一个名为x的节点特征矩阵x_paper:from torch_geometric.data import HeteroData # (1) Assign attributes after initialization, data = HeteroData() data['paper'].x = x_paper # or (2) pass them as keyword arguments during initialization, data = HeteroData(paper={ 'x': x_paper }) # or (3) pass them as dictionaries during initialization, data = HeteroData({'paper': { 'x': x_paper }})
要从源节点类型
"author"初始化一条边到目标节点类型"paper",关系类型为"writes",并持有一个名为edge_index的图连接矩阵edge_index_author_paper:# (1) Assign attributes after initialization, data = HeteroData() data['author', 'writes', 'paper'].edge_index = edge_index_author_paper # or (2) pass them as keyword arguments during initialization, data = HeteroData(author__writes__paper={ 'edge_index': edge_index_author_paper }) # or (3) pass them as dictionaries during initialization, data = HeteroData({ ('author', 'writes', 'paper'): { 'edge_index': edge_index_author_paper } })
- to_namedtuple() NamedTuple[source]
Returns a
NamedTupleof stored key/value pairs.- Return type:
- set_value_dict(key: str, value_dict: Dict[str, Any]) Self[source]
将字典
value_dict中的值设置为字典中所有节点/边类型的属性key。data = HeteroData() data.set_value_dict('x', { 'paper': torch.randn(4, 16), 'author': torch.randn(8, 32), }) print(data['paper'].x)
- Return type:
Self
- __cat_dim__(key: str, value: Any, store: Optional[Union[NodeStorage, EdgeStorage]] = None, *args, **kwargs) Any[source]
Returns the dimension for which the value
valueof the attributekeywill get concatenated when creating mini-batches usingtorch_geometric.loader.DataLoader.注意
此方法仅供内部使用,仅在特定属性的小批量创建过程损坏时才应被重写。
- Return type:
- __inc__(key: str, value: Any, store: Optional[Union[NodeStorage, EdgeStorage]] = None, *args, **kwargs) Any[source]
Returns the incremental count to cumulatively increase the value
valueof the attributekeywhen creating mini-batches usingtorch_geometric.loader.DataLoader.注意
此方法仅供内部使用,仅在特定属性的小批量创建过程损坏时才应被重写。
- Return type:
- property num_features: Dict[str, int]
返回图中每种节点类型的特征数量。
num_node_features的别名。
- metadata() Tuple[List[str], List[Tuple[str, str, str]]][source]
返回异构元数据,即其节点和边类型。
data = HeteroData() data['paper'].x = ... data['author'].x = ... data['author', 'writes', 'paper'].edge_index = ... print(data.metadata()) >>> (['paper', 'author'], [('author', 'writes', 'paper')])
- collect(key: str, allow_empty: bool = False) Dict[Union[str, Tuple[str, str, str]], Any][source]
从所有节点和边类型中收集属性
key。data = HeteroData() data['paper'].x = ... data['author'].x = ... print(data.collect('x')) >>> { 'paper': ..., 'author': ...}
注意
这相当于编写
data.x_dict。
- get_node_store(key: str) NodeStorage[source]
获取特定节点类型
key的NodeStorage对象。 如果存储尚不存在,将为给定的节点类型创建一个新的torch_geometric.data.storage.NodeStorage对象。data = HeteroData() node_storage = data.get_node_store('paper')
- Return type:
NodeStorage
- get_edge_store(src: str, rel: str, dst: str) EdgeStorage[source]
获取由元组
(src, rel, dst)指定的特定边类型的EdgeStorage对象。 如果存储尚未存在,将为给定的边类型创建一个新的torch_geometric.data.storage.EdgeStorage对象。data = HeteroData() edge_storage = data.get_edge_store('author', 'writes', 'paper')
- Return type:
EdgeStorage
- subgraph(subset_dict: Dict[str, Tensor]) Self[source]
返回包含
subset_dict中节点类型和相应节点的诱导子图。如果节点类型不是
subset_dict中的键,则该类型的所有节点仍保留在图中。data = HeteroData() data['paper'].x = ... data['author'].x = ... data['conference'].x = ... data['paper', 'cites', 'paper'].edge_index = ... data['author', 'paper'].edge_index = ... data['paper', 'conference'].edge_index = ... print(data) >>> HeteroData( paper={ x=[10, 16] }, author={ x=[5, 32] }, conference={ x=[5, 8] }, (paper, cites, paper)={ edge_index=[2, 50] }, (author, to, paper)={ edge_index=[2, 30] }, (paper, to, conference)={ edge_index=[2, 25] } ) subset_dict = { 'paper': torch.tensor([3, 4, 5, 6]), 'author': torch.tensor([0, 2]), } print(data.subgraph(subset_dict)) >>> HeteroData( paper={ x=[4, 16] }, author={ x=[2, 32] }, conference={ x=[5, 8] }, (paper, cites, paper)={ edge_index=[2, 24] }, (author, to, paper)={ edge_index=[2, 5] }, (paper, to, conference)={ edge_index=[2, 10] } )
- Parameters:
subset_dict (Dict[str, LongTensor or BoolTensor]) – 一个字典,保存每种节点类型要保留的节点。
- Return type:
Self
- edge_subgraph(subset_dict: Dict[Tuple[str, str, str], Tensor]) Self[source]
返回由
subset_dict中给定边索引的诱导子图,适用于某些边类型。 当前将保留图中的所有节点,即使它们在子图计算后是孤立的。
- node_type_subgraph(node_types: List[str]) Self[source]
返回由给定的
node_types诱导的子图,即 返回的HeteroData对象仅包含在node_types中包含的节点类型, 并且仅包含两端点都在node_types中的边类型。- Return type:
Self
- edge_type_subgraph(edge_types: List[Tuple[str, str, str]]) Self[source]
返回由给定的
edge_types诱导的子图,即 返回的HeteroData对象仅包含在edge_types中包含的边类型, 并且仅包含在node_types中包含的端点节点类型。- Return type:
Self
- to_homogeneous(node_attrs: Optional[List[str]] = None, edge_attrs: Optional[List[str]] = None, add_node_type: bool = True, add_edge_type: bool = True, dummy_values: bool = True) Data[source]
将一个
HeteroData对象转换为一个同质的Data对象。 默认情况下,所有具有相同特征维度的不同类型特征将被合并为一个单一表示,除非通过node_attrs和edge_attrs参数另行指定。 此外,名为node_type和edge_type的属性将被添加到返回的Data对象中,分别表示节点级别和边级别的向量,这些向量以整数形式保存节点和边的类型。- Parameters:
node_attrs (List[str], optional) – 要跨所有节点类型组合的节点特征。这些节点特征需要具有相同的特征维度。如果设置为
None,将自动确定要组合的节点特征。 (默认值:None)edge_attrs (List[str], optional) – 要跨所有边类型组合的边特征。这些边特征需要具有相同的特征维度。如果设置为
None,将自动确定要组合的边特征。 (默认值:None)add_node_type (bool, 可选) – 如果设置为
False,将不会 将节点级别的向量node_type添加到返回的Data对象中。 (默认:True)add_edge_type (bool, 可选) – 如果设置为
False,将不会 将边级别的向量edge_type添加到返回的Data对象中。 (默认:True)dummy_values (bool, 可选) – 如果设置为
True,将会用虚拟值填充剩余类型的属性。 虚拟值对于浮点属性是NaN, 对于布尔值是False,对于整数是-1。 (默认值:True)
- Return type:
- get_all_tensor_attrs() List[TensorAttr][source]
返回所有已注册的张量属性。
- Return type:
- apply(func: Callable, *args: str)
Applies the function
func, either to all attributes or only the ones given in*args.
- apply_(func: Callable, *args: str)
Applies the in-place function
func, either to all attributes or only the ones given in*args.
- clone(*args: str)
Performs cloning of tensors, either for all attributes or only the ones given in
*args.
- coalesce() Self
Sorts and removes duplicated entries from edge indices
edge_index.- Return type:
Self
- concat(data: Self) Self
Concatenates
selfwith anotherdataobject. All values needs to have matching shapes at non-concat dimensions.- Return type:
Self
- contiguous(*args: str)
Ensures a contiguous memory layout, either for all attributes or only the ones given in
*args.
- coo(edge_types: Optional[List[Any]] = None, store: bool = False) Tuple[Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Optional[Tensor]]]
返回
GraphStore中以COO格式存储的边索引。
- cpu(*args: str)
Copies attributes to CPU memory, either for all attributes or only the ones given in
*args.
- csc(edge_types: Optional[List[Any]] = None, store: bool = False) Tuple[Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Optional[Tensor]]]
返回
GraphStore中以CSC格式存储的边索引。
- csr(edge_types: Optional[List[Any]] = None, store: bool = False) Tuple[Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Optional[Tensor]]]
返回
GraphStore中以CSR格式存储的边索引。
- cuda(device: Optional[Union[int, str]] = None, *args: str, non_blocking: bool = False)
Copies attributes to CUDA memory, either for all attributes or only the ones given in
*args.
- detach(*args: str)
Detaches attributes from the computation graph by creating a new tensor, either for all attributes or only the ones given in
*args.
- detach_(*args: str)
Detaches attributes from the computation graph, either for all attributes or only the ones given in
*args.
- generate_ids()
Generates and sets
n_idande_idattributes to assign each node and edge to a continuously ascending and unique ID.
- get_edge_index(*args, **kwargs) Tuple[Tensor, Tensor]
同步从
GraphStore获取一个edge_index元组。
- get_tensor(*args, convert_type: bool = False, **kwargs) Union[Tensor, ndarray]
同步从
FeatureStore获取一个tensor。- Parameters:
*args – 传递给
TensorAttr的参数。convert_type (bool, optional) – 是否将输出张量的类型转换为属性索引的类型。 (默认:
False)**kwargs – 传递给
TensorAttr的关键字参数。
- Raises:
ValueError – 如果输入的
TensorAttr没有完全指定。- Return type:
- is_coalesced() bool
Returns
Trueif edge indicesedge_indexare sorted and do not contain duplicate entries.
- property is_cuda: bool
Returns
Trueif anytorch.Tensorattribute is stored on the GPU,Falseotherwise.
- is_sorted(sort_by_row: bool = True) bool
Returns
Trueif edge indicesedge_indexare sorted.- Parameters:
sort_by_row (bool, optional) – If set to
False, will require column-wise order/by destination node order ofedge_index. (default:True)- Return type:
bool 翻译后的内容: bool 在这个例子中,`bool` 是一个Python函数名称,根据翻译规则1,不需要翻译。因此,翻译后的内容保持不变。
- multi_get_tensor(attrs: List[TensorAttr], convert_type: bool = False) List[Union[Tensor, ndarray]]
同步从
FeatureStore中获取与attrs中属性相关联的每个张量的列表。注意
默认实现简单地遍历所有对
get_tensor()的调用。建议能够提供 额外、更高性能功能的实现类 重写此方法。- Parameters:
attrs (List[TensorAttr]) – 一个输入
TensorAttr对象的列表,用于标识要获取的张量。convert_type (bool, optional) – Whether to convert the type of the output tensor to the type of the attribute index. (default:
False)
- Raises:
ValueError – 如果任何输入的
TensorAttr没有完全指定。- Return type:
- pin_memory(*args: str)
Copies attributes to pinned memory, either for all attributes or only the ones given in
*args.
- put_edge_index(edge_index: Tuple[Tensor, Tensor], *args, **kwargs) bool
同步地将一个
edge_index元组添加到GraphStore。 返回插入是否成功。- Parameters:
edge_index (Tuple[torch.Tensor, torch.Tensor]) –
edge_index元组,格式在EdgeAttr中指定。*args – Arguments passed to
EdgeAttr.**kwargs – Keyword arguments passed to
EdgeAttr.
- Return type:
bool 翻译后的内容: bool 在这个例子中,`bool` 是一个Python函数名称,根据翻译规则1,不需要翻译。因此,翻译后的内容保持不变。
- put_tensor(tensor: Union[Tensor, ndarray], *args, **kwargs) bool
同步将一个
tensor添加到FeatureStore中。 返回插入是否成功。- Parameters:
张量 (torch.Tensor 或 np.ndarray) – 要添加的特征张量。
*args – Arguments passed to
TensorAttr.**kwargs – Keyword arguments passed to
TensorAttr.
- Raises:
ValueError – If the input
TensorAttris not fully specified.- Return type:
bool 翻译后的内容: bool 在这个例子中,`bool` 是一个Python函数名称,根据翻译规则1,不需要翻译。因此,翻译后的内容保持不变。
- record_stream(stream: Stream, *args: str)
Ensures that the tensor memory is not reused for another tensor until all current work queued on
streamhas been completed, either for all attributes or only the ones given in*args.
- remove_edge_index(*args, **kwargs) bool
同步从
GraphStore中删除一个edge_index元组。返回删除是否成功。- Parameters:
- Return type:
bool 翻译后的内容: bool 在这个例子中,`bool` 是一个Python函数名称,根据翻译规则1,不需要翻译。因此,翻译后的内容保持不变。
- remove_tensor(*args, **kwargs) bool
从
FeatureStore中移除一个张量。 返回删除是否成功。- Parameters:
*args – Arguments passed to
TensorAttr.**kwargs – Keyword arguments passed to
TensorAttr.
- Raises:
ValueError – If the input
TensorAttris not fully specified.- Return type:
bool 翻译后的内容: bool 在这个例子中,`bool` 是一个Python函数名称,根据翻译规则1,不需要翻译。因此,翻译后的内容保持不变。
- requires_grad_(*args: str, requires_grad: bool = True)
Tracks gradient computation, either for all attributes or only the ones given in
*args.
Moves attributes to shared memory, either for all attributes or only the ones given in
*args.
- size(dim: Optional[int] = None) Optional[Union[Tuple[Optional[int], Optional[int]], int]]
返回由图形引起的邻接矩阵的大小。
- snapshot(start_time: Union[float, int], end_time: Union[float, int], attr: str = 'time') Self
Returns a snapshot of
datato only hold events that occurred in period[start_time, end_time].- Return type:
Self
- sort(sort_by_row: bool = True) Self
Sorts edge indices
edge_indexand their corresponding edge features.
- to(device: Union[int, str], *args: str, non_blocking: bool = False)
Performs tensor device conversion, either for all attributes or only the ones given in
*args.
- up_to(end_time: Union[float, int]) Self
Returns a snapshot of
datato only hold events that occurred up toend_time(inclusive ofedge_time).- Return type:
Self
- update_tensor(tensor: Union[Tensor, ndarray], *args, **kwargs) bool
更新
FeatureStore中的tensor为新值。返回更新是否成功。注意
实现类可以选择定义更高效的更新方法;默认情况下执行删除和插入操作。
- Parameters:
tensor (torch.Tensor 或 np.ndarray) – 需要更新的特征张量。
*args – Arguments passed to
TensorAttr.**kwargs – Keyword arguments passed to
TensorAttr.
- Return type:
bool 翻译后的内容: bool 在这个例子中,`bool` 是一个Python函数名称,根据翻译规则1,不需要翻译。因此,翻译后的内容保持不变。
- view(*args, **kwargs) AttrView
返回一个视图,该视图基于尚未完全指定的
FeatureStore,给定一个TensorAttr。- Return type:
AttrView