dgl.heterograph

dgl.heterograph(data_dict, num_nodes_dict=None, idtype=None, device=None)[source]

创建一个异构图并返回。

Parameters:

data_dict (图数据) –
用于构建异构图的数据字典。键是以字符串三元组（src_type, edge_type, dst_type）的形式存在，指定源节点、边和目标节点的类型。值是图数据，形式为\((U, V)\)，其中\((U[i], V[i])\)形成ID为\(i\)的边。允许的图数据格式有：
- (Tensor, Tensor): 每个张量必须是一个包含节点ID的一维张量。DGL称这种格式为“节点张量元组”。张量应具有相同的数据类型，必须是int32或int64。它们还应具有相同的设备上下文（见下文idtype和device的描述）。
- ('coo', (Tensor, Tensor)): 与(Tensor, Tensor)相同。
- ('csr', (Tensor, Tensor, Tensor)): 这三个张量构成图的邻接矩阵的CSR表示。第一个是行索引指针。第二个是列索引。第三个是边ID，可以为空（即0个元素）以表示从0开始的连续整数ID。
- ('csc', (Tensor, Tensor, Tensor)): 这三个张量构成图的邻接矩阵的CSC表示。第一个是列索引指针。第二个是行索引。第三个是边ID，可以为空以表示从0开始的连续整数ID。
张量可以用任何整数可迭代对象（如列表、元组、numpy.ndarray）替换。
num_nodes_dict (dict[str, int], optional) – 某些节点类型的节点数量，这是一个将节点类型 \(T\) 映射到 \(T\) 类型节点数量的字典。如果未为某个节点类型 \(T\) 提供此值，DGL 会找到出现在每个图数据中的最大 ID，其源或目标节点类型为 \(T\)，并将节点数量设置为该 ID 加一。如果提供了此值且该值不大于某个节点类型的最大 ID，DGL 将引发错误。默认情况下，DGL 会推断所有节点类型的节点数量。
idtype (int32 or int64, optional) – The data type for storing the structure-related graph information such as node and edge IDs. It should be a framework-specific data type object (e.g., torch.int32). If None (default), DGL infers the ID type from the data_dict argument.
device (device context, optional) – The device of the returned graph, which should be a framework-specific device object (e.g., torch.device). If None (default), DGL uses the device of the tensors of the data argument. If data is not a tuple of node-tensors, the returned graph is on CPU. If the specified device differs from that of the provided tensors, it casts the given tensors to the specified device first.

Returns:

创建的图表。

Return type:

DGLGraph

注释

If the idtype argument is not given then:
- 在节点-张量格式的元组情况下，DGL使用给定ID张量的数据类型。
- 在序列格式的元组情况下，DGL 使用 int64。
Once the graph has been created, you can change the data type by using dgl.DGLGraph.long() or dgl.DGLGraph.int().

If the specified idtype argument differs from the data type of the provided tensors, it casts the given tensors to the specified data type first.
The most efficient construction approach is to provide a tuple of node tensors without specifying idtype and device. This is because the returned graph shares the storage with the input node-tensors in this case.
DGL internally maintains multiple copies of the graph structure in different sparse formats and chooses the most efficient one depending on the computation invoked. If memory usage becomes an issue in the case of large graphs, use dgl.DGLGraph.formats() to restrict the allowed formats.
DGL internally decides a deterministic order for the same set of node types and canonical edge types, which does not necessarily follow the order in data_dict.

示例

以下示例使用PyTorch后端。

>>> import dgl
>>> import torch

创建一个具有三种规范边类型的异构图。

>>> data_dict = {
...     ('user', 'follows', 'user'): (torch.tensor([0, 1]), torch.tensor([1, 2])),
...     ('user', 'follows', 'topic'): (torch.tensor([1, 1]), torch.tensor([1, 2])),
...     ('user', 'plays', 'game'): (torch.tensor([0, 3]), torch.tensor([3, 4]))
... }
>>> g = dgl.heterograph(data_dict)
>>> g
Graph(num_nodes={'game': 5, 'topic': 3, 'user': 4},
      num_edges={('user', 'follows', 'topic'): 2, ('user', 'follows', 'user'): 2,
                 ('user', 'plays', 'game'): 2},
      metagraph=[('user', 'topic', 'follows'), ('user', 'user', 'follows'),
                 ('user', 'game', 'plays')])

明确指定图中每种节点类型的节点数量。

>>> num_nodes_dict = {'user': 4, 'topic': 4, 'game': 6}
>>> g = dgl.heterograph(data_dict, num_nodes_dict=num_nodes_dict)

在第一个GPU上创建一个数据类型为int32的图形。

>>> g = dgl.heterograph(data_dict, idtype=torch.int32, device='cuda:0')