dgl.knn_graph

dgl.knn_graph(x, k, algorithm='bruteforce-blas', dist='euclidean', exclude_self=False)[source]

根据k-最近邻(KNN)从一组点构建图并返回。

该函数将点集的坐标/特征转换为有向同构图。点集的坐标被指定为一个矩阵,其中行对应点,列对应坐标/特征维度。

返回的图的节点对应于点,其中每个点的前驱是其通过所选距离测量的k个最近邻居。

如果 x 是一个3D张量,那么每个子矩阵将被转换为一个单独的图。然后DGL将这些图组合成一个包含多个(\(shape(x)[0]\))连通分量的大型批处理图。

查看基准测试以获取完整的基准测试结果。

Parameters:
  • x (Tensor) –

    点的坐标。它可以在CPU或GPU上。

    • 如果是2D,x[i] 对应于KNN图中的第i个节点。

    • 如果是3D,x[i] 对应于第i个KNN图,并且 x[i][j] 对应于第i个KNN图中的第j个节点。

  • k (int) – The number of nearest neighbors per node.

  • algorithm (str, optional) –

    Algorithm used to compute the k-nearest neighbors.

    • ’bruteforce-blas’ will first compute the distance matrix using BLAS matrix multiplication operation provided by backend frameworks. Then use topk algorithm to get k-nearest neighbors. This method is fast when the point set is small but has \(O(N^2)\) memory complexity where \(N\) is the number of points.

    • ’bruteforce’ will compute distances pair by pair and directly select the k-nearest neighbors during distance computation. This method is slower than ‘bruteforce-blas’ but has less memory overhead (i.e., \(O(Nk)\) where \(N\) is the number of points, \(k\) is the number of nearest neighbors per node) since we do not need to store all distances.

    • ’bruteforce-sharemem’ (CUDA only) is similar to ‘bruteforce’ but use shared memory in CUDA devices for buffer. This method is faster than ‘bruteforce’ when the dimension of input points is not large. This method is only available on CUDA device.

    • ’kd-tree’ will use the kd-tree algorithm (CPU only). This method is suitable for low-dimensional data (e.g. 3D point clouds)

    • ’nn-descent’ is an approximate approach from paper Efficient k-nearest neighbor graph construction for generic similarity measures. This method will search for nearest neighbor candidates in “neighbors’ neighbors”.

    (default: ‘bruteforce-blas’)

  • dist (str, optional) – The distance metric used to compute distance between points. It can be the following metrics: * ‘euclidean’: Use Euclidean distance (L2 norm) \(\sqrt{\sum_{i} (x_{i} - y_{i})^{2}}\). * ‘cosine’: Use cosine distance. (default: ‘euclidean’)

  • exclude_self (bool, optional) – If True, the output graph will not contain self loop edges, and each node will not be counted as one of its own k neighbors. If False, the output graph will contain self loop edges, and a node will be counted as one of its own k neighbors.

Returns:

构建的图。节点ID的顺序与x相同。

Return type:

DGLGraph

示例

以下示例使用 PyTorch 后端。

>>> import dgl
>>> import torch

x 是一个二维张量时,会构建一个单一的KNN图。

>>> x = torch.tensor([[0.0, 0.0, 1.0],
...                   [1.0, 0.5, 0.5],
...                   [0.5, 0.2, 0.2],
...                   [0.3, 0.2, 0.4]])
>>> knn_g = dgl.knn_graph(x, 2)  # Each node has two predecessors
>>> knn_g.edges()
(tensor([0, 1, 2, 2, 2, 3, 3, 3]), tensor([0, 1, 1, 2, 3, 0, 2, 3]))

x 是一个3D张量时,DGL会构建多个KNN图,然后将它们组合成一个具有多个连通分量的图。

>>> x1 = torch.tensor([[0.0, 0.0, 1.0],
...                    [1.0, 0.5, 0.5],
...                    [0.5, 0.2, 0.2],
...                    [0.3, 0.2, 0.4]])
>>> x2 = torch.tensor([[0.0, 1.0, 1.0],
...                    [0.3, 0.3, 0.3],
...                    [0.4, 0.4, 1.0],
...                    [0.3, 0.8, 0.2]])
>>> x = torch.stack([x1, x2], dim=0)
>>> knn_g = dgl.knn_graph(x, 2)  # Each node has two predecessors
>>> knn_g.edges()
(tensor([0, 1, 2, 2, 2, 3, 3, 3, 4, 5, 5, 5, 6, 6, 7, 7]),
 tensor([0, 1, 1, 2, 3, 0, 2, 3, 4, 5, 6, 7, 4, 6, 5, 7]))