dgl.distributed.sample_etype_neighbors

dgl.distributed.sample_etype_neighbors(g, nodes, fanout, edge_dir='in', prob=None, replace=False, etype_sorted=True, use_graphbolt=False)[source]

从分布式图中给定节点的邻居中采样。

For each node, a number of inbound (or outbound when edge_dir == 'out') edges will be randomly chosen. The returned graph will contain all the nodes in the original graph, but only the sampled edges.

Node/edge features are not preserved. The original IDs of the sampled edges are stored as the dgl.EID feature in the returned graph.

此函数假设输入是一个同质的DGLGraph,其边按边类型排序。采样的子图也以同质图格式存储。也就是说,所有节点和边都被分配了唯一的ID(相比之下,我们通常使用类型名称和节点/边ID来识别DGLGraph中的节点或边)。我们将这种类型的ID称为同质ID。 用户可以使用dgl.distributed.GraphPartitionBook.map_to_per_ntype()dgl.distributed.GraphPartitionBook.map_to_per_etype() 来识别它们的节点/边类型以及该类型的节点/边ID。

Parameters:
  • g (DistGraph) – The distributed graph..

  • nodes (tensor or dict) – Node IDs to sample neighbors from. If it’s a dict, it should contain only one key-value pair to make this API consistent with dgl.sampling.sample_neighbors.

  • fanout (intdict[etype, int]) –

    每个节点每种边类型要采样的边数。如果给定一个整数,DGL 会假设每种边类型都应用相同的 fanout。

    如果给定 -1,将选择所有的邻居。

  • edge_dir (str, optional) –

    Determines whether to sample inbound or outbound edges.

    Can take either in for inbound edges or out for outbound edges.

  • prob (str, optional) –

    Feature name used as the (unnormalized) probabilities associated with each neighboring edge of a node. The feature must have only one element for each edge.

    The features must be non-negative floats, and the sum of the features of inbound/outbound edges for every node must be positive (though they don’t have to sum up to one). Otherwise, the result will be undefined.

  • replace (bool, optional) –

    If True, sample with replacement.

    When sampling with replacement, the sampled subgraph could have parallel edges.

    For sampling without replacement, if fanout > the number of neighbors, all the neighbors are sampled. If fanout == -1, all neighbors are collected.

  • etype_sorted (bool, optional) – 表示etype是否已排序。

  • use_graphbolt (bool, optional) – Whether to use GraphBolt for sampling.

Returns:

一个仅包含采样邻居边的采样子图。它在CPU上。

Return type:

DGLGraph