cdlib.evaluation.adjusted_rand_index¶

cdlib.evaluation.adjusted_rand_index(first_partition: object, second_partition: object) → MatchingResult¶

调整后的兰德指数以考虑机会因素。

兰德指数通过考虑所有样本对并计算在预测和真实聚类中被分配到相同或不同簇的样本对，来计算两个聚类之间的相似性度量。

原始RI分数随后通过以下方案“调整机会”转换为ARI分数：

ARI = (RI - Expected_RI) / (max(RI) - Expected_RI)

调整后的兰德指数因此确保在随机标记时具有接近0.0的值，无论聚类和样本的数量如何，当聚类完全相同时（除了排列），其值正好为1.0。

ARI 是一个对称的度量：

adjusted_rand_index(a, b) == adjusted_rand_index(b, a)

Parameters:

first_partition – NodeClustering 对象
second_partition – NodeClustering 对象

Returns:

匹配结果对象

Example:

>>> from cdlib import evaluation, algorithms
>>> import networkx as nx
>>> g = nx.karate_club_graph()
>>> louvain_communities = algorithms.louvain(g)
>>> leiden_communities = algorithms.leiden(g)
>>> evaluation.adjusted_rand_index(louvain_communities,leiden_communities)

Reference:

Hubert, L., & Arabie, P. (1985). Comparing partitions. 分类学杂志, 2(1), 193-218.