Shortcuts

torcharrow.functional.get_jaccard_similarity

torcharrow.functional.get_jaccard_similarity(input_ids: ListColumn, matching_ids: ListColumn)

返回input_ids和matching_ids之间的jaccard_similarity。 jaccard相似度是|input_ids.intersect(matching_ids)|/|input_ids.union(matching_ids)|

Parameters:
  • input_ids (第一个ID列表) –

  • matching_ids (第二个ID列表) –

示例

>>> import torcharrow as ta
>>> from torcharrow import functional
>>> input_ids = ta.column([[1, 1, 2, 3],[5,8],[13]])
>>> matching_ids = ta.column([[1,2,3],[2,3],[13,13,13,13,13]])
>>> functional.get_jaccard_similarity(input_ids, matching_ids)
0  0.75
1  0
2  0.2
dtype: Float32(nullable=True), length: 3, null_count: 0