Shortcuts

torcharrow.functional.sigrid_hash

torcharrow.functional.sigrid_hash(value_col: NumericalColumn, salt: int, max_value: int)

对索引或索引列表应用哈希处理。这是在推荐领域中常见的操作,以便为缩小的嵌入表提供有效的输入。

Parameters:
  • value_col (定义索引的数值列) –

  • salt (用于初始化随机哈希过程的值) –

  • max_value (值将被哈希到范围 [0, max_value)) –

示例

>>> import torcharrow as ta
>>> from torcharrow import functional
>>> a = ta.column([1, 2, 3, 5, 8, 10, 11])
>>> functional.sigrid_hash(a, 0, 100)
0  60
1  54
2  54
3   4
4  67
5   2
6  25
dtype: Int64(nullable=True), length: 7, null_count: 0