cudf.core.column.string.StringMethods.character_tokenize#
- StringMethods.character_tokenize() SeriesOrIndex[source]#
每个字符串被分割成单个字符。 返回的序列包含每个字符作为单独的字符串。
- Returns:
- Series or Index of object.
示例
>>> import cudf >>> data = ["hello world", None, "goodbye, thank you."] >>> ser = cudf.Series(data) >>> ser.str.character_tokenize() 0 h 0 e 0 l 0 l 0 o 0 0 w 0 o 0 r 0 l 0 d 2 g 2 o 2 o 2 d 2 b 2 y 2 e 2 , 2 2 t 2 h 2 a 2 n 2 k 2 2 y 2 o 2 u 2 . dtype: object