pyspark.sql.Window.orderBy

static Window. orderBy ( * cols : Union [ ColumnOrName , List [ ColumnOrName_ ] ] ) → WindowSpec [source]

创建一个带有排序定义的 WindowSpec

新增于版本 1.4.0。

Parameters
cols str, or list

列名或表达式

Returns
class

WindowSpec 一个带有定义排序的 WindowSpec

示例

>>> from pyspark.sql import Window
>>> from pyspark.sql.functions import row_number
>>> df = spark.createDataFrame(
...      [(1, "a"), (1, "a"), (2, "a"), (1, "b"), (2, "b"), (3, "b")], ["id", "category"])
>>> df.show()
+---+--------+
| id|category|
+---+--------+
|  1|       a|
|  1|       a|
|  2|       a|
|  1|       b|
|  2|       b|
|  3|       b|
+---+--------+

显示按分区 id 中的 category 排序的行号。

>>> window = Window.partitionBy("id").orderBy("category")
>>> df.withColumn("row_number", row_number().over(window)).show()
+---+--------+----------+
| id|category|row_number|
+---+--------+----------+
|  1|       a|         1|
|  1|       a|         2|
|  1|       b|         3|
|  2|       a|         1|
|  2|       b|         2|
|  3|       b|         1|
+---+--------+----------+