pyspark.sql.Window.orderBy ¶
-
static
Window.orderBy( * cols : Union [ ColumnOrName , List [ ColumnOrName_ ] ] ) → WindowSpec [source] ¶ -
创建一个带有排序定义的
WindowSpec。新增于版本 1.4.0。
- Parameters
-
-
cols
str,
列or list -
列名或表达式
-
cols
str,
- Returns
-
- class
-
WindowSpec 一个带有定义排序的
WindowSpec。
示例
>>> from pyspark.sql import Window >>> from pyspark.sql.functions import row_number >>> df = spark.createDataFrame( ... [(1, "a"), (1, "a"), (2, "a"), (1, "b"), (2, "b"), (3, "b")], ["id", "category"]) >>> df.show() +---+--------+ | id|category| +---+--------+ | 1| a| | 1| a| | 2| a| | 1| b| | 2| b| | 3| b| +---+--------+
显示按分区
id中的category排序的行号。>>> window = Window.partitionBy("id").orderBy("category") >>> df.withColumn("row_number", row_number().over(window)).show() +---+--------+----------+ | id|category|row_number| +---+--------+----------+ | 1| a| 1| | 1| a| 2| | 1| b| 3| | 2| a| 1| | 2| b| 2| | 3| b| 1| +---+--------+----------+