连接

描述

SQL连接用于根据连接条件组合来自两个关系的行。以下部分描述了整体连接语法,子部分涵盖不同类型的连接及其示例。

语法

relation { [ join_type ] JOIN [ LATERAL ] relation [ join_criteria ] | NATURAL join_type JOIN [ LATERAL ] relation }

参数

连接类型

内连接

内连接是Spark SQL中的默认连接。它选择在两个关系中具有匹配值的行。

语法:

relation [ INNER ] JOIN relation [ join_criteria ]

左连接

左连接返回左关系中的所有值和右关系中匹配的值,如果没有匹配则追加NULL。它也被称为左外连接。

语法:

relation LEFT [ OUTER ] JOIN relation [ join_criteria ]

右连接

右连接返回右关系中的所有值以及左关系中匹配的值,如果没有匹配则返回 NULL。它也被称为右外连接。

语法:

relation RIGHT [ OUTER ] JOIN relation [ join_criteria ]

完全连接

全连接返回两个关系中的所有值,在没有匹配的一方附加NULL值。它也被称为全外连接。

语法:

relation FULL [ OUTER ] JOIN relation [ join_criteria ]

交叉连接

交叉连接返回两个关系的笛卡尔积。

语法:

relation CROSS JOIN relation [ join_criteria ]

半连接

半连接返回与右侧匹配的关系左侧的值。它也被称为左半连接。

语法:

relation [ LEFT ] SEMI JOIN relation [ join_criteria ]

反连接

反连接返回左侧关系中与右侧没有匹配的值。这也被称为左反连接。

语法:

relation [ LEFT ] ANTI JOIN relation [ join_criteria ]

示例

-- 使用员工和部门表来演示不同类型的连接。
SELECT * FROM employee;
+---+-----+------+
| id| name|deptno|
+---+-----+------+
|105|Chloe| 5|
|103| Paul| 3|
|101| John| 1|
|102| Lisa| 2|
|104| Evan| 4|
|106| Amy| 6|
+---+-----+------+
SELECT * FROM department;
+------+-----------+
|deptno| deptname|
+------+-----------+
| 3|Engineering|
| 2| Sales|
| 1| Marketing|
+------+-----------+
-- 使用员工和部门表来演示内连接。
SELECT id, name, employee.deptno, deptname
FROM employee INNER JOIN department ON employee.deptno = department.deptno;
+---+-----+------+-----------|
| id| name|deptno| deptname|
+---+-----+------+-----------|
|103| Paul| 3|Engineering|
|101| John| 1| Marketing|
|102| Lisa| 2| Sales|
+---+-----+------+-----------|
-- 使用员工和部门表来演示左连接。
SELECT id, name, employee.deptno, deptname
FROM employee LEFT JOIN department ON employee.deptno = department.deptno;
+---+-----+------+-----------|
| id| name|deptno| deptname|
+---+-----+------+-----------|
|105|Chloe| 5| NULL|
|103| Paul| 3|Engineering|
|101| John| 1| Marketing|
|102| Lisa| 2| Sales|
|104| Evan| 4| NULL|
|106| Amy| 6| NULL|
+---+-----+------+-----------|
-- 使用员工和部门表来演示右连接。
SELECT id, name, employee.deptno, deptname
FROM employee RIGHT JOIN department ON employee.deptno = department.deptno;
+---+-----+------+-----------|
| id| name|deptno| deptname|
+---+-----+------+-----------|
|103| Paul| 3|Engineering|
|101| John| 1| Marketing|
|102| Lisa| 2| Sales|
+---+-----+------+-----------|
-- 使用员工和部门表来演示全连接。
SELECT id, name, employee.deptno, deptname
FROM employee FULL JOIN department ON employee.deptno = department.deptno;
+---+-----+------+-----------|
| id| name|deptno| deptname|
+---+-----+------+-----------|
|101| John| 1| Marketing|
|106| Amy| 6| NULL|
|103| Paul| 3|Engineering|
|105|Chloe| 5| NULL|
|104| Evan| 4| NULL|
|102| Lisa| 2| Sales|
+---+-----+------+-----------|
-- 使用员工和部门表来演示笛卡尔连接。
SELECT id, name, employee.deptno, deptname FROM employee CROSS JOIN department;
+---+-----+------+-----------|
| id| name|deptno| deptname|
+---+-----+------+-----------|
|105|Chloe| 5|Engineering|
|105|Chloe| 5| Marketing|
|105|Chloe| 5| Sales|
|103| Paul| 3|Engineering|
|103| Paul| 3| Marketing|
|103| Paul| 3| Sales|
|101| John| 1|Engineering|
|101| John| 1| Marketing|
|101| John| 1| Sales|
|102| Lisa| 2|Engineering|
|102| Lisa| 2| Marketing|
|102| Lisa| 2| Sales|
|104| Evan| 4|Engineering|
|104| Evan| 4| Marketing|
|104| Evan| 4| Sales|
|106| Amy| 4|Engineering|
|106| Amy| 4| Marketing|
|106| Amy| 4| Sales|
+---+-----+------+-----------|
-- 使用员工和部门表来演示半连接。
SELECT * FROM employee SEMI JOIN department ON employee.deptno = department.deptno;
+---+-----+------+
| id| name|deptno|
+---+-----+------+
|103| Paul| 3|
|101| John| 1|
|102| Lisa| 2|
+---+-----+------+
-- 使用员工和部门表来演示反连接。
SELECT * FROM employee ANTI JOIN department ON employee.deptno = department.deptno;
+---+-----+------+
| id| name|deptno|
+---+-----+------+
|105|Chloe| 5|
|104| Evan| 4|
|106| Amy| 6|
+---+-----+------+