SQLEXISTS 语句如何工作?

我正在努力学习 SQL,并且在理解 EXISTS 语句方面遇到了困难。我偶然发现了这句关于“存在”的话,有些事情我不明白:

使用存在运算符,子查询可以返回零行、一行或多行,条件只是检查子查询是否返回任何行。如果查看子查询的 select 子句,您将看到它由一个文字(1)组成; 因为包含查询中的条件只需要知道返回了多少行,所以子查询返回的实际数据是不相关的。

我不明白的是,外部查询如何知道子查询正在检查哪一行? 例如:

SELECT *
FROM suppliers
WHERE EXISTS (select *
from orders
where suppliers.supplier_id = orders.supplier_id);

我知道,如果来自供应商的 id 与订单表匹配,子查询将返回 true,并且将输出来自供应商表中匹配行的所有列。我不明白的是,如果只返回 true 或 false,子查询如何通信应该打印哪个特定行(比方说具有供应商 id 25的行)。

在我看来,外部查询和子查询之间没有关系。

102632 次浏览

EXISTS means that the subquery returns at least one row, that's really it. In that case, it's a correlated subquery because it checks the supplier_id of the outer table to the supplier_id of the inner table. This query says, in effect:

SELECT all suppliers For each supplier ID, see if an order exists for this supplier If the supplier is not present in the orders table, remove the supplier from the results RETURN all suppliers who have corresponding rows in the orders table

You could do the same thing in this case with an INNER JOIN.

SELECT suppliers.*
FROM suppliers
INNER
JOIN orders
ON suppliers.supplier_id = orders.supplier_id;

Ponies comment is correct. You'd need to do grouping with that join, or select distinct depending on the data you need.

What you describe is a so called query with a correlated subquery.

(In general) it's something that you should try to avoid by writing the query by using a join instead:

SELECT suppliers.*
FROM suppliers
JOIN orders USING supplier_id
GROUP BY suppliers.supplier_id

Because otherwise, the subquery will be executed for each row in the outer query.

If you had a where clause that looked like this:

WHERE id in (25,26,27) -- and so on

you can easily understand why some rows are returned and some are not.

When the where clause is like this:

WHERE EXISTS (select * from orders where suppliers.supplier_id = orders.supplier_id);

it just means : return rows that have an existing record in the orders table with te same id.

It appears to me that there is no relationship between the outer query and the subquery.

What do you think the WHERE clause inside the EXISTS example is doing? How do you come to that conclusion when the SUPPLIERS reference isn't in the FROM or JOIN clauses within the EXISTS clause?

EXISTS valuates for TRUE/FALSE, and exits as TRUE on the first match of the criteria -- this is why it can be faster than IN. Also be aware that the SELECT clause in an EXISTS is ignored - IE:

SELECT s.*
FROM SUPPLIERS s
WHERE EXISTS (SELECT 1/0
FROM ORDERS o
WHERE o.supplier_id = s.supplier_id)

...should hit a division by zero error, but it won't. The WHERE clause is the most important piece of an EXISTS clause.

Also be aware that a JOIN is not a direct replacement for EXISTS, because there will be duplicate parent records if there's more than one child record associated to the parent.

You can produce identical results using either JOIN, EXISTS, IN, or INTERSECT:

SELECT s.supplier_id
FROM suppliers s
INNER JOIN (SELECT DISTINCT o.supplier_id FROM orders o) o
ON o.supplier_id = s.supplier_id


SELECT s.supplier_id
FROM suppliers s
WHERE EXISTS (SELECT * FROM orders o WHERE o.supplier_id = s.supplier_id)


SELECT s.supplier_id
FROM suppliers s
WHERE s.supplier_id IN (SELECT o.supplier_id FROM orders o)


SELECT s.supplier_id
FROM suppliers s
INTERSECT
SELECT o.supplier_id
FROM orders o

Think of it this way:

For 'each' row from Suppliers, check if there 'exists' a row in the Order table that meets the condition Suppliers.supplier_id (this comes from Outer query current 'row') = Orders.supplier_id. When you find the first matching row, stop right there - the WHERE EXISTS has been satisfied.

The magic link between the outer query and the subquery lies in the fact that Supplier_id gets passed from the outer query to the subquery for each row evaluated.

Or, to put it another way, the subquery is executed for each table row of the outer query.

It is NOT like the subquery is executed on the whole and gets the 'true/false' and then tries to match this 'true/false' condition with outer query.

Database table model

Let’s assume we have the following two tables in our database, that form a one-to-many table relationship.

SQL EXISTS tables

The student table is the parent, and the student_grade is the child table since it has a student_id Foreign Key column referencing the id Primary Key column in the student table.

The student table contains the following two records:

| id | first_name | last_name | admission_score |
|----|------------|-----------|-----------------|
| 1  | Alice      | Smith     | 8.95            |
| 2  | Bob        | Johnson   | 8.75            |

And, the student_grade table stores the grades the students received:

| id | class_name | grade | student_id |
|----|------------|-------|------------|
| 1  | Math       | 10    | 1          |
| 2  | Math       | 9.5   | 1          |
| 3  | Math       | 9.75  | 1          |
| 4  | Science    | 9.5   | 1          |
| 5  | Science    | 9     | 1          |
| 6  | Science    | 9.25  | 1          |
| 7  | Math       | 8.5   | 2          |
| 8  | Math       | 9.5   | 2          |
| 9  | Math       | 9     | 2          |
| 10 | Science    | 10    | 2          |
| 11 | Science    | 9.4   | 2          |

SQL EXISTS

Let’s say we want to get all students that have received a 10 grade in Math class.

If we are only interested in the student identifier, then we can run a query like this one:

SELECT
student_grade.student_id
FROM
student_grade
WHERE
student_grade.grade = 10 AND
student_grade.class_name = 'Math'
ORDER BY
student_grade.student_id

But, the application is interested in displaying the full name of a student, not just the identifier, so we need info from the student table as well.

In order to filter the student records that have a 10 grade in Math, we can use the EXISTS SQL operator, like this:

SELECT
id, first_name, last_name
FROM
student
WHERE EXISTS (
SELECT 1
FROM
student_grade
WHERE
student_grade.student_id = student.id AND
student_grade.grade = 10 AND
student_grade.class_name = 'Math'
)
ORDER BY id

When running the query above, we can see that only the Alice row is selected:

| id | first_name | last_name |
|----|------------|-----------|
| 1  | Alice      | Smith     |

The outer query selects the student row columns we are interested in returning to the client. However, the WHERE clause is using the EXISTS operator with an associated inner subquery.

The EXISTS operator returns true if the subquery returns at least one record and false if no row is selected. The database engine does not have to run the subquery entirely. If a single record is matched, the EXISTS operator returns true, and the associated other query row is selected.

The inner subquery is correlated because the student_id column of the student_grade table is matched against the id column of the outer student table.