哪个 SQL 查询更快? 过滤连接条件还是 Where 子句？

小开

我猜是第一个，因为它对数据进行了更具体的过滤。但是应该看看执行计划和任何优化一样，因为它可以根据数据的大小、服务器硬件等而有很大的不同。

小开

Is it faster? Try it and see.

哪个更容易读？第一个对我来说看起来更“正确”，因为移动的条件实际上与 join 无关。

小开

对于任何值一分钱的查询优化器... ... 它们都是相同的。

小开

最佳答案

在性能方面，它们是相同的(并且产生相同的计划)

逻辑上，如果用 LEFT JOIN替换 INNER JOIN，应该使操作仍然有意义。

在你的情况下，这将是这样的:

SELECT  *
FROM    TableA a
LEFT JOIN
TableXRef x
ON      x.TableAID = a.ID
AND a.ID = 1
LEFT JOIN
TableB b
ON      x.TableBID = b.ID

或者这样:

SELECT  *
FROM    TableA a
LEFT JOIN
TableXRef x
ON      x.TableAID = a.ID
LEFT JOIN
TableB b
ON      b.id = x.TableBID
WHERE   a.id = 1

前一个查询不会返回除 1以外的 a.id的任何实际匹配，因此后一种语法(使用 WHERE)在逻辑上更加一致。

小开

As far as the two methods go.

JOIN/ON 用于连接表
WHERE 用于筛选结果

虽然你可以用不同的方式使用它们，但对我来说，它总是像一种气味。

当性能有问题的时候就处理它，然后你就可以研究这样的“优化”。

小开

这种联接的位置不太可能成为性能的决定性因素。我并不十分熟悉 tsql 的执行计划，但是它们很可能会针对类似的计划自动进行优化。

小开

规则 # 0: 运行一些基准测试，然后看看！要想知道哪个更快，唯一的办法就是试一试。使用 SQL 分析器很容易执行这些类型的基准测试。

Also, examine the execution plan for the query written with a JOIN and with a WHERE clause to see what differences stand out.

最后，正如其他人所说的那样，任何像样的优化器都应该对这两个优化器进行相同的处理，包括内置在 SQLServer 中的优化器。

小开

For inner joins it doesn't matter where you put your criteria. The SQL compiler will transform both into an execution plan in which the filtering occurs below the join (ie. as if the filter expressions appears is in the join condition).

外部联接是另一回事，因为过滤器的位置改变了查询的语义。

小开

在后面他们是一样的。我们知道这一点，因为如果对每个查询执行 explain analyze，计划结果是相同的。举个例子:

# explain analyze select e.* from event e join result r on e.id = r.event_id and r.team_2_score=24;


QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Hash Join  (cost=27.09..38.22 rows=7 width=899) (actual time=0.045..0.047 rows=1 loops=1)
Hash Cond: (e.id = r.event_id)
->  Seq Scan on event e  (cost=0.00..10.80 rows=80 width=899) (actual time=0.009..0.010 rows=2 loops=1)
->  Hash  (cost=27.00..27.00 rows=7 width=8) (actual time=0.017..0.017 rows=1 loops=1)
Buckets: 1024  Batches: 1  Memory Usage: 9kB
->  Seq Scan on result r  (cost=0.00..27.00 rows=7 width=8) (actual time=0.006..0.008 rows=1 loops=1)
Filter: (team_2_score = 24)
Rows Removed by Filter: 1
Planning time: 0.182 ms
Execution time: 0.101 ms
(10 rows)


# explain analyze select e.* from event e join result r on e.id = r.event_id where r.team_2_score=24;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Hash Join  (cost=27.09..38.22 rows=7 width=899) (actual time=0.027..0.029 rows=1 loops=1)
Hash Cond: (e.id = r.event_id)
->  Seq Scan on event e  (cost=0.00..10.80 rows=80 width=899) (actual time=0.010..0.011 rows=2 loops=1)
->  Hash  (cost=27.00..27.00 rows=7 width=8) (actual time=0.010..0.010 rows=1 loops=1)
Buckets: 1024  Batches: 1  Memory Usage: 9kB
->  Seq Scan on result r  (cost=0.00..27.00 rows=7 width=8) (actual time=0.006..0.007 rows=1 loops=1)
Filter: (team_2_score = 24)
Rows Removed by Filter: 1
Planning time: 0.140 ms
Execution time: 0.058 ms
(10 rows)

它们具有相同的最小和最大成本以及相同的查询计划。另外，请注意，即使在最上面的查询中，team _ score _ 2也被应用为“ Filter”。