从一个表中选择,而不是在另一个表中

我试图找到位于一个表中而不是另一个表中的行,两个表位于不同的数据库中,并且在我用来匹配的列上具有不同的列名。

我有一个查询,代码如下,我认为它可能工作,但它的方式太慢:

SELECT `pm`.`id`
FROM `R2R`.`partmaster` `pm`
WHERE NOT EXISTS (
SELECT *
FROM `wpsapi4`.`product_details` `pd`
WHERE `pm`.`id` = `pd`.`part_num`
)

因此,查询尝试执行以下操作:

从 R2R.partmaster 数据库中选择不在 wpsapi4.product _ Details 数据库中的所有 id。我要匹配的列是 partmaster. id & product _ Details。零件

130950 次浏览

You can LEFT JOIN the two tables. If there is no corresponding row in the second table, the values will be NULL.

SELECT id FROM partmaster LEFT JOIN product_details ON (...) WHERE product_details.part_num IS NULL

Expanding on Sjoerd's anti-join, you can also use the easy to understand SELECT WHERE X NOT IN (SELECT) pattern.

SELECT pm.id FROM r2r.partmaster pm
WHERE pm.id NOT IN (SELECT pd.part_num FROM wpsapi4.product_details pd)

Note that you only need to use ` backticks on reserved words, names with spaces and such, not with normal column names.

On MySQL 5+ this kind of query runs pretty fast.
On MySQL 3/4 it's slow.

Make sure you have indexes on the fields in question
You need to have an index on pm.id, pd.part_num.

So there's loads of posts on the web that show how to do this, I've found 3 ways, same as pointed out by Johan & Sjoerd. I couldn't get any of these queries to work, well obviously they work fine it's my database that's not working correctly and those queries all ran slow.

So I worked out another way that someone else may find useful:

The basic jist of it is to create a temporary table and fill it with all the information, then remove all the rows that ARE in the other table.

So I did these 3 queries, and it ran quickly (in a couple moments).

CREATE TEMPORARY TABLE


`database1`.`newRows`


SELECT


`t1`.`id` AS `columnID`


FROM


`database2`.`table` AS `t1`

.

CREATE INDEX `columnID` ON `database1`.`newRows`(`columnID`)

.

DELETE FROM `database1`.`newRows`


WHERE


EXISTS(
SELECT `columnID` FROM `database1`.`product_details` WHERE `columnID`=`database1`.`newRows`.`columnID`
)

To expand on Johan's answer, if the part_num column in the sub-select can contain null values then the query will break.

To correct this, add a null check...

SELECT pm.id FROM r2r.partmaster pm
WHERE pm.id NOT IN
(SELECT pd.part_num FROM wpsapi4.product_details pd
where pd.part_num is not null)
  • Sorry but I couldn't add a comment as I don't have the rep!