连接表上带有 LIMIT1的 MySQLJOIN

我想连接两个表,但是每个表1上的表2只有1条记录

例如:

SELECT c.id, c.title, p.id AS product_id, p.title
FROM categories AS c
JOIN products AS p ON c.id = p.category_id

这样我就能拿到 products的所有记录,这不是我想要的。我希望每个类别有1个(第一个)产品(我在 products 字段中有一个 sort列)。

我该怎么做呢?

177374 次浏览

Assuming you want product with MIN()imial value in sort column, it would look something like this.

SELECT
c.id, c.title, p.id AS product_id, p.title
FROM
categories AS c
INNER JOIN (
SELECT
p.id, p.category_id, p.title
FROM
products AS p
CROSS JOIN (
SELECT p.category_id, MIN(sort) AS sort
FROM products
GROUP BY category_id
) AS sq USING (category_id)
) AS p ON c.id = p.category_id

What about this?

SELECT c.id, c.title, (SELECT id from products AS p
WHERE c.id = p.category_id
ORDER BY ...
LIMIT 1)
FROM categories AS c;

I would try something like this:

SELECT C.*,
(SELECT P.id, P.title
FROM products as P
WHERE P.category_id = C.id
LIMIT 1)
FROM categories C

I like more another approach described in a similar question: https://stackoverflow.com/a/11885521/2215679

This approach is better especially in case if you need to show more than one field in SELECT. To avoid Error Code: 1241. Operand should contain 1 column(s) or double sub-select for each column.

For your situation the Query should looks like (this query also will work in PostgresQL and it is pretty fast, see my update below):

SELECT
c.id,
c.title,
p.id AS product_id,
p.title AS product_title
FROM categories AS c
JOIN products AS p ON
p.id = (                                 --- the PRIMARY KEY
SELECT p1.id FROM products AS p1
WHERE c.id=p1.category_id
ORDER BY p1.id LIMIT 1
)

PS. I did the performance test of the query vs other proposed here, and this query is the best option yet!

UPDATE (2022-07-20, PostgresSQL)

I'm not working with mySQL for a while already, so, I decided to test the performance of my solution (which actually works perfect in both MySQL and PostgresQL) with solution provided by @Gravy in PostgresQL v.12.9.

For that I decided to create a dummy tables and data with 100 categories and 100000 products. You can check the code on this gist

I run my query above and it took only 13ms to run.

After I slightly modified (for postgres) the query from @Gravy:

SELECT
id,
category_title,
(array_agg(product_title))[1]
FROM
(SELECT c.id, c.title AS category_title, p.id AS product_id, p.title AS product_title
FROM categories AS c
JOIN products AS p ON c.id = p.category_id
ORDER BY c.id ASC) AS a
GROUP BY id, category_title;

and run it too. It took more than 150ms in my machine. Which is >10x times slower.

In defense of @gravy's solution, I agree with n+1 problem. But, in this particular case, usually the number of products is way larger than categories. So, running through each category is way less expensive than running through each product as in @Gravy's query.

By the way, if your table has 1mln products with 100 categories, the speed of my query is still the same (between 9-17ms), but the query from [@Gravy] takes more than 2 seconds to run

In resume, at this moment, my query is the most performant and optimal solution for the current task.

Feel free to comment.

SELECT c.id, c.title, p.id AS product_id, p.title
FROM categories AS c
JOIN products AS p ON c.id = p.category_id
GROUP BY c.id

This will return the first data in products (equals limit 1)

Accepted answer by @goggin13 looks wrong. Other solutions provided to-date will work, but suffer from the n+1 problem and as such, suffer a performance hit.

n+1 problem: If there are 100 categories, then we would have to do 1 select to get the categories, then for each of the 100 categories returned, we would need to do a select to get the products in that category. So 101 SELECT queries would be performed.

My alternative solution solves the n+1 problem and consequently should be significantly more performant as only 2 selects are being performed.

SELECT
*
FROM
(SELECT c.id, c.title, p.id AS product_id, p.title
FROM categories AS c
JOIN products AS p ON c.id = p.category_id
ORDER BY c.id ASC) AS a
GROUP BY id;

Replace the tables with yours:

SELECT * FROM works w
LEFT JOIN
(SELECT photoPath, photoUrl, videoUrl FROM workmedias LIMIT 1) AS wm ON wm.idWork = w.idWork

The With clause would do the trick. Something like this:

WITH SELECTION AS (SELECT id FROM products LIMIT 1)
SELECT a.id, c.id, c.title FROM selection a JOIN categories c ON (c.id = a.id);

When using postgres you can use the DISTINCT ON syntex to limit the number of columns returned from either table.

Here is a sample of the code:

SELECT c.id, c.title, p.id AS product_id, p.title FROM categories AS c JOIN ( SELECT DISTINCT ON(p1.id) id, p1.title, p1.category_id FROM products p1 ) p ON (c.id = p.category_id)
The trick is not to join directly on the table with multiple occurrences of the id, rather, first create a table with only a single occurrence for each id

Another example with 3 nested tables: 1/ User 2/ UserRoleCompanie 3/ Companie

  • 1 user has many UserRoleCompanie.
  • 1 UserRoleCompanie has 1 user and 1 Company
  • 1 Companie has many UserRoleCompanie
SELECT
u.id as userId,
u.firstName,
u.lastName,
u.email,
urc.id ,
urc.companieRole,
c.id as companieId,
c.name as companieName
FROM User as u
JOIN UserRoleCompanie as urc ON u.id = urc.userId
AND urc.id = (
SELECT urc2.id
FROM UserRoleCompanie urc2
JOIN Companie ON urc2.companieId = Companie.id
AND urc2.userId = u.id
AND Companie.isPersonal = false
order by Companie.createdAt DESC
        

limit 1
)
    

LEFT JOIN Companie as c ON urc.companieId = c.id
+---------------------------+-----------+--------------------+---------------------------+---------------------------+--------------+---------------------------+-------------------+
| userId                    | firstName | lastName           | email                     | id                        | companieRole | companieId                | companieName      |
+---------------------------+-----------+--------------------+---------------------------+---------------------------+--------------+---------------------------+-------------------+
| cjjt9s9iw037f0748raxmnnde | henry     | pierrot            | henry@gmail.com           | cjtuflye81dwt0748e4hnkiv0 | OWNER        | cjtuflye71dws0748r7vtuqmg | leclerc           |

In my opinion, this is the best answer (making it general):

SELECT
TB1.Id
FROM Table1 AS TB1
INNER JOIN Table2 AS TB2 ON (TB1.Id = TB2.Id_TB1)
AND TB2.Id = (
SELECT Id
FROM Table2
WHERE TB1.Id = Id_TB1
ORDER BY Table2.Id DESC
LIMIT 1
)
SELECT
c.id,
c.title,
p.id AS product_id,
p.title AS product_title
FROM categories AS c
JOIN products AS p ON
p.id = (
SELECT MIN(p1.id) FROM products AS p1
WHERE c.id=p1.category_id
)

Using MIN or MAX in a subquery will make your query run much faster.