如何在 MySQL 中为每个组选择第一行?

在 C # 中是这样的:

table
.GroupBy(row => row.SomeColumn)
.Select(group => group
.OrderBy(row => row.AnotherColumn)
.First()
)

Linq-To-Sql 将其转换为以下 T-SQL 代码:

SELECT [t3].[AnotherColumn], [t3].[SomeColumn]
FROM (
SELECT [t0].[SomeColumn]
FROM [Table] AS [t0]
GROUP BY [t0].[SomeColumn]
) AS [t1]
OUTER APPLY (
SELECT TOP (1) [t2].[AnotherColumn], [t2].[SomeColumn]
FROM [Table] AS [t2]
WHERE (([t1].[SomeColumn] IS NULL) AND ([t2].[SomeColumn] IS NULL))
OR (([t1].[SomeColumn] IS NOT NULL) AND ([t2].[SomeColumn] IS NOT NULL)
AND ([t1].[SomeColumn] = [t2].[SomeColumn]))
ORDER BY [t2].[AnotherColumn]
) AS [t3]
ORDER BY [t3].[AnotherColumn]

但它与 MySQL 不兼容。

205491 次浏览

I based my answer on the title of your post only, as I don't know C# and didn't understand the given query. But in MySQL I suggest you try subselects. First get a set of primary keys of interesting columns then select data from those rows:

SELECT somecolumn, anothercolumn
FROM sometable
WHERE id IN (
SELECT min(id)
FROM sometable
GROUP BY somecolumn
);

You should use some aggregate function to get the value of AnotherColumn that you want. That is, if you want the lowest value of AnotherColumn for each value of SomeColumn (either numerically or lexicographically), you can use:

SELECT SomeColumn, MIN(AnotherColumn)
FROM YourTable
GROUP BY SomeColumn

Some hopefully helpful links:

http://dev.mysql.com/doc/refman/5.1/en/group-by-functions.html

http://www.oreillynet.com/databases/blog/2007/05/debunking_group_by_myths.html

Here's another way you could try, that doesn't need that ID field.

select some_column, min(another_column)
from i_have_a_table
group by some_column

Still I agree with lfagundes that you should add some primary key ..

Also beware that by doing this, you cannot (easily) get at the other values is the same row as the resulting some_colum, another_column pair! You'd need lfagundes apprach and a PK to do that!

When I write

SELECT AnotherColumn
FROM Table
GROUP BY SomeColumn
;

It works. IIRC in other RDBMS such statement is impossible, because a column that doesn't belongs to the grouping key is being referenced without any sort of aggregation.

This "quirk" behaves very closely to what I want. So I used it to get the result I wanted:

SELECT * FROM
(
SELECT * FROM `table`
ORDER BY AnotherColumn
) t1
GROUP BY SomeColumn
;

Yet another way to do it

Select max from group that works in views

SELECT * FROM action a
WHERE NOT EXISTS (
SELECT 1 FROM action a2
WHERE a2.user_id = a.user_id
AND a2.action_date > a.action_date
AND a2.action_type = a.action_type
)
AND a.action_type = "CF"

Why not use MySQL LIMIT keyword?

SELECT [t2].[AnotherColumn], [t2].[SomeColumn]
FROM [Table] AS [t2]
WHERE (([t1].[SomeColumn] IS NULL) AND ([t2].[SomeColumn] IS NULL))
OR (([t1].[SomeColumn] IS NOT NULL) AND ([t2].[SomeColumn] IS NOT NULL)
AND ([t1].[SomeColumn] = [t2].[SomeColumn]))
ORDER BY [t2].[AnotherColumn]
LIMIT 1

How about this:

SELECT SUBSTRING_INDEX(
MIN(CONCAT(OrderColumn, '|', IFNULL(TargetColumn, ''))
), '|', -1) as TargetColumn
FROM table
GROUP BY GroupColumn

From MySQL 5.7 documentation

MySQL 5.7.5 and up implements detection of functional dependence. If the ONLY_FULL_GROUP_BY SQL mode is enabled (which it is by default), MySQL rejects queries for which the select list, HAVING condition, or ORDER BY list refer to nonaggregated columns that are neither named in the GROUP BY clause nor are functionally dependent on them.

This means that @Jader Dias's solution wouldn't work everywhere.

Here is a solution that would work when ONLY_FULL_GROUP_BY is enabled:

SET @row := NULL;
SELECT
SomeColumn,
AnotherColumn
FROM (
SELECT
CASE @id <=> SomeColumn AND @row IS NOT NULL
WHEN TRUE THEN @row := @row+1
ELSE @row := 0
END AS rownum,
@id := SomeColumn AS SomeColumn,
AnotherColumn
FROM
SomeTable
ORDER BY
SomeColumn, -AnotherColumn DESC
) _values
WHERE rownum = 0
ORDER BY SomeColumn;

I have not seen the following solution among the answers, so I thought I'd put it out there.

The problem is to select rows which are the first rows when ordered by AnotherColumn in all groups grouped by SomeColumn.

The following solution will do this in MySQL. id has to be a unique column which must not hold values containing - (which I use as a separator).

select t1.*
from mytable t1
inner join (
select SUBSTRING_INDEX(
GROUP_CONCAT(t3.id ORDER BY t3.AnotherColumn DESC SEPARATOR '-'),
'-',
1
) as id
from mytable t3
group by t3.SomeColumn
) t2 on t2.id = t1.id




-- Where
SUBSTRING_INDEX(GROUP_CONCAT(id order by AnotherColumn desc separator '-'), '-', 1)
-- can be seen as:
FIRST(id order by AnotherColumn desc)


-- For completeness sake:
SUBSTRING_INDEX(GROUP_CONCAT(id order by AnotherColumn desc separator '-'), '-', -1)
-- would then be seen as:
LAST(id order by AnotherColumn desc)

There is a feature request for FIRST() and LAST() in the MySQL bug tracker, but it was closed many years back.

Yet another way to do it (without the primary key) would be using the JSON functions:

select somecolumn, json_unquote( json_extract(json_arrayagg(othercolumn), "$[0]") )
from sometable group by somecolumn

or pre 5.7.22

select somecolumn,
json_unquote(
json_extract(
concat('["', group_concat(othercolumn separator '","') ,'"]')
,"$[0]" )
)
from sometable group by somecolumn

Ordering (or filtering) can be done before grouping:

select somecolumn, json_unquote( json_extract(json_arrayagg(othercolumn), "$[0]") )
from (select * from sometable order by othercolumn) as t group by somecolumn

... or after grouping (of course):

select somecolumn, json_unquote( json_extract(json_arrayagg(othercolumn), "$[0]") ) as other
from sometable group by somecolumn order by other

Admittedly, it's rather convoluted and performance is probably not great (didn't test it on large data, works well on my limited data sets).

SELECT
t1.*


FROM
table_name AS t1


LEFT JOIN table_name AS t2 ON (
t2.group_by_column = t1.group_by_column
-- group_by_column is the column you would use in the GROUP BY statement
AND
t2.order_by_column < t1.order_by_column
-- order_by_column is column you would use in the ORDER BY statement
-- usually is the autoincremented key column
)


WHERE
t2.group_by_column IS NULL;

With MySQL v8+ you could use window functions

Best performance and easy to use:

SELECT id, code,
SUBSTRING_INDEX( GROUP_CONCAT(price ORDER BY id DESC), ',', 1) first_found_price
FROM stocks
GROUP BY code
ORDER BY id DESC

You can specify different SEPARATOR (default is comma).

Select the first row for each group (as ordered by a column) in Mysql .

We have:

a table: mytable
a column we are ordering by: the_column_to_order_by
a column that we wish to group by: the_group_by_column

Here's my solution. The inner query gets you a unique set of rows, selected as a dual key. The outer query joins the same table by joining on both of those keys (with AND).

SELECT * FROM
(
SELECT the_group_by_column, MAX(the_column_to_order_by) the_column_to_order_by
FROM mytable
GROUP BY the_group_by_column
ORDER BY MAX(the_column_to_order_by) DESC
) as mytable1
JOIN mytable mytable2 ON mytable2.the_group_by_column =
mytablealiamytable2.the_group_by_column
AND mytable2.the_column_to_order_by = mytable1.the_column_to_order_by;

FYI: I haven't thought about efficiency at all for this and can't speak to that one way or the other.

I suggest to use this official way from MySql:

SELECT article, dealer, price
FROM   shop s1
WHERE  price=(SELECT MAX(s2.price)
FROM shop s2
WHERE s1.article = s2.article
GROUP BY s2.article)
ORDER BY article;

With this way, we can get the highest price on each article

I recently discovered a cool trick to accomplish this. Basically just make two different subqueries from a table and join them together. One of the subqueries does the aggregation based on a grouping, and the other subquery just grabs the first DISTINCT row for each grouped item.

When you join these subqueries together, you will get the first distinct item from each group, but will also get the aggregated columns across the whole group for each item. This is essentially the same result as having ONLY_FULL_GROUP_BY turned off.

SELECT non_aggregated_data.foo_column AS foo_column,
non_aggregated_data.bar_column AS bar_column,
aggregated_data.value_1_sum    AS value_1_sum,
aggregated_data.value_2_sum    AS value_2_sum
FROM (SELECT column_to_self_join_on,
sum(value_1) AS value_1_sum,
sum(value_2) AS value_2_sum
FROM example_table
GROUP BY column_to_self_join_on) AS aggregated_data
LEFT JOIN (SELECT DISTINCT(column_to_self_join_on),
foo_column,
bar_column
FROM example_table) AS non_aggregated_data
ON non_aggregated_data.column_to_self_join_on = aggregated_data.column_to_self_join_on

rtribaldos mentioned that in younger database versions, window-functions could be used.
Here is a code which worked for me and was as fast as Martin Zwarík's substring_index-solution (in Mariadb 10.5.16):

SELECT group_col, order_col FROM (
SELECT group_col, order_col
, ROW_NUMBER() OVER(PARTITION BY group_col ORDER BY order_col) rnr
FROM some_table
WHERE <some_condition>
) i
WHERE rnr=1;