从多个列中选择最小值的最佳方法是什么?

给定 SQLServer2005中的下表:

ID   Col1   Col2   Col3
--   ----   ----   ----
1       3     34     76
2      32    976     24
3       7    235      3
4     245      1    792

写出产生以下结果的查询的最佳方式是什么(即产生最后一列的查询——一个包含 Col1、 Col2和 Col 3 每一行的最小值的列) ?

ID   Col1   Col2   Col3  TheMin
--   ----   ----   ----  ------
1       3     34     76       3
2      32    976     24      24
3       7    235      3       3
4     245      1    792       1

更新:

为了澄清(正如我在评论中所说的) ,在实际场景中,数据库是 正常化了。这些“数组”列不在实际的表中,而是在报表中需要的结果集中。新的要求是报表也需要这个 MinValue 列。我不能改变底层的结果集,因此我希望 T-SQL 能够方便地“摆脱牢狱之灾”。

我尝试了下面提到的 CASE 方法,虽然有点麻烦,但是它很有效。它也比答案中所说的更复杂,因为您需要考虑到同一行中有两个 min 值这一事实。

无论如何,我认为我应该发布我当前的解决方案,考虑到我的约束,它工作得很好。它使用了 UNPIVOT 操作符:

with cte (ID, Col1, Col2, Col3)
as
(
select ID, Col1, Col2, Col3
from TestTable
)
select cte.ID, Col1, Col2, Col3, TheMin from cte
join
(
select
ID, min(Amount) as TheMin
from
cte
UNPIVOT (Amount for AmountCol in (Col1, Col2, Col3)) as unpvt
group by ID
) as minValues
on cte.ID = minValues.ID

首先我要说的是,我并不期望它能提供最好的性能,但考虑到目前的情况(我不能仅仅为了新的 MinValue 列需求而重新设计所有查询) ,这是一张相当优雅的“出狱卡”。

316776 次浏览

There are likely to be many ways to accomplish this. My suggestion is to use Case/When to do it. With 3 columns, it's not too bad.

Select Id,
Case When Col1 < Col2 And Col1 < Col3 Then Col1
When Col2 < Col1 And Col2 < Col3 Then Col2
Else Col3
End As TheMin
From   YourTableNameHere

This is brute force but works

 select case when col1 <= col2 and col1 <= col3 then col1
case when col2 <= col1 and col2 <= col3 then col2
case when col3 <= col1 and col3 <= col2 then col3
as 'TheMin'
end


from Table T

... because min() works only on one column and not across columns.

The best way to do that is probably not to do it - it's strange that people insist on storing their data in a way that requires SQL "gymnastics" to extract meaningful information, when there are far easier ways to achieve the desired result if you just structure your schema a little better :-)

The right way to do this, in my opinion, is to have the following table:

ID    Col    Val
--    ---    ---
1      1      3
1      2     34
1      3     76


2      1     32
2      2    976
2      3     24


3      1      7
3      2    235
3      3      3


4      1    245
4      2      1
4      3    792

with ID/Col as the primary key (and possibly Col as an extra key, depending on your needs). Then your query becomes a simple select min(val) from tbl and you can still treat the individual 'old columns' separately by using where col = 2 in your other queries. This also allows for easy expansion should the number of 'old columns' grow.

This makes your queries so much easier. The general guideline I tend to use is, if you ever have something that looks like an array in a database row, you're probably doing something wrong and should think about restructuring the data.


However, if for some reason you can't change those columns, I'd suggest using insert and update triggers and add another column which these triggers set to the minimum on Col1/2/3. This will move the 'cost' of the operation away from the select to the update/insert where it belongs - most database tables in my experience are read far more often than written so incurring the cost on write tends to be more efficient over time.

In other words, the minimum for a row only changes when one of the other columns change, so that's when you should be calculating it, not every time you select (which is wasted if the data isn't changing). You would then end up with a table like:

ID   Col1   Col2   Col3   MinVal
--   ----   ----   ----   ------
1      3     34     76        3
2     32    976     24       24
3      7    235      3        3
4    245      1    792        1

Any other option that has to make decisions at select time is usually a bad idea performance-wise, since the data only changes on insert/update - the addition of another column takes up more space in the DB and will be slightly slower for the inserts and updates but can be much faster for selects - the preferred approach should depend on your priorities there but, as stated, most tables are read far more often than they're written.

If you're able to make a stored procedure, it could take an array of values, and you could just call that.

select *,
case when column1 < columnl2 And column1 < column3 then column1
when columnl2 < column1 And columnl2 < column3 then columnl2
else column3
end As minValue
from   tbl_example

You could also do this with a union query. As the number of columns increase, you would need to modify the query, but at least it would be a straight forward modification.

Select T.Id, T.Col1, T.Col2, T.Col3, A.TheMin
From   YourTable T
Inner Join (
Select A.Id, Min(A.Col1) As TheMin
From   (
Select Id, Col1
From   YourTable


Union All


Select Id, Col2
From   YourTable


Union All


Select Id, Col3
From   YourTable
) As A
Group By A.Id
) As A
On T.Id = A.Id

If you use SQL 2005 you can do something neat like this:

;WITH    res
AS ( SELECT   t.YourID ,
CAST(( SELECT   Col1 AS c01 ,
Col2 AS c02 ,
Col3 AS c03 ,
Col4 AS c04 ,
Col5 AS c05
FROM     YourTable AS cols
WHERE    YourID = t.YourID
FOR
XML AUTO ,
ELEMENTS
) AS XML) AS colslist
FROM     YourTable AS t
)
SELECT  YourID ,
colslist.query('for $c in //cols return min(data($c/*))').value('.',
'real') AS YourMin ,
colslist.query('for $c in //cols return avg(data($c/*))').value('.',
'real') AS YourAvg ,
colslist.query('for $c in //cols return max(data($c/*))').value('.',
'real') AS YourMax
FROM    res

This way you don't get lost in so many operators :)

However, this could be slower than the other choice.

It's your choice...

Both this question And this question try to answer this.

The recap is that Oracle has a built in function for this, with Sql Server you are stuck either defining a user-defined-function or using case statements.

A little twist on the union query:

DECLARE @Foo TABLE (ID INT, Col1 INT, Col2 INT, Col3 INT)


INSERT @Foo (ID, Col1, Col2, Col3)
VALUES
(1, 3, 34, 76),
(2, 32, 976, 24),
(3, 7, 235, 3),
(4, 245, 1, 792)


SELECT
ID,
Col1,
Col2,
Col3,
(
SELECT MIN(T.Col)
FROM
(
SELECT Foo.Col1 AS Col UNION ALL
SELECT Foo.Col2 AS Col UNION ALL
SELECT Foo.Col3 AS Col
) AS T
) AS TheMin
FROM
@Foo AS Foo

If the columns were integers as in your example I would create a function:

create function f_min_int(@a as int, @b as int)
returns int
as
begin
return case when @a < @b then @a else coalesce(@b,@a) end
end

then when I need to use it I would do :

select col1, col2, col3, dbo.f_min_int(dbo.f_min_int(col1,col2),col3)

if you have 5 colums then the above becomes

select col1, col2, col3, col4, col5,
dbo.f_min_int(dbo.f_min_int(dbo.f_min_int(dbo.f_min_int(col1,col2),col3),col4),col5)

If you know what values you are looking for, usually a status code, the following can be helpful:

select case when 0 in (PAGE1STATUS ,PAGE2STATUS ,PAGE3STATUS,
PAGE4STATUS,PAGE5STATUS ,PAGE6STATUS) then 0 else 1 end
FROM CUSTOMERS_FORMS

On MySQL, use this:

select least(col1, col2, col3) FROM yourtable

You can use the "brute force" approach with a twist:

SELECT CASE
WHEN Col1 <= Col2 AND Col1 <= Col3 THEN Col1
WHEN                  Col2 <= Col3 THEN Col2
ELSE                                    Col3
END AS [Min Value] FROM [Your Table]

When the first when condition fails it guarantees that Col1 is not the smallest value therefore you can eliminate it from rest of the conditions. Likewise for subsequent conditions. For five columns your query becomes:

SELECT CASE
WHEN Col1 <= Col2 AND Col1 <= Col3 AND Col1 <= Col4 AND Col1 <= Col5 THEN Col1
WHEN                  Col2 <= Col3 AND Col2 <= Col4 AND Col2 <= Col5 THEN Col2
WHEN                                   Col3 <= Col4 AND Col3 <= Col5 THEN Col3
WHEN                                                    Col4 <= Col5 THEN Col4
ELSE                                                                      Col5
END AS [Min Value] FROM [Your Table]

Note that if there is a tie between two or more columns then <= ensures that we exit the CASE statement as early as possible.

Using CROSS APPLY:

SELECT ID, Col1, Col2, Col3, MinValue
FROM YourTable
CROSS APPLY (SELECT MIN(d) AS MinValue FROM (VALUES (Col1), (Col2), (Col3)) AS a(d)) A

SQL Fiddle

Below I use a temp table to get the minimum of several dates. The first temp table queries several joined tables to get various dates (as well as other values for the query), the second temp table then gets the various columns and the minimum date using as many passes as there are date columns.

This is essentially like the union query, the same number of passes are required, but may be more efficient (based on experience, but would need testing). Efficiency wasn't an issue in this case (8,000 records). One could index etc.

--==================== this gets minimums and global min
if object_id('tempdb..#temp1') is not null
drop table #temp1
if object_id('tempdb..#temp2') is not null
drop table #temp2


select r.recordid ,  r.ReferenceNumber, i.InventionTitle, RecordDate, i.ReceivedDate
, min(fi.uploaddate) [Min File Upload], min(fi.CorrespondenceDate) [Min File Correspondence]
into #temp1
from record r
join Invention i on i.inventionid = r.recordid
left join LnkRecordFile lrf on lrf.recordid = r.recordid
left join fileinformation fi on fi.fileid = lrf.fileid
where r.recorddate > '2015-05-26'
group by  r.recordid, recorddate, i.ReceivedDate,
r.ReferenceNumber, i.InventionTitle






select recordid, recorddate [min date]
into #temp2
from #temp1


update #temp2
set [min date] = ReceivedDate
from #temp1 t1 join #temp2 t2 on t1.recordid = t2.recordid
where t1.ReceivedDate < [min date] and  t1.ReceivedDate > '2001-01-01'


update #temp2
set [min date] = t1.[Min File Upload]
from #temp1 t1 join #temp2 t2 on t1.recordid = t2.recordid
where t1.[Min File Upload] < [min date] and  t1.[Min File Upload] > '2001-01-01'


update #temp2
set [min date] = t1.[Min File Correspondence]
from #temp1 t1 join #temp2 t2 on t1.recordid = t2.recordid
where t1.[Min File Correspondence] < [min date] and t1.[Min File Correspondence] > '2001-01-01'




select t1.*, t2.[min date] [LOWEST DATE]
from #temp1 t1 join #temp2 t2 on t1.recordid = t2.recordid
order by t1.recordid
SELECT ID, Col1, Col2, Col3,
(SELECT MIN(Col) FROM (VALUES (Col1), (Col2), (Col3)) AS X(Col)) AS TheMin
FROM Table

For multiple columns its best to use a CASE statement, however for two numeric columns i and j you can use simple math:

min(i,j) = (i+j)/2 - abs(i-j)/2

This formula can be used to get the minimum value of multiple columns but its really messy past 2, min(i,j,k) would be min(i,min(j,k))

SELECT [ID],
(
SELECT MIN([value].[MinValue])
FROM
(
VALUES
([Col1]),
([Col1]),
([Col2]),
([Col3])
) AS [value] ([MinValue])
) AS [MinValue]
FROM Table;

I know that question is old, but I was still in the need of the answer and was not happy with other answers so I had to devise my own which is a twist on @paxdiablo´s answer.


I came from land of SAP ASE 16.0, and I only needed a peek at statistics of certain data which are IMHO validly stored in different columns of a single row (they represent different times - when arrival of something was planned, what it was expected when the action started and finally what was the actual time). Thus I had transposed columns into the rows of temporary table and preformed my query over this as usually.

N.B. Not the one-size-fits-all solution ahead!

CREATE TABLE #tempTable (ID int, columnName varchar(20), dataValue int)


INSERT INTO #tempTable
SELECT ID, 'Col1', Col1
FROM sourceTable
WHERE Col1 IS NOT NULL
INSERT INTO #tempTable
SELECT ID, 'Col2', Col2
FROM sourceTable
WHERE Col2 IS NOT NULL
INSERT INTO #tempTable
SELECT ID, 'Col3', Col3
FROM sourceTable
WHERE Col3 IS NOT NULL


SELECT ID
, min(dataValue) AS 'Min'
, max(dataValue) AS 'Max'
, max(dataValue) - min(dataValue) AS 'Diff'
FROM #tempTable
GROUP BY ID

This took some 30 seconds on source set of 630000 rows and used only index-data, so not the thing to run in time-critical process but for things like one-time data inspection or end-of-the-day report you might be fine (but verify this with your peers or superiors, please!). Main bonus of this style for me was that I could readily use more/less columns and change grouping, filtering, etc., especially once data was copyied over.

The additional data (columnName, maxes, ...) were to aid me in my search, so you might not need them; I left them here to maybe spark some ideas :-).

case when Col1 < Col2 and Col1 < Col3 then Col1 when Col2 is null and Col3 is null then Col1 when Col1 < Col2 and Col3 is null then Col1 when Col1 < Col3 and Col2 is null then Col1 when Col2 < Col1 and Col2 < Col3 then Col2 when Col1 is null and Col3 is null then Col2 when Col2 < Col1 and Col3 is null then Col2 when Col2 < Col3 and Col1 is null then Col2 when Col3 < Col1 and Col3 < Col2 then Col3 when Col1 is null and Col2 is null then Col3 when Col3 < Col1 and Col2 is null then Col3 when Col3 < Col2 and Col1 is null then Col3 when Col2 = Col3 then Col2 when Col1 = Col3 then Col1 when Col1 = Col2 then Col1 when Col2 = Col3 and Col1 = Col3 then Col1
else null end as 'MIN'