了解 T-SQL 中的 PIVOT 函数

我是 SQL 的新手。

我有一张这样的桌子:

ID | TeamID | UserID | ElementID | PhaseID | Effort
-----------------------------------------------------
1  |   1    |  1      |   3       |  5     |   6.74
2  |   1    |  1      |   3       |  6     |   8.25
3  |   1    |  1      |   4       |  1     |   2.23
4  |   1    |  1      |   4       |  5     |   6.8
5  |   1    |  1      |   4       |  6     |   1.5

我被告知要得到这样的数据

ElementID | PhaseID1 | PhaseID5 | PhaseID6
--------------------------------------------
3     |   NULL   |   6.74   |   8.25
4     |   2.23   |   6.8    |   1.5

我知道我需要使用 PIVOT 函数,但不能清楚地理解它。 如果有人能在上面的例子中解释一下,那将是非常有帮助的

221649 次浏览

These are the very basic pivot example kindly go through that.

SQL SERVER – PIVOT and UNPIVOT Table Examples

Example from above link for the product table:

SELECT PRODUCT, FRED, KATE
FROM (
SELECT CUST, PRODUCT, QTY
FROM Product) up
PIVOT (SUM(QTY) FOR CUST IN (FRED, KATE)) AS pvt
ORDER BY PRODUCT

renders:

 PRODUCT FRED  KATE
--------------------
BEER     24    12
MILK      3     1
SODA   NULL     6
VEG    NULL     5

Similar examples can be found in the blog post Pivot tables in SQL Server. A simple sample

A PIVOT used to rotate the data from one column into multiple columns.

For your example here is a STATIC Pivot meaning you hard code the columns that you want to rotate:

create table temp
(
id int,
teamid int,
userid int,
elementid int,
phaseid int,
effort decimal(10, 5)
)


insert into temp values (1,1,1,3,5,6.74)
insert into temp values (2,1,1,3,6,8.25)
insert into temp values (3,1,1,4,1,2.23)
insert into temp values (4,1,1,4,5,6.8)
insert into temp values (5,1,1,4,6,1.5)


select elementid
, [1] as phaseid1
, [5] as phaseid5
, [6] as phaseid6
from
(
select elementid, phaseid, effort
from temp
) x
pivot
(
max(effort)
for phaseid in([1], [5], [6])
)p

Here is a SQL Demo with a working version.

This can also be done through a dynamic PIVOT where you create the list of columns dynamically and perform the PIVOT.

DECLARE @cols AS NVARCHAR(MAX),
@query  AS NVARCHAR(MAX);


select @cols = STUFF((SELECT distinct ',' + QUOTENAME(c.phaseid)
FROM temp c
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')


set @query = 'SELECT elementid, ' + @cols + ' from
(
select elementid, phaseid, effort
from temp
) x
pivot
(
max(effort)
for phaseid in (' + @cols + ')
) p '




execute(@query)

The results for both:

ELEMENTID   PHASEID1    PHASEID5    PHASEID6
3           Null        6.74        8.25
4           2.23        6.8         1.5

To set Compatibility error

use this before using pivot function

ALTER DATABASE [dbname] SET COMPATIBILITY_LEVEL = 100
    SELECT <non-pivoted column>,
[first pivoted column] AS <column name>,
[second pivoted column] AS <column name>,
...
[last pivoted column] AS <column name>
FROM
(<SELECT query that produces the data>)
AS <alias for the source query>
PIVOT
(
<aggregation function>(<column being aggregated>)
FOR
[<column that contains the values that will become column headers>]
IN ( [first pivoted column], [second pivoted column],
... [last pivoted column])
) AS <alias for the pivot table>
<optional ORDER BY clause>;


USE AdventureWorks2008R2 ;
GO
SELECT DaysToManufacture, AVG(StandardCost) AS AverageCost
FROM Production.Product
GROUP BY DaysToManufacture;


DaysToManufacture          AverageCost
0                          5.0885
1                          223.88
2                          359.1082
4                          949.4105


-- Pivot table with one row and five columns
SELECT 'AverageCost' AS Cost_Sorted_By_Production_Days,
[0], [1], [2], [3], [4]
FROM
(SELECT DaysToManufacture, StandardCost
FROM Production.Product) AS SourceTable
PIVOT
(
AVG(StandardCost)
FOR DaysToManufacture IN ([0], [1], [2], [3], [4])
) AS PivotTable;








Here is the result set.
Cost_Sorted_By_Production_Days    0         1         2           3       4
AverageCost                       5.0885    223.88    359.1082    NULL    949.4105

Ive something to add here which no one mentioned.

The pivot function works great when the source has 3 columns: One for the aggregate, one to spread as columns with for, and one as a pivot for row distribution. In the product example it's QTY, CUST, PRODUCT.

However, if you have more columns in the source it will break the results into multiple rows instead of one row per pivot based on unique values per additional column (as Group By would do in a simple query).

See this example, ive added a timestamp column to the source table:

enter image description here

Now see its impact:

SELECT CUST, MILK


FROM Product
-- FROM (SELECT CUST, Product, QTY FROM PRODUCT) p
PIVOT (
SUM(QTY) FOR PRODUCT IN (MILK)
) AS pvt


ORDER BY CUST

enter image description here


In order to fix this, you can either pull a subquery as a source as everyone has done above - with only 3 columns (this is not always going to work for your scenario, imagine if you need to put a where condition for the timestamp).

Second solution is to use a group by and do a sum of the pivoted column values again.

SELECT
CUST,
sum(MILK) t_MILK


FROM Product
PIVOT (
SUM(QTY) FOR PRODUCT IN (MILK)
) AS pvt


GROUP BY CUST
ORDER BY CUST


GO

enter image description here

A pivot is used to convert one of the columns in your data set from rows into columns (this is typically referred to as the spreading column). In the example you have given, this means converting the PhaseID rows into a set of columns, where there is one column for each distinct value that PhaseID can contain - 1, 5 and 6 in this case.

These pivoted values are grouped via the ElementID column in the example that you have given.

Typically you also then need to provide some form of aggregation that gives you the values referenced by the intersection of the spreading value (PhaseID) and the grouping value (ElementID). Although in the example given the aggregation that will be used is unclear, but involves the Effort column.

Once this pivoting is done, the grouping and spreading columns are used to find an aggregation value. Or in your case, ElementID and PhaseIDX lookup Effort.

Using the grouping, spreading, aggregation terminology you will typically see example syntax for a pivot as:

WITH PivotData AS
(
SELECT <grouping column>
, <spreading column>
, <aggregation column>
FROM <source table>
)
SELECT <grouping column>, <distinct spreading values>
FROM PivotData
PIVOT (<aggregation function>(<aggregation column>)
FOR <spreading column> IN <distinct spreading values>));

This gives a graphical explanation of how the grouping, spreading and aggregation columns convert from the source to pivoted tables if that helps further.

FOR XML PATH might not work on Microsoft Azure Synapse Serve. A possible alternative, following @Taryn dynamic generated cols approach, same results is obtained by using STRING_AGG.

DECLARE @cols AS NVARCHAR(MAX), @query AS NVARCHAR(MAX)


SELECT @cols = STRING_AGG(QUOTENAME(c.phaseid),', ')
/*OPTIONAL: within group (order by cast(t1.[FLOW_SP_SLPM] as INT) asc)*/
FROM (SELECT phaseid FROM temp
GROUP BY phaseid) c


set @query = 'SELECT elementid,' + @cols + ' from
(
select elementid,
phaseid,
effort
from temp
) x
PIVOT
(
max(effort)
for phaseid in (' + @cols + ')
) p '


execute(@query)