SQL Server Understanding PIVOT function in T-SQL

carvr3hs  于 2023-03-11  发布在  其他
关注(0)|答案(7)|浏览(134)

I am very new to SQL.

I have a table like this:
| ID | TeamID | UserID | ElementID | PhaseID | Effort |
| ------------ | ------------ | ------------ | ------------ | ------------ | ------------ |
| 1 | 1 | 1 | 3 | 5 | 6.74 |
| 2 | 1 | 1 | 3 | 6 | 8.25 |
| 3 | 1 | 1 | 4 | 1 | 2.23 |
| 4 | 1 | 1 | 4 | 5 | 6.8 |
| 5 | 1 | 1 | 4 | 6 | 1.5 |

And I was told to get data like this

ElementIDPhaseID1PhaseID5PhaseID6
3NULL6.748.25
42.236.81.5

I understand I need to use PIVOT function. But can't understand it clearly. It would be great help if somebody can explain it in above case.(or any alternatives if any)

2skhul33

2skhul331#

A PIVOT used to rotate the data from one column into multiple columns.

For your example here is a STATIC Pivot meaning you hard code the columns that you want to rotate:

create table temp
(
  id int,
  teamid int,
  userid int,
  elementid int,
  phaseid int,
  effort decimal(10, 5)
)

insert into temp values (1,1,1,3,5,6.74)
insert into temp values (2,1,1,3,6,8.25)
insert into temp values (3,1,1,4,1,2.23)
insert into temp values (4,1,1,4,5,6.8)
insert into temp values (5,1,1,4,6,1.5)

select elementid
  , [1] as phaseid1
  , [5] as phaseid5
  , [6] as phaseid6
from
(
  select elementid, phaseid, effort
  from temp
) x
pivot
(
  max(effort)
  for phaseid in([1], [5], [6])
)p

Here is a SQL Demo with a working version.

This can also be done through a dynamic PIVOT where you create the list of columns dynamically and perform the PIVOT.

DECLARE @cols AS NVARCHAR(MAX),
    @query  AS NVARCHAR(MAX);

select @cols = STUFF((SELECT distinct ',' + QUOTENAME(c.phaseid) 
            FROM temp c
            FOR XML PATH(''), TYPE
            ).value('.', 'NVARCHAR(MAX)') 
        ,1,1,'')

set @query = 'SELECT elementid, ' + @cols + ' from 
            (
                select elementid, phaseid, effort
                from temp
           ) x
            pivot 
            (
                 max(effort)
                for phaseid in (' + @cols + ')
            ) p '

execute(@query)

The results for both:

ELEMENTID   PHASEID1    PHASEID5    PHASEID6
3           Null        6.74        8.25
4           2.23        6.8         1.5
pcww981p

pcww981p2#

These are the very basic pivot example kindly go through that.

SQL SERVER – PIVOT and UNPIVOT Table Examples

Example from above link for the product table:

SELECT PRODUCT, FRED, KATE
FROM (
SELECT CUST, PRODUCT, QTY
FROM Product) up
 PIVOT (SUM(QTY) FOR CUST IN (FRED, KATE)) AS pvt
ORDER BY PRODUCT

renders:

PRODUCT FRED  KATE
 --------------------
 BEER     24    12
 MILK      3     1
 SODA   NULL     6
 VEG    NULL     5

Similar examples can be found in the blog post Pivot tables in SQL Server. A simple sample

3npbholx

3npbholx3#

Ive something to add here which no one mentioned.

The pivot function works great when the source has 3 columns: One for the aggregate , one to spread as columns with for , and one as a pivot for row distribution. In the product example it's QTY, CUST, PRODUCT .

However, if you have more columns in the source it will break the results into multiple rows instead of one row per pivot based on unique values per additional column (as Group By would do in a simple query).

See this example, ive added a timestamp column to the source table:

Now see its impact:

SELECT CUST, MILK

FROM Product
-- FROM (SELECT CUST, Product, QTY FROM PRODUCT) p
PIVOT (
    SUM(QTY) FOR PRODUCT IN (MILK)
) AS pvt

ORDER BY CUST

In order to fix this, you can either pull a subquery as a source as everyone has done above - with only 3 columns (this is not always going to work for your scenario, imagine if you need to put a where condition for the timestamp).

Second solution is to use a group by and do a sum of the pivoted column values again.

SELECT 
CUST, 
sum(MILK) t_MILK

FROM Product
PIVOT (
    SUM(QTY) FOR PRODUCT IN (MILK)
) AS pvt

GROUP BY CUST
ORDER BY CUST

GO

vc6uscn9

vc6uscn94#

A pivot is used to convert one of the columns in your data set from rows into columns (this is typically referred to as the spreading column). In the example you have given, this means converting the PhaseID rows into a set of columns, where there is one column for each distinct value that PhaseID can contain - 1, 5 and 6 in this case.

These pivoted values are grouped via the ElementID column in the example that you have given.

Typically you also then need to provide some form of aggregation that gives you the values referenced by the intersection of the spreading value ( PhaseID ) and the grouping value ( ElementID ). Although in the example given the aggregation that will be used is unclear, but involves the Effort column.

Once this pivoting is done, the grouping and spreading columns are used to find an aggregation value. Or in your case, ElementID and PhaseIDX lookup Effort .

Using the grouping, spreading, aggregation terminology you will typically see example syntax for a pivot as:

WITH PivotData AS
(
    SELECT <grouping column>
        , <spreading column>
        , <aggregation column>
    FROM <source table>
)
SELECT <grouping column>, <distinct spreading values>
FROM PivotData
    PIVOT (<aggregation function>(<aggregation column>)
        FOR <spreading column> IN <distinct spreading values>));

This gives a graphical explanation of how the grouping, spreading and aggregation columns convert from the source to pivoted tables if that helps further.

c7rzv4ha

c7rzv4ha5#

SELECT <non-pivoted column>,
    [first pivoted column] AS <column name>,
    [second pivoted column] AS <column name>,
    ...
    [last pivoted column] AS <column name>
FROM
    (<SELECT query that produces the data>)
    AS <alias for the source query>
PIVOT
(
    <aggregation function>(<column being aggregated>)
FOR
[<column that contains the values that will become column headers>]
    IN ( [first pivoted column], [second pivoted column],
    ... [last pivoted column])
) AS <alias for the pivot table>
<optional ORDER BY clause>;

USE AdventureWorks2008R2 ;
GO
SELECT DaysToManufacture, AVG(StandardCost) AS AverageCost 
FROM Production.Product
GROUP BY DaysToManufacture;

    DaysToManufacture          AverageCost
0                          5.0885
1                          223.88
2                          359.1082
4                          949.4105

    -- Pivot table with one row and five columns
SELECT 'AverageCost' AS Cost_Sorted_By_Production_Days, 
[0], [1], [2], [3], [4]
FROM
(SELECT DaysToManufacture, StandardCost 
    FROM Production.Product) AS SourceTable
PIVOT
(
AVG(StandardCost)
FOR DaysToManufacture IN ([0], [1], [2], [3], [4])
) AS PivotTable;



Here is the result set.
Cost_Sorted_By_Production_Days    0         1         2           3       4       
AverageCost                       5.0885    223.88    359.1082    NULL    949.4105
t9eec4r0

t9eec4r06#

To set Compatibility error

use this before using pivot function

ALTER DATABASE [dbname] SET COMPATIBILITY_LEVEL = 100
v6ylcynt

v6ylcynt7#

FOR XML PATH might not work on Microsoft Azure Synapse Serve. A possible alternative, following @Taryn dynamic generated cols approach, same results is obtained by using STRING_AGG .

DECLARE @cols AS NVARCHAR(MAX), @query AS NVARCHAR(MAX)

SELECT @cols = STRING_AGG(QUOTENAME(c.phaseid),', ')
/*OPTIONAL: within group (order by cast(t1.[FLOW_SP_SLPM] as INT) asc)*/
FROM (SELECT phaseid FROM temp
GROUP BY phaseid) c

set @query = 'SELECT elementid,' + @cols + ' from 
             (
                select elementid,
                phaseid,
                effort
                from temp
            ) x
            PIVOT 
            (
                max(effort)
                for phaseid in (' + @cols + ')
            ) p '

execute(@query)

相关问题