Thursday, 14 August 2014

SQL: More PIVOT

As you have seen in the previous article the use of PIVOT with the months of the year. This is a common use of PIVOT.

Let's look at another example. Again, using Northwind. This time what I'd like to achieve is a list of employees as columns and a total of orders each has made in each month/year. Let's start by looking at the raw data

SELECT OrderDate, OrderId, CustomerId, EmployeeId FROM Orders
ORDER BY OrderDate, EmployeeId
Gives us some output that isn't quite the data we are looking for
OrderDate               OrderId     CustomerId EmployeeId
----------------------- ----------- ---------- -----------
1996-07-04 00:00:00.000 10248       VINET      5
1996-07-05 00:00:00.000 10249       TOMSP      6
1996-07-08 00:00:00.000 10251       VICTE      3
1996-07-08 00:00:00.000 10250       HANAR      4
1996-07-09 00:00:00.000 10252       SUPRD      4
...
1998-05-06 00:00:00.000 11074       SIMOB      7
1998-05-06 00:00:00.000 11075       RICSU      8

(830 row(s) affected)
So we will get this with the year and month of the OrderDate and then sort it
SELECT
    YEAR(OrderDate) AS YY, 
    MONTH(OrderDate) AS MM, 
    OrderId, 
    CustomerId, 
    EmployeeId FROM Orders
ORDER BY 
    YEAR(OrderDate), 
    MONTH(OrderDate),
    EmployeeId
Giving us something a little bit better
YY          MM          OrderId     CustomerId EmployeeId
----------- ----------- ----------- ---------- -----------
1996        7           10258       ERNSH      1
1996        7           10265       BLONP      2
1996        7           10266       WARTH      3
1996        7           10251       VICTE      3
1996        7           10253       HANAR      3
1996        7           10256       WELLI      3
1996        7           10257       HILAA      4
...
We can see that EmployeeId 1 sold one order in July 1996, EmployeeId 3 sold 4 etc. We can summarise this with
SELECT
    YEAR(OrderDate) AS YY, 
    MONTH(OrderDate) AS MM, 
    EmployeeId, 
    COUNT(*) AS NumberOfOrders 
FROM Orders
GROUP BY 
    YEAR(OrderDate), Month(OrderDate), EmployeeId
ORDER BY 
    YEAR(OrderDate), Month(OrderDate), EmployeeId
Producing
YY          MM          EmployeeId  NumberOfOrders
----------- ----------- ----------- --------------
1996        7           1           1
1996        7           2           1
1996        7           3           4
1996        7           4           7
...
Now to PIVOT this we can simply run this
SELECT * FROM
(
    SELECT 
        YEAR(OrderDate) As YY, 
        MONTH(OrderDate) AS MM, 
        CustomerId,
        EmployeeId from Orders) As Source
PIVOT
(
    COUNT(CustomerId) FOR EmployeeId IN ( [1], [2], [3], [4] )
) AS PivotTable
This gives us the output for Employees 1, 2, 3 and 4
YY          MM          1           2           3           4
----------- ----------- ----------- ----------- ----------- -----------
1997        1           3           4           7           8
1996        8           5           2           2           5
1997        9           8           7           4           5
...
1996        7           1           1           4           7
1998        4           8           18          10          10

(23 row(s) affected)
Not quite right. I'd rather it was in some sort of order
SELECT * FROM
(
    SELECT 
        YEAR(OrderDate) As YY,
        MONTH(OrderDate) AS MM,
        CustomerId,
        EmployeeId 
    FROM Orders) As Source
PIVOT
(
    COUNT(CustomerId) FOR EmployeeId IN ( [1], [2], [3], [4] )
) AS PivotTable
ORDER BY YY, MM
One thing to note - in the ORDER BY clause you can use the column names - you don't have to repeat the clause. The output will be
YY          MM          1           2           3           4
----------- ----------- ----------- ----------- ----------- -----------
1996        7           1           1           4           7
1996        8           5           2           2           5
1996        9           5           5           1           3
...
There are however nine employees. So to get all nine employees we need to modify our query for this.
SELECT * FROM
(
    SELECT 
        YEAR(OrderDate) As YY, 
        MONTH(OrderDate) AS MM, 
        CustomerId, 
        EmployeeId FROM Orders) As Source
PIVOT
(
    COUNT(CustomerId) FOR EmployeeId 
         IN ( [1], [2], [3], [4], [5], [6], [7], [8], [9] )
) AS PivotTable
ORDER BY YY, MM
Finally giving us
YY    MM    1     2     3     4     5     6     7     8     9
----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
1996  7     1     1     4     7     3     2     0     2     2
1996  8     5     2     2     5     0     4     1     6     0
1996  9     5     5     1     3     1     3     2     3     0
...
This creates an issue. What happens when a new Employee joins. We need to modify our query to cope with this. The next article will be about creating a Dynamic PIVOT. No real rocket science, but using generated SQL.