Monday, 10 June 2013

SQL: Recursive Common Table Expressions (CTE)

This post follows up the post on Common Table Expressions to introduce Recursive CTE's. You shouldn't overly worry about this. The word recursion sends shivers down most developers! And to be honest the use of recursive CTE's will only be in very particular occasions.

A recursive CTE will reference itself. As we should know in recursion we should have a "exit" clause. This is achieved in a recursive CTE with an anchor member. The anchor member doesn't reference the CTE - whereas the recursive member of the CTE will do.

Our CTE is thus made up of two elements - the anchor member and the recursive member. These two parts are combined using a SET operator - normally UNION or UNION ALL.
WITH cte_name ( column_names ... )
AS (
   CTE_query_for_anchor_member
 UNION [ALL]
   CTE_query_for_recursive_member
)
SELECT * FROM cte_name -- Or other statement to use CTE
I will use the Northwind database Employees table.
SELECT 
   EmployeeID, LastName, Title, ReportsTo 
FROM Employees
Giving us the output
EmployeeID  LastName             Title                          ReportsTo
----------- -------------------- ------------------------------ -----------
1           Davolio              Sales Representative           2
2           Fuller               Vice President, Sales          NULL
3           Leverling            Sales Representative           2
4           Peacock              Sales Representative           2
5           Buchanan             Sales Manager                  2
6           Suyama               Sales Representative           5
7           King                 Sales Representative           5
8           Callahan             Inside Sales Coordinator       2
9           Dodsworth            Sales Representative           5
That's the raw data - let's try and put this into a "hierarchy" of the employees and there managers. Each employee's manager is defined by the ReportsTo column, so we can join on this.
SELECT 
   E.EmployeeID, E.LastName, E.Title, E.ReportsTo, M.LastName--, M.Title
FROM Employees M
JOIN Employees E ON (E.ReportsTo = M.EmployeeID)
Now showing the manager's name (through the JOIN)
EmployeeID  LastName             Title                          ReportsTo   LastName
----------- -------------------- ------------------------------ ----------- --------------------
1           Davolio              Sales Representative           2           Fuller
3           Leverling            Sales Representative           2           Fuller
4           Peacock              Sales Representative           2           Fuller
5           Buchanan             Sales Manager                  2           Fuller
6           Suyama               Sales Representative           5           Buchanan
7           King                 Sales Representative           5           Buchanan
8           Callahan             Inside Sales Coordinator       2           Fuller
9           Dodsworth            Sales Representative           5           Buchanan
We need to tweak this a little as the top manager (Fuller - Vice President of Sales) is missing, we need an Outer Join.
SELECT 
   E.EmployeeID, E.LastName, E.Title, E.ReportsTo, M.LastName
FROM Employees M
RIGHT OUTER JOIN Employees E ON (E.ReportsTo = M.EmployeeID)
Now showing Fuller - with Nulls for who he reports to.
EmployeeID  LastName             Title                          ReportsTo   LastName
----------- -------------------- ------------------------------ ----------- --------------------
1           Davolio              Sales Representative           2           Fuller
2           Fuller               Vice President, Sales          NULL        NULL
3           Leverling            Sales Representative           2           Fuller
4           Peacock              Sales Representative           2           Fuller
5           Buchanan             Sales Manager                  2           Fuller
6           Suyama               Sales Representative           5           Buchanan
7           King                 Sales Representative           5           Buchanan
8           Callahan             Inside Sales Coordinator       2           Fuller
9           Dodsworth            Sales Representative           5           Buchanan
There we go. Let's do this with a recursive CTE. The anchor condition will be when the ReportsTo column is NULL - i.e. when we have an employee who doesn't report to anyone else.
SELECT ReportsTo, EmployeeID, LastName, FirstName, Title
FROM Employees 
WHERE ReportsTo IS NULL
Doesn't seem like much - but let's think of this as a separate query. Imagine our CTE is called CTEEmployees and this SQL refers to it
SELECT e.ReportsTo, e.EmployeeID, e.LastName, e.FirstName, e.Title
FROM Employees AS e
    INNER JOIN CTEEmployees AS d
    ON e.ReportsTo = d.EmployeeID 
Putting this together into a CTE and a query to use the CTE we would have
WITH CTEEmployees(ReportsTo, EmployeeID, LastName, FirstName, Title) AS 
(
    SELECT ReportsTo, EmployeeID, LastName, FirstName, Title
    FROM Employees 
    WHERE ReportsTo IS NULL
    UNION ALL
    SELECT e.ReportsTo, e.EmployeeID, e.LastName, e.FirstName, e.Title
    FROM Employees AS e
        INNER JOIN CTEEmployees AS d
        ON e.ReportsTo = d.EmployeeID 
)
SELECT EmployeeID, LastName, FirstName, Title, ReportsTo
FROM CTEEmployees
ORDER BY EmployeeID
Giving the results
EmployeeID  LastName             FirstName  Title                          ReportsTo
----------- -------------------- ---------- ------------------------------ -----------
1           Davolio              Nancy      Sales Representative           2
2           Fuller               Andrew     Vice President, Sales          NULL
3           Leverling            Janet      Sales Representative           2
4           Peacock              Margaret   Sales Representative           2
5           Buchanan             Steven     Sales Manager                  2
6           Suyama               Michael    Sales Representative           5
7           King                 Robert     Sales Representative           5
8           Callahan             Laura      Inside Sales Coordinator       2
9           Dodsworth            Anne       Sales Representative           5
I'm not sure it was worth the hassle to get there when an outer join did this for me. But let's add another column to our output
WITH CTEEmployees(ReportsTo, EmployeeID, LastName, FirstName, Title, EmployeeLevel) AS 
(
    SELECT ReportsTo, EmployeeID, LastName, FirstName, Title, 0
    FROM Employees 
    WHERE ReportsTo IS NULL
    UNION ALL
    SELECT e.ReportsTo, e.EmployeeID, e.LastName, e.FirstName, e.Title, d.EmployeeLevel + 1
    FROM Employees AS e
        INNER JOIN CTEEmployees AS d
        ON e.ReportsTo = d.EmployeeID 
)
SELECT EmployeeID, LastName, FirstName, Title, ReportsTo, EmployeeLevel
FROM CTEEmployees
ORDER BY EmployeeID
Giving us the same output with a column showing the level
EmployeeID  LastName             FirstName  Title                          ReportsTo   EmployeeLevel
----------- -------------------- ---------- ------------------------------ ----------- -------------
1           Davolio              Nancy      Sales Representative           2           1
2           Fuller               Andrew     Vice President, Sales          NULL        0
3           Leverling            Janet      Sales Representative           2           1
4           Peacock              Margaret   Sales Representative           2           1
5           Buchanan             Steven     Sales Manager                  2           1
6           Suyama               Michael    Sales Representative           5           2
7           King                 Robert     Sales Representative           5           2
8           Callahan             Laura      Inside Sales Coordinator       2           1
9           Dodsworth            Anne       Sales Representative           5           2
This extra column would be useful when displaying the output.

The actual use of recursive of CTE is going to be limited to data that is stored in a hierarchical structure - i.e. with self joins.

No comments:

Post a Comment