My first introduction to this to look at sub-queries and inline views and show how a Common Table Expression can be used instead.
We are going to use the Northwind database for this example. The problem that I want to solve is to calculate is the average number of orders that each customer will make (using only customers who have made orders before - just for simplicity).
SELECT COUNT(*) AS NumberOfOrders FROM Orders SELECT COUNT(DISTINCT CustomerID) AS NumberOfCustomersWhoHaveMadeOrders FROM Orderswith the results
NumberOfOrders -------------- 830 NumberOfCustomersWhoHaveMadeOrders ---------------------------------- 89The answer is 830/89 - which is 9.3258. The following SQL will do this for you
SELECT COUNT(DISTINCT O.OrderID) / CAST(COUNT(DISTINCT O.CustomerID) AS float) AS AverageNumberOfOrdersPerCustomer FROM Orders Owith the results
AverageNumberOfOrdersPerCustomer --------------------------------------- 9.32584269662921One small downside of this - if there are no orders you will get a divide by zero error. In this simple example you won't see this, but taking the problem further we might want to retrieve the average number of orders by month - and we might not have any orders in one month. But the point of this is how to do this with a Common Table Expression.
WITH CTEOrders AS (SELECT CustomerID, COUNT(*) AS NumberOfOrdersPerCustomer FROM Orders GROUP BY CustomerID) SELECT AVG(CAST(NumberOfOrdersPerCustomer AS float)) AS AverageNumberOfOrdersPerCustomer FROM CTEOrderswith the results
AverageNumberOfOrdersPerCustomer -------------------------------- 9.32584269662921Let's look at the syntax of the CTE. We start with the WITH clause followed by the name of the Common Table Expression. Following this we can specify (in brackets) the names of the columns. This is optional if the columns are named within the SELECT statement within the CTE. The SELECT statement is enclosed in brackets.
WITH CTEName (column1, column2, ...) AS (SELECT statement)Immediately following the Common Table Expression definition you must use it. After the next statement the result of the CTE is gone.
Which is the best approach? I have always had problems reading Execution Plans, but we can have a go. The execution plan for the first query is this
Execution plan for standard query - click to enlarge |
Execution plan for Common Table Expression - click to enlarge |
Statistics for standard query |
Statistics for Common Table Expression |
So very similar in timing and the same results. But lets look at the name - it is called a Common Table Expression. One of the uses is to reuse the results from the CTE. So imagine that as well as the average I want the maximum number of orders per customer and minimum number of orders per customer. To do this with a CTE the query is
WITH CTEOrders AS (SELECT CustomerID, COUNT(*) AS NumberOfOrdersPerCustomer FROM Orders GROUP BY CustomerID) SELECT MAX(NumberOfOrdersPerCustomer) AS MaxOrders, MIN(NumberOfOrdersPerCustomer) AS MinOrders, AVG(CAST(NumberOfOrdersPerCustomer AS float)) AS AverageNumberOfOrdersPerCustomer FROM CTEOrdersGiving the results
MaxOrders MinOrders AverageNumberOfOrdersPerCustomer ----------- ----------- -------------------------------- 31 1 9.32584269662921Without much additional cost - the extra aggregates will be working with the data returned by the CTE. Doing this with the standard query will require quite a bit more work.
No comments:
Post a Comment