Let's set a problem to solve. I know I can solve this with some other SQL. Let's work with the Northwind database and the [Orders] and [Order Details] tables. An order (in the [Orders] table) can consist of multiple parts - the [Order Details]). Note that I have put the table names here in [] as the Northwind database have table names with spaces in them!
SELECT OrderID, CustomerID, OrderDate FROM OrdersHere are some of the columns and rows from the Orders table.
OrderID CustomerID OrderDate ----------- ---------- ----------------------- 10248 VINET 1996-07-04 00:00:00.000 10249 TOMSP 1996-07-05 00:00:00.000 10250 HANAR 1996-07-08 00:00:00.000 10251 VICTE 1996-07-08 00:00:00.000 ..... 11076 BONAP 1998-05-06 00:00:00.000 11077 RATTC 1998-05-06 00:00:00.000 (830 row(s) affected)And the order details with this SQL
SELECT * FROM [Order Details]giving
OrderID ProductID UnitPrice Quantity Discount ----------- ----------- --------------------- -------- ------------- 10248 11 14.00 12 0 10248 42 9.80 10 0 10248 72 34.80 5 0 10249 14 18.60 9 0 10249 51 42.40 40 0 10250 41 7.70 10 0 10250 51 42.40 35 0.15 10250 65 16.80 15 0.15 ..... 11077 75 7.75 4 0 11077 77 13.00 2 0 (2155 row(s) affected)Here is the first problem we want to solve. Return the highest order item for each order. With an ordinary correlated sub-query this can be done with this SQL
SELECT O.OrderID, O.Freight, (SELECT (MAX((UnitPrice * Quantity) * (1.0-Discount))) FROM [Order Details] OD WHERE OD.OrderID = O.OrderID ) AS MaxOrder FROM Orders OGives us this output
OrderID Freight MaxOrder ----------- --------------------- ------------- 10248 32.38 174 10249 11.61 1696 10250 65.83 1261.4 ..... 11076 38.28 375 11077 8.53 364.8 (830 row(s) affected)We have the highest order detail (as calculated using the unit price, quantity and discount). But what we don't know is the product for the highest order. A while ago I posted about using a Common Table Expression (CTE) as well as OVER with PARTITION, so let's do this with a CTE.
;WITH CTEOrderDetails AS ( SELECT OD.OrderID, OD.ProductID, ((UnitPrice * Quantity) * (1.0-Discount)) AS TotalCost, ROW_NUMBER() OVER (PARTITION BY OD.OrderID ORDER BY ((UnitPrice * Quantity) * (1.0-Discount)) DESC) As RowNum FROM [Order Details] OD) SELECT O.OrderId, O.Freight, OD.ProductID, OD.TotalCost FROM Orders O JOIN CTEOrderDetails OD ON (OD.OrderID = O.OrderID) WHERE OD.RowNum = 1And now we can get the output we want
OrderId Freight ProductID TotalCost ----------- --------------------- ----------- ------------- 10248 32.38 72 174 10249 11.61 51 1696 10250 65.83 51 1261.4 ..... 11076 38.28 6 375 11077 8.53 2 364.8 (830 row(s) affected)So a common table expression allows us to do this problem. Now let's look at the solution using CROSS APPLY. Before doing this I looked at this as some SQL - it is a correlated join (which you can't do!)
SELECT O.OrderId, O.Freight, OO.ProductID, OO.TotalCost FROM Orders O JOIN (SELECT TOP 1 OD.OrderID, OD.ProductID, ((UnitPrice * Quantity) * (1.0-Discount)) AS TotalCost FROM [Order Details] OD WHERE OD.OrderID = O.OrderID ) OO ON (O.OrderDate = OO.OrderID)The idea is to have an "inline view" (i.e. a query) but trying some sort of correlation. When I have a correlated sub-query, I think of it as like a "foreach" statement - for each row returned by the main query, run the sub-query. Which is what I was trying here. But if you change the JOIN to CROSS APPLY and remove the ON clause so you have this SQL
SELECT O.OrderId, O.Freight, OO.ProductID, OO.TotalCost FROM Orders O CROSS APPLY (SELECT TOP 1 OD.OrderID, OD.ProductID, ((UnitPrice * Quantity) * (1.0-Discount)) AS TotalCost FROM [Order Details] OD WHERE OD.OrderID = O.OrderID ) OOYou get the results we are looking for
OrderId Freight ProductID TotalCost ----------- --------------------- ----------- ------------- 10248 32.38 11 168 10249 11.61 14 167.4 10250 65.83 41 77 ..... 11077 8.53 2 364.8 (830 row(s) affected)A little description of what is happening might be useful. The CROSS APPLY is sort of like an INNER JOIN, in so much as the rows returned from the main query are all used with any matches from the query within the CROSS APPLY clause. So for each row returned by the main query find a row in the CROSS APPLY query (which has a reference to the outer query - here in the WHERE clause). Let's prove the fact that only rows in the main query are returned and insert a row into the [Orders] table
SET IDENTITY_INSERT Orders ON INSERT INTO Orders (OrderID, CustomerID, EmployeeID, OrderDate, RequiredDate, ShippedDate, ShipVia, Freight, ShipName, ShipAddress, ShipCity, ShipRegion, ShipPostalCode, ShipCountry) VALUES (99999, 'VINET', 5, GETDATE(), GETDATE(), GETDATE(), 3, 40.10, 'Connor Macleod', 'Clan Maclead', 'Glenfinnan', 'Highlands', 'N/A', 'Scotland') SET IDENTITY_INSERT Orders OFFIf you re-run the query you will see that order 99999 does not display.
You may have noticed in my description of CROSS APPLY I didn't use the terms inner or outer. Microsoft have't use it in it's naming, but if you read the articles on this it talks about the outer query (the query not in the CROSS APPLY). There is a further option we can use - OUTER APPLY. The OUTER APPLY will include the rows from the outer query that do not have matches in the inner query. So using this query
SELECT O.OrderId, O.Freight, OO.ProductID, OO.TotalCost FROM Orders O OUTER APPLY (SELECT TOP 1 OD.OrderID, OD.ProductID, ((UnitPrice * Quantity) * (1.0-Discount)) AS TotalCost FROM [Order Details] OD WHERE OD.OrderID = O.OrderID ) OOWill show the row that we require
OrderId Freight ProductID TotalCost ----------- --------------------- ----------- ------------- 10248 32.38 11 168 10249 11.61 14 167.4 10250 65.83 41 77 ..... 11076 38.28 6 375 11077 8.53 2 364.8 99999 40.10 NULL NULL (831 row(s) affected)If you want to do this with a Common Table Expression you need to firstly change the join (in our case to a LEFT OUTER JOIN) and secondly add a clause to the WHERE (remember we are only returning the top row - which would have RowNum = 1), we now need to include the rows that aren't included (so RowNum IS NULL).
;WITH CTEOrderDetails AS ( SELECT OD.OrderID, OD.ProductID, ((UnitPrice * Quantity) * (1.0-Discount)) AS TotalCost, ROW_NUMBER() OVER (PARTITION BY OD.OrderID ORDER BY ((UnitPrice * Quantity) * (1.0-Discount)) DESC) As RowNum FROM [Order Details] OD) SELECT O.OrderId, O.Freight, OD.ProductID, OD.TotalCost FROM Orders O LEFT OUTER JOIN CTEOrderDetails OD ON (OD.OrderID = O.OrderID) WHERE OD.RowNum = 1 OR OD.RowNum IS NULLNow we have the correct results.
No comments:
Post a Comment