Andrew Breen: September 2013

Thursday, 19 September 2013

SQL: User Defined Functions - Multi Statement Table Valued

Nearly finished this ad-hoc series of articles. This started as an exploration of CROSS APPLY and took me back to using user defined functions and how to use functions that return tables.

Our Scalar user defined function returned a single value, but we could have complex code. This is akin to a stored procedure, with the change that we could use the value returned within our SQL. The inline table valued function gave us the ability to return a results set that we could use within our FROM clauses. It was similar to a view (and could be used like a view, in that it could be updateable as well - depending on the SQL).

This type - multi statement table valued - of function creates a table variable which we then populate within our function and return it. We can put additional code within our function rather than a single SELECT statement. So now we can return a complex result set - and use this within our SQL. To be able to do this without functions, you would use a stored procedure and create a table variable (or a temporary table) and execute the stored procedure.

I always like a solution to solve, rather than showing some syntax. So I'd like a function that returns for a particular employee (in Northwind again) the employees formatted name, their managers name and the number of people that reports to the employee.

The first part - the employee name and manager name can be found with this SQL

SELECT 
        E.EmployeeID, 
        E.TitleOfCourtesy + ' ' + E.FirstName + ' ' + E.LastName,
        M.TitleOfCourtesy + ' ' + M.FirstName + ' ' + M.LastName AS ManagerName
    FROM Employees E
    LEFT OUTER JOIN Employees M ON (E.ReportsTo = M.EmployeeID)
    WHERE E.EmployeeID=2

And then to get the number of reports, we can use this

SELECT COUNT(*)
    FROM Employees
    WHERE ReportsTo = 2

It is possible to do this as one query (as a sub-query), but for the sake of this example we are going to perform both SQL within this Multi Statement Table Valued User Defined function (try saying that after a few beers!).

As a function this is what we would write

CREATE FUNCTION EmployeeInfo(@EmployeeID int)
RETURNS @EmployeeDetails TABLE
   (EmployeeID int,
    EmployeeName nvarchar(20),
    ManagerName nvarchar(20),
    NumberOfReports int
   )
AS
BEGIN
    INSERT INTO @EmployeeDetails
        SELECT 
            E.EmployeeID, 
            E.TitleOfCourtesy + ' ' + E.FirstName + ' ' + E.LastName,
            M.TitleOfCourtesy + ' ' + M.FirstName + ' ' + M.LastName AS ManagerName,
            NULL
        FROM Employees E
        LEFT OUTER JOIN Employees M ON (E.ReportsTo = M.EmployeeID)
        WHERE E.EmployeeID=@EmployeeID

    DECLARE @NumberOfReports int

    SELECT @NumberOfReports = COUNT(*)
    FROM Employees
    WHERE ReportsTo = @EmployeeID

    UPDATE @EmployeeDetails SET NumberOfReports = @NumberOfReports
    RETURN
END

This can be executed with

SELECT * FROM dbo.EmployeeInfo(5)

Wednesday, 18 September 2013

SQL: User Defined Functions - Inline Table Valued

The next step on our User Defined Functions journey (this didn't start as a series of articles, but a quick demo of the CROSS APPLY statement!) is inline table valued functions.

They are very like a view. So let's start with an example of a view. Again from Northwind, we will work with a view that returns for each order

Order ID
Customer ID
Customer Name
Order Total (which is calculated from the Order Details)

To package as a view we will have this SQL

CREATE VIEW OrderSummary
AS 
SELECT 
   O.OrderID, C.CustomerID, C.CompanyName,
   CONVERT(money,SUM((OD.UnitPrice * OD.Quantity) * (1-OD.Discount))) AS OrderTotal
FROM Orders O
JOIN [Order Details] OD ON (O.OrderID = OD.OrderID)
JOIN Customers C ON (C.CustomerID = O.CustomerID)
GROUP BY 
 O.OrderID, C.CustomerID, C.CompanyName

Giving us the following output

OrderID     CustomerID CompanyName                              OrderTotal
----------- ---------- ---------------------------------------- ---------------------
10643       ALFKI      Alfreds Futterkiste                      814.50
10692       ALFKI      Alfreds Futterkiste                      878.00
10702       ALFKI      Alfreds Futterkiste                      330.00
...
10998       WOLZA      Wolski  Zajazd                           686.00
11044       WOLZA      Wolski  Zajazd                           591.60

(830 row(s) affected)

Thus we can use this within our SQL statements almost as if it is another table (and thus JOIN to it). If we want to restrict the output we use a WHERE clause, such as

SELECT * FROM OrderSummary
WHERE OrderTotal BETWEEN 140 and 145
ORDER BY OrderTotal

Which corresponds to these five rows

OrderID     CustomerID CompanyName                              OrderTotal
----------- ---------- ---------------------------------------- ---------------------
10809       WELLI      Wellington Importadora                   140.00
10349       SPLIR      Split Rail Beer & Ale                    141.60
10585       WELLI      Wellington Importadora                   142.50
10321       ISLAT      Island Trading                           144.00
10334       VICTE      Victuailles en stock                     144.80

(5 row(s) affected)

Now to do the exact same with a function we would create one, but return a TABLE

CREATE FUNCTION fOrderSummary1()
RETURNS TABLE
AS
RETURN(
   SELECT 
      O.OrderID, C.CustomerID, C.CompanyName,
      CONVERT(money,SUM((OD.UnitPrice * OD.Quantity) * (1-OD.Discount))) AS OrderTotal
   FROM Orders O
   JOIN [Order Details] OD ON (O.OrderID = OD.OrderID)
   JOIN Customers C ON (C.CustomerID = O.CustomerID)
   GROUP BY 
      O.OrderID, C.CustomerID, C.CompanyName
)

And we can execute this, and restrict the rows with

SELECT * FROM dbo.fOrderSummary1()
WHERE OrderTotal BETWEEN 140 and 145
ORDER BY OrderTotal

With the same output. So nothing more than we had before. However, we can create a function that takes parameters. In this case we can pass the lower and upper bounds.

CREATE FUNCTION fOrderSummary2(@Lower int, @Upper int)
RETURNS TABLE
AS
RETURN(
   SELECT 
      O.OrderID, C.CustomerID, C.CompanyName,
      CONVERT(money,SUM((OD.UnitPrice * OD.Quantity) * (1-OD.Discount))) AS OrderTotal
   FROM Orders O
   JOIN [Order Details] OD ON (O.OrderID = OD.OrderID)
   JOIN Customers C ON (C.CustomerID = O.CustomerID)
   GROUP BY 
      O.OrderID, C.CustomerID, C.CompanyName
   HAVING CONVERT(money,SUM((OD.UnitPrice * OD.Quantity) * (1-OD.Discount))) 
          BETWEEN @Lower AND @Upper
)

Which we can now use as

SELECT * FROM dbo.fOrderSummary2(140, 145)
ORDER BY OrderTotal

Which removes the WHERE clause in the main SQL - and hides it away.

Rules for inline table-valued functions are similar as for a view - if the SELECT statement is updateable, then the function will also be updateable.

Monday, 16 September 2013

SQL: User Defined Functions - Scalar - Round up

I might have posted my last article on User Defined Functions (UDF) in SQL a little quick. A few words of warning are probably required, or at least be aware. Plus a discussion of using a UDF in a computed column and a persisted computed column (using the PERSISTED keyword).

The example I gave is a simple function to work with some parameters. In this case we expect that one of the parameters might be NULL and handle this. Because this is some functionality we may want to reuse, we are utilising a UDF. All the good reasons for using a function.

The UDF I showed looks very much like a stored procedure. You can put other code within the function, perform some logic and access other tables. We had a little bit of logic in here (i.e. check if Region IS NULL).

Accessing Other tables

Lets show some use of accessing another table. I am going to make an example with a new table. But firstly, has anyone noticed that in the Northwind database Employee "Robert King" has a postcode RG1 9SP, but has the city as London - not Reading as per the postcode. Let's change his City from London to Reading.

UPDATE Employees SET City='Reading' WHERE EmployeeID=7

Now Reading is in Berkshire, but for the sake of argument (or internal Northwind politics!) we are not able to change the Employees table. We will create another table for Alternative Regions, with City and Region to use

CREATE TABLE AlternativeRegions
( City nvarchar(30) NOT NULL PRIMARY KEY,
  Region nvarchar(30) NOT NULL )

INSERT INTO AlternativeRegions VALUES
  ('Reading', 'Berkshire')

Now we can update our function to access this table

ALTER FUNCTION Location 
     (@City nvarchar(30), @Region nvarchar(30), @PostalCode nvarchar(20))
RETURNS nvarchar(80)
AS
BEGIN
 DECLARE @ret nvarchar(80)
 DECLARE @NewRegion nvarchar(30)
 IF @Region IS NULL
 BEGIN
  SELECT @NewRegion = Region
  FROM AlternativeRegions
  WHERE City = @City

  SELECT @ret = 
      CASE WHEN @NewRegion IS NULL THEN
          @City + ' ' + @PostalCode
        ELSE 
       @City + ', ' + @NewRegion + ' ' + @PostalCode
   END
 END
 ELSE
 BEGIN
  SET @ret = @City + ', ' + @Region + ' ' + @PostalCode
 END
RETURN @ret
END

And we can use this when Region is NULL

SELECT dbo.Location('Reading', NULL, 'RG1 9SP')

producing

Reading, Berkshire RG1 9SP

or within SQL

SELECT 
  EmployeeID, LastName,
  dbo.Location(City, Region, PostalCode) AS Location
FROM Employees

Performance

This example is just to show that you can access other tables.

But, I do need to ask when doing something like this why would you. Why not write SQL and structure to allow you to do something about this. If (as in the last piece of SQL) you were accessing all the rows in the Employee table would you then want to do a look-up of another table? And on a per-row basis (i.e. the table lookup will be done for each row returned).

Doing this type of modification, rather than changing the database structure to allow you the functionality you want seems like a maintenance nightmare. But on a performance point you are accessing each row. If you were doing it with only one row then maybe.

An example to really show this, is let's imagine we don't pass as parameters the City, Region and Postcode, but just have a function that takes the Employee ID.

CREATE FUNCTION EmployeeLocation
   (@EmployeeID int)
RETURNS nvarchar(80)
AS
BEGIN
 DECLARE @ret nvarchar(80)

 SELECT @ret = dbo.Location(City, Region, PostalCode)
 FROM Employees
 WHERE EmployeeID = @EmployeeID

 RETURN @ret
END
GO

And we would use this with

SELECT dbo.EmployeeLocation(9)

For accessing one employee, then we get the output

London WG2 7LT

But we can also do this

SELECT 
 EmployeeID, LastName, dbo.EmployeeLocation(EmployeeID)
FROM 
 Employees

And get our expected output. However, we are now accessing the Employees table once for the main query and each function will then access the Employees table for each row.

My advice is to be careful. SQL is very good with queries and how to optimise them. But when functions are involved it is more difficult.

One thing you cannot do in a is change a table. Anything you do within a scalar function is read-only

Using a computed column

Rather than calling the function you could add a computed column to your table. So in this example

ALTER TABLE Employees
   ADD Location AS dbo.Location(City, Region, PostalCode)

And then we can retrieve a column as if it is part of the table

SELECT EmployeeID, LastName, Location FROM Employees

By default computed columns are not physically stored in the database - but calculated each time they are referenced. There is an option that we can use - PERSISTED computed columns. Let's first try and use the Location function as PERSISTED, with this SQL

ALTER TABLE Employees
   ADD LocationPersisted AS dbo.Location(City, Region, PostalCode) PERSISTED

Gives us an error

Msg 4936, Level 16, State 1, Line 1
Computed column 'LocationPersisted' in table 'Employees'

   cannot be persisted because the column is non-deterministic.

So what is a non-deterministic column. A non-deterministic function returns different results each time they are called with a specific set of parameters, where as a deterministic function will always return the same result any time they are called. Our current Location function accesses our AlternativeRegions table - so this immediately makes it non deterministic. So let's revert to our previous version and create a function LocationNonDeterministic

ALTER FUNCTION LocationNonDeterministic 
     (@City nvarchar(30), @Region nvarchar(30), @PostalCode nvarchar(20))
RETURNS nvarchar(80)
WITH SCHEMABINDING
AS
BEGIN
 DECLARE @ret varchar(30)
 SELECT @ret=
   CASE
  WHEN @Region IS NULL THEN @City + ' ' + @PostalCode
  ELSE @City + ', ' + @Region + ' ' + @PostalCode
   END
 
RETURN @ret
END

Now if we try and add this computed column as PERSISTED we will get the same error.

Another requirement of a non-deterministic is it is "schema bound", here we add WITH SCHEMABINDING. This essentially means that the since the function is used elsewhere, we need to remove the dependencies to it before changing it. To create the function, we can do this

ALTER FUNCTION LocationNonDeterministic 
     (@City nvarchar(30), @Region nvarchar(30), @PostalCode nvarchar(20))
RETURNS nvarchar(80)
WITH SCHEMABINDING
AS
BEGIN
 DECLARE @ret varchar(30)
 SELECT @ret=
   CASE
  WHEN @Region IS NULL THEN @City + ' ' + @PostalCode
  ELSE @City + ', ' + @Region + ' ' + @PostalCode
   END
 
RETURN @ret
END
GO

And now to add the persisted computed column

ALTER TABLE Employees
   ADD LocationPersisted AS dbo.LocationNonDeterministic(City, Region, PostalCode) PERSISTED

Now we have a computed column which is persisted in the database. Remember our table has now got bigger. Also, we cannot change this function without dropping the persisted column, changing the function and then adding the persisted computed column back.

Friday, 13 September 2013

SQL: User Defined Functions - Scalar

The previous article on using CROSS APPLY discussed that it's use might not be something that you wouldn't use often. So a little bit of digging and one use where it has to be use is with a user defined functions. I decided to do a little bit of digging on this.

When I first started with SQL you couldn't created your own functions, but functionality was added with SQL Server 2000. That's a release I skipped and I first used them in SQL Server 2005. One type I didn't use was table functions. Before showing the options here, I will show the use of scalar user defined functions.

Again, using the Northwind database looking at the Employees table

SELECT 
   EmployeeID, LastName, City, Region, Postalcode, Country 
FROM Employees

Giving us

EmployeeID  LastName             City            Region          Postalcode Country
----------- -------------------- --------------- --------------- ---------- ---------------
1           Davolio              Seattle         WA              98122      USA
2           Fuller               Tacoma          WA              98401      USA
3           Leverling            Kirkland        WA              98033      USA
4           Peacock              Redmond         WA              98052      USA
5           Buchanan             London          NULL            SW1 8JR    UK
6           Suyama               London          NULL            EC2 7JR    UK
7           King                 London          NULL            RG1 9SP    UK
8           Callahan             Seattle         WA              98105      USA
9           Dodsworth            London          NULL            WG2 7LT    UK

Now Region can be null (for employees in the United Kingdom). Let's imagine the task that I want to solve is to return the location (city, region, postalcode and country) as one string. So the SQL rules are that if you to concatenate a null the result is a null, thus

SELECT 
    EmployeeID, LastName, City + ', ' + Region + ' ' + Postalcode + ' ' + Country 
FROM 
    Employees

Now gives us NULLS for all the addresses in the UK

EmployeeID  LastName             
----------- -------------------- -----------------------------------------------------------
1           Davolio              Seattle, WA 98122 USA
2           Fuller               Tacoma, WA 98401 USA
3           Leverling            Kirkland, WA 98033 USA
4           Peacock              Redmond, WA 98052 USA
5           Buchanan             NULL
6           Suyama               NULL
7           King                 NULL
8           Callahan             Seattle, WA 98105 USA
9           Dodsworth            NULL

The solution is to put a CASE statement in, thus this SQL

SELECT 
  EmployeeID, LastName,
  CASE
    WHEN Region IS NULL THEN City + ' ' + PostalCode
    ELSE City + ', ' + Region + ' ' + PostalCode
  END AS Location
FROM Employees

Will give us some nicely formatted text

EmployeeID  LastName             Location
----------- -------------------- -------------------------------------------
1           Davolio              Seattle, WA 98122
2           Fuller               Tacoma, WA 98401
3           Leverling            Kirkland, WA 98033
4           Peacock              Redmond, WA 98052
5           Buchanan             London SW1 8JR
6           Suyama               London EC2 7JR
7           King                 London RG1 9SP
8           Callahan             Seattle, WA 98105
9           Dodsworth            London WG2 7LT

But this might look a little messy, especially if this is something we do a lot. So we can create a user defined function

CREATE FUNCTION Location 
     (@City nvarchar(30), @Region nvarchar(30), @PostalCode nvarchar(20))
RETURNS nvarchar(80)
AS
BEGIN
 DECLARE @ret varchar(30)
 SELECT @ret=
   CASE
  WHEN @Region IS NULL THEN @City + ' ' + @PostalCode
  ELSE @City + ', ' + @Region + ' ' + @PostalCode
   END 

RETURN @ret
END
GO

So this SQL will return it

SELECT dbo.Location('Wallingford', 'Oxon', 'OX10')

Gives us

Wallingford, Oxon OX10

A function we can use in our SQL. Taking our Employees table we can now do this

SELECT 
  EmployeeID, LastName,
  dbo.Location(City, Region, PostalCode) AS Location
FROM Employees

Which gives us the expected result

EmployeeID  LastName             Location
----------- -------------------- ----------------------------------------------------------
1           Davolio              Seattle, WA 98122
2           Fuller               Tacoma, WA 98401
3           Leverling            Kirkland, WA 98033
4           Peacock              Redmond, WA 98052
5           Buchanan             London SW1 8JR
6           Suyama               London EC2 7JR
7           King                 London RG1 9SP
8           Callahan             Seattle, WA 98105
9           Dodsworth            London WG2 7LT

Wednesday, 11 September 2013

SQL: CROSS APPLY and OUTER APPLY

I didn't even notice that something called CROSS APPLY was added to SQL Server (in 2005) until recently. But found it difficult to work out when you would use it - in fact occasionally still not sure as I think I could do most things with other queries. Also not sure I am creating a convoluted example of it's use. Generally I think if you are struggling to work out how to use it, it is possibly an indication of something you wouldn't use often.

Let's set a problem to solve. I know I can solve this with some other SQL. Let's work with the Northwind database and the [Orders] and [Order Details] tables. An order (in the [Orders] table) can consist of multiple parts - the [Order Details]). Note that I have put the table names here in [] as the Northwind database have table names with spaces in them!

SELECT OrderID, CustomerID, OrderDate FROM Orders

Here are some of the columns and rows from the Orders table.

OrderID     CustomerID OrderDate
----------- ---------- -----------------------
10248       VINET      1996-07-04 00:00:00.000
10249       TOMSP      1996-07-05 00:00:00.000
10250       HANAR      1996-07-08 00:00:00.000
10251       VICTE      1996-07-08 00:00:00.000
.....
11076       BONAP      1998-05-06 00:00:00.000
11077       RATTC      1998-05-06 00:00:00.000

(830 row(s) affected)

And the order details with this SQL

SELECT * FROM [Order Details]

giving

OrderID     ProductID   UnitPrice             Quantity Discount
----------- ----------- --------------------- -------- -------------
10248       11          14.00                 12       0
10248       42          9.80                  10       0
10248       72          34.80                 5        0
10249       14          18.60                 9        0
10249       51          42.40                 40       0
10250       41          7.70                  10       0
10250       51          42.40                 35       0.15
10250       65          16.80                 15       0.15
.....
11077       75          7.75                  4        0
11077       77          13.00                 2        0

(2155 row(s) affected)

Here is the first problem we want to solve. Return the highest order item for each order. With an ordinary correlated sub-query this can be done with this SQL

SELECT O.OrderID, O.Freight,
(SELECT (MAX((UnitPrice * Quantity) * (1.0-Discount)))
  FROM [Order Details] OD 
  WHERE OD.OrderID = O.OrderID ) AS MaxOrder
FROM Orders O

Gives us this output

OrderID     Freight               MaxOrder
----------- --------------------- -------------
10248       32.38                 174
10249       11.61                 1696
10250       65.83                 1261.4
.....
11076       38.28                 375
11077       8.53                  364.8

(830 row(s) affected)

We have the highest order detail (as calculated using the unit price, quantity and discount). But what we don't know is the product for the highest order. A while ago I posted about using a Common Table Expression (CTE) as well as OVER with PARTITION, so let's do this with a CTE.

;WITH CTEOrderDetails
AS (
SELECT 
OD.OrderID, OD.ProductID,  ((UnitPrice * Quantity) * (1.0-Discount)) AS TotalCost, 
ROW_NUMBER() OVER (PARTITION BY OD.OrderID ORDER BY ((UnitPrice * Quantity) * (1.0-Discount)) DESC) As RowNum
FROM [Order Details] OD)

SELECT O.OrderId, O.Freight, OD.ProductID, OD.TotalCost
FROM Orders O
JOIN CTEOrderDetails OD ON (OD.OrderID = O.OrderID)
WHERE OD.RowNum = 1

And now we can get the output we want

OrderId     Freight               ProductID   TotalCost
----------- --------------------- ----------- -------------
10248       32.38                 72          174
10249       11.61                 51          1696
10250       65.83                 51          1261.4
.....
11076       38.28                 6           375
11077       8.53                  2           364.8

(830 row(s) affected)

So a common table expression allows us to do this problem. Now let's look at the solution using CROSS APPLY. Before doing this I looked at this as some SQL - it is a correlated join (which you can't do!)

SELECT O.OrderId, O.Freight, OO.ProductID, OO.TotalCost
FROM Orders O
JOIN
(SELECT TOP 1 OD.OrderID, OD.ProductID, ((UnitPrice * Quantity) * (1.0-Discount)) AS TotalCost
FROM [Order Details] OD 
WHERE OD.OrderID = O.OrderID
) OO ON (O.OrderDate = OO.OrderID)

The idea is to have an "inline view" (i.e. a query) but trying some sort of correlation. When I have a correlated sub-query, I think of it as like a "foreach" statement - for each row returned by the main query, run the sub-query. Which is what I was trying here. But if you change the JOIN to CROSS APPLY and remove the ON clause so you have this SQL

SELECT O.OrderId, O.Freight, OO.ProductID, OO.TotalCost
FROM Orders O
CROSS APPLY
(SELECT TOP 1 OD.OrderID, OD.ProductID, ((UnitPrice * Quantity) * (1.0-Discount)) AS TotalCost
FROM [Order Details] OD 
WHERE OD.OrderID = O.OrderID
) OO

You get the results we are looking for

OrderId     Freight               ProductID   TotalCost
----------- --------------------- ----------- -------------
10248       32.38                 11          168
10249       11.61                 14          167.4
10250       65.83                 41          77
.....
11077       8.53                  2           364.8

(830 row(s) affected)

A little description of what is happening might be useful. The CROSS APPLY is sort of like an INNER JOIN, in so much as the rows returned from the main query are all used with any matches from the query within the CROSS APPLY clause. So for each row returned by the main query find a row in the CROSS APPLY query (which has a reference to the outer query - here in the WHERE clause). Let's prove the fact that only rows in the main query are returned and insert a row into the [Orders] table

SET IDENTITY_INSERT Orders ON
INSERT INTO Orders 
 (OrderID, CustomerID, EmployeeID, OrderDate, RequiredDate, ShippedDate, ShipVia, Freight, ShipName, ShipAddress, ShipCity, ShipRegion, ShipPostalCode, ShipCountry)
VALUES
 (99999, 'VINET', 5, GETDATE(), GETDATE(), GETDATE(), 3, 40.10, 'Connor Macleod', 'Clan Maclead', 'Glenfinnan', 'Highlands', 'N/A', 'Scotland')
SET IDENTITY_INSERT Orders OFF

If you re-run the query you will see that order 99999 does not display.

You may have noticed in my description of CROSS APPLY I didn't use the terms inner or outer. Microsoft have't use it in it's naming, but if you read the articles on this it talks about the outer query (the query not in the CROSS APPLY). There is a further option we can use - OUTER APPLY. The OUTER APPLY will include the rows from the outer query that do not have matches in the inner query. So using this query

SELECT O.OrderId, O.Freight, OO.ProductID, OO.TotalCost
FROM Orders O
OUTER APPLY
(SELECT TOP 1 OD.OrderID, OD.ProductID, ((UnitPrice * Quantity) * (1.0-Discount)) AS TotalCost
FROM [Order Details] OD 
WHERE OD.OrderID = O.OrderID
) OO

Will show the row that we require

OrderId     Freight               ProductID   TotalCost
----------- --------------------- ----------- -------------
10248       32.38                 11          168
10249       11.61                 14          167.4
10250       65.83                 41          77
.....
11076       38.28                 6           375
11077       8.53                  2           364.8
99999       40.10                 NULL        NULL

(831 row(s) affected)

If you want to do this with a Common Table Expression you need to firstly change the join (in our case to a LEFT OUTER JOIN) and secondly add a clause to the WHERE (remember we are only returning the top row - which would have RowNum = 1), we now need to include the rows that aren't included (so RowNum IS NULL).

;WITH CTEOrderDetails
AS (
SELECT 
OD.OrderID, OD.ProductID,  ((UnitPrice * Quantity) * (1.0-Discount)) AS TotalCost, 
ROW_NUMBER() OVER (PARTITION BY OD.OrderID ORDER BY ((UnitPrice * Quantity) * (1.0-Discount)) DESC) As RowNum
FROM [Order Details] OD)

SELECT O.OrderId, O.Freight, OD.ProductID, OD.TotalCost
FROM Orders O
LEFT OUTER JOIN CTEOrderDetails OD ON (OD.OrderID = O.OrderID)
WHERE OD.RowNum = 1 OR OD.RowNum IS NULL

Now we have the correct results.

Andrew Breen