Friday, 17 May 2013

Splitting your train ticket

I have a meeting in Birmingham next week and was looking at train fares to get there. I have two real options - one is to take the train from Didcot and change in Oxford, the other is to drive to Oxford (slightly further to drive, but no need to change).

So I looked up the prices - return from Didcot is £85 and the return from Oxford was £68. Then I thought - it isn't £17 for a return ticket Didcot to Oxford. So I looked that up, it was only £6.60. 

So I did a bit of further research. The train from Oxford to Birmingham stops at Banbury. Oxford to Banbury is £11.30 (return) and Banbury to Birmingham is £19.90 (again return) - total return fare is £37.80. Not £68.

More research on the internet to check if you can do this - and you can. As long as the train you are on stops at the station you can use individual tickets. 

There are a few downsides. You need three sets of tickets and you will find the automated ticket machines will only let you buy tickets from the station you are in - not point to point from other stations (at least as far as I can see). Not too much of a problem - you buy online. But you then have three sets of confirmation codes to enter. On top of this if you book a seat (which you can do between Oxford and Birmingham) you can't get the same seat!

It might not be my money I am spending - but there is a principle here. 

Final few bits of research - if I buy a ticket from Banbury to Lemington-Spa then I can save another ten pence (so might not bother!). And it's the same going into London from Didcot - purchasing Didcot to Reading and then Reading to London saves a couple of pounds. And that is all on First Great Western.

Website you can use is http://www.splityourticket.co.uk which is useful - but not all routes are there.

I looked up prices on the TrainLine - but then book on CrossCountry trains as there is no booking fee,

Wednesday, 15 May 2013

SQL: ISNULL and COALESCE

It frequently comes up when teaching SQL about when to use the function ISNULL or the function COALESCE when writing code in SQL.

On an initial view it looks like both functions do the same thing except that COALESCE takes more arguments.
ISNULL(expression, replacement)
COALESCE(expression1, expression2 [,.... more expressions])
Let's do an example. Firstly we need some data
CREATE TABLE Person (
   Name varchar(20) NOT NULL,
   Postcode varchar(10) NULL
)
GO

INSERT INTO Person VALUES ('Karen', 'PA2 8NR')
INSERT INTO Person VALUES ('Julie', NULL)
INSERT INTO Person VALUES ('Mauda', NULL)
INSERT INTO Person VALUES ('Ali', NULL)
INSERT INTO Person VALUES ('Heidi', '2000')
INSERT INTO Person VALUES ('Helen', NULL)
INSERT INTO Person VALUES ('Rhona', 'G12 9SR')
GO
And if we run the following SQL we get the same results
SELECT Name, ISNULL(Postcode, 'None') FROM Person
SELECT Name, COALESCE(Postcode, 'None') FROM Person
Name                 
-------------------- ----------
Karen                PA2 8NR
Julie                None
Mauda                None
Ali                  None
Heidi                2000
Helen                None
Rhona                G12 9SR
The most obvious first difference is that ISNULL takes two parameters - an expression to check against (which can be any type), and an expression to return if the first parameter is NULL. The type of the replacement must be the same as the first expression or implicitly convertible to it. So the following can be done
SELECT Name, ISNULL(Postcode, 0) FROM Person
And will return the string '0' - converting the number 0 to a string. Doing this with COALESCE will
SELECT Name, COALESCE(Postcode, 0) FROM Person
Gives the following error message
Name                 
-------------------- -----------
Msg 245, Level 16, State 1, Line 1
Conversion failed when converting the varchar value 'PA2 8NR' to data type int.
Another point related to types will be apparent with this code sample
SELECT Name, ISNULL(Postcode, 'No Postcode') FROM Person
SELECT Name, COALESCE(Postcode, 'No Postcode') FROM Person
The output looks like this
Name                 
-------------------- ----------
Karen                PA2 8NR
Julie                No Postcod
Mauda                No Postcod
Ali                  No Postcod
Heidi                2000
Helen                No Postcod
Rhona                G12 9SR

Name                 
-------------------- -----------
Karen                PA2 8NR
Julie                No Postcode
Mauda                No Postcode
Ali                  No Postcode
Heidi                2000
Helen                No Postcode
Rhona                G12 9SR
The second set of results shows the output from COALESCE - which has the full text of "No postcode" whereas ISNULL has truncated it (to 10 characters) - which is the type of the column specified (Postcode) which is varchar(10).

The type has another knock on effect you might need to be aware of. If you use the type in a computed column then ISNULL will use the type of the first column (and the type created will be non-nullable). Let's create a table with as computed column. When you do this then you do not specify the type - SQL will infer the type from the result of the expression.

In the case of ISNULL this will be the type of the first parameter.
CREATE TABLE Person2 (
   Name varchar(20) NOT NULL,
   Postcode varchar(10) NULL,
   NotNullPostcode AS ISNULL(Postcode, 'None'))
GO
select COLUMN_NAME, DATA_TYPE, CHARACTER_MAXIMUM_LENGTH, IS_NULLABLE
from INFORMATION_SCHEMA.COLUMNS
where TABLE_NAME='Person2' AND COLUMN_NAME='NotNullPostcode'
COLUMN_NAME     DATA_TYPE       CHARACTER_MAXIMUM_LENGTH IS_NULLABLE
--------------- --------------- ------------------------ -----------
NotNullPostcode varchar         10                       NO
Even if you change the literal 'None' to something greater than 10 characters then it will only ever be 10 characters (the type of the first expression - in this case the type of the column Postcode).

Doing this with Coalesce will take the result of the COALESCE statement.
CREATE TABLE Person3 (
   Name varchar(20) NOT NULL,
   Postcode varchar(10) NULL,
   NotNullPostcode AS COALESCE(Postcode, 'None'))
GO
select COLUMN_NAME, DATA_TYPE, CHARACTER_MAXIMUM_LENGTH, IS_NULLABLE
from INFORMATION_SCHEMA.COLUMNS
where TABLE_NAME='Person3' AND COLUMN_NAME='NotNullPostcode'
COLUMN_NAME     DATA_TYPE       CHARACTER_MAXIMUM_LENGTH IS_NULLABLE
--------------- --------------- ------------------------ -----------
NotNullPostcode varchar         10                       YES
First thing to note is that even though we have specified a literal ('None') - the column is set as being Nullable. COALESCE will always be thought of returning a NULL - if any expression in the list can return a NULL. If all expressions are Non-Nullable then the resultant column won't allow nulls. Why you would then use COALESCE is then in doubt?

The Postcode column is longer than the literal 'None' - so the size is set to 10. Now let's make the literal larger in size.
CREATE TABLE Person3 (
   Name varchar(20) NOT NULL,
   Postcode varchar(10) NULL,
   NotNullPostcode AS COALESCE(Postcode, 'No postcode has been specified'))
GO
select COLUMN_NAME, DATA_TYPE, CHARACTER_MAXIMUM_LENGTH, IS_NULLABLE
from INFORMATION_SCHEMA.COLUMNS
where TABLE_NAME='Person3' AND COLUMN_NAME='NotNullPostcode'
COLUMN_NAME     DATA_TYPE       CHARACTER_MAXIMUM_LENGTH IS_NULLABLE
--------------- --------------- ------------------------ -----------
NotNullPostcode varchar         30                       YES
The type of an expression (especially the null-ability) is important when used in queries as the execution plan (how SQL thinks the best way to execute it) will be different.

As we have said the main difference between COALESCE and ISNULL is the number of parameters. COALESCE will look at each parameter (in order) and return the first expression that doesn't evaluate to NULL (or NULL if they all evaluate to NULL).

COALESCE is converted by SQL into a CASE statement. Thus the SQL below
SELECT Name, COALESCE(Postcode, 'None') FROM Person
is the same as
SELECT Name,
  CASE
     WHEN(Postcode IS NOT NULL) THEN Postcode
  ELSE 'None'
  END
FROM Person
This means that each expression is evaluated at least once and one may be evaluated a second time (in this case if Postcode is not null then it is evaluated to determine if it is not null and then evaluated again to return it). Another downside of this is that if you have an expression with sub-queries you may get different answers when the sub-query is evaluated a second time. ISNULL on the other hand is evaluated only once.

When to use?

An understanding of the differences of ISNULL and COALESCE is useful for the occasions when the differences are important. But why use them at all.

One use is the one given above - change some text. In this case when you don't know the postcode of the person. Remember when you deal with Nulls the result of an expression containing a NULL is also a NULL. So if you have an address as well you cannot concatenate the strings.

Additionally, aggregate functions (such as Average) will work with all the rows where the column in question is Not Null. Consider this example
CREATE Table Numbers (Number float NULL)
GO

INSERT INTO Numbers VALUES (4)
INSERT INTO Numbers VALUES (2)
INSERT INTO Numbers VALUES (3)
INSERT INTO Numbers VALUES (NULL)
INSERT INTO Numbers VALUES (6)
INSERT INTO Numbers VALUES (8)
The sum of the numbers is 23. If you run this SQL to ask for the average
SELECT AVG(Number) FROM Numbers
You get
4.6
Which is the result of 23/5. But using ISNULL to check Number first
SELECT AVG(ISNULL(Number,0)) FROM Numbers
Will now give you
3.83333333333333
Which is 23/6