Monday, 24 February 2014

SQL: Using a Collation within a SELECT list or WHERE clause

Sometimes you when you do comparisons on string data within a table you do want it to be case sensitive. However for a lot of queries (or table structures to be more specific) they will come out as case insensitive. For example, the following query on the table Employees in Northwind
SELECT EmployeeID, LastName, FirstName FROM Employees
WHERE FirstName IN ('Andrew', 'robert')
Will return two rows:
EmployeeID  LastName             FirstName
----------- -------------------- ----------
2           Fuller               Andrew
7           King                 Robert
Even though I asked for FirstName of 'robert' - I got Robert (with an uppercase R).

This is due to the collation sequence. If you use sp_help Employees it will show you the structure of the table. I am using this query to return the structure (which is part of the query with in the system stored procedure sp_help).
SELECT
  'Column_name' = name,
  'Type' = type_name(user_type_id),
  'Computed' = CASE WHEN ColumnProperty(object_id, name, 'IsComputed') = 0 
     THEN 'No' ELSE 'Yes' END,
  'Length' = convert(int, max_length),
  'Nullable' = CASE WHEN is_nullable = 0 THEN 'No' ELSE 'Yes' END,
  'Collation' = collation_name
FROM 
  sys.all_columns
WHERE
  object_id = 245575913
AND 
  name IN ('EmployeeID', 'LastName', 'FirstName')
(If you are going to do this check your object_id is correct).

This is showing the structure as
Column_name  Type   Computed Length      Nullable Collation
------------ ------ -------- ----------- -------- --------------------
EmployeeID   int    No       4           No       NULL
LastName     nvarch No       40          No       Latin1_General_CI_AS
FirstName    nvarch No       20          No       Latin1_General_CI_AS
The collation here is Latin1_General_CI_AS - the CI tells us it is case insensitive (AS is accent insensitive and if you see WS/WI this is width sensitivity/insensitivity and KS/KI is kana sensitivity/insensitivity respectively).

But what if I want to perform queries that are case sensitive. After all when I display the data it is in the case as it is stored.

I can convert a column to a different collation - this is done by adding after the column name the COLLATION keyword and the name of collation I want. Thus
SELECT EmployeeID, LastName, FirstName,
FirstName COLLATE Latin1_General_CS_AS AS CS_FirstName
FROM Employees
WHERE FirstName IN ('Andrew', 'robert')
Adds an additional column (with an alias CS_FirstName):
EmployeeID  LastName             FirstName  CS_FirstName
----------- -------------------- ---------- ------------
2           Fuller               Andrew     Andrew
7           King                 Robert     Robert
But I can't use a column alias in a where clause, so using something like this
SELECT EmployeeID, LastName, FirstName,
FirstName COLLATE Latin1_General_CS_AS AS CS_FirstName
FROM Employees
WHERE FirstName COLLATE Latin1_General_CS_AS IN ('Andrew', 'robert')
will now give us the correct results.
EmployeeID  LastName             FirstName  CS_FirstName
----------- -------------------- ---------- ------------
2           Fuller               Andrew     Andrew
Of course you don't need the computed column in the final output.

No comments:

Post a Comment