Changing a database collation for new and existing objects in a few easy steps

Michal Piatek

Published May 20, 2019

Introduction

The other day a consultant from some company that is working for one of our clients approached me during the upgrade process of a software that I don't want to mention here and told me that they need a restore of the PRD database to the QA instance - and that they need the collation of the restored database changed to a case sensitive collation. There must have been some odd changes to the software from version X to version X+1 so that now they needed case sensitivity but: client is king. So I set down, did my research and found the following method to be applicable and I think also well understandable for beginners.

Now before I start...

... explaining the process of changing a DB's collation let's take a moment and think about what a collation is and what it does have an impact on. Let me mention at this point that this short paragraph does certainly not cover the topic of collation as a whole and in detail but the few most important aspects should be pointed out here. When we look at the standard collation that comes with SQL Server 2017 (SQL_Latin1_General_CP1_CI_AS) then the name of the collation already tells us quite a few things about it. But what is a collation in the first place?

Collation refers to a set of rules that determine how data is sorted and compared. Character data is sorted using rules that define the correct character sequence, with options for specifying case-sensitivity, accent marks, kana character types and character width.

This quote from is the best (and first :) ) I found. It basically tells us that the collation determines the sequence of the characters which is especially important when it comes to localized special characters like the German Ä (should it come after the A? Or at the end of the alphabet after the Z?). The collation also determines whether the used character set is case sensitive (CS; vs. CI = case insensitive) whether it is accent sensitive (AS vs. AI; important for i.e. French) and whether it is Kana sensitive (KS vs. KI; that's like... like AS vs, AI only from Asia or so...) aso. The two main aspects due to my experience are sorting and case sensitivity.

Server collation vs. user DB collation

We have to distinguish between the SQL Server Server-Collation which is set during the setup and is subsequently applied to the system databases (master, model, msdb and tempdb) and the user DB collation which is applied during the creation of the database. This blog covers the change of a user DB collation. The Server Collation can be changed with a setup option - however due to my experience if you have to change the server collation do a fresh install. Changing the collation of msdb is not funny even if the collation for master, model and tempdb has been changed. I have done that once and basically ended up fixing jobs with collation conflicts for 3 days. Collation conflicts can occur when 2 objects (i.e. 2 tables) with different collations are used in the same batch or the same statement. The experienced DBA solves this using the "collate" clause which allows us to change the collation of a specific value at runtime. But this is also off topic...

Carrying out the collation change

OK, so I had to change the collation of the database for all new and (!!!) all existing objects. I found the easiest way to do this is to actually carrying out these steps:

Script out the "ALTER TABLE [table_name] ALTER COLUMN [column_name] datatype-and-size COLLATE new-collation-name (NOT) NULL" statements
Script all indexes to a #temptable
Delete all indexes generating the DROP INDEX statements from our #temptable
Change the collation of the database
Execute the scripts generated in step 1
Recreate all indexes

Let's have a look into the procedure in detail...

Step 1 - generate scripts to change the tables' collation

I have found (and adapted) the following query that would script me all the tables that contain columns that are of interesst to me - these are tables that contain columns that have either of these datatypes: char, varchar, nvarchar, text

declare @TableName nvarchar(512)
declare @SQLText nvarchar(4000)
declare @ColumnName nvarchar(256)
declare @CharacterMaxLen nvarchar(100)
declare @datatype sysname
declare @IsNullable nvarchar(10)
declare @CollationName nvarchar(256) = N'Latin1_General_CS_AS'




declare MyTableCursor CURSOR FOR
select TABLE_NAME from INFORMATION_SCHEMA.TABLES 


OPEN MyTableCursor


FETCH NEXT FROM MyTableCursor INTO @TableName
WHILE @@FETCH_STATUS = 0
	BEGIN
		DECLARE MyColumnCursor Cursor
		FOR 
		SELECT COLUMN_NAME,DATA_TYPE, CHARACTER_MAXIMUM_LENGTH,IS_NULLABLE from information_schema.columns
			WHERE table_name = @TableName AND  (Data_Type LIKE '%char%' OR Data_Type LIKE '%text%') AND COLLATION_NAME <> @CollationName
			ORDER BY ordinal_position 
		Open MyColumnCursor


		FETCH NEXT FROM MyColumnCursor INTO @ColumnName, @DataType, @CharacterMaxLen, @IsNullable
		WHILE @@FETCH_STATUS = 0
			BEGIN
			SET @SQLText = 'ALTER TABLE ' + @TableName + ' ALTER COLUMN [' + @ColumnName + '] ' + @DataType + '(' + 
			CASE WHEN @CharacterMaxLen = -1 THEN 'MAX' ELSE @CharacterMaxLen END + ') COLLATE ' 
			+ @CollationName + ' ' + CASE WHEN @IsNullable = 'NO' THEN 'NOT NULL' ELSE 'NULL' END
			PRINT @SQLText 


		FETCH NEXT FROM MyColumnCursor INTO @ColumnName, @DataType, @CharacterMaxLen, @IsNullable
		END
		CLOSE MyColumnCursor
		DEALLOCATE MyColumnCursor


FETCH NEXT FROM MyTableCursor INTO @TableName
END
CLOSE MyTableCursor
DEALLOCATE MyTableCursor

This will generate us a number of statements that look kinda like this:

Es wurde kein Alt-Text für dieses Bild angegeben.

We save this script and continue with...

Step 2 - Script all indexes

There are several ways to do this (even via the SSMS GUI... you don't wanna do this via the SSMS GUI unless you need to get the girl/guy doing her/his internship of your back for... days... weeks... depending on the size of your DB - or the duration of her/his internship). I found the method suggested by Kendra Little to be very elegant altough it has some draw backs (i.e. compression and LOB compaction are not scripted but this can easily be corrected via dbatools.io or via Ola Hallengren's scripts ) - so I used the script in her article to script all indexes into a #temptable. Since the SELECT INTO statement is not part of her article I post the adapted script here. The name of my #temptable that will be referred to during this post is ##t

	/*****************************************************************************	
		MIT License, http://www.opensource.org/licenses/mit-license.php
		Contact: help@sqlworkbooks.com
		Copyright (c) 2018 SQL Workbooks LLC
		Permission is hereby granted, free of charge, to any person 
		obtaining a copy of this software and associated documentation
		files (the "Software"), to deal in the Software without 
		restriction, including without limitation the rights to use,
		copy, modify, merge, publish, distribute, sublicense, and/or
		sell copies of the Software, and to permit persons to whom 
		the Software is furnished to do so, subject to the following 
		conditions:
		The above copyright notice and this permission notice shall be
		included in all copies or substantial portions of the Software.
		THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
		EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES 
		OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 
		NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT 
		HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, 
		WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
		FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR 
		OTHER DEALINGS IN THE SOFTWARE.
		*****************************************************************************/
		SELECT 
		DB_NAME() AS database_name,
		sc.name + N'.' + t.name AS table_name,
		(SELECT MAX(user_reads) 
		FROM (VALUES (last_user_seek), (last_user_scan), (last_user_lookup)) AS value(user_reads)) AS last_user_read,
		last_user_update,
		CASE si.index_id WHEN 0 THEN N'/* No create statement (Heap) */'
		ELSE 
		CASE is_primary_key WHEN 1 THEN
		N'ALTER TABLE ' + QUOTENAME(sc.name) + N'.' + QUOTENAME(t.name) + N' ADD CONSTRAINT ' + QUOTENAME(si.name) + N' PRIMARY KEY ' +
		CASE WHEN si.index_id > 1 THEN N'NON' ELSE N'' END + N'CLUSTERED '
		ELSE N'CREATE ' + 
		CASE WHEN si.is_unique = 1 then N'UNIQUE ' ELSE N'' END +
		CASE WHEN si.index_id > 1 THEN N'NON' ELSE N'' END + N'CLUSTERED ' +
		N'INDEX ' + QUOTENAME(si.name) + N' ON ' + QUOTENAME(sc.name) + N'.' + QUOTENAME(t.name) + N' '
		END +
		/* key def */ N'(' + key_definition + N')' +
		/* includes */ CASE WHEN include_definition IS NOT NULL THEN 
		N' INCLUDE (' + include_definition + N')'
		ELSE N''
		END +
		/* filters */ CASE WHEN filter_definition IS NOT NULL THEN 
		N' WHERE ' + filter_definition ELSE N''
		END +
		/* with clause - compression goes here */
		CASE WHEN row_compression_partition_list IS NOT NULL OR page_compression_partition_list IS NOT NULL 
		THEN N' WITH (' +
		CASE WHEN row_compression_partition_list IS NOT NULL THEN
		N'DATA_COMPRESSION = ROW ' + CASE WHEN psc.name IS NULL THEN N'' ELSE + N' ON PARTITIONS (' + row_compression_partition_list + N')' END
		ELSE N'' END +
		CASE WHEN row_compression_partition_list IS NOT NULL AND page_compression_partition_list IS NOT NULL THEN N', ' ELSE N'' END +
		CASE WHEN page_compression_partition_list IS NOT NULL THEN
		N'DATA_COMPRESSION = PAGE ' + CASE WHEN psc.name IS NULL THEN N'' ELSE + N' ON PARTITIONS (' + page_compression_partition_list + N')' END
		ELSE N'' END
		+ N')'
		ELSE N''
		END +
		/* ON where? filegroup? partition scheme? */
		' ON ' + CASE WHEN psc.name is null 
		THEN ISNULL(QUOTENAME(fg.name),N'')
		ELSE psc.name + N' (' + partitioning_column.column_name + N')' 
		END
		+ N';'
		END AS index_create_statement,
		si.index_id,
		si.name AS index_name,
		partition_sums.reserved_in_row_GB,
		partition_sums.reserved_LOB_GB,
		partition_sums.row_count,
		stat.user_seeks,
		stat.user_scans,
		stat.user_lookups,
		user_updates AS queries_that_modified,
		partition_sums.partition_count,
		si.allow_page_locks,
		si.allow_row_locks,
		si.is_hypothetical,
		si.has_filter,
		si.fill_factor,
		si.is_unique,
		ISNULL(pf.name, '/* Not partitioned */') AS partition_function,
		ISNULL(psc.name, fg.name) AS partition_scheme_or_filegroup,
		t.create_date AS table_created_date,
		t.modify_date AS table_modify_date into ##t
		FROM sys.indexes AS si
		JOIN sys.tables AS t ON si.object_id=t.object_id
		JOIN sys.schemas AS sc ON t.schema_id=sc.schema_id
		LEFT JOIN sys.dm_db_index_usage_stats AS stat ON 
		stat.database_id = DB_ID() 
		and si.object_id=stat.object_id 
		and si.index_id=stat.index_id
		LEFT JOIN sys.partition_schemes AS psc ON si.data_space_id=psc.data_space_id
		LEFT JOIN sys.partition_functions AS pf ON psc.function_id=pf.function_id
		LEFT JOIN sys.filegroups AS fg ON si.data_space_id=fg.data_space_id
		/* Key list */ OUTER APPLY ( SELECT STUFF (
		(SELECT N', ' + QUOTENAME(c.name) +
		CASE ic.is_descending_key WHEN 1 then N' DESC' ELSE N'' END
		
		FROM sys.index_columns AS ic 
		JOIN sys.columns AS c ON 
		ic.column_id=c.column_id 
		and ic.object_id=c.object_id
		WHERE ic.object_id = si.object_id
		and ic.index_id=si.index_id
		and ic.key_ordinal > 0
		ORDER BY ic.key_ordinal FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'),1,2,'')) AS keys ( key_definition )
		/* Partitioning Ordinal */ OUTER APPLY (
		SELECT MAX(QUOTENAME(c.name)) AS column_name
		FROM sys.index_columns AS ic 
		JOIN sys.columns AS c ON 
		ic.column_id=c.column_id 
		and ic.object_id=c.object_id
		WHERE ic.object_id = si.object_id
		and ic.index_id=si.index_id
		and ic.partition_ordinal = 1) AS partitioning_column
		/* Include list */ OUTER APPLY ( SELECT STUFF (
		(SELECT N', ' + QUOTENAME(c.name)
		FROM sys.index_columns AS ic 
		JOIN sys.columns AS c ON 
		ic.column_id=c.column_id 
		and ic.object_id=c.object_id
		WHERE ic.object_id = si.object_id
		and ic.index_id=si.index_id
		and ic.is_included_column = 1
		ORDER BY c.name FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'),1,2,'')) AS includes ( include_definition )
		/* Partitions */ OUTER APPLY ( 
		SELECT 
		COUNT(*) AS partition_count,
		CAST(SUM(ps.in_row_reserved_page_count)*8./1024./1024. AS NUMERIC(32,1)) AS reserved_in_row_GB,
		CAST(SUM(ps.lob_reserved_page_count)*8./1024./1024. AS NUMERIC(32,1)) AS reserved_LOB_GB,
		SUM(ps.row_count) AS row_count
		FROM sys.partitions AS p
		JOIN sys.dm_db_partition_stats AS ps ON
		p.partition_id=ps.partition_id
		WHERE p.object_id = si.object_id
		and p.index_id=si.index_id
		) AS partition_sums
		/* row compression list by partition */ OUTER APPLY ( SELECT STUFF (
		(SELECT N', ' + CAST(p.partition_number AS VARCHAR(32))
		FROM sys.partitions AS p
		WHERE p.object_id = si.object_id
		and p.index_id=si.index_id
		and p.data_compression = 1
		ORDER BY p.partition_number FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'),1,2,'')) AS row_compression_clause ( row_compression_partition_list )
		/* data compression list by partition */ OUTER APPLY ( SELECT STUFF (
		(SELECT N', ' + CAST(p.partition_number AS VARCHAR(32))
		FROM sys.partitions AS p
		WHERE p.object_id = si.object_id
		and p.index_id=si.index_id
		and p.data_compression = 2
		ORDER BY p.partition_number FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'),1,2,'')) AS page_compression_clause ( page_compression_partition_list )
		WHERE 
		si.type IN (0,1,2) /* heap, clustered, nonclustered */
		ORDER BY table_name, si.index_id
		OPTION (RECOMPILE);
		GO

The "into ##t" is in Line 87

So now we have our Indexes scripted into ##t we carry out a "SELECT * FROM ##t" and get sth. like this:

Step 3 - generate the DROP INDEX scripts

Next thing we do is we create our "DROP INDEX" scripts for the NC indexes

select N'drop index [' + index_name +'] on ' + table_name from ##t where index_id > 1

And the "ALTER TABLE DROP CONSTRAINT" scripts for the PKs

select N'alter table ' + table_name +' drop constraint ' + index_name from ##t where index_id = 1

And we get sth. like

respectively

So now we have our scripts we can start with the actual change. There are 2 things you should do at this point.

For Bugs Bunny's sake - take a backup

Change the Recovery Model to SIMPLE in order to avoid full transaction logs

At this point you should also prepare the "CREATE INDEX" resp. the "ALTER TABLE ADD CONSTRAINT" statements from our temp table as the connections to the DB will be broken after changing the DB collation.

Execute the following query to script the PKs (just copy the contents of the create_index_statement column):

select * from ##t where index_id = 1

Execute the following query to scripts all NC idx (again: the create_index_statement column)

select * from ##t where index_id > 1

Step 4 - Change the collation of the database

alter database [mydb] set single_user with rollback immediate

go

 


USE [master]

GO

ALTER DATABASE [mydb] COLLATE Latin1_General_CS_AS

GO

 

GO

 

alter database [mydb] set multi_user

This will assure that all new objects are created with the new collation

Step 5 - Delete all indexes

Execute the scripts you generated in Step 3 to delete all indexes. Make sure that you fist delete the NC indexes and then the PKs.

Step 6 - Run the ALTER TABLE statements

Execute the scripts generated in Step 1

Step 7 - Recreate the indexes

First create the PKs. Then create the NC indexes.

Esp. for the the PKs it is a good idea to seperate the statements with the batch seperator (normally this is GO but SSMS lets you define your own - in case you want your fellow DBAs in your company to hate you go to Tools --> Options --> Query Execution --> SQL Server --> General and

Run the scripts generated in Step 3 to create the PKs

Then create the NC indexes.

and that's basically it. Use this manual at your own risk - and have fun.

Ron Smit 2y

Nice. A few minor things could be improved, however. For instance, due to the implementation by Microsoft, the cursor loop will sometimes script ALTER TABLE statements for objects which are in fact VIEWS. (Yep, MS moves in mysterious ways.) Can easily be remedied by changing the code to something like this: declare MyTableCursor CURSOR FOR select TABLE_NAME from INFORMATION_SCHEMA.TABLES WHERE TABLE_TYPE <> 'VIEW'; -- or maybe you want to change it to WHERE TABLE_TYPE = 'BASE TABLE';

2 Reactions

Günter Hagedorn 2y

Very good. When trying to convert a db to a utf8 collation, there are problems with columns of type text though... and it seems as if unique constraints can't be droped the way the script suggests. but yes, thats about 4-5 years after the publication of this really helpful artcile. thanks.

See more comments

To view or add a comment, sign in

Changing a database collation for new and existing objects in a few easy steps

Michal Piatek

Introduction

Now before I start...

Server collation vs. user DB collation

Carrying out the collation change

Step 1 - generate scripts to change the tables' collation

Step 2 - Script all indexes

Step 3 - generate the DROP INDEX scripts

Step 4 - Change the collation of the database

Step 5 - Delete all indexes

Step 6 - Run the ALTER TABLE statements

Step 7 - Recreate the indexes

More articles by Michal Piatek

Others also viewed

SQL Indexes: Under the Hood Mechanics

Connecting the MCP Server to Your Production Database: Why a Read-Only Replica is Better and the Benefits of MCP

The Hi/Lo Algorithm

Understanding RANK() and PARTITION BY in SQL

Mastering SQL Common Table Expressions (CTEs): Simplify Your Queries

Boost SQL Efficiency with Indexing: When to Use and When to Avoid

Understanding SQL Joins with Real-Life Examples

Part 1: The Query Is Not the Program

Analyzing Query Performance with Execution Plans

Indexing In SQL

Explore content categories

Introduction

Now before I start...

Server collation vs. user DB collation

Carrying out the collation change

Step 1 - generate scripts to change the tables' collation

Step 2 - Script all indexes

Step 3 - generate the DROP INDEX scripts

Step 4 - Change the collation of the database

Step 5 - Delete all indexes

Step 6 - Run the ALTER TABLE statements

Step 7 - Recreate the indexes

More articles by Michal Piatek

Creating a linked server with integrated authentication

Others also viewed

SQL Indexes: Under the Hood Mechanics

Connecting the MCP Server to Your Production Database: Why a Read-Only Replica is Better and the Benefits of MCP

The Hi/Lo Algorithm

Understanding RANK() and PARTITION BY in SQL

Mastering SQL Common Table Expressions (CTEs): Simplify Your Queries

Boost SQL Efficiency with Indexing: When to Use and When to Avoid

Understanding SQL Joins with Real-Life Examples

Part 1: The Query Is Not the Program

Analyzing Query Performance with Execution Plans

Indexing In SQL

Explore content categories