Monday 14 April 2014

Understanding locks part 1 - Finding locks and blocks

One thing I've had trouble with has been locks. What are the different types of locks and why are they important.  Locking can occur on various resources, such as a row, page, table or database.

How these locks work vary according to the lock type.  For example, you may get a lock on a row that prevents any updates to that row, but allows reading.  Or it may even prevent reading as well. SQL Server creates locks on resources to maintain compliance with the ACID principles.  In a nutshell, a lock will prevent a second user from either modifying a record at the same time as someone else or prevent reading from a resource before it has finished updating. 

A block occurs when a lock prevents another transaction from completing when that other transaction attempts to handle the same resource.  Locking and blocking are part of normal every day activity for SQL server, but sometimes they can cause application issues due to things like badly written queries or running large reports in a live environment.  So it helps to know how to locate locks to zero in on the source of the issue.

Finding locks:

First, let's create a lock or two. I'm using the AdventureWorks2008R2 database for this example.

begin transaction updateit;
update person.Person set FirstName = 'Kenny' where BusinessEntityID = 1;
Now that we have an un-commited transaction with an update, this should have created a lock, probably a row and / or key lock (more on the lock types in a later post).  Now how do we locate the locks and what types they are?  This is where the sys.dm_tran_locks DMV comes in (2005+).  Try this:
select * from sys.dm_tran_locks;
This gives some useful information but it would make a lot more sense if we joined to some system tables.  Now try this:
select db.name as DBName,l.resource_type,o.name as ObectName,au.type_desc,i.name as IndexName,p.rows,
    l.resource_description,l.request_mode
from sys.dm_tran_locks l  --Locking DMV
join sys.databases db on db.database_id = l.resource_database_id
left join sys.allocation_units au on au.allocation_unit_id =      l.resource_associated_entity_id
left join sys.partitions p on p.hobt_id =   l.resource_associated_entity_id
left join sys.indexes i on i.index_id = p.index_id
    and i.object_id = p.object_id
left join sys.objects o on o.object_id =
    case
        when p.partition_id is null then l.resource_associated_entity_id
        else p.object_id
    end
where db.database_id = DB_ID()  --Limit results to this database
    and l.resource_associated_entity_id <> 0 
    and o.name not like 'sys%'  --Ignore system base tables
order by o.name,i.name,l.resource_type,l.request_type
;
 You will need to run this in the relevant database, i.e. AdventureWorks2008R2 for this example.  This should give you something that looks like:









Now we know what resources are locked.  We can see it's against the Person table and they are row level and key locks.  We also know the lock types, which are exclusive or intent exclusive locks.


Finding Blocks:

Now let's create a block.  Open up a separate tab / connection in SSMS to AdventureWorks2008R2 while the first transaction still remains uncommited, and run the following:

select * from person.Person where BusinessEntityID = 1;
Now because the first transaction hasn't been committed or rolled back yet, this statement won't complete and is now blocked.  If you run the above query again you should notice a few more rows, in particular intent shared locks.  But we're interested in blocks only at this point.

You could run "execute sp_who2; " and check the BlkBy column for Spids that are blocking others, and then run "DBCC INPUTBUFFER(spid); " with the source spid (there may be several nested blocks) to get the locking statement that's creating the block.

Or you could again use a DMV again to make things easier to spot.  Try this:

SELECT r.session_id, db.name as DBName, r.start_time, r.status, r.command, r.blocking_session_id, wait_type, wait_time, wait_resource, cpu_time, st.text as SQLText
FROM sys.dm_exec_requests r
join sys.databases db on db.database_id = r.database_id
cross apply sys.dm_exec_sql_text(r.sql_handle) st
where r.blocking_session_id > 0
order by start_time
;
This should give you something like this:




Unlike sp_who2, this will only return blocked transactions, and will also give you the statement that is blocked and the time that it started.  Now you can use DBCC INPUTBUFFER as above.

Now commit or rollback the original update statement.  e.g. "rollback transaction updateit; "  You should now notice three things.
  1) The select statement has now returned its results.
  2) If you run the query to find locks, you should now longer see the locks
  3) If you run the query to find blocks, these will be gone too.

If push came to shove, and if there was no alternative, you could kill the spid of the transaction causing the blocks. In the above example, instead of committing or rolling back, you could run "kill 59 "  Note that in our example screenshot, 59 is the blocking spid.

In the next post I intend to talk about the different lock types.


Sunday 6 April 2014

Saving index definitions and dropping indexes for large inserts

I was going to talk about how to capture index definitions so that you can drop indexes and then at a later point re-create them... but I got a bit carried away and wrote an SP that will do it all for you.  Before I share that though, a little something on why I did this.

Recently I've had a couple of large data migrations to do and always to tables that have a number of indexes.  There are two problems with this.

  1. Indexes add an overhead when you are inserting data and Microsoft advise to drop from one to all indexes depending on how much data you are inserting.  Dropping the indexes could add a good deal of speed to your insert.  Here is a guideline table of the indexes to drop per amount of data being inserted.  For example, you should drop the clustered index if the amount of data you are inserting exceeds 30% of the number of rows that already exist in your destination table.  Note how Microsoft advise to drop the indexes, not disable then rebuild.  The table is taken from Guidelines for Optimizing Bulk Import.
IndexesRelative amount of new data
Clustered index only30 percent
Clustered and one nonclustered index25 percent
Clustered and two nonclustered indexes25 percent
Single nonclustered index only100 percent
Two nonclustered indexes60 percent

  1.  Minimal logging can help when inserting large amounts of rows.  There are a few things you can do to avoid your transaction logs filling up, and minimal logging can help.  Plus it could also greatly improve the performance of your insert.  However, minimal logging has a few requirements.  And one of them is to not have a clustered index when inserting to a table that already has data. There are a number of additional requirements and these may vary according to your SQL version, however that's a discussion for another day. 
Suffice to say, with the two above points in mind, it would help to have something to drop your indexes, then rebuild them, without having to manually script out your indexes, then run them in again afterwards.  You'll want something you can incorporate into your migration scripts or package.

 To this end I've written an SP called dbo.DropCreateIndexes.  To use this, you should be aware of the following parameters

  • @Action:
    • Purpose - Define what you want the SP to do, e.g. drop or re-create the the indexes.
    • Options:
      • Drop - Drop the indexes on a table and save the definitions for later use.
      • DropOnly - Drop the indexes without saving the definitions.
      • Create - Recreate the indexes for a defined table, or for all saved indexes.  Definitions are deleted from the IndexDefinition table for indexes that are re-created.
      • SaveOnly - Saves the index definitions to the IndexDefinition table only.
      • Output (default) - Will output the definitions without doing any work.
  • @ShemaName:
    • Purpose - Define the schema to which the table belongs.  Defaults to dbo.
  • @TableName:
    • Purpose - Define the table to save the drop the indexes from, or re-create them to.
The SP will save any definitions to dbo.IndexDefinitions created on whatever database the SP has been created on. As SQL server doesn't seem store index definitions, this procedure will create the definitions from scratch according to what's in various system tables.  Simply use this to drop and save the indexes before your data load, then later re-create the indexes on the same table.

I have tried to cover all bases so this will also handle, multi-column indexes, included columns, filtered indexes, fill factors, table spaces, index padding, allowing row locks and ignoring duplicate keys.  If I've missed anything, be sure to let me know :)

NOTE: I have not fully tested this in all scenarios, nor have I accounted for any indexes other than clustered, non-clustered and primary keys. Please ensure you test this properly before using it.  It still needs some error handling as well.  I expect I'll be making the occasional tweak to it (such as maybe dropping clustered indexes only) as I continue to use it.  Suggestions are welcome.

A user defined function called dbo.concatStrings is also required, the script for which is below as well.  Make sure you install this along with the SP as it is required by the SP for building the create statements.

So here is the SP:

CREATE procedure [dbo].[DropCreateIndexes]
    @Action nvarchar(10) = 'Output',  --'Drop', 'DropOnly', 'Create', 'SaveOnly' or 'Output'
    @SchemaName nvarchar(128) = 'dbo',
    @TableName nvarchar(128)      --Table name or 'All' for create indexes only
   
AS

DECLARE
    @object_id bigint,
    @SQL nvarchar(max);
   
BEGIN
    set @object_id = OBJECT_ID(@TableName,'U');
   
    --CREATE SAVE TABLE:
    if OBJECT_ID('dbo.IndexDefinitions','U') is null
    BEGIN
        create table dbo.IndexDefinitions(
            ID int identity(1,1) not null,
            IndexName nvarchar(128) not null,
            TableName nvarchar(128) not null,
            SchemaName nvarchar(128) not null,
            SaveDate datetime default getdate(),
            CreateSQL nvarchar(max),
        );
    END;

   
    --SAVE INDEX DEFINITIONS FOR TABLE:
    if @Action in ('Drop','SaveOnly','Output')
    BEGIN
        --Delete any rows that exist for the same table if they exist
        delete from dbo.IndexDefinitions
        where TableName = @TableName
            and SchemaName = @SchemaName;
       
        --Create and save the definitions to dbo.indexDefinitions
        with ConcatCols as
        (
            select i.object_id,i.index_id,i.name as IndexName,stuff(
                (select ',[' + name + '] ' + case ic.is_descending_key
                        when 1 then 'DESC'
                        else 'ASC'
                    end
                from sys.columns c
                join sys.index_columns ic on ic.object_id = c.object_id
                    and ic.column_id = c.column_id
                where c.object_id = i.object_id
                    and ic.index_id = i.index_id
                    and ic.is_included_column <> 1
                for xml path(''))
            ,1,1,'')
            as ColumnParts
            from sys.indexes i
            where i.object_id = @object_id
        ),
        includedCols as (
            select i.object_id,i.index_id,i.name as IndexName,stuff(
                (select ',[' + name + ']'
                from sys.columns c
                join sys.index_columns ic on ic.object_id = c.object_id
                    and ic.column_id = c.column_id
                where c.object_id = i.object_id
                    and ic.index_id = i.index_id
                    and ic.is_included_column = 1
                for xml path(''))
            ,1,1,'')
            as IncludedColumns
            from sys.indexes i
        )
        select i.name as IndexName,o.name as TableName,schema_name(o.schema_id) as SchemaName,
            --PRIMARY KEY PART
            case i.is_primary_key
                when 1 then  'ALTER TABLE [' + SCHEMA_NAME(o.schema_id) + '].[' + o.name + '] ADD PRIMARY KEY CLUSTERED ('
            --INDEX PART
                else 'CREATE ' + case is_unique when 1 then 'UNIQUE ' else ' ' end + i.type_desc collate database_default + ' INDEX [' + i.name + '] on [' + SCHEMA_NAME(o.schema_id) + '].[' + o.name + '] ('
            end + cc.ColumnParts + ') ' +
            --INCLUDED COLUMNS
            case
                when inc.IncludedColumns is null then ''
                else 'INCLUDE (' + inc.IncludedColumns + ')'
            end +
            --FILTER DEFINITIONS
            case i.has_filter
                when 0 then ''
                else ' WHERE' + i.filter_definition
            end +   
            --COMPLETE WITH OPTIONS
            ' WITH (' +
                case i.is_primary_key
                    when 0 then
                        'SORT_IN_TEMPDB = OFF,' +
                        'ONLINE = ON,' +
                        'DROP_EXISTING = OFF,'
                    else ''
                end +
                'PAD_INDEX = ' + case i.is_padded when 1 then 'ON' else 'OFF' end + ',' +
                'STATISTICS_NORECOMPUTE = ' + case i.is_padded when 1 then 'ON' else 'OFF' end + ',' +       
                'IGNORE_DUP_KEY = ' + case i.ignore_dup_key when 1 then 'ON' else 'OFF' end + ',' +       
                'ALLOW_ROW_LOCKS = ' + case i.allow_row_locks when 1 then 'ON' else 'OFF' end + ',' +
                case i.fill_factor when 0 then '' else 'FILLFACTOR = ' + cast(i.fill_factor as varchar(3)) + ',' end +
                'STATISTICS_NORECOMPUTE = OFF)' +
            ' ON [' + fg.name + ']' as CreateSQL
        into #tempDef
        from sys.indexes i
        join sys.objects o on o.object_id = i.object_id
        join ConcatCols cc on cc.object_id = o.object_id
            and cc.index_id = i.index_id
        join sys.filegroups fg on fg.data_space_id = i.data_space_id
        left join includedCols inc on inc.object_id = i.object_id
            and inc.index_id = i.index_id;
           
           
        if @Action in ('Drop','SaveOnly')
        BEGIN
            insert into dbo.IndexDefinitions(IndexName,TableName,SchemaName,CreateSQL)
            select * from #tempDef;
        END;
       
        if @Action in ('Output')
        BEGIN
            Select * from #tempDef;
        END;
       
    END;
       
   
    --DROP ALL INDEX ON TABLE:
    if @Action in ('Drop', 'DropOnly')
    BEGIN
        select @SQL = (
            select case
                when i.is_primary_key = 1 then   
                    'ALTER TABLE [' + SCHEMA_NAME(o.schema_id) + '].[' + o.name +'] DROP CONSTRAINT ' + i.name + ';'
                else
                    'DROP INDEX [' + i.name + '] ON [' + SCHEMA_NAME(o.schema_id) + '].[' + o.name +'];'
            end
            from sys.indexes i
            join sys.objects o on o.object_id = i.object_id
            where o.object_id = @object_id
            for xml path ('')
        );       

        EXEC sp_executesql @SQL;
               
    END;
   
    if @Action = 'Create'
    BEGIN
        select @SQL = dbo.concatStrings(@TableName,@SchemaName);

        EXEC sp_executesql @SQL;   
       
        delete from dbo.IndexDefinitions
        where TableName = case @TableName when 'All' then TableName else @TableName end
            and SchemaName = case @TableName when 'All' then SchemaName else @SchemaName end;
       
    END;       
END;

And now for the function:

CREATE function [dbo].[concatStrings](@TableName nvarchar(128), @SchemaName nvarchar(128) = 'dbo')
returns nvarchar(max)

as

BEGIN
    declare @x varchar(max)
   
    select @x = coalesce(@x+';','') + CreateSQL
            from IndexDefinitions
            where TableName = case @TableName when 'All' then TableName else @TableName end
                and SchemaName = case @TableName when 'All' then SchemaName else @SchemaName end
           
    return @x;
END;

There you go.  Enjoy.  Any feedback is welcome.

*EDIT*
Here's an example:
exec dbo.DropCreateIndexes
    @action = N'Create',
    @TableName = N'dropIndexTest'

Deploying out of hours shouldn't be a thing

Getting code into production should not be hard. Nor should it have to be scheduled for special times. I have recently been involved with th...