DEADLOCK on SELECT due to UNORDERED PREFETCH

In a previous post, I have provided links to multiple articles that demonstrate how a SELECT statement involving a key lookup running under READ COMMITTED isolation level can deadlock with another session running an UPDATE statement.

In this post we will discuss about another case where a SELECT statement running under READ COMMITTED isolation level  can deadlock with another session due to the locking behavior.

Setup

Run the below script to create the test table and populate it.

IF DB_ID(‘DeadLockTest’) IS NULL

CREATE Database DeadLockTest

GO

USE DeadLockTest

GO

create table dbo.rcdl (id int identity(1,1) primary key clustered, id2 int, filler char(1000) default ‘abc’)

GO

set nocount on

declare @i int

set @i = 0

while @i < 1000

begin

insert dbo.rcdl(id2) values ( @i)

set @i = @i + 1

end

Go

create index ix_rcdl on dbo.rcdl(id2)

go

Now start another session (Session 1) and run the below command.

BEGIN TRAN

UPDATE dbo.rcdl

SET filler = ‘xyz’,id2=id

WHERE id = 24

WAITFOR DELAY ’00:00:05′

UPDATE dbo.rcdl

SET filler = ‘xyz’,id2=id

WHERE id = 1

ROLLBACK TRAN

Start one more session (session 2) and run the below query

SELECT filler FROM dbo.rcdl WHERE id2 < 25

In five seconds, the SELECT statement running in session 2 will fail with the below error.

Msg 1205, Level 13, State 51, Line 1

Transaction (Process ID 74) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.

Analyze

Let’s go ahead and populate the deadlock analysis table from the deadlock trace.

Process Resource Mode State Command
process593048 KEY: 26:72057594042056704 (1edefaa29392) S WAIT SELECT
process593048 KEY: 26:72057594042056704 (bb0d06c12baa) S GRANT SELECT
process13ee08 KEY: 26:72057594042056704 (bb0d06c12baa) X GRANT UPDATE
process13ee08 KEY: 26:72057594042056704 (1edefaa29392) X WAIT UPDATE

You can see that the update command is holding an X lock on one row and waiting for another X lock. The SELECT statement is also holding a shared (S) lock and waiting for another shared lock. But why the SELECT statement is holding a shared lock under READ COMMITTED isolation level? As stated in a previous post, SELECT statements should only hold the shared lock for the duration of processing the row. For a visual representation of the cocept of lock duration under different isolation levels, see Micheal J Swart’s (b |t) post “Visualizing Transaction Isolations For SQL Server”

But there are few exceptions to this rule. In certain cases, SQL server will keep the shared locks until the end of the transaction to ensure consistency. One of them involves “UNORDERED PREFETCH”, which happens to be the case here. If you look at the execution plan xml of the SELECT statement, you can see that our plan uses a Nested Loop Join with UNORDERED PREFETCH. Craig Freedman blogged about the scenarios where the shared locks are held until the end of the transaction here. He also wrote about UNORDERED PREFETCH in detail here.

Fix

You can use any of the below methods to resolve this issue.

  1. Combine the UPDATE statements to a single UPDATE statement.
  2. Cover the index by including the filler column
  3. Acquire non-compatible locks using a locking hint.

This post is part of the series SQL Server deadlocks and live locks – Common Patterns. If you have questions, please post a comment here or reach me at @sqlindian

Advertisements

Deadlocks Involving Lock Partitions

Lock Partitioning is a feature in SQL Server – which is automatically enabled for machines with more than 16 logical processors – aimed at reducing lock contention. To quote from the MSDN link on the topic,

For large computer systems, locks on frequently referenced objects can become a performance bottleneck as acquiring and releasing locks place contention on internal locking resources. Lock partitioning enhances locking performance by splitting a single lock resource into multiple lock resources. This feature is only available for systems with 16 or more CPUs, and is automatically enabled and cannot be disabled. Only object locks can be partitioned.

When lock portioning is enabled NL, SCH-S, IS, IU, and IX lock modes are acquired on a single partition. All other lock modes must be acquired on all partitions.

One key thing to notice is that, when lock partitioning is enabled, Objects locks are acquired on lock partitions,not the object itself. For eg. If you run the below query on a server where lock partitioning is not enabled, you will get the below result. (irrelevant results omitted)

USE AdventureWorks
GO

SET TRANSACTION ISOLATION LEVEL REPEATABLE READ
BEGIN TRAN

SELECT * FROM Person.[Address]
WHERE AddressId = 1

SELECT * FROM Person.ContactType WITH(TABLOCKX)

SELECT OBJECT_NAME(resource_associated_entity_id) as objectName,resource_description, resource_lock_partition,request_mode  FROM sys.dm_tran_locks
WHERE resource_type = ‘OBJECT’

ROLLBACK TRAN

ObjectName resource_lock_partition request_mode
Address 0 IS
ContactType 0 X

If you run the same script on a machine on which lock partitioning is enabled, you will get the following results.

ObjectName resource_lock_partition request_mode
Address 5 IS
ContactType 15 X
ContactType 14 X
ContactType 13 X
ContactType 12 X
ContactType 11 X
ContactType 10 X
ContactType 9 X
ContactType 8 X
ContactType 7 X
ContactType 6 X
ContactType 5 X
ContactType 4 X
ContactType 3 X
ContactType 2 X
ContactType 1 X
ContactType 0 X

You can see that the IS lock is acquired only on one partition, whereas the X lock is acquired on all partitions. The actual number of partitions depend upon the number of available logical processors.

As the locking is done at the lock partition level, naturally these locks can participate in a deadlock and the resources listed under the resource list will be the lock partitions.

A deadlock can be identified as a lock portioning related deadlock from the below attributes.

1. The wait resource is “Object”

2. The wait resource identifier will be in the format “7:1797581442:12” where the last two digit indicates the lock partition.

3. In the resource list the value for sub resource is “FULL”

4. The resource list will be in the format “lockPartition=12”

5. If you execute the statement involved in the deadlock in a query and inspect resource_lock_partition column in sys.dm_tran_locks, you will see a non-zero value.

The following demo will illustrate a deadlock on lock partitions.

Setup

SQL server automatically enables lock partitioning on machines with more than 16 logical processors. If you don’t have a machine with 16 cores, you can use the (undocumented and unsupported) startup parameter –Pn.

I’ve used –P16 to simulate 16 processors.

When lock partitioning is enabled, you will see the following message in the SQL server log.

Lock partitioning is enabled.  This is an informational message only. No user action is required.

Once you verify that lock partitioning is enabled, run the below script to create the table for our test.

IF DB_ID(‘DeadLockTest’) IS NULL
CREATE Database DeadLockTest
GO

USE DeadLockTest
GO

CREATE TABLE dbo.lpdl(id int identity(1,1) primary key clustered, filler char(100) default ‘abc’)
GO
SET NOCOUNT ON
INSERT INTO dbo.lpdl DEFAULT VALUES
GO 100

Now open a new session (session 1) and run the below query.

BEGIN TRAN

UPDATE dbo.lpdl
SET filler = ‘xyz’
WHERE id = 1

WAITFOR DELAY ’00:00:05′

SELECT * FROM dbo.lpdl WITH(TABLOCKX)

ROLLBACK TRAN

Open another session (Session 2) and run the below query.

BEGIN TRAN

UPDATE dbo.lpdl
SET filler = ‘xyz’
WHERE id = 1

WAITFOR DELAY ’00:00:05′

SELECT * FROM dbo.lpdl WITH(TABLOCKX)

ROLLBACK TRAN

One of the sessions will fail with the deadlock error. You have probably noticed that we are forcing a conversion deadlock here by first acquiring a compatible IX lock and the trying to acquire a non-compatible X lock on the same table.

Analyze

To analyze this deadlock we will go ahead and create our deadlock table.

Process Resource Mode State Command
process4ee8508 OBJECT: 26:242099903:2 X WAIT SELECT
process4ee8508 OBJECT: 26:242099903:0 X GRANT SELECT
process7c4988 OBJECT: 26:242099903:0 X WAIT SELECT
process7c4988 OBJECT: 26:242099903:2 IX GRANT UPDATE

The lock resource reported in the deadlock trace will be in the format OBJECT:DBID:OBJECTID:LOCKPARTITION. AS we can see here  process process7c4988 is holding an IX lock on partition 2. Since IX lock is acquired only on the local partition, we can assume that partition 2 is local to  process7c4988. This process now requires an X lock on partition 0. When a process requires any locks other than locks that can be acquired on the local partition, it must access the locks in a sequential order starting from partition 0. But since process process4ee8508 is already holding an X lock on partition 0, process process7c4988 is now blocked.

Meanwhile process4ee8508 is holding an X lock (converted its own IX lock on the partition to X lock) on partition 0 and it needs to acquire the X lock on the remaining 15 partitions in that order. When it reaches partition 2, it will be blocked since the other thread is holding an incompatible lock and a deadlock ensues.

Regardless of whether lock portioning was enabled or not, the above script will cause a deadlock.  But there are extremely rare scenarios where the lock partitioning feature will cause a deadlock whereas the normal locking behavior will only cause blocking. I will talk about that in the next post.

Fix

The key to avoid this kind of deadlock is to avoid object level locks. These kind of deadlocks are pretty rare. I’ve only found one reference of deadlock involving lock partitions in MSDN forums.

This post is part of the series SQL Server deadlocks and live locks – Common Patterns. If you have questions, please post a comment here or reach me at @sqlindian

Conversion Deadlock

This post discusses another ‘classic’ deadlock.

Setup

This deadlock involves just one resource and two processes, each holding a compatible lock and trying for a non-compatible lock. We will use a more restrictive REPEATABLE READ ISOLATION level for this demo. You can see that unlike the cyclic deadlock example, both the sessions are running the same script here.

Run the below script to create the table and test data.

IF DB_ID(‘DeadLockTest’) IS NULL
CREATE Database DeadLockTest
GO

USE DeadLockTest
GO

CREATE TABLE dbo.codl(id int identity(1,1) primary key clustered, col1 char(10) default ‘abc’)

INSERT INTO dbo.codl DEFAULT VALUES

Now run the below script simultaneously in two sessions.

SET TRANSACTION ISOLATION LEVEL REPEATABLE READ
BEGIN TRAN
SELECT * FROM dbo.codl WHERE ID = 1
WAITFOR DELAY ’00:00:05′
UPDATE dbo.codl
SET col1 = ‘xyz’ WHERE id = 1
ROLLBACK TRAN

One of the sessions will fail with the deadlock error.

Analyze

If you prepare the deadlock analysis table from the trace data, it will look like the following.

Process Resource Mode State Command
process592e08 KEY: 26:72057594040745984 (8194443284a0) X WAIT UPDATE
process592e08 KEY: 26:72057594040745984 (8194443284a0) S GRANT SELECT
process579048 KEY: 26:72057594040745984 (8194443284a0) X WAIT UPDATE
process579048 KEY: 26:72057594040745984 (8194443284a0) S GRANT SELECT

You can see that only a single resource is involved(specifically a single key) and both the sessions are holding a compatible Shared lock on it and both are trying to convert it into an exclusive lock.

Fix

You won’t see this deadlock happening in READ COMMITTED isolation level since the lock duration of a shared lock in READ COMMITTED is only during the time the row is being processed, with some exceptions where the lock is held until the statement is completed. Obviously in REPEATABE READ isolation level, the locks are held until the end of the transaction. To fix this deadlock in REPEATABLE READ isolation level you will need to acquire a non-compatible lock in the beginning by using a UPDLOCK hint in the SELECT statement.

This post is part of the series SQL Server deadlocks and live locks – Common Patterns. If you have questions, please post a comment here or reach me at @sqlindian