Cyclic deadlock

We will start with a classic deadlock scenario. The cyclic deadlocks are probably the most common and easily fixable deadlock.

Setup

This deadlock requires a minimum two resources and two process. For simplicity we will use two different tables.

Run the below script to setup the tables.

IF DB_ID(‘DeadLockTest’) IS NULL

CREATE Database DeadLockTest

GO

USE DeadLockTest

GO

CREATE TABLE Cdl1(id int identity(1,1), col1 char(10) default ‘abc’)

CREATE TABLE Cdl2(id int identity(1,1), col1 char(10) default ‘abc’)

GO

INSERT INTO Cdl1 DEFAULT VALUES

INSERT INTO Cdl2 DEFAULT VALUES

 

Now the run the below script in a new session (Session 1)

BEGIN TRAN

–update the first table

UPDATE Cdl1 SET col1= ‘xyz’

–Wait for a second for the other session to catchup

WAITFOR DELAY ’00:00:5′

–update the second table

UPDATE Cdl2 SET col1= ‘xyz’

ROLLBACK TRAN

And the below script in session 2

BEGIN TRAN

–update the first table

UPDATE Cdl2 SET col1= ‘xyz’

–Waitfor a second for the other session to catchup

WAITFOR DELAY ’00:00:5′

–update the second table

UPDATE Cdl1 SET col1= ‘xyz’

ROLLBACK TRAN

One of the session will fail with the error
Msg 1205, Level 13, State 45, Line 13

Transaction (Process ID 60) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.

Analyze

By analyzing the trace using your favorite method (I had enabled trace flag 1222), you will be able to arrive at the following table (The actual value of the process and resource will vary)

Process Resource Mode State Command Object
process579288 RID: 26:1:2671:0 U WAIT UPDATE DeadLockTest.dbo.Cdl1
process579288 RID: 26:1:2673:0 X GRANT UPDATE DeadLockTest.dbo.Cdl1
process13fdc8 RID: 26:1:2673:0 U WAIT UPDATE DeadLockTest.dbo.Cdl1
process13fdc8 RID: 26:1:2671:0 X GRANT UPDATE DeadLockTest.dbo.Cdl1

 

As you can see, both the processes are holding and waiting for incompatible locks. If you analyze the batch you can see that the tables are accessed in a different order.

Fix

This one is an easy fix. If you change the script to access the tables in the same order the deadlock will go away.

This post is part of the series SQL Server deadlocks and live locks – Common Patterns. If you have questions, please post a comment here or reach me at @sqlindian

Advertisements

Deadlock due to Parallelism

 

Do you think a simple query like the below one can deadlock with another instance running the same query?

SELECT @id = id FROM dbo.midl WITH(UPDLOCK)
WHERE id2 = @id

Read on to find out.

Setup

Since this post is about deadlock due to parallelism, make sure that parallelism is enabled on your server.

Run the below script to create the table.

IF DB_ID(‘DeadLockTest’) IS NULL
CREATE Database DeadLockTest
GO

CREATE TABLE dbo.midl(id int identity(1,1) primary key clustered, id2 int, filler char(100) default ‘abc’ )

SET NOCOUNT ON

DECLARE @i int
SET @i = 1
WHILE @i < 60000
BEGIN
INSERT INTO dbo.midl DEFAULT VALUES
SET @i = @i+1
END

UPDATE dbo.midl SET id2 = id+1

Now run the following query in multiple sessions or use a tool like SQLQueryStress developed by Adam Machanic (b | t) which can be used for executing a query in multiple sessions.

SET NOCOUNT ON

DECLARE @id int
WHILE 1=1
BEGIN
SET @id = (RAND() * 1000) + 1
SELECT @id = id FROM dbo.midl WITH(UPDLOCK)
WHERE id2 = @id
END

In my tests with five concurrent sessions, three of them deadlocked under a minute.

The queries failed with the error

Msg 1205, Level 13, State 51, Line 6

Transaction (Process ID 60) was deadlocked on lock | communication buffer resources with another process and has been chosen as the deadlock victim. Rerun the transaction.

Analyze

If you had paid close attention to the error message, it says the deadlock happened on lock | communication buffer resources, which is a hint about the actual issue here. If you pay close attention to the resource list in the deadlock trace you can see the exchange operators appearing in the waiter list.

exchangeEvent id=Pipe8ac26500 WaitType=e_waitPipeGetRow nodeId=1

Deadlocks involving parallelism always involve thread or lock resources. So we can proceed in our conventional route and prepare our deadlock table.

I had simplified the below table to include only two process instead of all four participants in the deadlock.

Process Resource Mode State Command Object
process579948 KEY: 26:72057594039500800 (8194443284a0) U WAIT SELECT indexname=PK__midl__X
process579948 KEY: 26:72057594039500800 (1524a5133a9d) U GRANT SELECT indexname=PK__midl__X
process81e714c8 KEY: 26:72057594039500800 (1524a5133a9d) U WAIT SELECT indexname=PK__midl__X
process81e714c8 KEY: 26:72057594039500800 (8194443284a0) U GRANT SELECT indexname=PK__midl__X

 

This only tells us that the contention is on the PK. We need to see what keys are being actually involved. We can get this using the undocumented %%lockres%% function.

By running the below query,

SELECT id, %%lockres%% as LockResource
FROM dbo.midl
WHERE %%lockres%% IN(‘(8194443284a0)’,'(1524a5133a9d)’)

Id LockResource
1 (8194443284a0)
12709 (1524a5133a9d)

 

Now we can see that, process81e714c8 is holding a lock on row with id 1 and is waiting for a lock on id 12709. This is OK since it is a normal blocking scenario. But process579948 is holding a lock on row with id 12709 and waiting for a lock on id 1. If we look at the execution plan, we can see why.

 

image

When a task is executed using a parallel operator, the input is divided into a number of parallel buckets and each parallel thread works on its own subset of data.

Fix

This is again an easy fix. You have probably noticed that the id2 column does not have an index. If you create a non-clustered index on the id2 column, the deadlock will disappear.

PS: If you did not get the deadlock error, try adding few more thousands of rows and/or few more concurrent processes running the same query.

This post is part of the series SQL Server deadlocks and live locks – Common Patterns. If you have questions, please post a comment here or reach me at @sqlindian

 

Troubleshooting deadlocks

 

There are quite a few ways to capture and troubleshoot deadlock errors. Personally I use the trace flag 1222 to capture deadlocks and manually analyze the trace results. The ultimate resource on deciphering the 1222 output is Bart Duncan’s blog. Please read this four part series. This will provide a lot of insights in troubleshooting deadlocks.

With SQL Server 2008, you can collect deadlock information using the system_health default event session in Extended Events. The most interesting thing about this is that you can collect the deadlock details after the fact. Jonathan Kehayias, probably the best expert you can find in the topic of extended events and a champion in deadlock troubleshooting wrote about using extended events and few other methods to capture and troubleshoot deadlock errors in his book Troubleshooting SQL server – A Guide for the Accidental DBA. He also discusses about some common deadlocks and how to troubleshoot them. You can read the chapter on deadlocks here.

Also see Amit Banerjee’s post on the retrieving deadlock data from the System Health event session  here. He also blogged about how you can use Powershell and TSQL to create individual deadlock graphs from System health event here.

Regardless of the method you sue to capture and analyze the deadlocks, you will need to arrive at something like below to understand the resources and processes involved in the deadlock to fix the deadlock issue.

Process Resource Mode State Command Object
process6dd708 KEY: 26:72057594042318848 (c904e608358d) U WAIT SELECT indexname=PK__spdl1
process6dd708 KEY: 26:72057594042318848 (30488d9fd081) U GRANT SELECT indexname=PK__spdl2
process6c34c8 KEY: 26:72057594042318848 (30488d9fd081) U WAIT SELECT indexname=PK__spdl2
process6c34c8 KEY: 26:72057594042318848 (c904e608358d) U GRANT SELECT indexname=PK__spdl1

The above table will give you an idea about the locks that are in granted and waiting state and the resources associated with it. The traces will only provide you the statements that are in the waiting state. You will need to analyze the batch to see what statement actually acquired the locks. In many cases you will also need to look at the execution plan to understand why the locks are being held.

We will follow this method to analyze the deadlocks we are going to discuss.

This post is part of the series SQL Server deadlocks and live locks – Common Patterns. If you have questions, please leave a comment here or reach me at @sqlindian

Introduction to deadlocks

Anybody with little experience with SQL Server must have seen few deadlock errors and probably understood why it happens. So a definition may not be necessary. Books online defines deadlock as “A deadlock occurs when two or more tasks permanently block each other by each task having a lock on a resource which the other tasks are trying to lock”

SQL server automatically detects deadlock conditions and choose one of the processes as the victim and rollback that transaction based on the DEADLOCK PRIORITY and log used. There are few exceptions to this rule for victim selection and in some cases more than one process may selected as a victim. We will discuss about this in detail in a later post.

A deadlock can occur on Locks, Worker Threads, Memory, Parallel query resources and MARS resources. Our discussions in this series will be around handling Lock related deadlocks.

A deadlock is not a bug, except for very few cases which we will discuss later. A deadlock is a natural side effect of a busy system and are not completely avoidable. But with careful database design, proper indexing and by handling deadlock errors, the possibility of deadlocks and the impact of it can be greatly reduced. Some of the deadlock scenarios we are going to discuss in the coming posts require use of additional locking hints or the use of sp_getAppLock

Deadlock Properties

  • If a lock can be described as a unit of contention, a deadlock involves at least two units of contention.
  • At least two processes will be involved in a deadlock.
  • Each of the processes involved in a deadlock will be – directly or indirectly – holding a lock and will be waiting for another.
  • The Pending requests are not compatible with the existing locks.

Deadlock Categories

Cyclic deadlock vs Conversion Deadlocks : Deadlocks are generally categorized as cyclic deadlocks or conversion deadlocks.

A cyclic deadlock happens when two processes accesses two resource in the opposite order. A cyclic deadlock involves two or more resources.

Following is a slightly modified version of the deadlock example from BOL

  1. Transaction A acquires an exclusive lock on row 1.
  1. Transaction B acquires an exclusive lock on row 2.
  1. Transaction A now requests an exclusive lock on row 2, and is blocked until transaction B finishes and releases the exclusive lock it has on row 2.
  1. Transaction B now requests an exclusive lock on row 1, and is blocked until transaction A finishes and releases the exclusive lock it has on row 1.

A conversion deadlock happens when two processes holding compatible locks try to Convert the lock into non-compatible mode. A conversion deadlock involves one or more resources.

If we modify the above example to show a conversion deadlock , it will look like

  1. Transaction A acquires a shared lock on row 1.
  1. Transaction B acquires a shared lock on row 1.
  1. Transaction A now requests an exclusive lock on row1, and is blocked until transaction B finishes and releases the share lock it has on row 1.
  1. Transaction B now requests an exclusive lock on row 1, and is blocked until transaction A finishes and releases the share lock it has on row 1.

Multi – Statement vs Single Statement deadlocks : Another way I would like to classify deadlocks are based on how many statements are involved in the deadlock. Simple deadlock scenarios involve multiple statements from multiple batches holding and waiting for conflicting locks. These deadlocks are relatively easy to troubleshoot. There are cases where the exact same statement executed by different processes ending up in a deadlock. We will see examples of both the cases in the coming posts.

Deadlocks and Isolation Levels

Deadlocks can happen in any isolation levels. The probability of deadlocks are higher in more restrictive isolation levels like SERIALIZABLE compared to less restrictive isolation levels like READ UNCOMMITTED. In the context of locks, Isolation levels only controls the duration and mode of Shared locks. Regardless of the isolation level, exclusive locks and update locks will be held until the end of the transaction.

In the next post I’ll introduce you to the resources I use for troubleshooting deadlocks.

This post is part of the series SQL Server deadlocks and live locks – Common Patterns. If you have questions, please post a comment here or reach me at @sqlindian

SQL Server deadlocks –Common Patterns

I get to play with a lot of deadlocks these days, part by chance and part by choice. So I thought it is a good idea to start a blog series about handling different types deadlocks.

I will try to establish some context about general deadlock properties and then introduce you to some common and some not so common deadlock patterns.

Introduction

Troubleshooting

Cyclic deadlock

Conversion deadlock

Multi-Victim Deadlocks

Deadlock due to different access Paths

Deadlock due to different access Order

Deadlock due to different lock granularity

Deadlock due to Foreign Key Constraints

Deadlock due to Parallelism

Deadlocks due to Partition Level Lock Escalation

DEADLOCK on SELECT due to UNORDERED PREFETCH

Deadlocks involving Select Into

Deadlock due to Savepoint Rollback behavior

Deadlocks due to Hash Collision

Deadlocks Involving Lock Partitions

Deadlocks due to Lock Partitioning

Deadlock due to Implicit Conversion

 

 

Deadlock due to Different Lock granularity

A deadlock can occur when a table is accessed with different lock granularity  in the same transaction. The following script demonstrates the concept.

Setup

Run the following script to create the necessary table.

IF DB_ID(‘DeadLockTest’) IS NULL
CREATE Database DeadLockTest
GO

USE DeadLockTest
GO
CREATE TABLE dbo.lgdl(id int identity(1,1) primary key clustered, filler char(100) default ‘abc’)
GO

INSERT INTO lgdl DEFAULT VALUES
GO
INSERT INTO lgdl DEFAULT VALUES
GO

Now open a new session (session 1) and run the below script.

SET TRANSACTION ISOLATION LEVEL REPEATABLE READ
BEGIN TRAN

SELECT * FROM lgdl WITH(ROWLOCK) WHERE id = 1

WAITFOR DELAY ’00:00:05′

UPDATE l
SET filler = ‘xyz’
FROM lgdl L WITH(PAGLOCK)
WHERE id =1

ROLLBACK TRAN

Open another session (session 2)  and run the below script.

SET TRANSACTION ISOLATION LEVEL REPEATABLE READ
BEGIN TRAN

SELECT * FROM lgdl WITH(ROWLOCK) WHERE id = 2

WAITFOR DELAY ’00:00:05′

UPDATE l
SET filler = ‘xyz’
FROM lgdl L  WITH(PAGLOCK)
WHERE id =2

ROLLBACK TRAN

Analyze

Lets build the deadlock analysis table from the trace results.

Process Resource Mode State Command Object
process7ed948 PAGE: 26:1:2898 U WAIT UPDATE dbo.lgdl
process7ed948 PAGE: 26:1:2898 IS GRANT UPDATE dbo.lgdl
process44f708 PAGE: 26:1:2898 X WAIT UPDATE dbo.lgdl
process44f708 PAGE: 26:1:2898 U GRANT UPDATE dbo.lgdl

So what is happening here is, both the processes will start a S lock on the individual keys and an IS lock on the page. Since we explicitly specified  PAGLOCK in the update statement both transactions will need to acquire locks at the page level for the UPDATE statement. SQL server acquires locks for updates in two phases. First it will acquire a U lock for the read phase of the Update and then an X lock for the Update. So when the first transaction tries to acquire a U lock on the page, it will succeed because a U lock is compatible with an IS lock. However, when it tries to convert the U lock to an X lock, it will be blocked since an X lock is not compatible with an IS lock.

Meanwhile when the second the second transaction tries to convert the IS lock to a U lock, it will be blocked because a U lock is not compatible with another U lock. And when two transactions block each other the result is a deadlock.

For details on SQL server lock compatibility, please refer to this entry on MSDN.

Fix

Make sure a table is accessed using the same lock granularity inside a transaction especillly when using lock granularity hints.

This post is part of the series SQL Server deadlocks and live locks – Common Patterns. If you have questions, please post a comment here or reach me at @sqlindian

Deadlocks due to different Access Order

A deadlock can occur when two sessions try to access the same index in a different order in non-compatible lock modes. The following script demonstrates this behavior.

Setup

Run the below scripts to create the necessary tables.

IF DB_ID(‘DeadLockTest’) IS NULL
CREATE Database DeadLockTest
GO

USE DeadLockTest
GO

CREATE TABLE dbo.aodl(id int identity(1,1) primary key clustered, filler char(100) default ‘abc’)
GO

SET NOCOUNT ON

DECLARE @i int
SET @i = 1
WHILE @i < 10000
BEGIN
INSERT INTO dbo.aodl DEFAULT VALUES
SET @i = @i+1
END

Now open a new session (session 1) and run the below script. Make sure to change the WAITFOR TIME value according to the time you are running. Its is used to ensure that both the queries will be starting at the same time.

WAITFOR TIME ’18:02′
BEGIN TRAN
SELECT * from dbo.aodl WITH(UPDLOCK)
WHERE Id > 4000
ORDER BY ID ASC
ROLLBACK TRAN

Now open another session (Session 2) and run the below script after changing the start time according to the first script.

WAITFOR TIME ’18:02′
BEGIN TRAN
SELECT * from dbo.aodl WITH(UPDLOCK)
WHERE Id < 6000
ORDER BY ID DESC
ROLLBACK TRAN

One of the session will fail with the deadlock error.

Analyze

From the script itself, it is obvious why the deadlock is happening. Lets confirm this by preparing the deadlock table.

Process Resource Mode State Command Object
process6dd708 KEY: 26:72057594042318848 (c904e608358d) U WAIT SELECT indexname=PK__spdl1
process6dd708 KEY: 26:72057594042318848 (30488d9fd081) U GRANT SELECT indexname=PK__spdl2
process6c34c8 KEY: 26:72057594042318848 (30488d9fd081) U WAIT SELECT indexname=PK__spdl2
process6c34c8 KEY: 26:72057594042318848 (c904e608358d) U GRANT SELECT indexname=PK__spdl1

 

Lets also find out what are the rows involved in deadlock using the below query.

SELECT %%lockres%% as LockResource, ID
FROM dbo.aodl
WHERE %%lockres%% IN(‘(c904e608358d)’, ‘(30488d9fd081)’)

On my machine it yields the below result.

LockResource ID
(c904e608358d) 5006
(30488d9fd081) 5007

You can see that both the processes were scanning the table in the opposite order and collided in the middle.

Fix

Change the ORDER BY clause to ensure that both the processes were accessing the table in the same order. One process will be blocked until the other finishes, but it won’t fail with a deadlock error.

This post is part of the series SQL Server deadlocks and live locks – Common Patterns. If you have questions, please post a comment here or reach me at @sqlindian

Deadlocks due to different Access Paths

A common deadlock pattern involving a single table is when the table is accessed using different access paths by means of different indexes. This post in Bart Duncan’s series explains this with a proper example.

Jonathan (b|t) offers a visual representation of the concept here.

Remus Rusanu(b|t) also offers an example of the problem here.

This post is part of the series SQL Server deadlocks and live locks – Common Patterns. If you have questions, please post a comment here or reach me at @sqlindian

Deadlock due to Savepoint Rollback behavior

Today I want to introduce you to a specific behavior about how locks are rolled back when you rollback a transaction to a Savepoint and how it can cause a deadlock.

Setup

Run the below script to create the necessary tables.

IF DB_ID(‘DeadLockTest’) IS NULL
CREATE Database DeadLockTest
GO

USE DeadLockTest
GO

CREATE TABLE dbo.spdl1(id int identity(1,1) primary key clustered, status char(20))
CREATE TABLE dbo.spdl2(id int identity(1,1) primary key clustered, status char(20))
GO
SET NOCOUNT ON

INSERT INTO dbo.spdl1 DEFAULT VALUES
INSERT INTO dbo.spdl2 DEFAULT VALUES

Now open a new session (session 1) and run the below script.

USE DeadLockTest
GO

SET TRANSACTION ISOLATION LEVEL REPEATABLE READ

BEGIN TRAN

SELECT * FROM dbo.spdl1 WHERE id = 1

SAVE TRANSACTION SP;

SELECT * FROM dbo.spdl2 WHERE id = 1

UPDATE dbo.spdl2 SET Status = ‘In Progress’
UPDATE dbo.spdl1 SET Status = ‘In Progress’

–Something went wrong.
–Rollback to the save point

ROLLBACK TRANSACTION SP;

Note that we performed two update statements and rolled it back. Open a second session (session 2) and run the below script.

USE DeadLockTest
GO

SET DEADLOCK_PRIORITY LOW

BEGIN TRAN

UPDATE dbo.spdl2 SET Status = ‘In Progress’ WHERE id = 1

The update will succeed.  Now run the below statement in the same window (session 2)

SELECT * FROM dbo.spdl1 WHERE id = 1

The above statement will be (surprisingly) blocked. Now go back to Session 1 and run the below script to cause a deadlock condition.

UPDATE dbo.spdl2 SET Status = ‘Failed’

The second session will now fail with the deadlock error.

Analyze

To see what is happening here, lets build the deadlock analysis table.

Process Resource Mode State Command Object
process6dd708 KEY: 26:72057594042122240 (8194443284a0) S WAIT SELECT indexname=PK__spdl1
process6dd708 KEY: 26:72057594042253312 (8194443284a0) X GRANT UPDATE indexname=PK__spdl2
process6c2e08 KEY: 26:72057594042253312 (8194443284a0) U WAIT UPDATE indexname=PK__spdl2
process6c2e08 KEY: 26:72057594042122240 (8194443284a0)  X GRANT SELECT ? indexname=PK__spdl1

As you can see above the process process6c2e08, which executed two update statements and rolled it back is still holding an exclusive lock (X) on one table, but obviously it released the exclusive lock on the other table, because the second session was able to acquire an X lock on that table and the UPDATE statement succeeded.

Couple of things to note here is that we are running Session 1 in REPEATABLE READ isolation level so we expect the shared locks to be held until the end of the transaction. The first SELECT statement is before the SAVE TRANSACTION statement and the second SELECT statement is after the SAVE TRANSACTION statement. So both these SELECT statements will first acquire Shared locks. When the update statements are executed, both the Shared locks will be converted to Exclusive locks. When we hit the ROLLBACK Savepoint, the ideal behavior is

1. Release the exclusive lock on the second table.

2. Downgrade the Exclusive lock on the first table to Shared Lock since the SELECT statement is outside the Savepoint.

But SQL Server does not undo a lock conversion by changing the lock mode to the previous mode existed before the Savepoint. So in our case, the exclusive lock on the first table will be held even if we rollback the UPDATE Statements.

IF you query the DMV sys.dm_tran_locks after the ROLLBACK savepoint statement, you can see that the process is still holding the exclusive lock.

Fix

Just move the first select statement below the SAVE TRANSACTION statement!

This post is part of the series SQL Server deadlocks and live locks – Common Patterns. If you have questions, please post a comment here or reach me at @sqlindian