Deadlock due to Parallelism

 

Do you think a simple query like the below one can deadlock with another instance running the same query?

SELECT @id = id FROM dbo.midl WITH(UPDLOCK)
WHERE id2 = @id

Read on to find out.

Setup

Since this post is about deadlock due to parallelism, make sure that parallelism is enabled on your server.

Run the below script to create the table.

IF DB_ID(‘DeadLockTest’) IS NULL
CREATE Database DeadLockTest
GO

CREATE TABLE dbo.midl(id int identity(1,1) primary key clustered, id2 int, filler char(100) default ‘abc’ )

SET NOCOUNT ON

DECLARE @i int
SET @i = 1
WHILE @i < 60000
BEGIN
INSERT INTO dbo.midl DEFAULT VALUES
SET @i = @i+1
END

UPDATE dbo.midl SET id2 = id+1

Now run the following query in multiple sessions or use a tool like SQLQueryStress developed by Adam Machanic (b | t) which can be used for executing a query in multiple sessions.

SET NOCOUNT ON

DECLARE @id int
WHILE 1=1
BEGIN
SET @id = (RAND() * 1000) + 1
SELECT @id = id FROM dbo.midl WITH(UPDLOCK)
WHERE id2 = @id
END

In my tests with five concurrent sessions, three of them deadlocked under a minute.

The queries failed with the error

Msg 1205, Level 13, State 51, Line 6

Transaction (Process ID 60) was deadlocked on lock | communication buffer resources with another process and has been chosen as the deadlock victim. Rerun the transaction.

Analyze

If you had paid close attention to the error message, it says the deadlock happened on lock | communication buffer resources, which is a hint about the actual issue here. If you pay close attention to the resource list in the deadlock trace you can see the exchange operators appearing in the waiter list.

exchangeEvent id=Pipe8ac26500 WaitType=e_waitPipeGetRow nodeId=1

Deadlocks involving parallelism always involve thread or lock resources. So we can proceed in our conventional route and prepare our deadlock table.

I had simplified the below table to include only two process instead of all four participants in the deadlock.

Process Resource Mode State Command Object
process579948 KEY: 26:72057594039500800 (8194443284a0) U WAIT SELECT indexname=PK__midl__X
process579948 KEY: 26:72057594039500800 (1524a5133a9d) U GRANT SELECT indexname=PK__midl__X
process81e714c8 KEY: 26:72057594039500800 (1524a5133a9d) U WAIT SELECT indexname=PK__midl__X
process81e714c8 KEY: 26:72057594039500800 (8194443284a0) U GRANT SELECT indexname=PK__midl__X

 

This only tells us that the contention is on the PK. We need to see what keys are being actually involved. We can get this using the undocumented %%lockres%% function.

By running the below query,

SELECT id, %%lockres%% as LockResource
FROM dbo.midl
WHERE %%lockres%% IN(‘(8194443284a0)’,'(1524a5133a9d)’)

Id LockResource
1 (8194443284a0)
12709 (1524a5133a9d)

 

Now we can see that, process81e714c8 is holding a lock on row with id 1 and is waiting for a lock on id 12709. This is OK since it is a normal blocking scenario. But process579948 is holding a lock on row with id 12709 and waiting for a lock on id 1. If we look at the execution plan, we can see why.

 

image

When a task is executed using a parallel operator, the input is divided into a number of parallel buckets and each parallel thread works on its own subset of data.

Fix

This is again an easy fix. You have probably noticed that the id2 column does not have an index. If you create a non-clustered index on the id2 column, the deadlock will disappear.

PS: If you did not get the deadlock error, try adding few more thousands of rows and/or few more concurrent processes running the same query.

This post is part of the series SQL Server deadlocks and live locks – Common Patterns. If you have questions, please post a comment here or reach me at @sqlindian

 

Advertisements

Troubleshooting deadlocks

 

There are quite a few ways to capture and troubleshoot deadlock errors. Personally I use the trace flag 1222 to capture deadlocks and manually analyze the trace results. The ultimate resource on deciphering the 1222 output is Bart Duncan’s blog. Please read this four part series. This will provide a lot of insights in troubleshooting deadlocks.

With SQL Server 2008, you can collect deadlock information using the system_health default event session in Extended Events. The most interesting thing about this is that you can collect the deadlock details after the fact. Jonathan Kehayias, probably the best expert you can find in the topic of extended events and a champion in deadlock troubleshooting wrote about using extended events and few other methods to capture and troubleshoot deadlock errors in his book Troubleshooting SQL server – A Guide for the Accidental DBA. He also discusses about some common deadlocks and how to troubleshoot them. You can read the chapter on deadlocks here.

Also see Amit Banerjee’s post on the retrieving deadlock data from the System Health event session  here. He also blogged about how you can use Powershell and TSQL to create individual deadlock graphs from System health event here.

Regardless of the method you sue to capture and analyze the deadlocks, you will need to arrive at something like below to understand the resources and processes involved in the deadlock to fix the deadlock issue.

Process Resource Mode State Command Object
process6dd708 KEY: 26:72057594042318848 (c904e608358d) U WAIT SELECT indexname=PK__spdl1
process6dd708 KEY: 26:72057594042318848 (30488d9fd081) U GRANT SELECT indexname=PK__spdl2
process6c34c8 KEY: 26:72057594042318848 (30488d9fd081) U WAIT SELECT indexname=PK__spdl2
process6c34c8 KEY: 26:72057594042318848 (c904e608358d) U GRANT SELECT indexname=PK__spdl1

The above table will give you an idea about the locks that are in granted and waiting state and the resources associated with it. The traces will only provide you the statements that are in the waiting state. You will need to analyze the batch to see what statement actually acquired the locks. In many cases you will also need to look at the execution plan to understand why the locks are being held.

We will follow this method to analyze the deadlocks we are going to discuss.

This post is part of the series SQL Server deadlocks and live locks – Common Patterns. If you have questions, please leave a comment here or reach me at @sqlindian

Introduction to deadlocks

Anybody with little experience with SQL Server must have seen few deadlock errors and probably understood why it happens. So a definition may not be necessary. Books online defines deadlock as “A deadlock occurs when two or more tasks permanently block each other by each task having a lock on a resource which the other tasks are trying to lock”

SQL server automatically detects deadlock conditions and choose one of the processes as the victim and rollback that transaction based on the DEADLOCK PRIORITY and log used. There are few exceptions to this rule for victim selection and in some cases more than one process may selected as a victim. We will discuss about this in detail in a later post.

A deadlock can occur on Locks, Worker Threads, Memory, Parallel query resources and MARS resources. Our discussions in this series will be around handling Lock related deadlocks.

A deadlock is not a bug, except for very few cases which we will discuss later. A deadlock is a natural side effect of a busy system and are not completely avoidable. But with careful database design, proper indexing and by handling deadlock errors, the possibility of deadlocks and the impact of it can be greatly reduced. Some of the deadlock scenarios we are going to discuss in the coming posts require use of additional locking hints or the use of sp_getAppLock

Deadlock Properties

  • If a lock can be described as a unit of contention, a deadlock involves at least two units of contention.
  • At least two processes will be involved in a deadlock.
  • Each of the processes involved in a deadlock will be – directly or indirectly – holding a lock and will be waiting for another.
  • The Pending requests are not compatible with the existing locks.

Deadlock Categories

Cyclic deadlock vs Conversion Deadlocks : Deadlocks are generally categorized as cyclic deadlocks or conversion deadlocks.

A cyclic deadlock happens when two processes accesses two resource in the opposite order. A cyclic deadlock involves two or more resources.

Following is a slightly modified version of the deadlock example from BOL

  1. Transaction A acquires an exclusive lock on row 1.
  1. Transaction B acquires an exclusive lock on row 2.
  1. Transaction A now requests an exclusive lock on row 2, and is blocked until transaction B finishes and releases the exclusive lock it has on row 2.
  1. Transaction B now requests an exclusive lock on row 1, and is blocked until transaction A finishes and releases the exclusive lock it has on row 1.

A conversion deadlock happens when two processes holding compatible locks try to Convert the lock into non-compatible mode. A conversion deadlock involves one or more resources.

If we modify the above example to show a conversion deadlock , it will look like

  1. Transaction A acquires a shared lock on row 1.
  1. Transaction B acquires a shared lock on row 1.
  1. Transaction A now requests an exclusive lock on row1, and is blocked until transaction B finishes and releases the share lock it has on row 1.
  1. Transaction B now requests an exclusive lock on row 1, and is blocked until transaction A finishes and releases the share lock it has on row 1.

Multi – Statement vs Single Statement deadlocks : Another way I would like to classify deadlocks are based on how many statements are involved in the deadlock. Simple deadlock scenarios involve multiple statements from multiple batches holding and waiting for conflicting locks. These deadlocks are relatively easy to troubleshoot. There are cases where the exact same statement executed by different processes ending up in a deadlock. We will see examples of both the cases in the coming posts.

Deadlocks and Isolation Levels

Deadlocks can happen in any isolation levels. The probability of deadlocks are higher in more restrictive isolation levels like SERIALIZABLE compared to less restrictive isolation levels like READ UNCOMMITTED. In the context of locks, Isolation levels only controls the duration and mode of Shared locks. Regardless of the isolation level, exclusive locks and update locks will be held until the end of the transaction.

In the next post I’ll introduce you to the resources I use for troubleshooting deadlocks.

This post is part of the series SQL Server deadlocks and live locks – Common Patterns. If you have questions, please post a comment here or reach me at @sqlindian

SQL Server deadlocks –Common Patterns

I get to play with a lot of deadlocks these days, part by chance and part by choice. So I thought it is a good idea to start a blog series about handling different types deadlocks.

I will try to establish some context about general deadlock properties and then introduce you to some common and some not so common deadlock patterns.

Introduction

Troubleshooting

Cyclic deadlock

Conversion deadlock

Multi-Victim Deadlocks

Deadlock due to different access Paths

Deadlock due to different access Order

Deadlock due to different lock granularity

Deadlock due to Foreign Key Constraints

Deadlock due to Parallelism

Deadlocks due to Partition Level Lock Escalation

DEADLOCK on SELECT due to UNORDERED PREFETCH

Deadlocks involving Select Into

Deadlock due to Savepoint Rollback behavior

Deadlocks due to Hash Collision

Deadlocks Involving Lock Partitions

Deadlocks due to Lock Partitioning

Deadlock due to Implicit Conversion

 

 

Deadlock due to Different Lock granularity

A deadlock can occur when a table is accessed with different lock granularity  in the same transaction. The following script demonstrates the concept.

Setup

Run the following script to create the necessary table.

IF DB_ID(‘DeadLockTest’) IS NULL
CREATE Database DeadLockTest
GO

USE DeadLockTest
GO
CREATE TABLE dbo.lgdl(id int identity(1,1) primary key clustered, filler char(100) default ‘abc’)
GO

INSERT INTO lgdl DEFAULT VALUES
GO
INSERT INTO lgdl DEFAULT VALUES
GO

Now open a new session (session 1) and run the below script.

SET TRANSACTION ISOLATION LEVEL REPEATABLE READ
BEGIN TRAN

SELECT * FROM lgdl WITH(ROWLOCK) WHERE id = 1

WAITFOR DELAY ’00:00:05′

UPDATE l
SET filler = ‘xyz’
FROM lgdl L WITH(PAGLOCK)
WHERE id =1

ROLLBACK TRAN

Open another session (session 2)  and run the below script.

SET TRANSACTION ISOLATION LEVEL REPEATABLE READ
BEGIN TRAN

SELECT * FROM lgdl WITH(ROWLOCK) WHERE id = 2

WAITFOR DELAY ’00:00:05′

UPDATE l
SET filler = ‘xyz’
FROM lgdl L  WITH(PAGLOCK)
WHERE id =2

ROLLBACK TRAN

Analyze

Lets build the deadlock analysis table from the trace results.

Process Resource Mode State Command Object
process7ed948 PAGE: 26:1:2898 U WAIT UPDATE dbo.lgdl
process7ed948 PAGE: 26:1:2898 IS GRANT UPDATE dbo.lgdl
process44f708 PAGE: 26:1:2898 X WAIT UPDATE dbo.lgdl
process44f708 PAGE: 26:1:2898 U GRANT UPDATE dbo.lgdl

So what is happening here is, both the processes will start a S lock on the individual keys and an IS lock on the page. Since we explicitly specified  PAGLOCK in the update statement both transactions will need to acquire locks at the page level for the UPDATE statement. SQL server acquires locks for updates in two phases. First it will acquire a U lock for the read phase of the Update and then an X lock for the Update. So when the first transaction tries to acquire a U lock on the page, it will succeed because a U lock is compatible with an IS lock. However, when it tries to convert the U lock to an X lock, it will be blocked since an X lock is not compatible with an IS lock.

Meanwhile when the second the second transaction tries to convert the IS lock to a U lock, it will be blocked because a U lock is not compatible with another U lock. And when two transactions block each other the result is a deadlock.

For details on SQL server lock compatibility, please refer to this entry on MSDN.

Fix

Make sure a table is accessed using the same lock granularity inside a transaction especillly when using lock granularity hints.

This post is part of the series SQL Server deadlocks and live locks – Common Patterns. If you have questions, please post a comment here or reach me at @sqlindian