Home > SQL Server > Writing Code that scales – Part 2 – Cursors

Writing Code that scales – Part 2 – Cursors

Following up to my previous post on scalability, we now turn our attention to cursors. You can probably find lots of info on the web on cursors and where not to use them , but I had included it into the presentation for the benefit of junior developers, and I’ll just mention here the demo I did during the presentation.

So what are the problems with cursors ?
1) They are procedural. This means that programmers coming from other languages think that they are perfectly logical , without realising that SQL is set based.
2) Some types of cursors lock records in the database and prevent other users from changing them. Other types of cursors create an additional copy of all records and then work with them.
3) A cursor is a lot more code to maintain. You’ll see that set based code is normally smaller.

Lets move on to an example.

Suppose that we wanted to update a table based on values from another table. Yes, this is a simple example I know, but it will be easier to understand if you’re coming from a .NET background into T-SQL development. So you’re probably thinking about some kind of lookup at this point , and in code terms you’re thinking about implementing the following  :

horrible cursor
I’ve used tables from AdventureWorksDW, althought I modified the DimCustomers table to hold geographical information, and put them in a cursors schema. Now, the above is certainly not an example of best practice coding. When I ran it on my machine , it took about 10 minutes to complete. To get an idea of what is actually going on , run a profiler trace when the cursor is running. In the background , you’re actually getting thousands of UPDATE and SELECT statements executing on the database. This is very inefficient when compared to the set-based solution that we’ll investigate in a moment. It also means a tremendous amount if IO going on.

Why this doesn’t scale

Well , typically what happens is that the cursor runs fine in development because the volumes of data are very small. In production , the volumes grow monthly and the cursor gets slower every month. Another reason is that when the cursor is running , it will cause blocking in the production environment. Lets take a look at a set based solution. One way of writing this would be to write and UPDATE statement with a JOIN clause.

Set based code

When I ran this on my machine , it took exactly 1 second to execute , but did the same job as the cursor. Now you see the power of set based code. The update statement , because there is a join , can in 1 statement update the relevant cell with the right piece of information. That is why we see such a difference. Another thing you can do is to compare execution plans.

Once you get your head around Set-Based queries , you’ll find it much easier to come up with set-based alternatives to cursors in your daily programming life.

Categories: SQL Server
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: