Saturday, April 04, 2009

How to use SQL Server T-SQL cursors


The general advice within the SQL Server community is do not use any T-SQL cursors at any time.
I must say that I have to agree, although if you have browsed my blog you've noticed that there are a couple of occasions when I do use cursors but only when the following is true:

- All cursors are read_only / forward-only for optimum speed.
- They're not used in OLTP environments.
- Datasources are small sets, usually system objects, like a list of databases / tables.
- The impact the cursor is many times smaller then the processing steps taken with the row results.

(like a cursor based on sys.databases and with the result of each row a full database backup is made).

Coding guidelines:
- Avoid using cursors and try solve the problem by using set based operations.
- Try to base them on readonly tables or local (#temptable) temporarily objects.
- Cleanup! Always CLOSE and DEALLOCATE cursors.
- Specify the 'LOCAL' option.
- Specify the FAST_FORWARD clause on DECLARE CURSOR. This opens an optimized forward-only, read-only cursor.
- Never use cursors in OLTP environments.
- Never use them as source of an update: UPDATE table SET col1 = 2 WHERE CURRENT OF the_cursor;
- Cursors are memory intensive; base them on smaller sets (less 5,000-10,000 rows, your site could differ: test!).

template:

    DECLARE the_cursor CURSOR FAST_FORWARD LOCAL FOR
        SELECT col1,col2,col3
         FROM dbo.mytable

    OPEN the_cursor

    FETCH NEXT FROM the_cursor INTO @col1,@col2,@col3
    WHILE (@@FETCH_STATUS <> -1)
    BEGIN

        FETCH NEXT FROM the_cursor INTO @col1,@col2,@col3
    END

    CLOSE the_cursor
    DEALLOCATE the_cursor
 


TIP:
If you do need cursors in high load OLTP environments because of some complex calculation that can't be done set based take the following approach:

Copy the required result set in a temporary object. Retrieve only the rows and columns you need, but do include all the fields of the primary key.

create #temptable (CalcValue int, pkCol int)

INSERT INTO #temptable (CalcValue, pkCol)

SELECT 0, PrimaryKeyCol
FROM dbo.HighLoadOLTPtable
WHERE -- your where clause here

Base your cursor on the temp. object.

Loop the cursor, perform your calculation and store the result in the temp. object row:

UPDATE #temptable SET CalcValue=complex_calculated_value
WHERE pkCol=pk_fields_as_fetched_by_cursor

When done looping close/deallocate the cursor, and update the high load OLTP source table(s) set based by primarykey, use a BEGIN TRANSACTION / COMMIT if required:

UPDATE dbo.HighLoadOLTPtable SET CalculatedValue = #temptable.CalcValueFROM dbo.HighLoadOLTPtable, #temptableWHERE dbo.HighLoadOLTPtable.PrimaryKeyCol = #temptable.pkCol


see my posts on
TempDB configuration and TempDB objects



Bookmark and Share

Wednesday, April 01, 2009

Working with SQL Server TempDB objects


Try to avoid them in the first place by using derived tables or CTEs

Table variables are easy an convenient, because you don't have to clean them up.
But only use them with small sets (smaller 1000 rows)

Do not use SELECT col1 INTO #temptable FROM dbo.table1

But do use a create table statement and a INSERT/SELECT:
CREATE #temptable (col1 int,col2 int)

INSERT INTO #temptable (col1,col2) SELECT Col1, Col2 FROM dbo.table1

a bit more work, but less blocking in environments with a higher load.

With larger sets you could consider using indexes and check with the query plan and statistics io to see if there are any benefits in using them.
Create these indexes after the table is filled with a set:

INSERT INTO #temptable (col1,col2) SELECT Col1,Col2 FROM dbo.table1
CREATE CLUSTERED INDEX CIX_#temptable ON #temptable (col1)

CREATE INDEX IX_#temptable_1 ON #temptable (Col2)

Create ony the indexes you absolutely need.

Configure TempDB as suggested in my post configure TempDB

Can your TempDB database disks handle the current load? See my post on fn_virtualfilestats

Prefer locally (#temptable) over globaly (##temptable) scoped definitions

Store results of linked servers in TempDB objects instead of accessing the remote datasource multiple times. See my post on linked servers

Be a good citizen, at the end of your stored procedure or T-SQL code always cleanup after your self and drop all temporarily objects you have created.


    --drop temp tables if they exists
    IF OBJECT_ID('tempdb..#temptable') IS NOT NULL
        DROP TABLE #temptable
 



Bookmark and Share