Monday, March 30, 2009
How to configure AWE memory
AWE according to BOL:
SQL Server 2005 Enterprise Edition supports Address Windowing Extensions (AWE) allowing use of physical memory over 4 gigabytes (GB) on 32-bit versions of Microsoft Windows operating systems. Up to 64 GB of physical memory is supported.
My advice: Don't use AWE!
As mentioned above, it is only supported by the far more expensive SQL Server Enterprise edition (32bit of course) and it is only used by the core database engine and no other components.
SSIS -and others- won't benefit and are still limited to 32bit (=2GB max).
AWE has also a performance overhead, and it's always a hassle with the boot.ini switches:
If your system has <= 4 GB - use /3GB (none of the other options)
If your system has > 4 GB and < 16 GB - use /3GB and /PAE and configure AWE in SSMS
If your system has > 16 GB - use /PAE and configure AWE in SSMS
Go for SQL Server standard Edition (which is more then enough for most shops) and go for a 64bit environment with enough memory (start with 8GB) for SQL Server to use.
Far Cheaper and far more scalable.
You can serve more databases on the same instance, leading to substantial cost savings.
See my post on Tools for performance and How many instances on a SQL Server
Saturday, March 28, 2009
Always run SQLIO on new servers!
When your vendor brings a new type of server on the market always check if the hardware is as good as they claim it is.
A while ago I did a test run and compared the IO results of a Dell server and an -almost- equal configured IBM server. Both were small configs with six internal drives. The tool I used was SQLIO.
SQLIO is a small .exe file (250KB) that mimics the SQL Server database engine. You don't have to install SQL Server for the tool to work. Only the Windows OS is enough, so you can also check the thoughput of your SAN filesystem.
You can download SQLIO and some documentation here
The Result? Guess what? The Dell server scored 10(ten!) times better on IO performance then the more expensive IBM...
Never take your vendors word for it, always test yourself. Because who gets the blaim in the end?
The test file was 25MB, but results were roughly the same for bigger test files.
Results for: sqlio -kR -s360 -frandom -o8 -b64 -LS -Fparam.txt
IBM server local drives:
throughput metrics:
IOs/sec: 1177.91
MBs/sec: 73.61
latency metrics:
Min_Latency(ms): 0
Avg_Latency(ms): 13
Max_Latency(ms): 919
histogram:
ms: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+
%: 16 11 4 5 8 8 6 5 4 3 3 2 2 2 1 1 1 1 1 1 1 1 1 1 13
Dell server local drives:
throughput metrics:
IOs/sec: 11,292.99
MBs/sec: 705.81
latency metrics:
Min_Latency(ms): 0
Avg_Latency(ms): 1
Max_Latency(ms): 394
histogram:
ms: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+
%: 0 96 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
See my post on disk configuration tips.
Friday, March 27, 2009
How many instances on a SQL Server?
In my opinion easy and short answer: install one (1) instance and use SQL Server Standard edition 64bit with enough memory (at least 8GB) for SQL Server to run comfortably.
Why? with multiple instances on the same server you have to make assumptions about how much memory you configure to each instance, leading to a less optimal configuration.
Also multiple -housekeeping- tasks scheduled on the seperate SQL Agents have to be configured and monitored separately.
On a dedicated SQL Server I usually opt for giving all but 1-2GB of memory (dependent on the features installed like SSIS or Reporting Services) to Windows and all other memory to SQL Server and definitely go for a 64 bit (32=dinosaur) environment. So if you need additional memory to support more databases on the same instance it can easily be installed without the hassle of boot.ini switches and the performance overhead of AWE. (and AWE is only supported by the more expensive SQL Server Enterprise Edition)
Do not forget to set an upper limit of how much memory SQL Server can max. use, to leave enough room for other apps and services.
sample: on a 16GB server you can set a maximum of 14GB for SQL Server to use, leaving 2GB for the OS and other stuff.
(click to enlarge)
With the single instance approach together with Tools for performance and so reducing the workload you can have more databases on the same single instance server, leading to significant cost savings.
Tuesday, March 17, 2009
Linked Server collation, performance and temp tables
When I create a linked server to a non SQL Server platform or to a SQL Server instance that's using another collation I always use the following linked server options:
- Collation Compatible: False
- Use Remote Collation: False
In all other cases set collation compatibility to True, so indexes on the remote SQL Server tables can be usedleading to a significant performance boost.
I usually set the RPC / RPC Out and Data Access properties to True.
(click to enlarge)
When the linked server is a read-only data store I definitely prefer to use an OPENQUERY construction:
select * from openquery
(
SQLSERVER2,'SELECT col1,col2 FROM CodeCatalog.dbo.recommendations'
)
over the four part name equivalent:
SELECT col1,col2 FROM SQLSERVER2.CodeCatalog.dbo.recommendations
Reason?
Far less distributed transaction errors (none so far!) generated by the OLE-DB driver.
Using OPENQUERY requires a bit more code, but it really, really pays of...
Performance.
Because linked servers are not the fastest objects on the planet, try to access them as less as you can. I'll usually access them just once and dump te result in a temporary table and use that as input for further processing.
sample:
create table #t (id int)
insert into #T (id)
select * from openquery
(
SQLSERVER2,'SELECT id FROM CodeCatalog.dbo.recommendations'
)
...process results from #T here....
Don't forget to script your linked server definitions and store them somewhere safe.
also, check out my post on msdtc.
- Collation Compatible: False
- Use Remote Collation: False
In all other cases set collation compatibility to True, so indexes on the remote SQL Server tables can be usedleading to a significant performance boost.
I usually set the RPC / RPC Out and Data Access properties to True.
(click to enlarge)
When the linked server is a read-only data store I definitely prefer to use an OPENQUERY construction:
select * from openquery
(
SQLSERVER2,'SELECT col1,col2 FROM CodeCatalog.dbo.recommendations'
)
over the four part name equivalent:
SELECT col1,col2 FROM SQLSERVER2.CodeCatalog.dbo.recommendations
Reason?
Far less distributed transaction errors (none so far!) generated by the OLE-DB driver.
Using OPENQUERY requires a bit more code, but it really, really pays of...
Performance.
Because linked servers are not the fastest objects on the planet, try to access them as less as you can. I'll usually access them just once and dump te result in a temporary table and use that as input for further processing.
sample:
create table #t (id int)
insert into #T (id)
select * from openquery
(
SQLSERVER2,'SELECT id FROM CodeCatalog.dbo.recommendations'
)
...process results from #T here....
Don't forget to script your linked server definitions and store them somewhere safe.
also, check out my post on msdtc.
Monday, March 16, 2009
SQL Server MSDTC settings
During installation of a new Windows OS I set the following properties for the MSDTC security settings and transaction timeout.
Required when you want to configure and use linked servers
Security setings:
Transaction Timeout:
To configure these properties in Windows Component Services, complete the following actions:
1.From your Microsoft Windows desktop,click Start > Settings > Administrative Tools > Component Services, or start dcomcnfg.exe from a command prompt.
2. Expand the tree view to locate the computer where you want to change the configuration for;for example, My Computer.
3. Display the context menu for the computer name, and then click properties.
4. Click the 'Options' tab, and set the 'Transaction Timeout' to a length of time that suits your environment.The minimum setting I use is 180 seconds.
5. Click the 'MSDTC' tab, and then 'Security Configuration'. make the changes as shown in the .jpg above and click OK to save.
The MSDTC service must be restarted for changes to take effect.The OS will ask for confirmation.
Required when you want to configure and use linked servers
Security setings:
Transaction Timeout:
To configure these properties in Windows Component Services, complete the following actions:
1.From your Microsoft Windows desktop,click Start > Settings > Administrative Tools > Component Services, or start dcomcnfg.exe from a command prompt.
2. Expand the tree view to locate the computer where you want to change the configuration for;for example, My Computer.
3. Display the context menu for the computer name, and then click properties.
4. Click the 'Options' tab, and set the 'Transaction Timeout' to a length of time that suits your environment.The minimum setting I use is 180 seconds.
5. Click the 'MSDTC' tab, and then 'Security Configuration'. make the changes as shown in the .jpg above and click OK to save.
The MSDTC service must be restarted for changes to take effect.The OS will ask for confirmation.
Friday, March 13, 2009
Enable Instant File Initialization
In SQL Server 2005 (and higher versions), data files can be initialized instantaneously.
Instant file initialization reclaims used disk space without filling that space with zeros. Instead, disk content is overwritten as new data is written to the files.
This can lead to a significant performance boost of ‘create database’, restore and autogrow operations.
For example, when a thread executing a query and is a victim of an autogrow, less disk activity needs to be done and it can finish much quicker.
Unfortunately this option is not available for Log files.
How to assign ‘Perform volume maintenance tasks’ to your SQL Server instance:
Start the gpedit.msc utility, browse to ‘user rights assignment’ and add your SQL Server service account to the ‘Perform volume maintenance tasks’ policy.
Restart the SQL Server service (not the windows server), for changes to take effect.
(click to enlarge):
Check the assigned privileges:
Open SQL Server Management Studio under the same account the SQL Server service is running. Open a query window, set result to out put to text and paste/run the following code:
EXEC sp_configure 'xp_cmdshell', 1;
GO
RECONFIGURE WITH OVERRIDE;
GO
EXEC ('xp_cmdshell ''whoami /priv''');
EXEC sp_configure 'xp_cmdshell', 0;
GO
RECONFIGURE WITH OVERRIDE;
GO
Result in SSMS should contain SeManageVolumePrivilege with state Enabled
Instant file initialization reclaims used disk space without filling that space with zeros. Instead, disk content is overwritten as new data is written to the files.
This can lead to a significant performance boost of ‘create database’, restore and autogrow operations.
For example, when a thread executing a query and is a victim of an autogrow, less disk activity needs to be done and it can finish much quicker.
Unfortunately this option is not available for Log files.
How to assign ‘Perform volume maintenance tasks’ to your SQL Server instance:
Start the gpedit.msc utility, browse to ‘user rights assignment’ and add your SQL Server service account to the ‘Perform volume maintenance tasks’ policy.
Restart the SQL Server service (not the windows server), for changes to take effect.
(click to enlarge):
Check the assigned privileges:
Open SQL Server Management Studio under the same account the SQL Server service is running. Open a query window, set result to out put to text and paste/run the following code:
EXEC sp_configure 'xp_cmdshell', 1;
GO
RECONFIGURE WITH OVERRIDE;
GO
EXEC ('xp_cmdshell ''whoami /priv''');
EXEC sp_configure 'xp_cmdshell', 0;
GO
RECONFIGURE WITH OVERRIDE;
GO
Result in SSMS should contain SeManageVolumePrivilege with state Enabled
(click to enlarge):
Thursday, March 12, 2009
SQL Server Windows OS Settings
Favor 64bit OS'es
32 bit = dinosaur
Impact of other applications:
Limit the numbers of services / applications running to an absolute minimum.
Virus scanner: do not scan on-line but schedule it as a task during off-peak hours.
During the scanning process exclude the SQL Server device extensions (.MDF, .NDF, .LDF, .BAK, .DIF and .TRN), so that they are not accessed during the scanning process.
Pagefile
Create a pagefile 1.5 - 2 times the amount of internal memory.
Don't put it on a bussy database volume.
Memory:
Use the /3GB switch in boot.ini on dedicated 32bit SQL Server OS’es with 4GB of memory and no other programs like Reporting Services installed. This switch instructs the OS to assign max 3GB to processes instead of the 32bit OS limit of 2GB, but leaving max. 1GB to the OS and other services.
Favor 64 bit over AWE
Network properties:
Maximize Data Throughput for Network Applications
(click to enlarge)
System properties:
Application Response" setting: "Background services"
Change the memory allocation to favor "Programs."
(click to enlarge)
32 bit = dinosaur
Impact of other applications:
Limit the numbers of services / applications running to an absolute minimum.
Virus scanner: do not scan on-line but schedule it as a task during off-peak hours.
During the scanning process exclude the SQL Server device extensions (.MDF, .NDF, .LDF, .BAK, .DIF and .TRN), so that they are not accessed during the scanning process.
Pagefile
Create a pagefile 1.5 - 2 times the amount of internal memory.
Don't put it on a bussy database volume.
Memory:
Use the /3GB switch in boot.ini on dedicated 32bit SQL Server OS’es with 4GB of memory and no other programs like Reporting Services installed. This switch instructs the OS to assign max 3GB to processes instead of the 32bit OS limit of 2GB, but leaving max. 1GB to the OS and other services.
Favor 64 bit over AWE
Network properties:
Maximize Data Throughput for Network Applications
(click to enlarge)
System properties:
Application Response" setting: "Background services"
Change the memory allocation to favor "Programs."
(click to enlarge)
Wednesday, March 11, 2009
Five methods converting rows to columns
Post with five methods for converting rows to columns.
From the classic ‘CASE’ construction to the newer PIVOT and ROW_NUMBER() OVER (ORDER BY) functions.
Be careful!
Some methods presented here can have a severe negative impact on performance. Especially the in-line function with a cursor should be used with the utmost precaution.Check the query plan and the statistics i/o (see this post) of your queries before putting the code into your production environment!
1. use a CASE statement
2. use the COALESCE function
-->
3. use ROW_NUMBER() OVER (ORDER BY)
4. Use an inline function
-->
5. Use a pivot
From the classic ‘CASE’ construction to the newer PIVOT and ROW_NUMBER() OVER (ORDER BY) functions.
Be careful!
Some methods presented here can have a severe negative impact on performance. Especially the in-line function with a cursor should be used with the utmost precaution.Check the query plan and the statistics i/o (see this post) of your queries before putting the code into your production environment!
1. use a CASE statement
SELECT DATEPART(YEAR,orderdate),
SUM(CASE WHEN DATEPART(q,orderdate) = 1 THEN orderamount ELSE 0 END) AS Qtr1,
SUM(CASE WHEN DATEPART(q,orderdate) = 2 THEN orderamount ELSE 0 END) AS Qtr2,
SUM(CASE WHEN DATEPART(q,orderdate) = 3 THEN orderamount ELSE 0 END) AS Qtr3,
SUM(CASE WHEN DATEPART(q,orderdate) = 4 THEN orderamount ELSE 0 END) AS Qtr4,
SUM(orderamount) AS Total
FROM Orders
-- additional where clause goes here...
GROUP BY DATEPART(YEAR,orderdate)
SUM(CASE WHEN DATEPART(q,orderdate) = 1 THEN orderamount ELSE 0 END) AS Qtr1,
SUM(CASE WHEN DATEPART(q,orderdate) = 2 THEN orderamount ELSE 0 END) AS Qtr2,
SUM(CASE WHEN DATEPART(q,orderdate) = 3 THEN orderamount ELSE 0 END) AS Qtr3,
SUM(CASE WHEN DATEPART(q,orderdate) = 4 THEN orderamount ELSE 0 END) AS Qtr4,
SUM(orderamount) AS Total
FROM Orders
-- additional where clause goes here...
GROUP BY DATEPART(YEAR,orderdate)
2. use the COALESCE function
DECLARE @AllValues VARCHAR(4000)
SELECT @AllValues = COALESCE(@AllValues + ',', '') + HandlingCode
FROM OrdersDetails
WHERE OrderNumber = @OrderNumber
SELECT @AllValues = COALESCE(@AllValues + ',', '') + HandlingCode
FROM OrdersDetails
WHERE OrderNumber = @OrderNumber
3. use ROW_NUMBER() OVER (ORDER BY)
SELECT OrderNumber, OrderDate,
--get the special handling codes and show them as columns, max of 3 (agreed by users)
(SELECT HandlingCode
FROM
(
SELECT ROW_NUMBER() OVER (ORDER BY HandlingCode) AS ROWNUMBER,HandlingCode
FROM OrdersDetails
WHERE OrdersDetails.OrderNumber = Orders.OrderNumber
) HandlingCode
WHERE ROWNUMBER = 1) HandlingCode1,
(SELECT HandlingCode
FROM
(
SELECT ROW_NUMBER() OVER (ORDER BY HandlingCode) AS ROWNUMBER,HandlingCode
FROM OrdersDetails
WHERE OrdersDetails.OrderNumber = Orders.OrderNumber
) HandlingCode
WHERE ROWNUMBER = 2) HandlingCode2,
(SELECT HandlingCode
FROM
(
SELECT ROW_NUMBER() OVER (ORDER BY HandlingCode) AS ROWNUMBER,HandlingCode
FROM OrdersDetails
WHERE OrdersDetails.OrderNumber = Orders.OrderNumber
) HandlingCode
WHERE ROWNUMBER = 3) HandlingCode3
FROM Orders
WHERE OrderNumber = @OrderNumber
--get the special handling codes and show them as columns, max of 3 (agreed by users)
(SELECT HandlingCode
FROM
(
SELECT ROW_NUMBER() OVER (ORDER BY HandlingCode) AS ROWNUMBER,HandlingCode
FROM OrdersDetails
WHERE OrdersDetails.OrderNumber = Orders.OrderNumber
) HandlingCode
WHERE ROWNUMBER = 1) HandlingCode1,
(SELECT HandlingCode
FROM
(
SELECT ROW_NUMBER() OVER (ORDER BY HandlingCode) AS ROWNUMBER,HandlingCode
FROM OrdersDetails
WHERE OrdersDetails.OrderNumber = Orders.OrderNumber
) HandlingCode
WHERE ROWNUMBER = 2) HandlingCode2,
(SELECT HandlingCode
FROM
(
SELECT ROW_NUMBER() OVER (ORDER BY HandlingCode) AS ROWNUMBER,HandlingCode
FROM OrdersDetails
WHERE OrdersDetails.OrderNumber = Orders.OrderNumber
) HandlingCode
WHERE ROWNUMBER = 3) HandlingCode3
FROM Orders
WHERE OrderNumber = @OrderNumber
4. Use an inline function
SELECT OrderNumber, OrderDate, fn_GetHandlingCodes (OrderNumber)
FROM Orders
WHERE OrderNumber = @OrderNumber
CREATE FUNCTION dbo.fn_GetHandlingCodes (@OrderNumber INT)
RETURNS VARCHAR(1200)
AS
BEGIN
DECLARE @HandlingCode VARCHAR(20)
DECLARE @ReturnValue VARCHAR(4000)
-- use that fastest cursor methods: local fast_forward
DECLARE code_cursor CURSOR LOCAL fast_forward FOR
SELECT HandlingCode
FROM OrdersDetails
WHERE OrderNumber = @OrderNumber
AND NOT HandlingCode IS NULL -- filled
SET @ReturnValue = '' -- set to non null
OPEN code_cursor
FETCH NEXT FROM code_cursor INTO @HandlingCode
WHILE (@@FETCH_STATUS = 0)
BEGIN
SET @ReturnValue = @ReturnValue + @HandlingCode + ', '
IF LEN (@ReturnValue) > 1000 BREAK -- avoid overflow
FETCH NEXT FROM code_cursor INTO @HandlingCode
END
CLOSE code_cursor
DEALLOCATE code_cursor
-- remove last delimiter
IF LEN(@ReturnValue) > 1 SET @ReturnValue = SUBSTRING(@ReturnValue,1,LEN(@ReturnValue)-2)
RETURN @ReturnValue
END
FROM Orders
WHERE OrderNumber = @OrderNumber
CREATE FUNCTION dbo.fn_GetHandlingCodes (@OrderNumber INT)
RETURNS VARCHAR(1200)
AS
BEGIN
DECLARE @HandlingCode VARCHAR(20)
DECLARE @ReturnValue VARCHAR(4000)
-- use that fastest cursor methods: local fast_forward
DECLARE code_cursor CURSOR LOCAL fast_forward FOR
SELECT HandlingCode
FROM OrdersDetails
WHERE OrderNumber = @OrderNumber
AND NOT HandlingCode IS NULL -- filled
SET @ReturnValue = '' -- set to non null
OPEN code_cursor
FETCH NEXT FROM code_cursor INTO @HandlingCode
WHILE (@@FETCH_STATUS = 0)
BEGIN
SET @ReturnValue = @ReturnValue + @HandlingCode + ', '
IF LEN (@ReturnValue) > 1000 BREAK -- avoid overflow
FETCH NEXT FROM code_cursor INTO @HandlingCode
END
CLOSE code_cursor
DEALLOCATE code_cursor
-- remove last delimiter
IF LEN(@ReturnValue) > 1 SET @ReturnValue = SUBSTRING(@ReturnValue,1,LEN(@ReturnValue)-2)
RETURN @ReturnValue
END
5. Use a pivot
USE AdventureWorks
GO
SELECT VendorID, [164] AS Emp1, [198] AS Emp2, [223] AS Emp3, [231] AS Emp4, [233] AS Emp5
FROM
(SELECT PurchaseOrderID, EmployeeID, VendorID
FROM Purchasing.PurchaseOrderHeader) p
PIVOT
(
COUNT (PurchaseOrderID)
FOR EmployeeID IN
( [164], [198], [223], [231], [233] )
) AS pvt
ORDER BY VendorID
GO
SELECT VendorID, [164] AS Emp1, [198] AS Emp2, [223] AS Emp3, [231] AS Emp4, [233] AS Emp5
FROM
(SELECT PurchaseOrderID, EmployeeID, VendorID
FROM Purchasing.PurchaseOrderHeader) p
PIVOT
(
COUNT (PurchaseOrderID)
FOR EmployeeID IN
( [164], [198], [223], [231], [233] )
) AS pvt
ORDER BY VendorID
Tuesday, March 10, 2009
SQL Server database settings
When creating a new SQL Server database I always use this checklist:
Name: Use as less non-alpha numeric characters in the database name as possible (do not use the ‘-‘, it will mess up your scripts). Use pascal casing for readability (use OrdersDb, do not use Orders-db or Orders_db)
Set Page verify to Checksum
Pre allocate data and log size to a reasonable size, do not rely on autogrowth.
Set a reasonable Autogrow size in MB (not a percentage) for data and log files
Enabling auto create/update statistics, is fine for most situations.
Set the right recovery model.
If your using ‘full’ or ‘bulk recovery’, schedule regular log backups (see this post )
Try to create at least two –equally sized!- device files for the data devices.
Not only reduces this locking issues but also when a database is regular under a heavy load,files can easily be moved to additional physical volumes (see this post )
Create one file for the Log
Use fn_virtualfilestats() to measure I/O load per device file (see this post )
Set the .MDF, .NDF and .LDF files in their own folders with the same name as the database. Don’t store all your db’s in one folder
Reorganize indexes/statistics on a regular basis (see this post )
Run DBCC CHECKDB on a regular basis (see this post )
Do not schedule any shrink operations
Configure instant file initialization (see this post )
Check out my post on auto index management and tools for performance, to setup more databases on the same instance and save costs
Name: Use as less non-alpha numeric characters in the database name as possible (do not use the ‘-‘, it will mess up your scripts). Use pascal casing for readability (use OrdersDb, do not use Orders-db or Orders_db)
Set Page verify to Checksum
Pre allocate data and log size to a reasonable size, do not rely on autogrowth.
Set a reasonable Autogrow size in MB (not a percentage) for data and log files
Enabling auto create/update statistics, is fine for most situations.
Set the right recovery model.
If your using ‘full’ or ‘bulk recovery’, schedule regular log backups (see this post )
Try to create at least two –equally sized!- device files for the data devices.
Not only reduces this locking issues but also when a database is regular under a heavy load,files can easily be moved to additional physical volumes (see this post )
Create one file for the Log
Use fn_virtualfilestats() to measure I/O load per device file (see this post )
Set the .MDF, .NDF and .LDF files in their own folders with the same name as the database. Don’t store all your db’s in one folder
Reorganize indexes/statistics on a regular basis (see this post )
Run DBCC CHECKDB on a regular basis (see this post )
Do not schedule any shrink operations
Configure instant file initialization (see this post )
Check out my post on auto index management and tools for performance, to setup more databases on the same instance and save costs
Saturday, March 07, 2009
SQL Server 2000 post upgrade steps
Small post on additional steps to perform when upgrading from an existing SQL Server 2000 database to SQL Server 2005 / 2008 (after you backup’ed, run upgrade advisor and tested everything twice right!)
After a successful restore or attach database operation, execute the following commands in the new environment:
USE your_db_name
GO
DBCC UPDATEUSAGE ('your_db_name')
DBCC CHECKDB ('your_db_name') WITH DATA_PURITY
EXEC sp_updatestats
And:
- Set database compatibility level to ‘SQL Server 2005’ or ‘SQL Server 2008’
- Set Page verify option to Checksum
- Set 'Auto Create' and 'Auto Update' statistics to TRUE
- Reorganize all indexes
Running ‘DBCC UPDATEUSAGE’ and ‘DBCC WITH DATA_PURITY’ once is enough.
The DBCC CHECKDB of the newer (2005/2008) SQL Server versions has it all combined in one statement.
Upgrade to the the newer varchar(max), nvarchar(max), and varbinary(max) data types instead of text, ntext, and image data types.
You can check by executing this T-SQL on your upgraded database on SQL Server 2005 / 2008, to see which tables need some work:
USE = 'your_db-name_here'
GO
SELECT userobjects.name tablename, columns.name columnname, types.name columntype FROM sys.all_columns columns INNER JOIN sys.all_objects userobjects ON columns.object_id = userobjects.object_id INNER JOIN sys.types types ON columns.system_type_id = types.system_type_id
WHERE userobjects.[type] = 'U'
and types.name IN ('image','text','ntext')
After a successful restore or attach database operation, execute the following commands in the new environment:
USE your_db_name
GO
DBCC UPDATEUSAGE ('your_db_name')
DBCC CHECKDB ('your_db_name') WITH DATA_PURITY
EXEC sp_updatestats
And:
- Set database compatibility level to ‘SQL Server 2005’ or ‘SQL Server 2008’
- Set Page verify option to Checksum
- Set 'Auto Create' and 'Auto Update' statistics to TRUE
- Reorganize all indexes
Running ‘DBCC UPDATEUSAGE’ and ‘DBCC WITH DATA_PURITY’ once is enough.
The DBCC CHECKDB of the newer (2005/2008) SQL Server versions has it all combined in one statement.
Upgrade to the the newer varchar(max), nvarchar(max), and varbinary(max) data types instead of text, ntext, and image data types.
You can check by executing this T-SQL on your upgraded database on SQL Server 2005 / 2008, to see which tables need some work:
USE = 'your_db-name_here'
GO
SELECT userobjects.name tablename, columns.name columnname, types.name columntype FROM sys.all_columns columns INNER JOIN sys.all_objects userobjects ON columns.object_id = userobjects.object_id INNER JOIN sys.types types ON columns.system_type_id = types.system_type_id
WHERE userobjects.[type] = 'U'
and types.name IN ('image','text','ntext')
Friday, March 06, 2009
SQL Server disk configuration tips
Most important metrics for SQL Server: ‘throughput’: how many MB/s can the disks handle and what is the load placed on the disks.
If you’re buying a standard package your vendor should know this.
Surprisingly however in the hardware proposal (top end server(s) / high rpm disks right?) this figure isn’t mentioned anywhere in most situations.
What you can do, is ask your vendor if they could run a fn_virualfilestats() function (see my fn_virtualfilestats post) on a similar production configuration of another customer (better: talk to them yourself!) . This gives you insight on the I/O pattern (more read of more write activity) and the load on TempDB, to determine if it’s a good candidate –which it usually is- to store on it’s own disk spindles.
Run the Microsoft SQLIO tool to determine a baseline on MB/s and latency per disk partition.
Store it for later reference, the Windows performance tool can also measure the throughput and it gives you the possibility to compare the numbers and check if there might be a disk throughput performance bottleneck in your production system.
Disk configuration tips:
Use as many disk spindles as you can, and try spreading data / log and tempdb on different raid arrays.
Stripe your RAID-config at 256k instead of 64k / 128k.This will increase performance of SQL Server read-aheads.
If you use Windows 2000 / 2003 align the disk partitions!!!:
(see Miscrosoft knowledge base article KB300415 on how to use Diskpart).
At the Diskpart command prompt,
type: Create Partition Primary Align=X,
where X is either 32 or 64, depending on the recommendation from your storage vendor.
If your storage vendor does not have any specific recommendations, it is recommended that
you use 64.
This step is not needed in Windows 2008
Full (not quick) format with a 64kb stripe size for disks holding data/log and backup files.
Full (not quick) format with a 4kb stripe size for OS and program/swap devices
Data raid5 or raid10 when there’s more write then read activity (check with fn_virualfilestats),
Log (raid1 or raid10 under heavy load), separate backup disk (raid5)
Put TempDB on its own physical disks (raid1, raid10 under heavy load).
In a non-clustered environment, I normally use the local server storage for TempDB.
-->
>
Use NTFS volumes
Do not use more then 80% of the volume
Defragment the disks on a regular basis
If you’re buying a standard package your vendor should know this.
Surprisingly however in the hardware proposal (top end server(s) / high rpm disks right?) this figure isn’t mentioned anywhere in most situations.
What you can do, is ask your vendor if they could run a fn_virualfilestats() function (see my fn_virtualfilestats post) on a similar production configuration of another customer (better: talk to them yourself!) . This gives you insight on the I/O pattern (more read of more write activity) and the load on TempDB, to determine if it’s a good candidate –which it usually is- to store on it’s own disk spindles.
Run the Microsoft SQLIO tool to determine a baseline on MB/s and latency per disk partition.
Store it for later reference, the Windows performance tool can also measure the throughput and it gives you the possibility to compare the numbers and check if there might be a disk throughput performance bottleneck in your production system.
Disk configuration tips:
Use as many disk spindles as you can, and try spreading data / log and tempdb on different raid arrays.
Stripe your RAID-config at 256k instead of 64k / 128k.This will increase performance of SQL Server read-aheads.
If you use Windows 2000 / 2003 align the disk partitions!!!:
(see Miscrosoft knowledge base article KB300415 on how to use Diskpart).
At the Diskpart command prompt,
type: Create Partition Primary Align=X,
where X is either 32 or 64, depending on the recommendation from your storage vendor.
If your storage vendor does not have any specific recommendations, it is recommended that
you use 64.
This step is not needed in Windows 2008
Full (not quick) format with a 64kb stripe size for disks holding data/log and backup files.
Full (not quick) format with a 4kb stripe size for OS and program/swap devices
Data raid5 or raid10 when there’s more write then read activity (check with fn_virualfilestats),
Log (raid1 or raid10 under heavy load), separate backup disk (raid5)
Put TempDB on its own physical disks (raid1, raid10 under heavy load).
In a non-clustered environment, I normally use the local server storage for TempDB.
>
Use NTFS volumes
Do not use more then 80% of the volume
Defragment the disks on a regular basis
Labels:
disk arrays,
Performance,
raid,
setup,
SQLIO,
storage
How to configure SQL Server TempDB
As a general guideline, create one data file for each CPU on the server.
Note that a dual-core CPU is considered to be two CPUs. logical procs (hyperthreading) do not
Only create one Log file
Do not use autoshrink on TempDB
If your'e using SQL Server 2008 set Page verify to ChecksumInvestigate the possibility if you can switch off the properties ‘Auto create’ and ‘Auto Update’ statistics,it can speed up the creation of objects in TempDB. Be careful however: measure the impact these changes can have on production! Change and measure only one parameter at the time.
After installation move the TempDB database to its own disks that differ from those that are used by user databases
ALTER DATABASE TempDB
MODIFY FILE (NAME=tempdev, FILENAME= '\tempdb.mdf');
GO
ALTER DATABASE TempDB
MODIFY FILE (NAME=templog, FILENAME= '\TempLog.ldf');
GO
Restart SQL Server service (not the Windows server), to make changes permanent
Pre-allocate data and log device sizes (create files with same equal size), do not rely on small auto growth steps
If you use auto growth set it to a reasonable size in MB, not a percentage
TempDB file size default FILEGROWTH increment
0 to 100 MB growth:10 MB
100 to 200 MB growth:20 MB
200 to 1000 MB growth:50 to 75 MB
1 GB or More growth:150 to 250 MB
Measure TempDB usage over time with fn_virtualfilestats
New releases of your –vendor- software could have a different impact on TempDB load.
Note that a dual-core CPU is considered to be two CPUs. logical procs (hyperthreading) do not
Only create one Log file
Do not use autoshrink on TempDB
If your'e using SQL Server 2008 set Page verify to ChecksumInvestigate the possibility if you can switch off the properties ‘Auto create’ and ‘Auto Update’ statistics,it can speed up the creation of objects in TempDB. Be careful however: measure the impact these changes can have on production! Change and measure only one parameter at the time.
After installation move the TempDB database to its own disks that differ from those that are used by user databases
ALTER DATABASE TempDB
MODIFY FILE (NAME=tempdev, FILENAME= '
GO
ALTER DATABASE TempDB
MODIFY FILE (NAME=templog, FILENAME= '
GO
Restart SQL Server service (not the Windows server), to make changes permanent
Pre-allocate data and log device sizes (create files with same equal size), do not rely on small auto growth steps
If you use auto growth set it to a reasonable size in MB, not a percentage
TempDB file size default FILEGROWTH increment
0 to 100 MB growth:10 MB
100 to 200 MB growth:20 MB
200 to 1000 MB growth:50 to 75 MB
1 GB or More growth:150 to 250 MB
Measure TempDB usage over time with fn_virtualfilestats
New releases of your –vendor- software could have a different impact on TempDB load.
Backup the mssqlsystemresource database
Since the introduction of SQL Server 2005, Microsoft implemented the mssqlsystemresource database.This is a read-only database that contains all the system objects that are included with SQL Server.It is stored as mssqlsystemresource.mdf / .ldf file combination and installed in the same location as the master databaseand it should stay there!
It is a SQL Server requirement that both the master and resource databases are in the same location.Due to the low IO overhead it's im my opinion not worth the trouble moving both databases.They should however be stored on a redundant disk. Too risky to lose them!
Its content is specific to a version / patch level and because it cannot be included inregular a SQL Server backup you should always make a manual copy of both .mdf and .ldf files:
- After a SQL Server instance is installed
- Before and after a service pack of hot fix is applied
Copying can be done while the SQL Server instance is on-line
It is a SQL Server requirement that both the master and resource databases are in the same location.Due to the low IO overhead it's im my opinion not worth the trouble moving both databases.They should however be stored on a redundant disk. Too risky to lose them!
Its content is specific to a version / patch level and because it cannot be included inregular a SQL Server backup you should always make a manual copy of both .mdf and .ldf files:
- After a SQL Server instance is installed
- Before and after a service pack of hot fix is applied
Copying can be done while the SQL Server instance is on-line
Thursday, March 05, 2009
Do you have a datetime T-SQL 'BETWEEN AND' bug??
Create a table with a datetime column
create table #t (col1 datetime)
add couple of rows with date and time values
insert into #t values ('1/mar/2009 08:00')
insert into #t values ('1/mar/2009 09:00')
insert into #t values ('1/mar/2009 10:00')
get the rows
select * from #t
where col1 between '1/mar/2009' and '1/mar/2009'
and guess what?
(0 row(s) affected)
Why? You’re trying to select a date without a time and that's treated as 00:00 midnight
and these rows do not exist in the table
Three ways to solve:
1.
append a time value to the sql string:
select * from #t
where col1 between '1/mar/2009 00:00:00' and '1/mar/2009 23:59:59'
(3 row(s) affected)
no go, too much work -specially with variables- and could lead to (more) bugs
2.
select * from #t
where CONVERT(datetime,CONVERT(varchar(20), col1, 112),112) between '1/mar/2009' and '1/mar/2009'
(3 row(s) affected)
uses double convert() T-SQL function to get rid of the time part and switch back to datetime format
no go: if you have a index defined an the column it will not be used by the SQL Server query optimizer because of the convert function. Bad performance!
3.
my absolute thumbs up favorite:
select * from #t
where col1 >= '1/mar/2009' and col1 < dateadd(d,1,'1/mar/2009')
(3 row(s) affected)
GO, this construction has served me well over the years: date/time values are handled correctly,
indexes are used and no hassle with convert() leading to good performance and good maintainable T-SQL code!
have fun
create table #t (col1 datetime)
add couple of rows with date and time values
insert into #t values ('1/mar/2009 08:00')
insert into #t values ('1/mar/2009 09:00')
insert into #t values ('1/mar/2009 10:00')
get the rows
select * from #t
where col1 between '1/mar/2009' and '1/mar/2009'
and guess what?
(0 row(s) affected)
Why? You’re trying to select a date without a time and that's treated as 00:00 midnight
and these rows do not exist in the table
Three ways to solve:
1.
append a time value to the sql string:
select * from #t
where col1 between '1/mar/2009 00:00:00' and '1/mar/2009 23:59:59'
(3 row(s) affected)
no go, too much work -specially with variables- and could lead to (more) bugs
2.
select * from #t
where CONVERT(datetime,CONVERT(varchar(20), col1, 112),112) between '1/mar/2009' and '1/mar/2009'
(3 row(s) affected)
uses double convert() T-SQL function to get rid of the time part and switch back to datetime format
no go: if you have a index defined an the column it will not be used by the SQL Server query optimizer because of the convert function. Bad performance!
3.
my absolute thumbs up favorite:
select * from #t
where col1 >= '1/mar/2009' and col1 < dateadd(d,1,'1/mar/2009')
(3 row(s) affected)
GO, this construction has served me well over the years: date/time values are handled correctly,
indexes are used and no hassle with convert() leading to good performance and good maintainable T-SQL code!
have fun
Tuesday, March 03, 2009
SQL Server tools for performance
Since the introduction of SQL Server 2005 Microsoft implemented Dynamic Management Views (DMV’s) in the database engine that enable us to peek inside of it, and retrieve the internal –performance- counters.
In this post I present a couple of queries gathered and used over time that assisted me to pin down the problem when having performance related issues.
If you’ve inherited a system from an older / upgraded SQL Server version, I recommend running the ‘unused indexes‘ query mentioned later on. It saved my life in more than one occasion.
In my experience as tables have grown bigger, indexes were created for specific needs and over time became obsolete, but no one bothered to remove them, leaving the system with a lot of unnecessary I/O overhead.
Explore sys.dm_exec_query_stats it contains more fields than used here (last_physical_reads for instance) and also check out other DMVs for additional info you might be interested in.
Before starting any investigation make sure that all indexes are defragmented and all the statistics are up to date up front (see my post on index reorganization)
The queries described here, combined with my previous posts on fn_virtualfilestats, auto index management, STATISTICS IO, multiple files per database and index reorganization should supply you with enough info to tackle most of I/O related performance issues.
Beware: DMV’s are reset when the SQL Server instance is restarted! So the numbers are only meaningful when the db engine is running for a considerable amount of time
First determine how long the instance has been running:
Queries with highest IO load:
Queries using the most CPU:
Get the unused indexes:
In this post I present a couple of queries gathered and used over time that assisted me to pin down the problem when having performance related issues.
If you’ve inherited a system from an older / upgraded SQL Server version, I recommend running the ‘unused indexes‘ query mentioned later on. It saved my life in more than one occasion.
In my experience as tables have grown bigger, indexes were created for specific needs and over time became obsolete, but no one bothered to remove them, leaving the system with a lot of unnecessary I/O overhead.
Explore sys.dm_exec_query_stats it contains more fields than used here (last_physical_reads for instance) and also check out other DMVs for additional info you might be interested in.
Before starting any investigation make sure that all indexes are defragmented and all the statistics are up to date up front (see my post on index reorganization)
The queries described here, combined with my previous posts on fn_virtualfilestats, auto index management, STATISTICS IO, multiple files per database and index reorganization should supply you with enough info to tackle most of I/O related performance issues.
Beware: DMV’s are reset when the SQL Server instance is restarted! So the numbers are only meaningful when the db engine is running for a considerable amount of time
First determine how long the instance has been running:
SELECT crdate AS Instance_Started_On
FROM sysdatabases
WHERE name = 'tempdb'
FROM sysdatabases
WHERE name = 'tempdb'
Queries with highest IO load:
SELECT TOP 100
total_logical_reads, total_logical_writes, execution_count,
total_logical_reads+total_logical_writes AS total_IO,
st.TEXT AS query_text,
st.dbid AS database_id,
DB_NAME(st.dbid) AS database_name
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(sql_handle) st
WHERE total_logical_reads+total_logical_writes > 0
ORDER BY (total_logical_reads+total_logical_writes) DESC -– most i/o intensive
total_logical_reads, total_logical_writes, execution_count,
total_logical_reads+total_logical_writes AS total_IO,
st.TEXT AS query_text,
st.dbid AS database_id,
DB_NAME(st.dbid) AS database_name
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(sql_handle) st
WHERE total_logical_reads+total_logical_writes > 0
ORDER BY (total_logical_reads+total_logical_writes) DESC -– most i/o intensive
Queries using the most CPU:
SELECT TOP 100
highest_cpu_queries.total_worker_time,
sql_text.dbid,
DB_NAME(sql_text.dbid) AS database_name,
sql_text.[TEXT]
FROM
(SELECT TOP 100 PERCENT
qs.plan_handle,
qs.total_worker_time
FROM
sys.dm_exec_query_stats qs
ORDER BY qs.total_worker_time DESC) AS highest_cpu_queries
cross apply sys.dm_exec_sql_text(plan_handle) AS sql_text
ORDER BY highest_cpu_queries.total_worker_time DESC
highest_cpu_queries.total_worker_time,
sql_text.dbid,
DB_NAME(sql_text.dbid) AS database_name,
sql_text.[TEXT]
FROM
(SELECT TOP 100 PERCENT
qs.plan_handle,
qs.total_worker_time
FROM
sys.dm_exec_query_stats qs
ORDER BY qs.total_worker_time DESC) AS highest_cpu_queries
cross apply sys.dm_exec_sql_text(plan_handle) AS sql_text
ORDER BY highest_cpu_queries.total_worker_time DESC
Get the unused indexes:
USE your_db_name –- switch TO your own db here BEFORE executing the dmv query below
go
SELECT TOP 100
OBJECT_NAME(i.OBJECT_ID) AS tablename,
i.name AS indexname,
s.user_updates,
s.user_seeks + s.user_scans + s.user_lookups AS totalreads,
s.user_updates AS totalwrites
FROM sys.indexes i
LEFT join sys.dm_db_index_usage_stats s
ON s.OBJECT_ID = i.OBJECT_ID and
i.index_id = s.index_id
WHERE OBJECTPROPERTY(i.OBJECT_ID, 'IsIndexable') = 1 and
-- index_usage_stats has no reference to this index (not being used)
s.index_id IS null or
-- index is being updated, but not used by seeks/scans/lookups
(s.user_updates > 0 and s.user_seeks = 0 and s.user_scans = 0 and s.user_lookups = 0)
ORDER BY s.user_updates DESC
go
go
SELECT TOP 100
OBJECT_NAME(i.OBJECT_ID) AS tablename,
i.name AS indexname,
s.user_updates,
s.user_seeks + s.user_scans + s.user_lookups AS totalreads,
s.user_updates AS totalwrites
FROM sys.indexes i
LEFT join sys.dm_db_index_usage_stats s
ON s.OBJECT_ID = i.OBJECT_ID and
i.index_id = s.index_id
WHERE OBJECTPROPERTY(i.OBJECT_ID, 'IsIndexable') = 1 and
-- index_usage_stats has no reference to this index (not being used)
s.index_id IS null or
-- index is being updated, but not used by seeks/scans/lookups
(s.user_updates > 0 and s.user_seeks = 0 and s.user_scans = 0 and s.user_lookups = 0)
ORDER BY s.user_updates DESC
go
Sunday, March 01, 2009
SET STATISTICS IO ON!
The graphical query plans introduced in the latest SQL Server versions give you great inside info on how your query is executing. Downside: sometimes they are difficult to read and they take time to fully understand.
Enter: SET STATISTICS IO ON
It gives you an overview off how many I/Os your query is executing on each accessed table, and by working together with the query optimizer you can absolutely minimize these numbers.
And by doing so optimize and increase the work load a SQL Server instance can handle.
For example: what is the impact of replacing an SELECT IN sub-query by an IF EXISTS () construction.
This feature is available since SQL Server 6.x, so you can still tune your old instances.
In my opinion this option should always be used with any new or changed query against a representative amount of data: a copy from, or simulation of, a production environment to check if the results are in-line with expectations.
Here’s a simple query and result when the option is enabled in the message tab of the results pane (click to enlarge):
If you have a hosted SQL Server environment with tight security, it could well be that generating graphical query plans is prohibited, while this option could still be granted giving you lots of meaningful information.
Before diving in to it check that all indexes a reorganized and the statistics are up to date.
(see my post index defragmentation)
Also check my post auto index management and let SQL Server assist you with your indexes!
Oh yeah, switch it to off again when you are done tuning your queries, don’t include it in your stored procedures.
Enter: SET STATISTICS IO ON
It gives you an overview off how many I/Os your query is executing on each accessed table, and by working together with the query optimizer you can absolutely minimize these numbers.
And by doing so optimize and increase the work load a SQL Server instance can handle.
For example: what is the impact of replacing an SELECT IN sub-query by an IF EXISTS () construction.
This feature is available since SQL Server 6.x, so you can still tune your old instances.
In my opinion this option should always be used with any new or changed query against a representative amount of data: a copy from, or simulation of, a production environment to check if the results are in-line with expectations.
Here’s a simple query and result when the option is enabled in the message tab of the results pane (click to enlarge):
If you have a hosted SQL Server environment with tight security, it could well be that generating graphical query plans is prohibited, while this option could still be granted giving you lots of meaningful information.
Before diving in to it check that all indexes a reorganized and the statistics are up to date.
(see my post index defragmentation)
Also check my post auto index management and let SQL Server assist you with your indexes!
Oh yeah, switch it to off again when you are done tuning your queries, don’t include it in your stored procedures.
Subscribe to:
Posts (Atom)