HK SQL Server Doctor

2016-10-11

SQL Trace

SQL Trace, even it's a quite old feature, still loved by lot of database administrators due to its simplicity of creation. Rather than writing a sql script to call sp_trace_xxx system stored procedures, it can be created by using SQL Profiler to export a trace definition script, as below steps:

1. Open SQL Server Profiler;
2. Create a new trace;
3. Select a trace template, pick the required events, columns, and set filters as you wish;
4. Start the trace;
5. Stop the trace;
6. Click "File" menu -> "Export" -> "Script Trace Definition" -> "For SQL Server 2005 - 2014..."
7. Save the exported sql file.
The trace definition sql script will be something like below:

8. Replace the text "InsertFileNameHere" with a filename prefixed by a path, e.g., c:\MyFolder\MyTrace. The .trc extension will be appended to the filename automatically;
9. If you wanna use rollover files on the .trc trace output, change the second parameter value of sp_trace_create from 0 to 2. The default value of max_file_size is 5 MB;
10. Noted that trace definitions will be automatically removed in the case of SQL Server restart, so if you wanna the same trace survives server restart, you can create a stored procedure in the master database, which wrap up the above trace definition script, then execute sp_procoption to set that stored procedure for automatic execution every time SQL Server is started. You can check which stored procedure(s) in master database will be auto-executed by the following query using sysadmin login to run:
USE master
SELECT [name]
FROM sysobjects
WHERE type = 'P'
AND OBJECTPROPERTY(id, 'ExecIsStartUp') = 1;
11. Execute the trace definition script to create and start the trace.

After the trace is created, you can check the properties of a trace, including path of the output .trc file, is rollover, events, columns, and filters, by below queries:
-- view created traces
SELECT * FROM sys.traces
-- view trace events & columns
SELECT
     e.name AS Event_Name,
     c.name AS Column_Name
FROM fn_trace_geteventinfo(3) ei    -- replace the fn param value with the Trace ID u wanna check
JOIN sys.trace_events e ON ei.eventid = e.trace_event_id
JOIN sys.trace_columns c ON ei.columnid = c.trace_column_id
-- view trace filters
SELECT
     columnid,
    c.name AS Column_Name,
     logical_operator,
     comparison_operator,
     value
FROM fn_trace_getfilterinfo(3) fi    -- replace the fn param value with the Trace ID u wanna check
JOIN sys.trace_columns c ON fi.columnid = c.trace_column_id

To see the trace output, you can open the .trc files by SQL Profiler, or you can use fn_trace_gettable which returns the content of one or more trace files in tabular form, e.g.
-- view trace output as table
SELECT *
FROM fn_trace_gettable('H:\SYSADMIN_AUDIT\sysadmin_audit_168.trc', default)    -- 1st param:initial file; 2nd param:no. of rollover files to be read (default=until last file)

2016-09-12

View Execution Plan for Multi Statement TVF, and why Inline TVF better

When you execute a Multi-statement table valued function (TVF), you will see an execution plan that looks something like this:

The optimizer always estimates a Multi-statement TVF has only a single row return, which is mostly not the truth. In order to see the detailed plan inside the Multi-statement TVF, you can run the following SQL statement to get it from the plan cache:
SELECT deqp.query_plan, dest.text,
SUBSTRING(dest.text, (deqs.statement_start_offset / 2) + 1,
(deqs.statement_end_offset - deqs.statement_start_offset)
/ 2 + 1) AS actualstatement, deqs.creation_time
FROM sys.dm_exec_query_stats AS deqs
CROSS APPLY sys.dm_exec_query_plan(deqs.plan_handle) AS deqp
CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest
WHERE deqp.objectid = OBJECT_ID('YourFunctionName');
The result will look like this:

Multi-statement TVF can be bad for performance due to the fact that it's treated like a table except there are no statistics available for SQL Server to base a good execution plan on - so it will estimate the function as returning a very small number of rows. If it returns a larger number of rows, then the plan generated could be a lot less than optimal.
Let's compare the showplan difference between Inline TVF and Multi-statement TVF. Below are two TVFs, one Inline and another Multi-statement:
CREATE FUNCTION udfInline
(
    @id int
)
RETURNS TABLE
AS
RETURN
    SELECT * FROM Account
GO
CREATE FUNCTION udf_Multi
(
    @id int
)
RETURNS
@result TABLE
(
    AccountId int, UtcOffset int
)
AS
BEGIN
    INSERT @result SELECT * FROM Account
    RETURN
END
GO
The Inline TVF execution plan look like this:

For Inline TVF, the estimated number of rows are quite accurate. The actual underlying tables get brought into the main query and a better execution plan can be generated based on proper statistics. You'll notice that in this case, the execution plan will NOT have a mention of the function at all as it's basically just merged the function into the main query.

The Multi statement TVF execution plan look like this:

For Multi-statement TVF, SQL Server made the estimation as only one row. There's a great reference on it on MSDN: "But if you use multi-statement TVF, it’s treated as just like another table. Because there is no statistics available, SQL Server has to make some assumptions and in general provide low estimate. If your TVF returns only a few rows, it will be fine. But if you intend to populate the TVF with thousands of rows and if this TVF is joined with other tables, inefficient plan can result from low cardinality estimate."

2016-08-08

How to fix Forwarded Records in a Heap

A heap is a table without a clustered index. Forwarded records are records in a heap that have grown too large for the page that it currently resides on. These types of records can only be found in heaps because tables with a clustered index keep the data sorted based on the clustered index. If a heap row gets updated and grows too big for it to stay on the same page, SQL Server will move it to a new page slot and leave a pointer at its original spot. Therefore, looking for a row will be more expensive afterwards. A lookup of the data row in a heap is no longer a direct access using the page and slot address. Instead of getting the data row the server might have to follow a forward pointer first.
You can temporarily fix forwarded records in heaps by rebuilding heaps. Note that rebuilding heaps was added to SQL Server starting with SQL Server 2008. You can’t rebuild heaps in SQL Server 2005 or lower. But creating a clustered index on it can avoid forward records permanently.
Below sql script list out all heaps in current database:
SELECT SCH.name + '.' + TBL.name AS TableName
FROM sys.tables AS TBL
     INNER JOIN sys.schemas AS SCH
         ON TBL.schema_id = SCH.schema_id
     INNER JOIN sys.indexes AS IDX
         ON TBL.object_id = IDX.object_id
            AND IDX.type = 0 -- = Heap
ORDER BY TableName;
You can then check the forwarded_record_count by using DMF sys.dm_db_index_physical_stats one by one table:
SELECT OBJECT_NAME(object_id) AS table_name, forwarded_record_count, avg_fragmentation_in_percent, page_count
FROM sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('<tableName>'), DEFAULT, DEFAULT, 'DETAILED');

2016-07-11

Checking Buffer Pool Usage by Table/Index

In SQL Server, the data in table is stored in pages which has fixed size of 8 KB. Whenever there is a need of a page (for read or write) the page is first read from the disk and bought to memory location called "Buffer Pool". Below sql script can be used to check the buffer pool usage by table and index in the current database, this script calculates the usage by counting the no. of rows in sys.dm_os_buffer_descriptors. Each cached data page has one buffer descriptor. Buffer descriptors uniquely identify each data page that is currently cached in an instance of SQL Server.
SELECT COUNT(*) AS cached_pages_count, COUNT(*) * 8 / 1024 AS cachedUsedMB,
obj.[object_id], obj.name AS tableName, idx.index_id, idx.name AS indexName
FROM sys.dm_os_buffer_descriptors AS bd
JOIN (
        SELECT p.[object_id], OBJECT_NAME(p.[object_id]) AS name, p.index_id, au.allocation_unit_id
        FROM sys.allocation_units AS au JOIN sys.partitions AS p ON au.container_id = p.hobt_id AND (au.[type] = 1 OR au.[type] = 3)
        UNION ALL
        SELECT p.[object_id], OBJECT_NAME(p.[object_id]) AS name, p.index_id, au.allocation_unit_id
        FROM sys.allocation_units AS au JOIN sys.partitions AS p ON au.container_id = p.[partition_id] AND au.[type] = 2
) AS obj ON bd.allocation_unit_id = obj.allocation_unit_id
JOIN sys.indexes AS idx ON idx.[object_id] = obj.[object_id] AND idx.index_id = obj.index_id
WHERE database_id = DB_ID()
GROUP BY obj.[object_id], obj.name, idx.index_id, idx.name
ORDER BY cached_pages_count DESC;

2016-06-06

SQL Server Buffer Pool Extension

Introduced in SQL Server 2014, the buffer pool extension (BPE) provides the seamless integration of a nonvolatile random access memory (solid-state drive SSD) extension to the database engine buffer pool to significantly improve I/O throughput.
Data and index pages are read from disk into the buffer pool and modified pages (also known as dirty pages) are written back to disk. Memory pressure on the server and database checkpoints cause hot (active) dirty pages in the buffer cache to be evicted from the cache and written to mechanical disks and then read back into the cache. These I/O operations are typically small random reads and writes on the order of 4 to 16 KB of data. Small random I/O patterns incur frequent seeks, competing for the mechanical disk arm, increasing I/O latency, and reducing aggregate I/O throughput of the system. The typical approach to resolving these I/O bottlenecks is to add more DRAM, or alternatively, added high-performance SAS spindles.

The following list describes the benefits of the BPE feature.
- Increased random I/O throughput
- Reduced I/O latency
- Increased transaction throughput
- Improved read performance with a larger hybrid buffer pool
- A caching architecture that can take advantage of present and future low-cost memory drives

The following illustration provides a high-level architectural overview of the buffer pool relative to other SQL Server components.

Below steps demonstrate how to enable BPE:

1. Check the currently configured value of the MAX SERVER MEMORY.
USE master
GO
EXEC sp_configure 'show advanced options', 1
RECONFIGURE WITH OVERRIDE
GO
EXEC sp_configure 'max server memory (MB)'
GO

2. Configure the BPE buffer cache file. The minimum size is the size of MAX SERVER MEMORY. You should put this file in a SSD drive, e.g.
USE master
GO
ALTER SERVER CONFIGURATION
SET BUFFER POOL EXTENSION ON
(FILENAME = '<SSD>:\SSDCACHE\SqlServerCache.BPE', SIZE = 32 GB);
GO

Best Practices:
Microsoft says that the BPE can be up to 32 times of the MAX SERVER MEMORY with SQL Server Enterprise, and up to 4 times with Standard Edition. But it recommends a ratio of 1:16 or less between the amount of memory and the size of the extension and cautions that a ratio of 1:4 to 1:8 could be optimal as a starting point.

2016-05-03

Restore Log Backup Files in a Directory

Most production databases are running in full recovery model, there must be transaction log backup job scheduled to run periodically. In order to reduce possible data loss, log should be backup as frequent as possible, let's say every minute. More log backups mean more files. Rather than manually restore log backups through the GUI, one at a time, you need a script-based method to automatically restore all of the log files in a directory. Below script can be used to generate RESTORE LOG statements by reading a directory which transaction log backup files reside. The log backup file time must be expressed by its file name, e.g. with a YYYYMMDDhhmmss suffix.
USE Master;
GO
SET NOCOUNT ON

-- 1 - Variable declaration
DECLARE @dbName sysname
DECLARE @backupPath NVARCHAR(500)
DECLARE @cmd NVARCHAR(500)
DECLARE @fileList TABLE (backupFile NVARCHAR(255))
DECLARE @lastFullBackup NVARCHAR(500)
DECLARE @lastDiffBackup NVARCHAR(500)
DECLARE @backupFile NVARCHAR(500)

-- 2 - Initialize variables
SET @dbName = '<UserDB>'
SET @backupPath = 'D:\LogBackup\UserDB\'

-- 3 - get list of files
SET @cmd = 'DIR /b ' + @backupPath

INSERT INTO @fileList(backupFile)
EXEC master.sys.xp_cmdshell @cmd

SELECT 'RESTORE LOG ' + @dbName + ' FROM DISK = N''' + @backupPath + backupFile + ''' WITH NORECOVERY'
FROM @fileList WHERE backupFile IS NOT NULL ORDER BY backupFile

2016-04-19

Pause Mirroring to Speed up Large Data Update

Database mirroring transports the changes in the production database to a mirror database, either synchronously or asynchronously. Asynchronous database mirroring is supported only by SQL Server Enterprise edition. During large data update, the mirror commit overhead of a Synchronous mirroring session becomes significant. In order to speed up your data update process, you can change the operating mode to Asynchronous, if your sql server is Enterprise edition. But if it's Standard edition, you can Pause the mirror session. During bottlenecks, pausing can be useful to improve performance on the principal server. Below is a demonstration.

1. Pause the mirroring session.

2. In “Database Mirroring Monitor”, also shows the mirroring session as Suspended.

3. Run some loading on the exposed principal database, to fill up the Unsent Log Queue:
SET NOCOUNT ON;
WHILE 1 = 1
BEGIN
INSERT Table_1 VALUES (99, 'Test')
DELETE Table_1 WHERE pk = 99
END

4. In this example, I accumulate 14Gb of unsent transaction log.

5. Now “Resume” the mirroring session.

6. As there’s quite a lot of unsent log in the queue, it takes time to synchronize.

7. Even the synchronous (high safety mode) mirroring session is synchronizing, you can still commit any new transactions WITHOUT any delay!

You should resume a paused session as soon as possible, because as long as a database mirroring session remains paused, the transaction log can NOT be truncated.

網頁