2017-03-14

Storage Best Practice - 64KB NTFS Allocation Unit Size

According to the SQL Server Best Practices Article, it is recommended that you use a 64-KB NTFS allocation unit size for data, logs, and tempdb. Below is my experience of how this best practice improves the throughput my production system by reducing the transaction log disk write latency.

The transaction log file originally placed in a disk partition with default 4KB allocation unit size. Disk write latency often climbs up to above 10ms.

Create a new disk partition with 64-KB NTFS allocation unit size, in Administrative Tools > Computer Management > Storage > Disk Management > Format > NTFS, 64K

Double check the new partition:
  1. Open Administrator command prompt
  2. fsutil fsinfo ntfsinfo [your drive]
  3. Check "Bytes Per Cluster" value is 65536

Move transaction log file to the new partition:
/* Ref. https://technet.microsoft.com/en-us/library/gg452698.aspx */
-- Check file size
USE master
GO
SELECT DB_NAME(database_id) AS DatabaseName, Name AS Logical_Name, Physical_Name, (size*8)/1024 SizeMB
FROM sys.master_files WHERE DB_NAME(database_id) = 'YourDB'
GO
-- Shrink Tran Log File to 1MB
USE YourDB
GO
DBCC SHRINKFILE(YourDB_log, TRUNCATEONLY);
GO
USE master
GO
-- Check file size
SELECT DB_NAME(database_id) AS DatabaseName, Name AS Logical_Name, Physical_Name, (size*8)/1024 SizeMB
FROM sys.master_files WHERE DB_NAME(database_id) = 'YourDB'
GO

-- Offline
ALTER DATABASE YourDB SET OFFLINE WITH ROLLBACK IMMEDIATE;
GO

-- *** COPY FILE TO NEW LOCATION USING WINDOWS FILE EXPLORER BY YOURSELF ***

-- Move file
USE master
GO
ALTER DATABASE YourDB MODIFY FILE (NAME = YourDB_log, FILENAME = '<*** NEW PATH ***>\YourDB_log.ldf')
GO
-- Check file location changed
USE master
GO
SELECT DB_NAME(database_id) AS DatabaseName, Name AS Logical_Name, Physical_Name, (size*8)/1024 SizeMB
FROM sys.master_files WHERE DB_NAME(database_id) = 'YourDB'
GO
-- Online
USE master
GO
ALTER DATABASE YourDB SET ONLINE
GO
-- Check file again
USE master
GO
SELECT DB_NAME(database_id) AS DatabaseName, Name AS Logical_Name, Physical_Name, (size*8)/1024 SizeMB
FROM sys.master_files WHERE DB_NAME(database_id) = 'YourDB'
GO
-- Grow Tran Log File to originally size, e.g. 10Gb
USE master
GO
ALTER DATABASE [YourDB]
MODIFY FILE ( NAME = N'YourDB_log', SIZE = 10000MB )
GO


Now the transaction log disk write latency keeps under 2ms.

2017-02-02

SQL Server 2016 Always Encrypted - Part 1

Always Encrypted is a client-side encryption technology in which data is automatically encrypted not only when it is written but also when it is read by an approved application. Unlike Transparent Data Encryption (TDE), which encrypts the data on disk but allows the data to be read by any application that queries the data, Always Encrypted requires your client application to use an Always Encrypted-enabled driver to communicate with the database. At this time, the only Always Encrypted-enabled driver is the .NET Framework Data Provider for SQL Server (a.k.a. ADO.NET), which requires installation of .NET Framework version 4.6 on the client computer.
Below is an example of how to create a table with Always Encrypted columns:
-- 1. Creating a column master key
USE TestDB
GO
CREATE COLUMN MASTER KEY TestColumnMasterKey WITH (
    KEY_STORE_PROVIDER_NAME = N'MSSQL_CERTIFICATE_STORE',
    KEY_PATH = N'CurrentUser/My/DE3A770F25EBD6071305B77FB198D1AE434E6014'
);
GO
-- 2. Creating a column encryption key
CREATE COLUMN ENCRYPTION KEY TestColumnEncryptKey WITH VALUES (
    COLUMN_MASTER_KEY = [TestColumnMasterKey],
    ALGORITHM = 'RSA_OAEP',
    ENCRYPTED_VALUE = 0x016E008000630075007200720065006E00740075007300650072002F006D0079002F0064006500330061003700370030006600320035006500620064003600300037003100330030003500620037003700660062003100390038006400310061006500340033003400650036003000310034004D74119935C902E59F57A96C3E6F770826D247135FFFA759B5B013DF4DAF7CFB760A5864DD8381B91924D067BE4F574B50DE7F0D53F278E1C003B5D192865B808C1590224F4A4BB463255101C36D3089F46609B376D7B00FA9F9CEAF715398EECAB790AC6EC8BD18C17B3EB992CAE08FEA6A2F5A2BDDA4F5A700744E45861F993A3C488127E5897B30892DD2734DD5D84F096882A393D5877C5A20E392888FE0357F46DB578AEB4C677CFFCE2281276C4D12F3E5AC3BCCC09B78BB0E522D86F9B2CF989F14695B7CB95A478194ECBD175B5C7C1687B7589FD9145B2782CB0BBAB6F7F5B0AC7F8C256EB0D3D87ABAE4F73137FA4AFA387B791B54AC503B53271D
);
GO
-- 3. Creating a table using Always Encrypted (DETERMINISTIC can be indexed and lookup; RANDOMIZED more secure
CREATE TABLE TestEncryptTable (
    id int IDENTITY(1, 1) PRIMARY KEY,
    encryptedLookupCol varchar(11) COLLATE Latin1_General_BIN2 ENCRYPTED WITH (
        ENCRYPTION_TYPE = DETERMINISTIC,
        ALGORITHM = 'AEAD_AES_256_CBC_HMAC_SHA_256',
        COLUMN_ENCRYPTION_KEY = TestColumnEncryptKey) NOT NULL,
    encryptedValueOnlyCol date ENCRYPTED WITH (
        ENCRYPTION_TYPE = RANDOMIZED,
        ALGORITHM = 'AEAD_AES_256_CBC_HMAC_SHA_256',
        COLUMN_ENCRYPTION_KEY = TestColumnEncryptKey) NOT NULL
);

2017-01-04

Coding Pattern of TRY CATCH and TRANSACTION inside a CURSOR

Sometimes you may need to create an sql agent job, which process sql commands inside a FIFO queue (which is implemented by an user table which an auto identity integer), and if one command raises error, log the error into an error log table, then continue the next command. Such job can be implemented by using a LOCAL FAST_FORWARD CURSOR, TRY...CATCH statement, and TRANSACTION statements. Below is an example illustrating how to do it.

-- Create Tables
CREATE TABLE QueueCommands (
    ID bigint IDENTITY(1,1) NOT NULL PRIMARY KEY,
    SqlCommand varchar(500) NOT NULL
 )
 GO
CREATE TABLE CompletedCommands (
    ID bigint NOT NULL,
    SqlCommand varchar(500) NOT NULL
)
GO
-- Error Log Table
CREATE TABLE ErrorLog(
    errorTime datetime NOT NULL CONSTRAINT DF_ErrorLog_errorTime  DEFAULT (getdate()),
    errorMsg nvarchar(2048) NULL
)
GO

-- Error Logger Stored Procedure
CREATE OR ALTER PROC error_handler_sp AS
BEGIN
   DECLARE @errmsg   nvarchar(2048),
           @severity tinyint,
           @state    tinyint,
           @errno    int,
           @proc     sysname,
           @lineno   int
          
   SELECT @errmsg = error_message(), @severity = error_severity(),
          @state  = error_state(), @errno = error_number(),
          @proc   = error_procedure(), @lineno = error_line()
      
   IF @errmsg NOT LIKE '***%'
   BEGIN
      SELECT @errmsg = '*** ' + coalesce(quotename(@proc), '<dynamic SQL>') +
                       ', Line ' + ltrim(str(@lineno)) + '. Errno ' +
                       ltrim(str(@errno)) + ': ' + @errmsg
   END

   INSERT ErrorLog (errorMsg) VALUES (@errmsg);

   RAISERROR('%s', @severity, @state, @errmsg)
END
GO

-- Create the Stored Procedure to be called by sql agent job
CREATE OR ALTER PROC JobProcessCommands
AS
BEGIN

SET XACT_ABORT ON;
SET NOCOUNT ON;

CREATE TABLE #tmpQueueCommands (
    ID bigint PRIMARY KEY,
    SqlCommand varchar(500)
);
INSERT #tmpQueueCommands (ID, SqlCommand) SELECT ID, SqlCommand FROM QueueCommands;

DECLARE @cur_ID bigint, @cur_SqlCommand varchar(500);
DECLARE cur CURSOR LOCAL FAST_FORWARD FOR
    SELECT ID, SqlCommand FROM #tmpQueueCommands ORDER BY ID
OPEN cur
WHILE 1 = 1
BEGIN
    FETCH cur INTO @cur_ID, @cur_SqlCommand
    IF @@FETCH_STATUS <> 0 BREAK
   
    BEGIN TRY
    BEGIN TRAN
   
    EXECUTE(@cur_SqlCommand);
    IF @@ERROR = 0
    BEGIN
        DELETE TOP(1) QueueCommands OUTPUT deleted.* INTO CompletedCommands WHERE ID = @cur_ID;
    END

    COMMIT
    END TRY
    BEGIN CATCH
    IF XACT_STATE() <> 0 ROLLBACK
    EXEC error_handler_sp
    END CATCH
END
CLOSE cur
DEALLOCATE cur

DROP TABLE #tmpQueueCommands;
END
GO

-- Example
INSERT QueueCommands VALUES ('SELECT 1');
INSERT QueueCommands VALUES ('SELECT 1/0');    -- Error
INSERT QueueCommands VALUES ('SELECT 2');

SELECT * FROM QueueCommands

EXEC JobProcessCommands

SELECT * FROM QueueCommands
SELECT * FROM CompletedCommands
SELECT * FROM ErrorLog

2016-12-13

Transaction Isolation Level

By default, when you open a connection to sql server, its TRANSACTION ISOLATION LEVEL is READ COMMITTED. However, client applications can change the transaction isolation level for their sql connections, such as by running SET TRANSACTION ISOLATION LEVEL statement. When diagnosing sql blocking and deadlock, you must beware of transaction isolation level not being overlooked. Below script checks the transaction isolation level of all current running processes:
SELECT
r.session_id, s.transaction_isolation_level,
CASE s.transaction_isolation_level
WHEN 0 THEN 'Unspecified'
WHEN 1 THEN 'ReadUncommitted'
WHEN 2 THEN 'ReadCommitted'
WHEN 3 THEN 'Repeatable'
WHEN 4 THEN 'Serializable'
WHEN 5 THEN 'Snapshot' END AS transaction_isolation_level_Name,
OBJECT_NAME(t.objectid, t.dbid) AS object_name,
t.[text],
SUBSTRING(
t.[text],
(r.statement_start_offset / 2) + 1,
((
CASE
r.statement_end_offset
WHEN -1
THEN DATALENGTH(t.[text])
ELSE r.statement_end_offset
END - r.statement_start_offset) / 2) + 1) AS stmt,
r.start_time, r.status, r.command, DB_NAME(r.database_id) AS db_name,
s.login_name, s.host_name, s.program_name, s.total_elapsed_time
FROM sys.dm_exec_requests r
JOIN sys.dm_exec_sessions s ON s.session_id = r.session_id
CROSS APPLY sys.dm_exec_sql_text(r.[sql_handle]) t
WHERE r.status IN ('runnable', 'suspended', 'running')


If it's set by the client application, and you are not able to change the application code, then you can override the isolation level inside the stored procedure, given that it use stored procedure and you got the permission to alter it. E.g.
-- a sample stored procedure
CREATE PROCEDURE uspTestTranIsoLvl
    @set bit
AS
BEGIN
IF @set = 1
BEGIN
    SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
END
   
SELECT session_id, transaction_isolation_level,
CASE transaction_isolation_level
WHEN 0 THEN 'Unspecified'
WHEN 1 THEN 'ReadUncommitted'
WHEN 2 THEN 'ReadCommitted'
WHEN 3 THEN 'Repeatable'
WHEN 4 THEN 'Serializable'
WHEN 5 THEN 'Snapshot' END AS TRANSACTION_ISOLATION_LEVEL
FROM sys.dm_exec_sessions WHERE session_id = @@SPID;

END
GO


-- Demonstration
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

SELECT 'Before', session_id, transaction_isolation_level,
CASE transaction_isolation_level
WHEN 0 THEN 'Unspecified'
WHEN 1 THEN 'ReadUncommitted'
WHEN 2 THEN 'ReadCommitted'
WHEN 3 THEN 'Repeatable'
WHEN 4 THEN 'Serializable'
WHEN 5 THEN 'Snapshot' END AS TRANSACTION_ISOLATION_LEVEL
FROM sys.dm_exec_sessions WHERE session_id = @@SPID;

SELECT 'NOT set';
EXEC uspTestTranIsoLvl 0;

SELECT 'SET';
EXEC uspTestTranIsoLvl 1;

SELECT 'After', session_id, transaction_isolation_level,
CASE transaction_isolation_level
WHEN 0 THEN 'Unspecified'
WHEN 1 THEN 'ReadUncommitted'
WHEN 2 THEN 'ReadCommitted'
WHEN 3 THEN 'Repeatable'
WHEN 4 THEN 'Serializable'
WHEN 5 THEN 'Snapshot' END AS TRANSACTION_ISOLATION_LEVEL
FROM sys.dm_exec_sessions WHERE session_id = @@SPID;





 

Above demonstration shows that the transaction isolation level can be overridden inside stored procedure, and back to original after the stored procedure returned.
 

2016-11-14

Increasing usability of Filtered Index

A filtered index is an optimized nonclustered index especially suited to cover queries that select from a well-defined subset of data. It uses a filter predicate to index a portion of rows in the table. A well-designed filtered index can improve query performance as well as reduce index maintenance and storage costs compared with full-table indexes.
In order to increase the chance that the query engine to employ the filtered indexes you created, make sure to include all the filtering column(s) of the index into its index key or included column list. Below example demonstrates why.


USE AdventureWorks2008
GO
-- Filtered Column in index key
DROP INDEX [Sales].[SalesOrderDetail].IX_1
GO
CREATE INDEX IX_1 ON [Sales].[SalesOrderDetail] (ProductID, SalesOrderID, OrderQty) WHERE OrderQty >= 2;
GO
SELECT ProductID, SalesOrderID FROM [Sales].[SalesOrderDetail] WHERE OrderQty >= 2        -- Use Filtered Index
SELECT ProductID, SalesOrderID FROM [Sales].[SalesOrderDetail] WHERE OrderQty >= 3        -- Use Filtered Index

-- Filtered Column NOT in index
DROP INDEX [Sales].[SalesOrderDetail].IX_1
GO
CREATE INDEX IX_1 ON [Sales].[SalesOrderDetail] (ProductID, SalesOrderID) WHERE OrderQty >= 2;
GO
SELECT ProductID, SalesOrderID FROM [Sales].[SalesOrderDetail] WHERE OrderQty >= 2        -- Use
SELECT ProductID, SalesOrderID FROM [Sales].[SalesOrderDetail] WHERE OrderQty >= 3        -- NOT use!

-- Filtered Column in INCLUDED
DROP INDEX [Sales].[SalesOrderDetail].IX_1
GO
CREATE INDEX IX_1 ON [Sales].[SalesOrderDetail] (ProductID, SalesOrderID) INCLUDE (OrderQty) WHERE OrderQty >= 2;
GO
SELECT ProductID, SalesOrderID FROM [Sales].[SalesOrderDetail] WHERE OrderQty >= 2        -- Use
SELECT ProductID, SalesOrderID FROM [Sales].[SalesOrderDetail] WHERE OrderQty >= 3        -- Use

By the way, the query optimizer won’t consider filtered indexes if you’re using local variables or parameterized SQL for the predicate that matches the filter. Dynamic SQL can help.


2016-10-11

SQL Trace

SQL Trace, even it's a quite old feature, still loved by lot of database administrators due to its simplicity of creation. Rather than writing a sql script to call sp_trace_xxx system stored procedures, it can be created by using SQL Profiler to export a trace definition script, as below steps:

1. Open SQL Server Profiler;
2. Create a new trace;
3. Select a trace template, pick the required events, columns, and set filters as you wish;
4. Start the trace;
5. Stop the trace;
6. Click "File" menu -> "Export" -> "Script Trace Definition" -> "For SQL Server 2005 - 2014..."
7. Save the exported sql file.
The trace definition sql script will be something like below:
8. Replace the text "InsertFileNameHere" with a filename prefixed by a path, e.g., c:\MyFolder\MyTrace. The .trc extension will be appended to the filename automatically;
9. If you wanna use rollover files on the .trc trace output, change the second parameter value of sp_trace_create from 0 to 2. The default value of max_file_size is 5 MB;
10. Noted that trace definitions will be automatically removed in the case of SQL Server restart, so if you wanna the same trace survives server restart, you can create a stored procedure in the master database, which wrap up the above trace definition script, then execute sp_procoption to set that stored procedure for automatic execution every time SQL Server is started. You can check which stored procedure(s) in master database will be auto-executed by the following query using sysadmin login to run:
USE master
SELECT [name]
FROM sysobjects
WHERE type = 'P'
  AND OBJECTPROPERTY(id, 'ExecIsStartUp') = 1; 

11. Execute the trace definition script to create and start the trace.

After the trace is created, you can check the properties of a trace, including path of the output .trc file, is rollover, events, columns, and filters, by below queries:
-- view created traces
SELECT * FROM sys.traces
-- view trace events & columns
SELECT
     e.name AS Event_Name,
     c.name AS Column_Name
FROM fn_trace_geteventinfo(3) ei    -- replace the fn param value with the Trace ID u wanna check
JOIN sys.trace_events e ON ei.eventid = e.trace_event_id
JOIN sys.trace_columns c ON ei.columnid = c.trace_column_id
-- view trace filters
SELECT
     columnid,
    c.name AS Column_Name,
     logical_operator,
     comparison_operator,
     value
FROM fn_trace_getfilterinfo(3) fi    -- replace the fn param value with the Trace ID u wanna check
JOIN sys.trace_columns c ON fi.columnid = c.trace_column_id 


To see the trace output, you can open the .trc files by SQL Profiler, or you can use fn_trace_gettable which returns the content of one or more trace files in tabular form, e.g.
-- view trace output as table
SELECT *
FROM fn_trace_gettable('H:\SYSADMIN_AUDIT\sysadmin_audit_168.trc', default)    -- 1st param:initial file; 2nd param:no. of rollover files to be read (default=until last file) 



2016-09-12

View Execution Plan for Multi Statement TVF, and why Inline TVF better

When you execute a Multi-statement table valued function (TVF), you will see an execution plan that looks something like this:
The optimizer always estimates a Multi-statement TVF has only a single row return, which is mostly not the truth. In order to see the detailed plan inside the Multi-statement TVF, you can run the following SQL statement to get it from the plan cache:
SELECT  deqp.query_plan, dest.text,
SUBSTRING(dest.text, (deqs.statement_start_offset / 2) + 1,
(deqs.statement_end_offset - deqs.statement_start_offset)
/ 2 + 1) AS actualstatement, deqs.creation_time
FROM    sys.dm_exec_query_stats AS deqs
CROSS APPLY sys.dm_exec_query_plan(deqs.plan_handle) AS deqp
CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest
WHERE   deqp.objectid = OBJECT_ID('YourFunctionName');

The result will look like this:



Multi-statement TVF can be bad for performance due to the fact that it's treated like a table except there are no statistics available for SQL Server to base a good execution plan on - so it will estimate the function as returning a very small number of rows. If it returns a larger number of rows, then the plan generated could be a lot less than optimal.
Let's compare the showplan difference between Inline TVF and Multi-statement TVF. Below are two TVFs, one Inline and another Multi-statement:
CREATE FUNCTION udfInline
(   
    @id int
)
RETURNS TABLE
AS
RETURN
    SELECT * FROM Account
GO
CREATE FUNCTION udf_Multi
(
    @id int
)
RETURNS
@result TABLE
(
    AccountId int, UtcOffset int
)
AS
BEGIN
    INSERT @result SELECT * FROM Account
    RETURN
END
GO

The Inline TVF execution plan look like this:
For Inline TVF, the estimated number of rows are quite accurate. The actual underlying tables get brought into the main query and a better execution plan can be generated based on proper statistics. You'll notice that in this case, the execution plan will NOT have a mention of the function at all as it's basically just merged the function into the main query.
The Multi statement TVF execution plan look like this:
For Multi-statement TVF, SQL Server made the estimation as only one row. There's a great reference on it on MSDN: "But if you use multi-statement TVF, it’s treated as just like another table. Because there is no statistics available, SQL Server has to make some assumptions and in general provide low estimate. If your TVF returns only a few rows, it will be fine. But if you intend to populate the TVF with thousands of rows and if this TVF is joined with other tables, inefficient plan can result from low cardinality estimate."