JSON is a popular data format used for storing unstructured data. Many RESTful web services return results and accept inputs in JSON format. Starting from SQL Server 2016, it has native support on parsing JSON data. This blog post demonstrates how to transform JSON data into a rows and columns, and joining the result with table data.
Firstly, let us create a table named Products, which stores the product details into a text column with JSON format:
CREATE TABLE Products (
id int NOT NULL PRIMARY KEY,
[name] varchar(50) NOT NULL,
detailsInJson nvarchar(4000) NOT NULL
);
Then populate some sample rows into it:
INSERT Products (id, [name], detailsInJson) VALUES (1, 'Toy Car', '{"size" : "10cm x 20cm x 30cm", "color" : "red", "type" : "mini"}');
INSERT Products (id, [name], detailsInJson) VALUES (2, 'Teddy Bear', '{"color" : "brown", "texture" : "cloth"}');
Let's double check the table data:
As you can see, each row has its own set of attributes in the JSON data.
We can retrieve the JSON data using OPENJSON function as below:
SELECT id, [name], D.*
FROM Products
CROSS APPLY OPENJSON (detailsInJson) WITH (
color varchar(50) '$.color',
size varchar(50) '$.size',
[type] varchar(50) '$.type',
[texture] varchar(50) '$.texture'
) AS D;
Below is the query result:
You can also check the execution plan of this query, OPENJSON is essentially a Table Valued Function which do the transformation:
SQL Server 2016 introduces various new built-in functions to process JSON data, which can be found in Microsoft SQL Docs.
2019-12-19
2019-11-19
Striped Database Backup
SQL Server allows you to backup your database or transaction log in a striped media set. A striped set is a set of disk files on which backup data is divided into blocks and distributed in a fixed order. Most database administrators overlook this feature. It can be used to speed up the backup process by distributing the backup workload into multiple storage devices.
Below scripts demonstrate how to make a striped backup set:
BACKUP DATABASE [StackOverflow2013]
TO
DISK='D:\DbBackup\StackOverflow2013_1.bak',
DISK='D:\DbBackup\StackOverflow2013_2.bak',
DISK='D:\DbBackup\StackOverflow2013_3.bak',
DISK='D:\DbBackup\StackOverflow2013_4.bak'
WITH STATS = 10
GO
I only used the same drive D:\ on above script, but you can specify different drives for each DISK.
Here is the result:
Below script demonstrate how to restore from a striped backup:
RESTORE DATABASE [StackOverflow2013]
FROM
DISK='D:\DbBackup\StackOverflow2013_1.bak',
DISK='D:\DbBackup\StackOverflow2013_2.bak',
DISK='D:\DbBackup\StackOverflow2013_3.bak',
DISK='D:\DbBackup\StackOverflow2013_4.bak'
WITH STATS = 10
GO
The major downside of striped backup is that if one backup file is corrupted, you cannot restore it.
Below scripts demonstrate how to make a striped backup set:
BACKUP DATABASE [StackOverflow2013]
TO
DISK='D:\DbBackup\StackOverflow2013_1.bak',
DISK='D:\DbBackup\StackOverflow2013_2.bak',
DISK='D:\DbBackup\StackOverflow2013_3.bak',
DISK='D:\DbBackup\StackOverflow2013_4.bak'
WITH STATS = 10
GO
I only used the same drive D:\ on above script, but you can specify different drives for each DISK.
Here is the result:
Below script demonstrate how to restore from a striped backup:
RESTORE DATABASE [StackOverflow2013]
FROM
DISK='D:\DbBackup\StackOverflow2013_1.bak',
DISK='D:\DbBackup\StackOverflow2013_2.bak',
DISK='D:\DbBackup\StackOverflow2013_3.bak',
DISK='D:\DbBackup\StackOverflow2013_4.bak'
WITH STATS = 10
GO
The major downside of striped backup is that if one backup file is corrupted, you cannot restore it.
2019-10-15
Auditing Stored Procedure Execution
In the previous blog post, I demonstrated how to use SQL Server Audit feature to keep track of login and logout events. This post shows how to keep track of stored procedure executions. Below script creates the audit:
USE [master]
GO
CREATE SERVER AUDIT [Audit_SP_Exec]
TO FILE
( FILEPATH = N'D:\SqlAudit' -- *** Modify the File Path ***
,MAXSIZE = 50 MB
,MAX_ROLLOVER_FILES = 2000
,RESERVE_DISK_SPACE = OFF
)
WITH
( QUEUE_DELAY = 5000
,ON_FAILURE = CONTINUE
)
GO
USE [TestDB] -- *** your application DB ***
GO
CREATE DATABASE AUDIT SPECIFICATION [DbAuditSpec_SP_Exec]
FOR SERVER AUDIT [Audit_SP_Exec]
ADD (EXECUTE ON OBJECT::[dbo].[uspTest] BY [public]),
ADD (EXECUTE ON OBJECT::[dbo].[uspTest2] BY [public])
GO
ALTER DATABASE AUDIT SPECIFICATION [DbAuditSpec_SP_Exec] WITH (STATE = ON)
GO
USE [master]
GO
ALTER SERVER AUDIT [Audit_SP_Exec] WITH (STATE = ON)
GO
Then you can view the audit log by right-click the server Audit in SSMS like below:
USE [master]
GO
CREATE SERVER AUDIT [Audit_SP_Exec]
TO FILE
( FILEPATH = N'D:\SqlAudit' -- *** Modify the File Path ***
,MAXSIZE = 50 MB
,MAX_ROLLOVER_FILES = 2000
,RESERVE_DISK_SPACE = OFF
)
WITH
( QUEUE_DELAY = 5000
,ON_FAILURE = CONTINUE
)
GO
USE [TestDB] -- *** your application DB ***
GO
CREATE DATABASE AUDIT SPECIFICATION [DbAuditSpec_SP_Exec]
FOR SERVER AUDIT [Audit_SP_Exec]
ADD (EXECUTE ON OBJECT::[dbo].[uspTest] BY [public]),
ADD (EXECUTE ON OBJECT::[dbo].[uspTest2] BY [public])
GO
ALTER DATABASE AUDIT SPECIFICATION [DbAuditSpec_SP_Exec] WITH (STATE = ON)
GO
USE [master]
GO
ALTER SERVER AUDIT [Audit_SP_Exec] WITH (STATE = ON)
GO
Then you can view the audit log by right-click the server Audit in SSMS like below:
2019-09-17
How to Audit Login and Logout for specific logins
Starting from SQL Server 2008, we can use SQL Server Audit feature to record numerous actions occurred on our SQL Server. SQL Server Audit uses Extended Events under the hood, which impose lesser loading than the plain old SQL Trace.
In this blog post, I would like to show you how to audit login and logout events for specific logins, which can only be done by setting a filter on the Server Audit.
USE [master]
GO
CREATE SERVER AUDIT [Audit_sysadm_login]
TO FILE
( FILEPATH = N'D:\SqlAudit\'
,MAXSIZE = 10 MB
,MAX_FILES = 100
,RESERVE_DISK_SPACE = OFF
)
WITH
( QUEUE_DELAY = 5000
,ON_FAILURE = CONTINUE
)
WHERE ([server_principal_name]='sa' OR [server_principal_name]='peter.lee')
GO
CREATE SERVER AUDIT SPECIFICATION [ServerAuditSpec_sysadm_login]
FOR SERVER AUDIT [Audit_sysadm_login]
ADD (SUCCESSFUL_LOGIN_GROUP),
ADD (LOGOUT_GROUP)
WITH (STATE = ON)
GO
ALTER SERVER AUDIT [Audit_sysadm_login] WITH (STATE = ON)
GO
* Please be reminded that ON_FAILURE is a critical argument on the CREATE SERVER AUDIT statement. If your SQL Server service online is more important than the auditing needs, set this argument to CONTINUE.
The simplest way to check the audit log is using the Log File Viewer, which can be opened in Object Explorer > Security > Audits > right-click the audit > View Audit Logs.
In this blog post, I would like to show you how to audit login and logout events for specific logins, which can only be done by setting a filter on the Server Audit.
USE [master]
GO
CREATE SERVER AUDIT [Audit_sysadm_login]
TO FILE
( FILEPATH = N'D:\SqlAudit\'
,MAXSIZE = 10 MB
,MAX_FILES = 100
,RESERVE_DISK_SPACE = OFF
)
WITH
( QUEUE_DELAY = 5000
,ON_FAILURE = CONTINUE
)
WHERE ([server_principal_name]='sa' OR [server_principal_name]='peter.lee')
GO
CREATE SERVER AUDIT SPECIFICATION [ServerAuditSpec_sysadm_login]
FOR SERVER AUDIT [Audit_sysadm_login]
ADD (SUCCESSFUL_LOGIN_GROUP),
ADD (LOGOUT_GROUP)
WITH (STATE = ON)
GO
ALTER SERVER AUDIT [Audit_sysadm_login] WITH (STATE = ON)
GO
* Please be reminded that ON_FAILURE is a critical argument on the CREATE SERVER AUDIT statement. If your SQL Server service online is more important than the auditing needs, set this argument to CONTINUE.
The simplest way to check the audit log is using the Log File Viewer, which can be opened in Object Explorer > Security > Audits > right-click the audit > View Audit Logs.
2019-08-23
Disabling and Re-Enabling SQL Logins
During system maintenance, as a production DBA, you mostly need to deploy stored procedure changes and data patches on the production database. While you are apply those changes, you would like to avoid any users and applications to access the database, otherwise program errors will occur and logical data integrity will be broken. You can use below SQL script to disable non-sysadmin sql logins before you apply changes, and re-enable them after your maintenance tasks completed.
DECLARE @sysadmins table ([name] sysname);
INSERT @sysadmins SELECT prin2.[name]
FROM sys.server_principals prin
JOIN sys.server_role_members mem ON prin.principal_id = mem.role_principal_id
JOIN sys.server_principals prin2 ON prin2.principal_id = mem.member_principal_id
WHERE prin.[type] = 'R' and prin.[name] = N'sysadmin';
--SELECT * FROM @sysadmins;
DECLARE @appLogins table (id int IDENTITY, [name] sysname);
INSERT @appLogins SELECT [name] FROM sys.server_principals
WHERE [type] = 'S' and [name] <> 'sa' AND [name] NOT IN (SELECT [name] FROM @sysadmins) AND [name] NOT LIKE '##%' AND [name] NOT LIKE '%\%';
SELECT * FROM sys.server_principals WHERE [name] IN (SELECT [name] FROM @appLogins);
DECLARE @cnt int, @loginName varchar(100);
SELECT @cnt = COUNT(*) FROM @appLogins;
WHILE @cnt <> 0
BEGIN
SELECT @LoginName = [Name] FROM @appLogins WHERE id = @cnt;
PRINT 'ALTER LOGIN [' + @LoginName + '] DISABLE;';
PRINT 'ALTER LOGIN [' + @LoginName + '] ENABLE;';
--EXEC('ALTER LOGIN [' + @LoginName + '] DISABLE;');
--EXEC('ALTER LOGIN [' + @LoginName + '] ENABLE;');
SET @cnt = @cnt - 1;
END
SELECT * FROM sys.server_principals WHERE [name] IN (SELECT [name] FROM @appLogins);
DECLARE @sysadmins table ([name] sysname);
INSERT @sysadmins SELECT prin2.[name]
FROM sys.server_principals prin
JOIN sys.server_role_members mem ON prin.principal_id = mem.role_principal_id
JOIN sys.server_principals prin2 ON prin2.principal_id = mem.member_principal_id
WHERE prin.[type] = 'R' and prin.[name] = N'sysadmin';
--SELECT * FROM @sysadmins;
DECLARE @appLogins table (id int IDENTITY, [name] sysname);
INSERT @appLogins SELECT [name] FROM sys.server_principals
WHERE [type] = 'S' and [name] <> 'sa' AND [name] NOT IN (SELECT [name] FROM @sysadmins) AND [name] NOT LIKE '##%' AND [name] NOT LIKE '%\%';
SELECT * FROM sys.server_principals WHERE [name] IN (SELECT [name] FROM @appLogins);
DECLARE @cnt int, @loginName varchar(100);
SELECT @cnt = COUNT(*) FROM @appLogins;
WHILE @cnt <> 0
BEGIN
SELECT @LoginName = [Name] FROM @appLogins WHERE id = @cnt;
PRINT 'ALTER LOGIN [' + @LoginName + '] DISABLE;';
PRINT 'ALTER LOGIN [' + @LoginName + '] ENABLE;';
--EXEC('ALTER LOGIN [' + @LoginName + '] DISABLE;');
--EXEC('ALTER LOGIN [' + @LoginName + '] ENABLE;');
SET @cnt = @cnt - 1;
END
SELECT * FROM sys.server_principals WHERE [name] IN (SELECT [name] FROM @appLogins);
2019-07-18
Solving Last-Page Insert Latch Contention by Index Partitioning
According to a Microsoft KB, if you have an even-increasing index on a table, such as IDENTITY or GETDATE(), then during huge amount of concurrent queries/threads inserting new rows into that table, different threads will contend for exclusive latch (PAGELATCH_EX) on the last data page of that index in the buffer pool.
Let's discuss more details about what happens internally in SQL Server when a query reading and writing pages in memory. When SQL Server accesses a page in the buffer pool, the page access must be synchronized among multiple worker threads. Every time a query read a page, its worker thread has to acquire a shared latch (PAGELATCH_SH), and every time a query write a page, its worker thread has to acquire an exclusive latch (PAGELATCH_EX). When you run an INSERT statement, the worker thread exclusively latches the page, in the meantime no any other concurrent worker threads can read and write from/to this page. With an ever-increasing index, parallel INSERT queries are contending about an exclusive latch on the same last page in that index, illustrated as below:
In this blog post, I'm going to demonstrate how to solve this problem by partitioning the index. Starting from SQL Server 2016 SP1, index partitioning feature is available on all editions.
Let's create the demo table first, with an identity primary key:
CREATE TABLE Ticket (
id int IDENTITY(1, 1) NOT NULL PRIMARY KEY,
colA varchar(50)
);
By using the SQLQueryStress tool created by Adam Mechanic, you can simulate multiple concurrent INSERT statements into the demo table:
You can see there are PAGELATCH_EX wait on some sessions:
SELECT wait_type, wait_resource, COUNT(1) cnt FROM sys.dm_exec_requests GROUP BY wait_type, wait_resource ORDER BY cnt DESC;
The wait resource in this case is 6:1:134489, meaning database id 6, file 1 and page 134489. You can check that page to determine what it is by using the DBCC PAGE command:
DBCC TRACEON(3604);
GO
DBCC PAGE(6, 1, 134489, 0);
GO
The ObjectId is 146099561. You can get the actual object name using the OBJECT_NAME function, which gives you the Ticket table you just created:
In order to minimize the PATCHLATCH_EX contention on the last page, we can use partitioning with a computed column. Let’s do this:
-- suppose you have 4 CPU cores in your SQL Server
CREATE PARTITION FUNCTION pf_hash (tinyint) AS RANGE LEFT FOR VALUES (0, 1, 2, 3);
GO
CREATE PARTITION SCHEME ps_hash AS PARTITION pf_hash ALL TO ([PRIMARY]);
GO
DROP TABLE Ticket;
GO
CREATE TABLE Ticket (
id int IDENTITY(1, 1) NOT NULL,
colA varchar(50),
hashId AS CONVERT(tinyint, ABS(id % 4)) PERSISTED NOT NULL,
CONSTRAINT PK_Ticket PRIMARY KEY CLUSTERED (id, hashId)
) ON ps_hash(hashId);
GO
Let's check the table is partitioned:
SELECT DISTINCT
table_name = O.[name],
partition_index_name = I.[name],
partition_index_type = I.[type_desc],
partition_scheme = PS.[name],
partition_function = PS.[name]
FROM sys.partitions P
JOIN sys.objects O ON O.[object_id] = P.[object_id]
JOIN sys.indexes I ON P.[object_id] = I.[object_id] AND P.index_id = I.index_id
JOIN sys.data_spaces DS ON I.data_space_id = DS.data_space_id
join sys.partition_schemes PS on DS.data_space_id = PS.data_space_id
JOIN sys.partition_functions PF on PS.function_id = PF.function_id;
Now run the stress test again, you will the PAGELATCH_EX wait becomes fewer than before.
Let's discuss more details about what happens internally in SQL Server when a query reading and writing pages in memory. When SQL Server accesses a page in the buffer pool, the page access must be synchronized among multiple worker threads. Every time a query read a page, its worker thread has to acquire a shared latch (PAGELATCH_SH), and every time a query write a page, its worker thread has to acquire an exclusive latch (PAGELATCH_EX). When you run an INSERT statement, the worker thread exclusively latches the page, in the meantime no any other concurrent worker threads can read and write from/to this page. With an ever-increasing index, parallel INSERT queries are contending about an exclusive latch on the same last page in that index, illustrated as below:
In this blog post, I'm going to demonstrate how to solve this problem by partitioning the index. Starting from SQL Server 2016 SP1, index partitioning feature is available on all editions.
Let's create the demo table first, with an identity primary key:
CREATE TABLE Ticket (
id int IDENTITY(1, 1) NOT NULL PRIMARY KEY,
colA varchar(50)
);
By using the SQLQueryStress tool created by Adam Mechanic, you can simulate multiple concurrent INSERT statements into the demo table:
SELECT wait_type, wait_resource, COUNT(1) cnt FROM sys.dm_exec_requests GROUP BY wait_type, wait_resource ORDER BY cnt DESC;
The wait resource in this case is 6:1:134489, meaning database id 6, file 1 and page 134489. You can check that page to determine what it is by using the DBCC PAGE command:
DBCC TRACEON(3604);
GO
DBCC PAGE(6, 1, 134489, 0);
GO
The ObjectId is 146099561. You can get the actual object name using the OBJECT_NAME function, which gives you the Ticket table you just created:
In order to minimize the PATCHLATCH_EX contention on the last page, we can use partitioning with a computed column. Let’s do this:
-- suppose you have 4 CPU cores in your SQL Server
CREATE PARTITION FUNCTION pf_hash (tinyint) AS RANGE LEFT FOR VALUES (0, 1, 2, 3);
GO
CREATE PARTITION SCHEME ps_hash AS PARTITION pf_hash ALL TO ([PRIMARY]);
GO
DROP TABLE Ticket;
GO
CREATE TABLE Ticket (
id int IDENTITY(1, 1) NOT NULL,
colA varchar(50),
hashId AS CONVERT(tinyint, ABS(id % 4)) PERSISTED NOT NULL,
CONSTRAINT PK_Ticket PRIMARY KEY CLUSTERED (id, hashId)
) ON ps_hash(hashId);
GO
Let's check the table is partitioned:
SELECT DISTINCT
table_name = O.[name],
partition_index_name = I.[name],
partition_index_type = I.[type_desc],
partition_scheme = PS.[name],
partition_function = PS.[name]
FROM sys.partitions P
JOIN sys.objects O ON O.[object_id] = P.[object_id]
JOIN sys.indexes I ON P.[object_id] = I.[object_id] AND P.index_id = I.index_id
JOIN sys.data_spaces DS ON I.data_space_id = DS.data_space_id
join sys.partition_schemes PS on DS.data_space_id = PS.data_space_id
JOIN sys.partition_functions PF on PS.function_id = PF.function_id;
Now run the stress test again, you will the PAGELATCH_EX wait becomes fewer than before.
2019-06-18
Lock Escalation is evaluated per Statement only, NOT per Session
Lock Escalation is an internal optimization process that SQL Server converts many fine-grained locks such as row/page locks into a single coarse-grained table/partition lock, in order to release memory held by the many fine-grained locks. When SQL Server is going to modify a large number of rows, it is more efficient for the database engine to take fewer, larger locks instead of dealing with a large number of individual locks. In this blog post, I'm going to explain that SQL Server evaluating lock escalation is per each statement, rather than per session. Let's take a look at the below demonstration:
USE AdventureWorks2008R2
GO
BEGIN TRAN
DELETE TOP(5000) FROM [Sales].[SalesOrderHeader];
-- Lock Escalation to Table Level
SELECT resource_type,resource_subtype,request_mode,COUNT(*)
FROM sys.dm_tran_locks WHERE request_session_id = @@SPID
GROUP BY resource_type,resource_subtype,request_mode;
ROLLBACK
GO
BEGIN TRAN
DECLARE @loop int = 1;
WHILE @loop < 10
BEGIN
DELETE TOP(1000) FROM [Sales].[SalesOrderHeader];
SET @loop += 1;
END
-- Lock Escalation NOT occurred!
SELECT resource_type,resource_subtype,request_mode,COUNT(*)
FROM sys.dm_tran_locks WHERE request_session_id = @@SPID
GROUP BY resource_type,resource_subtype,request_mode;
ROLLBACK
Below is the result:
As you can see, the first batch deletes 5000 rows in one statement, key and page locks are escalated into one table lock. The second batch deletes 10k rows by using a while loop, each statement just deletes 1000 rows, so no any locks being escalated.
2019-05-06
Tracking Schema Changes using DDL Triggers
The simplest way to check the schema changes history of a database is to check the Schema Changes History report , which can be opened in SSMS through right-click you SQL Server > Reports > Standard Reports > Schema Changes History. This built-in report shows the changes
you’ve made using Data Definition Language (DDL) statements, relying on the Default Trace mechanism. (SQL Server records certain events about your system
in a "Default Trace", which is a binary file that watches SQL Server
for certain events. It uses the SQL Trace mechanism indeed). But the built-in report has some weaknesses. It "rolls over", so the changes history isn't persisted forever. It doesn't show the DDL statements ran. Also it just shows the impersonated login (using EXECUTE AS statement) rather than the original login.
You can track the schema change better by implementing DDL Triggers yourself. Below is the script.
CREATE DATABASE AuditDB;
GO
USE AuditDB;
GO
CREATE TABLE DDLEvent (
eventDate datetime NOT NULL DEFAULT GETDATE(),
eventType nvarchar(255),
eventDDL nvarchar(max),
eventXML xml,
databaseName nvarchar(255),
schemaName nvarchar(255),
objectName nvarchar(255),
hostName nvarchar(255),
ipAddress nvarchar(255),
programName nvarchar(255),
loginName nvarchar(255)
);
GO
CREATE CLUSTERED INDEX CX_DDLEvent ON DDLEvent (eventDate)
GO
-- Modified it to your DB name
USE YourDB;
GO
CREATE OR ALTER TRIGGER DDLTrigger_AuditSchemaChange
ON DATABASE
FOR CREATE_PROCEDURE, ALTER_PROCEDURE, DROP_PROCEDURE, CREATE_TRIGGER, ALTER_TRIGGER, DROP_TRIGGER, CREATE_TABLE, ALTER_TABLE, DROP_TABLE, RENAME
AS
BEGIN
SET NOCOUNT ON;
DECLARE @eventData xml = EVENTDATA();
DECLARE @ip nvarchar(255) = CONVERT(nvarchar(255), CONNECTIONPROPERTY('client_net_address'));
INSERT AuditDB.dbo.DDLEvent (
eventType, eventDDL, eventXML, databaseName, schemaName, objectName, hostName, ipAddress, programName, loginName
) SELECT
@eventData.value('(/EVENT_INSTANCE/EventType)[1]', 'nvarchar(255)'),
@eventData.value('(/EVENT_INSTANCE/TSQLCommand)[1]', 'nvarchar(max)'),
@eventData,
DB_NAME(),
@eventData.value('(/EVENT_INSTANCE/SchemaName)[1]', 'nvarchar(255)'),
@eventData.value('(/EVENT_INSTANCE/ObjectName)[1]', 'nvarchar(255)'),
HOST_NAME(),
@ip,
PROGRAM_NAME(),
ORIGINAL_LOGIN()
;
END
GO
GO
USE AuditDB;
GO
CREATE TABLE DDLEvent (
eventDate datetime NOT NULL DEFAULT GETDATE(),
eventType nvarchar(255),
eventDDL nvarchar(max),
eventXML xml,
databaseName nvarchar(255),
schemaName nvarchar(255),
objectName nvarchar(255),
hostName nvarchar(255),
ipAddress nvarchar(255),
programName nvarchar(255),
loginName nvarchar(255)
);
GO
CREATE CLUSTERED INDEX CX_DDLEvent ON DDLEvent (eventDate)
GO
-- Modified it to your DB name
USE YourDB;
GO
CREATE OR ALTER TRIGGER DDLTrigger_AuditSchemaChange
ON DATABASE
FOR CREATE_PROCEDURE, ALTER_PROCEDURE, DROP_PROCEDURE, CREATE_TRIGGER, ALTER_TRIGGER, DROP_TRIGGER, CREATE_TABLE, ALTER_TABLE, DROP_TABLE, RENAME
AS
BEGIN
SET NOCOUNT ON;
DECLARE @eventData xml = EVENTDATA();
DECLARE @ip nvarchar(255) = CONVERT(nvarchar(255), CONNECTIONPROPERTY('client_net_address'));
INSERT AuditDB.dbo.DDLEvent (
eventType, eventDDL, eventXML, databaseName, schemaName, objectName, hostName, ipAddress, programName, loginName
) SELECT
@eventData.value('(/EVENT_INSTANCE/EventType)[1]', 'nvarchar(255)'),
@eventData.value('(/EVENT_INSTANCE/TSQLCommand)[1]', 'nvarchar(max)'),
@eventData,
DB_NAME(),
@eventData.value('(/EVENT_INSTANCE/SchemaName)[1]', 'nvarchar(255)'),
@eventData.value('(/EVENT_INSTANCE/ObjectName)[1]', 'nvarchar(255)'),
HOST_NAME(),
@ip,
PROGRAM_NAME(),
ORIGINAL_LOGIN()
;
END
GO
By default, once you create a DDL trigger, it will be enabled immediately. Then you can check the audit table using SELECT statement, as illustrated below.
2019-04-16
The Pitfall of Logon Trigger
Start with SQL Server 2008, it provides us a feature called Logon Trigger that can be used to audit and control login sessions. Some database administrators use it to restrict login from valid client host IPs or host names. But in fact such restriction can be cheated, and if there is any program bug inside your Logon Trigger code, it will end up locking everyone out of the instance, and you must rely on Dedicated Administrator Connection (DAC) to resume your database service.
Below is an example of such logon trigger:
When someone try to connect to your sql server using SSMS from an invalid host, below error message will be prompted out:
In my opinion, logon trigger should NOT be used as a mechanism to restrict login when it can be cheated. What if someone changes its laptop IP or name to match the allowed ones? You should use a firewall for this purpose.
Below is an example of such logon trigger:
When someone try to connect to your sql server using SSMS from an invalid host, below error message will be prompted out:
In my opinion, logon trigger should NOT be used as a mechanism to restrict login when it can be cheated. What if someone changes its laptop IP or name to match the allowed ones? You should use a firewall for this purpose.
2019-03-19
Tracking Page Split
When you insert rows into a table they go on a page, your row will have a row length and you can get only so many rows on the 8Kb data page. When that row’s length increases because you update a varchar column for instance, SQL Server needs to move the other rows along in order to make room for your modification, if the combined new length of all the rows on the page will no longer fit on that page then SQL Server allocates a new page and moves rows to the right or left of your modification onto it – that is called a 'page split'.
Page splits are normal and can't be avoided especially in an OLTP database. But, it can be performance bottleneck on your DML workload. Page split leaves space on data pages, and also increases the chances of deadlock and blocking as page locks will be acquired during the update statement executing. We can specify fill factor to reduce potential page splits by providing enough space for index expansion as data is added to the underlying table.
SQL Server provides a DMV to track page splits. sys.dm_db_index_operational_stats gives you the cumulative count of page splits. Below is the query which will allow us to track page splits:
SELECT
ios.index_id,
o.[name] AS [object_name],
i.[name] AS index_name,
ios.leaf_allocation_count AS page_split_for_index,
ios.nonleaf_allocation_count AS nonleaf_page_split_for_index
FROM sys.dm_db_index_operational_stats(DB_ID(N'DB_NAME'), NULL, NULL, NULL) AS ios
JOIN sys.indexes AS i ON ios.index_id = i.index_id AND ios.[object_id] = i.[object_id]
JOIN sys.objects AS o ON ios.[object_id] = o.[object_id]
WHERE o.[type_desc] ='user_table'
ORDER BY page_split_for_index DESC;
Page splits are normal and can't be avoided especially in an OLTP database. But, it can be performance bottleneck on your DML workload. Page split leaves space on data pages, and also increases the chances of deadlock and blocking as page locks will be acquired during the update statement executing. We can specify fill factor to reduce potential page splits by providing enough space for index expansion as data is added to the underlying table.
SQL Server provides a DMV to track page splits. sys.dm_db_index_operational_stats gives you the cumulative count of page splits. Below is the query which will allow us to track page splits:
SELECT
ios.index_id,
o.[name] AS [object_name],
i.[name] AS index_name,
ios.leaf_allocation_count AS page_split_for_index,
ios.nonleaf_allocation_count AS nonleaf_page_split_for_index
FROM sys.dm_db_index_operational_stats(DB_ID(N'DB_NAME'), NULL, NULL, NULL) AS ios
JOIN sys.indexes AS i ON ios.index_id = i.index_id AND ios.[object_id] = i.[object_id]
JOIN sys.objects AS o ON ios.[object_id] = o.[object_id]
WHERE o.[type_desc] ='user_table'
ORDER BY page_split_for_index DESC;
2019-02-07
Indexed View Overhead on DML - Part 1 (Non-aggregrated view)
Starting from SQL Server 2000, Indexed View is a powerful tool to speed up complicated SELECT queries. In this series, I would like to dive into the indexed view maintenance overhead on DML statements. In this blog post, I will first focus on indexed view with no any aggregation, i.e. no any group by clause.
Like indexes, every time we modify data in the underlying tables then not only must SQL Server maintain the index entries on those tables, but also the index entries on the indexed view. This can affect write performance. SQL Server maintains the indexed view by embedding extra operations into the execution plan of the DML statement.
Let's say we have two tables, the Account table stores account id, player name, and the current balance; and the PlayerBetLog table stores the players' betting records.
USE [TESTDB]
GO
CREATE TABLE [dbo].[Account] (
[accid] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,
[name] [varchar](50) NOT NULL,
[balance] [int] NOT NULL DEFAULT(0)
)
GO
CREATE TABLE [dbo].[PlayerBetLog] (
[dividendId] [bigint] IDENTITY(1,1) NOT NULL PRIMARY KEY,
[amount] [decimal](18, 2) NOT NULL,
[playerId] [int] NOT NULL,
[gameCat] [int] NOT NULL
);
The balance column in the Account table will be frequently updated, but new records will be seldomly added, and existing records never be deleted. On the other hand, new records will be frequently inserted into the PlayerBetLog table, but never change.
Let's insert some dummy data into the two tables:
Before adding an indexed view, let's run a typical DML workload, to see the execution plan, statistics I/O, and locked resources:
SET NOCOUNT ON;
SET ARITHABORT OFF;
SET STATISTICS IO ON;
BEGIN TRAN
INSERT PlayerBetLog (playerId, gameCat, amount) SELECT 1, 2, 10 UNION ALL SELECT 1, 2, 20 UNION ALL SELECT 2, 2, 100;
UPDATE Account SET balance += 10 WHERE accid = 1;
SET STATISTICS IO OFF;
-- check locked resources
SELECT
t1.request_session_id AS spid
,t1.resource_type
,DB_NAME(t1.resource_database_id) AS DB
,(CASE resource_type
WHEN 'DATABASE' THEN ''
WHEN 'FILE' THEN ''
WHEN 'OBJECT' THEN OBJECT_NAME(t1.resource_associated_entity_id, t1.resource_database_id)
WHEN 'EXTENT' THEN ''
WHEN 'APPLICATION' THEN ''
WHEN 'METADATA' THEN ''
WHEN 'ALLOCATION_UNIT' THEN (
SELECT TOP 1 OBJECT_NAME(p.[object_id])
FROM sys.partitions AS p JOIN sys.allocation_units AS a
ON a.container_id = (CASE a.[type] WHEN 0 THEN NULL WHEN 2 THEN p.partition_id ELSE p.hobt_id END)
WHERE a.allocation_unit_id = t1.resource_associated_entity_id
)
ELSE (SELECT TOP 1 OBJECT_NAME([object_id]) FROM sys.partitions WHERE hobt_id = t1.resource_associated_entity_id)
END) AS [object_name]
,i.name AS index_name
,t1.request_mode
,t1.request_status
FROM sys.dm_tran_locks AS t1 LEFT JOIN sys.dm_os_waiting_tasks AS t2
ON t1.lock_owner_address = t2.resource_address
LEFT JOIN sys.partitions AS s ON s.hobt_id = t1.resource_associated_entity_id
LEFT JOIN sys.indexes AS i on i.[object_id] = s.[object_id] and i.index_id = s.index_id
LEFT JOIN (SELECT t.[text], p.*
FROM sys.sysprocesses AS p
CROSS APPLY sys.dm_exec_sql_text(p.[sql_handle]) AS t
) AS t3 ON t1.request_session_id = t3.spid
WHERE t3.spid = @@spid
ORDER BY index_name, resource_type
ROLLBACK;
Below is the execution plan:
Now let's add an indexed view, which joins the Account table and the PlayerBetLog table, but excluding the frequently updated balance column:
USE [TESTDB]
GO
-- View
CREATE VIEW [dbo].[VwTranReport] WITH SCHEMABINDING AS
SELECT A.accid, A.name, B.dividendId, B.amount, B.gameCat FROM dbo.Account A JOIN dbo.PlayerBetLog B ON A.accid = B.playerId;
GO
-- Unique Clustered Index on View
CREATE UNIQUE CLUSTERED INDEX [VUX_VwTranReport] ON [dbo].[VwTranReport] ([dividendId]);
GO
-- another Nonclustered Index on View
CREATE NONCLUSTERED INDEX [NCI_VwTranReport_accid] ON [dbo].[VwTranReport] ([accid]);
GO
Let's run again the DML workload, and compare the new execution plan, statistics I/O, and locked resources:
Final thought:
This blog post only investigates the overhead of Non-aggregated indexed view, for aggregated one, much more complicated operations will be added into the DML execution plan. If you find adding an indexed view can really boost up a reporting query, you must do load test first to see whether your production database can afford the extra indexed view maintenance cost on the DML workload.
Like indexes, every time we modify data in the underlying tables then not only must SQL Server maintain the index entries on those tables, but also the index entries on the indexed view. This can affect write performance. SQL Server maintains the indexed view by embedding extra operations into the execution plan of the DML statement.
Let's say we have two tables, the Account table stores account id, player name, and the current balance; and the PlayerBetLog table stores the players' betting records.
USE [TESTDB]
GO
CREATE TABLE [dbo].[Account] (
[accid] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,
[name] [varchar](50) NOT NULL,
[balance] [int] NOT NULL DEFAULT(0)
)
GO
CREATE TABLE [dbo].[PlayerBetLog] (
[dividendId] [bigint] IDENTITY(1,1) NOT NULL PRIMARY KEY,
[amount] [decimal](18, 2) NOT NULL,
[playerId] [int] NOT NULL,
[gameCat] [int] NOT NULL
);
The balance column in the Account table will be frequently updated, but new records will be seldomly added, and existing records never be deleted. On the other hand, new records will be frequently inserted into the PlayerBetLog table, but never change.
Let's insert some dummy data into the two tables:
Before adding an indexed view, let's run a typical DML workload, to see the execution plan, statistics I/O, and locked resources:
SET NOCOUNT ON;
SET ARITHABORT OFF;
SET STATISTICS IO ON;
BEGIN TRAN
INSERT PlayerBetLog (playerId, gameCat, amount) SELECT 1, 2, 10 UNION ALL SELECT 1, 2, 20 UNION ALL SELECT 2, 2, 100;
UPDATE Account SET balance += 10 WHERE accid = 1;
SET STATISTICS IO OFF;
-- check locked resources
SELECT
t1.request_session_id AS spid
,t1.resource_type
,DB_NAME(t1.resource_database_id) AS DB
,(CASE resource_type
WHEN 'DATABASE' THEN ''
WHEN 'FILE' THEN ''
WHEN 'OBJECT' THEN OBJECT_NAME(t1.resource_associated_entity_id, t1.resource_database_id)
WHEN 'EXTENT' THEN ''
WHEN 'APPLICATION' THEN ''
WHEN 'METADATA' THEN ''
WHEN 'ALLOCATION_UNIT' THEN (
SELECT TOP 1 OBJECT_NAME(p.[object_id])
FROM sys.partitions AS p JOIN sys.allocation_units AS a
ON a.container_id = (CASE a.[type] WHEN 0 THEN NULL WHEN 2 THEN p.partition_id ELSE p.hobt_id END)
WHERE a.allocation_unit_id = t1.resource_associated_entity_id
)
ELSE (SELECT TOP 1 OBJECT_NAME([object_id]) FROM sys.partitions WHERE hobt_id = t1.resource_associated_entity_id)
END) AS [object_name]
,i.name AS index_name
,t1.request_mode
,t1.request_status
FROM sys.dm_tran_locks AS t1 LEFT JOIN sys.dm_os_waiting_tasks AS t2
ON t1.lock_owner_address = t2.resource_address
LEFT JOIN sys.partitions AS s ON s.hobt_id = t1.resource_associated_entity_id
LEFT JOIN sys.indexes AS i on i.[object_id] = s.[object_id] and i.index_id = s.index_id
LEFT JOIN (SELECT t.[text], p.*
FROM sys.sysprocesses AS p
CROSS APPLY sys.dm_exec_sql_text(p.[sql_handle]) AS t
) AS t3 ON t1.request_session_id = t3.spid
WHERE t3.spid = @@spid
ORDER BY index_name, resource_type
ROLLBACK;
As you can see in the above picture, the main operator on each DML statement only the Clustered Index Insert / Update operator.
Let's also check the statistics I/O by using http://statisticsparser.com/
Table 'PlayerBetLog'. Scan count 0, logical reads 12, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Account'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Below are the locked resources:
USE [TESTDB]
GO
-- View
CREATE VIEW [dbo].[VwTranReport] WITH SCHEMABINDING AS
SELECT A.accid, A.name, B.dividendId, B.amount, B.gameCat FROM dbo.Account A JOIN dbo.PlayerBetLog B ON A.accid = B.playerId;
GO
-- Unique Clustered Index on View
CREATE UNIQUE CLUSTERED INDEX [VUX_VwTranReport] ON [dbo].[VwTranReport] ([dividendId]);
GO
-- another Nonclustered Index on View
CREATE NONCLUSTERED INDEX [NCI_VwTranReport_accid] ON [dbo].[VwTranReport] ([accid]);
GO
Let's run again the DML workload, and compare the new execution plan, statistics I/O, and locked resources:
By comparing to the old plan on the left hand side, the new plan on the right hand side added an extra operation branch for the INSERT PlayerBetLog statement, which includes an index seek to get the Account name column value, and Index Insert to maintain the indexes on the indexed view.
On the other hand, as the indexed view excludes the balance column, the UPDATE statement on this column didn't incur any extra operation.
Below is the statistics I/O for the DML statements after indexed view added:
On the other hand, as the indexed view excludes the balance column, the UPDATE statement on this column didn't incur any extra operation.
Below is the statistics I/O for the DML statements after indexed view added:
Table 'VwTranReport'. Scan count 0, logical reads 12, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Account'. Scan count 0, logical reads 6, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 2, logical reads 11, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'PlayerBetLog'. Scan count 0, logical reads 12, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Account'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Extra logical read are placed on the Account table in order to get the name column for populating the indexed view, and extra I/O on the indexed view and Worktable in order to maintain the indexed view.Table 'Account'. Scan count 0, logical reads 6, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 2, logical reads 11, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'PlayerBetLog'. Scan count 0, logical reads 12, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Account'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Below picture compares the locked resources between the old plan (left) and the new plan (right), extra locks are being placed on the indexed view in the new plan:
Final thought:
This blog post only investigates the overhead of Non-aggregated indexed view, for aggregated one, much more complicated operations will be added into the DML execution plan. If you find adding an indexed view can really boost up a reporting query, you must do load test first to see whether your production database can afford the extra indexed view maintenance cost on the DML workload.
2019-01-03
An Overlooked Stored Procedure since SQL2012 - sp_describe_first_result_set
As a database developer, we create stored procedures for application developers to call them in their application code, most of the time they ask us to provide a document which list out the input parameters and result columns. Where listing stored procedure parameters are very easy by copying the parameters from the stored procedure definition, extracting result columns from the stored procedure body may be not that easy, especially there may be a SELECT * and joining multiple tables in the output statement. Starting with SQL Server 2012, there's a built-in stored procedure, sp_describe_first_result_set, which returns the metadata for the first possible result set of a Transact-SQL batch. Here is an example:
USE StackOverflow2010
GO
-- =============================================
-- Author: Peter Lee
-- Create date: 20180102
-- Description: SP Get User Posts
-- =============================================
CREATE OR ALTER PROCEDURE uspGetUserPosts
@DisplayName nvarchar(40)
AS
BEGIN
SET NOCOUNT ON;
SELECT @DisplayName AS UserName, U.Id AS UserId, U.CreationDate AS UserCreationDate,
U.Reputation, P.CreationDate AS PostCreationDate, P.Title
FROM Users AS U JOIN Posts AS P ON U.Id = P.OwnerUserId
WHERE U.DisplayName = @DisplayName
ORDER BY PostCreationDate DESC, Title
END
GO
EXEC dbo.uspGetUserPosts 'Anonymous User';
GO
EXEC sp_describe_first_result_set N'EXEC dbo.uspGetUserPosts ''Anonymous User''';
GO
The sp_describe_first_result_set stored procedure will tell you the metadata for the result set of your stored procedure, including the column name, data type, ordinal, columns in ORDER BY list, etc.
USE StackOverflow2010
GO
-- =============================================
-- Author: Peter Lee
-- Create date: 20180102
-- Description: SP Get User Posts
-- =============================================
CREATE OR ALTER PROCEDURE uspGetUserPosts
@DisplayName nvarchar(40)
AS
BEGIN
SET NOCOUNT ON;
SELECT @DisplayName AS UserName, U.Id AS UserId, U.CreationDate AS UserCreationDate,
U.Reputation, P.CreationDate AS PostCreationDate, P.Title
FROM Users AS U JOIN Posts AS P ON U.Id = P.OwnerUserId
WHERE U.DisplayName = @DisplayName
ORDER BY PostCreationDate DESC, Title
END
GO
EXEC dbo.uspGetUserPosts 'Anonymous User';
GO
EXEC sp_describe_first_result_set N'EXEC dbo.uspGetUserPosts ''Anonymous User''';
GO
The sp_describe_first_result_set stored procedure will tell you the metadata for the result set of your stored procedure, including the column name, data type, ordinal, columns in ORDER BY list, etc.
Subscribe to:
Posts (Atom)