2015-02-25

How to reduce the database size - Part 1

Large database is not bad, it means the system scale is large, many sales orders rush into the system everyday, your company is doing a good business, and as a DBA you don't need to worry about losing job. The problem, however, large databases are more expensive to support and maintain, they're usually requiring powerful hardware to run, and high storage cost.
Reducing the database size often helps to achieve other goals. Think about Disaster Recovery (DR), smaller database means faster to restore. And also think about performance, smaller table and row size means less I/O to access the data.
In this series, I'm going to discuss different measures to reduce database file (mdf and ndf) size. Some of them are transparent to the client applications; others require code refactoring.

0. Check the free space in data file
In order to make sure the large data files size is really due to large data inside, you should check the free space in the data files before anything else. However, I'm not telling you that you should shrink a database, which will result fragmentation in the data file (ref. Don’t Touch that Shrink Database Button!). Again, this step is just to make sure you're dealing with a database really filled by large data. The script below can be used to check the amount of allocated and unallocated space on per-database file basis of the current database.
SELECT f.type_desc AS [Type]
,f.name AS [FileName]
,fg.name AS [FileGroup]
,f.physical_name AS [Path]
,f.size / 128.0 AS [CurrentSizeMB]
,f.size / 128.0 - convert(int,fileproperty(f.name,'SpaceUsed')) / 128.0 AS [FreeSpaceMb]
FROM sys.database_files f WITH (NOLOCK) LEFT JOIN sys.filegroups fg WITH (NOLOCK)
ON f.data_space_id = fg.data_space_id
OPTION (RECOMPILE)

1. Find out the most space consuming objects
Tables (essentially clustered indexes or heaps) and indexes (non-clustered indexes) are the database objects which contain data. Other objects such as stored procedures, views, and functions don't spend you much space. The script below helps you to check the space usage per-object basis.
;with SpaceInfo(ObjectId, IndexId, TableName, IndexName
,Rows, TotalSpaceMB, UsedSpaceMB)
as
(
select
t.object_id as [ObjectId]
,i.index_id as [IndexId]
,s.name + '.' + t.Name as [TableName]
,i.name as [Index Name]
,sum(p.[Rows]) as [Rows]
,sum(au.total_pages) * 8 / 1024 as [Total Space MB]
,sum(au.used_pages) * 8 / 1024 as [Used Space MB]
from
sys.tables t with (nolock) join
sys.schemas s with (nolock) on
s.schema_id = t.schema_id
join sys.indexes i with (nolock) on
t.object_id = i.object_id
join sys.partitions p with (nolock) on
i.object_id = p.object_id and
i.index_id = p.index_id
cross apply
(
select
sum(a.total_pages) as total_pages
,sum(a.used_pages) as used_pages
from sys.allocation_units a with (nolock)
where p.partition_id = a.container_id
) au
where
i.object_id > 255
group by
t.object_id, i.index_id, s.name, t.name, i.name
)
select
ObjectId, IndexId, TableName, IndexName
,Rows, TotalSpaceMB, UsedSpaceMB
,TotalSpaceMB - UsedSpaceMB as [ReservedSpaceMB]
from
SpaceInfo
order by
TotalSpaceMB desc
option (recompile)
The figure below shows the output of this script, which quickly pinpoint the most space consuming objects.


2. Low page density (aka internal fragmentation) and index defragmentation
Typical, day to day activity causes indexes to fragment over time. Changes to your data – inserts, updates, deletes, and even changing varchar values contribute to fragmentation. Fragmentation exists in two kinds. Logical fragmentation (aka external fragmentation), which caused by the page split operations SQL server perform to accommodate new rows (insert) and increased row length (update), means that the logical order (determined by the index key values) of the pages does not match their physical order on the disk, and/or logically subsequent pages are not located in the same or adjacent extents (extent is the group of 8 pages). Low page density (aka internal fragmentation) is when there is empty space on data file pages in an index, either caused by page split operations, record deletion, or record size where only a few records can fit on a page, forcing empty space (e.g. a 5KB record size, where only one record can fit per 8KB page, leading to 3KB of empty, wasted space per page). This leads to reduced data density – extra space required on disk to store the data, wasted buffer pool memory, and more I/Os are required to read the same amount of data. A small degree of internal fragmentation (free space in index pages) is not necessarily bad. It reduces page splits during insert and update operations (the same effect as fill factor). Nonetheless, a large degree of internal fragmentation wastes space and reduces the performance of the system. Moreover, for indexes with ever-increasing keys, e.g. IDENTITY columns, internal fragmentation is not desirable because the data is always inserted at the end of the index. You can monitor both, internal and external fragmentation with sys.dm_db_index_physical_stats DMV. Internal fragmentation is reflected in the avg_page_space_used_in_percent column. Lower value in the column indicates higher degree of internal fragmentation. The script below shows the page density of all indexes in the current database. It's suggested to run on a Non-production server.
/*
* Check Indexes' Page Density & Logical Fragmentation level.
* Suggested to run on a NON-Production environment to avoid extra system workload.
*/
SELECT
OBJECT_NAME([object_id], database_id) AS [object],
index_id, partition_number, index_level,
index_type_desc, alloc_unit_type_desc,
page_count, avg_page_space_used_in_percent AS [page density %], avg_fragmentation_in_percent AS [logical fragmentation %]
FROM sys.dm_db_index_physical_stats(DB_ID(), DEFAULT, DEFAULT, DEFAULT, 'DETAILED')
ORDER BY [object], index_id, partition_number, index_level DESC
Figure below illustrated partial output of the script.

To better understand the result above, let me briefly explain you the table and index organization. Table (essentially clustered index or heap) and (non-clustered) index are organized the same way. The figure below shows the organization of a table. A table is contained in one or more partitions and each partition contains data rows in either a heap or a clustered index structure (b-tree). The pages of the heap or clustered index are managed in one or more allocation units (IN_ROW_DATA, LOB_DATA, and ROW_OVERFLOW_DATA), depending on the column types in the data rows.

As the tables in my example database are not partitioned, all allocation units belong to partition 1 there. If you have any partitioned tables, you will see separate rows in the result – one per partition per allocation unit. The index_level column is the current level of the index. 0 for index leaf levels, heaps, and LOB_DATA or ROW_OVERFLOW_DATA allocation units. Greater than 0 for nonleaf index levels. index_level will be the highest at the root level of an index.
You can remove internal fragmentation by rebuilding the index (ALTER INDEX ... REBUILD). Pay attention that OFFLINE rebuild is the default (and I prefer OFFLINE rather than ONLINE, ref. Be careful when scheduling an online index rebuild). If possible always try to define the maintenance window for the database. Select critical tables that are fragmented and define offline index rebuilds for them. Generally you don’t have to pay attention to heaps and to small tables. Take a log backup before rebuilding indexes if your database is in FULL recovery model.
Defining an appropriate value of FILLFACTOR could also reduce the number of page splits and internal fragmentation. Remember that FILLFACTOR is applied only during index creation or rebuild. Unfortunately, there is no any universal FILLFACTOR value. As the DBA you should try to figure out the optimal FILLFACTOR value for different indexes. (Ref. 5 Things About Fillfactor)

2015-02-12

Catch the Lead SQL Blocker - the real culprit of system slowness

Your client made an urgent call to you, said that the software system is hang. You check the CPU, RAM, I/O, and network, in all production servers, all seems healthy. You try to access the software system, but it's inaccessible. In this case, I suggest you to check the SQL Server, there's a free tool that I love so much called Idera SQL Check. If you see the Processes window just like a spider web, then you find the cause - SQL process blocking.

It is very common to deal with blocking scenario and there may be a blocking chain means one process (aka SPID) is blocking second one and second one is blocking third one and so on. It's really difficult to track the root of blocking chain by simply running the system stored procedure sp_who2 or from SSMS activity monitor if there are number of SPIDs which are involved. So here I wish to share a stored procedure that I wrote which will help you to find out the root SPID and all associated details, in order to help you to further troubleshoot and make prompt decision (may be you will consider to kill the lead blocker for a quick fix for the incident).

USE [master]
GO

CREATE PROC [dbo].[ViewProcessBlocking]
AS
BEGIN
SET NOCOUNT ON;

SELECT
s.spid, Blockingspid = s.blocked, DatabaseName = DB_NAME(s.dbid),
s.program_name, s.loginame, s.hostname, s.login_time, s.last_batch, s.waittime, s.waitresource,
ObjectName = OBJECT_NAME(objectid, s.dbid), Definition = CAST(text AS VARCHAR(MAX))
INTO #Processes
FROM sys.sysprocesses s
CROSS APPLY sys.dm_exec_sql_text (sql_handle)
WHERE
s.spid > 50
;
WITH Blocking(spid, Blockingspid, DB, BlockingObject, [Blocking Statement / Definition], program, login, host, login_time, last_batch, waittime_ms, waitresource, RowNo, LevelRow)
AS
(
SELECT
s.spid, s.Blockingspid, s.DatabaseName, s.ObjectName, s.Definition,
s.program_name, s.loginame, s.hostname, s.login_time, s.last_batch, s.waittime, s.waitresource,
ROW_NUMBER() OVER(ORDER BY s.spid),
0 AS LevelRow
FROM
#Processes s
JOIN #Processes s1 ON s.spid = s1.Blockingspid
WHERE
s.Blockingspid = 0
UNION ALL
SELECT
r.spid, r.Blockingspid, r.DatabaseName, r.ObjectName, r.Definition,
r.program_name, r.loginame, r.hostname, r.login_time, r.last_batch, r.waittime, r.waitresource,
d.RowNo,
d.LevelRow + 1
FROM
#Processes r
JOIN Blocking d ON r.Blockingspid = d.spid
WHERE
r.Blockingspid > 0
)
SELECT *
INTO #BlockTree
FROM Blocking
ORDER BY RowNo, LevelRow

-- Top HEAD
SELECT TOP 1 'Oldest HEAD may need to KILL' AS OldestHeadMayKill, 'KILL ' + CAST(spid AS varchar(10)) AS KillStmt,
* INTO #TopHead FROM #BlockTree WHERE LevelRow = 0 ORDER BY last_batch
SELECT * FROM #TopHead
SELECT COUNT(DISTINCT spid) AS [Total no. of affected process], CAST(MAX(waittime_ms) / 1000.0 / 60.0 AS decimal(38, 1)) AS [Longest Wait Minute(s)] FROM #BlockTree
-- All blocker(s) & blocked
SELECT DISTINCT
spid, Blockingspid, DB, BlockingObject, [Blocking Statement / Definition], program, [login], host, login_time, last_batch, waittime_ms, waitresource
FROM #BlockTree

DROP TABLE #TopHead
DROP TABLE #BlockTree
DROP TABLE #Processes

END

GO


Run this stored procedure:
EXEC master..ViewProcessBlocking
The result will be like this:

It shows you the lead blocker SPID, what SQL its running, how many processes are being blocked and how long it is. There's also a KILL statement to kill that culprit. As your blocking scenario may be caused by multiple lead blockers, you many need to kill and run ViewProcessBlocking multiple times in order to settle down the incident. Remind that it's just a quick fix to kill the lead blocker(s), as a DBA you must further investigate the problem. You can refer to my older post to enable the tracking events of blocked process and deadlock.

2015-02-11

Why NOLOCK is a really, really bad idea

The NOLOCK / READ UNCOMMITTED hint is much more dangerous than its name suggests. And that it why most people who don’t understand the problem, recommend it. It creates “incredibly hard to reproduce” bugs. The type that often destroy your end-users confidence in your product & your company.

What many people think NOLOCK is doing
Most people think the NOLOCK hint just reads rows & doesn’t have to wait till others have committed their updates. If someone is updating, that is OK. If they’ve changed a value then 99.999% of the time they will commit, so it’s OK to read it before they commit. If they haven’t changed the record yet then it saves me waiting, its like my transaction happened before theirs did. But it's absolutely wrong!

The Problem
Those that know better will often point to the fact that NOLOCK allows dirty reads. Data that has not been committed can, and will, be returned. In straight terms, an insert, update, or delete that is in process will considered for the result set, regardless of the state of the transaction. This can mean returning rows from an insert that may potentially rollback, or returning rows that are being deleted.

What about times a query is returning rows that you know for certain are not being modified? Suppose that you are updating rows for one client and need to return rows for a second client. In this case, will the use of NOLOCK be safe? The data isn’t being modified for the second client, so you might assume that the returning that data won’t have an opportunity for dirty data. Unfortunately, this assumption is incorrect.

Same Data is Read Twice
There are rare occasions when the same data can be read twice when using the read-uncommitted isolation level or nolock hint. To illustrate this issue we have to give a little background first. Clustered Indexes are created on SQL Server tables to physically order the data within the table based on the Cluster Key. The leaf pages of the index contain the data pages which contains the actual data for the table. Data pages can hold 8K worth of data.
Scenario: You have an ETL process that will Extract all records from a table, perform some type of transformation, and then load that data into another table. There are two types of scans that occur in SQL Server to read data: allocation scans and range scans. Range scans occur when you have a specific filter (where clause) for the data you are reading, and an index can be used to help seek out those specific records. When you do not have a filter, an allocation scan is used to scan all of the data pages that have been allocated to that table. Pending you are not doing any type of sort operations, your data will read the data pages in the order as it finds them on the disk. For simplicity, let’s assume there is no fragmentation so your data pages are in order 1-10. So far your process has read pages 1-6. Remember your NOLOCK process is not requesting shared (S) locks so you are not blocking other users. Meanwhile, another process begins which inserts records into your table. This process attempts to insert records onto Page 3, but the page is full and the record will not fit. As a result the page has to be split and half of the records will remain on Page 3 and the other records will be moved to a new page which will be page 11. Your process has already read the data that was on Page 3, but now half of that data has been moved to page 11. As a result, as your process continues it will read Page 11 which contains data that has already been read. If there is no type of checks on the destination table, you will end up with bad duplicate data.

Solution
One of the the main concerns that people have with locking, is the blocking that is associated with two users trying to access the same locked resource. As an alternative to using NOLOCK, try using READ_COMMITTED_SNAPSHOT / SNAPSHOT isolation level instead. Through this, data readers won’t block writers; which will reduce the amount of lock blocking on your data platform.

Conclusion
SQL Server is a very complex enterprise database solution with many options and flags that can be changed to alter the behavior of SQL Server. Although many of these options have a justified use, it is important to understand the risks that are associated with changing these options. The read-uncommitted isolation level and nolock table hint are no exception to this rule. Generally it is best practice to stick with the default isolation level and refrain from using table/query hints unless it is absolutely necessary and the solution has been thoroughly tested. Using read-uncommitted and nolock should be the EXCEPTION and not the RULE.

2015-02-09

MSSQL 2012 server failure results in Identity gaps

In SQL Server 2012, the database engine changes its mechanism for generating Identity values. Prior to SQL Server 2012, identity have no any cache in memory. SQL Server 2012 introduces cache in identity, leading to gaps will be resulted after server failover (ref.: Failover or Restart Results in Reseed of Identity). Below is the cache values for different data types.
typeidentity
TINYINT10
SMALLINT100
INT1000
BIGINT10000
NUMERIC10000
You can do a simple test to demonstrate how a server failure will result a gap in identity values:
IF OBJECT_ID(N'dbo.T1' , N'U') IS NOT NULL DROP TABLE dbo.T1;
GO
CREATE TABLE dbo.T1 (keycol INT IDENTITY);
GO
INSERT INTO dbo.T1 DEFAULT VALUES;
GO 2
SELECT IDENT_CURRENT(N'dbo.T1');
The result is 2.
To force an unclean termination of the SQL Server process, open Task Manager (Ctrl+Shift+Esc), right-click the SQL Server process, and choose End task.
Next, start the SQL Server using SQL Server Configuration Manager.
Then query the current identity value again:
SELECT IDENT_CURRENT(N'dbo.T1');
The result is 1001.

2015-02-02

SEQUENCE Basics

The SEQUENCE statement introduced in SQL Server 2012 brings the ANSI SQL 2003 standard method of generating IDs. Unlike IDENTITY, which is a table property, SEQUENCE is an independent schema bound object. Different tables and objects can share the same SEQUENCE object. Below is the syntax of CREATE SEQUENCE:

CREATE SEQUENCE [schema_name.]sequence_name
[AS [built_in_integer_type | user-defined_integer_type]]
[START WITH constant]
[INCREMENT BY constant]
[{MINVALUE [constant]} | {NO MINVALUE}]
[{MAXVALUE [constant]} | {NO MAXVALUE}]
[CYCLE | {NO CYCLE}]
[{CACHE [constant]} | {NO CACHE}]
[;]

A sequence can be defined as any integer type, default is bigint.
START WITH specifies the first value returned by the sequence object, which must be within the minimum and maximum values, default is the minimum value if its an ascending sequence and the maximum if its an descending sequence.
INCERMENT BY specifies the value used to increment (or decrement if negative) the value of the sequence for each call to the NEXT VALUE FOR. If the increment is positive, the sequence is ascending; if the increment is negative, the sequence is descending. The default is 1.
MINVALUE specifies the minimum value. The default is the minimum value of the data type.
MAXVALUE specifies the maximum value. The default is the maximum value of the data type.
CYCLE specifies whether the sequence should restart from the minimum value (or maximum for descending sequence) or throw an exception when its minimum or maximum value is exceeded. The default is NO CYCLE. Note that cycling restarts from the minimum or maximum value, not from the start value.
The CACHE option is used to increase the performance of a sequence object by minimizing the number of disk IOs that are required to generate sequence numbers. If you didn't specify any CACHE nor NO CACHE when creating a sequence, SQL Server will take care it for you. I am not going to explain too much about it, further details are provided in the BOL.
Let's create a simple sequence and play with it.
CREATE SEQUENCE Simple_Seq
AS INTEGER
START WITH 1
INCREMENT BY 1
MINVALUE 1
MAXVALUE 9
NO CYCLE;
Then try to get values from it a few times...
SELECT NEXT VALUE FOR Simple_Seq;
GO 9
SELECT NEXT VALUE FOR Simple_Seq;
After you hit 9 and then invoke the next value, you will get this error message.
Msg 11728, Level 16, State 1, Line 1
The sequence object 'Simple_Seq' has reached its minimum or maximum value. Restart the sequence object to allow new values to be generated.
You can restart the sequence by using the ALTER SEQUENCE statement:
ALTER SEQUENCE Simple_Seq RESTART;
Also if you specified the CYCLE option when creating the sequence, the sequence number will be recycled by itself.
CREATE SEQUENCE Cycle_Seq
AS INTEGER
START WITH 1
INCREMENT BY 1
MINVALUE 1
MAXVALUE 9
CYCLE;
SELECT NEXT VALUE FOR Cycle_Seq;
GO 9
SELECT NEXT VALUE FOR Cycle_Seq;

As I said, a SEQUENCE is an independent schema bound object which can be shared by different tables. Below two tables both use the same sequence as the default value of its primary key.
-- restart the sequence
ALTER SEQUENCE Cycle_Seq RESTART
GO
CREATE TABLE A (
id int DEFAULT NEXT VALUE FOR Cycle_Seq PRIMARY KEY,
n VARCHAR(50)
);
GO
CREATE TABLE B (
id int DEFAULT NEXT VALUE FOR Cycle_Seq PRIMARY KEY,
n VARCHAR(50)
);
GO
INSERT A (n) VALUES ('mouse');
INSERT B (n) VALUES ('Metal');
INSERT A (n) VALUES ('cow');
INSERT A (n) VALUES ('tiger');
INSERT B (n) VALUES ('wood');
SELECT * FROM A
SELECT * FROM B

Result:

2015-02-01

Transparent Data Encryption (TDE)

Transparent Data Encryption (TDE) provides the ability to encrypt an entire database and to have the encryption be completely transparent to the applications that access the database. TDE encrypts the data stored in both the database's data file (.mdf) and log file (.ldf) using either Advanced Encryption Standard (AES) or Triple DES (3DES) encryption. In addition, any backups of the database are encrypted. This protects the data while it's at rest as well as provides protection against losing sensitive information if the backup media were lost or stolen. The code below demonstrates how to enable TDE on a sample database named UserDB.
--Check if the Database Master Key already present.
USE master;
GO
SELECT * FROM sys.symmetric_keys;
--Drop the existing Master Key.
USE master;
GO
BEGIN TRY
DROP MASTER KEY;
END TRY
BEGIN CATCH
PRINT 'Master Key NOT exists.';
END CATCH
GO
--Create Master Key in master database.
USE master;
GO
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'master key password';
GO
--Create Server Certificate in the master database.
USE master;
GO
OPEN MASTER KEY DECRYPTION BY PASSWORD = 'master key password';
CREATE CERTIFICATE SQL_TDE_CERT WITH SUBJECT = 'SQL TDE CERT';
GO
--Create User Database Encryption Key.
USE UserDB;
GO
CREATE DATABASE ENCRYPTION KEY WITH ALGORITHM = AES_128 ENCRYPTION BY SERVER CERTIFICATE SQL_TDE_CERT;
GO
--Check Database Encryption Key created.
--The tempdb system database will also be encrypted
--if any other database on the instance of SQL Server is encrypted by using TDE.
SELECT DB_NAME(database_id) AS database_name, * FROM sys.dm_database_encryption_keys;
--Enabling Transparent Database Encryption for the User Database.
USE master;
GO
ALTER DATABASE UserDB SET ENCRYPTION ON;
GO
--Check User Database is encrypted.
SELECT name, is_encrypted FROM sys.databases;
--Backup Master Key to file.
USE master;
GO
OPEN MASTER KEY DECRYPTION BY PASSWORD = 'master key password';
BACKUP MASTER KEY TO FILE = 'D:\MSSQL_TDE_KEYS\MasterKey.mtk' ENCRYPTION BY PASSWORD = 'master key password';
GO
--Backup Server Certificate.
USE master;
GO
BACKUP CERTIFICATE SQL_TDE_CERT TO FILE = 'D:\MSSQL_TDE_KEYS\ServerCert.cer'
WITH PRIVATE KEY ( FILE = 'D:\MSSQL_TDE_KEYS\PrivateKey.pvk', ENCRYPTION BY PASSWORD = 'master key password');
GO