HK SQL Server Doctor: March 2015

2015-03-27

Transaction Log Disk Full due to Transactional Replication Not CatchUp

Problem Description:
When there are some bulk data operation (e.g. bulk data insert/update, create index/reindex, program bug leads to huge amount of updated rows) on a database publishing transactional replication, log reader agent may not catch up the bulk data operation. Transaction log of the affected DB cannot be truncated even log backup taken, until the log reader agent catch up the updated data. (Ref. http://support.microsoft.com/kb/317375 - section: Unreplicated transactions).

Emergency Resolution:
1. Assign extra transaction log space from another disk (if available, to buy time);
2. Stop the Log Reader Agent on the DB in Management Studio, by Right-Click the affected Publications, View Log Reader Agent Status, Click Stop;
3. Clear the pending replication commands in transaction log:
EXEC DB..sp_repldone @xactid = NULL, @xact_segno = NULL, @numtrans = 0, @time = 0, @reset = 1;
4. Drop all publications in DB:
EXEC DB..sp_droppublication 'all';
5. Disable Replication Publisher role of DB:
EXEC sp_replicationdboption 'DB', 'publish', 'false';
6. Check DB transaction log truncation NOT blocked by 'REPLICATION' again:
SELECT name AS DB, log_reuse_wait_desc FROM master.sys.databases;
7. Backup Transaction Log again to free space;
8. DBCC SHRINKFILE to shrink transaction log file (if required).

2015-03-25

You reduced a column size, but the table gets bigger, why?

Choosing correct data types could decrease the row size, and also improve performance. Sometime you may give the column too much space during table creation, and after the database already production running for a while, you would like to reduce the defined size for a column that well fit your system requirement. For example, you defined a column as fixed length nchar(50), but later you identify that using variable length nvarchar(10) is good enough.

For example, you created the table like this:

CREATE TABLE TestTbl (
id int IDENTITY(1,1) NOT NULL PRIMARY KEY,
col nchar(50) NULL
)
GO

The initial size, surely zero.

In order to simulate the production usage of this table, let's populate it with 100000 rows.

DECLARE @i int = 1
WHILE @i <= 100000
BEGIN
INSERT TestTbl (col) VALUES ('XXX')
SET @i += 1
END
GO

The table is populated with data.

As you identify that this column can use smaller defined size, you alter the column definition as below:

ALTER TABLE TestTbl ALTER COLUMN col nvarchar(10)

Now the column defined size is reduced, from fixed length nchar(50) to variable length nvarchar(10). But when you check the table size, surprisingly, it becomes bigger than before.

Why? Unfortunately one thing is not commonly known – alteration of the table never decreases the row size. When you drop the column, SQL Server removes column from the metadata but does not reclaim/rebuild the row. When you change column type from int to tinyint, for example, actual storage size remains intact – SQL Server just checks new value on insert/update stages. When you increase the size of the field (for example change int to bigint), SQL Server creates another bigint column and keep old int column space intact.
So how to fix it? Well, you need to rebuild clustered index. SQL Server will reclaim all space and rebuild the rows when you do that. By the way, clustered index rebuild is time consuming operation which locks the table. You can only do that in your system maintenance window.

Let's try to rebuild the clustered index:

ALTER INDEX [PK__TestTbl__3213E83F6E978ECC] ON TestTbl REBUILD

Now the table size is reduced finally!

The truth is, you should always carefully choose the type and length of every table column during the design phase.

2015-03-23

SQL Server 32-bit or 64-bit version?

You can use two different commands to check your SQL Server is 32-bit or 64-bit.

@@VERSION
It returns system and build information for the current installation of SQL Server.

SELECT @@VERSION

Sample result:
Microsoft SQL Server 2012 - 11.0.5058.0 (X64)
May 14 2014 18:34:29
Copyright (c) Microsoft Corporation
Express Edition (64-bit) on Windows NT 6.1 (Build 7601: Service Pack 1)

In the first line of the result, X86 = 32-bit, and x64 = 64-bit. Same is true for operating system.

SERVERPROPERTY('Edition')
The SERVERPROPERTY system function returns property information about the server instance. The 'Edition' property is the product edition of the instance of SQL Server. 64-bit versions of the Database Engine append (64-bit) to the version.

SELECT SERVERPROPERTY('Edition')

Sample result:
Express Edition (64-bit)

Hope this helps.

2015-03-18

Repairing a Corrupted Database

One of the most common reasons behind database corruption is collision with any third-party software. Virus attack or bug infection can also corrupt the files. A hardware fault in your system or a crash in your hard disk drive (HDD) may cause the database files to become corrupt. Or, if the database files are being stored in a compressed volume or folder, this may cause corruption in the database files too. So it should be avoided to store SQL Server database files in compressed volumes or folders.

Consider this scenario:
You have been working in a SQL Server database from last few days. One day you find that the database status is tagged as suspect, which means the database file is corrupted. Or, you are having problem while connecting to the database. So how to fix it?
According to the Microsoft KB "How to troubleshoot database consistency errors reported by DBCC CHECKB", the best solution to fix database consistency errors is to restore from a known good backup. However, if you cannot restore from a backup, then you can try DBCC CHECKDB to repair the error. Below are the steps:

1. DBCC CHECKDB(DBName) WITH NO_INFOMSGS
The DBName is a name of your corrupted database. If this is completed without any errors then the database does not need to be repaired.

2. ALTER DATABASE DBName SET SINGLE_USER WITH ROLLBACK IMMEDIATE
The database must be set in single user mode before repairing it.

3. DBCC CHECKDB(DBName, REPAIR_REBUILD)
There are number of repair model, you should first try REPAIR_REBUILD, which is no data loss. If OK, go to step 5.e (multi-user mode) If not, go to next step.

4. DBCC CHECKDB(DBName, REPAIR_ALLOW_DATA_LOSS)
This may cause data loss. If ok go to step 5.e (multi-user mode) If not, go to next step.

5.
a. ALTER DATABASE DBName SET EMERGENCY
b. ALTER DATABASE DBName SET SINGLE_USER
c. DBCC CHECKDB (DBName, REPAIR_ALLOW_DATA_LOSS) WITH NO_INFOMSGS, ALL_ERRORMSGS
d. ALTER DATABASE DBName SET ONLINE
e. ALTER DATABASE DBName SET MULTI_USER

However, if you tried all the steps above but still unable to repair the database corruption, you can try a more powerful third-party SQL database recovery software - Stellar Phoenix SQL Database Repair

2015-03-08

How to reduce the database size - Part 3

In the second part of this series, I explained how to use SQL Server data compression to reduce your database size. Here I explain how index tuning can help you to reduce your database size too.

Removing Unused Indexes

The downside to having too many indexes is that SQL Server has to maintain all of these indexes which can hurt DML performance and indexes also require additional storage. The script below can be used to identify the indexes possibly unused exist in the current database.

select
object_name(i.object_id) as ObjectName,
i.name as [Unused Index],
i.is_unique,
i.is_primary_key,
CASE i.index_id WHEN 0 THEN 'heap' WHEN 1 THEN 'Clustered' ELSE 'Non-clustered' END as is_clustered,
s.user_updates,
SUM(p.[used_page_count]) * 8 / 1024 AS indexSizeMB,
STATS_DATE(i.object_id, i.index_id) AS statistics_update_date
from sys.indexes i
left join sys.dm_db_index_usage_stats s on s.object_id = i.object_id and i.index_id = s.index_id and s.database_id = db_id()
join sys.dm_db_partition_stats p on i.object_id = p.object_id and i.index_id = p.index_id
where objectproperty(i.object_id, 'IsIndexable') = 1
and objectproperty(i.object_id, 'IsIndexed') = 1
and (
s.index_id is null -- and dm_db_index_usage_stats has no reference to this index
or (s.user_updates > 0 and s.user_seeks = 0 and s.user_scans = 0 and s.user_lookups = 0)
) -- index is being updated, but not used by seeks/scans/lookups
group by i.object_id, i.index_id, i.name, s.user_updates, i.is_unique, i.is_primary_key
order by indexSizeMB desc

Rather than running the script on a testing server which I mostly suggested, this script should only be RUN ON PRODUCTION server, as only the Production server statistics are accurate. Bear in mind that the statistics resets when the SQL Server service is restarted. In SQL Server 2012 and 2014, statistics resets when the index is rebuilt. Below figure illustrates the output of this script:

As you can see, the output includes table name, index name, whether it's a unique/primary key index, clustered or non-clustered, index size in MB, and the last statistics update time (remember that index rebuild also includes statistics update). A unique index may be used for a unique constraint, you must take attention to prevent violating business requirements for the system before considering to remove it. From the output of this unused index script, if you found an index is large, it's a non-clustered index (as clustered index is the table itself, dropping a clustered index make all the non-clustered indexes must be rebuilt, so you must think very carefully before dropping clustered index), and index rebuild time is quite a long time ago, then you can consider to drop that index. Keep in mind that you always need to carefully test your system when you drop any indexes making sure that there is no performance impact after.

2015-03-03

How to reduce the database size - Part 2

In the first part of this series, I explained how to find out the most space consuming objects and how to reduce the wasted space by defragmenting indexes with low page density. Here I explain how to implement data compression in SQL Server to further reduce your data size.

DATA COMPRESSION

If you have Enterprise Edition of SQL Server, you can reduce the data size by implementing Data Compression. Data compression reduces the storage costs and increases query performance by reducing I/O and increasing buffer-hit rates. There are two types of Data Compression supported in SQL Server for both tables and indexes – ROW and Page.

ROW compression addresses the storage inefficiency introduced by fixed-length data types. By default, in non-compressed row, size of the fixed-length data is based on the data type size. For example, CHAR(100) column would always use 100 bytes, regardless of the value – even when it is NULL. ROW compression addresses that and removes such an overhead. For example, CHAR(100) value of 'SQL Server 2012' would use just 15 byte (plus 2 offset bytes) rather than 100 bytes. Also ROW compression feature does not take any disk space for zero or null values.

PAGE compression is a superset of ROW compression and takes into account the redundant data in one or more rows on a given page. It also uses prefix and dictionary compression. What this simply means is that the storage engine cuts down on repeated data bytes in the page, by storing the repeated bytes once on the page and then refer to this value from all other occurrences of this value on the same page. PAGE compression only occurs when the page is full to optimize the performance.

Remind that Data Compression works with IN_ROW allocation units only. It doesn't compress LOB nor ROW_OVERFLOW data.

While it may appear that data compression would reduce the size of your tables or indexes, first you should evaluate the estimated space savings in a table or index by using either the sp_estimate_data_compression_savings system stored procedure or the Data Compression Wizard. The estimation works by copying and compressing the sample of your data in tempdb measuring compression results. An (IS) lock is acquired on the table during this operation. As a result, I suggest you to run the estimation on a testing server rather than directly on the production server. Also you should check if the data is fragmented before considering compression as you might be able to reduce the size of the index by rebuilding it instead of using compression.

To use the Data Compression Wizard, open SQL Server Management Studio, right-click on the table you need to estimate, select Storage and click Manage Compression.

This will launch the Data Compression Wizard. On the Welcome page, click Next.
On the Select Compression Type page, click the drop-down on the Compression Type column to select the compression type. Click the Calculate button to display the estimated compressed size. You can select try another compression type and click the Calculate button again to get result of different compression types.

After you made decision to turn on compression for this table, click on the “Next” button. On the Select an Output Option page, you can create a script that can be used later to apply the selected compression on that table.

Because the compression process can be very CPU and disk I/O intensive, you should only apply it on the production server during a maintenance window. Compressing a table (a clustered index) does NOT mean any non-clustered indexes of the table will be automatically compressed for you. Compression is based on a per object (index, and even partition) basis. If you want to compress the non-clustered indexes for this table, you will have to compress each of them.

Compression and decompression adds additional CPU load to the system, but it reduces I/O load in the system. That CPU overhead is relatively small for the ROW compression, especially when you read the data; however, for PAGE compression that overhead is more significant. Queries could execute even faster especially on the systems that are not heavily CPU bound. But batch inserts, batch updates, and index maintenance could take more time when data is compressed. The conclusion is if the system is not heavily CPU bound, it's suggested to implement ROW compression on the indexes with volatile (value change frequently) data. PAGE compression, on the other hand, could be the good choice for the old static data, especially when that data accessed infrequently. Partitioning helps you to implement different compression schemes to different table partitions and will allow you to reduce index maintenance overhead by rebuilding the index on partition scope.

網頁