2020-12-07

Replication Error – Cannot execute as the database principal because the principal “dbo” does not exist

Today I was called by the operation team, said that a drive in a database server running SQL Server going to be out of free space. I found a database that its transaction log was much bigger than how it should be. The scheduled transaction log backup was fine (I'm using the Ola's maintenance solution), and no any active transactions that preventing log truncation was found by the DBCC OPENTRAN() command running in the context of that database. However, DBCC OPENTRAN told me that the database has a publication of transactional replication configured. Then I queried the log_reuse_wait_desc column of the sys.databases system catalog view, which told me that the reuse of transaction log space was waiting on REPLICATION. Then I figured out it was the replication log reader agent got some problem. Below screenshot shows the error message of the log reader agent, by checking from the Replication Monitor > right-click that Log Reader Agent > View Details:

The error message was "Cannot execute as the database principal because the principal “dbo” does not exist, this type of principal cannot be impersonated, or you do not have permission". This message is quite misleading, but the root cause is the database owner of the publication database is invalid, which is mostly due to the database was created by a login and then the login was being removed. In order to solve it, the easiest solution is to set sa as the database owner, no matter you are just enabled Windows authentication or the sa login is disabled, you can still use the sa login as a valid database owner. It can be set in the database property file tab:
Now the replication resumed running normally. But if there are too many non-distributed replication records, drop and create the publication and subscription again may be a faster alternative, and that's why you must always remember to generate creation script for all new and changed publications.

2020-11-09

Checking Availability Group synchronization performance

One of the most important aspects of a successful deployment of SQL Server Availability Group is the synchronization speed of the secondary replica with the primary replica fulfills your production application performance requirement. Your application will be slowed down if a secondary replica with synchronous commit mode is lagging behind the primary replica. Also if any one of the secondary replicas far behind from the primary replica, transaction log reuse will be hindered, which makes the transaction log file keeps growing until disk full and the database becomes non-updatable. You can check the transaction log reuse wait by querying the sys.databases DMV's log_reuse_wait column.

The easiest way to check the status of AG is through the built-in dashboard in SSMS, you can open it by expanding AlwaysOn High Availability folder in the SSMS Object Explorer > Availability Groups > right-click the AG group > Show Dashboard. The default layout of the dashboard doesn't provide a lot of details, but you can add additional details into the layout through the Add/Remove Columns link on the dashboard, as shown below:

The description of these columns in the dashboard can be checked in the online documentation sys.dm_hadr_database_replica_states DMV. Below are some useful columns that I always add:

  • log_send_queue_size : Amount of log records of the primary database that has not been sent to the secondary databases, in kilobytes (KB).
  • log_send_rate : Average rate at which primary replica instance sent data during last active period, in kilobytes (KB)/second.
  • redo_queue_size: Amount of log records in the log files of the secondary replica that has not yet been redone, in kilobytes (KB).
  • redo_rate : Average Rate at which the log records are being redone on a given secondary database, in kilobytes (KB)/second.
 

2020-10-11

Cross Domain SQL Server Replication

Setting up replication with publisher on one domain and subscriber on another domain is tricky. This blog post discuss issues encountered and the steps to overcome them.

When you try to add subscription by specifying the FQDN of the subscriber server, you will get an error "Sql Server replication requires the actual server name to make a connection to the server" as shown below.

As you see on the SSMS object explorer in the above picture, the subscriber server actually can be reached from the publisher server by specifying the FQDN, but the subscription wizard denied you to create it. In order to overcome it, you need to create SQL Server Alias in the configuration manager. Alias provides alternate name to be used to connect to the target SQL Server. Note that SSMS is running in 32-bit, while most of the SQL Servers nowadays are in 64-bit version. Replication agent process is called from SQL Server agent, so the replication process run in 64-bit. So, we need to setup an alias in both 32-bit and 64-bit SQL Native Client configuration.

After the server aliases defined, you should able to add subscriber by specifying the server alias.


2020-09-02

How to get the service account by using T-SQL

You have the sysadmin privilege in the SQL Server instance, for some reason you need to check the service accounts executing the database engine, agent, and other services. All we know you can get the answer from SQL Server Configuration Manager, but you need to login the Windows OS in the database server first, which is little bit inconvenient. Even more it's the system engineer team in charge of maintaining Windows server, not you, and they don't let you login the OS. Since SQL Server 2008R2 SP1 we have a documented DMV sys.dm_server_services gives us information about the SQL Server, Full-Text, SQL Server Launchpad service, and SQL Server Agent services in the current instance of SQL Server.

As you can see in the result, this DMV also returns the startup type, running status, and other useful information about the services.

2020-08-08

SQL Server Always Encrypted Part 2

In part one, I demonstrated how to create a table with always encrypted columns. Now let's create a stored procedure to insert a row into it, another stored procedure to query a row from it, and a C# .NET console application to call these stored procedures. You will see that even if you get the sysadmin privilege, you cannot simply run query statement or call stored procedures to access those encrypted data.

Let's create the column master key and column encryption key using SSMS GUI:

In the user database, expand Security > Always Encrypted Keys > right-click Column Master Key > New > type the Name > click Generate Certificate.

Then right-click Column Encryption Key > New > type the Name > select the master key.

Let's create the table:

CREATE TABLE TestEncryptTable (
    id int IDENTITY(1, 1) PRIMARY KEY,
    encryptedLookupCol varchar(11) COLLATE Latin1_General_BIN2 ENCRYPTED WITH (
        ENCRYPTION_TYPE = DETERMINISTIC,
        ALGORITHM = 'AEAD_AES_256_CBC_HMAC_SHA_256',
        COLUMN_ENCRYPTION_KEY = TestColumnEncryptKey) NOT NULL,
    encryptedValueOnlyCol varchar(11) COLLATE Latin1_General_BIN2 ENCRYPTED WITH (
        ENCRYPTION_TYPE = RANDOMIZED,
        ALGORITHM = 'AEAD_AES_256_CBC_HMAC_SHA_256',
        COLUMN_ENCRYPTION_KEY = TestColumnEncryptKey) NOT NULL
);

Here are the stored procedures that my application will call to add and get row from the always encrypted table:

USE TestDB
GO
CREATE OR ALTER PROC InsertEncryptRow (
    @encryptedLookupCol varchar(11),
    @encryptedValueOnlyCol varchar(11)
) AS
BEGIN
    INSERT TestEncryptTable VALUES (
        @encryptedLookupCol, @encryptedValueOnlyCol);
END
GO
CREATE OR ALTER PROC GetEncryptRow (
    @id int
) AS
BEGIN
    SELECT * FROM TestEncryptedTable WHERE id = @id;
END

In the C# application, the database connection string must specify a new option for enabling Always Encrypted. This is the 'Column Encryption Setting = Enabled''option.

using System;
using System.Data;
using System.Data.SqlClient;

namespace TestAlwaysEncrypted
{
    public class Class1
    {
        public static string CS = "Data Source=.;Initial Catalog=TestDB;Column Encryption Setting=Enabled;Integrated Security=True";
        static void Main(string[] args)
        {
            string a1 = args[0];
            string a2 = args[1];
            SqlConnection conn = new SqlConnection(CS);
            using (conn)
            {
                conn.Open();
                string cmd = "InsertEncryptRow";
                SqlCommand sqlCmd = new SqlCommand(cmd, conn);
                sqlCmd.CommandType = CommandType.StoredProcedure;
                sqlCmd.Parameters.Add("@encryptedLookupCol", SqlDbType.VarChar);
                sqlCmd.Parameters["@encryptedLookupCol"].Value = a1;
                sqlCmd.Parameters.Add("@encryptedValueOnlyCol", SqlDbType.VarChar);
                sqlCmd.Parameters["@encryptedValueOnlyCol"].Value = a2;
                sqlCmd.ExecuteNonQuery();
            }
        }
    }
}

Execute the .NET project a few times, you can see the table has some new rows filled in, but you cannot see the encrypted data even you are the sysadmin of your SQL Server:

In order to see the data in SSMS, you can specify the Column Encryption setting when connected to the SQL instance through the 'Additional Connection Parameters' option:

Run the select query again, you can see the data now.
You can also verify that the data transmitted from the client application to your SQL Server is really encrypted, by using SQL Profiler to trace the stored procedure executed:
The .NET library automatically calls a SQL Server built-in stored procedure, sp_describe_parameter_encryption, so it knows which parameters needed to be encrypted.
 
 

2020-07-10

SQL Server 2019 great new feature - Accelerated Database Recovery (ADR)

This month blog post I would like to introduce a great new feature for SQL Server 2019, Accelerated Database Recovery (ADR), and let's see how fast a rollback can be by enabling ADR to speed it up. This experiment uses the infamous WideWorldImporters sample database.
Firstly let's measure the update and rollback time for the following update 236667 rows involved:
USE WideWorldImporters;
SELECT COUNT(*) FROM Warehouse.StockItemTransactions;
BEGIN TRAN
DECLARE @updateStartTime datetime = GETDATE();
UPDATE Warehouse.StockItemTransactions SET SupplierID = 1;
DECLARE @updateEndTime datetime = GETDATE();
ROLLBACK
DECLARE @rollbackEndTime datetime = GETDATE();
SELECT DATEDIFF(millisecond, @updateStartTime, @updateEndTime) updateTime, DATEDIFF(millisecond, @updateEndTime, @rollbackEndTime) rollbackTime;

Without ADR, it takes more than 12 seconds to rollback the update statement.










Enable ADR on the WideWorldImporters database by executing below alter database statement:
USE master
GO
ALTER DATABASE WideWorldImporters SET ACCELERATED_DATABASE_RECOVERY = ON;

You can verify ADR really enabled by checking the server error log:


Run the update statement again, you will see the rollback is instantaneous!


Now we SQL Server database administrators have a great reason to convince our boss to pay for a software upgrade :D

2020-06-01

Cannot shrink log file because the logical log file located at the end of the file is in use

As a database administrator, I seldom shrink my database, as it will cause index fragmentation in data file. But there are exceptions, such as log backup job failed due to disk full, and that was the case that I face today morning. The database joins an availability group, the daily log backup job was set to run on a secondary replica, but failed from Saturday. Transaction log file of that database grows to make the disk it resides nearly out of empty space. I manually ran a log backup first. Then I tried to run DBCC SHRINKFILE to shrink the log file, I got the following error message:

Cannot shrink log file because the logical log file located at the end of the file is in use.

This error is so common, especially if your database has publications of transactional replication and the log reader agent is not fast enough to process, but that's not the case this time. If it's a standalone or FCI sql server instance, as a quick fix, I will set the database to SIMPLE recovery model, shrink the log file, then set it back to FULL recovery, do a full backup and a log backup. But this database is in an availability group, we must remove the database from its availability group before setting it to SIMPLE recovery. So I further troubleshoot this case as below.

1. What is log reuse waiting on? We can check it by running a query on sys.databases DMV, which the log_reuse_wait columns tell you the reason.
USE [UserDB]
GO
DBCC LOGINFO; -- status 2 means the VLF is still in use
GO
SELECT [name], log_reuse_wait, log_reuse_wait_desc FROM sys.databases;
GO


2. I found the wait value was 13 (OLDEST_PAGE), which is caused by Indirect Checkpoint (default for newly created databases in SQL Server 2016), so I turn it off. Then the shrink log succeed.
USE master
GO
ALTER DATABASE [UserDB] SET TARGET_RECOVERY_TIME = 0 SECONDS WITH NO_WAIT;
GO
USE [UserDB]
GO
DBCC SHRINKFILE (N'UserDB_log' , 1024);
GO


Learning never stop for a SQL Server database administrator :)

2020-05-07

Running Query in SQLCMD mode in SSMS

Today I would like to talk about another easily overlooked but very useful feature in SQL Server Management Studio, which is running query in SQLCMD mode. This mode allows you to embed SQLCMD comands into your T-SQL script and execute it by using the SSMS. The most powerful use of this feature is you can run one script in multiple server instances, by using the :CONNECT SQLCMD command to connect to different instances. For example, you need to check the synchronization state of all databases in all production server instances, rather than connecting the server instances one by one in SSMS and checking them by Availability Group Dashboard, you will be more comfortable to just execute one script file once like below:
:CONNECT (local)\TARGETSVR1
SELECT R.replica_server_name, D.synchronization_state_desc, B.[name] AS DatabaseName FROM sys.dm_hadr_availability_replica_cluster_states R JOIN sys.dm_hadr_database_replica_states D ON R.replica_id = D.replica_id JOIN sys.databases B ON D.database_id = B.database_id WHERE replica_server_name = @@SERVERNAME;
GO
:CONNECT (local)\TARGETSVR2
SELECT R.replica_server_name, D.synchronization_state_desc, B.[name] AS DatabaseName FROM sys.dm_hadr_availability_replica_cluster_states R JOIN sys.dm_hadr_database_replica_states D ON R.replica_id = D.replica_id JOIN sys.databases B ON D.database_id = B.database_id WHERE replica_server_name = @@SERVERNAME;
GO

Before running the above script, you should enable SQLCMD mode for the opening query editor:
Remind that you must type the GO command before each :CONNECT command, in order to seperate each batch to be executed on different server instances, otherwise all batches will be executed on the lowest server instance in your script, which will be a mistake, e.g.

 

2020-04-08

Policy Based Management

This monthly blog post I like to introduce another overlooked SQL Server build-in feature, Policy Based Management (PBM), which is very useful indeed especially if you are a server/infrastructure DBA of a big corporation who need to administer many SQL Server instances. This feature can help us to evaluate and enforce some policies on database servers, such as naming conventions, file locations, and many configuration settings on different objects including server, database, file, login, table, etc. You can just create one set of policies, and apply the same set of policies to multiple server instances. Here I will demonstrate how to create policies to check whether the data files and log files of user databases are placed in the intended disk drives, another policy to enforce database enabling RCSI.

1. In SSMS Object Explorer, expand Management > Policy Management > right-click Conditions > New Condition. Create four conditions:
Name: Not System Databases. Facet: Database. Expression: @IsSystemObject = False
Name: Data File in M drive. Facet: Data File. Expression: @FileName LIKE 'M:%'
Name: Log File in N drive. Facet: Log File. Expression: @FileName LIKE 'N:%'
Name: Database RCSI enable. Facet: Database. Expression: @IsReadCommittedSnapshotOn = True


2. Right-click Policies > New Policy. Create three policies:
Name: Data Files Location. Check condition: Data File in M drive. Against targets: Every File, in Every FileGroup, in Not System Databases Database.
Name: Log Files Location. Check condition: Log File in N drive. Against targets: Every LogFile, in Not System Databases Database.
Name: Database RCSI enable. Check condition: Database RCSI enable. Against targets: Not System Databases Database.
As you can see, PBM condition can be used as Checking condition, and also be used as Filtering condition on target.


3. Now you are done on creating policies, let's evaluate them on local server instance. Right-click the Policies folder in object explorer > Evaluate > tick your tailor-made policies > press Evaluate button.

4. In the result, you can see which targets are violating your policies, and viewing the details.

5. For the Database RCSI enable policy, you can also apply the policy on the target, which essentially set the database RCSI option on, i.e. it runs ALTER DATABASE [UserDB] SET READ_COMMITTED_SNAPSHOT ON for you.

6. You can also create all policies in a central server instance, and evaluate the centralized policies on another server instances. In the Evaluate Policies window, you can select the Source, where you can select the central server.

BPM is easy to use. You even don't need to type one line of coding in order to create your own set of custom conditions and policies.

2020-03-06

SQL Agent Multi-Server Administration

Let's say you are a SQL Server database administrator working for a big corporation, which has hundreds of server instances. You have a bunch of agent jobs like the Ola's maintenance solution that you want to deploy on all the server instances in your company, and you like to manage those jobs across all the instances in a centralized master. Most database administrators believe it can only be done by 3rd party management tools. In fact, SQL Server Agent service has a built-in feature named Multi-server Administration (a.k.a. MSX/TSX) which was overlooked by most of you. In this blog post I would like to demonstrate how to use this feature to deploy and manage agent job in multiple target servers. For the sake of simplicity, I setup 3 server instances in my local PC: the default instance which will be the master server (MSX), while the other 2 instances named TargetSvr1 and TargetSvr2 will be the target servers (TSX). Let's start to see how to do it.

In the MSX, right-click SQL Server Agent >  Multi Server Administration > Make this a Master.

Click Next in the welcome page, specify an operator email if you like, or just click Next to skip it.

Add the target servers, then click Next to continue.

It will check the server compatibility, click Close once all targets passed the checking.

Let this wizard to create the login in your master server if necessary. Next.

Review the configuration. Click Finish.

Oops! There's an error.


In order to overcome it, for simplicity let's modify the registry key as stated in the online document, for each of your target servers, set them to zero.

Do the above steps again. You should success.

Now we can try to create a multi-server job. In order to proof the job really being executed in all the target servers, let's create a simple table in a testing database on all the target servers first.
USE master
GO
CREATE DATABASE TestingDB
GO
USE TestingDB
GO
CREATE TABLE TestTable (col varchar(50));
GO


Refresh the SSMS Object Explorer, SQL Server Agent in master server becomes MSX, and those in target servers become TSX.

And there's a new folder, Multi-Server Jobs, inside the master server. Let's create a job there which insert a row into the testing table. Here's the job body:
DECLARE @s varchar(50) = CAST(GETDATE() AS varchar(50));
INSERT TestingDB..TestTable VALUES (@s);


In the target tab of the new job window, specify the target servers of this job.

The default polling interval of multi-server job is one minute, which means jobs will be deployed and executed in the target servers one minute after the deployment and job firing in master.

After one minute, refresh the object explorer, you will see the job is deployed in all targets.

Let's execute the job in the Master server.

After one minute, run a multi-server query in your targets, you should see the new row exists in all target servers.

You can also check the job history in the master server.

Please be reminded that MSX/TSX doesn't guarantee the job executions are at the same time among the targets.

2020-02-15

How to make a Table Non-Updatable

It's a common business requirement that some data needed to be unchangeable once it entered into the database, such as payments, bank transfers, bets, etc. Unfortunately, SQL Server has no built in declarative support for immutable columns, but we database administrators and sql developers can make it ourselves by writing triggers which rollback or skip the update operations. In order to tighten it up, we can obscure the body text of the trigger using WITH ENCRYPTION option in the create trigger statement. Furthermore, we can create SQL Server Audit on SCHEMA_OBJECT_CHANGE_GROUP event class, in order to trace any alter/drop/disable operations on the triggers. Let's see a simple demo:

USE TestDB
GO
-- Create a demo Table, with one primary key column, one ordinary column, and one immutable column
CREATE TABLE DemoTable (pk int PRIMARY KEY, col1 varchar(50), immutableCol varchar(50));
GO

-- Create an encrypted trigger to prevent update on the immutable column
CREATE TRIGGER tr_DemoTable_Upd ON DemoTable
WITH ENCRYPTION
AFTER UPDATE
AS
BEGIN
SET NOCOUNT ON;
IF UPDATE(immutableCol) BEGIN ROLLBACK; RETURN; END
END
GO

-- Create a server audit object
USE master
GO
CREATE SERVER AUDIT Audit_Demo
TO FILE (
FILEPATH = N'D:\temp',
MAXSIZE = 50 MB,
MAX_ROLLOVER_FILES = 1000
)
WITH (ON_FAILURE = CONTINUE);
GO
ALTER SERVER AUDIT Audit_Demo WITH (STATE = ON);
GO
-- Create database audit specification to trace schema changes in the user database
USE TestDB
GO
CREATE DATABASE AUDIT SPECIFICATION DbAuditSpec_SchemaChg
FOR SERVER AUDIT Audit_Demo
ADD (SCHEMA_OBJECT_CHANGE_GROUP)
WITH (STATE = ON);
GO


Now we have done all the widgets, let's have a test on it.
-- add a testing row
INSERT DemoTable (pk, col1, immutableCol) VALUES (1, 'ABC', 'Immutable value');
GO
SELECT * FROM DemoTable;
GO
-- try updating the immutable column, which should be fail
UPDATE DemoTable SET immutableCol = 'Update it';
GO
SELECT * FROM DemoTable;


Let's try to disable the trigger, then check the audit log to see this action has been logged.
DISABLE TRIGGER tr_DemoTable_Upd ON DemoTable;

Although this approach cannot stop someone with sysadmin right to modify the data, it can deter them from attacking it.

2020-01-15

Change Data Capture (CDC) for tracking data change

SQL Server provides two features that track changes to table data: Change Data Capture (CDC) and Change Tracking (CT). While CT is a synchronous mechanism and less overhead, it only captures the event that rows in a table were changed, but does not capture the actual data. On the other hand, CDC relies on SQL Server Agent jobs to asynchronously capture changes for a table, both the fact that changes were made and the actual data. Before SQL Server 2016, CDC was an enterprise edition only feature. Starting from SQL Server 2016, CDC is also available in standard edition. Below picture is obtained from online SQL Docs illustrating the architecture of CDC.
Source tables are the user tables in your database that enabled CDC. When DML statements applied on user tables, the database modifications made are recorded on transaction log. A capture agent job is created when the first table in the database is enabled for CDC, this capture job periodically scans the transaction log and adds information about changes into change tables. SQL developers are provided with built-in functions to enumerate the captured changes.

Let's see a demo on how to use CDC.

1. Enable CDC for the current database.
USE TestDB;
GO

EXEC sys.sp_cdc_enable_db; 

2. Create a new table with primary key.
CREATE TABLE TestTable (ID int, col varchar(50), col2 varchar(50), CONSTRAINT PK_TestTable PRIMARY KEY (ID));

3. Enable CDC for the user table.
EXEC sys.sp_cdc_enable_table @source_schema = 'dbo', @source_name = 'TestTable', @role_name = NULL;
Two Agent jobs are created with each CDC enabled database: one that is used to capture changes, another is responsible for change table cleanup.
The capture job is started immediately and runs continuously. By default, it captures a maximum of 1000 transactions per cycle with a wait of 5 seconds between cycles. The cleanup job runs once daily at 2AM, and retains change table entries for 4320 minutes. You can check these configurations by executing sys.sp_cdc_help_jobs stored procedure.


4. List the tables which enabled CDC, get the capture_instance name.
EXEC sys.sp_cdc_help_change_data_capture;
By default, the capture instance name is derived from the source schema name plus the source table name in the format schemaname_sourcename, in our example, it's dbo_TestTable.

5. Apply some DML on the table, then uses the fn_cdc_get_all_changes function to get the changes.
DECLARE @begin binary(10), @end binary(10);
SET @begin = sys.fn_cdc_get_min_lsn('dbo_TestTable');
SET @end = sys.fn_cdc_get_max_lsn();
SELECT CASE __$operation WHEN 1 THEN 'delete' WHEN 2 THEN 'insert' WHEN 3 THEN 'PreUpdate' WHEN 4 THEN 'PostUpdate' ELSE 'Unknown' END AS Operation, *
    FROM cdc.fn_cdc_get_all_changes_dbo_TestTable(@begin, @end, N'all update old') ORDER BY __$start_lsn, __$seqval;
GO

As you can see, rolled back DML won't be captured
After inserted one row, the enumerate function returns one row stating an insert for a new row:
When multiple rows were inserted within one transaction, those inserts have the same __$start_lsn:
You can also use fn_cdc_get_net_changes function to get the changes, which only returns one net change row for each source row changed within the specified LSN range.


SQL Server provides quite a few functions and stored procedures that you can use to enumerate and manipulate CDC change tables according to your application requirement. Please refer to the online SQL Docs for all of them.