2025-07-05

Out of Sync status for the AlwaysOn Availability Group Replica

 Last week, a customer mentioned to me that their MSSQL server is running slowly. I verified that the CPU usage of the SQL Server was low, and the memory showed a high Page Life Expectancy value. This application is a vendor package software, with a low workload. Before informing the customer about the health of their SQL Server, I executed Paul Randal's Wait Statistics script. This script indicates that the primary wait type is HADR_SYNC_COMMIT, which signifies that the server is waiting for transaction commit processing for the synchronized secondary databases to harden the log. The screenshot below indicates that 75% of the wait time is attributed to HADR_SYNC_COMMIT.

The customer stated that the Availability Group consists of two replicas: the primary one in Shanghai and the secondary one in Hong Kong, with a network bandwidth of 10Mb/s. They requested that I demonstrate how frequently the AAG Replicas are out of sync. I provided them with the hadr_db_partner_set_sync_state event from the alwayson_health Extended Event session, as shown in the two screenshots below (sync_state NOT indicate out-of-sync, while sync_state LOG signifies resumed):


Given that the out-of-sync situation arises often, I can easily show how to recognize it using the Availability Group Dashboard, as depicted in the image below.

To address this issue, I recommended that the customer switch the AAG sync mode from Synchronous mode to Asynchronous mode; however, the trade-off is that the AAG will not be able to perform automatic failover.