This month blog post I would like to share a case that I encountered and was solved by calling Microsoft premier support. I have a SQL Server 2016 primary instance that configured high availability with other 3 nodes on an availability group, hosting a mission critical database total 6TB, the primary sql instance only had 64GB hardware memory. Client programs sometimes run a batch job which involves high volume of data import and read on that mission critical database. Database inaccessible occurs occasionally while the batch program running, made the batch failed and needed to restart again. Users complained on this problem. Checking the sql server error log, error messages were found as below:
Process 0:0:0 (0x5748) Worker 0x000001DD933D0160 appears to be non-yielding on Scheduler 7. Thread creation time: 13283945227971. Approx Thread CPU Used: kernel 130515 ms, user 0 ms. Process Utilization 62%. System Idle 33%. Interval: 130527 ms.
***Stack Dump being sent to D:\Program Files\Microsoft SQL Server\MSSQL13.HKRDBSP1\MSSQL\LOG\SQLDump0017.txt
Always On Availability Groups connection with secondary database terminated for primary database 'XXX' on the availability replica 'XXX' with Replica ID: {b94d35e1-6b50-48f3-8421-93fe6bf7c8b0}. This is an informational message only. No user action is required.
Unable to access availability database 'XXX' because the database replica is not in the PRIMARY or SECONDARY role. Connections to an availability database is permitted only when the database replica is in the PRIMARY or SECONDARY role. Try the operation again later.
As an application DBA, when we see dump was generated by SQL Server, what we can do is only call Microsoft for help. Below is the investigation done by Microsoft support engineer:
Below are non-yielding call stacks of the dump. We found SQL SERVER was
hung when allocating memory. Kernel time is the majority which means the
slowness is in windows OS.
# Call Site
00 sqldk!SOS_MemoryBlockAllocator::CommitBlockAndCheckNumaLocality
01 sqldk!SOS_MemoryBlockAllocator::AllocateBlock
02
sqldk!SOS_MemoryWorkSpace::AllocatePage
03
sqldk!MemoryNode::AllocatePagesInternal
04 sqldk!MemoryNode::AllocatePages
05
sqldk!MemoryClerkInternal::AllocatePagesWithFailureMode
06 sqldk!MemoryClerkInternal::AllocatePages
07
sqldk!TVarPageMgr<0>::PviNewVarPage
08
sqldk!TVarPageMgr<0>::PbAllocate
09 sqldk!CMemObj::InlineAlloc
0a sqldk!CMemObj::Alloc
IncidentType
NonYieldingScheduler
OffenderThreadId
~256
IsPreemptive
False
Wall_clock_time_ms
70215
Kernel_time_ms
70203
User_time_ms
0
passes
15
1. Please enable ‘Lock pages in memory’ for this SQL SERVER. ‘Lock pages in memory’ will use AWE api to allocate memory. It will be more efficient.
Please
follow below official article. Add “Lock pages in memory” privilege to “SQL Service account”.
https://technet.microsoft.com/en-us/library/ms190730(v=sql.105).aspx
2. Please increase physical memory for this VM to
128GB, and increase the sql server max server memory setting to 112GB.