HANA – Scale-Out System is Hanging With Many Threads in Synchronization ::Read Write Lock ::timed Wait Lock Exclusive and High CPU Consumption on Worker Nodes
HANA – Scale-Out System is Hanging With Many Threads in Synchronization ::Read Write Lock ::timed Wait Lock Exclusive and High CPU Consumption on Worker Nodes
If you observe constantly increasing CPU consumption on the worker nodes of a scale-out scenario until the system appears to be hanging.
You create runtime dumps as described in SAP KBA 1813020 and the runtime dumps of the worker nodes reveal that a considerable number of threads is in:
- State ExclusiveLock Enter according to the contents of m_dev_contexts
- Synchronization::ReadWriteLock::timedWaitLockExclusive according to section [STACK_SHORT]
Creating deadlock graphs on the worker nodes using the following commands:
deadlockdetector wg -w -o /tmp/<hostname>_deadlockdetector.txt
dot -Tpdf -o <pdffile> /tmp/<hostname>_deadlockdetector.txt
will reveal that there are deadlocks on the worker nodes.
Other Terms: blocking locks, IS_LOGGED
Reason and Prerequisites
Reason: This issues is caused by accessing column store tables on the worker nodes which have been created with the no-logging option. Such tables are usually created and used by SAP BW, e.g. while executing SAP BW queries, but might also be created by other applications. On the worker nodes, these tables are accessed by a special routine for no-logging column store tables, which does not adhere to the standard locking order. For this reason the access might lead to a deadlock with other threads accessing the same objects in standard locking order.
This issue cannot occur in SAP HANA database Revisions SPS 8 and SPS 9.
Affected Releases:
SAP HANA 1.0 database:
Revisions of SPS10
Revisions <= 112.03 (SPS11)
Revision == 120.00 (SPS12)
Prerequisites:
Scale-out scenario
Column store tables with no-logging option are being accessed
Solution
Apply one of the following SAP HANA database Revision:
SAP HANA 1.0 database:
Revisions >= 112.04 (SPS11) or
Revisions >= 121.00 (SPS12)
SAP HANA 2.0 database:
Revisions >= 000.00 (SPS0) or higher
Workaround: Restart the hanging indexserver(s)
Move all column store tables which are created with option no-logging to the master node of the scale-out system
If the no-logging column store tables are accessed on the master node, a different code path is used, which adheres to the standard locking order. Hence the issue cannot occur if all such tables reside on the master node.
If you want your BW system to create all temporary no-logging column store tables on the master node you can change the TABLE_PLACEMENT for TABLE_GROUPS BW_TEMP and sap.bw.temp to master. Please see SAP Note 1908075 for detailed instructions on adjusting the TABLE_PLACEMENT. After upgrading to a SAP HANA database Revision in which the bug is fixed, you should revert all workarounds.
Remarks:
As these tables require additional space, you should monitor the memory consumption on the master node after moving all no-logging column store tables to the master node.
This document refers to