Issue: Production was down for approximately 5 hours, 11:35 AM PST - 04:22 PM PST on 10.24
Root Cause Type: Platform, Third Party
Root Cause: Information from Verizon – an error was reported on the failover cluster.
Error Message:
Cluster network name resource 'Cluster Name' cannot be brought online. The computer object associated with the resource could not be updated in domain 'managed.cln' for the following reason:
Unable to update password for computer account.
The text for the associated error code is: Access is denied.
The cluster identity 'DAC30415VIR001$' may lack permissions required to update the object. Please work with your domain administrator to ensure that the cluster identity can update computer objects in the domain.
Solution: Verizon restarted the database and cluster. Additionally, they created a Microsoft case to investigate this cluster behavior. As a precaution all database clusters were patched with latest the windows updates and all cluster hotfixes were installed on all 4 nodes.
Error message – We observed that below error was coming from 09/06/2017 till cluster failover.
Cluster network name resource 'Cluster Name' cannot be brought online. The computer object associated with the resource could not be
updated in domain 'managed.cln' for the following reason:
Unable to update password for computer account.
The text for the associated error code is: Access is denied.
The cluster identity 'DAC30415VIR001$' may lack permissions required to update the object. Please work with your domain administrator to
ensure that the cluster identity can update computer objects in the domain.
Please find below image in which time is matching with exact production fail and up time, this service is the cause of production down.
Cluster resource 'Cluster Disk 1 - Q:\Quorum' in clustered service or application 'Cluster Group' failed.
Our analysis : Both error logs stopped after cluster restart. Which means first is a root cause and second is an impact. When restarted, the server first and second both error stopped logging.