A database availability group is the primary mechanism to protect Exchange Server mailbox databases against failure. These DAGs are based on Windows Server's failover clustering mechanism and on database replication. A mailbox database can be replicated across multiple DAG members -- mailbox servers configured to participate in the DAG. That way, if a mailbox server drops offline, another mailbox server can take over hosting the mailbox database.
While this concept is simple in theory, there are a number of factors that can directly affect a
Of course, simply having extra database copies isn't enough. The DAG must also be constructed in a way that allows database copies to remain online during a failure. As previously mentioned, DAGs are based on failover clustering, which adheres to the Majority Node Set model. In other words, the majority of the cluster nodes -- in this case, mailbox servers configured to act as DAG members -- must remain online for the cluster and the DAG to remain functional. Microsoft refers to this as retaining quorum. A failover cluster can maintain quorum and remain functional as long as the majority of cluster nodes remain online. Microsoft defines a majority as half plus one.
A quick history of clustering in Exchange
The requirement for a failover cluster to retain quorum historically meant that larger clusters can provide better protection than smaller ones. Imagine that you've created a database availability group with three mailbox servers. For the sake of simplicity, let's also assume that a copy of each mailbox database resides on each DAG member. If a mailbox server fails, the mailbox databases would remain accessible. And a three-node cluster requires two nodes to remain online for the cluster to retain quorum, so a three-node cluster can keep running if a single node fails, but it can't tolerate the failure of a second node, because the cluster would lose quorum.
Previously, the only way to protect Exchange against multiple simultaneous mailbox server failures was to create database availability groups with five or more members. But there are high costs associated with doing so, and that puts more resilient DAGs financially out of reach for smaller organizations.
Microsoft made an important change in Windows Server 2012 with its Failover Clustering Dynamic Quorum feature, which can keep a cluster running even if a majority of the cluster nodes fail. In fact, a cluster can even keep running even if only a single node remains online.
What is Dynamic Quorum?
Dynamic Quorum is a feature in a Windows Server failover cluster that allows a quorum to be recalculated when nodes fail or shut down. More information about it can be found here.
Node voting makes this possible. Normally, each node receives a single vote. The cluster looks at the total number of votes that would be cast if every node in the cluster were online, then uses that number to calculate the quorum requirements. For instance, there are three potential votes (one for each node) in a three-node cluster. If only two cluster nodes are able to vote due to the failure of a node, those two votes constitute a majority and quorum is retained.
The idea behind dynamic quorum is that if you can manipulate a node's voting rights, you can change the quorum requirements. For instance, suppose you have a three-node cluster but you take the voting rights away from two nodes. A single cluster node is able to vote in that situation, so that one node constitutes a majority. The other two nodes can go offline and the cluster will retain quorum because the failed nodes didn't have any voting rights. But the voting rights have to be set before the nodes go offline.
Cluster node voting can be manipulated manually or dynamically.
To do so, open the Failover Cluster Manager and select the cluster you want to configure. Go to the Actions pane and click More Actions | Configure Cluster Quorum Settings. This will cause Windows to launch the Configure Cluster Quorum Wizard. When the wizard begins, click Next to bypass the wizard's welcome screen, and then choose the Advanced Quorum Configuration and Witness Selection option.
The screen that follows lets you pick the nodes allowed to cast votes. You should allow voting rights for all nodes under normal circumstances. But on the next screen, you can select the option to allow the cluster to dynamically manage the assignment of node votes. This option enables the dynamic quorum feature.
About the author:
Brien Posey is an eight-time Microsoft MVP for his work with Windows Server, IIS, Exchange Server and file system storage technologies. Brien has served as CIO for a nationwide chain of hospitals and healthcare facilities, and was once responsible for IT operations at Fort Knox. He has also served as a network administrator for some of the nation's largest insurance companies.
This was first published in January 2014