Saturday, May 28, 2011

Blog Post: Understanding Quorum in a Failover Cluster by Symon Perriman

Hi Cluster Fans,

This blog post will clarify planning considerations around quorum in a Failover Cluster and answer some of the most common questions we hear. 

The quorum configuration in a failover cluster determines the number of failures that the cluster can sustain while still remaining online.  If an additional failure occurs beyond this threshold, the cluster will stop running.  A common perception is that the reason why the cluster will stop running if too many failures occur is to prevent the remaining nodes from taking on too many workloads and having the hosts be overcommitted.  In fact, the cluster does not know your capacity limitations or whether you would be willing to take a performance hit in order to keep it online.  Rather quorum is design to handle the scenario when there is a problem with communication between sets of cluster nodes, so that two servers do not try to simultaneously host a resource group and write to the same disk at the same time.  This is known as a “split brain” and we want to prevent this to avoid any potential corruption to a disk my having two simultaneous group owners.  By having this concept of quorum, the cluster will force the cluster service to stop in one of the subsets of nodes to ensure that there is only one true owner of a particular resource group.  Once nodes which have been stopped can once again communicate with the main group of nodes, they will automatically rejoin the cluster and start their cluster service.

For more information about quorum in a cluster, visit: http://technet.microsoft.com/en-us/library/cc731739.aspx.

Voting Towards Quorum

Having ‘quorum’, or a majority of voters, is based on voting algorithm where more than half of the voters must be online and able to communicate with each other.  Because a given cluster has a specific set of nodes and a specific quorum configuration, the cluster will know how many "votes" constitutes a majority of votes, or quorum.  If the number of voters drop below the majority, the cluster service will stop on the nodes in that group.  These nodes will still listen for the presence of other nodes, in case another node appears again on the network, but the nodes will not begin to function as a cluster until the quorum exists again.

It is important to realize that the cluster requires more than half of the total votes to achieve quorum.  This is to avoid having a ‘tie’ in the number of votes in a partition, since majority will always mean that the other partition has less than half the votes.  In a 5-node cluster, 3 voters must be online; yet in a 4-node cluster, 3 voters must also be online to have majority.  Because of this logic, it is recommended to always have an odd number of total voters in the cluster.  This does not necessarily mean an odd number of nodes is needed since both a disk or a file share can contribute a vote, depending on the quorum model.

A voter can be:

  • A node
    • 1 Vote
    • Every node in the cluster has 1 vote
  • A “Disk Witness” or “File Share Witness”
    • 1 Vote
    • Either 1 Disk Witness or 1 File Share Witness may have a vote in the cluster, but not multiple disks, multiple file shares nor any combination of the two 

Quorum Types

There are four quorum types.  This information is also available here: http://technet.microsoft.com/en-us/library/cc731739.aspx#BKMK_choices.

Node Majority

This is the easiest quorum type to understand and is recommended for clusters with an odd number of nodes (3-nodes, 5-nodes, etc.).  In this configuration, every node has 1 vote, so there is an odd number of total votes in the cluster.  If there is a partition between two subsets of nodes, the subset with more than half the nodes will maintain quorum.  For example, if a 5-node cluster partitions into a 3-node subset and a 2-node subset, the 3-node subset will stay online and the 2-node subset will offline until it can reconnect with the other 3 nodes.

Node & Disk Majority

This quorum configuration is most commonly used since it works well with 2-node and 4-node clusters which are the most common deployments.  This configuration is used when there is an even number of nodes in the cluster.  In this configuration, every node gets 1 vote, and additionally 1 disk gets 1 vote, so there is generally an odd number of total votes. 

This disk is called the Disk Witness (sometimes referred to as the ‘quorum disk’) and is simply a small clustered disk which is in the Cluster Available Storage group.  This disk is highly-available and can failover between nodes.  It is considered part of the Cluster Core Resources group, however it is generally hidden from view in Failover Cluster Manager since it does not need to be interacted with.

Since there are an even number of nodes and 1 addition Disk Witness vote, in total there will be an odd number of votes.  If there is a partition between two subsets of nodes, the subset with more than half the votes will maintain quorum.  For example, if a 4-node cluster with a Disk Witness partitions into a 2-node subset and another 2-node subset, one of those subsets will also own the Disk Witness, so it will have 3 total votes and will stay online.  The 2-node subset will offline until it can reconnect with the other 3 voters.  This means that the cluster can lose communication with any two voters, whether they are 2 nodes, or 1 node and the Witness Disk.

Node & File Share Majority

This quorum configuration is usually used in multi-site clusters.  This configuration is used when there is an even number of nodes in the cluster, so it can be used interchangeably with the Node and Disk Majority quorum mode.  In this configuration every node gets 1 vote, and additionally 1 remote file share gets 1 vote. 

This file share is called the File Share Witness (FSW) and is simply a file share on any server in the same AD Forest which all the cluster nodes have access to.  One node in the cluster will place a lock on the file share to consider it the ‘owner’ of that file share, and another node will grab the lock if the original owning node fails.  On a standalone server, the file share by itself is not highly-available, however the file share can also put on a clustered file share on an independent cluster, making the FSW clustered and giving it the ability to fail over between nodes.  It is important that you do not put this vote on a node in the same cluster, nor within a VM on the same cluster, because losing that node would cause you to lose the FSW vote, causing two votes to be lost on a single failure.  A single file server can host multiple FSWs for multiple clusters.

Generally multi-site clusters have two sites with an equal number of nodes at each site, giving an even number of nodes.  By adding this additional vote at a 3rd site, there is an odd number of votes in the cluster, at very little expense compared to deploying a 3rd site with an active cluster node and writable DC.  This means that either site or the FSW can be lost and the cluster can still maintain quorum.  For example, in a multi-site cluster with 2 nodes at Site1, 2 nodes at Site2 and a FSW at Site3, there are 5 total votes.  If there is a partition between the sites, one of the nodes at a site will own the lock to the FSW, so that site will have 3 total votes and will stay online.  The 2-node site will offline until it can reconnect with the other 3 voters.

Legacy: Disk Only

Important: This quorum type is not recommended as it has a single point of failure.

The Disk Only quorum type was available in Windows Server 2003 and has been maintained for compatibility reasons, however it is strongly recommended to never use this mode unless directed by a storage vender.  In this mode, only the Disk Witness contains a vote and there are no other voters in the cluster.  This means that if the disk becomes unavailable, the entire cluster will offline, so this is considered a single point of failure.  However some customers choose to deploy this configuration to get a “last man standing” configuration where the cluster remain online, so long as any one node is still operational and can access the cluster disk.  However, with this deployment objective, it is important to consider whether that last remaining node can even handle the capacity of all the workloads that have moved to it from other nodes.

Default Quorum Selection

When the cluster is created using Failover Cluster Manager, Cluster.exe or PowerShell, the cluster will automatically select the best quorum type for you to simplify the deployment.  This choice is based on the number of nodes and available storage.  The logic is as follows:

  • Odd Number of Nodes – use Node Majority
    • Even Number of Nodes
      • Available Cluster Disks – use Node & Disk Majority
      • No Available Cluster Disk – use Node Majority

The cluster will never select Node and File Share Majority or Legacy: Disk Only.  The quorum type is still fully configurable by the admin if the default selections are not preferred.

Changing Quorum Types

Changing the quorum type is easy through Failover Cluster Manager.  Right-click on the name of the cluster, select More Actions…, then select Configure Cluster Quorum Settings… to launch the Configure Cluster Quorum Wizard.  From the wizard it is possible to configure all 4 quorum types, change the Disk Witness or File Share Witness.  The wizard will even tell you the number of failures that can be sustained based on your configuration.

For a step-by-step guide of configuring quorum, visit: http://technet.microsoft.com/en-us/library/cc733130.aspx.

Thanks!
Symon Perriman
Technical Evangelist
Private Cloud Technologies
Microsoft

Lake Bell Amerie Rachel Bilson Karen Carreno Bijou Phillips

1 comment:

  1. Please remove my blog post from your site. You do not have my permission to repost this Microsoft content. Thank you.

    ReplyDelete