Overview

Brokers in an Qpid cluster communicate using the reliable multicast transport provided by corosync. It is recommended that they also use the Cluster Manager (cman) for quorum. Without cman, a network partition may cause lost and/or duplicated messages and unpredictable shutdown of Qpid brokers in the cluster.

A network partition or "split-brain" arises when a network failure splits the cluster into two or more sub-clusters that cannot communicate with each other. Each of the sub-clusters acts without knowledge of the others, resulting in inconsistent cluster state.

Cluster manager (cman) avoids this using a "quorum". A group of more than half the expected cluster nodes has quorum and can act. If a node loses contact with the quorum, the broker on that node shuts down to avoid inconsistency and allow clients to fail-over to a quorate broker. This ensures that only one group continues processing in the event of a partition. Note this means a cluster should have an odd number of members.

It is possible to create a cluster of 2 nodes, but this requires additional hardware (shared storage and power fencing) and configuration that is not covered in this article.

Qpid brokers don't use shared storage, each Qpid broker has its own independent store which is kept up to date with the other brokers via multicast. This means that Qpid brokers do not require hardware fencing to prevent corruption of shared store. However if your cluster runs any other services that use shared storage then you need fencing.

If a broker process crashes or shuts down it can be re-started automatically by the Resource Group Manager (rgmanager). Configuration is explained below.

Qpid broker clusters are active/active, meaning that clients can connect to any member at any time. This is different from "cold standby" services that use shared storage for fail-over, where only instance is active a time.

Note: cman and rgmanager are compontents of the Linux Cluster project at https://fedorahosted.org/cluster/wiki/HomePage

The corosync project is at: http://www.corosync.org/doku.php

Configuring Qpid brokers to use cman

To enable CMAN integration add this to /etc/qpidd.conf:

cluster-cman=yes

When cluster-cman is enabled, the Qpid broker will wait until it belongs to a quorate cluster before accepting client connections. It continually monitors the quorum status and shuts down immediately if it the node it runs on loses touch with the quorum. This avoids inconsistencies and allows clients to fail over to a quorate broker.

Notes on using corosync and cman

When using cman, you should not start the corosync service, as it is started automatically by cman, i.e.

# service corosync stop chkconfig corosync off

The configuration for cman is in /etc/cluster/cluster.conf. You can use the tools conga or system-config-cluster to edit this file, see "Cluster Administration". An example cluster.conf file is shown at the end of this article.

When corosync is started by cman, the corosync.conf file is not used. Many of the configuration parameters listed in corosync.conf can be set in cluster.conf instead. See the cman man page for details.

Create a cluster configuration listing each of your nodes using system-config-cluster or conga as described in "Cluster Administration"

You should enable the cman and rgmanager services:

# service cman start
# chkconfig cman on
# service rgmanager start # chkconfig rgmanager on

Restarting failed broker processes.

You can use the Recovery Manager (rgmanager) to re-start crashed broker process automatically. There is an example configuration at the end of the article.

Using conga or system-config-cluster edit cman configuration as follows:

1. Create a failover domain for each node in the cluster. Add just that one node to the domain.
2. Create a qpidd service in each fail-over domain with "autostart" checked and recovery policy "restart"

Rgmanager only allows one instance of a service to be active in a domain. Since qpidd brokers are all active, putting each one in a separate domain allows rgmangaer to restart any one of them automatically.

Example configuration

Note: if you are not using fencing you should disable the fence daemon by adding the following line to /etc/sysconfig/cman:

FENCE_JOIN="no"

Example cluster.conf for 3 node cluster with hosts mrg33, mrg34, mrg35:

<?xml version="1.0" ?>
<cluster alias="test" config_version="37" name="test">
  <uidgid uid="qpidd" gid="qpidd" />
  <clusternodes>
    <clusternode name="mrg33" nodeid="1" votes="1"/>
    <clusternode name="mrg34" nodeid="2" votes="1"/>
    <clusternode name="mrg35" nodeid="3" votes="1"/>
  </clusternodes>
  <cman/>
  <rm log_level="7">
    <failoverdomains>
      <failoverdomain name="only_broker1" nofailback="0" ordered="0" restricted="1">
     <failoverdomainnode name="mrg33" priority="1"/>
      </failoverdomain>
      <failoverdomain name="only_broker2" nofailback="0" ordered="0" restricted="1">
     <failoverdomainnode name="mrg34" priority="1"/>
      </failoverdomain>
      <failoverdomain name="only_broker3" nofailback="0" ordered="0" restricted="1">
     <failoverdomainnode name="mrg35" priority="1"/>
      </failoverdomain>
    </failoverdomains>
     <resources>
       <script name="qpidd" file="/etc/init.d/qpidd" />
       </resources>
     <service name="qpidd_broker1" domain="only_broker1">
       <script ref="qpidd" />
     </service>
     <service name="qpidd_broker2" domain="only_broker2">
       <script ref="qpidd" />
     </service>
     <service name="qpidd_broker3" domain="only_broker3">
       <script ref="qpidd" />
     </service>
  </rm>
</cluster>
<?xml version="1.0" ?>
<cluster alias="test" config_version="37" name="test">
  <clusternodes>
    <clusternode name="mrg33" nodeid="1" votes="1"/>
    <clusternode name="mrg34" nodeid="2" votes="1"/>
    <clusternode name="mrg35" nodeid="3" votes="1"/>
  </clusternodes>
  <cman/>
  <rm log_level="7">
    <failoverdomains>
      <failoverdomain name="only_broker1" nofailback="0" ordered="0" restricted="1">
     <failoverdomainnode name="mrg33" priority="1"/>
      </failoverdomain>
      <failoverdomain name="only_broker2" nofailback="0" ordered="0" restricted="1">
     <failoverdomainnode name="mrg34" priority="1"/>
      </failoverdomain>
      <failoverdomain name="only_broker3" nofailback="0" ordered="0" restricted="1">
     <failoverdomainnode name="mrg35" priority="1"/>
      </failoverdomain>
    </failoverdomains>
     <resources>
       <script name="qpidd" file="/etc/init.d/qpidd" />
       </resources>
     <service name="qpidd_broker1" domain="only_broker1">
       <script ref="qpidd" />
     </service>
     <service name="qpidd_broker2" domain="only_broker2">
       <script ref="qpidd" />
     </service>
     <service name="qpidd_broker3" domain="only_broker3">
       <script ref="qpidd" />
     </service>
  </rm>
</cluster>