Terracotta failover tuning

Guaranteed data consistency for cross-data center clustering

Guarantee your data consistency with the failover tuning feature added to Terracotta 4.3.3, available October 2016. This new feature provides guaranteed data consistency when data redundancy is needed in cases of network or data center failure.

Introduction

Typical webMethods Integration Server users often struggled with guaranteeing data consistency when implementing data redundancy by clustering application data with Terracotta BigMemory and distributing applications over two data centers. Because data is shared within a single Terracotta Server Array (TSA) cluster, there is a requirement to locate the mirror (passive) node of the cluster in the second data center for data redundancy. Splitting the TSA stripe (i.e., locate mirror node in second data center) over two data centers puts data consistency at risk due to the potential for a split-brain scenario.

Fortunately, a new feature called failover tuning has been added to Terracotta 4.3.3, available in October 2016, which will provide guaranteed data consistency when data redundancy is needed, such as with network or data center failure.

Typical cross-data center configuration

Data redundancy is most needed when data consistency across data centers is required. A typical example of a cross-data center configuration for webMethods users is shown in Figure 1.

In this use case, a single webMethods Integration Server (IS) cluster is across two data centers. An IS cluster may contain multiple Integration Servers, and in this case, all Integration Servers are up and processing requests. If one data center goes down, the remaining Integration Servers in the second data center will remain active without losing data.

In this architecture, a single TSA stripe (i.e., active/mirror) is deployed across both data centers. As always, the IS cluster also requires a shared (or active-active replicated) database. 

Figure 1: webMethods Integration Server cluster across data centers with split Terracotta stripes

Split-brain scenario

Before the introduction of Terracotta Failover Tuning, the cross-data center configuration described in the previous section was not recommended because of the risk of communication disruption (e.g., network partition) causing a split-brain scenario. While a very fast and reliable network without a single point of failure might prevent this scenario, it is still possible. 

What is a split-brain scenario? Split brain refers to a scenario where two servers assume the role of active server. This can occur during a network problem that disconnects the active and mirror servers, causing the mirror to become an active server while the active node also remains active.

In a clustered environment, this can occur when any network, hardware or other issue results in an active node partitioning from the cluster. The detection of such a situation results in an active-left event, the default behavior of a TSA. If the remaining passive node cannot find the active node during the five-second default, the passive node takes over as the active node.

With both nodes now acting as active, you will encounter a split-brain scenario. Any further operations performed on the data are likely to result in data inconsistencies.

To get around this split-brain scenario, we created a solution where a cluster may be configured to emphasize consistency in Terracotta 4.3.3 called the failover tuning feature. 

Terracotta 4.3.3 Failover Tuning feature

Terracotta 4.3.3 is a minor product release that includes the new failover tuning feature for guaranteeing consistency across distributed data center deployments. Since webMethods is the most common use case for this cross-data center configuration, this feature is available with webMethods 9.8 and later versions that support Terracotta 4.3. 

Terracotta Failover Tuning is the recommended solution for cross-data center clustering where data consistency is the primary goal. This feature allows for split stripes without the split-brain scenario by preventing automatic promotion of mirror to active node when communication is disrupted.

How does failover tuning work?

You can tune the failover behavior of your system by turning on Terracotta Failover Tuning then selecting high availability or consistency.

If you choose high availability, then the cluster behavior will be as it is today. Passive nodes (i.e., mirror nodes) will immediately take over as active nodes once they detect that the active node has left the cluster. This ensures high availability of the cluster but this behavior is prone to split-brain.

If you choose guaranteed consistency, then the cluster can be tuned for consistency. Passive nodes will wait for manual intervention to be promoted as active and the split-brain scenario is prevented. Since the default setting in tc-config is AVAILABIILITY, you must change the failover-priority element to CONSISTENCY under the servers tag.

There are a number of scenarios where there may be a network problem (e.g., prime site fails, secondary site fails, network connection fails). Figures 2 and 3 reflect what happens when a prime site failure occurs when the guaranteed consistency option is selected.

Figure 2: When primary site fails and the secondary site stops operations, all data is preserved.

Figure 3: After primary site fails, secondary site is brought online through manual intervention.

First, the user must check whether the original active node is running before promoting the PASSIVE-STANDBY node. Once the status is confirmed, an operator must trigger the failover-action command:

$KIT/server/bin/fail-over-action.sh[bat] -f /path/to/tc-config.xml
-n <server-name> --promote|--restart|--“action”

 Table 1: Manual Intervention Actions


Action

Description

Guidelines
-promote The node will move to the ACTIVE_COORDINATOR state, provided the node was currently in the WAITING-FOR-PROMOTION state.     When the operator is sure that the active is genuinely down
-restart The node will log appropriately, shutdown and mark the DB as dirty.

The server will restart automatically.
When the active is still up and running and you don’t want the waiting passive to become active
-failFast The node will log appropriately and shutdown without any changes to the database.

The server will not restart automatically.    
When the active is still up and running and you don’t want to keep the passive (e.g., perform analysis)

Summary

When data consistency is your primary goal, set the Terracotta Failover Tuning feature to CONSISTENCY. This will provide you guaranteed consistency across your data center for distributed (two data center) deployments. This feature allows for split stripes without the split-brain scenario by preventing automatic promotion of mirror to active node when communication is disrupted.

This feature is available with a BigMemory FX Edition license.

For more information, please refer to the Failover Tuning for Guaranteed Consistency section in BigMemory Max Administration Guide under Terracotta documentation.