Introduction
One of the great powers of the Tungsten Cluster is that the writable Primary role can be moved to another node or cluster at will.
When dealing with composite clusters and moving the Primary to another site entirely, there are different actions to take depending on the topology.
In this blog post we will compare the procedures for moving the Primary role from one Composite cluster to another peer Composite cluster.
The Composite Topologies, Reviewed
- CAP - One Active, writable cluster, one or more Passive clusters.
- CAA - All clusters are Active and writable, Connectors typically send writes to the local Primary.
- DAA - All clusters are Active and writable, with the Connectors configured to send all writes to just one cluster.
The Composite Topologies, Compared
Composite Active/Active (CAA) Versus Composite Active/Passive (CAP)
The key differences between CAP and CAA are:
- CAP has a single writable Primary in a single Active cluster, while all clusters in CAA are writable.
- CAP writes all go to the same Primary node in the single Active cluster, whilst CAA writes go to the local Primary by default (and if it is available), and to another available cluster otherwise.
- If the Active cluster fails in CAP, human intervention is required to activate a different cluster as Active; with CAA, the Connector automatically routes writes to another available server based upon configurable rules.
- CAA has a higher risk for data integrity issues.
- Due to the merged asynchronous writes from multiple source clusters.
- CAP allows for simplicity of application adoption because the risk for write conflicts inherent in CAA is not present.
Composite Active/Active Versus Dynamic Active/Active
- CAA clusters normally have writes going to the local Primary, while in DAA, all writes are routed to a single Primary, mimicking CAP behavior.
- DAA allows for simplicity of application adoption, lowering the risk for write conflicts.
Composite Active/Passive Versus Dynamic Active/Active
- DAA Connectors automatically reroute writes to another available cluster if the current Active cluster fails, while CAP requires human intervention to move the Active role to another cluster.
Moving the Primary Role to Another Cluster
We will now compare the various switch methods.
Composite Active/Active (CAA)
With CAA, there is no switch or failover because all clusters are writable (Active), with a local Primary for each cluster.
Composite Active/Passive (CAP) Composite Switch
For CAP, there is a single command to run to change the Active cluster to another, currently Passive cluster.
CAP uses cctrl> switch
at the composite level.
You may also use the `switch to {clusterName}
` syntax.
For example:
shell> cctrl -multi
[LOGICAL] / > use global
[LOGICAL] /global > ls
...
DATASOURCES:
+---------------------------------------------------------------------------------+
|east(composite master:ONLINE) |
|STATUS [OK] [2023/01/18 01:18:56 PM UTC] |
+---------------------------------------------------------------------------------+
+---------------------------------------------------------------------------------+
|west(composite slave:ONLINE) |
|STATUS [OK] [2023/01/18 01:18:57 PM UTC] |
+---------------------------------------------------------------------------------+
[LOGICAL] /global > switch
Savepoint switch_0(cluster=global, source=db1-demo.continuent.com, created=2023/01/27 18:37:36 UTC) created
SELECTED SLAVE: 'west@global'
FLUSHING TRANSACTIONS THROUGH 'db1-demo.continuent.com@east'
REPLICATOR 'db1-demo.continuent.com' IS NOW USING MASTER CONNECT URI 'thls://db4-demo.continuent.com:2112/'
composite data source 'west@global' is now OFFLINE
PUT THE NEW MASTER 'west@global' ONLINE
PUT THE PRIOR MASTER 'east@global' ONLINE AS A SLAVE
REVERT POLICY: MAINTENANCE => AUTOMATIC
SWITCH TO 'west@global' WAS SUCCESSFUL
[LOGICAL] /global > ls
...
DATASOURCES:
+---------------------------------------------------------------------------------+
|east(composite slave:ONLINE) |
|STATUS [OK] [2023/01/27 06:37:51 PM UTC] |
+---------------------------------------------------------------------------------+
+---------------------------------------------------------------------------------+
|west(composite master:ONLINE) |
|STATUS [OK] [2023/01/27 06:37:51 PM UTC] |
+---------------------------------------------------------------------------------+
Dynamic Active/Active (DAA) Composite Drain
When you want to move all writes to another site/cluster (like you would in a Composite Active/Passive cluster using the switch
command at the composite level), there is no switch command available in Dynamic Active/Active.
DAA has no `switch` or `failover` command because both sites are Active - it is, at its core, a CAA composite cluster. It is the configuration of the Connectors alone which turns a CAA cluster into a DAA cluster by routing all writes to just one cluster.
With that in mind, the only way to signal all the Connectors at once that an entire site is no longer available is to shun it. Shunning a cluster forces the Connectors to send traffic to the next available cluster in the affinity list.
As of version 7.0.2, we strongly recommend that you use the cctrl
command datasource SERVICE drain [optional timeout in seconds]
at the composite level to shun the currently selected Active cluster. This will allow the Connector to finish (drain) all in-flight queries, shun the composite dataservice once fully drained, and then move all writes to another cluster.
For example:
shell> grep app_user /opt/continuent/tungsten/tungsten-connector/conf/user.map"
app_user secret world emea:emea
shell> show_rw
R/W=db15-demo.continuent.com
R/O=db13-demo.continuent.com
shell> cctrl -multi
[LOGICAL] / > use world
[LOGICAL] /world > ls
...
DATASOURCES:
+---------------------------------------------------------------------------------+
|emea(composite master:ONLINE, global progress=1474239, max latency=0.704) |
|STATUS [OK] [2023/01/25 08:30:59 PM UTC] |
+---------------------------------------------------------------------------------+
| emea(master:ONLINE, progress=1452966, max latency=0.614) |
| emea_from_usa(relay:ONLINE, progress=21273, max latency=0.704) |
+---------------------------------------------------------------------------------+
+---------------------------------------------------------------------------------+
|usa(composite master:ONLINE, global progress=1474239, max latency=0.656) |
|STATUS [OK] [2023/01/21 01:57:38 PM UTC] |
+---------------------------------------------------------------------------------+
| usa(master:ONLINE, progress=21273, max latency=0.656) |
| usa_from_emea(relay:ONLINE, progress=1452966, max latency=0.655) |
+---------------------------------------------------------------------------------+
[LOGICAL] /world > datasource emea drain 3
composite data source 'emea' is now SHUNNED
[LOGICAL] /world > ls
DATASOURCES:
+---------------------------------------------------------------------------------+
|emea(composite master:SHUNNED(DRAIN-CONNECTIONS), global progress=21273, max |
|latency=0.704) |
|STATUS [SHUNNED] [2023/01/27 06:47:25 PM UTC] |
+---------------------------------------------------------------------------------+
| emea(master:SHUNNED, progress=1452966, max latency=0.614) |
| emea_from_usa(relay:ONLINE, progress=21273, max latency=0.704) |
+---------------------------------------------------------------------------------+
+---------------------------------------------------------------------------------+
|usa(composite master:ONLINE, global progress=1474239, max latency=0.656) |
|STATUS [OK] [2023/01/21 01:57:38 PM UTC] |
+---------------------------------------------------------------------------------+
| usa(master:ONLINE, progress=21273, max latency=0.656) |
| usa_from_emea(relay:ONLINE, progress=1452966, max latency=0.655) |
+---------------------------------------------------------------------------------+
shell> show_rw
R/W=db16-demo.continuent.com
R/O=db18-demo.continuent.com
[LOGICAL] /world > datasource emea welcome
WARNING: This is an expert-level command:
Incorrect use may cause data corruption
or make the cluster unavailable.
Do you want to continue? (y/n)> y
composite data source 'emea' is now ONLINE
[LOGICAL] /world > ls
DATASOURCES:
+---------------------------------------------------------------------------------+
|emea(composite master:ONLINE, global progress=1474241, max latency=2.127) |
|STATUS [OK] [2023/01/27 07:29:53 PM UTC] |
+---------------------------------------------------------------------------------+
| emea(master:ONLINE, progress=1452967, max latency=1.755) |
| emea_from_usa(relay:ONLINE, progress=21274, max latency=2.127) |
+---------------------------------------------------------------------------------+
+---------------------------------------------------------------------------------+
|usa(composite master:ONLINE, global progress=1474241, max latency=5.168) |
|STATUS [OK] [2023/01/21 01:57:38 PM UTC] |
+---------------------------------------------------------------------------------+
| usa(master:ONLINE, progress=21274, max latency=5.168) |
| usa_from_emea(relay:ONLINE, progress=1452967, max latency=1.532) |
+---------------------------------------------------------------------------------+
shell> show_rw
R/W=db15-demo.continuent.com
R/O=db14-demo.continuent.com
Please note that this is different from using the cctrl command datasource SERVICE shun
(available prior to version 7.0.2) at the composite level to shun the currently selected Active cluster. Using shun
instead of drain
will force the Connector to immediately sever/terminate all in-flight queries, then move all writes to another cluster.
Bonus Command Line Alias - show_rw
For my demo nodes, I like to be able to easily display the read-write splitting status through the Connector with a single command, so I wrote the following shell alias and put it into my .bashrc
file:
shell> alias show_rw="echo 'select concat(\"R/W=\",@@hostname) for update; select concat(\"R/O=\",@@hostname);' | tpm connector 2>/dev/null | grep -v '@@hostname'"
shell> show_rw
R/W=db15-demo.continuent.com
R/O=db14-demo.continuent.com
More Information
For even more details, please visit:
- Enabling Composite Dynamic Active/Active (scroll down to Manual Site-Level Switch, please)
- Manual Primary Switch
- De-Mystifying Tungsten Cluster Topologies, Part 3: CAP vs. CAA vs. DAA
- Tungsten Cluster: How Does Failover Work?
- Global WordPress High Availability Using Tungsten Clustering
Wrap-Up
In this blog post we compared the procedures for moving the Primary role from one Composite cluster to another peer Composite cluster.
Smooth sailing!
Comments
Add new comment