Long Live Backups!
Failure To Plan is Planning for Failure
Calculating acceptable business risk and then building strategies and protocols to address those risks are key parts of disaster recovery planning.
When planning for a disaster, two target objectives are often used to define the recovery plan:
Recovery Time Objective (RTO) |
Recovery Point Objective (RPO) |
RPO defines how much data can be lost - for example, if backups are made once per day, then the RPO would have a maximum of 24 hours if the failure happened just before the next backup ran.
RTO defines how long the business can be down, for example the time it takes to locate the correct backup, transfer it, decompress it, and restore it so the database becomes available again.
Without clustering, a database failure would require manual recovery from a backup, a process which requires downtime, and potentially incurs the loss of data since the last backup.
While a proper clustering solution can enable continuous MySQL operations with a very fast recovery time (RTO) and a very small loss window (RPO), database clustering cannot protect against all eventualities.
To Backup, or Not To Backup, That Is the Question
Why are backups even needed? There are multiple MySQL replicas in a fault-tolerant cluster, so the data is completely safe, correct?
NO!
Please remember that writes to the Primary are copied to all Replica nodes in a cluster as quickly as possible. That means that a bad write will propagate throughout the entire cluster, rendering every copy of the data partially or completely useless.
There are many possible reasons for a backup to be needed, here are just a few:
- Admin error resulting in loss or corruption of data
(i.e. someone types "DROP DATABASE" on your Primary by mistake…) - Application error, or automated SQL, leading to corruption
- Malicious activity/hacking leading to partial or total data loss
It is for reasons like this (and more!) that backups are REQUIRED for the safety of the data and the business operation.
If I Must Have Backups, Then Why Bother with Clustering?
“Instead of always using a hammer, you can use a screw driver, or pliers, and sometimes you don’t have to do anything at all.”
Continuent knows the pain of DBAs, SREs, DevOps, SysAdmins first-hand; that’s why our engineers have distilled various manual processes down to a single, seemingly magical command. Some of my favorites include:
- "switch," we can redirect connections to another part of the cluster, so you can perform maintenance, patches or updates without bringing your application down.
- "recover," we can restore optimal cluster health - automatically check status and states of all nodes and make any necessary configurations and changes needed.
- "tprovision," we can take a backup from one node and restore it on another.
You might be wondering if clustering is worth it if you still have to maintain a classic, proper backup process.
Clustering reduces the impact of a wide variety of risks that would otherwise cause a long outage with significant data loss. Scenarios like database, host, network, site and even regional failures can be protected against and remediated rapidly with clustering. Additionally, clustering provides for read-scaling and automated recovery. Backups alone provide none of this.
With a cluster in place (true at least for Tungsten Clustering ), there’s a:
- dramatic drop in the number of failure scenarios to recover from manually
- lower total cost of ownership (TCO)
- decrease in administrative overhead
Furthermore, having a fully-integrated, infrastructure-agnostic clustering solution like Tungsten Clustering makes it easier than ever to deal with complex cluster operations. Check out this blog about “the Boss,” or “Cluster Manager or Orchestrator, such as Tungsten Manager” that makes DR, multi-site, hybrid-cloud, multi-cloud MySQL easy and cost-effective.
On top of reliability, resilience, HA, DR, load balancing, performance and geo-scale distribution, Tungsten Clustering comes with an essential toolkit to make your MySQL environment easy to manage.
Conclusion: Disasters Happen...So Plan For Them
Even with clustering, backups are always necessary!
Best practices for data availability include a number of methods for ensuring business continuity, use them all and do not rely upon any single tool.
Check out: “3, 2, 1 MySQL Backup is Fun!” to learn more about backup planning for a clustered environment.
Comments
Add new comment