Welcome to our new blog series, where we examine specific topics about how customers have benefited from moving from Galera Cluster to Tungsten Cluster. For our first topic, we are going to focus on application performance, specifically, how Tungsten cluster leverages asynchronous replication which provides better performance than synchronous replication, used by Galera.
Quick Review - Asynchronous vs. Synchronous Replication
I’ve written about this before here.
To summarize, synchronous database replication entails the following steps:
- Transaction is written to the database, but not committed. Application will wait.
- Transaction data is then sent to other databases in the cluster.
- If there are multiple primary databases, database nodes in the cluster participate in a “certification” process. This is to decide if the transaction can be safely committed to all hosts in the cluster and there are no conflicts.
- Otherwise, other hosts acknowledge that they have received the transaction.
- Transaction is committed on all hosts, and control returns back to the application.
What Makes Asynchronous Replication More Robust?
Tolerating Network Latency
Asynchronous replication tolerates network delays and interruptions. When utilizing asynchronous replication, the database commits the transaction immediately and returns control to the application. On the other hand, any network latency or outage will have a direct impact on applications when using synchronous replication. This is because the database hosts must participate in a group chat and acknowledge receipt of the transaction before it can be committed. Any delay or interruption to this process will force the application to wait as well.
Summary: Async tolerates network latency better than sync.
Performance
Deriving from the point above, asynchronous replication is completely decoupled from the primary database. This minimizes the impact on application performance, as the primary database does not have to wait for replication to complete. Synchronous replication can introduce delays while it waits for transactions to either be acknowledged by other hosts in the cluster, or certification. The larger the transaction, the higher the potential for increased latency. Transaction size does not affect performance with asynchronous replication, so developers do not have to adjust their code to deal with latency with synchronous replication.
Summary: Async performs better than sync, and even better for large events.
Scalability
The nature of synchronous replication requires acknowledgement and possibly certification from all hosts in the cluster before a transaction can be committed. This, of course, takes time, and introduces performance lag in applications. As we add more database hosts to the cluster, this latency increases, as more nodes are now participating in the group chat to decide when a transaction will be committed (or rolled back). This is actually the opposite of the desired effect: We want to add more database hosts to scale horizontally, however when using synchronous replication, this could actually make performance worse. Asynchronous replication is unaffected by the number of hosts. Scaling horizontally will not impact performance, and can actually improve performance (and we will see how to do that later in this series!).
Summary: Async scales better than sync because more nodes take longer using sync.
Multi-Master Clusters - More is Better?
Synchronous replication with Galera offers “multi-master” clusters. These clusters contain multiple read/write primary databases. One might assume that having “more is better,” but in the context of database clustering, having multiple primaries will actually hurt performance. Remember, when having multiple primaries, every transaction must go through a certification process to decide if it is safe to commit the transaction to all hosts. This is additional overhead that is non-existent in asynchronous replication/single primary. As we mentioned above, adding more primaries means additional time for all participating nodes to certify each transaction. A better approach is to tune for reads. As a rule of thumb, database queries are 70 - 90% reads, so tuning for reads can yield more significant performance advantages than creating multiple primaries. This is also a topic that we will cover later in the series!
Up Next
We are going to talk about another advantage of asynchronous replication, used by Tungsten Cluster: deploying database clusters at geo-scale. The topics discussed in this article will be used as a basis for that upcoming article, and I hope you will follow this series.
Comments
Add new comment