GIT – First of all, we will discuss how to loadbalance queries from application to our MySQL database cluster. It is nice to have a database cluster that provides high availability and performance, but if you don’t find a proper way to connect, your efforts will be useless.
Different approaches exist to loadbalance queries from application to the database cluster. The purpose is to loadbalance queries to capitalize available resources but it is not the only one. If a node failed, the requests have to be correctly redirected to the remaining functional one. There are several approaches.
The first approach depends of your application. In my case, the database cluster is dedicated to a Java application, that’s why I tried several JDBC drivers to find one that matched my needs. Then we will discuss the MySQL Proxy. Despite the fact that it is still on alpha release, it looks really promising! Finally we will mention the IP layer loadbalancer I detailed in this previous post.
In the following section, the goal is to quickly identify the best solutions available to loadbalance queries to a MySQL cluster. The details are voluntarily not mentioned since the goal of this post is to identify the optimal MySQL cluster architecture in a context of two nodes. I benchmarked every option according to specific application criteria, but I think the results can be addressed to a more general context. The proposed solutions do not imply any modification from application perspective.
- JDBC drivers
- MySQL Proxy
- IP layer loadbalancer
JDBC drivers as loadbalancer
- Load Balanced MySQL driver
I used the Connector/J 5.1. Official JDBC driver for MySQL in its replication context as explained here. I quickly decided to give it up for several reasons. First of all, the response times were not satisfying, and then this driver does not support too many parallel connections. Finally data integrity was not always assured in stress situations.
- lbpool driver
MySQL officially suggests this open driver in a replication context. You can find more information about it here. This driver revealed much better performance than the previous one. Unfortunately, in my case, after more than 100’000 inserts, deadlocks appeared randomly. The driver entered in an unstable state where there was no way to connect to the cluster any more. That’s why I had to find an alternative.
This tool revealed good performance. Queries were correctly loadbalanced between my two nodes. I simulated the failure of one node; the queries were correctly redirected to the remaining one. I didn’t notice any inconsistency even for more than 100’000 inserts and the overhead was very small compared to a traditional connection. The only drawback is that the MySQL Proxy is still in alpha release and depends of the LUA script’s truthfullnees used to split read/write queries. However, it was the best candidate so far.
Loadbalance queries at IP level (keepalived)
This option explained in this previous post is the most mature and as proved to work over years. It allows to loadbalance MySQL queries between two nodes. We have to define a MySQL Virtual IPaddress. The application connects to the cluster through that VIP and then we configure the distribution on a round robin fashion for example. This option handles also the failover. If the node that currently owns the VIP failed, it is automatically taken over by the other at IP layer. Unfortunately, loadbalancing cannot be done at this level to split read/write queries. It is the best candidate to loadbalance queries and to handle failover, but it is not able to split read/write queries at this level without modifying the application or adding an extra mechanism that is to say combining it with MySQL Proxy for example.
Now that we have completed our overview of mechanisms’ distribution, it is time to enter the heart of the matter. Using one of the loadbalancer or a combination of several techniques we will try to list available MySQL database cluster architecture in a context of two nodes and select the best one according to several criteria such as complexity, ease of administration, performance…
MySQL Cluster — NDB engine
First we will mention the MySQL Cluster. It is a recent and evolved technology to provide high-level cluster architecture composed of three types of nodes. In our limited architecture, this kind of evolved cluster architecture is clearly not adapted since it is intended to a large number of nodes, that’s why we will not go deeper in this direction.
|Automatically synchronized||4 nodes at least|
|Synchronous replication||Large amount of memory needed|
|High-level administration and configuration||Difficult to evaluate a priori behavior|
|Hot backup||Still discouraged by community in a production environment|
|Easy to add/remove nodes||Not mature|
|Not so easy to deploy|
|No mechanisme to loadbalancer queries at MySQL nodes|
|Application dependent -> Need lots of tests|
In this scenario, to avoid any modification from application point of view, we set a MySQL Virtual IPAddress (VIP) using keepalived as explained in this previous post and then configure the distribution so that queries are redirected to one or the other node on a round robin fashion. If a node failed, it is removed from IPVS table and then it does not receive further requests. Thus high availability and performance is assured since total amount of queries is split between the two machines and the global service remains available even if a node failed.
Each machine reads the logs of the other and executes the queries of its neighbor that modify (insert/update) the database content. The only performance enhancement becomes from the fact that two machines are available to accept read queries. If your application performs much more writes queries, this kind of architecture will not be very efficient. Then you have to be conscious that replication is asynchronous. The content of your two databases is not necessarily the same at a time t. It could be problematic for sensible read queries that should take in account your last write query. Furthermore, the MySQL community does not promote this architecture. They say it works but they do not advise it.
|Transparent for the application||Have to handle specifically auto_increments, duplicate keys, etc.|
|Best way to fail over||Asynchronous replication|
|Easy to deploy||No performance enhancement except for read queries|
|Not promoted by MySQL Community|
Master/Slave Replication without loadbalancing
This technique is the easiest to deploy, the most mature. I will recommend it although it is not the most efficient one. With a standard master/slave replication we simply add a level of redundancy to a singleMySQL database. To make it work correctly, as with the master/master replication, we configure aMySQL VIP address. Every request is forwarded to the node that currently owns the VIP. In this scenario, the master performs every request, there is no loadbalancing. The slave who reads the master log file replicates the master. In order to facilitate the failover, the read-only flag is not set on the slave. Thus if the master failed, the slave will take over the VIP and be able to ensure serviceability without manual intervention. This architecture is safety and easy to deploy. It ensures a level of redundancy in the case of failure without modifying the application. However, if a node failed, we have to repair it manually.
|Transparent for the application||Resources not capitalized since only one node works at a time|
|High availability||No high performance|
Master/Slave Replication with loadbalancing
In this scenario, which is the most complicated to test in our case, the IP loadbalancer can indifferently redirects queries to one or the other backend on a round robin fashion. Then, in order to split read and write queries, we decide to use the MySQL Proxy and the rw-splitting.lua LUA script, since it is the better solution we find to split queries safely.
mysql-proxy \ --proxy-lua-script=/etc/mysql-proxy-0.6.0/lib/rw-splitting.lua --proxy-backend-addresses=master_vip:3306 \ --proxy-read-only-address=slave_vip:3306 &
The trick is to configure two VIP (master and slave VIP). All write access are forwarded to master_vipand read access to slave_vip. At proxy startup, we define a RW backend accessible through the VIP, thus if the master failed, the slave will take over the master_vip and be able to handle write queries since there is currently no way to do it with a LUA script. Here we assume keepalived deals correctly with VIP addresses. In spite of that, we are force to admit there is another single point of failure, which is the proxy itself. That’s why we have to start over the two machines and monitor it. The application has to access it through another VIP. I didn’t test this global solution since it seemed very complicated to ensure safety. Nevertheless it should work and provide a good solution, but at the price of high complexity! That’s why I’m free to any other proposition; feel free to suggest a better schema. Otherwise I will keep on promoting the master/slave replication with no loadbalancing…
|Clearly the best solution in term of performance||But what about feasibility?!|
|As good as the others in term of availability|