“How can we replicate without using binary logs” just a thought popped up in my brain during a casual discussion. This blog post is a brain dump of the possible options and a PoC.
MySQL standard replication heavily relies on the primary writer’s binary log, which serves as a fundamental component. Within the writer node, the binary log records all changes, and the replica node retrieves these logs to apply the corresponding modifications locally. However, a question we’re dwelling here: Is it possible to propagate changes between servers without resorting to the traditional binary log mechanism? Let’s explore alternative approaches to achieve seamless data synchronization while bypassing the conventional binary log methodology.
Note that we’re considering no possibility of data consistency validation, as we cannot use pt-table-checksum for the way it is designed to work. This is a DBA’s wild run about an idea and sharing.
Quick note
Before we start, note that this blog was written in 2018 but never published. It got lost in time only to resurrect now. That said, the content are relevant and thought provoking.
Let’s call the writer as mysql1 and let mysql2 be our binlog-independent replica. So how can we keep the data of mysql1 and mysql2 in sync?
Writes on both nodes for replication
We can have the application / writers to write on both the servers. This will need the application level changes and that doesn’t sound like an appropriate solution.
System-level syncs for replication
One way to replicate data from the writer (mysql1 – the source) to the replica (mysql2 – the destination) is by using a system-level synchronization tool like rsync. However, it’s essential to consider that the InnoDB engine continues to operate in the background, maintaining buffers and undo logs, which can complicate achieving a consistent copy of the dataset. We may even end-up corrupting the data on the source itself.
To ensure data consistency during the replication process, the simplest approach will need us to temporarily stop the MySQL service on the writer node and then proceed taking the snapshot. This snapshot used to spin-up the fresh data. But would you call it replication? may be… 🙂
I’d also look at possible solutions that may arise from DRBD like technologies but my past experience tells me to not dwell in that direction and keep it simple.
Frequent backup and restores for replication
This is a little bit different than the previous one as my idea here was to use incremental backups. To start with a full backup of master is restored to replica and then we can use incremental backup from master and use it to get the replica in sync with writer. But a big assumption here would be that the replica is not really being used for writes! If the replica has applied changes or was used for writes, the instance will not allow any additional increments on it.
So, nothing yet seem to be a nice solution for us. Now rewind back to the first solution where we talked about “writing on both the nodes”, and there we rejected the idea as it will need application change. Would it be nice that’s possible without application change?
ProxySQL Mirroring as replication
Imagine you have a production MySQL environment, and you want to have Dev replicas that have as realistic data as possible, but not without any PII or similarly sensitive data. We can achieve this using ProxySQL’s mirroring combined with rewrite rules.
Let explore the ProxySQL Mirroring option to setup replica without binary logs. Know that this is a proof of concept and doesn’t necessarily suite you as a solution. My intention here is to share the ideas and encourage trying new possibilities.
Current setup has a writer, mysql1 and a replica, mysql2. There is another node, mysql3, which is having ProxySQL v1.4.6 running.
Prepare the ProxySQL configuration
(admin@localhost) [(none)]>select * from runtime_mysql_servers;
+--------------+----------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
| hostgroup_id | hostname | port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment |
+--------------+----------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
| 2 | mysql2 | 3306 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | |
| 1 | mysql1 | 3306 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | |
+--------------+----------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
1 rows in set (0.00 sec)
(admin@localhost) [(none)]>select * from runtime_mysql_users;
+----------+-------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+
| username | password | active | use_ssl | default_hostgroup | default_schema | schema_locked | transaction_persistent | fast_forward | backend | frontend | max_connections |
+----------+-------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+
| app | *2470C0C06DEE42FD1618BB99005ADCA2EC9D1E19 | 1 | 0 | 1 | | 0 | 1 | 0 | 0 | 1 | 10000 |
| app | *2470C0C06DEE42FD1618BB99005ADCA2EC9D1E19 | 1 | 0 | 1 | | 0 | 1 | 0 | 1 | 0 | 10000 |
+----------+-------------------------------------------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+
(admin@localhost) [(none)]>select rule_id, match_pattern, destination_hostgroup,mirror_hostgroup from runtime_mysql_query_rules;
+---------+---------------+-----------------------+------------------+
| rule_id | match_pattern | destination_hostgroup | mirror_hostgroup |
+---------+---------------+-----------------------+------------------+
| 1 | ^INSERT | 1 | 2 |
+---------+---------------+-----------------------+------------------+
1 row in set (0.00 sec)
We’re just considering INSERT for the experiment; to get complete functionality we may then add other patterns (UPDATE/DELETE). You may also use match_pattern=’^SELECT’ with negate_match_pattern=1 to let everything other than SELECT to be mirrored.
Testing the ProxySQL Mirroring as replication
Connect to ProxySQL and execute an insert
[root@mysql3 ~]# mysql -uapp -ppassword -h127.0.0.1 -P6033
...
(app@127.0.0.1) [(none)]>select @@hostname;
+------------+
| @@hostname |
+------------+
| mysql1 |
+------------+
1 row in set (0.00 sec)
(app@127.0.0.1) [(none)]>use test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
(app@127.0.0.1) [test]>insert into a values (null);
Query OK, 1 row affected (0.00 sec)
Check if that insert land on both nodes, mysql1 & mysql2:
mysql1 (Master):
----------------
(root@localhost) [test]>select * from test.a;select count(*) from test.a;
+----+
| id |
+----+
| 1 |
+----+
1 row in set (0.00 sec)
+----------+
| count(*) |
+----------+
| 1 |
+----------+
1 row in set (0.00 sec)
(root@localhost) [test]>select @@hostname;
+------------+
| @@hostname |
+------------+
| mysql1 |
+------------+
1 row in set (0.00 sec)
mysql2 (Slave):
---------------
(root@localhost) [test]>select @@hostname;
+------------+
| @@hostname |
+------------+
| mysql2 |
+------------+
1 row in set (0.00 sec)
(root@localhost) [test]>select * from test.a; select count(*) from test.a;
+----+
| id |
+----+
| 1 |
+----+
1 row in set (0.00 sec)
+----------+
| count(*) |
+----------+
| 1 |
+----------+
1 row in set (0.00 sec)
(root@localhost) [test]>
This proves our point, we can replicate between two nodes without binary-logs using ProxySQL. Again, this is not a consistent mirroring solution and it is certainly not production ready, as Marco Tusa already pointed out last year in his blogpost about the mirroring feature. but what’s wrong in creating interest!
ProxySQL Mirroring with 3 nodes replication?
We could have ended our experiment here but I tried to add another node to the picture. Can we replicate to more than 2 nodes? Let’s see
Added new mysql node in the mysql_servers
(admin@localhost) [(none)]>select * from runtime_mysql_servers;
+--------------+----------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
| hostgroup_id | hostname | port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment |
+--------------+----------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
| 3 | mysql3 | 3306 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | |
| 2 | mysql2 | 3306 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | |
| 1 | mysql1 | 3306 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | |
+--------------+----------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
3 rows in set (0.01 sec)
Define query rules
(admin@localhost) [(none)]>select rule_id, match_pattern, destination_hostgroup,mirror_hostgroup from runtime_mysql_query_rules;
+---------+---------------+-----------------------+------------------+
| rule_id | match_pattern | destination_hostgroup | mirror_hostgroup |
+---------+---------------+-----------------------+------------------+
| 1 | ^INSERT | 1 | 2 |
| 2 | ^INSERT | 1 | 3 |
+---------+---------------+-----------------------+------------------+
Result: This setup didn’t work. To keep the length short I’d just explain why.
Why: The reason it didn’t work is that these rules are not chained. The proxy will start processing the query rules in the order of the rule_id. Our INSERT-query will match the first rule, get sent to hostgroup 1 and get mirrored to hostgroup 2 and that is where the rule processing ends. The 2nd rule will never be matched. If we chain these rules together (like 1 -> 2) then the 2nd rule will override the first one and the query will mirrored only to hostgroup 3. So this is also not what we wanted to achieve.
Use same hostgroup_id for all the hosts and update the MySQL query rules:
(admin@localhost) [(none)]>select * from runtime_mysql_servers;
+--------------+----------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
| hostgroup_id | hostname | port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment |
+--------------+----------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
| 2 | mysql3 | 3306 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | |
| 2 | mysql2 | 3306 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | |
| 2 | mysql1 | 3306 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | |
+--------------+----------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
3 rows in set (0.01 sec)
# Updated hostgroup_id of all the servers to 2.
Update the mysql query rules for propagating writes
(admin@localhost) [(none)]>select rule_id, match_pattern, destination_hostgroup from runtime_mysql_query_rules where match_pattern like '%insert';
+---------+---------------+-----------------------+------------------+
| rule_id | match_pattern | destination_hostgroup | mirror_hostgroup |
+---------+---------------+-----------------------+------------------+
| 1 | ^INSERT | 1 | 2 |
+---------+---------------+-----------------------+------------------+
Result: Didn’t work.
Why: ProxySQL will never send a query to multiple servers in a hostgroup. If you have 2 (or more) servers in a hostgroup, ProxySQL will pick only one to send the query. The mirroring feature works no different.
I tried various combinations of chained mysql_query_rules but nothing seemed to work. One thing is certain, the documentation on query mirroring could use a little more updating.
I consulted my friend – ProxySQL Guy and he shared a discussion explaining a “hacky way” to achieve this:
– Configure all servers in different hostgroups
– Set mirror to send the query back to proxysql itself, and create rules for the mirrored query (for example using different ports)
– The original mirrored query can be send to a hostgroup , and mirrored again back to proxysql.
– Iterate until all hostgroups are reached.
Note that: Even though there is this hacky way, officially ProxySQL doesn’t support it.
Final Attempt – The hacky way
Set ProxySQL to listen on two other ports
(admin@127.0.0.1) [(none)]>set mysql-interfaces='0.0.0.0:6033;/tmp/proxysql.sock;127.0.0.1:6034;127.0.0.1:6035';
Query OK, 1 row affected (0.00 sec)
(admin@127.0.0.1) [(none)]>save mysql variables to disk; proxysql restart;
Query OK, 93 rows affected (0.02 sec)
ERROR 2013 (HY000): Lost connection to MySQL server during query
(admin@127.0.0.1) [(none)]>show variables;
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id: 3
Current database: *** NONE ***
| mysql-interfaces | 0.0.0.0:6033;/tmp/proxysql.sock;127.0.0.1:6034;127.0.0.1:6035 |
Confirming ProxySQL is listening
[root@mysql3 ~]# netstat -anp | grep 60
tcp 0 0 127.0.0.1:6032 0.0.0.0:* LISTEN 8627/proxysql
tcp 0 0 0.0.0.0:6033 0.0.0.0:* LISTEN 8627/proxysql
tcp 0 0 127.0.0.1:6034 0.0.0.0:* LISTEN 8627/proxysql
tcp 0 0 127.0.0.1:6035 0.0.0.0:* LISTEN 8627/proxysql
Create mysql servers
(admin@127.0.0.1) [(none)]>select * from runtime_mysql_servers;
+--------------+-----------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
| hostgroup_id | hostname | port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment |
+--------------+-----------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
| 2 | mysql2 | 3306 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | |
| 11 | 127.0.0.1 | 6035 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | |
| 10 | 127.0.0.1 | 6034 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | |
| 1 | mysql1 | 3306 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | |
| 3 | mysql3 | 3306 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | |
+--------------+-----------+------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
5 rows in set (0.01 sec)
The query rules
(admin@127.0.0.1) [(none)]>select rule_id,proxy_port, match_pattern, destination_hostgroup,mirror_hostgroup from runtime_mysql_query_rules;
+---------+------------+---------------+-----------------------+------------------+
| rule_id | proxy_port | match_pattern | destination_hostgroup | mirror_hostgroup |
+---------+------------+---------------+-----------------------+------------------+
| 10 | 6033 | ^INSERT | 1 | 10 |
| 11 | 6034 | NULL | 2 | 11 |
| 12 | 6035 | NULL | 3 | NULL |
+---------+------------+---------------+-----------------------+------------------+
3 rows in set (0.00 sec)
Let’s see how an INSERT would be processed:
- On rule_id 10, the ProxySQL will match the INSERT pattern, execute on destination_hostgroup 1 and mirror to hostgroup 10. Our write reaches mysql1
- Hostgroup 10 is proxy itself listening on 127.0.0.1:6034
- The rule_id 11 get matched with the proxy_port 6034 and will get executed for the destination_hostgroup 2 and then mirrored to hostgroup 11.
- Our write reaches mysql2
- The hostgroup 11 is proxysql itself, again, listening on 127.0.0.1:6035
- The rule_id 12 matches and get parsed with proxy_port 6035, which redirects to destination_hostgroup 3.
- Our write reaches mysql3
Conclusion
There is always a way – the right way or the hacky way. ProxySQL is great tool with potential and we look forward to explore more on it. Don’t miss on reading the ProxySQL Configuration files and startup process.
Our experiments showed that we can replicate to multiple hosts without binary logging using ProxySQL mirroring feature. This feature is currently designed for 1:1 mirroring but for 1:N, we really need to go a non-standard route. It still is experimental but it can surely have some awesome use-cases. Do experiment with it. If you find bugs or see use cases for cool new features, do report them to the authors. If we can create a bigger interest in this, the authors will surely start investing more time in making it better.
4 comments
Another way is to leverage storage replication which just offloads everything from database and can achieve data consistency.
Hi Rama. That’s right, I intended to convey the same when I covered “System-level” section noting “snapshot / drbd like”.
Thanks for suggesting.
Hi Kedar , it was nice blog.if you can share my.cnf files configuration for both MySQL servers .so that we will understand what parameters you were using in this blog .Thank you
There is no specific config for this experiment.