How MySQL Binary Log Gets Its Number

In this blog I share how MySQL binary logs get their unique numbers after noting my observations with MySQL binary logs sequencing. Recently I noted an odd binary log sequencing while tackling an issue. This issue made me learn “How MySQL Binary Log Gets Its Number”.

It is understood that MySQL appends a numeric extension to the binary log-base name to generate binary log file names. The number increases each time the server creates a new log file, thus creating an ordered series of files. This is the exact sequence numbering I am discussing.

A binary log can be named using the “log_bin_basename” variable, otherwise the default value is host_name-bin. This blog is not introducing you to basics of binary logs in MySQL but raising your pondering about the design decision for the numbering of mysql binary logs.

The server creates a new file in the series each time any of the following events occurs:

  • The server is started or restarted
  • The server flushes the logs.
  • The size of the current log file reaches max_binlog_size.

I am going to use MySQL’s FLUSH BINARY LOGS command to generate new binary logs for our tests. I am going to generate new binary logs, do some speak-able things with MySQL binary logs, surely not going to touch the index file and observe how MySQL behaves. I am using MySQL 8.0.33.

Let’s start with generating 5 binary logs using FLUSH BINARY LOGS command:

[root@centos_1 mysql]# mysql
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 17
Server version: 8.0.33-25 Percona Server (GPL), Release 25, Revision 60c9e2c5
…
mysql> show binary logs;
+---------------+-----------+-----------+
| Log_name | File_size | Encrypted |
+---------------+-----------+-----------+
| binlog.000001 | 157 | No |
+---------------+-----------+-----------+
1 row in set (0.00 sec)

mysql> flush binary logs;
Query OK, 0 rows affected (0.01 sec)

mysql> flush binary logs;
Query OK, 0 rows affected (0.01 sec)

mysql> flush binary logs;
Query OK, 0 rows affected (0.22 sec)

mysql> flush binary logs;
Query OK, 0 rows affected (0.24 sec)
mysql> show binary logs;
+---------------+-----------+-----------+
| Log_name | File_size | Encrypted |
+---------------+-----------+-----------+
| binlog.000001 | 201 | No |
| binlog.000002 | 201 | No |
| binlog.000003 | 201 | No |
| binlog.000004 | 201 | No |
| binlog.000005 | 157 | No |
+---------------+-----------+-----------+
5 rows in set (0.00 sec)

Compressing some MySQL Binary logs

What happens if the binary log file is compressed? or moved? It is possible for someone to compress a MySQL binary log file, may be to free-up the space but yet to keep the binary logs data. That’s what happened in the issue I tackled but I am not recommending this as a standard practice. One may even move the binary log to a separate partition. I am not saying it is a recommended practice but point here is move the binary log file that’s present in the directory and see how MySQL behaves.

[root@centos_1 mysql]# gzip binlog.00000[1234]

After compressing or moving away the binary logs you will get file-size as 0. Let’s insert some sample data and observe the binary log position or size move:

mysql> insert into x values (1),(1),(1),(1);
…
mysql> show binary logs;
+---------------+-----------+-----------+
| Log_name | File_size | Encrypted |
+---------------+-----------+-----------+
| binlog.000001 | 0 | No |
| binlog.000002 | 0 | No |
| binlog.000003 | 0 | No |
| binlog.000004 | 0 | No |
| binlog.000005 | 1245 | No |
+---------------+-----------+-----------+
5 rows in set (0.00 sec)
Observation

You may see here that the compressed binary logs are still present in the SHOW BINARY LOGS output, though the file-size is shown as 0.

Compressing active MySQL binary log

What happens if we compress the active binary log which is being used by MySQL?

[root@centos_1 mysql]# gzip binlog.00000[5]
mysql> use test
Database changed
mysql> insert into x values (3),(3),(3),(3),(3),(3),(3);
…
mysql> show binary logs;
+---------------+-----------+-----------+
| Log_name | File_size | Encrypted |
+---------------+-----------+-----------+
| binlog.000001 | 0 | No |
| binlog.000002 | 0 | No |
| binlog.000003 | 0 | No |
| binlog.000004 | 0 | No |
| binlog.000005 | 3149 | No |
+---------------+-----------+-----------+
5 rows in set (0.00 sec)

The compressed but recent binary log size is shown, plus, it keeps on increasing!

mysql> insert into x values (3),(3);

mysql> show binary logs;
+---------------+-----------+-----------+
| Log_name | File_size | Encrypted |
+---------------+-----------+-----------+
| binlog.000001 | 0 | No |
| binlog.000002 | 0 | No |
| binlog.000003 | 0 | No |
| binlog.000004 | 0 | No |
| binlog.000005 | 3693 | No |
+---------------+-----------+-----------+
5 rows in set (0.00 sec)
Observation

When the active or current binary log is compressed, MySQL still uses it and adds data into it. As you can see the size of the binary log increases as well, even though the file remains compressed!

[root@centos_1 mysql]# ls -lhtr binlog.*
-rw-r----- 1 mysql mysql 153 Aug 19 21:37 binlog.000001.gz
-rw-r----- 1 mysql mysql 149 Aug 19 21:37 binlog.000002.gz
-rw-r----- 1 mysql mysql 150 Aug 19 21:37 binlog.000003.gz
-rw-r----- 1 mysql mysql 149 Aug 19 21:37 binlog.000004.gz
-rw-r----- 1 mysql mysql 80 Aug 19 21:37 binlog.index
-rw-r----- 1 mysql mysql 499 Aug 19 21:38 binlog.000005.gz
[root@centos_1 mysql]#
Task for you

Think about what happens if you rename the active binary log file as from “binlog.000005” to “binlog.000005.kedar”. Once done, run a test and let me know.

MySQL binary log number after compressed active binary log

Alright, now who is ready to stop here and tell me what happens if I issue the following command?

mysql> FLUSH BINARY LOGS;

Let’s do it:

mysql> flush binary logs;
Query OK, 0 rows affected (0.01 sec)

mysql> show binary logs;
+---------------+-----------+-----------+
| Log_name | File_size | Encrypted |
+---------------+-----------+-----------+
| binlog.000001 | 157 | No |
| binlog.000002 | 0 | No |
| binlog.000003 | 0 | No |
| binlog.000004 | 0 | No |
| binlog.000005 | 0 | No |
| binlog.000001 | 157 | No |
+---------------+-----------+-----------+
6 rows in set (0.00 sec)

The binary log sequencing restarted from “000001”!! There are two binary logs in the index file with the same name: “binlog.000001”.

[root@centos_1 mysql]# ls -lhtr binlog.*
-rw-r----- 1 mysql mysql 153 Aug 19 21:37 binlog.000001.gz
-rw-r----- 1 mysql mysql 149 Aug 19 21:37 binlog.000002.gz
-rw-r----- 1 mysql mysql 150 Aug 19 21:37 binlog.000003.gz
-rw-r----- 1 mysql mysql 149 Aug 19 21:37 binlog.000004.gz
-rw-r----- 1 mysql mysql 499 Aug 19 21:38 binlog.000005.gz
-rw-r----- 1 mysql mysql 157 Aug 19 21:39 binlog.000001
-rw-r----- 1 mysql mysql 96 Aug 19 21:39 binlog.index
[root@centos_1 mysql]#

What if I compress some binary logs randomly? During my investigation I also noted that it will restart the sequencing from the last available binary log file sequence number.

[root@centos_1 mysql]# ls -lhtr binlog.*
-rw-r----- 1 mysql mysql 201 Aug 19 22:07 binlog.000005
-rw-r----- 1 mysql mysql 201 Aug 19 22:07 binlog.000006
-rw-r----- 1 mysql mysql 149 Aug 19 22:07 binlog.000007.gz
-rw-r----- 1 mysql mysql 150 Aug 19 22:07 binlog.000008.gz
-rw-r----- 1 mysql mysql 151 Aug 19 22:07 binlog.000009.gz
-rw-r----- 1 mysql mysql 201 Aug 19 22:07 binlog.000010
-rw-r----- 1 mysql mysql 151 Aug 19 22:07 binlog.000011.gz
-rw-r----- 1 mysql mysql 150 Aug 19 22:07 binlog.000012.gz
-rw-r----- 1 mysql mysql 126 Aug 19 22:07 binlog.000013.gz
-rw-r----- 1 mysql mysql 144 Aug 19 22:07 binlog.index
[root@centos_1 mysql]# mysql
…
mysql> flush binary logs;
Query OK, 0 rows affected (0.02 sec)
mysql> show binary logs;
+---------------+-----------+-----------+
| Log_name | File_size | Encrypted |
+---------------+-----------+-----------+
| binlog.000005 | 201 | No |
| binlog.000006 | 201 | No |
| binlog.000007 | 0 | No |
| binlog.000008 | 0 | No |
| binlog.000009 | 0 | No |
| binlog.000010 | 201 | No |
| binlog.000011 | 157 | No |
| binlog.000012 | 0 | No |
| binlog.000013 | 0 | No |
| binlog.000011 | 157 | No |
+---------------+-----------+-----------+
10 rows in set (0.00 sec)

Why is MySQL not using the index file for reference to form the binary log sequence? Why doesn’t it know that the last binary log file number logged is 000013 and it should just increment it by one? How did MySQL come to the conclusion of creating a binary log file of sequence number 000011?

Here is the answer for which this whole story is being built up.

How MySQL Binary Log Gets Its Number

The responsible function here is find_uniq_filename which you can find out in binlog.cc.

A very short explanation of the function would be:

  • The code iterates through the binary log directory from the log_bin_basename (or defaults to datadir) and looks for files with that name.
  • It loops through all the files that match the basename pattern and looks to see if everything after the log_bin_basename is a number.
  • For each file in /var/log/binarylogs/ which are like mysql-bin.[], identify if [] is a number.
  • If it is a number then set the max_found variable to that value. If not, skip to the next file.
  • Repeat this process until all the files are processed. If we have a max_found, then the next binary log file sequence number would be max_found+1. Alternatively the binary log will start from 1.

A shell script representation for understanding this idea (not matching the actual function):

  max_found=0
  log_bin_basename="/var/log/binarylogs/mysql-bin"
  for file in $log_bin_basename.*; do
      extension="${file##*.}"
      if [[ $extension =~ ^[0-9]+$ ]]; then
          ((max_found = extension > max_found ? extension : max_found))
      else continue;
      fi
  done;
  new_binlog_sequence=$((max_found + 1))
  echo "New Binary Log Sequence: $new_binlog_sequence"

Source Code Reference: binlog.cc.

It was an interesting investigation but I couldn’t do it alone. After an hour worth of code digging (binlog.cc), I met an excellent engineer, Venkatesh Prasad, not the cricketer but an excellent colleague at Percona (amazing place to work btw). He showed me how he could easily trace the code flow using gdb and helped me understand the MySQL binlog sequencing game.

I am sure you’re aware about the binlog transaction compression feature in MySQL 8.

Do you remember I said “MySQL continued writing to the recent binary log file even though it was compressed!“. Would you like to tell me in comments why or would you like to write a blog on that on nitty-witty?

Conclusion

This blog concludes with the knowledge of binary log sequencing by exposing how it works internally, specifically pointing at the find_uniq_filename function in binlog.cc. Encouraging you to explore the beauty of open-source tools and understand the product you’re working with.
Additionally, be careful while you’re moving around the binary logs and do not edit the binlog index file. Happy binlogging.

2 comments
    1. Inevitable… Do you have a thought about the “design decision”?… Why MySQL is not using index file to track the max binlog number?

Leave a Reply

Your email address will not be published. Required fields are marked *