A tale of Corrupt InnoDB table, MySQL crash & recovery

I’m going to narrate you a story that happened around a crashing MyQL, Corrupted InnoDB table and finally the recovery by table restore. We will see how our database administrator detected the issue and what he did to resolve it.

A day in MySQL Database Consultant’s day was taking its shape while a friend called for help.

Friend: Hey, my mysql is crashing and website isn't functioning well. Everything is down. Can you help me? I am seeing following errors in the log.

[FATAL] InnoDB: Aborting because of a corrupt database page in the system tablespace.

InnoDB: Assertion failure

mysqld got signal 6

Our database admin quickly jumps in and checks for the MySQL error log.

        2018-01-01T07:39:03.173398Z 0 [ERROR] InnoDB: Database page corruption on disk or a failed file read of page [page id: space=13701, page number=4603]. You may have to recover from a backup.
        2018-01-01T07:39:03.173428Z 0 [Note] InnoDB: Page dump in ascii and hex (16384 bytes):
        ...
        InnoDB: End of page dump
        2018-01-01T07:39:03.265864Z 0 [Note] InnoDB: Uncompressed page, stored checksum in field1 4187651462, calculated checksums for field1: crc32 4128877936/1194051977, innodb 680941878, none 3735928559, stored checksum in field2 3735928559, calculated checksums for field2: crc32 4128877936/1194051977, innodb 1675940203, none 3735928559,  page LSN 400 3284265879, low 4 bytes of LSN at page end 3284252104, page number (if stored to page already) 4603, space id (if created with >= MySQL-4.1.1 and stored already) 13701
        InnoDB: Page may be an index page where index id is 33515
        2018-01-01T07:39:03.265911Z 0 [Note] InnoDB: Index 33515 is `PRIMARY` in table `nitty-witty`.`flat_address`
        2018-01-01T07:39:03.265919Z 0 [Note] InnoDB: It is also possible that your operating system has corrupted its own file cache and rebooting your computer removes the error. If the corrupt page is an index page. You can also try to fix the corruption by dumping, dropping, and reimporting the corrupt table. You can use CHECK TABLE to scan your table for corruption. Please refer to http://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.html for information about forcing recovery.
        2018-01-01T07:39:03.265944Z 0 [ERROR] [FATAL] InnoDB: Aborting because of a corrupt database page in the system tablespace. Or,  there was a failure in tagging the tablespace  as corrupt.
        2018-01-01 07:39:03 0x7f5fa0466700  InnoDB: Assertion failure in thread 140048687589120 in file ut0ut.cc line 916
        InnoDB: We intentionally generate a memory trap.
        InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
        InnoDB: If you get repeated assertion failures or crashes, even
        InnoDB: immediately after the mysqld startup, there may be
        InnoDB: corruption in the InnoDB tablespace. Please refer to
        InnoDB: http://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.html
        InnoDB: about forcing recovery.
        07:39:03 UTC - mysqld got signal 6 ;
        This could be because you hit a bug. It is also possible that this binary
        or one of the libraries it was linked against is corrupt, improperly built,
        or misconfigured. This error can also be caused by malfunctioning hardware.
        Attempting to collect some information that could help diagnose the problem.
        As this is a crash and something is definitely wrong, the information
        collection process might fail.

        key_buffer_size=536870912
        read_buffer_size=16777216
        max_used_connections=6
        max_threads=214
        thread_count=6
        connection_count=6
        It is possible that mysqld could use up to
        key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 7539498 K  bytes of memory
        Hope that's ok; if not, decrease some variables in the equation.

        Thread pointer: 0x0
        Attempting backtrace. You can use the following information to find out
        where mysqld died. If you see no messages after this, something went
        terribly wrong...

database admin: Yeah... your MySQL has a corrupted InnoDB table. What did you do?

Friend: Earlier today I had my disk full, I moved few files and deleted others. Then I tried to backup but my MySQL Crashed. So I tried to repair it.

mysql> repair table nitty-witty.flat_address;
+--------------------------------------+--------+----------+---------------------------------------------------------+
| Table                                | Op     | Msg_type | Msg_text                                                |
+--------------------------------------+--------+----------+---------------------------------------------------------+
| nitty-witty.flat_address | repair | note     | The storage engine for the table doesn't support repair |
+--------------------------------------+--------+----------+---------------------------------------------------------+
1 row in set (0.00 sec)

Friend (continued): Since this storage engine doesn't support repair, I tried to change the engine to MyISAM but it failed as well due to foreign keys.
ALTER TABLE nitty-witty.flat_address ENGINE=MyISAM;

database admin: No No... Good that it failed. Do not convert to MyISAM. Rather, forget MyISAM. Do you have backups? I see there are not binary logs either.

Friend:"yeah, I have one mysqldump from a week ago. I have some recent data in here. Can you repair it?"

No backups, corrupted data? Awesome day! right?

Our database admin looked further for hope. He found that MySQL is crashing frequently upon accessing the corrupted InnoDB table only and not otherwise. Error log does mention this is specific to Primary index. He quickly resorted to the innodb_force_recovery aiming to backup the data first.

You can use the innodb_force_recovery option to force the InnoDB storage engine to start up while preventing background operations from running, so that you can dump your tables.

1 (SRV_FORCE_IGNORE_CORRUPT): Lets the server run even if it detects a corrupt page. Tries to make SELECT * FROM tbl_name jump over corrupt index records and pages, which helps in dumping tables.

Above section comes from documentation and thus our Database Admin concluded to go ahead with following steps:
1. Chang MySQL configuration file (my.cnf) to add innodb_force_recovery=1
2. Restart MySQL
3. Dump the corrupted InnoDB table.

The backup of table was successful. Our database administrator later removes the innodb_force_recovery from my.cnf and restarts the MySQL server.

Further he restored the table as flat_address_NEW and reviewed the restored data.

mysql> select count(1) from flat_address_NEW;
+----------+
| count(1) |
+----------+
|   277878 |
+----------+
1 row in set (0.05 sec)

mysql> select count(1) from flat_address;
+----------+
| count(1) |
+----------+
|   277878 |
+----------+
1 row in set (0.06 sec)

Counts matched, his friend was happy to see his data back. Friend (table owner) manually verified some of the contents of the table and satisfied with the recovery.

Our database admin concluded to swap the tables as follows:

1. Rename flat_address to flat_address_OLD
2. Rename flat_address_NEW to flat_address

But wait there’s a problem. They had foreign keys remember?

See what documentation reads:

RENAME TABLE changes internally generated foreign key constraint names and user-defined foreign key constraint names that contain the string “tbl_name_ibfk_” to reflect the new table name. InnoDB interprets foreign key constraint names that contain the string “tbl_name_ibfk_” as internally generated names.

Foreign key constraint names that point to the renamed table are automatically updated unless there is a conflict, in which case, the statement fails with an error. A conflict occurs if the renamed constraint name already exists. In such cases, you must drop and re-create the foreign keys in order for them to function properly.

So, what will happen is if we swap, the related tables will point to flat_address_OLD table and not the newly restored (and renamed) table flat_address.

Also skipping foreign_key_checks won’t help here. They cannot simply swap tables, they will have to drop and recreate the references.
Next thing our database admin did is to look for tables referencing the corrupted table:

mysql>  SELECT TABLE_SCHEMA,TABLE_NAME,REFERENCED_TABLE_NAME,REFERENCED_COLUMN_NAME FROM information_schema.KEY_COLUMN_USAGE WHERE  REFERENCED_TABLE_SCHEMA = 'nitty-witty' and REFERENCED_TABLE_NAME='flat_address';
+--------------+-------------------+--------------------------+------------------------+
| TABLE_SCHEMA | TABLE_NAME        | REFERENCED_TABLE_NAME    | REFERENCED_COLUMN_NAME |
+--------------+-------------------+--------------------------+------------------------+
| nitty-witty  | flat_address_item | flat_address             | address_id             |
| nitty-witty  | flat_rate         | flat_address             | address_id             |
+--------------+-------------------+--------------------------+------------------------+

The plan there after was as follows:

  1. Drop foreign keys.
  2. Swap corrupted table with recovered table.
  3. Recreate foreign keys.
mysql>  alter table flat_address_item drop foreign key FK_B521389746C00700D1B2B76EBBE53854;
Query OK, 0 rows affected (0.01 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> alter table flat_rate drop foreign key FK_B1F177EFB73D3EDF5322BA64AC48D150;
Query OK, 0 rows affected (0.01 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> rename table flat_address to flat_address_OLD,flat_address_NEW to flat_address;
Query OK, 0 rows affected (0.01 sec)

mysql> alter table flat_address_item ADD CONSTRAINT `FK_B521389746C00700D1B2B76EBBE53854` FOREIGN KEY (`quote_address_id`) REFERENCES `flat_address` (`address_id`) ON DELETE CASCADE ON UPDATE CASCADE;
Query OK, 156 rows affected (0.06 sec)
Records: 156  Duplicates: 0  Warnings: 0

mysql> alter table flat_rate ADD CONSTRAINT `FK_B1F177EFB73D3EDF5322BA64AC48D150` FOREIGN KEY (`address_id`) REFERENCES `flat_address` (`address_id`) ON DELETE CASCADE ON UPDATE CASCADE;
Query OK, 0 rows affected (0.00 sec)
Records: 0  Duplicates: 0  Warnings: 0

Finally our Database admin informed his friend that the table is restored and he will have to drop the original corrupted table after the data has been verified. Database Admin also provided him with two golden lines:
1. Backup your data regularly.
2. Setup basic monitoring.

To summarize: A corrupted InnoDB Table was recovered by mysqldump / restore after restarting MySQL with innodb_force_recovery. Once data was verified by owner, the tables were swapped carefully handling the Foreign Keys. Finally the corrupted table was dropped to avoid further MySQL crashes.

Note that the table was comparatively small but there was no recent backup. Also there was complete downtime during this recovery activity.

Hope you learnt something from our MySQL Database Administrator’s adventure.

Leave a Reply

Your email address will not be published. Required fields are marked *