Database Table Space Recovery

Database Table Space Recovery

Tags
Database
MySQL
Published
December 3, 2022
The data of a table can be stored either in a shared table space or as a separate file. This behavior is controlled by the parameter innodb_file_per_table:
  1. When this parameter is set to OFF, the data of the table is stored in the system shared table space, which is combined with the data dictionary;
  1. When this parameter is set to ON, the data of each InnoDB table is stored in a separate file with the .ibd suffix.
From MySQL 5.6.6 onwards, its default value is ON. It is recommended to set this value to ON.
Storing each table as a separate file makes management easier. When the table is no longer needed, the system can delete the file directly by using the drop table command. However, if the data is stored in a shared table space, the space will not be reclaimed even after the table is deleted.

Data Deletion Process

Assume that we want to delete the record R4. The InnoDB engine will only mark this record as deleted. If a new record with an ID between 300 and 600 is inserted later, the position may be reused. However, the size of the disk file will not shrink.
notion image
InnoDB data is stored in pages. If all records on a data page are deleted, the entire data page can be reused.
If the usage of two adjacent data pages is very low, the system will merge the data on these two pages onto one page, and the other data page will be marked as reusable.
If you use the delete command to delete all data in the table, all data pages will be marked as reusable. However, the file on the disk will not shrink.
Updating the value on the index can be understood as deleting an old value and inserting a new one. This will also cause fragmentation. Tables that have undergone a large amount of insertions, deletions, and modifications may have fragmentation. Therefore, if these fragments can be eliminated, the goal of shrinking the table space can be achieved.
Rebuilding the table can achieve this goal. Use the alter table A engine=InnoDB command to rebuild the table.

MySQL 5.6 introduced Online DDL to optimize this operation process.
Online DDL's table rebuilding process:
  1. Create a temporary file and scan all data pages of table A's primary key;
  1. Generate a B+ tree from the records of table A in the data pages and store them in the temporary file;
  1. During the creation of the temporary file, all operations on table A are recorded in a log file (row log), which corresponds to the state 2 in the diagram;
  1. After the temporary file is generated, apply the operations in the log file to the temporary file to obtain a data file that is logically the same as table A, which corresponds to the state 3 in the diagram;
  1. Replace the data file of table A with the temporary file.
Due to the existence of the function of recording and replaying operations in the log file, this scheme allows adding, deleting, and modifying operations on table A during the table rebuilding process. In versions before 5.5, if there is new data to be written to table A during this process, it will cause data loss.
The entire DDL process is completed within InnoDB. For the server layer, since the data is not moved to a temporary table, it is an "in-place" operation, which is the origin of the name "inplace".
If you have a 1TB table and the disk space is currently 1.2TB, you cannot perform an inplace DDL because the tmp_file also takes up temporary space.