Alter table in MySQL and MariaDB without locking table with Percona Toolkit
Run alter table query in MySQL or MariaDB database is is frequent in lifecycle of a project. But this simple operation becomes more and more difficult to plan with growing data. Only few hundred thousand of lines and you'll start to have trouble. InnoDB engins lock table during alter query, so you can't use it for insert, update or delete and select will be slower. Under a minute, you can do it whenever you want but when it takes 10 minutes or more, you can't (you must not) do it when you are a lot of user on your application.
Hopefully, there are solutions out there, Percona have one: pt-online-schema-change. This tool allow you to alter table without locking the table. To achieve this, pt-online-schema-change performs many task, it copies table, adds trigger to register every change on original table, applies alter query on new table, applies all change registered by trigger, swaps original table by the new one and cleans his work (deletes old table, trigger).
pt-online-schema-change is part of Percona Toolkit. You can download it here : https://www.percona.com/downloads/percona-toolkit/LATEST/
For example, I'll use password database provided by Have I Been Pwned. With file "update 1" I'll get a 1Gb table, it's enough to have conclusive tests.
Table schema is simple:
CREATE TABLE `passwd_dictionary` ( `id` int(11) unsigned NOT NULL AUTO_INCREMENT, `hashed_passwd` varchar(50) COLLATE utf8mb4_unicode_ci DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=13696816 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
I use this query to load data from file I uncompress in /tmp directory:
MariaDB [hibp]>LOAD DATA INFILE '/tmp/password.txt' INTO TABLE passwd_dictionary (hashed_passwd);
To start I need some metrics, how long it take to find hash in table and to insert a new password.
MariaDB [hibp]> SELECT SQL_NO_CACHE * FROM passwd_dictionary WHERE hashed_passwd = '175FFE731F27432C706F872566DFD33EF5519DD9'; Empty set (2.98 sec) MariaDB [hibp]> INSERT INTO passwd_dictionary (hashed_passwd) VALUES ('password'); Query OK, 1 row affected (0.32 sec)
Now I'll change hash_passwd field size from 255 to 50. During the alter table, in another terminal, I will run previous query to measure impact of alter table.
MariaDB [hibp]> ALTER TABLE passwd_dictionary CHANGE hashed_passwd `hashed_passwd` varchar(50) COLLATE utf8mb4_unicode_ci DEFAULT NULL ; Query OK, 13675934 rows affected (46.52 sec) Records: 13675934 Duplicates: 0 Warnings: 0 MariaDB [hibp]> SELECT SQL_NO_CACHE * FROM passwd_dictionary WHERE hashed_passwd = '175FFE731F27432C706F872566DFD33EF5519DD9'; Empty set (3.85 sec) MariaDB [hibp]> INSERT INTO passwd_dictionary (hashed_passwd) VALUES ('password2'); Query OK, 1 row affected (8.11 sec)
On my laptop, it tooks almost 47 seconds to perform alter table. Select query is slightly slower, Insert query need long time. In reality insert query waited the end of alter table query. If I was faster, my insert query should take 47 seconds. !in case of webpage, I'll get a timeout error, on many page it's a disappointed behavior, on specific page like payment page, it's a big problem.
pt-online-schema-change can't be use from mysql or mariadb client but from linux command line. This tool has a lot of options, I'll not covered them all, documentation is very explicit.
root@ulaptop:~# pt-online-schema-change \ > --alter "change hashed_passwd hashed_passwd varchar(50) COLLATE utf8mb4_unicode_ci DEFAULT NULL" \ > D=hibp,t=passwd_dictionary --alter-foreign-keys-method auto \ > --max-load Threads_running=50 --critical-load Threads_running=100 \ > --no-drop-old-table --host=localhost --user=root --execute No slaves found. See --recursion-method if host ulaptop has slaves. Not checking slave lag because no slaves were found and --check-slave-lag was not specified. Operation, tries, wait: analyze_table, 10, 1 copy_rows, 10, 0.25 create_triggers, 10, 1 drop_triggers, 10, 1 swap_tables, 10, 1 update_foreign_keys, 10, 1 No foreign keys reference `hibp`.`passwd_dictionary`; ignoring --alter-foreign-keys-method. Altering `hibp`.`passwd_dictionary`... Creating new table... Created new table hibp._passwd_dictionary_new OK. Altering new table... Altered `hibp`.`_passwd_dictionary_new` OK. 2018-01-07T21:30:10 Creating triggers... 2018-01-07T21:30:10 Created triggers OK. 2018-01-07T21:30:10 Copying approximately 13275314 rows... Copying `hibp`.`passwd_dictionary`: 71% 00:12 remain 2018-01-07T21:30:54 Copied rows OK. 2018-01-07T21:30:54 Analyzing new table... 2018-01-07T21:30:54 Swapping tables... 2018-01-07T21:30:54 Swapped original and new tables OK. Not dropping old table because --no-drop-old-table was specified. 2018-01-07T21:30:54 Dropping triggers... 2018-01-07T21:30:54 Dropped triggers OK. Successfully altered `hibp`.`passwd_dictionary`.
The output of the command contains a lot of detail of what happened. It took 50 seconds, it's slower than alter table query. As you can see, there is a lot of parameters, --alter only contains the change on the table. 
Be careful of D (database) and t (table) options, they are separated by coma not by space.
--max-load and --critical-load options allow to paused or stopped the command if there are too much load on the server. These options must be adjust with your server performance.
Sometimes, it could be interesting to keep old table, this is role of --no-drop-old-table option, but be careful of disk space.
Last option --execute will run the command, I advise you to always replace this option by --dry-run to test your command at first.
During pt-online-schema-change running, I run some insert query and they all run instantaneous.
pt-online-schema-change is a great tool when your database grow. It's so easy to use,why should we wait to use it? Other tools like this one exist, create by facebook or github, I'll let you choose which one suits you the most.
Add a comment