Database Schema Migration Tools Choice Dilemma
While source code versioning has a lot of implementations in the form of finished products, version control for database schemas often requires some compromises to deal with. From experience, I know many cases when fancy looking features ruin the deployment process of the next application version.
One day we were assigned to prepare a database of a recently purchased project for future evolutions. The management of our company has decided that this project will be rewritten in Rust and later included into our infrastructure. So we had some freedom in choice of tools.
Looking ahead, I’ll say that we opted for a new database schema migration tool for us, which is both framework-agnostic and just perfect for PostgreSQL.
Many frameworks already had something on board for refactoring the project database schema, and we knew a couple of those for Rust. But this time we decided to stay away from implementation-specific dependencies and stopped our choice on Schema Guard.
The most well-known flagman tools for schema migrations today are Flyway and Liquibase, and Diesel is the most frequently used in Rust world. But we don’t like Java as dependency and won’t count on a particular framework. Furthermore, both most advertised tools are in fact less handy for us.
The Schema Guard has similar features in both Flyway and Liquibase, but offers great flexibility in how we can define the database schema. This tool allows the good-old plain SQL way and supports declarative schema definition in YAML format at the same time.
So, we can dump the existing project database schema just like this:
rdbm -c test.cfg --snapshot_to=S1_0_initial_schema.yaml snapshot
And make it the starting point for the further evolutions. The output schema dump in YAML is better than XML when we are using the comparison diff-like tools because of the clean, easier to read format without any tags and other leftovers.
What needs to be done to make the same initial database schema description for the Diesel framework – I don't even want to think about it!
Of course, we need to do some extra work with existing triggers and functions. As far as I know, no such ideal tool exists to deal with everything in one shot. The same is true when software engineers are going to add a new stored function. You need to write the function in .sql file and put this file into your migration scripts folder:
CREATE OR REPLACE function new_function() RETURNS varchar AS $$
BEGIN
return …;
END;
$$ LANGUAGE plpgsql;
When the appropriate script checksums change, all database schema migration tools that we tested allow repeating “repeatable” migrations. These are typically carried out following the completion of all versioned migrations. We found these to be especially useful when working with stored procedures and views.
Looks like all the versioning tools for databases are the same, but we found the essential differences for us. The big brothers both rely on Java, they interact with databases through driver, while Schema Guard uses standard PostgreSQL command line client psql. It is also able to create the target DB by itself.
Built-in ETL functions support looks great out of the box. It might look like if using this tool to regularly insert thousands of records, you are better to choose the other way or consider adding bulk import API in your application. But it might be logical to load some data through migrations. This may be appropriate if you are creating a database with some core lookup tables used to enforce referential integrity that are rarely updated, like States or Country lists etc. This may be an overkill to use a schema migration tool for that if you want to provide a postal code lookup and want to load tens of thousands of records. But this tool does the job just fine.