In this lunchtime #WPQuickies, I talk about WordPress database synchronisation.
This topic is a bit technical so get ready for some information overload.
Let’s tackle the big question everyone wants to know.
The Question You All Want To Know Is…
“How do I synchronise my WordPress staging/development database with the live/production database?”
I see this question posted in forums and groups at least twice every week.
It’s a simple question with a rather complicated answer.
Some Database Related Words And Meanings
Often I find that those asking the question don’t understand the terminologies correctly so let’s cover that quickly.
Copy
A one-time duplicate snapshot of an existing database. Thereafter, copy can contain different data from the original source database.
Clone
A one-time duplicate snapshot of an existing database. The terminology suggests that the data is exactly the same as the original database, however, after cloning the copy can contain different data.
Replication
This is a process of frequent cloning from an original database to a new copy. The purpose is to create a distributed database system. Frequency is determined by the systems manager and can range from every few minutes to hourly, daily, weekly or as needed.
Replication is a one-way process. Usually termed “master” and “slave” where a single source of truth database is named as the “master”. The “slave” databases are the result of the replication process.
You can use this system with WordPress using a plugin called hyperDB and you can speed up your site by load-balancing between multiple database copies.
Asynchronous
This is when data is sent out of sequence between two computers. This could mean that data is sent “out of order” or ad-hoc. The receiving computer has to determine how to correctly insert the data into its systems.
Synchronous
This is when data is sent in sequence. The receiving computer expects the data received to be in the correct order/sequence to insert into its systems.
Synchronisation (Syncing)
Data synchronisation is the process of establishing consistency among data from a source to a target data storage and vice versa and the continuous harmonization of the data over time.
In layman’s terms, data can be added to a single cloned database and the system will ensure that the data is copied and synchronised across all the databases it manages.
Merging
Merging is a process used by synchronisation.
The database system must be able to determine which data is the most recent or “truth” in order to overwrite/update older data to support synchronisation.
This is pretty difficult and can result in conflicts where two or more data cannot be differentiated as to which one is the truth.
Transaction
This is a single database operation; a read, write, delete, update, query, etc.
Transaction Management
A system that looks after the database transactions, making sure they are completed and in the correct order.
Roll Back
Used by the transaction management system to undo one or more transactions (database operations).
Roll Forward
Used by the transaction management system to redo/action one or more transactions (database operations).
Tables
A table is a set of grouped data elements.
A database usually has multiple tables.
Columns
A table column identifies data by name and type: e.g. FirstName Type=text
A table usually contains multiple columns.
Rows
A row is one entry in a table where data is stored in in each column: e.g. FirstName, LastName, DoB, PostCode
Primary Key
This is a special column in a table with a unique identifier. Usually this is number but it can be anything unique.
The primary key is used to be able to uniquely identify a row of data in a table.
Those are some pretty technical terms but I think it’s important to understand the basic operations of a database before answering the main question.
A Typical WordPress Designer/Developer Scenario
Here’s a typical scenario that WordPress designers and developers will find themselves in.
The client has an existing website.
The designer/developer makes a copy of the live website and database onto another machine to apply the new updates.
The new website and updated database is ready on another machine.
The live client website has received numerous updates during the project timeline and now has new data, not present in the copy.
The designer/developer then asks the question: “How do I synchronise my WordPress staging/development database with the live/production database?”
Files are easy – they can just be uploaded but how do you handle the database updates?
Unfortunately, MySQL or MariaDB which most WordPress websites use doesn’t have a distributed transactional management system.
Database Table Primary Keys
What’s the problem here?
Do you remember the primary key?
This value must be unique and in the scenario of two databases synchronising together they must match between the database copies.
Here’s a snapshot of the wp_options table from one of my local websites.
The option_id field is the primary key and is an incremental number.
The problem is that the other database, the client’s live website, now has new data with their own primary keys, and some of those numbers may be the same as the data in our newly developed website database.
The two databases are out-of-sync with each other.
There is no transaction management system to determine the correct data; nobody knows which data to add, update, overwrite or delete.
Because WordPress writes so much mixed data to the database tables, specifically wp_posts and wp_options, it makes it nearly impossible to synchronise changes between out-of-sync databases.
There are distributed database transaction management systems available for MySQL, but the cost is out of reach for a typical freelancer. Usually reserved for big enterprises.
Some Have Tried, Failed and Sort-of Succeeded
The WordPress company Delicious Brains who author the popular plugin WP Migrate DB Pro tried to create a service called Mergebot to tackle this problem.
They failed. Not because they are not very good developers but because the problem of synchronising MySQL databases is technically very challenging.
You can read about their journey and the issues in their blog post at https://deliciousbrains.com/syncing-wordpress-database-changes-merging/
WP Stagecoach claims to offer a database merging service, but merging data is very different from synchronising data, and the merging system itself has known issues https://wpstagecoach.com/kb/known-incompatibilities/.
VersionPress https://versionpress.com/ also claim to have a solution based on storing database transactional data in Git, a popular development versioning tool.
However, their plugin is a “developer preview”.
The documentation clearly states it should not be used in production https://docs.versionpress.net/en/getting-started/about-eap/ Some big issues still there, I think, and the last update to the plugin was 2 years ago, suggesting development has been abandoned.
The service offered by WPMerge https://wpmerge.io/ seems to offer a solution.
However, their service needs to be installed and activated before any changes are made to either development or production. It doesn’t work already out-of-sync databases.
You would need to install the service on the production server first before copying the site over to development.
The client probably isn’t going to be keen on you installing untested and resource heavy services on their live website.
It works by recording transactions triggered by the WordPress WP_Query class, which means it won’t store any data updated by native SQL calls that many plugins still use.
None of these systems handle conflict resolution either.
Solutions To WordPress Database “Synchronisation”
That’s enough about the complexity of synchronising databases – let’s look at some solutions to the problem.
Solution 1
Ask the client to put a hold on any updated to the live website.
This may not be possible for ecommerce or user generated content websites but it’s the simplest solution.
Solution 2
Limit and record the number of updates.
This is kind of like creating your own transactional record of what’s been updated on the live website.
You can ask the client to add items to a website.
You will need to add this content into your development website before pushing it up to the live website.
This can work well for small business websites.
It can also work for e-commerce websites as long as you have a way of exporting and importing the relevant data
The plugin WP All Import does a great job of exporting WooCommerce data and during importing, keeping the association between the data.
Solution 3
Record and apply database updates like unit tests.
This is not going to work for non-developers.
The idea here is to update the database with scripts, rather than use the WordPress user interface.
This way allows you to apply those same updates to the other database, ensuring that the same data is present.
Although the primary keys may be different, this is not a synchronisation method.
Conclusion
There doesn’t seem to be a magic bullet solution for keeping two WordPress databases completely synchronised between two different environments.
The best you can do is to halt updates on the production site during your project and replace it with your new site when complete.
Or, try to record and merge new data in the production site into your development database copy just prior to launching.
If you have any other solutions that work for you, please let me know in the comments below.
#WPQuickies
Join me every Thursday at 1 pm Sydney time for some more WPQuickies – WordPress tips and tricks in thirty minutes or less.
Broadcasting live on YouTube and Facebook.
Suggest a #WPQuickies Topic
If you have an WordPress topic you’d like to see explained in 30 mins or under, fill out the form below.
https://forms.gle/mMWCNd3L2cyDFBA57