Mariadb in a Galera Cluster Maintenance and Recovery¶
Introduction¶
This document covers how to perform system maintenance with the Mariadb database in active production, and how to recover from power outages, or network failure.
Environment¶
SCM (Scyld Cloud Manager) currently leverages the Kolla OpenStack project which packages the OpenStack services into Docker containers.
The mariadb database utilizes galera to run a database cluster on the three OpenStack controller systems. The cluster provides high availability as well as scalability. The three instances of the database run inside docker containers. Containerization changes how the user interacts with the database, as it differs from running the database directly on bare metal.
The best method of accessing the database is to install mariadb on the ops system which has access to the private OpenStack API network. Several scripts are provided to interrogate the status of the OpenStack cluster as well as the database.
Note: The scripts referenced in this document are shipped with SCM and are
installed in the directory /opt/scm/scripts/
.
Checking Galera Cluster Status¶
The script galeraStatus.sh
will report the current status
of the database cluster.
Since each of the OpenStack controller systems has an individual copy of the database for redundancy, the Galera software ensures that the three copies of the database are kept in sync with each other.
To protect the integrity of the data, Galera requires a quorum or majority of controller instances to agree on the last update to the database. In the SCM case there are three controllers, a minimum of two controllers must agree on the last committed transaction. This means that at least two instances of the database must be running and communicating to agree on every update.
Without quorum Galera will not allow updates to be done ensuring that the instances are kept synchronized and avoid a condition called split-brain.
A healthy cluster will have a similar report from running
the galeraStatus.sh
script.
Variable_name | Value |
---|---|
wsrep_last_committed | 3832840 |
wsrep_local_state_comment | Synced |
wsrep_incoming_addresses | 10.10.3.5:3306,10.10.3.1:3 306,10.10.3.3:3306 |
wsrep_cluster_size | 3 |
wsrep_cluster_status | Primary |
wsrep_ready | ON |
All the fields in this report are important.
- wsrep_last_committed: An ever increasing transaction ID.
- wsrep_local_state_comment: The state of the cluster. Synced is the healthy state. Possible other states: Joining, Waiting on SST, Joined, Synced or Donor.
- wsrep_incoming_addresses: The connected servers in the cluster; Healthy is all three.
- wsrep_cluster_size: Number of servers in the cluster; Again healthy is 3.
- wsrep_cluster_status: Primary means part of quorum and accepting updates; anything else is bad.
- wsrep_ready: ON means communicating with an active cluster. OFF means not accepting queries.
Further details can be found on the Galera website at: http://galeracluster.com/documentation-webpages/monitoringthecluster.html
Normal System Maintenance¶
Since the Galera wrapped Mariadb database is able to continue in production with a quorum of instances ( 2 out of 3 ) , maintenance can be performed without downtime. However, this maintenance needs to be done carefully to keep the quorum when performing maintenance.
The recommended way to temporarily remove a controller system from the cluster is to first stop the mariadb container on the one controller, then stop the docker service on that same controller.
Manually stopping the docker service before a system shutdown will reduce recovery time when the system reboots. Relying on the shutdown command to stop all the docker containers, may result in unclean shutdown of docker containers due to a systemd timeout.
For the smallest possible cluster scenario, there will be three systems which will play the roles of controllers, vmhost and ceph data hosts. This scenario currently only exists in non-production development clusters.
Note: If you do not have osd’s running on the controller, skip step 2.
# docker stop mariadb
# docker exec -t ceph_mon ceph osd set noout # only when osd is on this controller!!
# systemctl stop docker
A check of the galera status with one mariadb database shutdown will resemble the following report:
Variable_name | Value |
---|---|
wsrep_last_committed | 3829914 |
wsrep_local_state_comment | Synced |
wsrep_incoming_addresses | 10.10.3.1:3306,10.10.3.3:3 306 |
wsrep_cluster_size | 2 |
wsrep_cluster_status | Primary |
wsrep_ready | ON |
The controller with the shutdown docker can now have its maintenance performed and then be rebooted.
Restarting a Controller¶
Once the controller is booted back into service, docker will automatically start the containers except for the mariadb container which was manually stopped. Verify all docker containers, with the exception of the manually stopped mariadb, have started and have a status of Up using the command:
# docker ps -a
If the noout flag was previously set in ceph, unset that flag now using the command:
# docker exec -t ceph_mon ceph osd unset noout
Verify that the OpenStack services and the rabbitmq have all re-started, and connected successfully by running the /opt/scm/scripts/OS_Status.sh script.
After all the openstack services report an Up status, start the mariadb container on the controller you just rebooted.
# docker start mariadb
Monitor the status of the database cluster as the database on the rebooted node synchronizes with the other two nodes in the cluster.
It may take several minutes for the synchronization to begin, so monitor the database for a few minutes before considering that the synchronization has started.
You need to proceed with caution here because when the the third instance connects with the two running instances the cluster size immediately jumps to 3 before galera discovers that the new instance needs to be updated.
- The state during synchronization can be either Joining: receiving State Transfer
- or Donor/Sync depending on which mode of synchronization that galera decides to use
of either transferring the deltas or the entire database.
Variable_name | Value |
---|---|
wsrep_last_committed | 3842876 |
wsrep_local_state_comment | Joining: receiving State Transfer |
wsrep_incoming_addresses | 10.10.3.1:3306,10.10.3.5:3 306,10.10.3.3:3306 |
wsrep_cluster_size | 3 |
wsrep_cluster_status | Primary |
wsrep_ready | OFF |
After the synchronization completes the results will be similar to the following:
Variable_name | Value |
---|---|
wsrep_last_committed | 3868364 |
wsrep_local_state_comment | Synced |
wsrep_incoming_addresses | 10.10.3.1:3306,10.10.3.5:3 306,10.10.3.3:3306 |
wsrep_cluster_size | 3 |
wsrep_cluster_status | Primary |
wsrep_ready | ON |
Give the database some time to be safe and check that all the OpenStack services are up and running before starting on the next controller. This may be a good time for a coffee break. Check the status again when you return.
Carefully restarting the controllers one at a time will allow you to maintain the controllers without experiencing downtime.
Recovery from Failures¶
Network Failure or Power Loss¶
After a network failure, or a power loss check on the status
of the database by running the galeraStatus.sh
.
If the database did not recover, the report may resemble the following:
Variable_name | Value |
---|---|
wsrep_last_committed | 3885809 |
wsrep_local_state_comment | Initialized |
wsrep_incoming_addresses | 10.10.3.5:3306 |
wsrep_cluster_size | 1 |
wsrep_cluster_status | non-Primary |
wsrep_ready | OFF |
To recover, run the kolla ansible playbook to startup the database. The playbook will interrogate each instance of the database to find the latest committed transaction ID. The playbook will then bootstrap the database using that instance.
Before running the playbook, stop the database containers on all three controllers using the command:
# docker stop mariadb
From the ops system run the kolla ansible playbook to restart the database.
# kolla-ansible -i inventory/<clustername> mariadb_recovery
The mariadb_recovery playbook is not always successful but it does discover which controller has the most up to date version of the database. So make note of which controller the playbook chooses to start first as the playbook runs.
Worst Case - mariadb_recovery Failed¶
On the controller with the most up to date version of the database
edit /etc/kolla/mariadb/galera.cnf
.
Change the wsrep_cluster_address to the following value. Here we save the original value by commenting out the line.
Change:
wsrep_cluster_address = gcomm://10.11.0.1:4567,10.11.0.8:4567,10.11.0.9:4567
to:
#wsrep_cluster_address = gcomm://10.11.0.1:4567,10.11.0.8:4567,10.11.0.9:4567
wsrep_cluster_address = gcomm://
Then edit /var/lib/docker/volumes/mariadb/_data/grastate.dat
.
Change the value of safe_to_bootstrap from 0 to 1.
safe_to_bootstrap: 0
to:
safe_to_bootstrap: 1
Next Start mariadb on this just this one controller:
# docker start mariadb
On this controller in a separate window monitor the mariadb startup log:
# tail -f /var/lib/docker/volumes/kolla_logs/_data/mariadb/mariadb.log
If there are no obvious errors in the log and the database starts running,
cancel the monitoring using tail with a CTRL-C and change to the ops management window
to monitor the database with the script, galeraStatus.sh
.
Variable_name | Value |
---|---|
wsrep_last_committed | 140856471 |
wsrep_local_state_comment | Synced |
wsrep_incoming_addresses | 10.11.0.9:3306 |
wsrep_cluster_size | 1 |
wsrep_cluster_status | Primary |
wsrep_ready | ON |
Once you see the cluster state of Synced, size of 1, status of Primary and wsrep_ready ON, start the mariadb container on the second controller.
Again monitor the startup and you will see the data sync over to this node:
Variable_name | Value |
---|---|
wsrep_last_committed | 140856195 |
wsrep_local_state_comment | Joining: receiving State Transfer |
wsrep_incoming_addresses | 10.11.0.9:3306,10.11.0.8:3 306 |
wsrep_cluster_size | 2 |
wsrep_cluster_status | Primary |
wsrep_ready | OFF |
Rerun the galeraStatus.sh
script until you see the following:
Variable_name | Value |
---|---|
wsrep_last_committed | 140857617 |
wsrep_local_state_comment | Synced |
wsrep_incoming_addresses | 10.11.0.9:3306,10.11.0.8:3 306 |
wsrep_cluster_size | 2 |
wsrep_cluster_status | Primary |
wsrep_ready | ON |
Finally start mariadb on the last controller and monitor.
Variable_name | Value |
---|---|
wsrep_last_committed | 140856155 |
wsrep_local_state_comment | Joining: receiving State Transfer |
wsrep_incoming_addresses | 10.11.0.1:3306,10.11.0.9:3 306,10.11.0.8:3306 |
wsrep_cluster_size | 3 |
wsrep_cluster_status | Primary |
wsrep_ready | OFF |
You should see the state change to Synced, size of 3, status of Primary and wsrep_ready of ON.
Variable_name | Value |
---|---|
wsrep_last_committed | 140858943 |
wsrep_local_state_comment | Synced |
wsrep_incoming_addresses | 10.11.0.1:3306,10.11.0.9:3 306,10.11.0.8:3306 |
wsrep_cluster_size | 3 |
wsrep_cluster_status | Primary |
wsrep_ready | ON |
The database is recovered!
To clean up the controller where you started, you will want to change the settings back to their original values.
On the controller where you started:
# docker stop mariadb
Again, edit the file /etc/kolla/mariadb/galera.cnf
.
# docker start mariadb
and check galeraStatus.sh
to verify the controller rejoins galera successfully.
When One Instance Will Not Start¶
In rare cases one of the database instances will not start
and you will see errors in the /var/lib/docker/volumes/kolla_logs/_data/mariadb/mariadb.log
logfile.
If the other 2 instances are working and are synced you can quickly recover this corrupted instance by letting galera run a full sync of the database to replace the corrupted data. This is done by first stopping the mariadb container that is stuck in a restarting state, then removing the databases and finally start the mariadb container again.
# docker stop mariadb
# rm -rf /var/lib/docker/volumes/mariadb/_data/
# docker start mariadb
Verify that the database transfer completed by listing the directory:
# ls -lh /var/lib/docker/volumes/mariadb/_data/
Also verify that the cluster is again healthy with the galeraStatus.sh
script.
References¶
https://docs.openstack.org/kolla/newton/
https://docs.openstack.org/kolla-ansible/latest/
http://galeracluster.com/2016/11/introducing-the-safe-to-bootstrap-feature-in-galera-cluster
http://galeracluster.com/documentation-webpages/restartingcluster.html#safe-to-bootstrap-protection
http://galeracluster.com/documentation-webpages/galeraparameters.html#gcache-recover