Load BalancingMySQL/MariaDBWeb Servers

MariaDB Database Replication MariaDB Galera Database Cluster

Buckle up. We’re going to take another look at replicating MariaDB databases. Previously we’ve looked at MySQL Replication using the older MySQL Master-Salve (Binary) Replication and the newer Master-Master (Transaction-based) MySQL Replication. Both of these types of replication have their drawbacks as we found. In the case of MySQL Master-Master replication and MariaDB Galera, your database needs to meet the requirements of the Group Replication plugin. Even today, many developers have not modified their code to include simple things like a primary key which is vital for this type of replication.

Required Infrastructure

To complete this guide and successfully configure MariaDB Galera Database Replication with a working website you will need a total of 4 servers running Ubuntu. We have created a bundle of 4 servers for users to purchase. These will be delivered with 100GB NAS storage already mounted to /nas in each backend server. Order Here or Contact Us.

  • 3x NVMe VPS Servers (Backends)
    • 50GB HDD
    • 4 vCore
    • 5GB Ram
    • 100GB NAS Storage (mounted at /nas in each server)
  • 1x Cloud VPS (Balancer)
    • 10GB HDD
    • 2 vCore
    • 2 GB Ram

Save 15% off each server with the promotional code BPIBQ69YH6

HyperDB with MySQL Master-Slave Replication

In the article WordPress HyperDB Cluster with MySQL Replication we created a messy high availability platform for a WordPress website. In that guide, we used HyperDB to route connections to specific backends and MySQL Master-Slave replication to replicate the database. It was messy because we used LSYNCD to replicate data over the cluster and HyperDB isn’t the best way to achieve high availability. That’s because you create a single point of failure on the HyperDB Master. But it gets worse. Because it’s Master-Slave replication if you ever needed to promote a Slave server to a Master. For example, to fix MySQL on the original master your database would be out of sync. This means you would have to reset the replication and resync the database over the cluster. It’s a messy and dirty way of making something highly available.

For this guide, we have used a test domain which is nvme-vps.com. The nvme-vps.com website is powered by the infrastructure used to create this guide.

MariaDB Galera Cluster

So let’s look at one way to do this. We’re going to use three NVMe VPS Servers to create a Cluster with MariaDB Galera replicating database changes to the other nodes the cluster. MariaDB Galera uses Master-Master Replication or Active-Active Replication if you want to be politically correct. But the major change we are going to make here is the addition of shared storage. In each VPS Server, we are going to mount a 100GB network share. In the share, we are going to store our websites files. This removes the need for LSYNCD.

But wait. Some people reading this will rightly think I am creating a single point of failure by using a file share. Ordinarily, that would of course be correct. If the NAS was ever offline the websites files would not be accessible. In our case, the First2Host NAS is triple replicated with automatic failover. We are mounting the NAS in our three servers using a load-balancing IP.

Optional SSH Keys

Our VPS Servers are installed with Ubuntu 22. We can see our NAS partition is mounted at /nas. We will come back to this later on. Before continuing make sure each backend in your cluster is configured with an SSH key and that key is installed on each backend. You should be able to log in to each backend from another without the need for a password.

Filesystem                          Size  Used Avail Use% Mounted on
tmpfs                               499M  1.1M  498M   1% /run
/dev/sda1                            49G  2.1G   47G   5% /
tmpfs                               2.5G     0  2.5G   0% /dev/shm
tmpfs                               5.0M     0  5.0M   0% /run/lock
/dev/sda15                          105M  6.1M   99M   6% /boot/efi
10.210.19.178:/zpool-128976/HACloud  100G     0  100G   0% /nas
tmpfs                               499M  4.0K  499M   1% /run/user/0

Install MariaDB Galera Cluster

So let’s build. First, we need to install our MariaDB Galera Cluster. Each backend NVMe VPS Server in our cluster will be installed with MariaDB and the Galera plugin. The Galera cluster plugin has been bundled into MariaDB since v10.1. From here on, each backend VPS server is referenced as a Galera cluster node. So let’s get started with MariaDB Database Replication and install MariaDB on each node in your cluster.

apt install mariadb-server

Secure MariaDB

On each node run the mysql_secure_installation script. Use the below settings to secure MariaDB.

Switch to unix_socket authentication: N
Change the root password: N
Remove anonymous users: Y
Disallow root login remotely: N
Remove test database and access to it: Y
Reload privilege tables now: Y

Configure MariaDB Galera Cluster Nodes

So now we have three backend nodes each installed with MariaDB. We need them to work together. Open up the /etc/mysql/mariadb.conf.d/60-galera.cnf file.

nano /etc/mysql/mariadb.conf.d/60-galera.cnf

Here we are going to configure the basics of our cluster. The default file will look like this.

[galera]
# Mandatory settings
#wsrep_on                 = ON
#wsrep_cluster_name       = "MariaDB Galera Cluster"
#wsrep_cluster_address    = gcomm://
#binlog_format            = row
#default_storage_engine   = InnoDB
#innodb_autoinc_lock_mode = 2

# Allow server to accept connections on all interfaces.
#bind-address = 0.0.0.0

# Optional settings
#wsrep_slave_threads = 1
#innodb_flush_log_at_trx_commit = 0

MariaDB Galera Database Replication

So edit the file to reflect the below. But note that you can use any name for your cluster but IT MUST be the same on each node. We have commented on the file.

[galera]
# Mandatory settings
##
## Enable Write-set replication
wsrep_on                 = ON

## Location of wsrep library
wsrep_provider           = /usr/lib/galera/libgalera_smm.so

## Cluster Name
wsrep_cluster_name       = "F2HCloud VPS Cluster"

## Cluster Node IPs
wsrep_cluster_address    = gcomm://51.xXx.xX.10,193.Xx.xX.218,51.xXx.xXx.24

## Log Format
binlog_format            = row

## Sotrage Engline
default_storage_engine   = InnoDB

## Galera requires mode 2
innodb_autoinc_lock_mode = 2

## The below setting adds a primary key to tables that don't have one.
innodb_force_primary_key = 1

## Double write buffer
innodb_doublewrite       = 1

# Allow server to accept connections on all interfaces.
bind-address = 0.0.0.0

# Optional settings

## Number of threads. Aim for 4x the number of vCores. 4 vCores = 16 Threads.
wsrep_slave_threads = 16

## Write to file once per second. This could cause issues if all nodes go down at the same time
innodb_flush_log_at_trx_commit = 0

## Below settings to identify lines in logs
wsrep_node_name                = FR-N1
wsrep_node_address             = "THIS NODE IPv4"

## Log errors to error_log
log_error = /var/log/mysql/error.log

Save this file but do not start MariaDB. So next we need to copy this configuration to each node in our MariaDB Galera Cluster. Remember at the start we configured SSH Keys. We can use SCP to copy this file to other nodes. On each node you can then just update the wsrep_node_name and wsrep_node_address values.

scp /etc/mysql/mariadb.conf.d/60-galera.cnf root@server.f2hcloud.com:/etc/mysql/mariadb.conf.d/

We now have three identical nodes. The 60-galera.cnf exists on each node. So next let’s ensure the Firewall is configured to allow only our cluster IPs. Ideally offloading your Firewall rules to First2Host is the best solution because protection is provided at the PoP (Point of Presence) where connections enter our network. This also protects you from DDoS attacks up to 1.6TBPS. If you want to use a Software firewall like UFW. Use the rules below.

Firewall Rules

Ports that will be required for MariaDB Galera Database Replication are for MariaDB and OpenLiteSpeed which we will install later. We will simply whitelist each node’s IP.

  • 3306 (standard MariaDB port)
  • 4444 (SST port)
  • 4567 (Galera replication port)
  • 4568 (IST port)
  • 443
  • 80 (optional)
  • 22
  • 7080 (LiteSpeed)
# Replication Rules
ufw allow from 51.xXx.xx.10 proto tcp to any port 3306
ufw allow from 51.xXx.xx.10 proto tcp to any port 4444
ufw allow from 51.xXx.xx.10 proto tcp to any port 4567
ufw allow from 51.xXx.xx.10 proto tcp to any port 4568

# HTTP / HTTPS / SSH / OLS
ufw allow 443
ufw allow 80
ufw allow 22
ufw allow 7080

# Balancer Firewall rules

ufw allow from balancer-IPv4 proto tcp to any port 443
ufw allow from balancer-IPv4 proto tcp to any port 3306

After this step, if you run ufw status the firewall rules should look something like this.

Status: active

To                         Action      From
--                         ------      ----
443                        ALLOW       Anywhere
80                         ALLOW       Anywhere
22                         ALLOW       Anywhere
7080                       ALLOW       Anywhere
3306/tcp                   ALLOW       51.xXx.xXx.24
4444/tcp                   ALLOW       51.xXx.xXx.24
4567/tcp                   ALLOW       51.xXx.xXx.24
4568/tcp                   ALLOW       51.xXx.xXx.24
3306/tcp                   ALLOW       193.xXx.xXx.218
4444/tcp                   ALLOW       193.xXx.xXx.218
4567/tcp                   ALLOW       193.xXx.xXx.218
4568/tcp                   ALLOW       193.xXx.xXx.218
443 (v6)                   ALLOW       Anywhere (v6)
80 (v6)                    ALLOW       Anywhere (v6)
22 (v6)                    ALLOW       Anywhere (v6)
7080 (v6)                  ALLOW       Anywhere (v6)

We’re now almost ready to start our Database cluster. Here is a good time to take a snapshot of each node. If anything goes wrong you won’t have to start again then. So after taking a Snap or backup of your nodes stop the MariaDB service on two of the nodes. We will start the cluster from the remaining node.

systemctl stop mariadb

Start MariaDB Galera Cluster

On the remaining node, you can now stop the MariaDB service. To start a Galera Cluster for the first time you use a special script called galera_new_cluster. So enter that into the console.

galera_new_cluster

This command produces no output. To confirm it worked use the below command.

mysql -u root -p -e "SHOW STATUS LIKE 'wsrep_cluster_size'"

You should see that the cluster size is 1.

+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| wsrep_cluster_size | 1     |
+--------------------+-------+

Now bring up the MariaDB service on each remaining node. You can check the nodes are joining the cluster with the same command.

mysql -u root -p -e "SHOW STATUS LIKE 'wsrep_cluster_size'"

We have a three-node cluster so once all of our nodes are online we see the value set to three.

+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| wsrep_cluster_size | 3     |
+--------------------+-------+

Test MariaDB Galera Database Replication

So now it’s time to test our MariaDB Galera Database Replication. We will create a MariaDB database and add some content to it. That data should be replicated to other nodes. On any node create a database.

CREATE DATABASE CLIENTS;

And then add a new table and some dummy information.

CREATE TABLE CLIENTS.data ( id INT NOT NULL AUTO_INCREMENT, prefix VARCHAR(50), doornum INT, name VARCHAR(25), PRIMARY KEY(id));

# Dummy data

INSERT INTO CLIENTS.data(prefix, doornum, name) VALUES ("Mr", 2, "Leon");'

Log in to another node in the cluster and query data in the database. Mr, 2 and Leon.

SELECT * FROM CLIENTS.data;

So, in the returned information we can see our data was replicated to this node.

+----+--------+---------+------+
| id | prefix | doornum | name |
+----+--------+---------+------+
|  3 | Mr     |       2 | Leon |
+----+--------+---------+------+

Repeat the process on each node to confirm replication is working. Once you have finished testing you can delete the database.

DROP DATABASE CLIENTS;

Create Proxy Health Check User

At the end of this guide, we will be configuring a Load Balancer. Create the below user to use with MySQL health checks from the balancer to your backends. This user will be created in each node. Just run the commands in any node.

# Create HAProxy User

CREATE USER 'haproxy'@'Loadbalancer-IP';

# Grant Privilages

GRANT ALL PRIVILEGES ON wpdatabase.* To 'haproxy'@'loadbalancer-IP' IDENTIFIED BY 'PaSsWoRd';

Prepare Production Database

We want to host and import a WordPress database to this high availability MariaDB cluster. We’re assuming you have already dumped your database and moved it to one of the nodes. Next, we will create a new database and user and then, import the database. Still in the MariaDB console create the database and user.

# Create Database 
CREATE DATABASE DATABASE_NAME;

# Create User
CREATE USER 'User-Name'@'localhost';

# Assign Privilages and create password
GRANT ALL PRIVILEGES ON DATABASE_NAME.* To 'User-Name'@'localhost' IDENTIFIED BY '@PaSsWoRd@';

So the MariaDB Galera Database created above has already likely been replicated to the other MariaDB Galera nodes in your cluster and it’s time to import the database. This bit can be a bit tricky. As we mentioned before for MariaDB Galera Database Replication to work, each table should have a primary key. We can bypass this requirement but the best practice is to have a primary key. We have configured the Galera cluster to reject any databases that don’t contain keys in all tables. Next, try to import the database dump to the database you just created.

mysql -u root -p database_name < dump.sql

You can use the command below to confirm the tables from the SQL dump have been added to the database.

SHOW TABLES FROM DATABASE_NAME;

The output should list the tables like in the image below.

MariaDB Replication Using a MariaDB Galera Database Cluster

So now is the time to maybe take a break and another Snapshot or backup of each server. We have successfully created our MariaDB Galera Cluster with Master-Master/Active-Active Replication. MariaDB Galera Database Replication is working.

But now we must turn our attention to our web server and install OpenLiteSpeed in each node in the cluster. Then we need to configure a virtual host for our website.

Install OpenLiteSpeed

OpenLiteSpeed or OLS will serve our website files. It’s a fast and secure web server with native caching for free. Thankfully, the people over at OLS have made this step really easy. Run the following command to install OpenLiteSpeed. Just complete the following on one node for now. This command installs OpenLiteSpeed with lsphp version 8.1. You can change the 81 bit for different versions of PHP like 8.2

Install OpenLiteSpeed With PHP

bash <( curl -k https://raw.githubusercontent.com/litespeedtech/ols1clk/master/ols1clk.sh ) --lsphp 81

Install OpenLiteSpeed With PHP and WordPress

bash <( curl -k https://raw.githubusercontent.com/litespeedtech/ols1clk/master/ols1clk.sh ) -w --lsphp 81

Once the installation has been completed the login information for the LiteSpeed control panel is displayed in the console and in the /usr/local/lsws/password file.

cat  /usr/local/lsws/password

Create Virtual Host

At this point, we need to create a virtual host for our website. We could do this manually but there is also a script to complete this. So use the command below to create a virtual host for your domain name. For this guide, we are importing the nvme-vps.com website to the cluster.

/bin/bash <( curl -sk https://raw.githubusercontent.com/litespeedtech/ls-cloud-image/master/Setup/vhsetup.sh ) -d www.nvme-vps.com

OK. We now have OpenLiteSpeed installed. We need to tell OLS where our website files will be located. This is where our NAS comes into play. In each node, we mounted a 100GB NAS partition at /nas. Let’s finish configuring the vhost. Log in to the OLD control panel.

https://51.255.56.10:7080

In the OLS control panel navigate to Virtual Hosts and select the virtual host you created in the previous step.

In the Virtual host configuration notice the Virtual Host Root field. By default, OpenLiteSpeed uses /var/www/www.domain.com. We need to modify this to a folder in our mounded NAS. The command below assumes your NAS is mounted to /nas in your node.

mkdir -p /nas/www/www.nvme-vps.com

And in the Virtual Host Root we now specify the location we created above.

So because we have edited the OLS configuration we must restart the webserver. Use the control panel to perform a Graceful restart. Now we need to place our website files in the /nas/www/www.nvme-vps.com file location.

Migrate Website Data

And now you need to place your website files in the location specified above. /nas/www/www.nvme-vps.com in our case. You could ZIP the files and SCP them to the node. You will only need to complete this once.

zip -r files.zip /var/www/path/to/site

scp files.zip root@IP_node:/nas/www.nvme-vps.com/

Once the files are in place modify the wp-config.php file to reflect the database you created previously.

nano /nas/www/www.nvme-vps.com/wp-config.php

Fix Permissions, Groups and Ownership

Finally, ensure the permissions on your files and folders are correct and that the correct user and owner is set. If you miss this step your WordPress may not function.

# To change all the directories to 0755 (drwxr-xr-x):

find /nas/www/www.nvme-vps.com/ -type d -exec chmod 755 {} \;

# To change all the files to 0644 (-rw-r--r--):

find /nas/www/www.nvme-vps.com/ -type f -exec chmod 644 {} \;

# CHOWN all files and folders to www-data

chown -R www-data:www-data /nas/www/www.nvme-vps.com

And at this point, you are almost complete. You just need to think about an SSL and configuring LiteSpeed on the other nodes. We will come back to these. However, we are at a stage where we can see if this first node is working. Update the A record on your domain name to point to this nodes IPv4. All being well you should see your website albeit with a possible SSL error which we will correct soon.

Rinse & Repeat

Now that you have a single node mostly configured it’s time to repeat this process on the rest of the cluster nodes. It should be a lot quicker this time. Install LiteSpeed Web Server and a virtual host for the website on the other nodes.

bash <( curl -k https://raw.githubusercontent.com/litespeedtech/ols1clk/master/ols1clk.sh ) --lsphp 81

/bin/bash <( curl -sk https://raw.githubusercontent.com/litespeedtech/ls-cloud-image/master/Setup/vhsetup.sh ) -d www.nvme-vps.com

Ensure you correct the Virtual Host Root to point to the NAS and double-check each node is working by updating the A record on the website to point to each node.

Configure HAProxy Load Balancer

We’ve left any SSL provision until last. As it stands we are using Cloudflare with a Flexible SSL setting in SSL/TLS > Overview. Because LiteSpeed has generated a self-signed certificate we’re not seeing any errors on our nvme-vps.com domain. Leaving the MariaDB cluster like this would be insecure. So next we will introduce a Load Balancer to distribute https connections over our cluster. We have two options. SSL Passthrough and SSL Termination. The balancer needs to be a Highly Available VPS Server with 2 vCores and at least 2GB RAM. In the balancer set the hostname with the below commands and configure an A record so the hostname resolves to the balancer.

systemctl set-hostname balancer.f2hcloud.com

Install Balancer SSL Certificate

So next we install an SSL Certificate. You can use a Cloudflare Origin Certificate or a Let’s Encrypt one. We’re going to install Certbot and request a cetficiate that way.

systemctl install certbot -y

certbot certonly -d balancer.f2hcloud.com

Choose option 1 to spin up a webserver. Once complete the SSL certificate will be saved to /etc/letsencrypt/live/your.domain.com/

HAProxy is going to be expecting our SSL certificate to be packaged in a specific way. Combine the certificate using the commands below. We will configure a simple script to take care of renewals later. First, navigate into the directory with your SSL certificate. Then combine the certificate and private key.

cd /etc/letsencrypt/live/your.domain.com/
cat fullchain.pem privkey.pem > balancer.f2hcloud.com.pem

You can name yours anything but it must end in .pem. So the certificate is now located at /etc/letsencrypt/live/your.domain.com/balancer.f2hcloud.com.pem. This is the correct one to use with HAProxy.

Install HAProxy

So next we move on to the actual balancer software. We are going to use HAProxy however you can choose a different balancer if you want.

 apt install haproxy -y

The HAProxy configuration file is located at /etc/haproxy/haproxy.cfg but we will create our own confiuration file from scratch. So just move the existing file out of the way and create a new one.

mv /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy_original.cfg && nano /etc/haproxy/haproxy.cfg

Configure HAProxy For HTTPS & MySQL Checks

Copy the below configuration to the recreated /etc/haproxy/haproxy.cfg file.

# Frontend connections for HTTPS and MySQL

frontend http_front
   bind :80
   bind :443 ssl crt /etc/letsencrypt/live/balancer.f2hcloud.com/balancer.f2hcloud.com.pem
   default_backend f2h_hacluster

frontend mysql-frontend
    bind *:3306
    mode tcp
    default_backend mysql-backend

# Specify backend nodes. These are https connections

backend f2h_hacluster
   balance leastconn
   server frrbx-bk 193.70.65.218:443 check ssl verify none
   server frgra-bk 51.255.56.10:443 check ssl verify none
   server uklon-bk 51.195.251.24:443 check ssl verify none

backend mysql-backend
    mode tcp
    balance roundrobin
   server frrbx-bk 193.70.65.218:3306 check
   server frgra-bk 51.255.56.10:3306 check
   server uklon-bk 51.195.251.24:3306 check

global
        log /dev/log    local0
        log /dev/log    local1 notice
#       chroot /var/lib/haproxy
        stats socket /run/haproxy/admin.sock mode 660 level admin
        stats timeout 30s
        user haproxy
        group haproxy

defaults
        log     global
        mode    http
        option  httplog
        option  dontlognull
        timeout connect 5000
        timeout client  50000
        timeout server  50000
        errorfile 400 /etc/haproxy/errors/400.http
        errorfile 403 /etc/haproxy/errors/403.http
        errorfile 408 /etc/haproxy/errors/408.http
        errorfile 500 /etc/haproxy/errors/500.http
        errorfile 502 /etc/haproxy/errors/502.http
        errorfile 503 /etc/haproxy/errors/503.http
        errorfile 504 /etc/haproxy/errors/504.http

listen stats
    bind 57.128.122.127:1936 ssl crt /etc/letsencrypt/live/balancer.f2hcloud.com/balancer.f2hcloud.com.pem
    mode http
    stats enable
    stats hide-version
    stats show-node
    stats realm Haproxy\ Statistics
    stats uri /haproxy?stats
    stats admin if TRUE
    stats auth User:Password


Let’s go through this.

  • Frontend
    • We are binding ports 80 and 443 to our balancers IPv4.
    • These we specify the certificate we combined above for the balancer.
    • Then we create a default backend which will hold our backend nodes, backend is called f2h_hacluster.
  • Backend
    • The balancer is set to use leastconn. The server with the least number of connections will receive new connections.
    • The specify your backend nodes. Its imporant to append :443 to each IP
      • check – Check port :443 for a response.
      • SSL – Use SSL.
      • Verify none – Don’t verify the certificate.
  • Listen Stats
    • Bind port :1936 to the server IP and specify the location to the SSL.
    • Stats realm – Location of the HAProxy stats URL. (https://balancer.f2hcloud.com/haproxy?stats).
    • Stats auth – Choose a username and password to access the stats. Comment (#) this out for now authtication on the stats page.

OK. Now restart HAProxy.

systemctl restart haproxy

Test MariaDB Galera Database Cluster

Get ready. Everything should be working now. Finally, update the A record on your website to point to the balancer’s IPv4. If everything went correctly your connection will be sent to one of the backend servers specified in the HAProxy file. Open up the HAProxy stats page.

https://57.xXx.122.xXx:1936/haproxy?stats

So, if we navigate to our test domain, nvme-vps.com we do not see any certificate warnings and in our HAProxy stats, we can see our connection was forwarded to a backend server.

So who has noticed the elephant in the room? Let me know in the comments if you did! You may have noticed the MySQL frontend is reporting no connections. That’s normal because we are only sending HTTPS traffic through the balancer. The purpose of the MySQL section is not to forward MySQL connections but to take out one of the backend nodes if MySQL was to ever fail on it.

Final Thoughts

This is an amazing configuration to have in your toolbox. As with any High Availability Cluster, you do sacrifice some speed. But these types of setups with Active-Active database replication guarantee uptime and are highly scaleable which is perfect for Enterprise users. You can add further backends within 30 minutes and because this cluster is spread over multiple geographical locations it is extremely unlikely it would fail.

Everyone’s High Availability is different. I don’t feel there is a single correct way to provide high availability to a website or application. However, some people may feel that a Cloud that distributes content like Azure or AWS would be a better alternative. Some would then argue that by using a server on AWS or Azure you are creating a single point of failure (SPF). More would then argue that the balancer in this configuration is a SPF and they would be right. However, Load Balancers are so simple they never fail if configured correctly. Do you have a better way of achieving High availability? Would you separate the MariaDB servers into a separate cluster on private IPs? Leave your comments below.

Going Further

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button