How to set up a WordPress cluster using Docker and Google Cloud Platform
Setting up a scalable and highly available WordPress cluster using Docker and Google Cloud Platform can be quite some work. But with this guide, it will be easier!
In this guide, we will see how to create and deploy a scalable WordPress cluster hosted on the Google Cloud Platform. The cluster is made of:
- A Cloud SQL database containing the WordPress data.
- A Cloud Storage bucket containing the media content uploaded from WordPress – it is shared by all the worker instances.
- An auto-scaling instance group which contains all the worker instances.
- A HTTP Load Balancer which handles all the requests and forwards the traffic to the worker instances.
In order to ease the development and deployment of the WordPress site we are running it inside a Docker container. All the site content is packaged in a custom WordPress image (loosely based on the official one), automatically built and hosted in a private Docker Hub repository.
Our custom image contains a typical Apache/PHP/WordPress stack, adds the site content and declares a volume mounting point on the WordPress media upload directory: /var/www/html/wp-content/uploads.
WordPress media upload folder
In a single node configuration, the media files uploaded to WordPress are stored directly on the server filesystem. In a clustered configuration with multiple servers, this cannot work anymore, as these files must be accessible by all instances.
Unfortunately, WordPress does not provide any native solution to this problem, so we had to find a way to have all the instances media repository synchronized. Several blog articles cover the deployment of a WordPress cluster with different solutions to this problem:
- A NFS mount point.
- A master/slave WordPress set-up storing the files on Amazon S3 and cron based synchronisation of the local upload folder.
- A GlusterFS distributed filesystem.
In our case the site is deployed on the Google Cloud Platform which allows a more elegant way to achieve this: creating a mount point on the local file system to a shared Cloud Storage bucket with Cloud Storage FUSE. As per the documentation “Cloud Storage FUSE provides a tool for applications to upload and download Google Cloud Storage objects using standard file system semantics”. The mounting point content is automatically synchronized between all the instances without having to setup a cron periodic synchronization which implies replication delays or using another third party products like GlusterFS.
Warning : Cloud Storage FUSE is currently in Beta and has no guarantee of long-term maintenance.
Here are the required steps to achieve media synchronization between all the WordPress instances:
- Create a new Cloud Storage bucket.
- Create a new directory on the worker’s file system.
- Install and configure Cloud Storage Fuse to use this directory as a mounting point for the bucket.
- Mount this directory as a volume when running WordPress in a Docker container.
Info : The use of Cloud Storage Fuse requires to activate full access to the Storage API in the worker instance (or instance template) Access scopes configuration. For security reason, API access authorizations can’t be modified after the creation of a compute instance and must be set at creation
First we have to create a new bucket named wp-uploads-bucket.
We then have to create a new directory that will be used as the bucket mounting point:
sudo mkdir /mnt/wp-uploads sudo chmod a+w /mnt/wp-uploads
The installation of Cloud Storage Fuse is simple and well described in the documentation.
As we will create the mounting before running the Docker container, and access it from inside the container, we need to authorize access for all users. To do so we first need to allow the allow_other option to be used in the mount command: open the /etc/fuse.conf file and uncomment the line containing user_allow_other.
The mounting of the bucket can then be performed with the following command:
gcsfuse --dir-mode "777" -o allow_other wp-uploads-bucket /mnt/wp-uploads
The final step is to use this directory as volume mounting point when running a container from our image:
docker run [...] -v /mnt/wp-uploads:/var/www/html/wp-content/uploads -p 80:80 sfeir/custom-wordpress
Cloud SQL proxy
WordPress data are stored in a Cloud SQL database, which is a fully managed and cloud hosted MySQL instance. Access control on Cloud SQL database is achieved by exhaustively whitelisting all the IP addresses that are allowed to access the base. As we are working with a dynamic number of worker instances which are automatically created depending on the site traffic, we don’t know in advance the number of instances nor their IP address.
Fortunately, the GCP provides a tool to cover this use case: the Cloud SQL Proxy. Once installed and configured on a GCE instance, the proxy will handle the connection to the Cloud SQL database with the provided credentials using the Cloud SQL API which does not require IP whitelisting.
Info : To be able to use the Cloud SQL Proxy you must enable the Cloud SQL API when creating your GCE instance or instance template.
The Cloud SQL Proxy can be configured in many ways depending on which credentials you want to use and how to provide them, the explicit specification of the Cloud SQL database or the automatic instance discovery… Please refer to the documentation to select the appropriate configuration for your project.
The simplest configuration is Cloud SDK authentication and automatic instance discovery, which will use the credentials of the service account associated with the GCE instance and create a socket for all the Cloud SQL databases of the project.
For example, here are the commands allowing you to use the proxy and configure it to create Unix sockets instead of TCP sockets.
Create a new directory that will contain the socket files:
sudo mkdir /cloudsql sudo chmod 777 /cloudsql
Start the proxy using Cloud SDK authentication and automatic instance discovery (this has to be done every time the GCE instance is started):
./cloud_sql_proxy -dir=/cloudsql &
Once the connection to the database is performed and the Unix socket is created, the WordPress container can be run with the appropriate configuration:
docker run [...] \ -v /cloudsql:/cloudsql \ -e "WORDPRESS_DB_HOST=localhost:/cloudsql/SOCKET_TO_USE" \ -e "WORDPRESS_DB_USER=USER" \ -e "WORDPRESS_DB_PASSWORD=PASSWORD” \ -e "WORDPRESS_DB_NAME=DATABASE_NAME" \ -e "WORDPRESS_TABLE_PREFIX=prefix_” \ -p 80:80 sfeir/custom-wordpress
Instance image and startup scripts
The initial configuration of the worker instance (installation of Cloud Storage FUSE / Cloud SQL Proxy and the creation of the mounting point directories) can be performed only once and then committed to a new image containing all the static system configuration. An instance template can then be created, based on this image, and will be used when creating the worker instances.
The startup scripts are executed every time a new worker instance is created. They perform the actions required to initialize the instance and run a new docker container with our WordPress. Some actions need to be executed as root and some others using an account created to access Docker.
These scripts can be hosted on Cloud Storage, on a dedicated bucket.
Here is an example of a startup script executed as root when a worker instance is started:
# !/bin/bash # # Initialize Cloud SQL Proxy mkdir /cloudsql chmod 777 /cloudsql cloud_sql_proxy -dir=/cloudsql & # # Retrieve the next startup script and execute it as Docker user gsutil cp gs://CONFIG_BUCKET/startup-docker.sh startup-docker.sh chmod +x startup-docker.sh sudo -u DOCKER_USER ./startup-docker.sh%
And an example of the next startup script, executed as Docker user:
# !/bin/bash # # Perform Cloud Storage FUSE mounting gcsfuse --dir-mode "777" -o allow_other wp-uploads-bucket /mnt/wp-uploads # # Pull our custom WordPress image gsutil cp gs://CONFIG_BUCKET/docker-config/.dockercfg ~/.dockercfg docker pull sfeir/custom-wordpress # # Create a new container to run WordPress docker run -d \ -v /mnt/wp-uploads:/var/www/html/wp-content/uploads \ -v /cloudsql:/cloudsql \ -e "WORDPRESS_DB_HOST=localhost:/cloudsql/SOCKET_TO_USE" \ -e "WORDPRESS_DB_USER=USER" \ -e "WORDPRESS_DB_PASSWORD=PASSWORD” \ -e "WORDPRESS_DB_NAME=DATABASE_NAME" \ -e "WORDPRESS_TABLE_PREFIX=prefix_” \ -p 80:80 \ sfeir/monimage-wordpress:latest
Instance group and load balancer
The final step is to set up an instance group for the workers and to configure the load balancer. The instance group is based on the instance template and handles automatically the scaling of the worker pool depending on the instances load. An instance group can be configured to create new instances in one zone/datacenter or in a region (multiple datacenters) to ensure higher availability.
The worker instances do not have a public IP address and thus cannot be accessed directly from the Internet.
The load balancer is the only system in the cluster having a public IP address which shall be the one registered in the DNS. It handles the incoming requests and distributes the traffic evenly across the worker instances of the cluster. The load balancer is a global system on the GCP, it is not associated to a datacenter or a region.
Both the instance group and the load balancer use a configurable health check that reports the status of an instance. Depending on the load of the workers, the instance group will determine if new instances must be created or if some instances can be stopped. The load balancer uses the load report to distribute the requests evenly between the workers.
To this day, WordPress does not provide any straightforward solution to run it in a clustered mode. Fortunately the Google Cloud Platform is a powerful infrastructure and offers many tools we can use to set up the desired cluster. The hosting of the WordPress data and media content can be performed using Cloud SQL and Cloud Storage that are fully managed GCP products, which means that we don’t have to care about their availability or scalability.
The use of a load balancer backed by an instance group allows us to dynamically size the worker pool according to the number of visitors. We also don’t have to worry of a hypothetical worker instance failure, as a new one would automatically be spawned.
All these mechanisms working together allow us to build a scalable and highly available WordPress cluster able to handle the traffic peaks and decreases with a constant level of performance while limiting the costs associated with the number of instances running.