For one of my clients, Cutify Media & Marketing who run competition websites, we needed to scale a server to manage their peak load.
Their monthly competition sites get fairly busy but they have 2 annual competition sites which are like trying to navigate black Friday on a dial up modem. This year I was tasked with scaling the servers up so that we could sail through the peak load times.
We stuck with a relatively simple setup as the scale was only in the order of a few thousand requests per second at peak times. Had it been hundreds of thousands of requests over sustained periods we would certainly have added in load balancers at the very least, but given that our load was not that huge, but big enough to break a small server we kept it nice and simple.
The website’s all run WordPress. It was hosted on a managed server by a very reputable company, but I don’t like managed servers because it takes control away and when things do break I have to go begging at the hosting company to make changes they don’t want to make because it breaks their automation.
So we got two self managed dedicated servers. Each with 64Gb Ram and SSD. The “larger” server uses NvME so disk I/O is blazingly fast.
Web Hosting Server
The larger server (I only call it larger because of the NvME, but its pretty much identical in all other respects) is used as the web hosting server and the second server is used as a dedicated MySQL server.
On our web hosting server I installed Nginx and PHP 7.0. I made a few optimisations to PHP and Linux based on a book I had bought called Scaling PHP book.
In WordPress we installed the w3 total cache plugin. This plugin works amazingly well with Nginx because it comes with an Nginx config file so that any cached files it creates are served from nginx directly. No invoking PHP, MySQL or WordPress. Given our NvME SSD’s and Nginx, this make serving our site super quick!
The one issue I had with w3 total cache is that purging the cache didn’t seem to remove the cache files. This is a problem because nginx will serve the files if they exist. My work around was a fairly simple bash script which ran every 10 minutes and deleted the entire cache directory. I know that sounds a bit harsh, but there are things that make it ok (at least in my mind):
- We also use Cloudflare in front of this so that cache is still kept in place there and
- The bash script checks server load. If the load is above some threshold then it skips deleting the local cache, so during periods of high load the cache lives for longer.
We also had an issue that certain pages needed to be fairly fresh (eg, entry pages where a vote count might change often). We couldn’t set up Cloudflare’s page rules for all the different categories so we rather opted to set Cloudflare to obey our cache-control headers and I then coded up a filter in WordPress to send the correct cache control timeout values. This script also checks the server load so during periods of high load it automatically gets cloudflare to cache the content for a little bit longer.
Our MySQL server also got its own 64Gb Ram, SSD server. This server is optimised to handle MySQL and as that’s its only task, and combined with w3 total cache and cloudflare on the front end it did amazingly well.
As I’ve mentioned, cloudflare sits right in the front and handles a lot of the traffic. Because I’ve scripted our application to send timeouts based on server load we manage to keep our pages fresh, but balance freshness against up time during peak load times.
Cloudflare is great for caching but also for giving edge locations closer to our users than our data centre.
The result was that during certain peaks the server load peaked at about 1.8. That’s really high but it should also be noted that that was in the beginning where we were still making adjustments and having to clear caches a bit more than we would have liked. In spite of the load the site kept working well and remained fast throughout, so all round we had smiles on our faces!
If we found we were hitting breaking points it would have been really simple to set up a few web servers on AWS with a couple of HAProxy load balancers. Our load balancers would point to our servers and cloudflare would point to our load balancers.
If we really felt like the database was the bottle neck (which it usually is) we could have set up load balanced mysql servers with HAProxy too. This is something I’m sure we’ll need to revisit as the site grows its user base.