The path to bliss: Autoscaling

by scalingexperts

Download our Web Scaling eBook
Web Scaling vol. 1 - Small Architectures


I’ve been playing with the idea of autoscaling for quite some time (years, for real!). Only recently has this become more and more feasible, and I think I finally figured out a way to do it on my own.

Simply complex

I think it’s quite simple, but my best friend who happens to be a kick-ass sysadmin (@patrixl) gave me the longest blank stare ever when I tried to explain this to him. For that reason I’ve added lots more text to this article than I really wanted. Hopefully it’ll make my explanation easier to grasp.

The idea

For starters, be aware this applies only to virtualized server instances, and you’ll understand why shortly. The idea stemmed from 2 big problems when you’re running a business:

  1. Physical servers with exact specifications take way too long to provision (whether you’re ordering the equipment yourself, or relying on a web hosting company to do it for you).
  2. Capacity planning is extremely difficult, and usually very expensive (planning time and money lost from unused resources).

At my previous employer (a web hosting company), customers would sometimes send urgent requests for more servers with exact specifications, only to be told it would take one week due to back-ordered parts or mismatched RAM modules (accidents happen).

Most people just have no idea how much capacity they need until suddenly a blog post puts them on “The Map” and their website suddenly requires some serious additional resources.

What to do?

Well you can throw your arms in the air and quit your job, or you can laugh in amusement at how awesome it feels to automate your previous “job”, and go back to enjoying that wonderful mojito on the beach in DR.

OK shutup, tell me how!?!

Try this: create a virtual server for your web server, but limit the resources with absolute and relative values:

  • ionice the disk IO with a low priority (say… 7)
  • limit the CPU to just 25% (1/4 of CPU availability)
  • limit the memory to just 1GB (of 4GB)
  • limit the network to just 200Mbps (approx 1/4 of actual network throughput ~800Mbps)

Next, run a benchmark on that virtual server to see exactly how many website visitors that instance can handle.

Finally, duplicate that virtual server on the same physical machine. Technically, you should be able to serve exactly TWO times more concurrent visitors assuming there isn’t a bottleneck elsewhere (disk/network/external database).

Now what?

Well there you have it! You just created your own local “EC2-ish” instance for handling X visitors. Next time you need to handle Y more visitors, just launch the appropriate amount of virtual servers to support those visitors. The best part, you don’t need to provision physical machines with exact specifications anymore. Since your virtual servers only require a small amount of resources, you can simply request one or more physical machines with “at least X CPU and at least Y RAM etc”.

In our example above, if you provision a server with 8GB RAM and 2x 1Gigabit NICs, you should technically be able to deploy 8 virtual server instances. If your hosting provider can only give you 4 servers with 2GB of RAM, no problem! you can still deploy 8 virtual server instances, but spread across 4 machines instead of 1. Hah!

Beats the hell out of “NEEDING” 48GB RAM and 4 Quad Xeons by tomorrow.

Autoscaling it

To get to the point where you can auto-scale, you need the following:

  • The ability to detect your website’s usage and automatically trigger an alert when a limit has been reached.

This alert should NOT email/page/call/sms/ping/tweet/like/+1 you, duh you’re busy sleeping! The alert should call a script or command to launch the deployment process of a new virtual server. Any alerts you receive should be due to something critical, such as: the script didn’t work and all hell broke loose!

  • The ability to find a physical server in your network with available resources for your virtual server instance.

If your physical servers are all at capacity, you need to get on that ASAP. In the meantime, your deployment/automation scripts should be able to deploy your virtual server in an alternate location, such as a Cloud Hosting provider or a VPS provider somewhere. Obviously you’ll have a lot of work for the initial setup, but make sure you do it. You’ll sleep better at night.

  • The ability to deploy a perfect/fully configured virtual server instance with only 1 command.

You can do this by running your own PXE server with pre-configured network boot based on a MAC Address prefix which your virtual server instances will use (beware network booting doesn’t scale well past 10,000 servers).

You can also do this by scripted commands, or with the help of a sweet Puppet configuration to automate your virtual server installation/configuration.

  • A load-balancer pre-configured with like, hundreds of IPs for your future servers.

You don’t HAVE to pre-configure IPs, but you need some place to reserve those IPs which will be assigned/configured on your future virtual servers. One technique is to run a script which modifies your load-balancer once a new virtual server is deployed, and another is to simply configure 100 servers and hope you don’t need to scale from 1 to 100 while you’re sleeping on the beach.

  • The ability to remove a virtual server if it’s not needed.

Once again, you should automate the destruction of a virtual server which is costing you money if you don’t need it anymore (ex: an actual EC2 instance deployed in an emergency).

Feel free to post your ideas, techniques and thoughts.