Containers all the way down

It’s containers all the way down

This article is technical deep dive into aspects of the QuirkLocal architecture. I’ll give a brief overview of how we used to build application infrastructure, and then talk about containers and how QuirkLocal uses them to provide a modern, highly automated, and efficient application platform.

Pre-history: Bare Metal

There’s been a revolution in systems architecture over the last decade, and it mirrors an earlier trend towards virtualization. Virtualization has been with us for decades. IBM was doing it in the 80s, but it became much more mainstream, especially for servers in the aughts. VMWare exploded on the scene first with a desktop product, and then took over the data center. Many competitors followed suit.

Before virtualization, you had to create an entire physical server, with the exact right amount of CPU, memory and disk space, install an OS, and then install your application. These days we call this a ‘bare metal’ installation, but back then it was just the way things were done.

But nobody actually sized their servers correctly on bare metal, as this was mostly guess work, and it was difficult to change after the fact.  So the vast majority of services sat around doing mostly nothing, wasting most of their dedicated resources. From this NRDC report https://www.nrdc.org/sites/default/files/data-center-efficiency-assessment-IP.pdf – “Studies show that average server utilization remained static at 12 to 18 percent between 2006 and 2012. This underutilized equipment not only has a significant energy draw but also is a constraint on data center capaity.[sic]”

Modern Virtualization

With virtualization, you could create a virtual machine, allocating just a portion of the system’s resource to each partition, and it was easy to adjust if you found your application used more or less resources. Even better, you could over commit the system’s resources, as long as the partitions never used their max allocation all at once. This allowed you to get more virtual hardware than physical hardware.

Virtualization meant that we could much more efficiently utilize the resources we had. It also provided an important abstraction layer between the virtual environments and the underlying hardware. This meant you could easily move instances between physical machines, without the instance knowing that anything had changed.

But this form of virtualization is still relatively heavy weight. You need to manually create partitions, and then install an operating system, and then install all of the other support software or drivers, and then install and configure the target software. There were tools of course created to automate all of this, but still, it’s not ideal.

Containers all the way down

Then came containerization. Containerization is a lighter weight form of virtualization built on top of a layered virtual file system. Containers use a layered file system to keep track of changes and uniquely identify the contents of each layer. You can start a new container on top of any already published file system layer, for example, c9cf959fd83770dfdefd8fb42cfef0761432af36a764c077aed54bbc5bb25368, is the sha256 hash of an entire Ubuntu Linux installation.

You can build your own layered file system on top of this pre-existing container using a build script called a ‘Dockerfile’. Docker is the most popular containerization product currently, and Docker Hub https://hub.docker.com/  is the place to go if you want to find a preconfigured container. 

The best use of containers is not to run an entire general purpose operating system but to run a single instance of a server application. For example you’d run a web server, or database server. The application, with all of its configuration, comes pre-installed in the container.

For example, https://hub.docker.com/_/postgres contains a full installation of Postgres, a popular open source database. You can, with a single docker command, download and run this database in a container. Compared to manually installing Postgres in a virtual machine, this is a huge increase in deployment efficiency. You get all of the benefits in virtualization, hardware abstraction, ease of moving between physical hosts, and dynamic allocation of resources. And you get a batteries-included installation that somebody else built for you and recorded as a file system layer in a container.

You build complex applications from a collection of simple container environments that contain a single layer of the application, each independent of all of the others, running in its own virtual environment.

This is the way QuirkLocal is run both by developers and on our production servers that serve Internet facing websites. For the most part the containers used to develop our code are the same as the containers we deploy to. This means we rarely get that ‘works for me’ issue, where a local developer misconfiguration results in something breaking when it’s deployed to a different environment. Because of containers we have almost identical environments everywhere.

How we build QuirkLocal using containers

To build our application containers, we start with stock Docker images and then customize them with our own configuration. Then merge in our application code. This process is fully automated. We publish the resulting images to our hosting provider, fly.io that then deploys the image and starts up a virtual environment to run it.

Here’s our web server Dockerfile:

FROM nginx:1.23 as deploy
COPY --from=FrontendBuild /workspaces/packages/frontend/dist /usr/share/nginx/html
COPY ./build/nginx.conf.template /etc/nginx/templates/default.conf.template

It starts with a stock version of nginx, a popular web server, pulling from a container image that already has nginx fully installed and configured. Then it copies files from our application build environment, and finally copies over an nginx configuration file. That’s it, that’s all. And once this container is built, we can publish it anywhere. I can run this locally, I can run it on Azure, Amazon, Google Cloud Service, fly.io…  Anywhere that supports Docker images.

Scaling and immutable file systems

There’s another benefit to containers. Their file system layers are ‘immutable’. A running instance of the application cannot change the contents of the container image. This took me a bit to wrap my head around when I first started using them. I wanted to make configuration tweaks in a running environment and then save them back to the container image. This is frowned upon. Containers can write to the file system generated by the container, but these changes are ephemeral. If a container stops and starts again, all of those changes are gone. The file system is reloaded from the container image.

This makes containers stateless. You don’t need to save any state when you stop one. You don’t need to load any state when you start one. If you know anything about scaling multiple instances of an application, this sounds like heaven. You can transparently start and or stop new instances of the container as demand ramps up or down. There is a dark and complex scaling art practiced by those who intone ‘Kubernetes’. There are simpler platforms that allow for easy scaling of your containers, fly.io for instance.

Scale to zero

And finally, the immutability of containers allows ‘scale to zero’. With scale to zero, your server starts when there are requests, and stops when there are no requests. This can significantly save on hosting costs, as many cloud providers bill per CPU minute or second. A different form of this is called ‘serverless’. With serverless, the server is started and then stopped for every request. Caching of virtual machine state is used so that startup times are fast.

Containers: Custom, secure, economical deployments for QuirkLocal

All of this allows QuirkLocal to deploy a fully independent instance of the QuirkLocal application stack to fly.io for every customer, built from the same container images. Each instance is isolated from the other, providing added security and performance guarantees.

Containers allow us to use only the resources that particular instance needs. We can dynamically scale resources for heavier usage, and we can use ‘scale to zero’ to provide economical solutions for smaller customers.

Containers provide QuirkLocal an extremely flexible, highly automated means of rapidly deploying and scaling new instances for our customers.

If you liked this deep dive into the QuirkLocal application architecture, check out our post about some of the design decisions that went into making our application frontend. https://quirklocal.com/the-tech-behind-quirklocal/