The Benefits of Containers

In the old days, software installation, especially on servers, was a nightmare. Not only did you have to deal with massive installer programs and piles of documentation to get something working properly, but it tended not to be a repeatable process. If you installed software on multiple servers, there was a good chance that one or more servers were improperly configured, resulting in the waste of huge amounts of time troubleshooting. Additionally, it meant that scaling out was much harder and required a lot more work. This tended to mean that companies would spend a lot of money on hardware, simply to avoid spending even more money on salaries when there was a need to scale out.

However, tech has evolved. Rather than treating servers like pets that have to be carefully monitored and cared for, we now treat the machines like cattle. They are interchangeable, easily added or removed, and we can quickly get things working in a new environment (at least, compared to how things used to be). As part of this process, we’ve changed our approach to how we deploy code at scale, First, we started with virtual machines, so that we could abstract away the underlying hardware under our applications. While this helped considerably, it wasn’t enough and wasted a ton of resources on duplicate functionality (operating system installations). As we moved further along, we started switching to a container-based approach, preferring to abstract away everything but the operating system kernel.

Before we get too far into this, there are some terms we need to discuss quickly. A container is used to host a chunk of code in an isolated environment that only directly interacts with the operating system’s kernel. An image is essentially a template for a container {think about an image being a class and a container being an object – that’s close enough for the purposes of this show, but not for an actual certification test}. Images are built in layers, so you can compose functionality for a new image by utilizing a set of other images, including some that you may build yourself. Images are available from image hubs (like docker hub), where they are versioned and where documentation is available on using them. The host is the machine on which the containers are run.

Learning the concepts behind the code we write is important. We often complain or hear complaints about colleges teaching only concepts and not the practical skills. Coding bootcamps are full of college graduates looking to learn practical skills because jobs no longer want a junior with knowledge of concepts but with useable skills. For those of us who didn’t go to college for comp science, the rush to learn the skills can cause us to not take the time to learn the concepts. Learning the concepts allows us to be able to apply the what we learned in one language or framework to the next one we learn. They allow us to move from different skills within programming without having to learn a whole to set.

Episode Breakdown

Isolation from the rest of system, except for the kernel.

This means that each container has its own copy of the very small set of things required to run themselves, without other stuff in the mix. This tends to keep things smaller, and shrink the attack service area. Due to each container having its own copy of any dependencies, you don’t have to worry about an update in another container breaking your container. This also puts a security boundary in place. If another container is compromised, it is less likely to be able to compromise your container.

Ability to control resource usage

Along with security boundaries, you want to make sure that one container can’t cause a denial of service for other containers on the machine (intentionally or otherwise). Modern container technology allows you to limit how much RAM, CPU, network traffic, and disk I/O that a container is allowed to use. This can keep a rogue container from taking down your server. This can also make it easier to determine which container is misbehaving, because it means that the container with the problem is the one that gets the errors, instead of some other random container.

Ability to scale out in a repeatable fashion

Container technology is designed to facilitate horizontal scaling. Rather than simply throwing more resources at a container (although you can often do that as well), it makes it easier to quickly spin up more instances as required. Back in the day, spinning up new instances meant buying/leasing new hardware and all the associated paperwork for capital expenditures. This change makes the cost of this operation into an expense and means you can more quickly react to demand.

Obviously, this is only going to work to the degree that the code in the container is built to scale outward. This scaling ability can also make it easier to onboard new developers into your team, instead of spending a day or more trying to manually install everything required on their system. It also can make it easier to keep everyone up to date.

Ability to mount file system objects inside the container

Because the container is an isolated unit, you are likely going to need to persist some stuff in a place outside the container. This keeps your data from being destroyed when the container is destroyed. The app running within the container doesn’t need to know about things in the file system outside the container, so essentially what you do is mount a file system location on the host to a path within the client container. This can also make it a lot easier to do backups, move your container to another physical device, or to simply upgrade the container to a newer version.

Ability to remap network ports from the host to the container.

In addition to being able to map file system paths, containers also have the ability to map ports on the host to ports within themselves. This allows you to forward traffic from the host to a container. Because a lot of applications use default ports and some of them don’t make this configurable, running inside a container means that you can give the application a different port on the host, with the application being none the wiser about it. Typically you can also specify whether to map all traffic or just TCP/UDP. You can also disable all traffic. While this doesn’t truly air-gap the container and its contents, it does make it a lot harder to get to.

Ability to configure the network between containers and the outside world

If you are running multiple containers (or any container that you’d like to be secure), you probably will want at least some of the containers to be able to communicate with each other, while not allowing everything on the network to talk to them. There are several different types (these examples will be from Docker).

Bridge networks are usually used when applications are running in standalone containers that need to communicate with things outside.

Host networks directly map the network of the hosting OS to the container itself. While this can work pretty well, it’s probably better to avoid it just to keep the attack surface smaller on your container.

Macvlan (MAC VLAN) networks let you assign a MAC address to a container, making it appear as a separate, physical device on your network. This is useful, especially for legacy apps that expect to be connected directly to the network.

Ability to control security in the container in a way that isolates it from other containers and keeps it from compromising the host.

You can run a container as unprivileged user, which reduces the risk of privilege escalation attacks from within the container. You can map users (such as root) within a container to a less privileged user outside of the container.

Between this, the networking options, private VLAN capabilities, and the ability to limit resource usage, you are more able to trust containers that might otherwise be a bit more risky, simply because you can limit what they can do. While this doesn’t give you free reign to download every dodgy container that you find on the internet, it does reduce the risk of a breach caused by a third party. Bear in mind also that even if you trust a particular container right now, the next version may have vulnerabilities or even be the result of a supply chain attack.

Ability to isolate environment variables and secrets

We’ve talked about how containers can store data, and receive commands from outside over the network, but you probably also need to be able to configure basic settings within the container itself, without having to rebuild the container image every time you make a change. An industry standard way of doing this is to supply environment variables to the container. These variables will be accessed by the code within the container to determine how things should work at runtime.

You can also supply configuration files from outside the container and have them mounted into the container. This allows you to manage configuration outside the container itself for a bit more flexibility. Container technologies also allow you to manage secrets that containers will use. This is useful for keeping sensitive data out of source control and out of plaintext environment variables where it can be seen.

Containers are minimal and put together using composition to do more complex tasks

Rather than a bloated image that can support all kinds of scenarios, containers are built using composition from smaller container images. This makes it easier to avoid putting too much stuff in a single container. This slim design means that it is easier to patch container images quickly, as the area to test is smaller.

This is in contrast to things like virtual machines, which have a boatload (an OS-load, actually) of functionality included in them. This bloats their size and also makes patching more time consuming. The smaller size also makes things like package registries much cheaper to implement, since less bandwidth and storage is used.

Ability to declaratively specify configuration for ops and keep it in source control

Declarative configuration that is stored in source control also allows you to see how changes in deployment configuration might have caused problems. This also means that new developers on your team can more quickly spin up an environment, which reduces the amount of time spent on onboarding.

Since you are running the same environment configuration on your own machine, there are fewer things that could cause your code not to work when it is rolled out to production, because many configuration issues are no longer a problem. Finally, this practice makes it easier to quickly and horizontally scale your systems without necessarily involving development staff (or staff at all, for that matter).

Ability to package a complete application in the way that it would work when deployed, including dependencies

This means that deployment is a repeatable process, meaning that you will have fewer false positives from QA resulting from configuration issues. When operations rolls out an update, this also allows them to keep the old container and roll out a new one with the changes. This allows rapid rollback to a known good configuration in the event of an error.

As configuration rolls out to developers, this also makes it easy to switch configuration when you switch branches and to do so transparently. This reduces the overhead of configuration changes and makes it more likely that problems created by them will be caught by the people most likely to be able to troubleshoot them.

Tricks of the Trade

Learning the concepts behind the code we write is important. We often complain or hear complaints about colleges teaching only concepts and not the practical skills. Coding bootcamps are full of college graduates looking to learn practical skills because jobs no longer want a junior with knowledge of concepts but with useable skills. For those of us who didn’t go to college for comp science, the rush to learn the skills can cause us to not take the time to learn the concepts. Learning the concepts allows us to be able to apply the what we learned in one language or framework to the next one we learn. They allow us to move from different skills within programming without having to learn a whole to set.

Tagged with: , ,