Enhancing Cloud Security by Reducing Container Images Through Distroless Techniques
We analyzed the Distroless technique for reducing the size of container images and explored its capabilities to address security concerns. We provide an alternative approach to Distroless that reduces the attack surface for malicious actors targeting cloud-native applications while optimizing cloud resources.
Since its inception in 2013, Docker has transformed how developers use containers. Docker Hub, on the other hand, has influenced how developers share container images. So as not to reinvent the proverbial wheel, most of the developers who deploy their code to containers do so using a publicly available image in Docker Hub as a base. Some of the most popular images in Docker Hub, like the key-value store server official image, follow the same trend and use the official Debian image as its base. While this seems like a good idea, verifying the footprint of each image is not necessarily a cloud security practice that developers regularly implement.
The official images are often maintained by a community of developers. When developers take the base operating system (OS) images, most of the application images are developed on top of other images. This means the vulnerabilities and security weaknesses found in the base images are carried over to the application images that used them
In this article, we take a closer look at the Distroless technique for optimizing, among other things, security in container images and offer an alternative approach that can reduce both the size of container images and the attack surface for malicious actors seeking to exploit cloud-native applications.
A brief background on current industry practices
The software bill of materials (SBOM) has recently become a popular concept in the information security community. SBOM is a list of all the packages installed in a specific container or file system, while Syft has started to become the industry standard for generating SBOMs.
We used Syft to generate a package list from the official public image of Debian:
Figure 1 shows that there are 96 packages installed in this image. We can also use Grype, also an increasingly popular tool, to analyze the SBOM generated by Syft to scan the original image for vulnerabilities.
The extent of the risk of using Debian-based images is plain to see: The more packages there are, the larger the attack surface becomes. This also results in a bigger disk and bandwidth footprint, which has pushed many developers to migrate from using Debian-based images to Alpine-based ones. For the newcomers, Alpine Linux is a security-oriented, lightweight Linux distribution based on musl libc and BusyBox.
We can see the benefits that Alpine Linux offers here:
And, as of today, that represents 0 vulnerabilities.
The security improvement that Alpine Linux provides is great news. Alpine Linux has also been releasing timely updates.
It would be naive to imagine that the only vulnerable pieces of code inside a given container would be the packages of the original base image. The applications written by the developers introduce potential vulnerabilities to the container as well. To visualize the potential problems, let us suppose for now the possibility that any container might be running a vulnerable application that allows remote code execution (RCE) inside the container.
If a developer has an application running in a Debian-based base image and one of the packages available for the attacker to use is apt, then Debian’s package manager creates opportunities for the malicious actors to exploit. Alpine-based official images are not that different, since they contain apk, its package manager, and BusyBox, which combines tiny versions of many common UNIX utilities into a single small executable, such as wget.
Malicious actors always find a vulnerability to exploit, thus the need to eradicate all possible opportunities that can elevate them to the next phase of an attack.
From the standpoint of an attacker trying to gain access to the shell of a potentially exposed container, package managers are seen as obstacles that need to be overcome. But that is not the only concern we had when we attempted to map the attack surface. There were also several native Linux tools — depending on the base image — that can be used for malevolent purposes, so the images would be more secure without them.
One approach to solving this issue involves mapping out those tools and removing the actual binaries during build. There are, however, two issues with such an approach: the effort of mapping all available tools and the creativity of attackers to use what is left.
A simple yet powerful example is base64 command, given its presence in all the container base images, as well as full Linux distributions. Its intent is to encode and decode data for ease of transfer. We also noticed that focused cloud-native attackers use this technique on a large scale to download or drop malicious parts of their arsenal encoded in base64 based on the assumption that the target victim has the command installed so they can decode in runtime and exploit the container further.
Another notable issue is that many cloud service providers (CSP) functioning as a service offering also run in containers or micro virtual machines (VMs) that are based on images with more than the minimal required packages installed.
The stage for a cyberattack is set if the application running on the exposed container is breached, as this enables malicious actors to use the tools inside the container to advance to the next level, whether the application is running on-premise or through a CSP.
How can we address the security issues?
Clearly, the attack surface needs to be reduced. Google created Distroless container images, which are images that contain only the application and its runtime dependencies. Unlike images for standard Linux distributions, Distroless container images do not have package managers, shells, or other programs.
The Amazon Web Service (AWS) images shown in Figures 8 and 9 are not necessarily equal to what they offer as hosted, but they are base images that AWS provides so that users can create their own:
This approach allows us to tackle two main security issues that we have observed. We can significantly reduce the number of packages inside the image and retain only what is necessary for the intended application to run. By doing so, we also decrease the attack surface that cybercriminals can exploit. This approach also allows us to drastically reduce the number of vulnerabilities, even bringing it down to zero in most cases. This new approach makes the application more secure when deployed.
An alternative approach to Distroless
When we started this research, we noticed that most of the Distroless approaches we analyzed sought to achieve lighter and faster containers. In many cases, we observed that the container images did not have unnecessary tools and libs, while some even used scratch images with just a few base file systems as layers mounted afterward.
We propose an alternative approach to Distroless, which is to use a multistage build technique plus a scratch image that contains only the necessary supporting binaries for the intended application to run.
This approach dovetails neatly with serverless since its core concept is to break down applications into smaller functions and use the serverless functions to process data. In other words, each function has only one purpose. While this is the intended use, it might not reflect the real-world usage for all the users.
With the desired usage for container images in mind, there are two requirements for the function to run: the language interpreter and the CSP’s internal application programming interface (API) binaries. Our test results showed that we can drastically reduce both the size of the container as well as the attack surface and vulnerabilities found on the CSP-provided images.
The concept of Distroless container images may have been in existence for quite some time, but it is far from being the norm. As the body of research on container security is slowly being built, we continue to channel our expertise into the implications container security has for the cloud infrastructure. Our research showed its potential and how it can be adopted for resource optimization and for addressing security concerns. However, given the perceived shortfalls in the Distroless approach, we devised an alternative technique that uses a multistage build with a scratch image that contains only the essential supporting binaries for the intended application to run. If properly implemented, this approach can address vulnerability management issues and the need to minimize the attack surface that malicious actors targeting cloud-native applications exploit.
The multistage build with scratch image technique we discussed to optimize container images offers the following benefits for developers who strive to improve cloud security:
- It can work well with serverless as its core concept is to segmentize applications into smaller functions and use the serverless functions to process data.
- It can also significantly reduce not only the size of the container but also the attack surface and vulnerabilities found on the CSP-provided images.