Our previous two articles about securing the Docker ecosystem addressed two specific and critical areas: the Daemon and the container Build Phase. In each article, we shed light on the dark side of various default configurations, presented evidence of the latest attack methods during the build phase, danced with the Daemon and secured its weak underbelly… and less fantastically, we provided expertly-articulated recommendations and best practices to help you in your Docker-securing endeavors. If you haven’t read these articles yet, here are the links to the first and second blog posts.
In this final post, we conclude our series by focusing on Docker container Runtime. As prepared as one may be after a securely implementing at build phase, the equally challenging balance of the Docker security equation is the implementation and maintenance of containers upon deployment to production and runtime. We pull the thread on why the security of a running container is important and explore some of the challenges you may experience as you try to put a security strategy in place. And as with the previous two articles, we will present some battle-tested countermeasures you can incorporate into your strategy to drastically improve your Docker runtime container security.
In the first article of this series, we explained that Docker containers enable developers to run applications in resource-isolated processes on any platform – without having to worry about compatibility issues. Often, a container instance provides a single service or piece of an application (“microservice”). Containers enable more streamlined – and therefore faster – application delivery and greater portability, and also support dynamic workloads.
Once the Docker build phase has been successfully implemented, we move to the deployment-to-production phase, the last stage of a container’s lifecycle and usually the most critical. Security must remain an integral component of the entire container lifecycle, integral and diverse as the security considerations and configurations naturally morph with the maturation of the container.
Thus, even if a containerized applications’ Software Development Life Cycle (SDLC) exists in one continuous pipeline - all the way from development, to integration, to production - the security controls are often “siloed” without providing a more comprehensive security approach that is enforced throughout the entire process. That’s why many companies fall short of providing a must-have, “holistic” Defense in Depth security approach that aims to thwart, or at least minimize, any security breach that could have high operational and reputational costs for the organization.
And now – on to some best practices to incorporate into your Docker runtime container security strategy.
It’s critical to understand the security implications inherent in Docker and take purposeful steps to ensure the security team responsible for tuning runtime parameters does so properly and safely by correctly adhering to the security mechanisms at their disposal.
Here are a few effective ways to secure your Docker runtime containers.
The best practice here is to drop all non-essential capabilities and only enable the ones that are actually required by the container.
There are various Linux capabilities used as isolation techniques by Docker to restrict the privileges of the processes running in a container. By default, Docker starts containers with a restricted set of Linux capabilities, as shown in this table.
Capability Key | Capability Description |
---|---|
AUDIT_WRITE | Write records to kernel auditing log. |
CHOWN | Make arbitrary changes to file UIDs and GIDs (see chown(2)). |
DAC_OVERRIDE | Bypass file read, write, and execute permission checks. |
FOWNER | Bypass permission checks on operations that normally require the file system UID of the process to match the UID of the file. |
FSETID | Don’t clear set-user-ID and set-group-ID permission bits when a file is modified. |
KILL | Bypass permission checks for sending signals. |
MKNOD | Create special files using mknod(2). |
NET_BIND_SERVICE | Bind a socket to internet domain privileged ports (port numbers less than 1024). |
NET_RAW | Use RAW and PACKET sockets. |
SETFCAP | Set file capabilities. |
SETGID | Make arbitrary manipulations of process GIDs and supplementary GID list. |
SETPCAP | Modify process capabilities. |
SETUID | Make arbitrary manipulations of process UIDs. |
SYS_CHROOT | Use chroot(2), change root directory. |
While most of the default capabilities are inoffensive within a container, some of them can definitely pose some security risks, depending on the scenario. The NET_RAW
capability, for example, enables the container to craft raw packets and potentially be used to perform ARP spoofing and Man-in-the-Middle (MitM) attacks. In certain network configurations, it might also allow network traffic sniffing. This capability was recently abused to exploit a vulnerability that impacted many Docker-based networking stacks… and it is still enabled - by default - in Docker.
The question of whether these configurations should be shipped with default granted is one for another day (or blog post). In any case, Docker allows any default capabilities to be dropped and additional functionality not granted by default to be added. So for starters, your containers’ security can be improved simply by enabling capabilities on an as-needed basis. For example, launching a container with --cap-drop=all --cap-add=net_bind_service
would only give the container the capability to open privileged ports (port numbers less than 1024).
Use the below command to show all the capabilities that have been explicitly added at runtime.
docker ps -a -q | xargs docker inspect -f '{{ .Id }}: {{.HostConfig.CapAdd}}'
Instead, to show the full list of capabilities of a container, run the below command.
docker exec CONTAINER_ID capsh --print
Install the “libcap” package in your container if capsh
is missing.
Adding capabilities reduces container isolation and may pose security risks. Before adding capabilities, carefully evaluate the potential impact on the underlying environment.
Do not run privileged containers.
Running a container with the --privileged
flag effectively disables all isolation features. A privileged container has all available capabilities and complete access to all the host’s devices. It also runs with all the isolation techniques, such as cgroups, AppArmor, and SECcomp, as disabled.
In other words, an attacker on a privileged container can get “root” access on the underlying host with little effort.
This flag exists solely to enable fringe use-cases, like running Docker in Docker. Realistically, this lack of isolation should almost always be avoided in favour of more fine-grained oversight and control of capabilities; using --cap-add
and --cap-drop
flags is highly recommended in this case.
Use the below command to detect the privileged containers running in your system.
docker ps -q | xargs docker inspect -f '{{ .Id }}: {{ .HostConfig.Privileged }}'
Do not weaken the containerization by running containers with “host” network mode.
By default, Docker uses Kernel namespaces to give each container its own network stack, meaning that a container cannot see and affect network sockets running in another container or in the host system. This can be disabled by running a container with the --network=host
flag to give it full access to the host’s network stack.
Do not disable network isolation if you don’t want to expose the hosts’ network stack and services to any sort of manipulation that can potentially be leveraged by a malicious actor positioned in the container.
Use the command below to ensure that all your containers run with the “default” network mode.
docker ps -a -q | xargs docker inspect -f '{{ .Id }}: {{ .HostConfig.NetworkMode }}'
Do not disable the isolation provided by running containers with “host” PID mode.
By default, Docker uses the Kernel namespace to provide separation of processes. If this isolation is disabled, the entire host’s PID namespace is shared with the container, allowing processes within the container to see and potentially interact with the processes on the host system.
Use the command below to ensure that none of your containers run with the “host” PID mode.
docker ps -a -q | xargs docker inspect -f '{{ .Id }}: {{ .HostConfig.PidMode }}'
Often containers’ ports are exposed on any network interface instead of being bound to specific network interfaces.
By default, when you run a container, it does not publish any of its ports to the outside world unless the --publish
or -p
flag is used.
The publish flag can be used by passing an IP address to bind the port on a specific network interface of the host. If the IP address is not specified, the port is bound to any network interface. This might not be desirable because the connections on interfaces that are not designated for this type of traffic might be overlooked by intrusion detection, firewall, and other monitoring systems. Ensure that the exposed container ports are bound to a specific interface and not to the wildcard IP address 0.0.0.0
.
Use the command below to list the ports exposed by your running containers.
docker ps -q | xargs docker inspect -f '{{ .Id }} {{ .NetworkSettings.Ports }}'
The containers’ ports that have the HostIp
field set to 0.0.0.0
are exposed to any interface. Consider running the containers again to publish the port to a specific interface.
“Linking” is a deprecated feature that was introduced to facilitate communications between containers. Its functionalities were superseded by safer information and network-sharing mechanisms.
Using --link
to create relationships between containers shares all the environment variables from the source container to the recipient container and updates the /etc/hosts
file to resolve the names of linked containers.
Sharing environment variables could have serious security implications if sensitive data is stored in them. You can achieve the same result in a more controlled manner by mounting shared volumes or by using the environment variable definitions in Docker Compose. The container name resolution can be replaced by using user-defined Docker networks.
Use the command below to list any link between containers.
docker ps -a -q | xargs docker inspect -f "{{ .Id}} {{ .HostConfig.Links }}"
Mounting the host’s files and directories must be done carefully to avoid exposing sensitive resources of the host system.
The containers have access to any host’s file or directory mounted with the -v
or --volume
flag. Depending on what files are mounted, and if they are mounted in read-write mode, this feature could be abused by an attacker positioned on the container to compromise the Docker host integrity and to break out of the containerization.
Files and directories at paths such as those given below are critical to the host system and should not be mounted.
/
/boot
/dev
/etc
/lib
/proc
/sys
/usr
Use the command below to list the mounted resources and check whether they are mounted in read-write mode for each container instance.
docker ps -a -q | xargs docker inspect -f '{{ .Id } {{ .Mounts }}'
Use Docker secrets to manage sensitive data in your containers.
Storing sensitive data in your image is a dangerous practice that enables any attacker that gains access to a container to quickly recover the data from the filesystem.
Passing sensitive data to your containers via environment variables is also not recommended. Anyone who is able to run docker inspect
against your container, or an attacker who gets root access to your container, would be able to steal your secrets fairly quickly.
Instead, use the Docker secrets (see the docker secret
command) to manage sensitive data and securely transmit it to only those containers that need access to it. Secrets are encrypted during transit and at rest and are only accessible to those running services that have been granted explicit access.
Secrets are available for Docker Swarm and Docker compose.
Read more about Docker Secrets here.
Consider applying SELinux policies to your Docker containers.
Security-Enhanced Linux (SELinux) is a Linux kernel security module that allows the use of Mandatory Access Controls (MAC) on system resources.
As explained in the first article of this three-part series, SELinux must be first enabled at host-level, and then in the Docker daemon, before you can apply custom per-container policies.
Once SELinux is configured, new containers can be run with the --security-opt label=policy_name
flag to apply per-container specific SELinux policies.
Use the command below to show and review all the security options currently configured on the containers listed.
docker ps -a -q | xargs docker inspect -f '{{ .Id } {{ .HostConfig.SecurityOpt }}'
AppArmor is a Linux kernel security module that can be used to restrict the capabilities of single running processes by applying security profiles to any process. The profile allows or disallows resources such as network access, raw socket access, and the permission to access files in the filesystem.
AppArmor can be applied in the same way to Docker containers by running them with the --security-opt apparmor=profile_name
flag to apply per-container specific profiles.
Use the command below to show and review all the AppArmor profiles currently configured on the containers listed.
docker ps -a -q | xargs docker inspect -f '{{ .Id } {{ .AppArmorProfile }}'
If you run your containers on the cloud, make sure you use network policies to restrict the containers’ access to the API metadata.
This misconfiguration is not Docker-specific, but it’s often overlooked when defining the secure architecture of cloud infrastructure running on microservices.
Cloud platforms commonly expose metadata services to instances. These are typically provided via REST APIs and furnish information about the instance itself, such as network, storage, and temporary security credentials. Since these APIs are accessible by containers running on an instance, the credentials can help attackers move within your infrastructure and even to other cloud services under the same account.
The strategies we have outlined above should be part of a robust strategy to manage and ensure runtime container security. This strategy should include steps to thoroughly inspect internal traffic and fully safeguard container environments and sensitive data across the entire application lifecycle. Furthermore, it should be scalable and enable teams to manage it like any container service without burdening them or causing additional overheads.
However, any strategy designed to construct runtime container security is useless if it cannot be executed properly, consistently, and reliably by you and your team. And the only way to ensure that this happens is by developing and nurturing a ‘security-first’ mentality from day one of the SDLC. When you and your developers are well informed about the real threats to Docker runtime container security, you are better positioned to guard against them. And if a threat does manage to creep into your Docker ecosystem, you and your team are equipped with the right tools and techniques to minimize the damage. But for this to happen, you need both real-world knowledge and hands-on practice.
SecureFlag’s “real world-focused” training programs empower developers to learn and implement modern secure coding practices through 100% hands-on exercises. They learn about real threats to runtime container security, and can practice how to remediate these threats in a familiar “Integrated Development Environment” (IDE). This IDE is created on-demand and can be easily accessed through a familiar web browser. The developer can select the code of each exercise, exploit vulnerabilities, and then remediate each security issue in real-time – all of which gives them enhanced control over both their learning process, and its outcomes.
Just like old-fashioned tools and practices are not enough to secure the entire application lifecycle, similarly, old-fashioned training with videos and multiple-choice questions are inadequate to prepare developers to fight current and emerging threats. SecureFlag addresses these lacunae very effectively. Our unique teaching methodology includes engaging and interactive lessons, plus logically-linked units called “learning paths” so dev teams can learn critical secure coding practices in a hands-on manner. This improves their retention and gives them a thorough preparatory grounding to deal with issues once they get back to their work.
SecureFlag’s training can help protect your applications from attacks and exploits throughout the full “build-ship-run” lifecycle. To know more, reach out to us today.