The quest for minimalism

Earlier I wrote about the 'elastic-prune' a simple cron-job that lived in Kubernetes to clean up an Elasticsearch database. When I wrote it, I decided to give 'distroless' a whirl. Why distroless? Some will say its because of size, they are searching for the last byte of free space (and thus speed of launching). But, I think this is about moot. The Ubuntu 18.04 image and the Alpine image are pretty close in size, the last couple of MB doesn't matter.

'distroless' is all the code none of the cruft. No /etc directory. The side affect is its small, but the rationale is its secure. Its (more) secure because there are no extra tools laying around for things to 'live off the land'. This limits the 'blast-radius'. If something wiggles its way into a 'distroless' container it has less tools available to go onward and do more damage.

No shell, no awk, no netcat, no busybox. The only executable is yours. And this is what your build looks like. You can see we use a normal 'fat old alpine' source to build. We run 'pip' in there. Then we create a new container, copying from the 'build' only the files we need. We are done.

Doing the below I ended up with a 'mere' 3726 files. Yup, that is the list, see if your favourite tool made the cut.

Going 'distroless' saved me 33MB (from 86.3MB to 53.3MB). Was this worth it?

FROM python:3-alpine as build
LABEL maintainer="don@agilicus.com"

COPY . /elastic-prune
WORKDIR /elastic-prune

RUN pip install --target=./ -r requirements.txt

FROM gcr.io/distroless/python3
COPY --from=build /elastic-prune /elastic-prune
WORKDIR /elastic-prune
ENTRYPOINT ["/usr/bin/python3", "./elastic-prune.py"]
5 comments on “The quest for minimalism
  1. db Dave D says:

    You’ve got python, and that’s a pretty good tool for living off the land. Is your app in python? Consider compiling it with Cython.

    • db db says:

      we were just discussing this here. Perhaps the solution is to make a mod to Python to restrict what ‘entry’ point script it can run.
      The in-progress-proof-of-concept-hack I’m working on is for nodejs, same comment there. Currently I can get a remote shell exploit and use netcat. If all local tools vanished, it would be much tougher, but ultimately I would try and use nodejs.

      Usually the rootfs is mounted read-only, making it hard to inject a script to run, leaving one typing manually to the interpreter.

      So e.g. if python would refuse to run anything other than my main entry script, someone getting in would be sorely restricted in what they could do.

      The strategy is called Defence In Depth: its not about absolute security at each layer, its a successive fallback and delay strategy.

      • db Kevin Nisbet says:

        > Usually the rootfs is mounted read-only, making it hard to inject a script to run, leaving one typing manually to the interpreter.

        I’m not sure I agree with this statement. I think the common case is for the root filesystem to be mounted read-write, and you need to go to extra lengths to get read only behaviour (SecurityContext readOnlyRootFilesystem on kubernetes). I think I would be fairly surprised to find a kube distribution that by default used a pod security policy that required this by default.

        Medium term, I think the kubernetes way to address this would be seccomp support, where you add a seccomp filter to block exec* syscall’s (https://github.com/kubernetes/features/issues/135). I suspect this would have a similar effect, but also works for processes that need a writable filesystem, or containers / software pulled in from other sources that aren’t built to the distroless standard.

        I’m curious now though, if this would work with a user execing into a container. I’m a believer in debugging in production, and while having a service restricted in what it can do is very desirable as you point out, having an admin able to exec in and debug the process is also desirable. I believe smartos has a really cool behaviour around this, that I believe they mount all the system utilities under a /native mount point in any container. This allows someone to exec into the container, and have a consistent way to find a shell, run common debug tasks, etc.

        I wonder if this would also be achievable, with say the service running as a user, and all the utils or /native with only the root execute permission set.

        Lots to think about.

        • db db says:

          So to debug a container without tools, I wrote and contributed https://github.com/Agilicus/endoscope.
          It starts a new pod which *shares* the process + network namespace of the ‘debugee’.
          this allows you to have tools like ‘gdb’, ‘tcpdump’, ‘root’ without having them in the container under debug.
          it means that ‘distroless’ can be more fun 🙂

          On the ‘usually rootfs mounted r/o’, yes you need to enable it, but i’m assuming everyone does. Is this not the case? Its really hard to imagine blocking exec since most containers are going to have an entrypoint.sh that sets some env vars and then fork+exec the main thing.

          • db Kevin Nisbet says:

            As a very unscientific check, out of 234 stable helm charts (assuming all of them have a pod spec), only 4 seem to have `readOnlyRootFilesystem: true`. Even looking at a kubeadm based install, only Coredns appears to have readOnlyRootFilesystem set.

            As for using an entrypoint script, yea, I overlooked that. When I was looking through the kubernetes issue tracker, there seemed to be a couple references to embedding seccomp in the container, and dropping exec on the running daemon / exposed process only. But I doubt that’s a common practice.

Leave a Reply

Your email address will not be published. Required fields are marked *

*