# boring (dev-)containers for ai

We live in exciting times. During the last year, the way I write software has changed dramatically – every quarter year. Claude Code, OpenAI's Codex, OpenCode, GitHub CLI, pi... There is so much software (harnesses) right now that makes it so much easier and faster to write code.

Most of this progress comes from the harness that enabled LLMs to use your computer in a much more integrated way. Unfortunately, this means that the LLM can now use your computer in a much more integrated way. AI coding agents usually ask you for approval whether it is okay to perform an action, but to 'enjoy' full automation, you will have to use --dangerously-skip-permissions or --yolo, respectively. So basically: give the AI a free pass.

security is lagging behind

While most of the aforementioned tools provide some sort of rudimentary "guardrails" via e.g. sandboxing, they can still relatively easily be circumvented (and Claude's new Mythos escaped its sandbox while a researcher on the team was eating a sandwich in the park).

Anthropic now even has a denylist on auto-mode – a separate LLM classifier. That's a bit like fighting fire with fire. Prompt injections are just not fixable that easily.

There are three 'layers' that are exposed to the agent:

Kernel/OS
Filesystem
Network

sandboxing

We need something to secure the environment that AI agents run in. And it should be something that is not relying on the vibe-coded harness itself. We need something we can trust and verify. Not another AI that does it for us.

There is a great in-depth article by Luis Cardoso about the different ways of sandboxing AI (container, microVM, gVisor, WebAssembly, Seatbelt & Bubblewrap... ), and I highly encourage you to read it. And there are several projects/companies that want to tighten whatever 'sandboxing' AI coding agents provide out of the box. A non-exhaustive list:

Nono, which looks most promising to me
NVIDIA's OpenShell, which does a lot, is fairly complicated (k3s just to sandbox my agent?!), but does not seem to provide gVisor or Kata Containers-level isolation (check Luis' article above)
Alibaba's OpenSandbox, also too complicated for my taste
Docker's MicroVM Sandbox, which is nice but not open-source
... and many, many more

Some of these options look intriguing. But most I would not trust (yet) because they are either vibe-coded, from non-established players, or are complicated and obscure. Ideally, I would like to rely on existing and established technology that is easy to use.

devcontainers to the rescue

And the probably boring answer: DevContainers. The idea is to define your development environment (not only the software that you ship) in code. It then creates a container, and then you connect your IDE and start developing. The good thing? Onboarding a new colleague is now much easier. Sounds good?!

And because DevContainers support Podman, you can create more secure rootless containers. This is also possible with Docker/Moby, but much easier to achieve with Podman. Bonus: On macOS, you have to create a VM to be able to spin up GNU/Linux containers, which gives us another level of abstraction (and security!). Podman Desktop is a very nice frontend to all of that. DevContainers are the boring option. But more often than not, we should choose boring technology.

However, DevContainers do not solve everything. Because they have to communicate with your host and IDE, there are some attack vectors that you should be aware of. They also create some friction here and there due to the abstraction layer. While Podman supports 'docker-in-docker' (or container-in-container) workflows where you do the impossible of running a container in a container, I was only able to achieve this with the --privileged flag when running Podman in the VM. In theory and non-VM practice, this works even with rootless Podman (and if you managed to set it up on macOS, please write me).

bonus: apple containers?

With macOS Tahoe, Apple introduced its own containerization method. And the cool thing: it uses Kata Containers. What do we gain from this? In contrast to Podman or Docker Desktop, it spins up a separate VM for every container instead of sharing one for all. This is great from a security perspective. And since Apple knows their hardware (which cannot be said of their corner radii), it will probably be more performant. So why not use it now? There is no official Compose support yet. For DevContainers alone, this is not needed. But personally, I would like to use something that works for all my container workflows.

Let's make AI boring again!