Building a Minimal Container Runtime

Nov 22, 202510 min read

gocontainersdockersystems-programmingdevops

Have you ever wondered how Docker works under the hood? Containers have revolutionized software development, but their inner workings can seem mysterious. In this blog post, I'll walk you through a minimal container runtime I built called "smol-docker" that demonstrates the core concepts of containerization in just a few hundred lines of Go code.

What is smol-docker?

smol-docker is a lightweight container runtime I wrote in Go that can pull and run Docker images. I built it for educational purposes—both to deepen my own understanding of containers and to help other developers learn about container fundamentals without the complexity of a full-featured container engine like Docker or containerd.

Despite its simplicity, smol-docker implements the basic building blocks of containerization:

Image pulling and extraction
Filesystem isolation
Process isolation (on Linux)
Cross-platform support (Linux and macOS)

While I definitely don't recommend it for production use, it provides a clear view of how container runtimes work at their core.

How Containers Work: The Basics

Before diving into the code, let's understand what containers actually are. When I started this project, one of my first realizations was that containers are not lightweight VMs—they're isolated processes running on the host operating system. This distinction is crucial.

A container runtime needs to provide:

Filesystem isolation: Each container sees its own filesystem
Process isolation: Processes in one container can't see processes in another
Resource constraints: Limiting CPU, memory, etc. (I didn't implement this in smol-docker)
Networking: Isolated network stack (also not implemented in smol-docker)

Here's a visual representation of how containers achieve isolation:

Container Isolation Architecture

Figure: Container isolation architecture showing how multiple containers share the same kernel while maintaining isolation.

Building `smol-docker`: The Core Components

Let's explore how I implemented these concepts:

1. Project Structure

I kept the project structure simple and flat:

smol-docker/
├── main.go         # Core container runtime implementation
├── linux.go        # Linux-specific container isolation features
├── pull.sh         # Script to extract Docker images
├── Makefile        # Build and distribution management
├── go.mod          # Go module definition
├── go.sum          # Go module checksums
└── README.md       # Project documentation

2. Pulling Docker Images

The first step in running a container is getting the image. Initially, I considered implementing the Docker Registry API from scratch, but I quickly realized that was overkill for an educational project. Instead, I created a simple shell script (pull.sh) that leverages Docker's CLI to pull and export images:

#!/bin/bash

set -e

defaultImage="hello-world"
folder="dumps"

image="${1:-$defaultImage}"
imageFolder="${folder}/${image}"

mkdir -p ${folder}

if [ -d "$imageFolder" ]; then
    echo "Folder '$imageFolder' already exists"
else
    echo "Creating image-specific folder '$imageFolder'..."
    mkdir -p ${imageFolder}
fi

container=$(docker create "$image")

docker export "$container" -o "./${imageFolder}/${image}.tar.gz" > /dev/null

docker inspect -f '{{.Config.Cmd}}' "$image:latest" | tr -d '[]\\n' > "${imageFolder}/${image}-cmd"

docker rm "$container" > /dev/null

echo "Image content stored in ${imageFolder}/${image}.tar.gz"
echo "Command configuration stored in ${imageFolder}/${image}-cmd"
echo "Done."

This script:

Creates a temporary container from the specified image
Exports the container's filesystem as a tar archive
Extracts the default command from the image
Cleans up the temporary container

3. Container Representation

In main.go, I defined a simple Container struct to represent our container:

type Container struct {
    Image   string
    Command string
    RootDir string
    TempDir string
}

This struct holds everything needed to run a container—the image name, the command to execute, where the dumps are stored, and where we'll extract the image.

4. Setting Up the Container

The Setup method prepares the container environment:

func (c *Container) Setup() error {
    tempDir, err := c.createTempDir()
    if err != nil {
        return fmt.Errorf("failed to create temp directory: %w", err)
    }
    c.TempDir = tempDir

    tarPath := filepath.Join(c.RootDir, fmt.Sprintf("%s.tar.gz", c.Image))
    if err := c.unpackImage(tarPath); err != nil {
        return fmt.Errorf("failed to unpack image: %w", err)
    }

    return nil
}

This creates a temporary directory and extracts the container image into it. I learned that having a clean temporary workspace is essential for container isolation.

5. Running Containers on Linux

The real magic happens in the Linux-specific implementation (linux.go), which uses two key kernel features. This is where I had my biggest "aha!" moments about how containers actually work.

Filesystem Isolation with chroot

func (c *Container) changeRoot() error {
    oldRoot, err := os.Open("/")
    if err != nil {
        return fmt.Errorf("failed to open root: %w", err)
    }
    defer oldRoot.Close()

    cmd := exec.Command(c.Command)
    cmd.Stdin = os.Stdin
    cmd.Stdout = os.Stdout
    cmd.Stderr = os.Stderr

    if err := syscall.Chdir(c.TempDir); err != nil {
        return fmt.Errorf("failed to change directory: %w", err)
    }

    if err := syscall.Chroot(c.TempDir); err != nil {
        return fmt.Errorf("failed to chroot: %w", err)
    }

    if err := cmd.Run(); err != nil {
        log.Printf("Command failed: %v", err)
    }

    // Return to the original root
    if err := syscall.Fchdir(int(oldRoot.Fd())); err != nil {
        return fmt.Errorf("failed to restore old root directory: %w", err)
    }

    if err := syscall.Chroot("."); err != nil {
        return fmt.Errorf("failed to restore old root: %w", err)
    }

    return nil
}

I chose to use chroot for filesystem isolation because it's straightforward and doesn't require kernel namespace features. The chroot system call changes the root directory for the current process, making it impossible to access files outside the new root. This provides basic filesystem isolation.

Mount Namespace for /proc

func (c *Container) setupMounts() error {
    procPath := filepath.Join(c.TempDir, "proc")
    if err := os.MkdirAll(procPath, 0755); err != nil {
        return fmt.Errorf("failed to create proc directory: %w", err)
    }

    if err := syscall.Mount("proc", procPath, "proc", 0, ""); err != nil {
        return fmt.Errorf("failed to mount proc: %w", err)
    }

    return nil
}

This mounts a new /proc filesystem inside the container, which I discovered is essential for many Linux applications to function correctly. Without this, even simple commands would fail.

6. Cross-Platform Support (macOS)

One of the biggest challenges I faced was making smol-docker work on macOS. Since macOS doesn't support the same isolation primitives as Linux (no chroot, no mount namespaces), I had to get creative with a simulated environment:

func (c *Container) runMacOS() error {
    // For macOS, we'll use a simpler approach that doesn't require root privileges
    // We'll just run the command in the extracted directory with modified environment
    
    // Make the command path relative to the extracted directory
    cmdPath := c.Command
    if filepath.IsAbs(cmdPath) {
        cmdPath = filepath.Join(c.TempDir, cmdPath[1:])
    } else {
        cmdPath = filepath.Join(c.TempDir, cmdPath)
    }
    
    // Check if the file exists and is executable
    if _, err := os.Stat(cmdPath); os.IsNotExist(err) {
        return fmt.Errorf("executable not found in container: %s", cmdPath)
    }
    
    // Try to execute the command
    cmd := exec.Command(cmdPath)
    cmd.Stdin = os.Stdin
    cmd.Stdout = os.Stdout
    cmd.Stderr = os.Stderr
    
    // Set the working directory to the extracted image
    cmd.Dir = c.TempDir
    
    // Set environment variables to simulate container environment
    cmd.Env = append(os.Environ(),
        "CONTAINER=true",
        "HOME="+filepath.Join(c.TempDir, "home"),
        "PATH="+filepath.Join(c.TempDir, "bin")+":"+os.Getenv("PATH"),
    )

    if err := cmd.Run(); err != nil {
        // Handle various error cases...
        if err.Error() == "exec format error" {
            return fmt.Errorf("cannot run Linux binaries on macOS. Please use Docker Desktop or a Linux VM to run containers")
        }
        return fmt.Errorf("command failed: %w", err)
    }

    return nil
}

This macOS implementation doesn't provide true isolation, but it taught me a lot about what real isolation requires. It simulates a container-like environment by:

Setting the working directory to the extracted image
Configuring environment variables to mimic a container
Detecting when users try to run Linux binaries on macOS

The last point was important—I wanted users to get a clear error message rather than cryptic failures.

Using `smol-docker`

Let me show you how to use the container runtime I built:

Building the Project

First, clone and build the project:

git clone https://github.com/smol-go/smol-docker.git
cd smol-docker
make build

Pulling an Image

To download a Docker image:

./smol-docker pull nginx

This will:

Create a container from the nginx image
Export its filesystem to ./dumps/nginx/nginx.tar.gz
Extract the default command to ./dumps/nginx/nginx-cmd

Running a Container

To run the container:

./smol-docker run nginx

Or with a custom command:

./smol-docker run nginx /bin/bash

How It All Works Together

Let me trace through what happens when you run a container with smol-docker:

Argument Parsing: The parseArgs() function extracts the command, image name, and optional command.

func parseArgs() (string, string, string, error) {
    if len(os.Args) < 2 {
        return "", "", "", fmt.Errorf("insufficient arguments")
    }

    command := os.Args[1]

    if command != "run" && command != "pull" {
        return "", "", "", fmt.Errorf("invalid command: %s", command)
    }

    if len(os.Args) < 3 {
        return "", "", "", fmt.Errorf("image name required")
    }

    image := os.Args[2]
    var cmd string

    if command == "run" {
        if len(os.Args) > 3 {
            cmd = os.Args[3]
        } else {
            cmdFile := filepath.Join(dumpsDir, image, fmt.Sprintf("%s-cmd", image))
            buf, err := os.ReadFile(cmdFile)
            if err != nil {
                return "", "", "", fmt.Errorf("failed to read command file: %w", err)
            }
            cmd = string(buf)
        }
    }

    return command, image, cmd, nil
}

Image Pulling: The pullImage() function calls the shell script to download and extract the image.
Container Setup: The Setup() method creates a temporary directory and unpacks the image.
Platform Detection: The code checks the operating system and uses the appropriate isolation method.
Container Execution: On Linux, it uses chroot and mount namespaces. On macOS, it simulates a container environment.
Cleanup: After execution, the temporary directory is removed.

What I Learned: Security in Production Containers

While building smol-docker, I gained a deep appreciation for what production container runtimes actually do for security. My implementation only scratches the surface with basic chroot and mount namespace isolation.

In production container runtimes, multiple isolation mechanisms work together:

Namespaces: Isolate processes, networks, mounts, users, etc. (I only used mount namespaces)
Capabilities: Limit what privileged operations a process can perform (not implemented)
Seccomp: Filter system calls to reduce attack surface (not implemented)
AppArmor/SELinux: Mandatory access control systems (not implemented)

This multi-layered approach is why production containers are actually secure, while smol-docker is emphatically not suitable for anything beyond learning and experimentation.

What I Left Out (And Why)

I deliberately kept smol-docker minimal to focus on the core concepts. Here's what I didn't implement and why:

Network Isolation: Adding network namespaces would have doubled the complexity. For understanding the basics, filesystem and process isolation are sufficient.

Resource Limits (cgroups): While cgroups are crucial for production, they're more about resource management than isolation. I wanted to focus on the "what is a container" question first.

User Namespaces: Running containers as non-root is important for security, but it adds another layer of complexity that would obscure the fundamentals.

Image Layers: Docker's layered filesystem with overlayfs is clever, but understanding a single unpacked filesystem is easier for learning purposes.

Multi-process Support: Real containers run init systems and multiple processes. I kept it to one process to keep the code simple.

What's Next

If I continue working on this project, here are some features I'm considering:

Network namespaces for true network isolation
Basic cgroups to show resource limiting
User namespaces for better security
A simple layer system using overlayfs

But honestly, the current version achieves its goal: helping people (including myself) understand how containers work at a fundamental level.

If you fork this project, these would be great features to add as learning exercises. Each one teaches something different about Linux kernel features and container isolation.

Conclusion

Building smol-docker helped me demystify how containers work. At their core, containers are just processes with enhanced isolation provided by Linux kernel features.

My key takeaways from this project:

Containers are not VMs: They're isolated processes sharing the host kernel—this seems obvious now, but implementing it made the difference crystal clear
Isolation is multi-layered: Filesystem, process, network, and resource isolation work together in production systems
Platform-specific: Full container isolation relies heavily on Linux kernel features. The macOS version taught me just how much we take for granted on Linux
Simplicity is powerful: Even a basic implementation can demonstrate core container concepts effectively

By understanding these fundamentals, I now have a much deeper appreciation for what container technologies like Docker, containerd, and Kubernetes are doing behind the scenes.

If you're interested in exploring container internals, I encourage you to fork smol-docker and experiment with it. Try adding some of the missing features, or dive into the source code of more complete container runtimes like runc or crun.

Happy containerizing!

Want to explore the code? Check out the full source code on GitHub.