Skip to main content

Docker Storage

Overview

  • Don’t store app data in the container’s writable layer. Use named volumes for persistent data, bind mounts for local dev, and tmpfs for sensitive runtime files.
  • Prefer --mount syntax over -v for clarity.
  • On Linux, Docker uses OverlayFS (overlay2) to assemble image layers and the writable container layer via copy-on-write.
  • Inspect/storage hygiene: docker system df, docker volume ls|inspect, and prune wisely.

Storage Building Blocks

Union Filesystems & Copy-on-Write

Docker images are stacks of read-only layers. At docker run, Docker adds a writable layer on top. Reads walk down the stack; the first file found wins. Writes copy a file into the top layer (COW) and modify it there.

  • Why it matters: Data you write inside the container without a mount lives in that ephemeral writable layer. If the container dies, that data goes away with it.
  • Where it’s implemented: On Linux this is a storage driver (commonly overlay2). Others exist (btrfs, zfs, windowsfilter on Windows), but overlay2 is the default for modern Linux.

Check your driver:

docker info | grep -i 'Storage Driver'

Mounts vs. the Writable Layer

  • Volumes (type=volume): Managed by Docker, live under /var/lib/docker/volumes/... (Linux). Decoupled from the container lifecycle; great for databases and persistent state.
  • Bind mounts (type=bind): Mount a host path into the container. Perfect for local dev (live code edits). You manage the host path.
  • tmpfs (type=tmpfs): In-memory mount for ephemeral, sensitive runtime data (secrets, caches).

Choosing the Right Mount

Use caseRecommendedWhy
Databases / app stateNamed volumeDurable, portable, easy to back up, not tied to host path layout
Local dev (edit on host)Bind mountInstant code reload and editor support
Secrets / short-lived filestmpfsNever written to disk, faster
Multi-host / external storageVolume + plugin driverNFS/NetApp/EBS/Ceph/etc. via driver

--mount (explicit fields, easier to read):

docker run \
--mount type=volume,src=mydata,target=/var/lib/mysql \
--name db -d mysql:8

-v|--volume (shorthand):

docker run -v mydata:/var/lib/mysql --name db -d mysql:8

Bind mount:

docker run \
--mount type=bind,src=$PWD/app,target=/usr/src/app \
-p 3000:3000 node:20 node /usr/src/app/server.js

tmpfs:

docker run \
--mount type=tmpfs,target=/run/secrets,tmpfs-size=64m \
alpine:3.20 sleep 3600

Volumes: Create, Inspect, Label, Clean Up

Create & label:

docker volume create --label project=shop --label env=prod mydata

Inspect:

docker volume inspect mydata

List (with filters):

docker volume ls --filter label=project=shop

Remove unused:

docker volume prune   # CAREFUL: removes all unused volumes

Back up a named volume:

# Creates backup.tar.gz of the 'mydata' volume
docker run --rm \
--mount type=volume,src=mydata,target=/data \
--mount type=bind,src=$PWD,target=/backup \
alpine:3.20 sh -c "tar -czf /backup/backup.tar.gz -C /data ."

Restore:

docker run --rm \
--mount type=volume,src=mydata,target=/data \
--mount type=bind,src=$PWD,target=/backup \
alpine:3.20 sh -c "tar -xzf /backup/backup.tar.gz -C /data"

Bind Mounts: Power and Pitfalls

Dev loop example (hot reload):

docker run --rm -it \
--mount type=bind,src=$PWD/api,target=/app \
-p 8080:8080 golang:1.22 \
sh -c "cd /app && go run ./..."

Read-only bind (protect host files):

docker run --rm \
--mount type=bind,src=$PWD/config,target=/etc/myapp,readonly \
myimg:latest

SELinux (Fedora/RHEL/CentOS) relabeling:

  • :Z → private label for a single container
  • :z → shared label for multiple containers accessing the same path

With -v shorthand:

docker run -v $PWD/data:/data:Z myimg:latest

Mount propagation (advanced, for nested mounts):

docker run --privileged \
--mount type=bind,src=/mnt,target=/mnt,bind-propagation=rshared \
myimg:latest

Common issues

  • Permissions/UID mismatch: The container’s process UID must own or be allowed to access the host path. Fix with chown, setfacl, --user, or user-namespaces.
  • SELinux denials: Use :z/:Z or adjust policy.
  • Docker Desktop perf (macOS/Windows): Cross-VM file sharing is slower than Linux. Prefer volumes for heavy I/O or use newer “virtio-fs / gRPC-FUSE”-backed file sharing and avoid chatty file patterns.

Docker Compose Examples

Named volume (database)

# compose.yml
services:
db:
image: mysql:8
environment:
MYSQL_ROOT_PASSWORD: example
volumes:
- dbdata:/var/lib/mysql
volumes:
dbdata:
driver: local

Bind mount (dev app)

services:
web:
image: node:20
working_dir: /usr/src/app
command: ["npm","run","dev"]
ports:
- "3000:3000"
volumes:
- type: bind
source: ./app
target: /usr/src/app

tmpfs for sensitive runtime data

services:
worker:
image: alpine:3.20
command: ["sh","-c","do-work"]
tmpfs:
- /run/secrets:rw,noexec,nosuid,size=64m

External/remote volume via driver (example: NFS with local driver)

volumes:
shared:
driver: local
driver_opts:
type: "nfs"
o: "addr=10.0.0.10,nolock,soft,rw"
device: ":/export/share"
tip

For cloud/enterprise backends (EBS, NetApp, Ceph, etc.), use the appropriate volume plugin driver and follow its options.


Disk Usage, Auditing, and Cleanup

High-level view:

docker system df
docker system df -v # per-image/container breakdown

Find large volumes (Linux):

sudo du -sh /var/lib/docker/volumes/* 2>/dev/null | sort -h

Prune safely (dry-run by thinking before you run it 🙂):

docker system prune              # removes stopped containers, unused networks, dangling images
docker system prune -a # also removes unused images (not just dangling)
docker volume prune # removes unused volumes

Performance & Best Practices

  • Prefer named volumes in production. They’re lifecycle-safe and don’t depend on a fragile host path.
  • Keep hot I/O off bind mounts on macOS/Windows. Use volumes for DBs, caches, build artifacts.
  • Make mounts read-only wherever possible: readonly (or :ro) reduces risk.
  • Separate code and data. App code bind-mounted is fine; persistent state belongs in volumes.
  • Back up volumes with the tar pattern above, or snapshot the underlying storage (LVM/ZFS/cloud disk).
  • Label volumes (e.g., --label project=...) for searchability and cleanup.
  • Avoid mounting the Docker socket (/var/run/docker.sock) into containers unless you truly need Docker-in-Docker control; it effectively grants root-equivalent power on the host.
  • Use tmpfs for secrets and session files: fast and not persisted to disk.
  • Watch inode usage for build systems that create many tiny files. df -i on the host can diagnose inode exhaustion.
  • Know your driver: overlay2 is robust and standard on Linux; if you’re on btrfs/zfs, leverage snapshotting/quotas thoughtfully.

Troubleshooting Quick Hits

  • “Permission denied” on bind mount: Check UID/GID, --user, ACLs, and SELinux labels (:z/:Z).
  • DB won’t start after container recreation: Data lived in the container’s writable layer—move it to a named volume mounted at the DB data directory.
  • Slow file access on Docker Desktop: Switch heavy I/O paths to volumes; minimize watchers and file churn; consider caching strategies.
  • Can’t see volume size in inspect: Use du from a throwaway container mounted to the volume or check the host path under /var/lib/docker/volumes/<id>/_data.

Measure a volume’s size from a container:

docker run --rm \
--mount type=volume,src=mydata,target=/data \
alpine:3.20 sh -c "du -sh /data"

Security Notes

  • Least privilege mounts: readonly, avoid mounting host sensitive paths, and don’t expose host sockets unless necessary.
  • User namespaces: dockerd --userns-remap can map container root to an unprivileged host UID; ensure volume paths are owned/ACL’d appropriately.
  • SELinux/AppArmor: Keep them enabled; use SELinux :z/:Z or AppArmor profiles as needed.

Appendix: Quick Reference

Create named volume

docker volume create mydata

Run with volume

docker run --mount type=volume,src=mydata,target=/var/lib/myapp myimg

Run with bind mount

docker run --mount type=bind,src=/opt/config,target=/etc/myapp,readonly myimg

Run with tmpfs

docker run --mount type=tmpfs,target=/run/mytmp,tmpfs-size=32m myimg

Inspect/clean

docker volume ls
docker volume inspect mydata
docker volume rm mydata # only if unused
docker volume prune # removes all unused volumes