Docker has revolutionized how developers and ops build, ship, and run applications using containers. But what exactly makes Docker tick? In this comprehensive guide, we‘ll dive into Docker architecture and components to see what makes it so powerful.
Why Docker? The Benefits
Before we look at what Docker is and how it works, let‘s understand why Docker became so popular.
Faster application deployment – Containers allow you to package an application with all its dependencies into a standardized unit for software development. This brings portability and consistency across environments.
More efficient resource utilization – Containers share the host kernel and don‘t need a full OS, allowing higher server density vs. VMs. This table summarizes differences:
Containers | Virtual Machines | |
---|---|---|
Startup time | Seconds | Minutes |
Size | MBs | GBs |
Performance | Near native | Hypervisor overhead |
Resource isolation | Process level | Hardware virtualized |
Dev and IT alignment – Docker bridges the gap by letting developers build containers locally using the same tools used in production. Ops can deploy the containers consistently across environments.
According to Datadog‘s 2022 report, over 65% of companies now use Docker in production. The growth has been phenomenal and adoption continues rising.
Docker Architecture
Docker follows a client-server architecture with a central Docker daemon that does the heavy lifting. The Docker client talks to the daemon using a REST API.
Docker Engine
The Docker Engine is the layer on the host machine that runs and manages Docker objects. It consists of:
Docker Daemon – Background service (dockerd) that manages images, containers, networks, volumes.
REST API – API for interacting with the daemon, create/manage objects.
Command Line Interface (CLI)– Interface for users to communicate with daemon.
According to Solomon Hykes, Docker founder, their goal was to build a "portable engine" to provide an open API and runtime for anyone to build and run containers. The Docker Engine enabled exactly this on any infrastructure.
Docker Client
The command-line tool that talks to the daemon is called docker. This is the primary interface most users interact with to build, run, and manage Dockerized applications.
You can use the native Docker client or API clients built in many languages like Python, Java, JavaScript etc. to communicate with the daemon programmatically.
Docker Registries
A registry stores and distributes Docker images. For example, the default registry Docker Hub contains over 8 million public images to use off-the-shelf or derive from.
Users can also host private registries behind the firewall to share team images. Popular registry options include Docker Hub, AWS ECR, GCR, Quay, GitLab registry etc.
When you docker pull
an image, it pulls it from the configured registry. docker push
pushes it to the target registry.
Docker Objects
Docker exposes these main resources that you can build and manage.
Images
Docker images are read-only templates used for container creation. They provide a filesystem and contain everything needed to run applications – app code, libraries, dependencies, configs.
Images are built from a Dockerfile, which specifies instructions to build the image. Once built, images are stored in a registry and used to launch containers.
Some key aspects of images:
- Built on layered filesystem using UnionFS – this allows sharing common files between images
- Layers are stacked with copy-on-write – new layers just add diffs rather than a complete copy
- Immutable and portable – enables consistency across envs
- Reusable components to accelerate builds
By distributing images rather than full VM images, Docker enables efficient scaling and sharing of artifacts.
Containers
Containers are runtime instances of Docker images. The daemon uses images read-only layers and adds a writable layer to create containers. All user application code runs inside containers.
Containers are isolated, portable execution environments. You can run, start, stop and delete containers using the CLI and API.
Some features:
- Each container runs as an isolated process on the host OS
- Resource constraints can be applied to limit container access
- Network interfaces provide access to internal/external networks
- Volumes can be attached and shared between containers
According to Arun Gupta, containers utilize OS-level virtualization by leveraging technologies like namespaces and cgroups rather than hardware virtualization. This provides portability and performance.
Volumes
Docker volumes provide data persistence outside the lifecycle of containers. They are fully managed by Docker.
Benefits include:
- Avoid storing data in container writable layer
- Persist data after containers are deleted
- Share data between containers
- Portable across environments
Volume drivers allow storing volumes on cloud storage systems. Bind mounts can also mount files/dirs on host into containers.
Networks
Docker networks enable communication between containers across hosts. Built-in network drivers include:
- bridge – Default virtual network on Docker host.
- host – Removes network isolation between host and containers.
- overlay – Enables Swarm services to communicate across cluster.
- macvlan – Macvlan networks allow assigning MAC addresses to containers to make them appear as physical devices on the network.
- none – Disables networking completely.
The Docker network architecture allows flexible connectivity:
- Cross-host networking – Communicate across Docker daemons
- Embedding DNS – Built-in DNS server provides naming resolution
- Load balancing – L7 load balancing using swarm mode
How Docker Works
When you run docker container run
, this is what happens under the hood:
-
Docker client sends request to daemon process
-
Daemon pulls image from registry if not present locally
-
Writable container layer is created from image
-
Networking and storage are setup
-
Container process is spawned from image and app runs
-
User app accesses host resources (CPU, memory, network) via namespaces and cgroups
-
Changes to filesystem are stored in container writable layer
Let‘s understand some key concepts:
Copy-on-write strategy – Docker utilizes a CoW strategy for layering filesystems. This means new layers just store differences rather than duplicating files.
Namespaces – Provides isolation for containers at the process level. Each aspect like users, networks, hostname are namespaced or virtualized per container.
Control groups (cgroups) – Limits amount of resources like CPU and memory available to a container. Allows sharing resources efficiently across containers.
Union file system (UFS) – Provided by the kernel that enables combining multiple directories transparently. Docker combines multiple image layers into a UFS for containers.
Comparing Docker and Virtual Machines
Unlike VMs which virtualize hardware, containers provide operating system level virtualization by isolating resources for processes. This makes containers more lightweight and efficient.
Some differences:
Containers | Virtual Machines | |
---|---|---|
OS | Share host kernel | Guest OS kernel |
Startup time | Seconds | Minutes |
Size | MBs | GBs |
Performance | Near native | Hypervisor overhead |
Hardware utilization | Efficient | Emulated |
Isolation | Process level | Hardware virtualized |
Containers have lower startup times and overhead. But VMs provide complete isolation which may be preferred for some multi-tenant use cases.
Docker Installation
There are several installation options for Docker:
-
Docker Desktop – Easiest option for Mac and Windows. Bundles Docker CLI, daemon, SDK tools.
-
Docker Engine on Linux – Package containing CLI, daemon, engine. Install on Linux distros like Ubuntu, Fedora etc.
-
Cloud options – Managed Docker services on AWS, Azure, GCP. Saves installation.
-
Docker swarm mode – Native clustering for Docker daemons. Lets you join nodes as a swarm and deploy services.
Best Practices for Docker
Here are some tips from Docker experts on best practices:
-
Keep containers ephemeral – as per Kelsey Hightower, containers should be disposable, immutable infrastructure.
-
One process per container – Rob Baillie recommends sticking to one process to keep container restart fast.
-
Minimize layers in images – Addy Osmani suggests minimizing layers to reduce attack surface and size.
-
Leverage multi-stage builds – Tara Z. Manicsic advises using multi-stage builds to keep final images lean.
-
Scan images – according to Aqua Security, scan images in registries for vulnerabilities during CI/CD.
Conclusion
In this extensive guide, we covered Docker architecture starting from the basics of why Docker, the client-server design, core components like the daemon, CLI, registries and Docker objects like images, containers, volumes and networks.
We also dove into how Docker differs from virtual machines, installation options, and best practices from industry experts.
Docker adoption is accelerating driven by benefits like standardized environments, speed and scalability. I hope you found this guide helpful. Please share your feedback and feel free to reach out with any other questions on mastering Docker!