Index

Introduction

What problems were there before Docker and virtual engines?

→ Before virtualization, if you developed multiple servers written in Java 8, Java 9, Python 2, Python 3, etc and had to run them all on the same machine, there would likely be conflicts.

→ Also, when multiple servers are installed on the same machine, one server could, by accident, delete a file created by another server.

→ A server could have a memory leak and hoard all the RAM away from the other servers. In other words, there is no isolation between processes.

→ The only solution to the above problems would be to buy separate computers for each server, which is expensive and time consuming to set up.

→ "It works on my machine".

→ Wasted time taken to setup language, libraries, configuration and application on every machine.

What is a process?

It is a program that is currently under execution, using libraries and resources on the OS.

Virtual machines: Guest OS's were installed on top of the Hypervisor layer every time you wanted to run a program in isolation. There is duplication of Guest OS's, as well as hogging of CPU, RAM and storage by each OS, and a need to maintain and upgrade each OS.

What is a container?

→ It is an isolated process. (But it can spawn its own child processes)

→ It is achieved in Linux using Control Groups and Namespaces.

→ Control Groups allows you to control and allocate resources (CPU, memory, network etc) to containers.

→ Namespaces allows you to isolate an application's view of the OS, so that it sees only certain other processes, user IDs etc. This way containers will not interfere with each others resources.

→ Since it is a process, it will have a process id. Stepping inside the container and killing that id will kill the container itself.

<aside> 💡 LXC (Linux containers) is OS level virtualization method that is used to run multiple isolated containers on a control host using a single Linux kernel.

</aside>

What is the advantage of Linux containers over VMs? → Since it is a single Linux kernel, it has eliminated the need of virtual machines to create new OS instances every time you want isolation. Instead, isolation is achieved through namespaces, while the OS (or kernel) is shared between containers.

→ Also, since VMs required OS to be installed, there are a lot of background processes running to keep the OS up. Our application has to run on top of these processes. Using containers eliminates this requirement as well, since the host kernel is running all the background processes, while the container can focus on just the process that is required by our application. This massively reduces the size of a container as compared to a VM.

In simple words, a VM is an OS with dedicated resources. A container is a single process with resource usage limits applied.

How do containers solve the problem of "It works on my machine"?

Containers contain (lol) the programming language, libraries, and application already installed and ready to be executed. The host does not need to know the details. This also implies that containers have been standardised to work on any machine which uses a tool like Docker.

<aside> ⚠️ IMPORTANT: Above, I have mentioned that Java would come pre-installed in the container. This is only partially true. When we create our own custom containers, to be passed around to other machines as custom images, the image actually can contain a set of instructions to use other base images. In the example above, the custom container would have been spawned by instructions that said "Create a new container from a base Java image, then start my application and libraries on top of that Java container." This eliminates the need to pass around base images like Java, in case the other machine already has it.

</aside>

<aside> 💡 Since LXC came before Docker and it was exclusive to Linux, the containers could directly use the Linux host kernel. But with Docker needing to be run on Windows and Mac, Docker first creates a virtual Linux machine and runs all Linux containers inside that.

</aside>

Docker Architecture

Once Docker is installed on a host (represented by the 2nd block above) it exposes an API which can be accessed by the Docker client in a terminal. The client can be on a remote system as well.

When Docker is running, it has a host that keeps the containers running and spawns new containers from images.

The Docker registry works the same way as GitHub - it stores images on the cloud, which can be downloaded and used to spawn containers.

Working with Docker

Starting a container from an image

→ Client tells the Daemon to create a new container from an image 'A', version '1.0'.

→ Daemon checks if the image 'A' of version '1.0' already exists. If not, it is downloaded from the registry.

→ Daemon then spawns a container and gives it a process id.

<aside> 💡 Naming convention of images: Images from Docker hub have simple names like 'java', 'mongo'. Images from private registries usually add the registry name as a prefix to the image name. E.g.: artifactory.dev.tw.com/tech-radar

</aside>

Basic commands

docker images

Shows list of all images available in local.
docker ps

Show running containers.
docker ps -a

Show both running and stopped containers.
docker run mongo

Create and run a container of mongo
docker -d run mongo

Create a container and run in background. -d stands for 'detach'.
docker stop <container id>

Docker stops a running container. That is, the process is killed, but the state of the container is saved, which allows you to start up the container again like an unpause button.
docker start <container id>

Restart the paused container.
docker rm <container id>

Kill and destroy state of a container. Possible only if the container was stopped first with stop command.
docker pull mongo

Fetch the latest version of mongo image from the registry. Just like with git, you can change the registry url, to point to the default public Docker registry or your own private registry.
docker run -p 8080:80 nginx

Create an nginx container, bind the host port of 8080 to port 80 of the container (nginx runs on port 80 by default), and start it.
docker run -d -p 8000-9000:5000 training/webapp python app.py This will bind port 5000 in the container to a randomly available port between 8000 and 9000 on the host.
docker run -d -p 127.0.0.1:80:5000 training/webapp python app.py This binds port 5000 of the container to a specific interface - the localhost
docker run --rm -p 8080:80 nginx

Same as above, but the container gets deleted automatically when stopped.
docker run --name hello nginx

Create a container of nginx, give the container a custom name of 'hello', and start it.
docker create --name joe nginx Create a container with custom name 'joe', but don't start it.
docker logs <container id>
docker logs -f <container id>
docker exec <container id> ls
docker exec -it <container id> bash
docker inspect <container id>

Further Reading

Index

Introduction

Docker Architecture

Working with Docker

Starting a container from an image

Basic commands