Where are Docker Images Stored? Docker Container Paths Explained
Docker :: Rise of the Containers
What problems were there before Docker and virtual engines?
→ Before virtualization, if you developed multiple servers written in Java 8, Java 9, Python 2, Python 3, etc and had to run them all on the same machine, there would likely be conflicts.
→ Also, when multiple servers are installed on the same machine, one server could, by accident, delete a file created by another server.
→ A server could have a memory leak and hoard all the RAM away from the other servers. In other words, there is no isolation between processes.
→ The only solution to the above problems would be to buy separate computers for each server, which is expensive and time consuming to set up.
→ "It works on my machine".
→ Time taken to setup language, libraries, configuration and application on every machine.
What is a process?
It is a program that is currently under execution, using libraries and resources on the OS.
Virtual machines: Guest OS's were installed on top of the Hypervisor layer every time you wanted to run a program in isolation. There is duplication of Guest OS's, as well as hogging of CPU, RAM and storage by each OS, and a need to maintain and upgrade each OS.
What is a container?
→ It is an isolated process. (But it can spawn its own child processes)
→ It is achieved in Linux using Control Groups and Namespaces.
→ Control Groups allows you to control and allocate resources (CPU, memory, network etc) to containers.
→ Namespaces allows you to isolate an application's view of the OS, so that it sees only certain other processes, user IDs etc. This way containers will not interfere with each others resources.
→ Since it is a process, it will have a process id. Stepping inside the container and killing that id will kill the container itself.
<aside> 💡 LXC (Linux containers) is OS level virtualization method that is used to run multiple isolated containers on a control host using a single Linux kernel.
</aside>
What is the advantage of Linux containers over VMs? → Since it is a single Linux kernel, it has eliminated the need of virtual machines to create new OS instances every time you want isolation. Instead, isolation is achieved through namespaces, while the OS (or kernel) is shared between containers.
→ Also, since VMs required OS to be installed, there are a lot of background processes running to keep the OS up. Our application has to run on top of these processes. Using containers eliminates this requirement as well, since the host kernel is running all the background processes, while the container can focus on just the process that is required by our application. This massively reduces the size of a container as compared to a VM.
In simple words, a VM is an OS with dedicated resources. A container is a single process with resource usage limits applied.
How do containers solve the problem of "It works on my machine"?
Containers contain (lol) the programming language, libraries, and application already installed and ready to be executed. The host does not need to know the details. This also implies that containers have been standardised to work on any machine which uses a tool like Docker.
<aside> ⚠️ IMPORTANT: Above, I have mentioned that Java would come pre-installed in the container. This is only partially true. When we create our own custom containers, to be passed around to other machines as custom images, the image actually can contain a set of instructions to use other base images. In the example above, the custom container would have been spawned by instructions that said "Create a new container from the base Java image, then start my application and libraries on top of that Java container." This eliminates the need to pass around base images like Java, in case the other machine already has it.
</aside>
<aside> 💡 Since LXC came before Docker and it was exclusive to Linux, the containers could directly use the Linux host kernel. But with Docker needing to be run on Windows and Mac, Docker first creates a virtual Linux machine and runs all Linux containers inside that.
</aside>
Once Docker is installed on a host (represented by the 2nd block above) it exposes an API which can be accessed by the Docker client in a terminal. The client can be on a remote system as well.
When Docker is running it has a host that keeps the containers running and spawns new containers from images.
The Docker registry works the same way as Github.
→ Client tells the Daemon to create a new container from an image 'A', version '1.0'.
→ Daemon checks if the image 'A' of version '1.0' already exists. If not, it is downloaded from the registry.
→ Daemon then spawns a container and gives it a process id.
<aside>
💡 Naming convention of images: Images from Docker hub have simple names like 'java', 'mongo'. Images from private registries usually add the registry name as a prefix to the image name. E.g.: artifactory.dev.tw.com/tech-radar
</aside>
docker images
Shows list of all images available in local.
docker ps
Show running containers.
docker ps -a
Show both running and stopped containers.
docker run mongo
Create and run a container of mongo
docker -d run mongo
Create a container and run in background. -d
stands for 'detach'.
docker stop <container id>
Docker stops a running container. That is, the process is killed, but the state of the container is saved, which allows you to start up the container again like an unpause button.
docker start <container id>
Restart the paused container.
docker rm <container id>
Kill and destroy state of a container. Possible only if the container was stopped first with stop
command.
docker pull mongo
Fetch the latest version of mongo image from the registry. Just like with git, you can change the registry url, to point to the default public Docker registry or your own private registry.
docker run -p 8080:80 nginx
Create an nginx container, bind the host port of 8080 to port 80 of the container (nginx runs on port 80 by default), and start it.
docker run -d -p 8000-9000:5000 training/webapp python app.py
This will bind port 5000 in the container to a randomly available port between 8000 and 9000 on the host.
docker run -d -p 127.0.0.1:80:5000 training/webapp python app.py
This binds port 5000 of the container to a specific interface - the localhost
docker run --rm -p 8080:80 nginx
Same as above, but the container gets deleted automatically when stopped.
docker run --name hello nginx
Create a container of nginx, give the container a custom name of 'hello', and start it.
docker create --name joe nginx
Create a container with custom name 'joe', but don't start it.
docker logs <container id>
Display logs of the container
docker logs -f <container id>
Bind the command line to keep displaying the logs and updating it live.
docker exec <container id> ls
Execute the command 'ls' INSIDE the container. Keep in mind that the container needs to be able to understand 'ls' for it to be able to execute it.
docker exec -it <container id> bash
Bind the command line to that inside the container. 'i' stands for interactive, 't' stands for tty, which is related to the Linux terminal. Interactive means that it has the ability to wait for your inputs.
Ctrl + D to come back out of the container.
docker inspect <container id>
Display metadata about the container
Adding environment variables
docker run -it -rm -p 5432:5432 \\
-e POSTGRES_USER=myuser \\
-e POSTGRES_PASSWORD=mypassword \\
posgres:9.6.10-alpine
#it = bind console to container console
#rm = destroy the container when it is stopped
#p = bind the local port to the container port
#e = environment variable
#9.6.10-alpine = tag name of image which user wants to use for creating a container
docker container rm <container id>
Remove a container
Copy files to/from a container
# copy from container to local machine
docker cp my-container:afolder/report.xml
#copy from local machine to container
docker cp report.xml my-container:report_folder/report.xml
docker rm $(docker ps -a -q -f status=exited)
Remove all exited containers
Stop and delete all containers in 1 shot:
docker stop $(docker ps -a -q)
docker rm $(docker ps -a -q)
docker rmi <image id>
Delete an image.
docker rmi $(docker images -a -q)
Delete all images in 1 shot. (Add -f
after the parenthesis when images don't get removed as they are still referenced by some repository).
docker history myapp:v1
Show the history of an image
docker run -td <image>
Keep the container running in the background instead of closing. -t stands for tty. It allocates a tty to the container which waits for an input. -d stands for detach, which makes the container run in the background.
→ An image is an inert, immutable, file that's essentially a snapshot of a container.
→ An image can consist of multiple layers. Each one specifies things needed to run an application. Binary files, tools, runtimes, dependencies etc.
→ The layers are isolated enough that if you already have some of the layers of an image locally, they won't be downloaded again when you pull again from the registry.
→ An unused image is one which is not being used in any container, exited or currently running.
→ A dangling image is an old version of an image, with no name or tag. They are not referenced by any container, nor by any child images. Hence, they are 'dangling' and serve no purpose. The name will be <none>
when you list out docker images.
They are also called <none> <none>
images. (Some none none images are still referenced by child images). For a better explanation see how dangling images are created.
docker images
docker images alpine
docker images --filter=reference="busy*:*libc
docker images -f dangling=true
docker images purge
Remove all dangling imagesdocker system prune