Docker 101

Docker 101

If your goal is to ship softwares in the real world, one of the most powerful concepts to understand is containerization.

When developing locally, it solves the age-old problem of it works on my machine and when deploying on the cloud it solves the age-old problem of this architecture doesn't scale.

In this article, we will unlock the power inside this container by learning 101 concepts and terms related to computer science the cloud and of course docker.

Computer is a box that has three important components inside.

  1. A Central Processing Unit for calculating things.

  2. Random Access Memory for the applications you are using now.

  3. Disk to store things you might use later.

This is bare metal hardware but in order to use it we need an Operating System.

Most importantly the operating system provides a Kernel that sits on top of the bare metal, allowing software Applications to run on it.

Physical Media: In the old days you would go to the store to buy software to install it on your machine. Nowadays, most software is delivered via the internet through networking.

When you are using any software via the internet your computer is called the client but you and other billion users are getting that data from remote computers called servers. When an app start reaching millions of people, weird things begin to happen.

  1. The Central Processing Unit becomes exhausted handling all the incoming requests.

  2. Disk input/output slows down.

  3. Network bandwidth gets maxed out.

  4. The database becomes too large to query effectively.

  5. On top of that, you wrote some garbage code that's causing race conditions, memory leaks and unhandled errors that will eventually grind your server to a halt.

The big question is, how do we scale our infrastructure?

A server can scale up in two ways.

  1. Vertical

  2. Horizontal

To scale vertically, you take your one server and increase its Random Access Memory and Central Processing Unit. This can take you pretty far but eventually you'll hit a ceiling.

To scale horizontally, you distribute your code over multiple smaller servers which are often broken down into microservices that can run and scale independently. But distributed systems like this aren't very practical when talking about bare metal because actual resource allocation varies.

One way engineers address this is with virtual machines using tools like hypervisor. It can isolate and run multiple operating systems on a single machine.

That helps but the virtual machines' allocation of Central Processing Unit and memory is still fixed (fixed resource allocation). And that's where Docker comes in.

Applications running on top of the docker engine all share the same host operating system kernel (shared kernel) and use resources dynamically (dynamic resource allocation) their needs.

Under the hood, docker is running a daemon or persistent process that makes all this possible. (daemon process/ dockerd). And gives us Operating System-level virtualization.

What is awesome is that any developer can easily harness this power by simply installing Docker Desktop. It allows you develop software without having to make massive changes to your local system.

How it works

Step 1

You start with a docker file. This is like a blueprint that tells docker how to configure the environment that runs your application. The docker file is then used to build an image which contains an operating system, your dependencies and your code like a template for running your application. And we can upload this image to the cloud to places like docker hub and share it with the world.

But an image by itself doesn't do anything. You need to run it as a container which itself is an isolated package running your code that in theory can scale infinitely in the cloud. Containers are stateless which means when they shutdown, all the data inside them is lost. But that makes them portable, and they can run on every major cloud platform without vendor lock-in. Pretty cool but the best way to learn docker is to actually run a container:

  • create a docker file.

    • A docker file contains a collection of instructions which by convention are in all caps.

      • FROM is usually the first instruction you'll see which points to a base image to get started. This will often be a Linux distro and maybe followed by a colon which is an optional image tag and, in this case, specifies the version of the operating system.

      • WORKDIR the working directory instruction which creates a source directory and CDs into it. This is where we will put all our source code. All our commands will be executed from this working directory.

      • RUN instruction to use the Linux package manager to install our dependencies. RUN lets you run any command just like you would from the command line. You may be running as the root user but for better security we could also create a non-root user.

      • USER appuser

        With the user instruction, now we can use copy to copy the code on our local machine over to the image.

      • COPY

      • +app.py

      • ENV API_KEY=hi_mom

        Now to run this code we have an API key which we can set as an environment variable with the ENV instruction.

      • EXPOSE 8000

        We build in a web server that people can connect to which requires a port for external traffic. EXPOSE instruction makes that port accessible.

      • CMD [ "python3", "-m", "http.server", "8000" ]

        Command line instruction which is the command you want to run when starting up a container. In this case, it will run our web server.

        There can only be one command per container.

        Although you might also add an entry point.

      • ENTRYPOINT [ "python3", "-m", "http.server", "8000"]

        CMD ["8000"]

        Allowing you to pass arguments to the command when you run it (Override arguments). i.e. $> docker run your-image 5000

  • That's everything we need for the docker file. But as an added touch we can also use label to add some extra metadata.

    • LABEL maintainer="Ligalah Hezron"

      LABEL version="1.0"

  • Or we could run a heath check to make sure it is running properly.

    • HEALTHCHECK --intervals=30s --timeout=10s \

    • Or if the container needs to store data that is going to be used later or is going to be used by multiple containers we could mount a volume to it with a persistent disk.

    • VOLUME /db/data

Okay, we have a docker file so now what?

When you installed the docker desktop, that also installed the docker CLI which you can run from the terminal.

Run docker help to see all the possible commands but the one we need right now is docker build which will turn this docker file into an image.

When you run the command, it's a good idea to use the -t flag to tag it with a recognizable name.

-t awesome

Docker builds the image in layers. Every layer is identified by a sha256 hash which means if you modify your docker file, each layer will be cached so it only has to rebuild what is actually changed and that makes your workflow as a developer far more efficient.

It is important to point out that sometimes you don't want certain files to end up in a docker image. You can add them to the dockerignore file to exclude them from the actual files that get copied.

Open docker desktop and view the image there.

Not only does it give us a detailed breakdown but thanks to docker scout we are able to proactively identify any security vulnerabilities for each layer of the image. It works by extracting the software build material from the image and compares it to a bunch of security advisory database when there is a match it is given a severity rating so you can prioritize your security efforts.

The run button executes the docker run command. We can now access our server on localhost. In addition, we can see the running container on docker desktop which is the equivalent to the docker ps command which you can run from the terminal to get a breakdown on all the running and stop containers on your machine. If we click on it, we can inspect the logs from the container, or view the file system. We can even execute commands directly inside the running container.

When it comes to shutting it down, we can use docker stop to stop it gracefully or docker kill to forcefully stop it.

You can still see the shutdown container on the User Interface or use remove to get rid of it.

Running container in the cloud

docker push will upload your image to a remote registry where it can run on a cloud like AWS with elastic container service or it can be launched on serverless platforms like google cloud run.

Conversely, you might want to use someone else's docker image which can be downloaded from the cloud with docker pull. Now you can run any developers code without having to make any changes to your local environment or machine.

Congratulations, you are now a bonified and certified docker expert.

But docker itself is only the beginning. There is a good chance that your application has more than one service. You'll want to know about docker compose a tool for managing multi-container applications. It allows you to define multiple applications and their docker images in a single YAML file (YAML CONFIG) like a frontend, a backend and a database. docker-compose up command will spin up all the containers simultaneously. While docker-compose down command will stop them. That works on an individual server but once you reach massive scale, you'll likely need an orchestration tool like Kubernetes to run and manage containers all over the world. It works this way:

  • You have a control plane that exposes an API that can manage the cluster.

  • The cluster has multiple nodes or machines, each one containing a Kublet and multiple pods.

  • A pod is minimal deployable unit in Kubernetes which itself has one or more containers inside of it.

What makes Kubernetes so effective is that you can describe the desired state of the system and will automatically scale up or scale down while providing full tolerance to automatically heal (auto-heal) if one of your servers goes down. It gets pretty complicated but the good news is you probably don't need Kubernetes. It was developed at Google based on its borg system and is really only necessary for highly complex high traffic systems. If that sounds like you, you can also use extensions on docker desktop to debug your pods.