Want to significantly speed up the Dockerfile code/build/test cycle? In this article I'll discuss how the Docker image cache works and then give you some tips for using it effectively.

Caching Image Layers

Each instruction in your Dockerfile results in a new image layer being created and added to your local image cache. That image then becomes the parent for the image created by the next instruction (see my previous article for a detailed explanation of the image creation process). Let's look at an example:

FROM debian:wheezy
MAINTAINER [email protected]
RUN apt-get update && apt-get install -y vim

ENTRYPOINT 'vim'

If we docker build this Dockerfile and inspect the local image cache we'll see something like this:

docker images --tree
Warning: '--tree' is deprecated, it will be removed soon. See usage.
└─511136ea3c5a Virtual Size: 0 B Tags: scratch:latest
  └─59e359cb35ef Virtual Size: 85.18 MB
    └─e8d37d9e3476 Virtual Size: 85.18 MB Tags: debian:wheezy
      └─c58b36b8f285 Virtual Size: 85.18 MB
        └─90ea6e05b074 Virtual Size: 118.6 MB
          └─5dc74cffc471 Virtual Size: 118.6 MB Tags: vim:latest

The FROM instruction in our Dockerfile corresponds to the image layer tagged with debian:wheezy. The three child layers shown underneath that correspond to the other three instructions from our Dockerfile.

Another way to look at this is with the docker history command:

$ docker history vim
IMAGE         CREATED         CREATED BY                              SIZE
5dc74cffc471  15 minutes ago  /bin/sh -c #(nop) ENTRYPOINT [/bin/sh   0 B
90ea6e05b074  15 minutes ago  /bin/sh -c apt-get update && apt-get    33.41 MB
c58b36b8f285  15 minutes ago  /bin/sh -c #(nop) MAINTAINER brian.de   0 B
e8d37d9e3476  2 weeks ago     /bin/sh -c #(nop) CMD [/bin/bash]       0 B
59e359cb35ef  2 weeks ago     /bin/sh -c #(nop) ADD file:1e2ba3d937   85.18 MB
511136ea3c5a  13 months ago

With this view, the order is reversed (the child image appears before the parent) but you do get to see the Dockerfile instruction that was responsible for generating each layer.

After you've successfully built an image from your Dockerfile you should notice that subsequent builds of the same Dockerfile finish significantly faster. Once docker caches an image layer for an instruction it doesn't need to be rebuilt.

Let's look at an example for the Dockerfile above. We'll run the docker build command with the time utility so that we can see how long the initial build takes to complete.

$ time docker build -q -t vim .
Sending build context to Docker daemon 2.56 kB
Sending build context to Docker daemon
Step 0 : FROM debian:wheezy
 ---> e8d37d9e3476
Step 1 : MAINTAINER [email protected]
 ---> Running in 6b08074996d3
 ---> c58b36b8f285
Removing intermediate container 6b08074996d3
Step 2 : RUN apt-get update && apt-get install -y vim
 ---> Running in ef1603171a30
 ---> 90ea6e05b074
Removing intermediate container ef1603171a30
Step 3 : ENTRYPOINT 'vim'
 ---> Running in b3e0ad883ec5
 ---> 5dc74cffc471
Removing intermediate container b3e0ad883ec5
Successfully built 5dc74cffc471

real   0m21.917s
user   0m0.003s
sys    0m0.005s

You can see that it took 21 seconds to complete the build. Our example here is fairly trivial but it's not uncommon for builds to take many minutes once you start adding more instructions to your Dockerfile.

If we immediately execute the same instruction again we should see something like this:

$ time docker build -q -t vim .
Sending build context to Docker daemon 2.56 kB
Sending build context to Docker daemon
Step 0 : FROM debian:wheezy
 ---> e8d37d9e3476
Step 1 : MAINTAINER [email protected]
 ---> Using cache
 ---> c58b36b8f285
Step 2 : RUN apt-get update && apt-get install -y vim
 ---> Using cache
 ---> 90ea6e05b074
Step 3 : ENTRYPOINT 'vim'
 ---> Using cache
 ---> 5dc74cffc471
Successfully built 5dc74cffc471

real   0m0.032s
user   0m0.003s
sys    0m0.002s

Note how each instruction was followed by the "Using cache" message and the total build time dropped from 21 seconds to less than a second. Since we didn't change anything between the two builds there was really nothing for docker to do — everything was already in the cache.

Cache Invalidation

As Docker is processing your Dockerfile to determine whether a particular image layer is already cached it looks at two things: the instruction being executed and the parent image.

Docker will scan all of the children of the parent image and looks for one whose command matches the current instruction. If a match is found, docker skips to the next instruction and repeats the process.

If a matching image is not found in the cache, a new image is created.

Since the cache relies on both the instruction being executed and the image generated from the previous instruction it should come as no surprise that changing any instruction in the Dockerfile will invalidate the cache for all of the instructions that follow it. Invalidating an image also invalidates all the children of that image.

Let's make a change to our Dockerfile and see how it impacts the local image cache. We'll update the apt-get install instruction to install Emacs in addition to Vim:

FROM debian:wheezy
MAINTAINER [email protected]
RUN apt-get update && apt-get install -y vim emacs

ENTRYPOINT 'vim'

Let's build our new image and see what happens:

$ time docker build -q -t vim .
Sending build context to Docker daemon 2.56 kB
Sending build context to Docker daemon
Step 0 : FROM debian:wheezy
 ---> e8d37d9e3476
Step 1 : MAINTAINER [email protected]
 ---> Using cache
 ---> c58b36b8f285
Step 2 : RUN apt-get update && apt-get install -y vim emacs
 ---> Running in 33824d0f33ff
 ---> d6e06afe57c5
Removing intermediate container 33824d0f33ff
Step 3 : ENTRYPOINT 'vim'
 ---> Running in 27b1ae56612d
 ---> c3bf7baa3f34
Removing intermediate container 27b1ae56612d
Successfully built c3bf7baa3f34

real   2m1.511s
user   0m0.003s
sys    0m0.005s

Since we didn't alter the MAINTAINER instruction it was found in the cache and used as-is. However, we did edit the apt-get line so it resulted in a completely new image layer being created.

Furthermore, even though we didn't change the ENTRYPOINT instruction at all, its layer also had to be rebuilt since its parent image changed.

If we look at the image tree again we can see the two new layers that we created alongside the layers that were generated from the previous version of our Dockerfile:

$ docker images --tree
Warning: '--tree' is deprecated, it will be removed soon. See usage.
└─511136ea3c5a Virtual Size: 0 B Tags: scratch:latest
  └─59e359cb35ef Virtual Size: 85.18 MB
    └─e8d37d9e3476 Virtual Size: 85.18 MB Tags: debian:wheezy
      └─c58b36b8f285 Virtual Size: 85.18 MB
        ├─d6e06afe57c5 Virtual Size: 320.5 MB
        │ └─c3bf7baa3f34 Virtual Size: 320.5 MB Tags: vim:latest
        └─90ea6e05b074 Virtual Size: 118.6 MB
          └─5dc74cffc471 Virtual Size: 118.6 MB

Note how the layer for the MAINTAINER instruction (c58b36b8f285) remained the same, but it now has two children. The layers generated from the previous version of our Dockerfile are still in the cache it's just that they are no longer part of the tree tagged as vim:latest.

Conclusion

Now that you are familiar with how the Docker image cache works, next week we will discuss some strategies for making the most of it when working on your own Dockerfile.