19  Image-building best practices

19.1 Image layering

Did you know that you can look at what makes up an image? Using the docker image history command, you can see the command that was used to create each layer within an image.

  • Use the docker image history command to see the layers in the getting-started image you created earlier in the tutorial.
sudo docker image history getting-started

Each of the lines in the output represents a layer in the image. Using this, you can also quickly see the size of each layer, helping diagnose large images.

  • You’ll notice that several of the lines are truncated. If you add the --no-trunc flag, you’ll get the full output:
sudo docker image history --no-trunc getting-started

19.2 Layer caching

Now that you’ve seen the layering in action, there’s an important lesson to learn to help decrease build times for your container images.

Once a layer changes, all downstream layers have to be recreated as well

Let’s look at the Dockerfile we were using one more time…

# syntax=docker/dockerfile:1
FROM node:18-alpine
WORKDIR /app
COPY . .
RUN yarn install --production
CMD ["node", "src/index.js"]

Going back to the image history output, we see that each command in the Dockerfile becomes a new layer in the image. You might remember that when we made a change to the image, the yarn dependencies had to be reinstalled. Is there a way to fix this? It doesn’t make much sense to ship around the same dependencies every time we build, right?

To fix this, we need to restructure our Dockerfile to help support the caching of the dependencies. For Node-based applications, those dependencies are defined in the package.json file. So, what if we copied only that file in first, install the dependencies, and then copy in everything else? Then, we only recreate the yarn dependencies if there was a change to the package.json. Make sense?

  • Update the Dockerfile to copy in the package.json first, install dependencies, and then copy everything else in.
# syntax=docker/dockerfile:1
FROM node:18-alpine
WORKDIR /app
COPY package.json yarn.lock ./
RUN yarn install --production
COPY . .
CMD ["node", "src/index.js"]
EXPOSE 3000
  • Next, we create a file named .dockerignore (in the same folder as the Dockerfile) and include the text node_modules. We use the terminal to accomplish this:
echo "node_modules" > .dockerignore

.dockerignore files are an easy way to selectively copy only image relevant files. You can read more about this here.

In this case, the node_modules folder should be omitted in the second COPY step because otherwise, it would possibly overwrite files which were created by the command in the RUN step. For further details on why this is recommended for Node.js applications and other best practices, have a look at their guide on Dockerizing a Node.js web app

  • Build a new image using docker build.
sudo docker build -t getting-started .

You should see output like this…

[+] Building 15.4s (10/10) FINISHED                                
 => [internal] load build definition from Dockerfile          0.2s
 => => transferring dockerfile: 182B                          0.0s
 => [internal] load .dockerignore                             0.3s
 => => transferring context: 53B                              0.0s
 => [internal] load metadata for docker.io/library/node:18-a  0.0s
 => [internal] load build context                             0.2s
 => => transferring context: 3.22kB                           0.2s
 => [1/5] FROM docker.io/library/node:18-alpine               0.0s
 => CACHED [2/5] WORKDIR /app                                 0.0s
 => [3/5] COPY package.json yarn.lock ./                      0.7s
 => [4/5] RUN yarn install --production                      12.2s
 => [5/5] COPY . .                                            0.5s
 => exporting to image                                        1.5s
 => => exporting layers                                       1.5s
 => => writing image sha256:19dfa47bc86d7dceaa01a048c43a7dbc  0.0s
 => => naming to docker.io/library/getting-started            0.0s

You’ll see that all layers were rebuilt. Perfectly fine since we changed the Dockerfile quite a bit.

  • Now, make a change to the src/static/index.html file (like change the <title> to say “The Awesome Todo App” in line 11).

  • Build the Docker image now using sudo docker build -t getting-started . again. This time, your output should look a little different.

 => [internal] load build definition from Dockerfile          0.2s
 => => transferring dockerfile: 182B                          0.0s
 => [internal] load .dockerignore                             0.3s
 => => transferring context: 53B                              0.0s
 => [internal] load metadata for docker.io/library/node:18-a  0.0s
 => [1/5] FROM docker.io/library/node:18-alpine               0.0s
 => [internal] load build context                             0.0s
 => => transferring context: 3.45kB                           0.0s
 => CACHED [2/5] WORKDIR /app                                 0.0s
 => CACHED [3/5] COPY package.json yarn.lock ./               0.0s
 => CACHED [4/5] RUN yarn install --production                0.0s
 => [5/5] COPY . .                                            0.3s
 => exporting to image                                        0.1s
 => => exporting layers                                       0.1s
 => => writing image sha256:a85579d9ee5192bc7593ad0b2f263d25  0.0s
 => => naming to docker.io/library/getting-started            0.0s

First off, you should notice that the build was MUCH faster! And, you’ll see that several steps are using previously cached layers. So, yes! We’re using the build cache. Pushing and pulling this image and updates to it will be much faster as well.

19.3 Next steps

By understanding a little bit about the structure of images, you can build images faster and ship fewer changes. Scanning images gives you confidence that the containers you are running and distributing are secure.

In the next section, you’ll learn about additional resources you can use to continue learning about containers.