Using Docker and Yarn for Development
Why Develop Using Docker
Provisioning a Docker container is cheap. You can use it to set up a development environment that closely mirrors your production environment — where each service runs in its own VM. It is also great for having a consistent development environment across your entire team and for new developers to get started on your stack.
Why Docker and Yarn
Dependencies are ever changing during development. So as not to compromise on productivity, rebuilding a docker image needs to be fast. Combining Docker with Yarn makes for particularly speedy builds.
There has been some discussion on how to make Yarn play nice with Docker. They are well documented here and here.
TL;DR
Here’s a basic implementation for a Node development environment with Yarn for dependency management and nodemon to watch for code changes and restart the server.
We’ll use Docker Compose to define the run-time instructions (port mappings, bind mounts, entrypoints and commands) and keep them separate from the build instructions in the Dockerfile.
Dockerfile
FROM node:6.11.0 RUN curl -o- -L https://yarnpkg.com/install.sh | \ bash -s -- --version 0.26.1
RUN yarn global add nodemon@1.11.0RUN mkdir -p /usr/src/appADD .yarn_cache /usr/local/share/.cache/yarn/v1/ADD ./package.json ./yarn.* /tmp/
RUN cd /tmp && yarn
RUN cd /usr/src/app && ln -s /tmp/node_modules ADD . /usr/src/app/EXPOSE 3000 WORKDIR /usr/src/app
docker-compose.yml
version: '3'
services:
my-app:
build: .
ports:
- 3000:3000
volumes:
- .:/usr/src/app
entrypoint: ["sh", "-c"]
command: ["cp /tmp/yarn.lock yarn.lock & nodemon server.js & if [ '(tar -cf - /usr/local/share/.cache/yarn/v1 | crc32)' != '(tar -cf - .yarn_cache | crc32)' ]; then cp -r /usr/local/share/.cache/yarn/v1/. .yarn_cache/; fi"]
.dockerignore
node_modules
.gitignore
.yarn_cache/*
!.yarn_cache/.gitkeep
In your project directory, create an empty directory mkdir .yarn_cache
and build your container using docker-compose build
. Running is simple: docker-compose up
. You can then use your favorite text editor/IDE to work on the source code and see the changes reflected immediately on your app thanks to nodemon.
If you need to add new dependencies, just edit thepackage.json
directly .
Understanding the Docker Cache
Docker images are incredibly quick to build the second time round. This is because of the layered caching that Docker uses. Think of the Docker build process as creating a series of intermediate images, representing the state of the container at each instruction point. Each image has a checksum that will be compared during subsequent builds. If it tallies it means Docker will reuse the cache. If it doesn’t tally, Docker will invalidate all cached images from that point onward and generate new images.
Note: When calculating checksums, Docker looks at the files involved in ADD
and COPY
but only considers the command string for RUN
.
Ideally, we only want to install node_modules
when package.json
and yarn.lock
changes:
ADD ./package.json ./yarn.* /tmp/
RUN cd /tmp && yarn
RUN cd /usr/src/app && ln -s /tmp/node_modules
The above instructions copies package.json
and yarn.lock
(* to optionally copy yarn.lock
because it doesn’t exist during the first build) from the host’s project directory into the container’s /tmp
directory. The packages are then installed using yarn and node_modules
is symlinked to the container’s /usr/src/app
directory.
The reason why we install in /tmp
is because we eventually want to bind mount the host’s project directory onto /usr/src/app
so that we can have live code updates during development.
We can’t install in /usr/src/app
directly because the bind mount ‘replaces’ it with your working directory which does not contain node_modules
and you won’t be able to start your app.
Persisting the Yarn Cache
Yarn stores all installed packages in a global cache in your home directory. As Docker containers are ephemeral and data is lost between builds, having to rebuild node_modules
from scratch without using the yarn cache would be a terrible crime. Hence, we need a way to extract the yarn cache from the container to be reused in the next build.
One possible approach would be to copy it out at run-time:
entrypoint: ["sh", "-c"]
command: ["cp /tmp/yarn.lock yarn.lock & nodemon server.js & if [ '(tar -cf - /usr/local/share/.cache/yarn/v1 | crc32)' != '(tar -cf - .yarn_cache | crc32)' ]; then cp -r /usr/local/share/.cache/yarn/v1/. .yarn_cache/; fi"]
Remember that empty .yarn_cache
directory that we created on the host? We’re going to populate it with the container’s yarn cache. In the docker-compose.yml
, we can define an entrypoint
which in this case is to start a shell process and supply a command
to it. The command is kind of verbose but essentially it does:
- Tar the container’s yarn cache
/usr/local/share/.cache/yarn/v1
- Calculate it’s checksum using a cheap crc32
- Do the same for the host’s yarn cache
- If the checksums don’t match, replace host’s cache with the container’s
The above is performed in parallel with copying out yarn.lock
and starting nodemon
.
The yarn cache will then be added during build-time, before the yarn install:
ADD .yarn_cache /usr/local/share/.cache/yarn/v1/
Some tweaks/improvements
You could persist the tar files (maybe also gzip it) and copy them instead of recursively copying out the directory.
Perhaps the run-time shell scripts can be extracted into a standalone file, which is the approach taken by Martino Fornasa.
Fin
Hope you found this article useful. Feel free to reach out to me if you have suggestions/questions. Cheers!