Using Docker and Yarn for Development

Shem Leong
4 min readJun 16, 2017

--

https://unsplash.com/@chuttersnap

Why Develop Using Docker

Provisioning a Docker container is cheap. You can use it to set up a development environment that closely mirrors your production environment — where each service runs in its own VM. It is also great for having a consistent development environment across your entire team and for new developers to get started on your stack.

Why Docker and Yarn

Dependencies are ever changing during development. So as not to compromise on productivity, rebuilding a docker image needs to be fast. Combining Docker with Yarn makes for particularly speedy builds.

There has been some discussion on how to make Yarn play nice with Docker. They are well documented here and here.

TL;DR

Here’s a basic implementation for a Node development environment with Yarn for dependency management and nodemon to watch for code changes and restart the server.

We’ll use Docker Compose to define the run-time instructions (port mappings, bind mounts, entrypoints and commands) and keep them separate from the build instructions in the Dockerfile.

Dockerfile

FROM node:6.11.0 RUN curl -o- -L https://yarnpkg.com/install.sh | \  bash -s -- --version 0.26.1
RUN yarn global add nodemon@1.11.0
RUN mkdir -p /usr/src/appADD .yarn_cache /usr/local/share/.cache/yarn/v1/ADD ./package.json ./yarn.* /tmp/
RUN cd /tmp && yarn
RUN cd /usr/src/app && ln -s /tmp/node_modules
ADD . /usr/src/app/EXPOSE 3000 WORKDIR /usr/src/app

docker-compose.yml

version: '3'
services:
my-app:
build: .
ports:
- 3000:3000
volumes:
- .:/usr/src/app
entrypoint: ["sh", "-c"]
command: ["cp /tmp/yarn.lock yarn.lock & nodemon server.js & if [ '(tar -cf - /usr/local/share/.cache/yarn/v1 | crc32)' != '(tar -cf - .yarn_cache | crc32)' ]; then cp -r /usr/local/share/.cache/yarn/v1/. .yarn_cache/; fi"]

.dockerignore

node_modules

.gitignore

.yarn_cache/*
!.yarn_cache/.gitkeep

In your project directory, create an empty directory mkdir .yarn_cache and build your container using docker-compose build . Running is simple: docker-compose up . You can then use your favorite text editor/IDE to work on the source code and see the changes reflected immediately on your app thanks to nodemon.

If you need to add new dependencies, just edit thepackage.json directly .

Understanding the Docker Cache

Docker images are incredibly quick to build the second time round. This is because of the layered caching that Docker uses. Think of the Docker build process as creating a series of intermediate images, representing the state of the container at each instruction point. Each image has a checksum that will be compared during subsequent builds. If it tallies it means Docker will reuse the cache. If it doesn’t tally, Docker will invalidate all cached images from that point onward and generate new images.

Note: When calculating checksums, Docker looks at the files involved in ADD and COPY but only considers the command string for RUN.

Ideally, we only want to install node_modules when package.json and yarn.lock changes:

ADD ./package.json ./yarn.* /tmp/
RUN cd /tmp && yarn
RUN cd /usr/src/app && ln -s /tmp/node_modules

The above instructions copies package.json and yarn.lock (* to optionally copy yarn.lock because it doesn’t exist during the first build) from the host’s project directory into the container’s /tmp directory. The packages are then installed using yarn and node_modules is symlinked to the container’s /usr/src/app directory.

The reason why we install in /tmp is because we eventually want to bind mount the host’s project directory onto /usr/src/app so that we can have live code updates during development.

We can’t install in /usr/src/app directly because the bind mount ‘replaces’ it with your working directory which does not contain node_modules and you won’t be able to start your app.

Persisting the Yarn Cache

Yarn stores all installed packages in a global cache in your home directory. As Docker containers are ephemeral and data is lost between builds, having to rebuild node_modules from scratch without using the yarn cache would be a terrible crime. Hence, we need a way to extract the yarn cache from the container to be reused in the next build.

One possible approach would be to copy it out at run-time:

    entrypoint: ["sh", "-c"]
command: ["cp /tmp/yarn.lock yarn.lock & nodemon server.js & if [ '(tar -cf - /usr/local/share/.cache/yarn/v1 | crc32)' != '(tar -cf - .yarn_cache | crc32)' ]; then cp -r /usr/local/share/.cache/yarn/v1/. .yarn_cache/; fi"]

Remember that empty .yarn_cache directory that we created on the host? We’re going to populate it with the container’s yarn cache. In the docker-compose.yml, we can define an entrypoint which in this case is to start a shell process and supply a command to it. The command is kind of verbose but essentially it does:

  • Tar the container’s yarn cache /usr/local/share/.cache/yarn/v1
  • Calculate it’s checksum using a cheap crc32
  • Do the same for the host’s yarn cache
  • If the checksums don’t match, replace host’s cache with the container’s

The above is performed in parallel with copying out yarn.lock and starting nodemon.

The yarn cache will then be added during build-time, before the yarn install:

ADD .yarn_cache /usr/local/share/.cache/yarn/v1/

Some tweaks/improvements

You could persist the tar files (maybe also gzip it) and copy them instead of recursively copying out the directory.

Perhaps the run-time shell scripts can be extracted into a standalone file, which is the approach taken by Martino Fornasa.

Fin

Hope you found this article useful. Feel free to reach out to me if you have suggestions/questions. Cheers!

--

--