This blog post will help you understand the differences between two similar Dockerfile instructions – ADD and COPY – how they became what they are today, and our recommendation on which instruction you should use. (Hint: It's not ADD)
When building Docker images from a Dockerfile you have two instructions you can choose from to add directories/files to your image: ADD and COPY. Both instructions follow the same basic form and accomplish pretty much the same thing:
ADD <src>... <dest> COPY <src>... <dest>
In both cases, directories or files (the
<src>) are copied and added to the filesystem of the container at the specified
So if both instructions are equivalent, why do they both exist and which one should you use? Read on to find out.
If you're not interested in the nuances of ADD and COPY and just want an answer to "which one should I use?", all you need to know is: use COPY.
Monthly Docker Tips
In the Beginning...
Unlike the COPY instruction, ADD was part of Docker from the beginning and supports a few additional tricks beyond simply copying files from the build context.
The ADD instruction allows you to use a URL as the
<src> parameter. When a URL is provided, a file is downloaded from the URL and copied to the
ADD http://foo.com/bar.go /tmp/main.go
The file above will be downloaded from the specified URL and added to the container's filesystem at
/tmp/main.go. Another form allows you to simply specify the destination directory for the downloaded file:
ADD http://foo.com/bar.go /tmp/
<dest> argument ends with a trailing slash, Docker will infer the filename from the URL and add it to the specified directory. In this case, a file named
/tmp/bar.go will be added to the container's filesystem.
Another feature of ADD is the ability to automatically unpack compressed files. If the
<src> argument is a local file in a recognized compression format (tar, gzip, bzip2, etc) then it is unpacked at the specified
<dest> in the container's filesystem.
ADD /foo.tar.gz /tmp/
The command above would result in the contents of the foo.tar.gz archive being unpacked into the container's
Interestingly, the URL download and archive unpacking features cannot be used together. Any archives copied via URL will NOT be automatically unpacked.
Too Much Magic
Clearly, there is a lot of functionality behind the simple ADD instruction. While this makes ADD quite versatile it does NOT make it particularly predictable. Here's a quote from an issue that was logged against the ADD command back in December of 2013:
Currently the ADD command is IMO far too magical. It can add local and remote files. It will sometimes untar a file and it will sometimes not untar a file. If a file is a tarball that you want to copy, you accidentally untar it. If the file is a tarball in some unrecognized compressed format that you want to untar, you accidentally copy it. - amluto
The consensus seemed to be that ADD tried to do too much and was confusing to the user. Obviously, no one wanted to break backward compatibility with existing usage of ADD, so it was decided that a new instruction would be added which behaved more predictably.
Like ADD, But Less
COPY doesn't support URLs as a
<src> argument so it can't be used to download files from remote locations. Anything that you want to COPY into the container must be present in the local build context.
Also, COPY doesn't give any special treatment to archives. If you COPY an archive file it will land in the container exactly as it appears in the build context without any attempt to unpack it.
COPY is really just a stripped-down version of ADD that aims to meet the majority of the "copy-files-to-container" use cases without any surprises.
Which to Use?
In case it isn't obvious by now, the recommendation from the Docker team is to use COPY in almost all cases.
Really, the only reason to use ADD is when you have an archive file that you definitely want to have auto-extracted into the image. Ideally, ADD would be renamed to something like EXTRACT to really drive this point home (again, for backward-compatibility reasons, this is unlikely to happen).
OK, but what about fetching packages from remote URLs, isn't ADD still useful for that? Technically, yes, but in most cases you're probably better off RUNning a
wget. Consider the following example:
ADD http://foo.com/package.tar.bz2 /tmp/ RUN tar -xjf /tmp/package.tar.bz2 \ && make -C /tmp/package \ && rm /tmp/package.tar.bz2
Here we have an ADD instruction which retrieves a package from a URL followed by a RUN instruction which unpacks it, builds it and then attempts to clean-up the downloaded archive.
Unfortunately, since the package retrieval and the
rm command are in separate image layers we don't actually save any space in our final image (for a more detailed explanation of this phenomenon, see my Optimizing Docker Images article).
In this case you're better off doing something like this:
RUN curl http://foo.com/package.tar.bz2 \ | tar -xjC /tmp/package \ && make -C /tmp/package
curl the package and pipe it right into the
tar command for extraction. This way we aren't left with an archive file on the filesystem that we need to clean-up.
There may still be valid reasons to ADD a remote file to your image, but that should be an explicit decision and not your default choice.
Ultimately, the rule is this: use COPY (unless you're absolutely sure you need ADD).
Don't have a CenturyLink account? No problem. Just head over to our website and activate an account.