What if your application has been around a long time already? Is it too late for containers? Can you teach an old dog new tricks?

Yes you can! Before you throw in the towel and loading up your fatty VM, let's consider what constraints 'containerizing' puts on the application. In this post, I will demonstrate containerizing legacy applications use an application written by Matthias Nehlsen called Birdwatch. His application was not designed for containers but with making a few alterations, Birdwatch can be deployed and configured within a Docker container.

Step 1: Destructure

If the application relies on external services (i.e. database, nginx proxy, or message queue) these services should now be running within their own container. Of course, this step is not unique to containers; it's a common step for any complex application. The Birdwatch application has two major components. The application is built using the Play framework and relies on an ElasticSearch index. Therefore, I will design the architecture using two containers: one for the application and one for elastic search. This implementation has several advantages:

  • ElasicSearch could be leveraged by several applications
  • Each image can version independently
  • Proxies could be added to the architecture later

An alternative would be to create a single image which includes all components. While this may appear simpler, it has the same drawbacks as creating a monolithic black box - any alterations to a component will require a new image of the entire application. The larger the image, the larger the download, which will have a negative impact on deployment.

Step 2: Find reliable base images

A search on the Docker Registry will provide a list of possible images upon which to base the application. The difficulty is determining which to use. The community can star an image, which could be used to determine the popularity of an image. The Registry also shows the number of times an image has been downloaded, but these measures are subjective therefore find images with the following characteristics:

  • does the image provide a tag other than latest?
  • does the image provide a dockerfile?
  • do the images have a common base?

Using an image with a tag provides version control for the application. The latest tag is a rolling version consequently pulling the latest image and therefore may introduce a breaking change to the application. The Dockerfile is the complete image recipe to validate all the code and actions contained within the image. Having a common base image will improve the development cycle because they will leverage an image cache. Read Working with the Docker image cache to learn more. Our Birdwatch application needs an ElasticSearch image with version 1.1 or greater. Searching on the Registry for 'elasticsearch' provided 236 results. The first entry is a trusted image, has a Dockerfile, but only provides a latest tag.

elastic search

It also requires the Play2 framework. A Registry search only provides three possible candidates. Inspecting the FROM command in the Dockerfiles for each of the results show none of them have a common base with our trusted elastic search. Initially, I tried the mzkrelx/playframework2-dev but found its startup was slow. I finally settled on the reubenbond/playframework2 image. For an efficient production release of Birdwatch, I would consider creating a new Play framework image which was FROM the common dockerfile/java image used in the trusted ElasticSearch image.

play framework

Step 3: Manage Configuration

Applications require some configuration. This could include:

  • Log levels and location
  • Database location and credentials
  • Security information
  • Application settings

An application running in a container will need the same configuration. This configuration data could be sealed within the container; or, a better approach is to provide the configuration at runtime. Currently, there are three methods to provide runtime configuration.

Environment Variables

The first is using environment variables. All environment variables provided on the run command are exposed to the image. Therefore any configurations which use environment variables can be set during the run command. The Birdwatch application uses a set of configurations defined in an application.conf file. Assigning environment variables to these items is the only step required to make them configurable.

twitter.consumer.key=${TWITTER_KEY} twitter.consumer.secret=${TWITTER_SECRET} twitter.accessToken.key=${TWITTER_ACCESS_KEY} twitter.accessToken.secret=${TWITTER_ACCESS_SECRET} application.topics=${BIRDWATCH_TOPICS} application.users=${TWITTER_USERS}

Each of these must be set by a -e option on the docker run command.

Links

Using multiple container's requires setting up a communication channel between them. Docker calls this linking. The links are presented to the container as environment variables using specific naming convention. Our Birdwatch application is going to be linked to elasticsearch using --link dockerfile_elasticsearch_latest:db This link command creates several environment variables based on the db alias provided in the link. The TCP address and port are used in the application.conf again to provide the link to the elastic search container.

elastic.TweetURL="http://"${DB_PORT_9200_TCP_ADDR}":"${DB_PORT_9200_TCP_PORT}"/birdwatch_tech/tweets/" elastic.LogURL="http://"${DB_PORT_9200_TCP_ADDR}":"${DB_PORT_9200_TCP_PORT}"/logstash-" elastic.PercolatorURL="http://"${DB_PORT_9200_TCP_ADDR}":"${DB_PORT_9200_TCP_PORT}"/persistent_searches/tweets/_percolate/" elastic.PercolationQueryURL="http://"${DB_PORT_9200_TCP_ADDR}":"${DB_PORT_9200_TCP_PORT}"/persistent_searches/.percolator/"

Entrypoint Command

The final method for providing runtime configuration is through parameters to the Dockerfile entrypoint. Our Birdwatch application did not use this configuration process.

Step 4: Logging

The current best practice for logging is to write to the standard out. The journal and container logs capture this data and can be viewed in case of an application failure. This subject is far from concluded many discussions have been circling around the ability of a container to provide logging data. The Birdwatch application used system logging and a log file so no alterations to the codebase was needed to access its log data.

Step 5: Injecting Code

A Dockerfile must be used to create an image. This file outlines all the steps required to build the image. This article will not discuss the optimization of images. A great discussion about image optimization can be found here. The key command for adding code is the ADD command. It has two parameters a src and dest. The src parameter is a valid path when building the image. The dest parameter is an absolute path within the container. If you want to ignore files or directories when adding files place a .dockerignore in the root directory. It works similar to a .gitignore file. Once you have an image created it can be added to the Docker Hub. Our Birdwatch application has a pretty involved build process. Each of the UI frameworks need to be built first and then the application can be constructed. This could be handled in a script which would be executed on build, but for this demonstration I chose to run those steps outside of the docker build process and only add the resulting compiled code. I added a .dockerignore file to ignore the .git and logs directories:

.git logs


The Dockerfile to assemble the image:

FROM reubenbond/playframework2 MAINTAINER centurylink # Copy the application files. ADD ./target/universal/stage /var/birdwatch WORKDIR /var/birdwatch RUN chmod +x bin/birdwatch # Start the application. EXPOSE 9000 CMD bin/birdwatch < /dev/zero



The final image is on the Docker Hub: Birdwatch Image

Running with Docker

docker run dockerfile/elasticsearch

docker run -p 9000:9000 -e
"BIRDWATCH_TOPICS=docker,coreos,centurylink,clojure,clojurescript,panamax" -e
"TWITTER_CONSUMER_KEY=[KEY]" -e "TWITTER_CONSUMER_SECRET=[SECRET]" -e
"TWITTER_ACCESS_KEY=[KEY]" -e "TWITTER_SECRET=[SECRET]" -e
"TWITTER_USERS=2384071,15358364" --link dockerfile_elasticsearch_latest:db argvader/birdwatc


Install via CenturyLink Panamax

Search for Birdwatch and click install.

Conclusion

We have shown with the Birdwatch application it is possible to 'containerize' an application after it has completed its development cycles. With some knowledge of how Docker will expose linked containers and using environment variable we also provided runtime controls to the application. Now the Birdwatch application can be deployed and configured for any topic using any twitter credentials. No more excuses. Go forth and Containerize!