Ben Firshman, Product Manager at Docker, Inc.
Ben Firshman and his partner, Aamand Prasad, were the founders of Orchard. In the summer of 2014, Docker, Inc. acquired their 2-man startup. Ben and Aamand became Docker employees with the mission to create great developer experiences Docker users. Read more about it on the Docker blog.
Ben, Thank you for speaking with us. How are you doing?
I'm doing good, it's a pleasure to be here, thanks for inviting me as well.
So how was it to start a Docker-based startup and then be acquired by Docker?
It's quite flattering, I guess to sort of know you are doing the right thing, it's been very exciting. One of the main reasons we joined is we just started talking to each other and realized we actually had the same goals in mind, we wanted to build really great open source tools for orchestrating containers. They seemed to line up pretty well with what we had in mind, so we thought "hey, we should team up" and do this together.
How long had you been running Orchard before you got acquired?
It's a little known fact that we started Orchard about six months before we started using Docker, I think. Sort of trying to solve this problem of how do you run development environments and get them through your app lifecycle and get them into production. We tried loads of technologies, sort of experimenting with containerization with really small VM's and things like that. And then Docker came along and we sort of realized, "oh, this is the thing". This was obviously getting traction and we should jump on this bandwagon because it is the right technology. It turned out to be the right decision. Since we picked Docker, that was about September 2013, then we got acquired in August so it was just under a year.
So people might notice you have an accent, you actually live in the UK, right?
Yep, that is correct. Orchard turned into what is now the Docker London office. We have a little office there where we are building little bits of orchestration tooling.
How many people are in the London office?
It's only three of us for now, we are growing and building the team.
How many people are at Docker now?
Docker is huge. And it sort of grows every week. Last count I heard was 130 I think. Most of that is based in San Francisco, we've got a bunch of people on the east coast, a bunch of people in Europe and a little team in London, as well.
So what are you working on, what drives you and motivates you? What is exciting?
I want to make really great tools for developers and admins to deploy their stuff. That is from doing that job myself in companies where I had both developer and sysadmin hats. And from watching people do this and it just kind of being a pain at the moment. Lots of tools which are doing really good jobs but there are sort of some missing bits in there. What we are trying to do is to fill some of those holes and trying to create a really great user experience, particularly for developers running development environments and stuff like that.
So Docker orchestration is probably one of the most difficult parts of moving into a Docker-ized environment. Figuring out how to stitch a bunch of containers together, how to stitch them across hosts. It's easy to set up a container that has everything in it, put your MySQL and your Apache together but trying to stitch together a complex system that's not so easy. Right now there are tools like Mesos out there; can you explain to somebody who isn't familiar with the details of those tools what are the big differentiators between Mesos and Docker's orchestration that is being built out right?
There are a lot of similarities, obviously. Mesos was kind of the first thing that appeared, 2009/2010 I think. It was being built as an internal tool at Twitter to run Twitter and kind of inspired by some of the papers Google had published on their blog system. So didn't use any containerization at all, it was pre-containerization hype. It was intended to run processes on machines that were set up in the correct environment by things like Puppet and Chef. So it went very well if you had an application where you could predict its environment.
Twitter ran a lot of Java apps as well, which was really easy because you just need Java installed on the machine. As long as you live in the JVM world it's a bit like having Docker containers. One of the things we are also building internally is a thing called "Swarm".
What we are trying to do with Swarm is sort of say "there are a lot of tools appearing that let you schedule containers and other things onto clusters" and what we are saying is "you don't have to choose between technologies". What we are doing is building a thing that on one end speaks what is called the Docker Remote API, which is the thing on your computer that Docker uses to actually run containers. And exposing that API to a thing that can run containers on lots and lots of hosts but still talk the same API. That can be backed by a simple scheduler that we've included in Swarm if you want to get started really quickly. And it can also be backed by Mesos.
So, the idea is that you can use all this Docker tooling you already have so things like your Docker client, tools like Shipyard, things like Compose (which is another of our tools that I can talk about in a bit) and you don't have to modify these but you can use them on top of Mesos or your laptop as well, depending on what environment you have. You don't have to write it specifically for these environments.
One of the things I am excited aboutis that you offered to actually show us some of the stuff. Would you be able to do a demo of Swarm?
[Ben demos Docker Swarm by sharing his screen]. Here is a Swarm that I prepared earlier. We have a tool called "Machine", you gave a nice description of Orchard at the start, which was that is was a service for getting a Docker host started very quickly so you could run Docker containers without worrying about computers and things like that. Orchard has sort of turned into a thing calle "Machine" now, which is just a way of really easily getting started with Docker so you can create Docker hosts easily for running containers. You can create them on VirtualBox or you can create them on Digital Ocean. Machine has an option called "--swarm", which lets you create Swarms.
If you are interested in how to set up Swarms there is some really good documentation on Machines documentation. [Ben shows 'docker ls'] You can see some of the machines I created earlier. There is a node called "swarm-master" which exposes the Docker Remote API and sort of controls your cluster. And also another one called "swarm-01", which is a node, and you can imagine this could be any number of Swarm nodes. So we have two nodes, we could create "swarm-02" and "swarm-03" and grow our cluster that way.
[Ben sets up the environment by setting environment variables in the shell pointing to the Swarm] Now what I can do is start running Docker commands against this. [Ben runs 'docker info'] This is a simple command that gives information about your Docker engine. We can see that this is kind of a special Docker engine where it says "I'm a Docker engine that has 2 nodes", which is "swarm-master" and "swarm-01". If I run containers against this Swarm cluster, let's say running nginx, which is an official Docker image... [Ben runs 'docker run -d -p80:80 nginx']
If I run nginx in the background, and I can expose a port here, what this is doing is that Swarm has received this API request from the Docker client to start up nginx. It then picked one of our nodes which has spare capacity and it has started nginx on it.
Is the logic something an everyday developer could modify the logic for how it picks the backend?
Yes, there are a bunch of ways of configuring that logic that you can use. You can configure various strategies for Swarm. So Swarm can either try to binpack containers where there's enough capacity so it tries to fill up your nodes one by one. We can also just pick stuff randomly, or it can do a smart balance of that where it tries to balance it evenly but also picks nodes where there is some spare space. If you're feeling really adventurous you can also write your own strategies in Go and compile your own version of Swarm with those strategies.
And those Swarm nodes they don't need to be on the same physical network, right?
Yes, correct. As long as they are accessible by each other then they don't need to be on the same network.
Through TCP, UDP, through a particular port?
Through TCP on the standard Docker port and it is secured by TLS as well, in the same way the Docker engine can be secured by TLS. [Ben runs 'docker ps'] And we can all see it running. It's giving the IP address and we can also see its name. Normally this would just be a container name but it is also prefixed with the Swarm node it's been allocated to.
If you launched an Apache container or a MySQL container, could they talk to the nginx container? Can you link them together if they are across different hosts?
This is the one bit where Swarm doesn't help you yet. If you link two containers together, Docker has this feature where you can link two containers together so they can talk to each other essentially, Swarm will automatically schedule them onto the same node. Because links don't work across hosts yet.
So if you launch a container that is not on the same node and then you link it, will it move it? Or do you have to link it while you're launching it?
So, if start up a container, then start up another container and link it to the other container it will schedule it to the same node. It's kind of a hack because we know that's kind of a limitation, that's something that we haven't sorted out yet. If you want you can roll this yourself so you can either make sure they are accessible on the same network like if you put it into the same Amazon EC2 network, for example, and use host networking and know what ports things are on, that will work absolutely fine.
There's a bunch of other options with things like networking systems like Weave, which is a company building container networking. We are also working on this internally at Docker. A few months ago we acquired a company called SocketPlane and they were building networking for Docker and they are building some quite cool things there. We will have more to talk about soon but we are working hard on making networking work really well on Swarms.
What else can you do with Swarm, can you show us anything about Compose?
The thing called "Compose" which we built at Orchard, originally called "Fig", this was for running development environments locally. [Ben shows the contents of a 'docker-compose.yml' file] It defines web server and a redis server and how they link together. If I run this against my local computer... [Ben runs 'docker-compose up'] This is really handy for development environments where you are working on an application and you just want to run a bunch of services together.
The neat thing about what Compose is that Compose is just a Docker client. So, Compose speaks to anything that opens up the Docker Remote API, including Swarm. This demo doesn't work with this Compose file because as we said before it requires a link. If this application didn't require links and it was a single service then you could run it on top of Swarm. And those things do work together and there is some good documentation on Compose that explains how this works.
So what is the future of Docker?
That's a good question. There are kind of two things I am excited about. I am excited about all these bits of technology that are essentially making people's lives easier. Like Compose has made loads of developer's lives so much easier by being able to run development environments and I think we can do the same with Swarm as well for people who just want to get an application up and running really quickly on their cluster.
And when Compose and Swarm are working together that all works so well. Then I'm also excited about with Docker, as well as the little bits of technology that we're building, making building and deploying applications just work really well. Like be a really great user experience for developers and something that is not a pain to use and deploy applications, that is something I am excited about.
Swarm seems to do a lot of backend stuff for you, why would somebody use Swarm with something like Mesos? Why would you use both, it seems like Swarm has some of the functionality you would get out of Mesos... what would you gain by adding Mesos to Swarm?
Mesos specifically, the reason you might want to use Mesos is, we are quite open about the fact that we have only been building Swarm for a year and Mesos has been around for several years and runs Twitter. So if you are building on an enormous scale and still want to deploy Docker containers onto it then you might want to run your infrastructure on Mesos.
Also, for the case where you've already set up a Mesos cluster, we've talked to lots of people who already have big Mesos infrastructure set up and Mesos has built-in Docker support as well. They kind of want to keep their huge Mesos infrastructure, obviously, since there is a lot of investment to set up but also be able to run Docker native applications on top of it. That's where Swarm comes in where Swarm essentially sits between Docker world and Mesos world and can let you use all of your Mesos infrastructure with Docker tools.
It has been a pleasure to see all fo this n action and have you on the podcast. Is there anything else you wold like to share, any cool open source projects you've run into lately?
Cool open source projects, that's a good question. I'm sure there is... but nothing that is on the top of my mind. It's been a pleasure to talk to you as well.
Have you seen ImageLayers?
What's ImageLayers? Oh yeah, that is really cool, actually. Have you seen RancherOS?
Before we let you go can you tell us have you found any interesting open source projects lately?
One of the things I have seen lately is this company called Rancher Labs building an operating system called RancherOS. Which is a, it's a bit mad really in an exciting kind of engineering mad scientist way. This operating system which runs entirely on Docker. So PID 1 is the Docker engine and it essentially boots up your init system and all of your system services inside Docker containers. And, that's all it is.
How is that different from CoreOS for people who have heard about CoreOS but aren't familiar with it?
It's kind of similar to CoreOS but CoreOS uses a system called systemd and has a bunch of other processes running on the host system. Which is, honestly, probably a more sensible way to do it than RancherOS for a lot of things. But RancherOS is sort of experimenting with the idea of "what if absolutely everything is inside of Docker containers?" Everything you need on your operating system is inside Docker containers. It's really interesting, it's sort of an experiment at this stage but could be a glimpse of the future as well.