Running a Selenium Grid with docker-compose

At ContaAzul, we had 31 Windows machines powering our Selenium tests - one running the grid and 30 more running clients. Needless to say, this is very expensive.

As we are already using Docker to run our builds (on Shippable), we decided to try it out to run Selenium tests too.

It was no surprise that Selenium guys already made a ready-to-go set of Docker images. There is an image for the Selenium Grid itself, and the browser images - Chrome and Firefox, also with debug versions - which allow you to access them using VNC to "see what's happening there". You can check them out in their GitHub repository.

Making it work

Basically, I created a c3.4xlarge EC2 machine and installed both Docker and docker-compose, following their respective README's:

# install docker:
wget -qO- https://get.docker.com/ | sh

# install docker-compose:
curl -L https://github.com/docker/compose/releases/download/1.2.0/docker-compose-`uname -s`-`uname -m` > /usr/local/bin/docker-compose
chmod +x /usr/local/bin/docker-compose

With docker up and running, I created a simple docker-compose.yml file, which describes my containers and how they interact with each other. It ended up being something like this:

hub:
  image: selenium/hub
  ports:
    - "4444:4444"
firefox:
  image: selenium/node-firefox
  links:
    - hub
chrome:
  image: selenium/node-chrome
  links:
    - hub

This uses only the very basics of docker-compose syntax. You can always take a look at the docs to learn a bit more about it.

Now, we can start the service running:

docker-compose up

After that, it is just a matter of telling the selenium test runner to connect on the Docker host machine on port 4444 and everything will just work.

But, well, we had 30 machines before... now I only have 2 selenium instances running (one with Firefox and the other with Chrome). How can I fix that? Well, I'm glad you asked:

docker-compose scale chrome=15 firefox=15

Around 10 seconds later, 30 selenium instances up and running. Sweet.

Let's talk money

show me the money gif

The objective was to decrease our costs with EC2 instances in our development VPC.

With this changes, we dropped our monthly EC2 billing related to Selenium by ~77%! Ok, ok, we have also changed the main OS where Selenium runs. Well, even if the instances were already Linux boxes before, it would still be a cut of ~57%:

ec2 values

It is also important to notice that we pay the Amazon bill in USD, and we pay around BRL 4.5 per USD. That said, USD 1161 costs us around BRL 5224.5, which can buy here ~411L of beer (using BRL ~12.69/L).

Well, MISSION ACCOMPLISHED.

50% usage because we only use them about 12 hours per day (business hours, give or take).

Try it out

In order to make it easier for you guys to put all this to work (save you a bunch of copy-and-paste combos), I created a simple set of shell scripts that can easily put a selenium grid up and running.

To test that, you can start a fresh linux machine (or not, your call) and hit:

git clone https://github.com/caarlos0/selenium-grid-on-docker-example.git grid && \
  cd grid && sudo ./install.sh

This will download the scripts and install docker and docker compose. When you install Docker, it will suggest you to add your user to the docker group. You should really do that. I help you, it's something like this:

sudo usermod -aG docker your-username

Now, let's put it to run:

./run.sh

This command will pull and run 3 containers: hub, firefox and chrome. You can scale things up with:

./scale.sh 10

This will scale the grid to 10 Chrome containers and 10 Firefox containers (be advised that it will eat a lot memory - it's 20 browsers, after all).

On my Mac, I scaled it to 4 Chrome and 4 Firefox instances, and it works:

Running 4 Firefox and 4 Chrome instances on my laptop

Just 5 seconds to start 8 containers. Neat.

Conclusions

Docker is great! (I can't say this enough)

Some people don't yet trust Docker enough to put it in production, or are scared of it because of the lack of knowledge. I can only suggest you to start testing it in development machines, CI environments and so forth. It is safe and you will surely learn a lot (and love almost every second of it). You can also read my previous post about how Docker works.

The best part: need more speed? Just change the instance type and let docker-compose scale it up! The other best part (yes there is more than one): you can put 30 Selenium machines to run in a few seconds. Try that with 30 Windows machines.

Future

Maybe autoscale the cluster using spot instances?

Comment this post here.

Docker: The very basics

Or "what the hell is this Docker thing?"

Intro

According to their website,

Docker is an open platform for developers and sysadmins to build, ship, and run distributed applications. Consisting of Docker Engine, a portable, lightweight runtime and packaging tool, and Docker Hub, a cloud service for sharing applications and automating workflows, Docker enables apps to be quickly assembled from components and eliminates the friction between development, QA, and production environments. As a result, IT can ship faster and run the same app, unchanged, on laptops, data center VMs, and any cloud.

The main idea is that, instead of shipping wars, ears, tars, debs or whatever package system you might think of, you ship a standardized container, which can run "anywhere" without much effort.

The terminology is based on real containers. A container, if you look at Wikipedia, is:

An intermodal container (also known as a container, freight container, ISO container, shipping container, hi-cube container, box, sea container, container van) is a standardized reusable steel box. Intermodal containers are used to store and move materials and products efficiently and securely in the global containerized intermodal freight transport system. "Intermodal" indicates that the container can be used across various modes of transport, (from ship to rail to truck) without unloading and reloading its contents.

So, the idea is that, you can "ship" a container to production, which can be the same container you used before to test the app on your own machine, maybe changing only the database URL or something.

It is common to "link" Docker to Microservices and vice-versa, but, even if it's nice to, you don't need to use Docker to ship Microservices. Docker is just the tool commonly used for that.

If you think outside the box of putting web apps in production using Docker, you will see that you can ship basically any software using it, from scripts to selenium nodes, for example.

It is very important to note that the containers are stateless, meaning, they don't store any data within them. With that said, it is no surprise that CI companies are using Docker to "isolate" and distribute the builds.

How it works

I will assume that you are familiar with the "default" virtualization softwares, like, let's say, VirtualBox. This is how they work:

Basically, each new VM you need to run will run isolated and load up the entire "guest" OS, using the "host" hardware through the "Hypervisor". This all may sound nice in some cases, but this strategy uses a lot of resources and may be a little slow too.

Docker, in the other hand, doesn't have a Hypervisor. Instead, it does an operating-system-level virtualization, which is a server virtualization method where the kernel of an operating system allows for multiple isolated user space instances, instead of just one.

LXC, which was used as default by Docker until version 0.9), is nothing new. The first commit was made in Aug 6, 2008. The thing is that LXC is a little hard to work with, and Docker "abstracted" a lot of stuff with their libcontainer, which should allow us to, for example, run Docker on Windows Server in a near future.

Another key feature, in my humble opinion, is AuFS. AuFS (AnotherUnionFS) is a multi-layered filesystem that implements union mount, which basically allows several filesystems or directories to be simultaneously mounted and visible through a single mount point, appearing to be one filesystem to the end user (in this case, us).

Supposing you have 10 Docker containers based on, let's say, a 1Gb Sbuntu Server image, they will all use only 1Gb plus their specific data, instead of 10+Gb like it would use if running on VirtualBox, for example.

So, if you pull, let's say, the bobrik/image-cleaner image, you will see something like:

$ docker pull bobrik/image-cleaner
Pulling repository bobrik/image-cleaner
28b7cd17052f: Download complete
511136ea3c5a: Download complete
a5b60fe97da5: Download complete
9bff7ebd6f58: Download complete
5381e678f99a: Download complete
Status: Downloaded newer image for bobrik/image-cleaner:latest

Each of those checksums are nothing but AuFS "layers". So, if I decide to do another image adding some stuff to this one, I will not have to push the entire image, just the "diff"... more or less like a git commit.

Also, docker pull?

The Docker Registry

Yeah, you can push and pull container images to/from the Docker registry, which is something like a "GitHub" for Docker container images.

If you look at some public image in the registry, like this one, for example, you will see there is a Source project page link, which leads to the GitHub repository of that specific image, where you can finally see the Dockerfile for that image.

A Dockerfile is a text file that contains all the commands you would normally execute manually in order to build a Docker image. It extends another image, which can be a "clean" ubuntu server or really any other image.

The sintax is pretty simple (of course this is a very basic example), for example:

FROM busybox
ENTRYPOINT echo Hello World

You can then build this image, tagging it as hello-world:

$ docker build -t hello-world .
Sending build context to Docker daemon 2.048 kB
Sending build context to Docker daemon
Step 0 : FROM busybox
latest: Pulling from busybox
cf2616975b4a: Pull complete
6ce2e90b0bc7: Pull complete
8c2e06607696: Already exists
Digest: sha256:38a203e1986cf79639cfb9b2e1d6e773de84002feea2d4eb006b52004ee8502d
Status: Downloaded newer image for busybox:latest
 ---> 8c2e06607696
Step 1 : ENTRYPOINT echo Hello World
 ---> Running in b29590127d4f
 ---> 7fa687f18c73
Removing intermediate container b29590127d4f
Successfully built 7fa687f18c73

Now you can finally run it:

$ docker run -t hello-world
Hello World

If you create a Docker Registry account and if this wasn't a totally useless image, you could also push it with something like:

$ docker push caarlos0/hello-world

You can learn more about Dockerfile and the Docker cli here and here.

That's all folks!

I think this enough information for now. You can basically Google and learn more about all those fancy names I cited as well those in the images if you feel like needing more info.

Also, feel free to ask questions and/or share your thoughts in the comments box bellow. :beers:

Comment this post here.

Docker Protips™

Like my old post on git, this is somewhat a collection of useful Docker commands/tricks/whatever.

Feel free to leave yours in the comments!

Stop all containers

$ docker stop $(docker ps -qa)
  • ps -qa will output the CONTAINER_ID of all containers;
  • stop will get ps -qa as input and stop all of them.

You can also kill all running containers instead of stop them.

Delete all stopped containers

$ docker rm $(docker ps -qa -f="exited=0")
  • ps -qa will output the CONTAINER_ID of all containers;
  • -f="exited=0" flag will tell docker ps to filter by exited containers only;
  • rm will remove the container.

This command is particularly useful if you tested a lot of stuff in your machine which is now running out of disk space.

Delete all images

$ docker rmi -f $(docker images -q)
  • images -q will output the IMAGE_ID of all known images;
  • rmi -f will force delete all the given images.

Delete unused images

docker run --rm -v /var/run/docker.sock:/var/run/docker.sock bobrik/image-cleaner

Kitematic

If you use a Mac, managing boot2docker by hand can be a little "boring", you can use Kitematic for that.


Let's make this list bigger! Have your own tip/trick? Something I forgot to add? Share it with us!

Comment this post here.

QCon Sao Paulo - 2015: A short overview

So, this week I attended to QCon-SP.

The conference was great (congratulations everyone :beers:), but, I thought it would be nice to do an overview.

So, the top subjects were Microservices and Docker. A lot of Big Data too, but I like the Microservices thing more, so I didn't follow the Big Data track.

We saw a lot of company culture too, and, believe it or not, it was strongly related to Microservices.

Let me explain.

Basically, they defend small teams (5-8 people), each team owning one or more Microservices. The team is responsible for both develop and deploy those services, basically, the teams are multidisciplinary. The teams, not the people.

Also, a real Microservice should be independent. Microservices sharing the same database are not Microservices.

The philosophy behind Microservices is based on Unix Philosophy:

Of course you probably won't do that from day-1 in your new startup. You don't even know if it will work, and, Microservices add a little of complexity you might not want to pay now.

About this, two quotes by @randyshoup:

And, as expected, Docker seems to be now the pop thing to use to deploy Microservices. Even Microsoft is working on it.

There is a lot of cool things that you can do with it right now, and there will probably be more soon, like running desktop softwares inside a Docker container and freezing user space to turn on a Docker container in a "warm" state - which seems nice if you think about JVM JIT, for example.

I wish I had attended to the Docker tutorial by Jerome, but, unfortunately, that was not possible. There were very little practical stuff about Docker in talks and keynotes, but they were nice anyway.

See you next year!

Comment this post here.

Using Mockito's @InjectMocks

FYI: Like the previous post, this is a really quick tip.

Let's imagine we have two classes, and one depends on another:

Another.java:

@Log
public class Another {
    public final void doSomething() {
        log.info("another service is working...");
    }
}

One.java:

@Log
@RequiredArgsConstructor
public class One {
    private final transient Another another;

    public final void work() {
        log.info("Some service is working");
        another.doSomething();
        log.info("Worked!");
    }
}

Now, if we want to test One, we need an instance of Another. While we are testing One, we don't really care about Another, so, we use a Mock instead.

In Java world, it's pretty common to use Mockito for such cases. A common approach would be something like this:

OneTest.java:

public final class OneTest {
    @Mock
    private transient Another another;
    private transient One one;

    @Before
    public void setup() {
        MockitoAnnotations.initMocks(this);
        this.one = new One(another);
    }

    @Test
    public void oneCanWork() throws Exception {
        one.work();
        Mockito.verify(another).doSomething();
    }
}

It works, but it's unnecessary to call the One constructor by hand, we can just use @InjectMocks instead:

OneTest.java (2):

public final class OneTest {
    @Mock
    private transient Another another;
    @InjectMocks
    private transient One one;

    @Before
    public void setup() {
        MockitoAnnotations.initMocks(this);
    }

    @Test
    public void oneCanWork() throws Exception {
        one.work();
        Mockito.verify(another).doSomething();
    }
}

It does have some limitations, but for most cases it will work gracefully.

If feel like more info, read the Javadoc for it.

Comment this post here.