Increasing GitHub Actions Disk Space
A couple of days ago, all of the sudden, my jobs started running out of space.
At ContaAzul, we use the CI infrastructure a lot. We open several pull requests in several projects every day, and we block the merge until the build pass. We consider our master
branches are sacred, and we can’t afford too much waiting to change them.
For the past year, we have been using Travis Enterprise. While it is pretty good, it also has some problems:
So, I went out looking for cheaper and better alternatives.
My friend Caio told me they were using Buildkite at MeusPedidos and that it was working well.
I decided to give it a try. My plan was:
terminate
some machines.Setting up the master and the workers was easy. Buildkite provides an Open Source tool called Elastic CI Stack for AWS.
It works great, letting us even use Spot Instances (which most of the time are cheaper).
I spent some time reading all its code before using it (trust no one). Then, launched a cluster on our sandbox AWS account. After that Buildkite was already working.
Then, I set Buildkite up on the top 3 most changed repositories and let it run for a few days.
After a week, I was sold. I’m not the only user though, so I asked the team to reply a small Google form. Here are the results:
Classify, from 0 to 10, your satisfaction with Travis:
Classify, from 0 to 10, your satisfaction with Buildkite:
How would you feel if we fully migrate to Buildkite?
All the scores were good:
The only downside I was able to see was the migration process.
I had 50 projects to migrate.
The migration process was:
.travis.yml
file;.buildkite/pipeline.yml
and other required files;7 steps x 50 repositories = A lot of work.
It would be easier if:
.travis.yml
as a fallback or had a conversion tool of some sort.But none of those were true, so I automated some steps of the process.
First of all, I cloned all repositories with the help of clone-org:
$ clone-org --org ContaAzul --destination /tmp/ca
Second, I created both Buildkite and GitHub API tokens to script the link between them. Of course I made the script open-source.
After that, I created some template files for Java Buildkite builds. Since most of our projects are in Java, I figured that this will do most of the work for me. This step included creating a Docker image with Java, Maven and other tools.
With all that in place, I wrote an “one-liner” and a helper function that would do most of the work for me:
# helper function
setup_buildkite() {
cd ~/Code/github-buildkite-wire &&
./setup-pipeline.sh ContaAzul "$(basename "$1")"
}
# the one liner
$ find . -depth 1 -type d | while read -r folder; do
cd "$folder"
test -f '.travis.yml' &&
setup_buildkite "$folder" &&
rm .travis.yml &&
git checkout -b rm-travis &&
git commit -am 'removed travis.yml' &&
cp -r ~/Code/buildkite/template/* . &&
git add -A &&
git commit -m 'buildkite :wrench:' &&
git pr
done
PS: The git pr
alias calls this plugin
git pr
opened one pull request for each project. The majority worked, a few were completely wrong and some needed some tweaks. But, I automated most of my work, allowing me to complete days of work in about 2 hours.
Our custom Travis elastic stack wasn’t working very well.
We experienced a lot of random job failures and jobs stuck in queues for no obvious reason.
Buildkite native elastic stack works great. Their web interface feels faster and cleaner. We also didn’t experience any random job failures.
Having a base Docker image for our builds also allowed us to have faster builds. Travis’ base image had a lot of stuff we didn’t need and configurations we didn’t knew about. We tweaked ours for our use case, dropping the average build time in ~20%.