Ansible tweaks to get 500% faster runs

In this post, I’ll explain how we significantly decreased the time it takes to create an appliance build in Jidoteki Meta (Jidometa). We managed to bring the build times down from 6 minutes to a little less than 60 seconds.

Profiling

Before trying to optimize Ansible runs, it’s best to start by profiling the tasks to see what’s taking the most amount of time. The Jidoteki Meta virtual appliance is quite small (~60MB), so we decided to profile a customer build instead, using this simple plugin (example output):

image

Various approaches

A simple search for “faster ansible” or “speeding up ansible” returns many results, which all provide rehashed versions of SSH/remote optimizations. Unfortunately they are useless for us since we use ansible in local mode. Fortunately I did find one gem hidden at the bottom of an Async post, which is the basis for our huge speed gains (ironically, it has nothing to do with async).

Why not async?

The Ansible async module is quite complex, and requires another module (async_status), does polling in the background, and is a big ugly hack that’s more suitable for SSH/remote ansible runs. It wasn’t designed for us, so we didn’t include it in our optimization schemes.

Copying files

There is a known “bug” (feature) in Ansible since the early days: the copy module is terribly slow. Their recommended approach is to use the synchronize module which uses rsync behind the scenes, but that is also just a big hack.

Our 6 slowest tasks involve looping through a long array of with_items (~120 entries) and copying files from source -> destination. The gem was in this line:

’,’.join(dependencies)

If we could concatenate our list to a space-separated string, and then transform our task into a shell task (instead of copy), we could make one single call to the cp command, containing all 120 files. The processed command looks something like this:

cp file1 file2 file3 fileN file120 /dest/dir/

Without the need for looping or using with_items.

Previously, those 6 tasks went from taking a combined 180 seconds, to a total of only 3 seconds.

Wow!

Too simple

Of course, that solution was just too simple and probably should have been there from the start, so we made up for it by optimizing something a bit more complex ;)

At the end of our Ansible run, we generate a rootfs which is really just a CPIO archive containing the files/directory structure from the Ansible output. Since our Ansible repos are versioned in Git, every single Ansible run using the same git commit ref should produce the exact same rootfs (see our Reproducible Builds post). Of course, some minor differences exist in regards to file/directory modification times, and randomly generated passwords/keys, but everything else remains the same.

What I realized, was that if we can find a previous build that used the exact same git commit ref as the current build, then we can re-use the entire rootfs instead of re-generating it from scratch. Well, the Ansible role for generating the rootfs, which previously took 120 seconds, was reduced to only 8 seconds.

The sum of its parts

With all these optimizations in build times, we effectively managed to obtain a 500% speed increase for the final builds created with Jidometa.

Going back to our own Jidometa appliance build, which previously “only” took 38 seconds, now takes exactly 5 seconds.

I’m laughing as I type this.

Dogfooding

I may sound like a broken record, but something really magical occurs when you use your own product on a daily basis. We discover and implement solutions which solve our own problems, which happen to also solve our customers problems.

Our focus is on creating the most solid enterprise-ready on-prem virtual appliances. We work exclusively with customers who want to distribute professional, complex applications to the enterprise, in a self-contained file, without having to figure out all the tough parts.

Feel free to contact us, and we’ll be happy to discuss your requirements.