Updating a virtual appliance in the wild (part 2)

Update: Read part 3 of this blog post series.

In our previous post about updating virtual appliances, we didn’t explain all the details and pitfalls to look out for.

In this post, I will explain the exact approach we use, and some mistakes to avoid.

A first minor mistake

When we first started building virtual appliances for our customers, we already had a decent functional update process. Unfortunately, it assumed every customer would apply every update, sequentially. Hah!

People don’t always perform updates, and when they do it’s common for them to skip a few versions.

You shouldn’t force updates down your customer’s throat (*cough* Adobe *cough*), so it’s important to understand and handle situations where they might skip a few versions.

We ship update packages as encrypted tar files, so we solved this by adding a version.txt file to the root of each update package. We can then compare the versions and see if an update needs to be run, and we can deduce which updates haven’t been applied.

Example:

server=`cat /server/version.txt`
package=`cat version.txt`

compare=`echo "$server >= $package" | bc`

This will give us 0 (the appliance is old) or 1 (the appliance is up-to-date).

Idempotent updates

By using idempotent update scripts, we don’t have to worry about problems caused by a certain task running twice.

Example:

- name: Generate some random data for no reason
  shell: >
    creates=/random.data
    openssl rand 512 > /random.data
  tags:
    - random
    - "1.1"

We use Ansible for everything, so looking at the above task, we know it will be skipped if the /random.data file already exists.

Bundling all the updates

Our latest update process ensures all previous updates are included in the update package.

This seems tricky, but it’s actually quite simple. Here’s an example, and I’ll explain below:

- name: Generate some random data for no reason
  shell: >
    creates=/random.data
    openssl rand 512 > /random.data
  tags:
    - random
    - "1.2"

- name: Install the latest nginx
  yum: name=nginx state=latest enablerepo=myapp
  tags:
    - yum
    - "1.2"

Did you notice we specified the tag “1.2” for those two tasks? This is what allows customers to skip updates.

Even if someone’s virtual appliance is at version 1.0, and they skip version 1.1 (the one that generates the /random.data file), they will still get it when updating to version 1.2.

Updating your application

If you read the first part of this blog series, you’ll notice we created a local repository for packages.

This should be used to add your web application, and any other packages which need updating. If you create custom .deb or .rpm packages of your web application, then it becomes simple to update it behind the firewall.

For help creating custom distribution packages, I recommend Jordan’s FPM tool.

Example:

fpm -s dir -t deb -n ruby -v 2.1.0 -C /opt/ruby-2.1.0 \
    --prefix /opt/ruby-2.1.0 \
    bin include lib share

This generates a lovely custom ruby-2.1.0 .deb package based on directories found in /opt/ruby-2.1.0.

Encrypting the update package

Security is always our company’s top priority. We’ve actually made this one of our core values (I drafted a post about this, we’ll publish it eventually).

High-grade encryption should be used to protect the update packages for distribution.

Example:

openssl enc -aes-256-cbc -pass file:passphrase_file \
    -in update_package-1.2.tar -salt -a \
    -out update_package-1.2.asc

The virtual appliance itself contains the passphrase file to decrypt the package, but it can be replaced quite easily.

We’ve already discussed the idea and techniques for protecting your source code, so this method could still end up being useless. Regardless, it’s much better than no security at all.

Another minor mistake

One of the assumptions we made was that our update script, the one distributed with the virtual appliance, was flawless.

It was definitely not as flexible as it should have been, because it contained a lot of update logic. This made it impossible to alter the update process without: 1. updating the appliance’s update script, 2. running the update twice. Ew. Gross.

To solve this, we removed the appliance-specific update logic and made the appliance’s update script more generic.

Example:

#!/bin/bash
#
# Example script in the virtual appliance
decrypt_update_package
extract_update_package
ensure_everything_is_in_order
chmod +x update.sh
./update.sh || exit 1

We added the update logic to an update.sh script at the root of the tar update package. Here’s an example of what the update.sh script does.

Example:

#!/bin/bash
#
# Example update.sh script from an update package
ansible-playbook appliance.yml --tags="1.2" || exit 1
exit 0

With this new process, we can ensure that an update package can do whatever it wants, as long as it contains an update.sh script. In this case, it’s running an Ansible playbook, but it’s only running tasks tagged with “1.2”. Where have I seen that before? ;)

Final remarks

We’re constantly discussing and reviewing better ways of creating, updating, and distributing virtual appliances for our customers. When we improve processes such as this one, we try to distribute them to everyone so we can all benefit from it.

We’re aware that turning a SaaS application into a distributable virtual appliance is not easy. With Jidoteki, we aim to make this entire process much easier for you. Make sure you signup for early access.

In the meantime if you have questions, feel free to contact us by email, or on Twitter, or even in our chat room #Jidoteki on Freenode IRC.

We at Unscramble are also available for consulting if you need some expertise for shipping a virtual appliance to your enterprise customers.