mga/blog

ISSN 2011-0146

Using Travis-CI with Github Pages to build a self-updating static site

publicado en general,programación,web por mga en October 25, 2015

Recently I came across a project where one collaborator was generating CSV files that would then used to produce static minisite with an ElasticSearch-powered search. I decided to use Github Pages and Travis-CI for two main reasons:

  • a simple git push would let anyone update the site any time
  • Travis-CI could take care of static file generation and search indexing whenever a new set of CSVs was deployed

If you have a similar need, read on. This post is informed by many blog posts around the web (see footnotes1 2 3 4).

NOTE: This is not a recommendation on how to produce static sites. There are better tools out there for that. Check out StaticSiteGenerators if you need a proper system for static site generation. This tutorial is more the README I wish I found in the web while looking to solve this particular problem.

About Travis-CI and Github Pages

Github Pages is a quick and easy way to host static websites (you do need to know git and a Github account, but you already do, right?).

Travis-CI is a service that lets you trigger arbitrary code whenever you push changes to a code repository. The most popular use is code-testing. We will use the free version that requires your repository to be public. Take this into account if you require your code to be private.

In this example we will use Travis-CI to execute some Python code in the repository which takes care of indexing, static file-generation and repository updates. The project in question has two branches: the mandatory gh-pages branch, which Github will use for hosting the static site, and a csv branch to which will receive the latest CSVs. The gh-pages branch will be updated every time a new push arrives in the csv branch.

Suppose a basic structure of the project like this:

indexer.py will update the ElasticSearch index using data.csv, and also generate static.txt as a sort of pre-caching of the site. This may sound a little roundabout but bear with me. This structure was actually useful in our case. I will eventually publish the final site.

You will also notice a /python/build.sh file. This file contains the steps you use to create the index and static file manually. It is basically the list of UNIX commands you would type in your terminal to do the process yourself, only that you want Travis-CI to do it for you (magick!).

Setup permissions

Your Github account needs to allow Travis-CI some operations in your repositories.

NOTE: Make sure you consult others on security. I am not an expert on this subject. Refer to the footnotes for more details. I will just cover the basics.

In Github:

  • Click on your avatar in the top-right and select Settings > Personal access tokens
  • Generate a new token with these permissions: user:email, read:org, repo_deployment, repo:status, write:repo_hook, public_repo
  • IMPORTANT: Save the token somewhere you can easily retrieve it because Github shows it only once

In Travis-CI:

  • Install the Travis Ruby gem in your machine and login:

  • Go to your Travis-CI profile and turn on the repository you want to activate
  • Click the little gear icon to access the settings for that repo
  • Add any environment variables that your scripts use such as the URL to your ElasticSearch service or the path/to/some/file in the repository

Let’s look at a trimmed-down (useless) version of the Python indexer.py file (the # comments in the code will clarify the main parts):

The file has two purposes:

  1. index the CSV in the ElasticSearch
  2. produce a csv/static.txt

The post-deploy script

You may have noticed a requirements.txt file above. Vanilla Python in Travis-CI does not have every module by default. We need this file to tell Travis what to install once the repository is deployed. You can add as many modules as you want, these are just examples:

Now let’s look at the build.sh file. This is where the magic happens! This is also where the token we created above enters the scene. We will encrypt it in a minute.

But first:

Travis-CI requires a .travis.yml file (you might have noticed it in the root folder, next to index.html) that describes what happens once a new deploy is detected. Let’s start with the basic structure (once again, the # comments will clarify):

Once we have that file, we can add the encrypted Github token using the following command:

The --add flag will append the encrypted string to the .travis.yml file like so:

Note the new env > global > secure structure in the above snippet where the Travis-CI command-line program inserted some encrypted-stuff-here automatically.

This creates a new environment variable named GH_TOKEN available to any scripts run by Travis-CI (similar to adding the variable in the settings panel but in a more secure way). We will also add a variable for the repository name. You may want to encrypt it also but I will leave it plain text for example purposes:

Now we need to create the build script itself, python/build.sh. The steps are:

  1. clone the repository to a new folder (I had to do it this way because the scope in Travis-initiated processes seems to be limited to a single branch and I was not able to pull/push to other branches)
  2. checkout and pull the latest code in the csv branch
  3. checkout and merge the code into the gh-pages branch
  4. run the indexing and output new static files
  5. add the new files and create a new commit
  6. push the result to gh-pages

Below is a condensed version of this script. The echos (and all other terminal-visible commands in your scripts such as print) will be visible in the Travis console so you can debug what may be going wrong:

Now we need to tell Travis-CI to add executable permissions to this file and run it in the install part of the build lifecycle. We also add before_install and script sections to .travis.yml. The end result looks like:

And voilà! Once these files are added to the repository, upon next deploy on the csv branch Travis-CI will trigger the scripts and update all the data in ElasticSearch and the Github Pages website.

Hope this is useful to you and do contact me if there’s any glaring issues/omissions in this quick example. Special thanks to @auremoser for her feedback while writing this text.