I met a friend yesterday, and discussed ick at length. The main topic we covered was how to let ick maximise concurrency during building and optimise how soon the user gets notified of any problems in the build.

As an example, imagine a project, written in a language that gets compiled to native code, has some unit tests, and that ick builds Debian packages for it. When the user pushes changes to git, ick will do roughly the following things, in order:

  • fetch the source (or updates to it)
  • build the code
  • run unit tests
  • run any integration tests that can be run from the build tree without installing the software
  • create a Debian source package (.dsc)
  • from the source package, build the binary package (.deb)
  • upload the source and binary packages to an APT repository
  • deploy the uploaded binary packages to a test server
  • test the installed software on the test server, possibly heavily (e.g., benchmarks, load tests)
  • deploy the uploaded binary packages to a production server

What the user wants is to be notified of any problem as soon as possible. For example, if there is an error during building, they should hear about it as soon as it happens.

Ick lets the user specify projects, each of which has one or more pipelines, each of which has one or more actions. When all the actions of all the pipelines are executed in order, the project is built and the result it something the user wants.

For simple cases, this is fine, and there's usually not much room for concurrency. If the worker has multiple CPUs, the action that runs the project's build system ("make") can exploit that by running multiple compilers concurrently ("make -j128"). That is however not of interest to ick.

For ick, what matters is building different parts of the project on different workers at the same time, and not waiting for things unnecessarily. For the simple example above, this is simple enough: execute the steps in sequence, and tell the user after each step if it failed.

A more interesting case is when the program needs to be built on more than one CPU architecture. For example, amd64 (fast) and armhf (slow). Even more intereting is when the build takes a long time. Thus, we extend the example above with the following changes:

  • the project source code is large (think LibreOffice large)
  • it takes hours to build even of a fast machine, days on a slow machine
  • feedback should still come as soon as possible

The simplistic way would be to do the build, unit test, and package building steps sequentially, once per CPU type, starting with the fastest first. However, this would slow things down too much: the slow workers won't even start and if there's several slow architectures, the last one might take many days, even weeks. Not acceptable.

The solution is to start all builds at once. Basically, each build will effectively "fork" the pipeline for each type of build target, and excute actions within the fork sequentially.

Some things will need to wait for all builds from forks to finish. Forac example, and automatic notification of a new build should probably not be sent out until the software is built for all types of targets.

The tricky part, then, is to give the user a way to express this in way that's easy to get right and difficult to get wrong, and that still allows ick to things with as much concurrency as possible. Here's a suggestion.

A pipeline's list of actions implies a dependency order: each action implicitly depends on the previous action in the same list. Depending means that the depended-on action must finish successfully before the depepending action is started.

A "gate" action implies that all previous actions for the same build must finish successfully before the gate is passed. All gates must be explicit: there is no implicit gate at the end of a pipeline. This is so that a fast worker can continue opportunistically with the next pipeline, without having to wait (for days!) for a slow architecture build to finish.

A pipeline, or each of its actions, can specify that it should be run once per build, once per each type of target, or only for a specific type of target. This is done by using tags. Each worker will have a set of tags that describe it, and ick will automatically detect types of workers from the tags.

As a concrete example, a project and pipeline and workers might be specified as follows. We'll have three workers, two for amd64, and one for armel.

workers:
  - worker: cheetah
    tags:
      - amd64-arch
  - worker: cheetch2
    tags:
      - amd64-arch
  - worker: turtle
    tags:
      - armhf-arch

We'll have one project, which we specify should be built for every architecure. More specifically, ick will automatically run multiple forks of the build, one fork for each set of available workers that matches the tag wildcards. In our example workers, the two amd64 workers are considered equal, and ick will choose randomly one of them. With the example workers, ick will start two forks for this project when it starts building.

projects:
  - project: gcc
    pipelines:
      - clone
      - configure
      - build
      - check
      - dsc
      - deb
      - dput
      - announce
    target-tags:
      - *-arch

Here are simplistic versions of the pipelines. The actual build actions are uninteresting, we concentrate on the way concurrencty possibilities are expressed.

pipelines:
  - pipeline: clone
    actions:
      - shell: git clone git://git.example.com/gcc
    build: once

  - pipeline: configure
    actions:
      - shell: ./configure
    build: each

  - pipeline: build
    actions:
      - shell: make
    build: each

  - pipeline: check
    actions:
      - shell: make check
    build: each

  - pipeline: dsc
    actions:
      - shell: dpkg-buildpackage -S
    build: each

  - pipeline: deb
    actions:
      - shell: dpkg-buildpacakge -b
    build: each

  - pipeline: dput
    actions:
      - shell: dput *.changes
    build: each

  - pipeline: announce
    actions:
      - action: gate
      - shell: echo GCC HAS BEEN UPDATED AND ALL IS WELL
    build: once

The "build" field in a pipeline should have one of the following values:

  • once – the pipeline should be executed once for all forks; if a fast worker executes the pipeline successfully, and slower one arrives to it, the controller will just skip the pipeline for the slow worker

  • each – the pipeline should be executed by each fork

We might want to allow build for each action in a pipeline as well.

A gate action means that the controller will wait for all preceding actions to finish successfully before it gives any further actions to any workers to execute.

Ick will automatically archive the workspace in the blob service after each action, and each action will start by fetching the archived workspace from the previous action, whether from the same fork or not. (Obviously, the archiving and fetching will only be done if actions are given to different workers.)

Conceptually, the ick controller will start a build by building a directed acyclic graph of actions that need to be executed, and label each node in the graph with the type of worker needed to execute it. Something like the graph below.

When the build is actually executing, the controller will start keeping track of the build at the "start" node, and when a worker wants something to do, the controller will pick the next node in the graph suitable for that worker. If it's not the worker that did the previous action, the controller will automatically tell the new worker to fetch the correct workspace blob from the blob service.

I believe this will result in a nice amount of concurrency in builds orchestrated by ick. Also, the build graph will make for a nice visualisation in the UI, each node turning into green when the action is finished, or something.