2012-08-09

Continuous Delivery with Maven and Go into Maven Central

In continuous delivery the idea is to have the capability to release the software to production at the press of a button, as often as it makes sense from a business point of view, even on every commit (which would then be called continuous deployment). This means that every binary built could potentially be released to production in a matter of minutes or seconds, if the powers that be so wish.

Maven's ideology is opposed to continuous delivery, because with Maven it must be decided before building whether the next artifact will be a release or a snapshot version, whereas with continuous delivery it will be known only long after building the binaries (e.g. after it has passed all automated and manual testing) whether the binary is fit for release.

In this article I'll explain how I tamed Maven to support continuous delivery, and how to do continuous delivery into the Maven Central repository. For continuous integration and release management I'm using Go (not to be confused with Go, go or other uses of go). Git's distributed nature comes also into use, when creating new commits and tags during each build.

The project in question is Jumi, a new test runner for Java to replace JUnit's test runner. It's an open source library and framework which will be used by testing frameworks, build tools, IDEs and such, so the method of distribution is publishing it to Maven Central through Sonatype OSSRH.

Version numbering with Maven to support continuous delivery

Maven Release Plugin is diametrically opposed to continuous delivery, as is the use of snapshot version numbers in Maven, so I defined the project's version numbering scheme to use the MAJOR.MINOR.BUILD format. In the source repository the version numbers are snapshot versions, but before building the binaries the CI server will use the following script to read the current version from pom.xml and append the build number to it. For example, the latest (non-published) build is build number 134 and the version number in pom.xml is 0.1-SNAPSHOT, so the resulting release version number will be 0.1.134.

In the project's build script I'm then using Versions Maven Plugin to change the version numbers in all modules of this multi-module Maven project. The changed POM files are not checked in, because in continuous delivery each binary should anyways be built only once. To make the sources traceable I'm tagging every build, but I publish those tags only after releasing the build - more about that later.

P.S. versions:set requires the root aggregate POM to extend the parent POM, or else you will need to run it separately for both the root and the parent POMs by giving Maven the --file option. Also, to keep all plugin version numbers in the parent POM's dependencyManagement section, the root should extend parent, or else you will need to define the plugin version numbers on the command line and can't use the shorthand commands for running Maven plugins (unless you're OK with Maven deciding itself that which version to use, which can produce unrepeatable builds).

Managing the deployment pipeline with Go

Before going into further topics, I'll need to explain a bit about Go's artifact management and give an overview of Jumi's deployment pipeline.

Go is more than just a continuous integration server - it is designed for release and deployment management. (I find that Go fits that purpose better than for example Jenkins.) In Go the builds are organized into a deployment pipeline, as described in the Continuous Delivery book. All pipeline runs are logged forever* and configuration changes are version controlled (Go uses internally Git), so it also provides full auditability especially when it's used for deployment. The commercial version has additional deployment environment and access permission features, but the free version has been adequate for me for now (though being an open source project I could get the enterprise edition for free). Update 2012-09-20: Since Go 12.3 the free version has the same features as the commercial version - only the maximum numbers of users and remote agents differ. Update 2014-02-25: Go is now open source and fully free!

* For experimenting I recommend cloning a pipeline with a temporary name, to avoid the official pipeline's history from being filled with experimental runs. You can't remove things from the history (except by removing or hacking the database), but you can hide them by deleting the pipeline and not reusing the pipeline's name (if you create a new pipeline with the same name then the old history will become visible). Only build artifacts of old builds can be removed, and there's actually a feature for that to avoid the disk getting too full.

Go's central concepts are pipelines, stages and jobs. Each pipeline consists of one or more sequentially executed steps, and each step consists of one or more jobs (which can be executed in parallel if you have multiple build agents). A pipeline can depend on the stages of other pipelines and a stage can be triggered automatically or manually with a button press, letting you manage complex builds by chaining multiple pipelines together.

A pipeline may even depend on multiple pipelines, for example if the system is composed of multiple separately built applications, in which case one pipeline could be used to select that which versions to deploy together. Then further downstream pipelines or stages can be used to deploy the selected versions together into development, testing and finally into the production environment.

You can save artifacts produced by a job on the Go server and then access those artifacts in downstream jobs by fetching the artifacts into the working directory before running your scripts. Go uses environment variables to tell the build scripts about the build number, source control revision, identifiers of previous stages, custom parameters etc.

Here you can see two dependent pipelines from Jumi's deployment pipeline. Clicking the button in jumi-publish would trigger that pipeline using the artifacts from this particular build of the jumi pipeline. I can trigger the downstream pipeline using any previous build - it doesn't have to be the latest build.

Running a shell command in Go requires more configuration than in Jenkins (which has a single text area for inputting multiple commands), which has the positive side-effect that it drives you to store all build scripts in version control. I have one shell script for each Go job which in turn call a bunch of Ruby scripts and Maven commands.

Below is a diagram showing Jumi's deployment pipeline at the time of writing. The names of pipelines are in bold, stages underscored, and jobs are links to their respective shell scripts. For more details see the pipeline configuration.

Git
 |
 |
 |--> jumi (polling automatically)
      build         --> analyze
      build-release     coverage-report
         |
         |
         |--> jumi-publish (manually triggered)
              pre-check           --> ossrh           --> github
              check-release-notes     promote-staging     push-staging

The jumi/build stage builds the Maven artifacts and saves them on the Go server for use in later stages. It also tags the release and updates the release notes, but those commits and tags are not yet published, but they are saved on the Go server.

The jumi/analyze stage runs PIT mutation testing and produces line coverage and mutation coverage reports. They can be viewed in Go on their own tab.

The jumi-publish/pre-check stage makes sure that release notes for the release have been filled in (no "TBD" line items), or else it will fail the pipeline and prevent the release.

The jumi-publish/ossrh stage uploads the build artifacts from Go into OSSRH. It doesn't yet run the last Nexus command for promoting the artifacts from the OSSRH staging repository into Maven Central (I need to log in to OSSRH and click a button), because I haven't yet written smoke tests which would make sure that all artifacts were uploaded correctly.

The jumi-publish/github stage pushes to the official Git repository the tags and commits which were created in jumi/build. It will merge automatically if somebody has pushed there commits after this build was made.

Future plans for improving this pipeline include adding a new jumi-integration pipeline between jumi and jumi-publish. It will run consumer contract tests of programs using Jumi, against multiple versions of those programs to notice any backward incompatibility issues. This stage might eventually take hours to execute, in which case I may break it into multiple jobs and run them in parallel. I will also reuse a subset of those tests as smoke tests in the jumi-publish pipeline, after which I can automate the final step to promote from OSSRH to Maven Central.

Staging repository for Maven artifacts in Go

Creating publishable Maven artifacts happens with the Maven Deploy Plugin and the location of the Maven repository can be configured with altDeploymentRepository. I'm using -DaltDeploymentRepository="staging::default::file:staging" to create a staging repository with only this build's artifacts, which I then save on the Go server. (The file:staging path means the same as file://$PWD/staging but works also on Windows.)

That staging repository can be accessed from the Go server using HTTP, so it would be quite simple to let beta users access them (optionally using HTTP basic authentication). For example the URL to a build's staging repository could be http://build-server/go/files/jumi/134/build/1/build-release/staging/ Though inside Go jobs it's the easiest to fetch the staging repository to the working directory. That avoids the need to configure Maven's HTTP authentication settings.

When it is decided that a build can be published, I trigger the jumi-publish pipeline which uses the following script to upload the staging repository from a directory into a remote Maven repository. It uses curl to do HTTP PUT commands with HTTP basic authentication. (I wasn't able to find any documentation about the protocol of a Maven repository like Nexus, but was able to sniff it using Wireshark.)

In addition to the above uploading, the publish script uses Nexus Maven Plugin to close the OSSRH repository to which the artifacts were uploaded. It could also promote it to Maven Central, but I want to first create some smoke tests to make sure that all the necessary artifacts were uploaded. Until then I'll do a manual check before clicking the button in OSSRH to promote the artifacts to Maven Central.

Publishing to Maven Central puts some additional requirements on the artifacts. Since I'm not using Maven Release Plugin, I need to manually enable the sonatype-oss-release profile in Sonatype OSS Parent POM to generate all the required artifacts and to sign them. If you don't need to publish artifacts to Maven Central, then you might not need to do this signing. But if you do, it's good to know that the Maven GPG Plugin accepts as parameters the name and passphrase of the GPG key to use. They can be configured in the Go pipeline using secure environment variables which are automatically replaced with ******** in the console output. (For more security, don't save the passphrases on the Go server, but manually enter them when triggering the pipeline. Otherwise somebody with root access to the Go server could get the Go server's private key and decrypt the passphrase in Go's configuration files. Though using passphraseless SSH and GPG keys on the CI server is much simpler.)

Tagging the release and updating release notes with Git

When I do a release, I want it to be tagged and the version and date of the release added to release notes, which are in a text file in the root of the project. In order to get the release notes included in the tagged revision and for the tag to be on the revision which was built, that commit needs to be done before building (an additional benefit is that all GPG signing - the build artifacts and the tag - will be done in the build stage). Since I'm using Git, I can avoid the infinite loop, which would otherwise ensue from committing on every build, by pushing the commits only if the build is released.

The release notes for the next release can be read from the release notes file with a little regular expression (get-release-notes.rb). Writing the version number and date of the release into release notes is also solvable using regular expressions (prepare-release-notes.rb), as is preparing for the next release iteration by adding a placeholder for the future release notes (bump-release-notes.rb).

With the help of those helper scripts the build script, shown below, will be able to create a commit that contains the finalized release notes and tag it with a GPG signed tag (I'm including the release notes also in the tag message). It saves the release metadata into files, so that later stages of the pipeline would not need to recalculate them (for example in promote-staging.sh), and so that I could see them in a custom tab in Go (build-summary.html). Then the script does the build with Maven and after that does another commit which prepares the release notes for a future release.

At the end of the above script you will see what lets me get away with doing commits on build. I'm creating a new repository to the directory staging.git and saving that on the Go server the same way as all build artifacts.

Then when a release is published, the following script is used to merge those commits to the master branch and push them to the official Git repository:

Hopefully this article has given you some ideas for implementing continuous delivery using Maven.