Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
 

Please note that this page is outdated!

Introduction

This document details the current state of our CI/CB and where we want to be. Namely, the items we'll need to complete include:

  1. Migration to Apache (target date: June 30th), defined by:
    1. Apache Jenkins can run our PRs, merges, and a bare-bones nightly job
    2. We change ownership and organization of MXNet repo to someone from Apache/Apache
    3. Docs are on mxnet.apache.org
    4. Transfer domain mxnet.io to Apache and have they re-direct it to mxnet.apache.org
  2. Nightly builds - provide nightly builds to the community every night so we can confidently pick release candidates
    1. Source package
    2. Pip
    3. Docker
  3. Adopting the Apache release process (see below) (next release: July 19th)
  4. Release automation - we should be able to confidently automate the release when tagged

Current State

Our release process is still very manual. We tag a commit as a release and test the release manually / manually trigger the tests. There are two build tools that we are using: Jenkins and Travis, both are containerized.

Jenkins is our build solution for Linux (AML, Ubuntu 14.04)x(CPU, GPU) and Windows server (CPU).

Travis is our build solution for macOS (CPU only).

PRs/Merges

PRs & merges triggers classic build, language (Python, R, Scala, Julia) units tests, installation guide tests in Jenkins (http://ec2-52-25-96-65.us-west-2.compute.amazonaws.com/job/mxnet/) and Travis (https://travis-ci.org/dmlc/mxnet).

Jenkins builds are triggered via configuration https://github.com/dmlc/mxnet/blob/master/Jenkinsfile and Travis builds via configuration https://github.com/dmlc/mxnet/blob/master/.travis.yml.

Nightly Tests

Nightly tests run in Jenkins (http://jenkins-master-elb-1979848568.us-east-1.elb.amazonaws.com/)  they just test whether compilation and tests passes. It doesn't provide build artifacts for our community to use. - NOTE: this is a different set of servers from the one used for PRs. These jobs are configured via the web app and should be moved into a Jenkinsfile similar to the PR builds. What's currently running:

  • Core
  • ARM
  • Amalgamation
  • Javascript
  • Notebooks
  • Tutorials
  • Pip (installs and tests what's currently in PyPI)
  • Docker (pulls and tests what's in DockerHub)
  • Installation Guide

Apache Release Process

In abstract, the process to release Apache software during the incubator period starts with tagging a release candidate, suggesting a release on the dev mailing list, and collecting upvotes. At least three +1 votes and a majority +1 votes are required. Then the vote is held for the incubator PMC which must result in three +1 votes in order to move forward with the release.

The release manager than tags the release, packages the source (suffixed with "incubator") into a tarball,  and moves the source package into Apache's incubator distribution website. Any built binaries and environments (pip wheels, docker images) beyond the source packages are not controlled by a formal release process. "The public may also obtain Apache software from any number of downstream channels which redistribute our releases in either original or derived form (rpm, deb, homebrew, etc.). The vast majority of such downstream channels operate independently of Apache.”

Read more here: http://www.apache.org/dev/#releases

Release process specific to MXNet here: Release Process

A good example to follow here: https://cwiki.apache.org/confluence/display/HAWQ/Release+Process%3A+Step+by+step+guide#ReleaseProcess:Stepbystepguide-PublishingandDistributingRelease

Release Items

In addition to the other special builds listed in the "nightly" section, we should build and test the following every night. Any runs that passes builds/tests successfully can be used as release candidates. Once there's a positive vote, the first three would then have to be released manually or we would trigger a release from the tagging event, via our own Jenkins set up.

  1. Source packages (.tgz)
  2. Pip wheels
  3. Docker images
  4. Docs

Tasks

Due to the lack of time between now (last week of June) and the next release, we'll prioritize the migration and providing nightly builds. We'll deprioritize any automatic releases to release manually. There is some work to generate the keys to sign the releases, get them trusted, and packaging the release.

1. Migration (1.5 weeks)

Mostly passive work - requirements for completion is defined in Introduction. The active work of moving nightly jobs (listed above) into a Jenkinsfile for Apache's build server to consume is where most of the effort is, but that wasn't defined as a requirement for completing migration, because we already have nightly tests running in our in-house Jenkins.

Effort and FAQ: MXNet migration from DLMC to Apache

2. Source Packages (? day)

Projects typically create source packages (.tar.gz of source, no build files) as build artifacts inside Jenkins as part of their nightly builds (example from Apache JMeter). After the community votes on a release, the release manager packages a distribution, signs it with an OpenGPG compatible ASCII armored detached signature, and uploads it into the project's dist release website: https://dist.apache.org/repos/dist/release/.

TaskStatusStart DateCompletion DateEstimationPriorityNotes
Add build to nightly Jenkinsfile (running in Apache's build server and producing nightly source packages there)   1 day for EACH build type listed above?

What kind of builds do we want to run (there's a long list - can we simplify it to a couple of major build flags)? Tests? The effort here is in moving jobs configured manually in the Jenkins UI to a pipeline as code file (Jenkinsfile).

After builds passes for the night, we can generate the source package that lives in Jenkins.

Example:

https://builds.apache.org/job/JMeter-trunk/lastSuccessfulBuild/artifact/trunk/dist/

NOTE: I disagree with providing 1) nightly source packages; 2) having these run in Apache's build server; as priorities. Nightly tests only have to be running on our in-house server to give us confidence of picking release candidates. As long as we follow the voting process and get PMC's blessing, we would be following the Apache release process.

Generate new keys for signing releases   1+High

Need to generate a set of keys for the release manager to sign the releases. The public key has to be signed into the web of trust (BLOCKER) which is mostly done at in-person meetings such as conferences or signing parties. It also has to be uploaded to a public key server.

See:

http://www.apache.org/dev/release-signing.html

http://www.apache.org/dev/openpgp.html#generation-final-steps

http://www.apache.org/dev/openpgp.html#wot-link-in

 

Determine README, LICENSE, the directory structure, etc. for the project dist website   1High

Defer to someone else, call a meeting

https://dist.apache.org/repos/dist/

Trigger automated release on tagging event   .5Low 

3. Docker Images (2 days)

https://github.com/dmlc/mxnet/tree/master/docker

Need to add tests, and build/tag at the current commit.

TaskStatusStart DateCompletion DateEstimationPriorityNotes
Add builds/tests to nightly Jenkinsfile   1.5Medium 
Trigger automated release on tagging event   .5Low 

4. Pip Wheels (3 weeks)

 

TaskStatusStart DateCompletion DateEstimationPriorityNotes

Make wheels for all variants

    MediumAssigning medium because Apache release process has no requirements of Pip distributions, and source packages have priority over pip distributions. We'll build and test these manually if we need to.
 CPUStarted  2  
 MKL   2  
 CU75Build and test successful (Python 2.7). Need to test the upload and clean up the code.  2  
 CU80   2  
 CU75MKL   2  
 CU80MKL   2  
Run test suite for each variant,

for each Python:

  • 2.7
  • 3.4
  • 3.5
  • 3.6
    Medium 
 For CPU: "nosetests unittest"   2  
 For GPU: "nosetests gpu"   2  
Release  Github should send push notification to Jenkins to run this process each time there is a tagging event. Release the packages via twine. (pip install twine).5Low 

4. Windows, R, Scala Packages

Deferred

5. Docs build

Upon every release, build documentation and manually check/deploy

Workflow:

Design for versioned website:

  1. Push each stable release website static files to a separate repo, say dmlc/website-archive.
  2. Add versions tab to master and latest stable release which directs to other releases. Also add version number to right top corner to indicate current website version.
  3. For other old release versions, add a "Latest release" tab to switch to current release.
  4. Point root url to latest release instead of master.

Once we have a new release, we need to:

  1. Archive the last release
  2. Update the versions list on master branch and release tag

Releases

  1. Tagging a release on GitHub triggers a build job on Jenkins
  2. If the tag is not a release candidate ( ? ), the build job will create a new folder named after the tag (ex: v0.11)
  3. The job builds docs, then moves it in the new tagged folder, directs index.html to the latest versioned docs
  4. Commit and push that folder to the asf-site branch

Latest

  1. Every merge on master triggers the build job on Jenkins
  2. Build docs for the latest commit and replace the old latest/ with the new docs
  3. Commit and push that folder to the asf-site branch.

 

asf-site branch would look something like:

index.html -> v0.11/index.html (whichever folder is the latest stable release)

latest/ (mirrors the latest commits)

index.html

install/

tutorials/ ...

v0.11/

index.html

install/

tutorials/ ....

v0.10/

index.html

install/

tutorials/ ....

 

This is PredictionIO's asf-site branch for reference: https://git-wip-us.apache.org/repos/asf?p=incubator-predictionio-site.git;a=tree;h=refs/heads/asf-site;hb=refs/heads/asf-site