This document covers the process for managing Spark releases.
Prerequisites for Managing A Release
Create a GPG Key (https://www.apache.org/dev/release-signing)
Install GPG tools
For Ubuntu:
sudo apt-get install gnupg
For Mac OSX:
Install GPG Suite from gpgtools.org
Create and upload keys
# Install gnupg # - For Ubuntu $ sudo apt-get install gnupg # - For Mac OSX, install GPGTools $ gpg --gen-key # Create new key $ gpg --fingerprint # Get key digest # Upload digest to id.apache.org (gpg --fingerprint) $ gpg --send-key <KEY ID> # Distribute key $ gpg --output pwendell.asc --export -a <KEY_ID> # copy public key to Apache web space, name it <KEY_ID>.asc # Create an FOAF file and add it via svn (see http://people.apache.org/foaf/) # -> should include key fingerprint # Eventually key will show up on apache people page (e.g. https://people.apache.org/keys/committer/pwendell.asc)
Get Access to Apache Nexus for Publishing Artifacts
- You have this iff you can log into repository.apache.org with Apache username/password
- If you cannot get access, create an Apache infrastructure JIRA issue and request access, giving your Apache I.D.
- Install LDAP credentials in your ~/.m2/settings.xml file as described here for publishing
Get "Push" Access to Apache Git Repository
git remote add apache https://git-wip-us.apache.org/repos/asf/incubator-spark.git
Preparing the Code for a Release
Ensure Spark is Ready for a Release
- Check JIRA for remaining issues tied to the release
- Review and merge any blocking features
- Bump other remaining features to subsequent releases
- Ensure Spark versions are correct in the codebase
- Includes SBT/Maven builds, docs, and ec2 scripts
- See this example commit
- NOTE: The version in pom.xml files should be SPARK-VERSION_SCALA-VERSION-SNAPSHOT and will be changed automatically when cutting the release
- NOTE: The yarn-alpha module should have it's version bumped here because it is not enabled when publishing
Check for dead links in the docs
$ cd $SPARK_HOME/docs $ jekyll serve --watch $ sudo apt-get install linkchecker $ linkchecker -r 2 http://localhost:4000 --no-status --no-warnings
Checkout and Run Tests
$ git clone https://git-wip-us.apache.org/repos/asf/incubator-spark.git -b branch-0.8 $ cd incubator-spark $ sbt/sbt assembly $ export MAVEN_OPTS="-Xmx3g -XX:MaxPermSize=1g -XX:ReservedCodeCacheSize=1g" $ mvn test
Run License Audit Tool
$ java -jar /path/to/apache-rat-0.10.jar --dir . --exclude *.md > rat_results.txt $ vi rat_results.txt $ # Look for source files that seem to have missing headers $ cat rat_results.txt | grep "???" | grep -e \.scala$ -e \.java$ -e \.py$ -e \.sh$ $ # Add missing headers if necessary
Create CHANGES.txt File
# Append to CHANGES.txt file required by Apache # If doing a minor release, append to existing CHANGES.txt file in release branch # If doing a major release, copy CHANGES.txt file from last major release # and append to it (shown below) $ cat CHANGES.txt | tail -n +3 > OLD_CHANGES.txt $ echo "Spark Change Log" > CHANGES.txt $ echo "" >> CHANGES.txt $ echo "Release 0.9.0-incubating" >> CHANGES.txt $ echo "" >> CHANGES.txt $ # below might be the shittiest code I’ve ever written. This will be much easier $ # once all PR's use the new merge format. $ git log v0.8.0-incubating..HEAD \ > --grep "pull request" \ > --pretty="QQ %h %cd%nQQ %s%nQQ QQQ%b%nQQ" \ > | grep QQ | sed s/QQ// | sed "s/^ QQQ\(.*\)$/ [\1]/" >> CHANGES.txt $ cat OLD_CHANGES.txt >> CHANGES.txt $ rm OLD_CHANGES.txt $ git add CHANGES.txt && git commit -m "Change log for release 0.9.0-incubating"
Cutting a Release Candidate
Overview
Cutting a release candidate involves a two steps. First, we use the Maven release plug-in to create a release commit (a single commit where all of the version files have the correct number) and publish the code associated with that release to a staging repository in Maven. Second, we check out that release commit and package binary releases and documentation.
Release Script
- The process of creating releases has been automated via this release script
- Read and understand the script fully before you execute it. It will cut a Maven release, build binary releases and documentation, then copy the binary artifacts to a staging location on people.apache.org.
- NOTE: You must use git 1.7.X for this or else you'll hit this horrible bug
Rolling Back Release Candidates
- If a release candidate does not pass, it is necessary to roll back the commits which advanced Spark's versioning.
$ git fetch apache $ git checkout apache/branch-0.8 $ git tag -d v0.8.1-incubating $ git push origin :v0.8.1-incubating $ git revert HEAD --no-edit # revert dev version commit $ git revert HEAD~2 --no-edit # revert release commit $ git push apache HEAD:branch-0.8
Auditing a Staged Release
- The process of auditing release has been automated via this release audit script
- The release auditor will test example builds against the staged artifacts, verify signatures, and check for common mistakes made when cutting a release
Calling a Release Vote
- The release voting happens in two stages. First, a vote takes place on the Apache Spark developers list (the podling PMC or PPMC is voting), then one takes place on the general@i.a.o list (the IPMC). I used the same template for both votes. Look at past vote threads to see how this goes. Once the vote is finished you should also send out a summary e-mail with the totals (subject “[RESULT] [VOTE]...”).
- If possible, attach a draft of the release notes with the e-mail
- Attach the CHANGES.txt file in the e-mail
- NOTE: This will change once we graduate and there will be a single vote
Please vote on releasing the following candidate as Apache Spark
(incubating) version 0.8.1.
A draft of the release notes along with the CHANGES.txt file is attached to this e-mail.
The tag to be voted on is v0.8.1-incubating (commit fba8738):
https://git-wip-us.apache.org/repos/asf?p=incubator-spark.git;a=tag;h=720e75581ae5f0c4835513ee06bfa0cb71923c57
The release files, including signatures, digests, etc can be found at:
http://people.apache.org/~pwendell/spark-0.8.1-incubating-rc1/
Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/pwendell.asc
The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapachespark-022/
The documentation corresponding to this release can be found at:
http://people.apache.org/~pwendell/spark-0.8.1-incubating-rc1-docs/
Please vote on releasing this package as Apache Spark 0.8.1-incubating!
The vote is open until Thursday, September 19th at 05:00 UTC and passes if
a majority of at least 3 +1 [PPMC/IPMC] votes are cast.
[ ] +1 Release this package as Apache Spark 0.8.1-incubating
[ ] -1 Do not release this package because ...
To learn more about Apache Spark, please see
http://spark.incubator.apache.org/
Cutting the Official Release
Performing the Final Release in Nexus
Be Careful!
Make sure you chose the correct staging repository. THIS STEP IS IRREVERSIBLE.
- Find the staging repository and click "Release" and confirm.
Uploading Final Source and Binary Artifacts
Be Careful!
Once you move the artifacts into the release folder, they cannot be removed. THIS STEP IS IRREVERSIBLE.
# Create SVN folder and add the release artifacts there: # https://dist.apache.org/repos/dist/dev/incubator/spark/spark-0.9.0-incubating-rc5 $ scp pwendell@people.apache.org:~/public_html/spark-0.9.0-incubating-rc5/* spark-0.9.0-incubating-rc5/ # Verify md5 sums $ svn add spark-0.9.0-incubating $ svn commit -m "Adding spark-0.8.1-incubating-rc1" $ svn mv https://dist.apache.org/repos/dist/dev/incubator/spark/spark-0.8.1-incubating-rc4 \ > https://dist.apache.org/repos/dist/release/incubator/spark/spark-0.8.1-incubating # Look at http://www.apache.org/dist/incubator/spark/ to make sure it's there. # This will be mirrored throughout the Apache network.
Packaging and Wrap-Up for the Release
- Update remaining version numbers in the release branch (see this example commit)
- Update the spark-ec2 scripts
- Upload the binary packages to the spark-related-packages bucket in S3 and make them public
- Alter the init scripts in amplab/spark-ec2 repository to pull new binaries (see this example commit)
- You can audit the ec2 set-up by launching a cluster and running this audit script
- Update the Spark website
- The website repo is at: https://svn.apache.org/repos/asf/incubator/spark
- Copy new documentation to /site/docs and update the "latest" link
- NOTE: For the below items, look at how previous releases are documented on the site
- Create release notes
- Update documentation page
- Update downloads page
- Update the main page with a news item
- Once everything is working (ec2, website docs, website changes) create an announcement on the website and then send an e-mail to the mailing list
- Enjoy an adult beverage of your choice, congrats on making a Spark release