This document covers the process for managing Spark releases.

Prerequisites for Managing A Release

Pre-prerequisites

Create a GPG Key (https://www.apache.org/dev/release-signing)

# ---- Install GPG ----
# For Ubuntu, install through apt-get
$ sudo apt-get install gnupg
# For Mac OSX, install GPG Suite from http://gpgtools.org

# ---- Generate key ----
$ gpg --gen-key                   # Create new key, make sure it is RSA and 4096 bits (see https://www.apache.org/dev/openpgp.html#generate-key)
$ gpg --output <KEY_ID>.asc --export -a <KEY_ID>  # Generate public key file for distribution to Apache infrastructure

# ---- Distribute key ----
$ gpg --send-key <KEY_ID>         # Distribute public key to a key server, <KEY_ID> is the 8 HEX characters in the output of the previous command "pub  4096R/<KEY_ID> "
$ gpg --fingerprint               # Get key digest
# Open http://id.apache.org , login with Apache account and upload the key digest
$ scp <KEY_ID>.asc <USER_NAME>@people.apache.org:~/   # Copy generated <KEY_ID>.asc to Apache web space
# Create an FOAF file and add it via svn (see http://people.apache.org/foaf/ )
#   - should include key fingerprint
# Eventually key will show up on apache people page (e.g. https://people.apache.org/keys/committer/pwendell.asc )

Get Access to Apache Nexus for Publishing Artifacts

Get "Push" Access to Apache Git Repository

Preparing the Code for a Release

Ensure Spark is Ready for a Release

Check Out and Run Tests

$ git clone https://git-wip-us.apache.org/repos/asf/spark.git -b branch-0.9
$ cd spark
$ sbt/sbt assembly
$ export MAVEN_OPTS="-Xmx3g -XX:MaxPermSize=1g -XX:ReservedCodeCacheSize=1g"
$ mvn test

Check for Dead Links in the Docs

$ cd $SPARK_HOME/docs
$ jekyll serve --watch
$ sudo apt-get install linkchecker
$ linkchecker -r 2 http://localhost:4000 --no-status --no-warnings

Create new CHANGES.txt File

The new CHANGES.txt can be generated using this script.

Cutting a Release Candidate

Overview

Cutting a release candidate involves a two steps. First, we use the Maven release plug-in to create a release commit (a single commit where all of the version files have the correct number) and publish the code associated with that release to a staging repository in Maven. Second, we check out that release commit and package binary releases and documentation.

Setting up EC2 Instance (Recommended)

Creating Release Candidates

Rolling Back Release Candidates

Auditing a Staged Release Candidate

Calling a Release Vote

Cutting the Official Release

Performing the Final Release in Nexus

Make sure you chose the correct staging repository. THIS STEP IS IRREVERSIBLE.

Uploading Final Source and Binary Artifacts

Once you move the artifacts into the release folder, they cannot be removed. THIS STEP IS IRREVERSIBLE.

To upload the binaries, you have to first upload them to the "dev" directory in the Apache Distribution repo, and then move the binaries from "dev" directory to "release" directory. This "moving" is the only way you can add stuff to the actual release directory.

# Checkout the Spark directory in Apache distribution SVN "dev" repo 
$ svn co https://dist.apache.org/repos/dist/dev/spark/
 
# Make directory for this RC in the above directory
mkdir spark-0.9.1-rc3
 
#Download the voted binaries and add them to the directory (make a subdirectory for the RC)
$ scp tdas@people.apache.org:~/public_html/spark-0.9.1-rc3/* 
# Verify md5 sums
$ svn add spark-0.9.1-rc3
$ svn commit -m "Adding spark-0.9.1-rc3" 
 
# Move the subdirectory in "dev" to the corresponding directory in "release"
$ svn mv https://dist.apache.org/repos/dist/dev/spark/spark-0.9.1-rc3  https://dist.apache.org/repos/dist/release/spark/spark-0.9.1
# Look at http://www.apache.org/dist/spark/ to make sure it's there. It may take a while for them to be visible.
# This will be mirrored throughout the Apache network.

 

Packaging and Wrap-Up for the Release

 

Miscellaneous

Steps to create the AMI useful for making releases

# Install necessary tools
$ sudo apt-get update —fix-missing  
$ sudo apt-get install -y git openjdk-6-jdk maven rubygems python-epydoc gnupg-agent linkchecker libgfortran3
 
# Install Scala of the same version as that used by Spark
$ cd
$ wget http://www.scala-lang.org/files/archive/scala-2.10.3.tgz  
$ tar xvzf scala*.tgz
$ ln -s scala-2.10.3 scala

# Install SBT of a version compatible with the SBT of Spark (at least 0.13.1)
$ cd && mkdir sbt
$ cd sbt 
$ wget http://repo.typesafe.com/typesafe/ivy-releases/org.scala-sbt/sbt-launch/0.13.1/sbt-launch.jar
# Create /home/ubuntu/sbt/sbt with the following code
	SBT_OPTS="-Xms512M -Xmx1536M -Xss1M -XX:+CMSClassUnloadingEnabled -XX:MaxPermSize=256M"
	java $SBT_OPTS -jar `dirname $0`/sbt-launch.jar "$@"
$ chmod u+x /home/ubuntu/sbt/sbt
 
# Add stuff to ~/.bashrc
$ echo "export SCALA_HOME=/home/ubuntu/scala/" >> ~/.bashrc 
$ echo "export SBT_HOME=/home/ubuntu/sbt/" >> ~/.bashrc 
$ echo "export MAVEN_OPTS='-Xmx3g -XX:MaxPermSize=1g -XX:ReservedCodeCacheSize=1g'" >> ~/.bashrc
$ echo "export PATH='$SCALA_HOME/bin/:$SBT_HOME:$PATH'" >> ~/.bashrc
$ source ~/.bashrc
 
# Verify versions
java -version    # Make sure that your java version 1.6!!! Jars built with Java 6 has known problems
sbt sbt-version  # Should force the download of SBT dependencies and finally print SBT version, verify that SBT version is >= 0.13.1
scala -version   # Verify that Scala version is same as the one used for Spark