This document covers the process for managing Spark releases.
# ---- Install GPG ---- # For Ubuntu, install through apt-get $ sudo apt-get install gnupg # For Mac OSX, install GPG Suite from http://gpgtools.org # ---- Generate key ---- $ gpg --gen-key # Create new key, make sure it is RSA and 4096 bits (see https://www.apache.org/dev/openpgp.html#generate-key) $ gpg --output <KEY_ID>.asc --export -a <KEY_ID> # Generate public key file for distribution to Apache infrastructure # ---- Distribute key ---- $ gpg --send-key <KEY_ID> # Distribute public key to a key server, <KEY_ID> is the 8 HEX characters in the output of the previous command "pub 4096R/<KEY_ID> " $ gpg --fingerprint # Get key digest # Open http://id.apache.org , login with Apache account and upload the key digest $ scp <KEY_ID>.asc <USER_NAME>@people.apache.org:~/ # Copy generated <KEY_ID>.asc to Apache web space # Create an FOAF file and add it via svn (see http://people.apache.org/foaf/ ) # - should include key fingerprint # Eventually key will show up on apache people page (e.g. https://people.apache.org/keys/committer/pwendell.asc ) |
git remote add apache https://git-wip-us.apache.org/repos/asf/spark.git
Make sure you have configured git author info:
$ git config --global user.name <GIT USERNAME> $ git config --global user.email <GIT EMAIL ADDRESS> |
$ git clone https://git-wip-us.apache.org/repos/asf/spark.git -b branch-0.9 $ cd spark $ sbt/sbt assembly $ export MAVEN_OPTS="-Xmx3g -XX:MaxPermSize=1g -XX:ReservedCodeCacheSize=1g" $ mvn test |
$ cd $SPARK_HOME/docs $ jekyll serve --watch $ sudo apt-get install linkchecker $ linkchecker -r 2 http://localhost:4000 --no-status --no-warnings |
The new CHANGES.txt can be generated using this script.
Set SPARK_HOME environment variable and run the script.
$ export SPARK_HOME="..." $ python -u generate-changelist.py |
Cutting a release candidate involves a two steps. First, we use the Maven release plug-in to create a release commit (a single commit where all of the version files have the correct number) and publish the code associated with that release to a staging repository in Maven. Second, we check out that release commit and package binary releases and documentation.
Transfer your GPG keys from your home machine to the EC2 instance.
# == On home machine == $ gpg --list-keys # Identify the KEY_ID of the key you generated $ gpg --output pubkey.gpg --export <KEY_ID> $ gpg --output - --export-secret-key <KEY_ID> | cat pubkey.gpg - | gpg --armor --output keys.asc --symmetric --cipher-algo AES256 # Copy keys.asc to EC2 instance # == On EC2 machine == # Maybe necessary, if the ownership of gpg files are not set to current user $ sudo chown -R ubuntu:ubuntu ~/.gnupg/* # Import the keys $ sudo gpg --no-use-agent --output - keys.asc | gpg --import # Confirm that your key has been imported and then remove the keys file and $ gpg --list-keys $ rm keys.asc |
Install your private key that allows you to have password-less access in Apache webspace.
Set git user name and email (these are going to appear as the committer in the release commits).
$ git config --global user.name "Tathagata Das" $ git config --global user.email tathagata.das1565@gmail.com |
Checkout the appropriate version of Spark that has the right scripts related to the releases. For instance, to checkout the master branch, run "git clone https://git-wip-us.apache.org/repos/asf/spark.git".
Make sure Maven is configured with your Apache username and password. Your ~/.m2/settings.xml should have the following.
<settings xmlns="http://maven.apache.org/SETTINGS/1.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/SETTINGS/1.0.0 http://maven.apache.org/xsd/settings-1.0.0.xsd"> <servers> <server> <id>apache.snapshots.https</id> <username>APACHE_USERNAME</username> <password>PASSWORD</password> </server> <server> <id>apache.releases.https</id> <username>APACHE_USERNAME</username> <password>PASSWORD</password> </server> </servers> </settings> |
If a release candidate does not pass, it is necessary to roll back the commits which advanced Spark's versioning.
# Checkout the release branch from Apache repo # Delete earlier tag. If you are using RC-based tags (v0.9.1-rc1) then skip this. $ git tag -d v0.9.1 $ git push origin :v0.9.1 # Revert changes made by the Maven release plugin $ git revert HEAD --no-edit # revert dev version commit $ git revert HEAD~2 --no-edit # revert release commit $ git push apache HEAD:branch-0.9 |
Once the vote is done, you should also send out a summary e-mail with the totals (subject “[RESULT] [VOTE]...”).
Please vote on releasing the following candidate as Apache Spark version 0.9.1 A draft of the release notes along with the CHANGES.txt file is attached to this e-mail. The tag to be voted on is v0.9.1-rc1 (commit 81c6a06c): The release files, including signatures, digests, etc can be found at: Release artifacts are signed with the following key: The staging repository for this release can be found at: The documentation corresponding to this release can be found at: Please vote on releasing this package as Apache Spark 0.9.1! The vote is open until Thursday, September 19th at 05:00 UTC and passes if [ ] +1 Release this package as Apache Spark 0.9.1 To learn more about Apache Spark, please see |
Make sure you chose the correct staging repository. THIS STEP IS IRREVERSIBLE. |
Once you move the artifacts into the release folder, they cannot be removed. THIS STEP IS IRREVERSIBLE. |
To upload the binaries, you have to first upload them to the "dev" directory in the Apache Distribution repo, and then move the binaries from "dev" directory to "release" directory. This "moving" is the only way you can add stuff to the actual release directory.
# Checkout the Spark directory in Apache distribution SVN "dev" repo $ svn co https://dist.apache.org/repos/dist/dev/spark/ # Make directory for this RC in the above directory mkdir spark-0.9.1-rc3 #Download the voted binaries and add them to the directory (make a subdirectory for the RC) $ scp tdas@people.apache.org:~/public_html/spark-0.9.1-rc3/* # Verify md5 sums $ svn add spark-0.9.1-rc3 $ svn commit -m "Adding spark-0.9.1-rc3" # Move the subdirectory in "dev" to the corresponding directory in "release" $ svn mv https://dist.apache.org/repos/dist/dev/spark/spark-0.9.1-rc3 https://dist.apache.org/repos/dist/release/spark/spark-0.9.1 # Look at http://www.apache.org/dist/spark/ to make sure it's there. It may take a while for them to be visible. # This will be mirrored throughout the Apache network. |
Update the Spark Apache repository
Checkout the tagged commit for the release candidate and apply the correct version tag
# Apply the correct tag $ git checkout v0.9.1-rc3 # checkout the RC that passed $ git tag v0.9.1 $ git push apache v0.9.1 # Verify on the Apache git repo that the tag has been applied correctly # Remove the old tag $ git push apache :v0.9.1-rc3 |
The website repo is at: https://svn.apache.org/repos/asf/spark
$ svn co https://svn.apache.org/repos/asf/spark |
Copy new documentation to spark/site/docs and update the "latest" link. Make sure that the docs were generated with PRODUCTION=1 tag, if it wasnt already generated with it.
$ PRODUCTION=1 jekyll build |
# Install necessary tools $ sudo apt-get update —fix-missing $ sudo apt-get install -y git openjdk-6-jdk maven rubygems python-epydoc gnupg-agent linkchecker libgfortran3 # Install Scala of the same version as that used by Spark $ cd $ wget http://www.scala-lang.org/files/archive/scala-2.10.3.tgz $ tar xvzf scala*.tgz $ ln -s scala-2.10.3 scala # Install SBT of a version compatible with the SBT of Spark (at least 0.13.1) $ cd && mkdir sbt $ cd sbt $ wget http://repo.typesafe.com/typesafe/ivy-releases/org.scala-sbt/sbt-launch/0.13.1/sbt-launch.jar # Create /home/ubuntu/sbt/sbt with the following code SBT_OPTS="-Xms512M -Xmx1536M -Xss1M -XX:+CMSClassUnloadingEnabled -XX:MaxPermSize=256M" java $SBT_OPTS -jar `dirname $0`/sbt-launch.jar "$@" $ chmod u+x /home/ubuntu/sbt/sbt # Add stuff to ~/.bashrc $ echo "export SCALA_HOME=/home/ubuntu/scala/" >> ~/.bashrc $ echo "export SBT_HOME=/home/ubuntu/sbt/" >> ~/.bashrc $ echo "export MAVEN_OPTS='-Xmx3g -XX:MaxPermSize=1g -XX:ReservedCodeCacheSize=1g'" >> ~/.bashrc $ echo "export PATH='$SCALA_HOME/bin/:$SBT_HOME:$PATH'" >> ~/.bashrc $ source ~/.bashrc # Verify versions java -version # Make sure that your java version 1.6!!! Jars built with Java 6 has known problems sbt sbt-version # Should force the download of SBT dependencies and finally print SBT version, verify that SBT version is >= 0.13.1 scala -version # Verify that Scala version is same as the one used for Spark |