Page tree
Skip to end of metadata
Go to start of metadata

This document is intended to provide a comprehensive checklist of tasks before, during, and after an Arrow release.

Preparing for the release

JIRA tidying

Before creating a source release, the release manager must ensure that any resolved JIRAs have the appropriate Fix Version set so that the changelog is generated properly.

To do this, search for the Arrow project and issues with no fix version. Click the "Tools" dropdown menu in the top right of the page and select "Bulk Change". Indicate that you wish to edit the issues, then set the correct Fix Version and apply the change. Remember to uncheck the box about "send e-mail notifications" to avoid excess spam to issues@arrow.apache.org.

Main source release and vote

Source release and vote

requirements:

  • You must not have any arrow-cpp or parquet-cpp environment variables defined except CC or CXX if you want to build with something other than GCC by default (e.g. clang).
  • Being a committer to be able to push to dist and maven repository
  • a gpg key to sign the artifacts
  • Use java 8. check your JAVA_HOME environment variable (at least for now. See ARROW-930@jira)
  • Maven configured to publish artifacts to Apache repositories (see http://www.apache.org/dev/publishing-maven-artifacts.html)
  • Have the build requirements for cpp and c_glib installed (see their README)
  • Set JIRA_USER and JIRA_PASSWORD environment variables
  • Install jira Python package
  • Install en_US.UTF-8 locale (You can confirm available locales by locale -a)
  • Install Python 3 as python

To build the source release, run the following (replace 0.1.0 with version to release):

# create a release branch
git checkout -b release-0_1_0


# setup gpg agent for signing artifacts
source dev/release/setup-gpg-agent.sh


# prepare release v 0.1.0 (run tests, sign artifacts). Next version will be 0.1.1-SNAPSHOT
sh dev/release/00-prepare.sh 0.1.0 0.1.1


# push the tag to the apache remote
git push apache apache-arrow-0.1.0


# tag and stage artifacts to maven repo (repo will have to be finalized separately)
sh dev/release/01-perform.sh


# checkout the tag under a new branch name and push that branch to your fork's remote
#
# to launch a crossbow build this branch _must_ exist on your remote
git checkout -b zero-one-zero apache-arrow-0.1.0
git push <your fork's remote> zero-one-zero


# launch crossbow build, and wait for that to finish. the status can be checked
# using:
# python dev/tasks/crossbow.py status build-<build_number>
#
# <build_number> is output when you launch the build
python dev/tasks/crossbow.py -g conda -g linux -g wheel


# download and sign the artifacts
# 
# this will download packages to a directory called packages/
python dev/tasks/crossbow.py sign build-<build_number>


# create the source release
#
# <rc number> starts at 0 and increments every time the release candidate is burned
# <build_number> is the same as the one from the previous step
#
# so for the first RC this would be: sh dev/release/02-source.sh 0.1.0 0 packages/build-<build_number>
sh dev/release/02-source.sh 0.1.0 <rc number> <packages directory>

# once the vote has passed, publish the staged maven artifacts (see below)


Start the vote thread on dev@arrow.apache.org and supply intructions for verifying the integrity of the release. Approval requires a net of 3 +1 votes from PMC members. A release cannot be vetoed.

Useful commands:

To set the mvn version in the poms

mvn versions:set -DnewVersion=0.1-SNAPSHOT

Reset your workspace

git reset --hard

Setup gpg agent

eval $(gpg-agent --daemon --allow-preset-passphrase)
gpg --use-agent -s LICENSE.txt

Delete tag locally

git tag -d apache-arrow-0.1.0

How to stage maven artifacts:

artifacts get staged during the perform phase of the scripts above.

If you need to stage the artifacts again follow the instructions bellow:

# checkout the relese tag
git checkout apache-arrow-0.1.0
# setup the gpg agent for signing artifacts
source dev/release/setup-gpg-agent.sh
# go in the java subfolder
cd java
# stage the artifacts
mvn -Papache-release deploy

How to publish the staged artifacts:

Logon to the apache repository: https://repository.apache.org/#stagingRepositories
Select the arrow staging repository you just just created: orgapachearrow-100x
Click the "close" button
Once validation has passed, click the "release" button

Post-release tasks

After the release vote, we must undertake many tasks to update source artifacts, binary builds, and the Arrow website.

Updating the Arrow website

The website is a Jekyll project in the site/ directory in the main Arrow repository. As part of updating the website, we must perform various subtasks.

First, create a new entry to add to http://arrow.apache.org/release/; these are in the _release subdirectory. The new contents of the new entry will go into a new Markdown file of the form X.Y.Z.md. You can start by copying one of the other release entries.

Generate a web-friendly changelog by running (requires python3)

dev/release/changelog.py $VERSION 1


Copy and paste the result.

Update index.html as appropriate for the new release. Then update install.md to include links for the new release.

Finally, if appropriate, write a short blog post summarizing the new release highlights. Here is an example.

Uploading release artifacts to SVN

A PMC member must commit the source release artifacts to SVN:

# checkout the svn repo
PREVIOUS_RELEASE_VERSION="arrow-0.9.0"
VERSION="0.10.0"

RC_NUMBER=1
RC_CLONE="path/to/rc/checkout"
RELEASE_VERSION="arrow-${VERSION}"

svn co https://dist.apache.org/repos/dist/release/arrow tmp

# copy everything from the release candidate directory into the official release directory

mkdir tmp/"${RELEASE_VERSION}"
cp -r "${RC_CLONE}/"apache-${RELEASE_VERSION}"-rc"${RC_NUMBER}"/* tmp/"${RELEASE_VERSION}"

# add to SVN
svn add tmp/"${RELEASE_VERSION}"


# delete the old version
svn delete tmp/"${PREVIOUS_RELEASE_VERSION}"

# commit
svn ci -m "Apache Arrow ${VERSION}" tmp/"${RELEASE_VERSION}"


Announcing release

Add relevant release data for Arrow to https://reporter.apache.org.

Write a release announcement (see example) and send to announce@apache.org and dev@arrow.apache.org. The announcement to announce@apache.org must be sent from your apache.org e-mail address to be accepted.

Updating website with new API documentation

The API documentation for C++, C Glib, Python, Java, and JavaScript can be generated via a Docker-based setup. To generate the API documentation run the following command:

bash dev/gen_apidocs.sh


This script assumes that the parquet-cpp Git repository https://github.com/apache/parquet-cpp has been cloned besides the Arrow repository and a `dist` directory can be created at the same level by the current user. Please note that most of the software must be built in order to create the documentation, so this step may take some time to run, especially the first time around as the Docker container will also have to be built.

To upload the updated documentation to the website, navigate to site/asf-site and commit all changes:

pushd site/asf-site
git add .
git commit -m "Updated API documentation for version X.Y.Z"


After successfully creating the API documentation the website can be run locally to browse the API documentation from the top level
Documentation menu. To run the website issue the command:

bash dev/run_site.sh


The local URL for the website running inside the docker container will be shown as `Server address:` in the output of the command. To stop the server press `Ctrl-C` in that window.

Updating C++ and Python packages

We have been making Arrow available to C++ and Python users on the 3 major platforms (Linux, macOS, and Windows) via two package managers: pip and conda.

Updating Python Artifacts

pip Packages

pip binary packages (called "wheels") are built using the crossbow tool that we used above during the release candidate creation process and then uploaded to PyPI (PYthon Package Index) under the pyarrow package.

We use the twine tool to upload wheels to PyPI:

# go to the python directory of your arrow clone
cd arrow/python

# upload wheels to a testing index
twine upload --repository-url https://test.pypi.org/legacy/ binaries/*/*.whl

# if all went well then upload to the live index
twine upload binaries/*/*.whl


# Upload source distribution; make sure you do so from a tagged release
git checkout apache-arrow-$VERSION
python setup.py sdist upload


Please make sure you use twine >= 1.11.0. This supports the markdown long description in setup.py which also requires setuptools >= 38.6.0.

You must have the correct permissions on PyPI to upload wheels; ask Wes McKinney or Uwe Korn if you need help with this.

Updating conda packages

We have been building conda packages using conda-forge. The three "feedstocks" that must be updated in-order are:

  1. arrow-cpp-feedstock
  2. parquet-cpp-feedstock
  3. pyarrow-feedstock

To update a feedstock, open a pull request updating recipe/meta.yaml as appropriate. Once you are confident that the build is good and the metadata is updated properly, merge the pull request. You must wait until the results of each of the feedstocks land in anaconda.org before moving on to the next package.

Unfortunately, you cannot open pull requests to all three repositories at the same time because they are interdependent.

Updating Java Maven artifacts in Maven central

See instructions at end of dev/release/README in the main arrow repository.

You must set up Maven to be able to publish to Apache's repositories. Read more at https://www.apache.org/dev/publishing-maven-artifacts.html.

Updating Ruby packages

You need an account on https://rubygems.org/ to release Ruby packages.

If you have an account on https://rubygems.org/ , you need to join owners of red-arrow gem  and red-arrow-gpu gem . Existing owners can add a new account to the owners of them by the following command lines:

gem owner red-arrow -a NEW_ACCOUNT
gem owner red-arrow-gpu -a NEW_ACCOUNT

You can update Ruby packages when you join owners of them:

# Download the target versioned source archive
wget http://www-us.apache.org/dist/arrow/arrow-0.10.0/apache-arrow-0.10.0.tar.gz

# Extract the source archive
tar xf apache-arrow-0.10.0.tar.gz

# Move to ruby/red-arrow and run "rake release" to update red-arrow gem
(cd apache-arrow-0.10.0/ruby/red-arrow && rake release)

# Move to ruby/red-arrow-gpu and run "rake release" to update red-arrow-gpu gem
(cd apache-arrow-0.10.0/ruby/red-arrow-gpu && rake release)

JavaScript Releases

Make release branch then tag the release

git checkout -b release-js-X.Y.Z

Build the source release (requires NodeJS) and push tag. Omit "-p" for a dry run

dev/release/js-source-release.sh -p X.Y.Z $RC_NUM
git push apache apache-arrow-js-X.Y.Z

After release vote, rebase master on release branch

  • No labels