This Confluence has been LDAP enabled, if you are an ASF Committer, please use your LDAP Credentials to login. Any problems file an INFRA jira ticket please.

Child pages
  • Committers

Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Current Committers



Michael ArmbrustDatabricks
Joseph BradleyDatabricks
Felix CheungAutomattic

Mosharaf Chowdhury

University of Michigan, Ann Arbor

Jason Dai


Tathagata Das


Ankur Dave

UC Berkeley

Aaron DavidsonDatabricks

Thomas Dudziak


Robert Evans


Wenchen FanDatabricks
Joseph GonzalezUC Berkeley

Thomas Graves


Stephen Haberman


Mark Hamstra

ClearStory Data

Herman van HovellQuestTec B.V.
Yin HuaiDatabricks

Shane Huang


Andy Konwinski


Ryan LeCompte


Haoyuan Li

Alluxio, UC Berkeley

Xiao LiIBM
Davies LiuDatabricks
Cheng LianDatabricks
Yanbo LiangHortonworks

Sean McNamara


Xiangrui MengDatabricks

Mridul Muralidharam


Andrew OrPrinceton University
Kay OusterhoutUC Berkeley
Sean OwenCloudera

Nick Pentreath


Imran Rashid


Charles Reiss

UC Berkeley

Josh Rosen


Sandy RyzaClover Health
Kousuke SarutaNTT Data

Prashant Sharma


Ram Sriharsha


DB TsaiNetflix
Marcelo VanzinCloudera

Shivaram Venkataraman

UC Berkeley

Patrick Wendell


Andrew Xia


Reynold Xin


Matei Zaharia

Databricks, Stanford

Shixiong ZhuDatabricks

Becoming a Committer

To get started contributing to Spark, learn how to contribute – anyone can submit patches, documentation and examples to the project.

The PMC regularly adds new committers from the active contributors, based on their contributions to Spark. The qualifications for new committers include:

  1. Sustained contributions to Spark: Committers should have a history of major contributions to Spark. An ideal committer will have contributed broadly throughout the project, and have contributed at least one major component where they have taken an "ownership" role. An ownership role means that existing contributors feel that they should run patches for this component by this person.
  2. Quality of contributions: Committers more than any other community member should submit simple, well-tested, and well-designed patches. In addition, they should show sufficient expertise to be able to review patches, including making sure they fit within Spark's engineering practices (testability, documentation, API stability, code style, etc). The committership is collectively responsible for the software quality and maintainability of Spark.
  3. Community involvement: Committers should have a constructive and friendly attitude in all community interactions. They should also be active on the dev and user list and help mentor newer contributors and users. In design discussions, committers should maintain a professional and diplomatic approach, even in the face of disagreement.

The type and level of contributions considered may vary by project area -- for example, we greatly encourage contributors who want to work on mainly the documentation, or mainly on platform support for specific OSes, storage systems, etc.

Review Process

All contributions should be reviewed before merging as described in Contributing to Spark. In particular, if you are working on an area of the codebase you are unfamiliar with, look at the Git history for that code to see who reviewed patches before. You can do this using git log --format=full <filename>, by examining the "Commit" field to see who committed each patch.

How to Merge a Pull Request

Changes pushed to the master branch on Apache cannot be removed; that is, we can't force-push to it. So please don't add any test commits or anything like that, only real patches.

All merges should be done using the dev/ script, which squashes the pull request's changes into one commit. To use this script, you will need to add a git remote called "apache" at, as well as one called "apache-github" at git:// For the "apache" repo, you can authenticate using your ASF username and password. Ask Patrick if you have trouble with this or want help doing your first merge.

The script is fairly self explanatory and walks you through steps and options interactively.

If you want to amend a commit before merging – which should be used for trivial touch-ups – then simply let the script wait at the point where it asks you if you want to push to Apache. Then, in a separate window, modify the code and push a commit. Run "git rebase -i HEAD~2" and "squash" your new commit. Edit the commit message just after to remove your commit message. You can verify the result is one change with "git log". Then resume the script in the other window.

Also, please remember to set Assignee on JIRAs where applicable when they are resolved. The script can't do this automatically.

Minimize use of MINOR, BUILD, and HOTFIX with no JIRA

From pwendell at

It would be great if people could create JIRA's for any and all merged pull requests. The reason is that when patches get reverted due to build breaks or other issues, it is very difficult to keep track of what is going on if there is no JIRA. Here is a list of 5 patches we had to revert recently that didn't include a JIRA:

  •     Revert "[MINOR] [BUILD] Use custom temp directory during build."
  •     Revert "[SQL] [TEST] [MINOR] Uses a temporary in HiveThriftServer2Test to ensure expected logging behavior"
  •     Revert "[BUILD] Always run SQL tests in master build."
  •     Revert "[MINOR] [CORE] Warn users who try to cache RDDs with dynamic allocation on."
  •     Revert "[HOT FIX] [YARN] Check whether `/lib` exists before listing its files"


Policy on Backporting Bug Fixes

From pwendell at


  • Backports are an extremely valuable service to the community and should be considered for any bug fix.
  • Introducing a new bug in a maintenance release must be avoided at all costs. It over time would erode confidence in our release process.
  • Distributions or advanced users can always backport risky patches on their own, if they see fit.


  • Both the bug and the fix are well understood and isolated. Code being modified is well tested.
  • The bug being addressed is high priority to the community.
  • The backported fix does not vary widely from the master branch fix.


  • The bug or fix are not well understood. For instance, it relates to interactions between complex components or third party libraries (e.g. Hadoop libraries). The code is not well tested outside of the immediate bug being fixed.
  • The bug is not clearly a high priority for the community.
  • The backported fix is widely different from the master branch fix.


Moved permanently to