Current Committers
Name | Organization |
---|---|
Michael Armbrust | Databricks |
Mosharaf Chowdhury | UC Berkeley |
Jason Dai | Intel |
Tathagata Das | Databricks |
Ankur Dave | UC Berkeley |
Aaron Davidson | Databricks |
Thomas Dudziak | Groupon |
Robert Evans | Yahoo! |
Joseph Gonzalez | UC Berkeley |
Thomas Graves | Yahoo! |
Andy Konwinski | Databricks |
Stephen Haberman | Bizo |
Mark Hamstra | ClearStory Data |
Shane Huang | National University of Singapore |
Ryan LeCompte | Quantifind |
Haoyuan Li | UC Berkeley |
Sean McNamara | Webtrends |
Xiangrui Meng | Databricks |
Mridul Muralidharam | Yahoo! |
Andrew Or | Databricks |
Kay Ousterhout | UC Berkeley |
Nick Pentreath | Mxit |
Imran Rashid | Quantifind |
Charles Reiss | UC Berkeley |
Josh Rosen | Databricks |
Prashant Sharma | Imaginea, Pramati, Databricks |
Ram Sriharsha | Yahoo! |
Shivaram Venkataraman | UC Berkeley |
Patrick Wendell | Databricks |
Andrew Xia | Alibaba |
Reynold Xin | Databricks |
Matei Zaharia | Databricks |
Review Process and Maintainers
Spark development follows the Apache voting process, where changes to the code are approved through consensus. We use a review-then-commit model, where at least one committer other than the patch author has to review and approve it before it gets merged, and any committer may vote against it. For certain modules, changes to the architecture and public API should also be reviewed by a maintainer for that module (which may or may not be the same as the reviewer) before being merged. The PMC has designated the following maintainers:
Component | Maintainers |
---|---|
Spark core public API | Patrick Wendell, Reynold Xin, Matei Zaharia |
Job scheduler | Kay Ousterhout, Patrick Wendell, Matei Zaharia |
Shuffle and network | Aaron Davidson, Reynold Xin, Matei Zaharia |
Block manager | Aaron Davidson, Reynold Xin |
YARN | Thomas Graves, Andrew Or |
Python | Josh Rosen, Matei Zaharia |
MLlib | Xiangrui Meng, Matei Zaharia |
SQL | Michael Armbrust, Reynold Xin |
Streaming | Tathagata Das, Matei Zaharia |
GraphX | Ankur Dave, Joseph Gonzalez, Reynold Xin |
Note that the maintainers in Spark do not "own" each module – every committer is responsible for the quality of the whole codebase. Instead, maintainers are asked by the PMC to ensure that public APIs and changes to complex components are designed consistently. Any committer may contribute to any module, and any committer may vote on any code change.
Becoming a Committer
To get started contributing to Spark, learn how to contribute – anyone can submit patches, documentation and examples to the project.
The PMC regularly adds new committers from the active contributors, based on their contributions to Spark. The qualifications for new committers include:
- Sustained contributions to Spark: Committers should have a history of major contributions to Spark. An ideal committer will have contributed broadly throughout the project, and have contributed at least one major component where they have taken an "ownership" role. An ownership role means that existing contributors feel that they should run patches for this component by this person.
- Quality of contributions: Committers more than any other community member should submit simple, well-tested, and well-designed patches. In addition, they should show sufficient expertise to be able to review patches, including making sure they fit within Spark's engineering practices (testability, documentation, API stability, code style, etc). The committership is collectively responsible for the software quality and maintainability of Spark.
- Community involvement: Committers should have a constructive and friendly attitude in all community interactions. They should also be active on the dev and user list and help mentor newer contributors and users. In design discussions, committers should maintain a professional and diplomatic approach, even in the face of disagreement.
The type and level of contributions considered may vary by project area -- for example, we greatly encourage contributors who want to work on mainly the documentation, or mainly on platform support for specific OSes, storage systems, etc.