In the last a couple of years we have seen a rapid influx of new components getting added to the Apache bigdata stack. It is becoming more crucial to outline clean guidelines on how to add new components to the mix. Bigtop motto is "Debian of Big Data" as such we are trying to be as inclusive as possible. However, certain constrains exist and have to be addressed accordingly. While we are trying to provide as full list of such requirements as possible, the list provided below might not be complete.
Bigtop stack introduces the notion of a component maintainer (see MAINTAINERS.txt top level file) and thus, it is expected that a front line of support for certain areas would be provided by certain individuals. However, the long-term maintenance costs are expected to be shared among all the members of our community. Hence it is important for us to make sure that all members are comfortable with all the projects that are getting added at least at a very basic level.
2. Hard expectations
This is a list of requirement that we don't want to deviate from unless there is a really major shift in Bigtop's charter. Projects that violate at
least one of the following would have to go through community review and PMC approval on a case-by-case basis:
- Code is expected to be Licensed under Apache License, Version 2.0 (and their dependencies are expected to be compatible with this license)
There's an active and interested contributor ready to make the necessary patches for inclusion
- The project is expected to integrate well into "big data management software distribution based on Apache Hadoop". In other words, it's not a one-off, and has multiple integration points with the rest of our stack.
- The project is expected to be unavailable (at least the desired version) from major Linux distributions (Debian, Ubuntu, RedHat, SuSE). In other words, we don't want to duplicate the effort,, which is already done elsewhere unless there's a very strong reason to do so.
- The project is expected to be compatible with all of the supported platforms where Bigtop stack is officially supported by this community
- Patches are expected to be added to the trunk first before adding to any released branches. In rear cases, Bigtop can patch a component in flight
- to change the order of dependencies resolution ie to guarantee local artifacts are picked first
- outright broken component's release build (HADOOP-11489)
- some other special conditions, considered by the community
- The following is expected to be provided with each patch adding a new project to Bigtop distribution:
- packaging code and packaging tests
- deployment code
- smoke testing code
- The contribution passes project's CI and builds with the standard Bigtop toolchain
3. Soft expectations
Violating any of these expectations will require an explicit explanation attached to the proposal. They are flexible, but it doesn't mean that they can be disregarded willy-nily.
- Project provides test artifacts that go beyond our basic smoke-testing requirement (integration testing)
- Project is an Apache Software Foundations project (top-level or incubating).
- The project produces at least one executable system artifact. There are exceptions to the rule like Apache Tez, which is a library.