This page describes the Flink Improvement Proposal (FLIP) process for proposing a major change to Flink.

Purpose

The purpose of FLIPs is to have a central place to collect and document planned major enhancements to Apache Flink. While JIRA is still the tool to track tasks, bugs, and progress, the FLIPs give an accessible high level overview of the result of design discussions and proposals. Think of FLIPs as collections of major design documents for user-relevant changes.

We want to make Flink a core architectural component for users. We also support a large number of integrations with other tools, systems, and clients. Keeping this kind of usage healthy requires a high level of compatibility between releases — core architectural elements can't break compatibility or shift functionality from release to release. As a result each new major feature or public API has to be done in a way that we can stick with it going forward.

This means when making this kind of change we need to think through what we are doing as best we can prior to release. And as we go forward we need to stick to our decisions as much as possible. All technical decisions have pros and cons so it is important we capture the thought process that lead to a decision or design to avoid flip-flopping needlessly.

Hopefully we can make these proportional in effort to their magnitude — small changes should just need a couple brief paragraphs, whereas large changes need detailed design discussions.

This process also isn't meant to discourage incompatible changes — proposing an incompatible change is totally legitimate. Sometimes we will have made a mistake and the best path forward is a clean break that cleans things up and gives us a good foundation going forward. Rather this is intended to avoid accidentally introducing half thought-out interfaces and protocols that cause needless heartburn when changed. Likewise the definition of "compatible" is itself squishy: small details like which errors are thrown when are clearly part of the contract but may need to change in some circumstances, likewise performance isn't part of the public contract but dramatic changes may break use cases. So we just need to use good judgement about how big the impact of an incompatibility will be and how big the payoff is.

What is considered a "major change" that needs a FLIP?

Any of the following should be considered a major change:

  • Any major new feature, subsystem, or piece of functionality
  • Any change that impacts the public interfaces of the project

What are the "public interfaces" of the project?


All of the following are public interfaces that people build around:

  • DataStream, DataSet, SQL and Table API, including classes related to that, such as StreamExecutionEnvironment
  • Classes marked with the @Public annotation
  • On-disk binary formats, such as checkpoints/savepoints
  • User-facing scripts/command-line tools, i.e. bin/flink, Yarn scripts, Kubernetes scripts
  • Configuration settings
  • Exposed monitoring information


Not all compatibility commitments are the same. We need to spend significantly more time on public APIs as these can break code for users. They cause people to rebuild code and lead to compatibility issues in large multi-dependency projects (which end up requiring multiple incompatible versions). Configuration, monitoring, and command line tools can be faster and looser — changes here will break monitoring dashboards and require a bit of care during upgrades but aren't a huge burden.

For the most part monitoring, command line tool changes, and configs are added with new features so these can be done with a single FLIP.

What should be included in a FLIP?

A FLIP should contain the following sections:

  • Motivation: describe the problem to be solved
  • Proposed Change: describe the new thing you want to do. This may be fairly extensive and have large subsections of its own. Or it may be a few sentences, depending on the scope of the change.
  • New or Changed Public Interfaces: impact to any of the "compatibility commitments" described above. We want to call these out in particular so everyone thinks about them.
  • Migration Plan and Compatibility: if this feature requires additional support for a no-downtime upgrade describe how that will work
  • Rejected Alternatives: What are the other alternatives you considered and why are they worse? The goal of this section is to help people understand why this is the best solution now, and also to prevent churn in the future when old alternatives are reconsidered.

Who should initiate the FLIP?

Anyone can initiate a FLIP but you shouldn't do it unless you have an intention of getting the work done to implement it (otherwise it is silly).

Create your Own FLIP

  • If you are an Apache Flink committer, create a page which is a child of this one. You can do that by either clicking on "Create" in the header and choose "FLIP-Template" (and not "Blank page") to create your own FLIP. Take the next available FLIP number (see under  "FLIP round-up" and give your proposal a descriptive heading. e.g. "FLIP 42: Enable Flink Streaming Jobs to stop gracefully". 
  • If you don't have the necessary permissions for creating a new page, please create a Google Doc and make that view-only. As a FLIP number, please use FLIP-XXX. 
    • Post that Google Doc to the mailing list for a discussion thread. When the discussions have been resolved, the contributor ask on the Dev mailing list to a committer/PMC to copy the contents from the Google Doc, and create a FLIP number for them. The contributor can then use that FLIP
      to actually have a VOTE thread.

Process

Here is the process for making a FLIP:

  1. Follow the instructions at "Create your Own FLIP".
  2. Fill in the sections as described above
  3. Start a [DISCUSS] thread on the Apache mailing list. Please ensure that the subject of the thread is of the format [DISCUSS] FLIP-{your FLIP number} {your FLIP heading} The discussion should happen on the mailing list not on the wiki since the wiki comment system doesn't work well for larger discussions. In the process of the discussion you may update the proposal. You should let people know the changes you are making. You either include the link to the FLIP page on Confluence, or you link to the view-only Google Doc. 
  4. Once the proposal is finalized and there are no more open discussions
    1. If your FLIP is already in Confluence, proceed to step 5.
    2. If your FLIP is a Google Doc, please ask on the Dev mailing list to copy the contents from your Google Doc to a FLIP page, and create a FLIP number for you before proceeding to step 5. 
  5. call a [VOTE] to have the proposal adopted. These proposals are more serious than code changes and more serious even than release votes. The criteria for acceptance is consensus.
  6. Please update the FLIP wiki page to reflect the current stage of the FLIP after a vote. This acts as the permanent record indicating the result of the FLIP (e.g., Accepted or Rejected). Also report the result of the FLIP vote to the voting thread on the mailing list so the conclusion is clear.

It's not unusual for a FLIP proposal to take long discussions to be finalized. Below are some general suggestions on driving FLIPs towards consensus. Notice that these are hints rather than rules. Contributors should make pragmatic decisions in accordance with individual situations.

  • The progress of a FLIP should not be long blocked on an unresponsive reviewer. A reviewer who blocks a FLIP with dissenting opinions should try to respond to the subsequent replies timely, or at least provide a reasonable estimated time to respond.
  • A typical reasonable time to wait for responses is 1 week, but be pragmatic about it. Also, it would be considerate to wait longer during holiday seasons (e.g., Christmas, Chinese New Year, etc.).
  • We encourage FLIP proposers to actively reach out to the interested parties (e.g., previous contributors of the relevant part) early. It helps expose and address the potential dissenting opinions early, and also leaves more time for other parties to respond while the proposer works on the FLIP.
  • Committers should use their veto rights with care. According to the ASF policy, vetos must be provided with a technical justification showing why the change is bad. They should not be used for simply blocking the process so the voter has more time to catch up.

FLIP round-up

Next FLIP Number: 471

Use this number as the identifier for your FLIP and increment this value.

Under discussion

Accepted

Released

Discarded


  • No labels