Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Overview

In order to support the needs of a vibrant and growing community, we plan to continuously improve Airflow by

...

Champions are typically committers, but don't need to be. These Road Map Items require a commitment from the champion to corral the efforts of one or more developers to drive the development of the Road Map item. Similarly, a developer may contact the champion to be assigned work if he/she is interested in a particular area.

...

Roadmap

General

  • More Frequent Release Cycles - Champion : Max

  • Fault-isolation & dependency isolation

    • via better packaging/execution - Champion : Max
  • Improving Testing

Documentation

Airflow is a broad platform and documentation is critical not only for getting new users up and running but also helping users discover and utilize all of Airflow's features.

  • Add DAG Development Workflow - Champion : Sid

Feature Request: Why is my task not scheduled?

Python API

DAGs are code; the easier that code is to write, the better.

  • Rethink Start Date Handling - Champion : Jeremiah

    • The start date is confusing and not consistently handled or exposed throughout the web app. If I recall correctly, in some places, the execution date and start date are reversed.
  • Remove Start_Date & Interval from the DAG and let them be set by a UI calendar widget

    • This way, they will only be set to allowed values!
  • Remove Dependence on running everything in UTC or in a single TZ

  • Streamline workflow - Champion : Jeremiah

  • DAGRun-level triggers Champion : Jeremiah

    • Would be helpful to tie Operators to DagRun state somehow, so they could act as a cleanup. For example, say a DAG begins by launching a cluster, then fails while trying to execute a command on the cluster. The cleanup Operator would make sure the cluster was properly shut down. This could be mimicked today with a "one_failed" trigger attached to every node in the DAG.

Execution

The heart and soul of Airflow.

  • Fundamentally improving the scheduler
  • Ensure correct handling of Skipped tasks - Champion : Sid

    • depends_on_past=True and Skipped : 1155 (Fixed)
    • DAGs or Tasks that we would like to manually skip as described in 262
    • DAGs that specific only_run_latest as in 59
      • In these cases, we want to skip DAG Runs except for the latest. Unfortunately, if we skip all but the latest, the latest will not run if depends_on_past=True
  • Concurrency Limits Not Honored : max active, concurrency, pool

  • Backfill offers a parallel code path to scheduling - Champion : Jeremiah & Sid

  • Only Run Latest - Champion : Sid

    • For cases where we need to only run the latest in a series of task instance runs and mark the others as skipped. For example, we may have job to execute a DB snapshot every day. If the DAG is paused for 5 days and then unpaused, we don’t want to run all 5, just the latest. With this feature, we will provide “cron” functionality for task scheduling that is not related to ETL
  • Backfill Oddity : needs one successful run!

Security

CLI

  • CLI to use API
  • Ability to delete DAGs

UI

  • Revamp Connections UI

    • Increasingly, connections are putting fields in "extras", which works but means the correct fields are almost impossible to discover for new users. JDBCHook and GCloud hack the UI screen to show fields which are then automatically put into "extras", and that behavior should be supported more widely.
  • Ability to delete DAGs

Apache

  • Move towards Apache-community Friendly Licensed Dependencies

Deprecated Features