Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Status

Page properties

StateDraft
Discussion Thread
JIRA




Motivation

** Note: this AIP replaces AIP-12 with a better defined scope that should help in the discussion surrounding the AIP. **

Issue

In the current implementation, Gunicorn webserver processes each maintain their own DAG representation using their own `DagBag` instance. Because there is no synchronization between the different webserver processes, these `DagBag` instances can contain different states, especially in situations when DAGs have just been added or modified.

As a result, users can see different DAGs in the webserver, depending on the process that handles their request. This results in various stability issues, where DAGs seem to randomly appear and disappear between refreshes until the webserver processes are stabilized.

An example of these issues is illustrated in the following video:

Widget Connector
urlhttps://www.youtube.com/watch?v=sNrBruPS3r4


Proposal

To avoid the issues outlined above, we aim to make the webserver endpoints stateless, so that queries to the webserver return the same result regardless of the thread that is handling the request. We propose to do so by modifying the endpoints to use a (shared) single source-of-truth for displaying DAG/task related information.

The most obvious solution for maintaining a single source-of-truth for DAG-related information is the database, as this is already where Airflow persists DAG-related metadata. To make the webserver stateless, we then simply need to make sure that all required information is available in the database and can be queried by the webserver.

To keep this AIP tractable, we propose to leverage the existing ORM models for storing and querying DAG metadata from the database in the webserver. Following this approach, we should be able to achieve our objective by adding required fields/methods to the `DagModel` ORM class, which will serve as our entrypoint for querying DAG-related metadata from the database.

Required changes

To achieve our goal, we need to implement the following changes:

  • All references to the shared `DagBag` instance need to be removed from the webserver endpoints.
  • Functionality that is independent of the DAG file needs to be moved to the `DagModel` class, rather than the `DAG` class. Examples include the `following_schedule` method, etc. This way we can use this functionality independent of the DAG file. Backwards compatibility can be maintained if needed by forwarding calls on `DAG` instances to the backing `DagModel` instance.
  • The webserver endpoints need to be modified to reference the required methods/attributes on the `DagModel` class rather than the `DAG` class.

Note that some endpoints may need to be modified to read data not present in the database directly from the corresponding DAG file (but only if really needed). This will for example be required in the DAG graph view, as graph edges are currently not stored in the database. This can only be avoided by adding this missing data to the database.

Considerations

Larger discussions concerning serialization formats for DAGs and DAG versioning are not part of this AIP, as we only intend to make a few small changes to the existing classes to address the problem at hand. These other discussions involve larger (architectural) changes which are outside of the scope of this AIP.