Airflow Actors

Date: June 12th 2024

Authors: Elad Kalif, Shahar Epstein

Goal

This doc presents the current state of who is the Airflow user?

followed by open questions for discussions about the Airflow user in Airflow 3.

Intro

Airflow defines a user as someone who interacts with the Airflow application. From Airflow’s point of view, as of version 2.x, the user has the knowledge, skills, and permissions to act on any Airflow components.
For example, if a user wants to set a non-standard scheduling interval for his DAG and the documentation recommends using Timetables, then we presume that the feature exists and the problem is solved.
However, from the user's perspective, the problem may not be solved as they might not have the skills needed to register new plugins, or have the skills but not the permissions.
In some organizations, especially large corporates, the authors of pipelines may not have full access to all of the components of the Airflow application.

Airflow already acknowledges the existence of different actors, as can be seen in the architecture docs. This doc will focus on:

Who are the actors?
What are their qualifications?
What is the actor's scope?

Actors

An actor can be seen as an Airflow user with a specific role. In smaller deployments, all actors can be single users while in larger deployments there can be many actors with conflicting interests.
The conflicts may be caused by different agenda, roadmaps, etc.

For example, I need to deploy a plugin that requires the assistance of the actor in charge in this area, but they don’t have the time to do it.

Considering the existing architecture of Airflow, we could distinguish between four basic types of actors:

Operational User - Interacts with Airflow’s UI on a DAG/DAG Run specific basis. Normally triggers a run manually via pre-defined form in Airflow's UI.
This actor may have no technical knowledge and even not be from the data domain realm (for example, product managers).
DAG Author - Creates, modifies, or removes DAGs from the system. May interact with various Airflow components (Connections, Variables, Xcoms, etc..). Within this type, we could recognize two subtypes:

Technical Dag Author - interacts with .py files directly.
Non-technical DAG Author - interacts with an interface on top of .py files (for example, building DAGs via .yaml files).

Deployment Manager - owns the deployment. They have the ability to deploy plugins, install Python packages/providers, and handle the scale of workers, webserver, and schedulers.
They also decides on which executor to use, configuring the Airflow’s settings, and handling zombie or undead tasks. This actor may also enforce Cluster Policy, and possibly owns the CI/CD process of DAGs’s deployment.
Infrastructure Manager - Owns the serving of the compute/storage/systems to be served for the deployment of Airflow. For example, owns the Kubernetes cluster, DBs available in the company, etc.
This actor is sometimes external to the Airflow app and isn’t familiar with Airflow.

Actor	Required Skills	Required permissions	Responsibility	Notes
Operational user	Non-technical	UI	run/view existing DAGs
Dag Author	Python Knowledge	UI, API, DAG folder	Deploy DAGs
Deployment Manager	Python, Infrastructure	All components of Airflow	Health and stability of the application
Infrastructure Manager	Provides DB/K8s cluster to run the Airflow application			Sometimes, they’re not even aware of Airflow. Airflow is just one of the many applications that run on the infrastructure.

It’s important to note that with multi-tenancy, there might be additional sub-actors.

For example, the deployment manager creates a new tenant. Each tenant has an operational user and DAG authors, but also a tenant deployment manager who manages the infrastructure of the tenant.

Airflow 3 - Open Questions

Are we happy with the current model? If not, what changes do we hope to make?
Can we find mechanisms to reduce the conflicts between the different actors?
Should we define and discuss how AIPs affect each one of the actors?

Space shortcuts

Page tree

Goal

Intro

Actors

Airflow 3 - Open Questions

13 Comments

Constance Martineau

Daniel Standish

Daniel Standish

Daniel Standish

Constance Martineau

Philippe Lanoe

Elad Kalif

Philippe Lanoe

Jarek Potiuk

Philippe Lanoe

Jarek Potiuk

Philippe Lanoe

Jarek Potiuk