Status

StateDraft
Discussion Thread


Vote Thread
Vote Result Thread
Progress Tracking (PR/GitHub Project/Issue Label)
Date Created

2026.01.30

Version Released
AuthorsShahar Epstein 

Motivation

Airflow, as a leading open-source platform for data engineering and orchestration, sits at the center of many production data platforms, coordinating critical pipelines, integrations, and cross-team dependencies.

As AI-assisted development and operations become mainstream, Airflow users increasingly expect productivity gains for operational debugging and day-to-day troubleshooting: faster root-cause analysis, clearer next steps, and reduced dependence on tribal knowledge.

When things go wrong, the cost is paid in on-call time, delayed pipelines, and repeated debugging across Dag runs, task instances, logs, configs, and permissions.

Today, teams that want AI-assisted debugging in Airflow create ad-hoc, private integrations or rely on paid solutions. This creates duplicated effort, inconsistent security controls, and uneven user experience across deployments.

An official, opt-in assistant provides a safe and consistent foundation for AI assistance: a standard UI entry point, a supported backend interface, and a controlled way to retrieve Airflow state so answers can be grounded in real system data.

Because Airflow already centralizes the key signals needed for debugging (Dag definitions, task metadata, run state, logs, configs, and RBAC), it is well positioned to benefit from AI-assisted tooling to reduce time-to-diagnosis and improve operational reliability, without requiring Dag authors to change their code.

Considerations

This proposal introduces a new integration surface between Airflow and external AI systems, expanding the system’s security and operational surface.

Key considerations include:

  • Security surface expansion: The assistant may interact with external systems and process operational metadata. Strong guarantees around RBAC, redaction, and auditability are required.
  • Operational complexity: Enabling the feature requires additional configuration and operational awareness from deployment managers.
  • Opt-in nature: The feature must remain disabled by default and explicitly enabled by operators.
  • Dependency on standardized tooling interfaces: State-backed assistance depends on a consistent and secure way to retrieve Airflow state.
  • Maintenance overhead: The feature introduces additional long-term maintenance across UI, backend, and integration layers.

The design must strictly respect existing RBAC boundaries and ensure that the assistant never exceeds the permissions of the authenticated user.

What change do you propose to make?

This AIP proposes introducing an opt-in AI Assistant capability for Apache Airflow.

The assistant provides:

  • A conversational interface within the Airflow UI.
  • The ability to answer user questions about Airflow workflows.
  • Responses that may be grounded in live Airflow state through controlled, read-only access.

Phase 1 is explicitly limited to read-only assistance and does not allow any modification of Airflow state.

Figure 1 - Illustrations of the Assistant UI (screenshots from an initial POC, final implementation may differ)

 to illustrate one possible approach to realizing this proposal.

A non-binding reference implementation is provided separately (see Draft: AIP-101: Reference Implementation (Non-binding)) to illustrate one possible approach to realizing this proposal.

That document is intended to support discussion and demonstrate feasibility; it does not prescribe the final design.

That document is intended to support discussion and demonstrate feasibility; it does not prescribe the final design.

What problem does it solve?

Troubleshooting Airflow workflows currently requires manually navigating between multiple sources of information, including Dag definitions, task instances, logs, configuration, and documentation.

This process is:

  • Time-consuming
  • Error-prone
  • Dependent on user experience and tribal knowledge

There is no standardized way to query and correlate this information through a single interface.

Why is it needed?

Users increasingly expect AI-assisted workflows to improve productivity in debugging and operations.

Without a standardized approach:

  • Teams build ad-hoc integrations
  • Security controls vary across implementations
  • User experience is inconsistent

Providing an official, opt-in assistant:

  • Reduces duplicated effort
  • Establishes consistent security and audit practices
  • Enables predictable and safe adoption of AI assistance

Are there any downsides to this change?

Yes:

  • Introduces additional security considerations due to interaction with external systems.
  • Adds operational complexity for deployments that enable the feature.
  • Increases maintenance overhead for the project.
  • Depends on evolving AI ecosystem components and integration surfaces.

Organizations with strict data handling requirements may choose not to enable this feature.

Which users are affected by the change?

  • Deployment Managers: responsible for enabling, configuring, and operating the assistant.
  • Dag authors & Operational Users: interact with the assistant to retrieve insights about workflows.

How are users affected by the change? (e.g. DB upgrade required?)

By default (feature disabled):

  • No behavioral change.
  • No additional configuration or services required.

When enabled:

  • Deployment managers must configure and operate the assistant capability.
  • Additional components may be required to support assistant functionality.
  • The system’s security posture changes and must be managed accordingly.

The feature is opt-in and does not require changes to existing Dags or workflows.

What is the level of migration effort (manual and automated) needed for the users to adapt to the breaking changes? (especially in context of Airflow 3)

There are no required migrations or breaking changes for existing users when the feature is disabled (default).

The assistant is introduced as an opt-in capability, and therefore does not impact existing Dags, workflows, or deployments unless explicitly enabled.

When the feature is enabled:

  • Deployment Managers must perform manual setup steps, such as enabling the feature and configuring required integrations.
  • Additional metadata storage may be introduced for audit and operational purposes, which may require applying setup or migration steps as part of enabling the feature.
  • No changes are required from Dag Authors: existing Dag definitions continue to work unchanged.

At this stage, migration effort is low and primarily operational, with no expected need for automated migration utilities, as the feature does not modify existing user code or behavior.

Future phases that introduce state-modifying capabilities may require additional migration considerations and would be addressed in separate AIPs.

Other considerations?

In-scope
  • Read-only retrieval of Airflow state (e.g., Dags, task instances, logs, selected metadata).
  • Responses that may be grounded in retrieved system state.
  • Optional - responses that do not rely on live system state (for examples, general questions about Airflow).

Out of scope

  • Any state-modifying operations (create, update, delete).
  • Triggering or retrying workflows or tasks.
  • Automated remediation or execution.
  • Privilege escalation beyond the authenticated user’s permissions.
  • Autonomous or background actions.

Future phases may explore controlled, permissioned state-modifying capabilities under separate AIPs.

Constraints and guarantees

The following must always hold:

  • The feature is disabled by default.
  • The assistant operates in a strictly read-only mode in Phase 1.
  • All access is enforced via existing RBAC mechanisms.
  • The assistant must never exceed the permissions of the authenticated user.
  • Sensitive data must be redacted before any external processing.
  • All interactions must be auditable, with configurable retention and redaction.
  • Failures must not impact scheduler, executor, or core Airflow functionality.

Behavioral expectations

  • Users interact with the assistant via natural-language queries.
  • Responses may incorporate Airflow state when relevant.
  • The assistant must clearly distinguish between:
    • Responses based on live system state
    • General, non-grounded responses
  • The assistant must not fabricate unavailable data and must communicate limitations clearly.

Security considerations

  • All data access must respect Airflow’s RBAC model.
  • The assistant must not introduce privilege escalation.
  • Sensitive information must be protected through redaction and data minimization.
  • Interactions must be logged for auditability and traceability.
  • The assistant must remain operationally isolated from core scheduling and execution components.

What defines this AIP as "done"?

This AIP is considered complete when the following criteria are met for Phase 1 (read-only assistance):

  • An opt-in assistant capability is available and accessible only to authorized users.
  • Users can submit natural-language queries and receive responses related to Airflow workflows.
  • Responses can incorporate relevant Airflow state where applicable.
  • All interactions strictly respect RBAC and security constraints.
  • The feature is disabled by default and fully configurable by deployment managers.
  • Failures in the assistant do not impact core Airflow functionality.
  • Auditability and data protection requirements are enforced.
  • Documentation is provided for enabling, configuring, and operating the feature.

Once these criteria are satisfied, the feature can be considered ready for adoption in real-world deployments under the defined Phase 1 scope.