Status

StateAccepted
Discussion Thread
Vote Threadhttps://lists.apache.org/thread/k8txh306lsokvz9ok9v29pd20wjb3vs1
Vote Result Threadhttps://lists.apache.org/thread/nkt5v6hqxw1xd5sgtlym6hvpblbmdoss
Progress Tracking (PR/GitHub Project/Issue Label)
GitHub Project
Umbrella AIPAIP-38 Modern Web Application
Date Created

2024-07-31

Version Released
Authors

This AIP is part of AIP-38, which aims to modernize the Airflow UI.

Motivation

Flask AppBuilder, or FAB, has been a useful tool in developing the auth and UI of Airflow 1 and 2, but it has started to become more burdensome than helpful.

We’ve relied on FAB for authentication (authn) and authorization (authz), but as of AIP-56 Extensible User Management FAB no longer has to be part of core Airflow, as we can rely on other systems for those functions.

However, today FAB does provide the most commonly used authn/authz for Airflow, and there is certainly value in maintaining backwards compatibility for that purpose and for custom plugins. So let’s set the stage for fully removing FAB later, while maintaining a smooth upgrade path for users today.

Goals

  • Have a smooth migration path from Airflow 2 to Airflow 3, in particular for authn/authz
  • Set ourselves up for a true “FAB-free” future (while also making it possible today)
  • If possible, maintain backwards compatibility for plugins

Proposal

The UI components we use from FAB (e.g. CRUD for TIs) will be handled by AIP-38. The rest of this proposal will address the remaining uses.

Authentication / Authorization

The default FAB auth manager and its adjacent entities will be fully moved into the FAB provider. This is mostly done already, but parts of core still expect FAB, so we can do a fresh pass on this (e.g. remove the need for the ApplessAirflowSecurityManager).

This means an Airflow instance could be fully functional without FAB even being installed in the environment, and instead rely on a custom auth manager or other community manager auth manager.

In order to power local environments, which typically do not require authn or authz, core Airflow will contain a minimal auth backend that does not require any authentication and allows the user to perform all actions.

The FAB auth manager can be installed and used, if desired, for more complex environments.

Views from plugins

While AIP-68 will add support for Airflow 3 native plugins, backwards compatibility for plugins using FAB views (`appbuilder_views`) will be maintained, if possible.

Ideally, they continue working as-is. However, we reserve the right to remove support completely if we hit insurmountable hurdles during implementation - this is a best effort goal ultimately.

Menu items from plugins

A new interface for menu items will be added to plugins, as the current interface (`appbuilder_menu_items`) is specific to FAB. The FAB specific menu items interface will be deprecated in favor of a React menu item plugin.

Menu items from FAB (either from views in plugins or default menu items like the security tab) will be available in the Airflow 3 UI. We will likely have an API endpoint to expose the FAB menu items, which will allow the react UI to render them appropriately. This maintains backcompat for the plugins, and also allows the security pages to continue functioning.

How?

If the FAB provider is installed and the FAB auth manager is configured or plugins with FAB views are present, during webserver initialization the provider will create an appbuilder app to be used for that purpose.  We will do our best to avoid extra work if possible, but we will iron out the specifics during implementation. For example, the loading order of plugins and providers may make it difficult to detect if a plugin defines FAB views when the FAB provider is being loaded.

When the FAB auth manager is used, there are some db tables that need to be created or migrated. Instead of being tightly integrated into the sqlalchemy models in core, the FAB provider will take complete ownership of its db tables and maintain its own db cli commands (like migrate/reset/etc).

Why?

You may be wondering why this is worth it - can’t we instead just remove FAB altogether? While we could, today nearly all auth uses FAB. And we do not have a well defined replacement (something like Keycloak) scoped out, even if we do have a good interface to do it technically (auth manager). This means that, practically, to get adoption of Airflow 3 we will need backwards compatibility with the existing auth configuration in Airflow 2. A migration plan for migrating from the FAB auth manager to something else can be developed and made available to the Airflow community at a later date, and we can also end support for the FAB provider once we are comfortable that we have a solid replacement and migration path in place. Compatibility for plugins is just a bonus to make the Airflow 3 migration easier - the authn/authz aspect is the main driver for keeping FAB around.

How are users affected?

  • Less complex dependencies - FAB is no longer required for core
  • Minimal migration pain for users from Airflow 2 to 3
  • Using the FAB provider requires more deployment steps (e.g. separate db commands)

What defines this AIP as done?

  • Flask AppBuilder is no longer required to run the webserver and UI
  • The existing FAB auth manager is fully functional from the FAB provider, but not required by core Airflow
  • If possible, FAB plugin compatibility is maintained


18 Comments

  1. I think it's worth adding that reaping Flask App Builder will also mean few more things - notably around Admin UI and other Admin screens. I know it's probably more of a AIP-38, but some reference to that should be made here especially that:

    • currently the UI and code provided by FAB is handling CRUD operations for all those Admin components out-of-the-box. (it's kinda similar to Django Admin capabilities). 
    • there is a very deep integration of Connection UI and Hook definition. Currently Hooks (providers) contain basically definition of the connection definition in the FAB-driven Connection UI. All that has to be replaced with declarative way of defining connection parameters per each hook - i.e. how it maps extras to the UI. This means that the UI that replaces Connection will have to be sophisticated enough to handle all the variations there. IMHO this is a small project on its own because it will have to implement a path for providers to migrate - i..e. Providers (and Hooks) should provider both - Airflow 2 and Airflow 3 compliant way of exposing that information to Airlfow webserver (or we will have to find alternative approaches). For me (except the connections) there is very little need for providers to be installed in webserver and if we are considering separating Airlfow into "core" maybe a "webserver" and introducing various "environments" for those - possibly another way of exposing UI Connection definition (than installing providers) should be impleemented). Or maybe we should get rid of it. Anyhow - this is a big part of getting rid of FAB

    1. Related to this one. I might have missed the information but what happens to the secrets? I think it is a topic that spans across multiple AIPs (72, 67) but are we planning to keep them as they are today (stored in the DB and views in the UI to manage them) or enforce using a secret backend?

      1. Yes, secrets should still be DB, Env var & external secrets manager imo. We can't expect a new user or smaller groups to use external secrets manager

    2. +1 on the de-coupling of Connections Form (via FAB) & it's info from Providers

      1. Jedidiah Cunningham / Brent Bovenzi – I want to make sure we don't forget about decoupling the Connection Forms from FAB. 

  2. Looking nice.

    This might be a naive question but do we need to have a UI specific REST API ? I understand that this will make development and iteration much easier because we do not have to worry about backward compatibility, public documentation etc... On the other hand it means that we now have multiple REST API to maintain (most likely with code duplication, and all it implies), also if we need 'additionnal' non public stuff to build the UI, it means that other people will most likely not be able to easily build their own UI on top of the public REST API because obviously there are some missing things into the public one. At this point people might be tempted to depend on the internal UI Rest API to get the information they need. Also it opens doors to weird Rest endpoints that are not restful and poorly designed just because it is not public so it can become ugly quickly. (It is already kind of the case for certain private endpoint already)


    I think that we can achieve a complete enought public API that exposes all the necessary information that we need for the UI and that can be re-used by everybody. I think a 'complete' and elegant public API might be more ideal, but I also understand the business contraint and flexibility that we need for buildling the UI. that might just be too much work.

    Having a dedicated rest API might be the best approach, I just want us to think about the few cons it implies.


    1. Or maybe we can plan an upgrade path for an UI specific endpoint to become a public one. This way thinks are more 'flexible' and lets rigit between the public and UI specific endpionts. Maybe we can have 1 Rest API and different type of endpoints, instead of two complete separate API. Some endpoint documented and publicly exposed, some others that are UI specific and internal, just thinking out loud.

      1. Maybe we can have 1 Rest API and different type of endpoints, instead of two complete separate API. Some endpoint documented and publicly exposed, some others that are UI specific and internal, just thinking out loud.

        Yeah I like that, I agree that having 2 different endpoints could lead to some duplications. I definitely think we need endpoints specific to the UI which have no value of being exposes publicly but we could expose everything under the Rest API and document only the public ones as Pierre suggested. We should also encourage using public Rest API endpoints as much as possible. If the Rest API endpoints we currently have are not suited to be used by the UI, then we should fix/update them

        1. How about we add a "UI" group to the REST API, where we have a disclaimer that those endpoints are subject to change between versions. We reduce the boilerplate of running separate APIs, but also have everything live closer together making it easier to decide to make an endpoint fully public.

          1. I am fine with that but I personally prefer not exposing at all these UI only endpoints in the doc. Even with the disclaimer, that could create some friction to the user and discourage them to upgrade version

            1. I was looking into the OpenAPI spec. It is possible to tag endpoints and then have a script that removes tagged endpoints from the generated docs. That doesn't feel robust enough to me. I'm back to the idea that they're separate APIs, but ultimately very similar so it should still encourage us to develop public rest api when possible.

              1. So you expect the UI to use only this API or use the Rest API and the UI API (when the API is not available in Rest API)? I think the latter is better because we do not want to duplicate APIs

                1. Yeah, the UI will use public rest endpoints when they exist/are suitable, but things like the home page or the cluster overview that return data across many "REST" resources are where we are likely to need specialized UI ones, and by not having them be subject to the API version contract means we are free to change them from version to version at our will.

                  1. Exactly, the UI will use both the public and UI-only APIs and only those two APIs.

                    Since this is actually quite different from changing FAB, I decided to spin this off as a separate proposal: AIP-84 UI REST API

                      1. I completely agree. We need custom UI specific endpoints. I am pretty sure we can specify at a blueprint level or even 'per route' if things should be part of the swagger doc or not. (doc=False, or json_spec=False)... We need to check how to achieve this with connexion  or equivalent. Then we can have a dedicated blueprint api/ui that is not part of the public documentation, and no subject to semver.

                        In case we want to move an endpoint from private UI blueprint to the public one, we just need to register it on the appropriate blueprint, and no pain here. (Also all routes are at the same place in sources, and easy to deduplicate/factorise)

  3. I really like the updated incarnation here, specifically because it reduces the migration challenges from Airflow 2 to Airflow 3. 
    I believe that this is a pragmatic approach given the choices.