Skip to end of metadata
Go to start of metadata

Status

StateAccepted
Discussion Thread
JIRA

AIRFLOW-6925 - Getting issue details... STATUS

Created

2020-02-26


This document captures the design of REST API for Apache Airflow.

Summary of Changes

N.o.

Version

Date

Description

1

v20200226

2020-02-26 

First version

2

v202004032020-04-03Adaptation of permissions to the current Airflow model

3




4




5




6




7




Background

We currently have one experimental API, but despite its existence for 2 years, it has not reached a stable level. There are many critical aspects to this implementation including: no defined schema, very narrow range of functions, unsafe default configuration, no HATEOAS and many others. 

In the past, Drewstone began work on REST API based on OpenAPI. It's described by AIP-13: OpenAPI 3 based API definition. However, it was not completed due to the lack of interest from the author and the Kerberos configuration problem (It was at a time when Breeze was still developing, so configuring all dependencies, including Kerberos, was a problem). It had a narrow range of features that are expected by users e.g. access to XCOM data and management for connection is missing,

The Polidea and Google teams together with the community want to make another attempt based on our and community experience. Airflow deserves new stable solutions.

Goal and non-goals

This chapter sets out the success criteria and limits of this change. It only right for this AIP, but does not mean that the project may have additional goals that depend on the API. API is a complex component, so determining the scope of work is very important.

Goal

Create solid fundamentals

We want to develop a solution that meets the following basic requirements:

  • It should be easy to maintain - provides unified methods for major technical issues e.g. CRUD operations the database objects are carried out in the same way; used libraries are widely supported.
  • It should be trustworthy - It will be done by validation of requests and responses with the schema file;
  • It should be extensible -  the API schema should not change between Airflow versions and should not limit the development of the project
  • it should be secure -  To invoke the API action, the client should have the required permissions;

Complete and universal

The API will allow you to perform all operations that are available through Web UI and experimental API and those commands in CLI that are used by typical users. For example: we will not provide an API to change the Airflow configuration (this is possible via CLI), but we will provide an API to the current configuration (this is possible via Web UI).

The API will not provide operations that are completely new, but are expected by customers e.g. isolation of workers from the scheduler.

The API will  be intended for use by any third party.  It should not be related to a specific application, e.g. a React UI

Develop a privilege model that will be usable by Web UI and API

Update the permission model to use the new API.

Update the Web UI and the experimental API to use the new permission model.

Create the extension points for authorization

We want to create extension points that will enable the development of authorization mechanisms, e.g. OpenID, Kubernetes, LDAP in an independent manner. Specific implementations may occur during development, but they will not be discussed in this document.

Support Python and SQLAlchemy objects

The objects in Airflow are divided into two types:

  • SQL Alchemy - They always have a known structure. They are permanently saved to the database. 
  • Python objects e.g. DAG/BaseOperator -  They can be created dynamic and they reside only in memory. They have no direct matches in the database. In the database, they only have simplified equivalents.

We want to build an API that will provide information about objects in the database and simplified Python objects. In the second case, read-only.

Built with the community

We do not want to build a solution by one person, but work with all interested persons to develop the best solution. After reaching the consensus, on the mailing, we will create tickets in JIRA. 

Non-goal

API Client

The API will not depend on the specific client implementation. Customers can use any language and technology to send requests to the API server. It will even be possible to use CURL/Bash to send API requests.

Update WebUI or CLI to use the API

While the ultimate goal is that the Web UI and CLI use the API, updating these components will require a lot of work. However, it is not necessary to achieve the other goals.

Delete the current API

This API has existed for a very long time and has become part of a large number of solutions. For this reason, this deprecation should take place with a transitional period. 

Aggregated data

The goal is not to develop an API that is tailored to a specific case. Solutions will be developed that will enable the addition of new endpoints independently. 

API for optional components 

The goal is not to develop an API for non-fundamental components. There are expectations to develop an API that provides additional features. For example:

  • allows access to node information when using CeleryExecutor,
  • for deeper integration between Airflow and Kubernetes when KubernetesExecutor is used
  • for monitoring 

Integration with these components will be covered in other documents.

Create authorization plugins

The goal is only to develop extension points. The plugins will be developed independently. Each user has different requirements and it is not possible to choose the best solution. We also can't choose it, but we should support all common mechanisms. However, there are also users who do not need any authorization and they are the first candidates for users.

Technology

We use HTTP and JSON. These are the most common technologies. Protobuf is also popular but has compatibility limitations e.g. you can't use easily with CURL.

OpenAPI specification

OpenAPI specification is available on Github: 

https://github.com/PolideaInternal/airflow/pull/653

It's not fully complete yet but contains the most important elements e.g. we would to add HATEOS, but we can define it after specifying the endpoints.

The collection identifier segments in a resource name use the plural form of the noun used for the resource. (For example, a collection of Connection resources is called connections in the resource name.

Permission model

In this chapter, we'll consider how to limit permissions in the API as well as in Web UI.

Current approach

Airflow currently uses view-based permissions based on Flask App Builder.  We have the following permissions:

can_add

can_blocked

can_chart

can_clear

can_code

can_conf

can_dag_details

can_dag_edit

can_dag_read

can_dag_stats

can_dagrun_clear

can_delete

can_duration

can_edit

can_gantt

can_get_logs_with_metadata

can_graph

can_index

can_landing_times

can_list

can_log

can_paused

can_refresh

can_rendered

can_run

can_show

can_success

can_task

can_task_instances

can_task_stats

can_tree

can_tries

can_trigger

can_varimport

can_version

can_xcom

clear

menu_access

muldelete

set_failed

set_running

set_success

New approach

We keep integrations with Flask App Builder. Permission control will continue based on view and permission.

Access to the API will require an additional privilege - `can_api_access`.

To facilitate permissions management, a view will be prepared that allows simultaneous configuration, assigning permissions for the API and Web UI. All current configuration options that are currently supported by Airflow will be preserved.

Implementation

We use connexion. This is the most stable and mature solution. It  supports Flask and is therefore compatible with our application. 

Other alternatives presented in the "Rejected Proposals" section were also considered.

Documentation

We will have the following documentation:

  • OpenAPI specification
  • REST API Reference
  • Guide "How to use API"
  • Migration guide from the experimental API to the REST API
  • Migration guide for new permission model

API Reference will be generated based on the openapi.yaml file. This file will also be used in tests, so we will always have correct documentation.  It will also help to maintain its good condition. Guides will explain basic operations and facilitate the first use of the API. 

Contribution

We invite everyone to contribute.  Most design discussions will take place on the mailing list - dev@airflow.apache.org

Discussion thread: [PROPOSAL][AIP-32] Airflow REST API

We also have #sig-api to talk about non-key decisions and to coordinate our work.

Registration link: https://apache-airflow-slack.herokuapp.com/

All changes are labelled "area:API" on Github.

High-level information is available in the issue.

Rejected proposals

This section will summarize discussions about solutions that have not been implemented. This has been moved here to increase the readability of the document.

API generator based on the database model

There are ready solutions that allow you to quickly create an API based on a database model. They have the following advantages:

  1. allow us to create an API quickly with a small amount of code.
  2. allow flexible filtering
  3. have built-in permission control

However, these are not the features that are most important to us. Airflow is a product that is very mature used in complex solutions, including integration platforms. Other systems often integrate via API. This makes API stability very important. We can't afford to break backwards compatibility. However, this will not be possible if we choose this approach. If the API will allow any filters, then when we change the structure of the database, e.g. by dividing tables or completely changing the data storage (Redis vs. SQL), then filters will not work properly.

However, if the filtering is done by the user, this will not be a problem. This will also not be a big problem for the user, because users do not expect an immediate response if Airflow is part of a complex platform.

This type of generator is useful if you are building a two-component application - frontend and backend and you can guarantee that these applications will be deployed simultaneously. In this cases, the API is rarely used by third parties. However, we want to build an API for use by third parties mostly. 

FlaskAppBuilder

ModelRestApi has the limitation presented in the above section. We could use BaseAPI also. It doesn't have the above problems, but it doesn't have support for OpenAPI schema verification. But most important for me. It's less popular for building an API. New contributors that will make changes will need to learn FAB to make changes. However, people are lazy, so changes will be made based on a partial understanding of FAB. There are not many FAB REST API experts. On other hand, Connexion is a stable, reliable and trustworthy solution. There is a high probability that contributors know this framework from other projects. Connexion also has support for Flask and Tornado, which will reduce our dependence on one framework. In the future, this will enable the use of the API server in asynchronous mode. However, this is not part of this AIP.

Disclaimer

This document assumes you are already familiar with Airflow codebase and may change over time based on feedback.

4 Comments

  1. Haven't dug too deep on this but seems quite reasonable at a high level. I'm guessing wrapping DB access in API calls is out-of-scope for this AIP but could be a follow up?

  2. Yes. We want to start by building an API for use by third parties.  Worker/Internal API will have to be more complex.

  3. My main question is regarding the permission model: If we don't use FAB Permissions and implement our own, how do we sync and make it compatible with FAB?

  4. Hi,

    I just wanted to ask, assuming that:

    1. Most of it is going to be implemented from scratch.
    2. There are a lot of IO requests (with DB connections).
    3. Airflow supports python 3.6 and above.

    Did you consider going with other webservers (rather than Flask)? AFAIK Flask doesn't support asyncio. What about httpio/tornado or any other async solution?