Presently, Airflow doesn’t have a built-in Role Based Access Control (RBAC) capability. It does provide very limited authorization capability by providing admin, data_profiler, and user roles. However, associating these roles to authenticated identities is not a simple effort. The current proposal aims at resolving this issue by building RBAC into Airflow and simplifying access management via the Airflow UI.
I propose implementing Role based Access Control within Airflow at two levels:
View Level Access Control (VLAC)
Airflow UI exposes a number of views grouped under various categories like admin, data profiling, etc. To implement VLAC, we propose switching to following five roles
Has access to all the functions and views inside airflow
Has access to all views except User view
Only has access to the views under Data Profiling category.
This role corresponds to regular user and will have access to all non-admin and non-data profiling views.
Only has read-only access to all the non-admin and non-data profiling Airflow views.
User-Group mappings can be handled by Administrator within Airflow UI when the Authentication backend doesn’t provide the Group info for the authenticated user. If the Authentication backend does provide the group information, it will be recorded by Airflow on User login and will be written to the backend database. This information will be persisted until the user’s next successful login when it will be rewritten. This is done to account for any modifications made to user-group mapping within the authentication backed since the user’s last login to Airflow.
User-Role and Group-Role Mapping
User-Role and Group-Role mappings will be handled inside Airflow. Only users with Administrator role will be able to manage these mappings.
Role and Permissions are very specific to Airflow application and their values will be used as decorators within Airflow source code. Hence, none of the roles will be allowed to modify Role, Permission models. Even the Role-Permission mapping won’t be editable from the Airflow UI.
Dag Level Access Control (DLAC)
As Dags are not created using Airflow UI, the latter lacks information to make decisions pertaining to DAG level access. Hence, information pertaining to DAG level Access needs to be supplied in the DAG file that will be read by Airflow when the DAG is imported.
To implement DLAC, I propose following DAG level roles.
Can only view the DAG
Can only view, edit, and execute the DAG
Can only view and execute the DAG
Allows the user to view the DAG and the associated attributes.
Allows the user to clear or change the task/DAG status.
Allows the user to run the DAG
Allows the user to refresh the DAG
DAG Role Permission Mapping
READ_DAG, WRITE_DAG, EXECUTE_DAG, REFRESH_DAG
Presently, DAG can specify the owner using the ‘owner’ attribute. However, the ‘owner’ attribute doesn’t necessarily denote the creator(s) of the DAG. The airflow documentation recommends the owner attribute to be the unix username under which that DAG needs to run. At this time, we do not have the data to know how many Airflow deployments deviate from this recommendation.
I wanted to ensure that users are able to adopt DLAC smoothly and standardizing the use of owner attribute will be a breaking change. Hence, I’ve decided not to use the owner attribute in DLAC.
Instead, I propose addition of a new DAG attribute called “access_control” with following syntax:
This attribute allows the DAG to declare association between DAG roles and users/groups. This will be an optional attribute. Absence of this attribute would mean the DAG doesn’t want to declare any DAG level access control.
For every logged-in user, VLAC is enforced before DLAC. For certain View Level Roles, DLAC is ignored.
Following table shows whether DLAC is ignored/honored depending on user’s View-level role.
DAG Level Roles(Ignored/Honored)
DAG_Viewer Role is enforced for all the DAGs
The proposal will be implemented in two phases
Limit access to views within Airflow UI based on view-level access control (VLAC)
Add a User Management menu under Admin Category
Models to be created:
- Adding necessary Flask-Principal decorators to the views
Limit access to specific DAGs based on DAG-level roles (DLAC)
Models to be created:
Models to be modified:
DAG: has_DLAC column needs to be added
Addition of new “access_control” attribute to the DAG.
Should DAG_Executor role be allowed to refresh the DAG?