Status

StateCompleted
Discussion Threadhttps://lists.apache.org/thread/kxkctcbh9drfw065dgvr673zl0xyfl3r
Voting Thread

Initial implementation: https://lists.apache.org/thread/mj6jdc18c5zbpmcy2568xz6k9lzn76hf → Accepted

vote on part 2: https://lists.apache.org/thread/4dkbwob1wyl3xjbqdsmbd1mvgzflzp1f → Option B

Created

$action.dateFormatter.formatGivenString("yyyy-MM-dd", $content.getCreationDate())

Links

Motivation

As user of Airflow for our custom workflows we often use DagRun.conf attributes to control content and flow. Current UI allows (only) to launch via REST API with given parameters or using a JSON structure in the UI to trigger with parameters. This is technically feasible but not user friendly. A user needs to model, check and understand the JSON and enter parameters manually without the option to validate before trigger.

Similar like Jenkins or Github/Azure pipelines we desire an UI option to trigger with a UI and specifying parameters. We'd like to have a similar capability in Airflow.

Considerations

This AIP was created after working a longer time with an UI plugin which we implemented customized allowing users to "easily" trigger DAGs with parameters. More and more we realize that every use case and demo we make requires an simple entry form to show "how easy" it is (not to scare users and have them running away). We propose to "productize" the feature from a plugin and merge it into the Airflow Core UI - hoping that more people can use this feature, having a lower entry barrier for "new" users and also to close gaps from standard UI path to trigger a DAG today compared to how the UI plugin is integrated.

We also had (in the past) a separate UI (outside of Airflow) just to collect parameters and trigger the Airflow DAG in the backend via REST API - but this was a full separation of the user from Airflow and at the end requires a very redundant UI being developed and maintained.

What change do you propose to make?

Note

This (technical) proposal is made based on current Flask AppBuilder (FAB) and existing structures, Changes made for AIP-38 Modern Web Application might influence the final implementation. Feedback welcome.

Implementation Part 1) Specifying Trigger UI on DAG Level - (tick) DONE

We propose to leverage the params attribute of the DAG class with some additional optional attributes so that an UI (one form per DAG) can be specified in the DAG itself (optional). As the params attribute is built upon the Params class to define meta-data to serve the purpose of JSON schema validations, this definition already serves 80% of details needed to render a form. See also https://airflow.apache.org/docs/apache-airflow/stable/concepts/params.html.

The following details to model a form can directly be leveraged from existing Params class definition:

  • The order of fields of the form
  • Name of the field, using the dict key. As this might not be very descriptive the existing JSON schema field title can be used if available for a better "Human readability".
  • Type of the field derived from the JSON schema type and formatting clause
    • Type boolean can render a checkbox or on/off control
    • Elements of JSON format date and date-time can render a date picker
    • Type of arrays can be made as lists of text string and object types as a JSON sub-element entry
  • Elements defined with an enum can be used to define drop-down selects for a pre-defined list op options
  • If JSON schema allows to have types of null can control if a field is optional or required.
  • The JSON schema attribute description can be used to render field help text
  • Some JSON schema attributes for text like maxLength can be used to restrict user input on the form immediately
  • Further attributes can be flexibly defined on Params class if we want to add form features in the future. Nevertheless available options and features can not be validated inside your editor as it is just extending kwargs. Therefore a later switch from FAB to a modern UI can be done w/o changing the structure f Form definition, just add a different UI logic.
  • If just an arbitrary dict is used as default params then the existing dict value data types can be used to "guess" a suitable form even if the Params class for JSON schema is not used.
  • We can use more attributes on the Params class also to allow rendering of custom HTML field

Some additional (optional) attributes are proposed to be added to control a form layout:

  • An additional attribute hidden can be used to pass static values without populating to forms. Maybe not all values shall be presented to user. ((info) Update Note: Was implemented finally via the const attribute)
  • An additional attribute advanced can be used to have a user focusing on core elements and additional fields which are "expert options". ((info) Update Note: Was implemented finally via the section attribute)

Nevertheless re-using the existing Params class comes with a set of limitations compared to a separate definition of a user entry form:

  • Initial idea to allow definition of multiple forms per DAG is complex, re-using Params restricts to a single entry form per DAG (which might be acceptable to ease definition)
  • Deep nested complex JSON schema (other than simple key/value) will be very complex to handle in a generic form so the proposed implementation would be limited to this. Otherwise any complex and optional deep nested JSON structure is killing any generic
  • Usage of enum to define pick lists unfortunately restricts the user to add optional additional entries because JSON schema validation will kick-in.

With this extension the current behavior is continued and users can specify if a specific UI form is offered for the Trigger DAG option.

Implementation Part 2) UI Changes for Trigger Button - (lightbulb) OPEN

(info) Note: During the implementation PR there was a short discussion with Ash Berlin-Taylor and not to block the merge, the implementation was reverted. As there might be multiple opinions on the "Trigger Button" implementation a vote is being called again in the devlist (Link https://lists.apache.org/thread/4dkbwob1wyl3xjbqdsmbd1mvgzflzp1f)

The function of the trigger DAG button in DAG overview landing ("Home" / templates/airflow/dags.html) as well as DAG detail pages (grid, graph, ... view / templates/airflow/dag.html) is adjusted so that per DAG the available form(s) can be controlled for better user guidance. 

Option A) We leave the Trigger button on the UI as it is today - do not implement part 2 of this AIP and close it as completed.
Preview of how it would look like (no surprise - like today! (big grin)):


Option B) (star) SELECTED (star) Trigger Button does not show the selection list anymore. Page handling is based on the DAG definition. If the DAG has params defined (which are not const only), the current view "Trigger DAG w/ config" is shown. If the DAG has none (or only const) params defined, the UI is skipped and the trigger is executed directly, returning to the origin page. Flowchart like this:

Option C) A global option in airflow.cfg is added to define globally how trigger buttons on all views should behave for all DAGs. This option was initially contained in the PR and is here up to discussion. The parameter is called trigger_button_mode and has the following options:

  • AUTO (Default): Sets the trigger button behavior to what is described in Option B
  • FORM: Always show the trigger Form with parameter selection. Corresponds to the current option of the button named "Trigger DAG"
  • TRIGGER: Always skips the form and triggers w/o asking. Corresponds to the current option of the button named "Trigger DAG w/ config"
  • CUSTOM_URL: Replace the trigger form with a custom (potentially external of Airflow) endpoint which has any kind of UI. The UI takes care of whatever shall be displayed and potentially uses the REST API to execute the trigger call. The endpoint to direct the user to would be called trigger_button_url and is used as href as link endpoint. This can be an relative or absolute, even external URL, e.g. http://my-custom-ui.something.com/trigger_something/{dag_id}?origin={origin} whereas the the tokens dag_id and origin will be replaced before rendering the URL in the UI.


Option D) The DAG model is extended so that the trigger button behavior can be adjusted per DAG. This was the initial proposal of the AIP-50 but during implementation is was noticed that this option generates a high complexity, because DAG model as well as DB Scheme need to be extended. DB Scheme needs to be extended so that the DAG option can be persisted to DB so that w/o need of de-serialization of DAG the DAG overview can be displayed.

Proposal is to add a DAG attribute trigger_ui: str | List[str] = ["none", "form"]:

  1. If there is a single Trigger UI specified for the DAG, the button directly opens the form on click

  2. If more than one Trigger UI is defined for the DAG, then a list of UI's is presented, similar like today's drop-down with the today's two options (with and without parameters).

Menu names for (2) and URLs are determined by the UI class members linked to the DAG. The following options are proposed:

  • "none": Skips any user entry form and triggers the DAG w/o further details using the default parameter values (like today's option "Trigger DAG" or option TRIGGER in Option C)
  • "form": Shows the new form as described above as new feature in AIP-50 (like today's option "Trigger DAG w/ config" or option FORM in Option C)
  • a tuple with ("display text": "url to call") can be used to provide a generic text label and a URL to forward the user to a custom plugin or UI as seamless integration if more complex UI is needed. (Like option CUSTOM_URL in Option C)

DAGs not defining the attribute trigger_ui will use  ["none""form"] as default.



Comparison of options:

A) No ChangeB) Automatic PickC) Global ConfigD) DAG Level Config
UI Feature Richnessoo+++
Complexity+++++--
Config changeoo--o
Model changeooo--
DB Changeooo--
Performance Impactooo??
Backwards Compatability++-+++

Implementation Part 3) Standard Implementation for Forms (Actually core new feature) - (tick) DONE

Implement/Contribute a user-definable key/value entry form (general layout following todays JSON entry) which allows the user to easily enter parameters for triggering a DAG.

An example parameter set could look like

Example DAG with form
with DAG(
    dag_id="example_form_demo",
    (...)
    params={
        "param1": Param(
            "default value",
            type="string",
            title="Parameter1",
            description="(Optional Description and hints)",
        ),
        "parameter_2": Param(
            "user selection",
            enum=["foo", "bar", "user selection"],
        ),
        "flag3": Param(
            False,
            type="boolean",
            title="Parameter3"),
            description="(Optional Description and hints)",
        ),
    },
) as dag:
    run_this_last = EmptyOperator(
        task_id="run_this_last",
    )
    (...)

A form for this example could look like:

   Parameter1: <HTML input box for entering a value>
               (Optional Description and hints)
   
   parameter_2: <HTML Select box of options>
   
   Parameter3: <HTML Checkbox on/off>
               (Optional Description and hints)
   
   <Trigger DAG Button>

The resulting JSON would use the parameter keys and values and render the following DagRun.conf and trigger the DAG:

Example generated DagRun.conf
{
  "param1": "user input",
  "parameter_2": "user selection",
  "flag3": true
}

The number of form values, parameter names, parameter types, options, order and descriptions should be configurable in the DAG definition.


Example view how today's solution is made via a UI plugin to integrate into Airflow UI (but is not linked into the standard "Trigger DAG" button):


Note that the advanced parameters and generated Job Config are hidden per default and need to be un-folded

Implementation Part 4) Examples - (tick) DONE

Provide 1-2 example DAGs which show how the trigger forms can be used. Adjust existing examples as needed.

Implementation Part 5) Documentation - (tick) DONE

Provide needed documentation to describe the feature and options. This would include an description how to add custom forms above the standards via Airflow Plugins and custom Python code. Proposal is to extend the docs section of "Params" (https://airflow.apache.org/docs/apache-airflow/stable/concepts/params.html) for the purpose of describing how to generate a form from the params definitions.

What problem does it solve?

With this proposal the functional gap is closed allowing to add user friendly trigger UI forms for staring DAGs with parameters (w/o the need that every user must understand the specific required JSON structure to trigger a DAG). Also it adds missing functionality which users know and expect to have from e.g. CI/CD pipelines of Github/Azure DevOps, Jenkins (and many more) that simple forms can be added to trigger with parameters.

Why is it needed?

  • Adding missing functionality allowing user friendly trigger of workflows/DAGs w/o specific expertise of DAG conf needed (learning curve)
  • Also such entry form could provide an easy path to validate parameters before triggering (comparing to try to trigger with any JSON and then the DAG hits the wall because of bad parameters passed if user is not expert)
  • Do not scare users when showing how to (self-service) workflows
  • Prevent the need to add custom UIs in front of Airflow to "hide" Airflow internal complexity

Are there any downsides to this change?

  • Airflow Codebase will be extended by ~1000 LoC (additional potential maintenance)
  • Extension of UI must be discussed in conjunction with ongoing discussion and implementation of AIP-38 Modern Web Application

Which users are affected by the change?

Standard and power users of Airflow are not affected, there is mainly a very positive option possible for rare or entry users allowing to trigger DAG w/o mus experience, lowering the entry complexity.

How are users affected by the change? (e.g. DB upgrade required?)

With the proposed change no user or integrator is affected, the proposed change is just an extension w/o breaking change and no need to change DB layout

Other considerations?

The initial idea of implementing forms as structured Python classes has been dropped by the feedback received to leverage the existing Params structure. Otherwise of course specific dedicated user entry form modules still can be added later. The proposed approach is a 80/20 proposal (gaining 80% of use cases direct benefit with 20% complexity).

Current workarounds known and used in multiple places are:

  1. Implementing a custom (additional) Web UI which implements the required forms outside/on top of Airflow. This UI accepts user input and in the back-end triggers Airflow via REST API. This is flexible but replicates the efforts for operation, deployment, release as well and redundantly need to implement access control, logging etc.

  2. Implementing an custom Airflow Plugin which hosts additional launch/trigger UIs inside Airflow. We are using this but it is actually a bit redundant to other trigger options and is only 50% user friendly as the standard "Run" buttons point to the JSON trigger dialog but not to the plugin and this is mis-leading from user perspective (not well integrate-able)

Open Questions needed to be discussed and decided before moving ahead?

  • How does this extension relate and runs into a "potential dead end" of FAB if AIP-38 Modern Web Application is implemented in parallel? From editors perspective the proposed structure is not bound to FAB and param structure can also be transferred to a new modern UI.
    • And as a direct follow-up: Will AIP-38 need to be completed BEFORE making this? Will this be made in parallel and then needs to be migrated over? Can the described concept be included in AIP-38 from the beginning? (First hand w/o customizations would be 90% already)
  • Shall we make a PoC implementation in form of a PR for the discussion?
  • Might be out-dated with Draft v2? → Deployment: "security is super-important" - we need to know what will be the deployment model of those "customisations" - "who will be able to add them, how, what are the risks involved, whether ther will be any sandbox the custom code will be executed? Will it be server-side as well as client-only, or maybe it should only execute in the browser's javascript). I think there are a lot of "infrastructural" qustions to answer"
    Note/Partial Answer: Currently I would assume "standard" forms are with DAG development following any deployment/securty considerations as there is already potential to run/inject any arbitrary python code. Following same problems. For "custom" UI components it should follow the same approach like current Airflow Plugins. I'd propose to use FAB endpoints in a plugin and so this follows the same "problems" and challenges there are today. I do not want to add further attack points with this AIP. But might be related to question #1, if React will/shall be used.
  • General question: "how UI customization should look like in a modern web app environment?" (Note: "modern" meaning "React")
  • "ease of use for the end user is also a critical point (it should remain quite easy to extend the UI)" - How will this be achieved with this implementation?

What defines this AIP as "done"?

  • Airflow delivers the option "out of the box" to define at least one form (per DAG) with simple input fields to trigger a DAG via UI and passing standard scalar parameters, e.g. string values
  • Such forms can also be extended via UI plugins so that custom functionality can be directly linked into the standard UI "trigger DAG" buttons and allow replacing the standard dialogs w/o parameters or JSON dict entry

9 Comments

  1. This sounds like a useful addition to Airflow.

    However my first ask would be to see if we can do this "delcaratively" rather than with classes.

    For example, in 2.2 we added typing/schemas to the params  object of a DAG https://github.com/apache/airflow/pull/17100

    Those parameters are type checked with JSON Schema which has most/all of the elements you list already. For example:

    etc. I haven't checked the rest, but I think most or all of them would be expressible as JSON schema.

    By having it a) use the existing mechanism, and b) not using custom classes; it means that we already have a serialization plan to make it available to the webserver without executing DAG code.

  2. All for what Ash Berlin-Taylor wrote. We wer thinking about that and also we could use the same concept to convert our Provider's Connection customisations to follow the same route.

  3. This is definitely something a lot of users would like. Would love for you to re-use Params object that uses json-schema wherever possible please

  4. +1 for Ash Berlin-Taylor proposal. It doesn't feel like a good thing to me to include anything that has "UI" in its name at the level of DAG abstraction. On the other hand - using a schema looks like a natural boundary that can be consumed by CLI, Airflow UI and whatever else the users might want to use to control their Airflow deployments.

    Other than that - great idea!

  5. I can imagine some obvious extensions like:

    1. automatically populating the input form with parameters from a selected older DAG run (makes sense e.g. when typically 18 out of 20 parameters are the same every time for some DAG),
    2. displaying parameters of previous executions in a widget generated by proposed UI extension.

    It raises a question, of course, about schema versioning. Shall the schema (or at least some id of the schema) be attached to every DAG run? That would make it easier to figure out if values from an old DAG run fit into current version of the form/widget (that might change when a DAG is changed).

  6. Hi Ash Berlin-Taylor , Jarek Potiuk , Kaxil Naik . Igor Kholopov , Jacek Iżykowski thanks for the feedback I received so far! I can understand and support the proposal to leverage and re-use the generic Params structure which was defined for JSON schema validation - this already defines 80% of the base needed for a form. I revised the DRAFT to a "v2" in this AIP.

    I tried a bit of coding already and I assume from the updated draft an implementation is pretty straigt-forward. Would it help if I propose a PoC to have a feel about how it looks?

    (Sorry, I am not experienced with the discussion process in the Airflow dev team, if I would provide the contribution and a review is done, would this be accepted? Which things do I need to consider?)

  7. Yeah - creating a POC PR (even draft) and dropping information in the thread in the devlist when it's there to get some comments is the best way forward


  8. Thanks again for the feedback, please see https://github.com/apache/airflow/pull/27063 as a POC / Sneak Preview PR. Only 680 LOC for a cool feature added incl. examples.

  9. After positive feedback on the PR, the code is now functional complete and pipeline is green. Looking forward for reviews and merge: https://github.com/apache/airflow/pull/27063