I propose to enforce the usage of a linter and code formatter for Airflow so that all code is structured by the same conventions resulting in less inconsistent and more readable code.
Currently the Airflow codebase is written by many different people (which is great!), but also with many different styles of programming. There are many inconsistencies in code style which evaluate correctly but make it harder to read the Airflow code. I think enforcing a code format would benefit Airflow's code readability because all would be structured the same style and people stick to the same conventions for writing Airflow code. This also prevents occasional PRs such as https://github.com/apache/incubator-airflow/pull/3714.
I created this AIP since introducing a code formatter will most likely change a large number of lines and result in a few code conflicts.
Some examples where I think a code formatter would help:
Example #1: imports in /airflow/bin/cli.py
PEP8 convention states to group imports in 3 groups in the following order, separated by a blank line:
- Standard library imports.
- Related third party imports.
- Local application/library specific imports.
The example above, formatted with PyCharm optimize imports:
Example #1 contained 6 groups of imports and the formatted result contains 4 and all imports are ordered alphabetically which IMO makes them much more readable.
Example #2: example_trigger_controller_dag.py
Formatted with Black:
All quotes (single and double) are set to a single type (double in this case) for consistency. Black is configured with a maximum line length and if a collection fits on a single line, it auto-formats to a single line. In .flake8 there's a max-line-length of 110 but when looking at the code, most seems to be formatted with line length of +-90 characters.
Example 3: airflow/contrib/kubernetes/worker_configuration.py
Formatted with Black:
All bracketed items are given the same indentation, resulting in the same level of indentation and (IMO) more readable code.
I know of several code formatters:
This blog post compares all 3.
Flake8 has a similar purpose but only restricts to being PEP8 compliant, which helps but is nowhere near the "strictness" of the tools above.
- Black is a tool by a Python core developer, is deliberately unconfigurable and there's very few configuration options. Important note is Python 3.6+ is required, although older Python code can be formatted with Black too.
- YAPF is a tool by Google and very configurable, in contrast to Black.
- I don't consider autopep8 for this as it doesn't enforce a similar level of "strictness" as Black and YAPF do.
My suggestion is to go with YAPF, since it is configurable and we can make a style which follows most of the current Airflow style. IMO it's important to enforce a style for consistency, but I don't care too much about the actual style itself.