The JIRA issue is optional, but if there is already one that is relevant link to it here. Using the Jira issue macro is preferred.
|
Currently we can store connections in metastore or env variables.
As an airflow cluster maintainer, I want to be able to store connections in other sources e.g. AWS SSM parameter store. This would give me more flexibility in the ways that I can manage creds.
Is there anything special to consider about this AIP? Downsides? Difficultly in implementation or rollout etc?
Originally there was some uncertainty about whether to call it secrets or creds or something else. In discussions on dev list, slack, and the PR, we have coalesced around "secrets".
What change do you propose to make?
get_connection
functionality in BaseHook to be be an implementation of this base classalternative > metastore > env var
if alternative
secrets backend is enabled; env var > metastore
otherwise. Search path cannot be otherwise configured.For example, this would make it possible for a dev team to share one single source for creds instead of having to distribute creds to all developers in a text file. This would make it possible to ensure that all devs are always in sync, using the same creds definitions.
Another use case is for a platform like astronomer cloud, a user might not have access to the airflow CLI. In this case, there's no convenient way to get connections loaded into the metastore. So it would be easier to store them elsewhere.
Another use case is spinning up dev or QA instances. Your dev or QA instances could source creds from a place like SSM parameter store so you don't have to worry about loading them when you create an instance.
I wouldn't say this is necessary, but more that it is helpful, and for reasons discussed above.
I don't think so.
This is a refactor that adds an abstraction that is used optionally. It does not change current behavior in any way without active user action.
They are not.
There are a few implementation details that I had some uncertainty about. For one, I was uncertain about the name, but the community coalesced around "Secrets Backend".
Additionally, initially I wasn't sure about whether the classes should use only class methods, or if instance methods make sense. Ultimately, I decided instance methods made more sense; config params are passed to __init__, and the backend is instantiated on initial import.
Jarek has suggested considering extending this to support arbitrary secrets and not just connections. Ultimately we deferred consideration of this. I think we need to think about how we intend for that kind of method to be used. Currently the general convention is to retrieve creds through connection objects. And we have the Variable model for things that are not "secrets".
Kaxil suggested removing search path configurability. The proposal is as follows:
This would simplify the config because you could just have two items: `backend` (the alternative backend class) and `backend_kwargs`, which would be any config info needed by the backend.
In initial PR since multiple backends can be used simultaneously, we needed to provide a way for them to be configured independently, which means more detail has to go into the config.
This was completed
The PR is merged.