This is a working document to clarify proposal option number 2 originally presented in AIP-83 amendment to support classic Airflow authoring style.
We’re on a tight timeline to make a decision and finish implementation for 3.0. PLEASE LEAVE YOUR OPINIONS AS SOON AS POSSIBLE. We should reach a decision on everything before 19th January.
Proposal: restore uniqueness but make logical_date nullable and / or optional
Uncontroversial elements
- restore the uniqueness constraint on logical date
- make logical date nullable
- we will not immediately add timetables that don't use logical date
Quetions of controversy
Question 1. For manual runs, should logical date be null by default or only optionally?
1.a. user manually triggers from UI
1.b. triggering run via REST API
1.c. TriggerDagRunOperator (and maybe other Python interface?)
1.d airflow dags trigger
Question 2. For asset-triggered runs, should logical date be populated?
And what about dags that are asset-or-schedule?
Question 3. How to sort in the UI dag runs with no logical date?
Logical date is the default sort field currently. Start date would be a natural candidate when no logical date. But it is not populated initially.
One proposal is to add a run_after
field. This concept already exists in timetables. But it is not persisted to the dag run record.
Alternatively we could do coalesce(logical_date, start_date, updated_at)
Question 4: Should data_interval_start and data_interval_end be populated?
The question must be answered for both manual runs and asset-triggered.
Question 5: For runs without logical date (or data intervals) what should happen when a user has a template that references them?
Question 6: Implications for run_id
There is still a unique constraint on run_id. Currently it is generated deterministically from logical date plus run type. What should we do for runs with no logical date? How can we support the use case with many runs triggered in close succession such that perhaps even utcnow()
would collide. What shall we do?
Question 7: How do we deal with APIs that select a logical date range?
For example, xcom_pull
has include_prior_dates
. Taskinstance.clear
allows clearing all tis in a date range. Those use logical dates. What do runs with null logical date fit into these?
Resolution of questions
Here will be where I work out the consensus view based on feedback. This will be amended as we go.
Question 1a: manual runs in UI
- In legacy UI, behavior will be unchanged
- In new UI, when triggering a run, it will be default None but you can select a date with date picker
Question 1b: triggering through REST API
- legacy connexion API will be removed and will not be changed as part of this
- in new API, when triggering a run logical date will be required field, and user may choose NULL
Question 1c: TriggerDagRunOperator
- TriggerDagRunOperator will have a default of None for logical date, but user may supply
Question 1d: CLI dag run trigger
- CLI dag run trigger will be default none, but you can supply a logical date if you like.
Question 2: Asset triggered dag runs
- Asset-triggered dag runs will not have a logical date or a data interval
- Dags which are a schedule + asset triggered will have the same behavior as just asset-triggered (i.e. no logical date or data interval)
- For asset-triggered dags, using logical date in a template will result in key error
Question 3: How to sort runs in UI
- we add a new date field run_after which means, the earliest time after which this dag run may be scheduled
- Airflow will sort by
run_after
by default
Question 4a: data intervals for asset triggered
- Asset-triggered dags will have neither logical date nor data_interval_start / data_interval_end
Question 4b: data intervals for manual runs
- For manual runs, if the dag is schedule-driven, then there there will be a data interval if logical date is not null
- for manual runs, if the dag is asset-driven, there will be no data interval
Question 5: for runs with no logical date, what happens if accessed from template / context
- key error
Question 6 run_id logic for null logical date:
- when null logical date, use run_after + random string
Question 7 How do we deal with APIs that select a logical date range?
- For xcom pull, setting `include_prior_dates` to True will have no effect. The reason why is, prior refers to prior logical date, and we do not propose to change this.
- For clear, same thing – adding those filters will have no effect, for the same reason.
- When you use a range, runs with null logical date will never be included in the result.
- In other cases, when there existing filters that apply to logical date, we don't plan to change it.
34 Comments
Tzu-ping Chung
For Question 1, I prefer changing the default to
None
.This is simple enough for the web form and REST API since we are creating a new version entirely anyway.
utcnow()
.now()
if it’s not too difficult.TriggerDagRunOperator
currently accepts an optionallogical_date
value, and if it’s set toNone
we default toutcnow()
. Fortunately we can change it since it’s now in the standard provider—we only need to bump the major version to signify a breaking change. I’d say it should useNone
by default, and we should add a separate marker to use the current time (something like{{ utcnow() }}
for example).The CLI is probably the most problematic. Currently
airflow dags trigger
defaults toutcnow()
unless you optionally provide an explicit--exec-date
. We can probably still change the behaviour since the CLI is probably not used that often? It will need to be advertised much more prominently though.Daniel Standish
No objection from me. Jarek Potiuk Jens Scheffler ?
Jens Scheffler
+1 from my side.
Only notion I have is we can make logical date actually optional and if not provided then default to `NULL`/ `None`. But no strong option.
Pierre Jeambrun
Makes sense.
Acknowledged for the required FastAPI change.
Jarek Potiuk
No objection from my side.
Vikram Koka
I disagree with making the default to None if this includes the API.
This is a backwards incompatible change that does not seem to be for a compelling reason
Tzu-ping Chung
By the API, do you mean the REST API or the Python API (basically
TriggerDagRunOperator
)? The new FastAPI implementation of the REST API is hosted separately from the Connexion one, and both will be available in 3.0. Since the FastAPI one has not been released in a stable version yet, changing the default in it should not be considered backward incompatible. We can choose to have the two API implemenations behave differently; the Connextion API can stay unchanged for compatibility, while the FastAPI one defaultslogical_date
toNone
.Jarek Potiuk
We are going to remove the Connexion API altogether - so leaving old implementation as back-compatibility is not an option. But I do agree, we should be able to change the default here - with Airflow 3 default being NONE and ability to bring in the "logical_date" trigger if needed. This will be Airflow 3 API and we already agreed and implemented some backwards-incompatibility, that one does not seem like a difficult one to adapt to.
Tzu-ping Chung
For Question 2, my preference is to make asset-triggered runs all have logical date always set to None.
Daniel Standish
no objections
Daniel Standish
what would you propose the behavior be when user has an asset-triggered dag, thus no logical date, but they reference the logical date in a task template or grab it from the task context?
and what about asset-schedule hybrid dags. when a run is created from the schedule, then we'll have a logical date?
Tzu-ping Chung
Not sure TBH. Maybe set it to a datetime if it’s schedule-triggered, but null if asset-triggered. I think it’s possible to tell which is which in the scheduler.
Or just set it to null for all runs regardless. I imagine the logical date would be pretty much useless if it’s sometimes available and you’re better off relying on something else…? I’m really not sure.
Daniel Standish
what about if user references the logical date.
one option is to have logical date (in template context) be coalesce(logical_date, run_after)
Tzu-ping Chung
What is the log date?
Daniel Standish
sorry i always want to abbreviate logical date to "log date" but in writing it does not work very well
Tzu-ping Chung
Setting logical date in the context to coalesce means the value would be different in the database and the execution context, which I don’t particularly like. (Conceptually confusing.) There’s also
context["dagrun"].logical_date
would currently points to the actuallogical_date
column in the database. We could technically also do a coalese there but I like that even less. Too much trouble. Just havingNone
or even missing altogether (raising KeyError on access) is probably more realistic.Daniel Standish
Yeah i am just thinking about the UX when user clicks button and it automatically creates the run with no logical date.
But, in new UI we could certainly change that behavior. When clicking the "play" button, I am talking about.
It could pop up a modal and user could confirm. Then I think it would probably be fine.
Tzu-ping Chung
The new UI already does this, we just need to add the datetime picker.
Daniel Standish
ok, nice
Jarek Potiuk
I think we have no choice but introduce some friction for the users. I think if we make it clear that logical date remains as is when DAGs are "time triggered" and we tell the users it might or might not be None if it's triggered it differently, this is the **right** amount of friction. The users should deal with the ambiguity if they want, or stop using logical date with asset/API triggered runs - the latter being the desired outcome. I think masking it with some coalescing is only going to muddy the water and will "hide" the extent of the change we propose.
IMHO. Keep "old behaviour" in regular, data-interval bound (via schedule) runs is enough of a backwards compatibility. And I like what TP proposes there to generally let the "frictionless" behaviour equals to "don't use the logical date" while if you will insist to use logical date for anything else than scheduled runs, it should be your burden and complexity to carry, not ours.
Jarek Potiuk
No objections
Tzu-ping Chung
Question 3—we currently sort by multiple fields already, it’s defined on each timetable. Currently most timetables sort by
("data_interval_end", "logical_date")
and we can probably just addrun_after
and useCOALESCE
when needed.Or maybe we can actually just change the ordering to be based purely on
run_after
? We actually sneakily changed the ordering when we implemented timetables and data intervals (the ordering was purely based onexecution_date
prior to that). No-one seems to have noticed the change.Daniel Standish
Sounds like your vote is at least to add field run_after and add it to the sort list. Possibly make it the default sort. It's fine with me.
Jens Scheffler
+1 for this case to introduce run_date. Because or the Altenate might be a performance killer on DB.
Jarek Potiuk
+1 on run_after as the sorting key.
Tzu-ping Chung
Assuming we have a None logical_date value (not just in db but in the runtime context and generally exposed to users in interfaces), we need to rethink how
run_id
in generated. We have basically two choices, plus a wild card:logical_date
if not None,run_after
otherwise.run_after
There are some potential variants for option 1 too, like if you flip a flag on the DAG (or timetable) we use
run_after
even iflogical_date
is not None.Personally though, I feel we should just go with option 2. This may have compatibility implications (the date in the run ID changes), but that’s only for new runs—existing runs will have their existing
run_id
value—and it’s arguably quite an anti-pattern if you rely on the actual string value programatically anyway (such as trying to reverse-engineer a logical date out of the string). Using different dates conditionally may be confusing because you need to know the context behind to know which date it actually uses—doesscheduled__2025-01-13T00:00:00
refer to a run that was executed on the 13th or 14th? (assuming a daily DAG) Always using the same date field gives a definite answer.We also want to add a hash suffix of some sort in the run ID since the date won’t be guareteed to be unique (regardless if we go with option 1 or 2). I’m thinking something like this should be enough.
scheduled__2025-01-13T00:00:00__a5ay
The suffix is basically
Happy to discuss alternatives too. We might not even need 4 characters. Variants are also possible if you don’t like
+
or/
in Base64, or alternative encodings if you don’t like Base64 itself.Jarek Potiuk
> I like both option 2. and the suffix.
scheduled__2025-01-13T00:00:00__a5ay
Out of a bit of caution - likely we should also implement conflict handling - if inserting will cause a conflict on that field we should regenerate the random suffix and try again (few times?) - no matter what probability the conflict chance is, the rule of big numbers says that it will - eventually - happen if Airflow keeps on getting more and more popular and run on more and more scale
.
Plus the random number generation in a number of cases is less-random than what we believe. But that's just with my "that should never happen but did" hat on. So maybe we can ignore it.
Daniel Standish
I suppose it does not really matter too much, but i think that when there's a logical date, then it would make sense to continue to use it rather than run after which isn't as meaningful. For null logical date, yes, let's use run_after, although, I would be perfectly happy to just use a uuid. Indeed, are we going to change DagRun.id to be uuid anyway? if that's the case there isn't really a need for run_id anymore anyway.
Tzu-ping Chung
I just added a new Question 7 that I noticed when going through the code base.
Jarek Potiuk
> For example,
xcom_pull
hasinclude_prior_dates
.Taskinstance.clear
allows clearing all tis in a date range. Those use logical dates. What do runs with null logical date fit into these?I think they should be skipped (and any case where you involve filtering or selecting the logical date should follow). For me simply (regardless from the reason why) - we should treat the "dated" and "un-dated" (for the lack of better word) dag runs as separate "namespaces".
Daniel Standish
Jarek Potiuk Jens Scheffler Tzu-ping Chung Vikram Koka Shubham Mehta i will plan to call for vote this evening. please review the resolution section and let me know if you think any of the elements needs to be modified.
Jarek Potiuk
Looks good.
Jarek Potiuk
Thanks Daniel Standish For being so diligent and persistent. It's been great discussion and outcome.
Tzu-ping Chung