I like the idea and I understand that it might help in some use cases. The first concern that I have is that it would allow user code to run in the scheduler, if I understand correctly. This would have big implications in terms of security and how our security model works. (For instance the scheduler is a trusted component and has direct access to the DB, AIP-44 assumption)
If I remember correctly this is a route that we specifically tried to stay away from. On Fri 2 Feb 2024 at 20:03, Xiaodong (XD) DENG <xd.d...@apple.com.invalid> wrote: > Hi folks, > > I’m writing to share my thought regarding the possibility of supporting > “custom TI dependencies”. > > Currently we maintain the dependency check rules under > “airflow.ti_deps.deps". They cover the dependency checks like if there are > available pool slot/if the concurrency allows/TI trigger rules/if the state > is valid, etc., and play essential role in the scheduling process. > > One idea was brought up in our team's internal discussion: why shouldn’t > we support custom TI dependencies? > > In details: just like the cluster policies > (dag_policy/task_policy/task_instance_mutation_hook/pod_mutation_hook), if > we support users add their own dependency checks as custom classes (and > also put under airflow_local_settings.py), it will allow users to have much > higher flexibility in the TI scheduling. These custom TI deps should be > added as additions to the existing default deps (not replacing or removing > any of them). > > For example: similar to check for pool availability/concurrency, the job > may need to check for user’s infra-specific conditions, like if a GPU is > available right now (instead of competing with other jobs randomly), or if > an external system API is ready to be called (otherwise wait a bit ). And a > lot more other possibilities. > > Why cluster policies won’t help here? task_instance_mutation_hook is > executed in a “worker”, not in the DAG file processor, just before the TI > is executed. What we are trying to gain some control here, though, is in > the scheduling process (based on custom rules, to decide if the TI state > should be updated so it can be scheduled for execution). > > I would love to know how community finds this idea, before we start to > implement anything. Any quesiton/suggestion would be greatly appreciated. > Many thanks! > > > XD > > >