Thanks both for your inputs!
Hi Pierre, I think the key difference here is: by doing this, we are not allowing Airflow “users” to run their code in scheduler. We are only allowing Airflow “Admins” to deploy a plugin to run in scheduler, just the same as dag_policy/task_policy/task_instance_mutation_hook/pod_mutation_hook. So I do not think this violates our current preference in terms of security. Hi Constance, I thought the trigger is mainly for deferrable operator cases? It’s quite different scenario from what I’m trying to cover here IMHO. Did I miss anything? Please let me know. Thanks again! Looking forward to more questions/comments! XD > On Feb 2, 2024, at 13:29, Constance Martineau > <consta...@astronomer.io.INVALID> wrote: > > Naive question: Instead of running the code on the scheduler - could the > condition check be delegated to the triggerer? > > On Fri, Feb 2, 2024 at 2:33 PM Pierre Jeambrun <pierrejb...@gmail.com> > wrote: > >> But maybe it’s time to reconsider that :), curious to see what others >> think. >> >> On Fri 2 Feb 2024 at 20:30, Pierre Jeambrun <pierrejb...@gmail.com> wrote: >> >>> I like the idea and I understand that it might help in some use cases. >>> >>> The first concern that I have is that it would allow user code to run in >>> the scheduler, if I understand correctly. This would have big >> implications >>> in terms of security and how our security model works. (For instance the >>> scheduler is a trusted component and has direct access to the DB, AIP-44 >>> assumption) >>> >>> If I remember correctly this is a route that we specifically tried to >> stay >>> away from. >>> >>> On Fri 2 Feb 2024 at 20:03, Xiaodong (XD) DENG <xd.d...@apple.com.invalid >>> >>> wrote: >>> >>>> Hi folks, >>>> >>>> I’m writing to share my thought regarding the possibility of supporting >>>> “custom TI dependencies”. >>>> >>>> Currently we maintain the dependency check rules under >>>> “airflow.ti_deps.deps". They cover the dependency checks like if there >> are >>>> available pool slot/if the concurrency allows/TI trigger rules/if the >> state >>>> is valid, etc., and play essential role in the scheduling process. >>>> >>>> One idea was brought up in our team's internal discussion: why shouldn’t >>>> we support custom TI dependencies? >>>> >>>> In details: just like the cluster policies >>>> (dag_policy/task_policy/task_instance_mutation_hook/pod_mutation_hook), >> if >>>> we support users add their own dependency checks as custom classes (and >>>> also put under airflow_local_settings.py), it will allow users to have >> much >>>> higher flexibility in the TI scheduling. These custom TI deps should be >>>> added as additions to the existing default deps (not replacing or >> removing >>>> any of them). >>>> >>>> For example: similar to check for pool availability/concurrency, the job >>>> may need to check for user’s infra-specific conditions, like if a GPU is >>>> available right now (instead of competing with other jobs randomly), or >> if >>>> an external system API is ready to be called (otherwise wait a bit ). >> And a >>>> lot more other possibilities. >>>> >>>> Why cluster policies won’t help here? task_instance_mutation_hook is >>>> executed in a “worker”, not in the DAG file processor, just before the >> TI >>>> is executed. What we are trying to gain some control here, though, is in >>>> the scheduling process (based on custom rules, to decide if the TI state >>>> should be updated so it can be scheduled for execution). >>>> >>>> I would love to know how community finds this idea, before we start to >>>> implement anything. Any quesiton/suggestion would be greatly >> appreciated. >>>> Many thanks! >>>> >>>> >>>> XD >>>> >>>> >>>> >> --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org For additional commands, e-mail: dev-h...@airflow.apache.org