Thanks both for your inputs!

Hi Pierre,

I think the key difference here is: by doing this, we are not allowing Airflow 
“users” to run their code in scheduler. We are only allowing Airflow “Admins” 
to deploy a plugin to run in scheduler, just the same as 
dag_policy/task_policy/task_instance_mutation_hook/pod_mutation_hook.

So I do not think this violates our current preference in terms of security.


Hi Constance,

I thought the trigger is mainly for deferrable operator cases? It’s quite 
different scenario from what I’m trying to cover here IMHO. 
Did I miss anything? Please let me know.


Thanks again! Looking forward to more questions/comments!


XD


> On Feb 2, 2024, at 13:29, Constance Martineau 
> <consta...@astronomer.io.INVALID> wrote:
> 
> Naive question: Instead of running the code on the scheduler - could the
> condition check be delegated to the triggerer?
> 
> On Fri, Feb 2, 2024 at 2:33 PM Pierre Jeambrun <pierrejb...@gmail.com>
> wrote:
> 
>> But maybe it’s time to reconsider that :), curious to see what others
>> think.
>> 
>> On Fri 2 Feb 2024 at 20:30, Pierre Jeambrun <pierrejb...@gmail.com> wrote:
>> 
>>> I like the idea and I understand that it might help in some use cases.
>>> 
>>> The first concern that I have is that it would allow user code to run in
>>> the scheduler, if I understand correctly. This would have big
>> implications
>>> in terms of security and how our security model works. (For instance the
>>> scheduler is a trusted component and has direct access to the DB, AIP-44
>>> assumption)
>>> 
>>> If I remember correctly this is a route that we specifically tried to
>> stay
>>> away from.
>>> 
>>> On Fri 2 Feb 2024 at 20:03, Xiaodong (XD) DENG <xd.d...@apple.com.invalid
>>> 
>>> wrote:
>>> 
>>>> Hi folks,
>>>> 
>>>> I’m writing to share my thought regarding the possibility of supporting
>>>> “custom TI dependencies”.
>>>> 
>>>> Currently we maintain the dependency check rules under
>>>> “airflow.ti_deps.deps". They cover the dependency checks like if there
>> are
>>>> available pool slot/if the concurrency allows/TI trigger rules/if the
>> state
>>>> is valid, etc., and play essential role in the scheduling process.
>>>> 
>>>> One idea was brought up in our team's internal discussion: why shouldn’t
>>>> we support custom TI dependencies?
>>>> 
>>>> In details: just like the cluster policies
>>>> (dag_policy/task_policy/task_instance_mutation_hook/pod_mutation_hook),
>> if
>>>> we support users add their own dependency checks as custom classes (and
>>>> also put under airflow_local_settings.py), it will allow users to have
>> much
>>>> higher flexibility in the TI scheduling. These custom TI deps should be
>>>> added as additions to the existing default deps (not replacing or
>> removing
>>>> any of them).
>>>> 
>>>> For example: similar to check for pool availability/concurrency, the job
>>>> may need to check for user’s infra-specific conditions, like if a GPU is
>>>> available right now (instead of competing with other jobs randomly), or
>> if
>>>> an external system API is ready to be called (otherwise wait a bit ).
>> And a
>>>> lot more other possibilities.
>>>> 
>>>> Why cluster policies won’t help here?  task_instance_mutation_hook is
>>>> executed in a “worker”, not in the DAG file processor, just before the
>> TI
>>>> is executed. What we are trying to gain some control here, though, is in
>>>> the scheduling process (based on custom rules, to decide if the TI state
>>>> should be updated so it can be scheduled for execution).
>>>> 
>>>> I would love to know how community finds this idea, before we start to
>>>> implement anything. Any quesiton/suggestion would be greatly
>> appreciated.
>>>> Many thanks!
>>>> 
>>>> 
>>>> XD
>>>> 
>>>> 
>>>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
For additional commands, e-mail: dev-h...@airflow.apache.org

Reply via email to