Not interesting :) ?
On Thu, Jul 7, 2022 at 10:41 AM Jarek Potiuk <ja...@potiuk.com> wrote: > > Hello everyone, > > We have just published a blog on our medium - > https://medium.com/apache-airflow/airflows-magic-loop-ec424b05b629 - that is > a blog of one of our users Itay Bittan (thanks!) who had been inspired by our > discussion on Slack on how they struggle with delays of loading dynamic dags > in their K8S. > > The problem that they had was that they have dynamic dags that are created in > a big loop (1000s of DAGs) and that caused ~ 2 minutes delays on starting > their tas on K8S, because all DAGs have to be created by the loop. > > What I proposed to try (since the DAGs were connected by the loop but really > isolated from each other) is to skip "all other" DAG creation in the loop > when it is parsed in the worker. That resulted in cutting the delay to ~ > 200ms. > > His case seems to be general enough to maybe suggest it even as a "general" > solution - currently it is based on possibly several "non-documented" > assumptions (that dag_id is passed in a certain way to the worker and that > you can use it to filter out such a loop. > > However maybe that's a good idea to make it documented and convert into "best > practice" when you have similar Dynamic DAGs. > > I can think of several caveats of such an approach - not all DAGs created in > a loop can be isolated, sometimes there might be side-effects that make your > dag have different structure if you skip other DAGs, but - I thought that if > we add some "guidelines" that could be easily replicated by other users. > > WDYT? > > J.