Hi all, Happy to tell you all that we have completed the first phase of DAG Serialisation i.e. the Webserver is stateless and can now run without access to DAG Files.
The 2 limitations we had in 1.10.7-1.10.9 ( https://airflow.apache.org/docs/1.10.7/dag-serialization.html#limitations) have been resolved. Special thanks to @ash for his continuous guidance and contributions. Also a special mention to Anita Fronczak and Zhou Fang for their contributions along the way. The next step is to remove SimpleDag representation in the Scheduler and replace it with Serialized DAG (WIP PR: https://github.com/apache/airflow/pull/7694) *Advantages*: - *Reduction in Webserver startup time* for large number of DAGs. Without DAG Serialization all the DAGs are loaded in the DagBag during the Webserver startup. With DAG Serialization, an empty DagBag is created and Dags are loaded from DB only when needed (i.e. when a particular DAG is clicked on in the home page) - *No DAG Parsing / Consistency*: Webserver would load DAGs from DB and won't even need the DAG Files when DAG Serialization is turned on. DAGs are parsed, serialized and stored in DB by the Scheduler. - Rendered Templates for TasksInstances that have already run will now correctly display their value which was true at the time of the run instead of the current value. - Paves way for* DAG Versioning* (more details on it when I create a separate AIP / update an existing AIP for it) and *Scheduler HA *(AIP-15 <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103092651> ). I will create new JIRA issues for further steps with DAG Serialization and DAG Versioning and would discuss them in our next sig-dag-serialization call (later this month). Regards, Kaxil