Hey everyone, I updated our meeting notes document in the Airflow wiki to capture the notes from our dev call on the 24th of October. Apologies for the delay. The link for those notes is here <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=308153072#Airflow3Devcall:MeetingNotes-Summary.11>
To everyone who attended the meeting, please check the summary and add anything that I may have missed. For those who could not join, please let us know if you disagree with anything discussed and agreed upon in the meeting. Also, please do ask questions if something is unclear. I will follow-up with the agenda for the next meeting (scheduled for 7th Nov) in a few days. If you would like something to be added to the proposed agenda for that meeting, please add it here <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=308153072#Airflow3Devcall:MeetingNotes-(Proposed)Agenda.1> or let me know. Best regards, Vikram --- Below is the summary from the call on the 24th Oct: - Follow-up on action items from the last call: - Providers follow-up: Vikram Koka commented that he, Jens, Jarek, and others have been collaborating on a proposal document and that it would be shared well before the next dev call. Vikram shared a split version of that document to dev list on Friday as a pre-read for the providers repackaging document. - Performance test plan: Michal Modras shared that one of his team members would be working on it. He said that the current situation is that they have an “elastic DAG” which can be used for benchmarking various scenarios. He said that they would be working on the benchmark scenarios and sharing that document by the end of Nov. After that, the initial data collection based on these scenarios would be done and shared in Dec. - Development updates and presentations: - TP Chung shared a recorded video detailing the progress of AIP-74 <https://drive.google.com/file/d/1t69Pb9BOUNGE_egnI_kIpno90RVHoO-s/view> on AIP-74 Introducing Data Assets <https://cwiki.apache.org/confluence/display/AIRFLOW/Test+cases+AIP-74+Introducing+Data+Assets>. This showed the current progress on the AIP which was close to completion from a back-end perspective. This included the renaming of the existing Datasets and adding a name attribute to the asset class, to enable the specification of a more user-friendly name for the asset. He also shared that this name attribute is not currently shown in the UI, but that they would be working with their UI team to reflect these changes in the Airflow UI. - Ankit Chaurasia shared an update on AIP-83 Remove Execution Date Unique Constraint from DAG run <https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-83+Rename+execution_date+-%3E+logical_date+and+remove+unique+constraint>. He mentioned how this follows up on the work initially started in Airflow 2.2 with the shift towards run_id as a unique identifier and the logical_date as a human usable non-unique reference. He walked through the massive (around 280 files touched) draft PR focused on removing the execution_date completely including the CLI and the API. This PR has already had several comments from reviewers. The pervasiveness of the “execution date” was reinforced in the discussion. He mentioned how one of the challenges was also with the backfill PR and was waiting for the backfill PR to be completed and merged. He showcased the test plan for testing this AIP and requested the group to review and provide feedback on the test plan. He concluded that the PR is in good shape logic-wise, but fixing tests. The discussion after that raised the point that we need to upgrade the API version from v1 to v2 or v3 as part of Airflow 3, possible name to be “stable v3 API”. - Ephraim Anierobi shared an update on AIP-79 Remove Flask AppBuilder as Core dependency <https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-79%3A+Remove+Flask+AppBuilder+as+Core+dependency>. It has a lot of work streams including the auth manager. Ephraim mentioned how the part he was working on was the separation of the DB migration from Fab. He mentioned how apps could now use an alternative mechanism called external DB managers to run DB migrations as part of the upgrade. He mentioned how this AIP included other work such as Auth manager and UI. Rajesh added (on behalf of Vincent) that the Auth manager was close to being done. Ash had also done some work on this AIP regarding 5.0 RC, but was not yet complete with respect to testing. Vikram expressed concern that clearly multiple “necessary parts” were done as part of this AIP, but not clear if all the “sufficient parts” were done. Ephraim mentioned that “what is left is for the UI to switch”. This needs to be followed up in the next dev call. Jens suggested that we may need to migrate or reimplement the “Connection form UI”. - Ephraim Anierobi also shared an update on AIP-65 Improve DAG history in UI - backend changes <https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-65%3A+Improve+DAG+history+in+UI>. He said how he is working on DAG Versioning around the serialized DAG, so that older versions of the DAGruns could be visible through the UI. Ephraim mentioned that the serialized DAG hash is not always consistent today because of inconsistent sorting, which is the first thing he just fixed. He also added a user-specified version for the DAG and talked about how he is close to completion on the DAG history database changes. He said that the next steps were the API and UI work. - Kaxil Naik shared an update on AIP-72 Task Execution Interface aka Task SDK <https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-72+Task+Execution+Interface+aka+Task+SDK>, specifically focussing on the API server. He showed the addition of the scaffolding for the API server, so that in the future, the Core API and the execution API servers could be run independently if needed for different scaling requirements. He mentioned how he and Ash are working on porting code from Core to TaskSDK. Ash Berlin-Taylor then followed up with sharing the main PR for the Task SDK, which covers the DAG object, the TaskGroup, the Base Operator and so on, which would be the heart of airflow.sdk, which is the core Task SDK. To make upgrades easier, DAG import would stay as “airflow.dag”, so people don’t even get any deprecation warnings. Ash mentioned how he was currently thinking of including this “airflow.sdk” in the Scheduler to avoid reimplementation, but was still thinking through the implications. He mentioned how a lot of the classes are now using “attrs” to save on boilerplate code. Ash mentioned how the next step is to start adding endpoints for the API server to start doing simple client-based execution. - Brent Bovenzi finished up the call with an update on designs for AIP-38 Modern Web Application <https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-38+Modern+Web+Application> by walking through mockups of the redesigned DAG details page, including a list of actions / filters to show only failed tasks, etc. Brent also walked through the mockups for DAGruns view for Gantt chart view vs. Graph view, etc. Brent showed how the modal view enables a quick view of the details of a DA run and to switch between different areas which a user may want to drill into. Brent mentioned how the motivation was to enable proper use of space and to zoom between a DAG and a DAGrun and TaskInstance, without having to show a giant Christmas tree of the grid view. - Action items for next call: - Vikram to update the development milestones doc based on the performance benchmark plan from Michal - DONE - Vikram to create a subpage under Airflow 3.0 to track critical changes important for upgrades, including API changes such as those brought up in the Execution date removal discussion. API owners to update this page as they make user facing changes during development. - Ephraim, Vincent, and others interested to follow-up on "What's left for AIP-79" and present at the next dev call.