+1 to no more "zombie" Avi
On Wed, Feb 12, 2025 at 12:38 PM Jarek Potiuk <ja...@potiuk.com> wrote: > +1 on both. Changing and Airflow 3. Apart of some concerns about the name > itself, I never remember what kind of tasks are zombies and what triggers > that. > > I think zombie is a bit overloaded term especially in container world where > you have zombie processes already (when your init process does not do > zombie process reaping properly) and that might be confusing. > > Explicitly naming it - even if it is longer might be a bit more obvious. > > śr., 12 lut 2025, 13:15 użytkownik kalyan reddy <kalyan.be...@live.com> > napisał: > > > +1 to the idea and to restrict the change to Airflow 3 only > > ________________________________ > > From: Wei Lee <weilee...@gmail.com> > > Sent: 12 February 2025 17:01 > > To: dev@airflow.apache.org <dev@airflow.apache.org> > > Subject: Re: Updating "zombie task" terminology to "task heartbeat > timeout" > > > > I like this idea as well. But not sure whether it would affect > monitoring. > > 🤔 If we’re to introduce it, we’d better make it airflow 3 only and make > > sure we add a migration rule as we’re changing the configuration > > > > Best, > > Wei > > > > > On Feb 12, 2025, at 6:10 AM, Ryan Hatter <ryan.hat...@astronomer.io > .invalid> > > wrote: > > > > > > I love it. "heartbeat timeout" is obvious and has meaning in software > > > beyond Airflow, so it makes sense to stick with this verbiage and use > it > > to > > > replace "zombie" in docs, configs, logs, and code IMO. > > > > > > On Tue, Feb 11, 2025 at 4:15 PM Karen Braganza < > > karenbraganz...@gmail.com> > > > wrote: > > > > > >> Hi, > > >> > > >> I have been working on this PR > > >> <https://github.com/apache/airflow/pull/46257> to update our > > documentation > > >> on zombie tasks to reflect the terminology used in the user-facing > event > > >> logs in Airflow 2.10+. The event logs use the terminology "heartbeat > > >> timeout" whereas the documentation uses the terminology "zombie > tasks". > > I > > >> would like to update the documentation to focus on the "heartbeat > > timeout" > > >> terminology so that users are able to find and understand this > > >> documentation easily when they see a "heartbeat timeout" in the event > > logs. > > >> > > >> In the same vein, I think other user-facing configurations should also > > be > > >> updated to use the same terminology. I am proposing that we make the > > >> following changes to Airflow configuration variables: > > >> > > >> scheduler_zombie_task_threshold --> scheduler_task_heartbeat_ > > >> timeout_threshold > > >> zombie_detection_interval --> > task_heartbeat_timeout_detection_interval > > >> > > >> In addition to this, I propose that we also change the logs emitted by > > the > > >> scheduler to use the "task heartbeat timeout" terminology. > > >> > > >> For example, the below logs > > >> < > > >> > > > https://github.com/apache/airflow/blob/dea2cc9afc61caf49621c3b1923bcf90e96e17e9/airflow/jobs/scheduler_job_runner.py#L2040 > > >>> > > >> : > > >> self.log.error( > > >> "Detected zombie job: %s " > > >> "(See https://airflow.apache.org/docs/apache-airflow/" > > >> "stable/core-concepts/tasks.html#zombie-tasks)", > > >> request, > > >> ) > > >> > > >> should become: > > >> > > >> self.log.error( > > >> "Detected task heartbeat timeout: %s " > > >> "(See https://airflow.apache.org/docs/apache-airflow/" > > >> "stable/core-concepts/tasks.html#zombie-tasks)", > > >> request, > > >> ) > > >> > > >> I wanted to start this discussion to get everyone's thoughts on my > > >> proposal. Do you agree (or disagree) that at least all user-facing > > elements > > >> of Airflow should use the "task heartbeat timeout" terminology instead > > of > > >> "zombie tasks" for uniformity? > > >> > > >> I can add all of these changes to my PR. > > >> > > >> Best, > > >> Karen Braganza > > >> > > >> > > >> < > > >> > > > https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#zombie-detection-interval > > >>> > > >> > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org > > For additional commands, e-mail: dev-h...@airflow.apache.org > > > > >