I also agree with the idea that we should go for a name that's more accurate and easier to understand. Also, +1 to starting with Airflow 3.
Tbh "heartbeat" itself is an overused term/concept in Airflow. I think we already have 6 configurations with "heartbeat" in it, and they're different types of heartbeats. Anyways, I am against this name change: scheduler_zombie_task_threshold --> scheduler_task_heartbeat_timeout_threshold We already have scheduler heartbeat, and let's drop the scheduler word from this, so that users know that this is Task Instance heartbeat, not scheduler. I also think we should combine "local_task_job_heartbeat_sec" with "scheduler_zombie_task_threshold". That configurations description says that it already defaults to zombie task threshold when set to 0. I haven't dug into the code to see why they are different, but I really hope our configuration documentation doesn't read like below in the future: "local_task_job_heartbeat_sec: The frequency (in seconds) at which the LocalTaskJob should send heartbeat signals to the scheduler to notify it’s still alive. If this value is set to 0, the heartbeat interval will default to the value of [scheduler] scheduler_task_heartbeat_timeout_threshold." Thanks Shubham On 2025-02-11, 1:15 PM, "Karen Braganza" <karenbraganz...@gmail.com <mailto:karenbraganz...@gmail.com>> wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le contenu ne présente aucun risque. Hi, I have been working on this PR <https://github.com/apache/airflow/pull/46257> <https://github.com/apache/airflow/pull/46257>> to update our documentation on zombie tasks to reflect the terminology used in the user-facing event logs in Airflow 2.10+. The event logs use the terminology "heartbeat timeout" whereas the documentation uses the terminology "zombie tasks". I would like to update the documentation to focus on the "heartbeat timeout" terminology so that users are able to find and understand this documentation easily when they see a "heartbeat timeout" in the event logs. In the same vein, I think other user-facing configurations should also be updated to use the same terminology. I am proposing that we make the following changes to Airflow configuration variables: scheduler_zombie_task_threshold --> scheduler_task_heartbeat_ timeout_threshold zombie_detection_interval --> task_heartbeat_timeout_detection_interval In addition to this, I propose that we also change the logs emitted by the scheduler to use the "task heartbeat timeout" terminology. For example, the below logs <https://github.com/apache/airflow/blob/dea2cc9afc61caf49621c3b1923bcf90e96e17e9/airflow/jobs/scheduler_job_runner.py#L2040> <https://github.com/apache/airflow/blob/dea2cc9afc61caf49621c3b1923bcf90e96e17e9/airflow/jobs/scheduler_job_runner.py#L2040>> : self.log.error( "Detected zombie job: %s " "(See https://airflow.apache.org/docs/apache-airflow/" <https://airflow.apache.org/docs/apache-airflow/"> "stable/core-concepts/tasks.html#zombie-tasks)", request, ) should become: self.log.error( "Detected task heartbeat timeout: %s " "(See https://airflow.apache.org/docs/apache-airflow/" <https://airflow.apache.org/docs/apache-airflow/"> "stable/core-concepts/tasks.html#zombie-tasks)", request, ) I wanted to start this discussion to get everyone's thoughts on my proposal. Do you agree (or disagree) that at least all user-facing elements of Airflow should use the "task heartbeat timeout" terminology instead of "zombie tasks" for uniformity? I can add all of these changes to my PR. Best, Karen Braganza <https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#zombie-detection-interval> <https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#zombie-detection-interval>> --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org For additional commands, e-mail: dev-h...@airflow.apache.org