dosubot[bot] commented on issue #28445:
URL: https://github.com/apache/superset/issues/28445#issuecomment-2957589667

   <!-- Answer -->
   Yes, several users are still seeing the Celery worker liveness probe fail 
with "No nodes replied within time constraint" on recent Superset versions, 
including 4.1.1 and Celery 5.4.0. This is a known pain point, and the root 
cause appears to be a combination of how Kubernetes probes interact with 
Celery's process model and how Celery responds to `inspect ping` during 
startup, shutdown, or under load.
   
   Key findings and actionable notes:
   
   - The liveness probe is still configured by default to use `celery -A 
superset.tasks.celery_app:app inspect ping -d celery@$HOSTNAME`, and this 
command can fail even when the worker is otherwise healthy and processing 
tasks. This is a limitation of Celery's inspect mechanism, not unique to 
Superset. 
[source](https://github.com/apache/superset/blob/86e7139245e7444e749b5e8ad1d5167b4df53ce9/helm/superset/values.yaml)
   - Some users report that even with this error, Celery continues to function, 
and Flower can see the workers. The probe failure is often a false positive. 
[source](https://github.com/apache/superset/discussions/31931)
   - A workaround is to relax the liveness probe (increase 
`initialDelaySeconds`, `timeoutSeconds`, and `failureThreshold`) or to use a 
custom probe script that checks for the Celery process rather than relying on 
`inspect ping`.
   - For graceful shutdown and to avoid zombie workers, add a Kubernetes 
lifecycle `preStop` hook to send a Celery control shutdown signal:  
     ```
     lifecycle:
       preStop:
         exec:
           command:
             - /bin/sh
             - '-c'
             - celery --app=superset.tasks.celery_app:app control shutdown 
--destination=celery@$HOSTNAME
     ```
     This helps ensure the worker deregisters cleanly and avoids some 
probe-related restarts. 
[source](https://github.com/apache/superset/issues/33513)
   - If you use SQLite as the metadata DB, switch to PostgreSQL—async queries 
and Celery require it. 
[source](https://github.com/apache/superset/discussions/31931)
   
   There have not been major changes to the liveness probe logic in recent 
Superset releases; the Helm chart and probe command remain the same. The best 
current advice is to tune probe settings and use the preStop hook for graceful 
shutdown. If the worker is otherwise healthy and tasks are running, the probe 
failure may be safely ignored, but keep an eye on your actual task throughput 
and logs.
   
   If anyone has found a more robust probe or workaround, please share!
   
   
   <!-- Dosu Comment Footer -->
   *To reply, just mention [@dosu](https://go.dosu.dev/dosubot).*
   
   ---
   <sup>How did I do? 
[Good](https://app.dosu.dev/response-feedback/40bfca61-3dde-4625-a2cf-b25aaeda1fcf?feedback_type=great_response)
 | 
[Irrelevant](https://app.dosu.dev/response-feedback/40bfca61-3dde-4625-a2cf-b25aaeda1fcf?feedback_type=irrelevant_answer)
 | 
[Incorrect](https://app.dosu.dev/response-feedback/40bfca61-3dde-4625-a2cf-b25aaeda1fcf?feedback_type=incorrect_sources)
 | 
[Verbose](https://app.dosu.dev/response-feedback/40bfca61-3dde-4625-a2cf-b25aaeda1fcf?feedback_type=too_verbose)
 | 
[Hallucination](https://app.dosu.dev/response-feedback/40bfca61-3dde-4625-a2cf-b25aaeda1fcf?feedback_type=hallucination)
 | [Report 
🐛](https://app.dosu.dev/response-feedback/40bfca61-3dde-4625-a2cf-b25aaeda1fcf?feedback_type=bug_report)
 | 
[Other](https://app.dosu.dev/response-feedback/40bfca61-3dde-4625-a2cf-b25aaeda1fcf?feedback_type=other)</sup>&nbsp;&nbsp;[![Join
 
Discord](https://img.shields.io/badge/join-5865F2?logo=discord&logoColor=white&label=)](https://go.dosu.dev/discord-bot)&nbsp;[!
 [Share on 
X](https://img.shields.io/badge/X-share-black)](https://twitter.com/intent/tweet?text=%40dosu_ai%20helped%20me%20solve%20this%20issue!&url=https%3A//github.com/apache/superset/issues/28445)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to