dosubot[bot] commented on issue #28445: URL: https://github.com/apache/superset/issues/28445#issuecomment-2957589667
<!-- Answer --> Yes, several users are still seeing the Celery worker liveness probe fail with "No nodes replied within time constraint" on recent Superset versions, including 4.1.1 and Celery 5.4.0. This is a known pain point, and the root cause appears to be a combination of how Kubernetes probes interact with Celery's process model and how Celery responds to `inspect ping` during startup, shutdown, or under load. Key findings and actionable notes: - The liveness probe is still configured by default to use `celery -A superset.tasks.celery_app:app inspect ping -d celery@$HOSTNAME`, and this command can fail even when the worker is otherwise healthy and processing tasks. This is a limitation of Celery's inspect mechanism, not unique to Superset. [source](https://github.com/apache/superset/blob/86e7139245e7444e749b5e8ad1d5167b4df53ce9/helm/superset/values.yaml) - Some users report that even with this error, Celery continues to function, and Flower can see the workers. The probe failure is often a false positive. [source](https://github.com/apache/superset/discussions/31931) - A workaround is to relax the liveness probe (increase `initialDelaySeconds`, `timeoutSeconds`, and `failureThreshold`) or to use a custom probe script that checks for the Celery process rather than relying on `inspect ping`. - For graceful shutdown and to avoid zombie workers, add a Kubernetes lifecycle `preStop` hook to send a Celery control shutdown signal: ``` lifecycle: preStop: exec: command: - /bin/sh - '-c' - celery --app=superset.tasks.celery_app:app control shutdown --destination=celery@$HOSTNAME ``` This helps ensure the worker deregisters cleanly and avoids some probe-related restarts. [source](https://github.com/apache/superset/issues/33513) - If you use SQLite as the metadata DB, switch to PostgreSQL—async queries and Celery require it. [source](https://github.com/apache/superset/discussions/31931) There have not been major changes to the liveness probe logic in recent Superset releases; the Helm chart and probe command remain the same. The best current advice is to tune probe settings and use the preStop hook for graceful shutdown. If the worker is otherwise healthy and tasks are running, the probe failure may be safely ignored, but keep an eye on your actual task throughput and logs. If anyone has found a more robust probe or workaround, please share! <!-- Dosu Comment Footer --> *To reply, just mention [@dosu](https://go.dosu.dev/dosubot).* --- <sup>How did I do? [Good](https://app.dosu.dev/response-feedback/40bfca61-3dde-4625-a2cf-b25aaeda1fcf?feedback_type=great_response) | [Irrelevant](https://app.dosu.dev/response-feedback/40bfca61-3dde-4625-a2cf-b25aaeda1fcf?feedback_type=irrelevant_answer) | [Incorrect](https://app.dosu.dev/response-feedback/40bfca61-3dde-4625-a2cf-b25aaeda1fcf?feedback_type=incorrect_sources) | [Verbose](https://app.dosu.dev/response-feedback/40bfca61-3dde-4625-a2cf-b25aaeda1fcf?feedback_type=too_verbose) | [Hallucination](https://app.dosu.dev/response-feedback/40bfca61-3dde-4625-a2cf-b25aaeda1fcf?feedback_type=hallucination) | [Report 🐛](https://app.dosu.dev/response-feedback/40bfca61-3dde-4625-a2cf-b25aaeda1fcf?feedback_type=bug_report) | [Other](https://app.dosu.dev/response-feedback/40bfca61-3dde-4625-a2cf-b25aaeda1fcf?feedback_type=other)</sup> [](https://go.dosu.dev/discord-bot) [! [Share on X](https://img.shields.io/badge/X-share-black)](https://twitter.com/intent/tweet?text=%40dosu_ai%20helped%20me%20solve%20this%20issue!&url=https%3A//github.com/apache/superset/issues/28445) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
