[ https://issues.apache.org/jira/browse/FLINK-36404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Lorenzo Nicora updated FLINK-36404: ----------------------------------- Description: *Issue* {{PrometheusSinkWriteException}} thrown by {{HttpResponseCallback}} do not cause the httpclient IOReactor to fail, being actually swallowed, and preventing the job from failing. Also, rekatd: *Solution* Intercept {{PrometheusSinkWriteException}} up the httpclient stack, adding to the client a {{IOSessionListener}} to that can rethow those exceptions, causing the reactor to actually fail, and consequently also the operator to fail. Note: the httpclient IOReactor failing causes a number of exceptions. To keep the actual root cause evident, the response callback should log to ERROR when the exception happens was: *Issue* {{PrometheusSinkWriteException}} thrown by {{HttpResponseCallback}} do not cause the httpclient IOReactor to fail, being actually swallowed, and preventing the job from failing. *Solution* Intercept {{PrometheusSinkWriteException}} up the httpclient stack, adding to the client a {{IOSessionListener}} to that can rethow those exceptions, causing the reactor to actually fail, and consequently also the operator to fail. Note: the httpclient IOReactor failing causes a number of exceptions. To keep the actual root cause evident, the response callback should log to ERROR when the exception happens > PrometheusSinkWriteException thrown by the response callback may not cause > job to fail > -------------------------------------------------------------------------------------- > > Key: FLINK-36404 > URL: https://issues.apache.org/jira/browse/FLINK-36404 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Prometheus > Reporter: Lorenzo Nicora > Priority: Critical > > *Issue* > {{PrometheusSinkWriteException}} thrown by {{HttpResponseCallback}} do not > cause the httpclient IOReactor to fail, being actually swallowed, and > preventing the job from failing. > Also, rekatd: > *Solution* > Intercept {{PrometheusSinkWriteException}} up the httpclient stack, adding to > the client a {{IOSessionListener}} to that can rethow those exceptions, > causing the reactor to actually fail, and consequently also the operator to > fail. > Note: the httpclient IOReactor failing causes a number of exceptions. To keep > the actual root cause evident, the response callback should log to ERROR when > the exception happens -- This message was sent by Atlassian Jira (v8.20.10#820010)