Igal Shilman created FLINK-18790: ------------------------------------ Summary: Set a connection timeout that is lower than the request timeout for remote functions Key: FLINK-18790 URL: https://issues.apache.org/jira/browse/FLINK-18790 Project: Flink Issue Type: Improvement Components: Stateful Functions Reporter: Igal Shilman Fix For: statefun-2.2.0
Currently for remote functions, the connection timeout is identical to the whole request timeout. A problem with this happens when a remote function is behind a NAT/load balancer/or in general behind anything that holds the port open, even tho the remote function is not present or was relocated. In that case the entire request budget would be spent on waiting for a connection. This in particularly the case in Kubernetes where pods behind a service, were ungracefully killed at once. To fix that issue, I propose: 1) by default use 10% of the total request timeout for the connection timeout. 2) expose a configuration parameter explicitly. -- This message was sent by Atlassian Jira (v8.3.4#803005)