2020-05-14 23:30:14 UTC - Addison Higham: hrm, so, we are trying to make 
brokers recover quicker from a bookie failure in k8s and have found more data. 
What we previously thought was happening: a bookie failed, it was removed and 
re-added and resolved to the old IP address due to DNS caching.

That doesn't appear to be the case, we see that it is using the new IP when 
trying to look up rack info, but once it actually goes to use the new 
bookkeeper by sending it ops, all the netty channels are still opened to the 
old IP. It seems like the netty connection pool for the old bookie never gets 
closed and so it tries to reuse a connection pool for the old IP (with what 
looks like connections still in established... which I can't figure out why 
they wouldn't go to half-closed quicker). I am guessing that the netty 
connection pools are keyed off only hostname and not IP. I imagine somewhere we 
need to check when re-adding a broker with same hostname if the IP has changed 
and if so re-init the netty channels
----

Reply via email to