On 6/5/19 12:05 PM, Vladimir Sementsov-Ogievskiy wrote: >> By enabling TCP keepalives we are explicitly making the connection >> less reliable by forcing it to be terminated when keepalive >> threshold triggers, instead of waiting longer for TCP to recover. >> >> The rationale s that once a connection has been in a hung state for >> so long that keepalive triggers, its (hopefully) not useful to the >> mgmt app to carry on waiting anyway. >> >> If the connection is terminated by keepalive & the mgmt app then >> spawns a new client to carry on with the work, what are the risks >> involved ? eg Could packets from the stuck, terminated, connection >> suddenly arrive later and trigger I/O with outdated data payload ? > > Hmm, I believe that tcp guarantees isolation between different connections > >> >> I guess this is no different a situation from an app explicitly >> killing the QEMU NBD client process instead & spawning a new one. >> >> I'm still feeling a little uneasy about enabling it unconditionally >> though, since pretty much everything I know which supports keepalives >> has a way to turn them on/off at least, even if you can't tune the >> individual timer settings. > > Hm. So, I can add bool keepalive parameter for nbd format with default to > true. > And if needed, it may be later extended to be qapi 'alternate' of bool or > struct with > three numeric parameters, corresponding to TCP_KEEPCNT, TCP_KEEPIDLE and > TCP_KEEPINTVL . > > Opinions?
Adding a bool that could later turn into a qapi 'alternate' for fine-tuning seems reasonable. Defaulting the bool to true is not backwards-compatible; better would be defaulting it to false and letting users opt-in; introspection will also work to let you know whether the feature is present. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
signature.asc
Description: OpenPGP digital signature