On Wed, Jun 05, 2019 at 09:39:10AM -0500, Eric Blake wrote: > On 6/5/19 5:09 AM, Vladimir Sementsov-Ogievskiy wrote: > > Enable keepalive option to track server availablity. > > s/availablity/availability/ > > Do we want this unconditionally, or should it be an option (and hence > exposed over QMP)?
I guess this is really a question about what our intended connection reliability policy should be. By enabling TCP keepalives we are explicitly making the connection less reliable by forcing it to be terminated when keepalive threshold triggers, instead of waiting longer for TCP to recover. The rationale s that once a connection has been in a hung state for so long that keepalive triggers, its (hopefully) not useful to the mgmt app to carry on waiting anyway. If the connection is terminated by keepalive & the mgmt app then spawns a new client to carry on with the work, what are the risks involved ? eg Could packets from the stuck, terminated, connection suddenly arrive later and trigger I/O with outdated data payload ? I guess this is no different a situation from an app explicitly killing the QEMU NBD client process instead & spawning a new one. I'm still feeling a little uneasy about enabling it unconditionally though, since pretty much everything I know which supports keepalives has a way to turn them on/off at least, even if you can't tune the individual timer settings. > > Requested-by: Denis V. Lunev <d...@openvz.org> > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsement...@virtuozzo.com> > > --- > > block/nbd-client.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/block/nbd-client.c b/block/nbd-client.c > > index 790ecc1ee1..b57cea8482 100644 > > --- a/block/nbd-client.c > > +++ b/block/nbd-client.c > > @@ -1137,6 +1137,7 @@ static int nbd_client_connect(BlockDriverState *bs, > > > > /* NBD handshake */ > > logout("session init %s\n", export); > > + qio_channel_set_keepalive(QIO_CHANNEL(sioc), true, NULL); > > qio_channel_set_blocking(QIO_CHANNEL(sioc), true, NULL); > > > > client->info.request_sizes = true; > > > > -- > Eric Blake, Principal Software Engineer > Red Hat, Inc. +1-919-301-3226 > Virtualization: qemu.org | libvirt.org > Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|