On (Fri) 25 Feb 2011 [14:25:20], Michael Roth wrote: > On 02/24/2011 06:48 AM, Amit Shah wrote: > >On (Wed) 23 Feb 2011 [08:31:52], Michael Roth wrote: > >>On 02/22/2011 10:59 PM, Amit Shah wrote: > >>>On (Tue) 22 Feb 2011 [16:40:55], Michael Roth wrote: > >>>>If something in the guest is attempting to read/write from the > >>>>virtio-serial device, and nothing is connected to virtio-serial's > >>>>host character device (say, a socket) > >>>> > >>>>1. writes will block until something connect()s, at which point the > >>>>write will succeed > >>>> > >>>>2. reads will always return 0 until something connect()s, at which > >>>>point the reads will block until there's data > >>>> > >>>>This makes it difficult (impossible?) to implement the notion of > >>>>connect/disconnect or open/close over virtio-serial without layering > >>>>another protocol on top using hackish things like length-encoded > >>>>payloads or sentinel values to determine the end of one > >>>>RPC/request/response/session and the start of the next. > >>>> > >>>>For instance, if the host side disconnects, then reconnects before > >>>>we read(), we may never get the read()=0, and our FD remains valid. > >>>>Whereas with a tcp/unix socket our FD is no longer valid, and the > >>>>read()=0 is an event we can check for at any point after the other > >>>>end does a close/disconnect. > >>> > >>>There's SIGIO support, so host connect-disconnect notifications can be > >>>caught via the signal. > >> > >>I recall looking into this at some point....but don't we get a SIGIO > >>for read/write-ability in general? > > > >I don't get you -- the virtio_console driver emits the SIGIO signal > >only when the host side connects or disconnects. See > > > >http://www.linux-kvm.org/page/Virtio-serial_API > > > >So whenever you receive a SIGIO, poll() in the signal handler for all > >fds of interest and whichever has POLLIN set is writable. Whichever > >has POLLHUP set is not. If you maintain previous state of the fd > >(before signal), you can figure out if something happened on the host > >side. > > > > I tried this on RHEL6+rhn updates but the O_ASYNC flag doesn't seem > to be supported. Has this been backported?
Yes, it has been. > Either way, it seems we can still lose the disconnect event/poll > state change if the host reconnects before the signal is delivered. > So SIGIO in an application would need to be reserved for absolutely > 2 things: a host connect or disconnect (distinguishing between the 2 > may not be so important, we could treat either as the previous > session having been closed). Which limits the application to only > having 1 O_ASYNC FD open at a time. > > But even if we do that, it seems like there might still be a small > window where the application could read/write data intended for the > previous connection before the signal handler is invoked. Not too > sure on that point though. Assuming this isn't the case...it could > work. But what about windows guests? > > >>So you still need some way > >>differentiate, say, readability from a disconnect/EOF, and the > >>read()=0 that could determine this is still racing with host-side > >>reconnects. > > > >>>Also, nonblocking reads/writes will return -EPIPE if the host-side > >>>connection is not up. > >> > >>But we still essentially need to poll() for a host-side disconnected > >>state, which is still racy since they may reconnect before we've > >>done a read/write that would've generated the -EPIPE. It seems like > >>what we really need is for the FD to be invalid from that point > >>forward. > > > >This would go against (or abuse) a chardev interface. It would > >effectively treat a host-side port close as a hot-unplug event. > > > > Well, not a complete hot-unplug. The port would still be there, we'd > just need to re-open it after a read()=0 > > Personally I'm not necessarily advocating we change the default > behavior, but couldn't we support this as a separate mode? > > -device virtserialport,inv_fd_on_host_close=1 > > or something along that line? Yes, that would be reasonable. Maybe even a monitor command to switch the type? > >>Also, I focused more on the guest-side connect/disconnect detection, > >>but as Anthony mentioned I think the host side shares similar > >>limitations as well. AFAIK once we connect to the chardev that FD > >>remains valid until the connected process closes it, and so races > >>with the guest side on detecting connect/disconnect events in a > >>similar manner. For the host side it looks like virtio-console has > >>guest_close/guest_open callbacks already that we could potentially > >>use...seems like it's just a matter of tying them to the chardev... > >>basically having virtio-serial's guest_close() result in a close() > >>on the corresponding chardev connection's FD. > > > >Yes, this could be used. > > > >However, the problem with that will be that the chardev can't be > >opened again (AFAIR) and a new chardev will have to be used. > > > > Hmm...yeah I was thinking more specifically about the socket > chardev, where we can leave the listen_fd alone but close anything > we've accept()'d prior to a guest-side disconnect. But isn't that > enough? That would restrict this to sockets started in server mode. > Just add this option for chardevs where this actually makes > sense? For instance: > > -chardev socket,inv_fd_on_guest_close=1 > > Although, this wouldn't make sense if we're using the chardev for > anything other than virtio-serial...so that flag makes more sense as > a virtio-serial flag....but that virtio-serial flag only makes sense > for particular chardevs... > > I'm not sure what a good way to do this would be honestly...either > would work it seems...but neither seems very intuitive. Maybe a new char interface: qemu_chr_close_connected(vcon->chr). Don't know though. Will have to think about it. > >So if this is done on both the sides, the race will be eliminated but > >the expectation that a chardev port is just a serial port will be > >broken and we'll try to bake in some connection layer on top of it. > >That wasn't the original idea. We could extend this, but a better way > >to achieve this could be a library on either side to abstract these > >details off. > > As far as implementing a library...I tried layering something on top > with previous applications to provide connect/disconnect detection > over virtio-serial. Basically by proxing data to/from forwarding > sockets on either side of the virtio-serial channel, and packetizing > the virtio-serial stream into fixed-sized or length-encoded data > packets and control packets that induced the kind of > invalidate-fd-on-remote-close semantics we're discussing. > > It seems to work ok in practice, but we run into trouble when the > daemons managing the stream go down...mainly due to similar problems > as those we're trying to address with the aforementioned changes. > (for instance...forwarding daemon in guest sends partial packet, > goes down, starts back up, sends control packet...but packet stream > is out of sync now. only way around i've found around this so far is > reserving bytes in the transport as sentinel values we can use to > re-sync the stream..but then we have to do stuff like base64-encode > binary data, which can be expensive both in terms of channel > bandwidth as well as cpu) > > If that's the kind of approach we'd need to take, I think optional > flags like the ones mentioned above, or some other change to the > transport, are still required. Unless there's a better approach to > handling this outside of the actual transport that I'm missing? Let's go with the flags you suggested above. Amit