On Tue, Feb 04, 2025 at 03:08:02PM +0000, Daniel P. Berrangé wrote: > On Mon, Feb 03, 2025 at 01:20:01PM -0500, Peter Xu wrote: > > On Thu, Jan 30, 2025 at 11:08:29AM +0100, Maciej S. Szmigiero wrote: > > > From: "Maciej S. Szmigiero" <maciej.szmigi...@oracle.com> > > > > > > Multifd send channels are terminated by calling > > > qio_channel_shutdown(QIO_CHANNEL_SHUTDOWN_BOTH) in > > > multifd_send_terminate_threads(), which in the TLS case essentially > > > calls shutdown(SHUT_RDWR) on the underlying raw socket. > > > > > > Unfortunately, this does not terminate the TLS session properly and > > > the receive side sees this as a GNUTLS_E_PREMATURE_TERMINATION error. > > > > > > The only reason why this wasn't causing migration failures is because > > > the current migration code apparently does not check for migration > > > error being set after the end of the multifd receive process. > > > > > > However, this will change soon so the multifd receive code has to be > > > prepared to not return an error on such premature TLS session EOF. > > > Use the newly introduced QIOChannelTLS method for that. > > > > > > It's worth noting that even if the sender were to be changed to terminate > > > the TLS connection properly the receive side still needs to remain > > > compatible with older QEMU bit stream which does not do this. > > > > If this is an existing bug, we could add a Fixes. > > > > Two pure questions.. > > > > - What is the correct way to terminate the TLS session without this flag? > > > > - Why this is only needed by multifd sessions? > > Graceful TLS termination (via gnutls_bye()) should only be important to > security if the QEMU protocol in question does not know how much data it > is expecting to recieve. ie it cannot otherwise distinguish between an > expected EOF, and a premature EOF triggered by an attacker. > > If the migration protocol has sufficient info to know when a chanel is > expected to see EOF, then we should stop trying to read from the TLS > channel before seeing the underlying EOF. > > Ignoring GNUTLS_E_PREMATURE_TERMINATION would be valid if we know that > migration will still fail corretly in the case of a malicious attack > causing premature termination. > > If there's a risk that migration may succeed, but with incomplete data, > then we would need the full gnutls_bye dance.
IIUC that's not required for migration then, because migration should know exactly how much data to receive, and migration should need to verify that and fail if the received data didn't match the expectation along the way. We also have QEMU_VM_EOF as the end mark of stream. Said that, are we sure any pre-mature termination will only happen after all data read in the receive buffer that was sent? To ask in another way: what happens if the source QEMU sends everything and shutdown()/close() the channel, meanwhile the dest QEMU sees both (1) rest data to read, and (2) a pre-mature terminatino of TLS session in a read() syscall. Would (2) be reported even before (1), or the order guaranteed that read of the residue data in (1) always happen before (2) (considering dest QEMU can be slow sometime on consuming the network buffers)? Thanks, -- Peter Xu