Avihai Horon <avih...@nvidia.com> wrote: > On 5/16/2022 2:31 PM, Juan Quintela wrote: >> External email: Use caution opening links or attachments >> >> >> Avihai Horon <avih...@nvidia.com> wrote: >>> Add new function qemu_file_get_to_fd() that allows reading data from >>> QEMUFile and writing it straight into a given fd. >>> >>> This will be used later in VFIO migration code. >>> >>> Signed-off-by: Avihai Horon <avih...@nvidia.com> >>> --- >>> migration/qemu-file.c | 34 ++++++++++++++++++++++++++++++++++ >>> migration/qemu-file.h | 1 + >>> 2 files changed, 35 insertions(+) >>> >>> diff --git a/migration/qemu-file.c b/migration/qemu-file.c >>> index 1479cddad9..cad3d32eb3 100644 >>> --- a/migration/qemu-file.c >>> +++ b/migration/qemu-file.c >>> @@ -867,3 +867,37 @@ QIOChannel *qemu_file_get_ioc(QEMUFile *file) >>> { >>> return file->has_ioc ? QIO_CHANNEL(file->opaque) : NULL; >>> } >>> + >>> +/* >>> + * Read size bytes from QEMUFile f and write them to fd. >>> + */ >>> +int qemu_file_get_to_fd(QEMUFile *f, int fd, size_t size) >>> +{ >>> + while (size) { >>> + size_t pending = f->buf_size - f->buf_index; >>> + ssize_t rc; >>> + >>> + if (!pending) { >>> + rc = qemu_fill_buffer(f); >>> + if (rc < 0) { >>> + return rc; >>> + } >>> + if (rc == 0) { >>> + return -1; >>> + } >>> + continue; >>> + } >>> + >>> + rc = write(fd, f->buf + f->buf_index, MIN(pending, size)); >>> + if (rc < 0) { >>> + return rc; >>> + } >>> + if (rc == 0) { >>> + return -1; >>> + } >>> + f->buf_index += rc; >>> + size -= rc; >>> + } >>> + >>> + return 0; >>> +} >> Is there a really performance difference to just use: >> >> uint8_t buffer[size]; >> >> qemu_get_buffer(f, buffer, size); >> write(fd, buffer, size); >> >> Or telling it otherwise, what sizes are we talking here? > > It depends on the device, but It can range from a few MBs to several > GBs AFAIK.
a few MB is ok. Several GB on the main migration channel without a single header/whatever? This sounds like a recipe for disaster IMHO. Remember that on source side, we have a migration thread. This patch will just block that migration thread. If we are using 10Gigabit networking, 1GB/second for friends and making math easy, each GB will take 1 second downtime. During that downtime, migration control is basically handycapped. And you are sendinfg this data with qemu_put_buffer_async(). This function just adds its buffer to an iovec, only user uses 4KB (or whatever your page size is) to the iovec. You are talking about adding gigabytes here. I don't know how well this is going to work, but my guess is that migrate_cancel is not going to be happy. On destination side, this is even worse, because we receive this multigigabyte chunk in a coroutine, and I am not sure that this will not completely block all qemu while this is happening (again, multisecond time). I have to think more about this problem, but I can see how this is going to be able to go through the migration main channel. Later, Juan.