Re: [PATCH] migration/multifd: Document two places for mapped-ram

2024-03-01 Thread Prasad Pandit
Hello Petr, On Fri, 1 Mar 2024 at 14:46, wrote: > + * An explicitly close() on the channel here is normally not explicitly -> explicit > + * required, but can be helpful for "file:" iochannels, where it > + * will include an fdatasync() to make sure the data is flushed t

Re: [PATCH] migration/multifd: Document two places for mapped-ram

2024-03-03 Thread Prasad Pandit
On Mon, 4 Mar 2024 at 06:00, Peter Xu wrote: > Yes I think you're looking at the right path, it's just that the relevant > patches haven't yet landed upstream but is planned. I normally use > "Based-on" tag for such patch that has a dependency outside master, like: > > Based-on: <20240229153017.2

Re: [PATCH] hw/core/machine-smp: Remove deprecated "parameter=0" SMP configurations

2024-03-03 Thread Prasad Pandit
On Mon, 4 Mar 2024 at 10:02, Zhao Liu wrote: > diff --git a/hw/core/machine-smp.c b/hw/core/machine-smp.c > index 25019c91ee36..96533886b14e 100644 > --- a/hw/core/machine-smp.c > +++ b/hw/core/machine-smp.c > @@ -105,8 +105,9 @@ void machine_parse_smp_config(MachineState *ms, > (config->

Re: [PATCH v1 6/6] intel_iommu: Block migration if cap is updated

2024-03-03 Thread Prasad Pandit
On Wed, 28 Feb 2024 at 15:17, Zhenzhong Duan wrote: > When there is VFIO device and vIOMMU cap/ecap is updated based on host * cap/ecap -> capability/extended capability registers are updated ... > IOMMU cap/ecap, migration should be blocked. * It'll help to mention why migration should be bloc

Re: [PATCH] hw/core/machine-smp: Remove deprecated "parameter=0" SMP configurations

2024-03-04 Thread Prasad Pandit
Hello Zhao, On Mon, 4 Mar 2024 at 12:19, Zhao Liu wrote: > > unsigned maxcpus = config->has_maxcpus ? config->maxcpus : 0; > > This indicates the default maxcpus is initialized as 0 if user doesn't > specifies it. * 'has_maxcpus' should be set only if maxcpus > 0. If maxcpus == 0, then setting '

Re: [PATCH v1 6/6] intel_iommu: Block migration if cap is updated

2024-03-04 Thread Prasad Pandit
mp;vtd, errp); > if (ret) { > return ret; > } > > Then cap is updated with host cap value tmp_cap. This update only happen > before machine creation done. * After iommufd_device_get_info() ret is != 0. So s->cap won't be updated then. Hope that is intended. * With the above tweaks included: Reviewed-by: Prasad Pandit Thank you. --- - Prasad

Re: [PATCH v3 04/26] migration: Always report an error in ram_save_setup()

2024-03-04 Thread Prasad Pandit
On Mon, 4 Mar 2024 at 18:01, Cédric Le Goater wrote: > This will prepare ground for futur changes adding an Error** argument * futur -> future > +ret = qemu_fflush(f); > +if (ret) { * if (ret) -> if (ret < 0) Thank you. --- - Prasad

Re: [PATCH v3 05/26] migration: Add Error** argument to vmstate_save()

2024-03-04 Thread Prasad Pandit
; { > +MigrationState *ms = migrate_get_current(); > +Error *local_err = NULL; > SaveStateEntry *se; > > if (!migration_in_colo_state()) { > @@ -1776,8 +1780,10 @@ int qemu_save_device_state(QEMUFile *f) > if (se->is_ram) { > continue; > } > -ret = vmstate_save(f, se, NULL); > +ret = vmstate_save(f, se, NULL, &local_err); > if (ret) { > +migrate_set_error(ms, local_err); > +error_report_err(local_err); > return ret; > } > } > -- Reviewed-by: Prasad Pandit Thank you. --- - Prasad

Re: [PATCH] hw/core/machine-smp: Remove deprecated "parameter=0" SMP configurations

2024-03-05 Thread Prasad Pandit
Hi, On Tue, 5 Mar 2024 at 12:59, Zhao Liu wrote: > After simple test, if user sets maxcpus as 0, the has_maxcpus will be > true as well...I think it's related with QAPI code generation logic. * Right. [Maybe we digressed a bit in the discussion, so I snipped much of the details here. Sorry abou

Re: [PATCH] hw/core/machine-smp: Remove deprecated "parameter=0" SMP configurations

2024-03-05 Thread Prasad Pandit
! (Communication is the most difficult skill to master. :)) * If you plan to send a separate patch for above refactoring, then I'd add Reviewed-by for this one. Reviewed-by: Prasad Pandit Thank you. --- - Prasad

[PATCH] linux-aio: add IO_CMD_FDSYNC command support

2024-03-07 Thread Prasad Pandit
From: Prasad Pandit Libaio defines IO_CMD_FDSYNC command to sync all outstanding asynchronous I/O operations, by flushing out file data to the disk storage. Enable linux-aio to submit such aio request. This helps to reduce latency induced via pthread_create calls by thread-pool (aio=threads

Re: [PATCH] linux-aio: add IO_CMD_FDSYNC command support

2024-03-07 Thread Prasad Pandit
Hi, On Thu, 7 Mar 2024 at 19:21, Kevin Wolf wrote: > Kernel support for this is "relatively" new, added in 2018. Should we > fall back to the thread pool if we receive -EINVAL? laio_co_submit laio_do_submit ioq_submit io_submit * When an aio operation is not supported by the kernel,

Re: [PATCH] linux-aio: add IO_CMD_FDSYNC command support

2024-03-08 Thread Prasad Pandit
Hello Kevin, On Fri, 8 Mar 2024 at 14:35, Kevin Wolf wrote: > Hm... This might make it more challenging because then not only one > specific request fails, but the whole submission batch. * Yes exactly. * I've updated yesterday's patch to return an error (-EINVAL) from ioq_submit to laio_co_subm

Re: [PATCH 03/14] hw/core/machine-smp: Simplify variables' initialization in machine_parse_smp_config()

2024-03-10 Thread Prasad Pandit
Hi, On Fri, 8 Mar 2024 at 20:50, Zhao Liu wrote: > On Fri, Mar 08, 2024 at 02:20:45PM +0100, Thomas Huth wrote: > > Can we always rely on that? ... or is this just by luck due to the current > > implementation? In the latter case, I'd maybe rather drop this patch again. > > Thanks for the correct

[PATCH v2] linux-aio: add IO_CMD_FDSYNC command support

2024-03-10 Thread Prasad Pandit
From: Prasad Pandit Libaio defines IO_CMD_FDSYNC command to sync all outstanding asynchronous I/O operations, by flushing out file data to the disk storage. Enable linux-aio to submit such aio request. This helps to reduce latency induced via pthread_create calls by thread-pool (aio=threads

Re: [PATCH] linux-aio: add IO_CMD_FDSYNC command support

2024-03-10 Thread Prasad Pandit
Hello Kevin, On Fri, 8 Mar 2024 at 17:38, Prasad Pandit wrote: > I'm trying to test it against the Fedora-26 kernel, which was < 4.13.0, and > did not support the AIO_FDSYNC call. [PATCH v2] -> https://lists.nongnu.org/archive/html/qemu-devel/2024-03/msg02495.html * I&#

Re: [PATCH v2] linux-aio: add IO_CMD_FDSYNC command support

2024-03-12 Thread Prasad Pandit
Hello, On Tue, 12 Mar 2024 at 15:15, Kevin Wolf wrote: > Am 11.03.2024 um 20:36 hat Stefan Hajnoczi geschrieben: > > > > That can be avoided with a variable that keeps track of whether -EINVAL > > > > was seen before and skips Linux AIO in that > > > > case. > > > > > > > > Fallback should be ve

[PATCH v3] linux-aio: add IO_CMD_FDSYNC command support

2024-03-13 Thread Prasad Pandit
From: Prasad Pandit Libaio defines IO_CMD_FDSYNC command to sync all outstanding asynchronous I/O operations, by flushing out file data to the disk storage. Enable linux-aio to submit such aio request. This helps to reduce latency induced via pthread_create calls by thread-pool (aio=threads

Re: [PATCH 03/14] hw/core/machine-smp: Simplify variables' initialization in machine_parse_smp_config()

2024-03-13 Thread Prasad Pandit
Hello Zhao, > (Communicating with you also helped me to understand the QAPI related parts.) * I'm also visiting QAPI code parts for the first time. Thanks to you. :) On Mon, 11 Mar 2024 at 10:36, Zhao Liu wrote: > SMPConfiguration is created and set in machine_set_smp(). > Firstly, it is create

Re: [PATCH v3] linux-aio: add IO_CMD_FDSYNC command support

2024-03-13 Thread Prasad Pandit
On Wed, 13 Mar 2024 at 20:48, Stefan Hajnoczi wrote: > > +extern bool laio_has_fdsync(int); > Please declare this in include/block/raw-aio.h alongside the other laio APIs. > > FDSYNC support should be probed at open() time and the result should be > stored in a new bool field like s->laio_supports

[PATCH v4] linux-aio: add IO_CMD_FDSYNC command support

2024-03-14 Thread Prasad Pandit
From: Prasad Pandit Libaio defines IO_CMD_FDSYNC command to sync all outstanding asynchronous I/O operations, by flushing out file data to the disk storage. Enable linux-aio to submit such aio request. This helps to reduce latency induced via pthread_create calls by thread-pool (aio=threads

[PATCH] file-posix: rearrange BDRVRawState fields

2024-03-14 Thread Prasad Pandit
From: Prasad Pandit Rearrange BRDVRawState structure fields to avoid memory fragments in its object's memory and save some(~8) bytes per object. Signed-off-by: Prasad Pandit --- block/file-posix.c | 39 +++ 1 file changed, 19 insertions(+), 20 dele

Re: [PATCH 03/14] hw/core/machine-smp: Simplify variables' initialization in machine_parse_smp_config()

2024-03-18 Thread Prasad Pandit
Hello, On Mon, 18 Mar 2024 at 13:23, Zhao Liu wrote: > Indeed, as you say, these items are initialized to 0 by default. > > I think, however, that the initialization is so far away from where the > smp is currently parsed that one can't easily confirm it (thanks for > your confirmation!). > > Fro

Re: [PATCH v2 1/2] net: Provide MemReentrancyGuard * to qemu_new_nic()

2024-04-24 Thread Prasad Pandit
On Wednesday, 24 April, 2024 at 03:36:01 pm IST, Philippe Mathieu-Daudé wrote: >On 1/6/23 05:18, Akihiko Odaki wrote: >> Recently MemReentrancyGuard was added to DeviceState to record that the >> device is engaging in I/O. The network device backend needs to update it >> when delivering a packet to

[PATCH v5] linux-aio: add IO_CMD_FDSYNC command support

2024-04-25 Thread Prasad Pandit
From: Prasad Pandit Libaio defines IO_CMD_FDSYNC command to sync all outstanding asynchronous I/O operations, by flushing out file data to the disk storage. Enable linux-aio to submit such aio request. When using aio=native without fdsync() support, QEMU creates pthreads, and destroying these

Re: [PATCH] vhost: release memory objects in error path

2023-05-25 Thread Prasad Pandit
Hello Peter, all On Thu, 25 May 2023 at 18:33, Peter Xu wrote: > IIRC this bug used to only reproduce on rt kernels, is it still the case? > * Yes, it's a same crash. > Here besides doing correct unregister, does it also mean that even if > event_notifier_init() failed there's totally no erro

[RFC-PATCH v1 1/2] vhost: fail device start if iotlb update fails

2024-08-08 Thread Prasad Pandit
From: Prasad Pandit While starting a vhost device, updating iotlb entries via 'vhost_device_iotlb_miss' may return an error. qemu-kvm: vhost_device_iotlb_miss: 700871,700871: Fail to update device iotlb Fail device start when such an error occurs. Signed-off-by: Prasad Pandi

[RFC-PATCH v1 2/2] vhost-user: add a request-reply lock

2024-08-08 Thread Prasad Pandit
From: Prasad Pandit QEMU threads use vhost_user_write/read calls to send and receive request/reply messages from a vhost-user device. When multiple threads communicate with the same vhost-user device, they can receive each other's messages, resulting in an erroneous state. When fault_t

[RFC-PATCH v1 0/2] Postcopy migration and vhost-user errors

2024-08-08 Thread Prasad Pandit
From: Prasad Pandit Hello, * virsh(1) offers multiple options to initiate Postcopy migration: 1) virsh migrate --postcopy --postcopy-after-precopy 2) virsh migrate --postcopy + virsh migrate-postcopy 3) virsh migrate --postcopy --timeout --timeout-postcopy When Postcopy migration

[RFC-PATCH v2] vhost-user: add a request-reply lock

2024-08-19 Thread Prasad Pandit
From: Prasad Pandit QEMU threads use vhost_user_write/read calls to send and receive request/reply messages from a vhost-user device. When multiple threads communicate with the same vhost-user device, they can receive each other's messages, resulting in an erroneous state. When fault_t

Re: [PATCH v8 3/5] migration: Add migration parameters for QATzip

2024-08-21 Thread Prasad Pandit
;uint64', > @@ -1171,6 +1183,11 @@ > # speed, and 9 means best compression ratio which will consume > # more CPU. Defaults to 1. (Since 5.0) > # > +# @multifd-qatzip-level: Set the compression level to be used in live > +# migration. The level is an integer between 1 and 9, where 1 means > +# the best compression speed, and 9 means the best compression > +# ratio which will consume more CPU. Defaults to 1. (Since 9.2) > +# > # @multifd-zstd-level: Set the compression level to be used in live > # migration, the compression level is an integer between 0 and 20, > # where 0 means no compression, 1 means the best compression > @@ -1241,6 +1258,7 @@ > '*max-cpu-throttle': 'uint8', > '*multifd-compression': 'MultiFDCompression', > '*multifd-zlib-level': 'uint8', > +'*multifd-qatzip-level': 'uint8', > '*multifd-zstd-level': 'uint8', > '*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ], > '*x-vcpu-dirty-limit-period': { 'type': 'uint64', > -- > Yichen Wang 'multifd-qatzip-level' related changes look okay. Reviewed-by: Prasad Pandit Thank you. --- - Prasad

Re: [External] Re: [PATCH v8 3/5] migration: Add migration parameters for QATzip

2024-08-22 Thread Prasad Pandit
Hi, On Thu, 22 Aug 2024 at 02:13, Yichen Wang wrote: > After discussing with Intel folks, I decided to align to the existing > QPL behavior. In QPL, the code path of compression will always go > through regardless. When acceleration hardware is initialized > properly, use it. If failed, fallback

Re: [PATCH v8 4/5] migration: Introduce 'qatzip' compression method

2024-08-22 Thread Prasad Pandit
Hello, On Tue, 20 Aug 2024 at 22:40, Yichen Wang wrote: > +static int qatzip_send_setup(MultiFDSendParams *p, Error **errp) > +{ > +QatzipData *q; > +QzSessionParamsDeflate_T params; > +const char *err_msg; > +int ret; > + > +q = g_new0(QatzipData, 1); > +p->compress_data

Re: [PATCH v2 1/4] tests/qtest/migration-test: Use regular file file for shared-memory tests

2024-05-31 Thread Prasad Pandit
e the 'tmpfs' variable needs to be renamed to indicate that it uses '/tmp/' or '/var/tmp' directory to create temporary files. And it is not in memory tmpfs(5) used for shared memory '/dev/shm'. Commit message above says 'tmpfs mount is too small' and above c

Re: [PATCH v3 3/4] tests/qtest/migration-test: Enable on ppc64 TCG

2024-05-31 Thread Prasad Pandit
rrupt controller not being migrated or > + * reconstructed post-migration. Disable it until the problem is > resolved. > */ > if (g_str_equal(arch, "s390x") && !has_kvm) { > g_test_message("Skipping tests: s390x host with KVM is required"); > -- > 2.43.0 Reviewed-by: Prasad Pandit Thank you. --- - Prasad

Re: [PATCH v3 2/4] tests/qtest/migration-test: Quieten ppc64 QEMU warnings

2024-05-31 Thread Prasad Pandit
On Thu, 30 May 2024 at 13:17, Nicholas Piggin wrote: > > Reviewed-by: Thomas Huth > Signed-off-by: Nicholas Piggin > --- * No commit log message? --- - Prasad

Re: [PATCH 2/2] scsi-disk: Fix crash for VM configured with USB CDROM after live migration

2024-05-24 Thread Prasad Pandit
Hello Hyman, * Is this the same patch series as sent before..? -> https://lists.nongnu.org/archive/html/qemu-devel/2024-04/msg00816.html On Fri, 24 May 2024 at 12:02, Hyman Huang wrote: > For VMs configured with the USB CDROM device: > > -drive file=/path/to/local/file,id=drive-usb-disk0,media

Re: [PATCH v2 01/18] migration: Fix file migration with fdset

2024-05-24 Thread Prasad Pandit
); > +object_unref(OBJECT(fioc)); > +return; > +} > + * Should 'offset' be checked for > zero while ftruncating? Else it'll be same as O_TRUNC. Otherwise it looks fine. Reviewed-by: Prasad Pandit Thank you. --- - Prasad

Re: [PATCH v2 01/18] migration: Fix file migration with fdset

2024-05-24 Thread Prasad Pandit
On Fri, 24 May 2024 at 18:00, Fabiano Rosas wrote: > That's the point. If offset==0 we truncate all the way, if not, we truncate > to the offset. * Yes, I was wondering if the migration file has some data, but still 'offset' ends up being zero(0). If that's unlikely to happen, then we are good.

Re: [PATCH 2/2] scsi-disk: Fix crash for VM configured with USB CDROM after live migration

2024-05-24 Thread Prasad Pandit
Hi, On Fri, 24 May 2024 at 16:23, Yong Huang wrote: > I'm not testing the latest QEMU version while theoretically it is > reproducible, I'll check it and give a conclusion. * Yes, that'll help. Thank you. > I'm not sure this usage is common but in our production environment, it is > used. * I

Re: [RFC PATCH 2/3] tests/qtest/migration-test: enable on s390x

2024-05-26 Thread Prasad Pandit
Hi, On Sat, 25 May 2024 at 18:44, Nicholas Piggin wrote: > s390x is more stable now. Enable it. > > Signed-off-by: Nicholas Piggin > --- > tests/qtest/migration-test.c | 12 > 1 file changed, 12 deletions(-) > > diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test

Re: [RFC PATCH 2/3] tests/qtest/migration-test: enable on s390x

2024-05-27 Thread Prasad Pandit
> Yes they should be called "enable for TCG" indeed. > Ack with commit message changes: Reviewed-by: Prasad Pandit Thank you. --- - Prasad

Re: [External] Re: [PATCH v8 4/5] migration: Introduce 'qatzip' compression method

2024-08-26 Thread Prasad Pandit
er thread in both normal and error paths. > So I revised the patch to this behavior, otherwise I will run into > double free in the error path. > * I see, okay, in that case: Reviewed-by: Prasad Pandit Thank you. --- - Prasad

Re: [PATCH v4 15/16] migration/multifd: Register nocomp ops dynamically

2024-08-26 Thread Prasad Pandit
end_cleanup, > +.send_prepare = multifd_nocomp_send_prepare, > +.recv_setup = multifd_nocomp_recv_setup, > +.recv_cleanup = multifd_nocomp_recv_cleanup, > +.recv = multifd_nocomp_recv > +}; > + > +static void multifd_nocomp_register(void) > +{ > +multifd_register_ops(MULTIFD_COMPRESSION_NONE, &multifd_nocomp_ops); > +} > + > +migration_init(multifd_nocomp_register); Looks okay. Reviewed-by: Prasad Pandit Thank you. --- - Prasad

Re: Issue with QEMU Live Migration

2024-08-26 Thread Prasad Pandit
On Sun, 25 Aug 2024 at 22:40, Arisetty, Chakri wrote: > > - start the mirror job > > - qmp_migrate > > - once PRE_SWITCHOVER is reached, issue block-job-cancel > > - qmp_migrate_continue > > We use exact same steps to do live migration. I repeated the test now > > Sure, as you suggested to rule ou

Re: [RFC-PATCH v2] vhost-user: add a request-reply lock

2024-08-27 Thread Prasad Pandit
Hello Michael, all Sorry about a late reply, catching up with emails after long PTOs. On Mon, 19 Aug 2024 at 21:20, Michael S. Tsirkin wrote: >> makes sense. >> Acked-by: Michael S. Tsirkin >> But do not post v2 as reply to v1 pls. * Yes, okay. Thank you for the review. I sent in reply to v1

Re: [PATCH v3] vhost-user: Do not wait for reply for not sent VHOST_USER_SET_LOG_BASE

2024-08-27 Thread Prasad Pandit
On Thu, 1 Aug 2024 at 20:32, BillXiang wrote: > > From: "Michael S. Tsirkin" > > How do things work now after 7c211eb078c4 then? > > It does not really work after 7c211eb078c4 and it's my mistake. > I forgot to submit the code to check vq_index in 7c211eb078c4. > * vhost_user_set_log_base() sends

Re: [PATCH v3] vhost-user: Do not wait for reply for not sent VHOST_USER_SET_LOG_BASE

2024-08-27 Thread Prasad Pandit
On Tue, 27 Aug 2024 at 16:50, BillXiang wrote: > it's better to be consistent to use vhost_user_per_device_request for those > per-device messages, right? * ...consistent to use? Could you please elaborate a little? Thank you. --- - Prasad

[PATCH v2 0/2] Postcopy migration and vhost-user errors

2024-08-28 Thread Prasad Pandit
From: Prasad Pandit Hello, * virsh(1) offers multiple options to initiate Postcopy migration: 1) virsh migrate --postcopy --postcopy-after-precopy 2) virsh migrate --postcopy + virsh migrate-postcopy 3) virsh migrate --postcopy --timeout --timeout-postcopy When Postcopy migration

[PATCH v2 2/2] vhost-user: add a request-reply lock

2024-08-28 Thread Prasad Pandit
From: Prasad Pandit QEMU threads use vhost_user_write/read calls to send and receive request/reply messages from a vhost-user device. When multiple threads communicate with the same vhost-user device, they can receive each other's messages, resulting in an erroneous state. When fault_t

[PATCH v2 1/2] vhost: fail device start if iotlb update fails

2024-08-28 Thread Prasad Pandit
From: Prasad Pandit While starting a vhost device, updating iotlb entries via 'vhost_device_iotlb_miss' may return an error. qemu-kvm: vhost_device_iotlb_miss: 700871,700871: Fail to update device iotlb Fail device start when such an error occurs. Signed-off-by: Prasad Pandi

Re: [PATCH v2 2/2] vhost-user: add a request-reply lock

2024-08-28 Thread Prasad Pandit
On Wed, 28 Aug 2024 at 16:45, Michael S. Tsirkin wrote: > > - Place QEMU_LOCK_GUARD near the vhost_user_write() calls, holding > >the lock for longer fails some tests during rpmbuild(8). > > what do you mean fails rpmbuild? that qemu with this patch can not be > compiled? * In V1 of this pa

Re: [PATCH v2 2/2] vhost-user: add a request-reply lock

2024-08-29 Thread Prasad Pandit
Hello Michael, On Thu, 29 Aug 2024 at 13:12, Michael S. Tsirkin wrote: > Weird. Seems to indicate some kind of deadlock? * Such a deadlock should occur across all environments I guess, not sure why it happens selectively. It is strange. > So maybe vhost_user_postcopy_end should take the BQL? =

Re: [PATCH v9 3/5] migration: Add migration parameters for QATzip

2024-09-04 Thread Prasad Pandit
e CPU. Defaults to 1. (Since 9.2) > +# * Should the default compression level be at the median of 1 - 9 => 5 or 6 ? A compression method (QATzip) choosing speed (1) over compression as default seems contradictory. Otherwise: Reviewed-by: Prasad Pandit Thank you. --- - Prasad

Re: [PATCH v2 0/2] Postcopy migration and vhost-user errors

2024-09-11 Thread Prasad Pandit
Hello Michael, On Tue, 10 Sept 2024 at 22:40, Michael S. Tsirkin wrote: > So are we going to see a version with BQL? === diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index cdf9af4a4b..96ac0ed93b 100644 --- a/hw/virtio/vhost-user.c +++ b/hw/virtio/vhost-user.c @@ -2079,6 +2079,7 @

[PATCH 0/2] Postcopy migration and vhost-user errors

2024-07-11 Thread Prasad Pandit
From: Prasad Pandit Hello, * virsh(1) offers multiple options to initiate Postcopy migration: 1) virsh migrate --postcopy --postcopy-after-precopy 2) virsh migrate --postcopy + virsh migrate-postcopy 3) virsh migrate --postcopy --timeout --timeout-postcopy When Postcopy migration

[PATCH 1/2] vhost-user: add a write-read lock

2024-07-11 Thread Prasad Pandit
From: Prasad Pandit QEMU threads use vhost_user_write/read calls to send and receive messages from a vhost-user device. When multiple threads communicate with the same vhost-user device, they can receive each other's messages, resulting in an erroneous state. vhost_user_read_h

[PATCH 2/2] vhost: fail device start if iotlb update fails

2024-07-11 Thread Prasad Pandit
From: Prasad Pandit While starting a vhost device, updating iotlb entries via 'vhost_device_iotlb_miss' may return an error. qemu-kvm: vhost_device_iotlb_miss: 700871,700871: Fail to update device iotlb Fail device start when such an error occurs. Signed-off-by: Prasad Pandi

Re: [PATCH 1/2] vhost-user: add a write-read lock

2024-07-15 Thread Prasad Pandit
On Thu, 11 Jul 2024 at 21:12, Peter Xu wrote: > I apologize if I suggested WITH_QEMU_LOCK_GUARD when we talked.. I don't > remember which one I suggested, but in this case IIUC it'll be much easier > to review if you use the other sister function QEMU_LOCK_GUARD() > instead.. That should make the

Re: [PATCH 0/2] Postcopy migration and vhost-user errors

2024-07-15 Thread Prasad Pandit
On Thu, 11 Jul 2024 at 21:08, Peter Xu wrote: > Hmm, I thought it was one of the vcpu threads that invoked > vhost_dev_start(), rather than any migration thread? [QEMU=vhost-user-front-end] <===> [QEMU=vhost-user-front-end] ^

Re: [PATCH 1/2] vhost-user: add a write-read lock

2024-07-15 Thread Prasad Pandit
On Thu, 11 Jul 2024 at 20:11, Michael S. Tsirkin wrote: > Could you supply a Fixes tag here? What commit introduced the race? 'postcopy_end' message was added by: -> https://github.com/qemu/qemu/commit/46343570c06e63b4499f619011df80f91349cd49 Not sure if its race condition also began with it.

Re: [PATCH 0/2] Postcopy migration and vhost-user errors

2024-07-16 Thread Prasad Pandit
Hello Peter, On Mon, 15 Jul 2024 at 19:10, Peter Xu wrote: > IMHO it's better we debug and fix all the issues before merging this one, > otherwise we may overlook something. * Well we don't know where the issue is, not sure where the fix may go in, ex. if the issue turns out to be how virsh(1) i

Re: [PATCH 1/2] vhost-user: add a write-read lock

2024-07-16 Thread Prasad Pandit
On Mon, 15 Jul 2024 at 18:57, Peter Xu wrote: > I think it shouldn't be a major deal in most cases, if the extended cycles > only cover a bunch of instructions. In special case we can still use > WITH_QEMU_LOCK_GUARD, but I'd start with the simple first and only switch > if necessary. * Okay, wil

Re: [PATCH 1/2] vhost-user: add a write-read lock

2024-07-22 Thread Prasad Pandit
On Sun, 21 Jul 2024 at 01:11, Michael S. Tsirkin wrote: > So it's not a rw lock. It's just a mutex. > Lock should be named after what they protect, not > after where they are held. > In this case, this ensures only 1 request is > outstanding at a time. > So vhost_user_request_reply_lock Okay, wil

Re: [PATCH 0/2] Postcopy migration and vhost-user errors

2024-07-22 Thread Prasad Pandit
On Sun, 21 Jul 2024 at 01:11, Michael S. Tsirkin wrote: > > > > On Wed, Jul 17, 2024 at 04:55:52AM -0400, Michael S. Tsirkin wrote: > > > > > I just want to understand how we managed to have two threads > > > > > talking in parallel. BQL is normally enough, which path > > > > > manages to invoke v

Re: [PATCH 0/2] Postcopy migration and vhost-user errors

2024-07-23 Thread Prasad Pandit
Hi, On Tue, 23 Jul 2024 at 10:33, Prasad Pandit wrote: > On Sun, 21 Jul 2024 at 01:11, Michael S. Tsirkin wrote: > > > > > On Wed, Jul 17, 2024 at 04:55:52AM -0400, Michael S. Tsirkin wrote: > > > > > > I just want to understand how we managed to have two thr

Re: [PATCH v3 1/5] tests/migration: Move the guestperf tool to scripts directory

2024-10-21 Thread Prasad Pandit
perf/scenario.py (100%) > rename {tests => scripts}/migration/guestperf/shell.py (100%) > rename {tests => scripts}/migration/guestperf/timings.py (100%) > Looks okay. Reviewed-by: Prasad Pandit Thank you. --- - Prasad

Re: [PATCH] migration: Deprecate query-migrationthreads command

2024-10-21 Thread Prasad Pandit
x27;''''''''''''''''''''''''''''''''''''''''''''''' > > diff --git a/qapi/migration.json b/qapi/migration.json > index 3af6aa1740..a71a9f0cd3 100644 > --- a/qapi/migration.json > +++ b/qapi/migration.json > @@ -2284,12 +2284,16 @@ > # > # Returns information of migration threads > # > +# Features: > +# @deprecated: This command is deprecated with no replacement yet. > +# > # Returns: @MigrationThreadInfo > # > # Since: 7.2 > ## > { 'command': 'query-migrationthreads', > - 'returns': ['MigrationThreadInfo'] } > + 'returns': ['MigrationThreadInfo'], > + 'features': ['deprecated'] } > Sounds reasonable. Reviewed-by: Prasad Pandit Thank you. --- - Prasad

[PATCH 1/5] migration/multifd: move macros to multifd header

2024-10-29 Thread Prasad Pandit
From: Prasad Pandit Move MULTIFD_ macros to the header file so that they are accessible from other files. Signed-off-by: Prasad Pandit --- migration/multifd.c | 4 migration/multifd.h | 5 + 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/migration/multifd.c b

[PATCH 2/5] migration/postcopy: magic value for postcopy channel

2024-10-29 Thread Prasad Pandit
From: Prasad Pandit During migration, the precopy and the multifd channels send magic value (4-bytes) to the destination side, for it to identify the channel and properly establish connection. But Postcopy channel did not send such value. Introduce a magic value to be sent on the postcopy

[PATCH 5/5] migration: enable multifd and postcopy together

2024-10-29 Thread Prasad Pandit
From: Prasad Pandit Enable Multifd and Postcopy migration together. The migration_ioc_process_incoming() routine checks magic value sent on each channel and helps to properly setup multifd and postcopy channels. Idea is to take advantage of the multifd threads to accelerate transfer of large

[PATCH 4/5] migration: refactor ram_save_target_page functions

2024-10-29 Thread Prasad Pandit
From: Prasad Pandit Refactor ram_save_target_page legacy and multifd functions into one. Other than simplifying it, it avoids reinitialization of the 'migration_ops' object, when migration moves from multifd to postcopy phase. Signed-off-by: Prasad Pandit --- migration/

[PATCH 3/5] migration: remove multifd check with postcopy

2024-10-29 Thread Prasad Pandit
From: Prasad Pandit Remove multifd capability check with Postcopy mode. This helps to enable both multifd and postcopy together. Update migrate_multifd() to return false when migration reaches Postcopy phase. In Postcopy phase, source guest is paused, so the migration threads on the source stop

[PATCH 0/5] Allow to enable multifd and postcopy migration together

2024-10-29 Thread Prasad Pandit
From: Prasad Pandit Hello, * Currently Multifd and Postcopy migration can not be used together. QEMU shows "Postcopy is not yet compatible with multifd" message. When migrating guests with large (100's GB) RAM, Multifd threads help to accelerate migration, but inability

Re: [PATCH 3/5] migration: remove multifd check with postcopy

2024-11-04 Thread Prasad Pandit
On Fri, 1 Nov 2024 at 20:06, Peter Xu wrote: > > -return s->capabilities[MIGRATION_CAPABILITY_MULTIFD]; > > +return s->capabilities[MIGRATION_CAPABILITY_MULTIFD] > > +&& !migration_in_postcopy(); > > } > > We need to keep this as-is.. I'm afraid. > You can always do proper che

Re: [PATCH 2/5] migration/postcopy: magic value for postcopy channel

2024-11-04 Thread Prasad Pandit
On Fri, 1 Nov 2024 at 20:04, Peter Xu wrote: > As we discussed internally, we can't do this unconditionally. We at least > some compat properties. * ie. we define a new compat property to check if postcopy sends a magic value or not? > Or we need to see whether Fabiano's handshake can > simpli

Re: [PATCH 4/5] migration: refactor ram_save_target_page functions

2024-11-04 Thread Prasad Pandit
On Fri, 1 Nov 2024 at 20:09, Peter Xu wrote: > > +if (migrate_multifd()) { > > +RAMBlock *block = pss->block; > > +/* > > + * While using multifd live migration, we still need to handle zero > > + * page checking on the migration main thread. > > + */ >

Re: [PATCH 5/5] migration: enable multifd and postcopy together

2024-11-05 Thread Prasad Pandit
On Mon, 4 Nov 2024 at 23:19, Peter Xu wrote: > Precopy keeps sending data even during postcopy, that's the background > stream (with/without preempt feature enabled). May need some amendment > when repost here. * Okay. > > +if (channel == CH_POSTCOPY) { > > +return false; > > +}

[PATCH] vhost: fail device start if iotlb update fails

2024-11-04 Thread Prasad Pandit
From: Prasad Pandit While starting a vhost device, updating iotlb entries via 'vhost_device_iotlb_miss' may return an error. qemu-kvm: vhost_device_iotlb_miss: 700871,700871: Fail to update device iotlb Fail device start when such an error occurs. Signed-off-by: Prasad Pandi

Re: [PATCH 3/5] migration: remove multifd check with postcopy

2024-11-05 Thread Prasad Pandit
On Mon, 4 Nov 2024 at 22:23, Peter Xu wrote: > We definitely need a helper like this to simply detect what the user chose > on the feature. > > You can still introduce a new helper, e.g. migrate_multifd_precopy(), if > that simplifies the code. Okay. Thank you. --- - Prasad

Re: [PATCH 4/5] migration: refactor ram_save_target_page functions

2024-11-05 Thread Prasad Pandit
On Mon, 4 Nov 2024 at 22:30, Peter Xu wrote: > Yes, IMHO it's better when merged. > > One more note here, that even with ZERO_PAGE_DETECTION_MULTIFD, qemu will > fallback to use LEGACY in reality when !multifd before. We need to keep > that behavior. * Where does this fallback happen? in ram_sav

Re: [PATCH 2/5] migration/postcopy: magic value for postcopy channel

2024-11-05 Thread Prasad Pandit
On Mon, 4 Nov 2024 at 22:48, Peter Xu wrote: > Firstly, we'll need a way to tell mgmt that the new qemu binary supports > enablement of both multifd + postcopy feature. That can be done with a > > "features": [ "postcopy-with-multifd-precopy" ] > > Flag attached to the QMP "migrate" command. *

Re: [PATCH] tests/qtest: fix heap-use-after-free

2024-11-11 Thread Prasad Pandit
On Mon, 11 Nov 2024 at 14:37, Dmitry Frolov wrote: > "int main(int argc, char **argv, char** envp)" is non-standart > Microsoft`s extention of the C language and it`s not portable. > In my particular case (Debian 13, clang-16) this raises wild-pointer > dereference with ASAN message "heap-use-afte

Re: [PATCH v1] vhost: fail device start if iotlb update fails

2024-11-11 Thread Prasad Pandit
Hello Jason, On Mon, 11 Nov 2024 at 07:08, Jason Wang wrote: > > While starting a vhost device, updating iotlb entries > > via 'vhost_device_iotlb_miss' may return an error. > > > > qemu-kvm: vhost_device_iotlb_miss: > > 700871,700871: Fail to update device iotlb > > Actually, such updating

Re: [PATCH] tests/qtest: fix heap-use-after-free

2024-11-11 Thread Prasad Pandit
On Mon, 11 Nov 2024 at 17:41, Дмитрий Фролов wrote: > Above loop dereferences the pointer env, which is pointing to > the memory area, which is not allowed to read. * Not allowed to read environment variables? Is it because Debian/clang does not support the '**envp' parameter? Is '**envp' set to

Re: [PATCH] docs/system/bootindex: Make it clear that s390x can also boot from virtio-net

2024-11-11 Thread Prasad Pandit
l try up to > -8 total devices, any number of which may be disks. > +8 total devices, any number of which may be disks or virtio-net devices. Looks okay. Reviewed-by: Prasad Pandit Thank you. --- - Prasad

Re: [sdl-qemu] [PATCH] tests/qtest: fix heap-use-after-free

2024-11-11 Thread Prasad Pandit
Hi, On Mon, 11 Nov 2024 at 22:51, Alexey Khoroshilov wrote: > On 11.11.2024 16:35, Дмитрий Фролов wrote: > Not allowed to read the exact memory area, because it is marked as freed. > > As far as I understand, heap-use-after-free means a situation when code > allocates memory then frees it and th

Re: [PATCH] tests/qtest: fix non portable env varibles access

2024-11-11 Thread Prasad Pandit
On Tue, 12 Nov 2024 at 12:08, Dmitry Frolov wrote: > "int main(int argc, char **argv, char** envp)" is non-standart standart -> standard > Microsoft`s extention of the C language and it`s not portable. * But it looks widely supported. > In my particular case (Debian 13, clang-16) this raises w

Re: [PATCH v2] target/riscv: Add Tenstorrent Ascalon CPU

2024-11-13 Thread Prasad Pandit
On Wed, 13 Nov 2024 at 16:36, Anton Blanchard wrote: > Add a CPU entry for the Tenstorrent Ascalon CPU, a series of 2 wide to > 8 wide RV64 cores. More details can be found at > https://tenstorrent.com/ip/tt-ascalon > > Signed-off-by: Anton Blanchard > --- > target/riscv/cpu-qom.h | 1 + > targ

Re: [PATCH] tests/qtest: fix non portable env varibles access

2024-11-12 Thread Prasad Pandit
Hi, On Tue, 12 Nov 2024 at 14:45, Дмитрий Фролов wrote: > It looks like this is a clang optimization issue. > > When environ is absent (not mentioned in the source code) > The value of envp is also 0x51400040 (reproducible), > but the behavior may be each time different. > Mostly test fails w

Re: [PATCH 2/5] migration/postcopy: magic value for postcopy channel

2024-11-07 Thread Prasad Pandit
On Wed, 6 Nov 2024 at 18:41, Fabiano Rosas wrote: > What we're thinking is having an initial exchange of information between > src & dst as soon as migration starts and that would sync the > capabilities and parameters between both sides. Which would then be > followed by a channel establishment p

Re: [PATCH] vhost: fail device start if iotlb update fails

2024-11-05 Thread Prasad Pandit
On Tue, 5 Nov 2024 at 16:19, Stefano Garzarella wrote: > VHOST_OPS_DEBUG() is usually used in the error path when calling a > `dev->vhost_ops` callback. In addition vhost_device_iotlb_miss() is > already reporting error through error_report() in the error path, so I > think we can remove this line

Re: [PATCH 2/5] migration/postcopy: magic value for postcopy channel

2024-11-06 Thread Prasad Pandit
On Tue, 5 Nov 2024 at 18:30, Peter Xu wrote: > https://www.qemu.org/docs/master/devel/qapi-code-gen.html > > Sometimes, the behaviour of QEMU changes compatibly, but without a > change in the QMP syntax (usually by allowing values or operations > that previously resulted in

Re: [PATCH] vhost: fail device start if iotlb update fails

2024-11-06 Thread Prasad Pandit
On Wed, 6 Nov 2024 at 14:21, Stefano Garzarella wrote: > I think we should call that functions in the reverse order, so just add them > in the error path, as we already do for other calls. === diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index a70b7422b5..f168e1f02a 100644 --- a/hw/virtio/v

Re: [PATCH] vhost: fail device start if iotlb update fails

2024-11-06 Thread Prasad Pandit
On Wed, 6 Nov 2024 at 16:05, Stefano Garzarella wrote: > >+fail_iotlb: > >+hdev->vhost_ops->vhost_set_iotlb_callback(hdev, false); > > fail_start: > >+hdev->vhost_ops->vhost_dev_start(hdev, false); > > This should go before the fail_start label, since it should not be > called when vhost_d

[PATCH v1] vhost: fail device start if iotlb update fails

2024-11-07 Thread Prasad Pandit
From: Prasad Pandit While starting a vhost device, updating iotlb entries via 'vhost_device_iotlb_miss' may return an error. qemu-kvm: vhost_device_iotlb_miss: 700871,700871: Fail to update device iotlb Fail device start when such an error occurs. Signed-off-by: Prasad Pandi

Re: [PATCH 2/5] migration/postcopy: magic value for postcopy channel

2024-11-07 Thread Prasad Pandit
On Wed, 6 Nov 2024 at 21:30, Peter Xu wrote: > IIUC we can't simply change it to this. We can do this with a compat > property and we can start sending a magic in the preempt channel, but then > the code still needs to keep working with old binaries of QEMU, so in all > cases we'll need to keep t

Re: [PATCH] vhost: fail device start if iotlb update fails

2024-11-06 Thread Prasad Pandit
On Wed, 6 Nov 2024 at 17:31, Stefano Garzarella wrote: > For vhost_set_iotlb_callback() that is true because for now we go to > that label only if the callback is defined, but this is not the case for > hdev->vhost_ops->vhost_dev_start(). > > Anyway if in the future we add a new step that need to

Re: [PATCH 2/5] migration/postcopy: magic value for postcopy channel

2024-11-08 Thread Prasad Pandit
Hello Peter, Dan, Fabiano, Thank you so much for the detailed comments, I appreciate it. On Thu, 7 Nov 2024 at 17:41, Fabiano Rosas wrote: > The handshake will be a QEMU-only feature. Libvirt will then only start > the migration on src and QEMU will do the capabilities handling. > On Thu, 7 Nov

[PATCH v2 0/3] Allow to enable multifd and postcopy migration together

2024-11-29 Thread Prasad Pandit
From: Prasad Pandit Hello, * Currently Multifd and Postcopy migration can not be used together. QEMU shows "Postcopy is not yet compatible with multifd" message. When migrating guests with large (100's GB) RAM, Multifd threads help to accelerate migration, but inability

  1   2   3   >