Re: [RFC PATCH] migration/rdma: Remove qemu_rdma_broken_ipv6_kernel

2025-03-26 Thread Michael Galaxy
Excellent find. Thank you very much for checking on the history. Hopefully my comments were not too hard to read. =) FYI: I've since left Akamai last year and now work at Nvidia. Reviewed-by: Michael Galaxy On 3/26/25 04:52, Jack Wang wrote: I hit following error which testing migrati

Re: [PATCH 0/6] refactor RDMA live migration based on rsocket API

2024-10-23 Thread Michael Galaxy
d. Do share if you have any problems with it, like if it has compatibility issues or if we need a different solution. We're open to change. I'm not familiar with the "current state" of this or how well it would even work. - Michael On Fri, Oct 4, 2024 at 4:06 PM Michael Gal

Re: [PATCH 1/3] scsi: fetch unit attention when creating the request

2024-10-23 Thread Michael Galaxy
, Michael Galaxy wrote: Thanks for your help. - Michael On 10/9/24 11:28, Paolo Bonzini wrote: Yes, it looks like an easy backport. Adding Michael Tokarev and qemu-stable. Paolo On Wed, Oct 9, 2024 at 6:03 PM Michael Galaxy wrote: Hi All, We have stumbled upon this bug in our production

Re: [PATCH 1/3] scsi: fetch unit attention when creating the request

2024-10-09 Thread Michael Galaxy
. |---! Yes, it looks like an easy backport. Adding Michael Tokarev and qemu-stable. Paolo On Wed, Oct 9, 2024 at 6:03 PM Michael Galaxy wrote: Hi All, We have stumbled upon this bug in our production systems on QEMU 7.2.x. This is a pretty nasty bug because it has the effect of causing

Re: [PATCH 1/3] scsi: fetch unit attention when creating the request

2024-10-09 Thread Michael Galaxy
we kindly ask to pull this identical patch for 7.2.15? Last year, it just went to master and landed in 8.0.50. We're planning to upgrade, but it will be quite some time before we get around to that, and I suspect others are also running 7.2.x in production. - Michael Galaxy On 7/12/23

Re: [PATCH 0/6] refactor RDMA live migration based on rsocket API

2024-10-07 Thread Michael Galaxy
you have any problems with it, like if it has compatibility issues or if we need a different solution. We're open to change. I'm not familiar with the "current state" of this or how well it would even work. - Michael On Fri, Oct 4, 2024 at 4:06 PM Michael Galaxy wrote

Re: [PATCH 0/6] refactor RDMA live migration based on rsocket API

2024-10-04 Thread Michael Galaxy
04:26:27PM -0500, Michael Galaxy wrote: What about the testing solution that I mentioned? Does that satisfy your concerns? Or is there still a gap here that needs to be met? I think such testing framework would be helpful, especially if we can kick it off in CI when preparing pull requests, then

Re: [PATCH 0/6] refactor RDMA live migration based on rsocket API

2024-10-03 Thread Michael Galaxy
On 9/30/24 14:47, Peter Xu wrote: !---| This Message Is From an External Sender This message came from outside your organization. |---! On Mon, Sep 30, 2024 at 07

Re: [PATCH 0/6] refactor RDMA live migration based on rsocket API

2024-09-30 Thread Michael Galaxy
29, 2024 at 03:26:58PM -0500, Michael Galaxy wrote: On 9/29/24 13:14, Michael S. Tsirkin wrote: !---| This Message Is From an External Sender This message came from outside your organization

Re: [PATCH 0/6] refactor RDMA live migration based on rsocket API

2024-09-29 Thread Michael Galaxy
28, 2024 at 12:52:08PM -0500, Michael Galaxy wrote: A bounce buffer defeats the entire purpose of using RDMA in these cases. When using RDMA for very large transfers like this, the goal here is to map the entire memory region at once and avoid all CPU interactions (except for message management

Re: [PATCH 0/6] refactor RDMA live migration based on rsocket API

2024-09-28 Thread Michael Galaxy
On 9/27/24 16:45, Sean Hefty wrote: !---| This Message Is From an External Sender This message came from outside your organization. |---! I have met with the tea

Re: [PATCH 0/6] refactor RDMA live migration based on rsocket API

2024-09-27 Thread Michael Galaxy
, -Original Message- From: Michael Galaxy [mailto:mgal...@akamai.com] Sent: Monday, September 23, 2024 3:29 AM To: Michael S. Tsirkin ; Peter Xu Cc: Gonglei (Arei) ; qemu-devel@nongnu.org; yu.zh...@ionos.com; elmar.ger...@ionos.com; zhengchuan ; berra...@redhat.com; arm...@redhat.com; lizhij

Re: [PATCH 0/6] refactor RDMA live migration based on rsocket API

2024-09-22 Thread Michael Galaxy
Hi All, I have met with the team from IONOS about their testing on actual IB hardware here at KVM Forum today and the requirements are starting to make more sense to me. I didn't say much in our previous thread because I misunderstood the requirements, so let me try to explain and see if we'r

Re: CPR/liveupdate: test results using prior bug fix

2024-05-17 Thread Michael Galaxy
OK, acknowledged. Thanks, All. - Michael On 5/16/24 13:07, Steven Sistare wrote: On 5/16/2024 1:24 PM, Michael Galaxy wrote: On 5/14/24 08:54, Michael Tokarev wrote: On 5/14/24 16:39, Michael Galaxy wrote: Steve, OK, so it does not look like this bugfix you wrote was included in 8.2.4

Re: [PATCH-for-9.1 v2 2/3] migration: Remove RDMA protocol handling

2024-05-16 Thread Michael Galaxy
These are very compelling results, no? (40gbps cards, right? Are the cards active/active? or active/standby?) - Michael On 5/14/24 10:19, Yu Zhang wrote: Hello Peter and all, I did a comparison of the VM live-migration speeds between RDMA and TCP/IP on our servers and plotted the results to g

Re: CPR/liveupdate: test results using prior bug fix

2024-05-16 Thread Michael Galaxy
On 5/14/24 08:54, Michael Tokarev wrote: On 5/14/24 16:39, Michael Galaxy wrote: Steve, OK, so it does not look like this bugfix you wrote was included in 8.2.4 (which was released yesterday). Unfortunately, that means that anyone using CPR in that release will still (eventually) encounter

Re: CPR/liveupdate: test results using prior bug fix

2024-05-14 Thread Michael Galaxy
rhaps, the relevant commits for a possible 8.2.5 ? - Michael On 5/13/24 20:15, Michael Galaxy wrote: Hi Steve, Thanks for the response. It looks like literally *just today* 8.2.4 was released. I'll go check it out. - Michael On 5/13/24 15:10, Steven Sistare wrote: Hi Michael,   N

Re: CPR/liveupdate: test results using prior bug fix

2024-05-13 Thread Michael Galaxy
y are all symptoms of "the possibility of ram and device state being out of sync" as mentioned in the commit. I am not familiar with the process for maintaining old releases for qemu. Perhaps someone on this list can comment on 8.2.3. - Steve On 5/13/2024 2:22 PM, Michael Galaxy wrote:

Re: [PATCH-for-9.1 v2 2/3] migration: Remove RDMA protocol handling

2024-05-13 Thread Michael Galaxy
: On Tue, May 07, 2024 at 01:50:43AM +, Gonglei (Arei) wrote: Hello, -Original Message- From: Peter Xu [mailto:pet...@redhat.com] Sent: Monday, May 6, 2024 11:18 PM To: Gonglei (Arei) Cc: Daniel P. Berrangé ; Markus Armbruster ; Michael Galaxy ; Yu Zhang ; Zhijian Li (Fujitsu) ; Jinpu

CPR/liveupdate: test results using prior bug fix

2024-05-13 Thread Michael Galaxy
Hi Steve, We found that this specific change in particular ("migration: stop vm for cpr") fixes a bug that we've identified in testing back-to-back live updates in a lab environment. More specifically, *without* this change (which is not available in 8.2.2, but *is* available in 9.0.0) cause

Re: [PATCH-for-9.1 v2 2/3] migration: Remove RDMA protocol handling

2024-05-02 Thread Michael Galaxy
Yu Zhang / Jinpu, Any possibility (at your lesiure, and within the disclosure rules of your company, IONOS) if you could share any of your performance information to educate the group? NICs have indeed changed, but not everybody has 100ge mellanox cards at their disposal. Some people don't.

Re: [PATCH-for-9.1 v2 2/3] migration: Remove RDMA protocol handling

2024-04-29 Thread Michael Galaxy
Reviewed-by: Michael Galaxy Thanks Yu Zhang and Peter. - Michael On 4/29/24 15:45, Yu Zhang wrote: Hello Michael and Peter, We are very glad at your quick and kind reply about our plan to take over the maintenance of your code. The message is for presenting our plan and working together. If

Re: [PATCH-for-9.1 v2 2/3] migration: Remove RDMA protocol handling

2024-04-29 Thread Michael Galaxy
Hi All (and Peter), My name is Michael Galaxy (formerly Hines). Yes, I changed my last name (highly irregular for a male) and yes, that's my real last name: https://www.linkedin.com/in/mrgalaxy/) I'm the original author of the RDMA implementation. I've been discussing with

Re: [PATCH V3] migration: simplify notifiers

2023-07-13 Thread Michael Galaxy
the global notifier list in a new function migration_call_notifiers, and make it externally visible so future live update code can call it. Tested-by: Michael Galaxy Reviewed-by: Michael Galaxy No functional change. Signed-off-by: Steve Sistare --- hw/net/virtio-net.c | 6 +++--- hw

Re: [PATCH V4 0/2] migration file URI

2023-07-13 Thread Michael Galaxy
Tested-by: Michael Galaxy Reviewed-by: Michael Galaxy On 6/30/23 09:25, Steve Sistare wrote: Add the migration URI "file:filename[,offset=offset]". Fabiano Rosas has submitted the unit tests in the series migration: Test the new "file:" migration Steve Sistare (2):

Re: [PATCH V4] migration: simplify blockers

2023-07-13 Thread Michael Galaxy
multiple modes. No functional change. Tested-by: Michael Galaxy Reviewed-by: Michael Galaxy Signed-off-by: Steve Sistare --- backends/tpm/tpm_emulator.c | 10 ++ block/parallels.c| 6 ++ block/qcow.c | 6 ++ block/vdi.c | 6

Re: [PATCH V9 00/46] Live Update

2023-07-13 Thread Michael Galaxy
 Good morning, On 7/10/23 10:10, Steven Sistare wrote: On 6/12/2023 10:59 AM, Michael Galaxy wrote: Hi Steve, On 6/7/23 12:37, Steven Sistare wrote: On 6/7/2023 11:55 AM, Michael Galaxy wrote: Another option could be to expose "-migrate-mode-disable" (instead of enable) and just

Re: [PATCH V9 00/46] Live Update

2023-06-12 Thread Michael Galaxy
Hi Steve, On 6/7/23 12:37, Steven Sistare wrote: On 6/7/2023 11:55 AM, Michael Galaxy wrote: Another option could be to expose "-migrate-mode-disable" (instead of enable) and just enable all 3 modes by default, since we are already required to switch from "normal" mo

Re: [PATCH V9 00/46] Live Update

2023-06-07 Thread Michael Galaxy
ve the capability to completely prevent a running QEMU from using these modes before the VM starts up. - Michael On 6/6/23 17:15, Michael Galaxy wrote: Hi Steve, In the current design you have, we have to specify both the command line parameter "-migrate-mode-enable cpr-reboot" *

Re: [PATCH V9 00/46] Live Update

2023-06-06 Thread Michael Galaxy
Hi Steve, In the current design you have, we have to specify both the command line parameter "-migrate-mode-enable cpr-reboot" *and* issue the monitor command "migrate_set_parameter mode cpr-${mode}". Is it possible to opt-in to the CPR mode just once over the monitor instead of having to spe

Re: [PATCH V9 00/46] Live Update

2023-04-14 Thread Michael Galaxy
lity questions and we were able to fix those issues. We will continue our testing throughout the year with more heavily-loaded workloads, but all in all we would very much be interested in seeing further reviews on this patch series from others. * *--- Tested-by: Michael Galaxy On 12/7/2

Re: [PATCH V9 00/46] Live Update

2023-04-07 Thread Michael Galaxy
Hey Steven, Have you done any "back-to-back" live update testing before? I am still doing extensive testing on this myself. I am running into a bug that I have not yet diagnosed. It involves the following: 1. Perform a live update (I'm using kexec + PMEM-based live updates). => VM comes bac