Yu,
On 07/03/2024 00:30, Philippe Mathieu-Daudé wrote:
> Cc'ing RDMA migration reviewers/maintainers:
>
> $ ./scripts/get_maintainer.pl -f migration/rdma.c
> Li Zhijian (reviewer:RDMA Migration)
> Peter Xu (maintainer:Migration)
> Fabiano Rosas (maintainer:Migration)
>
> On 5/3/24 22:32, Yu
On 28/03/2024 23:01, Peter Xu wrote:
> On Thu, Mar 28, 2024 at 11:18:04AM -0300, Fabiano Rosas wrote:
>> Philippe Mathieu-Daudé writes:
>>
>>> The whole RDMA subsystem was deprecated in commit e9a54265f5
>>> ("hw/rdma: Deprecate the pvrdma device and the rdma subsystem")
>>> released in v8.2.
>>
Phil,
on 3/29/2024 6:28 PM, Philippe Mathieu-Daudé wrote:
>>
>>
>>> IMHO it's more important to know whether there are still users and
>>> whether
>>> they would still like to see it around.
>>
>> Agree.
>> I didn't immediately express my opinion in V1 because I'm also
>> consulting our
>> custo
On 02/04/2024 12:09, fan wrote:
> On Tue, Apr 02, 2024 at 09:46:46AM +0800, Li Zhijian via wrote:
>> It helps to figure out where the first dvsec register is located. In
>> addition, replace offset and size hardcore with existing macros.
>>
>> Signed-off-by: Li Zhijian
>> ---
>> hw/mem/cxl_typ
On 02/04/2024 17:17, Jonathan Cameron wrote:
> On Tue, 2 Apr 2024 09:46:47 +0800
> Li Zhijian wrote:
>
>> After the kernel commit
>> 0cab68720598 ("cxl/pci: Fix disabling memory if DVSEC CXL Range does not
>> match a CFMWS window")
>
> Fixes tag seems appropriate.
>
>> CXL type3 devices can
On 03/04/2024 11:42, Li Zhijian wrote:
>
>
> On 02/04/2024 17:17, Jonathan Cameron wrote:
>> On Tue, 2 Apr 2024 09:46:47 +0800
>> Li Zhijian wrote:
>>
>>> After the kernel commit
>>> 0cab68720598 ("cxl/pci: Fix disabling memory if DVSEC CXL Range does not
>>> match a CFMWS window")
>>
>> Fix
on 4/10/2024 3:46 AM, Peter Xu wrote:
>> Is there document/link about the unittest/CI for migration tests, Why
>> are those tests missing?
>> Is it hard or very special to set up an environment for that? maybe we
>> can help in this regards.
> See tests/qtest/migration-test.c. We put most of ou
On 08/03/2024 14:55, Peter Xu wrote:
> On Fri, Mar 08, 2024 at 07:27:59AM +0100, Yu Zhang wrote:
>> Hello Zhijian and Peter,
>>
>> Thank you so much for testing and confirming it.
>> I created a patch in the email format, unfortunately got an issue for
>> setting up the
>> "Application-specific P
On 17/04/2024 10:44, Li Zhijian wrote:
> bdrv_activate_all() should not be called from the coroutine context, move
> it to the QEMU thread colo_process_incoming_thread() with the bql_lock
> protected.
>
> The backtrace is as follows:
> #4 0x561af7948362 in bdrv_graph_rdlock_main_loop () a
On 17/04/2024 14:13, Philippe Mathieu-Daudé wrote:
> On 17/4/24 04:56, Li Zhijian via wrote:
>> bdrv_activate_all() should not be called from the coroutine context, move
>> it to the QEMU thread colo_process_incoming_thread() with the bql_lock
>> protected.
>>
>> The backtrace is as follows:
>>
On 24/04/2024 12:52, mii wrote:
>
> On 2024/04/24 10:28, Yong Huang wrote:
>>
>>
>> On Tue, Apr 23, 2024 at 9:35 PM Peter Xu wrote:
>>
>> On Tue, Apr 23, 2024 at 09:13:08AM +, Masato Imai wrote:
>> > When the KVM acceleration parameter is not set, executing
>> calc_dirty_rate
>>
ping
On 11/04/2024 18:18, Jonathan Cameron wrote:
> On Tue, 9 Apr 2024 15:58:46 +0800
> Li Zhijian wrote:
>
>> After the kernel commit
>> 0cab68720598 ("cxl/pci: Fix disabling memory if DVSEC CXL Range does not
>> match a CFMWS window")
>> CXL type3 devices cannot be enabled again after the
on 5/7/2024 10:50 AM, Masato Imai wrote:
> When the KVM acceleration parameter is not set, executing calc_dirty_rate
> with the -r or -b option results in a segmentation fault due to accessing
> a null kvm_state pointer in the kvm_dirty_ring_enabled function. This
> commit adds a null check for k
On 25/05/2023 19:49, Jonathan Cameron via wrote:
> On Fri, 19 May 2023 16:58:02 +0800
> Li Zhijian wrote:
>
>> Using the same style except the 'Topo' abbreviation.
>>
>> Signed-off-by: Li Zhijian
>> ---
>> I'm not a native speaker, feel free to correct me.
>
> I've edited slightly and applied
On 05/06/2024 20:02, Jonathan Cameron wrote:
> On Fri, 31 May 2024 14:13:17 +0800
> Li Zhijian wrote:
>
>> There is no user for this member. All '-M cxl-fmw.N' options have
>> been parsed and saved to CXLState.fixed_windows.
>>
>> Signed-off-by: Li Zhijian
>
> Hi Li,
>
> Applied to my tree w
Hi Fan
Do you have a newer instruction to play with the DCD. It seems that
the instruction in RFC[0] doesn't work for current code.
[0] https://lore.kernel.org/all/20230511175609.2091136-1-fan...@samsung.com/
On 19/04/2024 07:10, nifan@gmail.com wrote:
> A git tree of this series can be f
On 19/04/2024 07:11, nifan@gmail.com wrote:
> +} else if (type == DC_EVENT_ADD_CAPACITY) {
> +if (cxl_extents_overlaps_dpa_range(&dcd->dc.extents, dpa, len)) {
> +error_setg(errp,
> + "cannot add DPA already accessible to the same
On 19/04/2024 07:10, nifan@gmail.com wrote:
> From: Fan Ni
>
> +}
> +
> static bool cxl_setup_memory(CXLType3Dev *ct3d, Error **errp)
> {
> DeviceState *ds = DEVICE(ct3d);
> @@ -635,6 +676,13 @@ static bool cxl_setup_memory(CXLType3Dev *ct3d, Error
> **errp)
> g_free(p
On 19/04/2024 07:10, nifan@gmail.com wrote:
> +uint64_t dc_size;
> +
> +mr = host_memory_backend_get_memory(ct3d->dc.host_dc);
> +dc_size = memory_region_size(mr);
> +region_len = DIV_ROUND_UP(dc_size, ct3d->dc.num_regions);
> +
> +if (dc_size % (ct3d->dc.num_regions * CXL
On 16/05/2024 03:04, Fabiano Rosas wrote:
> Li Zhijian via writes:
>
>> Currently, it always returns 0, no need to check the return value at all.
>> In addition, enter colo coroutine only if migration_incoming_colo_enabled()
>> is true.
>> Once the destination side enters the COLO* state, the C
98911] RBP: 7ffebec70bd0 R08: R09: 7ffebec70640
[ 127.000879] R10: R11: 0246 R12: 00403840
[ 127.003572] R13: R14: R15:
[ 127.005543]
Thanks
Zhijian
On 17/05/2024 01:12, fan wrote:
> On Tu
On 24/05/2024 01:44, nifan@gmail.com wrote:
> From: Fan Ni
>
> With the change, when setting up memory for type3 memory device, we can
> create DC regions.
> A property 'num-dc-regions' is added to ct3_props to allow users to pass the
> number of DC regions to create. To make it easier, oth
On 06/06/2024 19:31, Leon Romanovsky wrote:
> On Wed, Jun 05, 2024 at 10:00:24AM +, Gonglei (Arei) wrote:
>>
>>
>>> -Original Message-
>>> From: Michael S. Tsirkin [mailto:m...@redhat.com]
>>> Sent: Wednesday, June 5, 2024 3:57 PM
>>> To: Gonglei (Arei)
>>> Cc: qemu-devel@nongnu.org;
On 18/06/2024 02:15, Peter Xu wrote:
> The postcopy thread names on dest QEMU are slightly confusing, partly I'll
> need to blame myself on 36f62f11e4 ("migration: Postcopy preemption
> preparation on channel creation"). E.g., "fault-fast" reads like a fast
> version of "fault-default", but it's
On 21/06/2024 15:03, Philippe Mathieu-Daudé wrote:
> Last use of VMSTATE_ARRAY_TEST() was removed in commit 46baa9007f
> ("migration/i386: Remove old non-softfloat 64bit FP support"), we
> can safely get rid of it.
>
> Signed-off-by: Philippe Mathieu-Daudé
Reviewed-by: Li Zhijian
> ---
>
On 18/09/2023 22:41, Markus Armbruster wrote:
> Functions that use an Error **errp parameter to return errors should
> not also report them to the user, because reporting is the caller's
> job. When the caller does, the error is reported twice. When it
> doesn't (because it recovered from the e
On 18/09/2023 22:41, Markus Armbruster wrote:
> Functions that use an Error **errp parameter to return errors should
> not also report them to the user, because reporting is the caller's
> job. When the caller does, the error is reported twice. When it
> doesn't (because it recovered from the e
On 18/09/2023 22:41, Markus Armbruster wrote:
> Functions that use an Error **errp parameter to return errors should
> not also report them to the user, because reporting is the caller's
> job. When the caller does, the error is reported twice. When it
> doesn't (because it recovered from the e
On 18/09/2023 22:41, Markus Armbruster wrote:
> Functions that use an Error **errp parameter to return errors should
> not also report them to the user, because reporting is the caller's
> job. When the caller does, the error is reported twice. When it
> doesn't (because it recovered from the e
On 18/09/2023 22:41, Markus Armbruster wrote:
> Functions that use an Error **errp parameter to return errors should
> not also report them to the user, because reporting is the caller's
> job. When the caller does, the error is reported twice. When it
> doesn't (because it recovered from the e
On 18/09/2023 22:41, Markus Armbruster wrote:
> Functions that use an Error **errp parameter to return errors should
> not also report them to the user, because reporting is the caller's
> job. When the caller does, the error is reported twice. When it
> doesn't (because it recovered from the e
On 18/09/2023 22:41, Markus Armbruster wrote:
> Just for consistency with qemu_rdma_write_one() and
> qemu_rdma_write_flush(), and for slightly simpler code.
>
> Signed-off-by: Markus Armbruster
Reviewed-by: Li Zhijian
On 18/09/2023 22:41, Markus Armbruster wrote:
> Functions that use an Error **errp parameter to return errors should
> not also report them to the user, because reporting is the caller's
> job. When the caller does, the error is reported twice. When it
> doesn't (because it recovered from the e
On 18/09/2023 22:41, Markus Armbruster wrote:
> Just for symmetry with qemu_rdma_post_send_control(). Error messages
> lose detail I consider of no use to users.
>
> Signed-off-by: Markus Armbruster
Reviewed-by: Li Zhijian
On 18/09/2023 22:41, Markus Armbruster wrote:
> Functions that use an Error **errp parameter to return errors should
> not also report them to the user, because reporting is the caller's
> job. When the caller does, the error is reported twice. When it
> doesn't (because it recovered from the e
On 18/09/2023 22:41, Markus Armbruster wrote:
> Functions that use an Error **errp parameter to return errors should
> not also report them to the user, because reporting is the caller's
> job. When the caller does, the error is reported twice. When it
> doesn't (because it recovered from the e
On 18/09/2023 22:41, Markus Armbruster wrote:
> Functions that use an Error **errp parameter to return errors should
> not also report them to the user, because reporting is the caller's
> job. When the caller does, the error is reported twice. When it
> doesn't (because it recovered from the e
On 26/09/2023 13:50, Li Zhijian wrote:
>
>
> On 18/09/2023 22:41, Markus Armbruster wrote:
>> Functions that use an Error **errp parameter to return errors should
>> not also report them to the user, because reporting is the caller's
>> job. When the caller does, the error is reported twice.
On 18/09/2023 22:41, Markus Armbruster wrote:
> Functions that use an Error **errp parameter to return errors should
> not also report them to the user, because reporting is the caller's
> job. When the caller does, the error is reported twice. When it
> doesn't (because it recovered from the e
On 18/09/2023 22:42, Markus Armbruster wrote:
> Functions that use an Error **errp parameter to return errors should
> not also report them to the user, because reporting is the caller's
> job. When the caller does, the error is reported twice. When it
> doesn't (because it recovered from the e
On 18/09/2023 22:42, Markus Armbruster wrote:
> When qemu_rdma_wait_comp_channel() receives an event from the
> completion channel, it reports an error "receive cm event while wait
> comp channel,cm event is T", where T is the numeric event type.
> However, the function fails only when T is a dis
On 18/09/2023 22:42, Markus Armbruster wrote:
> Functions that use an Error **errp parameter to return errors should
> not also report them to the user, because reporting is the caller's
> job. When the caller does, the error is reported twice. When it
> doesn't (because it recovered from the e
On 18/09/2023 22:42, Markus Armbruster wrote:
> Functions that use an Error **errp parameter to return errors should
> not also report them to the user, because reporting is the caller's
> job. When the caller does, the error is reported twice. When it
> doesn't (because it recovered from the e
On 18/09/2023 22:42, Markus Armbruster wrote:
> if (local->nb_blocks != nb_dest_blocks) {
> -fprintf(stderr, "ram blocks mismatch (Number of blocks %d vs %d)
> "
> -"Your QEMU command line parameters are probably "
> -"not identical o
On 18/09/2023 22:42, Markus Armbruster wrote:
> qemu_rdma_dump_id() dumps RDMA device details to stdout.
>
> rdma_start_outgoing_migration() calls it via qemu_rdma_source_init()
> and qemu_rdma_resolve_host() to show source device details.
> rdma_start_incoming_migration() arranges its call via
On 26/09/2023 14:41, Markus Armbruster wrote:
> "Zhijian Li (Fujitsu)" writes:
>
>> On 18/09/2023 22:41, Markus Armbruster wrote:
>>> Functions that use an Error **errp parameter to return errors should
>>> not also report them to the user, because repo
o Zhijian can see the patches and review when he still has
> the bandwidth.
Feel free to add my Acked tag. thanks.
Acked-by: Li Zhijian
>
> Cc: Daniel P. Berrangé
> Cc: Juan Quintela
> Cc: Markus Armbruster
> Cc: Zhijian Li (Fujitsu)
> Cc: Fabiano Rosas
> Signed-of
On 18/09/2023 22:42, Markus Armbruster wrote:
> Functions that use an Error **errp parameter to return errors should
> not also report them to the user, because reporting is the caller's
> job. When the caller does, the error is reported twice. When it
> doesn't (because it recovered from the e
On 18/09/2023 22:41, Markus Armbruster wrote:
> Several functions return negative errno codes on failure. Callers
> check for specific codes exactly never. For some of the functions,
> callers couldn't check even if they wanted to, because the functions
> also return negative values that aren't
On 18/09/2023 22:41, Markus Armbruster wrote:
> This is just to make the error value more obvious. Callers don't
> mind.
>
> Signed-off-by: Markus Armbruster
Reviewed-by: Li Zhijian
On 25/09/2023 15:09, Markus Armbruster wrote:
> "Zhijian Li (Fujitsu)" writes:
>
>> On 18/09/2023 22:41, Markus Armbruster wrote:
>>> All we do with the value of RDMAContext member @error_state is test
>>> whether it's zero. Change to bool
On 26/09/2023 19:52, Markus Armbruster wrote:
> "Zhijian Li (Fujitsu)" writes:
>
>> On 18/09/2023 22:42, Markus Armbruster wrote:
>>> Functions that use an Error **errp parameter to return errors should
>>> not also report them to the user, because repo
On 15/09/2023 04:20, “William Roche wrote:
> From: William Roche
>
> A memory page poisoned from the hypervisor level is no longer readable.
> Thus, it is now treated as a zero-page for the ram saving migration phase.
>
> The migration of a VM will crash Qemu when it tries to read the
> memory
On 15/09/2023 19:31, William Roche wrote:
> On 9/15/23 05:13, Zhijian Li (Fujitsu) wrote:
>>
>>
>> I'm okay with "RDMA isn't touched".
>> BTW, could you share your reproducing program/hacking to poison the page, so
>> that
>> i am able
On 18/09/2023 22:41, Markus Armbruster wrote:
> We use int instead of uint64_t in a few places. Change them to
> uint64_t.
>
> This cleans up a comparison of signed qemu_rdma_block_for_wrid()
> parameter @wrid_requested with unsigned @wr_id. Harmless, because the
> actual arguments are non-neg
On 15/09/2023 19:31, William Roche wrote:
> On 9/15/23 05:13, Zhijian Li (Fujitsu) wrote:
>>
>>
>> I'm okay with "RDMA isn't touched".
>> BTW, could you share your reproducing program/hacking to poison the page, so
>> that
>> i am able
On 20/09/2023 21:01, Fabiano Rosas wrote:
> Li Zhijian writes:
>
>> From: Li Zhijian
>>
>> Previously, we got a confusion error that complains
>> the RDMAControlHeader.repeat:
>> qemu-system-x86_64: rdma: Too many requests in this message
>> (3638950032).Bailing.
>>
>> Actually, it's caused b
Sorry to all, i forgot to update my email address to lizhij...@fujitsu.com.
Corrected it.
On 20/09/2023 17:04, Li Zhijian wrote:
> From: Li Zhijian
>
> Destination will fail with:
> qemu-system-x86_64: rdma: Too many requests in this message
> (3638950032).Bailing.
>
> migrate with RDMA is d
On 21/09/2023 02:31, Markus Armbruster wrote:
> Local variables shadowing other local variables or parameters make the
> code needlessly hard to understand. Tracked down with -Wshadow=local.
> Clean up: delete inner declarations when they are actually redundant,
> else rename variables.
>
> Sig
Perter,
On 20/09/2023 00:49, Peter Xu wrote:
> On Mon, Sep 18, 2023 at 04:41:14PM +0200, Markus Armbruster wrote:
>> Oh dear, where to start. There's so much wrong, and in pretty obvious
>> ways. This code should never have passed review. I'm refraining from
>> saying more; see the commit mess
On 18/09/2023 22:41, Markus Armbruster wrote:
> qemu_rdma_poll()'s return type is uint64_t, even though it returns 0,
> -1, or @ret, which is int. Its callers assign the return value to int
> variables, then check whether it's negative. Unclean.
>
> Return int instead.
>
> Signed-off-by: Mark
On 18/09/2023 22:41, Markus Armbruster wrote:
> qemu_rdma_data_init() return type is void *. It actually returns
> RDMAContext *, and all its callers assign the value to an
> RDMAContext *. Unclean.
>
> Return RDMAContext * instead.
>
> Signed-off-by: Markus Armbruster
Reviewed-by: Li Zhijia
On 18/09/2023 22:41, Markus Armbruster wrote:
> rdma_delete_block() always returns 0, which its only caller ignores.
> Return void instead.
>
> Signed-off-by: Markus Armbruster
Reviewed-by: Li Zhijian
On 18/09/2023 22:41, Markus Armbruster wrote:
> wrid_desc[] uses 4001 pointers to map four integer values to strings.
>
> print_wrid() accesses wrid_desc[] out of bounds when passed a negative
> argument. It returns null for values 2..1999 and 2001..3999.
>
> qemu_rdma_poll() and qemu_rdma_blo
On 18/09/2023 22:41, Markus Armbruster wrote:
> Signed-off-by: Markus Armbruster
Reviewed-by: Li Zhijian
On 18/09/2023 22:41, Markus Armbruster wrote:
> qemu_rdma_accept() returns 0 in some cases even when it didn't
> complete its job due to errors. Impact is not obvious. I figure the
> caller will soon fail again with a misleading error message.
>
> Fix it to return -1 on any failure.
>
> Signe
On 18/09/2023 22:41, Markus Armbruster wrote:
> include/qapi/error.h demands:
>
> * - Functions that use Error to report errors have an Error **errp
> * parameter. It should be the last parameter, except for functions
> * taking variable arguments.
>
> qemu_rdma_connect() does not co
On 18/09/2023 22:41, Markus Armbruster wrote:
> When all we do with an Error we receive into a local variable is
> propagating to somewhere else, we can just as well receive it there
> right away.
>
> Signed-off-by: Markus Armbruster
Reviewed-by: Li Zhijian
> ---
> migration/rdma.c | 19 +
On 18/09/2023 22:41, Markus Armbruster wrote:
> rdma_add_block() can't fail. Return void, and drop the unreachable
> error handling.
>
> Signed-off-by: Markus Armbruster
> ---
> migration/rdma.c | 30 +-
> 1 file changed, 9 insertions(+), 21 deletions(-)
>
[...]
On 20/09/2023 20:46, Fabiano Rosas wrote:
> Li Zhijian writes:
>
>> From: Li Zhijian
>>
>> Destination will fail with:
>> qemu-system-x86_64: rdma: Too many requests in this message
>> (3638950032).Bailing.
>>
>> migrate with RDMA is different from tcp. RDMA has its own control
>> message, an
On 21/09/2023 19:15, Markus Armbruster wrote:
> "Zhijian Li (Fujitsu)" writes:
>
>> On 18/09/2023 22:41, Markus Armbruster wrote:
>>> rdma_add_block() can't fail. Return void, and drop the unreachable
>>> error handling.
>>>
>>> S
On 18/09/2023 22:41, Markus Armbruster wrote:
> qemu_rdma_search_ram_block() can't fail. Return void, and drop the
> unreachable error handling.
>
> Signed-off-by: Markus Armbruster
Reviewed-by: Li Zhijian
On 18/09/2023 22:41, Markus Armbruster wrote:
> qemu_rdma_buffer_mergable() is semantically a predicate. It returns
> int 0 or 1. Return bool instead.
>
> Signed-off-by: Markus Armbruster
Reviewed-by: Li Zhijian
On 18/09/2023 22:41, Markus Armbruster wrote:
> @error_reported and @received_error are flags. The latter is even
> assigned bool true. Change them from int to bool.
>
> Signed-off-by: Markus Armbruster
Reviewed-by: Li Zhijian
> ---
> migration/rdma.c | 6 +++---
> 1 file changed, 3 ins
On 18/09/2023 22:41, Markus Armbruster wrote:
> Several error messages include numeric error codes returned by failed
> functions:
>
> * ibv_poll_cq() returns an unspecified negative value. Useless.
>
> * rdma_accept and rmda_get_cm_event() return -1. Useless.
s/rmda_get_cm_event/rdma_get_c
On 18/09/2023 22:41, Markus Armbruster wrote:
> QIOChannelClass methods qio_channel_rdma_readv() and
> qio_channel_rdma_writev() violate their method contract when
> rdma->error_state is non-zero:
>
> 1. They return whatever is in rdma->error_state then. Only -1 will be
> fine. -2 will be
On 18/09/2023 22:41, Markus Armbruster wrote:
> Hiding return statements in macros is a bad idea. Use a function
> instead, and open code the return part.
>
> Signed-off-by: Markus Armbruster
Reviewed-by: Li Zhijian
> ---
> migration/rdma.c | 43 +++---
On 18/09/2023 22:41, Markus Armbruster wrote:
> qemu_rdma_resolve_host() and qemu_rdma_dest_init() try addresses until
> they find on that works. If none works, they return the first Error
> set by qemu_rdma_broken_ipv6_kernel(), or else return a generic one.
>
> qemu_rdma_broken_ipv6_kernel()
On 18/09/2023 22:41, Markus Armbruster wrote:
> qemu_get_cm_event_timeout() neglects to set an error when it fails
> because rdma_get_cm_event() fails. Harmless, as its caller
> qemu_rdma_connect() substitutes a generic error then. Fix it anyway.
>
> qemu_rdma_connect() also sets the generic e
On 18/09/2023 22:41, Markus Armbruster wrote:
> The QEMUFileHooks methods don't come with a written contract. Digging
> through the code calling them, we find:
>
> * save_page():
I'm fine with this
>
>Negative values RAM_SAVE_CONTROL_DELAYED and
>RAM_SAVE_CONTROL_NOT_SUPP are special
On 18/09/2023 22:41, Markus Armbruster wrote:
> rdma_getaddrinfo() returns 0 on success. On error, it returns one of
> the EAI_ error codes like getaddrinfo() does,
Good catch, It used to be -1 on error, rdma_getaddrinfo(3) updated it 2021.
or -1 with errno set.
> This is broken by design
On 18/09/2023 22:41, Markus Armbruster wrote:
> qemu_rdma_wait_comp_channel() returns 0 on success, and either -1 or
> rdma->error_state on failure. Callers actually expect a negative
> error value.
I don't see the only one callers expect a negative error code.
migration/rdma.c:1654:re
On 18/09/2023 22:41, Markus Armbruster wrote:
> All we do with the value of RDMAContext member @error_state is test
> whether it's zero. Change to bool and rename to @errored.
>
make sense!
Reviewed-by: Li Zhijian
Can we move this patch ahead "[PATCH 23/52] migration/rdma: Clean up
qemu_rd
On 18/09/2023 22:41, Markus Armbruster wrote:
> Signed-off-by: Markus Armbruster
Reviewed-by: Li Zhijian
> ---
> migration/rdma.c | 35 ++-
> 1 file changed, 10 insertions(+), 25 deletions(-)
>
> diff --git a/migration/rdma.c b/migration/rdma.c
> index 85f
On 18/09/2023 22:41, Markus Armbruster wrote:
> When a function returns 0 on success, negative value on error,
> checking for non-zero suffices, but checking for negative is clearer.
> So do that.
>
This patch is no my favor convention.
@Peter, Juan
I'd like to hear your opinions.
Thanks
Zhi
On 18/09/2023 22:41, Markus Armbruster wrote:
> When migration capability @rdma-pin-all is true, but the server cannot
> honor it, qemu_rdma_connect() calls macro ERROR(), then returns
> success.
>
> ERROR() sets an error. Since qemu_rdma_connect() returns success, its
> caller rdma_start_outgo
On 18/09/2023 22:41, Markus Armbruster wrote:
> Functions that use an Error **errp parameter to return errors should
> not also report them to the user, because reporting is the caller's
> job. When the caller does, the error is reported twice. When it
> doesn't (because it recovered from the e
On 25/09/2023 14:36, Markus Armbruster wrote:
> "Zhijian Li (Fujitsu)" writes:
>
>> On 18/09/2023 22:41, Markus Armbruster wrote:
>>> The QEMUFileHooks methods don't come with a written contract. Digging
>>> through the code calling them, we fin
On 18/09/2023 22:41, Markus Armbruster wrote:
> qemu_rdma_resolve_host() and qemu_rdma_dest_init() iterate over
> addresses to find one that works, holding onto the first Error from
> qemu_rdma_broken_ipv6_kernel() for use when no address works. Issues:
>
> 1. If @errp was &error_abort or &erro
On 18/09/2023 22:41, Markus Armbruster wrote:
> ERROR() has become "error_setg() unless an error has been set
> already". Hiding the conditional in the macro is in the way of
> further work. Replace the macro uses by their expansion, and delete
> the macro.
>
> Signed-off-by: Markus Armbruster
On 18/09/2023 22:41, Markus Armbruster wrote:
> These guards are all redundant now.
>
> Signed-off-by: Markus Armbruster
Reviewed-by: Li Zhijian
> ---
> migration/rdma.c | 164 +++
> 1 file changed, 51 insertions(+), 113 deletions(-)
>
> diff
On 22/09/2023 23:21, Peter Xu wrote:
> On Thu, Sep 21, 2023 at 08:27:24AM +0000, Zhijian Li (Fujitsu) wrote:
>> I'm worried that I may not have enough time, ability, or environment to
>> review/test
>> the RDMA patches. but for this patch set, i will take a look late
On 22/09/2023 23:42, Peter Xu wrote:
> On Wed, Sep 20, 2023 at 05:04:11PM +0800, Li Zhijian wrote:
>> From: Li Zhijian
>>
>> Destination will fail with:
>> qemu-system-x86_64: rdma: Too many requests in this message
>> (3638950032).Bailing.
>>
>> migrate with RDMA is different from tcp. RDMA ha
On 28/09/2023 21:19, Markus Armbruster wrote:
> qio_channel_rdma_readv() assigns the size_t value of qemu_rdma_fill()
> to an int variable before it adds it to @done / subtracts it from
> @want, both size_t. Truncation when qemu_rdma_fill() copies more than
> INT_MAX bytes. Seems vanishingly un
On 28/09/2023 21:19, Markus Armbruster wrote:
> qemu_rdma_exchange_get_response() compares int parameter @expecting
> with uint32_t head->type. Actual arguments are non-negative
> enumeration constants, RDMAControlHeader uint32_t member type, or
> qemu_rdma_exchange_recv() int parameter expectin
+rdma-core
Is global variable *errno* reliable when the documentation only states
"returns 0 on success, or the value of errno on failure (which indicates the
failure reason)."
Someone read it as "assign error code to errno and return it.", I used to think
the same way.
but ibv_post_send() doe
On 28/09/2023 21:20, Markus Armbruster wrote:
> Functions that use an Error **errp parameter to return errors should
> not also report them to the user, because reporting is the caller's
> job. When the caller does, the error is reported twice. When it
> doesn't (because it recovered from the e
On 28/09/2023 21:20, Markus Armbruster wrote:
> error_report() obeys -msg, reports the current error location if any,
> and reports to the current monitor if any. Reporting to stderr
> directly with fprintf() or perror() is wrong, because it loses all
> this.
>
> Fix the offenders. Bonus: reso
On 28/09/2023 21:20, Markus Armbruster wrote:
> qemu_rdma_dump_id() dumps RDMA device details to stdout.
>
> rdma_start_outgoing_migration() calls it via qemu_rdma_source_init()
> and qemu_rdma_resolve_host() to show source device details.
> rdma_start_incoming_migration() arranges its call via
On 28/09/2023 21:19, Markus Armbruster wrote:
> We use errno after calling Libibverbs functions that are not
> documented to set errno (manual page does not mention errno), or where
> the documentation is unclear ("returns [...] the value of errno on
> failure"). While this could be read as "set
1 - 100 of 149 matches
Mail list logo