On 20/02/2025 23:55, Peter Xu wrote: > On Thu, Feb 20, 2025 at 05:40:38PM +0800, Li Zhijian wrote: >> On 19/02/2025 22:11, Peter Xu wrote: >>>>>>> then >>>>>>> in the test it tries to detect rdma link and fetch the ip only >>>>>> It should work without root permission if we just*detect* and*fetch ip*. >>>>>> >>>>>> Do you also mean we can split new-rdma-link.sh to 2 separate scripts >>>>>> - add-rdma-link.sh # optionally, execute by user before the test >>>>>> (require root permission) >>>>>> - detect-fetch-rdma.sh # execute from the migration-test >>>>> Hmm indeed we still need a script to scan over all the ports.. >>>>> >>>>> If having --rdma is a good idea, maybe we can further make it a parameter >>>>> to --rdma? >>>>> >>>>> $ migration-test --rdma $RDMA_IP >>>>> >>>>> Or: >>>>> >>>>> $ migration-test --rdma-ip $RDMA_IP >>>> I think --rdma only makes sense if it's going to do something >>>> special. The optmimal scenario is that it always runs the test when it >>>> can and sets up/tears down anything it needs. >>>> >>>> If it needs root, I'd prefer the test informs about this and does the >>>> work itself. >>>> >>>> It would also be good to have the add + detect separate so we have more >>>> flexibility, maybe we manage to enable this in CI even. >>>> >>>> So: >>>> >>>> ./add.sh >>>> migration-test >>>> (runs detect.sh + runs rdma test) >>>> (leaves stuff behind) >>>> >>>> migration-test >>>> (skips rdma test with message that it needs root) >>>> >>>> sudo migration-test >>>> (runs add.sh + detect.sh + runs rdma test) >>>> (cleans itself up) >>>> >>>> Does that make sense to you? I hope it's not too much work. >>> Looks good here. We can also keep all the rdma stuff into one file, taking >>> parameters. >>> >>> ./rdma-helper.sh setup >>> ./rdma-helper.sh detect-ip >> >> Hi Peter and Fabiano >> >> Many thanks for your kindly idea and suggestion. >> Please take another look at the changes below. >> - I don't copy script to the build dir, just execute the script like >> misc-tests.c >> - It will automatically create a new RXE if it doesn't exit when running in >> root > > Thanks! This is much better. Comments below. > >> >> [PATCH] migration: Add qtest for migration over RDMA >> >> This qtest requires there is RDMA(RoCE) link in the host. >> Introduce a scripts/rdma-migration-helper.sh to >> - setup a new RXE if it's root >> - detect existing RoCE link >> to make the qtest work smoothly. >> >> Test will be skip if there is no available RoCE link. >> # Start of rdma tests >> # Running /x86_64/migration/precopy/rdma/plain >> ok 1 /x86_64/migration/precopy/rdma/plain # SKIP There is no available >> rdma link in the host. >> Maybe you are not running with the root permission >> # End of rdma tests >> >> Admin is able to remove the RXE by passing 'cleanup' to this script. >> >> Signed-off-by: Li Zhijian <lizhij...@fujitsu.com> >> --- >> scripts/rdma-migration-helper.sh | 40 +++++++++++++++++++ >> tests/qtest/migration/precopy-tests.c | 57 +++++++++++++++++++++++++++ >> 2 files changed, 97 insertions(+) >> create mode 100755 scripts/rdma-migration-helper.sh >> >> diff --git a/scripts/rdma-migration-helper.sh >> b/scripts/rdma-migration-helper.sh >> new file mode 100755 >> index 0000000000..4ef62baf0f >> --- /dev/null >> +++ b/scripts/rdma-migration-helper.sh >> @@ -0,0 +1,40 @@ >> +#!/bin/bash >> + >> +# Copied from blktests >> +get_ipv4_addr() { >> + ip -4 -o addr show dev "$1" | >> + sed -n 's/.*[[:blank:]]inet[[:blank:]]*\([^[:blank:]/]*\).*/\1/p' | >> + tr -d '\n' >> +} >> + >> +has_soft_rdma() { >> + rdma link | grep -q " netdev $1[[:blank:]]*\$" >> +} >> + >> +rdma_rxe_setup_detect() >> +{ >> + ( >> + cd /sys/class/net && >> + for i in *; do >> + [ -e "$i" ] || continue >> + [ "$i" = "lo" ] && continue >> + [ "$(<"$i/addr_len")" = 6 ] || continue >> + [ "$(<"$i/carrier")" = 1 ] || continue >> + >> + has_soft_rdma "$i" && break >> + [ "$operation" = "setup" ] && rdma link add "${i}_rxe" type >> rxe netdev "$i" && break >> + done >> + has_soft_rdma "$i" || return >> + get_ipv4_addr $i >> + ) >> +} >> + >> +operation=${1:-setup} >> + >> +if [ "$operation" == "setup" ] || [ "$operation" == "detect" ]; then >> + rdma_rxe_setup_detect >> +elif [ "$operation" == "cleanup" ]; then >> + modprobe -r rdma_rxe > > Would this affects host when there's existing user of rdma_rxe (or fail > with -EBUSY)?
'modprobe -r ' will fail with EBUSY in this case. > If there's no major side effect of leftover rdma link, we > could also drop the "cleanup" for now and start from simple. In my understanding, there is no any side effect, I'm fine to drop it in this time. > >> +else >> + echo "Usage: $0 [setup | cleanup | detect]" >> +fi >> diff --git a/tests/qtest/migration/precopy-tests.c >> b/tests/qtest/migration/precopy-tests.c >> index ba273d10b9..8c72eb699b 100644 >> --- a/tests/qtest/migration/precopy-tests.c >> +++ b/tests/qtest/migration/precopy-tests.c >> @@ -99,6 +99,59 @@ static void test_precopy_unix_dirty_ring(void) >> test_precopy_common(&args); >> } >> >> +#ifdef CONFIG_RDMA >> + >> +#define RDMA_MIGRATION_HELPER "scripts/rdma-migration-helper.sh" >> +static int new_rdma_link(char *buffer) { > > Nit: newline before "{". Good catch! > >> + const char *argument = (geteuid() == 0) ? "setup" : "detect"; >> + char command[1024]; >> + >> + snprintf(command, sizeof(command), "%s %s", RDMA_MIGRATION_HELPER, >> argument); >> + >> + FILE *pipe = popen(command, "r"); >> + if (pipe == NULL) { >> + perror("Failed to run script"); >> + return -1; >> + } >> + >> + int idx = 0; >> + while (fgets(buffer + idx, 128 - idx, pipe) != NULL) { >> + idx += strlen(buffer); >> + } >> + >> + int status = pclose(pipe); >> + if (status == -1) { >> + perror("Error reported by pclose()"); >> + return -1; >> + } else if (WIFEXITED(status)) { >> + return WEXITSTATUS(status); >> + } >> + >> + return -1; >> +} >> + >> +static void test_precopy_rdma_plain(void) >> +{ >> + char buffer[128] = {}; >> + >> + if (new_rdma_link(buffer)) { >> + g_test_skip("There is no available rdma link in the host.\n" >> + "Maybe you are not running with the root permission"); > > Nit: can be slightly more verbose? > > g_test_skip("\nThere is no available rdma link to run RDMA > migration test. To enable the test:\n" > "(1) Run \'%s setup\' with root and rerun the test\n" > "(2) Run the test with root privilege\n", > RDMA_MIGRATION_HELPER); Yes, it's much better. Thanks Zhijian > > >> + return; >> + } >> + >> + /* FIXME: query a free port instead of hard code. */ >> + g_autofree char *uri = g_strdup_printf("rdma:%s:7777", buffer); >> + >> + MigrateCommon args = { >> + .listen_uri = uri, >> + .connect_uri = uri, >> + }; >> + >> + test_precopy_common(&args); >> +} >> +#endif >> + >> static void test_precopy_tcp_plain(void) >> { >> MigrateCommon args = { >> @@ -1124,6 +1177,10 @@ static void >> migration_test_add_precopy_smoke(MigrationTestEnv *env) >> test_multifd_tcp_uri_none); >> migration_test_add("/migration/multifd/tcp/plain/cancel", >> test_multifd_tcp_cancel); >> +#ifdef CONFIG_RDMA >> + migration_test_add("/migration/precopy/rdma/plain", >> + test_precopy_rdma_plain); >> +#endif >> } >> >> void migration_test_add_precopy(MigrationTestEnv *env) >> -- >> 2.41.0 >> >