On 10/15/24 7:35 PM, Stefan Berger wrote:
On 10/15/24 6:02 PM, Fabiano Rosas wrote:
Stefan Berger <stef...@linux.ibm.com> writes:
Yes, I've been using that method to reproduce live migration race
conditions as well. It's quite effective.
If you don't think you'll be able to find the root cause due to the
unreproducibility on your side, maybe we could at least add an assert
that bcount is not larger than rsp_size. I think that would at least
give an explicit error instead of a buffer overflow.
I can also try to dig deeper into this when I get some time. At the
moment I know nothing about the tpm device emulation.
The loop has run 3000 times by itself so that part is stable. However,
it seems there is some other test case that the loop cannot run in
parallel with. So, yes there is 'something'. ... ... Just having all
CPUs in a system busy requires waiting for migration to be complete on
the dst_qemu side as well. Can you try it with this patch:
diff --git a/tests/qtest/tpm-tests.c b/tests/qtest/tpm-tests.c
index fb94496bbd..b52cd44841 100644
--- a/tests/qtest/tpm-tests.c
+++ b/tests/qtest/tpm-tests.c
@@ -115,6 +115,7 @@ void tpm_test_swtpm_migration_test(const char
*src_tpm_path,
tpm_util_migrate(src_qemu, uri);
tpm_util_wait_for_migration_complete(src_qemu);
+ tpm_util_wait_for_migration_complete(dst_qemu);
tpm_util_pcrread(dst_qemu, tx, tpm_pcrread_resp,
sizeof(tpm_pcrread_resp));
Only waiting on the destination side is also possible:
diff --git a/tests/qtest/tpm-tests.c b/tests/qtest/tpm-tests.c
index fb94496bbd..197714f8d9 100644
--- a/tests/qtest/tpm-tests.c
+++ b/tests/qtest/tpm-tests.c
@@ -114,7 +114,7 @@ void tpm_test_swtpm_migration_test(const char
*src_tpm_path,
sizeof(tpm_pcrread_resp));
tpm_util_migrate(src_qemu, uri);
- tpm_util_wait_for_migration_complete(src_qemu);
+ tpm_util_wait_for_migration_complete(dst_qemu);
tpm_util_pcrread(dst_qemu, tx, tpm_pcrread_resp,
sizeof(tpm_pcrread_resp));