On 03.09.21 21:14, Peter Xu wrote:
On Fri, Sep 03, 2021 at 12:07:20PM +0200, David Hildenbrand wrote:
On 03.09.21 10:47, David Hildenbrand wrote:
On 03.09.21 00:32, Peter Xu wrote:
On Thu, Sep 02, 2021 at 03:14:30PM +0200, David Hildenbrand wrote:
diff --git a/migration/migration.c b/migration/migration.c
index bb909781b7..ae97c2c461 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -391,7 +391,7 @@ int
migrate_send_rp_message_req_pages(MigrationIncomingState *mis,
int migrate_send_rp_req_pages(MigrationIncomingState *mis,
RAMBlock *rb, ram_addr_t start, uint64_t
haddr)
{
- void *aligned = (void *)(uintptr_t)(haddr & (-qemu_ram_pagesize(rb)));
+ void *aligned = (void *)QEMU_ALIGN_DOWN(haddr, qemu_ram_pagesize(rb));
Is uintptr_t still needed? I thought it would generate a warning otherwise but
not sure.
It doesn't in my setup, but maybe it will on 32bit archs ...
I discussed this with Phil in
https://lkml.kernel.org/r/2c8d80ad-f171-7d5f-3235-92f02fa17...@redhat.com
Maybe
QEMU_ALIGN_PTR_DOWN((void *)haddr, qemu_ram_pagesize(rb)));
Is really what we want.
... but it would suffer the same issue I think. I just ran it trough the
gitlab pipeline, including "i386-fedora-cross-compile" ... and it seems to
compile just fine, which is weird, because I'd also expect
"warning: cast to pointer from integer of different size
[-Wint-to-pointer-cast]"
We most certainly need the "(void *)(uintptr_t)" to convert from u64 to a
pointer.
Let's just do it cleanly:
void *unaligned = (void *)(uintptr_t)haddr;
void *aligned = QEMU_ALIGN_PTR_DOWN(unaligned, qemu_ram_pagesize(rb));
Thoughts?
---8<---
$ cat a.c
#include <stdio.h>
#include <time.h>
#include <assert.h>
#define ROUND_DOWN(n, d) ((n) & -(0 ? (n) : (d)))
#define QEMU_ALIGN_DOWN(n, m) ((n) / (m) * (m))
unsigned long getns(void)
{
struct timespec tp;
clock_gettime(CLOCK_MONOTONIC, &tp);
return tp.tv_sec * 1000000000 + tp.tv_nsec;
}
void main(void)
{
int i;
unsigned long start, end, v1 = 0x1234567890, v2 = 0x1000;
start = getns();
for (i = 0; i < 1000000; i++) {
v1 = ROUND_DOWN(v1, v2);
}
end = getns();
printf("ROUND_DOWN took: \t%ld (us)\n", (end - start) / 1000);
start = getns();
for (i = 0; i < 1000000; i++) {
v1 = QEMU_ALIGN_DOWN(v1, v2);
}
end = getns();
printf("QEMU_ALIGN_DOWN took: \t%ld (us)\n", (end - start) / 1000);
}
$ make a
$ ./a
ROUND_DOWN took: 1445 (us)
QEMU_ALIGN_DOWN took: 9684 (us)
---8<---
So it's ~5 times slower here on the laptop, even if not very stable. Agree
it's not a big deal. :)
Same results for me, especially even if I turn v1 and v2 into global volatiles,
make sure the results won't get optimized out and compile with -03.
It's just that since we know it's still faster, I then second:
(uinptr_t)ROUND_DOWN(...);
Well okay then,
void *aligned = (void *)(uintptr_t)ROUND_DOWN(haddr, qemu_ram_pagesize(rb));
fits precisely into a single line :)
--
Thanks,
David / dhildenb