On 05/05/2020 10:00, Laurent Vivier wrote: > On 04/05/2020 11:46, Philippe Mathieu-Daudé wrote: >> In nvme_create_queue_pair() we create a page list using >> qemu_blockalign(), then map it with qemu_vfio_dma_map(): >> >> q->prp_list_pages = qemu_blockalign0(bs, s->page_size * NVME_QUEUE_SIZE); >> r = qemu_vfio_dma_map(s->vfio, q->prp_list_pages, >> s->page_size * NVME_QUEUE_SIZE, ...); >> >> With: >> >> s->page_size = MAX(4096, 1 << (12 + ((cap >> 48) & 0xF))); >> >> The qemu_vfio_dma_map() documentation says "The caller need >> to make sure the area is aligned to page size". While we use >> multiple s->page_size as alignment, it might be not sufficient >> on some hosts. Use the qemu_real_host_page_size value to be >> sure the host alignment is respected. >> >> Signed-off-by: Philippe Mathieu-Daudé <phi...@redhat.com> >> --- >> Cc: Cédric Le Goater <c...@kaod.org> >> Cc: David Gibson <da...@gibson.dropbear.id.au> >> Cc: Laurent Vivier <lviv...@redhat.com> >> --- >> block/nvme.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/block/nvme.c b/block/nvme.c >> index 7b7c0cc5d6..bde0d28b39 100644 >> --- a/block/nvme.c >> +++ b/block/nvme.c >> @@ -627,7 +627,7 @@ static int nvme_init(BlockDriverState *bs, const char >> *device, int namespace, >> >> s->page_size = MAX(4096, 1 << (12 + ((cap >> 48) & 0xF))); >> s->doorbell_scale = (4 << (((cap >> 32) & 0xF))) / sizeof(uint32_t); >> - bs->bl.opt_mem_alignment = s->page_size; >> + bs->bl.opt_mem_alignment = MAX(qemu_real_host_page_size, s->page_size); > > Why don't you replace directly the "4096" in s->page_size by > qemu_real_host_page_size? > > I think this was the purpose of MAX(4096, ...) to align to the host page > size.
oh, I see, page_size is used for a lot of things than can be broken, and this is what is done in bdrv_opt_mem_align() for instance, 4096 is the sector size not the host page size. BTW, you need the same change as in nvme_init() in nvme_refresh_limits(), I think. Thanks, Laurent