Re: [Qemu-devel] Regression: block: Add .bdrv_co_pwrite_zeroes()

Eric Blake Tue, 05 Jul 2016 08:02:38 -0700

On 07/05/2016 07:37 AM, Paolo Bonzini wrote:
> 
> 
> On 05/07/2016 03:53, Eric Blake wrote:
>>>> I think we cannot assert that that these alignments are a power of 2.
>> Perhaps that means we should just fix our code to round things down to
>> the nearest power of 2 (8MB) for the opt_transfer_len and
>> opt_discard_alignment values.  Can you post a stack-trace of the actual
>> assertion you are hitting?
> 
> It doesn't work for opt_discard_alignment.  Neither 8MB nor 16MB is
> aligned to 15MB for example.


The largest power-of-2 alignment that will align with every 15M page is
1M.  Is there a measurable performance difference between doing lots of
1M accesses (14 out of 15 are unaligned, but none of them cross page
boundaries), vs. doing 8M accesses (14 out of 15 are unaligned, and 7
out of 15 cross page boundaries, but there are fewer accesses overall)?

The optimal alignments are advisory - they should be a hint that says
that accesses smaller than the alignment may require RMW and are
therefore slower.  I agree that at a certain point, we will definitely
see slowdowns (if we do all 64k accesses, we could probably notice it),
but I'm having a hard time seeing how hardware that advertises a
non-power-of-2 can behave less efficiently for 1M than it would for 8M,
particularly if the smallest addressable block size is indeed smaller
than 1M on that device.

And yes, we could probably switch to (potentially slower) / % * instead
of bit operations in block/io.c to accommodate a non-power-of-2 optimal
size, but it would require a careful audit to make sure we don't have
even more bit-wise operations lurking that were assuming a power of 2.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] Regression: block: Add .bdrv_co_pwrite_zeroes()

Reply via email to