On 11/22/2016 08:03 AM, Kevin Wolf wrote: > Am 17.11.2016 um 21:13 hat Eric Blake geschrieben: >> Discard is advisory, so rounding the requests to alignment >> boundaries is never semantically wrong from the data that >> the guest sees. But at least the Dell Equallogic iSCSI SANs >> has an interesting property that its advertised discard >> alignment is 15M, yet documents that discarding a sequence >> of 1M slices will eventually result in the 15M page being >> marked as discarded, and it is possible to observe which >> pages have been discarded. >>
>> >> Rework the block layer discard algorithm to mirror the write >> zero algorithm: always peel off any unaligned head or tail >> and manage that in isolation, then do the bulk of the request >> on an aligned boundary. The fallback when the driver returns >> -ENOTSUP for an unaligned request is to silently ignore that >> portion of the discard request; but for devices that can pass >> the partial request all the way down to hardware, this can >> result in the hardware coalescing requests and discarding >> aligned pages after all. >> > > Hm, from the commit message I expected splitting requests in three > (head, bulk, tail), but actually we can end up splitting it in five? Correct; so maybe I need to improve the commit message. The goal in multiple splits was to make it easier for drivers to not have to worry about re-aligning things (a request is either sub-sector, sub-page but sector-aligned, or page-aligned). > > Just to check whether I got this right, let me try an example: Let's > assume request alignment 512, pdiscard alignment 64k, and we get a > discard request with offset 510, length 130k. This algorithm makes the > following calls to the driver: > > 1. pdiscard offset=510, len=2 | new count = 130k - 2 > 2. pdiscard offset=512, len=(64k - 512) | new count = 66k + 510 > 3. pdiscard offset=64k, len=64k | new count = 2558 > 4. pdiscard offset=128k, len=2048 | new count = 510 > 5. pdiscard offset=130k, len=510 | new count = 0 > > Correct? > Yes. > If so, is this really necessary or even helpful? I see that the iscsi > driver throws away requests 1 and 5 and needs the split because > otherwise it would disregard the areas covered by 2 and 4, too. But why > can't or shouldn't the iscsi driver do this rounding itself when it gets > combined 1+2 and 4+5 requests? Because then every driver has to implement rounding; I thought that having centralized rounding code was easier to maintain overall than having every driver reimplement it. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature