Re: [Qemu-devel] [PATCH] block/iscsi: use 16 byte CDBs only when necessary

Peter Lieven Tue, 02 Sep 2014 12:32:38 -0700

Looking at the code, is it possible that not the guest is causing trouble here, 
but
multiwrite_merge code?


>From what I see the only limit it has when merging requests is the number of 
>IOVs.


Any thoughts?

Mine are:
 a) Introducing bs->bl.max_request_size and set merge = 0 if the result would 
be too big. Default
max request size to 32768 sectors (see below).
 b) Hardcoding the limit in multiwrite_merge for now limiting the merged size 
to 16MB (32768 sectors).
     Which is the limit we already use in bdrv_co_discard and 
bdrv_co_write_zeroes if we don't know
     better.

Peter

Am 02.09.2014 um 17:28 schrieb ronnie sahlberg:
> That is one big request.  I assume the device reports "no limit" in
> the VPD page so we can not state it is the guest/application going
> beyond the allowed limit?
>
>
> I am not entirely sure what meaning the target assigns to Protocol
> Error means here.
> It could be that ~100M is way higher than MaxBurstLength ?  What is
> the MaxBurstLength that was reported by the server during login
> negotiation?
> If so, we should make libiscsi check the maxburstlength and fail the
> request early. We would still fail the I/O so it will not really solve
> anything much
> but at least we should not send the request to the server.
>
> Best would probably be to take the smallest of a non-zero
> Block-Limits.max_transfer_length and iscsi-MaxBurstLength/block-size
> and pass this back to the guest in the emulated Block-Limits-VPD.
> At least then you have tried to tell the guest "never do SCSI I/O
> bigger than this".
>
> I.e. even if the target reports BlockLimits.MaxTransferLength == 0 ==
> no limit to QEMU, QEMU should probably take the iscsi transport limit
> into account and pass this to the guest
> by setting the emulated BlockLimits page it passes to scale to the
> maximum that MaxBurstLength allows.
>
>
> Then if BTRFS or SG_IO in the guest ignores the BlockLimits it is
> clearly a guest problem.
>
> (A different interpretation for ProtocolError could be the mismatch
> between the iscsi expected data transfer length and the scsi transfer
> length, but that should result in residuals, not protocol error.)
>
>
>
> Hypothetically there could be targets that support really huge
> MaxBurstLengths > 32MB. For those you probably want to switch to
> WRITE16 when the SCSI transfer length goes > 0xffff.
>
> - if (iscsilun->use_16_for_rw)  {
> + if (iscsilun->use_16_for_rw || num_sectors > 0xffff)  {
>
>
> regards
> ronnie sahlberg
>
> On Mon, Sep 1, 2014 at 8:21 AM, Peter Lieven <p...@kamp.de> wrote:
>> On 17.06.2014 13:46, Paolo Bonzini wrote:
>>
>> Il 17/06/2014 13:37, Peter Lieven ha scritto:
>>
>> On 17.06.2014 13:15, Paolo Bonzini wrote:
>>
>> Il 17/06/2014 08:14, Peter Lieven ha scritto:
>>
>>
>>
>> BTW, while debugging a case with a bigger storage supplier I found
>> that open-iscsi seems to do exactly this undeterministic behaviour.
>> I have a 3TB LUN. If I access < 2TB sectors it uses READ10/WRITE10 and
>> if I go beyond 2TB it changes to READ16/WRITE16.
>>
>>
>> Isn't that exactly what your latest patch does for >64K sector writes? :)
>>
>>
>> Not exactly, we choose the default by checking the LUN size. 10 Byte for
>> < 2TB and 16 Byte otherwise.
>>
>>
>> Yeah, I meant introducing the non-determinism.
>>
>> My latest patch makes an exception if a request is bigger than 64K
>> sectors and
>> switches to 16 Byte requests. These would otherwise end in an I/O error.
>>
>>
>> It could also be split at the block layer, like we do for unmap.  I think
>> there's also a maximum transfer size somewhere in the VPD, we could to
>> READ16/WRITE16 if it is >64K sectors.
>>
>>
>> It seems that there might be a real world example where Linux issues >32MB
>> write requests. Maybe someone familiar with btrfs can advise.
>> I see iSCSI Protocol Errors in my logs:
>>
>> Sep  1 10:10:14 libiscsi:0 PDU header: 01 a1 00 00 00 01 00 00 00 00 00 00
>> 00 00 00 00 00 00 00 07 06 8f 30 00 00 00 00 06 00 00 00 0a 2a 00 01 09 9e
>> 50 00 47 98 00 00 00 00 00 00 00 [XXX]
>> Sep  1 10:10:14 qemu-2.0.0: iSCSI: Failed to write10 data to iSCSI lun.
>> Request was rejected with reason: 0x04 (Protocol Error)
>>
>> Looking at the headers the xferlen in the iSCSI PDU is 110047232 Byte which
>> is 214936 sectors.
>> 214936 % 65536 = 18328 which is exactly the number of blocks in the SCSI
>> WRITE10 CDB.
>>
>> Can someone advise if this is something that btrfs can cause
>> or if I have to
>> blame the customer that he issues very big write requests with Direct I/O?
>>
>> The user sseems something like this in the log:
>> [34640.489284] BTRFS: bdev /dev/vda2 errs: wr 8232, rd 0, flush 0, corrupt
>> 0, gen 0
>> [34640.490379] end_request: I/O error, dev vda, sector 17446880
>> [34640.491251] end_request: I/O error, dev vda, sector 5150144
>> [34640.491290] end_request: I/O error, dev vda, sector 17472080
>> [34640.492201] end_request: I/O error, dev vda, sector 17523488
>> [34640.492201] end_request: I/O error, dev vda, sector 17536592
>> [34640.492201] end_request: I/O error, dev vda, sector 17599088
>> [34640.492201] end_request: I/O error, dev vda, sector 17601104
>> [34640.685611] end_request: I/O error, dev vda, sector 15495456
>> [34640.685650] end_request: I/O error, dev vda, sector 7138216
>>
>> Thanks,
>> Peter
>>

Re: [Qemu-devel] [PATCH] block/iscsi: use 16 byte CDBs only when necessary

Reply via email to