atapi: make PIO read requests async

Peter Lieven Tue, 06 Oct 2015 12:00:00 -0700

> Am 06.10.2015 um 19:07 schrieb John Snow <js...@redhat.com>:
> 
> 
> 
>> On 10/06/2015 05:20 AM, Peter Lieven wrote:
>>> Am 06.10.2015 um 10:57 schrieb Kevin Wolf:
>>> Am 05.10.2015 um 23:15 hat John Snow geschrieben:
>>>> 
>>>> On 09/21/2015 08:25 AM, Peter Lieven wrote:
>>>>> PIO read requests on the ATAPI interface used to be sync blk requests.
>>>>> This has to siginificant drawbacks. First the main loop hangs util an
>>>>> I/O request is completed and secondly if the I/O request does not
>>>>> complete (e.g. due to an unresponsive storage) Qemu hangs completely.
>>>>> 
>>>>> Signed-off-by: Peter Lieven <p...@kamp.de>
>>>>> ---
>>>>>  hw/ide/atapi.c | 69
>>>>> ++++++++++++++++++++++++++++++++++++----------------------
>>>>>  1 file changed, 43 insertions(+), 26 deletions(-)
>>>>> 
>>>>> diff --git a/hw/ide/atapi.c b/hw/ide/atapi.c
>>>>> index 747f466..9257e1c 100644
>>>>> --- a/hw/ide/atapi.c
>>>>> +++ b/hw/ide/atapi.c
>>>>> @@ -105,31 +105,51 @@ static void cd_data_to_raw(uint8_t *buf, int lba)
>>>>>      memset(buf, 0, 288);
>>>>>  }
>>>>>  -static int cd_read_sector(IDEState *s, int lba, uint8_t *buf, int
>>>>> sector_size)
>>>>> +static void cd_read_sector_cb(void *opaque, int ret)
>>>>>  {
>>>>> -    int ret;
>>>>> +    IDEState *s = opaque;
>>>>>  -    switch(sector_size) {
>>>>> -    case 2048:
>>>>> -        block_acct_start(blk_get_stats(s->blk), &s->acct,
>>>>> -                         4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
>>>>> -        ret = blk_read(s->blk, (int64_t)lba << 2, buf, 4);
>>>>> -        block_acct_done(blk_get_stats(s->blk), &s->acct);
>>>>> -        break;
>>>>> -    case 2352:
>>>>> -        block_acct_start(blk_get_stats(s->blk), &s->acct,
>>>>> -                         4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
>>>>> -        ret = blk_read(s->blk, (int64_t)lba << 2, buf + 16, 4);
>>>>> -        block_acct_done(blk_get_stats(s->blk), &s->acct);
>>>>> -        if (ret < 0)
>>>>> -            return ret;
>>>>> -        cd_data_to_raw(buf, lba);
>>>>> -        break;
>>>>> -    default:
>>>>> -        ret = -EIO;
>>>>> -        break;
>>>>> +    block_acct_done(blk_get_stats(s->blk), &s->acct);
>>>>> +
>>>>> +    if (ret < 0) {
>>>>> +        ide_atapi_io_error(s, ret);
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    if (s->cd_sector_size == 2352) {
>>>>> +        cd_data_to_raw(s->io_buffer, s->lba);
>>>>>      }
>>>>> -    return ret;
>>>>> +
>>>>> +    s->lba++;
>>>>> +    s->io_buffer_index = 0;
>>>>> +    s->status &= ~BUSY_STAT;
>>>>> +
>>>>> +    ide_atapi_cmd_reply_end(s);
>>>>> +}
>>>>> +
>>>>> +static int cd_read_sector(IDEState *s, int lba, void *buf, int
>>>>> sector_size)
>>>>> +{
>>>>> +    if (sector_size != 2048 && sector_size != 2352) {
>>>>> +        return -EINVAL;
>>>>> +    }
>>>>> +
>>>>> +    s->iov.iov_base = buf;
>>>>> +    if (sector_size == 2352) {
>>>>> +        buf += 4;
>>>>> +    }
>>> This doesn't look quite right, buf is never read after this.
>>> 
>>> Also, why +=4 when it was originally buf + 16?
>> 
>> You are right. I mixed that up.
>> 
>>> 
>>>>> +
>>>>> +    s->iov.iov_len = 4 * BDRV_SECTOR_SIZE;
>>>>> +    qemu_iovec_init_external(&s->qiov, &s->iov, 1);
>>>>> +
>>>>> +    if (blk_aio_readv(s->blk, (int64_t)lba << 2, &s->qiov, 4,
>>>>> +                      cd_read_sector_cb, s) == NULL) {
>>>>> +        return -EIO;
>>>>> +    }
>>>>> +
>>>>> +    block_acct_start(blk_get_stats(s->blk), &s->acct,
>>>>> +                     4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
>>>>> +    s->status |= BUSY_STAT;
>>>>> +    return 0;
>>>>>  }
>>>> We discussed this off-list a bit, but for upstream synchronization:
>>>> 
>>>> Unfortunately, I believe making cd_read_sector here non-blocking makes
>>>> ide_atapi_cmd_reply_end non-blocking, and as a result makes calls to
>>>> s->end_transfer_func() nonblocking, which functions like ide_data_readw
>>>> are not prepared to cope with.
>>> I don't think that's a problem as long as BSY is set while the
>>> asynchronous command is running and DRQ is cleared. The latter will
>>> protect ide_data_readw(). ide_sector_read() does essentially the same
>>> thing.
>> 
>> I was thinking the same. Without the BSY its not working at all.
>> 
>>> 
>>> Or maybe I'm just missing what you're trying to say.
>>> 
>>>> My suggestion is to buffer an entire DRQ block of data at once
>>>> (byte_count_limit) to avoid the problem.
>>> No matter whether there is a problem or not, buffering more data at once
>>> (and therefore doing less requests) is better for performance anyway.
>> 
>> Its possible to do only one read in the backend and read the whole
>> request into the IO buffer. I send a follow-up.
> 
> Be cautious: we only have 128K (+4 bytes) to play with in the io_buffer
> and the READ10 cdb can request up to 128MiB! For performance, it might
> be nice to always buffer something like:
> 
> MIN(128K, nb_sectors * sector_size)


isnt nb_sectors limited to CD_MAX_SECTORS (32)?

Peter


> 
> and then as the guest drains the DRQ block of size byte_count_limit
> which can only be at largest 0xFFFE (we can fit in at least two of these
> per io_buffer refill) we can just shift the data_ptr and data_end
> pointers to utilize io_buffer like a ring buffer.
> 
> Because the guest can at most fetch 0xfffe bytes at a time, it will tend
> to leave at least 4 bytes left over from a 64 block read. Luckily, we've
> got 4 extra bytes in s->io_buffer, so with a ring buffer we can always
> rebuffer *at least* two full DRQ blocks of data at a time.
> 
> The routine would basically look like this:
> 
> - No DRQ blocks buffered, so read up to 64 blocks or however many are
> left for our transfer
> - If we have at least one full DRQ block allocated, start the transfer
> and send an interrupt
> - If we ran out of DRQ blocks, go back to the top and buffer them.
> 
> This would eliminate the need for code stanza #3 in
> ide_atapi_cmd_reply_end, which re-starts a transfer without signaling to
> the guest. We'd only have:
> 
> ide_atapi_cmd_reply_end(...) {
>  if (packet_transfer_size == 0) { end(...); return; }
>  if (blocks_buffered < 1) { async_buffer_blocks(...); return; }
>  ide_transfer_start(...)
>  ide_set_irq(s->bus);
> }
> 
> 
> which is a good deal simpler than what we have now, though I need to
> look into the formatting of raw CD data a little more to make sure my
> numbers make sense... it may not be quite so easy to buffer multiple DRQ
> blocks in some cases, but so it goes -- we should always be able to
> buffer at least one.
> 
>> Maybe do you have a pointer to the test tool that John mentioned?
>> 
>> Peter
>>

Re: [Qemu-devel] [PATCH 1/5] ide/atapi: make PIO read requests async

Reply via email to