BTW, I think Cc the virtio mailing list in the next version is a good idea.
On Wed, Jun 01, 2011 at 05:59:28PM +0200, Paolo Bonzini wrote: > On 06/01/2011 04:36 PM, Michael S. Tsirkin wrote: > >Ah, I think I understand now. Both sense and data have in > >fields that might only be used partially? > >In that case I think I agree: it's best to require the use of separate > >buffers for them, in this way used len will give you useful information > >and you won't need sense_len and data_len: just a flag to > >mark the fact that there*is* a sense buffer following. > > If the device wants a sense buffer to be there always, that's > sensible. No flag needed here, then. Also, sense is always "in", > there is no out. > > But I do not understand how the used len helps me. If I read the > spec correctly, the length will be the number of bytes written, but > this will always point to after the last field. If sense or data > are written partially, this will not be written in the fields---in > fact, virtio_blk does contain both sense_len and residual. Its > sense field is fixed size, which is probably why it doesn't contain > something like datain_size (there is just one variable-size field). > > Strictly speaking I wouldn't need dataout_size too, because I have > only one variable-size read-only field, but I prefer to be > future-proof. > > I think the following is a good compromise: > > 1) keep dataout_size, datain_size and sense_size; > > 2) dataout, datain and sense shall each start a separate buffer, and > sense shall be contained in a single buffer; it is permissible to > put sense and the subsequent fields in the same buffer. This will > make it easy for the QEMU implementation to pick up its iovecs. > > It will also let the device detect mistakes in filling the data > sizes. I am not sure whether the length info in the header is redundant. If so, I think it's better not to duplicate it on the assumption that this will let us detect bugs: the bugs will be in the header as likely as not. If the overlap is not complete, some redundancy is not too bad. > In practice I expect 3 descriptors will be used (one direct > for read-only stuff up to data; one possibly indirect for data; one > direct for sense and other write-only stuff). > > >However some questions: > >1. I think you don't need numdatain/numdataout: each > >buffer can include in and out segments. Just tell device how many > >buffers are there. > > I don't understand. > > Paolo I think I didn't express myself clearly. Follows a somewhat lengthy background to make sure we use the same terms. Feel free to ignore until ---- There seems to be some confusion about the terminology: the term buffers is confusing. I'm guilty of mixing terms too. Let's refer to the virtio spec. There are two kinds of entities there: - descriptor: each descriptor points to a location in memory. It can be in or out. descriptors can be chained together. - head: internally, points to a chain of descriptors, both in and out. * driver makes head available to device, thus adding some memory for out + some memory for in. * device uses the memory, and reports to driver how much in memory was used. The assumption is always that device consumes in memory from the beginning in a contigious way. That's why length is enough. That's also why it's called add_buf: conceptually it is a single buffer. Drivers and devices never operate in terms of descriptors: these are internal to the virtio ring transport. Instead, they always operate in term of heads. virtio interface does let you pass in s/g entries but this is an artifact of linux. At the moment many devices in qemu assume that drivers put some info in specific descriptors. This is not a good idea, they should always operate in terms of offsets from start of descriptor. vhost does this correctly BTW. I posted a patch to fix that previously, need to dust it up and merge. ---- Now to our problem: As far as I can tell there are two input buffers in each request: sense and data. Right? If sense is fixed length, we can simply put it first, have device write sense then data. This does not seem too limiting, if you want a lot of flexibility sense length can be in device config. If we don't want to limit ourselves to fixed length sense, we would have driver use two heads for a request. This is possible but one needs to be careful in the driver to make sure there's enough space for both requests. Maybe add_bufs API to add multiple bufs might be a good idea here. -- MST