On 04/07/2016 04:38 AM, Vladimir Sementsov-Ogievskiy wrote: > On 05.04.2016 16:43, Paolo Bonzini wrote: >> >> On 05/04/2016 06:05, Kevin Wolf wrote: >>> The options I can think of is adding a request field "max number of >>> descriptors" or a flag "only single descriptor" (with the assumption >>> that clients always want one or unlimited), but maybe you have a better >>> idea. >> I think a limit is better. Even if the client is ultimately going to >> process the whole file, it may take a very long time and space to >> retrieve all the descriptors in one go. Rather than query e.g. 16GB at >> a time, I think it's simpler to put a limit of 1024 descriptors or so. >> >> Paolo >> > > I vote for the limit too. More over, I think, there should be two sides > limit: > > 1. The client can specify the limit, so server should not return more > extents than requested. Of course, server should chose sequential > extents from the beginning of requested range.
For the client to request a limit would entail that we enhance the protocol to allow structured requests (where a wire-sniffer would know how many bytes to read for the client's additional data, even if it does not understand the extension's semantics). Might not be a bad idea to have this in the long run, but so far I've been reluctant to bite the bullet. > 2. Server side limit: if client asked too many extents or not specified > a limit at all, server should not return all extents, but only 1024 (for > ex.) from the beginning of the range. Okay, I'm fairly convinced now that letting the server limit the reply is a good thing, and that one doesn't require a structured request from the client. Since we just recently documented that strings should be no more than 4096 bytes, and my v2 proposal used 8 bytes per descriptor, maybe a good way to enforce a similar limit would be: The server MAY choose to send fewer descriptors than what would describe the full extent of the client's request, but MUST send at least one descriptor unless an error is reported. The server MUST NOT send more than 512 descriptors, even if that does not completely describe the client's requested length. That way, a client in general should never expect more than ~4096 bytes + overhead on any server reply except a reply to NBD_CMD_READ, and can therefore utilize stack allocation for all other replies (if we do this, maybe we should make a hard rule that all future protocol extensions, other than NBD_CMD_READ, will guarantee that a reply has a bounded size) I also think it may be okay to let the server reply with MORE data than the client requested, but only as long as it does not result in any extra descriptors (that is, only the last descriptor can result in a length beyond the client's request). For example, if the client asks for block status of 1M of the file, but the server can conveniently learn via lseek(SEEK_HOLE) or other means that there are 2M of data before status changes, then there's no reason to force the server to throw away the information about the 1M beyond the client's read, and the client might even be able to be more efficient in later requests. > 2.1 And/or, why not allow the server use the power of structured reply > and send several reply chunks? Why did you forbid this? (if I correctly > understand "This chunk type MUST appear at most once in a structured > reply.") If we allow more than one chunk, then either every chunk has to include an offset (more traffic over the wire), or the chunks have to be sent in a particular order (we aren't gaining any benefits that NBD_CMD_READ gains by allowing out-of-order transmission). It's also more work for the client to reconstruct if it has to reassemble; with NBD_CMD_READ, the payload is dominated by the data being read, and you can pwrite() the data into its final location as the client; but with NBD_CMD_BLOCK_STATUS, the payload is dominated by the metadata and we want to keep it minimal; and there is no convenient command for the client to reassemble the information if received out of order. Allowing for a short reply seems to be worth doing, but allowing for multiple reply chunks seems not worth the risk. I'm also starting to think that it is worth FIRST documenting an extension for advertising block sizes, so that we can then couch BLOCK_STATUS in those terms (a server MUST NOT subdivide status into finer granularity than the advertised block sizes). -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature