Am 23.07.2014 um 15:17 hat Luiz Capitulino geschrieben: > > Management software, such as OpenStack and RHEV's vdsm, wants to be able > to allocate VM disk space on demand. The basic use case is to start a VM > with a small disk and then the disk is enlarged when QEMU hits a ENOSPC > condition. > > To this end, the management software has to be notified when QEMU > encounters ENOSPC. The most straightforward solution is to extend QMP's > BLOCK_IO_ERROR event with that information. > > This series does exactly that. The approach taken is the simplest possible: > the BLOCK_IO_ERROR event is extended to contain a "nospace" key, which > will be true whenever the guest runs out of space *and* werror=stop|enospc. > Here's an example: > > { "event": "BLOCK_IO_ERROR", > "data": { "device": "ide0-hd1", > "operation": "write", > "action": "stop", > "nospace": true }, > "timestamp": { "seconds": 1265044230, "microseconds": 450486 } } > > There are three important things to observe: > > 1. query-block already supports querying the event by means of the > "io-status" key. Actually, "nospace" and "io-status" keys share > the same semantics. This is a big advantage of this approach, no > further extension of query-block is needed > > 2. The event could also contain an error message key for debugging, > But if we add it to the event, should we add it to query-block too?
I don't think it's strictly necessary, but I can imagine that it would be a very nice feature for debugging if you could check after that fact what caused the VM stop even if you don't have a QMP log with the event. > 3. I'm not extending BLOCK_JOB_ERROR. The reason is that it seems > that BLOCK_IO_ERROR is also emitted on BLOCK_JOB_ERROR Hm, I can't see this in the code. Where do I need to look? Or did you get both a BLOCK_JOB_ERROR and a BLOCK_IO_ERROR because the guest tried to access the image, too, and caused a separate error? > Now, this series is an RFC because there's an alternative solution for > this problem: instead of extending the BLOCK_IO_ERROR event with no-space > indicator, we could have a stringfied errno. This way management apps > would also be able to distinguish among other errors. I don't think sending errnos is a good approach (but if we took it, we should use an enum rather than strings) and prefer exposing the exact information that is actually needed. > For example, we could have a "error-details" dict containing a > "reason" and a "message" key: > > { "event": "BLOCK_IO_ERROR", > "data": { "device": "ide0-hd1", > "operation": "write", > "action": "stop", > "error-details": { "reason": "eio", "message": "I/O > error" }, > "timestamp": { "seconds": 1265044230, "microseconds": 450486 } } > > And then query-block would have to be extended to contain the same > information. > > IMO, this series implementation is good enough for the requirement we > currently have but I'm open to go complex if needed. Agreed. I would like to see the human-readable strerror() string added, but that doesn't make this series any worse as a first step: Acked-by: Kevin Wolf <kw...@redhat.com>