---- Eric Blake <ebl...@redhat.com> wrote: > [any way you could convince your mailer to not break threading?] > > On 03/03/2016 09:24 PM, mgre...@cinci.rr.com wrote: > >> > >> The zeros are not part of the compressed data, though, that's why the > >> Compressed Cluster Descriptor indicates a shorter size. Had another > >> compressed cluster been written to the same image, it might have ended > >> up where you are seeing the zero padding now. (The trick with > >> compression is that multiple guest clusters can end up in a single host > >> cluster.) > >> > > > > Thanks, but the given length of 0x5600 is still short by 160(decimal) bytes > > compared to the > > non-zero data (which occupies an additional 133 bytes beyond the expected > > end at > > 0x3DED50) and zero > > padding (an additional 27 bytes beyond that). Could there be an off-by-one > > error > > somewhere? > > The file doesn't even end on a sector boundary let alone a cluster > > boundary. > > Based on an IRC conversation I had when you first asked the question, I > think the spec is indeed weak, and that we DO have some fishy code. > > Look at what the code does: > > uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs, > uint64_t offset, > int compressed_size) > ... > nb_csectors = ((cluster_offset + compressed_size - 1) >> 9) - > (cluster_offset >> 9); > > cluster_offset |= QCOW_OFLAG_COMPRESSED | > ((uint64_t)nb_csectors << s->csize_shift); > > That sure does sound like an off-by-one. cluster_offset does indeed > look like a byte offset (from qcow2_alloc_bytes); so let's consider what > happens if we've already allocated one compressed cluster in the past, > going from 65536 to 66999. So on this call, cluster_offset would be > 67000, and if compressed size is 1025 (just round numbers to make > discussion easy; no idea if gzip would really do this on any particular > data), we are computing ((67000+1025-1)>>9) - (67000>>9) == 2, but 1025 > bytes occupies 3 sectors, not 2. > > But we offset it by another off-by-one: > > int qcow2_decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset) > { > ... > nb_csectors = ((cluster_offset >> s->csize_shift) & > s->csize_mask) + 1; > > Yuck. That is horribly ugly. > > We need to fix the documentation to make it obvious that the sector > count is a _LOWER BOUND_ on the number of sectors occupied, and that you > need to read at least one more cluster's worth of data before decompressing. > > It would also be nice to fix qcow2 code to not have quite so many > off-by-one computations, but be more precise about what data is going where. > > -- > Eric Blake eblake redhat com +1-919-301-3266 > Libvirt virtualization library http://libvirt.org >
Thanks for verifying the behavior for me. This will allow me to add compression support to the code I've already written. I've implemented QCow2 support for a fork of the popular DOSBox emulator. The project (not mine), is called DOSBox-X, and is geared towards emulating Windows9x era hardware with a specific focus on games: http://dosbox-x.com/ If you'd like to give me any feedback on the QCow2 code I've written please feel free to e-mail me directly. The relevant files are qcow2_disk.cpp and qcow_disk.h. If you do, please be kind as my C++ skills are very rusty. Overall I think it is pretty readable. It took me about 2 and a half weeks working in my spare time to go from reading the spec to the final code integrated into DOSBox-X. Thanks, Michael Greger