On Sun, Aug 20, 2017 at 1:47 PM, Ashijeet Acharya <ashijeetacha...@gmail.com> wrote: > On Fri, May 5, 2017 at 7:29 PM, Stefan Hajnoczi <stefa...@gmail.com> wrote: >> >> On Thu, Apr 27, 2017 at 01:36:30PM +0530, Ashijeet Acharya wrote: >> > This series helps to provide chunk size independence for DMG driver to >> > prevent >> > denial-of-service in cases where untrusted files are being accessed by >> > the user. >> >> The core of the chunk size dependence problem are these lines: >> >> s->compressed_chunk = qemu_try_blockalign(bs->file->bs, >> ds.max_compressed_size + 1); >> s->uncompressed_chunk = qemu_try_blockalign(bs->file->bs, >> 512 * >> ds.max_sectors_per_chunk); >> >> The refactoring needs to eliminate these buffers because their size is >> controlled by the untrusted input file. > > > Oh okay, I understand now. But wouldn't I still need to allocate some memory > for these buffers to be able to use them for the compressed chunks case you > mentioned below. Instead of letting the DMG images control the size of these > buffers, maybe I can hard-code the size of these buffers instead? > >> >> >> After applying your patches these lines remain unchanged and we still >> cannot use input files that have a 250 MB chunk size, for example. So >> I'm not sure how this series is supposed to work. >> >> Here is the approach I would take: >> >> In order to achieve this dmg_read_chunk() needs to be scrapped. It is >> designed to read a full chunk. The new model does not read full chunks >> anymore. >> >> Uncompressed reads or zeroes should operate directly on qiov, not >> s->uncompressed_chunk. This code will be dropped: >> >> data = s->uncompressed_chunk + sector_offset_in_chunk * 512; >> qemu_iovec_from_buf(qiov, i * 512, data, 512); > > > I have never worked with qiov before, are there any places where I can refer > to inside other drivers to get the idea of how to use it directly (I am > searching by myself in the meantime...)?
A QEMUIOVector is a utility type for struct iovec iov[] processing. See util/iov.c. This is called "vectored" or "scatter-gather" I/O. Instead of transferring data to/from a single <buffer, length> tuple, they take an array [<buffer, length>]. For example, the buffer "Hello world" could be split into two elements: [{"Hello ", strlen("Hello ")}, {"world", strlen("world")}] Vectored I/O is often used because it eliminates memory copies. Say you have a network packet header struct and also a data payload array. Traditionally you would have to allocate a new buffer large enough for both header and payload, copy the header and payload into the buffer, and finally give this temporary buffer to the I/O function. This is inefficient. With vectored I/O you can create a vector with two elements, the header and the payload, and the I/O function can process them without needing a temporary buffer copy. > I got clearly what you are trying > to say, but don't know how to implement it. I think, don't we already do > that for the zeroed chunks in DMG in dmg_co_preadv()? Yes, dmg_co_preadv() directly zeroes the qiov. It doesn't use s->compressed_chunk/s->uncompressed_chunk. > >> >> >> Compressed reads still buffers. I suggest the following buffers: >> >> 1. compressed_buf - compressed data is read into this buffer from file >> 2. uncompressed_buf - a place to discard decompressed data while >> simulating a seek operation > > > Yes, these are the buffers whose size I can hard-code as discussed above? > You can suggest the preferred size to me. Try starting with 256 KB for both buffers. >> Data is read from compressed chunks by reading a reasonable amount >> (64k?) into compressed_buf. If the user wishes to read at an offset >> into this chunk then a loop decompresses data we are seeking over into >> uncompressed_buf (and refills compressed_buf if it becomes empty) until >> the desired offset is reached. Then decompression can continue >> directly into the user's qiov and uncompressed_buf isn't used to >> decompress the data requested by the user. > > > Yes, this series does exactly that but keeps using the "uncompressed" buffer > once we reach the desired offset. Once, I understand to use qiov directly, > we can do this. Also, Kevin did suggest me (as I remember vaguely) that in > reality we never actually get the read request at a particular offset > because DMG driver is generally used with "qemu-img convert", which means > all read requests are from the top. For performance it's fine to assume a sequential access pattern. The block driver should still support random access I/O though.