On Wed, Jun 01, 2016 at 11:25:57AM +0200, Kevin Wolf wrote: > Am 27.05.2016 um 19:33 hat Stefan Hajnoczi geschrieben: > > On Sat, May 14, 2016 at 03:45:50PM +0300, Denis V. Lunev wrote: > > > + qemu_co_mutex_lock(&s->lock); > > > + cluster_offset = \ > > > + qcow2_alloc_compressed_cluster_offset(bs, sector_num << 9, > > > out_len); > > > > The backslash isn't necessary for wrapping lines in C. This kind of > > thing is only necessary in languages like Python where the grammar is > > whitespace sensistive. > > > > The C compiler is happy with an arbitrary amount of whitespace > > (newlines) in the middle of a statement. The backslash in C is handled > > by the preprocessor: it joins the line. That's useful for macro > > definitions where you need to tell the preprocessor that several lines > > belong to one macro definition. But it's not needed for normal C code. > > > > > + if (!cluster_offset) { > > > + qemu_co_mutex_unlock(&s->lock); > > > + ret = -EIO; > > > + goto fail; > > > + } > > > + cluster_offset &= s->cluster_offset_mask; > > > > > > - BLKDBG_EVENT(bs->file, BLKDBG_WRITE_COMPRESSED); > > > - ret = bdrv_pwrite(bs->file->bs, cluster_offset, out_buf, > > > out_len); > > > - if (ret < 0) { > > > - goto fail; > > > - } > > > + ret = qcow2_pre_write_overlap_check(bs, 0, cluster_offset, out_len); > > > + qemu_co_mutex_unlock(&s->lock); > > > + if (ret < 0) { > > > + goto fail; > > > } > > > > > > + iov = (struct iovec) { > > > + .iov_base = out_buf, > > > + .iov_len = out_len, > > > + }; > > > + qemu_iovec_init_external(&hd_qiov, &iov, 1); > > > + > > > + BLKDBG_EVENT(bs->file, BLKDBG_WRITE_COMPRESSED); > > > + ret = bdrv_co_pwritev(bs->file->bs, cluster_offset, out_len, > > > &hd_qiov, 0); > > > > There is a race condition here: > > > > If the newly allocated cluster is only partially filled by compressed > > data then qcow2_alloc_compressed_cluster_offset() remembers that more > > bytes are still available in the cluster. The > > qcow2_alloc_compressed_cluster_offset() caller will continue filling the > > same cluster. > > > > Imagine two compressed writes running at the same time. Write A > > allocates just a few bytes so write B shares a sector with the first > > write: > > > > Sector 1 > > |AAABBBBBBBBB| > > > > The race condition is that bdrv_co_pwritev() uses read-modify-write (a > > bounce buffer). If both requests call bdrv_co_pwritev() around the same > > time then the following could happen: > > > > Sector 1 > > |000BBBBBBBBB| > > > > or: > > > > Sector 1 > > |AAA000000000| > > > > It's necessary to hold s->lock around the compressed data write to avoid > > this race condition. > > I don't think this race condition exists. bdrv_co_pwritev() can indeed > perform RMW if the lower layer requires alignment, but that's not > something callers need to care about (which would be an awful API) but > is fully handled there by marking requests serialising if they involve > RMW.
Good point. Stefan
signature.asc
Description: PGP signature