On 27.05.2016 20:33, Stefan Hajnoczi wrote:
On Sat, May 14, 2016 at 03:45:50PM +0300, Denis V. Lunev wrote:
+ qemu_co_mutex_lock(&s->lock);
+ cluster_offset = \
+ qcow2_alloc_compressed_cluster_offset(bs, sector_num << 9, out_len);
The backslash isn't necessary for wrapping lines in C. This kind of
thing is only necessary in languages like Python where the grammar is
whitespace sensistive.
The C compiler is happy with an arbitrary amount of whitespace
(newlines) in the middle of a statement. The backslash in C is handled
by the preprocessor: it joins the line. That's useful for macro
definitions where you need to tell the preprocessor that several lines
belong to one macro definition. But it's not needed for normal C code.
Thanks for the explanation, but the backslash is used more for the
person as a marker a line break. The current coding style misses this
point, but I can remove the backslash, because I don't think it's
something important :)
+ if (!cluster_offset) {
+ qemu_co_mutex_unlock(&s->lock);
+ ret = -EIO;
+ goto fail;
+ }
+ cluster_offset &= s->cluster_offset_mask;
- BLKDBG_EVENT(bs->file, BLKDBG_WRITE_COMPRESSED);
- ret = bdrv_pwrite(bs->file->bs, cluster_offset, out_buf, out_len);
- if (ret < 0) {
- goto fail;
- }
+ ret = qcow2_pre_write_overlap_check(bs, 0, cluster_offset, out_len);
+ qemu_co_mutex_unlock(&s->lock);
+ if (ret < 0) {
+ goto fail;
}
+ iov = (struct iovec) {
+ .iov_base = out_buf,
+ .iov_len = out_len,
+ };
+ qemu_iovec_init_external(&hd_qiov, &iov, 1);
+
+ BLKDBG_EVENT(bs->file, BLKDBG_WRITE_COMPRESSED);
+ ret = bdrv_co_pwritev(bs->file->bs, cluster_offset, out_len, &hd_qiov, 0);
There is a race condition here:
If the newly allocated cluster is only partially filled by compressed
data then qcow2_alloc_compressed_cluster_offset() remembers that more
bytes are still available in the cluster. The
qcow2_alloc_compressed_cluster_offset() caller will continue filling the
same cluster.
Imagine two compressed writes running at the same time. Write A
allocates just a few bytes so write B shares a sector with the first
write:
Sector 1
|AAABBBBBBBBB|
The race condition is that bdrv_co_pwritev() uses read-modify-write (a
bounce buffer). If both requests call bdrv_co_pwritev() around the same
time then the following could happen:
Sector 1
|000BBBBBBBBB|
or:
Sector 1
|AAA000000000|
It's necessary to hold s->lock around the compressed data write to avoid
this race condition.
I agree, there is really a race.. Thank you, this is a very good point!