On Thu, Feb 16, 2012 at 04:01:58PM +0100, Kevin Wolf wrote:
> Hi Christoph,
>
> I just talked to Stefan about our testing, both regarding the block
> layer and qemu in general, and we came to the conclusion that it would
> probably make sense to merge qemu-iotests into qemu.git.
>
> The immediate
On Mon, Oct 15, 2012 at 12:07:37PM -0600, Eric Blake wrote:
> On 10/15/2012 11:29 AM, Alex Bligh wrote:
> > This patch allows qemu-img rebase to rebase an image to
> > have no backing file, as opposed to merely allowing it to
> > rebase to an existing backing file.
>
> You can already do that by r
On Fri, Jun 22, 2012 at 10:48:56AM -0700, Chris Wedgwood wrote:
> > FITRIM is a mounted filesystem feature to discard (or "trim") blocks which
> > are not in use by the filesystem. This is useful for solid-state drives
> > (SSDs) and thinly-provisioned storage. Provide access to the feature
> > fr
Only buffers that map to unallocated blocks need to be zeroed.
Signed-off-by: Christoph Hellwig
Index: qemu/block/sheepdog.c
===
--- qemu.orig/block/sheepdog.c 2012-06-27 18:02:41.849867899 +0200
+++ qemu/block/sheepdog.c
On Thu, Jun 28, 2012 at 04:06:24PM +0900, MORITA Kazutaka wrote:
>
> 'offset' is the offset of the sheepdog object. I think it should be
> 'done' since we need to pass the number of skip bytes.
Indeed. Odd that mny tests didn't catch this.
>
> > goto done;
> > }
Only buffers that map to unallocated blocks need to be zeroed.
Signed-off-by: Christoph Hellwig
---
block/sheepdog.c | 28 ++--
1 file changed, 18 insertions(+), 10 deletions(-)
Index: qemu/block/sheepdog.c
Only buffers that map to unallocated blocks need to be zeroed.
Signed-off-by: Christoph Hellwig
---
block/sheepdog.c | 37 ++---
1 file changed, 18 insertions(+), 19 deletions(-)
Index: qemu/block/sheepdog.c
On Mon, Jul 09, 2012 at 04:54:08PM +0800, Wenchao Xia wrote:
> Hi, Paolo and folks,
> qemu have good capabilities to access different virtual disks, I want
> to expose its block layer API to let 3rd party program linked in, such
> as management stack or block tools, to access images data directly
Only buffers that map to unallocated blocks need to be zeroed.
Signed-off-by: Christoph Hellwig
---
block/sheepdog.c | 37 ++---
1 file changed, 18 insertions(+), 19 deletions(-)
Index: qemu/block/sheepdog.c
On Tue, Oct 09, 2012 at 05:42:01PM +0800, Chen HanXiao wrote:
> When we use SCSI generic device as disk image, function lseek
> could not get the size of this kind of device.
> So try to use SCSI command Read Capacity(10) when lseek failed to get
> the size of SCSI generic device.
Eww, this is ugl
On Wed, Jul 31, 2013 at 08:19:51AM +0200, Paolo Bonzini wrote:
> Most of the block layer is under the BSD license, thus it is reasonable
> to license block/raw.c the same way. CCed people should ACK by replying
> with a Signed-off-by line.
The coded was intended to be GPLv2.
On Tue, Jul 16, 2013 at 06:29:26PM +0200, Paolo Bonzini wrote:
> Signed-off-by: Paolo Bonzini
This isn't really XFS specific, at least ext4 and ocfs2 can report the same.
On Tue, May 06, 2014 at 09:00:54PM +0200, Max Reitz wrote:
> The current version of raw-posix always uses ioctl(FS_IOC_FIEMAP) if
> FIEMAP is available; lseek with SEEK_HOLE/SEEK_DATA are not even
> compiled in in this case. However, there may be implementations which
> support the latter but not t
On Wed, Dec 21, 2011 at 04:00:36PM +, Stefan Hajnoczi wrote:
> The ability to zero regions of an image file is a useful primitive for
> higher-level features such as image streaming or zero write detection.
>
> Image formats may support an optimized metadata representation instead
> of writing
On Wed, Dec 14, 2011 at 01:40:22PM +0100, Paolo Bonzini wrote:
> If the partitions are aligned, the OS will always issue aligned requests,
> because file system blocks are already 4k.
It won't nessecarily. For example XFS will do a fair amount of sub-blocksize
I/O for metadata and the log. Note
FYI, this causes segfaults when doing large streaming writes when
running against a sheepdog cluster which:
a) has relatively fast SSDs
and
b) uses buffered I/O.
Unfortunately I can't get a useful backtrace out of gdb. When running just
this commit I at least get some debugging messages:
On Mon, Dec 19, 2011 at 11:13:24AM -0600, Anthony Liguori wrote:
> Hi,
>
> I've published a set of tests I wrote over the weekend on qemu.org. My
> motivations were 1) to prevent regressions like the libguestfs one and 2)
> to have an easier way to do development testing as I work on QEMU Object
On Fri, Dec 23, 2011 at 02:38:50PM +0100, Christoph Hellwig wrote:
> FYI, this causes segfaults when doing large streaming writes when
> running against a sheepdog cluster which:
>
> a) has relatively fast SSDs
>
> and
>
> b) uses buffered I/O.
>
> Unfo
On Fri, Dec 30, 2011 at 10:35:01AM +, Stefan Hajnoczi wrote:
> Are you building with gcc 4.5.3 or later? (Earlier versions may
> mis-compile, see https://bugs.launchpad.net/qemu/+bug/902148.)
I'm using "gcc version 4.6.2 (Debian 4.6.2-9)", so that should not
be an issue.
> If you can reprod
Forgot to attach the "go-fast" patch required to reproduce the issue
on slow storage:
Index: sheepdog/sheep/simple_store.c
===
--- sheepdog.orig/sheep/simple_store.c 2011-12-29 20:10:44.171622413 +0100
+++ sheepdog/sheep/simple_store
On Tue, Jan 03, 2012 at 08:16:55AM +, Stefan Hajnoczi wrote:
> On Mon, Jan 02, 2012 at 04:39:59PM +0100, Christoph Hellwig wrote:
> > I've tried to understand how the recursive calling happens, but
> > unfortunately
> > the whole coroutine code lacks any sort of d
> +static int qemu_rbd_snap_remove(BlockDriverState *bs,
> +const char *snapshot_name)
> +{
> +BDRVRBDState *s = bs->opaque;
> +int r;
> +
> +r = rbd_snap_remove(s->image, snapshot_name);
> +r = rbd_snap_rollback(s->image, snapshot_name);
Have these
On Thu, Mar 15, 2012 at 07:59:20AM +, Stefan Hajnoczi wrote:
> On Wed, Mar 14, 2012 at 11:04:24PM +0100, Stefan Weil wrote:
> > What's the reason for "_supported_os Linux" in each test?
> > Because of this statement, the tests only run on Linux today,
> > although they could also run on BSD, w3
On Fri, Mar 09, 2012 at 04:07:43PM +, Stefan Hajnoczi wrote:
> Paolo, your discard improvements in QEMU add FALLOC_FL_PUNCH_HOLE
> support. XFS supports this fallocate() flag in current kernels,
> thereby making the XFS-specific support obsolete.
>
> I'm wondering whether it's worth expanding
On Fri, Mar 09, 2012 at 06:16:54PM +0100, Paolo Bonzini wrote:
> > I'm wondering whether it's worth expanding the SELinux policy if we
> > will have no fstatfs(2) callers in QEMU. Are you planning to drop the
> > XFS code?
>
> Chris Wedgwood said that on XFS you want to do discard even if the fil
On Sat, Mar 10, 2012 at 12:02:40PM -0600, Richard Laager wrote:
> If we could probe for FALLOC_FL_PUNCH_HOLE support, then we could avoid
> advertising discard support based on FALLOC_FL_PUNCH_HOLE when it is not
> going to work. This would side step these problems. You said it wasn't
> possible to
On Wed, Mar 14, 2012 at 01:01:35PM +0100, Kevin Wolf wrote:
> Paolo mentioned a use case as a fast way for guests to write zeros, but
> is it really faster than a normal write when we have to emulate it by a
> bdrv_write with a temporary buffer of zeros? On the other hand we have
> the cases where
On Wed, Mar 14, 2012 at 01:14:18PM +0100, Paolo Bonzini wrote:
>
> Note that the discard granularity is only a hint, so it's really more a
> maximum suggested value than a granularity. Outside of a cluster
> boundary the format would still have to write zeros manually.
>
> Also, Linux for exampl
On Wed, Mar 14, 2012 at 01:49:48PM +0100, Paolo Bonzini wrote:
> It does make the distinction. "I don't care" is UNMAP (or WRITE
> SAME(16) with the UNMAP bit set); "I want to have zeroes" is WRITE
> SAME(10) or WRITE SAME(16) with an all-zero payload.
But once the taget sets the unmap zeroes dat
On Thu, Mar 08, 2012 at 06:15:13PM +0100, Paolo Bonzini wrote:
> Allow discard to fail, and fall back to the write operation. This
> is needed because there's no simple way to probe for availability
> of FALLOC_FL_PUNCH_HOLE.
So you switch on advertising TRIM support in the patch before, and then
On Fri, Mar 09, 2012 at 02:36:50PM -0600, Richard Laager wrote:
> I'm not sure if fallocate() and/or BLKDISCARD always guarantee that the
> discard has made it to stable storage. If they don't, does O_DIRECT or
> O_DSYNC on open() cause them to make any such guarantee? If not, should
> you be calli
On Thu, Mar 08, 2012 at 06:15:17PM +0100, Paolo Bonzini wrote:
> SEEK_DATA and SEEK_HOLE can be used to implement the is_allocated
> callback for raw files. These currently work on btrfs, with an XFS
> implementation also coming soon.
Btw - if you're interested in a bit more kernel hacking it wou
On Sun, Mar 11, 2012 at 04:03:01PM -0500, Leonardo E. Reiter wrote:
> indeed mmap() is used in the code. This is unfortunate that it cannot be
> used. It's a really high performance way to achieve what we want here, and
> very safe for the use-case. Of course the only medium we support in the
>
On Mon, Mar 26, 2012 at 10:44:07AM +0100, Daniel P. Berrange wrote:
> This suggests that there be a new command line param to '-drive' to turn
> discard support on/off, since QEMU can't reliably know if the raw file
> it is given is intended to be fully pre-allocated by the mgmt app.
Yes.
On Mon, Mar 26, 2012 at 02:40:47PM -0500, Richard Laager wrote:
> On Sat, 2012-03-24 at 16:27 +0100, Christoph Hellwig wrote:
> > > has_discard = !fallocate(s->fd, FALLOC_FL_PUNCH_HOLE |
> > > FALLOC_FL_KEEP_SIZE,
> >
> > There is no point in usi
On Tue, Mar 27, 2012 at 04:48:19PM +0100, Stefano Stabellini wrote:
> Anthony,
> please pull this small patch series that allows xen_disk to be used
> correctly with NATIVE_AIO and O_DIRECT.
>
> This series should be backported to the stable branch too.
Any plans to add BLKIF_OP_FLUSH_DISKCACHE s
Why are you even trying this again? As explained very clearly last time you
can't change from a writeback-style to a write-through style I/O from
the monitor without creating massive data integrity problems. See my
patchset that allows changing this from the guest for how it should be
done - I ju
On Mon, May 16, 2011 at 04:10:21PM -0500, Anthony Liguori wrote:
> To further clarify:
>
> Today cache=none|writethrough|writeback does two things. It:
>
> 1) Changes the WCE flag that's visible to the guest
>
> 2) Determines whether the host page cache is used for doing guest I/O
>
> As Christoph
Change BDRV_O_NOCACHE to only imply bypassing the host OS file cache,
but no writeback semantics. All existing callers are changed to also
specify BDRV_O_CACHE_WB to give them writeback semantics.
Signed-off-by: Christoph Hellwig
Index: qemu/block.c
This patchset resurrects my older patches to add TRIM support. The
old approach to pass an action handler through the IDE subsystem doesn't
work any more after the refactoring for the AHCI driver, so not we simple
switch on the type of command passed in.
Make dma_bdrv_io available for drivers, and pass an explicit I/O function
instead of hardcoding bdrv_aio_readv/bdrv_aio_writev. This is required
to implement non-READ/WRITE dma commands in the ide driver, e.g. the
upcoming TRIM support.
Signed-off-by: Christoph Hellwig
Index: qemu/dma
Add support for TRIM sub function of the data set management command,
and wire it up to the qemu discard infrastructure.
Signed-off-by: Christoph Hellwig
Index: qemu/hw/ide/core.c
===
--- qemu.orig/hw/ide/core.c 2011-05-18 20
Replace the is_read flag with a dma_cmd flag to allow the dma and
restart logic to handler other commands like TRIM.
Signed-off-by: Christoph Hellwig
Index: qemu/hw/ide/core.c
===
--- qemu.orig/hw/ide/core.c 2011-05-18 20:24
On Thu, May 19, 2011 at 09:56:37AM +0100, Stefan Hajnoczi wrote:
> Please post the output of "qemu-img info Ubuntu.vmdk". I suspect this
> image file is not being recognized as vmdk and is being treated as a
> raw image, hence the literal copy of its 512-byte sector size
> contents.
It's not. It
Looks good,
Reviewed-by: Christoph Hellwig
t, while the later will free the request data
> structure, thus causing the new read or write request to use a
> freed/stale structure when it completes.
>
> This patch fixes the bug, fixing a crash with scsi-generic & RHEL5.5
> installer.
Looks good,
Reviewed-by: Christoph Hellwig
On Tue, May 17, 2011 at 01:00:49PM +0200, Paolo Bonzini wrote:
> This abstracts calling the command_complete callback, reducing churn
> in the following patches.
Adding the helper sounds fine to me, but the name seems rather badly
chosen. What about scsi_req_get_more_data or similar? Either way
> qbus_create_inplace(&bus->qbus, &scsi_bus_info, host, NULL);
> bus->busnr = next_scsi_bus++;
> bus->tcq = tcq;
> bus->ndev = ndev;
> -bus->complete = complete;
> +bus->ops = *ops;
Normally bus->ops would be a pointer, so you can just assign it to
the address passed in
> --- a/hw/scsi-bus.c
> +++ b/hw/scsi-bus.c
> @@ -136,6 +136,7 @@ SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d,
> uint32_t tag, uint32_t l
> SCSIRequest *req;
>
> req = qemu_mallocz(size);
> +req->refcount = 2;
> req->bus = scsi_bus_from_device(d);
> req->dev =
Looks good,
Reviewed-by: Christoph Hellwig
Looks good,
Reviewed-by: Christoph Hellwig
Looks good,
Reviewed-by: Christoph Hellwig
Looks good,
Reviewed-by: Christoph Hellwig
Looks good,
Reviewed-by: Christoph Hellwig
Looks good,
Reviewed-by: Christoph Hellwig
Looks good,
Reviewed-by: Christoph Hellwig
Looksgood,
Reviewed-by: Christoph Hellwig
> -void scsi_req_enqueue(SCSIRequest *req)
> +int32_t scsi_req_enqueue(SCSIRequest *req, uint8_t *buf)
> {
> +int32_t rc;
> assert(!req->enqueued);
> scsi_req_ref(req);
> req->enqueued = true;
> QTAILQ_INSERT_TAIL(&req->dev->requests, req, next);
> +
> +/* Make sure the
Looks good, but the scsi_req_kick name doesn't sound to useful.
Either way some documentation on what it does would be useful.
Looks good,
Reviewed-by: Christoph Hellwig
Looks good,
Reviewed-by: Christoph Hellwig
> -typedef struct SCSIDiskReq {
> + typedef struct SCSIDiskReq {
Random whiyespace change.
Otherwise looks good,
Reviewed-by: Christoph Hellwig
Looks good,
Reviewed-by: Christoph Hellwig
Looks good,
Reviewed-by: Christoph Hellwig
> +static void esp_command_complete(SCSIRequest *req, uint32_t arg)
Shouldn't the "arg" argument to the new ->command_complete be renamed
to something like "sense" or "status"?
> +static void esp_transfer_data(SCSIRequest *req, uint32_t arg)
> +{
> +ESPState *s = DO_UPCAST(ESPState, busdev.qd
On Fri, May 20, 2011 at 05:03:31PM +0200, Paolo Bonzini wrote:
> This is the second part of my SCSI work. The first is still pending
> and this one is incomplete, but I still would like to get opinions
> early enough because this design directly affects the UI.
>
> This series is half of the work
Looks good,
Reviewed-by: Christoph Hellwig
On Mon, May 23, 2011 at 12:34:39PM +0200, Christoph Egger wrote:
>
> This does 2 things:
> - use the correct way to get the size of a disk device or partition
> (from h...@netbsd.org)
> - if given a block device, use the character device instead.
> (from bou...@netbsd.org)
Please split tha
> +if (lstat(filename, &sb) < 0) {
> +fprintf(stderr, "%s: stat failed: %s\n", filename,
> strerror(errno));
> +return -errno;
> +}
> +
> +if (S_ISBLK(sb.st_mode))
> +filename = raw_get_rawdevice(filename);
Please move the lstat and S_ISBLK check into raw_get_r
On Fri, May 20, 2011 at 05:03:33PM +0200, Paolo Bonzini wrote:
> SAM logical unit numbers are complicated beasts that can address
> multiple levels of buses and targets before finally reaching
> logical units. Begin supporting them by correctly parsing vSCSI
> LUNs.
>
> Note that with the current
On Fri, May 20, 2011 at 05:03:34PM +0200, Paolo Bonzini wrote:
> This allows passthrough of devices with LUN != 0, by redirecting them to
> LUN0 in the emulated target.
I'm not quite sure what this code is for. Each /dev/sg device reresents
a LUN. So if we want to suport multiple LUNs in qemu fo
On Fri, May 20, 2011 at 07:37:30PM +0200, Paolo Bonzini wrote:
> On 05/20/2011 06:14 PM, Christoph Hellwig wrote:
>> I don't quite understand what you mean with path here. It doesn't
>> seem to map to any SAM concept, nor does it seem to be related
>> to traditio
> case REPORT_LUNS:
> +assert(!s->lun);
Besides REPORT_LUNS really belonging into the core code as mentioned before
the assert seems dangerous to me. What protects a guest from issuing a
REPORT LUNS for a non-zero LUN and hitting this assert? Note that SPC
explicitly allows sending
t
> allocation.
>
> Luckily, this qcow2_cache_put shouldn't fail anyway because the L2 table is
> only read, so that qcow2_cache_put doesn't even involve I/O.
Looks good,
Reviewed-by: Christoph Hellwig
Thanks, applied.
On Fri, May 27, 2011 at 06:45:03PM +0800, Feiran Zheng wrote:
> Patch on qemu-iotest.
>
> 005, test of creating 5TB images, not practical on raw format, so not run on
> it.
It's perfectly fine on raw, just try it.
On Fri, May 27, 2011 at 06:48:49PM +0800, Feiran Zheng wrote:
> Does this mean one must have that large fs space?
No.
On Wed, May 25, 2011 at 05:20:59PM +0200, Paolo Bonzini wrote:
> I agree. This case of INQUIRY is needed because (for simplicity and
> backwards compatibility) you can hang a scsi-disk or scsi-generic device
> directly off the HBA, without the intermediate pseudo-device that handles
> dispatchi
On Sat, May 28, 2011 at 12:51:04PM +0800, Fam Zheng wrote:
> But it says I can't create a 5000G raw image, this is the output of
> `./check 005` (with qemu-img version 0.14.50)
Somewhat confusingly qemu-img converts any EFBIG return to this
message. For some reason you have a really crappy filesy
ping?
On Thu, May 19, 2011 at 10:57:48AM +0200, Christoph Hellwig wrote:
> This patchset resurrects my older patches to add TRIM support. The
> old approach to pass an action handler through the IDE subsystem doesn't
> work any more after the refactoring for the AHCI driver, so
Thanks, applied.
And sorry, I did it before reading stefan's review, so the tag is missing in
the commit.
On Thu, Jun 09, 2011 at 11:47:24AM +0200, Andreas F?rber wrote:
> Am 09.06.2011 um 11:33 schrieb Christoph Hellwig:
>
>> Thanks, applied.
>>
>> And sorry, I did it before reading stefan's review, so the tag is missing
>> in
>> the commit.
>
>
> gi
On Wed, Jun 15, 2011 at 09:46:10AM -0400, Federico Simoncelli wrote:
> qemu-img currently writes disk images using writeback and filling
> up the cache buffers which are then flushed by the kernel preventing
> other processes from accessing the storage.
> This is particularly bad in cluster environ
Looks good,
Reviewed-by: Christoph Hellwig
Only safe because device hot unplug automatically destroys the
> BlockDriverState. But that's a questionable feature, best not to rely
> on it.
Looks good,
Reviewed-by: Christoph Hellwig
On Wed, Jul 20, 2011 at 06:23:38PM +0200, Markus Armbruster wrote:
> So we can more easily add device model callbacks.
>
> Signed-off-by: Markus Armbruster
Looks good,
Reviewed-by: Christoph Hellwig
On Wed, Jul 20, 2011 at 06:23:39PM +0200, Markus Armbruster wrote:
> Multiplexing callbacks complicates matters needlessly.
Looks good,
Reviewed-by: Christoph Hellwig
On Wed, Jul 20, 2011 at 06:23:40PM +0200, Markus Armbruster wrote:
> It's been disabled since the start (commit 19cb3738, Aug 2006), and
> has been untouched except for spelling fixes and such. I don't feel
> like dragging it along any further.
>
> Signed-off-by: Markus Armbruster
Ok, let's nuk
On Wed, Jul 20, 2011 at 06:23:44PM +0200, Markus Armbruster wrote:
> Drop WIN_SRST, it has same value as WIN_DEVICE_RESET.
>
> CFA_IDLEIMMEDIATE isn't specific to CFATA. ACS-2 shows it as a
> defined command in ATA-1, -2 and -3. Rename to WIN_IDLEIMMEDIATE2.
>
> Turn unused macros into comments
Looks good,
Reviewed-by: Christoph Hellwig
On Wed, Jul 20, 2011 at 06:23:46PM +0200, Markus Armbruster wrote:
> Must set the ATAPI device signature, see ACS-2 7.36.6 Outputs for
> PACKET feature set devices.
Odd but true, even if it's 7.38.2 in my local copy of the ACS spec.
It defintively should be documented in a comment next to the cod
On Wed, Jul 20, 2011 at 06:23:47PM +0200, Markus Armbruster wrote:
> No functional change.
>
> It would be nice to have handler functions in the table, like commit
> e1a064f9 did for ATAPI. Left for another day.
Looks good,
Reviewed-by: Christoph Hellwig
me for CFA_WRITE_SECT_WO_ERASE, CFA_WRITE_MULTI_WO_ERASE. Restrict
> them to IDE_CFATA, like the other CFA_ commands.
Looks good,
Reviewed-by: Christoph Hellwig
as the bdrv_eject name :)
Looks fine anyway,
Reviewed-by: Christoph Hellwig
On Wed, Jul 20, 2011 at 06:23:50PM +0200, Markus Armbruster wrote:
> We already track it in BlockDriverState since commit 4be9762a. As
> discussed in that commit's message, we should track it in the device
> device models instead, because it's device state.
Looks good,
Rev
On Wed, Jul 20, 2011 at 06:23:51PM +0200, Markus Armbruster wrote:
>
> Signed-off-by: Markus Armbruster
Looks good, although I would have folded this patch into the previous
one.
Reviewed-by: Christoph Hellwig
On Wed, Jul 20, 2011 at 06:23:52PM +0200, Markus Armbruster wrote:
> Use a table to declare which drive kinds accept each command.
>
> Limit READ_CAPACITY, READ_TOC, GET_CONFIGURATION to SCSI_CD, as per
> SPC-4.
READ CAPACITY is defined in SBC, and absolutely required for proper
operation with di
On Wed, Jul 20, 2011 at 06:23:53PM +0200, Markus Armbruster wrote:
>
> Signed-off-by: Markus Armbruster
Looks good,
Reviewed-by: Christoph Hellwig
On Wed, Jul 20, 2011 at 06:23:54PM +0200, Markus Armbruster wrote:
> We already track it in BlockDriverState since commit 4be9762a. As
> discussed in that commit's message, we should track it in the device
> device models instead, because it's device state.
Looks good,
Rev
lf */
Why not comment this near the actual code? The footnote scheme is
pretty odd and I've not seen it anywhere else in the code.
Otherwise looks fine,
Reviewed-by: Christoph Hellwig
1 - 100 of 765 matches
Mail list logo