[Qemu-devel] QEMU graphics API
Good morning, Is there documentation somewhere describing the QEMU graphics API - e.g. I'm looking for some guidance on developing a new QEMU frontend (like the SDL/SDL2 frontend)? Many thanks, Anna
Re: [Qemu-devel] [PATCH v2 21/21] iotests: Add test for different refcount widths
On 2014-11-15 at 15:50, Eric Blake wrote: On 11/14/2014 06:06 AM, Max Reitz wrote: Add a test for conversion between different refcount widths and errors specific to certain widths (i.e. snapshots with refcount_width=1). Signed-off-by: Max Reitz --- tests/qemu-iotests/112 | 252 + tests/qemu-iotests/112.out | 131 +++ tests/qemu-iotests/group | 1 + 3 files changed, 384 insertions(+) create mode 100755 tests/qemu-iotests/112 create mode 100644 tests/qemu-iotests/112.out +echo +echo '=== Testing too many references for check ===' +echo + +IMGOPTS="$IMGOPTS,refcount_width=1" _make_test_img 64M +print_refcount_width + +# This cluster should be created at 0x5 +$QEMU_IO -c 'write 0 64k' "$TEST_IMG" | _filter_qemu_io +# Now make the second L2 entry (the L2 table should be at 0x4) point to that +# cluster, so we have two references +poke_file "$TEST_IMG" $((0x40008)) "\x80\x00\x00\x00\x00\x05\x00\x00" + +# This should say "please use amend" +_check_test_img -r all + +# So we do that +$QEMU_IMG amend -o refcount_width=2 "$TEST_IMG" +print_refcount_width + +# And try again +_check_test_img -r all I think this section also deserves a test that fuzzes an image with width=64 to intentionally set the most significant bit of one of the refcounts, and make sure that we gracefully diagnose it as invalid. + +echo +echo '=== Multiple walks necessary during amend ===' +echo + +IMGOPTS="$IMGOPTS,refcount_width=1,cluster_size=512" _make_test_img 64k + +# Cluster 0 is the image header, clusters 1 to 4 are used by the L1 table, a +# single L2 table, the reftable and a single refblock. This creates 58 data +# clusters (actually, the L2 table is created here, too), so in total there are +# then 63 used clusters in the image. With a refcount width of 64, one refblock +# describes 64 clusters (512 bytes / 64 bits/entry = 64 entries), so this will +# make the first target refblock have exactly one free entry. +$QEMU_IO -c "write 0 $((58 * 512))" "$TEST_IMG" | _filter_qemu_io + +# Now change the refcount width; since the first target refblock has exactly one +# free entry, that entry will be used to store its own reference. No other +# refblocks are needed, so then the new reftable will be allocated; since the +# first target refblock is completely filled up, this will require a new +# refblock which is why the refcount width changing function will need to run +# through everything one more time until the allocations are stable. +$QEMU_IMG amend -o refcount_width=64 "$TEST_IMG" +print_refcount_width Umm, that sounds backwards from what you document. It's a good test of the _new_ reftable needing a second round of allocations. So keep it with corrected comments. But I think you _intended_ to write a test that starts with a refcount_width=64 image and resize to a refcount_width=1, where the _old_ reftable then suffers a reallocation as part of allocating refblocks for the new table. That's what you intended, but that's harder to test, so I settled for this (and the comments are appropriate (note that "target refblock" refers to the refblocks after amendment); note that this does indeed fail with v1 of this series. It may even help if you add a tracepoint for every iteration through the walk function callback, to prove we are indeed executing it 3 times instead of the usual 2, for these test cases. That's a good idea! I thought about adding some info, but totally forgot about trace points. I'll see whether I can add a test for increasing the size of the original reftable during amendment, too. Max
Re: [Qemu-devel] [PATCH v2 06/21] qcow2: Helper for refcount array reallocation
On 2014-11-15 at 17:50, Eric Blake wrote: On 11/14/2014 06:05 AM, Max Reitz wrote: Add a helper function for reallocating a refcount array, independently s/independently/independent/ of the refcount order. The newly allocated space is zeroed and the function handles failed reallocations gracefully. This patch is doing two things: it is refactoring things into a nice helper function (mentioned), AND it is adding a guarantee that you now always allocate a table on cluster boundaries, even when you aren't using the full table (hinted at elsewhere in the series, but noticeably absent here). I think you want to add more comments to the commit message making that more obvious, since it looks like you rely on that guarantee later. Will do. Signed-off-by: Max Reitz --- block/qcow2-refcount.c | 121 + 1 file changed, 72 insertions(+), 49 deletions(-) + +static int realloc_refcount_array(BDRVQcowState *s, uint16_t **array, + int64_t *size, int64_t new_size) I think this function deserves a comment stating that *array is actually allocated to full cluster size with a 0 tail, so that it can be written straight to disk. OK, will add a comment. +{ +/* Round to clusters so the array can be directly written to disk */ +size_t old_byte_size = ROUND_UP(refcount_array_byte_size(s, *size), +s->cluster_size); +size_t new_byte_size = ROUND_UP(refcount_array_byte_size(s, new_size), +s->cluster_size); +uint16_t *new_ptr; Can old_byte_size ever equal new_byte_size? Or are we guaranteed that this will only be called when we really need to add another cluster to the reftable? [reading further] Yes, it looks like *size and new_size are not necessarily cluster-aligned, so as an example, it is very likely that we might call realloc_refcount_array with the existing size of 20 and a new size of 21, both of which fit within the same byte size when rounded up to cluster boundary. But that means that the realloc is a no-op in that case; might it be worth special-casing rather than wasting time on the g_try_realloc and no-op memset? [at least the code works correctly even without a special case shortcut] Well, it's probably not necessary, but it will look most likely look better to catch that case. + +new_ptr = g_try_realloc(*array, new_byte_size); +if (new_byte_size && !new_ptr) { +return -ENOMEM; +} Is it worth asserting that new_byte_size is non-zero? Why would anyone ever call this to resize down to 0? (But I can see where you DO call it with old_byte_size of zero, when initializing data structures and using this function for the first allocation.) Hm, considering every image that can be opened using the qcow2 driver needs at least one cluster (the header), we can outrule that this is called with new_size == 0 (which would be the only way new_byte_size could ever be 0 either). + +if (new_ptr) { If we assert that new_byte_size is non-zero, then at this point, new_ptr is non-NULL and this condition is pointless. +memset((void *)((uintptr_t)new_ptr + old_byte_size), 0, + new_byte_size - old_byte_size); +} + +*array = new_ptr; +*size = new_size; + +return 0; +} Code looks correct as written, whether or not you also add more comments, asserts, and/or shortcuts for no-op situations. So: Reviewed-by: Eric Blake Thanks! Max
Re: [Qemu-devel] [PATCH v2 07/21] qcow2: Helper function for refcount modification
On 2014-11-15 at 18:02, Eric Blake wrote: On 11/14/2014 06:06 AM, Max Reitz wrote: Since refcounts do not always have to be a uint16_t, all refcount blocks and arrays in memory should not have a specific type (thus they become pointers to void) and for accessing them, two helper functions are used (a getter and a setter). Those functions are called indirectly through function pointers in the BDRVQcowState so they may later be exchanged for different refcount orders. Signed-off-by: Max Reitz --- block/qcow2-refcount.c | 128 ++--- block/qcow2.h | 8 2 files changed, 87 insertions(+), 49 deletions(-) @@ -1216,7 +1249,7 @@ enum { * error occurred. */ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res, -uint16_t **refcount_table, int64_t *refcount_table_size, int64_t l2_offset, +void **refcount_table, int64_t *refcount_table_size, int64_t l2_offset, int flags) { Might be worth fixing the indentation here in addition to all the other places you adjusted. But that's minor. @@ -1933,17 +1967,13 @@ write_refblocks: goto fail; } -on_disk_refblock = qemu_blockalign0(bs->file, s->cluster_size); -for (i = 0; i < s->refcount_block_size && -refblock_start + i < *nb_clusters; i++) -{ -on_disk_refblock[i] = -cpu_to_be16((*refcount_table)[refblock_start + i]); -} +/* The size of *refcount_table is always cluster-aligned, therefore the + * write operation will not overflow */ +on_disk_refblock = (void *)((uintptr_t)*refcount_table + +(refblock_index << s->refcount_block_bits)); Here is where you are relying on the guarantee that you added in 6/21, which is why I ask for that one to mention it. Nice reduction of a bounce buffer, by the way :) Worth mentioning in the commit message as an intentional part of this commit? Why not. @@ -2087,7 +2117,7 @@ int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res, /* Because the old reftable has been exchanged for a new one the * references have to be recalculated */ rebuild = false; -memset(refcount_table, 0, nb_clusters * sizeof(uint16_t)); +memset(refcount_table, 0, nb_clusters * s->refcount_bits / 8); Phew; we're safe that this won't overflow; and good that you do the * first (if you did the /8 first, it would fail for sub-byte refcounts). Thanks for catching this, it is wrong (albeit it does the right thing). It should use refcount_array_byte_size(), which was in this version of the series introduced before this patch, so it's an artifact of swapping patch 6 and 7. Max Reviewed-by: Eric Blake
Re: [Qemu-devel] QEMU graphics API
On Mo, 2014-11-17 at 08:32 +, Anna Fischer wrote: > Good morning, > > Is there documentation somewhere describing the QEMU graphics API - > e.g. I'm looking for some guidance on developing a new QEMU frontend > (like the SDL/SDL2 frontend)? Not really. my kvm forum talk covers some of it: https://www.kraxel.org/slides/qemu-gfx/ cheers, Gerd
Re: [Qemu-devel] [PATCH v2 08/21] qcow2: More helpers for refcount modification
On 2014-11-15 at 18:08, Eric Blake wrote: On 11/14/2014 06:06 AM, Max Reitz wrote: Add helper functions for getting and setting refcounts in a refcount array for any possible refcount order, and choose the correct one during refcount initialization. Signed-off-by: Max Reitz --- block/qcow2-refcount.c | 146 - 1 file changed, 144 insertions(+), 2 deletions(-) +static void get_refcount_functions(int refcount_order, + Qcow2GetRefcountFunc **get, + Qcow2SetRefcountFunc **set) +{ +switch (refcount_order) { +case 0: +*get = &get_refcount_ro0; +*set = &set_refcount_ro0; +break; Bike-shedding: instead of a switch statement and open-coded assignments, is it worth setting up an array of function pointers where you just grab the correct functions by doing array[refcount_order]? But I don't see any strong reason to change style; what you have works. I thought about it, but it wouldn't get much shorter. But maybe it looks nicer. I ought to think about it again. Max Reviewed-by: Eric Blake
Re: [Qemu-devel] [RFC][PATCH 2/2] xen:i386:pc_piix: create isa bridge specific to IGD passthrough
On 2014/11/17 14:10, Michael S. Tsirkin wrote: On Mon, Nov 17, 2014 at 10:47:56AM +0800, Chen, Tiejun wrote: On 2014/11/5 22:09, Michael S. Tsirkin wrote: On Wed, Nov 05, 2014 at 03:22:59PM +0800, Tiejun Chen wrote: Currently IGD drivers always need to access PCH by 1f.0, and PCH vendor/device id is used to identify the card. Signed-off-by: Tiejun Chen --- hw/i386/pc_piix.c | 28 +++- 1 file changed, 27 insertions(+), 1 deletion(-) diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index b559181..b19c7a9 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -50,7 +50,7 @@ #include "cpu.h" #include "qemu/error-report.h" #ifdef CONFIG_XEN -# include +#include #endif #define MAX_IDE_BUS 2 @@ -452,6 +452,31 @@ static void pc_init_isa(MachineState *machine) } #ifdef CONFIG_XEN +static void xen_igd_passthrough_isa_bridge_create(PCIBus *bus) +{ +struct PCIDevice *dev; +Error *local_err = NULL; +uint16_t device_id = 0x; + +/* Currently IGD drivers always need to access PCH by 1f.0. */ +dev = pci_create_simple(bus, PCI_DEVFN(0x1f, 0), +"xen-igd-passthrough-isa-bridge"); + +/* Identify PCH card with its own real vendor/device ids. + * Here that vendor id is always PCI_VENDOR_ID_INTEL. + */ +if (dev) { +device_id = object_property_get_int(OBJECT(dev), "device-id", +&local_err); +if (!local_err && device_id != 0x) { +pci_config_set_device_id(dev->config, device_id); +return; +} +} + +fprintf(stderr, "xen set xen-igd-passthrough-isa-bridge failed\n"); +} + static void pc_xen_hvm_init(MachineState *machine) { PCIBus *bus; @@ -461,6 +486,7 @@ static void pc_xen_hvm_init(MachineState *machine) bus = pci_find_primary_bus(); if (bus != NULL) { pci_create_simple(bus, -1, "xen-platform"); +xen_igd_passthrough_isa_bridge_create(bus); } } #endif Can't we defer this step until the GPU is added? Sounds great but I can't figure out where we can to do this exactly. This way there won't be need to poke at host device directly, you could get all info from dev->config of the host device. As I understand We have two steps here: #1 At first I have to write something to check if we're registering 00:02.0 & IGD, right? But where? While registering each pci device? In xen_pt_initfn. Just check the device and vendor ID against the table you have. Okay. Please see the follows which is just compiled: diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c index c6466dc..f3ea313 100644 --- a/hw/xen/xen_pt.c +++ b/hw/xen/xen_pt.c @@ -632,6 +632,94 @@ static const MemoryListener xen_pt_io_listener = { .priority = 10, }; +typedef struct { +uint16_t gpu_device_id; +uint16_t pch_device_id; +} XenIGDDeviceIDInfo; + +/* In real world different GPU should have different PCH. But actually + * the different PCH DIDs likely map to different PCH SKUs. We do the + * same thing for the GPU. For PCH, the different SKUs are going to be + * all the same silicon design and implementation, just different + * features turn on and off with fuses. The SW interfaces should be + * consistent across all SKUs in a given family (eg LPT). But just same + * features may not be supported. + * + * Most of these different PCH features probably don't matter to the + * Gfx driver, but obviously any difference in display port connections + * will so it should be fine with any PCH in case of passthrough. + * + * So currently use one PCH version, 0x8c4e, to cover all HSW(Haswell) + * scenarios, 0x9cc3 for BDW(Broadwell). + */ +static const XenIGDDeviceIDInfo xen_igd_combo_id_infos[] = { +/* HSW Classic */ +{0x0402, 0x8c4e}, /* HSWGT1D, HSWD_w7 */ +{0x0406, 0x8c4e}, /* HSWGT1M, HSWM_w7 */ +{0x0412, 0x8c4e}, /* HSWGT2D, HSWD_w7 */ +{0x0416, 0x8c4e}, /* HSWGT2M, HSWM_w7 */ +{0x041E, 0x8c4e}, /* HSWGT15D, HSWD_w7 */ +/* HSW ULT */ +{0x0A06, 0x8c4e}, /* HSWGT1UT, HSWM_w7 */ +{0x0A16, 0x8c4e}, /* HSWGT2UT, HSWM_w7 */ +{0x0A26, 0x8c4e}, /* HSWGT3UT, HSWM_w7 */ +{0x0A2E, 0x8c4e}, /* HSWGT3UT28W, HSWM_w7 */ +{0x0A1E, 0x8c4e}, /* HSWGT2UX, HSWM_w7 */ +{0x0A0E, 0x8c4e}, /* HSWGT1ULX, HSWM_w7 */ +/* HSW CRW */ +{0x0D26, 0x8c4e}, /* HSWGT3CW, HSWM_w7 */ +{0x0D22, 0x8c4e}, /* HSWGT3CWDT, HSWD_w7 */ +/* HSW Server */ +{0x041A, 0x8c4e}, /* HSWSVGT2, HSWD_w7 */ +/* HSW SRVR */ +{0x040A, 0x8c4e}, /* HSWSVGT1, HSWD_w7 */ +/* BSW */ +{0x1606, 0x9cc3}, /* BDWULTGT1, BDWM_w7 */ +{0x1616, 0x9cc3}, /* BDWULTGT2, BDWM_w7 */ +{0x1626, 0x9cc3}, /* BDWULTGT3, BDWM_w7 */ +{0x160E, 0x9cc3}, /* BDWULXGT1, BDWM_w7 */ +{0x161E, 0x9cc3}, /* BDWULXGT2, BDWM_w7 */ +{0x1602, 0x9cc3}, /* BDWHALOGT1, BDWM_w7 */ +{0x1612, 0x9cc3}, /* BDWHALOGT2, BDWM_w7 */ +{0x1622, 0x9cc3}, /* BDWHALOGT3, BDWM_w7 */ +{0x1
[Qemu-devel] Possible approaches to limit csw overhead
Hello, I have a rather practical question, is it possible to limit amount of vm-initiated events for single VM? As and example, VM which experienced OOM and effectively stuck dead generates a lot of unnecessary context switches triggering do_raw_spin_lock very often and therefore increasing overall compute workload. This possibly can be done via reactive limitation of the cpu quota via cgroup, but such method is quite impractical because every orchestration solution will need to implement its own piece of code to detect such VM states and act properly. I wonder if there may be a proposal which will do this job better than userspace-implemented perf statistics loop. Thanks!
Re: [Qemu-devel] [PATCH V8 0/3] Virtual Machine Generation ID
- Original Message - > From: "Michael S. Tsirkin" > To: "Gal Hammer" > Cc: pbonz...@redhat.com, qemu-devel@nongnu.org > Sent: Sunday, November 16, 2014 9:49:37 PM > Subject: Re: [Qemu-devel] [PATCH V8 0/3] Virtual Machine Generation ID > > On Sun, Nov 16, 2014 at 12:15:56PM +0200, Gal Hammer wrote: > > Hi, > > > > The patch grow to three parts now. Although it is still add a QEmu > > support for Microsoft's Virtual Machine Generation ID device. > > > > The first is a short device's description, then the ACPI tables > > changes and the actual device and the last patch updates the tests' > > ACPI tables. > > > > Your comment are welcomed. > > > > Thanks, > > > > Gal. > > Looks good, but how about a unit test? > Should be easy: give device a physical address, > read back memory to see that it's been written to. I thought it would make sense to add the tests only after the patch is accepted and the device's implementation is final. > > > > > V8 - Add a device's description file. > >- GUID is stored in fw cfg file and the guest writes the > > physical address to the device (reduces vmexits). > > > > V7 - Move the device's description back to the static SSDT table. > >- The GUID is store in a "hard coded" physical address and not > > in the ACPI table itself. > >- ACPI notification is triggered when the GUID is changed. > > > > V6 - include the pre-compiled ASL file > >- remove an empty line at end of files. > > > > V5 - Move device's description to SSDT table (dynamic). > > > > V4 - Fix a typo in error message string. > >- Move device's description from DSDT back to SSDT table. > > > > V3 - Remove "-uuid" command line parameter. > >- Move device's description from SSDT to DSDT table. > >- Add new "vmgenid" sysbus device. > > > > Gal Hammer (3): > > docs: vm generation id device's description > > i386: Add a Virtual Machine Generation ID device > > tests: update acpi tables after adding the vmgenid device > > > > default-configs/i386-softmmu.mak | 1 + > > default-configs/x86_64-softmmu.mak | 1 + > > docs/specs/vmgenid.txt | 27 > > hw/acpi/core.c | 8 +++ > > hw/acpi/ich9.c | 8 +++ > > hw/acpi/piix4.c | 8 +++ > > hw/i386/acpi-build.c | 26 +++ > > hw/i386/acpi-dsdt.dsl| 4 +- > > hw/i386/acpi-dsdt.hex.generated | 6 +- > > hw/i386/pc.c | 8 +++ > > hw/i386/q35-acpi-dsdt.dsl| 5 +- > > hw/i386/q35-acpi-dsdt.hex.generated | 8 +-- > > hw/i386/ssdt-misc.dsl| 43 > > hw/i386/ssdt-misc.hex.generated | 8 +-- > > hw/isa/lpc_ich9.c| 1 + > > hw/misc/Makefile.objs| 1 + > > hw/misc/vmgenid.c| 131 > > +++ > > include/hw/acpi/acpi.h | 2 + > > include/hw/acpi/acpi_dev_interface.h | 4 ++ > > include/hw/acpi/ich9.h | 2 + > > include/hw/i386/pc.h | 3 + > > include/hw/misc/vmgenid.h| 21 ++ > > tests/acpi-test-data/pc/DSDT | Bin 2807 -> 2820 bytes > > tests/acpi-test-data/pc/SSDT | Bin 3065 -> 3268 bytes > > tests/acpi-test-data/q35/DSDT| Bin 7397 -> 7410 bytes > > tests/acpi-test-data/q35/SSDT| Bin 1346 -> 1549 bytes > > 26 files changed, 313 insertions(+), 13 deletions(-) > > create mode 100644 docs/specs/vmgenid.txt > > create mode 100644 hw/misc/vmgenid.c > > create mode 100644 include/hw/misc/vmgenid.h > > > > -- > > 1.9.3 > >
[Qemu-devel] [Bug 1392504] Re: USB Passthrough is not working anymore
I greatly appriciate your help to find a solution for this issue, but I have used apt-get to installed QEMU already since version 12.10. I never used the QEMU source code for something. Is it possible that you provide an updated executable or library? Regards, Leen -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1392504 Title: USB Passthrough is not working anymore Status in QEMU: New Bug description: After upgrading from Ubuntu 14.04 to Ubuntu 14.10 USB passthrough in QEMU (version is now 2.1.0 - Debian2.1+dfsg-4ubuntu6.1) is not working any more. Already tried to remove and attach the USB devices. I use 1 USB2 HDD + 1 USB3 HDD to a virtual linux machine; 1 USB2 HDD to a virual FNAS machine and a USB 2camera to virtual Win7 machine. All these devices are not working any more. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1392504/+subscriptions
Re: [Qemu-devel] [PATCH v2 0/5] libqos: Virtio MMIO driver
El Sat, 1 Nov 2014 18:02:25 +0100 Marc Marí escribió: > Add virtio-mmio support to libqos and test case for virtio-blk. Ping! It has been two weeks. Is it in somebody's todo list? Thanks Marc
Re: [Qemu-devel] [PATCH V8 0/3] Virtual Machine Generation ID
On Mon, Nov 17, 2014 at 04:03:45AM -0500, Gal Hammer wrote: > > > - Original Message - > > From: "Michael S. Tsirkin" > > To: "Gal Hammer" > > Cc: pbonz...@redhat.com, qemu-devel@nongnu.org > > Sent: Sunday, November 16, 2014 9:49:37 PM > > Subject: Re: [Qemu-devel] [PATCH V8 0/3] Virtual Machine Generation ID > > > > On Sun, Nov 16, 2014 at 12:15:56PM +0200, Gal Hammer wrote: > > > Hi, > > > > > > The patch grow to three parts now. Although it is still add a QEmu > > > support for Microsoft's Virtual Machine Generation ID device. > > > > > > The first is a short device's description, then the ACPI tables > > > changes and the actual device and the last patch updates the tests' > > > ACPI tables. > > > > > > Your comment are welcomed. > > > > > > Thanks, > > > > > > Gal. > > > > Looks good, but how about a unit test? > > Should be easy: give device a physical address, > > read back memory to see that it's been written to. > > I thought it would make sense to add the tests only after the patch is > accepted and the device's implementation is final. OK so please add this as TODO in the cover letter, so we remember to address this before merging. We can't merge new functionality before 2.2 is out, so there's still time. > > > > > > > > > V8 - Add a device's description file. > > >- GUID is stored in fw cfg file and the guest writes the > > > physical address to the device (reduces vmexits). > > > > > > V7 - Move the device's description back to the static SSDT table. > > >- The GUID is store in a "hard coded" physical address and not > > > in the ACPI table itself. > > >- ACPI notification is triggered when the GUID is changed. > > > > > > V6 - include the pre-compiled ASL file > > >- remove an empty line at end of files. > > > > > > V5 - Move device's description to SSDT table (dynamic). > > > > > > V4 - Fix a typo in error message string. > > >- Move device's description from DSDT back to SSDT table. > > > > > > V3 - Remove "-uuid" command line parameter. > > >- Move device's description from SSDT to DSDT table. > > >- Add new "vmgenid" sysbus device. > > > > > > Gal Hammer (3): > > > docs: vm generation id device's description > > > i386: Add a Virtual Machine Generation ID device > > > tests: update acpi tables after adding the vmgenid device > > > > > > default-configs/i386-softmmu.mak | 1 + > > > default-configs/x86_64-softmmu.mak | 1 + > > > docs/specs/vmgenid.txt | 27 > > > hw/acpi/core.c | 8 +++ > > > hw/acpi/ich9.c | 8 +++ > > > hw/acpi/piix4.c | 8 +++ > > > hw/i386/acpi-build.c | 26 +++ > > > hw/i386/acpi-dsdt.dsl| 4 +- > > > hw/i386/acpi-dsdt.hex.generated | 6 +- > > > hw/i386/pc.c | 8 +++ > > > hw/i386/q35-acpi-dsdt.dsl| 5 +- > > > hw/i386/q35-acpi-dsdt.hex.generated | 8 +-- > > > hw/i386/ssdt-misc.dsl| 43 > > > hw/i386/ssdt-misc.hex.generated | 8 +-- > > > hw/isa/lpc_ich9.c| 1 + > > > hw/misc/Makefile.objs| 1 + > > > hw/misc/vmgenid.c| 131 > > > +++ > > > include/hw/acpi/acpi.h | 2 + > > > include/hw/acpi/acpi_dev_interface.h | 4 ++ > > > include/hw/acpi/ich9.h | 2 + > > > include/hw/i386/pc.h | 3 + > > > include/hw/misc/vmgenid.h| 21 ++ > > > tests/acpi-test-data/pc/DSDT | Bin 2807 -> 2820 bytes > > > tests/acpi-test-data/pc/SSDT | Bin 3065 -> 3268 bytes > > > tests/acpi-test-data/q35/DSDT| Bin 7397 -> 7410 bytes > > > tests/acpi-test-data/q35/SSDT| Bin 1346 -> 1549 bytes > > > 26 files changed, 313 insertions(+), 13 deletions(-) > > > create mode 100644 docs/specs/vmgenid.txt > > > create mode 100644 hw/misc/vmgenid.c > > > create mode 100644 include/hw/misc/vmgenid.h > > > > > > -- > > > 1.9.3 > > > >
Re: [Qemu-devel] [RFC][PATCH 2/2] xen:i386:pc_piix: create isa bridge specific to IGD passthrough
On Mon, Nov 17, 2014 at 04:48:32PM +0800, Chen, Tiejun wrote: > On 2014/11/17 14:10, Michael S. Tsirkin wrote: > >On Mon, Nov 17, 2014 at 10:47:56AM +0800, Chen, Tiejun wrote: > >>On 2014/11/5 22:09, Michael S. Tsirkin wrote: > >>>On Wed, Nov 05, 2014 at 03:22:59PM +0800, Tiejun Chen wrote: > Currently IGD drivers always need to access PCH by 1f.0, and > PCH vendor/device id is used to identify the card. > > Signed-off-by: Tiejun Chen > --- > hw/i386/pc_piix.c | 28 +++- > 1 file changed, 27 insertions(+), 1 deletion(-) > > diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c > index b559181..b19c7a9 100644 > --- a/hw/i386/pc_piix.c > +++ b/hw/i386/pc_piix.c > @@ -50,7 +50,7 @@ > #include "cpu.h" > #include "qemu/error-report.h" > #ifdef CONFIG_XEN > -# include > +#include > #endif > > #define MAX_IDE_BUS 2 > @@ -452,6 +452,31 @@ static void pc_init_isa(MachineState *machine) > } > > #ifdef CONFIG_XEN > +static void xen_igd_passthrough_isa_bridge_create(PCIBus *bus) > +{ > +struct PCIDevice *dev; > +Error *local_err = NULL; > +uint16_t device_id = 0x; > + > +/* Currently IGD drivers always need to access PCH by 1f.0. */ > +dev = pci_create_simple(bus, PCI_DEVFN(0x1f, 0), > +"xen-igd-passthrough-isa-bridge"); > + > +/* Identify PCH card with its own real vendor/device ids. > + * Here that vendor id is always PCI_VENDOR_ID_INTEL. > + */ > +if (dev) { > +device_id = object_property_get_int(OBJECT(dev), "device-id", > +&local_err); > +if (!local_err && device_id != 0x) { > +pci_config_set_device_id(dev->config, device_id); > +return; > +} > +} > + > +fprintf(stderr, "xen set xen-igd-passthrough-isa-bridge failed\n"); > +} > + > static void pc_xen_hvm_init(MachineState *machine) > { > PCIBus *bus; > @@ -461,6 +486,7 @@ static void pc_xen_hvm_init(MachineState *machine) > bus = pci_find_primary_bus(); > if (bus != NULL) { > pci_create_simple(bus, -1, "xen-platform"); > +xen_igd_passthrough_isa_bridge_create(bus); > } > } > #endif > >>> > >>>Can't we defer this step until the GPU is added? > >> > >>Sounds great but I can't figure out where we can to do this exactly. > >> > >>>This way there won't be need to poke at host device > >>>directly, you could get all info from dev->config > >>>of the host device. > >> > >>As I understand We have two steps here: > >> > >>#1 At first I have to write something to check if we're registering 00:02.0 > >>& IGD, right? But where? While registering each pci device? > > > >In xen_pt_initfn. > >Just check the device and vendor ID against the table you have. > > > > Okay. Please see the follows which is just compiled: > > diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c > index c6466dc..f3ea313 100644 > --- a/hw/xen/xen_pt.c > +++ b/hw/xen/xen_pt.c > @@ -632,6 +632,94 @@ static const MemoryListener xen_pt_io_listener = { > .priority = 10, > }; > > +typedef struct { > +uint16_t gpu_device_id; > +uint16_t pch_device_id; > +} XenIGDDeviceIDInfo; > + > +/* In real world different GPU should have different PCH. But actually > + * the different PCH DIDs likely map to different PCH SKUs. We do the > + * same thing for the GPU. For PCH, the different SKUs are going to be > + * all the same silicon design and implementation, just different > + * features turn on and off with fuses. The SW interfaces should be > + * consistent across all SKUs in a given family (eg LPT). But just same > + * features may not be supported. > + * > + * Most of these different PCH features probably don't matter to the > + * Gfx driver, but obviously any difference in display port connections > + * will so it should be fine with any PCH in case of passthrough. > + * > + * So currently use one PCH version, 0x8c4e, to cover all HSW(Haswell) > + * scenarios, 0x9cc3 for BDW(Broadwell). > + */ > +static const XenIGDDeviceIDInfo xen_igd_combo_id_infos[] = { > +/* HSW Classic */ > +{0x0402, 0x8c4e}, /* HSWGT1D, HSWD_w7 */ > +{0x0406, 0x8c4e}, /* HSWGT1M, HSWM_w7 */ > +{0x0412, 0x8c4e}, /* HSWGT2D, HSWD_w7 */ > +{0x0416, 0x8c4e}, /* HSWGT2M, HSWM_w7 */ > +{0x041E, 0x8c4e}, /* HSWGT15D, HSWD_w7 */ > +/* HSW ULT */ > +{0x0A06, 0x8c4e}, /* HSWGT1UT, HSWM_w7 */ > +{0x0A16, 0x8c4e}, /* HSWGT2UT, HSWM_w7 */ > +{0x0A26, 0x8c4e}, /* HSWGT3UT, HSWM_w7 */ > +{0x0A2E, 0x8c4e}, /* HSWGT3UT28W, HSWM_w7 */ > +{0x0A1E, 0x8c4e}, /* HSWGT2UX, HSWM_w7 */ > +{0x0A0E, 0x8c4e}, /* HSWGT1ULX, HSWM_w7 */ > +/* HSW CRW */ > +{0x0D26, 0x8c4e}, /* HSWGT3CW, HSWM_w7 */ >
Re: [Qemu-devel] [RFC PATCH v4 11/25] cpu-exec: allow temporary disabling icount
> From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo > Bonzini > On 07/11/2014 11:32, Pavel Dovgalyuk wrote: > > cpu_restore_state_from_tb(cpu, tb, retaddr); > > +/* tb could be temporary, generated by exec nocache */ > > +tb_phys_invalidate(tb, -1); > > Would you need to do the same sequence as cpu_exec_nocache in this case? Right. > cpu->current_tb = NULL; > tb_phys_invalidate(tb, -1); > tb_free(tb); > > In this case, perhaps something like this would be better: > > diff --git a/cpu-exec.c b/cpu-exec.c > index 10c0f42..faf8041 100644 > --- a/cpu-exec.c > +++ b/cpu-exec.c > @@ -209,7 +209,7 @@ static void cpu_exec_nocache(CPUArchState *env, int > max_cycles, > max_cycles = CF_COUNT_MASK; > > tb = tb_gen_code(cpu, orig_tb->pc, orig_tb->cs_base, orig_tb->flags, > - max_cycles); > + max_cycles | CF_NOCACHE); > cpu->current_tb = tb; > /* execute the generated code */ > trace_exec_tb_nocache(tb, tb->pc); > diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h > index 9b67d16..41464f3 100644 > --- a/include/exec/exec-all.h > +++ b/include/exec/exec-all.h > @@ -148,6 +148,7 @@ struct TranslationBlock { > #define CF_COUNT_MASK 0x7fff > #define CF_LAST_IO 0x8000 /* Last insn may be an IO access. */ > #define CF_LAST_IO 0x8000 /* Last insn may be an IO access. */ > +#define CF_NOCACHE 0x4 /* To be freed after execution */ > > void *tc_ptr;/* pointer to the translated code */ > /* next matching tb for physical address. */ > diff --git a/translate-all.c b/translate-all.c > index 1a27df7..6ea4da4 100644 > --- a/translate-all.c > +++ b/translate-all.c > @@ -264,6 +264,12 @@ bool cpu_restore_state(CPUState *cpu, uintptr_t retaddr) > tb = tb_find_pc(retaddr); > if (tb) { > cpu_restore_state_from_tb(cpu, tb, retaddr); > +if (tb->cflags & CF_NOCACHE) { > +/* one-shot translation, invalidate it immediately */ > +cpu->current_tb = NULL; > +tb_phys_invalidate(tb, -1); > +tb_free(tb); > +} > return true; > } > return false; > > > I pushed it to an icount branch on my github repository, together with > other patches that introduce CF_USE_ICOUNT instead of touching the > use_icount global variable. Can you please take a look and, if it > works, incorporate it in your patch series? Ok, I'll check it. I noticed that you missed this patch in your branch: http://lists.nongnu.org/archive/html/qemu-devel/2014-10/msg04034.html Pavel Dovgalyuk
Re: [Qemu-devel] [RFC][PATCH 2/2] xen:i386:pc_piix: create isa bridge specific to IGD passthrough
On 2014/11/17 17:25, Michael S. Tsirkin wrote: On Mon, Nov 17, 2014 at 04:48:32PM +0800, Chen, Tiejun wrote: On 2014/11/17 14:10, Michael S. Tsirkin wrote: On Mon, Nov 17, 2014 at 10:47:56AM +0800, Chen, Tiejun wrote: On 2014/11/5 22:09, Michael S. Tsirkin wrote: On Wed, Nov 05, 2014 at 03:22:59PM +0800, Tiejun Chen wrote: Currently IGD drivers always need to access PCH by 1f.0, and PCH vendor/device id is used to identify the card. Signed-off-by: Tiejun Chen --- [snip] Cleaner: if (!pci_dev) { fprintf return; } pci_config_set_device_id(pci_dev->config, pch_id); I will address all comments and thanks. +} +} + /* init */ static int xen_pt_initfn(PCIDevice *d) @@ -682,6 +770,9 @@ static int xen_pt_initfn(PCIDevice *d) return -1; } +/* Register ISA bridge for passthrough GFX. */ +xen_igd_passthrough_isa_bridge_create(s, &s->real_device); + /* reinitialize each config register to be emulated */ if (xen_pt_config_init(s)) { XEN_PT_ERR(d, "PCI Config space initialisation failed.\n"); Note I will introduce a inline function in another patch, +static inline int is_vga_passthrough(XenHostPCIDevice *dev) +{ +return (xen_has_gfx_passthru && (dev->vendor_id == PCI_VENDOR_ID_INTEL) +&& ((dev->class_code >> 0x8) == PCI_CLASS_DISPLAY_VGA)); +} Thanks Tiejun Why bother with all these conditions? Won't it be enough to check dev->vendor_id == PCI_VENDOR_ID_INTEL? If this is just used for IGD, its always fine without checking vendor_id. So after remove that check, I guess I need to rename that as is_igd_vga_passthrough() to make sense. Thanks Tiejun
Re: [Qemu-devel] [RFC][PATCH 2/2] xen:i386:pc_piix: create isa bridge specific to IGD passthrough
On Mon, Nov 17, 2014 at 05:42:12PM +0800, Chen, Tiejun wrote: > On 2014/11/17 17:25, Michael S. Tsirkin wrote: > >On Mon, Nov 17, 2014 at 04:48:32PM +0800, Chen, Tiejun wrote: > >>On 2014/11/17 14:10, Michael S. Tsirkin wrote: > >>>On Mon, Nov 17, 2014 at 10:47:56AM +0800, Chen, Tiejun wrote: > On 2014/11/5 22:09, Michael S. Tsirkin wrote: > >On Wed, Nov 05, 2014 at 03:22:59PM +0800, Tiejun Chen wrote: > >>Currently IGD drivers always need to access PCH by 1f.0, and > >>PCH vendor/device id is used to identify the card. > >> > >>Signed-off-by: Tiejun Chen > >>--- > > [snip] > > >Cleaner: > > if (!pci_dev) { > > fprintf > > return; > > } > > pci_config_set_device_id(pci_dev->config, pch_id); > > I will address all comments and thanks. > > > > >>+} > >>+} > >>+ > >> /* init */ > >> > >> static int xen_pt_initfn(PCIDevice *d) > >>@@ -682,6 +770,9 @@ static int xen_pt_initfn(PCIDevice *d) > >> return -1; > >> } > >> > >>+/* Register ISA bridge for passthrough GFX. */ > >>+xen_igd_passthrough_isa_bridge_create(s, &s->real_device); > >>+ > >> /* reinitialize each config register to be emulated */ > >> if (xen_pt_config_init(s)) { > >> XEN_PT_ERR(d, "PCI Config space initialisation failed.\n"); > >> > >>Note I will introduce a inline function in another patch, > >> > >>+static inline int is_vga_passthrough(XenHostPCIDevice *dev) > >>+{ > >>+return (xen_has_gfx_passthru && (dev->vendor_id == PCI_VENDOR_ID_INTEL) > >>+&& ((dev->class_code >> 0x8) == PCI_CLASS_DISPLAY_VGA)); > >>+} > >> > >>Thanks > >>Tiejun > > > >Why bother with all these conditions? > >Won't it be enough to check dev->vendor_id == PCI_VENDOR_ID_INTEL? > > > > If this is just used for IGD, its always fine without checking vendor_id. You need to match device ID to *know* it's IGD. > So after remove that check, I guess I need to rename that as > is_igd_vga_passthrough() to make sense. > > Thanks > Tiejun There is no need to check class code or xen_has_gfx_passthru flag. Device ID + vendor ID identifies each device uniquely. -- MST
[Qemu-devel] [PATCH v3 for-2.2 1/3] raw-posix: Fix comment for raw_co_get_block_status()
Missed in commit 705be72. Signed-off-by: Markus Armbruster Reviewed-by: Paolo Bonzini Reviewed-by: Fam Zheng Reviewed-by: Eric Blake Reviewed-by: Max Reitz --- block/raw-posix.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/block/raw-posix.c b/block/raw-posix.c index e100ae2..706d3c0 100644 --- a/block/raw-posix.c +++ b/block/raw-posix.c @@ -1555,9 +1555,7 @@ static int try_seek_hole(BlockDriverState *bs, off_t start, off_t *data, } /* - * Returns true iff the specified sector is present in the disk image. Drivers - * not implementing the functionality are assumed to not support backing files, - * hence all their sectors are reported as allocated. + * Returns the allocation status of the specified sectors. * * If 'sector_num' is beyond the end of the disk image the return value is 0 * and 'pnum' is set to 0. -- 1.9.3
[Qemu-devel] [PATCH v3 for-2.2 2/3] raw-posix: SEEK_HOLE suffices, get rid of FIEMAP
Commit 5500316 (May 2012) implemented raw_co_is_allocated() as follows: 1. If defined(CONFIG_FIEMAP), use the FS_IOC_FIEMAP ioctl 2. Else if defined(SEEK_HOLE) && defined(SEEK_DATA), use lseek() 3. Else pretend there are no holes Later on, raw_co_is_allocated() was generalized to raw_co_get_block_status(). Commit 4f11aa8 (May 2014) changed it to try the three methods in order until success, because "there may be implementations which support [SEEK_HOLE/SEEK_DATA] but not [FIEMAP] (e.g., NFSv4.2) as well as vice versa." Unfortunately, we used FIEMAP incorrectly: we lacked FIEMAP_FLAG_SYNC. Commit 38c4d0a (Sep 2014) added it. Because that's a significant speed hit, the next commit 7c159037 put SEEK_HOLE/SEEK_DATA first. As you see, the obvious use of FIEMAP is wrong, and the correct use is slow. I guess this puts it somewhere between -7 "The obvious use is wrong" and -10 "It's impossible to get right" on Rusty Russel's Hard to Misuse scale[*]. "Fortunately", the FIEMAP code is used only when * SEEK_HOLE/SEEK_DATA aren't defined, but CONFIG_FIEMAP is Uncommon. SEEK_HOLE had no XFS implementation between 2011 (when it was introduced for ext4 and btrfs) and 2012. * SEEK_HOLE/SEEK_DATA and CONFIG_FIEMAP are defined, but lseek() fails Unlikely. Thus, the FIEMAP code executes rarely. Makes it a nice hidey-hole for bugs. Worse, bugs hiding there can theoretically bite even on a host that has SEEK_HOLE/SEEK_DATA. I don't want to worry about this crap, not even theoretically. Get rid of it. [*] http://ozlabs.org/~rusty/index.cgi/tech/2008-04-01.html Signed-off-by: Markus Armbruster Reviewed-by: Max Reitz Reviewed-by: Eric Blake --- block/raw-posix.c | 60 --- 1 file changed, 4 insertions(+), 56 deletions(-) diff --git a/block/raw-posix.c b/block/raw-posix.c index 706d3c0..fd80d84 100644 --- a/block/raw-posix.c +++ b/block/raw-posix.c @@ -60,9 +60,6 @@ #define FS_NOCOW_FL 0x0080 /* Do not cow file */ #endif #endif -#ifdef CONFIG_FIEMAP -#include -#endif #ifdef CONFIG_FALLOCATE_PUNCH_HOLE #include #endif @@ -1481,52 +1478,6 @@ out: return result; } -static int try_fiemap(BlockDriverState *bs, off_t start, off_t *data, - off_t *hole, int nb_sectors) -{ -#ifdef CONFIG_FIEMAP -BDRVRawState *s = bs->opaque; -int ret = 0; -struct { -struct fiemap fm; -struct fiemap_extent fe; -} f; - -if (s->skip_fiemap) { -return -ENOTSUP; -} - -f.fm.fm_start = start; -f.fm.fm_length = (int64_t)nb_sectors * BDRV_SECTOR_SIZE; -f.fm.fm_flags = FIEMAP_FLAG_SYNC; -f.fm.fm_extent_count = 1; -f.fm.fm_reserved = 0; -if (ioctl(s->fd, FS_IOC_FIEMAP, &f) == -1) { -s->skip_fiemap = true; -return -errno; -} - -if (f.fm.fm_mapped_extents == 0) { -/* No extents found, data is beyond f.fm.fm_start + f.fm.fm_length. - * f.fm.fm_start + f.fm.fm_length must be clamped to the file size! - */ -off_t length = lseek(s->fd, 0, SEEK_END); -*hole = f.fm.fm_start; -*data = MIN(f.fm.fm_start + f.fm.fm_length, length); -} else { -*data = f.fe.fe_logical; -*hole = f.fe.fe_logical + f.fe.fe_length; -if (f.fe.fe_flags & FIEMAP_EXTENT_UNWRITTEN) { -ret |= BDRV_BLOCK_ZERO; -} -} - -return ret; -#else -return -ENOTSUP; -#endif -} - static int try_seek_hole(BlockDriverState *bs, off_t start, off_t *data, off_t *hole) { @@ -1593,13 +1544,10 @@ static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs, ret = try_seek_hole(bs, start, &data, &hole); if (ret < 0) { -ret = try_fiemap(bs, start, &data, &hole, nb_sectors); -if (ret < 0) { -/* Assume everything is allocated. */ -data = 0; -hole = start + nb_sectors * BDRV_SECTOR_SIZE; -ret = 0; -} +/* Assume everything is allocated. */ +data = 0; +hole = start + nb_sectors * BDRV_SECTOR_SIZE; +ret = 0; } assert(ret >= 0); -- 1.9.3
[Qemu-devel] [PATCH v3 for-2.2 0/3] raw-posix: Get rid of FIEMAP, fix SEEK_HOLE
PATCH 1 is just a comment fix. PATCH 2 drops FIEMAP use and explains why it needs to go. PATCH 3 carefully rewrites the SEEK_HOLE code. Why 2.2? The series fixes bugs, but the bugs are either not terribly severe, or not particularly likely to bite. The reason I want it included is we've already changed the code fundamentally in 2.2, from Incorrect use of FIEMAP if available, else somewhat incorrect use of SEEK_HOLE if available, else pretend no holes to Somewhat incorrect use of SEEK_HOLE if available, else correct but slow-as-molasses use of FIEMAP if available, else pretend no holes Let's finish the job: Correct use of SEEK_HOLE if available, else pretend no holes Less complex, more robust, and no half-broken intermediate version released. v3: * PATCH 1&2 unchanged, except for a commit message typo [Eric] * PATCH 3&4 redone as PATCH 3, very carefully [Eric, Kevin] v2: * PATCH 1 unchanged * PATCH 2 revised and split up [Paolo, Fam, Eric, Max] Markus Armbruster (3): raw-posix: Fix comment for raw_co_get_block_status() raw-posix: SEEK_HOLE suffices, get rid of FIEMAP raw-posix: The SEEK_HOLE code is flawed, rewrite it block/raw-posix.c | 167 -- 1 file changed, 86 insertions(+), 81 deletions(-) -- 1.9.3
[Qemu-devel] [PATCH v3 for-2.2 3/3] raw-posix: The SEEK_HOLE code is flawed, rewrite it
On systems where SEEK_HOLE in a trailing hole seeks to EOF (Solaris, but not Linux), try_seek_hole() reports trailing data instead. Additionally, unlikely lseek() failures are treated badly: * When SEEK_HOLE fails, try_seek_hole() reports trailing data. For -ENXIO, there's in fact a trailing hole. Can happen only when something truncated the file since we opened it. * When SEEK_HOLE succeeds, SEEK_DATA fails, and SEEK_END succeeds, then try_seek_hole() reports a trailing hole. This is okay only when SEEK_DATA failed with -ENXIO (which means the non-trailing hole found by SEEK_HOLE has since become trailing somehow). For other failures (unlikely), it's wrong. * When SEEK_HOLE succeeds, SEEK_DATA fails, SEEK_END fails (unlikely), then try_seek_hole() reports bogus data [-1,start), which its caller raw_co_get_block_status() turns into zero sectors of data. Could theoretically lead to infinite loops in code that attempts to scan data vs. hole forward. Rewrite from scratch, with very careful comments. Signed-off-by: Markus Armbruster --- block/raw-posix.c | 111 +- 1 file changed, 85 insertions(+), 26 deletions(-) diff --git a/block/raw-posix.c b/block/raw-posix.c index fd80d84..0b5d5a8 100644 --- a/block/raw-posix.c +++ b/block/raw-posix.c @@ -1478,28 +1478,86 @@ out: return result; } -static int try_seek_hole(BlockDriverState *bs, off_t start, off_t *data, - off_t *hole) +/* + * Find allocation range in @bs around offset @start. + * May change underlying file descriptor's file offset. + * If @start is not in a hole, store @start in @data, and the + * beginning of the next hole in @hole, and return 0. + * If @start is in a non-trailing hole, store @start in @hole and the + * beginning of the next non-hole in @data, and return 0. + * If @start is in a trailing hole or beyond EOF, return -ENXIO. + * If we can't find out, return a negative errno other than -ENXIO. + */ +static int find_allocation(BlockDriverState *bs, off_t start, + off_t *data, off_t *hole) { #if defined SEEK_HOLE && defined SEEK_DATA BDRVRawState *s = bs->opaque; +off_t offs; -*hole = lseek(s->fd, start, SEEK_HOLE); -if (*hole == -1) { -return -errno; +/* + * SEEK_DATA cases: + * D1. offs == start: start is in data + * D2. offs > start: start is in a hole, next data at offs + * D3. offs < 0, errno = ENXIO: either start is in a trailing hole + * or start is beyond EOF + * If the latter happens, the file has been truncated behind + * our back since we opened it. All bets are off then. + * Treating like a trailing hole is simplest. + * D4. offs < 0, errno != ENXIO: we learned nothing + */ +offs = lseek(s->fd, start, SEEK_DATA); +if (offs < 0) { +return -errno; /* D3 or D4 */ } +assert(offs >= start); -if (*hole > start) { +if (offs > start) { +/* D2: in hole, next data at offs */ +*hole = start; +*data = offs; +return 0; +} + +/* D1: in data, end not yet known */ + +/* + * SEEK_HOLE cases: + * H1. offs == start: start is in a hole + * If this happens here, a hole has been dug behind our back + * since the previous lseek(). + * H2. offs > start: either start is in data, next hole at offs, + * or start is in trailing hole, EOF at offs + * Linux treats trailing holes like any other hole: offs == + * start. Solaris seeks to EOF instead: offs > start (blech). + * If that happens here, a hole has been dug behind our back + * since the previous lseek(). + * H3. offs < 0, errno = ENXIO: start is beyond EOF + * If this happens, the file has been truncated behind our + * back since we opened it. Treat it like a trailing hole. + * H4. offs < 0, errno != ENXIO: we learned nothing + * Pretend we know nothing at all, i.e. "forget" about D1. + */ +offs = lseek(s->fd, start, SEEK_HOLE); +if (offs < 0) { +return -errno; /* D1 and (H3 or H4) */ +} +assert(offs >= start); + +if (offs > start) { +/* + * D1 and H2: either in data, next hole at offs, or it was in + * data but is now in a trailing hole. In the latter case, + * all bets are off. Treating it as if it there was data all + * the way to EOF is safe, so simply do that. + */ *data = start; -} else { -/* On a hole. We need another syscall to find its end. */ -*data = lseek(s->fd, start, SEEK_DATA); -if (*data == -1) { -*data = lseek(s->fd, 0, SEEK_END); -} +*hole = offs; +return 0; } -return 0; +/* D1 and H1 */ +return -EBUSY; #else return -ENOTSUP; #endif @@ -1542,25
Re: [Qemu-devel] [PATCH v3 for-2.2 2/3] raw-posix: SEEK_HOLE suffices, get rid of FIEMAP
On 2014-11-17 at 11:18, Markus Armbruster wrote: Commit 5500316 (May 2012) implemented raw_co_is_allocated() as follows: 1. If defined(CONFIG_FIEMAP), use the FS_IOC_FIEMAP ioctl 2. Else if defined(SEEK_HOLE) && defined(SEEK_DATA), use lseek() 3. Else pretend there are no holes Later on, raw_co_is_allocated() was generalized to raw_co_get_block_status(). Commit 4f11aa8 (May 2014) changed it to try the three methods in order until success, because "there may be implementations which support [SEEK_HOLE/SEEK_DATA] but not [FIEMAP] (e.g., NFSv4.2) as well as vice versa." Unfortunately, we used FIEMAP incorrectly: we lacked FIEMAP_FLAG_SYNC. Commit 38c4d0a (Sep 2014) added it. Because that's a significant speed hit, the next commit 7c159037 put SEEK_HOLE/SEEK_DATA first. As you see, the obvious use of FIEMAP is wrong, and the correct use is slow. I guess this puts it somewhere between -7 "The obvious use is wrong" and -10 "It's impossible to get right" on Rusty Russel's Hard to Misuse scale[*]. "Fortunately", the FIEMAP code is used only when * SEEK_HOLE/SEEK_DATA aren't defined, but CONFIG_FIEMAP is Uncommon. SEEK_HOLE had no XFS implementation between 2011 (when it was introduced for ext4 and btrfs) and 2012. * SEEK_HOLE/SEEK_DATA and CONFIG_FIEMAP are defined, but lseek() fails Unlikely. Thus, the FIEMAP code executes rarely. Makes it a nice hidey-hole for bugs. Worse, bugs hiding there can theoretically bite even on a host that has SEEK_HOLE/SEEK_DATA. I don't want to worry about this crap, not even theoretically. Get rid of it. [*] http://ozlabs.org/~rusty/index.cgi/tech/2008-04-01.html Signed-off-by: Markus Armbruster Reviewed-by: Max Reitz Reviewed-by: Eric Blake --- block/raw-posix.c | 60 --- 1 file changed, 4 insertions(+), 56 deletions(-) I only just now realized that you're not getting rid of FIEMAP completely; there's still the skip_fiemap flag in BDRVRawState. I don't care for now, though, thus my R-b stands. Max
Re: [Qemu-devel] [PATCH 0/4] migration: fix CVE-2014-7840
On Mon, Nov 17, 2014 at 12:06:38PM +0530, Amit Shah wrote: > On (Wed) 12 Nov 2014 [11:44:35], Michael S. Tsirkin wrote: > > This patchset fixes CVE-2014-7840: invalid > > migration stream can cause arbitrary qemu memory > > overwrite. > > First patch includes the minimal fix for the issue. > > Follow-up patches on top add extra checking to reduce the > > chance this kind of bug recurs. > > > > Note: these are already (tentatively-pending review) > > queued in my tree, so only review/ack > > is necessary. > > Why not let this go in via the migration tree? > > > Amit Well I Cc'd Juan and David, so if they had a problem with this, I expect they'd complain. David acked so I assume it's ok. Since I wasted time testing this and have it on my tree already, might as well just merge. Which reminds me: we really should have someone in MAINTAINERS for migration-related files. -- MST
Re: [Qemu-devel] [PATCH v3 for-2.2 3/3] raw-posix: The SEEK_HOLE code is flawed, rewrite it
On 2014-11-17 at 11:18, Markus Armbruster wrote: On systems where SEEK_HOLE in a trailing hole seeks to EOF (Solaris, but not Linux), try_seek_hole() reports trailing data instead. Additionally, unlikely lseek() failures are treated badly: * When SEEK_HOLE fails, try_seek_hole() reports trailing data. For -ENXIO, there's in fact a trailing hole. Can happen only when something truncated the file since we opened it. * When SEEK_HOLE succeeds, SEEK_DATA fails, and SEEK_END succeeds, then try_seek_hole() reports a trailing hole. This is okay only when SEEK_DATA failed with -ENXIO (which means the non-trailing hole found by SEEK_HOLE has since become trailing somehow). For other failures (unlikely), it's wrong. * When SEEK_HOLE succeeds, SEEK_DATA fails, SEEK_END fails (unlikely), then try_seek_hole() reports bogus data [-1,start), which its caller raw_co_get_block_status() turns into zero sectors of data. Could theoretically lead to infinite loops in code that attempts to scan data vs. hole forward. Rewrite from scratch, with very careful comments. Signed-off-by: Markus Armbruster --- block/raw-posix.c | 111 +- 1 file changed, 85 insertions(+), 26 deletions(-) Reviewed-by: Max Reitz
Re: [Qemu-devel] [PATCH v2 21/21] iotests: Add test for different refcount widths
On 2014-11-17 at 09:34, Max Reitz wrote: On 2014-11-15 at 15:50, Eric Blake wrote: On 11/14/2014 06:06 AM, Max Reitz wrote: Add a test for conversion between different refcount widths and errors specific to certain widths (i.e. snapshots with refcount_width=1). Signed-off-by: Max Reitz --- tests/qemu-iotests/112 | 252 + tests/qemu-iotests/112.out | 131 +++ tests/qemu-iotests/group | 1 + 3 files changed, 384 insertions(+) create mode 100755 tests/qemu-iotests/112 create mode 100644 tests/qemu-iotests/112.out +echo +echo '=== Testing too many references for check ===' +echo + +IMGOPTS="$IMGOPTS,refcount_width=1" _make_test_img 64M +print_refcount_width + +# This cluster should be created at 0x5 +$QEMU_IO -c 'write 0 64k' "$TEST_IMG" | _filter_qemu_io +# Now make the second L2 entry (the L2 table should be at 0x4) point to that +# cluster, so we have two references +poke_file "$TEST_IMG" $((0x40008)) "\x80\x00\x00\x00\x00\x05\x00\x00" + +# This should say "please use amend" +_check_test_img -r all + +# So we do that +$QEMU_IMG amend -o refcount_width=2 "$TEST_IMG" +print_refcount_width + +# And try again +_check_test_img -r all I think this section also deserves a test that fuzzes an image with width=64 to intentionally set the most significant bit of one of the refcounts, and make sure that we gracefully diagnose it as invalid. + +echo +echo '=== Multiple walks necessary during amend ===' +echo + +IMGOPTS="$IMGOPTS,refcount_width=1,cluster_size=512" _make_test_img 64k + +# Cluster 0 is the image header, clusters 1 to 4 are used by the L1 table, a +# single L2 table, the reftable and a single refblock. This creates 58 data +# clusters (actually, the L2 table is created here, too), so in total there are +# then 63 used clusters in the image. With a refcount width of 64, one refblock +# describes 64 clusters (512 bytes / 64 bits/entry = 64 entries), so this will +# make the first target refblock have exactly one free entry. +$QEMU_IO -c "write 0 $((58 * 512))" "$TEST_IMG" | _filter_qemu_io + +# Now change the refcount width; since the first target refblock has exactly one +# free entry, that entry will be used to store its own reference. No other +# refblocks are needed, so then the new reftable will be allocated; since the +# first target refblock is completely filled up, this will require a new +# refblock which is why the refcount width changing function will need to run +# through everything one more time until the allocations are stable. +$QEMU_IMG amend -o refcount_width=64 "$TEST_IMG" +print_refcount_width Umm, that sounds backwards from what you document. It's a good test of the _new_ reftable needing a second round of allocations. So keep it with corrected comments. But I think you _intended_ to write a test that starts with a refcount_width=64 image and resize to a refcount_width=1, where the _old_ reftable then suffers a reallocation as part of allocating refblocks for the new table. That's what you intended, but that's harder to test, so I settled for this (and the comments are appropriate (note that "target refblock" refers to the refblocks after amendment); note that this does indeed fail with v1 of this series. It may even help if you add a tracepoint for every iteration through the walk function callback, to prove we are indeed executing it 3 times instead of the usual 2, for these test cases. That's a good idea! I thought about adding some info, but totally forgot about trace points. ...On second thought, trace doesn't work so well with qemu-img. My best bet would be blkdebug, but that seems kind of ugly to me... Max I'll see whether I can add a test for increasing the size of the original reftable during amendment, too. Max
Re: [Qemu-devel] [PATCH 0/4] migration: fix CVE-2014-7840
On (Mon) 17 Nov 2014 [12:32:57], Michael S. Tsirkin wrote: > On Mon, Nov 17, 2014 at 12:06:38PM +0530, Amit Shah wrote: > > On (Wed) 12 Nov 2014 [11:44:35], Michael S. Tsirkin wrote: > > > This patchset fixes CVE-2014-7840: invalid > > > migration stream can cause arbitrary qemu memory > > > overwrite. > > > First patch includes the minimal fix for the issue. > > > Follow-up patches on top add extra checking to reduce the > > > chance this kind of bug recurs. > > > > > > Note: these are already (tentatively-pending review) > > > queued in my tree, so only review/ack > > > is necessary. > > > > Why not let this go in via the migration tree? > > Well I Cc'd Juan and David, so if they had a problem with this, I expect > they'd complain. David acked so I assume it's ok. Since I wasted time > testing this and have it on my tree already, might as well just merge. IMO asking as a courtesy would've been better than just stating it. > Which reminds me: we really should have someone in MAINTAINERS > for migration-related files. There is, since last week. Amit
Re: [Qemu-devel] [PATCH v2 0/3] fix bug about balloon working incorrectly when hotplug memeory
On Mon, Nov 17, 2014 at 01:11:07PM +0800, zhanghailiang wrote: > Hi, > > Patch 1 and 2 mainly fix bug about balloon not working correctly when we do > hotplug memory. It takes 'ram_size' as VM's real RAM size which is wrong > after we hotplug memory. > > This bug exists since we begin to support hotplug memory, and it is better > to fix it. > > Patch 3 add some trace events, it helps debugging balloon. If it is > unnecessary, > pls feel free to remove it. > > Thanks, > zhanghailiang What about other users of ram_size? Are they all incorrect? > v2: > - fix compiling break for other targets that don't support pc-dimm > > zhanghailiang (3): > pc-dimm: add a function to calculate VM's current RAM size > virtio-balloon: Fix balloon not working correctly when hotplug memory > virtio-balloon: Add some trace events > > hw/mem/pc-dimm.c| 26 ++ > hw/virtio/virtio-balloon.c | 21 +++-- > include/exec/cpu-common.h | 1 + > stubs/qmp_pc_dimm_device_list.c | 5 + > trace-events| 4 > 5 files changed, 51 insertions(+), 6 deletions(-) > > -- > 1.7.12.4 >
Re: [Qemu-devel] [PATCH 6/9] acl: fix memory leak
On 15/11/2014 11:06, arei.gong...@huawei.com wrote: > From: Gonglei > > If 'i != index' for all acl->entries, variable > entry leaks the storage it points to. > > Signed-off-by: Gonglei > --- > util/acl.c | 10 +- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/util/acl.c b/util/acl.c > index 938b7ae..571d686 100644 > --- a/util/acl.c > +++ b/util/acl.c > @@ -132,7 +132,6 @@ int qemu_acl_insert(qemu_acl *acl, > const char *match, > int index) > { > -qemu_acl_entry *entry; > qemu_acl_entry *tmp; > int i = 0; > > @@ -142,13 +141,14 @@ int qemu_acl_insert(qemu_acl *acl, > return qemu_acl_append(acl, deny, match); > } > > -entry = g_malloc(sizeof(*entry)); > -entry->match = g_strdup(match); > -entry->deny = deny; > - > QTAILQ_FOREACH(tmp, &acl->entries, next) { > i++; > if (i == index) { > +qemu_acl_entry *entry; > +entry = g_malloc(sizeof(*entry)); > +entry->match = g_strdup(match); > +entry->deny = deny; > + > QTAILQ_INSERT_BEFORE(tmp, entry, next); > acl->nentries++; > break; > This should never happen, but the patch doesn't hurt either. Paolo
Re: [Qemu-devel] [PATCH 0/4] migration: fix CVE-2014-7840
On Mon, Nov 17, 2014 at 04:08:58PM +0530, Amit Shah wrote: > On (Mon) 17 Nov 2014 [12:32:57], Michael S. Tsirkin wrote: > > On Mon, Nov 17, 2014 at 12:06:38PM +0530, Amit Shah wrote: > > > On (Wed) 12 Nov 2014 [11:44:35], Michael S. Tsirkin wrote: > > > > This patchset fixes CVE-2014-7840: invalid > > > > migration stream can cause arbitrary qemu memory > > > > overwrite. > > > > First patch includes the minimal fix for the issue. > > > > Follow-up patches on top add extra checking to reduce the > > > > chance this kind of bug recurs. > > > > > > > > Note: these are already (tentatively-pending review) > > > > queued in my tree, so only review/ack > > > > is necessary. > > > > > > Why not let this go in via the migration tree? > > > > Well I Cc'd Juan and David, so if they had a problem with this, I expect > > they'd complain. David acked so I assume it's ok. Since I wasted time > > testing this and have it on my tree already, might as well just merge. > > IMO asking as a courtesy would've been better than just stating it. Right, thanks for reminding me. BTW, there is actually a good reason to special-case it: it's a CVE fix, which I handle. So they stay on my private queue and are passed to vendors so vendors can fix downstreams, until making fix public is cleared with all reporters and vendors. After reporting is cleared, I try to collect acks but don't normally route patches through separate queues - that would make it harder to track the status which we need for CVEs. I guess this specific one actually is well contained, so it could go in through a specific tree if it had to. In fact, it is still possible if Juan says he prefers it so: I only expect to send pull request around tomorrow or the day after that. > > Which reminds me: we really should have someone in MAINTAINERS > > for migration-related files. > > There is, since last week. > > > Amit That's good. I see Juan is listed there now, so all's well. -- MST
Re: [Qemu-devel] [PATCH v2 0/3] fix bug about balloon working incorrectly when hotplug memeory
On 2014/11/17 18:39, Michael S. Tsirkin wrote: On Mon, Nov 17, 2014 at 01:11:07PM +0800, zhanghailiang wrote: Hi, Patch 1 and 2 mainly fix bug about balloon not working correctly when we do hotplug memory. It takes 'ram_size' as VM's real RAM size which is wrong after we hotplug memory. This bug exists since we begin to support hotplug memory, and it is better to fix it. Patch 3 add some trace events, it helps debugging balloon. If it is unnecessary, pls feel free to remove it. Thanks, zhanghailiang What about other users of ram_size? Are they all incorrect? pc-dimm is only supported in x86 target now, and i am not quite sure if hotplug memory will break migration. I'll look into it. Thanks. v2: - fix compiling break for other targets that don't support pc-dimm zhanghailiang (3): pc-dimm: add a function to calculate VM's current RAM size virtio-balloon: Fix balloon not working correctly when hotplug memory virtio-balloon: Add some trace events hw/mem/pc-dimm.c| 26 ++ hw/virtio/virtio-balloon.c | 21 +++-- include/exec/cpu-common.h | 1 + stubs/qmp_pc_dimm_device_list.c | 5 + trace-events| 4 5 files changed, 51 insertions(+), 6 deletions(-) -- 1.7.12.4 .
Re: [Qemu-devel] [PATCH 9/9] hcd-musb: fix dereference null return value
On 15/11/2014 11:06, arei.gong...@huawei.com wrote: > From: Gonglei > > Signed-off-by: Gonglei > --- > hw/usb/hcd-musb.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/hw/usb/hcd-musb.c b/hw/usb/hcd-musb.c > index 66bc61a..f2cb73c 100644 > --- a/hw/usb/hcd-musb.c > +++ b/hw/usb/hcd-musb.c > @@ -624,6 +624,10 @@ static void musb_packet(MUSBState *s, MUSBEndPoint *ep, > > /* A wild guess on the FADDR semantics... */ > dev = usb_find_device(&s->port, ep->faddr[idx]); > +if (!dev) { > +TRACE("Do not find an usb device"); > +return; > +} > uep = usb_ep_get(dev, pid, ep->type[idx] & 0xf); > usb_packet_setup(&ep->packey[dir].p, pid, uep, 0, > (dev->addr << 16) | (uep->nr << 8) | pid, false, true); > I think this patch is not the real fix. usb_ep_get and usb_handle_packet can deal with a NULL device, but we have to avoid dereferencing NULL pointers when building the id. Paolo diff --git a/hw/usb/hcd-musb.c b/hw/usb/hcd-musb.c index 66bc61a..40809f6 100644 --- a/hw/usb/hcd-musb.c +++ b/hw/usb/hcd-musb.c @@ -608,6 +608,7 @@ static void musb_packet(MUSBState *s, MUSBEndPoint *ep, USBDevice *dev; USBEndpoint *uep; int idx = epnum && dir; +int id; int ttype; /* ep->type[0,1] contains: @@ -625,8 +626,11 @@ static void musb_packet(MUSBState *s, MUSBEndPoint *ep, /* A wild guess on the FADDR semantics... */ dev = usb_find_device(&s->port, ep->faddr[idx]); uep = usb_ep_get(dev, pid, ep->type[idx] & 0xf); -usb_packet_setup(&ep->packey[dir].p, pid, uep, 0, - (dev->addr << 16) | (uep->nr << 8) | pid, false, true); +id = pid; +if (uep) { +id |= (dev->addr << 16) | (uep->nr << 8); +} +usb_packet_setup(&ep->packey[dir].p, pid, uep, 0, id, false, true); usb_packet_addbuf(&ep->packey[dir].p, ep->buf[idx], len); ep->packey[dir].ep = ep; ep->packey[dir].dir = dir;
Re: [Qemu-devel] [PATCH v3 for-2.2 2/3] raw-posix: SEEK_HOLE suffices, get rid of FIEMAP
Max Reitz writes: > On 2014-11-17 at 11:18, Markus Armbruster wrote: >> Commit 5500316 (May 2012) implemented raw_co_is_allocated() as >> follows: >> >> 1. If defined(CONFIG_FIEMAP), use the FS_IOC_FIEMAP ioctl >> >> 2. Else if defined(SEEK_HOLE) && defined(SEEK_DATA), use lseek() >> >> 3. Else pretend there are no holes >> >> Later on, raw_co_is_allocated() was generalized to >> raw_co_get_block_status(). >> >> Commit 4f11aa8 (May 2014) changed it to try the three methods in order >> until success, because "there may be implementations which support >> [SEEK_HOLE/SEEK_DATA] but not [FIEMAP] (e.g., NFSv4.2) as well as vice >> versa." >> >> Unfortunately, we used FIEMAP incorrectly: we lacked FIEMAP_FLAG_SYNC. >> Commit 38c4d0a (Sep 2014) added it. Because that's a significant >> speed hit, the next commit 7c159037 put SEEK_HOLE/SEEK_DATA first. >> >> As you see, the obvious use of FIEMAP is wrong, and the correct use is >> slow. I guess this puts it somewhere between -7 "The obvious use is >> wrong" and -10 "It's impossible to get right" on Rusty Russel's Hard >> to Misuse scale[*]. >> >> "Fortunately", the FIEMAP code is used only when >> >> * SEEK_HOLE/SEEK_DATA aren't defined, but CONFIG_FIEMAP is >> >>Uncommon. SEEK_HOLE had no XFS implementation between 2011 (when it >>was introduced for ext4 and btrfs) and 2012. >> >> * SEEK_HOLE/SEEK_DATA and CONFIG_FIEMAP are defined, but lseek() fails >> >>Unlikely. >> >> Thus, the FIEMAP code executes rarely. Makes it a nice hidey-hole for >> bugs. Worse, bugs hiding there can theoretically bite even on a host >> that has SEEK_HOLE/SEEK_DATA. >> >> I don't want to worry about this crap, not even theoretically. Get >> rid of it. >> >> [*] http://ozlabs.org/~rusty/index.cgi/tech/2008-04-01.html >> >> Signed-off-by: Markus Armbruster >> Reviewed-by: Max Reitz >> Reviewed-by: Eric Blake >> --- >> block/raw-posix.c | 60 >> --- >> 1 file changed, 4 insertions(+), 56 deletions(-) > > I only just now realized that you're not getting rid of FIEMAP > completely; there's still the skip_fiemap flag in BDRVRawState. I > don't care for now, though, thus my R-b stands. You're right! The appended patch should be squashed in, either on commit, or in a respin. diff --git a/block/raw-posix.c b/block/raw-posix.c index 0b5d5a8..414e6d1 100644 --- a/block/raw-posix.c +++ b/block/raw-posix.c @@ -148,9 +148,6 @@ typedef struct BDRVRawState { bool has_write_zeroes:1; bool discard_zeroes:1; bool needs_alignment; -#ifdef CONFIG_FIEMAP -bool skip_fiemap; -#endif } BDRVRawState; typedef struct BDRVRawReopenState {
Re: [Qemu-devel] [PATCH 2/4] exec: add wrapper for host pointer access
* Michael S. Tsirkin (m...@redhat.com) wrote: > host pointer accesses force pointer math, let's > add a wrapper to make them safer. > > Signed-off-by: Michael S. Tsirkin > --- > include/exec/cpu-all.h | 5 + > exec.c | 10 +- > 2 files changed, 10 insertions(+), 5 deletions(-) > > diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h > index c085804..9d8d408 100644 > --- a/include/exec/cpu-all.h > +++ b/include/exec/cpu-all.h > @@ -313,6 +313,11 @@ typedef struct RAMBlock { > int fd; > } RAMBlock; > > +static inline void *ramblock_ptr(RAMBlock *block, ram_addr_t offset) > +{ > +return (char *)block->host + offset; > +} I'm a bit surprised you don't need to pass a length to this to be able to tell how much you can access. > typedef struct RAMList { > QemuMutex mutex; > /* Protected by the iothread lock. */ > diff --git a/exec.c b/exec.c > index ad5cf12..9648669 100644 > --- a/exec.c > +++ b/exec.c > @@ -840,7 +840,7 @@ static void tlb_reset_dirty_range_all(ram_addr_t start, > ram_addr_t length) > > block = qemu_get_ram_block(start); > assert(block == qemu_get_ram_block(end - 1)); > -start1 = (uintptr_t)block->host + (start - block->offset); > +start1 = (uintptr_t)ramblock_ptr(block, start - block->offset); > cpu_tlb_reset_dirty_all(start1, length); > } > > @@ -1500,7 +1500,7 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t length) > QTAILQ_FOREACH(block, &ram_list.blocks, next) { > offset = addr - block->offset; > if (offset < block->length) { > -vaddr = block->host + offset; > +vaddr = ramblock_ptr(block, offset); > if (block->flags & RAM_PREALLOC) { > ; > } else if (xen_enabled()) { > @@ -1551,7 +1551,7 @@ void *qemu_get_ram_block_host_ptr(ram_addr_t addr) > { > RAMBlock *block = qemu_get_ram_block(addr); > > -return block->host; > +return ramblock_ptr(block, 0); > } > > /* Return a host pointer to ram allocated with qemu_ram_alloc. > @@ -1578,7 +1578,7 @@ void *qemu_get_ram_ptr(ram_addr_t addr) > xen_map_cache(block->offset, block->length, 1); > } > } > -return block->host + (addr - block->offset); > +return ramblock_ptr(block, addr - block->offset); > } which then makes me wonder if all the uses of this are safe near the end of the block. > /* Return a host pointer to guest's ram. Similar to qemu_get_ram_ptr > @@ -1597,7 +1597,7 @@ static void *qemu_ram_ptr_length(ram_addr_t addr, > hwaddr *size) > if (addr - block->offset < block->length) { > if (addr - block->offset + *size > block->length) > *size = block->length - addr + block->offset; > -return block->host + (addr - block->offset); > +return ramblock_ptr(block, addr - block->offset); > } but then this sounds like it's going to have partial duplication, it already looks like it's only going to succeed if it finds itself a block that the access fits in. Dave > } > > -- > MST > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [PATCH] target-cris/translate.c: fix out of bounds read
On 17/11/2014 06:57, zhanghailiang wrote: > In function t_gen_mov_TN_preg and t_gen_mov_preg_TN, The begin check about the > validity of in-parameter 'r' is useless. We still access cpu_PR[r] in the > follow code if it is invalid. Which will be an out-of-bounds read error. > > Fix it by using assert() to ensure it is valid before using it. > > Signed-off-by: zhanghailiang > --- > target-cris/translate.c | 8 ++-- > 1 file changed, 2 insertions(+), 6 deletions(-) > > diff --git a/target-cris/translate.c b/target-cris/translate.c > index e37b04e..76406af 100644 > --- a/target-cris/translate.c > +++ b/target-cris/translate.c > @@ -169,9 +169,7 @@ static int preg_sizes[] = { > > static inline void t_gen_mov_TN_preg(TCGv tn, int r) > { > -if (r < 0 || r > 15) { > -fprintf(stderr, "wrong register read $p%d\n", r); > -} > +assert(r >= 0 && r <= 15); > if (r == PR_BZ || r == PR_WZ || r == PR_DZ) { > tcg_gen_mov_tl(tn, tcg_const_tl(0)); > } else if (r == PR_VR) { > @@ -182,9 +180,7 @@ static inline void t_gen_mov_TN_preg(TCGv tn, int r) > } > static inline void t_gen_mov_preg_TN(DisasContext *dc, int r, TCGv tn) > { > -if (r < 0 || r > 15) { > -fprintf(stderr, "wrong register write $p%d\n", r); > -} > +assert(r >= 0 && r <= 15); > if (r == PR_BZ || r == PR_WZ || r == PR_DZ) { > return; > } else if (r == PR_SRS) { > Applied, thanks. Paolo
Re: [Qemu-devel] [PATCH v3 for-2.2 2/3] raw-posix: SEEK_HOLE suffices, get rid of FIEMAP
On 2014-11-17 at 11:58, Markus Armbruster wrote: Max Reitz writes: On 2014-11-17 at 11:18, Markus Armbruster wrote: Commit 5500316 (May 2012) implemented raw_co_is_allocated() as follows: 1. If defined(CONFIG_FIEMAP), use the FS_IOC_FIEMAP ioctl 2. Else if defined(SEEK_HOLE) && defined(SEEK_DATA), use lseek() 3. Else pretend there are no holes Later on, raw_co_is_allocated() was generalized to raw_co_get_block_status(). Commit 4f11aa8 (May 2014) changed it to try the three methods in order until success, because "there may be implementations which support [SEEK_HOLE/SEEK_DATA] but not [FIEMAP] (e.g., NFSv4.2) as well as vice versa." Unfortunately, we used FIEMAP incorrectly: we lacked FIEMAP_FLAG_SYNC. Commit 38c4d0a (Sep 2014) added it. Because that's a significant speed hit, the next commit 7c159037 put SEEK_HOLE/SEEK_DATA first. As you see, the obvious use of FIEMAP is wrong, and the correct use is slow. I guess this puts it somewhere between -7 "The obvious use is wrong" and -10 "It's impossible to get right" on Rusty Russel's Hard to Misuse scale[*]. "Fortunately", the FIEMAP code is used only when * SEEK_HOLE/SEEK_DATA aren't defined, but CONFIG_FIEMAP is Uncommon. SEEK_HOLE had no XFS implementation between 2011 (when it was introduced for ext4 and btrfs) and 2012. * SEEK_HOLE/SEEK_DATA and CONFIG_FIEMAP are defined, but lseek() fails Unlikely. Thus, the FIEMAP code executes rarely. Makes it a nice hidey-hole for bugs. Worse, bugs hiding there can theoretically bite even on a host that has SEEK_HOLE/SEEK_DATA. I don't want to worry about this crap, not even theoretically. Get rid of it. [*] http://ozlabs.org/~rusty/index.cgi/tech/2008-04-01.html Signed-off-by: Markus Armbruster Reviewed-by: Max Reitz Reviewed-by: Eric Blake --- block/raw-posix.c | 60 --- 1 file changed, 4 insertions(+), 56 deletions(-) I only just now realized that you're not getting rid of FIEMAP completely; there's still the skip_fiemap flag in BDRVRawState. I don't care for now, though, thus my R-b stands. You're right! The appended patch should be squashed in, either on commit, or in a respin. diff --git a/block/raw-posix.c b/block/raw-posix.c index 0b5d5a8..414e6d1 100644 --- a/block/raw-posix.c +++ b/block/raw-posix.c @@ -148,9 +148,6 @@ typedef struct BDRVRawState { bool has_write_zeroes:1; bool discard_zeroes:1; bool needs_alignment; -#ifdef CONFIG_FIEMAP -bool skip_fiemap; -#endif } BDRVRawState; typedef struct BDRVRawReopenState { With or without thank hunk (better with): Reviewed-by: Max Reitz
Re: [Qemu-devel] [PATCH v2 21/21] iotests: Add test for different refcount widths
On 2014-11-17 at 11:38, Max Reitz wrote: On 2014-11-17 at 09:34, Max Reitz wrote: On 2014-11-15 at 15:50, Eric Blake wrote: On 11/14/2014 06:06 AM, Max Reitz wrote: Add a test for conversion between different refcount widths and errors specific to certain widths (i.e. snapshots with refcount_width=1). Signed-off-by: Max Reitz --- tests/qemu-iotests/112 | 252 + tests/qemu-iotests/112.out | 131 +++ tests/qemu-iotests/group | 1 + 3 files changed, 384 insertions(+) create mode 100755 tests/qemu-iotests/112 create mode 100644 tests/qemu-iotests/112.out +echo +echo '=== Testing too many references for check ===' +echo + +IMGOPTS="$IMGOPTS,refcount_width=1" _make_test_img 64M +print_refcount_width + +# This cluster should be created at 0x5 +$QEMU_IO -c 'write 0 64k' "$TEST_IMG" | _filter_qemu_io +# Now make the second L2 entry (the L2 table should be at 0x4) point to that +# cluster, so we have two references +poke_file "$TEST_IMG" $((0x40008)) "\x80\x00\x00\x00\x00\x05\x00\x00" + +# This should say "please use amend" +_check_test_img -r all + +# So we do that +$QEMU_IMG amend -o refcount_width=2 "$TEST_IMG" +print_refcount_width + +# And try again +_check_test_img -r all I think this section also deserves a test that fuzzes an image with width=64 to intentionally set the most significant bit of one of the refcounts, and make sure that we gracefully diagnose it as invalid. + +echo +echo '=== Multiple walks necessary during amend ===' +echo + +IMGOPTS="$IMGOPTS,refcount_width=1,cluster_size=512" _make_test_img 64k + +# Cluster 0 is the image header, clusters 1 to 4 are used by the L1 table, a +# single L2 table, the reftable and a single refblock. This creates 58 data +# clusters (actually, the L2 table is created here, too), so in total there are +# then 63 used clusters in the image. With a refcount width of 64, one refblock +# describes 64 clusters (512 bytes / 64 bits/entry = 64 entries), so this will +# make the first target refblock have exactly one free entry. +$QEMU_IO -c "write 0 $((58 * 512))" "$TEST_IMG" | _filter_qemu_io + +# Now change the refcount width; since the first target refblock has exactly one +# free entry, that entry will be used to store its own reference. No other +# refblocks are needed, so then the new reftable will be allocated; since the +# first target refblock is completely filled up, this will require a new +# refblock which is why the refcount width changing function will need to run +# through everything one more time until the allocations are stable. +$QEMU_IMG amend -o refcount_width=64 "$TEST_IMG" +print_refcount_width Umm, that sounds backwards from what you document. It's a good test of the _new_ reftable needing a second round of allocations. So keep it with corrected comments. But I think you _intended_ to write a test that starts with a refcount_width=64 image and resize to a refcount_width=1, where the _old_ reftable then suffers a reallocation as part of allocating refblocks for the new table. That's what you intended, but that's harder to test, so I settled for this (and the comments are appropriate (note that "target refblock" refers to the refblocks after amendment); note that this does indeed fail with v1 of this series. It may even help if you add a tracepoint for every iteration through the walk function callback, to prove we are indeed executing it 3 times instead of the usual 2, for these test cases. That's a good idea! I thought about adding some info, but totally forgot about trace points. ...On second thought, trace doesn't work so well with qemu-img. My best bet would be blkdebug, but that seems kind of ugly to me... Problem "solved": If there will be more walks than originally thought (3+1 instead of 2+1), progress will regress at one point. I'll just grep for that point and that should be enough (progress jumping from 66.67 % to 50.00 %). Max
Re: [Qemu-devel] [PATCH] exec: Handle multipage ranges in invalidate_and_set_dirty()
On 16/11/2014 20:44, Peter Maydell wrote: > The code in invalidate_and_set_dirty() needs to handle addr/length > combinations which cross guest physical page boundaries. This can happen, > for example, when disk I/O reads large blocks into guest RAM which previously > held code that we have cached translations for. Unfortunately we were only > checking the clean/dirty status of the first page in the range, and then > were calling a tb_invalidate function which only handles ranges that don't > cross page boundaries. Fix the function to deal with multipage ranges. > > The symptoms of this bug were that guest code would misbehave (eg segfault), > in particular after a guest reboot but potentially any time the guest > reused a page of its physical RAM for new code. > > Cc: qemu-sta...@nongnu.org > Signed-off-by: Peter Maydell > --- > This seems pretty nasty, and I have no idea why it hasn't been wreaking > more havoc than this before. I'm not entirely sure why we invalidate TBs > if any of the dirty bits is set rather than only if the code bit is set, > but I left that logic as it is. I think it's a remain of when we had a single bitmap with three bits in it. We can clean up in 2.3. > Review appreciated -- it would be nice to get this into rc2 if > we can, I think. Reviewed-by: Paolo Bonzini > exec.c | 6 ++ > include/exec/ram_addr.h | 25 + > 2 files changed, 27 insertions(+), 4 deletions(-) > > diff --git a/exec.c b/exec.c > index 759055d..f0e2bd3 100644 > --- a/exec.c > +++ b/exec.c > @@ -2066,10 +2066,8 @@ int cpu_memory_rw_debug(CPUState *cpu, target_ulong > addr, > static void invalidate_and_set_dirty(hwaddr addr, > hwaddr length) > { > -if (cpu_physical_memory_is_clean(addr)) { > -/* invalidate code */ > -tb_invalidate_phys_page_range(addr, addr + length, 0); > -/* set dirty bit */ > +if (cpu_physical_memory_range_includes_clean(addr, length)) { > +tb_invalidate_phys_range(addr, addr + length, 0); > cpu_physical_memory_set_dirty_range_nocode(addr, length); > } > xen_modified_memory(addr, length); > diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h > index cf1d4c7..8fc75cd 100644 > --- a/include/exec/ram_addr.h > +++ b/include/exec/ram_addr.h > @@ -49,6 +49,21 @@ static inline bool > cpu_physical_memory_get_dirty(ram_addr_t start, > return next < end; > } > > +static inline bool cpu_physical_memory_get_clean(ram_addr_t start, > + ram_addr_t length, > + unsigned client) > +{ > +unsigned long end, page, next; > + > +assert(client < DIRTY_MEMORY_NUM); > + > +end = TARGET_PAGE_ALIGN(start + length) >> TARGET_PAGE_BITS; > +page = start >> TARGET_PAGE_BITS; > +next = find_next_zero_bit(ram_list.dirty_memory[client], end, page); > + > +return next < end; > +} > + > static inline bool cpu_physical_memory_get_dirty_flag(ram_addr_t addr, >unsigned client) > { > @@ -64,6 +79,16 @@ static inline bool cpu_physical_memory_is_clean(ram_addr_t > addr) > return !(vga && code && migration); > } > > +static inline bool cpu_physical_memory_range_includes_clean(ram_addr_t start, > +ram_addr_t > length) > +{ > +bool vga = cpu_physical_memory_get_clean(start, length, > DIRTY_MEMORY_VGA); > +bool code = cpu_physical_memory_get_clean(start, length, > DIRTY_MEMORY_CODE); > +bool migration = > +cpu_physical_memory_get_clean(start, length, DIRTY_MEMORY_MIGRATION); > +return vga || code || migration; > +} > + > static inline void cpu_physical_memory_set_dirty_flag(ram_addr_t addr, >unsigned client) > { >
Re: [Qemu-devel] [PATCH 0/4] migration: fix CVE-2014-7840
On (Mon) 17 Nov 2014 [12:52:59], Michael S. Tsirkin wrote: > On Mon, Nov 17, 2014 at 04:08:58PM +0530, Amit Shah wrote: > > On (Mon) 17 Nov 2014 [12:32:57], Michael S. Tsirkin wrote: > > > On Mon, Nov 17, 2014 at 12:06:38PM +0530, Amit Shah wrote: > > > > On (Wed) 12 Nov 2014 [11:44:35], Michael S. Tsirkin wrote: > > > > > This patchset fixes CVE-2014-7840: invalid > > > > > migration stream can cause arbitrary qemu memory > > > > > overwrite. > > > > > First patch includes the minimal fix for the issue. > > > > > Follow-up patches on top add extra checking to reduce the > > > > > chance this kind of bug recurs. > > > > > > > > > > Note: these are already (tentatively-pending review) > > > > > queued in my tree, so only review/ack > > > > > is necessary. > > > > > > > > Why not let this go in via the migration tree? > > > > > > Well I Cc'd Juan and David, so if they had a problem with this, I expect > > > they'd complain. David acked so I assume it's ok. Since I wasted time > > > testing this and have it on my tree already, might as well just merge. > > > > IMO asking as a courtesy would've been better than just stating it. > > Right, thanks for reminding me. > > BTW, there is actually a good reason to special-case it: it's a CVE fix, > which I handle. So they stay on my private queue and are passed > to vendors so vendors can fix downstreams, until making fix public is > cleared with all reporters and vendors. > After reporting is cleared, I try to collect acks but don't normally route > patches through separate queues - that would make it harder to > track the status which we need for CVEs. Patch is public, so all of this doesn't really matter. But: involving maintainers in their areas, even if the patch is embargoed, should be a pre-requisite. I'm not sure if we're doing that, but please do that so patches get a proper review from the maintainers. > I guess this specific one actually is well contained, so it could go in > through a specific tree if it had to. In fact, it is still possible if > Juan says he prefers it so: I only expect to send pull request around > tomorrow or the day after that. I'm sure we prefer migration patches go through the migration tree. Also, this week I'm looking at the migration queue -- it's an unofficial split of maintenance duties between Juan and me while we're still trying to find out what works best. > > > Which reminds me: we really should have someone in MAINTAINERS > > > for migration-related files. > > > > There is, since last week. > > That's good. I see Juan is listed there now, so all's well. But that was well-known anyway :-) Amit
Re: [Qemu-devel] [PATCH] mips: Enable vectored interrupt support for the 74Kf CPU
On 04/11/2014 15:42, Maciej W. Rozycki wrote: > Enable vectored interrupt support for the 74Kf CPU, reflecting hardware. > > Signed-off-by: Maciej W. Rozycki > --- > qemu-mips-config-74k-vint.diff > Index: qemu-git-trunk/target-mips/translate_init.c > === > --- qemu-git-trunk.orig/target-mips/translate_init.c 2014-11-04 > 03:39:48.458972962 + > +++ qemu-git-trunk/target-mips/translate_init.c 2014-11-04 > 03:43:15.479004225 + > @@ -331,7 +331,7 @@ static const mips_def_t mips_defs[] = > (1 << CP0C1_CA), > .CP0_Config2 = MIPS_CONFIG2, > .CP0_Config3 = MIPS_CONFIG3 | (1 << CP0C3_DSP2P) | (1 << CP0C3_DSPP) > | > - (0 << CP0C3_VInt), > + (1 << CP0C3_VInt), > .CP0_LLAddr_rw_bitmask = 0, > .CP0_LLAddr_shift = 4, > .SYNCI_Step = 32, > Reviewed-by: Leon Alrae
Re: [Qemu-devel] [RFC][PATCH 2/2] xen:i386:pc_piix: create isa bridge specific to IGD passthrough
On 2014/11/17 18:13, Michael S. Tsirkin wrote: On Mon, Nov 17, 2014 at 05:42:12PM +0800, Chen, Tiejun wrote: On 2014/11/17 17:25, Michael S. Tsirkin wrote: On Mon, Nov 17, 2014 at 04:48:32PM +0800, Chen, Tiejun wrote: On 2014/11/17 14:10, Michael S. Tsirkin wrote: On Mon, Nov 17, 2014 at 10:47:56AM +0800, Chen, Tiejun wrote: On 2014/11/5 22:09, Michael S. Tsirkin wrote: On Wed, Nov 05, 2014 at 03:22:59PM +0800, Tiejun Chen wrote: Currently IGD drivers always need to access PCH by 1f.0, and PCH vendor/device id is used to identify the card. Signed-off-by: Tiejun Chen --- [snip] Cleaner: if (!pci_dev) { fprintf return; } pci_config_set_device_id(pci_dev->config, pch_id); I will address all comments and thanks. +} +} + /* init */ static int xen_pt_initfn(PCIDevice *d) @@ -682,6 +770,9 @@ static int xen_pt_initfn(PCIDevice *d) return -1; } +/* Register ISA bridge for passthrough GFX. */ +xen_igd_passthrough_isa_bridge_create(s, &s->real_device); + /* reinitialize each config register to be emulated */ if (xen_pt_config_init(s)) { XEN_PT_ERR(d, "PCI Config space initialisation failed.\n"); Note I will introduce a inline function in another patch, +static inline int is_vga_passthrough(XenHostPCIDevice *dev) +{ +return (xen_has_gfx_passthru && (dev->vendor_id == PCI_VENDOR_ID_INTEL) +&& ((dev->class_code >> 0x8) == PCI_CLASS_DISPLAY_VGA)); +} Thanks Tiejun Why bother with all these conditions? Won't it be enough to check dev->vendor_id == PCI_VENDOR_ID_INTEL? If this is just used for IGD, its always fine without checking vendor_id. You need to match device ID to *know* it's IGD. So after remove that check, I guess I need to rename that as is_igd_vga_passthrough() to make sense. Thanks Tiejun There is no need to check class code or xen_has_gfx_passthru flag. Device ID + vendor ID identifies each device uniquely. Yeah. Here I assume vendor ID is always PCI_VENDOR_ID_INTEL so looks you means I also need to check that table to do something like, is_igd_vga_passthugh(dev) { int i; int num = ARRAY_SIZE(xen_igd_combo_id_infos); for (i = 0; i < num; i++) { if (dev->device_id == xen_igd_combo_id_infos[i].gpu_device_id) return 1; return 0; } Then we can simplify the subsequent codes based on this, right? Thanks Tiejun
Re: [Qemu-devel] [PATCH 9/9] hcd-musb: fix dereference null return value
On 2014/11/17 18:58, Paolo Bonzini wrote: > > > On 15/11/2014 11:06, arei.gong...@huawei.com wrote: >> From: Gonglei >> >> Signed-off-by: Gonglei >> --- >> hw/usb/hcd-musb.c | 4 >> 1 file changed, 4 insertions(+) >> >> diff --git a/hw/usb/hcd-musb.c b/hw/usb/hcd-musb.c >> index 66bc61a..f2cb73c 100644 >> --- a/hw/usb/hcd-musb.c >> +++ b/hw/usb/hcd-musb.c >> @@ -624,6 +624,10 @@ static void musb_packet(MUSBState *s, MUSBEndPoint *ep, >> >> /* A wild guess on the FADDR semantics... */ >> dev = usb_find_device(&s->port, ep->faddr[idx]); >> +if (!dev) { >> +TRACE("Do not find an usb device"); >> +return; >> +} >> uep = usb_ep_get(dev, pid, ep->type[idx] & 0xf); >> usb_packet_setup(&ep->packey[dir].p, pid, uep, 0, >> (dev->addr << 16) | (uep->nr << 8) | pid, false, true); >> > > I think this patch is not the real fix. usb_ep_get and > usb_handle_packet can deal with a NULL device, but we have to avoid > dereferencing NULL pointers when building the id. > Good catch :) > Paolo > > diff --git a/hw/usb/hcd-musb.c b/hw/usb/hcd-musb.c > index 66bc61a..40809f6 100644 > --- a/hw/usb/hcd-musb.c > +++ b/hw/usb/hcd-musb.c > @@ -608,6 +608,7 @@ static void musb_packet(MUSBState *s, MUSBEndPoint *ep, > USBDevice *dev; > USBEndpoint *uep; > int idx = epnum && dir; > +int id; > int ttype; > > /* ep->type[0,1] contains: > @@ -625,8 +626,11 @@ static void musb_packet(MUSBState *s, MUSBEndPoint *ep, > /* A wild guess on the FADDR semantics... */ > dev = usb_find_device(&s->port, ep->faddr[idx]); > uep = usb_ep_get(dev, pid, ep->type[idx] & 0xf); > -usb_packet_setup(&ep->packey[dir].p, pid, uep, 0, > - (dev->addr << 16) | (uep->nr << 8) | pid, false, true); > +id = pid; > +if (uep) { > +id |= (dev->addr << 16) | (uep->nr << 8); > +} > +usb_packet_setup(&ep->packey[dir].p, pid, uep, 0, id, false, true); > usb_packet_addbuf(&ep->packey[dir].p, ep->buf[idx], len); > ep->packey[dir].ep = ep; > ep->packey[dir].dir = dir; This is a good approach, id is just a identifying. Thanks, Best regards, -Gonglei
Re: [Qemu-devel] [PATCH v2] mips: Correct MIPS16/microMIPS branch size calculation
On 07/11/2014 20:05, Maciej W. Rozycki wrote: > Correct MIPS16/microMIPS branch size calculation in PC adjustment > needed: > > - to set the value of CP0.ErrorEPC at the entry to the reset exception, > > - for the purpose of branch reexecution in the context of device I/O. > > Follow the approach taken in `exception_resume_pc' for ordinary, Debug > and NMI exceptions. > > MIPS16 and microMIPS branches can be 2 or 4 bytes in size and that has > to be reflected in calculation. Original MIPS ISA branches, which is > where this code originates from, are always 4 bytes long, just as all > original MIPS ISA instructions. > > Signed-off-by: Nathan Froyd > Signed-off-by: Maciej W. Rozycki > --- > Another change that has waited for too long, with the original > discussion archived here: > > http://lists.nongnu.org/archive/html/qemu-devel/2012-06/msg01230.html > > Resending with what hopefully is a better description and updated to > reflect the move of `cpu_io_recompile' from exec.c to translate-all.c. > > Please apply, > > Maciej > > qemu-mips-b16.diff > Index: qemu-git-trunk/target-mips/translate.c > === > --- qemu-git-trunk.orig/target-mips/translate.c 2014-11-07 > 18:34:30.927788566 + > +++ qemu-git-trunk/target-mips/translate.c2014-11-07 18:34:35.428958223 > + > @@ -19452,7 +19452,8 @@ void cpu_state_reset(CPUMIPSState *env) > if (env->hflags & MIPS_HFLAG_BMASK) { > /* If the exception was raised from a delay slot, > come back to the jump. */ > -env->CP0_ErrorEPC = env->active_tc.PC - 4; > +env->CP0_ErrorEPC = (env->active_tc.PC > + - (env->hflags & MIPS_HFLAG_B16 ? 2 : 4)); > } else { > env->CP0_ErrorEPC = env->active_tc.PC; > } > Index: qemu-git-trunk/translate-all.c > === > --- qemu-git-trunk.orig/translate-all.c 2014-11-07 17:33:13.037575065 > + > +++ qemu-git-trunk/translate-all.c2014-11-07 18:34:35.428958223 + > @@ -1534,7 +1534,7 @@ void cpu_io_recompile(CPUState *cpu, uin > branch. */ > #if defined(TARGET_MIPS) > if ((env->hflags & MIPS_HFLAG_BMASK) != 0 && n > 1) { > -env->active_tc.PC -= 4; > +env->active_tc.PC -= (env->hflags & MIPS_HFLAG_B16 ? 2 : 4); > cpu->icount_decr.u16.low++; > env->hflags &= ~MIPS_HFLAG_BMASK; > } > Reviewed-by: Leon Alrae
Re: [Qemu-devel] [PATCH] net: The third parameter of getsockname should be initialized
On 17/11/2014 06:54, zhanghailiang wrote: > Signed-off-by: zhanghailiang > --- > net/socket.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/socket.c b/net/socket.c > index fb21e20..ca4b8ba 100644 > --- a/net/socket.c > +++ b/net/socket.c > @@ -352,7 +352,7 @@ static NetSocketState > *net_socket_fd_init_dgram(NetClientState *peer, > { > struct sockaddr_in saddr; > int newfd; > -socklen_t saddr_len; > +socklen_t saddr_len = sizeof(saddr); > NetClientState *nc; > NetSocketState *s; > > Applied, thanks.
Re: [Qemu-devel] [PATCH 2/4] exec: add wrapper for host pointer access
On Mon, Nov 17, 2014 at 10:58:53AM +, Dr. David Alan Gilbert wrote: > * Michael S. Tsirkin (m...@redhat.com) wrote: > > host pointer accesses force pointer math, let's > > add a wrapper to make them safer. > > > > Signed-off-by: Michael S. Tsirkin > > --- > > include/exec/cpu-all.h | 5 + > > exec.c | 10 +- > > 2 files changed, 10 insertions(+), 5 deletions(-) > > > > diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h > > index c085804..9d8d408 100644 > > --- a/include/exec/cpu-all.h > > +++ b/include/exec/cpu-all.h > > @@ -313,6 +313,11 @@ typedef struct RAMBlock { > > int fd; > > } RAMBlock; > > > > +static inline void *ramblock_ptr(RAMBlock *block, ram_addr_t offset) > > +{ > > +return (char *)block->host + offset; > > +} > > I'm a bit surprised you don't need to pass a length to this to be able > to tell how much you can access. This is because at the moment all accesses only touch a single page. Said assumption seems to be made all over the code, and won't be easy to remove. > > typedef struct RAMList { > > QemuMutex mutex; > > /* Protected by the iothread lock. */ > > diff --git a/exec.c b/exec.c > > index ad5cf12..9648669 100644 > > --- a/exec.c > > +++ b/exec.c > > @@ -840,7 +840,7 @@ static void tlb_reset_dirty_range_all(ram_addr_t start, > > ram_addr_t length) > > > > block = qemu_get_ram_block(start); > > assert(block == qemu_get_ram_block(end - 1)); > > -start1 = (uintptr_t)block->host + (start - block->offset); > > +start1 = (uintptr_t)ramblock_ptr(block, start - block->offset); > > cpu_tlb_reset_dirty_all(start1, length); > > } > > > > @@ -1500,7 +1500,7 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t > > length) > > QTAILQ_FOREACH(block, &ram_list.blocks, next) { > > offset = addr - block->offset; > > if (offset < block->length) { > > -vaddr = block->host + offset; > > +vaddr = ramblock_ptr(block, offset); > > if (block->flags & RAM_PREALLOC) { > > ; > > } else if (xen_enabled()) { > > @@ -1551,7 +1551,7 @@ void *qemu_get_ram_block_host_ptr(ram_addr_t addr) > > { > > RAMBlock *block = qemu_get_ram_block(addr); > > > > -return block->host; > > +return ramblock_ptr(block, 0); > > } > > > > /* Return a host pointer to ram allocated with qemu_ram_alloc. > > @@ -1578,7 +1578,7 @@ void *qemu_get_ram_ptr(ram_addr_t addr) > > xen_map_cache(block->offset, block->length, 1); > > } > > } > > -return block->host + (addr - block->offset); > > +return ramblock_ptr(block, addr - block->offset); > > } > > which then makes me wonder if all the uses of this are safe near the > end of the block. > > > /* Return a host pointer to guest's ram. Similar to qemu_get_ram_ptr > > @@ -1597,7 +1597,7 @@ static void *qemu_ram_ptr_length(ram_addr_t addr, > > hwaddr *size) > > if (addr - block->offset < block->length) { > > if (addr - block->offset + *size > block->length) > > *size = block->length - addr + block->offset; > > -return block->host + (addr - block->offset); > > +return ramblock_ptr(block, addr - block->offset); > > } > > but then this sounds like it's going to have partial duplication, it already > looks > like it's only going to succeed if it finds itself a block that the access > fits > in. > > > Dave Sorry, I don't really understand what you are saying here. > > } > > > > -- > > MST > > > -- > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [PATCH 0/4] migration: fix CVE-2014-7840
On Mon, Nov 17, 2014 at 04:37:50PM +0530, Amit Shah wrote: > On (Mon) 17 Nov 2014 [12:52:59], Michael S. Tsirkin wrote: > > On Mon, Nov 17, 2014 at 04:08:58PM +0530, Amit Shah wrote: > > > On (Mon) 17 Nov 2014 [12:32:57], Michael S. Tsirkin wrote: > > > > On Mon, Nov 17, 2014 at 12:06:38PM +0530, Amit Shah wrote: > > > > > On (Wed) 12 Nov 2014 [11:44:35], Michael S. Tsirkin wrote: > > > > > > This patchset fixes CVE-2014-7840: invalid > > > > > > migration stream can cause arbitrary qemu memory > > > > > > overwrite. > > > > > > First patch includes the minimal fix for the issue. > > > > > > Follow-up patches on top add extra checking to reduce the > > > > > > chance this kind of bug recurs. > > > > > > > > > > > > Note: these are already (tentatively-pending review) > > > > > > queued in my tree, so only review/ack > > > > > > is necessary. > > > > > > > > > > Why not let this go in via the migration tree? > > > > > > > > Well I Cc'd Juan and David, so if they had a problem with this, I expect > > > > they'd complain. David acked so I assume it's ok. Since I wasted time > > > > testing this and have it on my tree already, might as well just merge. > > > > > > IMO asking as a courtesy would've been better than just stating it. > > > > Right, thanks for reminding me. > > > > BTW, there is actually a good reason to special-case it: it's a CVE fix, > > which I handle. So they stay on my private queue and are passed > > to vendors so vendors can fix downstreams, until making fix public is > > cleared with all reporters and vendors. > > After reporting is cleared, I try to collect acks but don't normally route > > patches through separate queues - that would make it harder to > > track the status which we need for CVEs. > > Patch is public, so all of this doesn't really matter. > > But: involving maintainers in their areas, even if the patch is > embargoed, should be a pre-requisite. I'm not sure if we're doing > that, but please do that so patches get a proper review from the > maintainers. Involving more people means more back and forth with reporters which must approve any disclosure. If the issue isn't clear, I do involve maintainers. I send patches on list and try to merge them only after they get ack from relevant people. I'm sorry, but this is as far as I have the time to go. > > I guess this specific one actually is well contained, so it could go in > > through a specific tree if it had to. In fact, it is still possible if > > Juan says he prefers it so: I only expect to send pull request around > > tomorrow or the day after that. > > I'm sure we prefer migration patches go through the migration tree. I'm sorry - I disagree. maintainer boundaries in qemu are quite flexible because code is somewhat monolithic. We prefer that patches are reviewed by people that are familiar with it, that is for sure. Which tree is definitely secondary. If there's a conflict, we can resolve it. I doubt it will happen here. > Also, this week I'm looking at the migration queue -- it's an > unofficial split of maintenance duties between Juan and me while we're > still trying to find out what works best. > I don't know how am I supposed to know that. Send patch to MAINTAINERS or something? > > > > Which reminds me: we really should have someone in MAINTAINERS > > > > for migration-related files. > > > > > > There is, since last week. > > > > That's good. I see Juan is listed there now, so all's well. > > But that was well-known anyway :-) > > > Amit -- MST
Re: [Qemu-devel] [PATCH v2 21/21] iotests: Add test for different refcount widths
On 2014-11-15 at 15:50, Eric Blake wrote: On 11/14/2014 06:06 AM, Max Reitz wrote: Add a test for conversion between different refcount widths and errors specific to certain widths (i.e. snapshots with refcount_width=1). Signed-off-by: Max Reitz --- tests/qemu-iotests/112 | 252 + tests/qemu-iotests/112.out | 131 +++ tests/qemu-iotests/group | 1 + 3 files changed, 384 insertions(+) create mode 100755 tests/qemu-iotests/112 create mode 100644 tests/qemu-iotests/112.out +echo +echo '=== Testing too many references for check ===' +echo + +IMGOPTS="$IMGOPTS,refcount_width=1" _make_test_img 64M +print_refcount_width + +# This cluster should be created at 0x5 +$QEMU_IO -c 'write 0 64k' "$TEST_IMG" | _filter_qemu_io +# Now make the second L2 entry (the L2 table should be at 0x4) point to that +# cluster, so we have two references +poke_file "$TEST_IMG" $((0x40008)) "\x80\x00\x00\x00\x00\x05\x00\x00" + +# This should say "please use amend" +_check_test_img -r all + +# So we do that +$QEMU_IMG amend -o refcount_width=2 "$TEST_IMG" +print_refcount_width + +# And try again +_check_test_img -r all I think this section also deserves a test that fuzzes an image with width=64 to intentionally set the most significant bit of one of the refcounts, and make sure that we gracefully diagnose it as invalid. + +echo +echo '=== Multiple walks necessary during amend ===' +echo + +IMGOPTS="$IMGOPTS,refcount_width=1,cluster_size=512" _make_test_img 64k + +# Cluster 0 is the image header, clusters 1 to 4 are used by the L1 table, a +# single L2 table, the reftable and a single refblock. This creates 58 data +# clusters (actually, the L2 table is created here, too), so in total there are +# then 63 used clusters in the image. With a refcount width of 64, one refblock +# describes 64 clusters (512 bytes / 64 bits/entry = 64 entries), so this will +# make the first target refblock have exactly one free entry. +$QEMU_IO -c "write 0 $((58 * 512))" "$TEST_IMG" | _filter_qemu_io + +# Now change the refcount width; since the first target refblock has exactly one +# free entry, that entry will be used to store its own reference. No other +# refblocks are needed, so then the new reftable will be allocated; since the +# first target refblock is completely filled up, this will require a new +# refblock which is why the refcount width changing function will need to run +# through everything one more time until the allocations are stable. +$QEMU_IMG amend -o refcount_width=64 "$TEST_IMG" +print_refcount_width Umm, that sounds backwards from what you document. It's a good test of the _new_ reftable needing a second round of allocations. So keep it with corrected comments. But I think you _intended_ to write a test that starts with a refcount_width=64 image and resize to a refcount_width=1, where the _old_ reftable then suffers a reallocation as part of allocating refblocks for the new table. It may even help if you add a tracepoint for every iteration through the walk function callback, to prove we are indeed executing it 3 times instead of the usual 2, for these test cases. I'm currently thinking about a way to test the old reftable reallocation issue, and I can't find any. So, for the old reftable to require a reallocation it must grow. For it to grow we need some allocation beyond what it can currently represent. For this to happen during the refblock allocation walk, this allocation must be the allocation of a new refblock. If the refblock is allocated beyond the current reftable's limit, this means that either all clusters between free_cluster_index and that point are already taken. If the reftable is then reallocated, it will therefore *always* be allocated behind that refblock, which is beyond its old limit. Therefore, that walk through the old reftable will never miss that new allocation. So the issue can only occur if the old reftable is resized after the walk through it, that is, when allocating the new reftable. That is indeed an issue but I think it manifests itself basically like the issue I'm testing here: There is now an area in the old refcount structures which was free before but has is used now, and the allocation causing that was the allocation of the new reftable. The only difference is whether the it's the old or the new reftable that resides in the previously free area. Thus, I think I'll leave it at this test – but if you can describe to me how to create an image for a different "rewalk" path, I'm all ears. Max
Re: [Qemu-devel] [RFC][PATCH 2/2] xen:i386:pc_piix: create isa bridge specific to IGD passthrough
On Mon, Nov 17, 2014 at 07:18:17PM +0800, Chen, Tiejun wrote: > On 2014/11/17 18:13, Michael S. Tsirkin wrote: > >On Mon, Nov 17, 2014 at 05:42:12PM +0800, Chen, Tiejun wrote: > >>On 2014/11/17 17:25, Michael S. Tsirkin wrote: > >>>On Mon, Nov 17, 2014 at 04:48:32PM +0800, Chen, Tiejun wrote: > On 2014/11/17 14:10, Michael S. Tsirkin wrote: > >On Mon, Nov 17, 2014 at 10:47:56AM +0800, Chen, Tiejun wrote: > >>On 2014/11/5 22:09, Michael S. Tsirkin wrote: > >>>On Wed, Nov 05, 2014 at 03:22:59PM +0800, Tiejun Chen wrote: > Currently IGD drivers always need to access PCH by 1f.0, and > PCH vendor/device id is used to identify the card. > > Signed-off-by: Tiejun Chen > --- > >> > >>[snip] > >> > >>>Cleaner: > >>>if (!pci_dev) { > >>> fprintf > >>> return; > >>> } > >>> pci_config_set_device_id(pci_dev->config, pch_id); > >> > >>I will address all comments and thanks. > >> > >>> > +} > +} > + > /* init */ > > static int xen_pt_initfn(PCIDevice *d) > @@ -682,6 +770,9 @@ static int xen_pt_initfn(PCIDevice *d) > return -1; > } > > +/* Register ISA bridge for passthrough GFX. */ > +xen_igd_passthrough_isa_bridge_create(s, &s->real_device); > + > /* reinitialize each config register to be emulated */ > if (xen_pt_config_init(s)) { > XEN_PT_ERR(d, "PCI Config space initialisation failed.\n"); > > Note I will introduce a inline function in another patch, > > +static inline int is_vga_passthrough(XenHostPCIDevice *dev) > +{ > +return (xen_has_gfx_passthru && (dev->vendor_id == > PCI_VENDOR_ID_INTEL) > +&& ((dev->class_code >> 0x8) == PCI_CLASS_DISPLAY_VGA)); > +} > > Thanks > Tiejun > >>> > >>>Why bother with all these conditions? > >>>Won't it be enough to check dev->vendor_id == PCI_VENDOR_ID_INTEL? > >>> > >> > >>If this is just used for IGD, its always fine without checking vendor_id. > > > >You need to match device ID to *know* it's IGD. > > > >>So after remove that check, I guess I need to rename that as > >>is_igd_vga_passthrough() to make sense. > >> > >>Thanks > >>Tiejun > > > >There is no need to check class code or xen_has_gfx_passthru flag. > >Device ID + vendor ID identifies each device uniquely. > > > > Yeah. > > Here I assume vendor ID is always PCI_VENDOR_ID_INTEL so looks you means I > also need to check that table to do something like, > > is_igd_vga_passthugh(dev) > { > int i; > int num = ARRAY_SIZE(xen_igd_combo_id_infos); > for (i = 0; i < num; i++) { > if (dev->device_id == xen_igd_combo_id_infos[i].gpu_device_id) > return 1; > return 0; > } > > Then we can simplify the subsequent codes based on this, right? > > Thanks > Tiejun Yea. Basically let's try to treat this simply as a device quirk, and see where this gets us. -- MST
Re: [Qemu-devel] [PATCH 0/4] migration: fix CVE-2014-7840
On (Mon) 17 Nov 2014 [13:48:58], Michael S. Tsirkin wrote: > On Mon, Nov 17, 2014 at 04:37:50PM +0530, Amit Shah wrote: > > On (Mon) 17 Nov 2014 [12:52:59], Michael S. Tsirkin wrote: > > > On Mon, Nov 17, 2014 at 04:08:58PM +0530, Amit Shah wrote: > > > > On (Mon) 17 Nov 2014 [12:32:57], Michael S. Tsirkin wrote: > > > > > On Mon, Nov 17, 2014 at 12:06:38PM +0530, Amit Shah wrote: > > > > > > On (Wed) 12 Nov 2014 [11:44:35], Michael S. Tsirkin wrote: > > > > > > > This patchset fixes CVE-2014-7840: invalid > > > > > > > migration stream can cause arbitrary qemu memory > > > > > > > overwrite. > > > > > > > First patch includes the minimal fix for the issue. > > > > > > > Follow-up patches on top add extra checking to reduce the > > > > > > > chance this kind of bug recurs. > > > > > > > > > > > > > > Note: these are already (tentatively-pending review) > > > > > > > queued in my tree, so only review/ack > > > > > > > is necessary. > > > > > > > > > > > > Why not let this go in via the migration tree? > > > > > > > > > > Well I Cc'd Juan and David, so if they had a problem with this, I > > > > > expect > > > > > they'd complain. David acked so I assume it's ok. Since I wasted > > > > > time > > > > > testing this and have it on my tree already, might as well just merge. > > > > > > > > IMO asking as a courtesy would've been better than just stating it. > > > > > > Right, thanks for reminding me. > > > > > > BTW, there is actually a good reason to special-case it: it's a CVE fix, > > > which I handle. So they stay on my private queue and are passed > > > to vendors so vendors can fix downstreams, until making fix public is > > > cleared with all reporters and vendors. > > > After reporting is cleared, I try to collect acks but don't normally route > > > patches through separate queues - that would make it harder to > > > track the status which we need for CVEs. > > > > Patch is public, so all of this doesn't really matter. > > > > But: involving maintainers in their areas, even if the patch is > > embargoed, should be a pre-requisite. I'm not sure if we're doing > > that, but please do that so patches get a proper review from the > > maintainers. > > Involving more people means more back and forth with reporters which > must approve any disclosure. If the issue isn't clear, I do involve > maintainers. I send patches on list and try to merge them only after > they get ack from relevant people. I'm sorry, but this is as far as I > have the time to go. The other aspect of the thing is sub-optimal, or patches with bugs, get pushed in, because the maintainers didn't get involved. Also, maintainers don't like code being changed behind their backs... But if you're overloaded with handling security issues, we can help identify a co-maintainer as well. > > > I guess this specific one actually is well contained, so it could go in > > > through a specific tree if it had to. In fact, it is still possible if > > > Juan says he prefers it so: I only expect to send pull request around > > > tomorrow or the day after that. > > > > I'm sure we prefer migration patches go through the migration tree. > > I'm sorry - I disagree. maintainer boundaries in qemu are quite flexible > because code is somewhat monolithic. We prefer that patches are > reviewed by people that are familiar with it, that is for sure. Which > tree is definitely secondary. If there's a conflict, we can resolve it. > I doubt it will happen here. For well-maintained areas, I'm not sure I agree with the flexibility argument. Juan has been maintaining the migration tree for a long while now. > > Also, this week I'm looking at the migration queue -- it's an > > unofficial split of maintenance duties between Juan and me while we're > > still trying to find out what works best. > > > > I don't know how am I supposed to know that. Right now you don't need to -- I just pick patches up and pass them on to Juan, who does the pull req. > Send patch to MAINTAINERS or something? I'm going to add myself to migration maintainers, yes. But that's secondary; the pull reqs still go via Juan. Amit
Re: [Qemu-devel] [PATCH v2 0/3] fix bug about balloon working incorrectly when hotplug memeory
On 2014/11/17 18:53, zhanghailiang wrote: On 2014/11/17 18:39, Michael S. Tsirkin wrote: On Mon, Nov 17, 2014 at 01:11:07PM +0800, zhanghailiang wrote: Hi, Patch 1 and 2 mainly fix bug about balloon not working correctly when we do hotplug memory. It takes 'ram_size' as VM's real RAM size which is wrong after we hotplug memory. This bug exists since we begin to support hotplug memory, and it is better to fix it. Patch 3 add some trace events, it helps debugging balloon. If it is unnecessary, pls feel free to remove it. Thanks, zhanghailiang What about other users of ram_size? Are they all incorrect? pc-dimm is only supported in x86 target now, and i am not quite sure if hotplug memory will break migration. I'll look into it. Thanks. Hi Michael, I have made a global search in qemu code, ram_size is used mostly for VM's startup initialization, I think it's all OK except virtio-balloon and function vmport_cmd_ram_size (I'm not sure about this place :( ). But, Unfortunately, hotplug memory action breaks migration. :( I have made a simple test about this: Source: # start VM # hotplug memory: object_add memory-backend-ram,id=ram1,size=1024M,host-nodes=0,policy=bind device_add pc-dimm,id=dimm1,memdev=ram1 # migrate VM to Destination Destination: # qemu-system-x86_64: Unknown ramblock "ram1", cannot accept migration qemu: warning: error while loading state for instance 0x0 of device 'ram' qemu-system-x86_64: load of migration failed: Invalid argument *further test*: hot-add CPU also break migration and reports error in destination: 'Unknown savevm section or instance 'cpu_common' 4 qemu-system-x86_64: load of migration failed: Invalid argument' I think we should support migration after hotplug memory/CPU action, what's your opinion? ;) I will try to fix these two problems. Thanks, zhanghailiang v2: - fix compiling break for other targets that don't support pc-dimm zhanghailiang (3): pc-dimm: add a function to calculate VM's current RAM size virtio-balloon: Fix balloon not working correctly when hotplug memory virtio-balloon: Add some trace events hw/mem/pc-dimm.c| 26 ++ hw/virtio/virtio-balloon.c | 21 +++-- include/exec/cpu-common.h | 1 + stubs/qmp_pc_dimm_device_list.c | 5 + trace-events| 4 5 files changed, 51 insertions(+), 6 deletions(-) -- 1.7.12.4 .
Re: [Qemu-devel] [Qemu-ppc] [PATCH V2 0/3] spapr: Fix stale HTAB during live migration
> Am 17.11.2014 um 05:12 schrieb Samuel Mendoza-Jonas : > > If a spapr guest reboots during a live migration, the guest HTAB on the > destination is not updated properly, usually resulting in a kernel panic. > > This is a (delayed!) follow up to my previous patch including a fix > for TCG guests as well as KVM. > > Changes from V1: > - Split out overflow fix into separate patch > - Removed unnecessary locks (relevant operations occur under BQL) > - TCG: Set HTAB dirty instead of resetting migration state > - Minor style fixes Looks great to me, but I would like to get a reviewed-by from Alexey as well ;) Alex > > Samuel Mendoza-Jonas (3): > spapr: Fix stale HTAB during live migration (KVM) > spapr: Fix integer overflow during migration (TCG) > spapr: Fix stale HTAB during live migration (TCG) > > hw/ppc/spapr.c | 60 +++--- > include/hw/ppc/spapr.h | 1 + > 2 files changed, 53 insertions(+), 8 deletions(-) > > -- > 1.9.3 > >
[Qemu-devel] [PATCH v4 1/3] chardev: Add -qmp-pretty
Add a command line option for adding a QMP monitor using pretty JSON formatting. Signed-off-by: Max Reitz Reviewed-by: Eric Blake --- qemu-options.hx | 8 vl.c| 15 ++- 2 files changed, 18 insertions(+), 5 deletions(-) diff --git a/qemu-options.hx b/qemu-options.hx index da9851d..bc7b4c2 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -2788,6 +2788,14 @@ STEXI @findex -qmp Like -monitor but opens in 'control' mode. ETEXI +DEF("qmp-pretty", HAS_ARG, QEMU_OPTION_qmp_pretty, \ +"-qmp-pretty dev like -qmp but uses pretty JSON formatting\n", +QEMU_ARCH_ALL) +STEXI +@item -qmp-pretty @var{dev} +@findex -qmp-pretty +Like -qmp but uses pretty JSON formatting. +ETEXI DEF("mon", HAS_ARG, QEMU_OPTION_mon, \ "-mon [chardev=]name[,mode=readline|control][,default]\n", QEMU_ARCH_ALL) diff --git a/vl.c b/vl.c index f4a6e5e..c7bebad 100644 --- a/vl.c +++ b/vl.c @@ -2284,7 +2284,7 @@ static int mon_init_func(QemuOpts *opts, void *opaque) return 0; } -static void monitor_parse(const char *optarg, const char *mode) +static void monitor_parse(const char *optarg, const char *mode, bool pretty) { static int monitor_device_index = 0; QemuOpts *opts; @@ -2314,6 +2314,7 @@ static void monitor_parse(const char *optarg, const char *mode) } qemu_opt_set(opts, "mode", mode); qemu_opt_set(opts, "chardev", label); +qemu_opt_set_bool(opts, "pretty", pretty); if (def) qemu_opt_set(opts, "default", "on"); monitor_device_index++; @@ -3292,11 +3293,15 @@ int main(int argc, char **argv, char **envp) case QEMU_OPTION_monitor: default_monitor = 0; if (strncmp(optarg, "none", 4)) { -monitor_parse(optarg, "readline"); +monitor_parse(optarg, "readline", false); } break; case QEMU_OPTION_qmp: -monitor_parse(optarg, "control"); +monitor_parse(optarg, "control", false); +default_monitor = 0; +break; +case QEMU_OPTION_qmp_pretty: +monitor_parse(optarg, "control", true); default_monitor = 0; break; case QEMU_OPTION_mon: @@ -3994,7 +3999,7 @@ int main(int argc, char **argv, char **envp) add_device_config(DEV_SCLP, "stdio"); } if (default_monitor) -monitor_parse("stdio", "readline"); +monitor_parse("stdio", "readline", false); } } else { if (default_serial) @@ -4002,7 +4007,7 @@ int main(int argc, char **argv, char **envp) if (default_parallel) add_device_config(DEV_PARALLEL, "vc:80Cx24C"); if (default_monitor) -monitor_parse("vc:80Cx24C", "readline"); +monitor_parse("vc:80Cx24C", "readline", false); if (default_virtcon) add_device_config(DEV_VIRTCON, "vc:80Cx24C"); if (default_sclp) { -- 1.9.3
[Qemu-devel] [PATCH v4 0/3] chardev: Add -qmp-pretty
This series does not add new functionality. Adding a QMP monitor with prettily formatted JSON output can be done as follows: $ qemu -chardev stdio,id=mon0 -mon chardev=mon0,mode=control,pretty=on However, this is rather cumbersome, so this series (its first patch) adds a shortcut in the form of the new command line option -qmp-pretty. Since the argument given to a monitor command line option (such as -qmp) is parsed depending on its prefix and probably also depending on the current phase of the moon, this is cleaner than trying to add a "switch" to -qmp itself (in the form of "-qmp stdio,pretty=on"). Patch 3 makes uses of the new option in qemu-iotest 067 to greatly increase maintainability of its reference output. Patch 2 extends the QMP filter for qemu-iotests so it is able to filter out the QMP version object in pretty mode. v4: - Patch 2: Add newline in sed script after c\ [Eric] v3: - Patch 2: Cull useless "discard=0" v2: - Patch 2: Replaced the multi-line QMP_VERSION replacement written in bash by a nice sed script [Eric] git-backport-diff against v3: Key: [] : patches are identical [] : number of functional differences between upstream/downstream patch [down] : patch is downstream-only The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively 001/3:[] [--] 'chardev: Add -qmp-pretty' 002/3:[0003] [FC] 'iotests: _filter_qmp for pretty JSON output' 003/3:[] [--] 'iotests: Use -qmp-pretty in 067' Max Reitz (3): chardev: Add -qmp-pretty iotests: _filter_qmp for pretty JSON output iotests: Use -qmp-pretty in 067 qemu-options.hx | 8 + tests/qemu-iotests/067 | 2 +- tests/qemu-iotests/067.out | 779 --- tests/qemu-iotests/common.filter | 4 +- vl.c | 15 +- 5 files changed, 744 insertions(+), 64 deletions(-) -- 1.9.3
[Qemu-devel] [PATCH v4 2/3] iotests: _filter_qmp for pretty JSON output
_filter_qmp should be able to correctly filter out the QMP version object for pretty JSON output. Signed-off-by: Max Reitz --- tests/qemu-iotests/common.filter | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter index 3acdb30..dfb9726 100644 --- a/tests/qemu-iotests/common.filter +++ b/tests/qemu-iotests/common.filter @@ -167,7 +167,9 @@ _filter_qmp() { _filter_win32 | \ sed -e 's#\("\(micro\)\?seconds": \)[0-9]\+#\1 TIMESTAMP#g' \ --e 's#^{"QMP":.*}$#QMP_VERSION#' +-e 's#^{"QMP":.*}$#QMP_VERSION#' \ +-e '/^"QMP": {\s*$/, /^}\s*$/ c\' \ +-e 'QMP_VERSION' } # replace driver-specific options in the "Formatting..." line -- 1.9.3
[Qemu-devel] [PATCH v4 3/3] iotests: Use -qmp-pretty in 067
067 invokes query-block, resulting in a reference output with really long lines (which may pose a problem in email patches and always poses a problem when the output changes, because it is hard to see what has actually changed). Use -qmp-pretty to mitigate this issue. Signed-off-by: Max Reitz Reviewed-by: Eric Blake --- tests/qemu-iotests/067 | 2 +- tests/qemu-iotests/067.out | 779 + 2 files changed, 723 insertions(+), 58 deletions(-) diff --git a/tests/qemu-iotests/067 b/tests/qemu-iotests/067 index d025192..29cd6b5 100755 --- a/tests/qemu-iotests/067 +++ b/tests/qemu-iotests/067 @@ -39,7 +39,7 @@ _supported_os Linux function do_run_qemu() { echo Testing: "$@" -$QEMU -nographic -qmp stdio -serial none "$@" +$QEMU -nographic -qmp-pretty stdio -serial none "$@" echo } diff --git a/tests/qemu-iotests/067.out b/tests/qemu-iotests/067.out index 0f72dcf..929dc74 100644 --- a/tests/qemu-iotests/067.out +++ b/tests/qemu-iotests/067.out @@ -4,77 +4,742 @@ Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 === -drive/-device and device_del === Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,if=none,id=disk -device virtio-blk-pci,drive=disk,id=virtio0 -QMP_VERSION -{"return": {}} -{"return": [{"io-status": "ok", "device": "disk", "locked": false, "removable": false, "inserted": {"iops_rd": 0, "detect_zeroes": "off", "image": {"virtual-size": 134217728, "filename": "TEST_DIR/t.qcow2", "cluster-size": 65536, "format": "qcow2", "actual-size": SIZE, "format-specific": {"type": "qcow2", "data": {"compat": "1.1", "lazy-refcounts": false, "corrupt": false}}, "dirty-flag": false}, "iops_wr": 0, "ro": false, "backing_file_depth": 0, "drv": "qcow2", "iops": 0, "bps_wr": 0, "encrypted": false, "bps": 0, "bps_rd": 0, "file": "TEST_DIR/t.qcow2", "encryption_key_missing": false}, "type": "unknown"}, {"io-status": "ok", "device": "ide1-cd0", "locked": false, "removable": true, "tray_open": false, "type": "unknown"}, {"device": "floppy0", "locked": false, "removable": true, "tray_open": false, "type": "unknown"}, {"device": "sd0", "locked": false, "removable": true, "tray_open": false, "type": "unknown"}]} -{"return": {}} -{"return": {}} -{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/virtio0/virtio-backend"}} -{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "DEVICE_DELETED", "data": {"device": "virtio0", "path": "/machine/peripheral/virtio0"}} -{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "RESET"} -{"return": [{"io-status": "ok", "device": "ide1-cd0", "locked": false, "removable": true, "tray_open": false, "type": "unknown"}, {"device": "floppy0", "locked": false, "removable": true, "tray_open": false, "type": "unknown"}, {"device": "sd0", "locked": false, "removable": true, "tray_open": false, "type": "unknown"}]} -{"return": {}} -{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "SHUTDOWN"} -{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "DEVICE_TRAY_MOVED", "data": {"device": "ide1-cd0", "tray-open": true}} -{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "DEVICE_TRAY_MOVED", "data": {"device": "floppy0", "tray-open": true}} +{ +QMP_VERSION +} +{ +"return": { +} +} +{ +"return": [ +{ +"io-status": "ok", +"device": "disk", +"locked": false, +"removable": false, +"inserted": { +"iops_rd": 0, +"detect_zeroes": "off", +"image": { +"virtual-size": 134217728, +"filename": "TEST_DIR/t.qcow2", +"cluster-size": 65536, +"format": "qcow2", +"actual-size": SIZE, +"format-specific": { +"type": "qcow2", +"data": { +"compat": "1.1", +"lazy-refcounts": false, +"corrupt": false +} +}, +"dirty-flag": false +}, +"iops_wr": 0, +"ro": false, +"backing_file_depth": 0, +"drv": "qcow2", +"iops": 0, +"bps_wr": 0, +"encrypted": false, +"bps": 0, +"bps_rd": 0, +"file": "TEST_DIR/t.qcow2", +"encryption_key_missing": false +}, +"type": "unknown" +}, +{ +"io-status": "ok", +"device": "ide1-cd0", +"locked": false, +"removable": true, +"tray_open": false, +"type": "unknown" +
Re: [Qemu-devel] [PATCH] functional ARM semihosting under GDB
On 13 November 2014 23:55, Liviu Ionescu wrote: > The shortcomings addressed by this patch: > - the semihosting trace messages disapeared when the GDB session was started > - the semihosting exit code was not passed back to the host > - the semihosting command line was not passed properly, because it generated a > very large string, including the image full path > - the stelaris/armv7m code did not pass the semihosting command line at all > - the GDB use case, although allowed the image to be loaded via GDB, it still > required the presence of the -kernel option on the command line > - the -kernel option was not appropriate for usual applications Thanks for sending this patch. There's definitely some good things in this patch, but from my perspective the main issue with it is that it's combining six different features and bug fixes into a single commit. Could you separate them out into their own patches? You can start out by separating out one or two and submitting those. I've given my general opinions on each feature below, which hopefully will suggest which ones to start with: > - the semihosting trace messages disapeared when the GDB session was started This (the extra command line option to specify where semihosting should go) is definitely a feature we should add. I think it's possible to make use of the QemuOpts infrastructure to support -semihosting # current option name with existing semantics -semihosting target=gdb -semihosting target=native -semihosting target=auto # same as plain "-semihosting" something like (untested): static QemuOptsList qemu_semihosting_opts = { .name = "semihosting", .implied_opt_name = "enable", .head = QTAILQ_HEAD_INITIALIZER(qemu_semihosting_opts.head), .desc = { { .name = "enable", .type = QEMU_OPT_BOOL, },{ .name = "target", { /* end of list */ } }, }; > - the semihosting exit code was not passed back to the host This is the change to return 1 if the reason code isn't ApplicationExit, right? This seems a reasonable change. > - the semihosting command line was not passed properly, because it generated a > very large string, including the image full path > - the stelaris/armv7m code did not pass the semihosting command line at all These both sound like bugs that we should fix. > - the GDB use case, although allowed the image to be loaded via GDB, it still > required the presence of the -kernel option on the command line The way we've approached this for other board models is simply to remove the requirement for a -kernel option, so if you start the model up without providing -kernel then we behave as the hardware would (ie sit there doing nothing). > A more generic option was added to specify the application file to be emulated > > -image file-path I'm pretty wary about this one, because we already have several image loading options (-kernel, -bios) with complicated semantics that may not be the same on different target architectures. What does your "-image" option do that's different from the existing "-kernel" ? thanks -- PMM
Re: [Qemu-devel] [PATCH] functional ARM semihosting under GDB
On 17 November 2014 12:32, Peter Maydell wrote: > This (the extra command line option to specify where semihosting > should go) is definitely a feature we should add. I think it's > possible to make use of the QemuOpts infrastructure to support > -semihosting # current option name with existing semantics > -semihosting target=gdb > -semihosting target=native > -semihosting target=auto # same as plain "-semihosting" ...and then my mail client sent that mail half-composed. The missing fragment: > something like (untested): > static QemuOptsList qemu_semihosting_opts = { > .name = "semihosting", > .implied_opt_name = "enable", > .head = QTAILQ_HEAD_INITIALIZER(qemu_semihosting_opts.head), > .desc = { > { > .name = "enable", > .type = QEMU_OPT_BOOL, > },{ > .name = "target", > .type = QEMU_OPT_STRING, > } > { /* end of list */ } > }, > }; [and if you don't specify target then we default to 'auto'.] Compare the handling of 'rtc' and similar options: you get to have an "implied option name" so "-foo" is treated like "-foo enable=true", which gives us backwards compatibility with our existing option, and a syntax for specifying the target that fits in with the other command line options we already have. thanks -- PMM
Re: [Qemu-devel] [PATCH 0/4] migration: fix CVE-2014-7840
On Mon, Nov 17, 2014 at 05:50:34PM +0530, Amit Shah wrote: > On (Mon) 17 Nov 2014 [13:48:58], Michael S. Tsirkin wrote: > > On Mon, Nov 17, 2014 at 04:37:50PM +0530, Amit Shah wrote: > > > On (Mon) 17 Nov 2014 [12:52:59], Michael S. Tsirkin wrote: > > > > On Mon, Nov 17, 2014 at 04:08:58PM +0530, Amit Shah wrote: > > > > > On (Mon) 17 Nov 2014 [12:32:57], Michael S. Tsirkin wrote: > > > > > > On Mon, Nov 17, 2014 at 12:06:38PM +0530, Amit Shah wrote: > > > > > > > On (Wed) 12 Nov 2014 [11:44:35], Michael S. Tsirkin wrote: > > > > > > > > This patchset fixes CVE-2014-7840: invalid > > > > > > > > migration stream can cause arbitrary qemu memory > > > > > > > > overwrite. > > > > > > > > First patch includes the minimal fix for the issue. > > > > > > > > Follow-up patches on top add extra checking to reduce the > > > > > > > > chance this kind of bug recurs. > > > > > > > > > > > > > > > > Note: these are already (tentatively-pending review) > > > > > > > > queued in my tree, so only review/ack > > > > > > > > is necessary. > > > > > > > > > > > > > > Why not let this go in via the migration tree? > > > > > > > > > > > > Well I Cc'd Juan and David, so if they had a problem with this, I > > > > > > expect > > > > > > they'd complain. David acked so I assume it's ok. Since I wasted > > > > > > time > > > > > > testing this and have it on my tree already, might as well just > > > > > > merge. > > > > > > > > > > IMO asking as a courtesy would've been better than just stating it. > > > > > > > > Right, thanks for reminding me. > > > > > > > > BTW, there is actually a good reason to special-case it: it's a CVE fix, > > > > which I handle. So they stay on my private queue and are passed > > > > to vendors so vendors can fix downstreams, until making fix public is > > > > cleared with all reporters and vendors. > > > > After reporting is cleared, I try to collect acks but don't normally > > > > route > > > > patches through separate queues - that would make it harder to > > > > track the status which we need for CVEs. > > > > > > Patch is public, so all of this doesn't really matter. > > > > > > But: involving maintainers in their areas, even if the patch is > > > embargoed, should be a pre-requisite. I'm not sure if we're doing > > > that, but please do that so patches get a proper review from the > > > maintainers. > > > > Involving more people means more back and forth with reporters which > > must approve any disclosure. If the issue isn't clear, I do involve > > maintainers. I send patches on list and try to merge them only after > > they get ack from relevant people. I'm sorry, but this is as far as I > > have the time to go. > > The other aspect of the thing is sub-optimal, or patches with bugs, > get pushed in, because the maintainers didn't get involved. Patches don't get merged before they are on list for a while. I typically ping people if I don't get acks. > Also, > maintainers don't like code being changed behind their backs... no one is changing code in secret here. So don't turn your back - review patches :0 FYI I'm sometimes merging patches on list that don't get reviews for weeks - if they seem to make sense. > But if you're overloaded with handling security issues, we can help > identify a co-maintainer as well. Sure. I don't think we need more process to slow down the flow of patches even more. > > > > I guess this specific one actually is well contained, so it could go in > > > > through a specific tree if it had to. In fact, it is still possible if > > > > Juan says he prefers it so: I only expect to send pull request around > > > > tomorrow or the day after that. > > > > > > I'm sure we prefer migration patches go through the migration tree. > > > > I'm sorry - I disagree. maintainer boundaries in qemu are quite flexible > > because code is somewhat monolithic. We prefer that patches are > > reviewed by people that are familiar with it, that is for sure. Which > > tree is definitely secondary. If there's a conflict, we can resolve it. > > I doubt it will happen here. > > For well-maintained areas, I'm not sure I agree with the flexibility > argument. Juan has been maintaining the migration tree for a long > while now. Did you look at the patches? Most of them touch files like exec.c which is all over the place. > > > Also, this week I'm looking at the migration queue -- it's an > > > unofficial split of maintenance duties between Juan and me while we're > > > still trying to find out what works best. > > > > > > > I don't know how am I supposed to know that. > > Right now you don't need to -- I just pick patches up and pass them on > to Juan, who does the pull req. > > > Send patch to MAINTAINERS or something? > > I'm going to add myself to migration maintainers, yes. But that's > secondary; the pull reqs still go via Juan. > > > Amit
Re: [Qemu-devel] [PATCH v2 0/3] fix bug about balloon working incorrectly when hotplug memeory
On Mon, Nov 17, 2014 at 08:25:14PM +0800, zhanghailiang wrote: > On 2014/11/17 18:53, zhanghailiang wrote: > >On 2014/11/17 18:39, Michael S. Tsirkin wrote: > >>On Mon, Nov 17, 2014 at 01:11:07PM +0800, zhanghailiang wrote: > >>>Hi, > >>> > >>>Patch 1 and 2 mainly fix bug about balloon not working correctly when we do > >>>hotplug memory. It takes 'ram_size' as VM's real RAM size which is wrong > >>>after we hotplug memory. > >>> > >>>This bug exists since we begin to support hotplug memory, and it is better > >>>to fix it. > >>> > >>>Patch 3 add some trace events, it helps debugging balloon. If it is > >>>unnecessary, > >>>pls feel free to remove it. > >>> > >>>Thanks, > >>>zhanghailiang > >> > >>What about other users of ram_size? > >>Are they all incorrect? > >> > > > >pc-dimm is only supported in x86 target now, and i am not quite sure if > >hotplug > >memory will break migration. I'll look into it. Thanks. > > > > Hi Michael, > > I have made a global search in qemu code, ram_size is used mostly for VM's > startup initialization, I think it's all OK except virtio-balloon and > function vmport_cmd_ram_size (I'm not sure about this place :( ). Comment out ram_size in header, and see what breaks. > But, Unfortunately, hotplug memory action breaks migration. :( > I have made a simple test about this: > > Source: > # start VM > # hotplug memory: > object_add memory-backend-ram,id=ram1,size=1024M,host-nodes=0,policy=bind > device_add pc-dimm,id=dimm1,memdev=ram1 > # migrate VM to Destination > > Destination: > # qemu-system-x86_64: Unknown ramblock "ram1", cannot accept migration > qemu: warning: error while loading state for instance 0x0 of device 'ram' > qemu-system-x86_64: load of migration failed: Invalid argument > > *further test*: > hot-add CPU also break migration and reports error in destination: > 'Unknown savevm section or instance 'cpu_common' 4 > qemu-system-x86_64: load of migration failed: Invalid argument' > > I think we should support migration after hotplug memory/CPU action, > what's your opinion? ;) You must configure identical hardware on source and destination. This means that after adding memory on source, you must specify it (on command line) for destination. > I will try to fix these two problems. > > Thanks, > zhanghailiang > >>>v2: > >>>- fix compiling break for other targets that don't support pc-dimm > >>> > >>>zhanghailiang (3): > >>> pc-dimm: add a function to calculate VM's current RAM size > >>> virtio-balloon: Fix balloon not working correctly when hotplug memory > >>> virtio-balloon: Add some trace events > >>> > >>> hw/mem/pc-dimm.c| 26 ++ > >>> hw/virtio/virtio-balloon.c | 21 +++-- > >>> include/exec/cpu-common.h | 1 + > >>> stubs/qmp_pc_dimm_device_list.c | 5 + > >>> trace-events| 4 > >>> 5 files changed, 51 insertions(+), 6 deletions(-) > >>> > >>>-- > >>>1.7.12.4 > >>> > >> > >>. > >> > > > > > > > > > > >
Re: [Qemu-devel] [RFC PATCH] spapr-pci: Enable huge BARs
> Am 17.11.2014 um 04:54 schrieb Alexey Kardashevskiy : > > At the moment sPAPR only supports 512MB window for MMIO BARs. However > modern devices might want bigger 64bit BARs. > > This adds another 64bit MMIO window per PHB and advertises it via > the PHB's "ranges" property in the device tree. The new window is 1TB long > and starts from 1TB offset on a PCI address space. > > Older (<3.12) kernels expect BARs to have the same offset on both bus > and memory address spaces. Since we are getting now another MMIO region, > we either have to add (0xA000. - 0x8000.) offset to > its bus offset OR simply put MMIO range at the same offset as on the bus. > This puts 32bit MMIO space at 0x8000. offset in RAM and IO space > at 0xA000. (used to be vice versa). > > While we are here, let's increase PHB address spacing from 64GB to 16TB > in order not to touch it again any time soon. > > Signed-off-by: Alexey Kardashevskiy I'll have to digthrough the compatibility matrix details myself still, but please make sure that you don't change address space layouts for pseries-2.2. Alex
Re: [Qemu-devel] [PATCH v6 04/10] hbitmap: Add hbitmap_copy
+ +HBitmap *hbitmap_copy(const HBitmap *bitmap) +{ +int i; +int64_t size; +HBitmap *hb = g_memdup(bitmap, sizeof(struct HBitmap)); + +size = bitmap->size; +for (i = HBITMAP_LEVELS; i-- > 0; ) { +size = MAX((size + BITS_PER_LONG - 1) >> BITS_PER_LEVEL, 1); +hb->levels[i] = g_memdup(bitmap->levels[i], + size * sizeof(unsigned long)); +} + +return hb; +} "(size + BITS_PER_LONG - 1) >> BITS_PER_LEVEL" - will be zero iff size == 0. Is it really possible in qemu? If not, we doesn't need MAX(..., 1). There is similar construction in older "hbitmap_alloc" function. -- Best regards, Vladimir
Re: [Qemu-devel] [PATCH 9/9] hcd-musb: fix dereference null return value
On 2014/11/17 19:18, Gonglei wrote: > On 2014/11/17 18:58, Paolo Bonzini wrote: > >> >> >> On 15/11/2014 11:06, arei.gong...@huawei.com wrote: >>> From: Gonglei >>> >>> Signed-off-by: Gonglei >>> --- >>> hw/usb/hcd-musb.c | 4 >>> 1 file changed, 4 insertions(+) >>> >>> diff --git a/hw/usb/hcd-musb.c b/hw/usb/hcd-musb.c >>> index 66bc61a..f2cb73c 100644 >>> --- a/hw/usb/hcd-musb.c >>> +++ b/hw/usb/hcd-musb.c >>> @@ -624,6 +624,10 @@ static void musb_packet(MUSBState *s, MUSBEndPoint *ep, >>> >>> /* A wild guess on the FADDR semantics... */ >>> dev = usb_find_device(&s->port, ep->faddr[idx]); >>> +if (!dev) { >>> +TRACE("Do not find an usb device"); >>> +return; >>> +} >>> uep = usb_ep_get(dev, pid, ep->type[idx] & 0xf); >>> usb_packet_setup(&ep->packey[dir].p, pid, uep, 0, >>> (dev->addr << 16) | (uep->nr << 8) | pid, false, >>> true); >>> >> >> I think this patch is not the real fix. usb_ep_get and >> usb_handle_packet can deal with a NULL device, but we have to avoid >> dereferencing NULL pointers when building the id. >> > > Good catch :) > >> Paolo >> >> diff --git a/hw/usb/hcd-musb.c b/hw/usb/hcd-musb.c >> index 66bc61a..40809f6 100644 >> --- a/hw/usb/hcd-musb.c >> +++ b/hw/usb/hcd-musb.c >> @@ -608,6 +608,7 @@ static void musb_packet(MUSBState *s, MUSBEndPoint *ep, >> USBDevice *dev; >> USBEndpoint *uep; >> int idx = epnum && dir; >> +int id; >> int ttype; >> >> /* ep->type[0,1] contains: >> @@ -625,8 +626,11 @@ static void musb_packet(MUSBState *s, MUSBEndPoint *ep, >> /* A wild guess on the FADDR semantics... */ >> dev = usb_find_device(&s->port, ep->faddr[idx]); >> uep = usb_ep_get(dev, pid, ep->type[idx] & 0xf); >> -usb_packet_setup(&ep->packey[dir].p, pid, uep, 0, >> - (dev->addr << 16) | (uep->nr << 8) | pid, false, true); >> +id = pid; >> +if (uep) { >> +id |= (dev->addr << 16) | (uep->nr << 8); >> +} >> +usb_packet_setup(&ep->packey[dir].p, pid, uep, 0, id, false, true); >> usb_packet_addbuf(&ep->packey[dir].p, ep->buf[idx], len); >> ep->packey[dir].ep = ep; >> ep->packey[dir].dir = dir; > > This is a good approach, id is just a identifying. Thanks, > Let me split the patch from this series as a separate patch and add Paolo's signed-off-by. Asking for Gerd's reviewing, Thanks. Best regards, -Gonglei
[Qemu-devel] [PATCH] hcd-musb: fix dereference null return value
From: Gonglei usb_ep_get and usb_handle_packet can deal with a NULL device, but we have to avoid dereferencing NULL pointers when building the id. Signed-off-by: Paolo Bonzini Signed-off-by: Gonglei --- hw/usb/hcd-musb.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/hw/usb/hcd-musb.c b/hw/usb/hcd-musb.c index 66bc61a..72ddc7e 100644 --- a/hw/usb/hcd-musb.c +++ b/hw/usb/hcd-musb.c @@ -609,6 +609,7 @@ static void musb_packet(MUSBState *s, MUSBEndPoint *ep, USBEndpoint *uep; int idx = epnum && dir; int ttype; +uint64_t id; /* ep->type[0,1] contains: * in bits 7:6 the speed (0 - invalid, 1 - high, 2 - full, 3 - slow) @@ -625,8 +626,11 @@ static void musb_packet(MUSBState *s, MUSBEndPoint *ep, /* A wild guess on the FADDR semantics... */ dev = usb_find_device(&s->port, ep->faddr[idx]); uep = usb_ep_get(dev, pid, ep->type[idx] & 0xf); -usb_packet_setup(&ep->packey[dir].p, pid, uep, 0, - (dev->addr << 16) | (uep->nr << 8) | pid, false, true); +id = (uint64_t)pid; +if (uep) { +id |= (dev->addr << 16) | (uep->nr << 8); +} +usb_packet_setup(&ep->packey[dir].p, pid, uep, 0, id, false, true); usb_packet_addbuf(&ep->packey[dir].p, ep->buf[idx], len); ep->packey[dir].ep = ep; ep->packey[dir].dir = dir; -- 1.7.12.4
Re: [Qemu-devel] [PATCH 2/4] exec: add wrapper for host pointer access
* Michael S. Tsirkin (m...@redhat.com) wrote: > On Mon, Nov 17, 2014 at 10:58:53AM +, Dr. David Alan Gilbert wrote: > > * Michael S. Tsirkin (m...@redhat.com) wrote: > > > host pointer accesses force pointer math, let's > > > add a wrapper to make them safer. > > > > > > Signed-off-by: Michael S. Tsirkin > > > --- > > > include/exec/cpu-all.h | 5 + > > > exec.c | 10 +- > > > 2 files changed, 10 insertions(+), 5 deletions(-) > > > > > > diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h > > > index c085804..9d8d408 100644 > > > --- a/include/exec/cpu-all.h > > > +++ b/include/exec/cpu-all.h > > > @@ -313,6 +313,11 @@ typedef struct RAMBlock { > > > int fd; > > > } RAMBlock; > > > > > > +static inline void *ramblock_ptr(RAMBlock *block, ram_addr_t offset) > > > +{ > > > +return (char *)block->host + offset; > > > +} > > > > I'm a bit surprised you don't need to pass a length to this to be able > > to tell how much you can access. > > This is because at the moment all accesses only touch a single page. > Said assumption seems to be made all over the code, and won't > be easy to remove. > > > > typedef struct RAMList { > > > QemuMutex mutex; > > > /* Protected by the iothread lock. */ > > > diff --git a/exec.c b/exec.c > > > index ad5cf12..9648669 100644 > > > --- a/exec.c > > > +++ b/exec.c > > > @@ -840,7 +840,7 @@ static void tlb_reset_dirty_range_all(ram_addr_t > > > start, ram_addr_t length) > > > > > > block = qemu_get_ram_block(start); > > > assert(block == qemu_get_ram_block(end - 1)); > > > -start1 = (uintptr_t)block->host + (start - block->offset); > > > +start1 = (uintptr_t)ramblock_ptr(block, start - block->offset); > > > cpu_tlb_reset_dirty_all(start1, length); > > > } > > > > > > @@ -1500,7 +1500,7 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t > > > length) > > > QTAILQ_FOREACH(block, &ram_list.blocks, next) { > > > offset = addr - block->offset; > > > if (offset < block->length) { > > > -vaddr = block->host + offset; > > > +vaddr = ramblock_ptr(block, offset); > > > if (block->flags & RAM_PREALLOC) { > > > ; > > > } else if (xen_enabled()) { > > > @@ -1551,7 +1551,7 @@ void *qemu_get_ram_block_host_ptr(ram_addr_t addr) > > > { > > > RAMBlock *block = qemu_get_ram_block(addr); > > > > > > -return block->host; > > > +return ramblock_ptr(block, 0); > > > } > > > > > > /* Return a host pointer to ram allocated with qemu_ram_alloc. > > > @@ -1578,7 +1578,7 @@ void *qemu_get_ram_ptr(ram_addr_t addr) > > > xen_map_cache(block->offset, block->length, 1); > > > } > > > } > > > -return block->host + (addr - block->offset); > > > +return ramblock_ptr(block, addr - block->offset); > > > } > > > > which then makes me wonder if all the uses of this are safe near the > > end of the block. > > > > > /* Return a host pointer to guest's ram. Similar to qemu_get_ram_ptr > > > @@ -1597,7 +1597,7 @@ static void *qemu_ram_ptr_length(ram_addr_t addr, > > > hwaddr *size) > > > if (addr - block->offset < block->length) { > > > if (addr - block->offset + *size > block->length) > > > *size = block->length - addr + block->offset; > > > -return block->host + (addr - block->offset); > > > +return ramblock_ptr(block, addr - block->offset); > > > } > > > > but then this sounds like it's going to have partial duplication, it > > already looks > > like it's only going to succeed if it finds itself a block that the access > > fits > > in. > > > > > > Dave > > Sorry, I don't really understand what you are saying here. qemu_ram_ptr_length already does some checks, so using ramblock_ptr is duplicating some of that; not a big issue. Dave > > > > } > > > > > > -- > > > MST > > > > > -- > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [PATCH] mips: Correct the handling of writes to CP0.Status for MIPSr6
On 10/11/2014 13:45, Maciej W. Rozycki wrote: > Correct these issues with the handling of CP0.Status for MIPSr6: > > * only ignore the bit pattern of 0b11 on writes to CP0.Status.KSU, that > is for processors that do implement Supervisor Mode, let the bit > pattern be written to CP0.Status.UM:R0 freely (of course the value > written to read-only CP0.Status.R0 will be discarded anyway); this is > in accordance to the relevant architecture specification[1], > > * check the newly written pattern rather than the current contents of > CP0.Status for the KSU bits being 0b11, > > * use meaningful macro names to refer to CP0.Status bits rather than > magic numbers. > > References: > > [1] "MIPS Architecture For Programmers, Volume III: MIPS64 / microMIPS64 > Privileged Resource Architecture", MIPS Technologies, Inc., Document > Number: MD00091, Revision 6.00, March 31, 2014, Table 9.45 "Status > Register Field Descriptions", pp. 210-211. > > Signed-off-by: Maciej W. Rozycki > --- > Leon, > > Noticed in porting the next change I'm going to post. NB I have no > reasonable way to do run-time checks of an r6 configuration, so please > double check this works for you even though I believe it is obviously > correct; I did check CP0.Status.KSU rejects the pattern of 0b11 via GDB > on MIPS64R6-generic with the aid of the next change. > > I suggest making it a policy not to accept new code using magic > numbers. Then the existing places can be gradually identified and > corrected. > > Please apply. > > Thanks, > > Maciej > > qemu-mips-r6-status-r0.diff > Index: qemu-git-trunk/target-mips/op_helper.c > === > --- qemu-git-trunk.orig/target-mips/op_helper.c 2014-11-09 > 23:44:45.467759913 + > +++ qemu-git-trunk/target-mips/op_helper.c2014-11-09 23:45:27.977531070 > + > @@ -1423,10 +1423,12 @@ void helper_mtc0_status(CPUMIPSState *en > uint32_t mask = env->CP0_Status_rw_bitmask; > > if (env->insn_flags & ISA_MIPS32R6) { > -if (extract32(env->CP0_Status, CP0St_KSU, 2) == 0x3) { > +bool has_supervisor = extract32(mask, CP0St_KSU, 2) == 0x3; > + > +if (has_supervisor && extract32(arg1, CP0St_KSU, 2) == 0x3) { > mask &= ~(3 << CP0St_KSU); > } > -mask &= ~(0x0018 & arg1); > +mask &= ~(((1 << CP0St_SR) | (1 << CP0St_NMI)) & arg1); > } > > val = arg1 & mask; > Thanks for fixing and cleaning this up. Reviewed-by: Leon Alrae
Re: [Qemu-devel] [PATCH 9/9] hcd-musb: fix dereference null return value
> >> @@ -625,8 +626,11 @@ static void musb_packet(MUSBState *s, MUSBEndPoint > >> *ep, > >> /* A wild guess on the FADDR semantics... */ > >> dev = usb_find_device(&s->port, ep->faddr[idx]); > >> uep = usb_ep_get(dev, pid, ep->type[idx] & 0xf); > >> -usb_packet_setup(&ep->packey[dir].p, pid, uep, 0, > >> - (dev->addr << 16) | (uep->nr << 8) | pid, false, > >> true); > >> +id = pid; > >> +if (uep) { > >> +id |= (dev->addr << 16) | (uep->nr << 8); > >> +} > >> +usb_packet_setup(&ep->packey[dir].p, pid, uep, 0, id, false, true); > >> usb_packet_addbuf(&ep->packey[dir].p, ep->buf[idx], len); > >> ep->packey[dir].ep = ep; > >> ep->packey[dir].dir = dir; > > > > This is a good approach, id is just a identifying. Thanks, > > > > Let me split the patch from this series as a separate patch > and add Paolo's signed-off-by. > > Asking for Gerd's reviewing, Thanks. Looks good to me. cheers, Gerd
Re: [Qemu-devel] [PATCH 9/9] hcd-musb: fix dereference null return value
17/11/2014 14:36, Gerd Hoffmann wrote: @@ -625,8 +626,11 @@ static void musb_packet(MUSBState *s, MUSBEndPoint *ep, /* A wild guess on the FADDR semantics... */ dev = usb_find_device(&s->port, ep->faddr[idx]); uep = usb_ep_get(dev, pid, ep->type[idx] & 0xf); -usb_packet_setup(&ep->packey[dir].p, pid, uep, 0, - (dev->addr << 16) | (uep->nr << 8) | pid, false, true); +id = pid; +if (uep) { +id |= (dev->addr << 16) | (uep->nr << 8); +} +usb_packet_setup(&ep->packey[dir].p, pid, uep, 0, id, false, true); usb_packet_addbuf(&ep->packey[dir].p, ep->buf[idx], len); ep->packey[dir].ep = ep; ep->packey[dir].dir = dir; >>> >>> This is a good approach, id is just a identifying. Thanks, >>> >> >> Let me split the patch from this series as a separate patch >> and add Paolo's signed-off-by. >> >> Asking for Gerd's reviewing, Thanks. > > Looks good to me. Thanks, then I'll pick it up. Paolo
Re: [Qemu-devel] [PATCH] mips: Correct the writes to CP0 Status and Cause registers via gdbstub
Hi Maciej, On 10/11/2014 13:46, Maciej W. Rozycki wrote: > qemu-mips-status.diff > Index: qemu-git-trunk/target-mips/cpu.h > === > --- qemu-git-trunk.orig/target-mips/cpu.h 2014-11-09 23:44:32.0 > + > +++ qemu-git-trunk/target-mips/cpu.h 2014-11-09 23:49:54.997846651 + Could you please include diffstat when you generate patches? (git format-patch adds it by default). It makes review much easier - especially of longer patches - as it immediately tells which files and how many lines have been modified before starting looking at the actual changes. Thanks, Leon
Re: [Qemu-devel] [PATCH] functional ARM semihosting under GDB
On 17 Nov 2014, at 14:32, Peter Maydell wrote: > that it's combining six different features and bug fixes into a > single commit. Could you separate them out into their own patches? sure. in practical terms, this requires separate branches and each be applied to master, right? my experience with git is not as good as I would like (but improving), and, considering all changes are now in a single branch, could you suggest a simple way to do this? > should go) is definitely a feature we should add. I think it's > possible to make use of the QemuOpts infrastructure to support > -semihosting # current option name with existing semantics > -semihosting target=gdb > -semihosting target=native > -semihosting target=auto # same as plain "-semihosting" > ... will try this >> A more generic option was added to specify the application file to be >> emulated >> >>-image file-path > > I'm pretty wary about this one, because we already have several image > loading options (-kernel, -bios) with complicated semantics that may > not be the same on different target architectures. What does your > "-image" option do that's different from the existing "-kernel" ? there are two issues here: - try to completely ignore your use case of qemu; you have a simple embedded application, or a unit test, that you want to run under qemu, in a similar way you run it on the physical board; you read the qemu manual and you find out that you need to load a linux kernel; this makes absolutely no sense, the small elf you want to run has absolutely nothing to do with any kernel. the -kernel option is also accompanied by several other options, like -initrd, etc that also make no sense for regular embedded applications. - when using the -kernel option, there is also an -append option, to tell the kernel to start with a command line constructed by concatenating the kernel full path and the given options; this does not apply to non-linux images, for those the options are passed via semihosting entirely, including the argv[0]. hence the -semihosting-cmdline, that pairs with -image, not with -kernel, which pairs with -append, but with different functionality. so, -image is more general than -kernel, and is a better match for general use, while -kernel is specific to unix/linux emulations. regards, Liviu
Re: [Qemu-devel] State of ARM FIQ in Qemu
Hi Greg Am Freitag, 14. November 2014, 10:50:40 schrieb Greg Bellows: > On 14 November 2014 09:34, Tim Sander wrote: > > > > > 0xbfffe000? You where talking about the fact that the security > > > > > extensions > > > > > where not implemented. I was not aware that the different vbar's > > > > where > > > > > > > already part of the security stuff? > > > > > > > > MVBAR is part of the Security extensions. HVBAR is part of the > > > > Virtualization extensions. In mainline QEMU we implement neither > > > > of those extensions, and so don't implement the associated > > > > registers. (Strictly speaking, VBAR is also only in the > > > > Security extensions, but we provide it as a workaround for > > > > guests that assume our CPUs should implement it.) > > > > > > Peter beat me to it. None of the VBAR registers should matter in your > > > > case > > > > > which coincides with the use of hivecs. > > > > While writing this mail i found out that the integrated debugger is > > causing > > harm in combination with the fiq. So everything below the braces seems to > > be related to the this problem. But i still wanted to keep the data points > > for > > reference: > > > > { > > Ok, so qemu only implements the SCTLR.V bit to control the memory address > > of > > the interrupt vector. So its either 0 or 0x. That is fine with me. > > Currently i have the problem that a call to set_fiq_handler does not place > > the > > binary stuff loaded at the address where qemu is jumping to which is > > presumably > > 0x1240. I have checked that SCTLR.V =1 under linux which is fine. > > > > The background info to set_fiq_handler from my understanding is that it > > copies > > the given stuff directly at the address where the FIQ vector is located. > > This > > works as the FIQ is the last entry and thus there is some memory space for > > a > > short interrupt handler. I checked the memory when entering the FIQ with > > the > > integrated gdb: > > (gdb) info reg > > r0 0x0 0 > > r1 0x0 0 > > r2 0x1 1 > > r3 0x76eb34c8 1995125960 > > r4 0x76eb34c8 1995125960 > > r5 0x76f633b8 1995846584 > > r6 0x2a 42 > > r7 0x76f4c28c 1995752076 > > r8 0xf8200100 -132120320 > > r9 0xe004 -536608768 > > r100x60004059 1610629209 > > r110x0 0 > > r120x0 0 > > sp 0x908be000 0x908be000 > > lr 0x76dfc108 1994375432 > > pc 0x1240 0x1240 > > cpsr 0x600f01d1 1611596241 > > (gdb) x 0x1240 > > 0xe599b00c > > > > But my firq_fiq_handler starts with 0xee12af10? I know that this works on > > real > > hardware so i suspect that this an error within qemu? Or at least that > > there > > is something amiss in the way the memory is initialized or handled. > > > > Is there a way to instrument the memory below the vector table to get > > debug > > logs if the memory is modified? > > } > > > > > It may be worthwhile to put a kernel breakpoint in handle_fiq_as_nmi() > > > > just > > > > > to see where it goes. If CONFIG_ARM_GIC is enabled it should take you > > > to > > > your handler I suspect. Plus, if you get there then we have likely > > > > proven > > > > > that QEMU is getting the kernel to the right place. I set a BP in this > > > routine on my A9 run and appear to be hitting it correctly. > > > > So you are talking about the linux kernel, right? CONFIG_ARM_GIC=y check > > but > > i can't find handle_fiq_as_nmi? Even a fuzzier "rgrep nmi * |grep fiq" > > does not > > find anything. > > Maybe we are working off different versions of the kernel sources. I'm > using a kernel variant of v3.18-rc1. I took a look at my 3.15 kernel and > it does not have the routine, so perhaps yours is an earlier version as > well. I am on 3.14 as i am working with rt-preempt kernels right now. > I don't spend much time working in or tracking the Linux kernel, so I am > not sure when the difference was introduced. I just found it to be a > convenient function to set a BP for early FIQ debugging, you may have > something different. > > Interestingly, as I researched the Linux FIQ support I found this mail > thread... > > http://www.spinics.net/lists/arm-kernel/msg14960.html > > As I don't have access to your code, I could not verify that the SVC SPSR > was being preserved, but it may be worth you looking into it as I can see > this potentially resulting in all kinds of random behavior. More > interestingly, this comment and code appears to have been changed in later > versions of the FIQ code, so perhaps this has been fixed or improved (My > 3.18 kernel does not have the comment). I have not following the 3.18 kernel concering the FIQ but i will take a look. But regarding the above link i think preserving SPSR is only needed if mode switc
Re: [Qemu-devel] [PATCH v10 01/26] target-arm: extend async excp masking
On 6 November 2014 15:50, Greg Bellows wrote: > This patch extends arm_excp_unmasked() to use lookup tables for determining > whether IRQ and FIQ exceptions are masked. The lookup tables are based on the > ARMv8 and ARMv7 specification physical interrupt masking tables. > > If EL3 is using AArch64 IRQ/FIQ masking is ignored in all exception levels > other than EL3 if SCR.{FIQ|IRQ} is set to 1 (routed to EL3). > > Signed-off-by: Greg Bellows > > --- > > v8 -> v9 > - Undo the use of tables for exception masking and instead go with simplified > logic based on the target EL lookup. > - Remove the masking tables > > v7 -> v8 > - Add IRQ and FIQ exeception masking lookup tables. > - Rewrite patch to use lookup tables for determining whether an excpetion is > masked or not. > > v5 -> v6 > - Globally change Aarch# to AArch# > - Fixed comment termination > > v4 -> v5 > - Merge with v4 patch 10 > --- > target-arm/cpu.h | 66 > > 1 file changed, 52 insertions(+), 14 deletions(-) > > diff --git a/target-arm/cpu.h b/target-arm/cpu.h > index 7f80090..cf30b2a 100644 > --- a/target-arm/cpu.h > +++ b/target-arm/cpu.h > @@ -1247,27 +1247,51 @@ static inline bool arm_excp_unmasked(CPUState *cs, > unsigned int excp_idx) > CPUARMState *env = cs->env_ptr; > unsigned int cur_el = arm_current_el(env); > unsigned int target_el = arm_excp_target_el(cs, excp_idx); > -/* FIXME: Use actual secure state. */ > -bool secure = false; > -/* If in EL1/0, Physical IRQ routing to EL2 only happens from NS state. > */ > -bool irq_can_hyp = !secure && cur_el < 2 && target_el == 2; > - > -/* Don't take exceptions if they target a lower EL. */ > +bool secure = arm_is_secure(env); > +uint32_t scr; > +uint32_t hcr; > +bool pstate_unmasked; > +int8_t unmasked = 0; > +bool is_aa64 = arm_el_is_aa64(env, 3); If you keep this variable, call it el3_is_aa64 (but see comments below). > + > +/* Don't take exceptions if they target a lower EL. > + * This check should catch any exceptions that would not be taken but > left > + * pending. > + */ > if (cur_el > target_el) { > return false; > } > > switch (excp_idx) { > case EXCP_FIQ: > -if (irq_can_hyp && (env->cp15.hcr_el2 & HCR_FMO)) { > -return true; > -} > -return !(env->daif & PSTATE_F); > +/* If FIQs are routed to EL3 or EL2 then there are cases where we > + * override the CPSR.F in determining if the exception is masked or > + * not. If neither of these are set then we fall back to the CPSR.F > + * setting otherwise we further assess the state below. > + */ > +hcr = (env->cp15.hcr_el2 & HCR_FMO); > +scr = (env->cp15.scr_el3 & SCR_FIQ); > + > +/* When EL3 is 32-bit, the SCR.FW bit controls whether the CPSR.F bit > + * masks FIQ interrupts when taken in non-secure state. If SCR.FW is > + * set then FIQs can be masked by CPSR.F when non-secure but only > + * when FIQs are only routed to EL3. > + */ > +scr &= is_aa64 || !((env->cp15.scr_el3 & SCR_FW) && !hcr); > +pstate_unmasked = !(env->daif & PSTATE_F); > +break; > + > case EXCP_IRQ: > -if (irq_can_hyp && (env->cp15.hcr_el2 & HCR_IMO)) { > -return true; > -} > -return !(env->daif & PSTATE_I); > +/* When EL3 execution state is 32-bit, if HCR.IMO is set then we may > + * override the CPSR.I masking when in non-secure state. The SCR.IRQ > + * setting has already been taken into consideration when setting the > + * target EL, so it does not have a further affect here. > + */ > +hcr = is_aa64 || (env->cp15.hcr_el2 & HCR_IMO); > +scr = false; > +pstate_unmasked = !(env->daif & PSTATE_I); > +break; > + > case EXCP_VFIQ: > if (secure || !(env->cp15.hcr_el2 & HCR_FMO)) { > /* VFIQs are only taken when hypervized and non-secure. */ > @@ -1283,6 +1307,20 @@ static inline bool arm_excp_unmasked(CPUState *cs, > unsigned int excp_idx) > default: > g_assert_not_reached(); > } > + > +/* Use the target EL, current execution state and SCR/HCR settings to > + * determine whether the corresponding CPSR bit is used to mask the > + * interrupt. > + */ > +if ((target_el > cur_el) && (target_el != 1) && (scr || hcr) && > +(is_aa64 || !secure)) { > +unmasked = 1; > +} I think this logic is correct but I find it a little confusing. In particular it is not obvious that most of this doesn't apply for 64-bit EL3, because there is an "is_aa64 || " hidden in the settings of scr and hcr, as well as in the check on !secure. I would suggest pulling that out so: if ((target_el > cur_el) && (target_el != 1) { /* With 64-bit EL3 the logic is simple: w
Re: [Qemu-devel] [PATCH v10 06/26] target-arm: add secure state bit to CPREG hash
On 6 November 2014 15:50, Greg Bellows wrote: > Added additional NS-bit to CPREG hash encoding. Updated hash lookup > locations to specify hash bit currently set to non-secure. > > Signed-off-by: Greg Bellows Reviewed-by: Peter Maydell thanks -- PMM
[Qemu-devel] [Bug 1393440] [NEW] pcie.c:148: possible error in OR expression ?
Public bug reported: [qemu/hw/pci/pcie.c:148] -> [qemu/hw/pci/pcie.c:148]: (style) Same expression on both sides of '|'. pci_long_test_and_set_mask(dev->w1cmask + pos + PCI_EXP_DEVSTA, PCI_EXP_DEVSTA_CED | PCI_EXP_DEVSTA_NFED | PCI_EXP_DEVSTA_URD | PCI_EXP_DEVSTA_URD); I am guessing that something like pci_long_test_and_set_mask(dev->w1cmask + pos + PCI_EXP_DEVSTA, PCI_EXP_DEVSTA_CED | PCI_EXP_DEVSTA_NFED | PCI_EXP_DEVSTA_FED | PCI_EXP_DEVSTA_URD); was intended. I used static analyser cppcheck to find this bug. ** Affects: qemu Importance: Undecided Status: New -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1393440 Title: pcie.c:148: possible error in OR expression ? Status in QEMU: New Bug description: [qemu/hw/pci/pcie.c:148] -> [qemu/hw/pci/pcie.c:148]: (style) Same expression on both sides of '|'. pci_long_test_and_set_mask(dev->w1cmask + pos + PCI_EXP_DEVSTA, PCI_EXP_DEVSTA_CED | PCI_EXP_DEVSTA_NFED | PCI_EXP_DEVSTA_URD | PCI_EXP_DEVSTA_URD); I am guessing that something like pci_long_test_and_set_mask(dev->w1cmask + pos + PCI_EXP_DEVSTA, PCI_EXP_DEVSTA_CED | PCI_EXP_DEVSTA_NFED | PCI_EXP_DEVSTA_FED | PCI_EXP_DEVSTA_URD); was intended. I used static analyser cppcheck to find this bug. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1393440/+subscriptions
Re: [Qemu-devel] [PATCH v10 07/26] target-arm: insert AArch32 cpregs twice into hashtable
On 6 November 2014 15:50, Greg Bellows wrote: > From: Fabian Aggeler > > Prepare for cp register banking by inserting every cp register twice, > once for secure world and once for non-secure world. > > Signed-off-by: Fabian Aggeler > Signed-off-by: Greg Bellows > +if (state == ARM_CP_STATE_AA32) { > +/* Under AArch32 CP registers can be common > + * (same for secure and non-secure world) or banked. > + */ > + switch (r->secure) { > + case ARM_CP_SECSTATE_S: > + case ARM_CP_SECSTATE_NS: > +add_cpreg_to_hashtable(cpu, r, opaque, state, > + r->secure, crm, opc1, > opc2); > +break; Looks like you might have some 3-space indent going on here? Otherwise Reviewed-by: Peter Maydell thanks -- PMM
Re: [Qemu-devel] [PATCH v10 14/26] target-arm: respect SCR.FW, SCR.AW and SCTLR.NMFI
On 6 November 2014 15:51, Greg Bellows wrote: > From: Fabian Aggeler > > Add checks of SCR AW/FW bits when performing writes of CPSR. These SCR bits > are used to control whether the CPSR masking bits can be adjusted from > non-secure state. > > Signed-off-by: Fabian Aggeler > Signed-off-by: Greg Bellows > > --- > > v8 -> v9 > - Move cpsr_write mask filtering above mode switch. > - Replace conditional checks removed in v8. > > v7 -> v8 > - Fixed incorrect use of env->uncached_cpsr A/I/F to use env->daif instead. > - Removed incorrect statement about SPSR to CPSR copies being affected by > SCR.AW/FW. > - Fix typo in comment. > - Simpified cpsr_write logic > > v3 -> v4 > - Fixed up conditions for ignoring CPSR.A/F updates by isolating to v7 and > checking for the existence of EL3 and non-existence of EL2. > --- > target-arm/helper.c | 59 > +++-- > 1 file changed, 57 insertions(+), 2 deletions(-) > > diff --git a/target-arm/helper.c b/target-arm/helper.c > index 948192b..9186fc7 100644 > --- a/target-arm/helper.c > +++ b/target-arm/helper.c > @@ -3644,6 +3644,8 @@ uint32_t cpsr_read(CPUARMState *env) > > void cpsr_write(CPUARMState *env, uint32_t val, uint32_t mask) > { > +uint32_t changed_daif; > + > if (mask & CPSR_NZCV) { > env->ZF = (~val) & CPSR_Z; > env->NF = val; > @@ -3666,8 +3668,57 @@ void cpsr_write(CPUARMState *env, uint32_t val, > uint32_t mask) > env->GE = (val >> 16) & 0xf; > } > > -env->daif &= ~(CPSR_AIF & mask); > -env->daif |= val & CPSR_AIF & mask; > +/* In a V7 implementation that includes the security extensions but does > + * not include Virtualization Extensions the SCR.FW and SCR.AW bits > control > + * whether non-secure software is allowed to change the CPSR_F and CPSR_A > + * bits respectively. > + * > + * In a V8 implementation, it is permitted for privileged software to > + * change the CPSR A/F bits regardless of the SCR.AW/FW bits. > + */ > +if (!arm_feature(env, ARM_FEATURE_V8) && > +arm_feature(env, ARM_FEATURE_EL3) && > +!arm_feature(env, ARM_FEATURE_EL2) && > +!arm_is_secure(env)) { > + > +changed_daif = (env->daif ^ val) & mask; > + > +if (changed_daif & CPSR_A) { > +/* Check to see if we are allowed to change the masking of async > + * abort exceptions from a non-secure state. > + */ > +if (!(env->cp15.scr_el3 & SCR_AW)) { > +qemu_log_mask(LOG_GUEST_ERROR, > + "Ignoring attempt to switch CPSR_A flag from " > + "non-secure world with SCR.AW bit clear\n"); > +mask &= ~CPSR_A; > +} > +} > + > +if (changed_daif & CPSR_F) { > +/* Check to see if we are allowed to change the masking of FIQ > + * exceptions from a non-secure state. > + */ > +if (!(env->cp15.scr_el3 & SCR_FW)) { > +qemu_log_mask(LOG_GUEST_ERROR, > + "Ignoring attempt to switch CPSR_F flag from " > + "non-secure world with SCR.FW bit clear\n"); > +mask &= ~CPSR_F; > +} > + > +/* Check whether non-maskable FIQ (NMFI) support is enabled. > + * If this bit is set software is not allowed to mask > + * FIQs, but is allowed to set CPSR_F to 0. > + */ > +if ((A32_BANKED_CURRENT_REG_GET(env, sctlr) & SCTLR_NMFI) && > +(val & CPSR_F)) { > +qemu_log_mask(LOG_GUEST_ERROR, > + "Ignoring attempt to enable CPSR_F flag " > + "(non-maskable FIQ [NMFI] support enabled)\n"); > +mask &= ~CPSR_F; > +} > +} > +} > > if ((env->uncached_cpsr ^ val) & mask & CPSR_M) { > if (bad_mode_switch(env, val & CPSR_M)) { > @@ -3680,6 +3731,10 @@ void cpsr_write(CPUARMState *env, uint32_t val, > uint32_t mask) > switch_mode(env, val & CPSR_M); > } > } > + > +env->daif &= ~(CPSR_AIF & mask); > +env->daif |= val & CPSR_AIF & mask; > + Was moving this bit below the setting of the mode flags deliberate? I don't see any reason why it has to go here. If you move it back where it was then you can add my r-b tag. > mask &= ~CACHED_CPSR_BITS; > env->uncached_cpsr = (env->uncached_cpsr & ~mask) | (val & mask); > } thanks -- PMM
Re: [Qemu-devel] [PATCH v10 13/26] target-arm: add SCTLR_EL3 and make SCTLR banked
On 6 November 2014 15:51, Greg Bellows wrote: > From: Fabian Aggeler > > Implements SCTLR_EL3 and uses secure/non-secure instance when > needed. > > Signed-off-by: Fabian Aggeler > Signed-off-by: Greg Bellows Reviewed-by: Peter Maydell thanks -- PMM
Re: [Qemu-devel] [PATCH v10 19/26] target-arm: make IFSR banked
On 6 November 2014 15:51, Greg Bellows wrote: > From: Fabian Aggeler > > When EL3 is running in AArch32 (or ARMv7 with Security Extensions) > IFSR has a secure and a non-secure instance. Adds IFSR32_EL2 definition and > storage. > > Signed-off-by: Fabian Aggeler > Signed-off-by: Greg Bellows Reviewed-by: Peter Maydell thanks -- PMM
Re: [Qemu-devel] [PATCH v2 1/5] libqos: Change use of pointers to uint64_t in virtio
On Sat, Nov 01, 2014 at 06:02:26PM +0100, Marc Marí wrote: > Convert use of pointers in functions of virtio to uint64_t in order to make it > platform-independent. > > Add casting from pointers (in PCI functions) to uint64_t and vice versa > through > uintptr_t. > > Signed-off-by: Marc Marí > --- > tests/libqos/virtio-pci.c | 20 +++- > tests/libqos/virtio.c |8 > tests/libqos/virtio.h | 16 > tests/virtio-blk-test.c | 21 ++--- > 4 files changed, 37 insertions(+), 28 deletions(-) This makes sense. To fully abolish pointers the PCI code also needs to be converted. Do plan plan to do that? Stefan pgp2hyApzBs_0.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH v2 1/5] libqos: Change use of pointers to uint64_t in virtio
El Mon, 17 Nov 2014 15:16:21 + Stefan Hajnoczi escribió: > On Sat, Nov 01, 2014 at 06:02:26PM +0100, Marc Marí wrote: > > Convert use of pointers in functions of virtio to uint64_t in order > > to make it platform-independent. > > > > Add casting from pointers (in PCI functions) to uint64_t and vice > > versa through uintptr_t. > > > > Signed-off-by: Marc Marí > > --- > > tests/libqos/virtio-pci.c | 20 +++- > > tests/libqos/virtio.c |8 > > tests/libqos/virtio.h | 16 > > tests/virtio-blk-test.c | 21 ++--- > > 4 files changed, 37 insertions(+), 28 deletions(-) > > This makes sense. To fully abolish pointers the PCI code also needs > to be converted. Do plan plan to do that? > > Stefan Not planned, but if asked, I may do it. Marc
Re: [Qemu-devel] [PATCH v10 18/26] target-arm: make DACR banked
On 6 November 2014 15:51, Greg Bellows wrote: > From: Fabian Aggeler > > When EL3 is running in AArch32 (or ARMv7 with Security Extensions) > DACR has a secure and a non-secure instance. Adds definition for DACR32_EL2. > > Signed-off-by: Fabian Aggeler > Signed-off-by: Greg Bellows Reviewed-by: Peter Maydell thanks -- PMM
Re: [Qemu-devel] [PATCH v2 2/5] tests: Prepare virtio-blk-test for multi-arch implementation
On Sat, Nov 01, 2014 at 06:02:27PM +0100, Marc Marí wrote: > Modularize functions in virtio-blk-test and add PCI suffix for PCI specific > components. > > Signed-off-by: Marc Marí > --- > tests/virtio-blk-test.c | 57 > +++ > 1 file changed, 38 insertions(+), 19 deletions(-) > > diff --git a/tests/virtio-blk-test.c b/tests/virtio-blk-test.c > index 33e8094..6f07d5a 100644 > --- a/tests/virtio-blk-test.c > +++ b/tests/virtio-blk-test.c > @@ -55,11 +55,11 @@ typedef struct QVirtioBlkReq { > uint8_t status; > } QVirtioBlkReq; > > -static QPCIBus *test_start(void) > +static char *drive_create(void) > { > -char *cmdline; > -char tmp_path[] = "/tmp/qtest.XX"; > int fd, ret; > +char *tmp_path = malloc(18); > +strncpy(tmp_path, "/tmp/qtest.XX", 18); Please use g_strdup(): https://developer.gnome.org/glib/stable/glib-String-Utility-Functions.html#g-strdup QEMU does not use malloc(3)/free(3) directly. Always use glib memory allocation (in this case g_strdup() allocates memory for you). > > /* Create a temporary raw image */ > fd = mkstemp(tmp_path); > @@ -68,6 +68,16 @@ static QPCIBus *test_start(void) > g_assert_cmpint(ret, ==, 0); > close(fd); > > +return tmp_path; > +} > + > +static QPCIBus *pci_test_start(void) > +{ > +char *cmdline; > +char *tmp_path; > + > +tmp_path = drive_create(); It seems tmp_path is leaked. You must free dynamically allocated memory! pgpnRFgXfwmzk.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH v10 16/26] target-arm: make TTBR0/1 banked
On 6 November 2014 15:51, Greg Bellows wrote: > From: Fabian Aggeler > > Adds secure and non-secure bank register suport for TTBR0 and TTBR1. > Changes include adding secure and non-secure instances of ttbr0 and ttbr1 as > well as a CP register definition for TTBR0_EL3. Added a union containing > both EL based array fields and secure and non-secure fields mapped to them. > Updated accesses to use A32_BANKED_CURRENT_REG_GET macro. > > Signed-off-by: Fabian Aggeler > Signed-off-by: Greg Bellows Reviewed-by: Peter Maydell thanks -- PMM
Re: [Qemu-devel] [PATCH v10 24/26] target-arm: make c13 cp regs banked (FCSEIDR, ...)
On 6 November 2014 15:51, Greg Bellows wrote: > From: Fabian Aggeler > > When EL3 is running in AArch32 (or ARMv7 with Security Extensions) > FCSEIDR, CONTEXTIDR, TPIDRURW, TPIDRURO and TPIDRPRW have a secure > and a non-secure instance. > > Signed-off-by: Fabian Aggeler > Signed-off-by: Greg Bellows Reviewed-by: Peter Maydell thanks -- PMM
Re: [Qemu-devel] [PATCH v2 1/5] libqos: Change use of pointers to uint64_t in virtio
Am 17.11.2014 um 16:19 schrieb Marc Marí: > El Mon, 17 Nov 2014 15:16:21 + > Stefan Hajnoczi escribió: >> On Sat, Nov 01, 2014 at 06:02:26PM +0100, Marc Marí wrote: >>> Convert use of pointers in functions of virtio to uint64_t in order >>> to make it platform-independent. >>> >>> Add casting from pointers (in PCI functions) to uint64_t and vice >>> versa through uintptr_t. >>> >>> Signed-off-by: Marc Marí >>> --- >>> tests/libqos/virtio-pci.c | 20 +++- >>> tests/libqos/virtio.c |8 >>> tests/libqos/virtio.h | 16 >>> tests/virtio-blk-test.c | 21 ++--- >>> 4 files changed, 37 insertions(+), 28 deletions(-) >> >> This makes sense. To fully abolish pointers the PCI code also needs >> to be converted. Do plan plan to do that? >> >> Stefan > > Not planned, but if asked, I may do it. Pretty please. :) I was wondering the same thing when reading the patch. Regards, Andreas -- SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 21284 AG Nürnberg
Re: [Qemu-devel] [PATCH v2 4/5] libqos: Add malloc generic
On Sat, Nov 01, 2014 at 06:02:29PM +0100, Marc Marí wrote: > This malloc is a basic interface implementation that works for any platform. > It should be replaced in the future for a real malloc implementation for each > of the platforms. > > Signed-off-by: Marc Marí > --- > tests/libqos/malloc-generic.c | 50 > + > tests/libqos/malloc-generic.h | 21 + > 2 files changed, 71 insertions(+) > create mode 100644 tests/libqos/malloc-generic.c > create mode 100644 tests/libqos/malloc-generic.h > > diff --git a/tests/libqos/malloc-generic.c b/tests/libqos/malloc-generic.c > new file mode 100644 > index 000..0049424 > --- /dev/null > +++ b/tests/libqos/malloc-generic.c > @@ -0,0 +1,50 @@ > +/* > + * Basic libqos generic malloc support > + * > + * Copyright (c) 2014 Marc Marí > + * > + * This work is licensed under the terms of the GNU GPL, version 2 or later. > + * See the COPYING file in the top-level directory. > + */ > + > +#include "libqos/malloc-generic.h" > +#include "libqos/malloc.h" > +#include Please always include system headers (<>) before user headers (""). This ensures that QEMU's headers do not pollute the namespace or affect system headers in any way. pgp0BXpxeo7kI.pgp Description: PGP signature
[Qemu-devel] [PATCH 3/5] block: Add blk_add_close_notifier() for BB
Adding something like a "delete notifier" to a BlockBackend would not make much sense, because whoever is interested in registering there will probably hold a reference to that BlockBackend; therefore, the notifier will never be called (or only when the notifiee already relinquished its reference and thus most probably is no longer interested in that notification). Therefore, this patch just passes through the close notifier interface of the root BDS. This will be called when the device is ejected, for instance, and therefore does make sense. Signed-off-by: Max Reitz --- block/block-backend.c | 5 + include/sysemu/block-backend.h | 1 + 2 files changed, 6 insertions(+) diff --git a/block/block-backend.c b/block/block-backend.c index 7a7f690..ef16d73 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -642,6 +642,11 @@ void blk_remove_aio_context_notifier(BlockBackend *blk, detach_aio_context, opaque); } +void blk_add_close_notifier(BlockBackend *blk, Notifier *notify) +{ +bdrv_add_close_notifier(blk->bs, notify); +} + void blk_io_plug(BlockBackend *blk) { bdrv_io_plug(blk->bs); diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h index d9c1337..8871a02 100644 --- a/include/sysemu/block-backend.h +++ b/include/sysemu/block-backend.h @@ -143,6 +143,7 @@ void blk_remove_aio_context_notifier(BlockBackend *blk, void *), void (*detach_aio_context)(void *), void *opaque); +void blk_add_close_notifier(BlockBackend *blk, Notifier *notify); void blk_io_plug(BlockBackend *blk); void blk_io_unplug(BlockBackend *blk); BlockAcctStats *blk_get_stats(BlockBackend *blk); -- 1.9.3
[Qemu-devel] [PATCH 1/5] block: Lift more functions into BlockBackend
There are already some blk_aio_* functions, so we might as well have blk_co_* functions (as far as we need them). This patch adds blk_co_flush(), blk_co_discard(), and also blk_invalidate_cache() (which is not a blk_co_* function but is needed nonetheless). Signed-off-by: Max Reitz --- block/block-backend.c | 15 +++ include/sysemu/block-backend.h | 3 +++ 2 files changed, 18 insertions(+) diff --git a/block/block-backend.c b/block/block-backend.c index d0692b1..89f69b7 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -497,6 +497,16 @@ BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf, return bdrv_aio_ioctl(blk->bs, req, buf, cb, opaque); } +int blk_co_discard(BlockBackend *blk, int64_t sector_num, int nb_sectors) +{ +return bdrv_co_discard(blk->bs, sector_num, nb_sectors); +} + +int blk_co_flush(BlockBackend *blk) +{ +return bdrv_co_flush(blk->bs); +} + int blk_flush(BlockBackend *blk) { return bdrv_flush(blk->bs); @@ -549,6 +559,11 @@ void blk_set_enable_write_cache(BlockBackend *blk, bool wce) bdrv_set_enable_write_cache(blk->bs, wce); } +void blk_invalidate_cache(BlockBackend *blk, Error **errp) +{ +bdrv_invalidate_cache(blk->bs, errp); +} + int blk_is_inserted(BlockBackend *blk) { return bdrv_is_inserted(blk->bs); diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h index 52d13c1..0c46b82 100644 --- a/include/sysemu/block-backend.h +++ b/include/sysemu/block-backend.h @@ -108,6 +108,8 @@ int blk_aio_multiwrite(BlockBackend *blk, BlockRequest *reqs, int num_reqs); int blk_ioctl(BlockBackend *blk, unsigned long int req, void *buf); BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf, BlockCompletionFunc *cb, void *opaque); +int blk_co_discard(BlockBackend *blk, int64_t sector_num, int nb_sectors); +int blk_co_flush(BlockBackend *blk); int blk_flush(BlockBackend *blk); int blk_flush_all(void); void blk_drain_all(void); @@ -120,6 +122,7 @@ int blk_is_read_only(BlockBackend *blk); int blk_is_sg(BlockBackend *blk); int blk_enable_write_cache(BlockBackend *blk); void blk_set_enable_write_cache(BlockBackend *blk, bool wce); +void blk_invalidate_cache(BlockBackend *blk, Error **errp); int blk_is_inserted(BlockBackend *blk); void blk_lock_medium(BlockBackend *blk, bool locked); void blk_eject(BlockBackend *blk, bool eject_flag); -- 1.9.3
[Qemu-devel] [PATCH 2/5] block: Add AioContextNotifier functions to BB
Because all BlockDriverStates behind a single BlockBackend reside in a single AioContext, it is fine to just pass these functions (blk_add_aio_context_notifier() and blk_remove_aio_context_notifier()) through to the root BlockDriverState. Signed-off-by: Max Reitz --- block/block-backend.c | 18 ++ include/sysemu/block-backend.h | 8 2 files changed, 26 insertions(+) diff --git a/block/block-backend.c b/block/block-backend.c index 89f69b7..7a7f690 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -624,6 +624,24 @@ void blk_set_aio_context(BlockBackend *blk, AioContext *new_context) bdrv_set_aio_context(blk->bs, new_context); } +void blk_add_aio_context_notifier(BlockBackend *blk, +void (*attached_aio_context)(AioContext *new_context, void *opaque), +void (*detach_aio_context)(void *opaque), void *opaque) +{ +bdrv_add_aio_context_notifier(blk->bs, attached_aio_context, + detach_aio_context, opaque); +} + +void blk_remove_aio_context_notifier(BlockBackend *blk, + void (*attached_aio_context)(AioContext *, + void *), + void (*detach_aio_context)(void *), + void *opaque) +{ +bdrv_remove_aio_context_notifier(blk->bs, attached_aio_context, + detach_aio_context, opaque); +} + void blk_io_plug(BlockBackend *blk) { bdrv_io_plug(blk->bs); diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h index 0c46b82..d9c1337 100644 --- a/include/sysemu/block-backend.h +++ b/include/sysemu/block-backend.h @@ -135,6 +135,14 @@ void blk_op_block_all(BlockBackend *blk, Error *reason); void blk_op_unblock_all(BlockBackend *blk, Error *reason); AioContext *blk_get_aio_context(BlockBackend *blk); void blk_set_aio_context(BlockBackend *blk, AioContext *new_context); +void blk_add_aio_context_notifier(BlockBackend *blk, +void (*attached_aio_context)(AioContext *new_context, void *opaque), +void (*detach_aio_context)(void *opaque), void *opaque); +void blk_remove_aio_context_notifier(BlockBackend *blk, + void (*attached_aio_context)(AioContext *, + void *), + void (*detach_aio_context)(void *), + void *opaque); void blk_io_plug(BlockBackend *blk); void blk_io_unplug(BlockBackend *blk); BlockAcctStats *blk_get_stats(BlockBackend *blk); -- 1.9.3
[Qemu-devel] [PATCH 5/5] nbd: Use BlockBackend internally
With all externally visible functions changed to use BlockBackend, this patch makes nbd use BlockBackend for everything internally as well. While touching them, substitute 512 by BDRV_SECTOR_SIZE in the calls to blk_read(), blk_write() and blk_co_discard(). Signed-off-by: Max Reitz --- nbd.c | 56 1 file changed, 28 insertions(+), 28 deletions(-) diff --git a/nbd.c b/nbd.c index 3fd5743..53cf82b 100644 --- a/nbd.c +++ b/nbd.c @@ -17,8 +17,6 @@ */ #include "block/nbd.h" -#include "block/block.h" -#include "block/block_int.h" #include "sysemu/block-backend.h" #include "block/coroutine.h" @@ -102,7 +100,7 @@ struct NBDExport { int refcount; void (*close)(NBDExport *exp); -BlockDriverState *bs; +BlockBackend *blk; char *name; off_t dev_offset; off_t size; @@ -930,7 +928,7 @@ static void nbd_request_put(NBDRequest *req) nbd_client_put(client); } -static void bs_aio_attached(AioContext *ctx, void *opaque) +static void blk_aio_attached(AioContext *ctx, void *opaque) { NBDExport *exp = opaque; NBDClient *client; @@ -944,7 +942,7 @@ static void bs_aio_attached(AioContext *ctx, void *opaque) } } -static void bs_aio_detach(void *opaque) +static void blk_aio_detach(void *opaque) { NBDExport *exp = opaque; NBDClient *client; @@ -961,24 +959,23 @@ static void bs_aio_detach(void *opaque) NBDExport *nbd_export_new(BlockBackend *blk, off_t dev_offset, off_t size, uint32_t nbdflags, void (*close)(NBDExport *)) { -BlockDriverState *bs = blk_bs(blk); NBDExport *exp = g_malloc0(sizeof(NBDExport)); exp->refcount = 1; QTAILQ_INIT(&exp->clients); -exp->bs = bs; +exp->blk = blk; exp->dev_offset = dev_offset; exp->nbdflags = nbdflags; -exp->size = size == -1 ? bdrv_getlength(bs) : size; +exp->size = size == -1 ? blk_getlength(blk) : size; exp->close = close; -exp->ctx = bdrv_get_aio_context(bs); -bdrv_ref(bs); -bdrv_add_aio_context_notifier(bs, bs_aio_attached, bs_aio_detach, exp); +exp->ctx = blk_get_aio_context(blk); +blk_ref(blk); +blk_add_aio_context_notifier(blk, blk_aio_attached, blk_aio_detach, exp); /* * NBD exports are used for non-shared storage migration. Make sure * that BDRV_O_INCOMING is cleared and the image is ready for write * access since the export could be available before migration handover. */ -bdrv_invalidate_cache(bs, NULL); +blk_invalidate_cache(blk, NULL); return exp; } @@ -1025,11 +1022,11 @@ void nbd_export_close(NBDExport *exp) } nbd_export_set_name(exp, NULL); nbd_export_put(exp); -if (exp->bs) { -bdrv_remove_aio_context_notifier(exp->bs, bs_aio_attached, - bs_aio_detach, exp); -bdrv_unref(exp->bs); -exp->bs = NULL; +if (exp->blk) { +blk_remove_aio_context_notifier(exp->blk, blk_aio_attached, +blk_aio_detach, exp); +blk_unref(exp->blk); +exp->blk = NULL; } } @@ -1059,7 +1056,7 @@ void nbd_export_put(NBDExport *exp) BlockBackend *nbd_export_get_blockdev(NBDExport *exp) { -return exp->bs->blk; +return exp->blk; } void nbd_export_close_all(void) @@ -1138,7 +1135,7 @@ static ssize_t nbd_co_receive_request(NBDRequest *req, struct nbd_request *reque command = request->type & NBD_CMD_MASK_COMMAND; if (command == NBD_CMD_READ || command == NBD_CMD_WRITE) { -req->data = qemu_blockalign(client->exp->bs, request->len); +req->data = blk_blockalign(client->exp->blk, request->len); } if (command == NBD_CMD_WRITE) { TRACE("Reading %u byte(s)", request->len); @@ -1204,7 +1201,7 @@ static void nbd_trip(void *opaque) TRACE("Request type is READ"); if (request.type & NBD_CMD_FLAG_FUA) { -ret = bdrv_co_flush(exp->bs); +ret = blk_co_flush(exp->blk); if (ret < 0) { LOG("flush failed"); reply.error = -ret; @@ -1212,8 +1209,9 @@ static void nbd_trip(void *opaque) } } -ret = bdrv_read(exp->bs, (request.from + exp->dev_offset) / 512, -req->data, request.len / 512); +ret = blk_read(exp->blk, + (request.from + exp->dev_offset) / BDRV_SECTOR_SIZE, + req->data, request.len / BDRV_SECTOR_SIZE); if (ret < 0) { LOG("reading from file failed"); reply.error = -ret; @@ -1235,8 +1233,9 @@ static void nbd_trip(void *opaque) TRACE("Writing to device"); -ret = bdrv_write(exp->bs, (request.from + exp->dev_offset) / 512, - req->data, request.len / 512); +ret = blk_write(exp->blk, +(request.from + exp->dev_offset) / BDRV_SECTOR_SIZE,
[Qemu-devel] [PATCH 0/5] nbd: Use BlockBackend
>From the block layer's perspective, the nbd server is pretty similar to a guest device. Therefore, it should use BlockBackend to access block devices, just like any other guest device does. This series consequently makes the nbd server use BlockBackend for referencing block devices. Max Reitz (5): block: Lift more functions into BlockBackend block: Add AioContextNotifier functions to BB block: Add blk_add_close_notifier() for BB nbd: Change external interface to BlockBackend nbd: Use BlockBackend internally block/block-backend.c | 38 + blockdev-nbd.c | 15 +- include/block/nbd.h| 7 ++--- include/sysemu/block-backend.h | 12 nbd.c | 63 +- qemu-nbd.c | 2 +- 6 files changed, 94 insertions(+), 43 deletions(-) -- 1.9.3
[Qemu-devel] [PATCH 4/5] nbd: Change external interface to BlockBackend
Substitute BlockDriverState by BlockBackend in every globally visible function provided by nbd. Signed-off-by: Max Reitz --- blockdev-nbd.c | 15 --- include/block/nbd.h | 7 +++ nbd.c | 11 ++- qemu-nbd.c | 2 +- 4 files changed, 18 insertions(+), 17 deletions(-) diff --git a/blockdev-nbd.c b/blockdev-nbd.c index 06f901e..22e95d1 100644 --- a/blockdev-nbd.c +++ b/blockdev-nbd.c @@ -10,6 +10,7 @@ */ #include "sysemu/blockdev.h" +#include "sysemu/block-backend.h" #include "hw/block/block.h" #include "monitor/monitor.h" #include "qapi/qmp/qerror.h" @@ -73,7 +74,7 @@ static void nbd_close_notifier(Notifier *n, void *data) void qmp_nbd_server_add(const char *device, bool has_writable, bool writable, Error **errp) { -BlockDriverState *bs; +BlockBackend *blk; NBDExport *exp; NBDCloseNotifier *n; @@ -87,12 +88,12 @@ void qmp_nbd_server_add(const char *device, bool has_writable, bool writable, return; } -bs = bdrv_find(device); -if (!bs) { +blk = blk_by_name(device); +if (!blk) { error_set(errp, QERR_DEVICE_NOT_FOUND, device); return; } -if (!bdrv_is_inserted(bs)) { +if (!blk_is_inserted(blk)) { error_set(errp, QERR_DEVICE_HAS_NO_MEDIUM, device); return; } @@ -100,18 +101,18 @@ void qmp_nbd_server_add(const char *device, bool has_writable, bool writable, if (!has_writable) { writable = false; } -if (bdrv_is_read_only(bs)) { +if (blk_is_read_only(blk)) { writable = false; } -exp = nbd_export_new(bs, 0, -1, writable ? 0 : NBD_FLAG_READ_ONLY, NULL); +exp = nbd_export_new(blk, 0, -1, writable ? 0 : NBD_FLAG_READ_ONLY, NULL); nbd_export_set_name(exp, device); n = g_new0(NBDCloseNotifier, 1); n->n.notify = nbd_close_notifier; n->exp = exp; -bdrv_add_close_notifier(bs, &n->n); +blk_add_close_notifier(blk, &n->n); QTAILQ_INSERT_TAIL(&close_notifiers, n, next); } diff --git a/include/block/nbd.h b/include/block/nbd.h index 9e835d2..348302c 100644 --- a/include/block/nbd.h +++ b/include/block/nbd.h @@ -85,14 +85,13 @@ int nbd_disconnect(int fd); typedef struct NBDExport NBDExport; typedef struct NBDClient NBDClient; -NBDExport *nbd_export_new(BlockDriverState *bs, off_t dev_offset, - off_t size, uint32_t nbdflags, - void (*close)(NBDExport *)); +NBDExport *nbd_export_new(BlockBackend *blk, off_t dev_offset, off_t size, + uint32_t nbdflags, void (*close)(NBDExport *)); void nbd_export_close(NBDExport *exp); void nbd_export_get(NBDExport *exp); void nbd_export_put(NBDExport *exp); -BlockDriverState *nbd_export_get_blockdev(NBDExport *exp); +BlockBackend *nbd_export_get_blockdev(NBDExport *exp); NBDExport *nbd_export_find(const char *name); void nbd_export_set_name(NBDExport *exp, const char *name); diff --git a/nbd.c b/nbd.c index a7bce45..3fd5743 100644 --- a/nbd.c +++ b/nbd.c @@ -19,6 +19,7 @@ #include "block/nbd.h" #include "block/block.h" #include "block/block_int.h" +#include "sysemu/block-backend.h" #include "block/coroutine.h" @@ -957,10 +958,10 @@ static void bs_aio_detach(void *opaque) exp->ctx = NULL; } -NBDExport *nbd_export_new(BlockDriverState *bs, off_t dev_offset, - off_t size, uint32_t nbdflags, - void (*close)(NBDExport *)) +NBDExport *nbd_export_new(BlockBackend *blk, off_t dev_offset, off_t size, + uint32_t nbdflags, void (*close)(NBDExport *)) { +BlockDriverState *bs = blk_bs(blk); NBDExport *exp = g_malloc0(sizeof(NBDExport)); exp->refcount = 1; QTAILQ_INIT(&exp->clients); @@ -1056,9 +1057,9 @@ void nbd_export_put(NBDExport *exp) } } -BlockDriverState *nbd_export_get_blockdev(NBDExport *exp) +BlockBackend *nbd_export_get_blockdev(NBDExport *exp) { -return exp->bs; +return exp->bs->blk; } void nbd_export_close_all(void) diff --git a/qemu-nbd.c b/qemu-nbd.c index 5cd6c6d..60ce50f 100644 --- a/qemu-nbd.c +++ b/qemu-nbd.c @@ -730,7 +730,7 @@ int main(int argc, char **argv) } } -exp = nbd_export_new(bs, dev_offset, fd_size, nbdflags, nbd_export_closed); +exp = nbd_export_new(blk, dev_offset, fd_size, nbdflags, nbd_export_closed); if (sockpath) { fd = unix_socket_incoming(sockpath); -- 1.9.3
Re: [Qemu-devel] [PATCH v10 17/26] target-arm: make TTBCR banked
On 6 November 2014 15:51, Greg Bellows wrote: > From: Fabian Aggeler > > Adds secure and non-secure bank register suport for TTBCR. > Added new struct to compartmentalize the TCR data and masks. Removed old > tcr/ttbcr data and added a 4 element array of the new structs in cp15. This > allows for one entry per EL. Added a CP register definition for TCR_EL3. > > Signed-off-by: Fabian Aggeler > Signed-off-by: Greg Bellows > +static void *raw_ptr(CPUARMState *env, const ARMCPRegInfo *ri) > +{ > +return (char *)(env) + (ri)->fieldoffset; > +} You don't need the brackets round "ri" here, since this isn't a macro. > +/* Get the TCR bank based on our security state*/ Missing space before "*/". Otherwise Reviewed-by: Peter Maydell thanks -- PMM
Re: [Qemu-devel] [PATCH v2 5/5] libqos: Add virtio MMIO support
On Sat, Nov 01, 2014 at 06:02:30PM +0100, Marc Marí wrote: > Add virtio MMIO support. > Add virtio-blk-test MMIO test case. > > Signed-off-by: Marc Marí > --- > tests/Makefile |4 +- > tests/libqos/virtio-mmio.c | 190 > > tests/libqos/virtio-mmio.h | 46 +++ > tests/virtio-blk-test.c| 143 +++-- > 4 files changed, 375 insertions(+), 8 deletions(-) > create mode 100644 tests/libqos/virtio-mmio.c > create mode 100644 tests/libqos/virtio-mmio.h > > diff --git a/tests/Makefile b/tests/Makefile > index 4ae0ca4..f193b03 100644 > --- a/tests/Makefile > +++ b/tests/Makefile > @@ -183,6 +183,8 @@ gcov-files-sparc-y += hw/timer/m48t59.c > gcov-files-sparc64-y += hw/timer/m48t59.c > check-qtest-arm-y = tests/tmp105-test$(EXESUF) > gcov-files-arm-y += hw/misc/tmp105.c > +check-qtest-arm-y += tests/virtio-blk-test$(EXESUF) > +gcov-files-arm-y += arm-softmmu/hw/block/virtio-blk.c > check-qtest-ppc-y += tests/boot-order-test$(EXESUF) > check-qtest-ppc64-y += tests/boot-order-test$(EXESUF) > check-qtest-ppc64-y += tests/spapr-phb-test$(EXESUF) > @@ -300,8 +302,8 @@ libqos-obj-y += tests/libqos/i2c.o > libqos-pc-obj-y = $(libqos-obj-y) tests/libqos/pci-pc.o > libqos-pc-obj-y += tests/libqos/malloc-pc.o > libqos-omap-obj-y = $(libqos-obj-y) tests/libqos/i2c-omap.o > -libqos-virtio-obj-y = $(libqos-obj-y) $(libqos-pc-obj-y) > tests/libqos/virtio.o tests/libqos/virtio-pci.o > libqos-usb-obj-y = $(libqos-pc-obj-y) tests/libqos/usb.o > +libqos-virtio-obj-y = $(libqos-pc-obj-y) tests/libqos/virtio.o > tests/libqos/virtio-pci.o tests/libqos/virtio-mmio.o > tests/libqos/malloc-generic.o > > tests/rtc-test$(EXESUF): tests/rtc-test.o > tests/m48t59-test$(EXESUF): tests/m48t59-test.o > diff --git a/tests/libqos/virtio-mmio.c b/tests/libqos/virtio-mmio.c > new file mode 100644 > index 000..6896f7b > --- /dev/null > +++ b/tests/libqos/virtio-mmio.c > @@ -0,0 +1,190 @@ > +/* > + * libqos virtio MMIO driver > + * > + * Copyright (c) 2014 Marc Marí > + * > + * This work is licensed under the terms of the GNU GPL, version 2 or later. > + * See the COPYING file in the top-level directory. > + */ > + > +#include > +#include "libqtest.h" > +#include "libqos/virtio.h" > +#include "libqos/virtio-mmio.h" > +#include "libqos/malloc.h" > +#include "libqos/malloc-generic.h" > +#include Please include system headers before user headers. > +QVirtioMMIODevice *qvirtio_mmio_init_device(uint64_t addr, uint32_t > page_size) > +{ > +QVirtioMMIODevice *dev; > +uint32_t magic; > +dev = g_malloc0(sizeof(*dev)); > + > +magic = readl(addr + QVIRTIO_MMIO_MAGIC_VALUE); > +g_assert(magic == ('v' | 'i' << 8 | 'r' << 16 | 't' << 24)); The magic value is little-endian according to the virtio 1.0 specification and the Linux kernel's virtio_mmio implementation. QEMU's virtio-mmio is native endian (i.e. host endian) and I think this is a bug. qtest's readl() is host-endian so it matches what QEMU does. I suppose you can leave this for now but it may need to be changed in the future. > +static void mmio_basic(void) > +{ > +QVirtioMMIODevice *dev; > +QVirtQueue *vq; > +QGuestAllocator *alloc; > +QVirtioBlkReq req; > +int n_size = TEST_IMAGE_SIZE / 2; > +uint64_t req_addr; > +uint64_t capacity; > +uint32_t features; > +uint32_t free_head; > +uint8_t status; > +char *data; > + > +arm_test_start(); > + > +dev = qvirtio_mmio_init_device(MMIO_DEV_BASE_ADDR, MMIO_PAGE_SIZE); > +g_assert(dev != NULL); > +g_assert_cmphex(dev->vdev.device_type, ==, QVIRTIO_BLK_DEVICE_ID); > + > +qvirtio_reset(&qvirtio_mmio, &dev->vdev); > +qvirtio_set_acknowledge(&qvirtio_mmio, &dev->vdev); > +qvirtio_set_driver(&qvirtio_mmio, &dev->vdev); > + > +capacity = qvirtio_config_readq(&qvirtio_mmio, &dev->vdev, > + > QVIRTIO_MMIO_DEVICE_SPECIFIC); > +g_assert_cmpint(capacity, ==, TEST_IMAGE_SIZE / 512); > + > +features = qvirtio_get_features(&qvirtio_mmio, &dev->vdev); > +features = features & ~(QVIRTIO_F_RING_INDIRECT_DESC | > +QVIRTIO_F_RING_EVENT_IDX | > QVIRTIO_BLK_F_SCSI); > +qvirtio_set_features(&qvirtio_mmio, &dev->vdev, features); > + > +alloc = generic_alloc_init(MMIO_RAM_ADDR, MMIO_RAM_SIZE, MMIO_PAGE_SIZE); > +vq = qvirtqueue_setup(&qvirtio_mmio, &dev->vdev, alloc, 0); > + > +qvirtio_set_driver_ok(&qvirtio_mmio, &dev->vdev); > + > +qmp("{ 'execute': 'block_resize', 'arguments': { 'device': 'drive0', " > +" 'size': %d } }", > n_size); > + > +qvirtio_wait_queue_isr(&qvirtio_mmio, &dev->vdev, vq, > + QVIRTIO_BLK_TIMEOUT_US); > + > +capacity = qvirtio_config_readq(&qvirtio_mmio, &dev->vdev, > + > QVIRTIO_MMIO_DEVICE_S
[Qemu-devel] [PATCH] spice: remove spice-experimental.h include
Nothing seems to be using functions from spice-experimental.h (better that way). Let's remove its inclusion. Signed-off-by: Marc-André Lureau --- spice-qemu-char.c | 1 - ui/spice-core.c | 1 - 2 files changed, 2 deletions(-) diff --git a/spice-qemu-char.c b/spice-qemu-char.c index 8106e06..7e0d300 100644 --- a/spice-qemu-char.c +++ b/spice-qemu-char.c @@ -3,7 +3,6 @@ #include "ui/qemu-spice.h" #include "sysemu/char.h" #include -#include #include #include "qemu/osdep.h" diff --git a/ui/spice-core.c b/ui/spice-core.c index 6467fa4..2eae2bd 100644 --- a/ui/spice-core.c +++ b/ui/spice-core.c @@ -16,7 +16,6 @@ */ #include -#include #include #include "sysemu/sysemu.h" -- 2.1.0
Re: [Qemu-devel] [PATCH v4 2/3] iotests: _filter_qmp for pretty JSON output
On 11/17/2014 05:31 AM, Max Reitz wrote: > _filter_qmp should be able to correctly filter out the QMP version > object for pretty JSON output. > > Signed-off-by: Max Reitz > --- > tests/qemu-iotests/common.filter | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/tests/qemu-iotests/common.filter > b/tests/qemu-iotests/common.filter > index 3acdb30..dfb9726 100644 > --- a/tests/qemu-iotests/common.filter > +++ b/tests/qemu-iotests/common.filter > @@ -167,7 +167,9 @@ _filter_qmp() > { > _filter_win32 | \ > sed -e 's#\("\(micro\)\?seconds": \)[0-9]\+#\1 TIMESTAMP#g' \ > --e 's#^{"QMP":.*}$#QMP_VERSION#' > +-e 's#^{"QMP":.*}$#QMP_VERSION#' \ > +-e '/^"QMP": {\s*$/, /^}\s*$/ c\' \ \s is a GNU sed extension. But we don't really need to care about whitespace to the end of the line; I think that it is sufficient to just match the following regex: -e '/^"QMP": {/, /^}/ c\' \ > +-e 'QMP_VERSION' Either way, it's not the first time we've used a GNU sed extension, and since other series are now depending on this one, I can live with: Reviewed-by: Eric Blake -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v4 2/3] iotests: _filter_qmp for pretty JSON output
On 2014-11-17 at 17:04, Eric Blake wrote: On 11/17/2014 05:31 AM, Max Reitz wrote: _filter_qmp should be able to correctly filter out the QMP version object for pretty JSON output. Signed-off-by: Max Reitz --- tests/qemu-iotests/common.filter | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter index 3acdb30..dfb9726 100644 --- a/tests/qemu-iotests/common.filter +++ b/tests/qemu-iotests/common.filter @@ -167,7 +167,9 @@ _filter_qmp() { _filter_win32 | \ sed -e 's#\("\(micro\)\?seconds": \)[0-9]\+#\1 TIMESTAMP#g' \ --e 's#^{"QMP":.*}$#QMP_VERSION#' +-e 's#^{"QMP":.*}$#QMP_VERSION#' \ +-e '/^"QMP": {\s*$/, /^}\s*$/ c\' \ \s is a GNU sed extension. But we don't really need to care about whitespace to the end of the line; I think that it is sufficient to just match the following regex: -e '/^"QMP": {/, /^}/ c\' \ That doesn't match, however. QEMU's JSON formatter sometimes outputs whitespace at the end of line. +-e 'QMP_VERSION' Either way, it's not the first time we've used a GNU sed extension, and since other series are now depending on this one, I can live with: Ooooh, you mean version 3 was actually fine? (which used 'c\' without a line break) ;-) Reviewed-by: Eric Blake Thank you! Max
Re: [Qemu-devel] [PATCH] functional ARM semihosting under GDB
On 17 November 2014 14:32, Liviu Ionescu wrote: > On 17 Nov 2014, at 14:32, Peter Maydell wrote: > >> that it's combining six different features and bug fixes into a >> single commit. Could you separate them out into their own patches? > > sure. in practical terms, this requires separate branches and each be > applied to master, right? Not necessarily. You could have one branch with six commits on it, rebase that branch on master, and just send the patch or two at the "bottom" of the stack to the mailing list. > my experience with git is not as good as I would like (but improving), > and, considering all changes are now in a single branch, could you > suggest a simple way to do this? Personally I use "stgit" as my tool for patch-management, but plain git will also work fine here. You want to do an interactive rebase, and you can then use the "edit" action in the rebase for splitting the commit. This looks like a decent explanation of how to do it: http://git-scm.com/book/en/v2/Git-Tools-Rewriting-History#Splitting-a-Commit Note that 'git add' has options for adding only some patch hunks to a commit (rather than the default of adding all the changes made to a file into the commit); see the man page's section on "interactive mode" and the "--patch" option. >>> A more generic option was added to specify the application file to be >>> emulated >>> >>>-image file-path >> >> I'm pretty wary about this one, because we already have several image >> loading options (-kernel, -bios) with complicated semantics that may >> not be the same on different target architectures. What does your >> "-image" option do that's different from the existing "-kernel" ? > > there are two issues here: > > - try to completely ignore your use case of qemu; you have a simple embedded > application, or a unit test, that you want to run under qemu, in a similar > way you run it on the physical board; you read the qemu manual and you find > out that you need to load a linux kernel; this makes absolutely no sense, the > small elf you want to run has absolutely nothing to do with any kernel. the > -kernel option is also accompanied by several other options, like -initrd, > etc that also make no sense for regular embedded applications. This is basically saying "the -kernel option is not a great name for that switch", right? If you pass it an ELF file it will do the right thing, it's just not obvious (and possibly not documented) that it will. I'm sympathetic to the desire to clean up our argument handling here, but we already have a confusing mess of options which have different semantics in different situations, and it's going to need real care to avoid just making the situation worse by adding yet another option to the mess. thanks -- PMM
Re: [Qemu-devel] [PATCH 2/4] exec: add wrapper for host pointer access
On Mon, Nov 17, 2014 at 12:59:44PM +, Dr. David Alan Gilbert wrote: > * Michael S. Tsirkin (m...@redhat.com) wrote: > > On Mon, Nov 17, 2014 at 10:58:53AM +, Dr. David Alan Gilbert wrote: > > > * Michael S. Tsirkin (m...@redhat.com) wrote: > > > > host pointer accesses force pointer math, let's > > > > add a wrapper to make them safer. > > > > > > > > Signed-off-by: Michael S. Tsirkin > > > > --- > > > > include/exec/cpu-all.h | 5 + > > > > exec.c | 10 +- > > > > 2 files changed, 10 insertions(+), 5 deletions(-) > > > > > > > > diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h > > > > index c085804..9d8d408 100644 > > > > --- a/include/exec/cpu-all.h > > > > +++ b/include/exec/cpu-all.h > > > > @@ -313,6 +313,11 @@ typedef struct RAMBlock { > > > > int fd; > > > > } RAMBlock; > > > > > > > > +static inline void *ramblock_ptr(RAMBlock *block, ram_addr_t offset) > > > > +{ > > > > +return (char *)block->host + offset; > > > > +} > > > > > > I'm a bit surprised you don't need to pass a length to this to be able > > > to tell how much you can access. > > > > This is because at the moment all accesses only touch a single page. > > Said assumption seems to be made all over the code, and won't > > be easy to remove. > > > > > > typedef struct RAMList { > > > > QemuMutex mutex; > > > > /* Protected by the iothread lock. */ > > > > diff --git a/exec.c b/exec.c > > > > index ad5cf12..9648669 100644 > > > > --- a/exec.c > > > > +++ b/exec.c > > > > @@ -840,7 +840,7 @@ static void tlb_reset_dirty_range_all(ram_addr_t > > > > start, ram_addr_t length) > > > > > > > > block = qemu_get_ram_block(start); > > > > assert(block == qemu_get_ram_block(end - 1)); > > > > -start1 = (uintptr_t)block->host + (start - block->offset); > > > > +start1 = (uintptr_t)ramblock_ptr(block, start - block->offset); > > > > cpu_tlb_reset_dirty_all(start1, length); > > > > } > > > > > > > > @@ -1500,7 +1500,7 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t > > > > length) > > > > QTAILQ_FOREACH(block, &ram_list.blocks, next) { > > > > offset = addr - block->offset; > > > > if (offset < block->length) { > > > > -vaddr = block->host + offset; > > > > +vaddr = ramblock_ptr(block, offset); > > > > if (block->flags & RAM_PREALLOC) { > > > > ; > > > > } else if (xen_enabled()) { > > > > @@ -1551,7 +1551,7 @@ void *qemu_get_ram_block_host_ptr(ram_addr_t addr) > > > > { > > > > RAMBlock *block = qemu_get_ram_block(addr); > > > > > > > > -return block->host; > > > > +return ramblock_ptr(block, 0); > > > > } > > > > > > > > /* Return a host pointer to ram allocated with qemu_ram_alloc. > > > > @@ -1578,7 +1578,7 @@ void *qemu_get_ram_ptr(ram_addr_t addr) > > > > xen_map_cache(block->offset, block->length, 1); > > > > } > > > > } > > > > -return block->host + (addr - block->offset); > > > > +return ramblock_ptr(block, addr - block->offset); > > > > } > > > > > > which then makes me wonder if all the uses of this are safe near the > > > end of the block. > > > > > > > /* Return a host pointer to guest's ram. Similar to qemu_get_ram_ptr > > > > @@ -1597,7 +1597,7 @@ static void *qemu_ram_ptr_length(ram_addr_t addr, > > > > hwaddr *size) > > > > if (addr - block->offset < block->length) { > > > > if (addr - block->offset + *size > block->length) > > > > *size = block->length - addr + block->offset; > > > > -return block->host + (addr - block->offset); > > > > +return ramblock_ptr(block, addr - block->offset); > > > > } > > > > > > but then this sounds like it's going to have partial duplication, it > > > already looks > > > like it's only going to succeed if it finds itself a block that the > > > access fits > > > in. > > > > > > > > > Dave > > > > Sorry, I don't really understand what you are saying here. > > qemu_ram_ptr_length already does some checks, so using ramblock_ptr is > duplicating > some of that; not a big issue. > > Dave Yep. Since the point is hardening, it's probably a good idea to keep it simple - and data path shouldn't use ram_addr_t. > > > > > > } > > > > > > > > -- > > > > MST > > > > > > > -- > > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK > -- > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [Qemu-devel] [PATCH] Tracing: Fix simpletrace.py error on tcg enabled binary traces
On Sun, Nov 02, 2014 at 10:37:59PM +0100, christoph.seif...@posteo.de wrote: > From: Christoph Seifert > > simpletrace.py does not recognize the tcg option while reading trace-events > file. In result simpletrace does not work on binary traces and tcg enabled > events. Moved transformation of tcg enabled events to _read_events() which is > used by simpletrace. > > Signed-off-by: Christoph Seifert > --- > scripts/tracetool/__init__.py | 67 > +-- > 1 file changed, 33 insertions(+), 34 deletions(-) Looks good to me. Lluís: Any comments? > diff --git a/scripts/tracetool/__init__.py b/scripts/tracetool/__init__.py > index 3d5743f..181675f 100644 > --- a/scripts/tracetool/__init__.py > +++ b/scripts/tracetool/__init__.py > @@ -253,14 +253,44 @@ class Event(object): > > > def _read_events(fobj): > -res = [] > +events = [] > for line in fobj: > if not line.strip(): > continue > if line.lstrip().startswith('#'): > continue > -res.append(Event.build(line)) > -return res > + > +event = Event.build(line) > + > +# transform TCG-enabled events > +if "tcg" not in event.properties: > +events.append(event) > +else: > +event_trans = event.copy() > +event_trans.name += "_trans" > +event_trans.properties += ["tcg-trans"] > +event_trans.fmt = event.fmt[0] > +args_trans = [] > +for atrans, aorig in zip( > + > event_trans.transform(tracetool.transform.TCG_2_HOST).args, > +event.args): > +if atrans == aorig: > +args_trans.append(atrans) > +event_trans.args = Arguments(args_trans) > +event_trans = event_trans.copy() > + > +event_exec = event.copy() > +event_exec.name += "_exec" > +event_exec.properties += ["tcg-exec"] > +event_exec.fmt = event.fmt[1] > +event_exec = event_exec.transform(tracetool.transform.TCG_2_HOST) > + > +new_event = [event_trans, event_exec] > +event.event_trans, event.event_exec = new_event > + > +events.extend(new_event) > + > +return events > > > class TracetoolError (Exception): > @@ -333,35 +363,4 @@ def generate(fevents, format, backends, > > events = _read_events(fevents) > > -# transform TCG-enabled events > -new_events = [] > -for event in events: > -if "tcg" not in event.properties: > -new_events.append(event) > -else: > -event_trans = event.copy() > -event_trans.name += "_trans" > -event_trans.properties += ["tcg-trans"] > -event_trans.fmt = event.fmt[0] > -args_trans = [] > -for atrans, aorig in zip( > - > event_trans.transform(tracetool.transform.TCG_2_HOST).args, > -event.args): > -if atrans == aorig: > -args_trans.append(atrans) > -event_trans.args = Arguments(args_trans) > -event_trans = event_trans.copy() > - > -event_exec = event.copy() > -event_exec.name += "_exec" > -event_exec.properties += ["tcg-exec"] > -event_exec.fmt = event.fmt[1] > -event_exec = event_exec.transform(tracetool.transform.TCG_2_HOST) > - > -new_event = [event_trans, event_exec] > -event.event_trans, event.event_exec = new_event > - > -new_events.extend(new_event) > -events = new_events > - > tracetool.format.generate(events, format, backend) > -- > 2.1.3 > > pgpVcJQLV0gqx.pgp Description: PGP signature
Re: [Qemu-devel] QEMU trunk now in hardfreeze
On Fri, Nov 07, 2014 at 08:42:57AM +0800, Gonglei wrote: > On 2014/11/7 1:26, Paolo Bonzini wrote: > > > On 06/11/2014 17:49, Stefan Hajnoczi wrote: > -Boot Devices Supporting dynamically modify boot order of > guest, and assuring taking effect after the guest rebooting. > >> "Please add this" isn't enough. What is the justification to add > >> it during hard freeze where only bug fixes are allowed? > > > > He meant to the changelog, it's one of his contributions to 2.2. :) > > > > Yeah. Thanks. ;) Sorry for the misunderstanding :). Please let me know if you need a wiki account and I'll email you login details. Stefan pgpi7RjedKwi3.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH v3 for-2.2 2/3] raw-posix: SEEK_HOLE suffices, get rid of FIEMAP
On 11/17/2014 03:58 AM, Markus Armbruster wrote: > Max Reitz writes: > >> On 2014-11-17 at 11:18, Markus Armbruster wrote: >>> Commit 5500316 (May 2012) implemented raw_co_is_allocated() as >>> follows: >>> >>> Signed-off-by: Markus Armbruster >>> Reviewed-by: Max Reitz >>> Reviewed-by: Eric Blake >>> --- >>> block/raw-posix.c | 60 >>> --- >>> 1 file changed, 4 insertions(+), 56 deletions(-) >> >> I only just now realized that you're not getting rid of FIEMAP >> completely; there's still the skip_fiemap flag in BDRVRawState. I >> don't care for now, though, thus my R-b stands. > > You're right! The appended patch should be squashed in, either on > commit, or in a respin. My R-b stands whether or not you squash this in; best situation would be squashing this in during commit. > > > diff --git a/block/raw-posix.c b/block/raw-posix.c > index 0b5d5a8..414e6d1 100644 > --- a/block/raw-posix.c > +++ b/block/raw-posix.c > @@ -148,9 +148,6 @@ typedef struct BDRVRawState { > bool has_write_zeroes:1; > bool discard_zeroes:1; > bool needs_alignment; > -#ifdef CONFIG_FIEMAP > -bool skip_fiemap; > -#endif > } BDRVRawState; > > typedef struct BDRVRawReopenState { > > > -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v3 for-2.2 3/3] raw-posix: The SEEK_HOLE code is flawed, rewrite it
On 11/17/2014 03:18 AM, Markus Armbruster wrote: > On systems where SEEK_HOLE in a trailing hole seeks to EOF (Solaris, > but not Linux), try_seek_hole() reports trailing data instead. Maybe worth a comment that this is not fatal, but also not optimal. > > Additionally, unlikely lseek() failures are treated badly: > > * When SEEK_HOLE fails, try_seek_hole() reports trailing data. For > -ENXIO, there's in fact a trailing hole. Can happen only when > something truncated the file since we opened it. > > * When SEEK_HOLE succeeds, SEEK_DATA fails, and SEEK_END succeeds, > then try_seek_hole() reports a trailing hole. This is okay only > when SEEK_DATA failed with -ENXIO (which means the non-trailing hole > found by SEEK_HOLE has since become trailing somehow). For other > failures (unlikely), it's wrong. > > * When SEEK_HOLE succeeds, SEEK_DATA fails, SEEK_END fails (unlikely), > then try_seek_hole() reports bogus data [-1,start), which its caller > raw_co_get_block_status() turns into zero sectors of data. Could > theoretically lead to infinite loops in code that attempts to scan > data vs. hole forward. > > Rewrite from scratch, with very careful comments. Thanks for the careful commit message as well as the careful comments :) > > Signed-off-by: Markus Armbruster > --- > block/raw-posix.c | 111 > +- > 1 file changed, 85 insertions(+), 26 deletions(-) > > @@ -1542,25 +1600,26 @@ static int64_t coroutine_fn > raw_co_get_block_status(BlockDriverState *bs, > nb_sectors = DIV_ROUND_UP(total_size - start, BDRV_SECTOR_SIZE); > } > > +} else if (data == start) { > /* On a data extent, compute sectors to the end of the extent. */ > *pnum = MIN(nb_sectors, (hole - start) / BDRV_SECTOR_SIZE); I think we are safe for now that no file system supports holes smaller than 512 bytes, so (hole - start) should always be a non-zero multiple of sectors. Similarly for the hole case of (data - start). Maybe it's worth assert(*pnum > 0) to ensure that we never hit a situation where we go into an infinite loop where we aren't progressing because pnum is never advancing to the next sector? But that would be okay as a separate patch, and I don't want to delay getting _this_ patch into 2.2. Reviewed-by: Eric Blake -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature