[Bug 1793608] Re: qemu doesn't seem to support lxvwsx for POWER9 target
A patch has been posted here: https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg02218.html ("ppc/translate: Implement lxvwsx opcode") ** Changed in: qemu Status: New => In Progress -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1793608 Title: qemu doesn't seem to support lxvwsx for POWER9 target Status in QEMU: In Progress Bug description: When running a simple program built for POWER9 on QEMU 3.0.0 in linux- user mode, it crashes with a message: "illegal instruction". It turns out that lxvwsx instruction "Load Word and Splat Indexed" is not supported. If workaround is implemented by issuing two separate instructions (first load then splat) then all tests pass correctly. Operating system: Ubuntu Mate 16.04.2 64-bit (or Linux Mint 18 64-bit). Cross-compiler for gcc-powerpc64le-linux-gnu is installed (gcc-5 series). QEMU 3.0.0 is built from source and installed with: sudo make install The program in question: https://github.com/VectorChief/UniSIMD-assembler Turn off the workaround: RT_ELEM_COMPAT_PW9 should be set to 1 in the following file: https://github.com/VectorChief/UniSIMD-assembler/blob/master/core/config/rtarch_p32_128x1v2.h Change to the "test" directory and type "make -f simd_make_p64.mk". powerpc64le-linux-gnu-objdump -d simd_test.p64_32Lp9 > simd_test_p64_32Lp9.txt Open newly created text file simd_test_p64_32Lp9.txt and search for lxvwsx (in s_test01, ...) The instruction shows up in objdump correctly. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1793608/+subscriptions
Re: [PATCH v2 00/44] Make qdev static property API usable by any QOM type
On 11/11/20 19:39, Eduardo Habkost wrote: I will submit v3 of this series with both object_class_property_add_field() and object_class_add_field_properties() as internal QOM APIs. object_class_add_field_properties() will be used to implement device_class_set_props(). I have no problem making both of them public APIs. If an object can use only a single array of static^Wfield properties that's totally fine; I'm just not sure about splitting properties between class_init and static arrays, which is the less consistent case. Paolo
Re: [PATCH 1/2] char: Skip CLI aliases in query-chardev-backends
Kevin Wolf writes: > The aliases "tty" and "parport" are only valid on the command line, QMP > commands like chardev-add don't know them. query-chardev-backends should > describe QMP and therefore not include them in the list of available > backends. > > Signed-off-by: Kevin Wolf I'd call that a bug. In the light of PATCH 2, I propose to put that one first (with the help_string_append() hunk dropped), then remove the aliases from CLI help, too, like this: diff --git a/chardev/char.c b/chardev/char.c index aa4282164a..8b6e78a122 100644 --- a/chardev/char.c +++ b/chardev/char.c @@ -568,13 +568,8 @@ static void chardev_name_foreach(void (*fn)(const char *name, void *opaque), void *opaque) { ChadevClassFE fe = { .fn = fn, .opaque = opaque }; -int i; object_class_foreach(chardev_class_foreach, TYPE_CHARDEV, false, &fe); - -for (i = 0; i < (int)ARRAY_SIZE(chardev_alias_table); i++) { -fn(chardev_alias_table[i].alias, opaque); -} } static void
Re: [PATCH v2] ACPI: Avoid infinite recursion when dump-vmstate
On Thu, Nov 12, 2020 at 10:06:38AM +0800, Peng Liang wrote: > There is a field with vmstate_ghes_state as vmsd in vmstate_ghes_state, > which will lead to infinite recursion in dump_vmstate_vmsd. > > Fixes: a08a64627b ("ACPI: Record the Generic Error Status Block address") > Reported-by: Euler Robot > Signed-off-by: Peng Liang > Acked-by: Igor Mammedov Reviewed-by: Michael S. Tsirkin > --- > hw/acpi/generic_event_device.c | 12 +++- > 1 file changed, 11 insertions(+), 1 deletion(-) > > diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c > index 6df400e1ee16..5454be67d5f0 100644 > --- a/hw/acpi/generic_event_device.c > +++ b/hw/acpi/generic_event_device.c > @@ -322,6 +322,16 @@ static const VMStateDescription vmstate_ged_state = { > } > }; > > +static const VMStateDescription vmstate_ghes = { > +.name = "acpi-ghes", > +.version_id = 1, > +.minimum_version_id = 1, > +.fields = (VMStateField[]) { > +VMSTATE_UINT64(ghes_addr_le, AcpiGhesState), > +VMSTATE_END_OF_LIST() > +}, > +}; > + > static bool ghes_needed(void *opaque) > { > AcpiGedState *s = opaque; > @@ -335,7 +345,7 @@ static const VMStateDescription vmstate_ghes_state = { > .needed = ghes_needed, > .fields = (VMStateField[]) { > VMSTATE_STRUCT(ghes_state, AcpiGedState, 1, > - vmstate_ghes_state, AcpiGhesState), > + vmstate_ghes, AcpiGhesState), > VMSTATE_END_OF_LIST() > } > }; > -- > 2.26.2
Re: [PATCH-for-5.2 v2] hw/i386/acpi-build: Fix maybe-uninitialized error when ACPI hotplug off
On Thu, Nov 12, 2020 at 09:21:47AM +0530, Ani Sinha wrote: > > > On Sun, Nov 8, 2020 at 22:40 Philippe Mathieu-Daudé > wrote: > > On 11/8/20 4:58 AM, Ani Sinha wrote: > > On Sun, Nov 8, 2020 at 1:10 AM Philippe Mathieu-Daudé > > wrote: > >> > >> GCC 9.3.0 thinks that 'method' can be left uninitialized. This code > >> is already in the "if (bsel || pcihp_bridge_en)" block statement, > >> but it isn't smart enough to figure it out. > >> > >> Restrict the code to be used only in the "if (bsel || pcihp_bridge_en)" > >> block statement to fix (on Ubuntu): > >> > >>  ../hw/i386/acpi-build.c: In function > 'build_append_pci_bus_devices': > >>  ../hw/i386/acpi-build.c:496:9: error: 'method' may be used > uninitialized > >>  in this function [-Werror=maybe-uninitialized] > >>   496 |     aml_append(parent_scope, method); > >>     |     ^~~~ > >>  cc1: all warnings being treated as errors > > > > OK I looked at the patch closely and it makes sense. Can you please > > run a "make check" to make sure we have not broken anything? > > Yes I did... > > > Has this been queued Michael? tagged, thanks! > > > >
Re: [Virtio-fs] [PATCH] virtiofsd: Use --thread-pool-size=0 to mean no thread pool
On Fri, Nov 6, 2020 at 11:35 PM Vivek Goyal wrote: > > On Fri, Nov 06, 2020 at 08:33:50PM +, Venegas Munoz, Jose Carlos wrote: > > Hi Vivek, > > > > I have tested with Kata 1.12-apha0, the results seems that are better for > > the use fio config I am tracking. > > > > The fio config does randrw: > > > > fio --direct=1 --gtod_reduce=1 --name=test --filename=random_read_write.fio > > --bs=4k --iodepth=64 --size=200M --readwrite=randrw --rwmixread=75 > > > > Hi Carlos, > > Thanks for the testing. > > So basically two conclusions from your tests. > > - for virtiofs, --thread-pool-size=0 is performing better as comapred > to --thread-pool-size=1 as well as --thread-pool-size=64. Approximately > 35-40% better. > > - virtio-9p is still approximately 30% better than virtiofs > --thread-pool-size=0. > > As I had done the analysis that this particular workload (mixed read and > write) is bad with virtiofs because after every write we are invalidating > attrs and cache so next read ends up fetching attrs again. I had posted > patches to gain some of the performance. > > https://lore.kernel.org/linux-fsdevel/20200929185015.gg220...@redhat.com/ > > But I got the feedback to look into implementing file leases instead. Hmm, the FUSE_AUTO_INVAL_DATA feature is buggy, how about turning it off for now? 9p doesn't have it, so no point in enabling it for virtiofs by default. Also I think some confusion comes from cache=auto being the default for virtiofs.Not sure what the default is for 9p, but comparing default to default will definitely not be apples to apples since this mode is nonexistent in 9p. 9p:cache=none <-> virtiofs:cache=none 9p:cache=loose <-> virtiofs:cache=always "9p:cache=mmap" and "virtiofs:cache=auto" have no match. Untested patch attached. Thanks, Miklos diff --git a/tools/virtiofsd/passthrough_ll.c b/tools/virtiofsd/passthrough_ll.c index ec1008bceba8..d474c553bb5c 100644 --- a/tools/virtiofsd/passthrough_ll.c +++ b/tools/virtiofsd/passthrough_ll.c @@ -618,6 +618,9 @@ static void lo_init(void *userdata, struct fuse_conn_info *conn) lo->announce_submounts = false; } #endif + +/* This is currently buggy with mixed read-write load, so disable */ +conn->want &= ~FUSE_CAP_AUTO_INVAL_DATA; } static void lo_getattr(fuse_req_t req, fuse_ino_t ino, @@ -3444,7 +3447,7 @@ int main(int argc, char *argv[]) lo.inodes = g_hash_table_new(lo_key_hash, lo_key_equal); lo.root.fd = -1; lo.root.fuse_ino = FUSE_ROOT_ID; -lo.cache = CACHE_AUTO; +lo.cache = CACHE_ALWAYS; /* * Set up the ino map like this:
Re: [PATCH] ast2600: SRAM is 89KB
On 11/12/20 2:21 AM, Joel Stanley wrote: > On the AST2600A1, the SRAM size was increased to 89KB. > > Fixes: 7582591ae745 ("aspeed: Support AST2600A1 silicon revision") > Signed-off-by: Joel Stanley Reviewed-by: Cédric Le Goater > --- > hw/arm/aspeed_ast2600.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/hw/arm/aspeed_ast2600.c b/hw/arm/aspeed_ast2600.c > index 1450bde7cf26..12e4a16d3765 100644 > --- a/hw/arm/aspeed_ast2600.c > +++ b/hw/arm/aspeed_ast2600.c > @@ -481,7 +481,7 @@ static void aspeed_soc_ast2600_class_init(ObjectClass > *oc, void *data) > sc->name = "ast2600-a1"; > sc->cpu_type = ARM_CPU_TYPE_NAME("cortex-a7"); > sc->silicon_rev = AST2600_A1_SILICON_REV; > -sc->sram_size= 0x1; > +sc->sram_size= 0x16400; > sc->spis_num = 2; > sc->ehcis_num= 2; > sc->wdts_num = 4; >
[Bug 786208] Re: Missing checks for non-existent device in ide_exec_cmd
The QEMU project is currently considering to move its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now. If you still think this bug report here is valid, then please switch the state back to "New" within the next 60 days, otherwise this report will be marked as "Expired". Or mark it as "Fix Released" if the problem has been solved with a newer version of QEMU already. Thank you and sorry for the inconvenience. ** Changed in: qemu Assignee: John Snow (jnsnow) => (unassigned) ** Changed in: qemu Status: New => Incomplete -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/786208 Title: Missing checks for non-existent device in ide_exec_cmd Status in QEMU: Incomplete Bug description: Several calls in the ide_exec_cmd handler are missing checks for (!s->bs) or similar, resulting in NULL pointer dereferences, divide- by-zero, or possibly other badness if the guest performs operations on a non-existent IDE master. For example, the WIN_READ_NATIVE_MAX command does a 'ide_set_sector(s, s->nb_sectors - 1);', which does 'cyl = sector_num / (s->heads * s->sectors);', which will fail with a divide-by-zero if heads = sectors = 0. And WIN_MULTREAD also does not check for s->bs, but does a 'ide_sector_read(s);', which will do 'bdrv_read(s->bs, sector_num, s->io_buffer, n);' on a NULL s->bs, leading to a segfault. I do not *believe* that a malicious guest can do anything more than cause a crash with these bugs. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/786208/+subscriptions
[Bug 670769] Re: CDROM size not updated when changing image files
Looking through old bug tickets... can you still reproduce this issue with the latest version of QEMU? Or could we close this ticket nowadays? ** Changed in: qemu Status: New => Incomplete ** Changed in: qemu Assignee: John Snow (jnsnow) => (unassigned) -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/670769 Title: CDROM size not updated when changing image files Status in QEMU: Incomplete Bug description: I'm using qemu 13.0 with a plain Linux kernel using the ata_piix driver as the guest, and an initrd that starts a shell. When changing the image in the monitor and reading from the CDROM in the guest, the size is not updated. I'm using LInux 2.6.32.24 as the host and I've tested 2.6.32.24, 2.6.35, and 2.6.36 as guests. Both host and guest are 64-bit. Here is the command used to start the guest using the initrd: ./x86_64-softmmu/qemu-system-x86_64 -cdrom /spare2/cd1.img -kernel /sources/linux-2.6.32.24-test/arch/x86/boot/bzImage -initrd /spare2/initrd.img -append 'root=/dev/ram0 rw' -cpu core2duo Additional info on this bug can be found here: http://marc.info/?l=kvm&m=128746013906820&w=2. Note: this is how I discovered the bug, using 32-bit Slackware install CDs. I'm attaching the initrd I used in my tests: I created two different-sized fake CDROM images by dd'ing from /dev/zero. In my tests, cd1.img is smaller that cd2.img. In the monitor I executed 'change ide1-cd0 /spare2/cd2.img' to load the new image. I checked the size by cat'ing /sys/block/sr0/size in the guest after reading the CDROM. Reading the CDROM was done by typing 'dd if=/dev/sr0 of=/dev/null bs=512 count=3' To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/670769/+subscriptions
[Bug 761469] Re: multicast VPN breaks IPv6 Duplicate Address Detection
The QEMU project is currently considering to move its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now. If you still think this bug report here is valid, then please switch the state back to "New" within the next 60 days, otherwise this report will be marked as "Expired". Or mark it as "Fix Released" if the problem has been solved with a newer version of QEMU already. Thank you and sorry for the inconvenience. ** Changed in: qemu Status: New => Incomplete -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/761469 Title: multicast VPN breaks IPv6 Duplicate Address Detection Status in QEMU: Incomplete Status in Fedora: Won't Fix Bug description: The multicast VPN facility in Qemu uses Multicast Loopback to make sure that other Qemu processes on the Host server receive the transmission. The side effect of this is that the process sending the packet also gets the packet back on its receive channel and currently this is not filtered but passed directly to the Virtual Machine. You can see this effect by running tcpdump in the virtual machine. Every packet sent appears to be duplicated. For IPv4 it doesn't appear to cause much of a problem, however with IPv6 the duplicate neighbor solicitation packet is precisely what the system uses to detect duplicate addresses. So IPv6 addresses fail to establish. If you run 'ip addr' on a virtual Linux machine with IPv6 enabled you will see 'dadfailed' next to the link local address. 'ping6' will then not work. Checked against 0.12.1. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/761469/+subscriptions
Re: [PATCH v2-for-5.2] macio: set user_creatable to false in macio_class_init()
On 10/11/2020 10:31, Mark Cave-Ayland wrote: Commit 348b8d1a76 "macio: don't reference serial_hd() directly within the device" removed the setting of user_creatable to false on the basis that the restriction was due to the use of serial_hd() in macio_instance_init(). Unfortunately this isn't the full story since the PIC object property links must still be set before the device is realized. Whilst it is possible to update the macio device and Mac machines to resolve this, the fix is too invasive at this point in the release cycle. For now simply set user_creatable back to false in macio_class_init() to prevent QEMU from segfaulting in anticipation of the proper fix arriving in QEMU 6.0. Reported-by: Thomas Huth Signed-off-by: Mark Cave-Ayland Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Thomas Huth --- hw/misc/macio/macio.c | 2 ++ 1 file changed, 2 insertions(+) v2: - Rebase onto master - Add for-5.2 into subject prefix - Add R-B tags from Philippe and Thomas diff --git a/hw/misc/macio/macio.c b/hw/misc/macio/macio.c index 51368884d0..bb601f782c 100644 --- a/hw/misc/macio/macio.c +++ b/hw/misc/macio/macio.c @@ -456,6 +456,8 @@ static void macio_class_init(ObjectClass *klass, void *data) k->class_id = PCI_CLASS_OTHERS << 8; device_class_set_props(dc, macio_properties); set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories); +/* Reason: requires PIC property links to be set in macio_*_realize() */ +dc->user_creatable = false; } static const TypeInfo macio_bus_info = { Applied to my qemu-macppc branch. ATB, Mark.
[PATCH] virtio-net: purge queued rx packets on queue deletion
https://bugzilla.redhat.com/show_bug.cgi?id=1829272 When deleting queue pair, purge pending RX packets if any. Example of problematic flow: 1. Bring up q35 VM with tap (vhost off) and virtio-net or e1000e 2. Run ping flood to the VM NIC ( 1 ms interval) 3. Hot unplug the NIC device (device_del) During unplug process one or more packets come, the NIC can't receive, tap disables read_poll 4. Hot plug the device (device_add) with the same netdev The tap stays with read_poll disabled and does not receive any packets anymore (tap_send never triggered) Signed-off-by: Yuri Benditovich --- net/net.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/net/net.c b/net/net.c index 7a2a0fb5ac..a95b417300 100644 --- a/net/net.c +++ b/net/net.c @@ -411,10 +411,14 @@ void qemu_del_nic(NICState *nic) qemu_macaddr_set_free(&nic->conf->macaddr); -/* If this is a peer NIC and peer has already been deleted, free it now. */ -if (nic->peer_deleted) { -for (i = 0; i < queues; i++) { -qemu_free_net_client(qemu_get_subqueue(nic, i)->peer); +for (i = 0; i < queues; i++) { +NetClientState *nc = qemu_get_subqueue(nic, i); +/* If this is a peer NIC and peer has already been deleted, free it now. */ +if (nic->peer_deleted) { +qemu_free_net_client(nc->peer); +} else if (nc->peer) { +/* if there are RX packets pending, complete them */ +qemu_purge_queued_packets(nc->peer); } } -- 2.17.1
[PATCH] scsi-disk: convert more errno values back to SCSI statuses
Linux has some OS-specific (and sometimes weird) mappings for various SCSI statuses and sense codes. The most important is probably RESERVATION CONFLICT. Add them so that they can be reported back to the guest kernel. Cc: Hannes Reinecke Signed-off-by: Paolo Bonzini --- hw/scsi/scsi-disk.c | 19 +++ 1 file changed, 19 insertions(+) diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c index 424bc192b7..fa14d1527a 100644 --- a/hw/scsi/scsi-disk.c +++ b/hw/scsi/scsi-disk.c @@ -461,6 +461,25 @@ static bool scsi_handle_rw_error(SCSIDiskReq *r, int error, bool acct_failed) } error = scsi_sense_buf_to_errno(r->req.sense, sizeof(r->req.sense)); break; +#ifdef CONFIG_LINUX +/* These errno mapping are specific to Linux. For more information: + * - scsi_decide_disposition in drivers/scsi/scsi_error.c + * - scsi_result_to_blk_status in drivers/scsi/scsi_lib.c + * - blk_errors[] in block/blk-core.c + */ +case EBADE: +/* DID_NEXUS_FAILURE -> BLK_STS_NEXUS. */ +scsi_req_complete(&r->req, RESERVATION_CONFLICT); +break; +case ENODATA: +/* DID_MEDIUM_ERROR -> BLK_STS_MEDIUM. */ +scsi_check_condition(r, SENSE_CODE(READ_ERROR)); +break; +case EREMOTEIO: +/* DID_TARGET_FAILURE -> BLK_STS_TARGET. */ +scsi_req_complete(&r->req, HARDWARE_ERROR); +break; +#endif case ENOMEDIUM: scsi_check_condition(r, SENSE_CODE(NO_MEDIUM)); break; -- 2.28.0
Re: [PATCH v5 1/1] tricore: added triboard with tc27x_soc
On Mon, Nov 09, 2020 at 05:50:55PM +0100, David Brenken wrote: > From: Andreas Konopik > > Signed-off-by: Andreas Konopik > Signed-off-by: David Brenken > Signed-off-by: Georg Hofstetter > Signed-off-by: Robert Rasche > Signed-off-by: Lars Biermanski > --- > default-configs/devices/tricore-softmmu.mak | 2 +- > hw/tricore/Kconfig | 8 + > hw/tricore/meson.build | 2 + > hw/tricore/tc27x_soc.c | 246 > hw/tricore/triboard.c | 98 > include/hw/tricore/tc27x_soc.h | 129 ++ > include/hw/tricore/triboard.h | 50 > 7 files changed, 534 insertions(+), 1 deletion(-) > create mode 100644 hw/tricore/tc27x_soc.c > create mode 100644 hw/tricore/triboard.c > create mode 100644 include/hw/tricore/tc27x_soc.h > create mode 100644 include/hw/tricore/triboard.h Reviewed-by: Bastian Koppelmann I applied this to my tricore.next tree. Cheers, Bastian
[PULL 1/1] macio: set user_creatable to false in macio_class_init()
Commit 348b8d1a76 "macio: don't reference serial_hd() directly within the device" removed the setting of user_creatable to false on the basis that the restriction was due to the use of serial_hd() in macio_instance_init(). Unfortunately this isn't the full story since the PIC object property links must still be set before the device is realized. Whilst it is possible to update the macio device and Mac machines to resolve this, the fix is too invasive at this point in the release cycle. For now simply set user_creatable back to false in macio_class_init() to prevent QEMU from segfaulting in anticipation of the proper fix arriving in QEMU 6.0. Reported-by: Thomas Huth Signed-off-by: Mark Cave-Ayland Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Thomas Huth Message-Id: <20201110103111.18395-1-mark.cave-ayl...@ilande.co.uk> Signed-off-by: Mark Cave-Ayland --- hw/misc/macio/macio.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/hw/misc/macio/macio.c b/hw/misc/macio/macio.c index 51368884d0..bb601f782c 100644 --- a/hw/misc/macio/macio.c +++ b/hw/misc/macio/macio.c @@ -456,6 +456,8 @@ static void macio_class_init(ObjectClass *klass, void *data) k->class_id = PCI_CLASS_OTHERS << 8; device_class_set_props(dc, macio_properties); set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories); +/* Reason: requires PIC property links to be set in macio_*_realize() */ +dc->user_creatable = false; } static const TypeInfo macio_bus_info = { -- 2.20.1
[PULL 0/1] qemu-macppc queue 20201112
The following changes since commit a4c141dca466ed3e9451f147efe6304b1b659ff5: Merge remote-tracking branch 'remotes/stefanha-gitlab/tags/tracing-pull-request' into staging (2020-11-11 19:53:15 +) are available in the Git repository at: git://github.com/mcayland/qemu.git tags/qemu-macppc-20201112 for you to fetch changes up to 6bfa035ec31f4f5a14499f17e08f62e8f14760cc: macio: set user_creatable to false in macio_class_init() (2020-11-12 09:26:20 +) qemu-macppc fix for 5.2 Mark Cave-Ayland (1): macio: set user_creatable to false in macio_class_init() hw/misc/macio/macio.c | 2 ++ 1 file changed, 2 insertions(+)
Re: [Virtio-fs] [PATCH] virtiofsd: Use --thread-pool-size=0 to mean no thread pool
On Thu, Nov 12, 2020 at 10:06 AM Miklos Szeredi wrote: > > On Fri, Nov 6, 2020 at 11:35 PM Vivek Goyal wrote: > > > > On Fri, Nov 06, 2020 at 08:33:50PM +, Venegas Munoz, Jose Carlos wrote: > > > Hi Vivek, > > > > > > I have tested with Kata 1.12-apha0, the results seems that are better for > > > the use fio config I am tracking. > > > > > > The fio config does randrw: > > > > > > fio --direct=1 --gtod_reduce=1 --name=test > > > --filename=random_read_write.fio --bs=4k --iodepth=64 --size=200M > > > --readwrite=randrw --rwmixread=75 > > > > > > > Hi Carlos, > > > > Thanks for the testing. > > > > So basically two conclusions from your tests. > > > > - for virtiofs, --thread-pool-size=0 is performing better as comapred > > to --thread-pool-size=1 as well as --thread-pool-size=64. Approximately > > 35-40% better. > > > > - virtio-9p is still approximately 30% better than virtiofs > > --thread-pool-size=0. > > > > As I had done the analysis that this particular workload (mixed read and > > write) is bad with virtiofs because after every write we are invalidating > > attrs and cache so next read ends up fetching attrs again. I had posted > > patches to gain some of the performance. > > > > https://lore.kernel.org/linux-fsdevel/20200929185015.gg220...@redhat.com/ > > > > But I got the feedback to look into implementing file leases instead. > > Hmm, the FUSE_AUTO_INVAL_DATA feature is buggy, how about turning it > off for now? 9p doesn't have it, so no point in enabling it for > virtiofs by default. > > Also I think some confusion comes from cache=auto being the default > for virtiofs.Not sure what the default is for 9p, but comparing > default to default will definitely not be apples to apples since this > mode is nonexistent in 9p. > > 9p:cache=none <-> virtiofs:cache=none > 9p:cache=loose <-> virtiofs:cache=always > > "9p:cache=mmap" and "virtiofs:cache=auto" have no match. > > Untested patch attached. Another performance issue with virtiofs could be due to the strict page writeback rules in fuse that are meant to prevent misuse of kernel memory by unprivileged processes. Since virtiofs isn't subject to that limitation, these could be relaxed. Attaching a patch that does one half of this. The other half is getting rid of the page copying, that's a bit more involved, but shouldn't be too difficult. Just need to duplicate the cached writeback callbacks for virtiofs and do away with the complex temp page stuff. Thanks, Miklos diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index d414c787e362..92c92c482c57 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -502,6 +502,7 @@ struct fuse_fs_context { bool no_force_umount:1; bool legacy_opts_show:1; bool dax:1; + bool relax_writeback:1; unsigned int max_read; unsigned int blksize; const char *subtype; diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index 36ab05315828..029325ebd1b3 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -1130,7 +1130,8 @@ void fuse_free_conn(struct fuse_conn *fc) } EXPORT_SYMBOL_GPL(fuse_free_conn); -static int fuse_bdi_init(struct fuse_conn *fc, struct super_block *sb) +static int fuse_bdi_init(struct fuse_conn *fc, struct super_block *sb, + struct fuse_fs_context *ctx) { int err; char *suffix = ""; @@ -1151,21 +1152,24 @@ static int fuse_bdi_init(struct fuse_conn *fc, struct super_block *sb) /* fuse does it's own writeback accounting */ sb->s_bdi->capabilities &= ~BDI_CAP_WRITEBACK_ACCT; - sb->s_bdi->capabilities |= BDI_CAP_STRICTLIMIT; - /* - * For a single fuse filesystem use max 1% of dirty + - * writeback threshold. - * - * This gives about 1M of write buffer for memory maps on a - * machine with 1G and 10% dirty_ratio, which should be more - * than enough. - * - * Privileged users can raise it by writing to - * - */sys/class/bdi//max_ratio - */ - bdi_set_max_ratio(sb->s_bdi, 1); + if (!ctx->relax_writeback) { + sb->s_bdi->capabilities |= BDI_CAP_STRICTLIMIT; + + /* + * For a single fuse filesystem use max 1% of dirty + + * writeback threshold. + * + * This gives about 1M of write buffer for memory maps on a + * machine with 1G and 10% dirty_ratio, which should be more + * than enough. + * + * Privileged users can raise it by writing to + * + */sys/class/bdi//max_ratio + */ + bdi_set_max_ratio(sb->s_bdi, 1); + } return 0; } @@ -1354,7 +1358,7 @@ int fuse_fill_super_common(struct super_block *sb, struct fuse_fs_context *ctx) fc->dev = sb->s_dev; fm->sb = sb; - err = fuse_bdi_init(fc, sb); + err = fuse_bdi_init(fc, sb, ctx); if (err) goto err_dev_free; diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c index 8868ac31a3c0..efbe1697612e 100644 --- a/fs/fuse/virtio_fs.c +++ b/fs/fuse/virtio_fs.c @@ -1288,6 +1288,7 @@ static inline void virtio_fs_ctx_set_defaults(struct fuse_fs_context *ctx) ctx->destroy = true; ctx->no_control = true; ctx->no_force_umount = true; + ctx->relax_writeback = true; }
Re: [PATCH PROTOTYPE 3/6] vfio: Implement support for sparse RAM memory regions
On 20.10.20 22:44, Peter Xu wrote: > On Tue, Oct 20, 2020 at 10:01:12PM +0200, David Hildenbrand wrote: >> Thanks ... but I have an AMD system. Will try to find out how to get >> that running with AMD :) I just did some more testing with the oldish GPU I have for that purpose. Seems to work, at least I get video output that keeps on working - did not try advanced things yet. I use -device vfio-pci,host=05:00.0,x-vga=on -device vfio-pci,host=05:00.1 when adding "-device intel-iommu", I got "qemu-system-x86_64: -device vfio-pci,host=05:00.1: vfio :05:00.1: group 23 used in multiple address spaces" ... so I poked around the internet a bit and got it running with -device intel-iommu,caching-mode=on \ -device pcie-pci-bridge,addr=1e.0,id=pci.1 \ -device vfio-pci,host=05:00.0,xvga=on,bus=pci.1,addr=1.0,multifunction=on \ -device vfio-pci,host=05:00.1,bus=pci.1,addr=1.1 \ Things still seem to be working, so I assume it works (I guess ?!). -- Thanks, David / dhildenb
Re: [PATCH] net/e1000e_core: make sure RDH never exceeds RDT in e1000e_ring_advance()
On Wed, Nov 11, 2020 at 1:48 PM Jason Wang wrote: > > > On 2020/11/11 下午4:54, Jason Wang wrote: > > > > On 2020/11/10 下午5:06, Mauro Matteo Cascella wrote: > >> On Mon, Nov 9, 2020 at 3:38 AM Jason Wang wrote: > >>> > >>> On 2020/11/5 下午6:56, Mauro Matteo Cascella wrote: > The e1000e_write_packet_to_guest() function iterates over a set of > receive descriptors by advancing rx descriptor head register (RDH) > from > its initial value to rx descriptor tail register (RDT). The check in > e1000e_ring_empty() is responsible for detecting whether RDH has > reached > RDT, terminating the loop if that's the case. Additional checks have > been added in the past to deal with bogus values submitted by the > guest > to prevent possible infinite loop. This is done by "wrapping > around" RDH > at some point and detecting whether it assumes the original value > during > the loop. > > However, when e1000e is configured to use the packet split feature, > RDH is > incremented by two instead of one, as the packet split descriptors are > 32 bytes while regular descriptors are 16 bytes. A malicious or buggy > guest may set RDT to an odd value and transmit only null RX > descriptors. > This corner case would prevent RDH from ever matching RDT, leading > to an > infinite loop. This patch adds a check in e1000e_ring_advance() to > make > sure RDH does never exceed RDT. > > This issue was independently reported by Gaoning Pan (Zhejiang > University) > and Cheolwoo Myung. > > Signed-off-by: Mauro Matteo Cascella > Reported-by: Gaoning Pan > Reported-by: Cheolwoo Myung <330cj...@gmail.com> > --- > References: > https://git.qemu.org/?p=qemu.git;a=commit;h=dd793a74882477ca38d49e191110c17dfe > > https://git.qemu.org/?p=qemu.git;a=commit;h=4154c7e03fa55b4cf52509a83d50d6c09d743b7 > > http://www.intel.com/content/dam/doc/datasheet/82574l-gbe-controller-datasheet.pdf > > > hw/net/e1000e_core.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c > index bcd186cac5..4c4d14b6ed 100644 > --- a/hw/net/e1000e_core.c > +++ b/hw/net/e1000e_core.c > @@ -831,6 +831,10 @@ e1000e_ring_advance(E1000ECore *core, const > E1000E_RingInfo *r, uint32_t count) > { > core->mac[r->dh] += count; > > +if (core->mac[r->dh] > core->mac[r->dt]) { > +core->mac[r->dh] = core->mac[r->dt]; > +} > + > if (core->mac[r->dh] * E1000_RING_DESC_LEN >= > core->mac[r->dlen]) { > core->mac[r->dh] = 0; > } > >> Hi Jason, > >> > >>> A question here. > >>> > >>> When count > 1, is this correct to reset dh here? > >>> > >>> Thanks > >>> > >> My understanding is that wrapping at (or above) RDLEN is the correct > >> behavior regardless of count. I don't see why count > 1 should modify > >> this behavior. I'm not sure, though. Anyway, this patch fixes the > >> above reproducer, so I'm adding a Tested-by line here. > >> > >> Tested-by: Mauro Matteo Cascella > >> > >> Thank you, > >> -- > >> Mauro Matteo Cascella > >> Red Hat Product Security > >> PGP-Key ID: BB3410B0 > >> > > > > Right. > > > > Applied. > > > > Thanks > > > I had to drop this since it breaks e1000e PXE test. > > Thanks > By debugging the failing qtest (/x86_64/pxe/ipv4/q35/e1000e) I noticed several cases where RDH > RDT in e1000e_ring_advance(). Given the RX/TX descriptor ring structure, I guess this is a possible scenario when the tail pointer wraps back to base when maximum descriptors have been processed. I will send a new version to only cover cases where RDH < RDT and the increment would exceed RDT. This should be enough to fix the infinite loop issue while making the e1000e PXE test pass. Thank you, -- Mauro Matteo Cascella Red Hat Product Security PGP-Key ID: BB3410B0
[PATCH] xhci: fix nec-usb-xhci properties
Storing properties directly in XHCIPciState.xhci doesn't work, the object_initialize_child() call in xhci_instance_init() will overwrite them. This changes the defaults for some properties, which in turn breaks live migration and possibly other things as well. So add XHCINecState, store properties there, copy them over on instance init. Fixes: 8ddab8dd3d81 ("usb/hcd-xhci: Split pci wrapper for xhci base model") Reported-by: Dr. David Alan Gilbert Signed-off-by: Gerd Hoffmann --- hw/usb/hcd-xhci-nec.c | 31 ++- 1 file changed, 26 insertions(+), 5 deletions(-) diff --git a/hw/usb/hcd-xhci-nec.c b/hw/usb/hcd-xhci-nec.c index 5707b2cabd16..13a125afe2f7 100644 --- a/hw/usb/hcd-xhci-nec.c +++ b/hw/usb/hcd-xhci-nec.c @@ -27,18 +27,37 @@ #include "hcd-xhci-pci.h" +typedef struct XHCINecState { +/*< private >*/ +XHCIPciState parent_obj; +/*< public >*/ +uint32_t flags; +uint32_t intrs; +uint32_t slots; +} XHCINecState; + static Property nec_xhci_properties[] = { DEFINE_PROP_ON_OFF_AUTO("msi", XHCIPciState, msi, ON_OFF_AUTO_AUTO), DEFINE_PROP_ON_OFF_AUTO("msix", XHCIPciState, msix, ON_OFF_AUTO_AUTO), -DEFINE_PROP_BIT("superspeed-ports-first", XHCIPciState, -xhci.flags, XHCI_FLAG_SS_FIRST, true), -DEFINE_PROP_BIT("force-pcie-endcap", XHCIPciState, xhci.flags, +DEFINE_PROP_BIT("superspeed-ports-first", XHCINecState, flags, +XHCI_FLAG_SS_FIRST, true), +DEFINE_PROP_BIT("force-pcie-endcap", XHCINecState, flags, XHCI_FLAG_FORCE_PCIE_ENDCAP, false), -DEFINE_PROP_UINT32("intrs", XHCIPciState, xhci.numintrs, XHCI_MAXINTRS), -DEFINE_PROP_UINT32("slots", XHCIPciState, xhci.numslots, XHCI_MAXSLOTS), +DEFINE_PROP_UINT32("intrs", XHCINecState, intrs, XHCI_MAXINTRS), +DEFINE_PROP_UINT32("slots", XHCINecState, slots, XHCI_MAXSLOTS), DEFINE_PROP_END_OF_LIST(), }; +static void nec_xhci_instance_init(Object *obj) +{ +XHCIPciState *pci = XHCI_PCI(obj); +XHCINecState *nec = container_of(pci, XHCINecState, parent_obj); + +pci->xhci.flags= nec->flags; +pci->xhci.numintrs = nec->intrs; +pci->xhci.numslots = nec->slots; +} + static void nec_xhci_class_init(ObjectClass *klass, void *data) { PCIDeviceClass *k = PCI_DEVICE_CLASS(klass); @@ -53,6 +72,8 @@ static void nec_xhci_class_init(ObjectClass *klass, void *data) static const TypeInfo nec_xhci_info = { .name = TYPE_NEC_XHCI, .parent= TYPE_XHCI_PCI, +.instance_size = sizeof(XHCINecState), +.instance_init = nec_xhci_instance_init, .class_init= nec_xhci_class_init, }; -- 2.27.0
Re: [PATCH] migration: handle CANCELLING state in migration_completion()
* Longpeng(Mike) (longpe...@huawei.com) wrote: > The following sequence may cause the VM abort during migration: > > 1. RUN_STATE_RUNNING,MIGRATION_STATUS_ACTIVE > > 2. before call migration_completion(), we send migrate_cancel >QMP command, the state machine is changed to: > RUN_STATE_RUNNING,MIGRATION_STATUS_CANCELLING > > 3. call migration_completion(), and the state machine is >switch to: RUN_STATE_RUNNING,MIGRATION_STATUS_COMPLETED > > 4. call migration_iteration_finish(), because the migration >status is COMPLETED, so it will try to set the runstate >to POSTMIGRATE, but RUNNING-->POSTMIGRATE is an invalid >transition, so abort(). > > The migration_completion() should not change the migration state > to COMPLETED if it is already changed to CANCELLING. > > Signed-off-by: Longpeng(Mike) Yes I think so; the only downside I see is I think this ends up going from CACELLNG->FAILED. Reviewed-by: Dr. David Alan Gilbert > --- > migration/migration.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/migration/migration.c b/migration/migration.c > index 3263aa5..b11a2bd 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -3061,6 +3061,8 @@ static void migration_completion(MigrationState *s) > > qemu_savevm_state_complete_postcopy(s->to_dst_file); > trace_migration_completion_postcopy_end_after_complete(); > +} else if (s->state == MIGRATION_STATUS_CANCELLING) { > +goto fail; > } > > /* > -- > 1.8.3.1 > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [PATCH 1/2] char: Skip CLI aliases in query-chardev-backends
Am 12.11.2020 um 09:22 hat Markus Armbruster geschrieben: > Kevin Wolf writes: > > > The aliases "tty" and "parport" are only valid on the command line, QMP > > commands like chardev-add don't know them. query-chardev-backends should > > describe QMP and therefore not include them in the list of available > > backends. > > > > Signed-off-by: Kevin Wolf > > I'd call that a bug. Which is why I'm fixing it, separate from the deprecation. > In the light of PATCH 2, I propose to put that one first (with the > help_string_append() hunk dropped), then remove the aliases from CLI > help, too, like this: [...] Going one step back without thinking in solutions immediately, what you're suggesting is that deprecated options should become undocumented instead of just annotated as deprecated? Kevin
Re: [PATCH 1/3] meson: move vhost_user_blk_server to meson.build
On Wed, Nov 11, 2020 at 12:54:38PM +0100, Philippe Mathieu-Daudé wrote: > On 11/11/20 12:44 PM, Philippe Mathieu-Daudé wrote: > > On 11/11/20 10:41 AM, Philippe Mathieu-Daudé wrote: > >> On 11/10/20 6:11 PM, Stefan Hajnoczi wrote: > >>> The --enable/disable-vhost-user-blk-server options were implemented in > >>> ./configure. There has been confusion about them and part of the problem > >>> is that the shell syntax used for setting the default value is not easy > >>> to read. Move the option over to meson where the conditions are easier > >>> to understand: > >>> > >>> have_vhost_user_blk_server = (targetos == 'linux') > >>> > >>> if get_option('vhost_user_blk_server').enabled() > >>> if targetos != 'linux' > >>> error('vhost_user_blk_server requires linux') > >>> endif > >>> elif get_option('vhost_user_blk_server').disabled() or not have_system > >>> have_vhost_user_blk_server = false > >>> endif > >> > >> Something is odd: > >> > >> $ ../configure --disable-system --enable-vhost-user-blk-server > > > > I failed when pasting, this misses '--disable-tools' to make sense. > > > > We define in meson.build: > > > > have_block = have_system or have_tools > > > > Maybe this is the one you want instead of have_system? > > This snippet seems to fix: > > -- >8 -- > --- a/meson.build > +++ b/meson.build > @@ -751,6 +751,10 @@ > > has_statx = cc.links(statx_test) > > +if 'CONFIG_VHOST_USER' in config_host and not (have_system or have_tools) > +error('vhost-user does not make sense without system or tools > support enabled') > +endif > + > have_vhost_user_blk_server = (targetos == 'linux' and > 'CONFIG_VHOST_USER' in config_host) > > --- > > $ ../configure --disable-system --enable-vhost-user-blk-server > ../source/qemu/meson.build:755:4: ERROR: Problem encountered: vhost-user > does not make sense without system or tools support enabled > > I'll send a patch. This patch was discussed in "[PATCH-for-5.2 v2 0/4] vhost-user: Fix ./configure confusion". We agreed to drop it for now because it breaks Linux ./configure --disable-system --disable-tools. This patch series is fine as it is. Stefan signature.asc Description: PGP signature
Re: [Virtio-fs] [PATCH] virtiofsd: Use --thread-pool-size=0 to mean no thread pool
On Donnerstag, 12. November 2020 10:06:37 CET Miklos Szeredi wrote: > On Fri, Nov 6, 2020 at 11:35 PM Vivek Goyal wrote: > > On Fri, Nov 06, 2020 at 08:33:50PM +, Venegas Munoz, Jose Carlos wrote: > > > Hi Vivek, > > > > > > I have tested with Kata 1.12-apha0, the results seems that are better > > > for the use fio config I am tracking. > > > > > > The fio config does randrw: > > > > > > fio --direct=1 --gtod_reduce=1 --name=test > > > --filename=random_read_write.fio --bs=4k --iodepth=64 --size=200M > > > --readwrite=randrw --rwmixread=75> > > Hi Carlos, > > > > Thanks for the testing. > > > > So basically two conclusions from your tests. > > > > - for virtiofs, --thread-pool-size=0 is performing better as comapred > > > > to --thread-pool-size=1 as well as --thread-pool-size=64. Approximately > > 35-40% better. > > > > - virtio-9p is still approximately 30% better than virtiofs > > > > --thread-pool-size=0. > > > > As I had done the analysis that this particular workload (mixed read and > > write) is bad with virtiofs because after every write we are invalidating > > attrs and cache so next read ends up fetching attrs again. I had posted > > patches to gain some of the performance. > > > > https://lore.kernel.org/linux-fsdevel/20200929185015.gg220...@redhat.com/ > > > > But I got the feedback to look into implementing file leases instead. > > Hmm, the FUSE_AUTO_INVAL_DATA feature is buggy, how about turning it > off for now? 9p doesn't have it, so no point in enabling it for > virtiofs by default. > > Also I think some confusion comes from cache=auto being the default > for virtiofs.Not sure what the default is for 9p, but comparing > default to default will definitely not be apples to apples since this > mode is nonexistent in 9p. The default for 9p is cache=none. That should be changed to cache=mmap being default IMO, because if users stick with the default setting 'none', it breaks software relying on mmap() calls. > > 9p:cache=none <-> virtiofs:cache=none > 9p:cache=loose <-> virtiofs:cache=always > > "9p:cache=mmap" and "virtiofs:cache=auto" have no match. What does 'auto' do exactly? > > Untested patch attached. > > Thanks, > Miklos Best regards, Christian Schoenebeck
Re: [PATCH v2 2/4] ads7846: put it into the 'display' category
On Thu, 12 Nov 2020 at 07:25, Gan Qixin wrote: > > The category of the ads7846 device is not set, put it into the 'display' > category. > > Signed-off-by: Gan Qixin > --- > hw/display/ads7846.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/hw/display/ads7846.c b/hw/display/ads7846.c > index 023165b2a3..b455cb5acd 100644 > --- a/hw/display/ads7846.c > +++ b/hw/display/ads7846.c > @@ -163,10 +163,12 @@ static void ads7846_realize(SSISlave *d, Error **errp) > > static void ads7846_class_init(ObjectClass *klass, void *data) > { > +DeviceClass *dc = DEVICE_CLASS(klass); > SSISlaveClass *k = SSI_SLAVE_CLASS(klass); > > k->realize = ads7846_realize; > k->transfer = ads7846_transfer; > +set_bit(DEVICE_CATEGORY_DISPLAY, dc->categories); > } This isn't a display, despite being in hw/display. It's a touch-screen controller, so it ought to be in hw/input and be DEVICE_CATEGORY_INPUT. thanks -- PMM
Re: [PATCH 0/2] RFC: Issue with discards on raw block device without O_DIRECT
[added some relevant people and lists to CC] On Wed 11-11-20 17:44:05, Maxim Levitsky wrote: > On Wed, 2020-11-11 at 17:39 +0200, Maxim Levitsky wrote: > > clone of "starship_production" > > The git-publish destroyed the cover letter: > > For the reference this is for bz #1872633 > > The issue is that current kernel code that implements 'fallocate' > on kernel block devices roughly works like that: > > 1. Flush the page cache on the range that is about to be discarded. > 2. Issue the discard and wait for it to finish. >(as far as I can see the discard doesn't go through the >page cache). > > 3. Check if the page cache is dirty for this range, >if it is dirty (meaning that someone wrote to it meanwhile) >return -EBUSY. > > This means that if qemu (or qemu-img) issues a write, and then > discard to the area that shares a page, -EBUSY can be returned by > the kernel. Indeed, if you don't submit PAGE_SIZE aligned discards, you can get back EBUSY which seems wrong to me. IMO we should handle this gracefully in the kernel so we need to fix this. > On the other hand, for example, the ext4 implementation of discard > doesn't seem to be affected. It does take a lock on the inode to avoid > concurrent IO and flushes O_DIRECT writers prior to doing discard thought. Well, filesystem hole punching is somewhat different beast than block device discard (at least implementation wise). > Doing fsync and retrying is seems to resolve this issue, but it might be > a too big hammer. Just retrying doesn't work, indicating that maybe the > code that flushes the page cache in (1) doesn't do this correctly ? > > It also can be racy unless special means are done to block IO from happening > from qemu during this fsync. > > This patch series contains two patches: > > First patch just lets the file-posix ignore the -EBUSY errors, which is > technically enough to fail back to plain write in this case, but seems wrong. > > And the second patch adds an optimization to qemu-img to avoid such a > fragmented write/discard in the first place. > > Both patches make the reproducer work for this particular bugzilla, > but I don't think they are enough. > > What do you think? So if the EBUSY error happens because something happened to the page cache outside of discarded range (like you describe above), that is a kernel bug than needs to get fixed. EBUSY should really mean - someone wrote to the discarded range while discard was running and userspace app has to deal with that depending on what it aims to do... Honza -- Jan Kara SUSE Labs, CR
Re: [PATCH] vl, qemu-config: remove -set
Hi, > IOW, it looks like it is valid to use -set, even if you're not using > -readconfig. Yes, that is quite useful for setting device properties which are not (yet) supported by libvirt, like this: Grepping through my libvirt domain config files I see 90% is indeed "-set device". But I've also found netdev (set tftp+bootfile for ""). take care, Gerd
Re: [PULL 0/2] Linux user for 5.2 patches
On Wed, 11 Nov 2020 at 21:42, Laurent Vivier wrote: > > The following changes since commit c6f28ed5075df79fef39c500362a3f4089256c9c: > > Update version for v5.2.0-rc1 release (2020-11-10 22:29:57 +) > > are available in the Git repository at: > > git://github.com/vivier/qemu.git tags/linux-user-for-5.2-pull-request > > for you to fetch changes up to c7811022ebfcaae64e06383ff734f3b3651bf892: > > linux-user: Prevent crash in epoll_ctl (2020-11-11 11:01:08 +0100) > > > Fixes for epoll_ctl and stack_t > > Applied, thanks. Please update the changelog at https://wiki.qemu.org/ChangeLog/5.2 for any user-visible changes. -- PMM
[PATCH] linux-user: Implement copy_file_range
Signed-off-by: Andreas Schwab --- linux-user/syscall.c | 40 1 file changed, 40 insertions(+) diff --git a/linux-user/syscall.c b/linux-user/syscall.c index 3160a9ba06..c3373af4c7 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -813,6 +813,12 @@ safe_syscall5(int, mq_timedsend, int, mqdes, const char *, msg_ptr, safe_syscall5(int, mq_timedreceive, int, mqdes, char *, msg_ptr, size_t, len, unsigned *, prio, const struct timespec *, timeout) #endif +#if defined(TARGET_NR_copy_file_range) && defined(__NR_copy_file_range) +safe_syscall6(ssize_t, copy_file_range, int, infd, loff_t *, pinoff, + int, outfd, loff_t *, poutoff, size_t, length, + unsigned int, flags) +#endif + /* We do ioctl like this rather than via safe_syscall3 to preserve the * "third argument might be integer or pointer or not present" behaviour of * the libc function. @@ -13057,6 +13063,40 @@ static abi_long do_syscall1(void *cpu_env, int num, abi_long arg1, return get_errno(membarrier(arg1, arg2)); #endif +#if defined(TARGET_NR_copy_file_range) && defined(__NR_copy_file_range) +case TARGET_NR_copy_file_range: +{ +loff_t inoff, outoff; +loff_t *pinoff = NULL, *poutoff = NULL; + +if (arg2) { +if (get_user_u64(inoff, arg2)) { +return -TARGET_EFAULT; +} +pinoff = &inoff; +} +if (arg4) { +if (get_user_u64(outoff, arg4)) { +return -TARGET_EFAULT; +} +poutoff = &outoff; +} +ret = get_errno(safe_copy_file_range(arg1, pinoff, arg3, poutoff, + arg5, arg6)); +if (arg2) { +if (put_user_u64(inoff, arg2)) { +return -TARGET_EFAULT; +} +} +if (arg4) { +if (put_user_u64(outoff, arg4)) { +return -TARGET_EFAULT; +} +} +} +return ret; +#endif + default: qemu_log_mask(LOG_UNIMP, "Unsupported syscall: %d\n", num); return -TARGET_ENOSYS; -- 2.29.0 -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different."
Re: [PATCH] target/i386: seg_helper: Correct segement selector nullification in the RET/IRET helper
On 22/10/20 12:16, Bin Meng wrote: From: Bin Meng Per the SDM, when returning to outer privilege level, for segment registers (ES, FS, GS, and DS) if the check fails, the segment selector becomes null, but QEMU clears the base/limit/flags as well as nullifying the segment selector, which should be a spec violation. Real hardware seems to be compliant with the spec, at least on one Coffee Lake board I tested. This is all quite messy in QEMU; for performance reasons, it never even checks the flags on memory accesses, only on selector loads. One way to fix it would be to define five extra hflags bits that copy the P bit of DS/ES/SS/FS/GS. gen_lea_v_seg checks if the hflag is clear, and if so it generates a #GP or #SS. Regarding your patch, I think at least the segment should be made "unusable". On Intel processors there is an internal "unusable" flag, on AMD and in QEMU, equivalently, the P bit would be cleared in the flags. As far as I know the difference is only visible with VMX. That is, you'd need something like this: cpu_x86_load_seg_cache(env, seg_reg, 0, env->segs[seg_reg].base, env->segs[seg_reg].limit, env->segs[seg_reg].flags & ~DESC_P_MASK); Thanks, Paolo Signed-off-by: Bin Meng --- target/i386/seg_helper.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/target/i386/seg_helper.c b/target/i386/seg_helper.c index be88938..d8766d8 100644 --- a/target/i386/seg_helper.c +++ b/target/i386/seg_helper.c @@ -2108,7 +2108,10 @@ static inline void validate_seg(CPUX86State *env, int seg_reg, int cpl) if (!(e2 & DESC_CS_MASK) || !(e2 & DESC_C_MASK)) { /* data or non conforming code segment */ if (dpl < cpl) { -cpu_x86_load_seg_cache(env, seg_reg, 0, 0, 0, 0); +cpu_x86_load_seg_cache(env, seg_reg, 0, + env->segs[seg_reg].base, + env->segs[seg_reg].limit, + env->segs[seg_reg].flags); } } }
RE: [PATCH v2 2/4] ads7846: put it into the 'display' category
> -Original Message- > From: Peter Maydell [mailto:peter.mayd...@linaro.org] > Sent: Thursday, November 12, 2020 7:20 PM > To: ganqixin > Cc: QEMU Developers ; QEMU Trivial > ; Thomas Huth ; > Zhanghailiang ; Michael S. Tsirkin > ; Laurent Vivier ; Markus Armbruster > ; Chenqun (kuhn) ; > Philippe Mathieu-Daudé > Subject: Re: [PATCH v2 2/4] ads7846: put it into the 'display' category > > On Thu, 12 Nov 2020 at 07:25, Gan Qixin wrote: > > > > The category of the ads7846 device is not set, put it into the 'display' > > category. > > > > Signed-off-by: Gan Qixin > > --- > > hw/display/ads7846.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/hw/display/ads7846.c b/hw/display/ads7846.c index > > 023165b2a3..b455cb5acd 100644 > > --- a/hw/display/ads7846.c > > +++ b/hw/display/ads7846.c > > @@ -163,10 +163,12 @@ static void ads7846_realize(SSISlave *d, Error > > **errp) > > > > static void ads7846_class_init(ObjectClass *klass, void *data) { > > +DeviceClass *dc = DEVICE_CLASS(klass); > > SSISlaveClass *k = SSI_SLAVE_CLASS(klass); > > > > k->realize = ads7846_realize; > > k->transfer = ads7846_transfer; > > +set_bit(DEVICE_CATEGORY_DISPLAY, dc->categories); > > } > > This isn't a display, despite being in hw/display. It's a touch-screen > controller, > so it ought to be in hw/input and be DEVICE_CATEGORY_INPUT. > Thanks for your reply, I will recategorize ads7846. Gan Qixin
Re: [PATCH 0/2] RFC: Issue with discards on raw block device without O_DIRECT
On Thu 12-11-20 12:19:51, Jan Kara wrote: > [added some relevant people and lists to CC] > > On Wed 11-11-20 17:44:05, Maxim Levitsky wrote: > > On Wed, 2020-11-11 at 17:39 +0200, Maxim Levitsky wrote: > > > clone of "starship_production" > > > > The git-publish destroyed the cover letter: > > > > For the reference this is for bz #1872633 > > > > The issue is that current kernel code that implements 'fallocate' > > on kernel block devices roughly works like that: > > > > 1. Flush the page cache on the range that is about to be discarded. > > 2. Issue the discard and wait for it to finish. > >(as far as I can see the discard doesn't go through the > >page cache). > > > > 3. Check if the page cache is dirty for this range, > >if it is dirty (meaning that someone wrote to it meanwhile) > >return -EBUSY. > > > > This means that if qemu (or qemu-img) issues a write, and then > > discard to the area that shares a page, -EBUSY can be returned by > > the kernel. > > Indeed, if you don't submit PAGE_SIZE aligned discards, you can get back > EBUSY which seems wrong to me. IMO we should handle this gracefully in the > kernel so we need to fix this. > > > On the other hand, for example, the ext4 implementation of discard > > doesn't seem to be affected. It does take a lock on the inode to avoid > > concurrent IO and flushes O_DIRECT writers prior to doing discard thought. > > Well, filesystem hole punching is somewhat different beast than block device > discard (at least implementation wise). > > > Doing fsync and retrying is seems to resolve this issue, but it might be > > a too big hammer. Just retrying doesn't work, indicating that maybe the > > code that flushes the page cache in (1) doesn't do this correctly ? > > > > It also can be racy unless special means are done to block IO from happening > > from qemu during this fsync. > > > > This patch series contains two patches: > > > > First patch just lets the file-posix ignore the -EBUSY errors, which is > > technically enough to fail back to plain write in this case, but seems > > wrong. > > > > And the second patch adds an optimization to qemu-img to avoid such a > > fragmented write/discard in the first place. > > > > Both patches make the reproducer work for this particular bugzilla, > > but I don't think they are enough. > > > > What do you think? > > So if the EBUSY error happens because something happened to the page cache > outside of discarded range (like you describe above), that is a kernel bug > than needs to get fixed. EBUSY should really mean - someone wrote to the > discarded range while discard was running and userspace app has to deal > with that depending on what it aims to do... So I was looking what it would take to fix this inside the kernel. The problem is that invalidate_inode_pages2_range() is working on page granularity and it is non-trivial to extend it to work on byte granularity since we don't support something like "try to reclaim part of a page". But I'm also somewhat wondering why we use invalidate_inode_pages2_range() here instead of truncate_inode_pages_range() again? I mean the EBUSY detection cannot be reliable anyway and userspace has no way of knowing whether a write happened before discard or after it so just discarding data is fine from this point of view. Darrick? Honza -- Jan Kara SUSE Labs, CR
Re: [PATCH 2/2] qemu-img: align next status sector on destination alignment.
Am 11.11.20 um 16:39 schrieb Maxim Levitsky: > This helps avoid unneeded writes and discards. > > Signed-off-by: Maxim Levitsky > --- > qemu-img.c | 13 - > 1 file changed, 8 insertions(+), 5 deletions(-) > > diff --git a/qemu-img.c b/qemu-img.c > index c2c56fc797..7e9b0f659f 100644 > --- a/qemu-img.c > +++ b/qemu-img.c > @@ -1722,7 +1722,7 @@ static void convert_select_part(ImgConvertState *s, > int64_t sector_num, > static int convert_iteration_sectors(ImgConvertState *s, int64_t sector_num) > { > int64_t src_cur_offset; > -int ret, n, src_cur; > +int ret, n, src_cur, alignment; > bool post_backing_zero = false; > > convert_select_part(s, sector_num, &src_cur, &src_cur_offset); > @@ -1785,11 +1785,14 @@ static int convert_iteration_sectors(ImgConvertState > *s, int64_t sector_num) > n = DIV_ROUND_UP(count, BDRV_SECTOR_SIZE); > > /* > - * Avoid that s->sector_next_status becomes unaligned to the source > - * request alignment and/or cluster size to avoid unnecessary read > - * cycles. > + * Avoid that s->sector_next_status becomes unaligned to the > + * source/destination request alignment and/or cluster size to avoid > + * unnecessary read/write cycles. > */ > -tail = (sector_num - src_cur_offset + n) % s->src_alignment[src_cur]; > +alignment = MAX(s->src_alignment[src_cur], s->alignment); > +assert(is_power_of_2(alignment)); > + > +tail = (sector_num - src_cur_offset + n) % alignment; > if (n > tail) { > n -= tail; > } I was also considering including the s->alignment when adding this chance. However, you need the least common multiple of both alignments, not the maximum, otherwise you might get misaligned to either source or destination. Why exactly do you need the power of two requirement? Peter
Re: [Virtio-fs] [PATCH] virtiofsd: Use --thread-pool-size=0 to mean no thread pool
On Thu, Nov 12, 2020 at 12:34 PM Christian Schoenebeck wrote: > > On Donnerstag, 12. November 2020 10:06:37 CET Miklos Szeredi wrote: > > > > 9p:cache=none <-> virtiofs:cache=none > > 9p:cache=loose <-> virtiofs:cache=always > > > > "9p:cache=mmap" and "virtiofs:cache=auto" have no match. > > What does 'auto' do exactly? It has a configurable timeout (default is 1s) for the attribute and lookup cache and close-to-open consistency for data (flush on close, invalidate on open). This is similar to, but less refined as NFS. It's something that could easily be improved if there's a need for it. Thanks, Miklos
[Bug 1628971] Re: -netdev user: guestfwd doesn't work
** Changed in: qemu Status: New => Invalid -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1628971 Title: -netdev user: guestfwd doesn't work Status in QEMU: Invalid Bug description: Hello! QEMU emulator version 2.6.1 (Debian 1:2.6.1+dfsg-0ubuntu4), Copyright (c) 2003-2008 Fabrice Bellard The IP address 192.168.1.46 is assigned to eth0. qemu-system-x86_64 \ -no-hpet \ -nodefconfig \ -machine accel=kvm \ -cpu host \ -smp 2 \ -drive if=virtio,file=yakkety-server-cloudimg-amd64.img \ -device virtio-net-pci,netdev=net0 \ -netdev 'user,id=net0,hostfwd=tcp::-:22,guestfwd=tcp:1.2.3.4:1234-cmd:nc 192.168.1.46 8842' \ -m 1024 \ -initrd yakkety-server-cloudimg-amd64-initrd-generic \ -kernel yakkety-server-cloudimg-amd64-vmlinuz-generic \ -append 'root=/dev/vda1 modprobe.blacklist=floppy systemd.log_level=debug systemd.journald.forward_to_console=1' Without the guestfwd=... part everything works nicely. With it I get the following message. qemu-system-x86_64: -netdev user,id=net0,hostfwd=tcp::-:22,guestfwd=tcp:1.2.3.4:1234-cmd:nc 192.168.1.46 8842: conflicting/invalid host:port in guest forwarding rule 'tcp:1.2.3.4:1234-cmd:nc 192.168.1.46 8842' qemu-system-x86_64: -netdev user,id=net0,hostfwd=tcp::-:22,guestfwd=tcp:1.2.3.4:1234-cmd:nc 192.168.1.46 8842: Device 'user' could not be initialized But I just compiled c640f2849ee8775fe1bbd7a2772610aa77816f9f, and I get the same behavior. pas@strange:~/qemu/x86_64-softmmu$ ./qemu-system-x86_64 -net 'user,guestfwd=tcp:1.2.3.4:1234-cmd:nc 192.168.1.48 80' qemu-system-x86_64: -net user,guestfwd=tcp:1.2.3.4:1234-cmd:nc 192.168.1.48 80: conflicting/invalid host:port in guest forwarding rule 'tcp:1.2.3.4:1234-cmd:nc 192.168.1.48 80' qemu-system-x86_64: -net user,guestfwd=tcp:1.2.3.4:1234-cmd:nc 192.168.1.48 80: Device 'user' could not be initialized After poking a bit around it seems that this check fails in slirp/slirp.c: (around line 1074) if ((guest_addr->s_addr & slirp->vnetwork_mask.s_addr) != slirp->vnetwork_addr.s_addr || guest_addr->s_addr == slirp->vhost_addr.s_addr || guest_addr->s_addr == slirp->vnameserver_addr.s_addr) { return -1; } Because guest_addr, and slirp has equivalent s_addr values. x86_64-softmmu/qemu-system-x86_64 -net 'user,net=10.0.2.0/24,host=10.0.2.2,guestfwd=tcp:12.0.0.2:80-cmd:echo ok' guest_addr: 12.0.0.2 vnetwork_mask: 12.0.0.2 vhost_addr: 12.0.0.2 vnameserver_addr: 12.0.0.2 guest_addr & mask: 12.0.0.2 Thanks in advance for looking into this! To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1628971/+subscriptions
[Bug 1626207] Re: -device usb-host failing with usbip_vudc-vhdi_hcd gadget
The QEMU project is currently considering to move its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now. If you still think this bug report here is valid, then please switch the state back to "New" within the next 60 days, otherwise this report will be marked as "Expired". Or mark it as "Fix Released" if the problem has been solved with a newer version of QEMU already. Thank you and sorry for the inconvenience. ** Changed in: qemu Status: New => Incomplete -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1626207 Title: -device usb-host failing with usbip_vudc-vhdi_hcd gadget Status in QEMU: Incomplete Bug description: I must admit that I do not know if this is a Qemu or a Kernel issue, but I try reporting here: When I try to forward a copy of my USB mouse that I made with the new virtual USB device controller in kernel 4.7 I get the following in my log: [ +0.703256] usb 1-4: reset full-speed USB device number 3 using xhci_hcd [ +1.020776] usb usb7-port3: Cannot enable. Maybe the USB cable is bad? [ +0.855013] usb usb7-port3: Cannot enable. Maybe the USB cable is bad? [ +0.859052] usb usb7-port3: Cannot enable. Maybe the USB cable is bad? [ +0.857000] usb usb7-port3: Cannot enable. Maybe the USB cable is bad? [ +0.000141] usb 7-3: USB disconnect, device number 10 [ +0.153056] usb 1-4: reset full-speed USB device number 3 using xhci_hcd [ +0.703746] usb usb7-port3: Cannot enable. Maybe the USB cable is bad? [ +0.855001] usb usb7-port3: Cannot enable. Maybe the USB cable is bad? [ +0.855006] usb usb7-port3: Cannot enable. Maybe the USB cable is bad? [ +0.855005] usb usb7-port3: Cannot enable. Maybe the USB cable is bad? [ +0.09] usb usb7-port3: unable to enumerate USB device the commands I use for makeing the virtual device are(after making a copy of the report description of my real mouse in /root/usb/kingston_report_desc): mkdir /sys/kernel/config/usb_gadget/winmouse cd /sys/kernel/config/usb_gadget/winmouse echo "0x047d" > idVendor echo "0x1020" > idProduct mkdir strings/0x409 echo 2016 > strings/0x409/serialnumber echo Kensington > strings/0x409/manufacturer echo "Kensington Expert Mouse" > strings/0x409/product mkdir configs/c.1 mkdir configs/c.1/strings/0x409 mkdir functions/hid.usb2 echo 2 > functions/hid.usb2/protocol echo 1 > functions/hid.usb2/subclass echo 4 > functions/hid.usb2/report_length cat /root/usb/kingston_report_desc > functions/hid.usb2/report_desc echo 0xa0 > configs/c.1/bmAttributes echo 100 > configs/c.1/MaxPower ln -s functions/hid.usb2 configs/c.1 echo usbip-vudc.2 > UDC usbip attach -r localhost -d usbip-vudc.2 The virtual mouse then shows up as Bus7,Dev10 and I use the option -device usb-host,hostbus=7,hostaddr=10,id=hostdev6,bus=usb.0,port=2 in Qemu to attach it. This is Qemu 2.7.0 on kernel 4.7.4 To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1626207/+subscriptions
[Bug 1626596] Re: Lockup with vhost network
The QEMU project is currently considering to move its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now. If you still think this bug report here is valid, then please switch the state back to "New" within the next 60 days, otherwise this report will be marked as "Expired". Or mark it as "Fix Released" if the problem has been solved with a newer version of QEMU already. Thank you and sorry for the inconvenience. ** Changed in: qemu Status: New => Incomplete -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1626596 Title: Lockup with vhost network Status in QEMU: Incomplete Bug description: After using Qemu in this configuration successfully for quite a while, I changed two things: - moved the VM from a 8-core 4GHz host to a slower 2-core 1.6 Ghz machine - upgraded qemu from 2.1 to 2.5 and almost immediately (in a couple hours) got hit with a vhost- related lockup. QEMU command line is: qemu-system-x86_64 -enable-kvm -daemonize -monitor unix:./monitor,server,nowait -cpu host -M q35 -balloon virtio -device virtio-scsi-pci,id=scsi -device scsi-hd,drive=hd -drive if=none,id=hd,cache=writeback,aio=native,format=raw,file=.img,discard=unmap ,detect-zeroes=unmap -device virtio-net-pci,netdev=net0,mac= -netdev tap,vhost=on,id=net0,script=.sh -usbdevice tablet -smp 2 -m 512 -vnc :yz VM was running fine, except no network traffic was passed from/to it. Shutting down the VM, it hung at "Will now halt." The QEMU process was unkillable, so the only choice was to sysrq-b the entire box. dmesg with sysrq-w attached. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1626596/+subscriptions
[Bug 1668273] Re: DoS possible on - a QEMU process using userspace SLIRP?
Slirp has been moved to an external project now. If this is still an issue, please report the problem there instead: https://gitlab.freedesktop.org/slirp/libslirp ** Changed in: qemu Status: New => Won't Fix -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1668273 Title: DoS possible on - a QEMU process using userspace SLIRP? Status in QEMU: Won't Fix Bug description: Steps to reproduce: - Launch a VM using QEMU (2.8.0): $ qemu-system-x86_64 \ -machine accel=kvm \ -hda Fedora-Cloud-Base-25-1.3.x86_64.qcow2 \ -m 2G \ -smp 2 \ -vnc :8 \ -boot dc \ -vga std \ -cpu host \ -net nic,vlan=0 \ -net user,vlan=0,hostfwd=tcp::10024-:22,hostfwd=tcp::8082-:80 - SSH into the VM, install httpd, start httpd $ ssh -p 10024 root@localhost 'dnf install -y httpd && systemctl start httpd' - Compile and run the following Java program (on the host): $ cat < URLConnectionReader.java import java.net.*; import java.io.*; public class URLConnectionReader { public static void main(String[] args) throws Exception { int i = 0; while (i < 1024) { URL this_is_404 = new URL("http://localhost:8082/blah";); URLConnection yc = this_is_404.openConnection(); try { BufferedReader in = new BufferedReader(new InputStreamReader( yc.getInputStream())); String inputLine; while ((inputLine = in.readLine()) != null) System.out.println(inputLine); in.close(); } catch (Exception e) { //HttpURLConnection urlConnection = (HttpURLConnection) yc; //urlConnection.disconnect(); } i++; } Thread.sleep(10); } } $ javac URLConnectionReader.java $ java URLConnectionReader & The java program tries to open a lot of HTTP connections, but never calls disconnect() on any. - Take a look at the list of open FDs of the qemu process: $ ls -tl /proc/${qemu-pid}/fd $ lsof -p ${qemu-pid} All of the TCP connections will be stuck at FIN_WAIT2 The VM becomes unresponsive. Neither SSH or VNC works after this; even after tcp_fin_timeout expires. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1668273/+subscriptions
Re: [PULL 0/1] qemu-macppc queue 20201112
On Thu, 12 Nov 2020 at 09:56, Mark Cave-Ayland wrote: > > The following changes since commit a4c141dca466ed3e9451f147efe6304b1b659ff5: > > Merge remote-tracking branch > 'remotes/stefanha-gitlab/tags/tracing-pull-request' into staging (2020-11-11 > 19:53:15 +) > > are available in the Git repository at: > > git://github.com/mcayland/qemu.git tags/qemu-macppc-20201112 > > for you to fetch changes up to 6bfa035ec31f4f5a14499f17e08f62e8f14760cc: > > macio: set user_creatable to false in macio_class_init() (2020-11-12 > 09:26:20 +) > > > qemu-macppc fix for 5.2 > > Applied, thanks. Please update the changelog at https://wiki.qemu.org/ChangeLog/5.2 for any user-visible changes. -- PMM
Re: [PATCH 1/1] virtio-blk-ccw: tweak the default for num_queues
On Wed, 11 Nov 2020 13:49:08 +0100 Michael Mueller wrote: > Halil, > > still I would like to know what the exact memory consumption per queue > is that you are talking about. Have you made a calculation? Thanks. Hi! The default size for virtio-blk seems to be 256 ring entries, which translates to 6668 bytes for the split ring(s). The queue size is user configurable, and guest, in theory, can decide to have a smaller queue. The indirect descriptors are allocated separately, and bounced via swiotlb in case of secure guests. Regards, Halil
Re: [PATCH 2/2] qemu-img: align next status sector on destination alignment.
On 11/12/20 6:40 AM, Peter Lieven wrote: >> /* >> - * Avoid that s->sector_next_status becomes unaligned to the source >> - * request alignment and/or cluster size to avoid unnecessary read >> - * cycles. >> + * Avoid that s->sector_next_status becomes unaligned to the >> + * source/destination request alignment and/or cluster size to avoid >> + * unnecessary read/write cycles. >> */ >> -tail = (sector_num - src_cur_offset + n) % >> s->src_alignment[src_cur]; >> +alignment = MAX(s->src_alignment[src_cur], s->alignment); >> +assert(is_power_of_2(alignment)); >> + >> +tail = (sector_num - src_cur_offset + n) % alignment; >> if (n > tail) { >> n -= tail; >> } > > > I was also considering including the s->alignment when adding this chance. > However, you need the least common multiple of both alignments, not the > maximum, otherwise > > you might get misaligned to either source or destination. > > > Why exactly do you need the power of two requirement? The power of two requirement ensures that you h ave the least common multiple of both alignments ;) However, there ARE devices in practice that have advertised a non-power-of-2 discard granularity, such as 15MiB (see commits 430b26a8, 63188c24). Which means you probably don't want to assert power-of-2, and in turn need to worry about least common multiple. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Re: [PATCH] vl, qemu-config: remove -set
On 12/11/20 12:26, Gerd Hoffmann wrote: Yes, that is quite useful for setting device properties which are not (yet) supported by libvirt, like this: Grepping through my libvirt domain config files I see 90% is indeed "-set device". But I've also found netdev (set tftp+bootfile for ""). Hmm... Looks like I will move -set from config-file to vl.c and handle both QemuOpts and keyval-based options. Paolo
Re: [PATCH-for-5.2 4/4] migration/ram: Fix hexadecimal format string specifier
* Dr. David Alan Gilbert (dgilb...@redhat.com) wrote: > * Philippe Mathieu-Daud̩ (phi...@redhat.com) wrote: > > Hi David, Juan. > > > > On 11/3/20 1:46 PM, Dr. David Alan Gilbert wrote: > > > * Philippe Mathieu-Daud̮̩ (phi...@redhat.com) wrote: > > >> The '%u' conversion specifier is for decimal notation. > > >> When prefixing a format with '0x', we want the hexadecimal > > >> specifier ('%x'). > > >> > > >> Inspired-by: Dov Murik > > >> Signed-off-by: Philippe Mathieu-Daud̮̩ > > > > > > Reviewed-by: Dr. David Alan Gilbert > > > > As there is no qemu-trivial@ pull request during freeze/rc, > > can you queue this patch via your migration tree? > > Yep, will do. Queued > Dave > > > Thanks, > > > > Phil. > > > > > > > >> --- > > >> migration/ram.c | 2 +- > > >> 1 file changed, 1 insertion(+), 1 deletion(-) > > >> > > >> diff --git a/migration/ram.c b/migration/ram.c > > >> index 2da2b622ab2..23dcfb3ac38 100644 > > >> --- a/migration/ram.c > > >> +++ b/migration/ram.c > > >> @@ -3729,7 +3729,7 @@ int ram_dirty_bitmap_reload(MigrationState *s, > > >> RAMBlock *block) > > >> } > > >> > > >> if (end_mark != RAMBLOCK_RECV_BITMAP_ENDING) { > > >> -error_report("%s: ramblock '%s' end mark incorrect: 0x%"PRIu64, > > >> +error_report("%s: ramblock '%s' end mark incorrect: 0x%"PRIx64, > > >> __func__, block->idstr, end_mark); > > >> ret = -EINVAL; > > >> goto out; > > >> -- > > >> 2.26.2 > > >> > > > -- > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK > > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: Command line QAPIfication and -readconfig
On Wed, 2020-11-11 at 11:35 +0100, Kevin Wolf wrote: > Am 11.11.2020 um 11:14 hat Daniel P. Berrangé geschrieben: > > Normally we would not mark something as deprecated unless its replacement > > is ready, because telling people "stop using this but the replacement > > doesn't exist yet" is not a nice message as there is no action users can > > take to deal with the deprecation. > > But there is a replacement: Put everything back into the command line > and keep it in a shell script. Config files with -readconfig were never > complete enough to fully describe a VM, so it's not too unlikely that > you'll already have that script anyway. This is correct: in fact... > > We might question whether -readconfig has any users but I would note > > that our own documentation illustrates its usage, and provides several > > example configs > > [...] > > config/mach-virt-graphical.cfg:# -readconfig mach-virt-graphical.cfg \ > > config/mach-virt-serial.cfg:# -readconfig mach-virt-serial.cfg \ > > config/q35-emulated.cfg:# -readconfig q35-emulated.cfg > > config/q35-virtio-graphical.cfg:# -readconfig q35-virtio-graphical.cfg > > config/q35-virtio-serial.cfg:# -readconfig q35-virtio-serial.cfg \ [...] ... these configuration files, which I contributed some time ago, all start with something along the lines of # Usage: # # $ qemu-system-aarch64 \ # -nodefaults \ # -readconfig mach-virt-serial.cfg \ # -display none -serial mon:stdio \ # -cpu host because it was simply impossible to provide QEMU with all the settings necessary to obtain the desired virtual hardware using the configuration file only. -- Andrea Bolognani / Red Hat / Virtualization
Re: [PATCH v2] ACPI: Avoid infinite recursion when dump-vmstate
* Peng Liang (liangpen...@huawei.com) wrote: > There is a field with vmstate_ghes_state as vmsd in vmstate_ghes_state, > which will lead to infinite recursion in dump_vmstate_vmsd. > > Fixes: a08a64627b ("ACPI: Record the Generic Error Status Block address") > Reported-by: Euler Robot > Signed-off-by: Peng Liang > Acked-by: Igor Mammedov Queued for migration > --- > hw/acpi/generic_event_device.c | 12 +++- > 1 file changed, 11 insertions(+), 1 deletion(-) > > diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c > index 6df400e1ee16..5454be67d5f0 100644 > --- a/hw/acpi/generic_event_device.c > +++ b/hw/acpi/generic_event_device.c > @@ -322,6 +322,16 @@ static const VMStateDescription vmstate_ged_state = { > } > }; > > +static const VMStateDescription vmstate_ghes = { > +.name = "acpi-ghes", > +.version_id = 1, > +.minimum_version_id = 1, > +.fields = (VMStateField[]) { > +VMSTATE_UINT64(ghes_addr_le, AcpiGhesState), > +VMSTATE_END_OF_LIST() > +}, > +}; > + > static bool ghes_needed(void *opaque) > { > AcpiGedState *s = opaque; > @@ -335,7 +345,7 @@ static const VMStateDescription vmstate_ghes_state = { > .needed = ghes_needed, > .fields = (VMStateField[]) { > VMSTATE_STRUCT(ghes_state, AcpiGedState, 1, > - vmstate_ghes_state, AcpiGhesState), > + vmstate_ghes, AcpiGhesState), > VMSTATE_END_OF_LIST() > } > }; > -- > 2.26.2 > > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
[Bug 1778966] Re: Windows 1803 and later crashes on KVM
The QEMU project is currently considering to move its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now. If you still think this bug report here is valid, then please switch the state back to "New" within the next 60 days, otherwise this report will be marked as "Expired". Or mark it as "Fix Released" if the problem has been solved with a newer version of QEMU already. Thank you and sorry for the inconvenience. ** Changed in: qemu Status: New => Incomplete -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1778966 Title: Windows 1803 and later crashes on KVM Status in QEMU: Incomplete Bug description: For a bionic host, using the current public kvm modules, KVM is not capable of booting a WindowsInsider or msdn Windows 1803 Windows Server iso. In snstalling from an ISO from a started windows 2016 guest results in an unbootable and unrepairable guest. The hardware is a threadripper 1920x with 32GB of main memory, disk mydigital BPX SSD and WD based 4 column RAID 5 via mdadm. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1778966/+subscriptions
[Bug 1779120] Re: disk missing in the guest contingently when hotplug several virtio scsi disks consecutively
The QEMU project is currently considering to move its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now. If you still think this bug report here is valid, then please switch the state back to "New" within the next 60 days, otherwise this report will be marked as "Expired". Or mark it as "Fix Released" if the problem has been solved with a newer version of QEMU already. Thank you and sorry for the inconvenience. ** Changed in: qemu Status: New => Incomplete -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1779120 Title: disk missing in the guest contingently when hotplug several virtio scsi disks consecutively Status in QEMU: Incomplete Bug description: Hi, I found a bug that disk missing (not all disks missing ) in the guest contingently when hotplug several virtio scsi disks consecutively. After rebooting the guest,the missing disks appear again. The guest is centos7.3 running on a centos7.3 host and the scsi controllers are configed with iothread. The scsi controller xml is below: If the scsi controllers are configed without iothread, disks are all can be seen in the guest when hotplug several virtio scsi disks consecutively. I think the biggest difference between them is that scsi controllers with iothread call virtio_notify_irqfd to notify guest and scsi controllers without iothread call virtio_notify instead. What make it difference? Will interrupts are lost when call virtio_notify_irqfd due to race condition for some unknow reasons? Maybe guys more familiar with scsi dataplane can help. Thanks for your reply! To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1779120/+subscriptions
Re: [PATCH for-5.2 04/10] vhost-user-blk-test: rename destroy_drive() to destroy_file()
On 11.11.20 13:43, Stefan Hajnoczi wrote: The function is used not just for image files but also for UNIX domain sockets (QMP monitor and vhost-user-blk). Reflect that in the name. Signed-off-by: Stefan Hajnoczi --- tests/qtest/vhost-user-blk-test.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) [...] @@ -731,7 +732,7 @@ static char *start_vhost_user_blk(GString *cmd_line, int vus_instances, sock_path = g_strdup(sock_path_tempate); fd = mkstemp(sock_path); g_assert_cmpint(fd, >=, 0); -g_test_queue_destroy(drive_destroy, sock_path); +g_test_queue_destroy(drive_file, sock_path); s/drive_file/destroy_file/, I think :) /* create image file */ img_path = drive_create(); g_string_append_printf(storage_daemon_command,
Re: [PATCH] migration/multifd: fix hangup with TLS-Multifd due to blocking handshake
* Chuan Zheng (zhengch...@huawei.com) wrote: > The qemu main loop could hang up forever when we enable TLS+Multifd. > The Src multifd_send_0 invokes tls handshake, it sends hello to sever > and wait response. > However, the Dst main qemu loop has been waiting recvmsg() for multifd_recv_1. > Both of Src and Dst main qemu loop are blocking and waiting for reponse which > results in hanging up forever. > > Src: (multifd_send_0) Dst: > (multifd_recv_1) > multifd_channel_connect > migration_channel_process_incoming > multifd_tls_channel_connect > migration_tls_channel_process_incoming > multifd_tls_channel_connect > qio_channel_tls_handshake_task >qio_channel_tls_handshake > gnutls_handshake > qio_channel_tls_handshake_task > ... > qcrypto_tls_session_handshake > ... > gnutls_handshake > ... >... > ... > recvmsg (Blocking I/O waiting for response) > recvmsg (Blocking I/O waiting for response) > > Fix this by offloadinig handshake work to a background thread. > > Reported-by: Yan Jin > Suggested-by: Daniel P. Berrangé > Signed-off-by: Chuan Zheng Queued > --- > migration/multifd.c | 23 +-- > 1 file changed, 17 insertions(+), 6 deletions(-) > > diff --git a/migration/multifd.c b/migration/multifd.c > index 68b171f..88486b9 100644 > --- a/migration/multifd.c > +++ b/migration/multifd.c > @@ -739,6 +739,19 @@ static void multifd_tls_outgoing_handshake(QIOTask *task, > multifd_channel_connect(p, ioc, err); > } > > +static void *multifd_tls_handshake_thread(void *opaque) > +{ > +MultiFDSendParams *p = opaque; > +QIOChannelTLS *tioc = QIO_CHANNEL_TLS(p->c); > + > +qio_channel_tls_handshake(tioc, > + multifd_tls_outgoing_handshake, > + p, > + NULL, > + NULL); > +return NULL; > +} > + > static void multifd_tls_channel_connect(MultiFDSendParams *p, > QIOChannel *ioc, > Error **errp) > @@ -754,12 +767,10 @@ static void > multifd_tls_channel_connect(MultiFDSendParams *p, > > trace_multifd_tls_outgoing_handshake_start(ioc, tioc, hostname); > qio_channel_set_name(QIO_CHANNEL(tioc), "multifd-tls-outgoing"); > -qio_channel_tls_handshake(tioc, > - multifd_tls_outgoing_handshake, > - p, > - NULL, > - NULL); > - > +p->c = QIO_CHANNEL(tioc); > +qemu_thread_create(&p->thread, "multifd-tls-handshake-worker", > + multifd_tls_handshake_thread, p, > + QEMU_THREAD_JOINABLE); > } > > static bool multifd_channel_connect(MultiFDSendParams *p, > -- > 1.8.3.1 > > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
[for-5.2 0/9] docs: Move orphan top-level .rst files into manuals
Currently we have a handful of rST documents that are sat in the top level docs/ directory and do not get built into the manuals. These are a legacy from the period after we'd decided we wanted rST format documentation but before we'd set up the manual structure. This patchset moves them all into at least plausibly suitable places in the manual set: * virtio-net-failover, cpu-hotplug, virtio-pmem all go into the system manual * microvm goes into the system manual, but first we have to create a structure in target-i386.rst that lets us have a list of multiple machine types (along the pattern that target-arm.rst does) * pr-manager.rst goes into the system manual, but the part of it documenting the qemu-pr-helper executable needs to go into the tools manual If anybody who cares about the x86 machine models would like to create some documentation of the others ("q35", "isapc", "xenpv", "xenfv") you now have a place for it to live :-) Since there's no good way to cross-reference between different manuals there is no direct link between the pr-manager.rst and the qemu-pr-helper.rst; my proposal for fixing that is the recent "build a single manual, not five" RFC. thanks -- PMM Peter Maydell (9): docs: Move virtio-net-failover.rst into the system manual docs: Move cpu-hotplug.rst into the system manual docs: Move virtio-pmem.rst into the system manual docs/system/virtio-pmem.rst: Fix minor style issues docs: Split out 'pc' machine model docs into their own file docs: Move microvm.rst into the system manual docs: Move pr-manager.rst into the system manual docs: Split qemu-pr-helper documentation into tools manual docs/system/pr-manager.rst: Fix minor docs nits docs/meson.build | 1 + docs/{ => system}/cpu-hotplug.rst | 0 docs/{ => system/i386}/microvm.rst| 5 +- docs/system/i386/pc.rst | 7 ++ docs/system/index.rst | 4 + docs/{ => system}/pr-manager.rst | 44 ++- docs/system/target-i386.rst | 19 +++-- docs/{ => system}/virtio-net-failover.rst | 0 docs/system/virtio-pmem.rst | 75 +++ docs/tools/conf.py| 2 + docs/tools/index.rst | 1 + docs/tools/qemu-pr-helper.rst | 90 +++ docs/virtio-pmem.rst | 76 --- 13 files changed, 204 insertions(+), 120 deletions(-) rename docs/{ => system}/cpu-hotplug.rst (100%) rename docs/{ => system/i386}/microvm.rst (98%) create mode 100644 docs/system/i386/pc.rst rename docs/{ => system}/pr-manager.rst (68%) rename docs/{ => system}/virtio-net-failover.rst (100%) create mode 100644 docs/system/virtio-pmem.rst create mode 100644 docs/tools/qemu-pr-helper.rst delete mode 100644 docs/virtio-pmem.rst -- 2.20.1
[for-5.2 3/9] docs: Move virtio-pmem.rst into the system manual
Signed-off-by: Peter Maydell --- docs/system/index.rst | 1 + docs/{ => system}/virtio-pmem.rst | 0 2 files changed, 1 insertion(+) rename docs/{ => system}/virtio-pmem.rst (100%) diff --git a/docs/system/index.rst b/docs/system/index.rst index 0f0f6d2e99d..2a5155c67dc 100644 --- a/docs/system/index.rst +++ b/docs/system/index.rst @@ -30,6 +30,7 @@ Contents: gdb managed-startup cpu-hotplug + virtio-pmem targets security deprecated diff --git a/docs/virtio-pmem.rst b/docs/system/virtio-pmem.rst similarity index 100% rename from docs/virtio-pmem.rst rename to docs/system/virtio-pmem.rst -- 2.20.1
[for-5.2 4/9] docs/system/virtio-pmem.rst: Fix minor style issues
The virtio-pmem documentation has some minor style issues we hadn't noticed since we weren't rendering it in our docs: * Sphinx doesn't complain about overlong title-underlining the way it complains about too-short underlining, but it looks odd; make the underlines of the top level title the right length * Indent of paragraphs makes them render as blockquotes; remove the indent so they just render as normal text * Leading 'o' isn't rst markup, so it just renders as a literal "o"; reformat as a subsection heading instead * "QEMU" in the document title and section headings are a bit odd and unnecessary since this is the QEMU manual; delete or rephrase them * There's no need to specify what QEMU version the device first appeared in. Signed-off-by: Peter Maydell --- docs/system/virtio-pmem.rst | 55 ++--- 1 file changed, 27 insertions(+), 28 deletions(-) diff --git a/docs/system/virtio-pmem.rst b/docs/system/virtio-pmem.rst index 4bf5d004432..e5f91eff1c2 100644 --- a/docs/system/virtio-pmem.rst +++ b/docs/system/virtio-pmem.rst @@ -1,38 +1,37 @@ - -QEMU virtio pmem - +=== +virtio pmem +=== - This document explains the setup and usage of the virtio pmem device - which is available since QEMU v4.1.0. - - The virtio pmem device is a paravirtualized persistent memory device - on regular (i.e non-NVDIMM) storage. +This document explains the setup and usage of the virtio pmem device. +The virtio pmem device is a paravirtualized persistent memory device +on regular (i.e non-NVDIMM) storage. Usecase - Virtio pmem allows to bypass the guest page cache and directly use - host page cache. This reduces guest memory footprint as the host can - make efficient memory reclaim decisions under memory pressure. +Virtio pmem allows to bypass the guest page cache and directly use +host page cache. This reduces guest memory footprint as the host can +make efficient memory reclaim decisions under memory pressure. -o How does virtio-pmem compare to the nvdimm emulation supported by QEMU? +How does virtio-pmem compare to the nvdimm emulation? +- - NVDIMM emulation on regular (i.e. non-NVDIMM) host storage does not - persist the guest writes as there are no defined semantics in the device - specification. The virtio pmem device provides guest write persistence - on non-NVDIMM host storage. +NVDIMM emulation on regular (i.e. non-NVDIMM) host storage does not +persist the guest writes as there are no defined semantics in the device +specification. The virtio pmem device provides guest write persistence +on non-NVDIMM host storage. virtio pmem usage - - A virtio pmem device backed by a memory-backend-file can be created on - the QEMU command line as in the following example:: +A virtio pmem device backed by a memory-backend-file can be created on +the QEMU command line as in the following example:: -object memory-backend-file,id=mem1,share,mem-path=./virtio_pmem.img,size=4G -device virtio-pmem-pci,memdev=mem1,id=nv1 - where: +where: - "object memory-backend-file,id=mem1,share,mem-path=, size=" creates a backend file with the specified size. @@ -40,8 +39,8 @@ virtio pmem usage - "device virtio-pmem-pci,id=nvdimm1,memdev=mem1" creates a virtio pmem pci device whose storage is provided by above memory backend device. - Multiple virtio pmem devices can be created if multiple pairs of "-object" - and "-device" are provided. +Multiple virtio pmem devices can be created if multiple pairs of "-object" +and "-device" are provided. Hotplug --- @@ -59,14 +58,14 @@ the guest:: Guest Data Persistence -- - Guest data persistence on non-NVDIMM requires guest userspace applications - to perform fsync/msync. This is different from a real nvdimm backend where - no additional fsync/msync is required. This is to persist guest writes in - host backing file which otherwise remains in host page cache and there is - risk of losing the data in case of power failure. +Guest data persistence on non-NVDIMM requires guest userspace applications +to perform fsync/msync. This is different from a real nvdimm backend where +no additional fsync/msync is required. This is to persist guest writes in +host backing file which otherwise remains in host page cache and there is +risk of losing the data in case of power failure. - With virtio pmem device, MAP_SYNC mmap flag is not supported. This provides - a hint to application to perform fsync for write persistence. +With virtio pmem device, MAP_SYNC mmap flag is not supported. This provides +a hint to application to perform fsync for write persistence. Limitations -- 2.20.1
[for-5.2 5/9] docs: Split out 'pc' machine model docs into their own file
Currently target-i386.rst includes the documentation of the 'pc' machine model inline. Split it out into its own file, in a similar way to target-i386.rst; this gives us a place to put documentation of other i386 machine models, such as 'microvm'. Signed-off-by: Peter Maydell --- docs/system/i386/pc.rst | 7 +++ docs/system/target-i386.rst | 18 +- 2 files changed, 20 insertions(+), 5 deletions(-) create mode 100644 docs/system/i386/pc.rst diff --git a/docs/system/i386/pc.rst b/docs/system/i386/pc.rst new file mode 100644 index 000..d543c11a5cd --- /dev/null +++ b/docs/system/i386/pc.rst @@ -0,0 +1,7 @@ +i440fx PC (``pc-i440fx``, ``pc``) += + +Peripherals +~~~ + +.. include:: ../target-i386-desc.rst.inc diff --git a/docs/system/target-i386.rst b/docs/system/target-i386.rst index 51be03d881f..1612ddba907 100644 --- a/docs/system/target-i386.rst +++ b/docs/system/target-i386.rst @@ -1,14 +1,22 @@ .. _QEMU-PC-System-emulator: -x86 (PC) System emulator - +x86 System emulator +--- .. _pcsys_005fdevices: -Peripherals -~~~ +Board-specific documentation + -.. include:: target-i386-desc.rst.inc +.. + This table of contents should be kept sorted alphabetically + by the title text of each file, which isn't the same ordering + as an alphabetical sort by filename. + +.. toctree:: + :maxdepth: 1 + + i386/pc .. include:: cpu-models-x86.rst.inc -- 2.20.1
[for-5.2 1/9] docs: Move virtio-net-failover.rst into the system manual
The virtio-net-failover documentation is currently orphan and not included in any manual; move it into the system manual, immediately following the general network emulation section. Signed-off-by: Peter Maydell --- docs/system/index.rst | 1 + docs/{ => system}/virtio-net-failover.rst | 0 2 files changed, 1 insertion(+) rename docs/{ => system}/virtio-net-failover.rst (100%) diff --git a/docs/system/index.rst b/docs/system/index.rst index c0f685b818e..d0613cd5f72 100644 --- a/docs/system/index.rst +++ b/docs/system/index.rst @@ -21,6 +21,7 @@ Contents: monitor images net + virtio-net-failover usb ivshmem linuxboot diff --git a/docs/virtio-net-failover.rst b/docs/system/virtio-net-failover.rst similarity index 100% rename from docs/virtio-net-failover.rst rename to docs/system/virtio-net-failover.rst -- 2.20.1
[for-5.2 6/9] docs: Move microvm.rst into the system manual
Now that target-i386.rst has a place to list documentation of machines other than the 'pc' machine, we have a place we can move the microvm documentation to. Signed-off-by: Peter Maydell --- docs/{ => system/i386}/microvm.rst | 5 ++--- docs/system/target-i386.rst| 1 + 2 files changed, 3 insertions(+), 3 deletions(-) rename docs/{ => system/i386}/microvm.rst (98%) diff --git a/docs/microvm.rst b/docs/system/i386/microvm.rst similarity index 98% rename from docs/microvm.rst rename to docs/system/i386/microvm.rst index fcf41fc1f6f..1675e37d3e7 100644 --- a/docs/microvm.rst +++ b/docs/system/i386/microvm.rst @@ -1,6 +1,5 @@ - -microvm Machine Type - +'microvm' virtual platform (``microvm``) + ``microvm`` is a machine type inspired by ``Firecracker`` and constructed after its machine model. diff --git a/docs/system/target-i386.rst b/docs/system/target-i386.rst index 1612ddba907..22ba5ce2c0f 100644 --- a/docs/system/target-i386.rst +++ b/docs/system/target-i386.rst @@ -16,6 +16,7 @@ Board-specific documentation .. toctree:: :maxdepth: 1 + i386/microvm i386/pc .. include:: cpu-models-x86.rst.inc -- 2.20.1
[for-5.2 2/9] docs: Move cpu-hotplug.rst into the system manual
The cpu-hotplug.rst documentation is currently orphan and not included in any manual; move it into the system manual. Signed-off-by: Peter Maydell --- docs/{ => system}/cpu-hotplug.rst | 0 docs/system/index.rst | 1 + 2 files changed, 1 insertion(+) rename docs/{ => system}/cpu-hotplug.rst (100%) diff --git a/docs/cpu-hotplug.rst b/docs/system/cpu-hotplug.rst similarity index 100% rename from docs/cpu-hotplug.rst rename to docs/system/cpu-hotplug.rst diff --git a/docs/system/index.rst b/docs/system/index.rst index d0613cd5f72..0f0f6d2e99d 100644 --- a/docs/system/index.rst +++ b/docs/system/index.rst @@ -29,6 +29,7 @@ Contents: tls gdb managed-startup + cpu-hotplug targets security deprecated -- 2.20.1
[for-5.2 9/9] docs/system/pr-manager.rst: Fix minor docs nits
Fix a couple of nits in pr-manager.rst: * the title marker for the top level heading is overlength * stray capital 'R' in the middle of a sentence Signed-off-by: Peter Maydell --- docs/system/pr-manager.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/system/pr-manager.rst b/docs/system/pr-manager.rst index 3f5b9f94dcd..b19a0c15e66 100644 --- a/docs/system/pr-manager.rst +++ b/docs/system/pr-manager.rst @@ -1,8 +1,8 @@ -== +=== Persistent reservation managers -== +=== -SCSI persistent Reservations allow restricting access to block devices +SCSI persistent reservations allow restricting access to block devices to specific initiators in a shared storage setup. When implementing clustering of virtual machines, it is a common requirement for virtual machines to send persistent reservation SCSI commands. However, -- 2.20.1
[for-5.2 7/9] docs: Move pr-manager.rst into the system manual
Move the pr-manager documentation into the system manual. Some of it (the documentation of the pr-manager-helper tool) should be in tools, but we will split it up after moving it. Signed-off-by: Peter Maydell --- docs/system/index.rst| 1 + docs/{ => system}/pr-manager.rst | 0 2 files changed, 1 insertion(+) rename docs/{ => system}/pr-manager.rst (100%) diff --git a/docs/system/index.rst b/docs/system/index.rst index 2a5155c67dc..e5a35817a24 100644 --- a/docs/system/index.rst +++ b/docs/system/index.rst @@ -31,6 +31,7 @@ Contents: managed-startup cpu-hotplug virtio-pmem + pr-manager targets security deprecated diff --git a/docs/pr-manager.rst b/docs/system/pr-manager.rst similarity index 100% rename from docs/pr-manager.rst rename to docs/system/pr-manager.rst -- 2.20.1
[for-5.2 8/9] docs: Split qemu-pr-helper documentation into tools manual
Split the documentation of the qemu-pr-helper binary into the tools manual, and give it a manpage like our other standalone executables. Signed-off-by: Peter Maydell --- docs/meson.build | 1 + docs/system/pr-manager.rst| 38 ++- docs/tools/conf.py| 2 + docs/tools/index.rst | 1 + docs/tools/qemu-pr-helper.rst | 90 +++ 5 files changed, 99 insertions(+), 33 deletions(-) create mode 100644 docs/tools/qemu-pr-helper.rst diff --git a/docs/meson.build b/docs/meson.build index bf8204a08fa..ebd85d59f98 100644 --- a/docs/meson.build +++ b/docs/meson.build @@ -60,6 +60,7 @@ if build_docs 'tools': { 'qemu-img.1': (have_tools ? 'man1' : ''), 'qemu-nbd.8': (have_tools ? 'man8' : ''), +'qemu-pr-helper.8': (have_tools ? 'man8' : ''), 'qemu-trace-stap.1': (config_host.has_key('CONFIG_TRACE_SYSTEMTAP') ? 'man1' : ''), 'virtfs-proxy-helper.1': (have_virtfs_proxy_helper ? 'man1' : ''), 'virtiofsd.1': (have_virtiofsd ? 'man1' : ''), diff --git a/docs/system/pr-manager.rst b/docs/system/pr-manager.rst index 9b1de198b1b..3f5b9f94dcd 100644 --- a/docs/system/pr-manager.rst +++ b/docs/system/pr-manager.rst @@ -50,39 +50,11 @@ Alternatively, using ``-blockdev``:: -blockdev node-name=hd,driver=raw,file.driver=host_device,file.filename=/dev/sdb,file.pr-manager=helper0 -device scsi-block,drive=hd --- -Invoking :program:`qemu-pr-helper` --- - -QEMU provides an implementation of the persistent reservation helper, -called :program:`qemu-pr-helper`. The helper should be started as a -system service and supports the following option: - --d, --daemon run in the background --q, --quiet decrease verbosity --v, --verbose increase verbosity --f, --pidfile=pathPID file when running as a daemon --k, --socket=path path to the socket --T, --trace=trace-optstracing options - -By default, the socket and PID file are placed in the runtime state -directory, for example :file:`/var/run/qemu-pr-helper.sock` and -:file:`/var/run/qemu-pr-helper.pid`. The PID file is not created -unless :option:`-d` is passed too. - -:program:`qemu-pr-helper` can also use the systemd socket activation -protocol. In this case, the systemd socket unit should specify a -Unix stream socket, like this:: - -[Socket] -ListenStream=/var/run/qemu-pr-helper.sock - -After connecting to the socket, :program:`qemu-pr-helper`` can optionally drop -root privileges, except for those capabilities that are needed for -its operation. To do this, add the following options: - --u, --user=user user to drop privileges to --g, --group=group group to drop privileges to +You will also need to ensure that the helper program +:command:`qemu-pr-helper` is running, and that it has been +set up to use the same socket filename as your QEMU commandline +specifies. See the qemu-pr-helper documentation or manpage for +further details. - Multipath devices and persistent reservations diff --git a/docs/tools/conf.py b/docs/tools/conf.py index 9052d17d6d4..4760d36ff2a 100644 --- a/docs/tools/conf.py +++ b/docs/tools/conf.py @@ -22,6 +22,8 @@ man_pages = [ ['Fabrice Bellard'], 1), ('qemu-nbd', 'qemu-nbd', u'QEMU Disk Network Block Device Server', ['Anthony Liguori '], 8), +('qemu-pr-helper', 'qemu-pr-helper', 'QEMU persistent reservation helper', + [], 8), ('qemu-trace-stap', 'qemu-trace-stap', u'QEMU SystemTap trace tool', [], 1), ('virtfs-proxy-helper', 'virtfs-proxy-helper', diff --git a/docs/tools/index.rst b/docs/tools/index.rst index 232ce9f3e46..b99f86c7c66 100644 --- a/docs/tools/index.rst +++ b/docs/tools/index.rst @@ -12,6 +12,7 @@ Contents: qemu-img qemu-nbd + qemu-pr-helper qemu-trace-stap virtfs-proxy-helper virtiofsd diff --git a/docs/tools/qemu-pr-helper.rst b/docs/tools/qemu-pr-helper.rst new file mode 100644 index 000..ac036180ac1 --- /dev/null +++ b/docs/tools/qemu-pr-helper.rst @@ -0,0 +1,90 @@ +QEMU persistent reservation helper +== + +Synopsis + + +**qemu-pr-helper** [*OPTION*] + +Description +--- + +Implements the persistent reservation helper for QEMU. + +SCSI persistent reservations allow restricting access to block devices +to specific initiators in a shared storage setup. When implementing +clustering of virtual machines, it is a common requirement for virtual +machines to send persistent reservation SCSI commands. However, +the operating system restricts sending these commands to unprivileged +programs because incorrect usage can disrupt regular operation of the +storage fabric. QEMU's SCSI passthrough devices ``scsi-block`` +and ``scsi-generic`` support passing guest persistent reservation +re
Re: [PATCH v2 5/5] migration: fix uninitialized variable warning in migrate_send_rp_req_pages()
* Chen Qun (kuhn.chen...@huawei.com) wrote: > After the WITH_QEMU_LOCK_GUARD macro is added, the compiler cannot identify > that the statements in the macro must be executed. As a result, some > variables > assignment statements in the macro may be considered as unexecuted by the > compiler. > > When the -Wmaybe-uninitialized capability is enabled on GCC9,the compiler > showed warning: > migration/migration.c: In function ‘migrate_send_rp_req_pages’: > migration/migration.c:384:8: warning: ‘received’ may be used uninitialized in > this function [-Wmaybe-uninitialized] > 384 | if (received) { > |^ > > Add a default value for 'received' to prevented the warning. > > Reported-by: Euler Robot > Signed-off-by: Chen Qun > Reviewed-by: Philippe Mathieu-Daudé Queuing this one via migration > --- > Cc: Juan Quintela > Cc: "Dr. David Alan Gilbert" > --- > migration/migration.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/migration/migration.c b/migration/migration.c > index 9bb4fee5ac..de90486a61 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -361,7 +361,7 @@ int migrate_send_rp_req_pages(MigrationIncomingState *mis, >RAMBlock *rb, ram_addr_t start, uint64_t haddr) > { > void *aligned = (void *)(uintptr_t)(haddr & (-qemu_ram_pagesize(rb))); > -bool received; > +bool received = false; > > WITH_QEMU_LOCK_GUARD(&mis->page_request_mutex) { > received = ramblock_recv_bitmap_test_byte_offset(rb, start); > -- > 2.27.0 > > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [PATCH v2 00/44] Make qdev static property API usable by any QOM type
On Thu, Nov 12, 2020 at 09:11:48AM +0100, Paolo Bonzini wrote: > On 11/11/20 19:39, Eduardo Habkost wrote: > > I will submit v3 of this series with both > > object_class_property_add_field() and > > object_class_add_field_properties() as internal QOM APIs. > > object_class_add_field_properties() will be used to implement > > device_class_set_props(). > > I have no problem making both of them public APIs. If an object can use > only a single array of static^Wfield properties that's totally fine; I'm > just not sure about splitting properties between class_init and static > arrays, which is the less consistent case. I agree that using a static array for a couple of properties and object_class_property_add*() for all the rest isn't desirable. Making both APIs public sounds like a good plan. I'd like us to make almost every object use only an array (just like almost every device already use only an array, today), but maybe we'll hit too many obstacles trying to do that. We'll see. -- Eduardo
Re: [PATCH] migration/dirtyrate: simplify inlcudes in dirtyrate.c
* Chuan Zheng (zhengch...@huawei.com) wrote: > Remove redundant blank line which is left by Commit 662770af7c6e8c, > also take this opportunity to remove redundant includes in dirtyrate.c. > > Signed-off-by: Chuan Zheng and queued Reviewed-by: Dr. David Alan Gilbert > --- > migration/dirtyrate.c | 5 - > 1 file changed, 5 deletions(-) > > diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c > index 8f728d2..ccb9814 100644 > --- a/migration/dirtyrate.c > +++ b/migration/dirtyrate.c > @@ -11,17 +11,12 @@ > */ > > #include "qemu/osdep.h" > - > #include > #include "qapi/error.h" > #include "cpu.h" > -#include "qemu/config-file.h" > -#include "exec/memory.h" > #include "exec/ramblock.h" > -#include "exec/target_page.h" > #include "qemu/rcu_queue.h" > #include "qapi/qapi-commands-migration.h" > -#include "migration.h" > #include "ram.h" > #include "trace.h" > #include "dirtyrate.h" > -- > 1.8.3.1 > > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [PATCH 2/2] qemu-img: align next status sector on destination alignment.
On Thu, 2020-11-12 at 07:45 -0600, Eric Blake wrote: > On 11/12/20 6:40 AM, Peter Lieven wrote: > > > > /* > > > - * Avoid that s->sector_next_status becomes unaligned to the > > > source > > > - * request alignment and/or cluster size to avoid unnecessary > > > read > > > - * cycles. > > > + * Avoid that s->sector_next_status becomes unaligned to the > > > + * source/destination request alignment and/or cluster size to > > > avoid > > > + * unnecessary read/write cycles. > > > */ > > > -tail = (sector_num - src_cur_offset + n) % > > > s->src_alignment[src_cur]; > > > +alignment = MAX(s->src_alignment[src_cur], s->alignment); > > > +assert(is_power_of_2(alignment)); > > > + > > > +tail = (sector_num - src_cur_offset + n) % alignment; > > > if (n > tail) { > > > n -= tail; > > > } > > > > I was also considering including the s->alignment when adding this chance. > > However, you need the least common multiple of both alignments, not the > > maximum, otherwise > > > > you might get misaligned to either source or destination. > > > > > > Why exactly do you need the power of two requirement? > > The power of two requirement ensures that you h ave the least common > multiple of both alignments ;) - Yes, and in addition to that both alignments are already power of two because we align them to it. The assert I added is just in case. This is how we calculate the destination alignment: s.alignment = MAX(pow2floor(s.min_sparse), DIV_ROUND_UP(out_bs->bl.request_alignment, BDRV_SECTOR_SIZE)); This is how we calculate the source alignments (it is possible to have several source files) s.src_alignment[bs_i] = DIV_ROUND_UP(src_bs->bl.request_alignment, BDRV_SECTOR_SIZE); if (!bdrv_get_info(src_bs, &bdi)) { s.src_alignment[bs_i] = MAX(s.src_alignment[bs_i], bdi.cluster_size / BDRV_SECTOR_SIZE); } The bl.request_alignment comment mentions that it must be power of 2, and cluster sizes are also very likely to be power of 2 in all drivers we support. An assert for s.src_alignment won't hurt though. Note though that we don't really read the discard alignment. We have max_pdiscard, and pdiscard_alignment, but max_pdiscard is only used to split 'too large' discard requests, and pdiscard_alignment is advisory and used only in a couple of places. Neither are set by file-posix. We do have request_alignment, which file-posix tries to align to the logical block size which is still often 512 for backward compatibility on many devices (even nvme) Now both source and destination alignments in qemu-img convert are based on request_aligment and on min_sparse (which is by default 4K, controlled by qemu-img -S parameter (which otherwise controls the minimum number of blocks to be zero, to consider discarding them in convert) This means that there is no guarantee to have 4K alignment, and besides, with sufficiently fragmented source file, the bdrv_block_status_above can return arbitrary short and unaligned extents which can't be aligned, thus as I said this patch alone can't guarantee that we won't have write and discard sharing the same page. Another thing that can be discussed is why is_allocated_sectors was patched to convert short discards to writes. Probably because underlying hardware ignores them or something? In theory we should detect this and fail back to regular zeroing in this case. Again though, while in case of converting an empty file, this is the only source of writes, and eliminating it, also 'fixes' this case, with sufficiently fragmented source file we can even without this code get a write and discard sharing a page. Best regards, Maxim Levitsky > > However, there ARE devices in practice that have advertised a > non-power-of-2 discard granularity, such as 15MiB (see commits 430b26a8, > 63188c24). Which means you probably don't want to assert power-of-2, > and in turn need to worry about least common multiple. >
Re: [PATCH for-5.2 08/10] block/export: port virtio-blk discard/write zeroes input validation
On 11.11.20 13:43, Stefan Hajnoczi wrote: Validate discard/write zeroes the same way we do for virtio-blk. Some of these checks are mandated by the VIRTIO specification, others are internal to QEMU. Signed-off-by: Stefan Hajnoczi --- block/export/vhost-user-blk-server.c | 115 +-- 1 file changed, 92 insertions(+), 23 deletions(-) diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c index 62672d1cb9..3295d3c951 100644 --- a/block/export/vhost-user-blk-server.c +++ b/block/export/vhost-user-blk-server.c [...] @@ -58,30 +60,101 @@ static void vu_blk_req_complete(VuBlkReq *req) free(req); } +static bool vu_blk_sect_range_ok(VuBlkExport *vexp, uint64_t sector, + size_t size) +{ +uint64_t nb_sectors = size >> BDRV_SECTOR_BITS; +uint64_t total_sectors; + +if (nb_sectors > BDRV_REQUEST_MAX_SECTORS) { I wonder why you pass a byte length, then shift it down to sectors, when it’s kind of unsafe on the caller’s side to even calculate that byte length (because the caller has to verify that shifting the sector count up to be a byte length is safe). Wouldn’t it be simpler to pass nb_sectors (perhaps even as uint64_t, because why not) and then compare that against BDRV_REQUEST_MAX_BYTES here? (With how the code is written, it also has the problem of rounding down @size. Which isn’t a problem in practice because the caller effectively guarantees that @size is aligned to sectors, but it still means that the code looks a bit strange. As in, “Why is it safe to round down? Ah, because the caller always produces an aligned value. But why does the caller even pass a byte count, then? Especially given that the offset is passed as a sector index, too.”) +return false; +} +if ((sector << BDRV_SECTOR_BITS) % vexp->blk_size) { This made me wonder why the discard/write-zeroes sector granularity would be BDRV_SECTOR_BITS and not blk_size, which is used for IN/OUT (read/write) requests. I still don’t know, but judging from the only reference I could find quickly (contrib/vhost-user-blk/vhost-user-blk.c), it seems correct. OTOH, I wonder whether BDRV_SECTOR_BITS/BDRV_SECTOR_SIZE are the correct constants. Isn’t it just 9/512 as per some specification, i.e., shouldn’t it be independent of qemu’s block layer’s sector size? +return false; +} +blk_get_geometry(vexp->export.blk, &total_sectors); +if (sector > total_sectors || nb_sectors > total_sectors - sector) { +return false; +} +return true; +} + [...] @@ -170,19 +243,13 @@ static void coroutine_fn vu_blk_virtio_process_req(void *opaque) } case VIRTIO_BLK_T_DISCARD: case VIRTIO_BLK_T_WRITE_ZEROES: { -int rc; - if (!vexp->writable) { req->in->status = VIRTIO_BLK_S_IOERR; break; } -rc = vu_blk_discard_write_zeroes(blk, &elem->out_sg[1], out_num, type); -if (rc == 0) { -req->in->status = VIRTIO_BLK_S_OK; -} else { -req->in->status = VIRTIO_BLK_S_IOERR; -} +req->in->status = vu_blk_discard_write_zeroes(vexp, elem->out_sg, + out_num, type); elem->out_sg is different from &elem->out_sg[1], but from what I can tell vu_blk_discard_write_zeroes() doesn’t really change in how it treats @iov. I’m confused. Is that a bug fix? (If so, it isn’t mentioned in the commit message) Apart from this, the patch looks good to me (although there are the two things mentioned above that I find a bit strange, but definitely not wrong). Max break; } default:
Re: [RFC v3] VFIO Migration
On Wed, 11 Nov 2020 15:48:50 + Daniel P. Berrangé wrote: > In terms of validation I can't help but feel the whole proposal is > really very complicated. > > In validating QEMU migration compatibility we merely compare the > versioned machine type. > > IIUC, in this proposal, it would be more like exploding the machine > type into all its 100's of properties and then comparing each one > individually. > > I really prefer the simpler model of QEMU versioned machine types > where compatibility is a simple string comparison, hiding the > 100's of individual config parameters. > > Of course there are scenarios where this will lead a mgmt app to > refuse a migration, when it could in fact have permitted it. > > eg consider pc-i440fx-4.0 and pc-i440fx-5.0 machine types, > which only differ in the value "foo=7" and "foo=8" respectively. > > Now if the target only supported machine type pc-i440fx-5.0, then > with a basic string comparison of machine type versin, we can't > migrate from a host uing pc-i440fx-4.0 > > If we exploded the machine type into its params, we could see that > we can migrate from pc-i440fx-4.0 to pc-i440fx-5.0, simply by > overriding the value of "foo". > > So, yes, dealing with individual params is more flexible, but it > comes at an enourmous cost in complexity to process all the > parameters. I'm not convinced this is a good tradeoff. For mdev devices, we could have something similar to versioned machine types by introducing versioned mdev types. (Which would fit well with mdev types having to match strictly for migration to be possible.) For other use cases, we would need to introduce a new construct.
Re: [PATCH] scsi-disk: convert more errno values back to SCSI statuses
On 11/12/20 10:52 AM, Paolo Bonzini wrote: Linux has some OS-specific (and sometimes weird) mappings for various SCSI statuses and sense codes. The most important is probably RESERVATION CONFLICT. Add them so that they can be reported back to the guest kernel. Cc: Hannes Reinecke Signed-off-by: Paolo Bonzini --- hw/scsi/scsi-disk.c | 19 +++ 1 file changed, 19 insertions(+) diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c index 424bc192b7..fa14d1527a 100644 --- a/hw/scsi/scsi-disk.c +++ b/hw/scsi/scsi-disk.c @@ -461,6 +461,25 @@ static bool scsi_handle_rw_error(SCSIDiskReq *r, int error, bool acct_failed) } error = scsi_sense_buf_to_errno(r->req.sense, sizeof(r->req.sense)); break; +#ifdef CONFIG_LINUX +/* These errno mapping are specific to Linux. For more information: + * - scsi_decide_disposition in drivers/scsi/scsi_error.c + * - scsi_result_to_blk_status in drivers/scsi/scsi_lib.c + * - blk_errors[] in block/blk-core.c + */ +case EBADE: +/* DID_NEXUS_FAILURE -> BLK_STS_NEXUS. */ +scsi_req_complete(&r->req, RESERVATION_CONFLICT); +break; +case ENODATA: +/* DID_MEDIUM_ERROR -> BLK_STS_MEDIUM. */ +scsi_check_condition(r, SENSE_CODE(READ_ERROR)); +break; +case EREMOTEIO: +/* DID_TARGET_FAILURE -> BLK_STS_TARGET. */ +scsi_req_complete(&r->req, HARDWARE_ERROR); +break; +#endif case ENOMEDIUM: scsi_check_condition(r, SENSE_CODE(NO_MEDIUM)); break; Well, ironically I'm currently debugging a customer escalation which touches exactly this area. It revolves more around the SG_IO handling; qemu ignores the host_status setting completely, leading to data corruption if the host has to abort commands. (Not that the host _does_ abort commands, as qemu SG_IO sets an infinite timeout. But that's another story). And part of the patchset is to enable passing of the host_status code back to the drivers. In particular virtio_scsi has a 'response' code which matches pretty closely to the linux SCSI DID_XXX codes. So my idea is to pass the host_status directly down to the drivers, allowing virtio-scsi to do a mapping between DID_XX and virtio response codes. That way we can get rid of the weird mapping in utils/scsi.c where we try to map the DID_XXX codes onto Sense or SCSI status. And with that we could map the error codes above onto DID_XXX codes, too, allowing us to have a more streamlined mapping. Thoughts? Cheers, Hannes -- Dr. Hannes ReineckeKernel Storage Architect h...@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer
Re: [PATCH] migration: handle CANCELLING state in migration_completion()
* Longpeng(Mike) (longpe...@huawei.com) wrote: > The following sequence may cause the VM abort during migration: > > 1. RUN_STATE_RUNNING,MIGRATION_STATUS_ACTIVE > > 2. before call migration_completion(), we send migrate_cancel >QMP command, the state machine is changed to: > RUN_STATE_RUNNING,MIGRATION_STATUS_CANCELLING > > 3. call migration_completion(), and the state machine is >switch to: RUN_STATE_RUNNING,MIGRATION_STATUS_COMPLETED > > 4. call migration_iteration_finish(), because the migration >status is COMPLETED, so it will try to set the runstate >to POSTMIGRATE, but RUNNING-->POSTMIGRATE is an invalid >transition, so abort(). > > The migration_completion() should not change the migration state > to COMPLETED if it is already changed to CANCELLING. > > Signed-off-by: Longpeng(Mike) Queued > --- > migration/migration.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/migration/migration.c b/migration/migration.c > index 3263aa5..b11a2bd 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -3061,6 +3061,8 @@ static void migration_completion(MigrationState *s) > > qemu_savevm_state_complete_postcopy(s->to_dst_file); > trace_migration_completion_postcopy_end_after_complete(); > +} else if (s->state == MIGRATION_STATUS_CANCELLING) { > +goto fail; > } > > /* > -- > 1.8.3.1 > > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [PATCH] hw/arm/virt: ARM_VIRT must select ARM_GIC
On Wed, 11 Nov 2020 at 14:34, Andrew Jones wrote: > > The removal of the selection of A15MPCORE from ARM_VIRT also > removed what A15MPCORE selects, ARM_GIC. We still need ARM_GIC. > > Fixes: bec3c97e0cf9 ("hw/arm/virt: Remove dependency on Cortex-A15 MPCore > peripherals") > Reported-by: Miroslav Rezanina > Signed-off-by: Andrew Jones Applied to target-arm.next, thanks. -- PMM
Re: [PATCH] exynos: Fix bad printf format specifiers
On Wed, 11 Nov 2020 at 07:48, Alex Chen wrote: > > We should use printf format specifier "%u" instead of "%d" for > argument of type "unsigned int". > > Reported-by: Euler Robot > Signed-off-by: Alex Chen > --- > hw/timer/exynos4210_mct.c | 4 ++-- > hw/timer/exynos4210_pwm.c | 8 > 2 files changed, 6 insertions(+), 6 deletions(-) Applied to target-arm.next, thanks. -- PMM
Re: [PATCH] hw/input/ps2.c: Remove remnants of printf debug
On Sun, 1 Nov 2020 at 13:33, Peter Maydell wrote: > > In commit 5edab03d4040 we added tracepoints to the ps2 keyboard > and mouse emulation. However we didn't remove all the debug-by-printf > support. In fact there is only one printf() remaining, and it is > redundant with the trace_ps2_write_mouse() event next to it. > Remove the printf() and the now-unused DEBUG* macros. > > Signed-off-by: Peter Maydell
Re: [PATCH] target/openrisc: Remove dead code attempting to check "is timer disabled"
On Wed, 4 Nov 2020 at 11:44, Stafford Horne wrote: > As for the patch. > > Acked-by: Stafford Horne Thanks; I'll take this via target-arm.next since I'm putting together a pullreq at the moment. -- PMM
Re: [PATCH 0/2] RFC: Issue with discards on raw block device without O_DIRECT
On Thu, 2020-11-12 at 12:19 +0100, Jan Kara wrote: > [added some relevant people and lists to CC] > > On Wed 11-11-20 17:44:05, Maxim Levitsky wrote: > > On Wed, 2020-11-11 at 17:39 +0200, Maxim Levitsky wrote: > > > clone of "starship_production" > > > > The git-publish destroyed the cover letter: > > > > For the reference this is for bz #1872633 > > > > The issue is that current kernel code that implements 'fallocate' > > on kernel block devices roughly works like that: > > > > 1. Flush the page cache on the range that is about to be discarded. > > 2. Issue the discard and wait for it to finish. > >(as far as I can see the discard doesn't go through the > >page cache). > > > > 3. Check if the page cache is dirty for this range, > >if it is dirty (meaning that someone wrote to it meanwhile) > >return -EBUSY. > > > > This means that if qemu (or qemu-img) issues a write, and then > > discard to the area that shares a page, -EBUSY can be returned by > > the kernel. > > Indeed, if you don't submit PAGE_SIZE aligned discards, you can get back > EBUSY which seems wrong to me. IMO we should handle this gracefully in the > kernel so we need to fix this. > > > On the other hand, for example, the ext4 implementation of discard > > doesn't seem to be affected. It does take a lock on the inode to avoid > > concurrent IO and flushes O_DIRECT writers prior to doing discard thought. > > Well, filesystem hole punching is somewhat different beast than block device > discard (at least implementation wise). > > > Doing fsync and retrying is seems to resolve this issue, but it might be > > a too big hammer. Just retrying doesn't work, indicating that maybe the > > code that flushes the page cache in (1) doesn't do this correctly ? > > > > It also can be racy unless special means are done to block IO from happening > > from qemu during this fsync. > > > > This patch series contains two patches: > > > > First patch just lets the file-posix ignore the -EBUSY errors, which is > > technically enough to fail back to plain write in this case, but seems > > wrong. > > > > And the second patch adds an optimization to qemu-img to avoid such a > > fragmented write/discard in the first place. > > > > Both patches make the reproducer work for this particular bugzilla, > > but I don't think they are enough. > > > > What do you think? > > So if the EBUSY error happens because something happened to the page cache > outside of discarded range (like you describe above), that is a kernel bug > than needs to get fixed. EBUSY should really mean - someone wrote to the > discarded range while discard was running and userspace app has to deal > with that depending on what it aims to do... I double checked this, those are the writes/discards according to my debug prints (I print start and then start+len-1 for each request) I have attached the patch for this for reference. ZERO: 0x7fe0 7fffefff (len:0x1ff000) fallocate 7fe0 7fffefff WRITE: 0x7000 7dff (len:0xe00) write 7000 7dff ZERO: 0x7e00 801fefff (len:0x1ff200) fallocate 7e00 801fefff FALLOCATE failed with error 16 qemu-img: error while writing at byte 2147483136: Device or resource busy Best regards, Maxim Levitsky > > Honza commit ce897250babe3527f451c1c54c86b62659a2c29e Author: Maxim Levitsky Date: Thu Oct 15 17:02:58 2020 +0300 hacks diff --git a/block/file-posix.c b/block/file-posix.c index c63926d592..91ca690505 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -1440,6 +1440,12 @@ static ssize_t handle_aiocb_rw_linear(RawPosixAIOData *aiocb, char *buf) while (offset < aiocb->aio_nbytes) { if (aiocb->aio_type & QEMU_AIO_WRITE) { + +long unsigned int size = aiocb->aio_nbytes - offset; +long unsigned int off = aiocb->aio_offset + offset; + +printf(" write %012lx %012lx\n", off, off+size-1); + len = pwrite(aiocb->aio_fildes, (const char *)buf + offset, aiocb->aio_nbytes - offset, @@ -1581,10 +1587,15 @@ static int translate_err(int err) static int do_fallocate(int fd, int mode, off_t offset, off_t len) { do { + +printf(" fallocate %012lx %012lx\n", offset, offset+len-1); + if (fallocate(fd, mode, offset, len) == 0) { return 0; } } while (errno == EINTR); + +printf("FALLOCATE failed with error %d\n", errno); return translate_err(-errno); } #endif diff --git a/qemu-img.c b/qemu-img.c index c2c56fc797..64d3b84728 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -1803,6 +1803,7 @@ static int convert_iteration_sectors(ImgConvertState *s, int64_t sector_num) } s->sector_next_status = sector_num + n; +printf("NEXT_STATUS: %lx\n", s->sector_next_status << 9); }
Re: [PATCH for-5.2 09/10] vhost-user-blk-test: test discard/write zeroes invalid inputs
On 11.11.20 13:43, Stefan Hajnoczi wrote: Exercise input validation code paths in block/export/vhost-user-blk-server.c. Signed-off-by: Stefan Hajnoczi --- block/export/vhost-user-blk-server.c | 4 +- tests/qtest/vhost-user-blk-test.c| 124 +++ 2 files changed, 126 insertions(+), 2 deletions(-) diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c index 3295d3c951..d88e41714d 100644 --- a/block/export/vhost-user-blk-server.c +++ b/block/export/vhost-user-blk-server.c @@ -248,8 +248,8 @@ static void coroutine_fn vu_blk_virtio_process_req(void *opaque) break; } -req->in->status = vu_blk_discard_write_zeroes(vexp, elem->out_sg, - out_num, type); +req->in->status = vu_blk_discard_write_zeroes(vexp, out_iov, out_num, + type); break; } default: Looks like this hunk should be squashed into the previous patch. I think that would lift my confusion. Max
[Bug 1779649] Re: Suspending a domain works with a 3.16 gues kernel but not with a 4.16 one
The QEMU project is currently considering to move its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now. If you still think this bug report here is valid, then please switch the state back to "New" within the next 60 days, otherwise this report will be marked as "Expired". Or mark it as "Fix Released" if the problem has been solved with a newer version of QEMU already. Thank you and sorry for the inconvenience. ** Changed in: qemu Status: New => Incomplete -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1779649 Title: Suspending a domain works with a 3.16 gues kernel but not with a 4.16 one Status in QEMU: Incomplete Bug description: Suspending a domain with `systemctl suspend` works with a guest 3.16 kernel (jessie), the domain goes into `pmsuspend` in libvirt but doesn't work anymore with a 4.16 one (sid) where: - With a QXL card: the spice display just goes black and the domain stays `running` in libvirt but is "dead" - With a VGA card: the screen goes black and comes back immediately, the domain stays fine Qemu: Master, 281bd281222776229d5dbf84d1a5c6d8d9d2a34b To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1779649/+subscriptions
Re: [PATCH for-5.2] virtiofsd: Announce submounts even without statx()
* Max Reitz (mre...@redhat.com) wrote: > Contrary to what the check (and warning) in lo_init() claims, we can > announce submounts just fine even without statx() -- the check is based > on comparing both the mount ID and st_dev of parent and child. Without > statx(), we will not have the mount ID; but we always have st_dev. > > The only problems we have (without statx() and its mount ID) are: > > (1) Mounting the same device twice may lead to both trees being treated > as exactly the same tree by virtiofsd. But that is a problem that > is completely independent of mirroring host submounts in the guest. > Both submount roots will still show the FUSE_SUBMOUNT flag, because > their st_dev still differs from their respective parent. > > (2) There is only one exception to (1), and that is if you mount a > device inside a mount of itself: Then, its st_dev will be the same > as that of its parent, and so without a mount ID, virtiofsd will not > be able to recognize the nested mount's root as a submount. > However, thanks to virtiofsd then treating both trees as exactly the > same tree, it will be caught up in a loop when the guest tries to > examine the nested submount, so the guest will always see nothing > but an ELOOP there. Therefore, this case is just fully broken > without statx(), whether we check for submounts (based on st_dev) or > not. > > All in all, checking for submounts works well even without comparing the > mount ID (i.e., without statx()). The only concern is an edge case > that, without statx() mount IDs, is utterly broken anyway. > > Thus, drop said check in lo_init(). > > Reported-by: Miklos Szeredi > Signed-off-by: Max Reitz Queued > --- > tools/virtiofsd/passthrough_ll.c | 8 > 1 file changed, 8 deletions(-) > > diff --git a/tools/virtiofsd/passthrough_ll.c > b/tools/virtiofsd/passthrough_ll.c > index ec1008bceb..6c64b03f1a 100644 > --- a/tools/virtiofsd/passthrough_ll.c > +++ b/tools/virtiofsd/passthrough_ll.c > @@ -610,14 +610,6 @@ static void lo_init(void *userdata, struct > fuse_conn_info *conn) > "does not support it\n"); > lo->announce_submounts = false; > } > - > -#ifndef CONFIG_STATX > -if (lo->announce_submounts) { > -fuse_log(FUSE_LOG_WARNING, "lo_init: Cannot announce submounts, > there " > - "is no statx()\n"); > -lo->announce_submounts = false; > -} > -#endif > } > > static void lo_getattr(fuse_req_t req, fuse_ino_t ino, > -- > 2.26.2 > > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
[Bug 1781515] Re: Resolution switch leads to the screen/image being corrupted
The QEMU project is currently considering to move its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now. If you still think this bug report here is valid, then please switch the state back to "New" within the next 60 days, otherwise this report will be marked as "Expired". Or mark it as "Fix Released" if the problem has been solved with a newer version of QEMU already. Thank you and sorry for the inconvenience. ** Changed in: qemu Status: New => Incomplete -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1781515 Title: Resolution switch leads to the screen/image being corrupted Status in QEMU: Incomplete Bug description: I am currently using QEMU on a Arch Linux host, the guest OS is also Arch Linux. The QEMU version is currently 2.12.0-2 packaged by Arch Linux, the command line I'm using to fire an Arch VM is: $ qemu-system-x86_64 -enable-kvm -hda archlinux.qcow2 -m 4G -smp 4 The problem I'm currently having is, after firing the VM and running startx I want to change the resolution to the native resolution, which is 1366x768 on my ThinkPad T450, however, after changing the resolution the image on the guest gets corrupted and it's impossible to see anything. At this point I can only turn off the VM. The only workaround I found is to start the VM with -vga virtio. The problem in this case occurs with -vga std which is the default video driver. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1781515/+subscriptions
[Bug 1782300] Re: COLO unable to failover to secondary VM
The QEMU project is currently considering to move its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now. If you still think this bug report here is valid, then please switch the state back to "New" within the next 60 days, otherwise this report will be marked as "Expired". Or mark it as "Fix Released" if the problem has been solved with a newer version of QEMU already. Thank you and sorry for the inconvenience. ** Changed in: qemu Status: New => Incomplete -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1782300 Title: COLO unable to failover to secondary VM Status in QEMU: Incomplete Bug description: I test COLO feature on my host following docs/COLO-FT.txt in qemu folder, but fail to failover to secondary VM. Is there any mistake in my execution steps? Execution environment: QEMU v2.12.0-rc4 OS: Ubuntu 16.04.3 LTS Kernel: Linux 4.4.35 Secondary VM IP: noted as "a.b.c.d" Execution steps: # Primary ${COLO_PATH}/x86_64-softmmu/qemu-system-x86_64 \ -enable-kvm \ -m 512M \ -smp 2 \ -qmp stdio \ -vnc :7 \ -name primary \ -device piix3-usb-uhci \ -device usb-tablet \ -netdev tap,id=tap0,vhost=off \ -device virtio-net-pci,id=net-pci0,netdev=tap0 \ -drive if=virtio,id=primary-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\ children.0.file.filename=${IMG_PATH},\ children.0.driver=raw -S # Secondary ${COLO_PATH}/x86_64-softmmu/qemu-system-x86_64 \ -enable-kvm \ -m 512M \ -smp 2 \ -qmp stdio \ -vnc :8 \ -name secondary \ -device piix3-usb-uhci \ -device usb-tablet \ -netdev tap,id=tap1,vhost=off \ -device virtio-net-pci,id=net-pci0,netdev=tap1 \ -drive if=none,id=secondary-disk0,file.filename=${IMG_PATH},driver=raw,node-name=node0 \ -drive if=virtio,id=active-disk0,driver=replication,mode=secondary,\ file.driver=qcow2,top-id=active-disk0,\ file.file.filename=$ACTIVE_DISK,\ file.backing.driver=qcow2,\ file.backing.file.filename=$HIDDEN_DISK,\ file.backing.backing=secondary-disk0 \ -incoming tcp:0: # Enter into Secondary: {'execute':'qmp_capabilities'} { 'execute': 'nbd-server-start', 'arguments': {'addr': {'type': 'inet', 'data': {'host': 'a.b.c.d', 'port': '8889'} } } } {'execute': 'nbd-server-add', 'arguments': {'device': 'secondary-disk0', 'writable': true } } # Enter into Primary: {'execute':'qmp_capabilities'} {'execute': 'human-monitor-command', 'arguments': { 'command-line': 'drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=a.b.c.d,file.port=8889,file.export=secondary-disk0,node-name=nbd_client0' } } { 'execute':'x-blockdev-change', 'arguments':{'parent': 'primary-disk0', 'node': 'nbd_client0' } } { 'execute': 'migrate-set-capabilities', 'arguments': {'capabilities': [ {'capability': 'x-colo', 'state': true } ] } } { 'execute': 'migrate', 'arguments': {'uri': 'tcp:a.b.c.d:' } } # To test failover Primary { 'execute': 'x-blockdev-change', 'arguments': {'parent': 'primary-disk0', 'child': 'children.1'}} { 'execute': 'human-monitor-command','arguments': {'command-line': 'drive_del nbd_client0'}} Secondary { 'execute': 'nbd-server-stop' } Stop Primary Send ^C signal to terminate PVM. Secondary { "execute": "x-colo-lost-heartbeat" } # Result: Primary (Use ^C to terminate) qemu-system-x86_64: Can't receive COLO message: Input/output error qemu-system-x86_64: terminating on signal 2 {"timestamp": {"seconds": 1531815575, "microseconds": 997696}, "event": "SHUTDOWN", "data": {"guest":false}} Secondary { 'execute': 'nbd-server-stop' } {"return": {}} { "execute": "x-colo-lost-heartbeat" } {"return": {}} qemu-system-x86_64: Can't receive COLO message: Input/output error Segmentation fault To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1782300/+subscriptions
[Bug 1782107] Re: Random errors when emulating armv7 and using dh
The QEMU project is currently considering to move its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now. If you still think this bug report here is valid, then please switch the state back to "New" within the next 60 days, otherwise this report will be marked as "Expired". Or mark it as "Fix Released" if the problem has been solved with a newer version of QEMU already. Thank you and sorry for the inconvenience. ** Changed in: qemu Status: New => Incomplete -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1782107 Title: Random errors when emulating armv7 and using dh Status in QEMU: Incomplete Bug description: Howdy, I'm encountering random errors when using qemu to cross-package my project using dh. In previous iterations of my project it would only fail once every two attempts. Now it fails every time. Example error included. If you'd like to try and replicate this error, a version of my project is publicly available with simple instructions on how to package it (using qemu) here: https://github.com/Nadav-Ruskin/configsite To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1782107/+subscriptions
[Bug 1780815] Re: SDL Doesn't Rescale Image on Resolution Change
The QEMU project is currently considering to move its bug tracking to another system. For this we need to know which bugs are still valid and which could be closed already. Thus we are setting older bugs to "Incomplete" now. If you still think this bug report here is valid, then please switch the state back to "New" within the next 60 days, otherwise this report will be marked as "Expired". Or mark it as "Fix Released" if the problem has been solved with a newer version of QEMU already. Thank you and sorry for the inconvenience. ** Changed in: qemu Status: New => Incomplete -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1780815 Title: SDL Doesn't Rescale Image on Resolution Change Status in QEMU: Incomplete Bug description: Whilst in fullscreen mode, if the guest changes resolution for whatever reason, the screen doesn't update the scaling so you end up looking at only part of the screen, e.g. if the guest changes from 640x480 to 1024x768, the image will still be fullscreen, but what you're actually looking at will be the top-left most 640x480 segment of the screen stretched out. Manually going out of fullscreen mode and then back in fixes the scaling, but you have to do that every time the guest changes resolution. QEmu 2.12.0 using qemu-system-x86_64 with Arch Linux host. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1780815/+subscriptions
Re: [PATCH V2] vhost-user-blk/scsi: Fix broken error handling for socket call
Ping? This is a coverity issue fix which has been reviewed, whose tree should it go via? Adding mst to cc list as the listed maintainer. thanks -- PMM On Thu, 29 Oct 2020 at 06:05, AlexChen wrote: > > When socket() fails, it returns -1, 0 is the normal return value and should > not return error. > > Reported-by: Euler Robot > Signed-off-by: AlexChen > --- > contrib/vhost-user-blk/vhost-user-blk.c | 2 +- > contrib/vhost-user-scsi/vhost-user-scsi.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/contrib/vhost-user-blk/vhost-user-blk.c > b/contrib/vhost-user-blk/vhost-user-blk.c > index 25eccd02b5..40a2dfc544 100644 > --- a/contrib/vhost-user-blk/vhost-user-blk.c > +++ b/contrib/vhost-user-blk/vhost-user-blk.c > @@ -474,7 +474,7 @@ static int unix_sock_new(char *unix_fn) > assert(unix_fn); > > sock = socket(AF_UNIX, SOCK_STREAM, 0); > -if (sock <= 0) { > +if (sock < 0) { > perror("socket"); > return -1; > } > diff --git a/contrib/vhost-user-scsi/vhost-user-scsi.c > b/contrib/vhost-user-scsi/vhost-user-scsi.c > index 3c912384e9..0f9ba4b2a2 100644 > --- a/contrib/vhost-user-scsi/vhost-user-scsi.c > +++ b/contrib/vhost-user-scsi/vhost-user-scsi.c > @@ -320,7 +320,7 @@ static int unix_sock_new(char *unix_fn) > assert(unix_fn); > > sock = socket(AF_UNIX, SOCK_STREAM, 0); > -if (sock <= 0) { > +if (sock < 0) { > perror("socket"); > return -1; > } > -- > 2.19.1
Re: [PATCH v2 1/1] register: Remove unnecessary NULL check
On Fri, 2 Oct 2020 at 17:04, Alistair Francis wrote: > > This patch fixes CID 1432800 by removing an unnecessary check. > > Signed-off-by: Alistair Francis > --- > hw/core/register.c | 4 > 1 file changed, 4 deletions(-) > > diff --git a/hw/core/register.c b/hw/core/register.c > index 31038bd7cc..3600ef5bde 100644 > --- a/hw/core/register.c > +++ b/hw/core/register.c > @@ -258,10 +258,6 @@ static RegisterInfoArray > *register_init_block(DeviceState *owner, > int index = rae[i].addr / data_size; > RegisterInfo *r = &ri[index]; > > -if (data + data_size * index == 0 || !&rae[i]) { > -continue; > -} > - > /* Init the register, this will zero it. */ > object_initialize((void *)r, sizeof(*r), TYPE_REGISTER); > > -- > 2.28.0 Applied to target-arm.next, thanks. -- PMM
Re: [PATCH] target/i386: avoid theoretical leak on MCE injection
On Tue, 6 Oct 2020 at 08:54, Paolo Bonzini wrote: > > g_strdup_printf is used twice to write to the same variable, which > can theoretically cause a leak. In practice, it is extremely > unlikely that a guest is seeing a recursive MCE and has disabled > CR4.MCE between the first and the second error, but we can fix it > and we can also make a slight improvement on the logic: CR4.MCE=0 > causes a triple fault even for a non-recursive machine check, so > let's place its test first. > > Signed-off-by: Paolo Bonzini > --- > target/i386/helper.c | 10 -- > 1 file changed, 4 insertions(+), 6 deletions(-) > > diff --git a/target/i386/helper.c b/target/i386/helper.c > index 32fa21a7bb..f64379367d 100644 > --- a/target/i386/helper.c > +++ b/target/i386/helper.c > @@ -908,16 +908,14 @@ static void do_inject_x86_mce(CPUState *cs, > run_on_cpu_data data) > return; > } > > -if (recursive) { > -need_reset = true; > -msg = g_strdup_printf("CPU %d: Previous MCE still in progress, " > - "raising triple fault", cs->cpu_index); > -} > - > if (!(cenv->cr[4] & CR4_MCE_MASK)) { > need_reset = true; > msg = g_strdup_printf("CPU %d: MCE capability is not enabled, " >"raising triple fault", cs->cpu_index); > +} else if (recursive) { > +need_reset = true; > +msg = g_strdup_printf("CPU %d: Previous MCE still in progress, " > + "raising triple fault", cs->cpu_index); > } > > if (need_reset) { Reviewed-by: Peter Maydell Might be nice to have this in 5.2, given it is a coverity issue fix? thanks -- PMM
Re: [PATCH for-5.2 10/10] block/export: port virtio-blk read/write range check
On 11.11.20 13:43, Stefan Hajnoczi wrote: Check that the sector number and byte count are valid. Signed-off-by: Stefan Hajnoczi --- block/export/vhost-user-blk-server.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c index d88e41714d..6d7fd0fec3 100644 --- a/block/export/vhost-user-blk-server.c +++ b/block/export/vhost-user-blk-server.c @@ -214,9 +214,23 @@ static void coroutine_fn vu_blk_virtio_process_req(void *opaque) QEMUIOVector qiov; if (is_write) { qemu_iovec_init_external(&qiov, out_iov, out_num); + +if (unlikely(!vu_blk_sect_range_ok(vexp, req->sector_num, + qiov.size))) { +req->in->status = VIRTIO_BLK_S_IOERR; +break; +} + ret = blk_co_pwritev(blk, offset, qiov.size, &qiov, 0); } else { qemu_iovec_init_external(&qiov, in_iov, in_num); + +if (unlikely(!vu_blk_sect_range_ok(vexp, req->sector_num, + qiov.size))) { +req->in->status = VIRTIO_BLK_S_IOERR; +break; +} + ret = blk_co_preadv(blk, offset, qiov.size, &qiov, 0); } if (ret >= 0) { req->sector_num is not a block layer sector, though (i.e. not a 512-byte sector); it references sectors of size vexp->blk_size (which I presume aren’t necessarily 512 bytes in length). Second, I now understand why vu_blk_sect_range_ok() takes a byte length; but with an arbitrary length as given here, it must also round that down when converting that length to block layer sectors. (Or just compare the byte length against the result of bdrv_getlength().) Max
Re: [PATCH-for-5.2 v3] util/cutils: Fix Coverity array overrun in freq_to_str()
On Sun, 1 Nov 2020 at 21:57, Philippe Mathieu-Daudé wrote: > > Fix Coverity CID 1435957: Memory - illegal accesses (OVERRUN): > > >>> Overrunning array "suffixes" of 7 8-byte elements at element > index 7 (byte offset 63) using index "idx" (which evaluates to 7). > > Note, the biggest input value freq_to_str() can accept is UINT64_MAX, > which is ~18.446 EHz, less than 1000 EHz. > > Reported-by: Eduardo Habkost > Suggested-by: Peter Maydell > Signed-off-by: Philippe Mathieu-Daudé > --- > v3: Follow Peter's suggestion Since nobody else has picked this up I'll take it via target-arm.next. thanks -- PMM
Re: [PATCH v2 1/1] Fix use after free in vfio_migration_probe
On Fri, 6 Nov 2020 at 18:35, Kirti Wankhede wrote: > > Fixes Coverity issue: > CID 1436126: Memory - illegal accesses (USE_AFTER_FREE) > > Fixes: a9e271ec9b36 ("vfio: Add migration region initialization and finalize > function") > > Signed-off-by: Kirti Wankhede > Reviewed-by: David Edmondson > Reviewed-by: Alex Bennée > Reviewed-by: Philippe Mathieu-Daudé > --- Hi Alex -- this is a fix for a Coverity issue, are you planning a pullreq with it in? It would be nice to have it in rc2 next week. thanks -- PMM
Re: [PATCH v3 0/3] virtiofsd: fix some accessing NULL pointer problem
* Haotian Li (lihaoti...@huawei.com) wrote: > Hi, > We find some potential NULL pointer bugs on tools/virtiofsd. > Three patches are made to fix them Queued > Haotian Li (3): > tools/virtiofsd/buffer.c: check whether buf is NULL in > fuse_bufvec_advance func > virtiofsd: check whether lo_map_reserve returns NULL in main func > virtiofsd: check whether strdup lo.source return NULL in main func. > > tools/virtiofsd/buffer.c | 4 > tools/virtiofsd/passthrough_ll.c | 16 +++- > 2 files changed, 19 insertions(+), 1 deletion(-) > > -- > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [PATCH v5 0/7] iOS and Apple Silicon host support
On Sun, Nov 08, 2020 at 03:24:17PM -0800, Joelle van Dyne wrote: > Based-on: 20201106032921.600200-1-richard.hender...@linaro.org > ([PATCH v3 00/41] Mirror map JIT memory for TCG) > > These set of changes brings QEMU TCG to iOS devices and future Apple Silicon > devices. They were originally developed last year and have been working in the > UTM app. Recently, we ported the changes to master, re-wrote a lot of the > build > script changes for meson, and broke up the patches into more distinct units. > > A summary of the changes: > > * `CONFIG_IOS` defined when building for iOS and iOS specific changes (as well > as unsupported code) are gated behind it. > * A new dependency, libucontext is added since iOS does not have native > ucontext > and broken support for sigaltstack. libucontext is available as a new option > for coroutine backend. > * For (recent) jailbroken iOS devices as well as upcoming Apple Silicon > devices, > there are new rules for applications supporting JIT (with the proper > entitlement). These rules are implemented as well. > > Since v5: > > * Fixed some more instances of QAPI define of CONFIG_HOST_BLOCK_DEVICE > * Fixed libucontext build on newer version of GCC I have pinged Software Freedom Conservancy about an opinion regarding merging the reverse-engineered part. Hope to get a reply soon. You are on CC so you'll see it. QEMU is currently in freeze (fixes only, no new features) for the upcoming 5.2 release. The development tree will open again at the start of December: https://wiki.qemu.org/Planning/5.2 The code looks good to me. Thank you! Reviewed-by: Stefan Hajnoczi signature.asc Description: PGP signature
[PATCH for-5.2?] configure: Make "does libgio work" test pull in some actual functions
In commit 76346b6264a9b01979 we tried to add a configure check that the libgio pkg-config data was correct, which builds an executable linked against it. Unfortunately this doesn't catch the problem (missing static library dependency info), because a "do nothing" test source file doesn't have any symbol references that cause the linker to pull in .o files from libgio.a, and so we don't see the "missing symbols from libmount" error that a full QEMU link triggers. (The ineffective test went unnoticed because of a typo that effectively disabled libgio unconditionally, but after commit 3569a5dfc11f2 fixed that, a static link of the system emulator on Ubuntu stopped working again.) Improve the gio test by having the test source fragment reference a g_dbus function (which is what is indirectly causing us to end up wanting functions from libmount). Signed-off-by: Peter Maydell --- configure | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/configure b/configure index 4cef321d9dc..2717cf1db0a 100755 --- a/configure +++ b/configure @@ -3512,8 +3512,15 @@ if $pkg_config --atleast-version=$glib_req_ver gio-2.0; then # Check that the libraries actually work -- Ubuntu 18.04 ships # with pkg-config --static --libs data for gio-2.0 that is missing # -lblkid and will give a link error. -write_c_skeleton -if compile_prog "" "$gio_libs" ; then +cat > $TMPC < +int main(void) +{ +g_dbus_proxy_new_sync(0, 0, 0, 0, 0, 0, 0, 0); +return 0; +} +EOF +if compile_prog "$gio_cflags" "$gio_libs" ; then gio=yes else gio=no -- 2.20.1
Re: [PATCH v2 1/1] Fix use after free in vfio_migration_probe
On Thu, 12 Nov 2020 15:57:46 + Peter Maydell wrote: > On Fri, 6 Nov 2020 at 18:35, Kirti Wankhede wrote: > > > > Fixes Coverity issue: > > CID 1436126: Memory - illegal accesses (USE_AFTER_FREE) > > > > Fixes: a9e271ec9b36 ("vfio: Add migration region initialization and finalize > > function") > > > > Signed-off-by: Kirti Wankhede > > Reviewed-by: David Edmondson > > Reviewed-by: Alex Bennée > > Reviewed-by: Philippe Mathieu-Daudé > > --- > > Hi Alex -- this is a fix for a Coverity issue, are you planning > a pullreq with it in? It would be nice to have it in rc2 next week. Hi Peter, Yes, I was planning to send a pull request. It's a trivial and obvious fix, so if by chance you'd like to grab it separately, you're also more than welcome. Acked-by: Alex Williamson Reviewed-by: Alex Williamson Thanks, Alex
Re: [PATCH for-5.2 04/10] vhost-user-blk-test: rename destroy_drive() to destroy_file()
On Thu, Nov 12, 2020 at 03:32:58PM +0100, Max Reitz wrote: > On 11.11.20 13:43, Stefan Hajnoczi wrote: > > The function is used not just for image files but also for UNIX domain > > sockets (QMP monitor and vhost-user-blk). Reflect that in the name. > > > > Signed-off-by: Stefan Hajnoczi > > --- > > tests/qtest/vhost-user-blk-test.c | 9 + > > 1 file changed, 5 insertions(+), 4 deletions(-) > > [...] > > > @@ -731,7 +732,7 @@ static char *start_vhost_user_blk(GString *cmd_line, > > int vus_instances, > > sock_path = g_strdup(sock_path_tempate); > > fd = mkstemp(sock_path); > > g_assert_cmpint(fd, >=, 0); > > -g_test_queue_destroy(drive_destroy, sock_path); > > +g_test_queue_destroy(drive_file, sock_path); > > s/drive_file/destroy_file/, I think :) Oops! The following commit removes this line so I didn't notice this issue: "vhost-user-blk-test: fix races by using fd passing". Michael: If you want this intermediate commit fixed for bisectability I will respin. Otherwise the patches in your tree are fine. signature.asc Description: PGP signature
Re: [PATCH-for-5.2? v5 0/2] ci: Move trace backend tests across to gitlab
On Wed, Nov 11, 2020 at 01:12:32PM +0100, Philippe Mathieu-Daudé wrote: > Extracted from "ci: Move various jobs from Travis to GitLab CI": > https://www.mail-archive.com/qemu-devel@nongnu.org/msg758314.html > > v5: > - addressed Wainer's review comment > > Philippe Mathieu-Daudé (2): > tests/docker: Install liblttng-ust-dev package in Ubuntu 20.04 image > gitlab-ci: Move trace backend tests across to gitlab > > .gitlab-ci.yml | 18 ++ > .travis.yml| 19 --- > tests/docker/dockerfiles/ubuntu2004.docker | 1 + > 3 files changed, 19 insertions(+), 19 deletions(-) > > -- > 2.26.2 > > Reviewed-by: Stefan Hajnoczi signature.asc Description: PGP signature
[PATCH 2/6] qapi: Remember alias definitions in qobject-input-visitor
This makes qobject-input-visitor remember the currently valid aliases in each StackObject. It doesn't actually allow using the aliases yet. Signed-off-by: Kevin Wolf --- qapi/qobject-input-visitor.c | 115 +++ 1 file changed, 115 insertions(+) diff --git a/qapi/qobject-input-visitor.c b/qapi/qobject-input-visitor.c index 23843b242e..a00ac32682 100644 --- a/qapi/qobject-input-visitor.c +++ b/qapi/qobject-input-visitor.c @@ -29,6 +29,29 @@ #include "qemu/cutils.h" #include "qemu/option.h" +typedef struct InputVisitorAlias { + /* Alias name. NULL for any (wildcard alias). */ +const char *alias; + +/* + * NULL terminated array representing a path. + * Must contain at least one non-NULL element if alias is not NULL. + */ +const char **src; + +/* StackObject in which the alias should be looked for */ +struct StackObject *alias_so; + +/* + * The alias remains valid as long as the containing StackObject has + * StackObject.alias_scope_nesting >= InputVisitorAlias.scope_nesting + * or until the whole StackObject is removed. + */ +int scope_nesting; + +QSLIST_ENTRY(InputVisitorAlias) next; +} InputVisitorAlias; + typedef struct StackObject { const char *name;/* Name of @obj in its parent, if any */ QObject *obj;/* QDict or QList being visited */ @@ -38,6 +61,9 @@ typedef struct StackObject { const QListEntry *entry;/* If @obj is QList: unvisited tail */ unsigned index; /* If @obj is QList: list index of @entry */ +QSLIST_HEAD(, InputVisitorAlias) aliases; +int alias_scope_nesting;/* Increase on scope start, decrease on end */ + QSLIST_ENTRY(StackObject) node; /* parent */ } StackObject; @@ -203,6 +229,38 @@ static const char *qobject_input_get_keyval(QObjectInputVisitor *qiv, return qstring_get_str(qstr); } +/* + * Propagates aliases from the parent StackObject @src to its direct + * child StackObject @dst, which is representing the child struct @name. + * + * Every alias whose source path begins with @name and which still + * applies in @dst (i.e. it is either a wildcard alias or has at least + * one more source path element) is propagated to @dst with the first + * element (i.e. @name) removed from the source path. + */ +static void propagate_aliases(StackObject *dst, StackObject *src, + const char *name) +{ +InputVisitorAlias *a; + +QSLIST_FOREACH(a, &src->aliases, next) { +if (!a->src[0] || strcmp(a->src[0], name)) { +continue; +} +if (a->src[1] || !a->alias) { +InputVisitorAlias *alias = g_new(InputVisitorAlias, 1); + +*alias = (InputVisitorAlias) { +.alias = a->alias, +.alias_so = a->alias_so, +.src= &a->src[1], +}; + +QSLIST_INSERT_HEAD(&dst->aliases, alias, next); +} +} +} + static const QListEntry *qobject_input_push(QObjectInputVisitor *qiv, const char *name, QObject *obj, void *qapi) @@ -226,6 +284,9 @@ static const QListEntry *qobject_input_push(QObjectInputVisitor *qiv, g_hash_table_insert(h, (void *)qdict_entry_key(entry), NULL); } tos->h = h; +if (!QSLIST_EMPTY(&qiv->stack)) { +propagate_aliases(tos, QSLIST_FIRST(&qiv->stack), name); +} } else { assert(qlist); tos->entry = qlist_first(qlist); @@ -257,10 +318,17 @@ static bool qobject_input_check_struct(Visitor *v, Error **errp) static void qobject_input_stack_object_free(StackObject *tos) { +InputVisitorAlias *a; + if (tos->h) { g_hash_table_unref(tos->h); } +while ((a = QSLIST_FIRST(&tos->aliases))) { +QSLIST_REMOVE_HEAD(&tos->aliases, next); +g_free(a); +} + g_free(tos); } @@ -274,6 +342,50 @@ static void qobject_input_pop(Visitor *v, void **obj) qobject_input_stack_object_free(tos); } +static void qobject_input_start_alias_scope(Visitor *v) +{ +QObjectInputVisitor *qiv = to_qiv(v); +StackObject *tos = QSLIST_FIRST(&qiv->stack); + +tos->alias_scope_nesting++; +} + +static void qobject_input_end_alias_scope(Visitor *v) +{ +QObjectInputVisitor *qiv = to_qiv(v); +StackObject *tos = QSLIST_FIRST(&qiv->stack); +InputVisitorAlias *a, *next; + +assert(tos->alias_scope_nesting > 0); +tos->alias_scope_nesting--; + +QSLIST_FOREACH_SAFE(a, &tos->aliases, next, next) { +if (a->scope_nesting > tos->alias_scope_nesting) { +QSLIST_REMOVE(&tos->aliases, a, InputVisitorAlias, next); +g_free(a); +} +} +} + +static void qobject_input_define_alias(Visitor *v, const char *alias_name, + const char **source) +{ +QObjectInputVi
[PATCH 0/6] qapi: Add support for aliases
This series introduces alias definitions for QAPI object types (structs and unions). This allows using the same QAPI type and visitor even when the syntax has some variations between different external interfaces such as QMP and the command line. It also provides a new tool for evolving the schema while maintaining backwards compatibility (possibly during a deprecation period). The first user is intended to be a QAPIfied -chardev command line option, for which I'll send a separate series. A git tag is available that contains both this series and the chardev changes that make use of it: https://repo.or.cz/qemu/kevin.git qapi-alias-chardev-v1 Kevin Wolf (6): qapi: Add interfaces for alias support to Visitor qapi: Remember alias definitions in qobject-input-visitor qapi: Simplify full_name_nth() in qobject-input-visitor qapi: Apply aliases in qobject-input-visitor qapi: Add support for aliases tests/qapi-schema: Test cases for aliases docs/devel/qapi-code-gen.txt | 37 +- docs/sphinx/qapidoc.py| 2 +- include/qapi/visitor-impl.h | 12 + include/qapi/visitor.h| 37 ++ qapi/qapi-visit-core.c| 21 ++ qapi/qobject-input-visitor.c | 322 -- scripts/qapi/expr.py | 34 +- scripts/qapi/schema.py| 27 +- scripts/qapi/types.py | 4 +- scripts/qapi/visit.py | 33 +- tests/qapi-schema/test-qapi.py| 7 +- tests/qapi-schema/alias-bad-type.err | 2 + tests/qapi-schema/alias-bad-type.json | 3 + tests/qapi-schema/alias-bad-type.out | 0 tests/qapi-schema/alias-missing-source.err| 2 + tests/qapi-schema/alias-missing-source.json | 3 + tests/qapi-schema/alias-missing-source.out| 0 tests/qapi-schema/alias-name-bad-type.err | 2 + tests/qapi-schema/alias-name-bad-type.json| 3 + tests/qapi-schema/alias-name-bad-type.out | 0 tests/qapi-schema/alias-source-bad-type.err | 2 + tests/qapi-schema/alias-source-bad-type.json | 3 + tests/qapi-schema/alias-source-bad-type.out | 0 .../alias-source-elem-bad-type.err| 2 + .../alias-source-elem-bad-type.json | 3 + .../alias-source-elem-bad-type.out| 0 tests/qapi-schema/alias-source-empty.err | 2 + tests/qapi-schema/alias-source-empty.json | 3 + tests/qapi-schema/alias-source-empty.out | 0 tests/qapi-schema/alias-unknown-key.err | 3 + tests/qapi-schema/alias-unknown-key.json | 3 + tests/qapi-schema/alias-unknown-key.out | 0 tests/qapi-schema/aliases-bad-type.err| 2 + tests/qapi-schema/aliases-bad-type.json | 3 + tests/qapi-schema/aliases-bad-type.out| 0 tests/qapi-schema/double-type.err | 2 +- tests/qapi-schema/meson.build | 8 + tests/qapi-schema/qapi-schema-test.json | 24 ++ tests/qapi-schema/qapi-schema-test.out| 29 ++ tests/qapi-schema/unknown-expr-key.err| 2 +- 40 files changed, 596 insertions(+), 46 deletions(-) create mode 100644 tests/qapi-schema/alias-bad-type.err create mode 100644 tests/qapi-schema/alias-bad-type.json create mode 100644 tests/qapi-schema/alias-bad-type.out create mode 100644 tests/qapi-schema/alias-missing-source.err create mode 100644 tests/qapi-schema/alias-missing-source.json create mode 100644 tests/qapi-schema/alias-missing-source.out create mode 100644 tests/qapi-schema/alias-name-bad-type.err create mode 100644 tests/qapi-schema/alias-name-bad-type.json create mode 100644 tests/qapi-schema/alias-name-bad-type.out create mode 100644 tests/qapi-schema/alias-source-bad-type.err create mode 100644 tests/qapi-schema/alias-source-bad-type.json create mode 100644 tests/qapi-schema/alias-source-bad-type.out create mode 100644 tests/qapi-schema/alias-source-elem-bad-type.err create mode 100644 tests/qapi-schema/alias-source-elem-bad-type.json create mode 100644 tests/qapi-schema/alias-source-elem-bad-type.out create mode 100644 tests/qapi-schema/alias-source-empty.err create mode 100644 tests/qapi-schema/alias-source-empty.json create mode 100644 tests/qapi-schema/alias-source-empty.out create mode 100644 tests/qapi-schema/alias-unknown-key.err create mode 100644 tests/qapi-schema/alias-unknown-key.json create mode 100644 tests/qapi-schema/alias-unknown-key.out create mode 100644 tests/qapi-schema/aliases-bad-type.err create mode 100644 tests/qapi-schema/aliases-bad-type.json create mode 100644 tests/qapi-schema/aliases-bad-type.out -- 2.28.0
[PATCH 4/6] qapi: Apply aliases in qobject-input-visitor
When looking for an object in a struct in the external representation, check not only the currently visited struct, but also whether an alias in the current StackObject matches and try to fetch the value from the alias then. Providing two values for the same object through different aliases is an error. Signed-off-by: Kevin Wolf --- qapi/qobject-input-visitor.c | 169 +-- 1 file changed, 160 insertions(+), 9 deletions(-) diff --git a/qapi/qobject-input-visitor.c b/qapi/qobject-input-visitor.c index 1415561828..faca5b6b55 100644 --- a/qapi/qobject-input-visitor.c +++ b/qapi/qobject-input-visitor.c @@ -74,6 +74,8 @@ struct QObjectInputVisitor { QObject *root; bool keyval;/* Assume @root made with keyval_parse() */ +QDict *empty_qdict; /* Used for implicit objects */ + /* Stack of objects being visited (all entries will be either * QDict or QList). */ QSLIST_HEAD(, StackObject) stack; @@ -141,9 +143,139 @@ static const char *full_name(QObjectInputVisitor *qiv, const char *name) return full_name_so(qiv, name, tos); } +static bool find_object_member(QObjectInputVisitor *qiv, + StackObject **so, const char **name, + bool *implicit_object, Error **errp); + +static bool alias_present(QObjectInputVisitor *qiv, + InputVisitorAlias *a, const char *name) +{ +StackObject *so = a->alias_so; + +/* + * The passed source @name is only relevant for wildcard aliases which + * don't have a separate name, otherwise we use the alias name. + */ +if (a->alias) { +name = a->alias; +} + +if (!find_object_member(qiv, &so, &name, NULL, NULL)) { +return false; +} + +/* + * Every source can be used only once. If a value in the input would end up + * being used twice through aliases, we'll fail the second access. + */ +if (!g_hash_table_contains(so->h, name)) { +return false; +} + +return true; +} + +static bool alias_source_matches(QObjectInputVisitor *qiv, + StackObject *so, InputVisitorAlias *a, + const char *name, bool *implicit_object) +{ +if (a->src[0] == NULL) { +assert(a->alias == NULL); +return true; +} + +if (!strcmp(a->src[0], name)) { +if (a->alias && a->src[1] == NULL) { +/* + * We're matching an exact member, the source for this alias is + * immediately in @so. + */ +return true; +} else if (implicit_object) { +/* + * We're only looking at a prefix of the source path for the alias. + * If the input contains no object of the requested name, we will + * implicitly create an empty one so that the alias can still be + * used. + * + * We want to create the implicit object only if the alias is + * actually used, but we can't tell here for wildcard aliases (only + * a later visitor call will determine this). This means that + * wildcard aliases must never have optional keys in their source + * path. + */ +if (!a->alias || alias_present(qiv, a, a->alias)) { +*implicit_object = true; +} +} +} + +return false; +} + +static bool find_object_member(QObjectInputVisitor *qiv, + StackObject **so, const char **name, + bool *implicit_object, Error **errp) +{ +StackObject *cur_so = *so; +QDict *qdict = qobject_to(QDict, cur_so->obj); +const char *found = NULL; +bool found_is_wildcard = false; +InputVisitorAlias *a; + +if (implicit_object) { +*implicit_object = false; +} + +/* Directly present in the container */ +if (qdict_haskey(qdict, *name)) { +found = *name; +} + +/* + * Find aliases whose source path matches @name in this StackObject. We can + * then get the value with the key a->alias from a->alias_so. + */ +QSLIST_FOREACH(a, &cur_so->aliases, next) { +if (a->alias == NULL && found) { +/* + * Skip wildcard aliases if we already have a match. This is + * not a conflict that should result in an error. + */ +continue; +} + +if (!alias_source_matches(qiv, cur_so, a, *name, implicit_object)) { +continue; +} + +if (!alias_present(qiv, a, *name)) { +continue; +} + +if (found && !found_is_wildcard) { +error_setg(errp, "Value for parameter %s was already given " + "through an alias", full_name_so(qiv, *name, *so)); +return false; +} else { +found = a->alias ?: *name; +
[PATCH 1/6] qapi: Add interfaces for alias support to Visitor
This adds functions to the Visitor interface that can be used to define aliases and alias scopes. Signed-off-by: Kevin Wolf --- include/qapi/visitor-impl.h | 12 include/qapi/visitor.h | 37 + qapi/qapi-visit-core.c | 21 + 3 files changed, 70 insertions(+) diff --git a/include/qapi/visitor-impl.h b/include/qapi/visitor-impl.h index 7362c043be..e30da2599c 100644 --- a/include/qapi/visitor-impl.h +++ b/include/qapi/visitor-impl.h @@ -113,6 +113,18 @@ struct Visitor The core takes care of the return type in the public interface. */ void (*optional)(Visitor *v, const char *name, bool *present); +/* + * Optional; intended for input visitors. If not given, aliases are + * ignored. + */ +void (*define_alias)(Visitor *v, const char *alias, const char **source); + +/* Must be set if define_alias is set */ +void (*start_alias_scope)(Visitor *v); + +/* Must be set if define_alias is set */ +void (*end_alias_scope)(Visitor *v); + /* Must be set */ VisitorType type; diff --git a/include/qapi/visitor.h b/include/qapi/visitor.h index ebc19ede7f..9bdc0ee03d 100644 --- a/include/qapi/visitor.h +++ b/include/qapi/visitor.h @@ -459,6 +459,43 @@ void visit_end_alternate(Visitor *v, void **obj); */ bool visit_optional(Visitor *v, const char *name, bool *present); +/* + * Defines a new alias rule. + * + * If @alias is non-NULL, the member named @alias in the external + * representation of the current struct is defined as an alias for the + * member described by @source. + * + * If @alias is NULL, all members of the struct described by @source are + * considered to have alias members with the same key in the current + * struct. + * + * @source is a NULL-terminated array of names that describe the path to + * a member, starting from the currently visited struct. + * + * The alias stays valid until the current alias scope ends. + * visit_start/end_struct() implicitly start/end an alias scope. + * Additionally, visit_start/end_alias_scope() can be used to explicitly + * create a nested alias scope. + */ +void visit_define_alias(Visitor *v, const char *alias, const char **source); + +/* + * Begins an explicit alias scope. + * + * Alias definitions after here will only stay valid until the + * corresponding visit_end_alias_scope() is called. + */ +void visit_start_alias_scope(Visitor *v); + +/* + * Ends an explicit alias scope. + * + * Alias definitions between the correspoding visit_start_alias_scope() + * call and here go out of scope and won't apply in later code any more. + */ +void visit_end_alias_scope(Visitor *v); + /* * Visit an enum value. * diff --git a/qapi/qapi-visit-core.c b/qapi/qapi-visit-core.c index 7e5f40e7f0..719a9f5da2 100644 --- a/qapi/qapi-visit-core.c +++ b/qapi/qapi-visit-core.c @@ -135,6 +135,27 @@ bool visit_optional(Visitor *v, const char *name, bool *present) return *present; } +void visit_define_alias(Visitor *v, const char *alias, const char **source) +{ +if (v->define_alias) { +v->define_alias(v, alias, source); +} +} + +void visit_start_alias_scope(Visitor *v) +{ +if (v->start_alias_scope) { +v->start_alias_scope(v); +} +} + +void visit_end_alias_scope(Visitor *v) +{ +if (v->end_alias_scope) { +v->end_alias_scope(v); +} +} + bool visit_is_input(Visitor *v) { return v->type == VISITOR_INPUT; -- 2.28.0
[PATCH 3/6] qapi: Simplify full_name_nth() in qobject-input-visitor
Instead of counting how many elements from the top of the stack we need to ignore until we find the thing we're interested in, we can just directly pass the StackObject pointer because all callers already know it. We only need a different way now to tell if we want to know the name of something contained in the given StackObject or of the StackObject itself. Passing name = NULL is the obvious way to request the latter. This simplifies the interface and makes it easier to use in cases where we have the StackObject, but don't know how many steps down the stack it is. Signed-off-by: Kevin Wolf --- qapi/qobject-input-visitor.c | 38 ++-- 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/qapi/qobject-input-visitor.c b/qapi/qobject-input-visitor.c index a00ac32682..1415561828 100644 --- a/qapi/qobject-input-visitor.c +++ b/qapi/qobject-input-visitor.c @@ -87,20 +87,16 @@ static QObjectInputVisitor *to_qiv(Visitor *v) } /* - * Find the full name of something @qiv is currently visiting. - * @qiv is visiting something named @name in the stack of containers - * @qiv->stack. - * If @n is zero, return its full name. - * If @n is positive, return the full name of the @n-th container - * counting from the top. The stack of containers must have at least - * @n elements. - * The returned string is valid until the next full_name_nth(@v) or - * destruction of @v. + * Find the full name of something named @name in @so which @qiv is + * currently visiting. If @name is NULL, find the full name of @so + * itself. + * + * The returned string is valid until the next full_name_so(@qiv) or + * destruction of @qiv. */ -static const char *full_name_nth(QObjectInputVisitor *qiv, const char *name, - int n) +static const char *full_name_so(QObjectInputVisitor *qiv, const char *name, +StackObject *so) { -StackObject *so; char buf[32]; if (qiv->errname) { @@ -109,10 +105,13 @@ static const char *full_name_nth(QObjectInputVisitor *qiv, const char *name, qiv->errname = g_string_new(""); } -QSLIST_FOREACH(so , &qiv->stack, node) { -if (n) { -n--; -} else if (qobject_type(so->obj) == QTYPE_QDICT) { +if (!name && so) { +name = so->name; +so = QSLIST_NEXT(so, node); +} + +for (; so; so = QSLIST_NEXT(so, node)) { +if (qobject_type(so->obj) == QTYPE_QDICT) { g_string_prepend(qiv->errname, name ?: ""); g_string_prepend_c(qiv->errname, '.'); } else { @@ -123,7 +122,6 @@ static const char *full_name_nth(QObjectInputVisitor *qiv, const char *name, } name = so->name; } -assert(!n); if (name) { g_string_prepend(qiv->errname, name); @@ -138,7 +136,9 @@ static const char *full_name_nth(QObjectInputVisitor *qiv, const char *name, static const char *full_name(QObjectInputVisitor *qiv, const char *name) { -return full_name_nth(qiv, name, 0); +StackObject *tos = QSLIST_FIRST(&qiv->stack); + +return full_name_so(qiv, name, tos); } static QObject *qobject_input_try_get_object(QObjectInputVisitor *qiv, @@ -473,7 +473,7 @@ static bool qobject_input_check_list(Visitor *v, Error **errp) if (tos->entry) { error_setg(errp, "Only %u list elements expected in %s", - tos->index + 1, full_name_nth(qiv, NULL, 1)); + tos->index + 1, full_name_so(qiv, NULL, tos)); return false; } return true; -- 2.28.0
[PATCH 5/6] qapi: Add support for aliases
Introduce alias definitions for object types (structs and unions). This allows using the same QAPI type and visitor for many syntax variations that exist in the external representation, like between QMP and the command line. It also provides a new tool for evolving the schema while maintaining backwards compatibility during a deprecation period. Signed-off-by: Kevin Wolf --- docs/devel/qapi-code-gen.txt | 37 +++--- docs/sphinx/qapidoc.py | 2 +- scripts/qapi/expr.py | 34 +-- scripts/qapi/schema.py | 27 +++ scripts/qapi/types.py | 4 ++- scripts/qapi/visit.py | 33 --- tests/qapi-schema/test-qapi.py | 7 - tests/qapi-schema/double-type.err | 2 +- tests/qapi-schema/unknown-expr-key.err | 2 +- 9 files changed, 130 insertions(+), 18 deletions(-) diff --git a/docs/devel/qapi-code-gen.txt b/docs/devel/qapi-code-gen.txt index 6906a06ad2..6da14d5275 100644 --- a/docs/devel/qapi-code-gen.txt +++ b/docs/devel/qapi-code-gen.txt @@ -231,7 +231,8 @@ Syntax: 'data': MEMBERS, '*base': STRING, '*if': COND, - '*features': FEATURES } + '*features': FEATURES, + '*aliases': ALIASES } MEMBERS = { MEMBER, ... } MEMBER = STRING : TYPE-REF | STRING : { 'type': TYPE-REF, @@ -286,13 +287,15 @@ Syntax: UNION = { 'union': STRING, 'data': BRANCHES, '*if': COND, - '*features': FEATURES } + '*features': FEATURES, + '*aliases': ALIASES } | { 'union': STRING, 'data': BRANCHES, 'base': ( MEMBERS | STRING ), 'discriminator': STRING, '*if': COND, - '*features': FEATURES } + '*features': FEATURES, + '*aliases': ALIASES } BRANCHES = { BRANCH, ... } BRANCH = STRING : TYPE-REF | STRING : { 'type': TYPE-REF, '*if': COND } @@ -837,6 +840,34 @@ shows a conditional entity only when the condition is satisfied in this particular build. +=== Aliases === + +Object types, including structs and unions, can contain alias +definitions. + +Aliases define alternative member names that may be used in the +external representation to provide a value for a member in the same +object or in a nested object. + +Syntax: +ALIAS = { '*alias': STRING, + 'source': [ STRING, ... ] } + +'source' is a list of member names representing the path to an object +member, starting from the type where the alias definition is +specified. It may refer to another alias name. It is allowed to use +a path that doesn't necessarily match an existing member in every +variant or even at all; in this case, the alias remains unused. + +If 'alias' is present, then the single member referred to by 'source' +is made accessible with the name given in 'alias' in the type where +the alias definition is specified. + +If 'alias' is not present, then all members in the object referred to +by 'source' are made accessible in the type where the alias definition +is specified with the same name as they have in 'source'. + + === Documentation comments === A multi-line comment that starts and ends with a '##' line is a diff --git a/docs/sphinx/qapidoc.py b/docs/sphinx/qapidoc.py index e03abcbb95..6c94c01148 100644 --- a/docs/sphinx/qapidoc.py +++ b/docs/sphinx/qapidoc.py @@ -310,7 +310,7 @@ class QAPISchemaGenRSTVisitor(QAPISchemaVisitor): + self._nodes_for_if_section(ifcond)) def visit_object_type(self, name, info, ifcond, features, - base, members, variants): + base, members, variants, aliases): doc = self._cur_doc if base and base.is_implicit(): base = None diff --git a/scripts/qapi/expr.py b/scripts/qapi/expr.py index 2fcaaa2497..21fded529b 100644 --- a/scripts/qapi/expr.py +++ b/scripts/qapi/expr.py @@ -198,6 +198,32 @@ def check_features(features, info): check_if(f, info, source) +def check_aliases(aliases, info): +if aliases is None: +return +if not isinstance(aliases, list): +raise QAPISemError(info, "'aliases' must be an array") +for a in aliases: +if not isinstance(a, dict): +raise QAPISemError(info, "'aliases' entries must be objects") +check_keys(a, info, "alias", ['source'], ['alias']) + +if 'alias' in a: +source = "alias member 'alias'" +check_name_is_str(a['alias'], info, source) +check_name_str(a['alias'], info, source) + +if not isinstance(a['source'], list): +raise QAPISemError(info, "'source' must be an array") +if not a['source']: +raise QAPISemError(info, "'source' must not be empty") + +
[PATCH 6/6] tests/qapi-schema: Test cases for aliases
Signed-off-by: Kevin Wolf --- tests/qapi-schema/alias-bad-type.err | 2 ++ tests/qapi-schema/alias-bad-type.json | 3 ++ tests/qapi-schema/alias-bad-type.out | 0 tests/qapi-schema/alias-missing-source.err| 2 ++ tests/qapi-schema/alias-missing-source.json | 3 ++ tests/qapi-schema/alias-missing-source.out| 0 tests/qapi-schema/alias-name-bad-type.err | 2 ++ tests/qapi-schema/alias-name-bad-type.json| 3 ++ tests/qapi-schema/alias-name-bad-type.out | 0 tests/qapi-schema/alias-source-bad-type.err | 2 ++ tests/qapi-schema/alias-source-bad-type.json | 3 ++ tests/qapi-schema/alias-source-bad-type.out | 0 .../alias-source-elem-bad-type.err| 2 ++ .../alias-source-elem-bad-type.json | 3 ++ .../alias-source-elem-bad-type.out| 0 tests/qapi-schema/alias-source-empty.err | 2 ++ tests/qapi-schema/alias-source-empty.json | 3 ++ tests/qapi-schema/alias-source-empty.out | 0 tests/qapi-schema/alias-unknown-key.err | 3 ++ tests/qapi-schema/alias-unknown-key.json | 3 ++ tests/qapi-schema/alias-unknown-key.out | 0 tests/qapi-schema/aliases-bad-type.err| 2 ++ tests/qapi-schema/aliases-bad-type.json | 3 ++ tests/qapi-schema/aliases-bad-type.out| 0 tests/qapi-schema/meson.build | 8 + tests/qapi-schema/qapi-schema-test.json | 24 +++ tests/qapi-schema/qapi-schema-test.out| 29 +++ 27 files changed, 102 insertions(+) create mode 100644 tests/qapi-schema/alias-bad-type.err create mode 100644 tests/qapi-schema/alias-bad-type.json create mode 100644 tests/qapi-schema/alias-bad-type.out create mode 100644 tests/qapi-schema/alias-missing-source.err create mode 100644 tests/qapi-schema/alias-missing-source.json create mode 100644 tests/qapi-schema/alias-missing-source.out create mode 100644 tests/qapi-schema/alias-name-bad-type.err create mode 100644 tests/qapi-schema/alias-name-bad-type.json create mode 100644 tests/qapi-schema/alias-name-bad-type.out create mode 100644 tests/qapi-schema/alias-source-bad-type.err create mode 100644 tests/qapi-schema/alias-source-bad-type.json create mode 100644 tests/qapi-schema/alias-source-bad-type.out create mode 100644 tests/qapi-schema/alias-source-elem-bad-type.err create mode 100644 tests/qapi-schema/alias-source-elem-bad-type.json create mode 100644 tests/qapi-schema/alias-source-elem-bad-type.out create mode 100644 tests/qapi-schema/alias-source-empty.err create mode 100644 tests/qapi-schema/alias-source-empty.json create mode 100644 tests/qapi-schema/alias-source-empty.out create mode 100644 tests/qapi-schema/alias-unknown-key.err create mode 100644 tests/qapi-schema/alias-unknown-key.json create mode 100644 tests/qapi-schema/alias-unknown-key.out create mode 100644 tests/qapi-schema/aliases-bad-type.err create mode 100644 tests/qapi-schema/aliases-bad-type.json create mode 100644 tests/qapi-schema/aliases-bad-type.out diff --git a/tests/qapi-schema/alias-bad-type.err b/tests/qapi-schema/alias-bad-type.err new file mode 100644 index 00..a982d380b8 --- /dev/null +++ b/tests/qapi-schema/alias-bad-type.err @@ -0,0 +1,2 @@ +alias-bad-type.json: In struct 'AliasStruct0': +alias-bad-type.json:1: 'aliases' entries must be objects diff --git a/tests/qapi-schema/alias-bad-type.json b/tests/qapi-schema/alias-bad-type.json new file mode 100644 index 00..0aa5d206fe --- /dev/null +++ b/tests/qapi-schema/alias-bad-type.json @@ -0,0 +1,3 @@ +{ 'struct': 'AliasStruct0', + 'data': { 'foo': 'int' }, + 'aliases': [ 'must be an object' ] } diff --git a/tests/qapi-schema/alias-bad-type.out b/tests/qapi-schema/alias-bad-type.out new file mode 100644 index 00..e69de29bb2 diff --git a/tests/qapi-schema/alias-missing-source.err b/tests/qapi-schema/alias-missing-source.err new file mode 100644 index 00..1cfe8a6aa5 --- /dev/null +++ b/tests/qapi-schema/alias-missing-source.err @@ -0,0 +1,2 @@ +alias-missing-source.json: In struct 'AliasStruct0': +alias-missing-source.json:1: alias misses key 'source' diff --git a/tests/qapi-schema/alias-missing-source.json b/tests/qapi-schema/alias-missing-source.json new file mode 100644 index 00..4ae22c2221 --- /dev/null +++ b/tests/qapi-schema/alias-missing-source.json @@ -0,0 +1,3 @@ +{ 'struct': 'AliasStruct0', + 'data': { 'foo': 'int' }, + 'aliases': [ { 'alias': 'bar' } ] } diff --git a/tests/qapi-schema/alias-missing-source.out b/tests/qapi-schema/alias-missing-source.out new file mode 100644 index 00..e69de29bb2 diff --git a/tests/qapi-schema/alias-name-bad-type.err b/tests/qapi-schema/alias-name-bad-type.err new file mode 100644 index 00..83b9fe0b65 --- /dev/null +++ b/tests/qapi-schema/alias-name-bad-type.err @@ -0,0 +1,2 @@ +alias-name-bad-type.json: In struct 'AliasStruct0': +alias-name-bad-type.json:1: alias member 'alias' requires a stri
[Bug 1809291] Re: SD Card not working in Ubuntu 18.10 (CMD 2, 3 timeout). The device worked fine in Ubuntu 18.04 and earlier versions but not in Ubuntu 18.10
The new code in Qemu is correct, the real problem is that the code [1] is trying to negotiate an invalid working voltage with CMD41. The SD specification marks the first 15 bits as reserved (except for the 7th, that's the dual-voltage flag) meaning that compliant cards will timeout as well. If you look closer at the source code you can see that this problem's been patched by replacing the invalid argument 0x1 with a more reasonable 0x, barely enough to work in the 2.7V range. [1] https://eecs.wsu.edu/~cs460/samples/LAB5pre/step3/sdc.c -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1809291 Title: SD Card not working in Ubuntu 18.10 (CMD 2,3 timeout). The device worked fine in Ubuntu 18.04 and earlier versions but not in Ubuntu 18.10 Status in QEMU: Confirmed Bug description: ARM PL181 MMC card no longer working in qemu-system-arm in Ubuntu 18.10 The MMC driver code worked fine in Ubuntu 15.10 to 18.04. The command to run qemu-system-arm is qemu-system-arm -M versatilepb -m 256M -sd sdimage -kernel t.bin -serial mon:stdio During SDC initialization, SDC commands 2, 3, 9, 13, 7, 16 all timeout, which cause subsequent read/write commands 17/24 to fail also. Tried both ARM versatilepb and realview-pb-a8, realview-pbx-a9 boards: all the same. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1809291/+subscriptions
Re: [RFC PATCH 12/25] acpi/pci: Consolidate host bridge setup
On 20-11-10 21:47:11, Ben Widawsky wrote: > This cleanup will make it easier to add support for CXL to the mix. This patch needs bios table updates for qtest... I am fixing it... > > Signed-off-by: Ben Widawsky > --- > hw/i386/acpi-build.c | 31 +-- > 1 file changed, 17 insertions(+), 14 deletions(-) > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c > index 4f66642d88..99b3088c9e 100644 > --- a/hw/i386/acpi-build.c > +++ b/hw/i386/acpi-build.c > @@ -1486,6 +1486,20 @@ static void build_smb0(Aml *table, I2CBus *smbus, int > devnr, int func) > aml_append(table, scope); > } > > +enum { PCI, PCIE }; > +static void init_pci_acpi(Aml *dev, int uid, int type) > +{ > +if (type == PCI) { > +aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03"))); > +aml_append(dev, aml_name_decl("_UID", aml_int(uid))); > +} else { > +aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08"))); > +aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03"))); > +aml_append(dev, aml_name_decl("_UID", aml_int(uid))); > +aml_append(dev, build_q35_osc_method()); > +} > +} > + > static void > build_dsdt(GArray *table_data, BIOSLinker *linker, > AcpiPmInfo *pm, AcpiMiscInfo *misc, > @@ -1514,9 +1528,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, > if (misc->is_piix4) { > sb_scope = aml_scope("_SB"); > dev = aml_device("PCI0"); > -aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03"))); > +init_pci_acpi(dev, 0, PCI); > aml_append(dev, aml_name_decl("_ADR", aml_int(0))); > -aml_append(dev, aml_name_decl("_UID", aml_int(0))); > aml_append(sb_scope, dev); > aml_append(dsdt, sb_scope); > > @@ -1530,11 +1543,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, > } else { > sb_scope = aml_scope("_SB"); > dev = aml_device("PCI0"); > -aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08"))); > -aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03"))); > +init_pci_acpi(dev, 0, PCIE); > aml_append(dev, aml_name_decl("_ADR", aml_int(0))); > -aml_append(dev, aml_name_decl("_UID", aml_int(0))); > -aml_append(dev, build_q35_osc_method()); > aml_append(sb_scope, dev); > > if (pm->smi_on_cpuhp) { > @@ -1636,15 +1646,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, > > scope = aml_scope("\\_SB"); > dev = aml_device("PC%.02X", bus_num); > -aml_append(dev, aml_name_decl("_UID", aml_int(bus_num))); > aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num))); > -if (pci_bus_is_express(bus)) { > -aml_append(dev, aml_name_decl("_HID", > aml_eisaid("PNP0A08"))); > -aml_append(dev, aml_name_decl("_CID", > aml_eisaid("PNP0A03"))); > -aml_append(dev, build_q35_osc_method()); > -} else { > -aml_append(dev, aml_name_decl("_HID", > aml_eisaid("PNP0A03"))); > -} > +init_pci_acpi(dev, bus_num, pci_bus_is_express(bus) ? PCIE : > PCI); > > if (numa_node != NUMA_NODE_UNASSIGNED) { > aml_append(dev, aml_name_decl("_PXM", aml_int(numa_node))); > -- > 2.29.2 > >