Re: [PATCH for-7.2] vhost: enable vrings in vhost_dev_start() for vhost-user devices
On Wed, Nov 30, 2022 at 04:03:28PM -0500, Stefan Hajnoczi wrote: On Fri, Nov 25, 2022 at 09:12:43AM +0100, Stefano Garzarella wrote: On Thu, Nov 24, 2022 at 01:36:29PM -0500, Stefan Hajnoczi wrote: > On Wed, Nov 23, 2022 at 02:16:30PM +0100, Stefano Garzarella wrote: > > Commit 02b61f38d3 ("hw/virtio: incorporate backend features in features") > > properly negotiates VHOST_USER_F_PROTOCOL_FEATURES with the vhost-user > > backend, but we forgot to enable vrings as specified in > > docs/interop/vhost-user.rst: > > > > If ``VHOST_USER_F_PROTOCOL_FEATURES`` has not been negotiated, the > > ring starts directly in the enabled state. > > > > If ``VHOST_USER_F_PROTOCOL_FEATURES`` has been negotiated, the ring is > > initialized in a disabled state and is enabled by > > ``VHOST_USER_SET_VRING_ENABLE`` with parameter 1. > > > > Some vhost-user front-ends already did this by calling > > vhost_ops.vhost_set_vring_enable() directly: > > - backends/cryptodev-vhost.c > > - hw/net/virtio-net.c > > - hw/virtio/vhost-user-gpio.c > > > > But most didn't do that, so we would leave the vrings disabled and some > > backends would not work. We observed this issue with the rust version of > > virtiofsd [1], which uses the event loop [2] provided by the > > vhost-user-backend crate where requests are not processed if vring is > > not enabled. > > > > Let's fix this issue by enabling the vrings in vhost_dev_start() for > > vhost-user front-ends that don't already do this directly. Same thing > > also in vhost_dev_stop() where we disable vrings. > > > > [1] https://gitlab.com/virtio-fs/virtiofsd > > [2] https://github.com/rust-vmm/vhost/blob/240fc2966/crates/vhost-user-backend/src/event_loop.rs#L217 > > > > Fixes: 02b61f38d3 ("hw/virtio: incorporate backend features in features") > > Reported-by: German Maglione > > Tested-by: German Maglione > > Signed-off-by: Stefano Garzarella > > --- > > include/hw/virtio/vhost.h | 6 +++-- > > backends/cryptodev-vhost.c | 4 ++-- > > backends/vhost-user.c | 4 ++-- > > hw/block/vhost-user-blk.c | 4 ++-- > > hw/net/vhost_net.c | 8 +++ > > hw/scsi/vhost-scsi-common.c| 4 ++-- > > hw/virtio/vhost-user-fs.c | 4 ++-- > > hw/virtio/vhost-user-gpio.c| 4 ++-- > > hw/virtio/vhost-user-i2c.c | 4 ++-- > > hw/virtio/vhost-user-rng.c | 4 ++-- > > hw/virtio/vhost-vsock-common.c | 4 ++-- > > hw/virtio/vhost.c | 44 ++ > > hw/virtio/trace-events | 4 ++-- > > 13 files changed, 67 insertions(+), 31 deletions(-) > > > > diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h > > index 353252ac3e..67a6807fac 100644 > > --- a/include/hw/virtio/vhost.h > > +++ b/include/hw/virtio/vhost.h > > @@ -184,24 +184,26 @@ static inline bool vhost_dev_is_started(struct vhost_dev *hdev) > > * vhost_dev_start() - start the vhost device > > * @hdev: common vhost_dev structure > > * @vdev: the VirtIODevice structure > > + * @vrings: true to have vrings enabled in this call > > * > > * Starts the vhost device. From this point VirtIO feature negotiation > > * can start and the device can start processing VirtIO transactions. > > * > > * Return: 0 on success, < 0 on error. > > */ > > -int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev); > > +int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings); > > > > /** > > * vhost_dev_stop() - stop the vhost device > > * @hdev: common vhost_dev structure > > * @vdev: the VirtIODevice structure > > + * @vrings: true to have vrings disabled in this call > > * > > * Stop the vhost device. After the device is stopped the notifiers > > * can be disabled (@vhost_dev_disable_notifiers) and the device can > > * be torn down (@vhost_dev_cleanup). > > */ > > -void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev); > > +void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings); > > > > /** > > * DOC: vhost device configuration handling > > diff --git a/backends/cryptodev-vhost.c b/backends/cryptodev-vhost.c > > index bc13e466b4..572f87b3be 100644 > > --- a/backends/cryptodev-vhost.c > > +++ b/backends/cryptodev-vhost.c > > @@ -94,7 +94,7 @@ cryptodev_vhost_start_one(CryptoDevBackendVhost *crypto, > > goto fail_notifiers; > > } > > > > -r = vhost_dev_start(&crypto->dev, dev); > > +r = vhost_dev_start(&crypto->dev, dev, false); > > if (r < 0) { > > goto fail_start; > > } > > @@ -111,7 +111,7 @@ static void > > cryptodev_vhost_stop_one(CryptoDevBackendVhost *crypto, > > VirtIODevice *dev) > > { > > -vhost_dev_stop(&crypto->dev, dev); > > +vhost_dev_stop(&crypto->dev, dev, false); > > vhost_dev_disable_notifiers(&crypto->dev, dev); > > } > > > > diff --git a/backends/vhost-user.c b/backends/vhost-user.c > > index 5dedb2d987..7bfcaef976 100644 > > -
Re: regression: insmod module failed in VM with nvdimm on
On Thu, 1 Dec 2022 at 08:15, chenxiang (M) wrote: > > Hi Ard, > > > 在 2022/11/30 16:18, Ard Biesheuvel 写道: > > On Wed, 30 Nov 2022 at 08:53, Marc Zyngier wrote: > >> On Wed, 30 Nov 2022 02:52:35 +, > >> "chenxiang (M)" wrote: > >>> Hi, > >>> > >>> We boot the VM using following commands (with nvdimm on) (qemu > >>> version 6.1.50, kernel 6.0-r4): > >> How relevant is the presence of the nvdimm? Do you observe the failure > >> without this? > >> > >>> qemu-system-aarch64 -machine > >>> virt,kernel_irqchip=on,gic-version=3,nvdimm=on -kernel > >>> /home/kernel/Image -initrd /home/mini-rootfs/rootfs.cpio.gz -bios > >>> /root/QEMU_EFI.FD -cpu host -enable-kvm -net none -nographic -m > >>> 2G,maxmem=64G,slots=3 -smp 4 -append 'rdinit=init console=ttyAMA0 > >>> ealycon=pl0ll,0x9000 pcie_ports=native pciehp.pciehp_debug=1' > >>> -object memory-backend-ram,id=ram1,size=10G -device > >>> nvdimm,id=dimm1,memdev=ram1 -device ioh3420,id=root_port1,chassis=1 > >>> -device vfio-pci,host=7d:01.0,id=net0,bus=root_port1 > >>> > >>> Then in VM we insmod a module, vmalloc error occurs as follows (kernel > >>> 5.19-rc4 is normal, and the issue is still on kernel 6.1-rc4): > >>> > >>> estuary:/$ insmod /lib/modules/$(uname -r)/hnae3.ko > >>> [8.186563] vmap allocation for size 20480 failed: use > >>> vmalloc= to increase size > >> Have you tried increasing the vmalloc size to check that this is > >> indeed the problem? > >> > >> [...] > >> > >>> We git bisect the code, and find the patch c5a89f75d2a ("arm64: kaslr: > >>> defer initialization to initcall where permitted"). > >> I guess you mean commit fc5a89f75d2a instead, right? > >> > >>> Do you have any idea about the issue? > >> I sort of suspect that the nvdimm gets vmap-ed and consumes a large > >> portion of the vmalloc space, but you give very little information > >> that could help here... > >> > > Ouch. I suspect what's going on here: that patch defers the > > randomization of the module region, so that we can decouple it from > > the very early init code. > > > > Obviously, it is happening too late now, and the randomized module > > region is overlapping with a vmalloc region that is in use by the time > > the randomization occurs. > > > > Does the below fix the issue? > > The issue still occurs, but it seems decrease the probability, before it > occured almost every time, after the change, i tried 2-3 times, and it > occurs. > But i change back "subsys_initcall" to "core_initcall", and i test more > than 20 times, and it is still ok. > Thank you for confirming. I will send out a patch today. > > > > diff --git a/arch/arm64/kernel/kaslr.c b/arch/arm64/kernel/kaslr.c > > index 37a9deed2aec..71fb18b2f304 100644 > > --- a/arch/arm64/kernel/kaslr.c > > +++ b/arch/arm64/kernel/kaslr.c > > @@ -90,4 +90,4 @@ static int __init kaslr_init(void) > > > > return 0; > > } > > -subsys_initcall(kaslr_init) > > +arch_initcall(kaslr_init) > > . > > >
Re: [PATCH 2/9] ui: Fix silent truncation of numeric keys in HMP sendkey
On Thu, Dec 01, 2022 at 07:13:04AM +0100, Markus Armbruster wrote: > Keys are int. HMP sendkey assigns them from the value strtoul(), > silently truncating values greater than INT_MAX. Fix to reject them. > > While there, use qemu_strtoul() instead of strtoul() so checkpatch.pl > won't complain. > > Signed-off-by: Markus Armbruster > --- > monitor/hmp-cmds.c | 9 +++-- > 1 file changed, 7 insertions(+), 2 deletions(-) Reviewed-by: Daniel P. Berrangé With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH 1/9] ui: Check numeric part of expire_password argument @time properly
On Thu, Dec 01, 2022 at 07:13:03AM +0100, Markus Armbruster wrote: > When argument @time isn't 'now' or 'never', we parse it as an integer, > optionally prefixed with '+'. If parsing fails, we silently assume > zero. Report an error and fail instead. > > While there, use qemu_strtou64() instead of strtoull() so > checkpatch.pl won't complain. > > Aside: encoding numbers in strings is bad QMP practice. > > Signed-off-by: Markus Armbruster > --- > monitor/qmp-cmds.c | 17 +++-- > 1 file changed, 15 insertions(+), 2 deletions(-) Reviewed-by: Daniel P. Berrangé With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH 0/3] Fix UNMAP notifier for intel-iommu
On Thu, Dec 1, 2022 at 12:38 AM Michael S. Tsirkin wrote: > > On Tue, Nov 29, 2022 at 04:10:34PM +0800, Jason Wang wrote: > > Hi All: > > > > According to ATS, device should work if ATS is disabled. This is not > > correctly implemented in the current intel-iommu since it doesn't > > handle the UNMAP notifier correctly. This breaks the vhost-net + > > vIOMMU without dt. > > > > The root casue is that the when there's a device IOTLB miss (note that > > it's not specific to PCI so it can work without ATS), Qemu doesn't > > build the IOVA tree, so when guest start an IOTLB invalidation, Qemu > > won't trigger the UNMAP notifier. > > > > Fixing by build IOVA tree during IOMMU translsation. > > > > Thanks > > Any changes of Fixes tags? this is 8.0 yes? Yes, it's for 8.0. Thanks > > > Jason Wang (3): > > intel-iommu: fail MAP notifier without caching mode > > intel-iommu: fail DEVIOTLB_UNMAP without dt mode > > intel-iommu: build iova tree during IOMMU translation > > > > hw/i386/intel_iommu.c | 58 --- > > 1 file changed, 33 insertions(+), 25 deletions(-) > > > > -- > > 2.25.1 >
Re: [PATCH 3/3] intel-iommu: build iova tree during IOMMU translation
On Wed, Nov 30, 2022 at 11:17 PM Peter Xu wrote: > > On Wed, Nov 30, 2022 at 02:33:51PM +0800, Jason Wang wrote: > > On Tue, Nov 29, 2022 at 11:57 PM Peter Xu wrote: > > > > > > On Tue, Nov 29, 2022 at 04:10:37PM +0800, Jason Wang wrote: > > > > The IOVA tree is only built during page walk this breaks the device > > > > that tries to use UNMAP notifier only. One example is vhost-net, it > > > > tries to use UNMAP notifier when vIOMMU doesn't support DEVIOTLB_UNMAP > > > > notifier (e.g when dt mode is not enabled). The interesting part is > > > > that it doesn't use MAP since it can query the IOMMU translation by > > > > itself upon a IOTLB miss. > > > > > > > > This doesn't work since Qemu doesn't build IOVA tree in IOMMU > > > > translation which means the UNMAP notifier won't be triggered during > > > > the page walk since Qemu think it is never mapped. This could be > > > > noticed when vIOMMU is used with vhost_net but dt is disabled. > > > > > > > > Fixing this by build the iova tree during IOMMU translation, this > > > > makes sure the UNMAP notifier event could be identified during page > > > > walk. And we need to walk page table not only for UNMAP notifier but > > > > for MAP notifier during PSI. > > > > > > > > Signed-off-by: Jason Wang > > > > --- > > > > hw/i386/intel_iommu.c | 43 ++- > > > > 1 file changed, 18 insertions(+), 25 deletions(-) > > > > > > > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c > > > > index d025ef2873..edeb62f4b2 100644 > > > > --- a/hw/i386/intel_iommu.c > > > > +++ b/hw/i386/intel_iommu.c > > > > @@ -1834,6 +1834,8 @@ static bool > > > > vtd_do_iommu_translate(VTDAddressSpace *vtd_as, PCIBus *bus, > > > > uint8_t access_flags; > > > > bool rid2pasid = (pasid == PCI_NO_PASID) && s->root_scalable; > > > > VTDIOTLBEntry *iotlb_entry; > > > > +const DMAMap *mapped; > > > > +DMAMap target; > > > > > > > > /* > > > > * We have standalone memory region for interrupt addresses, we > > > > @@ -1954,6 +1956,21 @@ out: > > > > entry->translated_addr = vtd_get_slpte_addr(slpte, s->aw_bits) & > > > > page_mask; > > > > entry->addr_mask = ~page_mask; > > > > entry->perm = access_flags; > > > > + > > > > +target.iova = entry->iova; > > > > +target.size = entry->addr_mask; > > > > +target.translated_addr = entry->translated_addr; > > > > +target.perm = entry->perm; > > > > + > > > > +mapped = iova_tree_find(vtd_as->iova_tree, &target); > > > > +if (!mapped) { > > > > +/* To make UNMAP notifier work, we need build iova tree here > > > > + * in order to have the UNMAP iommu notifier to be triggered > > > > + * during the page walk. > > > > + */ > > > > +iova_tree_insert(vtd_as->iova_tree, &target); > > > > +} > > > > + > > > > return true; > > > > > > > > error: > > > > @@ -2161,31 +2178,7 @@ static void > > > > vtd_iotlb_page_invalidate_notify(IntelIOMMUState *s, > > > > ret = vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus), > > > > vtd_as->devfn, &ce); > > > > if (!ret && domain_id == vtd_get_domain_id(s, &ce, > > > > vtd_as->pasid)) { > > > > -if (vtd_as_has_map_notifier(vtd_as)) { > > > > -/* > > > > - * As long as we have MAP notifications registered in > > > > - * any of our IOMMU notifiers, we need to sync the > > > > - * shadow page table. > > > > - */ > > > > -vtd_sync_shadow_page_table_range(vtd_as, &ce, addr, > > > > size); > > > > -} else { > > > > -/* > > > > - * For UNMAP-only notifiers, we don't need to walk the > > > > - * page tables. We just deliver the PSI down to > > > > - * invalidate caches. > > > > - */ > > > > -IOMMUTLBEvent event = { > > > > -.type = IOMMU_NOTIFIER_UNMAP, > > > > -.entry = { > > > > -.target_as = &address_space_memory, > > > > -.iova = addr, > > > > -.translated_addr = 0, > > > > -.addr_mask = size - 1, > > > > -.perm = IOMMU_NONE, > > > > -}, > > > > -}; > > > > -memory_region_notify_iommu(&vtd_as->iommu, 0, event); > > > > > > Isn't this path the one that will be responsible for pass-through the > > > UNMAP > > > events from guest to vhost when there's no MAP notifier requested? > > > > Yes, but it doesn't do the iova tree removing. More below. > > > > > > > > At least that's what I expected when introducing the iova tree, because > > > for > > > unmap-only device hierachy I thought we didn't need the tree at all. > > > > Then the problem is the UNMAP notifier won't be trig
Re: [PATCH v2 3/4] vdpa: handle VIRTIO_NET_CTRL_ANNOUNCE in vhost_vdpa_net_handle_ctrl_avail
On Wed, Nov 30, 2022 at 3:07 PM Eugenio Perez Martin wrote: > > On Wed, Nov 30, 2022 at 8:02 AM Jason Wang wrote: > > > > On Fri, Nov 25, 2022 at 1:33 AM Eugenio Pérez wrote: > > > > > > Since this capability is emulated by qemu shadowed CVQ cannot forward it > > > to the device. Process all that command within qemu. > > > > > > Signed-off-by: Eugenio Pérez > > > --- > > > net/vhost-vdpa.c | 15 --- > > > 1 file changed, 12 insertions(+), 3 deletions(-) > > > > > > diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c > > > index 2b4b85d8f8..8172aa8449 100644 > > > --- a/net/vhost-vdpa.c > > > +++ b/net/vhost-vdpa.c > > > @@ -489,9 +489,18 @@ static int > > > vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq, > > > out.iov_len = iov_to_buf(elem->out_sg, elem->out_num, 0, > > > s->cvq_cmd_out_buffer, > > > vhost_vdpa_net_cvq_cmd_len()); > > > -dev_written = vhost_vdpa_net_cvq_add(s, out.iov_len, sizeof(status)); > > > -if (unlikely(dev_written < 0)) { > > > -goto out; > > > +if (*(uint8_t *)s->cvq_cmd_out_buffer == VIRTIO_NET_CTRL_ANNOUNCE) { > > > +/* > > > + * Guest announce capability is emulated by qemu, so dont > > > forward to > > > > s/dont/don't/ > > > > I'll correct it, thanks! > > > > + * the device. > > > + */ > > > +dev_written = sizeof(status); > > > +*s->status = VIRTIO_NET_OK; > > > > I wonder if we should avoid negotiating ANNOUNCE with vDPA parents if > > we do this? > > > > I can re-check, but the next patch should avoid it. Kind of, it makes sure guest can always see _F_ANNOUNCE. But does it prevent _F_ANNOUNCE from being negotiated? > Even if > negotiated, the parent should never set the announce status bit, since > we never tell the device is a destination device. That's the point, do we have such a guarantee? Or I wonder if there's any parent that supports _F_ANNOUNCE if yes, how it is supposed to work? > > But it's better to be on the safe side, I'll recheck. Exactly. Thanks > > Thanks! > > > Thanks > > > > > +} else { > > > +dev_written = vhost_vdpa_net_cvq_add(s, out.iov_len, > > > sizeof(status)); > > > +if (unlikely(dev_written < 0)) { > > > +goto out; > > > +} > > > } > > > > > > if (unlikely(dev_written < sizeof(status))) { > > > -- > > > 2.31.1 > > > > > >
Re: [PATCH for 8.0 v8 06/12] vdpa: extract vhost_vdpa_svq_allocate_iova_tree
On Wed, Nov 30, 2022 at 3:40 PM Eugenio Perez Martin wrote: > > On Wed, Nov 30, 2022 at 7:43 AM Jason Wang wrote: > > > > On Thu, Nov 24, 2022 at 11:52 PM Eugenio Pérez wrote: > > > > > > It can be allocated either if all virtqueues must be shadowed or if > > > vdpa-net detects it can shadow only cvq. > > > > > > Extract in its own function so we can reuse it. > > > > > > Signed-off-by: Eugenio Pérez > > > --- > > > net/vhost-vdpa.c | 29 + > > > 1 file changed, 17 insertions(+), 12 deletions(-) > > > > > > diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c > > > index 88e0eec5fa..9ee3bc4cd3 100644 > > > --- a/net/vhost-vdpa.c > > > +++ b/net/vhost-vdpa.c > > > @@ -240,6 +240,22 @@ static NetClientInfo net_vhost_vdpa_info = { > > > .check_peer_type = vhost_vdpa_check_peer_type, > > > }; > > > > > > +static int vhost_vdpa_get_iova_range(int fd, > > > + struct vhost_vdpa_iova_range > > > *iova_range) > > > +{ > > > +int ret = ioctl(fd, VHOST_VDPA_GET_IOVA_RANGE, iova_range); > > > + > > > +return ret < 0 ? -errno : 0; > > > +} > > > > I don't get why this needs to be moved to net specific code. > > > > It was already in net, this code just extracted it in its own function. Ok, there's similar function that in vhost-vdpa.c: static void vhost_vdpa_get_iova_range(struct vhost_vdpa *v) { int ret = vhost_vdpa_call(v->dev, VHOST_VDPA_GET_IOVA_RANGE, &v->iova_range); if (ret != 0) { v->iova_range.first = 0; v->iova_range.last = UINT64_MAX; } trace_vhost_vdpa_get_iova_range(v->dev, v->iova_range.first, v->iova_range.last); } I think we can reuse that. Thanks > > It's done in net because iova_tree must be the same for all queuepair > vhost, so we need to allocate before them. > > Thanks! > > > Thanks > > > > > + > > > +static VhostIOVATree *vhost_vdpa_svq_allocate_iova_tree(int > > > vdpa_device_fd) > > > +{ > > > +struct vhost_vdpa_iova_range iova_range; > > > + > > > +vhost_vdpa_get_iova_range(vdpa_device_fd, &iova_range); > > > +return vhost_iova_tree_new(iova_range.first, iova_range.last); > > > +} > > > + > > > static void vhost_vdpa_cvq_unmap_buf(struct vhost_vdpa *v, void *addr) > > > { > > > VhostIOVATree *tree = v->iova_tree; > > > @@ -587,14 +603,6 @@ static NetClientState > > > *net_vhost_vdpa_init(NetClientState *peer, > > > return nc; > > > } > > > > > > -static int vhost_vdpa_get_iova_range(int fd, > > > - struct vhost_vdpa_iova_range > > > *iova_range) > > > -{ > > > -int ret = ioctl(fd, VHOST_VDPA_GET_IOVA_RANGE, iova_range); > > > - > > > -return ret < 0 ? -errno : 0; > > > -} > > > - > > > static int vhost_vdpa_get_features(int fd, uint64_t *features, Error > > > **errp) > > > { > > > int ret = ioctl(fd, VHOST_GET_FEATURES, features); > > > @@ -690,14 +698,11 @@ int net_init_vhost_vdpa(const Netdev *netdev, const > > > char *name, > > > } > > > > > > if (opts->x_svq) { > > > -struct vhost_vdpa_iova_range iova_range; > > > - > > > if (!vhost_vdpa_net_valid_svq_features(features, errp)) { > > > goto err_svq; > > > } > > > > > > -vhost_vdpa_get_iova_range(vdpa_device_fd, &iova_range); > > > -iova_tree = vhost_iova_tree_new(iova_range.first, > > > iova_range.last); > > > +iova_tree = vhost_vdpa_svq_allocate_iova_tree(vdpa_device_fd); > > > } > > > > > > ncs = g_malloc0(sizeof(*ncs) * queue_pairs); > > > -- > > > 2.31.1 > > > > > >
Re: [PATCH v12 0/7] s390x: CPU Topology
Hello Pierre On 11/29/22 18:41, Pierre Morel wrote: Hi, The implementation of the CPU Topology in QEMU has been modified since the last patch series. - The two preliminary patches have been accepted and are no longer part of this series. - The topology machine property has been abandoned - the topology_capable QEMU capability has been abandoned - both where replaced with a new CPU feature, topology-disable to fence per default the ctop topology information feature. To use the QEMU patches, you will need Linux V6-rc1 or newer, or use the following Linux mainline patches: f5ecfee94493 2022-07-20 KVM: s390: resetting the Topology-Change-Report 24fe0195bc19 2022-07-20 KVM: s390: guest support for topology function 0130337ec45b 2022-07-20 KVM: s390: Cleanup ipte lock access and SIIF fac.. Currently this code is for KVM only, I have no idea if it is interesting to provide a TCG patch. If ever it will be done in another series. To have a better understanding of the S390x CPU Topology and its implementation in QEMU you can have a look at the documentation in the last patch of this series. The admin will want to match the host and the guest topology, taking into account that the guest does not recognize multithreading. Consequently, two vCPU assigned to threads of the same real CPU should preferably be assigned to the same socket of the guest machine. Please make sure the patchset compiles on non-s390x platforms and check that the documentation generates correctly. You will need to install : python3-sphinx python3-sphinx_rtd_theme 'configure' should then enable doc generation. Thanks, C.
Re: [PATCH 1/3] intel-iommu: fail MAP notifier without caching mode
On Wed, Nov 30, 2022 at 11:06 PM Peter Xu wrote: > > On Wed, Nov 30, 2022 at 02:23:06PM +0800, Jason Wang wrote: > > On Tue, Nov 29, 2022 at 11:35 PM Peter Xu wrote: > > > > > > On Tue, Nov 29, 2022 at 04:10:35PM +0800, Jason Wang wrote: > > > > Without caching mode, MAP notifier won't work correctly since guest > > > > won't send IOTLB update event when it establishes new mappings in the > > > > I/O page tables. Let's fail the IOMMU notifiers early instead of > > > > misbehaving silently. > > > > > > > > Signed-off-by: Jason Wang > > > > --- > > > > hw/i386/intel_iommu.c | 7 +++ > > > > 1 file changed, 7 insertions(+) > > > > > > > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c > > > > index a08ee85edf..9143376677 100644 > > > > --- a/hw/i386/intel_iommu.c > > > > +++ b/hw/i386/intel_iommu.c > > > > @@ -3186,6 +3186,13 @@ static int > > > > vtd_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu, > > > > "Snoop Control with vhost or VFIO is not > > > > supported"); > > > > return -ENOTSUP; > > > > } > > > > +if (!s->caching_mode && (new & IOMMU_NOTIFIER_MAP)) { > > > > +error_setg_errno(errp, ENOTSUP, > > > > + "device %02x.%02x.%x requires caching mode", > > > > + pci_bus_num(vtd_as->bus), > > > > PCI_SLOT(vtd_as->devfn), > > > > + PCI_FUNC(vtd_as->devfn)); > > > > +return -ENOTSUP; > > > > +} > > > > > > We used to have that but got reverted because it's too late to fail, so we > > > moved it over even though not as clean.. > > > > > > https://lore.kernel.org/all/20190916080718.3299-5-pet...@redhat.com/ > > > > One of the difference is that the patch doesn't do exit() here. I > > think it's better to fail instead of misbehving silently, this is what > > other vIOMMU did: > > > > E.g in smmu we had: > > > > if (new & IOMMU_NOTIFIER_MAP) { > > error_setg(errp, > >"device %02x.%02x.%x requires iommu MAP notifier which > > is " > >"not currently supported", pci_bus_num(sdev->bus), > >PCI_SLOT(sdev->devfn), PCI_FUNC(sdev->devfn)); > > return -EINVAL; > > } > > > > So did for amd iommu. > > Yeah that's fine. Would you mind add the explanation (and also above link, > which might be helpful to pick up the history..) to the commit message? Will do. > With that: > > Reviewed-by: Peter Xu > > Thanks, Thanks > > -- > Peter Xu >
Re: [PATCH for 7.2? V2] vhost: fix vq dirty bitmap syncing when vIOMMU is enabled
On Wed, Nov 30, 2022 at 12:08 AM Eric Auger wrote: > > Hi Michael, > > On 11/29/22 16:44, Michael S. Tsirkin wrote: > > On Tue, Nov 29, 2022 at 10:52:29AM +0100, Eric Auger wrote: > >> Hi Jason, > >> > >> On 11/29/22 05:02, Jason Wang wrote: > >>> When vIOMMU is enabled, the vq->used_phys is actually the IOVA not > >>> GPA. So we need to translate it to GPA before the syncing otherwise we > >>> may hit the following crash since IOVA could be out of the scope of > >>> the GPA log size. This could be noted when using virtio-IOMMU with > >>> vhost using 1G memory. > >>> > >>> Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support") > >>> Cc: qemu-sta...@nongnu.org > >>> Tested-by: Lei Yang > >>> Reported-by: Yalan Zhang > >>> Signed-off-by: Jason Wang > >>> --- > >>> Changes since V1: > >>> - Fix the address calculation when used ring is not page aligned > >>> - Fix the length for each round of dirty bitmap syncing > >>> - Use LOG_GUEST_ERROR to log wrong used adddress > >>> - Various other tweaks > >>> --- > >>> hw/virtio/vhost.c | 76 ++- > >>> 1 file changed, 56 insertions(+), 20 deletions(-) > >>> > >>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c > >>> index d1c4c20b8c..0cd5f25fcb 100644 > >>> --- a/hw/virtio/vhost.c > >>> +++ b/hw/virtio/vhost.c > >>> @@ -20,6 +20,7 @@ > >>> #include "qemu/range.h" > >>> #include "qemu/error-report.h" > >>> #include "qemu/memfd.h" > >>> +#include "qemu/log.h" > >>> #include "standard-headers/linux/vhost_types.h" > >>> #include "hw/virtio/virtio-bus.h" > >>> #include "hw/virtio/virtio-access.h" > >>> @@ -106,6 +107,24 @@ static void vhost_dev_sync_region(struct vhost_dev > >>> *dev, > >>> } > >>> } > >>> > >>> +static bool vhost_dev_has_iommu(struct vhost_dev *dev) > >>> +{ > >>> +VirtIODevice *vdev = dev->vdev; > >>> + > >>> +/* > >>> + * For vhost, VIRTIO_F_IOMMU_PLATFORM means the backend support > >>> + * incremental memory mapping API via IOTLB API. For platform that > >>> + * does not have IOMMU, there's no need to enable this feature > >>> + * which may cause unnecessary IOTLB miss/update transactions. > >>> + */ > >>> +if (vdev) { > >>> +return virtio_bus_device_iommu_enabled(vdev) && > >>> +virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); > >>> +} else { > >>> +return false; > >>> +} > >>> +} > >>> + > >>> static int vhost_sync_dirty_bitmap(struct vhost_dev *dev, > >>> MemoryRegionSection *section, > >>> hwaddr first, > >>> @@ -137,8 +156,43 @@ static int vhost_sync_dirty_bitmap(struct vhost_dev > >>> *dev, > >>> continue; > >>> } > >>> > >>> -vhost_dev_sync_region(dev, section, start_addr, end_addr, > >>> vq->used_phys, > >>> - range_get_last(vq->used_phys, > >>> vq->used_size)); > >>> +if (vhost_dev_has_iommu(dev)) { > >>> +IOMMUTLBEntry iotlb; > >>> +hwaddr used_phys = vq->used_phys, used_size = vq->used_size; > >>> +hwaddr phys, s; > >>> + > >>> +while (used_size) { > >>> +rcu_read_lock(); > >>> +iotlb = address_space_get_iotlb_entry(dev->vdev->dma_as, > >>> + used_phys, > >>> + true, > >>> MEMTXATTRS_UNSPECIFIED); > >>> +rcu_read_unlock(); > >>> + > >>> +if (!iotlb.target_as) { > >>> +qemu_log_mask(LOG_GUEST_ERROR, "translation " > >>> + "failure for used_phys %"PRIx64"\n", > >>> used_phys); > >> looks weird to see translation of "used_phys" whereas it is an iova. At > >> least I would reword the msg Let me tweak this in the next version. > >>> +return -EINVAL; > >>> +} > >>> + > >>> +phys = iotlb.translated_addr + (used_phys & > >>> iotlb.addr_mask); > >> you may use a local variable storing this offset = > >> > >> used_phys & iotlb.addr_mask Ok. > >> > >>> + > >>> +/* Distance from start of used ring until last byte of > >>> + IOMMU page */ > >> you can avoid checkpatch warnings here > >>> +s = iotlb.addr_mask - (used_phys & iotlb.addr_mask); > >>> +/* Size of used ring, or of the part of it until end > >>> + of IOMMU page */ > >> and here Will fix. > >> > >> I would suggest to rewrite this into > >> s =iotlb.addr_mask - (used_phys & iotlb.addr_mask) + 1 > >> s = MIN(s, used_size); > > This does not work - if iotlb.addr_mask - (used_phys & iotlb.addr_mask) > > is all-ones then + 1 gives you 0 and MIN gives you 0. > > Theoretical but worth being safe here IMHO. > Ah OK, I should have read your previous discussion more thoroughly ... > Maybe just add a short comment th
Re: [PATCH 3/9] ui: Drop disabled code for SPICE_CHANNEL_WEBDAV
On Thu, Dec 01, 2022 at 07:13:05AM +0100, Markus Armbruster wrote: > HMP "info spice" has a bit of code to show channel type > SPICE_CHANNEL_WEBDAV as "webdav", disabled since commit 7c6044a94e > "hmp: info spice: take out webdav" (v2.3.0), because it compiles only > with Spice versions 0.12.7 and later. Our minimum version is 0.12.5. Last version bump was 4 years ago commit 1b63665c2c0e0d52735e0dd5217f109fe0dd2322 Author: Marc-André Lureau Date: Wed Nov 28 19:59:32 2018 +0400 configure: bump spice-server required version to 0.12.5 ...snip According to repology, all the distros that are build target platforms for QEMU include it: RHEL-7: 0.14.0 Debian (Stretch): 0.12.8 Debian (Jessie): 0.12.5 FreeBSD (ports): 0.14.0 OpenSUSE Leap 15: 0.14.0 Ubuntu (Xenial): 0.12.6 We moved on from Debian and RHEL since then Debian 11: 0.14.3 RHEL-8: 0.14.2 FreeBSD (ports): 0.14.4 Fedora 35: 0.14.0 Ubuntu 20.04: 0.14.0 OpenSUSE Leap 15.3: 0.14.3 IOW, we can bump to 0.14.0, and then revert the webdav conditional commit. > > Looks like nobody minded in more than seven years. Drop it, so that > checkpatch.pl won't complain when I move the code. > > Signed-off-by: Markus Armbruster > --- > monitor/hmp-cmds.c | 6 -- > 1 file changed, 6 deletions(-) > > diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c > index a7c9ae2520..86dd961462 100644 > --- a/monitor/hmp-cmds.c > +++ b/monitor/hmp-cmds.c > @@ -626,12 +626,6 @@ void hmp_info_spice(Monitor *mon, const QDict *qdict) > [SPICE_CHANNEL_SMARTCARD] = "smartcard", > [SPICE_CHANNEL_USBREDIR] = "usbredir", > [SPICE_CHANNEL_PORT] = "port", > -#if 0 > -/* minimum spice-protocol is 0.12.3, webdav was added in 0.12.7, > - * no easy way to #ifdef (SPICE_CHANNEL_* is a enum). Disable > - * as quick fix for build failures with older versions. */ > -[SPICE_CHANNEL_WEBDAV] = "webdav", > -#endif > }; > > info = qmp_query_spice(NULL); > -- > 2.37.3 > > With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH] vhost: Not return fail while the device does not support send_iotlb_msg
On Wed, Nov 30, 2022 at 4:11 PM Cindy Lu wrote: > > Some device does not support vhost_send_device_iotlb_msg() > such as vDPA device, which is as expected. So we should not > return fail here. Please explain in which case you may hit the -ENODEV and what's the side effect of this. Thanks > > Signed-off-by: Cindy Lu > --- > hw/virtio/vhost-backend.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c > index 8e581575c9..9321ed9031 100644 > --- a/hw/virtio/vhost-backend.c > +++ b/hw/virtio/vhost-backend.c > @@ -360,7 +360,7 @@ int vhost_backend_update_device_iotlb(struct vhost_dev > *dev, > if (dev->vhost_ops && dev->vhost_ops->vhost_send_device_iotlb_msg) > return dev->vhost_ops->vhost_send_device_iotlb_msg(dev, &imsg); > > -return -ENODEV; > +return 0; > } > > int vhost_backend_invalidate_device_iotlb(struct vhost_dev *dev, > @@ -375,7 +375,7 @@ int vhost_backend_invalidate_device_iotlb(struct > vhost_dev *dev, > if (dev->vhost_ops && dev->vhost_ops->vhost_send_device_iotlb_msg) > return dev->vhost_ops->vhost_send_device_iotlb_msg(dev, &imsg); > > -return -ENODEV; > +return 0; > } > > int vhost_backend_handle_iotlb_msg(struct vhost_dev *dev, > -- > 2.34.3 >
Re: [PATCH 4/9] ui: Clean up a few things checkpatch.pl would flag later on
On Thu, Dec 01, 2022 at 07:13:06AM +0100, Markus Armbruster wrote: > Fix a few style violations so that checkpatch.pl won't complain when I > move this code. > > Signed-off-by: Markus Armbruster > --- > monitor/hmp-cmds.c | 7 --- > monitor/qmp-cmds.c | 21 +++-- > 2 files changed, 15 insertions(+), 13 deletions(-) Reviewed-by: Daniel P. Berrangé With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH 5/9] ui: Move QMP commands from monitor to new ui/ui-qmp-cmds.c
On Thu, Dec 01, 2022 at 07:13:07AM +0100, Markus Armbruster wrote: > This moves these commands from MAINTAINERS section "Human > Monitor (HMP)" to "Graphics". > > Command add-client applies to socket character devices in addition to > display devices. Move it anyway. Aside: the way @protocol character > device IDs and display types is bad design. > > Signed-off-by: Markus Armbruster > --- > monitor/qmp-cmds.c | 177 - > ui/ui-qmp-cmds.c | 194 + > ui/meson.build | 1 + > 3 files changed, 195 insertions(+), 177 deletions(-) > create mode 100644 ui/ui-qmp-cmds.c Reviewed-by: Daniel P. Berrangé With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH 6/9] ui: Move HMP commands from monitor to new ui/ui-hmp-cmds.c
On Thu, Dec 01, 2022 at 07:13:08AM +0100, Markus Armbruster wrote: > This moves these commands from MAINTAINERS section "Human > Monitor (HMP)" to "Graphics". > > Signed-off-by: Markus Armbruster > --- > monitor/hmp-cmds.c | 342 -- > ui/ui-hmp-cmds.c | 360 + > ui/meson.build | 1 + > 3 files changed, 361 insertions(+), 342 deletions(-) > create mode 100644 ui/ui-hmp-cmds.c Reviewed-by: Daniel P. Berrangé With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH for-7.2] target/i386: Always completely initialize TranslateFault
Queued, thanks. Paolo
[PULL 2/2] target/i386: Always completely initialize TranslateFault
From: Richard Henderson In get_physical_address, the canonical address check failed to set TranslateFault.stage2, which resulted in an uninitialized read from the struct when reporting the fault in x86_cpu_tlb_fill. Adjust all error paths to use structure assignment so that the entire struct is always initialized. Reported-by: Daniel Hoffman Fixes: 9bbcf372193a ("target/i386: Reorg GET_HPHYS") Signed-off-by: Richard Henderson Message-Id: <20221201074522.178498-1-richard.hender...@linaro.org> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1324 Signed-off-by: Paolo Bonzini --- target/i386/tcg/sysemu/excp_helper.c | 34 1 file changed, 19 insertions(+), 15 deletions(-) diff --git a/target/i386/tcg/sysemu/excp_helper.c b/target/i386/tcg/sysemu/excp_helper.c index 405a5d414a14..55bd1194d31b 100644 --- a/target/i386/tcg/sysemu/excp_helper.c +++ b/target/i386/tcg/sysemu/excp_helper.c @@ -71,10 +71,11 @@ static bool ptw_translate(PTETranslate *inout, hwaddr addr) TranslateFault *err = inout->err; assert(inout->ptw_idx == MMU_NESTED_IDX); -err->exception_index = 0; /* unused */ -err->error_code = inout->env->error_code; -err->cr2 = addr; -err->stage2 = S2_GPT; +*err = (TranslateFault){ +.error_code = inout->env->error_code, +.cr2 = addr, +.stage2 = S2_GPT, +}; return false; } return true; @@ -431,10 +432,11 @@ do_check_protect_pse36: MMU_NESTED_IDX, true, &pte_trans.haddr, &full, 0); if (unlikely(flags & TLB_INVALID_MASK)) { -err->exception_index = 0; /* unused */ -err->error_code = env->error_code; -err->cr2 = paddr; -err->stage2 = S2_GPA; +*err = (TranslateFault){ +.error_code = env->error_code, +.cr2 = paddr, +.stage2 = S2_GPA, +}; return false; } @@ -494,10 +496,11 @@ do_check_protect_pse36: } break; } -err->exception_index = EXCP0E_PAGE; -err->error_code = error_code; -err->cr2 = addr; -err->stage2 = S2_NONE; +*err = (TranslateFault){ +.exception_index = EXCP0E_PAGE, +.error_code = error_code, +.cr2 = addr, +}; return false; } @@ -564,9 +567,10 @@ static bool get_physical_address(CPUX86State *env, vaddr addr, int shift = in.pg_mode & PG_MODE_LA57 ? 56 : 47; int64_t sext = (int64_t)addr >> shift; if (sext != 0 && sext != -1) { -err->exception_index = EXCP0D_GPF; -err->error_code = 0; -err->cr2 = addr; +*err = (TranslateFault){ +.exception_index = EXCP0D_GPF, +.cr2 = addr, +}; return false; } } -- 2.38.1
[PULL 1/2] target/i386: allow MMX instructions with CR4.OSFXSR=0
MMX state is saved/restored by FSAVE/FRSTOR so the instructions are not illegal opcodes even if CR4.OSFXSR=0. Make sure that validate_vex takes into account the prefix and only checks HF_OSFXSR_MASK in the presence of an SSE instruction. Fixes: 20581aadec5e ("target/i386: validate VEX prefixes via the instructions' exception classes", 2022-10-18) Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1350 Reported-by: Helge Konetzka (@hejko on gitlab.com) Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/tcg/decode-new.c.inc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc index e4878b967f0e..80c579164ff2 100644 --- a/target/i386/tcg/decode-new.c.inc +++ b/target/i386/tcg/decode-new.c.inc @@ -1488,7 +1488,8 @@ static bool validate_vex(DisasContext *s, X86DecodedInsn *decode) if (!(s->flags & HF_AVX_EN_MASK)) { goto illegal; } -} else { +} else if (e->special != X86_SPECIAL_MMX || + (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ | PREFIX_DATA))) { if (!(s->flags & HF_OSFXSR_MASK)) { goto illegal; } -- 2.38.1
Re: [PATCH 7/9] ui: Improve "change vnc" error reporting
On Thu, Dec 01, 2022 at 07:13:09AM +0100, Markus Armbruster wrote: > Switch from monitor_printf() to error_setg() and hmp_handle_error(). > This makes "this is an error" more obvious both in the source and in > the monitor, where hmp_handle_error() prefixes the message with > "Error: ". > > Signed-off-by: Markus Armbruster > --- > monitor/hmp-cmds.c | 8 > ui/ui-hmp-cmds.c | 22 ++ > 2 files changed, 26 insertions(+), 4 deletions(-) > > diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c > index f0f7b74fb3..8542eee3d4 100644 > --- a/monitor/hmp-cmds.c > +++ b/monitor/hmp-cmds.c > @@ -1209,9 +1209,8 @@ void hmp_change(Monitor *mon, const QDict *qdict) > #ifdef CONFIG_VNC > if (strcmp(device, "vnc") == 0) { > if (read_only) { > -monitor_printf(mon, > - "Parameter 'read-only-mode' is invalid for > VNC\n"); > -return; > +error_setg(&err, "Parameter 'read-only-mode' is invalid for > VNC"); > +goto end; > } > if (strcmp(target, "passwd") == 0 || > strcmp(target, "password") == 0) { > @@ -1223,7 +1222,8 @@ void hmp_change(Monitor *mon, const QDict *qdict) > qmp_change_vnc_password(arg, &err); > } > } else { > -monitor_printf(mon, "Expected 'password' after 'vnc'\n"); > +error_setg(&err, "Expected 'password' after 'vnc'"); > +goto end; > } > } else > #endif Upto this point Reviewed-by: Daniel P. Berrangé > diff --git a/ui/ui-hmp-cmds.c b/ui/ui-hmp-cmds.c > index af290da9e1..90a4f86f25 100644 > --- a/ui/ui-hmp-cmds.c > +++ b/ui/ui-hmp-cmds.c > @@ -270,6 +270,28 @@ out: > hmp_handle_error(mon, err); > } > > +void hmp_change_vnc(Monitor *mon, const char *device, const char *target, > +const char *arg, const char *read_only, bool force, > +Error **errp) > +{ > +if (read_only) { > +error_setg(mon, "Parameter 'read-only-mode' is invalid for VNC"); > +return; > +} > +if (strcmp(target, "passwd") == 0 || > +strcmp(target, "password") == 0) { > +if (!arg) { > +MonitorHMP *hmp_mon = container_of(mon, MonitorHMP, common); > +monitor_read_password(hmp_mon, hmp_change_read_arg, NULL); > +return; > +} else { > +qmp_change_vnc_password(arg, &err); > +} > +} else { > +monitor_printf(mon, "Expected 'password' after 'vnc'\n"); > +} > +} > + This chunk ought to be in the next patch IIUC > void hmp_sendkey(Monitor *mon, const QDict *qdict) > { > const char *keys = qdict_get_str(qdict, "keys"); > -- > 2.37.3 > > With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH 9/9] ui: Reduce nesting in hmp_change_vnc() slightly
On Thu, Dec 01, 2022 at 07:13:11AM +0100, Markus Armbruster wrote: > Transform > > if (good) { > do stuff > } else { > handle error > } > > to > > if (!good) { > handle error > return; > } > do stuff > > Signed-off-by: Markus Armbruster > --- > ui/ui-hmp-cmds.c | 17 +++-- > 1 file changed, 7 insertions(+), 10 deletions(-) Reviewed-by: Daniel P. Berrangé With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
[PULL for-7.2 0/2] TCG/i386 fixes for QEMU 7.2-rc4
The following changes since commit 7c09a7f6ae1770d15535980d15dffdb23f4d9786: Update VERSION for v7.2.0-rc2 (2022-11-22 18:59:56 -0500) are available in the Git repository at: https://gitlab.com/bonzini/qemu.git tags/for-upstream for you to fetch changes up to 8218c048be1567db9dfd3cf1e19fbff76bce8cfd: target/i386: Always completely initialize TranslateFault (2022-12-01 09:53:24 +0100) * Fix MMX instructions for system emulators * Fix uninitialized TranslateFault after canonical address checks Paolo Bonzini (1): target/i386: allow MMX instructions with CR4.OSFXSR=0 Richard Henderson (1): target/i386: Always completely initialize TranslateFault target/i386/tcg/decode-new.c.inc | 3 ++- target/i386/tcg/sysemu/excp_helper.c | 34 +++--- 2 files changed, 21 insertions(+), 16 deletions(-) -- 2.38.1
Re: [PATCH 8/9] ui: Factor out hmp_change_vnc(), and move to ui/ui-hmp-cmds.c
On Thu, Dec 01, 2022 at 07:13:10AM +0100, Markus Armbruster wrote: > Signed-off-by: Markus Armbruster > --- > include/monitor/hmp.h | 5 + > monitor/hmp-cmds.c| 28 +--- > ui/ui-hmp-cmds.c | 19 +++ > 3 files changed, 21 insertions(+), 31 deletions(-) If you pull in the hmp_change_vnc impl from the previous patch Reviewed-by: Daniel P. Berrangé With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH v12 1/7] s390x/cpu topology: Creating CPU topology device
On 29/11/2022 18.42, Pierre Morel wrote: We will need a Topology device to transfer the topology during migration and to implement machine reset. The device creation is fenced by s390_has_topology(). Signed-off-by: Pierre Morel --- ... diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c index 2e64ffab45..973bbdd36e 100644 --- a/hw/s390x/s390-virtio-ccw.c +++ b/hw/s390x/s390-virtio-ccw.c @@ -44,6 +44,7 @@ #include "hw/s390x/pv.h" #include "migration/blocker.h" #include "qapi/visitor.h" +#include "hw/s390x/cpu-topology.h" static Error *pv_mig_blocker; @@ -102,6 +103,24 @@ static void s390_init_cpus(MachineState *machine) } } +static DeviceState *s390_init_topology(MachineState *machine, Error **errp) +{ +DeviceState *dev; + +dev = qdev_new(TYPE_S390_CPU_TOPOLOGY); + +object_property_add_child(&machine->parent_obj, + TYPE_S390_CPU_TOPOLOGY, OBJECT(dev)); +object_property_set_int(OBJECT(dev), "num-cores", +machine->smp.cores * machine->smp.threads, errp); I wonder what will happen if we ever support multithreading on s390x later? ... won't this cause some oddities when migrating older machines types with smp.threads > 1 later? Maybe we should prohibit to enable the CPU topology instead if a user tried to use threads > 1 with an older machine type? Thomas +object_property_set_int(OBJECT(dev), "num-sockets", +machine->smp.sockets, errp); + +sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), errp); + +return dev; +}
Re: Monitor commands related to display server passwords
On Wed, Nov 30, 2022 at 09:03:03AM +, Daniel P. Berrangé wrote: > On Wed, Nov 30, 2022 at 09:02:56AM +0100, Markus Armbruster wrote: > > Related: QCryptoSecret objects. > > snip > > > Currently used by various block backends and the tls-creds-x509 object. > > > > Would it make sense with display servers, too? > > In 6.0 I introduced support for 'password-secret' to SPICE and VNC > command line. > > I don't know why, but I only deprecated 'password' in SPICE and > not in VNC. The 'password' option in VNC isn't actually setting a password, it is more like saying 'auth=password'. The actualpassword had to be set via the 'change' command, we never allowed it on the CLI before. So there was nothing to deprecate for VNC, only SPICE. With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH v2 3/4] vdpa: handle VIRTIO_NET_CTRL_ANNOUNCE in vhost_vdpa_net_handle_ctrl_avail
On Thu, Dec 1, 2022 at 9:39 AM Jason Wang wrote: > > On Wed, Nov 30, 2022 at 3:07 PM Eugenio Perez Martin > wrote: > > > > On Wed, Nov 30, 2022 at 8:02 AM Jason Wang wrote: > > > > > > On Fri, Nov 25, 2022 at 1:33 AM Eugenio Pérez wrote: > > > > > > > > Since this capability is emulated by qemu shadowed CVQ cannot forward it > > > > to the device. Process all that command within qemu. > > > > > > > > Signed-off-by: Eugenio Pérez > > > > --- > > > > net/vhost-vdpa.c | 15 --- > > > > 1 file changed, 12 insertions(+), 3 deletions(-) > > > > > > > > diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c > > > > index 2b4b85d8f8..8172aa8449 100644 > > > > --- a/net/vhost-vdpa.c > > > > +++ b/net/vhost-vdpa.c > > > > @@ -489,9 +489,18 @@ static int > > > > vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq, > > > > out.iov_len = iov_to_buf(elem->out_sg, elem->out_num, 0, > > > > s->cvq_cmd_out_buffer, > > > > vhost_vdpa_net_cvq_cmd_len()); > > > > -dev_written = vhost_vdpa_net_cvq_add(s, out.iov_len, > > > > sizeof(status)); > > > > -if (unlikely(dev_written < 0)) { > > > > -goto out; > > > > +if (*(uint8_t *)s->cvq_cmd_out_buffer == VIRTIO_NET_CTRL_ANNOUNCE) > > > > { > > > > +/* > > > > + * Guest announce capability is emulated by qemu, so dont > > > > forward to > > > > > > s/dont/don't/ > > > > > > > I'll correct it, thanks! > > > > > > + * the device. > > > > + */ > > > > +dev_written = sizeof(status); > > > > +*s->status = VIRTIO_NET_OK; > > > > > > I wonder if we should avoid negotiating ANNOUNCE with vDPA parents if > > > we do this? > > > > > > > I can re-check, but the next patch should avoid it. > > Kind of, it makes sure guest can always see _F_ANNOUNCE. But does it > prevent _F_ANNOUNCE from being negotiated? > It should go like: * vhost_net_ack_features calls vhost_ack_features with feature_bits = vdpa_feature_bits and features = guest acked features. vhost_ack_features stores in hdev->acked_features only the features that met features & bit_mask, so it will not store _F_ANNOUNCE. * vhost_vdpa_set_features is called from vhost_dev_set_features with features = dev->acked_features. Both functions can add features by themselves (VHOST_F_LOG_ALL, VIRTIO_F_IOMMU_PLATFORM), but no _F_ANNOUNCE. Still untested. > > Even if > > negotiated, the parent should never set the announce status bit, since > > we never tell the device is a destination device. > > That's the point, do we have such a guarantee? Or I wonder if there's > any parent that supports _F_ANNOUNCE if yes, how it is supposed to > work? > At the moment it is impossible to work since there is no support for config interrupt from the device. Even with config interrupt, something external from qemu should make the device enable the status bit, since the current migration protocol makes no difference between to be a migration destination and to start the device from scratch. Unless it enables the bit maliciously or by mistake. Just for completion, the current method works with no need of vdpa device config interrupt support thanks to being 100% emulated in qemu, which has the support of injecting config interrupts. Thanks!
Re: [PATCH] MAINTAINERS: Inherit from nanoMIPS
Hi, Stefan is no longer working with us, but I will be more than happy to take maintaining the nanoMIPS ISA on me! Regards, Milica Any comments on this?
[PATCH v3] riscv: Allow user to set the satp mode
RISC-V specifies multiple sizes for addressable memory and Linux probes for the machine's support at startup via the satp CSR register (done in csr.c:validate_vm). As per the specification, sv64 must support sv57, which in turn must support sv48...etc. So we can restrict machine support by simply setting the "highest" supported mode and the bare mode is always supported. You can set the satp mode using the new properties "mbare", "sv32", "sv39", "sv48", "sv57" and "sv64" as follows: -cpu rv64,sv57=on # Linux will boot using sv57 scheme -cpu rv64,sv39=on # Linux will boot using sv39 scheme We take the highest level set by the user: -cpu rv64,sv48=on,sv57=on # Linux will boot using sv57 scheme We make sure that invalid configurations are rejected: -cpu rv64,sv32=on # Can't enable 32-bit satp mode in 64-bit -cpu rv64,sv39=off,sv48=on # sv39 must be supported if higher modes are # enabled We accept "redundant" configurations: -cpu rv64,sv48=on,sv57=off # sv39 must be supported if higher modes are In addition, we now correctly set the device-tree entry 'mmu-type' using those new properties. Co-Developed-by: Ludovic Henry Signed-off-by: Ludovic Henry Signed-off-by: Alexandre Ghiti --- v3: - Free sv_name as pointed by Bin - Replace satp-mode with boolean properties as suggested by Andrew - Removed RB from Atish as the patch considerably changed v2: - Use error_setg + return as suggested by Alistair - Add RB from Atish - Fixed checkpatch issues missed in v1 - Replaced Ludovic email address with the rivos one hw/riscv/virt.c | 16 ++-- target/riscv/cpu.c | 164 target/riscv/cpu.h | 8 ++ target/riscv/cpu_bits.h | 1 + target/riscv/csr.c | 8 +- 5 files changed, 186 insertions(+), 11 deletions(-) diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c index a5bc7353b4..bb7c739a74 100644 --- a/hw/riscv/virt.c +++ b/hw/riscv/virt.c @@ -228,7 +228,7 @@ static void create_fdt_socket_cpus(RISCVVirtState *s, int socket, int cpu; uint32_t cpu_phandle; MachineState *mc = MACHINE(s); -char *name, *cpu_name, *core_name, *intc_name; +char *name, *cpu_name, *core_name, *intc_name, *sv_name; for (cpu = s->soc[socket].num_harts - 1; cpu >= 0; cpu--) { cpu_phandle = (*phandle)++; @@ -236,14 +236,12 @@ static void create_fdt_socket_cpus(RISCVVirtState *s, int socket, cpu_name = g_strdup_printf("/cpus/cpu@%d", s->soc[socket].hartid_base + cpu); qemu_fdt_add_subnode(mc->fdt, cpu_name); -if (riscv_feature(&s->soc[socket].harts[cpu].env, - RISCV_FEATURE_MMU)) { -qemu_fdt_setprop_string(mc->fdt, cpu_name, "mmu-type", -(is_32_bit) ? "riscv,sv32" : "riscv,sv48"); -} else { -qemu_fdt_setprop_string(mc->fdt, cpu_name, "mmu-type", -"riscv,none"); -} + +sv_name = g_strdup_printf("riscv,%s", + s->soc[socket].harts[cpu].cfg.satp_mode_str); +qemu_fdt_setprop_string(mc->fdt, cpu_name, "mmu-type", sv_name); +g_free(sv_name); + name = riscv_isa_string(&s->soc[socket].harts[cpu]); qemu_fdt_setprop_string(mc->fdt, cpu_name, "riscv,isa", name); g_free(name); diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c index d14e95c9dc..51c06ed057 100644 --- a/target/riscv/cpu.c +++ b/target/riscv/cpu.c @@ -907,6 +907,66 @@ static void riscv_cpu_realize(DeviceState *dev, Error **errp) } #endif +/* + * Either a cpu sets its supported satp_mode in XXX_cpu_init + * or the user sets this value using satp_mode property. + */ +bool rv32 = riscv_cpu_mxl(&cpu->env) == MXL_RV32; + +cpu->cfg.satp_mode = VM_1_10_UNDEF; + +if (rv32) { +if (cpu->cfg.sv32 == ON_OFF_AUTO_ON) { +cpu->cfg.satp_mode_str = g_strdup("sv32"); +cpu->cfg.satp_mode = VM_1_10_SV32; +} else if (cpu->cfg.mbare == ON_OFF_AUTO_ON) { +cpu->cfg.satp_mode_str = g_strdup("none"); +cpu->cfg.satp_mode = VM_1_10_MBARE; +} +} else { +if (cpu->cfg.sv64 == ON_OFF_AUTO_ON) { +cpu->cfg.satp_mode_str = g_strdup("sv64"); +cpu->cfg.satp_mode = VM_1_10_SV64; +} else if (cpu->cfg.sv57 == ON_OFF_AUTO_ON) { +cpu->cfg.satp_mode_str = g_strdup("sv57"); +cpu->cfg.satp_mode = VM_1_10_SV57; +} else if (cpu->cfg.sv48 == ON_OFF_AUTO_ON) { +cpu->cfg.satp_mode_str = g_strdup("sv48"); +cpu->cfg.satp_mode = VM_1_10_SV48; +} else if (cpu->cfg.sv39 == ON_OFF_AUTO_ON) { +cpu->cfg.satp_mode_str = g_strdup("sv39"); +cpu->cfg.satp_mode = VM_1_10_SV39; +} else if (cpu->cfg.mbare == ON_OFF_AUTO_ON) { +cpu->cfg.satp_mode_str = g_strdup("none"); +cpu->cfg.satp_mode = VM_1_10_MBAR
Re: [PATCH v12 1/7] s390x/cpu topology: Creating CPU topology device
On 12/1/22 10:08, Thomas Huth wrote: On 29/11/2022 18.42, Pierre Morel wrote: We will need a Topology device to transfer the topology during migration and to implement machine reset. The device creation is fenced by s390_has_topology(). Signed-off-by: Pierre Morel --- ... diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c index 2e64ffab45..973bbdd36e 100644 --- a/hw/s390x/s390-virtio-ccw.c +++ b/hw/s390x/s390-virtio-ccw.c @@ -44,6 +44,7 @@ #include "hw/s390x/pv.h" #include "migration/blocker.h" #include "qapi/visitor.h" +#include "hw/s390x/cpu-topology.h" static Error *pv_mig_blocker; @@ -102,6 +103,24 @@ static void s390_init_cpus(MachineState *machine) } } +static DeviceState *s390_init_topology(MachineState *machine, Error **errp) +{ + DeviceState *dev; + + dev = qdev_new(TYPE_S390_CPU_TOPOLOGY); + + object_property_add_child(&machine->parent_obj, + TYPE_S390_CPU_TOPOLOGY, OBJECT(dev)); + object_property_set_int(OBJECT(dev), "num-cores", + machine->smp.cores * machine->smp.threads, errp); I wonder what will happen if we ever support multithreading on s390x later? ... won't this cause some oddities when migrating older machines types with smp.threads > 1 later? Maybe we should prohibit to enable the CPU topology instead if a user tried to use threads > 1 with an older machine type? Yes, right, I forgot to change this back. Anyway it has no sens for new machine which prohibit smp.threads > 1 neither. I change this by returning an error in case we have smp.threads > 1 Thanks. Pierre Thomas + object_property_set_int(OBJECT(dev), "num-sockets", + machine->smp.sockets, errp); + + sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), errp); + + return dev; +} -- Pierre Morel IBM Lab Boeblingen
Re: [Virtio-fs] [PATCH] virtiofsd: Add `sigreturn` to the seccomp whitelist
German Maglione writes: > On Mon, Nov 28, 2022 at 10:00 AM Marc Hartmayer > wrote: >> >> German Maglione writes: >> >> > On Fri, Nov 25, 2022 at 3:40 PM Marc Hartmayer >> > wrote: >> >> >> >> The virtiofsd currently crashes on s390x. This is because of a >> >> `sigreturn` system call. See audit log below: >> >> >> >> type=SECCOMP msg=audit(1669382477.611:459): auid=4294967295 uid=0 gid=0 >> >> ses=4294967295 subj=system_u:system_r:virtd_t:s0-s0:c0.c1023 pid=6649 >> >> comm="virtiofsd" exe="/usr/libexec/virtiofsd" sig=31 arch=8016 >> >> syscall=119 compat=0 ip=0x3fff15f748a code=0x8000AUID="unset" >> >> UID="root" GID="root" ARCH=s390x SYSCALL=sigreturn >> >> >> >> Signed-off-by: Marc Hartmayer >> >> --- >> >> tools/virtiofsd/passthrough_seccomp.c | 1 + >> >> 1 file changed, 1 insertion(+) >> >> >> >> diff --git a/tools/virtiofsd/passthrough_seccomp.c >> >> b/tools/virtiofsd/passthrough_seccomp.c >> >> index 888295c073de..0033dab4939e 100644 >> >> --- a/tools/virtiofsd/passthrough_seccomp.c >> >> +++ b/tools/virtiofsd/passthrough_seccomp.c >> >> @@ -110,6 +110,7 @@ static const int syscall_allowlist[] = { >> >> #endif >> >> SCMP_SYS(set_robust_list), >> >> SCMP_SYS(setxattr), >> >> +SCMP_SYS(sigreturn), >> >> SCMP_SYS(symlinkat), >> >> SCMP_SYS(syncfs), >> >> SCMP_SYS(time), /* Rarely needed, except on static builds */ >> >> -- >> >> 2.34.1 >> >> >> >> ___ >> >> Virtio-fs mailing list >> >> virtio...@redhat.com >> >> https://listman.redhat.com/mailman/listinfo/virtio-fs >> >> >> > >> > Reviewed-by: German Maglione >> >> Thanks. >> >> > >> > Should we add this also in the rust version?, I see we don't have it >> > enabled either. >> >> Yep - thanks. > > Could you test this MR to see if it is ok? > https://gitlab.com/virtio-fs/virtiofsd/-/merge_requests/144 I couldn’t reproduce the problem using the Rust version - even without your patch. With your patch applied it’s (of course) still working. > > Thanks, > > -- > German > -- Kind regards / Beste Grüße Marc Hartmayer IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Gregor Pillen Geschäftsführung: David Faller Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294
Re: [PATCH for 8.0 v8 06/12] vdpa: extract vhost_vdpa_svq_allocate_iova_tree
On Thu, Dec 1, 2022 at 9:45 AM Jason Wang wrote: > > On Wed, Nov 30, 2022 at 3:40 PM Eugenio Perez Martin > wrote: > > > > On Wed, Nov 30, 2022 at 7:43 AM Jason Wang wrote: > > > > > > On Thu, Nov 24, 2022 at 11:52 PM Eugenio Pérez > > > wrote: > > > > > > > > It can be allocated either if all virtqueues must be shadowed or if > > > > vdpa-net detects it can shadow only cvq. > > > > > > > > Extract in its own function so we can reuse it. > > > > > > > > Signed-off-by: Eugenio Pérez > > > > --- > > > > net/vhost-vdpa.c | 29 + > > > > 1 file changed, 17 insertions(+), 12 deletions(-) > > > > > > > > diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c > > > > index 88e0eec5fa..9ee3bc4cd3 100644 > > > > --- a/net/vhost-vdpa.c > > > > +++ b/net/vhost-vdpa.c > > > > @@ -240,6 +240,22 @@ static NetClientInfo net_vhost_vdpa_info = { > > > > .check_peer_type = vhost_vdpa_check_peer_type, > > > > }; > > > > > > > > +static int vhost_vdpa_get_iova_range(int fd, > > > > + struct vhost_vdpa_iova_range > > > > *iova_range) > > > > +{ > > > > +int ret = ioctl(fd, VHOST_VDPA_GET_IOVA_RANGE, iova_range); > > > > + > > > > +return ret < 0 ? -errno : 0; > > > > +} > > > > > > I don't get why this needs to be moved to net specific code. > > > > > > > It was already in net, this code just extracted it in its own function. > > Ok, there's similar function that in vhost-vdpa.c: > > static void vhost_vdpa_get_iova_range(struct vhost_vdpa *v) > { > int ret = vhost_vdpa_call(v->dev, VHOST_VDPA_GET_IOVA_RANGE, > &v->iova_range); > if (ret != 0) { > v->iova_range.first = 0; > v->iova_range.last = UINT64_MAX; > } > > trace_vhost_vdpa_get_iova_range(v->dev, v->iova_range.first, > v->iova_range.last); > } > > I think we can reuse that. > That's right, but I'd do the reverse: I would store iova_min, iova_max in VhostVDPAState and would set it to vhost_vdpa at net_vhost_vdpa_init. That way, we only have one ioctl call at the beginning instead of having (#vq pairs + cvq) calls each time the device starts. I can send it in a new change if you see it ok. There are a few functions like that we can reuse in net/. To get the features and the backend features are two other examples. Even if we don't cache them since device initialization mandates the read, we could reduce code duplication that way. However, they use vhost_dev or vhost_vdpa instead of directly the file descriptor. Not a big deal but it's an extra step. What do you think? Thanks!
[PATCH v3] e1000: Configure ResettableClass
This is part of recent efforts of refactoring e1000 and e1000e. DeviceClass's reset member is deprecated so migrate to ResettableClass. There is no behavioral difference. Signed-off-by: Akihiko Odaki Reviewed-by: Peter Maydell Reviewed-by: Philippe Mathieu-Daudé --- hw/net/e1000.c | 13 - 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/hw/net/e1000.c b/hw/net/e1000.c index e26e0a64c1..025aba726b 100644 --- a/hw/net/e1000.c +++ b/hw/net/e1000.c @@ -373,9 +373,9 @@ static bool e1000_vet_init_need(void *opaque) return chkflag(VET); } -static void e1000_reset(void *opaque) +static void e1000_reset_hold(Object *obj) { -E1000State *d = opaque; +E1000State *d = E1000(obj); E1000BaseClass *edc = E1000_GET_CLASS(d); uint8_t *macaddr = d->conf.macaddr.a; @@ -1746,12 +1746,6 @@ static void pci_e1000_realize(PCIDevice *pci_dev, Error **errp) e1000_flush_queue_timer, d); } -static void qdev_e1000_reset(DeviceState *dev) -{ -E1000State *d = E1000(dev); -e1000_reset(d); -} - static Property e1000_properties[] = { DEFINE_NIC_PROPERTIES(E1000State, conf), DEFINE_PROP_BIT("autonegotiation", E1000State, @@ -1777,6 +1771,7 @@ typedef struct E1000Info { static void e1000_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); +ResettableClass *rc = RESETTABLE_CLASS(klass); PCIDeviceClass *k = PCI_DEVICE_CLASS(klass); E1000BaseClass *e = E1000_CLASS(klass); const E1000Info *info = data; @@ -1789,9 +1784,9 @@ static void e1000_class_init(ObjectClass *klass, void *data) k->revision = info->revision; e->phy_id2 = info->phy_id2; k->class_id = PCI_CLASS_NETWORK_ETHERNET; +rc->phases.hold = e1000_reset_hold; set_bit(DEVICE_CATEGORY_NETWORK, dc->categories); dc->desc = "Intel Gigabit Ethernet"; -dc->reset = qdev_e1000_reset; dc->vmsd = &vmstate_e1000; device_class_set_props(dc, e1000_properties); } -- 2.38.1
[PATCH v3] e1000e: Configure ResettableClass
This is part of recent efforts of refactoring e1000 and e1000e. DeviceClass's reset member is deprecated so migrate to ResettableClass. There is no behavioral difference. Signed-off-by: Akihiko Odaki Reviewed-by: Peter Maydell Reviewed-by: Philippe Mathieu-Daudé --- hw/net/e1000e.c | 10 ++ hw/net/trace-events | 2 +- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/hw/net/e1000e.c b/hw/net/e1000e.c index 7523e9f5d2..923ad2fc89 100644 --- a/hw/net/e1000e.c +++ b/hw/net/e1000e.c @@ -512,11 +512,11 @@ static void e1000e_pci_uninit(PCIDevice *pci_dev) msi_uninit(pci_dev); } -static void e1000e_qdev_reset(DeviceState *dev) +static void e1000e_qdev_reset_hold(Object *obj) { -E1000EState *s = E1000E(dev); +E1000EState *s = E1000E(obj); -trace_e1000e_cb_qdev_reset(); +trace_e1000e_cb_qdev_reset_hold(); e1000e_core_reset(&s->core); @@ -669,6 +669,7 @@ static Property e1000e_properties[] = { static void e1000e_class_init(ObjectClass *class, void *data) { DeviceClass *dc = DEVICE_CLASS(class); +ResettableClass *rc = RESETTABLE_CLASS(class); PCIDeviceClass *c = PCI_DEVICE_CLASS(class); c->realize = e1000e_pci_realize; @@ -679,8 +680,9 @@ static void e1000e_class_init(ObjectClass *class, void *data) c->romfile = "efi-e1000e.rom"; c->class_id = PCI_CLASS_NETWORK_ETHERNET; +rc->phases.hold = e1000e_qdev_reset_hold; + dc->desc = "Intel 82574L GbE Controller"; -dc->reset = e1000e_qdev_reset; dc->vmsd = &e1000e_vmstate; e1000e_prop_disable_vnet = qdev_prop_uint8; diff --git a/hw/net/trace-events b/hw/net/trace-events index 4c0ec3fda1..2bdf437c7c 100644 --- a/hw/net/trace-events +++ b/hw/net/trace-events @@ -253,7 +253,7 @@ e1000e_vm_state_stopped(void) "VM state is stopped" # e1000e.c e1000e_cb_pci_realize(void) "E1000E PCI realize entry" e1000e_cb_pci_uninit(void) "E1000E PCI unit entry" -e1000e_cb_qdev_reset(void) "E1000E qdev reset entry" +e1000e_cb_qdev_reset_hold(void) "E1000E qdev reset hold" e1000e_cb_pre_save(void) "E1000E pre save entry" e1000e_cb_post_load(void) "E1000E post load entry" -- 2.38.1
Re: [PATCH v12 6/7] s390x/cpu_topology: activating CPU topology
On 29/11/2022 18.42, Pierre Morel wrote: The KVM capability, KVM_CAP_S390_CPU_TOPOLOGY is used to activate the S390_FEAT_CONFIGURATION_TOPOLOGY feature and the topology facility for the guest in the case the topology is available in QEMU and in KVM. The feature is fenced for SE (secure execution). Out of curiosity: Why does it not work yet? To allow smooth migration with old QEMU the feature is disabled by default using the CPU flag -disable-topology. I stared at this code for a while now, but I have to admit that I don't quite get it. Why do we need a new "disable" feature flag here? I think it is pretty much impossible to set "ctop=on" with an older version of QEMU, since it would require the QEMU to enable KVM_CAP_S390_CPU_TOPOLOGY in the kernel for this feature bit - and older versions of QEMU don't set this capability yet. Which scenario would fail without this disable-topology feature bit? What do I miss? Making the S390_FEAT_CONFIGURATION_TOPOLOGY belonging to the default features makes the -ctop CPU flag is no more necessary, too many verbs in this sentence ;-) turning the topology feature on is done with -disable-topology only. ... Thomas
Re: [PATCH for-7.2] vhost: enable vrings in vhost_dev_start() for vhost-user devices
Stefano Garzarella writes: > On Wed, Nov 30, 2022 at 04:03:28PM -0500, Stefan Hajnoczi wrote: >>On Fri, Nov 25, 2022 at 09:12:43AM +0100, Stefano Garzarella wrote: >>> On Thu, Nov 24, 2022 at 01:36:29PM -0500, Stefan Hajnoczi wrote: >>> > On Wed, Nov 23, 2022 at 02:16:30PM +0100, Stefano Garzarella wrote: >>> > > Commit 02b61f38d3 ("hw/virtio: incorporate backend features in >>> > > features") >>> > > properly negotiates VHOST_USER_F_PROTOCOL_FEATURES with the vhost-user >>> > > backend, but we forgot to enable vrings as specified in >>> > > docs/interop/vhost-user.rst: >>> > > >>> > > If ``VHOST_USER_F_PROTOCOL_FEATURES`` has not been negotiated, the >>> > > ring starts directly in the enabled state. >>> > > >>> > > If ``VHOST_USER_F_PROTOCOL_FEATURES`` has been negotiated, the ring >>> > > is >>> > > initialized in a disabled state and is enabled by >>> > > ``VHOST_USER_SET_VRING_ENABLE`` with parameter 1. >>> > > >>> > > Some vhost-user front-ends already did this by calling >>> > > vhost_ops.vhost_set_vring_enable() directly: >>> > > - backends/cryptodev-vhost.c >>> > > - hw/net/virtio-net.c >>> > > - hw/virtio/vhost-user-gpio.c >>> > > >>> > > But most didn't do that, so we would leave the vrings disabled and some >>> > > backends would not work. We observed this issue with the rust version of >>> > > virtiofsd [1], which uses the event loop [2] provided by the >>> > > vhost-user-backend crate where requests are not processed if vring is >>> > > not enabled. >>> > > >>> > > Let's fix this issue by enabling the vrings in vhost_dev_start() for >>> > > vhost-user front-ends that don't already do this directly. Same thing >>> > > also in vhost_dev_stop() where we disable vrings. >>> > > >>> > > [1] https://gitlab.com/virtio-fs/virtiofsd >>> > > [2] > https://github.com/rust-vmm/vhost/blob/240fc2966/crates/vhost-user-backend/src/event_loop.rs#L217 >>> > > >>> > > Fixes: 02b61f38d3 ("hw/virtio: incorporate backend features in >>> > > features") >>> > > Reported-by: German Maglione >>> > > Tested-by: German Maglione >>> > > Signed-off-by: Stefano Garzarella >>> > > --- >>> > > include/hw/virtio/vhost.h | 6 +++-- >>> > > backends/cryptodev-vhost.c | 4 ++-- >>> > > backends/vhost-user.c | 4 ++-- >>> > > hw/block/vhost-user-blk.c | 4 ++-- >>> > > hw/net/vhost_net.c | 8 +++ >>> > > hw/scsi/vhost-scsi-common.c| 4 ++-- >>> > > hw/virtio/vhost-user-fs.c | 4 ++-- >>> > > hw/virtio/vhost-user-gpio.c| 4 ++-- >>> > > hw/virtio/vhost-user-i2c.c | 4 ++-- >>> > > hw/virtio/vhost-user-rng.c | 4 ++-- >>> > > hw/virtio/vhost-vsock-common.c | 4 ++-- >>> > > hw/virtio/vhost.c | 44 ++ >>> > > hw/virtio/trace-events | 4 ++-- >>> > > 13 files changed, 67 insertions(+), 31 deletions(-) >>> > > >>> > > diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h >>> > > index 353252ac3e..67a6807fac 100644 >>> > > --- a/include/hw/virtio/vhost.h >>> > > +++ b/include/hw/virtio/vhost.h >>> > > @@ -184,24 +184,26 @@ static inline bool vhost_dev_is_started(struct >>> > > vhost_dev *hdev) >>> > > * vhost_dev_start() - start the vhost device >>> > > * @hdev: common vhost_dev structure >>> > > * @vdev: the VirtIODevice structure >>> > > + * @vrings: true to have vrings enabled in this call >>> > > * >>> > > * Starts the vhost device. From this point VirtIO feature negotiation >>> > > * can start and the device can start processing VirtIO transactions. >>> > > * >>> > > * Return: 0 on success, < 0 on error. >>> > > */ >>> > > -int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev); >>> > > +int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev, bool >>> > > vrings); >>> > > >>> > > /** >>> > > * vhost_dev_stop() - stop the vhost device >>> > > * @hdev: common vhost_dev structure >>> > > * @vdev: the VirtIODevice structure >>> > > + * @vrings: true to have vrings disabled in this call >>> > > * >>> > > * Stop the vhost device. After the device is stopped the notifiers >>> > > * can be disabled (@vhost_dev_disable_notifiers) and the device can >>> > > * be torn down (@vhost_dev_cleanup). >>> > > */ >>> > > -void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev); >>> > > +void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool >>> > > vrings); >>> > > >>> > > /** >>> > > * DOC: vhost device configuration handling >>> > > diff --git a/backends/cryptodev-vhost.c b/backends/cryptodev-vhost.c >>> > > index bc13e466b4..572f87b3be 100644 >>> > > --- a/backends/cryptodev-vhost.c >>> > > +++ b/backends/cryptodev-vhost.c >>> > > @@ -94,7 +94,7 @@ cryptodev_vhost_start_one(CryptoDevBackendVhost >>> > > *crypto, >>> > > goto fail_notifiers; >>> > > } >>> > > >>> > > -r = vhost_dev_start(&crypto->dev, dev); >>> > > +r = vhost_dev_start(&crypto->dev, dev, false); >>> > > if (r
[PATCH 1/3] block: mention 'password-secret' option for -iscsi
The 'password-secret' option was added commit b189346eb1784df95ed6fed610411dbf23d19e1f Author: Daniel P. Berrangé Date: Thu Jan 21 14:19:21 2016 + iscsi: add support for getting CHAP password via QCryptoSecret API but was not mentioned in the command line docs Signed-off-by: Daniel P. Berrangé --- qemu-options.hx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/qemu-options.hx b/qemu-options.hx index 7f99d15b23..055df73306 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -1889,7 +1889,7 @@ SRST ERST DEF("iscsi", HAS_ARG, QEMU_OPTION_iscsi, -"-iscsi [user=user][,password=password]\n" +"-iscsi [user=user][,password=password],password-secret=secret-id]\n" " [,header-digest=CRC32C|CR32C-NONE|NONE-CRC32C|NONE\n" " [,initiator-name=initiator-iqn][,id=target-iqn]\n" " [,timeout=timeout]\n" -- 2.38.1
[PATCH 0/3] More work on deprecation/removal of clear text passwords
This deprecates the -iscsi clear text 'password' option and deletes the previously deprecated -spice 'password' option. Daniel P. Berrangé (3): block: mention 'password-secret' option for -iscsi block: deprecate iSCSI 'password' in favour of 'password-secret' ui: remove deprecated 'password' option for SPICE block/iscsi.c | 3 +++ docs/about/deprecated.rst | 19 +++ docs/about/removed-features.rst | 7 +++ qemu-options.hx | 11 ++- ui/spice-core.c | 15 --- 5 files changed, 23 insertions(+), 32 deletions(-) -- 2.38.1
[PATCH 3/3] ui: remove deprecated 'password' option for SPICE
This has been replaced by the 'password-secret' option, which references a 'secret' object instance. Signed-off-by: Daniel P. Berrangé --- docs/about/deprecated.rst | 8 docs/about/removed-features.rst | 7 +++ qemu-options.hx | 9 + ui/spice-core.c | 15 --- 4 files changed, 8 insertions(+), 31 deletions(-) diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst index 2cc8924fe9..ee4301f96d 100644 --- a/docs/about/deprecated.rst +++ b/docs/about/deprecated.rst @@ -73,14 +73,6 @@ Input parameters that take a size value should only use a size suffix the value is hexadecimal. That is, '0x20M' is deprecated, and should be written either as '32M' or as '0x200'. -``-spice password=string`` (since 6.0) -'' - -This option is insecure because the SPICE password remains visible in -the process listing. This is replaced by the new ``password-secret`` -option which lets the password be securely provided on the command -line using a ``secret`` object instance. - ``-smp`` ("parameter=0" SMP configurations) (since 6.2) ''' diff --git a/docs/about/removed-features.rst b/docs/about/removed-features.rst index 63df9848fd..e04e095320 100644 --- a/docs/about/removed-features.rst +++ b/docs/about/removed-features.rst @@ -408,6 +408,13 @@ pcspk-audiodev=``. Use ``-device`` instead. +``-spice password=string`` (removed in 8.0) +''' + +This option is insecure because the SPICE password remains visible in +the process listing. This is replaced by the new ``password-secret`` +option which lets the password be securely provided on the command +line using a ``secret`` object instance. QEMU Machine Protocol (QMP) commands diff --git a/qemu-options.hx b/qemu-options.hx index 055df73306..8a326f4dbb 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -2132,7 +2132,7 @@ DEF("spice", HAS_ARG, QEMU_OPTION_spice, " [,tls-channel=[main|display|cursor|inputs|record|playback]]\n" " [,plaintext-channel=[main|display|cursor|inputs|record|playback]]\n" " [,sasl=on|off][,disable-ticketing=on|off]\n" -" [,password=][,password-secret=]\n" +" [,password-secret=]\n" " [,image-compression=[auto_glz|auto_lz|quic|glz|lz|off]]\n" " [,jpeg-wan-compression=[auto|never|always]]\n" " [,zlib-glz-wan-compression=[auto|never|always]]\n" @@ -2158,13 +2158,6 @@ SRST ``ipv4=on|off``; \ ``ipv6=on|off``; \ ``unix=on|off`` Force using the specified IP version. -``password=`` -Set the password you need to authenticate. - -This option is deprecated and insecure because it leaves the -password visible in the process listing. Use ``password-secret`` -instead. - ``password-secret=`` Set the ID of the ``secret`` object containing the password you need to authenticate. diff --git a/ui/spice-core.c b/ui/spice-core.c index c3ac20ad43..15fba68e31 100644 --- a/ui/spice-core.c +++ b/ui/spice-core.c @@ -413,9 +413,6 @@ static QemuOptsList qemu_spice_opts = { .name = "unix", .type = QEMU_OPT_BOOL, #endif -},{ -.name = "password", -.type = QEMU_OPT_STRING, },{ .name = "password-secret", .type = QEMU_OPT_STRING, @@ -671,20 +668,8 @@ static void qemu_spice_init(void) } passwordSecret = qemu_opt_get(opts, "password-secret"); if (passwordSecret) { -if (qemu_opt_get(opts, "password")) { -error_report("'password' option is mutually exclusive with " - "'password-secret'"); -exit(1); -} password = qcrypto_secret_lookup_as_utf8(passwordSecret, &error_fatal); -} else { -str = qemu_opt_get(opts, "password"); -if (str) { -warn_report("'password' option is deprecated and insecure, " -"use 'password-secret' instead"); -password = g_strdup(str); -} } if (tls_port) { -- 2.38.1
[PATCH 2/3] block: deprecate iSCSI 'password' in favour of 'password-secret'
Support for referencing secret objects was added in commit b189346eb1784df95ed6fed610411dbf23d19e1f Author: Daniel P. Berrangé Date: Thu Jan 21 14:19:21 2016 + iscsi: add support for getting CHAP password via QCryptoSecret API The existing 'password' option is overdue for deprecation and subsequent removal. Signed-off-by: Daniel P. Berrangé --- block/iscsi.c | 3 +++ docs/about/deprecated.rst | 11 +++ 2 files changed, 14 insertions(+) diff --git a/block/iscsi.c b/block/iscsi.c index a316d46d96..58c0623052 100644 --- a/block/iscsi.c +++ b/block/iscsi.c @@ -1352,6 +1352,9 @@ static void apply_chap(struct iscsi_context *iscsi, QemuOpts *opts, } else if (!password) { error_setg(errp, "CHAP username specified but no password was given"); return; +} else { +warn_report("iSCSI block driver 'password' option is deprecated, " +"use 'password-secret' instead"); } if (iscsi_set_initiator_username_pwd(iscsi, user, password)) { diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst index 93affe3669..2cc8924fe9 100644 --- a/docs/about/deprecated.rst +++ b/docs/about/deprecated.rst @@ -267,6 +267,17 @@ Options are: - move backing file to NVDIMM storage and keep ``pmem=on`` (to have NVDIMM with persistence guaranties). +Block driver options + + +``iscsi,password=xxx`` (since 8.0) +'' + +Specifying the iSCSI password in plain text on the command line using the +``password`` option is insecure. The ``password-secret`` option should be +used instead, to refer to a ``--object secret...`` instance that provides +a password via a file, or encrypted. + Device options -- -- 2.38.1
Re: [PATCH] MAINTAINERS: Inherit from nanoMIPS
Milica Lazarevic writes: > Hi, > > Stefan is no longer working with us, but I will be more than happy to take > maintaining the nanoMIPS ISA on me! > > Regards, > Milica > Any comments on this? Suggest you post a patch to update MAINTAINERS, with a suitable cc:.
Re: [PATCH 3/9] ui: Drop disabled code for SPICE_CHANNEL_WEBDAV
Daniel P. Berrangé writes: > On Thu, Dec 01, 2022 at 07:13:05AM +0100, Markus Armbruster wrote: >> HMP "info spice" has a bit of code to show channel type >> SPICE_CHANNEL_WEBDAV as "webdav", disabled since commit 7c6044a94e >> "hmp: info spice: take out webdav" (v2.3.0), because it compiles only >> with Spice versions 0.12.7 and later. Our minimum version is 0.12.5. > > Last version bump was 4 years ago > > commit 1b63665c2c0e0d52735e0dd5217f109fe0dd2322 > Author: Marc-André Lureau > Date: Wed Nov 28 19:59:32 2018 +0400 > > configure: bump spice-server required version to 0.12.5 > > ...snip > > According to repology, all the distros that are build target platforms > for QEMU include it: > > RHEL-7: 0.14.0 > Debian (Stretch): 0.12.8 > Debian (Jessie): 0.12.5 > FreeBSD (ports): 0.14.0 > OpenSUSE Leap 15: 0.14.0 > Ubuntu (Xenial): 0.12.6 > > We moved on from Debian and RHEL since then > >Debian 11: 0.14.3 >RHEL-8: 0.14.2 >FreeBSD (ports): 0.14.4 >Fedora 35: 0.14.0 >Ubuntu 20.04: 0.14.0 >OpenSUSE Leap 15.3: 0.14.3 > > IOW, we can bump to 0.14.0, and then revert the > webdav conditional commit. Will do in v2. Thanks!
Re: [PATCH 6/9] ui: Move HMP commands from monitor to new ui/ui-hmp-cmds.c
Philippe Mathieu-Daudé writes: > On 1/12/22 07:13, Markus Armbruster wrote: >> This moves these commands from MAINTAINERS section "Human >> Monitor (HMP)" to "Graphics". >> Signed-off-by: Markus Armbruster >> --- >> monitor/hmp-cmds.c | 342 -- >> ui/ui-hmp-cmds.c | 360 + >> ui/meson.build | 1 + >> 3 files changed, 361 insertions(+), 342 deletions(-) >> create mode 100644 ui/ui-hmp-cmds.c > > >> +#ifdef CONFIG_SPICE >> +void hmp_info_spice(Monitor *mon, const QDict *qdict) >> +{ >> +SpiceChannelList *chan; >> +SpiceInfo *info; >> +const char *channel_name; >> +const char * const channel_names[] = { > > Can be static (pre-existing). Separate patch, to keep the code motion pure. >> +[SPICE_CHANNEL_MAIN] = "main", >> +[SPICE_CHANNEL_DISPLAY] = "display", >> +[SPICE_CHANNEL_INPUTS] = "inputs", >> +[SPICE_CHANNEL_CURSOR] = "cursor", >> +[SPICE_CHANNEL_PLAYBACK] = "playback", >> +[SPICE_CHANNEL_RECORD] = "record", >> +[SPICE_CHANNEL_TUNNEL] = "tunnel", >> +[SPICE_CHANNEL_SMARTCARD] = "smartcard", >> +[SPICE_CHANNEL_USBREDIR] = "usbredir", >> +[SPICE_CHANNEL_PORT] = "port", >> +}; > Reviewed-by: Philippe Mathieu-Daudé Thanks!
Re: [PATCH 7/9] ui: Improve "change vnc" error reporting
Daniel P. Berrangé writes: > On Thu, Dec 01, 2022 at 07:13:09AM +0100, Markus Armbruster wrote: >> Switch from monitor_printf() to error_setg() and hmp_handle_error(). >> This makes "this is an error" more obvious both in the source and in >> the monitor, where hmp_handle_error() prefixes the message with >> "Error: ". >> >> Signed-off-by: Markus Armbruster >> --- >> monitor/hmp-cmds.c | 8 >> ui/ui-hmp-cmds.c | 22 ++ >> 2 files changed, 26 insertions(+), 4 deletions(-) >> >> diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c >> index f0f7b74fb3..8542eee3d4 100644 >> --- a/monitor/hmp-cmds.c >> +++ b/monitor/hmp-cmds.c >> @@ -1209,9 +1209,8 @@ void hmp_change(Monitor *mon, const QDict *qdict) >> #ifdef CONFIG_VNC >> if (strcmp(device, "vnc") == 0) { >> if (read_only) { >> -monitor_printf(mon, >> - "Parameter 'read-only-mode' is invalid for >> VNC\n"); >> -return; >> +error_setg(&err, "Parameter 'read-only-mode' is invalid for >> VNC"); >> +goto end; >> } >> if (strcmp(target, "passwd") == 0 || >> strcmp(target, "password") == 0) { >> @@ -1223,7 +1222,8 @@ void hmp_change(Monitor *mon, const QDict *qdict) >> qmp_change_vnc_password(arg, &err); >> } >> } else { >> -monitor_printf(mon, "Expected 'password' after 'vnc'\n"); >> +error_setg(&err, "Expected 'password' after 'vnc'"); >> +goto end; >> } >> } else >> #endif > > Upto this point > > Reviewed-by: Daniel P. Berrangé > > >> diff --git a/ui/ui-hmp-cmds.c b/ui/ui-hmp-cmds.c >> index af290da9e1..90a4f86f25 100644 >> --- a/ui/ui-hmp-cmds.c >> +++ b/ui/ui-hmp-cmds.c >> @@ -270,6 +270,28 @@ out: >> hmp_handle_error(mon, err); >> } >> >> +void hmp_change_vnc(Monitor *mon, const char *device, const char *target, >> +const char *arg, const char *read_only, bool force, >> +Error **errp) >> +{ >> +if (read_only) { >> +error_setg(mon, "Parameter 'read-only-mode' is invalid for VNC"); >> +return; >> +} >> +if (strcmp(target, "passwd") == 0 || >> +strcmp(target, "password") == 0) { >> +if (!arg) { >> +MonitorHMP *hmp_mon = container_of(mon, MonitorHMP, common); >> +monitor_read_password(hmp_mon, hmp_change_read_arg, NULL); >> +return; >> +} else { >> +qmp_change_vnc_password(arg, &err); >> +} >> +} else { >> +monitor_printf(mon, "Expected 'password' after 'vnc'\n"); >> +} >> +} >> + > > This chunk ought to be in the next patch IIUC Oops! Thank you! >> void hmp_sendkey(Monitor *mon, const QDict *qdict) >> { >> const char *keys = qdict_get_str(qdict, "keys"); >> -- >> 2.37.3 >> >> > > With regards, > Daniel
[PATCH] accel/kvm/kvm-all: Handle register access errors
A register access error typically means something seriously wrong happened so that anything bad can happen after that and recovery is impossible. Even failing one register access is catastorophic as architecture-specific code are not written so that it torelates such failures. Make sure the VM stop and nothing worse happens if such an error occurs. Signed-off-by: Akihiko Odaki --- accel/kvm/kvm-all.c | 32 1 file changed, 28 insertions(+), 4 deletions(-) diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index f99b0becd8..9e848f750e 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -2709,7 +2709,13 @@ bool kvm_cpu_check_are_resettable(void) static void do_kvm_cpu_synchronize_state(CPUState *cpu, run_on_cpu_data arg) { if (!cpu->vcpu_dirty) { -kvm_arch_get_registers(cpu); +int ret = kvm_arch_get_registers(cpu); +if (ret) { +error_report("Failed to get registers: %s", strerror(-ret)); +cpu_dump_state(cpu, stderr, CPU_DUMP_CODE); +vm_stop(RUN_STATE_INTERNAL_ERROR); +} + cpu->vcpu_dirty = true; } } @@ -2723,7 +2729,13 @@ void kvm_cpu_synchronize_state(CPUState *cpu) static void do_kvm_cpu_synchronize_post_reset(CPUState *cpu, run_on_cpu_data arg) { -kvm_arch_put_registers(cpu, KVM_PUT_RESET_STATE); +int ret = kvm_arch_put_registers(cpu, KVM_PUT_RESET_STATE); +if (ret) { +error_report("Failed to put registers after reset: %s", strerror(-ret)); +cpu_dump_state(cpu, stderr, CPU_DUMP_CODE); +vm_stop(RUN_STATE_INTERNAL_ERROR); +} + cpu->vcpu_dirty = false; } @@ -2734,7 +2746,12 @@ void kvm_cpu_synchronize_post_reset(CPUState *cpu) static void do_kvm_cpu_synchronize_post_init(CPUState *cpu, run_on_cpu_data arg) { -kvm_arch_put_registers(cpu, KVM_PUT_FULL_STATE); +int ret = kvm_arch_put_registers(cpu, KVM_PUT_FULL_STATE); +if (ret) { +error_report("Failed to put registers after init: %s", strerror(-ret)); +exit(1); +} + cpu->vcpu_dirty = false; } @@ -2827,7 +2844,14 @@ int kvm_cpu_exec(CPUState *cpu) MemTxAttrs attrs; if (cpu->vcpu_dirty) { -kvm_arch_put_registers(cpu, KVM_PUT_RUNTIME_STATE); +ret = kvm_arch_put_registers(cpu, KVM_PUT_RUNTIME_STATE); +if (ret) { +error_report("Failed to put registers after init: %s", + strerror(-ret)); +ret = -1; +break; +} + cpu->vcpu_dirty = false; } -- 2.38.1
Re: [PATCH qemu.git 0/1] hw/arm/virt: add 2x sp804 timer
On Wed, 30 Nov 2022 at 18:56, ~axelheider wrote: > > This patch adds timer peripherals to the arm-virt machine. The > use case is, that this machine is quite useful for testing purposes > when it comes to non-Linux operating system (seL4 in our case). > However, is currently lacks a dedicates timer peripheral, so some > scenarios cannot be tested easily with QEMU. The RTC cannot be > used, because he resolution is too low. Since the sp804 supposed > already exists in QEMU, adding these peripherals seems easy and > it does not appear to break any existing use cases. Is there a reason you can't use the CPU's built-in generic timer device ? That is what typical guest code does on this system. I'm a bit reluctant to add more devices to the virt board because over time it gradually gets increasingly complicated, and every new device model we expose to the guest is another thing that's part of the security attack surface for guest code trying to escape from a KVM VM. thanks -- PMM
[PATCH] target/arm: Propagate errno when writing list
Before this change, write_kvmstate_to_list() and write_list_to_kvmstate() tolerated even if it failed to access some register, and returned a bool indicating whether one of the register accesses failed. However, it does not make sen not to fail early as the the callers check the returned value and fail early anyway. So let write_kvmstate_to_list() and write_list_to_kvmstate() fail early too. This will allow to propagate errno to the callers and log it if appropriate. Signed-off-by: Akihiko Odaki --- target/arm/kvm-stub.c | 4 ++-- target/arm/kvm.c | 27 +++ target/arm/kvm64.c| 10 ++ target/arm/kvm_arm.h | 15 --- target/arm/machine.c | 13 - 5 files changed, 31 insertions(+), 38 deletions(-) diff --git a/target/arm/kvm-stub.c b/target/arm/kvm-stub.c index 965a486b32..f659afd7b8 100644 --- a/target/arm/kvm-stub.c +++ b/target/arm/kvm-stub.c @@ -13,12 +13,12 @@ #include "cpu.h" #include "kvm_arm.h" -bool write_kvmstate_to_list(ARMCPU *cpu) +int write_kvmstate_to_list(ARMCPU *cpu) { g_assert_not_reached(); } -bool write_list_to_kvmstate(ARMCPU *cpu, int level) +int write_list_to_kvmstate(ARMCPU *cpu, int level) { g_assert_not_reached(); } diff --git a/target/arm/kvm.c b/target/arm/kvm.c index f022c644d2..4cef0efc96 100644 --- a/target/arm/kvm.c +++ b/target/arm/kvm.c @@ -501,12 +501,12 @@ int kvm_arm_init_cpreg_list(ARMCPU *cpu) } assert(cpu->cpreg_array_len == arraylen); -if (!write_kvmstate_to_list(cpu)) { +ret = write_kvmstate_to_list(cpu); +if (ret) { /* Shouldn't happen unless kernel is inconsistent about * what registers exist. */ fprintf(stderr, "Initial read of kernel register state failed\n"); -ret = -EINVAL; goto out; } @@ -515,11 +515,10 @@ out: return ret; } -bool write_kvmstate_to_list(ARMCPU *cpu) +int write_kvmstate_to_list(ARMCPU *cpu) { CPUState *cs = CPU(cpu); int i; -bool ok = true; for (i = 0; i < cpu->cpreg_array_len; i++) { struct kvm_one_reg r; @@ -545,17 +544,16 @@ bool write_kvmstate_to_list(ARMCPU *cpu) g_assert_not_reached(); } if (ret) { -ok = false; +return ret; } } -return ok; +return 0; } -bool write_list_to_kvmstate(ARMCPU *cpu, int level) +int write_list_to_kvmstate(ARMCPU *cpu, int level) { CPUState *cs = CPU(cpu); int i; -bool ok = true; for (i = 0; i < cpu->cpreg_array_len; i++) { struct kvm_one_reg r; @@ -581,14 +579,10 @@ bool write_list_to_kvmstate(ARMCPU *cpu, int level) } ret = kvm_vcpu_ioctl(cs, KVM_SET_ONE_REG, &r); if (ret) { -/* We might fail for "unknown register" and also for - * "you tried to set a register which is constant with - * a different value from what it actually contains". - */ -ok = false; +return ret; } } -return ok; +return 0; } void kvm_arm_cpu_pre_save(ARMCPU *cpu) @@ -620,8 +614,9 @@ void kvm_arm_reset_vcpu(ARMCPU *cpu) fprintf(stderr, "kvm_arm_vcpu_init failed: %s\n", strerror(-ret)); abort(); } -if (!write_kvmstate_to_list(cpu)) { -fprintf(stderr, "write_kvmstate_to_list failed\n"); +ret = write_kvmstate_to_list(cpu); +if (ret < 0) { +fprintf(stderr, "write_kvmstate_to_list failed: %s\n", strerror(-ret)); abort(); } /* diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c index 1197253d12..f7879194f9 100644 --- a/target/arm/kvm64.c +++ b/target/arm/kvm64.c @@ -1202,8 +1202,9 @@ int kvm_arch_put_registers(CPUState *cs, int level) write_cpustate_to_list(cpu, true); -if (!write_list_to_kvmstate(cpu, level)) { -return -EINVAL; +ret = write_list_to_kvmstate(cpu, level); +if (ret) { +return ret; } /* @@ -1418,8 +1419,9 @@ int kvm_arch_get_registers(CPUState *cs) return ret; } -if (!write_kvmstate_to_list(cpu)) { -return -EINVAL; +ret = write_kvmstate_to_list(cpu); +if (ret) { +return ret; } /* Note that it's OK to have registers which aren't in CPUState, * so we can ignore a failure return here. diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h index 99017b635c..05fce74baf 100644 --- a/target/arm/kvm_arm.h +++ b/target/arm/kvm_arm.h @@ -106,13 +106,9 @@ int kvm_arm_cpreg_level(uint64_t regidx); * This updates KVM's working data structures from TCG data or * from incoming migration state. * - * Returns: true if all register values were updated correctly, - * false if some register was unknown to the kernel or could not - * be written (eg constant register with the wrong value). - * Note that we do not stop early on failure -- we will attempt - * writing all registers in the list. + * Returns: 0 if success, else < 0 er
Re: [PATCH for-7.2] vhost: enable vrings in vhost_dev_start() for vhost-user devices
On Thu, Dec 01, 2022 at 10:14:39AM +, Alex Bennée wrote: > Do you think rust-vmm's vhost crates have enough of the state > management to manage vhost and vhost-user backends? Maybe it would be a > good experiment in replacing a (small well defined) piece of > functionality with rust? > > That said there is a lot of deep magic in the vhost-net stuff which I > think is down to the interaction with things like vdpk and other network > optimisations that might be missing. For the rest of the devices most of > the code is basically boiler plate which has grown variations due to > code motion and change. This is the sort of thing that generics solves > well. Not sure what you want to replace with what though, libvhost-user or vhost-user bits in qemu? -- MST
Re: [PATCH] accel/kvm/kvm-all: Handle register access errors
On Thu, 1 Dec 2022 at 10:27, Akihiko Odaki wrote: > > A register access error typically means something seriously wrong > happened so that anything bad can happen after that and recovery is > impossible. > Even failing one register access is catastorophic as > architecture-specific code are not written so that it torelates such > failures. > > Make sure the VM stop and nothing worse happens if such an error occurs. > > Signed-off-by: Akihiko Odaki In a similar vein there was also https://lore.kernel.org/all/20220617144857.34189-1-pet...@redhat.com/ back in June, which on the one hand was less comprehensive but on the other does the plumbing to pass the error upwards rather than reporting it immediately at point of failure. I'm in principle in favour but suspect we'll run into some corner cases where we were happily ignoring not-very-important failures (eg if you're running Linux as the host OS on a Mac M1 and your host kernel doesn't have this fix: https://lore.kernel.org/all/ynhz6cw5onr2e...@google.com/T/ then QEMU will go from "works by sheer luck" to "consistently hits this error check"). So we should aim to land this extra error checking early in the release cycle so we have plenty of time to deal with any bug reports we get about it. thanks -- PMM
Re: [PATCH v2 1/1] tcg: convert tcg/README to rst
On 30/11/2022 18:52, Fabiano Rosas wrote: Mark Cave-Ayland writes: Convert tcg/README to rst and move it to docs/devel as a new "TCG Intermediate Representation" page. There are a few minor changes to improve the aesthetic of the final output which are as follows: - Rename the title from "Tiny Code Generator - Fabrice Bellard" to "TCG Intermediate Representation" - Remove the section numbering - Add the missing parameters to the ssadd_vec operations in the "Host vector operations" section - Change the path to the Atomic Operations document to use a proper reference - Replace tcg/README in tcg.rst with a proper reference to the new document Signed-off-by: Mark Cave-Ayland Reviewed-by: Fabiano Rosas Thanks! I have a couple of suggestions for a small restructuring. But we could do that after this patch gets in, no worries. The index now looks like: Translator Internals <--- mentions what TCG stands for CPU state optimisations<--- references TBs and targets Direct block chaining Self-modifying code and... Exception support MMU emulation TCG Intermediate Representation <--- duplicate section name (see below) Introduction <--- 2nd time we mention what TCG stands for Definitions <--- defines TBs and targets Intermediate representation <--- duplicate section name Instruction Reference Backend Recommended coding rules for best performance I think it would be nicer to merge the text from "Introduction" with the paragraph that mentions TCG in "Translator Internals" instead of linking; and moving the whole "Definitions" section to before "CPU state optimizations". That way we keep the definitions before the text that mentions the terms and remove the duplicate "intermediate representation" in the index. I don't have a great deal of time in the short term, but I agree the documentation can be further improved once the patch has been merged. At least once the conversion is done, it allows for others to come up with subsequent patches as time allows :) ATB, Mark.
Re: [PATCH v2 1/1] tcg: convert tcg/README to rst
On 30/11/2022 22:12, Richard Henderson wrote: On 11/30/22 02:04, Mark Cave-Ayland wrote: + * - eqv_i32/i64 *t0*, *t1*, *t2* + + - | *t0* = ~(*t1* ^ *t2*), or equivalently, *t0* = *t1* & ~\ *t2* t1 ^ ~t2 The only typo I saw, fixed while queuing. Ooops! Thanks for spotting and fixing. ATB, Mark.
Re: [PATCH] accel/kvm/kvm-all: Handle register access errors
On 2022/12/01 19:40, Peter Maydell wrote: On Thu, 1 Dec 2022 at 10:27, Akihiko Odaki wrote: A register access error typically means something seriously wrong happened so that anything bad can happen after that and recovery is impossible. Even failing one register access is catastorophic as architecture-specific code are not written so that it torelates such failures. Make sure the VM stop and nothing worse happens if such an error occurs. Signed-off-by: Akihiko Odaki In a similar vein there was also https://lore.kernel.org/all/20220617144857.34189-1-pet...@redhat.com/ back in June, which on the one hand was less comprehensive but on the other does the plumbing to pass the error upwards rather than reporting it immediately at point of failure. I'm in principle in favour but suspect we'll run into some corner cases where we were happily ignoring not-very-important failures (eg if you're running Linux as the host OS on a Mac M1 and your host kernel doesn't have this fix: https://lore.kernel.org/all/ynhz6cw5onr2e...@google.com/T/ then QEMU will go from "works by sheer luck" to "consistently hits this error check"). So we should aim to land this extra error checking early in the release cycle so we have plenty of time to deal with any bug reports we get about it. thanks -- PMM Actually I found this problem when I tried to run QEMU with KVM on M2 MacBook Air and encountered a failure described and fixed at: https://lore.kernel.org/all/20221201104914.28944-2-akihiko.od...@daynix.com/ Although the affected register was not really important, QEMU couldn't run the guest well enough because kvm_arch_put_registers for ARM64 is written in a way that it fails early. I guess the situation is not so different for other architectures as well. I still agree that this should be postponed until a new release cycle starts as register saving/restoring is too important to fail. Regards, Akihiko Odaki
Re: [PATCH] tests/qtest/vhost-user-blk-test: don't abort all qtests on missing envar
On 25/11/2022 16.58, Christian Schoenebeck wrote: This test requires environment variable QTEST_QEMU_STORAGE_DAEMON_BINARY to be defined for running. If not, it would immediately abort all qtests and prevent other, unrelated tests from running. To fix that, just skip vhost-user-blk-test instead and log a message about missing environment variable. Signed-off-by: Christian Schoenebeck --- I also tried g_test_skip(errmsg) from the setup handlers instead, but it always caused the tests to abort with an error: ../tests/qtest/libqtest.c:179: kill_qemu() tried to terminate QEMU process but encountered exit status 1 (expected 0) I haven't further investigated. tests/qtest/vhost-user-blk-test.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/tests/qtest/vhost-user-blk-test.c b/tests/qtest/vhost-user-blk-test.c index 07a4c2d500..dc37f5af4d 100644 --- a/tests/qtest/vhost-user-blk-test.c +++ b/tests/qtest/vhost-user-blk-test.c @@ -983,6 +983,12 @@ static void register_vhost_user_blk_test(void) .before = vhost_user_blk_test_setup, }; +if (!getenv("QTEST_QEMU_STORAGE_DAEMON_BINARY")) { +g_test_message("QTEST_QEMU_STORAGE_DAEMON_BINARY not defined, " + "skipping vhost-user-blk-test"); +return; +} Could we use g_test_skip() here? Thomas
Re: regression: insmod module failed in VM with nvdimm on
On Thu, 1 Dec 2022 at 09:07, Ard Biesheuvel wrote: > > On Thu, 1 Dec 2022 at 08:15, chenxiang (M) wrote: > > > > Hi Ard, > > > > > > 在 2022/11/30 16:18, Ard Biesheuvel 写道: > > > On Wed, 30 Nov 2022 at 08:53, Marc Zyngier wrote: > > >> On Wed, 30 Nov 2022 02:52:35 +, > > >> "chenxiang (M)" wrote: > > >>> Hi, > > >>> > > >>> We boot the VM using following commands (with nvdimm on) (qemu > > >>> version 6.1.50, kernel 6.0-r4): > > >> How relevant is the presence of the nvdimm? Do you observe the failure > > >> without this? > > >> > > >>> qemu-system-aarch64 -machine > > >>> virt,kernel_irqchip=on,gic-version=3,nvdimm=on -kernel > > >>> /home/kernel/Image -initrd /home/mini-rootfs/rootfs.cpio.gz -bios > > >>> /root/QEMU_EFI.FD -cpu host -enable-kvm -net none -nographic -m > > >>> 2G,maxmem=64G,slots=3 -smp 4 -append 'rdinit=init console=ttyAMA0 > > >>> ealycon=pl0ll,0x9000 pcie_ports=native pciehp.pciehp_debug=1' > > >>> -object memory-backend-ram,id=ram1,size=10G -device > > >>> nvdimm,id=dimm1,memdev=ram1 -device ioh3420,id=root_port1,chassis=1 > > >>> -device vfio-pci,host=7d:01.0,id=net0,bus=root_port1 > > >>> > > >>> Then in VM we insmod a module, vmalloc error occurs as follows (kernel > > >>> 5.19-rc4 is normal, and the issue is still on kernel 6.1-rc4): > > >>> > > >>> estuary:/$ insmod /lib/modules/$(uname -r)/hnae3.ko > > >>> [8.186563] vmap allocation for size 20480 failed: use > > >>> vmalloc= to increase size > > >> Have you tried increasing the vmalloc size to check that this is > > >> indeed the problem? > > >> > > >> [...] > > >> > > >>> We git bisect the code, and find the patch c5a89f75d2a ("arm64: kaslr: > > >>> defer initialization to initcall where permitted"). > > >> I guess you mean commit fc5a89f75d2a instead, right? > > >> > > >>> Do you have any idea about the issue? > > >> I sort of suspect that the nvdimm gets vmap-ed and consumes a large > > >> portion of the vmalloc space, but you give very little information > > >> that could help here... > > >> > > > Ouch. I suspect what's going on here: that patch defers the > > > randomization of the module region, so that we can decouple it from > > > the very early init code. > > > > > > Obviously, it is happening too late now, and the randomized module > > > region is overlapping with a vmalloc region that is in use by the time > > > the randomization occurs. > > > > > > Does the below fix the issue? > > > > The issue still occurs, but it seems decrease the probability, before it > > occured almost every time, after the change, i tried 2-3 times, and it > > occurs. > > But i change back "subsys_initcall" to "core_initcall", and i test more > > than 20 times, and it is still ok. > > > > Thank you for confirming. I will send out a patch today. > ...but before I do that, could you please check whether the change below fixes your issue as well? diff --git a/arch/arm64/kernel/kaslr.c b/arch/arm64/kernel/kaslr.c index 6ccc7ef600e7c1e1..c8c205b630da1951 100644 --- a/arch/arm64/kernel/kaslr.c +++ b/arch/arm64/kernel/kaslr.c @@ -20,7 +20,11 @@ #include #include -u64 __ro_after_init module_alloc_base; +/* + * Set a reasonable default for module_alloc_base in case + * we end up running with module randomization disabled. + */ +u64 __ro_after_init module_alloc_base = (u64)_etext - MODULES_VSIZE; u16 __initdata memstart_offset_seed; struct arm64_ftr_override kaslr_feature_override __initdata; @@ -30,12 +34,6 @@ static int __init kaslr_init(void) u64 module_range; u32 seed; - /* -* Set a reasonable default for module_alloc_base in case -* we end up running with module randomization disabled. -*/ - module_alloc_base = (u64)_etext - MODULES_VSIZE; - if (kaslr_feature_override.val & kaslr_feature_override.mask & 0xf) { pr_info("KASLR disabled on command line\n"); return 0;
Re: [PATCH] accel/kvm/kvm-all: Handle register access errors
On Thu, 1 Dec 2022 at 11:00, Akihiko Odaki wrote: > > On 2022/12/01 19:40, Peter Maydell wrote: > > On Thu, 1 Dec 2022 at 10:27, Akihiko Odaki wrote: > >> > >> A register access error typically means something seriously wrong > >> happened so that anything bad can happen after that and recovery is > >> impossible. > >> Even failing one register access is catastorophic as > >> architecture-specific code are not written so that it torelates such > >> failures. > >> > >> Make sure the VM stop and nothing worse happens if such an error occurs. > >> > >> Signed-off-by: Akihiko Odaki > > > > In a similar vein there was also > > https://lore.kernel.org/all/20220617144857.34189-1-pet...@redhat.com/ > > back in June, which on the one hand was less comprehensive but on > > the other does the plumbing to pass the error upwards rather than > > reporting it immediately at point of failure. > > > > I'm in principle in favour but suspect we'll run into some corner > > cases where we were happily ignoring not-very-important failures > > (eg if you're running Linux as the host OS on a Mac M1 and your > > host kernel doesn't have this fix: > > https://lore.kernel.org/all/ynhz6cw5onr2e...@google.com/T/ > > then QEMU will go from "works by sheer luck" to "consistently > > hits this error check"). So we should aim to land this extra > > error checking early in the release cycle so we have plenty of > > time to deal with any bug reports we get about it. > Actually I found this problem when I tried to run QEMU with KVM on M2 > MacBook Air and encountered a failure described and fixed at: > https://lore.kernel.org/all/20221201104914.28944-2-akihiko.od...@daynix.com/ Ah, yeah, you're trying to run QEMU+KVM on a heterogenous cluster. You need to force all the vCPUs to run on only a single host CPU type. It's a shame the error-reporting for this situation is not very good, but there's not really any way to tell in advance, the best you get is an error at the point where a vCPU happens to migrate over to a different host CPU. > Although the affected register was not really important, QEMU couldn't > run the guest well enough because kvm_arch_put_registers for ARM64 is > written in a way that it fails early. I guess the situation is not so > different for other architectures as well. I think Arm is the only one that does this kind of "leave the handling of the system registers up to the host kernel and treat them as mostly black-box values to be passed around on migration" approach. Most other architectures have QEMU know about specific system registers in the vCPU and only ask the kernel about those, I think. -- PMM
Re: [PATCH] linux-user,bsd-user: re-exec with G_SLICE=always-malloc
Emilio Cota writes: > On Tue, Oct 04, 2022 at 13:00:47 +0100, Daniel P. Berrangé wrote: > (snip) >> Can't say I especially like this but I'm out of other ideas for how >> to guarantee a solution. Users can't set env vars prior to launching >> QEMU user emulators when using binfmt. > > An alternative is to not use GSlice between fork/exec. I'm > not sure if within that region there are other users besides > GTree (GArray perhaps?), but if there aren't, then just using > a different binary tree implementation should do. Hmm my distros version of GArray certainly does and that is used quite heavily across gdbstub and plugins. > > Untested patches using ccan's AVL tree: > https://github.com/cota/qemu/commits/avl > > Would that be more palatable? I think generally we wouldn't want to have multiple implementations unless there was a definite benefit (c.f. QHT). That said I think Richard's latest optimisation work: Subject: [PATCH v2 0/7] accel/tcg: Rewrite user-only vma tracking Date: Thu, 27 Oct 2022 22:12:51 +1100 Message-Id: <20221027111258.348196-1-richard.hender...@linaro.org> brings in the kernel's interval tree (with unit tests). I wonder if the page_collection use of GTree could be converted to that? I don't know how you would defend against re-introducing it into linux-user though aside from commentary. > > Thanks, > Emilio -- Alex Bennée
Re: [PATCH v12 6/7] s390x/cpu_topology: activating CPU topology
On 12/1/22 11:15, Thomas Huth wrote: On 29/11/2022 18.42, Pierre Morel wrote: The KVM capability, KVM_CAP_S390_CPU_TOPOLOGY is used to activate the S390_FEAT_CONFIGURATION_TOPOLOGY feature and the topology facility for the guest in the case the topology is available in QEMU and in KVM. The feature is fenced for SE (secure execution). Out of curiosity: Why does it not work yet? To allow smooth migration with old QEMU the feature is disabled by default using the CPU flag -disable-topology. I stared at this code for a while now, but I have to admit that I don't quite get it. Why do we need a new "disable" feature flag here? I think it is pretty much impossible to set "ctop=on" with an older version of QEMU, since it would require the QEMU to enable KVM_CAP_S390_CPU_TOPOLOGY in the kernel for this feature bit - and older versions of QEMU don't set this capability yet. Which scenario would fail without this disable-topology feature bit? What do I miss? The only scenario it provides is that ctop is then disabled by default on newer QEMU allowing migration between old and new QEMU for older machine without changing the CPU flags. Otherwise, we would need -ctop=off on newer QEMU to disable the topology. Making the S390_FEAT_CONFIGURATION_TOPOLOGY belonging to the default features makes the -ctop CPU flag is no more necessary, too many verbs in this sentence ;-) definitively :) turning the topology feature on is done with -disable-topology only. ... Thomas -- Pierre Morel IBM Lab Boeblingen
Re: [PATCH for-8.0] hw/misc: Move some arm-related files from specific_ss into softmmu_ss
On Wed, 30 Nov 2022 at 11:16, Thomas Huth wrote: > > By removing #include "kvm-consts.h" from arm-powerctl.h (seems not to > be required there) and adjusting the header includes in some files, we > can move them from specific_ss into softmmu_ss, so that they only need > to be compiled once and not for qemu-system-arm and qemu-system-aarch64 > individually. > --- a/target/arm/arm-powerctl.h > +++ b/target/arm/arm-powerctl.h > @@ -11,8 +11,6 @@ > #ifndef QEMU_ARM_POWERCTL_H > #define QEMU_ARM_POWERCTL_H > > -#include "kvm-consts.h" > - > #define QEMU_ARM_POWERCTL_RET_SUCCESS QEMU_PSCI_RET_SUCCESS > #define QEMU_ARM_POWERCTL_INVALID_PARAM QEMU_PSCI_RET_INVALID_PARAMS > #define QEMU_ARM_POWERCTL_ALREADY_ON QEMU_PSCI_RET_ALREADY_ON kvm-consts.h is where QEMU_PSCI_RET_SUCCESS etc are defined. So while the #include isn't strictly needed for compilation to work because arm-powerctl.h only creates the #define and doesn't use it, it does mean that any source file that uses the QEMU_ARM_POWERCTL_* now needs to include kvm-consts.h somehow itself. (Usually this is going to happen implicitly via target/arm/cpu.h, I think.) I guess this is worth living with for the benefit of not compiling things twice. It could probably be untangled a little by e.g. moving the PSCI constants into their own header rather than defining them in kvm-consts.h, but I'm not sure if it's worth the effort right now. > diff --git a/hw/misc/meson.build b/hw/misc/meson.build > index 95268eddc0..9ca6bf1d17 100644 > --- a/hw/misc/meson.build > +++ b/hw/misc/meson.build > @@ -84,8 +84,8 @@ softmmu_ss.add(when: 'CONFIG_RASPI', if_true: files( > )) > softmmu_ss.add(when: 'CONFIG_SLAVIO', if_true: files('slavio_misc.c')) > softmmu_ss.add(when: 'CONFIG_ZYNQ', if_true: files('zynq_slcr.c')) > -specific_ss.add(when: 'CONFIG_XLNX_ZYNQMP_ARM', if_true: > files('xlnx-zynqmp-crf.c')) > -specific_ss.add(when: 'CONFIG_XLNX_ZYNQMP_ARM', if_true: > files('xlnx-zynqmp-apu-ctrl.c')) > +softmmu_ss.add(when: 'CONFIG_XLNX_ZYNQMP_ARM', if_true: > files('xlnx-zynqmp-crf.c')) > +softmmu_ss.add(when: 'CONFIG_XLNX_ZYNQMP_ARM', if_true: > files('xlnx-zynqmp-apu-ctrl.c')) > specific_ss.add(when: 'CONFIG_XLNX_VERSAL', if_true: > files('xlnx-versal-crl.c')) > softmmu_ss.add(when: 'CONFIG_XLNX_VERSAL', if_true: files( >'xlnx-versal-xramc.c', > @@ -126,8 +126,8 @@ softmmu_ss.add(when: 'CONFIG_GRLIB', if_true: > files('grlib_ahb_apb_pnp.c')) > > specific_ss.add(when: 'CONFIG_AVR_POWER', if_true: files('avr_power.c')) > > -specific_ss.add(when: 'CONFIG_IMX', if_true: files('imx6_src.c')) > -specific_ss.add(when: 'CONFIG_IOTKIT_SYSCTL', if_true: > files('iotkit-sysctl.c')) > +softmmu_ss.add(when: 'CONFIG_IMX', if_true: files('imx6_src.c')) This file could now be listed in the "softmmu_ss.add(when: 'CONFIG_IMX', if_true: files(...)" list earlier in the file. > +softmmu_ss.add(when: 'CONFIG_IOTKIT_SYSCTL', if_true: > files('iotkit-sysctl.c')) This line could be moved up to next to the other CONFIG_IOTKIT_* lines. > specific_ss.add(when: 'CONFIG_MAC_VIA', if_true: files('mac_via.c')) thanks -- PMM
Re: regression: insmod module failed in VM with nvdimm on
在 2022/12/1 19:07, Ard Biesheuvel 写道: On Thu, 1 Dec 2022 at 09:07, Ard Biesheuvel wrote: On Thu, 1 Dec 2022 at 08:15, chenxiang (M) wrote: Hi Ard, 在 2022/11/30 16:18, Ard Biesheuvel 写道: On Wed, 30 Nov 2022 at 08:53, Marc Zyngier wrote: On Wed, 30 Nov 2022 02:52:35 +, "chenxiang (M)" wrote: Hi, We boot the VM using following commands (with nvdimm on) (qemu version 6.1.50, kernel 6.0-r4): How relevant is the presence of the nvdimm? Do you observe the failure without this? qemu-system-aarch64 -machine virt,kernel_irqchip=on,gic-version=3,nvdimm=on -kernel /home/kernel/Image -initrd /home/mini-rootfs/rootfs.cpio.gz -bios /root/QEMU_EFI.FD -cpu host -enable-kvm -net none -nographic -m 2G,maxmem=64G,slots=3 -smp 4 -append 'rdinit=init console=ttyAMA0 ealycon=pl0ll,0x9000 pcie_ports=native pciehp.pciehp_debug=1' -object memory-backend-ram,id=ram1,size=10G -device nvdimm,id=dimm1,memdev=ram1 -device ioh3420,id=root_port1,chassis=1 -device vfio-pci,host=7d:01.0,id=net0,bus=root_port1 Then in VM we insmod a module, vmalloc error occurs as follows (kernel 5.19-rc4 is normal, and the issue is still on kernel 6.1-rc4): estuary:/$ insmod /lib/modules/$(uname -r)/hnae3.ko [8.186563] vmap allocation for size 20480 failed: use vmalloc= to increase size Have you tried increasing the vmalloc size to check that this is indeed the problem? [...] We git bisect the code, and find the patch c5a89f75d2a ("arm64: kaslr: defer initialization to initcall where permitted"). I guess you mean commit fc5a89f75d2a instead, right? Do you have any idea about the issue? I sort of suspect that the nvdimm gets vmap-ed and consumes a large portion of the vmalloc space, but you give very little information that could help here... Ouch. I suspect what's going on here: that patch defers the randomization of the module region, so that we can decouple it from the very early init code. Obviously, it is happening too late now, and the randomized module region is overlapping with a vmalloc region that is in use by the time the randomization occurs. Does the below fix the issue? The issue still occurs, but it seems decrease the probability, before it occured almost every time, after the change, i tried 2-3 times, and it occurs. But i change back "subsys_initcall" to "core_initcall", and i test more than 20 times, and it is still ok. Thank you for confirming. I will send out a patch today. ...but before I do that, could you please check whether the change below fixes your issue as well? Yes, but i can only reply to you tomorrow as other guy is testing on the only environment today. diff --git a/arch/arm64/kernel/kaslr.c b/arch/arm64/kernel/kaslr.c index 6ccc7ef600e7c1e1..c8c205b630da1951 100644 --- a/arch/arm64/kernel/kaslr.c +++ b/arch/arm64/kernel/kaslr.c @@ -20,7 +20,11 @@ #include #include -u64 __ro_after_init module_alloc_base; +/* + * Set a reasonable default for module_alloc_base in case + * we end up running with module randomization disabled. + */ +u64 __ro_after_init module_alloc_base = (u64)_etext - MODULES_VSIZE; u16 __initdata memstart_offset_seed; struct arm64_ftr_override kaslr_feature_override __initdata; @@ -30,12 +34,6 @@ static int __init kaslr_init(void) u64 module_range; u32 seed; - /* -* Set a reasonable default for module_alloc_base in case -* we end up running with module randomization disabled. -*/ - module_alloc_base = (u64)_etext - MODULES_VSIZE; - if (kaslr_feature_override.val & kaslr_feature_override.mask & 0xf) { pr_info("KASLR disabled on command line\n"); return 0; .
[PATCH v2 04/13] pci: Make query-pci stub consistent with the real one
QMP query-pci and HMP info pci can behave differently when there are no PCI devices. They can report nothing, like this: qemu-system-aarch64 -S -M spitz -display none -monitor stdio QEMU 7.1.91 monitor - type 'help' for more information (qemu) info pci Or they can fail, like this: qemu-system-microblaze -M petalogix-s3adsp1800 -display none -monitor stdio QEMU 7.1.91 monitor - type 'help' for more information (qemu) info pci PCI devices not supported They fail when none of the target's machines supports PCI, i.e. when we're using qmp_query_pci() from hw/pci/pci-stub.c. The error is not useful, and reporting nothing makes sense, so do that in pci-stub.c, too. Now qmp_query_pci() can't fail anymore. Drop the dead error handling from hmp_info_pci(). Signed-off-by: Markus Armbruster Reviewed-by: Michael S. Tsirkin Reviewed-by: Dr. David Alan Gilbert --- hw/pci/pci-hmp-cmds.c | 8 +--- hw/pci/pci-stub.c | 3 --- 2 files changed, 1 insertion(+), 10 deletions(-) diff --git a/hw/pci/pci-hmp-cmds.c b/hw/pci/pci-hmp-cmds.c index 5adfe4f57f..e915fb9fe7 100644 --- a/hw/pci/pci-hmp-cmds.c +++ b/hw/pci/pci-hmp-cmds.c @@ -105,14 +105,8 @@ static void hmp_info_pci_device(Monitor *mon, const PciDeviceInfo *dev) void hmp_info_pci(Monitor *mon, const QDict *qdict) { PciInfoList *info_list, *info; -Error *err = NULL; -info_list = qmp_query_pci(&err); -if (err) { -monitor_printf(mon, "PCI devices not supported\n"); -error_free(err); -return; -} +info_list = qmp_query_pci(&error_abort); for (info = info_list; info; info = info->next) { PciDeviceInfoList *dev; diff --git a/hw/pci/pci-stub.c b/hw/pci/pci-stub.c index 3a027c42e4..f29ecc999e 100644 --- a/hw/pci/pci-stub.c +++ b/hw/pci/pci-stub.c @@ -21,9 +21,7 @@ #include "qemu/osdep.h" #include "sysemu/sysemu.h" #include "monitor/monitor.h" -#include "qapi/error.h" #include "qapi/qapi-commands-pci.h" -#include "qapi/qmp/qerror.h" #include "hw/pci/pci.h" #include "hw/pci/msi.h" #include "hw/pci/msix.h" @@ -33,7 +31,6 @@ bool pci_available; PciInfoList *qmp_query_pci(Error **errp) { -error_setg(errp, QERR_UNSUPPORTED); return NULL; } -- 2.37.3
[PATCH v2 01/13] pci: Clean up a few things checkpatch.pl would flag later on
Fix a few style violations so that checkpatch.pl won't complain when I move this code. Signed-off-by: Markus Armbruster Reviewed-by: Michael S. Tsirkin Reviewed-by: Philippe Mathieu-Daudé --- hw/pci/pci.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/hw/pci/pci.c b/hw/pci/pci.c index 2f450f6a72..53ed447115 100644 --- a/hw/pci/pci.c +++ b/hw/pci/pci.c @@ -1851,10 +1851,12 @@ static PciBridgeInfo *qmp_query_pci_bridge(PCIDevice *dev, PCIBus *bus, range->limit = pci_bridge_get_limit(dev, PCI_BASE_ADDRESS_MEM_PREFETCH); if (dev->config[PCI_SECONDARY_BUS] != 0) { -PCIBus *child_bus = pci_find_bus_nr(bus, dev->config[PCI_SECONDARY_BUS]); +PCIBus *child_bus = pci_find_bus_nr(bus, +dev->config[PCI_SECONDARY_BUS]); if (child_bus) { info->has_devices = true; -info->devices = qmp_query_pci_devices(child_bus, dev->config[PCI_SECONDARY_BUS]); +info->devices = qmp_query_pci_devices(child_bus, +dev->config[PCI_SECONDARY_BUS]); } } @@ -2612,8 +2614,9 @@ static void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent) pci_get_word(d->config + PCI_SUBSYSTEM_ID)); for (i = 0; i < PCI_NUM_REGIONS; i++) { r = &d->io_regions[i]; -if (!r->size) +if (!r->size) { continue; +} monitor_printf(mon, "%*sbar %d: %s at 0x%"FMT_PCIBUS " [0x%"FMT_PCIBUS"]\n", indent, "", -- 2.37.3
[PATCH v2 00/13] pci: Move and clean up monitor command code
This is mainly about splitting off monitor-related code. There's also a few UI fixes to HMP command pcie_aer_inject_error. v2: * PATCH 08: Use qemu_strtoui() [David], commit message corrected * PATCH 13: New Markus Armbruster (13): pci: Clean up a few things checkpatch.pl would flag later on pci: Move QMP commands to new hw/pci/pci-qmp-cmds.c pci: Move HMP commands from monitor/ to new hw/pci/pci-hmp-cmds.c pci: Make query-pci stub consistent with the real one pci: Build hw/pci/pci-hmp-cmds.c only when CONFIG_PCI pci: Deduplicate get_class_desc() pci: Move pcibus_dev_print() to pci-hmp-cmds.c pci: Fix silent truncation of pcie_aer_inject_error argument pci: Move HMP command from hw/pci/pcie_aer.c to pci-hmp-cmds.c pci: Inline do_pcie_aer_inject_error() into its only caller pci: Rename hmp_pcie_aer_inject_error()'s local variable @err pci: Improve do_pcie_aer_inject_error()'s error messages pci: Reject pcie_aer_inject_error -c with symbolic error status hw/pci/pci-internal.h | 25 + include/monitor/hmp.h | 1 + include/sysemu/sysemu.h | 3 - hw/pci/pci-hmp-cmds.c | 238 hw/pci/pci-qmp-cmds.c | 201 + hw/pci/pci-stub.c | 9 +- hw/pci/pci.c| 226 +- hw/pci/pcie_aer.c | 113 +-- monitor/hmp-cmds.c | 107 -- hw/pci/meson.build | 2 + 10 files changed, 480 insertions(+), 445 deletions(-) create mode 100644 hw/pci/pci-internal.h create mode 100644 hw/pci/pci-hmp-cmds.c create mode 100644 hw/pci/pci-qmp-cmds.c -- 2.37.3
[PATCH v2 11/13] pci: Rename hmp_pcie_aer_inject_error()'s local variable @err
I'd like to use @err for an Error *err. Rename PCIEAERErr err to aer_err. Signed-off-by: Markus Armbruster Reviewed-by: Michael S. Tsirkin Reviewed-by: Philippe Mathieu-Daudé --- hw/pci/pci-hmp-cmds.c | 34 +- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/hw/pci/pci-hmp-cmds.c b/hw/pci/pci-hmp-cmds.c index a9a5bbb930..2dd65ca6ee 100644 --- a/hw/pci/pci-hmp-cmds.c +++ b/hw/pci/pci-hmp-cmds.c @@ -167,7 +167,7 @@ void hmp_pcie_aer_inject_error(Monitor *mon, const QDict *qdict) unsigned int num; bool correctable; PCIDevice *dev; -PCIEAERErr err; +PCIEAERErr aer_err; int ret; ret = pci_qdev_find_device(id, &dev); @@ -193,34 +193,34 @@ void hmp_pcie_aer_inject_error(Monitor *mon, const QDict *qdict) error_status = num; correctable = qdict_get_try_bool(qdict, "correctable", false); } -err.status = error_status; -err.source_id = pci_requester_id(dev); +aer_err.status = error_status; +aer_err.source_id = pci_requester_id(dev); -err.flags = 0; +aer_err.flags = 0; if (correctable) { -err.flags |= PCIE_AER_ERR_IS_CORRECTABLE; +aer_err.flags |= PCIE_AER_ERR_IS_CORRECTABLE; } if (qdict_get_try_bool(qdict, "advisory_non_fatal", false)) { -err.flags |= PCIE_AER_ERR_MAYBE_ADVISORY; +aer_err.flags |= PCIE_AER_ERR_MAYBE_ADVISORY; } if (qdict_haskey(qdict, "header0")) { -err.flags |= PCIE_AER_ERR_HEADER_VALID; +aer_err.flags |= PCIE_AER_ERR_HEADER_VALID; } if (qdict_haskey(qdict, "prefix0")) { -err.flags |= PCIE_AER_ERR_TLP_PREFIX_PRESENT; +aer_err.flags |= PCIE_AER_ERR_TLP_PREFIX_PRESENT; } -err.header[0] = qdict_get_try_int(qdict, "header0", 0); -err.header[1] = qdict_get_try_int(qdict, "header1", 0); -err.header[2] = qdict_get_try_int(qdict, "header2", 0); -err.header[3] = qdict_get_try_int(qdict, "header3", 0); +aer_err.header[0] = qdict_get_try_int(qdict, "header0", 0); +aer_err.header[1] = qdict_get_try_int(qdict, "header1", 0); +aer_err.header[2] = qdict_get_try_int(qdict, "header2", 0); +aer_err.header[3] = qdict_get_try_int(qdict, "header3", 0); -err.prefix[0] = qdict_get_try_int(qdict, "prefix0", 0); -err.prefix[1] = qdict_get_try_int(qdict, "prefix1", 0); -err.prefix[2] = qdict_get_try_int(qdict, "prefix2", 0); -err.prefix[3] = qdict_get_try_int(qdict, "prefix3", 0); +aer_err.prefix[0] = qdict_get_try_int(qdict, "prefix0", 0); +aer_err.prefix[1] = qdict_get_try_int(qdict, "prefix1", 0); +aer_err.prefix[2] = qdict_get_try_int(qdict, "prefix2", 0); +aer_err.prefix[3] = qdict_get_try_int(qdict, "prefix3", 0); -ret = pcie_aer_inject_error(dev, &err); +ret = pcie_aer_inject_error(dev, &aer_err); if (ret < 0) { monitor_printf(mon, "failed to inject error: %s\n", strerror(-ret)); -- 2.37.3
[PATCH v2 07/13] pci: Move pcibus_dev_print() to pci-hmp-cmds.c
This method is for HMP command "info qtree". Signed-off-by: Markus Armbruster Reviewed-by: Michael S. Tsirkin Reviewed-by: Philippe Mathieu-Daudé --- hw/pci/pci-internal.h | 1 + hw/pci/pci-hmp-cmds.c | 38 ++ hw/pci/pci.c | 38 -- 3 files changed, 39 insertions(+), 38 deletions(-) diff --git a/hw/pci/pci-internal.h b/hw/pci/pci-internal.h index 4903a26cbf..3199615e50 100644 --- a/hw/pci/pci-internal.h +++ b/hw/pci/pci-internal.h @@ -16,5 +16,6 @@ extern PCIHostStateList pci_host_bridges; const pci_class_desc *get_class_desc(int class); PCIBus *pci_find_bus_nr(PCIBus *bus, int bus_num); +void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent); #endif diff --git a/hw/pci/pci-hmp-cmds.c b/hw/pci/pci-hmp-cmds.c index e915fb9fe7..417f1ca607 100644 --- a/hw/pci/pci-hmp-cmds.c +++ b/hw/pci/pci-hmp-cmds.c @@ -14,8 +14,10 @@ */ #include "qemu/osdep.h" +#include "hw/pci/pci.h" #include "monitor/hmp.h" #include "monitor/monitor.h" +#include "pci-internal.h" #include "qapi/error.h" #include "qapi/qapi-commands-pci.h" @@ -118,3 +120,39 @@ void hmp_info_pci(Monitor *mon, const QDict *qdict) qapi_free_PciInfoList(info_list); } + +void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent) +{ +PCIDevice *d = (PCIDevice *)dev; +int class = pci_get_word(d->config + PCI_CLASS_DEVICE); +const pci_class_desc *desc = get_class_desc(class); +char ctxt[64]; +PCIIORegion *r; +int i; + +if (desc->desc) { +snprintf(ctxt, sizeof(ctxt), "%s", desc->desc); +} else { +snprintf(ctxt, sizeof(ctxt), "Class %04x", class); +} + +monitor_printf(mon, "%*sclass %s, addr %02x:%02x.%x, " + "pci id %04x:%04x (sub %04x:%04x)\n", + indent, "", ctxt, pci_dev_bus_num(d), + PCI_SLOT(d->devfn), PCI_FUNC(d->devfn), + pci_get_word(d->config + PCI_VENDOR_ID), + pci_get_word(d->config + PCI_DEVICE_ID), + pci_get_word(d->config + PCI_SUBSYSTEM_VENDOR_ID), + pci_get_word(d->config + PCI_SUBSYSTEM_ID)); +for (i = 0; i < PCI_NUM_REGIONS; i++) { +r = &d->io_regions[i]; +if (!r->size) { +continue; +} +monitor_printf(mon, "%*sbar %d: %s at 0x%"FMT_PCIBUS + " [0x%"FMT_PCIBUS"]\n", + indent, "", + i, r->type & PCI_BASE_ADDRESS_SPACE_IO ? "i/o" : "mem", + r->addr, r->addr + r->size - 1); +} +} diff --git a/hw/pci/pci.c b/hw/pci/pci.c index 6711a75098..d654045fe9 100644 --- a/hw/pci/pci.c +++ b/hw/pci/pci.c @@ -34,7 +34,6 @@ #include "hw/qdev-properties-system.h" #include "migration/qemu-file-types.h" #include "migration/vmstate.h" -#include "monitor/monitor.h" #include "net/net.h" #include "sysemu/numa.h" #include "sysemu/sysemu.h" @@ -59,7 +58,6 @@ bool pci_available = true; -static void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent); static char *pcibus_get_dev_path(DeviceState *dev); static char *pcibus_get_fw_dev_path(DeviceState *dev); static void pcibus_reset(BusState *qbus); @@ -2406,42 +2404,6 @@ uint8_t pci_find_capability(PCIDevice *pdev, uint8_t cap_id) return pci_find_capability_list(pdev, cap_id, NULL); } -static void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent) -{ -PCIDevice *d = (PCIDevice *)dev; -int class = pci_get_word(d->config + PCI_CLASS_DEVICE); -const pci_class_desc *desc = get_class_desc(class); -char ctxt[64]; -PCIIORegion *r; -int i; - -if (desc->desc) { -snprintf(ctxt, sizeof(ctxt), "%s", desc->desc); -} else { -snprintf(ctxt, sizeof(ctxt), "Class %04x", class); -} - -monitor_printf(mon, "%*sclass %s, addr %02x:%02x.%x, " - "pci id %04x:%04x (sub %04x:%04x)\n", - indent, "", ctxt, pci_dev_bus_num(d), - PCI_SLOT(d->devfn), PCI_FUNC(d->devfn), - pci_get_word(d->config + PCI_VENDOR_ID), - pci_get_word(d->config + PCI_DEVICE_ID), - pci_get_word(d->config + PCI_SUBSYSTEM_VENDOR_ID), - pci_get_word(d->config + PCI_SUBSYSTEM_ID)); -for (i = 0; i < PCI_NUM_REGIONS; i++) { -r = &d->io_regions[i]; -if (!r->size) { -continue; -} -monitor_printf(mon, "%*sbar %d: %s at 0x%"FMT_PCIBUS - " [0x%"FMT_PCIBUS"]\n", - indent, "", - i, r->type & PCI_BASE_ADDRESS_SPACE_IO ? "i/o" : "mem", - r->addr, r->addr + r->size - 1); -} -} - static char *pci_dev_fw_name(DeviceState *dev, char *buf, int len) { PCIDevice *d = (PCIDevice *)dev; -- 2.37.3
[PATCH v2 10/13] pci: Inline do_pcie_aer_inject_error() into its only caller
Signed-off-by: Markus Armbruster Reviewed-by: Michael S. Tsirkin Reviewed-by: Dr. David Alan Gilbert --- hw/pci/pci-hmp-cmds.c | 41 ++--- 1 file changed, 6 insertions(+), 35 deletions(-) diff --git a/hw/pci/pci-hmp-cmds.c b/hw/pci/pci-hmp-cmds.c index ae75b920aa..a9a5bbb930 100644 --- a/hw/pci/pci-hmp-cmds.c +++ b/hw/pci/pci-hmp-cmds.c @@ -159,21 +159,7 @@ void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent) } } -typedef struct PCIEErrorDetails { -const char *id; -const char *root_bus; -int bus; -int devfn; -} PCIEErrorDetails; - -/* - * Inject an error described by @qdict. - * On success, set @details to show where error was sent. - * Return negative errno if injection failed and a message was emitted. - */ -static int do_pcie_aer_inject_error(Monitor *mon, -const QDict *qdict, -PCIEErrorDetails *details) +void hmp_pcie_aer_inject_error(Monitor *mon, const QDict *qdict) { const char *id = qdict_get_str(qdict, "id"); const char *error_name; @@ -189,12 +175,12 @@ static int do_pcie_aer_inject_error(Monitor *mon, monitor_printf(mon, "id or pci device path is invalid or device not " "found. %s\n", id); -return ret; +return; } if (!pci_is_express(dev)) { monitor_printf(mon, "the device doesn't support pci express. %s\n", id); -return -ENOSYS; +return; } error_name = qdict_get_str(qdict, "error_status"); @@ -202,7 +188,7 @@ static int do_pcie_aer_inject_error(Monitor *mon, if (qemu_strtoui(error_name, NULL, 0, &num) < 0) { monitor_printf(mon, "invalid error status value. \"%s\"", error_name); -return -EINVAL; +return; } error_status = num; correctable = qdict_get_try_bool(qdict, "correctable", false); @@ -238,25 +224,10 @@ static int do_pcie_aer_inject_error(Monitor *mon, if (ret < 0) { monitor_printf(mon, "failed to inject error: %s\n", strerror(-ret)); -return ret; -} -details->id = id; -details->root_bus = pci_root_bus_path(dev); -details->bus = pci_dev_bus_num(dev); -details->devfn = dev->devfn; - -return 0; -} - -void hmp_pcie_aer_inject_error(Monitor *mon, const QDict *qdict) -{ -PCIEErrorDetails data; - -if (do_pcie_aer_inject_error(mon, qdict, &data) < 0) { return; } monitor_printf(mon, "OK id: %s root bus: %s, bus: %x devfn: %x.%x\n", - data.id, data.root_bus, data.bus, - PCI_SLOT(data.devfn), PCI_FUNC(data.devfn)); + id, pci_root_bus_path(dev), pci_dev_bus_num(dev), + PCI_SLOT(dev->devfn), PCI_FUNC(dev->devfn)); } -- 2.37.3
[PATCH v2 08/13] pci: Fix silent truncation of pcie_aer_inject_error argument
PCI AER error status is 32 bit. The HMP command supports both symbolic and numeric error status: anything that isn't a known symbolic value is parsed as number with strtol(). Issues: * Empty argument yields value zero. * Range errors from strtol() are ignored, value is UINT32_MAX. * Values not representable in uint32_t are silently truncated. Fix to reject such input by switching to strtoui(). Signed-off-by: Markus Armbruster Reviewed-by: Michael S. Tsirkin --- hw/pci/pcie_aer.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/hw/pci/pcie_aer.c b/hw/pci/pcie_aer.c index eff62f3945..58d20816d6 100644 --- a/hw/pci/pcie_aer.c +++ b/hw/pci/pcie_aer.c @@ -30,6 +30,7 @@ #include "hw/pci/pci_bus.h" #include "hw/pci/pcie_regs.h" #include "qapi/error.h" +#include "qemu/cutils.h" //#define DEBUG_PCIE #ifdef DEBUG_PCIE @@ -963,6 +964,7 @@ static int do_pcie_aer_inject_error(Monitor *mon, const char *id = qdict_get_str(qdict, "id"); const char *error_name; uint32_t error_status; +unsigned int num; bool correctable; PCIDevice *dev; PCIEAERErr err; @@ -983,14 +985,13 @@ static int do_pcie_aer_inject_error(Monitor *mon, error_name = qdict_get_str(qdict, "error_status"); if (pcie_aer_parse_error_string(error_name, &error_status, &correctable)) { -char *e = NULL; -error_status = strtoul(error_name, &e, 0); -correctable = qdict_get_try_bool(qdict, "correctable", false); -if (!e || *e != '\0') { +if (qemu_strtoui(error_name, NULL, 0, &num) < 0) { monitor_printf(mon, "invalid error status value. \"%s\"", error_name); return -EINVAL; } +error_status = num; +correctable = qdict_get_try_bool(qdict, "correctable", false); } err.status = error_status; err.source_id = pci_requester_id(dev); -- 2.37.3
[PATCH v2 03/13] pci: Move HMP commands from monitor/ to new hw/pci/pci-hmp-cmds.c
This moves these commands from MAINTAINERS section "Human Monitor (HMP)" to "PCI". Signed-off-by: Markus Armbruster Reviewed-by: Michael S. Tsirkin Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Dr. David Alan Gilbert --- hw/pci/pci-hmp-cmds.c | 126 ++ monitor/hmp-cmds.c| 107 --- hw/pci/meson.build| 1 + 3 files changed, 127 insertions(+), 107 deletions(-) create mode 100644 hw/pci/pci-hmp-cmds.c diff --git a/hw/pci/pci-hmp-cmds.c b/hw/pci/pci-hmp-cmds.c new file mode 100644 index 00..5adfe4f57f --- /dev/null +++ b/hw/pci/pci-hmp-cmds.c @@ -0,0 +1,126 @@ +/* + * HMP commands related to PCI + * + * Copyright IBM, Corp. 2011 + * + * Authors: + * Anthony Liguori + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + * Contributions after 2012-01-13 are licensed under the terms of the + * GNU GPL, version 2 or (at your option) any later version. + */ + +#include "qemu/osdep.h" +#include "monitor/hmp.h" +#include "monitor/monitor.h" +#include "qapi/error.h" +#include "qapi/qapi-commands-pci.h" + +static void hmp_info_pci_device(Monitor *mon, const PciDeviceInfo *dev) +{ +PciMemoryRegionList *region; + +monitor_printf(mon, " Bus %2" PRId64 ", ", dev->bus); +monitor_printf(mon, "device %3" PRId64 ", function %" PRId64 ":\n", + dev->slot, dev->function); +monitor_printf(mon, ""); + +if (dev->class_info->has_desc) { +monitor_puts(mon, dev->class_info->desc); +} else { +monitor_printf(mon, "Class %04" PRId64, dev->class_info->q_class); +} + +monitor_printf(mon, ": PCI device %04" PRIx64 ":%04" PRIx64 "\n", + dev->id->vendor, dev->id->device); +if (dev->id->has_subsystem_vendor && dev->id->has_subsystem) { +monitor_printf(mon, " PCI subsystem %04" PRIx64 ":%04" PRIx64 "\n", + dev->id->subsystem_vendor, dev->id->subsystem); +} + +if (dev->has_irq) { +monitor_printf(mon, " IRQ %" PRId64 ", pin %c\n", + dev->irq, (char)('A' + dev->irq_pin - 1)); +} + +if (dev->has_pci_bridge) { +monitor_printf(mon, " BUS %" PRId64 ".\n", + dev->pci_bridge->bus->number); +monitor_printf(mon, " secondary bus %" PRId64 ".\n", + dev->pci_bridge->bus->secondary); +monitor_printf(mon, " subordinate bus %" PRId64 ".\n", + dev->pci_bridge->bus->subordinate); + +monitor_printf(mon, " IO range [0x%04"PRIx64", 0x%04"PRIx64"]\n", + dev->pci_bridge->bus->io_range->base, + dev->pci_bridge->bus->io_range->limit); + +monitor_printf(mon, + " memory range [0x%08"PRIx64", 0x%08"PRIx64"]\n", + dev->pci_bridge->bus->memory_range->base, + dev->pci_bridge->bus->memory_range->limit); + +monitor_printf(mon, " prefetchable memory range " + "[0x%08"PRIx64", 0x%08"PRIx64"]\n", + dev->pci_bridge->bus->prefetchable_range->base, + dev->pci_bridge->bus->prefetchable_range->limit); +} + +for (region = dev->regions; region; region = region->next) { +uint64_t addr, size; + +addr = region->value->address; +size = region->value->size; + +monitor_printf(mon, " BAR%" PRId64 ": ", region->value->bar); + +if (!strcmp(region->value->type, "io")) { +monitor_printf(mon, "I/O at 0x%04" PRIx64 +" [0x%04" PRIx64 "].\n", + addr, addr + size - 1); +} else { +monitor_printf(mon, "%d bit%s memory at 0x%08" PRIx64 + " [0x%08" PRIx64 "].\n", + region->value->mem_type_64 ? 64 : 32, + region->value->prefetch ? " prefetchable" : "", + addr, addr + size - 1); +} +} + +monitor_printf(mon, " id \"%s\"\n", dev->qdev_id); + +if (dev->has_pci_bridge) { +if (dev->pci_bridge->has_devices) { +PciDeviceInfoList *cdev; +for (cdev = dev->pci_bridge->devices; cdev; cdev = cdev->next) { +hmp_info_pci_device(mon, cdev->value); +} +} +} +} + +void hmp_info_pci(Monitor *mon, const QDict *qdict) +{ +PciInfoList *info_list, *info; +Error *err = NULL; + +info_list = qmp_query_pci(&err); +if (err) { +monitor_printf(mon, "PCI devices not supported\n"); +error_free(err); +return; +} + +for (info = info_list; info; info = info->next) { +PciDeviceInfoList *dev; + +for (dev = info->value->devices; dev; dev = dev->next) { +hmp_info
[PATCH v2 02/13] pci: Move QMP commands to new hw/pci/pci-qmp-cmds.c
Signed-off-by: Markus Armbruster Reviewed-by: Michael S. Tsirkin --- hw/pci/pci-internal.h | 20 + hw/pci/pci-qmp-cmds.c | 201 ++ hw/pci/pci.c | 188 +-- hw/pci/meson.build| 1 + 4 files changed, 226 insertions(+), 184 deletions(-) create mode 100644 hw/pci/pci-internal.h create mode 100644 hw/pci/pci-qmp-cmds.c diff --git a/hw/pci/pci-internal.h b/hw/pci/pci-internal.h new file mode 100644 index 00..4903a26cbf --- /dev/null +++ b/hw/pci/pci-internal.h @@ -0,0 +1,20 @@ +#ifndef HW_PCI_PCI_INTERNAL_H +#define HW_PCI_PCI_INTERNAL_H + +#include "qemu/queue.h" + +typedef struct { +uint16_t class; +const char *desc; +const char *fw_name; +uint16_t fw_ign_bits; +} pci_class_desc; + +typedef QLIST_HEAD(, PCIHostState) PCIHostStateList; + +extern PCIHostStateList pci_host_bridges; + +const pci_class_desc *get_class_desc(int class); +PCIBus *pci_find_bus_nr(PCIBus *bus, int bus_num); + +#endif diff --git a/hw/pci/pci-qmp-cmds.c b/hw/pci/pci-qmp-cmds.c new file mode 100644 index 00..4b434dcfda --- /dev/null +++ b/hw/pci/pci-qmp-cmds.c @@ -0,0 +1,201 @@ +/* + * QMP commands related to PCI + * + * Copyright (c) 2004 Fabrice Bellard + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + */ + +#include "qemu/osdep.h" +#include "hw/pci/pci.h" +#include "hw/pci/pci_bridge.h" +#include "pci-internal.h" +#include "qapi/qapi-commands-pci.h" + +static PciDeviceInfoList *qmp_query_pci_devices(PCIBus *bus, int bus_num); + +static PciMemoryRegionList *qmp_query_pci_regions(const PCIDevice *dev) +{ +PciMemoryRegionList *head = NULL, **tail = &head; +int i; + +for (i = 0; i < PCI_NUM_REGIONS; i++) { +const PCIIORegion *r = &dev->io_regions[i]; +PciMemoryRegion *region; + +if (!r->size) { +continue; +} + +region = g_malloc0(sizeof(*region)); + +if (r->type & PCI_BASE_ADDRESS_SPACE_IO) { +region->type = g_strdup("io"); +} else { +region->type = g_strdup("memory"); +region->has_prefetch = true; +region->prefetch = !!(r->type & PCI_BASE_ADDRESS_MEM_PREFETCH); +region->has_mem_type_64 = true; +region->mem_type_64 = !!(r->type & PCI_BASE_ADDRESS_MEM_TYPE_64); +} + +region->bar = i; +region->address = r->addr; +region->size = r->size; + +QAPI_LIST_APPEND(tail, region); +} + +return head; +} + +static PciBridgeInfo *qmp_query_pci_bridge(PCIDevice *dev, PCIBus *bus, + int bus_num) +{ +PciBridgeInfo *info; +PciMemoryRange *range; + +info = g_new0(PciBridgeInfo, 1); + +info->bus = g_new0(PciBusInfo, 1); +info->bus->number = dev->config[PCI_PRIMARY_BUS]; +info->bus->secondary = dev->config[PCI_SECONDARY_BUS]; +info->bus->subordinate = dev->config[PCI_SUBORDINATE_BUS]; + +range = info->bus->io_range = g_new0(PciMemoryRange, 1); +range->base = pci_bridge_get_base(dev, PCI_BASE_ADDRESS_SPACE_IO); +range->limit = pci_bridge_get_limit(dev, PCI_BASE_ADDRESS_SPACE_IO); + +range = info->bus->memory_range = g_new0(PciMemoryRange, 1); +range->base = pci_bridge_get_base(dev, PCI_BASE_ADDRESS_SPACE_MEMORY); +range->limit = pci_bridge_get_limit(dev, PCI_BASE_ADDRESS_SPACE_MEMORY); + +range = info->bus->prefetchable_range = g_new0(PciMemoryRange, 1); +range->base = pci_bridge_get_base(dev, PCI_BASE_ADDRESS_MEM_PREFETCH); +range->limit = pci_bridge_get_limit(dev, PCI_BASE_ADDRESS_MEM_PREFETCH); + +if (dev->config[PCI_SECONDARY_BUS] != 0) { +PCIBus *child_bus = pci_find_bus_nr(bus, +dev->config[PCI_SECONDARY_BUS]); +if (child_bus) { +info->has_devices = true; +info->devices = qmp_query_pci_devices(child_bus, +
[PATCH v2 12/13] pci: Improve do_pcie_aer_inject_error()'s error messages
Signed-off-by: Markus Armbruster Reviewed-by: Michael S. Tsirkin Reviewed-by: Dr. David Alan Gilbert --- hw/pci/pci-hmp-cmds.c | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/hw/pci/pci-hmp-cmds.c b/hw/pci/pci-hmp-cmds.c index 2dd65ca6ee..7a3175ab4b 100644 --- a/hw/pci/pci-hmp-cmds.c +++ b/hw/pci/pci-hmp-cmds.c @@ -161,6 +161,7 @@ void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent) void hmp_pcie_aer_inject_error(Monitor *mon, const QDict *qdict) { +Error *err = NULL; const char *id = qdict_get_str(qdict, "id"); const char *error_name; uint32_t error_status; @@ -171,24 +172,20 @@ void hmp_pcie_aer_inject_error(Monitor *mon, const QDict *qdict) int ret; ret = pci_qdev_find_device(id, &dev); -if (ret < 0) { -monitor_printf(mon, - "id or pci device path is invalid or device not " - "found. %s\n", id); -return; +if (ret == -ENODEV) { +error_setg(&err, "device '%s' not found", id); +goto out; } -if (!pci_is_express(dev)) { -monitor_printf(mon, "the device doesn't support pci express. %s\n", - id); -return; +if (ret < 0 || !pci_is_express(dev)) { +error_setg(&err, "device '%s' is not a PCIe device", id); +goto out; } error_name = qdict_get_str(qdict, "error_status"); if (pcie_aer_parse_error_string(error_name, &error_status, &correctable)) { if (qemu_strtoui(error_name, NULL, 0, &num) < 0) { -monitor_printf(mon, "invalid error status value. \"%s\"", - error_name); -return; +error_setg(&err, "invalid error status value '%s'", error_name); +goto out; } error_status = num; correctable = qdict_get_try_bool(qdict, "correctable", false); @@ -222,12 +219,15 @@ void hmp_pcie_aer_inject_error(Monitor *mon, const QDict *qdict) ret = pcie_aer_inject_error(dev, &aer_err); if (ret < 0) { -monitor_printf(mon, "failed to inject error: %s\n", - strerror(-ret)); -return; +error_setg_errno(&err, -ret, "failed to inject error"); +goto out; } + monitor_printf(mon, "OK id: %s root bus: %s, bus: %x devfn: %x.%x\n", id, pci_root_bus_path(dev), pci_dev_bus_num(dev), PCI_SLOT(dev->devfn), PCI_FUNC(dev->devfn)); + +out: +hmp_handle_error(mon, err); } -- 2.37.3
[PATCH v2 13/13] pci: Reject pcie_aer_inject_error -c with symbolic error status
When argument @error_status is symbolic, flag -c is ignored. Reject it instead. Signed-off-by: Markus Armbruster --- hw/pci/pci-hmp-cmds.c | 5 + 1 file changed, 5 insertions(+) diff --git a/hw/pci/pci-hmp-cmds.c b/hw/pci/pci-hmp-cmds.c index 7a3175ab4b..043b0a601d 100644 --- a/hw/pci/pci-hmp-cmds.c +++ b/hw/pci/pci-hmp-cmds.c @@ -189,6 +189,11 @@ void hmp_pcie_aer_inject_error(Monitor *mon, const QDict *qdict) } error_status = num; correctable = qdict_get_try_bool(qdict, "correctable", false); +} else { +if (qdict_haskey(qdict, "correctable")) { +error_setg(&err, "-c is only valid with numeric error status"); +goto out; +} } aer_err.status = error_status; aer_err.source_id = pci_requester_id(dev); -- 2.37.3
[PATCH v2 05/13] pci: Build hw/pci/pci-hmp-cmds.c only when CONFIG_PCI
We compile pci-hmp-cmds.c always, but pci-qmp-cmds.c only when CONFIG_PCI. hw/pci/pci-stub.c keeps the linker happy when !CONFIG_PCI. Build pci-hmp-cmds.c that way, too. Signed-off-by: Markus Armbruster Reviewed-by: Michael S. Tsirkin --- hw/pci/pci-stub.c | 5 + hw/pci/meson.build | 2 +- 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/hw/pci/pci-stub.c b/hw/pci/pci-stub.c index f29ecc999e..01d20a2f67 100644 --- a/hw/pci/pci-stub.c +++ b/hw/pci/pci-stub.c @@ -21,6 +21,7 @@ #include "qemu/osdep.h" #include "sysemu/sysemu.h" #include "monitor/monitor.h" +#include "monitor/hmp.h" #include "qapi/qapi-commands-pci.h" #include "hw/pci/pci.h" #include "hw/pci/msi.h" @@ -34,6 +35,10 @@ PciInfoList *qmp_query_pci(Error **errp) return NULL; } +void hmp_info_pci(Monitor *mon, const QDict *qdict) +{ +} + void hmp_pcie_aer_inject_error(Monitor *mon, const QDict *qdict) { monitor_printf(mon, "PCI devices not supported\n"); diff --git a/hw/pci/meson.build b/hw/pci/meson.build index e42a133f3a..4fcd888b27 100644 --- a/hw/pci/meson.build +++ b/hw/pci/meson.build @@ -5,6 +5,7 @@ pci_ss.add(files( 'pci.c', 'pci_bridge.c', 'pci_host.c', + 'pci-hmp-cmds.c', 'pci-qmp-cmds.c', 'pcie_sriov.c', 'shpc.c', @@ -20,4 +21,3 @@ softmmu_ss.add_all(when: 'CONFIG_PCI', if_true: pci_ss) softmmu_ss.add(when: 'CONFIG_PCI', if_false: files('pci-stub.c')) softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('pci-stub.c')) -softmmu_ss.add(files('pci-hmp-cmds.c')) -- 2.37.3
[PATCH v2 09/13] pci: Move HMP command from hw/pci/pcie_aer.c to pci-hmp-cmds.c
Signed-off-by: Markus Armbruster Reviewed-by: Michael S. Tsirkin Reviewed-by: Philippe Mathieu-Daudé --- hw/pci/pci-internal.h | 4 ++ include/monitor/hmp.h | 1 + include/sysemu/sysemu.h | 3 -- hw/pci/pci-hmp-cmds.c | 104 hw/pci/pci-stub.c | 1 - hw/pci/pcie_aer.c | 114 ++-- 6 files changed, 113 insertions(+), 114 deletions(-) diff --git a/hw/pci/pci-internal.h b/hw/pci/pci-internal.h index 3199615e50..2ea356bdf5 100644 --- a/hw/pci/pci-internal.h +++ b/hw/pci/pci-internal.h @@ -18,4 +18,8 @@ const pci_class_desc *get_class_desc(int class); PCIBus *pci_find_bus_nr(PCIBus *bus, int bus_num); void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent); +int pcie_aer_parse_error_string(const char *error_name, +uint32_t *status, bool *correctable); +int pcie_aer_inject_error(PCIDevice *dev, const PCIEAERErr *err); + #endif diff --git a/include/monitor/hmp.h b/include/monitor/hmp.h index dfbc0c9a2f..27f86399f7 100644 --- a/include/monitor/hmp.h +++ b/include/monitor/hmp.h @@ -143,5 +143,6 @@ void hmp_info_vcpu_dirty_limit(Monitor *mon, const QDict *qdict); void hmp_human_readable_text_helper(Monitor *mon, HumanReadableText *(*qmp_handler)(Error **)); void hmp_info_stats(Monitor *mon, const QDict *qdict); +void hmp_pcie_aer_inject_error(Monitor *mon, const QDict *qdict); #endif diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h index 6a7a31e64d..25be2a692e 100644 --- a/include/sysemu/sysemu.h +++ b/include/sysemu/sysemu.h @@ -61,9 +61,6 @@ extern int nb_option_roms; extern const char *prom_envs[MAX_PROM_ENVS]; extern unsigned int nb_prom_envs; -/* pcie aer error injection */ -void hmp_pcie_aer_inject_error(Monitor *mon, const QDict *qdict); - /* serial ports */ /* Return the Chardev for serial port i, or NULL if none */ diff --git a/hw/pci/pci-hmp-cmds.c b/hw/pci/pci-hmp-cmds.c index 417f1ca607..ae75b920aa 100644 --- a/hw/pci/pci-hmp-cmds.c +++ b/hw/pci/pci-hmp-cmds.c @@ -19,7 +19,9 @@ #include "monitor/monitor.h" #include "pci-internal.h" #include "qapi/error.h" +#include "qapi/qmp/qdict.h" #include "qapi/qapi-commands-pci.h" +#include "qemu/cutils.h" static void hmp_info_pci_device(Monitor *mon, const PciDeviceInfo *dev) { @@ -156,3 +158,105 @@ void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent) r->addr, r->addr + r->size - 1); } } + +typedef struct PCIEErrorDetails { +const char *id; +const char *root_bus; +int bus; +int devfn; +} PCIEErrorDetails; + +/* + * Inject an error described by @qdict. + * On success, set @details to show where error was sent. + * Return negative errno if injection failed and a message was emitted. + */ +static int do_pcie_aer_inject_error(Monitor *mon, +const QDict *qdict, +PCIEErrorDetails *details) +{ +const char *id = qdict_get_str(qdict, "id"); +const char *error_name; +uint32_t error_status; +unsigned int num; +bool correctable; +PCIDevice *dev; +PCIEAERErr err; +int ret; + +ret = pci_qdev_find_device(id, &dev); +if (ret < 0) { +monitor_printf(mon, + "id or pci device path is invalid or device not " + "found. %s\n", id); +return ret; +} +if (!pci_is_express(dev)) { +monitor_printf(mon, "the device doesn't support pci express. %s\n", + id); +return -ENOSYS; +} + +error_name = qdict_get_str(qdict, "error_status"); +if (pcie_aer_parse_error_string(error_name, &error_status, &correctable)) { +if (qemu_strtoui(error_name, NULL, 0, &num) < 0) { +monitor_printf(mon, "invalid error status value. \"%s\"", + error_name); +return -EINVAL; +} +error_status = num; +correctable = qdict_get_try_bool(qdict, "correctable", false); +} +err.status = error_status; +err.source_id = pci_requester_id(dev); + +err.flags = 0; +if (correctable) { +err.flags |= PCIE_AER_ERR_IS_CORRECTABLE; +} +if (qdict_get_try_bool(qdict, "advisory_non_fatal", false)) { +err.flags |= PCIE_AER_ERR_MAYBE_ADVISORY; +} +if (qdict_haskey(qdict, "header0")) { +err.flags |= PCIE_AER_ERR_HEADER_VALID; +} +if (qdict_haskey(qdict, "prefix0")) { +err.flags |= PCIE_AER_ERR_TLP_PREFIX_PRESENT; +} + +err.header[0] = qdict_get_try_int(qdict, "header0", 0); +err.header[1] = qdict_get_try_int(qdict, "header1", 0); +err.header[2] = qdict_get_try_int(qdict, "header2", 0); +err.header[3] = qdict_get_try_int(qdict, "header3", 0); + +err.prefix[0] = qdict_get_try_int(qdict, "prefix0", 0); +err.prefix[1] = qdict_get_try_int(qdict, "prefix1", 0); +err.p
[PATCH v2 06/13] pci: Deduplicate get_class_desc()
pcibus_dev_print() contains a copy of get_class_desc(). Call the function instead. Signed-off-by: Markus Armbruster Reviewed-by: Michael S. Tsirkin --- hw/pci/pci.c | 9 +++-- 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/hw/pci/pci.c b/hw/pci/pci.c index 81ffc74925..6711a75098 100644 --- a/hw/pci/pci.c +++ b/hw/pci/pci.c @@ -2409,15 +2409,12 @@ uint8_t pci_find_capability(PCIDevice *pdev, uint8_t cap_id) static void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent) { PCIDevice *d = (PCIDevice *)dev; -const pci_class_desc *desc; +int class = pci_get_word(d->config + PCI_CLASS_DEVICE); +const pci_class_desc *desc = get_class_desc(class); char ctxt[64]; PCIIORegion *r; -int i, class; +int i; -class = pci_get_word(d->config + PCI_CLASS_DEVICE); -desc = pci_class_descriptions; -while (desc->desc && class != desc->class) -desc++; if (desc->desc) { snprintf(ctxt, sizeof(ctxt), "%s", desc->desc); } else { -- 2.37.3
Re: [PATCH 3/3] ui: remove deprecated 'password' option for SPICE
Daniel P. Berrangé writes: > This has been replaced by the 'password-secret' option, > which references a 'secret' object instance. > > Signed-off-by: Daniel P. Berrangé > --- > docs/about/deprecated.rst | 8 > docs/about/removed-features.rst | 7 +++ > qemu-options.hx | 9 + > ui/spice-core.c | 15 --- > 4 files changed, 8 insertions(+), 31 deletions(-) > > diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst > index 2cc8924fe9..ee4301f96d 100644 > --- a/docs/about/deprecated.rst > +++ b/docs/about/deprecated.rst > @@ -73,14 +73,6 @@ Input parameters that take a size value should only use a > size suffix > the value is hexadecimal. That is, '0x20M' is deprecated, and should > be written either as '32M' or as '0x200'. > > -``-spice password=string`` (since 6.0) > -'' > - > -This option is insecure because the SPICE password remains visible in > -the process listing. This is replaced by the new ``password-secret`` > -option which lets the password be securely provided on the command > -line using a ``secret`` object instance. > - > ``-smp`` ("parameter=0" SMP configurations) (since 6.2) > ''' > > diff --git a/docs/about/removed-features.rst b/docs/about/removed-features.rst > index 63df9848fd..e04e095320 100644 > --- a/docs/about/removed-features.rst > +++ b/docs/about/removed-features.rst > @@ -408,6 +408,13 @@ pcspk-audiodev=``. > > Use ``-device`` instead. > > +``-spice password=string`` (removed in 8.0) > +''' > + > +This option is insecure because the SPICE password remains visible in Nitpicking... since the option doesn't exist anymore, it can't *be* insecure. It sure *was* insecure. > +the process listing. This is replaced by the new ``password-secret`` > +option which lets the password be securely provided on the command > +line using a ``secret`` object instance. > > QEMU Machine Protocol (QMP) commands > [...] Reviewed-by: Markus Armbruster
Re: [PATCH for-7.2] vhost: enable vrings in vhost_dev_start() for vhost-user devices
"Michael S. Tsirkin" writes: > On Thu, Dec 01, 2022 at 10:14:39AM +, Alex Bennée wrote: >> Do you think rust-vmm's vhost crates have enough of the state >> management to manage vhost and vhost-user backends? Maybe it would be a >> good experiment in replacing a (small well defined) piece of >> functionality with rust? >> >> That said there is a lot of deep magic in the vhost-net stuff which I >> think is down to the interaction with things like vdpk and other network >> optimisations that might be missing. For the rest of the devices most of >> the code is basically boiler plate which has grown variations due to >> code motion and change. This is the sort of thing that generics solves >> well. > > Not sure what you want to replace with what though, libvhost-user or > vhost-user bits in qemu? The vhost-user bits in the main QEMU binary. We already don't use libvhost-user for most of our backends anyway ;-) -- Alex Bennée
Re: [PATCH 2/3] block: deprecate iSCSI 'password' in favour of 'password-secret'
Daniel P. Berrangé writes: > Support for referencing secret objects was added in > > commit b189346eb1784df95ed6fed610411dbf23d19e1f > Author: Daniel P. Berrangé > Date: Thu Jan 21 14:19:21 2016 + > > iscsi: add support for getting CHAP password via QCryptoSecret API > > The existing 'password' option is overdue for deprecation and > subsequent removal. > > Signed-off-by: Daniel P. Berrangé > --- > block/iscsi.c | 3 +++ > docs/about/deprecated.rst | 11 +++ > 2 files changed, 14 insertions(+) > > diff --git a/block/iscsi.c b/block/iscsi.c > index a316d46d96..58c0623052 100644 > --- a/block/iscsi.c > +++ b/block/iscsi.c > @@ -1352,6 +1352,9 @@ static void apply_chap(struct iscsi_context *iscsi, > QemuOpts *opts, > } else if (!password) { > error_setg(errp, "CHAP username specified but no password was > given"); > return; > +} else { > +warn_report("iSCSI block driver 'password' option is deprecated, " > +"use 'password-secret' instead"); > } > > if (iscsi_set_initiator_username_pwd(iscsi, user, password)) { > diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst > index 93affe3669..2cc8924fe9 100644 > --- a/docs/about/deprecated.rst > +++ b/docs/about/deprecated.rst > @@ -267,6 +267,17 @@ Options are: > - move backing file to NVDIMM storage and keep ``pmem=on`` >(to have NVDIMM with persistence guaranties). > > +Block driver options > + I'm not sure about this headline. For what it's worth, -help shows -iscsi under "Block device options". > + > +``iscsi,password=xxx`` (since 8.0) > +'' > + > +Specifying the iSCSI password in plain text on the command line using the > +``password`` option is insecure. The ``password-secret`` option should be > +used instead, to refer to a ``--object secret...`` instance that provides > +a password via a file, or encrypted. > + > Device options > --
Re: [PATCH-for-8.0 1/2] target/s390x: Replace TCGv by TCGv_i64 in op_mov2e()
On Wed, 2022-11-30 at 17:34 +0100, Philippe Mathieu-Daudé wrote: > Although TCGv is defined as TCGv_i64 on s390x, > make it clear tcg_temp_new_i64() returns a TCGv_i64. > > Signed-off-by: Philippe Mathieu-Daudé > --- > target/s390x/tcg/translate.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/target/s390x/tcg/translate.c > b/target/s390x/tcg/translate.c > index 1e599ac259..a77039b863 100644 > --- a/target/s390x/tcg/translate.c > +++ b/target/s390x/tcg/translate.c > @@ -3335,7 +3335,7 @@ static DisasJumpType op_mov2(DisasContext *s, > DisasOps *o) > static DisasJumpType op_mov2e(DisasContext *s, DisasOps *o) > { > int b2 = get_field(s, b2); > - TCGv ar1 = tcg_temp_new_i64(); > + TCGv_i64 ar1 = tcg_temp_new_i64(); > > o->out = o->in2; > o->g_out = o->g_in2; Reviewed-by: Ilya Leoshkevich It looks as if besides sparc and s390x there is one occurrence of this in alpha? $ git grep -w TCGv | grep -w tcg_temp_new_i64 target/alpha/translate.c:TCGv tmp = tcg_temp_new_i64();
Re: [PATCH 3/9] ui: Drop disabled code for SPICE_CHANNEL_WEBDAV
Daniel P. Berrangé writes: > On Thu, Dec 01, 2022 at 07:13:05AM +0100, Markus Armbruster wrote: >> HMP "info spice" has a bit of code to show channel type >> SPICE_CHANNEL_WEBDAV as "webdav", disabled since commit 7c6044a94e >> "hmp: info spice: take out webdav" (v2.3.0), because it compiles only >> with Spice versions 0.12.7 and later. Our minimum version is 0.12.5. > > Last version bump was 4 years ago > > commit 1b63665c2c0e0d52735e0dd5217f109fe0dd2322 > Author: Marc-André Lureau > Date: Wed Nov 28 19:59:32 2018 +0400 > > configure: bump spice-server required version to 0.12.5 > > ...snip > > According to repology, all the distros that are build target platforms > for QEMU include it: > > RHEL-7: 0.14.0 > Debian (Stretch): 0.12.8 > Debian (Jessie): 0.12.5 > FreeBSD (ports): 0.14.0 > OpenSUSE Leap 15: 0.14.0 > Ubuntu (Xenial): 0.12.6 > > We moved on from Debian and RHEL since then > >Debian 11: 0.14.3 >RHEL-8: 0.14.2 >FreeBSD (ports): 0.14.4 >Fedora 35: 0.14.0 >Ubuntu 20.04: 0.14.0 >OpenSUSE Leap 15.3: 0.14.3 > > IOW, we can bump to 0.14.0, and then revert the > webdav conditional commit. We need to bump spice-protocol, actually. Would you like me to bump spice-server as well? To which version? [...]
Re: [PATCH] MAINTAINERS: Inherit from nanoMIPS
On 1/12/22 11:24, Markus Armbruster wrote: Milica Lazarevic writes: Hi, Stefan is no longer working with us, but I will be more than happy to take maintaining the nanoMIPS ISA on me! Regards, Milica Any comments on this? Suggest you post a patch to update MAINTAINERS, with a suitable cc:. nanoMIPS depends on the generic MIPS infra, reviewing overall MIPS patches also helps.
Re: regression: insmod module failed in VM with nvdimm on
On Thu, 1 Dec 2022 at 13:07, chenxiang (M) wrote: > > > > 在 2022/12/1 19:07, Ard Biesheuvel 写道: > > On Thu, 1 Dec 2022 at 09:07, Ard Biesheuvel wrote: > >> On Thu, 1 Dec 2022 at 08:15, chenxiang (M) > >> wrote: > >>> Hi Ard, > >>> > >>> > >>> 在 2022/11/30 16:18, Ard Biesheuvel 写道: > On Wed, 30 Nov 2022 at 08:53, Marc Zyngier wrote: > > On Wed, 30 Nov 2022 02:52:35 +, > > "chenxiang (M)" wrote: > >> Hi, > >> > >> We boot the VM using following commands (with nvdimm on) (qemu > >> version 6.1.50, kernel 6.0-r4): > > How relevant is the presence of the nvdimm? Do you observe the failure > > without this? > > > >> qemu-system-aarch64 -machine > >> virt,kernel_irqchip=on,gic-version=3,nvdimm=on -kernel > >> /home/kernel/Image -initrd /home/mini-rootfs/rootfs.cpio.gz -bios > >> /root/QEMU_EFI.FD -cpu host -enable-kvm -net none -nographic -m > >> 2G,maxmem=64G,slots=3 -smp 4 -append 'rdinit=init console=ttyAMA0 > >> ealycon=pl0ll,0x9000 pcie_ports=native pciehp.pciehp_debug=1' > >> -object memory-backend-ram,id=ram1,size=10G -device > >> nvdimm,id=dimm1,memdev=ram1 -device ioh3420,id=root_port1,chassis=1 > >> -device vfio-pci,host=7d:01.0,id=net0,bus=root_port1 > >> > >> Then in VM we insmod a module, vmalloc error occurs as follows (kernel > >> 5.19-rc4 is normal, and the issue is still on kernel 6.1-rc4): > >> > >> estuary:/$ insmod /lib/modules/$(uname -r)/hnae3.ko > >> [8.186563] vmap allocation for size 20480 failed: use > >> vmalloc= to increase size > > Have you tried increasing the vmalloc size to check that this is > > indeed the problem? > > > > [...] > > > >> We git bisect the code, and find the patch c5a89f75d2a ("arm64: kaslr: > >> defer initialization to initcall where permitted"). > > I guess you mean commit fc5a89f75d2a instead, right? > > > >> Do you have any idea about the issue? > > I sort of suspect that the nvdimm gets vmap-ed and consumes a large > > portion of the vmalloc space, but you give very little information > > that could help here... > > > Ouch. I suspect what's going on here: that patch defers the > randomization of the module region, so that we can decouple it from > the very early init code. > > Obviously, it is happening too late now, and the randomized module > region is overlapping with a vmalloc region that is in use by the time > the randomization occurs. > > Does the below fix the issue? > >>> The issue still occurs, but it seems decrease the probability, before it > >>> occured almost every time, after the change, i tried 2-3 times, and it > >>> occurs. > >>> But i change back "subsys_initcall" to "core_initcall", and i test more > >>> than 20 times, and it is still ok. > >>> > >> Thank you for confirming. I will send out a patch today. > >> > > ...but before I do that, could you please check whether the change > > below fixes your issue as well? > > Yes, but i can only reply to you tomorrow as other guy is testing on the > only environment today. > That is fine, thanks.
Re: [PATCH 1/3] block: mention 'password-secret' option for -iscsi
Daniel P. Berrangé writes: > The 'password-secret' option was added > > commit b189346eb1784df95ed6fed610411dbf23d19e1f > Author: Daniel P. Berrangé > Date: Thu Jan 21 14:19:21 2016 + > > iscsi: add support for getting CHAP password via QCryptoSecret API > > but was not mentioned in the command line docs > > Signed-off-by: Daniel P. Berrangé > --- > qemu-options.hx | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/qemu-options.hx b/qemu-options.hx > index 7f99d15b23..055df73306 100644 > --- a/qemu-options.hx > +++ b/qemu-options.hx > @@ -1889,7 +1889,7 @@ SRST > ERST > > DEF("iscsi", HAS_ARG, QEMU_OPTION_iscsi, > -"-iscsi [user=user][,password=password]\n" > +"-iscsi [user=user][,password=password],password-secret=secret-id]\n" Loos like you're missing a bracket before the comma. The line below also lacks one at the end. > " [,header-digest=CRC32C|CR32C-NONE|NONE-CRC32C|NONE\n" > " [,initiator-name=initiator-iqn][,id=target-iqn]\n" > " [,timeout=timeout]\n"
Re: [PATCH v12 0/7] s390x: CPU Topology
On 12/1/22 09:45, Cédric Le Goater wrote: Hello Pierre On 11/29/22 18:41, Pierre Morel wrote: Hi, The implementation of the CPU Topology in QEMU has been modified since the last patch series. - The two preliminary patches have been accepted and are no longer part of this series. - The topology machine property has been abandoned - the topology_capable QEMU capability has been abandoned - both where replaced with a new CPU feature, topology-disable to fence per default the ctop topology information feature. To use the QEMU patches, you will need Linux V6-rc1 or newer, or use the following Linux mainline patches: f5ecfee94493 2022-07-20 KVM: s390: resetting the Topology-Change-Report 24fe0195bc19 2022-07-20 KVM: s390: guest support for topology function 0130337ec45b 2022-07-20 KVM: s390: Cleanup ipte lock access and SIIF fac.. Currently this code is for KVM only, I have no idea if it is interesting to provide a TCG patch. If ever it will be done in another series. To have a better understanding of the S390x CPU Topology and its implementation in QEMU you can have a look at the documentation in the last patch of this series. The admin will want to match the host and the guest topology, taking into account that the guest does not recognize multithreading. Consequently, two vCPU assigned to threads of the same real CPU should preferably be assigned to the same socket of the guest machine. Hello Cedric, Please make sure the patchset compiles on non-s390x platforms and check Yes, that the documentation generates correctly. You will need to install : python3-sphinx python3-sphinx_rtd_theme 'configure' should then enable doc generation. Yes, thanks, Pierre Thanks, C. -- Pierre Morel IBM Lab Boeblingen
Re: cortex-a55/a75/a76 MPIDR_EL1 specifica
On Thu, 1 Dec 2022 at 10:55, Timofey Kutergin wrote: > After submitting a patch for cortex-a55 support and trying > to run SMP on odroid-c4 I have found concern about mpidr_el1 register. > > cortex-a55 (and also a75/76/78) seem to be considered SMT CPUs > with single SMT thread. As a result, they publish core numbers > in AFF1 rather than in AFF0 and AFF0 is always zero. In the > cortex-a55 TRM: > > Aff1, [11:8] Part of Affinity level 1. Third highest level affinity field. > CPUID. Identification number for each CPU in the Cortex-A55 cluster: > 0x0 MP1: CPUID: 0. to > 0x7 MP8: CPUID: 7. > Aff0, [7:0] > Affinity level 0. The level identifies individual threads within a > multi-threaded core. The > Cortex-A55 core is single-threaded, so this field has the value 0x00. > > Plus MT (bit 24) is set to 1. > > But it seems that qemu does not take it into account. Is this > intentional? Does it make sense to change this behaviour to > something closer to real CPU ? I don't think it's intentional, it's just that when we wrote the code to handle the MPIDR all the CPUs we emulated were MT==0 and put their CPU number in the Aff0 field, so that's all that the mpidr_read_val() function handles. Of the CPUs we currently emulate, these ones set MT==1: * Cortex-A55 * Cortex-A76 * Neoverse-N1 (I don't know about the a64fx but would guess it is MT==0.) Since we already have a couple of CPU models which we don't get this right for, I don't think this needs to block the A55 support, but it would I think be nice to implement it correctly. I guess I would do this via a bool ARMCPU::cpuid_in_aff1 similar to the existing ARMCPU::mp_is_up, and making mpidr_read_val() act appropriately. The awkward part is that this would mean that across migration from an older QEMU version the guest would see the MPIDR value change, which is probably bad. So I guess we need a CPU property to suppress that and to set that in the hw_compat_7_2 array (which will appear when the 8.0 board types get added after 7.2 releases). Did you see any guest-visible behaviour issues because of this value being wrong ? thanks -- PMM
[PATCH] .gitlab/issue_templates: Move suggestions into comments
Many users forget to remove the suggestions from the bug template when creating a new issue. So when searching for strings like "s390x" or "Windows", you get a lot of unrelated issues in the results. Thus let's move the suggestions into HTML comments - so they will still show up in the markdown when editing the bug, while being hidden/ignored in the final text or in the search queries. Signed-off-by: Thomas Huth --- .gitlab/issue_templates/bug.md | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/.gitlab/issue_templates/bug.md b/.gitlab/issue_templates/bug.md index e910f7b1c2..53a79f5828 100644 --- a/.gitlab/issue_templates/bug.md +++ b/.gitlab/issue_templates/bug.md @@ -18,11 +18,11 @@ https://www.qemu.org/contribute/security-process/ --> ## Host environment - - Operating system: (Windows 10 21H1, Fedora 34, etc.) - - OS/kernel version: (For POSIX hosts, use `uname -a`) - - Architecture: (x86, ARM, s390x, etc.) - - QEMU flavor: (qemu-system-x86_64, qemu-aarch64, qemu-img, etc.) - - QEMU version: (e.g. `qemu-system-x86_64 --version`) + - Operating system: + - OS/kernel version: + - Architecture: + - QEMU flavor: + - QEMU version: - QEMU command line: + - OS/kernel version: + - Architecture: ## Description of problem -- 2.31.1
[PATCH] blockdev: add 'media=cdrom' argument to support usb cdrom emulated as cdrom
From: zhipeng Lu The drive interface supports media=cdrom so that the usb cdrom can be emulated as cdrom in qemu, but libvirt deprived the drive interface, so media=cdrom is added to the blockdev interface to support usb cdrom emulated as cdrom Signed-off-by: zhipeng Lu --- block.c | 11 ++- block/block-backend.c | 18 ++ hw/core/qdev-properties-system.c| 3 +++ hw/scsi/scsi-bus.c | 4 +++- include/block/block_int-common.h| 1 + include/sysemu/block-backend-global-state.h | 2 ++ 6 files changed, 37 insertions(+), 2 deletions(-) diff --git a/block.c b/block.c index a18f052374..d194e3039e 100644 --- a/block.c +++ b/block.c @@ -3787,6 +3787,7 @@ static BlockDriverState *bdrv_open_inherit(const char *filename, Error *local_err = NULL; QDict *snapshot_options = NULL; int snapshot_flags = 0; +int media = 0; assert(!child_class || !flags); assert(!child_class == !parent); @@ -3906,6 +3907,12 @@ static BlockDriverState *bdrv_open_inherit(const char *filename, qdict_del(bs->options, "backing"); qdict_del(options, "backing"); } +if (!g_strcmp0(qdict_get_try_str(options, "media"), "cdrom")) { +media = 1; +qdict_del(bs->explicit_options, "media"); +qdict_del(bs->options, "media"); +qdict_del(options, "media"); +} /* Open image file without format layer. This BlockBackend is only used for * probing, the block drivers will do their own bdrv_open_child() for the @@ -4033,7 +4040,9 @@ static BlockDriverState *bdrv_open_inherit(const char *filename, bdrv_unref(bs); bs = snapshot_bs; } - +if (bs && media) { +bs->media_cd = true; +} return bs; fail: diff --git a/block/block-backend.c b/block/block-backend.c index d98a96ff37..1760079a67 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -72,6 +72,7 @@ struct BlockBackend { uint64_t perm; uint64_t shared_perm; bool disable_perm; +bool media_cd; bool allow_aio_context_change; bool allow_write_beyond_eof; @@ -2634,3 +2635,20 @@ int blk_make_empty(BlockBackend *blk, Error **errp) return bdrv_make_empty(blk->root, errp); } + +bool blk_is_cdrom(BlockBackend *blk) +{ +if (!blk) { +return false; +} +return blk->media_cd; + +} + +void blk_set_cdrom(BlockBackend *blk) +{ +if (!blk) { +return ; +} +blk->media_cd = true; +} diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c index a91f60567a..99df29ccb8 100644 --- a/hw/core/qdev-properties-system.c +++ b/hw/core/qdev-properties-system.c @@ -149,6 +149,9 @@ static void set_drive_helper(Object *obj, Visitor *v, const char *name, if (ret < 0) { goto fail; } +if (bs->media_cd) { +blk_set_cdrom(blk); +} } } if (!blk) { diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c index ceceafb2cd..bc2bbdc823 100644 --- a/hw/scsi/scsi-bus.c +++ b/hw/scsi/scsi-bus.c @@ -321,7 +321,9 @@ SCSIDevice *scsi_bus_legacy_add_drive(SCSIBus *bus, BlockBackend *blk, driver = "scsi-generic"; } else { dinfo = blk_legacy_dinfo(blk); -if (dinfo && dinfo->media_cd) { +if ((dinfo && dinfo->media_cd)) { +driver = "scsi-cd"; +} else if (blk_is_cdrom(blk)) { driver = "scsi-cd"; } else { driver = "scsi-hd"; diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h index 31ae91e56e..04a814ab1b 100644 --- a/include/block/block_int-common.h +++ b/include/block/block_int-common.h @@ -1023,6 +1023,7 @@ struct BlockDriverState { bool probed;/* if true, format was probed rather than specified */ bool force_share; /* if true, always allow all shared permissions */ bool implicit; /* if true, this filter node was automatically inserted */ +bool media_cd; BlockDriver *drv; /* NULL means no media */ void *opaque; diff --git a/include/sysemu/block-backend-global-state.h b/include/sysemu/block-backend-global-state.h index 6858e39cb6..108dfad283 100644 --- a/include/sysemu/block-backend-global-state.h +++ b/include/sysemu/block-backend-global-state.h @@ -71,6 +71,8 @@ void blk_set_on_error(BlockBackend *blk, BlockdevOnError on_read_error, BlockdevOnError on_write_error); bool blk_supports_write_perm(BlockBackend *blk); bool blk_is_sg(BlockBackend *blk); +bool blk_is_cdrom(BlockBackend *blk); +void blk_set_cdrom(BlockBackend *blk); void blk_set_enable_write_cache(BlockBackend *blk, bool wce); int blk_get_flags(BlockBackend *blk); bool blk_op_is_blocked(BlockBackend *blk, BlockOpType op, Error **errp); -- 2.31.1
Re: cortex-a55/a75/a76 MPIDR_EL1 specifica
More like the opposite - proprietary software which does not yet have proper support for MT CPUs worked fine on QEMU but failed on hardware - so discrepancy was found. This does not block anything right now, the question is more how closely qemu should follow hardware which it emulates... BR Timofey On Thu, Dec 1, 2022 at 4:38 PM Peter Maydell wrote: > On Thu, 1 Dec 2022 at 10:55, Timofey Kutergin wrote: > > After submitting a patch for cortex-a55 support and trying > > to run SMP on odroid-c4 I have found concern about mpidr_el1 register. > > > > cortex-a55 (and also a75/76/78) seem to be considered SMT CPUs > > with single SMT thread. As a result, they publish core numbers > > in AFF1 rather than in AFF0 and AFF0 is always zero. In the > > cortex-a55 TRM: > > > > Aff1, [11:8] Part of Affinity level 1. Third highest level affinity > field. > > CPUID. Identification number for each CPU in the Cortex-A55 cluster: > > 0x0 MP1: CPUID: 0. to > > 0x7 MP8: CPUID: 7. > > Aff0, [7:0] > > Affinity level 0. The level identifies individual threads within a > multi-threaded core. The > > Cortex-A55 core is single-threaded, so this field has the value 0x00. > > > > Plus MT (bit 24) is set to 1. > > > > But it seems that qemu does not take it into account. Is this > > intentional? Does it make sense to change this behaviour to > > something closer to real CPU ? > > I don't think it's intentional, it's just that when we wrote > the code to handle the MPIDR all the CPUs we emulated were > MT==0 and put their CPU number in the Aff0 field, so that's all > that the mpidr_read_val() function handles. > > Of the CPUs we currently emulate, these ones set MT==1: > * Cortex-A55 > * Cortex-A76 > * Neoverse-N1 > (I don't know about the a64fx but would guess it is MT==0.) > > Since we already have a couple of CPU models which we don't > get this right for, I don't think this needs to block the A55 > support, but it would I think be nice to implement it correctly. > I guess I would do this via a bool ARMCPU::cpuid_in_aff1 similar to > the existing ARMCPU::mp_is_up, and making mpidr_read_val() > act appropriately. The awkward part is that this would mean > that across migration from an older QEMU version the guest > would see the MPIDR value change, which is probably bad. > So I guess we need a CPU property to suppress that and to > set that in the hw_compat_7_2 array (which will appear when > the 8.0 board types get added after 7.2 releases). > > Did you see any guest-visible behaviour issues because of this > value being wrong ? > > thanks > -- PMM >
Re: [PATCH v7 00/14] Still more coroutine and various fixes in block layer
Am 28.11.2022 um 15:23 hat Emanuele Giuseppe Esposito geschrieben: > This is a dump of all minor coroutine-related fixes found while looking > around and testing various things in the QEMU block layer. > > Patches aim to: > - add missing coroutine_fn annotation to the functions > - simplify to avoid the typical "if in coroutine: fn() > // else create_coroutine(fn)" already present in generated_co_wraper > functions. > - make sure that if a BlockDriver callback is defined as coroutine_fn, then > it is always running in a coroutine. > > This serie is based on Kevin Wolf's series "block: Simplify drain". > > Based-on: <20221108123738.530873-1-kw...@redhat.com> > > Emanuele Thanks, applied to block-next. Kevin
[PATCH 15/15] hw/intc: sifive_plic: Fix the pending register range check
The pending register upper limit is currently set to plic->num_sources >> 3, which is wrong, e.g.: considering plic->num_sources is 7, the upper limit becomes 0 which fails the range check if reading the pending register at pending_base. Fixes: 1e24429e40df ("SiFive RISC-V PLIC Block") Signed-off-by: Bin Meng --- hw/intc/sifive_plic.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/hw/intc/sifive_plic.c b/hw/intc/sifive_plic.c index 7a6a358c57..a3fc8222c7 100644 --- a/hw/intc/sifive_plic.c +++ b/hw/intc/sifive_plic.c @@ -143,7 +143,8 @@ static uint64_t sifive_plic_read(void *opaque, hwaddr addr, unsigned size) uint32_t irq = (addr - plic->priority_base) >> 2; return plic->source_priority[irq]; -} else if (addr_between(addr, plic->pending_base, plic->num_sources >> 3)) { +} else if (addr_between(addr, plic->pending_base, +(plic->num_sources + 31) >> 3)) { uint32_t word = (addr - plic->pending_base) >> 2; return plic->pending[word]; @@ -202,7 +203,7 @@ static void sifive_plic_write(void *opaque, hwaddr addr, uint64_t value, sifive_plic_update(plic); } } else if (addr_between(addr, plic->pending_base, -plic->num_sources >> 3)) { +(plic->num_sources + 31) >> 3)) { qemu_log_mask(LOG_GUEST_ERROR, "%s: invalid pending write: 0x%" HWADDR_PRIx "", __func__, addr); -- 2.34.1
[PATCH 14/15] hw/riscv: opentitan: Drop "hartid-base" and "priority-base" initialization
"hartid-base" and "priority-base" are zero by default. There is no need to initialize them to zero again. Signed-off-by: Bin Meng --- hw/riscv/opentitan.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/hw/riscv/opentitan.c b/hw/riscv/opentitan.c index be7ff1eea0..da73aa51f5 100644 --- a/hw/riscv/opentitan.c +++ b/hw/riscv/opentitan.c @@ -165,10 +165,8 @@ static void lowrisc_ibex_soc_realize(DeviceState *dev_soc, Error **errp) /* PLIC */ qdev_prop_set_string(DEVICE(&s->plic), "hart-config", "M"); -qdev_prop_set_uint32(DEVICE(&s->plic), "hartid-base", 0); qdev_prop_set_uint32(DEVICE(&s->plic), "num-sources", 180); qdev_prop_set_uint32(DEVICE(&s->plic), "num-priorities", 3); -qdev_prop_set_uint32(DEVICE(&s->plic), "priority-base", 0x00); qdev_prop_set_uint32(DEVICE(&s->plic), "pending-base", 0x1000); qdev_prop_set_uint32(DEVICE(&s->plic), "enable-base", 0x2000); qdev_prop_set_uint32(DEVICE(&s->plic), "enable-stride", 32); -- 2.34.1
[PATCH 09/15] hw/riscv: microchip_pfsoc: Fix the number of interrupt sources of PLIC
Per chapter 6.5.2 in [1], the number of interupt sources including interrupt source 0 should be 187. [1] PolarFire SoC MSS TRM: https://ww1.microchip.com/downloads/aemDocuments/documents/FPGA/ProductDocuments/ReferenceManuals/PolarFire_SoC_FPGA_MSS_Technical_Reference_Manual_VC.pdf Fixes: 56f6e31e7b7e ("hw/riscv: Initial support for Microchip PolarFire SoC Icicle Kit board") Signed-off-by: Bin Meng --- include/hw/riscv/microchip_pfsoc.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/hw/riscv/microchip_pfsoc.h b/include/hw/riscv/microchip_pfsoc.h index a757b240e0..9720bac2d5 100644 --- a/include/hw/riscv/microchip_pfsoc.h +++ b/include/hw/riscv/microchip_pfsoc.h @@ -150,7 +150,7 @@ enum { #define MICROCHIP_PFSOC_MANAGEMENT_CPU_COUNT1 #define MICROCHIP_PFSOC_COMPUTE_CPU_COUNT 4 -#define MICROCHIP_PFSOC_PLIC_NUM_SOURCES185 +#define MICROCHIP_PFSOC_PLIC_NUM_SOURCES187 #define MICROCHIP_PFSOC_PLIC_NUM_PRIORITIES 7 #define MICROCHIP_PFSOC_PLIC_PRIORITY_BASE 0x04 #define MICROCHIP_PFSOC_PLIC_PENDING_BASE 0x1000 -- 2.34.1
[PATCH 08/15] hw/intc: sifive_plic: Update "num-sources" property default value
At present the default value of "num-sources" property is zero, which does not make a lot of sense, as in sifive_plic_realize() we see s->bitfield_words is calculated by: s->bitfield_words = (s->num_sources + 31) >> 5; if the we don't configure "num-sources" property its default value zero makes s->bitfield_words zero too, which isn't true because interrupt source 0 still occupies one word. Let's change the default value to 1 meaning that only interrupt source 0 is supported by default and a sanity check in realize(). While we are here, add a comment to describe the exact meaning of this property that the number should include interrupt source 0. A wrong multi-line comment format is corrected too. Signed-off-by: Bin Meng --- hw/intc/sifive_plic.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/hw/intc/sifive_plic.c b/hw/intc/sifive_plic.c index 5fd9a53569..2bd292410d 100644 --- a/hw/intc/sifive_plic.c +++ b/hw/intc/sifive_plic.c @@ -363,6 +363,11 @@ static void sifive_plic_realize(DeviceState *dev, Error **errp) parse_hart_config(s); +if (!s->num_sources) { +error_report("plic: invalid number of interrupt sources"); +exit(1); +} + s->bitfield_words = (s->num_sources + 31) >> 5; s->num_enables = s->bitfield_words * s->num_addrs; s->source_priority = g_new0(uint32_t, s->num_sources); @@ -379,7 +384,8 @@ static void sifive_plic_realize(DeviceState *dev, Error **errp) s->m_external_irqs = g_malloc(sizeof(qemu_irq) * s->num_harts); qdev_init_gpio_out(dev, s->m_external_irqs, s->num_harts); -/* We can't allow the supervisor to control SEIP as this would allow the +/* + * We can't allow the supervisor to control SEIP as this would allow the * supervisor to clear a pending external interrupt which will result in * lost a interrupt in the case a PLIC is attached. The SEIP bit must be * hardware controlled when a PLIC is attached. @@ -419,7 +425,8 @@ static const VMStateDescription vmstate_sifive_plic = { static Property sifive_plic_properties[] = { DEFINE_PROP_STRING("hart-config", SiFivePLICState, hart_config), DEFINE_PROP_UINT32("hartid-base", SiFivePLICState, hartid_base, 0), -DEFINE_PROP_UINT32("num-sources", SiFivePLICState, num_sources, 0), +/* number of interrupt sources including interrupt source 0 */ +DEFINE_PROP_UINT32("num-sources", SiFivePLICState, num_sources, 1), DEFINE_PROP_UINT32("num-priorities", SiFivePLICState, num_priorities, 0), DEFINE_PROP_UINT32("priority-base", SiFivePLICState, priority_base, 0), DEFINE_PROP_UINT32("pending-base", SiFivePLICState, pending_base, 0), -- 2.34.1
[PATCH 12/15] hw/riscv: virt: Fix the value of "riscv, ndev" in the dtb
Commit 28d8c281200f ("hw/riscv: virt: Add optional AIA IMSIC support to virt machine") changed the value of VIRT_IRQCHIP_NUM_SOURCES from 127 to 53, which is VIRTIO_NDEV and also used as the value of "riscv,ndev" property in the dtb. Unfortunately this is wrong as VIRT_IRQCHIP_NUM_SOURCES should include interrupt source 0 but "riscv,ndev" does not. While we are here, we also fix the comments of platform bus irq range which is now "64 to 96", but should be "64 to 95", introduced since commit 1832b7cb3f64 ("hw/riscv: virt: Create a platform bus"). Fixes: 28d8c281200f ("hw/riscv: virt: Add optional AIA IMSIC support to virt machine") Signed-off-by: Bin Meng --- include/hw/riscv/virt.h | 5 ++--- hw/riscv/virt.c | 3 ++- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/include/hw/riscv/virt.h b/include/hw/riscv/virt.h index be4ab8fe7f..7c23aea4a0 100644 --- a/include/hw/riscv/virt.h +++ b/include/hw/riscv/virt.h @@ -87,15 +87,14 @@ enum { VIRTIO_IRQ = 1, /* 1 to 8 */ VIRTIO_COUNT = 8, PCIE_IRQ = 0x20, /* 32 to 35 */ -VIRT_PLATFORM_BUS_IRQ = 64, /* 64 to 96 */ -VIRTIO_NDEV = 96 /* Arbitrary maximum number of interrupts */ +VIRT_PLATFORM_BUS_IRQ = 64, /* 64 to 95 */ }; #define VIRT_PLATFORM_BUS_NUM_IRQS 32 #define VIRT_IRQCHIP_IPI_MSI 1 #define VIRT_IRQCHIP_NUM_MSIS 255 -#define VIRT_IRQCHIP_NUM_SOURCES VIRTIO_NDEV +#define VIRT_IRQCHIP_NUM_SOURCES 96 #define VIRT_IRQCHIP_NUM_PRIO_BITS 3 #define VIRT_IRQCHIP_MAX_GUESTS_BITS 3 #define VIRT_IRQCHIP_MAX_GUESTS ((1U << VIRT_IRQCHIP_MAX_GUESTS_BITS) - 1U) diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c index a5bc7353b4..c4ee489a80 100644 --- a/hw/riscv/virt.c +++ b/hw/riscv/virt.c @@ -468,7 +468,8 @@ static void create_fdt_socket_plic(RISCVVirtState *s, plic_cells, s->soc[socket].num_harts * sizeof(uint32_t) * 4); qemu_fdt_setprop_cells(mc->fdt, plic_name, "reg", 0x0, plic_addr, 0x0, memmap[VIRT_PLIC].size); -qemu_fdt_setprop_cell(mc->fdt, plic_name, "riscv,ndev", VIRTIO_NDEV); +qemu_fdt_setprop_cell(mc->fdt, plic_name, "riscv,ndev", + VIRT_IRQCHIP_NUM_SOURCES - 1); riscv_socket_fdt_write_id(mc, mc->fdt, plic_name, socket); qemu_fdt_setprop_cell(mc->fdt, plic_name, "phandle", plic_phandles[socket]); -- 2.34.1
[PATCH 10/15] hw/riscv: sifive_e: Fix the number of interrupt sources of PLIC
Per chapter 10 in Freedom E310 manuals [1][2][3], E310 G002 and G003 supports 52 interrupt sources while G000 supports 51 interrupt sources. We use the value of G002 and G003, so it is 53 (including source 0). [1] G000 manual: https://sifive.cdn.prismic.io/sifive/4faf3e34-4a42-4c2f-be9e-c77baa4928c7_fe310-g000-manual-v3p2.pdf [2] G002 manual: https://sifive.cdn.prismic.io/sifive/034760b5-ac6a-4b1c-911c-f4148bb2c4a5_fe310-g002-v1p5.pdf [3] G003 manual: https://sifive.cdn.prismic.io/sifive/3af39c59-6498-471e-9dab-5355a0d539eb_fe310-g003-manual.pdf Fixes: eb637edb1241 ("SiFive Freedom E Series RISC-V Machine") Signed-off-by: Bin Meng --- include/hw/riscv/sifive_e.h | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/include/hw/riscv/sifive_e.h b/include/hw/riscv/sifive_e.h index d738745925..9e58247fd8 100644 --- a/include/hw/riscv/sifive_e.h +++ b/include/hw/riscv/sifive_e.h @@ -82,7 +82,12 @@ enum { }; #define SIFIVE_E_PLIC_HART_CONFIG "M" -#define SIFIVE_E_PLIC_NUM_SOURCES 127 +/* + * Freedom E310 G002 and G003 supports 52 interrupt sources while + * Freedom E310 G000 supports 51 interrupt sources. We use the value + * of G002 and G003, so it is 53 (including interrupt source 0). + */ +#define SIFIVE_E_PLIC_NUM_SOURCES 53 #define SIFIVE_E_PLIC_NUM_PRIORITIES 7 #define SIFIVE_E_PLIC_PRIORITY_BASE 0x04 #define SIFIVE_E_PLIC_PENDING_BASE 0x1000 -- 2.34.1
[PATCH 03/15] hw/riscv: Fix opentitan dependency to SIFIVE_PLIC
Since commit ef6310064820 ("hw/riscv: opentitan: Update to the latest build") the IBEX PLIC model was replaced with the SiFive PLIC model in the 'opentitan' machine but we forgot the add the dependency there. Signed-off-by: Bin Meng --- hw/riscv/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/hw/riscv/Kconfig b/hw/riscv/Kconfig index 167dc4cca6..1e4b58024f 100644 --- a/hw/riscv/Kconfig +++ b/hw/riscv/Kconfig @@ -19,6 +19,7 @@ config MICROCHIP_PFSOC config OPENTITAN bool select IBEX +select SIFIVE_PLIC select UNIMP config SHAKTI_C -- 2.34.1
[PATCH 11/15] hw/riscv: sifive_u: Avoid using magic number for "riscv, ndev"
At present magic number is used to create "riscv,ndev" property in the dtb. Let's use the macro SIFIVE_U_PLIC_NUM_SOURCES that is used to instantiate the PLIC model instead. Signed-off-by: Bin Meng --- hw/riscv/sifive_u.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/hw/riscv/sifive_u.c b/hw/riscv/sifive_u.c index b139824aab..b40a4767e2 100644 --- a/hw/riscv/sifive_u.c +++ b/hw/riscv/sifive_u.c @@ -287,7 +287,8 @@ static void create_fdt(SiFiveUState *s, const MemMapEntry *memmap, qemu_fdt_setprop_cells(fdt, nodename, "reg", 0x0, memmap[SIFIVE_U_DEV_PLIC].base, 0x0, memmap[SIFIVE_U_DEV_PLIC].size); -qemu_fdt_setprop_cell(fdt, nodename, "riscv,ndev", 0x35); +qemu_fdt_setprop_cell(fdt, nodename, "riscv,ndev", + SIFIVE_U_PLIC_NUM_SOURCES - 1); qemu_fdt_setprop_cell(fdt, nodename, "phandle", plic_phandle); plic_phandle = qemu_fdt_get_phandle(fdt, nodename); g_free(cells); -- 2.34.1
[PATCH 02/15] hw/intc: Select MSI_NONBROKEN in RISC-V AIA interrupt controllers
hw/pci/Kconfig says MSI_NONBROKEN should be selected by interrupt controllers regardless of how MSI is implemented. msi_nonbroken is initialized to true in both riscv_aplic_realize() and riscv_imsic_realize(). Select MSI_NONBROKEN in RISCV_APLIC and RISCV_IMSIC. Signed-off-by: Bin Meng --- hw/intc/Kconfig | 2 ++ 1 file changed, 2 insertions(+) diff --git a/hw/intc/Kconfig b/hw/intc/Kconfig index 1d4573e803..21441d0a0c 100644 --- a/hw/intc/Kconfig +++ b/hw/intc/Kconfig @@ -72,9 +72,11 @@ config RISCV_ACLINT config RISCV_APLIC bool +select MSI_NONBROKEN config RISCV_IMSIC bool +select MSI_NONBROKEN config SIFIVE_PLIC bool -- 2.34.1
[PATCH 01/15] hw/riscv: Select MSI_NONBROKEN in SIFIVE_PLIC
hw/pci/Kconfig says MSI_NONBROKEN should be selected by interrupt controllers regardless of how MSI is implemented. msi_nonbroken is initialized to true in sifive_plic_realize(). Let SIFIVE_PLIC select MSI_NONBROKEN and drop the selection from RISC-V machines. Signed-off-by: Bin Meng --- hw/intc/Kconfig | 1 + hw/riscv/Kconfig | 5 - 2 files changed, 1 insertion(+), 5 deletions(-) diff --git a/hw/intc/Kconfig b/hw/intc/Kconfig index ecd2883ceb..1d4573e803 100644 --- a/hw/intc/Kconfig +++ b/hw/intc/Kconfig @@ -78,6 +78,7 @@ config RISCV_IMSIC config SIFIVE_PLIC bool +select MSI_NONBROKEN config GOLDFISH_PIC bool diff --git a/hw/riscv/Kconfig b/hw/riscv/Kconfig index 79ff61c464..167dc4cca6 100644 --- a/hw/riscv/Kconfig +++ b/hw/riscv/Kconfig @@ -11,7 +11,6 @@ config MICROCHIP_PFSOC select MCHP_PFSOC_IOSCB select MCHP_PFSOC_MMUART select MCHP_PFSOC_SYSREG -select MSI_NONBROKEN select RISCV_ACLINT select SIFIVE_PDMA select SIFIVE_PLIC @@ -37,7 +36,6 @@ config RISCV_VIRT imply TPM_TIS_SYSBUS select RISCV_NUMA select GOLDFISH_RTC -select MSI_NONBROKEN select PCI select PCI_EXPRESS_GENERIC_BRIDGE select PFLASH_CFI01 @@ -53,7 +51,6 @@ config RISCV_VIRT config SIFIVE_E bool -select MSI_NONBROKEN select RISCV_ACLINT select SIFIVE_GPIO select SIFIVE_PLIC @@ -64,7 +61,6 @@ config SIFIVE_E config SIFIVE_U bool select CADENCE -select MSI_NONBROKEN select RISCV_ACLINT select SIFIVE_GPIO select SIFIVE_PDMA @@ -82,6 +78,5 @@ config SPIKE bool select RISCV_NUMA select HTIF -select MSI_NONBROKEN select RISCV_ACLINT select SIFIVE_PLIC -- 2.34.1
[PATCH 13/15] hw/intc: sifive_plic: Change "priority-base" to start from interrupt source 0
At present the SiFive PLIC model "priority-base" expects interrupt priority register base starting from source 1 instead source 0, that's why on most platforms "priority-base" is set to 0x04 except 'opentitan' machine. 'opentitan' should have set "priority-base" to 0x04 too. Note the irq number calculation in sifive_plic_{read,write} is correct as the codes make up for the irq number by adding 1. Let's simply update "priority-base" to start from interrupt source 0 and add a comment to make it crystal clear. Signed-off-by: Bin Meng --- include/hw/riscv/microchip_pfsoc.h | 2 +- include/hw/riscv/shakti_c.h| 2 +- include/hw/riscv/sifive_e.h| 2 +- include/hw/riscv/sifive_u.h| 2 +- include/hw/riscv/virt.h| 2 +- hw/intc/sifive_plic.c | 5 +++-- 6 files changed, 8 insertions(+), 7 deletions(-) diff --git a/include/hw/riscv/microchip_pfsoc.h b/include/hw/riscv/microchip_pfsoc.h index 9720bac2d5..c10d87a601 100644 --- a/include/hw/riscv/microchip_pfsoc.h +++ b/include/hw/riscv/microchip_pfsoc.h @@ -152,7 +152,7 @@ enum { #define MICROCHIP_PFSOC_PLIC_NUM_SOURCES187 #define MICROCHIP_PFSOC_PLIC_NUM_PRIORITIES 7 -#define MICROCHIP_PFSOC_PLIC_PRIORITY_BASE 0x04 +#define MICROCHIP_PFSOC_PLIC_PRIORITY_BASE 0x00 #define MICROCHIP_PFSOC_PLIC_PENDING_BASE 0x1000 #define MICROCHIP_PFSOC_PLIC_ENABLE_BASE0x2000 #define MICROCHIP_PFSOC_PLIC_ENABLE_STRIDE 0x80 diff --git a/include/hw/riscv/shakti_c.h b/include/hw/riscv/shakti_c.h index daf0aae13f..539fe1156d 100644 --- a/include/hw/riscv/shakti_c.h +++ b/include/hw/riscv/shakti_c.h @@ -65,7 +65,7 @@ enum { #define SHAKTI_C_PLIC_NUM_SOURCES 28 /* Excluding Priority 0 */ #define SHAKTI_C_PLIC_NUM_PRIORITIES 2 -#define SHAKTI_C_PLIC_PRIORITY_BASE 0x04 +#define SHAKTI_C_PLIC_PRIORITY_BASE 0x00 #define SHAKTI_C_PLIC_PENDING_BASE 0x1000 #define SHAKTI_C_PLIC_ENABLE_BASE 0x2000 #define SHAKTI_C_PLIC_ENABLE_STRIDE 0x80 diff --git a/include/hw/riscv/sifive_e.h b/include/hw/riscv/sifive_e.h index 9e58247fd8..b824a79e2d 100644 --- a/include/hw/riscv/sifive_e.h +++ b/include/hw/riscv/sifive_e.h @@ -89,7 +89,7 @@ enum { */ #define SIFIVE_E_PLIC_NUM_SOURCES 53 #define SIFIVE_E_PLIC_NUM_PRIORITIES 7 -#define SIFIVE_E_PLIC_PRIORITY_BASE 0x04 +#define SIFIVE_E_PLIC_PRIORITY_BASE 0x00 #define SIFIVE_E_PLIC_PENDING_BASE 0x1000 #define SIFIVE_E_PLIC_ENABLE_BASE 0x2000 #define SIFIVE_E_PLIC_ENABLE_STRIDE 0x80 diff --git a/include/hw/riscv/sifive_u.h b/include/hw/riscv/sifive_u.h index 8f63a183c4..e680d61ece 100644 --- a/include/hw/riscv/sifive_u.h +++ b/include/hw/riscv/sifive_u.h @@ -158,7 +158,7 @@ enum { #define SIFIVE_U_PLIC_NUM_SOURCES 54 #define SIFIVE_U_PLIC_NUM_PRIORITIES 7 -#define SIFIVE_U_PLIC_PRIORITY_BASE 0x04 +#define SIFIVE_U_PLIC_PRIORITY_BASE 0x00 #define SIFIVE_U_PLIC_PENDING_BASE 0x1000 #define SIFIVE_U_PLIC_ENABLE_BASE 0x2000 #define SIFIVE_U_PLIC_ENABLE_STRIDE 0x80 diff --git a/include/hw/riscv/virt.h b/include/hw/riscv/virt.h index 7c23aea4a0..37819c168c 100644 --- a/include/hw/riscv/virt.h +++ b/include/hw/riscv/virt.h @@ -99,7 +99,7 @@ enum { #define VIRT_IRQCHIP_MAX_GUESTS_BITS 3 #define VIRT_IRQCHIP_MAX_GUESTS ((1U << VIRT_IRQCHIP_MAX_GUESTS_BITS) - 1U) -#define VIRT_PLIC_PRIORITY_BASE 0x04 +#define VIRT_PLIC_PRIORITY_BASE 0x00 #define VIRT_PLIC_PENDING_BASE 0x1000 #define VIRT_PLIC_ENABLE_BASE 0x2000 #define VIRT_PLIC_ENABLE_STRIDE 0x80 diff --git a/hw/intc/sifive_plic.c b/hw/intc/sifive_plic.c index 2bd292410d..7a6a358c57 100644 --- a/hw/intc/sifive_plic.c +++ b/hw/intc/sifive_plic.c @@ -140,7 +140,7 @@ static uint64_t sifive_plic_read(void *opaque, hwaddr addr, unsigned size) SiFivePLICState *plic = opaque; if (addr_between(addr, plic->priority_base, plic->num_sources << 2)) { -uint32_t irq = ((addr - plic->priority_base) >> 2) + 1; +uint32_t irq = (addr - plic->priority_base) >> 2; return plic->source_priority[irq]; } else if (addr_between(addr, plic->pending_base, plic->num_sources >> 3)) { @@ -187,7 +187,7 @@ static void sifive_plic_write(void *opaque, hwaddr addr, uint64_t value, SiFivePLICState *plic = opaque; if (addr_between(addr, plic->priority_base, plic->num_sources << 2)) { -uint32_t irq = ((addr - plic->priority_base) >> 2) + 1; +uint32_t irq = (addr - plic->priority_base) >> 2; if (((plic->num_priorities + 1) & plic->num_priorities) == 0) { /* @@ -428,6 +428,7 @@ static Property sifive_plic_properties[] = { /* number of interrupt sources including interrupt source 0 */ DEFINE_PROP_UINT32("num-sources", SiFivePLICState, num_sources, 1), DEFINE_PROP_UINT32("num-priorities", SiFivePLICState, num_priorities, 0), +/* interrupt priority register base starting from source 0 */ DEFINE_PROP_UINT32("priority-base", SiFivePLICState, priority_base, 0), DEFINE_PROP_UINT32("pending-base", SiFivePLICState, pending
[PATCH 05/15] hw/riscv: spike: Remove misleading comments
PLIC is not included in the 'spike' machine. Signed-off-by: Bin Meng --- hw/riscv/spike.c | 1 - 1 file changed, 1 deletion(-) diff --git a/hw/riscv/spike.c b/hw/riscv/spike.c index 1e1d752c00..13946acf0d 100644 --- a/hw/riscv/spike.c +++ b/hw/riscv/spike.c @@ -8,7 +8,6 @@ * * 0) HTIF Console and Poweroff * 1) CLINT (Timer and IPI) - * 2) PLIC (Platform Level Interrupt Controller) * * This program is free software; you can redistribute it and/or modify it * under the terms and conditions of the GNU General Public License, -- 2.34.1
[PATCH 06/15] hw/intc: sifive_plic: Drop PLICMode_H
H-mode has been removed since priv spec 1.10. Drop it. Signed-off-by: Bin Meng --- include/hw/intc/sifive_plic.h | 1 - hw/intc/sifive_plic.c | 1 - 2 files changed, 2 deletions(-) diff --git a/include/hw/intc/sifive_plic.h b/include/hw/intc/sifive_plic.h index 134cf39a96..d3f45ec248 100644 --- a/include/hw/intc/sifive_plic.h +++ b/include/hw/intc/sifive_plic.h @@ -33,7 +33,6 @@ DECLARE_INSTANCE_CHECKER(SiFivePLICState, SIFIVE_PLIC, typedef enum PLICMode { PLICMode_U, PLICMode_S, -PLICMode_H, PLICMode_M } PLICMode; diff --git a/hw/intc/sifive_plic.c b/hw/intc/sifive_plic.c index 1cf156cf85..3f6ffb1d70 100644 --- a/hw/intc/sifive_plic.c +++ b/hw/intc/sifive_plic.c @@ -42,7 +42,6 @@ static PLICMode char_to_mode(char c) switch (c) { case 'U': return PLICMode_U; case 'S': return PLICMode_S; -case 'H': return PLICMode_H; case 'M': return PLICMode_M; default: error_report("plic: invalid mode '%c'", c); -- 2.34.1
[PATCH 04/15] hw/riscv: Sort machines Kconfig options in alphabetical order
SHAKTI_C machine Kconfig option was inserted in disorder. Fix it. Signed-off-by: Bin Meng --- hw/riscv/Kconfig | 16 +--- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/hw/riscv/Kconfig b/hw/riscv/Kconfig index 1e4b58024f..4550b3b938 100644 --- a/hw/riscv/Kconfig +++ b/hw/riscv/Kconfig @@ -4,6 +4,8 @@ config RISCV_NUMA config IBEX bool +# RISC-V machines in alphabetical order + config MICROCHIP_PFSOC bool select CADENCE_SDHCI @@ -22,13 +24,6 @@ config OPENTITAN select SIFIVE_PLIC select UNIMP -config SHAKTI_C -bool -select UNIMP -select SHAKTI_UART -select RISCV_ACLINT -select SIFIVE_PLIC - config RISCV_VIRT bool imply PCI_DEVICES @@ -50,6 +45,13 @@ config RISCV_VIRT select FW_CFG_DMA select PLATFORM_BUS +config SHAKTI_C +bool +select RISCV_ACLINT +select SHAKTI_UART +select SIFIVE_PLIC +select UNIMP + config SIFIVE_E bool select RISCV_ACLINT -- 2.34.1