Re: [PATCH v2 09/20] drm/i915: Remove references to struct drm_device.pdev
On Tue, 08 Dec 2020, Thomas Zimmermann wrote: > ping for a review of the i915 patches What did you have in mind regarding merging the series? Should we just pick the patches up? BR, Jani. -- Jani Nikula, Intel Open Source Graphics Center ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v1] vdpa/mlx5: Use write memory barrier after updating CQ index
On 2020/12/9 下午10:00, Eli Cohen wrote: Make sure to put dma write memory barrier after updating CQ consumer index so the hardware knows that there are available CQE slots in the queue. Failure to do this can cause the update of the RX doorbell record to get updated before the CQ consumer index resulting in CQ overrun. Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Signed-off-by: Eli Cohen --- V0 -> V1 Use dma_wmb() instead of wmb() Acked-by: Jason Wang drivers/vdpa/mlx5/net/mlx5_vnet.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c index db87abc3cb60..43b0069ff8b1 100644 --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c @@ -479,6 +479,11 @@ static int mlx5_vdpa_poll_one(struct mlx5_vdpa_cq *vcq) static void mlx5_vdpa_handle_completions(struct mlx5_vdpa_virtqueue *mvq, int num) { mlx5_cq_set_ci(&mvq->cq.mcq); + + /* make sure CQ cosumer update is visible to the hardware before updating +* RX doorbell record. +*/ + dma_wmb(); rx_post(&mvq->vqqp, num); if (mvq->event_cb.callback) mvq->event_cb.callback(mvq->event_cb.private); ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [RFC PATCH 00/27] vDPA software assisted live migration
On 2020/12/9 下午11:57, Stefan Hajnoczi wrote: On Wed, Dec 09, 2020 at 04:26:50AM -0500, Jason Wang wrote: - Original Message - On Fri, Nov 20, 2020 at 07:50:38PM +0100, Eugenio Pérez wrote: This series enable vDPA software assisted live migration for vhost-net devices. This is a new method of vhost devices migration: Instead of relay on vDPA device's dirty logging capability, SW assisted LM intercepts dataplane, forwarding the descriptors between VM and device. Pros: + vhost/vDPA devices don't need to implement dirty memory logging + Obsoletes ioctl(VHOST_SET_LOG_BASE) and friends Cons: - Not generic, relies on vhost-net-specific ioctls - Doesn't support VIRTIO Shared Memory Regions https://github.com/oasis-tcs/virtio-spec/blob/master/shared-mem.tex I may miss something but my understanding is that it's the responsiblity of device to migrate this part? Good point. You're right. - Performance (see below) I think performance will be significantly lower when the shadow vq is enabled. Imagine a vDPA device with hardware vq doorbell registers mapped into the guest so the guest driver can directly kick the device. When the shadow vq is enabled a vmexit is needed to write to the shadow vq ioeventfd, then the host kernel scheduler switches to a QEMU thread to read the ioeventfd, the descriptors are translated, QEMU writes to the vhost hdev kick fd, the host kernel scheduler switches to the vhost worker thread, vhost/vDPA notifies the virtqueue, and finally the vDPA driver writes to the hardware vq doorbell register. That is a lot of overhead compared to writing to an exitless MMIO register! I think it's a balance. E.g we can poll the virtqueue to have an exitless doorbell. If the shadow vq was implemented in drivers/vhost/ and QEMU used the existing ioctl(VHOST_SET_LOG_BASE) approach, then the overhead would be reduced to just one set of ioeventfd/irqfd. In other words, the QEMU dirty memory logging happens asynchronously and isn't in the dataplane. In addition, hardware that supports dirty memory logging as well as software vDPA devices could completely eliminate the shadow vq for even better performance. Yes. That's our plan. But the interface might require more thought. E.g is the bitmap a good approach? To me reporting dirty pages via virqueue is better since it get less footprint and is self throttled. And we need an address space other than the one used by guest for either bitmap for virtqueue. But performance is a question of "is it good enough?". Maybe this approach is okay and users don't expect good performance while dirty memory logging is enabled. Yes, and actually such slow down may help for the converge of the migration. Note that the whole idea is try to have a generic solution for all types of devices. It's good to consider the performance but for the first stage, it should be sufficient to make it work and consider to optimize on top. Moving the shadow vq to the kernel later would be quite a big change requiring rewriting much of the code. That's why I mentioned this now before a lot of effort is invested in a QEMU implementation. Right. I just wanted to share the idea of moving the shadow vq into the kernel in case you like that approach better. My understanding is to keep kernel as simple as possible and leave the polices to userspace as much as possible. E.g it requires us to disable doorbell mapping and irq offloading, all of which were under the control of userspace. If the performance is acceptable with the QEMU approach then I think that's the best place to implement it. It looks high-overhead though so maybe one of the first things to do is to run benchmarks to collect data on how it performs? Yes, I agree. Thanks Stefan ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v2 05/12] x86: rework arch_local_irq_restore() to not use popf
On Wed, Dec 09, 2020 at 07:54:26PM +0100, Thomas Gleixner wrote: > On Wed, Dec 09 2020 at 18:15, Mark Rutland wrote: > > In arch/x86/kernel/apic/io_apic.c's timer_irq_works() we do: > > > > local_irq_save(flags); > > local_irq_enable(); > > > > [ trigger an IRQ here ] > > > > local_irq_restore(flags); > > > > ... and in check_timer() we call that a number of times after either a > > local_irq_save() or local_irq_disable(), eventually trailing with a > > local_irq_disable() that will balance things up before calling > > local_irq_restore(). I gave the patchlet below a spin with my debug patch, and it boots cleanly for me under QEMU. If you spin it as a real patch, feel free to add: Tested-by: Mark Rutland Mark. > --- > arch/x86/kernel/apic/io_apic.c | 22 ++ > 1 file changed, 6 insertions(+), 16 deletions(-) > > --- a/arch/x86/kernel/apic/io_apic.c > +++ b/arch/x86/kernel/apic/io_apic.c > @@ -1618,21 +1618,16 @@ static void __init delay_without_tsc(voi > static int __init timer_irq_works(void) > { > unsigned long t1 = jiffies; > - unsigned long flags; > > if (no_timer_check) > return 1; > > - local_save_flags(flags); > local_irq_enable(); > - > if (boot_cpu_has(X86_FEATURE_TSC)) > delay_with_tsc(); > else > delay_without_tsc(); > > - local_irq_restore(flags); > - > /* >* Expect a few ticks at least, to be sure some possible >* glue logic does not lock up after one or two first > @@ -1641,10 +1636,10 @@ static int __init timer_irq_works(void) >* least one tick may be lost due to delays. >*/ > > - /* jiffies wrap? */ > - if (time_after(jiffies, t1 + 4)) > - return 1; > - return 0; > + local_irq_disable(); > + > + /* Did jiffies advance? */ > + return time_after(jiffies, t1 + 4); > } > > /* > @@ -2117,13 +2112,12 @@ static inline void __init check_timer(vo > struct irq_cfg *cfg = irqd_cfg(irq_data); > int node = cpu_to_node(0); > int apic1, pin1, apic2, pin2; > - unsigned long flags; > int no_pin1 = 0; > > if (!global_clock_event) > return; > > - local_irq_save(flags); > + local_irq_disable(); > > /* >* get/set the timer IRQ vector: > @@ -2191,7 +2185,6 @@ static inline void __init check_timer(vo > goto out; > } > panic_if_irq_remap("timer doesn't work through > Interrupt-remapped IO-APIC"); > - local_irq_disable(); > clear_IO_APIC_pin(apic1, pin1); > if (!no_pin1) > apic_printk(APIC_QUIET, KERN_ERR "..MP-BIOS bug: " > @@ -2215,7 +2208,6 @@ static inline void __init check_timer(vo > /* >* Cleanup, just in case ... >*/ > - local_irq_disable(); > legacy_pic->mask(0); > clear_IO_APIC_pin(apic2, pin2); > apic_printk(APIC_QUIET, KERN_INFO "... failed.\n"); > @@ -2232,7 +2224,6 @@ static inline void __init check_timer(vo > apic_printk(APIC_QUIET, KERN_INFO ". works.\n"); > goto out; > } > - local_irq_disable(); > legacy_pic->mask(0); > apic_write(APIC_LVT0, APIC_LVT_MASKED | APIC_DM_FIXED | cfg->vector); > apic_printk(APIC_QUIET, KERN_INFO ". failed.\n"); > @@ -2251,7 +2242,6 @@ static inline void __init check_timer(vo > apic_printk(APIC_QUIET, KERN_INFO ". works.\n"); > goto out; > } > - local_irq_disable(); > apic_printk(APIC_QUIET, KERN_INFO ". failed :(.\n"); > if (apic_is_x2apic_enabled()) > apic_printk(APIC_QUIET, KERN_INFO > @@ -2260,7 +2250,7 @@ static inline void __init check_timer(vo > panic("IO-APIC + timer doesn't work! Boot with apic=debug and send a " > "report. Then try booting with the 'noapic' option.\n"); > out: > - local_irq_restore(flags); > + local_irq_enable(); > } > > /* ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [RFC PATCH 05/27] vhost: Add hdev->dev.sw_lm_vq_handler
On Wed, Dec 09, 2020 at 04:02:56PM +0100, Eugenio Perez Martin wrote: > On Mon, Dec 7, 2020 at 5:52 PM Stefan Hajnoczi wrote: > > On Fri, Nov 20, 2020 at 07:50:43PM +0100, Eugenio Pérez wrote: > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c > > > index 9179013ac4..9a69ae3598 100644 > > > --- a/hw/net/virtio-net.c > > > +++ b/hw/net/virtio-net.c > > > @@ -2628,24 +2628,32 @@ static void virtio_net_tx_bh(void *opaque) > > > } > > > } > > > > > > -static void virtio_net_add_queue(VirtIONet *n, int index) > > > +static void virtio_net_add_queue(VirtIONet *n, int index, > > > + VirtIOHandleOutput custom_handler) > > > { > > > > We talked about the possibility of moving this into the generic vhost > > code so that devices don't need to be modified. It would be nice to hide > > this feature inside vhost. > > I'm thinking of tying it to VirtQueue, allowing the caller to override > the handler knowing it is not going to be called (I mean, not offering > race conditions protection, like before of starting processing > notifications in qemu calling vhost_dev_disable_notifiers). Yes, I can see how at least part of this belongs to VirtQueue. Stefan signature.asc Description: PGP signature ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v2 09/20] drm/i915: Remove references to struct drm_device.pdev
Hi Am 10.12.20 um 10:03 schrieb Jani Nikula: On Tue, 08 Dec 2020, Thomas Zimmermann wrote: ping for a review of the i915 patches What did you have in mind regarding merging the series? Should we just pick the patches up? Originally I thought that individual trees would merge their rsp patches. I guess the larger drivers could get some minor conflicts. Amdgpu/radeon devs suggested to put everything into drm-misc-next at once. I'd prefer that, but ultimately it's up to you. Best regards Thomas BR, Jani. -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Felix Imendörffer OpenPGP_signature Description: OpenPGP digital signature ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v2 13/20] drm/nouveau: Remove references to struct drm_device.pdev
Hi Am 09.12.20 um 19:04 schrieb Jeremy Cline: Hi, On Tue, Dec 01, 2020 at 11:35:35AM +0100, Thomas Zimmermann wrote: Using struct drm_device.pdev is deprecated. Convert nouveau to struct drm_device.dev. No functional changes. Signed-off-by: Thomas Zimmermann Cc: Ben Skeggs --- drivers/gpu/drm/nouveau/dispnv04/arb.c | 12 +++- drivers/gpu/drm/nouveau/dispnv04/disp.h | 14 -- drivers/gpu/drm/nouveau/dispnv04/hw.c | 10 ++ drivers/gpu/drm/nouveau/nouveau_abi16.c | 7 --- drivers/gpu/drm/nouveau/nouveau_acpi.c | 2 +- drivers/gpu/drm/nouveau/nouveau_bios.c | 11 --- drivers/gpu/drm/nouveau/nouveau_connector.c | 10 ++ drivers/gpu/drm/nouveau/nouveau_drm.c | 5 ++--- drivers/gpu/drm/nouveau/nouveau_fbcon.c | 6 -- drivers/gpu/drm/nouveau/nouveau_vga.c | 20 10 files changed, 58 insertions(+), 39 deletions(-) I believe there's a use of drm_device.pdev in drivers/gpu/drm/nouveau/dispnv04/dfp.c in the nv04_dfp_update_backlight() function. Oh, I see! It's PowerPC-only. Other than that, this looks good to me. With the forgotten pdev fixes, can I count this as an Acked-by? Best regards Thomas diff --git a/drivers/gpu/drm/nouveau/dispnv04/arb.c b/drivers/gpu/drm/nouveau/dispnv04/arb.c index 9d4a2d97507e..1d3542d6006b 100644 --- a/drivers/gpu/drm/nouveau/dispnv04/arb.c +++ b/drivers/gpu/drm/nouveau/dispnv04/arb.c @@ -200,16 +200,17 @@ nv04_update_arb(struct drm_device *dev, int VClk, int bpp, int MClk = nouveau_hw_get_clock(dev, PLL_MEMORY); int NVClk = nouveau_hw_get_clock(dev, PLL_CORE); uint32_t cfg1 = nvif_rd32(device, NV04_PFB_CFG1); + struct pci_dev *pdev = to_pci_dev(dev->dev); sim_data.pclk_khz = VClk; sim_data.mclk_khz = MClk; sim_data.nvclk_khz = NVClk; sim_data.bpp = bpp; sim_data.two_heads = nv_two_heads(dev); - if ((dev->pdev->device & 0x) == 0x01a0 /*CHIPSET_NFORCE*/ || - (dev->pdev->device & 0x) == 0x01f0 /*CHIPSET_NFORCE2*/) { + if ((pdev->device & 0x) == 0x01a0 /*CHIPSET_NFORCE*/ || + (pdev->device & 0x) == 0x01f0 /*CHIPSET_NFORCE2*/) { uint32_t type; - int domain = pci_domain_nr(dev->pdev->bus); + int domain = pci_domain_nr(pdev->bus); pci_read_config_dword(pci_get_domain_bus_and_slot(domain, 0, 1), 0x7c, &type); @@ -251,11 +252,12 @@ void nouveau_calc_arb(struct drm_device *dev, int vclk, int bpp, int *burst, int *lwm) { struct nouveau_drm *drm = nouveau_drm(dev); + struct pci_dev *pdev = to_pci_dev(dev->dev); if (drm->client.device.info.family < NV_DEVICE_INFO_V0_KELVIN) nv04_update_arb(dev, vclk, bpp, burst, lwm); - else if ((dev->pdev->device & 0xfff0) == 0x0240 /*CHIPSET_C51*/ || -(dev->pdev->device & 0xfff0) == 0x03d0 /*CHIPSET_C512*/) { + else if ((pdev->device & 0xfff0) == 0x0240 /*CHIPSET_C51*/ || +(pdev->device & 0xfff0) == 0x03d0 /*CHIPSET_C512*/) { *burst = 128; *lwm = 0x0480; } else diff --git a/drivers/gpu/drm/nouveau/dispnv04/disp.h b/drivers/gpu/drm/nouveau/dispnv04/disp.h index 5ace5e906949..f0a24126641a 100644 --- a/drivers/gpu/drm/nouveau/dispnv04/disp.h +++ b/drivers/gpu/drm/nouveau/dispnv04/disp.h @@ -130,7 +130,7 @@ static inline bool nv_two_heads(struct drm_device *dev) { struct nouveau_drm *drm = nouveau_drm(dev); - const int impl = dev->pdev->device & 0x0ff0; + const int impl = to_pci_dev(dev->dev)->device & 0x0ff0; if (drm->client.device.info.family >= NV_DEVICE_INFO_V0_CELSIUS && impl != 0x0100 && impl != 0x0150 && impl != 0x01a0 && impl != 0x0200) @@ -142,14 +142,14 @@ nv_two_heads(struct drm_device *dev) static inline bool nv_gf4_disp_arch(struct drm_device *dev) { - return nv_two_heads(dev) && (dev->pdev->device & 0x0ff0) != 0x0110; + return nv_two_heads(dev) && (to_pci_dev(dev->dev)->device & 0x0ff0) != 0x0110; } static inline bool nv_two_reg_pll(struct drm_device *dev) { struct nouveau_drm *drm = nouveau_drm(dev); - const int impl = dev->pdev->device & 0x0ff0; + const int impl = to_pci_dev(dev->dev)->device & 0x0ff0; if (impl == 0x0310 || impl == 0x0340 || drm->client.device.info.family >= NV_DEVICE_INFO_V0_CURIE) return true; @@ -160,9 +160,11 @@ static inline bool nv_match_device(struct drm_device *dev, unsigned device, unsigned sub_vendor, unsigned sub_device) { - return dev->pdev->device == device && - dev->pdev->subsystem_vendor == sub_vendor && - dev->pdev->subsystem_device == sub_device; + struct pci_dev *pdev = to_pci_dev(dev->dev); + + return pdev->device == device && + pdev->subsystem
Re: [RFC PATCH 07/27] vhost: Route guest->host notification through qemu
On Wed, Dec 09, 2020 at 06:08:14PM +0100, Eugenio Perez Martin wrote: > On Mon, Dec 7, 2020 at 6:42 PM Stefan Hajnoczi wrote: > > On Fri, Nov 20, 2020 at 07:50:45PM +0100, Eugenio Pérez wrote: > > > +{ > > > +struct vhost_vring_file file = { > > > +.index = idx > > > +}; > > > +VirtQueue *vq = virtio_get_queue(dev->vdev, idx); > > > +VhostShadowVirtqueue *svq; > > > +int r; > > > + > > > +svq = g_new0(VhostShadowVirtqueue, 1); > > > +svq->vq = vq; > > > + > > > +r = event_notifier_init(&svq->hdev_notifier, 0); > > > +assert(r == 0); > > > + > > > +file.fd = event_notifier_get_fd(&svq->hdev_notifier); > > > +r = dev->vhost_ops->vhost_set_vring_kick(dev, &file); > > > +assert(r == 0); > > > + > > > +return svq; > > > +} > > > > I guess there are assumptions about the status of the device? Does the > > virtqueue need to be disabled when this function is called? > > > > Yes. Maybe an assertion checking the notification state? Sounds good. > > > + > > > +static int vhost_sw_live_migration_stop(struct vhost_dev *dev) > > > +{ > > > +int idx; > > > + > > > +vhost_dev_enable_notifiers(dev, dev->vdev); > > > +for (idx = 0; idx < dev->nvqs; ++idx) { > > > +vhost_sw_lm_shadow_vq_free(dev->sw_lm_shadow_vq[idx]); > > > +} > > > + > > > +return 0; > > > +} > > > + > > > +static int vhost_sw_live_migration_start(struct vhost_dev *dev) > > > +{ > > > +int idx; > > > + > > > +for (idx = 0; idx < dev->nvqs; ++idx) { > > > +dev->sw_lm_shadow_vq[idx] = vhost_sw_lm_shadow_vq(dev, idx); > > > +} > > > + > > > +vhost_dev_disable_notifiers(dev, dev->vdev); > > > > There is a race condition if the guest kicks the vq while this is > > happening. The shadow vq hdev_notifier needs to be set so the vhost > > device checks the virtqueue for requests that slipped in during the > > race window. > > > > I'm not sure if I follow you. If I understand correctly, > vhost_dev_disable_notifiers calls virtio_bus_cleanup_host_notifier, > and the latter calls virtio_queue_host_notifier_read. That's why the > documentation says "This might actually run the qemu handlers right > away, so virtio in qemu must be completely setup when this is > called.". Am I missing something? There are at least two cases: 1. Virtqueue kicks that come before vhost_dev_disable_notifiers(). vhost_dev_disable_notifiers() notices that and calls virtio_queue_notify_vq(). Will handle_sw_lm_vq() be invoked or is the device's vq handler function invoked? 2. Virtqueue kicks that come in after vhost_dev_disable_notifiers() returns. We hold the QEMU global mutex so the vCPU thread cannot perform MMIO/PIO dispatch immediately. The vCPU thread's ioctl(KVM_RUN) has already returned and will dispatch dispatch the MMIO/PIO access inside QEMU as soon as the global mutex is released. In other words, we're not taking the kvm.ko ioeventfd path but memory_region_dispatch_write_eventfds() should signal the ioeventfd that is registered at the time the dispatch occurs. Is that eventfd handled by handle_sw_lm_vq()? Neither of these cases are obvious from the code. At least comments would help but I suspect restructuring the code so the critical ioeventfd state changes happen in a sequence would make it even clearer. signature.asc Description: PGP signature ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [RFC PATCH 13/27] vhost: Send buffers to device
On Wed, Dec 09, 2020 at 07:41:23PM +0100, Eugenio Perez Martin wrote: > On Tue, Dec 8, 2020 at 9:16 AM Stefan Hajnoczi wrote: > > On Fri, Nov 20, 2020 at 07:50:51PM +0100, Eugenio Pérez wrote: > > > +while (true) { > > > +int r; > > > +if (virtio_queue_full(vq)) { > > > +break; > > > +} > > > > Why is this check necessary? The guest cannot provide more descriptors > > than there is ring space. If that happens somehow then it's a driver > > error that is already reported in virtqueue_pop() below. > > > > It's just checked because virtqueue_pop prints an error on that case, > and there is no way to tell the difference between a regular error and > another caused by other causes. Maybe the right thing to do is just to > not to print that error? Caller should do the error printing in that > case. Should we return an error code? The reason an error is printed today is because it's a guest error that never happens with correct guest drivers. Something is broken and the user should know about it. Why is "virtio_queue_full" (I already forgot what that actually means, it's not clear whether this is referring to avail elements or used elements) a condition that should be silently ignored in shadow vqs? Stefan signature.asc Description: PGP signature ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [RFC PATCH 16/27] virtio: Expose virtqueue_alloc_element
On Wed, Dec 09, 2020 at 07:46:49PM +0100, Eugenio Perez Martin wrote: > On Tue, Dec 8, 2020 at 9:26 AM Stefan Hajnoczi wrote: > > > > On Fri, Nov 20, 2020 at 07:50:54PM +0100, Eugenio Pérez wrote: > > > Specify VirtQueueElement * as return type makes no harm at this moment. > > > > The reason for the void * return type is that C implicitly converts void > > pointers to pointers of any type. The function takes a size_t sz > > argument so it can allocate a object of user-defined size. The idea is > > that the user's struct embeds a VirtQueueElement field. Changing the > > return type to VirtQueueElement * means that callers may need to > > explicitly cast to the user's struct type. > > > > It's a question of coding style but I think the void * return type > > communicates what is going on better than VirtQueueElement *. > > Right, what I meant with that is that nobody uses that feature, but I > just re-check and I saw that contrib/vhost-user-blk actually uses it > (not checked for more uses). I think it is better just to drop this > commit. contrib/vhost-user-blk doesn't use hw/virtio/virtio.c. The code is similar and copy-pasted, but you are free to change this file without affecting vontrib/vhost-user-blk :). I still think it's clearer to make it obvious that this function allocates an object of generic type or at least the change is purely a question of style and probably not worth making. Stefan signature.asc Description: PGP signature ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v2 13/20] drm/nouveau: Remove references to struct drm_device.pdev
On Thu, Dec 10, 2020 at 08:56:08AM +0100, Thomas Zimmermann wrote: > Hi > > Am 09.12.20 um 19:04 schrieb Jeremy Cline: > > Hi, > > > > On Tue, Dec 01, 2020 at 11:35:35AM +0100, Thomas Zimmermann wrote: > > > Using struct drm_device.pdev is deprecated. Convert nouveau to struct > > > drm_device.dev. No functional changes. > > > > > > Signed-off-by: Thomas Zimmermann > > > Cc: Ben Skeggs > > > --- > > > drivers/gpu/drm/nouveau/dispnv04/arb.c | 12 +++- > > > drivers/gpu/drm/nouveau/dispnv04/disp.h | 14 -- > > > drivers/gpu/drm/nouveau/dispnv04/hw.c | 10 ++ > > > drivers/gpu/drm/nouveau/nouveau_abi16.c | 7 --- > > > drivers/gpu/drm/nouveau/nouveau_acpi.c | 2 +- > > > drivers/gpu/drm/nouveau/nouveau_bios.c | 11 --- > > > drivers/gpu/drm/nouveau/nouveau_connector.c | 10 ++ > > > drivers/gpu/drm/nouveau/nouveau_drm.c | 5 ++--- > > > drivers/gpu/drm/nouveau/nouveau_fbcon.c | 6 -- > > > drivers/gpu/drm/nouveau/nouveau_vga.c | 20 > > > 10 files changed, 58 insertions(+), 39 deletions(-) > > > > > > > I believe there's a use of drm_device.pdev in > > drivers/gpu/drm/nouveau/dispnv04/dfp.c in the > > nv04_dfp_update_backlight() function. > > Oh, I see! It's PowerPC-only. > > > > > Other than that, this looks good to me. > > With the forgotten pdev fixes, can I count this as an Acked-by? > Yeah, with that fix: Reviewed-by: Jeremy Cline > Best regards > Thomas > > > > > > diff --git a/drivers/gpu/drm/nouveau/dispnv04/arb.c > > > b/drivers/gpu/drm/nouveau/dispnv04/arb.c > > > index 9d4a2d97507e..1d3542d6006b 100644 > > > --- a/drivers/gpu/drm/nouveau/dispnv04/arb.c > > > +++ b/drivers/gpu/drm/nouveau/dispnv04/arb.c > > > @@ -200,16 +200,17 @@ nv04_update_arb(struct drm_device *dev, int VClk, > > > int bpp, > > > int MClk = nouveau_hw_get_clock(dev, PLL_MEMORY); > > > int NVClk = nouveau_hw_get_clock(dev, PLL_CORE); > > > uint32_t cfg1 = nvif_rd32(device, NV04_PFB_CFG1); > > > + struct pci_dev *pdev = to_pci_dev(dev->dev); > > > sim_data.pclk_khz = VClk; > > > sim_data.mclk_khz = MClk; > > > sim_data.nvclk_khz = NVClk; > > > sim_data.bpp = bpp; > > > sim_data.two_heads = nv_two_heads(dev); > > > - if ((dev->pdev->device & 0x) == 0x01a0 /*CHIPSET_NFORCE*/ || > > > - (dev->pdev->device & 0x) == 0x01f0 /*CHIPSET_NFORCE2*/) { > > > + if ((pdev->device & 0x) == 0x01a0 /*CHIPSET_NFORCE*/ || > > > + (pdev->device & 0x) == 0x01f0 /*CHIPSET_NFORCE2*/) { > > > uint32_t type; > > > - int domain = pci_domain_nr(dev->pdev->bus); > > > + int domain = pci_domain_nr(pdev->bus); > > > > > > pci_read_config_dword(pci_get_domain_bus_and_slot(domain, 0, 1), > > > 0x7c, &type); > > > @@ -251,11 +252,12 @@ void > > > nouveau_calc_arb(struct drm_device *dev, int vclk, int bpp, int *burst, > > > int *lwm) > > > { > > > struct nouveau_drm *drm = nouveau_drm(dev); > > > + struct pci_dev *pdev = to_pci_dev(dev->dev); > > > if (drm->client.device.info.family < NV_DEVICE_INFO_V0_KELVIN) > > > nv04_update_arb(dev, vclk, bpp, burst, lwm); > > > - else if ((dev->pdev->device & 0xfff0) == 0x0240 /*CHIPSET_C51*/ || > > > - (dev->pdev->device & 0xfff0) == 0x03d0 /*CHIPSET_C512*/) { > > > + else if ((pdev->device & 0xfff0) == 0x0240 /*CHIPSET_C51*/ || > > > + (pdev->device & 0xfff0) == 0x03d0 /*CHIPSET_C512*/) { > > > *burst = 128; > > > *lwm = 0x0480; > > > } else > > > diff --git a/drivers/gpu/drm/nouveau/dispnv04/disp.h > > > b/drivers/gpu/drm/nouveau/dispnv04/disp.h > > > index 5ace5e906949..f0a24126641a 100644 > > > --- a/drivers/gpu/drm/nouveau/dispnv04/disp.h > > > +++ b/drivers/gpu/drm/nouveau/dispnv04/disp.h > > > @@ -130,7 +130,7 @@ static inline bool > > > nv_two_heads(struct drm_device *dev) > > > { > > > struct nouveau_drm *drm = nouveau_drm(dev); > > > - const int impl = dev->pdev->device & 0x0ff0; > > > + const int impl = to_pci_dev(dev->dev)->device & 0x0ff0; > > > if (drm->client.device.info.family >= NV_DEVICE_INFO_V0_CELSIUS > > > && impl != 0x0100 && > > > impl != 0x0150 && impl != 0x01a0 && impl != 0x0200) > > > @@ -142,14 +142,14 @@ nv_two_heads(struct drm_device *dev) > > > static inline bool > > > nv_gf4_disp_arch(struct drm_device *dev) > > > { > > > - return nv_two_heads(dev) && (dev->pdev->device & 0x0ff0) != 0x0110; > > > + return nv_two_heads(dev) && (to_pci_dev(dev->dev)->device & 0x0ff0) != > > > 0x0110; > > > } > > > static inline bool > > > nv_two_reg_pll(struct drm_device *dev) > > > { > > > struct nouveau_drm *drm = nouveau_drm(dev); > > > - const int impl = dev->pdev->device & 0x0ff0; > > > + const
x86/ioapic: Cleanup the timer_works() irqflags mess
Mark tripped over the creative irqflags handling in the IO-APIC timer delivery check which ends up doing: local_irq_save(flags); local_irq_enable(); local_irq_restore(flags); which triggered a new consistency check he's working on required for replacing the POPF based restore with a conditional STI. That code is a historical mess and none of this is needed. Make it straightforward use local_irq_disable()/enable() as that's all what is required. It is invoked from interrupt enabled code nowadays. Reported-by: Mark Rutland Signed-off-by: Thomas Gleixner Tested-by: Mark Rutland --- arch/x86/kernel/apic/io_apic.c | 22 ++ 1 file changed, 6 insertions(+), 16 deletions(-) --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -1618,21 +1618,16 @@ static void __init delay_without_tsc(voi static int __init timer_irq_works(void) { unsigned long t1 = jiffies; - unsigned long flags; if (no_timer_check) return 1; - local_save_flags(flags); local_irq_enable(); - if (boot_cpu_has(X86_FEATURE_TSC)) delay_with_tsc(); else delay_without_tsc(); - local_irq_restore(flags); - /* * Expect a few ticks at least, to be sure some possible * glue logic does not lock up after one or two first @@ -1641,10 +1636,10 @@ static int __init timer_irq_works(void) * least one tick may be lost due to delays. */ - /* jiffies wrap? */ - if (time_after(jiffies, t1 + 4)) - return 1; - return 0; + local_irq_disable(); + + /* Did jiffies advance? */ + return time_after(jiffies, t1 + 4); } /* @@ -2117,13 +2112,12 @@ static inline void __init check_timer(vo struct irq_cfg *cfg = irqd_cfg(irq_data); int node = cpu_to_node(0); int apic1, pin1, apic2, pin2; - unsigned long flags; int no_pin1 = 0; if (!global_clock_event) return; - local_irq_save(flags); + local_irq_disable(); /* * get/set the timer IRQ vector: @@ -2191,7 +2185,6 @@ static inline void __init check_timer(vo goto out; } panic_if_irq_remap("timer doesn't work through Interrupt-remapped IO-APIC"); - local_irq_disable(); clear_IO_APIC_pin(apic1, pin1); if (!no_pin1) apic_printk(APIC_QUIET, KERN_ERR "..MP-BIOS bug: " @@ -2215,7 +2208,6 @@ static inline void __init check_timer(vo /* * Cleanup, just in case ... */ - local_irq_disable(); legacy_pic->mask(0); clear_IO_APIC_pin(apic2, pin2); apic_printk(APIC_QUIET, KERN_INFO "... failed.\n"); @@ -2232,7 +2224,6 @@ static inline void __init check_timer(vo apic_printk(APIC_QUIET, KERN_INFO ". works.\n"); goto out; } - local_irq_disable(); legacy_pic->mask(0); apic_write(APIC_LVT0, APIC_LVT_MASKED | APIC_DM_FIXED | cfg->vector); apic_printk(APIC_QUIET, KERN_INFO ". failed.\n"); @@ -2251,7 +2242,6 @@ static inline void __init check_timer(vo apic_printk(APIC_QUIET, KERN_INFO ". works.\n"); goto out; } - local_irq_disable(); apic_printk(APIC_QUIET, KERN_INFO ". failed :(.\n"); if (apic_is_x2apic_enabled()) apic_printk(APIC_QUIET, KERN_INFO @@ -2260,7 +2250,7 @@ static inline void __init check_timer(vo panic("IO-APIC + timer doesn't work! Boot with apic=debug and send a " "report. Then try booting with the 'noapic' option.\n"); out: - local_irq_restore(flags); + local_irq_enable(); } /* ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH net] vhost_net: fix high cpu load when sendmsg fails
On 2020/12/9 下午9:27, wangyunjian wrote: -Original Message- From: Michael S. Tsirkin [mailto:m...@redhat.com] Sent: Wednesday, December 9, 2020 8:50 PM To: wangyunjian Cc: jasow...@redhat.com; virtualization@lists.linux-foundation.org; net...@vger.kernel.org; Lilijun (Jerry) ; chenchanghu ; xudingke Subject: Re: [PATCH net] vhost_net: fix high cpu load when sendmsg fails On Wed, Dec 09, 2020 at 07:48:24PM +0800, wangyunjian wrote: From: Yunjian Wang Currently we break the loop and wake up the vhost_worker when sendmsg fails. When the worker wakes up again, we'll meet the same error. This will cause high CPU load. To fix this issue, we can skip this description by ignoring the error. Signed-off-by: Yunjian Wang --- drivers/vhost/net.c | 24 +--- 1 file changed, 5 insertions(+), 19 deletions(-) diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index 531a00d703cd..ac950b1120f5 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -829,14 +829,8 @@ static void handle_tx_copy(struct vhost_net *net, struct socket *sock) /* TODO: Check specific error and bomb out unless ENOBUFS? */ err = sock->ops->sendmsg(sock, &msg, len); - if (unlikely(err < 0)) { - vhost_discard_vq_desc(vq, 1); - vhost_net_enable_vq(net, vq); - break; - } - if (err != len) - pr_debug("Truncated TX packet: len %d != %zd\n", -err, len); + if (unlikely(err < 0 || err != len)) + vq_err(vq, "Fail to sending packets err : %d, len : %zd\n", err, +len); done: vq->heads[nvq->done_idx].id = cpu_to_vhost32(vq, head); vq->heads[nvq->done_idx].len = 0; One of the reasons for sendmsg to fail is ENOBUFS. In that case for sure we don't want to drop packet. Now the function tap_sendmsg()/tun_sendmsg() don't return ENOBUFS. I think not, it can happen if we exceeds sndbuf. E.g see tun_alloc_skb(). Thanks There could be other transient errors. Which error did you encounter, specifically? Currently a guest vm send a skb which length is more than 64k. If virtio hdr is wrong, the problem will also be triggered. Thanks @@ -925,19 +919,11 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock) /* TODO: Check specific error and bomb out unless ENOBUFS? */ err = sock->ops->sendmsg(sock, &msg, len); - if (unlikely(err < 0)) { - if (zcopy_used) { + if (unlikely(err < 0 || err != len)) { + if (zcopy_used && err < 0) vhost_net_ubuf_put(ubufs); - nvq->upend_idx = ((unsigned)nvq->upend_idx - 1) - % UIO_MAXIOV; - } - vhost_discard_vq_desc(vq, 1); - vhost_net_enable_vq(net, vq); - break; + vq_err(vq, "Fail to sending packets err : %d, len : %zd\n", err, +len); } - if (err != len) - pr_debug("Truncated TX packet: " -" len %d != %zd\n", err, len); if (!zcopy_used) vhost_add_used_and_signal(&net->dev, vq, head, 0); else -- 2.23.0 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: x86/ioapic: Cleanup the timer_works() irqflags mess
On 10.12.20 21:15, Thomas Gleixner wrote: Mark tripped over the creative irqflags handling in the IO-APIC timer delivery check which ends up doing: local_irq_save(flags); local_irq_enable(); local_irq_restore(flags); which triggered a new consistency check he's working on required for replacing the POPF based restore with a conditional STI. That code is a historical mess and none of this is needed. Make it straightforward use local_irq_disable()/enable() as that's all what is required. It is invoked from interrupt enabled code nowadays. Reported-by: Mark Rutland Signed-off-by: Thomas Gleixner Tested-by: Mark Rutland Reviewed-by: Juergen Gross Juergen OpenPGP_0xB0DE9DD628BF132F.asc Description: application/pgp-keys OpenPGP_signature Description: OpenPGP digital signature ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization