[PATCH] drm/amdgpu: nuke the ih reentrant lock
Interrupts on are non-reentrant on linux. This is just an ancient leftover from radeon where irq processing was kicked of from different places. Signed-off-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 - drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c | 5 - drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h | 1 - 3 files changed, 7 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index a15f1b604733..886625fb464b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -3284,7 +3284,6 @@ int amdgpu_device_init(struct amdgpu_device *adev, /* mutex initialization are all done here so we * can recall function without having locking issues */ - atomic_set(&adev->irq.ih.lock, 0); mutex_init(&adev->firmware.mutex); mutex_init(&adev->pm.mutex); mutex_init(&adev->gfx.gpu_clock_mutex); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c index 1024065f1f03..faaa6aa2faaf 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c @@ -228,10 +228,6 @@ int amdgpu_ih_process(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih) wptr = amdgpu_ih_get_wptr(adev, ih); restart_ih: - /* is somebody else already processing irqs? */ - if (atomic_xchg(&ih->lock, 1)) - return IRQ_NONE; - DRM_DEBUG("%s: rptr %d, wptr %d\n", __func__, ih->rptr, wptr); /* Order reading of wptr vs. reading of IH ring data */ @@ -244,7 +240,6 @@ int amdgpu_ih_process(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih) amdgpu_ih_set_rptr(adev, ih); wake_up_all(&ih->wait_process); - atomic_set(&ih->lock, 0); /* make sure wptr hasn't changed while processing */ wptr = amdgpu_ih_get_wptr(adev, ih); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h index 87ec6d20dbe0..0649b59830a5 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h @@ -64,7 +64,6 @@ struct amdgpu_ih_ring { boolenabled; unsignedrptr; - atomic_tlock; struct amdgpu_ih_regs ih_regs; /* For waiting on IH processing at checkpoint. */ -- 2.25.1 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/amdgpu: nuke the ih reentrant lock
On Fri, Mar 12, 2021 at 02:59:06PM +0100, Christian König wrote: > Interrupts on are non-reentrant on linux. This is just an ancient > leftover from radeon where irq processing was kicked of from different > places. > > Signed-off-by: Christian König Man you tricked me into grepping this on radeon and it looks horrible. atomic_t is unordered in linux, so whatever was built there for radeon does not wokr like a lock. It's missing all the barriers afiui. Good riddance at least for amdgpu. Reviewed-by: Daniel Vetter > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 - > drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c | 5 - > drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h | 1 - > 3 files changed, 7 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index a15f1b604733..886625fb464b 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -3284,7 +3284,6 @@ int amdgpu_device_init(struct amdgpu_device *adev, > > /* mutex initialization are all done here so we >* can recall function without having locking issues */ > - atomic_set(&adev->irq.ih.lock, 0); > mutex_init(&adev->firmware.mutex); > mutex_init(&adev->pm.mutex); > mutex_init(&adev->gfx.gpu_clock_mutex); > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c > index 1024065f1f03..faaa6aa2faaf 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c > @@ -228,10 +228,6 @@ int amdgpu_ih_process(struct amdgpu_device *adev, struct > amdgpu_ih_ring *ih) > wptr = amdgpu_ih_get_wptr(adev, ih); > > restart_ih: > - /* is somebody else already processing irqs? */ > - if (atomic_xchg(&ih->lock, 1)) > - return IRQ_NONE; > - > DRM_DEBUG("%s: rptr %d, wptr %d\n", __func__, ih->rptr, wptr); > > /* Order reading of wptr vs. reading of IH ring data */ > @@ -244,7 +240,6 @@ int amdgpu_ih_process(struct amdgpu_device *adev, struct > amdgpu_ih_ring *ih) > > amdgpu_ih_set_rptr(adev, ih); > wake_up_all(&ih->wait_process); > - atomic_set(&ih->lock, 0); > > /* make sure wptr hasn't changed while processing */ > wptr = amdgpu_ih_get_wptr(adev, ih); > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h > index 87ec6d20dbe0..0649b59830a5 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h > @@ -64,7 +64,6 @@ struct amdgpu_ih_ring { > > boolenabled; > unsignedrptr; > - atomic_tlock; > struct amdgpu_ih_regs ih_regs; > > /* For waiting on IH processing at checkpoint. */ > -- > 2.25.1 > > ___ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] dma-fence: Document recoverable page fault implications
On Wed, Feb 03, 2021 at 04:40:38PM +0100, Christian König wrote: > Am 03.02.21 um 16:29 schrieb Daniel Vetter: > > Recently there was a fairly long thread about recoreable hardware page > > faults, how they can deadlock, and what to do about that. > > > > While the discussion is still fresh I figured good time to try and > > document the conclusions a bit. This documentation section explains > > what's the potential problem, and the remedies we've discussed, > > roughly ordered from best to worst. > > > > v2: Linus -> Linux typoe (Dave) > > > > v3: > > - Make it clear drivers only need to implement one option (Christian) > > - Make it clearer that implicit sync is out the window with exclusive > >fences (Christian) > > - Add the fairly theoretical option of segementing the memory (either > >statically or through dynamic checks at runtime for which piece of > >memory is managed how) and explain why it's not a great idea (Felix) > > > > References: > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210107030127.20393-1-Felix.Kuehling%40amd.com%2F&data=04%7C01%7Cchristian.koenig%40amd.com%7C767e1096b9554ab5b6dd08d8c8587f0f%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637479629728871138%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Fth2y8c3LuNbweQGrsS7VjYESGlshu6DfQyikiWBwyQ%3D&reserved=0 > > Cc: Dave Airlie > > Cc: Maarten Lankhorst > > Cc: Thomas Hellström > > Cc: "Christian König" > > Cc: Jerome Glisse > > Cc: Felix Kuehling > > Signed-off-by: Daniel Vetter > > Cc: Sumit Semwal > > Cc: linux-me...@vger.kernel.org > > Cc: linaro-mm-...@lists.linaro.org > > Reviewed-by: Christian König Pulled this in, thanks everyone who helped clarify the situation here. Cheers, Daniel > I still haven't fully given up on supporting implicit sync with user fences, > but it is really an eeek, let's try very hard to avoid that, problem. > > Christian > > > --- > > Documentation/driver-api/dma-buf.rst | 76 > > 1 file changed, 76 insertions(+) > > > > diff --git a/Documentation/driver-api/dma-buf.rst > > b/Documentation/driver-api/dma-buf.rst > > index a2133d69872c..7f37ec30d9fd 100644 > > --- a/Documentation/driver-api/dma-buf.rst > > +++ b/Documentation/driver-api/dma-buf.rst > > @@ -257,3 +257,79 @@ fences in the kernel. This means: > > userspace is allowed to use userspace fencing or long running compute > > workloads. This also means no implicit fencing for shared buffers in > > these > > cases. > > + > > +Recoverable Hardware Page Faults Implications > > +~ > > + > > +Modern hardware supports recoverable page faults, which has a lot of > > +implications for DMA fences. > > + > > +First, a pending page fault obviously holds up the work that's running on > > the > > +accelerator and a memory allocation is usually required to resolve the > > fault. > > +But memory allocations are not allowed to gate completion of DMA fences, > > which > > +means any workload using recoverable page faults cannot use DMA fences for > > +synchronization. Synchronization fences controlled by userspace must be > > used > > +instead. > > + > > +On GPUs this poses a problem, because current desktop compositor protocols > > on > > +Linux rely on DMA fences, which means without an entirely new userspace > > stack > > +built on top of userspace fences, they cannot benefit from recoverable page > > +faults. Specifically this means implicit synchronization will not be > > possible. > > +The exception is when page faults are only used as migration hints and > > never to > > +on-demand fill a memory request. For now this means recoverable page > > +faults on GPUs are limited to pure compute workloads. > > + > > +Furthermore GPUs usually have shared resources between the 3D rendering and > > +compute side, like compute units or command submission engines. If both a > > 3D > > +job with a DMA fence and a compute workload using recoverable page faults > > are > > +pending they could deadlock: > > + > > +- The 3D workload might need to wait for the compute job to finish and > > release > > + hardware resources first. > > + > > +- The compute workload might be stuck in a page fault, because the memory > > + allocation is waiting for the DMA fence of the 3D workload to complete. > > + > > +There are a few options to prevent this problem, one of which drivers need > > to > > +ensure: > > + > > +- Compute workloads can always be preempted, even when a page fault is > > pending > > + and not yet repaired. Not all hardware supports this. > > + > > +- DMA fence workloads and workloads which need page fault handling have > > + independent hardware resources to guarantee forward progress. This could > > be > > + achieved through e.g. through dedicated engines and minimal compute unit > > + reservations for DMA fence workl
Re: [PATCH v2] fb_defio: Remove custom address_space_operations
On Wed, Mar 10, 2021 at 06:55:30PM +, Matthew Wilcox (Oracle) wrote: > There's no need to give the page an address_space. Leaving the > page->mapping as NULL will cause the VM to handle set_page_dirty() > the same way that it's handled now, and that was the only reason to > set the address_space in the first place. > > Signed-off-by: Matthew Wilcox (Oracle) > Reviewed-by: Christoph Hellwig > Reviewed-by: William Kucharski Thanks for your patch, merged to drm-misc-next for 5.13. While I have an expert here, does this mean that for a VM_PFNMAP we could pull of the same trick without any struct page backing, assuming we pulle the per-page dirty state into some tracking of our own? I'm asking since for DRM drivers we currently have a fairly awkward situation with a bounce buffer in system memory going on that we copy out of, because we can't directly use the gpu buffers. If we can track directly in the gpu buffers, maybe even as some kind of overlay over the vma, we could avoid that copy. Otoh no one cares about fbcon performance, so *shrug*. Cheers, Daniel > --- > v2: Delete local variable definitions > drivers/video/fbdev/core/fb_defio.c | 35 - > drivers/video/fbdev/core/fbmem.c| 4 > include/linux/fb.h | 3 --- > 3 files changed, 42 deletions(-) > > diff --git a/drivers/video/fbdev/core/fb_defio.c > b/drivers/video/fbdev/core/fb_defio.c > index a591d291b231..b292887a2481 100644 > --- a/drivers/video/fbdev/core/fb_defio.c > +++ b/drivers/video/fbdev/core/fb_defio.c > @@ -52,13 +52,6 @@ static vm_fault_t fb_deferred_io_fault(struct vm_fault > *vmf) > return VM_FAULT_SIGBUS; > > get_page(page); > - > - if (vmf->vma->vm_file) > - page->mapping = vmf->vma->vm_file->f_mapping; > - else > - printk(KERN_ERR "no mapping available\n"); > - > - BUG_ON(!page->mapping); > page->index = vmf->pgoff; > > vmf->page = page; > @@ -151,17 +144,6 @@ static const struct vm_operations_struct > fb_deferred_io_vm_ops = { > .page_mkwrite = fb_deferred_io_mkwrite, > }; > > -static int fb_deferred_io_set_page_dirty(struct page *page) > -{ > - if (!PageDirty(page)) > - SetPageDirty(page); > - return 0; > -} > - > -static const struct address_space_operations fb_deferred_io_aops = { > - .set_page_dirty = fb_deferred_io_set_page_dirty, > -}; > - > int fb_deferred_io_mmap(struct fb_info *info, struct vm_area_struct *vma) > { > vma->vm_ops = &fb_deferred_io_vm_ops; > @@ -212,29 +194,12 @@ void fb_deferred_io_init(struct fb_info *info) > } > EXPORT_SYMBOL_GPL(fb_deferred_io_init); > > -void fb_deferred_io_open(struct fb_info *info, > - struct inode *inode, > - struct file *file) > -{ > - file->f_mapping->a_ops = &fb_deferred_io_aops; > -} > -EXPORT_SYMBOL_GPL(fb_deferred_io_open); > - > void fb_deferred_io_cleanup(struct fb_info *info) > { > struct fb_deferred_io *fbdefio = info->fbdefio; > - struct page *page; > - int i; > > BUG_ON(!fbdefio); > cancel_delayed_work_sync(&info->deferred_work); > - > - /* clear out the mapping that we setup */ > - for (i = 0 ; i < info->fix.smem_len; i += PAGE_SIZE) { > - page = fb_deferred_io_page(info, i); > - page->mapping = NULL; > - } > - > mutex_destroy(&fbdefio->lock); > } > EXPORT_SYMBOL_GPL(fb_deferred_io_cleanup); > diff --git a/drivers/video/fbdev/core/fbmem.c > b/drivers/video/fbdev/core/fbmem.c > index 06f5805de2de..372b52a2befa 100644 > --- a/drivers/video/fbdev/core/fbmem.c > +++ b/drivers/video/fbdev/core/fbmem.c > @@ -1415,10 +1415,6 @@ __releases(&info->lock) > if (res) > module_put(info->fbops->owner); > } > -#ifdef CONFIG_FB_DEFERRED_IO > - if (info->fbdefio) > - fb_deferred_io_open(info, inode, file); > -#endif > out: > unlock_fb_info(info); > if (res) > diff --git a/include/linux/fb.h b/include/linux/fb.h > index ecfbcc0553a5..a8dccd23c249 100644 > --- a/include/linux/fb.h > +++ b/include/linux/fb.h > @@ -659,9 +659,6 @@ static inline void __fb_pad_aligned_buffer(u8 *dst, u32 > d_pitch, > /* drivers/video/fb_defio.c */ > int fb_deferred_io_mmap(struct fb_info *info, struct vm_area_struct *vma); > extern void fb_deferred_io_init(struct fb_info *info); > -extern void fb_deferred_io_open(struct fb_info *info, > - struct inode *inode, > - struct file *file); > extern void fb_deferred_io_cleanup(struct fb_info *info); > extern int fb_deferred_io_fsync(struct file *file, loff_t start, > loff_t end, int datasync); > -- > 2.30.0 > > ___ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel -- Daniel Vetter Software
Re: [Intel-gfx] [PATCH] i915: Drop legacy execbuffer support
On Thu, Mar 11, 2021 at 10:31:33PM -0600, Jason Ekstrand wrote: > > On March 11, 2021 20:26:06 "Dixit, Ashutosh" wrote: > > > On Wed, 10 Mar 2021 13:00:49 -0800, Jason Ekstrand wrote: > > > > > > libdrm has supported the newer execbuffer2 ioctl and using it by default > > > when it exists since libdrm commit > > > b50964027bef249a0cc3d511de05c2464e0a1e22 > > > which landed Mar 2, 2010. The i915 and i965 drivers in Mesa at the time > > > both used libdrm and so did the Intel X11 back-end. The SNA back-end > > > for X11 has always used execbuffer2. > > > > > > Signed-off-by: Jason Ekstrand > > > --- > > > .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 100 -- > > > drivers/gpu/drm/i915/gem/i915_gem_ioctls.h| 2 - > > > drivers/gpu/drm/i915/i915_drv.c | 2 +- > > > 3 files changed, 1 insertion(+), 103 deletions(-) > > > > Don't we want to clean up references to legacy execbuffer in > > include/uapi/drm/i915_drm.h too? > > I thought about that but Daniel said we should leave them. Maybe a comment > is in order? These headers are copied unchanged to userspace for building. We don't use kernel-headers packages directly in any of our userspace (I hope at least), but still better safe than sorry and avoid compilation failures simply due to updated uapi headers that lost a few old things. Also we need at least the struct size because that's encoded in the ioctl number, and at that point might as well keep the entire thing. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [RFC PATCH 0/7] drm/panfrost: Add a new submit ioctl
On Thu, 11 Mar 2021 12:16:33 + Steven Price wrote: > Also the current code completely ignores PANFROST_BO_REF_READ. So either > that should be defined as 0, or even better we support 3 modes: > > * Exclusive ('write' access) > * Shared ('read' access) > * No fence - ensures the BO is mapped, but doesn't add any implicit > fences. I looked at other drivers and they seem to have this NO_FENCE/NO_IMPLICIT flag at the job/submit level and conditioned on the presence of a in_sync_fd file. I can see per-BO NO_FENCE being useful for sub-buffer accesses, assuming we don't want to deal with other deps explicitly, but we could also force the NO_IMPLICIT behavior for the whole job and let the user deal with all deps explicitly. TBH, it's a bit unclear to me how that feature would be used by the vulkan driver, so I'd rather leave that NO_FENCE use case for later. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] i915: Drop relocation support on all new hardware (v3)
On Thu, Mar 11, 2021 at 12:57:25PM -0600, Jason Ekstrand wrote: > On Thu, Mar 11, 2021 at 12:20 PM Zbigniew Kempczyński > wrote: > > > > On Thu, Mar 11, 2021 at 11:18:11AM -0600, Jason Ekstrand wrote: > > > On Thu, Mar 11, 2021 at 10:51 AM Zbigniew Kempczyński > > > wrote: > > > > > > > > On Thu, Mar 11, 2021 at 10:24:38AM -0600, Jason Ekstrand wrote: > > > > > On Thu, Mar 11, 2021 at 9:57 AM Daniel Vetter wrote: > > > > > > > > > > > > On Thu, Mar 11, 2021 at 4:50 PM Jason Ekstrand > > > > > > wrote: > > > > > > > > > > > > > > On Thu, Mar 11, 2021 at 5:44 AM Zbigniew Kempczyński > > > > > > > wrote: > > > > > > > > > > > > > > > > On Wed, Mar 10, 2021 at 03:50:07PM -0600, Jason Ekstrand wrote: > > > > > > > > > The Vulkan driver in Mesa for Intel hardware never uses > > > > > > > > > relocations if > > > > > > > > > it's running on a version of i915 that supports at least > > > > > > > > > softpin which > > > > > > > > > all versions of i915 supporting Gen12 do. On the OpenGL > > > > > > > > > side, Gen12+ is > > > > > > > > > only supported by iris which never uses relocations. The > > > > > > > > > older i965 > > > > > > > > > driver in Mesa does use relocations but it only supports > > > > > > > > > Intel hardware > > > > > > > > > through Gen11 and has been deprecated for all hardware Gen9+. > > > > > > > > > The > > > > > > > > > compute driver also never uses relocations. This only leaves > > > > > > > > > the media > > > > > > > > > driver which is supposed to be switching to softpin going > > > > > > > > > forward. > > > > > > > > > Making softpin a requirement for all future hardware seems > > > > > > > > > reasonable. > > > > > > > > > > > > > > > > > > Rejecting relocations starting with Gen12 has the benefit > > > > > > > > > that we don't > > > > > > > > > have to bother supporting it on platforms with local memory. > > > > > > > > > Given how > > > > > > > > > much CPU touching of memory is required for relocations, not > > > > > > > > > having to > > > > > > > > > do so on platforms where not all memory is directly > > > > > > > > > CPU-accessible > > > > > > > > > carries significant advantages. > > > > > > > > > > > > > > > > > > v2 (Jason Ekstrand): > > > > > > > > > - Allow TGL-LP platforms as they've already shipped > > > > > > > > > > > > > > > > > > v3 (Jason Ekstrand): > > > > > > > > > - WARN_ON platforms with LMEM support in case the check is > > > > > > > > > wrong > > > > > > > > > > > > > > > > I was asked to review of this patch. It works along with > > > > > > > > expected > > > > > > > > IGT check > > > > > > > > https://patchwork.freedesktop.org/patch/423361/?series=82954&rev=25 > > > > > > > > > > > > > > > > Before I'll give you r-b - isn't i915_gem_execbuffer2_ioctl() > > > > > > > > better place > > > > > > > > to do for loop just after copy_from_user() and check > > > > > > > > relocation_count? > > > > > > > > We have an access to exec2_list there, we know the gen so we're > > > > > > > > able to say > > > > > > > > relocations are not supported immediate, without entering > > > > > > > > i915_gem_do_execbuffer(). > > > > > > > > > > > > > > I considered that but it adds an extra object list walk for a case > > > > > > > which we expect to not happen. I'm not sure how expensive the > > > > > > > list > > > > > > > walk would be if all we do is check the number of relocations on > > > > > > > each > > > > > > > object. I guess, if it comes right after a copy_from_user, it's > > > > > > > all > > > > > > > hot in the cache so it shouldn't matter. Ok. I've convinced > > > > > > > myself. > > > > > > > I'll move it. > > > > > > > > > > > > I really wouldn't move it if it's another list walk. Execbuf has a > > > > > > lot > > > > > > of fast-paths going on, and we have extensive tests to make sure it > > > > > > unwinds correctly in all cases. It's not very intuitive, but execbuf > > > > > > code isn't scoring very high on that. > > > > > > > > > > And here I'd just finished doing the typing to move it. Good thing I > > > > > hadn't closed vim yet and it was still in my undo buffer. :-) > > > > > > > > Before entering "slower" path from my perspective I would just check > > > > batch object at that place. We still would have single list walkthrough > > > > and quick check on the very beginning. > > > > > > Can you be more specific about what exactly you think we can check at > > > the beginning? Either we add a flag that we can O(1) check at the > > > beginning or we have to check everything in exec2_list for > > > exec2_list[n].relocation_count == 0. That's a list walk. I'm not > > > seeing what up-front check you're thinking we can do without the list > > > walk. > > > > I expect that last (default) or first (I915_EXEC_BATCH_FIRST) execobj > > (batch) will likely has relocations. So we can check that single > > object without entering i915_gem_do_execbuffer(). If that check > > is missed (relocation_count = 0) you'll catch r
Re: [syzbot] upstream boot error: WARNING in vkms_vblank_simulate
On Fri, Mar 12, 2021 at 11:46:27AM +0100, Dmitry Vyukov wrote: > On Fri, Mar 12, 2021 at 11:26 AM syzbot > wrote: > > > > Hello, > > > > syzbot found the following issue on: > > > > HEAD commit:f78d76e7 Merge tag 'drm-fixes-2021-03-12-1' of git://anong.. > > git tree: upstream > > console output: https://syzkaller.appspot.com/x/log.txt?x=11c16ba2d0 > > kernel config: https://syzkaller.appspot.com/x/.config?x=dc02c6afcb046874 > > dashboard link: https://syzkaller.appspot.com/bug?extid=333bd014262fd5d0a418 > > userspace arch: arm > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > Reported-by: syzbot+333bd014262fd5d0a...@syzkaller.appspotmail.com > > This WARNING seems to be happening just randomly. > It was already reported as: > > #syz dup: WARNING in vkms_vblank_simulate (2) > https://syzkaller.appspot.com/bug?id=9b10491371879700d6a21c15684c2232ff015084 > > It now has whooping 48589 crashes and also crashes slow qemu tcg instances. Yeah your box is too slow. We're trying to simulate hw here, which means if we can process less than 1 hrtimer per vblank (standard every 16ms) then we scream, because things go very wrong with the simulated hw. And the hrtimer is really not that big, all the expensive processing is pushed to worker, where we have code to handle if it falls back too much. So either patch this out or make the code robust against a kernel that somehow can't process a single hrtimer every 16ms. -Daniel > > > > > [ cut here ] > > WARNING: CPU: 0 PID: 0 at drivers/gpu/drm/vkms/vkms_crtc.c:21 > > vkms_vblank_simulate+0x26c/0x2f4 drivers/gpu/drm/vkms/vkms_crtc.c:41 > > Modules linked in: > > CPU: 0 PID: 0 Comm: swapper/0 Not tainted > > 5.12.0-rc2-syzkaller-00338-gf78d76e72a46 #0 > > Hardware name: linux,dummy-virt (DT) > > pstate: 2085 (nzCv daIf -PAN -UAO -TCO BTYPE=--) > > pc : vkms_vblank_simulate+0x26c/0x2f4 drivers/gpu/drm/vkms/vkms_crtc.c:21 > > lr : hrtimer_forward_now include/linux/hrtimer.h:510 [inline] > > lr : vkms_vblank_simulate+0x90/0x2f4 drivers/gpu/drm/vkms/vkms_crtc.c:19 > > sp : 6a693cd0 > > x29: 6a693cd0 x28: 0c9d1e58 > > x27: dfff8000 x26: 6a67f540 > > x25: 1fffed4cfeb1 x24: 1fffed4cfeaa > > x23: 0c9d0d30 x22: 00fe4c00 > > x21: 6a67f540 x20: 0c9d0e58 > > x19: 0c9d1e58 x18: 6a6a1b48 > > x17: 1fffe1952345 x16: > > x15: 8000197bf810 x14: 1fffed4d2750 > > x13: 0001 x12: 0033 > > x11: 12fb4936 x10: 0007 > > x9 : 12fb4943 x8 : 800017d14c00 > > x7 : f1f1f1f1 x6 : dfff8000 > > x5 : 7fff x4 : 0008e44f6b90 > > x3 : 0008e54db790 x2 : 0008e44f6b90 > > x1 : 0008e54db790 x0 : 0002 > > Call trace: > > vkms_vblank_simulate+0x26c/0x2f4 drivers/gpu/drm/vkms/vkms_crtc.c:41 > > __run_hrtimer kernel/time/hrtimer.c:1519 [inline] > > __hrtimer_run_queues+0x590/0xe40 kernel/time/hrtimer.c:1583 > > hrtimer_interrupt+0x2d4/0x810 kernel/time/hrtimer.c:1645 > > timer_handler drivers/clocksource/arm_arch_timer.c:647 [inline] > > arch_timer_handler_phys+0x4c/0x70 drivers/clocksource/arm_arch_timer.c:665 > > handle_percpu_devid_irq+0x19c/0x330 kernel/irq/chip.c:930 > > generic_handle_irq_desc include/linux/irqdesc.h:158 [inline] > > generic_handle_irq kernel/irq/irqdesc.c:652 [inline] > > __handle_domain_irq+0x11c/0x1f0 kernel/irq/irqdesc.c:689 > > handle_domain_irq include/linux/irqdesc.h:176 [inline] > > gic_handle_irq+0x5c/0x1b0 drivers/irqchip/irq-gic.c:370 > > el1_irq+0xb4/0x180 arch/arm64/kernel/entry.S:669 > > arch_local_irq_restore arch/arm64/include/asm/irqflags.h:124 [inline] > > queue_work_on+0x74/0x110 kernel/workqueue.c:1528 > > queue_work include/linux/workqueue.h:507 [inline] > > cursor_timer_handler+0x64/0x100 drivers/video/fbdev/core/fbcon.c:397 > > call_timer_fn+0x1d4/0x9c4 kernel/time/timer.c:1431 > > expire_timers kernel/time/timer.c:1476 [inline] > > __run_timers.part.0+0x530/0xa00 kernel/time/timer.c:1745 > > __run_timers kernel/time/timer.c:1726 [inline] > > run_timer_softirq+0xa4/0x1a0 kernel/time/timer.c:1758 > > _stext+0x2b4/0x1084 > > do_softirq_own_stack include/asm-generic/softirq_stack.h:10 [inline] > > invoke_softirq kernel/softirq.c:228 [inline] > > __irq_exit_rcu+0x46c/0x510 kernel/softirq.c:422 > > irq_exit+0x14/0x84 kernel/softirq.c:446 > > __handle_domain_irq+0x120/0x1f0 kernel/irq/irqdesc.c:692 > > handle_domain_irq include/linux/irqdesc.h:176 [inline] > > gic_handle_irq+0x5c/0x1b0 drivers/irqchip/irq-gic.c:370 > > el1_irq+0xb4/0x180 arch/arm64/kernel/entry.S:669 > > arch_local_irq_enable+0xc/0x14 arch/arm64/include/asm/irqflags.h:37 > > default_idle_call+0x64/0xf4 kernel/sched/idle.c:112 > > cpuidle_idle_call kernel/sched/idle.c:194 [inline] > > do_idle+0x38c/0x4ec kernel/sched/idle.c:300 > > cpu_startup_entry+0x28/
Re: [PATCH] drm/exynos: move to use request_irq by IRQF_NO_AUTOEN flag
On Fri, Mar 12, 2021 at 07:43:05PM +0800, Tian Tao wrote: > After this patch cbe16f35bee68 genirq: Add IRQF_NO_AUTOEN for > request_irq/nmi() is merged. request_irq() after setting > IRQ_NOAUTOEN as below > > irq_set_status_flags(irq, IRQ_NOAUTOEN); > request_irq(dev, irq...); > can be replaced by request_irq() with IRQF_NO_AUTOEN flag. > > Signed-off-by: Tian Tao To get all your nice cleanups land faster I strongly recommend you review other people's small patches. And then ask them to review yours in return. You have already drm-misc commit rights, so should be all ready to go and do lots of great stuff! Cheers, Daniel > --- > drivers/gpu/drm/exynos/exynos5433_drm_decon.c | 4 ++-- > drivers/gpu/drm/exynos/exynos_drm_dsi.c | 7 +++ > 2 files changed, 5 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/exynos/exynos5433_drm_decon.c > b/drivers/gpu/drm/exynos/exynos5433_drm_decon.c > index 1f79bc2..f530aff 100644 > --- a/drivers/gpu/drm/exynos/exynos5433_drm_decon.c > +++ b/drivers/gpu/drm/exynos/exynos5433_drm_decon.c > @@ -775,8 +775,8 @@ static int decon_conf_irq(struct decon_context *ctx, > const char *name, > return irq; > } > } > - irq_set_status_flags(irq, IRQ_NOAUTOEN); > - ret = devm_request_irq(ctx->dev, irq, handler, flags, "drm_decon", ctx); > + ret = devm_request_irq(ctx->dev, irq, handler, > +flags | IRQ_NOAUTOEN, "drm_decon", ctx); > if (ret < 0) { > dev_err(ctx->dev, "IRQ %s request failed\n", name); > return ret; > diff --git a/drivers/gpu/drm/exynos/exynos_drm_dsi.c > b/drivers/gpu/drm/exynos/exynos_drm_dsi.c > index 83ab6b3..fd9b133b 100644 > --- a/drivers/gpu/drm/exynos/exynos_drm_dsi.c > +++ b/drivers/gpu/drm/exynos/exynos_drm_dsi.c > @@ -1352,10 +1352,9 @@ static int exynos_dsi_register_te_irq(struct > exynos_dsi *dsi, > } > > te_gpio_irq = gpio_to_irq(dsi->te_gpio); > - irq_set_status_flags(te_gpio_irq, IRQ_NOAUTOEN); > > ret = request_threaded_irq(te_gpio_irq, exynos_dsi_te_irq_handler, NULL, > - IRQF_TRIGGER_RISING, "TE", dsi); > +IRQF_TRIGGER_RISING | IRQ_NOAUTOEN, "TE", > dsi); > if (ret) { > dev_err(dsi->dev, "request interrupt failed with %d\n", ret); > gpio_free(dsi->te_gpio); > @@ -1802,9 +1801,9 @@ static int exynos_dsi_probe(struct platform_device > *pdev) > if (dsi->irq < 0) > return dsi->irq; > > - irq_set_status_flags(dsi->irq, IRQ_NOAUTOEN); > ret = devm_request_threaded_irq(dev, dsi->irq, NULL, > - exynos_dsi_irq, IRQF_ONESHOT, > + exynos_dsi_irq, > + IRQF_ONESHOT | IRQ_NOAUTOEN, > dev_name(dev), dsi); > if (ret) { > dev_err(dev, "failed to request dsi irq\n"); > -- > 2.7.4 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/amdgpu: nuke the ih reentrant lock
Am 12.03.21 um 15:04 schrieb Daniel Vetter: On Fri, Mar 12, 2021 at 02:59:06PM +0100, Christian König wrote: Interrupts on are non-reentrant on linux. This is just an ancient leftover from radeon where irq processing was kicked of from different places. Signed-off-by: Christian König Man you tricked me into grepping this on radeon and it looks horrible. atomic_t is unordered in linux, so whatever was built there for radeon does not wokr like a lock. It's missing all the barriers afiui. Good riddance at least for amdgpu. Hui? atomic_xchg() is perfectly ordered as far as I know. IIRC Alex added this for R600 because we had boards where interrupts where sporadically swallowed during driver startup and we had to kick of ring buffer processing manually. Anyway we have a fence processing fallback timer in amdgpu instead and that stuff is probably unused on any modern hardware. Reviewed-by: Daniel Vetter Thanks, Christian. --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 - drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c | 5 - drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h | 1 - 3 files changed, 7 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index a15f1b604733..886625fb464b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -3284,7 +3284,6 @@ int amdgpu_device_init(struct amdgpu_device *adev, /* mutex initialization are all done here so we * can recall function without having locking issues */ - atomic_set(&adev->irq.ih.lock, 0); mutex_init(&adev->firmware.mutex); mutex_init(&adev->pm.mutex); mutex_init(&adev->gfx.gpu_clock_mutex); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c index 1024065f1f03..faaa6aa2faaf 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c @@ -228,10 +228,6 @@ int amdgpu_ih_process(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih) wptr = amdgpu_ih_get_wptr(adev, ih); restart_ih: - /* is somebody else already processing irqs? */ - if (atomic_xchg(&ih->lock, 1)) - return IRQ_NONE; - DRM_DEBUG("%s: rptr %d, wptr %d\n", __func__, ih->rptr, wptr); /* Order reading of wptr vs. reading of IH ring data */ @@ -244,7 +240,6 @@ int amdgpu_ih_process(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih) amdgpu_ih_set_rptr(adev, ih); wake_up_all(&ih->wait_process); - atomic_set(&ih->lock, 0); /* make sure wptr hasn't changed while processing */ wptr = amdgpu_ih_get_wptr(adev, ih); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h index 87ec6d20dbe0..0649b59830a5 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h @@ -64,7 +64,6 @@ struct amdgpu_ih_ring { boolenabled; unsignedrptr; - atomic_tlock; struct amdgpu_ih_regs ih_regs; /* For waiting on IH processing at checkpoint. */ -- 2.25.1 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/amdgpu: nuke the ih reentrant lock
On Fri, Mar 12, 2021 at 03:27:58PM +0100, Christian König wrote: > > > Am 12.03.21 um 15:04 schrieb Daniel Vetter: > > On Fri, Mar 12, 2021 at 02:59:06PM +0100, Christian König wrote: > > > Interrupts on are non-reentrant on linux. This is just an ancient > > > leftover from radeon where irq processing was kicked of from different > > > places. > > > > > > Signed-off-by: Christian König > > Man you tricked me into grepping this on radeon and it looks horrible. > > atomic_t is unordered in linux, so whatever was built there for radeon > > does not wokr like a lock. It's missing all the barriers afiui. Good > > riddance at least for amdgpu. > > Hui? atomic_xchg() is perfectly ordered as far as I know. Oh right, if atomic_ returns a value it's fully ordered. Nice tour into memory-barriers.txt. It's the atomic_t operations that don't return anything which are fully unordered, and I mixed that up. > IIRC Alex added this for R600 because we had boards where interrupts where > sporadically swallowed during driver startup and we had to kick of ring > buffer processing manually. > > Anyway we have a fence processing fallback timer in amdgpu instead and that > stuff is probably unused on any modern hardware. Ah yes that stuff. Kinda annoying we have these old dma_fence around where dma_fence_wait doesn't really work that well, but oh well. -Daniel > > > Reviewed-by: Daniel Vetter > > Thanks, > Christian. > > > > > > --- > > > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 - > > > drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c | 5 - > > > drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h | 1 - > > > 3 files changed, 7 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > > index a15f1b604733..886625fb464b 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > > @@ -3284,7 +3284,6 @@ int amdgpu_device_init(struct amdgpu_device *adev, > > > /* mutex initialization are all done here so we > > >* can recall function without having locking issues */ > > > - atomic_set(&adev->irq.ih.lock, 0); > > > mutex_init(&adev->firmware.mutex); > > > mutex_init(&adev->pm.mutex); > > > mutex_init(&adev->gfx.gpu_clock_mutex); > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c > > > index 1024065f1f03..faaa6aa2faaf 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c > > > @@ -228,10 +228,6 @@ int amdgpu_ih_process(struct amdgpu_device *adev, > > > struct amdgpu_ih_ring *ih) > > > wptr = amdgpu_ih_get_wptr(adev, ih); > > > restart_ih: > > > - /* is somebody else already processing irqs? */ > > > - if (atomic_xchg(&ih->lock, 1)) > > > - return IRQ_NONE; > > > - > > > DRM_DEBUG("%s: rptr %d, wptr %d\n", __func__, ih->rptr, wptr); > > > /* Order reading of wptr vs. reading of IH ring data */ > > > @@ -244,7 +240,6 @@ int amdgpu_ih_process(struct amdgpu_device *adev, > > > struct amdgpu_ih_ring *ih) > > > amdgpu_ih_set_rptr(adev, ih); > > > wake_up_all(&ih->wait_process); > > > - atomic_set(&ih->lock, 0); > > > /* make sure wptr hasn't changed while processing */ > > > wptr = amdgpu_ih_get_wptr(adev, ih); > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h > > > index 87ec6d20dbe0..0649b59830a5 100644 > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h > > > @@ -64,7 +64,6 @@ struct amdgpu_ih_ring { > > > boolenabled; > > > unsignedrptr; > > > - atomic_tlock; > > > struct amdgpu_ih_regs ih_regs; > > > /* For waiting on IH processing at checkpoint. */ > > > -- > > > 2.25.1 > > > > > > ___ > > > dri-devel mailing list > > > dri-devel@lists.freedesktop.org > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/amdgpu: nuke the ih reentrant lock
On Fri, Mar 12, 2021 at 03:35:50PM +0100, Daniel Vetter wrote: > On Fri, Mar 12, 2021 at 03:27:58PM +0100, Christian König wrote: > > > > > > Am 12.03.21 um 15:04 schrieb Daniel Vetter: > > > On Fri, Mar 12, 2021 at 02:59:06PM +0100, Christian König wrote: > > > > Interrupts on are non-reentrant on linux. This is just an ancient > > > > leftover from radeon where irq processing was kicked of from different > > > > places. > > > > > > > > Signed-off-by: Christian König > > > Man you tricked me into grepping this on radeon and it looks horrible. > > > atomic_t is unordered in linux, so whatever was built there for radeon > > > does not wokr like a lock. It's missing all the barriers afiui. Good > > > riddance at least for amdgpu. > > > > Hui? atomic_xchg() is perfectly ordered as far as I know. > > Oh right, if atomic_ returns a value it's fully ordered. Nice tour into > memory-barriers.txt. It's the atomic_t operations that don't return > anything which are fully unordered, and I mixed that up. > > > IIRC Alex added this for R600 because we had boards where interrupts where > > sporadically swallowed during driver startup and we had to kick of ring > > buffer processing manually. > > > > Anyway we have a fence processing fallback timer in amdgpu instead and that > > stuff is probably unused on any modern hardware. > > Ah yes that stuff. Kinda annoying we have these old dma_fence around where > dma_fence_wait doesn't really work that well, but oh well. Argh I'm not awake. dma_fence_wait() works well on these, but the generic cb stuff (which is used in ever more place) doesn't, because it misses the fallback timer the ->wait does. -Daniel > > > Reviewed-by: Daniel Vetter > > > > Thanks, > > Christian. > > > > > > > > > --- > > > > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 - > > > > drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c | 5 - > > > > drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h | 1 - > > > > 3 files changed, 7 deletions(-) > > > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > > > index a15f1b604733..886625fb464b 100644 > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > > > @@ -3284,7 +3284,6 @@ int amdgpu_device_init(struct amdgpu_device *adev, > > > > /* mutex initialization are all done here so we > > > > * can recall function without having locking issues */ > > > > - atomic_set(&adev->irq.ih.lock, 0); > > > > mutex_init(&adev->firmware.mutex); > > > > mutex_init(&adev->pm.mutex); > > > > mutex_init(&adev->gfx.gpu_clock_mutex); > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c > > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c > > > > index 1024065f1f03..faaa6aa2faaf 100644 > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c > > > > @@ -228,10 +228,6 @@ int amdgpu_ih_process(struct amdgpu_device *adev, > > > > struct amdgpu_ih_ring *ih) > > > > wptr = amdgpu_ih_get_wptr(adev, ih); > > > > restart_ih: > > > > - /* is somebody else already processing irqs? */ > > > > - if (atomic_xchg(&ih->lock, 1)) > > > > - return IRQ_NONE; > > > > - > > > > DRM_DEBUG("%s: rptr %d, wptr %d\n", __func__, ih->rptr, wptr); > > > > /* Order reading of wptr vs. reading of IH ring data */ > > > > @@ -244,7 +240,6 @@ int amdgpu_ih_process(struct amdgpu_device *adev, > > > > struct amdgpu_ih_ring *ih) > > > > amdgpu_ih_set_rptr(adev, ih); > > > > wake_up_all(&ih->wait_process); > > > > - atomic_set(&ih->lock, 0); > > > > /* make sure wptr hasn't changed while processing */ > > > > wptr = amdgpu_ih_get_wptr(adev, ih); > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h > > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h > > > > index 87ec6d20dbe0..0649b59830a5 100644 > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h > > > > @@ -64,7 +64,6 @@ struct amdgpu_ih_ring { > > > > boolenabled; > > > > unsignedrptr; > > > > - atomic_tlock; > > > > struct amdgpu_ih_regs ih_regs; > > > > /* For waiting on IH processing at checkpoint. */ > > > > -- > > > > 2.25.1 > > > > > > > > ___ > > > > dri-devel mailing list > > > > dri-devel@lists.freedesktop.org > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/amdgpu: nuke the ih reentrant lock
Am 12.03.21 um 15:36 schrieb Daniel Vetter: On Fri, Mar 12, 2021 at 03:35:50PM +0100, Daniel Vetter wrote: On Fri, Mar 12, 2021 at 03:27:58PM +0100, Christian König wrote: Am 12.03.21 um 15:04 schrieb Daniel Vetter: On Fri, Mar 12, 2021 at 02:59:06PM +0100, Christian König wrote: Interrupts on are non-reentrant on linux. This is just an ancient leftover from radeon where irq processing was kicked of from different places. Signed-off-by: Christian König Man you tricked me into grepping this on radeon and it looks horrible. atomic_t is unordered in linux, so whatever was built there for radeon does not wokr like a lock. It's missing all the barriers afiui. Good riddance at least for amdgpu. Hui? atomic_xchg() is perfectly ordered as far as I know. Oh right, if atomic_ returns a value it's fully ordered. Nice tour into memory-barriers.txt. It's the atomic_t operations that don't return anything which are fully unordered, and I mixed that up. IIRC Alex added this for R600 because we had boards where interrupts where sporadically swallowed during driver startup and we had to kick of ring buffer processing manually. Anyway we have a fence processing fallback timer in amdgpu instead and that stuff is probably unused on any modern hardware. Ah yes that stuff. Kinda annoying we have these old dma_fence around where dma_fence_wait doesn't really work that well, but oh well. Argh I'm not awake. dma_fence_wait() works well on these, but the generic cb stuff (which is used in ever more place) doesn't, because it misses the fallback timer the ->wait does. That's not what I meant. See amdgpu_fence_schedule_fallback(). As soon as somebody requested a dma_fence to be signaled we start a 500ms timeout for fence processing which is cleared as soon as we see the first interrupt. That is necessary because some motherboards have the ugly behavior of swallowing approx ~0.1% if all MSI interrupts after some idle time. No idea where exactly that comes from, but I have the feeling that this was the same bug Alex tried to handle on R6xx over 10 years ago with kicking of interrupt processing manually after enabling interrupts for the first time. Regards, Christian. -Daniel Reviewed-by: Daniel Vetter Thanks, Christian. --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 - drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c | 5 - drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h | 1 - 3 files changed, 7 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index a15f1b604733..886625fb464b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -3284,7 +3284,6 @@ int amdgpu_device_init(struct amdgpu_device *adev, /* mutex initialization are all done here so we * can recall function without having locking issues */ - atomic_set(&adev->irq.ih.lock, 0); mutex_init(&adev->firmware.mutex); mutex_init(&adev->pm.mutex); mutex_init(&adev->gfx.gpu_clock_mutex); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c index 1024065f1f03..faaa6aa2faaf 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c @@ -228,10 +228,6 @@ int amdgpu_ih_process(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih) wptr = amdgpu_ih_get_wptr(adev, ih); restart_ih: - /* is somebody else already processing irqs? */ - if (atomic_xchg(&ih->lock, 1)) - return IRQ_NONE; - DRM_DEBUG("%s: rptr %d, wptr %d\n", __func__, ih->rptr, wptr); /* Order reading of wptr vs. reading of IH ring data */ @@ -244,7 +240,6 @@ int amdgpu_ih_process(struct amdgpu_device *adev, struct amdgpu_ih_ring *ih) amdgpu_ih_set_rptr(adev, ih); wake_up_all(&ih->wait_process); - atomic_set(&ih->lock, 0); /* make sure wptr hasn't changed while processing */ wptr = amdgpu_ih_get_wptr(adev, ih); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h index 87ec6d20dbe0..0649b59830a5 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h @@ -64,7 +64,6 @@ struct amdgpu_ih_ring { boolenabled; unsignedrptr; - atomic_tlock; struct amdgpu_ih_regs ih_regs; /* For waiting on IH processing at checkpoint. */ -- 2.25.1 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v2] drm/scheduler re-insert Bailing job to avoid memleak
On 2021-03-12 1:50 a.m., Jack Zhang wrote: re-insert Bailing jobs to avoid memory leak. Usually we put a v2:"Blha blha blha" here to explain what was modified in v2 Also - since you make changes to another driver you should add their maintainer and mailing list probably (use ./scripts/get_maintainer.pl) for this Signed-off-by: Jack Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 +++- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c| 8 ++-- drivers/gpu/drm/panfrost/panfrost_job.c| 2 +- drivers/gpu/drm/scheduler/sched_main.c | 8 +++- include/drm/gpu_scheduler.h| 1 + 5 files changed, 18 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 79b9cc73763f..86463b0f936e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4815,8 +4815,10 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, job ? job->base.id : -1); /* even we skipped this reset, still need to set the job to guilty */ - if (job) + if (job) { drm_sched_increase_karma(&job->base); + r = DRM_GPU_SCHED_STAT_BAILING; + } goto skip_recovery; } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index 759b34799221..41390bdacd9e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -34,6 +34,7 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job) struct amdgpu_job *job = to_amdgpu_job(s_job); struct amdgpu_task_info ti; struct amdgpu_device *adev = ring->adev; + int ret; memset(&ti, 0, sizeof(struct amdgpu_task_info)); @@ -52,8 +53,11 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job) ti.process_name, ti.tgid, ti.task_name, ti.pid); if (amdgpu_device_should_recover_gpu(ring->adev)) { - amdgpu_device_gpu_recover(ring->adev, job); - return DRM_GPU_SCHED_STAT_NOMINAL; + ret = amdgpu_device_gpu_recover(ring->adev, job); + if (ret == DRM_GPU_SCHED_STAT_BAILING) + return DRM_GPU_SCHED_STAT_BAILING; + else + return DRM_GPU_SCHED_STAT_NOMINAL; } else { drm_sched_suspend_timeout(&ring->sched); if (amdgpu_sriov_vf(adev)) diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c index 6003cfeb1322..c372f4a38736 100644 --- a/drivers/gpu/drm/panfrost/panfrost_job.c +++ b/drivers/gpu/drm/panfrost/panfrost_job.c @@ -456,7 +456,7 @@ static enum drm_gpu_sched_stat panfrost_job_timedout(struct drm_sched_job /* Scheduler is already stopped, nothing to do. */ if (!panfrost_scheduler_stop(&pfdev->js->queue[js], sched_job)) - return DRM_GPU_SCHED_STAT_NOMINAL; + return DRM_GPU_SCHED_STAT_BAILING; Note that there is another early termination in panfrost at https://elixir.bootlin.com/linux/v5.11.1/source/drivers/gpu/drm/panfrost/panfrost_job.c#L445 So probably should also add there. /* Schedule a reset if there's no reset in progress. */ if (!atomic_xchg(&pfdev->reset.pending, 1)) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 92d8de24d0a1..a44f621fb5c4 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -314,6 +314,7 @@ static void drm_sched_job_timedout(struct work_struct *work) { struct drm_gpu_scheduler *sched; struct drm_sched_job *job; + int ret; sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work); @@ -331,8 +332,13 @@ static void drm_sched_job_timedout(struct work_struct *work) list_del_init(&job->list); spin_unlock(&sched->job_list_lock); - job->sched->ops->timedout_job(job); + ret = job->sched->ops->timedout_job(job); + if (ret == DRM_GPU_SCHED_STAT_BAILING) { + spin_lock(&sched->job_list_lock); + list_add(&job->node, &sched->ring_mirror_list); + spin_unlock(&sched->job_list_lock); + } Just reiterating my comment from v1 here since u missed it - Problem here that since you already dropped the reset locks you are racing here now against other recovery threads as they process the same mirror list, and yet,I think this solution makes things better then they are now with the leak but still, it's only temporary band-aid until the full solution to be implemented as described earlier by Christian. Probably then worth mentioning here with a comment this it's a tem
Re: [RFC PATCH 0/7] drm/panfrost: Add a new submit ioctl
On Fri, Mar 12, 2021 at 1:31 AM Boris Brezillon wrote: > > On Thu, 11 Mar 2021 12:11:48 -0600 > Jason Ekstrand wrote: > > > > > > > > 2/ Queued jobs might be executed out-of-order (unless they have > > > > > > > explicit/implicit deps between them), and Vulkan asks that > > > > > > > the out > > > > > > > fence be signaled when all jobs are done. Timeline syncobjs > > > > > > > are a > > > > > > > good match for that use case. All we need to do is pass the > > > > > > > same > > > > > > > fence syncobj to all jobs being attached to a single > > > > > > > QueueSubmit > > > > > > > request, but a different point on the timeline. The syncobj > > > > > > > timeline wait does the rest and guarantees that we've reached > > > > > > > a > > > > > > > given timeline point (IOW, all jobs before that point are > > > > > > > done) > > > > > > > before declaring the fence as signaled. > > > > > > > One alternative would be to have dummy 'synchronization' jobs > > > > > > > that > > > > > > > don't actually execute anything on the GPU but declare a > > > > > > > dependency > > > > > > > on all other jobs that are part of the QueueSubmit request, > > > > > > > and > > > > > > > signal the out fence (the scheduler would do most of the work > > > > > > > for > > > > > > > us, all we have to do is support NULL job heads and signal the > > > > > > > fence directly when that happens instead of queueing the job). > > > > > > > > > > > > I have to admit to being rather hazy on the details of timeline > > > > > > syncobjs, but I thought there was a requirement that the timeline > > > > > > moves > > > > > > monotonically. I.e. if you have multiple jobs signalling the same > > > > > > syncobj just with different points, then AFAIU the API requires > > > > > > that the > > > > > > points are triggered in order. > > > > > > > > > > I only started looking at the SYNCOBJ_TIMELINE API a few days ago, so > > > > > I > > > > > might be wrong, but my understanding is that queuing fences (addition > > > > > of new points in the timeline) should happen in order, but signaling > > > > > can happen in any order. When checking for a signaled fence the > > > > > fence-chain logic starts from the last point (or from an explicit > > > > > point > > > > > if you use the timeline wait flavor) and goes backward, stopping at > > > > > the > > > > > first un-signaled node. If I'm correct, that means that fences that > > > > > are part of a chain can be signaled in any order. > > > > > > > > You don't even need a timeline for this. Just have a single syncobj > > > > per-queue and make each submit wait on it and then signal it. > > > > Alternatively, you can just always hang on to the out-fence from the > > > > previous submit and make the next one wait on that. > > > > > > That's what I have right now, but it forces the serialization of all > > > jobs that are pushed during a submit (and there can be more than one > > > per command buffer on panfrost :-/). Maybe I'm wrong, but I thought it'd > > > be better to not force this serialization if we can avoid it. > > > > I'm not familiar with panfrost's needs and I don't work on a tiler and > > I know there are different issues there. But... > > > > The Vulkan spec requires that everything that all the submits that > > happen on a given vkQueue happen in-order. Search the spec for > > "Submission order" for more details. > > Duh, looks like I completely occulted the "Submission order" > guarantees. This being said, even after reading this chapter multiple > times I'm not sure what kind of guarantee this gives us, given the > execution itself can be out-of-order. My understanding is that > submission order matters for implicit deps, say you have 2 distinct > VkSubmitInfo, the first one (in submission order) writing to a buffer > and the second one reading from it, you really want the first one to > be submitted first and the second one to wait on the implicit BO fence > created by the first one. If we were to submit out-of-order, this > guarantee wouldn't be met. OTOH, if we have 2 completely independent > submits, I don't really see what submission order gives us if execution > is out-of-order. Right, this is where things get sticky. What's guaranteed there is submission order not execution order. But, sadly, there's more hidden in there than you might think. Before we can go there, though, we need to establish a few details of Mali hardware works. My understanding (feel free to correct me if I'm wrong) is that, roughly, Mali has two engines which have to work together to render: One engine for compute/vertex/geometry and one for binning and fragment. When you go to submit a draw, you fire off any geometry work on the compute engine and the fragment work on the binning engine. At a tile flush boundary (or renderpass in Vulkan), you insert a dependency between the geometry work you put on the compute engine and the binning/fr
[RFC 0/6] Default request/fence expiry + watchdog
From: Tvrtko Ursulin "Watchdog" aka "restoring hangcheck" aka default request/fence expiry - first post of a somewhat controversial feature so may be somewhat rough in commit messages, commentary and implementation. So only RFC for now. I parenthesise the "watchdog" becuase in classical sense watchdog would allow userspace to ping it and so remain alive. I parenthesise "restoring hangcheck" because this series, contrary to the old hangcheck, is not looking at whether the workload is making any progress from the kernel side either. (Althoguh disclaimer my memory may be leaky - Daniel suspects old hangcheck had some stricter, more indiscriminatory, angles to it. But apart from being prone to both false negatives and false positives I can't remember that myself.) Short version - ask is to fail any user submissions after a set time period. In this RFC that time is ten seconds. Time counts from the moment user submission is "runnable" (implicit and explicit dependencies have been cleared) and keeps counting regardless of the GPU contetion caused by other users of the system. So semantics are really a bit weak but again, I understand this is really wanted by the DRM core. As an attempt to compensate for this brutish nature, I proposed adding extendable configurability via a context param as part of the series. That could allow userspace to pick different semantics (always going more restrictive than the system default) and so implement interesting things like long desired media watchdog. Module trickyness of the implementation there. Test-with: 20210312093329.1639502-1-tvrtko.ursu...@linux.intel.com Cc: Daniel Vetter Chris Wilson (1): drm/i915: Individual request cancellation Tvrtko Ursulin (5): drm/i915: Restrict sentinel requests further drm/i915: Request watchdog infrastructure drm/i915: Allow userspace to configure the watchdog drm/i915: Fail too long user submissions by default drm/i915: Allow configuring default request expiry via modparam drivers/gpu/drm/i915/Kconfig.profile | 8 + drivers/gpu/drm/i915/gem/i915_gem_context.c | 92 ++ .../gpu/drm/i915/gem/i915_gem_context_types.h | 4 + drivers/gpu/drm/i915/gt/intel_context_param.h | 11 +- drivers/gpu/drm/i915/gt/intel_context_types.h | 4 + .../gpu/drm/i915/gt/intel_engine_heartbeat.c | 1 + .../drm/i915/gt/intel_execlists_submission.c | 11 +- .../drm/i915/gt/intel_execlists_submission.h | 2 + drivers/gpu/drm/i915/gt/intel_gt.c| 3 + drivers/gpu/drm/i915/gt/intel_gt.h| 2 + drivers/gpu/drm/i915/gt/intel_gt_requests.c | 21 ++ drivers/gpu/drm/i915/gt/intel_gt_types.h | 7 + drivers/gpu/drm/i915/i915_params.c| 5 + drivers/gpu/drm/i915/i915_params.h| 1 + drivers/gpu/drm/i915/i915_request.c | 129 +++- drivers/gpu/drm/i915/i915_request.h | 12 +- drivers/gpu/drm/i915/selftests/i915_request.c | 275 ++ include/uapi/drm/i915_drm.h | 5 +- 18 files changed, 584 insertions(+), 9 deletions(-) -- 2.27.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[RFC 1/6] drm/i915: Individual request cancellation
From: Chris Wilson Currently, we cancel outstanding requests within a context when the context is closed. We may also want to cancel individual requests using the same graceful preemption mechanism. v2 (Tvrtko): * Cancel waiters carefully considering no timeline lock and RCU. * Fixed selftests. Signed-off-by: Chris Wilson Signed-off-by: Tvrtko Ursulin --- .../gpu/drm/i915/gt/intel_engine_heartbeat.c | 1 + .../drm/i915/gt/intel_execlists_submission.c | 9 +- drivers/gpu/drm/i915/i915_request.c | 77 - drivers/gpu/drm/i915/i915_request.h | 4 +- drivers/gpu/drm/i915/selftests/i915_request.c | 275 ++ 5 files changed, 360 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c index 0b062fad1837..e2fb3ae2aaf3 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c @@ -314,6 +314,7 @@ int intel_engine_pulse(struct intel_engine_cs *engine) mutex_unlock(&ce->timeline->mutex); } + intel_engine_flush_scheduler(engine); intel_engine_pm_put(engine); return err; } diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c index 85ff5fe861b4..4c2acb5a6c0a 100644 --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c @@ -421,6 +421,11 @@ static void reset_active(struct i915_request *rq, ce->lrc.lrca = lrc_update_regs(ce, engine, head); } +static bool bad_request(const struct i915_request *rq) +{ + return rq->fence.error && i915_request_started(rq); +} + static struct intel_engine_cs * __execlists_schedule_in(struct i915_request *rq) { @@ -433,7 +438,7 @@ __execlists_schedule_in(struct i915_request *rq) !intel_engine_has_heartbeat(engine))) intel_context_set_banned(ce); - if (unlikely(intel_context_is_banned(ce))) + if (unlikely(intel_context_is_banned(ce) || bad_request(rq))) reset_active(rq, engine); if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)) @@ -1112,7 +1117,7 @@ static unsigned long active_preempt_timeout(struct intel_engine_cs *engine, return 0; /* Force a fast reset for terminated contexts (ignoring sysfs!) */ - if (unlikely(intel_context_is_banned(rq->context))) + if (unlikely(intel_context_is_banned(rq->context) || bad_request(rq))) return 1; return READ_ONCE(engine->props.preempt_timeout_ms); diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index e7b4c4bc41a6..fb9c5bb1fe41 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -33,7 +33,10 @@ #include "gem/i915_gem_context.h" #include "gt/intel_breadcrumbs.h" #include "gt/intel_context.h" +#include "gt/intel_engine.h" +#include "gt/intel_engine_heartbeat.h" #include "gt/intel_gpu_commands.h" +#include "gt/intel_reset.h" #include "gt/intel_ring.h" #include "gt/intel_rps.h" @@ -429,20 +432,22 @@ void __i915_request_skip(struct i915_request *rq) rq->infix = rq->postfix; } -void i915_request_set_error_once(struct i915_request *rq, int error) +bool i915_request_set_error_once(struct i915_request *rq, int error) { int old; GEM_BUG_ON(!IS_ERR_VALUE((long)error)); if (i915_request_signaled(rq)) - return; + return false; old = READ_ONCE(rq->fence.error); do { if (fatal_error(old)) - return; + return false; } while (!try_cmpxchg(&rq->fence.error, &old, error)); + + return true; } struct i915_request *i915_request_mark_eio(struct i915_request *rq) @@ -609,6 +614,72 @@ void i915_request_unsubmit(struct i915_request *request) spin_unlock_irqrestore(&se->lock, flags); } +static struct intel_engine_cs *active_engine(struct i915_request *rq) +{ + struct intel_engine_cs *engine, *locked; + + locked = READ_ONCE(rq->engine); + spin_lock_irq(&locked->sched.lock); + while (unlikely(locked != (engine = READ_ONCE(rq->engine { + spin_unlock(&locked->sched.lock); + locked = engine; + spin_lock(&locked->sched.lock); + } + + engine = NULL; + if (i915_request_is_active(rq) && !__i915_request_is_complete(rq)) + engine = locked; + + spin_unlock_irq(&locked->sched.lock); + + return engine; +} + +static void __cancel_request(struct i915_request *rq) +{ + struct intel_engine_cs *engine = active_engine(rq); + + if (engine && intel_engine_pulse(engine)) + intel_gt_handle_error(engine->gt, engine->mask, 0, + "request cancellation by
[RFC 3/6] drm/i915: Request watchdog infrastructure
From: Tvrtko Ursulin Prepares the plumbing for setting request/fence expiration time. All code is put in place but is never activeted due yet missing ability to actually configure the timer. Outline of the basic operation: A timer is started when request is ready for execution. If the request completes (retires) before the timer fires, timer is cancelled and nothing further happens. If the timer fires request is added to a lockless list and worker queued. Purpose of this is twofold: a) It allows request cancellation from a more friendly context and b) coalesces multiple expirations into a single event of consuming the list. Worker locklessly consumes the list of expired requests and cancels them all using previous added i915_request_cancel(). Associated timeout value is stored in rq->context.watchdog.timeout_us. Signed-off-by: Tvrtko Ursulin Cc: Daniel Vetter --- drivers/gpu/drm/i915/gt/intel_context_types.h | 4 ++ .../drm/i915/gt/intel_execlists_submission.h | 2 + drivers/gpu/drm/i915/gt/intel_gt.c| 3 ++ drivers/gpu/drm/i915/gt/intel_gt.h| 2 + drivers/gpu/drm/i915/gt/intel_gt_requests.c | 21 drivers/gpu/drm/i915/gt/intel_gt_types.h | 7 +++ drivers/gpu/drm/i915/i915_request.c | 52 +++ drivers/gpu/drm/i915/i915_request.h | 8 +++ 8 files changed, 99 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h index 0ea18c9e2aca..65a5730a4f5b 100644 --- a/drivers/gpu/drm/i915/gt/intel_context_types.h +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h @@ -99,6 +99,10 @@ struct intel_context { #define CONTEXT_FORCE_SINGLE_SUBMISSION7 #define CONTEXT_NOPREEMPT 8 + struct { + u64 timeout_us; + } watchdog; + u32 *lrc_reg_state; union { struct { diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h index f7bd3fccfee8..4ca9b475e252 100644 --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h @@ -6,6 +6,7 @@ #ifndef __INTEL_EXECLISTS_SUBMISSION_H__ #define __INTEL_EXECLISTS_SUBMISSION_H__ +#include #include struct drm_printer; @@ -13,6 +14,7 @@ struct drm_printer; struct i915_request; struct intel_context; struct intel_engine_cs; +struct intel_gt; enum { INTEL_CONTEXT_SCHEDULE_IN = 0, diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index ca76f93bc03d..8d77dcbad059 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -31,6 +31,9 @@ void intel_gt_init_early(struct intel_gt *gt, struct drm_i915_private *i915) INIT_LIST_HEAD(>->closed_vma); spin_lock_init(>->closed_lock); + init_llist_head(>->watchdog.list); + INIT_WORK(>->watchdog.work, intel_gt_watchdog_work); + intel_gt_init_buffer_pool(gt); intel_gt_init_reset(gt); intel_gt_init_requests(gt); diff --git a/drivers/gpu/drm/i915/gt/intel_gt.h b/drivers/gpu/drm/i915/gt/intel_gt.h index a17bd8b3195f..7ec395cace69 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.h +++ b/drivers/gpu/drm/i915/gt/intel_gt.h @@ -78,4 +78,6 @@ static inline bool intel_gt_is_wedged(const struct intel_gt *gt) void intel_gt_info_print(const struct intel_gt_info *info, struct drm_printer *p); +void intel_gt_watchdog_work(struct work_struct *work); + #endif /* __INTEL_GT_H__ */ diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c index 36ec97f79174..df12fc7f0aa7 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c @@ -8,6 +8,7 @@ #include "i915_drv.h" /* for_each_engine() */ #include "i915_request.h" #include "intel_engine_heartbeat.h" +#include "intel_execlists_submission.h" #include "intel_gt.h" #include "intel_gt_pm.h" #include "intel_gt_requests.h" @@ -242,4 +243,24 @@ void intel_gt_fini_requests(struct intel_gt *gt) { /* Wait until the work is marked as finished before unloading! */ cancel_delayed_work_sync(>->requests.retire_work); + + flush_work(>->watchdog.work); +} + +void intel_gt_watchdog_work(struct work_struct *work) +{ + struct intel_gt *gt = + container_of(work, typeof(*gt), watchdog.work); + struct i915_request *rq, *rn; + struct llist_node *first; + + first = llist_del_all(>->watchdog.list); + if (!first) + return; + + llist_for_each_entry_safe(rq, rn, first, watchdog.link) { + if (!i915_request_completed(rq)) + i915_request_cancel(rq, -EINTR); + i915_request_put(rq); + } } diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h index 626
[RFC 2/6] drm/i915: Restrict sentinel requests further
From: Tvrtko Ursulin Disallow sentinel requests follow previous sentinels to make request cancellation work better when faced with a chain of requests which have all been marked as in error. Signed-off-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c index 4c2acb5a6c0a..4b870eca9693 100644 --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c @@ -896,7 +896,7 @@ static bool can_merge_rq(const struct i915_request *prev, if (__i915_request_is_complete(next)) return true; - if (unlikely((i915_request_flags(prev) ^ i915_request_flags(next)) & + if (unlikely((i915_request_flags(prev) | i915_request_flags(next)) & (BIT(I915_FENCE_FLAG_NOPREEMPT) | BIT(I915_FENCE_FLAG_SENTINEL return false; -- 2.27.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[RFC 4/6] drm/i915: Allow userspace to configure the watchdog
From: Tvrtko Ursulin Idea here is to make the watchdog mechanism more useful than for just default request/fence expiry. To this effect a new context param I915_CONTEXT_PARAM_WATCHDOG is added where the value fields allows passing in a timeout in micro-seconds. This allows userspace to set a limit to how long they expect their batches to take, or otherwise they will be cancelled, and userspace notified via one of the available mechanisms. Main attractiveness of adding uapi here is perhaps to extend the proposal by passing in a structure instead of a single value, like for illustration only: struct drm_i915_gem_context_watchdog { __u64 flags; #define I915_CONTEXT_WATCHDOG_WALL_TIMEBIT(0) #define I915_CONTEXT_WATCHDOG_GPU_TIME BIT(1) #define I915_CONTEXT_WATCHDOG_FROM_SUBMIT BIT(2) #define I915_CONTEXT_WATCHDOG_FROM_RUNNABLEBIT(3) __64 timeout_us; }; Point being to prepare the uapi for different semantics from the start. Given how not a single one makes complete sense for all use cases. And also perhaps satisfy the long wanted media watchdog feature request. Signed-off-by: Tvrtko Ursulin Cc: Daniel Vetter --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 57 +++ .../gpu/drm/i915/gem/i915_gem_context_types.h | 4 ++ drivers/gpu/drm/i915/gt/intel_context_param.h | 11 +++- include/uapi/drm/i915_drm.h | 5 +- 4 files changed, 75 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index ca37d93ef5e7..32b05af4fc8f 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -233,6 +233,8 @@ static void intel_context_set_gem(struct intel_context *ce, if (ctx->sched.priority >= I915_PRIORITY_NORMAL && intel_engine_has_timeslices(ce->engine)) __set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags); + + intel_context_set_watchdog_us(ce, ctx->watchdog.timeout_us); } static void __free_engines(struct i915_gem_engines *e, unsigned int count) @@ -1397,6 +1399,28 @@ static int set_ringsize(struct i915_gem_context *ctx, __intel_context_ring_size(args->value)); } +static int __apply_watchdog(struct intel_context *ce, void *timeout_us) +{ + return intel_context_set_watchdog_us(ce, (uintptr_t)timeout_us); +} + +static int set_watchdog(struct i915_gem_context *ctx, + struct drm_i915_gem_context_param *args) +{ + int ret; + + if (args->size) + return -EINVAL; + + ret = context_apply_all(ctx, __apply_watchdog, + (void *)(uintptr_t)args->value); + + if (!ret) + ctx->watchdog.timeout_us = args->value; + + return ret; +} + static int __get_ringsize(struct intel_context *ce, void *arg) { long sz; @@ -1426,6 +1450,17 @@ static int get_ringsize(struct i915_gem_context *ctx, return 0; } +static int get_watchdog(struct i915_gem_context *ctx, + struct drm_i915_gem_context_param *args) +{ + if (args->size) + return -EINVAL; + + args->value = ctx->watchdog.timeout_us; + + return 0; +} + int i915_gem_user_to_context_sseu(struct intel_gt *gt, const struct drm_i915_gem_context_param_sseu *user, @@ -2075,6 +2110,10 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv, ret = set_ringsize(ctx, args); break; + case I915_CONTEXT_PARAM_WATCHDOG: + ret = set_watchdog(ctx, args); + break; + case I915_CONTEXT_PARAM_BAN_PERIOD: default: ret = -EINVAL; @@ -2196,6 +2235,19 @@ static int clone_schedattr(struct i915_gem_context *dst, return 0; } +static int clone_watchdog(struct i915_gem_context *dst, + struct i915_gem_context *src) +{ + int ret; + + ret = context_apply_all(dst, __apply_watchdog, + (void *)(uintptr_t)src->watchdog.timeout_us); + if (!ret) + dst->watchdog = src->watchdog; + + return ret; +} + static int clone_sseu(struct i915_gem_context *dst, struct i915_gem_context *src) { @@ -2279,6 +2331,7 @@ static int create_clone(struct i915_user_extension __user *ext, void *data) MAP(SSEU, clone_sseu), MAP(TIMELINE, clone_timeline), MAP(VM, clone_vm), + MAP(WATCHDOG, clone_watchdog), #undef MAP }; struct drm_i915_gem_context_create_ext_clone local; @@ -2532,6 +2585,10 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data, ret = get_ringsize(ctx, args); break; + case I915_CONTEXT_PARAM_WATCHDOG: + ret = get_watchdog(ctx, args); + break; +
[RFC 5/6] drm/i915: Fail too long user submissions by default
From: Tvrtko Ursulin A new Kconfig option CONFIG_DRM_I915_REQUEST_TIMEOUT is added, defaulting to 10s, and this timeout is applied to _all_ contexts using the previously added watchdog facility. Result of this is that any user submission will simply fail after this time, either causing a reset (for non-preemptable) or incomplete results. This can have an effect that workloads which used to work fine will suddenly start failing. When the default expiry is active userspace will not be allowed to decrease the timeout using the context param setting. Signed-off-by: Tvrtko Ursulin Cc: Daniel Vetter --- drivers/gpu/drm/i915/Kconfig.profile| 8 drivers/gpu/drm/i915/gem/i915_gem_context.c | 47 ++--- 2 files changed, 48 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/Kconfig.profile b/drivers/gpu/drm/i915/Kconfig.profile index 35bbe2b80596..55e15773 100644 --- a/drivers/gpu/drm/i915/Kconfig.profile +++ b/drivers/gpu/drm/i915/Kconfig.profile @@ -1,3 +1,11 @@ +config DRM_I915_REQUEST_TIMEOUT + int "Default timeout for requests (ms)" + default 1 # milliseconds + help + ... + + May be 0 to disable the timeout. + config DRM_I915_FENCE_TIMEOUT int "Timeout for unsignaled foreign fences (ms, jiffy granularity)" default 1 # milliseconds diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 32b05af4fc8f..21c0176e27a0 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -854,6 +854,25 @@ static void __assign_timeline(struct i915_gem_context *ctx, context_apply_all(ctx, __apply_timeline, timeline); } +static int +__set_watchdog(struct i915_gem_context *ctx, unsigned long timeout_us); + +static void __set_default_fence_expiry(struct i915_gem_context *ctx) +{ + struct drm_i915_private *i915 = ctx->i915; + int ret; + + if (!IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT)) + return; + + /* Default expiry for user fences. */ + ret = __set_watchdog(ctx, CONFIG_DRM_I915_REQUEST_TIMEOUT * 1000); + if (ret) + drm_notice(&i915->drm, + "Failed to configure default fence expiry! (%d)", + ret); +} + static struct i915_gem_context * i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags) { @@ -898,6 +917,8 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags) intel_timeline_put(timeline); } + __set_default_fence_expiry(ctx); + trace_i915_context_create(ctx); return ctx; @@ -1404,23 +1425,35 @@ static int __apply_watchdog(struct intel_context *ce, void *timeout_us) return intel_context_set_watchdog_us(ce, (uintptr_t)timeout_us); } -static int set_watchdog(struct i915_gem_context *ctx, - struct drm_i915_gem_context_param *args) +static int +__set_watchdog(struct i915_gem_context *ctx, unsigned long timeout_us) { int ret; - if (args->size) - return -EINVAL; - ret = context_apply_all(ctx, __apply_watchdog, - (void *)(uintptr_t)args->value); + (void *)(uintptr_t)timeout_us); if (!ret) - ctx->watchdog.timeout_us = args->value; + ctx->watchdog.timeout_us = timeout_us; return ret; } +static int set_watchdog(struct i915_gem_context *ctx, + struct drm_i915_gem_context_param *args) +{ + if (args->size) + return -EINVAL; + + /* Disallow disabling or configuring longer watchdog than default. */ + if (IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) && + (!args->value || +args->value > CONFIG_DRM_I915_REQUEST_TIMEOUT * 1000)) + return -EPERM; + + return __set_watchdog(ctx, args->value); +} + static int __get_ringsize(struct intel_context *ce, void *arg) { long sz; -- 2.27.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[RFC 6/6] drm/i915: Allow configuring default request expiry via modparam
From: Tvrtko Ursulin Module parameter is added (request_timeout_ms) to allow configuring the default request/fence expiry. Default value is inherited from CONFIG_DRM_I915_REQUEST_TIMEOUT. Signed-off-by: Tvrtko Ursulin Cc: Daniel Vetter --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 8 +--- drivers/gpu/drm/i915/i915_params.c | 5 + drivers/gpu/drm/i915/i915_params.h | 1 + 3 files changed, 11 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 21c0176e27a0..1dae5e2514a9 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -866,7 +866,7 @@ static void __set_default_fence_expiry(struct i915_gem_context *ctx) return; /* Default expiry for user fences. */ - ret = __set_watchdog(ctx, CONFIG_DRM_I915_REQUEST_TIMEOUT * 1000); + ret = __set_watchdog(ctx, i915->params.request_timeout_ms * 1000); if (ret) drm_notice(&i915->drm, "Failed to configure default fence expiry! (%d)", @@ -1442,13 +1442,15 @@ __set_watchdog(struct i915_gem_context *ctx, unsigned long timeout_us) static int set_watchdog(struct i915_gem_context *ctx, struct drm_i915_gem_context_param *args) { + struct drm_i915_private *i915 = ctx->i915; + if (args->size) return -EINVAL; /* Disallow disabling or configuring longer watchdog than default. */ - if (IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) && + if (i915->params.request_timeout_ms && (!args->value || -args->value > CONFIG_DRM_I915_REQUEST_TIMEOUT * 1000)) +args->value > i915->params.request_timeout_ms * 1000)) return -EPERM; return __set_watchdog(ctx, args->value); diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c index 6939634e56ed..0320878d96b0 100644 --- a/drivers/gpu/drm/i915/i915_params.c +++ b/drivers/gpu/drm/i915/i915_params.c @@ -197,6 +197,11 @@ i915_param_named_unsafe(fake_lmem_start, ulong, 0400, "Fake LMEM start offset (default: 0)"); #endif +#if CONFIG_DRM_I915_REQUEST_TIMEOUT +i915_param_named_unsafe(request_timeout_ms, uint, 0600, + "Default request/fence/batch buffer expiration timeout."); +#endif + static __always_inline void _print_param(struct drm_printer *p, const char *name, const char *type, diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h index 48f47e44e848..34ebb0662547 100644 --- a/drivers/gpu/drm/i915/i915_params.h +++ b/drivers/gpu/drm/i915/i915_params.h @@ -72,6 +72,7 @@ struct drm_printer; param(int, enable_dpcd_backlight, -1, 0600) \ param(char *, force_probe, CONFIG_DRM_I915_FORCE_PROBE, 0400) \ param(unsigned long, fake_lmem_start, 0, 0400) \ + param(unsigned int, request_timeout_ms, CONFIG_DRM_I915_REQUEST_TIMEOUT, 0600) \ /* leave bools at the end to not create holes */ \ param(bool, enable_hangcheck, true, 0600) \ param(bool, load_detect_test, false, 0600) \ -- 2.27.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 210321] /display/dc/dcn20/dcn20_resource.c:3240 dcn20_validate_bandwidth_fp+0x8b/0xd0 [amdgpu]
https://bugzilla.kernel.org/show_bug.cgi?id=210321 --- Comment #5 from Tristen Hayfield (tristen.hayfi...@gmail.com) --- I did some more digging into this. I put some logging inside the if block to see if that branch is ever taken: if (voltage_supported && dummy_pstate_supported) { context->bw_ctx.bw.dcn.clk.p_state_change_support = false; goto restore_dml_state; } in order to log when or if the fallback worked. The logs confirmed that the fallback is often used and generally works. Upon starting up the system and starting up Xorg I get about a dozen log messages indicating that it entered the if block. The only exception seems to be as Florian describes above, that when the display shuts off due to power-saving it triggers the assertion. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH]] drm/amdgpu/gfx9: add gfxoff quirk
On Thu, 11 Mar 2021 at 21:00, Daniel Gomez wrote: > On Thu, 11 Mar 2021 at 17:10, Alex Deucher wrote: > > > > On Thu, Mar 11, 2021 at 10:02 AM Alexandre Desnoyers > wrote: > > > > > > On Thu, Mar 11, 2021 at 2:49 PM Daniel Gomez wrote: > > > > > > > > On Thu, 11 Mar 2021 at 10:09, Daniel Gomez wrote: > > > > > > > > > > On Wed, 10 Mar 2021 at 18:06, Alex Deucher > wrote: > > > > > > > > > > > > On Wed, Mar 10, 2021 at 11:37 AM Daniel Gomez > wrote: > > > > > > > > > > > > > > Disabling GFXOFF via the quirk list fixes a hardware lockup in > > > > > > > Ryzen V1605B, RAVEN 0x1002:0x15DD rev 0x83. > > > > > > > > > > > > > > Signed-off-by: Daniel Gomez > > > > > > > --- > > > > > > > > > > > > > > This patch is a continuation of the work here: > > > > > > > https://lkml.org/lkml/2021/2/3/122 where a hardware lockup > was discussed and > > > > > > > a dma_fence deadlock was provoke as a side effect. To > reproduce the issue > > > > > > > please refer to the above link. > > > > > > > > > > > > > > The hardware lockup was introduced in 5.6-rc1 for our > particular revision as it > > > > > > > wasn't part of the new blacklist. Before that, in kernel v5.5, > this hardware was > > > > > > > working fine without any hardware lock because the GFXOFF was > actually disabled > > > > > > > by the if condition for the CHIP_RAVEN case. So this patch, > adds the 'Radeon > > > > > > > Vega Mobile Series [1002:15dd] (rev 83)' to the blacklist to > disable the GFXOFF. > > > > > > > > > > > > > > But besides the fix, I'd like to ask from where this revision > comes from. Is it > > > > > > > an ASIC revision or is it hardcoded in the VBIOS from our > vendor? From what I > > > > > > > can see, it comes from the ASIC and I wonder if somehow we can > get an APU in the > > > > > > > future, 'not blacklisted', with the same problem. Then, should > this table only > > > > > > > filter for the vendor and device and not the revision? Do you > know if there are > > > > > > > any revisions for the 1002:15dd validated, tested and > functional? > > > > > > > > > > > > The pci revision id (RID) is used to specify the specific SKU > within a > > > > > > family. GFXOFF is supposed to be working on all raven > variants. It > > > > > > was tested and functional on all reference platforms and any OEM > > > > > > platforms that launched with Linux support. There are a lot of > > > > > > dependencies on sbios in the early raven variants (0x15dd), so > it's > > > > > > likely more of a specific platform issue, but there is not a > good way > > > > > > to detect this so we use the DID/SSID/RID as a proxy. The newer > raven > > > > > > variants (0x15d8) have much better GFXOFF support since they all > > > > > > shipped with newer firmware and sbios. > > > > > > > > > > We took one of the first reference platform boards to design our > > > > > custom board based on the V1605B and I assume it has one of the > early 'unstable' > > > > > raven variants with RID 0x83. Also, as OEM we are in control of > the bios > > > > > (provided by insyde) but I wasn't sure about the RID so, thanks > for the > > > > > clarification. Is there anything we can do with the bios to have > the GFXOFF > > > > > enabled and 'stable' for this particular revision? Otherwise we'd > need to add > > > > > the 0x83 RID to the table. Also, there is an extra ']' in the patch > > > > > subject. Sorry > > > > > for that. Would you need a new patch in case you accept it with > the ']' removed? > > > > > > > > > > Good to hear that the newer raven versions have better GFXOFF > support. > > > > > > > > Adding Alex Desnoyer to the loop as he is the electronic/hardware and > > > > bios responsible so, he can > > > > provide more information about this. > > > > > > Hello everyone, > > > > > > We, Qtechnology, are the OEM of the hardware platform where we > > > originally discovered the bug. Our platform is based on the AMD > > > Dibbler V-1000 reference design, with the latest Insyde BIOS release > > > available for the (now unsupported) Dibbler platform. We have the > > > Insyde BIOS source code internally, so we can make some modifications > > > as needed. > > > > > > The last test that Daniel and myself performed was on a standard > > > Dibbler PCB rev.B1 motherboard (NOT our platform), and using the > > > corresponding latest AMD released BIOS "RDB1109GA". As Daniel wrote, > > > the hardware lockup can be reproduced on the Dibbler, even if it has a > > > different RID that our V1605B APU. > > > > > > We also have a Neousys Technology POC-515 embedded computer (V-1000, > > > V1605B) in our office. The Neousys PC also uses Insyde BIOS. This > > > computer is also locking-up in the test. > > > > https://www.neousys-tech.com/en/product/application/rugged-embedded/poc-500-amd-ryzen-ultra-compact-embedded-computer > > > > > > > > > Digging into the BIOS source code, the only reference to GFXOFF is in > > > the SMU and PSP firmware release notes, where some bug fixes have been
Re: [PATCH 14/17] iommu: remove DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE
On 2021-03-11 08:26, Christoph Hellwig wrote: On Wed, Mar 10, 2021 at 06:39:57PM +, Robin Murphy wrote: Actually... Just mirroring the iommu_dma_strict value into struct iommu_domain should solve all of that with very little boilerplate code. Yes, my initial thought was to directly replace the attribute with a common flag at iommu_domain level, but since in all cases the behaviour is effectively global rather than actually per-domain, it seemed reasonable to take it a step further. This passes compile-testing for arm64 and x86, what do you think? It seems to miss a few bits, and also generally seems to be not actually apply to recent mainline or something like it due to different empty lines in a few places. Yeah, that was sketched out on top of some other development patches, and in being so focused on not breaking any of the x86 behaviours I did indeed overlook fully converting the SMMU drivers... oops! (my thought was to do the conversion for its own sake, then clean up the redundant attribute separately, but I guess it's fine either way) Let me know what you think of the version here: http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/iommu-cleanup I'll happily switch the patch to you as the author if you're fine with that as well. I still have reservations about removing the attribute API entirely and pretending that io_pgtable_cfg is anything other than a SoC-specific private interface, but the reworked patch on its own looks reasonable to me, thanks! (I wasn't too convinced about the iommu_cmd_line wrappers either...) Just iommu_get_dma_strict() needs an export since the SMMU drivers can be modular - I consciously didn't add that myself since I was mistakenly thinking only iommu-dma would call it. Robin. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v5 0/7] drm: add simpledrm driver
On Friday, March 12, 2021 3:03:46 AM EST Thomas Zimmermann wrote: > Hi > > Am 12.03.21 um 04:49 schrieb nerdopolis: > > On Wednesday, March 10, 2021 4:10:35 AM EST Thomas Zimmermann wrote: > >> Hi > >> > >> Am 10.03.21 um 03:50 schrieb nerdopolis: > >>> On Friday, September 2, 2016 4:22:38 AM EST David Herrmann wrote: > Hey > > On request of Noralf, I picked up the patches and prepared v5. Works > fine with > Xorg, if configured according to: > > https://lists.freedesktop.org/archives/dri-devel/2014-January/052777.html > If anyone knows how to make Xorg pick it up dynamically without such a > static > configuration, please let me know. > > > > >>> Hi > >>> > >>> I am kind of curious as I do have interest in seeing this merged as well. > >> > >> Please take a look at [1]. It's not the same driver, but something to > >> the same effect. I know it's been almost a year, but I do work on this > >> and intend to come back with a new version during 2021. > >> > >> I currently work on fastboot support for the new driver. But it's a > >> complicated matter and takes time. If there's interest, we could talk > >> about merging what's already there. > >> > >> Best regards > >> Thomas > >> > >> [1] > >> https://lore.kernel.org/dri-devel/20200625120011.16168-1-tzimmerm...@suse.de/ > >> > >>> > >>> There is an email in this thread from 2018, but when I tried to import an > >>> mbox > >>> file from the whole month for August 2018, for some reason, kmail doesn't > >>> see > >>> the sender and mailing list recipient in that one, so I will reply to > >>> this one, > >>> because I was able to import this into my mail client. > >>> https://www.spinics.net/lists/dri-devel/msg185519.html > >>> > >>> I was able to get this to build against Linux 4.8, but not against a newer > >>> version, some headers seem to have been split, and some things are off by > >>> 8 > >>> and other things. I could NOT find a git repo, but I was able to find the > >>> newest patches I could find, and import those with git am against 4.8 with > >>> some tweaks. If that is needed, I can link it, but only if you want. > >>> > >>> However in QEMU I wasn't able to figure out how to make it create a > >>> /dev/dri/card0 device, even after blacklisting the other modules for qxl, > >>> cirrus, etc, and then modprobe-ing simpledrm > >>> > >>> In my view something like this is would be useful. There still could be > >>> hardware devices that don't have modesetting support (like vmvga in > >>> qemu/virt-manager as an example). And most wayland servers need a > >>> /dev/dri/card0 device as well as a potential user-mode TTY replacement > >>> would > >>> also need /dev/dri/card0 > >>> > >>> I will admit I unfortunately failed to get it to build against master. I > >>> couldn't figure out some of the changes, where some new structs were off > >>> by > >>> a factor of 8. > >>> > >>> > >>> Thanks > >>> > >>> > >>> > >>> ___ > >>> dri-devel mailing list > >>> dri-devel@lists.freedesktop.org > >>> https://lists.freedesktop.org/mailman/listinfo/dri-devel > >>> > >> > >> > > Hi > > > > I tried simplekms against v5.9, and it built, and it runs, and is pretty > > neat. > > > > I tried using the qxl, cirrus, and vmware card in QEMU. Weston starts on all > > of them. And I did ensure that the simplekms driver was being used > > Cool! Thanks a lot. When I submit the next patchset can I add > > Tested-by: nerdopolis > > to the tags? > Sure! > > > > That is, it works after adding GRUB_GFXPAYLOAD_LINUX=keep , to avoid having > > to > > set a VGA option. (although not sure the equivalent in syslinux yet) > > Yeah, it's a known 'limitation.' (1) But it's usually something that > Linux distributions take care of. > > The rsp kernel feature needs a set up from the firmware/bootloader/etc. > Once the driver has been merged, added other generic drivers (EFI, VESA, > etc) should be a lot easier. Those would maybe not require the firmware > setup. > > Best regards > Thomas > > (1) Well, it's the way it's supposed to work. > > > > > > > Thanks. > > > > > > ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 212255] New: WARNING: at arch/x86/kernel/fpu/core.c:129 kernel_fpu_begin_mask
https://bugzilla.kernel.org/show_bug.cgi?id=212255 Bug ID: 212255 Summary: WARNING: at arch/x86/kernel/fpu/core.c:129 kernel_fpu_begin_mask Product: Drivers Version: 2.5 Kernel Version: 5.11.3 Hardware: x86-64 OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Video(DRI - non Intel) Assignee: drivers_video-...@kernel-bugs.osdl.org Reporter: i...@felicetufo.com Regression: No Created attachment 295819 --> https://bugzilla.kernel.org/attachment.cgi?id=295819&action=edit dmesg Hello, i get two warnings booting kernel 5.11.3. The warning is present on later 5.11.x kernels (included 5.11.6) too, but does not show on 5.11.2. [ cut here ] [6.099356] WARNING: CPU: 6 PID: 366 at arch/x86/kernel/fpu/core.c:129 kernel_fpu_begin_mask+0xa3/0xb0 ---[ end trace 4cbc711ff0b0578b ]--- [6.102354] [ cut here ] [6.102354] WARNING: CPU: 6 PID: 366 at arch/x86/kernel/fpu/core.c:155 kernel_fpu_end+0x1e/0x30 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 212255] WARNING: at arch/x86/kernel/fpu/core.c:129 kernel_fpu_begin_mask
https://bugzilla.kernel.org/show_bug.cgi?id=212255 Alex Deucher (alexdeuc...@gmail.com) changed: What|Removed |Added CC||alexdeuc...@gmail.com --- Comment #1 from Alex Deucher (alexdeuc...@gmail.com) --- Possibly fixed with these patches? https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=15e8b95d5f7509e0b09289be8c422c459c9f0412 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=680174cfd1e1cea70a8f30ccb44d8fbdf996018e -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v5] Add power/gpu_frequency tracepoint.
On Wed 2021-03-10 13:56:29, Peiyong Lin wrote: > On Mon, Dec 21, 2020 at 12:10 PM Peiyong Lin wrote: > > > > Historically there is no common trace event for GPU frequency, in > > downstream Android each different hardware vendor implements their own > > way to expose GPU frequency, for example as a debugfs node. This patch > > standardize it as a common trace event in upstream linux kernel to help > > the ecosystem have a common implementation across hardware vendors. > > Toolings in the Linux ecosystem will benefit from this especially in the > > downstream Android, where this information is critical to graphics > > developers. ... > > /* This part must be outside protection */ > > -- > > 2.29.2.684.gfbc64c5ab5-goog > > > > Hi there, > > Could you please take a look at this patch? Please trim email quotes when replying. Also, you might want to state _name_ of person who should apply this, so there's no ambiguity. You cc-ed many people... Pavel -- http://www.livejournal.com/~pavelmachek signature.asc Description: Digital signature ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [RFC PATCH 0/7] drm/panfrost: Add a new submit ioctl
On Fri, 12 Mar 2021 09:37:49 -0600 Jason Ekstrand wrote: > On Fri, Mar 12, 2021 at 1:31 AM Boris Brezillon > wrote: > > > > On Thu, 11 Mar 2021 12:11:48 -0600 > > Jason Ekstrand wrote: > > > > > > > > > > 2/ Queued jobs might be executed out-of-order (unless they have > > > > > > > > explicit/implicit deps between them), and Vulkan asks that > > > > > > > > the out > > > > > > > > fence be signaled when all jobs are done. Timeline syncobjs > > > > > > > > are a > > > > > > > > good match for that use case. All we need to do is pass the > > > > > > > > same > > > > > > > > fence syncobj to all jobs being attached to a single > > > > > > > > QueueSubmit > > > > > > > > request, but a different point on the timeline. The syncobj > > > > > > > > timeline wait does the rest and guarantees that we've > > > > > > > > reached a > > > > > > > > given timeline point (IOW, all jobs before that point are > > > > > > > > done) > > > > > > > > before declaring the fence as signaled. > > > > > > > > One alternative would be to have dummy 'synchronization' > > > > > > > > jobs that > > > > > > > > don't actually execute anything on the GPU but declare a > > > > > > > > dependency > > > > > > > > on all other jobs that are part of the QueueSubmit request, > > > > > > > > and > > > > > > > > signal the out fence (the scheduler would do most of the > > > > > > > > work for > > > > > > > > us, all we have to do is support NULL job heads and signal > > > > > > > > the > > > > > > > > fence directly when that happens instead of queueing the > > > > > > > > job). > > > > > > > > > > > > > > I have to admit to being rather hazy on the details of timeline > > > > > > > syncobjs, but I thought there was a requirement that the timeline > > > > > > > moves > > > > > > > monotonically. I.e. if you have multiple jobs signalling the same > > > > > > > syncobj just with different points, then AFAIU the API requires > > > > > > > that the > > > > > > > points are triggered in order. > > > > > > > > > > > > I only started looking at the SYNCOBJ_TIMELINE API a few days ago, > > > > > > so I > > > > > > might be wrong, but my understanding is that queuing fences > > > > > > (addition > > > > > > of new points in the timeline) should happen in order, but signaling > > > > > > can happen in any order. When checking for a signaled fence the > > > > > > fence-chain logic starts from the last point (or from an explicit > > > > > > point > > > > > > if you use the timeline wait flavor) and goes backward, stopping at > > > > > > the > > > > > > first un-signaled node. If I'm correct, that means that fences that > > > > > > are part of a chain can be signaled in any order. > > > > > > > > > > You don't even need a timeline for this. Just have a single syncobj > > > > > per-queue and make each submit wait on it and then signal it. > > > > > Alternatively, you can just always hang on to the out-fence from the > > > > > previous submit and make the next one wait on that. > > > > > > > > That's what I have right now, but it forces the serialization of all > > > > jobs that are pushed during a submit (and there can be more than one > > > > per command buffer on panfrost :-/). Maybe I'm wrong, but I thought it'd > > > > be better to not force this serialization if we can avoid it. > > > > > > I'm not familiar with panfrost's needs and I don't work on a tiler and > > > I know there are different issues there. But... > > > > > > The Vulkan spec requires that everything that all the submits that > > > happen on a given vkQueue happen in-order. Search the spec for > > > "Submission order" for more details. > > > > Duh, looks like I completely occulted the "Submission order" > > guarantees. This being said, even after reading this chapter multiple > > times I'm not sure what kind of guarantee this gives us, given the > > execution itself can be out-of-order. My understanding is that > > submission order matters for implicit deps, say you have 2 distinct > > VkSubmitInfo, the first one (in submission order) writing to a buffer > > and the second one reading from it, you really want the first one to > > be submitted first and the second one to wait on the implicit BO fence > > created by the first one. If we were to submit out-of-order, this > > guarantee wouldn't be met. OTOH, if we have 2 completely independent > > submits, I don't really see what submission order gives us if execution > > is out-of-order. > > Right, this is where things get sticky. What's guaranteed there is > submission order not execution order. But, sadly, there's more hidden > in there than you might think. Before we can go there, though, we > need to establish a few details of Mali hardware works. > > My understanding (feel free to correct me if I'm wrong) is that, > roughly, Mali has two engines which have to work together to render: > One engine for compute/vertex/geometry and one for
[Bug 212255] WARNING: at arch/x86/kernel/fpu/core.c:129 kernel_fpu_begin_mask
https://bugzilla.kernel.org/show_bug.cgi?id=212255 --- Comment #2 from Felice Tufo (i...@felicetufo.com) --- Thanks Alex, it seems that Linus merged those patches (just today) for the next -rc release, am I right? If so, I'll do a quick test and let you know as soon as Ubuntu team will release the next mainline kernel (usually just 1 or 2 days after -rc is out). -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 212255] WARNING: at arch/x86/kernel/fpu/core.c:129 kernel_fpu_begin_mask
https://bugzilla.kernel.org/show_bug.cgi?id=212255 --- Comment #3 from Alex Deucher (alexdeuc...@gmail.com) --- (In reply to Felice Tufo from comment #2) > Thanks Alex, > it seems that Linus merged those patches (just today) for the next -rc > release, am I right? correct. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [RFC PATCH 0/7] drm/panfrost: Add a new submit ioctl
On Fri, 12 Mar 2021 19:25:13 +0100 Boris Brezillon wrote: > > So where does this leave us? Well, it depends on your submit model > > and exactly how you handle pipeline barriers that sync between > > engines. If you're taking option 3 above and doing two command > > buffers for each VkCommandBuffer, then you probably want two > > serialized timelines, one for each engine, and some mechanism to tell > > the kernel driver "these two command buffers have to run in parallel" > > so that your ping-pong works. If you're doing 1 or 2 above, I think > > you probably still want two simple syncobjs, one for each engine. You > > don't really have any need to go all that far back in history. All > > you really need to describe is "command buffer X depends on previous > > compute work" or "command buffer X depends on previous binning work". > > Okay, so this will effectively force in-order execution. Let's take your > previous example and add 2 more jobs at the end that have no deps on > previous commands: > > vkBeginRenderPass() /* Writes to ImageA */ > vkCmdDraw() > vkCmdDraw() > ... > vkEndRenderPass() > vkPipelineBarrier(imageA /* fragment -> compute */) > vkCmdDispatch() /* reads imageA, writes BufferB */ > vkBeginRenderPass() /* Writes to ImageC */ > vkCmdBindVertexBuffers(bufferB) > vkCmdDraw(); > ... > vkEndRenderPass() > vkBeginRenderPass() /* Writes to ImageD */ > vkCmdDraw() > ... > vkEndRenderPass() > > A: Vertex for the first draw on the compute engine > B: Vertex for the first draw on the compute engine > C: Fragment for the first draw on the binning engine; depends on A > D: Fragment for the second draw on the binning engine; depends on B > E: Compute on the compute engine; depends on C and D > F: Vertex for the third draw on the compute engine; depends on E > G: Fragment for the third draw on the binning engine; depends on F > H: Vertex for the fourth draw on the compute engine > I: Fragment for the fourth draw on the binning engine > > When we reach E, we might be waiting for D to finish before scheduling > the job, and because of the implicit serialization we have on the > compute queue (F implicitly depends on E, and H on F) we can't schedule > H either, which could, in theory be started. I guess that's where the > term submission order is a bit unclear to me. The action of starting a > job sounds like execution order to me (the order you starts jobs > determines the execution order since we only have one HW queue per job > type). All implicit deps have been calculated when we queued the job to > the SW queue, and I thought that would be enough to meet the submission > order requirements, but I might be wrong. > > The PoC I have was trying to get rid of this explicit serialization on > the compute and fragment queues by having one syncobj timeline > (queue()) and synchronization points (Sx). > > S0: in-fences=, out-fences= #waitSemaphore > sync point > A: in-fences=, out-fences= > B: in-fences=, out-fences= > C: in-fences=, out-fence= #implicit dep on A through > the tiler context > D: in-fences=, out-fence= #implicit dep on B through > the tiler context > E: in-fences=, out-fence= #implicit dep on D through > imageA > F: in-fences=, out-fence= #implicit dep on E through > buffer B > G: in-fences=, out-fence= #implicit dep on F through > the tiler context > H: in-fences=, out-fence= > I: in-fences=, out-fence= #implicit dep on H through > the tiler buffer > S1: in-fences=, out-fences= > #signalSemaphore,fence sync point > # QueueWaitIdle is implemented with a wait(queue(0)), AKA wait on the last > point > > With this solution H can be started before E if the compute slot > is empty and E's implicit deps are not done. It's probably overkill, > but I thought maximizing GPU utilization was important. Nevermind, I forgot the drm scheduler was dequeuing jobs in order, so 2 syncobjs (one per queue type) is indeed the right approach. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [RFC PATCH 0/7] drm/panfrost: Add a new submit ioctl
> > 3. Each VkCommandBuffer is two command buffers: one for compute and > > one for binning, and you use some sort of HW synchronization mechanism > > to handle the dependencies as you ping-pong between them. > > I didn't consider that option. We have a DOORBELL instruction on Bifrost > to wake up the CPU when the GPU wants to report something (not even > sure we have something equivalent on Midgard), but there's no > inter-queue sync mechanism AFAICT. I was about to say that we don't have hardware support. Even considering DOORBELL (which AFAIK isn't piped through to anything on kbase at least), I'm inclined to say we don't have hardware support. Option 2 (what kbase does? dunno how DDK vulkan works though) is probably our best bet, tbh. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH] drm/amdgpu: Mark mmhub_v1_7_setup_vm_pt_regs() as static
Kernel test robot throws below warning -> drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c:56:6: warning: no previous prototype for 'mmhub_v1_7_setup_vm_pt_regs' [-Wmissing-prototypes] Mark mmhub_v1_7_setup_vm_pt_regs() as static. Reported-by: kernel test robot Signed-off-by: Souptick Joarder --- drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c index 4df0b73..ae7d8a1 100644 --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c @@ -53,7 +53,7 @@ static u64 mmhub_v1_7_get_fb_location(struct amdgpu_device *adev) return base; } -void mmhub_v1_7_setup_vm_pt_regs(struct amdgpu_device *adev, uint32_t vmid, +static void mmhub_v1_7_setup_vm_pt_regs(struct amdgpu_device *adev, uint32_t vmid, uint64_t page_table_base) { struct amdgpu_vmhub *hub = &adev->vmhub[AMDGPU_MMHUB_0]; -- 1.9.1 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/amdgpu: Mark mmhub_v1_7_setup_vm_pt_regs() as static
[AMD Official Use Only - Internal Distribution Only] Thank you Joarder for the fix. But this has already been fixed in our Alex's drm-next branch. Regards, Oak On 2021-03-12, 5:19 PM, "Souptick Joarder" wrote: Kernel test robot throws below warning -> drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c:56:6: warning: no previous prototype for 'mmhub_v1_7_setup_vm_pt_regs' [-Wmissing-prototypes] Mark mmhub_v1_7_setup_vm_pt_regs() as static. Reported-by: kernel test robot Signed-off-by: Souptick Joarder --- drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c index 4df0b73..ae7d8a1 100644 --- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c +++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c @@ -53,7 +53,7 @@ static u64 mmhub_v1_7_get_fb_location(struct amdgpu_device *adev) return base; } -void mmhub_v1_7_setup_vm_pt_regs(struct amdgpu_device *adev, uint32_t vmid, +static void mmhub_v1_7_setup_vm_pt_regs(struct amdgpu_device *adev, uint32_t vmid, uint64_t page_table_base) { struct amdgpu_vmhub *hub = &adev->vmhub[AMDGPU_MMHUB_0]; -- 1.9.1 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drivers: video: fbcon: fix NULL dereference in fbcon_cursor()
On Fri, Mar 12, 2021 at 09:36:42AM +0100, Greg Kroah-Hartman wrote: > On Fri, Mar 12, 2021 at 04:14:21PM +0800, Du Cheng wrote: > > add null-check on function pointer before dereference on ops->cursor > > > > Reported-by: syzbot+b67aaae8d3a927f68...@syzkaller.appspotmail.com > > Signed-off-by: Du Cheng > > --- > > drivers/video/fbdev/core/fbcon.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/drivers/video/fbdev/core/fbcon.c > > b/drivers/video/fbdev/core/fbcon.c > > index 44a5cd2f54cc..3406067985b1 100644 > > --- a/drivers/video/fbdev/core/fbcon.c > > +++ b/drivers/video/fbdev/core/fbcon.c > > @@ -1333,6 +1333,9 @@ static void fbcon_cursor(struct vc_data *vc, int mode) > > > > ops->cursor_flash = (mode == CM_ERASE) ? 0 : 1; > > > > + if (!ops->cursor) > > + return; > > + > > ops->cursor(vc, info, mode, get_color(vc, info, c, 1), > > get_color(vc, info, c, 0)); > > } > > -- > > 2.27.0 > > > > Is this the same issue reported here: > > https://lore.kernel.org/r/20210307105642.112572-1-h.shahbazi@gmail.com > > And has syzbot testing shown that this fix does solve the issue? > > thanks, > > greg k-h Hi Greg, After syzbot testing at https://syzkaller.appspot.com/bug?id=26567b12e74b8791e1db50da6039ee1705e5a7ed the results showed that shahbazi's patch did not solve this bug, but my patch passed the syzbot test. Regards, Du Cheng ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH] gpu: host1x: mipi: Simplify with dev_err_probe()
Common pattern of handling deferred probe can be simplified with dev_err_probe(). Less code and also it prints the error value. Signed-off-by: Tian Tao --- drivers/gpu/host1x/mipi.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/host1x/mipi.c b/drivers/gpu/host1x/mipi.c index 2efe12d..4ba022f 100644 --- a/drivers/gpu/host1x/mipi.c +++ b/drivers/gpu/host1x/mipi.c @@ -523,10 +523,9 @@ static int tegra_mipi_probe(struct platform_device *pdev) mutex_init(&mipi->lock); mipi->clk = devm_clk_get(&pdev->dev, NULL); - if (IS_ERR(mipi->clk)) { - dev_err(&pdev->dev, "failed to get clock\n"); - return PTR_ERR(mipi->clk); - } + if (IS_ERR(mipi->clk)) + return dev_err_probe(&pdev->dev, PTR_ERR(mipi->clk), +"failed to get clock\n"); err = clk_prepare(mipi->clk); if (err < 0) -- 2.7.4 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH] drm/kmb: remove unneeded semicolon
From: Junlin Yang Fixes coccicheck warnings: ./drivers/gpu/drm/kmb/kmb_dsi.c:284:3-4: Unneeded semicolon ./drivers/gpu/drm/kmb/kmb_dsi.c:304:3-4: Unneeded semicolon ./drivers/gpu/drm/kmb/kmb_dsi.c:321:3-4: Unneeded semicolon ./drivers/gpu/drm/kmb/kmb_dsi.c:340:3-4: Unneeded semicolon ./drivers/gpu/drm/kmb/kmb_dsi.c:364:2-3: Unneeded semicolon Signed-off-by: Junlin Yang --- drivers/gpu/drm/kmb/kmb_dsi.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/kmb/kmb_dsi.c b/drivers/gpu/drm/kmb/kmb_dsi.c index 4b5d82a..231041b 100644 --- a/drivers/gpu/drm/kmb/kmb_dsi.c +++ b/drivers/gpu/drm/kmb/kmb_dsi.c @@ -281,7 +281,7 @@ static u32 mipi_get_datatype_params(u32 data_type, u32 data_mode, default: DRM_ERROR("DSI: Invalid data_mode %d\n", data_mode); return -EINVAL; - }; + } break; case DSI_LP_DT_PPS_YCBCR422_16B: data_type_param.size_constraint_pixels = 2; @@ -301,7 +301,7 @@ static u32 mipi_get_datatype_params(u32 data_type, u32 data_mode, default: DRM_ERROR("DSI: Invalid data_mode %d\n", data_mode); return -EINVAL; - }; + } break; case DSI_LP_DT_LPPS_YCBCR422_20B: case DSI_LP_DT_PPS_YCBCR422_24B: @@ -318,7 +318,7 @@ static u32 mipi_get_datatype_params(u32 data_type, u32 data_mode, default: DRM_ERROR("DSI: Invalid data_mode %d\n", data_mode); return -EINVAL; - }; + } break; case DSI_LP_DT_PPS_RGB565_16B: data_type_param.size_constraint_pixels = 1; @@ -337,7 +337,7 @@ static u32 mipi_get_datatype_params(u32 data_type, u32 data_mode, default: DRM_ERROR("DSI: Invalid data_mode %d\n", data_mode); return -EINVAL; - }; + } break; case DSI_LP_DT_PPS_RGB666_18B: data_type_param.size_constraint_pixels = 4; @@ -361,7 +361,7 @@ static u32 mipi_get_datatype_params(u32 data_type, u32 data_mode, default: DRM_ERROR("DSI: Invalid data_type %d\n", data_type); return -EINVAL; - }; + } *params = data_type_param; return 0; -- 1.9.1 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 212259] New: Entire graphics stack locks up when running SteamVR and sometimes Sway; is sometimes unrecoverable
https://bugzilla.kernel.org/show_bug.cgi?id=212259 Bug ID: 212259 Summary: Entire graphics stack locks up when running SteamVR and sometimes Sway; is sometimes unrecoverable Product: Drivers Version: 2.5 Kernel Version: 5.11.0 Hardware: Intel OS: Linux Tree: Mainline Status: NEW Severity: blocking Priority: P1 Component: Video(DRI - non Intel) Assignee: drivers_video-...@kernel-bugs.osdl.org Reporter: happysmas...@protonmail.com Regression: No Created attachment 295831 --> https://bugzilla.kernel.org/attachment.cgi?id=295831&action=edit Last 406 lines of dmesg at the time of error This bug is driving me nuts! First time, it frequently happened when I ran Sway and LXDE at the same time, where my entire graphics stack would freeze and I would be unable to switch virtual TTYs or even kill X, even with kill -9 on everything. Now, I am experiencing this bug again, this time with SteamVR, and at a MUCH higher frequency. It seems to happen, fairly randomly, whenever I start SteamVR, but seems to be more likely when my system has been running for a long time, or when Waterfox has been running for a long time. A couple weeks ago I found a way to reset my GPU as a workaround to the bug (https://www.reddit.com/r/linuxquestions/comments/lpiwkg/how_to_reset_graphics_stack_when_x_stops/), but today, this did not work. My screen went black, but I was still unable to kill -9 X, even after trying the reset many times. In dmesg, there were messages about the reset still being in-progress, but I did not see anything seem to change. Eventually, I decided to give up on gracefully fixing this bug with no downtime to my website and poweroff, then hold the power button of my computer after the HDD activity light went off as even powering off doesn't seem to work properly with this bug. Only then, could I reboot and recover. I have attached the end of the most recent dmesg where this failed. I also have more dmesgs that may or may not be the same bug, but the attachment field only allows for one. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH] gpu: drm: mediatek: delete redundant printing of return value
platform_get_irq() has already checked and printed the return value, the printing here is nothing special, it is not necessary at all. Signed-off-by: Wang Qing --- drivers/gpu/drm/mediatek/mtk_dpi.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/gpu/drm/mediatek/mtk_dpi.c b/drivers/gpu/drm/mediatek/mtk_dpi.c index b05f900..0ac4962 --- a/drivers/gpu/drm/mediatek/mtk_dpi.c +++ b/drivers/gpu/drm/mediatek/mtk_dpi.c @@ -751,10 +751,8 @@ static int mtk_dpi_probe(struct platform_device *pdev) } dpi->irq = platform_get_irq(pdev, 0); - if (dpi->irq <= 0) { - dev_err(dev, "Failed to get irq: %d\n", dpi->irq); + if (dpi->irq <= 0) return -EINVAL; - } ret = drm_of_find_panel_or_bridge(dev->of_node, 0, 0, NULL, &dpi->next_bridge); -- 2.7.4 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 2/2] MIPS: ingenic: gcw0: SPI panel does not require active-high CS
On Sun, Mar 07, 2021 at 07:28:30PM +, Paul Cercueil wrote: > The NT39016 panel is a fun beast, even though the documentation states > that the CS line is active-low, it will work just fine if the CS line is > configured as active-high, but it won't work if the CS line is forced > low or forced high. > > Since it did actually work with the spi-cs-high property, this is not a > bugfix, but we should nonetheless remove that property to match the > documentation. > > Signed-off-by: Paul Cercueil > --- > arch/mips/boot/dts/ingenic/gcw0.dts | 1 - > 1 file changed, 1 deletion(-) applied to mips-next. Thomas. -- Crap can work. Given enough thrust pigs will fly, but it's not necessarily a good idea.[ RFC1925, 2.3 ] ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 1/2] dt-bindings: panel/kingdisplay,kd035g6-54nt: Remove spi-cs-high
On Sun, Mar 07, 2021 at 07:28:29PM +, Paul Cercueil wrote: > The NT39016 panel is a fun beast, even though the documentation states > that the CS line is active-low, it will work just fine if the CS line is > configured as active-high, but it won't work if the CS line is forced > low or forced high. > > Since it did actually work with the spi-cs-high property, this is not a > bugfix, but we should nonetheless remove that property from the example > to match the documentation. > > Signed-off-by: Paul Cercueil > --- > .../bindings/display/panel/kingdisplay,kd035g6-54nt.yaml | 1 - > 1 file changed, 1 deletion(-) applied to mips-next. Thomas. -- Crap can work. Given enough thrust pigs will fly, but it's not necessarily a good idea.[ RFC1925, 2.3 ] ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v5 0/7] drm: add simpledrm driver
Hi Am 12.03.21 um 04:49 schrieb nerdopolis: On Wednesday, March 10, 2021 4:10:35 AM EST Thomas Zimmermann wrote: Hi Am 10.03.21 um 03:50 schrieb nerdopolis: On Friday, September 2, 2016 4:22:38 AM EST David Herrmann wrote: Hey On request of Noralf, I picked up the patches and prepared v5. Works fine with Xorg, if configured according to: https://lists.freedesktop.org/archives/dri-devel/2014-January/052777.html If anyone knows how to make Xorg pick it up dynamically without such a static configuration, please let me know. Hi I am kind of curious as I do have interest in seeing this merged as well. Please take a look at [1]. It's not the same driver, but something to the same effect. I know it's been almost a year, but I do work on this and intend to come back with a new version during 2021. I currently work on fastboot support for the new driver. But it's a complicated matter and takes time. If there's interest, we could talk about merging what's already there. Best regards Thomas [1] https://lore.kernel.org/dri-devel/20200625120011.16168-1-tzimmerm...@suse.de/ There is an email in this thread from 2018, but when I tried to import an mbox file from the whole month for August 2018, for some reason, kmail doesn't see the sender and mailing list recipient in that one, so I will reply to this one, because I was able to import this into my mail client. https://www.spinics.net/lists/dri-devel/msg185519.html I was able to get this to build against Linux 4.8, but not against a newer version, some headers seem to have been split, and some things are off by 8 and other things. I could NOT find a git repo, but I was able to find the newest patches I could find, and import those with git am against 4.8 with some tweaks. If that is needed, I can link it, but only if you want. However in QEMU I wasn't able to figure out how to make it create a /dev/dri/card0 device, even after blacklisting the other modules for qxl, cirrus, etc, and then modprobe-ing simpledrm In my view something like this is would be useful. There still could be hardware devices that don't have modesetting support (like vmvga in qemu/virt-manager as an example). And most wayland servers need a /dev/dri/card0 device as well as a potential user-mode TTY replacement would also need /dev/dri/card0 I will admit I unfortunately failed to get it to build against master. I couldn't figure out some of the changes, where some new structs were off by a factor of 8. Thanks ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel Hi I tried simplekms against v5.9, and it built, and it runs, and is pretty neat. I tried using the qxl, cirrus, and vmware card in QEMU. Weston starts on all of them. And I did ensure that the simplekms driver was being used Cool! Thanks a lot. When I submit the next patchset can I add Tested-by: nerdopolis to the tags? That is, it works after adding GRUB_GFXPAYLOAD_LINUX=keep , to avoid having to set a VGA option. (although not sure the equivalent in syslinux yet) Yeah, it's a known 'limitation.' (1) But it's usually something that Linux distributions take care of. The rsp kernel feature needs a set up from the firmware/bootloader/etc. Once the driver has been merged, added other generic drivers (EFI, VESA, etc) should be a lot easier. Those would maybe not require the firmware setup. Best regards Thomas (1) Well, it's the way it's supposed to work. Thanks. -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Felix Imendörffer OpenPGP_signature Description: OpenPGP digital signature ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: vmwgfx leaking bo pins?
Am 12.03.21 um 00:02 schrieb Zack Rusin: On Mar 11, 2021, at 17:35, Thomas Hellström (Intel) wrote: Hi, Zack On 3/11/21 10:07 PM, Zack Rusin wrote: On Mar 11, 2021, at 05:46, Thomas Hellström (Intel) wrote: Hi, I tried latest drm-fixes today and saw a lot of these: Fallout from ttm rework? Yes, I fixed this in d1a73c641afd2617bd80bce8b71a096fc5b74b7e it was in drm-misc-next in the drm-misc tree for a while but hasn’t been merged for 5.12. Mhm, crap. I hoped that you would have the same issue as radeon somehow and could help with debugging. Please make sure the patch is pushed to drm-misc-fixes so that it ends up fixed in 5.12. z Hmm, yes but doesn't that fix trip the ttm_bo_unpin() dma_resv_assert_held(bo->base.resv)? No, doesn’t seem to. TBH I’m not sure why myself, but it seems to be working fine. Taking the reservation to unpin at TTM bo free has always been awkward and that's why vmwgfx and I guess other TTM drivers have been sloppy doing that as TTM never cared. Perhaps TTM could change the pin_count to an atomic and allow unlocked unpinning? still requiring the reservation lock for pin_count transition 0->1, though. Yea, that’d probably make sense. I think in general just making sure the requirements are consistent and well documented would be great. Also, pinning at bo creation in vmwgfx has been to do the equivalent of ttm_bo_init_reserved() (which api was added later). Creating pinned would make the object isolated and allowing the reserve trylock that followed to always succeed. With the introduction of the TTM pin_count, it seems ttm_bo_init_reserved() is used to enable pinned creation which is used to emulate ttm_bo_init_reserved() :) Yea, we should probably port the vmwgfx code to ttm_bo_init_reserved just to be match the newly established semantics. Yeah, I stumbled over that at well during one of the cleanups and considered changing the implementation. But then it got lost in all the rework. Christian. z ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH] drivers: video: fbcon: fix NULL dereference in fbcon_cursor()
add null-check on function pointer before dereference on ops->cursor Reported-by: syzbot+b67aaae8d3a927f68...@syzkaller.appspotmail.com Signed-off-by: Du Cheng --- drivers/video/fbdev/core/fbcon.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/video/fbdev/core/fbcon.c b/drivers/video/fbdev/core/fbcon.c index 44a5cd2f54cc..3406067985b1 100644 --- a/drivers/video/fbdev/core/fbcon.c +++ b/drivers/video/fbdev/core/fbcon.c @@ -1333,6 +1333,9 @@ static void fbcon_cursor(struct vc_data *vc, int mode) ops->cursor_flash = (mode == CM_ERASE) ? 0 : 1; + if (!ops->cursor) + return; + ops->cursor(vc, info, mode, get_color(vc, info, c, 1), get_color(vc, info, c, 0)); } -- 2.27.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drivers: video: fbcon: fix NULL dereference in fbcon_cursor()
On Fri, Mar 12, 2021 at 04:14:21PM +0800, Du Cheng wrote: > add null-check on function pointer before dereference on ops->cursor > > Reported-by: syzbot+b67aaae8d3a927f68...@syzkaller.appspotmail.com > Signed-off-by: Du Cheng > --- > drivers/video/fbdev/core/fbcon.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/video/fbdev/core/fbcon.c > b/drivers/video/fbdev/core/fbcon.c > index 44a5cd2f54cc..3406067985b1 100644 > --- a/drivers/video/fbdev/core/fbcon.c > +++ b/drivers/video/fbdev/core/fbcon.c > @@ -1333,6 +1333,9 @@ static void fbcon_cursor(struct vc_data *vc, int mode) > > ops->cursor_flash = (mode == CM_ERASE) ? 0 : 1; > > + if (!ops->cursor) > + return; > + > ops->cursor(vc, info, mode, get_color(vc, info, c, 1), > get_color(vc, info, c, 0)); > } > -- > 2.27.0 > Is this the same issue reported here: https://lore.kernel.org/r/20210307105642.112572-1-h.shahbazi@gmail.com And has syzbot testing shown that this fix does solve the issue? thanks, greg k-h ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH v6 0/8] Add support for SVM atomics in Nouveau
This is the sixth version of a series to add support to Nouveau for atomic memory operations on OpenCL shared virtual memory (SVM) regions. There are no significant changes for version six other than correcting a minor s390 build and bisectability issue and removing a redundant call to compound_page() when checking for PageLocked in patch 1. Exclusive device access is implemented by adding a new swap entry type (SWAP_DEVICE_EXCLUSIVE) which is similar to a migration entry. The main difference is that on fault the original entry is immediately restored by the fault handler instead of waiting. Restoring the entry triggers calls to MMU notifers which allows a device driver to revoke the atomic access permission from the GPU prior to the CPU finalising the entry. Patches 1 & 2 refactor existing migration and device private entry functions. Patches 3 & 4 rework try_to_unmap_one() by splitting out unrelated functionality into separate functions - try_to_migrate_one() and try_to_munlock_one(). These should not change any functionality, but any help testing would be much appreciated as I have not been able to test every usage of try_to_unmap_one(). Patch 5 contains the bulk of the implementation for device exclusive memory. Patch 6 contains some additions to the HMM selftests to ensure everything works as expected. Patch 7 is a cleanup for the Nouveau SVM implementation. Patch 8 contains the implementation of atomic access for the Nouveau driver. This has been tested using the latest upstream Mesa userspace with a simple OpenCL test program which checks the results of atomic GPU operations on a SVM buffer whilst also writing to the same buffer from the CPU. Alistair Popple (8): mm: Remove special swap entry functions mm/swapops: Rework swap entry manipulation code mm/rmap: Split try_to_munlock from try_to_unmap mm/rmap: Split migration into its own function mm: Device exclusive memory access mm: Selftests for exclusive device memory nouveau/svm: Refactor nouveau_range_fault nouveau/svm: Implement atomic SVM access Documentation/vm/hmm.rst | 19 +- arch/s390/mm/pgtable.c| 2 +- drivers/gpu/drm/nouveau/include/nvif/if000c.h | 1 + drivers/gpu/drm/nouveau/nouveau_svm.c | 130 +++- drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 1 + .../drm/nouveau/nvkm/subdev/mmu/vmmgp100.c| 6 + fs/proc/task_mmu.c| 23 +- include/linux/mmu_notifier.h | 25 +- include/linux/rmap.h | 9 +- include/linux/swap.h | 8 +- include/linux/swapops.h | 123 ++-- lib/test_hmm.c| 126 +++- lib/test_hmm_uapi.h | 2 + mm/debug_vm_pgtable.c | 12 +- mm/hmm.c | 12 +- mm/huge_memory.c | 45 +- mm/hugetlb.c | 10 +- mm/memcontrol.c | 2 +- mm/memory.c | 127 +++- mm/migrate.c | 41 +- mm/mprotect.c | 18 +- mm/page_vma_mapped.c | 15 +- mm/rmap.c | 597 +++--- tools/testing/selftests/vm/hmm-tests.c| 219 +++ 24 files changed, 1313 insertions(+), 260 deletions(-) -- 2.20.1 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH v6 1/8] mm: Remove special swap entry functions
Remove the migration and device private entry_to_page() and entry_to_pfn() inline functions and instead open code them directly. This results in shorter code which is easier to understand. Signed-off-by: Alistair Popple Reviewed-by: Ralph Campbell --- v6: * Removed redundant compound_page() call from inside PageLocked() * Fixed a minor build issue for s390 reported by kernel test bot v4: * Added pfn_swap_entry_to_page() * Reinstated check that migration entries point to locked pages * Removed #define swapcache_prepare which isn't needed for CONFIG_SWAP=0 builds --- arch/s390/mm/pgtable.c | 2 +- fs/proc/task_mmu.c | 23 +- include/linux/swap.h| 4 +-- include/linux/swapops.h | 69 ++--- mm/hmm.c| 5 ++- mm/huge_memory.c| 4 +-- mm/memcontrol.c | 2 +- mm/memory.c | 10 +++--- mm/migrate.c| 6 ++-- mm/page_vma_mapped.c| 6 ++-- 10 files changed, 50 insertions(+), 81 deletions(-) diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c index 18205f851c24..eec3a9d7176e 100644 --- a/arch/s390/mm/pgtable.c +++ b/arch/s390/mm/pgtable.c @@ -691,7 +691,7 @@ static void ptep_zap_swap_entry(struct mm_struct *mm, swp_entry_t entry) if (!non_swap_entry(entry)) dec_mm_counter(mm, MM_SWAPENTS); else if (is_migration_entry(entry)) { - struct page *page = migration_entry_to_page(entry); + struct page *page = pfn_swap_entry_to_page(entry); dec_mm_counter(mm, mm_counter(page)); } diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 3cec6fbef725..08ee59d945c0 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -514,10 +514,8 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr, } else { mss->swap_pss += (u64)PAGE_SIZE << PSS_SHIFT; } - } else if (is_migration_entry(swpent)) - page = migration_entry_to_page(swpent); - else if (is_device_private_entry(swpent)) - page = device_private_entry_to_page(swpent); + } else if (is_pfn_swap_entry(swpent)) + page = pfn_swap_entry_to_page(swpent); } else if (unlikely(IS_ENABLED(CONFIG_SHMEM) && mss->check_shmem_swap && pte_none(*pte))) { page = xa_load(&vma->vm_file->f_mapping->i_pages, @@ -549,7 +547,7 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr, swp_entry_t entry = pmd_to_swp_entry(*pmd); if (is_migration_entry(entry)) - page = migration_entry_to_page(entry); + page = pfn_swap_entry_to_page(entry); } if (IS_ERR_OR_NULL(page)) return; @@ -691,10 +689,8 @@ static int smaps_hugetlb_range(pte_t *pte, unsigned long hmask, } else if (is_swap_pte(*pte)) { swp_entry_t swpent = pte_to_swp_entry(*pte); - if (is_migration_entry(swpent)) - page = migration_entry_to_page(swpent); - else if (is_device_private_entry(swpent)) - page = device_private_entry_to_page(swpent); + if (is_pfn_swap_entry(swpent)) + page = pfn_swap_entry_to_page(swpent); } if (page) { int mapcount = page_mapcount(page); @@ -1383,11 +1379,8 @@ static pagemap_entry_t pte_to_pagemap_entry(struct pagemapread *pm, frame = swp_type(entry) | (swp_offset(entry) << MAX_SWAPFILES_SHIFT); flags |= PM_SWAP; - if (is_migration_entry(entry)) - page = migration_entry_to_page(entry); - - if (is_device_private_entry(entry)) - page = device_private_entry_to_page(entry); + if (is_pfn_swap_entry(entry)) + page = pfn_swap_entry_to_page(entry); } if (page && !PageAnon(page)) @@ -1444,7 +1437,7 @@ static int pagemap_pmd_range(pmd_t *pmdp, unsigned long addr, unsigned long end, if (pmd_swp_soft_dirty(pmd)) flags |= PM_SOFT_DIRTY; VM_BUG_ON(!is_pmd_migration_entry(pmd)); - page = migration_entry_to_page(entry); + page = pfn_swap_entry_to_page(entry); } #endif diff --git a/include/linux/swap.h b/include/linux/swap.h index 4cc6ec3bf0ab..516104b9334b 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -523,8 +523,8 @@ static inline void show_swap_cache_info(void) { } -#define free_swap_and_cache(e) ({(is_migration_entry(e) || is_device_private_entry(e));}) -#define swapcache_prepare(e) ({(is_migration_entry
[PATCH v6 2/8] mm/swapops: Rework swap entry manipulation code
Both migration and device private pages use special swap entries that are manipluated by a range of inline functions. The arguments to these are somewhat inconsitent so rework them to remove flag type arguments and to make the arguments similar for both read and write entry creation. Signed-off-by: Alistair Popple Reviewed-by: Christoph Hellwig Reviewed-by: Jason Gunthorpe Reviewed-by: Ralph Campbell --- include/linux/swapops.h | 56 ++--- mm/debug_vm_pgtable.c | 12 - mm/hmm.c| 2 +- mm/huge_memory.c| 26 +-- mm/hugetlb.c| 10 +--- mm/memory.c | 10 +--- mm/migrate.c| 26 ++- mm/mprotect.c | 10 +--- mm/rmap.c | 10 +--- 9 files changed, 100 insertions(+), 62 deletions(-) diff --git a/include/linux/swapops.h b/include/linux/swapops.h index 139be8235ad2..4dfd807ae52a 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -100,35 +100,35 @@ static inline void *swp_to_radix_entry(swp_entry_t entry) } #if IS_ENABLED(CONFIG_DEVICE_PRIVATE) -static inline swp_entry_t make_device_private_entry(struct page *page, bool write) +static inline swp_entry_t make_readable_device_private_entry(pgoff_t offset) { - return swp_entry(write ? SWP_DEVICE_WRITE : SWP_DEVICE_READ, -page_to_pfn(page)); + return swp_entry(SWP_DEVICE_READ, offset); } -static inline bool is_device_private_entry(swp_entry_t entry) +static inline swp_entry_t make_writable_device_private_entry(pgoff_t offset) { - int type = swp_type(entry); - return type == SWP_DEVICE_READ || type == SWP_DEVICE_WRITE; + return swp_entry(SWP_DEVICE_WRITE, offset); } -static inline void make_device_private_entry_read(swp_entry_t *entry) +static inline bool is_device_private_entry(swp_entry_t entry) { - *entry = swp_entry(SWP_DEVICE_READ, swp_offset(*entry)); + int type = swp_type(entry); + return type == SWP_DEVICE_READ || type == SWP_DEVICE_WRITE; } -static inline bool is_write_device_private_entry(swp_entry_t entry) +static inline bool is_writable_device_private_entry(swp_entry_t entry) { return unlikely(swp_type(entry) == SWP_DEVICE_WRITE); } #else /* CONFIG_DEVICE_PRIVATE */ -static inline swp_entry_t make_device_private_entry(struct page *page, bool write) +static inline swp_entry_t make_readable_device_private_entry(pgoff_t offset) { return swp_entry(0, 0); } -static inline void make_device_private_entry_read(swp_entry_t *entry) +static inline swp_entry_t make_writable_device_private_entry(pgoff_t offset) { + return swp_entry(0, 0); } static inline bool is_device_private_entry(swp_entry_t entry) @@ -136,35 +136,32 @@ static inline bool is_device_private_entry(swp_entry_t entry) return false; } -static inline bool is_write_device_private_entry(swp_entry_t entry) +static inline bool is_writable_device_private_entry(swp_entry_t entry) { return false; } #endif /* CONFIG_DEVICE_PRIVATE */ #ifdef CONFIG_MIGRATION -static inline swp_entry_t make_migration_entry(struct page *page, int write) -{ - BUG_ON(!PageLocked(compound_head(page))); - - return swp_entry(write ? SWP_MIGRATION_WRITE : SWP_MIGRATION_READ, - page_to_pfn(page)); -} - static inline int is_migration_entry(swp_entry_t entry) { return unlikely(swp_type(entry) == SWP_MIGRATION_READ || swp_type(entry) == SWP_MIGRATION_WRITE); } -static inline int is_write_migration_entry(swp_entry_t entry) +static inline int is_writable_migration_entry(swp_entry_t entry) { return unlikely(swp_type(entry) == SWP_MIGRATION_WRITE); } -static inline void make_migration_entry_read(swp_entry_t *entry) +static inline swp_entry_t make_readable_migration_entry(pgoff_t offset) { - *entry = swp_entry(SWP_MIGRATION_READ, swp_offset(*entry)); + return swp_entry(SWP_MIGRATION_READ, offset); +} + +static inline swp_entry_t make_writable_migration_entry(pgoff_t offset) +{ + return swp_entry(SWP_MIGRATION_WRITE, offset); } extern void __migration_entry_wait(struct mm_struct *mm, pte_t *ptep, @@ -174,21 +171,28 @@ extern void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd, extern void migration_entry_wait_huge(struct vm_area_struct *vma, struct mm_struct *mm, pte_t *pte); #else +static inline swp_entry_t make_readable_migration_entry(pgoff_t offset) +{ + return swp_entry(0, 0); +} + +static inline swp_entry_t make_writable_migration_entry(pgoff_t offset) +{ + return swp_entry(0, 0); +} -#define make_migration_entry(page, write) swp_entry(0, 0) static inline int is_migration_entry(swp_entry_t swp) { return 0; } -static inline void make_migration_entry_read(swp_entry_t *entryp) { } static inline void __migration_entry_wait(struct mm_struct *mm, pt
[PATCH v6 3/8] mm/rmap: Split try_to_munlock from try_to_unmap
The behaviour of try_to_unmap_one() is difficult to follow because it performs different operations based on a fairly large set of flags used in different combinations. TTU_MUNLOCK is one such flag. However it is exclusively used by try_to_munlock() which specifies no other flags. Therefore rather than overload try_to_unmap_one() with unrelated behaviour split this out into it's own function and remove the flag. Signed-off-by: Alistair Popple Reviewed-by: Ralph Campbell --- Christoph - I didn't add your Reviewed-by from v3 because removal of the extra VM_LOCKED check in v4 changed things slightly. Let me know if you're still ok for me to add it. Thanks. v4: * Removed redundant check for VM_LOCKED --- include/linux/rmap.h | 1 - mm/rmap.c| 40 2 files changed, 32 insertions(+), 9 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index def5c62c93b3..e26ac2d71346 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -87,7 +87,6 @@ struct anon_vma_chain { enum ttu_flags { TTU_MIGRATION = 0x1, /* migration mode */ - TTU_MUNLOCK = 0x2, /* munlock mode */ TTU_SPLIT_HUGE_PMD = 0x4, /* split huge PMD if any */ TTU_IGNORE_MLOCK= 0x8, /* ignore mlock */ diff --git a/mm/rmap.c b/mm/rmap.c index 977e70803ed8..d02bade5245b 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1405,10 +1405,6 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, struct mmu_notifier_range range; enum ttu_flags flags = (enum ttu_flags)(long)arg; - /* munlock has nothing to gain from examining un-locked vmas */ - if ((flags & TTU_MUNLOCK) && !(vma->vm_flags & VM_LOCKED)) - return true; - if (IS_ENABLED(CONFIG_MIGRATION) && (flags & TTU_MIGRATION) && is_zone_device_page(page) && !is_device_private_page(page)) return true; @@ -1469,8 +1465,6 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, page_vma_mapped_walk_done(&pvmw); break; } - if (flags & TTU_MUNLOCK) - continue; } /* Unexpected PMD-mapped THP? */ @@ -1784,6 +1778,37 @@ bool try_to_unmap(struct page *page, enum ttu_flags flags) return !page_mapcount(page) ? true : false; } +static bool try_to_munlock_one(struct page *page, struct vm_area_struct *vma, +unsigned long address, void *arg) +{ + struct page_vma_mapped_walk pvmw = { + .page = page, + .vma = vma, + .address = address, + }; + + /* munlock has nothing to gain from examining un-locked vmas */ + if (!(vma->vm_flags & VM_LOCKED)) + return true; + + while (page_vma_mapped_walk(&pvmw)) { + /* PTE-mapped THP are never mlocked */ + if (!PageTransCompound(page)) { + /* +* Holding pte lock, we do *not* need +* mmap_lock here +*/ + mlock_vma_page(page); + } + page_vma_mapped_walk_done(&pvmw); + + /* found a mlocked page, no point continuing munlock check */ + return false; + } + + return true; +} + /** * try_to_munlock - try to munlock a page * @page: the page to be munlocked @@ -1796,8 +1821,7 @@ bool try_to_unmap(struct page *page, enum ttu_flags flags) void try_to_munlock(struct page *page) { struct rmap_walk_control rwc = { - .rmap_one = try_to_unmap_one, - .arg = (void *)TTU_MUNLOCK, + .rmap_one = try_to_munlock_one, .done = page_not_mapped, .anon_lock = page_lock_anon_vma_read, -- 2.20.1 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH v6 4/8] mm/rmap: Split migration into its own function
Migration is currently implemented as a mode of operation for try_to_unmap_one() generally specified by passing the TTU_MIGRATION flag or in the case of splitting a huge anonymous page TTU_SPLIT_FREEZE. However it does not have much in common with the rest of the unmap functionality of try_to_unmap_one() and thus splitting it into a separate function reduces the complexity of try_to_unmap_one() making it more readable. Several simplifications can also be made in try_to_migrate_one() based on the following observations: - All users of TTU_MIGRATION also set TTU_IGNORE_MLOCK. - No users of TTU_MIGRATION ever set TTU_IGNORE_HWPOISON. - No users of TTU_MIGRATION ever set TTU_BATCH_FLUSH. TTU_SPLIT_FREEZE is a special case of migration used when splitting an anonymous page. This is most easily dealt with by calling the correct function from unmap_page() in mm/huge_memory.c - either try_to_migrate() for PageAnon or try_to_unmap(). Signed-off-by: Alistair Popple Reviewed-by: Christoph Hellwig Reviewed-by: Ralph Campbell --- v5: * Added comments about how PMD splitting works for migration vs. unmapping * Tightened up the flag check in try_to_migrate() to be explicit about which TTU_XXX flags are supported. --- include/linux/rmap.h | 4 +- mm/huge_memory.c | 15 +- mm/migrate.c | 9 +- mm/rmap.c| 358 --- 4 files changed, 280 insertions(+), 106 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index e26ac2d71346..6062e0cfca2d 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -86,8 +86,6 @@ struct anon_vma_chain { }; enum ttu_flags { - TTU_MIGRATION = 0x1, /* migration mode */ - TTU_SPLIT_HUGE_PMD = 0x4, /* split huge PMD if any */ TTU_IGNORE_MLOCK= 0x8, /* ignore mlock */ TTU_IGNORE_HWPOISON = 0x20, /* corrupted page is recoverable */ @@ -96,7 +94,6 @@ enum ttu_flags { * do a final flush if necessary */ TTU_RMAP_LOCKED = 0x80, /* do not grab rmap lock: * caller holds it */ - TTU_SPLIT_FREEZE= 0x100,/* freeze pte under splitting thp */ }; #ifdef CONFIG_MMU @@ -193,6 +190,7 @@ static inline void page_dup_rmap(struct page *page, bool compound) int page_referenced(struct page *, int is_locked, struct mem_cgroup *memcg, unsigned long *vm_flags); +bool try_to_migrate(struct page *page, enum ttu_flags flags); bool try_to_unmap(struct page *, enum ttu_flags flags); /* Avoid racy checks */ diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 89af065cea5b..eab004331b97 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2357,16 +2357,21 @@ void vma_adjust_trans_huge(struct vm_area_struct *vma, static void unmap_page(struct page *page) { - enum ttu_flags ttu_flags = TTU_IGNORE_MLOCK | - TTU_RMAP_LOCKED | TTU_SPLIT_HUGE_PMD; + enum ttu_flags ttu_flags = TTU_RMAP_LOCKED | TTU_SPLIT_HUGE_PMD; bool unmap_success; VM_BUG_ON_PAGE(!PageHead(page), page); if (PageAnon(page)) - ttu_flags |= TTU_SPLIT_FREEZE; - - unmap_success = try_to_unmap(page, ttu_flags); + unmap_success = try_to_migrate(page, ttu_flags); + else + /* +* Don't install migration entries for file backed pages. This +* helps handle cases when i_size is in the middle of the page +* as there is no need to unmap pages beyond i_size manually. +*/ + unmap_success = try_to_unmap(page, ttu_flags | + TTU_IGNORE_MLOCK); VM_BUG_ON_PAGE(!unmap_success, page); } diff --git a/mm/migrate.c b/mm/migrate.c index b752543adb64..cc4612e2a246 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1130,7 +1130,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage, /* Establish migration ptes */ VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) && !anon_vma, page); - try_to_unmap(page, TTU_MIGRATION|TTU_IGNORE_MLOCK); + try_to_migrate(page, 0); page_was_mapped = 1; } @@ -1332,7 +1332,7 @@ static int unmap_and_move_huge_page(new_page_t get_new_page, if (page_mapped(hpage)) { bool mapping_locked = false; - enum ttu_flags ttu = TTU_MIGRATION|TTU_IGNORE_MLOCK; + enum ttu_flags ttu = 0; if (!PageAnon(hpage)) { /* @@ -1349,7 +1349,7 @@ static int unmap_and_move_huge_page(new_page_t get_new_page, ttu |= TTU_RMAP_LOCKED; } - try_to_unmap(hpage, ttu); + try_to_migrate(hpage, ttu); page_was_mapped
[PATCH v6 6/8] mm: Selftests for exclusive device memory
Adds some selftests for exclusive device memory. Signed-off-by: Alistair Popple Acked-by: Jason Gunthorpe Tested-by: Ralph Campbell Reviewed-by: Ralph Campbell --- lib/test_hmm.c | 124 ++ lib/test_hmm_uapi.h| 2 + tools/testing/selftests/vm/hmm-tests.c | 219 + 3 files changed, 345 insertions(+) diff --git a/lib/test_hmm.c b/lib/test_hmm.c index 5c9f5a020c1d..305a9d9e2b4c 100644 --- a/lib/test_hmm.c +++ b/lib/test_hmm.c @@ -25,6 +25,7 @@ #include #include #include +#include #include "test_hmm_uapi.h" @@ -46,6 +47,7 @@ struct dmirror_bounce { unsigned long cpages; }; +#define DPT_XA_TAG_ATOMIC 1UL #define DPT_XA_TAG_WRITE 3UL /* @@ -619,6 +621,54 @@ static void dmirror_migrate_alloc_and_copy(struct migrate_vma *args, } } +static int dmirror_check_atomic(struct dmirror *dmirror, unsigned long start, +unsigned long end) +{ + unsigned long pfn; + + for (pfn = start >> PAGE_SHIFT; pfn < (end >> PAGE_SHIFT); pfn++) { + void *entry; + struct page *page; + + entry = xa_load(&dmirror->pt, pfn); + page = xa_untag_pointer(entry); + if (xa_pointer_tag(entry) == DPT_XA_TAG_ATOMIC) + return -EPERM; + } + + return 0; +} + +static int dmirror_atomic_map(unsigned long start, unsigned long end, + struct page **pages, struct dmirror *dmirror) +{ + unsigned long pfn, mapped = 0; + int i; + + /* Map the migrated pages into the device's page tables. */ + mutex_lock(&dmirror->mutex); + + for (i = 0, pfn = start >> PAGE_SHIFT; pfn < (end >> PAGE_SHIFT); pfn++, i++) { + void *entry; + + if (!pages[i]) + continue; + + entry = pages[i]; + entry = xa_tag_pointer(entry, DPT_XA_TAG_ATOMIC); + entry = xa_store(&dmirror->pt, pfn, entry, GFP_ATOMIC); + if (xa_is_err(entry)) { + mutex_unlock(&dmirror->mutex); + return xa_err(entry); + } + + mapped++; + } + + mutex_unlock(&dmirror->mutex); + return mapped; +} + static int dmirror_migrate_finalize_and_map(struct migrate_vma *args, struct dmirror *dmirror) { @@ -661,6 +711,71 @@ static int dmirror_migrate_finalize_and_map(struct migrate_vma *args, return 0; } +static int dmirror_exclusive(struct dmirror *dmirror, +struct hmm_dmirror_cmd *cmd) +{ + unsigned long start, end, addr; + unsigned long size = cmd->npages << PAGE_SHIFT; + struct mm_struct *mm = dmirror->notifier.mm; + struct page *pages[64]; + struct dmirror_bounce bounce; + unsigned long next; + int ret; + + start = cmd->addr; + end = start + size; + if (end < start) + return -EINVAL; + + /* Since the mm is for the mirrored process, get a reference first. */ + if (!mmget_not_zero(mm)) + return -EINVAL; + + mmap_read_lock(mm); + for (addr = start; addr < end; addr = next) { + int i, mapped; + + if (end < addr + (ARRAY_SIZE(pages) << PAGE_SHIFT)) + next = end; + else + next = addr + (ARRAY_SIZE(pages) << PAGE_SHIFT); + + ret = make_device_exclusive_range(mm, addr, next, pages, NULL); + mapped = dmirror_atomic_map(addr, next, pages, dmirror); + for (i = 0; i < ret; i++) { + if (pages[i]) { + unlock_page(pages[i]); + put_page(pages[i]); + } + } + + if (addr + (mapped << PAGE_SHIFT) < next) { + mmap_read_unlock(mm); + mmput(mm); + return -EBUSY; + } + } + mmap_read_unlock(mm); + mmput(mm); + + /* Return the migrated data for verification. */ + ret = dmirror_bounce_init(&bounce, start, size); + if (ret) + return ret; + mutex_lock(&dmirror->mutex); + ret = dmirror_do_read(dmirror, start, end, &bounce); + mutex_unlock(&dmirror->mutex); + if (ret == 0) { + if (copy_to_user(u64_to_user_ptr(cmd->ptr), bounce.ptr, +bounce.size)) + ret = -EFAULT; + } + + cmd->cpages = bounce.cpages; + dmirror_bounce_fini(&bounce); + return ret; +} + static int dmirror_migrate(struct dmirror *dmirror, struct hmm_dmirror_cmd *cmd) { @@ -949,6 +1064,15 @@ static long dmirror_fops_unlocked_ioctl(struct file *filp,
[PATCH v6 7/8] nouveau/svm: Refactor nouveau_range_fault
Call mmu_interval_notifier_insert() as part of nouveau_range_fault(). This doesn't introduce any functional change but makes it easier for a subsequent patch to alter the behaviour of nouveau_range_fault() to support GPU atomic operations. Signed-off-by: Alistair Popple --- drivers/gpu/drm/nouveau/nouveau_svm.c | 34 --- 1 file changed, 20 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c index 94f841026c3b..a195e48c9aee 100644 --- a/drivers/gpu/drm/nouveau/nouveau_svm.c +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c @@ -567,18 +567,27 @@ static int nouveau_range_fault(struct nouveau_svmm *svmm, unsigned long hmm_pfns[1]; struct hmm_range range = { .notifier = ¬ifier->notifier, - .start = notifier->notifier.interval_tree.start, - .end = notifier->notifier.interval_tree.last + 1, .default_flags = hmm_flags, .hmm_pfns = hmm_pfns, .dev_private_owner = drm->dev, }; - struct mm_struct *mm = notifier->notifier.mm; + struct mm_struct *mm = svmm->notifier.mm; int ret; + ret = mmu_interval_notifier_insert(¬ifier->notifier, mm, + args->p.addr, args->p.size, + &nouveau_svm_mni_ops); + if (ret) + return ret; + + range.start = notifier->notifier.interval_tree.start; + range.end = notifier->notifier.interval_tree.last + 1; + while (true) { - if (time_after(jiffies, timeout)) - return -EBUSY; + if (time_after(jiffies, timeout)) { + ret = -EBUSY; + goto out; + } range.notifier_seq = mmu_interval_read_begin(range.notifier); mmap_read_lock(mm); @@ -587,7 +596,7 @@ static int nouveau_range_fault(struct nouveau_svmm *svmm, if (ret) { if (ret == -EBUSY) continue; - return ret; + goto out; } mutex_lock(&svmm->mutex); @@ -606,6 +615,9 @@ static int nouveau_range_fault(struct nouveau_svmm *svmm, svmm->vmm->vmm.object.client->super = false; mutex_unlock(&svmm->mutex); +out: + mmu_interval_notifier_remove(¬ifier->notifier); + return ret; } @@ -727,14 +739,8 @@ nouveau_svm_fault(struct nvif_notify *notify) } notifier.svmm = svmm; - ret = mmu_interval_notifier_insert(¬ifier.notifier, mm, - args.i.p.addr, args.i.p.size, - &nouveau_svm_mni_ops); - if (!ret) { - ret = nouveau_range_fault(svmm, svm->drm, &args.i, - sizeof(args), hmm_flags, ¬ifier); - mmu_interval_notifier_remove(¬ifier.notifier); - } + ret = nouveau_range_fault(svmm, svm->drm, &args.i, + sizeof(args), hmm_flags, ¬ifier); mmput(mm); limit = args.i.p.addr + args.i.p.size; -- 2.20.1 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH v6 5/8] mm: Device exclusive memory access
Some devices require exclusive write access to shared virtual memory (SVM) ranges to perform atomic operations on that memory. This requires CPU page tables to be updated to deny access whilst atomic operations are occurring. In order to do this introduce a new swap entry type (SWP_DEVICE_EXCLUSIVE). When a SVM range needs to be marked for exclusive access by a device all page table mappings for the particular range are replaced with device exclusive swap entries. This causes any CPU access to the page to result in a fault. Faults are resovled by replacing the faulting entry with the original mapping. This results in MMU notifiers being called which a driver uses to update access permissions such as revoking atomic access. After notifiers have been called the device will no longer have exclusive access to the region. Signed-off-by: Alistair Popple --- v6: * Fixed a bisectablity issue due to incorrectly applying the rename of migrate_pgmap_owner to the wrong patches for Nouveau and hmm_test. v5: * Renamed range->migrate_pgmap_owner to range->owner. * Added MMU_NOTIFY_EXCLUSIVE to allow passing of a driver cookie which allows notifiers called as a result of make_device_exclusive_range() to be ignored. * Added a check to try_to_protect_one() to detect if the pages originally returned from get_user_pages() have been unmapped or not. * Removed check_device_exclusive_range() as it is no longer required with the other changes. * Documentation update. v4: * Add function to check that mappings are still valid and exclusive. * s/long/unsigned long/ in make_device_exclusive_entry(). --- Documentation/vm/hmm.rst | 19 ++- drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +- include/linux/mmu_notifier.h | 25 +++- include/linux/rmap.h | 4 + include/linux/swap.h | 4 +- include/linux/swapops.h | 44 +- lib/test_hmm.c| 2 +- mm/hmm.c | 5 + mm/memory.c | 107 +- mm/mprotect.c | 8 + mm/page_vma_mapped.c | 9 +- mm/rmap.c | 203 ++ 12 files changed, 419 insertions(+), 13 deletions(-) diff --git a/Documentation/vm/hmm.rst b/Documentation/vm/hmm.rst index 09e28507f5b2..a5fdee82c037 100644 --- a/Documentation/vm/hmm.rst +++ b/Documentation/vm/hmm.rst @@ -332,7 +332,7 @@ between device driver specific code and shared common code: walks to fill in the ``args->src`` array with PFNs to be migrated. The ``invalidate_range_start()`` callback is passed a ``struct mmu_notifier_range`` with the ``event`` field set to - ``MMU_NOTIFY_MIGRATE`` and the ``migrate_pgmap_owner`` field set to + ``MMU_NOTIFY_MIGRATE`` and the ``owner`` field set to the ``args->pgmap_owner`` field passed to migrate_vma_setup(). This is allows the device driver to skip the invalidation callback and only invalidate device private MMU mappings that are actually migrating. @@ -405,6 +405,23 @@ between device driver specific code and shared common code: The lock can now be released. +Exclusive access memory +=== + +Not all devices support atomic access to system memory. To support atomic +operations to a shared virtual memory page such a device needs access to that +page which is exclusive of any userspace access from the CPU. The +``make_device_exclusive_range()`` function can be used to make a memory range +inaccessible from userspace. + +This replaces all mappings for pages in the given range with special swap +entries. Any attempt to access the swap entry results in a fault which is +resovled by replacing the entry with the original mapping. A driver gets +notified that the mapping has been changed by MMU notifiers, after which point +it will no longer have exclusive access to the page. Exclusive access is +guranteed to last until the driver drops the page lock and page reference, at +which point any CPU faults on the page may proceed as described. + Memory cgroup (memcg) and rss accounting diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c index f18bd53da052..94f841026c3b 100644 --- a/drivers/gpu/drm/nouveau/nouveau_svm.c +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c @@ -265,7 +265,7 @@ nouveau_svmm_invalidate_range_start(struct mmu_notifier *mn, * the invalidation is handled as part of the migration process. */ if (update->event == MMU_NOTIFY_MIGRATE && - update->migrate_pgmap_owner == svmm->vmm->cli->drm->dev) + update->owner == svmm->vmm->cli->drm->dev) goto out; if (limit > svmm->unmanaged.start && start < svmm->unmanaged.limit) { diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h index b8200782dede..455e269bf825 100644 --- a/include/linux/mm
[PATCH v6 8/8] nouveau/svm: Implement atomic SVM access
Some NVIDIA GPUs do not support direct atomic access to system memory via PCIe. Instead this must be emulated by granting the GPU exclusive access to the memory. This is achieved by replacing CPU page table entries with special swap entries that fault on userspace access. The driver then grants the GPU permission to update the page undergoing atomic access via the GPU page tables. When CPU access to the page is required a CPU fault is raised which calls into the device driver via MMU notifiers to revoke the atomic access. The original page table entries are then restored allowing CPU access to proceed. Signed-off-by: Alistair Popple --- v4: * Check that page table entries haven't changed before mapping on the device --- drivers/gpu/drm/nouveau/include/nvif/if000c.h | 1 + drivers/gpu/drm/nouveau/nouveau_svm.c | 100 -- drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h | 1 + .../drm/nouveau/nvkm/subdev/mmu/vmmgp100.c| 6 ++ 4 files changed, 100 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/nouveau/include/nvif/if000c.h b/drivers/gpu/drm/nouveau/include/nvif/if000c.h index d6dd40f21eed..9c7ff56831c5 100644 --- a/drivers/gpu/drm/nouveau/include/nvif/if000c.h +++ b/drivers/gpu/drm/nouveau/include/nvif/if000c.h @@ -77,6 +77,7 @@ struct nvif_vmm_pfnmap_v0 { #define NVIF_VMM_PFNMAP_V0_APER 0x00f0ULL #define NVIF_VMM_PFNMAP_V0_HOST 0xULL #define NVIF_VMM_PFNMAP_V0_VRAM 0x0010ULL +#define NVIF_VMM_PFNMAP_V0_A 0x0004ULL #define NVIF_VMM_PFNMAP_V0_W 0x0002ULL #define NVIF_VMM_PFNMAP_V0_V 0x0001ULL #define NVIF_VMM_PFNMAP_V0_NONE 0xULL diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c index a195e48c9aee..16b07d7589d2 100644 --- a/drivers/gpu/drm/nouveau/nouveau_svm.c +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c @@ -35,6 +35,7 @@ #include #include #include +#include struct nouveau_svm { struct nouveau_drm *drm; @@ -421,9 +422,9 @@ nouveau_svm_fault_cmp(const void *a, const void *b) return ret; if ((ret = (s64)fa->addr - fb->addr)) return ret; - /*XXX: atomic? */ - return (fa->access == 0 || fa->access == 3) - - (fb->access == 0 || fb->access == 3); + /* Atomic access (2) has highest priority */ + return (-1*(fa->access == 2) + (fa->access == 0 || fa->access == 3)) - + (-1*(fb->access == 2) + (fb->access == 0 || fb->access == 3)); } static void @@ -487,6 +488,10 @@ static bool nouveau_svm_range_invalidate(struct mmu_interval_notifier *mni, struct svm_notifier *sn = container_of(mni, struct svm_notifier, notifier); + if (range->event == MMU_NOTIFY_EXCLUSIVE && + range->owner == sn->svmm->vmm->cli->drm->dev) + return true; + /* * serializes the update to mni->invalidate_seq done by caller and * prevents invalidation of the PTE from progressing while HW is being @@ -555,6 +560,73 @@ static void nouveau_hmm_convert_pfn(struct nouveau_drm *drm, args->p.phys[0] |= NVIF_VMM_PFNMAP_V0_W; } +static int nouveau_atomic_range_fault(struct nouveau_svmm *svmm, + struct nouveau_drm *drm, + struct nouveau_pfnmap_args *args, u32 size, + struct svm_notifier *notifier) +{ + unsigned long timeout = + jiffies + msecs_to_jiffies(HMM_RANGE_DEFAULT_TIMEOUT); + struct mm_struct *mm = svmm->notifier.mm; + struct page *page; + unsigned long start = args->p.addr; + unsigned long notifier_seq; + int ret = 0; + + ret = mmu_interval_notifier_insert(¬ifier->notifier, mm, + args->p.addr, args->p.size, + &nouveau_svm_mni_ops); + if (ret) + return ret; + + while (true) { + if (time_after(jiffies, timeout)) { + ret = -EBUSY; + goto out; + } + + notifier_seq = mmu_interval_read_begin(¬ifier->notifier); + mmap_read_lock(mm); + make_device_exclusive_range(mm, start, start + PAGE_SIZE, + &page, drm->dev); + mmap_read_unlock(mm); + if (!page) { + ret = -EINVAL; + goto out; + } + + mutex_lock(&svmm->mutex); + if (mmu_interval_read_retry(¬ifier->notifier, + notifier_seq)) { + mutex_unlock(&svmm->mutex); +
[PATCH v2] dt-bindings: display: sitronix, st7789v-dbi: Add Waveshare 2inch LCD module
From: "Carlis" Document support for the Waveshare 2inch LCD module display, which is a 240x320 2" TFT display driven by a Sitronix ST7789V TFT Controller. Signed-off-by: Carlis --- v2:change compatible name. --- .../display/sitronix,st7789v-dbi.yaml | 72 +++ 1 file changed, 72 insertions(+) create mode 100644 Documentation/devicetree/bindings/display/sitronix,st7789v-dbi.yaml diff --git a/Documentation/devicetree/bindings/display/sitronix,st7789v-dbi.yaml b/Documentation/devicetree/bindings/display/sitronix,st7789v-dbi.yaml new file mode 100644 index ..6abf82966230 --- /dev/null +++ b/Documentation/devicetree/bindings/display/sitronix,st7789v-dbi.yaml @@ -0,0 +1,72 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/display/sitronix,st7789v-dbi.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Sitronix ST7789V Display Panels Device Tree Bindings + +maintainers: + - Carlis + +description: + This binding is for display panels using a Sitronix ST7789V + controller in SPI mode. + +allOf: + - $ref: panel/panel-common.yaml# + +properties: + compatible: +oneOf: + - description: + Waveshare 2" 240x320 Color TFT LCD +items: + - enum: + - waveshare,ws2inch + - const: sitronix,st7789v-dbi + + spi-max-frequency: +maximum: 3200 + + dc-gpios: +maxItems: 1 +description: Display data/command selection (D/CX) + + backlight: true + reg: true + reset-gpios: true + rotation: true + +required: + - compatible + - reg + - dc-gpios + - reset-gpios + +additionalProperties: false + +examples: + - | +#include + +backlight: backlight { +compatible = "gpio-backlight"; +gpios = <&gpio 18 GPIO_ACTIVE_HIGH>; +}; + +spi { +#address-cells = <1>; +#size-cells = <0>; + +display@0{ +compatible = "waveshare,ws2inch", "sitronix,st7789v-dbi"; +reg = <0>; +spi-max-frequency = <3200>; +dc-gpios = <&gpio 25 GPIO_ACTIVE_HIGH>; +reset-gpios = <&gpio 27 GPIO_ACTIVE_HIGH>; +rotation = <270>; +}; +}; + +... -- 2.25.1 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH] drm: bridge: adv7511: Support I2S IEC958 encoded PCM format
Support IEC958 encoded PCM format for ADV7511 so that ADV7511 HDMI audio driver can accept the IEC958 data from the I2S input. Signed-off-by: Sia Jee Heng --- drivers/gpu/drm/bridge/adv7511/adv7511.h | 1 + drivers/gpu/drm/bridge/adv7511/adv7511_audio.c | 6 ++ 2 files changed, 7 insertions(+) diff --git a/drivers/gpu/drm/bridge/adv7511/adv7511.h b/drivers/gpu/drm/bridge/adv7511/adv7511.h index a9bb734366ae..05e3abb5a0c9 100644 --- a/drivers/gpu/drm/bridge/adv7511/adv7511.h +++ b/drivers/gpu/drm/bridge/adv7511/adv7511.h @@ -191,6 +191,7 @@ #define ADV7511_I2S_FORMAT_I2S 0 #define ADV7511_I2S_FORMAT_RIGHT_J 1 #define ADV7511_I2S_FORMAT_LEFT_J 2 +#define ADV7511_I2S_IEC958_DIRECT 3 #define ADV7511_PACKET(p, x) ((p) * 0x20 + (x)) #define ADV7511_PACKET_SDP(x) ADV7511_PACKET(0, x) diff --git a/drivers/gpu/drm/bridge/adv7511/adv7511_audio.c b/drivers/gpu/drm/bridge/adv7511/adv7511_audio.c index 45838bd08d37..61f4a38e7d2b 100644 --- a/drivers/gpu/drm/bridge/adv7511/adv7511_audio.c +++ b/drivers/gpu/drm/bridge/adv7511/adv7511_audio.c @@ -101,6 +101,10 @@ static int adv7511_hdmi_hw_params(struct device *dev, void *data, case 20: len = ADV7511_I2S_SAMPLE_LEN_20; break; + case 32: + if (fmt->bit_fmt != SNDRV_PCM_FORMAT_IEC958_SUBFRAME_LE) + return -EINVAL; + fallthrough; case 24: len = ADV7511_I2S_SAMPLE_LEN_24; break; @@ -112,6 +116,8 @@ static int adv7511_hdmi_hw_params(struct device *dev, void *data, case HDMI_I2S: audio_source = ADV7511_AUDIO_SOURCE_I2S; i2s_format = ADV7511_I2S_FORMAT_I2S; + if (fmt->bit_fmt == SNDRV_PCM_FORMAT_IEC958_SUBFRAME_LE) + i2s_format = ADV7511_I2S_IEC958_DIRECT; break; case HDMI_RIGHT_J: audio_source = ADV7511_AUDIO_SOURCE_I2S; base-commit: de066e116306baf3a6a62691ac63cfc0b1dabddb -- 2.18.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH v2] drm/tiny: add support for Waveshare 2inch LCD module
From: "Carlis" This adds a new module for the ST7789V controller with parameters for the Waveshare 2inch LCD module. Signed-off-by: Carlis --- v2:change compatible value. --- MAINTAINERS| 7 + drivers/gpu/drm/tiny/Kconfig | 14 ++ drivers/gpu/drm/tiny/Makefile | 1 + drivers/gpu/drm/tiny/st7789v.c | 269 + 4 files changed, 291 insertions(+) create mode 100644 drivers/gpu/drm/tiny/st7789v.c diff --git a/MAINTAINERS b/MAINTAINERS index d92f85ca831d..25c4ae12cecf 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -5769,6 +5769,13 @@ T: git git://anongit.freedesktop.org/drm/drm-misc F: Documentation/devicetree/bindings/display/sitronix,st7735r.yaml F: drivers/gpu/drm/tiny/st7735r.c +DRM DRIVER FOR SITRONIX ST7789V PANELS +M: Carlis +S: Maintained +T: git git://anongit.freedesktop.org/drm/drm-misc +F: Documentation/devicetree/bindings/display/sitronix,st7789v-dbi.yaml +F: drivers/gpu/drm/tiny/st7789v.c + DRM DRIVER FOR SONY ACX424AKP PANELS M: Linus Walleij S: Maintained diff --git a/drivers/gpu/drm/tiny/Kconfig b/drivers/gpu/drm/tiny/Kconfig index 2b6414f0fa75..ac2c7fb702f0 100644 --- a/drivers/gpu/drm/tiny/Kconfig +++ b/drivers/gpu/drm/tiny/Kconfig @@ -131,3 +131,17 @@ config TINYDRM_ST7735R * Okaya RH128128T 1.44" 128x128 TFT If M is selected the module will be called st7735r. + +config TINYDRM_ST7789V + tristate "DRM support for Sitronix ST7789V display panels" + depends on DRM && SPI + select DRM_KMS_HELPER + select DRM_KMS_CMA_HELPER + select DRM_MIPI_DBI + select BACKLIGHT_CLASS_DEVICE + help + DRM driver for Sitronix ST7789V with one of the following + LCDs: + * Waveshare 2inch lcd module 240x320 TFT + + If M is selected the module will be called st7789v. diff --git a/drivers/gpu/drm/tiny/Makefile b/drivers/gpu/drm/tiny/Makefile index 6ae4e9e5a35f..aa0caa2b6c16 100644 --- a/drivers/gpu/drm/tiny/Makefile +++ b/drivers/gpu/drm/tiny/Makefile @@ -10,3 +10,4 @@ obj-$(CONFIG_TINYDRM_MI0283QT)+= mi0283qt.o obj-$(CONFIG_TINYDRM_REPAPER) += repaper.o obj-$(CONFIG_TINYDRM_ST7586) += st7586.o obj-$(CONFIG_TINYDRM_ST7735R) += st7735r.o +obj-$(CONFIG_TINYDRM_ST7789V) += st7789v.o diff --git a/drivers/gpu/drm/tiny/st7789v.c b/drivers/gpu/drm/tiny/st7789v.c new file mode 100644 index ..9b4bb9edba40 --- /dev/null +++ b/drivers/gpu/drm/tiny/st7789v.c @@ -0,0 +1,269 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * DRM driver for display panels connected to a Sitronix ST7789V + * display controller in SPI mode. + * + * Copyright 2017 David Lechner + * Copyright (C) 2019 Glider bvba + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include + +#define ST7789V_PORCTRL 0xb2 +#define ST7789V_GCTRL 0xb7 +#define ST7789V_VCOMS 0xbb +#define ST7789V_LCMCTRL 0xc0 +#define ST7789V_VDVVRHEN0xc2 +#define ST7789V_VRHS0xc3 +#define ST7789V_VDVS0xc4 +#define ST7789V_FRCTRL2 0xc6 +#define ST7789V_PWCTRL1 0xd0 +#define ST7789V_PVGAMCTRL 0xe0 +#define ST7789V_NVGAMCTRL 0xe1 + +#define ST7789V_MY BIT(7) +#define ST7789V_MX BIT(6) +#define ST7789V_MV BIT(5) +#define ST7789V_RGBBIT(3) + +struct st7789v_cfg { + const struct drm_display_mode mode; + unsigned int left_offset; + unsigned int top_offset; + unsigned int write_only:1; + unsigned int rgb:1; /* RGB (vs. BGR) */ +}; + +struct st7789v_priv { + struct mipi_dbi_dev dbidev; /* Must be first for .release() */ + const struct st7789v_cfg *cfg; +}; + +static void st7789v_pipe_enable(struct drm_simple_display_pipe *pipe, + struct drm_crtc_state *crtc_state, + struct drm_plane_state *plane_state) +{ + struct mipi_dbi_dev *dbidev = drm_to_mipi_dbi_dev(pipe->crtc.dev); + struct st7789v_priv *priv = container_of(dbidev, struct st7789v_priv, +dbidev); + struct mipi_dbi *dbi = &dbidev->dbi; + int ret, idx; + u8 addr_mode; + + if (!drm_dev_enter(pipe->crtc.dev, &idx)) + return; + + DRM_DEBUG_KMS("\n"); + + ret = mipi_dbi_poweron_reset(dbidev); + if (ret) + goto out_exit; + + msleep(150); + + mipi_dbi_command(dbi, MIPI_DCS_EXIT_SLEEP_MODE); + msleep(100); + + + switch (dbidev->rotation) { + default: + addr_mode = 0; + break; + case 90: + addr_mode = ST7789V_MY | ST7789V_MV; + break; + case 180: + addr_mode = ST7789V_MX | ST7789V_MY; + break; + case 270: + addr_mo
Re: [PATCH] drivers: video: fbcon: fix NULL dereference in fbcon_cursor()
On Fri, Mar 12, 2021 at 09:36:42AM +0100, Greg Kroah-Hartman wrote: > On Fri, Mar 12, 2021 at 04:14:21PM +0800, Du Cheng wrote: > > add null-check on function pointer before dereference on ops->cursor > > > > Reported-by: syzbot+b67aaae8d3a927f68...@syzkaller.appspotmail.com > > Signed-off-by: Du Cheng > > --- > > drivers/video/fbdev/core/fbcon.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/drivers/video/fbdev/core/fbcon.c > > b/drivers/video/fbdev/core/fbcon.c > > index 44a5cd2f54cc..3406067985b1 100644 > > --- a/drivers/video/fbdev/core/fbcon.c > > +++ b/drivers/video/fbdev/core/fbcon.c > > @@ -1333,6 +1333,9 @@ static void fbcon_cursor(struct vc_data *vc, int mode) > > > > ops->cursor_flash = (mode == CM_ERASE) ? 0 : 1; > > > > + if (!ops->cursor) > > + return; > > + > > ops->cursor(vc, info, mode, get_color(vc, info, c, 1), > > get_color(vc, info, c, 0)); > > } > > -- > > 2.27.0 > > > > Is this the same issue reported here: > > https://lore.kernel.org/r/20210307105642.112572-1-h.shahbazi@gmail.com > > And has syzbot testing shown that this fix does solve the issue? > > thanks, > > greg k-h Hi Greg, I sent both my patch and that of shahbazi to syzbot to see if they resolve the bug by the id b67aaae8d3a927f68d20. I will keep you posted of the outcomes. Regards, Du Cheng ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH] drm/ttm: make ttm_bo_unpin more defensive
We seem to have some more driver bugs than thought. Signed-off-by: Christian König --- include/drm/ttm/ttm_bo_api.h | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h index 4fb523dfab32..df9fe596e7c5 100644 --- a/include/drm/ttm/ttm_bo_api.h +++ b/include/drm/ttm/ttm_bo_api.h @@ -603,9 +603,11 @@ static inline void ttm_bo_pin(struct ttm_buffer_object *bo) static inline void ttm_bo_unpin(struct ttm_buffer_object *bo) { dma_resv_assert_held(bo->base.resv); - WARN_ON_ONCE(!bo->pin_count); WARN_ON_ONCE(!kref_read(&bo->kref)); - --bo->pin_count; + if (bo->pin_count) + --bo->pin_count; + else + WARN_ON_ONCE(true); } int ttm_mem_evict_first(struct ttm_device *bdev, -- 2.25.1 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: don't base trees on 5.12-rc1
On Thu, Mar 11, 2021 at 05:11:36AM +1000, Dave Airlie wrote: > On Wed, 10 Mar 2021 at 17:48, Maxime Ripard wrote: > > > > Hi Dave, > > > > On Wed, Mar 10, 2021 at 09:50:29AM +1000, Dave Airlie wrote: > > > I'm mostly sending this to the -misc maintainers because > > > drm-misc-fixes is based on rc1 at present. > > > > > > This needs to be *rebased* not merged up to 5.12-rc2. Merging will > > > still have the bad landmine commits in the bisect history. This is a > > > very special case. > > > > I'm sorry, I'm not entirely sure I get this. -rc1 is still in the -rc2 > > history, so how would that change anything in the bisect history? > > > > We can't get rid of the bad commit range, we can reduce the amount of > times someone accidentally bisects into it, by not using it as a base > commit for future changes. > > If in the future a bisect happens to want to test one of the patches > in drm-misc-fixes that is based on rc1, it will land the user with an > rc1 test kernel and could eat their swapfile/disk. We can avoid that > problem by not using rc1 as a base for drm-misc-fixes. > > We can't avoid them bisecting into the broken commits between when > this landed and was fixed, but rebasing trees can minimise the chances > of this when bisecting other changesets. Ok, yeah, that makes sense. Thanks! Maxime ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: vmwgfx leaking bo pins?
On 3/12/21 12:02 AM, Zack Rusin wrote: On Mar 11, 2021, at 17:35, Thomas Hellström (Intel) wrote: Hi, Zack On 3/11/21 10:07 PM, Zack Rusin wrote: On Mar 11, 2021, at 05:46, Thomas Hellström (Intel) wrote: Hi, I tried latest drm-fixes today and saw a lot of these: Fallout from ttm rework? Yes, I fixed this in d1a73c641afd2617bd80bce8b71a096fc5b74b7e it was in drm-misc-next in the drm-misc tree for a while but hasn’t been merged for 5.12. z Hmm, yes but doesn't that fix trip the ttm_bo_unpin() dma_resv_assert_held(bo->base.resv)? No, doesn’t seem to. TBH I’m not sure why myself, but it seems to be working fine. With CONFIG_PROVE_LOCKING=y I see this: [ 7.117145] [drm] FIFO at 0xfe00 size is 8192 kiB [ 7.117284] [drm] VRAM at 0xe800 size is 131072 kiB [ 7.117291] INFO: trying to register non-static key. [ 7.117295] the code is fine but needs lockdep annotation. [ 7.117298] turning off the locking correctness validator Which will probably mask that dma_resv_assert_held(bo->base.resv) /Thomas ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH][next] drm/amd/pm: Fix spelling mistake "disble" -> "disable"
From: Colin Ian King There is a spelling mistake in an assert message. Fix it. Signed-off-by: Colin Ian King --- drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c index 0d725b66fb78..7edafef095a3 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c @@ -1300,7 +1300,7 @@ static int smu7_start_dpm(struct pp_hwmgr *hwmgr) (0 == smum_send_msg_to_smc(hwmgr, PPSMC_MSG_PCIeDPM_Disable, NULL)), - "Failed to disble pcie DPM during DPM Start Function!", + "Failed to disable pcie DPM during DPM Start Function!", return -EINVAL); } -- 2.30.2 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/ttm: make ttm_bo_unpin more defensive
On Fri, 12 Mar 2021 at 09:38, Christian König wrote: > > We seem to have some more driver bugs than thought. > > Signed-off-by: Christian König Acked-by: Matthew Auld ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/komeda: Fix off-by-1 when with readback conn due to rounding
(Adding back James again - did you use get_maintainer.pl?) On Thu, Mar 11, 2021 at 12:08:46PM +, carsten.haitz...@foss.arm.com wrote: > From: Carsten Haitzler > > When setting up a readback connector that writes data back to memory > rather than to an actual output device (HDMI etc.), rounding was set > to round. As the DPU uses a higher internal number of bits when generating > a color value, this round-down back to 8bit ended up with everything > being off-by one. e.g. #fefefe became #ff. This sets Perhaps overly pedantic, but now we've tracked down what was actually happening I think we can be more precise here. Not _everything_ is off-by-one, it's just rounding in the standard sense - if the most significant bit-to-be-discarded is set, the value is rounded up to minimise the absolute error introduced by bit-depth reduction. > rounding to "round-down" so things end up correct by turning on the LW_TRC > round down flag. Can we call it "truncate" rather than round down? I think it makes "TRC" a bit more understandable. > > Signed-off-by: Carsten Haitzler > --- > drivers/gpu/drm/arm/display/komeda/d71/d71_component.c | 7 ++- > drivers/gpu/drm/arm/display/komeda/d71/d71_regs.h | 1 + > 2 files changed, 7 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/arm/display/komeda/d71/d71_component.c > b/drivers/gpu/drm/arm/display/komeda/d71/d71_component.c > index 8a02ade369db..e97acc5519d1 100644 > --- a/drivers/gpu/drm/arm/display/komeda/d71/d71_component.c > +++ b/drivers/gpu/drm/arm/display/komeda/d71/d71_component.c > @@ -468,7 +468,12 @@ static void d71_wb_layer_update(struct komeda_component > *c, > struct komeda_layer_state *st = to_layer_st(state); > struct drm_connector_state *conn_st = state->wb_conn->state; > struct komeda_fb *kfb = to_kfb(conn_st->writeback_job->fb); > - u32 ctrl = L_EN | LW_OFM, mask = L_EN | LW_OFM | LW_TBU_EN; > + /* LW_TRC sets rounding to truncate not round which is needed for > + * the output of writeback to match the input in the most common > + * use cases like RGB888 -> RGB888, so set this bit by default > + */ Hm, not sure why this file uses "net/" style comments, but as you said, this is in-keeping with the rest of the file, so meh :-) > + u32 ctrl = LW_TRC | L_EN | LW_OFM; > + u32 mask = LW_TRC | L_EN | LW_OFM | LW_TBU_EN; If you were aiming for matching register order, this should be: L_EN | LW_TRC | LW_OFM | LW_TBU_EN I think it'd be nice to have the exact behaviour in the commit message, but either way this seems OK as a pragmatic fix so: Reviewed-by: Brian Starkey Thanks, -Brian > u32 __iomem *reg = c->reg; > > d71_layer_update_fb(c, kfb, st->addr); > diff --git a/drivers/gpu/drm/arm/display/komeda/d71/d71_regs.h > b/drivers/gpu/drm/arm/display/komeda/d71/d71_regs.h > index e80172a0b320..a8036689d721 100644 > --- a/drivers/gpu/drm/arm/display/komeda/d71/d71_regs.h > +++ b/drivers/gpu/drm/arm/display/komeda/d71/d71_regs.h > @@ -321,6 +321,7 @@ > #define LAYER_WR_FORMAT 0x0D8 > > /* Layer_WR control bits */ > +#define LW_TRC BIT(1) > #define LW_OFM BIT(4) > #define LW_LALPHA(x) (((x) & 0xFF) << 8) > #define LW_A_WCACHE(x) (((x) & 0xF) << 28) > -- > 2.30.0 > > ___ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[syzbot] upstream boot error: WARNING in vkms_vblank_simulate
Hello, syzbot found the following issue on: HEAD commit:f78d76e7 Merge tag 'drm-fixes-2021-03-12-1' of git://anong.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=11c16ba2d0 kernel config: https://syzkaller.appspot.com/x/.config?x=dc02c6afcb046874 dashboard link: https://syzkaller.appspot.com/bug?extid=333bd014262fd5d0a418 userspace arch: arm IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+333bd014262fd5d0a...@syzkaller.appspotmail.com [ cut here ] WARNING: CPU: 0 PID: 0 at drivers/gpu/drm/vkms/vkms_crtc.c:21 vkms_vblank_simulate+0x26c/0x2f4 drivers/gpu/drm/vkms/vkms_crtc.c:41 Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.12.0-rc2-syzkaller-00338-gf78d76e72a46 #0 Hardware name: linux,dummy-virt (DT) pstate: 2085 (nzCv daIf -PAN -UAO -TCO BTYPE=--) pc : vkms_vblank_simulate+0x26c/0x2f4 drivers/gpu/drm/vkms/vkms_crtc.c:21 lr : hrtimer_forward_now include/linux/hrtimer.h:510 [inline] lr : vkms_vblank_simulate+0x90/0x2f4 drivers/gpu/drm/vkms/vkms_crtc.c:19 sp : 6a693cd0 x29: 6a693cd0 x28: 0c9d1e58 x27: dfff8000 x26: 6a67f540 x25: 1fffed4cfeb1 x24: 1fffed4cfeaa x23: 0c9d0d30 x22: 00fe4c00 x21: 6a67f540 x20: 0c9d0e58 x19: 0c9d1e58 x18: 6a6a1b48 x17: 1fffe1952345 x16: x15: 8000197bf810 x14: 1fffed4d2750 x13: 0001 x12: 0033 x11: 12fb4936 x10: 0007 x9 : 12fb4943 x8 : 800017d14c00 x7 : f1f1f1f1 x6 : dfff8000 x5 : 7fff x4 : 0008e44f6b90 x3 : 0008e54db790 x2 : 0008e44f6b90 x1 : 0008e54db790 x0 : 0002 Call trace: vkms_vblank_simulate+0x26c/0x2f4 drivers/gpu/drm/vkms/vkms_crtc.c:41 __run_hrtimer kernel/time/hrtimer.c:1519 [inline] __hrtimer_run_queues+0x590/0xe40 kernel/time/hrtimer.c:1583 hrtimer_interrupt+0x2d4/0x810 kernel/time/hrtimer.c:1645 timer_handler drivers/clocksource/arm_arch_timer.c:647 [inline] arch_timer_handler_phys+0x4c/0x70 drivers/clocksource/arm_arch_timer.c:665 handle_percpu_devid_irq+0x19c/0x330 kernel/irq/chip.c:930 generic_handle_irq_desc include/linux/irqdesc.h:158 [inline] generic_handle_irq kernel/irq/irqdesc.c:652 [inline] __handle_domain_irq+0x11c/0x1f0 kernel/irq/irqdesc.c:689 handle_domain_irq include/linux/irqdesc.h:176 [inline] gic_handle_irq+0x5c/0x1b0 drivers/irqchip/irq-gic.c:370 el1_irq+0xb4/0x180 arch/arm64/kernel/entry.S:669 arch_local_irq_restore arch/arm64/include/asm/irqflags.h:124 [inline] queue_work_on+0x74/0x110 kernel/workqueue.c:1528 queue_work include/linux/workqueue.h:507 [inline] cursor_timer_handler+0x64/0x100 drivers/video/fbdev/core/fbcon.c:397 call_timer_fn+0x1d4/0x9c4 kernel/time/timer.c:1431 expire_timers kernel/time/timer.c:1476 [inline] __run_timers.part.0+0x530/0xa00 kernel/time/timer.c:1745 __run_timers kernel/time/timer.c:1726 [inline] run_timer_softirq+0xa4/0x1a0 kernel/time/timer.c:1758 _stext+0x2b4/0x1084 do_softirq_own_stack include/asm-generic/softirq_stack.h:10 [inline] invoke_softirq kernel/softirq.c:228 [inline] __irq_exit_rcu+0x46c/0x510 kernel/softirq.c:422 irq_exit+0x14/0x84 kernel/softirq.c:446 __handle_domain_irq+0x120/0x1f0 kernel/irq/irqdesc.c:692 handle_domain_irq include/linux/irqdesc.h:176 [inline] gic_handle_irq+0x5c/0x1b0 drivers/irqchip/irq-gic.c:370 el1_irq+0xb4/0x180 arch/arm64/kernel/entry.S:669 arch_local_irq_enable+0xc/0x14 arch/arm64/include/asm/irqflags.h:37 default_idle_call+0x64/0xf4 kernel/sched/idle.c:112 cpuidle_idle_call kernel/sched/idle.c:194 [inline] do_idle+0x38c/0x4ec kernel/sched/idle.c:300 cpu_startup_entry+0x28/0x80 kernel/sched/idle.c:397 rest_init+0x1d0/0x2cc init/main.c:721 arch_call_rest_init+0x10/0x1c start_kernel+0x3b0/0x3e8 init/main.c:1064 0x0 --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkal...@googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v2 01/14] opp: Add devres wrapper for dev_pm_opp_set_clkname
On Fri, 12 Mar 2021 at 06:33, Viresh Kumar wrote: > > On 11-03-21, 22:20, Dmitry Osipenko wrote: > > +struct opp_table *devm_pm_opp_set_clkname(struct device *dev, const char > > *name) > > +{ > > + struct opp_table *opp_table; > > + int err; > > + > > + opp_table = dev_pm_opp_set_clkname(dev, name); > > + if (IS_ERR(opp_table)) > > + return opp_table; > > + > > + err = devm_add_action_or_reset(dev, devm_pm_opp_clkname_release, > > opp_table); > > + if (err) > > + opp_table = ERR_PTR(err); > > + > > + return opp_table; > > +} > > I wonder if we still need to return opp_table from here, or a simple > integer is fine.. The callers shouldn't be required to use the OPP > table directly anymore I believe and so better simplify the return > part of this and all other routines you are adding here.. Yes, please. I was thinking along the same lines, when I reviewed the mmc patch (patch9). > > If there is a user which needs the opp_table, let it use the regular > non-devm variant. Kind regards Uffe ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH] drm/exynos: move to use request_irq by IRQF_NO_AUTOEN flag
After this patch cbe16f35bee68 genirq: Add IRQF_NO_AUTOEN for request_irq/nmi() is merged. request_irq() after setting IRQ_NOAUTOEN as below irq_set_status_flags(irq, IRQ_NOAUTOEN); request_irq(dev, irq...); can be replaced by request_irq() with IRQF_NO_AUTOEN flag. Signed-off-by: Tian Tao --- drivers/gpu/drm/exynos/exynos5433_drm_decon.c | 4 ++-- drivers/gpu/drm/exynos/exynos_drm_dsi.c | 7 +++ 2 files changed, 5 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/exynos/exynos5433_drm_decon.c b/drivers/gpu/drm/exynos/exynos5433_drm_decon.c index 1f79bc2..f530aff 100644 --- a/drivers/gpu/drm/exynos/exynos5433_drm_decon.c +++ b/drivers/gpu/drm/exynos/exynos5433_drm_decon.c @@ -775,8 +775,8 @@ static int decon_conf_irq(struct decon_context *ctx, const char *name, return irq; } } - irq_set_status_flags(irq, IRQ_NOAUTOEN); - ret = devm_request_irq(ctx->dev, irq, handler, flags, "drm_decon", ctx); + ret = devm_request_irq(ctx->dev, irq, handler, + flags | IRQ_NOAUTOEN, "drm_decon", ctx); if (ret < 0) { dev_err(ctx->dev, "IRQ %s request failed\n", name); return ret; diff --git a/drivers/gpu/drm/exynos/exynos_drm_dsi.c b/drivers/gpu/drm/exynos/exynos_drm_dsi.c index 83ab6b3..fd9b133b 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_dsi.c +++ b/drivers/gpu/drm/exynos/exynos_drm_dsi.c @@ -1352,10 +1352,9 @@ static int exynos_dsi_register_te_irq(struct exynos_dsi *dsi, } te_gpio_irq = gpio_to_irq(dsi->te_gpio); - irq_set_status_flags(te_gpio_irq, IRQ_NOAUTOEN); ret = request_threaded_irq(te_gpio_irq, exynos_dsi_te_irq_handler, NULL, - IRQF_TRIGGER_RISING, "TE", dsi); + IRQF_TRIGGER_RISING | IRQ_NOAUTOEN, "TE", dsi); if (ret) { dev_err(dsi->dev, "request interrupt failed with %d\n", ret); gpio_free(dsi->te_gpio); @@ -1802,9 +1801,9 @@ static int exynos_dsi_probe(struct platform_device *pdev) if (dsi->irq < 0) return dsi->irq; - irq_set_status_flags(dsi->irq, IRQ_NOAUTOEN); ret = devm_request_threaded_irq(dev, dsi->irq, NULL, - exynos_dsi_irq, IRQF_ONESHOT, + exynos_dsi_irq, + IRQF_ONESHOT | IRQ_NOAUTOEN, dev_name(dev), dsi); if (ret) { dev_err(dev, "failed to request dsi irq\n"); -- 2.7.4 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v2 4/5] drm: Add and export function drm_gem_cma_sync_data
Le jeu. 11 mars 2021 à 12:28, Christoph Hellwig a écrit : On Sun, Mar 07, 2021 at 08:28:34PM +, Paul Cercueil wrote: + drm_atomic_for_each_plane_damage(&iter, &clip) { + for (i = 0; i < finfo->num_planes; i++) { + daddr = drm_fb_cma_get_gem_addr(state->fb, state, i); + + /* Ignore x1/x2 values, invalidate complete lines */ + offset = clip.y1 * state->fb->pitches[i]; + + dma_sync_single_for_device(dev, daddr + offset, + (clip.y2 - clip.y1) * state->fb->pitches[i], + DMA_TO_DEVICE); Are these helpers only ever used to transfer data to the device and never from it? If so please clearly document that. Yes. In the DRM world, are there cases where we transfer data from the device? I assume these cases are handled by v4l2 instead. -Paul ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v2 5/5] drm/ingenic: Add option to alloc cached GEM buffers
Le jeu. 11 mars 2021 à 12:30, Christoph Hellwig a écrit : On Sun, Mar 07, 2021 at 08:28:35PM +, Paul Cercueil wrote: With the module parameter ingenic-drm.cached_gem_buffers, it is possible to specify that we want GEM buffers backed by non-coherent memory. Shouldn't there be a way to discover this through a DT property? Good question. My original way of thinking was that as this feature speeds up only software rendering, this is really application-dependent: a modern desktop where everything is rendered via the GPU wouldn't benefit much from it. With that in mind, it is fine as a module option. On the other hand... the "software rendering is faster with non-coherent buffers" really is a SoC property, since it is only true for some generations of Ingenic SoCs and not others. So it would make sense to have a DT property for it. -Paul ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v2 3/5] drm: Add and export function drm_gem_cma_mmap_noncoherent
Le jeu. 11 mars 2021 à 12:36, Christoph Hellwig a écrit : On Thu, Mar 11, 2021 at 12:32:27PM +, Paul Cercueil wrote: > dma_to_phys must not be used by drivers. > > I have a proper helper for this waiting for users: > > http://git.infradead.org/users/hch/misc.git/commitdiff/96a546e7229ec53aadbdb7936d1e5e6cb5958952 > > If you can confirm the helpers works for you I can try to still sneak > it to Linus for 5.12 to ease the merge pain. I can try. How do I get a page pointer from a dma_addr_t? You don't - you get it from using virt_to_page on the pointer returned from dma_alloc_noncoherent. That beind said to keep the API sane I should probably add a wrapper that does that for you. I tested using: ret = dma_mmap_pages(cma_obj->base.dev->dev, vma, vma->vm_end - vma->vm_start, virt_to_page(cma_obj->vaddr)); It works fine. I think I can use remap_pfn_range() for now, and switch to your new API once it's available in drm-misc-next. Cheers, -Paul ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v7 3/3] drm: Add GUD USB Display driver
Den 12.03.2021 05.32, skrev Peter Stuge: > Ilia Mirkin wrote: >> XRGB means that the memory layout should match a 32-bit integer, >> stored as LE, with the low bits being B, next bits being G, etc. This >> translates to byte 0 = B, byte 1 = G, etc. If you're on a BE system, >> and you're handed a XRGB buffer, it still expects that byte 0 = B, >> etc (except as I outlined, some drivers which are from before these >> formats were a thing, sort of do their own thing). Thankfully this is >> equivalent to BGRX (big-endian packing), so you can just munge the >> format. > > I understand! Thanks a lot for clarifying. > > It makes much more sense to me that the format indeed describes > what is in memory rather than how pixels look to software. > > I'm not sure why you guys were talking about BE in the first place, >>> >>> I was worried that the translation didn't consider endianess. >> >> The translation in gud_xrgb_to_color definitely seems suspect. > > So to me this means that the gud_pipe translations from XRGB to the > 1-bit formats *do* have to adjust for the reversed order on BE. > > >> There's also a gud_is_big_endian, but I'm guessing this applies to the >> downstream device rather than the host system. > > gud_is_big_endian() is a static bool wrapper around defined(__BIG_ENDIAN) > so yes, it applies to the host. > > With memory layout being constant I again think gud_xrgb_to_color() > needs to take further steps to work correctly also on BE hosts. (Maybe > that's le32_to_cpu(*pix32), maybe drm_fb_swab(), maybe something else?) > > >> I didn't check if dev->mode_config.quirk_addfb_prefer_host_byte_order >> is set > > I can't tell if that's helpful, probably Noralf can. > I skimmed a discussion a while back around BE and all it's problems through the whole graphics stack and my take away was that I needed a BE machine so I could test the whole stack to make sure it works. Judging from what Ilia says I've been wrong in my assumptions so the driver is probably broken on BE (along with my SPI display drivers). I'm not going to try and fix this now, I need someone with a BE machine that can test the whole stack and make sure it actually works. Me doing more guesswork is not going to work out well I think :) I'll add a FIXME in gud_pipe.c that BE is probably broken and link to this discussion. It seems Gerd did some work to fix BE in the kernel: drm: byteorder fixes https://patchwork.freedesktop.org/series/49073/ But that seems to deal with the format value itself and not the buffer contents. Noralf. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 212229] STM32F469: vblank wait timed out on output to /sys/class/graphics/fb0/pan
https://bugzilla.kernel.org/show_bug.cgi?id=212229 --- Comment #3 from Yauheni Saldatsenka (eugen...@gmail.com) --- Created attachment 295809 --> https://bugzilla.kernel.org/attachment.cgi?id=295809&action=edit /sys/kernel/debug/dri/0/state -- You may reply to this email to add a comment. You are receiving this mail because: You are on the CC list for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 210321] /display/dc/dcn20/dcn20_resource.c:3240 dcn20_validate_bandwidth_fp+0x8b/0xd0 [amdgpu]
https://bugzilla.kernel.org/show_bug.cgi?id=210321 Tristen Hayfield (tristen.hayfi...@gmail.com) changed: What|Removed |Added CC||tristen.hayfi...@gmail.com --- Comment #4 from Tristen Hayfield (tristen.hayfi...@gmail.com) --- I'm seeing this on the 5.10.* series as well. Currently 5.10.23, Gentoo. Radeon RX 5500 XT. Looking at the offending section of code, it seems an assertion is being triggered: // Fallback: Try to only support G6 temperature read latency context->bw_ctx.dml.soc.dram_clock_change_latency_us = context->bw_ctx.dml.soc.dummy_pstate_latency_us; voltage_supported = dcn20_validate_bandwidth_internal(dc, context, false); dummy_pstate_supported = context->bw_ctx.bw.dcn.clk.p_state_change_support; if (voltage_supported && dummy_pstate_supported) { context->bw_ctx.bw.dcn.clk.p_state_change_support = false; goto restore_dml_state; } // ERROR: fallback is supposed to always work. ASSERT(false); So one of (or both) voltage_supported and dummy_pstate_supported are evaluating to false here and falling through to the assertions. Stack trace attached for completeness' sake. Hopefully a dev that understands the hardware will take a look at this one day and find it helpful. [ 642.193449] WARNING: CPU: 22 PID: 3546 at drivers/gpu/drm/amd/amdgpu/../display/dc/dcn20/dcn20_resource.c:3242 dcn20_validate_bandwidth_fp+0xd3/0xf0 [amdgpu] [ 642.193450] Modules linked in: fuse nfs lockd grace nfs_ssc sunrpc k10temp amdgpu backlight gpu_sched snd_hda_codec_hdmi ttm iwlmvm iwlwifi acpi_cpufreq efivarfs [ 642.193457] CPU: 22 PID: 3546 Comm: X Not tainted 5.10.23-gentoo #1 [ 642.193457] Hardware name: System manufacturer System Product Name/TUF GAMING X570-PLUS (WI-FI), BIOS 3402 01/13/2021 [ 642.193487] RIP: 0010:dcn20_validate_bandwidth_fp+0xd3/0xf0 [amdgpu] [ 642.193488] Code: 5d 41 5c c3 5b 48 89 ee 4c 89 e7 5d ba 01 00 00 00 41 5c e9 2f f6 ff ff 41 0f b6 f4 48 c7 c7 a0 a8 8c c0 31 c0 e8 8d 09 14 d9 <0f> 0b 48 89 9d 50 26 00 00 44 89 e0 5b 5d 41 5c c3 0f 0b e9 53 ff [ 642.193489] RSP: 0018:c18284b37b40 EFLAGS: 00010246 [ 642.193490] RAX: RBX: 40794000 RCX: [ 642.193490] RDX: RSI: 9ea22f197380 RDI: 9ea22f197380 [ 642.193491] RBP: 9e93ab0e R08: R09: c18284b37910 [ 642.193492] R10: c18284b37908 R11: 9a78 R12: 0001 [ 642.193492] R13: R14: 9e93ab0e R15: 9e9344e5b560 [ 642.193493] FS: 7fc5f22978c0() GS:9ea22f18() knlGS: [ 642.193494] CS: 0010 DS: ES: CR0: 80050033 [ 642.193494] CR2: 55fa36a96628 CR3: 0001155b2000 CR4: 00350ee0 [ 642.193495] Call Trace: [ 642.193526] dcn20_validate_bandwidth+0x24/0x40 [amdgpu] [ 642.193548] dc_validate_global_state+0x284/0x300 [amdgpu] [ 642.193580] amdgpu_dm_atomic_check+0xb09/0xc00 [amdgpu] [ 642.193584] drm_atomic_check_only+0x555/0x7d0 [ 642.193585] drm_atomic_commit+0xe/0x50 [ 642.193586] drm_atomic_connector_commit_dpms+0xd5/0xf0 [ 642.193588] drm_mode_obj_set_property_ioctl+0x184/0x3a0 [ 642.193589] ? drm_connector_set_obj_prop+0x80/0x80 [ 642.193590] drm_connector_property_set_ioctl+0x32/0x50 [ 642.193592] drm_ioctl_kernel+0xa5/0xf0 [ 642.193593] drm_ioctl+0x20a/0x3a0 [ 642.193594] ? drm_connector_set_obj_prop+0x80/0x80 [ 642.193614] amdgpu_drm_ioctl+0x44/0x80 [amdgpu] [ 642.193616] __x64_sys_ioctl+0x81/0xa0 [ 642.193618] do_syscall_64+0x33/0x80 [ 642.193620] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 642.193621] RIP: 0033:0x7fc5f24cb227 [ 642.193622] Code: 1f 40 00 48 89 d8 49 8d 3c 1c 48 f7 d8 49 39 c4 72 b1 e8 0c ff ff ff 85 c0 78 b6 5b 4c 89 e0 5d 41 5c c3 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 11 6c 0c 00 f7 d8 64 89 01 48 [ 642.193622] RSP: 002b:7fff122b98c8 EFLAGS: 0246 ORIG_RAX: 0010 [ 642.193623] RAX: ffda RBX: 7fff122b9900 RCX: 7fc5f24cb227 [ 642.193624] RDX: 7fff122b9900 RSI: c01064ab RDI: 000c [ 642.193624] RBP: c01064ab R08: R09: 7fc5f2b97d10 [ 642.193625] R10: 7fc5f2b97d20 R11: 0246 R12: 55fa38755350 [ 642.193625] R13: 000c R14: R15: 55fa36abf540 [ 642.193626] ---[ end trace b1edc8bf2eac897c ]--- I added the following line before the assertion and recompiled the kernel: DC_LOG_ERROR("voltage_supported: %d, dummy_pstate_supported: %d\n", voltage_supported, dummy_pstate_supported); When the issue triggered again, it logged: [drm:dcn20_validate_bandwidth_fp [amdgpu]] *ERROR* voltage_supported: 1, dummy_pstate_supported: 0 So in my case the assertion is being tri
Re: [PATCH v2 05/14] opp: Add devres wrapper for dev_pm_opp_register_notifier
12.03.2021 08:26, Viresh Kumar пишет: > On 11-03-21, 22:20, Dmitry Osipenko wrote: >> From: Yangtao Li >> >> Add devres wrapper for dev_pm_opp_register_notifier() to simplify driver >> code. >> >> Signed-off-by: Yangtao Li >> Signed-off-by: Dmitry Osipenko >> --- >> drivers/opp/core.c | 38 ++ >> include/linux/pm_opp.h | 6 ++ >> 2 files changed, 44 insertions(+) > > As I said in the previous version, I am not sure if we need this patch > at all. This has only one user. > I'll drop this patch in v3, thanks. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v2 01/14] opp: Add devres wrapper for dev_pm_opp_set_clkname
12.03.2021 13:36, Ulf Hansson пишет: > On Fri, 12 Mar 2021 at 06:33, Viresh Kumar wrote: >> >> On 11-03-21, 22:20, Dmitry Osipenko wrote: >>> +struct opp_table *devm_pm_opp_set_clkname(struct device *dev, const char >>> *name) >>> +{ >>> + struct opp_table *opp_table; >>> + int err; >>> + >>> + opp_table = dev_pm_opp_set_clkname(dev, name); >>> + if (IS_ERR(opp_table)) >>> + return opp_table; >>> + >>> + err = devm_add_action_or_reset(dev, devm_pm_opp_clkname_release, >>> opp_table); >>> + if (err) >>> + opp_table = ERR_PTR(err); >>> + >>> + return opp_table; >>> +} >> >> I wonder if we still need to return opp_table from here, or a simple >> integer is fine.. The callers shouldn't be required to use the OPP >> table directly anymore I believe and so better simplify the return >> part of this and all other routines you are adding here.. > > Yes, please. I was thinking along the same lines, when I reviewed the > mmc patch (patch9). > >> >> If there is a user which needs the opp_table, let it use the regular >> non-devm variant. Indeed, that's a very good suggestion! The opp_table isn't needed by the devm users, I'll change it in v3, thanks! ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Patch "drm/fb-helper: only unmap if buffer not null" has been added to the 5.11-stable tree
This is a note to let you know that I've just added the patch titled drm/fb-helper: only unmap if buffer not null to the 5.11-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: drm-fb-helper-only-unmap-if-buffer-not-null.patch and it can be found in the queue-5.11 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let know about it. >From 874a52f9b693ed8bf7a92b3592a547ce8a684e6f Mon Sep 17 00:00:00 2001 From: Tong Zhang Date: Sat, 27 Feb 2021 23:46:25 -0500 Subject: drm/fb-helper: only unmap if buffer not null From: Tong Zhang commit 874a52f9b693ed8bf7a92b3592a547ce8a684e6f upstream. drm_fbdev_cleanup() can be called when fb_helper->buffer is null, hence fb_helper->buffer should be checked before calling drm_client_buffer_vunmap(). This buffer is also checked in drm_client_framebuffer_delete(), so we should also do the same thing for drm_client_buffer_vunmap(). [ 199.128742] RIP: 0010:drm_client_buffer_vunmap+0xd/0x20 [ 199.129031] Code: 43 18 48 8b 53 20 49 89 45 00 49 89 55 08 5b 44 89 e0 41 5c 41 5d 41 5e 5d c3 0f 1f 00 53 48 89 fb 48 8d 7f 10 e8 73 7d a1 ff <48> 8b 7b 10 48 8d 73 18 5b e9 75 53 fc ff 0 f 1f 44 00 00 48 b8 00 [ 199.130041] RSP: 0018:888103f3fc88 EFLAGS: 00010282 [ 199.130329] RAX: 0001 RBX: RCX: 8214d46d [ 199.130733] RDX: 1079c6b9 RSI: 0246 RDI: 83ce35c8 [ 199.131119] RBP: 888103d25458 R08: 0001 R09: fbfff0791761 [ 199.131505] R10: 83c8bb07 R11: fbfff0791760 R12: [ 199.131891] R13: 888103d25468 R14: 888103d25418 R15: 888103f18120 [ 199.132277] FS: 7f36fdcbb6a0() GS:88815b40() knlGS: [ 199.132721] CS: 0010 DS: ES: CR0: 80050033 [ 199.133033] CR2: 0010 CR3: 000103d26000 CR4: 06f0 [ 199.133420] DR0: DR1: DR2: [ 199.133807] DR3: DR6: fffe0ff0 DR7: 0400 [ 199.134195] Call Trace: [ 199.134333] drm_fbdev_cleanup+0x179/0x1a0 [ 199.134562] drm_fbdev_client_unregister+0x2b/0x40 [ 199.134828] drm_client_dev_unregister+0xa8/0x180 [ 199.135088] drm_dev_unregister+0x61/0x110 [ 199.135315] mgag200_pci_remove+0x38/0x52 [mgag200] [ 199.135586] pci_device_remove+0x62/0xe0 [ 199.135806] device_release_driver_internal+0x148/0x270 [ 199.136094] driver_detach+0x76/0xe0 [ 199.136294] bus_remove_driver+0x7e/0x100 [ 199.136521] pci_unregister_driver+0x28/0xf0 [ 199.136759] __x64_sys_delete_module+0x268/0x300 [ 199.137016] ? __ia32_sys_delete_module+0x300/0x300 [ 199.137285] ? call_rcu+0x3e4/0x580 [ 199.137481] ? fpregs_assert_state_consistent+0x4d/0x60 [ 199.137767] ? exit_to_user_mode_prepare+0x2f/0x130 [ 199.138037] do_syscall_64+0x33/0x40 [ 199.138237] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 199.138517] RIP: 0033:0x7f36fdc3dcf7 Signed-off-by: Tong Zhang Fixes: 763aea17bf57 ("drm/fb-helper: Unmap client buffer during shutdown") Cc: Thomas Zimmermann Cc: Sam Ravnborg Cc: Maxime Ripard Cc: Maarten Lankhorst Cc: David Airlie Cc: Daniel Vetter Cc: dri-devel@lists.freedesktop.org Cc: # v5.11+ Signed-off-by: Thomas Zimmermann Link: https://patchwork.freedesktop.org/patch/msgid/20210228044625.171151-1-ztong0...@gmail.com Signed-off-by: Maarten Lankhorst Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/drm_fb_helper.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/gpu/drm/drm_fb_helper.c +++ b/drivers/gpu/drm/drm_fb_helper.c @@ -2043,7 +2043,7 @@ static void drm_fbdev_cleanup(struct drm if (shadow) vfree(shadow); - else + else if (fb_helper->buffer) drm_client_buffer_vunmap(fb_helper->buffer); drm_client_framebuffer_delete(fb_helper->buffer); Patches currently in stable-queue which might be from ztong0...@gmail.com are queue-5.11/drm-fb-helper-only-unmap-if-buffer-not-null.patch ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel