date:20210312

Interrupts on are non-reentrant on linux. This is just an ancient
leftover from radeon where irq processing was kicked of from different
places.

Signed-off-by: Christian König 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c | 5 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h | 1 -
 3 files changed, 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index a15f1b604733..886625fb464b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3284,7 +3284,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 
/* mutex initialization are all done here so we
 * can recall function without having locking issues */
-   atomic_set(&adev->irq.ih.lock, 0);
mutex_init(&adev->firmware.mutex);
mutex_init(&adev->pm.mutex);
mutex_init(&adev->gfx.gpu_clock_mutex);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
index 1024065f1f03..faaa6aa2faaf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
@@ -228,10 +228,6 @@ int amdgpu_ih_process(struct amdgpu_device *adev, struct 
amdgpu_ih_ring *ih)
wptr = amdgpu_ih_get_wptr(adev, ih);
 
 restart_ih:
-   /* is somebody else already processing irqs? */
-   if (atomic_xchg(&ih->lock, 1))
-   return IRQ_NONE;
-
DRM_DEBUG("%s: rptr %d, wptr %d\n", __func__, ih->rptr, wptr);
 
/* Order reading of wptr vs. reading of IH ring data */
@@ -244,7 +240,6 @@ int amdgpu_ih_process(struct amdgpu_device *adev, struct 
amdgpu_ih_ring *ih)
 
amdgpu_ih_set_rptr(adev, ih);
wake_up_all(&ih->wait_process);
-   atomic_set(&ih->lock, 0);
 
/* make sure wptr hasn't changed while processing */
wptr = amdgpu_ih_get_wptr(adev, ih);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
index 87ec6d20dbe0..0649b59830a5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
@@ -64,7 +64,6 @@ struct amdgpu_ih_ring {
 
boolenabled;
unsignedrptr;
-   atomic_tlock;
struct amdgpu_ih_regs   ih_regs;
 
/* For waiting on IH processing at checkpoint. */
-- 
2.25.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm/amdgpu: nuke the ih reentrant lock

On Fri, Mar 12, 2021 at 02:59:06PM +0100, Christian König wrote:
> Interrupts on are non-reentrant on linux. This is just an ancient
> leftover from radeon where irq processing was kicked of from different
> places.
> 
> Signed-off-by: Christian König 

Man you tricked me into grepping this on radeon and it looks horrible.
atomic_t is unordered in linux, so whatever was built there for radeon
does not wokr like a lock. It's missing all the barriers afiui. Good
riddance at least for amdgpu.

Reviewed-by: Daniel Vetter 

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 -
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c | 5 -
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h | 1 -
>  3 files changed, 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index a15f1b604733..886625fb464b 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3284,7 +3284,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>  
>   /* mutex initialization are all done here so we
>* can recall function without having locking issues */
> - atomic_set(&adev->irq.ih.lock, 0);
>   mutex_init(&adev->firmware.mutex);
>   mutex_init(&adev->pm.mutex);
>   mutex_init(&adev->gfx.gpu_clock_mutex);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
> index 1024065f1f03..faaa6aa2faaf 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
> @@ -228,10 +228,6 @@ int amdgpu_ih_process(struct amdgpu_device *adev, struct 
> amdgpu_ih_ring *ih)
>   wptr = amdgpu_ih_get_wptr(adev, ih);
>  
>  restart_ih:
> - /* is somebody else already processing irqs? */
> - if (atomic_xchg(&ih->lock, 1))
> - return IRQ_NONE;
> -
>   DRM_DEBUG("%s: rptr %d, wptr %d\n", __func__, ih->rptr, wptr);
>  
>   /* Order reading of wptr vs. reading of IH ring data */
> @@ -244,7 +240,6 @@ int amdgpu_ih_process(struct amdgpu_device *adev, struct 
> amdgpu_ih_ring *ih)
>  
>   amdgpu_ih_set_rptr(adev, ih);
>   wake_up_all(&ih->wait_process);
> - atomic_set(&ih->lock, 0);
>  
>   /* make sure wptr hasn't changed while processing */
>   wptr = amdgpu_ih_get_wptr(adev, ih);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
> index 87ec6d20dbe0..0649b59830a5 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
> @@ -64,7 +64,6 @@ struct amdgpu_ih_ring {
>  
>   boolenabled;
>   unsignedrptr;
> - atomic_tlock;
>   struct amdgpu_ih_regs   ih_regs;
>  
>   /* For waiting on IH processing at checkpoint. */
> -- 
> 2.25.1
> 
> ___
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] dma-fence: Document recoverable page fault implications

On Wed, Feb 03, 2021 at 04:40:38PM +0100, Christian König wrote:
> Am 03.02.21 um 16:29 schrieb Daniel Vetter:
> > Recently there was a fairly long thread about recoreable hardware page
> > faults, how they can deadlock, and what to do about that.
> > 
> > While the discussion is still fresh I figured good time to try and
> > document the conclusions a bit. This documentation section explains
> > what's the potential problem, and the remedies we've discussed,
> > roughly ordered from best to worst.
> > 
> > v2: Linus -> Linux typoe (Dave)
> > 
> > v3:
> > - Make it clear drivers only need to implement one option (Christian)
> > - Make it clearer that implicit sync is out the window with exclusive
> >fences (Christian)
> > - Add the fairly theoretical option of segementing the memory (either
> >statically or through dynamic checks at runtime for which piece of
> >memory is managed how) and explain why it's not a great idea (Felix)
> > 
> > References: 
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210107030127.20393-1-Felix.Kuehling%40amd.com%2F&data=04%7C01%7Cchristian.koenig%40amd.com%7C767e1096b9554ab5b6dd08d8c8587f0f%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637479629728871138%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Fth2y8c3LuNbweQGrsS7VjYESGlshu6DfQyikiWBwyQ%3D&reserved=0
> > Cc: Dave Airlie 
> > Cc: Maarten Lankhorst 
> > Cc: Thomas Hellström 
> > Cc: "Christian König" 
> > Cc: Jerome Glisse 
> > Cc: Felix Kuehling 
> > Signed-off-by: Daniel Vetter 
> > Cc: Sumit Semwal 
> > Cc: linux-me...@vger.kernel.org
> > Cc: linaro-mm-...@lists.linaro.org
> 
> Reviewed-by: Christian König 

Pulled this in, thanks everyone who helped clarify the situation here.

Cheers, Daniel

> I still haven't fully given up on supporting implicit sync with user fences,
> but it is really an eeek, let's try very hard to avoid that, problem.
> 
> Christian
> 
> > ---
> >   Documentation/driver-api/dma-buf.rst | 76 
> >   1 file changed, 76 insertions(+)
> > 
> > diff --git a/Documentation/driver-api/dma-buf.rst 
> > b/Documentation/driver-api/dma-buf.rst
> > index a2133d69872c..7f37ec30d9fd 100644
> > --- a/Documentation/driver-api/dma-buf.rst
> > +++ b/Documentation/driver-api/dma-buf.rst
> > @@ -257,3 +257,79 @@ fences in the kernel. This means:
> > userspace is allowed to use userspace fencing or long running compute
> > workloads. This also means no implicit fencing for shared buffers in 
> > these
> > cases.
> > +
> > +Recoverable Hardware Page Faults Implications
> > +~
> > +
> > +Modern hardware supports recoverable page faults, which has a lot of
> > +implications for DMA fences.
> > +
> > +First, a pending page fault obviously holds up the work that's running on 
> > the
> > +accelerator and a memory allocation is usually required to resolve the 
> > fault.
> > +But memory allocations are not allowed to gate completion of DMA fences, 
> > which
> > +means any workload using recoverable page faults cannot use DMA fences for
> > +synchronization. Synchronization fences controlled by userspace must be 
> > used
> > +instead.
> > +
> > +On GPUs this poses a problem, because current desktop compositor protocols 
> > on
> > +Linux rely on DMA fences, which means without an entirely new userspace 
> > stack
> > +built on top of userspace fences, they cannot benefit from recoverable page
> > +faults. Specifically this means implicit synchronization will not be 
> > possible.
> > +The exception is when page faults are only used as migration hints and 
> > never to
> > +on-demand fill a memory request. For now this means recoverable page
> > +faults on GPUs are limited to pure compute workloads.
> > +
> > +Furthermore GPUs usually have shared resources between the 3D rendering and
> > +compute side, like compute units or command submission engines. If both a 
> > 3D
> > +job with a DMA fence and a compute workload using recoverable page faults 
> > are
> > +pending they could deadlock:
> > +
> > +- The 3D workload might need to wait for the compute job to finish and 
> > release
> > +  hardware resources first.
> > +
> > +- The compute workload might be stuck in a page fault, because the memory
> > +  allocation is waiting for the DMA fence of the 3D workload to complete.
> > +
> > +There are a few options to prevent this problem, one of which drivers need 
> > to
> > +ensure:
> > +
> > +- Compute workloads can always be preempted, even when a page fault is 
> > pending
> > +  and not yet repaired. Not all hardware supports this.
> > +
> > +- DMA fence workloads and workloads which need page fault handling have
> > +  independent hardware resources to guarantee forward progress. This could 
> > be
> > +  achieved through e.g. through dedicated engines and minimal compute unit
> > +  reservations for DMA fence workl

Re: [PATCH v2] fb_defio: Remove custom address_space_operations

On Wed, Mar 10, 2021 at 06:55:30PM +, Matthew Wilcox (Oracle) wrote:
> There's no need to give the page an address_space.  Leaving the
> page->mapping as NULL will cause the VM to handle set_page_dirty()
> the same way that it's handled now, and that was the only reason to
> set the address_space in the first place.
> 
> Signed-off-by: Matthew Wilcox (Oracle) 
> Reviewed-by: Christoph Hellwig 
> Reviewed-by: William Kucharski 

Thanks for your patch, merged to drm-misc-next for 5.13.

While I have an expert here, does this mean that for a VM_PFNMAP we could
pull of the same trick without any struct page backing, assuming we pulle
the per-page dirty state into some tracking of our own?

I'm asking since for DRM drivers we currently have a fairly awkward
situation with a bounce buffer in system memory going on that we copy out
of, because we can't directly use the gpu buffers. If we can track
directly in the gpu buffers, maybe even as some kind of overlay over the
vma, we could avoid that copy.

Otoh no one cares about fbcon performance, so *shrug*.

Cheers, Daniel

> ---
> v2: Delete local variable definitions
>  drivers/video/fbdev/core/fb_defio.c | 35 -
>  drivers/video/fbdev/core/fbmem.c|  4 
>  include/linux/fb.h  |  3 ---
>  3 files changed, 42 deletions(-)
> 
> diff --git a/drivers/video/fbdev/core/fb_defio.c 
> b/drivers/video/fbdev/core/fb_defio.c
> index a591d291b231..b292887a2481 100644
> --- a/drivers/video/fbdev/core/fb_defio.c
> +++ b/drivers/video/fbdev/core/fb_defio.c
> @@ -52,13 +52,6 @@ static vm_fault_t fb_deferred_io_fault(struct vm_fault 
> *vmf)
>   return VM_FAULT_SIGBUS;
>  
>   get_page(page);
> -
> - if (vmf->vma->vm_file)
> - page->mapping = vmf->vma->vm_file->f_mapping;
> - else
> - printk(KERN_ERR "no mapping available\n");
> -
> - BUG_ON(!page->mapping);
>   page->index = vmf->pgoff;
>  
>   vmf->page = page;
> @@ -151,17 +144,6 @@ static const struct vm_operations_struct 
> fb_deferred_io_vm_ops = {
>   .page_mkwrite   = fb_deferred_io_mkwrite,
>  };
>  
> -static int fb_deferred_io_set_page_dirty(struct page *page)
> -{
> - if (!PageDirty(page))
> - SetPageDirty(page);
> - return 0;
> -}
> -
> -static const struct address_space_operations fb_deferred_io_aops = {
> - .set_page_dirty = fb_deferred_io_set_page_dirty,
> -};
> -
>  int fb_deferred_io_mmap(struct fb_info *info, struct vm_area_struct *vma)
>  {
>   vma->vm_ops = &fb_deferred_io_vm_ops;
> @@ -212,29 +194,12 @@ void fb_deferred_io_init(struct fb_info *info)
>  }
>  EXPORT_SYMBOL_GPL(fb_deferred_io_init);
>  
> -void fb_deferred_io_open(struct fb_info *info,
> -  struct inode *inode,
> -  struct file *file)
> -{
> - file->f_mapping->a_ops = &fb_deferred_io_aops;
> -}
> -EXPORT_SYMBOL_GPL(fb_deferred_io_open);
> -
>  void fb_deferred_io_cleanup(struct fb_info *info)
>  {
>   struct fb_deferred_io *fbdefio = info->fbdefio;
> - struct page *page;
> - int i;
>  
>   BUG_ON(!fbdefio);
>   cancel_delayed_work_sync(&info->deferred_work);
> -
> - /* clear out the mapping that we setup */
> - for (i = 0 ; i < info->fix.smem_len; i += PAGE_SIZE) {
> - page = fb_deferred_io_page(info, i);
> - page->mapping = NULL;
> - }
> -
>   mutex_destroy(&fbdefio->lock);
>  }
>  EXPORT_SYMBOL_GPL(fb_deferred_io_cleanup);
> diff --git a/drivers/video/fbdev/core/fbmem.c 
> b/drivers/video/fbdev/core/fbmem.c
> index 06f5805de2de..372b52a2befa 100644
> --- a/drivers/video/fbdev/core/fbmem.c
> +++ b/drivers/video/fbdev/core/fbmem.c
> @@ -1415,10 +1415,6 @@ __releases(&info->lock)
>   if (res)
>   module_put(info->fbops->owner);
>   }
> -#ifdef CONFIG_FB_DEFERRED_IO
> - if (info->fbdefio)
> - fb_deferred_io_open(info, inode, file);
> -#endif
>  out:
>   unlock_fb_info(info);
>   if (res)
> diff --git a/include/linux/fb.h b/include/linux/fb.h
> index ecfbcc0553a5..a8dccd23c249 100644
> --- a/include/linux/fb.h
> +++ b/include/linux/fb.h
> @@ -659,9 +659,6 @@ static inline void __fb_pad_aligned_buffer(u8 *dst, u32 
> d_pitch,
>  /* drivers/video/fb_defio.c */
>  int fb_deferred_io_mmap(struct fb_info *info, struct vm_area_struct *vma);
>  extern void fb_deferred_io_init(struct fb_info *info);
> -extern void fb_deferred_io_open(struct fb_info *info,
> - struct inode *inode,
> - struct file *file);
>  extern void fb_deferred_io_cleanup(struct fb_info *info);
>  extern int fb_deferred_io_fsync(struct file *file, loff_t start,
>   loff_t end, int datasync);
> -- 
> 2.30.0
> 
> ___
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software

Re: [Intel-gfx] [PATCH] i915: Drop legacy execbuffer support

On Thu, Mar 11, 2021 at 10:31:33PM -0600, Jason Ekstrand wrote:
> 
> On March 11, 2021 20:26:06 "Dixit, Ashutosh"  wrote:
> 
> > On Wed, 10 Mar 2021 13:00:49 -0800, Jason Ekstrand wrote:
> > > 
> > > libdrm has supported the newer execbuffer2 ioctl and using it by default
> > > when it exists since libdrm commit 
> > > b50964027bef249a0cc3d511de05c2464e0a1e22
> > > which landed Mar 2, 2010.  The i915 and i965 drivers in Mesa at the time
> > > both used libdrm and so did the Intel X11 back-end.  The SNA back-end
> > > for X11 has always used execbuffer2.
> > > 
> > > Signed-off-by: Jason Ekstrand 
> > > ---
> > > .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 100 --
> > > drivers/gpu/drm/i915/gem/i915_gem_ioctls.h|   2 -
> > > drivers/gpu/drm/i915/i915_drv.c   |   2 +-
> > > 3 files changed, 1 insertion(+), 103 deletions(-)
> > 
> > Don't we want to clean up references to legacy execbuffer in
> > include/uapi/drm/i915_drm.h too?
> 
> I thought about that but Daniel said we should leave them. Maybe a comment
> is in order?

These headers are copied unchanged to userspace for building. We don't use
kernel-headers packages directly in any of our userspace (I hope at
least), but still better safe than sorry and avoid compilation failures
simply due to updated uapi headers that lost a few old things.

Also we need at least the struct size because that's encoded in the ioctl
number, and at that point might as well keep the entire thing.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [RFC PATCH 0/7] drm/panfrost: Add a new submit ioctl

2021-03-12 Thread Boris Brezillon

On Thu, 11 Mar 2021 12:16:33 +
Steven Price  wrote:

> Also the current code completely ignores PANFROST_BO_REF_READ. So either 
> that should be defined as 0, or even better we support 3 modes:
> 
>   * Exclusive ('write' access)
>   * Shared ('read' access)
>   * No fence - ensures the BO is mapped, but doesn't add any implicit 
> fences.

I looked at other drivers and they seem to have this
NO_FENCE/NO_IMPLICIT flag at the job/submit level and conditioned on
the presence of a in_sync_fd file. I can see per-BO NO_FENCE being
useful for sub-buffer accesses, assuming we don't want to deal with
other deps explicitly, but we could also force the NO_IMPLICIT behavior
for the whole job and let the user deal with all deps explicitly. TBH,
it's a bit unclear to me how that feature would be used by the vulkan
driver, so I'd rather leave that NO_FENCE use case for later.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] i915: Drop relocation support on all new hardware (v3)

On Thu, Mar 11, 2021 at 12:57:25PM -0600, Jason Ekstrand wrote:
> On Thu, Mar 11, 2021 at 12:20 PM Zbigniew Kempczyński
>  wrote:
> >
> > On Thu, Mar 11, 2021 at 11:18:11AM -0600, Jason Ekstrand wrote:
> > > On Thu, Mar 11, 2021 at 10:51 AM Zbigniew Kempczyński
> > >  wrote:
> > > >
> > > > On Thu, Mar 11, 2021 at 10:24:38AM -0600, Jason Ekstrand wrote:
> > > > > On Thu, Mar 11, 2021 at 9:57 AM Daniel Vetter  wrote:
> > > > > >
> > > > > > On Thu, Mar 11, 2021 at 4:50 PM Jason Ekstrand 
> > > > > >  wrote:
> > > > > > >
> > > > > > > On Thu, Mar 11, 2021 at 5:44 AM Zbigniew Kempczyński
> > > > > > >  wrote:
> > > > > > > >
> > > > > > > > On Wed, Mar 10, 2021 at 03:50:07PM -0600, Jason Ekstrand wrote:
> > > > > > > > > The Vulkan driver in Mesa for Intel hardware never uses 
> > > > > > > > > relocations if
> > > > > > > > > it's running on a version of i915 that supports at least 
> > > > > > > > > softpin which
> > > > > > > > > all versions of i915 supporting Gen12 do.  On the OpenGL 
> > > > > > > > > side, Gen12+ is
> > > > > > > > > only supported by iris which never uses relocations.  The 
> > > > > > > > > older i965
> > > > > > > > > driver in Mesa does use relocations but it only supports 
> > > > > > > > > Intel hardware
> > > > > > > > > through Gen11 and has been deprecated for all hardware Gen9+. 
> > > > > > > > >  The
> > > > > > > > > compute driver also never uses relocations.  This only leaves 
> > > > > > > > > the media
> > > > > > > > > driver which is supposed to be switching to softpin going 
> > > > > > > > > forward.
> > > > > > > > > Making softpin a requirement for all future hardware seems 
> > > > > > > > > reasonable.
> > > > > > > > >
> > > > > > > > > Rejecting relocations starting with Gen12 has the benefit 
> > > > > > > > > that we don't
> > > > > > > > > have to bother supporting it on platforms with local memory.  
> > > > > > > > > Given how
> > > > > > > > > much CPU touching of memory is required for relocations, not 
> > > > > > > > > having to
> > > > > > > > > do so on platforms where not all memory is directly 
> > > > > > > > > CPU-accessible
> > > > > > > > > carries significant advantages.
> > > > > > > > >
> > > > > > > > > v2 (Jason Ekstrand):
> > > > > > > > >  - Allow TGL-LP platforms as they've already shipped
> > > > > > > > >
> > > > > > > > > v3 (Jason Ekstrand):
> > > > > > > > >  - WARN_ON platforms with LMEM support in case the check is 
> > > > > > > > > wrong
> > > > > > > >
> > > > > > > > I was asked to review of this patch. It works along with 
> > > > > > > > expected
> > > > > > > > IGT check 
> > > > > > > > https://patchwork.freedesktop.org/patch/423361/?series=82954&rev=25
> > > > > > > >
> > > > > > > > Before I'll give you r-b - isn't i915_gem_execbuffer2_ioctl() 
> > > > > > > > better place
> > > > > > > > to do for loop just after copy_from_user() and check 
> > > > > > > > relocation_count?
> > > > > > > > We have an access to exec2_list there, we know the gen so we're 
> > > > > > > > able to say
> > > > > > > > relocations are not supported immediate, without entering 
> > > > > > > > i915_gem_do_execbuffer().
> > > > > > >
> > > > > > > I considered that but it adds an extra object list walk for a case
> > > > > > > which we expect to not happen.  I'm not sure how expensive the 
> > > > > > > list
> > > > > > > walk would be if all we do is check the number of relocations on 
> > > > > > > each
> > > > > > > object.  I guess, if it comes right after a copy_from_user, it's 
> > > > > > > all
> > > > > > > hot in the cache so it shouldn't matter.  Ok.  I've convinced 
> > > > > > > myself.
> > > > > > > I'll move it.
> > > > > >
> > > > > > I really wouldn't move it if it's another list walk. Execbuf has a 
> > > > > > lot
> > > > > > of fast-paths going on, and we have extensive tests to make sure it
> > > > > > unwinds correctly in all cases. It's not very intuitive, but execbuf
> > > > > > code isn't scoring very high on that.
> > > > >
> > > > > And here I'd just finished doing the typing to move it.  Good thing I
> > > > > hadn't closed vim yet and it was still in my undo buffer. :-)
> > > >
> > > > Before entering "slower" path from my perspective I would just check
> > > > batch object at that place. We still would have single list walkthrough
> > > > and quick check on the very beginning.
> > >
> > > Can you be more specific about what exactly you think we can check at
> > > the beginning?  Either we add a flag that we can O(1) check at the
> > > beginning or we have to check everything in exec2_list for
> > > exec2_list[n].relocation_count == 0.  That's a list walk.  I'm not
> > > seeing what up-front check you're thinking we can do without the list
> > > walk.
> >
> > I expect that last (default) or first (I915_EXEC_BATCH_FIRST) execobj
> > (batch) will likely has relocations. So we can check that single
> > object without entering i915_gem_do_execbuffer(). If that check
> > is missed (relocation_count = 0) you'll catch r

Re: [syzbot] upstream boot error: WARNING in vkms_vblank_simulate

On Fri, Mar 12, 2021 at 11:46:27AM +0100, Dmitry Vyukov wrote:
> On Fri, Mar 12, 2021 at 11:26 AM syzbot
>  wrote:
> >
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit:f78d76e7 Merge tag 'drm-fixes-2021-03-12-1' of git://anong..
> > git tree:   upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=11c16ba2d0
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=dc02c6afcb046874
> > dashboard link: https://syzkaller.appspot.com/bug?extid=333bd014262fd5d0a418
> > userspace arch: arm
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+333bd014262fd5d0a...@syzkaller.appspotmail.com
> 
> This WARNING seems to be happening just randomly.
> It was already reported as:
> 
> #syz dup: WARNING in vkms_vblank_simulate (2)
> https://syzkaller.appspot.com/bug?id=9b10491371879700d6a21c15684c2232ff015084
> 
> It now has whooping 48589 crashes and also crashes slow qemu tcg instances.

Yeah your box is too slow. We're trying to simulate hw here, which means
if we can process less than 1 hrtimer per vblank (standard every 16ms)
then we scream, because things go very wrong with the simulated hw. And
the hrtimer is really not that big, all the expensive processing is pushed
to worker, where we have code to handle if it falls back too much.

So either patch this out or make the code robust against a kernel that
somehow can't process a single hrtimer every 16ms.
-Daniel

> 
> 
> 
> > [ cut here ]
> > WARNING: CPU: 0 PID: 0 at drivers/gpu/drm/vkms/vkms_crtc.c:21 
> > vkms_vblank_simulate+0x26c/0x2f4 drivers/gpu/drm/vkms/vkms_crtc.c:41
> > Modules linked in:
> > CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
> > 5.12.0-rc2-syzkaller-00338-gf78d76e72a46 #0
> > Hardware name: linux,dummy-virt (DT)
> > pstate: 2085 (nzCv daIf -PAN -UAO -TCO BTYPE=--)
> > pc : vkms_vblank_simulate+0x26c/0x2f4 drivers/gpu/drm/vkms/vkms_crtc.c:21
> > lr : hrtimer_forward_now include/linux/hrtimer.h:510 [inline]
> > lr : vkms_vblank_simulate+0x90/0x2f4 drivers/gpu/drm/vkms/vkms_crtc.c:19
> > sp : 6a693cd0
> > x29: 6a693cd0 x28: 0c9d1e58
> > x27: dfff8000 x26: 6a67f540
> > x25: 1fffed4cfeb1 x24: 1fffed4cfeaa
> > x23: 0c9d0d30 x22: 00fe4c00
> > x21: 6a67f540 x20: 0c9d0e58
> > x19: 0c9d1e58 x18: 6a6a1b48
> > x17: 1fffe1952345 x16: 
> > x15: 8000197bf810 x14: 1fffed4d2750
> > x13: 0001 x12: 0033
> > x11: 12fb4936 x10: 0007
> > x9 : 12fb4943 x8 : 800017d14c00
> > x7 : f1f1f1f1 x6 : dfff8000
> > x5 : 7fff x4 : 0008e44f6b90
> > x3 : 0008e54db790 x2 : 0008e44f6b90
> > x1 : 0008e54db790 x0 : 0002
> > Call trace:
> >  vkms_vblank_simulate+0x26c/0x2f4 drivers/gpu/drm/vkms/vkms_crtc.c:41
> >  __run_hrtimer kernel/time/hrtimer.c:1519 [inline]
> >  __hrtimer_run_queues+0x590/0xe40 kernel/time/hrtimer.c:1583
> >  hrtimer_interrupt+0x2d4/0x810 kernel/time/hrtimer.c:1645
> >  timer_handler drivers/clocksource/arm_arch_timer.c:647 [inline]
> >  arch_timer_handler_phys+0x4c/0x70 drivers/clocksource/arm_arch_timer.c:665
> >  handle_percpu_devid_irq+0x19c/0x330 kernel/irq/chip.c:930
> >  generic_handle_irq_desc include/linux/irqdesc.h:158 [inline]
> >  generic_handle_irq kernel/irq/irqdesc.c:652 [inline]
> >  __handle_domain_irq+0x11c/0x1f0 kernel/irq/irqdesc.c:689
> >  handle_domain_irq include/linux/irqdesc.h:176 [inline]
> >  gic_handle_irq+0x5c/0x1b0 drivers/irqchip/irq-gic.c:370
> >  el1_irq+0xb4/0x180 arch/arm64/kernel/entry.S:669
> >  arch_local_irq_restore arch/arm64/include/asm/irqflags.h:124 [inline]
> >  queue_work_on+0x74/0x110 kernel/workqueue.c:1528
> >  queue_work include/linux/workqueue.h:507 [inline]
> >  cursor_timer_handler+0x64/0x100 drivers/video/fbdev/core/fbcon.c:397
> >  call_timer_fn+0x1d4/0x9c4 kernel/time/timer.c:1431
> >  expire_timers kernel/time/timer.c:1476 [inline]
> >  __run_timers.part.0+0x530/0xa00 kernel/time/timer.c:1745
> >  __run_timers kernel/time/timer.c:1726 [inline]
> >  run_timer_softirq+0xa4/0x1a0 kernel/time/timer.c:1758
> >  _stext+0x2b4/0x1084
> >  do_softirq_own_stack include/asm-generic/softirq_stack.h:10 [inline]
> >  invoke_softirq kernel/softirq.c:228 [inline]
> >  __irq_exit_rcu+0x46c/0x510 kernel/softirq.c:422
> >  irq_exit+0x14/0x84 kernel/softirq.c:446
> >  __handle_domain_irq+0x120/0x1f0 kernel/irq/irqdesc.c:692
> >  handle_domain_irq include/linux/irqdesc.h:176 [inline]
> >  gic_handle_irq+0x5c/0x1b0 drivers/irqchip/irq-gic.c:370
> >  el1_irq+0xb4/0x180 arch/arm64/kernel/entry.S:669
> >  arch_local_irq_enable+0xc/0x14 arch/arm64/include/asm/irqflags.h:37
> >  default_idle_call+0x64/0xf4 kernel/sched/idle.c:112
> >  cpuidle_idle_call kernel/sched/idle.c:194 [inline]
> >  do_idle+0x38c/0x4ec kernel/sched/idle.c:300
> >  cpu_startup_entry+0x28/

Re: [PATCH] drm/exynos: move to use request_irq by IRQF_NO_AUTOEN flag

On Fri, Mar 12, 2021 at 07:43:05PM +0800, Tian Tao wrote:
> After this patch cbe16f35bee68 genirq: Add IRQF_NO_AUTOEN for
> request_irq/nmi() is merged. request_irq() after setting
> IRQ_NOAUTOEN as below
> 
> irq_set_status_flags(irq, IRQ_NOAUTOEN);
> request_irq(dev, irq...);
> can be replaced by request_irq() with IRQF_NO_AUTOEN flag.
> 
> Signed-off-by: Tian Tao 

To get all your nice cleanups land faster I strongly recommend you review
other people's small patches. And then ask them to review yours in return.
You have already drm-misc commit rights, so should be all ready to go and
do lots of great stuff!

Cheers, Daniel

> ---
>  drivers/gpu/drm/exynos/exynos5433_drm_decon.c | 4 ++--
>  drivers/gpu/drm/exynos/exynos_drm_dsi.c   | 7 +++
>  2 files changed, 5 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/exynos/exynos5433_drm_decon.c 
> b/drivers/gpu/drm/exynos/exynos5433_drm_decon.c
> index 1f79bc2..f530aff 100644
> --- a/drivers/gpu/drm/exynos/exynos5433_drm_decon.c
> +++ b/drivers/gpu/drm/exynos/exynos5433_drm_decon.c
> @@ -775,8 +775,8 @@ static int decon_conf_irq(struct decon_context *ctx, 
> const char *name,
>   return irq;
>   }
>   }
> - irq_set_status_flags(irq, IRQ_NOAUTOEN);
> - ret = devm_request_irq(ctx->dev, irq, handler, flags, "drm_decon", ctx);
> + ret = devm_request_irq(ctx->dev, irq, handler,
> +flags | IRQ_NOAUTOEN, "drm_decon", ctx);
>   if (ret < 0) {
>   dev_err(ctx->dev, "IRQ %s request failed\n", name);
>   return ret;
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_dsi.c 
> b/drivers/gpu/drm/exynos/exynos_drm_dsi.c
> index 83ab6b3..fd9b133b 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_dsi.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_dsi.c
> @@ -1352,10 +1352,9 @@ static int exynos_dsi_register_te_irq(struct 
> exynos_dsi *dsi,
>   }
>  
>   te_gpio_irq = gpio_to_irq(dsi->te_gpio);
> - irq_set_status_flags(te_gpio_irq, IRQ_NOAUTOEN);
>  
>   ret = request_threaded_irq(te_gpio_irq, exynos_dsi_te_irq_handler, NULL,
> - IRQF_TRIGGER_RISING, "TE", dsi);
> +IRQF_TRIGGER_RISING | IRQ_NOAUTOEN, "TE", 
> dsi);
>   if (ret) {
>   dev_err(dsi->dev, "request interrupt failed with %d\n", ret);
>   gpio_free(dsi->te_gpio);
> @@ -1802,9 +1801,9 @@ static int exynos_dsi_probe(struct platform_device 
> *pdev)
>   if (dsi->irq < 0)
>   return dsi->irq;
>  
> - irq_set_status_flags(dsi->irq, IRQ_NOAUTOEN);
>   ret = devm_request_threaded_irq(dev, dsi->irq, NULL,
> - exynos_dsi_irq, IRQF_ONESHOT,
> + exynos_dsi_irq,
> + IRQF_ONESHOT | IRQ_NOAUTOEN,
>   dev_name(dev), dsi);
>   if (ret) {
>   dev_err(dev, "failed to request dsi irq\n");
> -- 
> 2.7.4
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm/amdgpu: nuke the ih reentrant lock




Am 12.03.21 um 15:04 schrieb Daniel Vetter:

On Fri, Mar 12, 2021 at 02:59:06PM +0100, Christian König wrote:

Interrupts on are non-reentrant on linux. This is just an ancient
leftover from radeon where irq processing was kicked of from different
places.

Signed-off-by: Christian König 

Man you tricked me into grepping this on radeon and it looks horrible.
atomic_t is unordered in linux, so whatever was built there for radeon
does not wokr like a lock. It's missing all the barriers afiui. Good
riddance at least for amdgpu.


Hui? atomic_xchg() is perfectly ordered as far as I know.

IIRC Alex added this for R600 because we had boards where interrupts 
where sporadically swallowed during driver startup and we had to kick of 
ring buffer processing manually.


Anyway we have a fence processing fallback timer in amdgpu instead and 
that stuff is probably unused on any modern hardware.



Reviewed-by: Daniel Vetter 


Thanks,
Christian.




---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 -
  drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c | 5 -
  drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h | 1 -
  3 files changed, 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index a15f1b604733..886625fb464b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3284,7 +3284,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
  
  	/* mutex initialization are all done here so we

 * can recall function without having locking issues */
-   atomic_set(&adev->irq.ih.lock, 0);
mutex_init(&adev->firmware.mutex);
mutex_init(&adev->pm.mutex);
mutex_init(&adev->gfx.gpu_clock_mutex);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
index 1024065f1f03..faaa6aa2faaf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
@@ -228,10 +228,6 @@ int amdgpu_ih_process(struct amdgpu_device *adev, struct 
amdgpu_ih_ring *ih)
wptr = amdgpu_ih_get_wptr(adev, ih);
  
  restart_ih:

-   /* is somebody else already processing irqs? */
-   if (atomic_xchg(&ih->lock, 1))
-   return IRQ_NONE;
-
DRM_DEBUG("%s: rptr %d, wptr %d\n", __func__, ih->rptr, wptr);
  
  	/* Order reading of wptr vs. reading of IH ring data */

@@ -244,7 +240,6 @@ int amdgpu_ih_process(struct amdgpu_device *adev, struct 
amdgpu_ih_ring *ih)
  
  	amdgpu_ih_set_rptr(adev, ih);

wake_up_all(&ih->wait_process);
-   atomic_set(&ih->lock, 0);
  
  	/* make sure wptr hasn't changed while processing */

wptr = amdgpu_ih_get_wptr(adev, ih);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
index 87ec6d20dbe0..0649b59830a5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
@@ -64,7 +64,6 @@ struct amdgpu_ih_ring {
  
  	boolenabled;

unsignedrptr;
-   atomic_tlock;
struct amdgpu_ih_regs   ih_regs;
  
  	/* For waiting on IH processing at checkpoint. */

--
2.25.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm/amdgpu: nuke the ih reentrant lock

On Fri, Mar 12, 2021 at 03:27:58PM +0100, Christian König wrote:
> 
> 
> Am 12.03.21 um 15:04 schrieb Daniel Vetter:
> > On Fri, Mar 12, 2021 at 02:59:06PM +0100, Christian König wrote:
> > > Interrupts on are non-reentrant on linux. This is just an ancient
> > > leftover from radeon where irq processing was kicked of from different
> > > places.
> > > 
> > > Signed-off-by: Christian König 
> > Man you tricked me into grepping this on radeon and it looks horrible.
> > atomic_t is unordered in linux, so whatever was built there for radeon
> > does not wokr like a lock. It's missing all the barriers afiui. Good
> > riddance at least for amdgpu.
> 
> Hui? atomic_xchg() is perfectly ordered as far as I know.

Oh right, if atomic_ returns a value it's fully ordered. Nice tour into
memory-barriers.txt. It's the atomic_t operations that don't return
anything which are fully unordered, and I mixed that up.

> IIRC Alex added this for R600 because we had boards where interrupts where
> sporadically swallowed during driver startup and we had to kick of ring
> buffer processing manually.
> 
> Anyway we have a fence processing fallback timer in amdgpu instead and that
> stuff is probably unused on any modern hardware.

Ah yes that stuff. Kinda annoying we have these old dma_fence around where
dma_fence_wait doesn't really work that well, but oh well.
-Daniel

> 
> > Reviewed-by: Daniel Vetter 
> 
> Thanks,
> Christian.
> 
> > 
> > > ---
> > >   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 -
> > >   drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c | 5 -
> > >   drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h | 1 -
> > >   3 files changed, 7 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > index a15f1b604733..886625fb464b 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > @@ -3284,7 +3284,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
> > >   /* mutex initialization are all done here so we
> > >* can recall function without having locking issues */
> > > - atomic_set(&adev->irq.ih.lock, 0);
> > >   mutex_init(&adev->firmware.mutex);
> > >   mutex_init(&adev->pm.mutex);
> > >   mutex_init(&adev->gfx.gpu_clock_mutex);
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c 
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
> > > index 1024065f1f03..faaa6aa2faaf 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
> > > @@ -228,10 +228,6 @@ int amdgpu_ih_process(struct amdgpu_device *adev, 
> > > struct amdgpu_ih_ring *ih)
> > >   wptr = amdgpu_ih_get_wptr(adev, ih);
> > >   restart_ih:
> > > - /* is somebody else already processing irqs? */
> > > - if (atomic_xchg(&ih->lock, 1))
> > > - return IRQ_NONE;
> > > -
> > >   DRM_DEBUG("%s: rptr %d, wptr %d\n", __func__, ih->rptr, wptr);
> > >   /* Order reading of wptr vs. reading of IH ring data */
> > > @@ -244,7 +240,6 @@ int amdgpu_ih_process(struct amdgpu_device *adev, 
> > > struct amdgpu_ih_ring *ih)
> > >   amdgpu_ih_set_rptr(adev, ih);
> > >   wake_up_all(&ih->wait_process);
> > > - atomic_set(&ih->lock, 0);
> > >   /* make sure wptr hasn't changed while processing */
> > >   wptr = amdgpu_ih_get_wptr(adev, ih);
> > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h 
> > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
> > > index 87ec6d20dbe0..0649b59830a5 100644
> > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
> > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
> > > @@ -64,7 +64,6 @@ struct amdgpu_ih_ring {
> > >   boolenabled;
> > >   unsignedrptr;
> > > - atomic_tlock;
> > >   struct amdgpu_ih_regs   ih_regs;
> > >   /* For waiting on IH processing at checkpoint. */
> > > -- 
> > > 2.25.1
> > > 
> > > ___
> > > dri-devel mailing list
> > > dri-devel@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm/amdgpu: nuke the ih reentrant lock

On Fri, Mar 12, 2021 at 03:35:50PM +0100, Daniel Vetter wrote:
> On Fri, Mar 12, 2021 at 03:27:58PM +0100, Christian König wrote:
> > 
> > 
> > Am 12.03.21 um 15:04 schrieb Daniel Vetter:
> > > On Fri, Mar 12, 2021 at 02:59:06PM +0100, Christian König wrote:
> > > > Interrupts on are non-reentrant on linux. This is just an ancient
> > > > leftover from radeon where irq processing was kicked of from different
> > > > places.
> > > > 
> > > > Signed-off-by: Christian König 
> > > Man you tricked me into grepping this on radeon and it looks horrible.
> > > atomic_t is unordered in linux, so whatever was built there for radeon
> > > does not wokr like a lock. It's missing all the barriers afiui. Good
> > > riddance at least for amdgpu.
> > 
> > Hui? atomic_xchg() is perfectly ordered as far as I know.
> 
> Oh right, if atomic_ returns a value it's fully ordered. Nice tour into
> memory-barriers.txt. It's the atomic_t operations that don't return
> anything which are fully unordered, and I mixed that up.
> 
> > IIRC Alex added this for R600 because we had boards where interrupts where
> > sporadically swallowed during driver startup and we had to kick of ring
> > buffer processing manually.
> > 
> > Anyway we have a fence processing fallback timer in amdgpu instead and that
> > stuff is probably unused on any modern hardware.
> 
> Ah yes that stuff. Kinda annoying we have these old dma_fence around where
> dma_fence_wait doesn't really work that well, but oh well.

Argh I'm not awake. dma_fence_wait() works well on these, but the generic
cb stuff (which is used in ever more place) doesn't, because it misses the
fallback timer the ->wait does.
-Daniel

> > > Reviewed-by: Daniel Vetter 
> > 
> > Thanks,
> > Christian.
> > 
> > > 
> > > > ---
> > > >   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 -
> > > >   drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c | 5 -
> > > >   drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h | 1 -
> > > >   3 files changed, 7 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > > index a15f1b604733..886625fb464b 100644
> > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > > > @@ -3284,7 +3284,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
> > > > /* mutex initialization are all done here so we
> > > >  * can recall function without having locking issues */
> > > > -   atomic_set(&adev->irq.ih.lock, 0);
> > > > mutex_init(&adev->firmware.mutex);
> > > > mutex_init(&adev->pm.mutex);
> > > > mutex_init(&adev->gfx.gpu_clock_mutex);
> > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c 
> > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
> > > > index 1024065f1f03..faaa6aa2faaf 100644
> > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
> > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
> > > > @@ -228,10 +228,6 @@ int amdgpu_ih_process(struct amdgpu_device *adev, 
> > > > struct amdgpu_ih_ring *ih)
> > > > wptr = amdgpu_ih_get_wptr(adev, ih);
> > > >   restart_ih:
> > > > -   /* is somebody else already processing irqs? */
> > > > -   if (atomic_xchg(&ih->lock, 1))
> > > > -   return IRQ_NONE;
> > > > -
> > > > DRM_DEBUG("%s: rptr %d, wptr %d\n", __func__, ih->rptr, wptr);
> > > > /* Order reading of wptr vs. reading of IH ring data */
> > > > @@ -244,7 +240,6 @@ int amdgpu_ih_process(struct amdgpu_device *adev, 
> > > > struct amdgpu_ih_ring *ih)
> > > > amdgpu_ih_set_rptr(adev, ih);
> > > > wake_up_all(&ih->wait_process);
> > > > -   atomic_set(&ih->lock, 0);
> > > > /* make sure wptr hasn't changed while processing */
> > > > wptr = amdgpu_ih_get_wptr(adev, ih);
> > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h 
> > > > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
> > > > index 87ec6d20dbe0..0649b59830a5 100644
> > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
> > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
> > > > @@ -64,7 +64,6 @@ struct amdgpu_ih_ring {
> > > > boolenabled;
> > > > unsignedrptr;
> > > > -   atomic_tlock;
> > > > struct amdgpu_ih_regs   ih_regs;
> > > > /* For waiting on IH processing at checkpoint. */
> > > > -- 
> > > > 2.25.1
> > > > 
> > > > ___
> > > > dri-devel mailing list
> > > > dri-devel@lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm/amdgpu: nuke the ih reentrant lock


Am 12.03.21 um 15:36 schrieb Daniel Vetter:

On Fri, Mar 12, 2021 at 03:35:50PM +0100, Daniel Vetter wrote:

On Fri, Mar 12, 2021 at 03:27:58PM +0100, Christian König wrote:


Am 12.03.21 um 15:04 schrieb Daniel Vetter:

On Fri, Mar 12, 2021 at 02:59:06PM +0100, Christian König wrote:

Interrupts on are non-reentrant on linux. This is just an ancient
leftover from radeon where irq processing was kicked of from different
places.

Signed-off-by: Christian König 

Man you tricked me into grepping this on radeon and it looks horrible.
atomic_t is unordered in linux, so whatever was built there for radeon
does not wokr like a lock. It's missing all the barriers afiui. Good
riddance at least for amdgpu.

Hui? atomic_xchg() is perfectly ordered as far as I know.

Oh right, if atomic_ returns a value it's fully ordered. Nice tour into
memory-barriers.txt. It's the atomic_t operations that don't return
anything which are fully unordered, and I mixed that up.


IIRC Alex added this for R600 because we had boards where interrupts where
sporadically swallowed during driver startup and we had to kick of ring
buffer processing manually.

Anyway we have a fence processing fallback timer in amdgpu instead and that
stuff is probably unused on any modern hardware.

Ah yes that stuff. Kinda annoying we have these old dma_fence around where
dma_fence_wait doesn't really work that well, but oh well.

Argh I'm not awake. dma_fence_wait() works well on these, but the generic
cb stuff (which is used in ever more place) doesn't, because it misses the
fallback timer the ->wait does.


That's not what I meant. See amdgpu_fence_schedule_fallback().

As soon as somebody requested a dma_fence to be signaled we start a 
500ms timeout for fence processing which is cleared as soon as we see 
the first interrupt.


That is necessary because some motherboards have the ugly behavior of 
swallowing approx ~0.1% if all MSI interrupts after some idle time.


No idea where exactly that comes from, but I have the feeling that this 
was the same bug Alex tried to handle on R6xx over 10 years ago with 
kicking of interrupt processing manually after enabling interrupts for 
the first time.


Regards,
Christian.


-Daniel


Reviewed-by: Daniel Vetter 

Thanks,
Christian.


---
   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 -
   drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c | 5 -
   drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h | 1 -
   3 files changed, 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index a15f1b604733..886625fb464b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3284,7 +3284,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
/* mutex initialization are all done here so we
 * can recall function without having locking issues */
-   atomic_set(&adev->irq.ih.lock, 0);
mutex_init(&adev->firmware.mutex);
mutex_init(&adev->pm.mutex);
mutex_init(&adev->gfx.gpu_clock_mutex);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
index 1024065f1f03..faaa6aa2faaf 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c
@@ -228,10 +228,6 @@ int amdgpu_ih_process(struct amdgpu_device *adev, struct 
amdgpu_ih_ring *ih)
wptr = amdgpu_ih_get_wptr(adev, ih);
   restart_ih:
-   /* is somebody else already processing irqs? */
-   if (atomic_xchg(&ih->lock, 1))
-   return IRQ_NONE;
-
DRM_DEBUG("%s: rptr %d, wptr %d\n", __func__, ih->rptr, wptr);
/* Order reading of wptr vs. reading of IH ring data */
@@ -244,7 +240,6 @@ int amdgpu_ih_process(struct amdgpu_device *adev, struct 
amdgpu_ih_ring *ih)
amdgpu_ih_set_rptr(adev, ih);
wake_up_all(&ih->wait_process);
-   atomic_set(&ih->lock, 0);
/* make sure wptr hasn't changed while processing */
wptr = amdgpu_ih_get_wptr(adev, ih);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
index 87ec6d20dbe0..0649b59830a5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
@@ -64,7 +64,6 @@ struct amdgpu_ih_ring {
boolenabled;
unsignedrptr;
-   atomic_tlock;
struct amdgpu_ih_regs   ih_regs;
/* For waiting on IH processing at checkpoint. */
--
2.25.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v2] drm/scheduler re-insert Bailing job to avoid memleak

2021-03-12 Thread Andrey Grodzovsky





On 2021-03-12 1:50 a.m., Jack Zhang wrote:

re-insert Bailing jobs to avoid memory leak.


Usually we put a v2:"Blha blha blha" here to explain
what was modified in v2

Also - since you make changes to another driver you should
add their maintainer and mailing list probably
(use ./scripts/get_maintainer.pl) for this



Signed-off-by: Jack Zhang 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 +++-
  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c| 8 ++--
  drivers/gpu/drm/panfrost/panfrost_job.c| 2 +-
  drivers/gpu/drm/scheduler/sched_main.c | 8 +++-
  include/drm/gpu_scheduler.h| 1 +
  5 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 79b9cc73763f..86463b0f936e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -4815,8 +4815,10 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev,
job ? job->base.id : -1);
  
  		/* even we skipped this reset, still need to set the job to guilty */

-   if (job)
+   if (job) {
drm_sched_increase_karma(&job->base);
+   r = DRM_GPU_SCHED_STAT_BAILING;
+   }
goto skip_recovery;
}
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c

index 759b34799221..41390bdacd9e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -34,6 +34,7 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct 
drm_sched_job *s_job)
struct amdgpu_job *job = to_amdgpu_job(s_job);
struct amdgpu_task_info ti;
struct amdgpu_device *adev = ring->adev;
+   int ret;
  
  	memset(&ti, 0, sizeof(struct amdgpu_task_info));
  
@@ -52,8 +53,11 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job)

  ti.process_name, ti.tgid, ti.task_name, ti.pid);
  
  	if (amdgpu_device_should_recover_gpu(ring->adev)) {

-   amdgpu_device_gpu_recover(ring->adev, job);
-   return DRM_GPU_SCHED_STAT_NOMINAL;
+   ret = amdgpu_device_gpu_recover(ring->adev, job);
+   if (ret == DRM_GPU_SCHED_STAT_BAILING)
+   return DRM_GPU_SCHED_STAT_BAILING;
+   else
+   return DRM_GPU_SCHED_STAT_NOMINAL;
} else {
drm_sched_suspend_timeout(&ring->sched);
if (amdgpu_sriov_vf(adev))
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
b/drivers/gpu/drm/panfrost/panfrost_job.c
index 6003cfeb1322..c372f4a38736 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -456,7 +456,7 @@ static enum drm_gpu_sched_stat panfrost_job_timedout(struct 
drm_sched_job
  
  	/* Scheduler is already stopped, nothing to do. */

if (!panfrost_scheduler_stop(&pfdev->js->queue[js], sched_job))
-   return DRM_GPU_SCHED_STAT_NOMINAL;
+   return DRM_GPU_SCHED_STAT_BAILING;


Note that there is another early termination in panfrost
at 
https://elixir.bootlin.com/linux/v5.11.1/source/drivers/gpu/drm/panfrost/panfrost_job.c#L445

So probably should also add there.

  
  	/* Schedule a reset if there's no reset in progress. */

if (!atomic_xchg(&pfdev->reset.pending, 1))
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 92d8de24d0a1..a44f621fb5c4 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -314,6 +314,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
  {
struct drm_gpu_scheduler *sched;
struct drm_sched_job *job;
+   int ret;
  
  	sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
  
@@ -331,8 +332,13 @@ static void drm_sched_job_timedout(struct work_struct *work)

list_del_init(&job->list);
spin_unlock(&sched->job_list_lock);
  
-		job->sched->ops->timedout_job(job);

+   ret = job->sched->ops->timedout_job(job);
  
+		if (ret == DRM_GPU_SCHED_STAT_BAILING) {

+   spin_lock(&sched->job_list_lock);
+   list_add(&job->node, &sched->ring_mirror_list);
+   spin_unlock(&sched->job_list_lock);
+   }


Just reiterating my comment from v1 here since u missed it -
Problem here that since you already dropped the reset locks you are
racing here now against other recovery threads as they process the same
mirror list, and yet,I think this solution makes things better then
they are now with the leak but still, it's only temporary band-aid until
the full solution to be implemented as described earlier by Christian.
Probably then worth mentioning here with a comment this it's a tem

Re: [RFC PATCH 0/7] drm/panfrost: Add a new submit ioctl

2021-03-12 Thread Jason Ekstrand

On Fri, Mar 12, 2021 at 1:31 AM Boris Brezillon
 wrote:
>
> On Thu, 11 Mar 2021 12:11:48 -0600
> Jason Ekstrand  wrote:
>
> > > > > > > 2/ Queued jobs might be executed out-of-order (unless they have
> > > > > > > explicit/implicit deps between them), and Vulkan asks that 
> > > > > > > the out
> > > > > > > fence be signaled when all jobs are done. Timeline syncobjs 
> > > > > > > are a
> > > > > > > good match for that use case. All we need to do is pass the 
> > > > > > > same
> > > > > > > fence syncobj to all jobs being attached to a single 
> > > > > > > QueueSubmit
> > > > > > > request, but a different point on the timeline. The syncobj
> > > > > > > timeline wait does the rest and guarantees that we've reached 
> > > > > > > a
> > > > > > > given timeline point (IOW, all jobs before that point are 
> > > > > > > done)
> > > > > > > before declaring the fence as signaled.
> > > > > > > One alternative would be to have dummy 'synchronization' jobs 
> > > > > > > that
> > > > > > > don't actually execute anything on the GPU but declare a 
> > > > > > > dependency
> > > > > > > on all other jobs that are part of the QueueSubmit request, 
> > > > > > > and
> > > > > > > signal the out fence (the scheduler would do most of the work 
> > > > > > > for
> > > > > > > us, all we have to do is support NULL job heads and signal the
> > > > > > > fence directly when that happens instead of queueing the job).
> > > > > >
> > > > > > I have to admit to being rather hazy on the details of timeline
> > > > > > syncobjs, but I thought there was a requirement that the timeline 
> > > > > > moves
> > > > > > monotonically. I.e. if you have multiple jobs signalling the same
> > > > > > syncobj just with different points, then AFAIU the API requires 
> > > > > > that the
> > > > > > points are triggered in order.
> > > > >
> > > > > I only started looking at the SYNCOBJ_TIMELINE API a few days ago, so 
> > > > > I
> > > > > might be wrong, but my understanding is that queuing fences (addition
> > > > > of new points in the timeline) should happen in order, but signaling
> > > > > can happen in any order. When checking for a signaled fence the
> > > > > fence-chain logic starts from the last point (or from an explicit 
> > > > > point
> > > > > if you use the timeline wait flavor) and goes backward, stopping at 
> > > > > the
> > > > > first un-signaled node. If I'm correct, that means that fences that
> > > > > are part of a chain can be signaled in any order.
> > > >
> > > > You don't even need a timeline for this.  Just have a single syncobj
> > > > per-queue and make each submit wait on it and then signal it.
> > > > Alternatively, you can just always hang on to the out-fence from the
> > > > previous submit and make the next one wait on that.
> > >
> > > That's what I have right now, but it forces the serialization of all
> > > jobs that are pushed during a submit (and there can be more than one
> > > per command buffer on panfrost :-/). Maybe I'm wrong, but I thought it'd
> > > be better to not force this serialization if we can avoid it.
> >
> > I'm not familiar with panfrost's needs and I don't work on a tiler and
> > I know there are different issues there.  But...
> >
> > The Vulkan spec requires that everything that all the submits that
> > happen on a given vkQueue happen in-order.  Search the spec for
> > "Submission order" for more details.
>
> Duh, looks like I completely occulted the "Submission order"
> guarantees. This being said, even after reading this chapter multiple
> times I'm not sure what kind of guarantee this gives us, given the
> execution itself can be out-of-order. My understanding is that
> submission order matters for implicit deps, say you have 2 distinct
> VkSubmitInfo, the first one (in submission order) writing to a buffer
> and the second one reading from it, you really want the first one to
> be submitted first and the second one to wait on the implicit BO fence
> created by the first one. If we were to submit out-of-order, this
> guarantee wouldn't be met. OTOH, if we have 2 completely independent
> submits, I don't really see what submission order gives us if execution
> is out-of-order.

Right, this is where things get sticky.  What's guaranteed there is
submission order not execution order.  But, sadly, there's more hidden
in there than you might think.  Before we can go there, though, we
need to establish a few details of Mali hardware works.

My understanding (feel free to correct me if I'm wrong) is that,
roughly, Mali has two engines which have to work together to render:
One engine for compute/vertex/geometry and one for binning and
fragment.  When you go to submit a draw, you fire off any geometry
work on the compute engine and the fragment work on the binning
engine.  At a tile flush boundary (or renderpass in Vulkan), you
insert a dependency between the geometry work you put on the compute
engine and the binning/fr

[RFC 0/6] Default request/fence expiry + watchdog

From: Tvrtko Ursulin 

"Watchdog" aka "restoring hangcheck" aka default request/fence expiry - first
post of a somewhat controversial feature so may be somewhat rough in commit
messages, commentary and implementation. So only RFC for now.

I parenthesise the "watchdog" becuase in classical sense watchdog would allow
userspace to ping it and so remain alive.

I parenthesise "restoring hangcheck" because this series, contrary to the old
hangcheck, is not looking at whether the workload is making any progress from
the kernel side either. (Althoguh disclaimer my memory may be leaky - Daniel
suspects old hangcheck had some stricter, more indiscriminatory, angles to it.
But apart from being prone to both false negatives and false positives I can't
remember that myself.)

Short version - ask is to fail any user submissions after a set time period. In
this RFC that time is ten seconds.

Time counts from the moment user submission is "runnable" (implicit and explicit
dependencies have been cleared) and keeps counting regardless of the GPU
contetion caused by other users of the system. So semantics are really a bit
weak but again, I understand this is really wanted by the DRM core.

As an attempt to compensate for this brutish nature, I proposed adding
extendable configurability via a context param as part of the series. That could
allow userspace to pick different semantics (always going more restrictive than
the system default) and so implement interesting things like long desired media
watchdog. Module trickyness of the implementation there.

Test-with: 20210312093329.1639502-1-tvrtko.ursu...@linux.intel.com
Cc: Daniel Vetter 

Chris Wilson (1):
  drm/i915: Individual request cancellation

Tvrtko Ursulin (5):
  drm/i915: Restrict sentinel requests further
  drm/i915: Request watchdog infrastructure
  drm/i915: Allow userspace to configure the watchdog
  drm/i915: Fail too long user submissions by default
  drm/i915: Allow configuring default request expiry via modparam

 drivers/gpu/drm/i915/Kconfig.profile  |   8 +
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  92 ++
 .../gpu/drm/i915/gem/i915_gem_context_types.h |   4 +
 drivers/gpu/drm/i915/gt/intel_context_param.h |  11 +-
 drivers/gpu/drm/i915/gt/intel_context_types.h |   4 +
 .../gpu/drm/i915/gt/intel_engine_heartbeat.c  |   1 +
 .../drm/i915/gt/intel_execlists_submission.c  |  11 +-
 .../drm/i915/gt/intel_execlists_submission.h  |   2 +
 drivers/gpu/drm/i915/gt/intel_gt.c|   3 +
 drivers/gpu/drm/i915/gt/intel_gt.h|   2 +
 drivers/gpu/drm/i915/gt/intel_gt_requests.c   |  21 ++
 drivers/gpu/drm/i915/gt/intel_gt_types.h  |   7 +
 drivers/gpu/drm/i915/i915_params.c|   5 +
 drivers/gpu/drm/i915/i915_params.h|   1 +
 drivers/gpu/drm/i915/i915_request.c   | 129 +++-
 drivers/gpu/drm/i915/i915_request.h   |  12 +-
 drivers/gpu/drm/i915/selftests/i915_request.c | 275 ++
 include/uapi/drm/i915_drm.h   |   5 +-
 18 files changed, 584 insertions(+), 9 deletions(-)

-- 
2.27.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[RFC 1/6] drm/i915: Individual request cancellation

From: Chris Wilson 

Currently, we cancel outstanding requests within a context when the
context is closed. We may also want to cancel individual requests using
the same graceful preemption mechanism.

v2 (Tvrtko):
 * Cancel waiters carefully considering no timeline lock and RCU.
 * Fixed selftests.

Signed-off-by: Chris Wilson 
Signed-off-by: Tvrtko Ursulin 
---
 .../gpu/drm/i915/gt/intel_engine_heartbeat.c  |   1 +
 .../drm/i915/gt/intel_execlists_submission.c  |   9 +-
 drivers/gpu/drm/i915/i915_request.c   |  77 -
 drivers/gpu/drm/i915/i915_request.h   |   4 +-
 drivers/gpu/drm/i915/selftests/i915_request.c | 275 ++
 5 files changed, 360 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c 
b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
index 0b062fad1837..e2fb3ae2aaf3 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
@@ -314,6 +314,7 @@ int intel_engine_pulse(struct intel_engine_cs *engine)
mutex_unlock(&ce->timeline->mutex);
}
 
+   intel_engine_flush_scheduler(engine);
intel_engine_pm_put(engine);
return err;
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 85ff5fe861b4..4c2acb5a6c0a 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -421,6 +421,11 @@ static void reset_active(struct i915_request *rq,
ce->lrc.lrca = lrc_update_regs(ce, engine, head);
 }
 
+static bool bad_request(const struct i915_request *rq)
+{
+   return rq->fence.error && i915_request_started(rq);
+}
+
 static struct intel_engine_cs *
 __execlists_schedule_in(struct i915_request *rq)
 {
@@ -433,7 +438,7 @@ __execlists_schedule_in(struct i915_request *rq)
 !intel_engine_has_heartbeat(engine)))
intel_context_set_banned(ce);
 
-   if (unlikely(intel_context_is_banned(ce)))
+   if (unlikely(intel_context_is_banned(ce) || bad_request(rq)))
reset_active(rq, engine);
 
if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
@@ -1112,7 +1117,7 @@ static unsigned long active_preempt_timeout(struct 
intel_engine_cs *engine,
return 0;
 
/* Force a fast reset for terminated contexts (ignoring sysfs!) */
-   if (unlikely(intel_context_is_banned(rq->context)))
+   if (unlikely(intel_context_is_banned(rq->context) || bad_request(rq)))
return 1;
 
return READ_ONCE(engine->props.preempt_timeout_ms);
diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c
index e7b4c4bc41a6..fb9c5bb1fe41 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -33,7 +33,10 @@
 #include "gem/i915_gem_context.h"
 #include "gt/intel_breadcrumbs.h"
 #include "gt/intel_context.h"
+#include "gt/intel_engine.h"
+#include "gt/intel_engine_heartbeat.h"
 #include "gt/intel_gpu_commands.h"
+#include "gt/intel_reset.h"
 #include "gt/intel_ring.h"
 #include "gt/intel_rps.h"
 
@@ -429,20 +432,22 @@ void __i915_request_skip(struct i915_request *rq)
rq->infix = rq->postfix;
 }
 
-void i915_request_set_error_once(struct i915_request *rq, int error)
+bool i915_request_set_error_once(struct i915_request *rq, int error)
 {
int old;
 
GEM_BUG_ON(!IS_ERR_VALUE((long)error));
 
if (i915_request_signaled(rq))
-   return;
+   return false;
 
old = READ_ONCE(rq->fence.error);
do {
if (fatal_error(old))
-   return;
+   return false;
} while (!try_cmpxchg(&rq->fence.error, &old, error));
+
+   return true;
 }
 
 struct i915_request *i915_request_mark_eio(struct i915_request *rq)
@@ -609,6 +614,72 @@ void i915_request_unsubmit(struct i915_request *request)
spin_unlock_irqrestore(&se->lock, flags);
 }
 
+static struct intel_engine_cs *active_engine(struct i915_request *rq)
+{
+   struct intel_engine_cs *engine, *locked;
+
+   locked = READ_ONCE(rq->engine);
+   spin_lock_irq(&locked->sched.lock);
+   while (unlikely(locked != (engine = READ_ONCE(rq->engine {
+   spin_unlock(&locked->sched.lock);
+   locked = engine;
+   spin_lock(&locked->sched.lock);
+   }
+
+   engine = NULL;
+   if (i915_request_is_active(rq) && !__i915_request_is_complete(rq))
+   engine = locked;
+
+   spin_unlock_irq(&locked->sched.lock);
+
+   return engine;
+}
+
+static void __cancel_request(struct i915_request *rq)
+{
+   struct intel_engine_cs *engine = active_engine(rq);
+
+   if (engine && intel_engine_pulse(engine))
+   intel_gt_handle_error(engine->gt, engine->mask, 0,
+ "request cancellation by

[RFC 3/6] drm/i915: Request watchdog infrastructure

From: Tvrtko Ursulin 

Prepares the plumbing for setting request/fence expiration time. All code
is put in place but is never activeted due yet missing ability to actually
configure the timer.

Outline of the basic operation:

A timer is started when request is ready for execution. If the request
completes (retires) before the timer fires, timer is cancelled and nothing
further happens.

If the timer fires request is added to a lockless list and worker queued.
Purpose of this is twofold: a) It allows request cancellation from a more
friendly context and b) coalesces multiple expirations into a single event
of consuming the list.

Worker locklessly consumes the list of expired requests and cancels them
all using previous added i915_request_cancel().

Associated timeout value is stored in rq->context.watchdog.timeout_us.

Signed-off-by: Tvrtko Ursulin 
Cc: Daniel Vetter 
---
 drivers/gpu/drm/i915/gt/intel_context_types.h |  4 ++
 .../drm/i915/gt/intel_execlists_submission.h  |  2 +
 drivers/gpu/drm/i915/gt/intel_gt.c|  3 ++
 drivers/gpu/drm/i915/gt/intel_gt.h|  2 +
 drivers/gpu/drm/i915/gt/intel_gt_requests.c   | 21 
 drivers/gpu/drm/i915/gt/intel_gt_types.h  |  7 +++
 drivers/gpu/drm/i915/i915_request.c   | 52 +++
 drivers/gpu/drm/i915/i915_request.h   |  8 +++
 8 files changed, 99 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
b/drivers/gpu/drm/i915/gt/intel_context_types.h
index 0ea18c9e2aca..65a5730a4f5b 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -99,6 +99,10 @@ struct intel_context {
 #define CONTEXT_FORCE_SINGLE_SUBMISSION7
 #define CONTEXT_NOPREEMPT  8
 
+   struct {
+   u64 timeout_us;
+   } watchdog;
+
u32 *lrc_reg_state;
union {
struct {
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h 
b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
index f7bd3fccfee8..4ca9b475e252 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.h
@@ -6,6 +6,7 @@
 #ifndef __INTEL_EXECLISTS_SUBMISSION_H__
 #define __INTEL_EXECLISTS_SUBMISSION_H__
 
+#include 
 #include 
 
 struct drm_printer;
@@ -13,6 +14,7 @@ struct drm_printer;
 struct i915_request;
 struct intel_context;
 struct intel_engine_cs;
+struct intel_gt;
 
 enum {
INTEL_CONTEXT_SCHEDULE_IN = 0,
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
b/drivers/gpu/drm/i915/gt/intel_gt.c
index ca76f93bc03d..8d77dcbad059 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -31,6 +31,9 @@ void intel_gt_init_early(struct intel_gt *gt, struct 
drm_i915_private *i915)
INIT_LIST_HEAD(>->closed_vma);
spin_lock_init(>->closed_lock);
 
+   init_llist_head(>->watchdog.list);
+   INIT_WORK(>->watchdog.work, intel_gt_watchdog_work);
+
intel_gt_init_buffer_pool(gt);
intel_gt_init_reset(gt);
intel_gt_init_requests(gt);
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.h 
b/drivers/gpu/drm/i915/gt/intel_gt.h
index a17bd8b3195f..7ec395cace69 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt.h
@@ -78,4 +78,6 @@ static inline bool intel_gt_is_wedged(const struct intel_gt 
*gt)
 void intel_gt_info_print(const struct intel_gt_info *info,
 struct drm_printer *p);
 
+void intel_gt_watchdog_work(struct work_struct *work);
+
 #endif /* __INTEL_GT_H__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c 
b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
index 36ec97f79174..df12fc7f0aa7 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
@@ -8,6 +8,7 @@
 #include "i915_drv.h" /* for_each_engine() */
 #include "i915_request.h"
 #include "intel_engine_heartbeat.h"
+#include "intel_execlists_submission.h"
 #include "intel_gt.h"
 #include "intel_gt_pm.h"
 #include "intel_gt_requests.h"
@@ -242,4 +243,24 @@ void intel_gt_fini_requests(struct intel_gt *gt)
 {
/* Wait until the work is marked as finished before unloading! */
cancel_delayed_work_sync(>->requests.retire_work);
+
+   flush_work(>->watchdog.work);
+}
+
+void intel_gt_watchdog_work(struct work_struct *work)
+{
+   struct intel_gt *gt =
+   container_of(work, typeof(*gt), watchdog.work);
+   struct i915_request *rq, *rn;
+   struct llist_node *first;
+
+   first = llist_del_all(>->watchdog.list);
+   if (!first)
+   return;
+
+   llist_for_each_entry_safe(rq, rn, first, watchdog.link) {
+   if (!i915_request_completed(rq))
+   i915_request_cancel(rq, -EINTR);
+   i915_request_put(rq);
+   }
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h 
b/drivers/gpu/drm/i915/gt/intel_gt_types.h
index 626

[RFC 2/6] drm/i915: Restrict sentinel requests further

From: Tvrtko Ursulin 

Disallow sentinel requests follow previous sentinels to make request
cancellation work better when faced with a chain of requests which have
all been marked as in error.

Signed-off-by: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 4c2acb5a6c0a..4b870eca9693 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -896,7 +896,7 @@ static bool can_merge_rq(const struct i915_request *prev,
if (__i915_request_is_complete(next))
return true;
 
-   if (unlikely((i915_request_flags(prev) ^ i915_request_flags(next)) &
+   if (unlikely((i915_request_flags(prev) | i915_request_flags(next)) &
 (BIT(I915_FENCE_FLAG_NOPREEMPT) |
  BIT(I915_FENCE_FLAG_SENTINEL
return false;
-- 
2.27.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[RFC 4/6] drm/i915: Allow userspace to configure the watchdog

From: Tvrtko Ursulin 

Idea here is to make the watchdog mechanism more useful than for just
default request/fence expiry.

To this effect a new context param I915_CONTEXT_PARAM_WATCHDOG is added
where the value fields allows passing in a timeout in micro-seconds.

This allows userspace to set a limit to how long they expect their batches
to take, or otherwise they will be cancelled, and userspace notified via
one of the available mechanisms.

Main attractiveness of adding uapi here is perhaps to extend the proposal
by passing in a structure instead of a single value, like for illustration
only:

struct drm_i915_gem_context_watchdog {
__u64 flags;
 #define I915_CONTEXT_WATCHDOG_WALL_TIMEBIT(0)
 #define I915_CONTEXT_WATCHDOG_GPU_TIME BIT(1)
 #define I915_CONTEXT_WATCHDOG_FROM_SUBMIT  BIT(2)
 #define I915_CONTEXT_WATCHDOG_FROM_RUNNABLEBIT(3)
__64 timeout_us;
};

Point being to prepare the uapi for different semantics from the start.
Given how not a single one makes complete sense for all use cases. And
also perhaps satisfy the long wanted media watchdog feature request.

Signed-off-by: Tvrtko Ursulin 
Cc: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 57 +++
 .../gpu/drm/i915/gem/i915_gem_context_types.h |  4 ++
 drivers/gpu/drm/i915/gt/intel_context_param.h | 11 +++-
 include/uapi/drm/i915_drm.h   |  5 +-
 4 files changed, 75 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index ca37d93ef5e7..32b05af4fc8f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -233,6 +233,8 @@ static void intel_context_set_gem(struct intel_context *ce,
if (ctx->sched.priority >= I915_PRIORITY_NORMAL &&
intel_engine_has_timeslices(ce->engine))
__set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
+
+   intel_context_set_watchdog_us(ce, ctx->watchdog.timeout_us);
 }
 
 static void __free_engines(struct i915_gem_engines *e, unsigned int count)
@@ -1397,6 +1399,28 @@ static int set_ringsize(struct i915_gem_context *ctx,
 __intel_context_ring_size(args->value));
 }
 
+static int __apply_watchdog(struct intel_context *ce, void *timeout_us)
+{
+   return intel_context_set_watchdog_us(ce, (uintptr_t)timeout_us);
+}
+
+static int set_watchdog(struct i915_gem_context *ctx,
+   struct drm_i915_gem_context_param *args)
+{
+   int ret;
+
+   if (args->size)
+   return -EINVAL;
+
+   ret = context_apply_all(ctx, __apply_watchdog,
+   (void *)(uintptr_t)args->value);
+
+   if (!ret)
+   ctx->watchdog.timeout_us = args->value;
+
+   return ret;
+}
+
 static int __get_ringsize(struct intel_context *ce, void *arg)
 {
long sz;
@@ -1426,6 +1450,17 @@ static int get_ringsize(struct i915_gem_context *ctx,
return 0;
 }
 
+static int get_watchdog(struct i915_gem_context *ctx,
+   struct drm_i915_gem_context_param *args)
+{
+   if (args->size)
+   return -EINVAL;
+
+   args->value = ctx->watchdog.timeout_us;
+
+   return 0;
+}
+
 int
 i915_gem_user_to_context_sseu(struct intel_gt *gt,
  const struct drm_i915_gem_context_param_sseu 
*user,
@@ -2075,6 +2110,10 @@ static int ctx_setparam(struct drm_i915_file_private 
*fpriv,
ret = set_ringsize(ctx, args);
break;
 
+   case I915_CONTEXT_PARAM_WATCHDOG:
+   ret = set_watchdog(ctx, args);
+   break;
+
case I915_CONTEXT_PARAM_BAN_PERIOD:
default:
ret = -EINVAL;
@@ -2196,6 +2235,19 @@ static int clone_schedattr(struct i915_gem_context *dst,
return 0;
 }
 
+static int clone_watchdog(struct i915_gem_context *dst,
+ struct i915_gem_context *src)
+{
+   int ret;
+
+   ret = context_apply_all(dst, __apply_watchdog,
+   (void *)(uintptr_t)src->watchdog.timeout_us);
+   if (!ret)
+   dst->watchdog = src->watchdog;
+
+   return ret;
+}
+
 static int clone_sseu(struct i915_gem_context *dst,
  struct i915_gem_context *src)
 {
@@ -2279,6 +2331,7 @@ static int create_clone(struct i915_user_extension __user 
*ext, void *data)
MAP(SSEU, clone_sseu),
MAP(TIMELINE, clone_timeline),
MAP(VM, clone_vm),
+   MAP(WATCHDOG, clone_watchdog),
 #undef MAP
};
struct drm_i915_gem_context_create_ext_clone local;
@@ -2532,6 +2585,10 @@ int i915_gem_context_getparam_ioctl(struct drm_device 
*dev, void *data,
ret = get_ringsize(ctx, args);
break;
 
+   case I915_CONTEXT_PARAM_WATCHDOG:
+   ret = get_watchdog(ctx, args);
+   break;
+

[RFC 5/6] drm/i915: Fail too long user submissions by default

From: Tvrtko Ursulin 

A new Kconfig option CONFIG_DRM_I915_REQUEST_TIMEOUT is added, defaulting
to 10s, and this timeout is applied to _all_ contexts using the previously
added watchdog facility.

Result of this is that any user submission will simply fail after this
time, either causing a reset (for non-preemptable) or incomplete results.

This can have an effect that workloads which used to work fine will
suddenly start failing.

When the default expiry is active userspace will not be allowed to
decrease the timeout using the context param setting.

Signed-off-by: Tvrtko Ursulin 
Cc: Daniel Vetter 
---
 drivers/gpu/drm/i915/Kconfig.profile|  8 
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 47 ++---
 2 files changed, 48 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig.profile 
b/drivers/gpu/drm/i915/Kconfig.profile
index 35bbe2b80596..55e15773 100644
--- a/drivers/gpu/drm/i915/Kconfig.profile
+++ b/drivers/gpu/drm/i915/Kconfig.profile
@@ -1,3 +1,11 @@
+config DRM_I915_REQUEST_TIMEOUT
+   int "Default timeout for requests (ms)"
+   default 1 # milliseconds
+   help
+ ...
+
+ May be 0 to disable the timeout.
+
 config DRM_I915_FENCE_TIMEOUT
int "Timeout for unsignaled foreign fences (ms, jiffy granularity)"
default 1 # milliseconds
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 32b05af4fc8f..21c0176e27a0 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -854,6 +854,25 @@ static void __assign_timeline(struct i915_gem_context *ctx,
context_apply_all(ctx, __apply_timeline, timeline);
 }
 
+static int
+__set_watchdog(struct i915_gem_context *ctx, unsigned long timeout_us);
+
+static void __set_default_fence_expiry(struct i915_gem_context *ctx)
+{
+   struct drm_i915_private *i915 = ctx->i915;
+   int ret;
+
+   if (!IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT))
+   return;
+
+   /* Default expiry for user fences. */
+   ret = __set_watchdog(ctx, CONFIG_DRM_I915_REQUEST_TIMEOUT * 1000);
+   if (ret)
+   drm_notice(&i915->drm,
+  "Failed to configure default fence expiry! (%d)",
+  ret);
+}
+
 static struct i915_gem_context *
 i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
 {
@@ -898,6 +917,8 @@ i915_gem_create_context(struct drm_i915_private *i915, 
unsigned int flags)
intel_timeline_put(timeline);
}
 
+   __set_default_fence_expiry(ctx);
+
trace_i915_context_create(ctx);
 
return ctx;
@@ -1404,23 +1425,35 @@ static int __apply_watchdog(struct intel_context *ce, 
void *timeout_us)
return intel_context_set_watchdog_us(ce, (uintptr_t)timeout_us);
 }
 
-static int set_watchdog(struct i915_gem_context *ctx,
-   struct drm_i915_gem_context_param *args)
+static int
+__set_watchdog(struct i915_gem_context *ctx, unsigned long timeout_us)
 {
int ret;
 
-   if (args->size)
-   return -EINVAL;
-
ret = context_apply_all(ctx, __apply_watchdog,
-   (void *)(uintptr_t)args->value);
+   (void *)(uintptr_t)timeout_us);
 
if (!ret)
-   ctx->watchdog.timeout_us = args->value;
+   ctx->watchdog.timeout_us = timeout_us;
 
return ret;
 }
 
+static int set_watchdog(struct i915_gem_context *ctx,
+   struct drm_i915_gem_context_param *args)
+{
+   if (args->size)
+   return -EINVAL;
+
+   /* Disallow disabling or configuring longer watchdog than default. */
+   if (IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) &&
+   (!args->value ||
+args->value > CONFIG_DRM_I915_REQUEST_TIMEOUT * 1000))
+   return -EPERM;
+
+   return __set_watchdog(ctx, args->value);
+}
+
 static int __get_ringsize(struct intel_context *ce, void *arg)
 {
long sz;
-- 
2.27.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[RFC 6/6] drm/i915: Allow configuring default request expiry via modparam

From: Tvrtko Ursulin 

Module parameter is added (request_timeout_ms) to allow configuring the
default request/fence expiry.

Default value is inherited from CONFIG_DRM_I915_REQUEST_TIMEOUT.

Signed-off-by: Tvrtko Ursulin 
Cc: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 8 +---
 drivers/gpu/drm/i915/i915_params.c  | 5 +
 drivers/gpu/drm/i915/i915_params.h  | 1 +
 3 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 21c0176e27a0..1dae5e2514a9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -866,7 +866,7 @@ static void __set_default_fence_expiry(struct 
i915_gem_context *ctx)
return;
 
/* Default expiry for user fences. */
-   ret = __set_watchdog(ctx, CONFIG_DRM_I915_REQUEST_TIMEOUT * 1000);
+   ret = __set_watchdog(ctx, i915->params.request_timeout_ms * 1000);
if (ret)
drm_notice(&i915->drm,
   "Failed to configure default fence expiry! (%d)",
@@ -1442,13 +1442,15 @@ __set_watchdog(struct i915_gem_context *ctx, unsigned 
long timeout_us)
 static int set_watchdog(struct i915_gem_context *ctx,
struct drm_i915_gem_context_param *args)
 {
+   struct drm_i915_private *i915 = ctx->i915;
+
if (args->size)
return -EINVAL;
 
/* Disallow disabling or configuring longer watchdog than default. */
-   if (IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) &&
+   if (i915->params.request_timeout_ms &&
(!args->value ||
-args->value > CONFIG_DRM_I915_REQUEST_TIMEOUT * 1000))
+args->value > i915->params.request_timeout_ms * 1000))
return -EPERM;
 
return __set_watchdog(ctx, args->value);
diff --git a/drivers/gpu/drm/i915/i915_params.c 
b/drivers/gpu/drm/i915/i915_params.c
index 6939634e56ed..0320878d96b0 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -197,6 +197,11 @@ i915_param_named_unsafe(fake_lmem_start, ulong, 0400,
"Fake LMEM start offset (default: 0)");
 #endif
 
+#if CONFIG_DRM_I915_REQUEST_TIMEOUT
+i915_param_named_unsafe(request_timeout_ms, uint, 0600,
+   "Default request/fence/batch buffer expiration 
timeout.");
+#endif
+
 static __always_inline void _print_param(struct drm_printer *p,
 const char *name,
 const char *type,
diff --git a/drivers/gpu/drm/i915/i915_params.h 
b/drivers/gpu/drm/i915/i915_params.h
index 48f47e44e848..34ebb0662547 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -72,6 +72,7 @@ struct drm_printer;
param(int, enable_dpcd_backlight, -1, 0600) \
param(char *, force_probe, CONFIG_DRM_I915_FORCE_PROBE, 0400) \
param(unsigned long, fake_lmem_start, 0, 0400) \
+   param(unsigned int, request_timeout_ms, 
CONFIG_DRM_I915_REQUEST_TIMEOUT, 0600) \
/* leave bools at the end to not create holes */ \
param(bool, enable_hangcheck, true, 0600) \
param(bool, load_detect_test, false, 0600) \
-- 
2.27.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 210321] /display/dc/dcn20/dcn20_resource.c:3240 dcn20_validate_bandwidth_fp+0x8b/0xd0 [amdgpu]

https://bugzilla.kernel.org/show_bug.cgi?id=210321

--- Comment #5 from Tristen Hayfield (tristen.hayfi...@gmail.com) ---
I did some more digging into this. I put some logging inside the if block to
see if that branch is ever taken:

if (voltage_supported && dummy_pstate_supported) {
context->bw_ctx.bw.dcn.clk.p_state_change_support = false;
goto restore_dml_state;
}

in order to log when or if the fallback worked. The logs confirmed that the
fallback is often used and generally works. Upon starting up the system and
starting up Xorg I get about a dozen log messages indicating that it entered
the if block. The only exception seems to be as Florian describes above, that
when the display shuts off due to power-saving it triggers the assertion.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH]] drm/amdgpu/gfx9: add gfxoff quirk

2021-03-12 Thread Daniel Gomez

On Thu, 11 Mar 2021 at 21:00, Daniel Gomez  wrote:

> On Thu, 11 Mar 2021 at 17:10, Alex Deucher  wrote:
> >
> > On Thu, Mar 11, 2021 at 10:02 AM Alexandre Desnoyers 
> wrote:
> > >
> > > On Thu, Mar 11, 2021 at 2:49 PM Daniel Gomez  wrote:
> > > >
> > > > On Thu, 11 Mar 2021 at 10:09, Daniel Gomez  wrote:
> > > > >
> > > > > On Wed, 10 Mar 2021 at 18:06, Alex Deucher 
> wrote:
> > > > > >
> > > > > > On Wed, Mar 10, 2021 at 11:37 AM Daniel Gomez 
> wrote:
> > > > > > >
> > > > > > > Disabling GFXOFF via the quirk list fixes a hardware lockup in
> > > > > > > Ryzen V1605B, RAVEN 0x1002:0x15DD rev 0x83.
> > > > > > >
> > > > > > > Signed-off-by: Daniel Gomez 
> > > > > > > ---
> > > > > > >
> > > > > > > This patch is a continuation of the work here:
> > > > > > > https://lkml.org/lkml/2021/2/3/122 where a hardware lockup
> was discussed and
> > > > > > > a dma_fence deadlock was provoke as a side effect. To
> reproduce the issue
> > > > > > > please refer to the above link.
> > > > > > >
> > > > > > > The hardware lockup was introduced in 5.6-rc1 for our
> particular revision as it
> > > > > > > wasn't part of the new blacklist. Before that, in kernel v5.5,
> this hardware was
> > > > > > > working fine without any hardware lock because the GFXOFF was
> actually disabled
> > > > > > > by the if condition for the CHIP_RAVEN case. So this patch,
> adds the 'Radeon
> > > > > > > Vega Mobile Series [1002:15dd] (rev 83)' to the blacklist to
> disable the GFXOFF.
> > > > > > >
> > > > > > > But besides the fix, I'd like to ask from where this revision
> comes from. Is it
> > > > > > > an ASIC revision or is it hardcoded in the VBIOS from our
> vendor? From what I
> > > > > > > can see, it comes from the ASIC and I wonder if somehow we can
> get an APU in the
> > > > > > > future, 'not blacklisted', with the same problem. Then, should
> this table only
> > > > > > > filter for the vendor and device and not the revision? Do you
> know if there are
> > > > > > > any revisions for the 1002:15dd validated, tested and
> functional?
> > > > > >
> > > > > > The pci revision id (RID) is used to specify the specific SKU
> within a
> > > > > > family.  GFXOFF is supposed to be working on all raven
> variants.  It
> > > > > > was tested and functional on all reference platforms and any OEM
> > > > > > platforms that launched with Linux support.  There are a lot of
> > > > > > dependencies on sbios in the early raven variants (0x15dd), so
> it's
> > > > > > likely more of a specific platform issue, but there is not a
> good way
> > > > > > to detect this so we use the DID/SSID/RID as a proxy.  The newer
> raven
> > > > > > variants (0x15d8) have much better GFXOFF support since they all
> > > > > > shipped with newer firmware and sbios.
> > > > >
> > > > > We took one of the first reference platform boards to design our
> > > > > custom board based on the V1605B and I assume it has one of the
> early 'unstable'
> > > > > raven variants with RID 0x83. Also, as OEM we are in control of
> the bios
> > > > > (provided by insyde) but I wasn't sure about the RID so, thanks
> for the
> > > > > clarification. Is there anything we can do with the bios to have
> the GFXOFF
> > > > > enabled and 'stable' for this particular revision? Otherwise we'd
> need to add
> > > > > the 0x83 RID to the table. Also, there is an extra ']' in the patch
> > > > > subject. Sorry
> > > > > for that. Would you need a new patch in case you accept it with
> the ']' removed?
> > > > >
> > > > > Good to hear that the newer raven versions have better GFXOFF
> support.
> > > >
> > > > Adding Alex Desnoyer to the loop as he is the electronic/hardware and
> > > > bios responsible so, he can
> > > > provide more information about this.
> > >
> > > Hello everyone,
> > >
> > > We, Qtechnology, are the OEM of the hardware platform where we
> > > originally discovered the bug.  Our platform is based on the AMD
> > > Dibbler V-1000 reference design, with the latest Insyde BIOS release
> > > available for the (now unsupported) Dibbler platform.  We have the
> > > Insyde BIOS source code internally, so we can make some modifications
> > > as needed.
> > >
> > > The last test that Daniel and myself performed was on a standard
> > > Dibbler PCB rev.B1 motherboard (NOT our platform), and using the
> > > corresponding latest AMD released BIOS "RDB1109GA".  As Daniel wrote,
> > > the hardware lockup can be reproduced on the Dibbler, even if it has a
> > > different RID that our V1605B APU.
> > >
> > > We also have a Neousys Technology POC-515 embedded computer (V-1000,
> > > V1605B) in our office.  The Neousys PC also uses Insyde BIOS.  This
> > > computer is also locking-up in the test.
> > >
> https://www.neousys-tech.com/en/product/application/rugged-embedded/poc-500-amd-ryzen-ultra-compact-embedded-computer
> > >
> > >
> > > Digging into the BIOS source code, the only reference to GFXOFF is in
> > > the SMU and PSP firmware release notes, where some bug fixes have been

Re: [PATCH 14/17] iommu: remove DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE

2021-03-12 Thread Robin Murphy


On 2021-03-11 08:26, Christoph Hellwig wrote:

On Wed, Mar 10, 2021 at 06:39:57PM +, Robin Murphy wrote:

Actually... Just mirroring the iommu_dma_strict value into
struct iommu_domain should solve all of that with very little
boilerplate code.


Yes, my initial thought was to directly replace the attribute with a
common flag at iommu_domain level, but since in all cases the behaviour
is effectively global rather than actually per-domain, it seemed
reasonable to take it a step further. This passes compile-testing for
arm64 and x86, what do you think?


It seems to miss a few bits, and also generally seems to be not actually
apply to recent mainline or something like it due to different empty
lines in a few places.


Yeah, that was sketched out on top of some other development patches, 
and in being so focused on not breaking any of the x86 behaviours I did 
indeed overlook fully converting the SMMU drivers... oops!


(my thought was to do the conversion for its own sake, then clean up the 
redundant attribute separately, but I guess it's fine either way)



Let me know what you think of the version here:

http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/iommu-cleanup

I'll happily switch the patch to you as the author if you're fine with
that as well.


I still have reservations about removing the attribute API entirely and 
pretending that io_pgtable_cfg is anything other than a SoC-specific 
private interface, but the reworked patch on its own looks reasonable to 
me, thanks! (I wasn't too convinced about the iommu_cmd_line wrappers 
either...) Just iommu_get_dma_strict() needs an export since the SMMU 
drivers can be modular - I consciously didn't add that myself since I 
was mistakenly thinking only iommu-dma would call it.


Robin.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v5 0/7] drm: add simpledrm driver

2021-03-12 Thread nerdopolis

On Friday, March 12, 2021 3:03:46 AM EST Thomas Zimmermann wrote:
> Hi
> 
> Am 12.03.21 um 04:49 schrieb nerdopolis:
> > On Wednesday, March 10, 2021 4:10:35 AM EST Thomas Zimmermann wrote:
> >> Hi
> >>
> >> Am 10.03.21 um 03:50 schrieb nerdopolis:
> >>> On Friday, September 2, 2016 4:22:38 AM EST David Herrmann wrote:
>  Hey
> 
>  On request of Noralf, I picked up the patches and prepared v5. Works 
>  fine with
>  Xorg, if configured according to:
>    
>  https://lists.freedesktop.org/archives/dri-devel/2014-January/052777.html
>  If anyone knows how to make Xorg pick it up dynamically without such a 
>  static
>  configuration, please let me know.
> 
> 
> 
> >>> Hi
> >>>
> >>> I am kind of curious as I do have interest in seeing this merged as well.
> >>
> >> Please take a look at [1]. It's not the same driver, but something to
> >> the same effect. I know it's been almost a year, but I do work on this
> >> and intend to come back with a new version during 2021.
> >>
> >> I currently work on fastboot support for the new driver. But it's a
> >> complicated matter and takes time. If there's interest, we could talk
> >> about merging what's already there.
> >>
> >> Best regards
> >> Thomas
> >>
> >> [1]
> >> https://lore.kernel.org/dri-devel/20200625120011.16168-1-tzimmerm...@suse.de/
> >>
> >>>
> >>> There is an email in this thread from 2018, but when I tried to import an 
> >>> mbox
> >>> file from the whole month for August 2018, for some reason, kmail doesn't 
> >>> see
> >>> the sender and mailing list recipient in that one, so I will reply to 
> >>> this one,
> >>> because I was able to import this into my mail client.
> >>> https://www.spinics.net/lists/dri-devel/msg185519.html
> >>>
> >>> I was able to get this to build against Linux 4.8, but not against a newer
> >>> version, some headers seem to have been split, and some things are off by 
> >>> 8
> >>> and other things. I could NOT find a git repo, but I was able to find the
> >>> newest patches I could find, and import those with git am against 4.8 with
> >>> some tweaks. If that is needed, I can link it, but only if you want.
> >>>
> >>> However in QEMU I wasn't able to figure out how to make it create a
> >>> /dev/dri/card0 device, even after blacklisting the other modules for qxl,
> >>> cirrus, etc, and then modprobe-ing simpledrm
> >>>
> >>> In my view something like this is would be useful. There still could be
> >>> hardware devices that don't have modesetting support (like vmvga in
> >>> qemu/virt-manager as an example). And most wayland servers need a
> >>> /dev/dri/card0 device as well as a potential user-mode TTY replacement 
> >>> would
> >>> also need /dev/dri/card0
> >>>
> >>> I will admit I unfortunately failed to get it to build against master. I
> >>> couldn't figure out some of the changes, where some new structs were off 
> >>> by
> >>> a factor of 8.
> >>>
> >>>
> >>> Thanks
> >>>
> >>>
> >>>
> >>> ___
> >>> dri-devel mailing list
> >>> dri-devel@lists.freedesktop.org
> >>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> >>>
> >>
> >>
> > Hi
> > 
> > I tried simplekms against v5.9, and it built, and it runs, and is pretty 
> > neat.
> > 
> > I tried using the qxl, cirrus, and vmware card in QEMU. Weston starts on all
> > of them. And I did ensure that the simplekms driver was being used
> 
> Cool! Thanks a lot. When I submit the next patchset can I add
> 
> Tested-by: nerdopolis 
> 
> to the tags?
> 
Sure!
> > 
> > That is, it works after adding GRUB_GFXPAYLOAD_LINUX=keep , to avoid having 
> > to
> > set a VGA option. (although not sure the equivalent in syslinux yet)
> 
> Yeah, it's a known 'limitation.' (1) But it's usually something that 
> Linux distributions take care of.
> 
> The rsp kernel feature needs a set up from the firmware/bootloader/etc. 
> Once the driver has been merged, added other generic drivers (EFI, VESA, 
> etc) should be a lot easier. Those would maybe not require the firmware 
> setup.
> 
> Best regards
> Thomas
> 
> (1) Well, it's the way it's supposed to work.
> 
> >   
> > 
> > Thanks.
> > 
> > 
> 
> 




___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 212255] New: WARNING: at arch/x86/kernel/fpu/core.c:129 kernel_fpu_begin_mask

https://bugzilla.kernel.org/show_bug.cgi?id=212255

Bug ID: 212255
   Summary: WARNING: at arch/x86/kernel/fpu/core.c:129
kernel_fpu_begin_mask
   Product: Drivers
   Version: 2.5
Kernel Version: 5.11.3
  Hardware: x86-64
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: Video(DRI - non Intel)
  Assignee: drivers_video-...@kernel-bugs.osdl.org
  Reporter: i...@felicetufo.com
Regression: No

Created attachment 295819
  --> https://bugzilla.kernel.org/attachment.cgi?id=295819&action=edit
dmesg

Hello,
i get two warnings booting kernel 5.11.3. The warning is present on later
5.11.x kernels (included 5.11.6) too, but does not show on 5.11.2.

[ cut here ]
[6.099356] WARNING: CPU: 6 PID: 366 at arch/x86/kernel/fpu/core.c:129
kernel_fpu_begin_mask+0xa3/0xb0
 ---[ end trace 4cbc711ff0b0578b ]---

[6.102354] [ cut here ]
[6.102354] WARNING: CPU: 6 PID: 366 at arch/x86/kernel/fpu/core.c:155
kernel_fpu_end+0x1e/0x30

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 212255] WARNING: at arch/x86/kernel/fpu/core.c:129 kernel_fpu_begin_mask

https://bugzilla.kernel.org/show_bug.cgi?id=212255

Alex Deucher (alexdeuc...@gmail.com) changed:

   What|Removed |Added

 CC||alexdeuc...@gmail.com

--- Comment #1 from Alex Deucher (alexdeuc...@gmail.com) ---
Possibly fixed with these patches?
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=15e8b95d5f7509e0b09289be8c422c459c9f0412
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=680174cfd1e1cea70a8f30ccb44d8fbdf996018e

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v5] Add power/gpu_frequency tracepoint.

2021-03-12 Thread Pavel Machek

On Wed 2021-03-10 13:56:29, Peiyong Lin wrote:
> On Mon, Dec 21, 2020 at 12:10 PM Peiyong Lin  wrote:
> >
> > Historically there is no common trace event for GPU frequency, in
> > downstream Android each different hardware vendor implements their own
> > way to expose GPU frequency, for example as a debugfs node.  This patch
> > standardize it as a common trace event in upstream linux kernel to help
> > the ecosystem have a common implementation across hardware vendors.
> > Toolings in the Linux ecosystem will benefit from this especially in the
> > downstream Android, where this information is critical to graphics
> > developers.
...
> >  /* This part must be outside protection */
> > --
> > 2.29.2.684.gfbc64c5ab5-goog
> >
> 
> Hi there,
> 
> Could you please take a look at this patch?

Please trim email quotes when replying.

Also, you might want to state _name_ of person who should apply this,
so there's no ambiguity. You cc-ed many people...
Pavel
-- 
http://www.livejournal.com/~pavelmachek


signature.asc
Description: Digital signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [RFC PATCH 0/7] drm/panfrost: Add a new submit ioctl

2021-03-12 Thread Boris Brezillon

On Fri, 12 Mar 2021 09:37:49 -0600
Jason Ekstrand  wrote:

> On Fri, Mar 12, 2021 at 1:31 AM Boris Brezillon
>  wrote:
> >
> > On Thu, 11 Mar 2021 12:11:48 -0600
> > Jason Ekstrand  wrote:
> >  
> > > > > > > > 2/ Queued jobs might be executed out-of-order (unless they have
> > > > > > > > explicit/implicit deps between them), and Vulkan asks that 
> > > > > > > > the out
> > > > > > > > fence be signaled when all jobs are done. Timeline syncobjs 
> > > > > > > > are a
> > > > > > > > good match for that use case. All we need to do is pass the 
> > > > > > > > same
> > > > > > > > fence syncobj to all jobs being attached to a single 
> > > > > > > > QueueSubmit
> > > > > > > > request, but a different point on the timeline. The syncobj
> > > > > > > > timeline wait does the rest and guarantees that we've 
> > > > > > > > reached a
> > > > > > > > given timeline point (IOW, all jobs before that point are 
> > > > > > > > done)
> > > > > > > > before declaring the fence as signaled.
> > > > > > > > One alternative would be to have dummy 'synchronization' 
> > > > > > > > jobs that
> > > > > > > > don't actually execute anything on the GPU but declare a 
> > > > > > > > dependency
> > > > > > > > on all other jobs that are part of the QueueSubmit request, 
> > > > > > > > and
> > > > > > > > signal the out fence (the scheduler would do most of the 
> > > > > > > > work for
> > > > > > > > us, all we have to do is support NULL job heads and signal 
> > > > > > > > the
> > > > > > > > fence directly when that happens instead of queueing the 
> > > > > > > > job).  
> > > > > > >
> > > > > > > I have to admit to being rather hazy on the details of timeline
> > > > > > > syncobjs, but I thought there was a requirement that the timeline 
> > > > > > > moves
> > > > > > > monotonically. I.e. if you have multiple jobs signalling the same
> > > > > > > syncobj just with different points, then AFAIU the API requires 
> > > > > > > that the
> > > > > > > points are triggered in order.  
> > > > > >
> > > > > > I only started looking at the SYNCOBJ_TIMELINE API a few days ago, 
> > > > > > so I
> > > > > > might be wrong, but my understanding is that queuing fences 
> > > > > > (addition
> > > > > > of new points in the timeline) should happen in order, but signaling
> > > > > > can happen in any order. When checking for a signaled fence the
> > > > > > fence-chain logic starts from the last point (or from an explicit 
> > > > > > point
> > > > > > if you use the timeline wait flavor) and goes backward, stopping at 
> > > > > > the
> > > > > > first un-signaled node. If I'm correct, that means that fences that
> > > > > > are part of a chain can be signaled in any order.  
> > > > >
> > > > > You don't even need a timeline for this.  Just have a single syncobj
> > > > > per-queue and make each submit wait on it and then signal it.
> > > > > Alternatively, you can just always hang on to the out-fence from the
> > > > > previous submit and make the next one wait on that.  
> > > >
> > > > That's what I have right now, but it forces the serialization of all
> > > > jobs that are pushed during a submit (and there can be more than one
> > > > per command buffer on panfrost :-/). Maybe I'm wrong, but I thought it'd
> > > > be better to not force this serialization if we can avoid it.  
> > >
> > > I'm not familiar with panfrost's needs and I don't work on a tiler and
> > > I know there are different issues there.  But...
> > >
> > > The Vulkan spec requires that everything that all the submits that
> > > happen on a given vkQueue happen in-order.  Search the spec for
> > > "Submission order" for more details.  
> >
> > Duh, looks like I completely occulted the "Submission order"
> > guarantees. This being said, even after reading this chapter multiple
> > times I'm not sure what kind of guarantee this gives us, given the
> > execution itself can be out-of-order. My understanding is that
> > submission order matters for implicit deps, say you have 2 distinct
> > VkSubmitInfo, the first one (in submission order) writing to a buffer
> > and the second one reading from it, you really want the first one to
> > be submitted first and the second one to wait on the implicit BO fence
> > created by the first one. If we were to submit out-of-order, this
> > guarantee wouldn't be met. OTOH, if we have 2 completely independent
> > submits, I don't really see what submission order gives us if execution
> > is out-of-order.  
> 
> Right, this is where things get sticky.  What's guaranteed there is
> submission order not execution order.  But, sadly, there's more hidden
> in there than you might think.  Before we can go there, though, we
> need to establish a few details of Mali hardware works.
> 
> My understanding (feel free to correct me if I'm wrong) is that,
> roughly, Mali has two engines which have to work together to render:
> One engine for compute/vertex/geometry and one for

[Bug 212255] WARNING: at arch/x86/kernel/fpu/core.c:129 kernel_fpu_begin_mask

https://bugzilla.kernel.org/show_bug.cgi?id=212255

--- Comment #2 from Felice Tufo (i...@felicetufo.com) ---
Thanks Alex,
it seems that Linus merged those patches (just today) for the next -rc release,
am I right?
If so, I'll do a quick test and let you know as soon as Ubuntu team will
release the next mainline kernel (usually just 1 or 2 days after -rc is out).

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 212255] WARNING: at arch/x86/kernel/fpu/core.c:129 kernel_fpu_begin_mask

https://bugzilla.kernel.org/show_bug.cgi?id=212255

--- Comment #3 from Alex Deucher (alexdeuc...@gmail.com) ---
(In reply to Felice Tufo from comment #2)
> Thanks Alex,
> it seems that Linus merged those patches (just today) for the next -rc
> release, am I right?

correct.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [RFC PATCH 0/7] drm/panfrost: Add a new submit ioctl

2021-03-12 Thread Boris Brezillon

On Fri, 12 Mar 2021 19:25:13 +0100
Boris Brezillon  wrote:

> > So where does this leave us?  Well, it depends on your submit model
> > and exactly how you handle pipeline barriers that sync between
> > engines.  If you're taking option 3 above and doing two command
> > buffers for each VkCommandBuffer, then you probably want two
> > serialized timelines, one for each engine, and some mechanism to tell
> > the kernel driver "these two command buffers have to run in parallel"
> > so that your ping-pong works.  If you're doing 1 or 2 above, I think
> > you probably still want two simple syncobjs, one for each engine.  You
> > don't really have any need to go all that far back in history.  All
> > you really need to describe is "command buffer X depends on previous
> > compute work" or "command buffer X depends on previous binning work".  
> 
> Okay, so this will effectively force in-order execution. Let's take your
> previous example and add 2 more jobs at the end that have no deps on
> previous commands:
> 
> vkBeginRenderPass() /* Writes to ImageA */
> vkCmdDraw()
> vkCmdDraw()
> ...
> vkEndRenderPass()
> vkPipelineBarrier(imageA /* fragment -> compute */)
> vkCmdDispatch() /* reads imageA, writes BufferB */
> vkBeginRenderPass() /* Writes to ImageC */
> vkCmdBindVertexBuffers(bufferB)
> vkCmdDraw();
> ...
> vkEndRenderPass()
> vkBeginRenderPass() /* Writes to ImageD */
> vkCmdDraw()
> ...
> vkEndRenderPass()
> 
> A: Vertex for the first draw on the compute engine
> B: Vertex for the first draw on the compute engine
> C: Fragment for the first draw on the binning engine; depends on A
> D: Fragment for the second draw on the binning engine; depends on B
> E: Compute on the compute engine; depends on C and D
> F: Vertex for the third draw on the compute engine; depends on E
> G: Fragment for the third draw on the binning engine; depends on F
> H: Vertex for the fourth draw on the compute engine
> I: Fragment for the fourth draw on the binning engine
> 
> When we reach E, we might be waiting for D to finish before scheduling
> the job, and because of the implicit serialization we have on the
> compute queue (F implicitly depends on E, and H on F) we can't schedule
> H either, which could, in theory be started. I guess that's where the
> term submission order is a bit unclear to me. The action of starting a
> job sounds like execution order to me (the order you starts jobs
> determines the execution order since we only have one HW queue per job
> type). All implicit deps have been calculated when we queued the job to
> the SW queue, and I thought that would be enough to meet the submission
> order requirements, but I might be wrong.
> 
> The PoC I have was trying to get rid of this explicit serialization on
> the compute and fragment queues by having one syncobj timeline
> (queue()) and synchronization points (Sx).
> 
> S0: in-fences=, out-fences= #waitSemaphore 
> sync point
> A: in-fences=, out-fences=
> B: in-fences=, out-fences=
> C: in-fences=, out-fence= #implicit dep on A through 
> the tiler context
> D: in-fences=, out-fence= #implicit dep on B through 
> the tiler context
> E: in-fences=, out-fence= #implicit dep on D through 
> imageA
> F: in-fences=, out-fence= #implicit dep on E through 
> buffer B
> G: in-fences=, out-fence= #implicit dep on F through 
> the tiler context
> H: in-fences=, out-fence=
> I: in-fences=, out-fence= #implicit dep on H through 
> the tiler buffer
> S1: in-fences=, out-fences= 
> #signalSemaphore,fence sync point
> # QueueWaitIdle is implemented with a wait(queue(0)), AKA wait on the last 
> point
> 
> With this solution H can be started before E if the compute slot
> is empty and E's implicit deps are not done. It's probably overkill,
> but I thought maximizing GPU utilization was important.

Nevermind, I forgot the drm scheduler was dequeuing jobs in order, so 2
syncobjs (one per queue type) is indeed the right approach.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [RFC PATCH 0/7] drm/panfrost: Add a new submit ioctl

2021-03-12 Thread Alyssa Rosenzweig

> >  3. Each VkCommandBuffer is two command buffers: one for compute and
> > one for binning, and you use some sort of HW synchronization mechanism
> > to handle the dependencies as you ping-pong between them.
> 
> I didn't consider that option. We have a DOORBELL instruction on Bifrost
> to wake up the CPU when the GPU wants to report something (not even
> sure we have something equivalent on Midgard), but there's no
> inter-queue sync mechanism AFAICT.

I was about to say that we don't have hardware support. Even considering
DOORBELL (which AFAIK isn't piped through to anything on kbase at
least), I'm inclined to say we don't have hardware support. Option 2
(what kbase does? dunno how DDK vulkan works though) is probably our
best bet, tbh.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH] drm/amdgpu: Mark mmhub_v1_7_setup_vm_pt_regs() as static

2021-03-12 Thread Souptick Joarder

Kernel test robot throws below warning ->

drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c:56:6: warning: no previous
prototype for 'mmhub_v1_7_setup_vm_pt_regs' [-Wmissing-prototypes]

Mark mmhub_v1_7_setup_vm_pt_regs() as static.

Reported-by: kernel test robot 
Signed-off-by: Souptick Joarder 
---
 drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c 
b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c
index 4df0b73..ae7d8a1 100644
--- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c
+++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c
@@ -53,7 +53,7 @@ static u64 mmhub_v1_7_get_fb_location(struct amdgpu_device 
*adev)
return base;
 }
 
-void mmhub_v1_7_setup_vm_pt_regs(struct amdgpu_device *adev, uint32_t vmid,
+static void mmhub_v1_7_setup_vm_pt_regs(struct amdgpu_device *adev, uint32_t 
vmid,
uint64_t page_table_base)
 {
struct amdgpu_vmhub *hub = &adev->vmhub[AMDGPU_MMHUB_0];
-- 
1.9.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm/amdgpu: Mark mmhub_v1_7_setup_vm_pt_regs() as static

2021-03-12 Thread Zeng, Oak

[AMD Official Use Only - Internal Distribution Only]

Thank you Joarder for the fix. But this has already been fixed in our Alex's 
drm-next branch.

Regards,
Oak



On 2021-03-12, 5:19 PM, "Souptick Joarder"  wrote:

Kernel test robot throws below warning ->

drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c:56:6: warning: no previous
prototype for 'mmhub_v1_7_setup_vm_pt_regs' [-Wmissing-prototypes]

Mark mmhub_v1_7_setup_vm_pt_regs() as static.

Reported-by: kernel test robot 
Signed-off-by: Souptick Joarder 
---
 drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c 
b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c
index 4df0b73..ae7d8a1 100644
--- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c
+++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_7.c
@@ -53,7 +53,7 @@ static u64 mmhub_v1_7_get_fb_location(struct 
amdgpu_device *adev)
 return base;
 }

-void mmhub_v1_7_setup_vm_pt_regs(struct amdgpu_device *adev, uint32_t vmid,
+static void mmhub_v1_7_setup_vm_pt_regs(struct amdgpu_device *adev, 
uint32_t vmid,
 uint64_t page_table_base)
 {
 struct amdgpu_vmhub *hub = &adev->vmhub[AMDGPU_MMHUB_0];
--
1.9.1


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drivers: video: fbcon: fix NULL dereference in fbcon_cursor()

2021-03-12 Thread Du Cheng

On Fri, Mar 12, 2021 at 09:36:42AM +0100, Greg Kroah-Hartman wrote:
> On Fri, Mar 12, 2021 at 04:14:21PM +0800, Du Cheng wrote:
> > add null-check on function pointer before dereference on ops->cursor
> > 
> > Reported-by: syzbot+b67aaae8d3a927f68...@syzkaller.appspotmail.com
> > Signed-off-by: Du Cheng 
> > ---
> >  drivers/video/fbdev/core/fbcon.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/video/fbdev/core/fbcon.c 
> > b/drivers/video/fbdev/core/fbcon.c
> > index 44a5cd2f54cc..3406067985b1 100644
> > --- a/drivers/video/fbdev/core/fbcon.c
> > +++ b/drivers/video/fbdev/core/fbcon.c
> > @@ -1333,6 +1333,9 @@ static void fbcon_cursor(struct vc_data *vc, int mode)
> >  
> > ops->cursor_flash = (mode == CM_ERASE) ? 0 : 1;
> >  
> > +   if (!ops->cursor)
> > +   return;
> > +
> > ops->cursor(vc, info, mode, get_color(vc, info, c, 1),
> > get_color(vc, info, c, 0));
> >  }
> > -- 
> > 2.27.0
> > 
> 
> Is this the same issue reported here:
>   
> https://lore.kernel.org/r/20210307105642.112572-1-h.shahbazi@gmail.com
> 
> And has syzbot testing shown that this fix does solve the issue?
> 
> thanks,
> 
> greg k-h

Hi Greg,

After syzbot testing at
https://syzkaller.appspot.com/bug?id=26567b12e74b8791e1db50da6039ee1705e5a7ed
the results showed that shahbazi's patch did not solve this bug, but my patch
passed the syzbot test.

Regards,
Du Cheng





___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH] gpu: host1x: mipi: Simplify with dev_err_probe()

2021-03-12 Thread Tian Tao

Common pattern of handling deferred probe can be simplified with
dev_err_probe().  Less code and also it prints the error value.

Signed-off-by: Tian Tao 
---
 drivers/gpu/host1x/mipi.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/host1x/mipi.c b/drivers/gpu/host1x/mipi.c
index 2efe12d..4ba022f 100644
--- a/drivers/gpu/host1x/mipi.c
+++ b/drivers/gpu/host1x/mipi.c
@@ -523,10 +523,9 @@ static int tegra_mipi_probe(struct platform_device *pdev)
mutex_init(&mipi->lock);
 
mipi->clk = devm_clk_get(&pdev->dev, NULL);
-   if (IS_ERR(mipi->clk)) {
-   dev_err(&pdev->dev, "failed to get clock\n");
-   return PTR_ERR(mipi->clk);
-   }
+   if (IS_ERR(mipi->clk))
+   return dev_err_probe(&pdev->dev, PTR_ERR(mipi->clk),
+"failed to get clock\n");
 
err = clk_prepare(mipi->clk);
if (err < 0)
-- 
2.7.4

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH] drm/kmb: remove unneeded semicolon

2021-03-12 Thread angkery

From: Junlin Yang 

Fixes coccicheck warnings:
./drivers/gpu/drm/kmb/kmb_dsi.c:284:3-4: Unneeded semicolon
./drivers/gpu/drm/kmb/kmb_dsi.c:304:3-4: Unneeded semicolon
./drivers/gpu/drm/kmb/kmb_dsi.c:321:3-4: Unneeded semicolon
./drivers/gpu/drm/kmb/kmb_dsi.c:340:3-4: Unneeded semicolon
./drivers/gpu/drm/kmb/kmb_dsi.c:364:2-3: Unneeded semicolon

Signed-off-by: Junlin Yang 
---
 drivers/gpu/drm/kmb/kmb_dsi.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/kmb/kmb_dsi.c b/drivers/gpu/drm/kmb/kmb_dsi.c
index 4b5d82a..231041b 100644
--- a/drivers/gpu/drm/kmb/kmb_dsi.c
+++ b/drivers/gpu/drm/kmb/kmb_dsi.c
@@ -281,7 +281,7 @@ static u32 mipi_get_datatype_params(u32 data_type, u32 
data_mode,
default:
DRM_ERROR("DSI: Invalid data_mode %d\n", data_mode);
return -EINVAL;
-   };
+   }
break;
case DSI_LP_DT_PPS_YCBCR422_16B:
data_type_param.size_constraint_pixels = 2;
@@ -301,7 +301,7 @@ static u32 mipi_get_datatype_params(u32 data_type, u32 
data_mode,
default:
DRM_ERROR("DSI: Invalid data_mode %d\n", data_mode);
return -EINVAL;
-   };
+   }
break;
case DSI_LP_DT_LPPS_YCBCR422_20B:
case DSI_LP_DT_PPS_YCBCR422_24B:
@@ -318,7 +318,7 @@ static u32 mipi_get_datatype_params(u32 data_type, u32 
data_mode,
default:
DRM_ERROR("DSI: Invalid data_mode %d\n", data_mode);
return -EINVAL;
-   };
+   }
break;
case DSI_LP_DT_PPS_RGB565_16B:
data_type_param.size_constraint_pixels = 1;
@@ -337,7 +337,7 @@ static u32 mipi_get_datatype_params(u32 data_type, u32 
data_mode,
default:
DRM_ERROR("DSI: Invalid data_mode %d\n", data_mode);
return -EINVAL;
-   };
+   }
break;
case DSI_LP_DT_PPS_RGB666_18B:
data_type_param.size_constraint_pixels = 4;
@@ -361,7 +361,7 @@ static u32 mipi_get_datatype_params(u32 data_type, u32 
data_mode,
default:
DRM_ERROR("DSI: Invalid data_type %d\n", data_type);
return -EINVAL;
-   };
+   }
 
*params = data_type_param;
return 0;
-- 
1.9.1


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 212259] New: Entire graphics stack locks up when running SteamVR and sometimes Sway; is sometimes unrecoverable

https://bugzilla.kernel.org/show_bug.cgi?id=212259

Bug ID: 212259
Summary: Entire graphics stack locks up when running SteamVR
and sometimes Sway; is sometimes unrecoverable
Product: Drivers
Version: 2.5
Kernel Version: 5.11.0
Hardware: Intel
OS: Linux
Tree: Mainline
Status: NEW
Severity: blocking
Priority: P1
Component: Video(DRI - non Intel)
Assignee: drivers_video-...@kernel-bugs.osdl.org
Reporter: happysmas...@protonmail.com
Regression: No

Created attachment 295831
--> https://bugzilla.kernel.org/attachment.cgi?id=295831&action=edit
Last 406 lines of dmesg at the time of error

This bug is driving me nuts!

First time, it frequently happened when I ran Sway and LXDE at the same time,
where my entire graphics stack would freeze and I would be unable to switch
virtual TTYs or even kill X, even with kill -9 on everything.

Now, I am experiencing this bug again, this time with SteamVR, and at a MUCH
higher frequency. It seems to happen, fairly randomly, whenever I start
SteamVR, but seems to be more likely when my system has been running for a long
time, or when Waterfox has been running for a long time.

A couple weeks ago I found a way to reset my GPU as a workaround to the bug
(https://www.reddit.com/r/linuxquestions/comments/lpiwkg/how_to_reset_graphics_stack_when_x_stops/),
but today, this did not work. My screen went black, but I was still unable to
kill -9 X, even after trying the reset many times. In dmesg, there were
messages about the reset still being in-progress, but I did not see anything
seem to change. Eventually, I decided to give up on gracefully fixing this bug
with no downtime to my website and poweroff, then hold the power button of my
computer after the HDD activity light went off as even powering off doesn't
seem to work properly with this bug. Only then, could I reboot and recover.

I have attached the end of the most recent dmesg where this failed. I also have
more dmesgs that may or may not be the same bug, but the attachment field only
allows for one.

--
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH] gpu: drm: mediatek: delete redundant printing of return value

2021-03-12 Thread Wang Qing

platform_get_irq() has already checked and printed the return value, 
the printing here is nothing special, it is not necessary at all.

Signed-off-by: Wang Qing 
---
 drivers/gpu/drm/mediatek/mtk_dpi.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/mediatek/mtk_dpi.c 
b/drivers/gpu/drm/mediatek/mtk_dpi.c
index b05f900..0ac4962
--- a/drivers/gpu/drm/mediatek/mtk_dpi.c
+++ b/drivers/gpu/drm/mediatek/mtk_dpi.c
@@ -751,10 +751,8 @@ static int mtk_dpi_probe(struct platform_device *pdev)
}
 
dpi->irq = platform_get_irq(pdev, 0);
-   if (dpi->irq <= 0) {
-   dev_err(dev, "Failed to get irq: %d\n", dpi->irq);
+   if (dpi->irq <= 0)
return -EINVAL;
-   }
 
ret = drm_of_find_panel_or_bridge(dev->of_node, 0, 0,
  NULL, &dpi->next_bridge);
-- 
2.7.4

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 2/2] MIPS: ingenic: gcw0: SPI panel does not require active-high CS

2021-03-12 Thread Thomas Bogendoerfer

On Sun, Mar 07, 2021 at 07:28:30PM +, Paul Cercueil wrote:
> The NT39016 panel is a fun beast, even though the documentation states
> that the CS line is active-low, it will work just fine if the CS line is
> configured as active-high, but it won't work if the CS line is forced
> low or forced high.
> 
> Since it did actually work with the spi-cs-high property, this is not a
> bugfix, but we should nonetheless remove that property to match the
> documentation.
> 
> Signed-off-by: Paul Cercueil 
> ---
>  arch/mips/boot/dts/ingenic/gcw0.dts | 1 -
>  1 file changed, 1 deletion(-)

applied to mips-next.

Thomas.

-- 
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea.[ RFC1925, 2.3 ]
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 1/2] dt-bindings: panel/kingdisplay,kd035g6-54nt: Remove spi-cs-high

2021-03-12 Thread Thomas Bogendoerfer

On Sun, Mar 07, 2021 at 07:28:29PM +, Paul Cercueil wrote:
> The NT39016 panel is a fun beast, even though the documentation states
> that the CS line is active-low, it will work just fine if the CS line is
> configured as active-high, but it won't work if the CS line is forced
> low or forced high.
> 
> Since it did actually work with the spi-cs-high property, this is not a
> bugfix, but we should nonetheless remove that property from the example
> to match the documentation.
> 
> Signed-off-by: Paul Cercueil 
> ---
>  .../bindings/display/panel/kingdisplay,kd035g6-54nt.yaml | 1 -
>  1 file changed, 1 deletion(-)

applied to mips-next.

Thomas.

-- 
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea.[ RFC1925, 2.3 ]
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v5 0/7] drm: add simpledrm driver

2021-03-12 Thread Thomas Zimmermann


Hi

Am 12.03.21 um 04:49 schrieb nerdopolis:

On Wednesday, March 10, 2021 4:10:35 AM EST Thomas Zimmermann wrote:

Hi

Am 10.03.21 um 03:50 schrieb nerdopolis:

On Friday, September 2, 2016 4:22:38 AM EST David Herrmann wrote:

Hey

On request of Noralf, I picked up the patches and prepared v5. Works fine with
Xorg, if configured according to:
  https://lists.freedesktop.org/archives/dri-devel/2014-January/052777.html
If anyone knows how to make Xorg pick it up dynamically without such a static
configuration, please let me know.




Hi

I am kind of curious as I do have interest in seeing this merged as well.


Please take a look at [1]. It's not the same driver, but something to
the same effect. I know it's been almost a year, but I do work on this
and intend to come back with a new version during 2021.

I currently work on fastboot support for the new driver. But it's a
complicated matter and takes time. If there's interest, we could talk
about merging what's already there.

Best regards
Thomas

[1]
https://lore.kernel.org/dri-devel/20200625120011.16168-1-tzimmerm...@suse.de/



There is an email in this thread from 2018, but when I tried to import an mbox
file from the whole month for August 2018, for some reason, kmail doesn't see
the sender and mailing list recipient in that one, so I will reply to this one,
because I was able to import this into my mail client.
https://www.spinics.net/lists/dri-devel/msg185519.html

I was able to get this to build against Linux 4.8, but not against a newer
version, some headers seem to have been split, and some things are off by 8
and other things. I could NOT find a git repo, but I was able to find the
newest patches I could find, and import those with git am against 4.8 with
some tweaks. If that is needed, I can link it, but only if you want.

However in QEMU I wasn't able to figure out how to make it create a
/dev/dri/card0 device, even after blacklisting the other modules for qxl,
cirrus, etc, and then modprobe-ing simpledrm

In my view something like this is would be useful. There still could be
hardware devices that don't have modesetting support (like vmvga in
qemu/virt-manager as an example). And most wayland servers need a
/dev/dri/card0 device as well as a potential user-mode TTY replacement would
also need /dev/dri/card0

I will admit I unfortunately failed to get it to build against master. I
couldn't figure out some of the changes, where some new structs were off by
a factor of 8.


Thanks



___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel





Hi

I tried simplekms against v5.9, and it built, and it runs, and is pretty neat.

I tried using the qxl, cirrus, and vmware card in QEMU. Weston starts on all
of them. And I did ensure that the simplekms driver was being used


Cool! Thanks a lot. When I submit the next patchset can I add

Tested-by: nerdopolis 

to the tags?



That is, it works after adding GRUB_GFXPAYLOAD_LINUX=keep , to avoid having to
set a VGA option. (although not sure the equivalent in syslinux yet)


Yeah, it's a known 'limitation.' (1) But it's usually something that 
Linux distributions take care of.


The rsp kernel feature needs a set up from the firmware/bootloader/etc. 
Once the driver has been merged, added other generic drivers (EFI, VESA, 
etc) should be a lot easier. Those would maybe not require the firmware 
setup.


Best regards
Thomas

(1) Well, it's the way it's supposed to work.

  


Thanks.




--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer



OpenPGP_signature
Description: OpenPGP digital signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: vmwgfx leaking bo pins?




Am 12.03.21 um 00:02 schrieb Zack Rusin:



On Mar 11, 2021, at 17:35, Thomas Hellström (Intel)  
wrote:

Hi, Zack

On 3/11/21 10:07 PM, Zack Rusin wrote:

On Mar 11, 2021, at 05:46, Thomas Hellström (Intel)  
wrote:

Hi,

I tried latest drm-fixes today and saw a lot of these: Fallout from ttm rework?

Yes, I fixed this in d1a73c641afd2617bd80bce8b71a096fc5b74b7e it was in 
drm-misc-next in the drm-misc tree for a while but hasn’t been merged for 5.12.


Mhm, crap. I hoped that you would have the same issue as radeon somehow 
and could help with debugging.


Please make sure the patch is pushed to drm-misc-fixes so that it ends 
up fixed in 5.12.




z


Hmm, yes but doesn't that fix trip the ttm_bo_unpin() 
dma_resv_assert_held(bo->base.resv)?

No, doesn’t seem to. TBH I’m not sure why myself, but it seems to be working 
fine.


Taking the reservation to unpin at TTM bo free has always been awkward and that's 
why vmwgfx and I guess other TTM drivers have been sloppy doing that as TTM never 
cared. Perhaps TTM could change the pin_count to an atomic and allow unlocked 
unpinning? still requiring the reservation lock for pin_count transition 0->1, 
though.

Yea, that’d probably make sense. I think in general just making sure the 
requirements are consistent and well documented would be great.


Also, pinning at bo creation in vmwgfx has been to do the equivalent of 
ttm_bo_init_reserved() (which api was added later). Creating pinned would make 
the object isolated and allowing the reserve trylock that followed to always 
succeed. With the introduction of the TTM pin_count, it seems 
ttm_bo_init_reserved() is used to enable pinned creation which is used to 
emulate ttm_bo_init_reserved() :)

Yea, we should probably port the vmwgfx code to ttm_bo_init_reserved just to be 
match the newly established semantics.


Yeah, I stumbled over that at well during one of the cleanups and 
considered changing the implementation. But then it got lost in all the 
rework.


Christian.


z


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH] drivers: video: fbcon: fix NULL dereference in fbcon_cursor()

2021-03-12 Thread Du Cheng

add null-check on function pointer before dereference on ops->cursor

Reported-by: syzbot+b67aaae8d3a927f68...@syzkaller.appspotmail.com
Signed-off-by: Du Cheng 
---
 drivers/video/fbdev/core/fbcon.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/video/fbdev/core/fbcon.c b/drivers/video/fbdev/core/fbcon.c
index 44a5cd2f54cc..3406067985b1 100644
--- a/drivers/video/fbdev/core/fbcon.c
+++ b/drivers/video/fbdev/core/fbcon.c
@@ -1333,6 +1333,9 @@ static void fbcon_cursor(struct vc_data *vc, int mode)
 
ops->cursor_flash = (mode == CM_ERASE) ? 0 : 1;
 
+   if (!ops->cursor)
+   return;
+
ops->cursor(vc, info, mode, get_color(vc, info, c, 1),
get_color(vc, info, c, 0));
 }
-- 
2.27.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drivers: video: fbcon: fix NULL dereference in fbcon_cursor()

2021-03-12 Thread Greg Kroah-Hartman

On Fri, Mar 12, 2021 at 04:14:21PM +0800, Du Cheng wrote:
> add null-check on function pointer before dereference on ops->cursor
> 
> Reported-by: syzbot+b67aaae8d3a927f68...@syzkaller.appspotmail.com
> Signed-off-by: Du Cheng 
> ---
>  drivers/video/fbdev/core/fbcon.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/video/fbdev/core/fbcon.c 
> b/drivers/video/fbdev/core/fbcon.c
> index 44a5cd2f54cc..3406067985b1 100644
> --- a/drivers/video/fbdev/core/fbcon.c
> +++ b/drivers/video/fbdev/core/fbcon.c
> @@ -1333,6 +1333,9 @@ static void fbcon_cursor(struct vc_data *vc, int mode)
>  
>   ops->cursor_flash = (mode == CM_ERASE) ? 0 : 1;
>  
> + if (!ops->cursor)
> + return;
> +
>   ops->cursor(vc, info, mode, get_color(vc, info, c, 1),
>   get_color(vc, info, c, 0));
>  }
> -- 
> 2.27.0
> 

Is this the same issue reported here:

https://lore.kernel.org/r/20210307105642.112572-1-h.shahbazi@gmail.com

And has syzbot testing shown that this fix does solve the issue?

thanks,

greg k-h
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH v6 0/8] Add support for SVM atomics in Nouveau

This is the sixth version of a series to add support to Nouveau for atomic
memory operations on OpenCL shared virtual memory (SVM) regions.

There are no significant changes for version six other than correcting a
minor s390 build and bisectability issue and removing a redundant call to
compound_page() when checking for PageLocked in patch 1.

Exclusive device access is implemented by adding a new swap entry type
(SWAP_DEVICE_EXCLUSIVE) which is similar to a migration entry. The main
difference is that on fault the original entry is immediately restored by
the fault handler instead of waiting.

Restoring the entry triggers calls to MMU notifers which allows a device
driver to revoke the atomic access permission from the GPU prior to the CPU
finalising the entry.

Patches 1 & 2 refactor existing migration and device private entry
functions.

Patches 3 & 4 rework try_to_unmap_one() by splitting out unrelated
functionality into separate functions - try_to_migrate_one() and
try_to_munlock_one(). These should not change any functionality, but any
help testing would be much appreciated as I have not been able to test
every usage of try_to_unmap_one().

Patch 5 contains the bulk of the implementation for device exclusive
memory.

Patch 6 contains some additions to the HMM selftests to ensure everything
works as expected.

Patch 7 is a cleanup for the Nouveau SVM implementation.

Patch 8 contains the implementation of atomic access for the Nouveau
driver.

This has been tested using the latest upstream Mesa userspace with a simple
OpenCL test program which checks the results of atomic GPU operations on a
SVM buffer whilst also writing to the same buffer from the CPU.

Alistair Popple (8):
  mm: Remove special swap entry functions
  mm/swapops: Rework swap entry manipulation code
  mm/rmap: Split try_to_munlock from try_to_unmap
  mm/rmap: Split migration into its own function
  mm: Device exclusive memory access
  mm: Selftests for exclusive device memory
  nouveau/svm: Refactor nouveau_range_fault
  nouveau/svm: Implement atomic SVM access

 Documentation/vm/hmm.rst  |  19 +-
 arch/s390/mm/pgtable.c|   2 +-
 drivers/gpu/drm/nouveau/include/nvif/if000c.h |   1 +
 drivers/gpu/drm/nouveau/nouveau_svm.c | 130 +++-
 drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h |   1 +
 .../drm/nouveau/nvkm/subdev/mmu/vmmgp100.c|   6 +
 fs/proc/task_mmu.c|  23 +-
 include/linux/mmu_notifier.h  |  25 +-
 include/linux/rmap.h  |   9 +-
 include/linux/swap.h  |   8 +-
 include/linux/swapops.h   | 123 ++--
 lib/test_hmm.c| 126 +++-
 lib/test_hmm_uapi.h   |   2 +
 mm/debug_vm_pgtable.c |  12 +-
 mm/hmm.c  |  12 +-
 mm/huge_memory.c  |  45 +-
 mm/hugetlb.c  |  10 +-
 mm/memcontrol.c   |   2 +-
 mm/memory.c   | 127 +++-
 mm/migrate.c  |  41 +-
 mm/mprotect.c |  18 +-
 mm/page_vma_mapped.c  |  15 +-
 mm/rmap.c | 597 +++---
 tools/testing/selftests/vm/hmm-tests.c| 219 +++
 24 files changed, 1313 insertions(+), 260 deletions(-)

-- 
2.20.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH v6 1/8] mm: Remove special swap entry functions

Remove the migration and device private entry_to_page() and
entry_to_pfn() inline functions and instead open code them directly.
This results in shorter code which is easier to understand.

Signed-off-by: Alistair Popple 
Reviewed-by: Ralph Campbell 

---

v6:
* Removed redundant compound_page() call from inside PageLocked()
* Fixed a minor build issue for s390 reported by kernel test bot

v4:
* Added pfn_swap_entry_to_page()
* Reinstated check that migration entries point to locked pages
* Removed #define swapcache_prepare which isn't needed for CONFIG_SWAP=0
  builds
---
 arch/s390/mm/pgtable.c  |  2 +-
 fs/proc/task_mmu.c  | 23 +-
 include/linux/swap.h|  4 +--
 include/linux/swapops.h | 69 ++---
 mm/hmm.c|  5 ++-
 mm/huge_memory.c|  4 +--
 mm/memcontrol.c |  2 +-
 mm/memory.c | 10 +++---
 mm/migrate.c|  6 ++--
 mm/page_vma_mapped.c|  6 ++--
 10 files changed, 50 insertions(+), 81 deletions(-)

diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c
index 18205f851c24..eec3a9d7176e 100644
--- a/arch/s390/mm/pgtable.c
+++ b/arch/s390/mm/pgtable.c
@@ -691,7 +691,7 @@ static void ptep_zap_swap_entry(struct mm_struct *mm, 
swp_entry_t entry)
if (!non_swap_entry(entry))
dec_mm_counter(mm, MM_SWAPENTS);
else if (is_migration_entry(entry)) {
-   struct page *page = migration_entry_to_page(entry);
+   struct page *page = pfn_swap_entry_to_page(entry);
 
dec_mm_counter(mm, mm_counter(page));
}
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 3cec6fbef725..08ee59d945c0 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -514,10 +514,8 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr,
} else {
mss->swap_pss += (u64)PAGE_SIZE << PSS_SHIFT;
}
-   } else if (is_migration_entry(swpent))
-   page = migration_entry_to_page(swpent);
-   else if (is_device_private_entry(swpent))
-   page = device_private_entry_to_page(swpent);
+   } else if (is_pfn_swap_entry(swpent))
+   page = pfn_swap_entry_to_page(swpent);
} else if (unlikely(IS_ENABLED(CONFIG_SHMEM) && mss->check_shmem_swap
&& pte_none(*pte))) {
page = xa_load(&vma->vm_file->f_mapping->i_pages,
@@ -549,7 +547,7 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr,
swp_entry_t entry = pmd_to_swp_entry(*pmd);
 
if (is_migration_entry(entry))
-   page = migration_entry_to_page(entry);
+   page = pfn_swap_entry_to_page(entry);
}
if (IS_ERR_OR_NULL(page))
return;
@@ -691,10 +689,8 @@ static int smaps_hugetlb_range(pte_t *pte, unsigned long 
hmask,
} else if (is_swap_pte(*pte)) {
swp_entry_t swpent = pte_to_swp_entry(*pte);
 
-   if (is_migration_entry(swpent))
-   page = migration_entry_to_page(swpent);
-   else if (is_device_private_entry(swpent))
-   page = device_private_entry_to_page(swpent);
+   if (is_pfn_swap_entry(swpent))
+   page = pfn_swap_entry_to_page(swpent);
}
if (page) {
int mapcount = page_mapcount(page);
@@ -1383,11 +1379,8 @@ static pagemap_entry_t pte_to_pagemap_entry(struct 
pagemapread *pm,
frame = swp_type(entry) |
(swp_offset(entry) << MAX_SWAPFILES_SHIFT);
flags |= PM_SWAP;
-   if (is_migration_entry(entry))
-   page = migration_entry_to_page(entry);
-
-   if (is_device_private_entry(entry))
-   page = device_private_entry_to_page(entry);
+   if (is_pfn_swap_entry(entry))
+   page = pfn_swap_entry_to_page(entry);
}
 
if (page && !PageAnon(page))
@@ -1444,7 +1437,7 @@ static int pagemap_pmd_range(pmd_t *pmdp, unsigned long 
addr, unsigned long end,
if (pmd_swp_soft_dirty(pmd))
flags |= PM_SOFT_DIRTY;
VM_BUG_ON(!is_pmd_migration_entry(pmd));
-   page = migration_entry_to_page(entry);
+   page = pfn_swap_entry_to_page(entry);
}
 #endif
 
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 4cc6ec3bf0ab..516104b9334b 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -523,8 +523,8 @@ static inline void show_swap_cache_info(void)
 {
 }
 
-#define free_swap_and_cache(e) ({(is_migration_entry(e) || 
is_device_private_entry(e));})
-#define swapcache_prepare(e) ({(is_migration_entry

[PATCH v6 2/8] mm/swapops: Rework swap entry manipulation code

Both migration and device private pages use special swap entries that
are manipluated by a range of inline functions. The arguments to these
are somewhat inconsitent so rework them to remove flag type arguments
and to make the arguments similar for both read and write entry
creation.

Signed-off-by: Alistair Popple 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Jason Gunthorpe 
Reviewed-by: Ralph Campbell 
---
 include/linux/swapops.h | 56 ++---
 mm/debug_vm_pgtable.c   | 12 -
 mm/hmm.c|  2 +-
 mm/huge_memory.c| 26 +--
 mm/hugetlb.c| 10 +---
 mm/memory.c | 10 +---
 mm/migrate.c| 26 ++-
 mm/mprotect.c   | 10 +---
 mm/rmap.c   | 10 +---
 9 files changed, 100 insertions(+), 62 deletions(-)

diff --git a/include/linux/swapops.h b/include/linux/swapops.h
index 139be8235ad2..4dfd807ae52a 100644
--- a/include/linux/swapops.h
+++ b/include/linux/swapops.h
@@ -100,35 +100,35 @@ static inline void *swp_to_radix_entry(swp_entry_t entry)
 }
 
 #if IS_ENABLED(CONFIG_DEVICE_PRIVATE)
-static inline swp_entry_t make_device_private_entry(struct page *page, bool 
write)
+static inline swp_entry_t make_readable_device_private_entry(pgoff_t offset)
 {
-   return swp_entry(write ? SWP_DEVICE_WRITE : SWP_DEVICE_READ,
-page_to_pfn(page));
+   return swp_entry(SWP_DEVICE_READ, offset);
 }
 
-static inline bool is_device_private_entry(swp_entry_t entry)
+static inline swp_entry_t make_writable_device_private_entry(pgoff_t offset)
 {
-   int type = swp_type(entry);
-   return type == SWP_DEVICE_READ || type == SWP_DEVICE_WRITE;
+   return swp_entry(SWP_DEVICE_WRITE, offset);
 }
 
-static inline void make_device_private_entry_read(swp_entry_t *entry)
+static inline bool is_device_private_entry(swp_entry_t entry)
 {
-   *entry = swp_entry(SWP_DEVICE_READ, swp_offset(*entry));
+   int type = swp_type(entry);
+   return type == SWP_DEVICE_READ || type == SWP_DEVICE_WRITE;
 }
 
-static inline bool is_write_device_private_entry(swp_entry_t entry)
+static inline bool is_writable_device_private_entry(swp_entry_t entry)
 {
return unlikely(swp_type(entry) == SWP_DEVICE_WRITE);
 }
 #else /* CONFIG_DEVICE_PRIVATE */
-static inline swp_entry_t make_device_private_entry(struct page *page, bool 
write)
+static inline swp_entry_t make_readable_device_private_entry(pgoff_t offset)
 {
return swp_entry(0, 0);
 }
 
-static inline void make_device_private_entry_read(swp_entry_t *entry)
+static inline swp_entry_t make_writable_device_private_entry(pgoff_t offset)
 {
+   return swp_entry(0, 0);
 }
 
 static inline bool is_device_private_entry(swp_entry_t entry)
@@ -136,35 +136,32 @@ static inline bool is_device_private_entry(swp_entry_t 
entry)
return false;
 }
 
-static inline bool is_write_device_private_entry(swp_entry_t entry)
+static inline bool is_writable_device_private_entry(swp_entry_t entry)
 {
return false;
 }
 #endif /* CONFIG_DEVICE_PRIVATE */
 
 #ifdef CONFIG_MIGRATION
-static inline swp_entry_t make_migration_entry(struct page *page, int write)
-{
-   BUG_ON(!PageLocked(compound_head(page)));
-
-   return swp_entry(write ? SWP_MIGRATION_WRITE : SWP_MIGRATION_READ,
-   page_to_pfn(page));
-}
-
 static inline int is_migration_entry(swp_entry_t entry)
 {
return unlikely(swp_type(entry) == SWP_MIGRATION_READ ||
swp_type(entry) == SWP_MIGRATION_WRITE);
 }
 
-static inline int is_write_migration_entry(swp_entry_t entry)
+static inline int is_writable_migration_entry(swp_entry_t entry)
 {
return unlikely(swp_type(entry) == SWP_MIGRATION_WRITE);
 }
 
-static inline void make_migration_entry_read(swp_entry_t *entry)
+static inline swp_entry_t make_readable_migration_entry(pgoff_t offset)
 {
-   *entry = swp_entry(SWP_MIGRATION_READ, swp_offset(*entry));
+   return swp_entry(SWP_MIGRATION_READ, offset);
+}
+
+static inline swp_entry_t make_writable_migration_entry(pgoff_t offset)
+{
+   return swp_entry(SWP_MIGRATION_WRITE, offset);
 }
 
 extern void __migration_entry_wait(struct mm_struct *mm, pte_t *ptep,
@@ -174,21 +171,28 @@ extern void migration_entry_wait(struct mm_struct *mm, 
pmd_t *pmd,
 extern void migration_entry_wait_huge(struct vm_area_struct *vma,
struct mm_struct *mm, pte_t *pte);
 #else
+static inline swp_entry_t make_readable_migration_entry(pgoff_t offset)
+{
+   return swp_entry(0, 0);
+}
+
+static inline swp_entry_t make_writable_migration_entry(pgoff_t offset)
+{
+   return swp_entry(0, 0);
+}
 
-#define make_migration_entry(page, write) swp_entry(0, 0)
 static inline int is_migration_entry(swp_entry_t swp)
 {
return 0;
 }
 
-static inline void make_migration_entry_read(swp_entry_t *entryp) { }
 static inline void __migration_entry_wait(struct mm_struct *mm, pt

[PATCH v6 3/8] mm/rmap: Split try_to_munlock from try_to_unmap

The behaviour of try_to_unmap_one() is difficult to follow because it
performs different operations based on a fairly large set of flags used
in different combinations.

TTU_MUNLOCK is one such flag. However it is exclusively used by
try_to_munlock() which specifies no other flags. Therefore rather than
overload try_to_unmap_one() with unrelated behaviour split this out into
it's own function and remove the flag.

Signed-off-by: Alistair Popple 
Reviewed-by: Ralph Campbell 

---

Christoph - I didn't add your Reviewed-by from v3 because removal of the
extra VM_LOCKED check in v4 changed things slightly. Let me know if
you're still ok for me to add it. Thanks.

v4:
* Removed redundant check for VM_LOCKED
---
 include/linux/rmap.h |  1 -
 mm/rmap.c| 40 
 2 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index def5c62c93b3..e26ac2d71346 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -87,7 +87,6 @@ struct anon_vma_chain {
 
 enum ttu_flags {
TTU_MIGRATION   = 0x1,  /* migration mode */
-   TTU_MUNLOCK = 0x2,  /* munlock mode */
 
TTU_SPLIT_HUGE_PMD  = 0x4,  /* split huge PMD if any */
TTU_IGNORE_MLOCK= 0x8,  /* ignore mlock */
diff --git a/mm/rmap.c b/mm/rmap.c
index 977e70803ed8..d02bade5245b 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1405,10 +1405,6 @@ static bool try_to_unmap_one(struct page *page, struct 
vm_area_struct *vma,
struct mmu_notifier_range range;
enum ttu_flags flags = (enum ttu_flags)(long)arg;
 
-   /* munlock has nothing to gain from examining un-locked vmas */
-   if ((flags & TTU_MUNLOCK) && !(vma->vm_flags & VM_LOCKED))
-   return true;
-
if (IS_ENABLED(CONFIG_MIGRATION) && (flags & TTU_MIGRATION) &&
is_zone_device_page(page) && !is_device_private_page(page))
return true;
@@ -1469,8 +1465,6 @@ static bool try_to_unmap_one(struct page *page, struct 
vm_area_struct *vma,
page_vma_mapped_walk_done(&pvmw);
break;
}
-   if (flags & TTU_MUNLOCK)
-   continue;
}
 
/* Unexpected PMD-mapped THP? */
@@ -1784,6 +1778,37 @@ bool try_to_unmap(struct page *page, enum ttu_flags 
flags)
return !page_mapcount(page) ? true : false;
 }
 
+static bool try_to_munlock_one(struct page *page, struct vm_area_struct *vma,
+unsigned long address, void *arg)
+{
+   struct page_vma_mapped_walk pvmw = {
+   .page = page,
+   .vma = vma,
+   .address = address,
+   };
+
+   /* munlock has nothing to gain from examining un-locked vmas */
+   if (!(vma->vm_flags & VM_LOCKED))
+   return true;
+
+   while (page_vma_mapped_walk(&pvmw)) {
+   /* PTE-mapped THP are never mlocked */
+   if (!PageTransCompound(page)) {
+   /*
+* Holding pte lock, we do *not* need
+* mmap_lock here
+*/
+   mlock_vma_page(page);
+   }
+   page_vma_mapped_walk_done(&pvmw);
+
+   /* found a mlocked page, no point continuing munlock check */
+   return false;
+   }
+
+   return true;
+}
+
 /**
  * try_to_munlock - try to munlock a page
  * @page: the page to be munlocked
@@ -1796,8 +1821,7 @@ bool try_to_unmap(struct page *page, enum ttu_flags flags)
 void try_to_munlock(struct page *page)
 {
struct rmap_walk_control rwc = {
-   .rmap_one = try_to_unmap_one,
-   .arg = (void *)TTU_MUNLOCK,
+   .rmap_one = try_to_munlock_one,
.done = page_not_mapped,
.anon_lock = page_lock_anon_vma_read,
 
-- 
2.20.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH v6 4/8] mm/rmap: Split migration into its own function

Migration is currently implemented as a mode of operation for
try_to_unmap_one() generally specified by passing the TTU_MIGRATION flag
or in the case of splitting a huge anonymous page TTU_SPLIT_FREEZE.

However it does not have much in common with the rest of the unmap
functionality of try_to_unmap_one() and thus splitting it into a
separate function reduces the complexity of try_to_unmap_one() making it
more readable.

Several simplifications can also be made in try_to_migrate_one() based
on the following observations:

 - All users of TTU_MIGRATION also set TTU_IGNORE_MLOCK.
 - No users of TTU_MIGRATION ever set TTU_IGNORE_HWPOISON.
 - No users of TTU_MIGRATION ever set TTU_BATCH_FLUSH.

TTU_SPLIT_FREEZE is a special case of migration used when splitting an
anonymous page. This is most easily dealt with by calling the correct
function from unmap_page() in mm/huge_memory.c  - either
try_to_migrate() for PageAnon or try_to_unmap().

Signed-off-by: Alistair Popple 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Ralph Campbell 

---

v5:
* Added comments about how PMD splitting works for migration vs.
  unmapping
* Tightened up the flag check in try_to_migrate() to be explicit about
  which TTU_XXX flags are supported.
---
 include/linux/rmap.h |   4 +-
 mm/huge_memory.c |  15 +-
 mm/migrate.c |   9 +-
 mm/rmap.c| 358 ---
 4 files changed, 280 insertions(+), 106 deletions(-)

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index e26ac2d71346..6062e0cfca2d 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -86,8 +86,6 @@ struct anon_vma_chain {
 };
 
 enum ttu_flags {
-   TTU_MIGRATION   = 0x1,  /* migration mode */
-
TTU_SPLIT_HUGE_PMD  = 0x4,  /* split huge PMD if any */
TTU_IGNORE_MLOCK= 0x8,  /* ignore mlock */
TTU_IGNORE_HWPOISON = 0x20, /* corrupted page is recoverable */
@@ -96,7 +94,6 @@ enum ttu_flags {
 * do a final flush if necessary */
TTU_RMAP_LOCKED = 0x80, /* do not grab rmap lock:
 * caller holds it */
-   TTU_SPLIT_FREEZE= 0x100,/* freeze pte under 
splitting thp */
 };
 
 #ifdef CONFIG_MMU
@@ -193,6 +190,7 @@ static inline void page_dup_rmap(struct page *page, bool 
compound)
 int page_referenced(struct page *, int is_locked,
struct mem_cgroup *memcg, unsigned long *vm_flags);
 
+bool try_to_migrate(struct page *page, enum ttu_flags flags);
 bool try_to_unmap(struct page *, enum ttu_flags flags);
 
 /* Avoid racy checks */
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 89af065cea5b..eab004331b97 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2357,16 +2357,21 @@ void vma_adjust_trans_huge(struct vm_area_struct *vma,
 
 static void unmap_page(struct page *page)
 {
-   enum ttu_flags ttu_flags = TTU_IGNORE_MLOCK |
-   TTU_RMAP_LOCKED | TTU_SPLIT_HUGE_PMD;
+   enum ttu_flags ttu_flags = TTU_RMAP_LOCKED | TTU_SPLIT_HUGE_PMD;
bool unmap_success;
 
VM_BUG_ON_PAGE(!PageHead(page), page);
 
if (PageAnon(page))
-   ttu_flags |= TTU_SPLIT_FREEZE;
-
-   unmap_success = try_to_unmap(page, ttu_flags);
+   unmap_success = try_to_migrate(page, ttu_flags);
+   else
+   /*
+* Don't install migration entries for file backed pages. This
+* helps handle cases when i_size is in the middle of the page
+* as there is no need to unmap pages beyond i_size manually.
+*/
+   unmap_success = try_to_unmap(page, ttu_flags |
+   TTU_IGNORE_MLOCK);
VM_BUG_ON_PAGE(!unmap_success, page);
 }
 
diff --git a/mm/migrate.c b/mm/migrate.c
index b752543adb64..cc4612e2a246 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1130,7 +1130,7 @@ static int __unmap_and_move(struct page *page, struct 
page *newpage,
/* Establish migration ptes */
VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) && !anon_vma,
page);
-   try_to_unmap(page, TTU_MIGRATION|TTU_IGNORE_MLOCK);
+   try_to_migrate(page, 0);
page_was_mapped = 1;
}
 
@@ -1332,7 +1332,7 @@ static int unmap_and_move_huge_page(new_page_t 
get_new_page,
 
if (page_mapped(hpage)) {
bool mapping_locked = false;
-   enum ttu_flags ttu = TTU_MIGRATION|TTU_IGNORE_MLOCK;
+   enum ttu_flags ttu = 0;
 
if (!PageAnon(hpage)) {
/*
@@ -1349,7 +1349,7 @@ static int unmap_and_move_huge_page(new_page_t 
get_new_page,
ttu |= TTU_RMAP_LOCKED;
}
 
-   try_to_unmap(hpage, ttu);
+   try_to_migrate(hpage, ttu);
page_was_mapped

[PATCH v6 6/8] mm: Selftests for exclusive device memory

Adds some selftests for exclusive device memory.

Signed-off-by: Alistair Popple 
Acked-by: Jason Gunthorpe 
Tested-by: Ralph Campbell 
Reviewed-by: Ralph Campbell 
---
 lib/test_hmm.c | 124 ++
 lib/test_hmm_uapi.h|   2 +
 tools/testing/selftests/vm/hmm-tests.c | 219 +
 3 files changed, 345 insertions(+)

diff --git a/lib/test_hmm.c b/lib/test_hmm.c
index 5c9f5a020c1d..305a9d9e2b4c 100644
--- a/lib/test_hmm.c
+++ b/lib/test_hmm.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "test_hmm_uapi.h"
 
@@ -46,6 +47,7 @@ struct dmirror_bounce {
unsigned long   cpages;
 };
 
+#define DPT_XA_TAG_ATOMIC 1UL
 #define DPT_XA_TAG_WRITE 3UL
 
 /*
@@ -619,6 +621,54 @@ static void dmirror_migrate_alloc_and_copy(struct 
migrate_vma *args,
}
 }
 
+static int dmirror_check_atomic(struct dmirror *dmirror, unsigned long start,
+unsigned long end)
+{
+   unsigned long pfn;
+
+   for (pfn = start >> PAGE_SHIFT; pfn < (end >> PAGE_SHIFT); pfn++) {
+   void *entry;
+   struct page *page;
+
+   entry = xa_load(&dmirror->pt, pfn);
+   page = xa_untag_pointer(entry);
+   if (xa_pointer_tag(entry) == DPT_XA_TAG_ATOMIC)
+   return -EPERM;
+   }
+
+   return 0;
+}
+
+static int dmirror_atomic_map(unsigned long start, unsigned long end,
+ struct page **pages, struct dmirror *dmirror)
+{
+   unsigned long pfn, mapped = 0;
+   int i;
+
+   /* Map the migrated pages into the device's page tables. */
+   mutex_lock(&dmirror->mutex);
+
+   for (i = 0, pfn = start >> PAGE_SHIFT; pfn < (end >> PAGE_SHIFT); 
pfn++, i++) {
+   void *entry;
+
+   if (!pages[i])
+   continue;
+
+   entry = pages[i];
+   entry = xa_tag_pointer(entry, DPT_XA_TAG_ATOMIC);
+   entry = xa_store(&dmirror->pt, pfn, entry, GFP_ATOMIC);
+   if (xa_is_err(entry)) {
+   mutex_unlock(&dmirror->mutex);
+   return xa_err(entry);
+   }
+
+   mapped++;
+   }
+
+   mutex_unlock(&dmirror->mutex);
+   return mapped;
+}
+
 static int dmirror_migrate_finalize_and_map(struct migrate_vma *args,
struct dmirror *dmirror)
 {
@@ -661,6 +711,71 @@ static int dmirror_migrate_finalize_and_map(struct 
migrate_vma *args,
return 0;
 }
 
+static int dmirror_exclusive(struct dmirror *dmirror,
+struct hmm_dmirror_cmd *cmd)
+{
+   unsigned long start, end, addr;
+   unsigned long size = cmd->npages << PAGE_SHIFT;
+   struct mm_struct *mm = dmirror->notifier.mm;
+   struct page *pages[64];
+   struct dmirror_bounce bounce;
+   unsigned long next;
+   int ret;
+
+   start = cmd->addr;
+   end = start + size;
+   if (end < start)
+   return -EINVAL;
+
+   /* Since the mm is for the mirrored process, get a reference first. */
+   if (!mmget_not_zero(mm))
+   return -EINVAL;
+
+   mmap_read_lock(mm);
+   for (addr = start; addr < end; addr = next) {
+   int i, mapped;
+
+   if (end < addr + (ARRAY_SIZE(pages) << PAGE_SHIFT))
+   next = end;
+   else
+   next = addr + (ARRAY_SIZE(pages) << PAGE_SHIFT);
+
+   ret = make_device_exclusive_range(mm, addr, next, pages, NULL);
+   mapped = dmirror_atomic_map(addr, next, pages, dmirror);
+   for (i = 0; i < ret; i++) {
+   if (pages[i]) {
+   unlock_page(pages[i]);
+   put_page(pages[i]);
+   }
+   }
+
+   if (addr + (mapped << PAGE_SHIFT) < next) {
+   mmap_read_unlock(mm);
+   mmput(mm);
+   return -EBUSY;
+   }
+   }
+   mmap_read_unlock(mm);
+   mmput(mm);
+
+   /* Return the migrated data for verification. */
+   ret = dmirror_bounce_init(&bounce, start, size);
+   if (ret)
+   return ret;
+   mutex_lock(&dmirror->mutex);
+   ret = dmirror_do_read(dmirror, start, end, &bounce);
+   mutex_unlock(&dmirror->mutex);
+   if (ret == 0) {
+   if (copy_to_user(u64_to_user_ptr(cmd->ptr), bounce.ptr,
+bounce.size))
+   ret = -EFAULT;
+   }
+
+   cmd->cpages = bounce.cpages;
+   dmirror_bounce_fini(&bounce);
+   return ret;
+}
+
 static int dmirror_migrate(struct dmirror *dmirror,
   struct hmm_dmirror_cmd *cmd)
 {
@@ -949,6 +1064,15 @@ static long dmirror_fops_unlocked_ioctl(struct file *filp,

[PATCH v6 7/8] nouveau/svm: Refactor nouveau_range_fault

Call mmu_interval_notifier_insert() as part of nouveau_range_fault().
This doesn't introduce any functional change but makes it easier for a
subsequent patch to alter the behaviour of nouveau_range_fault() to
support GPU atomic operations.

Signed-off-by: Alistair Popple 
---
 drivers/gpu/drm/nouveau/nouveau_svm.c | 34 ---
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c 
b/drivers/gpu/drm/nouveau/nouveau_svm.c
index 94f841026c3b..a195e48c9aee 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -567,18 +567,27 @@ static int nouveau_range_fault(struct nouveau_svmm *svmm,
unsigned long hmm_pfns[1];
struct hmm_range range = {
.notifier = ¬ifier->notifier,
-   .start = notifier->notifier.interval_tree.start,
-   .end = notifier->notifier.interval_tree.last + 1,
.default_flags = hmm_flags,
.hmm_pfns = hmm_pfns,
.dev_private_owner = drm->dev,
};
-   struct mm_struct *mm = notifier->notifier.mm;
+   struct mm_struct *mm = svmm->notifier.mm;
int ret;
 
+   ret = mmu_interval_notifier_insert(¬ifier->notifier, mm,
+   args->p.addr, args->p.size,
+   &nouveau_svm_mni_ops);
+   if (ret)
+   return ret;
+
+   range.start = notifier->notifier.interval_tree.start;
+   range.end = notifier->notifier.interval_tree.last + 1;
+
while (true) {
-   if (time_after(jiffies, timeout))
-   return -EBUSY;
+   if (time_after(jiffies, timeout)) {
+   ret = -EBUSY;
+   goto out;
+   }
 
range.notifier_seq = mmu_interval_read_begin(range.notifier);
mmap_read_lock(mm);
@@ -587,7 +596,7 @@ static int nouveau_range_fault(struct nouveau_svmm *svmm,
if (ret) {
if (ret == -EBUSY)
continue;
-   return ret;
+   goto out;
}
 
mutex_lock(&svmm->mutex);
@@ -606,6 +615,9 @@ static int nouveau_range_fault(struct nouveau_svmm *svmm,
svmm->vmm->vmm.object.client->super = false;
mutex_unlock(&svmm->mutex);
 
+out:
+   mmu_interval_notifier_remove(¬ifier->notifier);
+
return ret;
 }
 
@@ -727,14 +739,8 @@ nouveau_svm_fault(struct nvif_notify *notify)
}
 
notifier.svmm = svmm;
-   ret = mmu_interval_notifier_insert(¬ifier.notifier, mm,
-  args.i.p.addr, args.i.p.size,
-  &nouveau_svm_mni_ops);
-   if (!ret) {
-   ret = nouveau_range_fault(svmm, svm->drm, &args.i,
-   sizeof(args), hmm_flags, ¬ifier);
-   mmu_interval_notifier_remove(¬ifier.notifier);
-   }
+   ret = nouveau_range_fault(svmm, svm->drm, &args.i,
+   sizeof(args), hmm_flags, ¬ifier);
mmput(mm);
 
limit = args.i.p.addr + args.i.p.size;
-- 
2.20.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH v6 5/8] mm: Device exclusive memory access

Some devices require exclusive write access to shared virtual
memory (SVM) ranges to perform atomic operations on that memory. This
requires CPU page tables to be updated to deny access whilst atomic
operations are occurring.

In order to do this introduce a new swap entry
type (SWP_DEVICE_EXCLUSIVE). When a SVM range needs to be marked for
exclusive access by a device all page table mappings for the particular
range are replaced with device exclusive swap entries. This causes any
CPU access to the page to result in a fault.

Faults are resovled by replacing the faulting entry with the original
mapping. This results in MMU notifiers being called which a driver uses
to update access permissions such as revoking atomic access. After
notifiers have been called the device will no longer have exclusive
access to the region.

Signed-off-by: Alistair Popple 

---

v6:
* Fixed a bisectablity issue due to incorrectly applying the rename of
  migrate_pgmap_owner to the wrong patches for Nouveau and hmm_test.

v5:
* Renamed range->migrate_pgmap_owner to range->owner.
* Added MMU_NOTIFY_EXCLUSIVE to allow passing of a driver cookie which
  allows notifiers called as a result of make_device_exclusive_range() to
  be ignored.
* Added a check to try_to_protect_one() to detect if the pages originally
  returned from get_user_pages() have been unmapped or not.
* Removed check_device_exclusive_range() as it is no longer required with
  the other changes.
* Documentation update.

v4:
* Add function to check that mappings are still valid and exclusive.
* s/long/unsigned long/ in make_device_exclusive_entry().
---
 Documentation/vm/hmm.rst  |  19 ++-
 drivers/gpu/drm/nouveau/nouveau_svm.c |   2 +-
 include/linux/mmu_notifier.h  |  25 +++-
 include/linux/rmap.h  |   4 +
 include/linux/swap.h  |   4 +-
 include/linux/swapops.h   |  44 +-
 lib/test_hmm.c|   2 +-
 mm/hmm.c  |   5 +
 mm/memory.c   | 107 +-
 mm/mprotect.c |   8 +
 mm/page_vma_mapped.c  |   9 +-
 mm/rmap.c | 203 ++
 12 files changed, 419 insertions(+), 13 deletions(-)

diff --git a/Documentation/vm/hmm.rst b/Documentation/vm/hmm.rst
index 09e28507f5b2..a5fdee82c037 100644
--- a/Documentation/vm/hmm.rst
+++ b/Documentation/vm/hmm.rst
@@ -332,7 +332,7 @@ between device driver specific code and shared common code:
walks to fill in the ``args->src`` array with PFNs to be migrated.
The ``invalidate_range_start()`` callback is passed a
``struct mmu_notifier_range`` with the ``event`` field set to
-   ``MMU_NOTIFY_MIGRATE`` and the ``migrate_pgmap_owner`` field set to
+   ``MMU_NOTIFY_MIGRATE`` and the ``owner`` field set to
the ``args->pgmap_owner`` field passed to migrate_vma_setup(). This is
allows the device driver to skip the invalidation callback and only
invalidate device private MMU mappings that are actually migrating.
@@ -405,6 +405,23 @@ between device driver specific code and shared common code:
 
The lock can now be released.
 
+Exclusive access memory
+===
+
+Not all devices support atomic access to system memory. To support atomic
+operations to a shared virtual memory page such a device needs access to that
+page which is exclusive of any userspace access from the CPU. The
+``make_device_exclusive_range()`` function can be used to make a memory range
+inaccessible from userspace.
+
+This replaces all mappings for pages in the given range with special swap
+entries. Any attempt to access the swap entry results in a fault which is
+resovled by replacing the entry with the original mapping. A driver gets
+notified that the mapping has been changed by MMU notifiers, after which point
+it will no longer have exclusive access to the page. Exclusive access is
+guranteed to last until the driver drops the page lock and page reference, at
+which point any CPU faults on the page may proceed as described.
+
 Memory cgroup (memcg) and rss accounting
 
 
diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c 
b/drivers/gpu/drm/nouveau/nouveau_svm.c
index f18bd53da052..94f841026c3b 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -265,7 +265,7 @@ nouveau_svmm_invalidate_range_start(struct mmu_notifier *mn,
 * the invalidation is handled as part of the migration process.
 */
if (update->event == MMU_NOTIFY_MIGRATE &&
-   update->migrate_pgmap_owner == svmm->vmm->cli->drm->dev)
+   update->owner == svmm->vmm->cli->drm->dev)
goto out;
 
if (limit > svmm->unmanaged.start && start < svmm->unmanaged.limit) {
diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
index b8200782dede..455e269bf825 100644
--- a/include/linux/mm

[PATCH v6 8/8] nouveau/svm: Implement atomic SVM access

Some NVIDIA GPUs do not support direct atomic access to system memory
via PCIe. Instead this must be emulated by granting the GPU exclusive
access to the memory. This is achieved by replacing CPU page table
entries with special swap entries that fault on userspace access.

The driver then grants the GPU permission to update the page undergoing
atomic access via the GPU page tables. When CPU access to the page is
required a CPU fault is raised which calls into the device driver via
MMU notifiers to revoke the atomic access. The original page table
entries are then restored allowing CPU access to proceed.

Signed-off-by: Alistair Popple 

---

v4:
* Check that page table entries haven't changed before mapping on the
  device
---
 drivers/gpu/drm/nouveau/include/nvif/if000c.h |   1 +
 drivers/gpu/drm/nouveau/nouveau_svm.c | 100 --
 drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h |   1 +
 .../drm/nouveau/nvkm/subdev/mmu/vmmgp100.c|   6 ++
 4 files changed, 100 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/include/nvif/if000c.h 
b/drivers/gpu/drm/nouveau/include/nvif/if000c.h
index d6dd40f21eed..9c7ff56831c5 100644
--- a/drivers/gpu/drm/nouveau/include/nvif/if000c.h
+++ b/drivers/gpu/drm/nouveau/include/nvif/if000c.h
@@ -77,6 +77,7 @@ struct nvif_vmm_pfnmap_v0 {
 #define NVIF_VMM_PFNMAP_V0_APER   0x00f0ULL
 #define NVIF_VMM_PFNMAP_V0_HOST   0xULL
 #define NVIF_VMM_PFNMAP_V0_VRAM   0x0010ULL
+#define NVIF_VMM_PFNMAP_V0_A 0x0004ULL
 #define NVIF_VMM_PFNMAP_V0_W  0x0002ULL
 #define NVIF_VMM_PFNMAP_V0_V  0x0001ULL
 #define NVIF_VMM_PFNMAP_V0_NONE   0xULL
diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c 
b/drivers/gpu/drm/nouveau/nouveau_svm.c
index a195e48c9aee..16b07d7589d2 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 
 struct nouveau_svm {
struct nouveau_drm *drm;
@@ -421,9 +422,9 @@ nouveau_svm_fault_cmp(const void *a, const void *b)
return ret;
if ((ret = (s64)fa->addr - fb->addr))
return ret;
-   /*XXX: atomic? */
-   return (fa->access == 0 || fa->access == 3) -
-  (fb->access == 0 || fb->access == 3);
+   /* Atomic access (2) has highest priority */
+   return (-1*(fa->access == 2) + (fa->access == 0 || fa->access == 3)) -
+  (-1*(fb->access == 2) + (fb->access == 0 || fb->access == 3));
 }
 
 static void
@@ -487,6 +488,10 @@ static bool nouveau_svm_range_invalidate(struct 
mmu_interval_notifier *mni,
struct svm_notifier *sn =
container_of(mni, struct svm_notifier, notifier);
 
+   if (range->event == MMU_NOTIFY_EXCLUSIVE &&
+   range->owner == sn->svmm->vmm->cli->drm->dev)
+   return true;
+
/*
 * serializes the update to mni->invalidate_seq done by caller and
 * prevents invalidation of the PTE from progressing while HW is being
@@ -555,6 +560,73 @@ static void nouveau_hmm_convert_pfn(struct nouveau_drm 
*drm,
args->p.phys[0] |= NVIF_VMM_PFNMAP_V0_W;
 }
 
+static int nouveau_atomic_range_fault(struct nouveau_svmm *svmm,
+  struct nouveau_drm *drm,
+  struct nouveau_pfnmap_args *args, u32 size,
+  struct svm_notifier *notifier)
+{
+   unsigned long timeout =
+   jiffies + msecs_to_jiffies(HMM_RANGE_DEFAULT_TIMEOUT);
+   struct mm_struct *mm = svmm->notifier.mm;
+   struct page *page;
+   unsigned long start = args->p.addr;
+   unsigned long notifier_seq;
+   int ret = 0;
+
+   ret = mmu_interval_notifier_insert(¬ifier->notifier, mm,
+   args->p.addr, args->p.size,
+   &nouveau_svm_mni_ops);
+   if (ret)
+   return ret;
+
+   while (true) {
+   if (time_after(jiffies, timeout)) {
+   ret = -EBUSY;
+   goto out;
+   }
+
+   notifier_seq = mmu_interval_read_begin(¬ifier->notifier);
+   mmap_read_lock(mm);
+   make_device_exclusive_range(mm, start, start + PAGE_SIZE,
+   &page, drm->dev);
+   mmap_read_unlock(mm);
+   if (!page) {
+   ret = -EINVAL;
+   goto out;
+   }
+
+   mutex_lock(&svmm->mutex);
+   if (mmu_interval_read_retry(¬ifier->notifier,
+   notifier_seq)) {
+   mutex_unlock(&svmm->mutex);
+

[PATCH v2] dt-bindings: display: sitronix, st7789v-dbi: Add Waveshare 2inch LCD module

2021-03-12 Thread Carlis

From: "Carlis" 

Document support for the Waveshare 2inch LCD module display, which is a
240x320 2" TFT display driven by a Sitronix ST7789V TFT Controller.

Signed-off-by: Carlis 
---
v2:change compatible name.
---
 .../display/sitronix,st7789v-dbi.yaml | 72 +++
 1 file changed, 72 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/display/sitronix,st7789v-dbi.yaml

diff --git 
a/Documentation/devicetree/bindings/display/sitronix,st7789v-dbi.yaml 
b/Documentation/devicetree/bindings/display/sitronix,st7789v-dbi.yaml
new file mode 100644
index ..6abf82966230
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/sitronix,st7789v-dbi.yaml
@@ -0,0 +1,72 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/display/sitronix,st7789v-dbi.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Sitronix ST7789V Display Panels Device Tree Bindings
+
+maintainers:
+  - Carlis 
+
+description:
+  This binding is for display panels using a Sitronix ST7789V
+  controller in SPI mode.
+
+allOf:
+  - $ref: panel/panel-common.yaml#
+
+properties:
+  compatible:
+oneOf:
+  - description:
+  Waveshare 2" 240x320 Color TFT LCD
+items:
+  - enum:
+  - waveshare,ws2inch
+  - const: sitronix,st7789v-dbi
+
+  spi-max-frequency:
+maximum: 3200
+
+  dc-gpios:
+maxItems: 1
+description: Display data/command selection (D/CX)
+
+  backlight: true
+  reg: true
+  reset-gpios: true
+  rotation: true
+
+required:
+  - compatible
+  - reg
+  - dc-gpios
+  - reset-gpios
+
+additionalProperties: false
+
+examples:
+  - |
+#include 
+
+backlight: backlight {
+compatible = "gpio-backlight";
+gpios = <&gpio 18 GPIO_ACTIVE_HIGH>;
+};
+
+spi {
+#address-cells = <1>;
+#size-cells = <0>;
+
+display@0{
+compatible = "waveshare,ws2inch", "sitronix,st7789v-dbi";
+reg = <0>;
+spi-max-frequency = <3200>;
+dc-gpios = <&gpio 25 GPIO_ACTIVE_HIGH>;
+reset-gpios = <&gpio 27 GPIO_ACTIVE_HIGH>;
+rotation = <270>;
+};
+};
+
+...
-- 
2.25.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH] drm: bridge: adv7511: Support I2S IEC958 encoded PCM format

2021-03-12 Thread Sia Jee Heng

Support IEC958 encoded PCM format for ADV7511 so that ADV7511 HDMI
audio driver can accept the IEC958 data from the I2S input.

Signed-off-by: Sia Jee Heng 
---
 drivers/gpu/drm/bridge/adv7511/adv7511.h   | 1 +
 drivers/gpu/drm/bridge/adv7511/adv7511_audio.c | 6 ++
 2 files changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/bridge/adv7511/adv7511.h 
b/drivers/gpu/drm/bridge/adv7511/adv7511.h
index a9bb734366ae..05e3abb5a0c9 100644
--- a/drivers/gpu/drm/bridge/adv7511/adv7511.h
+++ b/drivers/gpu/drm/bridge/adv7511/adv7511.h
@@ -191,6 +191,7 @@
 #define ADV7511_I2S_FORMAT_I2S 0
 #define ADV7511_I2S_FORMAT_RIGHT_J 1
 #define ADV7511_I2S_FORMAT_LEFT_J  2
+#define ADV7511_I2S_IEC958_DIRECT  3
 
 #define ADV7511_PACKET(p, x)   ((p) * 0x20 + (x))
 #define ADV7511_PACKET_SDP(x)  ADV7511_PACKET(0, x)
diff --git a/drivers/gpu/drm/bridge/adv7511/adv7511_audio.c 
b/drivers/gpu/drm/bridge/adv7511/adv7511_audio.c
index 45838bd08d37..61f4a38e7d2b 100644
--- a/drivers/gpu/drm/bridge/adv7511/adv7511_audio.c
+++ b/drivers/gpu/drm/bridge/adv7511/adv7511_audio.c
@@ -101,6 +101,10 @@ static int adv7511_hdmi_hw_params(struct device *dev, void 
*data,
case 20:
len = ADV7511_I2S_SAMPLE_LEN_20;
break;
+   case 32:
+   if (fmt->bit_fmt != SNDRV_PCM_FORMAT_IEC958_SUBFRAME_LE)
+   return -EINVAL;
+   fallthrough;
case 24:
len = ADV7511_I2S_SAMPLE_LEN_24;
break;
@@ -112,6 +116,8 @@ static int adv7511_hdmi_hw_params(struct device *dev, void 
*data,
case HDMI_I2S:
audio_source = ADV7511_AUDIO_SOURCE_I2S;
i2s_format = ADV7511_I2S_FORMAT_I2S;
+   if (fmt->bit_fmt == SNDRV_PCM_FORMAT_IEC958_SUBFRAME_LE)
+   i2s_format = ADV7511_I2S_IEC958_DIRECT;
break;
case HDMI_RIGHT_J:
audio_source = ADV7511_AUDIO_SOURCE_I2S;

base-commit: de066e116306baf3a6a62691ac63cfc0b1dabddb
-- 
2.18.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH v2] drm/tiny: add support for Waveshare 2inch LCD module

2021-03-12 Thread Carlis

From: "Carlis" 

This adds a new module for the ST7789V controller with parameters for
the Waveshare 2inch LCD module.

Signed-off-by: Carlis 
---
v2:change compatible value.
---
 MAINTAINERS|   7 +
 drivers/gpu/drm/tiny/Kconfig   |  14 ++
 drivers/gpu/drm/tiny/Makefile  |   1 +
 drivers/gpu/drm/tiny/st7789v.c | 269 +
 4 files changed, 291 insertions(+)
 create mode 100644 drivers/gpu/drm/tiny/st7789v.c

diff --git a/MAINTAINERS b/MAINTAINERS
index d92f85ca831d..25c4ae12cecf 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5769,6 +5769,13 @@ T:   git git://anongit.freedesktop.org/drm/drm-misc
 F: Documentation/devicetree/bindings/display/sitronix,st7735r.yaml
 F: drivers/gpu/drm/tiny/st7735r.c
 
+DRM DRIVER FOR SITRONIX ST7789V PANELS
+M: Carlis 
+S: Maintained
+T: git git://anongit.freedesktop.org/drm/drm-misc
+F: Documentation/devicetree/bindings/display/sitronix,st7789v-dbi.yaml
+F: drivers/gpu/drm/tiny/st7789v.c
+
 DRM DRIVER FOR SONY ACX424AKP PANELS
 M: Linus Walleij 
 S: Maintained
diff --git a/drivers/gpu/drm/tiny/Kconfig b/drivers/gpu/drm/tiny/Kconfig
index 2b6414f0fa75..ac2c7fb702f0 100644
--- a/drivers/gpu/drm/tiny/Kconfig
+++ b/drivers/gpu/drm/tiny/Kconfig
@@ -131,3 +131,17 @@ config TINYDRM_ST7735R
  * Okaya RH128128T 1.44" 128x128 TFT
 
  If M is selected the module will be called st7735r.
+
+config TINYDRM_ST7789V
+   tristate "DRM support for Sitronix ST7789V display panels"
+   depends on DRM && SPI
+   select DRM_KMS_HELPER
+   select DRM_KMS_CMA_HELPER
+   select DRM_MIPI_DBI
+   select BACKLIGHT_CLASS_DEVICE
+   help
+ DRM driver for Sitronix ST7789V with one of the following
+ LCDs:
+ * Waveshare 2inch lcd module 240x320 TFT
+
+ If M is selected the module will be called st7789v.
diff --git a/drivers/gpu/drm/tiny/Makefile b/drivers/gpu/drm/tiny/Makefile
index 6ae4e9e5a35f..aa0caa2b6c16 100644
--- a/drivers/gpu/drm/tiny/Makefile
+++ b/drivers/gpu/drm/tiny/Makefile
@@ -10,3 +10,4 @@ obj-$(CONFIG_TINYDRM_MI0283QT)+= mi0283qt.o
 obj-$(CONFIG_TINYDRM_REPAPER)  += repaper.o
 obj-$(CONFIG_TINYDRM_ST7586)   += st7586.o
 obj-$(CONFIG_TINYDRM_ST7735R)  += st7735r.o
+obj-$(CONFIG_TINYDRM_ST7789V)  += st7789v.o
diff --git a/drivers/gpu/drm/tiny/st7789v.c b/drivers/gpu/drm/tiny/st7789v.c
new file mode 100644
index ..9b4bb9edba40
--- /dev/null
+++ b/drivers/gpu/drm/tiny/st7789v.c
@@ -0,0 +1,269 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * DRM driver for display panels connected to a Sitronix ST7789V
+ * display controller in SPI mode.
+ *
+ * Copyright 2017 David Lechner 
+ * Copyright (C) 2019 Glider bvba
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define ST7789V_PORCTRL 0xb2
+#define ST7789V_GCTRL   0xb7
+#define ST7789V_VCOMS   0xbb
+#define ST7789V_LCMCTRL 0xc0
+#define ST7789V_VDVVRHEN0xc2
+#define ST7789V_VRHS0xc3
+#define ST7789V_VDVS0xc4
+#define ST7789V_FRCTRL2 0xc6
+#define ST7789V_PWCTRL1 0xd0
+#define ST7789V_PVGAMCTRL   0xe0
+#define ST7789V_NVGAMCTRL   0xe1
+
+#define ST7789V_MY BIT(7)
+#define ST7789V_MX BIT(6)
+#define ST7789V_MV BIT(5)
+#define ST7789V_RGBBIT(3)
+
+struct st7789v_cfg {
+   const struct drm_display_mode mode;
+   unsigned int left_offset;
+   unsigned int top_offset;
+   unsigned int write_only:1;
+   unsigned int rgb:1; /* RGB (vs. BGR) */
+};
+
+struct st7789v_priv {
+   struct mipi_dbi_dev dbidev; /* Must be first for .release() */
+   const struct st7789v_cfg *cfg;
+};
+
+static void st7789v_pipe_enable(struct drm_simple_display_pipe *pipe,
+   struct drm_crtc_state *crtc_state,
+   struct drm_plane_state *plane_state)
+{
+   struct mipi_dbi_dev *dbidev = drm_to_mipi_dbi_dev(pipe->crtc.dev);
+   struct st7789v_priv *priv = container_of(dbidev, struct st7789v_priv,
+dbidev);
+   struct mipi_dbi *dbi = &dbidev->dbi;
+   int ret, idx;
+   u8 addr_mode;
+
+   if (!drm_dev_enter(pipe->crtc.dev, &idx))
+   return;
+
+   DRM_DEBUG_KMS("\n");
+
+   ret = mipi_dbi_poweron_reset(dbidev);
+   if (ret)
+   goto out_exit;
+
+   msleep(150);
+
+   mipi_dbi_command(dbi, MIPI_DCS_EXIT_SLEEP_MODE);
+   msleep(100);
+
+
+   switch (dbidev->rotation) {
+   default:
+   addr_mode = 0;
+   break;
+   case 90:
+   addr_mode = ST7789V_MY | ST7789V_MV;
+   break;
+   case 180:
+   addr_mode = ST7789V_MX | ST7789V_MY;
+   break;
+   case 270:
+   addr_mo

Re: [PATCH] drivers: video: fbcon: fix NULL dereference in fbcon_cursor()

2021-03-12 Thread Du Cheng

On Fri, Mar 12, 2021 at 09:36:42AM +0100, Greg Kroah-Hartman wrote:
> On Fri, Mar 12, 2021 at 04:14:21PM +0800, Du Cheng wrote:
> > add null-check on function pointer before dereference on ops->cursor
> > 
> > Reported-by: syzbot+b67aaae8d3a927f68...@syzkaller.appspotmail.com
> > Signed-off-by: Du Cheng 
> > ---
> >  drivers/video/fbdev/core/fbcon.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/video/fbdev/core/fbcon.c 
> > b/drivers/video/fbdev/core/fbcon.c
> > index 44a5cd2f54cc..3406067985b1 100644
> > --- a/drivers/video/fbdev/core/fbcon.c
> > +++ b/drivers/video/fbdev/core/fbcon.c
> > @@ -1333,6 +1333,9 @@ static void fbcon_cursor(struct vc_data *vc, int mode)
> >  
> > ops->cursor_flash = (mode == CM_ERASE) ? 0 : 1;
> >  
> > +   if (!ops->cursor)
> > +   return;
> > +
> > ops->cursor(vc, info, mode, get_color(vc, info, c, 1),
> > get_color(vc, info, c, 0));
> >  }
> > -- 
> > 2.27.0
> > 
> 
> Is this the same issue reported here:
>   
> https://lore.kernel.org/r/20210307105642.112572-1-h.shahbazi@gmail.com
> 
> And has syzbot testing shown that this fix does solve the issue?
> 
> thanks,
> 
> greg k-h

Hi Greg,

I sent both my patch and that of shahbazi to syzbot to see if they resolve the
bug by the id b67aaae8d3a927f68d20.

I will keep you posted of the outcomes.

Regards,
Du Cheng
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH] drm/ttm: make ttm_bo_unpin more defensive

We seem to have some more driver bugs than thought.

Signed-off-by: Christian König 
---
 include/drm/ttm/ttm_bo_api.h | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
index 4fb523dfab32..df9fe596e7c5 100644
--- a/include/drm/ttm/ttm_bo_api.h
+++ b/include/drm/ttm/ttm_bo_api.h
@@ -603,9 +603,11 @@ static inline void ttm_bo_pin(struct ttm_buffer_object *bo)
 static inline void ttm_bo_unpin(struct ttm_buffer_object *bo)
 {
dma_resv_assert_held(bo->base.resv);
-   WARN_ON_ONCE(!bo->pin_count);
WARN_ON_ONCE(!kref_read(&bo->kref));
-   --bo->pin_count;
+   if (bo->pin_count)
+   --bo->pin_count;
+   else
+   WARN_ON_ONCE(true);
 }
 
 int ttm_mem_evict_first(struct ttm_device *bdev,
-- 
2.25.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: don't base trees on 5.12-rc1

2021-03-12 Thread Maxime Ripard

On Thu, Mar 11, 2021 at 05:11:36AM +1000, Dave Airlie wrote:
> On Wed, 10 Mar 2021 at 17:48, Maxime Ripard  wrote:
> >
> > Hi Dave,
> >
> > On Wed, Mar 10, 2021 at 09:50:29AM +1000, Dave Airlie wrote:
> > > I'm mostly sending this to the -misc maintainers because
> > > drm-misc-fixes is based on rc1 at present.
> > >
> > > This needs to be *rebased* not merged up to 5.12-rc2. Merging will
> > > still have the bad landmine commits in the bisect history. This is a
> > > very special case.
> >
> > I'm sorry, I'm not entirely sure I get this. -rc1 is still in the -rc2
> > history, so how would that change anything in the bisect history?
> >
> 
> We can't get rid of the bad commit range, we can reduce the amount of
> times someone accidentally bisects into it, by not using it as a base
> commit for future changes.
> 
> If in the future a bisect happens to want to test one of the patches
> in drm-misc-fixes that is based on rc1, it will land the user with an
> rc1 test kernel and could eat their swapfile/disk. We can avoid that
> problem by not using rc1 as a base for drm-misc-fixes.
> 
> We can't avoid them bisecting into the broken commits between when
> this landed and was fixed, but rebasing trees can minimise the chances
> of this when bisecting other changesets.

Ok, yeah, that makes sense. Thanks!

Maxime
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: vmwgfx leaking bo pins?

2021-03-12 Thread Intel



On 3/12/21 12:02 AM, Zack Rusin wrote:



On Mar 11, 2021, at 17:35, Thomas Hellström (Intel)  
wrote:

Hi, Zack

On 3/11/21 10:07 PM, Zack Rusin wrote:

On Mar 11, 2021, at 05:46, Thomas Hellström (Intel)  
wrote:

Hi,

I tried latest drm-fixes today and saw a lot of these: Fallout from ttm rework?

Yes, I fixed this in d1a73c641afd2617bd80bce8b71a096fc5b74b7e it was in 
drm-misc-next in the drm-misc tree for a while but hasn’t been merged for 5.12.

z


Hmm, yes but doesn't that fix trip the ttm_bo_unpin() 
dma_resv_assert_held(bo->base.resv)?

No, doesn’t seem to. TBH I’m not sure why myself, but it seems to be working 
fine.



With CONFIG_PROVE_LOCKING=y I see this:

[    7.117145] [drm] FIFO at 0xfe00 size is 8192 kiB
[    7.117284] [drm] VRAM at 0xe800 size is 131072 kiB
[    7.117291] INFO: trying to register non-static key.
[    7.117295] the code is fine but needs lockdep annotation.
[    7.117298] turning off the locking correctness validator

Which will probably mask that dma_resv_assert_held(bo->base.resv)

/Thomas


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH][next] drm/amd/pm: Fix spelling mistake "disble" -> "disable"

2021-03-12 Thread Colin King

From: Colin Ian King 

There is a spelling mistake in an assert message. Fix it.

Signed-off-by: Colin Ian King 
---
 drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
index 0d725b66fb78..7edafef095a3 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
@@ -1300,7 +1300,7 @@ static int smu7_start_dpm(struct pp_hwmgr *hwmgr)
(0 == smum_send_msg_to_smc(hwmgr,
PPSMC_MSG_PCIeDPM_Disable,
NULL)),
-   "Failed to disble pcie DPM during DPM Start 
Function!",
+   "Failed to disable pcie DPM during DPM Start 
Function!",
return -EINVAL);
}
 
-- 
2.30.2

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm/ttm: make ttm_bo_unpin more defensive

2021-03-12 Thread Matthew Auld

On Fri, 12 Mar 2021 at 09:38, Christian König
 wrote:
>
> We seem to have some more driver bugs than thought.
>
> Signed-off-by: Christian König 
Acked-by: Matthew Auld 
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] drm/komeda: Fix off-by-1 when with readback conn due to rounding

2021-03-12 Thread Brian Starkey

(Adding back James again - did you use get_maintainer.pl?)

On Thu, Mar 11, 2021 at 12:08:46PM +, carsten.haitz...@foss.arm.com wrote:
> From: Carsten Haitzler 
> 
> When setting up a readback connector that writes data back to memory
> rather than to an actual output device (HDMI etc.), rounding was set
> to round. As the DPU uses a higher internal number of bits when generating
> a color value, this round-down back to 8bit ended up with everything
> being off-by one. e.g. #fefefe became #ff. This sets

Perhaps overly pedantic, but now we've tracked down what was actually
happening I think we can be more precise here. Not _everything_ is
off-by-one, it's just rounding in the standard sense - if the most
significant bit-to-be-discarded is set, the value is rounded up to
minimise the absolute error introduced by bit-depth reduction.

> rounding to "round-down" so things end up correct by turning on the LW_TRC
> round down flag.

Can we call it "truncate" rather than round down? I think it makes
"TRC" a bit more understandable.

> 
> Signed-off-by: Carsten Haitzler 
> ---
>  drivers/gpu/drm/arm/display/komeda/d71/d71_component.c | 7 ++-
>  drivers/gpu/drm/arm/display/komeda/d71/d71_regs.h  | 1 +
>  2 files changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/arm/display/komeda/d71/d71_component.c 
> b/drivers/gpu/drm/arm/display/komeda/d71/d71_component.c
> index 8a02ade369db..e97acc5519d1 100644
> --- a/drivers/gpu/drm/arm/display/komeda/d71/d71_component.c
> +++ b/drivers/gpu/drm/arm/display/komeda/d71/d71_component.c
> @@ -468,7 +468,12 @@ static void d71_wb_layer_update(struct komeda_component 
> *c,
>   struct komeda_layer_state *st = to_layer_st(state);
>   struct drm_connector_state *conn_st = state->wb_conn->state;
>   struct komeda_fb *kfb = to_kfb(conn_st->writeback_job->fb);
> - u32 ctrl = L_EN | LW_OFM, mask = L_EN | LW_OFM | LW_TBU_EN;
> + /* LW_TRC sets rounding to truncate not round which is needed for
> +  * the output of writeback to match the input in the most common
> +  * use cases like RGB888 -> RGB888, so set this bit by default
> +  */

Hm, not sure why this file uses "net/" style comments, but as you
said, this is in-keeping with the rest of the file, so meh :-)

> + u32 ctrl = LW_TRC | L_EN | LW_OFM;
> + u32 mask = LW_TRC | L_EN | LW_OFM | LW_TBU_EN;

If you were aiming for matching register order, this should be:

L_EN | LW_TRC | LW_OFM | LW_TBU_EN


I think it'd be nice to have the exact behaviour in the commit
message, but either way this seems OK as a pragmatic fix so:

Reviewed-by: Brian Starkey 

Thanks,
-Brian

>   u32 __iomem *reg = c->reg;
>  
>   d71_layer_update_fb(c, kfb, st->addr);
> diff --git a/drivers/gpu/drm/arm/display/komeda/d71/d71_regs.h 
> b/drivers/gpu/drm/arm/display/komeda/d71/d71_regs.h
> index e80172a0b320..a8036689d721 100644
> --- a/drivers/gpu/drm/arm/display/komeda/d71/d71_regs.h
> +++ b/drivers/gpu/drm/arm/display/komeda/d71/d71_regs.h
> @@ -321,6 +321,7 @@
>  #define LAYER_WR_FORMAT  0x0D8
>  
>  /* Layer_WR control bits */
> +#define LW_TRC   BIT(1)
>  #define LW_OFM   BIT(4)
>  #define LW_LALPHA(x) (((x) & 0xFF) << 8)
>  #define LW_A_WCACHE(x)   (((x) & 0xF) << 28)
> -- 
> 2.30.0
> 
> ___
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[syzbot] upstream boot error: WARNING in vkms_vblank_simulate

2021-03-12 Thread syzbot

Hello,

syzbot found the following issue on:

HEAD commit:f78d76e7 Merge tag 'drm-fixes-2021-03-12-1' of git://anong..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=11c16ba2d0
kernel config:  https://syzkaller.appspot.com/x/.config?x=dc02c6afcb046874
dashboard link: https://syzkaller.appspot.com/bug?extid=333bd014262fd5d0a418
userspace arch: arm

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+333bd014262fd5d0a...@syzkaller.appspotmail.com

[ cut here ]
WARNING: CPU: 0 PID: 0 at drivers/gpu/drm/vkms/vkms_crtc.c:21 
vkms_vblank_simulate+0x26c/0x2f4 drivers/gpu/drm/vkms/vkms_crtc.c:41
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
5.12.0-rc2-syzkaller-00338-gf78d76e72a46 #0
Hardware name: linux,dummy-virt (DT)
pstate: 2085 (nzCv daIf -PAN -UAO -TCO BTYPE=--)
pc : vkms_vblank_simulate+0x26c/0x2f4 drivers/gpu/drm/vkms/vkms_crtc.c:21
lr : hrtimer_forward_now include/linux/hrtimer.h:510 [inline]
lr : vkms_vblank_simulate+0x90/0x2f4 drivers/gpu/drm/vkms/vkms_crtc.c:19
sp : 6a693cd0
x29: 6a693cd0 x28: 0c9d1e58 
x27: dfff8000 x26: 6a67f540 
x25: 1fffed4cfeb1 x24: 1fffed4cfeaa 
x23: 0c9d0d30 x22: 00fe4c00 
x21: 6a67f540 x20: 0c9d0e58 
x19: 0c9d1e58 x18: 6a6a1b48 
x17: 1fffe1952345 x16:  
x15: 8000197bf810 x14: 1fffed4d2750 
x13: 0001 x12: 0033 
x11: 12fb4936 x10: 0007 
x9 : 12fb4943 x8 : 800017d14c00 
x7 : f1f1f1f1 x6 : dfff8000 
x5 : 7fff x4 : 0008e44f6b90 
x3 : 0008e54db790 x2 : 0008e44f6b90 
x1 : 0008e54db790 x0 : 0002 
Call trace:
 vkms_vblank_simulate+0x26c/0x2f4 drivers/gpu/drm/vkms/vkms_crtc.c:41
 __run_hrtimer kernel/time/hrtimer.c:1519 [inline]
 __hrtimer_run_queues+0x590/0xe40 kernel/time/hrtimer.c:1583
 hrtimer_interrupt+0x2d4/0x810 kernel/time/hrtimer.c:1645
 timer_handler drivers/clocksource/arm_arch_timer.c:647 [inline]
 arch_timer_handler_phys+0x4c/0x70 drivers/clocksource/arm_arch_timer.c:665
 handle_percpu_devid_irq+0x19c/0x330 kernel/irq/chip.c:930
 generic_handle_irq_desc include/linux/irqdesc.h:158 [inline]
 generic_handle_irq kernel/irq/irqdesc.c:652 [inline]
 __handle_domain_irq+0x11c/0x1f0 kernel/irq/irqdesc.c:689
 handle_domain_irq include/linux/irqdesc.h:176 [inline]
 gic_handle_irq+0x5c/0x1b0 drivers/irqchip/irq-gic.c:370
 el1_irq+0xb4/0x180 arch/arm64/kernel/entry.S:669
 arch_local_irq_restore arch/arm64/include/asm/irqflags.h:124 [inline]
 queue_work_on+0x74/0x110 kernel/workqueue.c:1528
 queue_work include/linux/workqueue.h:507 [inline]
 cursor_timer_handler+0x64/0x100 drivers/video/fbdev/core/fbcon.c:397
 call_timer_fn+0x1d4/0x9c4 kernel/time/timer.c:1431
 expire_timers kernel/time/timer.c:1476 [inline]
 __run_timers.part.0+0x530/0xa00 kernel/time/timer.c:1745
 __run_timers kernel/time/timer.c:1726 [inline]
 run_timer_softirq+0xa4/0x1a0 kernel/time/timer.c:1758
 _stext+0x2b4/0x1084
 do_softirq_own_stack include/asm-generic/softirq_stack.h:10 [inline]
 invoke_softirq kernel/softirq.c:228 [inline]
 __irq_exit_rcu+0x46c/0x510 kernel/softirq.c:422
 irq_exit+0x14/0x84 kernel/softirq.c:446
 __handle_domain_irq+0x120/0x1f0 kernel/irq/irqdesc.c:692
 handle_domain_irq include/linux/irqdesc.h:176 [inline]
 gic_handle_irq+0x5c/0x1b0 drivers/irqchip/irq-gic.c:370
 el1_irq+0xb4/0x180 arch/arm64/kernel/entry.S:669
 arch_local_irq_enable+0xc/0x14 arch/arm64/include/asm/irqflags.h:37
 default_idle_call+0x64/0xf4 kernel/sched/idle.c:112
 cpuidle_idle_call kernel/sched/idle.c:194 [inline]
 do_idle+0x38c/0x4ec kernel/sched/idle.c:300
 cpu_startup_entry+0x28/0x80 kernel/sched/idle.c:397
 rest_init+0x1d0/0x2cc init/main.c:721
 arch_call_rest_init+0x10/0x1c
 start_kernel+0x3b0/0x3e8 init/main.c:1064
 0x0


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v2 01/14] opp: Add devres wrapper for dev_pm_opp_set_clkname

2021-03-12 Thread Ulf Hansson

On Fri, 12 Mar 2021 at 06:33, Viresh Kumar  wrote:
>
> On 11-03-21, 22:20, Dmitry Osipenko wrote:
> > +struct opp_table *devm_pm_opp_set_clkname(struct device *dev, const char 
> > *name)
> > +{
> > + struct opp_table *opp_table;
> > + int err;
> > +
> > + opp_table = dev_pm_opp_set_clkname(dev, name);
> > + if (IS_ERR(opp_table))
> > + return opp_table;
> > +
> > + err = devm_add_action_or_reset(dev, devm_pm_opp_clkname_release, 
> > opp_table);
> > + if (err)
> > + opp_table = ERR_PTR(err);
> > +
> > + return opp_table;
> > +}
>
> I wonder if we still need to return opp_table from here, or a simple
> integer is fine.. The callers shouldn't be required to use the OPP
> table directly anymore I believe and so better simplify the return
> part of this and all other routines you are adding here..

Yes, please. I was thinking along the same lines, when I reviewed the
mmc patch (patch9).

>
> If there is a user which needs the opp_table, let it use the regular
> non-devm variant.

Kind regards
Uffe
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH] drm/exynos: move to use request_irq by IRQF_NO_AUTOEN flag

2021-03-12 Thread Tian Tao

After this patch cbe16f35bee68 genirq: Add IRQF_NO_AUTOEN for
request_irq/nmi() is merged. request_irq() after setting
IRQ_NOAUTOEN as below

irq_set_status_flags(irq, IRQ_NOAUTOEN);
request_irq(dev, irq...);
can be replaced by request_irq() with IRQF_NO_AUTOEN flag.

Signed-off-by: Tian Tao 
---
 drivers/gpu/drm/exynos/exynos5433_drm_decon.c | 4 ++--
 drivers/gpu/drm/exynos/exynos_drm_dsi.c   | 7 +++
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos5433_drm_decon.c 
b/drivers/gpu/drm/exynos/exynos5433_drm_decon.c
index 1f79bc2..f530aff 100644
--- a/drivers/gpu/drm/exynos/exynos5433_drm_decon.c
+++ b/drivers/gpu/drm/exynos/exynos5433_drm_decon.c
@@ -775,8 +775,8 @@ static int decon_conf_irq(struct decon_context *ctx, const 
char *name,
return irq;
}
}
-   irq_set_status_flags(irq, IRQ_NOAUTOEN);
-   ret = devm_request_irq(ctx->dev, irq, handler, flags, "drm_decon", ctx);
+   ret = devm_request_irq(ctx->dev, irq, handler,
+  flags | IRQ_NOAUTOEN, "drm_decon", ctx);
if (ret < 0) {
dev_err(ctx->dev, "IRQ %s request failed\n", name);
return ret;
diff --git a/drivers/gpu/drm/exynos/exynos_drm_dsi.c 
b/drivers/gpu/drm/exynos/exynos_drm_dsi.c
index 83ab6b3..fd9b133b 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_dsi.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_dsi.c
@@ -1352,10 +1352,9 @@ static int exynos_dsi_register_te_irq(struct exynos_dsi 
*dsi,
}
 
te_gpio_irq = gpio_to_irq(dsi->te_gpio);
-   irq_set_status_flags(te_gpio_irq, IRQ_NOAUTOEN);
 
ret = request_threaded_irq(te_gpio_irq, exynos_dsi_te_irq_handler, NULL,
-   IRQF_TRIGGER_RISING, "TE", dsi);
+  IRQF_TRIGGER_RISING | IRQ_NOAUTOEN, "TE", 
dsi);
if (ret) {
dev_err(dsi->dev, "request interrupt failed with %d\n", ret);
gpio_free(dsi->te_gpio);
@@ -1802,9 +1801,9 @@ static int exynos_dsi_probe(struct platform_device *pdev)
if (dsi->irq < 0)
return dsi->irq;
 
-   irq_set_status_flags(dsi->irq, IRQ_NOAUTOEN);
ret = devm_request_threaded_irq(dev, dsi->irq, NULL,
-   exynos_dsi_irq, IRQF_ONESHOT,
+   exynos_dsi_irq,
+   IRQF_ONESHOT | IRQ_NOAUTOEN,
dev_name(dev), dsi);
if (ret) {
dev_err(dev, "failed to request dsi irq\n");
-- 
2.7.4

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v2 4/5] drm: Add and export function drm_gem_cma_sync_data

2021-03-12 Thread Paul Cercueil





Le jeu. 11 mars 2021 à 12:28, Christoph Hellwig  a 
écrit :

On Sun, Mar 07, 2021 at 08:28:34PM +, Paul Cercueil wrote:

 +  drm_atomic_for_each_plane_damage(&iter, &clip) {
 +  for (i = 0; i < finfo->num_planes; i++) {
 +  daddr = drm_fb_cma_get_gem_addr(state->fb, state, i);
 +
 +  /* Ignore x1/x2 values, invalidate complete lines */
 +  offset = clip.y1 * state->fb->pitches[i];
 +
 +  dma_sync_single_for_device(dev, daddr + offset,
 + (clip.y2 - clip.y1) * 
state->fb->pitches[i],
 + DMA_TO_DEVICE);


Are these helpers only ever used to transfer data to the device and
never from it?  If so please clearly document that.


Yes. In the DRM world, are there cases where we transfer data from the 
device? I assume these cases are handled by v4l2 instead.


-Paul


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v2 5/5] drm/ingenic: Add option to alloc cached GEM buffers

2021-03-12 Thread Paul Cercueil





Le jeu. 11 mars 2021 à 12:30, Christoph Hellwig  a 
écrit :

On Sun, Mar 07, 2021 at 08:28:35PM +, Paul Cercueil wrote:
 With the module parameter ingenic-drm.cached_gem_buffers, it is 
possible

 to specify that we want GEM buffers backed by non-coherent memory.


Shouldn't there be a way to discover this through a DT property?


Good question. My original way of thinking was that as this feature 
speeds up only software rendering, this is really 
application-dependent: a modern desktop where everything is rendered 
via the GPU wouldn't benefit much from it. With that in mind, it is 
fine as a module option.


On the other hand... the "software rendering is faster with 
non-coherent buffers" really is a SoC property, since it is only true 
for some generations of Ingenic SoCs and not others. So it would make 
sense to have a DT property for it.


-Paul


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v2 3/5] drm: Add and export function drm_gem_cma_mmap_noncoherent

2021-03-12 Thread Paul Cercueil

Le jeu. 11 mars 2021 à 12:36, Christoph Hellwig  a 
écrit :

On Thu, Mar 11, 2021 at 12:32:27PM +, Paul Cercueil wrote:

 > dma_to_phys must not be used by drivers.
 >
 > I have a proper helper for this waiting for users:
 >
 > 
http://git.infradead.org/users/hch/misc.git/commitdiff/96a546e7229ec53aadbdb7936d1e5e6cb5958952

 >
 > If you can confirm the helpers works for you I can try to still 
sneak

 > it to Linus for 5.12 to ease the merge pain.

 I can try. How do I get a page pointer from a dma_addr_t?

You don't - you get it from using virt_to_page on the pointer returned
from dma_alloc_noncoherent.  That beind said to keep the API sane I
should probably add a wrapper that does that for you.

I tested using:

ret = dma_mmap_pages(cma_obj->base.dev->dev,
vma, vma->vm_end - vma->vm_start,
virt_to_page(cma_obj->vaddr));

It works fine.

I think I can use remap_pfn_range() for now, and switch to your new API 
once it's available in drm-misc-next.

Cheers,
-Paul

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v7 3/3] drm: Add GUD USB Display driver

2021-03-12 Thread Noralf Trønnes

Den 12.03.2021 05.32, skrev Peter Stuge:
> Ilia Mirkin wrote:
>> XRGB means that the memory layout should match a 32-bit integer,
>> stored as LE, with the low bits being B, next bits being G, etc. This
>> translates to byte 0 = B, byte 1 = G, etc. If you're on a BE system,
>> and you're handed a XRGB buffer, it still expects that byte 0 = B,
>> etc (except as I outlined, some drivers which are from before these
>> formats were a thing, sort of do their own thing). Thankfully this is
>> equivalent to BGRX (big-endian packing), so you can just munge the
>> format.
> 
> I understand! Thanks a lot for clarifying.
> 
> It makes much more sense to me that the format indeed describes
> what is in memory rather than how pixels look to software.
> 
> 
 I'm not sure why you guys were talking about BE in the first place,
>>>
>>> I was worried that the translation didn't consider endianess.
>>
>> The translation in gud_xrgb_to_color definitely seems suspect.
> 
> So to me this means that the gud_pipe translations from XRGB to the
> 1-bit formats *do* have to adjust for the reversed order on BE.
> 
> 
>> There's also a gud_is_big_endian, but I'm guessing this applies to the
>> downstream device rather than the host system.
> 
> gud_is_big_endian() is a static bool wrapper around defined(__BIG_ENDIAN)
> so yes, it applies to the host.
> 
> With memory layout being constant I again think gud_xrgb_to_color()
> needs to take further steps to work correctly also on BE hosts. (Maybe
> that's le32_to_cpu(*pix32), maybe drm_fb_swab(), maybe something else?)
> 
> 
>> I didn't check if dev->mode_config.quirk_addfb_prefer_host_byte_order
>> is set
> 
> I can't tell if that's helpful, probably Noralf can.
> 

I skimmed a discussion a while back around BE and all it's problems
through the whole graphics stack and my take away was that I needed a BE
machine so I could test the whole stack to make sure it works.
Judging from what Ilia says I've been wrong in my assumptions so the
driver is probably broken on BE (along with my SPI display drivers).
I'm not going to try and fix this now, I need someone with a BE machine
that can test the whole stack and make sure it actually works. Me doing
more guesswork is not going to work out well I think :)

I'll add a FIXME in gud_pipe.c that BE is probably broken and link to
this discussion.

It seems Gerd did some work to fix BE in the kernel:
drm: byteorder fixes
https://patchwork.freedesktop.org/series/49073/

But that seems to deal with the format value itself and not the buffer
contents.

Noralf.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 212229] STM32F469: vblank wait timed out on output to /sys/class/graphics/fb0/pan

https://bugzilla.kernel.org/show_bug.cgi?id=212229

--- Comment #3 from Yauheni Saldatsenka (eugen...@gmail.com) ---
Created attachment 295809
  --> https://bugzilla.kernel.org/attachment.cgi?id=295809&action=edit
/sys/kernel/debug/dri/0/state

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are on the CC list for the bug.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 210321] /display/dc/dcn20/dcn20_resource.c:3240 dcn20_validate_bandwidth_fp+0x8b/0xd0 [amdgpu]