Re: [RFC PATCH 5/7] drm/ttm: add range busy check for range manager

2022-03-17 Thread Christian König

Am 16.03.22 um 16:28 schrieb Robert Beckett:



On 16/03/2022 14:39, Christian König wrote:

Am 16.03.22 um 15:26 schrieb Robert Beckett:


[SNIP]
this is where I replace an existing range check via drm_mm with the 
range check I added in this patch.


Mhm, I still don't get the use case from the code, but I don't think 
it matters any more.


I suppose we could add another drm_mm range tracker just for 
testing and shadow track each allocation in the range, but that 
seemed like a lot of extra infrastructure for no general runtime use.


I have no idea what you mean with that.


I meant as a potential solution to tracking allocations without a 
range check, we would need to add something external. e.g. adding a 
shadow drm_mm range tracker, or a bitmask across the range, or stick 
objects in a list etc.


Ah! So you are trying to get access to the drm_mm inside the 
ttm_range_manager and not add some additional range check function! 
Now I got your use case.


well, specifically I was trying to avoid having to get access to the 
drm_mm.
I wanted to maintain an abstract interface at the resource manager 
level, hence the rfc to ask if we could add a range check to 
ttm_resource_manager_func.


I don't like the idea of code external to ttm having to poke in to the 
implementation details of the manager to get it's underlying drm_mm.


The purpose of the ttm_range_manager is to implement a base class which 
is then extended by the drivers with more explicit functionality.


I have it on my TODO list to properly export the ttm_range_manager 
functions and use them to simplify the amdgpu_gtt_mgr.c implementation.


So accessing the drm_mm for a test case sounds perfectly fine to me as 
long as you document what is happening. E.g. maybe add a wrapper 
function to get a pointer to the drm_mm.






would you mind explaining the rationale for removing range checks? 
It seems to me like a natural fit for a memory manager


TTM manages buffer objects and resources, not address space. The 
lpfn/fpfn parameter for the resource allocators are actually used 
as just two independent parameters and not define any range. We 
just keep the names for historical reasons.


The only places we still use and compare them as ranges are 
ttm_resource_compat() and ttm_bo_eviction_valuable() and I already 
have patches to clean up those and move them into the backend 
resource handling.


except the ttm_range_manager seems to still use them as a range 
specifier.


Yeah, because the range manager is the backend which handles ranges 
using the drm_mm :)


If the general design going forward is to not consider ranges, how 
would you recommend constructing buffers around pre-allocated 
regions e.g. uefi frame buffers who's range is dictated externally?


Call ttm_bo_mem_space() with the fpfn/lpfn filled in as required. See 
function amdgpu_bo_create_kernel_at() for an example.


ah, I see, thanks.

To allow similar code to before, which was conceptually just trying to 
see if a range was currently free, would you be okay with a new 
ttm_bo_mem_try_space, which does not do the force to evict, but 
instead returns -EBUSY?


You can already do that by setting the num_busy_placement to zero. That 
should prevent any eviction.


Regards,
Christian.




If so, the test can try to alloc, and immediately free if successful 
which would imply it was free.




Regards,
Christian.





Regards,
Christian.





Regards,
Christian.



Signed-off-by: Robert Beckett 
---
  drivers/gpu/drm/ttm/ttm_range_manager.c | 21 
+

  include/drm/ttm/ttm_range_manager.h |  3 +++
  2 files changed, 24 insertions(+)

diff --git a/drivers/gpu/drm/ttm/ttm_range_manager.c 
b/drivers/gpu/drm/ttm/ttm_range_manager.c

index 8cd4f3fb9f79..5662627bb933 100644
--- a/drivers/gpu/drm/ttm/ttm_range_manager.c
+++ b/drivers/gpu/drm/ttm/ttm_range_manager.c
@@ -206,3 +206,24 @@ int ttm_range_man_fini_nocheck(struct 
ttm_device *bdev,

  return 0;
  }
  EXPORT_SYMBOL(ttm_range_man_fini_nocheck);
+
+/**
+ * ttm_range_man_range_busy - Check whether anything is 
allocated with a range

+ *
+ * @man: memory manager to check
+ * @fpfn: first page number to check
+ * @lpfn: last page number to check
+ *
+ * Return: true if anything allocated within the range, false 
otherwise.

+ */
+bool ttm_range_man_range_busy(struct ttm_resource_manager *man,
+  unsigned fpfn, unsigned lpfn)
+{
+    struct ttm_range_manager *rman = to_range_manager(man);
+    struct drm_mm *mm = &rman->mm;
+
+    if (__drm_mm_interval_first(mm, PFN_PHYS(fpfn), 
PFN_PHYS(lpfn + 1) - 1))

+    return true;
+    return false;
+}
+EXPORT_SYMBOL(ttm_range_man_range_busy);
diff --git a/include/drm/ttm/ttm_range_manager.h 
b/include/drm/ttm/ttm_range_manager.h

index 7963b957e9ef..86794a3f9101 100644
--- a/include/drm/ttm/ttm_range_manager.h
+++ b/include/drm/ttm/ttm_range_manager.h
@@ -53,4 +53,7 @@ static __always_inline int 
ttm_range_man_fini(struct ttm_device *bdev,
  

Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

2022-03-17 Thread Christian König

Am 16.03.22 um 16:36 schrieb Rob Clark:

[SNIP]
just one point of clarification.. in the msm and i915 case it is
purely for debugging and telemetry (ie. sending crash logs back to
distro for analysis if user has crash reporting enabled).. it isn't
used for triggering any action like killing app or compositor.


By the way, how does msm it's memory management for the devcoredumps?

I mean it is strictly forbidden to allocate any memory in the GPU reset 
path.



I would however *strongly* recommend devcoredump support in other GPU
drivers (i915's thing pre-dates devcoredump by a lot).. I've used it
to debug and fix a couple obscure issues that I was not able to
reproduce by myself.


Yes, completely agree as well.

Thanks,
Christian.



BR,
-R




RE: [Intel-gfx] [PATCH v6 2/2] drm/i915/gem: Don't try to map and fence large scanout buffers (v9)

2022-03-17 Thread Kasireddy, Vivek
Hi Tvrtko,

> 
> On 16/03/2022 07:37, Kasireddy, Vivek wrote:
> > Hi Tvrtko,
> >
> >>
> >> On 15/03/2022 07:28, Kasireddy, Vivek wrote:
> >>> Hi Tvrtko, Daniel,
> >>>
> 
>  On 11/03/2022 09:39, Daniel Vetter wrote:
> > On Mon, 7 Mar 2022 at 21:38, Vivek Kasireddy 
> wrote:
> >>
> >> On platforms capable of allowing 8K (7680 x 4320) modes, pinning 2 or
> >> more framebuffers/scanout buffers results in only one that is mappable/
> >> fenceable. Therefore, pageflipping between these 2 FBs where only one
> >> is mappable/fenceable creates latencies large enough to miss alternate
> >> vblanks thereby producing less optimal framerate.
> >>
> >> This mainly happens because when i915_gem_object_pin_to_display_plane()
> >> is called to pin one of the FB objs, the associated vma is identified
> >> as misplaced and therefore i915_vma_unbind() is called which unbinds 
> >> and
> >> evicts it. This misplaced vma gets subseqently pinned only when
> >> i915_gem_object_ggtt_pin_ww() is called without PIN_MAPPABLE. This
> >> results in a latency of ~10ms and happens every other vblank/repaint 
> >> cycle.
> >> Therefore, to fix this issue, we try to see if there is space to map
> >> at-least two objects of a given size and return early if there isn't. 
> >> This
> >> would ensure that we do not try with PIN_MAPPABLE for any objects that
> >> are too big to map thereby preventing unncessary unbind.
> >>
> >> Testcase:
> >> Running Weston and weston-simple-egl on an Alderlake_S (ADLS) platform
> >> with a 8K@60 mode results in only ~40 FPS. Since upstream Weston 
> >> submits
> >> a frame ~7ms before the next vblank, the latencies seen between atomic
> >> commit and flip event are 7, 24 (7 + 16.66), 7, 24. suggesting that
> >> it misses the vblank every other frame.
> >>
> >> Here is the ftrace snippet that shows the source of the ~10ms latency:
> >>  i915_gem_object_pin_to_display_plane() {
> >> 0.102 us   |i915_gem_object_set_cache_level();
> >>i915_gem_object_ggtt_pin_ww() {
> >> 0.390 us   |  i915_vma_instance();
> >> 0.178 us   |  i915_vma_misplaced();
> >>  i915_vma_unbind() {
> >>  __i915_active_wait() {
> >> 0.082 us   |i915_active_acquire_if_busy();
> >> 0.475 us   |  }
> >>  intel_runtime_pm_get() {
> >> 0.087 us   |intel_runtime_pm_acquire();
> >> 0.259 us   |  }
> >>  __i915_active_wait() {
> >> 0.085 us   |i915_active_acquire_if_busy();
> >> 0.240 us   |  }
> >>  __i915_vma_evict() {
> >>ggtt_unbind_vma() {
> >>  gen8_ggtt_clear_range() {
> >> 10507.255 us |}
> >> 10507.689 us |  }
> >> 10508.516 us |   }
> >>
> >> v2: Instead of using bigjoiner checks, determine whether a scanout
> >>buffer is too big by checking to see if it is possible to map
> >>two of them into the ggtt.
> >>
> >> v3 (Ville):
> >> - Count how many fb objects can be fit into the available holes
> >>  instead of checking for a hole twice the object size.
> >> - Take alignment constraints into account.
> >> - Limit this large scanout buffer check to >= Gen 11 platforms.
> >>
> >> v4:
> >> - Remove existing heuristic that checks just for size. (Ville)
> >> - Return early if we find space to map at-least two objects. (Tvrtko)
> >> - Slightly update the commit message.
> >>
> >> v5: (Tvrtko)
> >> - Rename the function to indicate that the object may be too big to
> >>  map into the aperture.
> >> - Account for guard pages while calculating the total size required
> >>  for the object.
> >> - Do not subject all objects to the heuristic check and instead
> >>  consider objects only of a certain size.
> >> - Do the hole walk using the rbtree.
> >> - Preserve the existing PIN_NONBLOCK logic.
> >> - Drop the PIN_MAPPABLE check while pinning the VMA.
> >>
> >> v6: (Tvrtko)
> >> - Return 0 on success and the specific error code on failure to
> >>  preserve the existing behavior.
> >>
> >> v7: (Ville)
> >> - Drop the HAS_GMCH(i915), DISPLAY_VER(i915) < 11 and
> >>  size < ggtt->mappable_end / 4 checks.
> >> - Drop the redundant check that is based on previous heuristic.
> >>
> >> v8:
> >> - Make sure that we are holding the mutex associated with ggtt vm
> >>  as we traverse the hole nodes.
> >>
> >> v9: (Tvrtko)
> >> - Use mutex_lock_interruptible_nested() instead of mutex_lock().
> >>
> >> Cc: Ville Syrjälä 
> >> Cc: Maarten Lankhorst 
> >> Cc: Tvrtko Ursulin 
> >> Cc: M

Re: [PATCH] drm/ttm: fix uninit ptr deref in range manager alloc error path

2022-03-17 Thread Christian König

Am 16.03.22 um 20:50 schrieb Robert Beckett:

ttm_range_man_alloc would try to ttm_resource_fini the res pointer
before it is allocated.

Fixes: de3688e469b0 (drm/ttm: add ttm_resource_fini v2)

Signed-off-by: Robert Beckett 


Reviewed-by: Christian König 

Good catch, going to push that to drm-misc-fixes.


---
  drivers/gpu/drm/ttm/ttm_range_manager.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_range_manager.c 
b/drivers/gpu/drm/ttm/ttm_range_manager.c
index 5662627bb933..1b4d8ca52f68 100644
--- a/drivers/gpu/drm/ttm/ttm_range_manager.c
+++ b/drivers/gpu/drm/ttm/ttm_range_manager.c
@@ -89,7 +89,7 @@ static int ttm_range_man_alloc(struct ttm_resource_manager 
*man,
spin_unlock(&rman->lock);
  
  	if (unlikely(ret)) {

-   ttm_resource_fini(man, *res);
+   ttm_resource_fini(man, &node->base);
kfree(node);
return ret;
}




Re: [PATCH V9 1/5] dt-bindings: display: mediatek: add aal binding for MT8183

2022-03-17 Thread Krzysztof Kozlowski
On 17/03/2022 06:18, Rex-BC Chen wrote:
> Add aal binding for MT8183.
> 
> Signed-off-by: Rex-BC Chen 
> ---
>  .../devicetree/bindings/display/mediatek/mediatek,aal.yaml   | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 


Reviewed-by: Krzysztof Kozlowski 


Best regards,
Krzysztof


Re: [PATCH v8 22/24] drm: rockchip: Add VOP2 driver

2022-03-17 Thread Andy Yan

Hi Sascha:

On 3/16/22 20:22, Andy Yan wrote:


Hi Sascha and Daniel:

On 3/16/22 15:40, Sascha Hauer wrote:

On Tue, Mar 15, 2022 at 02:46:35PM +0800, Andy Yan wrote:

Hi Sascha:

On 3/11/22 16:33, Sascha Hauer wrote:

From: Andy Yan

The VOP2 unit is found on Rockchip SoCs beginning with rk3566/rk3568.
It replaces the VOP unit found in the older Rockchip SoCs.

This driver has been derived from the downstream Rockchip Kernel and
heavily modified:

- All nonstandard DRM properties have been removed
- dropped struct vop2_plane_state and pass around less data between
functions
- Dropped all DRM_FORMAT_* not known on upstream
- rework register access to get rid of excessively used macros
- Drop all waiting for framesyncs

The driver is tested with HDMI and MIPI-DSI display on a RK3568-EVB
board. Overlay support is tested with the modetest utility. AFBC support
on the cluster windows is tested with weston-simple-dmabuf-egl on
weston using the (yet to be upstreamed) panfrost driver support.

Do we need some modification to test AFBC by weston-simple-dma-egl ?

By default weston-simple-dma-egl uses DRM_FORMAT_XRGB which in the
panfrost driver ends up as PIPE_FORMAT_B8G8R8_UNORM and
panfrost_afbc_format() returns PIPE_FORMAT_NONE for that. Change the
format to DRM_FORMAT_ABGR using weston-simple-dma-egl -f 0x34324241.
This ends up as PIPE_FORMAT_R8G8B8A8_UNORM in panfrost_afbc_format()
which is a supported format.


I also try weston-simple-dmabuf-egl -f 0x34324241 command,

but I got this output log from weston[0]:

Layer 5 (pos 0x5000):
View 0 (role xdg_toplevel, PID 375, surface ID 3, top-level window 
'simple-dmabuf-egl' of org.freedesktop.weston.

simple-dmabuf-egl, 0xd08275e0):
position: (871, 174) -> (1127, 430)
[not opaque]
outputs: 0 (HDMI-A-1) (primary)
dmabuf buffer
format: 0x34324241 ABGR
modifier: ARM_BLOCK_SIZE=16x16,MODE=YTR|SPARSE (0x851)
Layer 6 (pos 0x2):
View 0 (role (null), PID 372, surface ID 18, background for output 
HDMI-A-1, 0xd0863520):

position: (0, 0) -> (1920, 1080)
[fully opaque]
outputs: 0 (HDMI-A-1) (primary)
[buffer not available]
[repaint] preparing state for output HDMI-A-1 (0)
[repaint] trying planes-only build state
[view] evaluating view 0xd083b0f0 for output HDMI-A-1 (0)
[view] not assigning view 0xd083b0f0 to plane (no buffer available)
[view] failing state generation: placing view 0xd083b0f0 to 
renderer not allowed

[repaint] could not build planes-only state, trying mixed
[state] using renderer FB ID 73 for mixed mode for output HDMI-A-1 (0)
[state] scanout will use for zpos 0
[view] evaluating view 0xd083b0f0 for output HDMI-A-1 (0)
[view] not assigning view 0xd083b0f0 to plane (no buffer available)
[view] view 0xd083b0f0 will be placed on the renderer
[view] evaluating view 0xd08275e0 for output HDMI-A-1 (0)
[plane] started with zpos 18446744073709551615
[view] view 0xd08275e0 will be placed on the renderer
[view] evaluating view 0xd0863520 for output HDMI-A-1 (0)
[view] not assigning view 0xd0863520 to plane (no buffer available)
[view] not assigning view 0xd0863520 to plane (occluded by 
renderer views)


[view] view 0xd0863520 will be placed on the renderer


From the log we can find that Layer5 view 0(0xd08275e0) is the 
afbc view rendered by Panfrost.


But it at last put on a render not a afbc window of vop  "view] view 
0xd083b0f0 will be placed on the renderer "


The output message from sys/kernel/debug/dri/state can also provide 
that only non-AFBC window smart0-win0 is used.


It seems that it failed in  weston drm_output_prepare_plane_view.

Maybe I need a deeper dig.



After a deeper dig, I found it failed from

drm_fb_get_from_dmabuf {


...

/* XXX: TODO:
 *
 * Currently the buffer is rejected if any dmabuf attribute
 * flag is set.  This keeps us from passing an inverted /
 * interlaced / bottom-first buffer (or any other type that may
 * be added in the future) through to an overlay. Ultimately,
 * these types of buffers should be handled through buffer
 * transforms and not as spot-checks requiring specific
 * knowledge. */
    if (dmabuf->attributes.flags) {
    drm_debug(backend, "\t\t\t\t invlid flag 0x%x\n", 
dmabuf->attributes.flags);

    return NULL;
    }

}

After some grep search, I found the flag is set  at create_dmabuf_buffer 
by weston-simple-dmabuf-egl itself.


So I run this test with -g: weston-simple-dmabuf-egl -f 0x34324241 -g

From the log I can see this view is go to a  overlay plane, but it 
doesn't appear on the screen.


Cat the dri state, I can see Cluster1-win0 this afbc window is enabled.

So I guess there is something wrong with the vop2 configuration.

I dump registers of OVERLAY and Cluster1-win0 and Smart0-win0(Primary plane)

I found a obvious  error in 0x604(OVERLAY_LAYER_SEL) register, the 
configuration value


is 0x5476

Re: [PATCH] fbdev: defio: fix the pagelist corruption

2022-03-17 Thread Geert Uytterhoeven
Hi Chuansheng,

On Thu, Mar 17, 2022 at 7:17 AM Chuansheng Liu  wrote:
> Easily hit the below list corruption:
> ==
> list_add corruption. prev->next should be next (c0ceb090), but
> was ec604507edc8. (prev=ec604507edc8).
> WARNING: CPU: 65 PID: 3959 at lib/list_debug.c:26
> __list_add_valid+0x53/0x80
> CPU: 65 PID: 3959 Comm: fbdev Tainted: G U
> RIP: 0010:__list_add_valid+0x53/0x80
> Call Trace:
>  
>  fb_deferred_io_mkwrite+0xea/0x150
>  do_page_mkwrite+0x57/0xc0
>  do_wp_page+0x278/0x2f0
>  __handle_mm_fault+0xdc2/0x1590
>  handle_mm_fault+0xdd/0x2c0
>  do_user_addr_fault+0x1d3/0x650
>  exc_page_fault+0x77/0x180
>  ? asm_exc_page_fault+0x8/0x30
>  asm_exc_page_fault+0x1e/0x30
> RIP: 0033:0x7fd98fc8fad1
> ==
>
> Figure out the race happens when one process is adding &page->lru into
> the pagelist tail in fb_deferred_io_mkwrite(), another process is
> re-initializing the same &page->lru in fb_deferred_io_fault(), which is
> not protected by the lock.
>
> This fix is to init all the page lists one time during initialization,
> it not only fixes the list corruption, but also avoids INIT_LIST_HEAD()
> redundantly.
>
> Fixes: 105a940416fc ("fbdev/defio: Early-out if page is already
> enlisted")
> Cc: Thomas Zimmermann 
> Signed-off-by: Chuansheng Liu 

Thanks for your patch!

> --- a/drivers/video/fbdev/core/fb_defio.c
> +++ b/drivers/video/fbdev/core/fb_defio.c
> @@ -220,6 +219,8 @@ static void fb_deferred_io_work(struct work_struct *work)
>  void fb_deferred_io_init(struct fb_info *info)
>  {
> struct fb_deferred_io *fbdefio = info->fbdefio;
> +   struct page *page;
> +   int i;

unsigned int i;

> BUG_ON(!fbdefio);
> mutex_init(&fbdefio->lock);
> @@ -227,6 +228,12 @@ void fb_deferred_io_init(struct fb_info *info)
> INIT_LIST_HEAD(&fbdefio->pagelist);
> if (fbdefio->delay == 0) /* set a default of 1 s */
> fbdefio->delay = HZ;
> +
> +   /* initialize all the page lists one time */
> +   for (i = 0; i < info->fix.smem_len; i += PAGE_SIZE) {
> +   page = fb_deferred_io_page(info, i);
> +   INIT_LIST_HEAD(&page->lru);
> +   }
>  }
>  EXPORT_SYMBOL_GPL(fb_deferred_io_init);

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


[PULL] drm-intel-next-fixes

2022-03-17 Thread Joonas Lahtinen
Hi Dave & Daniel,

Fix for vm_access() out-of-bounds access and PSR not staying disabled
during fastset once determined not reliable.

Then a naming fix to avoid conflicts for potential future fixes.

Regards, Joonas

***

drm-intel-next-fixes-2022-03-17:

- Do not re-enable PSR after it was marked as not reliable (Jose)
- Add missing boundary check in vm_access to avoid out-of-bounds access (Mastan)
- Naming fix for HPD short pulse handling for eDP (Jose)

The following changes since commit 5e7f44b5c2c035fe2e5458193c2bbee56db6a090:

  drm/i915/gtt: reduce overzealous alignment constraints for GGTT (2022-03-09 
08:34:55 +0200)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm-intel 
tags/drm-intel-next-fixes-2022-03-17

for you to fetch changes up to 278da06c03655c2bb9bc36ebdf45b90a079b3bfd:

  drm/i915/display: Do not re-enable PSR after it was marked as not reliable 
(2022-03-16 08:17:40 +0200)


- Do not re-enable PSR after it was marked as not reliable (Jose)
- Add missing boundary check in vm_access to avoid out-of-bounds access (Mastan)
- Naming fix for HPD short pulse handling for eDP (Jose)


José Roberto de Souza (2):
  drm/i915/display: Fix HPD short pulse handling for eDP
  drm/i915/display: Do not re-enable PSR after it was marked as not reliable

Mastan Katragadda (1):
  drm/i915/gem: add missing boundary check in vm_access

 drivers/gpu/drm/i915/display/intel_dp.c  | 2 +-
 drivers/gpu/drm/i915/display/intel_pps.c | 6 +++---
 drivers/gpu/drm/i915/display/intel_pps.h | 2 +-
 drivers/gpu/drm/i915/display/intel_psr.c | 4 
 drivers/gpu/drm/i915/gem/i915_gem_mman.c | 2 +-
 5 files changed, 10 insertions(+), 6 deletions(-)


Re: [PATCH 00/25] drm/msm/dpu: wide planes support

2022-03-17 Thread Dmitry Baryshkov

On 17/03/2022 04:10, Abhinav Kumar wrote:

Hi Dmitry

I have reviewed the series , some patches completely , some of them 
especially the plane to sspp mapping is something i still need to check.


But I had one question on the design.

I thought we were going to have a boot param to control whether driver 
will internally use both rectangles for the layer so that in the future 
if compositors can do this splitting, we can use that instead of driver 
doing it ( keep boot param disabled ? ).


No need to for this patch series. If your composer allocates smaller 
planes, then the driver won't do a thing. For the proper multirect there 
will be a boot param (at least initially) and then you can work on the 
custom properties, etc.




Thanks

Abhinav

On 2/9/2022 9:24 AM, Dmitry Baryshkov wrote:

It took me a way longer to finish than I expected. And more patches that
I initially hoped. This patchset brings in multirect usage to support
using two SSPP rectangles for a single plane. Virtual planes support is
omitted from this pull request, it will come later.

Dmitry Baryshkov (25):
   drm/msm/dpu: rip out master planes support
   drm/msm/dpu: do not limit the zpos property
   drm/msm/dpu: add support for SSPP allocation to RM
   drm/msm/dpu: move SSPP debugfs creation to dpu_kms.c
   drm/msm/dpu: move pipe_hw to dpu_plane_state
   drm/msm/dpu: inline dpu_plane_get_ctl_flush
   drm/msm/dpu: drop dpu_plane_pipe function
   drm/msm/dpu: get rid of cached flush_mask
   drm/msm/dpu: dpu_crtc_blend_setup: split mixer and ctl logic
   drm/msm/dpu: introduce struct dpu_sw_pipe
   drm/msm/dpu: use dpu_sw_pipe for dpu_hw_sspp callbacks
   drm/msm/dpu: inline _dpu_plane_set_scanout
   drm/msm/dpu: pass dpu_format to _dpu_hw_sspp_setup_scaler3()
   drm/msm/dpu: move stride programming to
 dpu_hw_sspp_setup_sourceaddress
   drm/msm/dpu: remove dpu_hw_fmt_layout from struct dpu_hw_pipe_cfg
   drm/msm/dpu: drop EAGAIN check from dpu_format_populate_layout
   drm/msm/dpu: drop src_split and multirect check from
 dpu_crtc_atomic_check
   drm/msm/dpu: move the rest of plane checks to dpu_plane_atomic_check()
   drm/msm/dpu: don't use unsupported blend stages
   drm/msm/dpu: add dpu_hw_pipe_cfg to dpu_plane_state
   drm/msm/dpu: simplify dpu_plane_validate_src()
   drm/msm/dpu: rewrite plane's QoS-related functions to take dpu_sw_pipe
 and dpu_format
   drm/msm/dpu: rework dpu_plane_atomic_check() and
 dpu_plane_sspp_atomic_update()
   drm/msm/dpu: populate SmartDMA features in hw catalog
   drm/msm/dpu: add support for wide planes

  drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c  | 355 +++-
  drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.h  |   1 -
  drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c   |   4 -
  .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c    |  10 +-
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c    |  78 +-
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h    |  35 +-
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.c   | 136 +--
  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h   |  88 +-
  drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c   |  21 +-
  drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h   |   1 +
  drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 813 +-
  drivers/gpu/drm/msm/disp/dpu1/dpu_plane.h |  42 +-
  drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c    |  81 ++
  drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h    |   6 +
  drivers/gpu/drm/msm/disp/dpu1/dpu_trace.h |  19 +-
  15 files changed, 827 insertions(+), 863 deletions(-)




--
With best wishes
Dmitry


Re: [PATCH v1 1/3] mm: split vm_normal_pages for LRU and non-LRU handling

2022-03-17 Thread David Hildenbrand
On 17.03.22 03:54, Alistair Popple wrote:
> Felix Kuehling  writes:
> 
>> On 2022-03-11 04:16, David Hildenbrand wrote:
>>> On 10.03.22 18:26, Alex Sierra wrote:
 DEVICE_COHERENT pages introduce a subtle distinction in the way
 "normal" pages can be used by various callers throughout the kernel.
 They behave like normal pages for purposes of mapping in CPU page
 tables, and for COW. But they do not support LRU lists, NUMA
 migration or THP. Therefore we split vm_normal_page into two
 functions vm_normal_any_page and vm_normal_lru_page. The latter will
 only return pages that can be put on an LRU list and that support
 NUMA migration, KSM and THP.

 We also introduced a FOLL_LRU flag that adds the same behaviour to
 follow_page and related APIs, to allow callers to specify that they
 expect to put pages on an LRU list.

>>> I still don't see the need for s/vm_normal_page/vm_normal_any_page/. And
>>> as this patch is dominated by that change, I'd suggest (again) to just
>>> drop it as I don't see any value of that renaming. No specifier implies any.
>>
>> OK. If nobody objects, we can adopts that naming convention.
> 
> I'd prefer we avoid the churn too, but I don't think we should make
> vm_normal_page() the equivalent of vm_normal_any_page(). It would mean
> vm_normal_page() would return non-LRU device coherent pages, but to me at 
> least
> device coherent pages seem special and not what I'd expect from a function 
> with
> "normal" in the name.
> 
> So I think it would be better to s/vm_normal_lru_page/vm_normal_page/ and keep
> vm_normal_any_page() (or perhaps call it vm_any_page?). This is basically what
> the previous incarnation of this feature did:
> 
> struct page *_vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
> pte_t pte, bool with_public_device);
> #define vm_normal_page(vma, addr, pte) _vm_normal_page(vma, addr, pte, false)
> 
> Except we should add:
> 
> #define vm_normal_any_page(vma, addr, pte) _vm_normal_page(vma, addr, pte, 
> true)
> 

"normal" simply tells us that this is not a special mapping -- IOW, we
want the VM to take a look at the memmap and not treat it like a PFN
map. What we're changing is that we're now also returning non-lru pages.
Fair enough, that's why we introduce vm_normal_lru_page() as a
replacement where we really can only deal with lru pages.

vm_normal_page vs vm_normal_lru_page is good enough. "lru" further
limits what we get via vm_normal_page, that's even how it's implemented.

vm_normal_page vs vm_normal_any_page is confusing IMHO.


-- 
Thanks,

David / dhildenb



[PATCH v2 4/5] drm/ssd130x: Reduce temporary buffer sizes

2022-03-17 Thread Geert Uytterhoeven
ssd130x_clear_screen() allocates a temporary buffer sized to hold one
byte per pixel, while it only needs to hold one bit per pixel.

ssd130x_fb_blit_rect() allocates a temporary buffer sized to hold one
byte per pixel for the whole frame buffer, while it only needs to hold
one bit per pixel for the part that is to be updated.
Pass dst_pitch to drm_fb_xrgb_to_mono(), as we have already
calculated it anyway.

Fixes: a61732e808672cfa ("drm: Add driver for Solomon SSD130x OLED displays")
Signed-off-by: Geert Uytterhoeven 
Acked-by: Javier Martinez Canillas 
---
v2:
  - Add Acked-by,
  - s/drm_fb_xrgb_to_mono_reversed/drm_fb_xrgb_to_mono/ in
description.
---
 drivers/gpu/drm/solomon/ssd130x.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/solomon/ssd130x.c 
b/drivers/gpu/drm/solomon/ssd130x.c
index 7c99af4ce9dd4e5c..38b6c2c14f53644b 100644
--- a/drivers/gpu/drm/solomon/ssd130x.c
+++ b/drivers/gpu/drm/solomon/ssd130x.c
@@ -440,7 +440,8 @@ static void ssd130x_clear_screen(struct ssd130x_device 
*ssd130x)
.y2 = ssd130x->height,
};
 
-   buf = kcalloc(ssd130x->width, ssd130x->height, GFP_KERNEL);
+   buf = kcalloc(DIV_ROUND_UP(ssd130x->width, 8), ssd130x->height,
+ GFP_KERNEL);
if (!buf)
return;
 
@@ -454,6 +455,7 @@ static int ssd130x_fb_blit_rect(struct drm_framebuffer *fb, 
const struct iosys_m
 {
struct ssd130x_device *ssd130x = drm_to_ssd130x(fb->dev);
void *vmap = map->vaddr; /* TODO: Use mapping abstraction properly */
+   unsigned int dst_pitch;
int ret = 0;
u8 *buf = NULL;
 
@@ -461,11 +463,12 @@ static int ssd130x_fb_blit_rect(struct drm_framebuffer 
*fb, const struct iosys_m
rect->y1 = round_down(rect->y1, 8);
rect->y2 = min_t(unsigned int, round_up(rect->y2, 8), ssd130x->height);
 
-   buf = kcalloc(fb->width, fb->height, GFP_KERNEL);
+   dst_pitch = DIV_ROUND_UP(drm_rect_width(rect), 8);
+   buf = kcalloc(dst_pitch, drm_rect_height(rect), GFP_KERNEL);
if (!buf)
return -ENOMEM;
 
-   drm_fb_xrgb_to_mono(buf, 0, vmap, fb, rect);
+   drm_fb_xrgb_to_mono(buf, dst_pitch, vmap, fb, rect);
 
ssd130x_update_rect(ssd130x, buf, rect);
 
-- 
2.25.1



[PATCH v2 3/5] drm/ssd130x: Fix rectangle updates

2022-03-17 Thread Geert Uytterhoeven
The rectangle update functions ssd130x_fb_blit_rect() and
ssd130x_update_rect() do not behave correctly when x1 != 0 or y1 !=
0, or when y1 or y2 are not aligned to display page boundaries.
E.g. when used as a text console, only the first line of text is shown
on the display.

  1. The buffer passed by ssd130x_fb_blit_rect() points to the first
 byte of monochrome bitmap data, and thus has its origin at (x1,
 y1), while ssd130x_update_rect() assumes it is at (0, 0).
 Fix ssd130x_update_rect() by changing the vertical and horizontal
 loop ranges, and adding the offsets only when needed.

  2. In ssd130x_fb_blit_rect(), align y1 and y2 to the display page
 boundaries before doing the color conversion, so the full page
 is converted and updated.
 Remove the correction for an unaligned y1 from
 ssd130x_update_rect(), and add a check to make sure y1 is aligned.

Fixes: a61732e808672cfa ("drm: Add driver for Solomon SSD130x OLED displays")
Signed-off-by: Geert Uytterhoeven 
Acked-by: Javier Martinez Canillas 
---
v2:
  - Add Acked-by.

Note that instead of calling drm_fb_xrgb_to_mono() and transposing
the bitmap, the image data could be converted to the transposed format
directly.  However, that would preclude exposing a monochrome format to
userspace when a fourcc for such a monochrome format is introduced.
---
 drivers/gpu/drm/solomon/ssd130x.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/solomon/ssd130x.c 
b/drivers/gpu/drm/solomon/ssd130x.c
index caee851efd5726e7..7c99af4ce9dd4e5c 100644
--- a/drivers/gpu/drm/solomon/ssd130x.c
+++ b/drivers/gpu/drm/solomon/ssd130x.c
@@ -355,11 +355,14 @@ static int ssd130x_update_rect(struct ssd130x_device 
*ssd130x, u8 *buf,
unsigned int width = drm_rect_width(rect);
unsigned int height = drm_rect_height(rect);
unsigned int line_length = DIV_ROUND_UP(width, 8);
-   unsigned int pages = DIV_ROUND_UP(y % 8 + height, 8);
+   unsigned int pages = DIV_ROUND_UP(height, 8);
+   struct drm_device *drm = &ssd130x->drm;
u32 array_idx = 0;
int ret, i, j, k;
u8 *data_array = NULL;
 
+   drm_WARN_ONCE(drm, y % 8 != 0, "y must be aligned to screen page\n");
+
data_array = kcalloc(width, pages, GFP_KERNEL);
if (!data_array)
return -ENOMEM;
@@ -401,13 +404,13 @@ static int ssd130x_update_rect(struct ssd130x_device 
*ssd130x, u8 *buf,
if (ret < 0)
goto out_free;
 
-   for (i = y / 8; i < y / 8 + pages; i++) {
+   for (i = 0; i < pages; i++) {
int m = 8;
 
/* Last page may be partial */
-   if (8 * (i + 1) > ssd130x->height)
+   if (8 * (y / 8 + i + 1) > ssd130x->height)
m = ssd130x->height % 8;
-   for (j = x; j < x + width; j++) {
+   for (j = 0; j < width; j++) {
u8 data = 0;
 
for (k = 0; k < m; k++) {
@@ -454,6 +457,10 @@ static int ssd130x_fb_blit_rect(struct drm_framebuffer 
*fb, const struct iosys_m
int ret = 0;
u8 *buf = NULL;
 
+   /* Align y to display page boundaries */
+   rect->y1 = round_down(rect->y1, 8);
+   rect->y2 = min_t(unsigned int, round_up(rect->y2, 8), ssd130x->height);
+
buf = kcalloc(fb->width, fb->height, GFP_KERNEL);
if (!buf)
return -ENOMEM;
-- 
2.25.1



[PATCH v2 5/5] drm/repaper: Reduce temporary buffer size in repaper_fb_dirty()

2022-03-17 Thread Geert Uytterhoeven
As the temporary buffer is no longer used to store 8-bit grayscale data,
its size can be reduced to the size needed to store the monochrome
bitmap data.

Fixes: 24c6bedefbe71de9 ("drm/repaper: Use format helper for xrgb to 
monochrome conversion")
Signed-off-by: Geert Uytterhoeven 
Reviewed-by: Javier Martinez Canillas 
---
v2:
  - Add Reviewed-by.

Untested due to lack of hardware.

I replaced kmalloc_array() by kmalloc() to match size calculations in
other locations in this driver.  There is no point in handling a
possible multiplication overflow only here.
---
 drivers/gpu/drm/tiny/repaper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/tiny/repaper.c b/drivers/gpu/drm/tiny/repaper.c
index a096fb8b83e99dc8..7738b87f370ad147 100644
--- a/drivers/gpu/drm/tiny/repaper.c
+++ b/drivers/gpu/drm/tiny/repaper.c
@@ -530,7 +530,7 @@ static int repaper_fb_dirty(struct drm_framebuffer *fb)
DRM_DEBUG("Flushing [FB:%d] st=%ums\n", fb->base.id,
  epd->factored_stage_time);
 
-   buf = kmalloc_array(fb->width, fb->height, GFP_KERNEL);
+   buf = kmalloc(fb->width * fb->height / 8, GFP_KERNEL);
if (!buf) {
ret = -ENOMEM;
goto out_exit;
-- 
2.25.1



[PATCH v2 0/5] drm: Fix monochrome conversion for sdd130x

2022-03-17 Thread Geert Uytterhoeven
Hi all,

This patch series contains fixes and improvements for the XRGB888 to
monochrome conversion in the DRM core, and for its users.

This has been tested on an Adafruit FeatherWing 128x32 OLED, connected
to an OrangeCrab ECP5 FPGA board running a 64 MHz VexRiscv RISC-V
softcore, using a text console with 4x6, 7x14 and 8x8 fonts.

Thanks!

Geert Uytterhoeven (5):
  drm/format-helper: Rename drm_fb_xrgb_to_mono_reversed()
  drm/format-helper: Fix XRGB888 to monochrome conversion
  drm/ssd130x: Fix rectangle updates
  drm/ssd130x: Reduce temporary buffer sizes
  drm/repaper: Reduce temporary buffer size in repaper_fb_dirty()

 drivers/gpu/drm/drm_format_helper.c | 74 +++--
 drivers/gpu/drm/solomon/ssd130x.c   | 24 +++---
 drivers/gpu/drm/tiny/repaper.c  |  4 +-
 include/drm/drm_format_helper.h |  5 +-
 4 files changed, 48 insertions(+), 59 deletions(-)

-- 
2.25.1

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


[PATCH v2 1/5] drm/format-helper: Rename drm_fb_xrgb8888_to_mono_reversed()

2022-03-17 Thread Geert Uytterhoeven
There is no "reversed" handling in drm_fb_xrgb_to_mono_reversed():
the function just converts from color to grayscale, and reduces the
number of grayscale levels from 256 to 2 (i.e. brightness 0-127 is
mapped to 0, 128-255 to 1).  All "reversed" handling is done in the
repaper driver, where this function originated.

Hence make this clear by renaming drm_fb_xrgb_to_mono_reversed() to
drm_fb_xrgb_to_mono(), and documenting the black/white pixel
mapping.

Fixes: bcf8b616deb87941 ("drm/format-helper: Add 
drm_fb_xrgb_to_mono_reversed()")
Signed-off-by: Geert Uytterhoeven 
Acked-by: Javier Martinez Canillas 
Reviewed-by: Andy Shevchenko 
---
v2:
  - Add Acked-by, Reviewed-by,
  - Join 2 lines.
---
 drivers/gpu/drm/drm_format_helper.c | 31 ++---
 drivers/gpu/drm/solomon/ssd130x.c   |  2 +-
 drivers/gpu/drm/tiny/repaper.c  |  2 +-
 include/drm/drm_format_helper.h |  5 ++---
 4 files changed, 19 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/drm_format_helper.c 
b/drivers/gpu/drm/drm_format_helper.c
index bc0f49773868a9b0..5d9d0c695845f575 100644
--- a/drivers/gpu/drm/drm_format_helper.c
+++ b/drivers/gpu/drm/drm_format_helper.c
@@ -594,8 +594,8 @@ int drm_fb_blit_toio(void __iomem *dst, unsigned int 
dst_pitch, uint32_t dst_for
 }
 EXPORT_SYMBOL(drm_fb_blit_toio);
 
-static void drm_fb_gray8_to_mono_reversed_line(u8 *dst, const u8 *src, 
unsigned int pixels,
-  unsigned int start_offset, 
unsigned int end_len)
+static void drm_fb_gray8_to_mono_line(u8 *dst, const u8 *src, unsigned int 
pixels,
+ unsigned int start_offset, unsigned int 
end_len)
 {
unsigned int xb, i;
 
@@ -621,8 +621,8 @@ static void drm_fb_gray8_to_mono_reversed_line(u8 *dst, 
const u8 *src, unsigned
 }
 
 /**
- * drm_fb_xrgb_to_mono_reversed - Convert XRGB to reversed monochrome
- * @dst: reversed monochrome destination buffer
+ * drm_fb_xrgb_to_mono - Convert XRGB to monochrome
+ * @dst: monochrome destination buffer (0=black, 1=white)
  * @dst_pitch: Number of bytes between two consecutive scanlines within dst
  * @src: XRGB source buffer
  * @fb: DRM framebuffer
@@ -633,10 +633,10 @@ static void drm_fb_gray8_to_mono_reversed_line(u8 *dst, 
const u8 *src, unsigned
  * and use this function to convert to the native format.
  *
  * This function uses drm_fb_xrgb_to_gray8() to convert to grayscale and
- * then the result is converted from grayscale to reversed monohrome.
+ * then the result is converted from grayscale to monochrome.
  */
-void drm_fb_xrgb_to_mono_reversed(void *dst, unsigned int dst_pitch, const 
void *vaddr,
- const struct drm_framebuffer *fb, const 
struct drm_rect *clip)
+void drm_fb_xrgb_to_mono(void *dst, unsigned int dst_pitch, const void 
*vaddr,
+const struct drm_framebuffer *fb, const struct 
drm_rect *clip)
 {
unsigned int linepixels = drm_rect_width(clip);
unsigned int lines = clip->y2 - clip->y1;
@@ -652,8 +652,8 @@ void drm_fb_xrgb_to_mono_reversed(void *dst, unsigned 
int dst_pitch, const v
return;
 
/*
-* The reversed mono destination buffer contains 1 bit per pixel
-* and destination scanlines have to be in multiple of 8 pixels.
+* The mono destination buffer contains 1 bit per pixel and
+* destination scanlines have to be in multiple of 8 pixels.
 */
if (!dst_pitch)
dst_pitch = DIV_ROUND_UP(linepixels, 8);
@@ -664,9 +664,9 @@ void drm_fb_xrgb_to_mono_reversed(void *dst, unsigned 
int dst_pitch, const v
 * The cma memory is write-combined so reads are uncached.
 * Speed up by fetching one line at a time.
 *
-* Also, format conversion from XR24 to reversed monochrome
-* are done line-by-line but are converted to 8-bit grayscale
-* as an intermediate step.
+* Also, format conversion from XR24 to monochrome are done
+* line-by-line but are converted to 8-bit grayscale as an
+* intermediate step.
 *
 * Allocate a buffer to be used for both copying from the cma
 * memory and to store the intermediate grayscale line pixels.
@@ -683,7 +683,7 @@ void drm_fb_xrgb_to_mono_reversed(void *dst, unsigned 
int dst_pitch, const v
 * are not aligned to multiple of 8.
 *
 * Calculate if the start and end pixels are not aligned and set the
-* offsets for the reversed mono line conversion function to adjust.
+* offsets for the mono line conversion function to adjust.
 */
start_offset = clip->x1 % 8;
end_len = clip->x2 % 8;
@@ -692,12 +692,11 @@ void drm_fb_xrgb_to_mono_reversed(void *dst, unsigned 
int dst_pitch, const v
for (y = 0; y < lines; y++) {
src32 = memcpy(src32, vaddr, len_src32);
 

[PATCH v2 2/5] drm/format-helper: Fix XRGB888 to monochrome conversion

2022-03-17 Thread Geert Uytterhoeven
The conversion functions drm_fb_xrgb_to_mono() and
drm_fb_gray8_to_mono_line() do not behave correctly when the
horizontal boundaries of the clip rectangle are not multiples of 8:
  a. When x1 % 8 != 0, the calculated pitch is not correct,
  b. When x2 % 8 != 0, the pixel data for the last byte is wrong.

Simplify the code and fix (a) by:
  1. Removing start_offset, and always storing the first pixel in the
 first bit of the monochrome destination buffer.
 Drivers that require the first pixel in a byte to be located at an
 x-coordinate that is a multiple of 8 can always align the clip
 rectangle before calling drm_fb_xrgb_to_mono().
 Note that:
   - The ssd130x driver does not need the alignment, as the
 monochrome buffer is a temporary format,
   - The repaper driver always updates the full screen, so the clip
 rectangle is always aligned.
  2. Passing the number of pixels to drm_fb_gray8_to_mono_line(),
 instead of the number of bytes, and the number of pixels in the
 last byte.

Fix (b) by explicitly setting the target bit, instead of always setting
bit 7 and shifting the value in each loop iteration.

Remove the bogus pitch check, which operates on bytes instead of pixels,
and triggers when e.g. flashing the cursor on a text console with a font
that is 8 pixels wide.

Drop the confusing comment about scanlines, as a pitch in bytes always
contains a multiple of 8 pixels.

While at it, use the drm_rect_height() helper instead of open-coding the
same operation.

Update the comments accordingly.

Fixes: bcf8b616deb87941 ("drm/format-helper: Add 
drm_fb_xrgb_to_mono_reversed()")
Signed-off-by: Geert Uytterhoeven 
Acked-by: Javier Martinez Canillas 
Reviewed-by: Andy Shevchenko 
---
v2:
  - Add Acked-by, Reviewed-by,
  - Use ">= 128" instead of "& BIT(7)" to increase readability.

I tried hard to fix this in small steps, but everything was no
intertangled that this turned out to be unfeasible.

Note that making these changes does not introduce regressions in the
ssd130x driver, as the latter is broken for x1 != 0 or y1 != 0 anyway.
---
 drivers/gpu/drm/drm_format_helper.c | 55 ++---
 1 file changed, 18 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/drm_format_helper.c 
b/drivers/gpu/drm/drm_format_helper.c
index 5d9d0c695845f575..e085f855a199013f 100644
--- a/drivers/gpu/drm/drm_format_helper.c
+++ b/drivers/gpu/drm/drm_format_helper.c
@@ -594,27 +594,16 @@ int drm_fb_blit_toio(void __iomem *dst, unsigned int 
dst_pitch, uint32_t dst_for
 }
 EXPORT_SYMBOL(drm_fb_blit_toio);
 
-static void drm_fb_gray8_to_mono_line(u8 *dst, const u8 *src, unsigned int 
pixels,
- unsigned int start_offset, unsigned int 
end_len)
-{
-   unsigned int xb, i;
-
-   for (xb = 0; xb < pixels; xb++) {
-   unsigned int start = 0, end = 8;
-   u8 byte = 0x00;
-
-   if (xb == 0 && start_offset)
-   start = start_offset;
 
-   if (xb == pixels - 1 && end_len)
-   end = end_len;
-
-   for (i = start; i < end; i++) {
-   unsigned int x = xb * 8 + i;
+static void drm_fb_gray8_to_mono_line(u8 *dst, const u8 *src, unsigned int 
pixels)
+{
+   while (pixels) {
+   unsigned int i, bits = min(pixels, 8U);
+   u8 byte = 0;
 
-   byte >>= 1;
-   if (src[x] >> 7)
-   byte |= BIT(7);
+   for (i = 0; i < bits; i++, pixels--) {
+   if (*src++ >= 128)
+   byte |= BIT(i);
}
*dst++ = byte;
}
@@ -634,16 +623,22 @@ static void drm_fb_gray8_to_mono_line(u8 *dst, const u8 
*src, unsigned int pixel
  *
  * This function uses drm_fb_xrgb_to_gray8() to convert to grayscale and
  * then the result is converted from grayscale to monochrome.
+ *
+ * The first pixel (upper left corner of the clip rectangle) will be converted
+ * and copied to the first bit (LSB) in the first byte of the monochrome
+ * destination buffer.
+ * If the caller requires that the first pixel in a byte must be located at an
+ * x-coordinate that is a multiple of 8, then the caller must take care itself
+ * of supplying a suitable clip rectangle.
  */
 void drm_fb_xrgb_to_mono(void *dst, unsigned int dst_pitch, const void 
*vaddr,
 const struct drm_framebuffer *fb, const struct 
drm_rect *clip)
 {
unsigned int linepixels = drm_rect_width(clip);
-   unsigned int lines = clip->y2 - clip->y1;
+   unsigned int lines = drm_rect_height(clip);
unsigned int cpp = fb->format->cpp[0];
unsigned int len_src32 = linepixels * cpp;
struct drm_device *dev = fb->dev;
-   unsigned int start_offset, end_len;
unsigned int y;
u8 *mono = dst, *gray8;
u32 *s

Re: [RFC PATCH 1/4] drm/amdkfd: Improve amdgpu_vm_handle_moved

2022-03-17 Thread Christian König

Am 17.03.22 um 01:20 schrieb Felix Kuehling:

Let amdgpu_vm_handle_moved update all BO VA mappings of BOs reserved by
the caller. This will be useful for handling extra BO VA mappings in
KFD VMs that are managed through the render node API.


Yes, that change is on my TODO list for quite a while as well.


TODO: This may also allow simplification of amdgpu_cs_vm_handling. See
the TODO comment in the code.


No, that won't work just yet.

We need to change the TLB flush detection for that, but I'm already 
working on those as well.



Signed-off-by: Felix Kuehling 


Please update the TODO, with that done: Reviewed-by: Christian König 




---
  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  6 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c |  2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c  | 18 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  3 ++-
  4 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index d162243d8e78..10941f0d8dde 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -826,6 +826,10 @@ static int amdgpu_cs_vm_handling(struct amdgpu_cs_parser 
*p)
return r;
}
  
+	/* TODO: Is this loop still needed, or could this be handled by

+* amdgpu_vm_handle_moved, now that it can handle all BOs that are
+* reserved under p->ticket?
+*/
amdgpu_bo_list_for_each_entry(e, p->bo_list) {
/* ignore duplicates */
bo = ttm_to_amdgpu_bo(e->tv.bo);
@@ -845,7 +849,7 @@ static int amdgpu_cs_vm_handling(struct amdgpu_cs_parser *p)
return r;
}
  
-	r = amdgpu_vm_handle_moved(adev, vm);

+   r = amdgpu_vm_handle_moved(adev, vm, &p->ticket);
if (r)
return r;
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c

index 579adfafe4d0..50805613c38c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
@@ -414,7 +414,7 @@ amdgpu_dma_buf_move_notify(struct dma_buf_attachment 
*attach)
  
  		r = amdgpu_vm_clear_freed(adev, vm, NULL);

if (!r)
-   r = amdgpu_vm_handle_moved(adev, vm);
+   r = amdgpu_vm_handle_moved(adev, vm, ticket);
  
  		if (r && r != -EBUSY)

DRM_ERROR("Failed to invalidate VM page tables (%d))\n",
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index fc4563cf2828..726b42c6d606 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -2190,11 +2190,12 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
   * PTs have to be reserved!
   */
  int amdgpu_vm_handle_moved(struct amdgpu_device *adev,
-  struct amdgpu_vm *vm)
+  struct amdgpu_vm *vm,
+  struct ww_acquire_ctx *ticket)
  {
struct amdgpu_bo_va *bo_va, *tmp;
struct dma_resv *resv;
-   bool clear;
+   bool clear, unlock;
int r;
  
  	list_for_each_entry_safe(bo_va, tmp, &vm->moved, base.vm_status) {

@@ -2212,17 +2213,24 @@ int amdgpu_vm_handle_moved(struct amdgpu_device *adev,
spin_unlock(&vm->invalidated_lock);
  
  		/* Try to reserve the BO to avoid clearing its ptes */

-   if (!amdgpu_vm_debug && dma_resv_trylock(resv))
+   if (!amdgpu_vm_debug && dma_resv_trylock(resv)) {
clear = false;
+   unlock = true;
+   /* The caller is already holding the reservation lock */
+   } else if (ticket && dma_resv_locking_ctx(resv) == ticket) {
+   clear = false;
+   unlock = false;
/* Somebody else is using the BO right now */
-   else
+   } else {
clear = true;
+   unlock = false;
+   }
  
  		r = amdgpu_vm_bo_update(adev, bo_va, clear, NULL);

if (r)
return r;
  
-		if (!clear)

+   if (unlock)
dma_resv_unlock(resv);
spin_lock(&vm->invalidated_lock);
}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index a40a6a993bb0..120a76aaae75 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -396,7 +396,8 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
  struct amdgpu_vm *vm,
  struct dma_fence **fence);
  int amdgpu_vm_handle_moved(struct amdgpu_device *adev,
-  struct amdgpu_vm *vm);
+  struct amdgpu_vm *vm,
+  

Re: [PATCH 3/3] drm/msm: Add a way to override processes comm/cmdline

2022-03-17 Thread Dan Carpenter
On Wed, Mar 16, 2022 at 05:29:45PM -0700, Rob Clark wrote:
>   switch (param) {
> + case MSM_PARAM_COMM:
> + case MSM_PARAM_CMDLINE: {
> + char *str, **paramp;
> +
> + str = kmalloc(len + 1, GFP_KERNEL);

if (!str)
return -ENOMEM;

> + if (copy_from_user(str, u64_to_user_ptr(value), len)) {
> + kfree(str);
> + return -EFAULT;
> + }
> +
> + /* Ensure string is null terminated: */
> + str[len] = '\0';
> +
> + if (param == MSM_PARAM_COMM) {
> + paramp = &ctx->comm;
> + } else {
> + paramp = &ctx->cmdline;
> + }
> +
> + kfree(*paramp);
> + *paramp = str;
> +
> + return 0;
> + }
>   case MSM_PARAM_SYSPROF:
>   if (!capable(CAP_SYS_ADMIN))
>   return -EPERM;
> diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
> index 4ec62b601adc..68f3f8ade76d 100644
> --- a/drivers/gpu/drm/msm/msm_gpu.c
> +++ b/drivers/gpu/drm/msm/msm_gpu.c
> @@ -364,14 +364,21 @@ static void retire_submits(struct msm_gpu *gpu);
>  
>  static void get_comm_cmdline(struct msm_gem_submit *submit, char **comm, 
> char **cmd)
>  {
> + struct msm_file_private *ctx = submit->queue->ctx;
>   struct task_struct *task;
>  
> + *comm = kstrdup(ctx->comm, GFP_KERNEL);
> + *cmd  = kstrdup(ctx->cmdline, GFP_KERNEL);
> +
>   task = get_pid_task(submit->pid, PIDTYPE_PID);
>   if (!task)
>   return;
>  
> - *comm = kstrdup(task->comm, GFP_KERNEL);
> - *cmd = kstrdup_quotable_cmdline(task, GFP_KERNEL);
> + if (!*comm)
> + *comm = kstrdup(task->comm, GFP_KERNEL);

What?

If the first allocation failed, then this one is going to fail as well.
Just return -ENOMEM.  Or maybe this is meant to be checking for an empty
string?

> +
> + if (!*cmd)
> + *cmd = kstrdup_quotable_cmdline(task, GFP_KERNEL);

Same.

>  
>   put_task_struct(task);
>  }

regards,
dan carpenter



Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

2022-03-17 Thread Sharma, Shashank




On 3/16/2022 10:50 PM, Rob Clark wrote:

On Tue, Mar 8, 2022 at 11:40 PM Shashank Sharma
 wrote:


From: Shashank Sharma 

This patch adds a new sysfs event, which will indicate
the userland about a GPU reset, and can also provide
some information like:
- process ID of the process involved with the GPU reset
- process name of the involved process
- the GPU status info (using flags)

This patch also introduces the first flag of the flags
bitmap, which can be appended as and when required.

V2: Addressed review comments from Christian and Amar
- move the reset information structure to DRM layer
- drop _ctx from struct name
- make pid 32 bit(than 64)
- set flag when VRAM invalid (than valid)
- add process name as well (Amar)

Cc: Alexandar Deucher 
Cc: Christian Koenig 
Cc: Amaranath Somalapuram 
Signed-off-by: Shashank Sharma 
---
  drivers/gpu/drm/drm_sysfs.c | 31 +++
  include/drm/drm_sysfs.h | 10 ++
  2 files changed, 41 insertions(+)

diff --git a/drivers/gpu/drm/drm_sysfs.c b/drivers/gpu/drm/drm_sysfs.c
index 430e00b16eec..840994810910 100644
--- a/drivers/gpu/drm/drm_sysfs.c
+++ b/drivers/gpu/drm/drm_sysfs.c
@@ -409,6 +409,37 @@ void drm_sysfs_hotplug_event(struct drm_device *dev)
  }
  EXPORT_SYMBOL(drm_sysfs_hotplug_event);

+/**
+ * drm_sysfs_reset_event - generate a DRM uevent to indicate GPU reset
+ * @dev: DRM device
+ * @reset_info: The contextual information about the reset (like PID, flags)
+ *
+ * Send a uevent for the DRM device specified by @dev. This informs
+ * user that a GPU reset has occurred, so that an interested client
+ * can take any recovery or profiling measure.
+ */
+void drm_sysfs_reset_event(struct drm_device *dev, struct drm_reset_event 
*reset_info)
+{
+   unsigned char pid_str[13];
+   unsigned char flags_str[15];
+   unsigned char pname_str[TASK_COMM_LEN + 6];
+   unsigned char reset_str[] = "RESET=1";
+   char *envp[] = { reset_str, pid_str, pname_str, flags_str, NULL };
+
+   if (!reset_info) {
+   DRM_WARN("No reset info, not sending the event\n");
+   return;
+   }
+
+   DRM_DEBUG("generating reset event\n");
+
+   snprintf(pid_str, ARRAY_SIZE(pid_str), "PID=%u", reset_info->pid);
+   snprintf(pname_str, ARRAY_SIZE(pname_str), "NAME=%s", 
reset_info->pname);
+   snprintf(flags_str, ARRAY_SIZE(flags_str), "FLAGS=%u", 
reset_info->flags);
+   kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE, envp);
+}
+EXPORT_SYMBOL(drm_sysfs_reset_event);
+
  /**
   * drm_sysfs_connector_hotplug_event - generate a DRM uevent for any connector
   * change
diff --git a/include/drm/drm_sysfs.h b/include/drm/drm_sysfs.h
index 6273cac44e47..5ba11c760619 100644
--- a/include/drm/drm_sysfs.h
+++ b/include/drm/drm_sysfs.h
@@ -1,16 +1,26 @@
  /* SPDX-License-Identifier: GPL-2.0 */
  #ifndef _DRM_SYSFS_H_
  #define _DRM_SYSFS_H_
+#include 
+
+#define DRM_GPU_RESET_FLAG_VRAM_INVALID (1 << 0)

  struct drm_device;
  struct device;
  struct drm_connector;
  struct drm_property;

+struct drm_reset_event {
+   uint32_t pid;


One side note, unrelated to devcoredump vs this..

AFAIU you probably want to be passing around a `struct pid *`, and
then somehow use pid_vnr() in the context of the process reading the
event to get the numeric pid.  Otherwise things will not do what you
expect if the process triggering the crash is in a different pid
namespace from the compositor.



I am not sure if it is a good idea to add the pid extraction complexity 
in here, it is left upto the driver to extract this information and pass 
it to the work queue. In case of AMDGPU, its extracted from GPU VM. It 
would be then more flexible for the drivers as well.


- Shashank


BR,
-R


+   uint32_t flags;
+   char pname[TASK_COMM_LEN];
+};
+
  int drm_class_device_register(struct device *dev);
  void drm_class_device_unregister(struct device *dev);

  void drm_sysfs_hotplug_event(struct drm_device *dev);
+void drm_sysfs_reset_event(struct drm_device *dev, struct drm_reset_event 
*reset_info);
  void drm_sysfs_connector_hotplug_event(struct drm_connector *connector);
  void drm_sysfs_connector_status_event(struct drm_connector *connector,
   struct drm_property *property);
--
2.32.0



Re: [PATCH v3 1/3] drm: allow real encoder to be passed for drm_writeback_connector

2022-03-17 Thread Laurent Pinchart
Hi Abhinav,

Thank you for the patch.

On Wed, Mar 16, 2022 at 11:48:16AM -0700, Abhinav Kumar wrote:
> For some vendor driver implementations, display hardware can
> be shared between the encoder used for writeback and the physical
> display.
> 
> In addition resources such as clocks and interrupts can
> also be shared between writeback and the real encoder.
> 
> To accommodate such vendor drivers and hardware, allow
> real encoder to be passed for drm_writeback_connector using a new
> drm_writeback_connector_init_with_encoder() API.

The commit message doesn't match the commit.

> In addition, to preserve the same call flows for the existing
> users of drm_writeback_connector_init(), also allow passing
> possible_crtcs as a parameter so that encoder can be initialized
> with it.
> 
> changes in v3:
>   - allow passing possible_crtcs for existing users of
> drm_writeback_connector_init()
>   - squash the vendor changes into the same commit so
> that each patch in the series can compile individually
> 
> Co-developed-by: Kandpal Suraj 
> Signed-off-by: Abhinav Kumar 
> ---
>  .../drm/arm/display/komeda/komeda_wb_connector.c   |   3 +-
>  drivers/gpu/drm/arm/malidp_mw.c|   5 +-
>  drivers/gpu/drm/drm_writeback.c| 103 
> +
>  drivers/gpu/drm/rcar-du/rcar_du_writeback.c|   5 +-
>  drivers/gpu/drm/vc4/vc4_txp.c  |  19 ++--
>  drivers/gpu/drm/vkms/vkms_writeback.c  |   3 +-
>  include/drm/drm_writeback.h|  22 -
>  7 files changed, 103 insertions(+), 57 deletions(-)
> 
> diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_wb_connector.c 
> b/drivers/gpu/drm/arm/display/komeda/komeda_wb_connector.c
> index e465cc4..40774e6 100644
> --- a/drivers/gpu/drm/arm/display/komeda/komeda_wb_connector.c
> +++ b/drivers/gpu/drm/arm/display/komeda/komeda_wb_connector.c
> @@ -155,7 +155,6 @@ static int komeda_wb_connector_add(struct komeda_kms_dev 
> *kms,
>   kwb_conn->wb_layer = kcrtc->master->wb_layer;
>  
>   wb_conn = &kwb_conn->base;
> - wb_conn->encoder.possible_crtcs = BIT(drm_crtc_index(&kcrtc->base));
>  
>   formats = komeda_get_layer_fourcc_list(&mdev->fmt_tbl,
>  kwb_conn->wb_layer->layer_type,
> @@ -164,7 +163,7 @@ static int komeda_wb_connector_add(struct komeda_kms_dev 
> *kms,
>   err = drm_writeback_connector_init(&kms->base, wb_conn,
>  &komeda_wb_connector_funcs,
>  &komeda_wb_encoder_helper_funcs,
> -formats, n_formats);
> +formats, n_formats, 
> BIT(drm_crtc_index(&kcrtc->base)));
>   komeda_put_fourcc_list(formats);
>   if (err) {
>   kfree(kwb_conn);
> diff --git a/drivers/gpu/drm/arm/malidp_mw.c b/drivers/gpu/drm/arm/malidp_mw.c
> index f5847a7..b882066 100644
> --- a/drivers/gpu/drm/arm/malidp_mw.c
> +++ b/drivers/gpu/drm/arm/malidp_mw.c
> @@ -208,11 +208,12 @@ int malidp_mw_connector_init(struct drm_device *drm)
>   struct malidp_drm *malidp = drm->dev_private;
>   u32 *formats;
>   int ret, n_formats;
> + uint32_t possible_crtcs;
>  
>   if (!malidp->dev->hw->enable_memwrite)
>   return 0;
>  
> - malidp->mw_connector.encoder.possible_crtcs = 1 << 
> drm_crtc_index(&malidp->crtc);
> + possible_crtcs = 1 << drm_crtc_index(&malidp->crtc);
>   drm_connector_helper_add(&malidp->mw_connector.base,
>&malidp_mw_connector_helper_funcs);
>  
> @@ -223,7 +224,7 @@ int malidp_mw_connector_init(struct drm_device *drm)
>   ret = drm_writeback_connector_init(drm, &malidp->mw_connector,
>  &malidp_mw_connector_funcs,
>  &malidp_mw_encoder_helper_funcs,
> -formats, n_formats);
> +formats, n_formats, possible_crtcs);

Do you need the local variable ?

>   kfree(formats);
>   if (ret)
>   return ret;
> diff --git a/drivers/gpu/drm/drm_writeback.c b/drivers/gpu/drm/drm_writeback.c
> index dccf4504..17c1471 100644
> --- a/drivers/gpu/drm/drm_writeback.c
> +++ b/drivers/gpu/drm/drm_writeback.c
> @@ -149,36 +149,15 @@ static const struct drm_encoder_funcs 
> drm_writeback_encoder_funcs = {
>   .destroy = drm_encoder_cleanup,
>  };
>  
> -/**
> - * drm_writeback_connector_init - Initialize a writeback connector and its 
> properties
> - * @dev: DRM device
> - * @wb_connector: Writeback connector to initialize
> - * @con_funcs: Connector funcs vtable
> - * @enc_helper_funcs: Encoder helper funcs vtable to be used by the internal 
> encoder
> - * @formats: Array of supported pixel formats for the writeback engine
> - * @n_formats: Length of the formats array
> - *
> - * Th

[PATCH 1/3] dt-bindings: display: bridge: it66121: Add audio support

2022-03-17 Thread Nicolas Belin
Update the ITE bridge HDMI it66121 bindings in order to
support audio.

Signed-off-by: Nicolas Belin 
---
 .../devicetree/bindings/display/bridge/ite,it66121.yaml| 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/devicetree/bindings/display/bridge/ite,it66121.yaml 
b/Documentation/devicetree/bindings/display/bridge/ite,it66121.yaml
index 6ec1d5fbb8bc..c6e81f532215 100644
--- a/Documentation/devicetree/bindings/display/bridge/ite,it66121.yaml
+++ b/Documentation/devicetree/bindings/display/bridge/ite,it66121.yaml
@@ -38,6 +38,9 @@ properties:
   interrupts:
 maxItems: 1
 
+  "#sound-dai-cells":
+const: 0
+
   ports:
 $ref: /schemas/graph.yaml#/properties/ports
 
-- 
2.25.1



[PATCH 0/3] drm: bridge: it66121: Add audio support

2022-03-17 Thread Nicolas Belin
This patch series adds the audio support on the it66121 HDMI bridge.

Patch 1 updates the ITE 66121 HDMI bridge bindings in order to support
audio.

Patch 2 sets the register page length or window length of the ITE 66121
HDMI bridge to 0x100 according to the documentation.

Patch 3 contains the actual driver modifications in order to add the
audio support on the ITE 66121 HDMI bridge.

Nicolas Belin (3):
  dt-bindings: display: bridge: it66121: Add audio support
  drm: bridge: it66121: Fix the register page length
  drm: bridge: it66121: Add audio support

 .../bindings/display/bridge/ite,it66121.yaml  |   3 +
 drivers/gpu/drm/bridge/ite-it66121.c  | 629 +-
 2 files changed, 631 insertions(+), 1 deletion(-)

-- 
2.25.1



[PATCH 2/3] drm: bridge: it66121: Fix the register page length

2022-03-17 Thread Nicolas Belin
Set the register page length or window length to
0x100 according to the documentation.

Fixes: 988156dc2fc9 ("drm: bridge: add it66121 driver")
Signed-off-by: Nicolas Belin 
---
 drivers/gpu/drm/bridge/ite-it66121.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/bridge/ite-it66121.c 
b/drivers/gpu/drm/bridge/ite-it66121.c
index 06b59b422c69..64912b770086 100644
--- a/drivers/gpu/drm/bridge/ite-it66121.c
+++ b/drivers/gpu/drm/bridge/ite-it66121.c
@@ -227,7 +227,7 @@ static const struct regmap_range_cfg it66121_regmap_banks[] 
= {
.selector_mask = 0x1,
.selector_shift = 0,
.window_start = 0x00,
-   .window_len = 0x130,
+   .window_len = 0x100,
},
 };
 
-- 
2.25.1



[PATCH 3/3] drm: bridge: it66121: Add audio support

2022-03-17 Thread Nicolas Belin
Adding the audio support on the HDMI bridge for I2S only.

Signed-off-by: Nicolas Belin 
Signed-off-by: Andy.Hsieh 
---
 drivers/gpu/drm/bridge/ite-it66121.c | 627 +++
 1 file changed, 627 insertions(+)

diff --git a/drivers/gpu/drm/bridge/ite-it66121.c 
b/drivers/gpu/drm/bridge/ite-it66121.c
index 64912b770086..514989676d07 100644
--- a/drivers/gpu/drm/bridge/ite-it66121.c
+++ b/drivers/gpu/drm/bridge/ite-it66121.c
@@ -27,6 +27,8 @@
 #include 
 #include 
 
+#include 
+
 #define IT66121_VENDOR_ID0_REG 0x00
 #define IT66121_VENDOR_ID1_REG 0x01
 #define IT66121_DEVICE_ID0_REG 0x02
@@ -155,6 +157,9 @@
 #define IT66121_AV_MUTE_ON BIT(0)
 #define IT66121_AV_MUTE_BLUESCRBIT(1)
 
+#define IT66121_PKT_CTS_CTRL_REG   0xC5
+#define IT66121_PKT_CTS_CTRL_SEL   BIT(1)
+
 #define IT66121_PKT_GEN_CTRL_REG   0xC6
 #define IT66121_PKT_GEN_CTRL_ONBIT(0)
 #define IT66121_PKT_GEN_CTRL_RPT   BIT(1)
@@ -202,6 +207,89 @@
 #define IT66121_EDID_SLEEP_US  2
 #define IT66121_EDID_TIMEOUT_US20
 #define IT66121_EDID_FIFO_SIZE 32
+
+#define IT66121_CLK_CTRL0_REG  0x58
+#define IT66121_CLK_CTRL0_AUTO_OVER_SAMPLING   BIT(4)
+#define IT66121_CLK_CTRL0_EXT_MCLK_MASKGENMASK(3, 2)
+#define IT66121_CLK_CTRL0_EXT_MCLK_128FS   (0 << 2)
+#define IT66121_CLK_CTRL0_EXT_MCLK_256FS   BIT(2)
+#define IT66121_CLK_CTRL0_EXT_MCLK_512FS   (2 << 2)
+#define IT66121_CLK_CTRL0_EXT_MCLK_1024FS  (3 << 2)
+#define IT66121_CLK_CTRL0_AUTO_IPCLK   BIT(0)
+#define IT66121_CLK_STATUS1_REG0x5E
+#define IT66121_CLK_STATUS2_REG0x5F
+
+#define IT66121_AUD_CTRL0_REG  0xE0
+#define IT66121_AUD_SWL(3 << 6)
+#define IT66121_AUD_16BIT  (0 << 6)
+#define IT66121_AUD_18BIT  BIT(6)
+#define IT66121_AUD_20BIT  (2 << 6)
+#define IT66121_AUD_24BIT  (3 << 6)
+#define IT66121_AUD_SPDIFTCBIT(5)
+#define IT66121_AUD_SPDIF  BIT(4)
+#define IT66121_AUD_I2S(0 << 4)
+#define IT66121_AUD_EN_I2S3BIT(3)
+#define IT66121_AUD_EN_I2S2BIT(2)
+#define IT66121_AUD_EN_I2S1BIT(1)
+#define IT66121_AUD_EN_I2S0BIT(0)
+#define IT66121_AUD_CTRL0_AUD_SEL  BIT(4)
+
+#define IT66121_AUD_CTRL1_REG  0xE1
+#define IT66121_AUD_FIFOMAP_REG0xE2
+#define IT66121_AUD_CTRL3_REG  0xE3
+#define IT66121_AUD_SRCVALID_FLAT_REG  0xE4
+#define IT66121_AUD_FLAT_SRC0  BIT(4)
+#define IT66121_AUD_FLAT_SRC1  BIT(5)
+#define IT66121_AUD_FLAT_SRC2  BIT(6)
+#define IT66121_AUD_FLAT_SRC3  BIT(7)
+#define IT66121_AUD_HDAUDIO_REG0xE5
+
+#define IT66121_AUD_PKT_CTS0_REG   0x130
+#define IT66121_AUD_PKT_CTS1_REG   0x131
+#define IT66121_AUD_PKT_CTS2_REG   0x132
+#define IT66121_AUD_PKT_N0_REG 0x133
+#define IT66121_AUD_PKT_N1_REG 0x134
+#define IT66121_AUD_PKT_N2_REG 0x135
+
+#define IT66121_AUD_CHST_MODE_REG  0x191
+#define IT66121_AUD_CHST_CAT_REG   0x192
+#define IT66121_AUD_CHST_SRCNUM_REG0x193
+#define IT66121_AUD_CHST_CHTNUM_REG0x194
+#define IT66121_AUD_CHST_CA_FS_REG 0x198
+#define IT66121_AUD_CHST_OFS_WL_REG0x199
+
+#define IT66121_AUD_PKT_CTS_CNT0_REG   0x1A0
+#define IT66121_AUD_PKT_CTS_CNT1_REG   0x1A1
+#define IT66121_AUD_PKT_CTS_CNT2_REG   0x1A2
+
+#define IT66121_AUD_FS_22P05K  0x4
+#define IT66121_AUD_FS_44P1K   0x0
+#define IT66121_AUD_FS_88P2K   0x8
+#define IT66121_AUD_FS_176P4K  0xC
+#define IT66121_AUD_FS_24K 0x6
+#define IT66121_AUD_FS_48K 0x2
+#define IT66121_AUD_FS_96K 0xA
+#define IT66121_AUD_FS_192K0xE
+#define IT66121_AUD_FS_768K0x9
+#define IT66121_AUD_FS_32K 0x3
+#define IT66121_AUD_FS_OTHER   0x1
+
+#define IT66121_AUD_SWL_21BIT  0xD
+#define IT66121_AUD_SWL_24BIT  0xB
+#define IT66121_AUD_SWL_23BIT  0x9
+#define IT66121_AUD_SWL_22BIT  0x5
+#define IT66121_AUD_SWL_20BIT  0x3
+#define IT66121_AUD_SWL_17BIT  0xC
+#define IT66121_AUD_SWL_19BIT  0x8
+#define IT66121_AUD_SWL_18BIT  0x4
+#define IT66121_AUD_SWL_16BIT   

Re: [PATCH,v2] drm/panel: Fix return value check in nt35950_probe()

2022-03-17 Thread AngeloGioacchino Del Regno

Il 17/03/22 09:37, Lu Wei ha scritto:

In function nt35950_probe(), mipi_dsi_device_register_full() is called
to create a MIPI DSI device. If it fails, a pointer encoded with an error
will be returned, so use IS_ERR() to check the return value. Besides, use
PTR_ERR to return the actual errno.

Fixes: 623a3531e9cf ("drm/panel: Add driver for Novatek NT35950 DSI DriverIC 
panels")
Signed-off-by: Lu Wei 


Reviewed-by: AngeloGioacchino Del Regno 


Thanks!


---
  drivers/gpu/drm/panel/panel-novatek-nt35950.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/panel/panel-novatek-nt35950.c 
b/drivers/gpu/drm/panel/panel-novatek-nt35950.c
index 288c7fa83ecc..d252e5e56228 100644
--- a/drivers/gpu/drm/panel/panel-novatek-nt35950.c
+++ b/drivers/gpu/drm/panel/panel-novatek-nt35950.c
@@ -579,9 +579,9 @@ static int nt35950_probe(struct mipi_dsi_device *dsi)
}
  
  		nt->dsi[1] = mipi_dsi_device_register_full(dsi_r_host, info);

-   if (!nt->dsi[1]) {
+   if (IS_ERR(nt->dsi[1])) {
dev_err(dev, "Cannot get secondary DSI node\n");
-   return -ENODEV;
+   return PTR_ERR(nt->dsi[1]);
}
num_dsis++;
}




Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

2022-03-17 Thread Christian König

Am 17.03.22 um 09:42 schrieb Sharma, Shashank:

On 3/16/2022 10:50 PM, Rob Clark wrote:

On Tue, Mar 8, 2022 at 11:40 PM Shashank Sharma
 wrote:


From: Shashank Sharma 

This patch adds a new sysfs event, which will indicate
the userland about a GPU reset, and can also provide
some information like:
- process ID of the process involved with the GPU reset
- process name of the involved process
- the GPU status info (using flags)

This patch also introduces the first flag of the flags
bitmap, which can be appended as and when required.

V2: Addressed review comments from Christian and Amar
    - move the reset information structure to DRM layer
    - drop _ctx from struct name
    - make pid 32 bit(than 64)
    - set flag when VRAM invalid (than valid)
    - add process name as well (Amar)

Cc: Alexandar Deucher 
Cc: Christian Koenig 
Cc: Amaranath Somalapuram 
Signed-off-by: Shashank Sharma 
---
  drivers/gpu/drm/drm_sysfs.c | 31 +++
  include/drm/drm_sysfs.h | 10 ++
  2 files changed, 41 insertions(+)

diff --git a/drivers/gpu/drm/drm_sysfs.c b/drivers/gpu/drm/drm_sysfs.c
index 430e00b16eec..840994810910 100644
--- a/drivers/gpu/drm/drm_sysfs.c
+++ b/drivers/gpu/drm/drm_sysfs.c
@@ -409,6 +409,37 @@ void drm_sysfs_hotplug_event(struct drm_device 
*dev)

  }
  EXPORT_SYMBOL(drm_sysfs_hotplug_event);

+/**
+ * drm_sysfs_reset_event - generate a DRM uevent to indicate GPU reset
+ * @dev: DRM device
+ * @reset_info: The contextual information about the reset (like 
PID, flags)

+ *
+ * Send a uevent for the DRM device specified by @dev. This informs
+ * user that a GPU reset has occurred, so that an interested client
+ * can take any recovery or profiling measure.
+ */
+void drm_sysfs_reset_event(struct drm_device *dev, struct 
drm_reset_event *reset_info)

+{
+   unsigned char pid_str[13];
+   unsigned char flags_str[15];
+   unsigned char pname_str[TASK_COMM_LEN + 6];
+   unsigned char reset_str[] = "RESET=1";
+   char *envp[] = { reset_str, pid_str, pname_str, flags_str, 
NULL };

+
+   if (!reset_info) {
+   DRM_WARN("No reset info, not sending the event\n");
+   return;
+   }
+
+   DRM_DEBUG("generating reset event\n");
+
+   snprintf(pid_str, ARRAY_SIZE(pid_str), "PID=%u", 
reset_info->pid);
+   snprintf(pname_str, ARRAY_SIZE(pname_str), "NAME=%s", 
reset_info->pname);
+   snprintf(flags_str, ARRAY_SIZE(flags_str), "FLAGS=%u", 
reset_info->flags);

+ kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE, envp);
+}
+EXPORT_SYMBOL(drm_sysfs_reset_event);
+
  /**
   * drm_sysfs_connector_hotplug_event - generate a DRM uevent for 
any connector

   * change
diff --git a/include/drm/drm_sysfs.h b/include/drm/drm_sysfs.h
index 6273cac44e47..5ba11c760619 100644
--- a/include/drm/drm_sysfs.h
+++ b/include/drm/drm_sysfs.h
@@ -1,16 +1,26 @@
  /* SPDX-License-Identifier: GPL-2.0 */
  #ifndef _DRM_SYSFS_H_
  #define _DRM_SYSFS_H_
+#include 
+
+#define DRM_GPU_RESET_FLAG_VRAM_INVALID (1 << 0)

  struct drm_device;
  struct device;
  struct drm_connector;
  struct drm_property;

+struct drm_reset_event {
+   uint32_t pid;


One side note, unrelated to devcoredump vs this..

AFAIU you probably want to be passing around a `struct pid *`, and
then somehow use pid_vnr() in the context of the process reading the
event to get the numeric pid.  Otherwise things will not do what you
expect if the process triggering the crash is in a different pid
namespace from the compositor.



I am not sure if it is a good idea to add the pid extraction 
complexity in here, it is left upto the driver to extract this 
information and pass it to the work queue. In case of AMDGPU, its 
extracted from GPU VM. It would be then more flexible for the drivers 
as well.


Yeah, but that is just used for debugging.

If we want to use the pid for housekeeping, like for a daemon which 
kills/restarts processes, we absolutely need that or otherwise won't be 
able to work with containers.


Regards,
Christian.



- Shashank


BR,
-R


+   uint32_t flags;
+   char pname[TASK_COMM_LEN];
+};
+
  int drm_class_device_register(struct device *dev);
  void drm_class_device_unregister(struct device *dev);

  void drm_sysfs_hotplug_event(struct drm_device *dev);
+void drm_sysfs_reset_event(struct drm_device *dev, struct 
drm_reset_event *reset_info);
  void drm_sysfs_connector_hotplug_event(struct drm_connector 
*connector);
  void drm_sysfs_connector_status_event(struct drm_connector 
*connector,

   struct drm_property *property);
--
2.32.0





Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

2022-03-17 Thread Daniel Vetter
On Mon, Mar 14, 2022 at 10:23:27AM -0400, Alex Deucher wrote:
> On Fri, Mar 11, 2022 at 3:30 AM Pekka Paalanen  wrote:
> >
> > On Thu, 10 Mar 2022 11:56:41 -0800
> > Rob Clark  wrote:
> >
> > > For something like just notifying a compositor that a gpu crash
> > > happened, perhaps drm_event is more suitable.  See
> > > virtio_gpu_fence_event_create() for an example of adding new event
> > > types.  Although maybe you want it to be an event which is not device
> > > specific.  This isn't so much of a debugging use-case as simply
> > > notification.
> >
> > Hi,
> >
> > for this particular use case, are we now talking about the display
> > device (KMS) crashing or the rendering device (OpenGL/Vulkan) crashing?
> >
> > If the former, I wasn't aware that display device crashes are a thing.
> > How should a userspace display server react to those?
> >
> > If the latter, don't we have EGL extensions or Vulkan API already to
> > deliver that?
> >
> > The above would be about device crashes that directly affect the
> > display server. Is that the use case in mind here, or is it instead
> > about notifying the display server that some application has caused a
> > driver/hardware crash? If the latter, how should a display server react
> > to that? Disconnect the application?
> >
> > Shashank, what is the actual use case you are developing this for?
> >
> > I've read all the emails here so far, and I don't recall seeing it
> > explained.
> >
> 
> The idea is that a support daemon or compositor would listen for GPU
> reset notifications and do something useful with them (kill the guilty
> app, restart the desktop environment, etc.).  Today when the GPU
> resets, most applications just continue assuming nothing is wrong,
> meanwhile the GPU has stopped accepting work until the apps re-init
> their context so all of their command submissions just get rejected.
> 
> > Btw. somewhat relatedly, there has been work aiming to allow
> > graceful hot-unplug of DRM devices. There is a kernel doc outlining how
> > the various APIs should react towards userspace when a DRM device
> > suddenly disappears. That seems to have some overlap here IMO.
> >
> > See 
> > https://www.kernel.org/doc/html/latest/gpu/drm-uapi.html#device-hot-unplug
> > which also has a couple pointers to EGL and Vulkan APIs.
> 
> The problem is most applications don't use the GL or VK robustness
> APIs.  You could use something like that in the compositor, but those
> APIs tend to be focused more on the application itself rather than the
> GPU in general.  E.g., Is my context lost.  Which is fine for
> restarting your context, but doesn't really help if you want to try
> and do something with another application (i.e., the likely guilty
> app).  Also, on dGPU at least, when you reset the GPU, vram is usually
> lost (either due to the memory controller being reset, or vram being
> zero'd on init due to ECC support), so even if you are not the guilty
> process, in that case you'd need to re-init your context anyway.

Isn't that what arb robustness and all that stuff is for? Doing that
through sysfs event sounds very wrong, since in general apps just don't
have access to that. Also vk equivalent is vk_error_device_lost. Iirc both
have information like whether the app was the guilty one causing the hang,
or whether it was just victimized because the gpu can't do anything else
than a full gpu reset which nukes everything (like amdgpu currently has,
aside from the thread unblock trick in the first attempt).

And if your app/compositor doesn't use robust contexts then the userspace
driver gets to do a best effort attempt at recovery, or exit(). Whatever
you can do really.

Also note that you don't actually want an event, but a query ioctl (plus
maybe a specific errno on your CS ioctl). Neither of the above flows
supports events for gpu resets. RESET_STATS ioctl is the i915
implementation of this stuff.

For the core dump aspect yes pls devcoredump and not reinvented wheels
(and i915 is a bad example here, but in defence the i915 sysfs hang event
predates devcoredump).

Cheers, Daniel


> 
> Alex
> 
> >
> >
> > Thanks,
> > pq

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


[PATCH 1/4] drm/gma500: Remove unused declarations and other cruft

2022-03-17 Thread Patrik Jakobsson
Most of these are old leftovers from one of the driver merges. This is
all dead code.

Signed-off-by: Patrik Jakobsson 
---
 drivers/gpu/drm/gma500/psb_drv.h | 75 +---
 1 file changed, 1 insertion(+), 74 deletions(-)

diff --git a/drivers/gpu/drm/gma500/psb_drv.h b/drivers/gpu/drm/gma500/psb_drv.h
index 553d03190ce1..66f61909a8c8 100644
--- a/drivers/gpu/drm/gma500/psb_drv.h
+++ b/drivers/gpu/drm/gma500/psb_drv.h
@@ -36,12 +36,6 @@
 /* Append new drm mode definition here, align with libdrm definition */
 #define DRM_MODE_SCALE_NO_SCALE2
 
-enum {
-   CHIP_PSB_8108 = 0,  /* Poulsbo */
-   CHIP_PSB_8109 = 1,  /* Poulsbo */
-   CHIP_MRST_4100 = 2, /* Moorestown/Oaktrail */
-};
-
 #define IS_PSB(drm) ((to_pci_dev((drm)->dev)->device & 0xfffe) == 0x8108)
 #define IS_MRST(drm) ((to_pci_dev((drm)->dev)->device & 0xfff0) == 0x4100)
 #define IS_CDV(drm) ((to_pci_dev((drm)->dev)->device & 0xfff0) == 0x0be0)
@@ -617,15 +611,7 @@ struct psb_ops {
int i2c_bus;/* I2C bus identifier for Moorestown */
 };
 
-
-
-extern int drm_crtc_probe_output_modes(struct drm_device *dev, int, int);
-extern int drm_pick_crtcs(struct drm_device *dev);
-
 /* psb_irq.c */
-extern void psb_irq_uninstall_islands(struct drm_device *dev, int hw_islands);
-extern int psb_vblank_wait2(struct drm_device *dev, unsigned int *sequence);
-extern int psb_vblank_wait(struct drm_device *dev, unsigned int *sequence);
 extern int psb_enable_vblank(struct drm_crtc *crtc);
 extern void psb_disable_vblank(struct drm_crtc *crtc);
 void
@@ -636,17 +622,9 @@ psb_disable_pipestat(struct drm_psb_private *dev_priv, int 
pipe, u32 mask);
 
 extern u32 psb_get_vblank_counter(struct drm_crtc *crtc);
 
-/* framebuffer.c */
-extern int psbfb_probed(struct drm_device *dev);
-extern int psbfb_remove(struct drm_device *dev,
-   struct drm_framebuffer *fb);
-/* psb_drv.c */
-extern void psb_spank(struct drm_psb_private *dev_priv);
-
-/* psb_reset.c */
+/* psb_lid.c */
 extern void psb_lid_timer_init(struct drm_psb_private *dev_priv);
 extern void psb_lid_timer_takedown(struct drm_psb_private *dev_priv);
-extern void psb_print_pagefault(struct drm_psb_private *dev_priv);
 
 /* modesetting */
 extern void psb_modeset_init(struct drm_device *dev);
@@ -689,43 +667,7 @@ extern const struct psb_ops oaktrail_chip_ops;
 /* cdv_device.c */
 extern const struct psb_ops cdv_chip_ops;
 
-/* Debug print bits setting */
-#define PSB_D_GENERAL (1 << 0)
-#define PSB_D_INIT(1 << 1)
-#define PSB_D_IRQ (1 << 2)
-#define PSB_D_ENTRY   (1 << 3)
-/* debug the get H/V BP/FP count */
-#define PSB_D_HV  (1 << 4)
-#define PSB_D_DBI_BF  (1 << 5)
-#define PSB_D_PM  (1 << 6)
-#define PSB_D_RENDER  (1 << 7)
-#define PSB_D_REG (1 << 8)
-#define PSB_D_MSVDX   (1 << 9)
-#define PSB_D_TOPAZ   (1 << 10)
-
-extern int drm_idle_check_interval;
-
 /* Utilities */
-static inline u32 MRST_MSG_READ32(int domain, uint port, uint offset)
-{
-   int mcr = (0xD0<<24) | (port << 16) | (offset << 8);
-   uint32_t ret_val = 0;
-   struct pci_dev *pci_root = pci_get_domain_bus_and_slot(domain, 0, 0);
-   pci_write_config_dword(pci_root, 0xD0, mcr);
-   pci_read_config_dword(pci_root, 0xD4, &ret_val);
-   pci_dev_put(pci_root);
-   return ret_val;
-}
-static inline void MRST_MSG_WRITE32(int domain, uint port, uint offset,
-   u32 value)
-{
-   int mcr = (0xE0<<24) | (port << 16) | (offset << 8) | 0xF0;
-   struct pci_dev *pci_root = pci_get_domain_bus_and_slot(domain, 0, 0);
-   pci_write_config_dword(pci_root, 0xD4, value);
-   pci_write_config_dword(pci_root, 0xD0, mcr);
-   pci_dev_put(pci_root);
-}
-
 static inline uint32_t REGISTER_READ(struct drm_device *dev, uint32_t reg)
 {
struct drm_psb_private *dev_priv = to_drm_psb_private(dev);
@@ -806,24 +748,9 @@ static inline void REGISTER_WRITE8(struct drm_device *dev,
 #define PSB_WVDC32(_val, _offs)iowrite32(_val, 
dev_priv->vdc_reg + (_offs))
 #define PSB_RVDC32(_offs)  ioread32(dev_priv->vdc_reg + (_offs))
 
-/* #define TRAP_SGX_PM_FAULT 1 */
-#ifdef TRAP_SGX_PM_FAULT
-#define PSB_RSGX32(_offs)  \
-({ \
-   if (inl(dev_priv->apm_base + PSB_APM_STS) & 0x3) {  \
-   pr_err("access sgx when it's off!! (READ) %s, %d\n",\
-  __FILE__, __LINE__); \
-   melay(1000);\
-   }   \
-   ioread32(dev_priv->sgx_reg + (_offs));  \
-})
-#else
 #define PSB_RSGX32(_offs)  ioread32(dev_priv->sgx_reg + (_offs))
-#endif
 #define PSB_WSGX32(_val, _offs)iowrite32(_val, 
dev_p

[PATCH 2/4] drm/gma500: Move gma_intel_crtc_funcs into gma_display.c

2022-03-17 Thread Patrik Jakobsson
All functions live in gma_display.c already so move the vtable. Also
shorten the name to gma_crtc_funcs.

Signed-off-by: Patrik Jakobsson 
---
 drivers/gpu/drm/gma500/cdv_device.c|  2 +-
 drivers/gpu/drm/gma500/gma_display.c   | 12 
 drivers/gpu/drm/gma500/gma_display.h   | 10 ++
 drivers/gpu/drm/gma500/oaktrail_device.c   |  2 +-
 drivers/gpu/drm/gma500/psb_device.c|  2 +-
 drivers/gpu/drm/gma500/psb_drv.h   |  2 --
 drivers/gpu/drm/gma500/psb_intel_display.c | 12 
 7 files changed, 17 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/gma500/cdv_device.c 
b/drivers/gpu/drm/gma500/cdv_device.c
index d7c6cca23e94..887c157d75f4 100644
--- a/drivers/gpu/drm/gma500/cdv_device.c
+++ b/drivers/gpu/drm/gma500/cdv_device.c
@@ -603,7 +603,7 @@ const struct psb_ops cdv_chip_ops = {
.errata = cdv_errata,
 
.crtc_helper = &cdv_intel_helper_funcs,
-   .crtc_funcs = &gma_intel_crtc_funcs,
+   .crtc_funcs = &gma_crtc_funcs,
.clock_funcs = &cdv_clock_funcs,
 
.output_init = cdv_output_init,
diff --git a/drivers/gpu/drm/gma500/gma_display.c 
b/drivers/gpu/drm/gma500/gma_display.c
index dd801404cf99..931ffb192fc4 100644
--- a/drivers/gpu/drm/gma500/gma_display.c
+++ b/drivers/gpu/drm/gma500/gma_display.c
@@ -565,6 +565,18 @@ int gma_crtc_set_config(struct drm_mode_set *set,
return ret;
 }
 
+const struct drm_crtc_funcs gma_crtc_funcs = {
+   .cursor_set = gma_crtc_cursor_set,
+   .cursor_move = gma_crtc_cursor_move,
+   .gamma_set = gma_crtc_gamma_set,
+   .set_config = gma_crtc_set_config,
+   .destroy = gma_crtc_destroy,
+   .page_flip = gma_crtc_page_flip,
+   .enable_vblank = psb_enable_vblank,
+   .disable_vblank = psb_disable_vblank,
+   .get_vblank_counter = psb_get_vblank_counter,
+};
+
 /*
  * Save HW states of given crtc
  */
diff --git a/drivers/gpu/drm/gma500/gma_display.h 
b/drivers/gpu/drm/gma500/gma_display.h
index 7bd6c1ee8b21..113cf048105e 100644
--- a/drivers/gpu/drm/gma500/gma_display.h
+++ b/drivers/gpu/drm/gma500/gma_display.h
@@ -58,15 +58,7 @@ extern bool gma_pipe_has_type(struct drm_crtc *crtc, int 
type);
 extern void gma_wait_for_vblank(struct drm_device *dev);
 extern int gma_pipe_set_base(struct drm_crtc *crtc, int x, int y,
 struct drm_framebuffer *old_fb);
-extern int gma_crtc_cursor_set(struct drm_crtc *crtc,
-  struct drm_file *file_priv,
-  uint32_t handle,
-  uint32_t width, uint32_t height);
-extern int gma_crtc_cursor_move(struct drm_crtc *crtc, int x, int y);
 extern void gma_crtc_load_lut(struct drm_crtc *crtc);
-extern int gma_crtc_gamma_set(struct drm_crtc *crtc, u16 *red, u16 *green,
- u16 *blue, u32 size,
- struct drm_modeset_acquire_ctx *ctx);
 extern void gma_crtc_dpms(struct drm_crtc *crtc, int mode);
 extern void gma_crtc_prepare(struct drm_crtc *crtc);
 extern void gma_crtc_commit(struct drm_crtc *crtc);
@@ -83,6 +75,8 @@ extern int gma_crtc_set_config(struct drm_mode_set *set,
 extern void gma_crtc_save(struct drm_crtc *crtc);
 extern void gma_crtc_restore(struct drm_crtc *crtc);
 
+extern const struct drm_crtc_funcs gma_crtc_funcs;
+
 extern void gma_encoder_prepare(struct drm_encoder *encoder);
 extern void gma_encoder_commit(struct drm_encoder *encoder);
 extern void gma_encoder_destroy(struct drm_encoder *encoder);
diff --git a/drivers/gpu/drm/gma500/oaktrail_device.c 
b/drivers/gpu/drm/gma500/oaktrail_device.c
index 5c75eae630b5..40f1bc736125 100644
--- a/drivers/gpu/drm/gma500/oaktrail_device.c
+++ b/drivers/gpu/drm/gma500/oaktrail_device.c
@@ -545,7 +545,7 @@ const struct psb_ops oaktrail_chip_ops = {
.chip_setup = oaktrail_chip_setup,
.chip_teardown = oaktrail_teardown,
.crtc_helper = &oaktrail_helper_funcs,
-   .crtc_funcs = &gma_intel_crtc_funcs,
+   .crtc_funcs = &gma_crtc_funcs,
 
.output_init = oaktrail_output_init,
 
diff --git a/drivers/gpu/drm/gma500/psb_device.c 
b/drivers/gpu/drm/gma500/psb_device.c
index 3030f18ba022..e93e4191c0ca 100644
--- a/drivers/gpu/drm/gma500/psb_device.c
+++ b/drivers/gpu/drm/gma500/psb_device.c
@@ -329,7 +329,7 @@ const struct psb_ops psb_chip_ops = {
.chip_teardown = psb_chip_teardown,
 
.crtc_helper = &psb_intel_helper_funcs,
-   .crtc_funcs = &gma_intel_crtc_funcs,
+   .crtc_funcs = &gma_crtc_funcs,
.clock_funcs = &psb_clock_funcs,
 
.output_init = psb_output_init,
diff --git a/drivers/gpu/drm/gma500/psb_drv.h b/drivers/gpu/drm/gma500/psb_drv.h
index 66f61909a8c8..88f44dbbc4eb 100644
--- a/drivers/gpu/drm/gma500/psb_drv.h
+++ b/drivers/gpu/drm/gma500/psb_drv.h
@@ -13,7 +13,6 @@
 
 #include 
 
-#include "gma_display.h"
 #include "gtt.h"
 #include "intel_bios.h"
 #include "mmu.h"
@@ -647,7 +646,6 @@ extern void oaktrail_lvds_init(st

[PATCH 4/4] drm/gma500: Cosmetic cleanup of irq code

2022-03-17 Thread Patrik Jakobsson
Use the gma_ prefix instead of psb_ since the code is common for all
chips. Various coding style fixes. Removal of unused code. Removal of
duplicate function declarations.

Signed-off-by: Patrik Jakobsson 
---
 drivers/gpu/drm/gma500/gma_display.c |  8 +--
 drivers/gpu/drm/gma500/opregion.c|  5 +-
 drivers/gpu/drm/gma500/power.c   | 10 +--
 drivers/gpu/drm/gma500/psb_drv.c |  2 +-
 drivers/gpu/drm/gma500/psb_drv.h | 11 
 drivers/gpu/drm/gma500/psb_irq.c | 94 +++-
 drivers/gpu/drm/gma500/psb_irq.h | 19 +++---
 7 files changed, 57 insertions(+), 92 deletions(-)

diff --git a/drivers/gpu/drm/gma500/gma_display.c 
b/drivers/gpu/drm/gma500/gma_display.c
index 931ffb192fc4..1d7964c339f4 100644
--- a/drivers/gpu/drm/gma500/gma_display.c
+++ b/drivers/gpu/drm/gma500/gma_display.c
@@ -17,7 +17,7 @@
 #include "framebuffer.h"
 #include "gem.h"
 #include "gma_display.h"
-#include "psb_drv.h"
+#include "psb_irq.h"
 #include "psb_intel_drv.h"
 #include "psb_intel_reg.h"
 
@@ -572,9 +572,9 @@ const struct drm_crtc_funcs gma_crtc_funcs = {
.set_config = gma_crtc_set_config,
.destroy = gma_crtc_destroy,
.page_flip = gma_crtc_page_flip,
-   .enable_vblank = psb_enable_vblank,
-   .disable_vblank = psb_disable_vblank,
-   .get_vblank_counter = psb_get_vblank_counter,
+   .enable_vblank = gma_enable_vblank,
+   .disable_vblank = gma_disable_vblank,
+   .get_vblank_counter = gma_get_vblank_counter,
 };
 
 /*
diff --git a/drivers/gpu/drm/gma500/opregion.c 
b/drivers/gpu/drm/gma500/opregion.c
index fef04ff8c3a9..dc494df71a48 100644
--- a/drivers/gpu/drm/gma500/opregion.c
+++ b/drivers/gpu/drm/gma500/opregion.c
@@ -23,6 +23,7 @@
  */
 #include 
 #include "psb_drv.h"
+#include "psb_irq.h"
 #include "psb_intel_reg.h"
 
 #define PCI_ASLE 0xe4
@@ -217,8 +218,8 @@ void psb_intel_opregion_enable_asle(struct drm_device *dev)
if (asle && system_opregion ) {
/* Don't do this on Medfield or other non PC like devices, they
   use the bit for something different altogether */
-   psb_enable_pipestat(dev_priv, 0, PIPE_LEGACY_BLC_EVENT_ENABLE);
-   psb_enable_pipestat(dev_priv, 1, PIPE_LEGACY_BLC_EVENT_ENABLE);
+   gma_enable_pipestat(dev_priv, 0, PIPE_LEGACY_BLC_EVENT_ENABLE);
+   gma_enable_pipestat(dev_priv, 1, PIPE_LEGACY_BLC_EVENT_ENABLE);
 
asle->tche = ASLE_ALS_EN | ASLE_BLC_EN | ASLE_PFIT_EN
| ASLE_PFMB_EN;
diff --git a/drivers/gpu/drm/gma500/power.c b/drivers/gpu/drm/gma500/power.c
index 6f917cfef65b..b91de6d36e41 100644
--- a/drivers/gpu/drm/gma500/power.c
+++ b/drivers/gpu/drm/gma500/power.c
@@ -201,7 +201,7 @@ int gma_power_suspend(struct device *_dev)
dev_err(dev->dev, "GPU hardware busy, cannot 
suspend\n");
return -EBUSY;
}
-   psb_irq_uninstall(dev);
+   gma_irq_uninstall(dev);
gma_suspend_display(dev);
gma_suspend_pci(pdev);
}
@@ -223,8 +223,8 @@ int gma_power_resume(struct device *_dev)
mutex_lock(&power_mutex);
gma_resume_pci(pdev);
gma_resume_display(pdev);
-   psb_irq_preinstall(dev);
-   psb_irq_postinstall(dev);
+   gma_irq_preinstall(dev);
+   gma_irq_postinstall(dev);
mutex_unlock(&power_mutex);
return 0;
 }
@@ -270,8 +270,8 @@ bool gma_power_begin(struct drm_device *dev, bool force_on)
/* Ok power up needed */
ret = gma_resume_pci(pdev);
if (ret == 0) {
-   psb_irq_preinstall(dev);
-   psb_irq_postinstall(dev);
+   gma_irq_preinstall(dev);
+   gma_irq_postinstall(dev);
pm_runtime_get(dev->dev);
dev_priv->display_count++;
spin_unlock_irqrestore(&power_ctrl_lock, flags);
diff --git a/drivers/gpu/drm/gma500/psb_drv.c b/drivers/gpu/drm/gma500/psb_drv.c
index e30b58184156..82d51e9821ad 100644
--- a/drivers/gpu/drm/gma500/psb_drv.c
+++ b/drivers/gpu/drm/gma500/psb_drv.c
@@ -380,7 +380,7 @@ static int psb_driver_load(struct drm_device *dev, unsigned 
long flags)
PSB_WVDC32(0x, PSB_INT_MASK_R);
spin_unlock_irqrestore(&dev_priv->irqmask_lock, irqflags);
 
-   psb_irq_install(dev, pdev->irq);
+   gma_irq_install(dev, pdev->irq);
 
dev->max_vblank_count = 0xff; /* only 24 bits of frame count */
 
diff --git a/drivers/gpu/drm/gma500/psb_drv.h b/drivers/gpu/drm/gma500/psb_drv.h
index aed167af13c5..0ddfec1a0851 100644
--- a/drivers/gpu/drm/gma500/psb_drv.h
+++ b/drivers/gpu/drm/gma500/psb_drv.h
@@ -609,17 +609,6 @@ struct psb_ops {
int i2c_bus;/* I2C bus identifier for Moorestown */
 };
 
-/* psb_irq.c */
-extern int psb_enable_vblank(struct drm_crtc *crtc);
-extern void psb_disable_vblank(struct drm_crtc *c

[PATCH 3/4] drm/gma500: Don't store crtc_funcs in psb_ops

2022-03-17 Thread Patrik Jakobsson
The drm_crtc_funcs are all generic and no chip specific functions are
necessary. We can therefore directly put gma_crtc_funcs into the
drm_crtc.

Signed-off-by: Patrik Jakobsson 
---
 drivers/gpu/drm/gma500/cdv_device.c| 1 -
 drivers/gpu/drm/gma500/oaktrail_device.c   | 1 -
 drivers/gpu/drm/gma500/psb_device.c| 1 -
 drivers/gpu/drm/gma500/psb_drv.h   | 1 -
 drivers/gpu/drm/gma500/psb_intel_display.c | 3 +--
 5 files changed, 1 insertion(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/gma500/cdv_device.c 
b/drivers/gpu/drm/gma500/cdv_device.c
index 887c157d75f4..f854f58bcbb3 100644
--- a/drivers/gpu/drm/gma500/cdv_device.c
+++ b/drivers/gpu/drm/gma500/cdv_device.c
@@ -603,7 +603,6 @@ const struct psb_ops cdv_chip_ops = {
.errata = cdv_errata,
 
.crtc_helper = &cdv_intel_helper_funcs,
-   .crtc_funcs = &gma_crtc_funcs,
.clock_funcs = &cdv_clock_funcs,
 
.output_init = cdv_output_init,
diff --git a/drivers/gpu/drm/gma500/oaktrail_device.c 
b/drivers/gpu/drm/gma500/oaktrail_device.c
index 40f1bc736125..5923a9c89312 100644
--- a/drivers/gpu/drm/gma500/oaktrail_device.c
+++ b/drivers/gpu/drm/gma500/oaktrail_device.c
@@ -545,7 +545,6 @@ const struct psb_ops oaktrail_chip_ops = {
.chip_setup = oaktrail_chip_setup,
.chip_teardown = oaktrail_teardown,
.crtc_helper = &oaktrail_helper_funcs,
-   .crtc_funcs = &gma_crtc_funcs,
 
.output_init = oaktrail_output_init,
 
diff --git a/drivers/gpu/drm/gma500/psb_device.c 
b/drivers/gpu/drm/gma500/psb_device.c
index e93e4191c0ca..59f325165667 100644
--- a/drivers/gpu/drm/gma500/psb_device.c
+++ b/drivers/gpu/drm/gma500/psb_device.c
@@ -329,7 +329,6 @@ const struct psb_ops psb_chip_ops = {
.chip_teardown = psb_chip_teardown,
 
.crtc_helper = &psb_intel_helper_funcs,
-   .crtc_funcs = &gma_crtc_funcs,
.clock_funcs = &psb_clock_funcs,
 
.output_init = psb_output_init,
diff --git a/drivers/gpu/drm/gma500/psb_drv.h b/drivers/gpu/drm/gma500/psb_drv.h
index 88f44dbbc4eb..aed167af13c5 100644
--- a/drivers/gpu/drm/gma500/psb_drv.h
+++ b/drivers/gpu/drm/gma500/psb_drv.h
@@ -578,7 +578,6 @@ struct psb_ops {
 
/* Sub functions */
struct drm_crtc_helper_funcs const *crtc_helper;
-   struct drm_crtc_funcs const *crtc_funcs;
const struct gma_clock_funcs *clock_funcs;
 
/* Setup hooks */
diff --git a/drivers/gpu/drm/gma500/psb_intel_display.c 
b/drivers/gpu/drm/gma500/psb_intel_display.c
index 6df62fe7c1e0..a99859b5b13a 100644
--- a/drivers/gpu/drm/gma500/psb_intel_display.c
+++ b/drivers/gpu/drm/gma500/psb_intel_display.c
@@ -488,8 +488,7 @@ void psb_intel_crtc_init(struct drm_device *dev, int pipe,
return;
}
 
-   /* Set the CRTC operations from the chip specific data */
-   drm_crtc_init(dev, &gma_crtc->base, dev_priv->ops->crtc_funcs);
+   drm_crtc_init(dev, &gma_crtc->base, &gma_crtc_funcs);
 
/* Set the CRTC clock functions from chip specific data */
gma_crtc->clock_funcs = dev_priv->ops->clock_funcs;
-- 
2.35.1



Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

2022-03-17 Thread Daniel Vetter
On Thu, Mar 17, 2022 at 08:03:27AM +0100, Christian König wrote:
> Am 16.03.22 um 16:36 schrieb Rob Clark:
> > [SNIP]
> > just one point of clarification.. in the msm and i915 case it is
> > purely for debugging and telemetry (ie. sending crash logs back to
> > distro for analysis if user has crash reporting enabled).. it isn't
> > used for triggering any action like killing app or compositor.
> 
> By the way, how does msm it's memory management for the devcoredumps?

GFP_NORECLAIM all the way. It's purely best effort.

Note that the fancy new plan for i915 discrete gpu is to only support gpu
crash dumps on non-recoverable gpu contexts, i.e. those that do not
continue to the next batch when something bad happens. This is what vk
wants and also what iris now uses (we do context recovery in userspace in
all cases), and non-recoverable contexts greatly simplify the crash dump
gather: Only thing you need to gather is the register state from hw
(before you reset it), all the batchbuffer bo and indirect state bo (in
i915 you can mark which bo to capture in the CS ioctl) can be captured in
a worker later on. Which for non-recoverable context is no issue, since
subsequent batchbuffers won't trample over any of these things.

And that way you can record the crashdump (or at least the big pieces like
all the indirect state stuff) with GFP_KERNEL.

msm probably gets it wrong since embedded drivers have much less shrinker
and generally no mmu notifiers going on :-)

> I mean it is strictly forbidden to allocate any memory in the GPU reset
> path.
> 
> > I would however *strongly* recommend devcoredump support in other GPU
> > drivers (i915's thing pre-dates devcoredump by a lot).. I've used it
> > to debug and fix a couple obscure issues that I was not able to
> > reproduce by myself.
> 
> Yes, completely agree as well.

+1

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [RESEND PATCH] drm/doc: Clarify what ioctls can be used on render nodes

2022-03-17 Thread Daniel Vetter
On Mon, Mar 07, 2022 at 08:32:36AM -0700, Jeffrey Hugo wrote:
> The documentation for render nodes indicates that only "PRIME-related"
> ioctls are valid on render nodes, but the documentation does not clarify
> what that means.  If the reader is not familiar with PRIME, they may
> beleive this to be only the ioctls with "PRIME" in the name and not other
> ioctls such as set of syncobj ioctls.  Clarify the situation for the
> reader by referencing where the reader will find a current list of valid
> ioctls.
> 
> Signed-off-by: Jeffrey Hugo 
> Acked-by: Pekka Paalanen 

Applied to drm-misc-next, thanks for the patch.
-Daniel

> ---
> 
> I was confused by this when reading the documentation.  Now that I have
> figured out what the documentation means, I would like to add a clarification
> for the next reader which would have helped me.
> 
>  Documentation/gpu/drm-uapi.rst | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-uapi.rst
> index 199afb5..ce47b42 100644
> --- a/Documentation/gpu/drm-uapi.rst
> +++ b/Documentation/gpu/drm-uapi.rst
> @@ -148,7 +148,9 @@ clients together with the legacy drmAuth authentication 
> procedure.
>  If a driver advertises render node support, DRM core will create a
>  separate render node called renderD. There will be one render node
>  per device. No ioctls except PRIME-related ioctls will be allowed on
> -this node. Especially GEM_OPEN will be explicitly prohibited. Render
> +this node. Especially GEM_OPEN will be explicitly prohibited. For a
> +complete list of driver-independent ioctls that can be used on render
> +nodes, see the ioctls marked DRM_RENDER_ALLOW in drm_ioctl.c  Render
>  nodes are designed to avoid the buffer-leaks, which occur if clients
>  guess the flink names or mmap offsets on the legacy interface.
>  Additionally to this basic interface, drivers must mark their
> -- 
> 2.7.4
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

2022-03-17 Thread Christian König

Am 17.03.22 um 10:29 schrieb Daniel Vetter:

On Thu, Mar 17, 2022 at 08:03:27AM +0100, Christian König wrote:

Am 16.03.22 um 16:36 schrieb Rob Clark:

[SNIP]
just one point of clarification.. in the msm and i915 case it is
purely for debugging and telemetry (ie. sending crash logs back to
distro for analysis if user has crash reporting enabled).. it isn't
used for triggering any action like killing app or compositor.

By the way, how does msm it's memory management for the devcoredumps?

GFP_NORECLAIM all the way. It's purely best effort.


Ok, good to know that it's as simple as that.


Note that the fancy new plan for i915 discrete gpu is to only support gpu
crash dumps on non-recoverable gpu contexts, i.e. those that do not
continue to the next batch when something bad happens.



This is what vk wants


That's exactly what I'm telling an internal team for a couple of years 
now as well. Good to know that this is not that totally crazy.



  and also what iris now uses (we do context recovery in userspace in
all cases), and non-recoverable contexts greatly simplify the crash dump
gather: Only thing you need to gather is the register state from hw
(before you reset it), all the batchbuffer bo and indirect state bo (in
i915 you can mark which bo to capture in the CS ioctl) can be captured in
a worker later on. Which for non-recoverable context is no issue, since
subsequent batchbuffers won't trample over any of these things.

And that way you can record the crashdump (or at least the big pieces like
all the indirect state stuff) with GFP_KERNEL.


Interesting idea, so basically we only do the state we need to reset 
initially and grab a reference on the killed application to gather the 
rest before we clean them up.


Going to keep that in mind as well.

Thanks,
Christian.



msm probably gets it wrong since embedded drivers have much less shrinker
and generally no mmu notifiers going on :-)


I mean it is strictly forbidden to allocate any memory in the GPU reset
path.


I would however *strongly* recommend devcoredump support in other GPU
drivers (i915's thing pre-dates devcoredump by a lot).. I've used it
to debug and fix a couple obscure issues that I was not able to
reproduce by myself.

Yes, completely agree as well.

+1

Cheers, Daniel




Re: [Intel-gfx] [PATCH v6 2/2] drm/i915/gem: Don't try to map and fence large scanout buffers (v9)

2022-03-17 Thread Daniel Vetter
On Tue, Mar 15, 2022 at 09:45:20AM +, Tvrtko Ursulin wrote:
> 
> On 15/03/2022 07:28, Kasireddy, Vivek wrote:
> > Hi Tvrtko, Daniel,
> > 
> > > 
> > > On 11/03/2022 09:39, Daniel Vetter wrote:
> > > > On Mon, 7 Mar 2022 at 21:38, Vivek Kasireddy 
> > > >  wrote:
> > > > > 
> > > > > On platforms capable of allowing 8K (7680 x 4320) modes, pinning 2 or
> > > > > more framebuffers/scanout buffers results in only one that is 
> > > > > mappable/
> > > > > fenceable. Therefore, pageflipping between these 2 FBs where only one
> > > > > is mappable/fenceable creates latencies large enough to miss alternate
> > > > > vblanks thereby producing less optimal framerate.
> > > > > 
> > > > > This mainly happens because when 
> > > > > i915_gem_object_pin_to_display_plane()
> > > > > is called to pin one of the FB objs, the associated vma is identified
> > > > > as misplaced and therefore i915_vma_unbind() is called which unbinds 
> > > > > and
> > > > > evicts it. This misplaced vma gets subseqently pinned only when
> > > > > i915_gem_object_ggtt_pin_ww() is called without PIN_MAPPABLE. This
> > > > > results in a latency of ~10ms and happens every other vblank/repaint 
> > > > > cycle.
> > > > > Therefore, to fix this issue, we try to see if there is space to map
> > > > > at-least two objects of a given size and return early if there isn't. 
> > > > > This
> > > > > would ensure that we do not try with PIN_MAPPABLE for any objects that
> > > > > are too big to map thereby preventing unncessary unbind.
> > > > > 
> > > > > Testcase:
> > > > > Running Weston and weston-simple-egl on an Alderlake_S (ADLS) platform
> > > > > with a 8K@60 mode results in only ~40 FPS. Since upstream Weston 
> > > > > submits
> > > > > a frame ~7ms before the next vblank, the latencies seen between atomic
> > > > > commit and flip event are 7, 24 (7 + 16.66), 7, 24. suggesting 
> > > > > that
> > > > > it misses the vblank every other frame.
> > > > > 
> > > > > Here is the ftrace snippet that shows the source of the ~10ms latency:
> > > > > i915_gem_object_pin_to_display_plane() {
> > > > > 0.102 us   |i915_gem_object_set_cache_level();
> > > > >   i915_gem_object_ggtt_pin_ww() {
> > > > > 0.390 us   |  i915_vma_instance();
> > > > > 0.178 us   |  i915_vma_misplaced();
> > > > > i915_vma_unbind() {
> > > > > __i915_active_wait() {
> > > > > 0.082 us   |i915_active_acquire_if_busy();
> > > > > 0.475 us   |  }
> > > > > intel_runtime_pm_get() {
> > > > > 0.087 us   |intel_runtime_pm_acquire();
> > > > > 0.259 us   |  }
> > > > > __i915_active_wait() {
> > > > > 0.085 us   |i915_active_acquire_if_busy();
> > > > > 0.240 us   |  }
> > > > > __i915_vma_evict() {
> > > > >   ggtt_unbind_vma() {
> > > > > gen8_ggtt_clear_range() {
> > > > > 10507.255 us |}
> > > > > 10507.689 us |  }
> > > > > 10508.516 us |   }
> > > > > 
> > > > > v2: Instead of using bigjoiner checks, determine whether a scanout
> > > > >   buffer is too big by checking to see if it is possible to map
> > > > >   two of them into the ggtt.
> > > > > 
> > > > > v3 (Ville):
> > > > > - Count how many fb objects can be fit into the available holes
> > > > > instead of checking for a hole twice the object size.
> > > > > - Take alignment constraints into account.
> > > > > - Limit this large scanout buffer check to >= Gen 11 platforms.
> > > > > 
> > > > > v4:
> > > > > - Remove existing heuristic that checks just for size. (Ville)
> > > > > - Return early if we find space to map at-least two objects. (Tvrtko)
> > > > > - Slightly update the commit message.
> > > > > 
> > > > > v5: (Tvrtko)
> > > > > - Rename the function to indicate that the object may be too big to
> > > > > map into the aperture.
> > > > > - Account for guard pages while calculating the total size required
> > > > > for the object.
> > > > > - Do not subject all objects to the heuristic check and instead
> > > > > consider objects only of a certain size.
> > > > > - Do the hole walk using the rbtree.
> > > > > - Preserve the existing PIN_NONBLOCK logic.
> > > > > - Drop the PIN_MAPPABLE check while pinning the VMA.
> > > > > 
> > > > > v6: (Tvrtko)
> > > > > - Return 0 on success and the specific error code on failure to
> > > > > preserve the existing behavior.
> > > > > 
> > > > > v7: (Ville)
> > > > > - Drop the HAS_GMCH(i915), DISPLAY_VER(i915) < 11 and
> > > > > size < ggtt->mappable_end / 4 checks.
> > > > > - Drop the redundant check that is based on previous heuristic.
> > > > > 
> > > > > v8:
> > > > > - Make sure that we are holding the mutex associated with ggtt vm
> > > > > as we traverse the hole nodes.
> > > > > 
> > > > > v9: (Tvrtko)
> > > > > - Use mutex_lock_interruptible_nested() instead of mutex_lock().
> > > > > 

Re: [PATCH v1] drm/shmem-helper: Correct doc-comment of drm_gem_shmem_get_sg_table()

2022-03-17 Thread Daniel Vetter
On Tue, Mar 08, 2022 at 04:34:01PM +0300, Dmitry Osipenko wrote:
> drm_gem_shmem_get_sg_table() never returns NULL on error, but a ERR_PTR.
> Correct the doc comment which says that it returns NULL on error.
> 
> Signed-off-by: Dmitry Osipenko 
> ---
>  drivers/gpu/drm/drm_gem_shmem_helper.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
> b/drivers/gpu/drm/drm_gem_shmem_helper.c
> index 8ad0e02991ca..37009418cd28 100644
> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> @@ -662,7 +662,7 @@ EXPORT_SYMBOL(drm_gem_shmem_print_info);
>   * drm_gem_shmem_get_pages_sgt() instead.
>   *
>   * Returns:
> - * A pointer to the scatter/gather table of pinned pages or NULL on failure.
> + * A pointer to the scatter/gather table of pinned pages or errno on failure.

Hm usually we write "negative errno" for these, since the error numbers
are defined as positive numbers. Care to respin?

Thanks a lot, Daniel

>   */
>  struct sg_table *drm_gem_shmem_get_sg_table(struct drm_gem_shmem_object 
> *shmem)
>  {
> -- 
> 2.35.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 2/3] drm/msm/gpu: Park scheduler threads for system suspend

2022-03-17 Thread Daniel Vetter
On Thu, Mar 10, 2022 at 03:46:05PM -0800, Rob Clark wrote:
> From: Rob Clark 
> 
> In the system suspend path, we don't want to be racing with the
> scheduler kthreads pushing additional queued up jobs to the hw
> queue (ringbuffer).  So park them first.  While we are at it,
> move the wait for active jobs to complete into the new system-
> suspend path.
> 
> Signed-off-by: Rob Clark 
> ---
>  drivers/gpu/drm/msm/adreno/adreno_device.c | 68 --
>  1 file changed, 64 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> b/drivers/gpu/drm/msm/adreno/adreno_device.c
> index 8859834b51b8..0440a98988fc 100644
> --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> @@ -619,22 +619,82 @@ static int active_submits(struct msm_gpu *gpu)
>  static int adreno_runtime_suspend(struct device *dev)
>  {
>   struct msm_gpu *gpu = dev_to_gpu(dev);
> - int remaining;
> +
> + /*
> +  * We should be holding a runpm ref, which will prevent
> +  * runtime suspend.  In the system suspend path, we've
> +  * already waited for active jobs to complete.
> +  */
> + WARN_ON_ONCE(gpu->active_submits);
> +
> + return gpu->funcs->pm_suspend(gpu);
> +}
> +
> +static void suspend_scheduler(struct msm_gpu *gpu)
> +{
> + int i;
> +
> + /*
> +  * Shut down the scheduler before we force suspend, so that
> +  * suspend isn't racing with scheduler kthread feeding us
> +  * more work.
> +  *
> +  * Note, we just want to park the thread, and let any jobs
> +  * that are already on the hw queue complete normally, as
> +  * opposed to the drm_sched_stop() path used for handling
> +  * faulting/timed-out jobs.  We can't really cancel any jobs
> +  * already on the hw queue without racing with the GPU.
> +  */
> + for (i = 0; i < gpu->nr_rings; i++) {
> + struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched;
> + kthread_park(sched->thread);

Shouldn't we have some proper interfaces for this? Also I'm kinda
wondering how other drivers do this, feels like we should have a standard
way.

Finally not flushing out all in-flight requests sounds a bit like a bad
idea for system suspend/resume since that's also the hibernation path, and
that would mean your shrinker/page reclaim stops working. At least in full
generality. Which ain't good for hibernation.

Adding Christian and Andrey.
-Daniel

> + }
> +}
> +
> +static void resume_scheduler(struct msm_gpu *gpu)
> +{
> + int i;
> +
> + for (i = 0; i < gpu->nr_rings; i++) {
> + struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched;
> + kthread_unpark(sched->thread);
> + }
> +}
> +
> +static int adreno_system_suspend(struct device *dev)
> +{
> + struct msm_gpu *gpu = dev_to_gpu(dev);
> + int remaining, ret;
> +
> + suspend_scheduler(gpu);
>  
>   remaining = wait_event_timeout(gpu->retire_event,
>  active_submits(gpu) == 0,
>  msecs_to_jiffies(1000));
>   if (remaining == 0) {
>   dev_err(dev, "Timeout waiting for GPU to suspend\n");
> - return -EBUSY;
> + ret = -EBUSY;
> + goto out;
>   }
>  
> - return gpu->funcs->pm_suspend(gpu);
> + ret = pm_runtime_force_suspend(dev);
> +out:
> + if (ret)
> + resume_scheduler(gpu);
> +
> + return ret;
>  }
> +
> +static int adreno_system_resume(struct device *dev)
> +{
> + resume_scheduler(dev_to_gpu(dev));
> + return pm_runtime_force_resume(dev);
> +}
> +
>  #endif
>  
>  static const struct dev_pm_ops adreno_pm_ops = {
> - SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, 
> pm_runtime_force_resume)
> + SET_SYSTEM_SLEEP_PM_OPS(adreno_system_suspend, adreno_system_resume)
>   SET_RUNTIME_PM_OPS(adreno_runtime_suspend, adreno_runtime_resume, NULL)
>  };
>  
> -- 
> 2.35.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 1/6] drm: allow real encoder to be passed for drm_writeback_connector

2022-03-17 Thread Daniel Vetter
On Fri, Mar 11, 2022 at 10:05:53AM +0200, Laurent Pinchart wrote:
> On Fri, Mar 11, 2022 at 10:46:13AM +0300, Dmitry Baryshkov wrote:
> > On Fri, 11 Mar 2022 at 04:50, Abhinav Kumar  
> > wrote:
> > >
> > > For some vendor driver implementations, display hardware can
> > > be shared between the encoder used for writeback and the physical
> > > display.
> > >
> > > In addition resources such as clocks and interrupts can
> > > also be shared between writeback and the real encoder.
> > >
> > > To accommodate such vendor drivers and hardware, allow
> > > real encoder to be passed for drm_writeback_connector.
> > >
> > > Co-developed-by: Kandpal Suraj 
> > > Signed-off-by: Abhinav Kumar 
> > > ---
> > >  drivers/gpu/drm/drm_writeback.c |  8 
> > >  include/drm/drm_writeback.h | 13 +++--
> > >  2 files changed, 15 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/drm_writeback.c 
> > > b/drivers/gpu/drm/drm_writeback.c
> > > index dccf4504..4dad687 100644
> > > --- a/drivers/gpu/drm/drm_writeback.c
> > > +++ b/drivers/gpu/drm/drm_writeback.c
> > > @@ -189,8 +189,8 @@ int drm_writeback_connector_init(struct drm_device 
> > > *dev,
> > > if (IS_ERR(blob))
> > > return PTR_ERR(blob);
> > >
> > > -   drm_encoder_helper_add(&wb_connector->encoder, enc_helper_funcs);
> > > -   ret = drm_encoder_init(dev, &wb_connector->encoder,
> > > +   drm_encoder_helper_add(wb_connector->encoder, enc_helper_funcs);
> > > +   ret = drm_encoder_init(dev, wb_connector->encoder,
> > >&drm_writeback_encoder_funcs,
> > >DRM_MODE_ENCODER_VIRTUAL, NULL);
> > 
> > If the encoder is provided by a separate driver, it might use a
> > different set of encoder funcs.
> 
> More than that, if the encoder is provided externally but doesn't have
> custom operations, I don't really see the point of having an external
> encoder in the first place.
> 
> Has this series been tested with a driver that needs to provide an
> encoder, to make sure it fits the purpose ?

Also, can we not force all drivers to do this setup that don't need it? We
have a ton of kms drivers, forcing unnecessary busiwork on drivers is
really not good.
-Daniel

> 
> > I'd suggest checking whether the wb_connector->encoder is NULL here.
> > If it is, allocate one using drmm_kzalloc and init it.
> > If it is not NULL, assume that it has been initialized already, so
> > skip the drm_encoder_init() and just call the drm_encoder_helper_add()
> > 
> > > if (ret)
> > > @@ -204,7 +204,7 @@ int drm_writeback_connector_init(struct drm_device 
> > > *dev,
> > > goto connector_fail;
> > >
> > > ret = drm_connector_attach_encoder(connector,
> > > -   &wb_connector->encoder);
> > > +   wb_connector->encoder);
> > > if (ret)
> > > goto attach_fail;
> > >
> > > @@ -233,7 +233,7 @@ int drm_writeback_connector_init(struct drm_device 
> > > *dev,
> > >  attach_fail:
> > > drm_connector_cleanup(connector);
> > >  connector_fail:
> > > -   drm_encoder_cleanup(&wb_connector->encoder);
> > > +   drm_encoder_cleanup(wb_connector->encoder);
> > >  fail:
> > > drm_property_blob_put(blob);
> > > return ret;
> > > diff --git a/include/drm/drm_writeback.h b/include/drm/drm_writeback.h
> > > index 9697d27..0ba266e 100644
> > > --- a/include/drm/drm_writeback.h
> > > +++ b/include/drm/drm_writeback.h
> > > @@ -25,13 +25,22 @@ struct drm_writeback_connector {
> > > struct drm_connector base;
> > >
> > > /**
> > > -* @encoder: Internal encoder used by the connector to fulfill
> > > +* @encoder: handle to drm_encoder used by the connector to 
> > > fulfill
> > >  * the DRM framework requirements. The users of the
> > >  * @drm_writeback_connector control the behaviour of the @encoder
> > >  * by passing the @enc_funcs parameter to 
> > > drm_writeback_connector_init()
> > >  * function.
> > > +*
> > > +* For some vendor drivers, the hardware resources are shared 
> > > between
> > > +* writeback encoder and rest of the display pipeline.
> > > +* To accommodate such cases, encoder is a handle to the real 
> > > encoder
> > > +* hardware.
> > > +*
> > > +* For current existing writeback users, this shall continue to 
> > > be the
> > > +* embedded encoder for the writeback connector.
> > > +*
> > >  */
> > > -   struct drm_encoder encoder;
> > > +   struct drm_encoder *encoder;
> > >
> > > /**
> > >  * @pixel_formats_blob_ptr:
> 
> -- 
> Regards,
> 
> Laurent Pinchart

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH v6 2/2] drm/i915/gem: Don't try to map and fence large scanout buffers (v9)

2022-03-17 Thread Tvrtko Ursulin



On 17/03/2022 09:47, Daniel Vetter wrote:

On Tue, Mar 15, 2022 at 09:45:20AM +, Tvrtko Ursulin wrote:


On 15/03/2022 07:28, Kasireddy, Vivek wrote:

Hi Tvrtko, Daniel,



On 11/03/2022 09:39, Daniel Vetter wrote:

On Mon, 7 Mar 2022 at 21:38, Vivek Kasireddy  wrote:


On platforms capable of allowing 8K (7680 x 4320) modes, pinning 2 or
more framebuffers/scanout buffers results in only one that is mappable/
fenceable. Therefore, pageflipping between these 2 FBs where only one
is mappable/fenceable creates latencies large enough to miss alternate
vblanks thereby producing less optimal framerate.

This mainly happens because when i915_gem_object_pin_to_display_plane()
is called to pin one of the FB objs, the associated vma is identified
as misplaced and therefore i915_vma_unbind() is called which unbinds and
evicts it. This misplaced vma gets subseqently pinned only when
i915_gem_object_ggtt_pin_ww() is called without PIN_MAPPABLE. This
results in a latency of ~10ms and happens every other vblank/repaint cycle.
Therefore, to fix this issue, we try to see if there is space to map
at-least two objects of a given size and return early if there isn't. This
would ensure that we do not try with PIN_MAPPABLE for any objects that
are too big to map thereby preventing unncessary unbind.

Testcase:
Running Weston and weston-simple-egl on an Alderlake_S (ADLS) platform
with a 8K@60 mode results in only ~40 FPS. Since upstream Weston submits
a frame ~7ms before the next vblank, the latencies seen between atomic
commit and flip event are 7, 24 (7 + 16.66), 7, 24. suggesting that
it misses the vblank every other frame.

Here is the ftrace snippet that shows the source of the ~10ms latency:
 i915_gem_object_pin_to_display_plane() {
0.102 us   |i915_gem_object_set_cache_level();
   i915_gem_object_ggtt_pin_ww() {
0.390 us   |  i915_vma_instance();
0.178 us   |  i915_vma_misplaced();
 i915_vma_unbind() {
 __i915_active_wait() {
0.082 us   |i915_active_acquire_if_busy();
0.475 us   |  }
 intel_runtime_pm_get() {
0.087 us   |intel_runtime_pm_acquire();
0.259 us   |  }
 __i915_active_wait() {
0.085 us   |i915_active_acquire_if_busy();
0.240 us   |  }
 __i915_vma_evict() {
   ggtt_unbind_vma() {
 gen8_ggtt_clear_range() {
10507.255 us |}
10507.689 us |  }
10508.516 us |   }

v2: Instead of using bigjoiner checks, determine whether a scanout
   buffer is too big by checking to see if it is possible to map
   two of them into the ggtt.

v3 (Ville):
- Count how many fb objects can be fit into the available holes
 instead of checking for a hole twice the object size.
- Take alignment constraints into account.
- Limit this large scanout buffer check to >= Gen 11 platforms.

v4:
- Remove existing heuristic that checks just for size. (Ville)
- Return early if we find space to map at-least two objects. (Tvrtko)
- Slightly update the commit message.

v5: (Tvrtko)
- Rename the function to indicate that the object may be too big to
 map into the aperture.
- Account for guard pages while calculating the total size required
 for the object.
- Do not subject all objects to the heuristic check and instead
 consider objects only of a certain size.
- Do the hole walk using the rbtree.
- Preserve the existing PIN_NONBLOCK logic.
- Drop the PIN_MAPPABLE check while pinning the VMA.

v6: (Tvrtko)
- Return 0 on success and the specific error code on failure to
 preserve the existing behavior.

v7: (Ville)
- Drop the HAS_GMCH(i915), DISPLAY_VER(i915) < 11 and
 size < ggtt->mappable_end / 4 checks.
- Drop the redundant check that is based on previous heuristic.

v8:
- Make sure that we are holding the mutex associated with ggtt vm
 as we traverse the hole nodes.

v9: (Tvrtko)
- Use mutex_lock_interruptible_nested() instead of mutex_lock().

Cc: Ville Syrjälä 
Cc: Maarten Lankhorst 
Cc: Tvrtko Ursulin 
Cc: Manasi Navare 
Reviewed-by: Tvrtko Ursulin 
Signed-off-by: Vivek Kasireddy 
---
drivers/gpu/drm/i915/i915_gem.c | 128 +++-
1 file changed, 94 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 9747924cc57b..e0d731b3f215 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -49,6 +49,7 @@
#include "gem/i915_gem_pm.h"
#include "gem/i915_gem_region.h"
#include "gem/i915_gem_userptr.h"
+#include "gem/i915_gem_tiling.h"
#include "gt/intel_engine_user.h"
#include "gt/intel_gt.h"
#include "gt/intel_gt_pm.h"
@@ -882,6 +883,96 @@ static void discard_ggtt_vma(struct i915_vma *vma)
   spin_unlock(&obj->vma.lock);
}

+static int
+i915_gem_object_fits_in_aperture(struct drm_i915_gem_object *obj,
+

Re: [PATCH 2/3] drm/msm/gpu: Park scheduler threads for system suspend

2022-03-17 Thread Christian König

Am 17.03.22 um 10:59 schrieb Daniel Vetter:

On Thu, Mar 10, 2022 at 03:46:05PM -0800, Rob Clark wrote:

From: Rob Clark 

In the system suspend path, we don't want to be racing with the
scheduler kthreads pushing additional queued up jobs to the hw
queue (ringbuffer).  So park them first.  While we are at it,
move the wait for active jobs to complete into the new system-
suspend path.

Signed-off-by: Rob Clark 
---
  drivers/gpu/drm/msm/adreno/adreno_device.c | 68 --
  1 file changed, 64 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
b/drivers/gpu/drm/msm/adreno/adreno_device.c
index 8859834b51b8..0440a98988fc 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_device.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
@@ -619,22 +619,82 @@ static int active_submits(struct msm_gpu *gpu)
  static int adreno_runtime_suspend(struct device *dev)
  {
struct msm_gpu *gpu = dev_to_gpu(dev);
-   int remaining;
+
+   /*
+* We should be holding a runpm ref, which will prevent
+* runtime suspend.  In the system suspend path, we've
+* already waited for active jobs to complete.
+*/
+   WARN_ON_ONCE(gpu->active_submits);
+
+   return gpu->funcs->pm_suspend(gpu);
+}
+
+static void suspend_scheduler(struct msm_gpu *gpu)
+{
+   int i;
+
+   /*
+* Shut down the scheduler before we force suspend, so that
+* suspend isn't racing with scheduler kthread feeding us
+* more work.
+*
+* Note, we just want to park the thread, and let any jobs
+* that are already on the hw queue complete normally, as
+* opposed to the drm_sched_stop() path used for handling
+* faulting/timed-out jobs.  We can't really cancel any jobs
+* already on the hw queue without racing with the GPU.
+*/
+   for (i = 0; i < gpu->nr_rings; i++) {
+   struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched;
+   kthread_park(sched->thread);

Shouldn't we have some proper interfaces for this?


If I'm not completely mistaken we already should have one, yes.


Also I'm kinda wondering how other drivers do this, feels like we should have a 
standard
way.

Finally not flushing out all in-flight requests sounds a bit like a bad
idea for system suspend/resume since that's also the hibernation path, and
that would mean your shrinker/page reclaim stops working. At least in full
generality. Which ain't good for hibernation.


Completely agree, that looks like an incorrect workaround to me.

During suspend all userspace applications should be frozen and all f 
their hardware activity flushed out and waited for completion.


I do remember that our internal guys came up with pretty much the same 
idea and it sounded broken to me back then as well.


Regards,
Christian.



Adding Christian and Andrey.
-Daniel


+   }
+}
+
+static void resume_scheduler(struct msm_gpu *gpu)
+{
+   int i;
+
+   for (i = 0; i < gpu->nr_rings; i++) {
+   struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched;
+   kthread_unpark(sched->thread);
+   }
+}
+
+static int adreno_system_suspend(struct device *dev)
+{
+   struct msm_gpu *gpu = dev_to_gpu(dev);
+   int remaining, ret;
+
+   suspend_scheduler(gpu);
  
  	remaining = wait_event_timeout(gpu->retire_event,

   active_submits(gpu) == 0,
   msecs_to_jiffies(1000));
if (remaining == 0) {
dev_err(dev, "Timeout waiting for GPU to suspend\n");
-   return -EBUSY;
+   ret = -EBUSY;
+   goto out;
}
  
-	return gpu->funcs->pm_suspend(gpu);

+   ret = pm_runtime_force_suspend(dev);
+out:
+   if (ret)
+   resume_scheduler(gpu);
+
+   return ret;
  }
+
+static int adreno_system_resume(struct device *dev)
+{
+   resume_scheduler(dev_to_gpu(dev));
+   return pm_runtime_force_resume(dev);
+}
+
  #endif
  
  static const struct dev_pm_ops adreno_pm_ops = {

-   SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, 
pm_runtime_force_resume)
+   SET_SYSTEM_SLEEP_PM_OPS(adreno_system_suspend, adreno_system_resume)
SET_RUNTIME_PM_OPS(adreno_runtime_suspend, adreno_runtime_resume, NULL)
  };
  
--

2.35.1





Re: [Intel-gfx] [PATCH v6 2/2] drm/i915/gem: Don't try to map and fence large scanout buffers (v9)

2022-03-17 Thread Daniel Vetter
On Thu, Mar 17, 2022 at 10:04:36AM +, Tvrtko Ursulin wrote:
> 
> On 17/03/2022 09:47, Daniel Vetter wrote:
> > On Tue, Mar 15, 2022 at 09:45:20AM +, Tvrtko Ursulin wrote:
> > > 
> > > On 15/03/2022 07:28, Kasireddy, Vivek wrote:
> > > > Hi Tvrtko, Daniel,
> > > > 
> > > > > 
> > > > > On 11/03/2022 09:39, Daniel Vetter wrote:
> > > > > > On Mon, 7 Mar 2022 at 21:38, Vivek Kasireddy 
> > > > > >  wrote:
> > > > > > > 
> > > > > > > On platforms capable of allowing 8K (7680 x 4320) modes, pinning 
> > > > > > > 2 or
> > > > > > > more framebuffers/scanout buffers results in only one that is 
> > > > > > > mappable/
> > > > > > > fenceable. Therefore, pageflipping between these 2 FBs where only 
> > > > > > > one
> > > > > > > is mappable/fenceable creates latencies large enough to miss 
> > > > > > > alternate
> > > > > > > vblanks thereby producing less optimal framerate.
> > > > > > > 
> > > > > > > This mainly happens because when 
> > > > > > > i915_gem_object_pin_to_display_plane()
> > > > > > > is called to pin one of the FB objs, the associated vma is 
> > > > > > > identified
> > > > > > > as misplaced and therefore i915_vma_unbind() is called which 
> > > > > > > unbinds and
> > > > > > > evicts it. This misplaced vma gets subseqently pinned only when
> > > > > > > i915_gem_object_ggtt_pin_ww() is called without PIN_MAPPABLE. This
> > > > > > > results in a latency of ~10ms and happens every other 
> > > > > > > vblank/repaint cycle.
> > > > > > > Therefore, to fix this issue, we try to see if there is space to 
> > > > > > > map
> > > > > > > at-least two objects of a given size and return early if there 
> > > > > > > isn't. This
> > > > > > > would ensure that we do not try with PIN_MAPPABLE for any objects 
> > > > > > > that
> > > > > > > are too big to map thereby preventing unncessary unbind.
> > > > > > > 
> > > > > > > Testcase:
> > > > > > > Running Weston and weston-simple-egl on an Alderlake_S (ADLS) 
> > > > > > > platform
> > > > > > > with a 8K@60 mode results in only ~40 FPS. Since upstream Weston 
> > > > > > > submits
> > > > > > > a frame ~7ms before the next vblank, the latencies seen between 
> > > > > > > atomic
> > > > > > > commit and flip event are 7, 24 (7 + 16.66), 7, 24. 
> > > > > > > suggesting that
> > > > > > > it misses the vblank every other frame.
> > > > > > > 
> > > > > > > Here is the ftrace snippet that shows the source of the ~10ms 
> > > > > > > latency:
> > > > > > >  i915_gem_object_pin_to_display_plane() {
> > > > > > > 0.102 us   |i915_gem_object_set_cache_level();
> > > > > > >i915_gem_object_ggtt_pin_ww() {
> > > > > > > 0.390 us   |  i915_vma_instance();
> > > > > > > 0.178 us   |  i915_vma_misplaced();
> > > > > > >  i915_vma_unbind() {
> > > > > > >  __i915_active_wait() {
> > > > > > > 0.082 us   |i915_active_acquire_if_busy();
> > > > > > > 0.475 us   |  }
> > > > > > >  intel_runtime_pm_get() {
> > > > > > > 0.087 us   |intel_runtime_pm_acquire();
> > > > > > > 0.259 us   |  }
> > > > > > >  __i915_active_wait() {
> > > > > > > 0.085 us   |i915_active_acquire_if_busy();
> > > > > > > 0.240 us   |  }
> > > > > > >  __i915_vma_evict() {
> > > > > > >ggtt_unbind_vma() {
> > > > > > >  gen8_ggtt_clear_range() {
> > > > > > > 10507.255 us |}
> > > > > > > 10507.689 us |  }
> > > > > > > 10508.516 us |   }
> > > > > > > 
> > > > > > > v2: Instead of using bigjoiner checks, determine whether a scanout
> > > > > > >buffer is too big by checking to see if it is possible to 
> > > > > > > map
> > > > > > >two of them into the ggtt.
> > > > > > > 
> > > > > > > v3 (Ville):
> > > > > > > - Count how many fb objects can be fit into the available holes
> > > > > > >  instead of checking for a hole twice the object size.
> > > > > > > - Take alignment constraints into account.
> > > > > > > - Limit this large scanout buffer check to >= Gen 11 platforms.
> > > > > > > 
> > > > > > > v4:
> > > > > > > - Remove existing heuristic that checks just for size. (Ville)
> > > > > > > - Return early if we find space to map at-least two objects. 
> > > > > > > (Tvrtko)
> > > > > > > - Slightly update the commit message.
> > > > > > > 
> > > > > > > v5: (Tvrtko)
> > > > > > > - Rename the function to indicate that the object may be too big 
> > > > > > > to
> > > > > > >  map into the aperture.
> > > > > > > - Account for guard pages while calculating the total size 
> > > > > > > required
> > > > > > >  for the object.
> > > > > > > - Do not subject all objects to the heuristic check and instead
> > > > > > >  consider objects only of a certain size.
> > > > > > > - Do the hole walk using the rbtree.
> > > > > > > - Preserve the existing PIN_NONBLOCK logic.
> > > > > > > - Drop the PIN_MAPP

[PULL] drm-misc-fixes

2022-03-17 Thread Thomas Zimmermann
Hi Dave and Daniel,

here's the PR for drm-misc-fixes for this week. Besides the fixes, it
contains a backmerge of drm/drm-fixes to get required Kconfig changes
from upstream.

Best regards
Thomas

drm-misc-fixes-2022-03-17:
 * drm/imx: Don't test bus flags in atomic check
 * drm/mgag200: Fix PLL setup on some models
 * drm/panel: Fix bpp settings on Innolux G070Y2-L01; Fix DRM_PANEL_EDP
   Kconfig dependencies
The following changes since commit 09688c0166e76ce2fb85e86b9d99be8b0084cdf9:

  Linux 5.17-rc8 (2022-03-13 13:23:37 -0700)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm-misc tags/drm-misc-fixes-2022-03-17

for you to fetch changes up to 3c3384050d68570f9de0fec9e58824decfefba7a:

  drm: Don't make DRM_PANEL_BRIDGE dependent on DRM_KMS_HELPERS (2022-03-17 
11:07:57 +0100)


 * drm/imx: Don't test bus flags in atomic check
 * drm/mgag200: Fix PLL setup on some models
 * drm/panel: Fix bpp settings on Innolux G070Y2-L01; Fix DRM_PANEL_EDP
   Kconfig dependencies


Christoph Niedermaier (1):
  drm/imx: parallel-display: Remove bus flags check in 
imx_pd_bridge_atomic_check()

Jocelyn Falempe (1):
  drm/mgag200: Fix PLL setup for g200wb and g200ew

Marek Vasut (1):
  drm/panel: simple: Fix Innolux G070Y2-L01 BPP settings

Thomas Zimmermann (2):
  Merge drm/drm-fixes into drm-misc-fixes
  drm: Don't make DRM_PANEL_BRIDGE dependent on DRM_KMS_HELPERS

 drivers/gpu/drm/bridge/Kconfig | 2 +-
 drivers/gpu/drm/imx/parallel-display.c | 8 
 drivers/gpu/drm/mgag200/mgag200_pll.c  | 6 +++---
 drivers/gpu/drm/panel/Kconfig  | 1 +
 drivers/gpu/drm/panel/panel-simple.c   | 2 +-
 5 files changed, 6 insertions(+), 13 deletions(-)

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer


Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

2022-03-17 Thread Daniel Stone
Hi,

On Thu, 17 Mar 2022 at 09:21, Christian König  wrote:
> Am 17.03.22 um 09:42 schrieb Sharma, Shashank:
> >> AFAIU you probably want to be passing around a `struct pid *`, and
> >> then somehow use pid_vnr() in the context of the process reading the
> >> event to get the numeric pid.  Otherwise things will not do what you
> >> expect if the process triggering the crash is in a different pid
> >> namespace from the compositor.
> >
> > I am not sure if it is a good idea to add the pid extraction
> > complexity in here, it is left upto the driver to extract this
> > information and pass it to the work queue. In case of AMDGPU, its
> > extracted from GPU VM. It would be then more flexible for the drivers
> > as well.
>
> Yeah, but that is just used for debugging.
>
> If we want to use the pid for housekeeping, like for a daemon which
> kills/restarts processes, we absolutely need that or otherwise won't be
> able to work with containers.

100% this.

Pushing back to the compositor is a red herring. The compositor is
just a service which tries to handle window management and input. If
you're looking to kill the offending process or whatever, then that
should go through the session manager - be it systemd or something
container-centric or whatever. At least that way it can deal with
cgroups at the same time, unlike the compositor which is not really
aware of what the thing on the other end of the socket is doing. This
ties in with the support they already have for things like coredump
analysis, and would also be useful for other devices.

Some environments combine compositor and session manager, and a lot of
them have them strongly related, but they're very definitely not the
same thing ...

Cheers,
Daniel


Re: [Intel-gfx] [PATCH v6 2/2] drm/i915/gem: Don't try to map and fence large scanout buffers (v9)

2022-03-17 Thread Tvrtko Ursulin



On 17/03/2022 07:08, Kasireddy, Vivek wrote:

Hi Tvrtko,



On 16/03/2022 07:37, Kasireddy, Vivek wrote:

Hi Tvrtko,



On 15/03/2022 07:28, Kasireddy, Vivek wrote:

Hi Tvrtko, Daniel,



On 11/03/2022 09:39, Daniel Vetter wrote:

On Mon, 7 Mar 2022 at 21:38, Vivek Kasireddy 

wrote:


On platforms capable of allowing 8K (7680 x 4320) modes, pinning 2 or
more framebuffers/scanout buffers results in only one that is mappable/
fenceable. Therefore, pageflipping between these 2 FBs where only one
is mappable/fenceable creates latencies large enough to miss alternate
vblanks thereby producing less optimal framerate.

This mainly happens because when i915_gem_object_pin_to_display_plane()
is called to pin one of the FB objs, the associated vma is identified
as misplaced and therefore i915_vma_unbind() is called which unbinds and
evicts it. This misplaced vma gets subseqently pinned only when
i915_gem_object_ggtt_pin_ww() is called without PIN_MAPPABLE. This
results in a latency of ~10ms and happens every other vblank/repaint cycle.
Therefore, to fix this issue, we try to see if there is space to map
at-least two objects of a given size and return early if there isn't. This
would ensure that we do not try with PIN_MAPPABLE for any objects that
are too big to map thereby preventing unncessary unbind.

Testcase:
Running Weston and weston-simple-egl on an Alderlake_S (ADLS) platform
with a 8K@60 mode results in only ~40 FPS. Since upstream Weston submits
a frame ~7ms before the next vblank, the latencies seen between atomic
commit and flip event are 7, 24 (7 + 16.66), 7, 24. suggesting that
it misses the vblank every other frame.

Here is the ftrace snippet that shows the source of the ~10ms latency:
  i915_gem_object_pin_to_display_plane() {
0.102 us   |i915_gem_object_set_cache_level();
i915_gem_object_ggtt_pin_ww() {
0.390 us   |  i915_vma_instance();
0.178 us   |  i915_vma_misplaced();
  i915_vma_unbind() {
  __i915_active_wait() {
0.082 us   |i915_active_acquire_if_busy();
0.475 us   |  }
  intel_runtime_pm_get() {
0.087 us   |intel_runtime_pm_acquire();
0.259 us   |  }
  __i915_active_wait() {
0.085 us   |i915_active_acquire_if_busy();
0.240 us   |  }
  __i915_vma_evict() {
ggtt_unbind_vma() {
  gen8_ggtt_clear_range() {
10507.255 us |}
10507.689 us |  }
10508.516 us |   }

v2: Instead of using bigjoiner checks, determine whether a scanout
buffer is too big by checking to see if it is possible to map
two of them into the ggtt.

v3 (Ville):
- Count how many fb objects can be fit into the available holes
  instead of checking for a hole twice the object size.
- Take alignment constraints into account.
- Limit this large scanout buffer check to >= Gen 11 platforms.

v4:
- Remove existing heuristic that checks just for size. (Ville)
- Return early if we find space to map at-least two objects. (Tvrtko)
- Slightly update the commit message.

v5: (Tvrtko)
- Rename the function to indicate that the object may be too big to
  map into the aperture.
- Account for guard pages while calculating the total size required
  for the object.
- Do not subject all objects to the heuristic check and instead
  consider objects only of a certain size.
- Do the hole walk using the rbtree.
- Preserve the existing PIN_NONBLOCK logic.
- Drop the PIN_MAPPABLE check while pinning the VMA.

v6: (Tvrtko)
- Return 0 on success and the specific error code on failure to
  preserve the existing behavior.

v7: (Ville)
- Drop the HAS_GMCH(i915), DISPLAY_VER(i915) < 11 and
  size < ggtt->mappable_end / 4 checks.
- Drop the redundant check that is based on previous heuristic.

v8:
- Make sure that we are holding the mutex associated with ggtt vm
  as we traverse the hole nodes.

v9: (Tvrtko)
- Use mutex_lock_interruptible_nested() instead of mutex_lock().

Cc: Ville Syrjälä 
Cc: Maarten Lankhorst 
Cc: Tvrtko Ursulin 
Cc: Manasi Navare 
Reviewed-by: Tvrtko Ursulin 
Signed-off-by: Vivek Kasireddy 
---
 drivers/gpu/drm/i915/i915_gem.c | 128 +++

-

 1 file changed, 94 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 9747924cc57b..e0d731b3f215 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -49,6 +49,7 @@
 #include "gem/i915_gem_pm.h"
 #include "gem/i915_gem_region.h"
 #include "gem/i915_gem_userptr.h"
+#include "gem/i915_gem_tiling.h"
 #include "gt/intel_engine_user.h"
 #include "gt/intel_gt.h"
 #include "gt/intel_gt_pm.h"
@@ -882,6 +883,96 @@ static void discard_ggtt_vma(struct i915_vma *vma)
spin_unlock(&obj->vma.lock);
 }

+static int
+i915_gem_object_fits_in_aperture(struct drm_

Re: [PATCH] drm: drm_bufs: Error out if 'dev->agp' is a null pointer

2022-03-17 Thread Daniel Vetter
On Fri, Mar 11, 2022 at 07:23:02AM +, Zheyu Ma wrote:
> The user program can control the 'drm_buf_desc::flags' via ioctl system
> call and enter the function drm_legacy_addbufs_agp(). If the driver
> doesn't initialize the agp resources, the driver will cause a null
> pointer dereference.
> 
> The following log reveals it:
> general protection fault, probably for non-canonical address
> 0xdc0f:  [#1] PREEMPT SMP KASAN PTI
> KASAN: null-ptr-deref in range [0x0078-0x007f]
> Call Trace:
>  
>  drm_ioctl_kernel+0x342/0x450 drivers/gpu/drm/drm_ioctl.c:785
>  drm_ioctl+0x592/0x940 drivers/gpu/drm/drm_ioctl.c:885
>  vfs_ioctl fs/ioctl.c:51 [inline]
>  __do_sys_ioctl fs/ioctl.c:874 [inline]
>  __se_sys_ioctl+0xaa/0xf0 fs/ioctl.c:860
>  do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>  do_syscall_64+0x43/0x90 arch/x86/entry/common.c:80
>  entry_SYSCALL_64_after_hwframe+0x44/0xae
> 
> Fix this bug by adding a check.
> 
> Signed-off-by: Zheyu Ma 

You can only hit this if you enabled a DRIVER_LEGACY drm driver, which
opens you up to tons of other CVEs and issues. What's your .config?
-Daniel

> ---
>  drivers/gpu/drm/drm_bufs.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/drm_bufs.c b/drivers/gpu/drm/drm_bufs.c
> index fcca21e8efac..4fe2363b1e34 100644
> --- a/drivers/gpu/drm/drm_bufs.c
> +++ b/drivers/gpu/drm/drm_bufs.c
> @@ -734,7 +734,7 @@ int drm_legacy_addbufs_agp(struct drm_device *dev,
>   int i, valid;
>   struct drm_buf **temp_buflist;
>  
> - if (!dma)
> + if (!dma || !dev->agp)
>   return -EINVAL;
>  
>   count = request->count;
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v1] drm/shmem-helper: Correct doc-comment of drm_gem_shmem_get_sg_table()

2022-03-17 Thread Dmitry Osipenko


On 3/17/22 12:52, Daniel Vetter wrote:
> On Tue, Mar 08, 2022 at 04:34:01PM +0300, Dmitry Osipenko wrote:
>> drm_gem_shmem_get_sg_table() never returns NULL on error, but a ERR_PTR.
>> Correct the doc comment which says that it returns NULL on error.
>>
>> Signed-off-by: Dmitry Osipenko 
>> ---
>>  drivers/gpu/drm/drm_gem_shmem_helper.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
>> b/drivers/gpu/drm/drm_gem_shmem_helper.c
>> index 8ad0e02991ca..37009418cd28 100644
>> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
>> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
>> @@ -662,7 +662,7 @@ EXPORT_SYMBOL(drm_gem_shmem_print_info);
>>   * drm_gem_shmem_get_pages_sgt() instead.
>>   *
>>   * Returns:
>> - * A pointer to the scatter/gather table of pinned pages or NULL on failure.
>> + * A pointer to the scatter/gather table of pinned pages or errno on 
>> failure.
> 
> Hm usually we write "negative errno" for these, since the error numbers
> are defined as positive numbers. Care to respin?

It's actually ERR_PTR that is returned here, "errno" was borrowed from
some other similar DRM comment. I added this patch to v2 of virtio
patchset [1] and will improve the comment in v3, thanks.

[1]
https://lore.kernel.org/dri-devel/20220314224253.236359-1-dmitry.osipe...@collabora.com/T/#t


Re: [PATCH 2/2] fbdev: Fix cfb_imageblit() for arbitrary image widths

2022-03-17 Thread Daniel Vetter
On Sun, Mar 13, 2022 at 08:29:52PM +0100, Thomas Zimmermann wrote:
> Commit 0d03011894d2 ("fbdev: Improve performance of cfb_imageblit()")
> broke cfb_imageblit() for image widths that are not aligned to 8-bit
> boundaries. Fix this by handling the trailing pixels on each line
> separately. The performance improvements in the original commit do not
> regress by this change.
> 
> Signed-off-by: Thomas Zimmermann 
> Fixes: 0d03011894d2 ("fbdev: Improve performance of cfb_imageblit()")
> Reported-by: Marek Szyprowski 
> Cc: Thomas Zimmermann 
> Cc: Javier Martinez Canillas 
> Cc: Sam Ravnborg 

On both patches:

Acked-by: Daniel Vetter 

> ---
>  drivers/video/fbdev/core/cfbimgblt.c | 28 
>  1 file changed, 24 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/video/fbdev/core/cfbimgblt.c 
> b/drivers/video/fbdev/core/cfbimgblt.c
> index 7361cfabdd85..9ebda4e0dc7a 100644
> --- a/drivers/video/fbdev/core/cfbimgblt.c
> +++ b/drivers/video/fbdev/core/cfbimgblt.c
> @@ -218,7 +218,7 @@ static inline void fast_imageblit(const struct fb_image 
> *image, struct fb_info *
>  {
>   u32 fgx = fgcolor, bgx = bgcolor, bpp = p->var.bits_per_pixel;
>   u32 ppw = 32/bpp, spitch = (image->width + 7)/8;
> - u32 bit_mask, eorx;
> + u32 bit_mask, eorx, shift;
>   const char *s = image->data, *src;
>   u32 __iomem *dst;
>   const u32 *tab = NULL;
> @@ -259,17 +259,23 @@ static inline void fast_imageblit(const struct fb_image 
> *image, struct fb_info *
>  
>   for (i = image->height; i--; ) {
>   dst = (u32 __iomem *)dst1;
> + shift = 8;
>   src = s;
>  
> + /*
> +  * Manually unroll the per-line copying loop for better
> +  * performance. This works until we processed the last
> +  * completely filled source byte (inclusive).
> +  */
>   switch (ppw) {
>   case 4: /* 8 bpp */
> - for (j = k; j; j -= 2, ++src) {
> + for (j = k; j >= 2; j -= 2, ++src) {
>   FB_WRITEL(colortab[(*src >> 4) & bit_mask], 
> dst++);
>   FB_WRITEL(colortab[(*src >> 0) & bit_mask], 
> dst++);
>   }
>   break;
>   case 2: /* 16 bpp */
> - for (j = k; j; j -= 4, ++src) {
> + for (j = k; j >= 4; j -= 4, ++src) {
>   FB_WRITEL(colortab[(*src >> 6) & bit_mask], 
> dst++);
>   FB_WRITEL(colortab[(*src >> 4) & bit_mask], 
> dst++);
>   FB_WRITEL(colortab[(*src >> 2) & bit_mask], 
> dst++);
> @@ -277,7 +283,7 @@ static inline void fast_imageblit(const struct fb_image 
> *image, struct fb_info *
>   }
>   break;
>   case 1: /* 32 bpp */
> - for (j = k; j; j -= 8, ++src) {
> + for (j = k; j >= 8; j -= 8, ++src) {
>   FB_WRITEL(colortab[(*src >> 7) & bit_mask], 
> dst++);
>   FB_WRITEL(colortab[(*src >> 6) & bit_mask], 
> dst++);
>   FB_WRITEL(colortab[(*src >> 5) & bit_mask], 
> dst++);
> @@ -290,6 +296,20 @@ static inline void fast_imageblit(const struct fb_image 
> *image, struct fb_info *
>   break;
>   }
>  
> + /*
> +  * For image widths that are not a multiple of 8, there
> +  * are trailing pixels left on the current line. Print
> +  * them as well.
> +  */
> + for (; j--; ) {
> + shift -= ppw;
> + FB_WRITEL(colortab[(*src >> shift) & bit_mask], dst++);
> + if (!shift) {
> + shift = 8;
> + ++src;
> + }
> + }
> +
>   dst1 += p->fix.line_length;
>   s += spitch;
>   }
> -- 
> 2.35.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 4/4] drm/gma500: Cosmetic cleanup of irq code

2022-03-17 Thread Daniel Vetter
On Thu, Mar 17, 2022 at 10:25:55AM +0100, Patrik Jakobsson wrote:
> Use the gma_ prefix instead of psb_ since the code is common for all
> chips. Various coding style fixes. Removal of unused code. Removal of
> duplicate function declarations.

I didn't really find the above removal things, was that from an old commit
message before you split those changes out?

Aside from that nit on the commit message on all 4 patches (btw you're
threading is somehow broken in this series):

Acked-by: Daniel Vetter 
> 
> Signed-off-by: Patrik Jakobsson 
> ---
>  drivers/gpu/drm/gma500/gma_display.c |  8 +--
>  drivers/gpu/drm/gma500/opregion.c|  5 +-
>  drivers/gpu/drm/gma500/power.c   | 10 +--
>  drivers/gpu/drm/gma500/psb_drv.c |  2 +-
>  drivers/gpu/drm/gma500/psb_drv.h | 11 
>  drivers/gpu/drm/gma500/psb_irq.c | 94 +++-
>  drivers/gpu/drm/gma500/psb_irq.h | 19 +++---
>  7 files changed, 57 insertions(+), 92 deletions(-)
> 
> diff --git a/drivers/gpu/drm/gma500/gma_display.c 
> b/drivers/gpu/drm/gma500/gma_display.c
> index 931ffb192fc4..1d7964c339f4 100644
> --- a/drivers/gpu/drm/gma500/gma_display.c
> +++ b/drivers/gpu/drm/gma500/gma_display.c
> @@ -17,7 +17,7 @@
>  #include "framebuffer.h"
>  #include "gem.h"
>  #include "gma_display.h"
> -#include "psb_drv.h"
> +#include "psb_irq.h"
>  #include "psb_intel_drv.h"
>  #include "psb_intel_reg.h"
>  
> @@ -572,9 +572,9 @@ const struct drm_crtc_funcs gma_crtc_funcs = {
>   .set_config = gma_crtc_set_config,
>   .destroy = gma_crtc_destroy,
>   .page_flip = gma_crtc_page_flip,
> - .enable_vblank = psb_enable_vblank,
> - .disable_vblank = psb_disable_vblank,
> - .get_vblank_counter = psb_get_vblank_counter,
> + .enable_vblank = gma_enable_vblank,
> + .disable_vblank = gma_disable_vblank,
> + .get_vblank_counter = gma_get_vblank_counter,
>  };
>  
>  /*
> diff --git a/drivers/gpu/drm/gma500/opregion.c 
> b/drivers/gpu/drm/gma500/opregion.c
> index fef04ff8c3a9..dc494df71a48 100644
> --- a/drivers/gpu/drm/gma500/opregion.c
> +++ b/drivers/gpu/drm/gma500/opregion.c
> @@ -23,6 +23,7 @@
>   */
>  #include 
>  #include "psb_drv.h"
> +#include "psb_irq.h"
>  #include "psb_intel_reg.h"
>  
>  #define PCI_ASLE 0xe4
> @@ -217,8 +218,8 @@ void psb_intel_opregion_enable_asle(struct drm_device 
> *dev)
>   if (asle && system_opregion ) {
>   /* Don't do this on Medfield or other non PC like devices, they
>  use the bit for something different altogether */
> - psb_enable_pipestat(dev_priv, 0, PIPE_LEGACY_BLC_EVENT_ENABLE);
> - psb_enable_pipestat(dev_priv, 1, PIPE_LEGACY_BLC_EVENT_ENABLE);
> + gma_enable_pipestat(dev_priv, 0, PIPE_LEGACY_BLC_EVENT_ENABLE);
> + gma_enable_pipestat(dev_priv, 1, PIPE_LEGACY_BLC_EVENT_ENABLE);
>  
>   asle->tche = ASLE_ALS_EN | ASLE_BLC_EN | ASLE_PFIT_EN
>   | ASLE_PFMB_EN;
> diff --git a/drivers/gpu/drm/gma500/power.c b/drivers/gpu/drm/gma500/power.c
> index 6f917cfef65b..b91de6d36e41 100644
> --- a/drivers/gpu/drm/gma500/power.c
> +++ b/drivers/gpu/drm/gma500/power.c
> @@ -201,7 +201,7 @@ int gma_power_suspend(struct device *_dev)
>   dev_err(dev->dev, "GPU hardware busy, cannot 
> suspend\n");
>   return -EBUSY;
>   }
> - psb_irq_uninstall(dev);
> + gma_irq_uninstall(dev);
>   gma_suspend_display(dev);
>   gma_suspend_pci(pdev);
>   }
> @@ -223,8 +223,8 @@ int gma_power_resume(struct device *_dev)
>   mutex_lock(&power_mutex);
>   gma_resume_pci(pdev);
>   gma_resume_display(pdev);
> - psb_irq_preinstall(dev);
> - psb_irq_postinstall(dev);
> + gma_irq_preinstall(dev);
> + gma_irq_postinstall(dev);
>   mutex_unlock(&power_mutex);
>   return 0;
>  }
> @@ -270,8 +270,8 @@ bool gma_power_begin(struct drm_device *dev, bool 
> force_on)
>   /* Ok power up needed */
>   ret = gma_resume_pci(pdev);
>   if (ret == 0) {
> - psb_irq_preinstall(dev);
> - psb_irq_postinstall(dev);
> + gma_irq_preinstall(dev);
> + gma_irq_postinstall(dev);
>   pm_runtime_get(dev->dev);
>   dev_priv->display_count++;
>   spin_unlock_irqrestore(&power_ctrl_lock, flags);
> diff --git a/drivers/gpu/drm/gma500/psb_drv.c 
> b/drivers/gpu/drm/gma500/psb_drv.c
> index e30b58184156..82d51e9821ad 100644
> --- a/drivers/gpu/drm/gma500/psb_drv.c
> +++ b/drivers/gpu/drm/gma500/psb_drv.c
> @@ -380,7 +380,7 @@ static int psb_driver_load(struct drm_device *dev, 
> unsigned long flags)
>   PSB_WVDC32(0x, PSB_INT_MASK_R);
>   spin_unlock_irqrestore(&dev_priv->irqmask_lock, irqflags);
>  
> - psb_irq_install(dev, pdev->irq);
> + gma_irq_install(dev, pdev->irq);
>  
>   dev->max_vblank_count = 0xff; /* o

Re: [PATCH 1/2] fbdev: Fix sys_imageblit() for arbitrary image widths

2022-03-17 Thread Javier Martinez Canillas
Hello Thomas,

On 3/13/22 20:29, Thomas Zimmermann wrote:
> Commit 6f29e04938bf ("fbdev: Improve performance of sys_imageblit()")
> broke sys_imageblit() for image width that are not aligned to 8-bit
> boundaries. Fix this by handling the trailing pixels on each line
> separately. The performance improvements in the original commit do not
> regress by this change.
> 
> Signed-off-by: Thomas Zimmermann 
> Fixes: 6f29e04938bf ("fbdev: Improve performance of sys_imageblit()")
> Cc: Thomas Zimmermann 
> Cc: Javier Martinez Canillas 
> Cc: Sam Ravnborg 
> ---

Looks good to me. Also Marek and Geert mentioned that fixes the issue
they were seeing.

Reviewed-by: Javier Martinez Canillas 

-- 
Best regards,

Javier Martinez Canillas
Linux Engineering
Red Hat



Re: [PATCH 2/2] fbdev: Fix cfb_imageblit() for arbitrary image widths

2022-03-17 Thread Javier Martinez Canillas
On 3/13/22 20:29, Thomas Zimmermann wrote:
> Commit 0d03011894d2 ("fbdev: Improve performance of cfb_imageblit()")
> broke cfb_imageblit() for image widths that are not aligned to 8-bit
> boundaries. Fix this by handling the trailing pixels on each line
> separately. The performance improvements in the original commit do not
> regress by this change.
> 
> Signed-off-by: Thomas Zimmermann 
> Fixes: 0d03011894d2 ("fbdev: Improve performance of cfb_imageblit()")
> Reported-by: Marek Szyprowski 
> Cc: Thomas Zimmermann 
> Cc: Javier Martinez Canillas 
> Cc: Sam Ravnborg 
> ---

Reviewed-by: Javier Martinez Canillas 

-- 
Best regards,

Javier Martinez Canillas
Linux Engineering
Red Hat



Re: [PATCH] fbdev: defio: fix the pagelist corruption

2022-03-17 Thread Javier Martinez Canillas
Hello Chuansheng,

On 3/17/22 06:46, Chuansheng Liu wrote:
> Easily hit the below list corruption:
> ==
> list_add corruption. prev->next should be next (c0ceb090), but
> was ec604507edc8. (prev=ec604507edc8).
> WARNING: CPU: 65 PID: 3959 at lib/list_debug.c:26
> __list_add_valid+0x53/0x80
> CPU: 65 PID: 3959 Comm: fbdev Tainted: G U
> RIP: 0010:__list_add_valid+0x53/0x80
> Call Trace:
>  
>  fb_deferred_io_mkwrite+0xea/0x150
>  do_page_mkwrite+0x57/0xc0
>  do_wp_page+0x278/0x2f0
>  __handle_mm_fault+0xdc2/0x1590
>  handle_mm_fault+0xdd/0x2c0
>  do_user_addr_fault+0x1d3/0x650
>  exc_page_fault+0x77/0x180
>  ? asm_exc_page_fault+0x8/0x30
>  asm_exc_page_fault+0x1e/0x30
> RIP: 0033:0x7fd98fc8fad1
> ==
> 
> Figure out the race happens when one process is adding &page->lru into
> the pagelist tail in fb_deferred_io_mkwrite(), another process is
> re-initializing the same &page->lru in fb_deferred_io_fault(), which is
> not protected by the lock.
> 
> This fix is to init all the page lists one time during initialization,
> it not only fixes the list corruption, but also avoids INIT_LIST_HEAD()
> redundantly.
> 
> Fixes: 105a940416fc ("fbdev/defio: Early-out if page is already
> enlisted")
> Cc: Thomas Zimmermann 
> Signed-off-by: Chuansheng Liu 
> ---

This makes sense to me. If you address Geert comment and post a v2,
feel free to add:

Reviewed-by: Javier Martinez Canillas 

-- 
Best regards,

Javier Martinez Canillas
Linux Engineering
Red Hat



Re: [PATCH 4/4] drm/gma500: Cosmetic cleanup of irq code

2022-03-17 Thread Patrik Jakobsson
On Thu, Mar 17, 2022 at 12:02 PM Daniel Vetter  wrote:
>
> On Thu, Mar 17, 2022 at 10:25:55AM +0100, Patrik Jakobsson wrote:
> > Use the gma_ prefix instead of psb_ since the code is common for all
> > chips. Various coding style fixes. Removal of unused code. Removal of
> > duplicate function declarations.
>
> I didn't really find the above removal things, was that from an old commit
> message before you split those changes out?

I was thinking about the removal of mid_pipe_vsync() (unused code) and
the psb_irq declarations in psb_drv.h (duplicate function
declarations). Perhaps I should've split this up in several patches.

>
> Aside from that nit on the commit message on all 4 patches (btw you're
> threading is somehow broken in this series):

I have a new gitconfig on this machine. It's likely misconfigured with
--no-thread or something like that.

Thanks for the review.

>
> Acked-by: Daniel Vetter 
> >
> > Signed-off-by: Patrik Jakobsson 
> > ---
> >  drivers/gpu/drm/gma500/gma_display.c |  8 +--
> >  drivers/gpu/drm/gma500/opregion.c|  5 +-
> >  drivers/gpu/drm/gma500/power.c   | 10 +--
> >  drivers/gpu/drm/gma500/psb_drv.c |  2 +-
> >  drivers/gpu/drm/gma500/psb_drv.h | 11 
> >  drivers/gpu/drm/gma500/psb_irq.c | 94 +++-
> >  drivers/gpu/drm/gma500/psb_irq.h | 19 +++---
> >  7 files changed, 57 insertions(+), 92 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/gma500/gma_display.c 
> > b/drivers/gpu/drm/gma500/gma_display.c
> > index 931ffb192fc4..1d7964c339f4 100644
> > --- a/drivers/gpu/drm/gma500/gma_display.c
> > +++ b/drivers/gpu/drm/gma500/gma_display.c
> > @@ -17,7 +17,7 @@
> >  #include "framebuffer.h"
> >  #include "gem.h"
> >  #include "gma_display.h"
> > -#include "psb_drv.h"
> > +#include "psb_irq.h"
> >  #include "psb_intel_drv.h"
> >  #include "psb_intel_reg.h"
> >
> > @@ -572,9 +572,9 @@ const struct drm_crtc_funcs gma_crtc_funcs = {
> >   .set_config = gma_crtc_set_config,
> >   .destroy = gma_crtc_destroy,
> >   .page_flip = gma_crtc_page_flip,
> > - .enable_vblank = psb_enable_vblank,
> > - .disable_vblank = psb_disable_vblank,
> > - .get_vblank_counter = psb_get_vblank_counter,
> > + .enable_vblank = gma_enable_vblank,
> > + .disable_vblank = gma_disable_vblank,
> > + .get_vblank_counter = gma_get_vblank_counter,
> >  };
> >
> >  /*
> > diff --git a/drivers/gpu/drm/gma500/opregion.c 
> > b/drivers/gpu/drm/gma500/opregion.c
> > index fef04ff8c3a9..dc494df71a48 100644
> > --- a/drivers/gpu/drm/gma500/opregion.c
> > +++ b/drivers/gpu/drm/gma500/opregion.c
> > @@ -23,6 +23,7 @@
> >   */
> >  #include 
> >  #include "psb_drv.h"
> > +#include "psb_irq.h"
> >  #include "psb_intel_reg.h"
> >
> >  #define PCI_ASLE 0xe4
> > @@ -217,8 +218,8 @@ void psb_intel_opregion_enable_asle(struct drm_device 
> > *dev)
> >   if (asle && system_opregion ) {
> >   /* Don't do this on Medfield or other non PC like devices, 
> > they
> >  use the bit for something different altogether */
> > - psb_enable_pipestat(dev_priv, 0, 
> > PIPE_LEGACY_BLC_EVENT_ENABLE);
> > - psb_enable_pipestat(dev_priv, 1, 
> > PIPE_LEGACY_BLC_EVENT_ENABLE);
> > + gma_enable_pipestat(dev_priv, 0, 
> > PIPE_LEGACY_BLC_EVENT_ENABLE);
> > + gma_enable_pipestat(dev_priv, 1, 
> > PIPE_LEGACY_BLC_EVENT_ENABLE);
> >
> >   asle->tche = ASLE_ALS_EN | ASLE_BLC_EN | ASLE_PFIT_EN
> >   | 
> > ASLE_PFMB_EN;
> > diff --git a/drivers/gpu/drm/gma500/power.c b/drivers/gpu/drm/gma500/power.c
> > index 6f917cfef65b..b91de6d36e41 100644
> > --- a/drivers/gpu/drm/gma500/power.c
> > +++ b/drivers/gpu/drm/gma500/power.c
> > @@ -201,7 +201,7 @@ int gma_power_suspend(struct device *_dev)
> >   dev_err(dev->dev, "GPU hardware busy, cannot 
> > suspend\n");
> >   return -EBUSY;
> >   }
> > - psb_irq_uninstall(dev);
> > + gma_irq_uninstall(dev);
> >   gma_suspend_display(dev);
> >   gma_suspend_pci(pdev);
> >   }
> > @@ -223,8 +223,8 @@ int gma_power_resume(struct device *_dev)
> >   mutex_lock(&power_mutex);
> >   gma_resume_pci(pdev);
> >   gma_resume_display(pdev);
> > - psb_irq_preinstall(dev);
> > - psb_irq_postinstall(dev);
> > + gma_irq_preinstall(dev);
> > + gma_irq_postinstall(dev);
> >   mutex_unlock(&power_mutex);
> >   return 0;
> >  }
> > @@ -270,8 +270,8 @@ bool gma_power_begin(struct drm_device *dev, bool 
> > force_on)
> >   /* Ok power up needed */
> >   ret = gma_resume_pci(pdev);
> >   if (ret == 0) {
> > - psb_irq_preinstall(dev);
> > - psb_irq_postinstall(dev);
> > + gma_irq_preinstall(dev);
> > + gma_irq_postinstall(dev);
> >   pm_runtime_get(dev->dev);
> >   dev_priv->display_c

Re: [PATCH v2 0/5] drm: Fix monochrome conversion for sdd130x

2022-03-17 Thread Javier Martinez Canillas
Hello Geert,

On 3/17/22 09:18, Geert Uytterhoeven wrote:
> Hi all,
> 
> This patch series contains fixes and improvements for the XRGB888 to
> monochrome conversion in the DRM core, and for its users.
> 
> This has been tested on an Adafruit FeatherWing 128x32 OLED, connected
> to an OrangeCrab ECP5 FPGA board running a 64 MHz VexRiscv RISC-V
> softcore, using a text console with 4x6, 7x14 and 8x8 fonts.
> 
> Thanks!
> 
> Geert Uytterhoeven (5):
>   drm/format-helper: Rename drm_fb_xrgb_to_mono_reversed()
>   drm/format-helper: Fix XRGB888 to monochrome conversion
>   drm/ssd130x: Fix rectangle updates
>   drm/ssd130x: Reduce temporary buffer sizes
>   drm/repaper: Reduce temporary buffer size in repaper_fb_dirty()
>

Thanks for re-spinning this series and again for fixing my bugs!

I pushed patches 1-4 to drm-misc (drm-misc-next) but left patch 5 since
would like to give Noralf the opportunity to review/test before pushing.

By the way, you should probably request commit access to the drm-misc tree:

https://drm.pages.freedesktop.org/maintainer-tools/commit-access.html

-- 
Best regards,

Javier Martinez Canillas
Linux Engineering
Red Hat



[v8 0/5] enhanced edid driver compatibility

2022-03-17 Thread Lee Shawn C
Support to parse multiple CEA extension blocks and HF-EEODB to
extend drm edid driver's capability.

v4: add one more patch to support HF-SCDB
v5: HF-SCDB and HF-VSDBS carry the same SCDS data. Reuse
drm_parse_hdmi_forum_vsdb() to parse this packet.
v6: save proper extension block index if CTA data information
was found in DispalyID block.
v7: using different parameters to store CEA and DisplayID block index.
configure DisplayID extansion block index before search available
DisplayID block.
v8: revert patch [v7 2/5] change.
And check cea pointer return from drm_find_cea_extension().
If drvier got the same cea pointer then exit this routine.

Lee Shawn C (5):
  drm/edid: seek for available CEA block from specific EDID block index
  drm/edid: parse multiple CEA extension block
  drm/edid: read HF-EEODB ext block
  drm/edid: parse HF-EEODB CEA extension block
  drm/edid: check for HF-SCDB block

 drivers/gpu/drm/drm_connector.c |   8 +-
 drivers/gpu/drm/drm_displayid.c |   5 +-
 drivers/gpu/drm/drm_edid.c  | 174 
 include/drm/drm_edid.h  |   4 +-
 4 files changed, 144 insertions(+), 47 deletions(-)

-- 
2.17.1



[v8 1/5] drm/edid: seek for available CEA block from specific EDID block index

2022-03-17 Thread Lee Shawn C
drm_find_cea_extension() always look for a top level CEA block. Pass
ext_index from caller then this function to search next available
CEA ext block from a specific EDID block pointer.

Cc: Jani Nikula 
Cc: Ville Syrjala 
Cc: Ankit Nautiyal 
Cc: intel-gfx 
Signed-off-by: Lee Shawn C 
---
 drivers/gpu/drm/drm_edid.c | 42 ++
 1 file changed, 20 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index 561f53831e29..1251226d9284 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -3353,16 +3353,14 @@ const u8 *drm_find_edid_extension(const struct edid 
*edid,
return edid_ext;
 }
 
-static const u8 *drm_find_cea_extension(const struct edid *edid)
+static const u8 *drm_find_cea_extension(const struct edid *edid, int 
*ext_index)
 {
const struct displayid_block *block;
struct displayid_iter iter;
const u8 *cea;
-   int ext_index = 0;
 
-   /* Look for a top level CEA extension block */
-   /* FIXME: make callers iterate through multiple CEA ext blocks? */
-   cea = drm_find_edid_extension(edid, CEA_EXT, &ext_index);
+   /* Look for a CEA extension block from ext_index */
+   cea = drm_find_edid_extension(edid, CEA_EXT, ext_index);
if (cea)
return cea;
 
@@ -3643,10 +3641,10 @@ add_alternate_cea_modes(struct drm_connector 
*connector, struct edid *edid)
struct drm_device *dev = connector->dev;
struct drm_display_mode *mode, *tmp;
LIST_HEAD(list);
-   int modes = 0;
+   int modes = 0, ext_index = 0;
 
/* Don't add CEA modes if the CEA extension block is missing */
-   if (!drm_find_cea_extension(edid))
+   if (!drm_find_cea_extension(edid, &ext_index))
return 0;
 
/*
@@ -4321,11 +4319,11 @@ static void drm_parse_y420cmdb_bitmap(struct 
drm_connector *connector,
 static int
 add_cea_modes(struct drm_connector *connector, struct edid *edid)
 {
-   const u8 *cea = drm_find_cea_extension(edid);
-   const u8 *db, *hdmi = NULL, *video = NULL;
+   const u8 *cea, *db, *hdmi = NULL, *video = NULL;
u8 dbl, hdmi_len, video_len = 0;
-   int modes = 0;
+   int modes = 0, ext_index = 0;
 
+   cea = drm_find_cea_extension(edid, &ext_index);
if (cea && cea_revision(cea) >= 3) {
int i, start, end;
 
@@ -4562,7 +4560,7 @@ static void drm_edid_to_eld(struct drm_connector 
*connector, struct edid *edid)
uint8_t *eld = connector->eld;
const u8 *cea;
const u8 *db;
-   int total_sad_count = 0;
+   int total_sad_count = 0, ext_index = 0;
int mnl;
int dbl;
 
@@ -4571,7 +4569,7 @@ static void drm_edid_to_eld(struct drm_connector 
*connector, struct edid *edid)
if (!edid)
return;
 
-   cea = drm_find_cea_extension(edid);
+   cea = drm_find_cea_extension(edid, &ext_index);
if (!cea) {
DRM_DEBUG_KMS("ELD: no CEA Extension found\n");
return;
@@ -4655,11 +4653,11 @@ static void drm_edid_to_eld(struct drm_connector 
*connector, struct edid *edid)
  */
 int drm_edid_to_sad(struct edid *edid, struct cea_sad **sads)
 {
-   int count = 0;
+   int count = 0, ext_index = 0;
int i, start, end, dbl;
const u8 *cea;
 
-   cea = drm_find_cea_extension(edid);
+   cea = drm_find_cea_extension(edid, &ext_index);
if (!cea) {
DRM_DEBUG_KMS("SAD: no CEA Extension found\n");
return 0;
@@ -4717,11 +4715,11 @@ EXPORT_SYMBOL(drm_edid_to_sad);
  */
 int drm_edid_to_speaker_allocation(struct edid *edid, u8 **sadb)
 {
-   int count = 0;
+   int count = 0, ext_index = 0;
int i, start, end, dbl;
const u8 *cea;
 
-   cea = drm_find_cea_extension(edid);
+   cea = drm_find_cea_extension(edid, &ext_index);
if (!cea) {
DRM_DEBUG_KMS("SAD: no CEA Extension found\n");
return 0;
@@ -4814,9 +4812,9 @@ bool drm_detect_hdmi_monitor(struct edid *edid)
 {
const u8 *edid_ext;
int i;
-   int start_offset, end_offset;
+   int start_offset, end_offset, ext_index = 0;
 
-   edid_ext = drm_find_cea_extension(edid);
+   edid_ext = drm_find_cea_extension(edid, &ext_index);
if (!edid_ext)
return false;
 
@@ -4853,9 +4851,9 @@ bool drm_detect_monitor_audio(struct edid *edid)
const u8 *edid_ext;
int i, j;
bool has_audio = false;
-   int start_offset, end_offset;
+   int start_offset, end_offset, ext_index = 0;
 
-   edid_ext = drm_find_cea_extension(edid);
+   edid_ext = drm_find_cea_extension(edid, &ext_index);
if (!edid_ext)
goto end;
 
@@ -5177,9 +5175,9 @@ static void drm_parse_cea_ext(struct drm_connector 
*connector,
 {
struct drm_display_info *info = &connector->display_info;
const u8

[v8 2/5] drm/edid: parse multiple CEA extension block

2022-03-17 Thread Lee Shawn C
Try to find and parse more CEA ext blocks if edid->extensions
is greater than one.

v2: split prvious patch to two. And do CEA block parsing
in this one.
v3: simplify this patch based on previous change.
v4: refine patch v3.
v5: revert previous change.
And check cea pointer return from drm_find_cea_extension().
If drvier got the same cea pointer then exit this routine.

Cc: Jani Nikula 
Cc: Ville Syrjala 
Cc: Ankit Nautiyal 
Cc: Drew Davenport 
Cc: intel-gfx 
Signed-off-by: Lee Shawn C 
---
 drivers/gpu/drm/drm_edid.c | 34 +-
 1 file changed, 21 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index 1251226d9284..ef65dd97d700 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -4319,16 +4319,24 @@ static void drm_parse_y420cmdb_bitmap(struct 
drm_connector *connector,
 static int
 add_cea_modes(struct drm_connector *connector, struct edid *edid)
 {
-   const u8 *cea, *db, *hdmi = NULL, *video = NULL;
-   u8 dbl, hdmi_len, video_len = 0;
int modes = 0, ext_index = 0;
+   const u8 *cur_cea = NULL;
 
-   cea = drm_find_cea_extension(edid, &ext_index);
-   if (cea && cea_revision(cea) >= 3) {
+   for (;;) {
+   const u8 *cea, *db, *hdmi = NULL, *video = NULL;
+   u8 dbl, hdmi_len = 0, video_len = 0;
int i, start, end;
 
+   cea = drm_find_cea_extension(edid, &ext_index);
+   if (!cea || cea == cur_cea)
+   break;
+   cur_cea = cea;
+
+   if (cea_revision(cea) < 3)
+   continue;
+
if (cea_db_offsets(cea, &start, &end))
-   return 0;
+   continue;
 
for_each_cea_db(cea, i, start, end) {
db = &cea[i];
@@ -4350,15 +4358,15 @@ add_cea_modes(struct drm_connector *connector, struct 
edid *edid)
  dbl - 1);
}
}
-   }
 
-   /*
-* We parse the HDMI VSDB after having added the cea modes as we will
-* be patching their flags when the sink supports stereo 3D.
-*/
-   if (hdmi)
-   modes += do_hdmi_vsdb_modes(connector, hdmi, hdmi_len, video,
-   video_len);
+   /*
+* We parse the HDMI VSDB after having added the cea modes as 
we will
+* be patching their flags when the sink supports stereo 3D.
+*/
+   if (hdmi)
+   modes += do_hdmi_vsdb_modes(connector, hdmi, hdmi_len, 
video,
+   video_len);
+   }
 
return modes;
 }
-- 
2.17.1



[v8 3/5] drm/edid: read HF-EEODB ext block

2022-03-17 Thread Lee Shawn C
According to HDMI 2.1 spec.

"The HDMI Forum EDID Extension Override Data Block (HF-EEODB)
is utilized by Sink Devices to provide an alternate method to
indicate an EDID Extension Block count larger than 1, while
avoiding the need to present a VESA Block Map in the first
E-EDID Extension Block."

It is a mandatory for HDMI 2.1 protocol compliance as well.
This patch help to know how many HF_EEODB blocks report by sink
and read allo HF_EEODB blocks back.

v2: support to find CEA block, check EEODB block format, and return
available block number in drm_edid_read_hf_eeodb_blk_count().

Cc: Jani Nikula 
Cc: Ville Syrjala 
Cc: Ankit Nautiyal 
Cc: intel-gfx 
Signed-off-by: Lee Shawn C 
---
 drivers/gpu/drm/drm_connector.c |  8 +++-
 drivers/gpu/drm/drm_edid.c  | 71 +++--
 include/drm/drm_edid.h  |  2 +-
 3 files changed, 74 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
index a50c82bc2b2f..16011023c12e 100644
--- a/drivers/gpu/drm/drm_connector.c
+++ b/drivers/gpu/drm/drm_connector.c
@@ -2129,7 +2129,7 @@ int drm_connector_update_edid_property(struct 
drm_connector *connector,
   const struct edid *edid)
 {
struct drm_device *dev = connector->dev;
-   size_t size = 0;
+   size_t size = 0, hf_eeodb_blk_count;
int ret;
const struct edid *old_edid;
 
@@ -2137,8 +2137,12 @@ int drm_connector_update_edid_property(struct 
drm_connector *connector,
if (connector->override_edid)
return 0;
 
-   if (edid)
+   if (edid) {
size = EDID_LENGTH * (1 + edid->extensions);
+   hf_eeodb_blk_count = drm_edid_read_hf_eeodb_blk_count(edid);
+   if (hf_eeodb_blk_count)
+   size = EDID_LENGTH * (1 + hf_eeodb_blk_count);
+   }
 
/* Set the display info, using edid if available, otherwise
 * resetting the values to defaults. This duplicates the work
diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index ef65dd97d700..890038758660 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -1992,6 +1992,7 @@ struct edid *drm_do_get_edid(struct drm_connector 
*connector,
 {
int i, j = 0, valid_extensions = 0;
u8 *edid, *new;
+   size_t hf_eeodb_blk_count;
struct edid *override;
 
override = drm_get_override_edid(connector);
@@ -2051,7 +2052,35 @@ struct edid *drm_do_get_edid(struct drm_connector 
*connector,
}
 
kfree(edid);
+   return (struct edid *)new;
+   }
+
+   hf_eeodb_blk_count = drm_edid_read_hf_eeodb_blk_count((struct edid 
*)edid);
+   if (hf_eeodb_blk_count >= 2) {
+   new = krealloc(edid, (hf_eeodb_blk_count + 1) * EDID_LENGTH, 
GFP_KERNEL);
+   if (!new)
+   goto out;
edid = new;
+
+   valid_extensions = hf_eeodb_blk_count - 1;
+   for (j = 2; j <= hf_eeodb_blk_count; j++) {
+   u8 *block = edid + j * EDID_LENGTH;
+
+   for (i = 0; i < 4; i++) {
+   if (get_edid_block(data, block, j, EDID_LENGTH))
+   goto out;
+   if (drm_edid_block_valid(block, j, false, NULL))
+   break;
+   }
+
+   if (i == 4)
+   valid_extensions--;
+   }
+
+   if (valid_extensions != hf_eeodb_blk_count - 1) {
+   DRM_ERROR("Not able to retrieve proper EDID contain 
HF-EEODB data.\n");
+   goto out;
+   }
}
 
return (struct edid *)edid;
@@ -3315,15 +3344,17 @@ add_detailed_modes(struct drm_connector *connector, 
struct edid *edid,
 #define VIDEO_BLOCK 0x02
 #define VENDOR_BLOCK0x03
 #define SPEAKER_BLOCK  0x04
-#define HDR_STATIC_METADATA_BLOCK  0x6
-#define USE_EXTENDED_TAG 0x07
-#define EXT_VIDEO_CAPABILITY_BLOCK 0x00
+#define EXT_VIDEO_CAPABILITY_BLOCK 0x00
+#define HDR_STATIC_METADATA_BLOCK  0x06
+#define USE_EXTENDED_TAG   0x07
 #define EXT_VIDEO_DATA_BLOCK_420   0x0E
-#define EXT_VIDEO_CAP_BLOCK_Y420CMDB 0x0F
+#define EXT_VIDEO_CAP_BLOCK_Y420CMDB   0x0F
+#define EXT_VIDEO_HF_EEODB_DATA_BLOCK  0x78
 #define EDID_BASIC_AUDIO   (1 << 6)
 #define EDID_CEA_YCRCB444  (1 << 5)
 #define EDID_CEA_YCRCB422  (1 << 4)
 #define EDID_CEA_VCDB_QS   (1 << 6)
+#define HF_EEODB_LENGTH2
 
 /*
  * Search EDID for CEA extension block.
@@ -4273,9 +4304,41 @@ static bool cea_db_is_y420vdb(const u8 *db)
return true;
 }
 
+static bool cea_db_is_hdmi_forum_eeodb(const u8 *db)
+{
+   if (cea_db_tag(db) != USE_EXTENDED_TAG)
+   return false;
+
+   if (cea_db_payload_len

[v8 4/5] drm/edid: parse HF-EEODB CEA extension block

2022-03-17 Thread Lee Shawn C
While adding CEA modes, try to get available EEODB block
number. Then based on it to parse numbers of ext blocks,
retrieve CEA information and add more CEA modes.

Cc: Jani Nikula 
Cc: Ville Syrjala 
Cc: Ankit Nautiyal 
Cc: intel-gfx 
Signed-off-by: Lee Shawn C 
---
 drivers/gpu/drm/drm_displayid.c |  5 -
 drivers/gpu/drm/drm_edid.c  | 35 +++--
 include/drm/drm_edid.h  |  2 +-
 3 files changed, 25 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/drm_displayid.c b/drivers/gpu/drm/drm_displayid.c
index 32da557b960f..dc649a9efaa2 100644
--- a/drivers/gpu/drm/drm_displayid.c
+++ b/drivers/gpu/drm/drm_displayid.c
@@ -37,7 +37,10 @@ static const u8 *drm_find_displayid_extension(const struct 
edid *edid,
  int *length, int *idx,
  int *ext_index)
 {
-   const u8 *displayid = drm_find_edid_extension(edid, DISPLAYID_EXT, 
ext_index);
+   const u8 *displayid = drm_find_edid_extension(edid,
+ DISPLAYID_EXT,
+ ext_index,
+ edid->extensions);
const struct displayid_header *base;
int ret;
 
diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index 890038758660..40c192587f0a 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -3360,23 +3360,23 @@ add_detailed_modes(struct drm_connector *connector, 
struct edid *edid,
  * Search EDID for CEA extension block.
  */
 const u8 *drm_find_edid_extension(const struct edid *edid,
- int ext_id, int *ext_index)
+ int ext_id, int *ext_index, int ext_blk_num)
 {
const u8 *edid_ext = NULL;
int i;
 
/* No EDID or EDID extensions */
-   if (edid == NULL || edid->extensions == 0)
+   if (edid == NULL || edid->extensions == 0 || *ext_index >= ext_blk_num)
return NULL;
 
/* Find CEA extension */
-   for (i = *ext_index; i < edid->extensions; i++) {
+   for (i = *ext_index; i < ext_blk_num; i++) {
edid_ext = (const u8 *)edid + EDID_LENGTH * (i + 1);
if (edid_ext[0] == ext_id)
break;
}
 
-   if (i >= edid->extensions)
+   if (i >= ext_blk_num)
return NULL;
 
*ext_index = i + 1;
@@ -3384,14 +3384,15 @@ const u8 *drm_find_edid_extension(const struct edid 
*edid,
return edid_ext;
 }
 
-static const u8 *drm_find_cea_extension(const struct edid *edid, int 
*ext_index)
+static const u8 *drm_find_cea_extension(const struct edid *edid,
+   int *ext_index, int ext_blk_num)
 {
const struct displayid_block *block;
struct displayid_iter iter;
const u8 *cea;
 
/* Look for a CEA extension block from ext_index */
-   cea = drm_find_edid_extension(edid, CEA_EXT, ext_index);
+   cea = drm_find_edid_extension(edid, CEA_EXT, ext_index, ext_blk_num);
if (cea)
return cea;
 
@@ -3675,7 +3676,7 @@ add_alternate_cea_modes(struct drm_connector *connector, 
struct edid *edid)
int modes = 0, ext_index = 0;
 
/* Don't add CEA modes if the CEA extension block is missing */
-   if (!drm_find_cea_extension(edid, &ext_index))
+   if (!drm_find_cea_extension(edid, &ext_index, edid->extensions))
return 0;
 
/*
@@ -4327,7 +4328,7 @@ size_t drm_edid_read_hf_eeodb_blk_count(const struct edid 
*edid)
int i, start, end, ext_index = 0;
 
if (edid->extensions) {
-   cea = drm_find_cea_extension(edid, &ext_index);
+   cea = drm_find_cea_extension(edid, &ext_index, 
edid->extensions);
 
if (cea && !cea_db_offsets(cea, &start, &end))
for_each_cea_db(cea, i, start, end)
@@ -4384,13 +4385,17 @@ add_cea_modes(struct drm_connector *connector, struct 
edid *edid)
 {
int modes = 0, ext_index = 0;
const u8 *cur_cea = NULL;
+   int ext_blk_num = drm_edid_read_hf_eeodb_blk_count(edid);
+
+   if (!ext_blk_num)
+   ext_blk_num = edid->extensions;
 
for (;;) {
const u8 *cea, *db, *hdmi = NULL, *video = NULL;
u8 dbl, hdmi_len = 0, video_len = 0;
int i, start, end;
 
-   cea = drm_find_cea_extension(edid, &ext_index);
+   cea = drm_find_cea_extension(edid, &ext_index, ext_blk_num);
if (!cea || cea == cur_cea)
break;
cur_cea = cea;
@@ -4640,7 +4645,7 @@ static void drm_edid_to_eld(struct drm_connector 
*connector, struct edid *edid)
if (!edid)
return;
 
-   cea = drm_find_cea_extension(edid, &ext_index);
+   cea = drm_find_cea_extension(edi

[v8 5/5] drm/edid: check for HF-SCDB block

2022-03-17 Thread Lee Shawn C
Find HF-SCDB information in CEA extensions block. And retrieve
Max_TMDS_Character_Rate that support by sink device.

v2: HF-SCDB and HF-VSDBS carry the same SCDS data. Reuse
drm_parse_hdmi_forum_vsdb() to parse this packet.

Cc: Jani Nikula 
Cc: Ville Syrjala 
Cc: Ankit Nautiyal 
Cc: intel-gfx 
Signed-off-by: Lee Shawn C 
---
 drivers/gpu/drm/drm_edid.c | 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index 40c192587f0a..64d13ba0f701 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -3350,6 +3350,7 @@ add_detailed_modes(struct drm_connector *connector, 
struct edid *edid,
 #define EXT_VIDEO_DATA_BLOCK_420   0x0E
 #define EXT_VIDEO_CAP_BLOCK_Y420CMDB   0x0F
 #define EXT_VIDEO_HF_EEODB_DATA_BLOCK  0x78
+#define EXT_VIDEO_HF_SCDB_DATA_BLOCK   0x79
 #define EDID_BASIC_AUDIO   (1 << 6)
 #define EDID_CEA_YCRCB444  (1 << 5)
 #define EDID_CEA_YCRCB422  (1 << 4)
@@ -4277,6 +4278,20 @@ static bool cea_db_is_vcdb(const u8 *db)
return true;
 }
 
+static bool cea_db_is_hdmi_forum_scdb(const u8 *db)
+{
+   if (cea_db_tag(db) != USE_EXTENDED_TAG)
+   return false;
+
+   if (cea_db_payload_len(db) < 7)
+   return false;
+
+   if (cea_db_extended_tag(db) != EXT_VIDEO_HF_SCDB_DATA_BLOCK)
+   return false;
+
+   return true;
+}
+
 static bool cea_db_is_y420cmdb(const u8 *db)
 {
if (cea_db_tag(db) != USE_EXTENDED_TAG)
@@ -5274,7 +5289,8 @@ static void drm_parse_cea_ext(struct drm_connector 
*connector,
 
if (cea_db_is_hdmi_vsdb(db))
drm_parse_hdmi_vsdb_video(connector, db);
-   if (cea_db_is_hdmi_forum_vsdb(db))
+   if (cea_db_is_hdmi_forum_vsdb(db) ||
+   cea_db_is_hdmi_forum_scdb(db))
drm_parse_hdmi_forum_vsdb(connector, db);
if (cea_db_is_microsoft_vsdb(db))
drm_parse_microsoft_vsdb(connector, db);
-- 
2.17.1



Re: [Freedreno] [PATCH v3 5/5] drm/msm: allow compile time selection of driver components

2022-03-17 Thread Dmitry Baryshkov

On 16/03/2022 20:26, Abhinav Kumar wrote:



On 3/16/2022 12:31 AM, Dmitry Baryshkov wrote:

On 16/03/2022 03:28, Abhinav Kumar wrote:



On 3/3/2022 7:21 PM, Dmitry Baryshkov wrote:

MSM DRM driver already allows one to compile out the DP or DSI support.
Add support for disabling other features like MDP4/MDP5/DPU drivers or
direct HDMI output support.

Suggested-by: Stephen Boyd 
Signed-off-by: Dmitry Baryshkov 
---
  drivers/gpu/drm/msm/Kconfig    | 50 
--

  drivers/gpu/drm/msm/Makefile   | 18 ++--
  drivers/gpu/drm/msm/msm_drv.h  | 33 ++
  drivers/gpu/drm/msm/msm_mdss.c | 13 +++--
  4 files changed, 106 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig
index 9b019598e042..3735fd41eb3b 100644
--- a/drivers/gpu/drm/msm/Kconfig
+++ b/drivers/gpu/drm/msm/Kconfig
@@ -46,12 +46,39 @@ config DRM_MSM_GPU_SUDO
    Only use this if you are a driver developer.  This should *not*
    be enabled for production kernels.  If unsure, say N.
-config DRM_MSM_HDMI_HDCP
-    bool "Enable HDMI HDCP support in MSM DRM driver"
+config DRM_MSM_MDSS
+    bool
+    depends on DRM_MSM
+    default n

shouldnt DRM_MSM_MDSS be defaulted to y?


No, it will be selected either by MDP5 or by DPU1. It is not used if 
DRM_MSM is compiled with just MDP4 or headless support in mind.

Ok got it.




Another question is the compilation validation of the combinations of 
these.


So we need to try:

1) DRM_MSM_MDSS + DRM_MSM_MDP4
2) DRM_MSM_MDSS + DRM_MSM_MDP5
3) DRM_MSM_MDSS + DRM_MSM_DPU

Earlier since all of them were compiled together any 
inter-dependencies will not show up. Now since we are separating it 
out, just wanted to make sure each of the combos compile?


I think you meant:
- headless
- MDP4
- MDP5
- DPU1
- MDP4 + MDP5
- MDP4 + DPU1
- MDP5 + DPU1
- all three drivers


Yes, each of these combinations.


Each of them was tested.




+
+config DRM_MSM_MDP4
+    bool "Enable MDP4 support in MSM DRM driver"
  depends on DRM_MSM
  default y
  help
-  Choose this option to enable HDCP state machine
+  Compile in support for the Mobile Display Processor v4 (MDP4) in
+  the MSM DRM driver. It is the older display controller found in
+  devices using APQ8064/MSM8960/MSM8x60 platforms.
+
+config DRM_MSM_MDP5
+    bool "Enable MDP5 support in MSM DRM driver"
+    depends on DRM_MSM
+    select DRM_MSM_MDSS
+    default y
+    help
+  Compile in support for the Mobile Display Processor v5 (MDP4) in
+  the MSM DRM driver. It is the display controller found in 
devices
+  using e.g. APQ8016/MSM8916/APQ8096/MSM8996/MSM8974/SDM6x0 
platforms.

+
+config DRM_MSM_DPU
+    bool "Enable DPU support in MSM DRM driver"
+    depends on DRM_MSM
+    select DRM_MSM_MDSS
+    default y
+    help
+  Compile in support for the Display Processing Unit in
+  the MSM DRM driver. It is the display controller found in 
devices

+  using e.g. SDM845 and newer platforms.
  config DRM_MSM_DP
  bool "Enable DisplayPort support in MSM DRM driver"
@@ -116,3 +143,20 @@ config DRM_MSM_DSI_7NM_PHY
  help
    Choose this option if DSI PHY on SM8150/SM8250/SC7280 is 
used on

    the platform.
+
+config DRM_MSM_HDMI
+    bool "Enable HDMI support in MSM DRM driver"
+    depends on DRM_MSM
+    default y
+    help
+  Compile in support for the HDMI output MSM DRM driver. It can
+  be a primary or a secondary display on device. Note that this 
is used

+  only for the direct HDMI output. If the device outputs HDMI data
+  throught some kind of DSI-to-HDMI bridge, this option can be 
disabled.

+
+config DRM_MSM_HDMI_HDCP
+    bool "Enable HDMI HDCP support in MSM DRM driver"
+    depends on DRM_MSM && DRM_MSM_HDMI
+    default y
+    help
+  Choose this option to enable HDCP state machine
diff --git a/drivers/gpu/drm/msm/Makefile 
b/drivers/gpu/drm/msm/Makefile

index e76927b42033..5fe9c20ab9ee 100644
--- a/drivers/gpu/drm/msm/Makefile
+++ b/drivers/gpu/drm/msm/Makefile
@@ -16,6 +16,8 @@ msm-y := \
  adreno/a6xx_gpu.o \
  adreno/a6xx_gmu.o \
  adreno/a6xx_hfi.o \
+
+msm-$(CONFIG_DRM_MSM_HDMI) += \
  hdmi/hdmi.o \
  hdmi/hdmi_audio.o \
  hdmi/hdmi_bridge.o \
@@ -27,8 +29,8 @@ msm-y := \
  hdmi/hdmi_phy_8x60.o \
  hdmi/hdmi_phy_8x74.o \
  hdmi/hdmi_pll_8960.o \
-    disp/mdp_format.o \
-    disp/mdp_kms.o \
+
+msm-$(CONFIG_DRM_MSM_MDP4) += \
  disp/mdp4/mdp4_crtc.o \
  disp/mdp4/mdp4_dtv_encoder.o \
  disp/mdp4/mdp4_lcdc_encoder.o \
@@ -37,6 +39,8 @@ msm-y := \
  disp/mdp4/mdp4_irq.o \
  disp/mdp4/mdp4_kms.o \
  disp/mdp4/mdp4_plane.o \
+
+msm-$(CONFIG_DRM_MSM_MDP5) += \
  disp/mdp5/mdp5_cfg.o \
  disp/mdp5/mdp5_ctl.o \
  disp/mdp5/mdp5_crtc.o \
@@ -47,6 +51,8 @@ msm-y := \
  disp/mdp5/mdp5_mixer.o \
  disp/mdp5/mdp5_plane.o \
  disp/mdp5/mdp5_smp.o \
+
+msm-$(CONFIG_DRM_MSM_DPU) +=

[PATCH 5.4 38/43] drm/vrr: Set VRR capable prop only if it is attached to connector

2022-03-17 Thread Greg Kroah-Hartman
From: Manasi Navare 

[ Upstream commit 62929726ef0ec72cbbe9440c5d125d4278b99894 ]

VRR capable property is not attached by default to the connector
It is attached only if VRR is supported.
So if the driver tries to call drm core set prop function without
it being attached that causes NULL dereference.

Cc: Jani Nikula 
Cc: Ville Syrjälä 
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Manasi Navare 
Reviewed-by: Ville Syrjälä 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20220225013055.9282-1-manasi.d.nav...@intel.com
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/drm_connector.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
index 2337b3827e6a..11a81e8ba963 100644
--- a/drivers/gpu/drm/drm_connector.c
+++ b/drivers/gpu/drm/drm_connector.c
@@ -1984,6 +1984,9 @@ EXPORT_SYMBOL(drm_connector_attach_max_bpc_property);
 void drm_connector_set_vrr_capable_property(
struct drm_connector *connector, bool capable)
 {
+   if (!connector->vrr_capable_property)
+   return;
+
drm_object_property_set_value(&connector->base,
  connector->vrr_capable_property,
  capable);
-- 
2.34.1





[PATCH 5.10 16/23] drm/vrr: Set VRR capable prop only if it is attached to connector

2022-03-17 Thread Greg Kroah-Hartman
From: Manasi Navare 

[ Upstream commit 62929726ef0ec72cbbe9440c5d125d4278b99894 ]

VRR capable property is not attached by default to the connector
It is attached only if VRR is supported.
So if the driver tries to call drm core set prop function without
it being attached that causes NULL dereference.

Cc: Jani Nikula 
Cc: Ville Syrjälä 
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Manasi Navare 
Reviewed-by: Ville Syrjälä 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20220225013055.9282-1-manasi.d.nav...@intel.com
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/drm_connector.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
index 717c4e7271b0..5163433ac561 100644
--- a/drivers/gpu/drm/drm_connector.c
+++ b/drivers/gpu/drm/drm_connector.c
@@ -2155,6 +2155,9 @@ EXPORT_SYMBOL(drm_connector_attach_max_bpc_property);
 void drm_connector_set_vrr_capable_property(
struct drm_connector *connector, bool capable)
 {
+   if (!connector->vrr_capable_property)
+   return;
+
drm_object_property_set_value(&connector->base,
  connector->vrr_capable_property,
  capable);
-- 
2.34.1





[PATCH 5.15 18/25] drm/vrr: Set VRR capable prop only if it is attached to connector

2022-03-17 Thread Greg Kroah-Hartman
From: Manasi Navare 

[ Upstream commit 62929726ef0ec72cbbe9440c5d125d4278b99894 ]

VRR capable property is not attached by default to the connector
It is attached only if VRR is supported.
So if the driver tries to call drm core set prop function without
it being attached that causes NULL dereference.

Cc: Jani Nikula 
Cc: Ville Syrjälä 
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Manasi Navare 
Reviewed-by: Ville Syrjälä 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20220225013055.9282-1-manasi.d.nav...@intel.com
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/drm_connector.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
index 2ba257b1ae20..e9b7926d9b66 100644
--- a/drivers/gpu/drm/drm_connector.c
+++ b/drivers/gpu/drm/drm_connector.c
@@ -2233,6 +2233,9 @@ EXPORT_SYMBOL(drm_connector_atomic_hdr_metadata_equal);
 void drm_connector_set_vrr_capable_property(
struct drm_connector *connector, bool capable)
 {
+   if (!connector->vrr_capable_property)
+   return;
+
drm_object_property_set_value(&connector->base,
  connector->vrr_capable_property,
  capable);
-- 
2.34.1





[PATCH 5.16 22/28] drm/vrr: Set VRR capable prop only if it is attached to connector

2022-03-17 Thread Greg Kroah-Hartman
From: Manasi Navare 

[ Upstream commit 62929726ef0ec72cbbe9440c5d125d4278b99894 ]

VRR capable property is not attached by default to the connector
It is attached only if VRR is supported.
So if the driver tries to call drm core set prop function without
it being attached that causes NULL dereference.

Cc: Jani Nikula 
Cc: Ville Syrjälä 
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Manasi Navare 
Reviewed-by: Ville Syrjälä 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20220225013055.9282-1-manasi.d.nav...@intel.com
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/drm_connector.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
index 52e20c68813b..6ae26e7d3dec 100644
--- a/drivers/gpu/drm/drm_connector.c
+++ b/drivers/gpu/drm/drm_connector.c
@@ -2275,6 +2275,9 @@ EXPORT_SYMBOL(drm_connector_atomic_hdr_metadata_equal);
 void drm_connector_set_vrr_capable_property(
struct drm_connector *connector, bool capable)
 {
+   if (!connector->vrr_capable_property)
+   return;
+
drm_object_property_set_value(&connector->base,
  connector->vrr_capable_property,
  capable);
-- 
2.34.1





[PATCH v4 0/3] drm/bridge: ti-sn65dsi86: Support non-eDP DisplayPort connectors

2022-03-17 Thread Kieran Bingham
Implement support for non eDP connectors on the TI-SN65DSI86 bridge, and
provide IRQ based hotplug detect to identify when the connector is
present.

no-hpd is extended to be the default behaviour for non DisplayPort
connectors.

This series is based upon Sam Ravnborgs and Rob Clarks series [0] to
support DRM_BRIDGE_STATE_OPS and NO_CONNECTOR support on the SN65DSI86,
however some extra modifications have been made on the top of Sam's
series to fix compile breakage and the NO_CONNECTOR support.

A full branch with these changes is available at [1]

As in v3, I have not taken ownership of the patches at [0], so it would
be good to hear if Sam has any plans to revive or push this series.
These patches are not expected to be integrated without [0].

[0] https://lore.kernel.org/all/20220206154405.124-1-...@ravnborg.org/
[1] git://git.kernel.org/pub/scm/linux/kernel/git/kbingham/rcar.git
branch: kbingham/drm-misc/next/sn65dsi86/hpd

Kieran Bingham (1):
  drm/bridge: ti-sn65dsi86: Support hotplug detection

Laurent Pinchart (2):
  drm/bridge: ti-sn65dsi86: Support DisplayPort (non-eDP) mode
  drm/bridge: ti-sn65dsi86: Implement bridge connector operations

 drivers/gpu/drm/bridge/ti-sn65dsi86.c | 191 --
 1 file changed, 176 insertions(+), 15 deletions(-)

-- 
2.32.0



[PATCH v4 1/3] drm/bridge: ti-sn65dsi86: Support DisplayPort (non-eDP) mode

2022-03-17 Thread Kieran Bingham
From: Laurent Pinchart 

Despite the SN65DSI86 being an eDP bridge, on some systems its output is
routed to a DisplayPort connector. Enable DisplayPort mode when the next
component in the display pipeline is detected as a DisplayPort
connector, and disable eDP features in that case.

Signed-off-by: Laurent Pinchart 
Reworked to set bridge type based on the next bridge/connector.
Signed-off-by: Kieran Bingham 
Reviewed-by: Laurent Pinchart 
Reviewed-by: Douglas Anderson 
--
Changes since v1/RFC:
 - Rebased on top of "drm/bridge: ti-sn65dsi86: switch to
   devm_drm_of_get_bridge"
 - eDP/DP mode determined from the next bridge connector type.

Changes since v2:
 - Remove setting of Standard DP Scrambler Seed. (It's read-only).
 - Prevent setting DP_EDP_CONFIGURATION_SET in
   ti_sn_bridge_atomic_enable()
 - Use Doug's suggested text for disabling ASSR on DP mode.

Changes since v3:
 - Remove ASSR_CONTROL definition

 drivers/gpu/drm/bridge/ti-sn65dsi86.c | 22 +++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c 
b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
index c892ecba91c7..c5f020a2d0d3 100644
--- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
+++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
@@ -93,6 +93,8 @@
 #define SN_DATARATE_CONFIG_REG 0x94
 #define  DP_DATARATE_MASK  GENMASK(7, 5)
 #define  DP_DATARATE(x)((x) << 5)
+#define SN_TRAINING_SETTING_REG0x95
+#define  SCRAMBLE_DISABLE  BIT(4)
 #define SN_ML_TX_MODE_REG  0x96
 #define  ML_TX_MAIN_LINK_OFF   0
 #define  ML_TX_NORMAL_MODE BIT(0)
@@ -982,6 +984,17 @@ static int ti_sn_link_training(struct ti_sn65dsi86 *pdata, 
int dp_rate_idx,
goto exit;
}
 
+   /*
+* eDP panels use an Alternate Scrambler Seed compared to displays
+* hooked up via a full DisplayPort connector. SN65DSI86 only supports
+* the alternate scrambler seed, not the normal one, so the only way we
+* can support full DisplayPort displays is by fully turning off the
+* scrambler.
+*/
+   if (pdata->bridge.type == DRM_MODE_CONNECTOR_DisplayPort)
+   regmap_update_bits(pdata->regmap, SN_TRAINING_SETTING_REG,
+  SCRAMBLE_DISABLE, SCRAMBLE_DISABLE);
+
/*
 * We'll try to link train several times.  As part of link training
 * the bridge chip will write DP_SET_POWER_D0 to DP_SET_POWER.  If
@@ -1046,12 +1059,13 @@ static void ti_sn_bridge_atomic_enable(struct 
drm_bridge *bridge,
 
/*
 * The SN65DSI86 only supports ASSR Display Authentication method and
-* this method is enabled by default. An eDP panel must support this
+* this method is enabled for eDP panels. An eDP panel must support this
 * authentication method. We need to enable this method in the eDP panel
 * at DisplayPort address 0x0010A prior to link training.
 */
-   drm_dp_dpcd_writeb(&pdata->aux, DP_EDP_CONFIGURATION_SET,
-  DP_ALTERNATE_SCRAMBLER_RESET_ENABLE);
+   if (pdata->bridge.type == DRM_MODE_CONNECTOR_eDP)
+   drm_dp_dpcd_writeb(&pdata->aux, DP_EDP_CONFIGURATION_SET,
+  DP_ALTERNATE_SCRAMBLER_RESET_ENABLE);
 
/* Set the DP output format (18 bpp or 24 bpp) */
val = (ti_sn_bridge_get_bpp(old_bridge_state) == 18) ? BPP_18_RGB : 0;
@@ -1215,6 +1229,8 @@ static int ti_sn_bridge_probe(struct auxiliary_device 
*adev,
 
pdata->bridge.funcs = &ti_sn_bridge_funcs;
pdata->bridge.of_node = np;
+   pdata->bridge.type = pdata->next_bridge->type == 
DRM_MODE_CONNECTOR_DisplayPort
+  ? DRM_MODE_CONNECTOR_DisplayPort : 
DRM_MODE_CONNECTOR_eDP;
 
drm_bridge_add(&pdata->bridge);
 
-- 
2.32.0



[PATCH v4 2/3] drm/bridge: ti-sn65dsi86: Implement bridge connector operations

2022-03-17 Thread Kieran Bingham
From: Laurent Pinchart 

Implement the bridge connector-related .get_edid() operation, and report
the related bridge capabilities and type.

Signed-off-by: Laurent Pinchart 
Signed-off-by: Kieran Bingham 
Reviewed-by: Laurent Pinchart 
---
Changes since v1:

- The connector .get_modes() operation doesn't rely on EDID anymore,
  __ti_sn_bridge_get_edid() and ti_sn_bridge_get_edid() got merged
  together
 - Fix on top of Sam Ravnborg's DRM_BRIDGE_STATE_OPS

Changes since v2: [Kieran]
 - Only support EDID on DRM_MODE_CONNECTOR_DisplayPort modes.

Changes since v3: [Kieran]
 - Remove PM calls in ti_sn_bridge_get_edid() and simplify

 drivers/gpu/drm/bridge/ti-sn65dsi86.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c 
b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
index c5f020a2d0d3..910bf3d41d2f 100644
--- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
+++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
@@ -1134,10 +1134,19 @@ static void ti_sn_bridge_atomic_post_disable(struct 
drm_bridge *bridge,
pm_runtime_put_sync(pdata->dev);
 }
 
+static struct edid *ti_sn_bridge_get_edid(struct drm_bridge *bridge,
+ struct drm_connector *connector)
+{
+   struct ti_sn65dsi86 *pdata = bridge_to_ti_sn65dsi86(bridge);
+
+   return drm_get_edid(connector, &pdata->aux.ddc);
+}
+
 static const struct drm_bridge_funcs ti_sn_bridge_funcs = {
.attach = ti_sn_bridge_attach,
.detach = ti_sn_bridge_detach,
.mode_valid = ti_sn_bridge_mode_valid,
+   .get_edid = ti_sn_bridge_get_edid,
.atomic_pre_enable = ti_sn_bridge_atomic_pre_enable,
.atomic_enable = ti_sn_bridge_atomic_enable,
.atomic_disable = ti_sn_bridge_atomic_disable,
@@ -1232,6 +1241,9 @@ static int ti_sn_bridge_probe(struct auxiliary_device 
*adev,
pdata->bridge.type = pdata->next_bridge->type == 
DRM_MODE_CONNECTOR_DisplayPort
   ? DRM_MODE_CONNECTOR_DisplayPort : 
DRM_MODE_CONNECTOR_eDP;
 
+   if (pdata->bridge.type == DRM_MODE_CONNECTOR_DisplayPort)
+   pdata->bridge.ops = DRM_BRIDGE_OP_EDID;
+
drm_bridge_add(&pdata->bridge);
 
ret = ti_sn_attach_host(pdata);
-- 
2.32.0



[PATCH v4 3/3] drm/bridge: ti-sn65dsi86: Support hotplug detection

2022-03-17 Thread Kieran Bingham
When the SN65DSI86 is used in DisplayPort mode, its output is likely
routed to a DisplayPort connector, which can benefit from hotplug
detection. Support it in such cases, with both polling mode and IRQ
based detection.

The implementation is limited to the bridge operations, as the connector
operations are legacy and new users should use
DRM_BRIDGE_ATTACH_NO_CONNECTOR.

Signed-off-by: Laurent Pinchart 
Signed-off-by: Kieran Bingham 
---
Changes since v1:

- Document the no_hpd field
- Rely on the SN_HPD_DISABLE_REG default value in the HPD case
- Add a TODO comment regarding IRQ support
[Kieran]
- Fix spelling s/assrted/asserted/
- Only enable HPD on DisplayPort connector.
- Support IRQ based hotplug detect

Changes since v2: [Kieran]
 - Use unsigned int for values read by regmap
 - Update HPD support warning message
 - Only enable OP_HPD if IRQ support enabled.
 - Only register IRQ handler during ti_sn_bridge_probe()
 - Store IRQ in the struct ti_sn65dsi86
 - Register IRQ only when !no-hpd
 - Refactor DRM_BRIDGE_OP_DETECT and DRM_BRIDGE_OP_HPD handling

Since v3:
 - Fix commit message
 - Remove stray debug print
 - initialise val in case of regmap read error in ti_sn_bridge_detect
 - Ensure pm-runtime reference held for ti_sn_bridge_detect
 - Reset status immediately after reading to reduce risk of lost
   interrupts during ti_sn65dsi86_irq_handler()
 - Reset only the IRQ bits set during ti_sn65dsi86_irq_handler()
 - Enable / disable IRQ during hpd_{enable,disable}
   This ensures the handler completes before it is disabled.
 - Extra comments to detail the notification process in
   ti_sn65dsi86_irq_handler()
 - Move SN_IRQ_EN_REG handling to hpd_{enable,disable} calls.

 drivers/gpu/drm/bridge/ti-sn65dsi86.c | 159 +++---
 1 file changed, 146 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c 
b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
index 910bf3d41d2f..0cc0409dcdd4 100644
--- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
+++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
@@ -69,6 +69,7 @@
 #define  BPP_18_RGBBIT(0)
 #define SN_HPD_DISABLE_REG 0x5C
 #define  HPD_DISABLE   BIT(0)
+#define  HPD_DEBOUNCED_STATE   BIT(4)
 #define SN_GPIO_IO_REG 0x5E
 #define  SN_GPIO_INPUT_SHIFT   4
 #define  SN_GPIO_OUTPUT_SHIFT  0
@@ -105,10 +106,24 @@
 #define SN_PWM_EN_INV_REG  0xA5
 #define  SN_PWM_INV_MASK   BIT(0)
 #define  SN_PWM_EN_MASKBIT(1)
+#define SN_IRQ_EN_REG  0xE0
+#define  IRQ_ENBIT(0)
+#define SN_IRQ_HPD_REG 0xE6
+#define  IRQ_HPD_ENBIT(0)
+#define  IRQ_HPD_INSERTION_EN  BIT(1)
+#define  IRQ_HPD_REMOVAL_ENBIT(2)
+#define  IRQ_HPD_REPLUG_EN BIT(3)
+#define  IRQ_HPD_PLL_UNLOCK_EN BIT(5)
 #define SN_AUX_CMD_STATUS_REG  0xF4
 #define  AUX_IRQ_STATUS_AUX_RPLY_TOUT  BIT(3)
 #define  AUX_IRQ_STATUS_AUX_SHORT  BIT(5)
 #define  AUX_IRQ_STATUS_NAT_I2C_FAIL   BIT(6)
+#define SN_IRQ_HPD_STATUS_REG  0xF5
+#define  IRQ_HPD_STATUSBIT(0)
+#define  IRQ_HPD_INSERTION_STATUS  BIT(1)
+#define  IRQ_HPD_REMOVAL_STATUSBIT(2)
+#define  IRQ_HPD_REPLUG_STATUS BIT(3)
+#define  IRQ_PLL_UNLOCKBIT(5)
 
 #define MIN_DSI_CLK_FREQ_MHZ   40
 
@@ -167,6 +182,12 @@
  * @pwm_enabled:  Used to track if the PWM signal is currently enabled.
  * @pwm_pin_busy: Track if GPIO4 is currently requested for GPIO or PWM.
  * @pwm_refclk_freq: Cache for the reference clock input to the PWM.
+ *
+ * @no_hpd:   Disable hot-plug detection as instructed by device tree (used
+ *for instance for eDP panels whose HPD signal won't be 
asserted
+ *until the panel is turned on, and is thus not usable for
+ *downstream device detection).
+ * @irq:  IRQ number for the device.
  */
 struct ti_sn65dsi86 {
struct auxiliary_device bridge_aux;
@@ -201,6 +222,9 @@ struct ti_sn65dsi86 {
atomic_tpwm_pin_busy;
 #endif
unsigned intpwm_refclk_freq;
+
+   boolno_hpd;
+   int irq;
 };
 
 static const struct regmap_range ti_sn65dsi86_volatile_ranges[] = {
@@ -315,23 +339,25 @@ static void ti_sn65dsi86_enable_comms(struct ti_sn65dsi86 
*pdata)
ti_sn_bridge_set_refclk_freq(pdata);
 
/*
-* HPD on this bridge chip is a bit useless.  This is an eDP bridge
-* so the HPD is an internal signal that's only there to signal that
-* the panel is done 

[PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on

2022-03-17 Thread Lee Jones
Presently the Client can be freed whilst still in use.

Use the already provided lock to prevent this.

Cc: Felix Kuehling 
Cc: Alex Deucher 
Cc: "Christian König" 
Cc: "Pan, Xinhui" 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: amd-...@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Lee Jones 
---
 drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
index e4beebb1c80a2..3b9ac1e87231f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
@@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, struct 
file *filep)
spin_unlock(&dev->smi_lock);
 
synchronize_rcu();
+
+   spin_lock(&client->lock);
kfifo_free(&client->fifo);
kfree(client);
+   spin_unlock(&client->lock);
 
return 0;
 }
@@ -247,11 +250,13 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd)
return ret;
}
 
+   spin_lock(&client->lock);
ret = anon_inode_getfd(kfd_smi_name, &kfd_smi_ev_fops, (void *)client,
   O_RDWR);
if (ret < 0) {
kfifo_free(&client->fifo);
kfree(client);
+   spin_unlock(&client->lock);
return ret;
}
*fd = ret;
@@ -264,6 +269,7 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd)
spin_lock(&dev->smi_lock);
list_add_rcu(&client->list, &dev->smi_clients);
spin_unlock(&dev->smi_lock);
+   spin_unlock(&client->lock);
 
return 0;
 }
-- 
2.35.1.894.gb6a874cedc-goog



Re: [PATCH v1 1/3] mm: split vm_normal_pages for LRU and non-LRU handling

2022-03-17 Thread Jason Gunthorpe
On Thu, Mar 17, 2022 at 09:13:50AM +0100, David Hildenbrand wrote:
> On 17.03.22 03:54, Alistair Popple wrote:
> > Felix Kuehling  writes:
> > 
> >> On 2022-03-11 04:16, David Hildenbrand wrote:
> >>> On 10.03.22 18:26, Alex Sierra wrote:
>  DEVICE_COHERENT pages introduce a subtle distinction in the way
>  "normal" pages can be used by various callers throughout the kernel.
>  They behave like normal pages for purposes of mapping in CPU page
>  tables, and for COW. But they do not support LRU lists, NUMA
>  migration or THP. Therefore we split vm_normal_page into two
>  functions vm_normal_any_page and vm_normal_lru_page. The latter will
>  only return pages that can be put on an LRU list and that support
>  NUMA migration, KSM and THP.
> 
>  We also introduced a FOLL_LRU flag that adds the same behaviour to
>  follow_page and related APIs, to allow callers to specify that they
>  expect to put pages on an LRU list.
> 
> >>> I still don't see the need for s/vm_normal_page/vm_normal_any_page/. And
> >>> as this patch is dominated by that change, I'd suggest (again) to just
> >>> drop it as I don't see any value of that renaming. No specifier implies 
> >>> any.
> >>
> >> OK. If nobody objects, we can adopts that naming convention.
> > 
> > I'd prefer we avoid the churn too, but I don't think we should make
> > vm_normal_page() the equivalent of vm_normal_any_page(). It would mean
> > vm_normal_page() would return non-LRU device coherent pages, but to me at 
> > least
> > device coherent pages seem special and not what I'd expect from a function 
> > with
> > "normal" in the name.
> > 
> > So I think it would be better to s/vm_normal_lru_page/vm_normal_page/ and 
> > keep
> > vm_normal_any_page() (or perhaps call it vm_any_page?). This is basically 
> > what
> > the previous incarnation of this feature did:
> > 
> > struct page *_vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
> > pte_t pte, bool with_public_device);
> > #define vm_normal_page(vma, addr, pte) _vm_normal_page(vma, addr, pte, 
> > false)
> > 
> > Except we should add:
> > 
> > #define vm_normal_any_page(vma, addr, pte) _vm_normal_page(vma, addr, pte, 
> > true)
> > 
> 
> "normal" simply tells us that this is not a special mapping -- IOW, we
> want the VM to take a look at the memmap and not treat it like a PFN
> map. What we're changing is that we're now also returning non-lru pages.
> Fair enough, that's why we introduce vm_normal_lru_page() as a
> replacement where we really can only deal with lru pages.
> 
> vm_normal_page vs vm_normal_lru_page is good enough. "lru" further
> limits what we get via vm_normal_page, that's even how it's implemented.

This naming makes sense to me.

Jason


Re: [PATCH] fbdev: defio: fix the pagelist corruption

2022-03-17 Thread Thomas Zimmermann

Hi

Am 17.03.22 um 06:46 schrieb Chuansheng Liu:

Easily hit the below list corruption:
==
list_add corruption. prev->next should be next (c0ceb090), but
was ec604507edc8. (prev=ec604507edc8).
WARNING: CPU: 65 PID: 3959 at lib/list_debug.c:26
__list_add_valid+0x53/0x80
CPU: 65 PID: 3959 Comm: fbdev Tainted: G U
RIP: 0010:__list_add_valid+0x53/0x80
Call Trace:
  
  fb_deferred_io_mkwrite+0xea/0x150
  do_page_mkwrite+0x57/0xc0
  do_wp_page+0x278/0x2f0
  __handle_mm_fault+0xdc2/0x1590
  handle_mm_fault+0xdd/0x2c0
  do_user_addr_fault+0x1d3/0x650
  exc_page_fault+0x77/0x180
  ? asm_exc_page_fault+0x8/0x30
  asm_exc_page_fault+0x1e/0x30
RIP: 0033:0x7fd98fc8fad1
==

Figure out the race happens when one process is adding &page->lru into
the pagelist tail in fb_deferred_io_mkwrite(), another process is
re-initializing the same &page->lru in fb_deferred_io_fault(), which is
not protected by the lock.

This fix is to init all the page lists one time during initialization,
it not only fixes the list corruption, but also avoids INIT_LIST_HEAD()
redundantly.

Fixes: 105a940416fc ("fbdev/defio: Early-out if page is already
enlisted")
Cc: Thomas Zimmermann 
Signed-off-by: Chuansheng Liu 


If you fix Geert's comment, feel free to add

Reviewed-by: Thomas Zimmermann 

Best regards
Thomas


---
  drivers/video/fbdev/core/fb_defio.c | 9 -
  1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/video/fbdev/core/fb_defio.c 
b/drivers/video/fbdev/core/fb_defio.c
index 98b0f23bf5e2..eafb66ca4f28 100644
--- a/drivers/video/fbdev/core/fb_defio.c
+++ b/drivers/video/fbdev/core/fb_defio.c
@@ -59,7 +59,6 @@ static vm_fault_t fb_deferred_io_fault(struct vm_fault *vmf)
printk(KERN_ERR "no mapping available\n");
  
  	BUG_ON(!page->mapping);

-   INIT_LIST_HEAD(&page->lru);
page->index = vmf->pgoff;
  
  	vmf->page = page;

@@ -220,6 +219,8 @@ static void fb_deferred_io_work(struct work_struct *work)
  void fb_deferred_io_init(struct fb_info *info)
  {
struct fb_deferred_io *fbdefio = info->fbdefio;
+   struct page *page;
+   int i;
  
  	BUG_ON(!fbdefio);

mutex_init(&fbdefio->lock);
@@ -227,6 +228,12 @@ void fb_deferred_io_init(struct fb_info *info)
INIT_LIST_HEAD(&fbdefio->pagelist);
if (fbdefio->delay == 0) /* set a default of 1 s */
fbdefio->delay = HZ;
+
+   /* initialize all the page lists one time */
+   for (i = 0; i < info->fix.smem_len; i += PAGE_SIZE) {
+   page = fb_deferred_io_page(info, i);
+   INIT_LIST_HEAD(&page->lru);
+   }
  }
  EXPORT_SYMBOL_GPL(fb_deferred_io_init);
  


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH 1/2] drm/i915: Fix renamed struct field

2022-03-17 Thread Souza, Jose
On Wed, 2022-03-16 at 16:45 -0700, Lucas De Marchi wrote:
> Earlier versions of commit a5b7ef27da60 ("drm/i915: Add struct to hold
> IP version") named "ver" as "arch" and then when it was renamed it
> missed the rename on MEDIA_VER_FULL() since it it's currently not used.

Reviewed-by: José Roberto de Souza 

> 
> Fixes: a5b7ef27da60 ("drm/i915: Add struct to hold IP version")
> Cc: José Roberto de Souza 
> Cc: Matt Roper 
> Signed-off-by: Lucas De Marchi 
> ---
>  drivers/gpu/drm/i915/i915_drv.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 26df561a4e94..7458b107a1d6 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -922,7 +922,7 @@ static inline struct intel_gt *to_gt(struct 
> drm_i915_private *i915)
>   (GRAPHICS_VER(i915) >= (from) && GRAPHICS_VER(i915) <= (until))
>  
>  #define MEDIA_VER(i915)  (INTEL_INFO(i915)->media.ver)
> -#define MEDIA_VER_FULL(i915) IP_VER(INTEL_INFO(i915)->media.arch, \
> +#define MEDIA_VER_FULL(i915) IP_VER(INTEL_INFO(i915)->media.ver, \
>  INTEL_INFO(i915)->media.rel)
>  #define IS_MEDIA_VER(i915, from, until) \
>   (MEDIA_VER(i915) >= (from) && MEDIA_VER(i915) <= (until))



Re: [PATCH 2/2] drm/i915: Add logical mapping for video decode engines

2022-03-17 Thread Souza, Jose
On Wed, 2022-03-16 at 16:45 -0700, Lucas De Marchi wrote:
> From: Matthew Brost 
> 
> Add logical mapping for VDBOXs. This mapping is required for
> split-frame workloads, which otherwise fail with
> 
>   -F8C53528: [GUC] 0441-INVALID_ENGINE_SUBMIT_MASK
> 
> ... if the application is using the logical id to reorder the engines and
> then using it for the batch buffer submission. It's not a big problem on
> media version 11 and 12 as they have only 2 instances of VCS and the
> logical to physical mapping is monotonically increasing - if the
> application is not using the logical id.
> 
> Changing it for the previous platforms allows the media driver
> implementation for the next ones (12.50 and above) to be the same,
> checking the logical id. It should also not introduce any bug for the
> old versions of userspace not checking the id.
> 
> The mapping added here is the complete map needed by XEHPSDV. Previous
> platforms with only 2 instances will just use a partial map and should
> still work.
> 
> Cc: Matt Roper 
> Signed-off-by: Matthew Brost 
> [ Extend the mapping to media versions 11 and 12 and give proper
>   justification in the commit message why ]
> Signed-off-by: Lucas De Marchi 
> ---
>  drivers/gpu/drm/i915/gt/intel_engine_cs.c | 22 +-
>  1 file changed, 17 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index 8080479f27aa..afa2e61cf729 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -731,12 +731,24 @@ static void populate_logical_ids(struct intel_gt *gt, 
> u8 *logical_ids,
>  
>  static void setup_logical_ids(struct intel_gt *gt, u8 *logical_ids, u8 class)
>  {
> - int i;
> - u8 map[MAX_ENGINE_INSTANCE + 1];
> + /*
> +  * Logical to physical mapping is needed for proper support
> +  * to split-frame feature.
> +  */
> + if (MEDIA_VER(gt->i915) >= 11 && class == VIDEO_DECODE_CLASS) {
> + static const u8 map[] = { 0, 2, 4, 6, 1, 3, 5, 7 };

You can drop the static.

Other than that LGTM.
Reviewed-by: José Roberto de Souza 

>  
> - for (i = 0; i < MAX_ENGINE_INSTANCE + 1; ++i)
> - map[i] = i;
> - populate_logical_ids(gt, logical_ids, class, map, ARRAY_SIZE(map));
> + populate_logical_ids(gt, logical_ids, class,
> +  map, ARRAY_SIZE(map));
> + } else {
> + int i;
> + u8 map[MAX_ENGINE_INSTANCE + 1];
> +
> + for (i = 0; i < MAX_ENGINE_INSTANCE + 1; ++i)
> + map[i] = i;
> + populate_logical_ids(gt, logical_ids, class,
> +  map, ARRAY_SIZE(map));
> + }
>  }
>  
>  /**



Re: (subset) [PATCH] drm/vc4: add tracepoints for CL submissions

2022-03-17 Thread Maxime Ripard
On Tue, 1 Feb 2022 20:26:51 -0100, Melissa Wen wrote:
> Trace submit_cl_ioctl and related IRQs for CL submission and bin/render
> jobs execution. It might be helpful to get a rendering timeline and
> track job throttling.
> 
> 

Applied to drm/drm-misc (drm-misc-next).

Thanks!
Maxime


Re: [PATCH] drm/vc4: add tracepoints for CL submissions

2022-03-17 Thread Maxime Ripard
On Thu, Mar 10, 2022 at 12:54:32PM +0100, Chema Casanova wrote:
> El 10/3/22 a las 12:12, Maxime Ripard escribió:
> > On Tue, Mar 01, 2022 at 01:58:26PM -0100, Melissa Wen wrote:
> > > On 02/25, Maxime Ripard wrote:
> > > > Hi Melissa,
> > > > 
> > > > On Tue, Feb 01, 2022 at 08:26:51PM -0100, Melissa Wen wrote:
> > > > > Trace submit_cl_ioctl and related IRQs for CL submission and 
> > > > > bin/render
> > > > > jobs execution. It might be helpful to get a rendering timeline and
> > > > > track job throttling.
> > > > > 
> > > > > Signed-off-by: Melissa Wen 
> > > > I'm not really sure what to do about this patch to be honest.
> > > > 
> > > > My understanding is that tracepoints are considered as userspace ABI,
> > > > but I can't really judge whether your traces are good enough or if it's
> > > > something that will bit us some time down the road.
> > > Thanks for taking a look at this patch.
> > > 
> > > So, I followed the same path of tracing CL submissions on v3d. I mean,
> > > tracking submit_cl ioctl, points when a job (bin/render) starts it
> > > execution, and irqs of completion (bin/render job). We used it to
> > > examine job throttling when running Chromium and, therefore, in addition
> > > to have the timeline of jobs execution, I show some data submitted in
> > > the ioctl to make connections. I think these tracers might be useful for
> > > some investigation in the future, but I'm also not sure if all data are
> > > necessary to be displayed.
> > Yeah, I'm sure that it's useful :)
> > 
> > I don't see anything wrong with that patch, really. What I meant is that
> > I don't really have the experience to judge if there's anything wrong in
> > the first place :)
> > 
> > If you can get someone with more experience with the v3d driver (Emma,
> > Iago maybe?) I'll be definitely be ok merging that patch
> 
> I've checked this patch and I've been using these tracepoints.
> They have been working properly.
> 
> Reviewed-by: Jose Maria Casanova Crespo 

Thanks for your feedback, I just merged the patch

Maxime


signature.asc
Description: PGP signature


Re: [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on

2022-03-17 Thread Lee Jones
On Thu, 17 Mar 2022, Lee Jones wrote:

> Presently the Client can be freed whilst still in use.
> 
> Use the already provided lock to prevent this.
> 
> Cc: Felix Kuehling 
> Cc: Alex Deucher 
> Cc: "Christian König" 
> Cc: "Pan, Xinhui" 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: amd-...@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: Lee Jones 
> ---

I should have clarified here, that:

This patch has only been *build* tested.

Since I have no way to run this on real H/W.

Please ensure this is tested on real H/W before it gets applied, since
it *may* have some undesired side-effects.  For instance, I have no
idea if client->lock plays nicely with dev->smi_lock or whether this
may well end up in deadlock.

TIA.

>  drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> index e4beebb1c80a2..3b9ac1e87231f 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> @@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, 
> struct file *filep)
>   spin_unlock(&dev->smi_lock);
>  
>   synchronize_rcu();
> +
> + spin_lock(&client->lock);
>   kfifo_free(&client->fifo);
>   kfree(client);
> + spin_unlock(&client->lock);
>  
>   return 0;
>  }
> @@ -247,11 +250,13 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t 
> *fd)
>   return ret;
>   }
>  
> + spin_lock(&client->lock);
>   ret = anon_inode_getfd(kfd_smi_name, &kfd_smi_ev_fops, (void *)client,
>  O_RDWR);
>   if (ret < 0) {
>   kfifo_free(&client->fifo);
>   kfree(client);
> + spin_unlock(&client->lock);
>   return ret;
>   }
>   *fd = ret;
> @@ -264,6 +269,7 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd)
>   spin_lock(&dev->smi_lock);
>   list_add_rcu(&client->list, &dev->smi_clients);
>   spin_unlock(&dev->smi_lock);
> + spin_unlock(&client->lock);
>  
>   return 0;
>  }

-- 
Lee Jones [李琼斯]
Principal Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog


Re: [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on

2022-03-17 Thread Felix Kuehling

Am 2022-03-17 um 09:16 schrieb Lee Jones:

Presently the Client can be freed whilst still in use.

Use the already provided lock to prevent this.

Cc: Felix Kuehling 
Cc: Alex Deucher 
Cc: "Christian König" 
Cc: "Pan, Xinhui" 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: amd-...@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Lee Jones 
---
  drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
index e4beebb1c80a2..3b9ac1e87231f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
@@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, struct 
file *filep)
spin_unlock(&dev->smi_lock);
  
  	synchronize_rcu();

+
+   spin_lock(&client->lock);
kfifo_free(&client->fifo);
kfree(client);
+   spin_unlock(&client->lock);


The spin_unlock is after the spinlock data structure has been freed. 
There should be no concurrent users here, since we are freeing the data 
structure. If there still are concurrent users at this point, they will 
crash anyway. So the locking is unnecessary.



  
  	return 0;

  }
@@ -247,11 +250,13 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd)
return ret;
}
  
+	spin_lock(&client->lock);


The client was just allocated, and it wasn't added to the client list or 
given to user mode yet. So there can be no concurrent users at this 
point. The locking is unnecessary.


There could be potential issues if someone uses the file descriptor by 
dumb luck before this function returns. So maybe we need to move the 
anon_inode_getfd to the end of the function (just before list_add_rcu) 
so that we only create the file descriptor after the client structure is 
fully initialized.


Regards,
  Felix



ret = anon_inode_getfd(kfd_smi_name, &kfd_smi_ev_fops, (void *)client,
   O_RDWR);
if (ret < 0) {
kfifo_free(&client->fifo);
kfree(client);
+   spin_unlock(&client->lock);
return ret;
}
*fd = ret;
@@ -264,6 +269,7 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd)
spin_lock(&dev->smi_lock);
list_add_rcu(&client->list, &dev->smi_clients);
spin_unlock(&dev->smi_lock);
+   spin_unlock(&client->lock);
  
  	return 0;

  }


Re: [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on

2022-03-17 Thread Lee Jones
Good afternoon Felix,

Thanks for your review.

> Am 2022-03-17 um 09:16 schrieb Lee Jones:
> > Presently the Client can be freed whilst still in use.
> > 
> > Use the already provided lock to prevent this.
> > 
> > Cc: Felix Kuehling 
> > Cc: Alex Deucher 
> > Cc: "Christian König" 
> > Cc: "Pan, Xinhui" 
> > Cc: David Airlie 
> > Cc: Daniel Vetter 
> > Cc: amd-...@lists.freedesktop.org
> > Cc: dri-devel@lists.freedesktop.org
> > Signed-off-by: Lee Jones 
> > ---
> >   drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++
> >   1 file changed, 6 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c 
> > b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> > index e4beebb1c80a2..3b9ac1e87231f 100644
> > --- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> > @@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, 
> > struct file *filep)
> > spin_unlock(&dev->smi_lock);
> > synchronize_rcu();
> > +
> > +   spin_lock(&client->lock);
> > kfifo_free(&client->fifo);
> > kfree(client);
> > +   spin_unlock(&client->lock);
> 
> The spin_unlock is after the spinlock data structure has been freed.

Good point.

If we go forward with this approach the unlock should perhaps be moved
to just before the kfree().

> There
> should be no concurrent users here, since we are freeing the data structure.
> If there still are concurrent users at this point, they will crash anyway.
> So the locking is unnecessary.

The users may well crash, as does the kernel unfortunately.

> > return 0;
> >   }
> > @@ -247,11 +250,13 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t 
> > *fd)
> > return ret;
> > }
> > +   spin_lock(&client->lock);
> 
> The client was just allocated, and it wasn't added to the client list or
> given to user mode yet. So there can be no concurrent users at this point.
> The locking is unnecessary.
> 
> There could be potential issues if someone uses the file descriptor by dumb
> luck before this function returns. So maybe we need to move the
> anon_inode_getfd to the end of the function (just before list_add_rcu) so
> that we only create the file descriptor after the client structure is fully
> initialized.

Bingo.  Well done. :)

I can move the function as suggested if that is the best route forward?

-- 
Lee Jones [李琼斯]
Principal Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog


Re: [PATCH 3/3] drm/msm: Add a way to override processes comm/cmdline

2022-03-17 Thread Rob Clark
On Thu, Mar 17, 2022 at 1:21 AM Dan Carpenter  wrote:
>
> On Wed, Mar 16, 2022 at 05:29:45PM -0700, Rob Clark wrote:
> >   switch (param) {
> > + case MSM_PARAM_COMM:
> > + case MSM_PARAM_CMDLINE: {
> > + char *str, **paramp;
> > +
> > + str = kmalloc(len + 1, GFP_KERNEL);
>
> if (!str)
> return -ENOMEM;
>
> > + if (copy_from_user(str, u64_to_user_ptr(value), len)) {
> > + kfree(str);
> > + return -EFAULT;
> > + }
> > +
> > + /* Ensure string is null terminated: */
> > + str[len] = '\0';
> > +
> > + if (param == MSM_PARAM_COMM) {
> > + paramp = &ctx->comm;
> > + } else {
> > + paramp = &ctx->cmdline;
> > + }
> > +
> > + kfree(*paramp);
> > + *paramp = str;
> > +
> > + return 0;
> > + }
> >   case MSM_PARAM_SYSPROF:
> >   if (!capable(CAP_SYS_ADMIN))
> >   return -EPERM;
> > diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
> > index 4ec62b601adc..68f3f8ade76d 100644
> > --- a/drivers/gpu/drm/msm/msm_gpu.c
> > +++ b/drivers/gpu/drm/msm/msm_gpu.c
> > @@ -364,14 +364,21 @@ static void retire_submits(struct msm_gpu *gpu);
> >
> >  static void get_comm_cmdline(struct msm_gem_submit *submit, char **comm, 
> > char **cmd)
> >  {
> > + struct msm_file_private *ctx = submit->queue->ctx;
> >   struct task_struct *task;
> >
> > + *comm = kstrdup(ctx->comm, GFP_KERNEL);
> > + *cmd  = kstrdup(ctx->cmdline, GFP_KERNEL);
> > +
> >   task = get_pid_task(submit->pid, PIDTYPE_PID);
> >   if (!task)
> >   return;
> >
> > - *comm = kstrdup(task->comm, GFP_KERNEL);
> > - *cmd = kstrdup_quotable_cmdline(task, GFP_KERNEL);
> > + if (!*comm)
> > + *comm = kstrdup(task->comm, GFP_KERNEL);
>
> What?
>
> If the first allocation failed, then this one is going to fail as well.
> Just return -ENOMEM.  Or maybe this is meant to be checking for an empty
> string?

fwiw, if ctx->comm is NULL, the kstrdup() will return NULL, so this
isn't intended to deal with OoM, but the case that comm and/or cmdline
is not overridden.

BR,
-R

>
> > +
> > + if (!*cmd)
> > + *cmd = kstrdup_quotable_cmdline(task, GFP_KERNEL);
>
> Same.
>
> >
> >   put_task_struct(task);
> >  }
>
> regards,
> dan carpenter
>


Re: amd-gfx Digest, Vol 70, Issue 199

2022-03-17 Thread Felix Kuehling

Am 2022-03-16 um 21:57 schrieb Yat Sin, David:

Use proper amdgpu_gem_prime_import function to handle all kinds of
imports. Remember the dmabuf reference to enable proper multi-GPU
attachment to multiple VMs without erroneously re-exporting the underlying
BO multiple times.

Signed-off-by: Felix Kuehling 
---
  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 38 ++
-
  1 file changed, 21 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index cd89d2e46852..2ac61a1e665e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -2033,30 +2033,27 @@ int
amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
struct amdgpu_bo *bo;
int ret;

-   if (dma_buf->ops != &amdgpu_dmabuf_ops)
-   /* Can't handle non-graphics buffers */
-   return -EINVAL;
-
-   obj = dma_buf->priv;
-   if (drm_to_adev(obj->dev) != adev)
-   /* Can't handle buffers from other devices */
-   return -EINVAL;
+   obj = amdgpu_gem_prime_import(adev_to_drm(adev), dma_buf);
+   if (IS_ERR(obj))
+   return PTR_ERR(obj);

bo = gem_to_amdgpu_bo(obj);
if (!(bo->preferred_domains & (AMDGPU_GEM_DOMAIN_VRAM |
-   AMDGPU_GEM_DOMAIN_GTT)))
+   AMDGPU_GEM_DOMAIN_GTT))) {
/* Only VRAM and GTT BOs are supported */
-   return -EINVAL;
+   ret = -EINVAL;
+   goto err_put_obj;
+   }

*mem = kzalloc(sizeof(struct kgd_mem), GFP_KERNEL);
-   if (!*mem)
-   return -ENOMEM;
+   if (!*mem) {
+   ret = -ENOMEM;
+   goto err_put_obj;
+   }

ret = drm_vma_node_allow(&obj->vma_node, drm_priv);
-   if (ret) {
-   kfree(mem);
-   return ret;
-   }
+   if (ret)
+   goto err_free_mem;

if (size)
*size = amdgpu_bo_size(bo);
@@ -2073,7 +2070,8 @@ int
amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
| KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE
| KFD_IOC_ALLOC_MEM_FLAGS_EXECUTABLE;

-   drm_gem_object_get(&bo->tbo.base);
+   get_dma_buf(dma_buf);
+   (*mem)->dmabuf = dma_buf;
(*mem)->bo = bo;
(*mem)->va = va;
(*mem)->domain = (bo->preferred_domains &
AMDGPU_GEM_DOMAIN_VRAM) ?
@@ -2085,6 +2083,12 @@ int
amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev,
(*mem)->is_imported = true;

return 0;
+
+err_free_mem:
+   kfree(mem);

Should be kfree(*mem)


Good catch. That was broken in the original code too and I just copied it.

Thanks,
  Felix




Regards,
David


+err_put_obj:
+   drm_gem_object_put(obj);
+   return ret;
  }

  /* Evict a userptr BO by stopping the queues if necessary


Re: [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on

2022-03-17 Thread Felix Kuehling



Am 2022-03-17 um 11:00 schrieb Lee Jones:

Good afternoon Felix,

Thanks for your review.


Am 2022-03-17 um 09:16 schrieb Lee Jones:

Presently the Client can be freed whilst still in use.

Use the already provided lock to prevent this.

Cc: Felix Kuehling 
Cc: Alex Deucher 
Cc: "Christian König" 
Cc: "Pan, Xinhui" 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: amd-...@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Lee Jones 
---
   drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++
   1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
index e4beebb1c80a2..3b9ac1e87231f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
@@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, struct 
file *filep)
spin_unlock(&dev->smi_lock);
synchronize_rcu();
+
+   spin_lock(&client->lock);
kfifo_free(&client->fifo);
kfree(client);
+   spin_unlock(&client->lock);

The spin_unlock is after the spinlock data structure has been freed.

Good point.

If we go forward with this approach the unlock should perhaps be moved
to just before the kfree().


There
should be no concurrent users here, since we are freeing the data structure.
If there still are concurrent users at this point, they will crash anyway.
So the locking is unnecessary.

The users may well crash, as does the kernel unfortunately.
We only get to kfd_smi_ev_release when the file descriptor is closed. 
User mode has no way to use the client any more at this point. This 
function also removes the client from the dev->smi_cllients list. So no 
more events will be added to the client. Therefore it is safe to free 
the client.


If any of the above were not true, it would not be safe to kfree(client).

But if it is safe to kfree(client), then there is no need for the locking.

Regards,
  Felix






return 0;
   }
@@ -247,11 +250,13 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd)
return ret;
}
+   spin_lock(&client->lock);

The client was just allocated, and it wasn't added to the client list or
given to user mode yet. So there can be no concurrent users at this point.
The locking is unnecessary.

There could be potential issues if someone uses the file descriptor by dumb
luck before this function returns. So maybe we need to move the
anon_inode_getfd to the end of the function (just before list_add_rcu) so
that we only create the file descriptor after the client structure is fully
initialized.

Bingo.  Well done. :)

I can move the function as suggested if that is the best route forward?



Re: [PATCH 2/3] drm/msm/gpu: Park scheduler threads for system suspend

2022-03-17 Thread Matthew Brost
On Thu, Mar 17, 2022 at 03:06:18AM -0700, Christian König wrote:
> Am 17.03.22 um 10:59 schrieb Daniel Vetter:
> > On Thu, Mar 10, 2022 at 03:46:05PM -0800, Rob Clark wrote:
> >> From: Rob Clark 
> >>
> >> In the system suspend path, we don't want to be racing with the
> >> scheduler kthreads pushing additional queued up jobs to the hw
> >> queue (ringbuffer).  So park them first.  While we are at it,
> >> move the wait for active jobs to complete into the new system-
> >> suspend path.
> >>
> >> Signed-off-by: Rob Clark 
> >> ---
> >>   drivers/gpu/drm/msm/adreno/adreno_device.c | 68 --
> >>   1 file changed, 64 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> >> b/drivers/gpu/drm/msm/adreno/adreno_device.c
> >> index 8859834b51b8..0440a98988fc 100644
> >> --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> >> +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> >> @@ -619,22 +619,82 @@ static int active_submits(struct msm_gpu *gpu)
> >>   static int adreno_runtime_suspend(struct device *dev)
> >>   {
> >>struct msm_gpu *gpu = dev_to_gpu(dev);
> >> -  int remaining;
> >> +
> >> +  /*
> >> +   * We should be holding a runpm ref, which will prevent
> >> +   * runtime suspend.  In the system suspend path, we've
> >> +   * already waited for active jobs to complete.
> >> +   */
> >> +  WARN_ON_ONCE(gpu->active_submits);
> >> +
> >> +  return gpu->funcs->pm_suspend(gpu);
> >> +}
> >> +
> >> +static void suspend_scheduler(struct msm_gpu *gpu)
> >> +{
> >> +  int i;
> >> +
> >> +  /*
> >> +   * Shut down the scheduler before we force suspend, so that
> >> +   * suspend isn't racing with scheduler kthread feeding us
> >> +   * more work.
> >> +   *
> >> +   * Note, we just want to park the thread, and let any jobs
> >> +   * that are already on the hw queue complete normally, as
> >> +   * opposed to the drm_sched_stop() path used for handling
> >> +   * faulting/timed-out jobs.  We can't really cancel any jobs
> >> +   * already on the hw queue without racing with the GPU.
> >> +   */
> >> +  for (i = 0; i < gpu->nr_rings; i++) {
> >> +  struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched;
> >> +  kthread_park(sched->thread);
> > Shouldn't we have some proper interfaces for this?
> 
> If I'm not completely mistaken we already should have one, yes.
> 
> > Also I'm kinda wondering how other drivers do this, feels like we should 
> > have a standard
> > way.
> >
> > Finally not flushing out all in-flight requests sounds a bit like a bad
> > idea for system suspend/resume since that's also the hibernation path, and
> > that would mean your shrinker/page reclaim stops working. At least in full
> > generality. Which ain't good for hibernation.
> 
> Completely agree, that looks like an incorrect workaround to me.
> 
> During suspend all userspace applications should be frozen and all f 
> their hardware activity flushed out and waited for completion.
>

Isn't that what Rob is doing?

He kills the scheduler preventing any new job from being submitted then
waits for an outstanding jobs to complete naturally complete (see the
wait_event_timeout below). If the jobs don't naturally complete the
suspend seems to be aborted? That flow makes sense to me and seems like
a novel way to avoid races.

Matt 
 
> I do remember that our internal guys came up with pretty much the same 
> idea and it sounded broken to me back then as well.
> 
> Regards,
> Christian.
> 
> >
> > Adding Christian and Andrey.
> > -Daniel
> >
> >> +  }
> >> +}
> >> +
> >> +static void resume_scheduler(struct msm_gpu *gpu)
> >> +{
> >> +  int i;
> >> +
> >> +  for (i = 0; i < gpu->nr_rings; i++) {
> >> +  struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched;
> >> +  kthread_unpark(sched->thread);
> >> +  }
> >> +}
> >> +
> >> +static int adreno_system_suspend(struct device *dev)
> >> +{
> >> +  struct msm_gpu *gpu = dev_to_gpu(dev);
> >> +  int remaining, ret;
> >> +
> >> +  suspend_scheduler(gpu);
> >>   
> >>remaining = wait_event_timeout(gpu->retire_event,
> >>   active_submits(gpu) == 0,
> >>   msecs_to_jiffies(1000));
> >>if (remaining == 0) {
> >>dev_err(dev, "Timeout waiting for GPU to suspend\n");
> >> -  return -EBUSY;
> >> +  ret = -EBUSY;
> >> +  goto out;
> >>}
> >>   
> >> -  return gpu->funcs->pm_suspend(gpu);
> >> +  ret = pm_runtime_force_suspend(dev);
> >> +out:
> >> +  if (ret)
> >> +  resume_scheduler(gpu);
> >> +
> >> +  return ret;
> >>   }
> >> +
> >> +static int adreno_system_resume(struct device *dev)
> >> +{
> >> +  resume_scheduler(dev_to_gpu(dev));
> >> +  return pm_runtime_force_resume(dev);
> >> +}
> >> +
> >>   #endif
> >>   
> >>   static const struct dev_pm_ops adreno_pm_ops = {
> >> -  SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, 
> >> pm_runtime_force_resume)
> >> +  SET_SYSTEM_SLEEP_PM_OPS(adreno_sys

Re: [PATCH 2/3] drm/msm/gpu: Park scheduler threads for system suspend

2022-03-17 Thread Rob Clark
On Thu, Mar 17, 2022 at 3:06 AM Christian König
 wrote:
>
> Am 17.03.22 um 10:59 schrieb Daniel Vetter:
> > On Thu, Mar 10, 2022 at 03:46:05PM -0800, Rob Clark wrote:
> >> From: Rob Clark 
> >>
> >> In the system suspend path, we don't want to be racing with the
> >> scheduler kthreads pushing additional queued up jobs to the hw
> >> queue (ringbuffer).  So park them first.  While we are at it,
> >> move the wait for active jobs to complete into the new system-
> >> suspend path.
> >>
> >> Signed-off-by: Rob Clark 
> >> ---
> >>   drivers/gpu/drm/msm/adreno/adreno_device.c | 68 --
> >>   1 file changed, 64 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> >> b/drivers/gpu/drm/msm/adreno/adreno_device.c
> >> index 8859834b51b8..0440a98988fc 100644
> >> --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> >> +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> >> @@ -619,22 +619,82 @@ static int active_submits(struct msm_gpu *gpu)
> >>   static int adreno_runtime_suspend(struct device *dev)
> >>   {
> >>  struct msm_gpu *gpu = dev_to_gpu(dev);
> >> -int remaining;
> >> +
> >> +/*
> >> + * We should be holding a runpm ref, which will prevent
> >> + * runtime suspend.  In the system suspend path, we've
> >> + * already waited for active jobs to complete.
> >> + */
> >> +WARN_ON_ONCE(gpu->active_submits);
> >> +
> >> +return gpu->funcs->pm_suspend(gpu);
> >> +}
> >> +
> >> +static void suspend_scheduler(struct msm_gpu *gpu)
> >> +{
> >> +int i;
> >> +
> >> +/*
> >> + * Shut down the scheduler before we force suspend, so that
> >> + * suspend isn't racing with scheduler kthread feeding us
> >> + * more work.
> >> + *
> >> + * Note, we just want to park the thread, and let any jobs
> >> + * that are already on the hw queue complete normally, as
> >> + * opposed to the drm_sched_stop() path used for handling
> >> + * faulting/timed-out jobs.  We can't really cancel any jobs
> >> + * already on the hw queue without racing with the GPU.
> >> + */
> >> +for (i = 0; i < gpu->nr_rings; i++) {
> >> +struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched;
> >> +kthread_park(sched->thread);
> > Shouldn't we have some proper interfaces for this?
>
> If I'm not completely mistaken we already should have one, yes.

drm_sched_stop() was my first thought, but it carries extra baggage.
Really I *just* want to park the kthread.

Note that amdgpu does (for afaict different reasons) park the kthread
directly as well.

> > Also I'm kinda wondering how other drivers do this, feels like we should 
> > have a standard
> > way.

As far as other drivers, it seems like they largely ignore it.  I
suspect other drivers also have problems in this area.

Fwiw, I have a piglit test to try to exercise this path if you want to
try it on other drivers.. might need some futzing around to make sure
enough work is queued up that there is some on hw ring and some queued
up in the scheduler when you try to suspend.

https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/643

> >
> > Finally not flushing out all in-flight requests sounds a bit like a bad
> > idea for system suspend/resume since that's also the hibernation path, and
> > that would mean your shrinker/page reclaim stops working. At least in full
> > generality. Which ain't good for hibernation.
>
> Completely agree, that looks like an incorrect workaround to me.
>
> During suspend all userspace applications should be frozen and all f
> their hardware activity flushed out and waited for completion.
>
> I do remember that our internal guys came up with pretty much the same
> idea and it sounded broken to me back then as well.

userspace frozen != kthread frozen .. that is what this patch is
trying to address, so we aren't racing between shutting down the hw
and the scheduler shoveling more jobs at us.

BR,
-R

> Regards,
> Christian.
>
> >
> > Adding Christian and Andrey.
> > -Daniel
> >
> >> +}
> >> +}
> >> +
> >> +static void resume_scheduler(struct msm_gpu *gpu)
> >> +{
> >> +int i;
> >> +
> >> +for (i = 0; i < gpu->nr_rings; i++) {
> >> +struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched;
> >> +kthread_unpark(sched->thread);
> >> +}
> >> +}
> >> +
> >> +static int adreno_system_suspend(struct device *dev)
> >> +{
> >> +struct msm_gpu *gpu = dev_to_gpu(dev);
> >> +int remaining, ret;
> >> +
> >> +suspend_scheduler(gpu);
> >>
> >>  remaining = wait_event_timeout(gpu->retire_event,
> >> active_submits(gpu) == 0,
> >> msecs_to_jiffies(1000));
> >>  if (remaining == 0) {
> >>  dev_err(dev, "Timeout waiting for GPU to suspend\n");
> >> -return -EBUSY;
> >> +ret = -EBUSY;
> >> +goto out;
> >>  }
> >>
> >> -return gpu->funcs->pm_s

Re: [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on

2022-03-17 Thread Lee Jones
On Thu, 17 Mar 2022, Felix Kuehling wrote:

> 
> Am 2022-03-17 um 11:00 schrieb Lee Jones:
> > Good afternoon Felix,
> > 
> > Thanks for your review.
> > 
> > > Am 2022-03-17 um 09:16 schrieb Lee Jones:
> > > > Presently the Client can be freed whilst still in use.
> > > > 
> > > > Use the already provided lock to prevent this.
> > > > 
> > > > Cc: Felix Kuehling 
> > > > Cc: Alex Deucher 
> > > > Cc: "Christian König" 
> > > > Cc: "Pan, Xinhui" 
> > > > Cc: David Airlie 
> > > > Cc: Daniel Vetter 
> > > > Cc: amd-...@lists.freedesktop.org
> > > > Cc: dri-devel@lists.freedesktop.org
> > > > Signed-off-by: Lee Jones 
> > > > ---
> > > >drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++
> > > >1 file changed, 6 insertions(+)
> > > > 
> > > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c 
> > > > b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> > > > index e4beebb1c80a2..3b9ac1e87231f 100644
> > > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> > > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> > > > @@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, 
> > > > struct file *filep)
> > > > spin_unlock(&dev->smi_lock);
> > > > synchronize_rcu();
> > > > +
> > > > +   spin_lock(&client->lock);
> > > > kfifo_free(&client->fifo);
> > > > kfree(client);
> > > > +   spin_unlock(&client->lock);
> > > The spin_unlock is after the spinlock data structure has been freed.
> > Good point.
> > 
> > If we go forward with this approach the unlock should perhaps be moved
> > to just before the kfree().
> > 
> > > There
> > > should be no concurrent users here, since we are freeing the data 
> > > structure.
> > > If there still are concurrent users at this point, they will crash anyway.
> > > So the locking is unnecessary.
> > The users may well crash, as does the kernel unfortunately.
> We only get to kfd_smi_ev_release when the file descriptor is closed. User
> mode has no way to use the client any more at this point. This function also
> removes the client from the dev->smi_cllients list. So no more events will
> be added to the client. Therefore it is safe to free the client.
> 
> If any of the above were not true, it would not be safe to kfree(client).
> 
> But if it is safe to kfree(client), then there is no need for the locking.

I'm not keen to go into too much detail until it's been patched.

However, there is a way to free the client while it is still in use.

Remember we are multi-threaded.

-- 
Lee Jones [李琼斯]
Principal Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog


Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

2022-03-17 Thread Rob Clark
On Thu, Mar 17, 2022 at 2:29 AM Daniel Vetter  wrote:
>
> On Thu, Mar 17, 2022 at 08:03:27AM +0100, Christian König wrote:
> > Am 16.03.22 um 16:36 schrieb Rob Clark:
> > > [SNIP]
> > > just one point of clarification.. in the msm and i915 case it is
> > > purely for debugging and telemetry (ie. sending crash logs back to
> > > distro for analysis if user has crash reporting enabled).. it isn't
> > > used for triggering any action like killing app or compositor.
> >
> > By the way, how does msm it's memory management for the devcoredumps?
>
> GFP_NORECLAIM all the way. It's purely best effort.

We do one GEM obj allocation in the snapshot path (the hw has a
mechanism to snapshot it's own state into a gpu buffer.. not sure if
nice debugging functionality like that is a commentary on the blob
driver quality, but I'm not complaining)

I suppose we could pre-allocate this buffer up-front.. but it doesn't
seem like a problem, ie. if allocation fails we just skip snapshotting
stuff that needs the hw crashdumper.  I guess since vram is not
involved, perhaps that makes the situation a bit more straightforward.

> Note that the fancy new plan for i915 discrete gpu is to only support gpu
> crash dumps on non-recoverable gpu contexts, i.e. those that do not
> continue to the next batch when something bad happens. This is what vk
> wants and also what iris now uses (we do context recovery in userspace in
> all cases), and non-recoverable contexts greatly simplify the crash dump
> gather: Only thing you need to gather is the register state from hw
> (before you reset it), all the batchbuffer bo and indirect state bo (in
> i915 you can mark which bo to capture in the CS ioctl) can be captured in
> a worker later on. Which for non-recoverable context is no issue, since
> subsequent batchbuffers won't trample over any of these things.
>
> And that way you can record the crashdump (or at least the big pieces like
> all the indirect state stuff) with GFP_KERNEL.
>
> msm probably gets it wrong since embedded drivers have much less shrinker
> and generally no mmu notifiers going on :-)

Note that the bo's associated with the batch are still pinned at this
point, from the bo lifecycle the batch is still active.  So from the
point of view of shrinker, there should be no interaction.  We aren't
doing anything with mmu notifiers (yet), so not entirely sure offhand
the concern there.

Currently we just use GFP_KERNEL and bail if allocation fails.

BR,
-R

> > I mean it is strictly forbidden to allocate any memory in the GPU reset
> > path.
> >
> > > I would however *strongly* recommend devcoredump support in other GPU
> > > drivers (i915's thing pre-dates devcoredump by a lot).. I've used it
> > > to debug and fix a couple obscure issues that I was not able to
> > > reproduce by myself.
> >
> > Yes, completely agree as well.
>
> +1
>
> Cheers, Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch


Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

2022-03-17 Thread Rob Clark
On Thu, Mar 17, 2022 at 2:29 AM Daniel Vetter  wrote:
>
> On Thu, Mar 17, 2022 at 08:03:27AM +0100, Christian König wrote:
> > Am 16.03.22 um 16:36 schrieb Rob Clark:
> > > [SNIP]
> > > just one point of clarification.. in the msm and i915 case it is
> > > purely for debugging and telemetry (ie. sending crash logs back to
> > > distro for analysis if user has crash reporting enabled).. it isn't
> > > used for triggering any action like killing app or compositor.
> >
> > By the way, how does msm it's memory management for the devcoredumps?
>
> GFP_NORECLAIM all the way. It's purely best effort.
>
> Note that the fancy new plan for i915 discrete gpu is to only support gpu
> crash dumps on non-recoverable gpu contexts, i.e. those that do not
> continue to the next batch when something bad happens. This is what vk
> wants and also what iris now uses (we do context recovery in userspace in
> all cases), and non-recoverable contexts greatly simplify the crash dump
> gather: Only thing you need to gather is the register state from hw
> (before you reset it), all the batchbuffer bo and indirect state bo (in
> i915 you can mark which bo to capture in the CS ioctl) can be captured in
> a worker later on. Which for non-recoverable context is no issue, since
> subsequent batchbuffers won't trample over any of these things.

fwiw, we snapshot everything (cmdstream and bo's marked with dump
flag, in addition to hw state) before resuming the GPU, so there is no
danger of things being trampled.  After state is captured and GPU
reset, we "replay" the submits that were written into the ringbuffer
after the faulting submit.  GPU crashes should be a thing you don't
need to try to optimize.

(At some point, I'd like to use scheduler for the replay, and actually
use drm_sched_stop()/etc.. but last time I looked there were still
some sched bugs in that area which prevented me from deleting a bunch
of code ;-))

BR,
-R

>
> And that way you can record the crashdump (or at least the big pieces like
> all the indirect state stuff) with GFP_KERNEL.
>
> msm probably gets it wrong since embedded drivers have much less shrinker
> and generally no mmu notifiers going on :-)
>
> > I mean it is strictly forbidden to allocate any memory in the GPU reset
> > path.
> >
> > > I would however *strongly* recommend devcoredump support in other GPU
> > > drivers (i915's thing pre-dates devcoredump by a lot).. I've used it
> > > to debug and fix a couple obscure issues that I was not able to
> > > reproduce by myself.
> >
> > Yes, completely agree as well.
>
> +1
>
> Cheers, Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch


Re: [Freedreno] [PATCH v3 5/5] drm/msm: allow compile time selection of driver components

2022-03-17 Thread Dmitry Baryshkov

On 17/03/2022 15:44, Dmitry Baryshkov wrote:

On 16/03/2022 20:26, Abhinav Kumar wrote:



On 3/16/2022 12:31 AM, Dmitry Baryshkov wrote:

On 16/03/2022 03:28, Abhinav Kumar wrote:



On 3/3/2022 7:21 PM, Dmitry Baryshkov wrote:
MSM DRM driver already allows one to compile out the DP or DSI 
support.

Add support for disabling other features like MDP4/MDP5/DPU drivers or
direct HDMI output support.

Suggested-by: Stephen Boyd 
Signed-off-by: Dmitry Baryshkov 
---
  drivers/gpu/drm/msm/Kconfig    | 50 
--

  drivers/gpu/drm/msm/Makefile   | 18 ++--
  drivers/gpu/drm/msm/msm_drv.h  | 33 ++
  drivers/gpu/drm/msm/msm_mdss.c | 13 +++--
  4 files changed, 106 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig
index 9b019598e042..3735fd41eb3b 100644
--- a/drivers/gpu/drm/msm/Kconfig
+++ b/drivers/gpu/drm/msm/Kconfig
@@ -46,12 +46,39 @@ config DRM_MSM_GPU_SUDO
    Only use this if you are a driver developer.  This should 
*not*

    be enabled for production kernels.  If unsure, say N.
-config DRM_MSM_HDMI_HDCP
-    bool "Enable HDMI HDCP support in MSM DRM driver"
+config DRM_MSM_MDSS
+    bool
+    depends on DRM_MSM
+    default n

shouldnt DRM_MSM_MDSS be defaulted to y?


No, it will be selected either by MDP5 or by DPU1. It is not used if 
DRM_MSM is compiled with just MDP4 or headless support in mind.

Ok got it.




Another question is the compilation validation of the combinations 
of these.


So we need to try:

1) DRM_MSM_MDSS + DRM_MSM_MDP4
2) DRM_MSM_MDSS + DRM_MSM_MDP5
3) DRM_MSM_MDSS + DRM_MSM_DPU

Earlier since all of them were compiled together any 
inter-dependencies will not show up. Now since we are separating it 
out, just wanted to make sure each of the combos compile?


I think you meant:
- headless
- MDP4
- MDP5
- DPU1
- MDP4 + MDP5
- MDP4 + DPU1
- MDP5 + DPU1
- all three drivers


Yes, each of these combinations.


Each of them was tested.


Hmm. It looks like I had DSI disabled during the tests. Will fix it up.






+
+config DRM_MSM_MDP4
+    bool "Enable MDP4 support in MSM DRM driver"
  depends on DRM_MSM
  default y
  help
-  Choose this option to enable HDCP state machine
+  Compile in support for the Mobile Display Processor v4 
(MDP4) in

+  the MSM DRM driver. It is the older display controller found in
+  devices using APQ8064/MSM8960/MSM8x60 platforms.
+
+config DRM_MSM_MDP5
+    bool "Enable MDP5 support in MSM DRM driver"
+    depends on DRM_MSM
+    select DRM_MSM_MDSS
+    default y
+    help
+  Compile in support for the Mobile Display Processor v5 
(MDP4) in
+  the MSM DRM driver. It is the display controller found in 
devices
+  using e.g. APQ8016/MSM8916/APQ8096/MSM8996/MSM8974/SDM6x0 
platforms.

+
+config DRM_MSM_DPU
+    bool "Enable DPU support in MSM DRM driver"
+    depends on DRM_MSM
+    select DRM_MSM_MDSS
+    default y
+    help
+  Compile in support for the Display Processing Unit in
+  the MSM DRM driver. It is the display controller found in 
devices

+  using e.g. SDM845 and newer platforms.
  config DRM_MSM_DP
  bool "Enable DisplayPort support in MSM DRM driver"
@@ -116,3 +143,20 @@ config DRM_MSM_DSI_7NM_PHY
  help
    Choose this option if DSI PHY on SM8150/SM8250/SC7280 is 
used on

    the platform.
+
+config DRM_MSM_HDMI
+    bool "Enable HDMI support in MSM DRM driver"
+    depends on DRM_MSM
+    default y
+    help
+  Compile in support for the HDMI output MSM DRM driver. It can
+  be a primary or a secondary display on device. Note that 
this is used
+  only for the direct HDMI output. If the device outputs HDMI 
data
+  throught some kind of DSI-to-HDMI bridge, this option can be 
disabled.

+
+config DRM_MSM_HDMI_HDCP
+    bool "Enable HDMI HDCP support in MSM DRM driver"
+    depends on DRM_MSM && DRM_MSM_HDMI
+    default y
+    help
+  Choose this option to enable HDCP state machine
diff --git a/drivers/gpu/drm/msm/Makefile 
b/drivers/gpu/drm/msm/Makefile

index e76927b42033..5fe9c20ab9ee 100644
--- a/drivers/gpu/drm/msm/Makefile
+++ b/drivers/gpu/drm/msm/Makefile
@@ -16,6 +16,8 @@ msm-y := \
  adreno/a6xx_gpu.o \
  adreno/a6xx_gmu.o \
  adreno/a6xx_hfi.o \
+
+msm-$(CONFIG_DRM_MSM_HDMI) += \
  hdmi/hdmi.o \
  hdmi/hdmi_audio.o \
  hdmi/hdmi_bridge.o \
@@ -27,8 +29,8 @@ msm-y := \
  hdmi/hdmi_phy_8x60.o \
  hdmi/hdmi_phy_8x74.o \
  hdmi/hdmi_pll_8960.o \
-    disp/mdp_format.o \
-    disp/mdp_kms.o \
+
+msm-$(CONFIG_DRM_MSM_MDP4) += \
  disp/mdp4/mdp4_crtc.o \
  disp/mdp4/mdp4_dtv_encoder.o \
  disp/mdp4/mdp4_lcdc_encoder.o \
@@ -37,6 +39,8 @@ msm-y := \
  disp/mdp4/mdp4_irq.o \
  disp/mdp4/mdp4_kms.o \
  disp/mdp4/mdp4_plane.o \
+
+msm-$(CONFIG_DRM_MSM_MDP5) += \
  disp/mdp5/mdp5_cfg.o \
  disp/mdp5/mdp5_ctl.o \
  disp/mdp5/mdp5_crtc.o \
@@ -47,6 +51,8 @@ msm-y

Re: [PATCH 2/3] drm/msm/gpu: Park scheduler threads for system suspend

2022-03-17 Thread Christian König

Am 17.03.22 um 16:10 schrieb Rob Clark:

[SNIP]
userspace frozen != kthread frozen .. that is what this patch is
trying to address, so we aren't racing between shutting down the hw
and the scheduler shoveling more jobs at us.


Well exactly that's the problem. The scheduler is supposed to shoveling 
more jobs at us until it is empty.


Thinking more about it we will then keep some dma_fence instance 
unsignaled and that is and extremely bad idea since it can lead to 
deadlocks during suspend.


So this patch here is an absolute clear NAK from my side. If amdgpu is 
doing something similar that is a severe bug and needs to be addressed 
somehow.


Regards,
Christian.



BR,
-R





Re: [PATCH v2 6/8] drm/shmem-helper: Add generic memory shrinker

2022-03-17 Thread Rob Clark
On Wed, Mar 16, 2022 at 5:13 PM Dmitry Osipenko
 wrote:
>
> On 3/16/22 23:00, Rob Clark wrote:
> > On Mon, Mar 14, 2022 at 3:44 PM Dmitry Osipenko
> >  wrote:
> >>
> >> Introduce a common DRM SHMEM shrinker. It allows to reduce code
> >> duplication among DRM drivers, it also handles complicated lockings
> >> for the drivers. This is initial version of the shrinker that covers
> >> basic needs of GPU drivers.
> >>
> >> This patch is based on a couple ideas borrowed from Rob's Clark MSM
> >> shrinker and Thomas' Zimmermann variant of SHMEM shrinker.
> >>
> >> GPU drivers that want to use generic DRM memory shrinker must support
> >> generic GEM reservations.
> >>
> >> Signed-off-by: Daniel Almeida 
> >> Signed-off-by: Dmitry Osipenko 
> >> ---
> >>  drivers/gpu/drm/drm_gem_shmem_helper.c | 194 +
> >>  include/drm/drm_device.h   |   4 +
> >>  include/drm/drm_gem.h  |  11 ++
> >>  include/drm/drm_gem_shmem_helper.h |  25 
> >>  4 files changed, 234 insertions(+)
> >>
> >> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
> >> b/drivers/gpu/drm/drm_gem_shmem_helper.c
> >> index 37009418cd28..35be2ee98f11 100644
> >> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> >> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> >> @@ -139,6 +139,9 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object 
> >> *shmem)
> >>  {
> >> struct drm_gem_object *obj = &shmem->base;
> >>
> >> +   /* take out shmem GEM object from the memory shrinker */
> >> +   drm_gem_shmem_madvise(shmem, 0);
> >> +
> >> WARN_ON(shmem->vmap_use_count);
> >>
> >> if (obj->import_attach) {
> >> @@ -163,6 +166,42 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object 
> >> *shmem)
> >>  }
> >>  EXPORT_SYMBOL_GPL(drm_gem_shmem_free);
> >>
> >> +static void drm_gem_shmem_update_purgeable_status(struct 
> >> drm_gem_shmem_object *shmem)
> >> +{
> >> +   struct drm_gem_object *obj = &shmem->base;
> >> +   struct drm_gem_shmem_shrinker *gem_shrinker = 
> >> obj->dev->shmem_shrinker;
> >> +   size_t page_count = obj->size >> PAGE_SHIFT;
> >> +
> >> +   if (!gem_shrinker || obj->import_attach || !obj->funcs->purge)
> >> +   return;
> >> +
> >> +   mutex_lock(&shmem->vmap_lock);
> >> +   mutex_lock(&shmem->pages_lock);
> >> +   mutex_lock(&gem_shrinker->lock);
> >> +
> >> +   if (shmem->madv < 0) {
> >> +   list_del_init(&shmem->madv_list);
> >> +   goto unlock;
> >> +   } else if (shmem->madv > 0) {
> >> +   if (!list_empty(&shmem->madv_list))
> >> +   goto unlock;
> >> +
> >> +   WARN_ON(gem_shrinker->shrinkable_count + page_count < 
> >> page_count);
> >> +   gem_shrinker->shrinkable_count += page_count;
> >> +
> >> +   list_add_tail(&shmem->madv_list, &gem_shrinker->lru);
> >> +   } else if (!list_empty(&shmem->madv_list)) {
> >> +   list_del_init(&shmem->madv_list);
> >> +
> >> +   WARN_ON(gem_shrinker->shrinkable_count < page_count);
> >> +   gem_shrinker->shrinkable_count -= page_count;
> >> +   }
> >> +unlock:
> >> +   mutex_unlock(&gem_shrinker->lock);
> >> +   mutex_unlock(&shmem->pages_lock);
> >> +   mutex_unlock(&shmem->vmap_lock);
> >> +}
> >> +
> >>  static int drm_gem_shmem_get_pages_locked(struct drm_gem_shmem_object 
> >> *shmem)
> >>  {
> >> struct drm_gem_object *obj = &shmem->base;
> >> @@ -366,6 +405,8 @@ int drm_gem_shmem_vmap(struct drm_gem_shmem_object 
> >> *shmem,
> >> ret = drm_gem_shmem_vmap_locked(shmem, map);
> >> mutex_unlock(&shmem->vmap_lock);
> >>
> >> +   drm_gem_shmem_update_purgeable_status(shmem);
> >> +
> >> return ret;
> >>  }
> >>  EXPORT_SYMBOL(drm_gem_shmem_vmap);
> >> @@ -409,6 +450,8 @@ void drm_gem_shmem_vunmap(struct drm_gem_shmem_object 
> >> *shmem,
> >> mutex_lock(&shmem->vmap_lock);
> >> drm_gem_shmem_vunmap_locked(shmem, map);
> >> mutex_unlock(&shmem->vmap_lock);
> >> +
> >> +   drm_gem_shmem_update_purgeable_status(shmem);
> >>  }
> >>  EXPORT_SYMBOL(drm_gem_shmem_vunmap);
> >>
> >> @@ -451,6 +494,8 @@ int drm_gem_shmem_madvise(struct drm_gem_shmem_object 
> >> *shmem, int madv)
> >>
> >> mutex_unlock(&shmem->pages_lock);
> >>
> >> +   drm_gem_shmem_update_purgeable_status(shmem);
> >> +
> >> return (madv >= 0);
> >>  }
> >>  EXPORT_SYMBOL(drm_gem_shmem_madvise);
> >> @@ -763,6 +808,155 @@ drm_gem_shmem_prime_import_sg_table(struct 
> >> drm_device *dev,
> >>  }
> >>  EXPORT_SYMBOL_GPL(drm_gem_shmem_prime_import_sg_table);
> >>
> >> +static struct drm_gem_shmem_shrinker *
> >> +to_drm_shrinker(struct shrinker *shrinker)
> >> +{
> >> +   return container_of(shrinker, struct drm_gem_shmem_shrinker, base);
> >> +}
> >> +
> >> +static unsigned long
> >> +drm_gem_shmem_shrinker_count_objects(struct shrinker *shrinker,
> >> + 

Re: [PATCH 2/3] drm/msm/gpu: Park scheduler threads for system suspend

2022-03-17 Thread Rob Clark
On Thu, Mar 17, 2022 at 9:04 AM Christian König
 wrote:
>
> Am 17.03.22 um 16:10 schrieb Rob Clark:
> > [SNIP]
> > userspace frozen != kthread frozen .. that is what this patch is
> > trying to address, so we aren't racing between shutting down the hw
> > and the scheduler shoveling more jobs at us.
>
> Well exactly that's the problem. The scheduler is supposed to shoveling
> more jobs at us until it is empty.
>
> Thinking more about it we will then keep some dma_fence instance
> unsignaled and that is and extremely bad idea since it can lead to
> deadlocks during suspend.

Hmm, perhaps that is true if you need to migrate things out of vram?
It is at least not a problem when vram is not involved.

> So this patch here is an absolute clear NAK from my side. If amdgpu is
> doing something similar that is a severe bug and needs to be addressed
> somehow.

I think amdgpu's use of kthread_park is not related to suspend, but
didn't look too closely.

And perhaps the solution for this problem is more complex in the case
of amdgpu, I'm not super familiar with the constraints there.  But I
think it is a fine solution for integrated GPUs.

BR,
-R

> Regards,
> Christian.
>
> >
> > BR,
> > -R
> >
>


Re: [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on

2022-03-17 Thread philip yang

  


On 2022-03-17 11:13 a.m., Lee Jones
  wrote:


  On Thu, 17 Mar 2022, Felix Kuehling wrote:


  

Am 2022-03-17 um 11:00 schrieb Lee Jones:


  Good afternoon Felix,

Thanks for your review.


  
Am 2022-03-17 um 09:16 schrieb Lee Jones:


  Presently the Client can be freed whilst still in use.

Use the already provided lock to prevent this.

Cc: Felix Kuehling 
Cc: Alex Deucher 
Cc: "Christian König" 
Cc: "Pan, Xinhui" 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: amd-...@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Lee Jones 
---
   drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++
   1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
index e4beebb1c80a2..3b9ac1e87231f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
@@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, struct file *filep)
   	spin_unlock(&dev->smi_lock);
   	synchronize_rcu();
+
+	spin_lock(&client->lock);
   	kfifo_free(&client->fifo);
   	kfree(client);
+	spin_unlock(&client->lock);


The spin_unlock is after the spinlock data structure has been freed.

  
  Good point.

If we go forward with this approach the unlock should perhaps be moved
to just before the kfree().


  
There
should be no concurrent users here, since we are freeing the data structure.
If there still are concurrent users at this point, they will crash anyway.
So the locking is unnecessary.

  
  The users may well crash, as does the kernel unfortunately.


We only get to kfd_smi_ev_release when the file descriptor is closed. User
mode has no way to use the client any more at this point. This function also
removes the client from the dev->smi_cllients list. So no more events will
be added to the client. Therefore it is safe to free the client.

If any of the above were not true, it would not be safe to kfree(client).

But if it is safe to kfree(client), then there is no need for the locking.

  
  
I'm not keen to go into too much detail until it's been patched.

However, there is a way to free the client while it is still in use.

Remember we are multi-threaded.


files_struct->count refcount is used to handle this race, as
  vfs_read/vfs_write takes file refcount and fput calls release only
  if refcount is 1, to guarantee that read/write from user space is
  finished here.
Another race is driver add_event_to_kfifo while closing the
  handler. We use rcu_read_lock in add_event_to_kfifo, and
  kfd_smi_ev_release calls synchronize_rcu to wait for all rcu_read
  done. So it is safe to call kfifo_free(&client->fifo) and
  kfree(client).
Regards,
Philip


  


  



[PATCH 1/2] drm: Add missing DP DSC extended capability definitions.

2022-03-17 Thread Stanislav Lisovskiy
Adding DP DSC register definitions, we might need for further
DSC implementation, supporting MST and DP branch pass-through mode.

Signed-off-by: Stanislav Lisovskiy 
---
 drivers/gpu/drm/dp/drm_dp.c| 25 +
 include/drm/dp/drm_dp_helper.h | 11 ++-
 2 files changed, 35 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/dp/drm_dp.c b/drivers/gpu/drm/dp/drm_dp.c
index 703972ae14c6..fe9c72055638 100644
--- a/drivers/gpu/drm/dp/drm_dp.c
+++ b/drivers/gpu/drm/dp/drm_dp.c
@@ -2312,6 +2312,31 @@ u8 drm_dp_dsc_sink_max_slice_count(const u8 
dsc_dpcd[DP_DSC_RECEIVER_CAP_SIZE],
 }
 EXPORT_SYMBOL(drm_dp_dsc_sink_max_slice_count);
 
+/**
+  * drm_dp_dsc_sink_bpp_increment_div - Get the bits per pixel precision
+  * which DP DSC sink device supports.
+  */
+u8 drm_dp_dsc_sink_bpp_increment_div(const u8 
dsc_dpcd[DP_DSC_RECEIVER_CAP_SIZE])
+{
+   u8 bpp_increment_dpcd = dsc_dpcd[DP_DSC_BITS_PER_PIXEL_INC - 
DP_DSC_SUPPORT];
+
+   switch (bpp_increment_dpcd) {
+   case DP_DSC_BITS_PER_PIXEL_1_16:
+   return 16;
+   case DP_DSC_BITS_PER_PIXEL_1_8:
+   return 8;
+   case DP_DSC_BITS_PER_PIXEL_1_4:
+   return 4;
+   case DP_DSC_BITS_PER_PIXEL_1_2:
+   return 2;
+   case DP_DSC_BITS_PER_PIXEL_1_1:
+   return 1;
+   }
+
+   return 0;
+}
+
+
 /**
  * drm_dp_dsc_sink_line_buf_depth() - Get the line buffer depth in bits
  * @dsc_dpcd: DSC capabilities from DPCD
diff --git a/include/drm/dp/drm_dp_helper.h b/include/drm/dp/drm_dp_helper.h
index 51e02cf75277..e4c9f4438ccb 100644
--- a/include/drm/dp/drm_dp_helper.h
+++ b/include/drm/dp/drm_dp_helper.h
@@ -246,6 +246,9 @@ struct drm_panel;
 
 #define DP_DSC_SUPPORT  0x060   /* DP 1.4 */
 # define DP_DSC_DECOMPRESSION_IS_SUPPORTED  (1 << 0)
+# define DP_DSC_PASS_THROUGH_IS_SUPPORTED   (1 << 1)
+# define DP_DSC_DYNAMIC_PPS_UPDATE_SUPPORT_COMP_TO_COMP(1 << 2)
+# define DP_DSC_DYNAMIC_PPS_UPDATE_SUPPORT_UNCOMP_TO_COMP  (1 << 3)
 
 #define DP_DSC_REV  0x061
 # define DP_DSC_MAJOR_MASK  (0xf << 0)
@@ -284,12 +287,15 @@ struct drm_panel;
 
 #define DP_DSC_BLK_PREDICTION_SUPPORT   0x066
 # define DP_DSC_BLK_PREDICTION_IS_SUPPORTED (1 << 0)
+# define DP_DSC_RGB_COLOR_CONV_BYPASS_SUPPORT (1 << 1)
 
 #define DP_DSC_MAX_BITS_PER_PIXEL_LOW   0x067   /* eDP 1.4 */
 
 #define DP_DSC_MAX_BITS_PER_PIXEL_HI0x068   /* eDP 1.4 */
 # define DP_DSC_MAX_BITS_PER_PIXEL_HI_MASK  (0x3 << 0)
 # define DP_DSC_MAX_BITS_PER_PIXEL_HI_SHIFT 8
+# define DP_DSC_MAX_BPP_DELTA_VERSION_MASK  0x06
+# define DP_DSC_MAX_BPP_DELTA_AVAILABILITY  0x08
 
 #define DP_DSC_DEC_COLOR_FORMAT_CAP 0x069
 # define DP_DSC_RGB (1 << 0)
@@ -351,11 +357,13 @@ struct drm_panel;
 # define DP_DSC_24_PER_DP_DSC_SINK  (1 << 2)
 
 #define DP_DSC_BITS_PER_PIXEL_INC   0x06F
+# define DP_DSC_RGB_YCbCr444_MAX_BPP_DELTA_MASK 0x1f
+# define DP_DSC_RGB_YCbCr420_MAX_BPP_DELTA_MASK 0xe0
 # define DP_DSC_BITS_PER_PIXEL_1_16 0x0
 # define DP_DSC_BITS_PER_PIXEL_1_8  0x1
 # define DP_DSC_BITS_PER_PIXEL_1_4  0x2
 # define DP_DSC_BITS_PER_PIXEL_1_2  0x3
-# define DP_DSC_BITS_PER_PIXEL_10x4
+# define DP_DSC_BITS_PER_PIXEL_1_1  0x4
 
 #define DP_PSR_SUPPORT  0x070   /* XXX 1.2? */
 # define DP_PSR_IS_SUPPORTED1
@@ -1825,6 +1833,7 @@ u8 drm_dp_dsc_sink_max_slice_count(const u8 
dsc_dpcd[DP_DSC_RECEIVER_CAP_SIZE],
 u8 drm_dp_dsc_sink_line_buf_depth(const u8 dsc_dpcd[DP_DSC_RECEIVER_CAP_SIZE]);
 int drm_dp_dsc_sink_supported_input_bpcs(const u8 
dsc_dpc[DP_DSC_RECEIVER_CAP_SIZE],
 u8 dsc_bpc[3]);
+u8 drm_dp_dsc_sink_bpp_increment_div(const u8 
dsc_dpcd[DP_DSC_RECEIVER_CAP_SIZE]);
 
 static inline bool
 drm_dp_sink_supports_dsc(const u8 dsc_dpcd[DP_DSC_RECEIVER_CAP_SIZE])
-- 
2.24.1.485.gad05a3d8e5



Re: [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on

2022-03-17 Thread Lee Jones
On Thu, 17 Mar 2022, philip yang wrote:

>On 2022-03-17 11:13 a.m., Lee Jones wrote:
> 
> On Thu, 17 Mar 2022, Felix Kuehling wrote:
> 
> 
> Am 2022-03-17 um 11:00 schrieb Lee Jones:
> 
> Good afternoon Felix,
> 
> Thanks for your review.
> 
> 
> Am 2022-03-17 um 09:16 schrieb Lee Jones:
> 
> Presently the Client can be freed whilst still in use.
> 
> Use the already provided lock to prevent this.
> 
> Cc: Felix Kuehling [1]
> Cc: Alex Deucher [2]
> Cc: "Christian König" [3]
> Cc: "Pan, Xinhui" [4]
> Cc: David Airlie [5]
> Cc: Daniel Vetter [6]
> Cc: [7]amd-...@lists.freedesktop.org
> Cc: [8]dri-devel@lists.freedesktop.org
> Signed-off-by: Lee Jones [9]
> ---
>drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++
>1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c 
> b/drivers/gpu/drm/amd/a
> mdkfd/kfd_smi_events.c
> index e4beebb1c80a2..3b9ac1e87231f 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c
> @@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, 
> struct f
> ile *filep)
> spin_unlock(&dev->smi_lock);
> synchronize_rcu();
> +
> +   spin_lock(&client->lock);
> kfifo_free(&client->fifo);
> kfree(client);
> +   spin_unlock(&client->lock);
> 
> The spin_unlock is after the spinlock data structure has been freed.
> 
> Good point.
> 
> If we go forward with this approach the unlock should perhaps be moved
> to just before the kfree().
> 
> 
> There
> should be no concurrent users here, since we are freeing the data structure.
> If there still are concurrent users at this point, they will crash anyway.
> So the locking is unnecessary.
> 
> The users may well crash, as does the kernel unfortunately.
> 
> We only get to kfd_smi_ev_release when the file descriptor is closed. User
> mode has no way to use the client any more at this point. This function also
> removes the client from the dev->smi_cllients list. So no more events will
> be added to the client. Therefore it is safe to free the client.
> 
> If any of the above were not true, it would not be safe to kfree(client).
> 
> But if it is safe to kfree(client), then there is no need for the locking.
> 
> I'm not keen to go into too much detail until it's been patched.
> 
> However, there is a way to free the client while it is still in use.
> 
> Remember we are multi-threaded.
> 
>files_struct->count refcount is used to handle this race, as
>vfs_read/vfs_write takes file refcount and fput calls release only if
>refcount is 1, to guarantee that read/write from user space is finished
>here.
> 
>Another race is driver add_event_to_kfifo while closing the handler. We
>use rcu_read_lock in add_event_to_kfifo, and kfd_smi_ev_release calls
>synchronize_rcu to wait for all rcu_read done. So it is safe to call
>kfifo_free(&client->fifo) and kfree(client).

Philip, please reach out to Felix.

We have discussed this in more detail off-line.

-- 
Lee Jones [李琼斯]
Principal Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog


[PATCH 0/2] Add DP MST DSC support to i915

2022-03-17 Thread Stanislav Lisovskiy
Currently we have only DSC support for DP SST.

Stanislav Lisovskiy (2):
  drm: Add missing DP DSC extended capability definitions.
  drm/i915: Add DSC support to MST path

 drivers/gpu/drm/dp/drm_dp.c |  25 
 drivers/gpu/drm/i915/display/intel_dp.c | 138 --
 drivers/gpu/drm/i915/display/intel_dp.h |  17 +++
 drivers/gpu/drm/i915/display/intel_dp_mst.c | 146 +++-
 include/drm/dp/drm_dp_helper.h  |  11 +-
 5 files changed, 320 insertions(+), 17 deletions(-)

-- 
2.24.1.485.gad05a3d8e5



[PATCH 2/2] drm/i915: Add DSC support to MST path

2022-03-17 Thread Stanislav Lisovskiy
Whenever we are not able to get enough timeslots
for required PBN, let's try to allocate those
using DSC, just same way as we do for SST.

Those patches are experimental yet, i.e not
for merging, still need to be tested with
proper DSC display, submitting those to check
ig nothing else blows up at least.

v2: Add DSC checks to intel_dp_mst_mode_valid_ctx, similar
to ones we have in intel_dp_mode_valid(Manasi Navare)

v3: Removed redundant edp condition logic from MST DSC
handling(Manasi Navare)

v4:  - Fixed forgotten force_dsc_en condition which was
   always enabled for testing purposes(Manasi Navare)
 - Properly process ret == EDEADLK, thus fixing the
   regression caused by WARN triggered with modeset_lock.

v5:  - Removed redundant check(Imre Deak)

Acked-by: Imre Deak 
Signed-off-by: Stanislav Lisovskiy 
---
 drivers/gpu/drm/i915/display/intel_dp.c | 138 --
 drivers/gpu/drm/i915/display/intel_dp.h |  17 +++
 drivers/gpu/drm/i915/display/intel_dp_mst.c | 146 +++-
 3 files changed, 285 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dp.c 
b/drivers/gpu/drm/i915/display/intel_dp.c
index 9e19165fd175..b04771e495cc 100644
--- a/drivers/gpu/drm/i915/display/intel_dp.c
+++ b/drivers/gpu/drm/i915/display/intel_dp.c
@@ -115,7 +115,6 @@ bool intel_dp_is_edp(struct intel_dp *intel_dp)
 }
 
 static void intel_dp_unset_edid(struct intel_dp *intel_dp);
-static int intel_dp_dsc_compute_bpp(struct intel_dp *intel_dp, u8 dsc_max_bpc);
 
 /* Is link rate UHBR and thus 128b/132b? */
 bool intel_dp_is_uhbr(const struct intel_crtc_state *crtc_state)
@@ -667,11 +666,12 @@ small_joiner_ram_size_bits(struct drm_i915_private *i915)
return 6144 * 8;
 }
 
-static u16 intel_dp_dsc_get_output_bpp(struct drm_i915_private *i915,
-  u32 link_clock, u32 lane_count,
-  u32 mode_clock, u32 mode_hdisplay,
-  bool bigjoiner,
-  u32 pipe_bpp)
+u16 intel_dp_dsc_get_output_bpp(struct drm_i915_private *i915,
+   u32 link_clock, u32 lane_count,
+   u32 mode_clock, u32 mode_hdisplay,
+   bool bigjoiner,
+   u32 pipe_bpp,
+   u32 timeslots)
 {
u32 bits_per_pixel, max_bpp_small_joiner_ram;
int i;
@@ -683,7 +683,7 @@ static u16 intel_dp_dsc_get_output_bpp(struct 
drm_i915_private *i915,
 * for MST -> TimeSlotsPerMTP has to be calculated
 */
bits_per_pixel = (link_clock * lane_count * 8) /
-intel_dp_mode_to_fec_clock(mode_clock);
+(intel_dp_mode_to_fec_clock(mode_clock) * timeslots);
drm_dbg_kms(&i915->drm, "Max link bpp: %u\n", bits_per_pixel);
 
/* Small Joiner Check: output bpp <= joiner RAM (bits) / Horiz. width */
@@ -737,9 +737,9 @@ static u16 intel_dp_dsc_get_output_bpp(struct 
drm_i915_private *i915,
return bits_per_pixel << 4;
 }
 
-static u8 intel_dp_dsc_get_slice_count(struct intel_dp *intel_dp,
-  int mode_clock, int mode_hdisplay,
-  bool bigjoiner)
+u8 intel_dp_dsc_get_slice_count(struct intel_dp *intel_dp,
+   int mode_clock, int mode_hdisplay,
+   bool bigjoiner)
 {
struct drm_i915_private *i915 = dp_to_i915(intel_dp);
u8 min_slice_count, i;
@@ -902,8 +902,8 @@ intel_dp_mode_valid_downstream(struct intel_connector 
*connector,
return MODE_OK;
 }
 
-static bool intel_dp_need_bigjoiner(struct intel_dp *intel_dp,
-   int hdisplay, int clock)
+bool intel_dp_need_bigjoiner(struct intel_dp *intel_dp,
+int hdisplay, int clock)
 {
struct drm_i915_private *i915 = dp_to_i915(intel_dp);
 
@@ -990,7 +990,7 @@ intel_dp_mode_valid(struct drm_connector *connector,
target_clock,
mode->hdisplay,
bigjoiner,
-   pipe_bpp) >> 4;
+   pipe_bpp, 1) >> 4;
dsc_slice_count =
intel_dp_dsc_get_slice_count(intel_dp,
 target_clock,
@@ -1285,7 +1285,7 @@ intel_dp_compute_link_config_wide(struct intel_dp 
*intel_dp,
return -EINVAL;
 }
 
-static int intel_dp_dsc_compute_bpp(struct intel_dp *intel_dp, u8 max_req_bpc)
+int intel_dp_dsc_compute_bpp(struct intel_dp *intel_dp, u8 max_req_bpc)
 {
struct drm_i915_private *i915 = dp_to_i915(inte

[PATCH 1/2] drm: Add missing DP DSC extended capability definitions.

2022-03-17 Thread Stanislav Lisovskiy
Adding DP DSC register definitions, we might need for further
DSC implementation, supporting MST and DP branch pass-through mode.

Signed-off-by: Stanislav Lisovskiy 
---
 drivers/gpu/drm/dp/drm_dp.c| 25 +
 include/drm/dp/drm_dp_helper.h | 11 ++-
 2 files changed, 35 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/dp/drm_dp.c b/drivers/gpu/drm/dp/drm_dp.c
index 703972ae14c6..fe9c72055638 100644
--- a/drivers/gpu/drm/dp/drm_dp.c
+++ b/drivers/gpu/drm/dp/drm_dp.c
@@ -2312,6 +2312,31 @@ u8 drm_dp_dsc_sink_max_slice_count(const u8 
dsc_dpcd[DP_DSC_RECEIVER_CAP_SIZE],
 }
 EXPORT_SYMBOL(drm_dp_dsc_sink_max_slice_count);
 
+/**
+  * drm_dp_dsc_sink_bpp_increment_div - Get the bits per pixel precision
+  * which DP DSC sink device supports.
+  */
+u8 drm_dp_dsc_sink_bpp_increment_div(const u8 
dsc_dpcd[DP_DSC_RECEIVER_CAP_SIZE])
+{
+   u8 bpp_increment_dpcd = dsc_dpcd[DP_DSC_BITS_PER_PIXEL_INC - 
DP_DSC_SUPPORT];
+
+   switch (bpp_increment_dpcd) {
+   case DP_DSC_BITS_PER_PIXEL_1_16:
+   return 16;
+   case DP_DSC_BITS_PER_PIXEL_1_8:
+   return 8;
+   case DP_DSC_BITS_PER_PIXEL_1_4:
+   return 4;
+   case DP_DSC_BITS_PER_PIXEL_1_2:
+   return 2;
+   case DP_DSC_BITS_PER_PIXEL_1_1:
+   return 1;
+   }
+
+   return 0;
+}
+
+
 /**
  * drm_dp_dsc_sink_line_buf_depth() - Get the line buffer depth in bits
  * @dsc_dpcd: DSC capabilities from DPCD
diff --git a/include/drm/dp/drm_dp_helper.h b/include/drm/dp/drm_dp_helper.h
index 51e02cf75277..e4c9f4438ccb 100644
--- a/include/drm/dp/drm_dp_helper.h
+++ b/include/drm/dp/drm_dp_helper.h
@@ -246,6 +246,9 @@ struct drm_panel;
 
 #define DP_DSC_SUPPORT  0x060   /* DP 1.4 */
 # define DP_DSC_DECOMPRESSION_IS_SUPPORTED  (1 << 0)
+# define DP_DSC_PASS_THROUGH_IS_SUPPORTED   (1 << 1)
+# define DP_DSC_DYNAMIC_PPS_UPDATE_SUPPORT_COMP_TO_COMP(1 << 2)
+# define DP_DSC_DYNAMIC_PPS_UPDATE_SUPPORT_UNCOMP_TO_COMP  (1 << 3)
 
 #define DP_DSC_REV  0x061
 # define DP_DSC_MAJOR_MASK  (0xf << 0)
@@ -284,12 +287,15 @@ struct drm_panel;
 
 #define DP_DSC_BLK_PREDICTION_SUPPORT   0x066
 # define DP_DSC_BLK_PREDICTION_IS_SUPPORTED (1 << 0)
+# define DP_DSC_RGB_COLOR_CONV_BYPASS_SUPPORT (1 << 1)
 
 #define DP_DSC_MAX_BITS_PER_PIXEL_LOW   0x067   /* eDP 1.4 */
 
 #define DP_DSC_MAX_BITS_PER_PIXEL_HI0x068   /* eDP 1.4 */
 # define DP_DSC_MAX_BITS_PER_PIXEL_HI_MASK  (0x3 << 0)
 # define DP_DSC_MAX_BITS_PER_PIXEL_HI_SHIFT 8
+# define DP_DSC_MAX_BPP_DELTA_VERSION_MASK  0x06
+# define DP_DSC_MAX_BPP_DELTA_AVAILABILITY  0x08
 
 #define DP_DSC_DEC_COLOR_FORMAT_CAP 0x069
 # define DP_DSC_RGB (1 << 0)
@@ -351,11 +357,13 @@ struct drm_panel;
 # define DP_DSC_24_PER_DP_DSC_SINK  (1 << 2)
 
 #define DP_DSC_BITS_PER_PIXEL_INC   0x06F
+# define DP_DSC_RGB_YCbCr444_MAX_BPP_DELTA_MASK 0x1f
+# define DP_DSC_RGB_YCbCr420_MAX_BPP_DELTA_MASK 0xe0
 # define DP_DSC_BITS_PER_PIXEL_1_16 0x0
 # define DP_DSC_BITS_PER_PIXEL_1_8  0x1
 # define DP_DSC_BITS_PER_PIXEL_1_4  0x2
 # define DP_DSC_BITS_PER_PIXEL_1_2  0x3
-# define DP_DSC_BITS_PER_PIXEL_10x4
+# define DP_DSC_BITS_PER_PIXEL_1_1  0x4
 
 #define DP_PSR_SUPPORT  0x070   /* XXX 1.2? */
 # define DP_PSR_IS_SUPPORTED1
@@ -1825,6 +1833,7 @@ u8 drm_dp_dsc_sink_max_slice_count(const u8 
dsc_dpcd[DP_DSC_RECEIVER_CAP_SIZE],
 u8 drm_dp_dsc_sink_line_buf_depth(const u8 dsc_dpcd[DP_DSC_RECEIVER_CAP_SIZE]);
 int drm_dp_dsc_sink_supported_input_bpcs(const u8 
dsc_dpc[DP_DSC_RECEIVER_CAP_SIZE],
 u8 dsc_bpc[3]);
+u8 drm_dp_dsc_sink_bpp_increment_div(const u8 
dsc_dpcd[DP_DSC_RECEIVER_CAP_SIZE]);
 
 static inline bool
 drm_dp_sink_supports_dsc(const u8 dsc_dpcd[DP_DSC_RECEIVER_CAP_SIZE])
-- 
2.24.1.485.gad05a3d8e5



Re: [PATCH v2] nvidia-wmi-ec-backlight: Add workarounds for confused firmware

2022-03-17 Thread Hans de Goede
Hi Daniel,

On 3/17/22 14:28, Daniel Dadap wrote:
> 
>> On Mar 17, 2022, at 07:17, Hans de Goede  wrote:
>>
>> Hi,
>>
>>> On 3/16/22 21:33, Daniel Dadap wrote:
>>> Some notebook systems with EC-driven backlight control appear to have a
>>> firmware bug which causes the system to use GPU-driven backlight control
>>> upon a fresh boot, but then switches to EC-driven backlight control
>>> after completing a suspend/resume cycle. All the while, the firmware
>>> reports that the backlight is under EC control, regardless of what is
>>> actually controlling the backlight brightness.
>>>
>>> This leads to the following behavior:
>>>
>>> * nvidia-wmi-ec-backlight gets probed on a fresh boot, due to the
>>>  WMI-wrapped ACPI method erroneously reporting EC control.
>>> * nvidia-wmi-ec-backlight does not work until after a suspend/resume
>>>  cycle, due to the backlight control actually being GPU-driven.
>>> * GPU drivers also register their own backlight handlers: in the case
>>>  of the notebook system where this behavior has been observed, both
>>>  amdgpu and the NVIDIA proprietary driver register backlight handlers.
>>> * The GPU which has backlight control upon a fresh boot (amdgpu in the
>>>  case observed so far) can successfully control the backlight through
>>>  its backlight driver's sysfs interface, but stops working after the
>>>  first suspend/resume cycle.
>>> * nvidia-wmi-ec-backlight is unable to control the backlight upon a
>>>  fresh boot, but begins to work after the first suspend/resume cycle.
>>> * The GPU which does not have backlight control (NVIDIA in this case)
>>>  is not able to control the backlight at any point while the system
>>>  is in operation. On similar hybrid systems with an EC-controlled
>>>  backlight, and AMD/NVIDIA iGPU/dGPU, the NVIDIA proprietary driver
>>>  does not register its backlight handler. It has not been determined
>>>  whether the non-functional handler registered by the NVIDIA driver
>>>  is due to another firmware bug, or a bug in the NVIDIA driver.
>>>
>>> Since nvidia-wmi-ec-backlight registers as a BACKLIGHT_FIRMWARE type
>>> device, it takes precedence over the BACKLIGHT_RAW devices registered
>>> by the GPU drivers. This in turn leads to backlight control appearing
>>> to be non-functional until after completing a suspend/resume cycle.
>>> However, it is still possible to control the backlight through direct
>>> interaction with the working GPU driver's backlight sysfs interface.
>>>
>>> These systems also appear to have a second firmware bug which resets
>>> the EC's brightness level to 100% on resume, but leaves the state in
>>> the kernel at the pre-suspend level. This causes attempts to save
>>> and restore the backlight level across the suspend/resume cycle to
>>> fail, due to the level appearing not to change even though it did.
>>>
>>> In order to work around these issues, add a quirk table to detect
>>> systems that are known to show these behaviors. So far, there is
>>> only one known system that requires these workarounds, and both
>>> issues are present on that system, but the quirks are tracked
>>> separately to make it easier to add them to other systems which
>>> may exhibit one of the bugs, but not the other. The original systems
>>> that this driver was tested on during development do not exhibit
>>> either of these quirks.
>>>
>>> If a system with the "GPU driver has backlight control" quirk is
>>> detected, nvidia-wmi-ec-backlight will grab a reference to the working
>>> (when freshly booted) GPU backlight handler and relays any backlight
>>> brightness level change requests directed at the EC to also be applied
>>> to the GPU backlight interface. This leads to redundant updates
>>> directed at the GPU backlight driver after a suspend/resume cycle, but
>>> it does allow the EC backlight control to work when the system is
>>> freshly booted.
>>
>> Ugh, I'm really not a fan of the backlight proxy plan here. I have
>> plans to clean-up the whole x86 backlight mess soon and an important part
>> of that is to stop registering multiple backlight interfaces for the
>> same panel/screen.
>>
>> Where as going with this workaround requires us to have 2 active
>> backlight interfaces active. Also this will very likely work to
>> (subtly) different backlight behavior before and after the first
>> suspend/resume.
> 
> I understand. Having multiple backlight devices for the same panel is indeed 
> annoying. Out of curiosity, what is the plan for determining that multiple 
> backlight interfaces are all supposed to control the same panel?

ATM the kernel basically only supports a bunch of different methods
to control the backlight of 1 internal panel. The plan is to tie this
to the panel from a userspace pov by making the brightness +
max_brightness properties on the drm_connector object for the
internal-panel.

The in kernel tying of the backlight device to the internal panel
will be done hardcoded inside the drm driver(s) based on the
drivers already

Re: [PATCH 2/3] drm/msm/gpu: Park scheduler threads for system suspend

2022-03-17 Thread Christian König

Am 17.03.22 um 17:18 schrieb Rob Clark:

On Thu, Mar 17, 2022 at 9:04 AM Christian König
 wrote:

Am 17.03.22 um 16:10 schrieb Rob Clark:

[SNIP]
userspace frozen != kthread frozen .. that is what this patch is
trying to address, so we aren't racing between shutting down the hw
and the scheduler shoveling more jobs at us.

Well exactly that's the problem. The scheduler is supposed to shoveling
more jobs at us until it is empty.

Thinking more about it we will then keep some dma_fence instance
unsignaled and that is and extremely bad idea since it can lead to
deadlocks during suspend.

Hmm, perhaps that is true if you need to migrate things out of vram?
It is at least not a problem when vram is not involved.


No, it's much wider than that.

See what can happen is that the memory management shrinkers want to wait 
for a dma_fence during suspend.


And if you stop the scheduler they will just wait forever.

What you need to do instead is to drain the scheduler, e.g. call 
drm_sched_entity_flush() with a proper timeout for each entity you have 
created.


Regards,
Christian.




So this patch here is an absolute clear NAK from my side. If amdgpu is
doing something similar that is a severe bug and needs to be addressed
somehow.

I think amdgpu's use of kthread_park is not related to suspend, but
didn't look too closely.

And perhaps the solution for this problem is more complex in the case
of amdgpu, I'm not super familiar with the constraints there.  But I
think it is a fine solution for integrated GPUs.

BR,
-R


Regards,
Christian.


BR,
-R





[PATCH v2 0/3] drm/msm: Add comm/cmdline override

2022-03-17 Thread Rob Clark
From: Rob Clark 

Add a way to override comm/cmdline per-drm_file.  This is useful for
VM scenarios where the host process is just a proxy for the actual
guest process.

Rob Clark (3):
  drm/msm: Add support for pointer params
  drm/msm: Split out helper to get comm/cmdline
  drm/msm: Add a way to override processes comm/cmdline

 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 49 -
 drivers/gpu/drm/msm/adreno/adreno_gpu.h |  4 +-
 drivers/gpu/drm/msm/msm_drv.c   |  8 ++--
 drivers/gpu/drm/msm/msm_gpu.c   | 40 
 drivers/gpu/drm/msm/msm_gpu.h   | 10 -
 drivers/gpu/drm/msm/msm_rd.c|  5 ++-
 drivers/gpu/drm/msm/msm_submitqueue.c   |  2 +
 include/uapi/drm/msm_drm.h  |  4 ++
 8 files changed, 94 insertions(+), 28 deletions(-)

-- 
2.35.1



[PATCH v2 1/3] drm/msm: Add support for pointer params

2022-03-17 Thread Rob Clark
From: Rob Clark 

The 64b value field is already suffient to hold a pointer instead of
immediate, but we also need a length field.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 12 ++--
 drivers/gpu/drm/msm/adreno/adreno_gpu.h |  4 ++--
 drivers/gpu/drm/msm/msm_drv.c   |  8 
 drivers/gpu/drm/msm/msm_gpu.h   |  4 ++--
 drivers/gpu/drm/msm/msm_rd.c|  5 +++--
 include/uapi/drm/msm_drm.h  |  2 ++
 6 files changed, 23 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 9efc84929be0..3d307b34854d 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -229,10 +229,14 @@ adreno_iommu_create_address_space(struct msm_gpu *gpu,
 }
 
 int adreno_get_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
-uint32_t param, uint64_t *value)
+uint32_t param, uint64_t *value, uint32_t *len)
 {
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
 
+   /* No pointer params yet */
+   if (*len != 0)
+   return -EINVAL;
+
switch (param) {
case MSM_PARAM_GPU_ID:
*value = adreno_gpu->info->revn;
@@ -284,8 +288,12 @@ int adreno_get_param(struct msm_gpu *gpu, struct 
msm_file_private *ctx,
 }
 
 int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
-uint32_t param, uint64_t value)
+uint32_t param, uint64_t value, uint32_t len)
 {
+   /* No pointer params yet */
+   if (len != 0)
+   return -EINVAL;
+
switch (param) {
case MSM_PARAM_SYSPROF:
if (!capable(CAP_SYS_ADMIN))
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h 
b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index 0490c5fbb780..ab3b5ef80332 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -281,9 +281,9 @@ static inline int adreno_is_a650_family(struct adreno_gpu 
*gpu)
 }
 
 int adreno_get_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
-uint32_t param, uint64_t *value);
+uint32_t param, uint64_t *value, uint32_t *len);
 int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
-uint32_t param, uint64_t value);
+uint32_t param, uint64_t value, uint32_t len);
 const struct firmware *adreno_request_fw(struct adreno_gpu *adreno_gpu,
const char *fwname);
 struct drm_gem_object *adreno_fw_create_bo(struct msm_gpu *gpu,
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index 780f9748aaaf..a5eed5738ac8 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -610,7 +610,7 @@ static int msm_ioctl_get_param(struct drm_device *dev, void 
*data,
/* for now, we just have 3d pipe.. eventually this would need to
 * be more clever to dispatch to appropriate gpu module:
 */
-   if (args->pipe != MSM_PIPE_3D0)
+   if ((args->pipe != MSM_PIPE_3D0) || (args->pad != 0))
return -EINVAL;
 
gpu = priv->gpu;
@@ -619,7 +619,7 @@ static int msm_ioctl_get_param(struct drm_device *dev, void 
*data,
return -ENXIO;
 
return gpu->funcs->get_param(gpu, file->driver_priv,
-args->param, &args->value);
+args->param, &args->value, &args->len);
 }
 
 static int msm_ioctl_set_param(struct drm_device *dev, void *data,
@@ -629,7 +629,7 @@ static int msm_ioctl_set_param(struct drm_device *dev, void 
*data,
struct drm_msm_param *args = data;
struct msm_gpu *gpu;
 
-   if (args->pipe != MSM_PIPE_3D0)
+   if ((args->pipe != MSM_PIPE_3D0) || (args->pad != 0))
return -EINVAL;
 
gpu = priv->gpu;
@@ -638,7 +638,7 @@ static int msm_ioctl_set_param(struct drm_device *dev, void 
*data,
return -ENXIO;
 
return gpu->funcs->set_param(gpu, file->driver_priv,
-args->param, args->value);
+args->param, args->value, args->len);
 }
 
 static int msm_ioctl_gem_new(struct drm_device *dev, void *data,
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index a84140055920..c28c2ad9f52e 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -44,9 +44,9 @@ struct msm_gpu_config {
  */
 struct msm_gpu_funcs {
int (*get_param)(struct msm_gpu *gpu, struct msm_file_private *ctx,
-uint32_t param, uint64_t *value);
+uint32_t param, uint64_t *value, uint32_t *len);
int (*set_param)(struct msm_gpu *gpu, struct msm_file_private *ctx,
-uint32_t param, uint64_t value);
+   

  1   2   3   >