Re: [RFC PATCH 5/7] drm/ttm: add range busy check for range manager
Am 16.03.22 um 16:28 schrieb Robert Beckett: On 16/03/2022 14:39, Christian König wrote: Am 16.03.22 um 15:26 schrieb Robert Beckett: [SNIP] this is where I replace an existing range check via drm_mm with the range check I added in this patch. Mhm, I still don't get the use case from the code, but I don't think it matters any more. I suppose we could add another drm_mm range tracker just for testing and shadow track each allocation in the range, but that seemed like a lot of extra infrastructure for no general runtime use. I have no idea what you mean with that. I meant as a potential solution to tracking allocations without a range check, we would need to add something external. e.g. adding a shadow drm_mm range tracker, or a bitmask across the range, or stick objects in a list etc. Ah! So you are trying to get access to the drm_mm inside the ttm_range_manager and not add some additional range check function! Now I got your use case. well, specifically I was trying to avoid having to get access to the drm_mm. I wanted to maintain an abstract interface at the resource manager level, hence the rfc to ask if we could add a range check to ttm_resource_manager_func. I don't like the idea of code external to ttm having to poke in to the implementation details of the manager to get it's underlying drm_mm. The purpose of the ttm_range_manager is to implement a base class which is then extended by the drivers with more explicit functionality. I have it on my TODO list to properly export the ttm_range_manager functions and use them to simplify the amdgpu_gtt_mgr.c implementation. So accessing the drm_mm for a test case sounds perfectly fine to me as long as you document what is happening. E.g. maybe add a wrapper function to get a pointer to the drm_mm. would you mind explaining the rationale for removing range checks? It seems to me like a natural fit for a memory manager TTM manages buffer objects and resources, not address space. The lpfn/fpfn parameter for the resource allocators are actually used as just two independent parameters and not define any range. We just keep the names for historical reasons. The only places we still use and compare them as ranges are ttm_resource_compat() and ttm_bo_eviction_valuable() and I already have patches to clean up those and move them into the backend resource handling. except the ttm_range_manager seems to still use them as a range specifier. Yeah, because the range manager is the backend which handles ranges using the drm_mm :) If the general design going forward is to not consider ranges, how would you recommend constructing buffers around pre-allocated regions e.g. uefi frame buffers who's range is dictated externally? Call ttm_bo_mem_space() with the fpfn/lpfn filled in as required. See function amdgpu_bo_create_kernel_at() for an example. ah, I see, thanks. To allow similar code to before, which was conceptually just trying to see if a range was currently free, would you be okay with a new ttm_bo_mem_try_space, which does not do the force to evict, but instead returns -EBUSY? You can already do that by setting the num_busy_placement to zero. That should prevent any eviction. Regards, Christian. If so, the test can try to alloc, and immediately free if successful which would imply it was free. Regards, Christian. Regards, Christian. Regards, Christian. Signed-off-by: Robert Beckett --- drivers/gpu/drm/ttm/ttm_range_manager.c | 21 + include/drm/ttm/ttm_range_manager.h | 3 +++ 2 files changed, 24 insertions(+) diff --git a/drivers/gpu/drm/ttm/ttm_range_manager.c b/drivers/gpu/drm/ttm/ttm_range_manager.c index 8cd4f3fb9f79..5662627bb933 100644 --- a/drivers/gpu/drm/ttm/ttm_range_manager.c +++ b/drivers/gpu/drm/ttm/ttm_range_manager.c @@ -206,3 +206,24 @@ int ttm_range_man_fini_nocheck(struct ttm_device *bdev, return 0; } EXPORT_SYMBOL(ttm_range_man_fini_nocheck); + +/** + * ttm_range_man_range_busy - Check whether anything is allocated with a range + * + * @man: memory manager to check + * @fpfn: first page number to check + * @lpfn: last page number to check + * + * Return: true if anything allocated within the range, false otherwise. + */ +bool ttm_range_man_range_busy(struct ttm_resource_manager *man, + unsigned fpfn, unsigned lpfn) +{ + struct ttm_range_manager *rman = to_range_manager(man); + struct drm_mm *mm = &rman->mm; + + if (__drm_mm_interval_first(mm, PFN_PHYS(fpfn), PFN_PHYS(lpfn + 1) - 1)) + return true; + return false; +} +EXPORT_SYMBOL(ttm_range_man_range_busy); diff --git a/include/drm/ttm/ttm_range_manager.h b/include/drm/ttm/ttm_range_manager.h index 7963b957e9ef..86794a3f9101 100644 --- a/include/drm/ttm/ttm_range_manager.h +++ b/include/drm/ttm/ttm_range_manager.h @@ -53,4 +53,7 @@ static __always_inline int ttm_range_man_fini(struct ttm_device *bdev,
Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event
Am 16.03.22 um 16:36 schrieb Rob Clark: [SNIP] just one point of clarification.. in the msm and i915 case it is purely for debugging and telemetry (ie. sending crash logs back to distro for analysis if user has crash reporting enabled).. it isn't used for triggering any action like killing app or compositor. By the way, how does msm it's memory management for the devcoredumps? I mean it is strictly forbidden to allocate any memory in the GPU reset path. I would however *strongly* recommend devcoredump support in other GPU drivers (i915's thing pre-dates devcoredump by a lot).. I've used it to debug and fix a couple obscure issues that I was not able to reproduce by myself. Yes, completely agree as well. Thanks, Christian. BR, -R
RE: [Intel-gfx] [PATCH v6 2/2] drm/i915/gem: Don't try to map and fence large scanout buffers (v9)
Hi Tvrtko, > > On 16/03/2022 07:37, Kasireddy, Vivek wrote: > > Hi Tvrtko, > > > >> > >> On 15/03/2022 07:28, Kasireddy, Vivek wrote: > >>> Hi Tvrtko, Daniel, > >>> > > On 11/03/2022 09:39, Daniel Vetter wrote: > > On Mon, 7 Mar 2022 at 21:38, Vivek Kasireddy > wrote: > >> > >> On platforms capable of allowing 8K (7680 x 4320) modes, pinning 2 or > >> more framebuffers/scanout buffers results in only one that is mappable/ > >> fenceable. Therefore, pageflipping between these 2 FBs where only one > >> is mappable/fenceable creates latencies large enough to miss alternate > >> vblanks thereby producing less optimal framerate. > >> > >> This mainly happens because when i915_gem_object_pin_to_display_plane() > >> is called to pin one of the FB objs, the associated vma is identified > >> as misplaced and therefore i915_vma_unbind() is called which unbinds > >> and > >> evicts it. This misplaced vma gets subseqently pinned only when > >> i915_gem_object_ggtt_pin_ww() is called without PIN_MAPPABLE. This > >> results in a latency of ~10ms and happens every other vblank/repaint > >> cycle. > >> Therefore, to fix this issue, we try to see if there is space to map > >> at-least two objects of a given size and return early if there isn't. > >> This > >> would ensure that we do not try with PIN_MAPPABLE for any objects that > >> are too big to map thereby preventing unncessary unbind. > >> > >> Testcase: > >> Running Weston and weston-simple-egl on an Alderlake_S (ADLS) platform > >> with a 8K@60 mode results in only ~40 FPS. Since upstream Weston > >> submits > >> a frame ~7ms before the next vblank, the latencies seen between atomic > >> commit and flip event are 7, 24 (7 + 16.66), 7, 24. suggesting that > >> it misses the vblank every other frame. > >> > >> Here is the ftrace snippet that shows the source of the ~10ms latency: > >> i915_gem_object_pin_to_display_plane() { > >> 0.102 us |i915_gem_object_set_cache_level(); > >>i915_gem_object_ggtt_pin_ww() { > >> 0.390 us | i915_vma_instance(); > >> 0.178 us | i915_vma_misplaced(); > >> i915_vma_unbind() { > >> __i915_active_wait() { > >> 0.082 us |i915_active_acquire_if_busy(); > >> 0.475 us | } > >> intel_runtime_pm_get() { > >> 0.087 us |intel_runtime_pm_acquire(); > >> 0.259 us | } > >> __i915_active_wait() { > >> 0.085 us |i915_active_acquire_if_busy(); > >> 0.240 us | } > >> __i915_vma_evict() { > >>ggtt_unbind_vma() { > >> gen8_ggtt_clear_range() { > >> 10507.255 us |} > >> 10507.689 us | } > >> 10508.516 us | } > >> > >> v2: Instead of using bigjoiner checks, determine whether a scanout > >>buffer is too big by checking to see if it is possible to map > >>two of them into the ggtt. > >> > >> v3 (Ville): > >> - Count how many fb objects can be fit into the available holes > >> instead of checking for a hole twice the object size. > >> - Take alignment constraints into account. > >> - Limit this large scanout buffer check to >= Gen 11 platforms. > >> > >> v4: > >> - Remove existing heuristic that checks just for size. (Ville) > >> - Return early if we find space to map at-least two objects. (Tvrtko) > >> - Slightly update the commit message. > >> > >> v5: (Tvrtko) > >> - Rename the function to indicate that the object may be too big to > >> map into the aperture. > >> - Account for guard pages while calculating the total size required > >> for the object. > >> - Do not subject all objects to the heuristic check and instead > >> consider objects only of a certain size. > >> - Do the hole walk using the rbtree. > >> - Preserve the existing PIN_NONBLOCK logic. > >> - Drop the PIN_MAPPABLE check while pinning the VMA. > >> > >> v6: (Tvrtko) > >> - Return 0 on success and the specific error code on failure to > >> preserve the existing behavior. > >> > >> v7: (Ville) > >> - Drop the HAS_GMCH(i915), DISPLAY_VER(i915) < 11 and > >> size < ggtt->mappable_end / 4 checks. > >> - Drop the redundant check that is based on previous heuristic. > >> > >> v8: > >> - Make sure that we are holding the mutex associated with ggtt vm > >> as we traverse the hole nodes. > >> > >> v9: (Tvrtko) > >> - Use mutex_lock_interruptible_nested() instead of mutex_lock(). > >> > >> Cc: Ville Syrjälä > >> Cc: Maarten Lankhorst > >> Cc: Tvrtko Ursulin > >> Cc: M
Re: [PATCH] drm/ttm: fix uninit ptr deref in range manager alloc error path
Am 16.03.22 um 20:50 schrieb Robert Beckett: ttm_range_man_alloc would try to ttm_resource_fini the res pointer before it is allocated. Fixes: de3688e469b0 (drm/ttm: add ttm_resource_fini v2) Signed-off-by: Robert Beckett Reviewed-by: Christian König Good catch, going to push that to drm-misc-fixes. --- drivers/gpu/drm/ttm/ttm_range_manager.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/ttm/ttm_range_manager.c b/drivers/gpu/drm/ttm/ttm_range_manager.c index 5662627bb933..1b4d8ca52f68 100644 --- a/drivers/gpu/drm/ttm/ttm_range_manager.c +++ b/drivers/gpu/drm/ttm/ttm_range_manager.c @@ -89,7 +89,7 @@ static int ttm_range_man_alloc(struct ttm_resource_manager *man, spin_unlock(&rman->lock); if (unlikely(ret)) { - ttm_resource_fini(man, *res); + ttm_resource_fini(man, &node->base); kfree(node); return ret; }
Re: [PATCH V9 1/5] dt-bindings: display: mediatek: add aal binding for MT8183
On 17/03/2022 06:18, Rex-BC Chen wrote: > Add aal binding for MT8183. > > Signed-off-by: Rex-BC Chen > --- > .../devicetree/bindings/display/mediatek/mediatek,aal.yaml | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > Reviewed-by: Krzysztof Kozlowski Best regards, Krzysztof
Re: [PATCH v8 22/24] drm: rockchip: Add VOP2 driver
Hi Sascha: On 3/16/22 20:22, Andy Yan wrote: Hi Sascha and Daniel: On 3/16/22 15:40, Sascha Hauer wrote: On Tue, Mar 15, 2022 at 02:46:35PM +0800, Andy Yan wrote: Hi Sascha: On 3/11/22 16:33, Sascha Hauer wrote: From: Andy Yan The VOP2 unit is found on Rockchip SoCs beginning with rk3566/rk3568. It replaces the VOP unit found in the older Rockchip SoCs. This driver has been derived from the downstream Rockchip Kernel and heavily modified: - All nonstandard DRM properties have been removed - dropped struct vop2_plane_state and pass around less data between functions - Dropped all DRM_FORMAT_* not known on upstream - rework register access to get rid of excessively used macros - Drop all waiting for framesyncs The driver is tested with HDMI and MIPI-DSI display on a RK3568-EVB board. Overlay support is tested with the modetest utility. AFBC support on the cluster windows is tested with weston-simple-dmabuf-egl on weston using the (yet to be upstreamed) panfrost driver support. Do we need some modification to test AFBC by weston-simple-dma-egl ? By default weston-simple-dma-egl uses DRM_FORMAT_XRGB which in the panfrost driver ends up as PIPE_FORMAT_B8G8R8_UNORM and panfrost_afbc_format() returns PIPE_FORMAT_NONE for that. Change the format to DRM_FORMAT_ABGR using weston-simple-dma-egl -f 0x34324241. This ends up as PIPE_FORMAT_R8G8B8A8_UNORM in panfrost_afbc_format() which is a supported format. I also try weston-simple-dmabuf-egl -f 0x34324241 command, but I got this output log from weston[0]: Layer 5 (pos 0x5000): View 0 (role xdg_toplevel, PID 375, surface ID 3, top-level window 'simple-dmabuf-egl' of org.freedesktop.weston. simple-dmabuf-egl, 0xd08275e0): position: (871, 174) -> (1127, 430) [not opaque] outputs: 0 (HDMI-A-1) (primary) dmabuf buffer format: 0x34324241 ABGR modifier: ARM_BLOCK_SIZE=16x16,MODE=YTR|SPARSE (0x851) Layer 6 (pos 0x2): View 0 (role (null), PID 372, surface ID 18, background for output HDMI-A-1, 0xd0863520): position: (0, 0) -> (1920, 1080) [fully opaque] outputs: 0 (HDMI-A-1) (primary) [buffer not available] [repaint] preparing state for output HDMI-A-1 (0) [repaint] trying planes-only build state [view] evaluating view 0xd083b0f0 for output HDMI-A-1 (0) [view] not assigning view 0xd083b0f0 to plane (no buffer available) [view] failing state generation: placing view 0xd083b0f0 to renderer not allowed [repaint] could not build planes-only state, trying mixed [state] using renderer FB ID 73 for mixed mode for output HDMI-A-1 (0) [state] scanout will use for zpos 0 [view] evaluating view 0xd083b0f0 for output HDMI-A-1 (0) [view] not assigning view 0xd083b0f0 to plane (no buffer available) [view] view 0xd083b0f0 will be placed on the renderer [view] evaluating view 0xd08275e0 for output HDMI-A-1 (0) [plane] started with zpos 18446744073709551615 [view] view 0xd08275e0 will be placed on the renderer [view] evaluating view 0xd0863520 for output HDMI-A-1 (0) [view] not assigning view 0xd0863520 to plane (no buffer available) [view] not assigning view 0xd0863520 to plane (occluded by renderer views) [view] view 0xd0863520 will be placed on the renderer From the log we can find that Layer5 view 0(0xd08275e0) is the afbc view rendered by Panfrost. But it at last put on a render not a afbc window of vop "view] view 0xd083b0f0 will be placed on the renderer " The output message from sys/kernel/debug/dri/state can also provide that only non-AFBC window smart0-win0 is used. It seems that it failed in weston drm_output_prepare_plane_view. Maybe I need a deeper dig. After a deeper dig, I found it failed from drm_fb_get_from_dmabuf { ... /* XXX: TODO: * * Currently the buffer is rejected if any dmabuf attribute * flag is set. This keeps us from passing an inverted / * interlaced / bottom-first buffer (or any other type that may * be added in the future) through to an overlay. Ultimately, * these types of buffers should be handled through buffer * transforms and not as spot-checks requiring specific * knowledge. */ if (dmabuf->attributes.flags) { drm_debug(backend, "\t\t\t\t invlid flag 0x%x\n", dmabuf->attributes.flags); return NULL; } } After some grep search, I found the flag is set at create_dmabuf_buffer by weston-simple-dmabuf-egl itself. So I run this test with -g: weston-simple-dmabuf-egl -f 0x34324241 -g From the log I can see this view is go to a overlay plane, but it doesn't appear on the screen. Cat the dri state, I can see Cluster1-win0 this afbc window is enabled. So I guess there is something wrong with the vop2 configuration. I dump registers of OVERLAY and Cluster1-win0 and Smart0-win0(Primary plane) I found a obvious error in 0x604(OVERLAY_LAYER_SEL) register, the configuration value is 0x5476
Re: [PATCH] fbdev: defio: fix the pagelist corruption
Hi Chuansheng, On Thu, Mar 17, 2022 at 7:17 AM Chuansheng Liu wrote: > Easily hit the below list corruption: > == > list_add corruption. prev->next should be next (c0ceb090), but > was ec604507edc8. (prev=ec604507edc8). > WARNING: CPU: 65 PID: 3959 at lib/list_debug.c:26 > __list_add_valid+0x53/0x80 > CPU: 65 PID: 3959 Comm: fbdev Tainted: G U > RIP: 0010:__list_add_valid+0x53/0x80 > Call Trace: > > fb_deferred_io_mkwrite+0xea/0x150 > do_page_mkwrite+0x57/0xc0 > do_wp_page+0x278/0x2f0 > __handle_mm_fault+0xdc2/0x1590 > handle_mm_fault+0xdd/0x2c0 > do_user_addr_fault+0x1d3/0x650 > exc_page_fault+0x77/0x180 > ? asm_exc_page_fault+0x8/0x30 > asm_exc_page_fault+0x1e/0x30 > RIP: 0033:0x7fd98fc8fad1 > == > > Figure out the race happens when one process is adding &page->lru into > the pagelist tail in fb_deferred_io_mkwrite(), another process is > re-initializing the same &page->lru in fb_deferred_io_fault(), which is > not protected by the lock. > > This fix is to init all the page lists one time during initialization, > it not only fixes the list corruption, but also avoids INIT_LIST_HEAD() > redundantly. > > Fixes: 105a940416fc ("fbdev/defio: Early-out if page is already > enlisted") > Cc: Thomas Zimmermann > Signed-off-by: Chuansheng Liu Thanks for your patch! > --- a/drivers/video/fbdev/core/fb_defio.c > +++ b/drivers/video/fbdev/core/fb_defio.c > @@ -220,6 +219,8 @@ static void fb_deferred_io_work(struct work_struct *work) > void fb_deferred_io_init(struct fb_info *info) > { > struct fb_deferred_io *fbdefio = info->fbdefio; > + struct page *page; > + int i; unsigned int i; > BUG_ON(!fbdefio); > mutex_init(&fbdefio->lock); > @@ -227,6 +228,12 @@ void fb_deferred_io_init(struct fb_info *info) > INIT_LIST_HEAD(&fbdefio->pagelist); > if (fbdefio->delay == 0) /* set a default of 1 s */ > fbdefio->delay = HZ; > + > + /* initialize all the page lists one time */ > + for (i = 0; i < info->fix.smem_len; i += PAGE_SIZE) { > + page = fb_deferred_io_page(info, i); > + INIT_LIST_HEAD(&page->lru); > + } > } > EXPORT_SYMBOL_GPL(fb_deferred_io_init); Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
[PULL] drm-intel-next-fixes
Hi Dave & Daniel, Fix for vm_access() out-of-bounds access and PSR not staying disabled during fastset once determined not reliable. Then a naming fix to avoid conflicts for potential future fixes. Regards, Joonas *** drm-intel-next-fixes-2022-03-17: - Do not re-enable PSR after it was marked as not reliable (Jose) - Add missing boundary check in vm_access to avoid out-of-bounds access (Mastan) - Naming fix for HPD short pulse handling for eDP (Jose) The following changes since commit 5e7f44b5c2c035fe2e5458193c2bbee56db6a090: drm/i915/gtt: reduce overzealous alignment constraints for GGTT (2022-03-09 08:34:55 +0200) are available in the Git repository at: git://anongit.freedesktop.org/drm/drm-intel tags/drm-intel-next-fixes-2022-03-17 for you to fetch changes up to 278da06c03655c2bb9bc36ebdf45b90a079b3bfd: drm/i915/display: Do not re-enable PSR after it was marked as not reliable (2022-03-16 08:17:40 +0200) - Do not re-enable PSR after it was marked as not reliable (Jose) - Add missing boundary check in vm_access to avoid out-of-bounds access (Mastan) - Naming fix for HPD short pulse handling for eDP (Jose) José Roberto de Souza (2): drm/i915/display: Fix HPD short pulse handling for eDP drm/i915/display: Do not re-enable PSR after it was marked as not reliable Mastan Katragadda (1): drm/i915/gem: add missing boundary check in vm_access drivers/gpu/drm/i915/display/intel_dp.c | 2 +- drivers/gpu/drm/i915/display/intel_pps.c | 6 +++--- drivers/gpu/drm/i915/display/intel_pps.h | 2 +- drivers/gpu/drm/i915/display/intel_psr.c | 4 drivers/gpu/drm/i915/gem/i915_gem_mman.c | 2 +- 5 files changed, 10 insertions(+), 6 deletions(-)
Re: [PATCH 00/25] drm/msm/dpu: wide planes support
On 17/03/2022 04:10, Abhinav Kumar wrote: Hi Dmitry I have reviewed the series , some patches completely , some of them especially the plane to sspp mapping is something i still need to check. But I had one question on the design. I thought we were going to have a boot param to control whether driver will internally use both rectangles for the layer so that in the future if compositors can do this splitting, we can use that instead of driver doing it ( keep boot param disabled ? ). No need to for this patch series. If your composer allocates smaller planes, then the driver won't do a thing. For the proper multirect there will be a boot param (at least initially) and then you can work on the custom properties, etc. Thanks Abhinav On 2/9/2022 9:24 AM, Dmitry Baryshkov wrote: It took me a way longer to finish than I expected. And more patches that I initially hoped. This patchset brings in multirect usage to support using two SSPP rectangles for a single plane. Virtual planes support is omitted from this pull request, it will come later. Dmitry Baryshkov (25): drm/msm/dpu: rip out master planes support drm/msm/dpu: do not limit the zpos property drm/msm/dpu: add support for SSPP allocation to RM drm/msm/dpu: move SSPP debugfs creation to dpu_kms.c drm/msm/dpu: move pipe_hw to dpu_plane_state drm/msm/dpu: inline dpu_plane_get_ctl_flush drm/msm/dpu: drop dpu_plane_pipe function drm/msm/dpu: get rid of cached flush_mask drm/msm/dpu: dpu_crtc_blend_setup: split mixer and ctl logic drm/msm/dpu: introduce struct dpu_sw_pipe drm/msm/dpu: use dpu_sw_pipe for dpu_hw_sspp callbacks drm/msm/dpu: inline _dpu_plane_set_scanout drm/msm/dpu: pass dpu_format to _dpu_hw_sspp_setup_scaler3() drm/msm/dpu: move stride programming to dpu_hw_sspp_setup_sourceaddress drm/msm/dpu: remove dpu_hw_fmt_layout from struct dpu_hw_pipe_cfg drm/msm/dpu: drop EAGAIN check from dpu_format_populate_layout drm/msm/dpu: drop src_split and multirect check from dpu_crtc_atomic_check drm/msm/dpu: move the rest of plane checks to dpu_plane_atomic_check() drm/msm/dpu: don't use unsupported blend stages drm/msm/dpu: add dpu_hw_pipe_cfg to dpu_plane_state drm/msm/dpu: simplify dpu_plane_validate_src() drm/msm/dpu: rewrite plane's QoS-related functions to take dpu_sw_pipe and dpu_format drm/msm/dpu: rework dpu_plane_atomic_check() and dpu_plane_sspp_atomic_update() drm/msm/dpu: populate SmartDMA features in hw catalog drm/msm/dpu: add support for wide planes drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 355 +++- drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.h | 1 - drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c | 4 - .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c | 10 +- drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c | 78 +- drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h | 35 +- drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.c | 136 +-- drivers/gpu/drm/msm/disp/dpu1/dpu_hw_sspp.h | 88 +- drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 21 +- drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h | 1 + drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 813 +- drivers/gpu/drm/msm/disp/dpu1/dpu_plane.h | 42 +- drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c | 81 ++ drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h | 6 + drivers/gpu/drm/msm/disp/dpu1/dpu_trace.h | 19 +- 15 files changed, 827 insertions(+), 863 deletions(-) -- With best wishes Dmitry
Re: [PATCH v1 1/3] mm: split vm_normal_pages for LRU and non-LRU handling
On 17.03.22 03:54, Alistair Popple wrote: > Felix Kuehling writes: > >> On 2022-03-11 04:16, David Hildenbrand wrote: >>> On 10.03.22 18:26, Alex Sierra wrote: DEVICE_COHERENT pages introduce a subtle distinction in the way "normal" pages can be used by various callers throughout the kernel. They behave like normal pages for purposes of mapping in CPU page tables, and for COW. But they do not support LRU lists, NUMA migration or THP. Therefore we split vm_normal_page into two functions vm_normal_any_page and vm_normal_lru_page. The latter will only return pages that can be put on an LRU list and that support NUMA migration, KSM and THP. We also introduced a FOLL_LRU flag that adds the same behaviour to follow_page and related APIs, to allow callers to specify that they expect to put pages on an LRU list. >>> I still don't see the need for s/vm_normal_page/vm_normal_any_page/. And >>> as this patch is dominated by that change, I'd suggest (again) to just >>> drop it as I don't see any value of that renaming. No specifier implies any. >> >> OK. If nobody objects, we can adopts that naming convention. > > I'd prefer we avoid the churn too, but I don't think we should make > vm_normal_page() the equivalent of vm_normal_any_page(). It would mean > vm_normal_page() would return non-LRU device coherent pages, but to me at > least > device coherent pages seem special and not what I'd expect from a function > with > "normal" in the name. > > So I think it would be better to s/vm_normal_lru_page/vm_normal_page/ and keep > vm_normal_any_page() (or perhaps call it vm_any_page?). This is basically what > the previous incarnation of this feature did: > > struct page *_vm_normal_page(struct vm_area_struct *vma, unsigned long addr, > pte_t pte, bool with_public_device); > #define vm_normal_page(vma, addr, pte) _vm_normal_page(vma, addr, pte, false) > > Except we should add: > > #define vm_normal_any_page(vma, addr, pte) _vm_normal_page(vma, addr, pte, > true) > "normal" simply tells us that this is not a special mapping -- IOW, we want the VM to take a look at the memmap and not treat it like a PFN map. What we're changing is that we're now also returning non-lru pages. Fair enough, that's why we introduce vm_normal_lru_page() as a replacement where we really can only deal with lru pages. vm_normal_page vs vm_normal_lru_page is good enough. "lru" further limits what we get via vm_normal_page, that's even how it's implemented. vm_normal_page vs vm_normal_any_page is confusing IMHO. -- Thanks, David / dhildenb
[PATCH v2 4/5] drm/ssd130x: Reduce temporary buffer sizes
ssd130x_clear_screen() allocates a temporary buffer sized to hold one byte per pixel, while it only needs to hold one bit per pixel. ssd130x_fb_blit_rect() allocates a temporary buffer sized to hold one byte per pixel for the whole frame buffer, while it only needs to hold one bit per pixel for the part that is to be updated. Pass dst_pitch to drm_fb_xrgb_to_mono(), as we have already calculated it anyway. Fixes: a61732e808672cfa ("drm: Add driver for Solomon SSD130x OLED displays") Signed-off-by: Geert Uytterhoeven Acked-by: Javier Martinez Canillas --- v2: - Add Acked-by, - s/drm_fb_xrgb_to_mono_reversed/drm_fb_xrgb_to_mono/ in description. --- drivers/gpu/drm/solomon/ssd130x.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/solomon/ssd130x.c b/drivers/gpu/drm/solomon/ssd130x.c index 7c99af4ce9dd4e5c..38b6c2c14f53644b 100644 --- a/drivers/gpu/drm/solomon/ssd130x.c +++ b/drivers/gpu/drm/solomon/ssd130x.c @@ -440,7 +440,8 @@ static void ssd130x_clear_screen(struct ssd130x_device *ssd130x) .y2 = ssd130x->height, }; - buf = kcalloc(ssd130x->width, ssd130x->height, GFP_KERNEL); + buf = kcalloc(DIV_ROUND_UP(ssd130x->width, 8), ssd130x->height, + GFP_KERNEL); if (!buf) return; @@ -454,6 +455,7 @@ static int ssd130x_fb_blit_rect(struct drm_framebuffer *fb, const struct iosys_m { struct ssd130x_device *ssd130x = drm_to_ssd130x(fb->dev); void *vmap = map->vaddr; /* TODO: Use mapping abstraction properly */ + unsigned int dst_pitch; int ret = 0; u8 *buf = NULL; @@ -461,11 +463,12 @@ static int ssd130x_fb_blit_rect(struct drm_framebuffer *fb, const struct iosys_m rect->y1 = round_down(rect->y1, 8); rect->y2 = min_t(unsigned int, round_up(rect->y2, 8), ssd130x->height); - buf = kcalloc(fb->width, fb->height, GFP_KERNEL); + dst_pitch = DIV_ROUND_UP(drm_rect_width(rect), 8); + buf = kcalloc(dst_pitch, drm_rect_height(rect), GFP_KERNEL); if (!buf) return -ENOMEM; - drm_fb_xrgb_to_mono(buf, 0, vmap, fb, rect); + drm_fb_xrgb_to_mono(buf, dst_pitch, vmap, fb, rect); ssd130x_update_rect(ssd130x, buf, rect); -- 2.25.1
[PATCH v2 3/5] drm/ssd130x: Fix rectangle updates
The rectangle update functions ssd130x_fb_blit_rect() and ssd130x_update_rect() do not behave correctly when x1 != 0 or y1 != 0, or when y1 or y2 are not aligned to display page boundaries. E.g. when used as a text console, only the first line of text is shown on the display. 1. The buffer passed by ssd130x_fb_blit_rect() points to the first byte of monochrome bitmap data, and thus has its origin at (x1, y1), while ssd130x_update_rect() assumes it is at (0, 0). Fix ssd130x_update_rect() by changing the vertical and horizontal loop ranges, and adding the offsets only when needed. 2. In ssd130x_fb_blit_rect(), align y1 and y2 to the display page boundaries before doing the color conversion, so the full page is converted and updated. Remove the correction for an unaligned y1 from ssd130x_update_rect(), and add a check to make sure y1 is aligned. Fixes: a61732e808672cfa ("drm: Add driver for Solomon SSD130x OLED displays") Signed-off-by: Geert Uytterhoeven Acked-by: Javier Martinez Canillas --- v2: - Add Acked-by. Note that instead of calling drm_fb_xrgb_to_mono() and transposing the bitmap, the image data could be converted to the transposed format directly. However, that would preclude exposing a monochrome format to userspace when a fourcc for such a monochrome format is introduced. --- drivers/gpu/drm/solomon/ssd130x.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/solomon/ssd130x.c b/drivers/gpu/drm/solomon/ssd130x.c index caee851efd5726e7..7c99af4ce9dd4e5c 100644 --- a/drivers/gpu/drm/solomon/ssd130x.c +++ b/drivers/gpu/drm/solomon/ssd130x.c @@ -355,11 +355,14 @@ static int ssd130x_update_rect(struct ssd130x_device *ssd130x, u8 *buf, unsigned int width = drm_rect_width(rect); unsigned int height = drm_rect_height(rect); unsigned int line_length = DIV_ROUND_UP(width, 8); - unsigned int pages = DIV_ROUND_UP(y % 8 + height, 8); + unsigned int pages = DIV_ROUND_UP(height, 8); + struct drm_device *drm = &ssd130x->drm; u32 array_idx = 0; int ret, i, j, k; u8 *data_array = NULL; + drm_WARN_ONCE(drm, y % 8 != 0, "y must be aligned to screen page\n"); + data_array = kcalloc(width, pages, GFP_KERNEL); if (!data_array) return -ENOMEM; @@ -401,13 +404,13 @@ static int ssd130x_update_rect(struct ssd130x_device *ssd130x, u8 *buf, if (ret < 0) goto out_free; - for (i = y / 8; i < y / 8 + pages; i++) { + for (i = 0; i < pages; i++) { int m = 8; /* Last page may be partial */ - if (8 * (i + 1) > ssd130x->height) + if (8 * (y / 8 + i + 1) > ssd130x->height) m = ssd130x->height % 8; - for (j = x; j < x + width; j++) { + for (j = 0; j < width; j++) { u8 data = 0; for (k = 0; k < m; k++) { @@ -454,6 +457,10 @@ static int ssd130x_fb_blit_rect(struct drm_framebuffer *fb, const struct iosys_m int ret = 0; u8 *buf = NULL; + /* Align y to display page boundaries */ + rect->y1 = round_down(rect->y1, 8); + rect->y2 = min_t(unsigned int, round_up(rect->y2, 8), ssd130x->height); + buf = kcalloc(fb->width, fb->height, GFP_KERNEL); if (!buf) return -ENOMEM; -- 2.25.1
[PATCH v2 5/5] drm/repaper: Reduce temporary buffer size in repaper_fb_dirty()
As the temporary buffer is no longer used to store 8-bit grayscale data, its size can be reduced to the size needed to store the monochrome bitmap data. Fixes: 24c6bedefbe71de9 ("drm/repaper: Use format helper for xrgb to monochrome conversion") Signed-off-by: Geert Uytterhoeven Reviewed-by: Javier Martinez Canillas --- v2: - Add Reviewed-by. Untested due to lack of hardware. I replaced kmalloc_array() by kmalloc() to match size calculations in other locations in this driver. There is no point in handling a possible multiplication overflow only here. --- drivers/gpu/drm/tiny/repaper.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/tiny/repaper.c b/drivers/gpu/drm/tiny/repaper.c index a096fb8b83e99dc8..7738b87f370ad147 100644 --- a/drivers/gpu/drm/tiny/repaper.c +++ b/drivers/gpu/drm/tiny/repaper.c @@ -530,7 +530,7 @@ static int repaper_fb_dirty(struct drm_framebuffer *fb) DRM_DEBUG("Flushing [FB:%d] st=%ums\n", fb->base.id, epd->factored_stage_time); - buf = kmalloc_array(fb->width, fb->height, GFP_KERNEL); + buf = kmalloc(fb->width * fb->height / 8, GFP_KERNEL); if (!buf) { ret = -ENOMEM; goto out_exit; -- 2.25.1
[PATCH v2 0/5] drm: Fix monochrome conversion for sdd130x
Hi all, This patch series contains fixes and improvements for the XRGB888 to monochrome conversion in the DRM core, and for its users. This has been tested on an Adafruit FeatherWing 128x32 OLED, connected to an OrangeCrab ECP5 FPGA board running a 64 MHz VexRiscv RISC-V softcore, using a text console with 4x6, 7x14 and 8x8 fonts. Thanks! Geert Uytterhoeven (5): drm/format-helper: Rename drm_fb_xrgb_to_mono_reversed() drm/format-helper: Fix XRGB888 to monochrome conversion drm/ssd130x: Fix rectangle updates drm/ssd130x: Reduce temporary buffer sizes drm/repaper: Reduce temporary buffer size in repaper_fb_dirty() drivers/gpu/drm/drm_format_helper.c | 74 +++-- drivers/gpu/drm/solomon/ssd130x.c | 24 +++--- drivers/gpu/drm/tiny/repaper.c | 4 +- include/drm/drm_format_helper.h | 5 +- 4 files changed, 48 insertions(+), 59 deletions(-) -- 2.25.1 Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
[PATCH v2 1/5] drm/format-helper: Rename drm_fb_xrgb8888_to_mono_reversed()
There is no "reversed" handling in drm_fb_xrgb_to_mono_reversed(): the function just converts from color to grayscale, and reduces the number of grayscale levels from 256 to 2 (i.e. brightness 0-127 is mapped to 0, 128-255 to 1). All "reversed" handling is done in the repaper driver, where this function originated. Hence make this clear by renaming drm_fb_xrgb_to_mono_reversed() to drm_fb_xrgb_to_mono(), and documenting the black/white pixel mapping. Fixes: bcf8b616deb87941 ("drm/format-helper: Add drm_fb_xrgb_to_mono_reversed()") Signed-off-by: Geert Uytterhoeven Acked-by: Javier Martinez Canillas Reviewed-by: Andy Shevchenko --- v2: - Add Acked-by, Reviewed-by, - Join 2 lines. --- drivers/gpu/drm/drm_format_helper.c | 31 ++--- drivers/gpu/drm/solomon/ssd130x.c | 2 +- drivers/gpu/drm/tiny/repaper.c | 2 +- include/drm/drm_format_helper.h | 5 ++--- 4 files changed, 19 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/drm_format_helper.c b/drivers/gpu/drm/drm_format_helper.c index bc0f49773868a9b0..5d9d0c695845f575 100644 --- a/drivers/gpu/drm/drm_format_helper.c +++ b/drivers/gpu/drm/drm_format_helper.c @@ -594,8 +594,8 @@ int drm_fb_blit_toio(void __iomem *dst, unsigned int dst_pitch, uint32_t dst_for } EXPORT_SYMBOL(drm_fb_blit_toio); -static void drm_fb_gray8_to_mono_reversed_line(u8 *dst, const u8 *src, unsigned int pixels, - unsigned int start_offset, unsigned int end_len) +static void drm_fb_gray8_to_mono_line(u8 *dst, const u8 *src, unsigned int pixels, + unsigned int start_offset, unsigned int end_len) { unsigned int xb, i; @@ -621,8 +621,8 @@ static void drm_fb_gray8_to_mono_reversed_line(u8 *dst, const u8 *src, unsigned } /** - * drm_fb_xrgb_to_mono_reversed - Convert XRGB to reversed monochrome - * @dst: reversed monochrome destination buffer + * drm_fb_xrgb_to_mono - Convert XRGB to monochrome + * @dst: monochrome destination buffer (0=black, 1=white) * @dst_pitch: Number of bytes between two consecutive scanlines within dst * @src: XRGB source buffer * @fb: DRM framebuffer @@ -633,10 +633,10 @@ static void drm_fb_gray8_to_mono_reversed_line(u8 *dst, const u8 *src, unsigned * and use this function to convert to the native format. * * This function uses drm_fb_xrgb_to_gray8() to convert to grayscale and - * then the result is converted from grayscale to reversed monohrome. + * then the result is converted from grayscale to monochrome. */ -void drm_fb_xrgb_to_mono_reversed(void *dst, unsigned int dst_pitch, const void *vaddr, - const struct drm_framebuffer *fb, const struct drm_rect *clip) +void drm_fb_xrgb_to_mono(void *dst, unsigned int dst_pitch, const void *vaddr, +const struct drm_framebuffer *fb, const struct drm_rect *clip) { unsigned int linepixels = drm_rect_width(clip); unsigned int lines = clip->y2 - clip->y1; @@ -652,8 +652,8 @@ void drm_fb_xrgb_to_mono_reversed(void *dst, unsigned int dst_pitch, const v return; /* -* The reversed mono destination buffer contains 1 bit per pixel -* and destination scanlines have to be in multiple of 8 pixels. +* The mono destination buffer contains 1 bit per pixel and +* destination scanlines have to be in multiple of 8 pixels. */ if (!dst_pitch) dst_pitch = DIV_ROUND_UP(linepixels, 8); @@ -664,9 +664,9 @@ void drm_fb_xrgb_to_mono_reversed(void *dst, unsigned int dst_pitch, const v * The cma memory is write-combined so reads are uncached. * Speed up by fetching one line at a time. * -* Also, format conversion from XR24 to reversed monochrome -* are done line-by-line but are converted to 8-bit grayscale -* as an intermediate step. +* Also, format conversion from XR24 to monochrome are done +* line-by-line but are converted to 8-bit grayscale as an +* intermediate step. * * Allocate a buffer to be used for both copying from the cma * memory and to store the intermediate grayscale line pixels. @@ -683,7 +683,7 @@ void drm_fb_xrgb_to_mono_reversed(void *dst, unsigned int dst_pitch, const v * are not aligned to multiple of 8. * * Calculate if the start and end pixels are not aligned and set the -* offsets for the reversed mono line conversion function to adjust. +* offsets for the mono line conversion function to adjust. */ start_offset = clip->x1 % 8; end_len = clip->x2 % 8; @@ -692,12 +692,11 @@ void drm_fb_xrgb_to_mono_reversed(void *dst, unsigned int dst_pitch, const v for (y = 0; y < lines; y++) { src32 = memcpy(src32, vaddr, len_src32);
[PATCH v2 2/5] drm/format-helper: Fix XRGB888 to monochrome conversion
The conversion functions drm_fb_xrgb_to_mono() and drm_fb_gray8_to_mono_line() do not behave correctly when the horizontal boundaries of the clip rectangle are not multiples of 8: a. When x1 % 8 != 0, the calculated pitch is not correct, b. When x2 % 8 != 0, the pixel data for the last byte is wrong. Simplify the code and fix (a) by: 1. Removing start_offset, and always storing the first pixel in the first bit of the monochrome destination buffer. Drivers that require the first pixel in a byte to be located at an x-coordinate that is a multiple of 8 can always align the clip rectangle before calling drm_fb_xrgb_to_mono(). Note that: - The ssd130x driver does not need the alignment, as the monochrome buffer is a temporary format, - The repaper driver always updates the full screen, so the clip rectangle is always aligned. 2. Passing the number of pixels to drm_fb_gray8_to_mono_line(), instead of the number of bytes, and the number of pixels in the last byte. Fix (b) by explicitly setting the target bit, instead of always setting bit 7 and shifting the value in each loop iteration. Remove the bogus pitch check, which operates on bytes instead of pixels, and triggers when e.g. flashing the cursor on a text console with a font that is 8 pixels wide. Drop the confusing comment about scanlines, as a pitch in bytes always contains a multiple of 8 pixels. While at it, use the drm_rect_height() helper instead of open-coding the same operation. Update the comments accordingly. Fixes: bcf8b616deb87941 ("drm/format-helper: Add drm_fb_xrgb_to_mono_reversed()") Signed-off-by: Geert Uytterhoeven Acked-by: Javier Martinez Canillas Reviewed-by: Andy Shevchenko --- v2: - Add Acked-by, Reviewed-by, - Use ">= 128" instead of "& BIT(7)" to increase readability. I tried hard to fix this in small steps, but everything was no intertangled that this turned out to be unfeasible. Note that making these changes does not introduce regressions in the ssd130x driver, as the latter is broken for x1 != 0 or y1 != 0 anyway. --- drivers/gpu/drm/drm_format_helper.c | 55 ++--- 1 file changed, 18 insertions(+), 37 deletions(-) diff --git a/drivers/gpu/drm/drm_format_helper.c b/drivers/gpu/drm/drm_format_helper.c index 5d9d0c695845f575..e085f855a199013f 100644 --- a/drivers/gpu/drm/drm_format_helper.c +++ b/drivers/gpu/drm/drm_format_helper.c @@ -594,27 +594,16 @@ int drm_fb_blit_toio(void __iomem *dst, unsigned int dst_pitch, uint32_t dst_for } EXPORT_SYMBOL(drm_fb_blit_toio); -static void drm_fb_gray8_to_mono_line(u8 *dst, const u8 *src, unsigned int pixels, - unsigned int start_offset, unsigned int end_len) -{ - unsigned int xb, i; - - for (xb = 0; xb < pixels; xb++) { - unsigned int start = 0, end = 8; - u8 byte = 0x00; - - if (xb == 0 && start_offset) - start = start_offset; - if (xb == pixels - 1 && end_len) - end = end_len; - - for (i = start; i < end; i++) { - unsigned int x = xb * 8 + i; +static void drm_fb_gray8_to_mono_line(u8 *dst, const u8 *src, unsigned int pixels) +{ + while (pixels) { + unsigned int i, bits = min(pixels, 8U); + u8 byte = 0; - byte >>= 1; - if (src[x] >> 7) - byte |= BIT(7); + for (i = 0; i < bits; i++, pixels--) { + if (*src++ >= 128) + byte |= BIT(i); } *dst++ = byte; } @@ -634,16 +623,22 @@ static void drm_fb_gray8_to_mono_line(u8 *dst, const u8 *src, unsigned int pixel * * This function uses drm_fb_xrgb_to_gray8() to convert to grayscale and * then the result is converted from grayscale to monochrome. + * + * The first pixel (upper left corner of the clip rectangle) will be converted + * and copied to the first bit (LSB) in the first byte of the monochrome + * destination buffer. + * If the caller requires that the first pixel in a byte must be located at an + * x-coordinate that is a multiple of 8, then the caller must take care itself + * of supplying a suitable clip rectangle. */ void drm_fb_xrgb_to_mono(void *dst, unsigned int dst_pitch, const void *vaddr, const struct drm_framebuffer *fb, const struct drm_rect *clip) { unsigned int linepixels = drm_rect_width(clip); - unsigned int lines = clip->y2 - clip->y1; + unsigned int lines = drm_rect_height(clip); unsigned int cpp = fb->format->cpp[0]; unsigned int len_src32 = linepixels * cpp; struct drm_device *dev = fb->dev; - unsigned int start_offset, end_len; unsigned int y; u8 *mono = dst, *gray8; u32 *s
Re: [RFC PATCH 1/4] drm/amdkfd: Improve amdgpu_vm_handle_moved
Am 17.03.22 um 01:20 schrieb Felix Kuehling: Let amdgpu_vm_handle_moved update all BO VA mappings of BOs reserved by the caller. This will be useful for handling extra BO VA mappings in KFD VMs that are managed through the render node API. Yes, that change is on my TODO list for quite a while as well. TODO: This may also allow simplification of amdgpu_cs_vm_handling. See the TODO comment in the code. No, that won't work just yet. We need to change the TLB flush detection for that, but I'm already working on those as well. Signed-off-by: Felix Kuehling Please update the TODO, with that done: Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 6 +- drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 18 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 3 ++- 4 files changed, 21 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index d162243d8e78..10941f0d8dde 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -826,6 +826,10 @@ static int amdgpu_cs_vm_handling(struct amdgpu_cs_parser *p) return r; } + /* TODO: Is this loop still needed, or could this be handled by +* amdgpu_vm_handle_moved, now that it can handle all BOs that are +* reserved under p->ticket? +*/ amdgpu_bo_list_for_each_entry(e, p->bo_list) { /* ignore duplicates */ bo = ttm_to_amdgpu_bo(e->tv.bo); @@ -845,7 +849,7 @@ static int amdgpu_cs_vm_handling(struct amdgpu_cs_parser *p) return r; } - r = amdgpu_vm_handle_moved(adev, vm); + r = amdgpu_vm_handle_moved(adev, vm, &p->ticket); if (r) return r; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c index 579adfafe4d0..50805613c38c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c @@ -414,7 +414,7 @@ amdgpu_dma_buf_move_notify(struct dma_buf_attachment *attach) r = amdgpu_vm_clear_freed(adev, vm, NULL); if (!r) - r = amdgpu_vm_handle_moved(adev, vm); + r = amdgpu_vm_handle_moved(adev, vm, ticket); if (r && r != -EBUSY) DRM_ERROR("Failed to invalidate VM page tables (%d))\n", diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index fc4563cf2828..726b42c6d606 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -2190,11 +2190,12 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev, * PTs have to be reserved! */ int amdgpu_vm_handle_moved(struct amdgpu_device *adev, - struct amdgpu_vm *vm) + struct amdgpu_vm *vm, + struct ww_acquire_ctx *ticket) { struct amdgpu_bo_va *bo_va, *tmp; struct dma_resv *resv; - bool clear; + bool clear, unlock; int r; list_for_each_entry_safe(bo_va, tmp, &vm->moved, base.vm_status) { @@ -2212,17 +2213,24 @@ int amdgpu_vm_handle_moved(struct amdgpu_device *adev, spin_unlock(&vm->invalidated_lock); /* Try to reserve the BO to avoid clearing its ptes */ - if (!amdgpu_vm_debug && dma_resv_trylock(resv)) + if (!amdgpu_vm_debug && dma_resv_trylock(resv)) { clear = false; + unlock = true; + /* The caller is already holding the reservation lock */ + } else if (ticket && dma_resv_locking_ctx(resv) == ticket) { + clear = false; + unlock = false; /* Somebody else is using the BO right now */ - else + } else { clear = true; + unlock = false; + } r = amdgpu_vm_bo_update(adev, bo_va, clear, NULL); if (r) return r; - if (!clear) + if (unlock) dma_resv_unlock(resv); spin_lock(&vm->invalidated_lock); } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h index a40a6a993bb0..120a76aaae75 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h @@ -396,7 +396,8 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev, struct amdgpu_vm *vm, struct dma_fence **fence); int amdgpu_vm_handle_moved(struct amdgpu_device *adev, - struct amdgpu_vm *vm); + struct amdgpu_vm *vm, +
Re: [PATCH 3/3] drm/msm: Add a way to override processes comm/cmdline
On Wed, Mar 16, 2022 at 05:29:45PM -0700, Rob Clark wrote: > switch (param) { > + case MSM_PARAM_COMM: > + case MSM_PARAM_CMDLINE: { > + char *str, **paramp; > + > + str = kmalloc(len + 1, GFP_KERNEL); if (!str) return -ENOMEM; > + if (copy_from_user(str, u64_to_user_ptr(value), len)) { > + kfree(str); > + return -EFAULT; > + } > + > + /* Ensure string is null terminated: */ > + str[len] = '\0'; > + > + if (param == MSM_PARAM_COMM) { > + paramp = &ctx->comm; > + } else { > + paramp = &ctx->cmdline; > + } > + > + kfree(*paramp); > + *paramp = str; > + > + return 0; > + } > case MSM_PARAM_SYSPROF: > if (!capable(CAP_SYS_ADMIN)) > return -EPERM; > diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c > index 4ec62b601adc..68f3f8ade76d 100644 > --- a/drivers/gpu/drm/msm/msm_gpu.c > +++ b/drivers/gpu/drm/msm/msm_gpu.c > @@ -364,14 +364,21 @@ static void retire_submits(struct msm_gpu *gpu); > > static void get_comm_cmdline(struct msm_gem_submit *submit, char **comm, > char **cmd) > { > + struct msm_file_private *ctx = submit->queue->ctx; > struct task_struct *task; > > + *comm = kstrdup(ctx->comm, GFP_KERNEL); > + *cmd = kstrdup(ctx->cmdline, GFP_KERNEL); > + > task = get_pid_task(submit->pid, PIDTYPE_PID); > if (!task) > return; > > - *comm = kstrdup(task->comm, GFP_KERNEL); > - *cmd = kstrdup_quotable_cmdline(task, GFP_KERNEL); > + if (!*comm) > + *comm = kstrdup(task->comm, GFP_KERNEL); What? If the first allocation failed, then this one is going to fail as well. Just return -ENOMEM. Or maybe this is meant to be checking for an empty string? > + > + if (!*cmd) > + *cmd = kstrdup_quotable_cmdline(task, GFP_KERNEL); Same. > > put_task_struct(task); > } regards, dan carpenter
Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event
On 3/16/2022 10:50 PM, Rob Clark wrote: On Tue, Mar 8, 2022 at 11:40 PM Shashank Sharma wrote: From: Shashank Sharma This patch adds a new sysfs event, which will indicate the userland about a GPU reset, and can also provide some information like: - process ID of the process involved with the GPU reset - process name of the involved process - the GPU status info (using flags) This patch also introduces the first flag of the flags bitmap, which can be appended as and when required. V2: Addressed review comments from Christian and Amar - move the reset information structure to DRM layer - drop _ctx from struct name - make pid 32 bit(than 64) - set flag when VRAM invalid (than valid) - add process name as well (Amar) Cc: Alexandar Deucher Cc: Christian Koenig Cc: Amaranath Somalapuram Signed-off-by: Shashank Sharma --- drivers/gpu/drm/drm_sysfs.c | 31 +++ include/drm/drm_sysfs.h | 10 ++ 2 files changed, 41 insertions(+) diff --git a/drivers/gpu/drm/drm_sysfs.c b/drivers/gpu/drm/drm_sysfs.c index 430e00b16eec..840994810910 100644 --- a/drivers/gpu/drm/drm_sysfs.c +++ b/drivers/gpu/drm/drm_sysfs.c @@ -409,6 +409,37 @@ void drm_sysfs_hotplug_event(struct drm_device *dev) } EXPORT_SYMBOL(drm_sysfs_hotplug_event); +/** + * drm_sysfs_reset_event - generate a DRM uevent to indicate GPU reset + * @dev: DRM device + * @reset_info: The contextual information about the reset (like PID, flags) + * + * Send a uevent for the DRM device specified by @dev. This informs + * user that a GPU reset has occurred, so that an interested client + * can take any recovery or profiling measure. + */ +void drm_sysfs_reset_event(struct drm_device *dev, struct drm_reset_event *reset_info) +{ + unsigned char pid_str[13]; + unsigned char flags_str[15]; + unsigned char pname_str[TASK_COMM_LEN + 6]; + unsigned char reset_str[] = "RESET=1"; + char *envp[] = { reset_str, pid_str, pname_str, flags_str, NULL }; + + if (!reset_info) { + DRM_WARN("No reset info, not sending the event\n"); + return; + } + + DRM_DEBUG("generating reset event\n"); + + snprintf(pid_str, ARRAY_SIZE(pid_str), "PID=%u", reset_info->pid); + snprintf(pname_str, ARRAY_SIZE(pname_str), "NAME=%s", reset_info->pname); + snprintf(flags_str, ARRAY_SIZE(flags_str), "FLAGS=%u", reset_info->flags); + kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE, envp); +} +EXPORT_SYMBOL(drm_sysfs_reset_event); + /** * drm_sysfs_connector_hotplug_event - generate a DRM uevent for any connector * change diff --git a/include/drm/drm_sysfs.h b/include/drm/drm_sysfs.h index 6273cac44e47..5ba11c760619 100644 --- a/include/drm/drm_sysfs.h +++ b/include/drm/drm_sysfs.h @@ -1,16 +1,26 @@ /* SPDX-License-Identifier: GPL-2.0 */ #ifndef _DRM_SYSFS_H_ #define _DRM_SYSFS_H_ +#include + +#define DRM_GPU_RESET_FLAG_VRAM_INVALID (1 << 0) struct drm_device; struct device; struct drm_connector; struct drm_property; +struct drm_reset_event { + uint32_t pid; One side note, unrelated to devcoredump vs this.. AFAIU you probably want to be passing around a `struct pid *`, and then somehow use pid_vnr() in the context of the process reading the event to get the numeric pid. Otherwise things will not do what you expect if the process triggering the crash is in a different pid namespace from the compositor. I am not sure if it is a good idea to add the pid extraction complexity in here, it is left upto the driver to extract this information and pass it to the work queue. In case of AMDGPU, its extracted from GPU VM. It would be then more flexible for the drivers as well. - Shashank BR, -R + uint32_t flags; + char pname[TASK_COMM_LEN]; +}; + int drm_class_device_register(struct device *dev); void drm_class_device_unregister(struct device *dev); void drm_sysfs_hotplug_event(struct drm_device *dev); +void drm_sysfs_reset_event(struct drm_device *dev, struct drm_reset_event *reset_info); void drm_sysfs_connector_hotplug_event(struct drm_connector *connector); void drm_sysfs_connector_status_event(struct drm_connector *connector, struct drm_property *property); -- 2.32.0
Re: [PATCH v3 1/3] drm: allow real encoder to be passed for drm_writeback_connector
Hi Abhinav, Thank you for the patch. On Wed, Mar 16, 2022 at 11:48:16AM -0700, Abhinav Kumar wrote: > For some vendor driver implementations, display hardware can > be shared between the encoder used for writeback and the physical > display. > > In addition resources such as clocks and interrupts can > also be shared between writeback and the real encoder. > > To accommodate such vendor drivers and hardware, allow > real encoder to be passed for drm_writeback_connector using a new > drm_writeback_connector_init_with_encoder() API. The commit message doesn't match the commit. > In addition, to preserve the same call flows for the existing > users of drm_writeback_connector_init(), also allow passing > possible_crtcs as a parameter so that encoder can be initialized > with it. > > changes in v3: > - allow passing possible_crtcs for existing users of > drm_writeback_connector_init() > - squash the vendor changes into the same commit so > that each patch in the series can compile individually > > Co-developed-by: Kandpal Suraj > Signed-off-by: Abhinav Kumar > --- > .../drm/arm/display/komeda/komeda_wb_connector.c | 3 +- > drivers/gpu/drm/arm/malidp_mw.c| 5 +- > drivers/gpu/drm/drm_writeback.c| 103 > + > drivers/gpu/drm/rcar-du/rcar_du_writeback.c| 5 +- > drivers/gpu/drm/vc4/vc4_txp.c | 19 ++-- > drivers/gpu/drm/vkms/vkms_writeback.c | 3 +- > include/drm/drm_writeback.h| 22 - > 7 files changed, 103 insertions(+), 57 deletions(-) > > diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_wb_connector.c > b/drivers/gpu/drm/arm/display/komeda/komeda_wb_connector.c > index e465cc4..40774e6 100644 > --- a/drivers/gpu/drm/arm/display/komeda/komeda_wb_connector.c > +++ b/drivers/gpu/drm/arm/display/komeda/komeda_wb_connector.c > @@ -155,7 +155,6 @@ static int komeda_wb_connector_add(struct komeda_kms_dev > *kms, > kwb_conn->wb_layer = kcrtc->master->wb_layer; > > wb_conn = &kwb_conn->base; > - wb_conn->encoder.possible_crtcs = BIT(drm_crtc_index(&kcrtc->base)); > > formats = komeda_get_layer_fourcc_list(&mdev->fmt_tbl, > kwb_conn->wb_layer->layer_type, > @@ -164,7 +163,7 @@ static int komeda_wb_connector_add(struct komeda_kms_dev > *kms, > err = drm_writeback_connector_init(&kms->base, wb_conn, > &komeda_wb_connector_funcs, > &komeda_wb_encoder_helper_funcs, > -formats, n_formats); > +formats, n_formats, > BIT(drm_crtc_index(&kcrtc->base))); > komeda_put_fourcc_list(formats); > if (err) { > kfree(kwb_conn); > diff --git a/drivers/gpu/drm/arm/malidp_mw.c b/drivers/gpu/drm/arm/malidp_mw.c > index f5847a7..b882066 100644 > --- a/drivers/gpu/drm/arm/malidp_mw.c > +++ b/drivers/gpu/drm/arm/malidp_mw.c > @@ -208,11 +208,12 @@ int malidp_mw_connector_init(struct drm_device *drm) > struct malidp_drm *malidp = drm->dev_private; > u32 *formats; > int ret, n_formats; > + uint32_t possible_crtcs; > > if (!malidp->dev->hw->enable_memwrite) > return 0; > > - malidp->mw_connector.encoder.possible_crtcs = 1 << > drm_crtc_index(&malidp->crtc); > + possible_crtcs = 1 << drm_crtc_index(&malidp->crtc); > drm_connector_helper_add(&malidp->mw_connector.base, >&malidp_mw_connector_helper_funcs); > > @@ -223,7 +224,7 @@ int malidp_mw_connector_init(struct drm_device *drm) > ret = drm_writeback_connector_init(drm, &malidp->mw_connector, > &malidp_mw_connector_funcs, > &malidp_mw_encoder_helper_funcs, > -formats, n_formats); > +formats, n_formats, possible_crtcs); Do you need the local variable ? > kfree(formats); > if (ret) > return ret; > diff --git a/drivers/gpu/drm/drm_writeback.c b/drivers/gpu/drm/drm_writeback.c > index dccf4504..17c1471 100644 > --- a/drivers/gpu/drm/drm_writeback.c > +++ b/drivers/gpu/drm/drm_writeback.c > @@ -149,36 +149,15 @@ static const struct drm_encoder_funcs > drm_writeback_encoder_funcs = { > .destroy = drm_encoder_cleanup, > }; > > -/** > - * drm_writeback_connector_init - Initialize a writeback connector and its > properties > - * @dev: DRM device > - * @wb_connector: Writeback connector to initialize > - * @con_funcs: Connector funcs vtable > - * @enc_helper_funcs: Encoder helper funcs vtable to be used by the internal > encoder > - * @formats: Array of supported pixel formats for the writeback engine > - * @n_formats: Length of the formats array > - * > - * Th
[PATCH 1/3] dt-bindings: display: bridge: it66121: Add audio support
Update the ITE bridge HDMI it66121 bindings in order to support audio. Signed-off-by: Nicolas Belin --- .../devicetree/bindings/display/bridge/ite,it66121.yaml| 3 +++ 1 file changed, 3 insertions(+) diff --git a/Documentation/devicetree/bindings/display/bridge/ite,it66121.yaml b/Documentation/devicetree/bindings/display/bridge/ite,it66121.yaml index 6ec1d5fbb8bc..c6e81f532215 100644 --- a/Documentation/devicetree/bindings/display/bridge/ite,it66121.yaml +++ b/Documentation/devicetree/bindings/display/bridge/ite,it66121.yaml @@ -38,6 +38,9 @@ properties: interrupts: maxItems: 1 + "#sound-dai-cells": +const: 0 + ports: $ref: /schemas/graph.yaml#/properties/ports -- 2.25.1
[PATCH 0/3] drm: bridge: it66121: Add audio support
This patch series adds the audio support on the it66121 HDMI bridge. Patch 1 updates the ITE 66121 HDMI bridge bindings in order to support audio. Patch 2 sets the register page length or window length of the ITE 66121 HDMI bridge to 0x100 according to the documentation. Patch 3 contains the actual driver modifications in order to add the audio support on the ITE 66121 HDMI bridge. Nicolas Belin (3): dt-bindings: display: bridge: it66121: Add audio support drm: bridge: it66121: Fix the register page length drm: bridge: it66121: Add audio support .../bindings/display/bridge/ite,it66121.yaml | 3 + drivers/gpu/drm/bridge/ite-it66121.c | 629 +- 2 files changed, 631 insertions(+), 1 deletion(-) -- 2.25.1
[PATCH 2/3] drm: bridge: it66121: Fix the register page length
Set the register page length or window length to 0x100 according to the documentation. Fixes: 988156dc2fc9 ("drm: bridge: add it66121 driver") Signed-off-by: Nicolas Belin --- drivers/gpu/drm/bridge/ite-it66121.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/bridge/ite-it66121.c b/drivers/gpu/drm/bridge/ite-it66121.c index 06b59b422c69..64912b770086 100644 --- a/drivers/gpu/drm/bridge/ite-it66121.c +++ b/drivers/gpu/drm/bridge/ite-it66121.c @@ -227,7 +227,7 @@ static const struct regmap_range_cfg it66121_regmap_banks[] = { .selector_mask = 0x1, .selector_shift = 0, .window_start = 0x00, - .window_len = 0x130, + .window_len = 0x100, }, }; -- 2.25.1
[PATCH 3/3] drm: bridge: it66121: Add audio support
Adding the audio support on the HDMI bridge for I2S only. Signed-off-by: Nicolas Belin Signed-off-by: Andy.Hsieh --- drivers/gpu/drm/bridge/ite-it66121.c | 627 +++ 1 file changed, 627 insertions(+) diff --git a/drivers/gpu/drm/bridge/ite-it66121.c b/drivers/gpu/drm/bridge/ite-it66121.c index 64912b770086..514989676d07 100644 --- a/drivers/gpu/drm/bridge/ite-it66121.c +++ b/drivers/gpu/drm/bridge/ite-it66121.c @@ -27,6 +27,8 @@ #include #include +#include + #define IT66121_VENDOR_ID0_REG 0x00 #define IT66121_VENDOR_ID1_REG 0x01 #define IT66121_DEVICE_ID0_REG 0x02 @@ -155,6 +157,9 @@ #define IT66121_AV_MUTE_ON BIT(0) #define IT66121_AV_MUTE_BLUESCRBIT(1) +#define IT66121_PKT_CTS_CTRL_REG 0xC5 +#define IT66121_PKT_CTS_CTRL_SEL BIT(1) + #define IT66121_PKT_GEN_CTRL_REG 0xC6 #define IT66121_PKT_GEN_CTRL_ONBIT(0) #define IT66121_PKT_GEN_CTRL_RPT BIT(1) @@ -202,6 +207,89 @@ #define IT66121_EDID_SLEEP_US 2 #define IT66121_EDID_TIMEOUT_US20 #define IT66121_EDID_FIFO_SIZE 32 + +#define IT66121_CLK_CTRL0_REG 0x58 +#define IT66121_CLK_CTRL0_AUTO_OVER_SAMPLING BIT(4) +#define IT66121_CLK_CTRL0_EXT_MCLK_MASKGENMASK(3, 2) +#define IT66121_CLK_CTRL0_EXT_MCLK_128FS (0 << 2) +#define IT66121_CLK_CTRL0_EXT_MCLK_256FS BIT(2) +#define IT66121_CLK_CTRL0_EXT_MCLK_512FS (2 << 2) +#define IT66121_CLK_CTRL0_EXT_MCLK_1024FS (3 << 2) +#define IT66121_CLK_CTRL0_AUTO_IPCLK BIT(0) +#define IT66121_CLK_STATUS1_REG0x5E +#define IT66121_CLK_STATUS2_REG0x5F + +#define IT66121_AUD_CTRL0_REG 0xE0 +#define IT66121_AUD_SWL(3 << 6) +#define IT66121_AUD_16BIT (0 << 6) +#define IT66121_AUD_18BIT BIT(6) +#define IT66121_AUD_20BIT (2 << 6) +#define IT66121_AUD_24BIT (3 << 6) +#define IT66121_AUD_SPDIFTCBIT(5) +#define IT66121_AUD_SPDIF BIT(4) +#define IT66121_AUD_I2S(0 << 4) +#define IT66121_AUD_EN_I2S3BIT(3) +#define IT66121_AUD_EN_I2S2BIT(2) +#define IT66121_AUD_EN_I2S1BIT(1) +#define IT66121_AUD_EN_I2S0BIT(0) +#define IT66121_AUD_CTRL0_AUD_SEL BIT(4) + +#define IT66121_AUD_CTRL1_REG 0xE1 +#define IT66121_AUD_FIFOMAP_REG0xE2 +#define IT66121_AUD_CTRL3_REG 0xE3 +#define IT66121_AUD_SRCVALID_FLAT_REG 0xE4 +#define IT66121_AUD_FLAT_SRC0 BIT(4) +#define IT66121_AUD_FLAT_SRC1 BIT(5) +#define IT66121_AUD_FLAT_SRC2 BIT(6) +#define IT66121_AUD_FLAT_SRC3 BIT(7) +#define IT66121_AUD_HDAUDIO_REG0xE5 + +#define IT66121_AUD_PKT_CTS0_REG 0x130 +#define IT66121_AUD_PKT_CTS1_REG 0x131 +#define IT66121_AUD_PKT_CTS2_REG 0x132 +#define IT66121_AUD_PKT_N0_REG 0x133 +#define IT66121_AUD_PKT_N1_REG 0x134 +#define IT66121_AUD_PKT_N2_REG 0x135 + +#define IT66121_AUD_CHST_MODE_REG 0x191 +#define IT66121_AUD_CHST_CAT_REG 0x192 +#define IT66121_AUD_CHST_SRCNUM_REG0x193 +#define IT66121_AUD_CHST_CHTNUM_REG0x194 +#define IT66121_AUD_CHST_CA_FS_REG 0x198 +#define IT66121_AUD_CHST_OFS_WL_REG0x199 + +#define IT66121_AUD_PKT_CTS_CNT0_REG 0x1A0 +#define IT66121_AUD_PKT_CTS_CNT1_REG 0x1A1 +#define IT66121_AUD_PKT_CTS_CNT2_REG 0x1A2 + +#define IT66121_AUD_FS_22P05K 0x4 +#define IT66121_AUD_FS_44P1K 0x0 +#define IT66121_AUD_FS_88P2K 0x8 +#define IT66121_AUD_FS_176P4K 0xC +#define IT66121_AUD_FS_24K 0x6 +#define IT66121_AUD_FS_48K 0x2 +#define IT66121_AUD_FS_96K 0xA +#define IT66121_AUD_FS_192K0xE +#define IT66121_AUD_FS_768K0x9 +#define IT66121_AUD_FS_32K 0x3 +#define IT66121_AUD_FS_OTHER 0x1 + +#define IT66121_AUD_SWL_21BIT 0xD +#define IT66121_AUD_SWL_24BIT 0xB +#define IT66121_AUD_SWL_23BIT 0x9 +#define IT66121_AUD_SWL_22BIT 0x5 +#define IT66121_AUD_SWL_20BIT 0x3 +#define IT66121_AUD_SWL_17BIT 0xC +#define IT66121_AUD_SWL_19BIT 0x8 +#define IT66121_AUD_SWL_18BIT 0x4 +#define IT66121_AUD_SWL_16BIT
Re: [PATCH,v2] drm/panel: Fix return value check in nt35950_probe()
Il 17/03/22 09:37, Lu Wei ha scritto: In function nt35950_probe(), mipi_dsi_device_register_full() is called to create a MIPI DSI device. If it fails, a pointer encoded with an error will be returned, so use IS_ERR() to check the return value. Besides, use PTR_ERR to return the actual errno. Fixes: 623a3531e9cf ("drm/panel: Add driver for Novatek NT35950 DSI DriverIC panels") Signed-off-by: Lu Wei Reviewed-by: AngeloGioacchino Del Regno Thanks! --- drivers/gpu/drm/panel/panel-novatek-nt35950.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/panel/panel-novatek-nt35950.c b/drivers/gpu/drm/panel/panel-novatek-nt35950.c index 288c7fa83ecc..d252e5e56228 100644 --- a/drivers/gpu/drm/panel/panel-novatek-nt35950.c +++ b/drivers/gpu/drm/panel/panel-novatek-nt35950.c @@ -579,9 +579,9 @@ static int nt35950_probe(struct mipi_dsi_device *dsi) } nt->dsi[1] = mipi_dsi_device_register_full(dsi_r_host, info); - if (!nt->dsi[1]) { + if (IS_ERR(nt->dsi[1])) { dev_err(dev, "Cannot get secondary DSI node\n"); - return -ENODEV; + return PTR_ERR(nt->dsi[1]); } num_dsis++; }
Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event
Am 17.03.22 um 09:42 schrieb Sharma, Shashank: On 3/16/2022 10:50 PM, Rob Clark wrote: On Tue, Mar 8, 2022 at 11:40 PM Shashank Sharma wrote: From: Shashank Sharma This patch adds a new sysfs event, which will indicate the userland about a GPU reset, and can also provide some information like: - process ID of the process involved with the GPU reset - process name of the involved process - the GPU status info (using flags) This patch also introduces the first flag of the flags bitmap, which can be appended as and when required. V2: Addressed review comments from Christian and Amar - move the reset information structure to DRM layer - drop _ctx from struct name - make pid 32 bit(than 64) - set flag when VRAM invalid (than valid) - add process name as well (Amar) Cc: Alexandar Deucher Cc: Christian Koenig Cc: Amaranath Somalapuram Signed-off-by: Shashank Sharma --- drivers/gpu/drm/drm_sysfs.c | 31 +++ include/drm/drm_sysfs.h | 10 ++ 2 files changed, 41 insertions(+) diff --git a/drivers/gpu/drm/drm_sysfs.c b/drivers/gpu/drm/drm_sysfs.c index 430e00b16eec..840994810910 100644 --- a/drivers/gpu/drm/drm_sysfs.c +++ b/drivers/gpu/drm/drm_sysfs.c @@ -409,6 +409,37 @@ void drm_sysfs_hotplug_event(struct drm_device *dev) } EXPORT_SYMBOL(drm_sysfs_hotplug_event); +/** + * drm_sysfs_reset_event - generate a DRM uevent to indicate GPU reset + * @dev: DRM device + * @reset_info: The contextual information about the reset (like PID, flags) + * + * Send a uevent for the DRM device specified by @dev. This informs + * user that a GPU reset has occurred, so that an interested client + * can take any recovery or profiling measure. + */ +void drm_sysfs_reset_event(struct drm_device *dev, struct drm_reset_event *reset_info) +{ + unsigned char pid_str[13]; + unsigned char flags_str[15]; + unsigned char pname_str[TASK_COMM_LEN + 6]; + unsigned char reset_str[] = "RESET=1"; + char *envp[] = { reset_str, pid_str, pname_str, flags_str, NULL }; + + if (!reset_info) { + DRM_WARN("No reset info, not sending the event\n"); + return; + } + + DRM_DEBUG("generating reset event\n"); + + snprintf(pid_str, ARRAY_SIZE(pid_str), "PID=%u", reset_info->pid); + snprintf(pname_str, ARRAY_SIZE(pname_str), "NAME=%s", reset_info->pname); + snprintf(flags_str, ARRAY_SIZE(flags_str), "FLAGS=%u", reset_info->flags); + kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE, envp); +} +EXPORT_SYMBOL(drm_sysfs_reset_event); + /** * drm_sysfs_connector_hotplug_event - generate a DRM uevent for any connector * change diff --git a/include/drm/drm_sysfs.h b/include/drm/drm_sysfs.h index 6273cac44e47..5ba11c760619 100644 --- a/include/drm/drm_sysfs.h +++ b/include/drm/drm_sysfs.h @@ -1,16 +1,26 @@ /* SPDX-License-Identifier: GPL-2.0 */ #ifndef _DRM_SYSFS_H_ #define _DRM_SYSFS_H_ +#include + +#define DRM_GPU_RESET_FLAG_VRAM_INVALID (1 << 0) struct drm_device; struct device; struct drm_connector; struct drm_property; +struct drm_reset_event { + uint32_t pid; One side note, unrelated to devcoredump vs this.. AFAIU you probably want to be passing around a `struct pid *`, and then somehow use pid_vnr() in the context of the process reading the event to get the numeric pid. Otherwise things will not do what you expect if the process triggering the crash is in a different pid namespace from the compositor. I am not sure if it is a good idea to add the pid extraction complexity in here, it is left upto the driver to extract this information and pass it to the work queue. In case of AMDGPU, its extracted from GPU VM. It would be then more flexible for the drivers as well. Yeah, but that is just used for debugging. If we want to use the pid for housekeeping, like for a daemon which kills/restarts processes, we absolutely need that or otherwise won't be able to work with containers. Regards, Christian. - Shashank BR, -R + uint32_t flags; + char pname[TASK_COMM_LEN]; +}; + int drm_class_device_register(struct device *dev); void drm_class_device_unregister(struct device *dev); void drm_sysfs_hotplug_event(struct drm_device *dev); +void drm_sysfs_reset_event(struct drm_device *dev, struct drm_reset_event *reset_info); void drm_sysfs_connector_hotplug_event(struct drm_connector *connector); void drm_sysfs_connector_status_event(struct drm_connector *connector, struct drm_property *property); -- 2.32.0
Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event
On Mon, Mar 14, 2022 at 10:23:27AM -0400, Alex Deucher wrote: > On Fri, Mar 11, 2022 at 3:30 AM Pekka Paalanen wrote: > > > > On Thu, 10 Mar 2022 11:56:41 -0800 > > Rob Clark wrote: > > > > > For something like just notifying a compositor that a gpu crash > > > happened, perhaps drm_event is more suitable. See > > > virtio_gpu_fence_event_create() for an example of adding new event > > > types. Although maybe you want it to be an event which is not device > > > specific. This isn't so much of a debugging use-case as simply > > > notification. > > > > Hi, > > > > for this particular use case, are we now talking about the display > > device (KMS) crashing or the rendering device (OpenGL/Vulkan) crashing? > > > > If the former, I wasn't aware that display device crashes are a thing. > > How should a userspace display server react to those? > > > > If the latter, don't we have EGL extensions or Vulkan API already to > > deliver that? > > > > The above would be about device crashes that directly affect the > > display server. Is that the use case in mind here, or is it instead > > about notifying the display server that some application has caused a > > driver/hardware crash? If the latter, how should a display server react > > to that? Disconnect the application? > > > > Shashank, what is the actual use case you are developing this for? > > > > I've read all the emails here so far, and I don't recall seeing it > > explained. > > > > The idea is that a support daemon or compositor would listen for GPU > reset notifications and do something useful with them (kill the guilty > app, restart the desktop environment, etc.). Today when the GPU > resets, most applications just continue assuming nothing is wrong, > meanwhile the GPU has stopped accepting work until the apps re-init > their context so all of their command submissions just get rejected. > > > Btw. somewhat relatedly, there has been work aiming to allow > > graceful hot-unplug of DRM devices. There is a kernel doc outlining how > > the various APIs should react towards userspace when a DRM device > > suddenly disappears. That seems to have some overlap here IMO. > > > > See > > https://www.kernel.org/doc/html/latest/gpu/drm-uapi.html#device-hot-unplug > > which also has a couple pointers to EGL and Vulkan APIs. > > The problem is most applications don't use the GL or VK robustness > APIs. You could use something like that in the compositor, but those > APIs tend to be focused more on the application itself rather than the > GPU in general. E.g., Is my context lost. Which is fine for > restarting your context, but doesn't really help if you want to try > and do something with another application (i.e., the likely guilty > app). Also, on dGPU at least, when you reset the GPU, vram is usually > lost (either due to the memory controller being reset, or vram being > zero'd on init due to ECC support), so even if you are not the guilty > process, in that case you'd need to re-init your context anyway. Isn't that what arb robustness and all that stuff is for? Doing that through sysfs event sounds very wrong, since in general apps just don't have access to that. Also vk equivalent is vk_error_device_lost. Iirc both have information like whether the app was the guilty one causing the hang, or whether it was just victimized because the gpu can't do anything else than a full gpu reset which nukes everything (like amdgpu currently has, aside from the thread unblock trick in the first attempt). And if your app/compositor doesn't use robust contexts then the userspace driver gets to do a best effort attempt at recovery, or exit(). Whatever you can do really. Also note that you don't actually want an event, but a query ioctl (plus maybe a specific errno on your CS ioctl). Neither of the above flows supports events for gpu resets. RESET_STATS ioctl is the i915 implementation of this stuff. For the core dump aspect yes pls devcoredump and not reinvented wheels (and i915 is a bad example here, but in defence the i915 sysfs hang event predates devcoredump). Cheers, Daniel > > Alex > > > > > > > Thanks, > > pq -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
[PATCH 1/4] drm/gma500: Remove unused declarations and other cruft
Most of these are old leftovers from one of the driver merges. This is all dead code. Signed-off-by: Patrik Jakobsson --- drivers/gpu/drm/gma500/psb_drv.h | 75 +--- 1 file changed, 1 insertion(+), 74 deletions(-) diff --git a/drivers/gpu/drm/gma500/psb_drv.h b/drivers/gpu/drm/gma500/psb_drv.h index 553d03190ce1..66f61909a8c8 100644 --- a/drivers/gpu/drm/gma500/psb_drv.h +++ b/drivers/gpu/drm/gma500/psb_drv.h @@ -36,12 +36,6 @@ /* Append new drm mode definition here, align with libdrm definition */ #define DRM_MODE_SCALE_NO_SCALE2 -enum { - CHIP_PSB_8108 = 0, /* Poulsbo */ - CHIP_PSB_8109 = 1, /* Poulsbo */ - CHIP_MRST_4100 = 2, /* Moorestown/Oaktrail */ -}; - #define IS_PSB(drm) ((to_pci_dev((drm)->dev)->device & 0xfffe) == 0x8108) #define IS_MRST(drm) ((to_pci_dev((drm)->dev)->device & 0xfff0) == 0x4100) #define IS_CDV(drm) ((to_pci_dev((drm)->dev)->device & 0xfff0) == 0x0be0) @@ -617,15 +611,7 @@ struct psb_ops { int i2c_bus;/* I2C bus identifier for Moorestown */ }; - - -extern int drm_crtc_probe_output_modes(struct drm_device *dev, int, int); -extern int drm_pick_crtcs(struct drm_device *dev); - /* psb_irq.c */ -extern void psb_irq_uninstall_islands(struct drm_device *dev, int hw_islands); -extern int psb_vblank_wait2(struct drm_device *dev, unsigned int *sequence); -extern int psb_vblank_wait(struct drm_device *dev, unsigned int *sequence); extern int psb_enable_vblank(struct drm_crtc *crtc); extern void psb_disable_vblank(struct drm_crtc *crtc); void @@ -636,17 +622,9 @@ psb_disable_pipestat(struct drm_psb_private *dev_priv, int pipe, u32 mask); extern u32 psb_get_vblank_counter(struct drm_crtc *crtc); -/* framebuffer.c */ -extern int psbfb_probed(struct drm_device *dev); -extern int psbfb_remove(struct drm_device *dev, - struct drm_framebuffer *fb); -/* psb_drv.c */ -extern void psb_spank(struct drm_psb_private *dev_priv); - -/* psb_reset.c */ +/* psb_lid.c */ extern void psb_lid_timer_init(struct drm_psb_private *dev_priv); extern void psb_lid_timer_takedown(struct drm_psb_private *dev_priv); -extern void psb_print_pagefault(struct drm_psb_private *dev_priv); /* modesetting */ extern void psb_modeset_init(struct drm_device *dev); @@ -689,43 +667,7 @@ extern const struct psb_ops oaktrail_chip_ops; /* cdv_device.c */ extern const struct psb_ops cdv_chip_ops; -/* Debug print bits setting */ -#define PSB_D_GENERAL (1 << 0) -#define PSB_D_INIT(1 << 1) -#define PSB_D_IRQ (1 << 2) -#define PSB_D_ENTRY (1 << 3) -/* debug the get H/V BP/FP count */ -#define PSB_D_HV (1 << 4) -#define PSB_D_DBI_BF (1 << 5) -#define PSB_D_PM (1 << 6) -#define PSB_D_RENDER (1 << 7) -#define PSB_D_REG (1 << 8) -#define PSB_D_MSVDX (1 << 9) -#define PSB_D_TOPAZ (1 << 10) - -extern int drm_idle_check_interval; - /* Utilities */ -static inline u32 MRST_MSG_READ32(int domain, uint port, uint offset) -{ - int mcr = (0xD0<<24) | (port << 16) | (offset << 8); - uint32_t ret_val = 0; - struct pci_dev *pci_root = pci_get_domain_bus_and_slot(domain, 0, 0); - pci_write_config_dword(pci_root, 0xD0, mcr); - pci_read_config_dword(pci_root, 0xD4, &ret_val); - pci_dev_put(pci_root); - return ret_val; -} -static inline void MRST_MSG_WRITE32(int domain, uint port, uint offset, - u32 value) -{ - int mcr = (0xE0<<24) | (port << 16) | (offset << 8) | 0xF0; - struct pci_dev *pci_root = pci_get_domain_bus_and_slot(domain, 0, 0); - pci_write_config_dword(pci_root, 0xD4, value); - pci_write_config_dword(pci_root, 0xD0, mcr); - pci_dev_put(pci_root); -} - static inline uint32_t REGISTER_READ(struct drm_device *dev, uint32_t reg) { struct drm_psb_private *dev_priv = to_drm_psb_private(dev); @@ -806,24 +748,9 @@ static inline void REGISTER_WRITE8(struct drm_device *dev, #define PSB_WVDC32(_val, _offs)iowrite32(_val, dev_priv->vdc_reg + (_offs)) #define PSB_RVDC32(_offs) ioread32(dev_priv->vdc_reg + (_offs)) -/* #define TRAP_SGX_PM_FAULT 1 */ -#ifdef TRAP_SGX_PM_FAULT -#define PSB_RSGX32(_offs) \ -({ \ - if (inl(dev_priv->apm_base + PSB_APM_STS) & 0x3) { \ - pr_err("access sgx when it's off!! (READ) %s, %d\n",\ - __FILE__, __LINE__); \ - melay(1000);\ - } \ - ioread32(dev_priv->sgx_reg + (_offs)); \ -}) -#else #define PSB_RSGX32(_offs) ioread32(dev_priv->sgx_reg + (_offs)) -#endif #define PSB_WSGX32(_val, _offs)iowrite32(_val, dev_p
[PATCH 2/4] drm/gma500: Move gma_intel_crtc_funcs into gma_display.c
All functions live in gma_display.c already so move the vtable. Also shorten the name to gma_crtc_funcs. Signed-off-by: Patrik Jakobsson --- drivers/gpu/drm/gma500/cdv_device.c| 2 +- drivers/gpu/drm/gma500/gma_display.c | 12 drivers/gpu/drm/gma500/gma_display.h | 10 ++ drivers/gpu/drm/gma500/oaktrail_device.c | 2 +- drivers/gpu/drm/gma500/psb_device.c| 2 +- drivers/gpu/drm/gma500/psb_drv.h | 2 -- drivers/gpu/drm/gma500/psb_intel_display.c | 12 7 files changed, 17 insertions(+), 25 deletions(-) diff --git a/drivers/gpu/drm/gma500/cdv_device.c b/drivers/gpu/drm/gma500/cdv_device.c index d7c6cca23e94..887c157d75f4 100644 --- a/drivers/gpu/drm/gma500/cdv_device.c +++ b/drivers/gpu/drm/gma500/cdv_device.c @@ -603,7 +603,7 @@ const struct psb_ops cdv_chip_ops = { .errata = cdv_errata, .crtc_helper = &cdv_intel_helper_funcs, - .crtc_funcs = &gma_intel_crtc_funcs, + .crtc_funcs = &gma_crtc_funcs, .clock_funcs = &cdv_clock_funcs, .output_init = cdv_output_init, diff --git a/drivers/gpu/drm/gma500/gma_display.c b/drivers/gpu/drm/gma500/gma_display.c index dd801404cf99..931ffb192fc4 100644 --- a/drivers/gpu/drm/gma500/gma_display.c +++ b/drivers/gpu/drm/gma500/gma_display.c @@ -565,6 +565,18 @@ int gma_crtc_set_config(struct drm_mode_set *set, return ret; } +const struct drm_crtc_funcs gma_crtc_funcs = { + .cursor_set = gma_crtc_cursor_set, + .cursor_move = gma_crtc_cursor_move, + .gamma_set = gma_crtc_gamma_set, + .set_config = gma_crtc_set_config, + .destroy = gma_crtc_destroy, + .page_flip = gma_crtc_page_flip, + .enable_vblank = psb_enable_vblank, + .disable_vblank = psb_disable_vblank, + .get_vblank_counter = psb_get_vblank_counter, +}; + /* * Save HW states of given crtc */ diff --git a/drivers/gpu/drm/gma500/gma_display.h b/drivers/gpu/drm/gma500/gma_display.h index 7bd6c1ee8b21..113cf048105e 100644 --- a/drivers/gpu/drm/gma500/gma_display.h +++ b/drivers/gpu/drm/gma500/gma_display.h @@ -58,15 +58,7 @@ extern bool gma_pipe_has_type(struct drm_crtc *crtc, int type); extern void gma_wait_for_vblank(struct drm_device *dev); extern int gma_pipe_set_base(struct drm_crtc *crtc, int x, int y, struct drm_framebuffer *old_fb); -extern int gma_crtc_cursor_set(struct drm_crtc *crtc, - struct drm_file *file_priv, - uint32_t handle, - uint32_t width, uint32_t height); -extern int gma_crtc_cursor_move(struct drm_crtc *crtc, int x, int y); extern void gma_crtc_load_lut(struct drm_crtc *crtc); -extern int gma_crtc_gamma_set(struct drm_crtc *crtc, u16 *red, u16 *green, - u16 *blue, u32 size, - struct drm_modeset_acquire_ctx *ctx); extern void gma_crtc_dpms(struct drm_crtc *crtc, int mode); extern void gma_crtc_prepare(struct drm_crtc *crtc); extern void gma_crtc_commit(struct drm_crtc *crtc); @@ -83,6 +75,8 @@ extern int gma_crtc_set_config(struct drm_mode_set *set, extern void gma_crtc_save(struct drm_crtc *crtc); extern void gma_crtc_restore(struct drm_crtc *crtc); +extern const struct drm_crtc_funcs gma_crtc_funcs; + extern void gma_encoder_prepare(struct drm_encoder *encoder); extern void gma_encoder_commit(struct drm_encoder *encoder); extern void gma_encoder_destroy(struct drm_encoder *encoder); diff --git a/drivers/gpu/drm/gma500/oaktrail_device.c b/drivers/gpu/drm/gma500/oaktrail_device.c index 5c75eae630b5..40f1bc736125 100644 --- a/drivers/gpu/drm/gma500/oaktrail_device.c +++ b/drivers/gpu/drm/gma500/oaktrail_device.c @@ -545,7 +545,7 @@ const struct psb_ops oaktrail_chip_ops = { .chip_setup = oaktrail_chip_setup, .chip_teardown = oaktrail_teardown, .crtc_helper = &oaktrail_helper_funcs, - .crtc_funcs = &gma_intel_crtc_funcs, + .crtc_funcs = &gma_crtc_funcs, .output_init = oaktrail_output_init, diff --git a/drivers/gpu/drm/gma500/psb_device.c b/drivers/gpu/drm/gma500/psb_device.c index 3030f18ba022..e93e4191c0ca 100644 --- a/drivers/gpu/drm/gma500/psb_device.c +++ b/drivers/gpu/drm/gma500/psb_device.c @@ -329,7 +329,7 @@ const struct psb_ops psb_chip_ops = { .chip_teardown = psb_chip_teardown, .crtc_helper = &psb_intel_helper_funcs, - .crtc_funcs = &gma_intel_crtc_funcs, + .crtc_funcs = &gma_crtc_funcs, .clock_funcs = &psb_clock_funcs, .output_init = psb_output_init, diff --git a/drivers/gpu/drm/gma500/psb_drv.h b/drivers/gpu/drm/gma500/psb_drv.h index 66f61909a8c8..88f44dbbc4eb 100644 --- a/drivers/gpu/drm/gma500/psb_drv.h +++ b/drivers/gpu/drm/gma500/psb_drv.h @@ -13,7 +13,6 @@ #include -#include "gma_display.h" #include "gtt.h" #include "intel_bios.h" #include "mmu.h" @@ -647,7 +646,6 @@ extern void oaktrail_lvds_init(st
[PATCH 4/4] drm/gma500: Cosmetic cleanup of irq code
Use the gma_ prefix instead of psb_ since the code is common for all chips. Various coding style fixes. Removal of unused code. Removal of duplicate function declarations. Signed-off-by: Patrik Jakobsson --- drivers/gpu/drm/gma500/gma_display.c | 8 +-- drivers/gpu/drm/gma500/opregion.c| 5 +- drivers/gpu/drm/gma500/power.c | 10 +-- drivers/gpu/drm/gma500/psb_drv.c | 2 +- drivers/gpu/drm/gma500/psb_drv.h | 11 drivers/gpu/drm/gma500/psb_irq.c | 94 +++- drivers/gpu/drm/gma500/psb_irq.h | 19 +++--- 7 files changed, 57 insertions(+), 92 deletions(-) diff --git a/drivers/gpu/drm/gma500/gma_display.c b/drivers/gpu/drm/gma500/gma_display.c index 931ffb192fc4..1d7964c339f4 100644 --- a/drivers/gpu/drm/gma500/gma_display.c +++ b/drivers/gpu/drm/gma500/gma_display.c @@ -17,7 +17,7 @@ #include "framebuffer.h" #include "gem.h" #include "gma_display.h" -#include "psb_drv.h" +#include "psb_irq.h" #include "psb_intel_drv.h" #include "psb_intel_reg.h" @@ -572,9 +572,9 @@ const struct drm_crtc_funcs gma_crtc_funcs = { .set_config = gma_crtc_set_config, .destroy = gma_crtc_destroy, .page_flip = gma_crtc_page_flip, - .enable_vblank = psb_enable_vblank, - .disable_vblank = psb_disable_vblank, - .get_vblank_counter = psb_get_vblank_counter, + .enable_vblank = gma_enable_vblank, + .disable_vblank = gma_disable_vblank, + .get_vblank_counter = gma_get_vblank_counter, }; /* diff --git a/drivers/gpu/drm/gma500/opregion.c b/drivers/gpu/drm/gma500/opregion.c index fef04ff8c3a9..dc494df71a48 100644 --- a/drivers/gpu/drm/gma500/opregion.c +++ b/drivers/gpu/drm/gma500/opregion.c @@ -23,6 +23,7 @@ */ #include #include "psb_drv.h" +#include "psb_irq.h" #include "psb_intel_reg.h" #define PCI_ASLE 0xe4 @@ -217,8 +218,8 @@ void psb_intel_opregion_enable_asle(struct drm_device *dev) if (asle && system_opregion ) { /* Don't do this on Medfield or other non PC like devices, they use the bit for something different altogether */ - psb_enable_pipestat(dev_priv, 0, PIPE_LEGACY_BLC_EVENT_ENABLE); - psb_enable_pipestat(dev_priv, 1, PIPE_LEGACY_BLC_EVENT_ENABLE); + gma_enable_pipestat(dev_priv, 0, PIPE_LEGACY_BLC_EVENT_ENABLE); + gma_enable_pipestat(dev_priv, 1, PIPE_LEGACY_BLC_EVENT_ENABLE); asle->tche = ASLE_ALS_EN | ASLE_BLC_EN | ASLE_PFIT_EN | ASLE_PFMB_EN; diff --git a/drivers/gpu/drm/gma500/power.c b/drivers/gpu/drm/gma500/power.c index 6f917cfef65b..b91de6d36e41 100644 --- a/drivers/gpu/drm/gma500/power.c +++ b/drivers/gpu/drm/gma500/power.c @@ -201,7 +201,7 @@ int gma_power_suspend(struct device *_dev) dev_err(dev->dev, "GPU hardware busy, cannot suspend\n"); return -EBUSY; } - psb_irq_uninstall(dev); + gma_irq_uninstall(dev); gma_suspend_display(dev); gma_suspend_pci(pdev); } @@ -223,8 +223,8 @@ int gma_power_resume(struct device *_dev) mutex_lock(&power_mutex); gma_resume_pci(pdev); gma_resume_display(pdev); - psb_irq_preinstall(dev); - psb_irq_postinstall(dev); + gma_irq_preinstall(dev); + gma_irq_postinstall(dev); mutex_unlock(&power_mutex); return 0; } @@ -270,8 +270,8 @@ bool gma_power_begin(struct drm_device *dev, bool force_on) /* Ok power up needed */ ret = gma_resume_pci(pdev); if (ret == 0) { - psb_irq_preinstall(dev); - psb_irq_postinstall(dev); + gma_irq_preinstall(dev); + gma_irq_postinstall(dev); pm_runtime_get(dev->dev); dev_priv->display_count++; spin_unlock_irqrestore(&power_ctrl_lock, flags); diff --git a/drivers/gpu/drm/gma500/psb_drv.c b/drivers/gpu/drm/gma500/psb_drv.c index e30b58184156..82d51e9821ad 100644 --- a/drivers/gpu/drm/gma500/psb_drv.c +++ b/drivers/gpu/drm/gma500/psb_drv.c @@ -380,7 +380,7 @@ static int psb_driver_load(struct drm_device *dev, unsigned long flags) PSB_WVDC32(0x, PSB_INT_MASK_R); spin_unlock_irqrestore(&dev_priv->irqmask_lock, irqflags); - psb_irq_install(dev, pdev->irq); + gma_irq_install(dev, pdev->irq); dev->max_vblank_count = 0xff; /* only 24 bits of frame count */ diff --git a/drivers/gpu/drm/gma500/psb_drv.h b/drivers/gpu/drm/gma500/psb_drv.h index aed167af13c5..0ddfec1a0851 100644 --- a/drivers/gpu/drm/gma500/psb_drv.h +++ b/drivers/gpu/drm/gma500/psb_drv.h @@ -609,17 +609,6 @@ struct psb_ops { int i2c_bus;/* I2C bus identifier for Moorestown */ }; -/* psb_irq.c */ -extern int psb_enable_vblank(struct drm_crtc *crtc); -extern void psb_disable_vblank(struct drm_crtc *c
[PATCH 3/4] drm/gma500: Don't store crtc_funcs in psb_ops
The drm_crtc_funcs are all generic and no chip specific functions are necessary. We can therefore directly put gma_crtc_funcs into the drm_crtc. Signed-off-by: Patrik Jakobsson --- drivers/gpu/drm/gma500/cdv_device.c| 1 - drivers/gpu/drm/gma500/oaktrail_device.c | 1 - drivers/gpu/drm/gma500/psb_device.c| 1 - drivers/gpu/drm/gma500/psb_drv.h | 1 - drivers/gpu/drm/gma500/psb_intel_display.c | 3 +-- 5 files changed, 1 insertion(+), 6 deletions(-) diff --git a/drivers/gpu/drm/gma500/cdv_device.c b/drivers/gpu/drm/gma500/cdv_device.c index 887c157d75f4..f854f58bcbb3 100644 --- a/drivers/gpu/drm/gma500/cdv_device.c +++ b/drivers/gpu/drm/gma500/cdv_device.c @@ -603,7 +603,6 @@ const struct psb_ops cdv_chip_ops = { .errata = cdv_errata, .crtc_helper = &cdv_intel_helper_funcs, - .crtc_funcs = &gma_crtc_funcs, .clock_funcs = &cdv_clock_funcs, .output_init = cdv_output_init, diff --git a/drivers/gpu/drm/gma500/oaktrail_device.c b/drivers/gpu/drm/gma500/oaktrail_device.c index 40f1bc736125..5923a9c89312 100644 --- a/drivers/gpu/drm/gma500/oaktrail_device.c +++ b/drivers/gpu/drm/gma500/oaktrail_device.c @@ -545,7 +545,6 @@ const struct psb_ops oaktrail_chip_ops = { .chip_setup = oaktrail_chip_setup, .chip_teardown = oaktrail_teardown, .crtc_helper = &oaktrail_helper_funcs, - .crtc_funcs = &gma_crtc_funcs, .output_init = oaktrail_output_init, diff --git a/drivers/gpu/drm/gma500/psb_device.c b/drivers/gpu/drm/gma500/psb_device.c index e93e4191c0ca..59f325165667 100644 --- a/drivers/gpu/drm/gma500/psb_device.c +++ b/drivers/gpu/drm/gma500/psb_device.c @@ -329,7 +329,6 @@ const struct psb_ops psb_chip_ops = { .chip_teardown = psb_chip_teardown, .crtc_helper = &psb_intel_helper_funcs, - .crtc_funcs = &gma_crtc_funcs, .clock_funcs = &psb_clock_funcs, .output_init = psb_output_init, diff --git a/drivers/gpu/drm/gma500/psb_drv.h b/drivers/gpu/drm/gma500/psb_drv.h index 88f44dbbc4eb..aed167af13c5 100644 --- a/drivers/gpu/drm/gma500/psb_drv.h +++ b/drivers/gpu/drm/gma500/psb_drv.h @@ -578,7 +578,6 @@ struct psb_ops { /* Sub functions */ struct drm_crtc_helper_funcs const *crtc_helper; - struct drm_crtc_funcs const *crtc_funcs; const struct gma_clock_funcs *clock_funcs; /* Setup hooks */ diff --git a/drivers/gpu/drm/gma500/psb_intel_display.c b/drivers/gpu/drm/gma500/psb_intel_display.c index 6df62fe7c1e0..a99859b5b13a 100644 --- a/drivers/gpu/drm/gma500/psb_intel_display.c +++ b/drivers/gpu/drm/gma500/psb_intel_display.c @@ -488,8 +488,7 @@ void psb_intel_crtc_init(struct drm_device *dev, int pipe, return; } - /* Set the CRTC operations from the chip specific data */ - drm_crtc_init(dev, &gma_crtc->base, dev_priv->ops->crtc_funcs); + drm_crtc_init(dev, &gma_crtc->base, &gma_crtc_funcs); /* Set the CRTC clock functions from chip specific data */ gma_crtc->clock_funcs = dev_priv->ops->clock_funcs; -- 2.35.1
Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event
On Thu, Mar 17, 2022 at 08:03:27AM +0100, Christian König wrote: > Am 16.03.22 um 16:36 schrieb Rob Clark: > > [SNIP] > > just one point of clarification.. in the msm and i915 case it is > > purely for debugging and telemetry (ie. sending crash logs back to > > distro for analysis if user has crash reporting enabled).. it isn't > > used for triggering any action like killing app or compositor. > > By the way, how does msm it's memory management for the devcoredumps? GFP_NORECLAIM all the way. It's purely best effort. Note that the fancy new plan for i915 discrete gpu is to only support gpu crash dumps on non-recoverable gpu contexts, i.e. those that do not continue to the next batch when something bad happens. This is what vk wants and also what iris now uses (we do context recovery in userspace in all cases), and non-recoverable contexts greatly simplify the crash dump gather: Only thing you need to gather is the register state from hw (before you reset it), all the batchbuffer bo and indirect state bo (in i915 you can mark which bo to capture in the CS ioctl) can be captured in a worker later on. Which for non-recoverable context is no issue, since subsequent batchbuffers won't trample over any of these things. And that way you can record the crashdump (or at least the big pieces like all the indirect state stuff) with GFP_KERNEL. msm probably gets it wrong since embedded drivers have much less shrinker and generally no mmu notifiers going on :-) > I mean it is strictly forbidden to allocate any memory in the GPU reset > path. > > > I would however *strongly* recommend devcoredump support in other GPU > > drivers (i915's thing pre-dates devcoredump by a lot).. I've used it > > to debug and fix a couple obscure issues that I was not able to > > reproduce by myself. > > Yes, completely agree as well. +1 Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [RESEND PATCH] drm/doc: Clarify what ioctls can be used on render nodes
On Mon, Mar 07, 2022 at 08:32:36AM -0700, Jeffrey Hugo wrote: > The documentation for render nodes indicates that only "PRIME-related" > ioctls are valid on render nodes, but the documentation does not clarify > what that means. If the reader is not familiar with PRIME, they may > beleive this to be only the ioctls with "PRIME" in the name and not other > ioctls such as set of syncobj ioctls. Clarify the situation for the > reader by referencing where the reader will find a current list of valid > ioctls. > > Signed-off-by: Jeffrey Hugo > Acked-by: Pekka Paalanen Applied to drm-misc-next, thanks for the patch. -Daniel > --- > > I was confused by this when reading the documentation. Now that I have > figured out what the documentation means, I would like to add a clarification > for the next reader which would have helped me. > > Documentation/gpu/drm-uapi.rst | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-uapi.rst > index 199afb5..ce47b42 100644 > --- a/Documentation/gpu/drm-uapi.rst > +++ b/Documentation/gpu/drm-uapi.rst > @@ -148,7 +148,9 @@ clients together with the legacy drmAuth authentication > procedure. > If a driver advertises render node support, DRM core will create a > separate render node called renderD. There will be one render node > per device. No ioctls except PRIME-related ioctls will be allowed on > -this node. Especially GEM_OPEN will be explicitly prohibited. Render > +this node. Especially GEM_OPEN will be explicitly prohibited. For a > +complete list of driver-independent ioctls that can be used on render > +nodes, see the ioctls marked DRM_RENDER_ALLOW in drm_ioctl.c Render > nodes are designed to avoid the buffer-leaks, which occur if clients > guess the flink names or mmap offsets on the legacy interface. > Additionally to this basic interface, drivers must mark their > -- > 2.7.4 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event
Am 17.03.22 um 10:29 schrieb Daniel Vetter: On Thu, Mar 17, 2022 at 08:03:27AM +0100, Christian König wrote: Am 16.03.22 um 16:36 schrieb Rob Clark: [SNIP] just one point of clarification.. in the msm and i915 case it is purely for debugging and telemetry (ie. sending crash logs back to distro for analysis if user has crash reporting enabled).. it isn't used for triggering any action like killing app or compositor. By the way, how does msm it's memory management for the devcoredumps? GFP_NORECLAIM all the way. It's purely best effort. Ok, good to know that it's as simple as that. Note that the fancy new plan for i915 discrete gpu is to only support gpu crash dumps on non-recoverable gpu contexts, i.e. those that do not continue to the next batch when something bad happens. This is what vk wants That's exactly what I'm telling an internal team for a couple of years now as well. Good to know that this is not that totally crazy. and also what iris now uses (we do context recovery in userspace in all cases), and non-recoverable contexts greatly simplify the crash dump gather: Only thing you need to gather is the register state from hw (before you reset it), all the batchbuffer bo and indirect state bo (in i915 you can mark which bo to capture in the CS ioctl) can be captured in a worker later on. Which for non-recoverable context is no issue, since subsequent batchbuffers won't trample over any of these things. And that way you can record the crashdump (or at least the big pieces like all the indirect state stuff) with GFP_KERNEL. Interesting idea, so basically we only do the state we need to reset initially and grab a reference on the killed application to gather the rest before we clean them up. Going to keep that in mind as well. Thanks, Christian. msm probably gets it wrong since embedded drivers have much less shrinker and generally no mmu notifiers going on :-) I mean it is strictly forbidden to allocate any memory in the GPU reset path. I would however *strongly* recommend devcoredump support in other GPU drivers (i915's thing pre-dates devcoredump by a lot).. I've used it to debug and fix a couple obscure issues that I was not able to reproduce by myself. Yes, completely agree as well. +1 Cheers, Daniel
Re: [Intel-gfx] [PATCH v6 2/2] drm/i915/gem: Don't try to map and fence large scanout buffers (v9)
On Tue, Mar 15, 2022 at 09:45:20AM +, Tvrtko Ursulin wrote: > > On 15/03/2022 07:28, Kasireddy, Vivek wrote: > > Hi Tvrtko, Daniel, > > > > > > > > On 11/03/2022 09:39, Daniel Vetter wrote: > > > > On Mon, 7 Mar 2022 at 21:38, Vivek Kasireddy > > > > wrote: > > > > > > > > > > On platforms capable of allowing 8K (7680 x 4320) modes, pinning 2 or > > > > > more framebuffers/scanout buffers results in only one that is > > > > > mappable/ > > > > > fenceable. Therefore, pageflipping between these 2 FBs where only one > > > > > is mappable/fenceable creates latencies large enough to miss alternate > > > > > vblanks thereby producing less optimal framerate. > > > > > > > > > > This mainly happens because when > > > > > i915_gem_object_pin_to_display_plane() > > > > > is called to pin one of the FB objs, the associated vma is identified > > > > > as misplaced and therefore i915_vma_unbind() is called which unbinds > > > > > and > > > > > evicts it. This misplaced vma gets subseqently pinned only when > > > > > i915_gem_object_ggtt_pin_ww() is called without PIN_MAPPABLE. This > > > > > results in a latency of ~10ms and happens every other vblank/repaint > > > > > cycle. > > > > > Therefore, to fix this issue, we try to see if there is space to map > > > > > at-least two objects of a given size and return early if there isn't. > > > > > This > > > > > would ensure that we do not try with PIN_MAPPABLE for any objects that > > > > > are too big to map thereby preventing unncessary unbind. > > > > > > > > > > Testcase: > > > > > Running Weston and weston-simple-egl on an Alderlake_S (ADLS) platform > > > > > with a 8K@60 mode results in only ~40 FPS. Since upstream Weston > > > > > submits > > > > > a frame ~7ms before the next vblank, the latencies seen between atomic > > > > > commit and flip event are 7, 24 (7 + 16.66), 7, 24. suggesting > > > > > that > > > > > it misses the vblank every other frame. > > > > > > > > > > Here is the ftrace snippet that shows the source of the ~10ms latency: > > > > > i915_gem_object_pin_to_display_plane() { > > > > > 0.102 us |i915_gem_object_set_cache_level(); > > > > > i915_gem_object_ggtt_pin_ww() { > > > > > 0.390 us | i915_vma_instance(); > > > > > 0.178 us | i915_vma_misplaced(); > > > > > i915_vma_unbind() { > > > > > __i915_active_wait() { > > > > > 0.082 us |i915_active_acquire_if_busy(); > > > > > 0.475 us | } > > > > > intel_runtime_pm_get() { > > > > > 0.087 us |intel_runtime_pm_acquire(); > > > > > 0.259 us | } > > > > > __i915_active_wait() { > > > > > 0.085 us |i915_active_acquire_if_busy(); > > > > > 0.240 us | } > > > > > __i915_vma_evict() { > > > > > ggtt_unbind_vma() { > > > > > gen8_ggtt_clear_range() { > > > > > 10507.255 us |} > > > > > 10507.689 us | } > > > > > 10508.516 us | } > > > > > > > > > > v2: Instead of using bigjoiner checks, determine whether a scanout > > > > > buffer is too big by checking to see if it is possible to map > > > > > two of them into the ggtt. > > > > > > > > > > v3 (Ville): > > > > > - Count how many fb objects can be fit into the available holes > > > > > instead of checking for a hole twice the object size. > > > > > - Take alignment constraints into account. > > > > > - Limit this large scanout buffer check to >= Gen 11 platforms. > > > > > > > > > > v4: > > > > > - Remove existing heuristic that checks just for size. (Ville) > > > > > - Return early if we find space to map at-least two objects. (Tvrtko) > > > > > - Slightly update the commit message. > > > > > > > > > > v5: (Tvrtko) > > > > > - Rename the function to indicate that the object may be too big to > > > > > map into the aperture. > > > > > - Account for guard pages while calculating the total size required > > > > > for the object. > > > > > - Do not subject all objects to the heuristic check and instead > > > > > consider objects only of a certain size. > > > > > - Do the hole walk using the rbtree. > > > > > - Preserve the existing PIN_NONBLOCK logic. > > > > > - Drop the PIN_MAPPABLE check while pinning the VMA. > > > > > > > > > > v6: (Tvrtko) > > > > > - Return 0 on success and the specific error code on failure to > > > > > preserve the existing behavior. > > > > > > > > > > v7: (Ville) > > > > > - Drop the HAS_GMCH(i915), DISPLAY_VER(i915) < 11 and > > > > > size < ggtt->mappable_end / 4 checks. > > > > > - Drop the redundant check that is based on previous heuristic. > > > > > > > > > > v8: > > > > > - Make sure that we are holding the mutex associated with ggtt vm > > > > > as we traverse the hole nodes. > > > > > > > > > > v9: (Tvrtko) > > > > > - Use mutex_lock_interruptible_nested() instead of mutex_lock(). > > > > >
Re: [PATCH v1] drm/shmem-helper: Correct doc-comment of drm_gem_shmem_get_sg_table()
On Tue, Mar 08, 2022 at 04:34:01PM +0300, Dmitry Osipenko wrote: > drm_gem_shmem_get_sg_table() never returns NULL on error, but a ERR_PTR. > Correct the doc comment which says that it returns NULL on error. > > Signed-off-by: Dmitry Osipenko > --- > drivers/gpu/drm/drm_gem_shmem_helper.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c > b/drivers/gpu/drm/drm_gem_shmem_helper.c > index 8ad0e02991ca..37009418cd28 100644 > --- a/drivers/gpu/drm/drm_gem_shmem_helper.c > +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c > @@ -662,7 +662,7 @@ EXPORT_SYMBOL(drm_gem_shmem_print_info); > * drm_gem_shmem_get_pages_sgt() instead. > * > * Returns: > - * A pointer to the scatter/gather table of pinned pages or NULL on failure. > + * A pointer to the scatter/gather table of pinned pages or errno on failure. Hm usually we write "negative errno" for these, since the error numbers are defined as positive numbers. Care to respin? Thanks a lot, Daniel > */ > struct sg_table *drm_gem_shmem_get_sg_table(struct drm_gem_shmem_object > *shmem) > { > -- > 2.35.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 2/3] drm/msm/gpu: Park scheduler threads for system suspend
On Thu, Mar 10, 2022 at 03:46:05PM -0800, Rob Clark wrote: > From: Rob Clark > > In the system suspend path, we don't want to be racing with the > scheduler kthreads pushing additional queued up jobs to the hw > queue (ringbuffer). So park them first. While we are at it, > move the wait for active jobs to complete into the new system- > suspend path. > > Signed-off-by: Rob Clark > --- > drivers/gpu/drm/msm/adreno/adreno_device.c | 68 -- > 1 file changed, 64 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c > b/drivers/gpu/drm/msm/adreno/adreno_device.c > index 8859834b51b8..0440a98988fc 100644 > --- a/drivers/gpu/drm/msm/adreno/adreno_device.c > +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c > @@ -619,22 +619,82 @@ static int active_submits(struct msm_gpu *gpu) > static int adreno_runtime_suspend(struct device *dev) > { > struct msm_gpu *gpu = dev_to_gpu(dev); > - int remaining; > + > + /* > + * We should be holding a runpm ref, which will prevent > + * runtime suspend. In the system suspend path, we've > + * already waited for active jobs to complete. > + */ > + WARN_ON_ONCE(gpu->active_submits); > + > + return gpu->funcs->pm_suspend(gpu); > +} > + > +static void suspend_scheduler(struct msm_gpu *gpu) > +{ > + int i; > + > + /* > + * Shut down the scheduler before we force suspend, so that > + * suspend isn't racing with scheduler kthread feeding us > + * more work. > + * > + * Note, we just want to park the thread, and let any jobs > + * that are already on the hw queue complete normally, as > + * opposed to the drm_sched_stop() path used for handling > + * faulting/timed-out jobs. We can't really cancel any jobs > + * already on the hw queue without racing with the GPU. > + */ > + for (i = 0; i < gpu->nr_rings; i++) { > + struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched; > + kthread_park(sched->thread); Shouldn't we have some proper interfaces for this? Also I'm kinda wondering how other drivers do this, feels like we should have a standard way. Finally not flushing out all in-flight requests sounds a bit like a bad idea for system suspend/resume since that's also the hibernation path, and that would mean your shrinker/page reclaim stops working. At least in full generality. Which ain't good for hibernation. Adding Christian and Andrey. -Daniel > + } > +} > + > +static void resume_scheduler(struct msm_gpu *gpu) > +{ > + int i; > + > + for (i = 0; i < gpu->nr_rings; i++) { > + struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched; > + kthread_unpark(sched->thread); > + } > +} > + > +static int adreno_system_suspend(struct device *dev) > +{ > + struct msm_gpu *gpu = dev_to_gpu(dev); > + int remaining, ret; > + > + suspend_scheduler(gpu); > > remaining = wait_event_timeout(gpu->retire_event, > active_submits(gpu) == 0, > msecs_to_jiffies(1000)); > if (remaining == 0) { > dev_err(dev, "Timeout waiting for GPU to suspend\n"); > - return -EBUSY; > + ret = -EBUSY; > + goto out; > } > > - return gpu->funcs->pm_suspend(gpu); > + ret = pm_runtime_force_suspend(dev); > +out: > + if (ret) > + resume_scheduler(gpu); > + > + return ret; > } > + > +static int adreno_system_resume(struct device *dev) > +{ > + resume_scheduler(dev_to_gpu(dev)); > + return pm_runtime_force_resume(dev); > +} > + > #endif > > static const struct dev_pm_ops adreno_pm_ops = { > - SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, > pm_runtime_force_resume) > + SET_SYSTEM_SLEEP_PM_OPS(adreno_system_suspend, adreno_system_resume) > SET_RUNTIME_PM_OPS(adreno_runtime_suspend, adreno_runtime_resume, NULL) > }; > > -- > 2.35.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 1/6] drm: allow real encoder to be passed for drm_writeback_connector
On Fri, Mar 11, 2022 at 10:05:53AM +0200, Laurent Pinchart wrote: > On Fri, Mar 11, 2022 at 10:46:13AM +0300, Dmitry Baryshkov wrote: > > On Fri, 11 Mar 2022 at 04:50, Abhinav Kumar > > wrote: > > > > > > For some vendor driver implementations, display hardware can > > > be shared between the encoder used for writeback and the physical > > > display. > > > > > > In addition resources such as clocks and interrupts can > > > also be shared between writeback and the real encoder. > > > > > > To accommodate such vendor drivers and hardware, allow > > > real encoder to be passed for drm_writeback_connector. > > > > > > Co-developed-by: Kandpal Suraj > > > Signed-off-by: Abhinav Kumar > > > --- > > > drivers/gpu/drm/drm_writeback.c | 8 > > > include/drm/drm_writeback.h | 13 +++-- > > > 2 files changed, 15 insertions(+), 6 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/drm_writeback.c > > > b/drivers/gpu/drm/drm_writeback.c > > > index dccf4504..4dad687 100644 > > > --- a/drivers/gpu/drm/drm_writeback.c > > > +++ b/drivers/gpu/drm/drm_writeback.c > > > @@ -189,8 +189,8 @@ int drm_writeback_connector_init(struct drm_device > > > *dev, > > > if (IS_ERR(blob)) > > > return PTR_ERR(blob); > > > > > > - drm_encoder_helper_add(&wb_connector->encoder, enc_helper_funcs); > > > - ret = drm_encoder_init(dev, &wb_connector->encoder, > > > + drm_encoder_helper_add(wb_connector->encoder, enc_helper_funcs); > > > + ret = drm_encoder_init(dev, wb_connector->encoder, > > >&drm_writeback_encoder_funcs, > > >DRM_MODE_ENCODER_VIRTUAL, NULL); > > > > If the encoder is provided by a separate driver, it might use a > > different set of encoder funcs. > > More than that, if the encoder is provided externally but doesn't have > custom operations, I don't really see the point of having an external > encoder in the first place. > > Has this series been tested with a driver that needs to provide an > encoder, to make sure it fits the purpose ? Also, can we not force all drivers to do this setup that don't need it? We have a ton of kms drivers, forcing unnecessary busiwork on drivers is really not good. -Daniel > > > I'd suggest checking whether the wb_connector->encoder is NULL here. > > If it is, allocate one using drmm_kzalloc and init it. > > If it is not NULL, assume that it has been initialized already, so > > skip the drm_encoder_init() and just call the drm_encoder_helper_add() > > > > > if (ret) > > > @@ -204,7 +204,7 @@ int drm_writeback_connector_init(struct drm_device > > > *dev, > > > goto connector_fail; > > > > > > ret = drm_connector_attach_encoder(connector, > > > - &wb_connector->encoder); > > > + wb_connector->encoder); > > > if (ret) > > > goto attach_fail; > > > > > > @@ -233,7 +233,7 @@ int drm_writeback_connector_init(struct drm_device > > > *dev, > > > attach_fail: > > > drm_connector_cleanup(connector); > > > connector_fail: > > > - drm_encoder_cleanup(&wb_connector->encoder); > > > + drm_encoder_cleanup(wb_connector->encoder); > > > fail: > > > drm_property_blob_put(blob); > > > return ret; > > > diff --git a/include/drm/drm_writeback.h b/include/drm/drm_writeback.h > > > index 9697d27..0ba266e 100644 > > > --- a/include/drm/drm_writeback.h > > > +++ b/include/drm/drm_writeback.h > > > @@ -25,13 +25,22 @@ struct drm_writeback_connector { > > > struct drm_connector base; > > > > > > /** > > > -* @encoder: Internal encoder used by the connector to fulfill > > > +* @encoder: handle to drm_encoder used by the connector to > > > fulfill > > > * the DRM framework requirements. The users of the > > > * @drm_writeback_connector control the behaviour of the @encoder > > > * by passing the @enc_funcs parameter to > > > drm_writeback_connector_init() > > > * function. > > > +* > > > +* For some vendor drivers, the hardware resources are shared > > > between > > > +* writeback encoder and rest of the display pipeline. > > > +* To accommodate such cases, encoder is a handle to the real > > > encoder > > > +* hardware. > > > +* > > > +* For current existing writeback users, this shall continue to > > > be the > > > +* embedded encoder for the writeback connector. > > > +* > > > */ > > > - struct drm_encoder encoder; > > > + struct drm_encoder *encoder; > > > > > > /** > > > * @pixel_formats_blob_ptr: > > -- > Regards, > > Laurent Pinchart -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [Intel-gfx] [PATCH v6 2/2] drm/i915/gem: Don't try to map and fence large scanout buffers (v9)
On 17/03/2022 09:47, Daniel Vetter wrote: On Tue, Mar 15, 2022 at 09:45:20AM +, Tvrtko Ursulin wrote: On 15/03/2022 07:28, Kasireddy, Vivek wrote: Hi Tvrtko, Daniel, On 11/03/2022 09:39, Daniel Vetter wrote: On Mon, 7 Mar 2022 at 21:38, Vivek Kasireddy wrote: On platforms capable of allowing 8K (7680 x 4320) modes, pinning 2 or more framebuffers/scanout buffers results in only one that is mappable/ fenceable. Therefore, pageflipping between these 2 FBs where only one is mappable/fenceable creates latencies large enough to miss alternate vblanks thereby producing less optimal framerate. This mainly happens because when i915_gem_object_pin_to_display_plane() is called to pin one of the FB objs, the associated vma is identified as misplaced and therefore i915_vma_unbind() is called which unbinds and evicts it. This misplaced vma gets subseqently pinned only when i915_gem_object_ggtt_pin_ww() is called without PIN_MAPPABLE. This results in a latency of ~10ms and happens every other vblank/repaint cycle. Therefore, to fix this issue, we try to see if there is space to map at-least two objects of a given size and return early if there isn't. This would ensure that we do not try with PIN_MAPPABLE for any objects that are too big to map thereby preventing unncessary unbind. Testcase: Running Weston and weston-simple-egl on an Alderlake_S (ADLS) platform with a 8K@60 mode results in only ~40 FPS. Since upstream Weston submits a frame ~7ms before the next vblank, the latencies seen between atomic commit and flip event are 7, 24 (7 + 16.66), 7, 24. suggesting that it misses the vblank every other frame. Here is the ftrace snippet that shows the source of the ~10ms latency: i915_gem_object_pin_to_display_plane() { 0.102 us |i915_gem_object_set_cache_level(); i915_gem_object_ggtt_pin_ww() { 0.390 us | i915_vma_instance(); 0.178 us | i915_vma_misplaced(); i915_vma_unbind() { __i915_active_wait() { 0.082 us |i915_active_acquire_if_busy(); 0.475 us | } intel_runtime_pm_get() { 0.087 us |intel_runtime_pm_acquire(); 0.259 us | } __i915_active_wait() { 0.085 us |i915_active_acquire_if_busy(); 0.240 us | } __i915_vma_evict() { ggtt_unbind_vma() { gen8_ggtt_clear_range() { 10507.255 us |} 10507.689 us | } 10508.516 us | } v2: Instead of using bigjoiner checks, determine whether a scanout buffer is too big by checking to see if it is possible to map two of them into the ggtt. v3 (Ville): - Count how many fb objects can be fit into the available holes instead of checking for a hole twice the object size. - Take alignment constraints into account. - Limit this large scanout buffer check to >= Gen 11 platforms. v4: - Remove existing heuristic that checks just for size. (Ville) - Return early if we find space to map at-least two objects. (Tvrtko) - Slightly update the commit message. v5: (Tvrtko) - Rename the function to indicate that the object may be too big to map into the aperture. - Account for guard pages while calculating the total size required for the object. - Do not subject all objects to the heuristic check and instead consider objects only of a certain size. - Do the hole walk using the rbtree. - Preserve the existing PIN_NONBLOCK logic. - Drop the PIN_MAPPABLE check while pinning the VMA. v6: (Tvrtko) - Return 0 on success and the specific error code on failure to preserve the existing behavior. v7: (Ville) - Drop the HAS_GMCH(i915), DISPLAY_VER(i915) < 11 and size < ggtt->mappable_end / 4 checks. - Drop the redundant check that is based on previous heuristic. v8: - Make sure that we are holding the mutex associated with ggtt vm as we traverse the hole nodes. v9: (Tvrtko) - Use mutex_lock_interruptible_nested() instead of mutex_lock(). Cc: Ville Syrjälä Cc: Maarten Lankhorst Cc: Tvrtko Ursulin Cc: Manasi Navare Reviewed-by: Tvrtko Ursulin Signed-off-by: Vivek Kasireddy --- drivers/gpu/drm/i915/i915_gem.c | 128 +++- 1 file changed, 94 insertions(+), 34 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 9747924cc57b..e0d731b3f215 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -49,6 +49,7 @@ #include "gem/i915_gem_pm.h" #include "gem/i915_gem_region.h" #include "gem/i915_gem_userptr.h" +#include "gem/i915_gem_tiling.h" #include "gt/intel_engine_user.h" #include "gt/intel_gt.h" #include "gt/intel_gt_pm.h" @@ -882,6 +883,96 @@ static void discard_ggtt_vma(struct i915_vma *vma) spin_unlock(&obj->vma.lock); } +static int +i915_gem_object_fits_in_aperture(struct drm_i915_gem_object *obj, +
Re: [PATCH 2/3] drm/msm/gpu: Park scheduler threads for system suspend
Am 17.03.22 um 10:59 schrieb Daniel Vetter: On Thu, Mar 10, 2022 at 03:46:05PM -0800, Rob Clark wrote: From: Rob Clark In the system suspend path, we don't want to be racing with the scheduler kthreads pushing additional queued up jobs to the hw queue (ringbuffer). So park them first. While we are at it, move the wait for active jobs to complete into the new system- suspend path. Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/adreno/adreno_device.c | 68 -- 1 file changed, 64 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c index 8859834b51b8..0440a98988fc 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_device.c +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c @@ -619,22 +619,82 @@ static int active_submits(struct msm_gpu *gpu) static int adreno_runtime_suspend(struct device *dev) { struct msm_gpu *gpu = dev_to_gpu(dev); - int remaining; + + /* +* We should be holding a runpm ref, which will prevent +* runtime suspend. In the system suspend path, we've +* already waited for active jobs to complete. +*/ + WARN_ON_ONCE(gpu->active_submits); + + return gpu->funcs->pm_suspend(gpu); +} + +static void suspend_scheduler(struct msm_gpu *gpu) +{ + int i; + + /* +* Shut down the scheduler before we force suspend, so that +* suspend isn't racing with scheduler kthread feeding us +* more work. +* +* Note, we just want to park the thread, and let any jobs +* that are already on the hw queue complete normally, as +* opposed to the drm_sched_stop() path used for handling +* faulting/timed-out jobs. We can't really cancel any jobs +* already on the hw queue without racing with the GPU. +*/ + for (i = 0; i < gpu->nr_rings; i++) { + struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched; + kthread_park(sched->thread); Shouldn't we have some proper interfaces for this? If I'm not completely mistaken we already should have one, yes. Also I'm kinda wondering how other drivers do this, feels like we should have a standard way. Finally not flushing out all in-flight requests sounds a bit like a bad idea for system suspend/resume since that's also the hibernation path, and that would mean your shrinker/page reclaim stops working. At least in full generality. Which ain't good for hibernation. Completely agree, that looks like an incorrect workaround to me. During suspend all userspace applications should be frozen and all f their hardware activity flushed out and waited for completion. I do remember that our internal guys came up with pretty much the same idea and it sounded broken to me back then as well. Regards, Christian. Adding Christian and Andrey. -Daniel + } +} + +static void resume_scheduler(struct msm_gpu *gpu) +{ + int i; + + for (i = 0; i < gpu->nr_rings; i++) { + struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched; + kthread_unpark(sched->thread); + } +} + +static int adreno_system_suspend(struct device *dev) +{ + struct msm_gpu *gpu = dev_to_gpu(dev); + int remaining, ret; + + suspend_scheduler(gpu); remaining = wait_event_timeout(gpu->retire_event, active_submits(gpu) == 0, msecs_to_jiffies(1000)); if (remaining == 0) { dev_err(dev, "Timeout waiting for GPU to suspend\n"); - return -EBUSY; + ret = -EBUSY; + goto out; } - return gpu->funcs->pm_suspend(gpu); + ret = pm_runtime_force_suspend(dev); +out: + if (ret) + resume_scheduler(gpu); + + return ret; } + +static int adreno_system_resume(struct device *dev) +{ + resume_scheduler(dev_to_gpu(dev)); + return pm_runtime_force_resume(dev); +} + #endif static const struct dev_pm_ops adreno_pm_ops = { - SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, pm_runtime_force_resume) + SET_SYSTEM_SLEEP_PM_OPS(adreno_system_suspend, adreno_system_resume) SET_RUNTIME_PM_OPS(adreno_runtime_suspend, adreno_runtime_resume, NULL) }; -- 2.35.1
Re: [Intel-gfx] [PATCH v6 2/2] drm/i915/gem: Don't try to map and fence large scanout buffers (v9)
On Thu, Mar 17, 2022 at 10:04:36AM +, Tvrtko Ursulin wrote: > > On 17/03/2022 09:47, Daniel Vetter wrote: > > On Tue, Mar 15, 2022 at 09:45:20AM +, Tvrtko Ursulin wrote: > > > > > > On 15/03/2022 07:28, Kasireddy, Vivek wrote: > > > > Hi Tvrtko, Daniel, > > > > > > > > > > > > > > On 11/03/2022 09:39, Daniel Vetter wrote: > > > > > > On Mon, 7 Mar 2022 at 21:38, Vivek Kasireddy > > > > > > wrote: > > > > > > > > > > > > > > On platforms capable of allowing 8K (7680 x 4320) modes, pinning > > > > > > > 2 or > > > > > > > more framebuffers/scanout buffers results in only one that is > > > > > > > mappable/ > > > > > > > fenceable. Therefore, pageflipping between these 2 FBs where only > > > > > > > one > > > > > > > is mappable/fenceable creates latencies large enough to miss > > > > > > > alternate > > > > > > > vblanks thereby producing less optimal framerate. > > > > > > > > > > > > > > This mainly happens because when > > > > > > > i915_gem_object_pin_to_display_plane() > > > > > > > is called to pin one of the FB objs, the associated vma is > > > > > > > identified > > > > > > > as misplaced and therefore i915_vma_unbind() is called which > > > > > > > unbinds and > > > > > > > evicts it. This misplaced vma gets subseqently pinned only when > > > > > > > i915_gem_object_ggtt_pin_ww() is called without PIN_MAPPABLE. This > > > > > > > results in a latency of ~10ms and happens every other > > > > > > > vblank/repaint cycle. > > > > > > > Therefore, to fix this issue, we try to see if there is space to > > > > > > > map > > > > > > > at-least two objects of a given size and return early if there > > > > > > > isn't. This > > > > > > > would ensure that we do not try with PIN_MAPPABLE for any objects > > > > > > > that > > > > > > > are too big to map thereby preventing unncessary unbind. > > > > > > > > > > > > > > Testcase: > > > > > > > Running Weston and weston-simple-egl on an Alderlake_S (ADLS) > > > > > > > platform > > > > > > > with a 8K@60 mode results in only ~40 FPS. Since upstream Weston > > > > > > > submits > > > > > > > a frame ~7ms before the next vblank, the latencies seen between > > > > > > > atomic > > > > > > > commit and flip event are 7, 24 (7 + 16.66), 7, 24. > > > > > > > suggesting that > > > > > > > it misses the vblank every other frame. > > > > > > > > > > > > > > Here is the ftrace snippet that shows the source of the ~10ms > > > > > > > latency: > > > > > > > i915_gem_object_pin_to_display_plane() { > > > > > > > 0.102 us |i915_gem_object_set_cache_level(); > > > > > > >i915_gem_object_ggtt_pin_ww() { > > > > > > > 0.390 us | i915_vma_instance(); > > > > > > > 0.178 us | i915_vma_misplaced(); > > > > > > > i915_vma_unbind() { > > > > > > > __i915_active_wait() { > > > > > > > 0.082 us |i915_active_acquire_if_busy(); > > > > > > > 0.475 us | } > > > > > > > intel_runtime_pm_get() { > > > > > > > 0.087 us |intel_runtime_pm_acquire(); > > > > > > > 0.259 us | } > > > > > > > __i915_active_wait() { > > > > > > > 0.085 us |i915_active_acquire_if_busy(); > > > > > > > 0.240 us | } > > > > > > > __i915_vma_evict() { > > > > > > >ggtt_unbind_vma() { > > > > > > > gen8_ggtt_clear_range() { > > > > > > > 10507.255 us |} > > > > > > > 10507.689 us | } > > > > > > > 10508.516 us | } > > > > > > > > > > > > > > v2: Instead of using bigjoiner checks, determine whether a scanout > > > > > > >buffer is too big by checking to see if it is possible to > > > > > > > map > > > > > > >two of them into the ggtt. > > > > > > > > > > > > > > v3 (Ville): > > > > > > > - Count how many fb objects can be fit into the available holes > > > > > > > instead of checking for a hole twice the object size. > > > > > > > - Take alignment constraints into account. > > > > > > > - Limit this large scanout buffer check to >= Gen 11 platforms. > > > > > > > > > > > > > > v4: > > > > > > > - Remove existing heuristic that checks just for size. (Ville) > > > > > > > - Return early if we find space to map at-least two objects. > > > > > > > (Tvrtko) > > > > > > > - Slightly update the commit message. > > > > > > > > > > > > > > v5: (Tvrtko) > > > > > > > - Rename the function to indicate that the object may be too big > > > > > > > to > > > > > > > map into the aperture. > > > > > > > - Account for guard pages while calculating the total size > > > > > > > required > > > > > > > for the object. > > > > > > > - Do not subject all objects to the heuristic check and instead > > > > > > > consider objects only of a certain size. > > > > > > > - Do the hole walk using the rbtree. > > > > > > > - Preserve the existing PIN_NONBLOCK logic. > > > > > > > - Drop the PIN_MAPP
[PULL] drm-misc-fixes
Hi Dave and Daniel, here's the PR for drm-misc-fixes for this week. Besides the fixes, it contains a backmerge of drm/drm-fixes to get required Kconfig changes from upstream. Best regards Thomas drm-misc-fixes-2022-03-17: * drm/imx: Don't test bus flags in atomic check * drm/mgag200: Fix PLL setup on some models * drm/panel: Fix bpp settings on Innolux G070Y2-L01; Fix DRM_PANEL_EDP Kconfig dependencies The following changes since commit 09688c0166e76ce2fb85e86b9d99be8b0084cdf9: Linux 5.17-rc8 (2022-03-13 13:23:37 -0700) are available in the Git repository at: git://anongit.freedesktop.org/drm/drm-misc tags/drm-misc-fixes-2022-03-17 for you to fetch changes up to 3c3384050d68570f9de0fec9e58824decfefba7a: drm: Don't make DRM_PANEL_BRIDGE dependent on DRM_KMS_HELPERS (2022-03-17 11:07:57 +0100) * drm/imx: Don't test bus flags in atomic check * drm/mgag200: Fix PLL setup on some models * drm/panel: Fix bpp settings on Innolux G070Y2-L01; Fix DRM_PANEL_EDP Kconfig dependencies Christoph Niedermaier (1): drm/imx: parallel-display: Remove bus flags check in imx_pd_bridge_atomic_check() Jocelyn Falempe (1): drm/mgag200: Fix PLL setup for g200wb and g200ew Marek Vasut (1): drm/panel: simple: Fix Innolux G070Y2-L01 BPP settings Thomas Zimmermann (2): Merge drm/drm-fixes into drm-misc-fixes drm: Don't make DRM_PANEL_BRIDGE dependent on DRM_KMS_HELPERS drivers/gpu/drm/bridge/Kconfig | 2 +- drivers/gpu/drm/imx/parallel-display.c | 8 drivers/gpu/drm/mgag200/mgag200_pll.c | 6 +++--- drivers/gpu/drm/panel/Kconfig | 1 + drivers/gpu/drm/panel/panel-simple.c | 2 +- 5 files changed, 6 insertions(+), 13 deletions(-) -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Felix Imendörffer
Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event
Hi, On Thu, 17 Mar 2022 at 09:21, Christian König wrote: > Am 17.03.22 um 09:42 schrieb Sharma, Shashank: > >> AFAIU you probably want to be passing around a `struct pid *`, and > >> then somehow use pid_vnr() in the context of the process reading the > >> event to get the numeric pid. Otherwise things will not do what you > >> expect if the process triggering the crash is in a different pid > >> namespace from the compositor. > > > > I am not sure if it is a good idea to add the pid extraction > > complexity in here, it is left upto the driver to extract this > > information and pass it to the work queue. In case of AMDGPU, its > > extracted from GPU VM. It would be then more flexible for the drivers > > as well. > > Yeah, but that is just used for debugging. > > If we want to use the pid for housekeeping, like for a daemon which > kills/restarts processes, we absolutely need that or otherwise won't be > able to work with containers. 100% this. Pushing back to the compositor is a red herring. The compositor is just a service which tries to handle window management and input. If you're looking to kill the offending process or whatever, then that should go through the session manager - be it systemd or something container-centric or whatever. At least that way it can deal with cgroups at the same time, unlike the compositor which is not really aware of what the thing on the other end of the socket is doing. This ties in with the support they already have for things like coredump analysis, and would also be useful for other devices. Some environments combine compositor and session manager, and a lot of them have them strongly related, but they're very definitely not the same thing ... Cheers, Daniel
Re: [Intel-gfx] [PATCH v6 2/2] drm/i915/gem: Don't try to map and fence large scanout buffers (v9)
On 17/03/2022 07:08, Kasireddy, Vivek wrote: Hi Tvrtko, On 16/03/2022 07:37, Kasireddy, Vivek wrote: Hi Tvrtko, On 15/03/2022 07:28, Kasireddy, Vivek wrote: Hi Tvrtko, Daniel, On 11/03/2022 09:39, Daniel Vetter wrote: On Mon, 7 Mar 2022 at 21:38, Vivek Kasireddy wrote: On platforms capable of allowing 8K (7680 x 4320) modes, pinning 2 or more framebuffers/scanout buffers results in only one that is mappable/ fenceable. Therefore, pageflipping between these 2 FBs where only one is mappable/fenceable creates latencies large enough to miss alternate vblanks thereby producing less optimal framerate. This mainly happens because when i915_gem_object_pin_to_display_plane() is called to pin one of the FB objs, the associated vma is identified as misplaced and therefore i915_vma_unbind() is called which unbinds and evicts it. This misplaced vma gets subseqently pinned only when i915_gem_object_ggtt_pin_ww() is called without PIN_MAPPABLE. This results in a latency of ~10ms and happens every other vblank/repaint cycle. Therefore, to fix this issue, we try to see if there is space to map at-least two objects of a given size and return early if there isn't. This would ensure that we do not try with PIN_MAPPABLE for any objects that are too big to map thereby preventing unncessary unbind. Testcase: Running Weston and weston-simple-egl on an Alderlake_S (ADLS) platform with a 8K@60 mode results in only ~40 FPS. Since upstream Weston submits a frame ~7ms before the next vblank, the latencies seen between atomic commit and flip event are 7, 24 (7 + 16.66), 7, 24. suggesting that it misses the vblank every other frame. Here is the ftrace snippet that shows the source of the ~10ms latency: i915_gem_object_pin_to_display_plane() { 0.102 us |i915_gem_object_set_cache_level(); i915_gem_object_ggtt_pin_ww() { 0.390 us | i915_vma_instance(); 0.178 us | i915_vma_misplaced(); i915_vma_unbind() { __i915_active_wait() { 0.082 us |i915_active_acquire_if_busy(); 0.475 us | } intel_runtime_pm_get() { 0.087 us |intel_runtime_pm_acquire(); 0.259 us | } __i915_active_wait() { 0.085 us |i915_active_acquire_if_busy(); 0.240 us | } __i915_vma_evict() { ggtt_unbind_vma() { gen8_ggtt_clear_range() { 10507.255 us |} 10507.689 us | } 10508.516 us | } v2: Instead of using bigjoiner checks, determine whether a scanout buffer is too big by checking to see if it is possible to map two of them into the ggtt. v3 (Ville): - Count how many fb objects can be fit into the available holes instead of checking for a hole twice the object size. - Take alignment constraints into account. - Limit this large scanout buffer check to >= Gen 11 platforms. v4: - Remove existing heuristic that checks just for size. (Ville) - Return early if we find space to map at-least two objects. (Tvrtko) - Slightly update the commit message. v5: (Tvrtko) - Rename the function to indicate that the object may be too big to map into the aperture. - Account for guard pages while calculating the total size required for the object. - Do not subject all objects to the heuristic check and instead consider objects only of a certain size. - Do the hole walk using the rbtree. - Preserve the existing PIN_NONBLOCK logic. - Drop the PIN_MAPPABLE check while pinning the VMA. v6: (Tvrtko) - Return 0 on success and the specific error code on failure to preserve the existing behavior. v7: (Ville) - Drop the HAS_GMCH(i915), DISPLAY_VER(i915) < 11 and size < ggtt->mappable_end / 4 checks. - Drop the redundant check that is based on previous heuristic. v8: - Make sure that we are holding the mutex associated with ggtt vm as we traverse the hole nodes. v9: (Tvrtko) - Use mutex_lock_interruptible_nested() instead of mutex_lock(). Cc: Ville Syrjälä Cc: Maarten Lankhorst Cc: Tvrtko Ursulin Cc: Manasi Navare Reviewed-by: Tvrtko Ursulin Signed-off-by: Vivek Kasireddy --- drivers/gpu/drm/i915/i915_gem.c | 128 +++ - 1 file changed, 94 insertions(+), 34 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 9747924cc57b..e0d731b3f215 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -49,6 +49,7 @@ #include "gem/i915_gem_pm.h" #include "gem/i915_gem_region.h" #include "gem/i915_gem_userptr.h" +#include "gem/i915_gem_tiling.h" #include "gt/intel_engine_user.h" #include "gt/intel_gt.h" #include "gt/intel_gt_pm.h" @@ -882,6 +883,96 @@ static void discard_ggtt_vma(struct i915_vma *vma) spin_unlock(&obj->vma.lock); } +static int +i915_gem_object_fits_in_aperture(struct drm_
Re: [PATCH] drm: drm_bufs: Error out if 'dev->agp' is a null pointer
On Fri, Mar 11, 2022 at 07:23:02AM +, Zheyu Ma wrote: > The user program can control the 'drm_buf_desc::flags' via ioctl system > call and enter the function drm_legacy_addbufs_agp(). If the driver > doesn't initialize the agp resources, the driver will cause a null > pointer dereference. > > The following log reveals it: > general protection fault, probably for non-canonical address > 0xdc0f: [#1] PREEMPT SMP KASAN PTI > KASAN: null-ptr-deref in range [0x0078-0x007f] > Call Trace: > > drm_ioctl_kernel+0x342/0x450 drivers/gpu/drm/drm_ioctl.c:785 > drm_ioctl+0x592/0x940 drivers/gpu/drm/drm_ioctl.c:885 > vfs_ioctl fs/ioctl.c:51 [inline] > __do_sys_ioctl fs/ioctl.c:874 [inline] > __se_sys_ioctl+0xaa/0xf0 fs/ioctl.c:860 > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > do_syscall_64+0x43/0x90 arch/x86/entry/common.c:80 > entry_SYSCALL_64_after_hwframe+0x44/0xae > > Fix this bug by adding a check. > > Signed-off-by: Zheyu Ma You can only hit this if you enabled a DRIVER_LEGACY drm driver, which opens you up to tons of other CVEs and issues. What's your .config? -Daniel > --- > drivers/gpu/drm/drm_bufs.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/drm_bufs.c b/drivers/gpu/drm/drm_bufs.c > index fcca21e8efac..4fe2363b1e34 100644 > --- a/drivers/gpu/drm/drm_bufs.c > +++ b/drivers/gpu/drm/drm_bufs.c > @@ -734,7 +734,7 @@ int drm_legacy_addbufs_agp(struct drm_device *dev, > int i, valid; > struct drm_buf **temp_buflist; > > - if (!dma) > + if (!dma || !dev->agp) > return -EINVAL; > > count = request->count; > -- > 2.25.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH v1] drm/shmem-helper: Correct doc-comment of drm_gem_shmem_get_sg_table()
On 3/17/22 12:52, Daniel Vetter wrote: > On Tue, Mar 08, 2022 at 04:34:01PM +0300, Dmitry Osipenko wrote: >> drm_gem_shmem_get_sg_table() never returns NULL on error, but a ERR_PTR. >> Correct the doc comment which says that it returns NULL on error. >> >> Signed-off-by: Dmitry Osipenko >> --- >> drivers/gpu/drm/drm_gem_shmem_helper.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c >> b/drivers/gpu/drm/drm_gem_shmem_helper.c >> index 8ad0e02991ca..37009418cd28 100644 >> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c >> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c >> @@ -662,7 +662,7 @@ EXPORT_SYMBOL(drm_gem_shmem_print_info); >> * drm_gem_shmem_get_pages_sgt() instead. >> * >> * Returns: >> - * A pointer to the scatter/gather table of pinned pages or NULL on failure. >> + * A pointer to the scatter/gather table of pinned pages or errno on >> failure. > > Hm usually we write "negative errno" for these, since the error numbers > are defined as positive numbers. Care to respin? It's actually ERR_PTR that is returned here, "errno" was borrowed from some other similar DRM comment. I added this patch to v2 of virtio patchset [1] and will improve the comment in v3, thanks. [1] https://lore.kernel.org/dri-devel/20220314224253.236359-1-dmitry.osipe...@collabora.com/T/#t
Re: [PATCH 2/2] fbdev: Fix cfb_imageblit() for arbitrary image widths
On Sun, Mar 13, 2022 at 08:29:52PM +0100, Thomas Zimmermann wrote: > Commit 0d03011894d2 ("fbdev: Improve performance of cfb_imageblit()") > broke cfb_imageblit() for image widths that are not aligned to 8-bit > boundaries. Fix this by handling the trailing pixels on each line > separately. The performance improvements in the original commit do not > regress by this change. > > Signed-off-by: Thomas Zimmermann > Fixes: 0d03011894d2 ("fbdev: Improve performance of cfb_imageblit()") > Reported-by: Marek Szyprowski > Cc: Thomas Zimmermann > Cc: Javier Martinez Canillas > Cc: Sam Ravnborg On both patches: Acked-by: Daniel Vetter > --- > drivers/video/fbdev/core/cfbimgblt.c | 28 > 1 file changed, 24 insertions(+), 4 deletions(-) > > diff --git a/drivers/video/fbdev/core/cfbimgblt.c > b/drivers/video/fbdev/core/cfbimgblt.c > index 7361cfabdd85..9ebda4e0dc7a 100644 > --- a/drivers/video/fbdev/core/cfbimgblt.c > +++ b/drivers/video/fbdev/core/cfbimgblt.c > @@ -218,7 +218,7 @@ static inline void fast_imageblit(const struct fb_image > *image, struct fb_info * > { > u32 fgx = fgcolor, bgx = bgcolor, bpp = p->var.bits_per_pixel; > u32 ppw = 32/bpp, spitch = (image->width + 7)/8; > - u32 bit_mask, eorx; > + u32 bit_mask, eorx, shift; > const char *s = image->data, *src; > u32 __iomem *dst; > const u32 *tab = NULL; > @@ -259,17 +259,23 @@ static inline void fast_imageblit(const struct fb_image > *image, struct fb_info * > > for (i = image->height; i--; ) { > dst = (u32 __iomem *)dst1; > + shift = 8; > src = s; > > + /* > + * Manually unroll the per-line copying loop for better > + * performance. This works until we processed the last > + * completely filled source byte (inclusive). > + */ > switch (ppw) { > case 4: /* 8 bpp */ > - for (j = k; j; j -= 2, ++src) { > + for (j = k; j >= 2; j -= 2, ++src) { > FB_WRITEL(colortab[(*src >> 4) & bit_mask], > dst++); > FB_WRITEL(colortab[(*src >> 0) & bit_mask], > dst++); > } > break; > case 2: /* 16 bpp */ > - for (j = k; j; j -= 4, ++src) { > + for (j = k; j >= 4; j -= 4, ++src) { > FB_WRITEL(colortab[(*src >> 6) & bit_mask], > dst++); > FB_WRITEL(colortab[(*src >> 4) & bit_mask], > dst++); > FB_WRITEL(colortab[(*src >> 2) & bit_mask], > dst++); > @@ -277,7 +283,7 @@ static inline void fast_imageblit(const struct fb_image > *image, struct fb_info * > } > break; > case 1: /* 32 bpp */ > - for (j = k; j; j -= 8, ++src) { > + for (j = k; j >= 8; j -= 8, ++src) { > FB_WRITEL(colortab[(*src >> 7) & bit_mask], > dst++); > FB_WRITEL(colortab[(*src >> 6) & bit_mask], > dst++); > FB_WRITEL(colortab[(*src >> 5) & bit_mask], > dst++); > @@ -290,6 +296,20 @@ static inline void fast_imageblit(const struct fb_image > *image, struct fb_info * > break; > } > > + /* > + * For image widths that are not a multiple of 8, there > + * are trailing pixels left on the current line. Print > + * them as well. > + */ > + for (; j--; ) { > + shift -= ppw; > + FB_WRITEL(colortab[(*src >> shift) & bit_mask], dst++); > + if (!shift) { > + shift = 8; > + ++src; > + } > + } > + > dst1 += p->fix.line_length; > s += spitch; > } > -- > 2.35.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 4/4] drm/gma500: Cosmetic cleanup of irq code
On Thu, Mar 17, 2022 at 10:25:55AM +0100, Patrik Jakobsson wrote: > Use the gma_ prefix instead of psb_ since the code is common for all > chips. Various coding style fixes. Removal of unused code. Removal of > duplicate function declarations. I didn't really find the above removal things, was that from an old commit message before you split those changes out? Aside from that nit on the commit message on all 4 patches (btw you're threading is somehow broken in this series): Acked-by: Daniel Vetter > > Signed-off-by: Patrik Jakobsson > --- > drivers/gpu/drm/gma500/gma_display.c | 8 +-- > drivers/gpu/drm/gma500/opregion.c| 5 +- > drivers/gpu/drm/gma500/power.c | 10 +-- > drivers/gpu/drm/gma500/psb_drv.c | 2 +- > drivers/gpu/drm/gma500/psb_drv.h | 11 > drivers/gpu/drm/gma500/psb_irq.c | 94 +++- > drivers/gpu/drm/gma500/psb_irq.h | 19 +++--- > 7 files changed, 57 insertions(+), 92 deletions(-) > > diff --git a/drivers/gpu/drm/gma500/gma_display.c > b/drivers/gpu/drm/gma500/gma_display.c > index 931ffb192fc4..1d7964c339f4 100644 > --- a/drivers/gpu/drm/gma500/gma_display.c > +++ b/drivers/gpu/drm/gma500/gma_display.c > @@ -17,7 +17,7 @@ > #include "framebuffer.h" > #include "gem.h" > #include "gma_display.h" > -#include "psb_drv.h" > +#include "psb_irq.h" > #include "psb_intel_drv.h" > #include "psb_intel_reg.h" > > @@ -572,9 +572,9 @@ const struct drm_crtc_funcs gma_crtc_funcs = { > .set_config = gma_crtc_set_config, > .destroy = gma_crtc_destroy, > .page_flip = gma_crtc_page_flip, > - .enable_vblank = psb_enable_vblank, > - .disable_vblank = psb_disable_vblank, > - .get_vblank_counter = psb_get_vblank_counter, > + .enable_vblank = gma_enable_vblank, > + .disable_vblank = gma_disable_vblank, > + .get_vblank_counter = gma_get_vblank_counter, > }; > > /* > diff --git a/drivers/gpu/drm/gma500/opregion.c > b/drivers/gpu/drm/gma500/opregion.c > index fef04ff8c3a9..dc494df71a48 100644 > --- a/drivers/gpu/drm/gma500/opregion.c > +++ b/drivers/gpu/drm/gma500/opregion.c > @@ -23,6 +23,7 @@ > */ > #include > #include "psb_drv.h" > +#include "psb_irq.h" > #include "psb_intel_reg.h" > > #define PCI_ASLE 0xe4 > @@ -217,8 +218,8 @@ void psb_intel_opregion_enable_asle(struct drm_device > *dev) > if (asle && system_opregion ) { > /* Don't do this on Medfield or other non PC like devices, they > use the bit for something different altogether */ > - psb_enable_pipestat(dev_priv, 0, PIPE_LEGACY_BLC_EVENT_ENABLE); > - psb_enable_pipestat(dev_priv, 1, PIPE_LEGACY_BLC_EVENT_ENABLE); > + gma_enable_pipestat(dev_priv, 0, PIPE_LEGACY_BLC_EVENT_ENABLE); > + gma_enable_pipestat(dev_priv, 1, PIPE_LEGACY_BLC_EVENT_ENABLE); > > asle->tche = ASLE_ALS_EN | ASLE_BLC_EN | ASLE_PFIT_EN > | ASLE_PFMB_EN; > diff --git a/drivers/gpu/drm/gma500/power.c b/drivers/gpu/drm/gma500/power.c > index 6f917cfef65b..b91de6d36e41 100644 > --- a/drivers/gpu/drm/gma500/power.c > +++ b/drivers/gpu/drm/gma500/power.c > @@ -201,7 +201,7 @@ int gma_power_suspend(struct device *_dev) > dev_err(dev->dev, "GPU hardware busy, cannot > suspend\n"); > return -EBUSY; > } > - psb_irq_uninstall(dev); > + gma_irq_uninstall(dev); > gma_suspend_display(dev); > gma_suspend_pci(pdev); > } > @@ -223,8 +223,8 @@ int gma_power_resume(struct device *_dev) > mutex_lock(&power_mutex); > gma_resume_pci(pdev); > gma_resume_display(pdev); > - psb_irq_preinstall(dev); > - psb_irq_postinstall(dev); > + gma_irq_preinstall(dev); > + gma_irq_postinstall(dev); > mutex_unlock(&power_mutex); > return 0; > } > @@ -270,8 +270,8 @@ bool gma_power_begin(struct drm_device *dev, bool > force_on) > /* Ok power up needed */ > ret = gma_resume_pci(pdev); > if (ret == 0) { > - psb_irq_preinstall(dev); > - psb_irq_postinstall(dev); > + gma_irq_preinstall(dev); > + gma_irq_postinstall(dev); > pm_runtime_get(dev->dev); > dev_priv->display_count++; > spin_unlock_irqrestore(&power_ctrl_lock, flags); > diff --git a/drivers/gpu/drm/gma500/psb_drv.c > b/drivers/gpu/drm/gma500/psb_drv.c > index e30b58184156..82d51e9821ad 100644 > --- a/drivers/gpu/drm/gma500/psb_drv.c > +++ b/drivers/gpu/drm/gma500/psb_drv.c > @@ -380,7 +380,7 @@ static int psb_driver_load(struct drm_device *dev, > unsigned long flags) > PSB_WVDC32(0x, PSB_INT_MASK_R); > spin_unlock_irqrestore(&dev_priv->irqmask_lock, irqflags); > > - psb_irq_install(dev, pdev->irq); > + gma_irq_install(dev, pdev->irq); > > dev->max_vblank_count = 0xff; /* o
Re: [PATCH 1/2] fbdev: Fix sys_imageblit() for arbitrary image widths
Hello Thomas, On 3/13/22 20:29, Thomas Zimmermann wrote: > Commit 6f29e04938bf ("fbdev: Improve performance of sys_imageblit()") > broke sys_imageblit() for image width that are not aligned to 8-bit > boundaries. Fix this by handling the trailing pixels on each line > separately. The performance improvements in the original commit do not > regress by this change. > > Signed-off-by: Thomas Zimmermann > Fixes: 6f29e04938bf ("fbdev: Improve performance of sys_imageblit()") > Cc: Thomas Zimmermann > Cc: Javier Martinez Canillas > Cc: Sam Ravnborg > --- Looks good to me. Also Marek and Geert mentioned that fixes the issue they were seeing. Reviewed-by: Javier Martinez Canillas -- Best regards, Javier Martinez Canillas Linux Engineering Red Hat
Re: [PATCH 2/2] fbdev: Fix cfb_imageblit() for arbitrary image widths
On 3/13/22 20:29, Thomas Zimmermann wrote: > Commit 0d03011894d2 ("fbdev: Improve performance of cfb_imageblit()") > broke cfb_imageblit() for image widths that are not aligned to 8-bit > boundaries. Fix this by handling the trailing pixels on each line > separately. The performance improvements in the original commit do not > regress by this change. > > Signed-off-by: Thomas Zimmermann > Fixes: 0d03011894d2 ("fbdev: Improve performance of cfb_imageblit()") > Reported-by: Marek Szyprowski > Cc: Thomas Zimmermann > Cc: Javier Martinez Canillas > Cc: Sam Ravnborg > --- Reviewed-by: Javier Martinez Canillas -- Best regards, Javier Martinez Canillas Linux Engineering Red Hat
Re: [PATCH] fbdev: defio: fix the pagelist corruption
Hello Chuansheng, On 3/17/22 06:46, Chuansheng Liu wrote: > Easily hit the below list corruption: > == > list_add corruption. prev->next should be next (c0ceb090), but > was ec604507edc8. (prev=ec604507edc8). > WARNING: CPU: 65 PID: 3959 at lib/list_debug.c:26 > __list_add_valid+0x53/0x80 > CPU: 65 PID: 3959 Comm: fbdev Tainted: G U > RIP: 0010:__list_add_valid+0x53/0x80 > Call Trace: > > fb_deferred_io_mkwrite+0xea/0x150 > do_page_mkwrite+0x57/0xc0 > do_wp_page+0x278/0x2f0 > __handle_mm_fault+0xdc2/0x1590 > handle_mm_fault+0xdd/0x2c0 > do_user_addr_fault+0x1d3/0x650 > exc_page_fault+0x77/0x180 > ? asm_exc_page_fault+0x8/0x30 > asm_exc_page_fault+0x1e/0x30 > RIP: 0033:0x7fd98fc8fad1 > == > > Figure out the race happens when one process is adding &page->lru into > the pagelist tail in fb_deferred_io_mkwrite(), another process is > re-initializing the same &page->lru in fb_deferred_io_fault(), which is > not protected by the lock. > > This fix is to init all the page lists one time during initialization, > it not only fixes the list corruption, but also avoids INIT_LIST_HEAD() > redundantly. > > Fixes: 105a940416fc ("fbdev/defio: Early-out if page is already > enlisted") > Cc: Thomas Zimmermann > Signed-off-by: Chuansheng Liu > --- This makes sense to me. If you address Geert comment and post a v2, feel free to add: Reviewed-by: Javier Martinez Canillas -- Best regards, Javier Martinez Canillas Linux Engineering Red Hat
Re: [PATCH 4/4] drm/gma500: Cosmetic cleanup of irq code
On Thu, Mar 17, 2022 at 12:02 PM Daniel Vetter wrote: > > On Thu, Mar 17, 2022 at 10:25:55AM +0100, Patrik Jakobsson wrote: > > Use the gma_ prefix instead of psb_ since the code is common for all > > chips. Various coding style fixes. Removal of unused code. Removal of > > duplicate function declarations. > > I didn't really find the above removal things, was that from an old commit > message before you split those changes out? I was thinking about the removal of mid_pipe_vsync() (unused code) and the psb_irq declarations in psb_drv.h (duplicate function declarations). Perhaps I should've split this up in several patches. > > Aside from that nit on the commit message on all 4 patches (btw you're > threading is somehow broken in this series): I have a new gitconfig on this machine. It's likely misconfigured with --no-thread or something like that. Thanks for the review. > > Acked-by: Daniel Vetter > > > > Signed-off-by: Patrik Jakobsson > > --- > > drivers/gpu/drm/gma500/gma_display.c | 8 +-- > > drivers/gpu/drm/gma500/opregion.c| 5 +- > > drivers/gpu/drm/gma500/power.c | 10 +-- > > drivers/gpu/drm/gma500/psb_drv.c | 2 +- > > drivers/gpu/drm/gma500/psb_drv.h | 11 > > drivers/gpu/drm/gma500/psb_irq.c | 94 +++- > > drivers/gpu/drm/gma500/psb_irq.h | 19 +++--- > > 7 files changed, 57 insertions(+), 92 deletions(-) > > > > diff --git a/drivers/gpu/drm/gma500/gma_display.c > > b/drivers/gpu/drm/gma500/gma_display.c > > index 931ffb192fc4..1d7964c339f4 100644 > > --- a/drivers/gpu/drm/gma500/gma_display.c > > +++ b/drivers/gpu/drm/gma500/gma_display.c > > @@ -17,7 +17,7 @@ > > #include "framebuffer.h" > > #include "gem.h" > > #include "gma_display.h" > > -#include "psb_drv.h" > > +#include "psb_irq.h" > > #include "psb_intel_drv.h" > > #include "psb_intel_reg.h" > > > > @@ -572,9 +572,9 @@ const struct drm_crtc_funcs gma_crtc_funcs = { > > .set_config = gma_crtc_set_config, > > .destroy = gma_crtc_destroy, > > .page_flip = gma_crtc_page_flip, > > - .enable_vblank = psb_enable_vblank, > > - .disable_vblank = psb_disable_vblank, > > - .get_vblank_counter = psb_get_vblank_counter, > > + .enable_vblank = gma_enable_vblank, > > + .disable_vblank = gma_disable_vblank, > > + .get_vblank_counter = gma_get_vblank_counter, > > }; > > > > /* > > diff --git a/drivers/gpu/drm/gma500/opregion.c > > b/drivers/gpu/drm/gma500/opregion.c > > index fef04ff8c3a9..dc494df71a48 100644 > > --- a/drivers/gpu/drm/gma500/opregion.c > > +++ b/drivers/gpu/drm/gma500/opregion.c > > @@ -23,6 +23,7 @@ > > */ > > #include > > #include "psb_drv.h" > > +#include "psb_irq.h" > > #include "psb_intel_reg.h" > > > > #define PCI_ASLE 0xe4 > > @@ -217,8 +218,8 @@ void psb_intel_opregion_enable_asle(struct drm_device > > *dev) > > if (asle && system_opregion ) { > > /* Don't do this on Medfield or other non PC like devices, > > they > > use the bit for something different altogether */ > > - psb_enable_pipestat(dev_priv, 0, > > PIPE_LEGACY_BLC_EVENT_ENABLE); > > - psb_enable_pipestat(dev_priv, 1, > > PIPE_LEGACY_BLC_EVENT_ENABLE); > > + gma_enable_pipestat(dev_priv, 0, > > PIPE_LEGACY_BLC_EVENT_ENABLE); > > + gma_enable_pipestat(dev_priv, 1, > > PIPE_LEGACY_BLC_EVENT_ENABLE); > > > > asle->tche = ASLE_ALS_EN | ASLE_BLC_EN | ASLE_PFIT_EN > > | > > ASLE_PFMB_EN; > > diff --git a/drivers/gpu/drm/gma500/power.c b/drivers/gpu/drm/gma500/power.c > > index 6f917cfef65b..b91de6d36e41 100644 > > --- a/drivers/gpu/drm/gma500/power.c > > +++ b/drivers/gpu/drm/gma500/power.c > > @@ -201,7 +201,7 @@ int gma_power_suspend(struct device *_dev) > > dev_err(dev->dev, "GPU hardware busy, cannot > > suspend\n"); > > return -EBUSY; > > } > > - psb_irq_uninstall(dev); > > + gma_irq_uninstall(dev); > > gma_suspend_display(dev); > > gma_suspend_pci(pdev); > > } > > @@ -223,8 +223,8 @@ int gma_power_resume(struct device *_dev) > > mutex_lock(&power_mutex); > > gma_resume_pci(pdev); > > gma_resume_display(pdev); > > - psb_irq_preinstall(dev); > > - psb_irq_postinstall(dev); > > + gma_irq_preinstall(dev); > > + gma_irq_postinstall(dev); > > mutex_unlock(&power_mutex); > > return 0; > > } > > @@ -270,8 +270,8 @@ bool gma_power_begin(struct drm_device *dev, bool > > force_on) > > /* Ok power up needed */ > > ret = gma_resume_pci(pdev); > > if (ret == 0) { > > - psb_irq_preinstall(dev); > > - psb_irq_postinstall(dev); > > + gma_irq_preinstall(dev); > > + gma_irq_postinstall(dev); > > pm_runtime_get(dev->dev); > > dev_priv->display_c
Re: [PATCH v2 0/5] drm: Fix monochrome conversion for sdd130x
Hello Geert, On 3/17/22 09:18, Geert Uytterhoeven wrote: > Hi all, > > This patch series contains fixes and improvements for the XRGB888 to > monochrome conversion in the DRM core, and for its users. > > This has been tested on an Adafruit FeatherWing 128x32 OLED, connected > to an OrangeCrab ECP5 FPGA board running a 64 MHz VexRiscv RISC-V > softcore, using a text console with 4x6, 7x14 and 8x8 fonts. > > Thanks! > > Geert Uytterhoeven (5): > drm/format-helper: Rename drm_fb_xrgb_to_mono_reversed() > drm/format-helper: Fix XRGB888 to monochrome conversion > drm/ssd130x: Fix rectangle updates > drm/ssd130x: Reduce temporary buffer sizes > drm/repaper: Reduce temporary buffer size in repaper_fb_dirty() > Thanks for re-spinning this series and again for fixing my bugs! I pushed patches 1-4 to drm-misc (drm-misc-next) but left patch 5 since would like to give Noralf the opportunity to review/test before pushing. By the way, you should probably request commit access to the drm-misc tree: https://drm.pages.freedesktop.org/maintainer-tools/commit-access.html -- Best regards, Javier Martinez Canillas Linux Engineering Red Hat
[v8 0/5] enhanced edid driver compatibility
Support to parse multiple CEA extension blocks and HF-EEODB to extend drm edid driver's capability. v4: add one more patch to support HF-SCDB v5: HF-SCDB and HF-VSDBS carry the same SCDS data. Reuse drm_parse_hdmi_forum_vsdb() to parse this packet. v6: save proper extension block index if CTA data information was found in DispalyID block. v7: using different parameters to store CEA and DisplayID block index. configure DisplayID extansion block index before search available DisplayID block. v8: revert patch [v7 2/5] change. And check cea pointer return from drm_find_cea_extension(). If drvier got the same cea pointer then exit this routine. Lee Shawn C (5): drm/edid: seek for available CEA block from specific EDID block index drm/edid: parse multiple CEA extension block drm/edid: read HF-EEODB ext block drm/edid: parse HF-EEODB CEA extension block drm/edid: check for HF-SCDB block drivers/gpu/drm/drm_connector.c | 8 +- drivers/gpu/drm/drm_displayid.c | 5 +- drivers/gpu/drm/drm_edid.c | 174 include/drm/drm_edid.h | 4 +- 4 files changed, 144 insertions(+), 47 deletions(-) -- 2.17.1
[v8 1/5] drm/edid: seek for available CEA block from specific EDID block index
drm_find_cea_extension() always look for a top level CEA block. Pass ext_index from caller then this function to search next available CEA ext block from a specific EDID block pointer. Cc: Jani Nikula Cc: Ville Syrjala Cc: Ankit Nautiyal Cc: intel-gfx Signed-off-by: Lee Shawn C --- drivers/gpu/drm/drm_edid.c | 42 ++ 1 file changed, 20 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c index 561f53831e29..1251226d9284 100644 --- a/drivers/gpu/drm/drm_edid.c +++ b/drivers/gpu/drm/drm_edid.c @@ -3353,16 +3353,14 @@ const u8 *drm_find_edid_extension(const struct edid *edid, return edid_ext; } -static const u8 *drm_find_cea_extension(const struct edid *edid) +static const u8 *drm_find_cea_extension(const struct edid *edid, int *ext_index) { const struct displayid_block *block; struct displayid_iter iter; const u8 *cea; - int ext_index = 0; - /* Look for a top level CEA extension block */ - /* FIXME: make callers iterate through multiple CEA ext blocks? */ - cea = drm_find_edid_extension(edid, CEA_EXT, &ext_index); + /* Look for a CEA extension block from ext_index */ + cea = drm_find_edid_extension(edid, CEA_EXT, ext_index); if (cea) return cea; @@ -3643,10 +3641,10 @@ add_alternate_cea_modes(struct drm_connector *connector, struct edid *edid) struct drm_device *dev = connector->dev; struct drm_display_mode *mode, *tmp; LIST_HEAD(list); - int modes = 0; + int modes = 0, ext_index = 0; /* Don't add CEA modes if the CEA extension block is missing */ - if (!drm_find_cea_extension(edid)) + if (!drm_find_cea_extension(edid, &ext_index)) return 0; /* @@ -4321,11 +4319,11 @@ static void drm_parse_y420cmdb_bitmap(struct drm_connector *connector, static int add_cea_modes(struct drm_connector *connector, struct edid *edid) { - const u8 *cea = drm_find_cea_extension(edid); - const u8 *db, *hdmi = NULL, *video = NULL; + const u8 *cea, *db, *hdmi = NULL, *video = NULL; u8 dbl, hdmi_len, video_len = 0; - int modes = 0; + int modes = 0, ext_index = 0; + cea = drm_find_cea_extension(edid, &ext_index); if (cea && cea_revision(cea) >= 3) { int i, start, end; @@ -4562,7 +4560,7 @@ static void drm_edid_to_eld(struct drm_connector *connector, struct edid *edid) uint8_t *eld = connector->eld; const u8 *cea; const u8 *db; - int total_sad_count = 0; + int total_sad_count = 0, ext_index = 0; int mnl; int dbl; @@ -4571,7 +4569,7 @@ static void drm_edid_to_eld(struct drm_connector *connector, struct edid *edid) if (!edid) return; - cea = drm_find_cea_extension(edid); + cea = drm_find_cea_extension(edid, &ext_index); if (!cea) { DRM_DEBUG_KMS("ELD: no CEA Extension found\n"); return; @@ -4655,11 +4653,11 @@ static void drm_edid_to_eld(struct drm_connector *connector, struct edid *edid) */ int drm_edid_to_sad(struct edid *edid, struct cea_sad **sads) { - int count = 0; + int count = 0, ext_index = 0; int i, start, end, dbl; const u8 *cea; - cea = drm_find_cea_extension(edid); + cea = drm_find_cea_extension(edid, &ext_index); if (!cea) { DRM_DEBUG_KMS("SAD: no CEA Extension found\n"); return 0; @@ -4717,11 +4715,11 @@ EXPORT_SYMBOL(drm_edid_to_sad); */ int drm_edid_to_speaker_allocation(struct edid *edid, u8 **sadb) { - int count = 0; + int count = 0, ext_index = 0; int i, start, end, dbl; const u8 *cea; - cea = drm_find_cea_extension(edid); + cea = drm_find_cea_extension(edid, &ext_index); if (!cea) { DRM_DEBUG_KMS("SAD: no CEA Extension found\n"); return 0; @@ -4814,9 +4812,9 @@ bool drm_detect_hdmi_monitor(struct edid *edid) { const u8 *edid_ext; int i; - int start_offset, end_offset; + int start_offset, end_offset, ext_index = 0; - edid_ext = drm_find_cea_extension(edid); + edid_ext = drm_find_cea_extension(edid, &ext_index); if (!edid_ext) return false; @@ -4853,9 +4851,9 @@ bool drm_detect_monitor_audio(struct edid *edid) const u8 *edid_ext; int i, j; bool has_audio = false; - int start_offset, end_offset; + int start_offset, end_offset, ext_index = 0; - edid_ext = drm_find_cea_extension(edid); + edid_ext = drm_find_cea_extension(edid, &ext_index); if (!edid_ext) goto end; @@ -5177,9 +5175,9 @@ static void drm_parse_cea_ext(struct drm_connector *connector, { struct drm_display_info *info = &connector->display_info; const u8
[v8 2/5] drm/edid: parse multiple CEA extension block
Try to find and parse more CEA ext blocks if edid->extensions is greater than one. v2: split prvious patch to two. And do CEA block parsing in this one. v3: simplify this patch based on previous change. v4: refine patch v3. v5: revert previous change. And check cea pointer return from drm_find_cea_extension(). If drvier got the same cea pointer then exit this routine. Cc: Jani Nikula Cc: Ville Syrjala Cc: Ankit Nautiyal Cc: Drew Davenport Cc: intel-gfx Signed-off-by: Lee Shawn C --- drivers/gpu/drm/drm_edid.c | 34 +- 1 file changed, 21 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c index 1251226d9284..ef65dd97d700 100644 --- a/drivers/gpu/drm/drm_edid.c +++ b/drivers/gpu/drm/drm_edid.c @@ -4319,16 +4319,24 @@ static void drm_parse_y420cmdb_bitmap(struct drm_connector *connector, static int add_cea_modes(struct drm_connector *connector, struct edid *edid) { - const u8 *cea, *db, *hdmi = NULL, *video = NULL; - u8 dbl, hdmi_len, video_len = 0; int modes = 0, ext_index = 0; + const u8 *cur_cea = NULL; - cea = drm_find_cea_extension(edid, &ext_index); - if (cea && cea_revision(cea) >= 3) { + for (;;) { + const u8 *cea, *db, *hdmi = NULL, *video = NULL; + u8 dbl, hdmi_len = 0, video_len = 0; int i, start, end; + cea = drm_find_cea_extension(edid, &ext_index); + if (!cea || cea == cur_cea) + break; + cur_cea = cea; + + if (cea_revision(cea) < 3) + continue; + if (cea_db_offsets(cea, &start, &end)) - return 0; + continue; for_each_cea_db(cea, i, start, end) { db = &cea[i]; @@ -4350,15 +4358,15 @@ add_cea_modes(struct drm_connector *connector, struct edid *edid) dbl - 1); } } - } - /* -* We parse the HDMI VSDB after having added the cea modes as we will -* be patching their flags when the sink supports stereo 3D. -*/ - if (hdmi) - modes += do_hdmi_vsdb_modes(connector, hdmi, hdmi_len, video, - video_len); + /* +* We parse the HDMI VSDB after having added the cea modes as we will +* be patching their flags when the sink supports stereo 3D. +*/ + if (hdmi) + modes += do_hdmi_vsdb_modes(connector, hdmi, hdmi_len, video, + video_len); + } return modes; } -- 2.17.1
[v8 3/5] drm/edid: read HF-EEODB ext block
According to HDMI 2.1 spec. "The HDMI Forum EDID Extension Override Data Block (HF-EEODB) is utilized by Sink Devices to provide an alternate method to indicate an EDID Extension Block count larger than 1, while avoiding the need to present a VESA Block Map in the first E-EDID Extension Block." It is a mandatory for HDMI 2.1 protocol compliance as well. This patch help to know how many HF_EEODB blocks report by sink and read allo HF_EEODB blocks back. v2: support to find CEA block, check EEODB block format, and return available block number in drm_edid_read_hf_eeodb_blk_count(). Cc: Jani Nikula Cc: Ville Syrjala Cc: Ankit Nautiyal Cc: intel-gfx Signed-off-by: Lee Shawn C --- drivers/gpu/drm/drm_connector.c | 8 +++- drivers/gpu/drm/drm_edid.c | 71 +++-- include/drm/drm_edid.h | 2 +- 3 files changed, 74 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c index a50c82bc2b2f..16011023c12e 100644 --- a/drivers/gpu/drm/drm_connector.c +++ b/drivers/gpu/drm/drm_connector.c @@ -2129,7 +2129,7 @@ int drm_connector_update_edid_property(struct drm_connector *connector, const struct edid *edid) { struct drm_device *dev = connector->dev; - size_t size = 0; + size_t size = 0, hf_eeodb_blk_count; int ret; const struct edid *old_edid; @@ -2137,8 +2137,12 @@ int drm_connector_update_edid_property(struct drm_connector *connector, if (connector->override_edid) return 0; - if (edid) + if (edid) { size = EDID_LENGTH * (1 + edid->extensions); + hf_eeodb_blk_count = drm_edid_read_hf_eeodb_blk_count(edid); + if (hf_eeodb_blk_count) + size = EDID_LENGTH * (1 + hf_eeodb_blk_count); + } /* Set the display info, using edid if available, otherwise * resetting the values to defaults. This duplicates the work diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c index ef65dd97d700..890038758660 100644 --- a/drivers/gpu/drm/drm_edid.c +++ b/drivers/gpu/drm/drm_edid.c @@ -1992,6 +1992,7 @@ struct edid *drm_do_get_edid(struct drm_connector *connector, { int i, j = 0, valid_extensions = 0; u8 *edid, *new; + size_t hf_eeodb_blk_count; struct edid *override; override = drm_get_override_edid(connector); @@ -2051,7 +2052,35 @@ struct edid *drm_do_get_edid(struct drm_connector *connector, } kfree(edid); + return (struct edid *)new; + } + + hf_eeodb_blk_count = drm_edid_read_hf_eeodb_blk_count((struct edid *)edid); + if (hf_eeodb_blk_count >= 2) { + new = krealloc(edid, (hf_eeodb_blk_count + 1) * EDID_LENGTH, GFP_KERNEL); + if (!new) + goto out; edid = new; + + valid_extensions = hf_eeodb_blk_count - 1; + for (j = 2; j <= hf_eeodb_blk_count; j++) { + u8 *block = edid + j * EDID_LENGTH; + + for (i = 0; i < 4; i++) { + if (get_edid_block(data, block, j, EDID_LENGTH)) + goto out; + if (drm_edid_block_valid(block, j, false, NULL)) + break; + } + + if (i == 4) + valid_extensions--; + } + + if (valid_extensions != hf_eeodb_blk_count - 1) { + DRM_ERROR("Not able to retrieve proper EDID contain HF-EEODB data.\n"); + goto out; + } } return (struct edid *)edid; @@ -3315,15 +3344,17 @@ add_detailed_modes(struct drm_connector *connector, struct edid *edid, #define VIDEO_BLOCK 0x02 #define VENDOR_BLOCK0x03 #define SPEAKER_BLOCK 0x04 -#define HDR_STATIC_METADATA_BLOCK 0x6 -#define USE_EXTENDED_TAG 0x07 -#define EXT_VIDEO_CAPABILITY_BLOCK 0x00 +#define EXT_VIDEO_CAPABILITY_BLOCK 0x00 +#define HDR_STATIC_METADATA_BLOCK 0x06 +#define USE_EXTENDED_TAG 0x07 #define EXT_VIDEO_DATA_BLOCK_420 0x0E -#define EXT_VIDEO_CAP_BLOCK_Y420CMDB 0x0F +#define EXT_VIDEO_CAP_BLOCK_Y420CMDB 0x0F +#define EXT_VIDEO_HF_EEODB_DATA_BLOCK 0x78 #define EDID_BASIC_AUDIO (1 << 6) #define EDID_CEA_YCRCB444 (1 << 5) #define EDID_CEA_YCRCB422 (1 << 4) #define EDID_CEA_VCDB_QS (1 << 6) +#define HF_EEODB_LENGTH2 /* * Search EDID for CEA extension block. @@ -4273,9 +4304,41 @@ static bool cea_db_is_y420vdb(const u8 *db) return true; } +static bool cea_db_is_hdmi_forum_eeodb(const u8 *db) +{ + if (cea_db_tag(db) != USE_EXTENDED_TAG) + return false; + + if (cea_db_payload_len
[v8 4/5] drm/edid: parse HF-EEODB CEA extension block
While adding CEA modes, try to get available EEODB block number. Then based on it to parse numbers of ext blocks, retrieve CEA information and add more CEA modes. Cc: Jani Nikula Cc: Ville Syrjala Cc: Ankit Nautiyal Cc: intel-gfx Signed-off-by: Lee Shawn C --- drivers/gpu/drm/drm_displayid.c | 5 - drivers/gpu/drm/drm_edid.c | 35 +++-- include/drm/drm_edid.h | 2 +- 3 files changed, 25 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/drm_displayid.c b/drivers/gpu/drm/drm_displayid.c index 32da557b960f..dc649a9efaa2 100644 --- a/drivers/gpu/drm/drm_displayid.c +++ b/drivers/gpu/drm/drm_displayid.c @@ -37,7 +37,10 @@ static const u8 *drm_find_displayid_extension(const struct edid *edid, int *length, int *idx, int *ext_index) { - const u8 *displayid = drm_find_edid_extension(edid, DISPLAYID_EXT, ext_index); + const u8 *displayid = drm_find_edid_extension(edid, + DISPLAYID_EXT, + ext_index, + edid->extensions); const struct displayid_header *base; int ret; diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c index 890038758660..40c192587f0a 100644 --- a/drivers/gpu/drm/drm_edid.c +++ b/drivers/gpu/drm/drm_edid.c @@ -3360,23 +3360,23 @@ add_detailed_modes(struct drm_connector *connector, struct edid *edid, * Search EDID for CEA extension block. */ const u8 *drm_find_edid_extension(const struct edid *edid, - int ext_id, int *ext_index) + int ext_id, int *ext_index, int ext_blk_num) { const u8 *edid_ext = NULL; int i; /* No EDID or EDID extensions */ - if (edid == NULL || edid->extensions == 0) + if (edid == NULL || edid->extensions == 0 || *ext_index >= ext_blk_num) return NULL; /* Find CEA extension */ - for (i = *ext_index; i < edid->extensions; i++) { + for (i = *ext_index; i < ext_blk_num; i++) { edid_ext = (const u8 *)edid + EDID_LENGTH * (i + 1); if (edid_ext[0] == ext_id) break; } - if (i >= edid->extensions) + if (i >= ext_blk_num) return NULL; *ext_index = i + 1; @@ -3384,14 +3384,15 @@ const u8 *drm_find_edid_extension(const struct edid *edid, return edid_ext; } -static const u8 *drm_find_cea_extension(const struct edid *edid, int *ext_index) +static const u8 *drm_find_cea_extension(const struct edid *edid, + int *ext_index, int ext_blk_num) { const struct displayid_block *block; struct displayid_iter iter; const u8 *cea; /* Look for a CEA extension block from ext_index */ - cea = drm_find_edid_extension(edid, CEA_EXT, ext_index); + cea = drm_find_edid_extension(edid, CEA_EXT, ext_index, ext_blk_num); if (cea) return cea; @@ -3675,7 +3676,7 @@ add_alternate_cea_modes(struct drm_connector *connector, struct edid *edid) int modes = 0, ext_index = 0; /* Don't add CEA modes if the CEA extension block is missing */ - if (!drm_find_cea_extension(edid, &ext_index)) + if (!drm_find_cea_extension(edid, &ext_index, edid->extensions)) return 0; /* @@ -4327,7 +4328,7 @@ size_t drm_edid_read_hf_eeodb_blk_count(const struct edid *edid) int i, start, end, ext_index = 0; if (edid->extensions) { - cea = drm_find_cea_extension(edid, &ext_index); + cea = drm_find_cea_extension(edid, &ext_index, edid->extensions); if (cea && !cea_db_offsets(cea, &start, &end)) for_each_cea_db(cea, i, start, end) @@ -4384,13 +4385,17 @@ add_cea_modes(struct drm_connector *connector, struct edid *edid) { int modes = 0, ext_index = 0; const u8 *cur_cea = NULL; + int ext_blk_num = drm_edid_read_hf_eeodb_blk_count(edid); + + if (!ext_blk_num) + ext_blk_num = edid->extensions; for (;;) { const u8 *cea, *db, *hdmi = NULL, *video = NULL; u8 dbl, hdmi_len = 0, video_len = 0; int i, start, end; - cea = drm_find_cea_extension(edid, &ext_index); + cea = drm_find_cea_extension(edid, &ext_index, ext_blk_num); if (!cea || cea == cur_cea) break; cur_cea = cea; @@ -4640,7 +4645,7 @@ static void drm_edid_to_eld(struct drm_connector *connector, struct edid *edid) if (!edid) return; - cea = drm_find_cea_extension(edid, &ext_index); + cea = drm_find_cea_extension(edi
[v8 5/5] drm/edid: check for HF-SCDB block
Find HF-SCDB information in CEA extensions block. And retrieve Max_TMDS_Character_Rate that support by sink device. v2: HF-SCDB and HF-VSDBS carry the same SCDS data. Reuse drm_parse_hdmi_forum_vsdb() to parse this packet. Cc: Jani Nikula Cc: Ville Syrjala Cc: Ankit Nautiyal Cc: intel-gfx Signed-off-by: Lee Shawn C --- drivers/gpu/drm/drm_edid.c | 18 +- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c index 40c192587f0a..64d13ba0f701 100644 --- a/drivers/gpu/drm/drm_edid.c +++ b/drivers/gpu/drm/drm_edid.c @@ -3350,6 +3350,7 @@ add_detailed_modes(struct drm_connector *connector, struct edid *edid, #define EXT_VIDEO_DATA_BLOCK_420 0x0E #define EXT_VIDEO_CAP_BLOCK_Y420CMDB 0x0F #define EXT_VIDEO_HF_EEODB_DATA_BLOCK 0x78 +#define EXT_VIDEO_HF_SCDB_DATA_BLOCK 0x79 #define EDID_BASIC_AUDIO (1 << 6) #define EDID_CEA_YCRCB444 (1 << 5) #define EDID_CEA_YCRCB422 (1 << 4) @@ -4277,6 +4278,20 @@ static bool cea_db_is_vcdb(const u8 *db) return true; } +static bool cea_db_is_hdmi_forum_scdb(const u8 *db) +{ + if (cea_db_tag(db) != USE_EXTENDED_TAG) + return false; + + if (cea_db_payload_len(db) < 7) + return false; + + if (cea_db_extended_tag(db) != EXT_VIDEO_HF_SCDB_DATA_BLOCK) + return false; + + return true; +} + static bool cea_db_is_y420cmdb(const u8 *db) { if (cea_db_tag(db) != USE_EXTENDED_TAG) @@ -5274,7 +5289,8 @@ static void drm_parse_cea_ext(struct drm_connector *connector, if (cea_db_is_hdmi_vsdb(db)) drm_parse_hdmi_vsdb_video(connector, db); - if (cea_db_is_hdmi_forum_vsdb(db)) + if (cea_db_is_hdmi_forum_vsdb(db) || + cea_db_is_hdmi_forum_scdb(db)) drm_parse_hdmi_forum_vsdb(connector, db); if (cea_db_is_microsoft_vsdb(db)) drm_parse_microsoft_vsdb(connector, db); -- 2.17.1
Re: [Freedreno] [PATCH v3 5/5] drm/msm: allow compile time selection of driver components
On 16/03/2022 20:26, Abhinav Kumar wrote: On 3/16/2022 12:31 AM, Dmitry Baryshkov wrote: On 16/03/2022 03:28, Abhinav Kumar wrote: On 3/3/2022 7:21 PM, Dmitry Baryshkov wrote: MSM DRM driver already allows one to compile out the DP or DSI support. Add support for disabling other features like MDP4/MDP5/DPU drivers or direct HDMI output support. Suggested-by: Stephen Boyd Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/Kconfig | 50 -- drivers/gpu/drm/msm/Makefile | 18 ++-- drivers/gpu/drm/msm/msm_drv.h | 33 ++ drivers/gpu/drm/msm/msm_mdss.c | 13 +++-- 4 files changed, 106 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig index 9b019598e042..3735fd41eb3b 100644 --- a/drivers/gpu/drm/msm/Kconfig +++ b/drivers/gpu/drm/msm/Kconfig @@ -46,12 +46,39 @@ config DRM_MSM_GPU_SUDO Only use this if you are a driver developer. This should *not* be enabled for production kernels. If unsure, say N. -config DRM_MSM_HDMI_HDCP - bool "Enable HDMI HDCP support in MSM DRM driver" +config DRM_MSM_MDSS + bool + depends on DRM_MSM + default n shouldnt DRM_MSM_MDSS be defaulted to y? No, it will be selected either by MDP5 or by DPU1. It is not used if DRM_MSM is compiled with just MDP4 or headless support in mind. Ok got it. Another question is the compilation validation of the combinations of these. So we need to try: 1) DRM_MSM_MDSS + DRM_MSM_MDP4 2) DRM_MSM_MDSS + DRM_MSM_MDP5 3) DRM_MSM_MDSS + DRM_MSM_DPU Earlier since all of them were compiled together any inter-dependencies will not show up. Now since we are separating it out, just wanted to make sure each of the combos compile? I think you meant: - headless - MDP4 - MDP5 - DPU1 - MDP4 + MDP5 - MDP4 + DPU1 - MDP5 + DPU1 - all three drivers Yes, each of these combinations. Each of them was tested. + +config DRM_MSM_MDP4 + bool "Enable MDP4 support in MSM DRM driver" depends on DRM_MSM default y help - Choose this option to enable HDCP state machine + Compile in support for the Mobile Display Processor v4 (MDP4) in + the MSM DRM driver. It is the older display controller found in + devices using APQ8064/MSM8960/MSM8x60 platforms. + +config DRM_MSM_MDP5 + bool "Enable MDP5 support in MSM DRM driver" + depends on DRM_MSM + select DRM_MSM_MDSS + default y + help + Compile in support for the Mobile Display Processor v5 (MDP4) in + the MSM DRM driver. It is the display controller found in devices + using e.g. APQ8016/MSM8916/APQ8096/MSM8996/MSM8974/SDM6x0 platforms. + +config DRM_MSM_DPU + bool "Enable DPU support in MSM DRM driver" + depends on DRM_MSM + select DRM_MSM_MDSS + default y + help + Compile in support for the Display Processing Unit in + the MSM DRM driver. It is the display controller found in devices + using e.g. SDM845 and newer platforms. config DRM_MSM_DP bool "Enable DisplayPort support in MSM DRM driver" @@ -116,3 +143,20 @@ config DRM_MSM_DSI_7NM_PHY help Choose this option if DSI PHY on SM8150/SM8250/SC7280 is used on the platform. + +config DRM_MSM_HDMI + bool "Enable HDMI support in MSM DRM driver" + depends on DRM_MSM + default y + help + Compile in support for the HDMI output MSM DRM driver. It can + be a primary or a secondary display on device. Note that this is used + only for the direct HDMI output. If the device outputs HDMI data + throught some kind of DSI-to-HDMI bridge, this option can be disabled. + +config DRM_MSM_HDMI_HDCP + bool "Enable HDMI HDCP support in MSM DRM driver" + depends on DRM_MSM && DRM_MSM_HDMI + default y + help + Choose this option to enable HDCP state machine diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile index e76927b42033..5fe9c20ab9ee 100644 --- a/drivers/gpu/drm/msm/Makefile +++ b/drivers/gpu/drm/msm/Makefile @@ -16,6 +16,8 @@ msm-y := \ adreno/a6xx_gpu.o \ adreno/a6xx_gmu.o \ adreno/a6xx_hfi.o \ + +msm-$(CONFIG_DRM_MSM_HDMI) += \ hdmi/hdmi.o \ hdmi/hdmi_audio.o \ hdmi/hdmi_bridge.o \ @@ -27,8 +29,8 @@ msm-y := \ hdmi/hdmi_phy_8x60.o \ hdmi/hdmi_phy_8x74.o \ hdmi/hdmi_pll_8960.o \ - disp/mdp_format.o \ - disp/mdp_kms.o \ + +msm-$(CONFIG_DRM_MSM_MDP4) += \ disp/mdp4/mdp4_crtc.o \ disp/mdp4/mdp4_dtv_encoder.o \ disp/mdp4/mdp4_lcdc_encoder.o \ @@ -37,6 +39,8 @@ msm-y := \ disp/mdp4/mdp4_irq.o \ disp/mdp4/mdp4_kms.o \ disp/mdp4/mdp4_plane.o \ + +msm-$(CONFIG_DRM_MSM_MDP5) += \ disp/mdp5/mdp5_cfg.o \ disp/mdp5/mdp5_ctl.o \ disp/mdp5/mdp5_crtc.o \ @@ -47,6 +51,8 @@ msm-y := \ disp/mdp5/mdp5_mixer.o \ disp/mdp5/mdp5_plane.o \ disp/mdp5/mdp5_smp.o \ + +msm-$(CONFIG_DRM_MSM_DPU) +=
[PATCH 5.4 38/43] drm/vrr: Set VRR capable prop only if it is attached to connector
From: Manasi Navare [ Upstream commit 62929726ef0ec72cbbe9440c5d125d4278b99894 ] VRR capable property is not attached by default to the connector It is attached only if VRR is supported. So if the driver tries to call drm core set prop function without it being attached that causes NULL dereference. Cc: Jani Nikula Cc: Ville Syrjälä Cc: dri-devel@lists.freedesktop.org Signed-off-by: Manasi Navare Reviewed-by: Ville Syrjälä Link: https://patchwork.freedesktop.org/patch/msgid/20220225013055.9282-1-manasi.d.nav...@intel.com Signed-off-by: Sasha Levin --- drivers/gpu/drm/drm_connector.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c index 2337b3827e6a..11a81e8ba963 100644 --- a/drivers/gpu/drm/drm_connector.c +++ b/drivers/gpu/drm/drm_connector.c @@ -1984,6 +1984,9 @@ EXPORT_SYMBOL(drm_connector_attach_max_bpc_property); void drm_connector_set_vrr_capable_property( struct drm_connector *connector, bool capable) { + if (!connector->vrr_capable_property) + return; + drm_object_property_set_value(&connector->base, connector->vrr_capable_property, capable); -- 2.34.1
[PATCH 5.10 16/23] drm/vrr: Set VRR capable prop only if it is attached to connector
From: Manasi Navare [ Upstream commit 62929726ef0ec72cbbe9440c5d125d4278b99894 ] VRR capable property is not attached by default to the connector It is attached only if VRR is supported. So if the driver tries to call drm core set prop function without it being attached that causes NULL dereference. Cc: Jani Nikula Cc: Ville Syrjälä Cc: dri-devel@lists.freedesktop.org Signed-off-by: Manasi Navare Reviewed-by: Ville Syrjälä Link: https://patchwork.freedesktop.org/patch/msgid/20220225013055.9282-1-manasi.d.nav...@intel.com Signed-off-by: Sasha Levin --- drivers/gpu/drm/drm_connector.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c index 717c4e7271b0..5163433ac561 100644 --- a/drivers/gpu/drm/drm_connector.c +++ b/drivers/gpu/drm/drm_connector.c @@ -2155,6 +2155,9 @@ EXPORT_SYMBOL(drm_connector_attach_max_bpc_property); void drm_connector_set_vrr_capable_property( struct drm_connector *connector, bool capable) { + if (!connector->vrr_capable_property) + return; + drm_object_property_set_value(&connector->base, connector->vrr_capable_property, capable); -- 2.34.1
[PATCH 5.15 18/25] drm/vrr: Set VRR capable prop only if it is attached to connector
From: Manasi Navare [ Upstream commit 62929726ef0ec72cbbe9440c5d125d4278b99894 ] VRR capable property is not attached by default to the connector It is attached only if VRR is supported. So if the driver tries to call drm core set prop function without it being attached that causes NULL dereference. Cc: Jani Nikula Cc: Ville Syrjälä Cc: dri-devel@lists.freedesktop.org Signed-off-by: Manasi Navare Reviewed-by: Ville Syrjälä Link: https://patchwork.freedesktop.org/patch/msgid/20220225013055.9282-1-manasi.d.nav...@intel.com Signed-off-by: Sasha Levin --- drivers/gpu/drm/drm_connector.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c index 2ba257b1ae20..e9b7926d9b66 100644 --- a/drivers/gpu/drm/drm_connector.c +++ b/drivers/gpu/drm/drm_connector.c @@ -2233,6 +2233,9 @@ EXPORT_SYMBOL(drm_connector_atomic_hdr_metadata_equal); void drm_connector_set_vrr_capable_property( struct drm_connector *connector, bool capable) { + if (!connector->vrr_capable_property) + return; + drm_object_property_set_value(&connector->base, connector->vrr_capable_property, capable); -- 2.34.1
[PATCH 5.16 22/28] drm/vrr: Set VRR capable prop only if it is attached to connector
From: Manasi Navare [ Upstream commit 62929726ef0ec72cbbe9440c5d125d4278b99894 ] VRR capable property is not attached by default to the connector It is attached only if VRR is supported. So if the driver tries to call drm core set prop function without it being attached that causes NULL dereference. Cc: Jani Nikula Cc: Ville Syrjälä Cc: dri-devel@lists.freedesktop.org Signed-off-by: Manasi Navare Reviewed-by: Ville Syrjälä Link: https://patchwork.freedesktop.org/patch/msgid/20220225013055.9282-1-manasi.d.nav...@intel.com Signed-off-by: Sasha Levin --- drivers/gpu/drm/drm_connector.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c index 52e20c68813b..6ae26e7d3dec 100644 --- a/drivers/gpu/drm/drm_connector.c +++ b/drivers/gpu/drm/drm_connector.c @@ -2275,6 +2275,9 @@ EXPORT_SYMBOL(drm_connector_atomic_hdr_metadata_equal); void drm_connector_set_vrr_capable_property( struct drm_connector *connector, bool capable) { + if (!connector->vrr_capable_property) + return; + drm_object_property_set_value(&connector->base, connector->vrr_capable_property, capable); -- 2.34.1
[PATCH v4 0/3] drm/bridge: ti-sn65dsi86: Support non-eDP DisplayPort connectors
Implement support for non eDP connectors on the TI-SN65DSI86 bridge, and provide IRQ based hotplug detect to identify when the connector is present. no-hpd is extended to be the default behaviour for non DisplayPort connectors. This series is based upon Sam Ravnborgs and Rob Clarks series [0] to support DRM_BRIDGE_STATE_OPS and NO_CONNECTOR support on the SN65DSI86, however some extra modifications have been made on the top of Sam's series to fix compile breakage and the NO_CONNECTOR support. A full branch with these changes is available at [1] As in v3, I have not taken ownership of the patches at [0], so it would be good to hear if Sam has any plans to revive or push this series. These patches are not expected to be integrated without [0]. [0] https://lore.kernel.org/all/20220206154405.124-1-...@ravnborg.org/ [1] git://git.kernel.org/pub/scm/linux/kernel/git/kbingham/rcar.git branch: kbingham/drm-misc/next/sn65dsi86/hpd Kieran Bingham (1): drm/bridge: ti-sn65dsi86: Support hotplug detection Laurent Pinchart (2): drm/bridge: ti-sn65dsi86: Support DisplayPort (non-eDP) mode drm/bridge: ti-sn65dsi86: Implement bridge connector operations drivers/gpu/drm/bridge/ti-sn65dsi86.c | 191 -- 1 file changed, 176 insertions(+), 15 deletions(-) -- 2.32.0
[PATCH v4 1/3] drm/bridge: ti-sn65dsi86: Support DisplayPort (non-eDP) mode
From: Laurent Pinchart Despite the SN65DSI86 being an eDP bridge, on some systems its output is routed to a DisplayPort connector. Enable DisplayPort mode when the next component in the display pipeline is detected as a DisplayPort connector, and disable eDP features in that case. Signed-off-by: Laurent Pinchart Reworked to set bridge type based on the next bridge/connector. Signed-off-by: Kieran Bingham Reviewed-by: Laurent Pinchart Reviewed-by: Douglas Anderson -- Changes since v1/RFC: - Rebased on top of "drm/bridge: ti-sn65dsi86: switch to devm_drm_of_get_bridge" - eDP/DP mode determined from the next bridge connector type. Changes since v2: - Remove setting of Standard DP Scrambler Seed. (It's read-only). - Prevent setting DP_EDP_CONFIGURATION_SET in ti_sn_bridge_atomic_enable() - Use Doug's suggested text for disabling ASSR on DP mode. Changes since v3: - Remove ASSR_CONTROL definition drivers/gpu/drm/bridge/ti-sn65dsi86.c | 22 +++--- 1 file changed, 19 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c b/drivers/gpu/drm/bridge/ti-sn65dsi86.c index c892ecba91c7..c5f020a2d0d3 100644 --- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c +++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c @@ -93,6 +93,8 @@ #define SN_DATARATE_CONFIG_REG 0x94 #define DP_DATARATE_MASK GENMASK(7, 5) #define DP_DATARATE(x)((x) << 5) +#define SN_TRAINING_SETTING_REG0x95 +#define SCRAMBLE_DISABLE BIT(4) #define SN_ML_TX_MODE_REG 0x96 #define ML_TX_MAIN_LINK_OFF 0 #define ML_TX_NORMAL_MODE BIT(0) @@ -982,6 +984,17 @@ static int ti_sn_link_training(struct ti_sn65dsi86 *pdata, int dp_rate_idx, goto exit; } + /* +* eDP panels use an Alternate Scrambler Seed compared to displays +* hooked up via a full DisplayPort connector. SN65DSI86 only supports +* the alternate scrambler seed, not the normal one, so the only way we +* can support full DisplayPort displays is by fully turning off the +* scrambler. +*/ + if (pdata->bridge.type == DRM_MODE_CONNECTOR_DisplayPort) + regmap_update_bits(pdata->regmap, SN_TRAINING_SETTING_REG, + SCRAMBLE_DISABLE, SCRAMBLE_DISABLE); + /* * We'll try to link train several times. As part of link training * the bridge chip will write DP_SET_POWER_D0 to DP_SET_POWER. If @@ -1046,12 +1059,13 @@ static void ti_sn_bridge_atomic_enable(struct drm_bridge *bridge, /* * The SN65DSI86 only supports ASSR Display Authentication method and -* this method is enabled by default. An eDP panel must support this +* this method is enabled for eDP panels. An eDP panel must support this * authentication method. We need to enable this method in the eDP panel * at DisplayPort address 0x0010A prior to link training. */ - drm_dp_dpcd_writeb(&pdata->aux, DP_EDP_CONFIGURATION_SET, - DP_ALTERNATE_SCRAMBLER_RESET_ENABLE); + if (pdata->bridge.type == DRM_MODE_CONNECTOR_eDP) + drm_dp_dpcd_writeb(&pdata->aux, DP_EDP_CONFIGURATION_SET, + DP_ALTERNATE_SCRAMBLER_RESET_ENABLE); /* Set the DP output format (18 bpp or 24 bpp) */ val = (ti_sn_bridge_get_bpp(old_bridge_state) == 18) ? BPP_18_RGB : 0; @@ -1215,6 +1229,8 @@ static int ti_sn_bridge_probe(struct auxiliary_device *adev, pdata->bridge.funcs = &ti_sn_bridge_funcs; pdata->bridge.of_node = np; + pdata->bridge.type = pdata->next_bridge->type == DRM_MODE_CONNECTOR_DisplayPort + ? DRM_MODE_CONNECTOR_DisplayPort : DRM_MODE_CONNECTOR_eDP; drm_bridge_add(&pdata->bridge); -- 2.32.0
[PATCH v4 2/3] drm/bridge: ti-sn65dsi86: Implement bridge connector operations
From: Laurent Pinchart Implement the bridge connector-related .get_edid() operation, and report the related bridge capabilities and type. Signed-off-by: Laurent Pinchart Signed-off-by: Kieran Bingham Reviewed-by: Laurent Pinchart --- Changes since v1: - The connector .get_modes() operation doesn't rely on EDID anymore, __ti_sn_bridge_get_edid() and ti_sn_bridge_get_edid() got merged together - Fix on top of Sam Ravnborg's DRM_BRIDGE_STATE_OPS Changes since v2: [Kieran] - Only support EDID on DRM_MODE_CONNECTOR_DisplayPort modes. Changes since v3: [Kieran] - Remove PM calls in ti_sn_bridge_get_edid() and simplify drivers/gpu/drm/bridge/ti-sn65dsi86.c | 12 1 file changed, 12 insertions(+) diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c b/drivers/gpu/drm/bridge/ti-sn65dsi86.c index c5f020a2d0d3..910bf3d41d2f 100644 --- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c +++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c @@ -1134,10 +1134,19 @@ static void ti_sn_bridge_atomic_post_disable(struct drm_bridge *bridge, pm_runtime_put_sync(pdata->dev); } +static struct edid *ti_sn_bridge_get_edid(struct drm_bridge *bridge, + struct drm_connector *connector) +{ + struct ti_sn65dsi86 *pdata = bridge_to_ti_sn65dsi86(bridge); + + return drm_get_edid(connector, &pdata->aux.ddc); +} + static const struct drm_bridge_funcs ti_sn_bridge_funcs = { .attach = ti_sn_bridge_attach, .detach = ti_sn_bridge_detach, .mode_valid = ti_sn_bridge_mode_valid, + .get_edid = ti_sn_bridge_get_edid, .atomic_pre_enable = ti_sn_bridge_atomic_pre_enable, .atomic_enable = ti_sn_bridge_atomic_enable, .atomic_disable = ti_sn_bridge_atomic_disable, @@ -1232,6 +1241,9 @@ static int ti_sn_bridge_probe(struct auxiliary_device *adev, pdata->bridge.type = pdata->next_bridge->type == DRM_MODE_CONNECTOR_DisplayPort ? DRM_MODE_CONNECTOR_DisplayPort : DRM_MODE_CONNECTOR_eDP; + if (pdata->bridge.type == DRM_MODE_CONNECTOR_DisplayPort) + pdata->bridge.ops = DRM_BRIDGE_OP_EDID; + drm_bridge_add(&pdata->bridge); ret = ti_sn_attach_host(pdata); -- 2.32.0
[PATCH v4 3/3] drm/bridge: ti-sn65dsi86: Support hotplug detection
When the SN65DSI86 is used in DisplayPort mode, its output is likely routed to a DisplayPort connector, which can benefit from hotplug detection. Support it in such cases, with both polling mode and IRQ based detection. The implementation is limited to the bridge operations, as the connector operations are legacy and new users should use DRM_BRIDGE_ATTACH_NO_CONNECTOR. Signed-off-by: Laurent Pinchart Signed-off-by: Kieran Bingham --- Changes since v1: - Document the no_hpd field - Rely on the SN_HPD_DISABLE_REG default value in the HPD case - Add a TODO comment regarding IRQ support [Kieran] - Fix spelling s/assrted/asserted/ - Only enable HPD on DisplayPort connector. - Support IRQ based hotplug detect Changes since v2: [Kieran] - Use unsigned int for values read by regmap - Update HPD support warning message - Only enable OP_HPD if IRQ support enabled. - Only register IRQ handler during ti_sn_bridge_probe() - Store IRQ in the struct ti_sn65dsi86 - Register IRQ only when !no-hpd - Refactor DRM_BRIDGE_OP_DETECT and DRM_BRIDGE_OP_HPD handling Since v3: - Fix commit message - Remove stray debug print - initialise val in case of regmap read error in ti_sn_bridge_detect - Ensure pm-runtime reference held for ti_sn_bridge_detect - Reset status immediately after reading to reduce risk of lost interrupts during ti_sn65dsi86_irq_handler() - Reset only the IRQ bits set during ti_sn65dsi86_irq_handler() - Enable / disable IRQ during hpd_{enable,disable} This ensures the handler completes before it is disabled. - Extra comments to detail the notification process in ti_sn65dsi86_irq_handler() - Move SN_IRQ_EN_REG handling to hpd_{enable,disable} calls. drivers/gpu/drm/bridge/ti-sn65dsi86.c | 159 +++--- 1 file changed, 146 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c b/drivers/gpu/drm/bridge/ti-sn65dsi86.c index 910bf3d41d2f..0cc0409dcdd4 100644 --- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c +++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c @@ -69,6 +69,7 @@ #define BPP_18_RGBBIT(0) #define SN_HPD_DISABLE_REG 0x5C #define HPD_DISABLE BIT(0) +#define HPD_DEBOUNCED_STATE BIT(4) #define SN_GPIO_IO_REG 0x5E #define SN_GPIO_INPUT_SHIFT 4 #define SN_GPIO_OUTPUT_SHIFT 0 @@ -105,10 +106,24 @@ #define SN_PWM_EN_INV_REG 0xA5 #define SN_PWM_INV_MASK BIT(0) #define SN_PWM_EN_MASKBIT(1) +#define SN_IRQ_EN_REG 0xE0 +#define IRQ_ENBIT(0) +#define SN_IRQ_HPD_REG 0xE6 +#define IRQ_HPD_ENBIT(0) +#define IRQ_HPD_INSERTION_EN BIT(1) +#define IRQ_HPD_REMOVAL_ENBIT(2) +#define IRQ_HPD_REPLUG_EN BIT(3) +#define IRQ_HPD_PLL_UNLOCK_EN BIT(5) #define SN_AUX_CMD_STATUS_REG 0xF4 #define AUX_IRQ_STATUS_AUX_RPLY_TOUT BIT(3) #define AUX_IRQ_STATUS_AUX_SHORT BIT(5) #define AUX_IRQ_STATUS_NAT_I2C_FAIL BIT(6) +#define SN_IRQ_HPD_STATUS_REG 0xF5 +#define IRQ_HPD_STATUSBIT(0) +#define IRQ_HPD_INSERTION_STATUS BIT(1) +#define IRQ_HPD_REMOVAL_STATUSBIT(2) +#define IRQ_HPD_REPLUG_STATUS BIT(3) +#define IRQ_PLL_UNLOCKBIT(5) #define MIN_DSI_CLK_FREQ_MHZ 40 @@ -167,6 +182,12 @@ * @pwm_enabled: Used to track if the PWM signal is currently enabled. * @pwm_pin_busy: Track if GPIO4 is currently requested for GPIO or PWM. * @pwm_refclk_freq: Cache for the reference clock input to the PWM. + * + * @no_hpd: Disable hot-plug detection as instructed by device tree (used + *for instance for eDP panels whose HPD signal won't be asserted + *until the panel is turned on, and is thus not usable for + *downstream device detection). + * @irq: IRQ number for the device. */ struct ti_sn65dsi86 { struct auxiliary_device bridge_aux; @@ -201,6 +222,9 @@ struct ti_sn65dsi86 { atomic_tpwm_pin_busy; #endif unsigned intpwm_refclk_freq; + + boolno_hpd; + int irq; }; static const struct regmap_range ti_sn65dsi86_volatile_ranges[] = { @@ -315,23 +339,25 @@ static void ti_sn65dsi86_enable_comms(struct ti_sn65dsi86 *pdata) ti_sn_bridge_set_refclk_freq(pdata); /* -* HPD on this bridge chip is a bit useless. This is an eDP bridge -* so the HPD is an internal signal that's only there to signal that -* the panel is done
[PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on
Presently the Client can be freed whilst still in use. Use the already provided lock to prevent this. Cc: Felix Kuehling Cc: Alex Deucher Cc: "Christian König" Cc: "Pan, Xinhui" Cc: David Airlie Cc: Daniel Vetter Cc: amd-...@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Lee Jones --- drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c index e4beebb1c80a2..3b9ac1e87231f 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c @@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, struct file *filep) spin_unlock(&dev->smi_lock); synchronize_rcu(); + + spin_lock(&client->lock); kfifo_free(&client->fifo); kfree(client); + spin_unlock(&client->lock); return 0; } @@ -247,11 +250,13 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd) return ret; } + spin_lock(&client->lock); ret = anon_inode_getfd(kfd_smi_name, &kfd_smi_ev_fops, (void *)client, O_RDWR); if (ret < 0) { kfifo_free(&client->fifo); kfree(client); + spin_unlock(&client->lock); return ret; } *fd = ret; @@ -264,6 +269,7 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd) spin_lock(&dev->smi_lock); list_add_rcu(&client->list, &dev->smi_clients); spin_unlock(&dev->smi_lock); + spin_unlock(&client->lock); return 0; } -- 2.35.1.894.gb6a874cedc-goog
Re: [PATCH v1 1/3] mm: split vm_normal_pages for LRU and non-LRU handling
On Thu, Mar 17, 2022 at 09:13:50AM +0100, David Hildenbrand wrote: > On 17.03.22 03:54, Alistair Popple wrote: > > Felix Kuehling writes: > > > >> On 2022-03-11 04:16, David Hildenbrand wrote: > >>> On 10.03.22 18:26, Alex Sierra wrote: > DEVICE_COHERENT pages introduce a subtle distinction in the way > "normal" pages can be used by various callers throughout the kernel. > They behave like normal pages for purposes of mapping in CPU page > tables, and for COW. But they do not support LRU lists, NUMA > migration or THP. Therefore we split vm_normal_page into two > functions vm_normal_any_page and vm_normal_lru_page. The latter will > only return pages that can be put on an LRU list and that support > NUMA migration, KSM and THP. > > We also introduced a FOLL_LRU flag that adds the same behaviour to > follow_page and related APIs, to allow callers to specify that they > expect to put pages on an LRU list. > > >>> I still don't see the need for s/vm_normal_page/vm_normal_any_page/. And > >>> as this patch is dominated by that change, I'd suggest (again) to just > >>> drop it as I don't see any value of that renaming. No specifier implies > >>> any. > >> > >> OK. If nobody objects, we can adopts that naming convention. > > > > I'd prefer we avoid the churn too, but I don't think we should make > > vm_normal_page() the equivalent of vm_normal_any_page(). It would mean > > vm_normal_page() would return non-LRU device coherent pages, but to me at > > least > > device coherent pages seem special and not what I'd expect from a function > > with > > "normal" in the name. > > > > So I think it would be better to s/vm_normal_lru_page/vm_normal_page/ and > > keep > > vm_normal_any_page() (or perhaps call it vm_any_page?). This is basically > > what > > the previous incarnation of this feature did: > > > > struct page *_vm_normal_page(struct vm_area_struct *vma, unsigned long addr, > > pte_t pte, bool with_public_device); > > #define vm_normal_page(vma, addr, pte) _vm_normal_page(vma, addr, pte, > > false) > > > > Except we should add: > > > > #define vm_normal_any_page(vma, addr, pte) _vm_normal_page(vma, addr, pte, > > true) > > > > "normal" simply tells us that this is not a special mapping -- IOW, we > want the VM to take a look at the memmap and not treat it like a PFN > map. What we're changing is that we're now also returning non-lru pages. > Fair enough, that's why we introduce vm_normal_lru_page() as a > replacement where we really can only deal with lru pages. > > vm_normal_page vs vm_normal_lru_page is good enough. "lru" further > limits what we get via vm_normal_page, that's even how it's implemented. This naming makes sense to me. Jason
Re: [PATCH] fbdev: defio: fix the pagelist corruption
Hi Am 17.03.22 um 06:46 schrieb Chuansheng Liu: Easily hit the below list corruption: == list_add corruption. prev->next should be next (c0ceb090), but was ec604507edc8. (prev=ec604507edc8). WARNING: CPU: 65 PID: 3959 at lib/list_debug.c:26 __list_add_valid+0x53/0x80 CPU: 65 PID: 3959 Comm: fbdev Tainted: G U RIP: 0010:__list_add_valid+0x53/0x80 Call Trace: fb_deferred_io_mkwrite+0xea/0x150 do_page_mkwrite+0x57/0xc0 do_wp_page+0x278/0x2f0 __handle_mm_fault+0xdc2/0x1590 handle_mm_fault+0xdd/0x2c0 do_user_addr_fault+0x1d3/0x650 exc_page_fault+0x77/0x180 ? asm_exc_page_fault+0x8/0x30 asm_exc_page_fault+0x1e/0x30 RIP: 0033:0x7fd98fc8fad1 == Figure out the race happens when one process is adding &page->lru into the pagelist tail in fb_deferred_io_mkwrite(), another process is re-initializing the same &page->lru in fb_deferred_io_fault(), which is not protected by the lock. This fix is to init all the page lists one time during initialization, it not only fixes the list corruption, but also avoids INIT_LIST_HEAD() redundantly. Fixes: 105a940416fc ("fbdev/defio: Early-out if page is already enlisted") Cc: Thomas Zimmermann Signed-off-by: Chuansheng Liu If you fix Geert's comment, feel free to add Reviewed-by: Thomas Zimmermann Best regards Thomas --- drivers/video/fbdev/core/fb_defio.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/video/fbdev/core/fb_defio.c b/drivers/video/fbdev/core/fb_defio.c index 98b0f23bf5e2..eafb66ca4f28 100644 --- a/drivers/video/fbdev/core/fb_defio.c +++ b/drivers/video/fbdev/core/fb_defio.c @@ -59,7 +59,6 @@ static vm_fault_t fb_deferred_io_fault(struct vm_fault *vmf) printk(KERN_ERR "no mapping available\n"); BUG_ON(!page->mapping); - INIT_LIST_HEAD(&page->lru); page->index = vmf->pgoff; vmf->page = page; @@ -220,6 +219,8 @@ static void fb_deferred_io_work(struct work_struct *work) void fb_deferred_io_init(struct fb_info *info) { struct fb_deferred_io *fbdefio = info->fbdefio; + struct page *page; + int i; BUG_ON(!fbdefio); mutex_init(&fbdefio->lock); @@ -227,6 +228,12 @@ void fb_deferred_io_init(struct fb_info *info) INIT_LIST_HEAD(&fbdefio->pagelist); if (fbdefio->delay == 0) /* set a default of 1 s */ fbdefio->delay = HZ; + + /* initialize all the page lists one time */ + for (i = 0; i < info->fix.smem_len; i += PAGE_SIZE) { + page = fb_deferred_io_page(info, i); + INIT_LIST_HEAD(&page->lru); + } } EXPORT_SYMBOL_GPL(fb_deferred_io_init); -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Ivo Totev OpenPGP_signature Description: OpenPGP digital signature
Re: [PATCH 1/2] drm/i915: Fix renamed struct field
On Wed, 2022-03-16 at 16:45 -0700, Lucas De Marchi wrote: > Earlier versions of commit a5b7ef27da60 ("drm/i915: Add struct to hold > IP version") named "ver" as "arch" and then when it was renamed it > missed the rename on MEDIA_VER_FULL() since it it's currently not used. Reviewed-by: José Roberto de Souza > > Fixes: a5b7ef27da60 ("drm/i915: Add struct to hold IP version") > Cc: José Roberto de Souza > Cc: Matt Roper > Signed-off-by: Lucas De Marchi > --- > drivers/gpu/drm/i915/i915_drv.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 26df561a4e94..7458b107a1d6 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -922,7 +922,7 @@ static inline struct intel_gt *to_gt(struct > drm_i915_private *i915) > (GRAPHICS_VER(i915) >= (from) && GRAPHICS_VER(i915) <= (until)) > > #define MEDIA_VER(i915) (INTEL_INFO(i915)->media.ver) > -#define MEDIA_VER_FULL(i915) IP_VER(INTEL_INFO(i915)->media.arch, \ > +#define MEDIA_VER_FULL(i915) IP_VER(INTEL_INFO(i915)->media.ver, \ > INTEL_INFO(i915)->media.rel) > #define IS_MEDIA_VER(i915, from, until) \ > (MEDIA_VER(i915) >= (from) && MEDIA_VER(i915) <= (until))
Re: [PATCH 2/2] drm/i915: Add logical mapping for video decode engines
On Wed, 2022-03-16 at 16:45 -0700, Lucas De Marchi wrote: > From: Matthew Brost > > Add logical mapping for VDBOXs. This mapping is required for > split-frame workloads, which otherwise fail with > > -F8C53528: [GUC] 0441-INVALID_ENGINE_SUBMIT_MASK > > ... if the application is using the logical id to reorder the engines and > then using it for the batch buffer submission. It's not a big problem on > media version 11 and 12 as they have only 2 instances of VCS and the > logical to physical mapping is monotonically increasing - if the > application is not using the logical id. > > Changing it for the previous platforms allows the media driver > implementation for the next ones (12.50 and above) to be the same, > checking the logical id. It should also not introduce any bug for the > old versions of userspace not checking the id. > > The mapping added here is the complete map needed by XEHPSDV. Previous > platforms with only 2 instances will just use a partial map and should > still work. > > Cc: Matt Roper > Signed-off-by: Matthew Brost > [ Extend the mapping to media versions 11 and 12 and give proper > justification in the commit message why ] > Signed-off-by: Lucas De Marchi > --- > drivers/gpu/drm/i915/gt/intel_engine_cs.c | 22 +- > 1 file changed, 17 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c > b/drivers/gpu/drm/i915/gt/intel_engine_cs.c > index 8080479f27aa..afa2e61cf729 100644 > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c > @@ -731,12 +731,24 @@ static void populate_logical_ids(struct intel_gt *gt, > u8 *logical_ids, > > static void setup_logical_ids(struct intel_gt *gt, u8 *logical_ids, u8 class) > { > - int i; > - u8 map[MAX_ENGINE_INSTANCE + 1]; > + /* > + * Logical to physical mapping is needed for proper support > + * to split-frame feature. > + */ > + if (MEDIA_VER(gt->i915) >= 11 && class == VIDEO_DECODE_CLASS) { > + static const u8 map[] = { 0, 2, 4, 6, 1, 3, 5, 7 }; You can drop the static. Other than that LGTM. Reviewed-by: José Roberto de Souza > > - for (i = 0; i < MAX_ENGINE_INSTANCE + 1; ++i) > - map[i] = i; > - populate_logical_ids(gt, logical_ids, class, map, ARRAY_SIZE(map)); > + populate_logical_ids(gt, logical_ids, class, > + map, ARRAY_SIZE(map)); > + } else { > + int i; > + u8 map[MAX_ENGINE_INSTANCE + 1]; > + > + for (i = 0; i < MAX_ENGINE_INSTANCE + 1; ++i) > + map[i] = i; > + populate_logical_ids(gt, logical_ids, class, > + map, ARRAY_SIZE(map)); > + } > } > > /**
Re: (subset) [PATCH] drm/vc4: add tracepoints for CL submissions
On Tue, 1 Feb 2022 20:26:51 -0100, Melissa Wen wrote: > Trace submit_cl_ioctl and related IRQs for CL submission and bin/render > jobs execution. It might be helpful to get a rendering timeline and > track job throttling. > > Applied to drm/drm-misc (drm-misc-next). Thanks! Maxime
Re: [PATCH] drm/vc4: add tracepoints for CL submissions
On Thu, Mar 10, 2022 at 12:54:32PM +0100, Chema Casanova wrote: > El 10/3/22 a las 12:12, Maxime Ripard escribió: > > On Tue, Mar 01, 2022 at 01:58:26PM -0100, Melissa Wen wrote: > > > On 02/25, Maxime Ripard wrote: > > > > Hi Melissa, > > > > > > > > On Tue, Feb 01, 2022 at 08:26:51PM -0100, Melissa Wen wrote: > > > > > Trace submit_cl_ioctl and related IRQs for CL submission and > > > > > bin/render > > > > > jobs execution. It might be helpful to get a rendering timeline and > > > > > track job throttling. > > > > > > > > > > Signed-off-by: Melissa Wen > > > > I'm not really sure what to do about this patch to be honest. > > > > > > > > My understanding is that tracepoints are considered as userspace ABI, > > > > but I can't really judge whether your traces are good enough or if it's > > > > something that will bit us some time down the road. > > > Thanks for taking a look at this patch. > > > > > > So, I followed the same path of tracing CL submissions on v3d. I mean, > > > tracking submit_cl ioctl, points when a job (bin/render) starts it > > > execution, and irqs of completion (bin/render job). We used it to > > > examine job throttling when running Chromium and, therefore, in addition > > > to have the timeline of jobs execution, I show some data submitted in > > > the ioctl to make connections. I think these tracers might be useful for > > > some investigation in the future, but I'm also not sure if all data are > > > necessary to be displayed. > > Yeah, I'm sure that it's useful :) > > > > I don't see anything wrong with that patch, really. What I meant is that > > I don't really have the experience to judge if there's anything wrong in > > the first place :) > > > > If you can get someone with more experience with the v3d driver (Emma, > > Iago maybe?) I'll be definitely be ok merging that patch > > I've checked this patch and I've been using these tracepoints. > They have been working properly. > > Reviewed-by: Jose Maria Casanova Crespo Thanks for your feedback, I just merged the patch Maxime signature.asc Description: PGP signature
Re: [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on
On Thu, 17 Mar 2022, Lee Jones wrote: > Presently the Client can be freed whilst still in use. > > Use the already provided lock to prevent this. > > Cc: Felix Kuehling > Cc: Alex Deucher > Cc: "Christian König" > Cc: "Pan, Xinhui" > Cc: David Airlie > Cc: Daniel Vetter > Cc: amd-...@lists.freedesktop.org > Cc: dri-devel@lists.freedesktop.org > Signed-off-by: Lee Jones > --- I should have clarified here, that: This patch has only been *build* tested. Since I have no way to run this on real H/W. Please ensure this is tested on real H/W before it gets applied, since it *may* have some undesired side-effects. For instance, I have no idea if client->lock plays nicely with dev->smi_lock or whether this may well end up in deadlock. TIA. > drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c > b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c > index e4beebb1c80a2..3b9ac1e87231f 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c > @@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, > struct file *filep) > spin_unlock(&dev->smi_lock); > > synchronize_rcu(); > + > + spin_lock(&client->lock); > kfifo_free(&client->fifo); > kfree(client); > + spin_unlock(&client->lock); > > return 0; > } > @@ -247,11 +250,13 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t > *fd) > return ret; > } > > + spin_lock(&client->lock); > ret = anon_inode_getfd(kfd_smi_name, &kfd_smi_ev_fops, (void *)client, > O_RDWR); > if (ret < 0) { > kfifo_free(&client->fifo); > kfree(client); > + spin_unlock(&client->lock); > return ret; > } > *fd = ret; > @@ -264,6 +269,7 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd) > spin_lock(&dev->smi_lock); > list_add_rcu(&client->list, &dev->smi_clients); > spin_unlock(&dev->smi_lock); > + spin_unlock(&client->lock); > > return 0; > } -- Lee Jones [李琼斯] Principal Technical Lead - Developer Services Linaro.org │ Open source software for Arm SoCs Follow Linaro: Facebook | Twitter | Blog
Re: [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on
Am 2022-03-17 um 09:16 schrieb Lee Jones: Presently the Client can be freed whilst still in use. Use the already provided lock to prevent this. Cc: Felix Kuehling Cc: Alex Deucher Cc: "Christian König" Cc: "Pan, Xinhui" Cc: David Airlie Cc: Daniel Vetter Cc: amd-...@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Lee Jones --- drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c index e4beebb1c80a2..3b9ac1e87231f 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c @@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, struct file *filep) spin_unlock(&dev->smi_lock); synchronize_rcu(); + + spin_lock(&client->lock); kfifo_free(&client->fifo); kfree(client); + spin_unlock(&client->lock); The spin_unlock is after the spinlock data structure has been freed. There should be no concurrent users here, since we are freeing the data structure. If there still are concurrent users at this point, they will crash anyway. So the locking is unnecessary. return 0; } @@ -247,11 +250,13 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd) return ret; } + spin_lock(&client->lock); The client was just allocated, and it wasn't added to the client list or given to user mode yet. So there can be no concurrent users at this point. The locking is unnecessary. There could be potential issues if someone uses the file descriptor by dumb luck before this function returns. So maybe we need to move the anon_inode_getfd to the end of the function (just before list_add_rcu) so that we only create the file descriptor after the client structure is fully initialized. Regards, Felix ret = anon_inode_getfd(kfd_smi_name, &kfd_smi_ev_fops, (void *)client, O_RDWR); if (ret < 0) { kfifo_free(&client->fifo); kfree(client); + spin_unlock(&client->lock); return ret; } *fd = ret; @@ -264,6 +269,7 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd) spin_lock(&dev->smi_lock); list_add_rcu(&client->list, &dev->smi_clients); spin_unlock(&dev->smi_lock); + spin_unlock(&client->lock); return 0; }
Re: [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on
Good afternoon Felix, Thanks for your review. > Am 2022-03-17 um 09:16 schrieb Lee Jones: > > Presently the Client can be freed whilst still in use. > > > > Use the already provided lock to prevent this. > > > > Cc: Felix Kuehling > > Cc: Alex Deucher > > Cc: "Christian König" > > Cc: "Pan, Xinhui" > > Cc: David Airlie > > Cc: Daniel Vetter > > Cc: amd-...@lists.freedesktop.org > > Cc: dri-devel@lists.freedesktop.org > > Signed-off-by: Lee Jones > > --- > > drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c > > b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c > > index e4beebb1c80a2..3b9ac1e87231f 100644 > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c > > @@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, > > struct file *filep) > > spin_unlock(&dev->smi_lock); > > synchronize_rcu(); > > + > > + spin_lock(&client->lock); > > kfifo_free(&client->fifo); > > kfree(client); > > + spin_unlock(&client->lock); > > The spin_unlock is after the spinlock data structure has been freed. Good point. If we go forward with this approach the unlock should perhaps be moved to just before the kfree(). > There > should be no concurrent users here, since we are freeing the data structure. > If there still are concurrent users at this point, they will crash anyway. > So the locking is unnecessary. The users may well crash, as does the kernel unfortunately. > > return 0; > > } > > @@ -247,11 +250,13 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t > > *fd) > > return ret; > > } > > + spin_lock(&client->lock); > > The client was just allocated, and it wasn't added to the client list or > given to user mode yet. So there can be no concurrent users at this point. > The locking is unnecessary. > > There could be potential issues if someone uses the file descriptor by dumb > luck before this function returns. So maybe we need to move the > anon_inode_getfd to the end of the function (just before list_add_rcu) so > that we only create the file descriptor after the client structure is fully > initialized. Bingo. Well done. :) I can move the function as suggested if that is the best route forward? -- Lee Jones [李琼斯] Principal Technical Lead - Developer Services Linaro.org │ Open source software for Arm SoCs Follow Linaro: Facebook | Twitter | Blog
Re: [PATCH 3/3] drm/msm: Add a way to override processes comm/cmdline
On Thu, Mar 17, 2022 at 1:21 AM Dan Carpenter wrote: > > On Wed, Mar 16, 2022 at 05:29:45PM -0700, Rob Clark wrote: > > switch (param) { > > + case MSM_PARAM_COMM: > > + case MSM_PARAM_CMDLINE: { > > + char *str, **paramp; > > + > > + str = kmalloc(len + 1, GFP_KERNEL); > > if (!str) > return -ENOMEM; > > > + if (copy_from_user(str, u64_to_user_ptr(value), len)) { > > + kfree(str); > > + return -EFAULT; > > + } > > + > > + /* Ensure string is null terminated: */ > > + str[len] = '\0'; > > + > > + if (param == MSM_PARAM_COMM) { > > + paramp = &ctx->comm; > > + } else { > > + paramp = &ctx->cmdline; > > + } > > + > > + kfree(*paramp); > > + *paramp = str; > > + > > + return 0; > > + } > > case MSM_PARAM_SYSPROF: > > if (!capable(CAP_SYS_ADMIN)) > > return -EPERM; > > diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c > > index 4ec62b601adc..68f3f8ade76d 100644 > > --- a/drivers/gpu/drm/msm/msm_gpu.c > > +++ b/drivers/gpu/drm/msm/msm_gpu.c > > @@ -364,14 +364,21 @@ static void retire_submits(struct msm_gpu *gpu); > > > > static void get_comm_cmdline(struct msm_gem_submit *submit, char **comm, > > char **cmd) > > { > > + struct msm_file_private *ctx = submit->queue->ctx; > > struct task_struct *task; > > > > + *comm = kstrdup(ctx->comm, GFP_KERNEL); > > + *cmd = kstrdup(ctx->cmdline, GFP_KERNEL); > > + > > task = get_pid_task(submit->pid, PIDTYPE_PID); > > if (!task) > > return; > > > > - *comm = kstrdup(task->comm, GFP_KERNEL); > > - *cmd = kstrdup_quotable_cmdline(task, GFP_KERNEL); > > + if (!*comm) > > + *comm = kstrdup(task->comm, GFP_KERNEL); > > What? > > If the first allocation failed, then this one is going to fail as well. > Just return -ENOMEM. Or maybe this is meant to be checking for an empty > string? fwiw, if ctx->comm is NULL, the kstrdup() will return NULL, so this isn't intended to deal with OoM, but the case that comm and/or cmdline is not overridden. BR, -R > > > + > > + if (!*cmd) > > + *cmd = kstrdup_quotable_cmdline(task, GFP_KERNEL); > > Same. > > > > > put_task_struct(task); > > } > > regards, > dan carpenter >
Re: amd-gfx Digest, Vol 70, Issue 199
Am 2022-03-16 um 21:57 schrieb Yat Sin, David: Use proper amdgpu_gem_prime_import function to handle all kinds of imports. Remember the dmabuf reference to enable proper multi-GPU attachment to multiple VMs without erroneously re-exporting the underlying BO multiple times. Signed-off-by: Felix Kuehling --- .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 38 ++ - 1 file changed, 21 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c index cd89d2e46852..2ac61a1e665e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c @@ -2033,30 +2033,27 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev, struct amdgpu_bo *bo; int ret; - if (dma_buf->ops != &amdgpu_dmabuf_ops) - /* Can't handle non-graphics buffers */ - return -EINVAL; - - obj = dma_buf->priv; - if (drm_to_adev(obj->dev) != adev) - /* Can't handle buffers from other devices */ - return -EINVAL; + obj = amdgpu_gem_prime_import(adev_to_drm(adev), dma_buf); + if (IS_ERR(obj)) + return PTR_ERR(obj); bo = gem_to_amdgpu_bo(obj); if (!(bo->preferred_domains & (AMDGPU_GEM_DOMAIN_VRAM | - AMDGPU_GEM_DOMAIN_GTT))) + AMDGPU_GEM_DOMAIN_GTT))) { /* Only VRAM and GTT BOs are supported */ - return -EINVAL; + ret = -EINVAL; + goto err_put_obj; + } *mem = kzalloc(sizeof(struct kgd_mem), GFP_KERNEL); - if (!*mem) - return -ENOMEM; + if (!*mem) { + ret = -ENOMEM; + goto err_put_obj; + } ret = drm_vma_node_allow(&obj->vma_node, drm_priv); - if (ret) { - kfree(mem); - return ret; - } + if (ret) + goto err_free_mem; if (size) *size = amdgpu_bo_size(bo); @@ -2073,7 +2070,8 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev, | KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE | KFD_IOC_ALLOC_MEM_FLAGS_EXECUTABLE; - drm_gem_object_get(&bo->tbo.base); + get_dma_buf(dma_buf); + (*mem)->dmabuf = dma_buf; (*mem)->bo = bo; (*mem)->va = va; (*mem)->domain = (bo->preferred_domains & AMDGPU_GEM_DOMAIN_VRAM) ? @@ -2085,6 +2083,12 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct amdgpu_device *adev, (*mem)->is_imported = true; return 0; + +err_free_mem: + kfree(mem); Should be kfree(*mem) Good catch. That was broken in the original code too and I just copied it. Thanks, Felix Regards, David +err_put_obj: + drm_gem_object_put(obj); + return ret; } /* Evict a userptr BO by stopping the queues if necessary
Re: [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on
Am 2022-03-17 um 11:00 schrieb Lee Jones: Good afternoon Felix, Thanks for your review. Am 2022-03-17 um 09:16 schrieb Lee Jones: Presently the Client can be freed whilst still in use. Use the already provided lock to prevent this. Cc: Felix Kuehling Cc: Alex Deucher Cc: "Christian König" Cc: "Pan, Xinhui" Cc: David Airlie Cc: Daniel Vetter Cc: amd-...@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Lee Jones --- drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c index e4beebb1c80a2..3b9ac1e87231f 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c @@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, struct file *filep) spin_unlock(&dev->smi_lock); synchronize_rcu(); + + spin_lock(&client->lock); kfifo_free(&client->fifo); kfree(client); + spin_unlock(&client->lock); The spin_unlock is after the spinlock data structure has been freed. Good point. If we go forward with this approach the unlock should perhaps be moved to just before the kfree(). There should be no concurrent users here, since we are freeing the data structure. If there still are concurrent users at this point, they will crash anyway. So the locking is unnecessary. The users may well crash, as does the kernel unfortunately. We only get to kfd_smi_ev_release when the file descriptor is closed. User mode has no way to use the client any more at this point. This function also removes the client from the dev->smi_cllients list. So no more events will be added to the client. Therefore it is safe to free the client. If any of the above were not true, it would not be safe to kfree(client). But if it is safe to kfree(client), then there is no need for the locking. Regards, Felix return 0; } @@ -247,11 +250,13 @@ int kfd_smi_event_open(struct kfd_dev *dev, uint32_t *fd) return ret; } + spin_lock(&client->lock); The client was just allocated, and it wasn't added to the client list or given to user mode yet. So there can be no concurrent users at this point. The locking is unnecessary. There could be potential issues if someone uses the file descriptor by dumb luck before this function returns. So maybe we need to move the anon_inode_getfd to the end of the function (just before list_add_rcu) so that we only create the file descriptor after the client structure is fully initialized. Bingo. Well done. :) I can move the function as suggested if that is the best route forward?
Re: [PATCH 2/3] drm/msm/gpu: Park scheduler threads for system suspend
On Thu, Mar 17, 2022 at 03:06:18AM -0700, Christian König wrote: > Am 17.03.22 um 10:59 schrieb Daniel Vetter: > > On Thu, Mar 10, 2022 at 03:46:05PM -0800, Rob Clark wrote: > >> From: Rob Clark > >> > >> In the system suspend path, we don't want to be racing with the > >> scheduler kthreads pushing additional queued up jobs to the hw > >> queue (ringbuffer). So park them first. While we are at it, > >> move the wait for active jobs to complete into the new system- > >> suspend path. > >> > >> Signed-off-by: Rob Clark > >> --- > >> drivers/gpu/drm/msm/adreno/adreno_device.c | 68 -- > >> 1 file changed, 64 insertions(+), 4 deletions(-) > >> > >> diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c > >> b/drivers/gpu/drm/msm/adreno/adreno_device.c > >> index 8859834b51b8..0440a98988fc 100644 > >> --- a/drivers/gpu/drm/msm/adreno/adreno_device.c > >> +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c > >> @@ -619,22 +619,82 @@ static int active_submits(struct msm_gpu *gpu) > >> static int adreno_runtime_suspend(struct device *dev) > >> { > >>struct msm_gpu *gpu = dev_to_gpu(dev); > >> - int remaining; > >> + > >> + /* > >> + * We should be holding a runpm ref, which will prevent > >> + * runtime suspend. In the system suspend path, we've > >> + * already waited for active jobs to complete. > >> + */ > >> + WARN_ON_ONCE(gpu->active_submits); > >> + > >> + return gpu->funcs->pm_suspend(gpu); > >> +} > >> + > >> +static void suspend_scheduler(struct msm_gpu *gpu) > >> +{ > >> + int i; > >> + > >> + /* > >> + * Shut down the scheduler before we force suspend, so that > >> + * suspend isn't racing with scheduler kthread feeding us > >> + * more work. > >> + * > >> + * Note, we just want to park the thread, and let any jobs > >> + * that are already on the hw queue complete normally, as > >> + * opposed to the drm_sched_stop() path used for handling > >> + * faulting/timed-out jobs. We can't really cancel any jobs > >> + * already on the hw queue without racing with the GPU. > >> + */ > >> + for (i = 0; i < gpu->nr_rings; i++) { > >> + struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched; > >> + kthread_park(sched->thread); > > Shouldn't we have some proper interfaces for this? > > If I'm not completely mistaken we already should have one, yes. > > > Also I'm kinda wondering how other drivers do this, feels like we should > > have a standard > > way. > > > > Finally not flushing out all in-flight requests sounds a bit like a bad > > idea for system suspend/resume since that's also the hibernation path, and > > that would mean your shrinker/page reclaim stops working. At least in full > > generality. Which ain't good for hibernation. > > Completely agree, that looks like an incorrect workaround to me. > > During suspend all userspace applications should be frozen and all f > their hardware activity flushed out and waited for completion. > Isn't that what Rob is doing? He kills the scheduler preventing any new job from being submitted then waits for an outstanding jobs to complete naturally complete (see the wait_event_timeout below). If the jobs don't naturally complete the suspend seems to be aborted? That flow makes sense to me and seems like a novel way to avoid races. Matt > I do remember that our internal guys came up with pretty much the same > idea and it sounded broken to me back then as well. > > Regards, > Christian. > > > > > Adding Christian and Andrey. > > -Daniel > > > >> + } > >> +} > >> + > >> +static void resume_scheduler(struct msm_gpu *gpu) > >> +{ > >> + int i; > >> + > >> + for (i = 0; i < gpu->nr_rings; i++) { > >> + struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched; > >> + kthread_unpark(sched->thread); > >> + } > >> +} > >> + > >> +static int adreno_system_suspend(struct device *dev) > >> +{ > >> + struct msm_gpu *gpu = dev_to_gpu(dev); > >> + int remaining, ret; > >> + > >> + suspend_scheduler(gpu); > >> > >>remaining = wait_event_timeout(gpu->retire_event, > >> active_submits(gpu) == 0, > >> msecs_to_jiffies(1000)); > >>if (remaining == 0) { > >>dev_err(dev, "Timeout waiting for GPU to suspend\n"); > >> - return -EBUSY; > >> + ret = -EBUSY; > >> + goto out; > >>} > >> > >> - return gpu->funcs->pm_suspend(gpu); > >> + ret = pm_runtime_force_suspend(dev); > >> +out: > >> + if (ret) > >> + resume_scheduler(gpu); > >> + > >> + return ret; > >> } > >> + > >> +static int adreno_system_resume(struct device *dev) > >> +{ > >> + resume_scheduler(dev_to_gpu(dev)); > >> + return pm_runtime_force_resume(dev); > >> +} > >> + > >> #endif > >> > >> static const struct dev_pm_ops adreno_pm_ops = { > >> - SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend, > >> pm_runtime_force_resume) > >> + SET_SYSTEM_SLEEP_PM_OPS(adreno_sys
Re: [PATCH 2/3] drm/msm/gpu: Park scheduler threads for system suspend
On Thu, Mar 17, 2022 at 3:06 AM Christian König wrote: > > Am 17.03.22 um 10:59 schrieb Daniel Vetter: > > On Thu, Mar 10, 2022 at 03:46:05PM -0800, Rob Clark wrote: > >> From: Rob Clark > >> > >> In the system suspend path, we don't want to be racing with the > >> scheduler kthreads pushing additional queued up jobs to the hw > >> queue (ringbuffer). So park them first. While we are at it, > >> move the wait for active jobs to complete into the new system- > >> suspend path. > >> > >> Signed-off-by: Rob Clark > >> --- > >> drivers/gpu/drm/msm/adreno/adreno_device.c | 68 -- > >> 1 file changed, 64 insertions(+), 4 deletions(-) > >> > >> diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c > >> b/drivers/gpu/drm/msm/adreno/adreno_device.c > >> index 8859834b51b8..0440a98988fc 100644 > >> --- a/drivers/gpu/drm/msm/adreno/adreno_device.c > >> +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c > >> @@ -619,22 +619,82 @@ static int active_submits(struct msm_gpu *gpu) > >> static int adreno_runtime_suspend(struct device *dev) > >> { > >> struct msm_gpu *gpu = dev_to_gpu(dev); > >> -int remaining; > >> + > >> +/* > >> + * We should be holding a runpm ref, which will prevent > >> + * runtime suspend. In the system suspend path, we've > >> + * already waited for active jobs to complete. > >> + */ > >> +WARN_ON_ONCE(gpu->active_submits); > >> + > >> +return gpu->funcs->pm_suspend(gpu); > >> +} > >> + > >> +static void suspend_scheduler(struct msm_gpu *gpu) > >> +{ > >> +int i; > >> + > >> +/* > >> + * Shut down the scheduler before we force suspend, so that > >> + * suspend isn't racing with scheduler kthread feeding us > >> + * more work. > >> + * > >> + * Note, we just want to park the thread, and let any jobs > >> + * that are already on the hw queue complete normally, as > >> + * opposed to the drm_sched_stop() path used for handling > >> + * faulting/timed-out jobs. We can't really cancel any jobs > >> + * already on the hw queue without racing with the GPU. > >> + */ > >> +for (i = 0; i < gpu->nr_rings; i++) { > >> +struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched; > >> +kthread_park(sched->thread); > > Shouldn't we have some proper interfaces for this? > > If I'm not completely mistaken we already should have one, yes. drm_sched_stop() was my first thought, but it carries extra baggage. Really I *just* want to park the kthread. Note that amdgpu does (for afaict different reasons) park the kthread directly as well. > > Also I'm kinda wondering how other drivers do this, feels like we should > > have a standard > > way. As far as other drivers, it seems like they largely ignore it. I suspect other drivers also have problems in this area. Fwiw, I have a piglit test to try to exercise this path if you want to try it on other drivers.. might need some futzing around to make sure enough work is queued up that there is some on hw ring and some queued up in the scheduler when you try to suspend. https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/643 > > > > Finally not flushing out all in-flight requests sounds a bit like a bad > > idea for system suspend/resume since that's also the hibernation path, and > > that would mean your shrinker/page reclaim stops working. At least in full > > generality. Which ain't good for hibernation. > > Completely agree, that looks like an incorrect workaround to me. > > During suspend all userspace applications should be frozen and all f > their hardware activity flushed out and waited for completion. > > I do remember that our internal guys came up with pretty much the same > idea and it sounded broken to me back then as well. userspace frozen != kthread frozen .. that is what this patch is trying to address, so we aren't racing between shutting down the hw and the scheduler shoveling more jobs at us. BR, -R > Regards, > Christian. > > > > > Adding Christian and Andrey. > > -Daniel > > > >> +} > >> +} > >> + > >> +static void resume_scheduler(struct msm_gpu *gpu) > >> +{ > >> +int i; > >> + > >> +for (i = 0; i < gpu->nr_rings; i++) { > >> +struct drm_gpu_scheduler *sched = &gpu->rb[i]->sched; > >> +kthread_unpark(sched->thread); > >> +} > >> +} > >> + > >> +static int adreno_system_suspend(struct device *dev) > >> +{ > >> +struct msm_gpu *gpu = dev_to_gpu(dev); > >> +int remaining, ret; > >> + > >> +suspend_scheduler(gpu); > >> > >> remaining = wait_event_timeout(gpu->retire_event, > >> active_submits(gpu) == 0, > >> msecs_to_jiffies(1000)); > >> if (remaining == 0) { > >> dev_err(dev, "Timeout waiting for GPU to suspend\n"); > >> -return -EBUSY; > >> +ret = -EBUSY; > >> +goto out; > >> } > >> > >> -return gpu->funcs->pm_s
Re: [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on
On Thu, 17 Mar 2022, Felix Kuehling wrote: > > Am 2022-03-17 um 11:00 schrieb Lee Jones: > > Good afternoon Felix, > > > > Thanks for your review. > > > > > Am 2022-03-17 um 09:16 schrieb Lee Jones: > > > > Presently the Client can be freed whilst still in use. > > > > > > > > Use the already provided lock to prevent this. > > > > > > > > Cc: Felix Kuehling > > > > Cc: Alex Deucher > > > > Cc: "Christian König" > > > > Cc: "Pan, Xinhui" > > > > Cc: David Airlie > > > > Cc: Daniel Vetter > > > > Cc: amd-...@lists.freedesktop.org > > > > Cc: dri-devel@lists.freedesktop.org > > > > Signed-off-by: Lee Jones > > > > --- > > > >drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++ > > > >1 file changed, 6 insertions(+) > > > > > > > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c > > > > b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c > > > > index e4beebb1c80a2..3b9ac1e87231f 100644 > > > > --- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c > > > > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c > > > > @@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, > > > > struct file *filep) > > > > spin_unlock(&dev->smi_lock); > > > > synchronize_rcu(); > > > > + > > > > + spin_lock(&client->lock); > > > > kfifo_free(&client->fifo); > > > > kfree(client); > > > > + spin_unlock(&client->lock); > > > The spin_unlock is after the spinlock data structure has been freed. > > Good point. > > > > If we go forward with this approach the unlock should perhaps be moved > > to just before the kfree(). > > > > > There > > > should be no concurrent users here, since we are freeing the data > > > structure. > > > If there still are concurrent users at this point, they will crash anyway. > > > So the locking is unnecessary. > > The users may well crash, as does the kernel unfortunately. > We only get to kfd_smi_ev_release when the file descriptor is closed. User > mode has no way to use the client any more at this point. This function also > removes the client from the dev->smi_cllients list. So no more events will > be added to the client. Therefore it is safe to free the client. > > If any of the above were not true, it would not be safe to kfree(client). > > But if it is safe to kfree(client), then there is no need for the locking. I'm not keen to go into too much detail until it's been patched. However, there is a way to free the client while it is still in use. Remember we are multi-threaded. -- Lee Jones [李琼斯] Principal Technical Lead - Developer Services Linaro.org │ Open source software for Arm SoCs Follow Linaro: Facebook | Twitter | Blog
Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event
On Thu, Mar 17, 2022 at 2:29 AM Daniel Vetter wrote: > > On Thu, Mar 17, 2022 at 08:03:27AM +0100, Christian König wrote: > > Am 16.03.22 um 16:36 schrieb Rob Clark: > > > [SNIP] > > > just one point of clarification.. in the msm and i915 case it is > > > purely for debugging and telemetry (ie. sending crash logs back to > > > distro for analysis if user has crash reporting enabled).. it isn't > > > used for triggering any action like killing app or compositor. > > > > By the way, how does msm it's memory management for the devcoredumps? > > GFP_NORECLAIM all the way. It's purely best effort. We do one GEM obj allocation in the snapshot path (the hw has a mechanism to snapshot it's own state into a gpu buffer.. not sure if nice debugging functionality like that is a commentary on the blob driver quality, but I'm not complaining) I suppose we could pre-allocate this buffer up-front.. but it doesn't seem like a problem, ie. if allocation fails we just skip snapshotting stuff that needs the hw crashdumper. I guess since vram is not involved, perhaps that makes the situation a bit more straightforward. > Note that the fancy new plan for i915 discrete gpu is to only support gpu > crash dumps on non-recoverable gpu contexts, i.e. those that do not > continue to the next batch when something bad happens. This is what vk > wants and also what iris now uses (we do context recovery in userspace in > all cases), and non-recoverable contexts greatly simplify the crash dump > gather: Only thing you need to gather is the register state from hw > (before you reset it), all the batchbuffer bo and indirect state bo (in > i915 you can mark which bo to capture in the CS ioctl) can be captured in > a worker later on. Which for non-recoverable context is no issue, since > subsequent batchbuffers won't trample over any of these things. > > And that way you can record the crashdump (or at least the big pieces like > all the indirect state stuff) with GFP_KERNEL. > > msm probably gets it wrong since embedded drivers have much less shrinker > and generally no mmu notifiers going on :-) Note that the bo's associated with the batch are still pinned at this point, from the bo lifecycle the batch is still active. So from the point of view of shrinker, there should be no interaction. We aren't doing anything with mmu notifiers (yet), so not entirely sure offhand the concern there. Currently we just use GFP_KERNEL and bail if allocation fails. BR, -R > > I mean it is strictly forbidden to allocate any memory in the GPU reset > > path. > > > > > I would however *strongly* recommend devcoredump support in other GPU > > > drivers (i915's thing pre-dates devcoredump by a lot).. I've used it > > > to debug and fix a couple obscure issues that I was not able to > > > reproduce by myself. > > > > Yes, completely agree as well. > > +1 > > Cheers, Daniel > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch
Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event
On Thu, Mar 17, 2022 at 2:29 AM Daniel Vetter wrote: > > On Thu, Mar 17, 2022 at 08:03:27AM +0100, Christian König wrote: > > Am 16.03.22 um 16:36 schrieb Rob Clark: > > > [SNIP] > > > just one point of clarification.. in the msm and i915 case it is > > > purely for debugging and telemetry (ie. sending crash logs back to > > > distro for analysis if user has crash reporting enabled).. it isn't > > > used for triggering any action like killing app or compositor. > > > > By the way, how does msm it's memory management for the devcoredumps? > > GFP_NORECLAIM all the way. It's purely best effort. > > Note that the fancy new plan for i915 discrete gpu is to only support gpu > crash dumps on non-recoverable gpu contexts, i.e. those that do not > continue to the next batch when something bad happens. This is what vk > wants and also what iris now uses (we do context recovery in userspace in > all cases), and non-recoverable contexts greatly simplify the crash dump > gather: Only thing you need to gather is the register state from hw > (before you reset it), all the batchbuffer bo and indirect state bo (in > i915 you can mark which bo to capture in the CS ioctl) can be captured in > a worker later on. Which for non-recoverable context is no issue, since > subsequent batchbuffers won't trample over any of these things. fwiw, we snapshot everything (cmdstream and bo's marked with dump flag, in addition to hw state) before resuming the GPU, so there is no danger of things being trampled. After state is captured and GPU reset, we "replay" the submits that were written into the ringbuffer after the faulting submit. GPU crashes should be a thing you don't need to try to optimize. (At some point, I'd like to use scheduler for the replay, and actually use drm_sched_stop()/etc.. but last time I looked there were still some sched bugs in that area which prevented me from deleting a bunch of code ;-)) BR, -R > > And that way you can record the crashdump (or at least the big pieces like > all the indirect state stuff) with GFP_KERNEL. > > msm probably gets it wrong since embedded drivers have much less shrinker > and generally no mmu notifiers going on :-) > > > I mean it is strictly forbidden to allocate any memory in the GPU reset > > path. > > > > > I would however *strongly* recommend devcoredump support in other GPU > > > drivers (i915's thing pre-dates devcoredump by a lot).. I've used it > > > to debug and fix a couple obscure issues that I was not able to > > > reproduce by myself. > > > > Yes, completely agree as well. > > +1 > > Cheers, Daniel > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch
Re: [Freedreno] [PATCH v3 5/5] drm/msm: allow compile time selection of driver components
On 17/03/2022 15:44, Dmitry Baryshkov wrote: On 16/03/2022 20:26, Abhinav Kumar wrote: On 3/16/2022 12:31 AM, Dmitry Baryshkov wrote: On 16/03/2022 03:28, Abhinav Kumar wrote: On 3/3/2022 7:21 PM, Dmitry Baryshkov wrote: MSM DRM driver already allows one to compile out the DP or DSI support. Add support for disabling other features like MDP4/MDP5/DPU drivers or direct HDMI output support. Suggested-by: Stephen Boyd Signed-off-by: Dmitry Baryshkov --- drivers/gpu/drm/msm/Kconfig | 50 -- drivers/gpu/drm/msm/Makefile | 18 ++-- drivers/gpu/drm/msm/msm_drv.h | 33 ++ drivers/gpu/drm/msm/msm_mdss.c | 13 +++-- 4 files changed, 106 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/msm/Kconfig b/drivers/gpu/drm/msm/Kconfig index 9b019598e042..3735fd41eb3b 100644 --- a/drivers/gpu/drm/msm/Kconfig +++ b/drivers/gpu/drm/msm/Kconfig @@ -46,12 +46,39 @@ config DRM_MSM_GPU_SUDO Only use this if you are a driver developer. This should *not* be enabled for production kernels. If unsure, say N. -config DRM_MSM_HDMI_HDCP - bool "Enable HDMI HDCP support in MSM DRM driver" +config DRM_MSM_MDSS + bool + depends on DRM_MSM + default n shouldnt DRM_MSM_MDSS be defaulted to y? No, it will be selected either by MDP5 or by DPU1. It is not used if DRM_MSM is compiled with just MDP4 or headless support in mind. Ok got it. Another question is the compilation validation of the combinations of these. So we need to try: 1) DRM_MSM_MDSS + DRM_MSM_MDP4 2) DRM_MSM_MDSS + DRM_MSM_MDP5 3) DRM_MSM_MDSS + DRM_MSM_DPU Earlier since all of them were compiled together any inter-dependencies will not show up. Now since we are separating it out, just wanted to make sure each of the combos compile? I think you meant: - headless - MDP4 - MDP5 - DPU1 - MDP4 + MDP5 - MDP4 + DPU1 - MDP5 + DPU1 - all three drivers Yes, each of these combinations. Each of them was tested. Hmm. It looks like I had DSI disabled during the tests. Will fix it up. + +config DRM_MSM_MDP4 + bool "Enable MDP4 support in MSM DRM driver" depends on DRM_MSM default y help - Choose this option to enable HDCP state machine + Compile in support for the Mobile Display Processor v4 (MDP4) in + the MSM DRM driver. It is the older display controller found in + devices using APQ8064/MSM8960/MSM8x60 platforms. + +config DRM_MSM_MDP5 + bool "Enable MDP5 support in MSM DRM driver" + depends on DRM_MSM + select DRM_MSM_MDSS + default y + help + Compile in support for the Mobile Display Processor v5 (MDP4) in + the MSM DRM driver. It is the display controller found in devices + using e.g. APQ8016/MSM8916/APQ8096/MSM8996/MSM8974/SDM6x0 platforms. + +config DRM_MSM_DPU + bool "Enable DPU support in MSM DRM driver" + depends on DRM_MSM + select DRM_MSM_MDSS + default y + help + Compile in support for the Display Processing Unit in + the MSM DRM driver. It is the display controller found in devices + using e.g. SDM845 and newer platforms. config DRM_MSM_DP bool "Enable DisplayPort support in MSM DRM driver" @@ -116,3 +143,20 @@ config DRM_MSM_DSI_7NM_PHY help Choose this option if DSI PHY on SM8150/SM8250/SC7280 is used on the platform. + +config DRM_MSM_HDMI + bool "Enable HDMI support in MSM DRM driver" + depends on DRM_MSM + default y + help + Compile in support for the HDMI output MSM DRM driver. It can + be a primary or a secondary display on device. Note that this is used + only for the direct HDMI output. If the device outputs HDMI data + throught some kind of DSI-to-HDMI bridge, this option can be disabled. + +config DRM_MSM_HDMI_HDCP + bool "Enable HDMI HDCP support in MSM DRM driver" + depends on DRM_MSM && DRM_MSM_HDMI + default y + help + Choose this option to enable HDCP state machine diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile index e76927b42033..5fe9c20ab9ee 100644 --- a/drivers/gpu/drm/msm/Makefile +++ b/drivers/gpu/drm/msm/Makefile @@ -16,6 +16,8 @@ msm-y := \ adreno/a6xx_gpu.o \ adreno/a6xx_gmu.o \ adreno/a6xx_hfi.o \ + +msm-$(CONFIG_DRM_MSM_HDMI) += \ hdmi/hdmi.o \ hdmi/hdmi_audio.o \ hdmi/hdmi_bridge.o \ @@ -27,8 +29,8 @@ msm-y := \ hdmi/hdmi_phy_8x60.o \ hdmi/hdmi_phy_8x74.o \ hdmi/hdmi_pll_8960.o \ - disp/mdp_format.o \ - disp/mdp_kms.o \ + +msm-$(CONFIG_DRM_MSM_MDP4) += \ disp/mdp4/mdp4_crtc.o \ disp/mdp4/mdp4_dtv_encoder.o \ disp/mdp4/mdp4_lcdc_encoder.o \ @@ -37,6 +39,8 @@ msm-y := \ disp/mdp4/mdp4_irq.o \ disp/mdp4/mdp4_kms.o \ disp/mdp4/mdp4_plane.o \ + +msm-$(CONFIG_DRM_MSM_MDP5) += \ disp/mdp5/mdp5_cfg.o \ disp/mdp5/mdp5_ctl.o \ disp/mdp5/mdp5_crtc.o \ @@ -47,6 +51,8 @@ msm-y
Re: [PATCH 2/3] drm/msm/gpu: Park scheduler threads for system suspend
Am 17.03.22 um 16:10 schrieb Rob Clark: [SNIP] userspace frozen != kthread frozen .. that is what this patch is trying to address, so we aren't racing between shutting down the hw and the scheduler shoveling more jobs at us. Well exactly that's the problem. The scheduler is supposed to shoveling more jobs at us until it is empty. Thinking more about it we will then keep some dma_fence instance unsignaled and that is and extremely bad idea since it can lead to deadlocks during suspend. So this patch here is an absolute clear NAK from my side. If amdgpu is doing something similar that is a severe bug and needs to be addressed somehow. Regards, Christian. BR, -R
Re: [PATCH v2 6/8] drm/shmem-helper: Add generic memory shrinker
On Wed, Mar 16, 2022 at 5:13 PM Dmitry Osipenko wrote: > > On 3/16/22 23:00, Rob Clark wrote: > > On Mon, Mar 14, 2022 at 3:44 PM Dmitry Osipenko > > wrote: > >> > >> Introduce a common DRM SHMEM shrinker. It allows to reduce code > >> duplication among DRM drivers, it also handles complicated lockings > >> for the drivers. This is initial version of the shrinker that covers > >> basic needs of GPU drivers. > >> > >> This patch is based on a couple ideas borrowed from Rob's Clark MSM > >> shrinker and Thomas' Zimmermann variant of SHMEM shrinker. > >> > >> GPU drivers that want to use generic DRM memory shrinker must support > >> generic GEM reservations. > >> > >> Signed-off-by: Daniel Almeida > >> Signed-off-by: Dmitry Osipenko > >> --- > >> drivers/gpu/drm/drm_gem_shmem_helper.c | 194 + > >> include/drm/drm_device.h | 4 + > >> include/drm/drm_gem.h | 11 ++ > >> include/drm/drm_gem_shmem_helper.h | 25 > >> 4 files changed, 234 insertions(+) > >> > >> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c > >> b/drivers/gpu/drm/drm_gem_shmem_helper.c > >> index 37009418cd28..35be2ee98f11 100644 > >> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c > >> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c > >> @@ -139,6 +139,9 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object > >> *shmem) > >> { > >> struct drm_gem_object *obj = &shmem->base; > >> > >> + /* take out shmem GEM object from the memory shrinker */ > >> + drm_gem_shmem_madvise(shmem, 0); > >> + > >> WARN_ON(shmem->vmap_use_count); > >> > >> if (obj->import_attach) { > >> @@ -163,6 +166,42 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object > >> *shmem) > >> } > >> EXPORT_SYMBOL_GPL(drm_gem_shmem_free); > >> > >> +static void drm_gem_shmem_update_purgeable_status(struct > >> drm_gem_shmem_object *shmem) > >> +{ > >> + struct drm_gem_object *obj = &shmem->base; > >> + struct drm_gem_shmem_shrinker *gem_shrinker = > >> obj->dev->shmem_shrinker; > >> + size_t page_count = obj->size >> PAGE_SHIFT; > >> + > >> + if (!gem_shrinker || obj->import_attach || !obj->funcs->purge) > >> + return; > >> + > >> + mutex_lock(&shmem->vmap_lock); > >> + mutex_lock(&shmem->pages_lock); > >> + mutex_lock(&gem_shrinker->lock); > >> + > >> + if (shmem->madv < 0) { > >> + list_del_init(&shmem->madv_list); > >> + goto unlock; > >> + } else if (shmem->madv > 0) { > >> + if (!list_empty(&shmem->madv_list)) > >> + goto unlock; > >> + > >> + WARN_ON(gem_shrinker->shrinkable_count + page_count < > >> page_count); > >> + gem_shrinker->shrinkable_count += page_count; > >> + > >> + list_add_tail(&shmem->madv_list, &gem_shrinker->lru); > >> + } else if (!list_empty(&shmem->madv_list)) { > >> + list_del_init(&shmem->madv_list); > >> + > >> + WARN_ON(gem_shrinker->shrinkable_count < page_count); > >> + gem_shrinker->shrinkable_count -= page_count; > >> + } > >> +unlock: > >> + mutex_unlock(&gem_shrinker->lock); > >> + mutex_unlock(&shmem->pages_lock); > >> + mutex_unlock(&shmem->vmap_lock); > >> +} > >> + > >> static int drm_gem_shmem_get_pages_locked(struct drm_gem_shmem_object > >> *shmem) > >> { > >> struct drm_gem_object *obj = &shmem->base; > >> @@ -366,6 +405,8 @@ int drm_gem_shmem_vmap(struct drm_gem_shmem_object > >> *shmem, > >> ret = drm_gem_shmem_vmap_locked(shmem, map); > >> mutex_unlock(&shmem->vmap_lock); > >> > >> + drm_gem_shmem_update_purgeable_status(shmem); > >> + > >> return ret; > >> } > >> EXPORT_SYMBOL(drm_gem_shmem_vmap); > >> @@ -409,6 +450,8 @@ void drm_gem_shmem_vunmap(struct drm_gem_shmem_object > >> *shmem, > >> mutex_lock(&shmem->vmap_lock); > >> drm_gem_shmem_vunmap_locked(shmem, map); > >> mutex_unlock(&shmem->vmap_lock); > >> + > >> + drm_gem_shmem_update_purgeable_status(shmem); > >> } > >> EXPORT_SYMBOL(drm_gem_shmem_vunmap); > >> > >> @@ -451,6 +494,8 @@ int drm_gem_shmem_madvise(struct drm_gem_shmem_object > >> *shmem, int madv) > >> > >> mutex_unlock(&shmem->pages_lock); > >> > >> + drm_gem_shmem_update_purgeable_status(shmem); > >> + > >> return (madv >= 0); > >> } > >> EXPORT_SYMBOL(drm_gem_shmem_madvise); > >> @@ -763,6 +808,155 @@ drm_gem_shmem_prime_import_sg_table(struct > >> drm_device *dev, > >> } > >> EXPORT_SYMBOL_GPL(drm_gem_shmem_prime_import_sg_table); > >> > >> +static struct drm_gem_shmem_shrinker * > >> +to_drm_shrinker(struct shrinker *shrinker) > >> +{ > >> + return container_of(shrinker, struct drm_gem_shmem_shrinker, base); > >> +} > >> + > >> +static unsigned long > >> +drm_gem_shmem_shrinker_count_objects(struct shrinker *shrinker, > >> +
Re: [PATCH 2/3] drm/msm/gpu: Park scheduler threads for system suspend
On Thu, Mar 17, 2022 at 9:04 AM Christian König wrote: > > Am 17.03.22 um 16:10 schrieb Rob Clark: > > [SNIP] > > userspace frozen != kthread frozen .. that is what this patch is > > trying to address, so we aren't racing between shutting down the hw > > and the scheduler shoveling more jobs at us. > > Well exactly that's the problem. The scheduler is supposed to shoveling > more jobs at us until it is empty. > > Thinking more about it we will then keep some dma_fence instance > unsignaled and that is and extremely bad idea since it can lead to > deadlocks during suspend. Hmm, perhaps that is true if you need to migrate things out of vram? It is at least not a problem when vram is not involved. > So this patch here is an absolute clear NAK from my side. If amdgpu is > doing something similar that is a severe bug and needs to be addressed > somehow. I think amdgpu's use of kthread_park is not related to suspend, but didn't look too closely. And perhaps the solution for this problem is more complex in the case of amdgpu, I'm not super familiar with the constraints there. But I think it is a fine solution for integrated GPUs. BR, -R > Regards, > Christian. > > > > > BR, > > -R > > >
Re: [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on
On 2022-03-17 11:13 a.m., Lee Jones wrote: On Thu, 17 Mar 2022, Felix Kuehling wrote: Am 2022-03-17 um 11:00 schrieb Lee Jones: Good afternoon Felix, Thanks for your review. Am 2022-03-17 um 09:16 schrieb Lee Jones: Presently the Client can be freed whilst still in use. Use the already provided lock to prevent this. Cc: Felix Kuehling Cc: Alex Deucher Cc: "Christian König" Cc: "Pan, Xinhui" Cc: David Airlie Cc: Daniel Vetter Cc: amd-...@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Lee Jones --- drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c index e4beebb1c80a2..3b9ac1e87231f 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c @@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, struct file *filep) spin_unlock(&dev->smi_lock); synchronize_rcu(); + + spin_lock(&client->lock); kfifo_free(&client->fifo); kfree(client); + spin_unlock(&client->lock); The spin_unlock is after the spinlock data structure has been freed. Good point. If we go forward with this approach the unlock should perhaps be moved to just before the kfree(). There should be no concurrent users here, since we are freeing the data structure. If there still are concurrent users at this point, they will crash anyway. So the locking is unnecessary. The users may well crash, as does the kernel unfortunately. We only get to kfd_smi_ev_release when the file descriptor is closed. User mode has no way to use the client any more at this point. This function also removes the client from the dev->smi_cllients list. So no more events will be added to the client. Therefore it is safe to free the client. If any of the above were not true, it would not be safe to kfree(client). But if it is safe to kfree(client), then there is no need for the locking. I'm not keen to go into too much detail until it's been patched. However, there is a way to free the client while it is still in use. Remember we are multi-threaded. files_struct->count refcount is used to handle this race, as vfs_read/vfs_write takes file refcount and fput calls release only if refcount is 1, to guarantee that read/write from user space is finished here. Another race is driver add_event_to_kfifo while closing the handler. We use rcu_read_lock in add_event_to_kfifo, and kfd_smi_ev_release calls synchronize_rcu to wait for all rcu_read done. So it is safe to call kfifo_free(&client->fifo) and kfree(client). Regards, Philip
[PATCH 1/2] drm: Add missing DP DSC extended capability definitions.
Adding DP DSC register definitions, we might need for further DSC implementation, supporting MST and DP branch pass-through mode. Signed-off-by: Stanislav Lisovskiy --- drivers/gpu/drm/dp/drm_dp.c| 25 + include/drm/dp/drm_dp_helper.h | 11 ++- 2 files changed, 35 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/dp/drm_dp.c b/drivers/gpu/drm/dp/drm_dp.c index 703972ae14c6..fe9c72055638 100644 --- a/drivers/gpu/drm/dp/drm_dp.c +++ b/drivers/gpu/drm/dp/drm_dp.c @@ -2312,6 +2312,31 @@ u8 drm_dp_dsc_sink_max_slice_count(const u8 dsc_dpcd[DP_DSC_RECEIVER_CAP_SIZE], } EXPORT_SYMBOL(drm_dp_dsc_sink_max_slice_count); +/** + * drm_dp_dsc_sink_bpp_increment_div - Get the bits per pixel precision + * which DP DSC sink device supports. + */ +u8 drm_dp_dsc_sink_bpp_increment_div(const u8 dsc_dpcd[DP_DSC_RECEIVER_CAP_SIZE]) +{ + u8 bpp_increment_dpcd = dsc_dpcd[DP_DSC_BITS_PER_PIXEL_INC - DP_DSC_SUPPORT]; + + switch (bpp_increment_dpcd) { + case DP_DSC_BITS_PER_PIXEL_1_16: + return 16; + case DP_DSC_BITS_PER_PIXEL_1_8: + return 8; + case DP_DSC_BITS_PER_PIXEL_1_4: + return 4; + case DP_DSC_BITS_PER_PIXEL_1_2: + return 2; + case DP_DSC_BITS_PER_PIXEL_1_1: + return 1; + } + + return 0; +} + + /** * drm_dp_dsc_sink_line_buf_depth() - Get the line buffer depth in bits * @dsc_dpcd: DSC capabilities from DPCD diff --git a/include/drm/dp/drm_dp_helper.h b/include/drm/dp/drm_dp_helper.h index 51e02cf75277..e4c9f4438ccb 100644 --- a/include/drm/dp/drm_dp_helper.h +++ b/include/drm/dp/drm_dp_helper.h @@ -246,6 +246,9 @@ struct drm_panel; #define DP_DSC_SUPPORT 0x060 /* DP 1.4 */ # define DP_DSC_DECOMPRESSION_IS_SUPPORTED (1 << 0) +# define DP_DSC_PASS_THROUGH_IS_SUPPORTED (1 << 1) +# define DP_DSC_DYNAMIC_PPS_UPDATE_SUPPORT_COMP_TO_COMP(1 << 2) +# define DP_DSC_DYNAMIC_PPS_UPDATE_SUPPORT_UNCOMP_TO_COMP (1 << 3) #define DP_DSC_REV 0x061 # define DP_DSC_MAJOR_MASK (0xf << 0) @@ -284,12 +287,15 @@ struct drm_panel; #define DP_DSC_BLK_PREDICTION_SUPPORT 0x066 # define DP_DSC_BLK_PREDICTION_IS_SUPPORTED (1 << 0) +# define DP_DSC_RGB_COLOR_CONV_BYPASS_SUPPORT (1 << 1) #define DP_DSC_MAX_BITS_PER_PIXEL_LOW 0x067 /* eDP 1.4 */ #define DP_DSC_MAX_BITS_PER_PIXEL_HI0x068 /* eDP 1.4 */ # define DP_DSC_MAX_BITS_PER_PIXEL_HI_MASK (0x3 << 0) # define DP_DSC_MAX_BITS_PER_PIXEL_HI_SHIFT 8 +# define DP_DSC_MAX_BPP_DELTA_VERSION_MASK 0x06 +# define DP_DSC_MAX_BPP_DELTA_AVAILABILITY 0x08 #define DP_DSC_DEC_COLOR_FORMAT_CAP 0x069 # define DP_DSC_RGB (1 << 0) @@ -351,11 +357,13 @@ struct drm_panel; # define DP_DSC_24_PER_DP_DSC_SINK (1 << 2) #define DP_DSC_BITS_PER_PIXEL_INC 0x06F +# define DP_DSC_RGB_YCbCr444_MAX_BPP_DELTA_MASK 0x1f +# define DP_DSC_RGB_YCbCr420_MAX_BPP_DELTA_MASK 0xe0 # define DP_DSC_BITS_PER_PIXEL_1_16 0x0 # define DP_DSC_BITS_PER_PIXEL_1_8 0x1 # define DP_DSC_BITS_PER_PIXEL_1_4 0x2 # define DP_DSC_BITS_PER_PIXEL_1_2 0x3 -# define DP_DSC_BITS_PER_PIXEL_10x4 +# define DP_DSC_BITS_PER_PIXEL_1_1 0x4 #define DP_PSR_SUPPORT 0x070 /* XXX 1.2? */ # define DP_PSR_IS_SUPPORTED1 @@ -1825,6 +1833,7 @@ u8 drm_dp_dsc_sink_max_slice_count(const u8 dsc_dpcd[DP_DSC_RECEIVER_CAP_SIZE], u8 drm_dp_dsc_sink_line_buf_depth(const u8 dsc_dpcd[DP_DSC_RECEIVER_CAP_SIZE]); int drm_dp_dsc_sink_supported_input_bpcs(const u8 dsc_dpc[DP_DSC_RECEIVER_CAP_SIZE], u8 dsc_bpc[3]); +u8 drm_dp_dsc_sink_bpp_increment_div(const u8 dsc_dpcd[DP_DSC_RECEIVER_CAP_SIZE]); static inline bool drm_dp_sink_supports_dsc(const u8 dsc_dpcd[DP_DSC_RECEIVER_CAP_SIZE]) -- 2.24.1.485.gad05a3d8e5
Re: [PATCH 1/1] drm/amdkfd: Protect the Client whilst it is being operated on
On Thu, 17 Mar 2022, philip yang wrote: >On 2022-03-17 11:13 a.m., Lee Jones wrote: > > On Thu, 17 Mar 2022, Felix Kuehling wrote: > > > Am 2022-03-17 um 11:00 schrieb Lee Jones: > > Good afternoon Felix, > > Thanks for your review. > > > Am 2022-03-17 um 09:16 schrieb Lee Jones: > > Presently the Client can be freed whilst still in use. > > Use the already provided lock to prevent this. > > Cc: Felix Kuehling [1] > Cc: Alex Deucher [2] > Cc: "Christian König" [3] > Cc: "Pan, Xinhui" [4] > Cc: David Airlie [5] > Cc: Daniel Vetter [6] > Cc: [7]amd-...@lists.freedesktop.org > Cc: [8]dri-devel@lists.freedesktop.org > Signed-off-by: Lee Jones [9] > --- >drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 6 ++ >1 file changed, 6 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c > b/drivers/gpu/drm/amd/a > mdkfd/kfd_smi_events.c > index e4beebb1c80a2..3b9ac1e87231f 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c > @@ -145,8 +145,11 @@ static int kfd_smi_ev_release(struct inode *inode, > struct f > ile *filep) > spin_unlock(&dev->smi_lock); > synchronize_rcu(); > + > + spin_lock(&client->lock); > kfifo_free(&client->fifo); > kfree(client); > + spin_unlock(&client->lock); > > The spin_unlock is after the spinlock data structure has been freed. > > Good point. > > If we go forward with this approach the unlock should perhaps be moved > to just before the kfree(). > > > There > should be no concurrent users here, since we are freeing the data structure. > If there still are concurrent users at this point, they will crash anyway. > So the locking is unnecessary. > > The users may well crash, as does the kernel unfortunately. > > We only get to kfd_smi_ev_release when the file descriptor is closed. User > mode has no way to use the client any more at this point. This function also > removes the client from the dev->smi_cllients list. So no more events will > be added to the client. Therefore it is safe to free the client. > > If any of the above were not true, it would not be safe to kfree(client). > > But if it is safe to kfree(client), then there is no need for the locking. > > I'm not keen to go into too much detail until it's been patched. > > However, there is a way to free the client while it is still in use. > > Remember we are multi-threaded. > >files_struct->count refcount is used to handle this race, as >vfs_read/vfs_write takes file refcount and fput calls release only if >refcount is 1, to guarantee that read/write from user space is finished >here. > >Another race is driver add_event_to_kfifo while closing the handler. We >use rcu_read_lock in add_event_to_kfifo, and kfd_smi_ev_release calls >synchronize_rcu to wait for all rcu_read done. So it is safe to call >kfifo_free(&client->fifo) and kfree(client). Philip, please reach out to Felix. We have discussed this in more detail off-line. -- Lee Jones [李琼斯] Principal Technical Lead - Developer Services Linaro.org │ Open source software for Arm SoCs Follow Linaro: Facebook | Twitter | Blog
[PATCH 0/2] Add DP MST DSC support to i915
Currently we have only DSC support for DP SST. Stanislav Lisovskiy (2): drm: Add missing DP DSC extended capability definitions. drm/i915: Add DSC support to MST path drivers/gpu/drm/dp/drm_dp.c | 25 drivers/gpu/drm/i915/display/intel_dp.c | 138 -- drivers/gpu/drm/i915/display/intel_dp.h | 17 +++ drivers/gpu/drm/i915/display/intel_dp_mst.c | 146 +++- include/drm/dp/drm_dp_helper.h | 11 +- 5 files changed, 320 insertions(+), 17 deletions(-) -- 2.24.1.485.gad05a3d8e5
[PATCH 2/2] drm/i915: Add DSC support to MST path
Whenever we are not able to get enough timeslots for required PBN, let's try to allocate those using DSC, just same way as we do for SST. Those patches are experimental yet, i.e not for merging, still need to be tested with proper DSC display, submitting those to check ig nothing else blows up at least. v2: Add DSC checks to intel_dp_mst_mode_valid_ctx, similar to ones we have in intel_dp_mode_valid(Manasi Navare) v3: Removed redundant edp condition logic from MST DSC handling(Manasi Navare) v4: - Fixed forgotten force_dsc_en condition which was always enabled for testing purposes(Manasi Navare) - Properly process ret == EDEADLK, thus fixing the regression caused by WARN triggered with modeset_lock. v5: - Removed redundant check(Imre Deak) Acked-by: Imre Deak Signed-off-by: Stanislav Lisovskiy --- drivers/gpu/drm/i915/display/intel_dp.c | 138 -- drivers/gpu/drm/i915/display/intel_dp.h | 17 +++ drivers/gpu/drm/i915/display/intel_dp_mst.c | 146 +++- 3 files changed, 285 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c index 9e19165fd175..b04771e495cc 100644 --- a/drivers/gpu/drm/i915/display/intel_dp.c +++ b/drivers/gpu/drm/i915/display/intel_dp.c @@ -115,7 +115,6 @@ bool intel_dp_is_edp(struct intel_dp *intel_dp) } static void intel_dp_unset_edid(struct intel_dp *intel_dp); -static int intel_dp_dsc_compute_bpp(struct intel_dp *intel_dp, u8 dsc_max_bpc); /* Is link rate UHBR and thus 128b/132b? */ bool intel_dp_is_uhbr(const struct intel_crtc_state *crtc_state) @@ -667,11 +666,12 @@ small_joiner_ram_size_bits(struct drm_i915_private *i915) return 6144 * 8; } -static u16 intel_dp_dsc_get_output_bpp(struct drm_i915_private *i915, - u32 link_clock, u32 lane_count, - u32 mode_clock, u32 mode_hdisplay, - bool bigjoiner, - u32 pipe_bpp) +u16 intel_dp_dsc_get_output_bpp(struct drm_i915_private *i915, + u32 link_clock, u32 lane_count, + u32 mode_clock, u32 mode_hdisplay, + bool bigjoiner, + u32 pipe_bpp, + u32 timeslots) { u32 bits_per_pixel, max_bpp_small_joiner_ram; int i; @@ -683,7 +683,7 @@ static u16 intel_dp_dsc_get_output_bpp(struct drm_i915_private *i915, * for MST -> TimeSlotsPerMTP has to be calculated */ bits_per_pixel = (link_clock * lane_count * 8) / -intel_dp_mode_to_fec_clock(mode_clock); +(intel_dp_mode_to_fec_clock(mode_clock) * timeslots); drm_dbg_kms(&i915->drm, "Max link bpp: %u\n", bits_per_pixel); /* Small Joiner Check: output bpp <= joiner RAM (bits) / Horiz. width */ @@ -737,9 +737,9 @@ static u16 intel_dp_dsc_get_output_bpp(struct drm_i915_private *i915, return bits_per_pixel << 4; } -static u8 intel_dp_dsc_get_slice_count(struct intel_dp *intel_dp, - int mode_clock, int mode_hdisplay, - bool bigjoiner) +u8 intel_dp_dsc_get_slice_count(struct intel_dp *intel_dp, + int mode_clock, int mode_hdisplay, + bool bigjoiner) { struct drm_i915_private *i915 = dp_to_i915(intel_dp); u8 min_slice_count, i; @@ -902,8 +902,8 @@ intel_dp_mode_valid_downstream(struct intel_connector *connector, return MODE_OK; } -static bool intel_dp_need_bigjoiner(struct intel_dp *intel_dp, - int hdisplay, int clock) +bool intel_dp_need_bigjoiner(struct intel_dp *intel_dp, +int hdisplay, int clock) { struct drm_i915_private *i915 = dp_to_i915(intel_dp); @@ -990,7 +990,7 @@ intel_dp_mode_valid(struct drm_connector *connector, target_clock, mode->hdisplay, bigjoiner, - pipe_bpp) >> 4; + pipe_bpp, 1) >> 4; dsc_slice_count = intel_dp_dsc_get_slice_count(intel_dp, target_clock, @@ -1285,7 +1285,7 @@ intel_dp_compute_link_config_wide(struct intel_dp *intel_dp, return -EINVAL; } -static int intel_dp_dsc_compute_bpp(struct intel_dp *intel_dp, u8 max_req_bpc) +int intel_dp_dsc_compute_bpp(struct intel_dp *intel_dp, u8 max_req_bpc) { struct drm_i915_private *i915 = dp_to_i915(inte
[PATCH 1/2] drm: Add missing DP DSC extended capability definitions.
Adding DP DSC register definitions, we might need for further DSC implementation, supporting MST and DP branch pass-through mode. Signed-off-by: Stanislav Lisovskiy --- drivers/gpu/drm/dp/drm_dp.c| 25 + include/drm/dp/drm_dp_helper.h | 11 ++- 2 files changed, 35 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/dp/drm_dp.c b/drivers/gpu/drm/dp/drm_dp.c index 703972ae14c6..fe9c72055638 100644 --- a/drivers/gpu/drm/dp/drm_dp.c +++ b/drivers/gpu/drm/dp/drm_dp.c @@ -2312,6 +2312,31 @@ u8 drm_dp_dsc_sink_max_slice_count(const u8 dsc_dpcd[DP_DSC_RECEIVER_CAP_SIZE], } EXPORT_SYMBOL(drm_dp_dsc_sink_max_slice_count); +/** + * drm_dp_dsc_sink_bpp_increment_div - Get the bits per pixel precision + * which DP DSC sink device supports. + */ +u8 drm_dp_dsc_sink_bpp_increment_div(const u8 dsc_dpcd[DP_DSC_RECEIVER_CAP_SIZE]) +{ + u8 bpp_increment_dpcd = dsc_dpcd[DP_DSC_BITS_PER_PIXEL_INC - DP_DSC_SUPPORT]; + + switch (bpp_increment_dpcd) { + case DP_DSC_BITS_PER_PIXEL_1_16: + return 16; + case DP_DSC_BITS_PER_PIXEL_1_8: + return 8; + case DP_DSC_BITS_PER_PIXEL_1_4: + return 4; + case DP_DSC_BITS_PER_PIXEL_1_2: + return 2; + case DP_DSC_BITS_PER_PIXEL_1_1: + return 1; + } + + return 0; +} + + /** * drm_dp_dsc_sink_line_buf_depth() - Get the line buffer depth in bits * @dsc_dpcd: DSC capabilities from DPCD diff --git a/include/drm/dp/drm_dp_helper.h b/include/drm/dp/drm_dp_helper.h index 51e02cf75277..e4c9f4438ccb 100644 --- a/include/drm/dp/drm_dp_helper.h +++ b/include/drm/dp/drm_dp_helper.h @@ -246,6 +246,9 @@ struct drm_panel; #define DP_DSC_SUPPORT 0x060 /* DP 1.4 */ # define DP_DSC_DECOMPRESSION_IS_SUPPORTED (1 << 0) +# define DP_DSC_PASS_THROUGH_IS_SUPPORTED (1 << 1) +# define DP_DSC_DYNAMIC_PPS_UPDATE_SUPPORT_COMP_TO_COMP(1 << 2) +# define DP_DSC_DYNAMIC_PPS_UPDATE_SUPPORT_UNCOMP_TO_COMP (1 << 3) #define DP_DSC_REV 0x061 # define DP_DSC_MAJOR_MASK (0xf << 0) @@ -284,12 +287,15 @@ struct drm_panel; #define DP_DSC_BLK_PREDICTION_SUPPORT 0x066 # define DP_DSC_BLK_PREDICTION_IS_SUPPORTED (1 << 0) +# define DP_DSC_RGB_COLOR_CONV_BYPASS_SUPPORT (1 << 1) #define DP_DSC_MAX_BITS_PER_PIXEL_LOW 0x067 /* eDP 1.4 */ #define DP_DSC_MAX_BITS_PER_PIXEL_HI0x068 /* eDP 1.4 */ # define DP_DSC_MAX_BITS_PER_PIXEL_HI_MASK (0x3 << 0) # define DP_DSC_MAX_BITS_PER_PIXEL_HI_SHIFT 8 +# define DP_DSC_MAX_BPP_DELTA_VERSION_MASK 0x06 +# define DP_DSC_MAX_BPP_DELTA_AVAILABILITY 0x08 #define DP_DSC_DEC_COLOR_FORMAT_CAP 0x069 # define DP_DSC_RGB (1 << 0) @@ -351,11 +357,13 @@ struct drm_panel; # define DP_DSC_24_PER_DP_DSC_SINK (1 << 2) #define DP_DSC_BITS_PER_PIXEL_INC 0x06F +# define DP_DSC_RGB_YCbCr444_MAX_BPP_DELTA_MASK 0x1f +# define DP_DSC_RGB_YCbCr420_MAX_BPP_DELTA_MASK 0xe0 # define DP_DSC_BITS_PER_PIXEL_1_16 0x0 # define DP_DSC_BITS_PER_PIXEL_1_8 0x1 # define DP_DSC_BITS_PER_PIXEL_1_4 0x2 # define DP_DSC_BITS_PER_PIXEL_1_2 0x3 -# define DP_DSC_BITS_PER_PIXEL_10x4 +# define DP_DSC_BITS_PER_PIXEL_1_1 0x4 #define DP_PSR_SUPPORT 0x070 /* XXX 1.2? */ # define DP_PSR_IS_SUPPORTED1 @@ -1825,6 +1833,7 @@ u8 drm_dp_dsc_sink_max_slice_count(const u8 dsc_dpcd[DP_DSC_RECEIVER_CAP_SIZE], u8 drm_dp_dsc_sink_line_buf_depth(const u8 dsc_dpcd[DP_DSC_RECEIVER_CAP_SIZE]); int drm_dp_dsc_sink_supported_input_bpcs(const u8 dsc_dpc[DP_DSC_RECEIVER_CAP_SIZE], u8 dsc_bpc[3]); +u8 drm_dp_dsc_sink_bpp_increment_div(const u8 dsc_dpcd[DP_DSC_RECEIVER_CAP_SIZE]); static inline bool drm_dp_sink_supports_dsc(const u8 dsc_dpcd[DP_DSC_RECEIVER_CAP_SIZE]) -- 2.24.1.485.gad05a3d8e5
Re: [PATCH v2] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
Hi Daniel, On 3/17/22 14:28, Daniel Dadap wrote: > >> On Mar 17, 2022, at 07:17, Hans de Goede wrote: >> >> Hi, >> >>> On 3/16/22 21:33, Daniel Dadap wrote: >>> Some notebook systems with EC-driven backlight control appear to have a >>> firmware bug which causes the system to use GPU-driven backlight control >>> upon a fresh boot, but then switches to EC-driven backlight control >>> after completing a suspend/resume cycle. All the while, the firmware >>> reports that the backlight is under EC control, regardless of what is >>> actually controlling the backlight brightness. >>> >>> This leads to the following behavior: >>> >>> * nvidia-wmi-ec-backlight gets probed on a fresh boot, due to the >>> WMI-wrapped ACPI method erroneously reporting EC control. >>> * nvidia-wmi-ec-backlight does not work until after a suspend/resume >>> cycle, due to the backlight control actually being GPU-driven. >>> * GPU drivers also register their own backlight handlers: in the case >>> of the notebook system where this behavior has been observed, both >>> amdgpu and the NVIDIA proprietary driver register backlight handlers. >>> * The GPU which has backlight control upon a fresh boot (amdgpu in the >>> case observed so far) can successfully control the backlight through >>> its backlight driver's sysfs interface, but stops working after the >>> first suspend/resume cycle. >>> * nvidia-wmi-ec-backlight is unable to control the backlight upon a >>> fresh boot, but begins to work after the first suspend/resume cycle. >>> * The GPU which does not have backlight control (NVIDIA in this case) >>> is not able to control the backlight at any point while the system >>> is in operation. On similar hybrid systems with an EC-controlled >>> backlight, and AMD/NVIDIA iGPU/dGPU, the NVIDIA proprietary driver >>> does not register its backlight handler. It has not been determined >>> whether the non-functional handler registered by the NVIDIA driver >>> is due to another firmware bug, or a bug in the NVIDIA driver. >>> >>> Since nvidia-wmi-ec-backlight registers as a BACKLIGHT_FIRMWARE type >>> device, it takes precedence over the BACKLIGHT_RAW devices registered >>> by the GPU drivers. This in turn leads to backlight control appearing >>> to be non-functional until after completing a suspend/resume cycle. >>> However, it is still possible to control the backlight through direct >>> interaction with the working GPU driver's backlight sysfs interface. >>> >>> These systems also appear to have a second firmware bug which resets >>> the EC's brightness level to 100% on resume, but leaves the state in >>> the kernel at the pre-suspend level. This causes attempts to save >>> and restore the backlight level across the suspend/resume cycle to >>> fail, due to the level appearing not to change even though it did. >>> >>> In order to work around these issues, add a quirk table to detect >>> systems that are known to show these behaviors. So far, there is >>> only one known system that requires these workarounds, and both >>> issues are present on that system, but the quirks are tracked >>> separately to make it easier to add them to other systems which >>> may exhibit one of the bugs, but not the other. The original systems >>> that this driver was tested on during development do not exhibit >>> either of these quirks. >>> >>> If a system with the "GPU driver has backlight control" quirk is >>> detected, nvidia-wmi-ec-backlight will grab a reference to the working >>> (when freshly booted) GPU backlight handler and relays any backlight >>> brightness level change requests directed at the EC to also be applied >>> to the GPU backlight interface. This leads to redundant updates >>> directed at the GPU backlight driver after a suspend/resume cycle, but >>> it does allow the EC backlight control to work when the system is >>> freshly booted. >> >> Ugh, I'm really not a fan of the backlight proxy plan here. I have >> plans to clean-up the whole x86 backlight mess soon and an important part >> of that is to stop registering multiple backlight interfaces for the >> same panel/screen. >> >> Where as going with this workaround requires us to have 2 active >> backlight interfaces active. Also this will very likely work to >> (subtly) different backlight behavior before and after the first >> suspend/resume. > > I understand. Having multiple backlight devices for the same panel is indeed > annoying. Out of curiosity, what is the plan for determining that multiple > backlight interfaces are all supposed to control the same panel? ATM the kernel basically only supports a bunch of different methods to control the backlight of 1 internal panel. The plan is to tie this to the panel from a userspace pov by making the brightness + max_brightness properties on the drm_connector object for the internal-panel. The in kernel tying of the backlight device to the internal panel will be done hardcoded inside the drm driver(s) based on the drivers already
Re: [PATCH 2/3] drm/msm/gpu: Park scheduler threads for system suspend
Am 17.03.22 um 17:18 schrieb Rob Clark: On Thu, Mar 17, 2022 at 9:04 AM Christian König wrote: Am 17.03.22 um 16:10 schrieb Rob Clark: [SNIP] userspace frozen != kthread frozen .. that is what this patch is trying to address, so we aren't racing between shutting down the hw and the scheduler shoveling more jobs at us. Well exactly that's the problem. The scheduler is supposed to shoveling more jobs at us until it is empty. Thinking more about it we will then keep some dma_fence instance unsignaled and that is and extremely bad idea since it can lead to deadlocks during suspend. Hmm, perhaps that is true if you need to migrate things out of vram? It is at least not a problem when vram is not involved. No, it's much wider than that. See what can happen is that the memory management shrinkers want to wait for a dma_fence during suspend. And if you stop the scheduler they will just wait forever. What you need to do instead is to drain the scheduler, e.g. call drm_sched_entity_flush() with a proper timeout for each entity you have created. Regards, Christian. So this patch here is an absolute clear NAK from my side. If amdgpu is doing something similar that is a severe bug and needs to be addressed somehow. I think amdgpu's use of kthread_park is not related to suspend, but didn't look too closely. And perhaps the solution for this problem is more complex in the case of amdgpu, I'm not super familiar with the constraints there. But I think it is a fine solution for integrated GPUs. BR, -R Regards, Christian. BR, -R
[PATCH v2 0/3] drm/msm: Add comm/cmdline override
From: Rob Clark Add a way to override comm/cmdline per-drm_file. This is useful for VM scenarios where the host process is just a proxy for the actual guest process. Rob Clark (3): drm/msm: Add support for pointer params drm/msm: Split out helper to get comm/cmdline drm/msm: Add a way to override processes comm/cmdline drivers/gpu/drm/msm/adreno/adreno_gpu.c | 49 - drivers/gpu/drm/msm/adreno/adreno_gpu.h | 4 +- drivers/gpu/drm/msm/msm_drv.c | 8 ++-- drivers/gpu/drm/msm/msm_gpu.c | 40 drivers/gpu/drm/msm/msm_gpu.h | 10 - drivers/gpu/drm/msm/msm_rd.c| 5 ++- drivers/gpu/drm/msm/msm_submitqueue.c | 2 + include/uapi/drm/msm_drm.h | 4 ++ 8 files changed, 94 insertions(+), 28 deletions(-) -- 2.35.1
[PATCH v2 1/3] drm/msm: Add support for pointer params
From: Rob Clark The 64b value field is already suffient to hold a pointer instead of immediate, but we also need a length field. Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/adreno/adreno_gpu.c | 12 ++-- drivers/gpu/drm/msm/adreno/adreno_gpu.h | 4 ++-- drivers/gpu/drm/msm/msm_drv.c | 8 drivers/gpu/drm/msm/msm_gpu.h | 4 ++-- drivers/gpu/drm/msm/msm_rd.c| 5 +++-- include/uapi/drm/msm_drm.h | 2 ++ 6 files changed, 23 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c index 9efc84929be0..3d307b34854d 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c @@ -229,10 +229,14 @@ adreno_iommu_create_address_space(struct msm_gpu *gpu, } int adreno_get_param(struct msm_gpu *gpu, struct msm_file_private *ctx, -uint32_t param, uint64_t *value) +uint32_t param, uint64_t *value, uint32_t *len) { struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); + /* No pointer params yet */ + if (*len != 0) + return -EINVAL; + switch (param) { case MSM_PARAM_GPU_ID: *value = adreno_gpu->info->revn; @@ -284,8 +288,12 @@ int adreno_get_param(struct msm_gpu *gpu, struct msm_file_private *ctx, } int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx, -uint32_t param, uint64_t value) +uint32_t param, uint64_t value, uint32_t len) { + /* No pointer params yet */ + if (len != 0) + return -EINVAL; + switch (param) { case MSM_PARAM_SYSPROF: if (!capable(CAP_SYS_ADMIN)) diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h index 0490c5fbb780..ab3b5ef80332 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h @@ -281,9 +281,9 @@ static inline int adreno_is_a650_family(struct adreno_gpu *gpu) } int adreno_get_param(struct msm_gpu *gpu, struct msm_file_private *ctx, -uint32_t param, uint64_t *value); +uint32_t param, uint64_t *value, uint32_t *len); int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx, -uint32_t param, uint64_t value); +uint32_t param, uint64_t value, uint32_t len); const struct firmware *adreno_request_fw(struct adreno_gpu *adreno_gpu, const char *fwname); struct drm_gem_object *adreno_fw_create_bo(struct msm_gpu *gpu, diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c index 780f9748aaaf..a5eed5738ac8 100644 --- a/drivers/gpu/drm/msm/msm_drv.c +++ b/drivers/gpu/drm/msm/msm_drv.c @@ -610,7 +610,7 @@ static int msm_ioctl_get_param(struct drm_device *dev, void *data, /* for now, we just have 3d pipe.. eventually this would need to * be more clever to dispatch to appropriate gpu module: */ - if (args->pipe != MSM_PIPE_3D0) + if ((args->pipe != MSM_PIPE_3D0) || (args->pad != 0)) return -EINVAL; gpu = priv->gpu; @@ -619,7 +619,7 @@ static int msm_ioctl_get_param(struct drm_device *dev, void *data, return -ENXIO; return gpu->funcs->get_param(gpu, file->driver_priv, -args->param, &args->value); +args->param, &args->value, &args->len); } static int msm_ioctl_set_param(struct drm_device *dev, void *data, @@ -629,7 +629,7 @@ static int msm_ioctl_set_param(struct drm_device *dev, void *data, struct drm_msm_param *args = data; struct msm_gpu *gpu; - if (args->pipe != MSM_PIPE_3D0) + if ((args->pipe != MSM_PIPE_3D0) || (args->pad != 0)) return -EINVAL; gpu = priv->gpu; @@ -638,7 +638,7 @@ static int msm_ioctl_set_param(struct drm_device *dev, void *data, return -ENXIO; return gpu->funcs->set_param(gpu, file->driver_priv, -args->param, args->value); +args->param, args->value, args->len); } static int msm_ioctl_gem_new(struct drm_device *dev, void *data, diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h index a84140055920..c28c2ad9f52e 100644 --- a/drivers/gpu/drm/msm/msm_gpu.h +++ b/drivers/gpu/drm/msm/msm_gpu.h @@ -44,9 +44,9 @@ struct msm_gpu_config { */ struct msm_gpu_funcs { int (*get_param)(struct msm_gpu *gpu, struct msm_file_private *ctx, -uint32_t param, uint64_t *value); +uint32_t param, uint64_t *value, uint32_t *len); int (*set_param)(struct msm_gpu *gpu, struct msm_file_private *ctx, -uint32_t param, uint64_t value); +