Re: [PATCH v3 02/15] drm/rockchip: Set dma mask to 64 bit
On 2024-10-17 09:06, Andy Yan wrote: > > > Hi Robin, > > Thanks for your comment。 > > At 2024-10-17 01:38:23, "Robin Murphy" wrote: >> On 2024-09-20 9:20 am, Andy Yan wrote: >>> From: Andy Yan >>> >>> The vop mmu support translate physical address upper 4 GB to iova >>> below 4 GB. So set dma mask to 64 bit to indicate we support address 4GB. >>> >>> This can avoid warnging message like this on some boards with DDR 4 GB: >>> >>> rockchip-drm display-subsystem: swiotlb buffer is full (sz: 266240 bytes), >>> total 32768 (slots), used 130 (slots) >>> rockchip-drm display-subsystem: swiotlb buffer is full (sz: 266240 bytes), >>> total 32768 (slots), used 0 (slots) >>> rockchip-drm display-subsystem: swiotlb buffer is full (sz: 266240 bytes), >>> total 32768 (slots), used 130 (slots) >>> rockchip-drm display-subsystem: swiotlb buffer is full (sz: 266240 bytes), >>> total 32768 (slots), used 130 (slots) >>> rockchip-drm display-subsystem: swiotlb buffer is full (sz: 266240 bytes), >>> total 32768 (slots), used 0 (slots) >> >> There are several things wrong with this... >> >> AFAICS the VOP itself still only supports 32-bit addresses, so the VOP >> driver should only be setting a 32-bit DMA mask. The IOMMUs support >> either 32-bit or 40-bit addresses, and the IOMMU driver does set its DMA > Does that mean we can only use the dev of IOMMU ? If that is true, would you > please give some inspiration on how to implement this? Or is there any other > diver i can follow。Very sorry for that I'm not familiar with memory > management and the IOMMU。 > > >> mask appropriately. None of those numbers is 64, so that's clearly >> suspicious already. Plus it would seem the claim of the IOMMU being able >> to address >4GB isn't strictly true for RK3288 (which does supposedly >> support 8GB of RAM). > > We can set DMA mask per device if we can find a right way to do it。 Removing the use of custom rockchip_drm_gem and use the common gem dma fops should also allow import of framebuffers in >4GB address. I played around with that [1] last year but never took it further because it broke multiple VOPs/IOMMUSs on e.g. rk3288. Only IOMMU dte address handling fixes for >4GB support was sent and got merged. When I tested [1] on an RK3568 back then it was possible to import video framebuffers located in >4GB memory and display them on screen without a spam of "swiotlb buffer is full" lines. Maybe there is some part of the current custom rockchip_drm_gem code that can be adjusted to work closer to the common gem dma fops?, or maybe fully drop rockchip_drm_gem in favor of common gem dma fops could be an alternative solution? [1] https://github.com/Kwiboo/linux-rockchip/commit/70695c8f868adec630592fef536364e59793de81 Regards, Jonas > >> >> Furthermore, the "display-subsystem" doesn't even exist - it does not >> represent any actual DMA-capable hardware, so it should not have a DMA >> mask, and it should not be used for DMA API operations. Buffers for the >> VOP should be DMA-mapped for the VOP device itself. At the very least >> the rockchip_gem_alloc_dma() path is clearly broken otherwise (I guess >> this patch possibly *would* make that brokenness apparent). >> >>> Signed-off-by: Andy Yan >>> Tested-by: Derek Foreman >>> --- >>> >>> (no changes since v1) >>> >>> drivers/gpu/drm/rockchip/rockchip_drm_drv.c | 4 +++- >>> 1 file changed, 3 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_drv.c >>> b/drivers/gpu/drm/rockchip/rockchip_drm_drv.c >>> index 04ef7a2c3833..8bc2ff3b04bb 100644 >>> --- a/drivers/gpu/drm/rockchip/rockchip_drm_drv.c >>> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_drv.c >>> @@ -445,7 +445,9 @@ static int rockchip_drm_platform_probe(struct >>> platform_device *pdev) >>> return ret; >>> } >>> >>> - return 0; >>> + ret = dma_coerce_mask_and_coherent(dev, DMA_BIT_MASK(64)); >> >> Finally as a general thing, please don't misuse >> dma_coerce_mask_and_coherent() in platform drivers, just use normal >> dma_set_mask_and_coherent(). The platform bus code has been initialising >> the dev->dma_mask pointer for years now, drivers should not be messing >> with it any more. > > Got it , thanks again。 > >> >> Thanks, >> Robin. >> >>> + >>> + return ret; >>> } >>> >>> static void rockchip_drm_platform_remove(struct platform_device *pdev)
Re: [PATCH 9/9] drm/panfrost: Explicitly clean up panfrost fence
Am 16.10.24 um 18:43 schrieb Adrián Larumbe: On 16.10.2024 15:12, Christian König wrote: Am 15.10.24 um 01:31 schrieb Adrián Larumbe: Doesn't make any functional difference because generic dma_fence is the first panfrost_fence structure member, but I guess it doesn't hurt either. As discussed with Sima we want to push into the exactly opposite direction because that requires that the panfrost module stays loaded as long as fences are around. Does that mean in future commits the struct dma_fence_ops' .release pointer will be done with altogether? Yes, exactly that's the idea. As a first step I'm preparing patches right now to enforce using kmalloc instead of driver brewed approaches for dma_fence handling. Regards, Christian. So clearly a NAK to this one here. Rather document on the structure that the dma_fence structure must be the first member. Regards, Christian. Signed-off-by: Adrián Larumbe --- drivers/gpu/drm/panfrost/panfrost_job.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c index 5d83c6a148ec..fa219f719bdc 100644 --- a/drivers/gpu/drm/panfrost/panfrost_job.c +++ b/drivers/gpu/drm/panfrost/panfrost_job.c @@ -85,9 +85,15 @@ static const char *panfrost_fence_get_timeline_name(struct dma_fence *fence) } } +static void panfrost_fence_release(struct dma_fence *fence) +{ + kfree(to_panfrost_fence(fence)); +} + static const struct dma_fence_ops panfrost_fence_ops = { .get_driver_name = panfrost_fence_get_driver_name, .get_timeline_name = panfrost_fence_get_timeline_name, + .release = panfrost_fence_release, }; static struct dma_fence *panfrost_fence_create(struct panfrost_device *pfdev, int js_num)
[PULL] drm-xe-next
Dave, Simona This week's -next PR. Note the implicit fencing uapi fix. Thanks, Thomas drm-xe-next-2024-10-17: UAPI Changes: - (Implicit) Fix the exec unnecessary implicit fencing (Matt Brost) Driver Changes: - Fix an inverted if statement (Colin) - Fixes around display d3cold vs non-d3cold runtime pm (Imre) - A couple of scheduling fixes (Matt Brost) - Increase a query timestamp witdh (Lucas) - Move a timestamp read (Lucas) - Tidy some code using multiple put_user() (Lucas) - Fix an ufence signaling error (Nirmoy) - Initialize the ufence.signalled field (Matt Auld) - Display fb alignement work (Juha-Pekka) - Disallow horisontal flip with tile4 + display20 (Juha-Pekka) - Extend a workaround (Shekhar) - Enlarge the global invalidation timeout (Shuicheng) The following changes since commit a187c1b0a800565a4db6372268692aff99df7f53: drm/xe: fix unbalanced rpm put() with declare_wedged() (2024-10-10 09:15:59 +0100) are available in the Git repository at: https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-next-2024-10-17 for you to fetch changes up to 2eb460ab9f4bc5b575f52568d17936da0af681d8: drm/xe: Enlarge the invalidation timeout from 150 to 500 (2024-10-16 16:11:10 +0100) UAPI Changes: - (Implicit) Fix the exec unnecessary implicit fencing (Matt Brost) Driver Changes: - Fix an inverted if statement (Colin) - Fixes around display d3cold vs non-d3cold runtime pm (Imre) - A couple of scheduling fixes (Matt Brost) - Increase a query timestamp witdh (Lucas) - Move a timestamp read (Lucas) - Tidy some code using multiple put_user() (Lucas) - Fix an ufence signaling error (Nirmoy) - Initialize the ufence.signalled field (Matt Auld) - Display fb alignement work (Juha-Pekka) - Disallow horisontal flip with tile4 + display20 (Juha-Pekka) - Extend a workaround (Shekhar) - Enlarge the global invalidation timeout (Shuicheng) Colin Ian King (1): drm/xe/guc: Fix inverted logic on snapshot->copy check Imre Deak (2): drm/xe/display: Separate the d3cold and non-d3cold runtime PM handling drm/xe/display: Add missing HPD interrupt enabling during non-d3cold RPM resume Juha-Pekka Heikkila (3): drm/xe: add interface to request physical alignment for buffer objects drm/xe/display: align framebuffers according to hw requirements drm/i915/display: Don't allow tile4 framebuffer to do hflip on display20 or greater Lucas De Marchi (3): drm/xe/query: Increase timestamp width drm/xe/query: Move timestamp reg to hwe_read_timestamp() drm/xe/query: Tidy up error EFAULT returns Matthew Auld (1): drm/xe/xe_sync: initialise ufence.signalled Matthew Brost (3): drm/xe: Take job list lock in xe_sched_add_pending_job drm/xe: Don't free job in TDR drm/xe: Use bookkeep slots for external BO's in exec IOCTL Nirmoy Das (1): drm/xe/ufence: ufence can be signaled right after wait_woken Shekhar Chauhan (1): drm/xe/xe3lpg: Extend Wa_18034896535 to Xe3_LPG. Shuicheng Lin (1): drm/xe: Enlarge the invalidation timeout from 150 to 500 drivers/gpu/drm/i915/display/intel_fb.c| 13 + drivers/gpu/drm/i915/display/intel_fb.h| 1 + drivers/gpu/drm/i915/display/skl_universal_plane.c | 11 + .../xe/compat-i915-headers/gem/i915_gem_stolen.h | 2 +- drivers/gpu/drm/xe/display/xe_display.c| 20 ++-- drivers/gpu/drm/xe/display/xe_fb_pin.c | 57 +- drivers/gpu/drm/xe/xe_bo.c | 29 --- drivers/gpu/drm/xe/xe_bo.h | 8 ++- drivers/gpu/drm/xe/xe_bo_types.h | 5 ++ drivers/gpu/drm/xe/xe_device.c | 2 +- drivers/gpu/drm/xe/xe_exec.c | 12 ++--- drivers/gpu/drm/xe/xe_ggtt.c | 2 +- drivers/gpu/drm/xe/xe_gpu_scheduler.h | 2 + drivers/gpu/drm/xe/xe_guc_log.c| 2 +- drivers/gpu/drm/xe/xe_guc_submit.c | 7 ++- drivers/gpu/drm/xe/xe_query.c | 42 ++-- drivers/gpu/drm/xe/xe_sync.c | 2 +- drivers/gpu/drm/xe/xe_wa.c | 5 ++ drivers/gpu/drm/xe/xe_wait_user_fence.c| 3 -- 19 files changed, 147 insertions(+), 78 deletions(-)
[PATCH v2] drm/i915/lspcon: do not hardcode settle timeout
Avoid hardcoding the LSPCON settle timeout because it takes a longer time on certain chips made by certain vendors. Use the function that already exists to determine the timeout. Reviewed-by: Ankit Nautiyal Signed-off-by: Giedrius Statkevičius --- v2: add documentation about the parameter, apply 80 character line length limit. drivers/gpu/drm/display/drm_dp_dual_mode_helper.c | 4 ++-- drivers/gpu/drm/i915/display/intel_lspcon.c | 3 ++- include/drm/display/drm_dp_dual_mode_helper.h | 2 +- 3 files changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/display/drm_dp_dual_mode_helper.c b/drivers/gpu/drm/display/drm_dp_dual_mode_helper.c index 14a2a8473682..d14b262b2344 100644 --- a/drivers/gpu/drm/display/drm_dp_dual_mode_helper.c +++ b/drivers/gpu/drm/display/drm_dp_dual_mode_helper.c @@ -486,16 +486,16 @@ EXPORT_SYMBOL(drm_lspcon_get_mode); * @dev: &drm_device to use * @adapter: I2C-over-aux adapter * @mode: required mode of operation + * @time_out: LSPCON mode change settle timeout * * Returns: * 0 on success, -error on failure/timeout */ int drm_lspcon_set_mode(const struct drm_device *dev, struct i2c_adapter *adapter, - enum drm_lspcon_mode mode) + enum drm_lspcon_mode mode, int time_out) { u8 data = 0; int ret; - int time_out = 200; enum drm_lspcon_mode current_mode; if (mode == DRM_LSPCON_MODE_PCON) diff --git a/drivers/gpu/drm/i915/display/intel_lspcon.c b/drivers/gpu/drm/i915/display/intel_lspcon.c index f9db867fae89..30c31fddec99 100644 --- a/drivers/gpu/drm/i915/display/intel_lspcon.c +++ b/drivers/gpu/drm/i915/display/intel_lspcon.c @@ -211,7 +211,8 @@ static int lspcon_change_mode(struct intel_lspcon *lspcon, return 0; } - err = drm_lspcon_set_mode(intel_dp->aux.drm_dev, ddc, mode); + err = drm_lspcon_set_mode(intel_dp->aux.drm_dev, ddc, mode, + lspcon_get_mode_settle_timeout(lspcon)); if (err < 0) { drm_err(display->drm, "LSPCON mode change failed\n"); return err; diff --git a/include/drm/display/drm_dp_dual_mode_helper.h b/include/drm/display/drm_dp_dual_mode_helper.h index 7ee482265087..7ac6969db935 100644 --- a/include/drm/display/drm_dp_dual_mode_helper.h +++ b/include/drm/display/drm_dp_dual_mode_helper.h @@ -117,5 +117,5 @@ const char *drm_dp_get_dual_mode_type_name(enum drm_dp_dual_mode_type type); int drm_lspcon_get_mode(const struct drm_device *dev, struct i2c_adapter *adapter, enum drm_lspcon_mode *current_mode); int drm_lspcon_set_mode(const struct drm_device *dev, struct i2c_adapter *adapter, - enum drm_lspcon_mode reqd_mode); + enum drm_lspcon_mode reqd_mode, int time_out); #endif -- 2.47.0
Re: [PATCH v7 1/5] drm: Introduce device wedged event
Am 17.10.24 um 04:47 schrieb Raag Jadav: On Mon, Sep 30, 2024 at 01:08:41PM +0530, Raag Jadav wrote: Introduce device wedged event, which will notify userspace of wedged (hanged/unusable) state of the DRM device through a uevent. This is useful especially in cases where the device is no longer operating as expected even after a hardware reset and has become unrecoverable from driver context. Well introduce is probably the wrong wording since i915 already has that and amdgpu looked into it but never upstreamed the support. I would rather say standardize. Purpose of this implementation is to provide drivers a generic way to recover with the help of userspace intervention. Different drivers may have different ideas of a "wedged device" depending on their hardware implementation, and hence the vendor agnostic nature of the event. It is up to the drivers to decide when they see the need for recovery and how they want to recover from the available methods. Current implementation defines three recovery methods, out of which, drivers can choose to support any one or multiple of them. Preferred recovery method will be sent in the uevent environment as WEDGED=. Userspace consumers (sysadmin) can define udev rules to parse this event and take respective action to recover the device. === == Recovery method Consumer expectations === == rebind unbind + rebind driver bus-reset unbind + reset bus device + rebind reboot reboot system === == Well that sounds like userspace would need to be involved in recovery. That in turn is a complete no-go since we at least need to signal all dma_fences to unblock the kernel. In other words things like bus reset needs to happen inside the kernel and *not* in userspace. What we can do is to signal to userspace: Hey a bus reset of device X happened, maybe restart container, daemon, whatever service which was using this device. Regards, Christian. v4: s/drm_dev_wedged/drm_dev_wedged_event Use drm_info() (Jani) Kernel doc adjustment (Aravind) v5: Send recovery method with uevent (Lina) v6: Access wedge_recovery_opts[] using helper function (Jani) Use snprintf() (Jani) v7: Convert recovery helpers into regular functions (Andy, Jani) Aesthetic adjustments (Andy) Handle invalid method cases Signed-off-by: Raag Jadav --- Cc'ing amd, collabora and others as I found semi-related work at https://lore.kernel.org/dri-devel/20230627132323.115440-1-andrealm...@igalia.com/ https://lore.kernel.org/amd-gfx/20240725150055.1991893-1-alexander.deuc...@amd.com/ https://lore.kernel.org/dri-devel/20241011225906.3789965-3-adrian.laru...@collabora.com/ https://lore.kernel.org/amd-gfx/CAAxE2A5v_RkZ9ex4=7jibskvb22_1faj0aanbcmktett5c3...@mail.gmail.com/ Please share feedback about usefulness and adoption of this. Improvements are welcome. Raag drivers/gpu/drm/drm_drv.c | 77 +++ include/drm/drm_device.h | 23 include/drm/drm_drv.h | 3 ++ 3 files changed, 103 insertions(+) diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c index ac30b0ec9d93..cfe9600da2ee 100644 --- a/drivers/gpu/drm/drm_drv.c +++ b/drivers/gpu/drm/drm_drv.c @@ -26,6 +26,8 @@ * DEALINGS IN THE SOFTWARE. */ +#include +#include #include #include #include @@ -33,6 +35,7 @@ #include #include #include +#include #include #include @@ -70,6 +73,42 @@ static struct dentry *drm_debugfs_root; DEFINE_STATIC_SRCU(drm_unplug_srcu); +/* + * Available recovery methods for wedged device. To be sent along with device + * wedged uevent. + */ +static const char *const drm_wedge_recovery_opts[] = { + [DRM_WEDGE_RECOVERY_REBIND] = "rebind", + [DRM_WEDGE_RECOVERY_BUS_RESET] = "bus-reset", + [DRM_WEDGE_RECOVERY_REBOOT] = "reboot", +}; + +static bool drm_wedge_recovery_is_valid(enum drm_wedge_recovery method) +{ + static_assert(ARRAY_SIZE(drm_wedge_recovery_opts) == DRM_WEDGE_RECOVERY_MAX); + + return method >= DRM_WEDGE_RECOVERY_REBIND && method < DRM_WEDGE_RECOVERY_MAX; +} + +/** + * drm_wedge_recovery_name - provide wedge recovery name + * @method: method to be used for recovery + * + * This validates wedge recovery @method against the available ones in + * drm_wedge_recovery_opts[] and provides respective recovery name in string + * format if found valid. + * + * Returns: pointer to const recovery string on success, NULL otherwise. + */ +const char *drm_wedge_recovery_name(enum drm_wedge_recovery method) +{ + if (drm_wedge_recovery_is_valid(method)) + return drm_wedge_recovery_opts[method]; + + return NULL; +} +EXPORT_SYMBOL(drm_wedge_recovery_name); + /* * DRM Minors * A DRM device can provide several char-dev interfaces on the DRM-
[PULL] drm-intel-fixes
Hi Dave & Sima, Here goes drm-intel-fixes towards v6.12-rc4. Just two DP MST fixes this round. Regards, Joonas *** drm-intel-fixes-2024-10-17: - Two DP bandwidth related MST fixes The following changes since commit 8e929cb546ee42c9a61d24fae60605e9e3192354: Linux 6.12-rc3 (2024-10-13 14:33:32 -0700) are available in the Git repository at: https://gitlab.freedesktop.org/drm/i915/kernel.git tags/drm-intel-fixes-2024-10-17 for you to fetch changes up to 2f54e71359eb2abc0bdf6619cd356e5e350ff27b: drm/i915/dp_mst: Don't require DSC hblank quirk for a non-DSC compatible mode (2024-10-16 14:56:40 +0300) - Two DP bandwidth related MST fixes Imre Deak (2): drm/i915/dp_mst: Handle error during DSC BW overhead/slice calculation drm/i915/dp_mst: Don't require DSC hblank quirk for a non-DSC compatible mode drivers/gpu/drm/i915/display/intel_dp_mst.c | 40 + 1 file changed, 30 insertions(+), 10 deletions(-)
[BUG] drm/amd/display: possible null-pointer dereference or redundant null check in amdgpu_dm.c
Hello, Our static analysis tool has identified a potential null-pointer dereference or redundant null check related to the wait-completion synchronization mechanism in amdgpu_dm.c in Linux 6.11. Consider the following execution scenario: dmub_aux_setconfig_callback() //731 if (adev->dm.dmub_notify)//734 complete(&adev->dm.dmub_aux_transfer_done); //737 The variable adev->dm.dmub_notify is checked by an if statement at Line 734, which indicates that adev->dm.dmub_notify can NULL. Then, complete() is called at Line 737 which wakes up the wait_for_completion(). Consider the wait_for_completion() amdgpu_dm_process_dmub_aux_transfer_sync()//12271 p_notify = adev->dm.dmub_notify;//12278 wait_for_completion_timeout(&adev->dm.dmub_aux_transfer_done, ...); // 12287 if (p_notify->result != AUX_RET_SUCCESS)//12293 The value of adev->dm.dmub_notify is assigned to p_notify at Line 12278. If adev->dm.dmub_notify at Line 734 is checked to be NULL, the value p_notify after the wait_for_completion_timeout() at Line 12278 can also be NULL. However, it is dereferenced at Line 12293 without rechecking, causing a possible null dereference. In fact, dmub_aux_setconfig_callback() is registered only if adev->dm.dmub_notify is checked to be not NULL: adev->dm.dmub_notify = kzalloc(...);//2006 if (!adev->dm.dmub_notify) {//2007 .. goto error; //2009 } //2010 .. register_dmub_notify_callback(..., dmub_aux_setconfig_callback, ...) //2019 I am not sure if adev->dm.dmub_notify is assigned with NULL elsewhere. If not, the if check at Line 734 can be redundant. Any feedback would be appreciated, thanks! Sincerely, Tuo Li
Re: [PATCH v6 01/14] drm/panthor: Add uAPI
On Thu, 2024-10-17 at 12:08 +0200, Boris Brezillon wrote: > On Thu, 17 Oct 2024 10:51:32 +0200 > Erik Faye-Lund wrote: > > > On Wed, 2024-10-16 at 16:18 +0200, Boris Brezillon wrote: > > > On Wed, 16 Oct 2024 16:05:55 +0200 > > > Erik Faye-Lund wrote: > > > > > > > On Wed, 2024-10-16 at 15:02 +0100, Robin Murphy wrote: > > > > > On 2024-10-16 2:50 pm, Erik Faye-Lund wrote: > > > > > > On Wed, 2024-10-16 at 15:16 +0200, Erik Faye-Lund wrote: > > > > > > > On Thu, 2024-02-29 at 17:22 +0100, Boris Brezillon > > > > > > > wrote: > > > > > > > > +/** > > > > > > > > + * enum drm_panthor_sync_op_flags - Synchronization > > > > > > > > operation > > > > > > > > flags. > > > > > > > > + */ > > > > > > > > +enum drm_panthor_sync_op_flags { > > > > > > > > + /** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK: > > > > > > > > Synchronization > > > > > > > > handle type mask. */ > > > > > > > > + DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK = 0xff, > > > > > > > > + > > > > > > > > + /** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ: > > > > > > > > Synchronization object type. */ > > > > > > > > + DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ = 0, > > > > > > > > + > > > > > > > > + /** > > > > > > > > +* > > > > > > > > @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCOBJ: > > > > > > > > Timeline synchronization > > > > > > > > +* object type. > > > > > > > > +*/ > > > > > > > > + DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCO > > > > > > > > BJ = > > > > > > > > 1, > > > > > > > > + > > > > > > > > + /** @DRM_PANTHOR_SYNC_OP_WAIT: Wait operation. > > > > > > > > */ > > > > > > > > + DRM_PANTHOR_SYNC_OP_WAIT = 0 << 31, > > > > > > > > + > > > > > > > > + /** @DRM_PANTHOR_SYNC_OP_SIGNAL: Signal > > > > > > > > operation. > > > > > > > > */ > > > > > > > > + DRM_PANTHOR_SYNC_OP_SIGNAL = (int)(1u << > > > > > > > > 31), > > > > > > > > > > > > > > Why do we cast to int here? 1u << 31 doesn't fit in a 32- > > > > > > > bit > > > > > > > signed > > > > > > > integer, so isn't this undefined behavior in C? > > > > > > > > > > > > > > > > > > > Seems this was proposed here: > > > > > > https://lore.kernel.org/dri-devel/89be8f8f-7c4e-4efd-0b7b-c30bcfbf1...@arm.com/ > > > > > > > > > > > > ...that kinda sounds like bad advice to me. > > > > > > > > > > > > Also, it's been pointed out to me elsewhere that this isn't > > > > > > *technically speaking* undefined, it's "implementation > > > > > > defined". > > > > > > But as > > > > > > far as kernel interfaces goes, that's pretty much the same; > > > > > > we > > > > > > can't > > > > > > guarantee that the kernel and the user-space is using the > > > > > > same > > > > > > implementation. > > > > > > > > > > > > Here's the quote from the C99 spec, section 6.3.1.3 "Signed > > > > > > and > > > > > > unsigned integers": > > > > > > > > > > > > """ > > > > > > Otherwise, the new type is signed and the value cannot be > > > > > > represented > > > > > > in it; either the result is implementation-defined or an > > > > > > implementation-defined signal is raised > > > > > > > > > > > > > > > > > > I think a better approach be to use -1 << 31, which is > > > > > > well- > > > > > > defined. > > > > > > But the problem then becomes assigning it into > > > > > > drm_panthor_sync_op::flags in a well-defined way... Could > > > > > > we > > > > > > make > > > > > > the > > > > > > field signed? That seems a bit bad as well... > > > > > > > > > > Is that a problem? Signed->unsigned conversion is always > > > > > well- > > > > > defined > > > > > (6.3.1.3 again), since it doesn't depend on how the signed > > > > > type > > > > > represents negatives. > > > > > > > > > > Robin. > > > > > > > > Ah, you're right. So that could fix the problem, indeed. > > > > > > On the other hand, I hate the idea of having -1 << 31 to encode > > > bit31-set. That's even worse for DRM_PANTHOR_VM_BIND_OP_TYPE_xxx > > > when > > > we'll reach a value above 0x7, because then the negative value is > > > hard > > > to map to its unsigned representation. If we really care about > > > this > > > corner case, I'd rather go full-defines for flags and call it a > > > day. > > > > > > > Yeah, I suppose it can get ugly for some other cases. > > > > If we rule that out, I think there's only two options I can think > > of > > left: > > > > 1. Using #defines instead, like Boris suggested > > 2. Using 64 bit signed enums (e.g "1ll << 31" instead) > > > > Again, #2 here would be the smaller change. But I kinda think I > > lean > > towards #1, because... These aren't really enumerators. They are > > flags. > > > > ...Yeah, sure. In C the practical difference isn't huge. But if we > > ever > > wanted to support using these enums from C++ code, we'd need to add > > overloaded operators, because C++ doesn't allow ORing together > > enums > > out of the box. > > > > I'm not saying I have any plans on using the uAPI
Re: [PATCH v12 1/3] dt-bindings: display: mediatek: Add OF graph support for board path
Il 16/10/24 18:09, Rob Herring ha scritto: On Wed, Oct 16, 2024 at 10:26 AM AngeloGioacchino Del Regno wrote: Il 16/10/24 16:00, Rob Herring ha scritto: On Wed, Oct 16, 2024 at 4:23 AM AngeloGioacchino Del Regno wrote: Il 15/10/24 15:48, Rob Herring ha scritto: On Tue, Oct 15, 2024 at 10:32:22AM +0200, AngeloGioacchino Del Regno wrote: Il 14/10/24 19:36, Rob Herring ha scritto: On Mon, Oct 14, 2024 at 3:51 AM AngeloGioacchino Del Regno wrote: The display IPs in MediaTek SoCs support being interconnected with different instances of DDP IPs (for example, merge0 or merge1) and/or with different DDP IPs (for example, rdma can be connected with either color, dpi, dsi, merge, etc), forming a full Display Data Path that ends with an actual display. The final display pipeline is effectively board specific, as it does depend on the display that is attached to it, and eventually on the sensors supported by the board (for example, Adaptive Ambient Light would need an Ambient Light Sensor, otherwise it's pointless!), other than the output type. Add support for OF graphs to most of the MediaTek DDP (display) bindings to add flexibility to build custom hardware paths, hence enabling board specific configuration of the display pipeline and allowing to finally migrate away from using hardcoded paths. Reviewed-by: Rob Herring (Arm) Reviewed-by: Alexandre Mergnat Tested-by: Alexandre Mergnat Reviewed-by: CK Hu Tested-by: Michael Walle # on kontron-sbc-i1200 Signed-off-by: AngeloGioacchino Del Regno --- .../display/mediatek/mediatek,aal.yaml| 40 +++ .../display/mediatek/mediatek,ccorr.yaml | 21 ++ .../display/mediatek/mediatek,color.yaml | 22 ++ .../display/mediatek/mediatek,dither.yaml | 22 ++ .../display/mediatek/mediatek,dpi.yaml| 25 +++- .../display/mediatek/mediatek,dsc.yaml| 24 +++ .../display/mediatek/mediatek,dsi.yaml| 27 - .../display/mediatek/mediatek,ethdr.yaml | 22 ++ .../display/mediatek/mediatek,gamma.yaml | 19 + .../display/mediatek/mediatek,merge.yaml | 23 +++ .../display/mediatek/mediatek,od.yaml | 22 ++ .../display/mediatek/mediatek,ovl-2l.yaml | 22 ++ .../display/mediatek/mediatek,ovl.yaml| 22 ++ .../display/mediatek/mediatek,postmask.yaml | 21 ++ .../display/mediatek/mediatek,rdma.yaml | 22 ++ .../display/mediatek/mediatek,ufoe.yaml | 21 ++ 16 files changed, 372 insertions(+), 3 deletions(-) diff --git a/Documentation/devicetree/bindings/display/mediatek/mediatek,aal.yaml b/Documentation/devicetree/bindings/display/mediatek/mediatek,aal.yaml index cf24434854ff..47ddba5c41af 100644 --- a/Documentation/devicetree/bindings/display/mediatek/mediatek,aal.yaml +++ b/Documentation/devicetree/bindings/display/mediatek/mediatek,aal.yaml @@ -62,6 +62,27 @@ properties: $ref: /schemas/types.yaml#/definitions/phandle-array maxItems: 1 + ports: +$ref: /schemas/graph.yaml#/properties/ports +description: + Input and output ports can have multiple endpoints, each of those + connects to either the primary, secondary, etc, display pipeline. + +properties: + port@0: +$ref: /schemas/graph.yaml#/properties/port +description: AAL input port + + port@1: +$ref: /schemas/graph.yaml#/properties/port +description: + AAL output to the next component's input, for example could be one + of many gamma, overdrive or other blocks. + +required: + - port@0 + - port@1 + required: - compatible - reg @@ -89,5 +110,24 @@ examples: power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>; clocks = <&mmsys CLK_MM_DISP_AAL>; mediatek,gce-client-reg = <&gce SUBSYS_1401 0x5000 0x1000>; + + ports { + #address-cells = <1>; + #size-cells = <0>; + + port@0 { + reg = <0>; + aal0_in: endpoint { + remote-endpoint = <&ccorr0_out>; + }; + }; + + port@1 { + reg = <1>; + aal0_out: endpoint { + remote-endpoint = <&gamma0_in>; + }; + }; + }; }; }; diff --git a/Documentation/devicetree/bindings/display/mediatek/mediatek,ccorr.yaml b/Documentation/devicetree/bindings/display/mediatek/mediatek,ccorr.yaml index 9f8366763831..fca8e7bb0cbc 100644 --- a/Documentation/devicetree/bindings/display/mediatek/mediatek,ccorr.yaml +++ b/Documentation/devicetree/bindings/display/mediatek/mediatek,ccorr.yaml @@ -57,6 +57,27 @@ properties: $ref: /schemas/types.yaml#/definitions/ph
Re: [PATCH 4/5] drm/sched: Re-group and rename the entity run-queue lock
On Wed, 2024-10-16 at 13:20 +0100, Tvrtko Ursulin wrote: > From: Tvrtko Ursulin > > When writing to a drm_sched_entity's run-queue, writers are protected > through the lock drm_sched_entity.rq_lock. This naming, however, > frequently collides with the separate internal lock of struct > drm_sched_rq, resulting in uses like this: > > spin_lock(&entity->rq_lock); > spin_lock(&entity->rq->lock); > > Rename drm_sched_entity.rq_lock to improve readability. While at it, > re-order that struct's members to make it more obvious what the lock > protects. > > v2: > * Rename some rq_lock straddlers in kerneldoc, improve commit text. > (Philipp) > > Signed-off-by: Tvrtko Ursulin > Suggested-by: Christian König > Cc: Alex Deucher > Cc: Luben Tuikov > Cc: Matthew Brost > Cc: Philipp Stanner > Reviewed-by: Christian König > --- > drivers/gpu/drm/scheduler/sched_entity.c | 28 -- > -- > drivers/gpu/drm/scheduler/sched_main.c | 2 +- > include/drm/gpu_scheduler.h | 21 +- > 3 files changed, 26 insertions(+), 25 deletions(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c > b/drivers/gpu/drm/scheduler/sched_entity.c > index b72cba292839..c013c2b49aa5 100644 > --- a/drivers/gpu/drm/scheduler/sched_entity.c > +++ b/drivers/gpu/drm/scheduler/sched_entity.c > @@ -105,7 +105,7 @@ int drm_sched_entity_init(struct drm_sched_entity > *entity, > /* We start in an idle state. */ > complete_all(&entity->entity_idle); > > - spin_lock_init(&entity->rq_lock); > + spin_lock_init(&entity->lock); > spsc_queue_init(&entity->job_queue); > > atomic_set(&entity->fence_seq, 0); > @@ -133,10 +133,10 @@ void drm_sched_entity_modify_sched(struct > drm_sched_entity *entity, > { > WARN_ON(!num_sched_list || !sched_list); > > - spin_lock(&entity->rq_lock); > + spin_lock(&entity->lock); > entity->sched_list = sched_list; > entity->num_sched_list = num_sched_list; > - spin_unlock(&entity->rq_lock); > + spin_unlock(&entity->lock); > } > EXPORT_SYMBOL(drm_sched_entity_modify_sched); > > @@ -244,10 +244,10 @@ static void drm_sched_entity_kill(struct > drm_sched_entity *entity) > if (!entity->rq) > return; > > - spin_lock(&entity->rq_lock); > + spin_lock(&entity->lock); > entity->stopped = true; > drm_sched_rq_remove_entity(entity->rq, entity); > - spin_unlock(&entity->rq_lock); > + spin_unlock(&entity->lock); > > /* Make sure this entity is not used by the scheduler at the > moment */ > wait_for_completion(&entity->entity_idle); > @@ -396,9 +396,9 @@ static void drm_sched_entity_wakeup(struct > dma_fence *f, > void drm_sched_entity_set_priority(struct drm_sched_entity *entity, > enum drm_sched_priority priority) > { > - spin_lock(&entity->rq_lock); > + spin_lock(&entity->lock); > entity->priority = priority; > - spin_unlock(&entity->rq_lock); > + spin_unlock(&entity->lock); > } > EXPORT_SYMBOL(drm_sched_entity_set_priority); > > @@ -515,10 +515,10 @@ struct drm_sched_job > *drm_sched_entity_pop_job(struct drm_sched_entity *entity) > > next = to_drm_sched_job(spsc_queue_peek(&entity- > >job_queue)); > if (next) { > - spin_lock(&entity->rq_lock); > + spin_lock(&entity->lock); > drm_sched_rq_update_fifo_locked(entity, > next- > >submit_ts); > - spin_unlock(&entity->rq_lock); > + spin_unlock(&entity->lock); > } > } > > @@ -559,14 +559,14 @@ void drm_sched_entity_select_rq(struct > drm_sched_entity *entity) > if (fence && !dma_fence_is_signaled(fence)) > return; > > - spin_lock(&entity->rq_lock); > + spin_lock(&entity->lock); > sched = drm_sched_pick_best(entity->sched_list, entity- > >num_sched_list); > rq = sched ? sched->sched_rq[entity->priority] : NULL; > if (rq != entity->rq) { > drm_sched_rq_remove_entity(entity->rq, entity); > entity->rq = rq; > } > - spin_unlock(&entity->rq_lock); > + spin_unlock(&entity->lock); > > if (entity->num_sched_list == 1) > entity->sched_list = NULL; > @@ -605,9 +605,9 @@ void drm_sched_entity_push_job(struct > drm_sched_job *sched_job) > struct drm_sched_rq *rq; > > /* Add the entity to the run queue */ > - spin_lock(&entity->rq_lock); > + spin_lock(&entity->lock); > if (entity->stopped) { > - spin_unlock(&entity->rq_lock); > + spin_unlock(&entity->lock); > > DRM_ERROR("Trying to push to a killed > entity\n"); > return; > @@ -621,7 +621,7 @@ void drm_sched_entity_push_job(st
[PATCH] drm/tilcdc: conditionally calling drm_atomic_helper_shutdown()
The `drm_atomic_helper_shutdown(dev)` is called only if `priv->is_registered` is true, ensuring that it runs only when the device has been properly registered. Otherwise, if it encounters a defer probe, the following call trace will appear. WARNING: CPU: 0 PID: 13 at drivers/gpu/drm/drm_atomic_state_helper.c:174 drm_atomic_helper_crtc_duplicate_state+0x68/0x70 Modules linked in: CPU: 0 PID: 13 Comm: kworker/u2:1 Hardware name: Generic AM33XX (Flattened Device Tree) Workqueue: events_unbound deferred_probe_work_func unwind_backtrace from show_stack+0x18/0x1c show_stack from dump_stack_lvl+0x24/0x2c dump_stack_lvl from __warn+0x80/0x134 __warn from warn_slowpath_fmt+0x19c/0x1a4 warn_slowpath_fmt from drm_atomic_helper_crtc_duplicate_state+0x68/0x70 drm_atomic_helper_crtc_duplicate_state from drm_atomic_get_crtc_state+0x70/0x110 drm_atomic_get_crtc_state from drm_atomic_helper_disable_all+0x98/0x1c8 drm_atomic_helper_disable_all from drm_atomic_helper_shutdown+0x90/0x144 drm_atomic_helper_shutdown from tilcdc_fini+0x58/0xe0 tilcdc_fini from tilcdc_init.constprop.0+0x23c/0x620 tilcdc_init.constprop.0 from tilcdc_pdev_probe+0x58/0xac tilcdc_pdev_probe from platform_probe+0x64/0xb8 platform_probe from really_probe+0xd0/0x2e0 really_probe from __driver_probe_device+0x90/0x1a8 __driver_probe_device from driver_probe_device+0x38/0x10c driver_probe_device from __device_attach_driver+0x9c/0x110 __device_attach_driver from bus_for_each_drv+0x98/0xec bus_for_each_drv from __device_attach+0xb0/0x1ac __device_attach from bus_probe_device+0x90/0x94 bus_probe_device from deferred_probe_work_func+0x80/0xac deferred_probe_work_func from process_one_work+0x198/0x3f8 process_one_work from worker_thread+0x35c/0x550 worker_thread from kthread+0x108/0x124 kthread from ret_from_fork+0x14/0x28 Exception stack(0xe0039fb0 to 0xe0039ff8) 9fa0: 9fc0: 9fe0: 0013 ---[ end trace ]--- tilcdc 4830e000.lcdc: [drm] *ERROR* Disabling all crtc's during unload failed with -12 Signed-off-by: Xulin Sun --- drivers/gpu/drm/tilcdc/tilcdc_drv.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/tilcdc/tilcdc_drv.c b/drivers/gpu/drm/tilcdc/tilcdc_drv.c index cd5eefa06060..9c11ea126b46 100644 --- a/drivers/gpu/drm/tilcdc/tilcdc_drv.c +++ b/drivers/gpu/drm/tilcdc/tilcdc_drv.c @@ -171,11 +171,12 @@ static void tilcdc_fini(struct drm_device *dev) if (priv->crtc) tilcdc_crtc_shutdown(priv->crtc); - if (priv->is_registered) + if (priv->is_registered) { drm_dev_unregister(dev); + drm_atomic_helper_shutdown(dev); + } drm_kms_helper_poll_fini(dev); - drm_atomic_helper_shutdown(dev); tilcdc_irq_uninstall(dev); drm_mode_config_cleanup(dev); -- 2.34.1
[linux-next:master] [drm/tests] d219425604: WARNING:at_drivers/gpu/drm/drm_framebuffer.c:#drm_framebuffer_free[drm]
arn (kernel/panic.c:741) kern :warn : [ 111.100940] ? drm_framebuffer_free (drivers/gpu/drm/drm_framebuffer.c:832) drm kern :warn : [ 111.101164] ? report_bug (lib/bug.c:180 lib/bug.c:219) kern :warn : [ 111.101257] ? handle_bug (arch/x86/kernel/traps.c:239) kern :warn : [ 111.101346] ? exc_invalid_op (arch/x86/kernel/traps.c:260 (discriminator 1)) kern :warn : [ 111.101439] ? asm_exc_invalid_op (arch/x86/include/asm/idtentry.h:621) kern :warn : [ 111.101554] ? drm_framebuffer_free (drivers/gpu/drm/drm_framebuffer.c:832) drm kern :warn : [ 111.101773] ? drm_framebuffer_free (drivers/gpu/drm/drm_framebuffer.c:832) drm kern :warn : [ 111.101991] drm_test_framebuffer_free (drivers/gpu/drm/tests/drm_framebuffer_test.c:693 (discriminator 5)) drm_framebuffer_test kern :warn : [ 111.102139] ? __pfx_drm_test_framebuffer_free (drivers/gpu/drm/tests/drm_framebuffer_test.c:670) drm_framebuffer_test kern :warn : [ 111.102295] ? __pfx_drm_mode_config_init_release (drivers/gpu/drm/drm_mode_config.c:386) drm kern :warn : [ 111.102539] ? __drmm_add_action (drivers/gpu/drm/drm_managed.c:161) drm kern :warn : [ 111.102756] ? __schedule (kernel/sched/core.c:6399) kern :warn : [ 111.102848] ? __pfx_read_tsc (arch/x86/kernel/tsc.c:1130) kern :warn : [ 111.102941] ? ktime_get_ts64 (kernel/time/timekeeping.c:378 (discriminator 4) kernel/time/timekeeping.c:395 (discriminator 4) kernel/time/timekeeping.c:403 (discriminator 4) kernel/time/timekeeping.c:983 (discriminator 4)) kern :warn : [ 111.103037] kunit_try_run_case (lib/kunit/test.c:400 lib/kunit/test.c:443) kern :warn : [ 111.103135] ? __pfx_kunit_try_run_case (lib/kunit/test.c:430) kern :warn : [ 111.103243] ? set_cpus_allowed_ptr (kernel/sched/core.c:3025) kern :warn : [ 111.103345] ? __pfx_set_cpus_allowed_ptr (kernel/sched/core.c:3025) kern :warn : [ 111.103455] ? __pfx_kunit_try_run_case (lib/kunit/test.c:430) kern :warn : [ 111.103574] ? __pfx_kunit_generic_run_threadfn_adapter (lib/kunit/try-catch.c:26) kern :warn : [ 111.103705] kunit_generic_run_threadfn_adapter (lib/kunit/try-catch.c:31) kern :warn : [ 111.103823] kthread (kernel/kthread.c:389) kern :warn : [ 111.103907] ? __pfx_kthread (kernel/kthread.c:342) kern :warn : [ 111.103997] ret_from_fork (arch/x86/kernel/process.c:153) kern :warn : [ 111.104085] ? __pfx_kthread (kernel/kthread.c:342) kern :warn : [ 111.104175] ret_from_fork_asm (arch/x86/entry/entry_64.S:257) kern :warn : [ 111.104272] kern :warn : [ 111.104334] ---[ end trace ]--- kern :info : [ 111.116715] ok 4 drm_test_framebuffer_free kern :info : [ 111.124711] ok 5 drm_test_framebuffer_init The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20241017/202410171515.c79582d2-oliver.s...@intel.com -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki
Re: [PATCH v2 6/9] drm/bridge: Add ITE IT6263 LVDS to HDMI converter
On 10/14/2024, Dmitry Baryshkov wrote: [...] +static int it6263_bridge_atomic_check(struct drm_bridge *bridge, +struct drm_bridge_state *bridge_state, +struct drm_crtc_state *crtc_state, +struct drm_connector_state *conn_state) { + struct drm_display_mode *mode = &crtc_state->adjusted_mode; >>> >>> Use drm_atomic_helper_connector_hdmi_check(). >>> >>> Implement .hdmi_tmds_char_rate_valid(). Also, I think, single and dual link >>> LVDS have different max >>> clock rates. Please correct me if I'm wrong. >> >> I guess this rate will be same for both links in dual lvds mode. >> For single link, it supports only link0. >> We cannot operate link1 its Own. >> >> From ITE point the max rate is rate corresponding to 1080p(148-150MHz) >> >> single and dual link LVDS have different max clock rates, but that >> constraint is >> in SoC side?? ITE HW manual does not mention about this. > > Huh? I checked the datasheet, version 0.8. > It specifies LVDS clock rate (not the mode clock) up to 150 MHz and HDMI The datasheet says "Features(LVDS RX) * Support input clock rate up to 150MHz". The 150MHz is the mode clock rate which kind of matches the words "Features(Combined) * Support up to Full-HD/1080P and UXGA(1600x 1200) display format". LVDS serial clock rate is either x7 or x3.5 the mode clock rate, depending on single link or dual link. > rate up to 225 MHz. Please check both constraints. Will check both constraints. [...] + it->bridge.ops = DRM_BRIDGE_OP_EDID | DRM_BRIDGE_OP_DETECT; >>> >>> | DRM_BRIDGE_OP_HDMI >>> >>> BTW: No HPD IRQ support? >> >> Renesas SMARC RZ/G3E this signal is internal. No dedicted IRQ line >> Populated for this signal. I don't know about NXP and any other platforms >> has HPD wired to test the HPD IRQ support. >> >> Maybe go with poll method now and add hot plug support, >> when we have platform with HPD to test. > > I'm fine with this. According to the datasheet it doesn't seem to have > the IRQ pin at all. It's just surprising to me. It's be nice to mention > that HW doesn't support HPD IRQ either before setting it->bridge.ops or > in the commit message. Will mention this before setting it->bridge.ops. -- Regards, Liu Ying
Re: [PATCH v6 01/14] drm/panthor: Add uAPI
On Thu, 17 Oct 2024 10:51:32 +0200 Erik Faye-Lund wrote: > On Wed, 2024-10-16 at 16:18 +0200, Boris Brezillon wrote: > > On Wed, 16 Oct 2024 16:05:55 +0200 > > Erik Faye-Lund wrote: > > > > > On Wed, 2024-10-16 at 15:02 +0100, Robin Murphy wrote: > > > > On 2024-10-16 2:50 pm, Erik Faye-Lund wrote: > > > > > On Wed, 2024-10-16 at 15:16 +0200, Erik Faye-Lund wrote: > > > > > > On Thu, 2024-02-29 at 17:22 +0100, Boris Brezillon wrote: > > > > > > > +/** > > > > > > > + * enum drm_panthor_sync_op_flags - Synchronization > > > > > > > operation > > > > > > > flags. > > > > > > > + */ > > > > > > > +enum drm_panthor_sync_op_flags { > > > > > > > + /** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK: > > > > > > > Synchronization > > > > > > > handle type mask. */ > > > > > > > + DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK = 0xff, > > > > > > > + > > > > > > > + /** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ: > > > > > > > Synchronization object type. */ > > > > > > > + DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ = 0, > > > > > > > + > > > > > > > + /** > > > > > > > + * > > > > > > > @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCOBJ: > > > > > > > Timeline synchronization > > > > > > > + * object type. > > > > > > > + */ > > > > > > > + DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCOBJ = > > > > > > > 1, > > > > > > > + > > > > > > > + /** @DRM_PANTHOR_SYNC_OP_WAIT: Wait operation. */ > > > > > > > + DRM_PANTHOR_SYNC_OP_WAIT = 0 << 31, > > > > > > > + > > > > > > > + /** @DRM_PANTHOR_SYNC_OP_SIGNAL: Signal operation. > > > > > > > */ > > > > > > > + DRM_PANTHOR_SYNC_OP_SIGNAL = (int)(1u << 31), > > > > > > > > > > > > Why do we cast to int here? 1u << 31 doesn't fit in a 32-bit > > > > > > signed > > > > > > integer, so isn't this undefined behavior in C? > > > > > > > > > > > > > > > > Seems this was proposed here: > > > > > https://lore.kernel.org/dri-devel/89be8f8f-7c4e-4efd-0b7b-c30bcfbf1...@arm.com/ > > > > > > > > > > ...that kinda sounds like bad advice to me. > > > > > > > > > > Also, it's been pointed out to me elsewhere that this isn't > > > > > *technically speaking* undefined, it's "implementation > > > > > defined". > > > > > But as > > > > > far as kernel interfaces goes, that's pretty much the same; we > > > > > can't > > > > > guarantee that the kernel and the user-space is using the same > > > > > implementation. > > > > > > > > > > Here's the quote from the C99 spec, section 6.3.1.3 "Signed and > > > > > unsigned integers": > > > > > > > > > > """ > > > > > Otherwise, the new type is signed and the value cannot be > > > > > represented > > > > > in it; either the result is implementation-defined or an > > > > > implementation-defined signal is raised > > > > > > > > > > > > > > > I think a better approach be to use -1 << 31, which is well- > > > > > defined. > > > > > But the problem then becomes assigning it into > > > > > drm_panthor_sync_op::flags in a well-defined way... Could we > > > > > make > > > > > the > > > > > field signed? That seems a bit bad as well... > > > > > > > > Is that a problem? Signed->unsigned conversion is always well- > > > > defined > > > > (6.3.1.3 again), since it doesn't depend on how the signed type > > > > represents negatives. > > > > > > > > Robin. > > > > > > Ah, you're right. So that could fix the problem, indeed. > > > > On the other hand, I hate the idea of having -1 << 31 to encode > > bit31-set. That's even worse for DRM_PANTHOR_VM_BIND_OP_TYPE_xxx when > > we'll reach a value above 0x7, because then the negative value is > > hard > > to map to its unsigned representation. If we really care about this > > corner case, I'd rather go full-defines for flags and call it a day. > > > > Yeah, I suppose it can get ugly for some other cases. > > If we rule that out, I think there's only two options I can think of > left: > > 1. Using #defines instead, like Boris suggested > 2. Using 64 bit signed enums (e.g "1ll << 31" instead) > > Again, #2 here would be the smaller change. But I kinda think I lean > towards #1, because... These aren't really enumerators. They are flags. > > ...Yeah, sure. In C the practical difference isn't huge. But if we ever > wanted to support using these enums from C++ code, we'd need to add > overloaded operators, because C++ doesn't allow ORing together enums > out of the box. > > I'm not saying I have any plans on using the uAPI from C++, just saying > that if we're going to tackle this, we might as well tackle it > completely... > > Also, expanding the enum-type to 64 bits might have some additional > consequences, like needlessly needing more stack-space to pass values > around etc. > > Thoughts? I'm leaning towards defines, because 64-bit enums are uncommon (FWIW, 'git grep "1ll << 31" include/uapi' returns nothing).
Re: [PATCH v6 01/14] drm/panthor: Add uAPI
On Wed, 2024-10-16 at 16:18 +0200, Boris Brezillon wrote: > On Wed, 16 Oct 2024 16:05:55 +0200 > Erik Faye-Lund wrote: > > > On Wed, 2024-10-16 at 15:02 +0100, Robin Murphy wrote: > > > On 2024-10-16 2:50 pm, Erik Faye-Lund wrote: > > > > On Wed, 2024-10-16 at 15:16 +0200, Erik Faye-Lund wrote: > > > > > On Thu, 2024-02-29 at 17:22 +0100, Boris Brezillon wrote: > > > > > > +/** > > > > > > + * enum drm_panthor_sync_op_flags - Synchronization > > > > > > operation > > > > > > flags. > > > > > > + */ > > > > > > +enum drm_panthor_sync_op_flags { > > > > > > + /** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK: > > > > > > Synchronization > > > > > > handle type mask. */ > > > > > > + DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK = 0xff, > > > > > > + > > > > > > + /** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ: > > > > > > Synchronization object type. */ > > > > > > + DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ = 0, > > > > > > + > > > > > > + /** > > > > > > +* > > > > > > @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCOBJ: > > > > > > Timeline synchronization > > > > > > +* object type. > > > > > > +*/ > > > > > > + DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCOBJ = > > > > > > 1, > > > > > > + > > > > > > + /** @DRM_PANTHOR_SYNC_OP_WAIT: Wait operation. */ > > > > > > + DRM_PANTHOR_SYNC_OP_WAIT = 0 << 31, > > > > > > + > > > > > > + /** @DRM_PANTHOR_SYNC_OP_SIGNAL: Signal operation. > > > > > > */ > > > > > > + DRM_PANTHOR_SYNC_OP_SIGNAL = (int)(1u << 31), > > > > > > > > > > Why do we cast to int here? 1u << 31 doesn't fit in a 32-bit > > > > > signed > > > > > integer, so isn't this undefined behavior in C? > > > > > > > > > > > > > Seems this was proposed here: > > > > https://lore.kernel.org/dri-devel/89be8f8f-7c4e-4efd-0b7b-c30bcfbf1...@arm.com/ > > > > > > > > ...that kinda sounds like bad advice to me. > > > > > > > > Also, it's been pointed out to me elsewhere that this isn't > > > > *technically speaking* undefined, it's "implementation > > > > defined". > > > > But as > > > > far as kernel interfaces goes, that's pretty much the same; we > > > > can't > > > > guarantee that the kernel and the user-space is using the same > > > > implementation. > > > > > > > > Here's the quote from the C99 spec, section 6.3.1.3 "Signed and > > > > unsigned integers": > > > > > > > > """ > > > > Otherwise, the new type is signed and the value cannot be > > > > represented > > > > in it; either the result is implementation-defined or an > > > > implementation-defined signal is raised > > > > > > > > > > > > I think a better approach be to use -1 << 31, which is well- > > > > defined. > > > > But the problem then becomes assigning it into > > > > drm_panthor_sync_op::flags in a well-defined way... Could we > > > > make > > > > the > > > > field signed? That seems a bit bad as well... > > > > > > Is that a problem? Signed->unsigned conversion is always well- > > > defined > > > (6.3.1.3 again), since it doesn't depend on how the signed type > > > represents negatives. > > > > > > Robin. > > > > Ah, you're right. So that could fix the problem, indeed. > > On the other hand, I hate the idea of having -1 << 31 to encode > bit31-set. That's even worse for DRM_PANTHOR_VM_BIND_OP_TYPE_xxx when > we'll reach a value above 0x7, because then the negative value is > hard > to map to its unsigned representation. If we really care about this > corner case, I'd rather go full-defines for flags and call it a day. > Yeah, I suppose it can get ugly for some other cases. If we rule that out, I think there's only two options I can think of left: 1. Using #defines instead, like Boris suggested 2. Using 64 bit signed enums (e.g "1ll << 31" instead) Again, #2 here would be the smaller change. But I kinda think I lean towards #1, because... These aren't really enumerators. They are flags. ...Yeah, sure. In C the practical difference isn't huge. But if we ever wanted to support using these enums from C++ code, we'd need to add overloaded operators, because C++ doesn't allow ORing together enums out of the box. I'm not saying I have any plans on using the uAPI from C++, just saying that if we're going to tackle this, we might as well tackle it completely... Also, expanding the enum-type to 64 bits might have some additional consequences, like needlessly needing more stack-space to pass values around etc. Thoughts? Surely there must be some precedence on using the top bit for flags in the kernel, no?
[PATCH v3 1/2] dt-bindings: display: bridge: sil, sii9022: Add bus-width
The SI9022 HDMI transmitter can be configured with a bus-width of 16, 18, or 24 bits. Introduce a bus-width property to the input endpoint, specifying the number of parallel RGB input bits connected to the transmitter. Signed-off-by: Wadim Egorov Reviewed-by: Krzysztof Kozlowski --- v3: Add Reviewed-by tag from Krzysztof --- .../bindings/display/bridge/sil,sii9022.yaml | 15 ++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/display/bridge/sil,sii9022.yaml b/Documentation/devicetree/bindings/display/bridge/sil,sii9022.yaml index 5a69547ad3d7..1509c4535e53 100644 --- a/Documentation/devicetree/bindings/display/bridge/sil,sii9022.yaml +++ b/Documentation/devicetree/bindings/display/bridge/sil,sii9022.yaml @@ -81,9 +81,22 @@ properties: properties: port@0: -$ref: /schemas/graph.yaml#/properties/port +unevaluatedProperties: false +$ref: /schemas/graph.yaml#/$defs/port-base description: Parallel RGB input port +properties: + endpoint: +$ref: /schemas/graph.yaml#/$defs/endpoint-base +unevaluatedProperties: false + +properties: + bus-width: +description: + Endpoint bus width. +enum: [ 16, 18, 24 ] +default: 24 + port@1: $ref: /schemas/graph.yaml#/properties/port description: HDMI output port -- 2.34.1
[PATCH v3 0/2] Introduce bus-width property for input bus format
This patch series introduces a bus-width property for the SI9022 HDMI transmitter, allowing the input bus format to be configured based on the number of RGB input pins. The default is set to 24-bit if unspecified. v3: - Add Reviewed-by tag from Krzysztof - Ensure bus_width is set/defaults to 24 even if an endpoint is not defined v2: https://lore.kernel.org/lkml/20241007085213.2918982-1-w.ego...@phytec.de/ v1: https://lore.kernel.org/lkml/20241003082006.2728617-1-w.ego...@phytec.de/T/ Wadim Egorov (2): dt-bindings: display: bridge: sil,sii9022: Add bus-width drm/bridge: sii902x: Set input bus format based on bus-width .../bindings/display/bridge/sil,sii9022.yaml | 15 +++- drivers/gpu/drm/bridge/sii902x.c | 24 ++- 2 files changed, 37 insertions(+), 2 deletions(-) -- 2.34.1
[PATCH v3 2/2] drm/bridge: sii902x: Set input bus format based on bus-width
Introduce a bus-width property to define the number of parallel RGB input pins connected to the transmitter. The input bus formats are updated accordingly. If the property is not specified, default to 24-bit bus-width. Signed-off-by: Wadim Egorov --- v3: Ensure bus_width is set/defaults to 24 even if an endpoint is not defined --- drivers/gpu/drm/bridge/sii902x.c | 24 +++- 1 file changed, 23 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/bridge/sii902x.c b/drivers/gpu/drm/bridge/sii902x.c index 7f91b0db161e..9be9cc5b9025 100644 --- a/drivers/gpu/drm/bridge/sii902x.c +++ b/drivers/gpu/drm/bridge/sii902x.c @@ -180,6 +180,8 @@ struct sii902x { struct gpio_desc *reset_gpio; struct i2c_mux_core *i2cmux; bool sink_is_hdmi; + u32 bus_width; + /* * Mutex protects audio and video functions from interfering * each other, by keeping their i2c command sequences atomic. @@ -477,6 +479,8 @@ static u32 *sii902x_bridge_atomic_get_input_bus_fmts(struct drm_bridge *bridge, u32 output_fmt, unsigned int *num_input_fmts) { + + struct sii902x *sii902x = bridge_to_sii902x(bridge); u32 *input_fmts; *num_input_fmts = 0; @@ -485,7 +489,20 @@ static u32 *sii902x_bridge_atomic_get_input_bus_fmts(struct drm_bridge *bridge, if (!input_fmts) return NULL; - input_fmts[0] = MEDIA_BUS_FMT_RGB888_1X24; + switch (sii902x->bus_width) { + case 16: + input_fmts[0] = MEDIA_BUS_FMT_RGB565_1X16; + break; + case 18: + input_fmts[0] = MEDIA_BUS_FMT_RGB666_1X18; + break; + case 24: + input_fmts[0] = MEDIA_BUS_FMT_RGB888_1X24; + break; + default: + return NULL; + } + *num_input_fmts = 1; return input_fmts; @@ -1167,6 +1184,11 @@ static int sii902x_probe(struct i2c_client *client) return PTR_ERR(sii902x->reset_gpio); } + sii902x->bus_width = 24; + endpoint = of_graph_get_endpoint_by_regs(dev->of_node, 0, -1); + if (endpoint) + of_property_read_u32(endpoint, "bus-width", &sii902x->bus_width); + endpoint = of_graph_get_endpoint_by_regs(dev->of_node, 1, -1); if (endpoint) { struct device_node *remote = of_graph_get_remote_port_parent(endpoint); -- 2.34.1
Re: [PATCH v4] drm/meson: switch to a managed drm device
Just a friendly reminder. 09/10/24 16:15, Anastasia Belova пишет: Switch to a managed drm device to cleanup some error handling and make future work easier. Fix dereference of NULL in meson_drv_bind_master by removing drm_dev_put(drm) before meson_encoder_*_remove and component_unbind_all where drm is dereferenced. Co-developed by Linux Verification Center (linuxtesting.org). Cc: sta...@vger.kernel.org Fixes: 6a044642988b ("drm/meson: fix unbind path if HDMI fails to bind") Signed-off-by: Anastasia Belova --- v2: fix commit message and add Cc: sta...@vger.kernel.org v3: cleanup error paths v4: fix build errors drivers/gpu/drm/meson/meson_crtc.c | 10 +-- drivers/gpu/drm/meson/meson_drv.c | 93 ++ drivers/gpu/drm/meson/meson_drv.h | 3 +- drivers/gpu/drm/meson/meson_encoder_cvbs.c | 8 +- drivers/gpu/drm/meson/meson_encoder_dsi.c | 2 +- drivers/gpu/drm/meson/meson_encoder_hdmi.c | 4 +- drivers/gpu/drm/meson/meson_overlay.c | 8 +- drivers/gpu/drm/meson/meson_plane.c| 10 +-- 8 files changed, 63 insertions(+), 75 deletions(-) diff --git a/drivers/gpu/drm/meson/meson_crtc.c b/drivers/gpu/drm/meson/meson_crtc.c index d70616da8ce2..e1c0bf3baeea 100644 --- a/drivers/gpu/drm/meson/meson_crtc.c +++ b/drivers/gpu/drm/meson/meson_crtc.c @@ -662,13 +662,13 @@ void meson_crtc_irq(struct meson_drm *priv) drm_crtc_handle_vblank(priv->crtc); - spin_lock_irqsave(&priv->drm->event_lock, flags); + spin_lock_irqsave(&priv->drm.event_lock, flags); if (meson_crtc->event) { drm_crtc_send_vblank_event(priv->crtc, meson_crtc->event); drm_crtc_vblank_put(priv->crtc); meson_crtc->event = NULL; } - spin_unlock_irqrestore(&priv->drm->event_lock, flags); + spin_unlock_irqrestore(&priv->drm.event_lock, flags); } int meson_crtc_create(struct meson_drm *priv) @@ -677,18 +677,18 @@ int meson_crtc_create(struct meson_drm *priv) struct drm_crtc *crtc; int ret; - meson_crtc = devm_kzalloc(priv->drm->dev, sizeof(*meson_crtc), + meson_crtc = devm_kzalloc(priv->drm.dev, sizeof(*meson_crtc), GFP_KERNEL); if (!meson_crtc) return -ENOMEM; meson_crtc->priv = priv; crtc = &meson_crtc->base; - ret = drm_crtc_init_with_planes(priv->drm, crtc, + ret = drm_crtc_init_with_planes(&priv->drm, crtc, priv->primary_plane, NULL, &meson_crtc_funcs, "meson_crtc"); if (ret) { - dev_err(priv->drm->dev, "Failed to init CRTC\n"); + dev_err(priv->drm.dev, "Failed to init CRTC\n"); return ret; } diff --git a/drivers/gpu/drm/meson/meson_drv.c b/drivers/gpu/drm/meson/meson_drv.c index 4bd0baa2a4f5..dd87c6b61e9e 100644 --- a/drivers/gpu/drm/meson/meson_drv.c +++ b/drivers/gpu/drm/meson/meson_drv.c @@ -182,7 +182,6 @@ static int meson_drv_bind_master(struct device *dev, bool has_components) struct platform_device *pdev = to_platform_device(dev); const struct meson_drm_match_data *match; struct meson_drm *priv; - struct drm_device *drm; struct resource *res; void __iomem *regs; int ret, i; @@ -197,58 +196,49 @@ static int meson_drv_bind_master(struct device *dev, bool has_components) if (!match) return -ENODEV; - drm = drm_dev_alloc(&meson_driver, dev); - if (IS_ERR(drm)) - return PTR_ERR(drm); + priv = devm_drm_dev_alloc(dev, &meson_driver, +struct meson_drm, drm); - priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL); - if (!priv) { - ret = -ENOMEM; - goto free_drm; - } - drm->dev_private = priv; - priv->drm = drm; + if (IS_ERR(priv)) + return PTR_ERR(priv); + + priv->drm.dev_private = priv; priv->dev = dev; priv->compat = match->compat; priv->afbcd.ops = match->afbcd_ops; regs = devm_platform_ioremap_resource_byname(pdev, "vpu"); if (IS_ERR(regs)) { - ret = PTR_ERR(regs); - goto free_drm; + return PTR_ERR(regs); } priv->io_base = regs; res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "hhi"); if (!res) { - ret = -EINVAL; - goto free_drm; + return -EINVAL; } /* Simply ioremap since it may be a shared register zone */ regs = devm_ioremap(dev, res->start, resource_size(res)); if (!regs) { - ret = -EADDRNOTAVAIL; - goto free_drm; + return -EADDRNOTAVAIL; } priv->hhi = devm_regmap_init_mmio(dev, regs, &meson_regmap_config); if (IS
Re: [PATCH v1 2/5] arm64: dts: qcom: Add support for configuring channel TRE size
On 16/10/2024 16:35, Bjorn Andersson wrote: >>> @@ -1064,7 +1064,7 @@ >>> }; >>> >>> gpi_dma0: dma-controller@90 { >>> - #dma-cells = <3>; >>> + #dma-cells = <4>; >>> compatible = "qcom,sc7280-gpi-dma", >>> "qcom,sm6350-gpi-dma"; >>> reg = <0 0x0090 0 0x6>; >>> interrupts = , >>> @@ -1114,8 +1114,8 @@ >>> "qup-memory"; >>> power-domains = <&rpmhpd SC7280_CX>; >>> required-opps = <&rpmhpd_opp_low_svs>; >>> - dmas = <&gpi_dma0 0 0 QCOM_GPI_I2C>, >>> - <&gpi_dma0 1 0 QCOM_GPI_I2C>; >>> + dmas = <&gpi_dma0 0 0 QCOM_GPI_I2C 64>, >>> + <&gpi_dma0 1 0 QCOM_GPI_I2C 64>; >> >> So everywhere is 64, thus this is fixed. Deduce it from the compatible >> > > If I understand correctly, it's a software tunable property, used to > balance how many TRE elements that should be preallocated. > > If so, it would not be a property of the hardware/compatible, but rather > a result of profiling and a balance between memory "waste" and > performance. In such case I would prefer it being runtime-calculated by the driver, based on frequency or expected bandwidth. And in any case if this is about to stay, having here default values means all upstream users don't need it. What's not upstream, does not exist in such context. We don't add features which are not used by upstream. Best regards, Krzysztof
Re: [PATCH RFC 3/3] arm64: dts: qcom: x1e80100: Add ACD levels for GPU
On 17/10/2024 08:12, Akhil P Oommen wrote: > On Wed, Oct 16, 2024 at 09:50:04AM +0200, Krzysztof Kozlowski wrote: >> On 15/10/2024 21:35, Akhil P Oommen wrote: >>> On Mon, Oct 14, 2024 at 09:40:13AM +0200, Krzysztof Kozlowski wrote: On Sat, Oct 12, 2024 at 01:59:30AM +0530, Akhil P Oommen wrote: > Update GPU node to include acd level values. > > Signed-off-by: Akhil P Oommen > --- > arch/arm64/boot/dts/qcom/x1e80100.dtsi | 11 ++- > 1 file changed, 10 insertions(+), 1 deletion(-) > > diff --git a/arch/arm64/boot/dts/qcom/x1e80100.dtsi > b/arch/arm64/boot/dts/qcom/x1e80100.dtsi > index a36076e3c56b..e6c500480eb1 100644 > --- a/arch/arm64/boot/dts/qcom/x1e80100.dtsi > +++ b/arch/arm64/boot/dts/qcom/x1e80100.dtsi > @@ -3323,60 +3323,69 @@ zap-shader { > }; > > gpu_opp_table: opp-table { > - compatible = "operating-points-v2"; > + compatible = "operating-points-v2-adreno"; This nicely breaks all existing users of this DTS. Sorry, no. We are way past initial bringup/development. One year past. > > How do I identify when devicetree is considered stable? An arbitrary > time period doesn't sound like a good idea. Is there a general consensus > on this? > > X1E chipset is still considered under development at least till the end of > this > year, right? Stable could be when people already get their consumer/final product with it. I got some weeks ago Lenovo T14s laptop and since yesterday working fine with Ubuntu: https://discourse.ubuntu.com/t/ubuntu-24-10-concept-snapdragon-x-elite/48800 All chipsets are under development, even old SM8450, but we avoid breaking it while doing that. Best regards, Krzysztof
Re:Re: [PATCH v3 02/15] drm/rockchip: Set dma mask to 64 bit
Hi Robin, Thanks for your comment。 At 2024-10-17 01:38:23, "Robin Murphy" wrote: >On 2024-09-20 9:20 am, Andy Yan wrote: >> From: Andy Yan >> >> The vop mmu support translate physical address upper 4 GB to iova >> below 4 GB. So set dma mask to 64 bit to indicate we support address >>> 4GB. >> >> This can avoid warnging message like this on some boards with DDR >>> 4 GB: >> >> rockchip-drm display-subsystem: swiotlb buffer is full (sz: 266240 bytes), >> total 32768 (slots), used 130 (slots) >> rockchip-drm display-subsystem: swiotlb buffer is full (sz: 266240 bytes), >> total 32768 (slots), used 0 (slots) >> rockchip-drm display-subsystem: swiotlb buffer is full (sz: 266240 bytes), >> total 32768 (slots), used 130 (slots) >> rockchip-drm display-subsystem: swiotlb buffer is full (sz: 266240 bytes), >> total 32768 (slots), used 130 (slots) >> rockchip-drm display-subsystem: swiotlb buffer is full (sz: 266240 bytes), >> total 32768 (slots), used 0 (slots) > >There are several things wrong with this... > >AFAICS the VOP itself still only supports 32-bit addresses, so the VOP >driver should only be setting a 32-bit DMA mask. The IOMMUs support >either 32-bit or 40-bit addresses, and the IOMMU driver does set its DMA Does that mean we can only use the dev of IOMMU ? If that is true, would you please give some inspiration on how to implement this? Or is there any other diver i can follow。Very sorry for that I'm not familiar with memory management and the IOMMU。 >mask appropriately. None of those numbers is 64, so that's clearly >suspicious already. Plus it would seem the claim of the IOMMU being able >to address >4GB isn't strictly true for RK3288 (which does supposedly >support 8GB of RAM). We can set DMA mask per device if we can find a right way to do it。 > >Furthermore, the "display-subsystem" doesn't even exist - it does not >represent any actual DMA-capable hardware, so it should not have a DMA >mask, and it should not be used for DMA API operations. Buffers for the >VOP should be DMA-mapped for the VOP device itself. At the very least >the rockchip_gem_alloc_dma() path is clearly broken otherwise (I guess >this patch possibly *would* make that brokenness apparent). > >> Signed-off-by: Andy Yan >> Tested-by: Derek Foreman >> --- >> >> (no changes since v1) >> >> drivers/gpu/drm/rockchip/rockchip_drm_drv.c | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_drv.c >> b/drivers/gpu/drm/rockchip/rockchip_drm_drv.c >> index 04ef7a2c3833..8bc2ff3b04bb 100644 >> --- a/drivers/gpu/drm/rockchip/rockchip_drm_drv.c >> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_drv.c >> @@ -445,7 +445,9 @@ static int rockchip_drm_platform_probe(struct >> platform_device *pdev) >> return ret; >> } >> >> -return 0; >> +ret = dma_coerce_mask_and_coherent(dev, DMA_BIT_MASK(64)); > >Finally as a general thing, please don't misuse >dma_coerce_mask_and_coherent() in platform drivers, just use normal >dma_set_mask_and_coherent(). The platform bus code has been initialising >the dev->dma_mask pointer for years now, drivers should not be messing >with it any more. Got it , thanks again。 > >Thanks, >Robin. > >> + >> +return ret; >> } >> >> static void rockchip_drm_platform_remove(struct platform_device *pdev)
Re: [PATCH v2 0/5] Small DRM scheduler improvements
On Wed, 2024-10-16 at 13:20 +0100, Tvrtko Ursulin wrote: > From: Tvrtko Ursulin > > Leftovers from the earlier "DRM scheduler fixes and improvements" > series. > > It looks the fixes have now propagated back to drm-misc-next so this > should now > be mergeable. > > It also needed a small rebase to account for one revert and one > spelling fix > which landed in the meantime. > > As a reminder, what remains are kerneldoc improvements, struct layout > tweaks for > clarity, one trivial cleanup for the FIFO mode, and most importantly > two spin > lock-unlock cycles are removed from the push job path by pulling > taking of the > locks one level up. > > I smoke tested it on the Steam Deck and lockdep seems happy. > > v2: > * Tweaks to commit messages and rename of some leftover rq_lock > naming inside > kerneldoc. > > Cc: Christian König > Cc: Philipp Stanner > > Tvrtko Ursulin (5): > drm/sched: Optimise drm_sched_entity_push_job > drm/sched: Stop setting current entity in FIFO mode > drm/sched: Re-order struct drm_sched_rq members for clarity > drm/sched: Re-group and rename the entity run-queue lock > drm/sched: Further optimise drm_sched_entity_push_job > > drivers/gpu/drm/scheduler/sched_entity.c | 42 +++--- > -- > drivers/gpu/drm/scheduler/sched_main.c | 32 +- > include/drm/gpu_scheduler.h | 34 ++- > 3 files changed, 61 insertions(+), 47 deletions(-) > Applied to drm-misc-next Thank you, P.
[PATCH v13 0/3] drm/mediatek: Add support for OF graphs
Changes in v13: - Added comment in commit description of patch [1/3] to warn about new port scheme. - The series is now fully reviewed, tested, hence *ready*. Changes in v12: - Added comment to describe graph for OVL_ADAPTOR in patch [3/3] as suggested by CK Hu. Changes in v11: - Added OVL_ADAPTOR_MDP_RDMA to OVL Adaptor exclusive components list to avoid failures in graphs with MDP_RDMA inside - Rebased on next-20241004 Changes in v10: - Removed erroneously added *.orig/*.rej files Changes in v9: - Rebased on next-20240910 - Removed redundant assignment and changed a print to dev_err() - Dropped if branch to switch conversion as requested; this will be sent as a separate commit out of this series. Changes in v8: - Rebased on next-20240617 - Changed to allow probing a VDO with no available display outputs Changes in v7: - Fix typo in patch 3/3 Changes in v6: - Added EPROBE_DEFER check to fix dsi/dpi false positive DT fallback case - Dropped refcount of ep_out in mtk_drm_of_get_ddp_ep_cid() - Fixed double refcount drop during path building - Removed failure upon finding a DT-disabled path as requested - Tested again on MT8195, MT8395 boards Changes in v5: - Fixed commit [2/3], changed allOf -> anyOf to get the intended allowance in the binding Changes in v4: - Fixed a typo that caused pure OF graphs pipelines multiple concurrent outputs to not get correctly parsed (port->id); - Added OVL_ADAPTOR support for OF graph specified pipelines; - Now tested with fully OF Graph specified pipelines on MT8195 Chromebooks and MT8395 boards; - Rebased on next-20240516 Changes in v3: - Rebased on next-20240502 because of renames in mediatek-drm Changes in v2: - Fixed wrong `required` block indentation in commit [2/3] The display IPs in MediaTek SoCs are *VERY* flexible and those support being interconnected with different instances of DDP IPs (for example, merge0 or merge1) and/or with different DDP IPs (for example, rdma can be connected with either color, dpi, dsi, merge, etc), forming a full Display Data Path that ends with an actual display. This series was born because of an issue that I've found while enabling support for MT8195/MT8395 boards with DSI output as main display: the current mtk_drm_route variations would not work as currently, the driver hardcodes a display path for Chromebooks, which have a DisplayPort panel with DSC support, instead of a DSI panel without DSC support. There are other reasons for which I wrote this series, and I find that hardcoding those paths - when a HW path is clearly board-specific - is highly suboptimal. Also, let's not forget about keeping this driver from becoming a huge list of paths for each combination of SoC->board->disp and... this and that. For more information, please look at the commit description for each of the commits included in this series. This series is essential to enable support for the MT8195/MT8395 EVK, Kontron i1200, Radxa NIO-12L and, mainly, for non-Chromebook boards and Chromebooks to co-exist without conflicts. Besides, this is also a valid option for MT8188 Chromebooks which might have different DSI-or-eDP displays depending on the model (as far as I can see from the mtk_drm_route attempt for this SoC that is already present in this driver). This series was tested on MT8195 Cherry Tomato and on MT8395 Radxa NIO-12L with both hardcoded paths, OF graph support and partially hardcoded paths, and pure OF graph support including pipelines that require OVL_ADAPTOR support. AngeloGioacchino Del Regno (3): dt-bindings: display: mediatek: Add OF graph support for board path dt-bindings: arm: mediatek: mmsys: Add OF graph support for board path drm/mediatek: Implement OF graphs support for display paths .../bindings/arm/mediatek/mediatek,mmsys.yaml | 28 ++ .../display/mediatek/mediatek,aal.yaml| 40 +++ .../display/mediatek/mediatek,ccorr.yaml | 21 ++ .../display/mediatek/mediatek,color.yaml | 22 ++ .../display/mediatek/mediatek,dither.yaml | 22 ++ .../display/mediatek/mediatek,dpi.yaml| 25 +- .../display/mediatek/mediatek,dsc.yaml| 24 ++ .../display/mediatek/mediatek,dsi.yaml| 27 +- .../display/mediatek/mediatek,ethdr.yaml | 22 ++ .../display/mediatek/mediatek,gamma.yaml | 19 ++ .../display/mediatek/mediatek,merge.yaml | 23 ++ .../display/mediatek/mediatek,od.yaml | 22 ++ .../display/mediatek/mediatek,ovl-2l.yaml | 22 ++ .../display/mediatek/mediatek,ovl.yaml| 22 ++ .../display/mediatek/mediatek,postmask.yaml | 21 ++ .../display/mediatek/mediatek,rdma.yaml | 22 ++ .../display/mediatek/mediatek,ufoe.yaml | 21 ++ drivers/gpu/drm/mediatek/mtk_disp_drv.h | 1 + .../gpu/drm/mediatek/mtk_disp_ovl_adaptor.c | 43 ++- drivers/gpu/drm/mediatek/mtk_dpi.c| 21 +- drivers/gpu/drm/mediatek/mtk_drm_drv.c| 253 +- drivers/gpu/drm/
[PATCH v13 3/3] drm/mediatek: Implement OF graphs support for display paths
It is impossible to add each and every possible DDP path combination for each and every possible combination of SoC and board: right now, this driver hardcodes configuration for 10 SoCs and this is going to grow larger and larger, and with new hacks like the introduction of mtk_drm_route which is anyway not enough for all final routes as the DSI cannot be connected to MERGE if it's not a dual-DSI, or enabling DSC preventively doesn't work if the display doesn't support it, or others. Since practically all display IPs in MediaTek SoCs support being interconnected with different instances of other, or the same, IPs or with different IPs and in different combinations, the final DDP pipeline is effectively a board specific configuration. Implement OF graphs support to the mediatek-drm drivers, allowing to stop hardcoding the paths, and preventing this driver to get a huge amount of arrays for each board and SoC combination, also paving the way to share the same mtk_mmsys_driver_data between multiple SoCs, making it more straightforward to add support for new chips. Note that the OVL_ADAPTOR software component driver needs relatively big changes in order to fully support OF Graphs (and more SoCs anyway) and such changes will come at a later time. As of now, the mtk_disp_ovl_adaptor driver takes the MERGE components (for example, on mt8195, merge 1 to 4) dynamically so, even though later updates to the ovl-adaptor driver will *not* require bindings changes, the merge1-4 will be temporarily omitted in the graph for the MT8195 SoC. This means that an example graph for this SoC looks like: mdp_rdma (0 ~ 7) -> padding (0 ~ 7) -> ethdr -> merge5 and the resulting path in this driver will be `ovl_adaptor -> merge5` Later updates to the ovl adaptor will expand it to support more SoCs and, in turn, to also fully support graphs. Reviewed-by: Alexandre Mergnat Tested-by: Alexandre Mergnat Acked-by: Sui Jingfeng Tested-by: Michael Walle # on kontron-sbc-i1200 Reviewed-by: CK Hu Signed-off-by: AngeloGioacchino Del Regno --- drivers/gpu/drm/mediatek/mtk_disp_drv.h | 1 + .../gpu/drm/mediatek/mtk_disp_ovl_adaptor.c | 43 ++- drivers/gpu/drm/mediatek/mtk_dpi.c| 21 +- drivers/gpu/drm/mediatek/mtk_drm_drv.c| 253 +- drivers/gpu/drm/mediatek/mtk_drm_drv.h| 2 +- drivers/gpu/drm/mediatek/mtk_dsi.c| 14 +- 6 files changed, 312 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/mediatek/mtk_disp_drv.h b/drivers/gpu/drm/mediatek/mtk_disp_drv.h index 082ac18fe04a..94843974851f 100644 --- a/drivers/gpu/drm/mediatek/mtk_disp_drv.h +++ b/drivers/gpu/drm/mediatek/mtk_disp_drv.h @@ -108,6 +108,7 @@ size_t mtk_ovl_get_num_formats(struct device *dev); void mtk_ovl_adaptor_add_comp(struct device *dev, struct mtk_mutex *mutex); void mtk_ovl_adaptor_remove_comp(struct device *dev, struct mtk_mutex *mutex); +bool mtk_ovl_adaptor_is_comp_present(struct device_node *node); void mtk_ovl_adaptor_connect(struct device *dev, struct device *mmsys_dev, unsigned int next); void mtk_ovl_adaptor_disconnect(struct device *dev, struct device *mmsys_dev, diff --git a/drivers/gpu/drm/mediatek/mtk_disp_ovl_adaptor.c b/drivers/gpu/drm/mediatek/mtk_disp_ovl_adaptor.c index c6768210b08b..4e064d3c97cc 100644 --- a/drivers/gpu/drm/mediatek/mtk_disp_ovl_adaptor.c +++ b/drivers/gpu/drm/mediatek/mtk_disp_ovl_adaptor.c @@ -490,6 +490,41 @@ static int compare_of(struct device *dev, void *data) return dev->of_node == data; } +static int ovl_adaptor_of_get_ddp_comp_type(struct device_node *node, + enum mtk_ovl_adaptor_comp_type *ctype) +{ + const struct of_device_id *of_id = of_match_node(mtk_ovl_adaptor_comp_dt_ids, node); + + if (!of_id) + return -EINVAL; + + *ctype = (enum mtk_ovl_adaptor_comp_type)((uintptr_t)of_id->data); + + return 0; +} + +bool mtk_ovl_adaptor_is_comp_present(struct device_node *node) +{ + enum mtk_ovl_adaptor_comp_type type; + int ret; + + ret = ovl_adaptor_of_get_ddp_comp_type(node, &type); + if (ret) + return false; + + if (type >= OVL_ADAPTOR_TYPE_NUM) + return false; + + /* +* In the context of mediatek-drm, ETHDR, MDP_RDMA and Padding are +* used exclusively by OVL Adaptor: if this component is not one of +* those, it's likely not an OVL Adaptor path. +*/ + return type == OVL_ADAPTOR_TYPE_ETHDR || + type == OVL_ADAPTOR_TYPE_MDP_RDMA || + type == OVL_ADAPTOR_TYPE_PADDING; +} + static int ovl_adaptor_comp_init(struct device *dev, struct component_match **match) { struct mtk_disp_ovl_adaptor *priv = dev_get_drvdata(dev); @@ -499,12 +534,11 @@ static int ovl_adaptor_comp_init(struct device *dev, struct component_match **ma parent = dev->parent->parent->of_node->parent;
[PATCH v13 1/3] dt-bindings: display: mediatek: Add OF graph support for board path
The display IPs in MediaTek SoCs support being interconnected with different instances of DDP IPs (for example, merge0 or merge1) and/or with different DDP IPs (for example, rdma can be connected with either color, dpi, dsi, merge, etc), forming a full Display Data Path that ends with an actual display. The final display pipeline is effectively board specific, as it does depend on the display that is attached to it, and eventually on the sensors supported by the board (for example, Adaptive Ambient Light would need an Ambient Light Sensor, otherwise it's pointless!), other than the output type. Add support for OF graphs to most of the MediaTek DDP (display) bindings to add flexibility to build custom hardware paths, hence enabling board specific configuration of the display pipeline and allowing to finally migrate away from using hardcoded paths. Please note that - while this commit retains retro-compatibility with old device trees - it will break the ABI for mediatek,dsi and for mediatek,dpi for the sake of consistency between the `ports` in all MediaTek DRM drivers versus DRM bridge drivers as in the previous binding, MediaTek was using `port` (implicitly, port@0) as an OUTPUT, while now the first port is an INPUT, and the second one is an OUTPUT, which is consistent with other DRM drivers which can be chained to drm/mediatek. As for maintainability concerns, I am aware that the old device tree will not be actively tested anymore, but retrocompatibility breakages will *not* be more likely to happen in the future because any addition to the graph (new drivers) will be done only for features present on newer SoCs, keeping the old ones (and their default pipeline) untouched. Reviewed-by: Rob Herring (Arm) Reviewed-by: Alexandre Mergnat Tested-by: Alexandre Mergnat Reviewed-by: CK Hu Tested-by: Michael Walle # on kontron-sbc-i1200 Signed-off-by: AngeloGioacchino Del Regno --- .../display/mediatek/mediatek,aal.yaml| 40 +++ .../display/mediatek/mediatek,ccorr.yaml | 21 ++ .../display/mediatek/mediatek,color.yaml | 22 ++ .../display/mediatek/mediatek,dither.yaml | 22 ++ .../display/mediatek/mediatek,dpi.yaml| 25 +++- .../display/mediatek/mediatek,dsc.yaml| 24 +++ .../display/mediatek/mediatek,dsi.yaml| 27 - .../display/mediatek/mediatek,ethdr.yaml | 22 ++ .../display/mediatek/mediatek,gamma.yaml | 19 + .../display/mediatek/mediatek,merge.yaml | 23 +++ .../display/mediatek/mediatek,od.yaml | 22 ++ .../display/mediatek/mediatek,ovl-2l.yaml | 22 ++ .../display/mediatek/mediatek,ovl.yaml| 22 ++ .../display/mediatek/mediatek,postmask.yaml | 21 ++ .../display/mediatek/mediatek,rdma.yaml | 22 ++ .../display/mediatek/mediatek,ufoe.yaml | 21 ++ 16 files changed, 372 insertions(+), 3 deletions(-) diff --git a/Documentation/devicetree/bindings/display/mediatek/mediatek,aal.yaml b/Documentation/devicetree/bindings/display/mediatek/mediatek,aal.yaml index cf24434854ff..47ddba5c41af 100644 --- a/Documentation/devicetree/bindings/display/mediatek/mediatek,aal.yaml +++ b/Documentation/devicetree/bindings/display/mediatek/mediatek,aal.yaml @@ -62,6 +62,27 @@ properties: $ref: /schemas/types.yaml#/definitions/phandle-array maxItems: 1 + ports: +$ref: /schemas/graph.yaml#/properties/ports +description: + Input and output ports can have multiple endpoints, each of those + connects to either the primary, secondary, etc, display pipeline. + +properties: + port@0: +$ref: /schemas/graph.yaml#/properties/port +description: AAL input port + + port@1: +$ref: /schemas/graph.yaml#/properties/port +description: + AAL output to the next component's input, for example could be one + of many gamma, overdrive or other blocks. + +required: + - port@0 + - port@1 + required: - compatible - reg @@ -89,5 +110,24 @@ examples: power-domains = <&scpsys MT8173_POWER_DOMAIN_MM>; clocks = <&mmsys CLK_MM_DISP_AAL>; mediatek,gce-client-reg = <&gce SUBSYS_1401 0x5000 0x1000>; + + ports { + #address-cells = <1>; + #size-cells = <0>; + + port@0 { + reg = <0>; + aal0_in: endpoint { + remote-endpoint = <&ccorr0_out>; + }; + }; + + port@1 { + reg = <1>; + aal0_out: endpoint { + remote-endpoint = <&gamma0_in>; + }; + }; + }; }; }; diff --git a/Documentation/devicetree/bindings/display/mediatek/mediatek,ccorr.yaml b/Documentation/devicetree/bindings/display/mediatek/mediatek,ccorr.yam
[PATCH v13 2/3] dt-bindings: arm: mediatek: mmsys: Add OF graph support for board path
Document OF graph on MMSYS/VDOSYS: this supports up to three DDP paths per HW instance (so potentially up to six displays for multi-vdo SoCs). The MMSYS or VDOSYS is always the first component in the DDP pipeline, so it only supports an output port with multiple endpoints - where each endpoint defines the starting point for one of the (currently three) possible hardware paths. Reviewed-by: Rob Herring (Arm) Reviewed-by: Alexandre Mergnat Tested-by: Alexandre Mergnat Reviewed-by: CK Hu Tested-by: Michael Walle # on kontron-sbc-i1200 Signed-off-by: AngeloGioacchino Del Regno --- .../bindings/arm/mediatek/mediatek,mmsys.yaml | 28 +++ 1 file changed, 28 insertions(+) diff --git a/Documentation/devicetree/bindings/arm/mediatek/mediatek,mmsys.yaml b/Documentation/devicetree/bindings/arm/mediatek/mediatek,mmsys.yaml index b3c6888c1457..3f4262e93c78 100644 --- a/Documentation/devicetree/bindings/arm/mediatek/mediatek,mmsys.yaml +++ b/Documentation/devicetree/bindings/arm/mediatek/mediatek,mmsys.yaml @@ -93,6 +93,34 @@ properties: '#reset-cells': const: 1 + port: +$ref: /schemas/graph.yaml#/properties/port +description: + Output port node. This port connects the MMSYS/VDOSYS output to + the first component of one display pipeline, for example one of + the available OVL or RDMA blocks. + Some MediaTek SoCs support multiple display outputs per MMSYS. +properties: + endpoint@0: +$ref: /schemas/graph.yaml#/properties/endpoint +description: Output to the primary display pipeline + + endpoint@1: +$ref: /schemas/graph.yaml#/properties/endpoint +description: Output to the secondary display pipeline + + endpoint@2: +$ref: /schemas/graph.yaml#/properties/endpoint +description: Output to the tertiary display pipeline + +anyOf: + - required: + - endpoint@0 + - required: + - endpoint@1 + - required: + - endpoint@2 + required: - compatible - reg -- 2.46.1
Re: [RFC PATCH v2 0/2] TEE subsystem for restricted dma-buf allocations
Hi Jens, On Tue, 15 Oct 2024 at 15:47, Jens Wiklander wrote: > > Hi, > > This patch set allocates the restricted DMA-bufs via the TEE subsystem. > This a complete rewrite compared to the previous patch set [1], and other > earlier proposals [2] and [3] with a separate restricted heap. > > The TEE subsystem handles the DMA-buf allocations since it is the TEE > (OP-TEE, AMD-TEE, TS-TEE, or a future QTEE) which sets up the restrictions > for the memory used for the DMA-bufs. Thanks for proposing this interface. IMHO, this solution will address many concerns raised for the prior vendor specific DMA heaps approach [1] as follows: 1. User-space interacting with the TEE subsystem for restricted memory allocation makes it obvious that the returned DMA buf can't be directly mapped by the CPU. 2. All the low level platform details gets abstracted out for user-space regarding how the platform specific memory restriction comes into play. 3. User-space doesn't have to deal with holding 2 DMA buffer references, one after allocation from DMA heap and other for communication with the TEE subsystem. 4. Allows for better co-ordination with other kernel subsystems dealing with restricted DMA-bufs. [1] https://lore.kernel.org/linux-arm-kernel/20240515112308.10171-1-yong...@mediatek.com/ > > I've added a new IOCTL, TEE_IOC_RSTMEM_ALLOC, to allocate the restricted > DMA-bufs. This new IOCTL reaches the backend TEE driver, allowing it to > choose how to allocate the restricted physical memory. > > TEE_IOC_RSTMEM_ALLOC is quite similar to TEE_IOC_SHM_ALLOC so it's tempting > to extend TEE_IOC_SHM_ALLOC with two new flags > TEE_IOC_SHM_FLAG_SECURE_VIDEO and TEE_IOC_SHM_FLAG_SECURE_TRUSTED_UI for > the same feature. However, it might be a bit confusing since > TEE_IOC_SHM_ALLOC only returns an anonymous file descriptor, but > TEE_IOC_SHM_FLAG_SECURE_VIDEO and TEE_IOC_SHM_FLAG_SECURE_TRUSTED_UI would > return a DMA-buf file descriptor instead. What do others think? I think it's better to keep it as a separate IOCTL given the primary objective of buffer allocation and it's usage. -Sumit > > This can be tested on QEMU with the following steps: > repo init -u https://github.com/jenswi-linaro/manifest.git -m qemu_v8.xml \ > -b prototype/sdp-v2 > repo sync -j8 > cd build > make toolchains -j4 > make all -j$(nproc) > make run-only > # login and at the prompt: > xtest --sdp-basic > > https://optee.readthedocs.io/en/latest/building/prerequisites.html > list dependencies needed to build the above. > > The tests are pretty basic, mostly checking that a Trusted Application in > the secure world can access and manipulate the memory. There are also some > negative tests for out of bounds buffers etc. > > Thanks, > Jens > > [1] > https://lore.kernel.org/lkml/20240830070351.2855919-1-jens.wiklan...@linaro.org/ > [2] > https://lore.kernel.org/dri-devel/20240515112308.10171-1-yong...@mediatek.com/ > [3] https://lore.kernel.org/lkml/20220805135330.970-1-olivier.ma...@nxp.com/ > > Changes since the V1 RFC: > * Based on v6.11 > * Complete rewrite, replacing the restricted heap with TEE_IOC_RSTMEM_ALLOC > > Changes since Olivier's post [2]: > * Based on Yong Wu's post [1] where much of dma-buf handling is done in > the generic restricted heap > * Simplifications and cleanup > * New commit message for "dma-buf: heaps: add Linaro restricted dmabuf heap > support" > * Replaced the word "secure" with "restricted" where applicable > > Jens Wiklander (2): > tee: add restricted memory allocation > optee: support restricted memory allocation > > drivers/tee/Makefile | 1 + > drivers/tee/optee/core.c | 21 > drivers/tee/optee/optee_private.h | 6 + > drivers/tee/optee/optee_smc.h | 35 ++ > drivers/tee/optee/smc_abi.c | 45 ++- > drivers/tee/tee_core.c| 33 - > drivers/tee/tee_private.h | 2 + > drivers/tee/tee_rstmem.c | 200 ++ > drivers/tee/tee_shm.c | 2 + > drivers/tee/tee_shm_pool.c| 69 ++- > include/linux/tee_core.h | 6 + > include/linux/tee_drv.h | 9 ++ > include/uapi/linux/tee.h | 33 - > 13 files changed, 455 insertions(+), 7 deletions(-) > create mode 100644 drivers/tee/tee_rstmem.c > > -- > 2.43.0 >
[PATCH] drm/bochs: Replace deprecated PCI implicit devres
bochs uses pcim_enable_device(), which causes pci_request_region() to implicitly set up devres callbacks which will release the region on driver detach. Despite this, the driver calls pci_release_regions() manually on driver teardown. Implicit devres has been deprecated in PCI in commit 81fcf28e74a3 ("PCI: Document hybrid devres hazards"). Replace the calls to pci_request_region() with ones to always-managed pcim_request_region(). Remove the unnecessary call to pci_release_regions(). Signed-off-by: Philipp Stanner --- drivers/gpu/drm/tiny/bochs.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/tiny/bochs.c b/drivers/gpu/drm/tiny/bochs.c index 31fc5d839e10..888f12a67470 100644 --- a/drivers/gpu/drm/tiny/bochs.c +++ b/drivers/gpu/drm/tiny/bochs.c @@ -217,7 +217,7 @@ static int bochs_hw_init(struct drm_device *dev) if (pdev->resource[2].flags & IORESOURCE_MEM) { /* mmio bar with vga and bochs registers present */ - if (pci_request_region(pdev, 2, "bochs-drm") != 0) { + if (pcim_request_region(pdev, 2, "bochs-drm") != 0) { DRM_ERROR("Cannot request mmio region\n"); return -EBUSY; } @@ -258,7 +258,7 @@ static int bochs_hw_init(struct drm_device *dev) size = min(size, mem); } - if (pci_request_region(pdev, 0, "bochs-drm") != 0) + if (pcim_request_region(pdev, 0, "bochs-drm") != 0) DRM_WARN("Cannot request framebuffer, boot fb still active?\n"); bochs->fb_map = ioremap(addr, size); @@ -302,7 +302,7 @@ static void bochs_hw_fini(struct drm_device *dev) release_region(VBE_DISPI_IOPORT_INDEX, 2); if (bochs->fb_map) iounmap(bochs->fb_map); - pci_release_regions(to_pci_dev(dev->dev)); + drm_edid_free(bochs->drm_edid); } -- 2.47.0
[PULL] drm-misc-next
Hi Dave, Simona, A new pull request for drm-misc-next. Cheers, Maarten drm-misc-next-2024-10-17: drm-misc-next for v6.13: Cross-subsystem Changes: - Small fixes to dma-buf. Core Changes: - Convert many drivers to use video aperture helpers and remove the DRM one. Driver Changes: - Add coredump, pantherlake support to accel/ivpu. - Assorted bugfixes to ivpu, edp-panel, bochs, gcc-15, panel/s6e3ha8. - Docbook fixes for TTM. - Add Samsung AMS581VF01 The following changes since commit 26bb2dc102783fef49336b26a94563318f9790d3: Merge tag 'drm-xe-next-2024-10-10' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next (2024-10-11 08:01:16 +1000) are available in the Git repository at: https://gitlab.freedesktop.org/drm/misc/kernel.git tags/drm-misc-next-2024-10-17 for you to fetch changes up to 134e71bd1edcc7252b64ca31efe88edfef86d784: drm/sched: Further optimise drm_sched_entity_push_job (2024-10-17 12:20:06 +0200) drm-misc-next for v6.13: Cross-subsystem Changes: - Small fixes to dma-buf. Core Changes: - Convert many drivers to use video aperture helpers and remove the DRM one. Driver Changes: - Add coredump, pantherlake support to accel/ivpu. - Assorted bugfixes to ivpu, edp-panel, bochs, gcc-15, panel/s6e3ha8. - Docbook fixes for TTM. - Add Samsung AMS581VF01 Aleksandrs Vinarskis (1): drm/edp-panel: Add panels used by Dell XPS 13 9345 Andrzej Kacprowski (4): accel/ivpu: Update VPU FW API headers accel/ivpu: Allow reading dvfs_mode debugfs file accel/ivpu: Add test_mode bit to force turbo accel/ivpu: Fix reset_engine debugfs file logic Arnd Bergmann (1): drm/panel: s6e3ha8: select CONFIG_DRM_DISPLAY_DSC_HELPER Brahmajit Das (1): drm/display: Fix building with GCC 15 Danila Tikhonov (2): dt-bindings: display: panel: Add Samsung AMS581VF01 drm/panel: Add Samsung AMS581VF01 panel driver Dmitry Baryshkov (1): drm/bridge: lt9611: use HDMI Connector helper to set InfoFrames Dzmitry Sankouski (1): drm/mipi-dsi: fix kernel doc on mipi_dsi_compression_mode_multi Jacek Lawrynowicz (11): accel/ivpu: Rename ivpu_log_level to fw_log_level accel/ivpu: Refactor functions in ivpu_fw_log.c accel/ivpu: Fix fw log printing accel/ivpu: Limit FW version string length accel/ivpu: Stop using hardcoded DRIVER_DATE accel/ivpu: Add auto selection logic for job scheduler accel/ivpu: Remove invalid warnings accel/ivpu: Increase MS info buffer size accel/ivpu: Fix ivpu_jsm_dyndbg_control() accel/ivpu: Remove HWS_EXTRA_EVENTS from test modes accel/ivpu: Fix typos in ivpu_pm.c Jakub Pawlak (1): accel/ivpu: Add tracing for IPC/PM/JOB Jeffrey Hugo (2): accel/qaic: Add ipc_router channel accel/qaic: Add AIC080 support Karol Wachowski (13): accel/ivpu: Add coredump support accel/ivpu: Set 500 ns delay between power island TRICKLE and ENABLE accel/ivpu: Turn on autosuspend on Simics accel/ivpu: Add FW version debugfs entry accel/ivpu: Remove 1-tile power up Simics workaround accel/ivpu: Add one jiffy to bo_wait_ioctl timeout value accel/ivpu: Print JSM message result in case of error accel/ivpu: Remove skip of clock own resource ack on Simics accel/ivpu: Prevent recovery invocation during probe and resume accel/ivpu: Refactor failure diagnostics during boot accel/ivpu: Do not fail on cmdq if failed to allocate preemption buffers accel/ivpu: Use whole user and shave ranges for preemption buffers accel/ivpu: Update power island delays Maaz Mombasawala (1): drm/vmwgfx: Stop using dev_private to store driver data. Maciej Falkowski (1): accel/ivpu: Add initial Panther Lake support Maíra Canal (1): MAINTAINERS: Add Maíra to VC4 reviewers Miguel Ojeda (1): drm/panic: Select ZLIB_DEFLATE for DRM_PANIC_SCREEN_QR_CODE Pintu Kumar (2): dma-buf: fix S_IRUGO to 0444, block comments, func declaration dma-buf/heaps: replace kmap_atomic with kmap_local_page Thomas Hellström (1): drm/ttm: Fix incorrect use of kernel-doc format Thomas Zimmermann (30): drm/bochs: Return error from correct pointer Merge drm/drm-next into drm-misc-next drm/amdgpu: Use video aperture helpers drm/arm/hdlcd: Use video aperture helpers drm/armada: Use video aperture helpers drm/ast: Use video aperture helpers drm/hisilicon/hibmc: Use video aperture helpers drm/hyperv-drm: Use video aperture helpers drm/i915: Use video aperture helpers drm/loongson: Use video aperture helpers drm/meson: Use video aperture helpers drm/mgag200: Use video aperture helpers drm/msm: Use video aperture helpers drm/nouveau: Use video aperture helpers drm/ofdrm: Use video a
Re: [PATCH 1/9] drm/panfrost: Replace DRM driver allocation method with newer one
On Tue, 15 Oct 2024 00:31:36 +0100 Adrián Larumbe wrote: > Drop the deprecated DRM driver allocation method in favour of > devm_drm_dev_alloc(). Overall just make it the same as in Panthor. > Also discard now superfluous generic and platform device pointers inside > the main panfrost device structure. > > Some ancient checkpatch issues unearthed as a result of these changes > were also fixed, like lines too long or double assignment in one line. > > Signed-off-by: Adrián Larumbe Acked-by: Boris Brezillon I didn't do a thorough review of the diff, but I'm assuming it's correct if the driver compiles. BTW, I don't see it done in this series, but it might be good to also turn devm_ calls into drmm_ ones, and dev_ into drm_.
Re: [PATCH v1 1/4] mm/hmm: HMM API for P2P DMA to device zone pages
On Thu, Oct 17, 2024 at 04:58:12AM -0700, Christoph Hellwig wrote: > On Wed, Oct 16, 2024 at 02:44:45PM -0300, Jason Gunthorpe wrote: > > > > FWIW, I've been expecting this series to be rebased on top of Leon's > > > > new DMA API series so it doesn't have this issue.. > > > > > > That's not going to make a difference at this level. > > > > I'm not sure what you are asking then. > > > > Patch 2 does pci_p2pdma_add_resource() and so a valid struct page with > > a P2P ZONE_DEVICE type exists, and that gets returned back to the > > hmm/odp code. > > > > Today odp calls dma_map_page() which only works by chance in limited > > cases. With Leon's revision it will call hmm_dma_map_pfn() -> > > dma_iova_link() which does call pci_p2pdma_map_type() and should do > > the right thing. > > Again none of this affects the code posted here. It reshuffles the > callers but has no direct affect on the patches posted here. I didn't realize till last night that Leon's series did not have P2P support. What I'm trying to say is that this is a multi-series project. A followup based on Leon's initial work will get the ODP DMA mapping path able to support ZONE_DEVICE P2P pages. Once that is done, this series sits on top. This series is only about hmm and effectively allows hmm_range_fault() to return a ZONE_DEVICE P2P page. Yonatan should explain this better in the cover letter and mark it as a RFC series. So, I know we are still figuring out the P2P support on the DMA API side, but my expectation for hmm is that hmm_range_fault() returing a ZONE_DEVICE P2P page is going to be what we want. > (and the current DMA series lacks P2P support, I'm trying to figure > out how to properly handle it at the moment). Yes, I see, I looked through those patches last night and there is a gap there. Broadly I think whatever flow NVMe uses for P2P will apply to ODP as well. Thanks, Jason
Re: [PATCH v2 1/4] drm/tests: helpers: Add helper for drm_display_mode_from_cea_vic()
On 2024/10/17 20:13, Maxime Ripard wrote: > On Thu, Oct 17, 2024 at 09:33:07AM GMT, Jinjie Ruan wrote: diff --git a/include/drm/drm_kunit_helpers.h b/include/drm/drm_kunit_helpers.h index e7cc17ee4934..1e7fd4be550c 100644 --- a/include/drm/drm_kunit_helpers.h +++ b/include/drm/drm_kunit_helpers.h @@ -4,6 +4,7 @@ #define DRM_KUNIT_HELPERS_H_ #include +#include #include @@ -120,4 +121,9 @@ drm_kunit_helper_create_crtc(struct kunit *test, const struct drm_crtc_funcs *funcs, const struct drm_crtc_helper_funcs *helper_funcs); +struct drm_display_mode * +drm_kunit_helper_display_mode_from_cea_vic(struct kunit *test, + struct drm_device *dev, + u8 video_code); >>> >>> It's not clear to me what you need the drm_edid header, you just return >>> a drm_display_mode pointer so you can just forward declare the structure >> >> >> There is a compile error without the header,because there is no >> "drm_display_mode_from_cea_vic()" declare. >> >> drivers/gpu/drm/tests/drm_kunit_helpers.c:341:16: error: implicit >> declaration of function ‘drm_display_mode_from_cea_vic’; did you mean >> ‘drm_kunit_display_mode_from_cea_vic’? >> [-Werror=implicit-function-declaration] >> 341 | mode = drm_display_mode_from_cea_vic(dev, video_code); >> |^ >> |drm_kunit_display_mode_from_cea_vic >> drivers/gpu/drm/tests/drm_kunit_helpers.c:341:14: warning: assignment to >> ‘struct drm_display_mode *’ from ‘int’ makes pointer from integer >> without a cast [-Wint-conversion] >> 341 | mode = drm_display_mode_from_cea_vic(dev, video_code); >> | ^ > > Right, but the error is in the C file, not the header. Yes, I have updated it to C file in V3, thank you! > > Maxime
Re: [PATCH v1 1/4] mm/hmm: HMM API for P2P DMA to device zone pages
On Thu, Oct 17, 2024 at 12:58:48PM +1100, Alistair Popple wrote: > Actually I think the rule should be don't look at the page at > all. hmm_range_fault() is about mirroring PTEs, no assumption should > even be made about the existence or otherwise of a struct page. We are not there yet.. > > We don't need to enforce, it we don't know what else the driver will > > want to use that P2P page for after all. It might stick it in a VMA > > for some unrelated reason. > > And wouldn't that touch the refcount and therefore be wrong? I mean the originating driver would do that Jason
[PULL] drm-misc-fixes
Hi Dave, Sima, here are the fixes from the misc tree for this week. Best regards Thomas drm-misc-fixes-2024-10-17: Short summary of fixes pull: ast: - Clear EDID on unplugged connectors host1x: - Fix boot on Tegra186 - Set DMA parameters mgag200: - Revert VBLANK support panel: - himax-hx83192: Adjust power and gamma qaic: - Sgtable loop fixes vmwgfx: - Limit display layout allocatino size - Handle allocation errors in connector checks - Clean up KMS code for 2d-only setup - Report surface-check errors correctly - Remove NULL test around kvfree() The following changes since commit fcddc71ec7ecf15b4df3c41288c9cf0b8e886111: drm/fbdev-dma: Only cleanup deferred I/O if necessary (2024-10-10 09:49:25 +0200) are available in the Git repository at: https://gitlab.freedesktop.org/drm/misc/kernel.git tags/drm-misc-fixes-2024-10-17 for you to fetch changes up to c09c4f2a972ca7cd9c8926594aa2099bcbcd3b79: drm/ast: vga: Clear EDID if no display is connected (2024-10-17 08:50:14 +0200) Short summary of fixes pull: ast: - Clear EDID on unplugged connectors host1x: - Fix boot on Tegra186 - Set DMA parameters mgag200: - Revert VBLANK support panel: - himax-hx83192: Adjust power and gamma qaic: - Sgtable loop fixes vmwgfx: - Limit display layout allocatino size - Handle allocation errors in connector checks - Clean up KMS code for 2d-only setup - Report surface-check errors correctly - Remove NULL test around kvfree() Cong Yang (1): drm/panel: himax-hx83102: Adjust power and gamma to optimize brightness Ian Forbes (2): drm/vmwgfx: Limit display layout ioctl array size to VMWGFX_NUM_DISPLAY_UNITS drm/vmwgfx: Handle possible ENOMEM in vmw_stdu_connector_atomic_check Jon Hunter (1): gpu: host1x: Fix boot regression for Tegra Nikolay Kuratov (1): drm/vmwgfx: Handle surface check failure correctly Pranjal Ramajor Asha Kanojiya (1): accel/qaic: Fix the for loop used to walk SG table Thierry Reding (1): gpu: host1x: Set up device DMA parameters Thomas Zimmermann (3): Revert "drm/mgag200: Add vblank support" drm/ast: sil164: Clear EDID if no display is connected drm/ast: vga: Clear EDID if no display is connected Thorsten Blum (1): drm/vmwgfx: Remove unnecessary NULL checks before kvfree() Zack Rusin (1): drm/vmwgfx: Cleanup kms setup without 3d drivers/accel/qaic/qaic_control.c | 2 +- drivers/accel/qaic/qaic_data.c | 6 +-- drivers/gpu/drm/ast/ast_sil164.c| 2 + drivers/gpu/drm/ast/ast_vga.c | 2 + drivers/gpu/drm/mgag200/mgag200_drv.c | 38 -- drivers/gpu/drm/mgag200/mgag200_drv.h | 14 +- drivers/gpu/drm/mgag200/mgag200_g200.c | 5 -- drivers/gpu/drm/mgag200/mgag200_g200eh.c| 5 -- drivers/gpu/drm/mgag200/mgag200_g200eh3.c | 5 -- drivers/gpu/drm/mgag200/mgag200_g200er.c| 10 +--- drivers/gpu/drm/mgag200/mgag200_g200ev.c| 10 +--- drivers/gpu/drm/mgag200/mgag200_g200ew3.c | 5 -- drivers/gpu/drm/mgag200/mgag200_g200se.c| 10 +--- drivers/gpu/drm/mgag200/mgag200_g200wb.c| 5 -- drivers/gpu/drm/mgag200/mgag200_mode.c | 77 + drivers/gpu/drm/panel/panel-himax-hx83102.c | 12 ++--- drivers/gpu/drm/vmwgfx/vmwgfx_blit.c| 6 +-- drivers/gpu/drm/vmwgfx/vmwgfx_drv.h | 4 +- drivers/gpu/drm/vmwgfx/vmwgfx_kms.c | 34 ++--- drivers/gpu/drm/vmwgfx/vmwgfx_kms.h | 3 -- drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c| 4 ++ drivers/gpu/drm/vmwgfx/vmwgfx_surface.c | 9 ++-- drivers/gpu/host1x/context.c| 1 + drivers/gpu/host1x/dev.c| 18 +++ include/linux/host1x.h | 1 + 25 files changed, 48 insertions(+), 240 deletions(-) -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Frankenstrasse 146, 90461 Nuernberg, Germany GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman HRB 36809 (AG Nuernberg)
Re: [PATCH v3] drm/mediatek: Fix color format MACROs in OVL
Il 16/10/24 16:17, Hsin-Te Yuan ha scritto: In commit 9f428b95ac89 ("drm/mediatek: Add new color format MACROs in OVL"), some new color formats are defined in the MACROs to make the switch statement more concise. That commit was intended to be a no-op cleanup. However, there are typos in these formats MACROs, which cause the return value to be incorrect. Fix the typos to ensure the return value remains unchanged. Fixes: 9f428b95ac89 ("drm/mediatek: Add new color format MACROs in OVL") Reviewed-by: Douglas Anderson Reviewed-by: Matthias Brugger Signed-off-by: Hsin-Te Yuan Reviewed-by: AngeloGioacchino Del Regno
Re: [PATCH v1 1/4] mm/hmm: HMM API for P2P DMA to device zone pages
On Thu, Oct 17, 2024 at 06:12:55AM -0700, Christoph Hellwig wrote: > On Thu, Oct 17, 2024 at 10:05:39AM -0300, Jason Gunthorpe wrote: > > Broadly I think whatever flow NVMe uses for P2P will apply to ODP as > > well. > > ODP is a lot simpler than NVMe for P2P actually :( What is your thinking there? I'm looking at the latest patches and I would expect dma_iova_init() to accept a phys so it can call pci_p2pdma_map_type() once for the whole transaction. It is a slow operation. Based on the result of pci_p2pdma_map_type() it would have to take one of three paths: direct, iommu, or acs/switch. It feels like dma_map_page() should become a new function that takes in the state and then it can do direct or acs based on the type held in the state. ODP would have to refresh the state for each page, but could follow the same code structure. Jason
Re: [RFC PATCH v2 2/2] optee: support restricted memory allocation
Hi Jens, On Tue, 15 Oct 2024 at 15:47, Jens Wiklander wrote: > > Add support in the OP-TEE backend driver for restricted memory > allocation. The support is limited to only the SMC ABI and for secure > video buffers. > > OP-TEE is probed for the range of restricted physical memory and a > memory pool allocator is initialized if OP-TEE have support for such > memory. > > Signed-off-by: Jens Wiklander > --- > drivers/tee/optee/core.c | 21 +++ > drivers/tee/optee/optee_private.h | 6 + > drivers/tee/optee/optee_smc.h | 35 > drivers/tee/optee/smc_abi.c | 45 --- > 4 files changed, 104 insertions(+), 3 deletions(-) > > diff --git a/drivers/tee/optee/core.c b/drivers/tee/optee/core.c > index 39e688d4e974..b6d5cbc6728d 100644 > --- a/drivers/tee/optee/core.c > +++ b/drivers/tee/optee/core.c > @@ -95,6 +95,25 @@ void optee_release_supp(struct tee_context *ctx) > optee_supp_release(&optee->supp); > } > > +int optee_rstmem_alloc(struct tee_context *ctx, struct tee_shm *shm, > + u32 flags, size_t size) > +{ > + struct optee *optee = tee_get_drvdata(ctx->teedev); > + > + if (!optee->sdp_pool) > + return -EINVAL; > + if (flags != TEE_IOC_FLAG_SECURE_VIDEO) > + return -EINVAL; > + return optee->sdp_pool->ops->alloc(optee->sdp_pool, shm, size, 0); > +} > + > +void optee_rstmem_free(struct tee_context *ctx, struct tee_shm *shm) > +{ > + struct optee *optee = tee_get_drvdata(ctx->teedev); > + > + optee->sdp_pool->ops->free(optee->sdp_pool, shm); > +} > + > void optee_remove_common(struct optee *optee) > { > /* Unregister OP-TEE specific client devices on TEE bus */ > @@ -111,6 +130,8 @@ void optee_remove_common(struct optee *optee) > tee_device_unregister(optee->teedev); > > tee_shm_pool_free(optee->pool); > + if (optee->sdp_pool) > + optee->sdp_pool->ops->destroy_pool(optee->sdp_pool); > optee_supp_uninit(&optee->supp); > mutex_destroy(&optee->call_queue.mutex); > } > diff --git a/drivers/tee/optee/optee_private.h > b/drivers/tee/optee/optee_private.h > index 424898cdc4e9..1f6b2cc992a9 100644 > --- a/drivers/tee/optee/optee_private.h > +++ b/drivers/tee/optee/optee_private.h > @@ -200,6 +200,7 @@ struct optee_ops { > * @notif: notification synchronization struct > * @supp: supplicant synchronization struct for RPC to > supplicant > * @pool: shared memory pool > + * @sdp_pool: restricted memory pool for secure data path > * @rpc_param_count: If > 0 number of RPC parameters to make room for > * @scan_bus_done flag if device registation was already done. > * @scan_bus_work workq to scan optee bus and register optee drivers > @@ -218,6 +219,7 @@ struct optee { > struct optee_notif notif; > struct optee_supp supp; > struct tee_shm_pool *pool; > + struct tee_shm_pool *sdp_pool; > unsigned int rpc_param_count; > bool scan_bus_done; > struct work_struct scan_bus_work; > @@ -340,6 +342,10 @@ void optee_rpc_cmd(struct tee_context *ctx, struct optee > *optee, > int optee_do_bottom_half(struct tee_context *ctx); > int optee_stop_async_notif(struct tee_context *ctx); > > +int optee_rstmem_alloc(struct tee_context *ctx, struct tee_shm *shm, > + u32 flags, size_t size); > +void optee_rstmem_free(struct tee_context *ctx, struct tee_shm *shm); > + > /* > * Small helpers > */ > diff --git a/drivers/tee/optee/optee_smc.h b/drivers/tee/optee/optee_smc.h > index 7d9fa426505b..c3b8a1c204af 100644 > --- a/drivers/tee/optee/optee_smc.h > +++ b/drivers/tee/optee/optee_smc.h > @@ -234,6 +234,39 @@ struct optee_smc_get_shm_config_result { > unsigned long settings; > }; > > +/* > + * Get Secure Data Path memory config > + * > + * Returns the Secure Data Path memory config. > + * > + * Call register usage: > + * a0 SMC Function ID, OPTEE_SMC_GET_SDP_CONFIG > + * a2-6 Not used, must be zero > + * a7 Hypervisor Client ID register > + * > + * Have config return register usage: > + * a0 OPTEE_SMC_RETURN_OK > + * a1 Physical address of start of SDP memory > + * a2 Size of SDP memory > + * a3 Not used > + * a4-7 Preserved > + * > + * Not available register usage: > + * a0 OPTEE_SMC_RETURN_ENOTAVAIL > + * a1-3 Not used > + * a4-7 Preserved > + */ > +#define OPTEE_SMC_FUNCID_GET_SDP_CONFIG20 > +#define OPTEE_SMC_GET_SDP_CONFIG \ > + OPTEE_SMC_FAST_CALL_VAL(OPTEE_SMC_FUNCID_GET_SDP_CONFIG) > + > +struct optee_smc_get_sdp_config_result { > + unsigned long status; > + unsigned long start; > + unsigned long size; > + unsigned long flags; > +}; > + > /* > * Exchanges capabilities between normal world and secure world > * > @@ -278,6 +311,8 @@ struct optee_smc_get_shm_config_result { > #define OP
Re: [PATCH v2 1/4] drm/tests: helpers: Add helper for drm_display_mode_from_cea_vic()
On Thu, Oct 17, 2024 at 09:33:07AM GMT, Jinjie Ruan wrote: > >> diff --git a/include/drm/drm_kunit_helpers.h > >> b/include/drm/drm_kunit_helpers.h > >> index e7cc17ee4934..1e7fd4be550c 100644 > >> --- a/include/drm/drm_kunit_helpers.h > >> +++ b/include/drm/drm_kunit_helpers.h > >> @@ -4,6 +4,7 @@ > >> #define DRM_KUNIT_HELPERS_H_ > >> > >> #include > >> +#include > >> > >> #include > >> > >> @@ -120,4 +121,9 @@ drm_kunit_helper_create_crtc(struct kunit *test, > >> const struct drm_crtc_funcs *funcs, > >> const struct drm_crtc_helper_funcs *helper_funcs); > >> > >> +struct drm_display_mode * > >> +drm_kunit_helper_display_mode_from_cea_vic(struct kunit *test, > >> + struct drm_device *dev, > >> + u8 video_code); > > > > It's not clear to me what you need the drm_edid header, you just return > > a drm_display_mode pointer so you can just forward declare the structure > > > There is a compile error without the header,because there is no > "drm_display_mode_from_cea_vic()" declare. > > drivers/gpu/drm/tests/drm_kunit_helpers.c:341:16: error: implicit > declaration of function ‘drm_display_mode_from_cea_vic’; did you mean > ‘drm_kunit_display_mode_from_cea_vic’? > [-Werror=implicit-function-declaration] > 341 | mode = drm_display_mode_from_cea_vic(dev, video_code); > |^ > |drm_kunit_display_mode_from_cea_vic > drivers/gpu/drm/tests/drm_kunit_helpers.c:341:14: warning: assignment to > ‘struct drm_display_mode *’ from ‘int’ makes pointer from integer > without a cast [-Wint-conversion] > 341 | mode = drm_display_mode_from_cea_vic(dev, video_code); > | ^ Right, but the error is in the C file, not the header. Maxime signature.asc Description: PGP signature
Re: [PATCH 1/3] drm/connector: hdmi: Fix memory leak in drm_display_mode_from_cea_vic()
Hi Jinjie, kernel test robot noticed the following build errors: [auto build test ERROR on drm-misc/drm-misc-next] [also build test ERROR on linus/master v6.12-rc3 next-20241017] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Jinjie-Ruan/drm-connector-hdmi-Fix-memory-leak-in-drm_display_mode_from_cea_vic/20241014-152022 base: git://anongit.freedesktop.org/drm/drm-misc drm-misc-next patch link: https://lore.kernel.org/r/20241014071632.989108-2-ruanjinjie%40huawei.com patch subject: [PATCH 1/3] drm/connector: hdmi: Fix memory leak in drm_display_mode_from_cea_vic() config: arm-randconfig-002-20241017 (https://download.01.org/0day-ci/archive/20241017/202410172046.2w97yglm-...@intel.com/config) compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241017/202410172046.2w97yglm-...@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot | Closes: https://lore.kernel.org/oe-kbuild-all/202410172046.2w97yglm-...@intel.com/ All errors (new ones prefixed by >>): >> drivers/gpu/drm/tests/drm_connector_test.c:1008:24: error: passing 'const >> struct drm_display_mode *' to parameter of type 'struct drm_display_mode *' >> discards qualifiers >> [-Werror,-Wincompatible-pointer-types-discards-qualifiers] drm_mode_destroy(drm, mode); ^~~~ include/drm/drm_modes.h:456:72: note: passing argument to parameter 'mode' here void drm_mode_destroy(struct drm_device *dev, struct drm_display_mode *mode); ^ drivers/gpu/drm/tests/drm_connector_test.c:1031:24: error: passing 'const struct drm_display_mode *' to parameter of type 'struct drm_display_mode *' discards qualifiers [-Werror,-Wincompatible-pointer-types-discards-qualifiers] drm_mode_destroy(drm, mode); ^~~~ include/drm/drm_modes.h:456:72: note: passing argument to parameter 'mode' here void drm_mode_destroy(struct drm_device *dev, struct drm_display_mode *mode); ^ drivers/gpu/drm/tests/drm_connector_test.c:1051:24: error: passing 'const struct drm_display_mode *' to parameter of type 'struct drm_display_mode *' discards qualifiers [-Werror,-Wincompatible-pointer-types-discards-qualifiers] drm_mode_destroy(drm, mode); ^~~~ include/drm/drm_modes.h:456:72: note: passing argument to parameter 'mode' here void drm_mode_destroy(struct drm_device *dev, struct drm_display_mode *mode); ^ drivers/gpu/drm/tests/drm_connector_test.c:1074:24: error: passing 'const struct drm_display_mode *' to parameter of type 'struct drm_display_mode *' discards qualifiers [-Werror,-Wincompatible-pointer-types-discards-qualifiers] drm_mode_destroy(drm, mode); ^~~~ include/drm/drm_modes.h:456:72: note: passing argument to parameter 'mode' here void drm_mode_destroy(struct drm_device *dev, struct drm_display_mode *mode); ^ drivers/gpu/drm/tests/drm_connector_test.c:1094:24: error: passing 'const struct drm_display_mode *' to parameter of type 'struct drm_display_mode *' discards qualifiers [-Werror,-Wincompatible-pointer-types-discards-qualifiers] drm_mode_destroy(drm, mode); ^~~~ include/drm/drm_modes.h:456:72: note: passing argument to parameter 'mode' here void drm_mode_destroy(struct drm_device *dev, struct drm_display_mode *mode); ^ drivers/gpu/drm/tests/drm_connector_test.c:1117:24: error: passing 'const struct drm_display_mode *' to parameter of type 'struct drm_display_mode *' discards qualifiers [-Werror,-Wincompatible-pointer-types-discards-qualifiers] drm_mode_destroy(drm, mode); ^~~~ include/drm/drm_modes.h:456:72: note: passing argument to parameter 'mode' here void drm_mode_destroy(struct drm_device *dev, struct drm_display_mode *mode);
Re: [PATCH v1 1/4] mm/hmm: HMM API for P2P DMA to device zone pages
On Thu, Oct 17, 2024 at 06:49:30AM -0700, Christoph Hellwig wrote: > On Thu, Oct 17, 2024 at 10:46:44AM -0300, Jason Gunthorpe wrote: > > On Thu, Oct 17, 2024 at 06:12:55AM -0700, Christoph Hellwig wrote: > > > On Thu, Oct 17, 2024 at 10:05:39AM -0300, Jason Gunthorpe wrote: > > > > Broadly I think whatever flow NVMe uses for P2P will apply to ODP as > > > > well. > > > > > > ODP is a lot simpler than NVMe for P2P actually :( > > > > What is your thinking there? I'm looking at the latest patches and I > > would expect dma_iova_init() to accept a phys so it can call > > pci_p2pdma_map_type() once for the whole transaction. It is a slow > > operation. > > You can't do it for the whole transaction. Here is my suggestion > for ODP: > > http://git.infradead.org/?p=users/hch/misc.git;a=shortlog;h=refs/heads/dma-split-wip OK, this looks very promising. I sketched something similar to the pci-p2pdma changes a while back too. BTW this: iommu: generalize the batched sync after map interface I am hoping to in a direction of adding a gather to the map, just like unmap. So eventually instead of open coding iotlb_sync_map() you'd flush the gather and it would do it. > For NVMe I need to figure out a way to split bios on a per P2P > type boundary as we don't have any space to record if something is a bus > mapped address. Yeah this came up before :\ Can't precompute the p2p type during bio creation, splitting based on pgmap would be good enough. Jason
Re: [PATCH v6 01/14] drm/panthor: Add uAPI
Hi Erik, On 17/10/2024 09:51, Erik Faye-Lund wrote: On Wed, 2024-10-16 at 16:18 +0200, Boris Brezillon wrote: On Wed, 16 Oct 2024 16:05:55 +0200 Erik Faye-Lund wrote: On Wed, 2024-10-16 at 15:02 +0100, Robin Murphy wrote: On 2024-10-16 2:50 pm, Erik Faye-Lund wrote: On Wed, 2024-10-16 at 15:16 +0200, Erik Faye-Lund wrote: On Thu, 2024-02-29 at 17:22 +0100, Boris Brezillon wrote: +/** + * enum drm_panthor_sync_op_flags - Synchronization operation flags. + */ +enum drm_panthor_sync_op_flags { + /** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK: Synchronization handle type mask. */ + DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_MASK = 0xff, + + /** @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ: Synchronization object type. */ + DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_SYNCOBJ = 0, + + /** +* @DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCOBJ: Timeline synchronization +* object type. +*/ + DRM_PANTHOR_SYNC_OP_HANDLE_TYPE_TIMELINE_SYNCOBJ = 1, + + /** @DRM_PANTHOR_SYNC_OP_WAIT: Wait operation. */ + DRM_PANTHOR_SYNC_OP_WAIT = 0 << 31, + + /** @DRM_PANTHOR_SYNC_OP_SIGNAL: Signal operation. */ + DRM_PANTHOR_SYNC_OP_SIGNAL = (int)(1u << 31), Why do we cast to int here? 1u << 31 doesn't fit in a 32-bit signed integer, so isn't this undefined behavior in C? Seems this was proposed here: https://lore.kernel.org/dri-devel/89be8f8f-7c4e-4efd-0b7b-c30bcfbf1...@arm.com/ ...that kinda sounds like bad advice to me. Also, it's been pointed out to me elsewhere that this isn't *technically speaking* undefined, it's "implementation defined". But as far as kernel interfaces goes, that's pretty much the same; we can't guarantee that the kernel and the user-space is using the same implementation. Here's the quote from the C99 spec, section 6.3.1.3 "Signed and unsigned integers": """ Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised I think a better approach be to use -1 << 31, which is well- defined. But the problem then becomes assigning it into drm_panthor_sync_op::flags in a well-defined way... Could we make the field signed? That seems a bit bad as well... Is that a problem? Signed->unsigned conversion is always well- defined (6.3.1.3 again), since it doesn't depend on how the signed type represents negatives. Robin. Ah, you're right. So that could fix the problem, indeed. On the other hand, I hate the idea of having -1 << 31 to encode bit31-set. That's even worse for DRM_PANTHOR_VM_BIND_OP_TYPE_xxx when we'll reach a value above 0x7, because then the negative value is hard to map to its unsigned representation. If we really care about this corner case, I'd rather go full-defines for flags and call it a day. Yeah, I suppose it can get ugly for some other cases. If we rule that out, I think there's only two options I can think of left: 1. Using #defines instead, like Boris suggested 2. Using 64 bit signed enums (e.g "1ll << 31" instead) Again, #2 here would be the smaller change. But I kinda think I lean towards #1, because... These aren't really enumerators. They are flags.> ...Yeah, sure. In C the practical difference isn't huge. But if we ever wanted to support using these enums from C++ code, we'd need to add overloaded operators, because C++ doesn't allow ORing together enums out of the box. That's only true for enum classes. Plain'ol enums' values can be ORed at will (but you will need to `static_cast` them back to the enum type, admittedly, because they auto-"promote" to int for the arithmetic op). I've had to use uAPI from C++ and the most painless approach, once you finish writing it, is to wrap the whole uAPI in C++ constructs anyway. So I wouldn't consider that angle, personally. I'm not saying I have any plans on using the uAPI from C++, just saying that if we're going to tackle this, we might as well tackle it completely... Also, expanding the enum-type to 64 bits might have some additional consequences, like needlessly needing more stack-space to pass values around etc. Thoughts? Surely there must be some precedence on using the top bit for flags in the kernel, no? -- Mihail Atanassov IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Re: [PATCH v3 00/23] drm/msm/dpu: Add Concurrent Writeback Support for DPU 10.x+
Hi Maxime On 10/17/2024 7:31 AM, Maxime Ripard wrote: On Wed, Oct 16, 2024 at 06:21:06PM GMT, Jessica Zhang wrote: Changes in v3: - Dropped support for CWB on DP connectors for now - Dropped unnecessary PINGPONG array in *_setup_cwb() - Add a check to make sure CWB and CDM aren't supported simultaneously (Dmitry) - Document cwb_enabled checks in dpu_crtc_get_topology() (Dmitry) - Moved implementation of drm_crtc_in_clone_mode() to drm_crtc.c (Jani) - Dropped duplicate error message for reserving CWB resources (Dmitry) - Added notes in framework changes about posting a separate series to add proper KUnit tests (Maxime) I mean, I asked for kunit tests, not for a note that is going to be dropped when applying. Maxime The framework changes wont be applied without an ack from you or in other words till we add the KUnit tests :) The series was re-pushed to get acks on all other MSM changes and keep this series ready for validation by other developers and interested parties. That way only KUnit will be the pending item. Based on cycles, one of us will add the KUnit and we can either link it to this series or absorb it in this itself when its ready. Thanks Abhinav
[RFC PATCH 1/1] drm/ttm, drm/xe: Add ttm_bo_access
Non-contiguous VRAM cannot be mapped in Xe nor can non-visible VRAM easily be accessed. Add ttm_bo_access, which is similar to ttm_bo_vm_access, to access such memory. Visible VRAM access is only supported at the momement but a follow up can add GPU access to non-visible VRAM. Suggested-by: Thomas Hellström Signed-off-by: Matthew Brost --- drivers/gpu/drm/ttm/ttm_bo_vm.c | 20 +- drivers/gpu/drm/xe/xe_bo.c | 48 + include/drm/ttm/ttm_bo.h| 2 ++ 3 files changed, 64 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c index 2c699ed1963a..b53cc064da44 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c @@ -405,13 +405,9 @@ static int ttm_bo_vm_access_kmap(struct ttm_buffer_object *bo, return len; } -int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr, -void *buf, int len, int write) +int ttm_bo_access(struct ttm_buffer_object *bo, unsigned long offset, + void *buf, int len, int write) { - struct ttm_buffer_object *bo = vma->vm_private_data; - unsigned long offset = (addr) - vma->vm_start + - ((vma->vm_pgoff - drm_vma_node_start(&bo->base.vma_node)) -<< PAGE_SHIFT); int ret; if (len < 1 || (offset + len) > bo->base.size) @@ -439,6 +435,18 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr, return ret; } +EXPORT_SYMBOL(ttm_bo_access); + +int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr, +void *buf, int len, int write) +{ + struct ttm_buffer_object *bo = vma->vm_private_data; + unsigned long offset = (addr) - vma->vm_start + + ((vma->vm_pgoff - drm_vma_node_start(&bo->base.vma_node)) +<< PAGE_SHIFT); + + return ttm_bo_access(bo, offset, buf, len, write); +} EXPORT_SYMBOL(ttm_bo_vm_access); static const struct vm_operations_struct ttm_bo_vm_ops = { diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c index 5b232f2951b1..267f3b03a6d0 100644 --- a/drivers/gpu/drm/xe/xe_bo.c +++ b/drivers/gpu/drm/xe/xe_bo.c @@ -,6 +,53 @@ static void xe_ttm_bo_swap_notify(struct ttm_buffer_object *ttm_bo) } } +static int xe_ttm_access_memory(struct ttm_buffer_object *ttm_bo, + unsigned long offset, void *buf, int len, + int write) +{ + struct xe_bo *bo = ttm_to_xe_bo(ttm_bo); + struct xe_device *xe = ttm_to_xe_device(ttm_bo->bdev); + struct iosys_map vmap; + struct xe_res_cursor cursor; + struct xe_mem_region *vram; + void __iomem *virtual; + int bytes_left = len; + + xe_bo_assert_held(bo); + + if (!mem_type_is_vram(ttm_bo->resource->mem_type)) + return -EIO; + + /* FIXME: Use GPU for non-visible VRAM */ + if (!(bo->flags & XE_BO_FLAG_NEEDS_CPU_ACCESS)) + return -EINVAL; + + vram = res_to_mem_region(ttm_bo->resource); + xe_res_first(ttm_bo->resource, offset & ~PAGE_MASK, 0, &cursor); + + do { + int wcount = PAGE_SIZE - (offset & PAGE_MASK) > bytes_left ? + bytes_left : PAGE_SIZE - (offset & PAGE_MASK); + + virtual = (u8 __force *)vram->mapping + cursor.start; + + iosys_map_set_vaddr_iomem(&vmap, (void __iomem *)virtual); + if (write) + xe_map_memcpy_to(xe, &vmap, offset & PAGE_MASK, buf, +wcount); + else + xe_map_memcpy_from(xe, buf, &vmap, offset & PAGE_MASK, + wcount); + + offset += wcount; + buf += wcount; + bytes_left -= wcount; + xe_res_next(&cursor, PAGE_SIZE); + } while (bytes_left); + + return len; +} + const struct ttm_device_funcs xe_ttm_funcs = { .ttm_tt_create = xe_ttm_tt_create, .ttm_tt_populate = xe_ttm_tt_populate, @@ -1120,6 +1167,7 @@ const struct ttm_device_funcs xe_ttm_funcs = { .move = xe_bo_move, .io_mem_reserve = xe_ttm_io_mem_reserve, .io_mem_pfn = xe_ttm_io_mem_pfn, + .access_memory = xe_ttm_access_memory, .release_notify = xe_ttm_bo_release_notify, .eviction_valuable = ttm_bo_eviction_valuable, .delete_mem_notify = xe_ttm_bo_delete_mem_notify, diff --git a/include/drm/ttm/ttm_bo.h b/include/drm/ttm/ttm_bo.h index 5804408815be..8ea11cd8df39 100644 --- a/include/drm/ttm/ttm_bo.h +++ b/include/drm/ttm/ttm_bo.h @@ -421,6 +421,8 @@ void ttm_bo_unpin(struct ttm_buffer_object *bo); int ttm_bo_evict_first(struct ttm_device *bdev, struct ttm_resource_manager *man, struct ttm_operation_ctx *ctx); +int ttm_bo_access(struct ttm_
[RFC PATCH 0/1] Enable non-contiguous VRAM access in Xe
Patches should be split but quick RFC for feedback. Matt Matthew Brost (1): drm/ttm, drm/xe: Add ttm_bo_access drivers/gpu/drm/ttm/ttm_bo_vm.c | 20 +- drivers/gpu/drm/xe/xe_bo.c | 48 + include/drm/ttm/ttm_bo.h| 2 ++ 3 files changed, 64 insertions(+), 6 deletions(-) -- 2.34.1
[PATCH 6.1.y 5.15.y] drm/shmem-helper: Fix BUG_ON() on mmap(PROT_WRITE, MAP_PRIVATE)
From: "Wachowski, Karol" commit 39bc27bd688066a63e56f7f64ad34fae03fbe3b8 upstream. Lack of check for copy-on-write (COW) mapping in drm_gem_shmem_mmap allows users to call mmap with PROT_WRITE and MAP_PRIVATE flag causing a kernel panic due to BUG_ON in vmf_insert_pfn_prot: BUG_ON((vma->vm_flags & VM_PFNMAP) && is_cow_mapping(vma->vm_flags)); Return -EINVAL early if COW mapping is detected. This bug affects all drm drivers using default shmem helpers. It can be reproduced by this simple example: void *ptr = mmap(0, size, PROT_WRITE, MAP_PRIVATE, fd, mmap_offset); ptr[0] = 0; Fixes: 2194a63a818d ("drm: Add library for shmem backed GEM objects") Cc: Noralf Trønnes Cc: Eric Anholt Cc: Rob Herring Cc: Maarten Lankhorst Cc: Maxime Ripard Cc: Thomas Zimmermann Cc: David Airlie Cc: Daniel Vetter Cc: dri-devel@lists.freedesktop.org Cc: # v5.2+ Signed-off-by: Wachowski, Karol Signed-off-by: Jacek Lawrynowicz Signed-off-by: Daniel Vetter Link: https://patchwork.freedesktop.org/patch/msgid/20240520100514.925681-1-jacek.lawrynow...@linux.intel.com Signed-off-by: Greg Kroah-Hartman [Sherry: bp to fix CVE-2024-39497, ignore context change due to missing commit 21aa27ddc582 ("drm/shmem-helper: Switch to reservation lock")] Signed-off-by: Sherry Yang --- drivers/gpu/drm/drm_gem_shmem_helper.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c index e33f06bb66eb..fb8093577245 100644 --- a/drivers/gpu/drm/drm_gem_shmem_helper.c +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c @@ -638,6 +638,9 @@ int drm_gem_shmem_mmap(struct drm_gem_shmem_object *shmem, struct vm_area_struct return ret; } + if (is_cow_mapping(vma->vm_flags)) + return -EINVAL; + ret = drm_gem_shmem_get_pages(shmem); if (ret) return ret; -- 2.46.0
[linux-next:master] [drm/tests] 2735d5e406: WARNING:at_drivers/gpu/drm/drm_framebuffer.c:#drm_framebuffer_init[drm]
Hello, kernel test robot noticed "WARNING:at_drivers/gpu/drm/drm_framebuffer.c:#drm_framebuffer_init[drm]" on: commit: 2735d5e4060960c7bd06698b0a1990c7d42c762e ("drm/tests: Add test for drm_framebuffer_init()") https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master [test failed on linux-next/master 15e7d45e786a62a211dd0098fee7c57f84f8c681] in testcase: kunit version: with following parameters: group: group-00 config: x86_64-rhel-8.3-kunit compiler: gcc-12 test machine: 4 threads 1 sockets Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz (Ivy Bridge) with 8G memory (please refer to attached dmesg/kmsg for entire log/backtrace) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot | Closes: https://lore.kernel.org/oe-lkp/202410171619.be977af4-...@intel.com The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20241017/202410171619.be977af4-...@intel.com kern :warn : [ 111.593780] [ cut here ] kern :warn : [ 111.593995] WARNING: CPU: 0 PID: 4859 at drivers/gpu/drm/drm_framebuffer.c:867 drm_framebuffer_init+0x40/0x380 [drm] kern :warn : [ 111.594323] Modules linked in: drm_framebuffer_test drm_kunit_helpers linear_ranges intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp btrfs coretemp kvm_intel kvm blake2b_generic xor raid6_pq libcrc32c crct10dif_pclmul crc32_generic crc32_pclmul crc32c_intel ghash_clmulni_intel sd_mod sha512_ssse3 sg rapl ipmi_devintf intel_cstate ipmi_msghandler i915 ahci intel_uncore mei_me ttm intel_gtt libahci mei drm_display_helper libata drm_kms_helper drm_buddy video wmi drm fuse ip_tables [last unloaded: drm_format_test] kern :warn : [ 111.595499] CPU: 0 UID: 0 PID: 4859 Comm: kunit_try_catch Tainted: G S BN 6.11.0-rc7-01410-g2735d5e40609 #1 kern :warn : [ 111.595716] Tainted: [S]=CPU_OUT_OF_SPEC, [B]=BAD_PAGE, [N]=TEST kern :warn : [ 111.595842] Hardware name: Hewlett-Packard p6-1451cx/2ADA, BIOS 8.15 02/05/2013 kern :warn : [ 111.595990] RIP: 0010:drm_framebuffer_init+0x40/0x380 [drm] kern :warn : [ 111.596223] Code: 56 41 55 49 89 d5 48 89 f2 41 54 48 c1 ea 03 55 48 89 fd 53 48 89 f3 48 83 ec 10 80 3c 02 00 0f 85 54 02 00 00 48 39 2b 74 1e <0f> 0b 41 bc ea ff ff ff 48 83 c4 10 44 89 e0 5b 5d 41 5c 41 5d 41 kern :warn : [ 111.596572] RSP: 0018:c9edfbd0 EFLAGS: 00210246 kern :warn : [ 111.596689] RAX: dc00 RBX: c9edfcc0 RCX: kern :warn : [ 111.596835] RDX: 1920001dbfa1 RSI: c9edfcc0 RDI: c9edfd08 kern :warn : [ 111.596979] RBP: 888103087000 R08: 888103087000 R09: 888217ec9100 kern :warn : [ 111.597126] R10: 0003 R11: 00657361656c6572 R12: 1920001dbfc0 kern :warn : [ 111.597272] R13: c9edfc40 R14: R15: c9edfe40 kern :warn : [ 111.597416] FS: () GS:8881c0e0() knlGS: kern :warn : [ 111.597592] CS: 0010 DS: ES: CR0: 80050033 kern :warn : [ 111.597714] CR2: f7293000 CR3: 0001310c6001 CR4: 001706f0 kern :warn : [ 111.597859] DR0: 874243e0 DR1: 874243e1 DR2: 874243e3 kern :warn : [ 111.598004] DR3: 874243e5 DR6: 0ff0 DR7: 0600 kern :warn : [ 111.598149] Call Trace: kern :warn : [ 111.598217] kern :warn : [ 111.598278] ? __warn+0xcc/0x260 kern :warn : [ 111.598365] ? drm_framebuffer_init+0x40/0x380 [drm] kern :warn : [ 111.598594] ? report_bug+0x261/0x2c0 kern :warn : [ 111.598686] ? handle_bug+0x3c/0x70 kern :warn : [ 111.598773] ? exc_invalid_op+0x17/0x40 kern :warn : [ 111.598867] ? asm_exc_invalid_op+0x1a/0x20 kern :warn : [ 111.598969] ? drm_framebuffer_init+0x40/0x380 [drm] kern :warn : [ 111.599186] ? _raw_spin_lock_irqsave+0x8b/0xf0 kern :warn : [ 111.599291] drm_test_framebuffer_init_bad_format+0xf0/0x220 [drm_framebuffer_test] kern :warn : [ 111.599451] ? __drmm_add_action+0x14b/0x280 [drm] kern :warn : [ 111.599678] ? __pfx_drm_test_framebuffer_init_bad_format+0x10/0x10 [drm_framebuffer_test] kern :warn : [ 111.599849] ? __pfx_drm_mode_config_init_release+0x10/0x10 [drm] kern :warn : [ 111.600082] ? __drmm_add_action+0x1a1/0x280 [drm] kern :warn : [ 111.600295] ? __pfx_drm_mode_config_init_release+0x10/0x10 [drm] kern :warn : [ 111.600543] ? __schedule+0x7ec/0x1950 kern :warn : [ 111.600635] ? __pfx_read_tsc+0x10/0x10 kern :warn : [ 111.600728] ? ktime_get_ts64+0x82/0x230 kern :warn : [ 111.600823] kunit_try_run_case+0x1b3/0x490 kern :warn : [ 111.600923] ? __pfx_kunit_try_run_case+0x10/0x10 kern :warn : [ 111.601031] ? set_cpus_allowed_ptr+0x85/0xc0 ke
Re: [PATCH v5 1/1] drm/i915/pxp: Add missing tag for Wa_14019159160
On Tue, Oct 15, 2024 at 05:16:58PM -0700, Alan Previn wrote: > Add missing tag for "Wa_14019159160 - Case 2" (for existing > PXP code that ensures run alone mode bit is set to allow > PxP-decryption. > > v5: - remove the max IP_VER check since new platforms that >i915 supports needs this fix and tag the caller too >(John Harrison). > v4: - Include IP_VER 12.71. (Matt Roper) > v3: - Check targeted platforms using IP_VAL. (John Harrison) > v2: - Fix WA id number (John Harrison). > - Improve comments and code to be specific >for the targeted platforms (John Harrison) > > Signed-off-by: Alan Previn > --- > drivers/gpu/drm/i915/gt/intel_lrc.c | 7 +-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c > b/drivers/gpu/drm/i915/gt/intel_lrc.c > index 7bd5d2c29056..51847a846002 100644 > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c > @@ -820,8 +820,10 @@ static bool ctx_needs_runalone(const struct > intel_context *ce) > bool ctx_is_protected = false; > > /* > - * On MTL and newer platforms, protected contexts require setting > - * the LRC run-alone bit or else the encryption will not happen. > + * Wa_14019159160 - Case 2. > + * On some platforms, protected contexts require setting > + * the LRC run-alone bit or else the encryption/decryption will not > happen. > + * NOTE: Case 2 only applies to PXP use-case of said workaround. >*/ > if (GRAPHICS_VER_FULL(ce->engine->i915) >= IP_VER(12, 70) && > (ce->engine->class == COMPUTE_CLASS || ce->engine->class == > RENDER_CLASS)) { > @@ -850,6 +852,7 @@ static void init_common_regs(u32 * const regs, > if (GRAPHICS_VER(engine->i915) < 11) > ctl |= _MASKED_BIT_DISABLE(CTX_CTRL_ENGINE_CTX_SAVE_INHIBIT | > CTX_CTRL_RS_CTX_ENABLE); > + /* Wa_14019159160 - Case 2.*/ I don't believe this needs to be repeated, but it doesn't hurt Reviewed-by: Rodrigo Vivi > if (ctx_needs_runalone(ce)) > ctl |= _MASKED_BIT_ENABLE(GEN12_CTX_CTRL_RUNALONE_MODE); > regs[CTX_CONTEXT_CONTROL] = ctl; > > base-commit: 01c7b2c084e5c84313f382734c10945b9aa49823 > -- > 2.34.1 >
Re: [PATCH v7 1/5] drm: Introduce device wedged event
Hi Raag, Em 30/09/2024 04:38, Raag Jadav escreveu: Introduce device wedged event, which will notify userspace of wedged (hanged/unusable) state of the DRM device through a uevent. This is useful especially in cases where the device is no longer operating as expected even after a hardware reset and has become unrecoverable from driver context. Purpose of this implementation is to provide drivers a generic way to recover with the help of userspace intervention. Different drivers may have different ideas of a "wedged device" depending on their hardware implementation, and hence the vendor agnostic nature of the event. It is up to the drivers to decide when they see the need for recovery and how they want to recover from the available methods. Current implementation defines three recovery methods, out of which, drivers can choose to support any one or multiple of them. Preferred recovery method will be sent in the uevent environment as WEDGED=. Userspace consumers (sysadmin) can define udev rules to parse this event and take respective action to recover the device. === == Recovery method Consumer expectations === == rebind unbind + rebind driver bus-reset unbind + reset bus device + rebind reboot reboot system === == I proposed something similar in the past: https://lore.kernel.org/dri-devel/20221125175203.52481-1-andrealm...@igalia.com/ The motivation was that amdgpu was getting stuck after every GPU reset, and there was just a black screen. The uevent would then trigger a daemon to reset the compositor and getting things back together. As you can see in my thread, the feature was blocked in favor of getting better overall GPU reset from the kernel side. Which kind of scenarios are making i915/xe the need to have userspace involvement? I tested a bunch of resets in i915 but never managed to get the driver stuck. For the bus-reset, amdgpu does that too, but it doesn't require userspace intervention.
Re: [PATCH v3] drm/display: Drop obsolete dependency on COMPILE_TEST
On Tue, Oct 15, 2024 at 09:06:04AM -0700, Doug Anderson wrote: > Hi, > > On Tue, Oct 15, 2024 at 4:46 AM Jean Delvare wrote: > > > > Since commit 0166dc11be91 ("of: make CONFIG_OF user selectable"), it > > is possible to test-build any driver which depends on OF on any > > architecture by explicitly selecting OF. Therefore depending on > > COMPILE_TEST as an alternative is no longer needed. > > > > To avoid reintroducing the randconfig bug originally fixed by commit > > 876271118aa4 ("drm/display: Fix build error without CONFIG_OF"), > > DRM_MSM which selects DRM_DISPLAY_DP_HELPER must explicitly depend > > on OF. This is consistent with what all other DRM drivers are doing. > > > > Signed-off-by: Jean Delvare > > Reviewed-by: Javier Martinez Canillas > > Cc: David Airlie > > Cc: Daniel Vetter > > --- > > For regular builds, this is a no-op, as OF is always enabled on > > ARCH_QCOM and SOC_IMX5. So this change only affects test builds. As > > explained before, allowing test builds only when OF is enabled > > improves the quality of these test builds, as the result is then > > closer to how the code is built on its intended targets. > > > > Changes in v3: > > * Rebase on top of kernel v6.11. > > Changes in v2: > > * Let DRM_MSM depend on OF so that random test builds won't break. > > > > drivers/gpu/drm/display/Kconfig |2 +- > > drivers/gpu/drm/msm/Kconfig |1 + > > 2 files changed, 2 insertions(+), 1 deletion(-) > > > > --- linux-6.11.orig/drivers/gpu/drm/display/Kconfig > > +++ linux-6.11/drivers/gpu/drm/display/Kconfig > > @@ -3,7 +3,7 @@ > > config DRM_DISPLAY_DP_AUX_BUS > > tristate > > depends on DRM > > - depends on OF || COMPILE_TEST > > + depends on OF > > > > config DRM_DISPLAY_HELPER > > tristate > > --- linux-6.11.orig/drivers/gpu/drm/msm/Kconfig > > +++ linux-6.11/drivers/gpu/drm/msm/Kconfig > > @@ -6,6 +6,7 @@ config DRM_MSM > > depends on ARCH_QCOM || SOC_IMX5 || COMPILE_TEST > > depends on COMMON_CLK > > depends on IOMMU_SUPPORT > > + depends on OF > > Perhaps nobody landed this because you're missing the msm maintainers > as specified by `./scripts/get_maintainer.pl -f > drivers/gpu/drm/msm/Kconfig` ? I've added them here. It seems like > we'd at least need an Ack by those guys since this modified the > msm/Kconfig... > > FWIW I haven't spent massive time studying this, but what you have > here looks reasonable. I'm happy at least with this from a DP AUX bus > perspective: > > Acked-by: Douglas Anderson > > Presumably landing this via drm-misc makes the most sense after MSM > guys give it an Ack. Acked-by: Dmitry Baryshkov -- With best wishes Dmitry
Re: [PATCH v3] drm/display: Drop obsolete dependency on COMPILE_TEST
On 10/15/2024 4:46 AM, Jean Delvare wrote: Since commit 0166dc11be91 ("of: make CONFIG_OF user selectable"), it is possible to test-build any driver which depends on OF on any architecture by explicitly selecting OF. Therefore depending on COMPILE_TEST as an alternative is no longer needed. To avoid reintroducing the randconfig bug originally fixed by commit 876271118aa4 ("drm/display: Fix build error without CONFIG_OF"), DRM_MSM which selects DRM_DISPLAY_DP_HELPER must explicitly depend on OF. This is consistent with what all other DRM drivers are doing. Signed-off-by: Jean Delvare Reviewed-by: Javier Martinez Canillas Cc: David Airlie Cc: Daniel Vetter --- For regular builds, this is a no-op, as OF is always enabled on ARCH_QCOM and SOC_IMX5. So this change only affects test builds. As explained before, allowing test builds only when OF is enabled improves the quality of these test builds, as the result is then closer to how the code is built on its intended targets. Changes in v3: * Rebase on top of kernel v6.11. Changes in v2: * Let DRM_MSM depend on OF so that random test builds won't break. drivers/gpu/drm/display/Kconfig |2 +- drivers/gpu/drm/msm/Kconfig |1 + 2 files changed, 2 insertions(+), 1 deletion(-) Reviewed-by: Abhinav Kumar
linux-next: manual merge of the drm tree with the drm-fixes tree
Hi all, Today's linux-next merge of the drm tree got a conflict in: drivers/gpu/drm/i915/display/intel_dp_mst.c between commit: 69b3d8721267 ("drm/i915/dp_mst: Handle error during DSC BW overhead/slice calculation") from the drm-fixes tree and commit: f2e2092a979c ("drm/i915/display: Use joined pipes in dsc helpers for slices, bpp") from the drm tree. I fixed it up (see below) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell diff --cc drivers/gpu/drm/i915/display/intel_dp_mst.c index eeaedd979354,4765bda154c1.. --- a/drivers/gpu/drm/i915/display/intel_dp_mst.c +++ b/drivers/gpu/drm/i915/display/intel_dp_mst.c @@@ -147,19 -156,6 +148,19 @@@ static int intel_dp_mst_calc_pbn(int pi return DIV_ROUND_UP(effective_data_rate * 64, 54 * 1000); } +static int intel_dp_mst_dsc_get_slice_count(const struct intel_connector *connector, + const struct intel_crtc_state *crtc_state) +{ + const struct drm_display_mode *adjusted_mode = + &crtc_state->hw.adjusted_mode; - int num_joined_pipes = crtc_state->joiner_pipes; ++ int num_joined_pipes = intel_crtc_num_joined_pipes(crtc_state); + + return intel_dp_dsc_get_slice_count(connector, + adjusted_mode->clock, + adjusted_mode->hdisplay, + num_joined_pipes); +} + static int intel_dp_mst_find_vcpi_slots_for_bpp(struct intel_encoder *encoder, struct intel_crtc_state *crtc_state, int max_bpp, pgpsf742OJdEN.pgp Description: OpenPGP digital signature
[PATCH next] drm/amdgpu: Fix a double lock bug
This was supposed to be an unlock instead of a lock. The original code will lead to a deadlock. Fixes: ee52489d1210 ("drm/amdgpu: Place NPS mode request on unload") Signed-off-by: Dan Carpenter --- >From static analysis, not testing. --- drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c index fcdbcff57632..3be07bcfd117 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c @@ -1605,7 +1605,7 @@ int amdgpu_xgmi_request_nps_change(struct amdgpu_device *adev, gmc.xgmi.head) adev->gmc.gmc_funcs->request_mem_partition_mode(tmp_adev, cur_nps_mode); - mutex_lock(&hive->hive_lock); + mutex_unlock(&hive->hive_lock); return r; } -- 2.45.2
Re: [PATCH v2 02/29] mm/migrate: Add migrate_device_prepopulated_range
Matthew Brost writes: > On Thu, Oct 17, 2024 at 04:49:11PM +1100, Alistair Popple wrote: >> >> Matthew Brost writes: >> >> > On Thu, Oct 17, 2024 at 02:21:13PM +1100, Alistair Popple wrote: >> >> >> >> Matthew Brost writes: >> >> >> >> > On Thu, Oct 17, 2024 at 12:49:55PM +1100, Alistair Popple wrote: >> >> >> >> >> >> Matthew Brost writes: >> >> >> >> >> >> > On Wed, Oct 16, 2024 at 04:46:52AM +, Matthew Brost wrote: >> >> >> >> On Wed, Oct 16, 2024 at 03:04:06PM +1100, Alistair Popple wrote: >> >> >> >> >> >> [...] >> >> >> >> >> >> >> > > +{ >> >> >> >> > > + unsigned long i; >> >> >> >> > > + >> >> >> >> > > + for (i = 0; i < npages; i++) { >> >> >> >> > > + struct page *page = pfn_to_page(src_pfns[i]); >> >> >> >> > > + >> >> >> >> > > + if (!get_page_unless_zero(page)) { >> >> >> >> > > + src_pfns[i] = 0; >> >> >> >> > > + continue; >> >> >> >> > > + } >> >> >> >> > > + >> >> >> >> > > + if (!trylock_page(page)) { >> >> >> >> > > + src_pfns[i] = 0; >> >> >> >> > > + put_page(page); >> >> >> >> > > + continue; >> >> >> >> > > + } >> >> >> >> > > + >> >> >> >> > > + src_pfns[i] = migrate_pfn(src_pfns[i]) | >> >> >> >> > > MIGRATE_PFN_MIGRATE; >> >> >> >> > >> >> >> >> > This needs to be converted to use a folio like >> >> >> >> > migrate_device_range(). But more importantly this should be split >> >> >> >> > out as >> >> >> >> > a function that both migrate_device_range() and this function can >> >> >> >> > call >> >> >> >> > given this bit is identical. >> >> >> >> > >> >> >> >> >> >> >> >> Missed the folio conversion and agree a helper shared between this >> >> >> >> function and migrate_device_range would be a good idea. Let add >> >> >> >> that. >> >> >> >> >> >> >> > >> >> >> > Alistair, >> >> >> > >> >> >> > Ok, I think now I want to go slightly different direction here to >> >> >> > give >> >> >> > GPUSVM a bit more control over several eviction scenarios. >> >> >> > >> >> >> > What if I exported the helper discussed above, e.g., >> >> >> > >> >> >> > 905 unsigned long migrate_device_pfn_lock(unsigned long pfn) >> >> >> > 906 { >> >> >> > 907 struct folio *folio; >> >> >> > 908 >> >> >> > 909 folio = folio_get_nontail_page(pfn_to_page(pfn)); >> >> >> > 910 if (!folio) >> >> >> > 911 return 0; >> >> >> > 912 >> >> >> > 913 if (!folio_trylock(folio)) { >> >> >> > 914 folio_put(folio); >> >> >> > 915 return 0; >> >> >> > 916 } >> >> >> > 917 >> >> >> > 918 return migrate_pfn(pfn) | MIGRATE_PFN_MIGRATE; >> >> >> > 919 } >> >> >> > 920 EXPORT_SYMBOL(migrate_device_pfn_lock); >> >> >> > >> >> >> > And then also export migrate_device_unmap. >> >> >> > >> >> >> > The usage here would be let a driver collect the device pages in >> >> >> > virtual >> >> >> > address range via hmm_range_fault, lock device pages under notifier >> >> >> > lock ensuring device pages are valid, drop the notifier lock and call >> >> >> > migrate_device_unmap. >> >> >> >> >> >> I'm still working through this series but that seems a bit dubious, the >> >> >> locking here is pretty subtle and easy to get wrong so seeing some code >> >> >> would help me a lot in understanding what you're suggesting. >> >> >> >> >> > >> >> > For sure locking in tricky, my mistake on not working through this >> >> > before sending out the next rev but it came to mind after sending + >> >> > regarding some late feedback from Thomas about using hmm for eviction >> >> > [2]. His suggestion of using hmm_range_fault to trigger migration >> >> > doesn't work for coherent pages, while something like below does. >> >> > >> >> > [2] >> >> > https://patchwork.freedesktop.org/patch/610957/?series=137870&rev=1#comment_1125461 >> >> > >> >> > Here is a snippet I have locally which seems to work. >> >> > >> >> > 2024 retry: >> >> > 2025 hmm_range.notifier_seq = mmu_interval_read_begin(notifier); >> >> > 2026 hmm_range.hmm_pfns = src; >> >> > 2027 >> >> > 2028 while (true) { >> >> > 2029 mmap_read_lock(mm); >> >> > 2030 err = hmm_range_fault(&hmm_range); >> >> > 2031 mmap_read_unlock(mm); >> >> > 2032 if (err == -EBUSY) { >> >> > 2033 if (time_after(jiffies, timeout)) >> >> > 2034 break; >> >> > 2035 >> >> > 2036 hmm_range.notifier_seq = >> >> > mmu_interval_read_begin(notifier); >> >> > 2037 continue; >> >> > 2038 } >> >> > 2039 break; >> >> > 2040 } >> >> > 2041 if (err) >> >> > 2042 goto err_put; >> >> > 2043 >> >> > 2044 drm_gpusvm_notifier_lock(gpusvm); >> >
[PATCH] drm/exynos: fix potential integer overflow in exynos_drm_gem_dumb_create()
From: Zichen Xie This was found by a static analyzer. There may be potential integer overflow issue in exynos_drm_gem_dumb_create(). args->size is defined as "__u64" while args->pitch and args->height are both defined as "__u32". The result of "args->pitch * args->height" will be limited to "__u32" without correct casting. Even if the overflow is quite difficult to happen, we still recommand adding an extra cast. Fixes: 7da5907c84f8 ("drm/exynos: fixed page align bug.") Signed-off-by: Zichen Xie --- drivers/gpu/drm/exynos/exynos_drm_gem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c b/drivers/gpu/drm/exynos/exynos_drm_gem.c index 638ca96830e9..de2126853d2c 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_gem.c +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c @@ -337,7 +337,7 @@ int exynos_drm_gem_dumb_create(struct drm_file *file_priv, */ args->pitch = args->width * ((args->bpp + 7) / 8); - args->size = args->pitch * args->height; + args->size = (__u64)args->pitch * args->height; if (is_drm_iommu_supported(dev)) flags = EXYNOS_BO_NONCONTIG | EXYNOS_BO_WC; -- 2.34.1
Re: [PATCH v2 02/29] mm/migrate: Add migrate_device_prepopulated_range
On Fri, Oct 18, 2024 at 08:58:02AM +1100, Alistair Popple wrote: > > Matthew Brost writes: > > > On Thu, Oct 17, 2024 at 04:49:11PM +1100, Alistair Popple wrote: > >> > >> Matthew Brost writes: > >> > >> > On Thu, Oct 17, 2024 at 02:21:13PM +1100, Alistair Popple wrote: > >> >> > >> >> Matthew Brost writes: > >> >> > >> >> > On Thu, Oct 17, 2024 at 12:49:55PM +1100, Alistair Popple wrote: > >> >> >> > >> >> >> Matthew Brost writes: > >> >> >> > >> >> >> > On Wed, Oct 16, 2024 at 04:46:52AM +, Matthew Brost wrote: > >> >> >> >> On Wed, Oct 16, 2024 at 03:04:06PM +1100, Alistair Popple wrote: > >> >> >> > >> >> >> [...] > >> >> >> > >> >> >> >> > > +{ > >> >> >> >> > > + unsigned long i; > >> >> >> >> > > + > >> >> >> >> > > + for (i = 0; i < npages; i++) { > >> >> >> >> > > + struct page *page = pfn_to_page(src_pfns[i]); > >> >> >> >> > > + > >> >> >> >> > > + if (!get_page_unless_zero(page)) { > >> >> >> >> > > + src_pfns[i] = 0; > >> >> >> >> > > + continue; > >> >> >> >> > > + } > >> >> >> >> > > + > >> >> >> >> > > + if (!trylock_page(page)) { > >> >> >> >> > > + src_pfns[i] = 0; > >> >> >> >> > > + put_page(page); > >> >> >> >> > > + continue; > >> >> >> >> > > + } > >> >> >> >> > > + > >> >> >> >> > > + src_pfns[i] = migrate_pfn(src_pfns[i]) | > >> >> >> >> > > MIGRATE_PFN_MIGRATE; > >> >> >> >> > > >> >> >> >> > This needs to be converted to use a folio like > >> >> >> >> > migrate_device_range(). But more importantly this should be > >> >> >> >> > split out as > >> >> >> >> > a function that both migrate_device_range() and this function > >> >> >> >> > can call > >> >> >> >> > given this bit is identical. > >> >> >> >> > > >> >> >> >> > >> >> >> >> Missed the folio conversion and agree a helper shared between this > >> >> >> >> function and migrate_device_range would be a good idea. Let add > >> >> >> >> that. > >> >> >> >> > >> >> >> > > >> >> >> > Alistair, > >> >> >> > > >> >> >> > Ok, I think now I want to go slightly different direction here to > >> >> >> > give > >> >> >> > GPUSVM a bit more control over several eviction scenarios. > >> >> >> > > >> >> >> > What if I exported the helper discussed above, e.g., > >> >> >> > > >> >> >> > 905 unsigned long migrate_device_pfn_lock(unsigned long pfn) > >> >> >> > 906 { > >> >> >> > 907 struct folio *folio; > >> >> >> > 908 > >> >> >> > 909 folio = folio_get_nontail_page(pfn_to_page(pfn)); > >> >> >> > 910 if (!folio) > >> >> >> > 911 return 0; > >> >> >> > 912 > >> >> >> > 913 if (!folio_trylock(folio)) { > >> >> >> > 914 folio_put(folio); > >> >> >> > 915 return 0; > >> >> >> > 916 } > >> >> >> > 917 > >> >> >> > 918 return migrate_pfn(pfn) | MIGRATE_PFN_MIGRATE; > >> >> >> > 919 } > >> >> >> > 920 EXPORT_SYMBOL(migrate_device_pfn_lock); > >> >> >> > > >> >> >> > And then also export migrate_device_unmap. > >> >> >> > > >> >> >> > The usage here would be let a driver collect the device pages in > >> >> >> > virtual > >> >> >> > address range via hmm_range_fault, lock device pages under notifier > >> >> >> > lock ensuring device pages are valid, drop the notifier lock and > >> >> >> > call > >> >> >> > migrate_device_unmap. > >> >> >> > >> >> >> I'm still working through this series but that seems a bit dubious, > >> >> >> the > >> >> >> locking here is pretty subtle and easy to get wrong so seeing some > >> >> >> code > >> >> >> would help me a lot in understanding what you're suggesting. > >> >> >> > >> >> > > >> >> > For sure locking in tricky, my mistake on not working through this > >> >> > before sending out the next rev but it came to mind after sending + > >> >> > regarding some late feedback from Thomas about using hmm for eviction > >> >> > [2]. His suggestion of using hmm_range_fault to trigger migration > >> >> > doesn't work for coherent pages, while something like below does. > >> >> > > >> >> > [2] > >> >> > https://patchwork.freedesktop.org/patch/610957/?series=137870&rev=1#comment_1125461 > >> >> > > >> >> > Here is a snippet I have locally which seems to work. > >> >> > > >> >> > 2024 retry: > >> >> > 2025 hmm_range.notifier_seq = > >> >> > mmu_interval_read_begin(notifier); > >> >> > 2026 hmm_range.hmm_pfns = src; > >> >> > 2027 > >> >> > 2028 while (true) { > >> >> > 2029 mmap_read_lock(mm); > >> >> > 2030 err = hmm_range_fault(&hmm_range); > >> >> > 2031 mmap_read_unlock(mm); > >> >> > 2032 if (err == -EBUSY) { > >> >> > 2033 if (time_after(jiffies, timeout)) > >> >> > 2034 break; > >> >> > 2035 > >> >> > 2036 hmm_range.notifier_seq = > >> >>
Re: [PATCH 1/3] drm/connector: hdmi: Fix memory leak in drm_display_mode_from_cea_vic()
Hi Jinjie, kernel test robot noticed the following build warnings: [auto build test WARNING on drm-misc/drm-misc-next] [also build test WARNING on linus/master v6.12-rc3 next-20241017] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Jinjie-Ruan/drm-connector-hdmi-Fix-memory-leak-in-drm_display_mode_from_cea_vic/20241014-152022 base: git://anongit.freedesktop.org/drm/drm-misc drm-misc-next patch link: https://lore.kernel.org/r/20241014071632.989108-2-ruanjinjie%40huawei.com patch subject: [PATCH 1/3] drm/connector: hdmi: Fix memory leak in drm_display_mode_from_cea_vic() config: arc-randconfig-002-20241017 (https://download.01.org/0day-ci/archive/20241018/202410180830.oitxtsov-...@intel.com/config) compiler: arc-elf-gcc (GCC) 13.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241018/202410180830.oitxtsov-...@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot | Closes: https://lore.kernel.org/oe-kbuild-all/202410180830.oitxtsov-...@intel.com/ All warnings (new ones prefixed by >>): drivers/gpu/drm/tests/drm_connector_test.c: In function 'drm_test_drm_hdmi_compute_mode_clock_rgb': >> drivers/gpu/drm/tests/drm_connector_test.c:1008:31: warning: passing >> argument 2 of 'drm_mode_destroy' discards 'const' qualifier from pointer >> target type [-Wdiscarded-qualifiers] 1008 | drm_mode_destroy(drm, mode); | ^~~~ In file included from drivers/gpu/drm/tests/drm_connector_test.c:13: include/drm/drm_modes.h:456:72: note: expected 'struct drm_display_mode *' but argument is of type 'const struct drm_display_mode *' 456 | void drm_mode_destroy(struct drm_device *dev, struct drm_display_mode *mode); | ~^~~~ drivers/gpu/drm/tests/drm_connector_test.c: In function 'drm_test_drm_hdmi_compute_mode_clock_rgb_10bpc': drivers/gpu/drm/tests/drm_connector_test.c:1031:31: warning: passing argument 2 of 'drm_mode_destroy' discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers] 1031 | drm_mode_destroy(drm, mode); | ^~~~ include/drm/drm_modes.h:456:72: note: expected 'struct drm_display_mode *' but argument is of type 'const struct drm_display_mode *' 456 | void drm_mode_destroy(struct drm_device *dev, struct drm_display_mode *mode); | ~^~~~ drivers/gpu/drm/tests/drm_connector_test.c: In function 'drm_test_drm_hdmi_compute_mode_clock_rgb_10bpc_vic_1': drivers/gpu/drm/tests/drm_connector_test.c:1051:31: warning: passing argument 2 of 'drm_mode_destroy' discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers] 1051 | drm_mode_destroy(drm, mode); | ^~~~ include/drm/drm_modes.h:456:72: note: expected 'struct drm_display_mode *' but argument is of type 'const struct drm_display_mode *' 456 | void drm_mode_destroy(struct drm_device *dev, struct drm_display_mode *mode); | ~^~~~ drivers/gpu/drm/tests/drm_connector_test.c: In function 'drm_test_drm_hdmi_compute_mode_clock_rgb_12bpc': drivers/gpu/drm/tests/drm_connector_test.c:1074:31: warning: passing argument 2 of 'drm_mode_destroy' discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers] 1074 | drm_mode_destroy(drm, mode); | ^~~~ include/drm/drm_modes.h:456:72: note: expected 'struct drm_display_mode *' but argument is of type 'const struct drm_display_mode *' 456 | void drm_mode_destroy(struct drm_device *dev, struct drm_display_mode *mode); | ~^~~~ drivers/gpu/drm/tests/drm_connector_test.c: In function 'drm_test_drm_hdmi_compute_mode_clock_rgb_12bpc_vic_1': drivers/gpu/drm/tests/drm_connector_test.c:1094:31: warning: passing argument 2 of 'drm_mode_destroy' discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers] 1094 | drm_mode_destroy(drm, mode);
[PATCH 5/5] drm: lcdif: Use drm_bridge_connector
Initialize a connector by calling drm_bridge_connector_init() for each encoder so that down stream bridge drivers don't need to create connectors any more. Signed-off-by: Liu Ying --- drivers/gpu/drm/mxsfb/Kconfig | 1 + drivers/gpu/drm/mxsfb/lcdif_drv.c | 17 - 2 files changed, 17 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/mxsfb/Kconfig b/drivers/gpu/drm/mxsfb/Kconfig index 264e74f45554..06c95e556380 100644 --- a/drivers/gpu/drm/mxsfb/Kconfig +++ b/drivers/gpu/drm/mxsfb/Kconfig @@ -27,6 +27,7 @@ config DRM_IMX_LCDIF depends on DRM && OF depends on COMMON_CLK depends on ARCH_MXC || COMPILE_TEST + select DRM_BRIDGE_CONNECTOR select DRM_CLIENT_SELECTION select DRM_MXS select DRM_KMS_HELPER diff --git a/drivers/gpu/drm/mxsfb/lcdif_drv.c b/drivers/gpu/drm/mxsfb/lcdif_drv.c index 58ccad9c425d..d4521da6675e 100644 --- a/drivers/gpu/drm/mxsfb/lcdif_drv.c +++ b/drivers/gpu/drm/mxsfb/lcdif_drv.c @@ -16,7 +16,9 @@ #include #include +#include #include +#include #include #include #include @@ -56,6 +58,7 @@ static int lcdif_attach_bridge(struct lcdif_drm_private *lcdif) struct device_node *remote; struct of_endpoint of_ep; struct drm_encoder *encoder; + struct drm_connector *connector; remote = of_graph_get_remote_port_parent(ep); if (!of_device_is_available(remote)) { @@ -97,13 +100,25 @@ static int lcdif_attach_bridge(struct lcdif_drm_private *lcdif) return ret; } - ret = drm_bridge_attach(encoder, bridge, NULL, 0); + ret = drm_bridge_attach(encoder, bridge, NULL, + DRM_BRIDGE_ATTACH_NO_CONNECTOR); if (ret) { of_node_put(ep); return dev_err_probe(dev, ret, "Failed to attach bridge for endpoint%u\n", of_ep.id); } + + connector = drm_bridge_connector_init(lcdif->drm, encoder); + if (IS_ERR(connector)) { + ret = PTR_ERR(connector); + dev_err(dev, "Failed to initialize bridge connector: %d\n", + ret); + of_node_put(ep); + return ret; + } + + drm_connector_attach_encoder(connector, encoder); } return 0; -- 2.34.1
[PATCH 0/5] drm: lcdif: Use drm_bridge_connector
Hi, This patch series aims to use drm_bridge_connector in the i.MX8MP LCDIF driver so that bridge drivers don't need to initialize DRM connectors. Patch 1-3 add HDMI connectors to some i.MX8MP platforms's DT as preparation work. The Synopsys DW HDMI bridge core driver would try to find the bridge of the HDMI connector after the LCDIF driver starts to use drm_bridge_connector. Patch 4 sets output_port to 1 in i.MX8MP HDMI TX driver, as a preparation work too. The Synopsys DW HDMI bridge core driver needs to know the output port index so that the driver can use the port index to find and attach the next bridge. The next bridge attachment is needed after the LCDIF driver starts to use drm_bridge_connector. Patch 5 makes the LCDIF driver use drm_bridge_connector. With this patch set, an in-flight ITE IT6263 bridge driver[1] doesn't need to initialize a DRM connector. [1] https://patchwork.freedesktop.org/patch/619465/?series=139266&rev=2 Liu Ying (5): arm64: dts: imx8mp-kontron-bl-osm-s: Add HDMI connector arm64: dts: imx8mp-kontron-smarc-eval-carrier: Add HDMI connector arm64: dts: imx8mp-msc-sm2s-ep1: Add HDMI connector drm/bridge: imx8mp-hdmi-tx: Set output_port to 1 drm: lcdif: Use drm_bridge_connector .../dts/freescale/imx8mp-kontron-bl-osm-s.dts | 19 +++ .../imx8mp-kontron-smarc-eval-carrier.dts | 19 +++ .../dts/freescale/imx8mp-msc-sm2s-ep1.dts | 19 +++ drivers/gpu/drm/bridge/imx/imx8mp-hdmi-tx.c | 1 + drivers/gpu/drm/mxsfb/Kconfig | 1 + drivers/gpu/drm/mxsfb/lcdif_drv.c | 17 - 6 files changed, 75 insertions(+), 1 deletion(-) -- 2.34.1
[PATCH 4/5] drm/bridge: imx8mp-hdmi-tx: Set output_port to 1
Set DW HDMI platform data's output_port to 1 in imx8mp_dw_hdmi_probe() so that dw_hdmi_probe() called by imx8mp_dw_hdmi_probe() can tell the DW HDMI bridge core driver about the output port we are using, hence the next bridge can be found in dw_hdmi_parse_dt() according to the port index, and furthermore the next bridge can be attached to bridge chain in dw_hdmi_bridge_attach() when the DRM_BRIDGE_ATTACH_NO_CONNECTOR flag is set. The output_port value aligns to the value used by devicetree. This is a preparation for making the i.MX8MP LCDIF driver use drm_bridge_connector which requires the DRM_BRIDGE_ATTACH_NO_CONNECTOR flag. Signed-off-by: Liu Ying --- drivers/gpu/drm/bridge/imx/imx8mp-hdmi-tx.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/bridge/imx/imx8mp-hdmi-tx.c b/drivers/gpu/drm/bridge/imx/imx8mp-hdmi-tx.c index 8fcc6d18f4ab..54a53f96929a 100644 --- a/drivers/gpu/drm/bridge/imx/imx8mp-hdmi-tx.c +++ b/drivers/gpu/drm/bridge/imx/imx8mp-hdmi-tx.c @@ -96,6 +96,7 @@ static int imx8mp_dw_hdmi_probe(struct platform_device *pdev) return dev_err_probe(dev, PTR_ERR(hdmi->pixclk), "Unable to get pixel clock\n"); + plat_data->output_port = 1; plat_data->mode_valid = imx8mp_hdmi_mode_valid; plat_data->phy_ops = &imx8mp_hdmi_phy_ops; plat_data->phy_name = "SAMSUNG HDMI TX PHY"; -- 2.34.1
[PATCH 2/5] arm64: dts: imx8mp-kontron-smarc-eval-carrier: Add HDMI connector
Add a HDMI connector to connect with i.MX8MP HDMI TX output. This is a preparation for making the i.MX8MP LCDIF driver use drm_bridge_connector which requires the DRM_BRIDGE_ATTACH_NO_CONNECTOR flag. With that flag, the DW HDMI bridge core driver would try to attach the next bridge which is the HDMI connector. Signed-off-by: Liu Ying --- .../imx8mp-kontron-smarc-eval-carrier.dts | 19 +++ 1 file changed, 19 insertions(+) diff --git a/arch/arm64/boot/dts/freescale/imx8mp-kontron-smarc-eval-carrier.dts b/arch/arm64/boot/dts/freescale/imx8mp-kontron-smarc-eval-carrier.dts index 2173a36ff691..815f313a2d33 100644 --- a/arch/arm64/boot/dts/freescale/imx8mp-kontron-smarc-eval-carrier.dts +++ b/arch/arm64/boot/dts/freescale/imx8mp-kontron-smarc-eval-carrier.dts @@ -29,6 +29,17 @@ extcon_usbc: usbc { id-gpios = <&gpio1 10 GPIO_ACTIVE_HIGH>; }; + hdmi-connector { + compatible = "hdmi-connector"; + type = "a"; + + port { + hdmi_in: endpoint { + remote-endpoint = <&hdmi_tx_out>; + }; + }; + }; + sound { compatible = "simple-audio-card"; simple-audio-card,bitclock-master = <&codec_dai>; @@ -108,6 +119,14 @@ &hdmi_tx { pinctrl-0 = <&pinctrl_hdmi>; ddc-i2c-bus = <&i2c3>; status = "okay"; + + ports { + port@1 { + hdmi_tx_out: endpoint { + remote-endpoint = <&hdmi_in>; + }; + }; + }; }; &hdmi_tx_phy { -- 2.34.1
[PATCH 1/5] arm64: dts: imx8mp-kontron-bl-osm-s: Add HDMI connector
Add a HDMI connector to connect with i.MX8MP HDMI TX output. This is a preparation for making the i.MX8MP LCDIF driver use drm_bridge_connector which requires the DRM_BRIDGE_ATTACH_NO_CONNECTOR flag. With that flag, the DW HDMI bridge core driver would try to attach the next bridge which is the HDMI connector. Signed-off-by: Liu Ying --- .../dts/freescale/imx8mp-kontron-bl-osm-s.dts | 19 +++ 1 file changed, 19 insertions(+) diff --git a/arch/arm64/boot/dts/freescale/imx8mp-kontron-bl-osm-s.dts b/arch/arm64/boot/dts/freescale/imx8mp-kontron-bl-osm-s.dts index 0eb9e726a9b8..445bf5a46c6a 100644 --- a/arch/arm64/boot/dts/freescale/imx8mp-kontron-bl-osm-s.dts +++ b/arch/arm64/boot/dts/freescale/imx8mp-kontron-bl-osm-s.dts @@ -23,6 +23,17 @@ extcon_usbc: usbc { id-gpios = <&gpio1 10 GPIO_ACTIVE_HIGH>; }; + hdmi-connector { + compatible = "hdmi-connector"; + type = "a"; + + port { + hdmi_in: endpoint { + remote-endpoint = <&hdmi_tx_out>; + }; + }; + }; + leds { compatible = "gpio-leds"; @@ -168,6 +179,14 @@ &hdmi_tx { pinctrl-0 = <&pinctrl_hdmi>; ddc-i2c-bus = <&i2c2>; status = "okay"; + + ports { + port@1 { + hdmi_tx_out: endpoint { + remote-endpoint = <&hdmi_in>; + }; + }; + }; }; &hdmi_tx_phy { -- 2.34.1
Re: [PATCH v2 6/9] drm/bridge: Add ITE IT6263 LVDS to HDMI converter
On 10/14/2024, Dmitry Baryshkov wrote: > On Sun, Oct 13, 2024 at 10:48:54AM +, Biju Das wrote: [...] > +static int it6263_bridge_attach(struct drm_bridge *bridge, > + enum drm_bridge_attach_flags flags) { > + struct it6263 *it = bridge_to_it6263(bridge); > + int ret; > + > + ret = drm_bridge_attach(bridge->encoder, it->next_bridge, bridge, > + flags | DRM_BRIDGE_ATTACH_NO_CONNECTOR); > + if (ret < 0) > + return ret; > + > + if (flags & DRM_BRIDGE_ATTACH_NO_CONNECTOR) > + return 0; > + > + it->connector.polled = DRM_CONNECTOR_POLL_CONNECT | > +DRM_CONNECTOR_POLL_DISCONNECT; > + Please strongly consider dropping this and using drm_bridge_connector in the host driver. >>> >>> I can't afford to make i.MX8MP imx-lcdif KMS use drm_bridge_connector >>> currently. Maybe the Renesas >>> RZ/G3E SMARC EVK Biju tested v1 patch set with is also not using >>> drm_bridge_connector. I hope we can >>> leave it as-is for now. >> >> Renesas platform use the drm_bridge_connector_init() helper to create a >> drm_connector for >> each output, instead of relying on the bridge drivers doing so. It attach >> the bridges with the DRM_BRIDGE_ATTACH_NO_CONNECTOR flag to instruct >> them not to create a connector. >> >> On Renesas platform, it exit from here >> if (flags & DRM_BRIDGE_ATTACH_NO_CONNECTOR) >> return 0; >> >> Maybe it is good to have both cases to start with. Add support for both >> cases now, >> Later when imx-lcdif KMS start using drm_bridge_connector, >> we can start dropping bridge devices to create connector?? > > Do we have a timeline for this? I sent out a patch series to make i.MX LCDIF driver use drm_bridge_connector just now. https://patchwork.freedesktop.org/series/140148/ -- Regards, Liu Ying
Re: [PATCH v2 02/29] mm/migrate: Add migrate_device_prepopulated_range
Mika Penttilä writes: > Hi, > > On 10/18/24 00:58, Alistair Popple wrote: >> Matthew Brost writes: >> >>> On Thu, Oct 17, 2024 at 04:49:11PM +1100, Alistair Popple wrote: Matthew Brost writes: > On Thu, Oct 17, 2024 at 02:21:13PM +1100, Alistair Popple wrote: >> Matthew Brost writes: >> >>> On Thu, Oct 17, 2024 at 12:49:55PM +1100, Alistair Popple wrote: Matthew Brost writes: > On Wed, Oct 16, 2024 at 04:46:52AM +, Matthew Brost wrote: >> On Wed, Oct 16, 2024 at 03:04:06PM +1100, Alistair Popple wrote: [...] +{ + unsigned long i; + + for (i = 0; i < npages; i++) { + struct page *page = pfn_to_page(src_pfns[i]); + + if (!get_page_unless_zero(page)) { + src_pfns[i] = 0; + continue; + } + + if (!trylock_page(page)) { + src_pfns[i] = 0; + put_page(page); + continue; + } + + src_pfns[i] = migrate_pfn(src_pfns[i]) | MIGRATE_PFN_MIGRATE; >>> This needs to be converted to use a folio like >>> migrate_device_range(). But more importantly this should be split >>> out as >>> a function that both migrate_device_range() and this function can >>> call >>> given this bit is identical. >>> >> Missed the folio conversion and agree a helper shared between this >> function and migrate_device_range would be a good idea. Let add that. >> > Alistair, > > Ok, I think now I want to go slightly different direction here to give > GPUSVM a bit more control over several eviction scenarios. > > What if I exported the helper discussed above, e.g., > > 905 unsigned long migrate_device_pfn_lock(unsigned long pfn) > 906 { > 907 struct folio *folio; > 908 > 909 folio = folio_get_nontail_page(pfn_to_page(pfn)); > 910 if (!folio) > 911 return 0; > 912 > 913 if (!folio_trylock(folio)) { > 914 folio_put(folio); > 915 return 0; > 916 } > 917 > 918 return migrate_pfn(pfn) | MIGRATE_PFN_MIGRATE; > 919 } > 920 EXPORT_SYMBOL(migrate_device_pfn_lock); > > And then also export migrate_device_unmap. > > The usage here would be let a driver collect the device pages in > virtual > address range via hmm_range_fault, lock device pages under notifier > lock ensuring device pages are valid, drop the notifier lock and call > migrate_device_unmap. I'm still working through this series but that seems a bit dubious, the locking here is pretty subtle and easy to get wrong so seeing some code would help me a lot in understanding what you're suggesting. >>> For sure locking in tricky, my mistake on not working through this >>> before sending out the next rev but it came to mind after sending + >>> regarding some late feedback from Thomas about using hmm for eviction >>> [2]. His suggestion of using hmm_range_fault to trigger migration >>> doesn't work for coherent pages, while something like below does. >>> >>> [2] >>> https://patchwork.freedesktop.org/patch/610957/?series=137870&rev=1#comment_1125461 >>> >>> Here is a snippet I have locally which seems to work. >>> >>> 2024 retry: >>> 2025 hmm_range.notifier_seq = mmu_interval_read_begin(notifier); >>> 2026 hmm_range.hmm_pfns = src; >>> 2027 >>> 2028 while (true) { >>> 2029 mmap_read_lock(mm); >>> 2030 err = hmm_range_fault(&hmm_range); >>> 2031 mmap_read_unlock(mm); >>> 2032 if (err == -EBUSY) { >>> 2033 if (time_after(jiffies, timeout)) >>> 2034 break; >>> 2035 >>> 2036 hmm_range.notifier_seq = >>> mmu_interval_read_begin(notifier); >>> 2037 continue; >>> 2038 } >>> 2039 break; >>> 2040 } >>> 2041 if (err) >>> 2042 goto err_put; >>> 2043 >>> 2044 drm_gpusvm_notifier_lock(gpusvm); >>> 2045 if (mmu_interval_read_retry(notifier, >>> hmm_range.notifier_seq)) { >>> 2046 drm_gpusvm_notifier_unlock(gpusvm); >>> 2047 memset(src, 0, s
Re: [PATCH v2 02/29] mm/migrate: Add migrate_device_prepopulated_range
Matthew Brost writes: > On Fri, Oct 18, 2024 at 08:58:02AM +1100, Alistair Popple wrote: >> >> Matthew Brost writes: >> >> > On Thu, Oct 17, 2024 at 04:49:11PM +1100, Alistair Popple wrote: >> >> >> >> Matthew Brost writes: >> >> >> >> > On Thu, Oct 17, 2024 at 02:21:13PM +1100, Alistair Popple wrote: >> >> >> >> >> >> Matthew Brost writes: >> >> >> >> >> >> > On Thu, Oct 17, 2024 at 12:49:55PM +1100, Alistair Popple wrote: >> >> >> >> >> >> >> >> Matthew Brost writes: >> >> >> >> >> >> >> >> > On Wed, Oct 16, 2024 at 04:46:52AM +, Matthew Brost wrote: >> >> >> >> >> On Wed, Oct 16, 2024 at 03:04:06PM +1100, Alistair Popple wrote: >> >> >> >> >> >> >> >> [...] >> >> >> >> >> >> >> >> >> > > +{ >> >> >> >> >> > > +unsigned long i; >> >> >> >> >> > > + >> >> >> >> >> > > +for (i = 0; i < npages; i++) { >> >> >> >> >> > > +struct page *page = pfn_to_page(src_pfns[i]); >> >> >> >> >> > > + >> >> >> >> >> > > +if (!get_page_unless_zero(page)) { >> >> >> >> >> > > +src_pfns[i] = 0; >> >> >> >> >> > > +continue; >> >> >> >> >> > > +} >> >> >> >> >> > > + >> >> >> >> >> > > +if (!trylock_page(page)) { >> >> >> >> >> > > +src_pfns[i] = 0; >> >> >> >> >> > > +put_page(page); >> >> >> >> >> > > +continue; >> >> >> >> >> > > +} >> >> >> >> >> > > + >> >> >> >> >> > > +src_pfns[i] = migrate_pfn(src_pfns[i]) | >> >> >> >> >> > > MIGRATE_PFN_MIGRATE; >> >> >> >> >> > >> >> >> >> >> > This needs to be converted to use a folio like >> >> >> >> >> > migrate_device_range(). But more importantly this should be >> >> >> >> >> > split out as >> >> >> >> >> > a function that both migrate_device_range() and this function >> >> >> >> >> > can call >> >> >> >> >> > given this bit is identical. >> >> >> >> >> > >> >> >> >> >> >> >> >> >> >> Missed the folio conversion and agree a helper shared between >> >> >> >> >> this >> >> >> >> >> function and migrate_device_range would be a good idea. Let add >> >> >> >> >> that. >> >> >> >> >> >> >> >> >> > >> >> >> >> > Alistair, >> >> >> >> > >> >> >> >> > Ok, I think now I want to go slightly different direction here to >> >> >> >> > give >> >> >> >> > GPUSVM a bit more control over several eviction scenarios. >> >> >> >> > >> >> >> >> > What if I exported the helper discussed above, e.g., >> >> >> >> > >> >> >> >> > 905 unsigned long migrate_device_pfn_lock(unsigned long pfn) >> >> >> >> > 906 { >> >> >> >> > 907 struct folio *folio; >> >> >> >> > 908 >> >> >> >> > 909 folio = folio_get_nontail_page(pfn_to_page(pfn)); >> >> >> >> > 910 if (!folio) >> >> >> >> > 911 return 0; >> >> >> >> > 912 >> >> >> >> > 913 if (!folio_trylock(folio)) { >> >> >> >> > 914 folio_put(folio); >> >> >> >> > 915 return 0; >> >> >> >> > 916 } >> >> >> >> > 917 >> >> >> >> > 918 return migrate_pfn(pfn) | MIGRATE_PFN_MIGRATE; >> >> >> >> > 919 } >> >> >> >> > 920 EXPORT_SYMBOL(migrate_device_pfn_lock); >> >> >> >> > >> >> >> >> > And then also export migrate_device_unmap. >> >> >> >> > >> >> >> >> > The usage here would be let a driver collect the device pages in >> >> >> >> > virtual >> >> >> >> > address range via hmm_range_fault, lock device pages under >> >> >> >> > notifier >> >> >> >> > lock ensuring device pages are valid, drop the notifier lock and >> >> >> >> > call >> >> >> >> > migrate_device_unmap. >> >> >> >> >> >> >> >> I'm still working through this series but that seems a bit dubious, >> >> >> >> the >> >> >> >> locking here is pretty subtle and easy to get wrong so seeing some >> >> >> >> code >> >> >> >> would help me a lot in understanding what you're suggesting. >> >> >> >> >> >> >> > >> >> >> > For sure locking in tricky, my mistake on not working through this >> >> >> > before sending out the next rev but it came to mind after sending + >> >> >> > regarding some late feedback from Thomas about using hmm for eviction >> >> >> > [2]. His suggestion of using hmm_range_fault to trigger migration >> >> >> > doesn't work for coherent pages, while something like below does. >> >> >> > >> >> >> > [2] >> >> >> > https://patchwork.freedesktop.org/patch/610957/?series=137870&rev=1#comment_1125461 >> >> >> > >> >> >> > Here is a snippet I have locally which seems to work. >> >> >> > >> >> >> > 2024 retry: >> >> >> > 2025 hmm_range.notifier_seq = >> >> >> > mmu_interval_read_begin(notifier); >> >> >> > 2026 hmm_range.hmm_pfns = src; >> >> >> > 2027 >> >> >> > 2028 while (true) { >> >> >> > 2029 mmap_read_lock(mm); >> >> >> > 2030 err = hmm_range_fault(&hmm_range); >> >> >> > 2031 mmap_read_unlock(mm); >> >> >> > 2032 if (err == -EBUSY) { >> >> >> > 2033 if (time_after
[PATCH] dma-buf: Eliminate all duplicate fences in dma_fence_unwrap_merge
When dma_fence_unwrap_merge is called on fence chains where the fences aren't ordered by context, the merging logic breaks down and we end up inserting fences twice. Doing this repeatedly leads to the number of fences going up exponentially, and in some gaming workloads we'll end up running out of memory to store the resulting array altogether, leading to a warning such as: vkd3d_queue: page allocation failure: order:7, mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null),cpuset=/,mems_allowed=0 CPU: 2 PID: 5287 Comm: vkd3d_queue Tainted: G S 6.10.7-200.fsync.fc40.x86_64 #1 Hardware name: Dell Inc. G5 5505/0NCW8W, BIOS 1.11.0 03/22/2022 Call Trace: dump_stack_lvl+0x5d/0x80 warn_alloc+0x164/0x190 ? srso_return_thunk+0x5/0x5f ? __alloc_pages_direct_compact+0x1d9/0x220 __alloc_pages_slowpath.constprop.2+0xd14/0xd80 __alloc_pages_noprof+0x32b/0x350 ? dma_fence_array_create+0x48/0x110 __kmalloc_large_node+0x6f/0x130 __kmalloc_noprof+0x2dd/0x4a0 ? dma_fence_array_create+0x48/0x110 dma_fence_array_create+0x48/0x110 __dma_fence_unwrap_merge+0x481/0x5b0 sync_file_merge.constprop.0+0xf8/0x180 sync_file_ioctl+0x476/0x590 ? srso_return_thunk+0x5/0x5f ? __seccomp_filter+0xe8/0x5a0 __x64_sys_ioctl+0x97/0xd0 do_syscall_64+0x82/0x160 ? srso_return_thunk+0x5/0x5f ? drm_syncobj_destroy_ioctl+0x8b/0xb0 ? srso_return_thunk+0x5/0x5f ? srso_return_thunk+0x5/0x5f ? __check_object_size+0x58/0x230 ? srso_return_thunk+0x5/0x5f ? srso_return_thunk+0x5/0x5f ? drm_ioctl+0x2ba/0x530 ? __pfx_drm_syncobj_destroy_ioctl+0x10/0x10 ? srso_return_thunk+0x5/0x5f ? ktime_get_mono_fast_ns+0x3b/0xd0 ? srso_return_thunk+0x5/0x5f ? amdgpu_drm_ioctl+0x71/0x90 [amdgpu] ? srso_return_thunk+0x5/0x5f ? syscall_exit_to_user_mode+0x72/0x200 ? srso_return_thunk+0x5/0x5f ? do_syscall_64+0x8e/0x160 ? syscall_exit_to_user_mode+0x72/0x200 ? srso_return_thunk+0x5/0x5f ? do_syscall_64+0x8e/0x160 ? srso_return_thunk+0x5/0x5f ? syscall_exit_to_user_mode+0x72/0x200 ? srso_return_thunk+0x5/0x5f ? do_syscall_64+0x8e/0x160 ? do_syscall_64+0x8e/0x160 ? srso_return_thunk+0x5/0x5f entry_SYSCALL_64_after_hwframe+0x76/0x7e It's a bit unfortunate that we end up with quadratic complexity w.r.t. the number of merged fences in all cases, but I'd argue in practice there shouldn't be more than a handful of in-flight fences to merge. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3617 Signed-off-by: Friedrich Vock --- drivers/dma-buf/dma-fence-unwrap.c | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/drivers/dma-buf/dma-fence-unwrap.c b/drivers/dma-buf/dma-fence-unwrap.c index 628af51c81af..46277cef0bc6 100644 --- a/drivers/dma-buf/dma-fence-unwrap.c +++ b/drivers/dma-buf/dma-fence-unwrap.c @@ -68,7 +68,7 @@ struct dma_fence *__dma_fence_unwrap_merge(unsigned int num_fences, struct dma_fence *tmp, **array; ktime_t timestamp; unsigned int i; - size_t count; + size_t count, j; count = 0; timestamp = ns_to_ktime(0); @@ -127,6 +127,10 @@ struct dma_fence *__dma_fence_unwrap_merge(unsigned int num_fences, * function is used multiple times. So attempt to order * the fences by context as we pass over them and merge * fences with the same context. +* +* We will remove any remaining duplicate fences down +* below, but doing this here saves us from having to +* iterate over the array to detect the duplicate. */ if (!tmp || tmp->context > next->context) { tmp = next; @@ -145,7 +149,12 @@ struct dma_fence *__dma_fence_unwrap_merge(unsigned int num_fences, } if (tmp) { - array[count++] = dma_fence_get(tmp); + for (j = 0; j < count; ++j) { + if (array[count] == tmp) + break; + } + if (j == count) + array[count++] = dma_fence_get(tmp); fences[sel] = dma_fence_unwrap_next(&iter[sel]); } } while (tmp); -- 2.47.0
[git pull] drm fixes for 6.12-rc4
Hi Linus, Weekly fixes, msm and xe are the two main ones, with a bunch of scattered fixes including a largish revert in mgag200, then amdgpu, vmwgfx and scattering of other minor ones. All seems pretty regular, Regards, Dave. drm-fixes-2024-10-18: drm fixes for 6.12-rc4 msm: - Display: - move CRTC resource assignment to atomic_check otherwise to make consecutive calls to atomic_check() consistent - fix rounding / sign-extension issues with pclk calculation in case of DSC - cleanups to drop incorrect null checks in dpu snapshots - fix to use kvzalloc in dpu snapshot to avoid allocation issues in heavily loaded system cases - Fix to not program merge_3d block if dual LM is not being used - Fix to not flush merge_3d block if its not enabled otherwise this leads to false timeouts - GPU: - a7xx: add a fence wait before SMMU table update xe: - New workaround to Xe2 (Aradhya) - Fix unbalanced rpm put (Matthew Auld) - Remove fragile lock optimization (Matthew Brost) - Fix job release, delegating it to the drm scheduler (Matthew Brost) - Fix timestamp bit width for Xe2 (Lucas) - Fix external BO's dma-resv usag (Matthew Brost) - Fix returning success for timeout in wait_token (Nirmoy) - Initialize fence to avoid it being detected as signaled (Matthew Auld) - Improve cache flush for BMG (Matthew Auld) - Don't allow hflip for tile4 framebuffer on Xe2 (Juha-Pekka) amdgpu: - SR-IOV fix - CS chunk handling fix - MES fixes - SMU13 fixes amdkfd: - VRAM usage reporting fix radeon: - Fix possible_clones handling i915: - Two DP bandwidth related MST fixes ast: - Clear EDID on unplugged connectors host1x: - Fix boot on Tegra186 - Set DMA parameters mgag200: - Revert VBLANK support panel: - himax-hx83192: Adjust power and gamma qaic: - Sgtable loop fixes vmwgfx: - Limit display layout allocatino size - Handle allocation errors in connector checks - Clean up KMS code for 2d-only setup - Report surface-check errors correctly - Remove NULL test around kvfree() The following changes since commit 8e929cb546ee42c9a61d24fae60605e9e3192354: Linux 6.12-rc3 (2024-10-13 14:33:32 -0700) are available in the Git repository at: https://gitlab.freedesktop.org/drm/kernel.git tags/drm-fixes-2024-10-18 for you to fetch changes up to 83f000784844cb9d4669ef1a3366479db3197b33: Merge tag 'drm-xe-fixes-2024-10-17' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes (2024-10-18 13:53:41 +1000) drm fixes for 6.12-rc4 msm: - Display: - move CRTC resource assignment to atomic_check otherwise to make consecutive calls to atomic_check() consistent - fix rounding / sign-extension issues with pclk calculation in case of DSC - cleanups to drop incorrect null checks in dpu snapshots - fix to use kvzalloc in dpu snapshot to avoid allocation issues in heavily loaded system cases - Fix to not program merge_3d block if dual LM is not being used - Fix to not flush merge_3d block if its not enabled otherwise this leads to false timeouts - GPU: - a7xx: add a fence wait before SMMU table update xe: - New workaround to Xe2 (Aradhya) - Fix unbalanced rpm put (Matthew Auld) - Remove fragile lock optimization (Matthew Brost) - Fix job release, delegating it to the drm scheduler (Matthew Brost) - Fix timestamp bit width for Xe2 (Lucas) - Fix external BO's dma-resv usag (Matthew Brost) - Fix returning success for timeout in wait_token (Nirmoy) - Initialize fence to avoid it being detected as signaled (Matthew Auld) - Improve cache flush for BMG (Matthew Auld) - Don't allow hflip for tile4 framebuffer on Xe2 (Juha-Pekka) amdgpu: - SR-IOV fix - CS chunk handling fix - MES fixes - SMU13 fixes amdkfd: - VRAM usage reporting fix radeon: - Fix possible_clones handling i915: - Two DP bandwidth related MST fixes ast: - Clear EDID on unplugged connectors host1x: - Fix boot on Tegra186 - Set DMA parameters mgag200: - Revert VBLANK support panel: - himax-hx83192: Adjust power and gamma qaic: - Sgtable loop fixes vmwgfx: - Limit display layout allocation size - Handle allocation errors in connector checks - Clean up KMS code for 2d-only setup - Report surface-check errors correctly - Remove NULL test around kvfree() Alex Deucher (4): drm/amdgpu: enable enforce_isolation sysfs node on VFs drm/amdgpu/smu13: always apply the powersave optimization drm/amdgpu/swsmu: Only force workload setup on init drm/amdgpu/swsmu: default to fullscreen 3D profile for dGPUs Aradhya Bhatia (1): drm/xe/xe2lpg: Extend Wa_15016589081 for xe2lpg Cong Yang (1): drm/panel: himax-hx83102: Adjust power and gamma to optimize brightness Dave Airlie (5): Merge tag 'drm-msm-fixes-2024-10-16' of https://gitlab.freedesktop.org/drm/msm into drm-fixes Merge tag 'amd-drm-fixes-6.12-2024-10-16' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes Merge tag 'drm-int
Re: [PATCH next] drm/amdgpu: Fix a double lock bug
On 10/18/2024 1:10 AM, Dan Carpenter wrote: > This was supposed to be an unlock instead of a lock. The original > code will lead to a deadlock. > > Fixes: ee52489d1210 ("drm/amdgpu: Place NPS mode request on unload") > Signed-off-by: Dan Carpenter Thanks, this is being taken care with a follow-up patch - https://patchwork.freedesktop.org/patch/620162/ Thanks, Lijo > --- > From static analysis, not testing. > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c > index fcdbcff57632..3be07bcfd117 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c > @@ -1605,7 +1605,7 @@ int amdgpu_xgmi_request_nps_change(struct amdgpu_device > *adev, >gmc.xgmi.head) > adev->gmc.gmc_funcs->request_mem_partition_mode(tmp_adev, > cur_nps_mode); > - mutex_lock(&hive->hive_lock); > + mutex_unlock(&hive->hive_lock); > > return r; > }
Re: [PATCH v2 02/29] mm/migrate: Add migrate_device_prepopulated_range
On 10/18/24 08:59, Alistair Popple wrote: > Matthew Brost writes: > >> On Fri, Oct 18, 2024 at 08:58:02AM +1100, Alistair Popple wrote: >>> Matthew Brost writes: >>> On Thu, Oct 17, 2024 at 04:49:11PM +1100, Alistair Popple wrote: > Matthew Brost writes: > >> On Thu, Oct 17, 2024 at 02:21:13PM +1100, Alistair Popple wrote: >>> Matthew Brost writes: >>> On Thu, Oct 17, 2024 at 12:49:55PM +1100, Alistair Popple wrote: > Matthew Brost writes: > >> On Wed, Oct 16, 2024 at 04:46:52AM +, Matthew Brost wrote: >>> On Wed, Oct 16, 2024 at 03:04:06PM +1100, Alistair Popple wrote: > [...] > > +{ > + unsigned long i; > + > + for (i = 0; i < npages; i++) { > + struct page *page = pfn_to_page(src_pfns[i]); > + > + if (!get_page_unless_zero(page)) { > + src_pfns[i] = 0; > + continue; > + } > + > + if (!trylock_page(page)) { > + src_pfns[i] = 0; > + put_page(page); > + continue; > + } > + > + src_pfns[i] = migrate_pfn(src_pfns[i]) | > MIGRATE_PFN_MIGRATE; This needs to be converted to use a folio like migrate_device_range(). But more importantly this should be split out as a function that both migrate_device_range() and this function can call given this bit is identical. >>> Missed the folio conversion and agree a helper shared between this >>> function and migrate_device_range would be a good idea. Let add >>> that. >>> >> Alistair, >> >> Ok, I think now I want to go slightly different direction here to >> give >> GPUSVM a bit more control over several eviction scenarios. >> >> What if I exported the helper discussed above, e.g., >> >> 905 unsigned long migrate_device_pfn_lock(unsigned long pfn) >> 906 { >> 907 struct folio *folio; >> 908 >> 909 folio = folio_get_nontail_page(pfn_to_page(pfn)); >> 910 if (!folio) >> 911 return 0; >> 912 >> 913 if (!folio_trylock(folio)) { >> 914 folio_put(folio); >> 915 return 0; >> 916 } >> 917 >> 918 return migrate_pfn(pfn) | MIGRATE_PFN_MIGRATE; >> 919 } >> 920 EXPORT_SYMBOL(migrate_device_pfn_lock); >> >> And then also export migrate_device_unmap. >> >> The usage here would be let a driver collect the device pages in >> virtual >> address range via hmm_range_fault, lock device pages under notifier >> lock ensuring device pages are valid, drop the notifier lock and call >> migrate_device_unmap. > I'm still working through this series but that seems a bit dubious, > the > locking here is pretty subtle and easy to get wrong so seeing some > code > would help me a lot in understanding what you're suggesting. > For sure locking in tricky, my mistake on not working through this before sending out the next rev but it came to mind after sending + regarding some late feedback from Thomas about using hmm for eviction [2]. His suggestion of using hmm_range_fault to trigger migration doesn't work for coherent pages, while something like below does. [2] https://patchwork.freedesktop.org/patch/610957/?series=137870&rev=1#comment_1125461 Here is a snippet I have locally which seems to work. 2024 retry: 2025 hmm_range.notifier_seq = mmu_interval_read_begin(notifier); 2026 hmm_range.hmm_pfns = src; 2027 2028 while (true) { 2029 mmap_read_lock(mm); 2030 err = hmm_range_fault(&hmm_range); 2031 mmap_read_unlock(mm); 2032 if (err == -EBUSY) { 2033 if (time_after(jiffies, timeout)) 2034 break; 2035 2036 hmm_range.notifier_seq = mmu_interval_read_begin(notifier); 2037 continue; 2038 } 2039 break; 2040 } 2041 if (err) 2042 goto err_put; 2043 2044 drm_gpusvm_notifier_lo
[PULL] drm-xe-fixes
Hi Dave and Simona, drm-xe-fixes for 6.12-rc4. Mostly some error path fixes and locking adjustements. Timestamp bit width fixes delta time calculations in userspace and one display fix for tile4 modifier in LNL/BMG. thanks Lucas De Marchi drm-xe-fixes-2024-10-17: Driver Changes: - New workaround to Xe2 (Aradhya) - Fix unbalanced rpm put (Matthew Auld) - Remove fragile lock optimization (Matthew Brost) - Fix job release, delegating it to the drm scheduler (Matthew Brost) - Fix timestamp bit width for Xe2 (Lucas) - Fix external BO's dma-resv usag (Matthew Brost) - Fix returning success for timeout in wait_token (Nirmoy) - Initialize fence to avoid it being detected as signaled (Matthew Auld) - Improve cache flush for BMG (Matthew Auld) - Don't allow hflip for tile4 framebuffer on Xe2 (Juha-Pekka) The following changes since commit 8e929cb546ee42c9a61d24fae60605e9e3192354: Linux 6.12-rc3 (2024-10-13 14:33:32 -0700) are available in the Git repository at: https://gitlab.freedesktop.org/drm/xe/kernel.git tags/drm-xe-fixes-2024-10-17 for you to fetch changes up to ffafd12696d1a4c8eeb7386d798d75e1fafb4e01: drm/i915/display: Don't allow tile4 framebuffer to do hflip on display20 or greater (2024-10-16 09:07:09 -0500) Driver Changes: - New workaround to Xe2 (Aradhya) - Fix unbalanced rpm put (Matthew Auld) - Remove fragile lock optimization (Matthew Brost) - Fix job release, delegating it to the drm scheduler (Matthew Brost) - Fix timestamp bit width for Xe2 (Lucas) - Fix external BO's dma-resv usag (Matthew Brost) - Fix returning success for timeout in wait_token (Nirmoy) - Initialize fence to avoid it being detected as signaled (Matthew Auld) - Improve cache flush for BMG (Matthew Auld) - Don't allow hflip for tile4 framebuffer on Xe2 (Juha-Pekka) Aradhya Bhatia (1): drm/xe/xe2lpg: Extend Wa_15016589081 for xe2lpg Juha-Pekka Heikkila (1): drm/i915/display: Don't allow tile4 framebuffer to do hflip on display20 or greater Lucas De Marchi (1): drm/xe/query: Increase timestamp width Matthew Auld (4): drm/xe: fix unbalanced rpm put() with fence_fini() drm/xe: fix unbalanced rpm put() with declare_wedged() drm/xe/xe_sync: initialise ufence.signalled drm/xe/bmg: improve cache flushing behaviour Matthew Brost (3): drm/xe: Take job list lock in xe_sched_add_pending_job drm/xe: Don't free job in TDR drm/xe: Use bookkeep slots for external BO's in exec IOCTL Nirmoy Das (1): drm/xe/ufence: ufence can be signaled right after wait_woken drivers/gpu/drm/i915/display/intel_fb.c| 13 ++ drivers/gpu/drm/i915/display/intel_fb.h| 1 + drivers/gpu/drm/i915/display/skl_universal_plane.c | 11 drivers/gpu/drm/xe/regs/xe_gt_regs.h | 3 --- drivers/gpu/drm/xe/xe_device.c | 4 +-- drivers/gpu/drm/xe/xe_exec.c | 12 +++-- drivers/gpu/drm/xe/xe_gpu_scheduler.h | 2 ++ drivers/gpu/drm/xe/xe_gt.c | 1 - drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c| 29 ++ drivers/gpu/drm/xe/xe_gt_tlb_invalidation.h| 1 - drivers/gpu/drm/xe/xe_guc_submit.c | 7 -- drivers/gpu/drm/xe/xe_query.c | 6 - drivers/gpu/drm/xe/xe_sync.c | 2 +- drivers/gpu/drm/xe/xe_vm.c | 8 ++ drivers/gpu/drm/xe/xe_wa.c | 4 +++ drivers/gpu/drm/xe/xe_wait_user_fence.c| 3 --- 16 files changed, 63 insertions(+), 44 deletions(-)
Re: [PATCH v2 02/29] mm/migrate: Add migrate_device_prepopulated_range
Hi, On 10/18/24 00:58, Alistair Popple wrote: > Matthew Brost writes: > >> On Thu, Oct 17, 2024 at 04:49:11PM +1100, Alistair Popple wrote: >>> Matthew Brost writes: >>> On Thu, Oct 17, 2024 at 02:21:13PM +1100, Alistair Popple wrote: > Matthew Brost writes: > >> On Thu, Oct 17, 2024 at 12:49:55PM +1100, Alistair Popple wrote: >>> Matthew Brost writes: >>> On Wed, Oct 16, 2024 at 04:46:52AM +, Matthew Brost wrote: > On Wed, Oct 16, 2024 at 03:04:06PM +1100, Alistair Popple wrote: >>> [...] >>> >>> +{ >>> + unsigned long i; >>> + >>> + for (i = 0; i < npages; i++) { >>> + struct page *page = pfn_to_page(src_pfns[i]); >>> + >>> + if (!get_page_unless_zero(page)) { >>> + src_pfns[i] = 0; >>> + continue; >>> + } >>> + >>> + if (!trylock_page(page)) { >>> + src_pfns[i] = 0; >>> + put_page(page); >>> + continue; >>> + } >>> + >>> + src_pfns[i] = migrate_pfn(src_pfns[i]) | >>> MIGRATE_PFN_MIGRATE; >> This needs to be converted to use a folio like >> migrate_device_range(). But more importantly this should be split >> out as >> a function that both migrate_device_range() and this function can >> call >> given this bit is identical. >> > Missed the folio conversion and agree a helper shared between this > function and migrate_device_range would be a good idea. Let add that. > Alistair, Ok, I think now I want to go slightly different direction here to give GPUSVM a bit more control over several eviction scenarios. What if I exported the helper discussed above, e.g., 905 unsigned long migrate_device_pfn_lock(unsigned long pfn) 906 { 907 struct folio *folio; 908 909 folio = folio_get_nontail_page(pfn_to_page(pfn)); 910 if (!folio) 911 return 0; 912 913 if (!folio_trylock(folio)) { 914 folio_put(folio); 915 return 0; 916 } 917 918 return migrate_pfn(pfn) | MIGRATE_PFN_MIGRATE; 919 } 920 EXPORT_SYMBOL(migrate_device_pfn_lock); And then also export migrate_device_unmap. The usage here would be let a driver collect the device pages in virtual address range via hmm_range_fault, lock device pages under notifier lock ensuring device pages are valid, drop the notifier lock and call migrate_device_unmap. >>> I'm still working through this series but that seems a bit dubious, the >>> locking here is pretty subtle and easy to get wrong so seeing some code >>> would help me a lot in understanding what you're suggesting. >>> >> For sure locking in tricky, my mistake on not working through this >> before sending out the next rev but it came to mind after sending + >> regarding some late feedback from Thomas about using hmm for eviction >> [2]. His suggestion of using hmm_range_fault to trigger migration >> doesn't work for coherent pages, while something like below does. >> >> [2] >> https://patchwork.freedesktop.org/patch/610957/?series=137870&rev=1#comment_1125461 >> >> Here is a snippet I have locally which seems to work. >> >> 2024 retry: >> 2025 hmm_range.notifier_seq = mmu_interval_read_begin(notifier); >> 2026 hmm_range.hmm_pfns = src; >> 2027 >> 2028 while (true) { >> 2029 mmap_read_lock(mm); >> 2030 err = hmm_range_fault(&hmm_range); >> 2031 mmap_read_unlock(mm); >> 2032 if (err == -EBUSY) { >> 2033 if (time_after(jiffies, timeout)) >> 2034 break; >> 2035 >> 2036 hmm_range.notifier_seq = >> mmu_interval_read_begin(notifier); >> 2037 continue; >> 2038 } >> 2039 break; >> 2040 } >> 2041 if (err) >> 2042 goto err_put; >> 2043 >> 2044 drm_gpusvm_notifier_lock(gpusvm); >> 2045 if (mmu_interval_read_retry(notifier, >> hmm_range.notifier_seq)) { >> 2046 drm_gpusvm_notifier_unlock(gpusvm); >> 2047 memset(src, 0, sizeof(*src) * npages); >> 2048 goto retry; >> 2049 } >> 2050 for (i = 0; i < npages; ++i)
Re: [PATCH v10 0/3] Add initial support for the Rockchip RK3588 HDMI TX Controller
On Wed, 16 Oct 2024 23:06:50 +0300, Cristian Ciocaltea wrote: > The Rockchip RK3588 SoC family integrates the Synopsys DesignWare HDMI > 2.1 Quad-Pixel (QP) TX controller, which is a new IP block, quite > different from those used in the previous generations of Rockchip SoCs. > > The controller supports the following features, among others: > > * Fixed Rate Link (FRL) > * Display Stream Compression (DSC) > * 4K@120Hz and 8K@60Hz video modes > * Variable Refresh Rate (VRR) including Quick Media Switching (QMS) > * Fast Vactive (FVA) > * SCDC I2C DDC access > * Multi-stream audio > * Enhanced Audio Return Channel (EARC) > > [...] Applied to misc/kernel.git (drm-misc-next). Thanks! Maxime
Re: vc4: HDMI Sink doesn't support RGB, something's wrong.
On Wed, Oct 16, 2024 at 07:16:43PM GMT, Dave Stevenson wrote: > Hi Stefan > > On Tue, 15 Oct 2024 at 22:13, Stefan Wahren wrote: > > > > Hi Dave, > > > > Am 15.10.24 um 11:32 schrieb Dave Stevenson: > > > On Mon, 14 Oct 2024 at 22:16, Stefan Wahren wrote: > > >> > > >> Am 14.10.24 um 12:54 schrieb Dave Stevenson: > > >>> On Mon, 14 Oct 2024 at 10:04, Maxime Ripard wrote: > > Hi, > > > > On Sun, Oct 13, 2024 at 09:57:58PM GMT, Stefan Wahren wrote: > > > Am 13.10.24 um 21:11 schrieb Dave Stevenson: > > >> Hi Stefan. > > >> > > >> On Sun, 13 Oct 2024, 18:19 Stefan Wahren, wrote: > > >> > > >> Hi, > > >> > > >> i recently switch for my suspend2idle tests from Raspberry Pi > > >> Bullseye > > >> to Bookworm. After that testing suspend2idle shows a new > > >> warning > > >> which i > > >> never saw before: > > >> > > >> HDMI Sink doesn't support RGB, something's wrong. > > >> > > >> > > >> Can you provide the edid of your display please? > > >> ... > > > > > > The failure is coming from sink_supports_format_bpc()[1], but the flag > > > for DRM_COLOR_FORMAT_RGB444 should have been set from > > > update_display_info()[2] parsing the EDID. > > > > > > Loading that EDID in via drm.edid_firmware has given me a console at > > > 1920x1200@60 without any issues, so I'm a little confused as to what > > > is going on. > > >> Since this warning only occurs on resume and not during normal boot, i > > >> would assume there is no issue with EDID. Maybe the flag get lost. I > > >> should have mention that X11 doesn't recover in this case and the > > >> display stays black. > > > Ah, I hadn't realised you meant it was only on resume that it didn't > > > come back up. > > > > > > I suspect you're right that the state gets lost somehow. It may be > > > triggered by the returning of connector_status_unknown on the > > > connector, but haven't traced it back. > > > > > > If I pick up your patches, what do I need to add to replicate this? > > i prepared a branch for you, which contains the latest suspend2idle patches: > > > > https://github.com/lategoodbye/linux-dev/commits/v6.12-pm/ > > > > Steps: > > 1. Flash latest Raspberry Pi OS (32 bit) on SD card > > 2. Build Kernel from repo above with arm/multi_v7_defconfig > > 3. Replace Kernel, modules + DTB on SD card with build ones > > 4. add the following to confix.txt > > device_tree=bcm2837-rpi-3-b-plus.dtb > > enable_uart=1 > > 5. change/add the following to cmdline.txt > > console=ttyS1,115200 > > no_console_suspend=1 > > 6. connect the following devices to Raspberry Pi 3 B+ : > > USB mouse > > USB keyboard > > HDMI monitor > > Debug UART adapter (USB side to PC) > > 7. Power on board and boot into X11 > > 8. Change to root > > 9. Enable wakeup for ttyS1 > > So I remember for next time > echo enabled > /sys/class/tty/ttyS1/power/wakeup > > > 10. Trigger suspend to idle via X11 (echo freeze > /sys/power/state) > > 11. Wakeup Raspberry Pi via Debug UART > > I don't get the error you are seeing, but I also don't get the display > resuming. > pm has obviously killed the power to the HDMI block, and it has the > reset values in as can be seen via /sys/kernel/debug/dri/0/hdmi_regs. > Nothing in the driver restores these registers, and I'm not sure if it > is meant to do so. > Run kmstest or similar from this state and the change of mode > reprogrammes the blocks so we get the display back again. > > I've also enabled CONFIG_DRM_LOAD_EDID_FIRMWARE so that I can use your > EDID, and get the same results. > > Knee-capping the HDMI block on suspend seems an unlikely mechanism to > work reliably. On the more recent Pis there is a need to be quite > careful in disabling the pipeline to avoid getting data stuck in > FIFOs. > I feel I must be missing something here. I think we're probably missing calls to drm_mode_config_helper_suspend and drm_mode_config_helper_resume. Maxime signature.asc Description: PGP signature
Re: [PATCH v3 00/23] drm/msm/dpu: Add Concurrent Writeback Support for DPU 10.x+
On Wed, Oct 16, 2024 at 06:21:06PM GMT, Jessica Zhang wrote: > Changes in v3: > - Dropped support for CWB on DP connectors for now > - Dropped unnecessary PINGPONG array in *_setup_cwb() > - Add a check to make sure CWB and CDM aren't supported simultaneously > (Dmitry) > - Document cwb_enabled checks in dpu_crtc_get_topology() (Dmitry) > - Moved implementation of drm_crtc_in_clone_mode() to drm_crtc.c (Jani) > - Dropped duplicate error message for reserving CWB resources (Dmitry) > - Added notes in framework changes about posting a separate series to > add proper KUnit tests (Maxime) I mean, I asked for kunit tests, not for a note that is going to be dropped when applying. Maxime signature.asc Description: PGP signature
Re: (subset) [PATCH v4 0/5] Add support for DisplayPort on SA8775P platform
On Fri, 04 Oct 2024 16:00:41 +0530, Soutrik Mukhopadhyay wrote: > This series adds support for the DisplayPort controller > and eDP PHY v5 found on the Qualcomm SA8775P platform. > Applied, thanks! [1/5] dt-bindings: phy: Add eDP PHY compatible for sa8775p commit: 7adb3d221a4d6a4f5e0793c3bd35f1168934035c [2/5] phy: qcom: edp: Introduce aux_cfg array for version specific aux settings commit: 913463587d528d766a8e12c7790995e273ec84fb [3/5] phy: qcom: edp: Add support for eDP PHY on SA8775P commit: 3f12bf16213c30d8e645027efd94a19c13ee0253 Best regards, -- ~Vinod
[PATCH] drm/bridge: Mark the of_node of the aux bridge device as reused
There are some cases where sharing the of_node renders different resources providers confused about the same resource being shared by two different devices. Avoid that by marking the of_node as reused since the aux bridge device is reusing the parent of_node. Signed-off-by: Abel Vesa --- drivers/gpu/drm/bridge/aux-bridge.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/bridge/aux-bridge.c b/drivers/gpu/drm/bridge/aux-bridge.c index b29980f95379ec7af873ed6e0fb79a9abb663c7b..ec3299ae49d6abdb75ee98acfe0682f1acc459f8 100644 --- a/drivers/gpu/drm/bridge/aux-bridge.c +++ b/drivers/gpu/drm/bridge/aux-bridge.c @@ -60,6 +60,7 @@ int drm_aux_bridge_register(struct device *parent) adev->dev.parent = parent; adev->dev.of_node = of_node_get(parent->of_node); adev->dev.release = drm_aux_bridge_release; + adev->dev.of_node_reused = true; ret = auxiliary_device_init(adev); if (ret) { --- base-commit: d61a00525464bfc5fe92c6ad713350988e492b88 change-id: 20241017-drm-aux-bridge-mark-of-node-reused-5c2ee740ff19 Best regards, -- Abel Vesa
Re: [PATCH v2 02/29] mm/migrate: Add migrate_device_prepopulated_range
On Thu, Oct 17, 2024 at 04:49:11PM +1100, Alistair Popple wrote: > > Matthew Brost writes: > > > On Thu, Oct 17, 2024 at 02:21:13PM +1100, Alistair Popple wrote: > >> > >> Matthew Brost writes: > >> > >> > On Thu, Oct 17, 2024 at 12:49:55PM +1100, Alistair Popple wrote: > >> >> > >> >> Matthew Brost writes: > >> >> > >> >> > On Wed, Oct 16, 2024 at 04:46:52AM +, Matthew Brost wrote: > >> >> >> On Wed, Oct 16, 2024 at 03:04:06PM +1100, Alistair Popple wrote: > >> >> > >> >> [...] > >> >> > >> >> >> > > +{ > >> >> >> > > +unsigned long i; > >> >> >> > > + > >> >> >> > > +for (i = 0; i < npages; i++) { > >> >> >> > > +struct page *page = pfn_to_page(src_pfns[i]); > >> >> >> > > + > >> >> >> > > +if (!get_page_unless_zero(page)) { > >> >> >> > > +src_pfns[i] = 0; > >> >> >> > > +continue; > >> >> >> > > +} > >> >> >> > > + > >> >> >> > > +if (!trylock_page(page)) { > >> >> >> > > +src_pfns[i] = 0; > >> >> >> > > +put_page(page); > >> >> >> > > +continue; > >> >> >> > > +} > >> >> >> > > + > >> >> >> > > +src_pfns[i] = migrate_pfn(src_pfns[i]) | > >> >> >> > > MIGRATE_PFN_MIGRATE; > >> >> >> > > >> >> >> > This needs to be converted to use a folio like > >> >> >> > migrate_device_range(). But more importantly this should be split > >> >> >> > out as > >> >> >> > a function that both migrate_device_range() and this function can > >> >> >> > call > >> >> >> > given this bit is identical. > >> >> >> > > >> >> >> > >> >> >> Missed the folio conversion and agree a helper shared between this > >> >> >> function and migrate_device_range would be a good idea. Let add that. > >> >> >> > >> >> > > >> >> > Alistair, > >> >> > > >> >> > Ok, I think now I want to go slightly different direction here to give > >> >> > GPUSVM a bit more control over several eviction scenarios. > >> >> > > >> >> > What if I exported the helper discussed above, e.g., > >> >> > > >> >> > 905 unsigned long migrate_device_pfn_lock(unsigned long pfn) > >> >> > 906 { > >> >> > 907 struct folio *folio; > >> >> > 908 > >> >> > 909 folio = folio_get_nontail_page(pfn_to_page(pfn)); > >> >> > 910 if (!folio) > >> >> > 911 return 0; > >> >> > 912 > >> >> > 913 if (!folio_trylock(folio)) { > >> >> > 914 folio_put(folio); > >> >> > 915 return 0; > >> >> > 916 } > >> >> > 917 > >> >> > 918 return migrate_pfn(pfn) | MIGRATE_PFN_MIGRATE; > >> >> > 919 } > >> >> > 920 EXPORT_SYMBOL(migrate_device_pfn_lock); > >> >> > > >> >> > And then also export migrate_device_unmap. > >> >> > > >> >> > The usage here would be let a driver collect the device pages in > >> >> > virtual > >> >> > address range via hmm_range_fault, lock device pages under notifier > >> >> > lock ensuring device pages are valid, drop the notifier lock and call > >> >> > migrate_device_unmap. > >> >> > >> >> I'm still working through this series but that seems a bit dubious, the > >> >> locking here is pretty subtle and easy to get wrong so seeing some code > >> >> would help me a lot in understanding what you're suggesting. > >> >> > >> > > >> > For sure locking in tricky, my mistake on not working through this > >> > before sending out the next rev but it came to mind after sending + > >> > regarding some late feedback from Thomas about using hmm for eviction > >> > [2]. His suggestion of using hmm_range_fault to trigger migration > >> > doesn't work for coherent pages, while something like below does. > >> > > >> > [2] > >> > https://patchwork.freedesktop.org/patch/610957/?series=137870&rev=1#comment_1125461 > >> > > >> > Here is a snippet I have locally which seems to work. > >> > > >> > 2024 retry: > >> > 2025 hmm_range.notifier_seq = mmu_interval_read_begin(notifier); > >> > 2026 hmm_range.hmm_pfns = src; > >> > 2027 > >> > 2028 while (true) { > >> > 2029 mmap_read_lock(mm); > >> > 2030 err = hmm_range_fault(&hmm_range); > >> > 2031 mmap_read_unlock(mm); > >> > 2032 if (err == -EBUSY) { > >> > 2033 if (time_after(jiffies, timeout)) > >> > 2034 break; > >> > 2035 > >> > 2036 hmm_range.notifier_seq = > >> > mmu_interval_read_begin(notifier); > >> > 2037 continue; > >> > 2038 } > >> > 2039 break; > >> > 2040 } > >> > 2041 if (err) > >> > 2042 goto err_put; > >> > 2043 > >> > 2044 drm_gpusvm_notifier_lock(gpusvm); > >> > 2045 if (mmu_interval_read_retry(notifier, > >> > hmm_range.notifier_seq)) { > >> > 2046 drm_gpusvm_notifier_unlock(gpusvm); > >>
Re: [PATCH V4 07/10] accel/amdxdna: Add command execution
On Wed, Oct 16, 2024 at 08:53:05PM -0700, Lizhi Hou wrote: > > On 10/14/24 19:13, Matthew Brost wrote: > > On Fri, Oct 11, 2024 at 04:12:41PM -0700, Lizhi Hou wrote: > > > Add interfaces for user application to submit command and wait for its > > > completion. > > > > > > Co-developed-by: Min Ma > > > Signed-off-by: Min Ma > > > Signed-off-by: Lizhi Hou > > > --- > > > drivers/accel/amdxdna/aie2_ctx.c | 624 +- > > > drivers/accel/amdxdna/aie2_message.c | 343 ++ > > > drivers/accel/amdxdna/aie2_pci.c | 6 + > > > drivers/accel/amdxdna/aie2_pci.h | 35 + > > > drivers/accel/amdxdna/aie2_psp.c | 2 + > > > drivers/accel/amdxdna/aie2_smu.c | 2 + > > > drivers/accel/amdxdna/amdxdna_ctx.c | 375 ++- > > > drivers/accel/amdxdna/amdxdna_ctx.h | 110 +++ > > > drivers/accel/amdxdna/amdxdna_gem.c | 1 + > > > .../accel/amdxdna/amdxdna_mailbox_helper.c| 5 + > > > drivers/accel/amdxdna/amdxdna_pci_drv.c | 6 + > > > drivers/accel/amdxdna/amdxdna_pci_drv.h | 5 + > > > drivers/accel/amdxdna/amdxdna_sysfs.c | 5 + > > > drivers/accel/amdxdna/npu1_regs.c | 1 + > > > drivers/accel/amdxdna/npu2_regs.c | 1 + > > > drivers/accel/amdxdna/npu4_regs.c | 1 + > > > drivers/accel/amdxdna/npu5_regs.c | 1 + > > > include/trace/events/amdxdna.h| 41 ++ > > > include/uapi/drm/amdxdna_accel.h | 59 ++ > > > 19 files changed, 1614 insertions(+), 9 deletions(-) > > > > > > diff --git a/drivers/accel/amdxdna/aie2_ctx.c > > > b/drivers/accel/amdxdna/aie2_ctx.c > > > index 617fc05077d9..f9010a902c99 100644 > > > --- a/drivers/accel/amdxdna/aie2_ctx.c > > > +++ b/drivers/accel/amdxdna/aie2_ctx.c > > > @@ -8,8 +8,11 @@ > > > #include > > > #include > > > #include > > > +#include > > > #include > > > +#include > > > +#include "aie2_msg_priv.h" > > > #include "aie2_pci.h" > > > #include "aie2_solver.h" > > > #include "amdxdna_ctx.h" > > > @@ -17,6 +20,361 @@ > > > #include "amdxdna_mailbox.h" > > > #include "amdxdna_pci_drv.h" > > > +bool force_cmdlist; > > > +module_param(force_cmdlist, bool, 0600); > > > +MODULE_PARM_DESC(force_cmdlist, "Force use command list (Default > > > false)"); > > > + > > > +#define HWCTX_MAX_TIMEOUT6 /* miliseconds */ > > > + > > > +static int > > > +aie2_hwctx_add_job(struct amdxdna_hwctx *hwctx, struct amdxdna_sched_job > > > *job) > > > +{ > > > + struct amdxdna_sched_job *other; > > > + int idx; > > > + > > > + idx = get_job_idx(hwctx->priv->seq); > > > + /* When pending list full, hwctx->seq points to oldest fence */ > > > + other = hwctx->priv->pending[idx]; > > > + if (other && other->fence) > > > + return -EAGAIN; > > > + > > > + if (other) { > > > + dma_fence_put(other->out_fence); > > > + amdxdna_job_put(other); > > > + } > > > + > > > + hwctx->priv->pending[idx] = job; > > > + job->seq = hwctx->priv->seq++; > > > + kref_get(&job->refcnt); > > > + > > > + return 0; > > > +} > > > + > > > +static struct amdxdna_sched_job * > > > +aie2_hwctx_get_job(struct amdxdna_hwctx *hwctx, u64 seq) > > > +{ > > > + int idx; > > > + > > > + /* Special sequence number for oldest fence if exist */ > > > + if (seq == AMDXDNA_INVALID_CMD_HANDLE) { > > > + idx = get_job_idx(hwctx->priv->seq); > > > + goto out; > > > + } > > > + > > > + if (seq >= hwctx->priv->seq) > > > + return ERR_PTR(-EINVAL); > > > + > > > + if (seq + HWCTX_MAX_CMDS < hwctx->priv->seq) > > > + return NULL; > > > + > > > + idx = get_job_idx(seq); > > > + > > > +out: > > > + return hwctx->priv->pending[idx]; > > > +} > > > + > > > +/* The bad_job is used in aie2_sched_job_timedout, otherwise, set it to > > > NULL */ > > > +static void aie2_hwctx_stop(struct amdxdna_dev *xdna, struct > > > amdxdna_hwctx *hwctx, > > > + struct drm_sched_job *bad_job) > > > +{ > > > + drm_sched_stop(&hwctx->priv->sched, bad_job); > > > + aie2_destroy_context(xdna->dev_handle, hwctx); > > > +} > > > + > > > +static int aie2_hwctx_restart(struct amdxdna_dev *xdna, struct > > > amdxdna_hwctx *hwctx) > > > +{ > > > + struct amdxdna_gem_obj *heap = hwctx->priv->heap; > > > + int ret; > > > + > > > + ret = aie2_create_context(xdna->dev_handle, hwctx); > > > + if (ret) { > > > + XDNA_ERR(xdna, "Create hwctx failed, ret %d", ret); > > > + goto out; > > > + } > > > + > > > + ret = aie2_map_host_buf(xdna->dev_handle, hwctx->fw_ctx_id, > > > + heap->mem.userptr, heap->mem.size); > > > + if (ret) { > > > + XDNA_ERR(xdna, "Map host buf failed, ret %d", ret); > > > + goto out; > > > + } > > > + > > > + if (hwctx->status != HWCTX_STAT_READY) { > > > + XDNA_DBG(xdna, "hwctx is not ready, status %d", hwctx->status); > > >
Re: [PATCH v3] locking/ww_mutex: Adjust to lockdep nest_lock requirements
On Thu, Oct 17, 2024 at 05:10:07PM +0200, Thomas Hellström wrote: > When using mutex_acquire_nest() with a nest_lock, lockdep refcounts the > number of acquired lockdep_maps of mutexes of the same class, and also > keeps a pointer to the first acquired lockdep_map of a class. That pointer > is then used for various comparison-, printing- and checking purposes, > but there is no mechanism to actively ensure that lockdep_map stays in > memory. Instead, a warning is printed if the lockdep_map is freed and > there are still held locks of the same lock class, even if the lockdep_map > itself has been released. > > In the context of WW/WD transactions that means that if a user unlocks > and frees a ww_mutex from within an ongoing ww transaction, and that > mutex happens to be the first ww_mutex grabbed in the transaction, > such a warning is printed and there might be a risk of a UAF. > > Note that this is only problem when lockdep is enabled and affects only > dereferences of struct lockdep_map. > > Adjust to this by adding a fake lockdep_map to the acquired context and > make sure it is the first acquired lockdep map of the associated > ww_mutex class. Then hold it for the duration of the WW/WD transaction. > > This has the side effect that trying to lock a ww mutex *without* a > ww_acquire_context but where a such context has been acquire, we'd see > a lockdep splat. The test-ww_mutex.c selftest attempts to do that, so > modify that particular test to not acquire a ww_acquire_context if it > is not going to be used. > > v2: > - Lower the number of locks in the test-ww_mutex > stress(STRESS_ALL) test to accommodate the dummy lock > introduced in this patch without overflowing lockdep held lock > references. > > v3: > - Adjust the ww_test_normal locking-api selftest to avoid > recursive locking (Boqun Feng) > - Initialize the dummy lock map with LD_WAIT_SLEEP to agree with > how the corresponding ww_mutex lockmaps are initialized > (Boqun Feng) > Thanks! > Cc: Peter Zijlstra > Cc: Ingo Molnar > Cc: Will Deacon > Cc: Waiman Long > Cc: Boqun Feng > Cc: Maarten Lankhorst > Cc: Christian König > Cc: dri-devel@lists.freedesktop.org > Cc: linux-ker...@vger.kernel.org Feel free to use these tags if you need. Co-developed-by: Boqun Feng Signed-off-by: Boqun Feng > Signed-off-by: Thomas Hellström > Acked-by: maarten.lankho...@linux.intel.com #v1 Tested-by: Boqun Feng Peter, since the v2 of this is actually picked in tip/locking/core, I assume you are going to drop that pick this v3? Let me know how you want to proceed, since I have a PR based on tip/locking/core. Regards, Boqun > --- > include/linux/ww_mutex.h | 14 ++ > kernel/locking/test-ww_mutex.c | 8 +--- > lib/locking-selftest.c | 4 ++-- > 3 files changed, 21 insertions(+), 5 deletions(-) > > diff --git a/include/linux/ww_mutex.h b/include/linux/ww_mutex.h > index bb763085479a..45ff6f7a872b 100644 > --- a/include/linux/ww_mutex.h > +++ b/include/linux/ww_mutex.h > @@ -65,6 +65,16 @@ struct ww_acquire_ctx { > #endif > #ifdef CONFIG_DEBUG_LOCK_ALLOC > struct lockdep_map dep_map; > + /** > + * @first_lock_dep_map: fake lockdep_map for first locked ww_mutex. > + * > + * lockdep requires the lockdep_map for the first locked ww_mutex > + * in a ww transaction to remain in memory until all ww_mutexes of > + * the transaction have been unlocked. Ensure this by keeping a > + * fake locked ww_mutex lockdep map between ww_acquire_init() and > + * ww_acquire_fini(). > + */ > + struct lockdep_map first_lock_dep_map; > #endif > #ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH > unsigned int deadlock_inject_interval; > @@ -146,7 +156,10 @@ static inline void ww_acquire_init(struct ww_acquire_ctx > *ctx, > debug_check_no_locks_freed((void *)ctx, sizeof(*ctx)); > lockdep_init_map(&ctx->dep_map, ww_class->acquire_name, >&ww_class->acquire_key, 0); > + lockdep_init_map_wait(&ctx->first_lock_dep_map, ww_class->mutex_name, > + &ww_class->mutex_key, 0, LD_WAIT_SLEEP); > mutex_acquire(&ctx->dep_map, 0, 0, _RET_IP_); > + mutex_acquire_nest(&ctx->first_lock_dep_map, 0, 0, &ctx->dep_map, > _RET_IP_); > #endif > #ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH > ctx->deadlock_inject_interval = 1; > @@ -185,6 +198,7 @@ static inline void ww_acquire_done(struct ww_acquire_ctx > *ctx) > static inline void ww_acquire_fini(struct ww_acquire_ctx *ctx) > { > #ifdef CONFIG_DEBUG_LOCK_ALLOC > + mutex_release(&ctx->first_lock_dep_map, _THIS_IP_); > mutex_release(&ctx->dep_map, _THIS_IP_); > #endif > #ifdef DEBUG_WW_MUTEXES > diff --git a/kernel/locking/test-ww_mutex.c b/kernel/locking/test-ww_mutex.c > index 10a5736a21c2..5d58b2c0ef98 100644 > --- a/kernel/locking/test-ww_mutex.c > +++ b/kernel/locking/test-ww_mutex.c > @@ -62,7 +62,8 @@ static int __test_mutex(unsigned int flags) >
Re: [PATCH] drm/bridge: Mark the of_node of the aux bridge device as reused
On Thu, Oct 17, 2024 at 06:35:26PM +0300, Abel Vesa wrote: > There are some cases where sharing the of_node renders different resources > providers confused about the same resource being shared by two different > devices. Can you be more specific about what type of issue you are trying to avoid here? Is it due to pinctrl for example? > Avoid that by marking the of_node as reused since the aux bridge > device is reusing the parent of_node. > > Signed-off-by: Abel Vesa > --- > drivers/gpu/drm/bridge/aux-bridge.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/gpu/drm/bridge/aux-bridge.c > b/drivers/gpu/drm/bridge/aux-bridge.c > index > b29980f95379ec7af873ed6e0fb79a9abb663c7b..ec3299ae49d6abdb75ee98acfe0682f1acc459f8 > 100644 > --- a/drivers/gpu/drm/bridge/aux-bridge.c > +++ b/drivers/gpu/drm/bridge/aux-bridge.c > @@ -60,6 +60,7 @@ int drm_aux_bridge_register(struct device *parent) > adev->dev.parent = parent; > adev->dev.of_node = of_node_get(parent->of_node); > adev->dev.release = drm_aux_bridge_release; > + adev->dev.of_node_reused = true; Please use the intended device_set_of_node_from_dev() helper for this. > ret = auxiliary_device_init(adev); > if (ret) { Johan
[PATCH] drm/syncobj: ensure progress for syncobj queries
Userspace might poll a syncobj with the query ioctl. Call dma_fence_enable_sw_signaling to ensure dma_fence_is_signaled returns true in finite time. --- panvk hits this issue when timeline semaphore is enabled. It uses the transfer ioctl to propagate fences. dma_fence_unwrap_merge converts the dma_fence_chain to a dma_fence_array. dma_fence_array_signaled never return true unless signaling is enabled. --- drivers/gpu/drm/drm_syncobj.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index 4fcfc0b9b386c..58c5593c897a2 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -1689,6 +1689,9 @@ int drm_syncobj_query_ioctl(struct drm_device *dev, void *data, DRM_SYNCOBJ_QUERY_FLAGS_LAST_SUBMITTED) { point = fence->seqno; } else { + /* ensure forward progress */ + dma_fence_enable_sw_signaling(fence); + dma_fence_chain_for_each(iter, fence) { if (iter->context != fence->context) { dma_fence_put(iter); -- 2.47.0.rc1.288.g06298d1525-goog
Re: vc4: HDMI Sink doesn't support RGB, something's wrong.
On Thu, 17 Oct 2024 at 16:59, Maxime Ripard wrote: > > On Thu, Oct 17, 2024 at 05:26:46PM GMT, Stefan Wahren wrote: > > Am 17.10.24 um 16:27 schrieb Maxime Ripard: > > > On Wed, Oct 16, 2024 at 07:16:43PM GMT, Dave Stevenson wrote: > > > > Hi Stefan > > > > > > > > On Tue, 15 Oct 2024 at 22:13, Stefan Wahren wrote: > > > > > Hi Dave, > > ... > > > > > i prepared a branch for you, which contains the latest suspend2idle > > > > > patches: > > > > > > > > > > https://github.com/lategoodbye/linux-dev/commits/v6.12-pm/ > > > > > > > > > > Steps: > > > > > 1. Flash latest Raspberry Pi OS (32 bit) on SD card > > > > > 2. Build Kernel from repo above with arm/multi_v7_defconfig > > > > > 3. Replace Kernel, modules + DTB on SD card with build ones > > > > > 4. add the following to confix.txt > > > > > device_tree=bcm2837-rpi-3-b-plus.dtb > > > > > enable_uart=1 > > > > > 5. change/add the following to cmdline.txt > > > > > console=ttyS1,115200 > > > > > no_console_suspend=1 > > > > > 6. connect the following devices to Raspberry Pi 3 B+ : > > > > > USB mouse > > > > > USB keyboard > > > > > HDMI monitor > > > > > Debug UART adapter (USB side to PC) > > > > > 7. Power on board and boot into X11 > > > > > 8. Change to root > > > > > 9. Enable wakeup for ttyS1 > > > > So I remember for next time > > > > echo enabled > /sys/class/tty/ttyS1/power/wakeup > > > > > > > > > 10. Trigger suspend to idle via X11 (echo freeze > /sys/power/state) > > > > > 11. Wakeup Raspberry Pi via Debug UART > > > > I don't get the error you are seeing, but I also don't get the display > > > > resuming. > > > > pm has obviously killed the power to the HDMI block, and it has the > > > > reset values in as can be seen via /sys/kernel/debug/dri/0/hdmi_regs. > > > > Nothing in the driver restores these registers, and I'm not sure if it > > > > is meant to do so. > > > > Run kmstest or similar from this state and the change of mode > > > > reprogrammes the blocks so we get the display back again. > > > > > > > > I've also enabled CONFIG_DRM_LOAD_EDID_FIRMWARE so that I can use your > > > > EDID, and get the same results. > > > > > > > > Knee-capping the HDMI block on suspend seems an unlikely mechanism to > > > > work reliably. On the more recent Pis there is a need to be quite > > > > careful in disabling the pipeline to avoid getting data stuck in > > > > FIFOs. > > > > I feel I must be missing something here. > > > > > > I think we're probably missing calls to drm_mode_config_helper_suspend and > > > drm_mode_config_helper_resume. > > > > Okay, i tried this and it works better (HDMI resumes fast), but it also > > triggers a lot of WARN > > vc4_plane_reset deviates from the helper there: > https://elixir.bootlin.com/linux/v6.11.3/source/drivers/gpu/drm/drm_atomic_state_helper.c#L326 > > We should adjust it. Yes, it looks like that WARN is inappropriate, and we should be freeing the old state. > > and the "doesn't support RGB ..." warnings are still there. > > > > I pushed my changes to the branch and attached the dmesg output. > > I can't help you there, it doesn't make sense to me. The EDID should be > correct. Nor can I. I've just taken the latest branch and HDMI does resume correctly after suspend now. We have seen monitors that do weird things on HPD when they stop getting video and go into standby mode, so I wonder if that is the case with your monitor. I do wonder if the HDMI part of the display is the correct place to handle drm_mode_config_helper_[suspend|resume]. All other users seem to do it at the base DRM driver level, which would be vc4_drv.c. I've done that and pushed it to https://github.com/6by9/linux/tree/lategoodbye-suspend. That also works for me without your changes to the HDMI side. That branch also includes the above fix to vc4_plane_reset too. Dave
Re: [PATCH v7 1/5] drm: Introduce device wedged event
On Thu, Oct 17, 2024 at 09:59:10AM +0200, Christian König wrote: > Am 17.10.24 um 04:47 schrieb Raag Jadav: > > On Mon, Sep 30, 2024 at 01:08:41PM +0530, Raag Jadav wrote: > > > Introduce device wedged event, which will notify userspace of wedged > > > (hanged/unusable) state of the DRM device through a uevent. This is > > > useful especially in cases where the device is no longer operating as > > > expected even after a hardware reset and has become unrecoverable from > > > driver context. > > Well introduce is probably the wrong wording since i915 already has that and > amdgpu looked into it but never upstreamed the support. in i915 we have the reset and error uevents, but not one specific for 'wedge'. This would indeed be a new one. > > I would rather say standardize. > > > > > > > Purpose of this implementation is to provide drivers a generic way to > > > recover with the help of userspace intervention. Different drivers may > > > have different ideas of a "wedged device" depending on their hardware > > > implementation, and hence the vendor agnostic nature of the event. > > > It is up to the drivers to decide when they see the need for recovery > > > and how they want to recover from the available methods. > > > > > > Current implementation defines three recovery methods, out of which, > > > drivers can choose to support any one or multiple of them. Preferred > > > recovery method will be sent in the uevent environment as WEDGED=. > > > Userspace consumers (sysadmin) can define udev rules to parse this event > > > and take respective action to recover the device. > > > > > > === == > > > Recovery method Consumer expectations > > > === == > > > rebind unbind + rebind driver > > > bus-reset unbind + reset bus device + rebind > > > reboot reboot system > > > === == > > Well that sounds like userspace would need to be involved in recovery. > > That in turn is a complete no-go since we at least need to signal all > dma_fences to unblock the kernel. In other words things like bus reset needs > to happen inside the kernel and *not* in userspace. > > What we can do is to signal to userspace: Hey a bus reset of device X > happened, maybe restart container, daemon, whatever service which was using > this device. Well, when we declare device 'wedged' it is because we don't want to take any drastic measures inside the kernel and want to leave it in a protected and unusable state. In a way that users wouldn't lose display for instance, or at least the device is in a debugable state. Then, the instructions here is to tell what could possibly be attempted from userspace to get the device to an usable state. The 'wedge' mode (the one emiting this uevent) needs to be responsible for signaling all the fences and everything needed for a clean unbind and whatever next step might be indicated to userspace. That should already be part of any wedged mode, regardless the uevent to inform the userspace here. > > Regards, > Christian. > > > > > > > v4: s/drm_dev_wedged/drm_dev_wedged_event > > > Use drm_info() (Jani) > > > Kernel doc adjustment (Aravind) > > > v5: Send recovery method with uevent (Lina) > > > v6: Access wedge_recovery_opts[] using helper function (Jani) > > > Use snprintf() (Jani) > > > v7: Convert recovery helpers into regular functions (Andy, Jani) > > > Aesthetic adjustments (Andy) > > > Handle invalid method cases > > > > > > Signed-off-by: Raag Jadav > > > --- > > Cc'ing amd, collabora and others as I found semi-related work at > > > > https://lore.kernel.org/dri-devel/20230627132323.115440-1-andrealm...@igalia.com/ > > https://lore.kernel.org/amd-gfx/20240725150055.1991893-1-alexander.deuc...@amd.com/ > > https://lore.kernel.org/dri-devel/20241011225906.3789965-3-adrian.laru...@collabora.com/ > > https://lore.kernel.org/amd-gfx/CAAxE2A5v_RkZ9ex4=7jibskvb22_1faj0aanbcmktett5c3...@mail.gmail.com/ > > > > > > Please share feedback about usefulness and adoption of this. > > Improvements are welcome. > > > > Raag > > > > > drivers/gpu/drm/drm_drv.c | 77 +++ > > > include/drm/drm_device.h | 23 > > > include/drm/drm_drv.h | 3 ++ > > > 3 files changed, 103 insertions(+) > > > > > > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c > > > index ac30b0ec9d93..cfe9600da2ee 100644 > > > --- a/drivers/gpu/drm/drm_drv.c > > > +++ b/drivers/gpu/drm/drm_drv.c > > > @@ -26,6 +26,8 @@ > > >* DEALINGS IN THE SOFTWARE. > > >*/ > > > +#include > > > +#include > > > #include > > > #include > > > #include > > > @@ -33,6 +35,7 @@ > > > #include > > > #include > > > #include > > > +#include > > > #include > > > #include > > > @@ -70,6 +73,42 @@ static struct dentry *drm_debugfs_
Re: [PATCH 1/3] drm/connector: hdmi: Fix memory leak in drm_display_mode_from_cea_vic()
Hi Jinjie, kernel test robot noticed the following build warnings: [auto build test WARNING on drm-misc/drm-misc-next] [also build test WARNING on linus/master v6.12-rc3 next-20241017] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Jinjie-Ruan/drm-connector-hdmi-Fix-memory-leak-in-drm_display_mode_from_cea_vic/20241014-152022 base: git://anongit.freedesktop.org/drm/drm-misc drm-misc-next patch link: https://lore.kernel.org/r/20241014071632.989108-2-ruanjinjie%40huawei.com patch subject: [PATCH 1/3] drm/connector: hdmi: Fix memory leak in drm_display_mode_from_cea_vic() config: x86_64-randconfig-121-20241017 (https://download.01.org/0day-ci/archive/20241018/202410180045.ubklh7fi-...@intel.com/config) compiler: gcc-12 (Debian 12.2.0-14) 12.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241018/202410180045.ubklh7fi-...@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot | Closes: https://lore.kernel.org/oe-kbuild-all/202410180045.ubklh7fi-...@intel.com/ sparse warnings: (new ones prefixed by >>) >> drivers/gpu/drm/tests/drm_connector_test.c:1008:31: sparse: sparse: >> incorrect type in argument 2 (different modifiers) @@ expected struct >> drm_display_mode *mode @@ got struct drm_display_mode const *[assigned] >> mode @@ drivers/gpu/drm/tests/drm_connector_test.c:1008:31: sparse: expected struct drm_display_mode *mode drivers/gpu/drm/tests/drm_connector_test.c:1008:31: sparse: got struct drm_display_mode const *[assigned] mode drivers/gpu/drm/tests/drm_connector_test.c:1031:31: sparse: sparse: incorrect type in argument 2 (different modifiers) @@ expected struct drm_display_mode *mode @@ got struct drm_display_mode const *[assigned] mode @@ drivers/gpu/drm/tests/drm_connector_test.c:1031:31: sparse: expected struct drm_display_mode *mode drivers/gpu/drm/tests/drm_connector_test.c:1031:31: sparse: got struct drm_display_mode const *[assigned] mode drivers/gpu/drm/tests/drm_connector_test.c:1051:31: sparse: sparse: incorrect type in argument 2 (different modifiers) @@ expected struct drm_display_mode *mode @@ got struct drm_display_mode const *[assigned] mode @@ drivers/gpu/drm/tests/drm_connector_test.c:1051:31: sparse: expected struct drm_display_mode *mode drivers/gpu/drm/tests/drm_connector_test.c:1051:31: sparse: got struct drm_display_mode const *[assigned] mode drivers/gpu/drm/tests/drm_connector_test.c:1074:31: sparse: sparse: incorrect type in argument 2 (different modifiers) @@ expected struct drm_display_mode *mode @@ got struct drm_display_mode const *[assigned] mode @@ drivers/gpu/drm/tests/drm_connector_test.c:1074:31: sparse: expected struct drm_display_mode *mode drivers/gpu/drm/tests/drm_connector_test.c:1074:31: sparse: got struct drm_display_mode const *[assigned] mode drivers/gpu/drm/tests/drm_connector_test.c:1094:31: sparse: sparse: incorrect type in argument 2 (different modifiers) @@ expected struct drm_display_mode *mode @@ got struct drm_display_mode const *[assigned] mode @@ drivers/gpu/drm/tests/drm_connector_test.c:1094:31: sparse: expected struct drm_display_mode *mode drivers/gpu/drm/tests/drm_connector_test.c:1094:31: sparse: got struct drm_display_mode const *[assigned] mode drivers/gpu/drm/tests/drm_connector_test.c:1117:31: sparse: sparse: incorrect type in argument 2 (different modifiers) @@ expected struct drm_display_mode *mode @@ got struct drm_display_mode const *[assigned] mode @@ drivers/gpu/drm/tests/drm_connector_test.c:1117:31: sparse: expected struct drm_display_mode *mode drivers/gpu/drm/tests/drm_connector_test.c:1117:31: sparse: got struct drm_display_mode const *[assigned] mode drivers/gpu/drm/tests/drm_connector_test.c:1142:31: sparse: sparse: incorrect type in argument 2 (different modifiers) @@ expected struct drm_display_mode *mode @@ got struct drm_display_mode const *[assigned] mode @@ drivers/gpu/drm/tests/drm_connector_test.c:1142:31: sparse: expected struct drm_display_mode *mode drivers/gpu/drm/tests/drm_connector_test.c:1142:31: sparse: got struct drm_display_mode const *[assigned] mode drivers/gpu/drm/tests/drm_connector_test.c:1182:31: sparse: sparse: incorrect type in argument 2 (different modifiers) @@ expected struct drm_display_mode *mode @@ got struct drm_display_mode const *[assigned] mode @@ drivers/gpu/drm/tests/drm_connector_test.c:1182:31: sparse: expected struct drm_dis
Re: [PATCH] i915: fix DRM_I915_GVT_KVMGT dependencies
On Thu, Oct 17, 2024 at 05:11:37AM +, Arnd Bergmann wrote: > On Thu, Oct 17, 2024, at 00:26, Sean Christopherson wrote: > > On Tue, Oct 15, 2024, Arnd Bergmann wrote: > >> diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig > >> index 46301c06d18a..985cb78d8256 100644 > >> --- a/drivers/gpu/drm/i915/Kconfig > >> +++ b/drivers/gpu/drm/i915/Kconfig > >> @@ -118,9 +118,8 @@ config DRM_I915_USERPTR > >> config DRM_I915_GVT_KVMGT > >>tristate "Enable KVM host support Intel GVT-g graphics virtualization" > >>depends on DRM_I915 > >> - depends on X86 > >> + depends on KVM_X86 > > > > Can GVT-g even work on non-Intel CPUs? I.e. would it make sense to take a > > dependency on KVM_INTEL? > > > > Yes, I think that should work, but I'm not sure if it needs a dependency > on both KVM_INTEL and KVM_X86 in that case, to handle the link-time > dependency in all configurations. not sure as well, but let's take the safest approach. pushed this patch. Thanks > > Arnd
Re: vc4: HDMI Sink doesn't support RGB, something's wrong.
Am 17.10.24 um 18:37 schrieb Dave Stevenson: On Thu, 17 Oct 2024 at 16:59, Maxime Ripard wrote: On Thu, Oct 17, 2024 at 05:26:46PM GMT, Stefan Wahren wrote: Am 17.10.24 um 16:27 schrieb Maxime Ripard: On Wed, Oct 16, 2024 at 07:16:43PM GMT, Dave Stevenson wrote: Hi Stefan On Tue, 15 Oct 2024 at 22:13, Stefan Wahren wrote: Hi Dave, ... i prepared a branch for you, which contains the latest suspend2idle patches: https://github.com/lategoodbye/linux-dev/commits/v6.12-pm/ Steps: 1. Flash latest Raspberry Pi OS (32 bit) on SD card 2. Build Kernel from repo above with arm/multi_v7_defconfig 3. Replace Kernel, modules + DTB on SD card with build ones 4. add the following to confix.txt device_tree=bcm2837-rpi-3-b-plus.dtb enable_uart=1 5. change/add the following to cmdline.txt console=ttyS1,115200 no_console_suspend=1 6. connect the following devices to Raspberry Pi 3 B+ : USB mouse USB keyboard HDMI monitor Debug UART adapter (USB side to PC) 7. Power on board and boot into X11 8. Change to root 9. Enable wakeup for ttyS1 So I remember for next time echo enabled > /sys/class/tty/ttyS1/power/wakeup 10. Trigger suspend to idle via X11 (echo freeze > /sys/power/state) 11. Wakeup Raspberry Pi via Debug UART I don't get the error you are seeing, but I also don't get the display resuming. pm has obviously killed the power to the HDMI block, and it has the reset values in as can be seen via /sys/kernel/debug/dri/0/hdmi_regs. Nothing in the driver restores these registers, and I'm not sure if it is meant to do so. Run kmstest or similar from this state and the change of mode reprogrammes the blocks so we get the display back again. I've also enabled CONFIG_DRM_LOAD_EDID_FIRMWARE so that I can use your EDID, and get the same results. Knee-capping the HDMI block on suspend seems an unlikely mechanism to work reliably. On the more recent Pis there is a need to be quite careful in disabling the pipeline to avoid getting data stuck in FIFOs. I feel I must be missing something here. I think we're probably missing calls to drm_mode_config_helper_suspend and drm_mode_config_helper_resume. Okay, i tried this and it works better (HDMI resumes fast), but it also triggers a lot of WARN vc4_plane_reset deviates from the helper there: https://elixir.bootlin.com/linux/v6.11.3/source/drivers/gpu/drm/drm_atomic_state_helper.c#L326 We should adjust it. Yes, it looks like that WARN is inappropriate, and we should be freeing the old state. and the "doesn't support RGB ..." warnings are still there. I pushed my changes to the branch and attached the dmesg output. I can't help you there, it doesn't make sense to me. The EDID should be correct. Nor can I. I've just taken the latest branch and HDMI does resume correctly after suspend now. No problem. At the end I just wanted to know if the warning was related to the problem that HDMI doesn't resume. Now it's clear these are not related and I can investigate further. We have seen monitors that do weird things on HPD when they stop getting video and go into standby mode, so I wonder if that is the case with your monitor. I do wonder if the HDMI part of the display is the correct place to handle drm_mode_config_helper_[suspend|resume]. All other users seem to do it at the base DRM driver level, which would be vc4_drv.c. I've done that and pushed it to https://github.com/6by9/linux/tree/lategoodbye-suspend. That also works for me without your changes to the HDMI side. That branch also includes the above fix to vc4_plane_reset too. Yes, that's a better place. Nice, thank you. Dave
[PATCH v3] locking/ww_mutex: Adjust to lockdep nest_lock requirements
When using mutex_acquire_nest() with a nest_lock, lockdep refcounts the number of acquired lockdep_maps of mutexes of the same class, and also keeps a pointer to the first acquired lockdep_map of a class. That pointer is then used for various comparison-, printing- and checking purposes, but there is no mechanism to actively ensure that lockdep_map stays in memory. Instead, a warning is printed if the lockdep_map is freed and there are still held locks of the same lock class, even if the lockdep_map itself has been released. In the context of WW/WD transactions that means that if a user unlocks and frees a ww_mutex from within an ongoing ww transaction, and that mutex happens to be the first ww_mutex grabbed in the transaction, such a warning is printed and there might be a risk of a UAF. Note that this is only problem when lockdep is enabled and affects only dereferences of struct lockdep_map. Adjust to this by adding a fake lockdep_map to the acquired context and make sure it is the first acquired lockdep map of the associated ww_mutex class. Then hold it for the duration of the WW/WD transaction. This has the side effect that trying to lock a ww mutex *without* a ww_acquire_context but where a such context has been acquire, we'd see a lockdep splat. The test-ww_mutex.c selftest attempts to do that, so modify that particular test to not acquire a ww_acquire_context if it is not going to be used. v2: - Lower the number of locks in the test-ww_mutex stress(STRESS_ALL) test to accommodate the dummy lock introduced in this patch without overflowing lockdep held lock references. v3: - Adjust the ww_test_normal locking-api selftest to avoid recursive locking (Boqun Feng) - Initialize the dummy lock map with LD_WAIT_SLEEP to agree with how the corresponding ww_mutex lockmaps are initialized (Boqun Feng) Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Will Deacon Cc: Waiman Long Cc: Boqun Feng Cc: Maarten Lankhorst Cc: Christian König Cc: dri-devel@lists.freedesktop.org Cc: linux-ker...@vger.kernel.org Signed-off-by: Thomas Hellström Acked-by: maarten.lankho...@linux.intel.com #v1 --- include/linux/ww_mutex.h | 14 ++ kernel/locking/test-ww_mutex.c | 8 +--- lib/locking-selftest.c | 4 ++-- 3 files changed, 21 insertions(+), 5 deletions(-) diff --git a/include/linux/ww_mutex.h b/include/linux/ww_mutex.h index bb763085479a..45ff6f7a872b 100644 --- a/include/linux/ww_mutex.h +++ b/include/linux/ww_mutex.h @@ -65,6 +65,16 @@ struct ww_acquire_ctx { #endif #ifdef CONFIG_DEBUG_LOCK_ALLOC struct lockdep_map dep_map; + /** +* @first_lock_dep_map: fake lockdep_map for first locked ww_mutex. +* +* lockdep requires the lockdep_map for the first locked ww_mutex +* in a ww transaction to remain in memory until all ww_mutexes of +* the transaction have been unlocked. Ensure this by keeping a +* fake locked ww_mutex lockdep map between ww_acquire_init() and +* ww_acquire_fini(). +*/ + struct lockdep_map first_lock_dep_map; #endif #ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH unsigned int deadlock_inject_interval; @@ -146,7 +156,10 @@ static inline void ww_acquire_init(struct ww_acquire_ctx *ctx, debug_check_no_locks_freed((void *)ctx, sizeof(*ctx)); lockdep_init_map(&ctx->dep_map, ww_class->acquire_name, &ww_class->acquire_key, 0); + lockdep_init_map_wait(&ctx->first_lock_dep_map, ww_class->mutex_name, + &ww_class->mutex_key, 0, LD_WAIT_SLEEP); mutex_acquire(&ctx->dep_map, 0, 0, _RET_IP_); + mutex_acquire_nest(&ctx->first_lock_dep_map, 0, 0, &ctx->dep_map, _RET_IP_); #endif #ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH ctx->deadlock_inject_interval = 1; @@ -185,6 +198,7 @@ static inline void ww_acquire_done(struct ww_acquire_ctx *ctx) static inline void ww_acquire_fini(struct ww_acquire_ctx *ctx) { #ifdef CONFIG_DEBUG_LOCK_ALLOC + mutex_release(&ctx->first_lock_dep_map, _THIS_IP_); mutex_release(&ctx->dep_map, _THIS_IP_); #endif #ifdef DEBUG_WW_MUTEXES diff --git a/kernel/locking/test-ww_mutex.c b/kernel/locking/test-ww_mutex.c index 10a5736a21c2..5d58b2c0ef98 100644 --- a/kernel/locking/test-ww_mutex.c +++ b/kernel/locking/test-ww_mutex.c @@ -62,7 +62,8 @@ static int __test_mutex(unsigned int flags) int ret; ww_mutex_init(&mtx.mutex, &ww_class); - ww_acquire_init(&ctx, &ww_class); + if (flags & TEST_MTX_CTX) + ww_acquire_init(&ctx, &ww_class); INIT_WORK_ONSTACK(&mtx.work, test_mutex_work); init_completion(&mtx.ready); @@ -90,7 +91,8 @@ static int __test_mutex(unsigned int flags) ret = wait_for_completion_timeout(&mtx.done, TIMEOUT); } ww_mutex_unlock(&mtx.mutex); - ww_acquire_fini(&ctx); + if (flags & TEST_MTX_CTX) + ww_acquire_fini(&ctx); if (ret)
[PATCH] dma-buf/dma-fence_array: use kvzalloc
Reports indicates that some userspace applications try to merge more than 80k of fences into a single dma_fence_array leading to a warning from kzalloc() that the requested size becomes to big. While that is clearly an userspace bug we should probably handle that case gracefully in the kernel. So we can either reject requests to merge more than a reasonable amount of fences (64k maybe?) or we can start to use kvzalloc() instead of kzalloc(). This patch here does the later. Signed-off-by: Christian König CC: sta...@vger.kernel.org --- drivers/dma-buf/dma-fence-array.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/dma-buf/dma-fence-array.c b/drivers/dma-buf/dma-fence-array.c index 8a08ffde31e7..46ac42bcfac0 100644 --- a/drivers/dma-buf/dma-fence-array.c +++ b/drivers/dma-buf/dma-fence-array.c @@ -119,8 +119,8 @@ static void dma_fence_array_release(struct dma_fence *fence) for (i = 0; i < array->num_fences; ++i) dma_fence_put(array->fences[i]); - kfree(array->fences); - dma_fence_free(fence); + kvfree(array->fences); + kvfree_rcu(fence, rcu); } static void dma_fence_array_set_deadline(struct dma_fence *fence, @@ -153,7 +153,7 @@ struct dma_fence_array *dma_fence_array_alloc(int num_fences) { struct dma_fence_array *array; - return kzalloc(struct_size(array, callbacks, num_fences), GFP_KERNEL); + return kvzalloc(struct_size(array, callbacks, num_fences), GFP_KERNEL); } EXPORT_SYMBOL(dma_fence_array_alloc); -- 2.34.1
[PATCH 01/11] accel/ivpu: Do not fail when more than 1 tile is fused
From: Karol Wachowski Allow TILE_FUSE register to disable more than 1 tile. The driver should not prevent such configurations from being functional. Signed-off-by: Karol Wachowski Reviewed-by: Jacek Lawrynowicz Signed-off-by: Jacek Lawrynowicz --- drivers/accel/ivpu/ivpu_hw_btrs.c | 12 +++- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/drivers/accel/ivpu/ivpu_hw_btrs.c b/drivers/accel/ivpu/ivpu_hw_btrs.c index 6d5f1cc711435..3212c99f36823 100644 --- a/drivers/accel/ivpu/ivpu_hw_btrs.c +++ b/drivers/accel/ivpu/ivpu_hw_btrs.c @@ -141,16 +141,10 @@ static int read_tile_config_fuse(struct ivpu_device *vdev, u32 *tile_fuse_config } config = REG_GET_FLD(VPU_HW_BTRS_LNL_TILE_FUSE, CONFIG, fuse); - if (!tile_disable_check(config)) { - ivpu_err(vdev, "Fuse: Invalid tile disable config (0x%x)\n", config); - return -EIO; - } + if (!tile_disable_check(config)) + ivpu_warn(vdev, "More than 1 tile disabled, tile fuse config mask: 0x%x\n", config); - if (config) - ivpu_dbg(vdev, MISC, "Fuse: %d tiles enabled. Tile number %d disabled\n", -BTRS_LNL_TILE_MAX_NUM - 1, ffs(config) - 1); - else - ivpu_dbg(vdev, MISC, "Fuse: All %d tiles enabled\n", BTRS_LNL_TILE_MAX_NUM); + ivpu_dbg(vdev, MISC, "Tile disable config mask: 0x%x\n", config); *tile_fuse_config = config; return 0; -- 2.45.1
[PATCH 02/11] accel/ivpu: Defer MMU root page table allocation
From: Karol Wachowski Defer root page table allocation and unify context init/fini functions. Move allocation of the root page table from the file_priv_open function to perform a lazy allocation approach during ivpu_bo_pin(). By doing so, we avoid the overhead of allocating page tables for simple operations like GET_PARAM that do not require them. Additionally, the MMU context descriptor table initialization has been moved to the ivpu_mmu_context_map_page function. This change streamlines the process and ensures that the descriptor table is only initialized when it is actually needed. Refactor init/fini functions to remove redundant code and make the context management more straightforward. Overall, these changes lead to a reduction in the time taken by the file descriptor open operation, as the costly root page table allocation is now avoided for operations that do not require it. Signed-off-by: Karol Wachowski Reviewed-by: Jacek Lawrynowicz Signed-off-by: Jacek Lawrynowicz --- drivers/accel/ivpu/ivpu_drv.c | 12 +-- drivers/accel/ivpu/ivpu_mmu.c | 94 ++--- drivers/accel/ivpu/ivpu_mmu.h | 4 +- drivers/accel/ivpu/ivpu_mmu_context.c | 145 +- drivers/accel/ivpu/ivpu_mmu_context.h | 9 +- 5 files changed, 115 insertions(+), 149 deletions(-) diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c index e7d8967c02f29..34e3e9b1c3f23 100644 --- a/drivers/accel/ivpu/ivpu_drv.c +++ b/drivers/accel/ivpu/ivpu_drv.c @@ -86,7 +86,7 @@ static void file_priv_unbind(struct ivpu_device *vdev, struct ivpu_file_priv *fi ivpu_cmdq_release_all_locked(file_priv); ivpu_bo_unbind_all_bos_from_context(vdev, &file_priv->ctx); - ivpu_mmu_user_context_fini(vdev, &file_priv->ctx); + ivpu_mmu_context_fini(vdev, &file_priv->ctx); file_priv->bound = false; drm_WARN_ON(&vdev->drm, !xa_erase_irq(&vdev->context_xa, file_priv->ctx.id)); } @@ -254,9 +254,7 @@ static int ivpu_open(struct drm_device *dev, struct drm_file *file) goto err_unlock; } - ret = ivpu_mmu_user_context_init(vdev, &file_priv->ctx, ctx_id); - if (ret) - goto err_xa_erase; + ivpu_mmu_context_init(vdev, &file_priv->ctx, ctx_id); file_priv->default_job_limit.min = FIELD_PREP(IVPU_JOB_ID_CONTEXT_MASK, (file_priv->ctx.id - 1)); @@ -273,8 +271,6 @@ static int ivpu_open(struct drm_device *dev, struct drm_file *file) return 0; -err_xa_erase: - xa_erase_irq(&vdev->context_xa, ctx_id); err_unlock: mutex_unlock(&vdev->context_list_lock); mutex_destroy(&file_priv->ms_lock); @@ -652,9 +648,7 @@ static int ivpu_dev_init(struct ivpu_device *vdev) if (ret) goto err_shutdown; - ret = ivpu_mmu_global_context_init(vdev); - if (ret) - goto err_shutdown; + ivpu_mmu_global_context_init(vdev); ret = ivpu_mmu_init(vdev); if (ret) diff --git a/drivers/accel/ivpu/ivpu_mmu.c b/drivers/accel/ivpu/ivpu_mmu.c index c078e214b2212..4ff0d7a519859 100644 --- a/drivers/accel/ivpu/ivpu_mmu.c +++ b/drivers/accel/ivpu/ivpu_mmu.c @@ -696,7 +696,7 @@ int ivpu_mmu_invalidate_tlb(struct ivpu_device *vdev, u16 ssid) return ret; } -static int ivpu_mmu_cd_add(struct ivpu_device *vdev, u32 ssid, u64 cd_dma) +static int ivpu_mmu_cdtab_entry_set(struct ivpu_device *vdev, u32 ssid, u64 cd_dma, bool valid) { struct ivpu_mmu_info *mmu = vdev->mmu; struct ivpu_mmu_cdtab *cdtab = &mmu->cdtab; @@ -708,30 +708,29 @@ static int ivpu_mmu_cd_add(struct ivpu_device *vdev, u32 ssid, u64 cd_dma) return -EINVAL; entry = cdtab->base + (ssid * IVPU_MMU_CDTAB_ENT_SIZE); - - if (cd_dma != 0) { - cd[0] = FIELD_PREP(IVPU_MMU_CD_0_TCR_T0SZ, IVPU_MMU_T0SZ_48BIT) | - FIELD_PREP(IVPU_MMU_CD_0_TCR_TG0, 0) | - FIELD_PREP(IVPU_MMU_CD_0_TCR_IRGN0, 0) | - FIELD_PREP(IVPU_MMU_CD_0_TCR_ORGN0, 0) | - FIELD_PREP(IVPU_MMU_CD_0_TCR_SH0, 0) | - FIELD_PREP(IVPU_MMU_CD_0_TCR_IPS, IVPU_MMU_IPS_48BIT) | - FIELD_PREP(IVPU_MMU_CD_0_ASID, ssid) | - IVPU_MMU_CD_0_TCR_EPD1 | - IVPU_MMU_CD_0_AA64 | - IVPU_MMU_CD_0_R | - IVPU_MMU_CD_0_ASET | - IVPU_MMU_CD_0_V; - cd[1] = cd_dma & IVPU_MMU_CD_1_TTB0_MASK; - cd[2] = 0; - cd[3] = 0x7444; - - /* For global context generate memory fault on VPU */ - if (ssid == IVPU_GLOBAL_CONTEXT_MMU_SSID) - cd[0] |= IVPU_MMU_CD_0_A; - } else { - memset(cd, 0, sizeof(cd));
[PATCH 03/11] accel/ivpu: Remove copy engine support
From: Andrzej Kacprowski Copy engine was deprecated by the FW and is no longer supported. Compute engine includes all copy engine functionality and should be used instead. This change does not affect user space as the copy engine was never used outside of a couple of tests. Signed-off-by: Andrzej Kacprowski Reviewed-by: Jacek Lawrynowicz Signed-off-by: Jacek Lawrynowicz --- drivers/accel/ivpu/ivpu_drv.h | 5 +--- drivers/accel/ivpu/ivpu_job.c | 43 +++ drivers/accel/ivpu/ivpu_jsm_msg.c | 8 +++--- include/uapi/drm/ivpu_accel.h | 6 + 4 files changed, 21 insertions(+), 41 deletions(-) diff --git a/drivers/accel/ivpu/ivpu_drv.h b/drivers/accel/ivpu/ivpu_drv.h index f905021ac1748..5b4f5104b4708 100644 --- a/drivers/accel/ivpu/ivpu_drv.h +++ b/drivers/accel/ivpu/ivpu_drv.h @@ -49,11 +49,8 @@ #define IVPU_JOB_ID_JOB_MASK GENMASK(7, 0) #define IVPU_JOB_ID_CONTEXT_MASK GENMASK(31, 8) -#define IVPU_NUM_ENGINES 2 #define IVPU_NUM_PRIORITIES4 -#define IVPU_NUM_CMDQS_PER_CTX (IVPU_NUM_ENGINES * IVPU_NUM_PRIORITIES) - -#define IVPU_CMDQ_INDEX(engine, priority) ((engine) * IVPU_NUM_PRIORITIES + (priority)) +#define IVPU_NUM_CMDQS_PER_CTX (IVPU_NUM_PRIORITIES) #define IVPU_PLATFORM_SILICON 0 #define IVPU_PLATFORM_SIMICS 2 diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c index 98e0b7b614071..f580959e87787 100644 --- a/drivers/accel/ivpu/ivpu_job.c +++ b/drivers/accel/ivpu/ivpu_job.c @@ -247,8 +247,7 @@ static int ivpu_cmdq_fini(struct ivpu_file_priv *file_priv, struct ivpu_cmdq *cm static struct ivpu_cmdq *ivpu_cmdq_acquire(struct ivpu_file_priv *file_priv, u16 engine, u8 priority) { - int cmdq_idx = IVPU_CMDQ_INDEX(engine, priority); - struct ivpu_cmdq *cmdq = file_priv->cmdq[cmdq_idx]; + struct ivpu_cmdq *cmdq = file_priv->cmdq[priority]; int ret; lockdep_assert_held(&file_priv->lock); @@ -257,7 +256,7 @@ static struct ivpu_cmdq *ivpu_cmdq_acquire(struct ivpu_file_priv *file_priv, u16 cmdq = ivpu_cmdq_alloc(file_priv); if (!cmdq) return NULL; - file_priv->cmdq[cmdq_idx] = cmdq; + file_priv->cmdq[priority] = cmdq; } ret = ivpu_cmdq_init(file_priv, cmdq, engine, priority); @@ -267,15 +266,14 @@ static struct ivpu_cmdq *ivpu_cmdq_acquire(struct ivpu_file_priv *file_priv, u16 return cmdq; } -static void ivpu_cmdq_release_locked(struct ivpu_file_priv *file_priv, u16 engine, u8 priority) +static void ivpu_cmdq_release_locked(struct ivpu_file_priv *file_priv, u8 priority) { - int cmdq_idx = IVPU_CMDQ_INDEX(engine, priority); - struct ivpu_cmdq *cmdq = file_priv->cmdq[cmdq_idx]; + struct ivpu_cmdq *cmdq = file_priv->cmdq[priority]; lockdep_assert_held(&file_priv->lock); if (cmdq) { - file_priv->cmdq[cmdq_idx] = NULL; + file_priv->cmdq[priority] = NULL; ivpu_cmdq_fini(file_priv, cmdq); ivpu_cmdq_free(file_priv, cmdq); } @@ -283,14 +281,12 @@ static void ivpu_cmdq_release_locked(struct ivpu_file_priv *file_priv, u16 engin void ivpu_cmdq_release_all_locked(struct ivpu_file_priv *file_priv) { - u16 engine; u8 priority; lockdep_assert_held(&file_priv->lock); - for (engine = 0; engine < IVPU_NUM_ENGINES; engine++) - for (priority = 0; priority < IVPU_NUM_PRIORITIES; priority++) - ivpu_cmdq_release_locked(file_priv, engine, priority); + for (priority = 0; priority < IVPU_NUM_PRIORITIES; priority++) + ivpu_cmdq_release_locked(file_priv, priority); } /* @@ -301,19 +297,15 @@ void ivpu_cmdq_release_all_locked(struct ivpu_file_priv *file_priv) */ static void ivpu_cmdq_reset(struct ivpu_file_priv *file_priv) { - u16 engine; u8 priority; mutex_lock(&file_priv->lock); - for (engine = 0; engine < IVPU_NUM_ENGINES; engine++) { - for (priority = 0; priority < IVPU_NUM_PRIORITIES; priority++) { - int cmdq_idx = IVPU_CMDQ_INDEX(engine, priority); - struct ivpu_cmdq *cmdq = file_priv->cmdq[cmdq_idx]; + for (priority = 0; priority < IVPU_NUM_PRIORITIES; priority++) { + struct ivpu_cmdq *cmdq = file_priv->cmdq[priority]; - if (cmdq) - cmdq->db_registered = false; - } + if (cmdq) + cmdq->db_registered = false; } mutex_unlock(&file_priv->lock); @@ -334,16 +326,11 @@ void ivpu_cmdq_reset_all_contexts(struct ivpu_device *vdev) static void ivpu_cmdq_fini_all(struct ivpu_file_priv *file_priv) { - u16 engine; u8 priority; - for (engine = 0; engine < IVPU_NUM_ENGINES; engine++) { -
[PATCH 00/11] accel/ivpu: Changes for 6.13-rc5
- Remove support for deprecated and unused copy engine - Improved open() performance by lazy allocating MMU page tables - Error handling fixes in MMU code - Extend VPU address ranges to allow bigger workloads Andrzej Kacprowski (1): accel/ivpu: Remove copy engine support Karol Wachowski (9): accel/ivpu: Do not fail when more than 1 tile is fused accel/ivpu: Defer MMU root page table allocation accel/ivpu: Clear CDTAB entry in case of failure accel/ivpu: Unmap partially mapped BOs in case of errors accel/ivpu: Use xa_alloc_cyclic() instead of custom function accel/ivpu: Make command queue ID allocated on XArray accel/ivpu: Don't allocate preemption buffers when MIP is disabled accel/ivpu: Increase DMA address range accel/ivpu: Move secondary preemption buffer allocation to DMA range Maciej Falkowski (1): accel/ivpu: Add debug Kconfig option drivers/accel/ivpu/Kconfig| 10 ++ drivers/accel/ivpu/Makefile | 2 + drivers/accel/ivpu/ivpu_drv.c | 31 +++-- drivers/accel/ivpu/ivpu_drv.h | 16 +-- drivers/accel/ivpu/ivpu_fw.c | 8 +- drivers/accel/ivpu/ivpu_hw.c | 10 +- drivers/accel/ivpu/ivpu_hw_btrs.c | 12 +- drivers/accel/ivpu/ivpu_job.c | 148 ++-- drivers/accel/ivpu/ivpu_job.h | 2 + drivers/accel/ivpu/ivpu_jsm_msg.c | 8 +- drivers/accel/ivpu/ivpu_mmu.c | 101 ++-- drivers/accel/ivpu/ivpu_mmu.h | 4 +- drivers/accel/ivpu/ivpu_mmu_context.c | 158 ++ drivers/accel/ivpu/ivpu_mmu_context.h | 9 +- drivers/accel/ivpu/ivpu_pm.c | 2 + include/uapi/drm/ivpu_accel.h | 6 +- 16 files changed, 243 insertions(+), 284 deletions(-) -- 2.45.1
[PATCH 05/11] accel/ivpu: Unmap partially mapped BOs in case of errors
From: Karol Wachowski Ensure that all buffers that were created only partially through allocated scatter-gather table are unmapped from MMU600 in case of errors. Signed-off-by: Karol Wachowski Reviewed-by: Jacek Lawrynowicz Signed-off-by: Jacek Lawrynowicz --- drivers/accel/ivpu/ivpu_mmu_context.c | 19 +++ 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/drivers/accel/ivpu/ivpu_mmu_context.c b/drivers/accel/ivpu/ivpu_mmu_context.c index 8992fe93b679a..697b57071d546 100644 --- a/drivers/accel/ivpu/ivpu_mmu_context.c +++ b/drivers/accel/ivpu/ivpu_mmu_context.c @@ -432,6 +432,7 @@ int ivpu_mmu_context_map_sgt(struct ivpu_device *vdev, struct ivpu_mmu_context *ctx, u64 vpu_addr, struct sg_table *sgt, bool llc_coherent) { + size_t start_vpu_addr = vpu_addr; struct scatterlist *sg; int ret; u64 prot; @@ -462,7 +463,7 @@ ivpu_mmu_context_map_sgt(struct ivpu_device *vdev, struct ivpu_mmu_context *ctx, ret = ivpu_mmu_context_map_pages(vdev, ctx, vpu_addr, dma_addr, size, prot); if (ret) { ivpu_err(vdev, "Failed to map context pages\n"); - goto err_unlock; + goto err_unmap_pages; } vpu_addr += size; } @@ -472,7 +473,7 @@ ivpu_mmu_context_map_sgt(struct ivpu_device *vdev, struct ivpu_mmu_context *ctx, if (ret) { ivpu_err(vdev, "Failed to set context descriptor for context %u: %d\n", ctx->id, ret); - goto err_unlock; + goto err_unmap_pages; } ctx->is_cd_valid = true; } @@ -480,17 +481,19 @@ ivpu_mmu_context_map_sgt(struct ivpu_device *vdev, struct ivpu_mmu_context *ctx, /* Ensure page table modifications are flushed from wc buffers to memory */ wmb(); - mutex_unlock(&ctx->lock); - ret = ivpu_mmu_invalidate_tlb(vdev, ctx->id); - if (ret) + if (ret) { ivpu_err(vdev, "Failed to invalidate TLB for ctx %u: %d\n", ctx->id, ret); - return ret; + goto err_unmap_pages; + } -err_unlock: mutex_unlock(&ctx->lock); - return ret; + return 0; +err_unmap_pages: + ivpu_mmu_context_unmap_pages(ctx, start_vpu_addr, vpu_addr - start_vpu_addr); + mutex_unlock(&ctx->lock); + return ret; } void -- 2.45.1
[PATCH 06/11] accel/ivpu: Use xa_alloc_cyclic() instead of custom function
From: Karol Wachowski Remove custom ivpu_id_alloc() wrapper used for ID allocations and replace it with standard xa_alloc_cyclic() API. The idea behind ivpu_id_alloc() was to have monotonic IDs, so the driver is easier to debug because same IDs are not reused all over. The same can be achieved just by using appropriate Linux API. Signed-off-by: Karol Wachowski Reviewed-by: Jacek Lawrynowicz Signed-off-by: Jacek Lawrynowicz --- drivers/accel/ivpu/ivpu_drv.c | 11 --- drivers/accel/ivpu/ivpu_drv.h | 4 ++-- drivers/accel/ivpu/ivpu_job.c | 34 ++ 3 files changed, 12 insertions(+), 37 deletions(-) diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c index 34e3e9b1c3f23..383e3eb988983 100644 --- a/drivers/accel/ivpu/ivpu_drv.c +++ b/drivers/accel/ivpu/ivpu_drv.c @@ -256,10 +256,8 @@ static int ivpu_open(struct drm_device *dev, struct drm_file *file) ivpu_mmu_context_init(vdev, &file_priv->ctx, ctx_id); - file_priv->default_job_limit.min = FIELD_PREP(IVPU_JOB_ID_CONTEXT_MASK, - (file_priv->ctx.id - 1)); - file_priv->default_job_limit.max = file_priv->default_job_limit.min | IVPU_JOB_ID_JOB_MASK; - file_priv->job_limit = file_priv->default_job_limit; + file_priv->job_limit.min = FIELD_PREP(IVPU_JOB_ID_CONTEXT_MASK, (file_priv->ctx.id - 1)); + file_priv->job_limit.max = file_priv->job_limit.min | IVPU_JOB_ID_JOB_MASK; mutex_unlock(&vdev->context_list_lock); drm_dev_exit(idx); @@ -618,9 +616,8 @@ static int ivpu_dev_init(struct ivpu_device *vdev) lockdep_set_class(&vdev->submitted_jobs_xa.xa_lock, &submitted_jobs_xa_lock_class_key); INIT_LIST_HEAD(&vdev->bo_list); - vdev->default_db_limit.min = IVPU_MIN_DB; - vdev->default_db_limit.max = IVPU_MAX_DB; - vdev->db_limit = vdev->default_db_limit; + vdev->db_limit.min = IVPU_MIN_DB; + vdev->db_limit.max = IVPU_MAX_DB; ret = drmm_mutex_init(&vdev->drm, &vdev->context_list_lock); if (ret) diff --git a/drivers/accel/ivpu/ivpu_drv.h b/drivers/accel/ivpu/ivpu_drv.h index 5b4f5104b4708..6774402821706 100644 --- a/drivers/accel/ivpu/ivpu_drv.h +++ b/drivers/accel/ivpu/ivpu_drv.h @@ -137,7 +137,7 @@ struct ivpu_device { struct xarray db_xa; struct xa_limit db_limit; - struct xa_limit default_db_limit; + u32 db_next; struct mutex bo_list_lock; /* Protects bo_list */ struct list_head bo_list; @@ -174,7 +174,7 @@ struct ivpu_file_priv { struct list_head ms_instance_list; struct ivpu_bo *ms_info_bo; struct xa_limit job_limit; - struct xa_limit default_job_limit; + u32 job_id_next; bool has_mmu_faults; bool bound; bool aborted; diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c index f580959e87787..9154c2e14245f 100644 --- a/drivers/accel/ivpu/ivpu_job.c +++ b/drivers/accel/ivpu/ivpu_job.c @@ -72,26 +72,6 @@ static void ivpu_preemption_buffers_free(struct ivpu_device *vdev, ivpu_bo_free(cmdq->secondary_preempt_buf); } -static int ivpu_id_alloc(struct xarray *xa, u32 *id, void *entry, struct xa_limit *limit, -const struct xa_limit default_limit) -{ - int ret; - - ret = __xa_alloc(xa, id, entry, *limit, GFP_KERNEL); - if (ret) { - limit->min = default_limit.min; - ret = __xa_alloc(xa, id, entry, *limit, GFP_KERNEL); - if (ret) - return ret; - } - - limit->min = *id + 1; - if (limit->min > limit->max) - limit->min = default_limit.min; - - return ret; -} - static struct ivpu_cmdq *ivpu_cmdq_alloc(struct ivpu_file_priv *file_priv) { struct ivpu_device *vdev = file_priv->vdev; @@ -102,11 +82,9 @@ static struct ivpu_cmdq *ivpu_cmdq_alloc(struct ivpu_file_priv *file_priv) if (!cmdq) return NULL; - xa_lock(&vdev->db_xa); /* lock here to protect db_limit */ - ret = ivpu_id_alloc(&vdev->db_xa, &cmdq->db_id, NULL, &vdev->db_limit, - vdev->default_db_limit); - xa_unlock(&vdev->db_xa); - if (ret) { + ret = xa_alloc_cyclic(&vdev->db_xa, &cmdq->db_id, NULL, vdev->db_limit, &vdev->db_next, + GFP_KERNEL); + if (ret < 0) { ivpu_err(vdev, "Failed to allocate doorbell id: %d\n", ret); goto err_free_cmdq; } @@ -554,9 +532,9 @@ static int ivpu_job_submit(struct ivpu_job *job, u8 priority) xa_lock(&vdev->submitted_jobs_xa); is_first_job = xa_empty(&vdev->submitted_jobs_xa); - ret = ivpu_id_alloc(&vdev->submitted_jobs_xa, &job->job_id, job, &file_priv->job_limit, - file_priv->default_job_limit); - if (ret) { + ret = __xa_alloc_cyclic(&vde
[PATCH 04/11] accel/ivpu: Clear CDTAB entry in case of failure
From: Karol Wachowski Don't leave a context descriptor in case CFGI_ALL flush fails. Mark it as invalid (by clearing valid bit) so nothing is left in partially-initialized state. Signed-off-by: Karol Wachowski Reviewed-by: Jacek Lawrynowicz Signed-off-by: Jacek Lawrynowicz --- drivers/accel/ivpu/ivpu_mmu.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/accel/ivpu/ivpu_mmu.c b/drivers/accel/ivpu/ivpu_mmu.c index 4ff0d7a519859..26ef52fbb93e5 100644 --- a/drivers/accel/ivpu/ivpu_mmu.c +++ b/drivers/accel/ivpu/ivpu_mmu.c @@ -749,10 +749,17 @@ static int ivpu_mmu_cdtab_entry_set(struct ivpu_device *vdev, u32 ssid, u64 cd_d ret = ivpu_mmu_cmdq_write_cfgi_all(vdev); if (ret) - goto unlock; + goto err_invalidate; ret = ivpu_mmu_cmdq_sync(vdev); + if (ret) + goto err_invalidate; unlock: + mutex_unlock(&mmu->lock); + return 0; + +err_invalidate: + WRITE_ONCE(entry[0], 0); mutex_unlock(&mmu->lock); return ret; } -- 2.45.1
[PATCH 08/11] accel/ivpu: Don't allocate preemption buffers when MIP is disabled
From: Karol Wachowski Do not allocate preemption buffers when Mid Inference Preemption (MIP) is disabled through test mode. Rename IVPU_TEST_MODE_PREEMPTION_DISABLE to IVPU_TEST_MODE_MIP_DISABLE to better describe that this test mode only disables MIP - job level preemption will still occur. Signed-off-by: Karol Wachowski Reviewed-by: Jacek Lawrynowicz Signed-off-by: Jacek Lawrynowicz --- drivers/accel/ivpu/ivpu_drv.h | 2 +- drivers/accel/ivpu/ivpu_job.c | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/accel/ivpu/ivpu_drv.h b/drivers/accel/ivpu/ivpu_drv.h index 8e79d78906bfe..3fdff3f6cffd8 100644 --- a/drivers/accel/ivpu/ivpu_drv.h +++ b/drivers/accel/ivpu/ivpu_drv.h @@ -197,7 +197,7 @@ extern bool ivpu_force_snoop; #define IVPU_TEST_MODE_NULL_SUBMISSIONBIT(2) #define IVPU_TEST_MODE_D0I3_MSG_DISABLE BIT(4) #define IVPU_TEST_MODE_D0I3_MSG_ENABLEBIT(5) -#define IVPU_TEST_MODE_PREEMPTION_DISABLE BIT(6) +#define IVPU_TEST_MODE_MIP_DISABLEBIT(6) #define IVPU_TEST_MODE_DISABLE_TIMEOUTS BIT(8) #define IVPU_TEST_MODE_TURBO BIT(9) extern int ivpu_test_mode; diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c index 82a57a30244d3..39ba6d3d8b0de 100644 --- a/drivers/accel/ivpu/ivpu_job.c +++ b/drivers/accel/ivpu/ivpu_job.c @@ -35,7 +35,8 @@ static int ivpu_preemption_buffers_create(struct ivpu_device *vdev, u64 primary_size = ALIGN(vdev->fw->primary_preempt_buf_size, PAGE_SIZE); u64 secondary_size = ALIGN(vdev->fw->secondary_preempt_buf_size, PAGE_SIZE); - if (vdev->fw->sched_mode != VPU_SCHEDULING_MODE_HW) + if (vdev->fw->sched_mode != VPU_SCHEDULING_MODE_HW || + ivpu_test_mode & IVPU_TEST_MODE_MIP_DISABLE) return 0; cmdq->primary_preempt_buf = ivpu_bo_create(vdev, &file_priv->ctx, &vdev->hw->ranges.user, @@ -347,8 +348,7 @@ static int ivpu_cmdq_push_job(struct ivpu_cmdq *cmdq, struct ivpu_job *job) if (unlikely(ivpu_test_mode & IVPU_TEST_MODE_NULL_SUBMISSION)) entry->flags = VPU_JOB_FLAGS_NULL_SUBMISSION_MASK; - if (vdev->fw->sched_mode == VPU_SCHEDULING_MODE_HW && - (unlikely(!(ivpu_test_mode & IVPU_TEST_MODE_PREEMPTION_DISABLE { + if (vdev->fw->sched_mode == VPU_SCHEDULING_MODE_HW) { if (cmdq->primary_preempt_buf) { entry->primary_preempt_buf_addr = cmdq->primary_preempt_buf->vpu_addr; entry->primary_preempt_buf_size = ivpu_bo_size(cmdq->primary_preempt_buf); -- 2.45.1
[PATCH 11/11] accel/ivpu: Move secondary preemption buffer allocation to DMA range
From: Karol Wachowski Secondary preemption buffer is accessible by NPU's DMA and can be allocated with addresses above 4 GB. Move secondary preemption buffer allocation from SHAVE range which is much smaller (2GB) to DMA range. This allows to allocate more command queues with corresponding preemption buffers without running out of address range. Signed-off-by: Karol Wachowski Reviewed-by: Jacek Lawrynowicz Signed-off-by: Jacek Lawrynowicz --- drivers/accel/ivpu/ivpu_job.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c index 39ba6d3d8b0de..7149312f16e19 100644 --- a/drivers/accel/ivpu/ivpu_job.c +++ b/drivers/accel/ivpu/ivpu_job.c @@ -46,7 +46,7 @@ static int ivpu_preemption_buffers_create(struct ivpu_device *vdev, return -ENOMEM; } - cmdq->secondary_preempt_buf = ivpu_bo_create(vdev, &file_priv->ctx, &vdev->hw->ranges.shave, + cmdq->secondary_preempt_buf = ivpu_bo_create(vdev, &file_priv->ctx, &vdev->hw->ranges.dma, secondary_size, DRM_IVPU_BO_WC); if (!cmdq->secondary_preempt_buf) { ivpu_err(vdev, "Failed to create secondary preemption buffer\n"); -- 2.45.1
[PATCH 10/11] accel/ivpu: Increase DMA address range
From: Karol Wachowski Increase DMA address range to: * 128 GB on 37xx (due to MMU limitations) * 256 GB on other generations Merge User and DMA ranges on 40xx and above as it is possible to access whole 256 GBs from both FW and DMA. Increase User range on 37xx from 255MB to 511MB to allow loading very large models. Do not set global_alias_pio_base/size on other generations than 37xx as it's only used on 37xx anyway. Signed-off-by: Karol Wachowski Signed-off-by: Andrzej Kacprowski Reviewed-by: Jacek Lawrynowicz Signed-off-by: Jacek Lawrynowicz --- drivers/accel/ivpu/ivpu_fw.c | 6 -- drivers/accel/ivpu/ivpu_hw.c | 10 +- drivers/accel/ivpu/ivpu_mmu_context.c | 4 ++-- 3 files changed, 11 insertions(+), 9 deletions(-) diff --git a/drivers/accel/ivpu/ivpu_fw.c b/drivers/accel/ivpu/ivpu_fw.c index d358cf0b0f972..6037ec0b30968 100644 --- a/drivers/accel/ivpu/ivpu_fw.c +++ b/drivers/accel/ivpu/ivpu_fw.c @@ -584,8 +584,10 @@ void ivpu_fw_boot_params_setup(struct ivpu_device *vdev, struct vpu_boot_params boot_params->ipc_payload_area_start = ipc_mem_rx->vpu_addr + ivpu_bo_size(ipc_mem_rx) / 2; boot_params->ipc_payload_area_size = ivpu_bo_size(ipc_mem_rx) / 2; - boot_params->global_aliased_pio_base = vdev->hw->ranges.user.start; - boot_params->global_aliased_pio_size = ivpu_hw_range_size(&vdev->hw->ranges.user); + if (ivpu_hw_ip_gen(vdev) == IVPU_HW_IP_37XX) { + boot_params->global_aliased_pio_base = vdev->hw->ranges.user.start; + boot_params->global_aliased_pio_size = ivpu_hw_range_size(&vdev->hw->ranges.user); + } /* Allow configuration for L2C_PAGE_TABLE with boot param value */ boot_params->autoconfig = 1; diff --git a/drivers/accel/ivpu/ivpu_hw.c b/drivers/accel/ivpu/ivpu_hw.c index 1c259d7178151..09ada8b500b99 100644 --- a/drivers/accel/ivpu/ivpu_hw.c +++ b/drivers/accel/ivpu/ivpu_hw.c @@ -114,14 +114,14 @@ static void memory_ranges_init(struct ivpu_device *vdev) { if (ivpu_hw_ip_gen(vdev) == IVPU_HW_IP_37XX) { ivpu_hw_range_init(&vdev->hw->ranges.global, 0x8000, SZ_512M); - ivpu_hw_range_init(&vdev->hw->ranges.user, 0xc000, 255 * SZ_1M); + ivpu_hw_range_init(&vdev->hw->ranges.user, 0x8800, 511 * SZ_1M); ivpu_hw_range_init(&vdev->hw->ranges.shave, 0x18000, SZ_2G); - ivpu_hw_range_init(&vdev->hw->ranges.dma, 0x2, SZ_8G); + ivpu_hw_range_init(&vdev->hw->ranges.dma, 0x2, SZ_128G); } else { ivpu_hw_range_init(&vdev->hw->ranges.global, 0x8000, SZ_512M); - ivpu_hw_range_init(&vdev->hw->ranges.user, 0x8000, SZ_256M); - ivpu_hw_range_init(&vdev->hw->ranges.shave, 0x8000 + SZ_256M, SZ_2G - SZ_256M); - ivpu_hw_range_init(&vdev->hw->ranges.dma, 0x2, SZ_8G); + ivpu_hw_range_init(&vdev->hw->ranges.shave, 0x8000, SZ_2G); + ivpu_hw_range_init(&vdev->hw->ranges.user, 0x1, SZ_256G); + vdev->hw->ranges.dma = vdev->hw->ranges.user; } } diff --git a/drivers/accel/ivpu/ivpu_mmu_context.c b/drivers/accel/ivpu/ivpu_mmu_context.c index 697b57071d546..891967a95bc3c 100644 --- a/drivers/accel/ivpu/ivpu_mmu_context.c +++ b/drivers/accel/ivpu/ivpu_mmu_context.c @@ -571,8 +571,8 @@ void ivpu_mmu_context_init(struct ivpu_device *vdev, struct ivpu_mmu_context *ct start = vdev->hw->ranges.global.start; end = vdev->hw->ranges.shave.end; } else { - start = vdev->hw->ranges.user.start; - end = vdev->hw->ranges.dma.end; + start = min_t(u64, vdev->hw->ranges.user.start, vdev->hw->ranges.shave.start); + end = max_t(u64, vdev->hw->ranges.user.end, vdev->hw->ranges.dma.end); } drm_mm_init(&ctx->mm, start, end - start); -- 2.45.1
[PATCH 09/11] accel/ivpu: Add debug Kconfig option
From: Maciej Falkowski Add CONFIG_DRM_ACCEL_IVPU_DEBUG option that: - Adds -DDEBUG that enables printk regardless of the kernel config - Enables unsafe module params (that are now disabled by default) Signed-off-by: Maciej Falkowski Reviewed-by: Jacek Lawrynowicz Signed-off-by: Jacek Lawrynowicz --- drivers/accel/ivpu/Kconfig| 10 ++ drivers/accel/ivpu/Makefile | 2 ++ drivers/accel/ivpu/ivpu_drv.c | 2 ++ drivers/accel/ivpu/ivpu_fw.c | 2 ++ drivers/accel/ivpu/ivpu_pm.c | 2 ++ 5 files changed, 18 insertions(+) diff --git a/drivers/accel/ivpu/Kconfig b/drivers/accel/ivpu/Kconfig index e4d418b44626e..8858b32e05640 100644 --- a/drivers/accel/ivpu/Kconfig +++ b/drivers/accel/ivpu/Kconfig @@ -16,3 +16,13 @@ config DRM_ACCEL_IVPU and Deep Learning applications. If "M" is selected, the module will be called intel_vpu. + +config DRM_ACCEL_IVPU_DEBUG + bool "Intel NPU debug mode" + depends on DRM_ACCEL_IVPU + default n + help + Choose this option to enable additional + debug features for the Intel NPU driver: + - Always print debug messages regardless of dyndbg config, + - Enable unsafe module params. diff --git a/drivers/accel/ivpu/Makefile b/drivers/accel/ivpu/Makefile index e73937c86d9ad..1029e0bab0615 100644 --- a/drivers/accel/ivpu/Makefile +++ b/drivers/accel/ivpu/Makefile @@ -24,4 +24,6 @@ intel_vpu-$(CONFIG_DEV_COREDUMP) += ivpu_coredump.o obj-$(CONFIG_DRM_ACCEL_IVPU) += intel_vpu.o +subdir-ccflags-$(CONFIG_DRM_ACCEL_IVPU_DEBUG) += -DDEBUG + CFLAGS_ivpu_trace_points.o = -I$(src) diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c index f5a8d93fe2a57..ca2bf47ce2484 100644 --- a/drivers/accel/ivpu/ivpu_drv.c +++ b/drivers/accel/ivpu/ivpu_drv.c @@ -43,8 +43,10 @@ module_param_named(dbg_mask, ivpu_dbg_mask, int, 0644); MODULE_PARM_DESC(dbg_mask, "Driver debug mask. See IVPU_DBG_* macros."); int ivpu_test_mode; +#if IS_ENABLED(CONFIG_DRM_ACCEL_IVPU_DEBUG) module_param_named_unsafe(test_mode, ivpu_test_mode, int, 0644); MODULE_PARM_DESC(test_mode, "Test mode mask. See IVPU_TEST_MODE_* macros."); +#endif u8 ivpu_pll_min_ratio; module_param_named(pll_min_ratio, ivpu_pll_min_ratio, byte, 0644); diff --git a/drivers/accel/ivpu/ivpu_fw.c b/drivers/accel/ivpu/ivpu_fw.c index be367465e7df4..d358cf0b0f972 100644 --- a/drivers/accel/ivpu/ivpu_fw.c +++ b/drivers/accel/ivpu/ivpu_fw.c @@ -46,8 +46,10 @@ #define IVPU_FOCUS_PRESENT_TIMER_MS 1000 static char *ivpu_firmware; +#if IS_ENABLED(CONFIG_DRM_ACCEL_IVPU_DEBUG) module_param_named_unsafe(firmware, ivpu_firmware, charp, 0644); MODULE_PARM_DESC(firmware, "NPU firmware binary in /lib/firmware/.."); +#endif static struct { int gen; diff --git a/drivers/accel/ivpu/ivpu_pm.c b/drivers/accel/ivpu/ivpu_pm.c index e567df79a6129..dbc0711e28d13 100644 --- a/drivers/accel/ivpu/ivpu_pm.c +++ b/drivers/accel/ivpu/ivpu_pm.c @@ -24,8 +24,10 @@ #include "vpu_boot_api.h" static bool ivpu_disable_recovery; +#if IS_ENABLED(CONFIG_DRM_ACCEL_IVPU_DEBUG) module_param_named_unsafe(disable_recovery, ivpu_disable_recovery, bool, 0644); MODULE_PARM_DESC(disable_recovery, "Disables recovery when NPU hang is detected"); +#endif static unsigned long ivpu_tdr_timeout_ms; module_param_named(tdr_timeout_ms, ivpu_tdr_timeout_ms, ulong, 0644); -- 2.45.1
[PATCH 07/11] accel/ivpu: Make command queue ID allocated on XArray
From: Karol Wachowski Use XArray for dynamic command queue ID allocations instead of fixed ones. This is required by upcoming changes to UAPI that will allow to manage command queues by user space instead of having predefined number of queues in a context. Signed-off-by: Karol Wachowski Reviewed-by: Jacek Lawrynowicz Signed-off-by: Jacek Lawrynowicz --- drivers/accel/ivpu/ivpu_drv.c | 6 +++ drivers/accel/ivpu/ivpu_drv.h | 7 ++- drivers/accel/ivpu/ivpu_job.c | 91 ++- drivers/accel/ivpu/ivpu_job.h | 2 + 4 files changed, 60 insertions(+), 46 deletions(-) diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c index 383e3eb988983..f5a8d93fe2a57 100644 --- a/drivers/accel/ivpu/ivpu_drv.c +++ b/drivers/accel/ivpu/ivpu_drv.c @@ -104,6 +104,8 @@ static void file_priv_release(struct kref *ref) pm_runtime_get_sync(vdev->drm.dev); mutex_lock(&vdev->context_list_lock); file_priv_unbind(vdev, file_priv); + drm_WARN_ON(&vdev->drm, !xa_empty(&file_priv->cmdq_xa)); + xa_destroy(&file_priv->cmdq_xa); mutex_unlock(&vdev->context_list_lock); pm_runtime_put_autosuspend(vdev->drm.dev); @@ -259,6 +261,10 @@ static int ivpu_open(struct drm_device *dev, struct drm_file *file) file_priv->job_limit.min = FIELD_PREP(IVPU_JOB_ID_CONTEXT_MASK, (file_priv->ctx.id - 1)); file_priv->job_limit.max = file_priv->job_limit.min | IVPU_JOB_ID_JOB_MASK; + xa_init_flags(&file_priv->cmdq_xa, XA_FLAGS_ALLOC1); + file_priv->cmdq_limit.min = IVPU_CMDQ_MIN_ID; + file_priv->cmdq_limit.max = IVPU_CMDQ_MAX_ID; + mutex_unlock(&vdev->context_list_lock); drm_dev_exit(idx); diff --git a/drivers/accel/ivpu/ivpu_drv.h b/drivers/accel/ivpu/ivpu_drv.h index 6774402821706..8e79d78906bfe 100644 --- a/drivers/accel/ivpu/ivpu_drv.h +++ b/drivers/accel/ivpu/ivpu_drv.h @@ -52,6 +52,9 @@ #define IVPU_NUM_PRIORITIES4 #define IVPU_NUM_CMDQS_PER_CTX (IVPU_NUM_PRIORITIES) +#define IVPU_CMDQ_MIN_ID 1 +#define IVPU_CMDQ_MAX_ID 255 + #define IVPU_PLATFORM_SILICON 0 #define IVPU_PLATFORM_SIMICS 2 #define IVPU_PLATFORM_FPGA3 @@ -168,13 +171,15 @@ struct ivpu_file_priv { struct kref ref; struct ivpu_device *vdev; struct mutex lock; /* Protects cmdq */ - struct ivpu_cmdq *cmdq[IVPU_NUM_CMDQS_PER_CTX]; + struct xarray cmdq_xa; struct ivpu_mmu_context ctx; struct mutex ms_lock; /* Protects ms_instance_list, ms_info_bo */ struct list_head ms_instance_list; struct ivpu_bo *ms_info_bo; struct xa_limit job_limit; u32 job_id_next; + struct xa_limit cmdq_limit; + u32 cmdq_id_next; bool has_mmu_faults; bool bound; bool aborted; diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c index 9154c2e14245f..82a57a30244d3 100644 --- a/drivers/accel/ivpu/ivpu_job.c +++ b/drivers/accel/ivpu/ivpu_job.c @@ -89,9 +89,16 @@ static struct ivpu_cmdq *ivpu_cmdq_alloc(struct ivpu_file_priv *file_priv) goto err_free_cmdq; } + ret = xa_alloc_cyclic(&file_priv->cmdq_xa, &cmdq->id, cmdq, file_priv->cmdq_limit, + &file_priv->cmdq_id_next, GFP_KERNEL); + if (ret < 0) { + ivpu_err(vdev, "Failed to allocate command queue id: %d\n", ret); + goto err_erase_db_xa; + } + cmdq->mem = ivpu_bo_create_global(vdev, SZ_4K, DRM_IVPU_BO_WC | DRM_IVPU_BO_MAPPABLE); if (!cmdq->mem) - goto err_erase_xa; + goto err_erase_cmdq_xa; ret = ivpu_preemption_buffers_create(vdev, file_priv, cmdq); if (ret) @@ -99,7 +106,9 @@ static struct ivpu_cmdq *ivpu_cmdq_alloc(struct ivpu_file_priv *file_priv) return cmdq; -err_erase_xa: +err_erase_cmdq_xa: + xa_erase(&file_priv->cmdq_xa, cmdq->id); +err_erase_db_xa: xa_erase(&vdev->db_xa, cmdq->db_id); err_free_cmdq: kfree(cmdq); @@ -123,13 +132,13 @@ static int ivpu_hws_cmdq_init(struct ivpu_file_priv *file_priv, struct ivpu_cmdq struct ivpu_device *vdev = file_priv->vdev; int ret; - ret = ivpu_jsm_hws_create_cmdq(vdev, file_priv->ctx.id, file_priv->ctx.id, cmdq->db_id, + ret = ivpu_jsm_hws_create_cmdq(vdev, file_priv->ctx.id, file_priv->ctx.id, cmdq->id, task_pid_nr(current), engine, cmdq->mem->vpu_addr, ivpu_bo_size(cmdq->mem)); if (ret) return ret; - ret = ivpu_jsm_hws_set_context_sched_properties(vdev, file_priv->ctx.id, cmdq->db_id, + ret = ivpu_jsm_hws_set_context_sched_properties(vdev, file_priv->ctx.id, cmdq->id, priority); if (ret) return ret; @@ -143,20 +152,21 @@ static int ivpu_register_db(struct ivpu_file_priv *file_priv, struct