Re: [PATCH v2] drm: Copy drm_wait_vblank to user before returning
On 2021-08-12 9:49 p.m., Mark Yacoub wrote: > From: Mark Yacoub > > [Why] > Userspace should get back a copy of drm_wait_vblank that's been modified > even when drm_wait_vblank_ioctl returns a failure. > > Rationale: > drm_wait_vblank_ioctl modifies the request and expects the user to read > it back. When the type is RELATIVE, it modifies it to ABSOLUTE and updates > the sequence to become current_vblank_count + sequence (which was > RELATIVE), but now it became ABSOLUTE. > drmWaitVBlank (in libdrm) expects this to be the case as it modifies > the request to be Absolute so it expects the sequence to would have been > updated. > > The change is in compat_drm_wait_vblank, which is called by > drm_compat_ioctl. This change of copying the data back regardless of the > return number makes it en par with drm_ioctl, which always copies the > data before returning. > > [How] > Return from the function after everything has been copied to user. > > Fixes: IGT:kms_flip::modeset-vs-vblank-race-interruptible > Tested on ChromeOS Trogdor(msm) > > Signed-off-by: Mark Yacoub > Change-Id: I98da279a5f1329c66a9d1e06b88d40b247b51313 With the Gerrit Change-Id removed, Reviewed-by: Michel Dänzer -- Earthling Michel Dänzer | https://redhat.com Libre software enthusiast | Mesa and X developer
Re: [Intel-gfx] [PATCH v6 01/15] drm/i915/pxp: Define PXP component interface
On Wed, 28 Jul 2021, Daniele Ceraolo Spurio wrote: > This will be used for communication between the i915 driver and the mei > one. Defining it in a stand-alone patch to avoid circualr dependedencies > between the patches modifying the 2 drivers. > > Split out from an original patch from Huang, Sean Z > > v2: rename the component struct (Rodrigo) > > Signed-off-by: Daniele Ceraolo Spurio > Cc: Rodrigo Vivi > Reviewed-by: Rodrigo Vivi > --- > include/drm/i915_component.h | 1 + > include/drm/i915_pxp_tee_interface.h | 45 > 2 files changed, 46 insertions(+) > create mode 100644 include/drm/i915_pxp_tee_interface.h > > diff --git a/include/drm/i915_component.h b/include/drm/i915_component.h > index 55c3b123581b..c1e2a43d2d1e 100644 > --- a/include/drm/i915_component.h > +++ b/include/drm/i915_component.h > @@ -29,6 +29,7 @@ > enum i915_component_type { > I915_COMPONENT_AUDIO = 1, > I915_COMPONENT_HDCP, > + I915_COMPONENT_PXP > }; > > /* MAX_PORT is the number of port > diff --git a/include/drm/i915_pxp_tee_interface.h > b/include/drm/i915_pxp_tee_interface.h > new file mode 100644 > index ..09b8389152af > --- /dev/null > +++ b/include/drm/i915_pxp_tee_interface.h > @@ -0,0 +1,45 @@ > +/* SPDX-License-Identifier: MIT */ > +/* > + * Copyright © 2020 Intel Corporation > + * > + * Authors: > + * Vitaly Lubart IMO we should avoid adding new authors lists in code comments. Git log provides accurate and up-to-date information, and I don't want patches where people add their names to authors lists. BR, Jani. > + */ > + > +#ifndef _I915_PXP_TEE_INTERFACE_H_ > +#define _I915_PXP_TEE_INTERFACE_H_ > + > +#include > +#include > + > +/** > + * struct i915_pxp_component_ops - ops for PXP services. > + * @owner: Module providing the ops > + * @send: sends data to PXP > + * @receive: receives data from PXP > + */ > +struct i915_pxp_component_ops { > + /** > + * @owner: owner of the module provding the ops > + */ > + struct module *owner; > + > + int (*send)(struct device *dev, const void *message, size_t size); > + int (*recv)(struct device *dev, void *buffer, size_t size); > +}; > + > +/** > + * struct i915_pxp_component - Used for communication between i915 and TEE > + * drivers for the PXP services > + * @tee_dev: device that provide the PXP service from TEE Bus. > + * @pxp_ops: Ops implemented by TEE driver, used by i915 driver. > + */ > +struct i915_pxp_component { > + struct device *tee_dev; > + const struct i915_pxp_component_ops *ops; > + > + /* To protect the above members. */ > + struct mutex mutex; > +}; > + > +#endif /* _I915_TEE_PXP_INTERFACE_H_ */ -- Jani Nikula, Intel Open Source Graphics Center
Re: [PATCH 10/64] lib80211: Use struct_group() for memcpy() region
On Tue, 2021-07-27 at 13:58 -0700, Kees Cook wrote: > > +++ b/include/linux/ieee80211.h > @@ -297,9 +297,11 @@ static inline u16 ieee80211_sn_sub(u16 sn1, u16 sn2) > struct ieee80211_hdr { > __le16 frame_control; > __le16 duration_id; > - u8 addr1[ETH_ALEN]; > - u8 addr2[ETH_ALEN]; > - u8 addr3[ETH_ALEN]; > + struct_group(addrs, > + u8 addr1[ETH_ALEN]; > + u8 addr2[ETH_ALEN]; > + u8 addr3[ETH_ALEN]; > + ); > __le16 seq_ctrl; > u8 addr4[ETH_ALEN]; > } __packed __aligned(2); This file isn't really just lib80211, it's also used by everyone else for 802.11, but I guess that's OK - after all, this doesn't really result in any changes here. > +++ b/net/wireless/lib80211_crypt_ccmp.c > @@ -136,7 +136,8 @@ static int ccmp_init_iv_and_aad(const struct > ieee80211_hdr *hdr, > pos = (u8 *) hdr; > aad[0] = pos[0] & 0x8f; > aad[1] = pos[1] & 0xc7; > - memcpy(aad + 2, hdr->addr1, 3 * ETH_ALEN); > + BUILD_BUG_ON(sizeof(hdr->addrs) != 3 * ETH_ALEN); > + memcpy(aad + 2, &hdr->addrs, ETH_ALEN); However, how is it you don't need the same change in net/mac80211/wpa.c? We have three similar instances: /* AAD (extra authenticate-only data) / masked 802.11 header * FC | A1 | A2 | A3 | SC | [A4] | [QC] */ put_unaligned_be16(len_a, &aad[0]); put_unaligned(mask_fc, (__le16 *)&aad[2]); memcpy(&aad[4], &hdr->addr1, 3 * ETH_ALEN); and memcpy(&aad[4], &hdr->addr1, 3 * ETH_ALEN); and memcpy(aad + 2, &hdr->addr1, 3 * ETH_ALEN); so those should also be changed, it seems? In which case I'd probably prefer to do this separately from the staging drivers ... johannes
Re: [PATCH 39/64] mac80211: Use memset_after() to clear tx status
On Sat, 2021-07-31 at 08:55 -0700, Kees Cook wrote: > On Tue, Jul 27, 2021 at 01:58:30PM -0700, Kees Cook wrote: > > In preparation for FORTIFY_SOURCE performing compile-time and run-time > > field bounds checking for memset(), avoid intentionally writing across > > neighboring fields. > > > > Use memset_after() so memset() doesn't get confused about writing > > beyond the destination member that is intended to be the starting point > > of zeroing through the end of the struct. > > > > Note that the common helper, ieee80211_tx_info_clear_status(), does NOT > > clear ack_signal, but the open-coded versions do. All three perform > > checks that the ack_signal position hasn't changed, though. > > Quick ping on this question: there is a mismatch between the common > helper and the other places that do this. Is there a bug here? Yes. The common helper should also clear ack_signal, but that was broken by commit e3e1a0bcb3f1 ("mac80211: reduce IEEE80211_TX_MAX_RATES"), because that commit changed the order of the fields and updated carl9170 and p54 properly but not the common helper... It doesn't actually matter much because ack_signal is normally filled in afterwards, and even if it isn't, it's just for statistics. The correct thing to do here would be to memset_after(&info->status, 0, rates); johannes
Re: [PATCH 39/64] mac80211: Use memset_after() to clear tx status
On Sat, 2021-07-31 at 08:55 -0700, Kees Cook wrote: > > > @@ -278,9 +278,7 @@ static void carl9170_tx_release(struct kref *ref) > > BUILD_BUG_ON( > > offsetof(struct ieee80211_tx_info, status.ack_signal) != 20); > > > > > > - memset(&txinfo->status.ack_signal, 0, > > - sizeof(struct ieee80211_tx_info) - > > - offsetof(struct ieee80211_tx_info, status.ack_signal)); > > + memset_after(&txinfo->status, 0, rates); FWIW, I think we should also remove the BUILD_BUG_ON() now in all the places - that was meant to give people a hint to update if some field ordering etc. changed, but now that it's "after rates" this is no longer necessary. johannes
Re: [PATCH 2/2] drm/i915: Add pci ids and uapi for DG1
Op 12-08-2021 om 23:10 schreef Jason Ekstrand: > On Thu, Aug 12, 2021 at 9:49 AM Daniel Vetter wrote: >> On Thu, Aug 12, 2021 at 2:44 PM Maarten Lankhorst >> wrote: >>> DG1 has support for local memory, which requires the usage of the >>> lmem placement extension for creating bo's, and memregion queries >>> to obtain the size. Because of this, those parts of the uapi are >>> no longer guarded behind FAKE_LMEM. >>> >>> According to the pull request referenced below, mesa should be mostly >>> ready for DG1. VK_EXT_memory_budget is not hooked up yet, but we >>> should definitely just enable the uapi parts by default. >>> >>> Signed-off-by: Maarten Lankhorst >>> References: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11584 >>> Cc: Jordan Justen jordan.l.jus...@intel.com >>> Cc: Jason Ekstrand ja...@jlekstrand.net >> Acked-by: Daniel Vetter > Acked-by: Jason Ekstrand > >>> --- >>> drivers/gpu/drm/i915/gem/i915_gem_create.c | 3 --- >>> drivers/gpu/drm/i915/i915_pci.c| 1 + >>> drivers/gpu/drm/i915/i915_query.c | 3 --- >>> 3 files changed, 1 insertion(+), 6 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c >>> b/drivers/gpu/drm/i915/gem/i915_gem_create.c >>> index 23fee13a3384..1d341b8c47c0 100644 >>> --- a/drivers/gpu/drm/i915/gem/i915_gem_create.c >>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c >>> @@ -347,9 +347,6 @@ static int ext_set_placements(struct >>> i915_user_extension __user *base, >>> { >>> struct drm_i915_gem_create_ext_memory_regions ext; >>> >>> - if (!IS_ENABLED(CONFIG_DRM_I915_UNSTABLE_FAKE_LMEM)) >>> - return -ENODEV; >>> - >>> if (copy_from_user(&ext, base, sizeof(ext))) >>> return -EFAULT; >>> >>> diff --git a/drivers/gpu/drm/i915/i915_pci.c >>> b/drivers/gpu/drm/i915/i915_pci.c >>> index 1bbd09ad5287..93ccdc6bbd03 100644 >>> --- a/drivers/gpu/drm/i915/i915_pci.c >>> +++ b/drivers/gpu/drm/i915/i915_pci.c >>> @@ -1115,6 +1115,7 @@ static const struct pci_device_id pciidlist[] = { >>> INTEL_RKL_IDS(&rkl_info), >>> INTEL_ADLS_IDS(&adl_s_info), >>> INTEL_ADLP_IDS(&adl_p_info), >>> + INTEL_DG1_IDS(&dg1_info), >>> {0, 0, 0} >>> }; >>> MODULE_DEVICE_TABLE(pci, pciidlist); >>> diff --git a/drivers/gpu/drm/i915/i915_query.c >>> b/drivers/gpu/drm/i915/i915_query.c >>> index e49da36c62fb..5e2b909827f4 100644 >>> --- a/drivers/gpu/drm/i915/i915_query.c >>> +++ b/drivers/gpu/drm/i915/i915_query.c >>> @@ -432,9 +432,6 @@ static int query_memregion_info(struct drm_i915_private >>> *i915, >>> u32 total_length; >>> int ret, id, i; >>> >>> - if (!IS_ENABLED(CONFIG_DRM_I915_UNSTABLE_FAKE_LMEM)) >>> - return -ENODEV; >>> - >>> if (query_item->flags != 0) >>> return -EINVAL; >>> >>> -- >>> 2.32.0 >>> >> >> -- >> Daniel Vetter >> Software Engineer, Intel Corporation >> http://blog.ffwll.ch Pushed this patch and did the revert from previous patch in drm-intel/topic/core-for-ci, enjoy!
Re: [PATCH v1] fbtft: fb_st7789v: added reset on init_display()
On Fri, Aug 13, 2021 at 08:25:10AM +0200, Oliver Graute wrote: > staging: fbtft: fb_st7789v: reset display before initialization What is this line here, and why is this not your subject line instead? > > In rare cases the display is flipped or mirrored. This was observed more > often in a low temperature environment. A clean reset on init_display() > should help to get registers in a sane state. > > Signed-off-by: Oliver Graute What commit does this fix? thanks, greg k-h
[PATCH] drm: radeon: r600_dma: Replace cpu_to_le32() by lower_32_bits()
This patch fixes the following sparse errors: drivers/gpu/drm/radeon/r600_dma.c:247:30: warning: incorrect type in assignment (different base types) drivers/gpu/drm/radeon/r600_dma.c:247:30:expected unsigned int volatile [usertype] drivers/gpu/drm/radeon/r600_dma.c:247:30:got restricted __le32 [usertype] Signed-off-by: zhaoxiao --- drivers/gpu/drm/radeon/r600_dma.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/radeon/r600_dma.c b/drivers/gpu/drm/radeon/r600_dma.c index fb65e6fb5c4f..a2d0b1edcd22 100644 --- a/drivers/gpu/drm/radeon/r600_dma.c +++ b/drivers/gpu/drm/radeon/r600_dma.c @@ -244,7 +244,7 @@ int r600_dma_ring_test(struct radeon_device *rdev, gpu_addr = rdev->wb.gpu_addr + index; tmp = 0xCAFEDEAD; - rdev->wb.wb[index/4] = cpu_to_le32(tmp); + rdev->wb.wb[index/4] = lower_32_bits(tmp); r = radeon_ring_lock(rdev, ring, 4); if (r) { -- 2.20.1
[PATCH 1/2] drm: avoid races with modesetting rights
In drm_client_modeset.c and drm_fb_helper.c, drm_master_internal_{acquire,release} are used to avoid races with DRM userspace. These functions hold onto drm_device.master_mutex while committing, and bail if there's already a master. However, ioctls can still race between themselves. A time-of-check-to-time-of-use error can occur if an ioctl that changes the modeset has its rights revoked after it validates its permissions, but before it completes. There are three ioctls that can affect modesetting permissions: - DROP_MASTER ioctl removes rights for a master and its leases - REVOKE_LEASE ioctl revokes rights for a specific lease - SET_MASTER ioctl sets the device master if the master role hasn't been acquired yet All these races can be avoided by introducing an SRCU that acts as a barrier for ioctls that can change modesetting permissions. Processes that perform modesetting should hold a read lock on the new drm_device.master_barrier_srcu, and ioctls that change these permissions should call synchronize_srcu before returning. This ensures that any process that might have seen old permissions are flushed out before DROP_MASTER/REVOKE_LEASE/SET_MASTER ioctls return to userspace. Reported-by: Daniel Vetter Signed-off-by: Desmond Cheong Zhi Xi --- drivers/gpu/drm/drm_auth.c | 17 ++--- drivers/gpu/drm/drm_client_modeset.c | 10 ++ drivers/gpu/drm/drm_drv.c| 2 ++ drivers/gpu/drm/drm_fb_helper.c | 20 drivers/gpu/drm/drm_internal.h | 5 +++-- drivers/gpu/drm/drm_ioctl.c | 25 + include/drm/drm_device.h | 11 +++ include/drm/drm_ioctl.h | 7 +++ 8 files changed, 76 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/drm_auth.c b/drivers/gpu/drm/drm_auth.c index 60a6b21474b1..004506608e76 100644 --- a/drivers/gpu/drm/drm_auth.c +++ b/drivers/gpu/drm/drm_auth.c @@ -29,6 +29,7 @@ */ #include +#include #include #include @@ -448,21 +449,31 @@ void drm_master_put(struct drm_master **master) EXPORT_SYMBOL(drm_master_put); /* Used by drm_client and drm_fb_helper */ -bool drm_master_internal_acquire(struct drm_device *dev) +bool drm_master_internal_acquire(struct drm_device *dev, int *idx) { + *idx = srcu_read_lock(&dev->master_barrier_srcu); + mutex_lock(&dev->master_mutex); if (dev->master) { mutex_unlock(&dev->master_mutex); + srcu_read_unlock(&dev->master_barrier_srcu, *idx); return false; } + mutex_unlock(&dev->master_mutex); return true; } EXPORT_SYMBOL(drm_master_internal_acquire); /* Used by drm_client and drm_fb_helper */ -void drm_master_internal_release(struct drm_device *dev) +void drm_master_internal_release(struct drm_device *dev, int idx) { - mutex_unlock(&dev->master_mutex); + srcu_read_unlock(&dev->master_barrier_srcu, idx); } EXPORT_SYMBOL(drm_master_internal_release); + +/* Used by drm_ioctl */ +void drm_master_flush(struct drm_device *dev) +{ + synchronize_srcu(&dev->master_barrier_srcu); +} diff --git a/drivers/gpu/drm/drm_client_modeset.c b/drivers/gpu/drm/drm_client_modeset.c index ced09c7c06f9..9885f36f71b7 100644 --- a/drivers/gpu/drm/drm_client_modeset.c +++ b/drivers/gpu/drm/drm_client_modeset.c @@ -1165,13 +1165,14 @@ int drm_client_modeset_commit(struct drm_client_dev *client) { struct drm_device *dev = client->dev; int ret; + int idx; - if (!drm_master_internal_acquire(dev)) + if (!drm_master_internal_acquire(dev, &idx)) return -EBUSY; ret = drm_client_modeset_commit_locked(client); - drm_master_internal_release(dev); + drm_master_internal_release(dev, idx); return ret; } @@ -1215,8 +1216,9 @@ int drm_client_modeset_dpms(struct drm_client_dev *client, int mode) { struct drm_device *dev = client->dev; int ret = 0; + int idx; - if (!drm_master_internal_acquire(dev)) + if (!drm_master_internal_acquire(dev, &idx)) return -EBUSY; mutex_lock(&client->modeset_mutex); @@ -1226,7 +1228,7 @@ int drm_client_modeset_dpms(struct drm_client_dev *client, int mode) drm_client_modeset_dpms_legacy(client, mode); mutex_unlock(&client->modeset_mutex); - drm_master_internal_release(dev); + drm_master_internal_release(dev, idx); return ret; } diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c index 7a5097467ba5..c313f0674db3 100644 --- a/drivers/gpu/drm/drm_drv.c +++ b/drivers/gpu/drm/drm_drv.c @@ -574,6 +574,7 @@ static void drm_dev_init_release(struct drm_device *dev, void *res) mutex_destroy(&dev->clientlist_mutex); mutex_destroy(&dev->filelist_mutex); mutex_destroy(&dev->struct_mutex); + cleanup_srcu_struct(&dev->master_barrier_srcu); drm_legacy_destroy_membe
[PATCH 2/2] drm: unexport drm_ioctl_permit
Since the last user of drm_ioctl_permit was removed, and it's now only used in drm_ioctl.c, unexport the symbol. Reported-by: Daniel Vetter Signed-off-by: Desmond Cheong Zhi Xi --- drivers/gpu/drm/drm_ioctl.c | 15 +-- include/drm/drm_ioctl.h | 1 - 2 files changed, 1 insertion(+), 15 deletions(-) diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c index eb4ec3fab7d1..fe271f6f96ab 100644 --- a/drivers/gpu/drm/drm_ioctl.c +++ b/drivers/gpu/drm/drm_ioctl.c @@ -522,19 +522,7 @@ int drm_version(struct drm_device *dev, void *data, return err; } -/** - * drm_ioctl_permit - Check ioctl permissions against caller - * - * @flags: ioctl permission flags. - * @file_priv: Pointer to struct drm_file identifying the caller. - * - * Checks whether the caller is allowed to run an ioctl with the - * indicated permissions. - * - * Returns: - * Zero if allowed, -EACCES otherwise. - */ -int drm_ioctl_permit(u32 flags, struct drm_file *file_priv) +static int drm_ioctl_permit(u32 flags, struct drm_file *file_priv) { /* ROOT_ONLY is only for CAP_SYS_ADMIN */ if (unlikely((flags & DRM_ROOT_ONLY) && !capable(CAP_SYS_ADMIN))) @@ -557,7 +545,6 @@ int drm_ioctl_permit(u32 flags, struct drm_file *file_priv) return 0; } -EXPORT_SYMBOL(drm_ioctl_permit); #define DRM_IOCTL_DEF(ioctl, _func, _flags)\ [DRM_IOCTL_NR(ioctl)] = { \ diff --git a/include/drm/drm_ioctl.h b/include/drm/drm_ioctl.h index 13a68cdcea36..fd29842127e5 100644 --- a/include/drm/drm_ioctl.h +++ b/include/drm/drm_ioctl.h @@ -174,7 +174,6 @@ struct drm_ioctl_desc { .name = #ioctl \ } -int drm_ioctl_permit(u32 flags, struct drm_file *file_priv); long drm_ioctl(struct file *filp, unsigned int cmd, unsigned long arg); long drm_ioctl_kernel(struct file *, drm_ioctl_t, void *, u32); #ifdef CONFIG_COMPAT -- 2.25.1
[PATCH 0/2] drm: update the ioctl handler
Hi, Finally got around to it. This patchset implements some updates to the drm ioctl handler that were first raised by Daniel Vetter in [1]. Namely: - Flush concurrent processes that can change the modeset when DRM masters are set/dropped or a lease is revoked - Unexport drm_ioctl_permit() Thoughts and comments would be very appreciated. Link: https://lore.kernel.org/lkml/YN9kAFcfGoB13x7f@phenom.ffwll.local/ [1] Best wishes, Desmond Desmond Cheong Zhi Xi (2): drm: avoid races with modesetting rights drm: unexport drm_ioctl_permit drivers/gpu/drm/drm_auth.c | 17 +--- drivers/gpu/drm/drm_client_modeset.c | 10 --- drivers/gpu/drm/drm_drv.c| 2 ++ drivers/gpu/drm/drm_fb_helper.c | 20 -- drivers/gpu/drm/drm_internal.h | 5 ++-- drivers/gpu/drm/drm_ioctl.c | 40 +++- include/drm/drm_device.h | 11 include/drm/drm_ioctl.h | 8 +- 8 files changed, 77 insertions(+), 36 deletions(-) -- 2.25.1
[PATCH] drm/amdgpu: Cancel delayed work when GFXOFF is disabled
From: Michel Dänzer schedule_delayed_work does not push back the work if it was already scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms after the first time GFXOFF was disabled and re-enabled, even if GFXOFF was disabled and re-enabled again during those 100 ms. This resulted in frame drops / stutter with the upcoming mutter 41 release on Navi 14, due to constantly enabling GFXOFF in the HW and disabling it again (for getting the GPU clock counter). To fix this, call cancel_delayed_work_sync when GFXOFF transitions from enabled to disabled. This makes sure the delayed work will be scheduled as intended in the reverse case. In order to avoid a deadlock, amdgpu_device_delay_enable_gfx_off needs to use mutex_trylock instead of mutex_lock. v2: * Use cancel_delayed_work_sync & mutex_trylock instead of mod_delayed_work. Signed-off-by: Michel Dänzer --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 13 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h| 3 +++ 3 files changed, 20 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index f3fd5ec710b6..8b025f70706c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2777,7 +2777,16 @@ static void amdgpu_device_delay_enable_gfx_off(struct work_struct *work) struct amdgpu_device *adev = container_of(work, struct amdgpu_device, gfx.gfx_off_delay_work.work); - mutex_lock(&adev->gfx.gfx_off_mutex); + /* mutex_lock could deadlock with cancel_delayed_work_sync in amdgpu_gfx_off_ctrl. */ + if (!mutex_trylock(&adev->gfx.gfx_off_mutex)) { + /* If there's a bug which causes amdgpu_gfx_off_ctrl to be called with enable=true +* when adev->gfx.gfx_off_req_count is already 0, we might race with that. +* Re-schedule to make sure gfx off will be re-enabled in the HW eventually. +*/ + schedule_delayed_work(&adev->gfx.gfx_off_delay_work, AMDGPU_GFX_OFF_DELAY_ENABLE); + return; + } + if (!adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) { if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, true)) adev->gfx.gfx_off_state = true; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c index a0be0772c8b3..da4c46db3093 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c @@ -28,9 +28,6 @@ #include "amdgpu_rlc.h" #include "amdgpu_ras.h" -/* delay 0.1 second to enable gfx off feature */ -#define GFX_OFF_DELAY_ENABLE msecs_to_jiffies(100) - /* * GPU GFX IP block helpers function. */ @@ -569,9 +566,13 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool enable) adev->gfx.gfx_off_req_count--; if (enable && !adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) { - schedule_delayed_work(&adev->gfx.gfx_off_delay_work, GFX_OFF_DELAY_ENABLE); - } else if (!enable && adev->gfx.gfx_off_state) { - if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, false)) { + schedule_delayed_work(&adev->gfx.gfx_off_delay_work, AMDGPU_GFX_OFF_DELAY_ENABLE); + } else if (!enable) { + if (adev->gfx.gfx_off_req_count == 1 && !adev->gfx.gfx_off_state) + cancel_delayed_work_sync(&adev->gfx.gfx_off_delay_work); + + if (adev->gfx.gfx_off_state && + !amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, false)) { adev->gfx.gfx_off_state = false; if (adev->gfx.funcs->init_spm_golden) { diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h index d43fe2ed8116..dcdb505bb7f4 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h @@ -32,6 +32,9 @@ #include "amdgpu_rlc.h" #include "soc15.h" +/* delay 0.1 second to enable gfx off feature */ +#define AMDGPU_GFX_OFF_DELAY_ENABLE msecs_to_jiffies(100) + /* GFX current status */ #define AMDGPU_GFX_NORMAL_MODE 0xL #define AMDGPU_GFX_SAFE_MODE 0x0001L -- 2.32.0
Re: [PATCH 2/2] drm/amdgpu: Use mod_delayed_work in JPEG/UVD/VCE/VCN ring_end_use hooks
On 2021-08-13 6:23 a.m., Lazar, Lijo wrote: > > > On 8/12/2021 10:24 PM, Michel Dänzer wrote: >> On 2021-08-12 1:33 p.m., Lazar, Lijo wrote: >>> On 8/12/2021 1:41 PM, Michel Dänzer wrote: On 2021-08-12 7:55 a.m., Koenig, Christian wrote: > Hi James, > > Evan seems to have understood how this all works together. > > See while any begin/end use critical section is active the work should > not be active. > > When you handle only one ring you can just call cancel in begin use and > schedule in end use. But when you have more than one ring you need a lock > or counter to prevent concurrent work items to be started. > > Michelle's idea to use mod_delayed_work is a bad one because it assumes > that the delayed work is still running. It merely assumes that the work may already have been scheduled before. Admittedly, I missed the cancel_delayed_work_sync calls for patch 2. While I think it can still have some effect when there's a single work item for multiple rings, as described by James, it's probably negligible, since presumably the time intervals between ring_begin_use and ring_end_use are normally much shorter than a second. So, while patch 2 is at worst a no-op (since mod_delayed_work is the same as schedule_delayed_work if the work hasn't been scheduled yet), I'm fine with dropping it. > Something similar applies to the first patch I think, There are no cancel work calls in that case, so the commit log is accurate TTBOMK. >>> >>> Curious - >>> >>> For patch 1, does it make a difference if any delayed work scheduled is >>> cancelled in the else part before proceeding? >>> >>> } else if (!enable && adev->gfx.gfx_off_state) { >>> cancel_delayed_work(); >> >> I tried the patch below. >> >> While this does seem to fix the problem as well, I see a potential issue: >> >> 1. amdgpu_gfx_off_ctrl locks adev->gfx.gfx_off_mutex >> 2. amdgpu_device_delay_enable_gfx_off runs, blocks in mutex_lock >> 3. amdgpu_gfx_off_ctrl calls cancel_delayed_work_sync >> >> I'm afraid this would deadlock? (CONFIG_PROVE_LOCKING doesn't complain >> though) > > Should use the cancel_delayed_work instead of the _sync version. The thing is, it's not clear to me from cancel_delayed_work's description that it's guaranteed not to wait for amdgpu_device_delay_enable_gfx_off to finish if it's already running. If that's not guaranteed, it's prone to the same deadlock. > As you mentioned - at best work is not scheduled yet and cancelled > successfully, or at worst it's waiting for the mutex. In the worst case, if > amdgpu_device_delay_enable_gfx_off gets the mutex after amdgpu_gfx_off_ctrl > unlocks it, there is an extra check as below. > > if (!adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) > > The count wouldn't be 0 and hence it won't enable GFXOFF. I'm not sure, but it might also be possible for amdgpu_device_delay_enable_gfx_off to get the mutex only after amdgpu_gfx_off_ctrl was called again and set adev->gfx.gfx_off_req_count back to 0. >> Maybe it's possible to fix it with cancel_delayed_work_sync somehow, but I'm >> not sure how offhand. (With cancel_delayed_work instead, I'm worried >> amdgpu_device_delay_enable_gfx_off might still enable GFXOFF in the HW >> immediately after amdgpu_gfx_off_ctrl unlocks the mutex. Then again, that >> might happen with mod_delayed_work as well...) > > As mentioned earlier, cancel_delayed_work won't cause this issue. > > In the mod_delayed_ patch, mod_ version is called only when req_count is 0. > While that is a good thing, it keeps alive one more contender for the mutex. Not sure what you mean. It leaves the possibility of amdgpu_device_delay_enable_gfx_off running just after amdgpu_gfx_off_ctrl tried to postpone it. As discussed above, something similar might be possible with cancel_delayed_work as well. > The cancel_ version eliminates that contender if happens to be called at the > right time (more likely if there are multiple requests to disable gfxoff). On > the other hand, don't know how costly it is to call cancel_ every time on the > else part (or maybe call only once when count increments to 1?). Sure, why not, though I doubt it matters much — I expect adev->gfx.gfx_off_req_count transitioning between 0 <-> 1 to be the most common case by far. I sent out a v2 patch which should address all these issues. -- Earthling Michel Dänzer | https://redhat.com Libre software enthusiast | Mesa and X developer
Re: [PATCH v18 0/2] Add memory bandwidth management to NVIDIA Tegra DRM driver
On Mon, Jun 07, 2021 at 01:40:06AM +0300, Dmitry Osipenko wrote: > 01.06.2021 07:21, Dmitry Osipenko пишет: > > This series adds memory bandwidth management to the NVIDIA Tegra DRM driver, > > which is done using interconnect framework. It fixes display corruption that > > happens due to insufficient memory bandwidth. > > > > Changelog: > > > > v18: - Moved total peak bandwidth from CRTC state to plane state and removed > >dummy plane bandwidth state initialization from T186+ plane hub. This > >was suggested by Thierry Reding to v17. > > > > - I haven't done anything about the cursor's plane bandwidth which > >doesn't contribute to overlapping bandwidths for a small sized > >window because it works okay as-is. > > Thierry, will you take these patches for 5.14? As discussed offline, I've picked these up for v5.15 with a small patch squashed in to unbreak the Tegra186 and later support. Thanks, Thierry signature.asc Description: PGP signature
[PATCH 0/4] drm/dp: add some defines and prep for DP 2.0
I'll probably want to merge these to drm-intel-next and drm-misc-next via a topic branch. Jani Nikula (4): drm/dp: add DP 2.0 UHBR link rate and bw code conversions drm/dp: use more of the extended receiver cap drm/dp: add LTTPR DP 2.0 DPCD addresses drm/dp: add helper for extracting adjust 128b/132b TX FFE preset drivers/gpu/drm/drm_dp_helper.c | 42 + include/drm/drm_dp_helper.h | 6 + 2 files changed, 43 insertions(+), 5 deletions(-) -- 2.20.1
[PATCH 1/4] drm/dp: add DP 2.0 UHBR link rate and bw code conversions
The bw code equals link_rate / 0.27 Gbps only for 8b/10b link rates. Handle DP 2.0 UHBR rates as special cases, though this is not pretty. Cc: Manasi Navare Signed-off-by: Jani Nikula --- drivers/gpu/drm/drm_dp_helper.c | 26 ++ 1 file changed, 22 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c index 6d0f2c447f3b..9b2a2961fca8 100644 --- a/drivers/gpu/drm/drm_dp_helper.c +++ b/drivers/gpu/drm/drm_dp_helper.c @@ -207,15 +207,33 @@ EXPORT_SYMBOL(drm_dp_lttpr_link_train_channel_eq_delay); u8 drm_dp_link_rate_to_bw_code(int link_rate) { - /* Spec says link_bw = link_rate / 0.27Gbps */ - return link_rate / 27000; + switch (link_rate) { + case 100: + return DP_LINK_BW_10; + case 135: + return DP_LINK_BW_13_5; + case 200: + return DP_LINK_BW_20; + default: + /* Spec says link_bw = link_rate / 0.27Gbps */ + return link_rate / 27000; + } } EXPORT_SYMBOL(drm_dp_link_rate_to_bw_code); int drm_dp_bw_code_to_link_rate(u8 link_bw) { - /* Spec says link_rate = link_bw * 0.27Gbps */ - return link_bw * 27000; + switch (link_bw) { + case DP_LINK_BW_10: + return 100; + case DP_LINK_BW_13_5: + return 135; + case DP_LINK_BW_20: + return 200; + default: + /* Spec says link_rate = link_bw * 0.27Gbps */ + return link_bw * 27000; + } } EXPORT_SYMBOL(drm_dp_bw_code_to_link_rate); -- 2.20.1
[PATCH 2/4] drm/dp: use more of the extended receiver cap
Extend the use of extended receiver cap at 0x2200 to cover MAIN_LINK_CHANNEL_CODING_CAP in 0x2206, in case an implementation hides the DP 2.0 128b/132b channel encoding cap. Cc: Manasi Navare Signed-off-by: Jani Nikula --- drivers/gpu/drm/drm_dp_helper.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c index 9b2a2961fca8..9389f92cb944 100644 --- a/drivers/gpu/drm/drm_dp_helper.c +++ b/drivers/gpu/drm/drm_dp_helper.c @@ -608,7 +608,7 @@ static u8 drm_dp_downstream_port_count(const u8 dpcd[DP_RECEIVER_CAP_SIZE]) static int drm_dp_read_extended_dpcd_caps(struct drm_dp_aux *aux, u8 dpcd[DP_RECEIVER_CAP_SIZE]) { - u8 dpcd_ext[6]; + u8 dpcd_ext[DP_MAIN_LINK_CHANNEL_CODING + 1]; int ret; /* -- 2.20.1
[PATCH 3/4] drm/dp: add LTTPR DP 2.0 DPCD addresses
DP 2.0 brings some new DPCD addresses for PHY repeaters. Cc: Manasi Navare Signed-off-by: Jani Nikula --- include/drm/drm_dp_helper.h | 4 1 file changed, 4 insertions(+) diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h index 1d5b3dbb6e56..f3a61341011d 100644 --- a/include/drm/drm_dp_helper.h +++ b/include/drm/drm_dp_helper.h @@ -1319,6 +1319,10 @@ struct drm_panel; #define DP_MAX_LANE_COUNT_PHY_REPEATER 0xf0004 /* 1.4a */ #define DP_Repeater_FEC_CAPABILITY 0xf0004 /* 1.4 */ #define DP_PHY_REPEATER_EXTENDED_WAIT_TIMEOUT 0xf0005 /* 1.4a */ +#define DP_MAIN_LINK_CHANNEL_CODING_PHY_REPEATER 0xf0006 /* 2.0 */ +# define DP_PHY_REPEATER_128B132B_SUPPORTED(1 << 0) +/* See DP_128B132B_SUPPORTED_LINK_RATES for values */ +#define DP_PHY_REPEATER_128B132B_RATES 0xf0007 /* 2.0 */ enum drm_dp_phy { DP_PHY_DPRX, -- 2.20.1
[PATCH 4/4] drm/dp: add helper for extracting adjust 128b/132b TX FFE preset
The DP 2.0 128b/132b channel coding uses TX FFE presets instead of vswing and pre-emphasis. Cc: Manasi Navare Signed-off-by: Jani Nikula --- drivers/gpu/drm/drm_dp_helper.c | 14 ++ include/drm/drm_dp_helper.h | 2 ++ 2 files changed, 16 insertions(+) diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c index 9389f92cb944..2843238a78e6 100644 --- a/drivers/gpu/drm/drm_dp_helper.c +++ b/drivers/gpu/drm/drm_dp_helper.c @@ -130,6 +130,20 @@ u8 drm_dp_get_adjust_request_pre_emphasis(const u8 link_status[DP_LINK_STATUS_SI } EXPORT_SYMBOL(drm_dp_get_adjust_request_pre_emphasis); +/* DP 2.0 128b/132b */ +u8 drm_dp_get_adjust_tx_ffe_preset(const u8 link_status[DP_LINK_STATUS_SIZE], + int lane) +{ + int i = DP_ADJUST_REQUEST_LANE0_1 + (lane >> 1); + int s = ((lane & 1) ? +DP_ADJUST_TX_FFE_PRESET_LANE1_SHIFT : +DP_ADJUST_TX_FFE_PRESET_LANE0_SHIFT); + u8 l = dp_link_status(link_status, i); + + return (l >> s) & 0xf; +} +EXPORT_SYMBOL(drm_dp_get_adjust_tx_ffe_preset); + u8 drm_dp_get_adjust_request_post_cursor(const u8 link_status[DP_LINK_STATUS_SIZE], unsigned int lane) { diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h index f3a61341011d..3ee0b3ffb8a5 100644 --- a/include/drm/drm_dp_helper.h +++ b/include/drm/drm_dp_helper.h @@ -1494,6 +1494,8 @@ u8 drm_dp_get_adjust_request_voltage(const u8 link_status[DP_LINK_STATUS_SIZE], int lane); u8 drm_dp_get_adjust_request_pre_emphasis(const u8 link_status[DP_LINK_STATUS_SIZE], int lane); +u8 drm_dp_get_adjust_tx_ffe_preset(const u8 link_status[DP_LINK_STATUS_SIZE], + int lane); u8 drm_dp_get_adjust_request_post_cursor(const u8 link_status[DP_LINK_STATUS_SIZE], unsigned int lane); -- 2.20.1
Re: [PATCH 2/2] drm/amdgpu: Use mod_delayed_work in JPEG/UVD/VCE/VCN ring_end_use hooks
On 8/13/2021 4:01 PM, Michel Dänzer wrote: On 2021-08-13 6:23 a.m., Lazar, Lijo wrote: On 8/12/2021 10:24 PM, Michel Dänzer wrote: On 2021-08-12 1:33 p.m., Lazar, Lijo wrote: On 8/12/2021 1:41 PM, Michel Dänzer wrote: On 2021-08-12 7:55 a.m., Koenig, Christian wrote: Hi James, Evan seems to have understood how this all works together. See while any begin/end use critical section is active the work should not be active. When you handle only one ring you can just call cancel in begin use and schedule in end use. But when you have more than one ring you need a lock or counter to prevent concurrent work items to be started. Michelle's idea to use mod_delayed_work is a bad one because it assumes that the delayed work is still running. It merely assumes that the work may already have been scheduled before. Admittedly, I missed the cancel_delayed_work_sync calls for patch 2. While I think it can still have some effect when there's a single work item for multiple rings, as described by James, it's probably negligible, since presumably the time intervals between ring_begin_use and ring_end_use are normally much shorter than a second. So, while patch 2 is at worst a no-op (since mod_delayed_work is the same as schedule_delayed_work if the work hasn't been scheduled yet), I'm fine with dropping it. Something similar applies to the first patch I think, There are no cancel work calls in that case, so the commit log is accurate TTBOMK. Curious - For patch 1, does it make a difference if any delayed work scheduled is cancelled in the else part before proceeding? } else if (!enable && adev->gfx.gfx_off_state) { cancel_delayed_work(); I tried the patch below. While this does seem to fix the problem as well, I see a potential issue: 1. amdgpu_gfx_off_ctrl locks adev->gfx.gfx_off_mutex 2. amdgpu_device_delay_enable_gfx_off runs, blocks in mutex_lock 3. amdgpu_gfx_off_ctrl calls cancel_delayed_work_sync I'm afraid this would deadlock? (CONFIG_PROVE_LOCKING doesn't complain though) Should use the cancel_delayed_work instead of the _sync version. The thing is, it's not clear to me from cancel_delayed_work's description that it's guaranteed not to wait for amdgpu_device_delay_enable_gfx_off to finish if it's already running. If that's not guaranteed, it's prone to the same deadlock. From what I understood from the the description, cancel initiates a cancel. If the work has already started, it returns false saying it couldn't succeed otherwise cancels out the scheduled work and returns true. In the note below, it asks to specifically use the _sync version if we need to wait for an already started work and that definitely has the problem of deadlock you mentioned above. * Note: * The work callback function may still be running on return, unless * it returns %true and the work doesn't re-arm itself. Explicitly flush or * use cancel_delayed_work_sync() to wait on it. As you mentioned - at best work is not scheduled yet and cancelled successfully, or at worst it's waiting for the mutex. In the worst case, if amdgpu_device_delay_enable_gfx_off gets the mutex after amdgpu_gfx_off_ctrl unlocks it, there is an extra check as below. if (!adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) The count wouldn't be 0 and hence it won't enable GFXOFF. I'm not sure, but it might also be possible for amdgpu_device_delay_enable_gfx_off to get the mutex only after amdgpu_gfx_off_ctrl was called again and set adev->gfx.gfx_off_req_count back to 0. Yes, this is a case we can't avoid in either case. If the work has already started, then mod_delayed_ also doesn't have any impact. Another case is work thread already got the mutex and a disable request comes just at that time. It needs to wait till mutex is released by work, that could mean enable gfxoff immediately followed by disable. Maybe it's possible to fix it with cancel_delayed_work_sync somehow, but I'm not sure how offhand. (With cancel_delayed_work instead, I'm worried amdgpu_device_delay_enable_gfx_off might still enable GFXOFF in the HW immediately after amdgpu_gfx_off_ctrl unlocks the mutex. Then again, that might happen with mod_delayed_work as well...) As mentioned earlier, cancel_delayed_work won't cause this issue. In the mod_delayed_ patch, mod_ version is called only when req_count is 0. While that is a good thing, it keeps alive one more contender for the mutex. Not sure what you mean. It leaves the possibility of amdgpu_device_delay_enable_gfx_off running just after amdgpu_gfx_off_ctrl tried to postpone it. As discussed above, something similar might be possible with cancel_delayed_work as well. The mod_delayed is called only req_count gets back to 0. If there is another disable request comes after that, it doesn't cancel out the work scheduled nor does it adjust the delay. Ex: Disable gfxoff -> Enable gfxoff (now the work is scheduled) -> Disable gfxoff (within 5ms
[PATCH] drm/i915/gt: Potential error pointer dereference in pinned_context()
If the intel_engine_create_pinned_context() function returns an error pointer, then dereferencing "ce" will Oops. Use "vm" instead of "ce->vm". Fixes: cf586021642d ("drm/i915/gt: Pipelined page migration") Signed-off-by: Dan Carpenter --- drivers/gpu/drm/i915/gt/intel_migrate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c b/drivers/gpu/drm/i915/gt/intel_migrate.c index d0a7c934fd3b..1dac21aa7e5c 100644 --- a/drivers/gpu/drm/i915/gt/intel_migrate.c +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c @@ -177,7 +177,7 @@ static struct intel_context *pinned_context(struct intel_gt *gt) ce = intel_engine_create_pinned_context(engine, vm, SZ_512K, I915_GEM_HWS_MIGRATE, &key, "migrate"); - i915_vm_put(ce->vm); + i915_vm_put(vm); return ce; } -- 2.20.1
Re: [PATCH] drm/amdgpu: Cancel delayed work when GFXOFF is disabled
On 8/13/2021 3:59 PM, Michel Dänzer wrote: From: Michel Dänzer schedule_delayed_work does not push back the work if it was already scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms after the first time GFXOFF was disabled and re-enabled, even if GFXOFF was disabled and re-enabled again during those 100 ms. This resulted in frame drops / stutter with the upcoming mutter 41 release on Navi 14, due to constantly enabling GFXOFF in the HW and disabling it again (for getting the GPU clock counter). To fix this, call cancel_delayed_work_sync when GFXOFF transitions from enabled to disabled. This makes sure the delayed work will be scheduled as intended in the reverse case. In order to avoid a deadlock, amdgpu_device_delay_enable_gfx_off needs to use mutex_trylock instead of mutex_lock. v2: * Use cancel_delayed_work_sync & mutex_trylock instead of mod_delayed_work. Signed-off-by: Michel Dänzer --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 13 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h| 3 +++ 3 files changed, 20 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index f3fd5ec710b6..8b025f70706c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2777,7 +2777,16 @@ static void amdgpu_device_delay_enable_gfx_off(struct work_struct *work) struct amdgpu_device *adev = container_of(work, struct amdgpu_device, gfx.gfx_off_delay_work.work); - mutex_lock(&adev->gfx.gfx_off_mutex); + /* mutex_lock could deadlock with cancel_delayed_work_sync in amdgpu_gfx_off_ctrl. */ + if (!mutex_trylock(&adev->gfx.gfx_off_mutex)) { + /* If there's a bug which causes amdgpu_gfx_off_ctrl to be called with enable=true +* when adev->gfx.gfx_off_req_count is already 0, we might race with that. +* Re-schedule to make sure gfx off will be re-enabled in the HW eventually. +*/ + schedule_delayed_work(&adev->gfx.gfx_off_delay_work, AMDGPU_GFX_OFF_DELAY_ENABLE); + return; This is not needed and is just creating another thread to contend for mutex. The checks below take care of enabling gfxoff correctly. If it's already in gfx_off state, it doesn't do anything. So I don't see why this change is needed. The other problem is amdgpu_get_gfx_off_status() also uses the same mutex. So it won't be knowing which thread it would be contending against and blindly creates more work items. + } + if (!adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) { if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, true)) adev->gfx.gfx_off_state = true; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c index a0be0772c8b3..da4c46db3093 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c @@ -28,9 +28,6 @@ #include "amdgpu_rlc.h" #include "amdgpu_ras.h" -/* delay 0.1 second to enable gfx off feature */ -#define GFX_OFF_DELAY_ENABLE msecs_to_jiffies(100) - /* * GPU GFX IP block helpers function. */ @@ -569,9 +566,13 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool enable) adev->gfx.gfx_off_req_count--; if (enable && !adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) { - schedule_delayed_work(&adev->gfx.gfx_off_delay_work, GFX_OFF_DELAY_ENABLE); - } else if (!enable && adev->gfx.gfx_off_state) { - if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, false)) { + schedule_delayed_work(&adev->gfx.gfx_off_delay_work, AMDGPU_GFX_OFF_DELAY_ENABLE); + } else if (!enable) { + if (adev->gfx.gfx_off_req_count == 1 && !adev->gfx.gfx_off_state) + cancel_delayed_work_sync(&adev->gfx.gfx_off_delay_work); This has the deadlock problem as discussed in the other thread. Thanks, Lijo + if (adev->gfx.gfx_off_state && + !amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, false)) { adev->gfx.gfx_off_state = false; if (adev->gfx.funcs->init_spm_golden) { diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h index d43fe2ed8116..dcdb505bb7f4 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h @@ -32,6 +32,9 @@ #include "amdgpu_rlc.h" #include "soc15.h" +/* delay 0.1 second to enable gfx off feature */ +#define AMDGPU_GFX_OFF_DELAY_ENABLE msecs_to_jiffies(100) + /* GFX current status */ #define AMDGPU_GFX_NORMAL_MODE0xL #define AMDGPU_GFX_SAFE_MODE
Re: [RFC PATCH 06/17] drm/exynos: dsi: Handle exynos specifics via driver_data
Hi Inki, On Fri, Aug 13, 2021 at 03:50:56PM +0900, Inki Dae wrote: > 21. 7. 26. 오전 2:25에 Sam Ravnborg 이(가) 쓴 글: > > On Sun, Jul 04, 2021 at 02:32:19PM +0530, Jagan Teki wrote: > >> Exynos DSI driver is actually a Samsung MIPI DSIM bridge > >> IP which is also used in i.MX8MM platforms. > >> > >> Right now the existing driver has some exynos drm specific > >> code bases like te_irq, crtc and component_ops. > >> > >> In order to switch this driver into a common bridge driver > >> We can see 2 options to handle the exynos specific code. > >> > >> A. Drop the component_ops, and rework other specifics. > >>This may lead to more foundation work as it requires > >>more changes in exynos drm drivers stack. > >> > >> B. Handle the exynos specifics via driver data, and make > >>the common bridge work in different platforms and plan > >>for option A in future. > >> > >> So, this patch is trying to add option B) changes to handle > >> exynos specifics via driver_data. > > > > We really should find someone that has the time, energy, knowledge and > > hardware that can include device_link support once anf for all for > > bridges. > > Then we would avoid hacks like this. > > > > I see no other options at the moment, but look forward for a better > > solution. > > I'm not sure that it's correct to share this mipi dsi driver with > I.MX8MM SoC even though it's a same IP because this MIPI DSI device > isn't peripheral device but in SoC. > > It would mean that Exynos MIPI DSI device depends on SoC architecture, > and Exynos and I.MX series are totally different SoC. So if we share > the same driver for the MIPI DSI device then many things we didn't > predict may happen in the future. Isn't that true for external components true thought ? Any driver shared by multiple systems will face this issue, where it will be developed with some use cases in mind, and regressions may happen when the driver is then extended to support other use cases not required for the original platform. In general we don't want multiple drivers for the same IP core unless there are valid technical reasons for that. It's the whole point of the device tree, being able to describe how IP cores are integrated, so that code can be reused across platforms. Of course, integration differences between SoCs can sometimes vary wildly and require some amount of glue code. > I don't want to make Jagan's efforts > in vain for the community but clarify whether this is correct way or > not. If this is only the way we have to go then we could more focus on > actual solution not such hack. Impossible work with Jagan alone I > think. I do agree that we need more correct solutions and less hacks in general :-) The issues faced on Exynos also exist on other platforms, so it would be much better to solve them well once that duplicating implementations with less test coverage and reviews. There have been efforts in the past to address some of these issues, which have resulted in solutions such as the component framework. However, I'd argued that we've never taken it to the last step, and have always stopped with half solutions. The component framework, for instance, is painful to use, and the handling of .remove() in most drivers is completely broken because of that (not just because of that though, we have issues in the DRM core that make hot-unplug just impossible to handle safetly). > So let's get started with a question, > - Is MIPI-DSI bridge device or Encoder device? I think that MIPI-DSI > is a Encoder device managed by atomic KMS. If MIPI-DSI should be > handled as bridge device then does now drm bridge framework provide > everything to share one driver with one more SoC? I mean something > that drm bridge has to consider for such driver support, which is > shared with one more SoC. The DRM "encoder" concept was a bit of a historical mistake that we are stuck with as drm_encoder is exposed to userspace. It comes from a time where nobody was envisioning chaining multiple encoders. DRM is moving to modelling every component after the CRTC as a bridge. This brings much more flexibility, and in that model, the drm_encoder becomes more or less a stub. The DRM bridge API has been extended in the past to support more features, and if anything is still missing that makes it difficult to move away from drm_encoder, we can of course address the issues in drm_bridge. > And Display mode - VIDEO and COMMAND mode - is generic type of MIPI > DSI, and also componentised subsystem is a generic solution to resolve > probing order issue not Exynos specific feature. These are driver > specific ones not Exynos SoC I think. As SoC specific things should be > considered, I think MIPI DSI Driver - interrupt handler and probing > order things are really specific to device driver - should be > separated but we could share the control part of the device. > > I was busy with other projects so didn't care of Linux DRM world so > there may be my missing something. >
Re: [PATCH v18 0/2] Add memory bandwidth management to NVIDIA Tegra DRM driver
13.08.2021 13:33, Thierry Reding пишет: > On Mon, Jun 07, 2021 at 01:40:06AM +0300, Dmitry Osipenko wrote: >> 01.06.2021 07:21, Dmitry Osipenko пишет: >>> This series adds memory bandwidth management to the NVIDIA Tegra DRM driver, >>> which is done using interconnect framework. It fixes display corruption that >>> happens due to insufficient memory bandwidth. >>> >>> Changelog: >>> >>> v18: - Moved total peak bandwidth from CRTC state to plane state and removed >>>dummy plane bandwidth state initialization from T186+ plane hub. This >>>was suggested by Thierry Reding to v17. >>> >>> - I haven't done anything about the cursor's plane bandwidth which >>>doesn't contribute to overlapping bandwidths for a small sized >>>window because it works okay as-is. >> >> Thierry, will you take these patches for 5.14? > > As discussed offline, I've picked these up for v5.15 with a small patch > squashed in to unbreak the Tegra186 and later support. Cool, thanks.
Re: [PATCH] drm/amdgpu: Cancel delayed work when GFXOFF is disabled
On 2021-08-13 1:50 p.m., Lazar, Lijo wrote: > > > On 8/13/2021 3:59 PM, Michel Dänzer wrote: >> From: Michel Dänzer >> >> schedule_delayed_work does not push back the work if it was already >> scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms >> after the first time GFXOFF was disabled and re-enabled, even if GFXOFF >> was disabled and re-enabled again during those 100 ms. >> >> This resulted in frame drops / stutter with the upcoming mutter 41 >> release on Navi 14, due to constantly enabling GFXOFF in the HW and >> disabling it again (for getting the GPU clock counter). >> >> To fix this, call cancel_delayed_work_sync when GFXOFF transitions from >> enabled to disabled. This makes sure the delayed work will be scheduled >> as intended in the reverse case. >> >> In order to avoid a deadlock, amdgpu_device_delay_enable_gfx_off needs >> to use mutex_trylock instead of mutex_lock. >> >> v2: >> * Use cancel_delayed_work_sync & mutex_trylock instead of >> mod_delayed_work. >> >> Signed-off-by: Michel Dänzer >> --- >> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 ++- >> drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 13 +++-- >> drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h | 3 +++ >> 3 files changed, 20 insertions(+), 7 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> index f3fd5ec710b6..8b025f70706c 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> @@ -2777,7 +2777,16 @@ static void amdgpu_device_delay_enable_gfx_off(struct >> work_struct *work) >> struct amdgpu_device *adev = >> container_of(work, struct amdgpu_device, >> gfx.gfx_off_delay_work.work); >> - mutex_lock(&adev->gfx.gfx_off_mutex); >> + /* mutex_lock could deadlock with cancel_delayed_work_sync in >> amdgpu_gfx_off_ctrl. */ >> + if (!mutex_trylock(&adev->gfx.gfx_off_mutex)) { >> + /* If there's a bug which causes amdgpu_gfx_off_ctrl to be called >> with enable=true >> + * when adev->gfx.gfx_off_req_count is already 0, we might race >> with that. >> + * Re-schedule to make sure gfx off will be re-enabled in the HW >> eventually. >> + */ >> + schedule_delayed_work(&adev->gfx.gfx_off_delay_work, >> AMDGPU_GFX_OFF_DELAY_ENABLE); >> + return; > > This is not needed and is just creating another thread to contend for mutex. Still not sure what you mean by that. What other thread? > The checks below take care of enabling gfxoff correctly. If it's already in > gfx_off state, it doesn't do anything. So I don't see why this change is > needed. mutex_trylock is needed to prevent the deadlock discussed before and below. schedule_delayed_work is needed due to this scenario hinted at by the comment: 1. amdgpu_gfx_off_ctrl locks mutex, calls schedule_delayed_work 2. amdgpu_device_delay_enable_gfx_off runs, calls mutex_trylock, which fails GFXOFF would never get re-enabled in HW in this case (until amdgpu_gfx_off_ctrl calls schedule_delayed_work again). (cancel_delayed_work_sync guarantees there's no pending delayed work when it returns, even if amdgpu_device_delay_enable_gfx_off calls schedule_delayed_work) > The other problem is amdgpu_get_gfx_off_status() also uses the same mutex. Not sure what for TBH. AFAICT there's only one implementation of this for Renoir, which just reads a register. (It's only called from debugfs) > So it won't be knowing which thread it would be contending against and > blindly creates more work items. There is only ever at most one instance of the delayed work at any time. amdgpu_device_delay_enable_gfx_off doesn't care whether amdgpu_gfx_off_ctrl or amdgpu_get_gfx_off_status is holding the mutex, it just keeps re-scheduling itself 100 ms later until it succeeds. >> @@ -569,9 +566,13 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, >> bool enable) >> adev->gfx.gfx_off_req_count--; >> if (enable && !adev->gfx.gfx_off_state && >> !adev->gfx.gfx_off_req_count) { >> - schedule_delayed_work(&adev->gfx.gfx_off_delay_work, >> GFX_OFF_DELAY_ENABLE); >> - } else if (!enable && adev->gfx.gfx_off_state) { >> - if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, >> false)) { >> + schedule_delayed_work(&adev->gfx.gfx_off_delay_work, >> AMDGPU_GFX_OFF_DELAY_ENABLE); >> + } else if (!enable) { >> + if (adev->gfx.gfx_off_req_count == 1 && !adev->gfx.gfx_off_state) >> + cancel_delayed_work_sync(&adev->gfx.gfx_off_delay_work); > > This has the deadlock problem as discussed in the other thread. It does not. If amdgpu_device_delay_enable_gfx_off runs while amdgpu_gfx_off_ctrl holds the mutex, mutex_trylock fails and the former bails. -- Earthling Michel Dänzer | https://redhat.com Libre software enthusiast | Mesa and X d
Re: [Intel-gfx] [PATCH v2] fbdev/efifb: Release PCI device's runtime PM ref during FB destroy
On Mon, Aug 09, 2021 at 04:31:46PM +0300, Imre Deak wrote: > Atm the EFI FB platform driver gets a runtime PM reference for the > associated GFX PCI device during probing the EFI FB platform device and > releases it only when the platform device gets unbound. > > When fbcon switches to the FB provided by the PCI device's driver (for > instance i915/drmfb), the EFI FB will get only unregistered without the > EFI FB platform device getting unbound, keeping the runtime PM reference > acquired during the platform device probing. This reference will prevent > the PCI driver from runtime suspending the device. > > Fix this by releasing the RPM reference from the EFI FB's destroy hook, > called when the FB gets unregistered. > > While at it assert that pm_runtime_get_sync() didn't fail. > > v2: > - Move pm_runtime_get_sync() before register_framebuffer() to avoid its > race wrt. efifb_destroy()->pm_runtime_put(). (Daniel) > - Assert that pm_runtime_get_sync() didn't fail. > - Clarify commit message wrt. platform/PCI device/driver and driver > removal vs. device unbinding. > > Fixes: a6c0fd3d5a8b ("efifb: Ensure graphics device for efifb stays at PCI > D0") > Cc: Kai-Heng Feng > Cc: Daniel Vetter > Reviewed-by: Daniel Vetter (v1) > Signed-off-by: Imre Deak Thanks for the reviews, pushed to drm-intel-next. > --- > drivers/video/fbdev/efifb.c | 21 ++--- > 1 file changed, 14 insertions(+), 7 deletions(-) > > diff --git a/drivers/video/fbdev/efifb.c b/drivers/video/fbdev/efifb.c > index 8ea8f079cde26..edca3703b9640 100644 > --- a/drivers/video/fbdev/efifb.c > +++ b/drivers/video/fbdev/efifb.c > @@ -47,6 +47,8 @@ static bool use_bgrt = true; > static bool request_mem_succeeded = false; > static u64 mem_flags = EFI_MEMORY_WC | EFI_MEMORY_UC; > > +static struct pci_dev *efifb_pci_dev;/* dev with BAR covering the > efifb */ > + > static struct fb_var_screeninfo efifb_defined = { > .activate = FB_ACTIVATE_NOW, > .height = -1, > @@ -243,6 +245,9 @@ static inline void efifb_show_boot_graphics(struct > fb_info *info) {} > > static void efifb_destroy(struct fb_info *info) > { > + if (efifb_pci_dev) > + pm_runtime_put(&efifb_pci_dev->dev); > + > if (info->screen_base) { > if (mem_flags & (EFI_MEMORY_UC | EFI_MEMORY_WC)) > iounmap(info->screen_base); > @@ -333,7 +338,6 @@ ATTRIBUTE_GROUPS(efifb); > > static bool pci_dev_disabled;/* FB base matches BAR of a disabled > device */ > > -static struct pci_dev *efifb_pci_dev;/* dev with BAR covering the > efifb */ > static struct resource *bar_resource; > static u64 bar_offset; > > @@ -569,17 +573,22 @@ static int efifb_probe(struct platform_device *dev) > pr_err("efifb: cannot allocate colormap\n"); > goto err_groups; > } > + > + if (efifb_pci_dev) > + WARN_ON(pm_runtime_get_sync(&efifb_pci_dev->dev) < 0); > + > err = register_framebuffer(info); > if (err < 0) { > pr_err("efifb: cannot register framebuffer\n"); > - goto err_fb_dealoc; > + goto err_put_rpm_ref; > } > fb_info(info, "%s frame buffer device\n", info->fix.id); > - if (efifb_pci_dev) > - pm_runtime_get_sync(&efifb_pci_dev->dev); > return 0; > > -err_fb_dealoc: > +err_put_rpm_ref: > + if (efifb_pci_dev) > + pm_runtime_put(&efifb_pci_dev->dev); > + > fb_dealloc_cmap(&info->cmap); > err_groups: > sysfs_remove_groups(&dev->dev.kobj, efifb_groups); > @@ -603,8 +612,6 @@ static int efifb_remove(struct platform_device *pdev) > unregister_framebuffer(info); > sysfs_remove_groups(&pdev->dev.kobj, efifb_groups); > framebuffer_release(info); > - if (efifb_pci_dev) > - pm_runtime_put(&efifb_pci_dev->dev); > > return 0; > } > -- > 2.27.0 >
Re: [PATCH v2 2/2] drm/panel: s6d27a1: Add driver for Samsung S6D27A1 display panel
On Sat, Aug 7, 2021 at 3:31 PM Markuss Broks wrote: > This adds a driver for Samsung S6D27A1 display controller and panel. > This panel is found in the Samsung GT-I8160 mobile phone, > and possibly some other mobile phones. > > This display needs manufacturer commands to configure it; > the commands used in this driver were taken from downstream driver > by Gareth Phillips; sadly, there is almost no documentation on what they > actually do. > > This driver re-uses the DBI infrastructure to communicate with the display. > > This driver is heavily based on WideChips WS2401 display controller > driver by Linus Walleij and on other panel drivers for reference. > > Signed-off-by: Markuss Broks > > v2 -> v3: Both v3 patches applied to drm-misc-next and pushed. Yours, Linus Walleij
Re: [PATCH 1/2] dt-bindings: display: panel: Add Truly NT35521 panel support
On Wed, Aug 11, 2021 at 12:51:56PM -0600, Rob Herring wrote: > On Wed, Aug 04, 2021 at 04:13:51PM +0800, Shawn Guo wrote: > > The Truly NT35521 is a 5.24" 1280x720 DSI panel, and the backlight is > > managed through DSI link. > > > > Signed-off-by: Shawn Guo > > --- > > .../bindings/display/panel/truly,nt35521.yaml | 62 +++ > > 1 file changed, 62 insertions(+) > > create mode 100644 > > Documentation/devicetree/bindings/display/panel/truly,nt35521.yaml > > > > diff --git > > a/Documentation/devicetree/bindings/display/panel/truly,nt35521.yaml > > b/Documentation/devicetree/bindings/display/panel/truly,nt35521.yaml > > new file mode 100644 > > index ..4727c3df6eb8 > > --- /dev/null > > +++ b/Documentation/devicetree/bindings/display/panel/truly,nt35521.yaml > > @@ -0,0 +1,62 @@ > > +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) > > +%YAML 1.2 > > +--- > > +$id: http://devicetree.org/schemas/display/panel/truly,nt35521.yaml# > > +$schema: http://devicetree.org/meta-schemas/core.yaml# > > + > > +title: Truly NT35521 5.24" 1280x720 MIPI-DSI Panel > > + > > +maintainers: > > + - Shawn Guo > > + > > +description: | > > + The Truly NT35521 is a 5.24" 1280x720 MIPI-DSI panel. The panel > > backlight > > + is managed through DSI link. > > + > > +allOf: > > + - $ref: panel-common.yaml# > > + > > +properties: > > + compatible: > > +const: truly,nt35521 > > + > > + reg: true > > + > > + reset-gpios: true > > + > > + enable-gpios: true > > + > > + pwr-positive5-gpios: > > +maxItems: 1 > > + > > + pwr-negative5-gpios: > > +maxItems: 1 > > Are these +/-5V supplies? If so, they should be modeled with > gpio-regulator perhaps unless the panel connection could only ever be > GPIOs. Hi Rob, The binding has been updated in v2 [1]. Please help review that. Thanks! Shawn [1] https://lore.kernel.org/linux-arm-msm/20210809051008.6172-2-shawn@linaro.org/T/#m587035a602b1be6c5326dcf24af01b3e8a5d2cc9 > > > + > > +required: > > + - compatible > > + - reg > > + - reset-gpios > > + - enable-gpios > > + - pwr-positive5-gpios > > + - pwr-negative5-gpios > > + > > +additionalProperties: false > > + > > +examples: > > + - | > > +#include > > + > > +dsi { > > +#address-cells = <1>; > > +#size-cells = <0>; > > + > > +panel@0 { > > +compatible = "truly,nt35521"; > > +reg = <0>; > > +reset-gpios = <&msmgpio 25 GPIO_ACTIVE_LOW>; > > +pwr-positive5-gpios = <&msmgpio 114 GPIO_ACTIVE_HIGH>; > > +pwr-negative5-gpios = <&msmgpio 17 GPIO_ACTIVE_HIGH>; > > +enable-gpios = <&msmgpio 10 GPIO_ACTIVE_HIGH>; > > +}; > > +}; > > +... > > -- > > 2.17.1 > > > >
Re: [PATCH] drm/i915/gt: Potential error pointer dereference in pinned_context()
On 8/13/21 1:36 PM, Dan Carpenter wrote: If the intel_engine_create_pinned_context() function returns an error pointer, then dereferencing "ce" will Oops. Use "vm" instead of "ce->vm". Fixes: cf586021642d ("drm/i915/gt: Pipelined page migration") Signed-off-by: Dan Carpenter --- drivers/gpu/drm/i915/gt/intel_migrate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c b/drivers/gpu/drm/i915/gt/intel_migrate.c index d0a7c934fd3b..1dac21aa7e5c 100644 --- a/drivers/gpu/drm/i915/gt/intel_migrate.c +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c @@ -177,7 +177,7 @@ static struct intel_context *pinned_context(struct intel_gt *gt) ce = intel_engine_create_pinned_context(engine, vm, SZ_512K, I915_GEM_HWS_MIGRATE, &key, "migrate"); - i915_vm_put(ce->vm); + i915_vm_put(vm); return ce; } Thanks, Reviewed-by: Thomas Hellström
Re: [PATCH] drm/amdgpu: Cancel delayed work when GFXOFF is disabled
On 8/13/2021 7:04 PM, Michel Dänzer wrote: On 2021-08-13 1:50 p.m., Lazar, Lijo wrote: On 8/13/2021 3:59 PM, Michel Dänzer wrote: From: Michel Dänzer schedule_delayed_work does not push back the work if it was already scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms after the first time GFXOFF was disabled and re-enabled, even if GFXOFF was disabled and re-enabled again during those 100 ms. This resulted in frame drops / stutter with the upcoming mutter 41 release on Navi 14, due to constantly enabling GFXOFF in the HW and disabling it again (for getting the GPU clock counter). To fix this, call cancel_delayed_work_sync when GFXOFF transitions from enabled to disabled. This makes sure the delayed work will be scheduled as intended in the reverse case. In order to avoid a deadlock, amdgpu_device_delay_enable_gfx_off needs to use mutex_trylock instead of mutex_lock. v2: * Use cancel_delayed_work_sync & mutex_trylock instead of mod_delayed_work. Signed-off-by: Michel Dänzer --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 13 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h | 3 +++ 3 files changed, 20 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index f3fd5ec710b6..8b025f70706c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2777,7 +2777,16 @@ static void amdgpu_device_delay_enable_gfx_off(struct work_struct *work) struct amdgpu_device *adev = container_of(work, struct amdgpu_device, gfx.gfx_off_delay_work.work); - mutex_lock(&adev->gfx.gfx_off_mutex); + /* mutex_lock could deadlock with cancel_delayed_work_sync in amdgpu_gfx_off_ctrl. */ + if (!mutex_trylock(&adev->gfx.gfx_off_mutex)) { + /* If there's a bug which causes amdgpu_gfx_off_ctrl to be called with enable=true + * when adev->gfx.gfx_off_req_count is already 0, we might race with that. + * Re-schedule to make sure gfx off will be re-enabled in the HW eventually. + */ + schedule_delayed_work(&adev->gfx.gfx_off_delay_work, AMDGPU_GFX_OFF_DELAY_ENABLE); + return; This is not needed and is just creating another thread to contend for mutex. Still not sure what you mean by that. What other thread? Sorry, I meant it schedules another workitem and delays GFXOFF enablement further. For ex: if it was another function like gfx_off_status holding the lock at the time of check. The checks below take care of enabling gfxoff correctly. If it's already in gfx_off state, it doesn't do anything. So I don't see why this change is needed. mutex_trylock is needed to prevent the deadlock discussed before and below. schedule_delayed_work is needed due to this scenario hinted at by the comment: 1. amdgpu_gfx_off_ctrl locks mutex, calls schedule_delayed_work 2. amdgpu_device_delay_enable_gfx_off runs, calls mutex_trylock, which fails GFXOFF would never get re-enabled in HW in this case (until amdgpu_gfx_off_ctrl calls schedule_delayed_work again). (cancel_delayed_work_sync guarantees there's no pending delayed work when it returns, even if amdgpu_device_delay_enable_gfx_off calls schedule_delayed_work) I think we need to explain based on the original code before. There is an asssumption here that the only other contention of this mutex is with the gfx_off_ctrl function. That is not true, so this is not the only case where mutex_trylock can fail. It could be because gfx_off_status is holding the lock. As far as I understand if the work has already started running when schedule_delayed_work is called, it will insert another in the work queue after delay. Based on that understanding I didn't find a problem with the original code. Maybe, mutex_trylock is added to call _sync to make sure work is cancelled or not running but that breaks other assumptions. The other problem is amdgpu_get_gfx_off_status() also uses the same mutex. Not sure what for TBH. AFAICT there's only one implementation of this for Renoir, which just reads a register. (It's only called from debugfs) I'm not sure either :) But as long as there are other functions that contend for the same lock, it's not good to implement based on assumptions only about a particular scenario. So it won't be knowing which thread it would be contending against and blindly creates more work items. There is only ever at most one instance of the delayed work at any time. amdgpu_device_delay_enable_gfx_off doesn't care whether amdgpu_gfx_off_ctrl or amdgpu_get_gfx_off_status is holding the mutex, it just keeps re-scheduling itself 100 ms later until it succeeds. Yes, that is the problem, there could be cases where it could have gone to gfxoff right after gfx_off_status releases the lock, but it doesn't delaying it further. That wo
Re: [Intel-gfx] [PATCH v6 09/15] drm/i915/pxp: Implement PXP irq handler
On Wed, Jul 28, 2021 at 07:01:00PM -0700, Daniele Ceraolo Spurio wrote: > From: "Huang, Sean Z" > > The HW will generate a teardown interrupt when session termination is > required, which requires i915 to submit a terminating batch. Once the HW > is done with the termination it will generate another interrupt, at > which point it is safe to re-create the session. > > Since the termination and re-creation flow is something we want to > trigger from the driver as well, use a common work function that can be > called both from the irq handler and from the driver set-up flows, which > has the addded benefit of allowing us to skip any extra locks because > the work itself serializes the operations. > > v2: use struct completion instead of bool (Chris) > v3: drop locks, clean up functions and improve comments (Chris), > move to common work function. > v4: improve comments, simplify wait logic (Rodrigo) > v5: unconditionally set interrupts, rename state_attacked var (Rodrigo) > > Signed-off-by: Huang, Sean Z > Signed-off-by: Daniele Ceraolo Spurio > Cc: Chris Wilson > Cc: Rodrigo Vivi > Reviewed-by: Rodrigo Vivi > --- > drivers/gpu/drm/i915/Makefile| 1 + > drivers/gpu/drm/i915/gt/intel_gt_irq.c | 7 ++ > drivers/gpu/drm/i915/i915_reg.h | 1 + > drivers/gpu/drm/i915/pxp/intel_pxp.c | 66 +++-- > drivers/gpu/drm/i915/pxp/intel_pxp.h | 8 ++ > drivers/gpu/drm/i915/pxp/intel_pxp_irq.c | 99 > drivers/gpu/drm/i915/pxp/intel_pxp_irq.h | 32 +++ > drivers/gpu/drm/i915/pxp/intel_pxp_session.c | 54 ++- > drivers/gpu/drm/i915/pxp/intel_pxp_session.h | 5 +- > drivers/gpu/drm/i915/pxp/intel_pxp_tee.c | 8 +- > drivers/gpu/drm/i915/pxp/intel_pxp_types.h | 18 > 11 files changed, 283 insertions(+), 16 deletions(-) > create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_irq.c > create mode 100644 drivers/gpu/drm/i915/pxp/intel_pxp_irq.h > > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile > index e13bc803e5ce..5dcdf5942d32 100644 > --- a/drivers/gpu/drm/i915/Makefile > +++ b/drivers/gpu/drm/i915/Makefile > @@ -279,6 +279,7 @@ i915-y += i915_perf.o > i915-$(CONFIG_DRM_I915_PXP) += \ > pxp/intel_pxp.o \ > pxp/intel_pxp_cmd.o \ > + pxp/intel_pxp_irq.o \ > pxp/intel_pxp_session.o \ > pxp/intel_pxp_tee.o > > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_irq.c > b/drivers/gpu/drm/i915/gt/intel_gt_irq.c > index b2de83be4d97..699a74582d32 100644 > --- a/drivers/gpu/drm/i915/gt/intel_gt_irq.c > +++ b/drivers/gpu/drm/i915/gt/intel_gt_irq.c > @@ -13,6 +13,7 @@ > #include "intel_lrc_reg.h" > #include "intel_uncore.h" > #include "intel_rps.h" > +#include "pxp/intel_pxp_irq.h" > > static void guc_irq_handler(struct intel_guc *guc, u16 iir) > { > @@ -64,6 +65,9 @@ gen11_other_irq_handler(struct intel_gt *gt, const u8 > instance, > if (instance == OTHER_GTPM_INSTANCE) > return gen11_rps_irq_handler(>->rps, iir); > > + if (instance == OTHER_KCR_INSTANCE) > + return intel_pxp_irq_handler(>->pxp, iir); > + > WARN_ONCE(1, "unhandled other interrupt instance=0x%x, iir=0x%x\n", > instance, iir); > } > @@ -196,6 +200,9 @@ void gen11_gt_irq_reset(struct intel_gt *gt) > intel_uncore_write(uncore, GEN11_GPM_WGBOXPERF_INTR_MASK, ~0); > intel_uncore_write(uncore, GEN11_GUC_SG_INTR_ENABLE, 0); > intel_uncore_write(uncore, GEN11_GUC_SG_INTR_MASK, ~0); > + > + intel_uncore_write(uncore, GEN11_CRYPTO_RSVD_INTR_ENABLE, 0); > + intel_uncore_write(uncore, GEN11_CRYPTO_RSVD_INTR_MASK, ~0); > } > > void gen11_gt_irq_postinstall(struct intel_gt *gt) > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h > index 70eed4fe3fe3..1a2a71916dfc 100644 > --- a/drivers/gpu/drm/i915/i915_reg.h > +++ b/drivers/gpu/drm/i915/i915_reg.h > @@ -8086,6 +8086,7 @@ enum { > /* irq instances for OTHER_CLASS */ > #define OTHER_GUC_INSTANCE 0 > #define OTHER_GTPM_INSTANCE 1 > +#define OTHER_KCR_INSTANCE 4 > > #define GEN11_INTR_IDENTITY_REG(x) _MMIO(0x190060 + ((x) * 4)) > > diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c > b/drivers/gpu/drm/i915/pxp/intel_pxp.c > index 26176d43a02d..b0c7edc10cc3 100644 > --- a/drivers/gpu/drm/i915/pxp/intel_pxp.c > +++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c > @@ -2,7 +2,9 @@ > /* > * Copyright(c) 2020 Intel Corporation. > */ > +#include > #include "intel_pxp.h" > +#include "intel_pxp_irq.h" > #include "intel_pxp_session.h" > #include "intel_pxp_tee.h" > #include "gt/intel_context.h" > @@ -68,6 +70,16 @@ void intel_pxp_init(struct intel_pxp *pxp) > > mutex_init(&pxp->tee_mutex); > > + /* > + * we'll use the completion to check if there is a termination pending, > + * so we start it as completed and we reinit it when a termination > + * is triggered. > + */ > + init_compl
Re: [Intel-gfx] [PATCH v6 10/15] drm/i915/pxp: interfaces for using protected objects
On Wed, Jul 28, 2021 at 07:01:01PM -0700, Daniele Ceraolo Spurio wrote: > This api allow user mode to create protected buffers and to mark > contexts as making use of such objects. Only when using contexts > marked in such a way is the execution guaranteed to work as expected. > > Contexts can only be marked as using protected content at creation time > (i.e. the parameter is immutable) and they must be both bannable and not > recoverable. > > All protected objects and contexts that have backing storage will be > considered invalid when the PXP session is destroyed and all new > submissions using them will be rejected. All intel contexts within the > invalidated gem contexts will be marked banned. A new flag has been > added to the RESET_STATS ioctl to report the context invalidation to > userspace. > > This patch was previously sent as 2 separate patches, which have been > squashed following a request to have all the uapi in a single patch. > I've retained the s-o-b from both. > > v5: squash patches, rebase on proto_ctx, update kerneldoc > > v6: rebase on obj create_ext changes > > Signed-off-by: Daniele Ceraolo Spurio > Signed-off-by: Bommu Krishnaiah > Cc: Rodrigo Vivi > Cc: Chris Wilson > Cc: Lionel Landwerlin > Cc: Jason Ekstrand > Cc: Daniel Vetter > Reviewed-by: Rodrigo Vivi #v5 > --- > drivers/gpu/drm/i915/gem/i915_gem_context.c | 68 -- > drivers/gpu/drm/i915/gem/i915_gem_context.h | 18 > .../gpu/drm/i915/gem/i915_gem_context_types.h | 2 + > drivers/gpu/drm/i915/gem/i915_gem_create.c| 75 > .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 40 - > drivers/gpu/drm/i915/gem/i915_gem_object.c| 6 ++ > drivers/gpu/drm/i915/gem/i915_gem_object.h| 12 +++ > .../gpu/drm/i915/gem/i915_gem_object_types.h | 9 ++ > drivers/gpu/drm/i915/pxp/intel_pxp.c | 89 +++ > drivers/gpu/drm/i915/pxp/intel_pxp.h | 15 > drivers/gpu/drm/i915/pxp/intel_pxp_session.c | 3 + > drivers/gpu/drm/i915/pxp/intel_pxp_types.h| 5 ++ > include/uapi/drm/i915_drm.h | 55 +++- > 13 files changed, 371 insertions(+), 26 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c > b/drivers/gpu/drm/i915/gem/i915_gem_context.c > index cff72679ad7c..0cd3e2d06188 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c > @@ -77,6 +77,8 @@ > #include "gt/intel_gpu_commands.h" > #include "gt/intel_ring.h" > > +#include "pxp/intel_pxp.h" > + > #include "i915_gem_context.h" > #include "i915_trace.h" > #include "i915_user_extensions.h" > @@ -241,6 +243,25 @@ static int proto_context_set_persistence(struct > drm_i915_private *i915, > return 0; > } > > +static int proto_context_set_protected(struct drm_i915_private *i915, > +struct i915_gem_proto_context *pc, > +bool protected) > +{ > + int ret = 0; > + > + if (!intel_pxp_is_enabled(&i915->gt.pxp)) > + ret = -ENODEV; > + else if (!protected) > + pc->user_flags &= ~BIT(UCONTEXT_PROTECTED); > + else if ((pc->user_flags & BIT(UCONTEXT_RECOVERABLE)) || > + !(pc->user_flags & BIT(UCONTEXT_BANNABLE))) > + ret = -EPERM; > + else > + pc->user_flags |= BIT(UCONTEXT_PROTECTED); > + > + return ret; > +} > + > static struct i915_gem_proto_context * > proto_context_create(struct drm_i915_private *i915, unsigned int flags) > { > @@ -686,6 +707,8 @@ static int set_proto_ctx_param(struct > drm_i915_file_private *fpriv, > ret = -EPERM; > else if (args->value) > pc->user_flags |= BIT(UCONTEXT_BANNABLE); > + else if (pc->user_flags & BIT(UCONTEXT_PROTECTED)) > + ret = -EPERM; > else > pc->user_flags &= ~BIT(UCONTEXT_BANNABLE); > break; > @@ -693,10 +716,12 @@ static int set_proto_ctx_param(struct > drm_i915_file_private *fpriv, > case I915_CONTEXT_PARAM_RECOVERABLE: > if (args->size) > ret = -EINVAL; > - else if (args->value) > - pc->user_flags |= BIT(UCONTEXT_RECOVERABLE); > - else > + else if (!args->value) > pc->user_flags &= ~BIT(UCONTEXT_RECOVERABLE); > + else if (pc->user_flags & BIT(UCONTEXT_PROTECTED)) > + ret = -EPERM; > + else > + pc->user_flags |= BIT(UCONTEXT_RECOVERABLE); > break; > > case I915_CONTEXT_PARAM_PRIORITY: > @@ -724,6 +749,11 @@ static int set_proto_ctx_param(struct > drm_i915_file_private *fpriv, > args->value); > break; > > + case I915_CONTEXT_PARAM_PROTECTED_CONTENT: > + ret =
Re: [PATCH] drm/amdgpu: Cancel delayed work when GFXOFF is disabled
On 2021-08-13 4:14 p.m., Lazar, Lijo wrote: > On 8/13/2021 7:04 PM, Michel Dänzer wrote: >> On 2021-08-13 1:50 p.m., Lazar, Lijo wrote: >>> On 8/13/2021 3:59 PM, Michel Dänzer wrote: From: Michel Dänzer schedule_delayed_work does not push back the work if it was already scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms after the first time GFXOFF was disabled and re-enabled, even if GFXOFF was disabled and re-enabled again during those 100 ms. This resulted in frame drops / stutter with the upcoming mutter 41 release on Navi 14, due to constantly enabling GFXOFF in the HW and disabling it again (for getting the GPU clock counter). To fix this, call cancel_delayed_work_sync when GFXOFF transitions from enabled to disabled. This makes sure the delayed work will be scheduled as intended in the reverse case. In order to avoid a deadlock, amdgpu_device_delay_enable_gfx_off needs to use mutex_trylock instead of mutex_lock. v2: * Use cancel_delayed_work_sync & mutex_trylock instead of mod_delayed_work. Signed-off-by: Michel Dänzer --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 13 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h | 3 +++ 3 files changed, 20 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index f3fd5ec710b6..8b025f70706c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2777,7 +2777,16 @@ static void amdgpu_device_delay_enable_gfx_off(struct work_struct *work) struct amdgpu_device *adev = container_of(work, struct amdgpu_device, gfx.gfx_off_delay_work.work); - mutex_lock(&adev->gfx.gfx_off_mutex); + /* mutex_lock could deadlock with cancel_delayed_work_sync in amdgpu_gfx_off_ctrl. */ + if (!mutex_trylock(&adev->gfx.gfx_off_mutex)) { + /* If there's a bug which causes amdgpu_gfx_off_ctrl to be called with enable=true + * when adev->gfx.gfx_off_req_count is already 0, we might race with that. + * Re-schedule to make sure gfx off will be re-enabled in the HW eventually. + */ + schedule_delayed_work(&adev->gfx.gfx_off_delay_work, AMDGPU_GFX_OFF_DELAY_ENABLE); + return; >>> >>> This is not needed and is just creating another thread to contend for mutex. >> >> Still not sure what you mean by that. What other thread? > > Sorry, I meant it schedules another workitem and delays GFXOFF enablement > further. For ex: if it was another function like gfx_off_status holding the > lock at the time of check. > >> >>> The checks below take care of enabling gfxoff correctly. If it's already in >>> gfx_off state, it doesn't do anything. So I don't see why this change is >>> needed. >> >> mutex_trylock is needed to prevent the deadlock discussed before and below. >> >> schedule_delayed_work is needed due to this scenario hinted at by the >> comment: >> >> 1. amdgpu_gfx_off_ctrl locks mutex, calls schedule_delayed_work >> 2. amdgpu_device_delay_enable_gfx_off runs, calls mutex_trylock, which fails >> >> GFXOFF would never get re-enabled in HW in this case (until >> amdgpu_gfx_off_ctrl calls schedule_delayed_work again). >> >> (cancel_delayed_work_sync guarantees there's no pending delayed work when it >> returns, even if amdgpu_device_delay_enable_gfx_off calls >> schedule_delayed_work) >> > > I think we need to explain based on the original code before. There is an > asssumption here that the only other contention of this mutex is with the > gfx_off_ctrl function. Not really. > As far as I understand if the work has already started running when > schedule_delayed_work is called, it will insert another in the work queue > after delay. Based on that understanding I didn't find a problem with the > original code. Original code as in without this patch or the mod_delayed_work patch? If so, the problem is not when the work has already started running. It's that when it hasn't started running yet, schedule_delayed_work doesn't change the timeout for the already scheduled work, so it ends up enabling GFXOFF earlier than intended (and thus at all in scenarios when it's not supposed to). > [...], there could be cases where it could have gone to gfxoff right after > gfx_off_status releases the lock, but it doesn't delaying it further. That > would be the case if some other function is also introduced which takes this > mutex. I really don't think we need to worry about amdgpu_get_gfx_off_status, since it's only called from debugfs (and should be very short). If something hits that debugfs file and it causes highe
Re: [Intel-gfx] [PATCH v6 10/15] drm/i915/pxp: interfaces for using protected objects
On Fri, Aug 13, 2021 at 04:37:53PM +0200, Daniel Vetter wrote: > On Wed, Jul 28, 2021 at 07:01:01PM -0700, Daniele Ceraolo Spurio wrote: > > This api allow user mode to create protected buffers and to mark > > contexts as making use of such objects. Only when using contexts > > marked in such a way is the execution guaranteed to work as expected. > > > > Contexts can only be marked as using protected content at creation time > > (i.e. the parameter is immutable) and they must be both bannable and not > > recoverable. > > > > All protected objects and contexts that have backing storage will be > > considered invalid when the PXP session is destroyed and all new > > submissions using them will be rejected. All intel contexts within the > > invalidated gem contexts will be marked banned. A new flag has been > > added to the RESET_STATS ioctl to report the context invalidation to > > userspace. > > > > This patch was previously sent as 2 separate patches, which have been > > squashed following a request to have all the uapi in a single patch. > > I've retained the s-o-b from both. > > > > v5: squash patches, rebase on proto_ctx, update kerneldoc > > > > v6: rebase on obj create_ext changes > > > > Signed-off-by: Daniele Ceraolo Spurio > > Signed-off-by: Bommu Krishnaiah > > Cc: Rodrigo Vivi > > Cc: Chris Wilson > > Cc: Lionel Landwerlin > > Cc: Jason Ekstrand > > Cc: Daniel Vetter > > Reviewed-by: Rodrigo Vivi #v5 > > --- > > drivers/gpu/drm/i915/gem/i915_gem_context.c | 68 -- > > drivers/gpu/drm/i915/gem/i915_gem_context.h | 18 > > .../gpu/drm/i915/gem/i915_gem_context_types.h | 2 + > > drivers/gpu/drm/i915/gem/i915_gem_create.c| 75 > > .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 40 - > > drivers/gpu/drm/i915/gem/i915_gem_object.c| 6 ++ > > drivers/gpu/drm/i915/gem/i915_gem_object.h| 12 +++ > > .../gpu/drm/i915/gem/i915_gem_object_types.h | 9 ++ > > drivers/gpu/drm/i915/pxp/intel_pxp.c | 89 +++ > > drivers/gpu/drm/i915/pxp/intel_pxp.h | 15 > > drivers/gpu/drm/i915/pxp/intel_pxp_session.c | 3 + > > drivers/gpu/drm/i915/pxp/intel_pxp_types.h| 5 ++ > > include/uapi/drm/i915_drm.h | 55 +++- > > 13 files changed, 371 insertions(+), 26 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c > > b/drivers/gpu/drm/i915/gem/i915_gem_context.c > > index cff72679ad7c..0cd3e2d06188 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c > > @@ -77,6 +77,8 @@ > > #include "gt/intel_gpu_commands.h" > > #include "gt/intel_ring.h" > > > > +#include "pxp/intel_pxp.h" > > + > > #include "i915_gem_context.h" > > #include "i915_trace.h" > > #include "i915_user_extensions.h" > > @@ -241,6 +243,25 @@ static int proto_context_set_persistence(struct > > drm_i915_private *i915, > > return 0; > > } > > > > +static int proto_context_set_protected(struct drm_i915_private *i915, > > + struct i915_gem_proto_context *pc, > > + bool protected) > > +{ > > + int ret = 0; > > + > > + if (!intel_pxp_is_enabled(&i915->gt.pxp)) > > + ret = -ENODEV; > > + else if (!protected) > > + pc->user_flags &= ~BIT(UCONTEXT_PROTECTED); > > + else if ((pc->user_flags & BIT(UCONTEXT_RECOVERABLE)) || > > +!(pc->user_flags & BIT(UCONTEXT_BANNABLE))) > > + ret = -EPERM; > > + else > > + pc->user_flags |= BIT(UCONTEXT_PROTECTED); > > + > > + return ret; > > +} > > + > > static struct i915_gem_proto_context * > > proto_context_create(struct drm_i915_private *i915, unsigned int flags) > > { > > @@ -686,6 +707,8 @@ static int set_proto_ctx_param(struct > > drm_i915_file_private *fpriv, > > ret = -EPERM; > > else if (args->value) > > pc->user_flags |= BIT(UCONTEXT_BANNABLE); > > + else if (pc->user_flags & BIT(UCONTEXT_PROTECTED)) > > + ret = -EPERM; > > else > > pc->user_flags &= ~BIT(UCONTEXT_BANNABLE); > > break; > > @@ -693,10 +716,12 @@ static int set_proto_ctx_param(struct > > drm_i915_file_private *fpriv, > > case I915_CONTEXT_PARAM_RECOVERABLE: > > if (args->size) > > ret = -EINVAL; > > - else if (args->value) > > - pc->user_flags |= BIT(UCONTEXT_RECOVERABLE); > > - else > > + else if (!args->value) > > pc->user_flags &= ~BIT(UCONTEXT_RECOVERABLE); > > + else if (pc->user_flags & BIT(UCONTEXT_PROTECTED)) > > + ret = -EPERM; > > + else > > + pc->user_flags |= BIT(UCONTEXT_RECOVERABLE); > > break; > > > > case I915_CONTEXT_PARAM_PRIORITY: > > @@ -724,6 +749,11 @@ static int
[PATCH v2 1/2] drm/i915/ttm: Reorganize the ttm move code somewhat
In order to make the code a bit more readable and to facilitate async memcpy moves, reorganize the move code a little. Determine at an early stage whether to copy or to clear. v2: - Don't set up the memcpy iterators unless we are actually going to memcpy. Signed-off-by: Thomas Hellström Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 77 ++--- 1 file changed, 44 insertions(+), 33 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c index 771eb2963123..d07de18529ab 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -431,6 +431,7 @@ i915_ttm_resource_get_st(struct drm_i915_gem_object *obj, } static int i915_ttm_accel_move(struct ttm_buffer_object *bo, + bool clear, struct ttm_resource *dst_mem, struct sg_table *dst_st) { @@ -449,13 +450,10 @@ static int i915_ttm_accel_move(struct ttm_buffer_object *bo, return -EINVAL; dst_level = i915_ttm_cache_level(i915, dst_mem, ttm); - if (!ttm || !ttm_tt_is_populated(ttm)) { + if (clear) { if (bo->type == ttm_bo_type_kernel) return -EINVAL; - if (ttm && !(ttm->page_flags & TTM_PAGE_FLAG_ZERO_ALLOC)) - return 0; - intel_engine_pm_get(i915->gt.migrate.context->engine); ret = intel_context_migrate_clear(i915->gt.migrate.context, NULL, dst_st->sgl, dst_level, @@ -489,6 +487,41 @@ static int i915_ttm_accel_move(struct ttm_buffer_object *bo, return ret; } +static void __i915_ttm_move(struct ttm_buffer_object *bo, bool clear, + struct ttm_resource *dst_mem, + struct sg_table *dst_st) +{ + int ret; + + ret = i915_ttm_accel_move(bo, clear, dst_mem, dst_st); + if (ret) { + struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo); + struct intel_memory_region *dst_reg, *src_reg; + union { + struct ttm_kmap_iter_tt tt; + struct ttm_kmap_iter_iomap io; + } _dst_iter, _src_iter; + struct ttm_kmap_iter *dst_iter, *src_iter; + + dst_reg = i915_ttm_region(bo->bdev, dst_mem->mem_type); + src_reg = i915_ttm_region(bo->bdev, bo->resource->mem_type); + GEM_BUG_ON(!dst_reg || !src_reg); + + dst_iter = !cpu_maps_iomem(dst_mem) ? + ttm_kmap_iter_tt_init(&_dst_iter.tt, bo->ttm) : + ttm_kmap_iter_iomap_init(&_dst_iter.io, &dst_reg->iomap, +dst_st, dst_reg->region.start); + + src_iter = !cpu_maps_iomem(bo->resource) ? + ttm_kmap_iter_tt_init(&_src_iter.tt, bo->ttm) : + ttm_kmap_iter_iomap_init(&_src_iter.io, &src_reg->iomap, +obj->ttm.cached_io_st, +src_reg->region.start); + + ttm_move_memcpy(bo, dst_mem->num_pages, dst_iter, src_iter); + } +} + static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict, struct ttm_operation_ctx *ctx, struct ttm_resource *dst_mem, @@ -497,19 +530,11 @@ static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict, struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo); struct ttm_resource_manager *dst_man = ttm_manager_type(bo->bdev, dst_mem->mem_type); - struct intel_memory_region *dst_reg, *src_reg; - union { - struct ttm_kmap_iter_tt tt; - struct ttm_kmap_iter_iomap io; - } _dst_iter, _src_iter; - struct ttm_kmap_iter *dst_iter, *src_iter; + struct ttm_tt *ttm = bo->ttm; struct sg_table *dst_st; + bool clear; int ret; - dst_reg = i915_ttm_region(bo->bdev, dst_mem->mem_type); - src_reg = i915_ttm_region(bo->bdev, bo->resource->mem_type); - GEM_BUG_ON(!dst_reg || !src_reg); - /* Sync for now. We could do the actual copy async. */ ret = ttm_bo_wait_ctx(bo, ctx); if (ret) @@ -526,9 +551,8 @@ static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict, } /* Populate ttm with pages if needed. Typically system memory. */ - if (bo->ttm && (dst_man->use_tt || - (bo->ttm->page_flags & TTM_PAGE_FLAG_SWAPPED))) { - ret = ttm_tt_populate(bo->bdev, bo->ttm, ctx); + if (ttm && (dst_man->use_tt || (ttm->page_flags & TTM_PAGE_FLAG_SWAPPED))) { + ret = ttm_tt_populate(bo->bdev, ttm, ctx); if (ret)
[PATCH v2 0/2] drm/i915, drm/ttm: Update the ttm_move_memcpy() interface
The ttm_move_memcpy() function was intended to be able to be used async under a fence. We are going to utilize that as a fallback if the gpu clearing blit fails before we set up CPU- or GPU ptes to the memory region. But to accomplish that the bo argument to ttm_move_memcpy() needs to be replaced. Patch 1 reorganizes the i915 ttm move code a bit to make the change in patch 2 smaller. Patch 2 updates the ttm_move_memcpy() interface. v2: - Don't initialize memcpy iterators until they are actually needed (Patch 1). - Added proper R-B:s and Cc:s Thomas Hellström (2): drm/i915/ttm: Reorganize the ttm move code somewhat drm/ttm, drm/i915: Update ttm_move_memcpy for async use drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 77 ++--- drivers/gpu/drm/ttm/ttm_bo_util.c | 20 +++ include/drm/ttm/ttm_bo_driver.h | 2 +- 3 files changed, 55 insertions(+), 44 deletions(-) Cc: Christian König -- 2.31.1
[PATCH v2 2/2] drm/ttm, drm/i915: Update ttm_move_memcpy for async use
The buffer object argument to ttm_move_memcpy was only used to determine whether the destination memory should be cleared only or whether we should copy data. Replace it with a "clear" bool, and update the callers. The intention here is to be able to use ttm_move_memcpy() async under a dma-fence as a fallback if an accelerated blit fails in a security- critical path where data might leak if the blit is not properly performed. For that purpose the bo is an unsuitable argument since its relevant members might already have changed at call time. Finally, update the ttm_move_memcpy kerneldoc that seems to have ended up with a stale version. Cc: Christian König Signed-off-by: Thomas Hellström Reviewed-by: Matthew Auld --- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 2 +- drivers/gpu/drm/ttm/ttm_bo_util.c | 20 ++-- include/drm/ttm/ttm_bo_driver.h | 2 +- 3 files changed, 12 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c index d07de18529ab..6995c66cbe21 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -518,7 +518,7 @@ static void __i915_ttm_move(struct ttm_buffer_object *bo, bool clear, obj->ttm.cached_io_st, src_reg->region.start); - ttm_move_memcpy(bo, dst_mem->num_pages, dst_iter, src_iter); + ttm_move_memcpy(clear, dst_mem->num_pages, dst_iter, src_iter); } } diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c index 763fa6f4e07d..5c20d0541cc3 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_util.c +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c @@ -78,22 +78,21 @@ void ttm_mem_io_free(struct ttm_device *bdev, /** * ttm_move_memcpy - Helper to perform a memcpy ttm move operation. - * @bo: The struct ttm_buffer_object. - * @new_mem: The struct ttm_resource we're moving to (copy destination). - * @new_iter: A struct ttm_kmap_iter representing the destination resource. + * @clear: Whether to clear rather than copy. + * @num_pages: Number of pages of the operation. + * @dst_iter: A struct ttm_kmap_iter representing the destination resource. * @src_iter: A struct ttm_kmap_iter representing the source resource. * * This function is intended to be able to move out async under a * dma-fence if desired. */ -void ttm_move_memcpy(struct ttm_buffer_object *bo, +void ttm_move_memcpy(bool clear, u32 num_pages, struct ttm_kmap_iter *dst_iter, struct ttm_kmap_iter *src_iter) { const struct ttm_kmap_iter_ops *dst_ops = dst_iter->ops; const struct ttm_kmap_iter_ops *src_ops = src_iter->ops; - struct ttm_tt *ttm = bo->ttm; struct dma_buf_map src_map, dst_map; pgoff_t i; @@ -102,10 +101,7 @@ void ttm_move_memcpy(struct ttm_buffer_object *bo, return; /* Don't move nonexistent data. Clear destination instead. */ - if (src_ops->maps_tt && (!ttm || !ttm_tt_is_populated(ttm))) { - if (ttm && !(ttm->page_flags & TTM_PAGE_FLAG_ZERO_ALLOC)) - return; - + if (clear) { for (i = 0; i < num_pages; ++i) { dst_ops->map_local(dst_iter, &dst_map, i); if (dst_map.is_iomem) @@ -149,6 +145,7 @@ int ttm_bo_move_memcpy(struct ttm_buffer_object *bo, struct ttm_kmap_iter_linear_io io; } _dst_iter, _src_iter; struct ttm_kmap_iter *dst_iter, *src_iter; + bool clear; int ret = 0; if (ttm && ((ttm->page_flags & TTM_PAGE_FLAG_SWAPPED) || @@ -172,7 +169,10 @@ int ttm_bo_move_memcpy(struct ttm_buffer_object *bo, goto out_src_iter; } - ttm_move_memcpy(bo, dst_mem->num_pages, dst_iter, src_iter); + clear = src_iter->ops->maps_tt && (!ttm || !ttm_tt_is_populated(ttm)); + if (!(clear && ttm && !(ttm->page_flags & TTM_PAGE_FLAG_ZERO_ALLOC))) + ttm_move_memcpy(clear, dst_mem->num_pages, dst_iter, src_iter); + src_copy = *src_mem; ttm_bo_move_sync_cleanup(bo, dst_mem); diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h index 68d6069572aa..5f087575194b 100644 --- a/include/drm/ttm/ttm_bo_driver.h +++ b/include/drm/ttm/ttm_bo_driver.h @@ -322,7 +322,7 @@ int ttm_bo_tt_bind(struct ttm_buffer_object *bo, struct ttm_resource *mem); */ void ttm_bo_tt_destroy(struct ttm_buffer_object *bo); -void ttm_move_memcpy(struct ttm_buffer_object *bo, +void ttm_move_memcpy(bool clear, u32 num_pages, struct ttm_kmap_iter *dst_iter, struct ttm_kmap_iter *src_iter); -- 2.31.1
[PATCH v6 0/6] Add Unisoc's drm kms module
ChangeList: RFC v1: 1. only upstream modeset and atomic at first commit. 2. remove some unused code; 3. use alpha and blend_mode properties; 3. add yaml support; 4. remove auto-adaptive panel driver; 5. bugfix RFC v2: 1. add sprd crtc and plane module for KMS, preparing for multi crtc&encoder 2. remove gem drivers, use generic CMA handlers 3. remove redundant "module_init", all the sub modules loading by KMS RFC v3: 1. multi crtc&encoder design have problem, so rollback to v1 RFC v4: 1. update to gcc-linaro-7.5.0 2. update to Linux 5.6-rc3 3. remove pm_runtime support 4. add COMPILE_TEST, remove unused kconfig 5. "drm_dev_put" on drm_unbind 6. fix some naming convention issue 7. remove semaphore lock for crtc flip 8. remove static variables RFC v5: 1. optimize encoder and connector code implementation 2. use "platform_get_irq" and "platform_get_resource" 3. drop useless function return type, drop unless debug log 4. custom properties should be separate, so drop it 5. use DRM_XXX replase pr_xxx 6. drop dsi&dphy hal callback ops 7. drop unless callback ops checking 8. add comments for sprd dpu structure RFC v6: 1. Access registers via readl/writel 2. Checking for unsupported KMS properties (format, rotation, blend_mode, etc) on plane_check ops 3. Remove always true checks for dpu core ops RFC v7: 1. Fix DTC unit name warnings 2. Fix the problem of maintainers 3. Call drmm_mode_config_init to mode config init 4. Embed drm_device in sprd_drm and use devm_drm_dev_alloc 5. Replace DRM_XXX with drm_xxx on KMS module, but not suitable for other subsystems 6. Remove plane_update stuff, dpu handles all the HW update in crtc->atomic_flush 7. Dsi&Dphy Code structure adjustment, all move to "sprd/" v0: 1. Remove dpu_core_ops stuff layer for sprd drtc driver, but dpu_layer need to keeping. Because all the HW update in crtc->atomic_flush, we need temporary storage all layers for the dpu pageflip of atomic_flush. 2. Add ports subnode with port@X. v1: 1. Remove dphy and dsi graph binding, merge the dphy driver into the dsi. 2. Add commit messages for Unisoc's virtual nodes. v2: 1. Use drm_xxx to replace all DRM_XXX. 2. Use kzalloc to replace devm_kzalloc for sprd_dsi/sprd_dpu structure init. 3. Remove dpu_core_ops midlayer. v3: 1. Remove dpu_layer midlayer and commit layers by aotmic_update v4: 1. Move the devm_drm_dev_alloc to master_ops->bind function. 2. The managed drmm_mode_config_init() it is no longer necessary for drivers to explicitly call drm_mode_config_cleanup, so delete it. 3. Use drmm_helpers to allocate crtc ,planes and encoder. 4. Move allocate crtc ,planes, encoder to bind funtion. 5. Move rotation enum definitions to crtc layer reg bitfields. v5: 1. Remove subdir-ccflgas-y for Makefile. 2. Keep the selects sorted by alphabet for Kconfig. 3. Fix the checkpatch warnings. 4. Use mode_set_nofb instead of mode_valid callback. 5. Follow the OF-Graph bindings, use of_graph_get_port_by_id instead of of_parse_phandle. 6. Use zpos to represent the layer position. 7. Rebase to last drm misc branch. 8. Remove panel_in port for dsi node. 9. Drop the dsi ip file prefix. 10. Add Signed-off-by for dsi&dphy patch. 11. Use the mode_flags of mipi_dsi_device to setup crtc DPI and EDPI mode. v6: 1. Disable and clear interrupts before register dpu IRQ 2. Init dpi config used by crtc_state->adjusted_mode on mode_set_nofb 3. Remove enable_irq and disable_irq function call. 4. Remove drm_format_info function call. 5. Redesign the way to access the dsi register. 6. Reduce the dsi_context member variables. Kevin Tang (6): dt-bindings: display: add Unisoc's drm master bindings drm/sprd: add Unisoc's drm kms master dt-bindings: display: add Unisoc's dpu bindings drm/sprd: add Unisoc's drm display controller driver dt-bindings: display: add Unisoc's mipi dsi controller bindings drm/sprd: add Unisoc's drm mipi dsi&dphy driver .../display/sprd/sprd,display-subsystem.yaml | 64 + .../display/sprd/sprd,sharkl3-dpu.yaml| 77 + .../display/sprd/sprd,sharkl3-dsi-host.yaml | 88 ++ drivers/gpu/drm/Kconfig |2 + drivers/gpu/drm/Makefile |1 + drivers/gpu/drm/sprd/Kconfig | 13 + drivers/gpu/drm/sprd/Makefile |6 + drivers/gpu/drm/sprd/megacores_pll.c | 317 + drivers/gpu/drm/sprd/megacores_pll.h | 146 ++ drivers/gpu/drm/sprd/sprd_dpu.c | 954 + drivers/gpu/drm/sprd/sprd_dpu.h | 109 ++ drivers/gpu/drm/sprd/sprd_drm.c | 207 +++ drivers/gpu/drm/sprd/sprd_drm.h | 19 + drivers/gpu/drm/sprd/sprd_dsi.c | 1260 + drivers/gpu/drm/sprd/sprd_dsi.h | 94 ++ 15 files changed, 3357 insertions(+) create mode 100644 Documentation/devicetree/bindings/display/sprd/sprd,display-subsystem.yaml create mode 100644 Documentation/devicetree/bindings/display/sprd/sprd,sharkl
[PATCH v6 1/6] dt-bindings: display: add Unisoc's drm master bindings
From: Kevin Tang The Unisoc DRM master device is a virtual device needed to list all DPU devices or other display interface nodes that comprise the graphics subsystem Unisoc's display pipeline have several components as below description, multi display controllers and corresponding physical interfaces. For different display scenarios, dpu0 and dpu1 maybe binding to different encoder. E.g: dpu0 and dpu1 both binding to DSI for dual mipi-dsi display; dpu0 binding to DSI for primary display, and dpu1 binding to DP for external display; Cc: Orson Zhai Cc: Chunyan Zhang Signed-off-by: Kevin Tang Reviewed-by: Rob Herring --- .../display/sprd/sprd,display-subsystem.yaml | 64 +++ 1 file changed, 64 insertions(+) create mode 100644 Documentation/devicetree/bindings/display/sprd/sprd,display-subsystem.yaml diff --git a/Documentation/devicetree/bindings/display/sprd/sprd,display-subsystem.yaml b/Documentation/devicetree/bindings/display/sprd/sprd,display-subsystem.yaml new file mode 100644 index 0..3d107e943 --- /dev/null +++ b/Documentation/devicetree/bindings/display/sprd/sprd,display-subsystem.yaml @@ -0,0 +1,64 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/display/sprd/sprd,display-subsystem.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Unisoc DRM master device + +maintainers: + - Kevin Tang + +description: | + The Unisoc DRM master device is a virtual device needed to list all + DPU devices or other display interface nodes that comprise the + graphics subsystem. + + Unisoc's display pipeline have several components as below description, + multi display controllers and corresponding physical interfaces. + For different display scenarios, dpu0 and dpu1 maybe binding to different + encoder. + + E.g: + dpu0 and dpu1 both binding to DSI for dual mipi-dsi display; + dpu0 binding to DSI for primary display, and dpu1 binding to DP for external display; + + +-+ + | | + |+-+ | + ++ | +++-+|DPHY/CPHY| | +--+ + |+->+dpu0+--->+MIPI|DSI +--->+Combo+->+Panel0| + |AXI | | +++-++-+ | +--+ + || | ^ | + || | | | + || | +---+ | + || | | | + |APB | | +--+-++---++---+ | +--+ + |+->+dpu1+--->+DisplayPort+--->+PHY+->+Panel1| + || | +++---++---+ | +--+ + ++ | | + +-+ + +properties: + compatible: +const: sprd,display-subsystem + + ports: +$ref: /schemas/types.yaml#/definitions/phandle-array +description: + Should contain a list of phandles pointing to display interface port + of DPU devices. + +required: + - compatible + - ports + +additionalProperties: false + +examples: + - | +display-subsystem { +compatible = "sprd,display-subsystem"; +ports = <&dpu_out>; +}; + -- 2.29.0
[PATCH v6 2/6] drm/sprd: add Unisoc's drm kms master
Adds drm support for the Unisoc's display subsystem. This is drm kms driver, this driver provides support for the application framework in Android, Yocto and more. Application framework can access Unisoc's display internal peripherals through libdrm or libkms, it's test ok by modetest (DRM/KMS test tool) and Android HWComposer. Cc: Orson Zhai Cc: Chunyan Zhang Signed-off-by: Kevin Tang v4: - Move the devm_drm_dev_alloc to master_ops->bind function. - The managed drmm_mode_config_init() it is no longer necessary for drivers to explicitly call drm_mode_config_cleanup, so delete it. v5: - Remove subdir-ccflgas-y for Makefile. - Keep the selects sorted by alphabet for Kconfig. --- drivers/gpu/drm/Kconfig | 2 + drivers/gpu/drm/Makefile| 1 + drivers/gpu/drm/sprd/Kconfig| 11 ++ drivers/gpu/drm/sprd/Makefile | 3 + drivers/gpu/drm/sprd/sprd_drm.c | 205 drivers/gpu/drm/sprd/sprd_drm.h | 16 +++ 6 files changed, 238 insertions(+) create mode 100644 drivers/gpu/drm/sprd/Kconfig create mode 100644 drivers/gpu/drm/sprd/Makefile create mode 100644 drivers/gpu/drm/sprd/sprd_drm.c create mode 100644 drivers/gpu/drm/sprd/sprd_drm.h diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index 3c16bd1af..a92525445 100644 --- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -386,6 +386,8 @@ source "drivers/gpu/drm/xlnx/Kconfig" source "drivers/gpu/drm/gud/Kconfig" +source "drivers/gpu/drm/sprd/Kconfig" + # Keep legacy drivers last menuconfig DRM_LEGACY diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile index 5279db439..03a52d1dc 100644 --- a/drivers/gpu/drm/Makefile +++ b/drivers/gpu/drm/Makefile @@ -126,3 +126,4 @@ obj-$(CONFIG_DRM_MCDE) += mcde/ obj-$(CONFIG_DRM_TIDSS) += tidss/ obj-y += xlnx/ obj-y += gud/ +obj-$(CONFIG_DRM_SPRD) += sprd/ diff --git a/drivers/gpu/drm/sprd/Kconfig b/drivers/gpu/drm/sprd/Kconfig new file mode 100644 index 0..726c3e76d --- /dev/null +++ b/drivers/gpu/drm/sprd/Kconfig @@ -0,0 +1,11 @@ +config DRM_SPRD + tristate "DRM Support for Unisoc SoCs Platform" + depends on ARCH_SPRD || COMPILE_TEST + depends on DRM && OF + select DRM_GEM_CMA_HELPER + select DRM_KMS_CMA_HELPER + select DRM_KMS_HELPER + help + Choose this option if you have a Unisoc chipset. + If M is selected the module will be called sprd_drm. + diff --git a/drivers/gpu/drm/sprd/Makefile b/drivers/gpu/drm/sprd/Makefile new file mode 100644 index 0..9850f00b8 --- /dev/null +++ b/drivers/gpu/drm/sprd/Makefile @@ -0,0 +1,3 @@ +# SPDX-License-Identifier: GPL-2.0 + +obj-y := sprd_drm.o diff --git a/drivers/gpu/drm/sprd/sprd_drm.c b/drivers/gpu/drm/sprd/sprd_drm.c new file mode 100644 index 0..6b00a6f27 --- /dev/null +++ b/drivers/gpu/drm/sprd/sprd_drm.c @@ -0,0 +1,205 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2020 Unisoc Inc. + */ + +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include + +#include "sprd_drm.h" + +#define DRIVER_NAME"sprd" +#define DRIVER_DESC"Spreadtrum SoCs' DRM Driver" +#define DRIVER_DATE"20200201" +#define DRIVER_MAJOR 1 +#define DRIVER_MINOR 0 + +static const struct drm_mode_config_helper_funcs sprd_drm_mode_config_helper = { + .atomic_commit_tail = drm_atomic_helper_commit_tail_rpm, +}; + +static const struct drm_mode_config_funcs sprd_drm_mode_config_funcs = { + .fb_create = drm_gem_fb_create, + .atomic_check = drm_atomic_helper_check, + .atomic_commit = drm_atomic_helper_commit, +}; + +static void sprd_drm_mode_config_init(struct drm_device *drm) +{ + drm->mode_config.min_width = 0; + drm->mode_config.min_height = 0; + drm->mode_config.max_width = 8192; + drm->mode_config.max_height = 8192; + drm->mode_config.allow_fb_modifiers = true; + + drm->mode_config.funcs = &sprd_drm_mode_config_funcs; + drm->mode_config.helper_private = &sprd_drm_mode_config_helper; +} + +DEFINE_DRM_GEM_CMA_FOPS(sprd_drm_fops); + +static struct drm_driver sprd_drm_drv = { + .driver_features= DRIVER_GEM | DRIVER_MODESET | DRIVER_ATOMIC, + .fops = &sprd_drm_fops, + + /* GEM Operations */ + DRM_GEM_CMA_DRIVER_OPS, + + .name = DRIVER_NAME, + .desc = DRIVER_DESC, + .date = DRIVER_DATE, + .major = DRIVER_MAJOR, + .minor = DRIVER_MINOR, +}; + +static int sprd_drm_bind(struct device *dev) +{ + struct platform_device *pdev = to_platform_device(dev); + struct drm_device *drm; + struct sprd_drm *sprd; + int ret; + + sprd = devm_drm_dev_alloc(dev, &sprd_drm_drv, struct sprd_drm, drm); + if (IS_ERR(sprd)) +
[PATCH v6 3/6] dt-bindings: display: add Unisoc's dpu bindings
From: Kevin Tang DPU (Display Processor Unit) is the Display Controller for the Unisoc SoCs which transfers the image data from a video memory buffer to an internal LCD interface. Cc: Orson Zhai Cc: Chunyan Zhang Signed-off-by: Kevin Tang Reviewed-by: Rob Herring --- .../display/sprd/sprd,sharkl3-dpu.yaml| 77 +++ 1 file changed, 77 insertions(+) create mode 100644 Documentation/devicetree/bindings/display/sprd/sprd,sharkl3-dpu.yaml diff --git a/Documentation/devicetree/bindings/display/sprd/sprd,sharkl3-dpu.yaml b/Documentation/devicetree/bindings/display/sprd/sprd,sharkl3-dpu.yaml new file mode 100644 index 0..4ebea60b8 --- /dev/null +++ b/Documentation/devicetree/bindings/display/sprd/sprd,sharkl3-dpu.yaml @@ -0,0 +1,77 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/display/sprd/sprd,sharkl3-dpu.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Unisoc Sharkl3 Display Processor Unit (DPU) + +maintainers: + - Kevin Tang + +description: | + DPU (Display Processor Unit) is the Display Controller for the Unisoc SoCs + which transfers the image data from a video memory buffer to an internal + LCD interface. + +properties: + compatible: +const: sprd,sharkl3-dpu + + reg: +maxItems: 1 + + interrupts: +maxItems: 1 + + clocks: +minItems: 2 + + clock-names: +items: + - const: clk_src_128m + - const: clk_src_384m + + power-domains: +maxItems: 1 + + iommus: +maxItems: 1 + + port: +type: object +description: + A port node with endpoint definitions as defined in + Documentation/devicetree/bindings/media/video-interfaces.txt. + That port should be the output endpoint, usually output to + the associated DSI. + +required: + - compatible + - reg + - interrupts + - clocks + - clock-names + - port + +additionalProperties: false + +examples: + - | +#include +#include +dpu: dpu@6300 { +compatible = "sprd,sharkl3-dpu"; +reg = <0x6300 0x1000>; +interrupts = ; +clock-names = "clk_src_128m", "clk_src_384m"; + +clocks = <&pll CLK_TWPLL_128M>, + <&pll CLK_TWPLL_384M>; + +dpu_port: port { +dpu_out: endpoint { +remote-endpoint = <&dsi_in>; +}; +}; +}; -- 2.29.0
[PATCH v6 4/6] drm/sprd: add Unisoc's drm display controller driver
Adds DPU(Display Processor Unit) support for the Unisoc's display subsystem. It's support multi planes, scaler, rotation, PQ(Picture Quality) and more. v2: - Use drm_xxx to replace all DRM_XXX. - Use kzalloc to replace devm_kzalloc for sprd_dpu structure init. v3: - Remove dpu_layer stuff layer and commit layers by aotmic_update v4: - Use drmm_helpers to allocate crtc and planes. - Move rotation enum definitions to crtc layer reg bitfields. - Move allocate crtc and planes to bind function. v5: - Fix the checkpatch warnings. - Use mode_set_nofb instead of mode_valid callback. - Follow the OF-Graph bindings, use of_graph_get_port_by_id instead of of_parse_phandle. - Use zpos to represent the layer position. - Rebase to last drm misc branch. v6: - Disable and clear interrupts before register dpu IRQ - Init dpi config used by crtc_state->adjusted_mode on mode_set_nofb - Remove enable_irq and disable_irq function call. - Remove drm_format_info function call. Cc: Orson Zhai Cc: Chunyan Zhang Signed-off-by: Kevin Tang --- drivers/gpu/drm/sprd/Kconfig| 1 + drivers/gpu/drm/sprd/Makefile | 3 +- drivers/gpu/drm/sprd/sprd_dpu.c | 937 drivers/gpu/drm/sprd/sprd_dpu.h | 109 drivers/gpu/drm/sprd/sprd_drm.c | 1 + drivers/gpu/drm/sprd/sprd_drm.h | 2 + 6 files changed, 1052 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/sprd/sprd_dpu.c create mode 100644 drivers/gpu/drm/sprd/sprd_dpu.h diff --git a/drivers/gpu/drm/sprd/Kconfig b/drivers/gpu/drm/sprd/Kconfig index 726c3e76d..37762c333 100644 --- a/drivers/gpu/drm/sprd/Kconfig +++ b/drivers/gpu/drm/sprd/Kconfig @@ -5,6 +5,7 @@ config DRM_SPRD select DRM_GEM_CMA_HELPER select DRM_KMS_CMA_HELPER select DRM_KMS_HELPER + select VIDEOMODE_HELPERS help Choose this option if you have a Unisoc chipset. If M is selected the module will be called sprd_drm. diff --git a/drivers/gpu/drm/sprd/Makefile b/drivers/gpu/drm/sprd/Makefile index 9850f00b8..ab12b95e6 100644 --- a/drivers/gpu/drm/sprd/Makefile +++ b/drivers/gpu/drm/sprd/Makefile @@ -1,3 +1,4 @@ # SPDX-License-Identifier: GPL-2.0 -obj-y := sprd_drm.o +obj-y := sprd_drm.o \ + sprd_dpu.o diff --git a/drivers/gpu/drm/sprd/sprd_dpu.c b/drivers/gpu/drm/sprd/sprd_dpu.c new file mode 100644 index 0..448dd4fb6 --- /dev/null +++ b/drivers/gpu/drm/sprd/sprd_dpu.c @@ -0,0 +1,937 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2020 Unisoc Inc. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include + +#include "sprd_drm.h" +#include "sprd_dpu.h" + +/* Global control registers */ +#define REG_DPU_CTRL 0x04 +#define REG_DPU_CFG0 0x08 +#define REG_PANEL_SIZE 0x20 +#define REG_BLEND_SIZE 0x24 +#define REG_BG_COLOR 0x2C + +/* Layer0 control registers */ +#define REG_LAY_BASE_ADDR0 0x30 +#define REG_LAY_BASE_ADDR1 0x34 +#define REG_LAY_BASE_ADDR2 0x38 +#define REG_LAY_CTRL 0x40 +#define REG_LAY_SIZE 0x44 +#define REG_LAY_PITCH 0x48 +#define REG_LAY_POS0x4C +#define REG_LAY_ALPHA 0x50 +#define REG_LAY_CROP_START 0x5C + +/* Interrupt control registers */ +#define REG_DPU_INT_EN 0x1E0 +#define REG_DPU_INT_CLR0x1E4 +#define REG_DPU_INT_STS0x1E8 + +/* DPI control registers */ +#define REG_DPI_CTRL 0x1F0 +#define REG_DPI_H_TIMING 0x1F4 +#define REG_DPI_V_TIMING 0x1F8 + +/* MMU control registers */ +#define REG_MMU_EN 0x800 +#define REG_MMU_VPN_RANGE 0x80C +#define REG_MMU_VAOR_ADDR_RD 0x818 +#define REG_MMU_VAOR_ADDR_WR 0x81C +#define REG_MMU_INV_ADDR_RD0x820 +#define REG_MMU_INV_ADDR_WR0x824 +#define REG_MMU_PPN1 0x83C +#define REG_MMU_RANGE1 0x840 +#define REG_MMU_PPN2 0x844 +#define REG_MMU_RANGE2 0x848 + +/* Global control bits */ +#define BIT_DPU_RUNBIT(0) +#define BIT_DPU_STOP BIT(1) +#define BIT_DPU_REG_UPDATE BIT(2) +#define BIT_DPU_IF_EDPIBIT(0) + +/* Layer control bits */ +#define BIT_DPU_LAY_EN BIT(0) +#define BIT_DPU_LAY_LAYER_ALPHA(0x01 << 2) +#define BIT_DPU_LAY_COMBO_ALPHA(0x02 << 2) +#define BIT_DPU_LAY_FORMAT_YUV422_2PLANE (0x00 << 4) +#define BIT_DPU_LAY_FORMAT_YUV420_2PLANE (0x01 << 4) +#define BIT_DPU_LAY_FORMAT_YUV420_3PLANE (0x02 << 4) +#define BIT_DPU_LAY_FORMAT_ARGB(0x03 << 4) +#define BIT_DPU_LAY_FORMAT_RGB565 (0x04 << 4) +#define BIT_DPU_LAY_DATA_ENDIAN_B0B1B2B3
[PATCH v6 5/6] dt-bindings: display: add Unisoc's mipi dsi controller bindings
From: Kevin Tang Adds MIPI DSI Controller support for Unisoc's display subsystem. v5: - Remove panel_in port for dsi node. Cc: Orson Zhai Cc: Chunyan Zhang Signed-off-by: Kevin Tang Reviewed-by: Rob Herring --- .../display/sprd/sprd,sharkl3-dsi-host.yaml | 88 +++ 1 file changed, 88 insertions(+) create mode 100644 Documentation/devicetree/bindings/display/sprd/sprd,sharkl3-dsi-host.yaml diff --git a/Documentation/devicetree/bindings/display/sprd/sprd,sharkl3-dsi-host.yaml b/Documentation/devicetree/bindings/display/sprd/sprd,sharkl3-dsi-host.yaml new file mode 100644 index 0..bc5594d18 --- /dev/null +++ b/Documentation/devicetree/bindings/display/sprd/sprd,sharkl3-dsi-host.yaml @@ -0,0 +1,88 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/display/sprd/sprd,sharkl3-dsi-host.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Unisoc MIPI DSI Controller + +maintainers: + - Kevin Tang + +properties: + compatible: +const: sprd,sharkl3-dsi-host + + reg: +maxItems: 1 + + interrupts: +maxItems: 2 + + clocks: +minItems: 1 + + clock-names: +items: + - const: clk_src_96m + + power-domains: +maxItems: 1 + + ports: +type: object + +properties: + "#address-cells": +const: 1 + + "#size-cells": +const: 0 + + port@0: +type: object +description: + A port node with endpoint definitions as defined in + Documentation/devicetree/bindings/media/video-interfaces.txt. + That port should be the input endpoint, usually coming from + the associated DPU. + +required: + - "#address-cells" + - "#size-cells" + - port@0 + +additionalProperties: false + +required: + - compatible + - reg + - interrupts + - clocks + - clock-names + - ports + +additionalProperties: false + +examples: + - | +#include +#include +dsi: dsi@6310 { +compatible = "sprd,sharkl3-dsi-host"; +reg = <0x6310 0x1000>; +interrupts = , + ; +clock-names = "clk_src_96m"; +clocks = <&pll CLK_TWPLL_96M>; +ports { +#address-cells = <1>; +#size-cells = <0>; +port@0 { +reg = <0>; +dsi_in: endpoint { +remote-endpoint = <&dpu_out>; +}; +}; +}; +}; -- 2.29.0
[PATCH v6 6/6] drm/sprd: add Unisoc's drm mipi dsi&dphy driver
Adds dsi host controller support for the Unisoc's display subsystem. Adds dsi phy support for the Unisoc's display subsystem. Only MIPI DSI Displays supported, DP/TV/HMDI will be support in the feature. v1: - Remove dphy and dsi graph binding, merge the dphy driver into the dsi. v2: - Use drm_xxx to replace all DRM_XXX. - Use kzalloc to replace devm_kzalloc for sprd_dsi structure init. v4: - Use drmm_helpers to allocate encoder. - Move allocate encoder and connector to bind function. v5: - Drop the dsi ip file prefix. - Fix the checkpatch warnings. - Add Signed-off-by for dsi&dphy patch. - Use the mode_flags of mipi_dsi_device to setup crtc DPI and EDPI mode. v6: - Redesign the way to access the dsi register. - Reduce the dsi_context member variables. --- drivers/gpu/drm/sprd/Kconfig |1 + drivers/gpu/drm/sprd/Makefile|4 +- drivers/gpu/drm/sprd/megacores_pll.c | 317 +++ drivers/gpu/drm/sprd/megacores_pll.h | 146 +++ drivers/gpu/drm/sprd/sprd_dpu.c | 17 + drivers/gpu/drm/sprd/sprd_drm.c |1 + drivers/gpu/drm/sprd/sprd_drm.h |1 + drivers/gpu/drm/sprd/sprd_dsi.c | 1260 ++ drivers/gpu/drm/sprd/sprd_dsi.h | 94 ++ 9 files changed, 1840 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/sprd/megacores_pll.c create mode 100644 drivers/gpu/drm/sprd/megacores_pll.h create mode 100644 drivers/gpu/drm/sprd/sprd_dsi.c create mode 100644 drivers/gpu/drm/sprd/sprd_dsi.h diff --git a/drivers/gpu/drm/sprd/Kconfig b/drivers/gpu/drm/sprd/Kconfig index 37762c333..3edeaeca0 100644 --- a/drivers/gpu/drm/sprd/Kconfig +++ b/drivers/gpu/drm/sprd/Kconfig @@ -5,6 +5,7 @@ config DRM_SPRD select DRM_GEM_CMA_HELPER select DRM_KMS_CMA_HELPER select DRM_KMS_HELPER + select DRM_MIPI_DSI select VIDEOMODE_HELPERS help Choose this option if you have a Unisoc chipset. diff --git a/drivers/gpu/drm/sprd/Makefile b/drivers/gpu/drm/sprd/Makefile index ab12b95e6..73f96c459 100644 --- a/drivers/gpu/drm/sprd/Makefile +++ b/drivers/gpu/drm/sprd/Makefile @@ -1,4 +1,6 @@ # SPDX-License-Identifier: GPL-2.0 obj-y := sprd_drm.o \ - sprd_dpu.o + sprd_dpu.o \ + sprd_dsi.o \ + megacores_pll.o diff --git a/drivers/gpu/drm/sprd/megacores_pll.c b/drivers/gpu/drm/sprd/megacores_pll.c new file mode 100644 index 0..0dfd3c372 --- /dev/null +++ b/drivers/gpu/drm/sprd/megacores_pll.c @@ -0,0 +1,317 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2020 Unisoc Inc. + */ + +#include +#include +#include +#include +#include +#include + +#include "megacores_pll.h" + +#define L 0 +#define H 1 +#define CLK0 +#define DATA 1 +#define INFINITY 0x +#define MIN_OUTPUT_FREQ(100) + +#define AVERAGE(a, b) (min(a, b) + abs((b) - (a)) / 2) + +/* sharkle */ +#define VCO_BAND_LOW 750 +#define VCO_BAND_MID 1100 +#define VCO_BAND_HIGH 1500 +#define PHY_REF_CLK26000 + +static int dphy_calc_pll_param(struct dphy_pll *pll) +{ + const u32 khz = 1000; + const u32 mhz = 100; + const unsigned long long factor = 100; + unsigned long long tmp; + int i; + + pll->potential_fvco = pll->freq / khz; + pll->ref_clk = PHY_REF_CLK / khz; + + for (i = 0; i < 4; ++i) { + if (pll->potential_fvco >= VCO_BAND_LOW && + pll->potential_fvco <= VCO_BAND_HIGH) { + pll->fvco = pll->potential_fvco; + pll->out_sel = BIT(i); + break; + } + pll->potential_fvco <<= 1; + } + if (pll->fvco == 0) + return -EINVAL; + + if (pll->fvco >= VCO_BAND_LOW && pll->fvco <= VCO_BAND_MID) { + /* vco band control */ + pll->vco_band = 0x0; + /* low pass filter control */ + pll->lpf_sel = 1; + } else if (pll->fvco > VCO_BAND_MID && pll->fvco <= VCO_BAND_HIGH) { + pll->vco_band = 0x1; + pll->lpf_sel = 0; + } else + return -EINVAL; + + pll->nint = pll->fvco / pll->ref_clk; + tmp = pll->fvco * factor * mhz; + do_div(tmp, pll->ref_clk); + tmp = tmp - pll->nint * factor * mhz; + tmp *= BIT(20); + do_div(tmp, 1); + pll->kint = (u32)tmp; + pll->refin = 3; /* pre-divider bypass */ + pll->sdm_en = true; /* use fraction N PLL */ + pll->fdk_s = 0x1; /* fraction */ + pll->cp_s = 0x0; + pll->det_delay = 0x1; + + return 0; +} + +static void dphy_set_pll_reg(struct dphy_pll *pll, struct regmap *regmap) +{ + struct pll_reg *reg = &pll->reg; + u8 *val; +
Re: [PATCH v1] fbtft: fb_st7789v: added reset on init_display()
On Fri, Aug 13, 2021 at 02:54:30PM +0200, Oliver Graute wrote: > On 13/08/21, Greg KH wrote: > > On Fri, Aug 13, 2021 at 08:25:10AM +0200, Oliver Graute wrote: > > > staging: fbtft: fb_st7789v: reset display before initialization > > > > What is this line here, and why is this not your subject line instead? > > I'll put the line as subject instead. > > > > In rare cases the display is flipped or mirrored. This was observed more > > > often in a low temperature environment. A clean reset on init_display() > > > should help to get registers in a sane state. > > > > > > Signed-off-by: Oliver Graute > > > > What commit does this fix? > > this is a fix for a rare behavior of the fb_st7789v display. Not a > bugfix for a specific commit. So if it has always been broken, list the commit where the code was added to the kernel, as this should be backported to the stable kernels, right? thanks, greg k-h
Re: [PATCH v1] staging: fbtft: fb_st7789v: reset display before initialization
On Fri, Aug 13, 2021 at 02:59:27PM +0200, Oliver Graute wrote: > In rare cases the display is flipped or mirrored. This was observed more > often in a low temperature environment. A clean reset on init_display() > should help to get registers in a sane state. > > Signed-off-by: Oliver Graute > --- > drivers/staging/fbtft/fb_st7789v.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/staging/fbtft/fb_st7789v.c > b/drivers/staging/fbtft/fb_st7789v.c > index 3a280cc1892c..0a2dbed9ffc7 100644 > --- a/drivers/staging/fbtft/fb_st7789v.c > +++ b/drivers/staging/fbtft/fb_st7789v.c > @@ -82,6 +82,8 @@ enum st7789v_command { > { > int rc; > > + par->fbtftops.reset(par); > + > rc = init_tearing_effect_line(par); > if (rc) > return rc; > -- > 2.17.1 > > Hi, This is the friendly patch-bot of Greg Kroah-Hartman. You have sent him a patch that has triggered this response. He used to manually respond to these common problems, but in order to save his sanity (he kept writing the same thing over and over, yet to different people), I was created. Hopefully you will not take offence and will fix the problem in your patch and resubmit it so that it can be accepted into the Linux kernel tree. You are receiving this message because of the following common error(s) as indicated below: - This looks like a new version of a previously submitted patch, but you did not list below the --- line any changes from the previous version. Please read the section entitled "The canonical patch format" in the kernel file, Documentation/SubmittingPatches for what needs to be done here to properly describe this. If you wish to discuss this problem further, or you have questions about how to resolve this issue, please feel free to respond to this email and Greg will reply once he has dug out from the pending patches received from other developers. thanks, greg k-h's patch email bot
Re: [PATCH] drm: radeon: r600_dma: Replace cpu_to_le32() by lower_32_bits()
On 2021-08-13 10:54 a.m., zhaoxiao wrote: > This patch fixes the following sparse errors: > drivers/gpu/drm/radeon/r600_dma.c:247:30: warning: incorrect type in > assignment (different base types) > drivers/gpu/drm/radeon/r600_dma.c:247:30:expected unsigned int volatile > [usertype] > drivers/gpu/drm/radeon/r600_dma.c:247:30:got restricted __le32 [usertype] > > Signed-off-by: zhaoxiao > --- > drivers/gpu/drm/radeon/r600_dma.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/radeon/r600_dma.c > b/drivers/gpu/drm/radeon/r600_dma.c > index fb65e6fb5c4f..a2d0b1edcd22 100644 > --- a/drivers/gpu/drm/radeon/r600_dma.c > +++ b/drivers/gpu/drm/radeon/r600_dma.c > @@ -244,7 +244,7 @@ int r600_dma_ring_test(struct radeon_device *rdev, > gpu_addr = rdev->wb.gpu_addr + index; > > tmp = 0xCAFEDEAD; > - rdev->wb.wb[index/4] = cpu_to_le32(tmp); > + rdev->wb.wb[index/4] = lower_32_bits(tmp); > > r = radeon_ring_lock(rdev, ring, 4); > if (r) { > Seems better to mark rdev->wb.wb as little endian instead. It's read with le32_to_cpu (with some exceptions which look like bugs), which would result in 0xADEDFECA like this. -- Earthling Michel Dänzer | https://redhat.com Libre software enthusiast | Mesa and X developer
Re: [PATCH] drm/amdgpu: Cancel delayed work when GFXOFF is disabled
On 8/13/2021 8:10 PM, Michel Dänzer wrote: On 2021-08-13 4:14 p.m., Lazar, Lijo wrote: On 8/13/2021 7:04 PM, Michel Dänzer wrote: On 2021-08-13 1:50 p.m., Lazar, Lijo wrote: On 8/13/2021 3:59 PM, Michel Dänzer wrote: From: Michel Dänzer schedule_delayed_work does not push back the work if it was already scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms after the first time GFXOFF was disabled and re-enabled, even if GFXOFF was disabled and re-enabled again during those 100 ms. This resulted in frame drops / stutter with the upcoming mutter 41 release on Navi 14, due to constantly enabling GFXOFF in the HW and disabling it again (for getting the GPU clock counter). To fix this, call cancel_delayed_work_sync when GFXOFF transitions from enabled to disabled. This makes sure the delayed work will be scheduled as intended in the reverse case. In order to avoid a deadlock, amdgpu_device_delay_enable_gfx_off needs to use mutex_trylock instead of mutex_lock. v2: * Use cancel_delayed_work_sync & mutex_trylock instead of mod_delayed_work. Signed-off-by: Michel Dänzer --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 13 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h | 3 +++ 3 files changed, 20 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index f3fd5ec710b6..8b025f70706c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2777,7 +2777,16 @@ static void amdgpu_device_delay_enable_gfx_off(struct work_struct *work) struct amdgpu_device *adev = container_of(work, struct amdgpu_device, gfx.gfx_off_delay_work.work); - mutex_lock(&adev->gfx.gfx_off_mutex); + /* mutex_lock could deadlock with cancel_delayed_work_sync in amdgpu_gfx_off_ctrl. */ + if (!mutex_trylock(&adev->gfx.gfx_off_mutex)) { + /* If there's a bug which causes amdgpu_gfx_off_ctrl to be called with enable=true + * when adev->gfx.gfx_off_req_count is already 0, we might race with that. + * Re-schedule to make sure gfx off will be re-enabled in the HW eventually. + */ + schedule_delayed_work(&adev->gfx.gfx_off_delay_work, AMDGPU_GFX_OFF_DELAY_ENABLE); + return; This is not needed and is just creating another thread to contend for mutex. Still not sure what you mean by that. What other thread? Sorry, I meant it schedules another workitem and delays GFXOFF enablement further. For ex: if it was another function like gfx_off_status holding the lock at the time of check. The checks below take care of enabling gfxoff correctly. If it's already in gfx_off state, it doesn't do anything. So I don't see why this change is needed. mutex_trylock is needed to prevent the deadlock discussed before and below. schedule_delayed_work is needed due to this scenario hinted at by the comment: 1. amdgpu_gfx_off_ctrl locks mutex, calls schedule_delayed_work 2. amdgpu_device_delay_enable_gfx_off runs, calls mutex_trylock, which fails GFXOFF would never get re-enabled in HW in this case (until amdgpu_gfx_off_ctrl calls schedule_delayed_work again). (cancel_delayed_work_sync guarantees there's no pending delayed work when it returns, even if amdgpu_device_delay_enable_gfx_off calls schedule_delayed_work) I think we need to explain based on the original code before. There is an asssumption here that the only other contention of this mutex is with the gfx_off_ctrl function. Not really. As far as I understand if the work has already started running when schedule_delayed_work is called, it will insert another in the work queue after delay. Based on that understanding I didn't find a problem with the original code. Original code as in without this patch or the mod_delayed_work patch? If so, the problem is not when the work has already started running. It's that when it hasn't started running yet, schedule_delayed_work doesn't change the timeout for the already scheduled work, so it ends up enabling GFXOFF earlier than intended (and thus at all in scenarios when it's not supposed to). I meant the original implementation of amdgpu_device_delay_enable_gfx_off(). If you indeed want to use _sync, there is a small problem with this implementation also which is roughly equivalent to the original problem you faced. amdgpu_gfx_off_ctrl(disable) locks mutex calls cancel_delayed_work_sync amdgpu_device_delay_enable_gfx_off already started running mutex_trylock fails and schedules another one amdgpu_gfx_off_ctrl(enable) schedules_delayed_work() - Delay is not extended, it's the same as when it's rearmed from work item. Probably, overthinking about the solution. Looking back, mod_ version is simpler :). May be just delay it further everytime there is a call with enable instead of doing it onl
Re: [PATCH 5/9] drm/i915/guc: Flush the work queue for GuC generated G2H
On Thu, Aug 12, 2021 at 10:38:18PM +, Matthew Brost wrote: > On Thu, Aug 12, 2021 at 09:47:23PM +0200, Daniel Vetter wrote: > > On Thu, Aug 12, 2021 at 03:23:30PM +, Matthew Brost wrote: > > > On Thu, Aug 12, 2021 at 04:11:28PM +0200, Daniel Vetter wrote: > > > > On Wed, Aug 11, 2021 at 01:16:18AM +, Matthew Brost wrote: > > > > > Flush the work queue for GuC generated G2H messages durinr a GT reset. > > > > > This is accomplished by spinning on the the list of outstanding G2H to > > > > > go empty. > > > > > > > > > > Fixes: eb5e7da736f3 ("drm/i915/guc: Reset implementation for new GuC > > > > > interface") > > > > > Signed-off-by: Matthew Brost > > > > > Cc: > > > > > --- > > > > > drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 5 + > > > > > 1 file changed, 5 insertions(+) > > > > > > > > > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > > > > > b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > > > > > index 3cd2da6f5c03..e5eb2df11b4a 100644 > > > > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > > > > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > > > > > @@ -727,6 +727,11 @@ void intel_guc_submission_reset_prepare(struct > > > > > intel_guc *guc) > > > > > wait_for_reset(guc, > > > > > &guc->outstanding_submission_g2h); > > > > > } while (!list_empty(&guc->ct.requests.incoming)); > > > > > } > > > > > + > > > > > + /* Flush any GuC generated G2H */ > > > > > + while (!list_empty(&guc->ct.requests.incoming)) > > > > > + msleep(20); > > > > > > > > flush_work or flush_workqueue, beacuse that comes with lockdep > > > > annotations. Dont hand-roll stuff like this if at all possible. > > > > > > lockdep puked when used that. > > > > Lockdep tends to be right ... So we definitely want that, but maybe a > > different flavour, or there's something wrong with the workqueue setup. > > > > Here is a dependency chain that lockdep doesn't like. > > fs_reclaim_acquire -> >->reset.mutex (shrinker) > workqueue -> fs_reclaim_acquire (error capture in workqueue) > >->reset.mutex -> workqueue (reset) > > In practice I don't think we couldn't ever hit this but lockdep does > looks right here. Trying to work out how to fix this. We really need to > all G2H to done being processed before we proceed during a reset or we > have races. Have a few ideas of how to handle this but can't convince > myself any of them are fully safe. It might be false sharing due to a single workqueue, or a single-threaded workqueue. Essentially the lockdep annotations for work_struct track two things: - dependencies against the specific work item - dependencies against anything queued on that work queue, if you flush the entire queue, or if you flush a work item that's on a single-threaded queue. Because if guc/host communication is inverted like this here, you have a much bigger problem. Note that if you pick a different workqueue for your guc work stuff then you need to make sure that's all properly flushed on suspend and driver unload. It might also be that the reset work is on the wrong workqueue. Either way, this must be fixed, because I've seen too many of these "it never happens in practice" blow up, plus if your locking scheme is engineered with quicksand forget about anyone ever understanding it. -Daniel > > Splat below: > > [ 154.625989] == > [ 154.632195] WARNING: possible circular locking dependency detected > [ 154.638393] 5.14.0-rc5-guc+ #50 Tainted: G U > [ 154.643991] -- > [ 154.650196] i915_selftest/1673 is trying to acquire lock: > [ 154.655621] 8881079cb918 > ((work_completion)(&ct->requests.worker)){+.+.}-{0:0}, at: > __flush_work+0x350/0x4d0 > [ 154.665826] >but task is already holding lock: > [ 154.671682] 8881079cbfb8 (>->reset.mutex){+.+.}-{3:3}, at: > intel_gt_reset+0xf0/0x300 [i915] > [ 154.680659] >which lock already depends on the new lock. > > [ 154.688857] >the existing dependency chain (in reverse order) is: > [ 154.696365] >-> #2 (>->reset.mutex){+.+.}-{3:3}: > [ 154.702571]lock_acquire+0xd2/0x300 > [ 154.706695]i915_gem_shrinker_taints_mutex+0x2d/0x50 [i915] > [ 154.712959]intel_gt_init_reset+0x61/0x80 [i915] > [ 154.718258]intel_gt_init_early+0xe6/0x120 [i915] > [ 154.723648]i915_driver_probe+0x592/0xdc0 [i915] > [ 154.728942]i915_pci_probe+0x43/0x1c0 [i915] > [ 154.733891]pci_device_probe+0x9b/0x110 > [ 154.738362]really_probe+0x1a6/0x3a0 > [ 154.742568]__driver_probe_device+0xf9/0x170 > [ 154.747468]driver_probe_device+0x19/0x90 > [ 154.752114]__driver_attach+0x99/0x170 > [ 154.756492]bus_for_each_dev+0x73/0xc0 > [ 154.760870]bus_add_driver+0x14b/0
[PATCH v5 0/9] dyndbg: add DEFINE_DYNAMIC_DEBUG_CATEGORIES and use in DRM
hi Jason, Greg, Daniel, dri-everyone, drm_debug_enabled() is called a lot (by drm-debug api) to do unlikely bit-tests to selectively enable debug printing; this is a good job for DYNAMIC_DEBUG, IFF it is built with JUMP_LABEL. This patchset enables the use of dynamic-debug to avoid those drm_debug_enabled() overheads, if CONFIG_DRM_USE_DYNAMIC_DEBUG=y. v5: much rework - based on Daniel Vetter's feedback, not RFC anymore. (except last one) - move POC bit_map callback code into dynamic_debug add .data to struct kernel_param add DEFINE_DYNAMIC_DEBUG_CATEGORIES : a declarative interface for bits => control-queries this is all new functionality. - use DEFINE_DYNAMIC_DEBUG_CATEGORIES in i915, amdgpu adds selectivity/control to existing categorizations - DRM_USE_DYNAMIC_DEBUG replace DRM_UT_ (an enum) with DRM_CAT_ (a prefix string, cpp-prepended to format) _UT_ still present, drm_debug_enabled() still used todo: change __drm_debug param-var to read DDD_CATEGORIES's param-var might suffice to keep parallel schemes coherent. - RFC add tracer func as syslog alternate test_dynamic_debug.ko: uses tracer for observability, does selftest has some misuse risk; calling pr_debug recursively. v4: (brown-bagger, various fixes after snips) v3: fixes missed SOB, && on BOL, commit-log tweaks v2: https://lore.kernel.org/lkml/20210711055003.528167-1-jim.cro...@gmail.com/ v1: https://lore.kernel.org/lkml/20201204035318.332419-1-jim.cro...@gmail.com/ Doing so creates many new pr_debug callsites, otherwise i915 has ~120 prdbgs, and drm has just 1; bash-5.1# modprobe i915 dyndbg: 8 debug prints in module video dyndbg: 305 debug prints in module drm dyndbg: 207 debug prints in module drm_kms_helper dyndbg: 2 debug prints in module ttm dyndbg: 1720 debug prints in module i915 On amdgpu, enabling it adds ~3200 prdbgs, currently at 56 bytes each. So CONFIG_DRM_USE_DYNAMIC_DEBUG=y affects resource requirements. Im working on a diet-plan. Im running this patchset bare-metal on an i7/i915 laptop & an r9/amdgpu desktop (both as loadable modules). I booted the amdgpu box with: BOOT_IMAGE=(hd2,gpt2)/vmlinuz-5.14.0-rc4-d7a-9-g5db471cba844 \ root=UUID=mumble ro \ rootflags=subvol=root00 rhgb \ dynamic_debug.verbose=3 main.dyndbg=+p \ amdgpu.debug=1 amdgpu.test=1 \ "amdgpu.dyndbg=format ^[ +p" That last line enables ~1700 prdbg callsites with a format like '[DML' etc at boot, and amdgpu.test=1 triggers 90 seconds of tests, yielding ~76k prdbgs in 409 seconds, before I turned them off with: echo module amdgpu -p > /proc/dynamic_debug/control Its worth noting, this changes the dyndbg-state underneath settings applied with `echo > parameters/debug`; the latter is qualitatively writeonly, maybe a param_get should return "NA" "-1" this merged cleanly, on top of commit d65ef4634e5c795a6a4df1d198992c70e9692fb3 (drm-tip/drm-tip) Jim Cromie (9): drm/print: fixup spelling in a comment moduleparam: add data member to struct kernel_param dyndbg: add DEFINE_DYNAMIC_DEBUG_CATEGORIES and callbacks i915/gvt: remove spaces in pr_debug "gvt: core:" etc prefixes i915/gvt: use DEFINE_DYNAMIC_DEBUG_CATEGORIES to create "gvt:core:" etc categories amdgpu: use DEFINE_DYNAMIC_DEBUG_CATEGORIES to control categorized pr_debugs drm_print: add choice to use dynamic debug in drm-debug amdgpu_ucode: reduce number of pr_debug calls dyndbg: RFC add tracer facility RFC drivers/gpu/drm/Kconfig | 13 + drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 293 ++ .../gpu/drm/amd/display/dc/core/dc_debug.c| 44 ++- drivers/gpu/drm/drm_print.c | 49 ++- drivers/gpu/drm/i915/gvt/Makefile | 4 + drivers/gpu/drm/i915/gvt/debug.h | 18 +- drivers/gpu/drm/i915/i915_params.c| 35 +++ include/drm/drm_print.h | 143 +++-- include/linux/dynamic_debug.h | 82 - include/linux/moduleparam.h | 11 +- lib/Kconfig.debug | 10 + lib/Makefile | 1 + lib/dynamic_debug.c | 171 -- lib/test_dynamic_debug.c | 247 +++ 14 files changed, 901 insertions(+), 220 deletions(-) create mode 100644 lib/test_dynamic_debug.c -- 2.31.1
Re: [Intel-gfx] [PATCH v6 10/15] drm/i915/pxp: interfaces for using protected objects
On 8/13/2021 7:37 AM, Daniel Vetter wrote: On Wed, Jul 28, 2021 at 07:01:01PM -0700, Daniele Ceraolo Spurio wrote: This api allow user mode to create protected buffers and to mark contexts as making use of such objects. Only when using contexts marked in such a way is the execution guaranteed to work as expected. Contexts can only be marked as using protected content at creation time (i.e. the parameter is immutable) and they must be both bannable and not recoverable. All protected objects and contexts that have backing storage will be considered invalid when the PXP session is destroyed and all new submissions using them will be rejected. All intel contexts within the invalidated gem contexts will be marked banned. A new flag has been added to the RESET_STATS ioctl to report the context invalidation to userspace. This patch was previously sent as 2 separate patches, which have been squashed following a request to have all the uapi in a single patch. I've retained the s-o-b from both. v5: squash patches, rebase on proto_ctx, update kerneldoc v6: rebase on obj create_ext changes Signed-off-by: Daniele Ceraolo Spurio Signed-off-by: Bommu Krishnaiah Cc: Rodrigo Vivi Cc: Chris Wilson Cc: Lionel Landwerlin Cc: Jason Ekstrand Cc: Daniel Vetter Reviewed-by: Rodrigo Vivi #v5 --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 68 -- drivers/gpu/drm/i915/gem/i915_gem_context.h | 18 .../gpu/drm/i915/gem/i915_gem_context_types.h | 2 + drivers/gpu/drm/i915/gem/i915_gem_create.c| 75 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 40 - drivers/gpu/drm/i915/gem/i915_gem_object.c| 6 ++ drivers/gpu/drm/i915/gem/i915_gem_object.h| 12 +++ .../gpu/drm/i915/gem/i915_gem_object_types.h | 9 ++ drivers/gpu/drm/i915/pxp/intel_pxp.c | 89 +++ drivers/gpu/drm/i915/pxp/intel_pxp.h | 15 drivers/gpu/drm/i915/pxp/intel_pxp_session.c | 3 + drivers/gpu/drm/i915/pxp/intel_pxp_types.h| 5 ++ include/uapi/drm/i915_drm.h | 55 +++- 13 files changed, 371 insertions(+), 26 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index cff72679ad7c..0cd3e2d06188 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -77,6 +77,8 @@ #include "gt/intel_gpu_commands.h" #include "gt/intel_ring.h" +#include "pxp/intel_pxp.h" + #include "i915_gem_context.h" #include "i915_trace.h" #include "i915_user_extensions.h" @@ -241,6 +243,25 @@ static int proto_context_set_persistence(struct drm_i915_private *i915, return 0; } +static int proto_context_set_protected(struct drm_i915_private *i915, + struct i915_gem_proto_context *pc, + bool protected) +{ + int ret = 0; + + if (!intel_pxp_is_enabled(&i915->gt.pxp)) + ret = -ENODEV; + else if (!protected) + pc->user_flags &= ~BIT(UCONTEXT_PROTECTED); + else if ((pc->user_flags & BIT(UCONTEXT_RECOVERABLE)) || +!(pc->user_flags & BIT(UCONTEXT_BANNABLE))) + ret = -EPERM; + else + pc->user_flags |= BIT(UCONTEXT_PROTECTED); + + return ret; +} + static struct i915_gem_proto_context * proto_context_create(struct drm_i915_private *i915, unsigned int flags) { @@ -686,6 +707,8 @@ static int set_proto_ctx_param(struct drm_i915_file_private *fpriv, ret = -EPERM; else if (args->value) pc->user_flags |= BIT(UCONTEXT_BANNABLE); + else if (pc->user_flags & BIT(UCONTEXT_PROTECTED)) + ret = -EPERM; else pc->user_flags &= ~BIT(UCONTEXT_BANNABLE); break; @@ -693,10 +716,12 @@ static int set_proto_ctx_param(struct drm_i915_file_private *fpriv, case I915_CONTEXT_PARAM_RECOVERABLE: if (args->size) ret = -EINVAL; - else if (args->value) - pc->user_flags |= BIT(UCONTEXT_RECOVERABLE); - else + else if (!args->value) pc->user_flags &= ~BIT(UCONTEXT_RECOVERABLE); + else if (pc->user_flags & BIT(UCONTEXT_PROTECTED)) + ret = -EPERM; + else + pc->user_flags |= BIT(UCONTEXT_RECOVERABLE); break; case I915_CONTEXT_PARAM_PRIORITY: @@ -724,6 +749,11 @@ static int set_proto_ctx_param(struct drm_i915_file_private *fpriv, args->value); break; + case I915_CONTEXT_PARAM_PROTECTED_CONTENT: + ret = proto_context_set_protected(fpriv->dev_priv, pc, +
[PATCH v5 1/9] drm/print: fixup spelling in a comment
s/prink/printk/ - no functional changes Signed-off-by: Jim Cromie --- include/drm/drm_print.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h index 9b66be54dd16..15a089a87c22 100644 --- a/include/drm/drm_print.h +++ b/include/drm/drm_print.h @@ -327,7 +327,7 @@ static inline bool drm_debug_enabled(enum drm_debug_category category) /* * struct device based logging * - * Prefer drm_device based logging over device or prink based logging. + * Prefer drm_device based logging over device or printk based logging. */ __printf(3, 4) -- 2.31.1
[PATCH v5 2/9] moduleparam: add data member to struct kernel_param
Add a const void* data member to the struct, to allow attaching private data that will be used soon by a setter method (via kp->data) to perform more elaborate actions. To attach the data at compile time, add new macros: module_param_cbd() derives from module_param_cb(), adding data param, and latter is redefined to use former. It calls __module_param_call_wdata(), which accepts a new data param and inits .data with it. Re-define __module_param_call() to use it. Use of this new data member will be rare, it might be worth redoing this as a separate/sub-type to de-bloat the base case. --- v4+: . const void* data - Signed-off-by: Jim Cromie --- include/linux/moduleparam.h | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/include/linux/moduleparam.h b/include/linux/moduleparam.h index eed280fae433..878387e0b2d9 100644 --- a/include/linux/moduleparam.h +++ b/include/linux/moduleparam.h @@ -78,6 +78,7 @@ struct kernel_param { const struct kparam_string *str; const struct kparam_array *arr; }; + const void *data; }; extern const struct kernel_param __start___param[], __stop___param[]; @@ -175,6 +176,9 @@ struct kparam_array #define module_param_cb(name, ops, arg, perm)\ __module_param_call(MODULE_PARAM_PREFIX, name, ops, arg, perm, -1, 0) +#define module_param_cbd(name, ops, arg, perm, data) \ + __module_param_call_wdata(MODULE_PARAM_PREFIX, name, ops, arg, perm, -1, 0, data) + #define module_param_cb_unsafe(name, ops, arg, perm) \ __module_param_call(MODULE_PARAM_PREFIX, name, ops, arg, perm, -1,\ KERNEL_PARAM_FL_UNSAFE) @@ -284,14 +288,17 @@ struct kparam_array /* This is the fundamental function for registering boot/module parameters. */ -#define __module_param_call(prefix, name, ops, arg, perm, level, flags) \ +#define __module_param_call(prefix, name, ops, arg, perm, level, flags) \ + __module_param_call_wdata(prefix, name, ops, arg, perm, level, flags, NULL) + +#define __module_param_call_wdata(prefix, name, ops, arg, perm, level, flags, data) \ /* Default value instead of permissions? */ \ static const char __param_str_##name[] = prefix #name; \ static struct kernel_param __moduleparam_const __param_##name \ __used __section("__param") \ __aligned(__alignof__(struct kernel_param)) \ = { __param_str_##name, THIS_MODULE, ops, \ - VERIFY_OCTAL_PERMISSIONS(perm), level, flags, { arg } } + VERIFY_OCTAL_PERMISSIONS(perm), level, flags, { arg }, data } /* Obsolete - use module_param_cb() */ #define module_param_call(name, _set, _get, arg, perm) \ -- 2.31.1
[PATCH v5 3/9] dyndbg: add DEFINE_DYNAMIC_DEBUG_CATEGORIES and callbacks
DEFINE_DYNAMIC_DEBUG_CATEGORIES(name, var, bitmap_desc, @bit_descs) allows users to define a drm.debug style (bitmap) sysfs interface, and to specify the desired mapping from bits[0-N] to the format-prefix'd pr_debug()s to be controlled. DEFINE_DYNAMIC_DEBUG_CATEGORIES(debug_gvt, __gvt_debug, "i915/gvt bitmap desc", /** * search-prefixes, passed to dd-exec_queries * defines bits 0-N in order. * leading ^ is tacitly inserted (by callback currently) * trailing space used here excludes subcats. * helper macro needs more work * macro to autogen ++$i, 0x%x$i ? */ _DD_cat_("gvt:cmd: "), _DD_cat_("gvt:core: "), _DD_cat_("gvt:dpy: "), _DD_cat_("gvt:el: "), _DD_cat_("gvt:irq: "), _DD_cat_("gvt:mm: "), _DD_cat_("gvt:mmio: "), _DD_cat_("gvt:render: "), _DD_cat_("gvt:sched: ")); dynamic_debug.c: add 3 new elements: - int param_set_dyndbg() - int param_get_dyndbg() - struct kernel_param_ops param_ops_dyndbg Following the model of kernel/params.c STANDARD_PARAM_DEFS, All 3 are non-static and exported. dynamic_debug.h: Add DEFINE_DYNAMIC_DEBUG_CATEGORIES() described above, and a do-nothing stub. Note that it also calls MODULE_PARM_DESC for the user, but expects the user to catenate all the bit-descriptions together (as is done in drm.debug), and in the following uses in amdgpu, i915. This in the hope that someone can offer an auto-incrementing label-generating macro, producing "\tbit-4 0x10\t" etc, and can show how to apply it to __VA_ARGS__. Also extern the struct kernel_param param_ops_dyndbg symbol, as is done in moduleparams.h for all the STANDARD params. USAGE NOTES: Using dyndbg to query on "format ^$prefix" requires that the prefix be present in the compiled-in format string; where run-time prefixing is used, that format would be "%s...", which is not usefully selectable. Adding structural query terms (func,file,lineno) could help (module is already done), but DEFINE_DYNAMIC_DEBUG_CATEGORIES can't do that now, adding it needs a better reason imo. Dyndbg is completely agnostic wrt the categorization scheme used, to play well with any prefix convention already in use. Ad-hoc categories and sub-categories are implicitly allowed, author discipline and review is expected. Here are some examples: "1","2","3" 2 doesnt imply 1. otherwize, sorta like printk levels "1:","2:","3:" are better, avoiding [1-9]\d+ ambiguity "hi:","mid:","low:" are reasonable, and imply independence "todo:","rfc:" might be handy "A:".."Z:" uhm, yeah Hierarchical classes/categories are natural: "drm::"is used in later commit "drm:::" is a natural extension. "drm:atomic:fail:" has been proposed, sounds directly useful Some properties of a hierarchical category deserve explication: Trailing spaces matter ! With 1..3-space ("drm: ", "drm:atomic: ", "drm:atomic:fail: "), the ":" doesnt terminate the search-space, the trailing space does. So a "drm:" search specification will match all DRM categories & subcategories, and will not be useful in an interface where all categories are controlled together. That said, "drm:atomic:" & "drm:atomic: " are different, and both are useful in cases. Ad-Hoc sub-categories: These have a caveat wrt wrapper macros adding prefixes like "drm:atomic: "; the trailing space in the prefix means that drm_dbg("fail: ...") renders as "drm:atomic: fail: ", which obviously isn't ideal wrt clear and simple bitmaps. A possible solution is to have a FOO_() version of every FOO() macro which (anti-mnemonically) elides the trailing space, which is normally inserted by a modified FOO(). Doing this would enforce a policy decision that "debug categories will be space terminated", with an pressure-relief valve. Summarizing: - "drm:kms: " & "drm:kms:" are different - "drm:kms"also different - includes drm:kms2: - "drm:kms:\t" also different - "drm:kms:*" doesnt work, no wildcard on format atm. Order matters in DEFINE_DYNAMIC_DEBUG_CATEGORIES(... @bit_descs) @bit_descs (array) position determines the bit mapping to the prefix, so to keep a stable map, new categories or 3rd level categories must be added to the end. Since bits are/will-stay applied 0-N, the later bits can countermand the earlier ones, but its tricky - consider; DD_CATs(... "drm:atomic:", ""drm:atomic:fail:" ) // misleading The 1st search-term is misleading, because it includes (modifies) subcategories, but then 2nd overrides it. So don't do that. There is still plenty of bikeshedding to do. --- v4+: . rename to DEFINE_DYNAMIC_DEBUG_CATEGORIES from DEFINE_DYNDBG_BITMAP . in query, replace hardcoded "i915" w kp->mod->name . static inline the stubs . const *str in structs, const array. -Emil . dyndbg: add do-nothing DEFINE_DYNAMIC_DEBUG_CATEGORIES if !DD_CORE . ca
[PATCH v5 4/9] i915/gvt: remove spaces in pr_debug "gvt: core:" etc prefixes
Taking embedded spaces out of existing prefixes makes them better class-prefixes; simplifying the nested quoting needed otherwise: $> echo "format '^gvt: core:' +p" >control Dropping the internal spaces means any trailing space in a query will more clearly terminate the prefix being searched for. Consider a generic drm-debug example: # turn off ATOMIC reports echo format "^drm:atomic: " -p > control # turn off all ATOMIC:* reports, including any sub-categories echo format "^drm:atomic:" -p > control # turn on ATOMIC:FAIL: reports echo format "^drm:atomic:fail: " +p > control Removing embedded spaces in the class-prefixes simplifies the corresponding match-prefix. This means that "quoted" match-prefixes are only needed when the trailing space is desired, in order to exclude explicitly sub-categorized pr-debugs; in this example, "drm:atomic:fail:". RFC: maybe the prefix catenation should paste in the " " class-prefix terminator explicitly. A pr_debug_() flavor could exclude the " ", allowing ad-hoc sub-categorization by appending for example, "fail:" to "drm:atomic:". Signed-off-by: Jim Cromie --- drivers/gpu/drm/i915/gvt/debug.h | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/gvt/debug.h b/drivers/gpu/drm/i915/gvt/debug.h index c6027125c1ec..b4021f41c546 100644 --- a/drivers/gpu/drm/i915/gvt/debug.h +++ b/drivers/gpu/drm/i915/gvt/debug.h @@ -36,30 +36,30 @@ do { \ } while (0) #define gvt_dbg_core(fmt, args...) \ - pr_debug("gvt: core: "fmt, ##args) + pr_debug("gvt:core: "fmt, ##args) #define gvt_dbg_irq(fmt, args...) \ - pr_debug("gvt: irq: "fmt, ##args) + pr_debug("gvt:irq: "fmt, ##args) #define gvt_dbg_mm(fmt, args...) \ - pr_debug("gvt: mm: "fmt, ##args) + pr_debug("gvt:mm: "fmt, ##args) #define gvt_dbg_mmio(fmt, args...) \ - pr_debug("gvt: mmio: "fmt, ##args) + pr_debug("gvt:mmio: "fmt, ##args) #define gvt_dbg_dpy(fmt, args...) \ - pr_debug("gvt: dpy: "fmt, ##args) + pr_debug("gvt:dpy: "fmt, ##args) #define gvt_dbg_el(fmt, args...) \ - pr_debug("gvt: el: "fmt, ##args) + pr_debug("gvt:el: "fmt, ##args) #define gvt_dbg_sched(fmt, args...) \ - pr_debug("gvt: sched: "fmt, ##args) + pr_debug("gvt:sched: "fmt, ##args) #define gvt_dbg_render(fmt, args...) \ - pr_debug("gvt: render: "fmt, ##args) + pr_debug("gvt:render: "fmt, ##args) #define gvt_dbg_cmd(fmt, args...) \ - pr_debug("gvt: cmd: "fmt, ##args) + pr_debug("gvt:cmd: "fmt, ##args) #endif -- 2.31.1
[PATCH v5 5/9] i915/gvt: use DEFINE_DYNAMIC_DEBUG_CATEGORIES to create "gvt:core:" etc categories
The gvt component of this driver has ~120 pr_debugs, in 9 categories quite similar to those in DRM. Following the interface model of drm.debug, add a parameter to map bits to these categorizations. DEFINE_DYNAMIC_DEBUG_CATEGORIES(debug_gvt, __gvt_debug, "dyndbg bitmap desc", { "gvt:cmd: ", "command processing" }, { "gvt:core: ", "core help" }, { "gvt:dpy: ", "display help" }, { "gvt:el: ", "help" }, { "gvt:irq: ", "help" }, { "gvt:mm: ", "help" }, { "gvt:mmio: ", "help" }, { "gvt:render: ", "help" }, { "gvt:sched: " "help" }); The actual patch has a few details different, cmd_help() macro emits the initialization construct. if CONFIG_DRM_USE_DYNAMIC_DEBUG, then -DDYNAMIC_DEBUG_MODULE is added cflags, by gvt/Makefile. --- v4+: . static decl of vector of bit->class descriptors - Emil.V . relocate gvt-makefile chunk from elsewhere Signed-off-by: Jim Cromie --- drivers/gpu/drm/i915/gvt/Makefile | 4 drivers/gpu/drm/i915/i915_params.c | 35 ++ 2 files changed, 39 insertions(+) diff --git a/drivers/gpu/drm/i915/gvt/Makefile b/drivers/gpu/drm/i915/gvt/Makefile index ea8324abc784..846ba73b8de6 100644 --- a/drivers/gpu/drm/i915/gvt/Makefile +++ b/drivers/gpu/drm/i915/gvt/Makefile @@ -7,3 +7,7 @@ GVT_SOURCE := gvt.o aperture_gm.o handlers.o vgpu.o trace_points.o firmware.o \ ccflags-y += -I $(srctree)/$(src) -I $(srctree)/$(src)/$(GVT_DIR)/ i915-y += $(addprefix $(GVT_DIR)/, $(GVT_SOURCE)) + +#ifdef CONFIG_DRM_USE_DYNAMIC_DEBUG +ccflags-y += -DDYNAMIC_DEBUG_MODULE +#endif diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c index e07f4cfea63a..683e942a074e 100644 --- a/drivers/gpu/drm/i915/i915_params.c +++ b/drivers/gpu/drm/i915/i915_params.c @@ -265,3 +265,38 @@ void i915_params_free(struct i915_params *params) I915_PARAMS_FOR_EACH(FREE); #undef FREE } + +#ifdef DRM_USE_DYNAMIC_DEBUG +/* todo: needs DYNAMIC_DEBUG_MODULE in some cases */ + +unsigned long __gvt_debug; +EXPORT_SYMBOL(__gvt_debug); + +#define _help(key) "\t\"" key "\"\t: help for " key "\n" + +#define I915_GVT_CATEGORIES(name) \ + " Enable debug output via /sys/module/i915/parameters/" #name \ + ", where each bit enables a debug category.\n" \ + _help("gvt:cmd:") \ + _help("gvt:core:") \ + _help("gvt:dpy:") \ + _help("gvt:el:")\ + _help("gvt:irq:") \ + _help("gvt:mm:")\ + _help("gvt:mmio:") \ + _help("gvt:render:")\ + _help("gvt:sched:") + +DEFINE_DYNAMIC_DEBUG_CATEGORIES(debug_gvt, __gvt_debug, + I915_GVT_CATEGORIES(debug_gvt), + _DD_cat_("gvt:cmd:"), + _DD_cat_("gvt:core:"), + _DD_cat_("gvt:dpy:"), + _DD_cat_("gvt:el:"), + _DD_cat_("gvt:irq:"), + _DD_cat_("gvt:mm:"), + _DD_cat_("gvt:mmio:"), + _DD_cat_("gvt:render:"), + _DD_cat_("gvt:sched:")); + +#endif -- 2.31.1
[PATCH v5 6/9] amdgpu: use DEFINE_DYNAMIC_DEBUG_CATEGORIES to control categorized pr_debugs
logger_types.h defines many DC_LOG_*() categorized debug wrappers. Most of these use DRM debug API, so are controllable using drm.debug, but others use bare pr_debug("$prefix: .."), each with a different class-prefix matching "^\[\w+\]:" Use DEFINE_DYNAMIC_DEBUG_CATEGORIES to create a /sys debug_dc parameter, modinfos, and to specify a map from bits -> categorized pr_debugs to be controlled. Signed-off-by: Jim Cromie --- .../gpu/drm/amd/display/dc/core/dc_debug.c| 44 ++- 1 file changed, 43 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_debug.c b/drivers/gpu/drm/amd/display/dc/core/dc_debug.c index 21be2a684393..69e68d721512 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_debug.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_debug.c @@ -36,8 +36,50 @@ #include "resource.h" -#define DC_LOGGER_INIT(logger) +#ifdef DRM_USE_DYNAMIC_DEBUG +/* define a drm.debug style dyndbg pr-debug control point */ +#include + +unsigned long __debug_dc; +EXPORT_SYMBOL(__debug_dc); + +#define _help_(key)"\t " key "\t- help for " key "\n" + +/* Id like to do these inside DEFINE_DYNAMIC_DEBUG_CATEGORIES, if possible */ +#define DC_DYNDBG_BITMAP_DESC(name)\ + "Control pr_debugs via /sys/module/amdgpu/parameters/" #name\ + ", where each bit controls a debug category.\n" \ + _help_("[SURFACE]:")\ + _help_("[CURSOR]:") \ + _help_("[PFLIP]:") \ + _help_("[VBLANK]:") \ + _help_("[HW_LINK_TRAINING]:") \ + _help_("[HW_AUDIO]:") \ + _help_("[SCALER]:") \ + _help_("[BIOS]:") \ + _help_("[BANDWIDTH_CALCS]:")\ + _help_("[DML]:")\ + _help_("[IF_TRACE]:") \ + _help_("[GAMMA]:") \ + _help_("[SMU_MSG]:") + +DEFINE_DYNAMIC_DEBUG_CATEGORIES(debug_dc, __debug_dc, + DC_DYNDBG_BITMAP_DESC(debug_dc), + _DD_cat_("[CURSOR]:"), + _DD_cat_("[PFLIP]:"), + _DD_cat_("[VBLANK]:"), + _DD_cat_("[HW_LINK_TRAINING]:"), + _DD_cat_("[HW_AUDIO]:"), + _DD_cat_("[SCALER]:"), + _DD_cat_("[BIOS]:"), + _DD_cat_("[BANDWIDTH_CALCS]:"), + _DD_cat_("[DML]:"), + _DD_cat_("[IF_TRACE]:"), + _DD_cat_("[GAMMA]:"), + _DD_cat_("[SMU_MSG]:")); +#endif +#define DC_LOGGER_INIT(logger) #define SURFACE_TRACE(...) do {\ if (dc->debug.surface_trace) \ -- 2.31.1
[PATCH v5 7/9] drm_print: add choice to use dynamic debug in drm-debug
drm's debug system writes 10 distinct categories of messages to syslog using a small API[1]: drm_dbg*(10 names), DRM_DEBUG*(8 names), DRM_DEV_DEBUG*(3 names). There are thousands of these callsites, each categorized by their authors. These callsites can be enabled at runtime by their category, each controlled by a bit in drm.debug (/sys/modules/drm/parameter/debug). In the current "basic" implementation, drm_debug_enabled() tests these bits in __drm_debug each time an API[1] call is executed; while cheap individually, the costs accumulate. This patch uses dynamic-debug with jump-label to patch enabled calls onto their respective NOOP slots, avoiding all runtime bit-checks of __drm_debug. Dynamic debug has no concept of category, but we can emulate one by replacing enum categories with a set of prefix-strings; "drm:core:", "drm:kms:" "drm:driver:" etc, and prepend them (at compile time) to the given formats. Then we can use: `echo module drm format "^drm:core: " +p > control` to enable the whole category with one query. This conversion yields ~2100 new callsites on my i7/i915 laptop: dyndbg: 195 debug prints in module drm_kms_helper dyndbg: 298 debug prints in module drm dyndbg: 1630 debug prints in module i915 CONFIG_DRM_USE_DYNAMIC_DEBUG enables this, and is available if CONFIG_DYNAMIC_DEBUG or CONFIG_DYNAMIC_DEBUG_CORE is chosen, and if CONFIG_JUMP_LABEL is enabled; this because its required to get the promised optimizations. The "basic" -> "dyndbg" switchover is layered into the macro scheme A. use DEFINE_DYNAMIC_DEBUG_CATEGORIES(debug, __drm_debug, "DRM debug category-per-bit control", { "drm:core:", "enable CORE debug messages" }, { "drm:kms:", "enable KMS debug messages" }, ...); B. A "classy" version of DRM_UT_ map, named DRM_DBG_CAT_ DRM_DBG_CLASS_ was proposed, I had agreed, but reconsidered; CATEGORY is already DRM's term-of-art, and adding a near-synonym 'CLASS' only adds ambiguity. "basic": DRM_DBG_CAT_ <=== DRM_UT_. Identity map. "dyndbg": #define DRM_DBG_CAT_KMS"drm:kms: " #define DRM_DBG_CAT_PRIME "drm:prime: " #define DRM_DBG_CAT_ATOMIC "drm:atomic: " DRM_UT_* are preserved, since theyre used elsewhere. We can probably reduce its use further, but thats a separate thing. C. drm_dev_dbg() & drm_debug() are interposed with macros basic: forward to renamed fn, with args preserved enabled: redirect to pr_debug, dev_dbg, with CATEGORY # format this is where drm_debug_enabled() is avoided. prefix is prepended at compile-time, no category at runtime. D. API[1] uses DRM_DBG_CAT_s these already use (C), now they use (B) too, to get the correct token type for "basic" and "dyndbg" configs. NOTES: Code Review is expected to catch lack of correspondence between bit=>prefix definitions (the selector) and the prefixes used in the API[1] layer above pr_debug() I've coded the search-prefixes/categories with a trailing space, which excludes any sub-categories added later. This convention protects any "drm:atomic:fail:" callsites from getting stomped on by `echo 0 > debug`. Other categories could differ, but we need some default. Dyndbg requires that the prefix be in the compiled-in format string; run-time prefixing evades callsite selection by category. pr_debug("%s: ...", __func__, ...) // not ideal With "lineno X" in a query, its possible to enable single callsites, but it is tedious, and useless in a category context. Unfortunately __func__ is not a macro, and cannot be catenated at preprocess/compile time. pr_debug("Entry: ...") // +fml gives useful log-info pr_debug("Exit: ...") // hard to catch them all But "func foo" added to query-command would work, should it be useful enough to justify extending the declarative interface. --- v4+: . use DEFINE_DYNAMIC_DEBUG_CATEGORIES in drm_print.c . s/DRM_DBG_CLASS_/DRM_DBG_CAT_/ - dont need another term . default=y in KBuild entry - per @DanVet . move some commit-log prose to dyndbg commit . add-prototyes to (param_get/set)_dyndbg . more wrinkles found by . relocate ratelimit chunk from elsewhere . add kernel doc Signed-off-by: Jim Cromie --- drivers/gpu/drm/Kconfig | 13 drivers/gpu/drm/drm_print.c | 49 + include/drm/drm_print.h | 141 3 files changed, 159 insertions(+), 44 deletions(-) diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index 7ff89690a976..97e38d86fd27 100644 --- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -57,6 +57,19 @@ config DRM_DEBUG_MM If in doubt, say "N". +config DRM_USE_DYNAMIC_DEBUG + bool "use dynamic debug to implement drm.debug" + default y + depends on DRM + depends on DYNAMIC_DEBUG || DYNAMIC_DEBUG_CORE + depends on JUMP_LABEL + help + The "basic" drm.debug facility does a lot of
[PATCH v5 8/9] amdgpu_ucode: reduce number of pr_debug calls
There are blocks of DRM_DEBUG calls, consolidate their args into single calls. With dynamic-debug in use, each callsite consumes 56 bytes of ro callsite data, and this patch removes about 65 calls, so it saves ~3.5kb. no functional changes. Signed-off-by: Jim Cromie --- drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 293 -- 1 file changed, 158 insertions(+), 135 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c index 2834981f8c08..14a9fef1f4c6 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c @@ -30,17 +30,26 @@ static void amdgpu_ucode_print_common_hdr(const struct common_firmware_header *hdr) { - DRM_DEBUG("size_bytes: %u\n", le32_to_cpu(hdr->size_bytes)); - DRM_DEBUG("header_size_bytes: %u\n", le32_to_cpu(hdr->header_size_bytes)); - DRM_DEBUG("header_version_major: %u\n", le16_to_cpu(hdr->header_version_major)); - DRM_DEBUG("header_version_minor: %u\n", le16_to_cpu(hdr->header_version_minor)); - DRM_DEBUG("ip_version_major: %u\n", le16_to_cpu(hdr->ip_version_major)); - DRM_DEBUG("ip_version_minor: %u\n", le16_to_cpu(hdr->ip_version_minor)); - DRM_DEBUG("ucode_version: 0x%08x\n", le32_to_cpu(hdr->ucode_version)); - DRM_DEBUG("ucode_size_bytes: %u\n", le32_to_cpu(hdr->ucode_size_bytes)); - DRM_DEBUG("ucode_array_offset_bytes: %u\n", - le32_to_cpu(hdr->ucode_array_offset_bytes)); - DRM_DEBUG("crc32: 0x%08x\n", le32_to_cpu(hdr->crc32)); + DRM_DEBUG("size_bytes: %u\n" + "header_size_bytes: %u\n" + "header_version_major: %u\n" + "header_version_minor: %u\n" + "ip_version_major: %u\n" + "ip_version_minor: %u\n" + "ucode_version: 0x%08x\n" + "ucode_size_bytes: %u\n" + "ucode_array_offset_bytes: %u\n" + "crc32: 0x%08x\n", + le32_to_cpu(hdr->size_bytes), + le32_to_cpu(hdr->header_size_bytes), + le16_to_cpu(hdr->header_version_major), + le16_to_cpu(hdr->header_version_minor), + le16_to_cpu(hdr->ip_version_major), + le16_to_cpu(hdr->ip_version_minor), + le32_to_cpu(hdr->ucode_version), + le32_to_cpu(hdr->ucode_size_bytes), + le32_to_cpu(hdr->ucode_array_offset_bytes), + le32_to_cpu(hdr->crc32)); } void amdgpu_ucode_print_mc_hdr(const struct common_firmware_header *hdr) @@ -55,9 +64,9 @@ void amdgpu_ucode_print_mc_hdr(const struct common_firmware_header *hdr) const struct mc_firmware_header_v1_0 *mc_hdr = container_of(hdr, struct mc_firmware_header_v1_0, header); - DRM_DEBUG("io_debug_size_bytes: %u\n", - le32_to_cpu(mc_hdr->io_debug_size_bytes)); - DRM_DEBUG("io_debug_array_offset_bytes: %u\n", + DRM_DEBUG("io_debug_size_bytes: %u\n" + "io_debug_array_offset_bytes: %u\n", + le32_to_cpu(mc_hdr->io_debug_size_bytes), le32_to_cpu(mc_hdr->io_debug_array_offset_bytes)); } else { DRM_ERROR("Unknown MC ucode version: %u.%u\n", version_major, version_minor); @@ -82,13 +91,17 @@ void amdgpu_ucode_print_smc_hdr(const struct common_firmware_header *hdr) switch (version_minor) { case 0: v2_0_hdr = container_of(hdr, struct smc_firmware_header_v2_0, v1_0.header); - DRM_DEBUG("ppt_offset_bytes: %u\n", le32_to_cpu(v2_0_hdr->ppt_offset_bytes)); - DRM_DEBUG("ppt_size_bytes: %u\n", le32_to_cpu(v2_0_hdr->ppt_size_bytes)); + DRM_DEBUG("ppt_offset_bytes: %u\n" + "ppt_size_bytes: %u\n", + le32_to_cpu(v2_0_hdr->ppt_offset_bytes), + le32_to_cpu(v2_0_hdr->ppt_size_bytes)); break; case 1: v2_1_hdr = container_of(hdr, struct smc_firmware_header_v2_1, v1_0.header); - DRM_DEBUG("pptable_count: %u\n", le32_to_cpu(v2_1_hdr->pptable_count)); - DRM_DEBUG("pptable_entry_offset: %u\n", le32_to_cpu(v2_1_hdr->pptable_entry_offset)); + DRM_DEBUG("pptable_count: %u\n" + "pptable_entry_offset: %u\n", + le32_to_cpu(v2_1_hdr->pptable_count), + le32_to_cpu(v2_1_hdr->pptable_entry_offset)); break; default: break; @@ -111,10 +124,12 @@ void amdgpu_ucode_print_gfx_hdr(const struct common_firmware_header *hdr)
[PATCH v5 9/9] dyndbg: RFC add tracer facility RFC
Sean Paul seanp...@chromium.org proposed, in https://patchwork.freedesktop.org/series/78133/ drm/trace: Mirror DRM debug logs to tracefs That patchset's goal is to be able to duplicate the debug stream to a tracing destination, by splitting drm_debug_enabled() into syslog & trace flavors, and enabling them separately. That clashes rather deeply with this patchset's goal; to avoid drm_debug_enabled() using dyndbg. Instead, this puts the print-to-trace decision in dyndbg, after the is-it-enabled test (which is a noop), so it has near zero additional cost (other than memory increase); the print-to-trace test is only done on enabled callsites. The basic elements: - add a new struct _ddebug member: (*tracer)(char *format, ...) - add a new T flag to enable tracer - adjust the static-key-enable/disable condition for (p|T) - if (p) before printk, since T enables too. - if (T) call tracer if its true = int dynamic_debug_register_tracer(query, modname, tracer); = int dynamic_debug_unregister_tracer(query, modname, cookie); This new interface lets clients set/unset a tracer function on each callsite matching the query, for example: "drm:atomic:fail:". Clients are expected to unregister the same callsites they register (a cookie), allowing protection of each client's dyndbg-state setup against overwrites by others. Intended Behavior: (things are in flux, RFC) - register sets empty slot, doesnt overwrite the query selects callsites, and sets +T (grammar requires +-action) - register allows same-tracer over-set wo warn 2nd register can then enable superset, subset, disjoint set - unregister clears slot if it matches cookie/tracer query selects set, -T (as tested) tolerates over-clears - dd-exec-queries(+/-T) can modify the enablements not sure its needed, but it falls out.. The code is currently in-line in ddebug_change(), to be moved to separate fn, rc determines flow, may also veto/alter changes by altering flag-settings - tbd. TBD: Im not sure what happens when exec-queries(+T) hits a site wo a tracer (silence I think. maybe not ideal). internal call-chain gets a tracer param: New arg: public: dynamic_debug_exec_queries ro-string copy moved ... 1 ddebug_exec_queries tracer=NULL ... to here 2 ddebug_exec_query tracer=NULL call-chain gets (re)used: with !NULL public: dynamic_debug_register_tracer tracer=client's w ro-string 1 ddebug_exec_queries tracer ... SELFTEST: test_dynamic_debug.ko: Uses the tracer facility to do a selftest: - A custom tracer counts the number of calls (of T-enabled pr_debugs), - do_debugging(x) calls a set of categorized pr_debugs x times - test registers the tracer on the function, then iteratively: manipulates dyndbg states via query-cmds runs do_debugging() counts enabled callsite executions reports mismatches - modprobe test_dynamic_debug use_bad_tracer=1 attaches a bad/recursive tracer Bad Things Happen. has thrown me interesting panics. NOTES: This needs more work. RFC. ERRORS (or WARNINGS): It should be an error to +T a callsite which has no aux_print set (ie already registered with a query that selected that callsite). This tacitly enforces registration. Then +T,-T can toggle those aux_print callsites (or subsets of them) to tailor the debug-stream for the purpose. Controlling flow is the best work limiter. --- v4+: (this patch sent after (on top of) v4) . fix "too many arguments to function", and name the args: int (*aux_print)(const char *fmt, char *prefix, char *label, void *); prefix : is a slot for dynamic_emit_prefix, or for custom buffer insert label : for builtin-caller used by drm-trace-print void* : vaf, add type constraint later. . fix printk (to syslog) needs if (+p), since +T also enables . add prototypes for un/register_aux_print . change iface names: s/aux_print/tracer/ . also s/trace_print/tracer/ . struct va_format *vaf - tighten further ? Signed-off-by: Jim Cromie --- include/linux/dynamic_debug.h | 32 - lib/Kconfig.debug | 10 ++ lib/Makefile | 1 + lib/dynamic_debug.c | 109 +++ lib/test_dynamic_debug.c | 247 ++ 5 files changed, 372 insertions(+), 27 deletions(-) create mode 100644 lib/test_dynamic_debug.c diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h index 42cfb37d4870..cbcb1c94cec3 100644 --- a/include/linux/dynamic_debug.h +++ b/include/linux/dynamic_debug.h @@ -20,6 +20,7 @@ struct _ddebug { const char *function; const char *filename; const char *format; + int (*tracer)(const char *fmt, char *prefix, char *label, struct va_format *vaf); unsigned int lineno:18; /* * The flags field controls the behaviour at the callsite. @@ -27,7 +28,11 @@ struct _ddebug { * writes commands
Re: [Intel-gfx] [PATCH v6 10/15] drm/i915/pxp: interfaces for using protected objects
On 8/13/2021 7:42 AM, Daniel Vetter wrote: On Fri, Aug 13, 2021 at 04:37:53PM +0200, Daniel Vetter wrote: On Wed, Jul 28, 2021 at 07:01:01PM -0700, Daniele Ceraolo Spurio wrote: This api allow user mode to create protected buffers and to mark contexts as making use of such objects. Only when using contexts marked in such a way is the execution guaranteed to work as expected. Contexts can only be marked as using protected content at creation time (i.e. the parameter is immutable) and they must be both bannable and not recoverable. All protected objects and contexts that have backing storage will be considered invalid when the PXP session is destroyed and all new submissions using them will be rejected. All intel contexts within the invalidated gem contexts will be marked banned. A new flag has been added to the RESET_STATS ioctl to report the context invalidation to userspace. This patch was previously sent as 2 separate patches, which have been squashed following a request to have all the uapi in a single patch. I've retained the s-o-b from both. v5: squash patches, rebase on proto_ctx, update kerneldoc v6: rebase on obj create_ext changes Signed-off-by: Daniele Ceraolo Spurio Signed-off-by: Bommu Krishnaiah Cc: Rodrigo Vivi Cc: Chris Wilson Cc: Lionel Landwerlin Cc: Jason Ekstrand Cc: Daniel Vetter Reviewed-by: Rodrigo Vivi #v5 --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 68 -- drivers/gpu/drm/i915/gem/i915_gem_context.h | 18 .../gpu/drm/i915/gem/i915_gem_context_types.h | 2 + drivers/gpu/drm/i915/gem/i915_gem_create.c| 75 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 40 - drivers/gpu/drm/i915/gem/i915_gem_object.c| 6 ++ drivers/gpu/drm/i915/gem/i915_gem_object.h| 12 +++ .../gpu/drm/i915/gem/i915_gem_object_types.h | 9 ++ drivers/gpu/drm/i915/pxp/intel_pxp.c | 89 +++ drivers/gpu/drm/i915/pxp/intel_pxp.h | 15 drivers/gpu/drm/i915/pxp/intel_pxp_session.c | 3 + drivers/gpu/drm/i915/pxp/intel_pxp_types.h| 5 ++ include/uapi/drm/i915_drm.h | 55 +++- 13 files changed, 371 insertions(+), 26 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index cff72679ad7c..0cd3e2d06188 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -77,6 +77,8 @@ #include "gt/intel_gpu_commands.h" #include "gt/intel_ring.h" +#include "pxp/intel_pxp.h" + #include "i915_gem_context.h" #include "i915_trace.h" #include "i915_user_extensions.h" @@ -241,6 +243,25 @@ static int proto_context_set_persistence(struct drm_i915_private *i915, return 0; } +static int proto_context_set_protected(struct drm_i915_private *i915, + struct i915_gem_proto_context *pc, + bool protected) +{ + int ret = 0; + + if (!intel_pxp_is_enabled(&i915->gt.pxp)) + ret = -ENODEV; + else if (!protected) + pc->user_flags &= ~BIT(UCONTEXT_PROTECTED); + else if ((pc->user_flags & BIT(UCONTEXT_RECOVERABLE)) || +!(pc->user_flags & BIT(UCONTEXT_BANNABLE))) + ret = -EPERM; + else + pc->user_flags |= BIT(UCONTEXT_PROTECTED); + + return ret; +} + static struct i915_gem_proto_context * proto_context_create(struct drm_i915_private *i915, unsigned int flags) { @@ -686,6 +707,8 @@ static int set_proto_ctx_param(struct drm_i915_file_private *fpriv, ret = -EPERM; else if (args->value) pc->user_flags |= BIT(UCONTEXT_BANNABLE); + else if (pc->user_flags & BIT(UCONTEXT_PROTECTED)) + ret = -EPERM; else pc->user_flags &= ~BIT(UCONTEXT_BANNABLE); break; @@ -693,10 +716,12 @@ static int set_proto_ctx_param(struct drm_i915_file_private *fpriv, case I915_CONTEXT_PARAM_RECOVERABLE: if (args->size) ret = -EINVAL; - else if (args->value) - pc->user_flags |= BIT(UCONTEXT_RECOVERABLE); - else + else if (!args->value) pc->user_flags &= ~BIT(UCONTEXT_RECOVERABLE); + else if (pc->user_flags & BIT(UCONTEXT_PROTECTED)) + ret = -EPERM; + else + pc->user_flags |= BIT(UCONTEXT_RECOVERABLE); break; case I915_CONTEXT_PARAM_PRIORITY: @@ -724,6 +749,11 @@ static int set_proto_ctx_param(struct drm_i915_file_private *fpriv, args->value); break; + case I915_CONTEXT_PARAM_PROTECTED_CONTENT: + ret = proto_context_set_protected(fp
Re: [PATCH v5 2/9] moduleparam: add data member to struct kernel_param
On Fri, Aug 13, 2021 at 09:17:10AM -0600, Jim Cromie wrote: > Add a const void* data member to the struct, to allow attaching > private data that will be used soon by a setter method (via kp->data) > to perform more elaborate actions. > > To attach the data at compile time, add new macros: > > module_param_cbd() derives from module_param_cb(), adding data param, > and latter is redefined to use former. > > It calls __module_param_call_wdata(), which accepts a new data param > and inits .data with it. Re-define __module_param_call() to use it. > > Use of this new data member will be rare, it might be worth redoing > this as a separate/sub-type to de-bloat the base case. ... > +#define module_param_cbd(name, ops, arg, perm, data) > \ > + __module_param_call_wdata(MODULE_PARAM_PREFIX, name, ops, arg, perm, > -1, 0, data) Cryptic name. Moreover, inconsistent with the rest. What about module_param_cb_data() ? > #define module_param_cb_unsafe(name, ops, arg, perm) \ > __module_param_call(MODULE_PARAM_PREFIX, name, ops, arg, perm, -1,\ > KERNEL_PARAM_FL_UNSAFE) (above left for the above comment) ... > +#define __module_param_call_wdata(prefix, name, ops, arg, perm, level, > flags, data) \ Similar __module_param_call_with_data() -- With Best Regards, Andy Shevchenko
Re: [PATCH v2] drm: Copy drm_wait_vblank to user before returning
Thanks for your review Michel! @MAINTAINER, could you please strip the Change-Id when applying. Thanks! On Fri, Aug 13, 2021 at 3:33 AM Michel Dänzer wrote: > > On 2021-08-12 9:49 p.m., Mark Yacoub wrote: > > From: Mark Yacoub > > > > [Why] > > Userspace should get back a copy of drm_wait_vblank that's been modified > > even when drm_wait_vblank_ioctl returns a failure. > > > > Rationale: > > drm_wait_vblank_ioctl modifies the request and expects the user to read > > it back. When the type is RELATIVE, it modifies it to ABSOLUTE and updates > > the sequence to become current_vblank_count + sequence (which was > > RELATIVE), but now it became ABSOLUTE. > > drmWaitVBlank (in libdrm) expects this to be the case as it modifies > > the request to be Absolute so it expects the sequence to would have been > > updated. > > > > The change is in compat_drm_wait_vblank, which is called by > > drm_compat_ioctl. This change of copying the data back regardless of the > > return number makes it en par with drm_ioctl, which always copies the > > data before returning. > > > > [How] > > Return from the function after everything has been copied to user. > > > > Fixes: IGT:kms_flip::modeset-vs-vblank-race-interruptible > > Tested on ChromeOS Trogdor(msm) > > > > Signed-off-by: Mark Yacoub > > Change-Id: I98da279a5f1329c66a9d1e06b88d40b247b51313 > > With the Gerrit Change-Id removed, > > Reviewed-by: Michel Dänzer > > > -- > Earthling Michel Dänzer | https://redhat.com > Libre software enthusiast | Mesa and X developer
Re: [PATCH 10/64] lib80211: Use struct_group() for memcpy() region
On Fri, Aug 13, 2021 at 10:04:09AM +0200, Johannes Berg wrote: > On Tue, 2021-07-27 at 13:58 -0700, Kees Cook wrote: > > > > +++ b/include/linux/ieee80211.h > > @@ -297,9 +297,11 @@ static inline u16 ieee80211_sn_sub(u16 sn1, u16 sn2) > > struct ieee80211_hdr { > > __le16 frame_control; > > __le16 duration_id; > > - u8 addr1[ETH_ALEN]; > > - u8 addr2[ETH_ALEN]; > > - u8 addr3[ETH_ALEN]; > > + struct_group(addrs, > > + u8 addr1[ETH_ALEN]; > > + u8 addr2[ETH_ALEN]; > > + u8 addr3[ETH_ALEN]; > > + ); > > __le16 seq_ctrl; > > u8 addr4[ETH_ALEN]; > > } __packed __aligned(2); > > This file isn't really just lib80211, it's also used by everyone else > for 802.11, but I guess that's OK - after all, this doesn't really > result in any changes here. > > > +++ b/net/wireless/lib80211_crypt_ccmp.c > > @@ -136,7 +136,8 @@ static int ccmp_init_iv_and_aad(const struct > > ieee80211_hdr *hdr, > > pos = (u8 *) hdr; > > aad[0] = pos[0] & 0x8f; > > aad[1] = pos[1] & 0xc7; > > - memcpy(aad + 2, hdr->addr1, 3 * ETH_ALEN); > > + BUILD_BUG_ON(sizeof(hdr->addrs) != 3 * ETH_ALEN); > > + memcpy(aad + 2, &hdr->addrs, ETH_ALEN); > > > However, how is it you don't need the same change in net/mac80211/wpa.c? > > We have three similar instances: > > /* AAD (extra authenticate-only data) / masked 802.11 header > * FC | A1 | A2 | A3 | SC | [A4] | [QC] */ > put_unaligned_be16(len_a, &aad[0]); > put_unaligned(mask_fc, (__le16 *)&aad[2]); > memcpy(&aad[4], &hdr->addr1, 3 * ETH_ALEN); > > > and > > memcpy(&aad[4], &hdr->addr1, 3 * ETH_ALEN); > > and > > memcpy(aad + 2, &hdr->addr1, 3 * ETH_ALEN); > > so those should also be changed, it seems? Ah! Yes, thanks for pointing this out. During earlier development I split the "cross-field write" changes from the "cross-field read" changes, and it looks like I missed moving lib80211_crypt_ccmp.c into that portion of the series (which I haven't posted nor finished -- it's lower priority than fixing the cross-field writes). > In which case I'd probably prefer to do this separately from the staging > drivers ... Agreed. Sorry for the noise on that part. I will double-check the other patches. -- Kees Cook
Re: [PATCH 1/2] drm: avoid races with modesetting rights
On Fri, Aug 13, 2021 at 04:54:49PM +0800, Desmond Cheong Zhi Xi wrote: > In drm_client_modeset.c and drm_fb_helper.c, > drm_master_internal_{acquire,release} are used to avoid races with DRM > userspace. These functions hold onto drm_device.master_mutex while > committing, and bail if there's already a master. > > However, ioctls can still race between themselves. A > time-of-check-to-time-of-use error can occur if an ioctl that changes > the modeset has its rights revoked after it validates its permissions, > but before it completes. > > There are three ioctls that can affect modesetting permissions: > > - DROP_MASTER ioctl removes rights for a master and its leases > > - REVOKE_LEASE ioctl revokes rights for a specific lease > > - SET_MASTER ioctl sets the device master if the master role hasn't > been acquired yet > > All these races can be avoided by introducing an SRCU that acts as a > barrier for ioctls that can change modesetting permissions. Processes > that perform modesetting should hold a read lock on the new > drm_device.master_barrier_srcu, and ioctls that change these > permissions should call synchronize_srcu before returning. > > This ensures that any process that might have seen old permissions are > flushed out before DROP_MASTER/REVOKE_LEASE/SET_MASTER ioctls return > to userspace. > > Reported-by: Daniel Vetter > Signed-off-by: Desmond Cheong Zhi Xi This looks pretty solid, but I think there's one gap where we can still race. Scenario. Process A has a drm fd with master rights and two threads: - thread 1 does a long-running display operation (like a modeset or whatever) - thread 2 does a drop-master Then we start a new process B, which acquires master in drm_open (there is no other one left). This is like setmaster ioctl, but your DRM_MASTER_FLUSH bit doesn't work there. The other thing is that for modeset stuff (which this all is) srcu is probably massive overkill, and a simple rwsem should be good enough too. Maybe even better, since the rwsem guarantees that no new reader can start once you try to acquire the write side. Finally, and this is a bit a bikeshed: I don't like much how DRM_MASTER_FLUSH leaks the need of these very few places into the very core drm_ioctl function. One idea I had was to use task_work in a special function, roughly void master_flush() { down_write(master_rwsem); up_write(master_rwms); } void drm_master_flush() { init_task_work(fpriv->master_flush_work, master_flush) task_work_add(fpriv->master_flush_work); /* if task_work_add fails we're exiting, at which point the lack * of master flush doesn't matter); } And maybe put a comment above the function explaining why and how this works. We could even do a drm_master_unlock_and_flush helper, since that's really what everyone wants, and it would make it very clear which master state changes need this flush. Instead of setting a flag bit in an ioctl table very far away ... Thoughts? -Daniel > --- > drivers/gpu/drm/drm_auth.c | 17 ++--- > drivers/gpu/drm/drm_client_modeset.c | 10 ++ > drivers/gpu/drm/drm_drv.c| 2 ++ > drivers/gpu/drm/drm_fb_helper.c | 20 > drivers/gpu/drm/drm_internal.h | 5 +++-- > drivers/gpu/drm/drm_ioctl.c | 25 + > include/drm/drm_device.h | 11 +++ > include/drm/drm_ioctl.h | 7 +++ > 8 files changed, 76 insertions(+), 21 deletions(-) > > diff --git a/drivers/gpu/drm/drm_auth.c b/drivers/gpu/drm/drm_auth.c > index 60a6b21474b1..004506608e76 100644 > --- a/drivers/gpu/drm/drm_auth.c > +++ b/drivers/gpu/drm/drm_auth.c > @@ -29,6 +29,7 @@ > */ > > #include > +#include > > #include > #include > @@ -448,21 +449,31 @@ void drm_master_put(struct drm_master **master) > EXPORT_SYMBOL(drm_master_put); > > /* Used by drm_client and drm_fb_helper */ > -bool drm_master_internal_acquire(struct drm_device *dev) > +bool drm_master_internal_acquire(struct drm_device *dev, int *idx) > { > + *idx = srcu_read_lock(&dev->master_barrier_srcu); > + > mutex_lock(&dev->master_mutex); > if (dev->master) { > mutex_unlock(&dev->master_mutex); > + srcu_read_unlock(&dev->master_barrier_srcu, *idx); > return false; > } > + mutex_unlock(&dev->master_mutex); > > return true; > } > EXPORT_SYMBOL(drm_master_internal_acquire); > > /* Used by drm_client and drm_fb_helper */ > -void drm_master_internal_release(struct drm_device *dev) > +void drm_master_internal_release(struct drm_device *dev, int idx) > { > - mutex_unlock(&dev->master_mutex); > + srcu_read_unlock(&dev->master_barrier_srcu, idx); > } > EXPORT_SYMBOL(drm_master_internal_release); > + > +/* Used by drm_ioctl */ > +void drm_master_flush(struct drm_device *dev) > +{ > + synchronize_srcu(&dev->master_barrier_srcu); > +} > diff --git a/drive
Re: [PATCH 2/2] drm: unexport drm_ioctl_permit
On Fri, Aug 13, 2021 at 04:54:50PM +0800, Desmond Cheong Zhi Xi wrote: > Since the last user of drm_ioctl_permit was removed, and it's now only > used in drm_ioctl.c, unexport the symbol. > > Reported-by: Daniel Vetter > Signed-off-by: Desmond Cheong Zhi Xi Applied to drm-misc-next for 5.16, thanks for your patch. -Daniel > --- > drivers/gpu/drm/drm_ioctl.c | 15 +-- > include/drm/drm_ioctl.h | 1 - > 2 files changed, 1 insertion(+), 15 deletions(-) > > diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c > index eb4ec3fab7d1..fe271f6f96ab 100644 > --- a/drivers/gpu/drm/drm_ioctl.c > +++ b/drivers/gpu/drm/drm_ioctl.c > @@ -522,19 +522,7 @@ int drm_version(struct drm_device *dev, void *data, > return err; > } > > -/** > - * drm_ioctl_permit - Check ioctl permissions against caller > - * > - * @flags: ioctl permission flags. > - * @file_priv: Pointer to struct drm_file identifying the caller. > - * > - * Checks whether the caller is allowed to run an ioctl with the > - * indicated permissions. > - * > - * Returns: > - * Zero if allowed, -EACCES otherwise. > - */ > -int drm_ioctl_permit(u32 flags, struct drm_file *file_priv) > +static int drm_ioctl_permit(u32 flags, struct drm_file *file_priv) > { > /* ROOT_ONLY is only for CAP_SYS_ADMIN */ > if (unlikely((flags & DRM_ROOT_ONLY) && !capable(CAP_SYS_ADMIN))) > @@ -557,7 +545,6 @@ int drm_ioctl_permit(u32 flags, struct drm_file > *file_priv) > > return 0; > } > -EXPORT_SYMBOL(drm_ioctl_permit); > > #define DRM_IOCTL_DEF(ioctl, _func, _flags) \ > [DRM_IOCTL_NR(ioctl)] = { \ > diff --git a/include/drm/drm_ioctl.h b/include/drm/drm_ioctl.h > index 13a68cdcea36..fd29842127e5 100644 > --- a/include/drm/drm_ioctl.h > +++ b/include/drm/drm_ioctl.h > @@ -174,7 +174,6 @@ struct drm_ioctl_desc { > .name = #ioctl \ > } > > -int drm_ioctl_permit(u32 flags, struct drm_file *file_priv); > long drm_ioctl(struct file *filp, unsigned int cmd, unsigned long arg); > long drm_ioctl_kernel(struct file *, drm_ioctl_t, void *, u32); > #ifdef CONFIG_COMPAT > -- > 2.25.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH v5 3/9] dyndbg: add DEFINE_DYNAMIC_DEBUG_CATEGORIES and callbacks
On Fri, Aug 13, 2021 at 09:17:11AM -0600, Jim Cromie wrote: > DEFINE_DYNAMIC_DEBUG_CATEGORIES(name, var, bitmap_desc, @bit_descs) > allows users to define a drm.debug style (bitmap) sysfs interface, and > to specify the desired mapping from bits[0-N] to the format-prefix'd > pr_debug()s to be controlled. > > DEFINE_DYNAMIC_DEBUG_CATEGORIES(debug_gvt, __gvt_debug, > "i915/gvt bitmap desc", > /** >* search-prefixes, passed to dd-exec_queries >* defines bits 0-N in order. >* leading ^ is tacitly inserted (by callback currently) >* trailing space used here excludes subcats. >* helper macro needs more work >* macro to autogen ++$i, 0x%x$i ? >*/ > _DD_cat_("gvt:cmd: "), > _DD_cat_("gvt:core: "), > _DD_cat_("gvt:dpy: "), > _DD_cat_("gvt:el: "), > _DD_cat_("gvt:irq: "), > _DD_cat_("gvt:mm: "), > _DD_cat_("gvt:mmio: "), > _DD_cat_("gvt:render: "), > _DD_cat_("gvt:sched: ")); > > dynamic_debug.c: add 3 new elements: > > - int param_set_dyndbg() > - int param_get_dyndbg() > - struct kernel_param_ops param_ops_dyndbg > > Following the model of kernel/params.c STANDARD_PARAM_DEFS, All 3 are > non-static and exported. > > dynamic_debug.h: > > Add DEFINE_DYNAMIC_DEBUG_CATEGORIES() described above, and a do-nothing stub. > > Note that it also calls MODULE_PARM_DESC for the user, but expects the > user to catenate all the bit-descriptions together (as is done in > drm.debug), and in the following uses in amdgpu, i915. > > This in the hope that someone can offer an auto-incrementing > label-generating macro, producing "\tbit-4 0x10\t" etc, and can show > how to apply it to __VA_ARGS__. > > Also extern the struct kernel_param param_ops_dyndbg symbol, as is > done in moduleparams.h for all the STANDARD params. > > USAGE NOTES: > > Using dyndbg to query on "format ^$prefix" requires that the prefix be > present in the compiled-in format string; where run-time prefixing is > used, that format would be "%s...", which is not usefully selectable. > > Adding structural query terms (func,file,lineno) could help (module is > already done), but DEFINE_DYNAMIC_DEBUG_CATEGORIES can't do that now, > adding it needs a better reason imo. > > Dyndbg is completely agnostic wrt the categorization scheme used, to > play well with any prefix convention already in use. Ad-hoc > categories and sub-categories are implicitly allowed, author > discipline and review is expected. > > Here are some examples: > > "1","2","3" 2 doesnt imply 1. > otherwize, sorta like printk levels > "1:","2:","3:"are better, avoiding [1-9]\d+ ambiguity > "hi:","mid:","low:" are reasonable, and imply independence > "todo:","rfc:"might be handy > "A:".."Z:"uhm, yeah > > Hierarchical classes/categories are natural: > > "drm::" is used in later commit > "drm:::"is a natural extension. > "drm:atomic:fail:"has been proposed, sounds directly useful > > Some properties of a hierarchical category deserve explication: > > Trailing spaces matter ! > > With 1..3-space ("drm: ", "drm:atomic: ", "drm:atomic:fail: "), the > ":" doesnt terminate the search-space, the trailing space does. > So a "drm:" search specification will match all DRM categories & > subcategories, and will not be useful in an interface where all > categories are controlled together. That said, "drm:atomic:" & > "drm:atomic: " are different, and both are useful in cases. > > Ad-Hoc sub-categories: > > These have a caveat wrt wrapper macros adding prefixes like > "drm:atomic: "; the trailing space in the prefix means that > drm_dbg("fail: ...") renders as "drm:atomic: fail: ", which obviously > isn't ideal wrt clear and simple bitmaps. > > A possible solution is to have a FOO_() version of every FOO() macro > which (anti-mnemonically) elides the trailing space, which is normally > inserted by a modified FOO(). Doing this would enforce a policy > decision that "debug categories will be space terminated", with an > pressure-relief valve. > > Summarizing: > > - "drm:kms: " & "drm:kms:" are different > - "drm:kms" also different - includes drm:kms2: > - "drm:kms:\t" also different > - "drm:kms:*"doesnt work, no wildcard on format atm. > > Order matters in DEFINE_DYNAMIC_DEBUG_CATEGORIES(... @bit_descs) > > @bit_descs (array) position determines the bit mapping to the prefix, > so to keep a stable map, new categories or 3rd level categories must > be added to the end. > > Since bits are/will-stay applied 0-N, the later bits can countermand > the earlier ones, but its tricky - consider; > > DD_CATs(... "drm:atomic:", ""drm:atomic:fail:" ) // misleading > > The 1st search-term is misleading, because it includes (modifies) > subcategories, but then 2nd overrides it. So don't do that. > > There is still plenty of bikeshedding to
Re: [PATCH] drm/amdgpu: Cancel delayed work when GFXOFF is disabled
On 2021-08-13 5:07 p.m., Lazar, Lijo wrote: > > > On 8/13/2021 8:10 PM, Michel Dänzer wrote: >> On 2021-08-13 4:14 p.m., Lazar, Lijo wrote: >>> On 8/13/2021 7:04 PM, Michel Dänzer wrote: On 2021-08-13 1:50 p.m., Lazar, Lijo wrote: > On 8/13/2021 3:59 PM, Michel Dänzer wrote: >> From: Michel Dänzer >> >> schedule_delayed_work does not push back the work if it was already >> scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms >> after the first time GFXOFF was disabled and re-enabled, even if GFXOFF >> was disabled and re-enabled again during those 100 ms. >> >> This resulted in frame drops / stutter with the upcoming mutter 41 >> release on Navi 14, due to constantly enabling GFXOFF in the HW and >> disabling it again (for getting the GPU clock counter). >> >> To fix this, call cancel_delayed_work_sync when GFXOFF transitions from >> enabled to disabled. This makes sure the delayed work will be scheduled >> as intended in the reverse case. >> >> In order to avoid a deadlock, amdgpu_device_delay_enable_gfx_off needs >> to use mutex_trylock instead of mutex_lock. >> >> v2: >> * Use cancel_delayed_work_sync & mutex_trylock instead of >> mod_delayed_work. >> >> Signed-off-by: Michel Dänzer >> --- >> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 11 ++- >> drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 13 +++-- >> drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h | 3 +++ >> 3 files changed, 20 insertions(+), 7 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> index f3fd5ec710b6..8b025f70706c 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> @@ -2777,7 +2777,16 @@ static void >> amdgpu_device_delay_enable_gfx_off(struct work_struct *work) >> struct amdgpu_device *adev = >> container_of(work, struct amdgpu_device, >> gfx.gfx_off_delay_work.work); >> - mutex_lock(&adev->gfx.gfx_off_mutex); >> + /* mutex_lock could deadlock with cancel_delayed_work_sync in >> amdgpu_gfx_off_ctrl. */ >> + if (!mutex_trylock(&adev->gfx.gfx_off_mutex)) { >> + /* If there's a bug which causes amdgpu_gfx_off_ctrl to be >> called with enable=true >> + * when adev->gfx.gfx_off_req_count is already 0, we might race >> with that. >> + * Re-schedule to make sure gfx off will be re-enabled in the >> HW eventually. >> + */ >> + schedule_delayed_work(&adev->gfx.gfx_off_delay_work, >> AMDGPU_GFX_OFF_DELAY_ENABLE); >> + return; > > This is not needed and is just creating another thread to contend for > mutex. Still not sure what you mean by that. What other thread? >>> >>> Sorry, I meant it schedules another workitem and delays GFXOFF enablement >>> further. For ex: if it was another function like gfx_off_status holding the >>> lock at the time of check. >>> > The checks below take care of enabling gfxoff correctly. If it's already > in gfx_off state, it doesn't do anything. So I don't see why this change > is needed. mutex_trylock is needed to prevent the deadlock discussed before and below. schedule_delayed_work is needed due to this scenario hinted at by the comment: 1. amdgpu_gfx_off_ctrl locks mutex, calls schedule_delayed_work 2. amdgpu_device_delay_enable_gfx_off runs, calls mutex_trylock, which fails GFXOFF would never get re-enabled in HW in this case (until amdgpu_gfx_off_ctrl calls schedule_delayed_work again). (cancel_delayed_work_sync guarantees there's no pending delayed work when it returns, even if amdgpu_device_delay_enable_gfx_off calls schedule_delayed_work) >>> >>> I think we need to explain based on the original code before. There is an >>> asssumption here that the only other contention of this mutex is with the >>> gfx_off_ctrl function. >> >> Not really. >> >> >>> As far as I understand if the work has already started running when >>> schedule_delayed_work is called, it will insert another in the work queue >>> after delay. Based on that understanding I didn't find a problem with the >>> original code. >> >> Original code as in without this patch or the mod_delayed_work patch? If so, >> the problem is not when the work has already started running. It's that when >> it hasn't started running yet, schedule_delayed_work doesn't change the >> timeout for the already scheduled work, so it ends up enabling GFXOFF >> earlier than intended (and thus at all in scenarios when it's not supposed >> to). >> > > I meant the original implementation of amdgpu_device_delay_enable_gfx_off(). > > > If you indeed w
Re: [PATCH 39/64] mac80211: Use memset_after() to clear tx status
On Fri, Aug 13, 2021 at 09:40:07AM +0200, Johannes Berg wrote: > On Sat, 2021-07-31 at 08:55 -0700, Kees Cook wrote: > > On Tue, Jul 27, 2021 at 01:58:30PM -0700, Kees Cook wrote: > > > In preparation for FORTIFY_SOURCE performing compile-time and run-time > > > field bounds checking for memset(), avoid intentionally writing across > > > neighboring fields. > > > > > > Use memset_after() so memset() doesn't get confused about writing > > > beyond the destination member that is intended to be the starting point > > > of zeroing through the end of the struct. > > > > > > Note that the common helper, ieee80211_tx_info_clear_status(), does NOT > > > clear ack_signal, but the open-coded versions do. All three perform > > > checks that the ack_signal position hasn't changed, though. > > > > Quick ping on this question: there is a mismatch between the common > > helper and the other places that do this. Is there a bug here? > > Yes. > > The common helper should also clear ack_signal, but that was broken by > commit e3e1a0bcb3f1 ("mac80211: reduce IEEE80211_TX_MAX_RATES"), because > that commit changed the order of the fields and updated carl9170 and p54 > properly but not the common helper... It looks like p54 actually uses the rates, which is why it does this manually. I can't see why carl9170 does this manually, though. > It doesn't actually matter much because ack_signal is normally filled in > afterwards, and even if it isn't, it's just for statistics. > > The correct thing to do here would be to > > memset_after(&info->status, 0, rates); Sounds good; I will adjust these (and drop the BULID_BUG_ONs, as you suggest in the next email). Thanks! -Kees -- Kees Cook
Re: [PATCH v5 3/9] dyndbg: add DEFINE_DYNAMIC_DEBUG_CATEGORIES and callbacks
On Fri, Aug 13, 2021 at 06:51:05PM +0300, Andy Shevchenko wrote: > On Fri, Aug 13, 2021 at 09:17:11AM -0600, Jim Cromie wrote: > > +int param_set_dyndbg(const char *instr, const struct kernel_param *kp) > > +{ > > + unsigned long inbits; > > + int rc, i, chgct = 0, totct = 0; > > + char query[OUR_QUERY_SIZE]; > > + struct dyndbg_bitdesc *bitmap = (struct dyndbg_bitdesc *) kp->data; > > So you need space after ')' ? More importantly, if ->data is of type 'void *', it is bad style to cast the pointer at all. I can't tell what type 'data' has; if it is added to kernel_param as part of this series, I wasn't cc'd on the patch that did that.
[GIT PULL] drm/tegra: Changes for v5.15-rc1
Hi Dave, The following changes since commit e73f0f0ee7541171d89f2e2491130c7771ba58d3: Linux 5.14-rc1 (2021-07-11 15:07:40 -0700) are available in the Git repository at: ssh://git.freedesktop.org/git/tegra/linux.git tags/drm/tegra/for-5.15-rc1 for you to fetch changes up to fed0289394173509b3150617e17739d0094ce88e: gpu: host1x: debug: Dump DMASTART and DMAEND register (2021-08-13 18:23:32 +0200) Once you've merged these I plan to push the libdrm changes which are going to use this new ABI and which also contain some basic sanity tests that we want to start running for regression testing. Thanks, Thierry drm/tegra: Changes for v5.15-rc1 The bulk of these changes is a more modern ABI that can be efficiently used on newer SoCs as well as older ones. The userspace parts for this are available here: - libdrm support: https://gitlab.freedesktop.org/tagr/drm/-/commits/drm-tegra-uabi-v8 - VAAPI driver: https://github.com/cyndis/vaapi-tegra-driver In addition, existing userspace from the grate reverse-engineering project has been updated to use this new ABI: - X11 driver: https://github.com/grate-driver/xf86-video-opentegra - 3D driver: https://github.com/grate-driver/grate Other than that, there's also support for display memory bandwidth management for various generations and a bit of cleanup. Dmitry Osipenko (2): drm/tegra: dc: Support memory bandwidth management drm/tegra: dc: Extend debug stats with total number of events Mikko Perttunen (15): gpu: host1x: Add DMA fence implementation gpu: host1x: Add no-recovery mode gpu: host1x: Add job release callback gpu: host1x: Add support for syncpoint waits in CDMA pushbuffer gpu: host1x: Add option to skip firewall for a job drm/tegra: Extract tegra_gem_lookup() drm/tegra: Add new UAPI to header drm/tegra: Boot VIC during runtime PM resume drm/tegra: Allocate per-engine channel in core code drm/tegra: Implement new UAPI drm/tegra: Implement syncpoint management UAPI drm/tegra: Implement syncpoint wait UAPI drm/tegra: Implement job submission part of new UAPI drm/tegra: Add job firewall drm/tegra: Bump driver version Thierry Reding (3): gpu: host1x: debug: Use dma_addr_t more consistently gpu: host1x: debug: Dump only relevant parts of CDMA push buffer gpu: host1x: debug: Dump DMASTART and DMAEND register drivers/gpu/drm/tegra/Kconfig | 1 + drivers/gpu/drm/tegra/Makefile | 3 + drivers/gpu/drm/tegra/dc.c | 358 - drivers/gpu/drm/tegra/dc.h | 17 + drivers/gpu/drm/tegra/drm.c| 98 +++-- drivers/gpu/drm/tegra/drm.h| 12 + drivers/gpu/drm/tegra/firewall.c | 254 drivers/gpu/drm/tegra/gem.c| 13 + drivers/gpu/drm/tegra/gem.h| 2 + drivers/gpu/drm/tegra/plane.c | 117 ++ drivers/gpu/drm/tegra/plane.h | 16 + drivers/gpu/drm/tegra/submit.c | 625 + drivers/gpu/drm/tegra/submit.h | 21 + drivers/gpu/drm/tegra/uapi.c | 338 drivers/gpu/drm/tegra/uapi.h | 58 +++ drivers/gpu/drm/tegra/vic.c| 112 +++--- drivers/gpu/host1x/Makefile| 1 + drivers/gpu/host1x/cdma.c | 58 ++- drivers/gpu/host1x/fence.c | 168 drivers/gpu/host1x/fence.h | 13 + drivers/gpu/host1x/hw/channel_hw.c | 87 +++- drivers/gpu/host1x/hw/debug_hw.c | 32 +- drivers/gpu/host1x/hw/debug_hw_1x01.c | 8 +- drivers/gpu/host1x/hw/debug_hw_1x06.c | 16 +- drivers/gpu/host1x/hw/hw_host1x02_uclass.h | 12 + drivers/gpu/host1x/hw/hw_host1x04_uclass.h | 12 + drivers/gpu/host1x/hw/hw_host1x05_uclass.h | 12 + drivers/gpu/host1x/hw/hw_host1x06_uclass.h | 12 + drivers/gpu/host1x/hw/hw_host1x07_uclass.h | 12 + drivers/gpu/host1x/intr.c | 9 + drivers/gpu/host1x/intr.h | 2 + drivers/gpu/host1x/job.c | 98 +++-- drivers/gpu/host1x/job.h | 16 + drivers/gpu/host1x/syncpt.c| 2 + drivers/gpu/host1x/syncpt.h| 12 + include/linux/host1x.h | 27 +- include/uapi/drm/tegra_drm.h | 425 ++-- 37 files changed, 2882 insertions(+), 197 deletions(-) create mode 100644 drivers/gpu/drm/tegra/firewall.c create mode 100644 drivers/gpu/drm/tegra/submit.c create mode 100644 drivers/gpu/drm/tegra/submit.h create mode 100644 drivers/gpu/drm/tegra/uapi.c create mode 100644 drivers/gpu/drm/tegra/uapi.h create mode 100644 drivers/gpu/host1x/fence.c create mode 100644 drivers
[tegra-drm:drm/tegra/for-next 16/17] drivers/gpu/drm/tegra/dc.c:1843:53: warning: variable 'new_dc_state' set but not used
tree: git://anongit.freedesktop.org/tegra/linux.git drm/tegra/for-next head: ad85b0843ee4536593415ca890d7fb52cd7f1fbe commit: 04d5d5df9df79f9045e76404775fc8a084aac23d [16/17] drm/tegra: dc: Support memory bandwidth management config: arm-defconfig (attached as .config) compiler: arm-linux-gnueabi-gcc (GCC) 11.2.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross git remote add tegra-drm git://anongit.freedesktop.org/tegra/linux.git git fetch --no-tags tegra-drm drm/tegra/for-next git checkout 04d5d5df9df79f9045e76404775fc8a084aac23d # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross ARCH=arm If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All warnings (new ones prefixed by >>): drivers/gpu/drm/tegra/dc.c: In function 'tegra_crtc_update_memory_bandwidth': >> drivers/gpu/drm/tegra/dc.c:1843:53: warning: variable 'new_dc_state' set but >> not used [-Wunused-but-set-variable] 1843 | const struct tegra_dc_state *old_dc_state, *new_dc_state; | ^~~~ >> drivers/gpu/drm/tegra/dc.c:1843:38: warning: variable 'old_dc_state' set but >> not used [-Wunused-but-set-variable] 1843 | const struct tegra_dc_state *old_dc_state, *new_dc_state; | ^~~~ drivers/gpu/drm/tegra/dc.c: In function 'tegra_crtc_calculate_memory_bandwidth': >> drivers/gpu/drm/tegra/dc.c:2223:38: warning: variable 'old_state' set but >> not used [-Wunused-but-set-variable] 2223 | const struct drm_crtc_state *old_state; | ^ vim +/new_dc_state +1843 drivers/gpu/drm/tegra/dc.c 1836 1837 static void 1838 tegra_crtc_update_memory_bandwidth(struct drm_crtc *crtc, 1839 struct drm_atomic_state *state, 1840 bool prepare_bandwidth_transition) 1841 { 1842 const struct tegra_plane_state *old_tegra_state, *new_tegra_state; > 1843 const struct tegra_dc_state *old_dc_state, *new_dc_state; 1844 u32 i, new_avg_bw, old_avg_bw, new_peak_bw, old_peak_bw; 1845 const struct drm_plane_state *old_plane_state; 1846 const struct drm_crtc_state *old_crtc_state; 1847 struct tegra_dc_window window, old_window; 1848 struct tegra_dc *dc = to_tegra_dc(crtc); 1849 struct tegra_plane *tegra; 1850 struct drm_plane *plane; 1851 1852 if (dc->soc->has_nvdisplay) 1853 return; 1854 1855 old_crtc_state = drm_atomic_get_old_crtc_state(state, crtc); 1856 old_dc_state = to_const_dc_state(old_crtc_state); 1857 new_dc_state = to_const_dc_state(crtc->state); 1858 1859 if (!crtc->state->active) { 1860 if (!old_crtc_state->active) 1861 return; 1862 1863 /* 1864 * When CRTC is disabled on DPMS, the state of attached planes 1865 * is kept unchanged. Hence we need to enforce removal of the 1866 * bandwidths from the ICC paths. 1867 */ 1868 drm_atomic_crtc_for_each_plane(plane, crtc) { 1869 tegra = to_tegra_plane(plane); 1870 1871 icc_set_bw(tegra->icc_mem, 0, 0); 1872 icc_set_bw(tegra->icc_mem_vfilter, 0, 0); 1873 } 1874 1875 return; 1876 } 1877 1878 for_each_old_plane_in_state(old_crtc_state->state, plane, 1879 old_plane_state, i) { 1880 old_tegra_state = to_const_tegra_plane_state(old_plane_state); 1881 new_tegra_state = to_const_tegra_plane_state(plane->state); 1882 tegra = to_tegra_plane(plane); 1883 1884 /* 1885 * We're iterating over the global atomic state and it contains 1886 * planes from another CRTC, hence we need to filter out the 1887 * planes unrelated to this CRTC. 1888 */ 1889 if (tegra->dc != dc) 1890 continue; 1891 1892 new_avg_bw = new_tegra_state->avg_memory_bandwidth; 1893 old_avg_bw = old_tegra_state->avg_memory_bandwidth; 1894 1895 new_peak_bw = new_tegra_state->total_peak_memory_bandwidth; 1896 old_peak_bw = old_tegra_state->total_peak_memory_bandwidth; 1897 1898
Re: [PATCH 0/1] Fix DRM driver initialization failure in kernel v5.14
Just a friendly reminder that this fix for a regression needs review. It should be a quick review. It would probably be good to ensure this gets in before the final 5.14 release, otherwise this is going to be a very visible regression for anyone that uses DRM and does not use debugfs. Thanks! -- Dan
[pull] amdgpu, amdkfd drm-next-5.15
Hi Dave, Daniel, Updates for 5.15. Mostly bug fixes and cleanups. The following changes since commit a43e2a0e11491b73e2acaa27ee74d6c3b86deac0: drm/amdkfd: Allow querying SVM attributes that are clear (2021-08-06 16:12:32 -0400) are available in the Git repository at: https://gitlab.freedesktop.org/agd5f/linux.git tags/amd-drm-next-5.15-2021-08-13 for you to fetch changes up to 554594567b1fa3da74f88ec7b2dc83d000c58e98: drm/display: fix possible null-pointer dereference in dcn10_set_clock() (2021-08-11 17:19:54 -0400) amd-drm-next-5.15-2021-08-13: amdgpu: - Improve aux i2c tracing - Misc display updates - Misc code cleanups - sprintf to sysfs_emit updates - Fix some fan control corner cases with suspend amdkfd: - Enable CWSR with software scheduling Alex Deucher (1): drm/amdgpu: handle VCN instances when harvesting (v2) Anson Jacob (1): drm/amd/display: use GFP_ATOMIC in amdgpu_dm_irq_schedule_work Anthony Koo (2): drm/amd/display: [FW Promotion] Release 0.0.78 drm/amd/display: 3.2.148 Ashley Thomas (1): drm/amd/display: Add AUX I2C tracing. Darren Powell (7): amdgpu/pm: Replace navi10 usage of sprintf with sysfs_emit amdgpu/pm: Replace smu11 usage of sprintf with sysfs_emit amdgpu/pm: Replace smu12/13 usage of sprintf with sysfs_emit amdgpu/pm: Replace vega10 usage of sprintf with sysfs_emit amdgpu/pm: Replace vega12,20 usage of sprintf with sysfs_emit amdgpu/pm: Replace hwmgr smu usage of sprintf with sysfs_emit amdgpu/pm: Replace amdgpu_pm usage of sprintf with sysfs_emit Eric Bernstein (1): drm/amd/display: Remove invalid assert for ODM + MPC case Mukul Joshi (1): drm/amdkfd: CWSR with software scheduler Nicholas Kazlauskas (2): drm/amd/display: Clear GPINT after DMCUB has reset drm/amd/display: Increase timeout threshold for DMCUB reset Philip Yang (1): drm/amdkfd: AIP mGPUs best prefetch location for xnack on Randy Dunlap (2): drm/amd/display: use do-while-0 for DC_TRACE_LEVEL_MESSAGE() drm/amdgpu: fix kernel-doc warnings on non-kernel-doc comments Roy Chan (5): drm/amd/display: fix missing writeback disablement if plane is removed drm/amd/display: refactor the codes to centralize the stream/pipe checking logic drm/amd/display: refactor the cursor programing codes drm/amd/display: fix incorrect CM/TF programming sequence in dwb drm/amd/display: Correct comment style Ryan Taylor (2): drm/amd/pm: restore fan_mode AMD_FAN_CTRL_NONE on resume (v2) drm/amd/pm: graceful exit on restore fan mode failure (v2) Sergio Miguéns Iglesias (1): drm/amdgpu: Removed unnecessary if statement Tuo Li (2): gpu: drm: amd: amdgpu: amdgpu_i2c: fix possible uninitialized-variable access in amdgpu_i2c_router_select_ddc_port() drm/display: fix possible null-pointer dereference in dcn10_set_clock() Victor Zhao (1): drm/amdgpu: Extend full access wait time in guest Wenjing Liu (1): drm/amd/display: add authentication_complete in hdcp output YuBiao Wang (1): drm/amd/amdgpu: skip locking delayed work if not initialized. drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10.c | 31 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v10_3.c | 31 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 33 +++- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 +- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 12 +- drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c | 3 - drivers/gpu/drm/amd/amdgpu/amdgpu_i2c.c| 2 +- drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c| 6 +- drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c | 16 +- .../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 21 ++- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 35 ++-- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_irq.c | 2 +- drivers/gpu/drm/amd/display/dc/core/dc.c | 62 --- drivers/gpu/drm/amd/display/dc/core/dc_stream.c| 106 +++- drivers/gpu/drm/amd/display/dc/dc.h| 2 +- drivers/gpu/drm/amd/display/dc/dce/dce_aux.c | 192 - drivers/gpu/drm/amd/display/dc/dce/dmub_psr.c | 2 +- .../drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c | 11 +- drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c | 14 +- .../gpu/drm/amd/display/dc/dcn30/dcn30_dwb_cm.c| 90 +++--- drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c | 12 +- .../gpu/drm/amd/display/dc/dcn30/dcn30_resource.c | 1 - drivers/gpu/drm/amd/display/dmub/inc/dmub_cmd.h| 6 +- drivers/gpu/drm/amd/display/dmub/src/dmub_dcn31.c | 18 +- drivers/gpu/drm/amd/display/modules/hdcp/hdcp.c| 5 +- drivers/gpu/drm/amd/display/modules/hdcp/hdcp.h| 8 + .../amd/display/modules/hdcp/hdcp1_transition.c|
[PATCH v2 00/12] Implement generic prot_guest_has() helper function
This patch series provides a generic helper function, prot_guest_has(), to replace the sme_active(), sev_active(), sev_es_active() and mem_encrypt_active() functions. It is expected that as new protected virtualization technologies are added to the kernel, they can all be covered by a single function call instead of a collection of specific function calls all called from the same locations. The powerpc and s390 patches have been compile tested only. Can the folks copied on this series verify that nothing breaks for them. Cc: Andi Kleen Cc: Andy Lutomirski Cc: Ard Biesheuvel Cc: Baoquan He Cc: Benjamin Herrenschmidt Cc: Borislav Petkov Cc: Christian Borntraeger Cc: Daniel Vetter Cc: Dave Hansen Cc: Dave Young Cc: David Airlie Cc: Heiko Carstens Cc: Ingo Molnar Cc: Joerg Roedel Cc: Maarten Lankhorst Cc: Maxime Ripard Cc: Michael Ellerman Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Thomas Zimmermann Cc: Vasily Gorbik Cc: VMware Graphics Cc: Will Deacon --- Patches based on: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git master 0b52902cd2d9 ("Merge branch 'efi/urgent'") Changes since v1: - Move some arch ioremap functions within #ifdef CONFIG_AMD_MEM_ENCRYPT in prep for use of prot_guest_has() by TDX. - Add type includes to the the protected_guest.h header file to prevent build errors outside of x86. - Make amd_prot_guest_has() EXPORT_SYMBOL_GPL - Use amd_prot_guest_has() in place of checking sme_me_mask in the arch/x86/mm/mem_encrypt.c file. Tom Lendacky (12): x86/ioremap: Selectively build arch override encryption functions mm: Introduce a function to check for virtualization protection features x86/sev: Add an x86 version of prot_guest_has() powerpc/pseries/svm: Add a powerpc version of prot_guest_has() x86/sme: Replace occurrences of sme_active() with prot_guest_has() x86/sev: Replace occurrences of sev_active() with prot_guest_has() x86/sev: Replace occurrences of sev_es_active() with prot_guest_has() treewide: Replace the use of mem_encrypt_active() with prot_guest_has() mm: Remove the now unused mem_encrypt_active() function x86/sev: Remove the now unused mem_encrypt_active() function powerpc/pseries/svm: Remove the now unused mem_encrypt_active() function s390/mm: Remove the now unused mem_encrypt_active() function arch/Kconfig | 3 ++ arch/powerpc/include/asm/mem_encrypt.h | 5 -- arch/powerpc/include/asm/protected_guest.h | 30 +++ arch/powerpc/platforms/pseries/Kconfig | 1 + arch/s390/include/asm/mem_encrypt.h| 2 - arch/x86/Kconfig | 1 + arch/x86/include/asm/io.h | 8 +++ arch/x86/include/asm/kexec.h | 2 +- arch/x86/include/asm/mem_encrypt.h | 13 + arch/x86/include/asm/protected_guest.h | 29 +++ arch/x86/kernel/crash_dump_64.c| 4 +- arch/x86/kernel/head64.c | 4 +- arch/x86/kernel/kvm.c | 3 +- arch/x86/kernel/kvmclock.c | 4 +- arch/x86/kernel/machine_kexec_64.c | 19 +++ arch/x86/kernel/pci-swiotlb.c | 9 ++-- arch/x86/kernel/relocate_kernel_64.S | 2 +- arch/x86/kernel/sev.c | 6 +-- arch/x86/kvm/svm/svm.c | 3 +- arch/x86/mm/ioremap.c | 18 +++ arch/x86/mm/mem_encrypt.c | 60 +++--- arch/x86/mm/mem_encrypt_identity.c | 3 +- arch/x86/mm/pat/set_memory.c | 3 +- arch/x86/platform/efi/efi_64.c | 9 ++-- arch/x86/realmode/init.c | 8 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 4 +- drivers/gpu/drm/drm_cache.c| 4 +- drivers/gpu/drm/vmwgfx/vmwgfx_drv.c| 4 +- drivers/gpu/drm/vmwgfx/vmwgfx_msg.c| 6 +-- drivers/iommu/amd/init.c | 7 +-- drivers/iommu/amd/iommu.c | 3 +- drivers/iommu/amd/iommu_v2.c | 3 +- drivers/iommu/iommu.c | 3 +- fs/proc/vmcore.c | 6 +-- include/linux/mem_encrypt.h| 4 -- include/linux/protected_guest.h| 40 +++ kernel/dma/swiotlb.c | 4 +- 37 files changed, 232 insertions(+), 105 deletions(-) create mode 100644 arch/powerpc/include/asm/protected_guest.h create mode 100644 arch/x86/include/asm/protected_guest.h create mode 100644 include/linux/protected_guest.h -- 2.32.0
[PATCH v2 01/12] x86/ioremap: Selectively build arch override encryption functions
In prep for other uses of the prot_guest_has() function besides AMD's memory encryption support, selectively build the AMD memory encryption architecture override functions only when CONFIG_AMD_MEM_ENCRYPT=y. These functions are: - early_memremap_pgprot_adjust() - arch_memremap_can_ram_remap() Additionally, routines that are only invoked by these architecture override functions can also be conditionally built. These functions are: - memremap_should_map_decrypted() - memremap_is_efi_data() - memremap_is_setup_data() - early_memremap_is_setup_data() And finally, phys_mem_access_encrypted() is conditionally built as well, but requires a static inline version of it when CONFIG_AMD_MEM_ENCRYPT is not set. Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Dave Hansen Cc: Andy Lutomirski Cc: Peter Zijlstra Signed-off-by: Tom Lendacky --- arch/x86/include/asm/io.h | 8 arch/x86/mm/ioremap.c | 2 +- 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h index 841a5d104afa..5c6a4af0b911 100644 --- a/arch/x86/include/asm/io.h +++ b/arch/x86/include/asm/io.h @@ -391,6 +391,7 @@ extern void arch_io_free_memtype_wc(resource_size_t start, resource_size_t size) #define arch_io_reserve_memtype_wc arch_io_reserve_memtype_wc #endif +#ifdef CONFIG_AMD_MEM_ENCRYPT extern bool arch_memremap_can_ram_remap(resource_size_t offset, unsigned long size, unsigned long flags); @@ -398,6 +399,13 @@ extern bool arch_memremap_can_ram_remap(resource_size_t offset, extern bool phys_mem_access_encrypted(unsigned long phys_addr, unsigned long size); +#else +static inline bool phys_mem_access_encrypted(unsigned long phys_addr, +unsigned long size) +{ + return true; +} +#endif /** * iosubmit_cmds512 - copy data to single MMIO location, in 512-bit units diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c index 60ade7dd71bd..ccff76cedd8f 100644 --- a/arch/x86/mm/ioremap.c +++ b/arch/x86/mm/ioremap.c @@ -508,6 +508,7 @@ void unxlate_dev_mem_ptr(phys_addr_t phys, void *addr) memunmap((void *)((unsigned long)addr & PAGE_MASK)); } +#ifdef CONFIG_AMD_MEM_ENCRYPT /* * Examine the physical address to determine if it is an area of memory * that should be mapped decrypted. If the memory is not part of the @@ -746,7 +747,6 @@ bool phys_mem_access_encrypted(unsigned long phys_addr, unsigned long size) return arch_memremap_can_ram_remap(phys_addr, size, 0); } -#ifdef CONFIG_AMD_MEM_ENCRYPT /* Remap memory with encryption */ void __init *early_memremap_encrypted(resource_size_t phys_addr, unsigned long size) -- 2.32.0
[PATCH v2 02/12] mm: Introduce a function to check for virtualization protection features
In prep for other protected virtualization technologies, introduce a generic helper function, prot_guest_has(), that can be used to check for specific protection attributes, like memory encryption. This is intended to eliminate having to add multiple technology-specific checks to the code (e.g. if (sev_active() || tdx_active())). Reviewed-by: Joerg Roedel Co-developed-by: Andi Kleen Signed-off-by: Andi Kleen Co-developed-by: Kuppuswamy Sathyanarayanan Signed-off-by: Kuppuswamy Sathyanarayanan Signed-off-by: Tom Lendacky --- arch/Kconfig| 3 +++ include/linux/protected_guest.h | 35 + 2 files changed, 38 insertions(+) create mode 100644 include/linux/protected_guest.h diff --git a/arch/Kconfig b/arch/Kconfig index 98db63496bab..bd4f60c581f1 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -1231,6 +1231,9 @@ config RELR config ARCH_HAS_MEM_ENCRYPT bool +config ARCH_HAS_PROTECTED_GUEST + bool + config HAVE_SPARSE_SYSCALL_NR bool help diff --git a/include/linux/protected_guest.h b/include/linux/protected_guest.h new file mode 100644 index ..43d4dde94793 --- /dev/null +++ b/include/linux/protected_guest.h @@ -0,0 +1,35 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Protected Guest (and Host) Capability checks + * + * Copyright (C) 2021 Advanced Micro Devices, Inc. + * + * Author: Tom Lendacky + */ + +#ifndef _PROTECTED_GUEST_H +#define _PROTECTED_GUEST_H + +#ifndef __ASSEMBLY__ + +#include +#include + +#define PATTR_MEM_ENCRYPT 0 /* Encrypted memory */ +#define PATTR_HOST_MEM_ENCRYPT 1 /* Host encrypted memory */ +#define PATTR_GUEST_MEM_ENCRYPT2 /* Guest encrypted memory */ +#define PATTR_GUEST_PROT_STATE 3 /* Guest encrypted state */ + +#ifdef CONFIG_ARCH_HAS_PROTECTED_GUEST + +#include + +#else /* !CONFIG_ARCH_HAS_PROTECTED_GUEST */ + +static inline bool prot_guest_has(unsigned int attr) { return false; } + +#endif /* CONFIG_ARCH_HAS_PROTECTED_GUEST */ + +#endif /* __ASSEMBLY__ */ + +#endif /* _PROTECTED_GUEST_H */ -- 2.32.0
[PATCH v2 03/12] x86/sev: Add an x86 version of prot_guest_has()
Introduce an x86 version of the prot_guest_has() function. This will be used in the more generic x86 code to replace vendor specific calls like sev_active(), etc. While the name suggests this is intended mainly for guests, it will also be used for host memory encryption checks in place of sme_active(). Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Dave Hansen Cc: Andy Lutomirski Cc: Peter Zijlstra Reviewed-by: Joerg Roedel Co-developed-by: Andi Kleen Signed-off-by: Andi Kleen Co-developed-by: Kuppuswamy Sathyanarayanan Signed-off-by: Kuppuswamy Sathyanarayanan Signed-off-by: Tom Lendacky --- arch/x86/Kconfig | 1 + arch/x86/include/asm/mem_encrypt.h | 2 ++ arch/x86/include/asm/protected_guest.h | 29 ++ arch/x86/mm/mem_encrypt.c | 25 ++ include/linux/protected_guest.h| 5 + 5 files changed, 62 insertions(+) create mode 100644 arch/x86/include/asm/protected_guest.h diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 421fa9e38c60..82e5fb713261 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1514,6 +1514,7 @@ config AMD_MEM_ENCRYPT select ARCH_HAS_FORCE_DMA_UNENCRYPTED select INSTRUCTION_DECODER select ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS + select ARCH_HAS_PROTECTED_GUEST help Say yes to enable support for the encryption of system memory. This requires an AMD processor that supports Secure Memory diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h index 9c80c68d75b5..a46d47662772 100644 --- a/arch/x86/include/asm/mem_encrypt.h +++ b/arch/x86/include/asm/mem_encrypt.h @@ -53,6 +53,7 @@ void __init sev_es_init_vc_handling(void); bool sme_active(void); bool sev_active(void); bool sev_es_active(void); +bool amd_prot_guest_has(unsigned int attr); #define __bss_decrypted __section(".bss..decrypted") @@ -78,6 +79,7 @@ static inline void sev_es_init_vc_handling(void) { } static inline bool sme_active(void) { return false; } static inline bool sev_active(void) { return false; } static inline bool sev_es_active(void) { return false; } +static inline bool amd_prot_guest_has(unsigned int attr) { return false; } static inline int __init early_set_memory_decrypted(unsigned long vaddr, unsigned long size) { return 0; } diff --git a/arch/x86/include/asm/protected_guest.h b/arch/x86/include/asm/protected_guest.h new file mode 100644 index ..51e4eefd9542 --- /dev/null +++ b/arch/x86/include/asm/protected_guest.h @@ -0,0 +1,29 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Protected Guest (and Host) Capability checks + * + * Copyright (C) 2021 Advanced Micro Devices, Inc. + * + * Author: Tom Lendacky + */ + +#ifndef _X86_PROTECTED_GUEST_H +#define _X86_PROTECTED_GUEST_H + +#include + +#ifndef __ASSEMBLY__ + +static inline bool prot_guest_has(unsigned int attr) +{ +#ifdef CONFIG_AMD_MEM_ENCRYPT + if (sme_me_mask) + return amd_prot_guest_has(attr); +#endif + + return false; +} + +#endif /* __ASSEMBLY__ */ + +#endif /* _X86_PROTECTED_GUEST_H */ diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c index ff08dc463634..edc67ddf065d 100644 --- a/arch/x86/mm/mem_encrypt.c +++ b/arch/x86/mm/mem_encrypt.c @@ -20,6 +20,7 @@ #include #include #include +#include #include #include @@ -389,6 +390,30 @@ bool noinstr sev_es_active(void) return sev_status & MSR_AMD64_SEV_ES_ENABLED; } +bool amd_prot_guest_has(unsigned int attr) +{ + switch (attr) { + case PATTR_MEM_ENCRYPT: + return sme_me_mask != 0; + + case PATTR_SME: + case PATTR_HOST_MEM_ENCRYPT: + return sme_active(); + + case PATTR_SEV: + case PATTR_GUEST_MEM_ENCRYPT: + return sev_active(); + + case PATTR_SEV_ES: + case PATTR_GUEST_PROT_STATE: + return sev_es_active(); + + default: + return false; + } +} +EXPORT_SYMBOL_GPL(amd_prot_guest_has); + /* Override for DMA direct allocation check - ARCH_HAS_FORCE_DMA_UNENCRYPTED */ bool force_dma_unencrypted(struct device *dev) { diff --git a/include/linux/protected_guest.h b/include/linux/protected_guest.h index 43d4dde94793..5ddef1b6a2ea 100644 --- a/include/linux/protected_guest.h +++ b/include/linux/protected_guest.h @@ -20,6 +20,11 @@ #define PATTR_GUEST_MEM_ENCRYPT2 /* Guest encrypted memory */ #define PATTR_GUEST_PROT_STATE 3 /* Guest encrypted state */ +/* 0x800 - 0x8ff reserved for AMD */ +#define PATTR_SME 0x800 +#define PATTR_SEV 0x801 +#define PATTR_SEV_ES 0x802 + #ifdef CONFIG_ARCH_HAS_PROTECTED_GUEST #include -- 2.32.0
[PATCH v2 04/12] powerpc/pseries/svm: Add a powerpc version of prot_guest_has()
Introduce a powerpc version of the prot_guest_has() function. This will be used to replace the powerpc mem_encrypt_active() implementation, so the implementation will initially only support the PATTR_MEM_ENCRYPT attribute. Cc: Michael Ellerman Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Signed-off-by: Tom Lendacky --- arch/powerpc/include/asm/protected_guest.h | 30 ++ arch/powerpc/platforms/pseries/Kconfig | 1 + 2 files changed, 31 insertions(+) create mode 100644 arch/powerpc/include/asm/protected_guest.h diff --git a/arch/powerpc/include/asm/protected_guest.h b/arch/powerpc/include/asm/protected_guest.h new file mode 100644 index ..ce55c2c7e534 --- /dev/null +++ b/arch/powerpc/include/asm/protected_guest.h @@ -0,0 +1,30 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Protected Guest (and Host) Capability checks + * + * Copyright (C) 2021 Advanced Micro Devices, Inc. + * + * Author: Tom Lendacky + */ + +#ifndef _POWERPC_PROTECTED_GUEST_H +#define _POWERPC_PROTECTED_GUEST_H + +#include + +#ifndef __ASSEMBLY__ + +static inline bool prot_guest_has(unsigned int attr) +{ + switch (attr) { + case PATTR_MEM_ENCRYPT: + return is_secure_guest(); + + default: + return false; + } +} + +#endif /* __ASSEMBLY__ */ + +#endif /* _POWERPC_PROTECTED_GUEST_H */ diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platforms/pseries/Kconfig index 5e037df2a3a1..8ce5417d6feb 100644 --- a/arch/powerpc/platforms/pseries/Kconfig +++ b/arch/powerpc/platforms/pseries/Kconfig @@ -159,6 +159,7 @@ config PPC_SVM select SWIOTLB select ARCH_HAS_MEM_ENCRYPT select ARCH_HAS_FORCE_DMA_UNENCRYPTED + select ARCH_HAS_PROTECTED_GUEST help There are certain POWER platforms which support secure guests using the Protected Execution Facility, with the help of an Ultravisor -- 2.32.0
[PATCH v2 05/12] x86/sme: Replace occurrences of sme_active() with prot_guest_has()
Replace occurrences of sme_active() with the more generic prot_guest_has() using PATTR_HOST_MEM_ENCRYPT, except for in arch/x86/mm/mem_encrypt*.c where PATTR_SME will be used. If future support is added for other memory encryption technologies, the use of PATTR_HOST_MEM_ENCRYPT can be updated, as required, to use PATTR_SME. Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Dave Hansen Cc: Andy Lutomirski Cc: Peter Zijlstra Cc: Joerg Roedel Cc: Will Deacon Reviewed-by: Joerg Roedel Signed-off-by: Tom Lendacky --- arch/x86/include/asm/kexec.h | 2 +- arch/x86/include/asm/mem_encrypt.h | 2 -- arch/x86/kernel/machine_kexec_64.c | 3 ++- arch/x86/kernel/pci-swiotlb.c| 9 - arch/x86/kernel/relocate_kernel_64.S | 2 +- arch/x86/mm/ioremap.c| 6 +++--- arch/x86/mm/mem_encrypt.c| 10 +- arch/x86/mm/mem_encrypt_identity.c | 3 ++- arch/x86/realmode/init.c | 5 +++-- drivers/iommu/amd/init.c | 7 --- 10 files changed, 25 insertions(+), 24 deletions(-) diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h index 0a6e34b07017..11b7c06e2828 100644 --- a/arch/x86/include/asm/kexec.h +++ b/arch/x86/include/asm/kexec.h @@ -129,7 +129,7 @@ relocate_kernel(unsigned long indirection_page, unsigned long page_list, unsigned long start_address, unsigned int preserve_context, - unsigned int sme_active); + unsigned int host_mem_enc_active); #endif #define ARCH_HAS_KIMAGE_ARCH diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h index a46d47662772..956338406cec 100644 --- a/arch/x86/include/asm/mem_encrypt.h +++ b/arch/x86/include/asm/mem_encrypt.h @@ -50,7 +50,6 @@ void __init mem_encrypt_free_decrypted_mem(void); void __init mem_encrypt_init(void); void __init sev_es_init_vc_handling(void); -bool sme_active(void); bool sev_active(void); bool sev_es_active(void); bool amd_prot_guest_has(unsigned int attr); @@ -76,7 +75,6 @@ static inline void __init sme_encrypt_kernel(struct boot_params *bp) { } static inline void __init sme_enable(struct boot_params *bp) { } static inline void sev_es_init_vc_handling(void) { } -static inline bool sme_active(void) { return false; } static inline bool sev_active(void) { return false; } static inline bool sev_es_active(void) { return false; } static inline bool amd_prot_guest_has(unsigned int attr) { return false; } diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c index 131f30fdcfbd..8e7b517ad738 100644 --- a/arch/x86/kernel/machine_kexec_64.c +++ b/arch/x86/kernel/machine_kexec_64.c @@ -17,6 +17,7 @@ #include #include #include +#include #include #include @@ -358,7 +359,7 @@ void machine_kexec(struct kimage *image) (unsigned long)page_list, image->start, image->preserve_context, - sme_active()); + prot_guest_has(PATTR_HOST_MEM_ENCRYPT)); #ifdef CONFIG_KEXEC_JUMP if (image->preserve_context) diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c index c2cfa5e7c152..bd9a9cfbc9a2 100644 --- a/arch/x86/kernel/pci-swiotlb.c +++ b/arch/x86/kernel/pci-swiotlb.c @@ -6,7 +6,7 @@ #include #include #include -#include +#include #include #include @@ -45,11 +45,10 @@ int __init pci_swiotlb_detect_4gb(void) swiotlb = 1; /* -* If SME is active then swiotlb will be set to 1 so that bounce -* buffers are allocated and used for devices that do not support -* the addressing range required for the encryption mask. +* Set swiotlb to 1 so that bounce buffers are allocated and used for +* devices that can't support DMA to encrypted memory. */ - if (sme_active()) + if (prot_guest_has(PATTR_HOST_MEM_ENCRYPT)) swiotlb = 1; return swiotlb; diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S index c53271aebb64..c8fe74a28143 100644 --- a/arch/x86/kernel/relocate_kernel_64.S +++ b/arch/x86/kernel/relocate_kernel_64.S @@ -47,7 +47,7 @@ SYM_CODE_START_NOALIGN(relocate_kernel) * %rsi page_list * %rdx start address * %rcx preserve_context -* %r8 sme_active +* %r8 host_mem_enc_active */ /* Save the CPU context, used for jumping back */ diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c index ccff76cedd8f..583afd54c7e1 100644 --- a/arch/x86/mm/ioremap.c +++ b/arch/x86/mm/ioremap.c @@ -14,7 +14,7 @@ #include #include #include -#include +#include #include #include @@ -703,7 +703,7 @@ bool arch_memremap_can_ram_remap(resource_size_t phys_addr, unsigned long
[PATCH v2 06/12] x86/sev: Replace occurrences of sev_active() with prot_guest_has()
Replace occurrences of sev_active() with the more generic prot_guest_has() using PATTR_GUEST_MEM_ENCRYPT, except for in arch/x86/mm/mem_encrypt*.c where PATTR_SEV will be used. If future support is added for other memory encryption technologies, the use of PATTR_GUEST_MEM_ENCRYPT can be updated, as required, to use PATTR_SEV. Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Dave Hansen Cc: Andy Lutomirski Cc: Peter Zijlstra Cc: Ard Biesheuvel Reviewed-by: Joerg Roedel Signed-off-by: Tom Lendacky --- arch/x86/include/asm/mem_encrypt.h | 2 -- arch/x86/kernel/crash_dump_64.c| 4 +++- arch/x86/kernel/kvm.c | 3 ++- arch/x86/kernel/kvmclock.c | 4 ++-- arch/x86/kernel/machine_kexec_64.c | 16 arch/x86/kvm/svm/svm.c | 3 ++- arch/x86/mm/ioremap.c | 6 +++--- arch/x86/mm/mem_encrypt.c | 15 +++ arch/x86/platform/efi/efi_64.c | 9 + 9 files changed, 32 insertions(+), 30 deletions(-) diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h index 956338406cec..7e25de37c148 100644 --- a/arch/x86/include/asm/mem_encrypt.h +++ b/arch/x86/include/asm/mem_encrypt.h @@ -50,7 +50,6 @@ void __init mem_encrypt_free_decrypted_mem(void); void __init mem_encrypt_init(void); void __init sev_es_init_vc_handling(void); -bool sev_active(void); bool sev_es_active(void); bool amd_prot_guest_has(unsigned int attr); @@ -75,7 +74,6 @@ static inline void __init sme_encrypt_kernel(struct boot_params *bp) { } static inline void __init sme_enable(struct boot_params *bp) { } static inline void sev_es_init_vc_handling(void) { } -static inline bool sev_active(void) { return false; } static inline bool sev_es_active(void) { return false; } static inline bool amd_prot_guest_has(unsigned int attr) { return false; } diff --git a/arch/x86/kernel/crash_dump_64.c b/arch/x86/kernel/crash_dump_64.c index 045e82e8945b..0cfe35f03e67 100644 --- a/arch/x86/kernel/crash_dump_64.c +++ b/arch/x86/kernel/crash_dump_64.c @@ -10,6 +10,7 @@ #include #include #include +#include static ssize_t __copy_oldmem_page(unsigned long pfn, char *buf, size_t csize, unsigned long offset, int userbuf, @@ -73,5 +74,6 @@ ssize_t copy_oldmem_page_encrypted(unsigned long pfn, char *buf, size_t csize, ssize_t elfcorehdr_read(char *buf, size_t count, u64 *ppos) { - return read_from_oldmem(buf, count, ppos, 0, sev_active()); + return read_from_oldmem(buf, count, ppos, 0, + prot_guest_has(PATTR_GUEST_MEM_ENCRYPT)); } diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index a26643dc6bd6..9d08ad2f3faa 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -27,6 +27,7 @@ #include #include #include +#include #include #include #include @@ -418,7 +419,7 @@ static void __init sev_map_percpu_data(void) { int cpu; - if (!sev_active()) + if (!prot_guest_has(PATTR_GUEST_MEM_ENCRYPT)) return; for_each_possible_cpu(cpu) { diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c index ad273e5861c1..f7ba78a23dcd 100644 --- a/arch/x86/kernel/kvmclock.c +++ b/arch/x86/kernel/kvmclock.c @@ -16,9 +16,9 @@ #include #include #include +#include #include -#include #include #include @@ -232,7 +232,7 @@ static void __init kvmclock_init_mem(void) * hvclock is shared between the guest and the hypervisor, must * be mapped decrypted. */ - if (sev_active()) { + if (prot_guest_has(PATTR_GUEST_MEM_ENCRYPT)) { r = set_memory_decrypted((unsigned long) hvclock_mem, 1UL << order); if (r) { diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c index 8e7b517ad738..66ff788b79c9 100644 --- a/arch/x86/kernel/machine_kexec_64.c +++ b/arch/x86/kernel/machine_kexec_64.c @@ -167,7 +167,7 @@ static int init_transition_pgtable(struct kimage *image, pgd_t *pgd) } pte = pte_offset_kernel(pmd, vaddr); - if (sev_active()) + if (prot_guest_has(PATTR_GUEST_MEM_ENCRYPT)) prot = PAGE_KERNEL_EXEC; set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, prot)); @@ -207,7 +207,7 @@ static int init_pgtable(struct kimage *image, unsigned long start_pgtable) level4p = (pgd_t *)__va(start_pgtable); clear_page(level4p); - if (sev_active()) { + if (prot_guest_has(PATTR_GUEST_MEM_ENCRYPT)) { info.page_flag |= _PAGE_ENC; info.kernpg_flag |= _PAGE_ENC; } @@ -570,12 +570,12 @@ void arch_kexec_unprotect_crashkres(void) */ int arch_kexec_post_alloc_pages(void *vaddr, unsigned int pages, gfp_t gfp) { - if (sev_active()) + if (!prot_guest_has(PATTR_HOST_MEM_ENCRYPT)) return 0; /* -
[PATCH v2 07/12] x86/sev: Replace occurrences of sev_es_active() with prot_guest_has()
Replace occurrences of sev_es_active() with the more generic prot_guest_has() using PATTR_GUEST_PROT_STATE, except for in arch/x86/kernel/sev*.c and arch/x86/mm/mem_encrypt*.c where PATTR_SEV_ES will be used. If future support is added for other memory encyrption techonologies, the use of PATTR_GUEST_PROT_STATE can be updated, as required, to specifically use PATTR_SEV_ES. Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Signed-off-by: Tom Lendacky --- arch/x86/include/asm/mem_encrypt.h | 2 -- arch/x86/kernel/sev.c | 6 +++--- arch/x86/mm/mem_encrypt.c | 7 +++ arch/x86/realmode/init.c | 3 +-- 4 files changed, 7 insertions(+), 11 deletions(-) diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h index 7e25de37c148..797146e0cd6b 100644 --- a/arch/x86/include/asm/mem_encrypt.h +++ b/arch/x86/include/asm/mem_encrypt.h @@ -50,7 +50,6 @@ void __init mem_encrypt_free_decrypted_mem(void); void __init mem_encrypt_init(void); void __init sev_es_init_vc_handling(void); -bool sev_es_active(void); bool amd_prot_guest_has(unsigned int attr); #define __bss_decrypted __section(".bss..decrypted") @@ -74,7 +73,6 @@ static inline void __init sme_encrypt_kernel(struct boot_params *bp) { } static inline void __init sme_enable(struct boot_params *bp) { } static inline void sev_es_init_vc_handling(void) { } -static inline bool sev_es_active(void) { return false; } static inline bool amd_prot_guest_has(unsigned int attr) { return false; } static inline int __init diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c index a6895e440bc3..66a4ab9d95d7 100644 --- a/arch/x86/kernel/sev.c +++ b/arch/x86/kernel/sev.c @@ -11,7 +11,7 @@ #include /* For show_regs() */ #include -#include +#include #include #include #include @@ -615,7 +615,7 @@ int __init sev_es_efi_map_ghcbs(pgd_t *pgd) int cpu; u64 pfn; - if (!sev_es_active()) + if (!prot_guest_has(PATTR_SEV_ES)) return 0; pflags = _PAGE_NX | _PAGE_RW; @@ -774,7 +774,7 @@ void __init sev_es_init_vc_handling(void) BUILD_BUG_ON(offsetof(struct sev_es_runtime_data, ghcb_page) % PAGE_SIZE); - if (!sev_es_active()) + if (!prot_guest_has(PATTR_SEV_ES)) return; if (!sev_es_check_cpu_features()) diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c index 83bc928f529e..38dfa84b77a1 100644 --- a/arch/x86/mm/mem_encrypt.c +++ b/arch/x86/mm/mem_encrypt.c @@ -383,8 +383,7 @@ static bool sme_active(void) return sme_me_mask && !sev_active(); } -/* Needs to be called from non-instrumentable code */ -bool noinstr sev_es_active(void) +static bool sev_es_active(void) { return sev_status & MSR_AMD64_SEV_ES_ENABLED; } @@ -482,7 +481,7 @@ static void print_mem_encrypt_feature_info(void) pr_cont(" SEV"); /* Encrypted Register State */ - if (sev_es_active()) + if (amd_prot_guest_has(PATTR_SEV_ES)) pr_cont(" SEV-ES"); pr_cont("\n"); @@ -501,7 +500,7 @@ void __init mem_encrypt_init(void) * With SEV, we need to unroll the rep string I/O instructions, * but SEV-ES supports them through the #VC handler. */ - if (amd_prot_guest_has(PATTR_SEV) && !sev_es_active()) + if (amd_prot_guest_has(PATTR_SEV) && !amd_prot_guest_has(PATTR_SEV_ES)) static_branch_enable(&sev_enable_key); print_mem_encrypt_feature_info(); diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c index 2109ae569c67..7711d0071f41 100644 --- a/arch/x86/realmode/init.c +++ b/arch/x86/realmode/init.c @@ -2,7 +2,6 @@ #include #include #include -#include #include #include @@ -48,7 +47,7 @@ static void sme_sev_setup_real_mode(struct trampoline_header *th) if (prot_guest_has(PATTR_HOST_MEM_ENCRYPT)) th->flags |= TH_FLAGS_SME_ACTIVE; - if (sev_es_active()) { + if (prot_guest_has(PATTR_GUEST_PROT_STATE)) { /* * Skip the call to verify_cpu() in secondary_startup_64 as it * will cause #VC exceptions when the AP can't handle them yet. -- 2.32.0
[PATCH v2 08/12] treewide: Replace the use of mem_encrypt_active() with prot_guest_has()
Replace occurrences of mem_encrypt_active() with calls to prot_guest_has() with the PATTR_MEM_ENCRYPT attribute. Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Dave Hansen Cc: Andy Lutomirski Cc: Peter Zijlstra Cc: David Airlie Cc: Daniel Vetter Cc: Maarten Lankhorst Cc: Maxime Ripard Cc: Thomas Zimmermann Cc: VMware Graphics Cc: Joerg Roedel Cc: Will Deacon Cc: Dave Young Cc: Baoquan He Signed-off-by: Tom Lendacky --- arch/x86/kernel/head64.c| 4 ++-- arch/x86/mm/ioremap.c | 4 ++-- arch/x86/mm/mem_encrypt.c | 5 ++--- arch/x86/mm/pat/set_memory.c| 3 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 +++- drivers/gpu/drm/drm_cache.c | 4 ++-- drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 4 ++-- drivers/gpu/drm/vmwgfx/vmwgfx_msg.c | 6 +++--- drivers/iommu/amd/iommu.c | 3 ++- drivers/iommu/amd/iommu_v2.c| 3 ++- drivers/iommu/iommu.c | 3 ++- fs/proc/vmcore.c| 6 +++--- kernel/dma/swiotlb.c| 4 ++-- 13 files changed, 29 insertions(+), 24 deletions(-) diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c index de01903c3735..cafed6456d45 100644 --- a/arch/x86/kernel/head64.c +++ b/arch/x86/kernel/head64.c @@ -19,7 +19,7 @@ #include #include #include -#include +#include #include #include @@ -285,7 +285,7 @@ unsigned long __head __startup_64(unsigned long physaddr, * there is no need to zero it after changing the memory encryption * attribute. */ - if (mem_encrypt_active()) { + if (prot_guest_has(PATTR_MEM_ENCRYPT)) { vaddr = (unsigned long)__start_bss_decrypted; vaddr_end = (unsigned long)__end_bss_decrypted; for (; vaddr < vaddr_end; vaddr += PMD_SIZE) { diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c index 3ed0f28f12af..7f012fc1b600 100644 --- a/arch/x86/mm/ioremap.c +++ b/arch/x86/mm/ioremap.c @@ -694,7 +694,7 @@ static bool __init early_memremap_is_setup_data(resource_size_t phys_addr, bool arch_memremap_can_ram_remap(resource_size_t phys_addr, unsigned long size, unsigned long flags) { - if (!mem_encrypt_active()) + if (!prot_guest_has(PATTR_MEM_ENCRYPT)) return true; if (flags & MEMREMAP_ENC) @@ -724,7 +724,7 @@ pgprot_t __init early_memremap_pgprot_adjust(resource_size_t phys_addr, { bool encrypted_prot; - if (!mem_encrypt_active()) + if (!prot_guest_has(PATTR_MEM_ENCRYPT)) return prot; encrypted_prot = true; diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c index 38dfa84b77a1..69aed9935b5e 100644 --- a/arch/x86/mm/mem_encrypt.c +++ b/arch/x86/mm/mem_encrypt.c @@ -364,8 +364,7 @@ int __init early_set_memory_encrypted(unsigned long vaddr, unsigned long size) /* * SME and SEV are very similar but they are not the same, so there are * times that the kernel will need to distinguish between SME and SEV. The - * sme_active() and sev_active() functions are used for this. When a - * distinction isn't needed, the mem_encrypt_active() function can be used. + * sme_active() and sev_active() functions are used for this. * * The trampoline code is a good example for this requirement. Before * paging is activated, SME will access all memory as decrypted, but SEV @@ -451,7 +450,7 @@ void __init mem_encrypt_free_decrypted_mem(void) * The unused memory range was mapped decrypted, change the encryption * attribute from decrypted to encrypted before freeing it. */ - if (mem_encrypt_active()) { + if (amd_prot_guest_has(PATTR_MEM_ENCRYPT)) { r = set_memory_encrypted(vaddr, npages); if (r) { pr_warn("failed to free unused decrypted pages\n"); diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c index ad8a5c586a35..6925f2bb4be1 100644 --- a/arch/x86/mm/pat/set_memory.c +++ b/arch/x86/mm/pat/set_memory.c @@ -18,6 +18,7 @@ #include #include #include +#include #include #include @@ -1986,7 +1987,7 @@ static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc) int ret; /* Nothing to do if memory encryption is not active */ - if (!mem_encrypt_active()) + if (!prot_guest_has(PATTR_MEM_ENCRYPT)) return 0; /* Should not be working on unaligned addresses */ diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 971c5b8e75dc..21c1e3056070 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -38,6 +38,7 @@ #include #include #include +#include #include "amdgpu.h" #include "amdgpu_irq.h" @@ -1250,7 +1251,8 @@ static int amdgpu_pci_probe(struct pci_dev *pdev, * howe
[PATCH v2 09/12] mm: Remove the now unused mem_encrypt_active() function
The mem_encrypt_active() function has been replaced by prot_guest_has(), so remove the implementation. Reviewed-by: Joerg Roedel Signed-off-by: Tom Lendacky --- include/linux/mem_encrypt.h | 4 1 file changed, 4 deletions(-) diff --git a/include/linux/mem_encrypt.h b/include/linux/mem_encrypt.h index 5c4a18a91f89..ae4526389261 100644 --- a/include/linux/mem_encrypt.h +++ b/include/linux/mem_encrypt.h @@ -16,10 +16,6 @@ #include -#else /* !CONFIG_ARCH_HAS_MEM_ENCRYPT */ - -static inline bool mem_encrypt_active(void) { return false; } - #endif /* CONFIG_ARCH_HAS_MEM_ENCRYPT */ #ifdef CONFIG_AMD_MEM_ENCRYPT -- 2.32.0
[PATCH v2 10/12] x86/sev: Remove the now unused mem_encrypt_active() function
The mem_encrypt_active() function has been replaced by prot_guest_has(), so remove the implementation. Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Reviewed-by: Joerg Roedel Signed-off-by: Tom Lendacky --- arch/x86/include/asm/mem_encrypt.h | 5 - 1 file changed, 5 deletions(-) diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h index 797146e0cd6b..94c089e9ea69 100644 --- a/arch/x86/include/asm/mem_encrypt.h +++ b/arch/x86/include/asm/mem_encrypt.h @@ -97,11 +97,6 @@ static inline void mem_encrypt_free_decrypted_mem(void) { } extern char __start_bss_decrypted[], __end_bss_decrypted[], __start_bss_decrypted_unused[]; -static inline bool mem_encrypt_active(void) -{ - return sme_me_mask; -} - static inline u64 sme_get_me_mask(void) { return sme_me_mask; -- 2.32.0
[PATCH v2 11/12] powerpc/pseries/svm: Remove the now unused mem_encrypt_active() function
The mem_encrypt_active() function has been replaced by prot_guest_has(), so remove the implementation. Cc: Michael Ellerman Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Signed-off-by: Tom Lendacky --- arch/powerpc/include/asm/mem_encrypt.h | 5 - 1 file changed, 5 deletions(-) diff --git a/arch/powerpc/include/asm/mem_encrypt.h b/arch/powerpc/include/asm/mem_encrypt.h index ba9dab07c1be..2f26b8fc8d29 100644 --- a/arch/powerpc/include/asm/mem_encrypt.h +++ b/arch/powerpc/include/asm/mem_encrypt.h @@ -10,11 +10,6 @@ #include -static inline bool mem_encrypt_active(void) -{ - return is_secure_guest(); -} - static inline bool force_dma_unencrypted(struct device *dev) { return is_secure_guest(); -- 2.32.0
[PATCH v2 12/12] s390/mm: Remove the now unused mem_encrypt_active() function
The mem_encrypt_active() function has been replaced by prot_guest_has(), so remove the implementation. Since the default implementation of the prot_guest_has() matches the s390 implementation of mem_encrypt_active(), prot_guest_has() does not need to be implemented in s390 (the config option ARCH_HAS_PROTECTED_GUEST is not set). Cc: Heiko Carstens Cc: Vasily Gorbik Cc: Christian Borntraeger Signed-off-by: Tom Lendacky --- arch/s390/include/asm/mem_encrypt.h | 2 -- 1 file changed, 2 deletions(-) diff --git a/arch/s390/include/asm/mem_encrypt.h b/arch/s390/include/asm/mem_encrypt.h index 2542cbf7e2d1..08a8b96606d7 100644 --- a/arch/s390/include/asm/mem_encrypt.h +++ b/arch/s390/include/asm/mem_encrypt.h @@ -4,8 +4,6 @@ #ifndef __ASSEMBLY__ -static inline bool mem_encrypt_active(void) { return false; } - int set_memory_encrypted(unsigned long addr, int numpages); int set_memory_decrypted(unsigned long addr, int numpages); -- 2.32.0
Re: [PATCH] drm/arm/malidp: fix mode_valid couldn't cull invalid modes issue
On Tue, Aug 10, 2021 at 10:43:31AM +0800, sandor...@nxp.com wrote: > From: Sandor Yu > > In function malidp_crtc_mode_valid, mode->crtc_mode = 0 when run > in drm_helper_probe_single_connector_modes. > Invalid video modes are not culled > and all modes move to the connector's modes list. > It is not expected by mode_valid. > > Replace mode->crtc_clock with mode->clock to fix the issue. > > Signed-off-by: Sandor Yu It looks like at least drm/bridge/cdns-dsi.c does the same thing of using mode->clock when validating, so looks like a legit bug. Acked-by: Liviu Dudau Many thanks, Liviu > --- > drivers/gpu/drm/arm/malidp_crtc.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/arm/malidp_crtc.c > b/drivers/gpu/drm/arm/malidp_crtc.c > index 494075ddbef6..55890334385d 100644 > --- a/drivers/gpu/drm/arm/malidp_crtc.c > +++ b/drivers/gpu/drm/arm/malidp_crtc.c > @@ -31,7 +31,7 @@ static enum drm_mode_status malidp_crtc_mode_valid(struct > drm_crtc *crtc, >* check that the hardware can drive the required clock rate, >* but skip the check if the clock is meant to be disabled (req_rate = > 0) >*/ > - long rate, req_rate = mode->crtc_clock * 1000; > + long rate, req_rate = mode->clock * 1000; > > if (req_rate) { > rate = clk_round_rate(hwdev->pxlclk, req_rate); > -- > 2.17.1 > -- | I would like to | | fix the world, | | but they're not | | giving me the | \ source code! / --- ¯\_(ツ)_/¯
Re: [PATCH 07/11] treewide: Replace the use of mem_encrypt_active() with prot_guest_has()
On 8/12/21 5:07 AM, Kirill A. Shutemov wrote: On Wed, Aug 11, 2021 at 10:52:55AM -0500, Tom Lendacky wrote: On 8/11/21 7:19 AM, Kirill A. Shutemov wrote: On Tue, Aug 10, 2021 at 02:48:54PM -0500, Tom Lendacky wrote: On 8/10/21 1:45 PM, Kuppuswamy, Sathyanarayanan wrote: ... Looking at code agains, now I *think* the reason is accessing a global variable from __startup_64() inside TDX version of prot_guest_has(). __startup_64() is special. If you access any global variable you need to use fixup_pointer(). See comment before __startup_64(). I'm not sure how you get away with accessing sme_me_mask directly from there. Any clues? Maybe just a luck and complier generates code just right for your case, I donno. Hmm... yeah, could be that the compiler is using rip-relative addressing for it because it lives in the .data section? I guess. It has to be fixed. It may break with complier upgrade or any random change around the code. I'll look at doing that separate from this series. BTW, does it work with clang for you? I haven't tried with clang, I'll check on that. Thanks, Tom
[PATCH] drm/i915/selftest: Fix use of err in igt_reset_{fail, nop}_engine()
Clang warns: In file included from drivers/gpu/drm/i915/gt/intel_reset.c:1514: drivers/gpu/drm/i915/gt/selftest_hangcheck.c:465:62: warning: variable 'err' is uninitialized when used here [-Wuninitialized] pr_err("[%s] Create context failed: %d!\n", engine->name, err); ^~~ ... drivers/gpu/drm/i915/gt/selftest_hangcheck.c:580:62: warning: variable 'err' is uninitialized when used here [-Wuninitialized] pr_err("[%s] Create context failed: %d!\n", engine->name, err); ^~~ ... 2 warnings generated. This appears to be a copy and paste issue. Use ce directly using the %pe specifier to pretty print the error code so that err is not used uninitialized in these functions. Fixes: 3a7b72665ea5 ("drm/i915/selftest: Bump selftest timeouts for hangcheck") Reported-by: Dan Carpenter Signed-off-by: Nathan Chancellor --- drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c index 08f011f893b2..2c1ed32ca5ac 100644 --- a/drivers/gpu/drm/i915/gt/selftest_hangcheck.c +++ b/drivers/gpu/drm/i915/gt/selftest_hangcheck.c @@ -462,7 +462,7 @@ static int igt_reset_nop_engine(void *arg) ce = intel_context_create(engine); if (IS_ERR(ce)) { - pr_err("[%s] Create context failed: %d!\n", engine->name, err); + pr_err("[%s] Create context failed: %pe!\n", engine->name, ce); return PTR_ERR(ce); } @@ -577,7 +577,7 @@ static int igt_reset_fail_engine(void *arg) ce = intel_context_create(engine); if (IS_ERR(ce)) { - pr_err("[%s] Create context failed: %d!\n", engine->name, err); + pr_err("[%s] Create context failed: %pe!\n", engine->name, ce); return PTR_ERR(ce); } base-commit: 927dfdd09d8c03ba100ed0c8c3915f8e1d1f5556 -- 2.33.0.rc2
Re: [tegra-drm:drm/tegra/for-next 16/17] drivers/gpu/drm/tegra/dc.c:1843:53: warning: variable 'new_dc_state' set but not used
13.08.2021 19:36, kernel test robot пишет: > tree: git://anongit.freedesktop.org/tegra/linux.git drm/tegra/for-next > head: ad85b0843ee4536593415ca890d7fb52cd7f1fbe > commit: 04d5d5df9df79f9045e76404775fc8a084aac23d [16/17] drm/tegra: dc: > Support memory bandwidth management > config: arm-defconfig (attached as .config) > compiler: arm-linux-gnueabi-gcc (GCC) 11.2.0 > reproduce (this is a W=1 build): > wget > https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O > ~/bin/make.cross > chmod +x ~/bin/make.cross > git remote add tegra-drm git://anongit.freedesktop.org/tegra/linux.git > git fetch --no-tags tegra-drm drm/tegra/for-next > git checkout 04d5d5df9df79f9045e76404775fc8a084aac23d > # save the attached .config to linux build tree > COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross > ARCH=arm > > If you fix the issue, kindly add following tag as appropriate > Reported-by: kernel test robot > > All warnings (new ones prefixed by >>): > >drivers/gpu/drm/tegra/dc.c: In function > 'tegra_crtc_update_memory_bandwidth': >>> drivers/gpu/drm/tegra/dc.c:1843:53: warning: variable 'new_dc_state' set >>> but not used [-Wunused-but-set-variable] > 1843 | const struct tegra_dc_state *old_dc_state, *new_dc_state; > | ^~~~ >>> drivers/gpu/drm/tegra/dc.c:1843:38: warning: variable 'old_dc_state' set >>> but not used [-Wunused-but-set-variable] > 1843 | const struct tegra_dc_state *old_dc_state, *new_dc_state; > | ^~~~ >drivers/gpu/drm/tegra/dc.c: In function > 'tegra_crtc_calculate_memory_bandwidth': >>> drivers/gpu/drm/tegra/dc.c:2223:38: warning: variable 'old_state' set but >>> not used [-Wunused-but-set-variable] > 2223 | const struct drm_crtc_state *old_state; > | ^ > > > vim +/new_dc_state +1843 drivers/gpu/drm/tegra/dc.c > > 1836 > 1837static void > 1838tegra_crtc_update_memory_bandwidth(struct drm_crtc *crtc, > 1839 struct drm_atomic_state > *state, > 1840 bool > prepare_bandwidth_transition) > 1841{ > 1842const struct tegra_plane_state *old_tegra_state, > *new_tegra_state; >> 1843 const struct tegra_dc_state *old_dc_state, *new_dc_state; > 1844u32 i, new_avg_bw, old_avg_bw, new_peak_bw, old_peak_bw; > 1845const struct drm_plane_state *old_plane_state; > 1846const struct drm_crtc_state *old_crtc_state; > 1847struct tegra_dc_window window, old_window; > 1848struct tegra_dc *dc = to_tegra_dc(crtc); > 1849struct tegra_plane *tegra; > 1850struct drm_plane *plane; > 1851 > 1852if (dc->soc->has_nvdisplay) > 1853return; > 1854 > 1855old_crtc_state = drm_atomic_get_old_crtc_state(state, > crtc); > 1856old_dc_state = to_const_dc_state(old_crtc_state); > 1857new_dc_state = to_const_dc_state(crtc->state); I probably should update compiler or set W=1 to get that warning. These variables were used in older versions of the patch and they can be removed now. Please amend the patch with this: diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c index 9ebb1b6840c6..e2b806369eac 100644 --- a/drivers/gpu/drm/tegra/dc.c +++ b/drivers/gpu/drm/tegra/dc.c @@ -1908,7 +1908,6 @@ tegra_crtc_update_memory_bandwidth(struct drm_crtc *crtc, bool prepare_bandwidth_transition) { const struct tegra_plane_state *old_tegra_state, *new_tegra_state; - const struct tegra_dc_state *old_dc_state, *new_dc_state; u32 i, new_avg_bw, old_avg_bw, new_peak_bw, old_peak_bw; const struct drm_plane_state *old_plane_state; const struct drm_crtc_state *old_crtc_state; @@ -1921,8 +1920,6 @@ tegra_crtc_update_memory_bandwidth(struct drm_crtc *crtc, return; old_crtc_state = drm_atomic_get_old_crtc_state(state, crtc); - old_dc_state = to_const_dc_state(old_crtc_state); - new_dc_state = to_const_dc_state(crtc->state); if (!crtc->state->active) { if (!old_crtc_state->active) diff --git a/drivers/gpu/drm/tegra/dc.h b/drivers/gpu/drm/tegra/dc.h index 26ad1e448c44..871cfb0cd31c 100644 --- a/drivers/gpu/drm/tegra/dc.h +++ b/drivers/gpu/drm/tegra/dc.h @@ -35,12 +35,6 @@ static inline struct tegra_dc_state *to_dc_state(struct drm_crtc_state *state) return NULL; } -static inline const struct tegra_dc_state * -to_const_dc_state(const struct drm_crtc_state *state) -{ - return to_d
Re: [PATCH v5 2/9] moduleparam: add data member to struct kernel_param
On Fri, Aug 13, 2021 at 9:44 AM Andy Shevchenko wrote: > > On Fri, Aug 13, 2021 at 09:17:10AM -0600, Jim Cromie wrote: > > Add a const void* data member to the struct, to allow attaching > > private data that will be used soon by a setter method (via kp->data) > > to perform more elaborate actions. > > > > To attach the data at compile time, add new macros: > > > > module_param_cbd() derives from module_param_cb(), adding data param, > > and latter is redefined to use former. > > > > It calls __module_param_call_wdata(), which accepts a new data param > > and inits .data with it. Re-define __module_param_call() to use it. > > > > Use of this new data member will be rare, it might be worth redoing > > this as a separate/sub-type to de-bloat the base case. > > ... > > > +#define module_param_cbd(name, ops, arg, perm, data) > > \ > > + __module_param_call_wdata(MODULE_PARAM_PREFIX, name, ops, arg, perm, > > -1, 0, data) > > Cryptic name. Moreover, inconsistent with the rest. > What about module_param_cb_data() ? > > > #define module_param_cb_unsafe(name, ops, arg, perm) > > \ > > __module_param_call(MODULE_PARAM_PREFIX, name, ops, arg, perm, -1, > > \ > > KERNEL_PARAM_FL_UNSAFE) > > (above left for the above comment) > > ... > > > +#define __module_param_call_wdata(prefix, name, ops, arg, perm, level, > > flags, data) \ > > Similar __module_param_call_with_data() > > -- > With Best Regards, > Andy Shevchenko > > yes to all renames, revised. thanks
Re: [tegra-drm:drm/tegra/for-next 16/17] drivers/gpu/drm/tegra/dc.c:1843:53: warning: variable 'new_dc_state' set but not used
13.08.2021 20:12, Dmitry Osipenko пишет: ... > I probably should update compiler or set W=1 to get that warning. These > variables were used in older versions of the patch and they can be removed > now. > > Please amend the patch with this: Perhaps too late already. I'll make patch for that and then also check whether the UAPI patches were fixed.
Re: [PATCH v2 02/12] mm: Introduce a function to check for virtualization protection features
On 8/13/21 9:59 AM, Tom Lendacky wrote: In prep for other protected virtualization technologies, introduce a generic helper function, prot_guest_has(), that can be used to check for specific protection attributes, like memory encryption. This is intended to eliminate having to add multiple technology-specific checks to the code (e.g. if (sev_active() || tdx_active())). Reviewed-by: Joerg Roedel Co-developed-by: Andi Kleen Signed-off-by: Andi Kleen Co-developed-by: Kuppuswamy Sathyanarayanan Signed-off-by: Kuppuswamy Sathyanarayanan Signed-off-by: Tom Lendacky --- arch/Kconfig| 3 +++ include/linux/protected_guest.h | 35 + 2 files changed, 38 insertions(+) create mode 100644 include/linux/protected_guest.h Reviewed-by: Kuppuswamy Sathyanarayanan -- Sathyanarayanan Kuppuswamy Linux Kernel Developer
Re: [PATCH v2 00/12] Implement generic prot_guest_has() helper function
On 8/13/21 11:59 AM, Tom Lendacky wrote: This patch series provides a generic helper function, prot_guest_has(), to replace the sme_active(), sev_active(), sev_es_active() and mem_encrypt_active() functions. It is expected that as new protected virtualization technologies are added to the kernel, they can all be covered by a single function call instead of a collection of specific function calls all called from the same locations. The powerpc and s390 patches have been compile tested only. Can the folks copied on this series verify that nothing breaks for them. There are some patches related to PPC that added new calls to the mem_encrypt_active() function that are not yet in the tip tree. After the merge window, I'll need to send a v3 with those additional changes before this series can be applied. Thanks, Tom
[Bug 214001] [bisected][regression] After commit "drm/ttm: Initialize debugfs from ttm_global_init()" kernels without debugfs explicitly set to 'allow all' fail to boot
https://bugzilla.kernel.org/show_bug.cgi?id=214001 Duncan (1i5t5.dun...@cox.net) changed: What|Removed |Added CC||1i5t5.dun...@cox.net --- Comment #2 from Duncan (1i5t5.dun...@cox.net) --- This has been reported (by someone else) on the dri-devel list (with the main kernel list and the devs CCed) as well, with me confirming it there. No answer from the devs there either. The reporter and I followed reporting instructions to take it to the list, and no hint it was even seen, despite the release getting closer and closer. So I was going to try bugzilla (despite instructions to take it to the list), to see if I could raise the profile a bit, and find this bug. Anyway, it's on both channels now. FWIW: https://lore.kernel.org/dri-devel/?q=5.14.0-rc4+broke+drm%2Fttm Tho FWIW your symptoms are a bit different than those of the OP there and I. We were able to boot, but only to legacy low-res VGA mode. He has a boot-splash enabled and the screen blanked from early boot when the drm-framebuffer would normally take over until the login prompt, which appeared in vga mode. I prefer to see the boot messages so no splash, and didn't have it blank, the screen just never left the legacy vga mode it normally uses for early boot. We're both on Radeons; he's on the old radeon driver while I'm on amdgpu (polaris-11, rx460). I wonder if you don't have the legacy vgacon (CONFIG_VGA_CONSOLE) enabled as a fall-back, as that would explain an apparent hang due to no valid graphics (tho the system may have booted, just without graphics). Alternatively, I don't know what the behavior of non-radeon/amdgpu drm-framebuffer drivers is, maybe whatever you're running either does hang or simply doesn't fall back to vgacon as our radeon and amdgpu drivers did? But in both his case and mine it bisected to the same commit, 69de4421bb, and reverting it against current gave both of us working systems again, so it's the same bug. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
Re: [PATCH 39/64] mac80211: Use memset_after() to clear tx status
On Fri, 2021-08-13 at 09:08 -0700, Kees Cook wrote: > > > > The common helper should also clear ack_signal, but that was broken by > > commit e3e1a0bcb3f1 ("mac80211: reduce IEEE80211_TX_MAX_RATES"), because > > that commit changed the order of the fields and updated carl9170 and p54 > > properly but not the common helper... > > It looks like p54 actually uses the rates, which is why it does this > manually. I can't see why carl9170 does this manually, though. mac80211 also uses the rates later again on status reporting, it just expects the # of attempts to be filled etc. I haven't looked at carl9170, but I would expect it to do something there and do it correctly, even though old it's a well-written driver and uses mac80211 rate control, so this would need to be correct for decent performance. But I guess it could be that the helper could be used because the rates were already handed to the firmware, and the code was just copy/pasted from p54 (the drivers were, IIRC, developed by the same folks) johannes
[Bug 214001] [bisected][regression] After commit "drm/ttm: Initialize debugfs from ttm_global_init()" kernels without debugfs explicitly set to 'allow all' fail to boot
https://bugzilla.kernel.org/show_bug.cgi?id=214001 --- Comment #3 from Linux_Chemist (untaintablean...@hotmail.co.uk) --- Thanks for your comment, Duncan! Yes, I'm on a customised kernel that has a lot removed (including debugfs as you can tell) and also amdgpu (RX 5700). There's usually a bug in a testing RC every few releases, I just report them here after bisecting; seems the right place for it even if it's not lol Caught a nice bug last release cycle with the memory reservation for the bios (https://www.phoronix.com/scan.php?page=news_item&px=Linux-Always-Reserve-1MB) (I wasn't sure to file this one under an AMD ("non-intel") specific 'video' bug but the commit was for 'drivers/gpu/drm/ttm' which I assume is agnostic. I don't know what it's for or whether only amdgpu/radeon makes use of it to say but it is interesting that the 3 of us have similar hardware.) I can confirm all my .configs have had CONFIG_VGA_CONSOLE=y in it (though a lot of fallback stuff pulled out that probably stops me getting the legacy low-res VGA mode you mention, c'est la vie) But anyways as you say, the ability to create a bootable kernel only becomes an issue from the commit in question when not having CONFIG_DEBUG_FS=y (and CONFIG_DEBUG_FS_ALLOW_ALL=y along with that) Don't get me wrong, it's not a showstopper 'massive bug' because you can always put debugfs + 'allow all' into your kernel, I did so and am happily on rc5 now, but that's why I'd like a consensus to be known or shared (i.e. change the wording for the kconfig options) about whether a lot of things are expecting debugfs to be there in some form now - is it now an 'essential' part of the kernel? Or should things that rely on it fail gracefully if they don't find it? Either it's essential and this isn't a bug and there needs to be clarification that debugfs should always be there in some form, or this is a bug and the commit needs tweaked to account for debugfs not being there or there in a diminished capacity. It is a bit silly that even CONFIG_DEBUG_FS_ALLOW_NONE wouldn't work for this bug because that seems like it should be providing that 'fail gracefully' mechanism to debugfs being 'there' but 'don't bother with it'. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
Re: [PATCH 5/9] drm/i915/guc: Flush the work queue for GuC generated G2H
On Fri, Aug 13, 2021 at 05:11:59PM +0200, Daniel Vetter wrote: > On Thu, Aug 12, 2021 at 10:38:18PM +, Matthew Brost wrote: > > On Thu, Aug 12, 2021 at 09:47:23PM +0200, Daniel Vetter wrote: > > > On Thu, Aug 12, 2021 at 03:23:30PM +, Matthew Brost wrote: > > > > On Thu, Aug 12, 2021 at 04:11:28PM +0200, Daniel Vetter wrote: > > > > > On Wed, Aug 11, 2021 at 01:16:18AM +, Matthew Brost wrote: > > > > > > Flush the work queue for GuC generated G2H messages durinr a GT > > > > > > reset. > > > > > > This is accomplished by spinning on the the list of outstanding G2H > > > > > > to > > > > > > go empty. > > > > > > > > > > > > Fixes: eb5e7da736f3 ("drm/i915/guc: Reset implementation for new > > > > > > GuC interface") > > > > > > Signed-off-by: Matthew Brost > > > > > > Cc: > > > > > > --- > > > > > > drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 5 + > > > > > > 1 file changed, 5 insertions(+) > > > > > > > > > > > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > > > > > > b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > > > > > > index 3cd2da6f5c03..e5eb2df11b4a 100644 > > > > > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > > > > > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c > > > > > > @@ -727,6 +727,11 @@ void intel_guc_submission_reset_prepare(struct > > > > > > intel_guc *guc) > > > > > > wait_for_reset(guc, > > > > > > &guc->outstanding_submission_g2h); > > > > > > } while (!list_empty(&guc->ct.requests.incoming)); > > > > > > } > > > > > > + > > > > > > + /* Flush any GuC generated G2H */ > > > > > > + while (!list_empty(&guc->ct.requests.incoming)) > > > > > > + msleep(20); > > > > > > > > > > flush_work or flush_workqueue, beacuse that comes with lockdep > > > > > annotations. Dont hand-roll stuff like this if at all possible. > > > > > > > > lockdep puked when used that. > > > > > > Lockdep tends to be right ... So we definitely want that, but maybe a > > > different flavour, or there's something wrong with the workqueue setup. > > > > > > > Here is a dependency chain that lockdep doesn't like. > > > > fs_reclaim_acquire -> >->reset.mutex (shrinker) > > workqueue -> fs_reclaim_acquire (error capture in workqueue) > > >->reset.mutex -> workqueue (reset) > > > > In practice I don't think we couldn't ever hit this but lockdep does > > looks right here. Trying to work out how to fix this. We really need to > > all G2H to done being processed before we proceed during a reset or we > > have races. Have a few ideas of how to handle this but can't convince > > myself any of them are fully safe. > > It might be false sharing due to a single workqueue, or a single-threaded > workqueue. > > Essentially the lockdep annotations for work_struct track two things: > - dependencies against the specific work item > - dependencies against anything queued on that work queue, if you flush > the entire queue, or if you flush a work item that's on a > single-threaded queue. > > Because if guc/host communication is inverted like this here, you have a > much bigger problem. > > Note that if you pick a different workqueue for your guc work stuff then > you need to make sure that's all properly flushed on suspend and driver > unload. > > It might also be that the reset work is on the wrong workqueue. > > Either way, this must be fixed, because I've seen too many of these "it > never happens in practice" blow up, plus if your locking scheme is > engineered with quicksand forget about anyone ever understanding it. The solution is to allocate memory for the error capture in an atomic context if the error capture is being done from the G2H work queue. That means this can possibly fail if the system is under memory pressure but that is better than a lockdep splat. Matt > -Daniel > > > > > Splat below: > > > > [ 154.625989] == > > [ 154.632195] WARNING: possible circular locking dependency detected > > [ 154.638393] 5.14.0-rc5-guc+ #50 Tainted: G U > > [ 154.643991] -- > > [ 154.650196] i915_selftest/1673 is trying to acquire lock: > > [ 154.655621] 8881079cb918 > > ((work_completion)(&ct->requests.worker)){+.+.}-{0:0}, at: > > __flush_work+0x350/0x4d0 > > [ 154.665826] > >but task is already holding lock: > > [ 154.671682] 8881079cbfb8 (>->reset.mutex){+.+.}-{3:3}, at: > > intel_gt_reset+0xf0/0x300 [i915] > > [ 154.680659] > >which lock already depends on the new lock. > > > > [ 154.688857] > >the existing dependency chain (in reverse order) is: > > [ 154.696365] > >-> #2 (>->reset.mutex){+.+.}-{3:3}: > > [ 154.702571]lock_acquire+0xd2/0x300 > > [ 154.706695]i915_gem_shrinker_taints_mutex+0x2d/0x50 [i915] > > [ 154.712959]intel_gt_init_reset+0x61/0x80 [i915]
Re: [PATCH v1] fbtft: fb_st7789v: added reset on init_display()
On 13/08/21, Greg KH wrote: > On Fri, Aug 13, 2021 at 08:25:10AM +0200, Oliver Graute wrote: > > staging: fbtft: fb_st7789v: reset display before initialization > > What is this line here, and why is this not your subject line instead? I'll put the line as subject instead. > > In rare cases the display is flipped or mirrored. This was observed more > > often in a low temperature environment. A clean reset on init_display() > > should help to get registers in a sane state. > > > > Signed-off-by: Oliver Graute > > What commit does this fix? this is a fix for a rare behavior of the fb_st7789v display. Not a bugfix for a specific commit. Best regards, Oliver