Re: [PATCH next 09/11] tree-wide: minmax: Replace all the uses of max() for array sizes with max_const()
On Sun, Jan 28, 2024 at 07:34:23PM +, David Laight wrote: > These are the only uses of max() that require a constant value > from constant parameters. > There don't seem to be any similar uses of min(). > > Replacing the max() by max_const() lets min()/max() be simplified > speeding up compilation. > > max_const() will convert enums to int (or unsigned int) so that the > casts added by max_t() are no longer needed. > > Signed-off-by: David Laight > --- For > fs/btrfs/tree-checker.c| 2 +- Acked-by: David Sterba
Re: [PATCH 2/2] drm/bridge: samsung-dsim: Fix porch calcalcuation rounding
On 25.01.24 19:44, Adam Ford wrote: > On Mon, Dec 11, 2023 at 9:33 PM Adam Ford wrote: >> >> When using video sync pulses, the HFP, HBP, and HSA are divided between >> the available lanes if there is more than one lane. For certain >> timings and lane configurations, the HFP may not be evenly divisible. >> If the HFP is rounded down, it ends up being too small which can cause >> some monitors to not sync properly. In these instances, adjust htotal >> and hsync to round the HFP up, and recalculate the htotal. >> >> Tested-by: Frieder Schrempf # Kontron BL >> i.MX8MM with HDMI monitor >> Signed-off-by: Adam Ford > > Gentle nudge on this one. Basically this fixes an issue with the 8MP, > but it's still unknown why it doesn't work on 8MM or 8MN, but Frieder > confirmed there are no regressions on 8MM or 8MN. I only tested two specific display setups on i.MX8MM. So of course I can't confirm the absence of regressions in general. Anyway, I think this should be applied.
RE: [PATCH linux-next/master 1/1] media: staging/imx7: Add handler the abnormal interrupt BIT_STATFF_INT
@rmf...@gmail.com@Laurent Pinchart@martin.kepplin...@puri.sm@ker...@puri.sm@linux-me...@vger.kernel.org Any feedback? Thanks B&R. Name: Alice Yuan Address: No.192, Zhangjiang Liangjing Road, Pudong New Area, Shanghai > -Original Message- > From: Alice Yuan > Sent: 2023年12月8日 10:15 > To: rmf...@gmail.com; Laurent Pinchart > ; martin.kepplin...@puri.sm; > ker...@puri.sm; linux-me...@vger.kernel.org > Cc: LnxRevLi ; linux-de...@linux.nxdi.nxp.com; > ker...@pengutronix.de; dl-linux-imx ; > dri-devel@lists.freedesktop.org; linux-arm-ker...@lists.infradead.org; > patchwork-...@pengutronix.de > Subject: [PATCH linux-next/master 1/1] media: staging/imx7: Add handler the > abnormal interrupt BIT_STATFF_INT > > From: "alice.yuan" > > When do 2592x1944 resolution capture, we found some times the > "BIT_STATFF_INT" is abnormal, the stat fifo is not full, but still the DMA > transfer > done interrupts generate, at this time we will get broken frames. > > From the reference manual of imx8mm, we know the STATFIFO full interrupt > status, that indicates the number of data in the STATFIFO reaches the trigger > level. It defined clearly about BIT_STATFF_INT, 0: STATFIFO is not full, 1: > STATFIFO is full. > > So we should handler the abnormal interrupts when BIT_STATFF_INT is 0, the > stat fifo is not full, we need to drop the broken frames, and recovery from > the > abnormal status. > > Signed-off-by: alice.yuan > --- > drivers/media/platform/nxp/imx7-media-csi.c | 25 - > 1 file changed, 19 insertions(+), 6 deletions(-) > > diff --git a/drivers/media/platform/nxp/imx7-media-csi.c > b/drivers/media/platform/nxp/imx7-media-csi.c > index 15049c6aab37..9012b155c2d7 100644 > --- a/drivers/media/platform/nxp/imx7-media-csi.c > +++ b/drivers/media/platform/nxp/imx7-media-csi.c > @@ -249,6 +249,7 @@ struct imx7_csi { > bool is_streaming; > int buf_num; > u32 frame_sequence; > + int frame_skip; > > bool last_eof; > struct completion last_eof_completion; @@ -686,6 +687,7 @@ static > void imx7_csi_enable(struct imx7_csi *csi) > imx7_csi_dmareq_rff_enable(csi); > imx7_csi_hw_enable(csi); > > + csi->frame_skip = 0; > if (csi->model == IMX7_CSI_IMX8MQ) > imx7_csi_baseaddr_switch_on_second_frame(csi); > } > @@ -764,6 +766,12 @@ static irqreturn_t imx7_csi_irq_handler(int irq, void > *data) > imx7_csi_error_recovery(csi); > } > > + if (!(status & BIT_STATFF_INT)) { > + dev_warn(csi->dev, "Stat fifo is not full\n"); > + imx7_csi_error_recovery(csi); > + csi->frame_skip++; > + } > + > if (status & BIT_ADDR_CH_ERR_INT) { > imx7_csi_hw_disable(csi); > > @@ -790,14 +798,19 @@ static irqreturn_t imx7_csi_irq_handler(int irq, void > *data) > > if ((status & BIT_DMA_TSF_DONE_FB1) || > (status & BIT_DMA_TSF_DONE_FB2)) { > - imx7_csi_vb2_buf_done(csi); > - > - if (csi->last_eof) { > - complete(&csi->last_eof_completion); > - csi->last_eof = false; > + if (csi->frame_skip) { > + dev_warn(csi->dev, "skip frame: %d\n", csi->frame_skip); > + csi->frame_skip--; > + goto out; > + } else { > + imx7_csi_vb2_buf_done(csi); > + if (csi->last_eof) { > + complete(&csi->last_eof_completion); > + csi->last_eof = false; > + } > } > } > - > +out: > spin_unlock(&csi->irqlock); > > return IRQ_HANDLED; > -- > 2.37.1
[PATCH] drm/i915: use READ_ONCE() to read vma->iomap in concurrent environment
In function i915_vma_pin_iomap(), vma->iomap is read using READ_ONCE() in line 562 562ptr = READ_ONCE(vma->iomap); while read directly in line 597 592if (unlikely(cmpxchg(&vma->iomap, NULL, ptr))) { 593if (page_unmask_bits(ptr)) 594__i915_gem_object_release_map(vma->obj); 595else 596io_mapping_unmap(ptr); 597ptr = vma->iomap; There is patch similar to this. https://github.com/torvalds/linux/commit/c1c0ce31b2420d5c173228a2132a492ede03d81f This patch find two read of same variable while one is protected, another is not. And READ_ONCE() is added to protect. Signed-off-by: linke li --- drivers/gpu/drm/i915/i915_vma.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index d09aad34ba37..9fcc11db0505 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -594,7 +594,7 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma) __i915_gem_object_release_map(vma->obj); else io_mapping_unmap(ptr); - ptr = vma->iomap; + ptr = READ_ONCE(vma->iomap); } } -- 2.39.3 (Apple Git-145)
Re: [PATCH RFC for upstream 1/4] dt-bindings: display: panel-simple: add ETML1010G3DRA
Hi Conor, On Fri, 2024-01-26 at 16:27 +, Conor Dooley wrote: > Hey, > > On Fri, Jan 26, 2024 at 09:57:23AM +0100, Yannic Moog wrote: > > Add Emerging Display Technology Corp. etml1010g3dra 10.1" LCD-TFT LVDS > > panel compatible string. > > > > Signed-off-by: Yannic Moog > > > [PATCH RFC for upstream 1/4] > > The "for upstream" here is not really relevant, what else would the > patch be for? I sent this for internal review first and forgot to remove the tags, sorry. Yannic > > Acked-by: Conor Dooley > > Thanks, > Conor. >
Re: [PATCH v3 1/3] drm/atomic: Allow drivers to write their own plane check for async flips
On Sun, 28 Jan 2024 18:25:13 -0300 André Almeida wrote: > Some hardware are more flexible on what they can flip asynchronously, so > rework the plane check so drivers can implement their own check, lifting > up some of the restrictions. > > Signed-off-by: André Almeida > --- > v3: no changes > > drivers/gpu/drm/drm_atomic_uapi.c | 62 ++- > include/drm/drm_atomic_uapi.h | 12 ++ > include/drm/drm_plane.h | 5 +++ > 3 files changed, 62 insertions(+), 17 deletions(-) > > diff --git a/drivers/gpu/drm/drm_atomic_uapi.c > b/drivers/gpu/drm/drm_atomic_uapi.c > index aee4a65d4959..6d5b9fec90c7 100644 > --- a/drivers/gpu/drm/drm_atomic_uapi.c > +++ b/drivers/gpu/drm/drm_atomic_uapi.c > @@ -620,7 +620,7 @@ static int drm_atomic_plane_set_property(struct drm_plane > *plane, > return 0; > } > > -static int > +int > drm_atomic_plane_get_property(struct drm_plane *plane, > const struct drm_plane_state *state, > struct drm_property *property, uint64_t *val) > @@ -683,6 +683,7 @@ drm_atomic_plane_get_property(struct drm_plane *plane, > > return 0; > } > +EXPORT_SYMBOL(drm_atomic_plane_get_property); > > static int drm_atomic_set_writeback_fb_for_connector( > struct drm_connector_state *conn_state, > @@ -1026,18 +1027,54 @@ int drm_atomic_connector_commit_dpms(struct > drm_atomic_state *state, > return ret; > } > > -static int drm_atomic_check_prop_changes(int ret, uint64_t old_val, uint64_t > prop_value, > +int drm_atomic_check_prop_changes(int ret, uint64_t old_val, uint64_t > prop_value, Hi, should the word "async" be somewhere in the function name? >struct drm_property *prop) > { > if (ret != 0 || old_val != prop_value) { > drm_dbg_atomic(prop->dev, > -"[PROP:%d:%s] No prop can be changed during > async flip\n", > +"[PROP:%d:%s] This prop cannot be changed during > async flip\n", > prop->base.id, prop->name); > return -EINVAL; > } > > return 0; > } > +EXPORT_SYMBOL(drm_atomic_check_prop_changes); > + > +/* plane changes may have exceptions, so we have a special function for them > */ > +static int drm_atomic_check_plane_changes(struct drm_property *prop, > + struct drm_plane *plane, > + struct drm_plane_state *plane_state, > + struct drm_mode_object *obj, > + u64 prop_value, u64 old_val) > +{ > + struct drm_mode_config *config = &plane->dev->mode_config; > + int ret; > + > + if (plane->funcs->check_async_props) > + return plane->funcs->check_async_props(prop, plane, plane_state, > + obj, prop_value, > old_val); Is it really ok to allow drivers to opt-in to also *reject* the FB_ID and IN_FENCE_FD props on async commits? Either intentionally or by accident. > + > + /* > + * if you are trying to change something other than the FB ID, your > + * change will be either rejected or ignored, so we can stop the check > + * here > + */ > + if (prop != config->prop_fb_id) { > + ret = drm_atomic_plane_get_property(plane, plane_state, > + prop, &old_val); > + return drm_atomic_check_prop_changes(ret, old_val, prop_value, > prop); When I first read this code, it seemed to say: "If the prop is not FB_ID, then do the usual prop change checking that happens on all changes, not only async.". Reading this patch a few more times over, I finally realized drm_atomic_check_prop_changes() is about async specifically. > + } > + > + if (plane_state->plane->type != DRM_PLANE_TYPE_PRIMARY) { > + drm_dbg_atomic(prop->dev, > +"[OBJECT:%d] Only primary planes can be changed > during async flip\n", > +obj->id); > + return -EINVAL; > + } > + > + return 0; > +} > > int drm_atomic_set_property(struct drm_atomic_state *state, > struct drm_file *file_priv, > @@ -1100,7 +1137,6 @@ int drm_atomic_set_property(struct drm_atomic_state > *state, > case DRM_MODE_OBJECT_PLANE: { > struct drm_plane *plane = obj_to_plane(obj); > struct drm_plane_state *plane_state; > - struct drm_mode_config *config = &plane->dev->mode_config; > > plane_state = drm_atomic_get_plane_state(state, plane); > if (IS_ERR(plane_state)) { > @@ -1108,19 +1144,11 @@ int drm_atomic_set_property(struct drm_atomic_state > *state, > break; > } > > - if (async_flip && prop != config->prop_fb_id) { > -
Re: [PATCH v19 22/30] drm/shmem-helper: Add common memory shrinker
On Mon, 29 Jan 2024 09:16:04 +0300 Dmitry Osipenko wrote: > On 1/26/24 21:12, Boris Brezillon wrote: > > On Fri, 26 Jan 2024 19:27:49 +0300 > > Dmitry Osipenko wrote: > > > >> On 1/26/24 12:55, Boris Brezillon wrote: > >>> On Fri, 26 Jan 2024 00:56:47 +0300 > >>> Dmitry Osipenko wrote: > >>> > On 1/25/24 13:19, Boris Brezillon wrote: > > On Fri, 5 Jan 2024 21:46:16 +0300 > > Dmitry Osipenko wrote: > > > >> +static bool drm_gem_shmem_is_evictable(struct drm_gem_shmem_object > >> *shmem) > >> +{ > >> + return (shmem->madv >= 0) && shmem->base.funcs->evict && > >> + refcount_read(&shmem->pages_use_count) && > >> + !refcount_read(&shmem->pages_pin_count) && > >> + !shmem->base.dma_buf && !shmem->base.import_attach && > >> + !shmem->evicted; > > > > Are we missing > > > > && dma_resv_test_signaled(shmem->base.resv, > > DMA_RESV_USAGE_BOOKKEEP) > > > > to make sure the GPU is done using the BO? > > The same applies to drm_gem_shmem_is_purgeable() BTW. > > > > If you don't want to do this test here, we need a way to let drivers > > provide a custom is_{evictable,purgeable}() test. > > > > I guess we should also expose drm_gem_shmem_shrinker_update_lru_locked() > > to let drivers move the GEMs that were used most recently (those > > referenced by a GPU job) at the end of the evictable LRU. > > We have the signaled-check in the common drm_gem_evict() helper: > > https://elixir.bootlin.com/linux/v6.8-rc1/source/drivers/gpu/drm/drm_gem.c#L1496 > > >>> > >>> Ah, indeed. I'll need DMA_RESV_USAGE_BOOKKEEP instead of > >>> DMA_RESV_USAGE_READ in panthor, but I can add it in the driver specific > >>> > >>> ->evict() hook (though that means calling dma_resv_test_signaled() > >>> twice, which is not great, oh well). > >> > >> Maybe we should change drm_gem_evict() to use BOOKKEEP. The > >> test_signaled(BOOKKEEP) should be a "stronger" check than > >> test_signaled(READ)? > > > > It is, just wondering if some users have a good reason to want > > READ here. > > > >> > >>> The problem about the evictable LRU remains though: we need a way to let > >>> drivers put their BOs at the end of the list when the BO has been used > >>> by the GPU, don't we? > >> > >> If BO is use, then it won't be evicted, while idling BOs will be > >> evicted. Hence, the used BOs will be naturally moved down the LRU list > >> each time shrinker is invoked. > >> > > > > That only do the trick if the BOs being used most often are busy when > > the shrinker kicks in though. Let's take this scenario: > > > > > > BO 1BO 2 > > shinker > > > > busy > > idle (first-pos-in-evictable-LRU) > > > > busy > > idle (second-pos-in-evictable-LRU) > > > > busy > > idle > > > > busy > > idle > > > > busy > > idle > > > > > > find a BO to evict > > > > pick BO 2 > > > > busy (swapin) > > idle > > > > If the LRU had been updated at each busy event, BO 1 should have > > been picked for eviction. But we evicted the BO that was first > > recorded idle instead of the one that was least recently > > recorded busy. > > You have to swapin(BO) every time BO goes to busy state, and swapin does > drm_gem_lru_move_tail(BO). Hence, each time BO goes idle->busy, it's moved > down the LRU list. Ah, that's the bit I was missing. It makes sense now. I guess that's good enough for now, we can sort out the BOOKKEEP vs READ in a follow-up series. Reviewed-by: Boris Brezillon
Re: [PATCH v19 22/30] drm/shmem-helper: Add common memory shrinker
On Fri, 5 Jan 2024 21:46:16 +0300 Dmitry Osipenko wrote: > +/** > + * drm_gem_shmem_swapin_locked() - Moves shmem GEM back to memory and enables > + * hardware access to the memory. > + * @shmem: shmem GEM object > + * > + * This function moves shmem GEM back to memory if it was previously evicted > + * by the memory shrinker. The GEM is ready to use on success. > + * > + * Returns: > + * 0 on success or a negative error code on failure. > + */ > +int drm_gem_shmem_swapin_locked(struct drm_gem_shmem_object *shmem) > +{ > + int err; > + > + dma_resv_assert_held(shmem->base.resv); > + > + if (!shmem->evicted) > + return 0; Shouldn't we have a drm_gem_shmem_shrinker_update_lru_locked() even if the object wasn't evicted, such that idle->busy transition moves the BO to the list tail? > + > + err = drm_gem_shmem_acquire_pages(shmem); > + if (err) > + return err; > + > + shmem->evicted = false; > + > + drm_gem_shmem_shrinker_update_lru_locked(shmem); > + > + return 0; > +} > +EXPORT_SYMBOL_GPL(drm_gem_shmem_swapin_locked); > +
Re: [PATCH v19 22/30] drm/shmem-helper: Add common memory shrinker
On Mon, 29 Jan 2024 09:55:11 +0100 Boris Brezillon wrote: > On Mon, 29 Jan 2024 09:16:04 +0300 > Dmitry Osipenko wrote: > > > On 1/26/24 21:12, Boris Brezillon wrote: > > > On Fri, 26 Jan 2024 19:27:49 +0300 > > > Dmitry Osipenko wrote: > > > > > >> On 1/26/24 12:55, Boris Brezillon wrote: > > >>> On Fri, 26 Jan 2024 00:56:47 +0300 > > >>> Dmitry Osipenko wrote: > > >>> > > On 1/25/24 13:19, Boris Brezillon wrote: > > > On Fri, 5 Jan 2024 21:46:16 +0300 > > > Dmitry Osipenko wrote: > > > > > >> +static bool drm_gem_shmem_is_evictable(struct drm_gem_shmem_object > > >> *shmem) > > >> +{ > > >> +return (shmem->madv >= 0) && shmem->base.funcs->evict && > > >> +refcount_read(&shmem->pages_use_count) && > > >> +!refcount_read(&shmem->pages_pin_count) && > > >> +!shmem->base.dma_buf && !shmem->base.import_attach && > > >> +!shmem->evicted; > > > > > > Are we missing > > > > > > && dma_resv_test_signaled(shmem->base.resv, > > > DMA_RESV_USAGE_BOOKKEEP) > > > > > > to make sure the GPU is done using the BO? > > > The same applies to drm_gem_shmem_is_purgeable() BTW. > > > > > > If you don't want to do this test here, we need a way to let drivers > > > provide a custom is_{evictable,purgeable}() test. > > > > > > I guess we should also expose > > > drm_gem_shmem_shrinker_update_lru_locked() > > > to let drivers move the GEMs that were used most recently (those > > > referenced by a GPU job) at the end of the evictable LRU. > > > > We have the signaled-check in the common drm_gem_evict() helper: > > > > https://elixir.bootlin.com/linux/v6.8-rc1/source/drivers/gpu/drm/drm_gem.c#L1496 > > > > >>> > > >>> Ah, indeed. I'll need DMA_RESV_USAGE_BOOKKEEP instead of > > >>> DMA_RESV_USAGE_READ in panthor, but I can add it in the driver specific > > >>> > > >>> ->evict() hook (though that means calling dma_resv_test_signaled() > > >>> twice, which is not great, oh well). > > >> > > >> Maybe we should change drm_gem_evict() to use BOOKKEEP. The > > >> test_signaled(BOOKKEEP) should be a "stronger" check than > > >> test_signaled(READ)? > > > > > > It is, just wondering if some users have a good reason to want > > > READ here. > > > > > >> > > >>> The problem about the evictable LRU remains though: we need a way to let > > >>> drivers put their BOs at the end of the list when the BO has been used > > >>> by the GPU, don't we? > > >> > > >> If BO is use, then it won't be evicted, while idling BOs will be > > >> evicted. Hence, the used BOs will be naturally moved down the LRU list > > >> each time shrinker is invoked. > > >> > > > > > > That only do the trick if the BOs being used most often are busy when > > > the shrinker kicks in though. Let's take this scenario: > > > > > > > > > BO 1 BO 2 > > > shinker > > > > > > busy > > > idle (first-pos-in-evictable-LRU) > > > > > > busy > > > idle (second-pos-in-evictable-LRU) > > > > > > busy > > > idle > > > > > > busy > > > idle > > > > > > busy > > > idle > > > > > > > > > find a BO to evict > > > > > > pick BO 2 > > > > > > busy (swapin) > > > idle > > > > > > If the LRU had been updated at each busy event, BO 1 should have > > > been picked for eviction. But we evicted the BO that was first > > > recorded idle instead of the one that was least recently > > > recorded busy. > > > > You have to swapin(BO) every time BO goes to busy state, and swapin does > > drm_gem_lru_move_tail(BO). Hence, each time BO goes idle->busy, it's moved > > down the LRU list. > > Ah, that's the bit I was missing. It makes sense now. I guess that's > good enough for now, we can sort out the BOOKKEEP vs READ in a > follow-up series. On second look, it seems drm_gem_shmem_shrinker_update_lru_locked() doesn't call drm_gem_shmem_shrinker_update_lru_locked() if the BO was already resident? Is there something else I'm overlooking here? > > Reviewed-by: Boris Brezillon
Re: [PATCH next 10/11] block: Use a boolean expression instead of max() on booleans
On Sun, 28 Jan 2024, David Laight wrote: > blk_stack_limits() contains: > t->zoned = max(t->zoned, b->zoned); > These are bool, so it is just a bitwise or. Should be a logical or, really. And || in code. BR, Jani. > However it generates: > error: comparison of constant ‘0’ with boolean expression is always true > [-Werror=bool-compare] > inside the signedness check that max() does unless a '+ 0' is added. > It is a shame the compiler generates this warning for code that will > be optimised away. > > Change so that the extra '+ 0' can be removed. > > Signed-off-by: David Laight > --- > block/blk-settings.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/block/blk-settings.c b/block/blk-settings.c > index 06ea91e51b8b..9ca21fea039d 100644 > --- a/block/blk-settings.c > +++ b/block/blk-settings.c > @@ -688,7 +688,7 @@ int blk_stack_limits(struct queue_limits *t, struct > queue_limits *b, > b->max_secure_erase_sectors); > t->zone_write_granularity = max(t->zone_write_granularity, > b->zone_write_granularity); > - t->zoned = max(t->zoned, b->zoned); > + t->zoned = t->zoned | b->zoned; > return ret; > } > EXPORT_SYMBOL(blk_stack_limits); -- Jani Nikula, Intel
Re: [PATCH RESEND] drm/virtio: set segment size for virtio_gpu device
On 1/23/24 21:14, Sebastian Ott wrote: > drm/virtio: set segment size for virtio_gpu device > > Set the segment size of the virtio_gpu device to the value > used by the drm helpers when allocating sg lists to fix the > following complaint from DMA_API debug code: > DMA-API: virtio-pci :07:00.0: mapping sg segment longer than device > claims to support [len=262144] [max=65536] > > Signed-off-by: Sebastian Ott > --- > drivers/gpu/drm/virtio/virtgpu_drv.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.c > b/drivers/gpu/drm/virtio/virtgpu_drv.c > index 4334c7608408..74b2cb3295af 100644 > --- a/drivers/gpu/drm/virtio/virtgpu_drv.c > +++ b/drivers/gpu/drm/virtio/virtgpu_drv.c > @@ -94,6 +94,7 @@ static int virtio_gpu_probe(struct virtio_device *vdev) > goto err_free; > } > > + dma_set_max_seg_size(dev->dev, dma_max_mapping_size(dev->dev) ? : > UINT_MAX); > ret = virtio_gpu_init(vdev, dev); > if (ret) > goto err_free; Added stable tag and applied to misc-fixes. Please use `git send-email` next time. Thanks -- Best regards, Dmitry
Re: [PATCH] drm: bridge: samsung-dsim: Don't use FORCE_STOP_STATE
On 26.01.24 19:28, Dave Airlie wrote: > Just FYI this conflictted pretty heavily with drm-misc-next changes in > the same area, someone should check drm-tip has the correct > resolution, I'm not really sure what is definitely should be. > > Dave. Thanks! I took a quick look at what is now in Linus' tree and it looks correct to me. The only thing I'm missing is my Reviewed-by tag which got lost somewhere, but I can get over that.
RE: [PATCH next 10/11] block: Use a boolean expression instead of max() on booleans
From: Jani Nikula > Sent: 29 January 2024 09:08 > > On Sun, 28 Jan 2024, David Laight wrote: > > blk_stack_limits() contains: > > t->zoned = max(t->zoned, b->zoned); > > These are bool, so it is just a bitwise or. > > Should be a logical or, really. And || in code. Not really, bitwise is fine for bool (especially for 'or') and generates better code. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
Re: [PATCH v5 0/3] drm/i915: Fix VMA UAF on destroy against deactivate race
Hi Janusz, There seems to be a regression in CI related to this: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_129026v2/bat-dg1-7/igt@gem_lmem_swapping@random-engi...@lmem0.html#dmesg-warnings1053 Please have a look. Regards, Nirmoy On 1/24/2024 6:13 PM, Janusz Krzysztofik wrote: Object debugging tools were sporadically reporting illegal attempts to free a still active i915 VMA object when parking a GT believed to be idle. [161.359441] ODEBUG: free active (active state 0) object: 88811643b958 object type: i915_active hint: __i915_vma_active+0x0/0x50 [i915] [161.360082] WARNING: CPU: 5 PID: 276 at lib/debugobjects.c:514 debug_print_object+0x80/0xb0 ... [161.360304] CPU: 5 PID: 276 Comm: kworker/5:2 Not tainted 6.5.0-rc1-CI_DRM_13375-g003f860e5577+ #1 [161.360314] Hardware name: Intel Corporation Rocket Lake Client Platform/RocketLake S UDIMM 6L RVP, BIOS RKLSFWI1.R00.3173.A03.2204210138 04/21/2022 [161.360322] Workqueue: i915-unordered __intel_wakeref_put_work [i915] [161.360592] RIP: 0010:debug_print_object+0x80/0xb0 ... [161.361347] debug_object_free+0xeb/0x110 [161.361362] i915_active_fini+0x14/0x130 [i915] [161.361866] release_references+0xfe/0x1f0 [i915] [161.362543] i915_vma_parked+0x1db/0x380 [i915] [161.363129] __gt_park+0x121/0x230 [i915] [161.363515] intel_wakeref_put_last+0x1f/0x70 [i915] That has been tracked down to be happening when another thread is deactivating the VMA inside __active_retire() helper, after the VMA's active counter has been already decremented to 0, but before deactivation of the VMA's object is reported to the object debugging tool. We could prevent from that race by serializing i915_active_fini() with __active_retire() via ref->tree_lock, but that wouldn't stop the VMA from being used, e.g. from __i915_vma_retire() called at the end of __active_retire(), after that VMA has been already freed by a concurrent i915_vma_destroy() on return from the i915_active_fini(). Then, we should rather fix the issue at the VMA level, not in i915_active. Since __i915_vma_parked() is called from __gt_park() on last put of the GT's wakeref, the issue could be addressed by holding the GT wakeref long enough for __active_retire() to complete before that wakeref is released and the GT parked. A VMA associated with a request doesn't acquire a GT wakeref by itself. Instead, it depends on a wakeref held directly by the request's active intel_context for a GT associated with its VM, and indirectly on that intel_context's engine wakeref if the engine belongs to the same GT as the VMA's VM. Those wakerefs are released asynchronously to VMA deactivation. In case of single-GT platforms, at least one of those wakerefs is usually held long enough for the request's VMA to be deactivated on time, before it is destroyed on last put of its VM GT wakeref. However, on multi-GT platforms, a request may use a VMA from a GT other than the one that hosts the request's engine, then it is protected only with the intel_context's VM GT wakeref. There was an attempt to fix the issue on 2-GT Meteor Lake by acquiring an extra wakeref for a Primary GT from i915_gem_do_execbuffer() -- see commit f56fe3e91787 ("drm/i915: Fix a VMA UAF for multi-gt platform"). However, that fix occurred insufficient -- the issue was still reported by CI. That wakeref was released on exit from i915_gem_do_execbuffer(), then potentially before completion of the request and deactivation of its associated VMAs. Moreover, CI reports indicated that single-GT platforms also suffered sporadically from the same race. I believe the issue was introduced by commit d93939730347 ("drm/i915: Remove the vma refcount") which moved a call to i915_active_fini() from a dropped i915_vma_release(), called on last put of the removed VMA kref, to i915_vma_parked() processing path called on last put of a GT wakeref. However, its visibility to the object debugging tool was suppressed by a bug in i915_active that was fixed two weeks later with commit e92eb246feb9 ("drm/i915/active: Fix missing debug object activation"). Fix the issue by getting a wakeref for the VMA's GT when activating it, and putting that wakeref only after the VMA is deactivated. However, exclude global GTT from that processing path, otherwise the GPU never goes idle. Since __i915_vma_retire() may be called from atomic contexts, use async variant of wakeref put. Also, to avoid circular locking dependency, take care of acquiring the wakeref before VM mutex when both are needed. Having that fixed, stop explicitly acquiring the extra GT0 wakeref from inside i915_gem_do_execbuffer(), and also drop an extra call to i915_active_wait(), introduced by commit 7a2280e8dcd2 ("drm/i915: Wait for active retire before i915_active_fini()") as another insufficient fix for this UAF race. v5: Replace "tile" with "GT" across commit descriptions (Rodrigo), - reword commit message and description of patch 2 reusing relevant chunks moved there from commit description
Re: [PATCH v19 22/30] drm/shmem-helper: Add common memory shrinker
On 1/29/24 12:01, Boris Brezillon wrote: > On Fri, 5 Jan 2024 21:46:16 +0300 > Dmitry Osipenko wrote: > >> +/** >> + * drm_gem_shmem_swapin_locked() - Moves shmem GEM back to memory and >> enables >> + * hardware access to the memory. >> + * @shmem: shmem GEM object >> + * >> + * This function moves shmem GEM back to memory if it was previously evicted >> + * by the memory shrinker. The GEM is ready to use on success. >> + * >> + * Returns: >> + * 0 on success or a negative error code on failure. >> + */ >> +int drm_gem_shmem_swapin_locked(struct drm_gem_shmem_object *shmem) >> +{ >> +int err; >> + >> +dma_resv_assert_held(shmem->base.resv); >> + >> +if (!shmem->evicted) >> +return 0; > > Shouldn't we have a drm_gem_shmem_shrinker_update_lru_locked() even if > the object wasn't evicted, such that idle->busy transition moves the BO > to the list tail? Seems so, good catch. I'll double-check and remove it in the next version. -- Best regards, Dmitry
Re: [RFC PATCH 0/2] drm/amd/display: switch amdgpu_dm_connector to
On Fri, 26 Jan 2024, Mario Limonciello wrote: > On 1/26/2024 10:28, Melissa Wen wrote: >> Hi, >> >> I'm debugging a null-pointer dereference when running >> igt@kms_connector_force_edid and the way I found to solve the bug is to >> stop using raw edid handler in amdgpu_connector_funcs_force and >> create_eml_sink in favor of managing resouces via sruct drm_edid helpers >> (Patch 1). The proper solution seems to be switch amdgpu_dm_connector >> from struct edid to struct drm_edid and avoid the usage of different >> approaches in the driver (Patch 2). However, doing it implies a good >> amount of work and validation, therefore I decided to send this RFC >> first to collect opinions and check if there is any parallel work on >> this side. It's a working in progress. >> >> The null-pointer error trigger by the igt@kms_connector_force_edid test >> was introduced by: >> - e54ed41620f ("drm/amd/display: Remove unwanted drm edid references") >> >> You can check the error trace in the first patch. >> >> This series was tested with kms_hdmi_inject and kms_force_connector. No >> null-pointer error, kms_hdmi_inject is successul and kms_force_connector >> is sucessful after the second execution - the force-edid subtest >> still fails in the first run (I'm still investigating). >> >> There is also a couple of cast warnings to be addressed - I'm looking >> for the best replacement. >> >> I appreciate any feedback and testing. > > So I'm actually a little bit worried by hardcoding EDID_LENGTH in this > series. > > I have some other patches that I'm posting later on that let you get the > EDID from _DDC BIOS method too. My observation was that the EDID can be > anywhere up to 512 bytes according to the ACPI spec. > > An earlier version of my patch was using EDID_LENGTH when fetching it > and the EDID checksum failed. > > I'll CC you on the post, we probably want to get your changes and mine > merged together. One of the main points of struct drm_edid is that it tracks the allocation size separately. We should simply not trust edid->extensions, because most of the time it originates from outside the kernel. Using drm_edid and immediately drm_edid_raw() falls short. That function should only be used during migration to help. And yeah, it also means EDID parsing should be done in drm_edid.c, and not spread out all over the subsystem. BR, Jani. > >> >> Melissa >> >> Melissa Wen (2): >>drm/amd/display: fix null-pointer dereference on edid reading >>drm/amd/display: switch amdgpu_dm_connector to use struct drm_edid >> >> .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 78 ++- >> .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 4 +- >> .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 9 ++- >> .../display/amdgpu_dm/amdgpu_dm_mst_types.c | 23 +++--- >> 4 files changed, 60 insertions(+), 54 deletions(-) >> > -- Jani Nikula, Intel
[PATCH RESEND v7 0/7] Add support for XLCDC to sam9x7 SoC family.
This patch series aims to add support for XLCDC IP of sam9x7 SoC family to the DRM subsystem.XLCDC IP has additional registers and new configuration bits compared to the existing register set of HLCDC IP. The new compatible string "microchip,sam9x75-xlcdc" is defined for sam9x75 variant of the sam9x7 SoC family.The is_xlcdc flag under driver data and IP specific driver ops helps to differentiate the XLCDC and existing HLCDC code within the same driver. changes in v7: * LCDC IP driver ops functions are declared static and its declaration are removed. changes in v6: * Fixed Cosmetic defects. * Added comments for readability. changes in v5: * return value of regmap_read_poll_timeout is checked in failure case. * HLCDC and XLCDC specific driver functions are now invoked using its IP specific driver ops w/o the need of checking is_xlcdc flag. * Removed empty spaces and blank lines. changes in v4: * fixed kernel warnings reported by kernel test robot. changes in v3: * Removed de-referencing the value of is_xlcdc flag multiple times in a single function. * Removed cpu_relax() call when using regmap_read_poll_timeout. * Updated xfactor and yfactor equations using shift operators * Defined CSC co-efficients in an array for code readability. changes in v2: * Change the driver compatible name from "microchip,sam9x7-xlcdc" to "microchip,sam9x75-xlcdc". * Move is_xlcdc flag to driver data. * Remove unsed Macro definitions. * Add co-developed-bys tags * Replace regmap_read() with regmap_read_poll_timeout() call * Split code into two helpers for code readablitity. Durai Manickam KR (1): drm: atmel-hlcdc: Define SAM9X7 SoC XLCDC specific registers Manikandan Muralidharan (6): drm: atmel-hlcdc: add flag and driver ops to differentiate XLCDC and HLCDC IP drm: atmel-hlcdc: add LCD controller layer definition for sam9x75 drm: atmel_hlcdc: Add support for XLCDC in atmel LCD driver drm: atmel-hlcdc: add DPI mode support for XLCDC drm: atmel-hlcdc: add vertical and horizontal scaling support for XLCDC drm: atmel-hlcdc: add support for DSI output formats .../gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c| 175 +++-- drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c | 105 + drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.h | 86 .../gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c | 369 +++--- include/linux/mfd/atmel-hlcdc.h | 10 + 5 files changed, 647 insertions(+), 98 deletions(-) -- 2.25.1
[PATCH RESEND v7 1/7] drm: atmel-hlcdc: add flag and driver ops to differentiate XLCDC and HLCDC IP
Add is_xlcdc flag and LCD IP specific ops in driver data to differentiate XLCDC and HLCDC code within the atmel-hlcdc driver files. Signed-off-by: Manikandan Muralidharan --- drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.h | 37 1 file changed, 37 insertions(+) diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.h b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.h index 5b5c774e0edf..d5e01ff8c7f9 100644 --- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.h +++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.h @@ -177,6 +177,9 @@ struct atmel_hlcdc_layer_cfg_layout { int csc; }; +struct atmel_hlcdc_plane_state; +struct atmel_hlcdc_dc; + /** * Atmel HLCDC DMA descriptor structure * @@ -288,6 +291,36 @@ atmel_hlcdc_layer_to_plane(struct atmel_hlcdc_layer *layer) return container_of(layer, struct atmel_hlcdc_plane, layer); } +/** + * struct atmel_lcdc_dc_ops - describes atmel_lcdc ops group + * to differentiate HLCDC and XLCDC IP code support. + * @plane_setup_scaler: update the vertical and horizontal scaling factors + * @update_lcdc_buffers: update the each LCDC layers DMA registers. + * @lcdc_atomic_disable: disable LCDC interrupts and layers + * @lcdc_update_general_settings: update each LCDC layers general + * confiugration register. + * @lcdc_atomic_update: enable the LCDC layers and interrupts. + * @lcdc_csc_init: update the color space conversion co-efficient of + * High-end overlay register. + * @lcdc_irq_dbg: to raise alert incase of interrupt overrun in any LCDC layer. + */ +struct atmel_lcdc_dc_ops { + void (*plane_setup_scaler)(struct atmel_hlcdc_plane *plane, + struct atmel_hlcdc_plane_state *state); + void (*update_lcdc_buffers)(struct atmel_hlcdc_plane *plane, + struct atmel_hlcdc_plane_state *state, + u32 sr, int i); + void (*lcdc_atomic_disable)(struct atmel_hlcdc_plane *plane); + void (*lcdc_update_general_settings)(struct atmel_hlcdc_plane *plane, +struct atmel_hlcdc_plane_state *state); + void (*lcdc_atomic_update)(struct atmel_hlcdc_plane *plane, + struct atmel_hlcdc_dc *dc); + void (*lcdc_csc_init)(struct atmel_hlcdc_plane *plane, + const struct atmel_hlcdc_layer_desc *desc); + void (*lcdc_irq_dbg)(struct atmel_hlcdc_plane *plane, +const struct atmel_hlcdc_layer_desc *desc); +}; + /** * Atmel HLCDC Display Controller description structure. * @@ -304,8 +337,10 @@ atmel_hlcdc_layer_to_plane(struct atmel_hlcdc_layer *layer) * @conflicting_output_formats: true if RGBXXX output formats conflict with * each other. * @fixed_clksrc: true if clock source is fixed + * @is_xlcdc: true if XLCDC IP is supported * @layers: a layer description table describing available layers * @nlayers: layer description table size + * @ops: atmel lcdc dc ops */ struct atmel_hlcdc_dc_desc { int min_width; @@ -317,8 +352,10 @@ struct atmel_hlcdc_dc_desc { int max_hpw; bool conflicting_output_formats; bool fixed_clksrc; + bool is_xlcdc; const struct atmel_hlcdc_layer_desc *layers; int nlayers; + const struct atmel_lcdc_dc_ops *ops; }; /** -- 2.25.1
[PATCH RESEND v7 2/7] drm: atmel-hlcdc: add LCD controller layer definition for sam9x75
Add the LCD controller layer definition and descriptor structure for sam9x75 for the following layers: - Base Layer - Overlay1 Layer - Overlay2 Layer - High End Overlay Signed-off-by: Manikandan Muralidharan --- drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c | 97 1 file changed, 97 insertions(+) diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c index fa0f9a93d50d..d30aec174aa2 100644 --- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c +++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c @@ -462,6 +462,99 @@ static const struct atmel_hlcdc_dc_desc atmel_hlcdc_dc_sam9x60 = { .layers = atmel_hlcdc_sam9x60_layers, }; +static const struct atmel_hlcdc_layer_desc atmel_xlcdc_sam9x75_layers[] = { + { + .name = "base", + .formats = &atmel_hlcdc_plane_rgb_formats, + .regs_offset = 0x60, + .id = 0, + .type = ATMEL_HLCDC_BASE_LAYER, + .cfgs_offset = 0x1c, + .layout = { + .xstride = { 2 }, + .default_color = 3, + .general_config = 4, + .disc_pos = 5, + .disc_size = 6, + }, + .clut_offset = 0x700, + }, + { + .name = "overlay1", + .formats = &atmel_hlcdc_plane_rgb_formats, + .regs_offset = 0x160, + .id = 1, + .type = ATMEL_HLCDC_OVERLAY_LAYER, + .cfgs_offset = 0x1c, + .layout = { + .pos = 2, + .size = 3, + .xstride = { 4 }, + .pstride = { 5 }, + .default_color = 6, + .chroma_key = 7, + .chroma_key_mask = 8, + .general_config = 9, + }, + .clut_offset = 0xb00, + }, + { + .name = "overlay2", + .formats = &atmel_hlcdc_plane_rgb_formats, + .regs_offset = 0x260, + .id = 2, + .type = ATMEL_HLCDC_OVERLAY_LAYER, + .cfgs_offset = 0x1c, + .layout = { + .pos = 2, + .size = 3, + .xstride = { 4 }, + .pstride = { 5 }, + .default_color = 6, + .chroma_key = 7, + .chroma_key_mask = 8, + .general_config = 9, + }, + .clut_offset = 0xf00, + }, + { + .name = "high-end-overlay", + .formats = &atmel_hlcdc_plane_rgb_and_yuv_formats, + .regs_offset = 0x360, + .id = 3, + .type = ATMEL_HLCDC_OVERLAY_LAYER, + .cfgs_offset = 0x30, + .layout = { + .pos = 2, + .size = 3, + .memsize = 4, + .xstride = { 5, 7 }, + .pstride = { 6, 8 }, + .default_color = 9, + .chroma_key = 10, + .chroma_key_mask = 11, + .general_config = 12, + .csc = 16, + .scaler_config = 23, + }, + .clut_offset = 0x1300, + }, +}; + +static const struct atmel_hlcdc_dc_desc atmel_xlcdc_dc_sam9x75 = { + .min_width = 0, + .min_height = 0, + .max_width = 2048, + .max_height = 2048, + .max_spw = 0xff, + .max_vpw = 0xff, + .max_hpw = 0x3ff, + .fixed_clksrc = true, + .is_xlcdc = true, + .nlayers = ARRAY_SIZE(atmel_xlcdc_sam9x75_layers), + .layers = atmel_xlcdc_sam9x75_layers, +}; + static const struct of_device_id atmel_hlcdc_of_match[] = { { .compatible = "atmel,at91sam9n12-hlcdc", @@ -487,6 +580,10 @@ static const struct of_device_id atmel_hlcdc_of_match[] = { .compatible = "microchip,sam9x60-hlcdc", .data = &atmel_hlcdc_dc_sam9x60, }, + { + .compatible = "microchip,sam9x75-xlcdc", + .data = &atmel_xlcdc_dc_sam9x75, + }, { /* sentinel */ }, }; MODULE_DEVICE_TABLE(of, atmel_hlcdc_of_match); -- 2.25.1
[PATCH RESEND v7 3/7] drm: atmel-hlcdc: Define SAM9X7 SoC XLCDC specific registers
From: Durai Manickam KR The register address of the XLCDC IP used in SAM9X7 SoC family are different from the previous HLCDC. Defining those address space with valid macros. Signed-off-by: Durai Manickam KR [manikanda...@microchip.com: Remove unused macro definitions] Signed-off-by: Manikandan Muralidharan --- drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.h | 42 include/linux/mfd/atmel-hlcdc.h | 10 + 2 files changed, 52 insertions(+) diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.h b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.h index d5e01ff8c7f9..e6b4b0bb11b9 100644 --- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.h +++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.h @@ -15,6 +15,7 @@ #include +/* LCD controller common registers */ #define ATMEL_HLCDC_LAYER_CHER 0x0 #define ATMEL_HLCDC_LAYER_CHDR 0x4 #define ATMEL_HLCDC_LAYER_CHSR 0x8 @@ -128,6 +129,47 @@ #define ATMEL_HLCDC_MAX_LAYERS 6 +/* XLCDC controller specific registers */ +#define ATMEL_XLCDC_LAYER_ENR 0x10 +#define ATMEL_XLCDC_LAYER_EN BIT(0) + +#define ATMEL_XLCDC_LAYER_IER 0x0 +#define ATMEL_XLCDC_LAYER_IDR 0x4 +#define ATMEL_XLCDC_LAYER_ISR 0xc +#define ATMEL_XLCDC_LAYER_OVR_IRQ(p) BIT(2 + (8 * (p))) + +#define ATMEL_XLCDC_LAYER_PLANE_ADDR(p)(((p) * 0x4) + 0x18) + +#define ATMEL_XLCDC_LAYER_DMA_CFG 0 + +#define ATMEL_XLCDC_LAYER_DMA BIT(0) +#define ATMEL_XLCDC_LAYER_REP BIT(1) +#define ATMEL_XLCDC_LAYER_DISCEN BIT(4) + +#define ATMEL_XLCDC_LAYER_SFACTC_A0_MULT_AS(4 << 6) +#define ATMEL_XLCDC_LAYER_SFACTA_ONE BIT(9) +#define ATMEL_XLCDC_LAYER_DFACTC_M_A0_MULT_AS (6 << 11) +#define ATMEL_XLCDC_LAYER_DFACTA_ONE BIT(14) + +#define ATMEL_XLCDC_LAYER_A0_SHIFT 16 +#define ATMEL_XLCDC_LAYER_A0(x)\ + ((x) << ATMEL_XLCDC_LAYER_A0_SHIFT) + +#define ATMEL_XLCDC_LAYER_VSCALER_LUMA_ENABLE BIT(0) +#define ATMEL_XLCDC_LAYER_VSCALER_CHROMA_ENABLEBIT(1) +#define ATMEL_XLCDC_LAYER_HSCALER_LUMA_ENABLE BIT(4) +#define ATMEL_XLCDC_LAYER_HSCALER_CHROMA_ENABLEBIT(5) + +#define ATMEL_XLCDC_LAYER_VXSYCFG_ONE BIT(0) +#define ATMEL_XLCDC_LAYER_VXSYTAP2_ENABLE BIT(4) +#define ATMEL_XLCDC_LAYER_VXSCCFG_ONE BIT(16) +#define ATMEL_XLCDC_LAYER_VXSCTAP2_ENABLE BIT(20) + +#define ATMEL_XLCDC_LAYER_HXSYCFG_ONE BIT(0) +#define ATMEL_XLCDC_LAYER_HXSYTAP2_ENABLE BIT(4) +#define ATMEL_XLCDC_LAYER_HXSCCFG_ONE BIT(16) +#define ATMEL_XLCDC_LAYER_HXSCTAP2_ENABLE BIT(20) + /** * Atmel HLCDC Layer registers layout structure * diff --git a/include/linux/mfd/atmel-hlcdc.h b/include/linux/mfd/atmel-hlcdc.h index a186119a49b5..80d675a03b39 100644 --- a/include/linux/mfd/atmel-hlcdc.h +++ b/include/linux/mfd/atmel-hlcdc.h @@ -22,6 +22,8 @@ #define ATMEL_HLCDC_DITHER BIT(6) #define ATMEL_HLCDC_DISPDLYBIT(7) #define ATMEL_HLCDC_MODE_MASK GENMASK(9, 8) +#define ATMEL_XLCDC_MODE_MASK GENMASK(10, 8) +#define ATMEL_XLCDC_DPIBIT(11) #define ATMEL_HLCDC_PP BIT(10) #define ATMEL_HLCDC_VSPSU BIT(12) #define ATMEL_HLCDC_VSPHO BIT(13) @@ -34,6 +36,12 @@ #define ATMEL_HLCDC_IDR0x30 #define ATMEL_HLCDC_IMR0x34 #define ATMEL_HLCDC_ISR0x38 +#define ATMEL_XLCDC_ATTRE 0x3c + +#define ATMEL_XLCDC_BASE_UPDATEBIT(0) +#define ATMEL_XLCDC_OVR1_UPDATEBIT(1) +#define ATMEL_XLCDC_OVR3_UPDATEBIT(2) +#define ATMEL_XLCDC_HEO_UPDATE BIT(3) #define ATMEL_HLCDC_CLKPOL BIT(0) #define ATMEL_HLCDC_CLKSEL BIT(2) @@ -48,6 +56,8 @@ #define ATMEL_HLCDC_DISP BIT(2) #define ATMEL_HLCDC_PWMBIT(3) #define ATMEL_HLCDC_SIPBIT(4) +#define ATMEL_XLCDC_SD BIT(5) +#define ATMEL_XLCDC_CM BIT(6) #define ATMEL_HLCDC_SOFBIT(0) #define ATMEL_HLCDC_SYNCDISBIT(1) -- 2.25.1
[PATCH RESEND v7 4/7] drm: atmel_hlcdc: Add support for XLCDC in atmel LCD driver
XLCDC in SAM9X7 has different sets of registers and additional configuration bits when compared to previous HLCDC IP. Read/write operation on the controller registers is now separated using the XLCDC status flag and with HLCDC and XLCDC IP specific ops. HEO scaling, window resampling, Alpha blending, YUV-to-RGB conversion in XLCDC is derived and handled using additional configuration bits and registers. Writing one to the Enable fields of each layer in LCD_ATTRE is required to reflect the values set in Configuration, FBA, Enable registers of each layer. Signed-off-by: Manikandan Muralidharan Co-developed-by: Hari Prasath Gujulan Elango Signed-off-by: Hari Prasath Gujulan Elango Co-developed-by: Durai Manickam KR Signed-off-by: Durai Manickam KR --- .../gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c| 33 +- drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c | 6 + drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.h | 3 + .../gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c | 349 +++--- 4 files changed, 329 insertions(+), 62 deletions(-) diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c index cc5cf4c2faf7..1ac31c0c474a 100644 --- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c +++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c @@ -79,6 +79,7 @@ static void atmel_hlcdc_crtc_mode_set_nofb(struct drm_crtc *c) unsigned int mask = ATMEL_HLCDC_CLKDIV_MASK | ATMEL_HLCDC_CLKPOL; unsigned int cfg = 0; int div, ret; + bool is_xlcdc = crtc->dc->desc->is_xlcdc; /* get encoder from crtc */ drm_for_each_encoder(en_iter, ddev) { @@ -164,10 +165,10 @@ static void atmel_hlcdc_crtc_mode_set_nofb(struct drm_crtc *c) state = drm_crtc_state_to_atmel_hlcdc_crtc_state(c->state); cfg = state->output_mode << 8; - if (adj->flags & DRM_MODE_FLAG_NVSYNC) + if (!is_xlcdc && (adj->flags & DRM_MODE_FLAG_NVSYNC)) cfg |= ATMEL_HLCDC_VSPOL; - if (adj->flags & DRM_MODE_FLAG_NHSYNC) + if (!is_xlcdc && (adj->flags & DRM_MODE_FLAG_NHSYNC)) cfg |= ATMEL_HLCDC_HSPOL; regmap_update_bits(regmap, ATMEL_HLCDC_CFG(5), @@ -202,6 +203,20 @@ static void atmel_hlcdc_crtc_atomic_disable(struct drm_crtc *c, pm_runtime_get_sync(dev->dev); + if (crtc->dc->desc->is_xlcdc) { + regmap_write(regmap, ATMEL_HLCDC_DIS, ATMEL_XLCDC_CM); + if (regmap_read_poll_timeout(regmap, ATMEL_HLCDC_SR, status, +!(status & ATMEL_XLCDC_CM), +10, 1000)) + dev_warn(dev->dev, "Atmel LCDC status register CMSTS timeout\n"); + + regmap_write(regmap, ATMEL_HLCDC_DIS, ATMEL_XLCDC_SD); + if (regmap_read_poll_timeout(regmap, ATMEL_HLCDC_SR, status, +status & ATMEL_XLCDC_SD, +10, 1000)) + dev_warn(dev->dev, "Atmel LCDC status register SDSTS timeout\n"); + } + regmap_write(regmap, ATMEL_HLCDC_DIS, ATMEL_HLCDC_DISP); while (!regmap_read(regmap, ATMEL_HLCDC_SR, &status) && (status & ATMEL_HLCDC_DISP)) @@ -256,6 +271,20 @@ static void atmel_hlcdc_crtc_atomic_enable(struct drm_crtc *c, !(status & ATMEL_HLCDC_DISP)) cpu_relax(); + if (crtc->dc->desc->is_xlcdc) { + regmap_write(regmap, ATMEL_HLCDC_EN, ATMEL_XLCDC_CM); + if (regmap_read_poll_timeout(regmap, ATMEL_HLCDC_SR, status, +status & ATMEL_XLCDC_CM, +10, 1000)) + dev_warn(dev->dev, "Atmel LCDC status register CMSTS timeout\n"); + + regmap_write(regmap, ATMEL_HLCDC_EN, ATMEL_XLCDC_SD); + if (regmap_read_poll_timeout(regmap, ATMEL_HLCDC_SR, status, +!(status & ATMEL_XLCDC_SD), +10, 1000)) + dev_warn(dev->dev, "Atmel LCDC status register SDSTS timeout\n"); + } + pm_runtime_put_sync(dev->dev); } diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c index d30aec174aa2..18a3a95f94be 100644 --- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c +++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c @@ -58,6 +58,7 @@ static const struct atmel_hlcdc_dc_desc atmel_hlcdc_dc_at91sam9n12 = { .conflicting_output_formats = true, .nlayers = ARRAY_SIZE(atmel_hlcdc_at91sam9n12_layers), .layers = atmel_hlcdc_at91sam9n12_layers, + .ops = &atmel_hlcdc_ops, }; static const struct atmel_hlcdc_layer_desc atmel_hlcdc_at91sam9x5_layers[] = { @@ -151,6 +152,7 @@ static const struct atmel_hlcdc_dc_desc atmel_hlcdc_dc_at91sam9x5 = {
[PATCH RESEND v7 6/7] drm: atmel-hlcdc: add vertical and horizontal scaling support for XLCDC
Update the LCDC_HEOCFG30 and LCDC_HEOCFG31 registers of XLCDC IP which supports vertical and horizontal scaling with Bilinear and Bicubic co-efficients taps for Chroma and Luma componenets of the Pixel. Signed-off-by: Manikandan Muralidharan --- drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c | 2 ++ drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.h | 4 .../gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c | 20 +++ 3 files changed, 26 insertions(+) diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c index 18a3a95f94be..debd4bf3e1b0 100644 --- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c +++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c @@ -541,6 +541,8 @@ static const struct atmel_hlcdc_layer_desc atmel_xlcdc_sam9x75_layers[] = { .general_config = 12, .csc = 16, .scaler_config = 23, + .vxs_config = 30, + .hxs_config = 31, }, .clut_offset = 0x1300, }, diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.h b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.h index 1ef15f2d536c..216beaf1da0e 100644 --- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.h +++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.h @@ -198,6 +198,8 @@ * @disc_pos: discard area position register * @disc_size: discard area size register * @csc: color space conversion register + * @vxs_config: vertical scalar filter taps control register + * @hxs_config: horizontal scalar filter taps control register */ struct atmel_hlcdc_layer_cfg_layout { int xstride[ATMEL_HLCDC_LAYER_MAX_PLANES]; @@ -217,6 +219,8 @@ struct atmel_hlcdc_layer_cfg_layout { int disc_pos; int disc_size; int csc; + int vxs_config; + int hxs_config; }; struct atmel_hlcdc_plane_state; diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c index 59ddd743ce92..a527badf865d 100644 --- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c +++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c @@ -966,6 +966,26 @@ static void xlcdc_csc_init(struct atmel_hlcdc_plane *plane, desc->layout.csc + i, xlcdc_csc_coeffs[i]); } + + if (desc->layout.vxs_config && desc->layout.hxs_config) { + /* +* Updating vxs.config and hxs.config fixes the +* Green Color Issue in SAM9X7 EGT Video Player App +*/ + atmel_hlcdc_layer_write_cfg(&plane->layer, + desc->layout.vxs_config, + ATMEL_XLCDC_LAYER_VXSYCFG_ONE | + ATMEL_XLCDC_LAYER_VXSYTAP2_ENABLE | + ATMEL_XLCDC_LAYER_VXSCCFG_ONE | + ATMEL_XLCDC_LAYER_VXSCTAP2_ENABLE); + + atmel_hlcdc_layer_write_cfg(&plane->layer, + desc->layout.hxs_config, + ATMEL_XLCDC_LAYER_HXSYCFG_ONE | + ATMEL_XLCDC_LAYER_HXSYTAP2_ENABLE | + ATMEL_XLCDC_LAYER_HXSCCFG_ONE | + ATMEL_XLCDC_LAYER_HXSCTAP2_ENABLE); + } } static int atmel_hlcdc_plane_init_properties(struct atmel_hlcdc_plane *plane) -- 2.25.1
[PATCH RESEND v7 5/7] drm: atmel-hlcdc: add DPI mode support for XLCDC
Add support for Display Pixel Interface (DPI) Compatible Mode support in atmel-hlcdc driver for XLCDC IP along with legacy pixel mapping. DPI mode BIT is configured in LCDC_CFG5 register. Signed-off-by: Manikandan Muralidharan [durai.manicka...@microchip.com: update DPI mode bit using is_xlcdc flag] Signed-off-by: Durai Manickam KR --- .../gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c| 21 +-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c index 1ac31c0c474a..1899be2eb6a3 100644 --- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c +++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c @@ -30,10 +30,12 @@ * * @base: base CRTC state * @output_mode: RGBXXX output mode + * @dpi: output DPI mode */ struct atmel_hlcdc_crtc_state { struct drm_crtc_state base; unsigned int output_mode; + u8 dpi; }; static inline struct atmel_hlcdc_crtc_state * @@ -164,6 +166,8 @@ static void atmel_hlcdc_crtc_mode_set_nofb(struct drm_crtc *c) state = drm_crtc_state_to_atmel_hlcdc_crtc_state(c->state); cfg = state->output_mode << 8; + if (is_xlcdc) + cfg |= state->dpi << 11; if (!is_xlcdc && (adj->flags & DRM_MODE_FLAG_NVSYNC)) cfg |= ATMEL_HLCDC_VSPOL; @@ -176,7 +180,9 @@ static void atmel_hlcdc_crtc_mode_set_nofb(struct drm_crtc *c) ATMEL_HLCDC_VSPDLYS | ATMEL_HLCDC_VSPDLYE | ATMEL_HLCDC_DISPPOL | ATMEL_HLCDC_DISPDLY | ATMEL_HLCDC_VSPSU | ATMEL_HLCDC_VSPHO | - ATMEL_HLCDC_GUARDTIME_MASK | ATMEL_HLCDC_MODE_MASK, + ATMEL_HLCDC_GUARDTIME_MASK | + (is_xlcdc ? ATMEL_XLCDC_MODE_MASK | + ATMEL_XLCDC_DPI : ATMEL_HLCDC_MODE_MASK), cfg); clk_disable_unprepare(crtc->dc->hlcdc->sys_clk); @@ -374,7 +380,15 @@ static int atmel_hlcdc_crtc_select_output_mode(struct drm_crtc_state *state) hstate = drm_crtc_state_to_atmel_hlcdc_crtc_state(state); hstate->output_mode = fls(output_fmts) - 1; - + if (crtc->dc->desc->is_xlcdc) { + /* check if MIPI DPI bit needs to be set */ + if (fls(output_fmts) > 3) { + hstate->output_mode -= 4; + hstate->dpi = 1; + } else { + hstate->dpi = 0; + } + } return 0; } @@ -478,6 +492,7 @@ static struct drm_crtc_state * atmel_hlcdc_crtc_duplicate_state(struct drm_crtc *crtc) { struct atmel_hlcdc_crtc_state *state, *cur; + struct atmel_hlcdc_crtc *c = drm_crtc_to_atmel_hlcdc_crtc(crtc); if (WARN_ON(!crtc->state)) return NULL; @@ -489,6 +504,8 @@ atmel_hlcdc_crtc_duplicate_state(struct drm_crtc *crtc) cur = drm_crtc_state_to_atmel_hlcdc_crtc_state(crtc->state); state->output_mode = cur->output_mode; + if (c->dc->desc->is_xlcdc) + state->dpi = cur->dpi; return &state->base; } -- 2.25.1
[PATCH RESEND v7 7/7] drm: atmel-hlcdc: add support for DSI output formats
Add support for the following DPI mode if the encoder type is DSI as per the XLCDC IP datasheet: - 16BPPCFG1 - 16BPPCFG2 - 16BPPCFG3 - 18BPPCFG1 - 18BPPCFG2 - 24BPP Signed-off-by: Manikandan Muralidharan [durai.manicka...@microchip.com: update output format using is_xlcdc flag] Signed-off-by: Durai Manickam KR --- .../gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c| 123 +- 1 file changed, 88 insertions(+), 35 deletions(-) diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c index 1899be2eb6a3..6f529769b036 100644 --- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c +++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c @@ -295,11 +295,18 @@ static void atmel_hlcdc_crtc_atomic_enable(struct drm_crtc *c, } -#define ATMEL_HLCDC_RGB444_OUTPUT BIT(0) -#define ATMEL_HLCDC_RGB565_OUTPUT BIT(1) -#define ATMEL_HLCDC_RGB666_OUTPUT BIT(2) -#define ATMEL_HLCDC_RGB888_OUTPUT BIT(3) -#define ATMEL_HLCDC_OUTPUT_MODE_MASK GENMASK(3, 0) +#define ATMEL_HLCDC_RGB444_OUTPUT BIT(0) +#define ATMEL_HLCDC_RGB565_OUTPUT BIT(1) +#define ATMEL_HLCDC_RGB666_OUTPUT BIT(2) +#define ATMEL_HLCDC_RGB888_OUTPUT BIT(3) +#define ATMEL_HLCDC_DPI_RGB565C1_OUTPUTBIT(4) +#define ATMEL_HLCDC_DPI_RGB565C2_OUTPUTBIT(5) +#define ATMEL_HLCDC_DPI_RGB565C3_OUTPUTBIT(6) +#define ATMEL_HLCDC_DPI_RGB666C1_OUTPUTBIT(7) +#define ATMEL_HLCDC_DPI_RGB666C2_OUTPUTBIT(8) +#define ATMEL_HLCDC_DPI_RGB888_OUTPUT BIT(9) +#define ATMEL_HLCDC_OUTPUT_MODE_MASK GENMASK(3, 0) +#define ATMEL_XLCDC_OUTPUT_MODE_MASK GENMASK(9, 0) static int atmel_hlcdc_connector_output_mode(struct drm_connector_state *state) { @@ -313,53 +320,99 @@ static int atmel_hlcdc_connector_output_mode(struct drm_connector_state *state) if (!encoder) encoder = connector->encoder; - switch (atmel_hlcdc_encoder_get_bus_fmt(encoder)) { - case 0: - break; - case MEDIA_BUS_FMT_RGB444_1X12: - return ATMEL_HLCDC_RGB444_OUTPUT; - case MEDIA_BUS_FMT_RGB565_1X16: - return ATMEL_HLCDC_RGB565_OUTPUT; - case MEDIA_BUS_FMT_RGB666_1X18: - return ATMEL_HLCDC_RGB666_OUTPUT; - case MEDIA_BUS_FMT_RGB888_1X24: - return ATMEL_HLCDC_RGB888_OUTPUT; - default: - return -EINVAL; - } - - for (j = 0; j < info->num_bus_formats; j++) { - switch (info->bus_formats[j]) { - case MEDIA_BUS_FMT_RGB444_1X12: - supported_fmts |= ATMEL_HLCDC_RGB444_OUTPUT; + if (encoder->encoder_type == DRM_MODE_ENCODER_DSI) { + /* +* atmel-hlcdc to support DSI formats with DSI video pipeline +* when DRM_MODE_ENCODER_DSI type is set by +* connector driver component. +*/ + switch (atmel_hlcdc_encoder_get_bus_fmt(encoder)) { + case 0: break; case MEDIA_BUS_FMT_RGB565_1X16: - supported_fmts |= ATMEL_HLCDC_RGB565_OUTPUT; - break; + return ATMEL_HLCDC_DPI_RGB565C1_OUTPUT; case MEDIA_BUS_FMT_RGB666_1X18: - supported_fmts |= ATMEL_HLCDC_RGB666_OUTPUT; - break; + return ATMEL_HLCDC_DPI_RGB666C1_OUTPUT; + case MEDIA_BUS_FMT_RGB666_1X24_CPADHI: + return ATMEL_HLCDC_DPI_RGB666C2_OUTPUT; case MEDIA_BUS_FMT_RGB888_1X24: - supported_fmts |= ATMEL_HLCDC_RGB888_OUTPUT; - break; + return ATMEL_HLCDC_DPI_RGB888_OUTPUT; default: + return -EINVAL; + } + + for (j = 0; j < info->num_bus_formats; j++) { + switch (info->bus_formats[j]) { + case MEDIA_BUS_FMT_RGB565_1X16: + supported_fmts |= + ATMEL_HLCDC_DPI_RGB565C1_OUTPUT; + break; + case MEDIA_BUS_FMT_RGB666_1X18: + supported_fmts |= + ATMEL_HLCDC_DPI_RGB666C1_OUTPUT; + break; + case MEDIA_BUS_FMT_RGB666_1X24_CPADHI: + supported_fmts |= + ATMEL_HLCDC_DPI_RGB666C2_OUTPUT; + break; + case MEDIA_BUS_FMT_RGB888_1X24: + supported_fmts |= + ATMEL_HLCDC_DPI_RGB888_OUTPUT; + break; +
Re:[PATCH v4 00/14] drm: Add a driver for CSF-based Mali GPUs
Hi Boris: Thanks for you great work。 One thing please take note: commit (arm64: dts: rockchip: rk3588: Add GPU nodes) in [1] seems remove the "disabled" status of usb_host2_xhci, this may cause a boot issue on some boards that use combphy2_psu phy for other functions. https://gitlab.freedesktop.org/panfrost/linux/-/commit/bf5b542294894219c2d366f6dd0d32a9dcf17252 At 2024-01-23 00:30:31, "Boris Brezillon" wrote: >Hello, > >This is the 4th version of the kernel driver for Mali CSF-based GPUs. > >A branch based on drm-misc-next and containing all the dependencies >that are not yet available in drm-misc-next here[1], and another [2] >containing extra patches to have things working on rk3588. The CSF >firmware binary can be found here[3], and should be placed under >/lib/firmware/arm/mali/arch10.8/mali_csffw.bin. > >The mesa MR adding v10 support on top of panthor is available here [4]. > >Steve, I intentionally dropped your R-b on "drm/panthor: Add the heap >logical block" and "drm/panthor: Add the scheduler logical block" >because the tiler-OOM handling changed enough to require a new review >IMHO. > >Regarding the GPL2+MIT relicensing, I collected Clément's R-b for the >devfreq code, but am still lacking Alexey Sheplyakov for some bits in >panthor_gpu.c. The rest of the code is either new, or covered by the >Linaro, Arm and Collabora acks. > >And here is a non-exhaustive changelog, check each commit for a detailed >changelog. > >v4: >- Fix various bugs in the VM logic >- Address comments from Steven, Liviu, Ketil and Chris >- Move tiler OOM handling out of the scheduler interrupt handling path > so we can properly recover when the system runs out of memory, and > panthor is blocked trying to allocate heap chunks >- Rework the heap locking to support concurrent chunk allocation. Not > sure if this is supposed to happen, but we need to be robust against > userspace passing the same heap context to two scheduling groups. > Wasn't needed before the tiler_oom rework, because heap allocation > base serialized by the scheduler lock. >- Make kernel BO destruction robust to NULL/ERR pointers > >v3; >- Quite a few changes at the MMU/sched level to make the fix some > race conditions and deadlocks >- Addition of the a sync-only VM_BIND operation (to support > vkQueueSparseBind with zero commands). >- Addition of a VM_GET_STATE ioctl >- Various cosmetic changes (see the commit changelogs for more details) >- Various fixes (see the commit changelogs for more details) > >v2: >- Rename the driver (pancsf -> panthor) >- Split the commit adding the driver to ease review >- Use drm_sched for dependency tracking/job submission >- Add a VM_BIND ioctl >- Add the concept of exclusive VM for BOs that are only ever mapped to a > single VM >- Document the code and uAPI >- Add a DT binding doc > >Regards, > >Boris > >[1]https://gitlab.freedesktop.org/panfrost/linux/-/tree/panthor-v4 >[2]https://gitlab.freedesktop.org/panfrost/linux/-/tree/panthor-v4+rk3588 >[3]https://gitlab.com/firefly-linux/external/libmali/-/raw/firefly/firmware/g610/mali_csffw.bin >[4]https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26358 > >Boris Brezillon (13): > drm/panthor: Add uAPI > drm/panthor: Add GPU register definitions > drm/panthor: Add the device logical block > drm/panthor: Add the GPU logical block > drm/panthor: Add GEM logical block > drm/panthor: Add the devfreq logical block > drm/panthor: Add the MMU/VM logical block > drm/panthor: Add the FW logical block > drm/panthor: Add the heap logical block > drm/panthor: Add the scheduler logical block > drm/panthor: Add the driver frontend block > drm/panthor: Allow driver compilation > drm/panthor: Add an entry to MAINTAINERS > >Liviu Dudau (1): > dt-bindings: gpu: mali-valhall-csf: Add support for Arm Mali CSF GPUs > > .../bindings/gpu/arm,mali-valhall-csf.yaml| 147 + > Documentation/gpu/driver-uapi.rst |5 + > MAINTAINERS | 11 + > drivers/gpu/drm/Kconfig |2 + > drivers/gpu/drm/Makefile |1 + > drivers/gpu/drm/panthor/Kconfig | 23 + > drivers/gpu/drm/panthor/Makefile | 15 + > drivers/gpu/drm/panthor/panthor_devfreq.c | 283 ++ > drivers/gpu/drm/panthor/panthor_devfreq.h | 25 + > drivers/gpu/drm/panthor/panthor_device.c | 544 +++ > drivers/gpu/drm/panthor/panthor_device.h | 393 ++ > drivers/gpu/drm/panthor/panthor_drv.c | 1470 +++ > drivers/gpu/drm/panthor/panthor_fw.c | 1334 +++ > drivers/gpu/drm/panthor/panthor_fw.h | 504 +++ > drivers/gpu/drm/panthor/panthor_gem.c | 228 ++ > drivers/gpu/drm/panthor/panthor_gem.h | 144 + > drivers/gpu/drm/panthor/panthor_gpu.c | 482 +++ > drivers/gpu/drm/panthor/panthor_gpu.h | 52 + > drivers/gpu/drm/panthor/panthor_heap.c| 596 +++ > drivers/gpu/drm/panthor/panthor_heap.h| 39 +
Re: [PATCH 2/2] drm/amd: Fetch the EDID from _DDC if available for eDP
On Fri, 26 Jan 2024, Mario Limonciello wrote: > Some manufacturers have intentionally put an EDID that differs from > the EDID on the internal panel on laptops. > > Attempt to fetch this EDID if it exists and prefer it over the EDID > that is provided by the panel. > > Signed-off-by: Mario Limonciello > --- > drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 ++ > drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 30 +++ > .../gpu/drm/amd/amdgpu/amdgpu_connectors.c| 5 > .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 8 - > .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 7 +++-- > 5 files changed, 49 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > index c5f3859fd682..99abe12567a4 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > @@ -1520,6 +1520,7 @@ int amdgpu_acpi_get_mem_info(struct amdgpu_device > *adev, int xcc_id, > > void amdgpu_acpi_get_backlight_caps(struct amdgpu_dm_backlight_caps *caps); > bool amdgpu_acpi_should_gpu_reset(struct amdgpu_device *adev); > +void *amdgpu_acpi_edid(struct amdgpu_device *adev, struct drm_connector > *connector); > void amdgpu_acpi_detect(void); > void amdgpu_acpi_release(void); > #else > @@ -1537,6 +1538,7 @@ static inline int amdgpu_acpi_get_mem_info(struct > amdgpu_device *adev, > } > static inline void amdgpu_acpi_fini(struct amdgpu_device *adev) { } > static inline bool amdgpu_acpi_should_gpu_reset(struct amdgpu_device *adev) > { return false; } > +static inline void *amdgpu_acpi_edid(struct amdgpu_device *adev, struct > drm_connector *connector) { return NULL; } > static inline void amdgpu_acpi_detect(void) { } > static inline void amdgpu_acpi_release(void) { } > static inline bool amdgpu_acpi_is_power_shift_control_supported(void) { > return false; } > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c > index e550067e5c5d..c106335f1f22 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c > @@ -1380,6 +1380,36 @@ bool amdgpu_acpi_should_gpu_reset(struct amdgpu_device > *adev) > #endif > } > > +/** > + * amdgpu_acpi_edid > + * @adev: amdgpu_device pointer > + * @connector: drm_connector pointer > + * > + * Returns the EDID used for the internal panel if present, NULL otherwise. > + */ > +void * > +amdgpu_acpi_edid(struct amdgpu_device *adev, struct drm_connector *connector) > +{ > + struct drm_device *ddev = adev_to_drm(adev); > + struct acpi_device *acpidev = ACPI_COMPANION(ddev->dev); > + void *edid; > + int r; > + > + if (!acpidev) > + return NULL; > + > + if (connector->connector_type != DRM_MODE_CONNECTOR_eDP) > + return NULL; > + > + r = acpi_video_get_edid(acpidev, ACPI_VIDEO_DISPLAY_LCD, -1, &edid); > + if (r < 0) { > + DRM_DEBUG_DRIVER("Failed to get EDID from ACPI: %d\n", r); > + return NULL; > + } > + > + return kmemdup(edid, r, GFP_KERNEL); > +} > + > /* > * amdgpu_acpi_detect - detect ACPI ATIF/ATCS methods > * > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c > index 9caba10315a8..c7e1563a46d3 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c > @@ -278,6 +278,11 @@ static void amdgpu_connector_get_edid(struct > drm_connector *connector) > struct amdgpu_device *adev = drm_to_adev(dev); > struct amdgpu_connector *amdgpu_connector = > to_amdgpu_connector(connector); > > + if (amdgpu_connector->edid) > + return; > + > + /* if the BIOS specifies the EDID via _DDC, prefer this */ > + amdgpu_connector->edid = amdgpu_acpi_edid(adev, connector); Imagine the EDID returned by acpi_video_get_edid() has edid->extensions bigger than 4. Of course it should not, but you have no guarantees, and it originates outside of the kernel. The real fix is to have the function return a struct drm_edid which tracks the allocation size separately. Unfortunately, it requires a bunch of changes along the way. We've mostly done it in i915, and I've sent a series to do this in drm/bridge [1]. Bottom line, we should stop using struct edid in drivers. They'll all parse the info differently, and from what I've seen, often wrong. BR, Jani. [1] https://patchwork.freedesktop.org/series/128149/ > if (amdgpu_connector->edid) > return; > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > index cd98b3565178..1faa21f542bd 100644 > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > @@ -6562,17 +6562,23 @@ static void amdgpu_dm_connector_funcs_force(struct > drm_connector *connector) > { >
RE: [PATCH next 10/11] block: Use a boolean expression instead of max() on booleans
On Mon, 29 Jan 2024, David Laight wrote: > From: Jani Nikula >> Sent: 29 January 2024 09:08 >> >> On Sun, 28 Jan 2024, David Laight wrote: >> > blk_stack_limits() contains: >> >t->zoned = max(t->zoned, b->zoned); >> > These are bool, so it is just a bitwise or. >> >> Should be a logical or, really. And || in code. > > Not really, bitwise is fine for bool (especially for 'or') > and generates better code. Logical operations for booleans are more readable for humans than bitwise. And semantically correct. With a = b || c you know what happens regardless of the types in question. a = b | c you have to look up the types to know what's going on. To me, better code only matters if it's a hotpath. That said, not my are of maintenance, so *shrug*. BR, Jani. -- Jani Nikula, Intel
Re: Making drm_gpuvm work across gpu devices
Am 26.01.24 um 21:13 schrieb Zeng, Oak: -Original Message- From: Christian König Sent: Friday, January 26, 2024 5:10 AM To: Zeng, Oak; David Airlie Cc: Ghimiray, Himal Prasad; thomas.hellst...@linux.intel.com; Winiarski, Michal ; Felix Kuehling; Welty, Brian; Shah, Ankur N; dri- de...@lists.freedesktop.org;intel...@lists.freedesktop.org; Gupta, saurabhg ; Danilo Krummrich; Daniel Vetter; Brost, Matthew; Bommu, Krishnaiah; Vishwanathapura, Niranjana Subject: Re: Making drm_gpuvm work across gpu devices Hi Oak, you can still use SVM, but it should not be a design criteria for the kernel UAPI. In other words the UAPI should be designed in such a way that the GPU virtual address can be equal to the CPU virtual address of a buffer, but can also be different to support use cases where this isn't the case. Terminology: SVM: any technology which can achieve a shared virtual address space b/t cpu and devices. The virtual address space can be managed by user space or kernel space. Intel implemented a SVM, based on the BO-centric gpu driver (gem-create, vm-bind) where virtual address space is managed by UMD. System allocator: another way of implement SVM. User just use malloc'ed memory for gpu submission. Virtual address space is managed by Linux core mm. In practice, we leverage HMM to implement system allocator. This article described details of all those different model:https://developer.nvidia.com/blog/simplifying-gpu-application-development-with-heterogeneous-memory-management/ Our programming model allows a mixture use of system allocator (even though system allocator is ) and traditional vm_bind (where cpu address can != gpu address). Let me re-post the pseudo codes: 1. Fd0 = open(/"dev/dri/render0") 2. Fd1 = open("/dev/dri/render1") 3. Fd3 = open("/dev/dri/xe-svm") 4. Gpu_Vm0 =xe_vm_create(fd0) 5. Gpu_Vm1 = xe_vm_create(fd1) 6. Queue0 = xe_exec_queue_create(fd0, gpu_vm0) 7. Queue1 = xe_exec_queue_create(fd1, gpu_vm1) 8. ptr = malloc() 9. bo = xe_bo_create(fd0) 10. Vm_bind(bo, gpu_vm0, va)//va is from UMD, cpu can access bo with same or different va. It is UMD's responsibility that va doesn't conflict with malloc'ed PTRs. 11. Xe_exec(queue0, ptr)//submit gpu job which use ptr, on card0 12. Xe_exec(queue1, ptr)//submit gpu job which use ptr, on card1 13. Xe_exec(queue0, va)//submit gpu job which use va, on card0 In above codes, the va used in vm_bind (line 10, Intel's API to bind an object to a va for GPU access) can be different from the CPU address when cpu access the same object. But whenever user use malloc'ed ptr for GPU submission (line 11, 12, so called system allocator), it implies CPU and GPUs use the same ptr to access. In above vm_bind, it is user/UMD's responsibility to guarantee that vm_bind va doesn't conflict with malloc'ed ptr. Otherwise it is treated as programming error. I think this design still meets your design restrictions. Well why do you need this "Fd3 = open("/dev/dri/xe-svm")" ? As far as I see fd3 isn't used anywhere. What you can do is to bind parts of your process address space to your driver connections (fd1, fd2 etc..) with a vm_bind(), but this should *not* come because of implicitely using some other file descriptor in the process. As far as I can see this design is exactly what failed so badly with KFD. Regards, Christian. Additionally to what Dave wrote I can summarize a few things I have learned while working on the AMD GPU drivers in the last decade or so: 1. Userspace requirements are *not* relevant for UAPI or even more general kernel driver design. 2. What should be done is to look at the hardware capabilities and try to expose those in a save manner to userspace. 3. The userspace requirements are then used to validate the kernel driver and especially the UAPI design to ensure that nothing was missed. The consequence of this is that nobody should ever use things like Cuda, Vulkan, OpenCL, OpenGL etc.. as argument to propose a certain UAPI design. What should be done instead is to say: My hardware works in this and that way -> we want to expose it like this -> because that enables us to implement the high level API in this and that way. Only this gives then a complete picture of how things interact together and allows the kernel community to influence and validate the design. What you described above is mainly bottom up. I know other people do top down, or whole system vertical HW-SW co-design. I don't have strong opinion here. Regards, Oak This doesn't mean that you need to throw away everything, but it gives a clear restriction that designs are not nailed in stone and for example you can't use something like a waterfall model. Going to answer on your other questions separately. Regards, Christian. Am 25.01.24 um 06:25 schrieb Zeng, Oak: Hi Dave, Let me step back. When I wrote " shared virtual address spac
[PATCH v2 0/1] Implement device_attach for virtio gpu
To realize dGPU prime feature for virtio gpu, we are trying to let dGPU import vram object of virtio gpu. As vram objects don't have backing pages and thus can't implement the drm_gem_object_funcs.get_sg_table callback, this removes calling drm_gem_map_dma_buf in virtgpu_gem_map_dma_buf and implement virtgpu specific map/unmap/attach callbacks to support both of shmem objects and vram objects. Changes from v1 to v2: -Reimplement virtgpu_gem_device_attach() -Remove calling drm dma-buf funcs in virtgpu callbacks and reimplement virtgpu specific dma-buf callbacks. Julia Zhang (1): drm/virtio: Implement device_attach drivers/gpu/drm/virtio/virtgpu_prime.c | 40 +++--- 1 file changed, 36 insertions(+), 4 deletions(-) -- 2.34.1
Re: [PATCH next 10/11] block: Use a boolean expression instead of max() on booleans
On Mon, Jan 29, 2024 at 09:22:40AM +, David Laight wrote: > From: Jani Nikula > > Sent: 29 January 2024 09:08 > > > > On Sun, 28 Jan 2024, David Laight wrote: > > > blk_stack_limits() contains: > > > t->zoned = max(t->zoned, b->zoned); > > > These are bool, so it is just a bitwise or. > > > > Should be a logical or, really. And || in code. > > Not really, bitwise is fine for bool (especially for 'or') > and generates better code. For | vs || the type doesn't make a difference... It makes a difference for AND. "0x1 & 0x10" vs "0x1 && 0x10". regards, dan carpenter
Re: Making drm_gpuvm work across gpu devices
Well Daniel and Dave noted it as well, so I'm just repeating it: Your design choices are not an argument to get something upstream. It's the job of the maintainers and at the end of the Linus to judge of something is acceptable or not. As far as I can see a good part of this this idea has been exercised lengthy with KFD and it turned out to not be the best approach. So from what I've seen the design you outlined is extremely unlikely to go upstream. Regards, Christian. Am 27.01.24 um 03:21 schrieb Zeng, Oak: Regarding the idea of expanding userptr to support migration, we explored this idea long time ago. It provides similar functions of the system allocator but its interface is not as convenient as system allocator. Besides the shared virtual address space, another benefit of a system allocator is, you can offload cpu program to gpu easier, you don’t need to call driver specific API (such as register_userptr and vm_bind in this case) for memory allocation. We also scoped the implementation. It turned out to be big, and not as beautiful as hmm. Why we gave up this approach. *From:*Christian König *Sent:* Friday, January 26, 2024 7:52 AM *To:* Thomas Hellström ; Daniel Vetter *Cc:* Brost, Matthew ; Felix Kuehling ; Welty, Brian ; Ghimiray, Himal Prasad ; Zeng, Oak ; Gupta, saurabhg ; Danilo Krummrich ; dri-devel@lists.freedesktop.org; Bommu, Krishnaiah ; Dave Airlie ; Vishwanathapura, Niranjana ; intel...@lists.freedesktop.org *Subject:* Re: Making drm_gpuvm work across gpu devices Am 26.01.24 um 09:21 schrieb Thomas Hellström: Hi, all On Thu, 2024-01-25 at 19:32 +0100, Daniel Vetter wrote: On Wed, Jan 24, 2024 at 09:33:12AM +0100, Christian König wrote: Am 23.01.24 um 20:37 schrieb Zeng, Oak: [SNIP] Yes most API are per device based. One exception I know is actually the kfd SVM API. If you look at the svm_ioctl function, it is per-process based. Each kfd_process represent a process across N gpu devices. Yeah and that was a big mistake in my opinion. We should really not do that ever again. Need to say, kfd SVM represent a shared virtual address space across CPU and all GPU devices on the system. This is by the definition of SVM (shared virtual memory). This is very different from our legacy gpu *device* driver which works for only one device (i.e., if you want one device to access another device's memory, you will have to use dma-buf export/import etc). Exactly that thinking is what we have currently found as blocker for a virtualization projects. Having SVM as device independent feature which somehow ties to the process address space turned out to be an extremely bad idea. The background is that this only works for some use cases but not all of them. What's working much better is to just have a mirror functionality which says that a range A..B of the process address space is mapped into a range C..D of the GPU address space. Those ranges can then be used to implement the SVM feature required for higher level APIs and not something you need at the UAPI or even inside the low level kernel memory management. When you talk about migrating memory to a device you also do this on a per device basis and *not* tied to the process address space. If you then get crappy performance because userspace gave contradicting information where to migrate memory then that's a bug in userspace and not something the kernel should try to prevent somehow. [SNIP] I think if you start using the same drm_gpuvm for multiple devices you will sooner or later start to run into the same mess we have seen with KFD, where we moved more and more functionality from the KFD to the DRM render node because we found that a lot of the stuff simply doesn't work correctly with a single object to maintain the state. As I understand it, KFD is designed to work across devices. A single pseudo /dev/kfd device represent all hardware gpu devices. That is why during kfd open, many pdd (process device data) is created, each for one hardware device for this process. Yes, I'm perfectly aware of that. And I can only r
Re: [PATCH] drm/etnaviv: fix DMA direction handling for cached read/write buffers
Hi Lucas, On Fri, 26 Jan 2024 at 17:00, Lucas Stach wrote: > The dma sync operation needs to be done with DMA_BIDIRECTIONAL when > the BO is prepared for both read and write operations. With the > current inverted if ladder it would only be synced for DMA_FROM_DEVICE. > > [...] > > static inline enum dma_data_direction etnaviv_op_to_dma_dir(u32 op) > { > - if (op & ETNA_PREP_READ) > + if (op & (ETNA_PREP_READ | ETNA_PREP_WRITE)) > + return DMA_BIDIRECTIONAL; This test will always be true for _either_ read or write. Cheers, Daniel
[PATCH v2 1/1] drm/virtio: Implement device_attach
As vram objects don't have backing pages and thus can't implement drm_gem_object_funcs.get_sg_table callback. This removes drm dma-buf callbacks in virtgpu_gem_map_dma_buf()/virtgpu_gem_unmap_dma_buf() and implement virtgpu specific map/unmap/attach callbacks to support both of shmem objects and vram objects. Signed-off-by: Julia Zhang --- drivers/gpu/drm/virtio/virtgpu_prime.c | 40 +++--- 1 file changed, 36 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/virtio/virtgpu_prime.c b/drivers/gpu/drm/virtio/virtgpu_prime.c index 44425f20d91a..b490a5343b06 100644 --- a/drivers/gpu/drm/virtio/virtgpu_prime.c +++ b/drivers/gpu/drm/virtio/virtgpu_prime.c @@ -49,11 +49,26 @@ virtgpu_gem_map_dma_buf(struct dma_buf_attachment *attach, { struct drm_gem_object *obj = attach->dmabuf->priv; struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj); + struct sg_table *sgt; + int ret; if (virtio_gpu_is_vram(bo)) return virtio_gpu_vram_map_dma_buf(bo, attach->dev, dir); - return drm_gem_map_dma_buf(attach, dir); + sgt = drm_prime_pages_to_sg(obj->dev, + to_drm_gem_shmem_obj(obj)->pages, + obj->size >> PAGE_SHIFT); + if (IS_ERR(sgt)) + return sgt; + + ret = dma_map_sgtable(attach->dev, sgt, dir, DMA_ATTR_SKIP_CPU_SYNC); + if (ret) { + sg_free_table(sgt); + kfree(sgt); + return ERR_PTR(ret); + } + + return sgt; } static void virtgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach, @@ -63,12 +78,29 @@ static void virtgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach, struct drm_gem_object *obj = attach->dmabuf->priv; struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj); + if (!sgt) + return; + if (virtio_gpu_is_vram(bo)) { virtio_gpu_vram_unmap_dma_buf(attach->dev, sgt, dir); - return; + } else { + dma_unmap_sgtable(attach->dev, sgt, dir, DMA_ATTR_SKIP_CPU_SYNC); + sg_free_table(sgt); + kfree(sgt); } +} + +static int virtgpu_gem_device_attach(struct dma_buf *dma_buf, +struct dma_buf_attachment *attach) +{ + struct drm_gem_object *obj = attach->dmabuf->priv; + struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj); + int ret = 0; + + if (!virtio_gpu_is_vram(bo) && obj->funcs->pin) + ret = obj->funcs->pin(obj); - drm_gem_unmap_dma_buf(attach, sgt, dir); + return ret; } static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops = { @@ -83,7 +115,7 @@ static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops = { .vmap = drm_gem_dmabuf_vmap, .vunmap = drm_gem_dmabuf_vunmap, }, - .device_attach = drm_gem_map_attach, + .device_attach = virtgpu_gem_device_attach, .get_uuid = virtgpu_virtio_get_uuid, }; -- 2.34.1
Re: [PATCH] drm: bridge: samsung-dsim: Don't use FORCE_STOP_STATE
Just FYI this conflictted pretty heavily with drm-misc-next changes in the same area, someone should check drm-tip has the correct resolution, I'm not really sure what is definitely should be. FWIW, this looks rather messy now. The drm-tip doesn't build. There was a new call to samsung_dsim_set_stop_state() introduced in commit b2fe2292624ac (drm: bridge: samsung-dsim: enter display mode in the enable() callback). Also merge commit 663a907e199b (Merge remote-tracking branch 'drm-misc/drm-misc-next' into drm-tip) is broken because it completely removes samsung_dsim_atomic_disable(). Dunno whats going on there. I'm just about to look at getting it to compile again and I'm trying to retest it. -michael
Re: [PATCH 03/19] drm/i915/dp: Add support to notify MST connectors to retry modesets
On Tue, 2024-01-23 at 12:28 +0200, Imre Deak wrote: > On shared (Thunderbolt) links with DP tunnels, the modeset may need > to > be retried on all connectors on the link due to a link BW limitation > arising only after the atomic check phase. To support this add a > helper > function queuing a work to retry the modeset on a given port's > connector > and at the same time any MST connector with streams through the same > port. A follow-up change enabling the DP tunnel Bandwidth Allocation > Mode will take this into use. > > Signed-off-by: Imre Deak > --- > drivers/gpu/drm/i915/display/intel_display.c | 5 +- > drivers/gpu/drm/i915/display/intel_dp.c | 55 > ++- > drivers/gpu/drm/i915/display/intel_dp.h | 8 +++ > .../drm/i915/display/intel_dp_link_training.c | 3 +- > drivers/gpu/drm/i915/display/intel_dp_mst.c | 2 + > 5 files changed, 67 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/i915/display/intel_display.c > b/drivers/gpu/drm/i915/display/intel_display.c > index a92e959c8ac7b..0caebbb3e2dbb 100644 > --- a/drivers/gpu/drm/i915/display/intel_display.c > +++ b/drivers/gpu/drm/i915/display/intel_display.c > @@ -8060,8 +8060,9 @@ void intel_hpd_poll_fini(struct > drm_i915_private *i915) > /* Kill all the work that may have been queued by hpd. */ > drm_connector_list_iter_begin(&i915->drm, &conn_iter); > for_each_intel_connector_iter(connector, &conn_iter) { > - if (connector->modeset_retry_work.func) > - cancel_work_sync(&connector- > >modeset_retry_work); > + if (connector->modeset_retry_work.func && > + cancel_work_sync(&connector->modeset_retry_work)) > + drm_connector_put(&connector->base); > if (connector->hdcp.shim) { > cancel_delayed_work_sync(&connector- > >hdcp.check_work); > cancel_work_sync(&connector->hdcp.prop_work); > diff --git a/drivers/gpu/drm/i915/display/intel_dp.c > b/drivers/gpu/drm/i915/display/intel_dp.c > index ab415f41924d7..4e36c2c39888e 100644 > --- a/drivers/gpu/drm/i915/display/intel_dp.c > +++ b/drivers/gpu/drm/i915/display/intel_dp.c > @@ -2837,6 +2837,50 @@ intel_dp_audio_compute_config(struct > intel_encoder *encoder, > intel_dp_is_uhbr(pipe_config) > ; > } > > +void intel_dp_queue_modeset_retry_work(struct intel_connector > *connector) > +{ > + struct drm_i915_private *i915 = to_i915(connector->base.dev); > + > + drm_connector_get(&connector->base); > + if (!queue_work(i915->unordered_wq, &connector- > >modeset_retry_work)) > + drm_connector_put(&connector->base); > +} > + > +void > +intel_dp_queue_modeset_retry_for_link(struct intel_atomic_state > *state, > + struct intel_encoder *encoder, > + const struct intel_crtc_state > *crtc_state, > + const struct > drm_connector_state *conn_state) > +{ > + struct drm_i915_private *i915 = to_i915(crtc_state- > >uapi.crtc->dev); > + struct intel_connector *connector; > + struct intel_digital_connector_state *iter_conn_state; > + struct intel_dp *intel_dp; > + int i; > + > + if (conn_state) { > + connector = to_intel_connector(conn_state- > >connector); > + intel_dp_queue_modeset_retry_work(connector); > + > + return; > + } > + > + if (drm_WARN_ON(&i915->drm, > + !intel_crtc_has_type(crtc_state, > INTEL_OUTPUT_DP_MST))) > + return; > + > + intel_dp = enc_to_intel_dp(encoder); > + > + for_each_new_intel_connector_in_state(state, connector, > iter_conn_state, i) { > + (void)iter_conn_state; Checked iter_conn_state->base->crtc documentation: @crtc: CRTC to connect connector to, NULL if disabled. Do we need to check if connector is "disabled" or is it impossible scenario? BR, Jouni Högander > + > + if (connector->mst_port != intel_dp) > + continue; > + > + intel_dp_queue_modeset_retry_work(connector); > + } > +} > + > int > intel_dp_compute_config(struct intel_encoder *encoder, > struct intel_crtc_state *pipe_config, > @@ -6436,6 +6480,14 @@ static void > intel_dp_modeset_retry_work_fn(struct work_struct *work) > mutex_unlock(&connector->dev->mode_config.mutex); > /* Send Hotplug uevent so userspace can reprobe */ > drm_kms_helper_connector_hotplug_event(connector); > + > + drm_connector_put(connector); > +} > + > +void intel_dp_init_modeset_retry_work(struct intel_connector > *connector) > +{ > + INIT_WORK(&connector->modeset_retry_work, > + intel_dp_modeset_retry_work_fn); > } > > bool > @@ -6452,8 +6504,7 @@ intel_dp_init_connector(
Re: [PATCH] drm: bridge: samsung-dsim: Don't use FORCE_STOP_STATE
Also merge commit 663a907e199b (Merge remote-tracking branch 'drm-misc/drm-misc-next' into drm-tip) is broken because it completely removes samsung_dsim_atomic_disable(). Dunno whats going on there. Actually, that merge commit looks even worse. It somehow folds the original samsung_dsim_atomic_disable() into samsung_dsim_atomic_enable(). Therefore, now the enable op will clear the DSIM_STATE_VIDOUT_AVAILABLE flag?! It will also never be set. Dunno how to proceed here. -michael
Re: (subset) [PATCH RFC for upstream 0/4] Add support for ETML1010G3DRA LVDS display on phyBOARD-Pollux
Hi, On Fri, 26 Jan 2024 09:57:22 +0100, Yannic Moog wrote: > This series adds the LVDS panel support in simple-panel, introduces > device tree support for the LVDS panel on the phyBOARD-Pollux and > enables the FSL_LDB config option needed for the driver for the imx8mp > ldb bridge. > Thanks, Applied to https://anongit.freedesktop.org/git/drm/drm-misc.git (drm-misc-next) [1/4] dt-bindings: display: panel-simple: add ETML1010G3DRA https://cgit.freedesktop.org/drm/drm-misc/commit/?id=7a61bbc10a7b2734fbffa4438340b6878cce2e5c [2/4] drm/panel: simple: Add EDT ETML1010G3DRA panel https://cgit.freedesktop.org/drm/drm-misc/commit/?id=aeb262c353354eab81ab0d3242afa70984b7dc34 -- Neil
Re: [PATCH 1/8] video: Add helpers for decoding screen_info
Thomas Zimmermann writes: Hello Thomas, > The plain values as stored in struct screen_info need to be decoded > before being used. Add helpers that decode the type of video output > and the framebuffer I/O aperture. > > Old or non-x86 systems may not set the type of video directly, but > only indicate the presence by storing 0x01 in orig_video_isVGA. The > decoding logic in screen_info_video_type() takes this into account. I always disliked how the orig_video_isVGA variable lost its meaning. > It then follows similar code in vgacon's vgacon_startup() to detect > the video type from the given values. > > A call to screen_info_resources() returns all known resources of the > given screen_info. The resources' values have been taken from existing > code in vgacon and vga16fb. These drivers can later be converted to > use the new interfaces. > > Signed-off-by: Thomas Zimmermann > --- Thanks for doing this! It's quite useful to have these helpers, since as you mention the screen_info data decoding is complex and the variables used to store the video type and modes are confusing / misleading. Reviewed-by: Javier Martinez Canillas I just have a few comments below: > +static inline bool __screen_info_has_ega_gfx(unsigned int mode) > +{ > + switch (mode) { > + case 0x0d: /* 320x200-4 */ > + case 0x0e: /* 640x200-4 */ > + case 0x0f: /* 640x350-1 */ > + case 0x10: /* 640x350-4 */ I wonder if makes sense to define some constant macros for these modes? I know that check_mode_supported() in drivers/video/fbdev/vga16fb.c also use magic numbers but I believe that it could ease readability. > + return true; > + default: > + return false; > + } > +} > + > +static inline bool __screen_info_has_vga_gfx(unsigned int mode) > +{ > + switch (mode) { > + case 0x10: /* 640x480-1 */ > + case 0x12: /* 640x480-4 */ > + case 0x13: /* 320-200-8 */ > + case 0x6a: /* 800x600-4 (VESA) */ > + return true; And same for these. It can be a follow-up patch though. [...] > +int screen_info_resources(const struct screen_info *si, struct resource *r, > size_t num) > +{ > + struct resource *pos = r; > + unsigned int type = screen_info_video_type(si); > + u64 base, size; > + > + switch (type) { > + case VIDEO_TYPE_MDA: > + if (num > 0) > + resource_init_io_named(pos++, 0x3b0, 12, "mda"); I see that drivers/video/fbdev/i740_reg.h has a #define MDA_BASE 0x3B0. Maybe move to a header in include/video along with the other constants mentioned above ? > + if (num > 1) > + resource_init_io_named(pos++, 0x3bf, 0x01, "mda"); > + if (num > 2) > + resource_init_mem_named(pos++, 0xb, 0x2000, "mda"); Same for these start addresses. I see that are also used by mdacon_startup() in drivers/video/console/mdacon.c, so some constants defined somewhere might be useful for that console driver too. The comment also applies to all the other start addresses, since I see that those are used by other drivers (i810, vgacon, etc). -- Best regards, Javier Martinez Canillas Core Platforms Red Hat
[PATCH RESEND v3 1/3] drm/stm: dsi: use new SYSTEM_SLEEP_PM_OPS() macro
Use RUNTIME_PM_OPS() instead of the old SET_SYSTEM_SLEEP_PM_OPS(). This means we don't need __maybe_unused on the functions. Signed-off-by: Raphael Gallais-Pou --- drivers/gpu/drm/stm/dw_mipi_dsi-stm.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c b/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c index d5f8c923d7bc..b1aee43d51e9 100644 --- a/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c +++ b/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c @@ -544,7 +544,7 @@ static void dw_mipi_dsi_stm_remove(struct platform_device *pdev) regulator_disable(dsi->vdd_supply); } -static int __maybe_unused dw_mipi_dsi_stm_suspend(struct device *dev) +static int dw_mipi_dsi_stm_suspend(struct device *dev) { struct dw_mipi_dsi_stm *dsi = dw_mipi_dsi_stm_plat_data.priv_data; @@ -556,7 +556,7 @@ static int __maybe_unused dw_mipi_dsi_stm_suspend(struct device *dev) return 0; } -static int __maybe_unused dw_mipi_dsi_stm_resume(struct device *dev) +static int dw_mipi_dsi_stm_resume(struct device *dev) { struct dw_mipi_dsi_stm *dsi = dw_mipi_dsi_stm_plat_data.priv_data; int ret; @@ -580,8 +580,8 @@ static int __maybe_unused dw_mipi_dsi_stm_resume(struct device *dev) } static const struct dev_pm_ops dw_mipi_dsi_stm_pm_ops = { - SET_SYSTEM_SLEEP_PM_OPS(dw_mipi_dsi_stm_suspend, - dw_mipi_dsi_stm_resume) + SYSTEM_SLEEP_PM_OPS(dw_mipi_dsi_stm_suspend, + dw_mipi_dsi_stm_resume) }; static struct platform_driver dw_mipi_dsi_stm_driver = { -- 2.25.1
[PATCH RESEND v3 2/3] drm/stm: dsi: add pm runtime ops
From: Yannick Fertre Update control of clocks and supply thanks to the PM runtime mechanism to avoid kernel crash during a system suspend. Signed-off-by: Yannick Fertre Signed-off-by: Raphael Gallais-Pou --- Changes in v2: - Changed SET_RUNTIME_PM_OPS to RUNTIME_PM_OPS and removed __maybe_unused --- drivers/gpu/drm/stm/dw_mipi_dsi-stm.c | 24 ++-- 1 file changed, 18 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c b/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c index b1aee43d51e9..82fff9e84345 100644 --- a/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c +++ b/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include @@ -77,6 +78,7 @@ enum dsi_color { struct dw_mipi_dsi_stm { void __iomem *base; struct clk *pllref_clk; + struct clk *pclk; struct dw_mipi_dsi *dsi; u32 hw_version; int lane_min_kbps; @@ -443,7 +445,6 @@ static int dw_mipi_dsi_stm_probe(struct platform_device *pdev) { struct device *dev = &pdev->dev; struct dw_mipi_dsi_stm *dsi; - struct clk *pclk; int ret; dsi = devm_kzalloc(dev, sizeof(*dsi), GFP_KERNEL); @@ -483,21 +484,21 @@ static int dw_mipi_dsi_stm_probe(struct platform_device *pdev) goto err_clk_get; } - pclk = devm_clk_get(dev, "pclk"); - if (IS_ERR(pclk)) { - ret = PTR_ERR(pclk); + dsi->pclk = devm_clk_get(dev, "pclk"); + if (IS_ERR(dsi->pclk)) { + ret = PTR_ERR(dsi->pclk); DRM_ERROR("Unable to get peripheral clock: %d\n", ret); goto err_dsi_probe; } - ret = clk_prepare_enable(pclk); + ret = clk_prepare_enable(dsi->pclk); if (ret) { DRM_ERROR("%s: Failed to enable peripheral clk\n", __func__); goto err_dsi_probe; } dsi->hw_version = dsi_read(dsi, DSI_VERSION) & VERSION; - clk_disable_unprepare(pclk); + clk_disable_unprepare(dsi->pclk); if (dsi->hw_version != HWVER_130 && dsi->hw_version != HWVER_131) { ret = -ENODEV; @@ -551,6 +552,7 @@ static int dw_mipi_dsi_stm_suspend(struct device *dev) DRM_DEBUG_DRIVER("\n"); clk_disable_unprepare(dsi->pllref_clk); + clk_disable_unprepare(dsi->pclk); regulator_disable(dsi->vdd_supply); return 0; @@ -569,8 +571,16 @@ static int dw_mipi_dsi_stm_resume(struct device *dev) return ret; } + ret = clk_prepare_enable(dsi->pclk); + if (ret) { + regulator_disable(dsi->vdd_supply); + DRM_ERROR("Failed to enable pclk: %d\n", ret); + return ret; + } + ret = clk_prepare_enable(dsi->pllref_clk); if (ret) { + clk_disable_unprepare(dsi->pclk); regulator_disable(dsi->vdd_supply); DRM_ERROR("Failed to enable pllref_clk: %d\n", ret); return ret; @@ -582,6 +592,8 @@ static int dw_mipi_dsi_stm_resume(struct device *dev) static const struct dev_pm_ops dw_mipi_dsi_stm_pm_ops = { SYSTEM_SLEEP_PM_OPS(dw_mipi_dsi_stm_suspend, dw_mipi_dsi_stm_resume) + RUNTIME_PM_OPS(dw_mipi_dsi_stm_suspend, + dw_mipi_dsi_stm_resume, NULL) }; static struct platform_driver dw_mipi_dsi_stm_driver = { -- 2.25.1
[PATCH RESEND v3 3/3] drm/stm: dsi: expose DSI PHY internal clock
DSISRC __ __\_ |\ pll4_p_ck ->| 1 |dsi_k ck_dsi_phy ->| 0 | |/ A DSI clock is missing in the clock framework. Looking at the clk_summary, it appears that 'ck_dsi_phy' is not implemented. Since the DSI kernel clock is based on the internal DSI pll. The common clock driver can not directly expose this 'ck_dsi_phy' clock because it does not contain any common registers with the DSI. Thus it needs to be done directly within the DSI phy driver. Signed-off-by: Raphael Gallais-Pou --- Changes in v3: - Fix smatch warning: .../dw_mipi_dsi-stm.c:719 dw_mipi_dsi_stm_probe() warn: 'dsi->pclk' from clk_prepare_enable() not released on lines: 719. --- drivers/gpu/drm/stm/dw_mipi_dsi-stm.c | 247 ++ 1 file changed, 216 insertions(+), 31 deletions(-) diff --git a/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c b/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c index 82fff9e84345..b20123854c4a 100644 --- a/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c +++ b/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c @@ -7,7 +7,9 @@ */ #include +#include #include +#include #include #include #include @@ -77,9 +79,12 @@ enum dsi_color { struct dw_mipi_dsi_stm { void __iomem *base; + struct device *dev; struct clk *pllref_clk; struct clk *pclk; + struct clk_hw txbyte_clk; struct dw_mipi_dsi *dsi; + struct dw_mipi_dsi_plat_data pdata; u32 hw_version; int lane_min_kbps; int lane_max_kbps; @@ -196,29 +201,198 @@ static int dsi_pll_get_params(struct dw_mipi_dsi_stm *dsi, return 0; } -static int dw_mipi_dsi_phy_init(void *priv_data) +#define clk_to_dw_mipi_dsi_stm(clk) \ + container_of(clk, struct dw_mipi_dsi_stm, txbyte_clk) + +static void dw_mipi_dsi_clk_disable(struct clk_hw *clk) { - struct dw_mipi_dsi_stm *dsi = priv_data; + struct dw_mipi_dsi_stm *dsi = clk_to_dw_mipi_dsi_stm(clk); + + DRM_DEBUG_DRIVER("\n"); + + /* Disable the DSI PLL */ + dsi_clear(dsi, DSI_WRPCR, WRPCR_PLLEN); + + /* Disable the regulator */ + dsi_clear(dsi, DSI_WRPCR, WRPCR_REGEN | WRPCR_BGREN); +} + +static int dw_mipi_dsi_clk_enable(struct clk_hw *clk) +{ + struct dw_mipi_dsi_stm *dsi = clk_to_dw_mipi_dsi_stm(clk); u32 val; int ret; + DRM_DEBUG_DRIVER("\n"); + /* Enable the regulator */ dsi_set(dsi, DSI_WRPCR, WRPCR_REGEN | WRPCR_BGREN); - ret = readl_poll_timeout(dsi->base + DSI_WISR, val, val & WISR_RRS, -SLEEP_US, TIMEOUT_US); + ret = readl_poll_timeout_atomic(dsi->base + DSI_WISR, val, val & WISR_RRS, + SLEEP_US, TIMEOUT_US); if (ret) DRM_DEBUG_DRIVER("!TIMEOUT! waiting REGU, let's continue\n"); /* Enable the DSI PLL & wait for its lock */ dsi_set(dsi, DSI_WRPCR, WRPCR_PLLEN); - ret = readl_poll_timeout(dsi->base + DSI_WISR, val, val & WISR_PLLLS, -SLEEP_US, TIMEOUT_US); + ret = readl_poll_timeout_atomic(dsi->base + DSI_WISR, val, val & WISR_PLLLS, + SLEEP_US, TIMEOUT_US); if (ret) DRM_DEBUG_DRIVER("!TIMEOUT! waiting PLL, let's continue\n"); return 0; } +static int dw_mipi_dsi_clk_is_enabled(struct clk_hw *hw) +{ + struct dw_mipi_dsi_stm *dsi = clk_to_dw_mipi_dsi_stm(hw); + + return dsi_read(dsi, DSI_WRPCR) & WRPCR_PLLEN; +} + +static unsigned long dw_mipi_dsi_clk_recalc_rate(struct clk_hw *hw, +unsigned long parent_rate) +{ + struct dw_mipi_dsi_stm *dsi = clk_to_dw_mipi_dsi_stm(hw); + unsigned int idf, ndiv, odf, pll_in_khz, pll_out_khz; + u32 val; + + DRM_DEBUG_DRIVER("\n"); + + pll_in_khz = (unsigned int)(parent_rate / 1000); + + val = dsi_read(dsi, DSI_WRPCR); + + idf = (val & WRPCR_IDF) >> 11; + if (!idf) + idf = 1; + ndiv = (val & WRPCR_NDIV) >> 2; + odf = int_pow(2, (val & WRPCR_ODF) >> 16); + + /* Get the adjusted pll out value */ + pll_out_khz = dsi_pll_get_clkout_khz(pll_in_khz, idf, ndiv, odf); + + return (unsigned long)pll_out_khz * 1000; +} + +static long dw_mipi_dsi_clk_round_rate(struct clk_hw *hw, unsigned long rate, + unsigned long *parent_rate) +{ + struct dw_mipi_dsi_stm *dsi = clk_to_dw_mipi_dsi_stm(hw); + unsigned int idf, ndiv, odf, pll_in_khz, pll_out_khz; + int ret; + + DRM_DEBUG_DRIVER("\n"); + + pll_in_khz = (unsigned int)(*parent_rate / 1000); + + /* Compute best pll parameters */ + idf = 0; + ndiv = 0; + odf = 0; + + ret = dsi_pll_get_params(dsi, pll_in_khz, rate / 1000, +&idf, &ndiv, &odf);
[PATCH RESEND v3 0/3] Update STM DSI PHY driver
This patch series aims to add several features of the dw-mipi-dsi phy driver that are missing or need to be updated. First patch update a PM macro. Second patch adds runtime PM functionality to the driver. Third patch adds a clock provider generated by the PHY itself. As explained in the commit log of the second patch, a clock declaration is missing. Since this clock is parent of 'dsi_k', it leads to an orphan clock. Most importantly this patch is an anticipation for future versions of the DSI PHY, and its inclusion within the display subsystem and the DRM framework. Last patch fixes a corner effect introduced previously. Since 'dsi' and 'dsi_k' are gated by the same bit on the same register, both reference work as peripheral clock in the device-tree. --- Changes in v3-resend: - Removed last patch as it has been merged https://lore.kernel.org/lkml/bf49f4c9-9e81-4c91-972d-13782d996...@foss.st.com/ Changes in v3: - Fix smatch warning (disable dsi->pclk when clk_register fails) Changes in v2: - Added patch 1/4 to use SYSTEM_SLEEP_PM_OPS instead of old macro and removed __maybe_used for accordingly - Changed SET_RUNTIME_PM_OPS to RUNTIME_PM_OPS Raphael Gallais-Pou (3): drm/stm: dsi: use new SYSTEM_SLEEP_PM_OPS() macro drm/stm: dsi: expose DSI PHY internal clock Yannick Fertre (1): drm/stm: dsi: add pm runtime ops drivers/gpu/drm/stm/dw_mipi_dsi-stm.c | 279 ++ 1 file changed, 238 insertions(+), 41 deletions(-) -- 2.25.1
Re: [PATCH v4 00/14] drm: Add a driver for CSF-based Mali GPUs
On Mon, 29 Jan 2024 17:20:47 +0800 (CST) "Andy Yan" wrote: > Hi Boris: > > Thanks for you great work。 > > One thing please take note: > commit (arm64: dts: rockchip: rk3588: Add GPU nodes) in [1] seems remove the > "disabled" status > of usb_host2_xhci, this may cause a boot issue on some boards that use > combphy2_psu phy for other functions. Oops, should be fixed in https://gitlab.freedesktop.org/panfrost/linux/-/commits/panthor-next+rk3588 now. Thanks, Boris
Re: [PATCH 03/19] drm/i915/dp: Add support to notify MST connectors to retry modesets
On Mon, Jan 29, 2024 at 12:36:12PM +0200, Hogander, Jouni wrote: > On Tue, 2024-01-23 at 12:28 +0200, Imre Deak wrote: > > [...] > > +void > > +intel_dp_queue_modeset_retry_for_link(struct intel_atomic_state *state, > > + struct intel_encoder *encoder, > > + const struct intel_crtc_state > > *crtc_state, > > + const struct drm_connector_state > > *conn_state) > > +{ > > + struct drm_i915_private *i915 = to_i915(crtc_state->uapi.crtc->dev); > > + struct intel_connector *connector; > > + struct intel_digital_connector_state *iter_conn_state; > > + struct intel_dp *intel_dp; > > + int i; > > + > > + if (conn_state) { > > + connector = to_intel_connector(conn_state->connector); > > + intel_dp_queue_modeset_retry_work(connector); > > + > > + return; > > + } > > + > > + if (drm_WARN_ON(&i915->drm, > > + !intel_crtc_has_type(crtc_state, > > INTEL_OUTPUT_DP_MST))) > > + return; > > + > > + intel_dp = enc_to_intel_dp(encoder); > > + > > + for_each_new_intel_connector_in_state(state, connector, > > iter_conn_state, i) { > > + (void)iter_conn_state; > > Checked iter_conn_state->base->crtc documentation: > > @crtc: CRTC to connect connector to, NULL if disabled. > > Do we need to check if connector is "disabled" or is it impossible > scenario? Yes, it does show if the connector is disabled and it would make sense to not notify those. However the check for that would be racy, at least during a non-blocking commit, but I think also in general where userspace could be in the middle of enabling this connector. The point of the notification is that userspace re-checks the mode it wants on each MST connector to be enabled, so to prevent that it would miss the re-check on connectors with a pending enabling like above, the notification is simply sent to all the connectors in the MST topology. > > BR, > > Jouni Högander > > > > + > > + if (connector->mst_port != intel_dp) > > + continue; > > + > > + intel_dp_queue_modeset_retry_work(connector); > > + } > > +} > > + > > int > > intel_dp_compute_config(struct intel_encoder *encoder, > > struct intel_crtc_state *pipe_config, > > @@ -6436,6 +6480,14 @@ static void > > intel_dp_modeset_retry_work_fn(struct work_struct *work) > > mutex_unlock(&connector->dev->mode_config.mutex); > > /* Send Hotplug uevent so userspace can reprobe */ > > drm_kms_helper_connector_hotplug_event(connector); > > + > > + drm_connector_put(connector); > > +} > > + > > +void intel_dp_init_modeset_retry_work(struct intel_connector > > *connector) > > +{ > > + INIT_WORK(&connector->modeset_retry_work, > > + intel_dp_modeset_retry_work_fn); > > } > > > > bool > > @@ -6452,8 +6504,7 @@ intel_dp_init_connector(struct > > intel_digital_port *dig_port, > > int type; > > > > /* Initialize the work for modeset in case of link train > > failure */ > > - INIT_WORK(&intel_connector->modeset_retry_work, > > - intel_dp_modeset_retry_work_fn); > > + intel_dp_init_modeset_retry_work(intel_connector); > > > > if (drm_WARN(dev, dig_port->max_lanes < 1, > > "Not enough lanes (%d) for DP on > > [ENCODER:%d:%s]\n", > > diff --git a/drivers/gpu/drm/i915/display/intel_dp.h > > b/drivers/gpu/drm/i915/display/intel_dp.h > > index 530cc97bc42f4..105c2086310db 100644 > > --- a/drivers/gpu/drm/i915/display/intel_dp.h > > +++ b/drivers/gpu/drm/i915/display/intel_dp.h > > @@ -23,6 +23,8 @@ struct intel_digital_port; > > struct intel_dp; > > struct intel_encoder; > > > > +struct work_struct; > > + > > struct link_config_limits { > > int min_rate, max_rate; > > int min_lane_count, max_lane_count; > > @@ -43,6 +45,12 @@ void intel_dp_adjust_compliance_config(struct > > intel_dp *intel_dp, > > bool intel_dp_limited_color_range(const struct intel_crtc_state > > *crtc_state, > > const struct drm_connector_state > > *conn_state); > > int intel_dp_min_bpp(enum intel_output_format output_format); > > +void intel_dp_init_modeset_retry_work(struct intel_connector > > *connector); > > +void intel_dp_queue_modeset_retry_work(struct intel_connector > > *connector); > > +void intel_dp_queue_modeset_retry_for_link(struct intel_atomic_state > > *state, > > + struct intel_encoder > > *encoder, > > + const struct > > intel_crtc_state *crtc_state, > > + const struct > > drm_connector_state *conn_state); > > bool intel_dp_init_connector(struct intel_digital_port *dig_port, > > st
Re: Re: Re: [PATCH 3/5] drm/ttm: replace busy placement with flags v6
On Fri, 2024-01-26 at 16:22 -0600, Lucas De Marchi wrote: > On Fri, Jan 26, 2024 at 04:16:58PM -0600, Lucas De Marchi wrote: > > On Thu, Jan 18, 2024 at 05:38:16PM +0100, Thomas Hellström wrote: > > > > > > On 1/17/24 13:27, Thomas Hellström wrote: > > > > > > > > On 1/17/24 11:47, Thomas Hellström wrote: > > > > > Hi, Christian > > > > > > > > > > Xe changes look good. Will send the series to xe ci to check > > > > > for > > > > > regressions. > > > > > > > > Hmm, there are some checkpatch warnings about author / SOB > > > > email > > > > mismatch, > > > > > > With those fixed, this patch is > > > > > > Reviewed-by: Thomas Hellström > > > > > > it actually broke drm-tip now that this is merged: > > > > ../drivers/gpu/drm/xe/xe_bo.c:41:10: error: ‘struct ttm_placement’ > > has no member named ‘num_busy_placement’; did you mean > > ‘num_placement’ > > 41 | .num_busy_placement = 1, > > | ^~ > > | num_placement > > ../drivers/gpu/drm/xe/xe_bo.c:41:31: error: excess elements in > > struct initializer [-Werror] > > 41 | .num_busy_placement = 1, > > | ^ > > > > > > Apparently a conflict with another patch that got applied a few > > days > > ago: a201c6ee37d6 ("drm/xe/bo: Evict VRAM to TT rather than to > > system") > > oh, no... apparently that commit is from a long time ago. The > problem > was that drm-misc-next was not yet in sync with drm-next. Thomas, do > you > have a fixup for this to put in rerere? > > Lucas De Marchi I added this as a manual fixup and ran some quick igt tests. Seems to work.
Re: [PATCH v5 0/3] drm/i915: Fix VMA UAF on destroy against deactivate race
Hi Nirmoy, On Monday, 29 January 2024 10:24:07 CET Nirmoy Das wrote: > Hi Janusz, > > There seems to be a regression in CI related to this: > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_129026v2/bat-dg1-7/ igt@gem_lmem_swapping@random-engi...@lmem0.html#dmesg-warnings1053 > > Please have a look. Yes, that's a problem, the series can't be merged in its current shape. However, I'm not sure if that's really a regression, or rather an exposure of another, already existing issue. It looks like a race among two concurrent calls to our __active_retire() used in DMA fence callbacks. I'm going to verify some ideas for a fix on trybot. Thanks, Janusz > > > Regards, > > Nirmoy > > On 1/24/2024 6:13 PM, Janusz Krzysztofik wrote: > > Object debugging tools were sporadically reporting illegal attempts to > > free a still active i915 VMA object when parking a GT believed to be idle. > > > > [161.359441] ODEBUG: free active (active state 0) object: 88811643b958 object type: i915_active hint: __i915_vma_active+0x0/0x50 [i915] > > [161.360082] WARNING: CPU: 5 PID: 276 at lib/debugobjects.c:514 debug_print_object+0x80/0xb0 > > ... > > [161.360304] CPU: 5 PID: 276 Comm: kworker/5:2 Not tainted 6.5.0-rc1- CI_DRM_13375-g003f860e5577+ #1 > > [161.360314] Hardware name: Intel Corporation Rocket Lake Client Platform/ RocketLake S UDIMM 6L RVP, BIOS RKLSFWI1.R00.3173.A03.2204210138 04/21/2022 > > [161.360322] Workqueue: i915-unordered __intel_wakeref_put_work [i915] > > [161.360592] RIP: 0010:debug_print_object+0x80/0xb0 > > ... > > [161.361347] debug_object_free+0xeb/0x110 > > [161.361362] i915_active_fini+0x14/0x130 [i915] > > [161.361866] release_references+0xfe/0x1f0 [i915] > > [161.362543] i915_vma_parked+0x1db/0x380 [i915] > > [161.363129] __gt_park+0x121/0x230 [i915] > > [161.363515] intel_wakeref_put_last+0x1f/0x70 [i915] > > > > That has been tracked down to be happening when another thread is > > deactivating the VMA inside __active_retire() helper, after the VMA's > > active counter has been already decremented to 0, but before deactivation > > of the VMA's object is reported to the object debugging tool. > > > > We could prevent from that race by serializing i915_active_fini() with > > __active_retire() via ref->tree_lock, but that wouldn't stop the VMA from > > being used, e.g. from __i915_vma_retire() called at the end of > > __active_retire(), after that VMA has been already freed by a concurrent > > i915_vma_destroy() on return from the i915_active_fini(). Then, we should > > rather fix the issue at the VMA level, not in i915_active. > > > > Since __i915_vma_parked() is called from __gt_park() on last put of the > > GT's wakeref, the issue could be addressed by holding the GT wakeref long > > enough for __active_retire() to complete before that wakeref is released > > and the GT parked. > > > > A VMA associated with a request doesn't acquire a GT wakeref by itself. > > Instead, it depends on a wakeref held directly by the request's active > > intel_context for a GT associated with its VM, and indirectly on that > > intel_context's engine wakeref if the engine belongs to the same GT as the > > VMA's VM. Those wakerefs are released asynchronously to VMA deactivation. > > > > In case of single-GT platforms, at least one of those wakerefs is usually > > held long enough for the request's VMA to be deactivated on time, before > > it is destroyed on last put of its VM GT wakeref. However, on multi-GT > > platforms, a request may use a VMA from a GT other than the one that hosts > > the request's engine, then it is protected only with the intel_context's > > VM GT wakeref. > > > > There was an attempt to fix the issue on 2-GT Meteor Lake by acquiring an > > extra wakeref for a Primary GT from i915_gem_do_execbuffer() -- see commit > > f56fe3e91787 ("drm/i915: Fix a VMA UAF for multi-gt platform"). However, > > that fix occurred insufficient -- the issue was still reported by CI. > > That wakeref was released on exit from i915_gem_do_execbuffer(), then > > potentially before completion of the request and deactivation of its > > associated VMAs. Moreover, CI reports indicated that single-GT platforms > > also suffered sporadically from the same race. > > > > I believe the issue was introduced by commit d93939730347 ("drm/i915: > > Remove the vma refcount") which moved a call to i915_active_fini() from > > a dropped i915_vma_release(), called on last put of the removed VMA kref, > > to i915_vma_parked() processing path called on last put of a GT wakeref. > > However, its visibility to the object debugging tool was suppressed by a > > bug in i915_active that was fixed two weeks later with commit e92eb246feb9 > > ("drm/i915/active: Fix missing debug object activation"). > > > > Fix the issue by getting a wakeref for the VMA's GT when activating it, > > and putting that wakeref only after the VMA is deactivated. However, > > exclude global GTT from that processing path, otherwise the GP
Re: [PATCH 2/8] video: Provide screen_info_get_pci_dev() to find screen_info's PCI device
Thomas Zimmermann writes: > Add screen_info_get_pci_dev() to find the PCI device of an instance > of screen_info. Does nothing on systems without PCI bus. > > Signed-off-by: Thomas Zimmermann > --- [...] > +struct pci_dev *screen_info_pci_dev(const struct screen_info *si) > +{ > + struct resource res[SCREEN_INFO_MAX_RESOURCES]; > + size_t i, numres; > + int ret; > + > + ret = screen_info_resources(si, res, ARRAY_SIZE(res)); > + if (ret < 0) > + return ERR_PTR(ret); > + numres = ret; > + I would just drop the ret variable and assign the screen_info_resources() return value to numres. I think that makes the code easier to follow. Reviewed-by: Javier Martinez Canillas -- Best regards, Javier Martinez Canillas Core Platforms Red Hat
Re: [PATCH v2 10/10] drm/vboxvideo: fix mapping leaks
Hi Philipp, On 1/23/24 10:43, Philipp Stanner wrote: > When the PCI devres API was introduced to this driver, it was wrongly > assumed that initializing the device with pcim_enable_device() instead > of pci_enable_device() will make all PCI functions managed. > > This is wrong and was caused by the quite confusing devres API for PCI > in which some, but not all, functions become managed that way. > > The function pci_iomap_range() is never managed. > > Replace pci_iomap_range() with the actually managed function > pcim_iomap_range(). > > Additionally, add a call to pcim_request_region() to ensure exclusive > access to BAR 0. I'm a bit worried about this last change. There might be issues where the pcim_request_region() fails due to e.g. a conflict with the simplefb / simpledrm code. There is a drm_aperture_remove_conflicting_pci_framebuffers() call done before hw_init() gets called, but still this has been known to cause issues in the past. Can you split out the adding of the pcim_request_region() into a separate patch and *not* mark that separate patch for stable ? Regards, Hans > > CC: # v5.10+ > Fixes: 8558de401b5f ("drm/vboxvideo: use managed pci functions") > Signed-off-by: Philipp Stanner > --- > drivers/gpu/drm/vboxvideo/vbox_main.c | 24 +--- > 1 file changed, 13 insertions(+), 11 deletions(-) > > diff --git a/drivers/gpu/drm/vboxvideo/vbox_main.c > b/drivers/gpu/drm/vboxvideo/vbox_main.c > index 42c2d8a99509..7f686a0190e6 100644 > --- a/drivers/gpu/drm/vboxvideo/vbox_main.c > +++ b/drivers/gpu/drm/vboxvideo/vbox_main.c > @@ -42,12 +42,11 @@ static int vbox_accel_init(struct vbox_private *vbox) > /* Take a command buffer for each screen from the end of usable VRAM. */ > vbox->available_vram_size -= vbox->num_crtcs * VBVA_MIN_BUFFER_SIZE; > > - vbox->vbva_buffers = pci_iomap_range(pdev, 0, > - vbox->available_vram_size, > - vbox->num_crtcs * > - VBVA_MIN_BUFFER_SIZE); > - if (!vbox->vbva_buffers) > - return -ENOMEM; > + vbox->vbva_buffers = pcim_iomap_range( > + pdev, 0, vbox->available_vram_size, > + vbox->num_crtcs * VBVA_MIN_BUFFER_SIZE); > + if (IS_ERR(vbox->vbva_buffers)) > + return PTR_ERR(vbox->vbva_buffers); > > for (i = 0; i < vbox->num_crtcs; ++i) { > vbva_setup_buffer_context(&vbox->vbva_info[i], > @@ -115,12 +114,15 @@ int vbox_hw_init(struct vbox_private *vbox) > > DRM_INFO("VRAM %08x\n", vbox->full_vram_size); > > + ret = pcim_request_region(pdev, 0, "vboxvideo"); > + if (ret) > + return ret; > + > /* Map guest-heap at end of vram */ > - vbox->guest_heap = > - pci_iomap_range(pdev, 0, GUEST_HEAP_OFFSET(vbox), > - GUEST_HEAP_SIZE); > - if (!vbox->guest_heap) > - return -ENOMEM; > + vbox->guest_heap = pcim_iomap_range(pdev, 0, > + GUEST_HEAP_OFFSET(vbox), GUEST_HEAP_SIZE); > + if (IS_ERR(vbox->guest_heap)) > + return PTR_ERR(vbox->guest_heap); > > /* Create guest-heap mem-pool use 2^4 = 16 byte chunks */ > vbox->guest_pool = devm_gen_pool_create(vbox->ddev.dev, 4, -1,
Re: [PATCH 3/8] firmware/sysfb: Set firmware-framebuffer parent device
Thomas Zimmermann writes: > Set the firmware framebuffer's parent device, which usually is the > graphics hardware's physical device. Integrates the framebuffer in > the Linux device hierarchy and lets Linux handle dependencies among > devices. For example, the graphics hardware won't be suspended while > the firmware device is still active. > > Signed-off-by: Thomas Zimmermann > --- > drivers/firmware/sysfb.c | 11 ++- > drivers/firmware/sysfb_simplefb.c | 5 - > include/linux/sysfb.h | 3 ++- > 3 files changed, 16 insertions(+), 3 deletions(-) > > diff --git a/drivers/firmware/sysfb.c b/drivers/firmware/sysfb.c > index 19706bd2642a..8a42da3f67a9 100644 > --- a/drivers/firmware/sysfb.c > +++ b/drivers/firmware/sysfb.c > @@ -29,6 +29,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -72,6 +73,8 @@ EXPORT_SYMBOL_GPL(sysfb_disable); > static __init int sysfb_init(void) > { > const struct screen_info *si = &screen_info; > + struct device *parent = NULL; > + struct pci_dev *pparent; Maybe pci_parent? It's easier to read than pparent IMO. Reviewed-by: Javier Martinez Canillas -- Best regards, Javier Martinez Canillas Core Platforms Red Hat
Re: [PATCH 4/8] fbdev/efifb: Remove PM for parent device
Thomas Zimmermann writes: > The EFI device has the correct parent device set. This allows Linux > to handle the power management internally. Hence, remove the manual > PM management for the parent device from efifb. > > Signed-off-by: Thomas Zimmermann > --- Reviewed-by: Javier Martinez Canillas -- Best regards, Javier Martinez Canillas Core Platforms Red Hat
Re: [PATCH 5/8] firmware/sysfb: Create firmware device only for enabled PCI devices
Thomas Zimmermann writes: > Test if the firmware framebuffer's parent PCI device, if any, has > been enabled. If not, the firmware framebuffer is most likely not > working. Hence, do not create a device for the firmware framebuffer > on disabled PCI devices. > > So far, efifb tracked the status of the PCI parent device internally > and did not bind if it was disabled. This patch implements the > functionality for all firmware framebuffers. > > Signed-off-by: Thomas Zimmermann > --- [...] > > +static __init bool sysfb_pci_dev_is_enabled(struct pci_dev *pdev) > +{ > +#if defined(CONFIG_PCI) > + /* > + * TODO: Try to integrate this code into the PCI subsystem > + */ > + int ret; > + u16 command; > + > + ret = pci_read_config_word(pdev, PCI_COMMAND, &command); > + if (ret != PCIBIOS_SUCCESSFUL) > + return false; > + if (!(command & PCI_COMMAND_MEMORY)) > + return false; > + return true; > +#else > + // Getting here without PCI support is probably a bug. > + return false; Should we warn before return in this case ? Reviewed-by: Javier Martinez Canillas -- Best regards, Javier Martinez Canillas Core Platforms Red Hat
Re: [PATCH 6/8] fbdev/efifb: Do not track parent device status
Thomas Zimmermann writes: > There will be no EFI framebuffer device for disabled parent devices > and thus we never probe efifb in that case. Hence remove the tracking > code from efifb. > > Signed-off-by: Thomas Zimmermann > --- Nice cleanup. Reviewed-by: Javier Martinez Canillas -- Best regards, Javier Martinez Canillas Core Platforms Red Hat
Re: [PATCH 7/8] firmware/sysfb: Update screen_info for relocated EFI framebuffers
Thomas Zimmermann writes: > On ARM PCI systems, the PCI hierarchy might be reconfigured during > boot and the firmware framebuffer might move as a result of that. > The values in screen_info will then be invalid. > > Work around this problem by tracking the framebuffer's initial > location before it get relocated; then fix the screen_info state > between reloaction and creating the firmware framebuffer's device. > > This functionality has been lifted from efifb. See the commit message > of commit 55d728a40d36 ("efi/fb: Avoid reconfiguration of BAR that > covers the framebuffer") for more information. > > Signed-off-by: Thomas Zimmermann > --- [...] > #if defined(CONFIG_PCI) Shouldn't this be && !defined(CONFIG_X86) ? Or maybe && defined(CONFIG_ARM64), although I don't know if the same also applies to other EFI platforms (e.g: CONFIG_RISCV). Reviewed-by: Javier Martinez Canillas -- Best regards, Javier Martinez Canillas Core Platforms Red Hat
Re: [PATCH 7/8] firmware/sysfb: Update screen_info for relocated EFI framebuffers
Javier Martinez Canillas writes: > Thomas Zimmermann writes: > >> On ARM PCI systems, the PCI hierarchy might be reconfigured during >> boot and the firmware framebuffer might move as a result of that. >> The values in screen_info will then be invalid. >> >> Work around this problem by tracking the framebuffer's initial >> location before it get relocated; then fix the screen_info state >> between reloaction and creating the firmware framebuffer's device. >> >> This functionality has been lifted from efifb. See the commit message >> of commit 55d728a40d36 ("efi/fb: Avoid reconfiguration of BAR that >> covers the framebuffer") for more information. >> >> Signed-off-by: Thomas Zimmermann >> --- > > [...] > >> #if defined(CONFIG_PCI) > > Shouldn't this be && !defined(CONFIG_X86) ? Or maybe && > defined(CONFIG_ARM64), although I don't know if the same > also applies to other EFI platforms (e.g: CONFIG_RISCV). > Answering my own question, the !defined(CONFIG_X86) was dropped in the commit dcf8f5ce3165 ("drivers/fbdev/efifb: Allow BAR to be moved instead of claiming it"). The rationale is explained in that commit message: While this is less likely to occur on x86, given that the firmware's PCI resource allocation is more likely to be preserved, this is a worthwhile sanity check to have in place, and so let's remove the preprocessor conditional that makes it !X86 only. So it is OK to just guard with #if defined(CONFIG_PCI). -- Best regards, Javier Martinez Canillas Core Platforms Red Hat
Re: [PATCH 8/8] fbdev/efifb: Remove framebuffer relocation tracking
Thomas Zimmermann writes: > If the firmware framebuffer has been reloacted, the sysfb code > fixes the screen_info state before it creates the framebuffer's > platform device. Efifb will automatically receive a screen_info > with updated values. Hence remove the tracking from efifb. > > Signed-off-by: Thomas Zimmermann > --- Reviewed-by: Javier Martinez Canillas -- Best regards, Javier Martinez Canillas Core Platforms Red Hat
Re: [Linaro-mm-sig] [PATCH 2/3] udmabuf: Sync buffer mappings for attached devices
Am 26.01.24 um 18:24 schrieb Andrew Davis: On 1/25/24 2:30 PM, Daniel Vetter wrote: On Tue, Jan 23, 2024 at 04:12:26PM -0600, Andrew Davis wrote: Currently this driver creates a SGT table using the CPU as the target device, then performs the dma_sync operations against that SGT. This is backwards to how DMA-BUFs are supposed to behave. This may have worked for the case where these buffers were given only back to the same CPU that produced them as in the QEMU case. And only then because the original author had the dma_sync operations also backwards, syncing for the "device" on begin_cpu. This was noticed and "fixed" in this patch[0]. That then meant we were sync'ing from the CPU to the CPU using a pseudo-device "miscdevice". Which then caused another issue due to the miscdevice not having a proper DMA mask (and why should it, the CPU is not a DMA device). The fix for that was an even more egregious hack[1] that declares the CPU is coherent with itself and can access its own memory space.. Unwind all this and perform the correct action by doing the dma_sync operations for each device currently attached to the backing buffer. [0] commit 1ffe09590121 ("udmabuf: fix dma-buf cpu access") [1] commit 9e9fa6a9198b ("udmabuf: Set the DMA mask for the udmabuf device (v2)") Signed-off-by: Andrew Davis So yeah the above hacks are terrible, but I don't think this is better. What you're doing now is that you're potentially doing the flushing multiple times, so if you have a lot of importers with life mappings this is a performance regression. I'd take lower performing but correct than fast and broken. :) Syncing for CPU/device is about making sure the CPU/device can see the data produced by the other. Some devices might be dma-coherent and syncing for them would be a NOP, but we cant know that here in this driver. Let's say we have two attached devices, one that is cache coherent and one that isn't. If we only sync for first attached device then that is converted to a NOP and we never flush like the second device needed. Same is true for devices behind IOMMU or with an L3 cache when syncing in the other direction for CPU. So we have to sync for all attached devices to ensure we get even the lowest common denominator device sync'd. It is up to the DMA-API layer to decide which syncs need to actually do something. If all attached devices are coherent then all syncs will be NOPs and we have no performance penalty. It's probably time to bite the bullet and teach the dma-api about flushing for multiple devices. Or some way we can figure out which is the one device we need to pick which gives us the right amount of flushing. Seems like a constraint solving micro-optimization. The DMA-API layer would have to track which buffers have already been flushed from CPU cache and also track that nothing has been written into those caches since that point, only then could it skip the flush. But that is already the point of the dirty bit in the caches themselves, cleaning already clean cache lines is essentially free in hardware. And so is invalidating lines, it is just flipping a bit. Well to separate the functionality a bit. What the DMA-API should provide is abstracting how the platform does flushing and invalidation of caches and the information which devices uses which caches and what needs to be flushed/invalidated to allow access between devices and the CPU. In other words what's necessary is the following: 1. sync device to cpu 2. sync cpu to device 3. sync device to device 1 and 2 are already present and implemented for years, but 3 is missing together with some of the necessary infrastructure to actually implement this. E.g. we don't know which devices write into which caches etc... On top of this we need the functionality to track who has accessed which piece of data and what DMA-API functions needs to be called to make things work for a specific use case. But this is then DMA-buf, I/O layer drivers etc.. and should not belong into the DMA-API. I also strongly think that putting the SWIOTLB bounce buffer functionality into the DMA-API was not the right choice. Regards, Christian. Andrew Cheers, Sima --- drivers/dma-buf/udmabuf.c | 41 +++ 1 file changed, 16 insertions(+), 25 deletions(-) diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c index 3a23f0a7d112a..ab6764322523c 100644 --- a/drivers/dma-buf/udmabuf.c +++ b/drivers/dma-buf/udmabuf.c @@ -26,8 +26,6 @@ MODULE_PARM_DESC(size_limit_mb, "Max size of a dmabuf, in megabytes. Default is struct udmabuf { pgoff_t pagecount; struct page **pages; - struct sg_table *sg; - struct miscdevice *device; struct list_head attachments; struct mutex lock; }; @@ -169,12 +167,8 @@ static void unmap_udmabuf(struct dma_buf_attachment *at, static void release_udmabuf(struct dma_buf *buf) { struct udmabuf *ubuf = buf->priv; - struct de
[PATCH] backlight: ktz8866: Correct the check for of_property_read_u32
of_property_read_u32 returns 0 when success, so reverse the return value to get the true value. Fixes: f8449c8f7355 ("backlight: ktz8866: Add support for Kinetic KTZ8866 backlight") Signed-off-by: Jianhua Lu --- drivers/video/backlight/ktz8866.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/video/backlight/ktz8866.c b/drivers/video/backlight/ktz8866.c index 9c980f2571ee..014877b5a984 100644 --- a/drivers/video/backlight/ktz8866.c +++ b/drivers/video/backlight/ktz8866.c @@ -97,20 +97,20 @@ static void ktz8866_init(struct ktz8866 *ktz) { unsigned int val = 0; - if (of_property_read_u32(ktz->client->dev.of_node, "current-num-sinks", &val)) + if (!of_property_read_u32(ktz->client->dev.of_node, "current-num-sinks", &val)) ktz8866_write(ktz, BL_EN, BIT(val) - 1); else /* Enable all 6 current sinks if the number of current sinks isn't specified. */ ktz8866_write(ktz, BL_EN, BIT(6) - 1); - if (of_property_read_u32(ktz->client->dev.of_node, "kinetic,current-ramp-delay-ms", &val)) { + if (!of_property_read_u32(ktz->client->dev.of_node, "kinetic,current-ramp-delay-ms", &val)) { if (val <= 128) ktz8866_write(ktz, BL_CFG2, BIT(7) | (ilog2(val) << 3) | PWM_HYST); else ktz8866_write(ktz, BL_CFG2, BIT(7) | ((5 + val / 64) << 3) | PWM_HYST); } - if (of_property_read_u32(ktz->client->dev.of_node, "kinetic,led-enable-ramp-delay-ms", &val)) { + if (!of_property_read_u32(ktz->client->dev.of_node, "kinetic,led-enable-ramp-delay-ms", &val)) { if (val == 0) ktz8866_write(ktz, BL_DIMMING, 0); else { -- 2.43.0
Re: [PATCH v5 5/8] iio: core: Add new DMABUF interface infrastructure
Am 27.01.24 um 17:50 schrieb Jonathan Cameron: + iio_buffer_dmabuf_put(attach); + +out_dmabuf_put: + dma_buf_put(dmabuf); As below. Feels like a __free(dma_buf_put) bit of magic would be a nice to have. I'm working on the patches right now, just one quick question. Having a __free(dma_buf_put) requires that dma_buf_put is first "registered" as a freeing function using DEFINE_FREE() in , which has not been done yet. That would mean carrying a dma-buf specific patch in your tree, are you OK with that? Needs an ACK from appropriate maintainer, but otherwise I'm fine doing so. Alternative is to circle back to this later after this code is upstream. Separate patches for that please, the autocleanup feature is so new that I'm not 100% convinced that everything works out smoothly from the start. Regards, Christian. Cheers, -Paul
Re: [PATCH v5 5/8] iio: core: Add new DMABUF interface infrastructure
Hi Christian, Le lundi 29 janvier 2024 à 13:52 +0100, Christian König a écrit : > Am 27.01.24 um 17:50 schrieb Jonathan Cameron: > > > > > + iio_buffer_dmabuf_put(attach); > > > > > + > > > > > +out_dmabuf_put: > > > > > + dma_buf_put(dmabuf); > > > > As below. Feels like a __free(dma_buf_put) bit of magic would > > > > be a > > > > nice to have. > > > I'm working on the patches right now, just one quick question. > > > > > > Having a __free(dma_buf_put) requires that dma_buf_put is first > > > "registered" as a freeing function using DEFINE_FREE() in > > > > > buf.h>, which has not been done yet. > > > > > > That would mean carrying a dma-buf specific patch in your tree, > > > are you > > > OK with that? > > Needs an ACK from appropriate maintainer, but otherwise I'm fine > > doing > > so. Alternative is to circle back to this later after this code is > > upstream. > > Separate patches for that please, the autocleanup feature is so new > that > I'm not 100% convinced that everything works out smoothly from the > start. Separate patches is a given, did you mean outside this patchset? Because I can send a separate patchset that introduces scope-based management for dma_fence and dma_buf, but then it won't have users. Cheers, -Paul
Re: [PATCH v2 10/10] drm/vboxvideo: fix mapping leaks
Hi, On Mon, 2024-01-29 at 12:15 +0100, Hans de Goede wrote: > Hi Philipp, > > On 1/23/24 10:43, Philipp Stanner wrote: > > When the PCI devres API was introduced to this driver, it was > > wrongly > > assumed that initializing the device with pcim_enable_device() > > instead > > of pci_enable_device() will make all PCI functions managed. > > > > This is wrong and was caused by the quite confusing devres API for > > PCI > > in which some, but not all, functions become managed that way. > > > > The function pci_iomap_range() is never managed. > > > > Replace pci_iomap_range() with the actually managed function > > pcim_iomap_range(). > > > > Additionally, add a call to pcim_request_region() to ensure > > exclusive > > access to BAR 0. > > I'm a bit worried about this last change. There might be > issues where the pcim_request_region() fails due to > e.g. a conflict with the simplefb / simpledrm code. > > There is a drm_aperture_remove_conflicting_pci_framebuffers() > call done before hw_init() gets called, but still this > has been known to cause issues in the past. > > Can you split out the adding of the pcim_request_region() > into a separate patch and *not* mark that separate patch > for stable ? Yes, that sounds reasonable. I'll split it out and deal with it once I'll send the other DRM patches from my backlog. Greetings, P. > > Regards, > > Hans > > > > > > > > > CC: # v5.10+ > > Fixes: 8558de401b5f ("drm/vboxvideo: use managed pci functions") > > Signed-off-by: Philipp Stanner > > --- > > drivers/gpu/drm/vboxvideo/vbox_main.c | 24 +-- > > - > > 1 file changed, 13 insertions(+), 11 deletions(-) > > > > diff --git a/drivers/gpu/drm/vboxvideo/vbox_main.c > > b/drivers/gpu/drm/vboxvideo/vbox_main.c > > index 42c2d8a99509..7f686a0190e6 100644 > > --- a/drivers/gpu/drm/vboxvideo/vbox_main.c > > +++ b/drivers/gpu/drm/vboxvideo/vbox_main.c > > @@ -42,12 +42,11 @@ static int vbox_accel_init(struct vbox_private > > *vbox) > > /* Take a command buffer for each screen from the end of > > usable VRAM. */ > > vbox->available_vram_size -= vbox->num_crtcs * > > VBVA_MIN_BUFFER_SIZE; > > > > - vbox->vbva_buffers = pci_iomap_range(pdev, 0, > > - vbox- > > >available_vram_size, > > - vbox->num_crtcs * > > - VBVA_MIN_BUFFER_SIZE); > > - if (!vbox->vbva_buffers) > > - return -ENOMEM; > > + vbox->vbva_buffers = pcim_iomap_range( > > + pdev, 0, vbox->available_vram_size, > > + vbox->num_crtcs * VBVA_MIN_BUFFER_SIZE); > > + if (IS_ERR(vbox->vbva_buffers)) > > + return PTR_ERR(vbox->vbva_buffers); > > > > for (i = 0; i < vbox->num_crtcs; ++i) { > > vbva_setup_buffer_context(&vbox->vbva_info[i], > > @@ -115,12 +114,15 @@ int vbox_hw_init(struct vbox_private *vbox) > > > > DRM_INFO("VRAM %08x\n", vbox->full_vram_size); > > > > + ret = pcim_request_region(pdev, 0, "vboxvideo"); > > + if (ret) > > + return ret; > > + > > /* Map guest-heap at end of vram */ > > - vbox->guest_heap = > > - pci_iomap_range(pdev, 0, GUEST_HEAP_OFFSET(vbox), > > - GUEST_HEAP_SIZE); > > - if (!vbox->guest_heap) > > - return -ENOMEM; > > + vbox->guest_heap = pcim_iomap_range(pdev, 0, > > + GUEST_HEAP_OFFSET(vbox), GUEST_HEAP_SIZE); > > + if (IS_ERR(vbox->guest_heap)) > > + return PTR_ERR(vbox->guest_heap); > > > > /* Create guest-heap mem-pool use 2^4 = 16 byte chunks */ > > vbox->guest_pool = devm_gen_pool_create(vbox->ddev.dev, 4, > > -1, >
Re: [PATCH v2 1/1] drm/virtio: Implement device_attach
Am 29.01.24 um 11:31 schrieb Julia Zhang: As vram objects don't have backing pages and thus can't implement drm_gem_object_funcs.get_sg_table callback. This removes drm dma-buf callbacks in virtgpu_gem_map_dma_buf()/virtgpu_gem_unmap_dma_buf() and implement virtgpu specific map/unmap/attach callbacks to support both of shmem objects and vram objects. Signed-off-by: Julia Zhang I need to find more time to look into the code, but of hand I would say that this is the correct solution. Regards, Christian. --- drivers/gpu/drm/virtio/virtgpu_prime.c | 40 +++--- 1 file changed, 36 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/virtio/virtgpu_prime.c b/drivers/gpu/drm/virtio/virtgpu_prime.c index 44425f20d91a..b490a5343b06 100644 --- a/drivers/gpu/drm/virtio/virtgpu_prime.c +++ b/drivers/gpu/drm/virtio/virtgpu_prime.c @@ -49,11 +49,26 @@ virtgpu_gem_map_dma_buf(struct dma_buf_attachment *attach, { struct drm_gem_object *obj = attach->dmabuf->priv; struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj); + struct sg_table *sgt; + int ret; if (virtio_gpu_is_vram(bo)) return virtio_gpu_vram_map_dma_buf(bo, attach->dev, dir); - return drm_gem_map_dma_buf(attach, dir); + sgt = drm_prime_pages_to_sg(obj->dev, + to_drm_gem_shmem_obj(obj)->pages, + obj->size >> PAGE_SHIFT); + if (IS_ERR(sgt)) + return sgt; + + ret = dma_map_sgtable(attach->dev, sgt, dir, DMA_ATTR_SKIP_CPU_SYNC); + if (ret) { + sg_free_table(sgt); + kfree(sgt); + return ERR_PTR(ret); + } + + return sgt; } static void virtgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach, @@ -63,12 +78,29 @@ static void virtgpu_gem_unmap_dma_buf(struct dma_buf_attachment *attach, struct drm_gem_object *obj = attach->dmabuf->priv; struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj); + if (!sgt) + return; + if (virtio_gpu_is_vram(bo)) { virtio_gpu_vram_unmap_dma_buf(attach->dev, sgt, dir); - return; + } else { + dma_unmap_sgtable(attach->dev, sgt, dir, DMA_ATTR_SKIP_CPU_SYNC); + sg_free_table(sgt); + kfree(sgt); } +} + +static int virtgpu_gem_device_attach(struct dma_buf *dma_buf, +struct dma_buf_attachment *attach) +{ + struct drm_gem_object *obj = attach->dmabuf->priv; + struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj); + int ret = 0; + + if (!virtio_gpu_is_vram(bo) && obj->funcs->pin) + ret = obj->funcs->pin(obj); - drm_gem_unmap_dma_buf(attach, sgt, dir); + return ret; } static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops = { @@ -83,7 +115,7 @@ static const struct virtio_dma_buf_ops virtgpu_dmabuf_ops = { .vmap = drm_gem_dmabuf_vmap, .vunmap = drm_gem_dmabuf_vunmap, }, - .device_attach = drm_gem_map_attach, + .device_attach = virtgpu_gem_device_attach, .get_uuid = virtgpu_virtio_get_uuid, };
Re: [PATCH v5 5/8] iio: core: Add new DMABUF interface infrastructure
Am 29.01.24 um 14:06 schrieb Paul Cercueil: Hi Christian, Le lundi 29 janvier 2024 à 13:52 +0100, Christian König a écrit : Am 27.01.24 um 17:50 schrieb Jonathan Cameron: + iio_buffer_dmabuf_put(attach); + +out_dmabuf_put: + dma_buf_put(dmabuf); As below. Feels like a __free(dma_buf_put) bit of magic would be a nice to have. I'm working on the patches right now, just one quick question. Having a __free(dma_buf_put) requires that dma_buf_put is first "registered" as a freeing function using DEFINE_FREE() in , which has not been done yet. That would mean carrying a dma-buf specific patch in your tree, are you OK with that? Needs an ACK from appropriate maintainer, but otherwise I'm fine doing so. Alternative is to circle back to this later after this code is upstream. Separate patches for that please, the autocleanup feature is so new that I'm not 100% convinced that everything works out smoothly from the start. Separate patches is a given, did you mean outside this patchset? Because I can send a separate patchset that introduces scope-based management for dma_fence and dma_buf, but then it won't have users. Outside of the patchset, this is essentially brand new stuff. IIRC we have quite a number of dma_fence selftests and sw_sync which is basically code inside the drivers/dma-buf directory only there for testing DMA-buf functionality. Convert those over as well and I'm more than happy to upstream this change. Thanks, Christian. Cheers, -Paul
[PATCH 0/5] drm/msm: Add display support for X1E80100
This patchset adds support for display for X1E80100. The support for embedded DisplayPort on this platform will not be enabled using the connetor type from driver match data, but through some 'is-edp' property via DT. This subsequent work will be part of a separate patchset. Signed-off-by: Abel Vesa --- Abel Vesa (4): dt-bindings: display/msm: document MDSS on X1E80100 dt-bindings: display/msm: Document the DPU for X1E80100 drm/msm: mdss: Add X1E80100 support drm/msm/dpu: Add X1E80100 support Abhinav Kumar (1): drm/msm/dp: Try looking for link-frequencies into the port@0's endpoint first .../bindings/display/msm/qcom,sm8650-dpu.yaml | 5 +- .../bindings/display/msm/qcom,x1e80100-mdss.yaml | 249 .../drm/msm/disp/dpu1/catalog/dpu_9_2_x1e80100.h | 449 + drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c | 2 + drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h | 1 + drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c| 1 + drivers/gpu/drm/msm/dp/dp_parser.c | 6 +- drivers/gpu/drm/msm/msm_mdss.c | 10 + 8 files changed, 721 insertions(+), 2 deletions(-) --- base-commit: 6776c8d0924953c6bbd4920d8408f4c1d898af71 change-id: 20231201-x1e80100-display-a46324400baf Best regards, -- Abel Vesa
[PATCH 1/5] dt-bindings: display/msm: document MDSS on X1E80100
Document the MDSS hardware found on the Qualcomm X1E80100 platform. Signed-off-by: Abel Vesa --- .../bindings/display/msm/qcom,x1e80100-mdss.yaml | 249 + 1 file changed, 249 insertions(+) diff --git a/Documentation/devicetree/bindings/display/msm/qcom,x1e80100-mdss.yaml b/Documentation/devicetree/bindings/display/msm/qcom,x1e80100-mdss.yaml new file mode 100644 index ..eaa91f7d61ac --- /dev/null +++ b/Documentation/devicetree/bindings/display/msm/qcom,x1e80100-mdss.yaml @@ -0,0 +1,249 @@ +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/display/msm/qcom,x1e80100-mdss.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Qualcomm X1E80100 Display MDSS + +maintainers: + - Abel Vesa + +description: + X1E80100 MSM Mobile Display Subsystem(MDSS), which encapsulates sub-blocks like + DPU display controller, DP interfaces, etc. + +$ref: /schemas/display/msm/mdss-common.yaml# + +properties: + compatible: +const: qcom,x1e80100-mdss + + clocks: +items: + - description: Display AHB + - description: Display hf AXI + - description: Display core + + iommus: +maxItems: 1 + + interconnects: +maxItems: 3 + + interconnect-names: +maxItems: 3 + +patternProperties: + "^display-controller@[0-9a-f]+$": +type: object +properties: + compatible: +const: qcom,x1e80100-dpu + + "^displayport-controller@[0-9a-f]+$": +type: object +properties: + compatible: +const: qcom,x1e80100-dp + + "^phy@[0-9a-f]+$": +type: object +properties: + compatible: +const: qcom,x1e80100-dp-phy + +required: + - compatible + +unevaluatedProperties: false + +examples: + - | +#include +#include +#include + +display-subsystem@ae0 { +compatible = "qcom,x1e80100-mdss"; +reg = <0x0ae0 0x1000>; +reg-names = "mdss"; + +interconnects = <&mmss_noc MASTER_MDP 0 &gem_noc SLAVE_LLCC 0>, +<&mc_virt MASTER_LLCC 0 &mc_virt SLAVE_EBI1 0>; +<&gem_noc MASTER_APPSS_PROC 0 &config_noc SLAVE_DISPLAY_CFG 0>; +interconnect-names = "mdp0-mem", "mdp1-mem", "cpu-cfg"; + +resets = <&dispcc_core_bcr>; + +power-domains = <&dispcc_gdsc>; + +clocks = <&dispcc DISP_CC_MDSS_AHB_CLK>, + <&gcc GCC_DISP_AHB_CLK>, + <&gcc GCC_DISP_HF_AXI_CLK>, + <&dispcc DISP_CC_MDSS_MDP_CLK>; +clock-names = "iface", "bus", "nrt_bus", "core"; + +interrupts = ; +interrupt-controller; +#interrupt-cells = <1>; + +iommus = <&apps_smmu 0x1c00 0x2>; + +#address-cells = <1>; +#size-cells = <1>; +ranges; + +display-controller@ae01000 { +compatible = "qcom,x1e80100-dpu"; +reg = <0x0ae01000 0x8f000>, + <0x0aeb 0x2008>; +reg-names = "mdp", "vbif"; + +clocks = <&gcc_axi_clk>, + <&dispcc_ahb_clk>, + <&dispcc_mdp_lut_clk>, + <&dispcc_mdp_clk>, + <&dispcc_mdp_vsync_clk>; +clock-names = "nrt_bus", + "iface", + "lut", + "core", + "vsync"; + +assigned-clocks = <&dispcc_mdp_vsync_clk>; +assigned-clock-rates = <1920>; + +operating-points-v2 = <&mdp_opp_table>; +power-domains = <&rpmhpd RPMHPD_MMCX>; + +interrupt-parent = <&mdss>; +interrupts = <0>; + +ports { +#address-cells = <1>; +#size-cells = <0>; + +port@0 { +reg = <0>; +dpu_intf1_out: endpoint { +remote-endpoint = <&dsi0_in>; +}; +}; + +port@1 { +reg = <1>; +dpu_intf2_out: endpoint { +remote-endpoint = <&dsi1_in>; +}; +}; +}; + +mdp_opp_table: opp-table { +compatible = "operating-points-v2"; + +opp-2 { +opp-hz = /bits/ 64 <2>; +required-opps = <&rpmhpd_opp_low_svs>; +}; + +opp-32500 { +opp-hz = /bits/ 64 <32500>; +required-opps = <&rpmhpd_opp_svs>; +}; + +opp-37500 { +opp-hz = /bits/ 64 <37500>; +required-opps = <&rpmhpd_opp_svs_l1>; +}; + +opp-51400 { +opp-hz = /bits/ 64 <51400>; +required-opps = <&rpmhpd_opp_nom>; +}; +
[PATCH 2/5] dt-bindings: display/msm: Document the DPU for X1E80100
Document the DPU for Qualcomm X1E80100 platform in the SM8650 schema, as they are similar. Signed-off-by: Abel Vesa --- Documentation/devicetree/bindings/display/msm/qcom,sm8650-dpu.yaml | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/Documentation/devicetree/bindings/display/msm/qcom,sm8650-dpu.yaml b/Documentation/devicetree/bindings/display/msm/qcom,sm8650-dpu.yaml index a01d15a03317..f84fa6d5e6a2 100644 --- a/Documentation/devicetree/bindings/display/msm/qcom,sm8650-dpu.yaml +++ b/Documentation/devicetree/bindings/display/msm/qcom,sm8650-dpu.yaml @@ -13,7 +13,10 @@ $ref: /schemas/display/msm/dpu-common.yaml# properties: compatible: -const: qcom,sm8650-dpu +items: + - enum: + - qcom,sm8650-dpu + - qcom,x1e80100-dpu reg: items: -- 2.34.1
[PATCH 3/5] drm/msm: mdss: Add X1E80100 support
Add support for MDSS on X1E80100. Signed-off-by: Abel Vesa --- drivers/gpu/drm/msm/msm_mdss.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/drivers/gpu/drm/msm/msm_mdss.c b/drivers/gpu/drm/msm/msm_mdss.c index 455b2e3a0cdd..eddf7fdbb60a 100644 --- a/drivers/gpu/drm/msm/msm_mdss.c +++ b/drivers/gpu/drm/msm/msm_mdss.c @@ -564,6 +564,15 @@ static const struct msm_mdss_data sdm670_data = { .highest_bank_bit = 1, }; +static const struct msm_mdss_data x1e80100_data = { + .ubwc_enc_version = UBWC_4_0, + .ubwc_dec_version = UBWC_4_3, + .ubwc_swizzle = 6, + .ubwc_static = 1, + .highest_bank_bit = 2, + .macrotile_mode = 1, +}; + static const struct msm_mdss_data sdm845_data = { .ubwc_enc_version = UBWC_2_0, .ubwc_dec_version = UBWC_2_0, @@ -655,6 +664,7 @@ static const struct of_device_id mdss_dt_match[] = { { .compatible = "qcom,sm8450-mdss", .data = &sm8350_data }, { .compatible = "qcom,sm8550-mdss", .data = &sm8550_data }, { .compatible = "qcom,sm8650-mdss", .data = &sm8550_data}, + { .compatible = "qcom,x1e80100-mdss", .data = &x1e80100_data}, {} }; MODULE_DEVICE_TABLE(of, mdss_dt_match); -- 2.34.1
[PATCH 5/5] drm/msm/dpu: Add X1E80100 support
Add definitions for the display hardware used on the Qualcomm X1E80100 platform. Co-developed-by: Abhinav Kumar Signed-off-by: Abhinav Kumar Signed-off-by: Abel Vesa --- .../drm/msm/disp/dpu1/catalog/dpu_9_2_x1e80100.h | 449 + drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c | 2 + drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h | 1 + drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c| 1 + 4 files changed, 453 insertions(+) diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_2_x1e80100.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_2_x1e80100.h new file mode 100644 index ..d4f1fbfa420a --- /dev/null +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_2_x1e80100.h @@ -0,0 +1,449 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (c) 2023, Linaro Limited + */ + +#ifndef _DPU_9_2_X1E80100_H +#define _DPU_9_2_X1E80100_H + +static const struct dpu_caps x1e80100_dpu_caps = { + .max_mixer_width = DEFAULT_DPU_OUTPUT_LINE_WIDTH, + .max_mixer_blendstages = 0xb, + .has_src_split = true, + .has_dim_layer = true, + .has_idle_pc = true, + .has_3d_merge = true, + .max_linewidth = 5120, + .pixel_ram_size = DEFAULT_PIXEL_RAM_SIZE, +}; + +static const struct dpu_mdp_cfg x1e80100_mdp = { + .name = "top_0", + .base = 0, .len = 0x494, + .features = BIT(DPU_MDP_PERIPH_0_REMOVED), + .clk_ctrls = { + [DPU_CLK_CTRL_REG_DMA] = { .reg_off = 0x2bc, .bit_off = 20 }, + }, +}; + +/* FIXME: get rid of DPU_CTL_SPLIT_DISPLAY in favour of proper ACTIVE_CTL support */ +static const struct dpu_ctl_cfg x1e80100_ctl[] = { + { + .name = "ctl_0", .id = CTL_0, + .base = 0x15000, .len = 0x290, + .features = CTL_SM8550_MASK | BIT(DPU_CTL_SPLIT_DISPLAY), + .intr_start = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR2, 9), + }, { + .name = "ctl_1", .id = CTL_1, + .base = 0x16000, .len = 0x290, + .features = CTL_SM8550_MASK | BIT(DPU_CTL_SPLIT_DISPLAY), + .intr_start = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR2, 10), + }, { + .name = "ctl_2", .id = CTL_2, + .base = 0x17000, .len = 0x290, + .features = CTL_SM8550_MASK, + .intr_start = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR2, 11), + }, { + .name = "ctl_3", .id = CTL_3, + .base = 0x18000, .len = 0x290, + .features = CTL_SM8550_MASK, + .intr_start = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR2, 12), + }, { + .name = "ctl_4", .id = CTL_4, + .base = 0x19000, .len = 0x290, + .features = CTL_SM8550_MASK, + .intr_start = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR2, 13), + }, { + .name = "ctl_5", .id = CTL_5, + .base = 0x1a000, .len = 0x290, + .features = CTL_SM8550_MASK, + .intr_start = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR2, 23), + }, +}; + +static const struct dpu_sspp_cfg x1e80100_sspp[] = { + { + .name = "sspp_0", .id = SSPP_VIG0, + .base = 0x4000, .len = 0x344, + .features = VIG_SDM845_MASK, + .sblk = &dpu_vig_sblk_qseed3_3_2, + .xin_id = 0, + .type = SSPP_TYPE_VIG, + }, { + .name = "sspp_1", .id = SSPP_VIG1, + .base = 0x6000, .len = 0x344, + .features = VIG_SDM845_MASK, + .sblk = &dpu_vig_sblk_qseed3_3_2, + .xin_id = 4, + .type = SSPP_TYPE_VIG, + }, { + .name = "sspp_2", .id = SSPP_VIG2, + .base = 0x8000, .len = 0x344, + .features = VIG_SDM845_MASK, + .sblk = &dpu_vig_sblk_qseed3_3_2, + .xin_id = 8, + .type = SSPP_TYPE_VIG, + }, { + .name = "sspp_3", .id = SSPP_VIG3, + .base = 0xa000, .len = 0x344, + .features = VIG_SDM845_MASK, + .sblk = &dpu_vig_sblk_qseed3_3_2, + .xin_id = 12, + .type = SSPP_TYPE_VIG, + }, { + .name = "sspp_8", .id = SSPP_DMA0, + .base = 0x24000, .len = 0x344, + .features = DMA_SDM845_MASK, + .sblk = &dpu_dma_sblk, + .xin_id = 1, + .type = SSPP_TYPE_DMA, + }, { + .name = "sspp_9", .id = SSPP_DMA1, + .base = 0x26000, .len = 0x344, + .features = DMA_SDM845_MASK, + .sblk = &dpu_dma_sblk, + .xin_id = 5, + .type = SSPP_TYPE_DMA, + }, { + .name = "sspp_10", .id = SSPP_DMA2, + .base = 0x28000, .len = 0x344, + .features = DMA_SDM845_MASK, + .sblk = &dpu_dma_sblk, + .xin_id = 9, + .type = SSPP_TYPE_DMA, +
[PATCH 4/5] drm/msm/dp: Try looking for link-frequencies into the port@0's endpoint first
From: Abhinav Kumar On platforms where the endpoint used is on port@0, looking for port@1 instead results in just ignoring the max link-frequencies altogether. Look at port@0 first, then, if not found, look for port@1. Signed-off-by: Abhinav Kumar Signed-off-by: Abel Vesa --- drivers/gpu/drm/msm/dp/dp_parser.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/msm/dp/dp_parser.c b/drivers/gpu/drm/msm/dp/dp_parser.c index 7032dcc8842b..eec5b8b83f4b 100644 --- a/drivers/gpu/drm/msm/dp/dp_parser.c +++ b/drivers/gpu/drm/msm/dp/dp_parser.c @@ -97,7 +97,11 @@ static u32 dp_parser_link_frequencies(struct device_node *of_node) u64 frequency = 0; int cnt; - endpoint = of_graph_get_endpoint_by_regs(of_node, 1, 0); /* port@1 */ + endpoint = of_graph_get_endpoint_by_regs(of_node, 0, 0); /* port@0 */ + + if (!endpoint) + endpoint = of_graph_get_endpoint_by_regs(of_node, 1, 0); /* port@1 */ + if (!endpoint) return 0; -- 2.34.1
Re: Implement per-key keyboard backlight as auxdisplay?
Hi, On 1/19/24 21:15, Pavel Machek wrote: > Hi! > 2. Implement per-key keyboards as auxdisplay - Pro: - Already has a concept for led positions - Is conceptually closer to "multiple leds forming a singular entity" - Con: - No preexisting UPower support - No concept for special hardware lightning modes - No support for arbitrary led outlines yet (e.g. ISO style enter-key) >>> >>> Please do this one. >> >> Ok, so based on the discussion so far and Pavel's feedback lets try to >> design a custom userspace API for this. I do not believe that auxdisplay >> is a good fit because: > > Ok, so lets call this a "display". These days, framebuffers and drm > handles displays. My proposal is to use similar API as other displays. > >> So my proposal would be an ioctl interface (ioctl only no r/w) >> using /dev/rgbkbd0 /dev/rgbkdb1, etc. registered as a misc chardev. >> >> For per key controllable rgb LEDs we need to discuss a coordinate >> system. I propose using a fixed size of 16 rows of 64 keys, >> so 64x16 in standard WxH notation. >> >> And then storing RGB in separate bytes, so userspace will then >> always send a buffer of 192 bytes per line (64x3) x 14 rows >> = 3072 bytes. With the kernel driver ignoring parts of >> the buffer where there are no actual keys. > > That's really really weird interface. If you are doing RGB888 64x14, > lets make it a ... display? :-). > > ioctl always sending 3072 bytes is really a hack. > > Small displays exist and are quite common, surely we'd handle this as > a display: > https://pajenicko.cz/displeje/graficky-oled-displej-0-66-64x48-i2c-bily-wemos-d1-mini > It is 64x48. This is indeed a display and should use display APIs > And then there's this: > https://pajenicko.cz/displeje/maticovy-8x8-led-displej-s-radicem-max7219 > and this: > https://pajenicko.cz/displeje/maticovy-8x32-led-displej-s-radicem-max7219 > > One of them is 8x8. > > Surely those should be displays, too? The 8x8 one not really, the other one could be used to scroll some text one but cannot display images, so not really displays IMHO. Anyways we are talking about keyboards here and those do not have a regular x-y grid like your example above, so they certainly do not count as displays. See the long discussion earlier in the thread. Regards, Hans
Re: Implement per-key keyboard backlight as auxdisplay?
Hi Werner, On 1/19/24 17:04, Werner Sembach wrote: > Am 19.01.24 um 09:44 schrieb Hans de Goede: >> So my proposal would be an ioctl interface (ioctl only no r/w) >> using /dev/rgbkbd0 /dev/rgbkdb1, etc. registered as a misc chardev. >> >> For per key controllable rgb LEDs we need to discuss a coordinate >> system. I propose using a fixed size of 16 rows of 64 keys, >> so 64x16 in standard WxH notation. >> >> And then storing RGB in separate bytes, so userspace will then >> always send a buffer of 192 bytes per line (64x3) x 14 rows >> = 3072 bytes. With the kernel driver ignoring parts of >> the buffer where there are no actual keys. > The be sure the "14 rows" is a typo? And should be 16 rows? Yes that should be 16. >> This way we can address all the possible keys in the various >> standard layouts in one standard wat and then the drivers can >> just skip keys which are not there when preparing the buffer >> to send to the hw / fw. > > Some remarks here: > > - Some keyboards might have two or more leds for big keys like (iso-)enter, > shift, capslock, num+, etc. that in theory are individually controllable by > the firmware. In windows drivers this is usually abstracted away, but could > be interesting for effects (e.g. if the top of iso-enter is separate from the > bottom of iso-enter like with one of our devices). > > - In combination with this: The driver might not be able to tell if the > actual physical keyboard is ISO or ANSI, so it might not be able the > correctly assign the leds around enter correctly as being an own key or being > part of ANSI- or ISO-enter. > > - Should the interface have different addresses for the different enter and > num+ styles (or even the different length shifts and spacebars)? > > One idea for this: Actually assign 1 value per line for tall keys per line, 3 > (or maybe even 4, to have one spare) values per line for wide keys and 6 (or > 8) values for space. e.g.: That sounds workable OTOH combined with your remarks about also supporting lightbars. I'm starting to think that we need to just punt this to userspace. So basically change things from trying to present a standardized address space where say the 'Q' key is always in the same place just model a keyboard as a string of LEDs (1 dimensional / so an array) and leave mapping which address in the array is which key to userspace, then userspace can have json or whatever files for this per keyboard. This keeps the kernel interface much more KISS which I think is what we need to strive for. So instead of having /dev/rgbkbd we get a /dev/rgbledstring and then that can be used for rbb-kbds and also your lightbar example as well as actual RGB LED strings, which depending on the controller may also have zones / effects, etc. just like the keyboards. > - Right shift would have 3 values in row 10. The first value might be the > left side of shift or the additional ABNT/JIS key. The 2nd Key might be the > left side or middle of shift and the third key might be the right side of > shift or the only value for the whole key. The additional ABNT/JIS key still > also has a dedicated value which is used by drivers which can differentiate > between physical layouts. > > - Enter would have 3 values in row 8 and 3 values in row 9. With the same > disambiguation as the additional ABNT/JIS but this time for ansii-/ and iso-# > > - Num+ would have 2 values, one row 8 and one in row 9. The one in row 9 > might control the whole key or might just control the lower half. The one in > row 8 might be another key or the upper half > > For the left half if the main block the leftmost value should be the "might > be the only relevant"-value while the right most value should be the "might > be another key"-value. For the right side of the main block this should be > swapped. Unused values should be adjacent to the "might be another > key"-value, e.g.: > > | Left shift value 1 | Left shift value > 2 | Left shift value 3 | Left shift value 4 | 102nd > key value > ISO/ANSI aware | Left shift color | Unused > | Unused | Unused | 102nd > key color > ISO non aware 1 led under shift | Left shift color | Unused > | Unused | 102nd key color | Unused > ANSI non aware 1 led under shift | Left shift color | Unused > | Unused | Unused | Unused > ISO non aware 2 leds under shift | Left shift left color | Left shift right > color | Unused | 102nd key color | Unused > ANSI non aware 2 leds under shift | Left shift left color | Left shift right > color | Unused | Unused | Unused > ISO non aware 3 leds under shift | Left shift left color | Left shift middle > co
Re: [PATCH v5 5/8] iio: core: Add new DMABUF interface infrastructure
Le lundi 29 janvier 2024 à 14:17 +0100, Christian König a écrit : > Am 29.01.24 um 14:06 schrieb Paul Cercueil: > > Hi Christian, > > > > Le lundi 29 janvier 2024 à 13:52 +0100, Christian König a écrit : > > > Am 27.01.24 um 17:50 schrieb Jonathan Cameron: > > > > > > > + iio_buffer_dmabuf_put(attach); > > > > > > > + > > > > > > > +out_dmabuf_put: > > > > > > > + dma_buf_put(dmabuf); > > > > > > As below. Feels like a __free(dma_buf_put) bit of magic > > > > > > would > > > > > > be a > > > > > > nice to have. > > > > > I'm working on the patches right now, just one quick > > > > > question. > > > > > > > > > > Having a __free(dma_buf_put) requires that dma_buf_put is > > > > > first > > > > > "registered" as a freeing function using DEFINE_FREE() in > > > > > > > > > buf.h>, which has not been done yet. > > > > > > > > > > That would mean carrying a dma-buf specific patch in your > > > > > tree, > > > > > are you > > > > > OK with that? > > > > Needs an ACK from appropriate maintainer, but otherwise I'm > > > > fine > > > > doing > > > > so. Alternative is to circle back to this later after this > > > > code is > > > > upstream. > > > Separate patches for that please, the autocleanup feature is so > > > new > > > that > > > I'm not 100% convinced that everything works out smoothly from > > > the > > > start. > > Separate patches is a given, did you mean outside this patchset? > > Because I can send a separate patchset that introduces scope-based > > management for dma_fence and dma_buf, but then it won't have users. > > Outside of the patchset, this is essentially brand new stuff. > > IIRC we have quite a number of dma_fence selftests and sw_sync which > is > basically code inside the drivers/dma-buf directory only there for > testing DMA-buf functionality. > > Convert those over as well and I'm more than happy to upstream this > change. Well there is very little to convert there; you can use scope-based management when the unref is done in all exit points of the functional block, and the only place I could find that does that in drivers/dma- buf/ was in dma_fence_chain_enable_signaling() in dma-fence-chain.c. Cheers, -Paul
Re: [PATCH 1/2] ACPI: video: Handle fetching EDID that is longer than 256 bytes
On Fri, Jan 26, 2024 at 7:55 PM Mario Limonciello wrote: > > The ACPI specification allows for an EDID to be up to 512 bytes but > the _DDC EDID fetching code will only try up to 256 bytes. > > Modify the code to instead start at 512 bytes and work it's way > down instead. > > Link: > https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/Apx_B_Video_Extensions/output-device-specific-methods.html#ddc-return-the-edid-for-this-device > Signed-off-by: Mario Limonciello > --- > drivers/acpi/acpi_video.c | 23 --- > 1 file changed, 16 insertions(+), 7 deletions(-) > > diff --git a/drivers/acpi/acpi_video.c b/drivers/acpi/acpi_video.c > index 62f4364e4460..b3b15dd4755d 100644 > --- a/drivers/acpi/acpi_video.c > +++ b/drivers/acpi/acpi_video.c > @@ -624,6 +624,10 @@ acpi_video_device_EDID(struct acpi_video_device *device, > arg0.integer.value = 1; > else if (length == 256) > arg0.integer.value = 2; > + else if (length == 384) > + arg0.integer.value = 3; > + else if (length == 512) > + arg0.integer.value = 4; It looks like switch () would be somewhat better. Or maybe even arg0.integer.value = length / 128; The validation could be added too: if (arg0.integer.value > 4 || arg0.integer.value * 128 != length) return -EINVAL; but it is pointless, because the caller is never passing an invalid number to it AFAICS. > else > return -EINVAL; > > @@ -1443,7 +1447,7 @@ int acpi_video_get_edid(struct acpi_device *device, int > type, int device_id, > > for (i = 0; i < video->attached_count; i++) { > video_device = video->attached_array[i].bind_info; > - length = 256; > + length = 512; > > if (!video_device) > continue; > @@ -1478,13 +1482,18 @@ int acpi_video_get_edid(struct acpi_device *device, > int type, int device_id, > > if (ACPI_FAILURE(status) || !buffer || > buffer->type != ACPI_TYPE_BUFFER) { > - length = 128; > - status = acpi_video_device_EDID(video_device, &buffer, > - length); > - if (ACPI_FAILURE(status) || !buffer || > - buffer->type != ACPI_TYPE_BUFFER) { > - continue; > + while (length) { I would prefer a do {} while () loop here, which could include the first invocation of acpi_video_device_EDID() too (and reduce code duplication a bit). > + length -= 128; > + status = acpi_video_device_EDID(video_device, > &buffer, > + length); No line break, please. > + if (ACPI_FAILURE(status) || !buffer || > + buffer->type != ACPI_TYPE_BUFFER) { > + continue; > + } > + break; > } > + if (!length) > + continue; > } > > *edid = buffer->buffer.pointer; > --
Re: [PATCH v5 5/8] iio: core: Add new DMABUF interface infrastructure
Le lundi 29 janvier 2024 à 14:32 +0100, Paul Cercueil a écrit : > Le lundi 29 janvier 2024 à 14:17 +0100, Christian König a écrit : > > Am 29.01.24 um 14:06 schrieb Paul Cercueil: > > > Hi Christian, > > > > > > Le lundi 29 janvier 2024 à 13:52 +0100, Christian König a écrit : > > > > Am 27.01.24 um 17:50 schrieb Jonathan Cameron: > > > > > > > > + iio_buffer_dmabuf_put(attach); > > > > > > > > + > > > > > > > > +out_dmabuf_put: > > > > > > > > + dma_buf_put(dmabuf); > > > > > > > As below. Feels like a __free(dma_buf_put) bit of magic > > > > > > > would > > > > > > > be a > > > > > > > nice to have. > > > > > > I'm working on the patches right now, just one quick > > > > > > question. > > > > > > > > > > > > Having a __free(dma_buf_put) requires that dma_buf_put is > > > > > > first > > > > > > "registered" as a freeing function using DEFINE_FREE() in > > > > > > > > > > > buf.h>, which has not been done yet. > > > > > > > > > > > > That would mean carrying a dma-buf specific patch in your > > > > > > tree, > > > > > > are you > > > > > > OK with that? > > > > > Needs an ACK from appropriate maintainer, but otherwise I'm > > > > > fine > > > > > doing > > > > > so. Alternative is to circle back to this later after this > > > > > code is > > > > > upstream. > > > > Separate patches for that please, the autocleanup feature is so > > > > new > > > > that > > > > I'm not 100% convinced that everything works out smoothly from > > > > the > > > > start. > > > Separate patches is a given, did you mean outside this patchset? > > > Because I can send a separate patchset that introduces scope- > > > based > > > management for dma_fence and dma_buf, but then it won't have > > > users. > > > > Outside of the patchset, this is essentially brand new stuff. > > > > IIRC we have quite a number of dma_fence selftests and sw_sync > > which > > is > > basically code inside the drivers/dma-buf directory only there for > > testing DMA-buf functionality. > > > > Convert those over as well and I'm more than happy to upstream this > > change. > > Well there is very little to convert there; you can use scope-based > management when the unref is done in all exit points of the > functional > block, and the only place I could find that does that in drivers/dma- > buf/ was in dma_fence_chain_enable_signaling() in dma-fence-chain.c. Actually - not even that, since it doesn't call dma_fence_get() and dma_fence_put() on the same fence. So I cannot use it anywhere in drivers/dma-buf/. -Paul
Re: Re: Re: [PATCH 1/3] bits: introduce fixed-type genmasks
On Wed, Jan 24, 2024 at 07:27:58AM -0800, Yury Norov wrote: On Wed, Jan 24, 2024 at 08:03:53AM -0600, Lucas De Marchi wrote: On Wed, Jan 24, 2024 at 09:58:26AM +0200, Jani Nikula wrote: > On Tue, 23 Jan 2024, Lucas De Marchi wrote: > > From: Yury Norov > > > > Generalize __GENMASK() to support different types, and implement > > fixed-types versions of GENMASK() based on it. The fixed-type version > > allows more strict checks to the min/max values accepted, which is > > useful for defining registers like implemented by i915 and xe drivers > > with their REG_GENMASK*() macros. > > Mmh, the commit message says the fixed-type version allows more strict > checks, but none are actually added. GENMASK_INPUT_CHECK() remains the > same. > > Compared to the i915 and xe versions, this is more lax now. You could > specify GENMASK_U32(63,32) without complaints. Doing this on top of the this series: -#define XELPDP_PORT_M2P_COMMAND_TYPE_MASKREG_GENMASK(30, 27) +#define XELPDP_PORT_M2P_COMMAND_TYPE_MASKREG_GENMASK(62, 32) and I do get a build failure: ../drivers/gpu/drm/i915/display/intel_cx0_phy.c: In function ‘__intel_cx0_read_once’: ../include/linux/bits.h:41:31: error: left shift count >= width of type [-Werror=shift-count-overflow] 41 | (((t)~0ULL - ((t)(1) << (l)) + 1) & \ | ^~ I would better include this in commit message to avoid people's confusion. If it comes to v2, can you please do it and mention that this trick relies on shift-count-overflow compiler check? either that or an explicit check as it was suggested. What's your preference? Lucas De Marchi Thanks, Yury
Re: Making drm_gpuvm work across gpu devices
On 2024-01-25 13:32, Daniel Vetter wrote: On Wed, Jan 24, 2024 at 09:33:12AM +0100, Christian König wrote: Am 23.01.24 um 20:37 schrieb Zeng, Oak: [SNIP] Yes most API are per device based. One exception I know is actually the kfd SVM API. If you look at the svm_ioctl function, it is per-process based. Each kfd_process represent a process across N gpu devices. Yeah and that was a big mistake in my opinion. We should really not do that ever again. Need to say, kfd SVM represent a shared virtual address space across CPU and all GPU devices on the system. This is by the definition of SVM (shared virtual memory). This is very different from our legacy gpu *device* driver which works for only one device (i.e., if you want one device to access another device's memory, you will have to use dma-buf export/import etc). Exactly that thinking is what we have currently found as blocker for a virtualization projects. Having SVM as device independent feature which somehow ties to the process address space turned out to be an extremely bad idea. The background is that this only works for some use cases but not all of them. What's working much better is to just have a mirror functionality which says that a range A..B of the process address space is mapped into a range C..D of the GPU address space. Those ranges can then be used to implement the SVM feature required for higher level APIs and not something you need at the UAPI or even inside the low level kernel memory management. When you talk about migrating memory to a device you also do this on a per device basis and *not* tied to the process address space. If you then get crappy performance because userspace gave contradicting information where to migrate memory then that's a bug in userspace and not something the kernel should try to prevent somehow. [SNIP] I think if you start using the same drm_gpuvm for multiple devices you will sooner or later start to run into the same mess we have seen with KFD, where we moved more and more functionality from the KFD to the DRM render node because we found that a lot of the stuff simply doesn't work correctly with a single object to maintain the state. As I understand it, KFD is designed to work across devices. A single pseudo /dev/kfd device represent all hardware gpu devices. That is why during kfd open, many pdd (process device data) is created, each for one hardware device for this process. Yes, I'm perfectly aware of that. And I can only repeat myself that I see this design as a rather extreme failure. And I think it's one of the reasons why NVidia is so dominant with Cuda. This whole approach KFD takes was designed with the idea of extending the CPU process into the GPUs, but this idea only works for a few use cases and is not something we should apply to drivers in general. A very good example are virtualization use cases where you end up with CPU address != GPU address because the VAs are actually coming from the guest VM and not the host process. SVM is a high level concept of OpenCL, Cuda, ROCm etc.. This should not have any influence on the design of the kernel UAPI. If you want to do something similar as KFD for Xe I think you need to get explicit permission to do this from Dave and Daniel and maybe even Linus. I think the one and only one exception where an SVM uapi like in kfd makes sense, is if the _hardware_ itself, not the software stack defined semantics that you've happened to build on top of that hw, enforces a 1:1 mapping with the cpu process address space. Which means your hardware is using PASID, IOMMU based translation, PCI-ATS (address translation services) or whatever your hw calls it and has _no_ device-side pagetables on top. Which from what I've seen all devices with device-memory have, simply because they need some place to store whether that memory is currently in device memory or should be translated using PASID. Currently there's no gpu that works with PASID only, but there are some on-cpu-die accelerator things that do work like that. Maybe in the future there will be some accelerators that are fully cpu cache coherent (including atomics) with something like CXL, and the on-device memory is managed as normal system memory with struct page as ZONE_DEVICE and accelerator va -> physical address translation is only done with PASID ... but for now I haven't seen that, definitely not in upstream drivers. And the moment you have some per-device pagetables or per-device memory management of some sort (like using gpuva mgr) then I'm 100% agreeing with Christian that the kfd SVM model is too strict and not a great idea. That basically means, without ATS/PRI+PASID you cannot implement a unified memory programming model, where GPUs or accelerators access virtual addresses without pre-registering them with an SVM API call. Unified memory is a feature implemented by the KFD SVM API and used by ROCm. This is used e.g. to implement OpenMP USM (unified shared memory). It'
Re: [PATCH 4/5] drm/msm/dp: Try looking for link-frequencies into the port@0's endpoint first
On Mon, 29 Jan 2024 at 15:19, Abel Vesa wrote: > > From: Abhinav Kumar > > On platforms where the endpoint used is on port@0, looking for port@1 > instead results in just ignoring the max link-frequencies altogether. > Look at port@0 first, then, if not found, look for port@1. NAK. Platforms do not "use port@0". It is for the connection between DPU and DP, while the link-frequencies property is for the link between DP controller and the actual display. > > Signed-off-by: Abhinav Kumar > Signed-off-by: Abel Vesa > --- > drivers/gpu/drm/msm/dp/dp_parser.c | 6 +- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/msm/dp/dp_parser.c > b/drivers/gpu/drm/msm/dp/dp_parser.c > index 7032dcc8842b..eec5b8b83f4b 100644 > --- a/drivers/gpu/drm/msm/dp/dp_parser.c > +++ b/drivers/gpu/drm/msm/dp/dp_parser.c > @@ -97,7 +97,11 @@ static u32 dp_parser_link_frequencies(struct device_node > *of_node) > u64 frequency = 0; > int cnt; > > - endpoint = of_graph_get_endpoint_by_regs(of_node, 1, 0); /* port@1 */ > + endpoint = of_graph_get_endpoint_by_regs(of_node, 0, 0); /* port@0 */ > + > + if (!endpoint) > + endpoint = of_graph_get_endpoint_by_regs(of_node, 1, 0); /* > port@1 */ > + > if (!endpoint) > return 0; > > > -- > 2.34.1 > -- With best wishes Dmitry
Re: Re: Re: [PATCH 1/3] bits: introduce fixed-type genmasks
On Mon, Jan 29, 2024 at 08:49:35AM -0600, Lucas De Marchi wrote: > On Wed, Jan 24, 2024 at 07:27:58AM -0800, Yury Norov wrote: > > On Wed, Jan 24, 2024 at 08:03:53AM -0600, Lucas De Marchi wrote: > > > On Wed, Jan 24, 2024 at 09:58:26AM +0200, Jani Nikula wrote: > > > > On Tue, 23 Jan 2024, Lucas De Marchi wrote: > > > > > From: Yury Norov > > > > > > > > > > Generalize __GENMASK() to support different types, and implement > > > > > fixed-types versions of GENMASK() based on it. The fixed-type version > > > > > allows more strict checks to the min/max values accepted, which is > > > > > useful for defining registers like implemented by i915 and xe drivers > > > > > with their REG_GENMASK*() macros. > > > > > > > > Mmh, the commit message says the fixed-type version allows more strict > > > > checks, but none are actually added. GENMASK_INPUT_CHECK() remains the > > > > same. > > > > > > > > Compared to the i915 and xe versions, this is more lax now. You could > > > > specify GENMASK_U32(63,32) without complaints. > > > > > > Doing this on top of the this series: > > > > > > -#define XELPDP_PORT_M2P_COMMAND_TYPE_MASKREG_GENMASK(30, > > > 27) > > > +#define XELPDP_PORT_M2P_COMMAND_TYPE_MASKREG_GENMASK(62, > > > 32) > > > > > > and I do get a build failure: > > > > > > ../drivers/gpu/drm/i915/display/intel_cx0_phy.c: In function > > > ‘__intel_cx0_read_once’: > > > ../include/linux/bits.h:41:31: error: left shift count >= width of type > > > [-Werror=shift-count-overflow] > > >41 | (((t)~0ULL - ((t)(1) << (l)) + 1) & \ > > > | ^~ > > > > I would better include this in commit message to avoid people's > > confusion. If it comes to v2, can you please do it and mention that > > this trick relies on shift-count-overflow compiler check? > > either that or an explicit check as it was suggested. What's your > preference? Let's put a comment in the code. An argument that shift-count-overflow may be disabled sounds more like a speculation unless we have a solid example of a build system where the error is disabled for a good sane reason, but possible GENMASK() overflow is still considered dangerous. GENMASK() is all about bit shifts, so shift-related error is something I'd expect when using GENMASK(). Also, the macro is widely used in the kernel: yury:linux$ git grep GENMASK | wc -l 26879 Explicit check would add pressure on the compiler for nothing. Thanks, Yury
Re: [PATCH 3/5] drm/msm: mdss: Add X1E80100 support
On Mon, 29 Jan 2024 at 15:19, Abel Vesa wrote: > > Add support for MDSS on X1E80100. > > Signed-off-by: Abel Vesa > --- > drivers/gpu/drm/msm/msm_mdss.c | 10 ++ > 1 file changed, 10 insertions(+) > > diff --git a/drivers/gpu/drm/msm/msm_mdss.c b/drivers/gpu/drm/msm/msm_mdss.c > index 455b2e3a0cdd..eddf7fdbb60a 100644 > --- a/drivers/gpu/drm/msm/msm_mdss.c > +++ b/drivers/gpu/drm/msm/msm_mdss.c > @@ -564,6 +564,15 @@ static const struct msm_mdss_data sdm670_data = { > .highest_bank_bit = 1, > }; > > +static const struct msm_mdss_data x1e80100_data = { > + .ubwc_enc_version = UBWC_4_0, > + .ubwc_dec_version = UBWC_4_3, > + .ubwc_swizzle = 6, > + .ubwc_static = 1, > + .highest_bank_bit = 2, > + .macrotile_mode = 1, Missing .reg_bus_bw, LGTM otherwise > +}; > + > static const struct msm_mdss_data sdm845_data = { > .ubwc_enc_version = UBWC_2_0, > .ubwc_dec_version = UBWC_2_0, > @@ -655,6 +664,7 @@ static const struct of_device_id mdss_dt_match[] = { > { .compatible = "qcom,sm8450-mdss", .data = &sm8350_data }, > { .compatible = "qcom,sm8550-mdss", .data = &sm8550_data }, > { .compatible = "qcom,sm8650-mdss", .data = &sm8550_data}, > + { .compatible = "qcom,x1e80100-mdss", .data = &x1e80100_data}, > {} > }; > MODULE_DEVICE_TABLE(of, mdss_dt_match); > > -- > 2.34.1 > -- With best wishes Dmitry
Re: [PATCH 5/5] drm/msm/dpu: Add X1E80100 support
On Mon, 29 Jan 2024 at 15:19, Abel Vesa wrote: > > Add definitions for the display hardware used on the Qualcomm X1E80100 > platform. > > Co-developed-by: Abhinav Kumar > Signed-off-by: Abhinav Kumar > Signed-off-by: Abel Vesa > --- > .../drm/msm/disp/dpu1/catalog/dpu_9_2_x1e80100.h | 449 > + > drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c | 2 + > drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h | 1 + > drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c| 1 + > 4 files changed, 453 insertions(+) > Reviewed-by: Dmitry Baryshkov -- With best wishes Dmitry
Re: [PATCH v4 0/8] drm/amd/display: Introduce KUnit to Display Mode Library
That we include so many C files with relative paths seems to be odd. Apart from that looks good to me. Christian. Am 26.01.24 um 16:48 schrieb Rodrigo Siqueira: In 2022, we got a great patchset from a GSoC project introducing unit tests to the amdgpu display. Since version 3, this effort was put on hold, and now I'm attempting to revive it. I'll add part of the original cover letter at the bottom of this cover letter, but you can read all the original messages at: https://lore.kernel.org/amd-gfx/20220912155919.39877-1-mairaca...@riseup.net/ Anyway, this new version changes are: - Rebase and adjust conflicts. - Rewrite part of the dc_dmub_srv_test to represent a real scenario that simulates some parameter configuration for using 4k144 and 4k240 displays. Thanks Siqueira Original cover letter Hello, This series is version 3 of the introduction of unit testing to the AMDPGU driver [1]. Our main goal is to bring unit testing to the AMD display driver; in particular, we'll focus on the Display Mode Library (DML) for DCN2.0, DMUB, and some of the DCE functions. This implementation intends to help developers to recognize bugs before they are merged into the mainline and also makes it possible for future code refactors of the AMD display driver. For the implementation of the tests, we decided to go with the Kernel Unit Testing Framework (KUnit). KUnit makes it possible to run test suites on kernel boot or load the tests as a module. It reports all test case results through a TAP (Test Anything Protocol) in the kernel log. Moreover, KUnit unifies the test structure and provides tools to simplify the testing for developers and CI systems. In regards to CI pipelines, we believe kunit_tool [2] provides ease of use, but we are also working on integrating KUnit into IGT [3]. Since the second version, we've chosen a mix of approaches to integrate KUnit tests into amdgpu: 1. Tests that use static functions are included through guards [4]. 2. Tests without static functions are included through a Makefile. We understand that testing static functions is not ideal, but taking into consideration that this driver relies heavily on static functions with complex behavior which would benefit from unit testing, otherwise, black-box tested through public functions with dozens of arguments and sometimes high cyclomatic complexity. The first seven patches represent what we intend to do for the rest of the DML modules: systematic testing of the DML functions, especially mathematically complicated functions. Also, it shows how simple it is to add new tests to the DML. Among the tests, we highlight the dcn20_fpu_test, which, had it existed then, could catch the defects introduced to dcn20_fpu.c by 8861c27a6c [5] later fixed by 9ad5d02c2a [6]. In this series, there's also an example of how unit tests can help avoid regressions and keep track of changes in behavior. [..] Isabella Basso (1): drm/amd/display: Introduce KUnit tests to display_rq_dlg_calc_20 Magali Lemes (1): drm/amd/display: Introduce KUnit tests for dcn20_fpu Maíra Canal (5): drm/amd/display: Introduce KUnit tests to the bw_fixed library drm/amd/display: Introduce KUnit tests to the display_mode_vba library drm/amd/display: Introduce KUnit to dcn20/display_mode_vba_20 library drm/amd/display: Introduce KUnit tests to dc_dmub_srv library Documentation/gpu: Add Display Core Unit Test documentation Tales Aparecida (1): drm/amd/display: Introduce KUnit tests for fixed31_32 library .../gpu/amdgpu/display/display-test.rst | 88 ++ Documentation/gpu/amdgpu/display/index.rst| 1 + drivers/gpu/drm/amd/display/Kconfig | 52 + drivers/gpu/drm/amd/display/Makefile | 2 +- drivers/gpu/drm/amd/display/dc/dc_dmub_srv.c | 4 + .../dc/dml/dcn20/display_mode_vba_20.c| 4 + .../dc/dml/dcn20/display_rq_dlg_calc_20.c | 4 + drivers/gpu/drm/amd/display/tests/Makefile| 18 + .../display/tests/dc/basics/fixpt31_32_test.c | 232 + .../amd/display/tests/dc/dc_dmub_srv_test.c | 159 .../tests/dc/dml/calcs/bw_fixed_test.c| 323 +++ .../tests/dc/dml/dcn20/dcn20_fpu_test.c | 561 +++ .../dc/dml/dcn20/display_mode_vba_20_test.c | 888 ++ .../dml/dcn20/display_rq_dlg_calc_20_test.c | 124 +++ .../tests/dc/dml/display_mode_vba_test.c | 741 +++ 15 files changed, 3200 insertions(+), 1 deletion(-) create mode 100644 Documentation/gpu/amdgpu/display/display-test.rst create mode 100644 drivers/gpu/drm/amd/display/tests/Makefile create mode 100644 drivers/gpu/drm/amd/display/tests/dc/basics/fixpt31_32_test.c create mode 100644 drivers/gpu/drm/amd/display/tests/dc/dc_dmub_srv_test.c create mode 100644 drivers/gpu/drm/amd/display/tests/dc/dml/calcs/bw_fixed_test.c create mode 100644 drivers/gpu/drm/amd/display/tests/dc/dml/dcn20/dcn20_fpu_test.c create mode 100644 drivers/gpu/drm/amd/
Re: Making drm_gpuvm work across gpu devices
Am 29.01.24 um 16:03 schrieb Felix Kuehling: On 2024-01-25 13:32, Daniel Vetter wrote: On Wed, Jan 24, 2024 at 09:33:12AM +0100, Christian König wrote: Am 23.01.24 um 20:37 schrieb Zeng, Oak: [SNIP] Yes most API are per device based. One exception I know is actually the kfd SVM API. If you look at the svm_ioctl function, it is per-process based. Each kfd_process represent a process across N gpu devices. Yeah and that was a big mistake in my opinion. We should really not do that ever again. Need to say, kfd SVM represent a shared virtual address space across CPU and all GPU devices on the system. This is by the definition of SVM (shared virtual memory). This is very different from our legacy gpu *device* driver which works for only one device (i.e., if you want one device to access another device's memory, you will have to use dma-buf export/import etc). Exactly that thinking is what we have currently found as blocker for a virtualization projects. Having SVM as device independent feature which somehow ties to the process address space turned out to be an extremely bad idea. The background is that this only works for some use cases but not all of them. What's working much better is to just have a mirror functionality which says that a range A..B of the process address space is mapped into a range C..D of the GPU address space. Those ranges can then be used to implement the SVM feature required for higher level APIs and not something you need at the UAPI or even inside the low level kernel memory management. When you talk about migrating memory to a device you also do this on a per device basis and *not* tied to the process address space. If you then get crappy performance because userspace gave contradicting information where to migrate memory then that's a bug in userspace and not something the kernel should try to prevent somehow. [SNIP] I think if you start using the same drm_gpuvm for multiple devices you will sooner or later start to run into the same mess we have seen with KFD, where we moved more and more functionality from the KFD to the DRM render node because we found that a lot of the stuff simply doesn't work correctly with a single object to maintain the state. As I understand it, KFD is designed to work across devices. A single pseudo /dev/kfd device represent all hardware gpu devices. That is why during kfd open, many pdd (process device data) is created, each for one hardware device for this process. Yes, I'm perfectly aware of that. And I can only repeat myself that I see this design as a rather extreme failure. And I think it's one of the reasons why NVidia is so dominant with Cuda. This whole approach KFD takes was designed with the idea of extending the CPU process into the GPUs, but this idea only works for a few use cases and is not something we should apply to drivers in general. A very good example are virtualization use cases where you end up with CPU address != GPU address because the VAs are actually coming from the guest VM and not the host process. SVM is a high level concept of OpenCL, Cuda, ROCm etc.. This should not have any influence on the design of the kernel UAPI. If you want to do something similar as KFD for Xe I think you need to get explicit permission to do this from Dave and Daniel and maybe even Linus. I think the one and only one exception where an SVM uapi like in kfd makes sense, is if the _hardware_ itself, not the software stack defined semantics that you've happened to build on top of that hw, enforces a 1:1 mapping with the cpu process address space. Which means your hardware is using PASID, IOMMU based translation, PCI-ATS (address translation services) or whatever your hw calls it and has _no_ device-side pagetables on top. Which from what I've seen all devices with device-memory have, simply because they need some place to store whether that memory is currently in device memory or should be translated using PASID. Currently there's no gpu that works with PASID only, but there are some on-cpu-die accelerator things that do work like that. Maybe in the future there will be some accelerators that are fully cpu cache coherent (including atomics) with something like CXL, and the on-device memory is managed as normal system memory with struct page as ZONE_DEVICE and accelerator va -> physical address translation is only done with PASID ... but for now I haven't seen that, definitely not in upstream drivers. And the moment you have some per-device pagetables or per-device memory management of some sort (like using gpuva mgr) then I'm 100% agreeing with Christian that the kfd SVM model is too strict and not a great idea. That basically means, without ATS/PRI+PASID you cannot implement a unified memory programming model, where GPUs or accelerators access virtual addresses without pre-registering them with an SVM API call. Unified memory is a feature implemented by the KFD SVM API and
Re: [PATCH] drm: bridge: samsung-dsim: Don't use FORCE_STOP_STATE
Just FYI this conflictted pretty heavily with drm-misc-next changes in the same area, someone should check drm-tip has the correct resolution, I'm not really sure what is definitely should be. FWIW, this looks rather messy now. The drm-tip doesn't build. There was a new call to samsung_dsim_set_stop_state() introduced in commit b2fe2292624ac (drm: bridge: samsung-dsim: enter display mode in the enable() callback). I had a closer look at the latest linux-next (where somehow my patch made it into) and tried to apply commit b2fe2292624ac (drm: bridge: samsung-dsim: enter display mode in the enable() callback). It looks like only the following hunk is still needed from that patch. Everything else is covered by this fixes patch. Dario, could you rebase your commit onto this patch? I had a quick test with this change and it seems to work fine for our case. --snip-- diff --git a/drivers/gpu/drm/bridge/samsung-dsim.c b/drivers/gpu/drm/bridge/samsung-dsim.c index 63a1a0c88be4..92755c90e7d2 100644 --- a/drivers/gpu/drm/bridge/samsung-dsim.c +++ b/drivers/gpu/drm/bridge/samsung-dsim.c @@ -1498,6 +1498,8 @@ static void samsung_dsim_atomic_disable(struct drm_bridge *bridge, if (!(dsi->state & DSIM_STATE_ENABLED)) return; + samsung_dsim_set_display_enable(dsi, false); + dsi->state &= ~DSIM_STATE_VIDOUT_AVAILABLE; } @@ -1506,8 +1508,6 @@ static void samsung_dsim_atomic_post_disable(struct drm_bridge *bridge, { struct samsung_dsim *dsi = bridge_to_dsi(bridge); - samsung_dsim_set_display_enable(dsi, false); - dsi->state &= ~DSIM_STATE_ENABLED; pm_runtime_put_sync(dsi->dev); } --snip-- -michael
Re: [PATCH 2/2] drm/amd: Fetch the EDID from _DDC if available for eDP
On 1/29/2024 03:39, Jani Nikula wrote: On Fri, 26 Jan 2024, Mario Limonciello wrote: Some manufacturers have intentionally put an EDID that differs from the EDID on the internal panel on laptops. Attempt to fetch this EDID if it exists and prefer it over the EDID that is provided by the panel. Signed-off-by: Mario Limonciello --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 30 +++ .../gpu/drm/amd/amdgpu/amdgpu_connectors.c| 5 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 8 - .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 7 +++-- 5 files changed, 49 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index c5f3859fd682..99abe12567a4 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -1520,6 +1520,7 @@ int amdgpu_acpi_get_mem_info(struct amdgpu_device *adev, int xcc_id, void amdgpu_acpi_get_backlight_caps(struct amdgpu_dm_backlight_caps *caps); bool amdgpu_acpi_should_gpu_reset(struct amdgpu_device *adev); +void *amdgpu_acpi_edid(struct amdgpu_device *adev, struct drm_connector *connector); void amdgpu_acpi_detect(void); void amdgpu_acpi_release(void); #else @@ -1537,6 +1538,7 @@ static inline int amdgpu_acpi_get_mem_info(struct amdgpu_device *adev, } static inline void amdgpu_acpi_fini(struct amdgpu_device *adev) { } static inline bool amdgpu_acpi_should_gpu_reset(struct amdgpu_device *adev) { return false; } +static inline void *amdgpu_acpi_edid(struct amdgpu_device *adev, struct drm_connector *connector) { return NULL; } static inline void amdgpu_acpi_detect(void) { } static inline void amdgpu_acpi_release(void) { } static inline bool amdgpu_acpi_is_power_shift_control_supported(void) { return false; } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c index e550067e5c5d..c106335f1f22 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c @@ -1380,6 +1380,36 @@ bool amdgpu_acpi_should_gpu_reset(struct amdgpu_device *adev) #endif } +/** + * amdgpu_acpi_edid + * @adev: amdgpu_device pointer + * @connector: drm_connector pointer + * + * Returns the EDID used for the internal panel if present, NULL otherwise. + */ +void * +amdgpu_acpi_edid(struct amdgpu_device *adev, struct drm_connector *connector) +{ + struct drm_device *ddev = adev_to_drm(adev); + struct acpi_device *acpidev = ACPI_COMPANION(ddev->dev); + void *edid; + int r; + + if (!acpidev) + return NULL; + + if (connector->connector_type != DRM_MODE_CONNECTOR_eDP) + return NULL; + + r = acpi_video_get_edid(acpidev, ACPI_VIDEO_DISPLAY_LCD, -1, &edid); + if (r < 0) { + DRM_DEBUG_DRIVER("Failed to get EDID from ACPI: %d\n", r); + return NULL; + } + + return kmemdup(edid, r, GFP_KERNEL); +} + /* * amdgpu_acpi_detect - detect ACPI ATIF/ATCS methods * diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c index 9caba10315a8..c7e1563a46d3 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c @@ -278,6 +278,11 @@ static void amdgpu_connector_get_edid(struct drm_connector *connector) struct amdgpu_device *adev = drm_to_adev(dev); struct amdgpu_connector *amdgpu_connector = to_amdgpu_connector(connector); + if (amdgpu_connector->edid) + return; + + /* if the BIOS specifies the EDID via _DDC, prefer this */ + amdgpu_connector->edid = amdgpu_acpi_edid(adev, connector); Imagine the EDID returned by acpi_video_get_edid() has edid->extensions bigger than 4. Of course it should not, but you have no guarantees, and it originates outside of the kernel. The real fix is to have the function return a struct drm_edid which tracks the allocation size separately. Unfortunately, it requires a bunch of changes along the way. We've mostly done it in i915, and I've sent a series to do this in drm/bridge [1]. Bottom line, we should stop using struct edid in drivers. They'll all parse the info differently, and from what I've seen, often wrong. Thanks for the feedback. In that case this specific change should probably rebase on the Melissa's work https://lore.kernel.org/amd-gfx/20240126163429.56714-1-m...@igalia.com/ after she takes into account the feedback. Let me ask you this though - do you think that after that's done should we let all drivers get EDID from BIOS as a priority? Or would you prefer that this is unique to amdgpu? Something like: 1) If user specifies on kernel command line and puts an EDID in /lib/firmware use that. 2) If BIOS has EDID in _DDC and it's eDP panel, use that. 3) Get panel EDID. BR, Jani. [1] https
Re: [PATCH 1/2] ACPI: video: Handle fetching EDID that is longer than 256 bytes
On 1/29/2024 07:54, Rafael J. Wysocki wrote: On Fri, Jan 26, 2024 at 7:55 PM Mario Limonciello wrote: The ACPI specification allows for an EDID to be up to 512 bytes but the _DDC EDID fetching code will only try up to 256 bytes. Modify the code to instead start at 512 bytes and work it's way down instead. Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/Apx_B_Video_Extensions/output-device-specific-methods.html#ddc-return-the-edid-for-this-device Signed-off-by: Mario Limonciello --- drivers/acpi/acpi_video.c | 23 --- 1 file changed, 16 insertions(+), 7 deletions(-) diff --git a/drivers/acpi/acpi_video.c b/drivers/acpi/acpi_video.c index 62f4364e4460..b3b15dd4755d 100644 --- a/drivers/acpi/acpi_video.c +++ b/drivers/acpi/acpi_video.c @@ -624,6 +624,10 @@ acpi_video_device_EDID(struct acpi_video_device *device, arg0.integer.value = 1; else if (length == 256) arg0.integer.value = 2; + else if (length == 384) + arg0.integer.value = 3; + else if (length == 512) + arg0.integer.value = 4; It looks like switch () would be somewhat better. Or maybe even arg0.integer.value = length / 128; The validation could be added too: if (arg0.integer.value > 4 || arg0.integer.value * 128 != length) return -EINVAL; but it is pointless, because the caller is never passing an invalid number to it AFAICS. Thanks. I'll swap over to one of these suggestions. I will also split this patch separately from the other as the other will take some time with refactoring necessary in DRM that will take a cycle or two. else return -EINVAL; @@ -1443,7 +1447,7 @@ int acpi_video_get_edid(struct acpi_device *device, int type, int device_id, for (i = 0; i < video->attached_count; i++) { video_device = video->attached_array[i].bind_info; - length = 256; + length = 512; if (!video_device) continue; @@ -1478,13 +1482,18 @@ int acpi_video_get_edid(struct acpi_device *device, int type, int device_id, if (ACPI_FAILURE(status) || !buffer || buffer->type != ACPI_TYPE_BUFFER) { - length = 128; - status = acpi_video_device_EDID(video_device, &buffer, - length); - if (ACPI_FAILURE(status) || !buffer || - buffer->type != ACPI_TYPE_BUFFER) { - continue; + while (length) { I would prefer a do {} while () loop here, which could include the first invocation of acpi_video_device_EDID() too (and reduce code duplication a bit). + length -= 128; + status = acpi_video_device_EDID(video_device, &buffer, + length); No line break, please. + if (ACPI_FAILURE(status) || !buffer || + buffer->type != ACPI_TYPE_BUFFER) { + continue; + } + break; } + if (!length) + continue; } *edid = buffer->buffer.pointer; --
Re: Making drm_gpuvm work across gpu devices
On 2024-01-29 10:33, Christian König wrote: Am 29.01.24 um 16:03 schrieb Felix Kuehling: On 2024-01-25 13:32, Daniel Vetter wrote: On Wed, Jan 24, 2024 at 09:33:12AM +0100, Christian König wrote: Am 23.01.24 um 20:37 schrieb Zeng, Oak: [SNIP] Yes most API are per device based. One exception I know is actually the kfd SVM API. If you look at the svm_ioctl function, it is per-process based. Each kfd_process represent a process across N gpu devices. Yeah and that was a big mistake in my opinion. We should really not do that ever again. Need to say, kfd SVM represent a shared virtual address space across CPU and all GPU devices on the system. This is by the definition of SVM (shared virtual memory). This is very different from our legacy gpu *device* driver which works for only one device (i.e., if you want one device to access another device's memory, you will have to use dma-buf export/import etc). Exactly that thinking is what we have currently found as blocker for a virtualization projects. Having SVM as device independent feature which somehow ties to the process address space turned out to be an extremely bad idea. The background is that this only works for some use cases but not all of them. What's working much better is to just have a mirror functionality which says that a range A..B of the process address space is mapped into a range C..D of the GPU address space. Those ranges can then be used to implement the SVM feature required for higher level APIs and not something you need at the UAPI or even inside the low level kernel memory management. When you talk about migrating memory to a device you also do this on a per device basis and *not* tied to the process address space. If you then get crappy performance because userspace gave contradicting information where to migrate memory then that's a bug in userspace and not something the kernel should try to prevent somehow. [SNIP] I think if you start using the same drm_gpuvm for multiple devices you will sooner or later start to run into the same mess we have seen with KFD, where we moved more and more functionality from the KFD to the DRM render node because we found that a lot of the stuff simply doesn't work correctly with a single object to maintain the state. As I understand it, KFD is designed to work across devices. A single pseudo /dev/kfd device represent all hardware gpu devices. That is why during kfd open, many pdd (process device data) is created, each for one hardware device for this process. Yes, I'm perfectly aware of that. And I can only repeat myself that I see this design as a rather extreme failure. And I think it's one of the reasons why NVidia is so dominant with Cuda. This whole approach KFD takes was designed with the idea of extending the CPU process into the GPUs, but this idea only works for a few use cases and is not something we should apply to drivers in general. A very good example are virtualization use cases where you end up with CPU address != GPU address because the VAs are actually coming from the guest VM and not the host process. SVM is a high level concept of OpenCL, Cuda, ROCm etc.. This should not have any influence on the design of the kernel UAPI. If you want to do something similar as KFD for Xe I think you need to get explicit permission to do this from Dave and Daniel and maybe even Linus. I think the one and only one exception where an SVM uapi like in kfd makes sense, is if the _hardware_ itself, not the software stack defined semantics that you've happened to build on top of that hw, enforces a 1:1 mapping with the cpu process address space. Which means your hardware is using PASID, IOMMU based translation, PCI-ATS (address translation services) or whatever your hw calls it and has _no_ device-side pagetables on top. Which from what I've seen all devices with device-memory have, simply because they need some place to store whether that memory is currently in device memory or should be translated using PASID. Currently there's no gpu that works with PASID only, but there are some on-cpu-die accelerator things that do work like that. Maybe in the future there will be some accelerators that are fully cpu cache coherent (including atomics) with something like CXL, and the on-device memory is managed as normal system memory with struct page as ZONE_DEVICE and accelerator va -> physical address translation is only done with PASID ... but for now I haven't seen that, definitely not in upstream drivers. And the moment you have some per-device pagetables or per-device memory management of some sort (like using gpuva mgr) then I'm 100% agreeing with Christian that the kfd SVM model is too strict and not a great idea. That basically means, without ATS/PRI+PASID you cannot implement a unified memory programming model, where GPUs or accelerators access virtual addresses without pre-registering them with an SVM API call. Unified memo
Re: Making drm_gpuvm work across gpu devices
Am 29.01.24 um 17:24 schrieb Felix Kuehling: On 2024-01-29 10:33, Christian König wrote: Am 29.01.24 um 16:03 schrieb Felix Kuehling: On 2024-01-25 13:32, Daniel Vetter wrote: On Wed, Jan 24, 2024 at 09:33:12AM +0100, Christian König wrote: Am 23.01.24 um 20:37 schrieb Zeng, Oak: [SNIP] Yes most API are per device based. One exception I know is actually the kfd SVM API. If you look at the svm_ioctl function, it is per-process based. Each kfd_process represent a process across N gpu devices. Yeah and that was a big mistake in my opinion. We should really not do that ever again. Need to say, kfd SVM represent a shared virtual address space across CPU and all GPU devices on the system. This is by the definition of SVM (shared virtual memory). This is very different from our legacy gpu *device* driver which works for only one device (i.e., if you want one device to access another device's memory, you will have to use dma-buf export/import etc). Exactly that thinking is what we have currently found as blocker for a virtualization projects. Having SVM as device independent feature which somehow ties to the process address space turned out to be an extremely bad idea. The background is that this only works for some use cases but not all of them. What's working much better is to just have a mirror functionality which says that a range A..B of the process address space is mapped into a range C..D of the GPU address space. Those ranges can then be used to implement the SVM feature required for higher level APIs and not something you need at the UAPI or even inside the low level kernel memory management. When you talk about migrating memory to a device you also do this on a per device basis and *not* tied to the process address space. If you then get crappy performance because userspace gave contradicting information where to migrate memory then that's a bug in userspace and not something the kernel should try to prevent somehow. [SNIP] I think if you start using the same drm_gpuvm for multiple devices you will sooner or later start to run into the same mess we have seen with KFD, where we moved more and more functionality from the KFD to the DRM render node because we found that a lot of the stuff simply doesn't work correctly with a single object to maintain the state. As I understand it, KFD is designed to work across devices. A single pseudo /dev/kfd device represent all hardware gpu devices. That is why during kfd open, many pdd (process device data) is created, each for one hardware device for this process. Yes, I'm perfectly aware of that. And I can only repeat myself that I see this design as a rather extreme failure. And I think it's one of the reasons why NVidia is so dominant with Cuda. This whole approach KFD takes was designed with the idea of extending the CPU process into the GPUs, but this idea only works for a few use cases and is not something we should apply to drivers in general. A very good example are virtualization use cases where you end up with CPU address != GPU address because the VAs are actually coming from the guest VM and not the host process. SVM is a high level concept of OpenCL, Cuda, ROCm etc.. This should not have any influence on the design of the kernel UAPI. If you want to do something similar as KFD for Xe I think you need to get explicit permission to do this from Dave and Daniel and maybe even Linus. I think the one and only one exception where an SVM uapi like in kfd makes sense, is if the _hardware_ itself, not the software stack defined semantics that you've happened to build on top of that hw, enforces a 1:1 mapping with the cpu process address space. Which means your hardware is using PASID, IOMMU based translation, PCI-ATS (address translation services) or whatever your hw calls it and has _no_ device-side pagetables on top. Which from what I've seen all devices with device-memory have, simply because they need some place to store whether that memory is currently in device memory or should be translated using PASID. Currently there's no gpu that works with PASID only, but there are some on-cpu-die accelerator things that do work like that. Maybe in the future there will be some accelerators that are fully cpu cache coherent (including atomics) with something like CXL, and the on-device memory is managed as normal system memory with struct page as ZONE_DEVICE and accelerator va -> physical address translation is only done with PASID ... but for now I haven't seen that, definitely not in upstream drivers. And the moment you have some per-device pagetables or per-device memory management of some sort (like using gpuva mgr) then I'm 100% agreeing with Christian that the kfd SVM model is too strict and not a great idea. That basically means, without ATS/PRI+PASID you cannot implement a unified memory programming model, where GPUs or accelerators access virtual addresses without pre-regist
Re: [PATCH 2/2] drm/amd: Fetch the EDID from _DDC if available for eDP
On Mon, 29 Jan 2024, Mario Limonciello wrote: > On 1/29/2024 03:39, Jani Nikula wrote: >> On Fri, 26 Jan 2024, Mario Limonciello wrote: >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c >>> index 9caba10315a8..c7e1563a46d3 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c >>> @@ -278,6 +278,11 @@ static void amdgpu_connector_get_edid(struct >>> drm_connector *connector) >>> struct amdgpu_device *adev = drm_to_adev(dev); >>> struct amdgpu_connector *amdgpu_connector = >>> to_amdgpu_connector(connector); >>> >>> + if (amdgpu_connector->edid) >>> + return; >>> + >>> + /* if the BIOS specifies the EDID via _DDC, prefer this */ >>> + amdgpu_connector->edid = amdgpu_acpi_edid(adev, connector); >> >> Imagine the EDID returned by acpi_video_get_edid() has edid->extensions >> bigger than 4. Of course it should not, but you have no guarantees, and >> it originates outside of the kernel. >> >> The real fix is to have the function return a struct drm_edid which >> tracks the allocation size separately. Unfortunately, it requires a >> bunch of changes along the way. We've mostly done it in i915, and I've >> sent a series to do this in drm/bridge [1]. Looking at it again, perhaps the ACPI code should just return a blob, and the drm code should have a helper to wrap that around struct drm_edid, so that the ACPI code does not have to depend on drm. Basic idea remains. >> Bottom line, we should stop using struct edid in drivers. They'll all >> parse the info differently, and from what I've seen, often wrong. >> >> > > Thanks for the feedback. In that case this specific change should > probably rebase on the Melissa's work > https://lore.kernel.org/amd-gfx/20240126163429.56714-1-m...@igalia.com/ > after she takes into account the feedback. > > Let me ask you this though - do you think that after that's done should > we let all drivers get EDID from BIOS as a priority? Or would you > prefer that this is unique to amdgpu? If the reason for having this is that the panel EDID contains some garbage, that's certainly not unique to amdgpu... :p > Something like: > > 1) If user specifies on kernel command line and puts an EDID in > /lib/firmware use that. > 2) If BIOS has EDID in _DDC and it's eDP panel, use that. I think we should also look into this. We currently don't do this, and it might help with some machines. However, gut feeling says it's probably better to keep this as a per driver decision instead of trying to bolt it into drm helpers. BR, Jani. > 3) Get panel EDID. > -- Jani Nikula, Intel
Re: [PATCH] drm/xe: Fix a build error
Am 27.01.24 um 16:53 schrieb Oak Zeng: This fixes a build failure on drm-tip. This issue was introduced during merge of "drm/ttm: replace busy placement with flags v6". For some reason, the xe_bo.c part of above change is not merged. Manually merge the missing part to drm_tip Mhm, I provided this as manual fixup for drm-tip in this rerere commit: commit afc5797e8c03bed3ec47a34f2bc3cf03fce24411 Author: Christian König Date: Thu Jan 25 10:44:54 2024 +0100 2024y-01m-25d-09h-44m-07s UTC: drm-tip rerere cache update git version 2.34.1 And for me compiling xe in drm-tip worked fine after that. No idea why that didn't work for you. Anyway feel free to add my rb to this patch here if it helps in any way. Regards, Christian. Signed-off-by: Oak Zeng --- drivers/gpu/drm/xe/xe_bo.c | 33 +++-- 1 file changed, 15 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c index 686d716c5581..d6a193060cc0 100644 --- a/drivers/gpu/drm/xe/xe_bo.c +++ b/drivers/gpu/drm/xe/xe_bo.c @@ -38,22 +38,26 @@ static const struct ttm_place sys_placement_flags = { static struct ttm_placement sys_placement = { .num_placement = 1, .placement = &sys_placement_flags, - .num_busy_placement = 1, - .busy_placement = &sys_placement_flags, }; -static const struct ttm_place tt_placement_flags = { - .fpfn = 0, - .lpfn = 0, - .mem_type = XE_PL_TT, - .flags = 0, +static const struct ttm_place tt_placement_flags[] = { + { + .fpfn = 0, + .lpfn = 0, + .mem_type = XE_PL_TT, + .flags = TTM_PL_FLAG_DESIRED, + }, + { + .fpfn = 0, + .lpfn = 0, + .mem_type = XE_PL_SYSTEM, + .flags = TTM_PL_FLAG_FALLBACK, + } }; static struct ttm_placement tt_placement = { - .num_placement = 1, - .placement = &tt_placement_flags, - .num_busy_placement = 1, - .busy_placement = &sys_placement_flags, + .num_placement = 2, + .placement = tt_placement_flags, }; bool mem_type_is_vram(u32 mem_type) @@ -230,8 +234,6 @@ static int __xe_bo_placement_for_flags(struct xe_device *xe, struct xe_bo *bo, bo->placement = (struct ttm_placement) { .num_placement = c, .placement = bo->placements, - .num_busy_placement = c, - .busy_placement = bo->placements, }; return 0; @@ -251,7 +253,6 @@ static void xe_evict_flags(struct ttm_buffer_object *tbo, /* Don't handle scatter gather BOs */ if (tbo->type == ttm_bo_type_sg) { placement->num_placement = 0; - placement->num_busy_placement = 0; return; } @@ -1391,8 +1392,6 @@ static int __xe_bo_fixed_placement(struct xe_device *xe, bo->placement = (struct ttm_placement) { .num_placement = 1, .placement = place, - .num_busy_placement = 1, - .busy_placement = place, }; return 0; @@ -2150,9 +2149,7 @@ int xe_bo_migrate(struct xe_bo *bo, u32 mem_type) xe_place_from_ttm_type(mem_type, &requested); placement.num_placement = 1; - placement.num_busy_placement = 1; placement.placement = &requested; - placement.busy_placement = &requested; /* * Stolen needs to be handled like below VRAM handling if we ever need
Re: [PATCH] drm: bridge: samsung-dsim: Don't use FORCE_STOP_STATE
On 29.01.24 10:20, Frieder Schrempf wrote: > On 26.01.24 19:28, Dave Airlie wrote: >> Just FYI this conflictted pretty heavily with drm-misc-next changes in >> the same area, someone should check drm-tip has the correct >> resolution, I'm not really sure what is definitely should be. >> >> Dave. > > Thanks! I took a quick look at what is now in Linus' tree and it looks > correct to me. The only thing I'm missing is my Reviewed-by tag which > got lost somewhere, but I can get over that. Apparently I missed the point here. I was looking at the wrong trees (drm-next and master instead of drm-misc-next and drm-tip). Sorry for the noise. Michael already pointed out the correct details.
Re: [PATCH 2/2] drm/amd: Fetch the EDID from _DDC if available for eDP
On 1/29/2024 10:46, Jani Nikula wrote: On Mon, 29 Jan 2024, Mario Limonciello wrote: On 1/29/2024 03:39, Jani Nikula wrote: On Fri, 26 Jan 2024, Mario Limonciello wrote: diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c index 9caba10315a8..c7e1563a46d3 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c @@ -278,6 +278,11 @@ static void amdgpu_connector_get_edid(struct drm_connector *connector) struct amdgpu_device *adev = drm_to_adev(dev); struct amdgpu_connector *amdgpu_connector = to_amdgpu_connector(connector); + if (amdgpu_connector->edid) + return; + + /* if the BIOS specifies the EDID via _DDC, prefer this */ + amdgpu_connector->edid = amdgpu_acpi_edid(adev, connector); Imagine the EDID returned by acpi_video_get_edid() has edid->extensions bigger than 4. Of course it should not, but you have no guarantees, and it originates outside of the kernel. The real fix is to have the function return a struct drm_edid which tracks the allocation size separately. Unfortunately, it requires a bunch of changes along the way. We've mostly done it in i915, and I've sent a series to do this in drm/bridge [1]. Looking at it again, perhaps the ACPI code should just return a blob, and the drm code should have a helper to wrap that around struct drm_edid, so that the ACPI code does not have to depend on drm. Basic idea remains. I'd ideally like to split this stuff and Melissa's rework to be independent if possible. I'll see if that's actually feasible. Bottom line, we should stop using struct edid in drivers. They'll all parse the info differently, and from what I've seen, often wrong. Thanks for the feedback. In that case this specific change should probably rebase on the Melissa's work https://lore.kernel.org/amd-gfx/20240126163429.56714-1-m...@igalia.com/ after she takes into account the feedback. Let me ask you this though - do you think that after that's done should we let all drivers get EDID from BIOS as a priority? Or would you prefer that this is unique to amdgpu? If the reason for having this is that the panel EDID contains some garbage, that's certainly not unique to amdgpu... :p OK; maybe a helper in DRM that wraps the ACPI code then and amdgpu will use the helper for this series. I'm also thinking it makes sense to have a new /proc/cmdline setup option to ignore the BIOS for EDID. I'm hoping that since Windows uses _DDC that BIOS will be higher quality; but you know; BIOS =) Something like: 1) If user specifies on kernel command line and puts an EDID in /lib/firmware use that. 2) If BIOS has EDID in _DDC and it's eDP panel, use that. I think we should also look into this. We currently don't do this, and it might help with some machines. However, gut feeling says it's probably better to keep this as a per driver decision instead of trying to bolt it into drm helpers. OK; I'll wire up the helper and if you want to use in the future you can too then. BR, Jani. 3) Get panel EDID.
[PATCH v6 0/6] iio: new DMABUF based API, v6
Hi Jonathan, This is the v6 of my patchset that introduces a new interface based on DMABUF objects. The code was updated quite a bit, using the feedback on the list for this patchset but also the feedback I received on the FunctionFS patchset that I'm working on upstreaming in parallel [1] where the DMABUF handling code is very similar. See below for the full changelog. I decided to drop the scope-based memory management for dma_buf and I hope you are OK with that. Christian wants the patch(es) to support scope-based memory management in dma-buf as a separate patchset; once it's in, I will gladly send a follow-up patch to use __free() where it makes sense. For performance numbers, I'll point you to the cover letter for my v5 patchset [2]. This patchset was based on next-20240129. Cheers, -Paul [1] https://lore.kernel.org/all/20230322092118.9213-1-p...@crapouillou.net/ [2] https://lore.kernel.org/linux-iio/219abc43b4fdd4a13b307ed2efaa0e6869e68e3f.ca...@gmail.com/T/ --- Changelog: * [2/6]: - Use new prototype for axi_dmac_alloc_desc() as it changed upstream * [3/6]: - Remove dead code in iio_dma_resv_lock() - Fix non-block actually blocking - Cache dma_buf_attachment instead of mapping/unmapping it for every transfer - Return -EINVAL instead of IIO_IOCTL_UNHANDLED for unknown ioctl - Make .block_enqueue() callback take a dma_fence pointer, which will be passed to iio_buffer_signal_dmabuf_done() instead of the dma_buf_attachment; and remove the backpointer from the priv structure to the dma_fence. - Use dma_fence_begin/end_signalling in the dma_fence critical sections - Unref dma_fence and dma_buf_attachment in worker, because they might try to lock the dma_resv, which would deadlock. - Add buffer ops to lock/unlock the queue. This is motivated by the fact that once the dma_fence has been installed, we cannot lock anything anymore - so the queue must be locked before the dma_fence is installed. - Use 'long retl' variable to handle the return value of dma_resv_wait_timeout() - Protect dmabufs list access with a mutex - Rework iio_buffer_find_attachment() to use the internal dmabufs list, instead of messing with dmabufs private data. - Add an atomically-increasing sequence number for fences * [4/6]: - Update iio_dma_buffer_enqueue_dmabuf() to take a dma_fence pointer - Pass that dma_fence pointer along to iio_buffer_signal_dmabuf_done() - Add iio_dma_buffer_lock_queue() / iio_dma_buffer_unlock_queue() - Do not lock the queue in iio_dma_buffer_enqueue_dmabuf(). The caller will ensure that it has been locked already. - Replace "int += bool;" by "if (bool) int++;" - Use dma_fence_begin/end_signalling in the dma_fence critical sections - Use one "num_dmabufs" fields instead of one "num_blocks" and one "num_fileio_blocks". Make it an atomic_t, which makes it possible to decrement it atomically in iio_buffer_block_release() without having to lock the queue mutex; and in turn, it means that we don't need to use iio_buffer_block_put_atomic() everywhere to avoid locking the queue mutex twice. - Use cleanup.h guard(mutex) when possible - Explicitely list all states in the switch in iio_dma_can_enqueue_block() - Rename iio_dma_buffer_fileio_mode() to iio_dma_buffer_can_use_fileio(), and add a comment explaining why it cannot race vs. DMABUF. * [5/6]: - Populate .lock_queue / .unlock_queue callbacks - Switch to atomic memory allocations in .submit_queue, because of the dma_fence critical section - Make sure that the size of the scatterlist is enough --- Paul Cercueil (6): dmaengine: Add API function dmaengine_prep_slave_dma_vec() dmaengine: dma-axi-dmac: Implement device_prep_slave_dma_vec iio: core: Add new DMABUF interface infrastructure iio: buffer-dma: Enable support for DMABUFs iio: buffer-dmaengine: Support new DMABUF based userspace API Documentation: iio: Document high-speed DMABUF based API Documentation/iio/dmabuf_api.rst | 54 ++ Documentation/iio/index.rst | 2 + drivers/dma/dma-axi-dmac.c| 40 ++ drivers/iio/buffer/industrialio-buffer-dma.c | 181 ++- .../buffer/industrialio-buffer-dmaengine.c| 58 ++- drivers/iio/industrialio-buffer.c | 462 ++ include/linux/dmaengine.h | 25 + include/linux/iio/buffer-dma.h| 31 ++ include/linux/iio/buffer_impl.h | 33 ++ include/uapi/linux/iio/buffer.h | 22 + 10 files changed, 891 insertions(+), 17 deletions(-) create mode 100644 Documentation/iio/dmabuf_api.rst -- 2.43.0
[PATCH v6 2/6] dmaengine: dma-axi-dmac: Implement device_prep_slave_dma_vec
Add implementation of the .device_prep_slave_dma_vec() callback. Signed-off-by: Paul Cercueil --- v3: New patch v5: Implement .device_prep_slave_dma_vec() instead of v3's .device_prep_slave_dma_array(). v6: Use new prototype for axi_dmac_alloc_desc() as it changed upstream. --- drivers/dma/dma-axi-dmac.c | 40 ++ 1 file changed, 40 insertions(+) diff --git a/drivers/dma/dma-axi-dmac.c b/drivers/dma/dma-axi-dmac.c index 4e339c04fc1e..276856a1742d 100644 --- a/drivers/dma/dma-axi-dmac.c +++ b/drivers/dma/dma-axi-dmac.c @@ -620,6 +620,45 @@ static struct axi_dmac_sg *axi_dmac_fill_linear_sg(struct axi_dmac_chan *chan, return sg; } +static struct dma_async_tx_descriptor * +axi_dmac_prep_slave_dma_vec(struct dma_chan *c, const struct dma_vec *vecs, + size_t nb, enum dma_transfer_direction direction, + unsigned long flags) +{ + struct axi_dmac_chan *chan = to_axi_dmac_chan(c); + struct axi_dmac_desc *desc; + unsigned int num_sgs = 0; + struct axi_dmac_sg *dsg; + size_t i; + + if (direction != chan->direction) + return NULL; + + for (i = 0; i < nb; i++) + num_sgs += DIV_ROUND_UP(vecs[i].len, chan->max_length); + + desc = axi_dmac_alloc_desc(chan, num_sgs); + if (!desc) + return NULL; + + dsg = desc->sg; + + for (i = 0; i < nb; i++) { + if (!axi_dmac_check_addr(chan, vecs[i].addr) || + !axi_dmac_check_len(chan, vecs[i].len)) { + kfree(desc); + return NULL; + } + + dsg = axi_dmac_fill_linear_sg(chan, direction, vecs[i].addr, 1, + vecs[i].len, dsg); + } + + desc->cyclic = false; + + return vchan_tx_prep(&chan->vchan, &desc->vdesc, flags); +} + static struct dma_async_tx_descriptor *axi_dmac_prep_slave_sg( struct dma_chan *c, struct scatterlist *sgl, unsigned int sg_len, enum dma_transfer_direction direction, @@ -1055,6 +1094,7 @@ static int axi_dmac_probe(struct platform_device *pdev) dma_dev->device_tx_status = dma_cookie_status; dma_dev->device_issue_pending = axi_dmac_issue_pending; dma_dev->device_prep_slave_sg = axi_dmac_prep_slave_sg; + dma_dev->device_prep_slave_dma_vec = axi_dmac_prep_slave_dma_vec; dma_dev->device_prep_dma_cyclic = axi_dmac_prep_dma_cyclic; dma_dev->device_prep_interleaved_dma = axi_dmac_prep_interleaved; dma_dev->device_terminate_all = axi_dmac_terminate_all; -- 2.43.0
[PATCH v6 1/6] dmaengine: Add API function dmaengine_prep_slave_dma_vec()
This function can be used to initiate a scatter-gather DMA transfer, where the address and size of each segment is located in one entry of the dma_vec array. The major difference with dmaengine_prep_slave_sg() is that it supports specifying the lengths of each DMA transfer; as trying to override the length of the transfer with dmaengine_prep_slave_sg() is a very tedious process. The introduction of a new API function is also justified by the fact that scatterlists are on their way out. Note that dmaengine_prep_interleaved_dma() is not helpful either in that case, as it assumes that the address of each segment will be higher than the one of the previous segment, which we just cannot guarantee in case of a scatter-gather transfer. Signed-off-by: Paul Cercueil --- v3: New patch v5: Replace with function dmaengine_prep_slave_dma_vec(), and struct 'dma_vec'. Note that at some point we will need to support cyclic transfers using dmaengine_prep_slave_dma_vec(). Maybe with a new "flags" parameter to the function? --- include/linux/dmaengine.h | 25 + 1 file changed, 25 insertions(+) diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h index 3df70d6131c8..ee5931ddb42f 100644 --- a/include/linux/dmaengine.h +++ b/include/linux/dmaengine.h @@ -160,6 +160,16 @@ struct dma_interleaved_template { struct data_chunk sgl[]; }; +/** + * struct dma_vec - DMA vector + * @addr: Bus address of the start of the vector + * @len: Length in bytes of the DMA vector + */ +struct dma_vec { + dma_addr_t addr; + size_t len; +}; + /** * enum dma_ctrl_flags - DMA flags to augment operation preparation, * control completion, and communicate status. @@ -910,6 +920,10 @@ struct dma_device { struct dma_async_tx_descriptor *(*device_prep_dma_interrupt)( struct dma_chan *chan, unsigned long flags); + struct dma_async_tx_descriptor *(*device_prep_slave_dma_vec)( + struct dma_chan *chan, const struct dma_vec *vecs, + size_t nents, enum dma_transfer_direction direction, + unsigned long flags); struct dma_async_tx_descriptor *(*device_prep_slave_sg)( struct dma_chan *chan, struct scatterlist *sgl, unsigned int sg_len, enum dma_transfer_direction direction, @@ -972,6 +986,17 @@ static inline struct dma_async_tx_descriptor *dmaengine_prep_slave_single( dir, flags, NULL); } +static inline struct dma_async_tx_descriptor *dmaengine_prep_slave_dma_vec( + struct dma_chan *chan, const struct dma_vec *vecs, size_t nents, + enum dma_transfer_direction dir, unsigned long flags) +{ + if (!chan || !chan->device || !chan->device->device_prep_slave_dma_vec) + return NULL; + + return chan->device->device_prep_slave_dma_vec(chan, vecs, nents, + dir, flags); +} + static inline struct dma_async_tx_descriptor *dmaengine_prep_slave_sg( struct dma_chan *chan, struct scatterlist *sgl, unsigned int sg_len, enum dma_transfer_direction dir, unsigned long flags) -- 2.43.0
[PATCH v6 3/6] iio: core: Add new DMABUF interface infrastructure
Add the necessary infrastructure to the IIO core to support a new optional DMABUF based interface. With this new interface, DMABUF objects (externally created) can be attached to a IIO buffer, and subsequently used for data transfer. A userspace application can then use this interface to share DMABUF objects between several interfaces, allowing it to transfer data in a zero-copy fashion, for instance between IIO and the USB stack. The userspace application can also memory-map the DMABUF objects, and access the sample data directly. The advantage of doing this vs. the read() interface is that it avoids an extra copy of the data between the kernel and userspace. This is particularly userful for high-speed devices which produce several megabytes or even gigabytes of data per second. As part of the interface, 3 new IOCTLs have been added: IIO_BUFFER_DMABUF_ATTACH_IOCTL(int fd): Attach the DMABUF object identified by the given file descriptor to the buffer. IIO_BUFFER_DMABUF_DETACH_IOCTL(int fd): Detach the DMABUF object identified by the given file descriptor from the buffer. Note that closing the IIO buffer's file descriptor will automatically detach all previously attached DMABUF objects. IIO_BUFFER_DMABUF_ENQUEUE_IOCTL(struct iio_dmabuf *): Request a data transfer to/from the given DMABUF object. Its file descriptor, as well as the transfer size and flags are provided in the "iio_dmabuf" structure. These three IOCTLs have to be performed on the IIO buffer's file descriptor, obtained using the IIO_BUFFER_GET_FD_IOCTL() ioctl. Signed-off-by: Paul Cercueil --- v2: Only allow the new IOCTLs on the buffer FD created with IIO_BUFFER_GET_FD_IOCTL(). v3: - Get rid of the old IOCTLs. The IIO subsystem does not create or manage DMABUFs anymore, and only attaches/detaches externally created DMABUFs. - Add IIO_BUFFER_DMABUF_CYCLIC to the supported flags. v5: - Use dev_err() instead of pr_err() - Inline to_iio_dma_fence() - Add comment to explain why we unref twice when detaching dmabuf - Remove TODO comment. It is actually safe to free the file's private data even when transfers are still pending because it won't be accessed. - Fix documentation of new fields in struct iio_buffer_access_funcs - iio_dma_resv_lock() does not need to be exported, make it static v6: - Remove dead code in iio_dma_resv_lock() - Fix non-block actually blocking - Cache dma_buf_attachment instead of mapping/unmapping it for every transfer - Return -EINVAL instead of IIO_IOCTL_UNHANDLED for unknown ioctl - Make .block_enqueue() callback take a dma_fence pointer, which will be passed to iio_buffer_signal_dmabuf_done() instead of the dma_buf_attachment; and remove the backpointer from the priv structure to the dma_fence. - Use dma_fence_begin/end_signalling in the dma_fence critical sections - Unref dma_fence and dma_buf_attachment in worker, because they might try to lock the dma_resv, which would deadlock. - Add buffer ops to lock/unlock the queue. This is motivated by the fact that once the dma_fence has been installed, we cannot lock anything anymore - so the queue must be locked before the dma_fence is installed. - Use 'long retl' variable to handle the return value of dma_resv_wait_timeout() - Protect dmabufs list access with a mutex - Rework iio_buffer_find_attachment() to use the internal dmabufs list, instead of messing with dmabufs private data. - Add an atomically-increasing sequence number for fences --- drivers/iio/industrialio-buffer.c | 462 ++ include/linux/iio/buffer_impl.h | 33 +++ include/uapi/linux/iio/buffer.h | 22 ++ 3 files changed, 517 insertions(+) diff --git a/drivers/iio/industrialio-buffer.c b/drivers/iio/industrialio-buffer.c index b581a7e80566..0e63a09fa90a 100644 --- a/drivers/iio/industrialio-buffer.c +++ b/drivers/iio/industrialio-buffer.c @@ -9,14 +9,19 @@ * - Better memory allocation techniques? * - Alternative access techniques? */ +#include #include #include #include #include +#include +#include +#include #include #include #include #include +#include #include #include @@ -28,6 +33,32 @@ #include #include +#define DMABUF_ENQUEUE_TIMEOUT_MS 5000 + +struct iio_dma_fence; + +struct iio_dmabuf_priv { + struct list_head entry; + struct kref ref; + + struct iio_buffer *buffer; + struct iio_dma_buffer_block *block; + + u64 context; + spinlock_t lock; + + struct dma_buf_attachment *attach; + struct sg_table *sgt; + enum dma_data_direction dir; + atomic_t seqno; +}; + +struct iio_dma_fence { + struct dma_fence base; + struct iio_dmabuf_priv *priv; + struct work_struct work; +}; + static const char * const iio_endian_prefix[] = { [IIO_BE] = "be", [IIO_LE] = "le", @@ -332,6 +3
[PATCH v6 5/6] iio: buffer-dmaengine: Support new DMABUF based userspace API
Use the functions provided by the buffer-dma core to implement the DMABUF userspace API in the buffer-dmaengine IIO buffer implementation. Since we want to be able to transfer an arbitrary number of bytes and not necesarily the full DMABUF, the associated scatterlist is converted to an array of DMA addresses + lengths, which is then passed to dmaengine_prep_slave_dma_array(). Signed-off-by: Paul Cercueil --- v3: Use the new dmaengine_prep_slave_dma_array(), and adapt the code to work with the new functions introduced in industrialio-buffer-dma.c. v5: - Use the new dmaengine_prep_slave_dma_vec(). - Restrict to input buffers, since output buffers are not yet supported by IIO buffers. v6: - Populate .lock_queue / .unlock_queue callbacks - Switch to atomic memory allocations in .submit_queue, because of the dma_fence critical section - Make sure that the size of the scatterlist is enough --- .../buffer/industrialio-buffer-dmaengine.c| 58 +-- 1 file changed, 52 insertions(+), 6 deletions(-) diff --git a/drivers/iio/buffer/industrialio-buffer-dmaengine.c b/drivers/iio/buffer/industrialio-buffer-dmaengine.c index 45fe7d0d42ee..c4cfdb0c1231 100644 --- a/drivers/iio/buffer/industrialio-buffer-dmaengine.c +++ b/drivers/iio/buffer/industrialio-buffer-dmaengine.c @@ -64,15 +64,54 @@ static int iio_dmaengine_buffer_submit_block(struct iio_dma_buffer_queue *queue, struct dmaengine_buffer *dmaengine_buffer = iio_buffer_to_dmaengine_buffer(&queue->buffer); struct dma_async_tx_descriptor *desc; + struct scatterlist *sgl; + struct dma_vec *vecs; dma_cookie_t cookie; + size_t len_total; + size_t max_size; + unsigned int i; + int nents; - block->bytes_used = min(block->size, dmaengine_buffer->max_size); - block->bytes_used = round_down(block->bytes_used, - dmaengine_buffer->align); + if (queue->buffer.direction != IIO_BUFFER_DIRECTION_IN) { + /* We do not yet support output buffers. */ + return -EINVAL; + } - desc = dmaengine_prep_slave_single(dmaengine_buffer->chan, - block->phys_addr, block->bytes_used, DMA_DEV_TO_MEM, - DMA_PREP_INTERRUPT); + if (block->sg_table) { + sgl = block->sg_table->sgl; + nents = sg_nents_for_len(sgl, block->bytes_used); + if (nents < 0) + return nents; + + vecs = kmalloc_array(nents, sizeof(*vecs), GFP_ATOMIC); + if (!vecs) + return -ENOMEM; + + len_total = block->bytes_used; + + for (i = 0; i < nents; i++) { + vecs[i].addr = sg_dma_address(sgl); + vecs[i].len = min(sg_dma_len(sgl), len_total); + len_total -= vecs[i].len; + + sgl = sg_next(sgl); + } + + desc = dmaengine_prep_slave_dma_vec(dmaengine_buffer->chan, + vecs, nents, DMA_DEV_TO_MEM, + DMA_PREP_INTERRUPT); + kfree(vecs); + } else { + max_size = min(block->size, dmaengine_buffer->max_size); + max_size = round_down(max_size, dmaengine_buffer->align); + block->bytes_used = max_size; + + desc = dmaengine_prep_slave_single(dmaengine_buffer->chan, + block->phys_addr, + block->bytes_used, + DMA_DEV_TO_MEM, + DMA_PREP_INTERRUPT); + } if (!desc) return -ENOMEM; @@ -120,6 +159,13 @@ static const struct iio_buffer_access_funcs iio_dmaengine_buffer_ops = { .data_available = iio_dma_buffer_data_available, .release = iio_dmaengine_buffer_release, + .enqueue_dmabuf = iio_dma_buffer_enqueue_dmabuf, + .attach_dmabuf = iio_dma_buffer_attach_dmabuf, + .detach_dmabuf = iio_dma_buffer_detach_dmabuf, + + .lock_queue = iio_dma_buffer_lock_queue, + .unlock_queue = iio_dma_buffer_unlock_queue, + .modes = INDIO_BUFFER_HARDWARE, .flags = INDIO_BUFFER_FLAG_FIXED_WATERMARK, }; -- 2.43.0
[PATCH v6 4/6] iio: buffer-dma: Enable support for DMABUFs
Implement iio_dma_buffer_attach_dmabuf(), iio_dma_buffer_detach_dmabuf() and iio_dma_buffer_transfer_dmabuf(), which can then be used by the IIO DMA buffer implementations. Signed-off-by: Paul Cercueil --- v3: Update code to provide the functions that will be used as callbacks for the new IOCTLs. v6: - Update iio_dma_buffer_enqueue_dmabuf() to take a dma_fence pointer - Pass that dma_fence pointer along to iio_buffer_signal_dmabuf_done() - Add iio_dma_buffer_lock_queue() / iio_dma_buffer_unlock_queue() - Do not lock the queue in iio_dma_buffer_enqueue_dmabuf(). The caller will ensure that it has been locked already. - Replace "int += bool;" by "if (bool) int++;" - Use dma_fence_begin/end_signalling in the dma_fence critical sections - Use one "num_dmabufs" fields instead of one "num_blocks" and one "num_fileio_blocks". Make it an atomic_t, which makes it possible to decrement it atomically in iio_buffer_block_release() without having to lock the queue mutex; and in turn, it means that we don't need to use iio_buffer_block_put_atomic() everywhere to avoid locking the queue mutex twice. - Use cleanup.h guard(mutex) when possible - Explicitely list all states in the switch in iio_dma_can_enqueue_block() - Rename iio_dma_buffer_fileio_mode() to iio_dma_buffer_can_use_fileio(), and add a comment explaining why it cannot race vs. DMABUF. --- drivers/iio/buffer/industrialio-buffer-dma.c | 181 +-- include/linux/iio/buffer-dma.h | 31 2 files changed, 201 insertions(+), 11 deletions(-) diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c b/drivers/iio/buffer/industrialio-buffer-dma.c index 5610ba67925e..c0f539af98f9 100644 --- a/drivers/iio/buffer/industrialio-buffer-dma.c +++ b/drivers/iio/buffer/industrialio-buffer-dma.c @@ -4,6 +4,8 @@ * Author: Lars-Peter Clausen */ +#include +#include #include #include #include @@ -14,6 +16,8 @@ #include #include #include +#include +#include #include #include @@ -94,13 +98,18 @@ static void iio_buffer_block_release(struct kref *kref) { struct iio_dma_buffer_block *block = container_of(kref, struct iio_dma_buffer_block, kref); + struct iio_dma_buffer_queue *queue = block->queue; - WARN_ON(block->state != IIO_BLOCK_STATE_DEAD); + WARN_ON(block->fileio && block->state != IIO_BLOCK_STATE_DEAD); - dma_free_coherent(block->queue->dev, PAGE_ALIGN(block->size), - block->vaddr, block->phys_addr); + if (block->fileio) { + dma_free_coherent(queue->dev, PAGE_ALIGN(block->size), + block->vaddr, block->phys_addr); + } else { + atomic_dec(&queue->num_dmabufs); + } - iio_buffer_put(&block->queue->buffer); + iio_buffer_put(&queue->buffer); kfree(block); } @@ -163,7 +172,7 @@ static struct iio_dma_buffer_queue *iio_buffer_to_queue(struct iio_buffer *buf) } static struct iio_dma_buffer_block *iio_dma_buffer_alloc_block( - struct iio_dma_buffer_queue *queue, size_t size) + struct iio_dma_buffer_queue *queue, size_t size, bool fileio) { struct iio_dma_buffer_block *block; @@ -171,13 +180,16 @@ static struct iio_dma_buffer_block *iio_dma_buffer_alloc_block( if (!block) return NULL; - block->vaddr = dma_alloc_coherent(queue->dev, PAGE_ALIGN(size), - &block->phys_addr, GFP_KERNEL); - if (!block->vaddr) { - kfree(block); - return NULL; + if (fileio) { + block->vaddr = dma_alloc_coherent(queue->dev, PAGE_ALIGN(size), + &block->phys_addr, GFP_KERNEL); + if (!block->vaddr) { + kfree(block); + return NULL; + } } + block->fileio = fileio; block->size = size; block->state = IIO_BLOCK_STATE_DONE; block->queue = queue; @@ -186,6 +198,9 @@ static struct iio_dma_buffer_block *iio_dma_buffer_alloc_block( iio_buffer_get(&queue->buffer); + if (!fileio) + atomic_inc(&queue->num_dmabufs); + return block; } @@ -206,13 +221,21 @@ void iio_dma_buffer_block_done(struct iio_dma_buffer_block *block) { struct iio_dma_buffer_queue *queue = block->queue; unsigned long flags; + bool cookie; + + cookie = dma_fence_begin_signalling(); spin_lock_irqsave(&queue->list_lock, flags); _iio_dma_buffer_block_done(block); spin_unlock_irqrestore(&queue->list_lock, flags); + if (!block->fileio) + iio_buffer_signal_dmabuf_done(block->fence, 0); + iio_buffer_block_put_atomic(block); wake_up_interruptible_poll(&queue->buffer.pollq, EPOLLIN | EPOLLRDN
[PATCH v6 6/6] Documentation: iio: Document high-speed DMABUF based API
Document the new DMABUF based API. Signed-off-by: Paul Cercueil --- v2: - Explicitly state that the new interface is optional and is not implemented by all drivers. - The IOCTLs can now only be called on the buffer FD returned by IIO_BUFFER_GET_FD_IOCTL. - Move the page up a bit in the index since it is core stuff and not driver-specific. v3: Update the documentation to reflect the new API. v5: Use description lists for the documentation of the three new IOCTLs instead of abusing subsections. --- Documentation/iio/dmabuf_api.rst | 54 Documentation/iio/index.rst | 2 ++ 2 files changed, 56 insertions(+) create mode 100644 Documentation/iio/dmabuf_api.rst diff --git a/Documentation/iio/dmabuf_api.rst b/Documentation/iio/dmabuf_api.rst new file mode 100644 index ..1cd6cd51a582 --- /dev/null +++ b/Documentation/iio/dmabuf_api.rst @@ -0,0 +1,54 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=== +High-speed DMABUF interface for IIO +=== + +1. Overview +=== + +The Industrial I/O subsystem supports access to buffers through a +file-based interface, with read() and write() access calls through the +IIO device's dev node. + +It additionally supports a DMABUF based interface, where the userspace +can attach DMABUF objects (externally created) to a IIO buffer, and +subsequently use them for data transfers. + +A userspace application can then use this interface to share DMABUF +objects between several interfaces, allowing it to transfer data in a +zero-copy fashion, for instance between IIO and the USB stack. + +The userspace application can also memory-map the DMABUF objects, and +access the sample data directly. The advantage of doing this vs. the +read() interface is that it avoids an extra copy of the data between the +kernel and userspace. This is particularly useful for high-speed devices +which produce several megabytes or even gigabytes of data per second. +It does however increase the userspace-kernelspace synchronization +overhead, as the DMA_BUF_SYNC_START and DMA_BUF_SYNC_END IOCTLs have to +be used for data integrity. + +2. User API +=== + +As part of this interface, three new IOCTLs have been added. These three +IOCTLs have to be performed on the IIO buffer's file descriptor, +obtained using the IIO_BUFFER_GET_FD_IOCTL() ioctl. + + ``IIO_BUFFER_DMABUF_ATTACH_IOCTL(int)`` +Attach the DMABUF object, identified by its file descriptor, to the +IIO buffer. Returns zero on success, and a negative errno value on +error. + + ``IIO_BUFFER_DMABUF_DETACH_IOCTL(int)`` +Detach the given DMABUF object, identified by its file descriptor, +from the IIO buffer. Returns zero on success, and a negative errno +value on error. + +Note that closing the IIO buffer's file descriptor will +automatically detach all previously attached DMABUF objects. + + ``IIO_BUFFER_DMABUF_ENQUEUE_IOCTL(struct iio_dmabuf *iio_dmabuf)`` +Enqueue a previously attached DMABUF object to the buffer queue. +Enqueued DMABUFs will be read from (if output buffer) or written to +(if input buffer) as long as the buffer is enabled. diff --git a/Documentation/iio/index.rst b/Documentation/iio/index.rst index 1b7292c58cd0..3eae8fcb1938 100644 --- a/Documentation/iio/index.rst +++ b/Documentation/iio/index.rst @@ -9,6 +9,8 @@ Industrial I/O iio_configfs + dmabuf_api + ep93xx_adc bno055 -- 2.43.0
[PATCH 6.7 251/346] drm: Fix TODO list mentioning non-KMS drivers
6.7-stable review patch. If anyone has any objections, please let me know. -- From: Thomas Zimmermann commit 9cf5ca1f485cae406968947a92bf304603999fa1 upstream. Non-KMS drivers have been removed from DRM. Update the TODO list accordingly. Signed-off-by: Thomas Zimmermann Fixes: a276afc19eec ("drm: Remove some obsolete drm pciids(tdfx, mga, i810, savage, r128, sis, via)") Cc: Cai Huoqing Cc: Daniel Vetter Cc: Dave Airlie Cc: Thomas Zimmermann Cc: Maarten Lankhorst Cc: Maxime Ripard Cc: David Airlie Cc: Daniel Vetter Cc: Jonathan Corbet Cc: dri-devel@lists.freedesktop.org Cc: # v6.3+ Cc: linux-...@vger.kernel.org Reviewed-by: David Airlie Reviewed-by: Daniel Vetter Acked-by: Alex Deucher Link: https://patchwork.freedesktop.org/patch/msgid/20231122122449.11588-3-tzimmerm...@suse.de Signed-off-by: Greg Kroah-Hartman --- Documentation/gpu/todo.rst |7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) --- a/Documentation/gpu/todo.rst +++ b/Documentation/gpu/todo.rst @@ -337,8 +337,8 @@ connector register/unregister fixes Level: Intermediate -Remove load/unload callbacks from all non-DRIVER_LEGACY drivers +Remove load/unload callbacks + The load/unload callbacks in struct &drm_driver are very much midlayers, plus for historical reasons they get the ordering wrong (and we can't fix that) @@ -347,8 +347,7 @@ between setting up the &drm_driver struc - Rework drivers to no longer use the load/unload callbacks, directly coding the load/unload sequence into the driver's probe function. -- Once all non-DRIVER_LEGACY drivers are converted, disallow the load/unload - callbacks for all modern drivers. +- Once all drivers are converted, remove the load/unload callbacks. Contact: Daniel Vetter
[PATCH 6.7 253/346] drm: Disable the cursor plane on atomic contexts with virtualized drivers
6.7-stable review patch. If anyone has any objections, please let me know. -- From: Zack Rusin commit 4e3b70da64a53784683cfcbac2deda5d6e540407 upstream. Cursor planes on virtualized drivers have special meaning and require that the clients handle them in specific ways, e.g. the cursor plane should react to the mouse movement the way a mouse cursor would be expected to and the client is required to set hotspot properties on it in order for the mouse events to be routed correctly. This breaks the contract as specified by the "universal planes". Fix it by disabling the cursor planes on virtualized drivers while adding a foundation on top of which it's possible to special case mouse cursor planes for clients that want it. Disabling the cursor planes makes some kms compositors which were broken, e.g. Weston, fallback to software cursor which works fine or at least better than currently while having no effect on others, e.g. gnome-shell or kwin, which put virtualized drivers on a deny-list when running in atomic context to make them fallback to legacy kms and avoid this issue. Signed-off-by: Zack Rusin Fixes: 681e7ec73044 ("drm: Allow userspace to ask for universal plane list (v2)") Cc: # v5.4+ Cc: Maarten Lankhorst Cc: Maxime Ripard Cc: Thomas Zimmermann Cc: David Airlie Cc: Daniel Vetter Cc: Dave Airlie Cc: Gerd Hoffmann Cc: Hans de Goede Cc: Gurchetan Singh Cc: Chia-I Wu Cc: dri-devel@lists.freedesktop.org Cc: virtualizat...@lists.linux-foundation.org Cc: spice-de...@lists.freedesktop.org Acked-by: Pekka Paalanen Reviewed-by: Javier Martinez Canillas Acked-by: Simon Ser Signed-off-by: Javier Martinez Canillas Link: https://patchwork.freedesktop.org/patch/msgid/20231023074613.41327-2-aest...@redhat.com Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/drm_plane.c | 13 + drivers/gpu/drm/qxl/qxl_drv.c|2 +- drivers/gpu/drm/vboxvideo/vbox_drv.c |2 +- drivers/gpu/drm/virtio/virtgpu_drv.c |2 +- drivers/gpu/drm/vmwgfx/vmwgfx_drv.c |2 +- include/drm/drm_drv.h|9 + include/drm/drm_file.h | 12 7 files changed, 38 insertions(+), 4 deletions(-) --- a/drivers/gpu/drm/drm_plane.c +++ b/drivers/gpu/drm/drm_plane.c @@ -678,6 +678,19 @@ int drm_mode_getplane_res(struct drm_dev !file_priv->universal_planes) continue; + /* +* If we're running on a virtualized driver then, +* unless userspace advertizes support for the +* virtualized cursor plane, disable cursor planes +* because they'll be broken due to missing cursor +* hotspot info. +*/ + if (plane->type == DRM_PLANE_TYPE_CURSOR && + drm_core_check_feature(dev, DRIVER_CURSOR_HOTSPOT) && + file_priv->atomic && + !file_priv->supports_virtualized_cursor_plane) + continue; + if (drm_lease_held(file_priv, plane->base.id)) { if (count < plane_resp->count_planes && put_user(plane->base.id, plane_ptr + count)) --- a/drivers/gpu/drm/qxl/qxl_drv.c +++ b/drivers/gpu/drm/qxl/qxl_drv.c @@ -285,7 +285,7 @@ static const struct drm_ioctl_desc qxl_i }; static struct drm_driver qxl_driver = { - .driver_features = DRIVER_GEM | DRIVER_MODESET | DRIVER_ATOMIC, + .driver_features = DRIVER_GEM | DRIVER_MODESET | DRIVER_ATOMIC | DRIVER_CURSOR_HOTSPOT, .dumb_create = qxl_mode_dumb_create, .dumb_map_offset = drm_gem_ttm_dumb_map_offset, --- a/drivers/gpu/drm/vboxvideo/vbox_drv.c +++ b/drivers/gpu/drm/vboxvideo/vbox_drv.c @@ -182,7 +182,7 @@ DEFINE_DRM_GEM_FOPS(vbox_fops); static const struct drm_driver driver = { .driver_features = - DRIVER_MODESET | DRIVER_GEM | DRIVER_ATOMIC, + DRIVER_MODESET | DRIVER_GEM | DRIVER_ATOMIC | DRIVER_CURSOR_HOTSPOT, .fops = &vbox_fops, .name = DRIVER_NAME, --- a/drivers/gpu/drm/virtio/virtgpu_drv.c +++ b/drivers/gpu/drm/virtio/virtgpu_drv.c @@ -177,7 +177,7 @@ static const struct drm_driver driver = * out via drm_device::driver_features: */ .driver_features = DRIVER_MODESET | DRIVER_GEM | DRIVER_RENDER | DRIVER_ATOMIC | - DRIVER_SYNCOBJ | DRIVER_SYNCOBJ_TIMELINE, + DRIVER_SYNCOBJ | DRIVER_SYNCOBJ_TIMELINE | DRIVER_CURSOR_HOTSPOT, .open = virtio_gpu_driver_open, .postclose = virtio_gpu_driver_postclose, --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c @@ -1611,7 +1611,7 @@ static const struct file_operations vmwg static const struct drm_driver driver = { .driver_features = - DRIVER_MODESET | DRIVER_RENDER | DRIVER_ATOMIC | DRIVER_GEM, + DRIVER_MODESET | DRIVER_RENDER | DRI
Re: [PATCH] drm/sched: Drain all entities in DRM sched run job worker
On 2024-01-29 02:44, Christian König wrote: > Am 26.01.24 um 17:29 schrieb Matthew Brost: >> On Fri, Jan 26, 2024 at 11:32:57AM +0100, Christian König wrote: >>> Am 25.01.24 um 18:30 schrieb Matthew Brost: On Thu, Jan 25, 2024 at 04:12:58PM +0100, Christian König wrote: > Am 24.01.24 um 22:08 schrieb Matthew Brost: >> All entities must be drained in the DRM scheduler run job worker to >> avoid the following case. An entity found that is ready, no job found >> ready on entity, and run job worker goes idle with other entities + jobs >> ready. Draining all ready entities (i.e. loop over all ready entities) >> in the run job worker ensures all job that are ready will be scheduled. > That doesn't make sense. drm_sched_select_entity() only returns entities > which are "ready", e.g. have a job to run. > That is what I thought too, hence my original design but it is not exactly true. Let me explain. drm_sched_select_entity() returns an entity with a non-empty spsc queue (job in queue) and no *current* waiting dependecies [1]. Dependecies for an entity can be added when drm_sched_entity_pop_job() is called [2][3] returning a NULL job. Thus we can get into a scenario where 2 entities A and B both have jobs and no current dependecies. A's job is waiting B's job, entity A gets selected first, a dependecy gets installed in drm_sched_entity_pop_job(), run work goes idle, and now we deadlock. >>> And here is the real problem. run work doesn't goes idle in that moment. >>> >>> drm_sched_run_job_work() should restarts itself until there is either no >>> more space in the ring buffer or it can't find a ready entity any more. >>> >>> At least that was the original design when that was all still driven by a >>> kthread. >>> >>> It can perfectly be that we messed this up when switching from kthread to a >>> work item. >>> >> Right, that what this patch does - the run worker does not go idle until >> no ready entities are found. That was incorrect in the original patch >> and fixed here. Do you have any issues with this fix? It has been tested >> 3x times and clearly fixes the issue. > > Ah! Yes in this case that patch here is a little bit ugly as well. > > The original idea was that run_job restarts so that we are able to pause > the submission thread without searching for an entity to submit more. > > I strongly suggest to replace the while loop with a call to > drm_sched_run_job_queue() so that when the entity can't provide a job we > just restart the queuing work. I agree with Christian. This more closely preserves the original design of the GPU schedulers, so we should go with that. -- Regards, Luben OpenPGP_0x4C15479431A334AF.asc Description: OpenPGP public key OpenPGP_signature.asc Description: OpenPGP digital signature