Re: [PATCH v6 32/32] drm/doc: gpusvm: Add GPU SVM documentation

2025-02-27 Thread Alistair Popple
On Mon, Feb 24, 2025 at 08:43:11PM -0800, Matthew Brost wrote:
> Add documentation for agree upon GPU SVM design principles, current
> status, and future plans.

Thanks for writing this up. In general I didn't see anything too controversial
but added a couple of comments below.

> 
> v4:
>  - Address Thomas's feedback
> v5:
>  - s/Current/Basline (Thomas)
> 
> Signed-off-by: Matthew Brost 
> Reviewed-by: Thomas Hellström 
> ---
>  Documentation/gpu/rfc/gpusvm.rst | 84 
>  Documentation/gpu/rfc/index.rst  |  4 ++
>  2 files changed, 88 insertions(+)
>  create mode 100644 Documentation/gpu/rfc/gpusvm.rst
> 
> diff --git a/Documentation/gpu/rfc/gpusvm.rst 
> b/Documentation/gpu/rfc/gpusvm.rst
> new file mode 100644
> index ..063412160685
> --- /dev/null
> +++ b/Documentation/gpu/rfc/gpusvm.rst
> @@ -0,0 +1,84 @@
> +===
> +GPU SVM Section
> +===
> +
> +Agreed upon design principles
> +=

As a general comment I think it would be nice if we could add some rational/
reasons for these design principals. Things inevitably change and if/when
we need to violate or update these principals it would be good to have some
documented rational for why we decided on them in the first place because the
reasoning may have become invalid by then.

> +* migrate_to_ram path
> + * Rely only on core MM concepts (migration PTEs, page references, and
> +   page locking).
> + * No driver specific locks other than locks for hardware interaction in
> +   this path. These are not required and generally a bad idea to
> +   invent driver defined locks to seal core MM races.

In principal I agree. The problem I think you will run into is the analogue of
what adding a trylock_page() to do_swap_page() fixes. Which is that a concurrent
GPU fault (which is higly likely after handling a CPU fault due to the GPU PTEs
becoming invalid) may, depending on your design, kick off a migration of the
page to the GPU via migrate_vma_setup().

The problem with that is migrate_vma_setup() will temprarily raise the folio
refcount, which can cause the migrate_to_ram() callback to fail but the elevated
refcount from migrate_to_ram() can also cause the GPU migration to fail thus
leading to a live-lock when both CPU and GPU fault handlers just keep retrying.

This was particularly problematic for us on multi-GPU setups, and our solution
was to introduce a migration critical section in the form of a mutex to ensure
only one thread was calling migrate_vma_setup() at a time.

And now that I've looked at UVM development history, and remembered more
context, this is why I had a vague recollection that adding a migration entry
in do_swap_page() would be better than taking a page lock. Doing so fixes the
issue with concurrent GPU faults blocking migrate_to_ram() because it makes
migrate_vma_setup() ignore the page.

> + * Partial migration is supported (i.e., a subset of pages attempting to
> +   migrate can actually migrate, with only the faulting page guaranteed
> +   to migrate).
> + * Driver handles mixed migrations via retry loops rather than locking.
>
> +* Eviction

This is a term that seems be somewhat overloaded depending on context so a
definition would be nice. Is your view of eviction migrating data from GPU back
to CPU without a virtual address to free up GPU memory? (that's what I think of,
but would be good to make sure we're in sync).

> + * Only looking at physical memory data structures and locks as opposed 
> to
> +   looking at virtual memory data structures and locks.
> + * No looking at mm/vma structs or relying on those being locked.

Agree with the above points.

> +* GPU fault side
> + * mmap_read only used around core MM functions which require this lock
> +   and should strive to take mmap_read lock only in GPU SVM layer.
> + * Big retry loop to handle all races with the mmu notifier under the gpu
> +   pagetable locks/mmu notifier range lock/whatever we end up calling
> +  those.

Again, one of the issues here (particularly with multi-GPU setups) is that it's
very easy to live-lock with rety loops because even attempting a migration that
fails can cause migration/fault handling in other threads to fail, either by
calling mmu_notifiers or taking a page reference.

Those are probably things that we should fix on the MM side, but for now UVM at
least uses a lock to ensure forward progress.

> + * Races (especially against concurrent eviction or migrate_to_ram)
> +   should not be handled on the fault side by trying to hold locks;
> +   rather, they should be handled using retry loops. One possible
> +   exception is holding a BO's dma-resv lock during the initial migration
> +   to VRAM, as this is a well-defined lock that can be taken underneath
> +   the mmap_read lock.

See my earlier comments. Although note I agree with this in principal, and we do
jus

[PATCH 3/4] drm/msm/dpu: remove DSC feature bit for PINGPONG on MSM8953

2025-02-27 Thread Dmitry Baryshkov
The MSM8937 platform doesn't have DSC blocks nor does have it DSC
registers in the PINGPONG block. Drop the DPU_PINGPONG_DSC feature bit
from the PINGPONG's feature mask and, as it is the only remaining bit,
drop the .features assignment completely.

Fixes: 7a6109ce1c2c ("drm/msm/dpu: Add support for MSM8953")
Reported-by: Abhinav Kumar 
Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_1_16_msm8953.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_1_16_msm8953.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_1_16_msm8953.h
index 
14f36ea6ad0eb61e87f043437a8cd78bb1bde49c..04f2021a7bef1bdefee77ab34074c06713f80487
 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_1_16_msm8953.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_1_16_msm8953.h
@@ -100,14 +100,12 @@ static const struct dpu_pingpong_cfg msm8953_pp[] = {
{
.name = "pingpong_0", .id = PINGPONG_0,
.base = 0x7, .len = 0xd4,
-   .features = PINGPONG_MSM8996_MASK,
.sblk = &msm8996_pp_sblk,
.intr_done = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 8),
.intr_rdptr = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 12),
}, {
.name = "pingpong_1", .id = PINGPONG_1,
.base = 0x70800, .len = 0xd4,
-   .features = PINGPONG_MSM8996_MASK,
.sblk = &msm8996_pp_sblk,
.intr_done = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 9),
.intr_rdptr = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 13),

-- 
2.39.5



[PATCH 2/4] drm/msm/dpu: remove DSC feature bit for PINGPONG on MSM8917

2025-02-27 Thread Dmitry Baryshkov
The MSM8937 platform doesn't have DSC blocks nor does have it DSC
registers in the PINGPONG block. Drop the DPU_PINGPONG_DSC feature bit
from the PINGPONG's feature mask and, as it is the only remaining bit,
drop the .features assignment completely.

Fixes: 62af6e1cb596 ("drm/msm/dpu: Add support for MSM8917")
Reported-by: Abhinav Kumar 
Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_1_15_msm8917.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_1_15_msm8917.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_1_15_msm8917.h
index 
6bdaecca676144f9162ab1839d99f3e2e3386dc7..6f2c40b303e2b017fc3f913563a1a251779a9124
 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_1_15_msm8917.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_1_15_msm8917.h
@@ -93,7 +93,6 @@ static const struct dpu_pingpong_cfg msm8917_pp[] = {
{
.name = "pingpong_0", .id = PINGPONG_0,
.base = 0x7, .len = 0xd4,
-   .features = PINGPONG_MSM8996_MASK,
.sblk = &msm8996_pp_sblk,
.intr_done = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 8),
.intr_rdptr = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 12),

-- 
2.39.5



Re: [PATCH v2] drm/ci: fix merge request rules

2025-02-27 Thread Dmitry Baryshkov
On Thu, Feb 27, 2025 at 09:50:50AM +0530, Vignesh Raman wrote:
> Merge request pipelines were only created when changes
> were made to drivers/gpu/drm/ci/, causing MRs that
> didn't touch this path to break. Fix MR pipeline rules
> to trigger jobs for all changes.
> 
> Run jobs automatically for marge-bot and scheduled
> pipelines, but in all other cases run manually. Also
> remove CI_PROJECT_NAMESPACE checks specific to mesa.
> 
> Fixes: df54f04f2020 ("drm/ci: update gitlab rules")
> Signed-off-by: Vignesh Raman 
> ---
> 
> v2:
>   - Run jobs automatically for marge-bot and scheduled
> pipelines, but in all other cases run manually. Also
> remove CI_PROJECT_NAMESPACE checks specific to mesa.
> 
> ---
>  drivers/gpu/drm/ci/gitlab-ci.yml | 21 +
>  1 file changed, 5 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ci/gitlab-ci.yml 
> b/drivers/gpu/drm/ci/gitlab-ci.yml
> index f04aabe8327c..f4e324e156db 100644
> --- a/drivers/gpu/drm/ci/gitlab-ci.yml
> +++ b/drivers/gpu/drm/ci/gitlab-ci.yml
> @@ -143,11 +143,11 @@ stages:
>  # Pre-merge pipeline
>  - if: &is-pre-merge $CI_PIPELINE_SOURCE == "merge_request_event"
>  # Push to a branch on a fork
> -- if: &is-fork-push $CI_PROJECT_NAMESPACE != "mesa" && 
> $CI_PIPELINE_SOURCE == "push"
> +- if: &is-fork-push $CI_PIPELINE_SOURCE == "push"
>  # nightly pipeline
>  - if: &is-scheduled-pipeline $CI_PIPELINE_SOURCE == "schedule"
>  # pipeline for direct pushes that bypassed the CI
> -- if: &is-direct-push $CI_PROJECT_NAMESPACE == "mesa" && 
> $CI_PIPELINE_SOURCE == "push" && $GITLAB_USER_LOGIN != "marge-bot"
> +- if: &is-direct-push $CI_PIPELINE_SOURCE == "push" && 
> $GITLAB_USER_LOGIN != "marge-bot"
>  
>  
>  # Rules applied to every job in the pipeline
> @@ -170,26 +170,15 @@ stages:
>  - !reference [.disable-farm-mr-rules, rules]
>  # Never run immediately after merging, as we just ran everything
>  - !reference [.never-post-merge-rules, rules]
> -# Build everything in merge pipelines, if any files affecting the 
> pipeline
> -# were changed
> +# Build everything in merge pipelines
>  - if: *is-merge-attempt
> -  changes: &all_paths
> -  - drivers/gpu/drm/ci/**/*
>when: on_success
>  # Same as above, but for pre-merge pipelines
>  - if: *is-pre-merge
> -  changes:
> -*all_paths
> -  when: manual
> -# Skip everything for pre-merge and merge pipelines which don't change
> -# anything in the build
> -- if: *is-merge-attempt
> -  when: never
> -- if: *is-pre-merge
> -  when: never
> +- when: manual

I believe there should be no dash on this line

>  # Build everything after someone bypassed the CI
>  - if: *is-direct-push
> -  when: on_success
> +- when: manual

And on this line too.

>  # Build everything in scheduled pipelines
>  - if: *is-scheduled-pipeline
>when: on_success
> -- 
> 2.47.2
> 

-- 
With best wishes
Dmitry


[PATCH v3 2/2] drm/panthor: Avoid sleep locking in the internal BO size path

2025-02-27 Thread Adrián Larumbe
Commit 434e5ca5b5d7 ("drm/panthor: Expose size of driver internal BO's over
fdinfo") locks the VMS xarray, to avoid UAF errors when the same VM is
being concurrently destroyed by another thread. However, that puts the
current thread in atomic context, which means taking the VMS' heap locks
will trigger a warning as the thread is no longer allowed to sleep.

Because in this case replacing the heap mutex with a spinlock isn't
feasible, the fdinfo handler no longer traverses the list of heaps for
every single VM associated with an open DRM file. Instead, when a new heap
chunk is allocated, its size is accumulated into a pool-wide tally, which
also makes the atomic context code path somewhat faster.

Signed-off-by: Adrián Larumbe 
Fixes: 3e2c8c718567 ("drm/panthor: Expose size of driver internal BO's over 
fdinfo")
---
 drivers/gpu/drm/panthor/panthor_heap.c | 59 +-
 drivers/gpu/drm/panthor/panthor_mmu.c  |  8 +---
 2 files changed, 31 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_heap.c 
b/drivers/gpu/drm/panthor/panthor_heap.c
index db0285ce5812..7cd05c7ee342 100644
--- a/drivers/gpu/drm/panthor/panthor_heap.c
+++ b/drivers/gpu/drm/panthor/panthor_heap.c
@@ -97,6 +97,9 @@ struct panthor_heap_pool {
 
/** @gpu_contexts: Buffer object containing the GPU heap contexts. */
struct panthor_kernel_bo *gpu_contexts;
+
+   /** @size: Size of all chunks across all heaps in the pool. */
+   atomic_t size;
 };
 
 static int panthor_heap_ctx_stride(struct panthor_device *ptdev)
@@ -118,7 +121,7 @@ static void *panthor_get_heap_ctx(struct panthor_heap_pool 
*pool, int id)
   panthor_get_heap_ctx_offset(pool, id);
 }
 
-static void panthor_free_heap_chunk(struct panthor_vm *vm,
+static void panthor_free_heap_chunk(struct panthor_heap_pool *pool,
struct panthor_heap *heap,
struct panthor_heap_chunk *chunk)
 {
@@ -127,11 +130,13 @@ static void panthor_free_heap_chunk(struct panthor_vm *vm,
heap->chunk_count--;
mutex_unlock(&heap->lock);
 
+   atomic_sub(heap->chunk_size, &pool->size);
+
panthor_kernel_bo_destroy(chunk->bo);
kfree(chunk);
 }
 
-static int panthor_alloc_heap_chunk(struct panthor_device *ptdev,
+static int panthor_alloc_heap_chunk(struct panthor_heap_pool *pool,
struct panthor_vm *vm,
struct panthor_heap *heap,
bool initial_chunk)
@@ -144,7 +149,7 @@ static int panthor_alloc_heap_chunk(struct panthor_device 
*ptdev,
if (!chunk)
return -ENOMEM;
 
-   chunk->bo = panthor_kernel_bo_create(ptdev, vm, heap->chunk_size,
+   chunk->bo = panthor_kernel_bo_create(pool->ptdev, vm, heap->chunk_size,
 DRM_PANTHOR_BO_NO_MMAP,
 DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC,
 PANTHOR_VM_KERNEL_AUTO_VA);
@@ -180,6 +185,8 @@ static int panthor_alloc_heap_chunk(struct panthor_device 
*ptdev,
heap->chunk_count++;
mutex_unlock(&heap->lock);
 
+   atomic_add(heap->chunk_size, &pool->size);
+
return 0;
 
 err_destroy_bo:
@@ -191,16 +198,16 @@ static int panthor_alloc_heap_chunk(struct panthor_device 
*ptdev,
return ret;
 }
 
-static void panthor_free_heap_chunks(struct panthor_vm *vm,
+static void panthor_free_heap_chunks(struct panthor_heap_pool *pool,
 struct panthor_heap *heap)
 {
struct panthor_heap_chunk *chunk, *tmp;
 
list_for_each_entry_safe(chunk, tmp, &heap->chunks, node)
-   panthor_free_heap_chunk(vm, heap, chunk);
+   panthor_free_heap_chunk(pool, heap, chunk);
 }
 
-static int panthor_alloc_heap_chunks(struct panthor_device *ptdev,
+static int panthor_alloc_heap_chunks(struct panthor_heap_pool *pool,
 struct panthor_vm *vm,
 struct panthor_heap *heap,
 u32 chunk_count)
@@ -209,7 +216,7 @@ static int panthor_alloc_heap_chunks(struct panthor_device 
*ptdev,
u32 i;
 
for (i = 0; i < chunk_count; i++) {
-   ret = panthor_alloc_heap_chunk(ptdev, vm, heap, true);
+   ret = panthor_alloc_heap_chunk(pool, vm, heap, true);
if (ret)
return ret;
}
@@ -226,7 +233,7 @@ panthor_heap_destroy_locked(struct panthor_heap_pool *pool, 
u32 handle)
if (!heap)
return -EINVAL;
 
-   panthor_free_heap_chunks(pool->vm, heap);
+   panthor_free_heap_chunks(pool, heap);
mutex_destroy(&heap->lock);
kfree(heap);
return 0;
@@ -308,7 +315,7 @@ int panthor_heap_create(struct panthor_heap_pool *pool,
heap->max_chunks = max_chunks;
  

Re: [PATCH 7/7] drm/msm/dpu: remove DPU_CTL_SPLIT_DISPLAY from CTL blocks on DPU >= 5.0

2025-02-27 Thread Dmitry Baryshkov
On Fri, Feb 21, 2025 at 12:37:40AM +0100, Marijn Suijten wrote:
> On 2025-02-20 12:26:24, Dmitry Baryshkov wrote:
> > Since DPU 5.0 CTL blocks do not require DPU_CTL_SPLIT_DISPLAY, as single
> > CTL is used for both interfaces. As both RM and encoder now handle
> > active CTLs, drop that feature bit.
> 
> I was wondering if this bit only existed to ensure the right "pair" of CTLs
> exist: not on DPU 4.0, but on DPU 3.0 we see that CTL_0 and CTL_2 have this 
> bit
> but not CTL_1.  Meaning that split display can only work when that specific 
> pair
> of CTL_0 and CTL_2 is used in conjunction?

Unfortunately I don't have a deep knowledge of those platforms and I
don't have a way to test it. My SDM660 board (IFC6560) doesn't have DSI1
routed anywhere.

> 
> > 
> > Signed-off-by: Dmitry Baryshkov 
> 
> Reviewed-by: Marijn Suijten 
> 

-- 
With best wishes
Dmitry


[PATCH v2 3/8] drm/msm/dpu: pass master interface to CTL configuration

2025-02-27 Thread Dmitry Baryshkov
Active controls require setup of the master interface. Pass the selected
interface to CTL configuration.

Reviewed-by: Marijn Suijten 
Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c | 2 ++
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c
index 
da9994a79ca293ec0265680c438835742102db2a..a0ba55ab3c894c200225fe48ec6214ae4135d059
 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c
@@ -60,6 +60,8 @@ static void _dpu_encoder_phys_cmd_update_intf_cfg(
return;
 
intf_cfg.intf = phys_enc->hw_intf->idx;
+   if (phys_enc->split_role == ENC_ROLE_MASTER)
+   intf_cfg.intf_master = phys_enc->hw_intf->idx;
intf_cfg.intf_mode_sel = DPU_CTL_MODE_SEL_CMD;
intf_cfg.stream_sel = cmd_enc->stream_sel;
intf_cfg.mode_3d = dpu_encoder_helper_get_3d_blend_mode(phys_enc);
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
index 
abd6600046cb3a91bf88ca240fd9b9c306b0ea2e..232055473ba55998b79dd2e8c752c129bbffbff4
 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
@@ -298,6 +298,8 @@ static void dpu_encoder_phys_vid_setup_timing_engine(
if (phys_enc->hw_cdm)
intf_cfg.cdm = phys_enc->hw_cdm->idx;
intf_cfg.intf = phys_enc->hw_intf->idx;
+   if (phys_enc->split_role == ENC_ROLE_MASTER)
+   intf_cfg.intf_master = phys_enc->hw_intf->idx;
intf_cfg.intf_mode_sel = DPU_CTL_MODE_SEL_VID;
intf_cfg.stream_sel = 0; /* Don't care value for video mode */
intf_cfg.mode_3d = dpu_encoder_helper_get_3d_blend_mode(phys_enc);

-- 
2.39.5



[PATCH v2 6/8] drm/msm/dpu: allocate single CTL for DPU >= 5.0

2025-02-27 Thread Dmitry Baryshkov
Unlike previous generation, since DPU 5.0 it is possible to use just one
CTL to handle all INTF and WB blocks for a single output. And one has to
use single CTL to support bonded DSI config. Allocate single CTL for
these DPU versions.

Reviewed-by: Marijn Suijten 
Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c | 17 +
 drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h |  2 ++
 2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
index 
5baf9df702b84b74ba00e703ad3cc12afb0e94a4..4dbc9bc7eb4f151f83055220665ee5fd238ae7ba
 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
@@ -53,6 +53,8 @@ int dpu_rm_init(struct drm_device *dev,
/* Clear, setup lists */
memset(rm, 0, sizeof(*rm));
 
+   rm->has_legacy_ctls = (cat->mdss_ver->core_major_ver < 5);
+
/* Interrogate HW catalog and create tracking items for hw blocks */
for (i = 0; i < cat->mixer_count; i++) {
struct dpu_hw_mixer *hw;
@@ -381,10 +383,16 @@ static int _dpu_rm_reserve_ctls(
int i = 0, j, num_ctls;
bool needs_split_display;
 
-   /* each hw_intf needs its own hw_ctrl to program its control path */
-   num_ctls = top->num_intf;
+   if (rm->has_legacy_ctls) {
+   /* each hw_intf needs its own hw_ctrl to program its control 
path */
+   num_ctls = top->num_intf;
 
-   needs_split_display = _dpu_rm_needs_split_display(top);
+   needs_split_display = _dpu_rm_needs_split_display(top);
+   } else {
+   /* use single CTL */
+   num_ctls = 1;
+   needs_split_display = false;
+   }
 
for (j = 0; j < ARRAY_SIZE(rm->ctl_blks); j++) {
const struct dpu_hw_ctl *ctl;
@@ -402,7 +410,8 @@ static int _dpu_rm_reserve_ctls(
 
DPU_DEBUG("ctl %d caps 0x%lX\n", j + CTL_0, features);
 
-   if (needs_split_display != has_split_display)
+   if (rm->has_legacy_ctls &&
+   needs_split_display != has_split_display)
continue;
 
ctl_idx[i] = j;
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h
index 
99bd594ee0d1995eca5a1f661b15e24fdf6acf39..130f753c36338544e84a305b266c3b47fa028d84
 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h
@@ -24,6 +24,7 @@ struct dpu_global_state;
  * @dspp_blks: array of dspp hardware resources
  * @hw_sspp: array of sspp hardware resources
  * @cdm_blk: cdm hardware resource
+ * @has_legacy_ctls: DPU uses pre-ACTIVE CTL blocks.
  */
 struct dpu_rm {
struct dpu_hw_blk *pingpong_blks[PINGPONG_MAX - PINGPONG_0];
@@ -37,6 +38,7 @@ struct dpu_rm {
struct dpu_hw_blk *dsc_blks[DSC_MAX - DSC_0];
struct dpu_hw_sspp *hw_sspp[SSPP_MAX - SSPP_NONE];
struct dpu_hw_blk *cdm_blk;
+   bool has_legacy_ctls;
 };
 
 struct dpu_rm_sspp_requirements {

-- 
2.39.5



[PATCH v2 4/8] drm/msm/dpu: use single CTL if it is the only CTL returned by RM

2025-02-27 Thread Dmitry Baryshkov
On DPU >= 5.0 CTL blocks were reworked in order to support using a
single CTL for all outputs. In preparation of reworking the RM code to
return single CTL make sure that dpu_encoder can cope with that.

Reviewed-by: Marijn Suijten 
Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
index 
32992e9525530ea4dec2f46643fc06d40d3bca7b..e7dad94d91a7b6e99adb9aadb48aa8cd164babfa
 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
@@ -1288,7 +1288,11 @@ static void dpu_encoder_virt_atomic_mode_set(struct 
drm_encoder *drm_enc,
return;
}
 
-   phys->hw_ctl = i < num_ctl ? to_dpu_hw_ctl(hw_ctl[i]) : NULL;
+   /* Use first (and only) CTL if active CTLs are supported */
+   if (num_ctl == 1)
+   phys->hw_ctl = to_dpu_hw_ctl(hw_ctl[0]);
+   else
+   phys->hw_ctl = i < num_ctl ? to_dpu_hw_ctl(hw_ctl[i]) : 
NULL;
if (!phys->hw_ctl) {
DPU_ERROR_ENC(dpu_enc,
"no ctl block assigned at idx: %d\n", i);

-- 
2.39.5



[PATCH v2 1/8] drm/msm/dpu: don't overwrite CTL_MERGE_3D_ACTIVE register

2025-02-27 Thread Dmitry Baryshkov
In case of complex pipelines (e.g. the forthcoming quad-pipe) the DPU
might use more that one MERGE_3D block for a single output.  Follow the
pattern and extend the CTL_MERGE_3D_ACTIVE active register instead of
simply writing new value there. Currently at most one MERGE_3D block is
being used, so this has no impact on existing targets.

Reviewed-by: Marijn Suijten 
Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
index 
4893f10d6a5832521808c0f4d8b231c356dbdc41..32ab33b314fc44e12ccb935c1695d2eea5c7d9b2
 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
@@ -548,6 +548,7 @@ static void dpu_hw_ctl_intf_cfg_v1(struct dpu_hw_ctl *ctx,
u32 dsc_active = 0;
u32 wb_active = 0;
u32 mode_sel = 0;
+   u32 merge_3d_active = 0;
 
/* CTL_TOP[31:28] carries group_id to collate CTL paths
 * per VM. Explicitly disable it until VM support is
@@ -562,6 +563,7 @@ static void dpu_hw_ctl_intf_cfg_v1(struct dpu_hw_ctl *ctx,
intf_active = DPU_REG_READ(c, CTL_INTF_ACTIVE);
wb_active = DPU_REG_READ(c, CTL_WB_ACTIVE);
dsc_active = DPU_REG_READ(c, CTL_DSC_ACTIVE);
+   merge_3d_active = DPU_REG_READ(c, CTL_MERGE_3D_ACTIVE);
 
if (cfg->intf)
intf_active |= BIT(cfg->intf - INTF_0);
@@ -572,14 +574,14 @@ static void dpu_hw_ctl_intf_cfg_v1(struct dpu_hw_ctl *ctx,
if (cfg->dsc)
dsc_active |= cfg->dsc;
 
+   if (cfg->merge_3d)
+   merge_3d_active |= BIT(cfg->merge_3d - MERGE_3D_0);
+
DPU_REG_WRITE(c, CTL_TOP, mode_sel);
DPU_REG_WRITE(c, CTL_INTF_ACTIVE, intf_active);
DPU_REG_WRITE(c, CTL_WB_ACTIVE, wb_active);
DPU_REG_WRITE(c, CTL_DSC_ACTIVE, dsc_active);
-
-   if (cfg->merge_3d)
-   DPU_REG_WRITE(c, CTL_MERGE_3D_ACTIVE,
- BIT(cfg->merge_3d - MERGE_3D_0));
+   DPU_REG_WRITE(c, CTL_MERGE_3D_ACTIVE, merge_3d_active);
 
if (cfg->cdm)
DPU_REG_WRITE(c, CTL_CDM_ACTIVE, cfg->cdm);

-- 
2.39.5



[PATCH v2 5/8] drm/msm/dpu: don't select single flush for active CTL blocks

2025-02-27 Thread Dmitry Baryshkov
In case of ACTIVE CTLs, a single CTL is being used for flushing all INTF
blocks. Don't skip programming the CTL on those targets.

Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
index 
232055473ba55998b79dd2e8c752c129bbffbff4..8a618841e3ea89acfe4a42d48319a6c54a1b3495
 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c
@@ -374,7 +374,8 @@ static void dpu_encoder_phys_vid_underrun_irq(void *arg)
 static bool dpu_encoder_phys_vid_needs_single_flush(
struct dpu_encoder_phys *phys_enc)
 {
-   return phys_enc->split_role != ENC_ROLE_SOLO;
+   return !(phys_enc->hw_ctl->caps->features & BIT(DPU_CTL_ACTIVE_CFG)) &&
+   phys_enc->split_role != ENC_ROLE_SOLO;
 }
 
 static void dpu_encoder_phys_vid_atomic_mode_set(

-- 
2.39.5



[PATCH v2 8/8] drm/msm/dpu: drop now-unused condition for has_legacy_ctls

2025-02-27 Thread Dmitry Baryshkov
Now as we have dropped the DPU_CTL_SPLIT_DISPLAY from DPU >= 5.0
configuration, drop the rm->has_legacy_ctl condition which short-cutted
the check for those platforms.

Suggested-by: Marijn Suijten 
Signed-off-by: Dmitry Baryshkov 

---

Note, it is imposible to reoder commits in any other sensible way. The
DPU_CTL_SPLIT_DISPLAY can not be dropped before the patch that enables
single-CTL support.
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
index 
4dbc9bc7eb4f151f83055220665ee5fd238ae7ba..2557effe639b5360bc948a49b0cccdb59ee35dab
 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
@@ -410,8 +410,7 @@ static int _dpu_rm_reserve_ctls(
 
DPU_DEBUG("ctl %d caps 0x%lX\n", j + CTL_0, features);
 
-   if (rm->has_legacy_ctls &&
-   needs_split_display != has_split_display)
+   if (needs_split_display != has_split_display)
continue;
 
ctl_idx[i] = j;

-- 
2.39.5



[PATCH v2 7/8] drm/msm/dpu: remove DPU_CTL_SPLIT_DISPLAY from CTL blocks on DPU >= 5.0

2025-02-27 Thread Dmitry Baryshkov
Since DPU 5.0 CTL blocks do not require DPU_CTL_SPLIT_DISPLAY, as single
CTL is used for both interfaces. As both RM and encoder now handle
active CTLs, drop that feature bit.

Reviewed-by: Marijn Suijten 
Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_10_0_sm8650.h  | 5 ++---
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h   | 5 ++---
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h  | 4 ++--
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_2_sm7150.h   | 4 ++--
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h   | 5 ++---
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h   | 5 ++---
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h | 5 ++---
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h   | 5 ++---
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_4_sa8775p.h  | 5 ++---
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h   | 5 ++---
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_2_x1e80100.h | 5 ++---
 11 files changed, 22 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_10_0_sm8650.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_10_0_sm8650.h
index 
bcb39807fe61e231d6e318d8729ed86f213fb06a..a705e3e761d9a578777cd03011e90df8002127a6
 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_10_0_sm8650.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_10_0_sm8650.h
@@ -27,17 +27,16 @@ static const struct dpu_mdp_cfg sm8650_mdp = {
},
 };
 
-/* FIXME: get rid of DPU_CTL_SPLIT_DISPLAY in favour of proper ACTIVE_CTL 
support */
 static const struct dpu_ctl_cfg sm8650_ctl[] = {
{
.name = "ctl_0", .id = CTL_0,
.base = 0x15000, .len = 0x1000,
-   .features = CTL_SM8550_MASK | BIT(DPU_CTL_SPLIT_DISPLAY),
+   .features = CTL_SM8550_MASK,
.intr_start = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR2, 9),
}, {
.name = "ctl_1", .id = CTL_1,
.base = 0x16000, .len = 0x1000,
-   .features = CTL_SM8550_MASK | BIT(DPU_CTL_SPLIT_DISPLAY),
+   .features = CTL_SM8550_MASK,
.intr_start = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR2, 10),
}, {
.name = "ctl_2", .id = CTL_2,
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h
index 
36cc9dbc00b5c1219e1aa557dd4ee0e801b5c9e7..714c27abddbec28e9d0a4f2d7c70828a6c1b0be5
 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h
@@ -37,17 +37,16 @@ static const struct dpu_mdp_cfg sm8150_mdp = {
},
 };
 
-/* FIXME: get rid of DPU_CTL_SPLIT_DISPLAY in favour of proper ACTIVE_CTL 
support */
 static const struct dpu_ctl_cfg sm8150_ctl[] = {
{
.name = "ctl_0", .id = CTL_0,
.base = 0x1000, .len = 0x1e0,
-   .features = BIT(DPU_CTL_ACTIVE_CFG) | 
BIT(DPU_CTL_SPLIT_DISPLAY),
+   .features = BIT(DPU_CTL_ACTIVE_CFG),
.intr_start = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR2, 9),
}, {
.name = "ctl_1", .id = CTL_1,
.base = 0x1200, .len = 0x1e0,
-   .features = BIT(DPU_CTL_ACTIVE_CFG) | 
BIT(DPU_CTL_SPLIT_DISPLAY),
+   .features = BIT(DPU_CTL_ACTIVE_CFG),
.intr_start = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR2, 10),
}, {
.name = "ctl_2", .id = CTL_2,
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h
index 
e8eacdb47967a227567a96a85a93a69befbb00d5..669f3a44c3387d5620530edab0fcca8d70671cb8
 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h
@@ -41,12 +41,12 @@ static const struct dpu_ctl_cfg sc8180x_ctl[] = {
{
.name = "ctl_0", .id = CTL_0,
.base = 0x1000, .len = 0x1e0,
-   .features = BIT(DPU_CTL_ACTIVE_CFG) | 
BIT(DPU_CTL_SPLIT_DISPLAY),
+   .features = BIT(DPU_CTL_ACTIVE_CFG),
.intr_start = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR2, 9),
}, {
.name = "ctl_1", .id = CTL_1,
.base = 0x1200, .len = 0x1e0,
-   .features = BIT(DPU_CTL_ACTIVE_CFG) | 
BIT(DPU_CTL_SPLIT_DISPLAY),
+   .features = BIT(DPU_CTL_ACTIVE_CFG),
.intr_start = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR2, 10),
}, {
.name = "ctl_2", .id = CTL_2,
diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_2_sm7150.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_2_sm7150.h
index 
2fe674d1e05988f39f66a01fedee96113437ea65..0d102888741a0c61ac547ec568e44c1e91350835
 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_2_sm7150.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_2_sm7150.h
@@ -38,12 +38,12 @@ static const struct dpu_ctl_cfg sm7150_ctl[

Re: [PATCH v6 32/32] drm/doc: gpusvm: Add GPU SVM documentation

2025-02-27 Thread Alistair Popple
On Thu, Feb 27, 2025 at 08:36:35PM -0800, Matthew Brost wrote:
> On Fri, Feb 28, 2025 at 01:34:42PM +1100, Alistair Popple wrote:
> > On Mon, Feb 24, 2025 at 08:43:11PM -0800, Matthew Brost wrote:
> > > Add documentation for agree upon GPU SVM design principles, current
> > > status, and future plans.
> > 
> > Thanks for writing this up. In general I didn't see anything too 
> > controversial
> > but added a couple of comments below.
> > 
> > > 
> > > v4:
> > >  - Address Thomas's feedback
> > > v5:
> > >  - s/Current/Basline (Thomas)
> > > 
> > > Signed-off-by: Matthew Brost 
> > > Reviewed-by: Thomas Hellström 
> > > ---
> > >  Documentation/gpu/rfc/gpusvm.rst | 84 
> > >  Documentation/gpu/rfc/index.rst  |  4 ++
> > >  2 files changed, 88 insertions(+)
> > >  create mode 100644 Documentation/gpu/rfc/gpusvm.rst
> > > 
> > > diff --git a/Documentation/gpu/rfc/gpusvm.rst 
> > > b/Documentation/gpu/rfc/gpusvm.rst
> > > new file mode 100644
> > > index ..063412160685
> > > --- /dev/null
> > > +++ b/Documentation/gpu/rfc/gpusvm.rst
> > > @@ -0,0 +1,84 @@
> > > +===
> > > +GPU SVM Section
> > > +===
> > > +
> > > +Agreed upon design principles
> > > +=
> > 
> > As a general comment I think it would be nice if we could add some rational/
> > reasons for these design principals. Things inevitably change and if/when
> > we need to violate or update these principals it would be good to have some
> > documented rational for why we decided on them in the first place because 
> > the
> > reasoning may have become invalid by then.
> > 
> 
> Let me try to add somethings to the various cases.

Thanks!

> > > +* migrate_to_ram path
> > > + * Rely only on core MM concepts (migration PTEs, page references, and
> > > +   page locking).
> > > + * No driver specific locks other than locks for hardware interaction in
> > > +   this path. These are not required and generally a bad idea to
> > > +   invent driver defined locks to seal core MM races.
> > 
> > In principal I agree. The problem I think you will run into is the analogue 
> > of
> > what adding a trylock_page() to do_swap_page() fixes. Which is that a 
> > concurrent
> > GPU fault (which is higly likely after handling a CPU fault due to the GPU 
> > PTEs
> > becoming invalid) may, depending on your design, kick off a migration of the
> > page to the GPU via migrate_vma_setup().
> > 
> > The problem with that is migrate_vma_setup() will temprarily raise the folio
> > refcount, which can cause the migrate_to_ram() callback to fail but the 
> > elevated
> > refcount from migrate_to_ram() can also cause the GPU migration to fail thus
> > leading to a live-lock when both CPU and GPU fault handlers just keep 
> > retrying.
> > 
> > This was particularly problematic for us on multi-GPU setups, and our 
> > solution
> > was to introduce a migration critical section in the form of a mutex to 
> > ensure
> > only one thread was calling migrate_vma_setup() at a time.
> > 
> > And now that I've looked at UVM development history, and remembered more
> > context, this is why I had a vague recollection that adding a migration 
> > entry
> > in do_swap_page() would be better than taking a page lock. Doing so fixes 
> > the
> > issue with concurrent GPU faults blocking migrate_to_ram() because it makes
> > migrate_vma_setup() ignore the page.
> > 
> 
> Ok, this is something to keep an eye on. In the current Xe code, we try
> to migrate a chunk of memory from the CPU to the GPU in our GPU fault
> handler once per fault. If it fails due to racing CPU access, we simply
> leave it in CPU memory and move on. We don't have any real migration
> policies in Xe yet—that is being worked on as a follow-up to my series.
> However, if we had a policy requiring a memory region to 'must be in
> GPU,' this could conceivably lead to a livelock with concurrent CPU and
> GPU access. I'm still not fully convinced that a driver-side lock is the
> solution here, but without encountering the issue on our side, I can't
> be completely certain what the solution is.

Right - we have migration policies that can cause us to try harder to migrate.
Also I agree with you that a driver-side lock might not be the best solution
here. It's what we did due to various limiations we have, but they are
unimportant for this discussion.

I agree the ideal solution wouldn't involve locks and would instead be to fix
the migration interfaces up such that one thread attempting to migrate doesn't
cause another thread which has started a migration to fail. The solution to that
isn't obvious, but I don't think it would be impossible either.

> > > + * Partial migration is supported (i.e., a subset of pages attempting to
> > > +   migrate can actually migrate, with only the faulting page guaranteed
> > > +   to migrate).
> > > + * Driver handles mixed migrations via retry loops rather than locking.
> > >
> > > +* Eviction
> > 
> > This i

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-02-27 Thread Dave Airlie
On Fri, 28 Feb 2025 at 11:49, Timur Tabi  wrote:
>
> On Fri, 2025-02-28 at 07:37 +1000, Dave Airlie wrote:
> > I've tried to retrofit checking 0x to drivers a lot, I'd
> > prefer not to. Drivers getting stuck in wait for clear bits for ever.
>
> That's what read_poll_timeout() is for.  I'm surprised Nouveau doesn't use it.

That doesn't handle the PCIE returns 0x case at all, which is
the thing we most want to handle, it also uses the CPU timer whereas
nouveau's wait infrastructure uses the GPU timer usually (though that
could be changed).

Dave.


linux-next: manual merge of the drm-xe tree with the drm tree

2025-02-27 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the drm-xe tree got a conflict in:

  drivers/gpu/drm/xe/display/xe_display.c

between commit:

  1b242ceec536 ("drm/i915/audio: convert to struct intel_display")

from the drm tree and commit:

  d41d048043c4 ("drm/xe/display: Drop xe_display_driver_remove()")

from the drm-xe tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/gpu/drm/xe/display/xe_display.c
index 02a413a07382,279b786d64dc..
--- a/drivers/gpu/drm/xe/display/xe_display.c
+++ b/drivers/gpu/drm/xe/display/xe_display.c
@@@ -169,7 -169,8 +169,8 @@@ static void xe_display_fini(void *arg
  
intel_hpd_poll_fini(xe);
intel_hdcp_component_fini(display);
 -  intel_audio_deinit(xe);
 +  intel_audio_deinit(display);
+   intel_display_driver_remove(display);
  }
  
  int xe_display_init(struct xe_device *xe)


pgp4IetuiCNDZ.pgp
Description: OpenPGP digital signature


Re: [PATCH v7 12/15] drm/msm/dpu: blend pipes per mixer pairs config

2025-02-27 Thread Dmitry Baryshkov
On Wed, Feb 26, 2025 at 08:31:01PM +0800, Jun Nie wrote:
> Currently, only 2 pipes are used at most for a plane. A stage structure
> describes the configuration for a mixer pair. So only one stage is needed
> for current usage cases. The quad-pipe case will be added in future and 2
> stages are used in the case. So extend the stage to an array with array
> size STAGES_PER_PLANE and blend pipes per mixer pair with configuration
> in the stage structure.
> 
> Signed-off-by: Jun Nie 
> ---
>  drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c| 45 
> +++--
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h |  3 +-
>  2 files changed, 31 insertions(+), 17 deletions(-)
> 
> @@ -463,15 +463,24 @@ static void _dpu_crtc_blend_setup_mixer(struct drm_crtc 
> *crtc,
>   if (pstate->stage == DPU_STAGE_BASE && format->alpha_enable)
>   bg_alpha_enable = true;
>  
> - for (i = 0; i < PIPES_PER_PLANE; i++) {
> - if (!pstate->pipe[i].sspp)
> - continue;
> - set_bit(pstate->pipe[i].sspp->idx, fetch_active);
> - _dpu_crtc_blend_setup_pipe(crtc, plane,
> -mixer, cstate->num_mixers,
> -pstate->stage,
> -format, fb ? fb->modifier : 
> 0,
> -&pstate->pipe[i], i, 
> stage_cfg);
> + /* loop pipe per mixer pair with config in stage structure */
> + for (stage = 0; stage < STAGES_PER_PLANE; stage++) {
> + head_pipe_in_stage = stage * PIPES_PER_STAGE;
> + for (i = 0; i < PIPES_PER_STAGE; i++) {
> + pipe_idx = i + head_pipe_in_stage;
> + if (!pstate->pipe[pipe_idx].sspp)
> + continue;

empty line

> + lms_in_pair = min(cstate->num_mixers - (stage * 
> PIPES_PER_STAGE),
> +   PIPES_PER_STAGE);
> + set_bit(pstate->pipe[pipe_idx].sspp->idx, 
> fetch_active);
> + _dpu_crtc_blend_setup_pipe(crtc, plane,
> +
> &mixer[head_pipe_in_stage],
> +lms_in_pair,
> +pstate->stage,
> +format, fb ? 
> fb->modifier : 0,
> +
> &pstate->pipe[pipe_idx], i,
> +&stage_cfg[stage]);
> + }
>   }
>  
>   /* blend config update */

[...]

> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h 
> b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
> index 
> 5f010d36672cc6440c69779908b315aab285eaf0..74bf3ab9d6cfb8152b32d89a6c66e4d92d5cee1d
>  100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
> @@ -34,8 +34,9 @@
>  #define DPU_MAX_PLANES   4
>  #endif
>  
> -#define PIPES_PER_PLANE  2
> +#define STAGES_PER_PLANE 1
>  #define PIPES_PER_STAGE  2
> +#define PIPES_PER_PLANE  (PIPES_PER_STAGE * 
> STAGES_PER_PLANE)

PLease move this to the previous patch.

With that fixed:

Reviewed-by: Dmitry Baryshkov 

>  #ifndef DPU_MAX_DE_CURVES
>  #define DPU_MAX_DE_CURVES3
>  #endif
> 
> -- 
> 2.34.1
> 

-- 
With best wishes
Dmitry


Re: [PATCH v7 13/15] drm/msm/dpu: support SSPP assignment for quad-pipe case

2025-02-27 Thread Dmitry Baryshkov
On Wed, Feb 26, 2025 at 08:31:02PM +0800, Jun Nie wrote:
> Currently, SSPPs are assigned to a maximum of two pipes. However,
> quad-pipe usage scenarios require four pipes and involve configuring
> two stages. In quad-pipe case, the first two pipes share a set of
> mixer configurations and enable multi-rect mode when certain
> conditions are met. The same applies to the subsequent two pipes.
> 
> Assign SSPPs to the pipes in each stage using a unified method and
> to loop the stages accordingly.
> 
> Signed-off-by: Jun Nie 
> ---
>  drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c  | 11 +
>  drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.h  |  2 +
>  drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 71 
> ---
>  3 files changed, 58 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c 
> b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
> index 
> 0a053c5888262d863a1e549e14e3aa40a80c3f06..9405453cbf5d852e72a5f954cd8c6aed3a222723
>  100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c
> @@ -1366,6 +1366,17 @@ int dpu_crtc_vblank(struct drm_crtc *crtc, bool en)
>   return 0;
>  }
>  
> +/**
> + * dpu_crtc_get_num_lm - Get mixer number in this CRTC pipeline
> + * @state: Pointer to drm crtc state object
> + */
> +unsigned int dpu_crtc_get_num_lm(const struct drm_crtc_state *state)
> +{
> + struct dpu_crtc_state *cstate = to_dpu_crtc_state(state);
> +
> + return cstate->num_mixers;
> +}
> +
>  #ifdef CONFIG_DEBUG_FS
>  static int _dpu_debugfs_status_show(struct seq_file *s, void *data)
>  {
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.h 
> b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.h
> index 
> 0b148f3ce0d7af80ec4ffcd31d8632a5815b16f1..b14bab2754635953da402d09e11a43b9b4cf4153
>  100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.h
> @@ -264,4 +264,6 @@ static inline enum dpu_crtc_client_type 
> dpu_crtc_get_client_type(
>  
>  void dpu_crtc_frame_event_cb(struct drm_crtc *crtc, u32 event);
>  
> +unsigned int dpu_crtc_get_num_lm(const struct drm_crtc_state *state);
> +
>  #endif /* _DPU_CRTC_H_ */
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c 
> b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
> index 
> d67f2ad20b4754ca4bcb759a65a39628b7236b0f..d1d6c91ed0f8e1c62b757ca42546fbc421609f72
>  100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
> @@ -1112,11 +1112,10 @@ static int dpu_plane_virtual_assign_resources(struct 
> drm_crtc *crtc,
>   struct dpu_rm_sspp_requirements reqs;
>   struct dpu_plane_state *pstate;
>   struct dpu_sw_pipe *pipe;
> - struct dpu_sw_pipe *r_pipe;
>   struct dpu_sw_pipe_cfg *pipe_cfg;
> - struct dpu_sw_pipe_cfg *r_pipe_cfg;
> + struct dpu_plane *pdpu = to_dpu_plane(plane);
>   const struct msm_format *fmt;
> - int i;
> + int i, num_lm, stage_id, num_stages;
>  
>   if (plane_state->crtc)
>   crtc_state = drm_atomic_get_new_crtc_state(state,
> @@ -1124,11 +1123,6 @@ static int dpu_plane_virtual_assign_resources(struct 
> drm_crtc *crtc,
>  
>   pstate = to_dpu_plane_state(plane_state);
>  
> - pipe = &pstate->pipe[0];
> - r_pipe = &pstate->pipe[1];
> - pipe_cfg = &pstate->pipe_cfg[0];
> - r_pipe_cfg = &pstate->pipe_cfg[1];
> -
>   for (i = 0; i < PIPES_PER_PLANE; i++)
>   pstate->pipe[i].sspp = NULL;
>  
> @@ -1142,24 +1136,49 @@ static int dpu_plane_virtual_assign_resources(struct 
> drm_crtc *crtc,
>  
>   reqs.rot90 = drm_rotation_90_or_270(plane_state->rotation);
>  
> - pipe->sspp = dpu_rm_reserve_sspp(&dpu_kms->rm, global_state, crtc, 
> &reqs);
> - if (!pipe->sspp)
> - return -ENODEV;
> -
> - if (!dpu_plane_try_multirect_parallel(pipe, pipe_cfg, r_pipe, 
> r_pipe_cfg,
> -   pipe->sspp,
> -   
> msm_framebuffer_format(plane_state->fb),
> -   
> dpu_kms->catalog->caps->max_linewidth)) {
> - /* multirect is not possible, use two SSPP blocks */
> - r_pipe->sspp = dpu_rm_reserve_sspp(&dpu_kms->rm, global_state, 
> crtc, &reqs);
> - if (!r_pipe->sspp)
> - return -ENODEV;
> -
> - pipe->multirect_index = DPU_SSPP_RECT_SOLO;
> - pipe->multirect_mode = DPU_SSPP_MULTIRECT_NONE;
> -
> - r_pipe->multirect_index = DPU_SSPP_RECT_SOLO;
> - r_pipe->multirect_mode = DPU_SSPP_MULTIRECT_NONE;
> + num_lm = dpu_crtc_get_num_lm(crtc_state);
> + num_stages = (num_lm + 1) / 2;
> + for (stage_id = 0; stage_id < num_stages; stage_id++) {
> + for (i = stage_id * PIPES_PER_STAGE; i < (stage_id + 1) * 
> PIPES_PER_STAGE; i++) {
> + struct dpu_sw_pipe *r_pipe;
> + struct dpu_sw_pipe_cfg *r_

[git pull] drm fixes for 6.14-rc5

2025-02-27 Thread Dave Airlie
Hi Linus,

This week's fixes pull, amdgpu mostly, with some xe and a few misc
others, the fb defio fix is bit of a change, but it avoids some nasty
NULL pointer crashes due to defio assuming page backing in places it
didn't have pages.

Regards,
Dave.

drm-fixes-2025-02-28:
drm fixes for 6.14-rc5

amdgpu:
- Legacy dpm suspend/resume fix
- Runtime PM fix for DELL G5 SE
- MAINTAINERS updates
- Enforce Isolation fixes
- mailmap update
- EDID reading i2c fix
- PSR fix
- eDP fix
- HPD interrupt handling fix
- Clear memory fix

amdkfd:
- MQD handling fix

vkms:
- fix rounding error

imagination:
- header fix

nouveau:
- connector status fix

fb/defio:
- NULL ptr fix for defio drivers

i915:
- Fix encoder HW state readout for DP UHBR MST

xe:
- OA uapi fix (Umesh)
- Userptr related fixes
- Remove a duplicated register entry
- Scheduler related fix to prevent exec races when freeing it
The following changes since commit d082ecbc71e9e0bf49883ee4afd435a77a5101b6:

  Linux 6.14-rc4 (2025-02-23 12:32:57 -0800)

are available in the Git repository at:

  https://gitlab.freedesktop.org/drm/kernel.git tags/drm-fixes-2025-02-28

for you to fetch changes up to 6a5884f200693eeffac4b008faf1e8bdf1c92af5:

  Merge tag 'drm-xe-fixes-2025-02-27' of
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
(2025-02-28 10:47:09 +1000)


drm fixes for 6.14-rc5

amdgpu:
- Legacy dpm suspend/resume fix
- Runtime PM fix for DELL G5 SE
- MAINTAINERS updates
- Enforce Isolation fixes
- mailmap update
- EDID reading i2c fix
- PSR fix
- eDP fix
- HPD interrupt handling fix
- Clear memory fix

amdkfd:
- MQD handling fix

vkms:
- fix rounding error

imagination:
- header fix

nouveau:
- connector status fix

fb/defio:
- NULL ptr fix for defio drivers

i915:
- Fix encoder HW state readout for DP UHBR MST

xe:
- OA uapi fix (Umesh)
- Userptr related fixes
- Remove a duplicated register entry
- Scheduler related fix to prevent exec races when freeing it


Alex Deucher (4):
  drm/amdgpu: disable BAR resize on Dell G5 SE
  MAINTAINERS: update amdgpu maintainers list
  drm/amdgpu/gfx: only call mes for enforce isolation if supported
  drm/amdgpu/mes: keep enforce isolation up to date

Aurabindo Pillai (1):
  MAINTAINERS: Update AMDGPU DML maintainers info

Dave Airlie (4):
  Merge tag 'amd-drm-fixes-6.14-2025-02-26' of
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
  Merge tag 'drm-misc-fixes-2025-02-27' of
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
  Merge tag 'drm-intel-fixes-2025-02-27' of
https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes
  Merge tag 'drm-xe-fixes-2025-02-27' of
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes

David Yat Sin (1):
  drm/amdkfd: Preserve cp_hqd_pq_control on update_mqd

Harry Wentland (1):
  drm/vkms: Round fixp2int conversion in lerp_u16

Imre Deak (1):
  drm/i915/dp_mst: Fix encoder HW state readout for UHBR MST

Maarten Lankhorst (1):
  MAINTAINERS: Add entry for DMEM cgroup controller

Masahiro Yamada (1):
  drm/imagination: remove unnecessary header include path

Matthew Auld (2):
  drm/xe/userptr: restore invalidation list on error
  drm/xe/userptr: fix EFAULT handling

Melissa Wen (1):
  drm/amd/display: restore edid reading from a given i2c adapter

Mingcong Bai (1):
  drm/xe/regs: remove a duplicate definition for RING_CTL_SIZE(size)

Pierre-Eric Pelloux-Prayer (1):
  drm/amdgpu: init return value in amdgpu_ttm_clear_buffer

Rodrigo Siqueira (2):
  MAINTAINERS: Change my role from Maintainer to Reviewer
  mailmap: Add entry for Rodrigo Siqueira

Roman Li (1):
  drm/amd/display: Fix HPD after gpu reset

Tejas Upadhyay (1):
  drm/xe: cancel pending job timer before freeing scheduler

Thomas Zimmermann (2):
  drm/nouveau: Do not override forced connector status
  drm/fbdev-dma: Add shadow buffering for deferred I/O

Tom Chung (1):
  drm/amd/display: Disable PSR-SU on eDP panels

Umesh Nerlige Ramappa (1):
  drm/xe/oa: Allow oa_exponent value of 0

Yilin Chen (1):
  drm/amd/display: add a quirk to enable eDP0 on DP1

chr[] (1):
  amdgpu/pm/legacy: fix suspend/resume issues

 .mailmap   |   3 +
 MAINTAINERS|  16 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |   7 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c|  11 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c|  20 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.h|   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c|   2 +-
 drivers/gpu/drm/amd/amdgpu/mes_v11_0.c |   4 +
 drivers/gpu/drm/amd/amdgpu/mes_v12_0.c |   4 +
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c   |   6 +-
 drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v1

[PATCH v1] drm/ci: use shallow clone to avoid timeouts

2025-02-27 Thread Vignesh Raman
The python-artifacts job has a timeout of 10 minutes, which causes
build failures as it was unable to clone the repository within the
specified limits. Set GIT_DEPTH to 10 to speed up cloning and avoid
build failures due to timeouts when fetching the full repository.

Signed-off-by: Vignesh Raman 
---
 drivers/gpu/drm/ci/gitlab-ci.yml | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/ci/gitlab-ci.yml b/drivers/gpu/drm/ci/gitlab-ci.yml
index f04aabe8327c..647ef980ad4a 100644
--- a/drivers/gpu/drm/ci/gitlab-ci.yml
+++ b/drivers/gpu/drm/ci/gitlab-ci.yml
@@ -40,6 +40,8 @@ variables:
   ARTIFACTS_BASE_URL: 
https://${CI_PROJECT_ROOT_NAMESPACE}.${CI_PAGES_DOMAIN}/-/${CI_PROJECT_NAME}/-/jobs/${CI_JOB_ID}/artifacts
   # Python scripts for structured logger
   PYTHONPATH: "$PYTHONPATH:$CI_PROJECT_DIR/install"
+  # Set to 0 to disable shallow cloning
+  GIT_DEPTH: 10
 
 
 default:
-- 
2.47.2



Re: [PATCH] drm/xe: Select INTEL_VSEC to fix build dependency

2025-02-27 Thread Su Hui

On 2025/2/28 00:03, Lucas De Marchi wrote:

On Thu, Feb 27, 2025 at 03:32:06PM +0800, Su Hui wrote:

When build randconfig, there is an error:
ld: drivers/gpu/drm/xe/xe_vsec.o: in function `xe_vsec_init':
xe_vsec.c:(.text+0x182): undefined reference to `intel_vsec_register'

When CONFIG_DRM_XE=y and CONFIG_INTEL_VSEC=m is set, ld couldn't find
'intel_vsec_register'. Select INTEL_VSEC to fix this error.

Fixes: 0c45e76fcc62 ("drm/xe/vsec: Support BMG devices")
Signed-off-by: Su Hui 
---
drivers/gpu/drm/xe/Kconfig | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/xe/Kconfig b/drivers/gpu/drm/xe/Kconfig
index b51a2bde73e2..7a60d96d2dd6 100644
--- a/drivers/gpu/drm/xe/Kconfig
+++ b/drivers/gpu/drm/xe/Kconfig
@@ -44,6 +44,7 @@ config DRM_XE
select WANT_DEV_COREDUMP
select AUXILIARY_BUS
select HMM_MIRROR
+    select INTEL_VSEC


intel_vsec is an x86 platform driver. I think we probably want to add a
config that depends on INTEL_VSEC rather than selecting it like this.
At the very least we need and `if x86` and also make sure the driver
works without that part.


There is a recursive dependency between INTEL_VSEC and DRM_XE:

    symbol DRM_XE depends on INTEL_VSEC
    symbol INTEL_VSEC depends on X86_PLATFORM_DEVICES
    symbol X86_PLATFORM_DEVICES is selected by DRM_XE

So if using 'depends on INTEL_VSEC', we should remove 'select 
X86_PLATFORM_DEVICES', like this one:


 config DRM_XE
    tristate "Intel Xe Graphics"
    depends on DRM && PCI && MMU && (m || (y && KUNIT=y))
+   depends on !X86 || INTEL_VSEC || INTEL_VSEC=n
+   depends on !X86 || !ACPI || ACPI_WMI
    select INTERVAL_TREE
    # we need shmfs for the swappable backing store, and in particular
    # the shmem_readpage() which depends upon tmpfs
@@ -27,8 +29,6 @@ config DRM_XE
    select BACKLIGHT_CLASS_DEVICE if ACPI
    select INPUT if ACPI
    select ACPI_VIDEO if X86 && ACPI
-   select X86_PLATFORM_DEVICES if X86 && ACPI
-   select ACPI_WMI if X86 && ACPI

The 'select X86_PLATFORM_DEVICES' is introduced by 67a9e86dc130 
("drm/xe: select
X86_PLATFORM_DEVICES when ACPI_WMI is selected"), so both ACPI_WMI need 
to be changed.


Another choice is using 'select INTEL_VSEC if X86' and no need to change 
other things.

Any suggestion for these two choices?




Re: [PATCH 6/6] drm/xe/xe_vm: Implement xe_vm_get_property_ioctl

2025-02-27 Thread Matthew Brost
On Thu, Feb 27, 2025 at 09:51:15AM -0700, Cavitt, Jonathan wrote:
> Some responses below.  If I skip over anything, just assume that I'm taking 
> the request
> into consideration and that it will be fixed for version 2 of this patch 
> series.
> 
> -Original Message-
> From: Brost, Matthew  
> Sent: Thursday, February 27, 2025 12:25 AM
> To: Cavitt, Jonathan 
> Cc: intel...@lists.freedesktop.org; Gupta, saurabhg 
> ; Zuo, Alex ; 
> joonas.lahti...@linux.intel.com; Zhang, Jianxun ; 
> dri-devel@lists.freedesktop.org
> Subject: Re: [PATCH 6/6] drm/xe/xe_vm: Implement xe_vm_get_property_ioctl
> > 
> > On Wed, Feb 26, 2025 at 10:55:56PM +, Jonathan Cavitt wrote:
> > > Add support for userspace to get various properties from a specified VM.
> > > The currently supported properties are:
> > > 
> > > - The number of engine resets the VM has observed
> > > - The number of exec queue bans the VM has observed, up to the last 50
> > >   relevant ones, and how many of those were caused by faults.
> > > 
> > > The latter request also includes information on the exec queue bans,
> > > such as the ID of the banned exec queue, whether the ban was caused by a
> > > pagefault or not, and the address and address type of the associated
> > > fault (if one exists).
> > > 
> > 
> > > Signed-off-by: Jonathan Cavitt 
> > > Suggested-by: Matthew Brost 
> > > ---
> [...]
> > 
> > > +
> > > +struct drm_xe_ban {
> > > + /** @exec_queue_id: ID of banned exec queue */
> > > + __u32 exec_queue_id;
> > 
> > I don't think we can reliably associate a page fault with an
> > exec_queue_id at the moment, given my above statement about having to
> > capture all state at the time of the page fault. Maybe we could with
> > some tricks between the page fault and the IOMMU CAT error G2H?
> > Regardless, let's ask the UMD we are targeting [1] if this information
> > would be helpful. It would seemingly have to be vendor-specific
> > information, not part of the generic Vk information.
> > 
> > Additionally, it might be good to ask what other vendor-specific
> > information, if any, we'd need here based on what the current page fault
> > interface supports.
> > 
> > [1] 
> > https://registry.khronos.org/vulkan/specs/latest/man/html/VK_EXT_device_fault.html
> 
> The original request was something along the lines of having a mirror of the
> DRM_IOCTL_I915_GET_RESET_STATS on XeKMD.  Those reset stats contain
> information on the "context" ID, which maps to the exec queue ID on XeKMD.
> 
> Even if we can't reasonably blame a pagefault on a particular exec queue, in
> order to match the request correctly, this information needs to be returned.
> 
> The I915 reset stats also contain information on the number of observed engine
> resets, so that needs to be returned as well.
> 
> @joonas.lahti...@linux.intel.com can provide more details.  Or maybe
> @Mistat, Tomasz .
> 

You haven't really answered my question here or below where you say see
above. We a need UMD use case posted with any uAPI changes before
merging uAPI changes. I know the above Vk extension is going to be
implemented on top of this series but it is very unclear where the
number of resets requirement / UMD use case is coming from which makes
it impossible to review. 

Again I suggest focusing on the Vk use case first or go talk to our UMD
partners and figure out exactly why something similar to
DRM_IOCTL_I915_GET_RESET_STATS is required in Xe. I have made similar
comments on VLK-69424.

Matt

> > 
> > > + /** @faulted: Whether or not the ban has an associated pagefault.  0 is 
> > > no, 1 is yes */
> > > + __u32 faulted;
> > > + /** @address: Address of the fault, if relevant */
> > > + __u64 address;
> > > + /** @address_type: enum drm_xe_fault_address_type, if relevant */
> > > + __u32 address_type;
> > 
> > We likely need a fault_size field to support VkDeviceSize
> > addressPrecision; as defined here [2]. I believe we can extract this
> > information from pagefault.fault_level.
> > 
> > [2] 
> > https://registry.khronos.org/vulkan/specs/latest/man/html/VkDeviceFaultAddressInfoEXT.html
> 
> I can add this field as a prototype, though it will always return SZ_4K until 
> we
> can have a longer discussion on how to map between the fault_level and the
> fault_size.
> 
> > 
> > > + /** @pad: MBZ */
> > > + __u32 pad;
> > > + /** @reserved: MBZ */
> > > + __u64 reserved[3];
> > > +};
> > > +
> > > +struct drm_xe_faults {
> > > + /** @num_faults: Number of faults observed on the VM */
> > > + __u32 num_faults;
> > > + /** @num_bans: Number of bans observed on the VM */
> > > + __u32 num_bans;
> > 
> > I don't think num_bans and num_faults really provide any benefit for
> > supporting [1]. The requirement for [1] is device faults-nothing more.
> > With that in mind, I'd lean toward an array of a single structure
> > (returned in drm_xe_vm_get_property.data, number of faults can be
> > inferred from the returned size) reporting all faults, with each entry
> > containing all the faul

Re: [git pull] drm fixes for 6.14-rc5

2025-02-27 Thread pr-tracker-bot
The pull request you sent on Fri, 28 Feb 2025 13:10:16 +1000:

> https://gitlab.freedesktop.org/drm/kernel.git tags/drm-fixes-2025-02-28

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/76544811c850a1f4c055aa182b513b7a843868ea

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html


[PATCH v1 1/7] virtio-gpu api: add blob userptr resource

2025-02-27 Thread Honglei Huang
From: Honglei Huang 

Add a new resource for blob resource, called userptr, used for let
host access guest user space memory, to acquire buffer based userptr
feature in virtio GPU.

- The capset VIRTIO_GPU_CAPSET_HSAKMT used for context init,
in this series patches only HSAKMT context can use the userptr
feature. HSAKMT is a GPU compute library in HSA stack, like
the role libdrm in mesa stack.
- New flag VIRTIO_GPU_BLOB_FLAG_USE_USERPTR used in blob create
to indicate the blob create ioctl is used for create a userptr
blob resource.

Signed-off-by: Honglei Huang 
---
 include/uapi/linux/virtio_gpu.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/include/uapi/linux/virtio_gpu.h b/include/uapi/linux/virtio_gpu.h
index bf2c9cabd207..4da36a1e62c4 100644
--- a/include/uapi/linux/virtio_gpu.h
+++ b/include/uapi/linux/virtio_gpu.h
@@ -65,6 +65,11 @@
  */
 #define VIRTIO_GPU_F_CONTEXT_INIT4
 
+/*
+ * VIRTGPU_BLOB_FLAG_USE_USERPTR
+ */
+#define VIRTIO_GPU_F_RESOURCE_USERPTR5
+
 enum virtio_gpu_ctrl_type {
VIRTIO_GPU_UNDEFINED = 0,
 
@@ -312,6 +317,7 @@ struct virtio_gpu_cmd_submit {
 /* 3 is reserved for gfxstream */
 #define VIRTIO_GPU_CAPSET_VENUS 4
 #define VIRTIO_GPU_CAPSET_DRM 6
+#define VIRTIO_GPU_CAPSET_HSAKMT 8
 
 /* VIRTIO_GPU_CMD_GET_CAPSET_INFO */
 struct virtio_gpu_get_capset_info {
@@ -404,6 +410,7 @@ struct virtio_gpu_resource_create_blob {
 #define VIRTIO_GPU_BLOB_FLAG_USE_MAPPABLE 0x0001
 #define VIRTIO_GPU_BLOB_FLAG_USE_SHAREABLE0x0002
 #define VIRTIO_GPU_BLOB_FLAG_USE_CROSS_DEVICE 0x0004
+#define VIRTIO_GPU_BLOB_FLAG_USE_USERPTR  0x0008
/* zero is invalid blob mem */
__le32 blob_mem;
__le32 blob_flags;
-- 
2.34.1



[PATCH v1 7/7] drm/virtio: implement userptr: add mmu notifier

2025-02-27 Thread Honglei Huang
From: Honglei Huang 

Add mmu notifier, there are some benefits:
- UMD do not need manage the userptrs, just alloc and free user space
memory, with the MMU notifier userpters can be managed by kernel.
- Can achieve a performance improvement of 20%~30%. With the MMU notifier
UMD like OpenCL can achieve 98% performance compared to bare metal in
some bench marks like Geekbench and CLpeak.

Signed-off-by: Honglei Huang 
---
 drivers/gpu/drm/virtio/virtgpu_drv.h |  47 ++-
 drivers/gpu/drm/virtio/virtgpu_ioctl.c   |   4 +-
 drivers/gpu/drm/virtio/virtgpu_kms.c |   2 +
 drivers/gpu/drm/virtio/virtgpu_userptr.c | 423 ++-
 4 files changed, 469 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h 
b/drivers/gpu/drm/virtio/virtgpu_drv.h
index fa5dd46e3732..6fa6dd9d1738 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -42,6 +42,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define DRIVER_NAME "virtio_gpu"
 #define DRIVER_DESC "virtio GPU"
@@ -121,9 +122,33 @@ struct virtio_gpu_object_userptr_ops {
int (*get_pages)(struct virtio_gpu_object_userptr *userptr);
void (*put_pages)(struct virtio_gpu_object_userptr *userptr);
void (*release)(struct virtio_gpu_object_userptr *userptr);
-   int (*insert)(struct virtio_gpu_object_userptr *userptr, struct 
virtio_gpu_fpriv *fpriv);
-   int (*remove)(struct virtio_gpu_object_userptr *userptr, struct 
virtio_gpu_fpriv *fpriv);
+   int (*insert)(struct virtio_gpu_object_userptr *userptr,
+ struct virtio_gpu_fpriv *fpriv);
+   int (*remove)(struct virtio_gpu_object_userptr *userptr,
+ struct virtio_gpu_fpriv *fpriv);
+   bool (*valid)(struct virtio_gpu_object_userptr *userptr);
+   void (*notifier_init)(struct virtio_gpu_object_userptr *userptr,
+ struct mm_struct *mm);
+   int (*notifier_add)(struct virtio_gpu_object_userptr *userptr,
+   unsigned long start, unsigned long length);
+   void (*notifier_remove)(struct virtio_gpu_object_userptr *userptr);
+   int (*split)(struct virtio_gpu_object_userptr *userptr,
+unsigned long start, unsigned long last,
+struct virtio_gpu_object_userptr **pnew);
+   void (*evict)(struct virtio_gpu_object_userptr *userptr);
+   void (*update)(struct virtio_gpu_object_userptr *userptr);
+   struct virtio_gpu_object_userptr *(*split_new)(
+   struct virtio_gpu_object_userptr *userptr, unsigned long start,
+   unsigned long last);
 };
+
+enum userptr_work_list_ops {
+   USERPTR_OP_NULL,
+   USERPTR_OP_UNMAP,
+   USERPTR_OP_UPDATE,
+   USERPTR_OP_EVICT,
+};
+
 struct virtio_gpu_object_userptr {
struct virtio_gpu_object base;
const struct virtio_gpu_object_userptr_ops *ops;
@@ -142,6 +167,16 @@ struct virtio_gpu_object_userptr {
struct sg_table *sgt;
 
struct interval_tree_node it_node;
+
+#ifdef CONFIG_MMU_NOTIFIER
+   struct list_head work_list;
+   enum userptr_work_list_ops op;
+   atomic_t in_release;
+   struct mm_struct *mm;
+   uint64_t notifier_start;
+   uint64_t notifier_last;
+   struct mmu_interval_notifier notifier;
+#endif
 };
 
 #define to_virtio_gpu_shmem(virtio_gpu_object) \
@@ -317,6 +352,12 @@ struct virtio_gpu_fpriv {
bool explicit_debug_name;
struct rb_root_cached userptrs_tree;
struct mutex userptrs_tree_lock;
+
+#ifdef CONFIG_MMU_NOTIFIER
+   struct work_struct userptr_work;
+   struct list_head userptr_work_list;
+   spinlock_t userptr_work_list_lock;
+#endif
 };
 
 /* virtgpu_ioctl.c */
@@ -536,4 +577,6 @@ bool virtio_gpu_is_userptr(struct virtio_gpu_object *bo);
 void virtio_gpu_userptr_interval_tree_init(struct virtio_gpu_fpriv *vfpriv);
 void virtio_gpu_userptr_set_handle(struct virtio_gpu_object *qobj,
   uint32_t handle);
+uint32_t virtio_gpu_userptr_get_handle(struct virtio_gpu_object *qobj);
+void virtio_gpu_userptr_list_work_init(struct virtio_gpu_fpriv *vfpriv);
 #endif
diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c 
b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
index ad1ac8d0eadf..14326fd8fee9 100644
--- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
+++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
@@ -697,8 +697,10 @@ static int virtio_gpu_context_init_ioctl(struct drm_device 
*dev,
}
}
 
-   if (vfpriv->context_init & VIRTIO_GPU_CAPSET_HSAKMT)
+   if (vfpriv->context_init & VIRTIO_GPU_CAPSET_HSAKMT) {
+   virtio_gpu_userptr_list_work_init(vfpriv);
virtio_gpu_userptr_interval_tree_init(vfpriv);
+   }
 
virtio_gpu_create_context_locked(vgdev, vfpriv);
virtio_gpu_notify(vgdev);
diff --git a/drivers/gpu/drm/virtio/virtgpu_kms.c 
b/drivers/gpu/drm/virtio/virtgpu_kms.c
index 3d5158cae

[PATCH v1 4/7] drm/virtio: implement userptr: add userptr obj

2025-02-27 Thread Honglei Huang
From: Honglei Huang 

Add implement for virtio gpu userptr. Current solution is pinning
all the user space memory. The UMD needs manage all the userptrs.

Signed-off-by: Honglei Huang 
---
 drivers/gpu/drm/virtio/Makefile  |   3 +-
 drivers/gpu/drm/virtio/virtgpu_drv.h |  33 
 drivers/gpu/drm/virtio/virtgpu_object.c  |   5 +
 drivers/gpu/drm/virtio/virtgpu_userptr.c | 230 +++
 4 files changed, 270 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/virtio/virtgpu_userptr.c

diff --git a/drivers/gpu/drm/virtio/Makefile b/drivers/gpu/drm/virtio/Makefile
index d2e1788a8227..fe7332a621aa 100644
--- a/drivers/gpu/drm/virtio/Makefile
+++ b/drivers/gpu/drm/virtio/Makefile
@@ -6,6 +6,7 @@
 virtio-gpu-y := virtgpu_drv.o virtgpu_kms.o virtgpu_gem.o virtgpu_vram.o \
virtgpu_display.o virtgpu_vq.o \
virtgpu_fence.o virtgpu_object.o virtgpu_debugfs.o virtgpu_plane.o \
-   virtgpu_ioctl.o virtgpu_prime.o virtgpu_trace_points.o virtgpu_submit.o
+   virtgpu_ioctl.o virtgpu_prime.o virtgpu_trace_points.o virtgpu_submit.o 
\
+   virtgpu_userptr.o
 
 obj-$(CONFIG_DRM_VIRTIO_GPU) += virtio-gpu.o
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h 
b/drivers/gpu/drm/virtio/virtgpu_drv.h
index 7bdcbaa20ef1..f3dcbd241f5a 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -85,6 +85,7 @@ struct virtio_gpu_object_params {
uint32_t blob_mem;
uint32_t blob_flags;
uint64_t blob_id;
+   uint64_t userptr;
 };
 
 struct virtio_gpu_object {
@@ -112,12 +113,38 @@ struct virtio_gpu_object_vram {
struct drm_mm_node vram_node;
 };
 
+struct virtio_gpu_object_userptr;
+
+struct virtio_gpu_object_userptr_ops {
+   int (*get_pages)(struct virtio_gpu_object_userptr *userptr);
+   void (*put_pages)(struct virtio_gpu_object_userptr *userptr);
+   void (*release)(struct virtio_gpu_object_userptr *userptr);
+};
+struct virtio_gpu_object_userptr {
+   struct virtio_gpu_object base;
+   const struct virtio_gpu_object_userptr_ops *ops;
+   struct mutex lock;
+
+   uint64_t start;
+   uint32_t npages;
+   uint32_t bo_handle;
+   uint32_t flags;
+
+   struct virtio_gpu_device *vgdev;
+   struct drm_file *file;
+   struct page **pages;
+   struct sg_table *sgt;
+};
+
 #define to_virtio_gpu_shmem(virtio_gpu_object) \
container_of((virtio_gpu_object), struct virtio_gpu_object_shmem, base)
 
 #define to_virtio_gpu_vram(virtio_gpu_object) \
container_of((virtio_gpu_object), struct virtio_gpu_object_vram, base)
 
+#define to_virtio_gpu_userptr(virtio_gpu_object) \
+   container_of((virtio_gpu_object), struct virtio_gpu_object_userptr, 
base)
+
 struct virtio_gpu_object_array {
struct ww_acquire_ctx ticket;
struct list_head next;
@@ -489,4 +516,10 @@ void virtio_gpu_vram_unmap_dma_buf(struct device *dev,
 int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data,
struct drm_file *file);
 
+/* virtgpu_userptr.c */
+int virtio_gpu_userptr_create(struct virtio_gpu_device *vgdev,
+ struct drm_file *file,
+ struct virtio_gpu_object_params *params,
+ struct virtio_gpu_object **bo_ptr);
+bool virtio_gpu_is_userptr(struct virtio_gpu_object *bo);
 #endif
diff --git a/drivers/gpu/drm/virtio/virtgpu_object.c 
b/drivers/gpu/drm/virtio/virtgpu_object.c
index c7e74cf13022..31659b0a028d 100644
--- a/drivers/gpu/drm/virtio/virtgpu_object.c
+++ b/drivers/gpu/drm/virtio/virtgpu_object.c
@@ -80,6 +80,11 @@ void virtio_gpu_cleanup_object(struct virtio_gpu_object *bo)
drm_gem_free_mmap_offset(&vram->base.base.base);
drm_gem_object_release(&vram->base.base.base);
kfree(vram);
+   } else if (virtio_gpu_is_userptr(bo)) {
+   struct virtio_gpu_object_userptr *userptr = 
to_virtio_gpu_userptr(bo);
+
+   drm_gem_object_release(&userptr->base.base.base);
+   kfree(userptr);
}
 }
 
diff --git a/drivers/gpu/drm/virtio/virtgpu_userptr.c 
b/drivers/gpu/drm/virtio/virtgpu_userptr.c
new file mode 100644
index ..b4a08811d345
--- /dev/null
+++ b/drivers/gpu/drm/virtio/virtgpu_userptr.c
@@ -0,0 +1,230 @@
+// SPDX-License-Identifier: GPL-2.0
+#include 
+#include 
+#include 
+#include 
+
+#include "virtgpu_drv.h"
+#include "drm/drm_gem.h"
+
+static struct sg_table *
+virtio_gpu_userptr_get_sg_table(struct drm_gem_object *obj);
+
+static void virtio_gpu_userptr_free(struct drm_gem_object *obj)
+{
+   struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
+   struct virtio_gpu_device *vgdev = obj->dev->dev_private;
+   struct virtio_gpu_object_userptr *userptr = to_virtio_gpu_userptr(bo);
+
+   if (bo->created) {
+   userptr->ops->release(userptr);
+
+   virtio_gpu_cmd_unref_resource(vgd

[PATCH v1 2/7] drm/virtgpu api: add blob userptr resource

2025-02-27 Thread Honglei Huang
From: Honglei Huang 

This makes blob userptr resource available to guest userspace.

- Flag VIRTGPU_BLOB_FLAG_USE_USERPTR for guest userspace blob create,
enable this flag to indicate blob userptr resource create.
- Flag VIRTGPU_BLOB_FLAG_USERPTR_RDONLY used for read only userptr,
if not set then the userptr will be writeable.
- New parameter blob_userptr for bypass userspace memory address to
virtio GPU, like other userptr design, virtio GPU needs a userspace
memory for device access.

Used for userptr feature, in compute side, this feature is basic and
essential. Let device to access userspace memory directly instead of
copying.

Signed-off-by: Honglei Huang 
---
 include/uapi/drm/virtgpu_drm.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/uapi/drm/virtgpu_drm.h b/include/uapi/drm/virtgpu_drm.h
index c2ce71987e9b..071f31752721 100644
--- a/include/uapi/drm/virtgpu_drm.h
+++ b/include/uapi/drm/virtgpu_drm.h
@@ -179,13 +179,14 @@ struct drm_virtgpu_resource_create_blob {
 #define VIRTGPU_BLOB_FLAG_USE_MAPPABLE 0x0001
 #define VIRTGPU_BLOB_FLAG_USE_SHAREABLE0x0002
 #define VIRTGPU_BLOB_FLAG_USE_CROSS_DEVICE 0x0004
+#define VIRTGPU_BLOB_FLAG_USE_USERPTR  0x0008
+#define VIRTGPU_BLOB_FLAG_USERPTR_RDONLY   0x0010
/* zero is invalid blob_mem */
__u32 blob_mem;
__u32 blob_flags;
__u32 bo_handle;
__u32 res_handle;
__u64 size;
-
/*
 * for 3D contexts with VIRTGPU_BLOB_MEM_HOST3D_GUEST and
 * VIRTGPU_BLOB_MEM_HOST3D otherwise, must be zero.
@@ -194,6 +195,7 @@ struct drm_virtgpu_resource_create_blob {
__u32 cmd_size;
__u64 cmd;
__u64 blob_id;
+   __u64 userptr;
 };
 
 #define VIRTGPU_CONTEXT_PARAM_CAPSET_ID   0x0001
-- 
2.34.1



[PATCH v1 0/7] Add virtio gpu userptr support

2025-02-27 Thread Honglei Huang
From: Honglei Huang 

Hello,

This series add virtio gpu userptr support and add libhsakmt capset.
The userptr feature is used for let host access guest user space memory,
this feature is used for GPU compute use case, to enable ROCm/OpenCL native
context. It should be pointed out that we are not to implement SVM here, 
this is just a buffer based userptr implementation.
The libhsakmt capset is used for ROCm context, libhsakmt is like the role 
of libdrm in Mesa.

Patches 1-2 add libhsakmt capset and userptr blob resource flag.
Patches 3-5 implement basic userptr feature, in some popular bench marks,
it has an efficiency of about 70% compared to bare metal in OpenCL API.
Patche 6 adds interval tree.
Patche 7 adds MMU notifier, let UMD do not need to manage userptr and
increase efficiency by 20% to 30%. With this patch, OpenCL in ROCm can
achieve 95%+ efficiency compared to bare metal in some popular bench marks.

Honglei Huang (7):
  virtio-gpu api: add blob userptr resource
  drm/virtgpu api: add blob userptr resource
  drm/virtio: implement userptr: probe for the feature
  drm/virtio: implement userptr: add userptr obj
  drm/virtio: advertise base userptr feature to userspace
  drm/virtio: implement userptr: add interval tree
  drm/virtio: implement userptr: add mmu notifier

 drivers/gpu/drm/virtio/Makefile  |   3 +-
 drivers/gpu/drm/virtio/virtgpu_debugfs.c |   1 +
 drivers/gpu/drm/virtio/virtgpu_drv.c |   1 +
 drivers/gpu/drm/virtio/virtgpu_drv.h |  91 +++
 drivers/gpu/drm/virtio/virtgpu_ioctl.c   |  22 +-
 drivers/gpu/drm/virtio/virtgpu_kms.c |  10 +-
 drivers/gpu/drm/virtio/virtgpu_object.c  |   5 +
 drivers/gpu/drm/virtio/virtgpu_userptr.c | 766 +++
 include/uapi/drm/virtgpu_drm.h   |   5 +-
 include/uapi/linux/virtio_gpu.h  |   7 +
 10 files changed, 905 insertions(+), 6 deletions(-)
 create mode 100644 drivers/gpu/drm/virtio/virtgpu_userptr.c

-- 
2.34.1




[PATCH v1 5/7] drm/virtio: advertise base userptr feature to userspace

2025-02-27 Thread Honglei Huang
From: Honglei Huang 

Introduce the basic userptr feature to userspace.

Signed-off-by: Honglei Huang 
---
 drivers/gpu/drm/virtio/virtgpu_ioctl.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c 
b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
index e4f76f315550..8a89774d0737 100644
--- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
+++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
@@ -36,7 +36,9 @@
 
 #define VIRTGPU_BLOB_FLAG_USE_MASK (VIRTGPU_BLOB_FLAG_USE_MAPPABLE | \
VIRTGPU_BLOB_FLAG_USE_SHAREABLE | \
-   VIRTGPU_BLOB_FLAG_USE_CROSS_DEVICE)
+   VIRTGPU_BLOB_FLAG_USE_CROSS_DEVICE | \
+   VIRTGPU_BLOB_FLAG_USE_USERPTR | \
+   VIRTGPU_BLOB_FLAG_USERPTR_RDONLY)
 
 /* Must be called with &virtio_gpu_fpriv.struct_mutex held. */
 static void virtio_gpu_create_context_locked(struct virtio_gpu_device *vgdev,
@@ -444,6 +446,8 @@ static int verify_blob(struct virtio_gpu_device *vgdev,
 {
if (!vgdev->has_resource_blob)
return -EINVAL;
+   if (!vgdev->has_resource_userptr && rc_blob->userptr)
+   return -EINVAL;
 
if (rc_blob->blob_flags & ~VIRTGPU_BLOB_FLAG_USE_MASK)
return -EINVAL;
@@ -489,6 +493,7 @@ static int verify_blob(struct virtio_gpu_device *vgdev,
params->size = rc_blob->size;
params->blob = true;
params->blob_flags = rc_blob->blob_flags;
+   params->userptr = rc_blob->userptr;
return 0;
 }
 
@@ -527,8 +532,10 @@ static int virtio_gpu_resource_create_blob_ioctl(struct 
drm_device *dev,
  vfpriv->ctx_id, NULL, NULL);
}
 
-   if (guest_blob)
+   if (guest_blob && !params.userptr)
ret = virtio_gpu_object_create(vgdev, ¶ms, &bo, NULL);
+   else if (guest_blob && params.userptr)
+   ret = virtio_gpu_userptr_create(vgdev, file, ¶ms, &bo);
else if (!guest_blob && host3d_blob)
ret = virtio_gpu_vram_create(vgdev, ¶ms, &bo);
else
-- 
2.34.1



[PATCH v1 6/7] drm/virtio: implement userptr: add interval tree

2025-02-27 Thread Honglei Huang
From: Honglei Huang 

Add interval tree to manage the userptrs to prevent repeat creation.
If the userptr exists, the ioctl will return the existing BO, and it's
offset with the create ioctl address.

Signed-off-by: Honglei Huang 
---
 drivers/gpu/drm/virtio/virtgpu_drv.h |  16 ++-
 drivers/gpu/drm/virtio/virtgpu_ioctl.c   |  13 ++-
 drivers/gpu/drm/virtio/virtgpu_userptr.c | 129 ++-
 include/uapi/drm/virtgpu_drm.h   |   1 +
 4 files changed, 152 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h 
b/drivers/gpu/drm/virtio/virtgpu_drv.h
index f3dcbd241f5a..fa5dd46e3732 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -54,6 +54,7 @@
 #define STATE_INITIALIZING 0
 #define STATE_OK 1
 #define STATE_ERR 2
+#define STATE_RES_EXISTS 3
 
 #define MAX_CAPSET_ID 63
 #define MAX_RINGS 64
@@ -114,18 +115,23 @@ struct virtio_gpu_object_vram {
 };
 
 struct virtio_gpu_object_userptr;
+struct virtio_gpu_fpriv;
 
 struct virtio_gpu_object_userptr_ops {
int (*get_pages)(struct virtio_gpu_object_userptr *userptr);
void (*put_pages)(struct virtio_gpu_object_userptr *userptr);
void (*release)(struct virtio_gpu_object_userptr *userptr);
+   int (*insert)(struct virtio_gpu_object_userptr *userptr, struct 
virtio_gpu_fpriv *fpriv);
+   int (*remove)(struct virtio_gpu_object_userptr *userptr, struct 
virtio_gpu_fpriv *fpriv);
 };
 struct virtio_gpu_object_userptr {
struct virtio_gpu_object base;
const struct virtio_gpu_object_userptr_ops *ops;
struct mutex lock;
 
+   uint64_t ptr;
uint64_t start;
+   uint64_t last;
uint32_t npages;
uint32_t bo_handle;
uint32_t flags;
@@ -134,6 +140,8 @@ struct virtio_gpu_object_userptr {
struct drm_file *file;
struct page **pages;
struct sg_table *sgt;
+
+   struct interval_tree_node it_node;
 };
 
 #define to_virtio_gpu_shmem(virtio_gpu_object) \
@@ -307,6 +315,8 @@ struct virtio_gpu_fpriv {
struct mutex context_lock;
char debug_name[DEBUG_NAME_MAX_LEN];
bool explicit_debug_name;
+   struct rb_root_cached userptrs_tree;
+   struct mutex userptrs_tree_lock;
 };
 
 /* virtgpu_ioctl.c */
@@ -520,6 +530,10 @@ int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, 
void *data,
 int virtio_gpu_userptr_create(struct virtio_gpu_device *vgdev,
  struct drm_file *file,
  struct virtio_gpu_object_params *params,
- struct virtio_gpu_object **bo_ptr);
+ struct virtio_gpu_object **bo_ptr,
+ struct drm_virtgpu_resource_create_blob *rc_blob);
 bool virtio_gpu_is_userptr(struct virtio_gpu_object *bo);
+void virtio_gpu_userptr_interval_tree_init(struct virtio_gpu_fpriv *vfpriv);
+void virtio_gpu_userptr_set_handle(struct virtio_gpu_object *qobj,
+  uint32_t handle);
 #endif
diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c 
b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
index 8a89774d0737..ad1ac8d0eadf 100644
--- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
+++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
@@ -534,8 +534,11 @@ static int virtio_gpu_resource_create_blob_ioctl(struct 
drm_device *dev,
 
if (guest_blob && !params.userptr)
ret = virtio_gpu_object_create(vgdev, ¶ms, &bo, NULL);
-   else if (guest_blob && params.userptr)
-   ret = virtio_gpu_userptr_create(vgdev, file, ¶ms, &bo);
+   else if (guest_blob && params.userptr) {
+   ret = virtio_gpu_userptr_create(vgdev, file, ¶ms, &bo, 
rc_blob);
+   if (ret > 0)
+   return ret;
+   }
else if (!guest_blob && host3d_blob)
ret = virtio_gpu_vram_create(vgdev, ¶ms, &bo);
else
@@ -567,6 +570,9 @@ static int virtio_gpu_resource_create_blob_ioctl(struct 
drm_device *dev,
rc_blob->res_handle = bo->hw_res_handle;
rc_blob->bo_handle = handle;
 
+   if (guest_blob && params.userptr)
+   virtio_gpu_userptr_set_handle(bo, handle);
+
/*
 * The handle owns the reference now.  But we must drop our
 * remaining reference *after* we no longer need to dereference
@@ -691,6 +697,9 @@ static int virtio_gpu_context_init_ioctl(struct drm_device 
*dev,
}
}
 
+   if (vfpriv->context_init & VIRTIO_GPU_CAPSET_HSAKMT)
+   virtio_gpu_userptr_interval_tree_init(vfpriv);
+
virtio_gpu_create_context_locked(vgdev, vfpriv);
virtio_gpu_notify(vgdev);
 
diff --git a/drivers/gpu/drm/virtio/virtgpu_userptr.c 
b/drivers/gpu/drm/virtio/virtgpu_userptr.c
index b4a08811d345..03398c3b9f30 100644
--- a/drivers/gpu/drm/virtio/virtgpu_userptr.c
+++ b/drivers/gpu/drm/virtio/virtgpu_userptr.c
@@ -10,6 +10,92 @@
 static struct sg_table *
 virtio_gpu_userp

Re: [PATCH 0/4] Check Rust signatures at compile time

2025-02-27 Thread Andreas Hindborg
"Alice Ryhl"  writes:

> Signed-off-by: Alice Ryhl 

What is going on with the cover letter of this one?


Best regards,
Andreas Hindborg




Re: [PATCH RESEND v3] drm/xe: xe_gen_wa_oob: replace program_invocation_short_name

2025-02-27 Thread Lucas De Marchi

On Thu, Feb 27, 2025 at 08:39:21AM -0500, Tamir Duberstein wrote:

Hi Lucas, chiming in here since I also care about building on macOS.

On Mon, Feb 24, 2025 at 10:05 AM Lucas De Marchi
 wrote:


Is this the approach taken for other similar issues you had? Note that
argv[0] and program_invocation_short_name are not the same thing. For
this particular binary I don't really care and if it's the approach
taken in other places, I'm ok using it.


Believe it or not, this is the only place that
program_invocation_short_name has ever been used in the kernel. There
have been numerous instances of:

#define _GNU_SOURCE /* for program_invocation_short_name */

but never any actual callers (that I could find in the git history)
other than this one.


I was expecting you'd take the acks and merge it all through a single
tree since you received push back on the need to build the kernel in
macOS.  Is this the only thing missing and you'd want it to go through
drm?


I believe the other patches have been applied or dropped. When I last
tested building allmodconfig this was the only issue I ran into (macOS
arm64), so I asked Daniel for this resend.


fair enough.  Pushed to drm-xe-next since nobody ever reads the
usage for this helper tool and it doesn't really matter if now it's ugly.

Lucas De Marchi



Cheers.
Tamir


[PATCH v2 0/6] Support for Adreno 623 GPU

2025-02-27 Thread Akhil P Oommen
This series adds support for A623 GPU found in QCS8300 chipsets. This
GPU IP is very similar to A621 GPU, except for the UBWC configuration
and the GMU firmware.

Both DT patches are for Bjorn and rest of the patches for Rob Clark to
pick up.

---
Changes in v2:
- Fix hwcg config (Konrad)
- Split gpucc reg list patch (Rob)
- Rebase on msm-next tip
- Link to v1: 
https://lore.kernel.org/r/20250213-a623-gpu-support-v1-0-993c65c39...@quicinc.com

---
Jie Zhang (6):
  drm/msm/a6xx: Split out gpucc register block
  drm/msm/a6xx: Fix gpucc register block for A621
  drm/msm/a6xx: Add support for Adreno 623
  dt-bindings: display/msm/gmu: Add Adreno 623 GMU
  arm64: dts: qcom: qcs8300: Add gpu and gmu nodes
  arm64: dts: qcom: qcs8300-ride: Enable Adreno 623 GPU

 .../devicetree/bindings/display/msm/gmu.yaml   |  1 +
 arch/arm64/boot/dts/qcom/qcs8300-ride.dts  |  8 ++
 arch/arm64/boot/dts/qcom/qcs8300.dtsi  | 93 ++
 drivers/gpu/drm/msm/adreno/a6xx_catalog.c  | 29 +++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c  |  8 ++
 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c| 13 ++-
 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h| 17 
 drivers/gpu/drm/msm/adreno/adreno_gpu.h|  5 ++
 8 files changed, 171 insertions(+), 3 deletions(-)
---
base-commit: 89839e69f6154feecd79bd01171375225b0296e9
change-id: 20250213-a623-gpu-support-f6698603fb85
prerequisite-change-id: 20250131-b4-branch-gfx-smmu-b03261963064:v5
prerequisite-patch-id: f8fd1a2020c940e595e58a8bd3c55d00d3d87271
prerequisite-patch-id: 08a0540f75b0f95fd2018b38c9ed5c6f96433b4d

Best regards,
-- 
Akhil P Oommen 



[PATCH v2 3/6] drm/msm/a6xx: Add support for Adreno 623

2025-02-27 Thread Akhil P Oommen
From: Jie Zhang 

Add support for Adreno 623 GPU found in QCS8300 chipsets.

Signed-off-by: Jie Zhang 
Signed-off-by: Akhil P Oommen 
---
 drivers/gpu/drm/msm/adreno/a6xx_catalog.c   | 29 +
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c   |  8 
 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c |  2 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.h |  5 +
 4 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c 
b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
index 
edffb7737a97b268bb2986d557969e651988a344..53e2ff4406d8f0afe474aaafbf0e459ef8f4577d
 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
@@ -879,6 +879,35 @@ static const struct adreno_info a6xx_gpus[] = {
{ 0, 0 },
{ 137, 1 },
),
+   }, {
+   .chip_ids = ADRENO_CHIP_IDS(0x06020300),
+   .family = ADRENO_6XX_GEN3,
+   .fw = {
+   [ADRENO_FW_SQE] = "a650_sqe.fw",
+   [ADRENO_FW_GMU] = "a623_gmu.bin",
+   },
+   .gmem = SZ_512K,
+   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
+   .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
+   ADRENO_QUIRK_HAS_HW_APRIV,
+   .init = a6xx_gpu_init,
+   .a6xx = &(const struct a6xx_info) {
+   .hwcg = a690_hwcg,
+   .protect = &a650_protect,
+   .gmu_cgc_mode = 0x00020200,
+   .prim_fifo_threshold = 0x0001,
+   .bcms = (const struct a6xx_bcm[]) {
+   { .name = "SH0", .buswidth = 16 },
+   { .name = "MC0", .buswidth = 4 },
+   {
+   .name = "ACV",
+   .fixed = true,
+   .perfmode = BIT(3),
+   },
+   { /* sentinel */ },
+   },
+   },
+   .address_space_size = SZ_16G,
}, {
.chip_ids = ADRENO_CHIP_IDS(
0x06030001,
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 
0ae29a7c8a4d3f74236a35cc919f69d5c0a384a0..1820c167fcee609deee3d49e7b5dd3736da23d99
 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -616,6 +616,14 @@ static void a6xx_calc_ubwc_config(struct adreno_gpu *gpu)
gpu->ubwc_config.uavflagprd_inv = 2;
}
 
+   if (adreno_is_a623(gpu)) {
+   gpu->ubwc_config.highest_bank_bit = 16;
+   gpu->ubwc_config.amsbc = 1;
+   gpu->ubwc_config.rgb565_predicator = 1;
+   gpu->ubwc_config.uavflagprd_inv = 2;
+   gpu->ubwc_config.macrotile_mode = 1;
+   }
+
if (adreno_is_a640_family(gpu))
gpu->ubwc_config.amsbc = 1;
 
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
index 
2c10474ccc95cf2515c6583007a9b5cc478f836c..3222a406d08950008ca8c67a9b78cdd0e98e888c
 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
@@ -1227,7 +1227,7 @@ static void a6xx_get_gmu_registers(struct msm_gpu *gpu,
_a6xx_get_gmu_registers(gpu, a6xx_state, &a6xx_gmu_reglist[1],
&a6xx_state->gmu_registers[1], true);
 
-   if (adreno_is_a621(adreno_gpu))
+   if (adreno_is_a621(adreno_gpu) || adreno_is_a623(adreno_gpu))
_a6xx_get_gmu_registers(gpu, a6xx_state, &a621_gpucc_reg,
&a6xx_state->gmu_registers[2], false);
else
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h 
b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index 
dcf454629ce037b2a8274a6699674ad754ce1f07..92caba3584da0400b44a903e465814af165d40a3
 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -442,6 +442,11 @@ static inline int adreno_is_a621(const struct adreno_gpu 
*gpu)
return gpu->info->chip_ids[0] == 0x06020100;
 }
 
+static inline int adreno_is_a623(const struct adreno_gpu *gpu)
+{
+   return gpu->info->chip_ids[0] == 0x06020300;
+}
+
 static inline int adreno_is_a630(const struct adreno_gpu *gpu)
 {
return adreno_is_revn(gpu, 630);

-- 
2.48.1



[PATCH v2 5/6] arm64: dts: qcom: qcs8300: Add gpu and gmu nodes

2025-02-27 Thread Akhil P Oommen
From: Jie Zhang 

Add gpu and gmu nodes for qcs8300 chipset.

Signed-off-by: Jie Zhang 
Signed-off-by: Akhil P Oommen 
---
 arch/arm64/boot/dts/qcom/qcs8300.dtsi | 93 +++
 1 file changed, 93 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/qcs8300.dtsi 
b/arch/arm64/boot/dts/qcom/qcs8300.dtsi
index 
f1c90db7b0e689035fbbaaa551611be34adf9ab6..2dc487dcc584cd0a057e18c53e2f945b8636ad14
 100644
--- a/arch/arm64/boot/dts/qcom/qcs8300.dtsi
+++ b/arch/arm64/boot/dts/qcom/qcs8300.dtsi
@@ -2660,6 +2660,99 @@ serdes0: phy@8909000 {
status = "disabled";
};
 
+   gpu: gpu@3d0 {
+   compatible = "qcom,adreno-623.0", "qcom,adreno";
+   reg = <0x0 0x03d0 0x0 0x4>,
+ <0x0 0x03d9e000 0x0 0x1000>,
+ <0x0 0x03d61000 0x0 0x800>;
+   reg-names = "kgsl_3d0_reg_memory",
+   "cx_mem",
+   "cx_dbgc";
+   interrupts = ;
+   iommus = <&adreno_smmu 0 0xc00>,
+<&adreno_smmu 1 0xc00>;
+   operating-points-v2 = <&gpu_opp_table>;
+   qcom,gmu = <&gmu>;
+   interconnects = <&gem_noc MASTER_GFX3D 
QCOM_ICC_TAG_ALWAYS
+&mc_virt SLAVE_EBI1 
QCOM_ICC_TAG_ALWAYS>;
+   interconnect-names = "gfx-mem";
+   #cooling-cells = <2>;
+
+   status = "disabled";
+
+   gpu_zap_shader: zap-shader {
+   memory-region = <&gpu_microcode_mem>;
+   };
+
+   gpu_opp_table: opp-table {
+   compatible = "operating-points-v2";
+
+   opp-87700 {
+   opp-hz = /bits/ 64 <87700>;
+   opp-level = 
;
+   opp-peak-kBps = <12484375>;
+   };
+
+   opp-78000 {
+   opp-hz = /bits/ 64 <78000>;
+   opp-level = 
;
+   opp-peak-kBps = <10687500>;
+   };
+
+   opp-59900 {
+   opp-hz = /bits/ 64 <59900>;
+   opp-level = ;
+   opp-peak-kBps = <8171875>;
+   };
+
+   opp-47900 {
+   opp-hz = /bits/ 64 <47900>;
+   opp-level = 
;
+   opp-peak-kBps = <5285156>;
+   };
+   };
+   };
+
+   gmu: gmu@3d6a000 {
+   compatible = "qcom,adreno-gmu-623.0", "qcom,adreno-gmu";
+   reg = <0x0 0x03d6a000 0x0 0x34000>,
+ <0x0 0x03de 0x0 0x1>,
+ <0x0 0x0b29 0x0 0x1>;
+   reg-names = "gmu", "rscc", "gmu_pdc";
+   interrupts = ,
+;
+   interrupt-names = "hfi", "gmu";
+   clocks = <&gpucc GPU_CC_CX_GMU_CLK>,
+<&gpucc GPU_CC_CXO_CLK>,
+<&gcc GCC_DDRSS_GPU_AXI_CLK>,
+<&gcc GCC_GPU_MEMNOC_GFX_CLK>,
+<&gpucc GPU_CC_AHB_CLK>,
+<&gpucc GPU_CC_HUB_CX_INT_CLK>,
+<&gpucc GPU_CC_HLOS1_VOTE_GPU_SMMU_CLK>;
+   clock-names = "gmu",
+ "cxo",
+ "axi",
+ "memnoc",
+ "ahb",
+ "hub",
+ "smmu_vote";
+   power-domains = <&gpucc GPU_CC_CX_GDSC>,
+   <&gpucc GPU_CC_GX_GDSC>;
+   power-domain-names = "cx",
+"gx";
+   iommus = <&adreno_smmu 5 0xc00>;
+   operating-points-v2 = <&gmu_opp_table>;
+
+   gmu_opp_table: opp-table {
+   compatible = "operating-points-v2";
+
+   opp-2 {
+   opp-hz = /bits/ 64 <2>;
+   opp-level = 
;
+

[PATCH v2 2/6] drm/msm/a6xx: Fix gpucc register block for A621

2025-02-27 Thread Akhil P Oommen
From: Jie Zhang 

Adreno 621 has a different memory map for GPUCC block. So update
a6xx_gpu_state code to dump the correct set of gpucc registers.

Signed-off-by: Jie Zhang 
Signed-off-by: Akhil P Oommen 
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c |  9 +++--
 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h | 12 
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
index 
81763876e4029713994b47729a2cec7e1dd3fbb9..2c10474ccc95cf2515c6583007a9b5cc478f836c
 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
@@ -1226,8 +1226,13 @@ static void a6xx_get_gmu_registers(struct msm_gpu *gpu,
&a6xx_state->gmu_registers[0], false);
_a6xx_get_gmu_registers(gpu, a6xx_state, &a6xx_gmu_reglist[1],
&a6xx_state->gmu_registers[1], true);
-   _a6xx_get_gmu_registers(gpu, a6xx_state, &a6xx_gpucc_reg,
-   &a6xx_state->gmu_registers[2], false);
+
+   if (adreno_is_a621(adreno_gpu))
+   _a6xx_get_gmu_registers(gpu, a6xx_state, &a621_gpucc_reg,
+   &a6xx_state->gmu_registers[2], false);
+   else
+   _a6xx_get_gmu_registers(gpu, a6xx_state, &a6xx_gpucc_reg,
+   &a6xx_state->gmu_registers[2], false);
 
if (!a6xx_gmu_gx_is_on(&a6xx_gpu->gmu))
return;
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h
index 
31c7462ab6d7b877c55abc04b98c0a80dac87759..e545106c70be713b07904187a9e246e08499f228
 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h
@@ -376,6 +376,17 @@ static const u32 a6xx_gmu_gpucc_registers[] = {
0xbc00, 0xbc16, 0xbc20, 0xbc27,
 };
 
+static const u32 a621_gmu_gpucc_registers[] = {
+   /* GPU CC */
+   0x9800, 0x980e, 0x9c00, 0x9c0e, 0xb000, 0xb004, 0xb400, 0xb404,
+   0xb800, 0xb804, 0xbc00, 0xbc05, 0xbc14, 0xbc1d, 0xbc2a, 0xbc30,
+   0xbc32, 0xbc32, 0xbc41, 0xbc55, 0xbc66, 0xbc68, 0xbc78, 0xbc7a,
+   0xbc89, 0xbc8a, 0xbc9c, 0xbc9e, 0xbca0, 0xbca3, 0xbcb3, 0xbcb5,
+   0xbcc5, 0xbcc7, 0xbcd6, 0xbcd8, 0xbce8, 0xbce9, 0xbcf9, 0xbcfc,
+   0xbd0b, 0xbd0c, 0xbd1c, 0xbd1e, 0xbd40, 0xbd70, 0xbe00, 0xbe16,
+   0xbe20, 0xbe2d,
+};
+
 static const u32 a6xx_gmu_cx_rscc_registers[] = {
/* GPU RSCC */
0x008c, 0x008c, 0x0101, 0x0102, 0x0340, 0x0342, 0x0344, 0x0347,
@@ -390,6 +401,7 @@ static const struct a6xx_registers a6xx_gmu_reglist[] = {
 };
 
 static const struct a6xx_registers a6xx_gpucc_reg = 
REGS(a6xx_gmu_gpucc_registers, 0, 0);
+static const struct a6xx_registers a621_gpucc_reg = 
REGS(a621_gmu_gpucc_registers, 0, 0);
 
 static u32 a6xx_get_cp_roq_size(struct msm_gpu *gpu);
 static u32 a7xx_get_cp_roq_size(struct msm_gpu *gpu);

-- 
2.48.1



[PATCH v2 1/6] drm/msm/a6xx: Split out gpucc register block

2025-02-27 Thread Akhil P Oommen
From: Jie Zhang 

Some GPUs have different memory map for GPUCC block. So split out the
gpucc range from a6xx_gmu_cx_registers to a separate block to
accommodate those GPUs.

Signed-off-by: Jie Zhang 
Signed-off-by: Akhil P Oommen 
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 8 +---
 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h | 5 +
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
index 
0fcae53c0b140b42d9af313695ad6121c9fc5618..81763876e4029713994b47729a2cec7e1dd3fbb9
 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c
@@ -1214,18 +1214,20 @@ static void a6xx_get_gmu_registers(struct msm_gpu *gpu,
struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
 
a6xx_state->gmu_registers = state_kcalloc(a6xx_state,
-   3, sizeof(*a6xx_state->gmu_registers));
+   4, sizeof(*a6xx_state->gmu_registers));
 
if (!a6xx_state->gmu_registers)
return;
 
-   a6xx_state->nr_gmu_registers = 3;
+   a6xx_state->nr_gmu_registers = 4;
 
/* Get the CX GMU registers from AHB */
_a6xx_get_gmu_registers(gpu, a6xx_state, &a6xx_gmu_reglist[0],
&a6xx_state->gmu_registers[0], false);
_a6xx_get_gmu_registers(gpu, a6xx_state, &a6xx_gmu_reglist[1],
&a6xx_state->gmu_registers[1], true);
+   _a6xx_get_gmu_registers(gpu, a6xx_state, &a6xx_gpucc_reg,
+   &a6xx_state->gmu_registers[2], false);
 
if (!a6xx_gmu_gx_is_on(&a6xx_gpu->gmu))
return;
@@ -1234,7 +1236,7 @@ static void a6xx_get_gmu_registers(struct msm_gpu *gpu,
gpu_write(gpu, REG_A6XX_GMU_AO_AHB_FENCE_CTRL, 0);
 
_a6xx_get_gmu_registers(gpu, a6xx_state, &a6xx_gmu_reglist[2],
-   &a6xx_state->gmu_registers[2], false);
+   &a6xx_state->gmu_registers[3], false);
 }
 
 static struct msm_gpu_state_bo *a6xx_snapshot_gmu_bo(
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h
index 
dd4c28a8d9233d8079abaf0065317c1d613dba32..31c7462ab6d7b877c55abc04b98c0a80dac87759
 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu_state.h
@@ -363,6 +363,9 @@ static const u32 a6xx_gmu_cx_registers[] = {
0x51e0, 0x51e2, 0x51f0, 0x51f0, 0x5200, 0x5201,
/* GMU AO */
0x9300, 0x9316, 0x9400, 0x9400,
+};
+
+static const u32 a6xx_gmu_gpucc_registers[] = {
/* GPU CC */
0x9800, 0x9812, 0x9840, 0x9852, 0x9c00, 0x9c04, 0x9c07, 0x9c0b,
0x9c15, 0x9c1c, 0x9c1e, 0x9c2d, 0x9c3c, 0x9c3d, 0x9c3f, 0x9c40,
@@ -386,6 +389,8 @@ static const struct a6xx_registers a6xx_gmu_reglist[] = {
REGS(a6xx_gmu_gx_registers, 0, 0),
 };
 
+static const struct a6xx_registers a6xx_gpucc_reg = 
REGS(a6xx_gmu_gpucc_registers, 0, 0);
+
 static u32 a6xx_get_cp_roq_size(struct msm_gpu *gpu);
 static u32 a7xx_get_cp_roq_size(struct msm_gpu *gpu);
 

-- 
2.48.1



[PATCH v2 6/6] arm64: dts: qcom: qcs8300-ride: Enable Adreno 623 GPU

2025-02-27 Thread Akhil P Oommen
From: Jie Zhang 

Enable GPU for qcs8300-ride platform and provide path for zap
shader.

Signed-off-by: Jie Zhang 
Signed-off-by: Akhil P Oommen 
Reviewed-by: Dmitry Baryshkov 
---
 arch/arm64/boot/dts/qcom/qcs8300-ride.dts | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/qcs8300-ride.dts 
b/arch/arm64/boot/dts/qcom/qcs8300-ride.dts
index 
b5c9f89b34356bbf8387643e8702a2a5f50b332f..5f6c6a1f59655bee62ca9ab09c4ee60c1b826a66
 100644
--- a/arch/arm64/boot/dts/qcom/qcs8300-ride.dts
+++ b/arch/arm64/boot/dts/qcom/qcs8300-ride.dts
@@ -285,6 +285,14 @@ queue3 {
};
 };
 
+&gpu {
+   status = "okay";
+};
+
+&gpu_zap_shader {
+   firmware-name = "qcom/qcs8300/a623_zap.mbn";
+};
+
 &qupv3_id_0 {
status = "okay";
 };

-- 
2.48.1



[PATCH v2 4/6] dt-bindings: display/msm/gmu: Add Adreno 623 GMU

2025-02-27 Thread Akhil P Oommen
From: Jie Zhang 

Document Adreno 623 GMU in the dt-binding specification.

Signed-off-by: Jie Zhang 
Signed-off-by: Akhil P Oommen 
Reviewed-by: Krzysztof Kozlowski 
---
 Documentation/devicetree/bindings/display/msm/gmu.yaml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/display/msm/gmu.yaml 
b/Documentation/devicetree/bindings/display/msm/gmu.yaml
index 
ab884e2364293ed4e79ddfec35b3c5f4d14ae853..4392aa7a4ffe2492d69a21e067be1f42e00016d8
 100644
--- a/Documentation/devicetree/bindings/display/msm/gmu.yaml
+++ b/Documentation/devicetree/bindings/display/msm/gmu.yaml
@@ -123,6 +123,7 @@ allOf:
 compatible:
   contains:
 enum:
+  - qcom,adreno-gmu-623.0
   - qcom,adreno-gmu-635.0
   - qcom,adreno-gmu-660.1
   - qcom,adreno-gmu-663.0

-- 
2.48.1



Re: [PATCH][next] drm/nouveau: Avoid multiple -Wflex-array-member-not-at-end warnings

2025-02-27 Thread Danilo Krummrich

On 2/12/25 10:01 AM, Gustavo A. R. Silva wrote:

-Wflex-array-member-not-at-end was introduced in GCC-14, and we are
getting ready to enable it, globally.

So, in order to avoid ending up with flexible-array members in the
middle of other structs, we use the `struct_group_tagged()` helper
to separate the flexible arrays from the rest of the members in the
flexible structures. We then use the newly created tagged `struct
nvif_ioctl_v0_hdr` and `struct nvif_ioctl_mthd_v0_hdr` to replace the
type of the objects causing trouble in multiple structures.

We also want to ensure that when new members need to be added to the
flexible structures, they are always included within the newly created
tagged structs. For this, we use `static_assert()`. This ensures that the
memory layout for both the flexible structure and the new tagged struct
is the same after any changes.

So, with these changes, fix the following warnings:
drivers/gpu/drm/nouveau/nvif/object.c:60:38: warning: structure containing a 
flexible array member is not at the end of another structure 
[-Wflex-array-member-not-at-end]
drivers/gpu/drm/nouveau/nvif/object.c:233:38: warning: structure containing a 
flexible array member is not at the end of another structure 
[-Wflex-array-member-not-at-end]
drivers/gpu/drm/nouveau/nvif/object.c:214:38: warning: structure containing a 
flexible array member is not at the end of another structure 
[-Wflex-array-member-not-at-end]
drivers/gpu/drm/nouveau/nvif/object.c:152:38: warning: structure containing a 
flexible array member is not at the end of another structure 
[-Wflex-array-member-not-at-end]
drivers/gpu/drm/nouveau/nvif/object.c:138:38: warning: structure containing a 
flexible array member is not at the end of another structure 
[-Wflex-array-member-not-at-end]
drivers/gpu/drm/nouveau/nvif/object.c:104:38: warning: structure containing a 
flexible array member is not at the end of another structure 
[-Wflex-array-member-not-at-end]
drivers/gpu/drm/nouveau/nouveau_svm.c:83:35: warning: structure containing a 
flexible array member is not at the end of another structure 
[-Wflex-array-member-not-at-end]
drivers/gpu/drm/nouveau/nouveau_svm.c:82:30: warning: structure containing a 
flexible array member is not at the end of another structure 
[-Wflex-array-member-not-at-end]

Signed-off-by: Gustavo A. R. Silva 


Applied to drm-misc-next, thanks!


Re: [PATCH v2 3/6] drm/msm/a6xx: Add support for Adreno 623

2025-02-27 Thread Konrad Dybcio
On 27.02.2025 9:07 PM, Akhil P Oommen wrote:
> From: Jie Zhang 
> 
> Add support for Adreno 623 GPU found in QCS8300 chipsets.
> 
> Signed-off-by: Jie Zhang 
> Signed-off-by: Akhil P Oommen 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_catalog.c   | 29 
> +
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   |  8 
>  drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c |  2 +-
>  drivers/gpu/drm/msm/adreno/adreno_gpu.h |  5 +
>  4 files changed, 43 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
> index 
> edffb7737a97b268bb2986d557969e651988a344..53e2ff4406d8f0afe474aaafbf0e459ef8f4577d
>  100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
> @@ -879,6 +879,35 @@ static const struct adreno_info a6xx_gpus[] = {
>   { 0, 0 },
>   { 137, 1 },
>   ),
> + }, {
> + .chip_ids = ADRENO_CHIP_IDS(0x06020300),
> + .family = ADRENO_6XX_GEN3,
> + .fw = {
> + [ADRENO_FW_SQE] = "a650_sqe.fw",
> + [ADRENO_FW_GMU] = "a623_gmu.bin",
> + },
> + .gmem = SZ_512K,
> + .inactive_period = DRM_MSM_INACTIVE_PERIOD,
> + .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
> + ADRENO_QUIRK_HAS_HW_APRIV,
> + .init = a6xx_gpu_init,
> + .a6xx = &(const struct a6xx_info) {
> + .hwcg = a690_hwcg,

You used the a620 table before, I'm assuming a690 is correct after all?

Konrad


Re: [PATCH 9/9] arm64: dts: imx95: Describe Mali G310 GPU

2025-02-27 Thread Marek Vasut

On 2/27/25 6:43 PM, Frank Li wrote:
[...]


diff --git a/arch/arm64/boot/dts/freescale/imx95.dtsi 
b/arch/arm64/boot/dts/freescale/imx95.dtsi
index 3af13173de4bd..36bad211e5558 100644
--- a/arch/arm64/boot/dts/freescale/imx95.dtsi
+++ b/arch/arm64/boot/dts/freescale/imx95.dtsi
@@ -249,6 +249,37 @@ dummy: clock-dummy {
clock-output-names = "dummy";
};

+   gpu_fixed_reg: fixed-gpu-reg {
+   compatible = "regulator-fixed";
+   regulator-min-microvolt = <92>;
+   regulator-max-microvolt = <92>;
+   regulator-name = "vdd_gpu";
+   regulator-always-on;
+   regulator-boot-on;


Does really need regulator-boot-on and regulator-always-on ?


I don't think so, this is a development remnant, fixed, thanks.

[...]


+   gpu: gpu@4d90 {
+   compatible = "fsl,imx95-mali", "arm,mali-valhall-csf";
+   reg = <0 0x4d90 0 0x48>;
+   clocks = <&scmi_clk IMX95_CLK_GPU>;
+   clock-names = "core";
+   interrupts = ,
+,
+;
+   interrupt-names = "gpu", "job", "mmu";
+   mali-supply = <&gpu_fixed_reg>;
+   operating-points-v2 = <&gpu_opp_table>;
+   power-domains = <&scmi_devpd IMX95_PD_GPU>, <&scmi_perf 
IMX95_PERF_GPU>;
+   power-domain-names = "mix", "perf";
+   resets = <&gpu_blk_ctrl 0>;
+   #cooling-cells = <2>;
+   dynamic-power-coefficient = <1013>;
+   status = "disabled";


GPU is internal module, which have not much dependence with other module
such as pinmux. why not default status is "disabled". Supposed gpu driver
will turn off clock and power if not used.
My thinking was that there are MX95 SoC with GPU fused off, hence it is 
better to keep the GPU disabled in DT by default. But I can also keep it 
enabled and the few boards which do not have MX95 SoC with GPU can 
explicitly disable it in board DT.


What do you think ?


Re: [PATCH 8/9] drm/panthor: Add i.MX95 support

2025-02-27 Thread Marek Vasut

On 2/27/25 9:17 PM, Marco Felsch wrote:

[...]


diff --git a/drivers/gpu/drm/panthor/panthor_drv.c 
b/drivers/gpu/drm/panthor/panthor_drv.c
index 06fe46e320738..2504a456d45c4 100644
--- a/drivers/gpu/drm/panthor/panthor_drv.c
+++ b/drivers/gpu/drm/panthor/panthor_drv.c
@@ -1591,6 +1591,7 @@ static struct attribute *panthor_attrs[] = {
  ATTRIBUTE_GROUPS(panthor);
  
  static const struct of_device_id dt_match[] = {

+   { .compatible = "fsl,imx95-mali" },   /* G310 */

  ^
 nxp?

Can we switch to nxp instead?

We can ... is that the current recommendation ?

Why not stick with fsl , is that deprecated now ?


Re: [PATCH 7/9] dt-bindings: gpu: mali-valhall-csf: Document i.MX95 support

2025-02-27 Thread Marek Vasut

On 2/27/25 7:38 PM, Rob Herring (Arm) wrote:


On Thu, 27 Feb 2025 17:58:07 +0100, Marek Vasut wrote:

The instance of the GPU populated in Freescale i.MX95 is the
Mali G310, document support for this variant.

Signed-off-by: Marek Vasut 
---
Cc: Boris Brezillon 
Cc: Conor Dooley 
Cc: David Airlie 
Cc: Fabio Estevam 
Cc: Krzysztof Kozlowski 
Cc: Liviu Dudau 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Pengutronix Kernel Team 
Cc: Philipp Zabel 
Cc: Rob Herring 
Cc: Sascha Hauer 
Cc: Sebastian Reichel 
Cc: Shawn Guo 
Cc: Simona Vetter 
Cc: Steven Price 
Cc: Thomas Zimmermann 
Cc: devicet...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: i...@lists.linux.dev
Cc: linux-arm-ker...@lists.infradead.org
---
  Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml | 1 +
  1 file changed, 1 insertion(+)



My bot found errors running 'make dt_binding_check' on your patch:

yamllint warnings/errors:

dtschema/dtc warnings/errors:


doc reference errors (make refcheckdocs):

See 
https://patchwork.ozlabs.org/project/devicetree-bindings/patch/20250227170012.124768-8-ma...@denx.de

It seems there are no errors in this list ?


[PATCH v2 1/9] drm/dp: Add definitions for POST_LT_ADJ training sequence

2025-02-27 Thread Ville Syrjala
From: Ville Syrjälä 

Add the bit definitions needed for POST_LT_ADJ sequence.

v2: DP_POST_LT_ADJ_REQ_IN_PROGRESS is bit 1 not 5 (Jani)

Signed-off-by: Ville Syrjälä 
---
 include/drm/display/drm_dp.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/drm/display/drm_dp.h b/include/drm/display/drm_dp.h
index c413ef68f9a3..e2d2ae573d8b 100644
--- a/include/drm/display/drm_dp.h
+++ b/include/drm/display/drm_dp.h
@@ -115,6 +115,7 @@
 
 #define DP_MAX_LANE_COUNT   0x002
 # define DP_MAX_LANE_COUNT_MASK0x1f
+# define DP_POST_LT_ADJ_REQ_SUPPORTED  (1 << 5) /* 1.3 */
 # define DP_TPS3_SUPPORTED (1 << 6) /* 1.2 */
 # define DP_ENHANCED_FRAME_CAP (1 << 7)
 
@@ -571,6 +572,7 @@
 
 #define DP_LANE_COUNT_SET  0x101
 # define DP_LANE_COUNT_MASK0x0f
+# define DP_POST_LT_ADJ_REQ_GRANTED (1 << 5) /* 1.3 */
 # define DP_LANE_COUNT_ENHANCED_FRAME_EN(1 << 7)
 
 #define DP_TRAINING_PATTERN_SET0x102
@@ -788,6 +790,7 @@
 
 #define DP_LANE_ALIGN_STATUS_UPDATED0x204
 #define  DP_INTERLANE_ALIGN_DONE(1 << 0)
+#define  DP_POST_LT_ADJ_REQ_IN_PROGRESS (1 << 1) /* 1.3 */
 #define  DP_128B132B_DPRX_EQ_INTERLANE_ALIGN_DONE   (1 << 2) /* 2.0 E11 */
 #define  DP_128B132B_DPRX_CDS_INTERLANE_ALIGN_DONE  (1 << 3) /* 2.0 E11 */
 #define  DP_128B132B_LT_FAILED  (1 << 4) /* 2.0 E11 */
-- 
2.45.3



Re: [PATCH 02/17] bitops: Add generic parity calculation for u64

2025-02-27 Thread David Laight
On Thu, 27 Feb 2025 13:05:29 -0500
Yury Norov  wrote:

> On Wed, Feb 26, 2025 at 10:29:11PM +, David Laight wrote:
> > On Mon, 24 Feb 2025 14:27:03 -0500
> > Yury Norov  wrote:
> >   
> > > +#define parity(val)  \
> > > +({   \
> > > + u64 __v = (val);\
> > > + int __ret;  \
> > > + switch (BITS_PER_TYPE(val)) {   \
> > > + case 64:\
> > > + __v ^= __v >> 32;   \
> > > + fallthrough;\
> > > + case 32:\
> > > + __v ^= __v >> 16;   \
> > > + fallthrough;\
> > > + case 16:\
> > > + __v ^= __v >> 8;\
> > > + fallthrough;\
> > > + case 8: \
> > > + __v ^= __v >> 4;\
> > > + __ret =  (0x6996 >> (__v & 0xf)) & 1;   \
> > > + break;  \
> > > + default:\
> > > + BUILD_BUG();\
> > > + }   \
> > > + __ret;  \
> > > +})
> > > +  
> > 
> > You really don't want to do that!
> > gcc makes a right hash of it for x86 (32bit).
> > See https://www.godbolt.org/z/jG8dv3cvs  
> 
> GCC fails to even understand this. Of course, the __v should be an
> __auto_type. But that way GCC fails to understand that case 64 is
> a dead code for all smaller type and throws a false-positive 
> Wshift-count-overflow. This is a known issue, unfixed for 25 years!

Just do __v ^= __v >> 16 >> 16

> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=4210
>  
> > You do better using a __v32 after the 64bit xor.  
> 
> It should be an __auto_type. I already mentioned. So because of that,
> we can either do something like this:
> 
>   #define parity(val) \
>   ({  \
>   #ifdef CLANG  \
>   __auto_type __v = (val);\
>   #else /* GCC; because of this and that */ \
>   u64 __v = (val);\
>   #endif\
>   int __ret;  \
> 
> Or simply disable Wshift-count-overflow for GCC.

For 64bit values on 32bit it is probably better to do:
int p32(unsigned long long x)
{
unsigned int lo = x;
lo ^= x >> 32;
lo ^= lo >> 16;
lo ^= lo >> 8;
lo ^= lo >> 4;
return (0x6996 >> (lo & 0xf)) & 1;
}
That stops the compiler doing 64bit shifts (ok on x86, but probably not 
elsewhere).
It is likely to be reasonably optimal for most 64bit cpu as well.
(For x86-64 it probably removes a load of REX prefix.)
(It adds an extra instruction to arm because if its barrel shifter.)


> 
> > Even the 64bit version is probably sub-optimal (both gcc and clang).
> > The whole lot ends up being a bit single register dependency chain.
> > You want to do:  
> 
> No, I don't. I want to have a sane compiler that does it for me.
> 
> > mov %eax, %edx
> > shrl $n, %eax
> > xor %edx, %eax
> > so that the 'mov' and 'shrl' can happen in the same clock
> > (without relying on the register-register move being optimised out).
> > 
> > I dropped in the arm64 for an example of where the magic shift of 6996
> > just adds an extra instruction.  
> 
> It's still unclear to me that this parity thing is used in hot paths.
> If that holds, it's unclear that your hand-made version is better than
> what's generated by GCC.

I wasn't seriously considering doing that optimisation.
Perhaps just hoping is might make a compiler person think :-)

David

> 
> Do you have any perf test?
> 
> Thanks,
> Yury



[PATCH 6/9] drm/panthor: Reset GPU after L2 cache power off

2025-02-27 Thread Marek Vasut
This seems necessary on Freescale i.MX95 Mali G310 to reliably resume
from runtime PM suspend. Without this, if only the L2 is powered down
on RPM entry, the GPU gets stuck and does not indicate the firmware is
booted after RPM resume.

Signed-off-by: Marek Vasut 
---
Cc: Boris Brezillon 
Cc: Conor Dooley 
Cc: David Airlie 
Cc: Fabio Estevam 
Cc: Krzysztof Kozlowski 
Cc: Liviu Dudau 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Pengutronix Kernel Team 
Cc: Philipp Zabel 
Cc: Rob Herring 
Cc: Sascha Hauer 
Cc: Sebastian Reichel 
Cc: Shawn Guo 
Cc: Simona Vetter 
Cc: Steven Price 
Cc: Thomas Zimmermann 
Cc: devicet...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: i...@lists.linux.dev
Cc: linux-arm-ker...@lists.infradead.org
---
 drivers/gpu/drm/panthor/panthor_gpu.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c 
b/drivers/gpu/drm/panthor/panthor_gpu.c
index 671049020afaa..0f07ef7d9aea7 100644
--- a/drivers/gpu/drm/panthor/panthor_gpu.c
+++ b/drivers/gpu/drm/panthor/panthor_gpu.c
@@ -470,11 +470,12 @@ int panthor_gpu_soft_reset(struct panthor_device *ptdev)
  */
 void panthor_gpu_suspend(struct panthor_device *ptdev)
 {
-   /* On a fast reset, simply power down the L2. */
-   if (!ptdev->reset.fast)
-   panthor_gpu_soft_reset(ptdev);
-   else
-   panthor_gpu_power_off(ptdev, L2, 1, 2);
+   /*
+* Power off the L2 and soft reset the GPU, that makes
+* iMX95 Mali G310 resume without firmware boot timeout.
+*/
+   panthor_gpu_power_off(ptdev, L2, 1, 2);
+   panthor_gpu_soft_reset(ptdev);
 
panthor_gpu_irq_suspend(&ptdev->gpu->irq);
 }
-- 
2.47.2



[PATCH 0/9] arm64: dts: imx95: Add support for Mali G310 GPU

2025-02-27 Thread Marek Vasut
The instance of the GPU populated in i.MX95 is the G310.
Add support for the GPUMIX reset via simple-reset driver,
add reset and multiple power domains support into panthor
GPU driver, add iMX95 GPU support into panthor driver and
describe the iMX95 GPU in imx95.dtsi DT.

Marek Vasut (9):
  dt-bindings: reset: imx95-gpu-blk-ctrl: Document Freescale i.MX95 GPU
reset
  reset: simple: Add support for Freescale i.MX95 GPU reset
  dt-bindings: gpu: mali-valhall-csf: Document optional reset
  drm/panthor: Implement optional reset
  drm/panthor: Implement support for multiple power domains
  drm/panthor: Reset GPU after L2 cache power off
  dt-bindings: gpu: mali-valhall-csf: Document i.MX95 support
  drm/panthor: Add i.MX95 support
  arm64: dts: imx95: Describe Mali G310 GPU

 .../bindings/gpu/arm,mali-valhall-csf.yaml|  4 +
 .../reset/fsl,imx95-gpu-blk-ctrl.yaml | 49 
 arch/arm64/boot/dts/freescale/imx95.dtsi  | 62 +++
 drivers/gpu/drm/panthor/Kconfig   |  1 +
 drivers/gpu/drm/panthor/panthor_device.c  | 79 +++
 drivers/gpu/drm/panthor/panthor_device.h  |  8 ++
 drivers/gpu/drm/panthor/panthor_drv.c |  1 +
 drivers/gpu/drm/panthor/panthor_gpu.c | 12 +--
 drivers/reset/reset-simple.c  |  8 ++
 9 files changed, 219 insertions(+), 5 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/reset/fsl,imx95-gpu-blk-ctrl.yaml

---
Cc: Boris Brezillon 
Cc: Conor Dooley 
Cc: David Airlie 
Cc: Fabio Estevam 
Cc: Krzysztof Kozlowski 
Cc: Liviu Dudau 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Pengutronix Kernel Team 
Cc: Philipp Zabel 
Cc: Rob Herring 
Cc: Sascha Hauer 
Cc: Sebastian Reichel 
Cc: Shawn Guo 
Cc: Simona Vetter 
Cc: Steven Price 
Cc: Thomas Zimmermann 
Cc: devicet...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: i...@lists.linux.dev
Cc: linux-arm-ker...@lists.infradead.org

-- 
2.47.2



[PATCH 4/9] drm/panthor: Implement optional reset

2025-02-27 Thread Marek Vasut
The instance of the GPU populated in Freescale i.MX95 does require
release from reset by writing into a single GPUMIX block controller
GPURESET register bit 0. Implement support for one optional reset.

Signed-off-by: Marek Vasut 
---
Cc: Boris Brezillon 
Cc: Conor Dooley 
Cc: David Airlie 
Cc: Fabio Estevam 
Cc: Krzysztof Kozlowski 
Cc: Liviu Dudau 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Pengutronix Kernel Team 
Cc: Philipp Zabel 
Cc: Rob Herring 
Cc: Sascha Hauer 
Cc: Sebastian Reichel 
Cc: Shawn Guo 
Cc: Simona Vetter 
Cc: Steven Price 
Cc: Thomas Zimmermann 
Cc: devicet...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: i...@lists.linux.dev
Cc: linux-arm-ker...@lists.infradead.org
---
 drivers/gpu/drm/panthor/Kconfig  |  1 +
 drivers/gpu/drm/panthor/panthor_device.c | 23 +++
 drivers/gpu/drm/panthor/panthor_device.h |  3 +++
 3 files changed, 27 insertions(+)

diff --git a/drivers/gpu/drm/panthor/Kconfig b/drivers/gpu/drm/panthor/Kconfig
index 55b40ad07f3b0..ab62bd6a0750f 100644
--- a/drivers/gpu/drm/panthor/Kconfig
+++ b/drivers/gpu/drm/panthor/Kconfig
@@ -14,6 +14,7 @@ config DRM_PANTHOR
select IOMMU_IO_PGTABLE_LPAE
select IOMMU_SUPPORT
select PM_DEVFREQ
+   select RESET_SIMPLE if SOC_IMX9
help
  DRM driver for ARM Mali CSF-based GPUs.
 
diff --git a/drivers/gpu/drm/panthor/panthor_device.c 
b/drivers/gpu/drm/panthor/panthor_device.c
index a9da1d1eeb707..51ee9cae94504 100644
--- a/drivers/gpu/drm/panthor/panthor_device.c
+++ b/drivers/gpu/drm/panthor/panthor_device.c
@@ -64,6 +64,17 @@ static int panthor_clk_init(struct panthor_device *ptdev)
return 0;
 }
 
+static int panthor_reset_init(struct panthor_device *ptdev)
+{
+   ptdev->resets = 
devm_reset_control_get_optional_exclusive_deasserted(ptdev->base.dev, NULL);
+   if (IS_ERR(ptdev->resets))
+   return dev_err_probe(ptdev->base.dev,
+PTR_ERR(ptdev->resets),
+"get reset failed");
+
+   return 0;
+}
+
 void panthor_device_unplug(struct panthor_device *ptdev)
 {
/* This function can be called from two different path: the reset work
@@ -217,6 +228,10 @@ int panthor_device_init(struct panthor_device *ptdev)
if (ret)
return ret;
 
+   ret = panthor_reset_init(ptdev);
+   if (ret)
+   return ret;
+
ret = panthor_devfreq_init(ptdev);
if (ret)
return ret;
@@ -470,6 +485,10 @@ int panthor_device_resume(struct device *dev)
if (ret)
goto err_disable_stacks_clk;
 
+   ret = reset_control_deassert(ptdev->resets);
+   if (ret)
+   goto err_disable_coregroup_clk;
+
panthor_devfreq_resume(ptdev);
 
if (panthor_device_is_initialized(ptdev) &&
@@ -512,6 +531,9 @@ int panthor_device_resume(struct device *dev)
 
 err_suspend_devfreq:
panthor_devfreq_suspend(ptdev);
+   reset_control_assert(ptdev->resets);
+
+err_disable_coregroup_clk:
clk_disable_unprepare(ptdev->clks.coregroup);
 
 err_disable_stacks_clk:
@@ -563,6 +585,7 @@ int panthor_device_suspend(struct device *dev)
 
panthor_devfreq_suspend(ptdev);
 
+   reset_control_assert(ptdev->resets);
clk_disable_unprepare(ptdev->clks.coregroup);
clk_disable_unprepare(ptdev->clks.stacks);
clk_disable_unprepare(ptdev->clks.core);
diff --git a/drivers/gpu/drm/panthor/panthor_device.h 
b/drivers/gpu/drm/panthor/panthor_device.h
index da6574021664b..fea3a05778e2e 100644
--- a/drivers/gpu/drm/panthor/panthor_device.h
+++ b/drivers/gpu/drm/panthor/panthor_device.h
@@ -111,6 +111,9 @@ struct panthor_device {
struct clk *coregroup;
} clks;
 
+   /** @resets: GPU reset. */
+   struct reset_control *resets;
+
/** @coherent: True if the CPU/GPU are memory coherent. */
bool coherent;
 
-- 
2.47.2



[PATCH 7/9] dt-bindings: gpu: mali-valhall-csf: Document i.MX95 support

2025-02-27 Thread Marek Vasut
The instance of the GPU populated in Freescale i.MX95 is the
Mali G310, document support for this variant.

Signed-off-by: Marek Vasut 
---
Cc: Boris Brezillon 
Cc: Conor Dooley 
Cc: David Airlie 
Cc: Fabio Estevam 
Cc: Krzysztof Kozlowski 
Cc: Liviu Dudau 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Pengutronix Kernel Team 
Cc: Philipp Zabel 
Cc: Rob Herring 
Cc: Sascha Hauer 
Cc: Sebastian Reichel 
Cc: Shawn Guo 
Cc: Simona Vetter 
Cc: Steven Price 
Cc: Thomas Zimmermann 
Cc: devicet...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: i...@lists.linux.dev
Cc: linux-arm-ker...@lists.infradead.org
---
 Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml 
b/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml
index 0efa06822a543..3ab62bd424e41 100644
--- a/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml
+++ b/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml
@@ -18,6 +18,7 @@ properties:
 oneOf:
   - items:
   - enum:
+  - fsl,imx95-mali# G310
   - rockchip,rk3588-mali
   - const: arm,mali-valhall-csf   # Mali Valhall GPU model/revision is 
fully discoverable
 
-- 
2.47.2



RE: [PATCH 6/6] drm/xe/xe_vm: Implement xe_vm_get_property_ioctl

2025-02-27 Thread Cavitt, Jonathan
Some responses below.  If I skip over anything, just assume that I'm taking the 
request
into consideration and that it will be fixed for version 2 of this patch series.

-Original Message-
From: Brost, Matthew  
Sent: Thursday, February 27, 2025 12:25 AM
To: Cavitt, Jonathan 
Cc: intel...@lists.freedesktop.org; Gupta, saurabhg ; 
Zuo, Alex ; joonas.lahti...@linux.intel.com; Zhang, Jianxun 
; dri-devel@lists.freedesktop.org
Subject: Re: [PATCH 6/6] drm/xe/xe_vm: Implement xe_vm_get_property_ioctl
> 
> On Wed, Feb 26, 2025 at 10:55:56PM +, Jonathan Cavitt wrote:
> > Add support for userspace to get various properties from a specified VM.
> > The currently supported properties are:
> > 
> > - The number of engine resets the VM has observed
> > - The number of exec queue bans the VM has observed, up to the last 50
> >   relevant ones, and how many of those were caused by faults.
> > 
> > The latter request also includes information on the exec queue bans,
> > such as the ID of the banned exec queue, whether the ban was caused by a
> > pagefault or not, and the address and address type of the associated
> > fault (if one exists).
> > 
> 
> > Signed-off-by: Jonathan Cavitt 
> > Suggested-by: Matthew Brost 
> > ---
[...]
> 
> > +
> > +struct drm_xe_ban {
> > +   /** @exec_queue_id: ID of banned exec queue */
> > +   __u32 exec_queue_id;
> 
> I don't think we can reliably associate a page fault with an
> exec_queue_id at the moment, given my above statement about having to
> capture all state at the time of the page fault. Maybe we could with
> some tricks between the page fault and the IOMMU CAT error G2H?
> Regardless, let's ask the UMD we are targeting [1] if this information
> would be helpful. It would seemingly have to be vendor-specific
> information, not part of the generic Vk information.
> 
> Additionally, it might be good to ask what other vendor-specific
> information, if any, we'd need here based on what the current page fault
> interface supports.
> 
> [1] 
> https://registry.khronos.org/vulkan/specs/latest/man/html/VK_EXT_device_fault.html

The original request was something along the lines of having a mirror of the
DRM_IOCTL_I915_GET_RESET_STATS on XeKMD.  Those reset stats contain
information on the "context" ID, which maps to the exec queue ID on XeKMD.

Even if we can't reasonably blame a pagefault on a particular exec queue, in
order to match the request correctly, this information needs to be returned.

The I915 reset stats also contain information on the number of observed engine
resets, so that needs to be returned as well.

@joonas.lahti...@linux.intel.com can provide more details.  Or maybe
@Mistat, Tomasz .

> 
> > +   /** @faulted: Whether or not the ban has an associated pagefault.  0 is 
> > no, 1 is yes */
> > +   __u32 faulted;
> > +   /** @address: Address of the fault, if relevant */
> > +   __u64 address;
> > +   /** @address_type: enum drm_xe_fault_address_type, if relevant */
> > +   __u32 address_type;
> 
> We likely need a fault_size field to support VkDeviceSize
> addressPrecision; as defined here [2]. I believe we can extract this
> information from pagefault.fault_level.
> 
> [2] 
> https://registry.khronos.org/vulkan/specs/latest/man/html/VkDeviceFaultAddressInfoEXT.html

I can add this field as a prototype, though it will always return SZ_4K until we
can have a longer discussion on how to map between the fault_level and the
fault_size.

> 
> > +   /** @pad: MBZ */
> > +   __u32 pad;
> > +   /** @reserved: MBZ */
> > +   __u64 reserved[3];
> > +};
> > +
> > +struct drm_xe_faults {
> > +   /** @num_faults: Number of faults observed on the VM */
> > +   __u32 num_faults;
> > +   /** @num_bans: Number of bans observed on the VM */
> > +   __u32 num_bans;
> 
> I don't think num_bans and num_faults really provide any benefit for
> supporting [1]. The requirement for [1] is device faults-nothing more.
> With that in mind, I'd lean toward an array of a single structure
> (returned in drm_xe_vm_get_property.data, number of faults can be
> inferred from the returned size) reporting all faults, with each entry
> containing all the fault information. If another use case arises for
> reporting all banned queues, we can add a property for that.

I'm fairly certain the full ban list was directly requested, but I can break
it into a third query at least.

Also, the abstraction is done here because that's how copy_from_user
has historically been used.  I'd rather not experiment with trying to
copy_from_user a structure array and bungling it, but I guess I can give
it a try at least...

> 
> > +   /** @reserved: MBZ */
> > +   __u64 reserved[2];
> > +   /** @list: Dynamic sized array of drm_xe_ban bans */
> > +   struct drm_xe_ban list[];
> 
> list[0] would be the prefered way.

That is not how dynamic arrays are handled for
struct drm_xe_query_engines,
struct drm_xe_query_mem_regions,
struct drm_xe_query_config,
struct drm_xe_query_gt_list,
struct drm_xe_query_topology_mask,

[PATCH 9/9] arm64: dts: imx95: Describe Mali G310 GPU

2025-02-27 Thread Marek Vasut
The instance of the GPU populated in i.MX95 is the G310,
describe this GPU in the DT. Include description of the
GPUMIX block controller, which can be operated as a simple
reset. Include dummy GPU voltage regulator and OPP tables.

Signed-off-by: Marek Vasut 
---
Cc: Boris Brezillon 
Cc: Conor Dooley 
Cc: David Airlie 
Cc: Fabio Estevam 
Cc: Krzysztof Kozlowski 
Cc: Liviu Dudau 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Pengutronix Kernel Team 
Cc: Philipp Zabel 
Cc: Rob Herring 
Cc: Sascha Hauer 
Cc: Sebastian Reichel 
Cc: Shawn Guo 
Cc: Simona Vetter 
Cc: Steven Price 
Cc: Thomas Zimmermann 
Cc: devicet...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: i...@lists.linux.dev
Cc: linux-arm-ker...@lists.infradead.org
---
 arch/arm64/boot/dts/freescale/imx95.dtsi | 62 
 1 file changed, 62 insertions(+)

diff --git a/arch/arm64/boot/dts/freescale/imx95.dtsi 
b/arch/arm64/boot/dts/freescale/imx95.dtsi
index 3af13173de4bd..36bad211e5558 100644
--- a/arch/arm64/boot/dts/freescale/imx95.dtsi
+++ b/arch/arm64/boot/dts/freescale/imx95.dtsi
@@ -249,6 +249,37 @@ dummy: clock-dummy {
clock-output-names = "dummy";
};
 
+   gpu_fixed_reg: fixed-gpu-reg {
+   compatible = "regulator-fixed";
+   regulator-min-microvolt = <92>;
+   regulator-max-microvolt = <92>;
+   regulator-name = "vdd_gpu";
+   regulator-always-on;
+   regulator-boot-on;
+   };
+
+   gpu_opp_table: opp_table {
+   compatible = "operating-points-v2";
+
+   opp-5 {
+   opp-hz = /bits/ 64 <5>;
+   opp-hz-real = /bits/ 64 <5>;
+   opp-microvolt = <92>;
+   };
+
+   opp-8 {
+   opp-hz = /bits/ 64 <8>;
+   opp-hz-real = /bits/ 64 <8>;
+   opp-microvolt = <92>;
+   };
+
+   opp-10 {
+   opp-hz = /bits/ 64 <10>;
+   opp-hz-real = /bits/ 64 <10>;
+   opp-microvolt = <92>;
+   };
+   };
+
clk_ext1: clock-ext1 {
compatible = "fixed-clock";
#clock-cells = <0>;
@@ -1846,6 +1877,37 @@ netc_emdio: mdio@0,0 {
};
};
 
+   gpu_blk_ctrl: reset-controller@4d81 {
+   compatible = "fsl,imx95-gpu-blk-ctrl";
+   reg = <0x0 0x4d81 0x0 0xc>;
+   #reset-cells = <1>;
+   clocks = <&scmi_clk IMX95_CLK_GPUAPB>;
+   assigned-clocks = <&scmi_clk IMX95_CLK_GPUAPB>;
+   assigned-clock-parents = <&scmi_clk 
IMX95_CLK_SYSPLL1_PFD1_DIV2>;
+   assigned-clock-rates = <1>;
+   power-domains = <&scmi_devpd IMX95_PD_GPU>;
+   status = "disabled";
+   };
+
+   gpu: gpu@4d90 {
+   compatible = "fsl,imx95-mali", "arm,mali-valhall-csf";
+   reg = <0 0x4d90 0 0x48>;
+   clocks = <&scmi_clk IMX95_CLK_GPU>;
+   clock-names = "core";
+   interrupts = ,
+,
+;
+   interrupt-names = "gpu", "job", "mmu";
+   mali-supply = <&gpu_fixed_reg>;
+   operating-points-v2 = <&gpu_opp_table>;
+   power-domains = <&scmi_devpd IMX95_PD_GPU>, <&scmi_perf 
IMX95_PERF_GPU>;
+   power-domain-names = "mix", "perf";
+   resets = <&gpu_blk_ctrl 0>;
+   #cooling-cells = <2>;
+   dynamic-power-coefficient = <1013>;
+   status = "disabled";
+   };
+
ddr-pmu@4e090dc0 {
compatible = "fsl,imx95-ddr-pmu", "fsl,imx93-ddr-pmu";
reg = <0x0 0x4e090dc0 0x0 0x200>;
-- 
2.47.2



[PATCH 5/9] drm/panthor: Implement support for multiple power domains

2025-02-27 Thread Marek Vasut
The driver code power domain binding to driver instances only works
for single power domain, in case there are multiple power domains,
it is necessary to explicitly attach via dev_pm_domain_attach*().
As DT bindings list support for up to 5 power domains, add support
for attaching them all. This is useful on Freescale i.MX95 which
does have two power domains.

Signed-off-by: Marek Vasut 
---
Cc: Boris Brezillon 
Cc: Conor Dooley 
Cc: David Airlie 
Cc: Fabio Estevam 
Cc: Krzysztof Kozlowski 
Cc: Liviu Dudau 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Pengutronix Kernel Team 
Cc: Philipp Zabel 
Cc: Rob Herring 
Cc: Sascha Hauer 
Cc: Sebastian Reichel 
Cc: Shawn Guo 
Cc: Simona Vetter 
Cc: Steven Price 
Cc: Thomas Zimmermann 
Cc: devicet...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: i...@lists.linux.dev
Cc: linux-arm-ker...@lists.infradead.org
---
 drivers/gpu/drm/panthor/panthor_device.c | 56 
 drivers/gpu/drm/panthor/panthor_device.h |  5 +++
 2 files changed, 61 insertions(+)

diff --git a/drivers/gpu/drm/panthor/panthor_device.c 
b/drivers/gpu/drm/panthor/panthor_device.c
index 51ee9cae94504..4348b7e917b64 100644
--- a/drivers/gpu/drm/panthor/panthor_device.c
+++ b/drivers/gpu/drm/panthor/panthor_device.c
@@ -75,6 +75,58 @@ static int panthor_reset_init(struct panthor_device *ptdev)
return 0;
 }
 
+/* Generic power domain handling code, see drivers/gpu/drm/tiny/simpledrm.c */
+static void panthor_detach_genpd(void *res)
+{
+   struct panthor_device *ptdev = res;
+   int i;
+
+   if (ptdev->pwr_dom_count <= 1)
+   return;
+
+   for (i = ptdev->pwr_dom_count - 1; i >= 0; i--)
+   dev_pm_domain_detach(ptdev->pwr_dom_devs[i], true);
+}
+
+static int panthor_genpd_init(struct panthor_device *ptdev)
+{
+   struct device *dev = ptdev->base.dev;
+   int i, ret;
+
+   ptdev->pwr_dom_count = of_count_phandle_with_args(dev->of_node, 
"power-domains",
+ 
"#power-domain-cells");
+   /*
+* Single power-domain devices are handled by driver core nothing to do
+* here. The same for device nodes without "power-domains" property.
+*/
+   if (ptdev->pwr_dom_count <= 1)
+   return 0;
+
+   if (ptdev->pwr_dom_count > ARRAY_SIZE(ptdev->pwr_dom_devs)) {
+   drm_warn(&ptdev->base, "Too many power domains (%d) for this 
device\n",
+ptdev->pwr_dom_count);
+   return -EINVAL;
+   }
+
+   for (i = 0; i < ptdev->pwr_dom_count; i++) {
+   ptdev->pwr_dom_devs[i] = dev_pm_domain_attach_by_id(dev, i);
+   if (!IS_ERR(ptdev->pwr_dom_devs[i]))
+   continue;
+
+   ret = PTR_ERR(ptdev->pwr_dom_devs[i]);
+   if (ret != -EPROBE_DEFER) {
+   drm_warn(&ptdev->base, "pm_domain_attach_by_id(%u) 
failed: %d\n", i, ret);
+   continue;
+   }
+
+   /* Missing dependency, try again. */
+   panthor_detach_genpd(ptdev);
+   return ret;
+   }
+
+   return devm_add_action_or_reset(dev, panthor_detach_genpd, ptdev);
+}
+
 void panthor_device_unplug(struct panthor_device *ptdev)
 {
/* This function can be called from two different path: the reset work
@@ -232,6 +284,10 @@ int panthor_device_init(struct panthor_device *ptdev)
if (ret)
return ret;
 
+   ret = panthor_genpd_init(ptdev);
+   if (ret)
+   return ret;
+
ret = panthor_devfreq_init(ptdev);
if (ret)
return ret;
diff --git a/drivers/gpu/drm/panthor/panthor_device.h 
b/drivers/gpu/drm/panthor/panthor_device.h
index fea3a05778e2e..7fb65447253e9 100644
--- a/drivers/gpu/drm/panthor/panthor_device.h
+++ b/drivers/gpu/drm/panthor/panthor_device.h
@@ -114,6 +114,11 @@ struct panthor_device {
/** @resets: GPU reset. */
struct reset_control *resets;
 
+   /** @pwr_dom_count: Power domain count */
+   int pwr_dom_count;
+   /** @pwr_dom_dev: Power domain devices */
+   struct device *pwr_dom_devs[5];
+
/** @coherent: True if the CPU/GPU are memory coherent. */
bool coherent;
 
-- 
2.47.2



[PATCH 8/9] drm/panthor: Add i.MX95 support

2025-02-27 Thread Marek Vasut
The instance of the GPU populated in Freescale i.MX95 is the
Mali G310, add support for this variant.

Signed-off-by: Marek Vasut 
---
Cc: Boris Brezillon 
Cc: Conor Dooley 
Cc: David Airlie 
Cc: Fabio Estevam 
Cc: Krzysztof Kozlowski 
Cc: Liviu Dudau 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Pengutronix Kernel Team 
Cc: Philipp Zabel 
Cc: Rob Herring 
Cc: Sascha Hauer 
Cc: Sebastian Reichel 
Cc: Shawn Guo 
Cc: Simona Vetter 
Cc: Steven Price 
Cc: Thomas Zimmermann 
Cc: devicet...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: i...@lists.linux.dev
Cc: linux-arm-ker...@lists.infradead.org
---
 drivers/gpu/drm/panthor/panthor_drv.c | 1 +
 drivers/gpu/drm/panthor/panthor_gpu.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/panthor/panthor_drv.c 
b/drivers/gpu/drm/panthor/panthor_drv.c
index 06fe46e320738..2504a456d45c4 100644
--- a/drivers/gpu/drm/panthor/panthor_drv.c
+++ b/drivers/gpu/drm/panthor/panthor_drv.c
@@ -1591,6 +1591,7 @@ static struct attribute *panthor_attrs[] = {
 ATTRIBUTE_GROUPS(panthor);
 
 static const struct of_device_id dt_match[] = {
+   { .compatible = "fsl,imx95-mali" }, /* G310 */
{ .compatible = "rockchip,rk3588-mali" },
{ .compatible = "arm,mali-valhall-csf" },
{}
diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c 
b/drivers/gpu/drm/panthor/panthor_gpu.c
index 0f07ef7d9aea7..2371ab8e50627 100644
--- a/drivers/gpu/drm/panthor/panthor_gpu.c
+++ b/drivers/gpu/drm/panthor/panthor_gpu.c
@@ -67,6 +67,7 @@ struct panthor_model {
 }
 
 static const struct panthor_model gpu_models[] = {
+   GPU_MODEL(g310, 0, 0),  /* NXP i.MX95 */
GPU_MODEL(g610, 10, 7),
{},
 };
-- 
2.47.2



[PATCH 1/9] dt-bindings: reset: imx95-gpu-blk-ctrl: Document Freescale i.MX95 GPU reset

2025-02-27 Thread Marek Vasut
The instance of the GPU populated in Freescale i.MX95 does require
release from reset by writing into a single GPUMIX block controller
GPURESET register bit 0. Document support for this reset register.

Signed-off-by: Marek Vasut 
---
Cc: Boris Brezillon 
Cc: Conor Dooley 
Cc: David Airlie 
Cc: Fabio Estevam 
Cc: Krzysztof Kozlowski 
Cc: Liviu Dudau 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Pengutronix Kernel Team 
Cc: Philipp Zabel 
Cc: Rob Herring 
Cc: Sascha Hauer 
Cc: Sebastian Reichel 
Cc: Shawn Guo 
Cc: Simona Vetter 
Cc: Steven Price 
Cc: Thomas Zimmermann 
Cc: devicet...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: i...@lists.linux.dev
Cc: linux-arm-ker...@lists.infradead.org
---
 .../reset/fsl,imx95-gpu-blk-ctrl.yaml | 49 +++
 1 file changed, 49 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/reset/fsl,imx95-gpu-blk-ctrl.yaml

diff --git 
a/Documentation/devicetree/bindings/reset/fsl,imx95-gpu-blk-ctrl.yaml 
b/Documentation/devicetree/bindings/reset/fsl,imx95-gpu-blk-ctrl.yaml
new file mode 100644
index 0..dc701bd556c0b
--- /dev/null
+++ b/Documentation/devicetree/bindings/reset/fsl,imx95-gpu-blk-ctrl.yaml
@@ -0,0 +1,49 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/reset/fsl,imx95-gpu-blk-ctrl.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Freescale i.MX95 GPU Block Controller
+
+maintainers:
+  - Marek Vasut 
+
+description: |
+  This reset controller is a block of ad-hoc debug registers, one of
+  which is a single-bit GPU reset.
+
+properties:
+  compatible:
+- const: fsl,imx95-gpu-blk-ctrl
+
+  reg:
+maxItems: 1
+
+  clocks:
+maxItems: 1
+
+  power-domains:
+maxItems: 1
+
+  '#reset-cells':
+const: 1
+
+required:
+  - compatible
+  - reg
+  - clocks
+  - power-domains
+  - '#reset-cells'
+
+additionalProperties: false
+
+examples:
+  - |
+reset-controller@4d81 {
+compatible = "fsl,imx95-gpu-blk-ctrl";
+reg = <0x0 0x4d81 0x0 0xc>;
+clocks = <&scmi_clk IMX95_CLK_GPUAPB>;
+power-domains = <&scmi_devpd IMX95_PD_GPU>;
+#reset-cells = <1>;
+};
-- 
2.47.2



[PATCH 3/9] dt-bindings: gpu: mali-valhall-csf: Document optional reset

2025-02-27 Thread Marek Vasut
The instance of the GPU populated in Freescale i.MX95 does require
release from reset by writing into a single GPUMIX block controller
GPURESET register bit 0. Document support for one optional reset.

Signed-off-by: Marek Vasut 
---
Cc: Boris Brezillon 
Cc: Conor Dooley 
Cc: David Airlie 
Cc: Fabio Estevam 
Cc: Krzysztof Kozlowski 
Cc: Liviu Dudau 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Pengutronix Kernel Team 
Cc: Philipp Zabel 
Cc: Rob Herring 
Cc: Sascha Hauer 
Cc: Sebastian Reichel 
Cc: Shawn Guo 
Cc: Simona Vetter 
Cc: Steven Price 
Cc: Thomas Zimmermann 
Cc: devicet...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: i...@lists.linux.dev
Cc: linux-arm-ker...@lists.infradead.org
---
 .../devicetree/bindings/gpu/arm,mali-valhall-csf.yaml  | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml 
b/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml
index a5b4e00217587..0efa06822a543 100644
--- a/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml
+++ b/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml
@@ -61,6 +61,9 @@ properties:
 minItems: 1
 maxItems: 5
 
+  resets:
+maxItems: 1
+
   sram-supply: true
 
   "#cooling-cells":
-- 
2.47.2



[PATCH 2/4] rust: add #[export] macro

2025-02-27 Thread Alice Ryhl
This macro behaves like #[no_mangle], but also performs an assertion
that the Rust function has the same signature as what is declared in the
C header.

If the signatures don't match, you will get errors that look like this:

error[E0308]: `if` and `else` have incompatible types
  --> /rust/kernel/print.rs:22:22
   |
21 | #[export]
   | - expected because of this
22 | unsafe extern "C" fn rust_fmt_argument(
   |  ^ expected `u8`, found `i8`
   |
   = note: expected fn item `unsafe extern "C" fn(*mut u8, *mut u8, *mut 
c_void) -> *mut u8 {bindings::rust_fmt_argument}`
  found fn item `unsafe extern "C" fn(*mut i8, *mut i8, *const 
c_void) -> *mut i8 {print::rust_fmt_argument}`

Signed-off-by: Alice Ryhl 
---
 rust/kernel/prelude.rs |  2 +-
 rust/macros/export.rs  | 25 +
 rust/macros/helpers.rs | 19 ++-
 rust/macros/lib.rs | 18 ++
 rust/macros/quote.rs   | 21 +++--
 5 files changed, 81 insertions(+), 4 deletions(-)

diff --git a/rust/kernel/prelude.rs b/rust/kernel/prelude.rs
index dde2e0649790..889102f5a81e 100644
--- a/rust/kernel/prelude.rs
+++ b/rust/kernel/prelude.rs
@@ -17,7 +17,7 @@
 pub use crate::alloc::{flags::*, Box, KBox, KVBox, KVVec, KVec, VBox, VVec, 
Vec};
 
 #[doc(no_inline)]
-pub use macros::{module, pin_data, pinned_drop, vtable, Zeroable};
+pub use macros::{export, module, pin_data, pinned_drop, vtable, Zeroable};
 
 pub use super::{build_assert, build_error};
 
diff --git a/rust/macros/export.rs b/rust/macros/export.rs
new file mode 100644
index ..3398e1655124
--- /dev/null
+++ b/rust/macros/export.rs
@@ -0,0 +1,25 @@
+// SPDX-License-Identifier: GPL-2.0
+
+use crate::helpers::function_name;
+use proc_macro::TokenStream;
+
+pub(crate) fn export(_attr: TokenStream, ts: TokenStream) -> TokenStream {
+let Some(name) = function_name(ts.clone()) else {
+return "::core::compile_error!(\"The #[export] attribute must be used 
on a function.\");"
+.parse::()
+.unwrap();
+};
+
+let signature_check = quote!(
+const _: () = {
+if true {
+::kernel::bindings::#name
+} else {
+#name
+};
+};
+);
+
+let no_mangle = "#[no_mangle]".parse::().unwrap();
+TokenStream::from_iter([signature_check, no_mangle, ts])
+}
diff --git a/rust/macros/helpers.rs b/rust/macros/helpers.rs
index 563dcd2b7ace..3e04f8ecfc74 100644
--- a/rust/macros/helpers.rs
+++ b/rust/macros/helpers.rs
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0
 
-use proc_macro::{token_stream, Group, TokenStream, TokenTree};
+use proc_macro::{token_stream, Group, Ident, TokenStream, TokenTree};
 
 pub(crate) fn try_ident(it: &mut token_stream::IntoIter) -> Option {
 if let Some(TokenTree::Ident(ident)) = it.next() {
@@ -215,3 +215,20 @@ pub(crate) fn parse_generics(input: TokenStream) -> 
(Generics, Vec) {
 rest,
 )
 }
+
+/// Given a function declaration, finds the name of the function.
+pub(crate) fn function_name(input: TokenStream) -> Option {
+let mut input = input.into_iter();
+while let Some(token) = input.next() {
+match token {
+TokenTree::Ident(i) if i.to_string() == "fn" => {
+if let Some(TokenTree::Ident(i)) = input.next() {
+return Some(i);
+}
+return None;
+}
+_ => continue,
+}
+}
+None
+}
diff --git a/rust/macros/lib.rs b/rust/macros/lib.rs
index d61bc6a56425..3cbf7705c4c1 100644
--- a/rust/macros/lib.rs
+++ b/rust/macros/lib.rs
@@ -9,6 +9,7 @@
 #[macro_use]
 mod quote;
 mod concat_idents;
+mod export;
 mod helpers;
 mod module;
 mod paste;
@@ -174,6 +175,23 @@ pub fn vtable(attr: TokenStream, ts: TokenStream) -> 
TokenStream {
 vtable::vtable(attr, ts)
 }
 
+/// Export a function so that C code can call it.
+///
+/// This macro has the following effect:
+///
+/// * Disables name mangling for this function.
+/// * Verifies at compile-time that the function signature matches what's in 
the header file.
+///
+/// This macro requires that the function is mentioned in a C header file, and 
that the header file
+/// is included in `rust/bindings/bindings_helper.h`.
+///
+/// This macro is *not* the same as the C macro `EXPORT_SYMBOL*`, since all 
Rust symbols are
+/// currently automatically exported with `EXPORT_SYMBOL_GPL`.
+#[proc_macro_attribute]
+pub fn export(attr: TokenStream, ts: TokenStream) -> TokenStream {
+export::export(attr, ts)
+}
+
 /// Concatenate two identifiers.
 ///
 /// This is useful in macros that need to declare or reference items with names
diff --git a/rust/macros/quote.rs b/rust/macros/quote.rs
index 33a199e4f176..c18960a91082 100644
--- a/rust/macros/quote.rs
+++ b/rust/macros/quote.rs
@@ -20,6 +20,12 @@ fn to_tokens(&self, tokens: &mut TokenStream) {
 }
 }
 
+impl ToToke

[PATCH 0/4] Check Rust signatures at compile time

2025-02-27 Thread Alice Ryhl
Signed-off-by: Alice Ryhl 
---
Alice Ryhl (4):
  rust: fix signature of rust_fmt_argument
  rust: add #[export] macro
  print: use new #[export] macro for rust_fmt_argument
  panic_qr: use new #[export] macro

 drivers/gpu/drm/drm_panic.c |  5 -
 drivers/gpu/drm/drm_panic_qr.rs | 15 +++
 include/drm/drm_panic.h |  7 +++
 include/linux/sprintf.h |  3 +++
 lib/vsprintf.c  |  3 ---
 rust/bindings/bindings_helper.h |  4 
 rust/kernel/prelude.rs  |  2 +-
 rust/kernel/print.rs| 11 ++-
 rust/macros/export.rs   | 25 +
 rust/macros/helpers.rs  | 19 ++-
 rust/macros/lib.rs  | 18 ++
 rust/macros/quote.rs| 21 +++--
 12 files changed, 112 insertions(+), 21 deletions(-)
---
base-commit: a64dcfb451e254085a7daee5fe51bf22959d52d3
change-id: 20250227-export-macro-9aa9f1016d8c

Best regards,
-- 
Alice Ryhl 



[PATCH 1/4] rust: fix signature of rust_fmt_argument

2025-02-27 Thread Alice Ryhl
Without this change, the rest of this series will emit the following
error message:

error[E0308]: `if` and `else` have incompatible types
  --> /rust/kernel/print.rs:22:22
   |
21 | #[export]
   | - expected because of this
22 | unsafe extern "C" fn rust_fmt_argument(
   |  ^ expected `u8`, found `i8`
   |
   = note: expected fn item `unsafe extern "C" fn(*mut u8, *mut u8, *mut 
c_void) -> *mut u8 {bindings::rust_fmt_argument}`
  found fn item `unsafe extern "C" fn(*mut i8, *mut i8, *const 
c_void) -> *mut i8 {print::rust_fmt_argument}`

The error may be different depending on the architecture.

Fixes: 787983da7718 ("vsprintf: add new `%pA` format specifier")
Signed-off-by: Alice Ryhl 
---
 lib/vsprintf.c   | 2 +-
 rust/kernel/print.rs | 8 
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 56fe96319292..a8ac4c4fffcf 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -2285,7 +2285,7 @@ int __init no_hash_pointers_enable(char *str)
 early_param("no_hash_pointers", no_hash_pointers_enable);
 
 /* Used for Rust formatting ('%pA'). */
-char *rust_fmt_argument(char *buf, char *end, void *ptr);
+char *rust_fmt_argument(char *buf, char *end, const void *ptr);
 
 /*
  * Show a '%p' thing.  A kernel extension is that the '%p' is followed
diff --git a/rust/kernel/print.rs b/rust/kernel/print.rs
index b19ee490be58..8551631dedf1 100644
--- a/rust/kernel/print.rs
+++ b/rust/kernel/print.rs
@@ -6,13 +6,13 @@
 //!
 //! Reference: 
 
-use core::{
+use core::fmt;
+
+use crate::{
 ffi::{c_char, c_void},
-fmt,
+str::RawFormatter,
 };
 
-use crate::str::RawFormatter;
-
 // Called from `vsprintf` with format specifier `%pA`.
 #[expect(clippy::missing_safety_doc)]
 #[no_mangle]

-- 
2.48.1.658.g4767266eb4-goog



[PATCH 4/4] panic_qr: use new #[export] macro

2025-02-27 Thread Alice Ryhl
This validates at compile time that the signatures match what is in the
header file. It highlights one annoyance with the compile-time check,
which is that it can only be used with functions marked unsafe.

If the function is not unsafe, then this error is emitted:

error[E0308]: `if` and `else` have incompatible types
   --> /drivers/gpu/drm/drm_panic_qr.rs:987:19
|
986 | #[export]
| - expected because of this
987 | pub extern "C" fn drm_panic_qr_max_data_size(version: u8, url_len: usize) 
-> usize {
|   ^^ expected unsafe fn, found 
safe fn
|
= note: expected fn item `unsafe extern "C" fn(_, _) -> _ 
{kernel::bindings::drm_panic_qr_max_data_size}`
   found fn item `extern "C" fn(_, _) -> _ 
{drm_panic_qr_max_data_size}`

Signed-off-by: Alice Ryhl 
---
 drivers/gpu/drm/drm_panic.c |  5 -
 drivers/gpu/drm/drm_panic_qr.rs | 15 +++
 include/drm/drm_panic.h |  7 +++
 rust/bindings/bindings_helper.h |  4 
 4 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/drm_panic.c b/drivers/gpu/drm/drm_panic.c
index f128d345b16d..dee5301dd729 100644
--- a/drivers/gpu/drm/drm_panic.c
+++ b/drivers/gpu/drm/drm_panic.c
@@ -486,11 +486,6 @@ static void drm_panic_qr_exit(void)
stream.workspace = NULL;
 }
 
-extern size_t drm_panic_qr_max_data_size(u8 version, size_t url_len);
-
-extern u8 drm_panic_qr_generate(const char *url, u8 *data, size_t data_len, 
size_t data_size,
-   u8 *tmp, size_t tmp_size);
-
 static int drm_panic_get_qr_code_url(u8 **qr_image)
 {
struct kmsg_dump_iter iter;
diff --git a/drivers/gpu/drm/drm_panic_qr.rs b/drivers/gpu/drm/drm_panic_qr.rs
index bcf248f69252..d055655aa0cd 100644
--- a/drivers/gpu/drm/drm_panic_qr.rs
+++ b/drivers/gpu/drm/drm_panic_qr.rs
@@ -27,7 +27,10 @@
 //! * 
 
 use core::cmp;
-use kernel::str::CStr;
+use kernel::{
+prelude::*,
+str::CStr,
+};
 
 #[derive(Debug, Clone, Copy, PartialEq, Eq, Ord, PartialOrd)]
 struct Version(usize);
@@ -929,7 +932,7 @@ fn draw_all(&mut self, data: impl Iterator) {
 /// * `tmp` must be valid for reading and writing for `tmp_size` bytes.
 ///
 /// They must remain valid for the duration of the function call.
-#[no_mangle]
+#[export]
 pub unsafe extern "C" fn drm_panic_qr_generate(
 url: *const kernel::ffi::c_char,
 data: *mut u8,
@@ -980,8 +983,12 @@ fn draw_all(&mut self, data: impl Iterator) {
 /// * If `url_len` > 0, remove the 2 segments header/length and also count the
 ///   conversion to numeric segments.
 /// * If `url_len` = 0, only removes 3 bytes for 1 binary segment.
-#[no_mangle]
-pub extern "C" fn drm_panic_qr_max_data_size(version: u8, url_len: usize) -> 
usize {
+///
+/// # Safety
+///
+/// Always safe to call.
+#[export]
+pub unsafe extern "C" fn drm_panic_qr_max_data_size(version: u8, url_len: 
usize) -> usize {
 #[expect(clippy::manual_range_contains)]
 if version < 1 || version > 40 {
 return 0;
diff --git a/include/drm/drm_panic.h b/include/drm/drm_panic.h
index f4e1fa9ae607..2a1536e0229a 100644
--- a/include/drm/drm_panic.h
+++ b/include/drm/drm_panic.h
@@ -163,4 +163,11 @@ static inline void drm_panic_unlock(struct drm_device 
*dev, unsigned long flags)
 
 #endif
 
+#if defined(CONFIG_DRM_PANIC_SCREEN_QR_CODE)
+extern size_t drm_panic_qr_max_data_size(u8 version, size_t url_len);
+
+extern u8 drm_panic_qr_generate(const char *url, u8 *data, size_t data_len, 
size_t data_size,
+   u8 *tmp, size_t tmp_size);
+#endif
+
 #endif /* __DRM_PANIC_H__ */
diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
index 55354e4dec14..5345aa93fb8a 100644
--- a/rust/bindings/bindings_helper.h
+++ b/rust/bindings/bindings_helper.h
@@ -36,6 +36,10 @@
 #include 
 #include 
 
+#if defined(CONFIG_DRM_PANIC_SCREEN_QR_CODE)
+#include 
+#endif
+
 /* `bindgen` gets confused at certain things. */
 const size_t RUST_CONST_HELPER_ARCH_SLAB_MINALIGN = ARCH_SLAB_MINALIGN;
 const size_t RUST_CONST_HELPER_PAGE_SIZE = PAGE_SIZE;

-- 
2.48.1.658.g4767266eb4-goog



Re: [PATCH v6 17/32] drm/xe: Do not allow CPU address mirror VMA unbind if the GPU has bindings

2025-02-27 Thread Thomas Hellström
On Mon, 2025-02-24 at 20:42 -0800, Matthew Brost wrote:
> uAPI is designed with the use case that only mapping a BO to a
> malloc'd
> address will unbind a CPU-address mirror VMA. Therefore, allowing a
> CPU-address mirror VMA to unbind when the GPU has bindings in the
> range
> being unbound does not make much sense. This behavior is not
> supported,
> as it simplifies the code. This decision can always be revisited if a
> use case arises.
> 
> v3:
>  - s/arrises/arises (Thomas)
>  - s/system allocator/GPU address mirror (Thomas)
>  - Kernel doc (Thomas)
>  - Newline between function defs (Thomas)
> v5:
>  - Kernel doc (Thomas)
> v6:
>  - Only compile if CONFIG_DRM_GPUSVM selected (CI, Lucas)
> 
> Signed-off-by: Matthew Brost 
> Reviewed-by: Himal Prasad Ghimiray 
Reviewed-by: Thomas Hellström 

> ---
>  drivers/gpu/drm/xe/xe_svm.c | 15 +++
>  drivers/gpu/drm/xe/xe_svm.h |  8 
>  drivers/gpu/drm/xe/xe_vm.c  | 16 
>  3 files changed, 39 insertions(+)
> 
> diff --git a/drivers/gpu/drm/xe/xe_svm.c
> b/drivers/gpu/drm/xe/xe_svm.c
> index a9d32cd69ae9..80076f4dc4b4 100644
> --- a/drivers/gpu/drm/xe/xe_svm.c
> +++ b/drivers/gpu/drm/xe/xe_svm.c
> @@ -434,3 +434,18 @@ int xe_svm_handle_pagefault(struct xe_vm *vm,
> struct xe_vma *vma,
>  
>   return err;
>  }
> +
> +/**
> + * xe_svm_has_mapping() - SVM has mappings
> + * @vm: The VM.
> + * @start: Start address.
> + * @end: End address.
> + *
> + * Check if an address range has SVM mappings.
> + *
> + * Return: True if address range has a SVM mapping, False otherwise
> + */
> +bool xe_svm_has_mapping(struct xe_vm *vm, u64 start, u64 end)
> +{
> + return drm_gpusvm_has_mapping(&vm->svm.gpusvm, start, end);
> +}
> diff --git a/drivers/gpu/drm/xe/xe_svm.h
> b/drivers/gpu/drm/xe/xe_svm.h
> index 87cbda5641bb..35e044e492e0 100644
> --- a/drivers/gpu/drm/xe/xe_svm.h
> +++ b/drivers/gpu/drm/xe/xe_svm.h
> @@ -57,6 +57,8 @@ void xe_svm_close(struct xe_vm *vm);
>  int xe_svm_handle_pagefault(struct xe_vm *vm, struct xe_vma *vma,
>       struct xe_tile *tile, u64 fault_addr,
>       bool atomic);
> +
> +bool xe_svm_has_mapping(struct xe_vm *vm, u64 start, u64 end);
>  #else
>  static inline bool xe_svm_range_pages_valid(struct xe_svm_range
> *range)
>  {
> @@ -86,6 +88,12 @@ int xe_svm_handle_pagefault(struct xe_vm *vm,
> struct xe_vma *vma,
>  {
>   return 0;
>  }
> +
> +static inline
> +bool xe_svm_has_mapping(struct xe_vm *vm, u64 start, u64 end)
> +{
> + return false;
> +}
>  #endif
>  
>  /**
> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> index 870629cbb859..a3ef76504ce8 100644
> --- a/drivers/gpu/drm/xe/xe_vm.c
> +++ b/drivers/gpu/drm/xe/xe_vm.c
> @@ -2442,6 +2442,17 @@ static int vm_bind_ioctl_ops_parse(struct
> xe_vm *vm, struct drm_gpuva_ops *ops,
>   struct xe_vma *old =
>   gpuva_to_vma(op->base.remap.unmap-
> >va);
>   bool skip = xe_vma_is_cpu_addr_mirror(old);
> + u64 start = xe_vma_start(old), end =
> xe_vma_end(old);
> +
> + if (op->base.remap.prev)
> + start = op->base.remap.prev->va.addr
> +
> + op->base.remap.prev-
> >va.range;
> + if (op->base.remap.next)
> + end = op->base.remap.next->va.addr;
> +
> + if (xe_vma_is_cpu_addr_mirror(old) &&
> +     xe_svm_has_mapping(vm, start, end))
> + return -EBUSY;
>  
>   op->remap.start = xe_vma_start(old);
>   op->remap.range = xe_vma_size(old);
> @@ -2524,6 +2535,11 @@ static int vm_bind_ioctl_ops_parse(struct
> xe_vm *vm, struct drm_gpuva_ops *ops,
>   {
>   struct xe_vma *vma = gpuva_to_vma(op-
> >base.unmap.va);
>  
> + if (xe_vma_is_cpu_addr_mirror(vma) &&
> +     xe_svm_has_mapping(vm,
> xe_vma_start(vma),
> +    xe_vma_end(vma)))
> + return -EBUSY;
> +
>   if (!xe_vma_is_cpu_addr_mirror(vma))
>   xe_vma_ops_incr_pt_update_ops(vops,
> op->tile_mask);
>   break;



Re: [PATCH 6/9] drm/panthor: Reset GPU after L2 cache power off

2025-02-27 Thread Marek Vasut

On 2/27/25 6:17 PM, Boris Brezillon wrote:

[...]


diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c 
b/drivers/gpu/drm/panthor/panthor_gpu.c
index 671049020afaa..0f07ef7d9aea7 100644
--- a/drivers/gpu/drm/panthor/panthor_gpu.c
+++ b/drivers/gpu/drm/panthor/panthor_gpu.c
@@ -470,11 +470,12 @@ int panthor_gpu_soft_reset(struct panthor_device *ptdev)
   */
  void panthor_gpu_suspend(struct panthor_device *ptdev)
  {
-   /* On a fast reset, simply power down the L2. */
-   if (!ptdev->reset.fast)
-   panthor_gpu_soft_reset(ptdev);
-   else
-   panthor_gpu_power_off(ptdev, L2, 1, 2);
+   /*
+* Power off the L2 and soft reset the GPU, that makes
+* iMX95 Mali G310 resume without firmware boot timeout.
+*/
+   panthor_gpu_power_off(ptdev, L2, 1, 2);
+   panthor_gpu_soft_reset(ptdev);


Unfortunately, if you do that unconditionally we no longer have a
fast-reset. Would be good to figure out why the fast-reset doesn't work
on this platform.


I was hoping to get some hint on this one, I spent quite a while trying 
to narrow this down, finally got it down to this particular bit.


The NXP downstream vendor kernel vendor Mali driver does not seem to 
have anything interesting regarding the L2 power handling, but I might 
have missed it, the code is difficult to read.


Have you ever seen anything problematic in this specific L2 department ?

Do you have any hints how I can debug this further ?


Re: [PATCH v4 2/2] drm/tiny: add driver for Apple Touch Bars in x86 Macs

2025-02-27 Thread Aditya Garg


> On 27 Feb 2025, at 10:24 PM, kernel test robot  wrote:
> 
> Hi Aditya,
> 
> kernel test robot noticed the following build warnings:
> 
> [auto build test WARNING on linus/master]
> [also build test WARNING on v6.14-rc4 next-20250227]
> [If your patch is applied to the wrong git tree, kindly drop us a note.

A version 7 of this patch has already been submitted, not sure why kernel test 
robot tested version 4

Re: [PATCH 1/9] dt-bindings: reset: imx95-gpu-blk-ctrl: Document Freescale i.MX95 GPU reset

2025-02-27 Thread Frank Li
On Thu, Feb 27, 2025 at 05:58:01PM +0100, Marek Vasut wrote:
> The instance of the GPU populated in Freescale i.MX95 does require
> release from reset by writing into a single GPUMIX block controller
> GPURESET register bit 0. Document support for this reset register.
>
> Signed-off-by: Marek Vasut 
> ---
> Cc: Boris Brezillon 
> Cc: Conor Dooley 
> Cc: David Airlie 
> Cc: Fabio Estevam 
> Cc: Krzysztof Kozlowski 
> Cc: Liviu Dudau 
> Cc: Maarten Lankhorst 
> Cc: Maxime Ripard 
> Cc: Pengutronix Kernel Team 
> Cc: Philipp Zabel 
> Cc: Rob Herring 
> Cc: Sascha Hauer 
> Cc: Sebastian Reichel 
> Cc: Shawn Guo 
> Cc: Simona Vetter 
> Cc: Steven Price 
> Cc: Thomas Zimmermann 
> Cc: devicet...@vger.kernel.org
> Cc: dri-devel@lists.freedesktop.org
> Cc: i...@lists.linux.dev
> Cc: linux-arm-ker...@lists.infradead.org
> ---
>  .../reset/fsl,imx95-gpu-blk-ctrl.yaml | 49 +++
>  1 file changed, 49 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/reset/fsl,imx95-gpu-blk-ctrl.yaml
>
> diff --git 
> a/Documentation/devicetree/bindings/reset/fsl,imx95-gpu-blk-ctrl.yaml 
> b/Documentation/devicetree/bindings/reset/fsl,imx95-gpu-blk-ctrl.yaml
> new file mode 100644
> index 0..dc701bd556c0b
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/reset/fsl,imx95-gpu-blk-ctrl.yaml
> @@ -0,0 +1,49 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/reset/fsl,imx95-gpu-blk-ctrl.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Freescale i.MX95 GPU Block Controller
> +
> +maintainers:
> +  - Marek Vasut 
> +
> +description: |

Needn't |

> +  This reset controller is a block of ad-hoc debug registers, one of
> +  which is a single-bit GPU reset.
> +
> +properties:
> +  compatible:
> +- const: fsl,imx95-gpu-blk-ctrl
> +
> +  reg:
> +maxItems: 1
> +
> +  clocks:
> +maxItems: 1
> +
> +  power-domains:
> +maxItems: 1
> +
> +  '#reset-cells':
> +const: 1
> +
> +required:
> +  - compatible
> +  - reg
> +  - clocks
> +  - power-domains
> +  - '#reset-cells'
> +
> +additionalProperties: false
> +
> +examples:
> +  - |
> +reset-controller@4d81 {
> +compatible = "fsl,imx95-gpu-blk-ctrl";
> +reg = <0x0 0x4d81 0x0 0xc>;

No sure if it pass dt_binding_check, I remember default 32bit address
reg = <0x4d81 0xc>

> +clocks = <&scmi_clk IMX95_CLK_GPUAPB>;

suppose you missed dt-binding include file for IMX95_CLK_GPUAPB

Frank
> +power-domains = <&scmi_devpd IMX95_PD_GPU>;
> +#reset-cells = <1>;
> +};
> --
> 2.47.2
>


Re: [PATCH 7/9] dt-bindings: gpu: mali-valhall-csf: Document i.MX95 support

2025-02-27 Thread Frank Li
On Thu, Feb 27, 2025 at 05:58:07PM +0100, Marek Vasut wrote:
> The instance of the GPU populated in Freescale i.MX95 is the
> Mali G310, document support for this variant.
>
> Signed-off-by: Marek Vasut 

Reviewed-by: Frank Li 

> ---
> Cc: Boris Brezillon 
> Cc: Conor Dooley 
> Cc: David Airlie 
> Cc: Fabio Estevam 
> Cc: Krzysztof Kozlowski 
> Cc: Liviu Dudau 
> Cc: Maarten Lankhorst 
> Cc: Maxime Ripard 
> Cc: Pengutronix Kernel Team 
> Cc: Philipp Zabel 
> Cc: Rob Herring 
> Cc: Sascha Hauer 
> Cc: Sebastian Reichel 
> Cc: Shawn Guo 
> Cc: Simona Vetter 
> Cc: Steven Price 
> Cc: Thomas Zimmermann 
> Cc: devicet...@vger.kernel.org
> Cc: dri-devel@lists.freedesktop.org
> Cc: i...@lists.linux.dev
> Cc: linux-arm-ker...@lists.infradead.org
> ---
>  Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml 
> b/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml
> index 0efa06822a543..3ab62bd424e41 100644
> --- a/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml
> +++ b/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml
> @@ -18,6 +18,7 @@ properties:
>  oneOf:
>- items:
>- enum:
> +  - fsl,imx95-mali# G310
>- rockchip,rk3588-mali
>- const: arm,mali-valhall-csf   # Mali Valhall GPU model/revision 
> is fully discoverable
>
> --
> 2.47.2
>


Re: [PATCH 6/9] drm/panthor: Reset GPU after L2 cache power off

2025-02-27 Thread Boris Brezillon
On Thu, 27 Feb 2025 17:58:06 +0100
Marek Vasut  wrote:

> This seems necessary on Freescale i.MX95 Mali G310 to reliably resume
> from runtime PM suspend. Without this, if only the L2 is powered down
> on RPM entry, the GPU gets stuck and does not indicate the firmware is
> booted after RPM resume.
> 
> Signed-off-by: Marek Vasut 
> ---
> Cc: Boris Brezillon 
> Cc: Conor Dooley 
> Cc: David Airlie 
> Cc: Fabio Estevam 
> Cc: Krzysztof Kozlowski 
> Cc: Liviu Dudau 
> Cc: Maarten Lankhorst 
> Cc: Maxime Ripard 
> Cc: Pengutronix Kernel Team 
> Cc: Philipp Zabel 
> Cc: Rob Herring 
> Cc: Sascha Hauer 
> Cc: Sebastian Reichel 
> Cc: Shawn Guo 
> Cc: Simona Vetter 
> Cc: Steven Price 
> Cc: Thomas Zimmermann 
> Cc: devicet...@vger.kernel.org
> Cc: dri-devel@lists.freedesktop.org
> Cc: i...@lists.linux.dev
> Cc: linux-arm-ker...@lists.infradead.org
> ---
>  drivers/gpu/drm/panthor/panthor_gpu.c | 11 ++-
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c 
> b/drivers/gpu/drm/panthor/panthor_gpu.c
> index 671049020afaa..0f07ef7d9aea7 100644
> --- a/drivers/gpu/drm/panthor/panthor_gpu.c
> +++ b/drivers/gpu/drm/panthor/panthor_gpu.c
> @@ -470,11 +470,12 @@ int panthor_gpu_soft_reset(struct panthor_device *ptdev)
>   */
>  void panthor_gpu_suspend(struct panthor_device *ptdev)
>  {
> - /* On a fast reset, simply power down the L2. */
> - if (!ptdev->reset.fast)
> - panthor_gpu_soft_reset(ptdev);
> - else
> - panthor_gpu_power_off(ptdev, L2, 1, 2);
> + /*
> +  * Power off the L2 and soft reset the GPU, that makes
> +  * iMX95 Mali G310 resume without firmware boot timeout.
> +  */
> + panthor_gpu_power_off(ptdev, L2, 1, 2);
> + panthor_gpu_soft_reset(ptdev);

Unfortunately, if you do that unconditionally we no longer have a
fast-reset. Would be good to figure out why the fast-reset doesn't work
on this platform.


Re: [PATCH 2/9] reset: simple: Add support for Freescale i.MX95 GPU reset

2025-02-27 Thread Frank Li
On Thu, Feb 27, 2025 at 05:58:02PM +0100, Marek Vasut wrote:
> The instance of the GPU populated in Freescale i.MX95 does require
> release from reset by writing into a single GPUMIX block controller
> GPURESET register bit 0. Implement support for this reset register.

Reviewed-by: Frank Li 

>
> Signed-off-by: Marek Vasut 
> ---
> Cc: Boris Brezillon 
> Cc: Conor Dooley 
> Cc: David Airlie 
> Cc: Fabio Estevam 
> Cc: Krzysztof Kozlowski 
> Cc: Liviu Dudau 
> Cc: Maarten Lankhorst 
> Cc: Maxime Ripard 
> Cc: Pengutronix Kernel Team 
> Cc: Philipp Zabel 
> Cc: Rob Herring 
> Cc: Sascha Hauer 
> Cc: Sebastian Reichel 
> Cc: Shawn Guo 
> Cc: Simona Vetter 
> Cc: Steven Price 
> Cc: Thomas Zimmermann 
> Cc: devicet...@vger.kernel.org
> Cc: dri-devel@lists.freedesktop.org
> Cc: i...@lists.linux.dev
> Cc: linux-arm-ker...@lists.infradead.org
> ---
>  drivers/reset/reset-simple.c | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/drivers/reset/reset-simple.c b/drivers/reset/reset-simple.c
> index 2760678398308..1415a941fd6eb 100644
> --- a/drivers/reset/reset-simple.c
> +++ b/drivers/reset/reset-simple.c
> @@ -133,9 +133,17 @@ static const struct reset_simple_devdata 
> reset_simple_active_low = {
>   .status_active_low = true,
>  };
>
> +static const struct reset_simple_devdata reset_simple_fsl_imx95_gpu_blk_ctrl 
> = {
> + .reg_offset = 0x8,
> + .active_low = true,
> + .status_active_low = true,
> +};
> +
>  static const struct of_device_id reset_simple_dt_ids[] = {
>   { .compatible = "altr,stratix10-rst-mgr",
>   .data = &reset_simple_socfpga },
> + { .compatible = "fsl,imx95-gpu-blk-ctrl",
> + .data = &reset_simple_fsl_imx95_gpu_blk_ctrl },
>   { .compatible = "st,stm32-rcc", },
>   { .compatible = "allwinner,sun6i-a31-clock-reset",
>   .data = &reset_simple_active_low },
> --
> 2.47.2
>


Re: [PATCH 02/17] bitops: Add generic parity calculation for u64

2025-02-27 Thread Yury Norov
On Thu, Feb 27, 2025 at 07:38:58AM +0100, Jiri Slaby wrote:
> On 26. 02. 25, 19:33, Yury Norov wrote:
> > > Not in cases where macros are inevitable. I mean, do we need parityXX() 
> > > for
> > > XX in (8, 16, 32, 64) at all? Isn't the parity() above enough for 
> > > everybody?
> > 
> > The existing codebase has something like:
> > 
> >  int ret;
> > 
> >  ret = i3c_master_get_free_addr(m, last_addr + 1);
> >  ret |= parity8(ret) ? 0 : BIT(7)
> > 
> > So if we'll switch it to a macro like one above, it will become a
> > 32-bit parity. It wouldn't be an error because i3c_master_get_free_addr()
> > returns an u8 or -ENOMEM, and the error code is checked explicitly.
> > 
> > But if we decide to go with parity() only, some users will have to
> > call it like parity((u8)val) explicitly. Which is not bad actually.
> 
> That cast looks ugly -- we apparently need parityXX(). (In this particular
> case we could do parity8(last_addr), but I assume there are more cases like
> this.) Thanks for looking up the case for this.

This parity8() is used in just 2 drivers - i3c and hwmon/spd5118. The hwmon
driver looks good. I3C, yeah, makes this implied typecast, which is nasty
regardless.

This is the new code, and I think if we all agree that generic parity()
would be a better API, it's a good time to convert existing users now.

Thanks,
Yury


Re: [PATCH 8/9] drm/panthor: Add i.MX95 support

2025-02-27 Thread Frank Li
On Thu, Feb 27, 2025 at 05:58:08PM +0100, Marek Vasut wrote:
> The instance of the GPU populated in Freescale i.MX95 is the
> Mali G310, add support for this variant.
>
> Signed-off-by: Marek Vasut 

Reviewed-by: Frank Li 

> ---
> Cc: Boris Brezillon 
> Cc: Conor Dooley 
> Cc: David Airlie 
> Cc: Fabio Estevam 
> Cc: Krzysztof Kozlowski 
> Cc: Liviu Dudau 
> Cc: Maarten Lankhorst 
> Cc: Maxime Ripard 
> Cc: Pengutronix Kernel Team 
> Cc: Philipp Zabel 
> Cc: Rob Herring 
> Cc: Sascha Hauer 
> Cc: Sebastian Reichel 
> Cc: Shawn Guo 
> Cc: Simona Vetter 
> Cc: Steven Price 
> Cc: Thomas Zimmermann 
> Cc: devicet...@vger.kernel.org
> Cc: dri-devel@lists.freedesktop.org
> Cc: i...@lists.linux.dev
> Cc: linux-arm-ker...@lists.infradead.org
> ---
>  drivers/gpu/drm/panthor/panthor_drv.c | 1 +
>  drivers/gpu/drm/panthor/panthor_gpu.c | 1 +
>  2 files changed, 2 insertions(+)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_drv.c 
> b/drivers/gpu/drm/panthor/panthor_drv.c
> index 06fe46e320738..2504a456d45c4 100644
> --- a/drivers/gpu/drm/panthor/panthor_drv.c
> +++ b/drivers/gpu/drm/panthor/panthor_drv.c
> @@ -1591,6 +1591,7 @@ static struct attribute *panthor_attrs[] = {
>  ATTRIBUTE_GROUPS(panthor);
>
>  static const struct of_device_id dt_match[] = {
> + { .compatible = "fsl,imx95-mali" }, /* G310 */
>   { .compatible = "rockchip,rk3588-mali" },
>   { .compatible = "arm,mali-valhall-csf" },
>   {}
> diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c 
> b/drivers/gpu/drm/panthor/panthor_gpu.c
> index 0f07ef7d9aea7..2371ab8e50627 100644
> --- a/drivers/gpu/drm/panthor/panthor_gpu.c
> +++ b/drivers/gpu/drm/panthor/panthor_gpu.c
> @@ -67,6 +67,7 @@ struct panthor_model {
>  }
>
>  static const struct panthor_model gpu_models[] = {
> + GPU_MODEL(g310, 0, 0),  /* NXP i.MX95 */
>   GPU_MODEL(g610, 10, 7),
>   {},
>  };
> --
> 2.47.2
>


Re: [PATCH 9/9] arm64: dts: imx95: Describe Mali G310 GPU

2025-02-27 Thread Frank Li
On Thu, Feb 27, 2025 at 05:58:09PM +0100, Marek Vasut wrote:
> The instance of the GPU populated in i.MX95 is the G310,
> describe this GPU in the DT. Include description of the
> GPUMIX block controller, which can be operated as a simple
> reset. Include dummy GPU voltage regulator and OPP tables.
>
> Signed-off-by: Marek Vasut 
> ---
> Cc: Boris Brezillon 
> Cc: Conor Dooley 
> Cc: David Airlie 
> Cc: Fabio Estevam 
> Cc: Krzysztof Kozlowski 
> Cc: Liviu Dudau 
> Cc: Maarten Lankhorst 
> Cc: Maxime Ripard 
> Cc: Pengutronix Kernel Team 
> Cc: Philipp Zabel 
> Cc: Rob Herring 
> Cc: Sascha Hauer 
> Cc: Sebastian Reichel 
> Cc: Shawn Guo 
> Cc: Simona Vetter 
> Cc: Steven Price 
> Cc: Thomas Zimmermann 
> Cc: devicet...@vger.kernel.org
> Cc: dri-devel@lists.freedesktop.org
> Cc: i...@lists.linux.dev
> Cc: linux-arm-ker...@lists.infradead.org
> ---
>  arch/arm64/boot/dts/freescale/imx95.dtsi | 62 
>  1 file changed, 62 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/freescale/imx95.dtsi 
> b/arch/arm64/boot/dts/freescale/imx95.dtsi
> index 3af13173de4bd..36bad211e5558 100644
> --- a/arch/arm64/boot/dts/freescale/imx95.dtsi
> +++ b/arch/arm64/boot/dts/freescale/imx95.dtsi
> @@ -249,6 +249,37 @@ dummy: clock-dummy {
>   clock-output-names = "dummy";
>   };
>
> + gpu_fixed_reg: fixed-gpu-reg {
> + compatible = "regulator-fixed";
> + regulator-min-microvolt = <92>;
> + regulator-max-microvolt = <92>;
> + regulator-name = "vdd_gpu";
> + regulator-always-on;
> + regulator-boot-on;

Does really need regulator-boot-on and regulator-always-on ?

> + };
> +
> + gpu_opp_table: opp_table {
> + compatible = "operating-points-v2";
> +
> + opp-5 {
> + opp-hz = /bits/ 64 <5>;
> + opp-hz-real = /bits/ 64 <5>;
> + opp-microvolt = <92>;
> + };
> +
> + opp-8 {
> + opp-hz = /bits/ 64 <8>;
> + opp-hz-real = /bits/ 64 <8>;
> + opp-microvolt = <92>;
> + };
> +
> + opp-10 {
> + opp-hz = /bits/ 64 <10>;
> + opp-hz-real = /bits/ 64 <10>;
> + opp-microvolt = <92>;
> + };
> + };
> +
>   clk_ext1: clock-ext1 {
>   compatible = "fixed-clock";
>   #clock-cells = <0>;
> @@ -1846,6 +1877,37 @@ netc_emdio: mdio@0,0 {
>   };
>   };
>
> + gpu_blk_ctrl: reset-controller@4d81 {
> + compatible = "fsl,imx95-gpu-blk-ctrl";
> + reg = <0x0 0x4d81 0x0 0xc>;
> + #reset-cells = <1>;
> + clocks = <&scmi_clk IMX95_CLK_GPUAPB>;
> + assigned-clocks = <&scmi_clk IMX95_CLK_GPUAPB>;
> + assigned-clock-parents = <&scmi_clk 
> IMX95_CLK_SYSPLL1_PFD1_DIV2>;
> + assigned-clock-rates = <1>;
> + power-domains = <&scmi_devpd IMX95_PD_GPU>;
> + status = "disabled";
> + };
> +
> + gpu: gpu@4d90 {
> + compatible = "fsl,imx95-mali", "arm,mali-valhall-csf";
> + reg = <0 0x4d90 0 0x48>;
> + clocks = <&scmi_clk IMX95_CLK_GPU>;
> + clock-names = "core";
> + interrupts = ,
> +  ,
> +  ;
> + interrupt-names = "gpu", "job", "mmu";
> + mali-supply = <&gpu_fixed_reg>;
> + operating-points-v2 = <&gpu_opp_table>;
> + power-domains = <&scmi_devpd IMX95_PD_GPU>, <&scmi_perf 
> IMX95_PERF_GPU>;
> + power-domain-names = "mix", "perf";
> + resets = <&gpu_blk_ctrl 0>;
> + #cooling-cells = <2>;
> + dynamic-power-coefficient = <1013>;
> + status = "disabled";

GPU is internal module, which have not much dependence with other module
such as pinmux. why not default status is "disabled". Supposed gpu driver
will turn off clock and power if not used.

Frank

> + };
> +
>   ddr-pmu@4e090dc0 {
>   compatible = "fsl,imx95-ddr-pmu", "fsl,imx93-ddr-pmu";
>   reg = <0x0 0x4e090dc0 0x0 0x200>;
> --
> 2.47.2
>


[PATCH 2/9] reset: simple: Add support for Freescale i.MX95 GPU reset

2025-02-27 Thread Marek Vasut
The instance of the GPU populated in Freescale i.MX95 does require
release from reset by writing into a single GPUMIX block controller
GPURESET register bit 0. Implement support for this reset register.

Signed-off-by: Marek Vasut 
---
Cc: Boris Brezillon 
Cc: Conor Dooley 
Cc: David Airlie 
Cc: Fabio Estevam 
Cc: Krzysztof Kozlowski 
Cc: Liviu Dudau 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Pengutronix Kernel Team 
Cc: Philipp Zabel 
Cc: Rob Herring 
Cc: Sascha Hauer 
Cc: Sebastian Reichel 
Cc: Shawn Guo 
Cc: Simona Vetter 
Cc: Steven Price 
Cc: Thomas Zimmermann 
Cc: devicet...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: i...@lists.linux.dev
Cc: linux-arm-ker...@lists.infradead.org
---
 drivers/reset/reset-simple.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/reset/reset-simple.c b/drivers/reset/reset-simple.c
index 2760678398308..1415a941fd6eb 100644
--- a/drivers/reset/reset-simple.c
+++ b/drivers/reset/reset-simple.c
@@ -133,9 +133,17 @@ static const struct reset_simple_devdata 
reset_simple_active_low = {
.status_active_low = true,
 };
 
+static const struct reset_simple_devdata reset_simple_fsl_imx95_gpu_blk_ctrl = 
{
+   .reg_offset = 0x8,
+   .active_low = true,
+   .status_active_low = true,
+};
+
 static const struct of_device_id reset_simple_dt_ids[] = {
{ .compatible = "altr,stratix10-rst-mgr",
.data = &reset_simple_socfpga },
+   { .compatible = "fsl,imx95-gpu-blk-ctrl",
+   .data = &reset_simple_fsl_imx95_gpu_blk_ctrl },
{ .compatible = "st,stm32-rcc", },
{ .compatible = "allwinner,sun6i-a31-clock-reset",
.data = &reset_simple_active_low },
-- 
2.47.2



[PATCH] drm/msm: fix a potential memory leak issue in submit_create()

2025-02-27 Thread Haoxiang Li
The memory allocated by msm_fence_alloc() actually is the
container of msm_fence_alloc()'s return value. Thus, just
free its return value is not enough.
Add a helper 'msm_fence_free()' in msm_fence.h/msm_fence.c
to do the complete job.

Fixes: f94e6a51e17c ("drm/msm: Pre-allocate hw_fence")
Cc: sta...@vger.kernel.org
Signed-off-by: Haoxiang Li 
---
 drivers/gpu/drm/msm/msm_fence.c  | 7 +++
 drivers/gpu/drm/msm/msm_fence.h  | 1 +
 drivers/gpu/drm/msm/msm_gem_submit.c | 2 +-
 3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c
index 1a5d4f1c8b42..0e257afaf443 100644
--- a/drivers/gpu/drm/msm/msm_fence.c
+++ b/drivers/gpu/drm/msm/msm_fence.c
@@ -184,6 +184,13 @@ msm_fence_alloc(void)
return &f->base;
 }
 
+void msm_fence_free(struct dma_fence *fence)
+{
+   struct msm_fence *f = to_msm_fence(fence);
+
+   kfree(f);
+}
+
 void
 msm_fence_init(struct dma_fence *fence, struct msm_fence_context *fctx)
 {
diff --git a/drivers/gpu/drm/msm/msm_fence.h b/drivers/gpu/drm/msm/msm_fence.h
index 148196375a0b..635c68629070 100644
--- a/drivers/gpu/drm/msm/msm_fence.h
+++ b/drivers/gpu/drm/msm/msm_fence.h
@@ -82,6 +82,7 @@ bool msm_fence_completed(struct msm_fence_context *fctx, 
uint32_t fence);
 void msm_update_fence(struct msm_fence_context *fctx, uint32_t fence);
 
 struct dma_fence * msm_fence_alloc(void);
+void msm_fence_free(struct dma_fence *fence);
 void msm_fence_init(struct dma_fence *fence, struct msm_fence_context *fctx);
 
 static inline bool
diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
b/drivers/gpu/drm/msm/msm_gem_submit.c
index dee470403036..3fdcfc5714b6 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -56,7 +56,7 @@ static struct msm_gem_submit *submit_create(struct drm_device 
*dev,
 
ret = drm_sched_job_init(&submit->base, queue->entity, 1, queue);
if (ret) {
-   kfree(submit->hw_fence);
+   msm_fence_free(submit->hw_fence);
kfree(submit);
return ERR_PTR(ret);
}
-- 
2.25.1



[PATCH v2 0/8] drm/msm/dpu: improve CTL handling on DPU >= 5.0 platforms

2025-02-27 Thread Dmitry Baryshkov
Since version 5.0 the DPU got an improved way of handling multi-output
configurations. It is now possible to program all pending changes
through a single CTL and flush everything at the same time.

Implement corresponding changes in the DPU driver.

Signed-off-by: Dmitry Baryshkov 
---
Changes in v2:
- Made CTL_MERGE_3D_ACTIVE writes unconditional (Marijn)
- Added CTL_INTF_MASTER clearing in dpu_hw_ctl_reset_intf_cfg_v1
  (Marijn)
- Added a patch to drop extra rm->has_legacy_ctls condition (and an
  explanation why it can not be folded in an earlier patch).
- Link to v1: 
https://lore.kernel.org/r/20250220-dpu-active-ctl-v1-0-71ca67a56...@linaro.org

---
Dmitry Baryshkov (8):
  drm/msm/dpu: don't overwrite CTL_MERGE_3D_ACTIVE register
  drm/msm/dpu: program master INTF value
  drm/msm/dpu: pass master interface to CTL configuration
  drm/msm/dpu: use single CTL if it is the only CTL returned by RM
  drm/msm/dpu: don't select single flush for active CTL blocks
  drm/msm/dpu: allocate single CTL for DPU >= 5.0
  drm/msm/dpu: remove DPU_CTL_SPLIT_DISPLAY from CTL blocks on DPU >= 5.0
  drm/msm/dpu: drop now-unused condition for has_legacy_ctls

 .../gpu/drm/msm/disp/dpu1/catalog/dpu_10_0_sm8650.h  |  5 ++---
 .../gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h   |  5 ++---
 .../gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h  |  4 ++--
 .../gpu/drm/msm/disp/dpu1/catalog/dpu_5_2_sm7150.h   |  4 ++--
 .../gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h   |  5 ++---
 .../gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h   |  5 ++---
 .../gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h |  5 ++---
 .../gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h   |  5 ++---
 .../gpu/drm/msm/disp/dpu1/catalog/dpu_8_4_sa8775p.h  |  5 ++---
 .../gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h   |  5 ++---
 .../gpu/drm/msm/disp/dpu1/catalog/dpu_9_2_x1e80100.h |  5 ++---
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c  |  6 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c |  2 ++
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c |  5 -
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c   | 20 +---
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h   |  2 ++
 drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c   | 14 +++---
 drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h   |  2 ++
 18 files changed, 65 insertions(+), 39 deletions(-)
---
base-commit: be5c7bbb3a64baf884481a1ba0c2f8fb2f93f7c3
change-id: 20250209-dpu-active-ctl-08cca4d8b08a

Best regards,
-- 
Dmitry Baryshkov 



[PATCH v2 2/8] drm/msm/dpu: program master INTF value

2025-02-27 Thread Dmitry Baryshkov
If several interfaces are being handled through a single CTL, a main
('master') INTF needs to be programmed into a separate register. Write
corresponding value into that register.

Co-developed-by: Marijn Suijten 
Signed-off-by: Marijn Suijten 
Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c | 12 
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h |  2 ++
 2 files changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
index 
32ab33b314fc44e12ccb935c1695d2eea5c7d9b2..60c4206c6f2833293fdcc56b653f7d3124a5
 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.c
@@ -583,6 +583,9 @@ static void dpu_hw_ctl_intf_cfg_v1(struct dpu_hw_ctl *ctx,
DPU_REG_WRITE(c, CTL_DSC_ACTIVE, dsc_active);
DPU_REG_WRITE(c, CTL_MERGE_3D_ACTIVE, merge_3d_active);
 
+   if (cfg->intf_master)
+   DPU_REG_WRITE(c, CTL_INTF_MASTER, BIT(cfg->intf_master - 
INTF_0));
+
if (cfg->cdm)
DPU_REG_WRITE(c, CTL_CDM_ACTIVE, cfg->cdm);
 }
@@ -625,6 +628,7 @@ static void dpu_hw_ctl_reset_intf_cfg_v1(struct dpu_hw_ctl 
*ctx,
 {
struct dpu_hw_blk_reg_map *c = &ctx->hw;
u32 intf_active = 0;
+   u32 intf_master = 0;
u32 wb_active = 0;
u32 merge3d_active = 0;
u32 dsc_active;
@@ -651,6 +655,14 @@ static void dpu_hw_ctl_reset_intf_cfg_v1(struct dpu_hw_ctl 
*ctx,
intf_active = DPU_REG_READ(c, CTL_INTF_ACTIVE);
intf_active &= ~BIT(cfg->intf - INTF_0);
DPU_REG_WRITE(c, CTL_INTF_ACTIVE, intf_active);
+
+   intf_master = DPU_REG_READ(c, CTL_INTF_MASTER);
+
+   /* Unset this intf as master, if it is the current master */
+   if (intf_master == BIT(cfg->intf - INTF_0)) {
+   DPU_DEBUG_DRIVER("Unsetting INTF_%d master\n", 
cfg->intf - INTF_0);
+   DPU_REG_WRITE(c, CTL_INTF_MASTER, 0);
+   }
}
 
if (cfg->wb) {
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h
index 
85c6c835cc8780e6cb66f3a262d9897c91962935..e95989a2fdda6344d0cb9d3036e6ed22a0458675
 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_ctl.h
@@ -36,6 +36,7 @@ struct dpu_hw_stage_cfg {
 /**
  * struct dpu_hw_intf_cfg :Describes how the DPU writes data to output 
interface
  * @intf : Interface id
+ * @intf_master:   Master interface id in the dual pipe topology
  * @mode_3d:   3d mux configuration
  * @merge_3d:  3d merge block used
  * @intf_mode_sel: Interface mode, cmd / vid
@@ -45,6 +46,7 @@ struct dpu_hw_stage_cfg {
  */
 struct dpu_hw_intf_cfg {
enum dpu_intf intf;
+   enum dpu_intf intf_master;
enum dpu_wb wb;
enum dpu_3d_blend_mode mode_3d;
enum dpu_merge_3d merge_3d;

-- 
2.39.5



Re: [PATCH v6 32/32] drm/doc: gpusvm: Add GPU SVM documentation

2025-02-27 Thread Matthew Brost
On Fri, Feb 28, 2025 at 01:34:42PM +1100, Alistair Popple wrote:
> On Mon, Feb 24, 2025 at 08:43:11PM -0800, Matthew Brost wrote:
> > Add documentation for agree upon GPU SVM design principles, current
> > status, and future plans.
> 
> Thanks for writing this up. In general I didn't see anything too controversial
> but added a couple of comments below.
> 
> > 
> > v4:
> >  - Address Thomas's feedback
> > v5:
> >  - s/Current/Basline (Thomas)
> > 
> > Signed-off-by: Matthew Brost 
> > Reviewed-by: Thomas Hellström 
> > ---
> >  Documentation/gpu/rfc/gpusvm.rst | 84 
> >  Documentation/gpu/rfc/index.rst  |  4 ++
> >  2 files changed, 88 insertions(+)
> >  create mode 100644 Documentation/gpu/rfc/gpusvm.rst
> > 
> > diff --git a/Documentation/gpu/rfc/gpusvm.rst 
> > b/Documentation/gpu/rfc/gpusvm.rst
> > new file mode 100644
> > index ..063412160685
> > --- /dev/null
> > +++ b/Documentation/gpu/rfc/gpusvm.rst
> > @@ -0,0 +1,84 @@
> > +===
> > +GPU SVM Section
> > +===
> > +
> > +Agreed upon design principles
> > +=
> 
> As a general comment I think it would be nice if we could add some rational/
> reasons for these design principals. Things inevitably change and if/when
> we need to violate or update these principals it would be good to have some
> documented rational for why we decided on them in the first place because the
> reasoning may have become invalid by then.
> 

Let me try to add somethings to the various cases.

> > +* migrate_to_ram path
> > +   * Rely only on core MM concepts (migration PTEs, page references, and
> > + page locking).
> > +   * No driver specific locks other than locks for hardware interaction in
> > + this path. These are not required and generally a bad idea to
> > + invent driver defined locks to seal core MM races.
> 
> In principal I agree. The problem I think you will run into is the analogue of
> what adding a trylock_page() to do_swap_page() fixes. Which is that a 
> concurrent
> GPU fault (which is higly likely after handling a CPU fault due to the GPU 
> PTEs
> becoming invalid) may, depending on your design, kick off a migration of the
> page to the GPU via migrate_vma_setup().
> 
> The problem with that is migrate_vma_setup() will temprarily raise the folio
> refcount, which can cause the migrate_to_ram() callback to fail but the 
> elevated
> refcount from migrate_to_ram() can also cause the GPU migration to fail thus
> leading to a live-lock when both CPU and GPU fault handlers just keep 
> retrying.
> 
> This was particularly problematic for us on multi-GPU setups, and our solution
> was to introduce a migration critical section in the form of a mutex to ensure
> only one thread was calling migrate_vma_setup() at a time.
> 
> And now that I've looked at UVM development history, and remembered more
> context, this is why I had a vague recollection that adding a migration entry
> in do_swap_page() would be better than taking a page lock. Doing so fixes the
> issue with concurrent GPU faults blocking migrate_to_ram() because it makes
> migrate_vma_setup() ignore the page.
> 

Ok, this is something to keep an eye on. In the current Xe code, we try
to migrate a chunk of memory from the CPU to the GPU in our GPU fault
handler once per fault. If it fails due to racing CPU access, we simply
leave it in CPU memory and move on. We don't have any real migration
policies in Xe yet—that is being worked on as a follow-up to my series.
However, if we had a policy requiring a memory region to 'must be in
GPU,' this could conceivably lead to a livelock with concurrent CPU and
GPU access. I'm still not fully convinced that a driver-side lock is the
solution here, but without encountering the issue on our side, I can't
be completely certain what the solution is.

> > +   * Partial migration is supported (i.e., a subset of pages attempting to
> > + migrate can actually migrate, with only the faulting page guaranteed
> > + to migrate).
> > +   * Driver handles mixed migrations via retry loops rather than locking.
> >
> > +* Eviction
> 
> This is a term that seems be somewhat overloaded depending on context so a
> definition would be nice. Is your view of eviction migrating data from GPU 
> back
> to CPU without a virtual address to free up GPU memory? (that's what I think 
> of,
> but would be good to make sure we're in sync).
> 

Yes. When GPU memory is oversubscribed, we find the physical backing in
an LRU list to evict. In Xe, this is a TTM BO.

> > +   * Only looking at physical memory data structures and locks as opposed 
> > to
> > + looking at virtual memory data structures and locks.
> > +   * No looking at mm/vma structs or relying on those being locked.
> 
> Agree with the above points.
> 
> > +* GPU fault side
> > +   * mmap_read only used around core MM functions which require this lock
> > + and should strive to take mmap_read lock only in GPU SVM la

Re: [PATCH v2 8/8] drm/xe/xe_vm: Implement xe_vm_get_property_ioctl

2025-02-27 Thread kernel test robot
Hi Jonathan,

kernel test robot noticed the following build warnings:

[auto build test WARNING on drm-xe/drm-xe-next]
[also build test WARNING on next-20250227]
[cannot apply to linus/master v6.14-rc4]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:
https://github.com/intel-lab-lkp/linux/commits/Jonathan-Cavitt/drm-xe-xe_gt_pagefault-Disallow-writes-to-read-only-VMAs/20250228-032204
base:   https://gitlab.freedesktop.org/drm/xe/kernel.git drm-xe-next
patch link:
https://lore.kernel.org/r/20250227191457.84035-9-jonathan.cavitt%40intel.com
patch subject: [PATCH v2 8/8] drm/xe/xe_vm: Implement xe_vm_get_property_ioctl
config: i386-buildonly-randconfig-002-20250228 
(https://download.01.org/0day-ci/archive/20250228/202502281118.xnrflzlo-...@intel.com/config)
compiler: clang version 19.1.7 (https://github.com/llvm/llvm-project 
cd708029e0b2869e80abe31ddb175f7c35361f90)
reproduce (this is a W=1 build): 
(https://download.01.org/0day-ci/archive/20250228/202502281118.xnrflzlo-...@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202502281118.xnrflzlo-...@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/gpu/drm/xe/xe_vm.c:3267:3: warning: label followed by a declaration 
>> is a C23 extension [-Wc23-extensions]
3267 | struct xe_exec_queue_ban_entry *entry;
 | ^
   1 warning generated.


vim +3267 drivers/gpu/drm/xe/xe_vm.c

  3260  
  3261  static u32 xe_vm_get_property_size(struct xe_vm *vm, u32 property)
  3262  {
  3263  u32 size = 0;
  3264  
  3265  switch (property) {
  3266  case DRM_XE_VM_GET_PROPERTY_FAULTS:
> 3267  struct xe_exec_queue_ban_entry *entry;
  3268  
  3269  spin_lock(&vm->bans.lock);
  3270  list_for_each_entry(entry, &vm->bans.list, list) {
  3271  struct xe_pagefault *pf = entry->pf;
  3272  
  3273  size += pf ? sizeof(struct drm_xe_ban) : 0;
  3274  }
  3275  spin_unlock(&vm->bans.lock);
  3276  return size;
  3277  case DRM_XE_VM_GET_PROPERTY_BANS:
  3278  spin_lock(&vm->bans.lock);
  3279  size = vm->bans.len * sizeof(struct drm_xe_ban);
  3280  spin_unlock(&vm->bans.lock);
  3281  return size;
  3282  case DRM_XE_VM_GET_PROPERTY_NUM_RESETS:
  3283  return 0;
  3284  default:
  3285  return -EINVAL;
  3286  }
  3287  }
  3288  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Re: [PATCH v1] drm/ci: remove CI_PRE_CLONE_SCRIPT

2025-02-27 Thread Vignesh Raman

Hi Dmitry,

On 27/02/25 11:21, Dmitry Baryshkov wrote:

On Thu, Feb 27, 2025 at 10:06:24AM +0530, Vignesh Raman wrote:

If we are not caching the git archive, do not
set CI_PRE_CLONE_SCRIPT. Setting it makes CI
try to download the cache first, and if it is
missing, it tries to clone the repo within a
time limit, which can cause build failures.


Please wrap the commit message according to the guidelines. 47 chars in
a line is way too short.

BTW: this didn't help with the python-artifacts issue. It still times
out.


The issue was with shallow cloning, and I have posted another patch.
https://lore.kernel.org/dri-devel/20250228031501.483475-1-vignesh.ra...@collabora.com/T/#u

The commit message is wrapped according to the guidelines in this patch.

Thanks.

Regards,
Vignesh





Signed-off-by: Vignesh Raman 
---
  drivers/gpu/drm/ci/gitlab-ci.yml | 6 --
  1 file changed, 6 deletions(-)

diff --git a/drivers/gpu/drm/ci/gitlab-ci.yml b/drivers/gpu/drm/ci/gitlab-ci.yml
index f4e324e156db..0bc4ac344757 100644
--- a/drivers/gpu/drm/ci/gitlab-ci.yml
+++ b/drivers/gpu/drm/ci/gitlab-ci.yml
@@ -13,12 +13,6 @@ variables:
FDO_UPSTREAM_REPO: helen.fornazier/linux   # The repo where the git-archive 
daily runs
MESA_TEMPLATES_COMMIT: &ci-templates-commit 
d5aa3941aa03c2f716595116354fb81eb8012acb
DRM_CI_PROJECT_URL: https://gitlab.freedesktop.org/${DRM_CI_PROJECT_PATH}
-  CI_PRE_CLONE_SCRIPT: |-
-  set -o xtrace
-  curl -L --retry 4 -f --retry-all-errors --retry-delay 60 -s 
${DRM_CI_PROJECT_URL}/-/raw/${DRM_CI_COMMIT_SHA}/.gitlab-ci/download-git-cache.sh
 -o download-git-cache.sh
-  bash download-git-cache.sh
-  rm download-git-cache.sh
-  set +o xtrace
S3_JWT_FILE: /s3_jwt
S3_JWT_FILE_SCRIPT: |-
echo -n '${S3_JWT}' > '${S3_JWT_FILE}' &&
--
2.47.2







Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-02-27 Thread Dave Airlie
On Fri, 28 Feb 2025 at 09:07, John Hubbard  wrote:
>
> On Thu Feb 27, 2025 at 1:42 PM PST, Dave Airlie wrote:
> > On Thu, 27 Feb 2025 at 11:34, John Hubbard  wrote:
> >> On Wed Feb 26, 2025 at 5:02 PM PST, Greg KH wrote:
> >> > On Wed, Feb 26, 2025 at 07:47:30PM -0400, Jason Gunthorpe wrote:
> ...
> > nova is just a drm driver, it's not a rewrite of the drm subsystem,
> > that sort of effort would entail a much larger commitment.
>
> Maybe at this point in the discussion it would help to discern between
> nova-core and nova-drm:
>
> drivers/gpu/nova-core/ (under discussion here)

nova-core won't be suffering any of the issues Jason is raising,
nova-core isn't going to have userspace facing interfaces or be part
of any subsystem with major lifetime expectations. It has to deal with
the hardware going away due to hot unplugs, and that is what this
devres is for.

nova-core will be a kernel internal pci driver, and vfio and nova-drm
will load on top of it, once those drivers are loaded and talking to
userspace they will keep references on the nova-core driver module
through normal means.

Dave.


[PATCH 0/4] drm/msm/dpu: disable DSC on some of old DPU models

2025-02-27 Thread Dmitry Baryshkov
During one of the chats Abhinav pointed out that in the 1.x generation
most of the DPU/MDP5 instances didn't have DSC support. Also SDM630
didn't provide DSC support. Disable DSC on those platforms.

Signed-off-by: Dmitry Baryshkov 
---
Dmitry Baryshkov (4):
  drm/msm/dpu: remove DSC feature bit for PINGPONG on MSM8937
  drm/msm/dpu: remove DSC feature bit for PINGPONG on MSM8917
  drm/msm/dpu: remove DSC feature bit for PINGPONG on MSM8953
  drm/msm/dpu: remove DSC feature bit for PINGPONG on SDM630

 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_1_14_msm8937.h | 2 --
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_1_15_msm8917.h | 1 -
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_1_16_msm8953.h | 2 --
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_3_sdm630.h   | 5 +++--
 4 files changed, 3 insertions(+), 7 deletions(-)
---
base-commit: be5c7bbb3a64baf884481a1ba0c2f8fb2f93f7c3
change-id: 20250228-dpu-fix-catalog-649db1fc29a6

Best regards,
-- 
Dmitry Baryshkov 



[PATCH 4/4] drm/msm/dpu: remove DSC feature bit for PINGPONG on SDM630

2025-02-27 Thread Dmitry Baryshkov
The MSM8937 platform doesn't have DSC blocks nor does have it DSC
registers in the PINGPONG block. Drop the DPU_PINGPONG_DSC feature bit
from the PINGPONG's feature mask, replacing PINGPONG_SDM845_MASK and
PINGPONG_SDM845_TE2_MASK with proper bitmasks.

Fixes: 7204df5e7e68 ("drm/msm/dpu: add support for SDM660 and SDM630 platforms")
Reported-by: Abhinav Kumar 
Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_3_sdm630.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_3_sdm630.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_3_sdm630.h
index 
df01227fc36468f4945c03e767e1409ea4fc0896..4fdc9c19a74a0c52ae502b77fb8697a53bef0f97
 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_3_sdm630.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_3_sdm630.h
@@ -115,14 +115,15 @@ static const struct dpu_pingpong_cfg sdm630_pp[] = {
{
.name = "pingpong_0", .id = PINGPONG_0,
.base = 0x7, .len = 0xd4,
-   .features = PINGPONG_SDM845_TE2_MASK,
+   .features = BIT(DPU_PINGPONG_DITHER) |
+   BIT(DPU_PINGPONG_TE2),
.sblk = &sdm845_pp_sblk_te,
.intr_done = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 8),
.intr_rdptr = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 12),
}, {
.name = "pingpong_2", .id = PINGPONG_2,
.base = 0x71000, .len = 0xd4,
-   .features = PINGPONG_SDM845_MASK,
+   .features = BIT(DPU_PINGPONG_DITHER),
.sblk = &sdm845_pp_sblk,
.intr_done = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 10),
.intr_rdptr = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 14),

-- 
2.39.5



[PATCH 1/4] drm/msm/dpu: remove DSC feature bit for PINGPONG on MSM8937

2025-02-27 Thread Dmitry Baryshkov
The MSM8937 platform doesn't have DSC blocks nor does have it DSC
registers in the PINGPONG block. Drop the DPU_PINGPONG_DSC feature bit
from the PINGPONG's feature mask and, as it is the only remaining bit,
drop the .features assignment completely.

Fixes: c079680bb0fa ("drm/msm/dpu: Add support for MSM8937")
Reported-by: Abhinav Kumar 
Signed-off-by: Dmitry Baryshkov 
---
 drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_1_14_msm8937.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_1_14_msm8937.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_1_14_msm8937.h
index 
ab3dfb0b374ead36c7f07b0a77c703fb2c09ff8a..a848f825c5948c5819758e131af60b83b543b15a
 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_1_14_msm8937.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_1_14_msm8937.h
@@ -100,14 +100,12 @@ static const struct dpu_pingpong_cfg msm8937_pp[] = {
{
.name = "pingpong_0", .id = PINGPONG_0,
.base = 0x7, .len = 0xd4,
-   .features = PINGPONG_MSM8996_MASK,
.sblk = &msm8996_pp_sblk,
.intr_done = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 8),
.intr_rdptr = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 12),
}, {
.name = "pingpong_1", .id = PINGPONG_1,
.base = 0x70800, .len = 0xd4,
-   .features = PINGPONG_MSM8996_MASK,
.sblk = &msm8996_pp_sblk,
.intr_done = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 9),
.intr_rdptr = DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 13),

-- 
2.39.5



RE: [PATCH] dma-buf: Take a breath during dma-fence-chain subtests

2025-02-27 Thread Gote, Nitin R
Hi,
 
> Am 27.02.25 um 13:52 schrieb Andi Shyti:
> > Hi Nitin,
> >
> > On Wed, Feb 26, 2025 at 09:25:34PM +0530, Nitin Gote wrote:
> >> Give the scheduler a chance to breath by calling cond_resched() as
> >> some of the loops may take some time on old machines (like
> >> apl/bsw/pnv), and so catch the attention of the watchdogs.
> >>
> >> Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12904
> >> Signed-off-by: Nitin Gote 
> > This patch goes beyond the intel-gfx domain so that you need to add
> > some people in Cc. By running checkpatch, you should add:
> >
> > Sumit Semwal  (maintainer:DMA BUFFER SHARING
> > FRAMEWORK) "Christian König" 
> > (maintainer:DMA BUFFER SHARING FRAMEWORK) linux-
> me...@vger.kernel.org
> > (open list:DMA BUFFER SHARING FRAMEWORK)
> > dri-devel@lists.freedesktop.org (open list:DMA BUFFER SHARING
> > FRAMEWORK)
> >
> > I added them now, but you might still be asked to resend.

Thank you Andi, for adding dma-buf maintainers.

> >
> > Said that, at a first glance, I don't have anything against this
> > patch.
> 
> There has been some push to deprecate cond_resched() cause it is almost always
> not appropriate.

Thank you Konig for review.
I'm not finding any push/commit or documentation of deprecated cond_resched() 
api.
If you have any reference, Could you please share a push of deprecated 
cond_resched()?

> 
> Saying that if I'm not completely mistaken that here is also not 100% correct
> usage.
> 
> Question is why is the test taking 26 (busy?) seconds to complete? That sounds
> really long even for a very old CPU.
> 
> Do we maybe have an udelay() here which should have been an usleep() or
> similar?

I will check and test with udelay() or similar api.
And I will resend a patch after testing.  

Regards,
Nitin
> 
> Regards,
> Christian.
> 
> >
> > Andi
> >
> >> ---
> >> Hi,
> >>
> >> For reviewer reference, adding here watchdog issue seen on old
> >> machines during dma-fence-chain subtests testing. This log is
> >> retrieved from device pstore log while testing dam-buf@all-tests:
> >>
> >> dma-buf: Running dma_fence_chain
> >> Panic#1 Part7
> >> <6> sizeof(dma_fence_chain)=184
> >> <6> dma-buf: Running dma_fence_chain/sanitycheck <6> dma-buf: Running
> >> dma_fence_chain/find_seqno <6> dma-buf: Running
> >> dma_fence_chain/find_signaled <6> dma-buf: Running
> >> dma_fence_chain/find_out_of_order <6> dma-buf: Running
> >> dma_fence_chain/find_gap <6> dma-buf: Running
> >> dma_fence_chain/find_race <6> Completed 4095 cycles <6> dma-buf:
> >> Running dma_fence_chain/signal_forward <6> dma-buf: Running
> >> dma_fence_chain/signal_backward <6> dma-buf: Running
> >> dma_fence_chain/wait_forward <6> dma-buf: Running
> >> dma_fence_chain/wait_backward <0> watchdog: BUG: soft lockup - CPU#2
> >> stuck for 26s! [dmabuf:2263]
> >> Panic#1 Part6
> >> <4> irq event stamp: 415735
> >> <4> hardirqs last  enabled at (415734): []
> >> handle_softirqs+0xab/0x4d0 <4> hardirqs last disabled at (415735):
> >> [] sysvec_apic_timer_interrupt+0x11/0xc0
> >> <4> softirqs last  enabled at (415728): []
> >> __irq_exit_rcu+0x13f/0x160 <4> softirqs last disabled at (415733):
> >> [] __irq_exit_rcu+0x13f/0x160 <4> CPU: 2 UID: 0
> >> PID: 2263 Comm: dmabuf Not tainted
> >> 6.14.0-rc2-drm-next_483-g7b91683e7de7+ #1 <4> Hardware name: Intel
> >> corporation NUC6CAYS/NUC6CAYB, BIOS
> AYAPLCEL.86A.0056.2018.0926.1100
> >> 09/26/2018 <4> RIP: 0010:handle_softirqs+0xb1/0x4d0 <4> RSP:
> >> 0018:c9154f60 EFLAGS: 0246 <4> RAX:  RBX:
> >> 0001 RCX:  <4> RDX: 
> RSI:
> >>  RDI:  <4> RBP: c9154fb8 R08:
> >>  R09:  <4> R10: 
> R11:
> >>  R12: 000a <4> R13: 0200
> R14:
> >> 0200 R15: 00400100 <4> FS:
> >> 77521c5cd940() GS:88827790()
> >> knlGS:
> >> Panic#1 Part5
> >> <4> CS:  0010 DS:  ES:  CR0: 80050033 <4> CR2:
> >> 5dbfee8c00c4 CR3: 000133d38000 CR4: 003526f0 <4>
> Call
> >> Trace:
> >> <4>  
> >> <4>  ? show_regs+0x6c/0x80
> >> <4>  ? watchdog_timer_fn+0x247/0x2d0
> >> <4>  ? __pfx_watchdog_timer_fn+0x10/0x10 <4>  ?
> >> __hrtimer_run_queues+0x1d0/0x420 <4>  ? hrtimer_interrupt+0x116/0x290
> >> <4>  ? __sysvec_apic_timer_interrupt+0x70/0x1e0
> >> <4>  ? sysvec_apic_timer_interrupt+0x47/0xc0
> >> <4>  ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
> >> <4>  ? handle_softirqs+0xb1/0x4d0
> >> <4>  __irq_exit_rcu+0x13f/0x160
> >> <4>  irq_exit_rcu+0xe/0x20
> >> <4>  sysvec_irq_work+0xa0/0xc0
> >> <4>  
> >> <4>  
> >> <4>  asm_sysvec_irq_work+0x1b/0x20
> >> <4> RIP: 0010:_raw_spin_unlock_irqrestore+0x57/0x80
> >> <4> RSP: 0018:c9000292b8f0 EFLAGS: 0246 <4> RAX:
> >>  RBX: 88810f235480 RCX:  <4> RDX:
> >>  RSI:  RDI:  <4>
> RBP:
> >

Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-02-27 Thread Timur Tabi
On Fri, 2025-02-28 at 07:37 +1000, Dave Airlie wrote:
> I've tried to retrofit checking 0x to drivers a lot, I'd
> prefer not to. Drivers getting stuck in wait for clear bits for ever.

That's what read_poll_timeout() is for.  I'm surprised Nouveau doesn't use it.


Re: [PATCH][next] drm/nouveau: Avoid multiple -Wflex-array-member-not-at-end warnings

2025-02-27 Thread Gustavo A. R. Silva

 > Applied to drm-misc-next, thanks!

Awesome. :)

Thank you, guys.
--
Gustavo



Re: [PATCH 02/17] bitops: Add generic parity calculation for u64

2025-02-27 Thread H. Peter Anvin
On February 27, 2025 1:57:41 PM PST, David Laight 
 wrote:
>On Thu, 27 Feb 2025 13:05:29 -0500
>Yury Norov  wrote:
>
>> On Wed, Feb 26, 2025 at 10:29:11PM +, David Laight wrote:
>> > On Mon, 24 Feb 2025 14:27:03 -0500
>> > Yury Norov  wrote:
>> >   
>> > > +#define parity(val) \
>> > > +({  \
>> > > +u64 __v = (val);\
>> > > +int __ret;  \
>> > > +switch (BITS_PER_TYPE(val)) {   \
>> > > +case 64:\
>> > > +__v ^= __v >> 32;   \
>> > > +fallthrough;\
>> > > +case 32:\
>> > > +__v ^= __v >> 16;   \
>> > > +fallthrough;\
>> > > +case 16:\
>> > > +__v ^= __v >> 8;\
>> > > +fallthrough;\
>> > > +case 8: \
>> > > +__v ^= __v >> 4;\
>> > > +__ret =  (0x6996 >> (__v & 0xf)) & 1;   \
>> > > +break;  \
>> > > +default:\
>> > > +BUILD_BUG();\
>> > > +}   \
>> > > +__ret;  \
>> > > +})
>> > > +  
>> > 
>> > You really don't want to do that!
>> > gcc makes a right hash of it for x86 (32bit).
>> > See https://www.godbolt.org/z/jG8dv3cvs  
>> 
>> GCC fails to even understand this. Of course, the __v should be an
>> __auto_type. But that way GCC fails to understand that case 64 is
>> a dead code for all smaller type and throws a false-positive 
>> Wshift-count-overflow. This is a known issue, unfixed for 25 years!
>
>Just do __v ^= __v >> 16 >> 16
>
>> 
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=4210
>>  
>> > You do better using a __v32 after the 64bit xor.  
>> 
>> It should be an __auto_type. I already mentioned. So because of that,
>> we can either do something like this:
>> 
>>   #define parity(val)\
>>   ({ \
>>   #ifdef CLANG  \
>>  __auto_type __v = (val);\
>>   #else /* GCC; because of this and that */ \
>>  u64 __v = (val);\
>>   #endif\
>>  int __ret;  \
>> 
>> Or simply disable Wshift-count-overflow for GCC.
>
>For 64bit values on 32bit it is probably better to do:
>int p32(unsigned long long x)
>{
>unsigned int lo = x;
>lo ^= x >> 32;
>lo ^= lo >> 16;
>lo ^= lo >> 8;
>lo ^= lo >> 4;
>return (0x6996 >> (lo & 0xf)) & 1;
>}
>That stops the compiler doing 64bit shifts (ok on x86, but probably not 
>elsewhere).
>It is likely to be reasonably optimal for most 64bit cpu as well.
>(For x86-64 it probably removes a load of REX prefix.)
>(It adds an extra instruction to arm because if its barrel shifter.)
>
>
>> 
>> > Even the 64bit version is probably sub-optimal (both gcc and clang).
>> > The whole lot ends up being a bit single register dependency chain.
>> > You want to do:  
>> 
>> No, I don't. I want to have a sane compiler that does it for me.
>> 
>> >mov %eax, %edx
>> >shrl $n, %eax
>> >xor %edx, %eax
>> > so that the 'mov' and 'shrl' can happen in the same clock
>> > (without relying on the register-register move being optimised out).
>> > 
>> > I dropped in the arm64 for an example of where the magic shift of 6996
>> > just adds an extra instruction.  
>> 
>> It's still unclear to me that this parity thing is used in hot paths.
>> If that holds, it's unclear that your hand-made version is better than
>> what's generated by GCC.
>
>I wasn't seriously considering doing that optimisation.
>Perhaps just hoping is might make a compiler person think :-)
>
>   David
>
>> 
>> Do you have any perf test?
>> 
>> Thanks,
>> Yury
>

What the compiler people need to do is to not make __builtin_parity*() generate 
crap.


[PATCH v1 3/7] drm/virtio: implement userptr: probe for the feature

2025-02-27 Thread Honglei Huang
From: Honglei Huang 

Add probe code path for virtio gpu userptr.

Signed-off-by: Honglei Huang 
---
 drivers/gpu/drm/virtio/virtgpu_debugfs.c | 1 +
 drivers/gpu/drm/virtio/virtgpu_drv.c | 1 +
 drivers/gpu/drm/virtio/virtgpu_drv.h | 1 +
 drivers/gpu/drm/virtio/virtgpu_kms.c | 8 ++--
 4 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_debugfs.c 
b/drivers/gpu/drm/virtio/virtgpu_debugfs.c
index 853dd9aa397e..da9fa034db0e 100644
--- a/drivers/gpu/drm/virtio/virtgpu_debugfs.c
+++ b/drivers/gpu/drm/virtio/virtgpu_debugfs.c
@@ -57,6 +57,7 @@ static int virtio_gpu_features(struct seq_file *m, void *data)
virtio_gpu_add_bool(m, "context init", vgdev->has_context_init);
virtio_gpu_add_int(m, "cap sets", vgdev->num_capsets);
virtio_gpu_add_int(m, "scanouts", vgdev->num_scanouts);
+   virtio_gpu_add_int(m, "blob userptr", vgdev->has_resource_userptr);
if (vgdev->host_visible_region.len) {
seq_printf(m, "%-16s : 0x%lx +0x%lx\n", "host visible region",
   (unsigned long)vgdev->host_visible_region.addr,
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.c 
b/drivers/gpu/drm/virtio/virtgpu_drv.c
index ffca6e2e1c9a..d79558139084 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.c
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.c
@@ -151,6 +151,7 @@ static unsigned int features[] = {
VIRTIO_GPU_F_RESOURCE_UUID,
VIRTIO_GPU_F_RESOURCE_BLOB,
VIRTIO_GPU_F_CONTEXT_INIT,
+   VIRTIO_GPU_F_RESOURCE_USERPTR,
 };
 static struct virtio_driver virtio_gpu_driver = {
.feature_table = features,
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h 
b/drivers/gpu/drm/virtio/virtgpu_drv.h
index 64c236169db8..7bdcbaa20ef1 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -249,6 +249,7 @@ struct virtio_gpu_device {
bool has_resource_blob;
bool has_host_visible;
bool has_context_init;
+   bool has_resource_userptr;
struct virtio_shm_region host_visible_region;
struct drm_mm host_visible_mm;
 
diff --git a/drivers/gpu/drm/virtio/virtgpu_kms.c 
b/drivers/gpu/drm/virtio/virtgpu_kms.c
index 7dfb2006c561..3d5158caef46 100644
--- a/drivers/gpu/drm/virtio/virtgpu_kms.c
+++ b/drivers/gpu/drm/virtio/virtgpu_kms.c
@@ -174,6 +174,9 @@ int virtio_gpu_init(struct virtio_device *vdev, struct 
drm_device *dev)
if (virtio_has_feature(vgdev->vdev, VIRTIO_GPU_F_RESOURCE_BLOB)) {
vgdev->has_resource_blob = true;
}
+   if (virtio_has_feature(vgdev->vdev, VIRTIO_GPU_F_RESOURCE_USERPTR)) {
+   vgdev->has_resource_userptr = true;
+   }
if (virtio_get_shm_region(vgdev->vdev, &vgdev->host_visible_region,
  VIRTIO_GPU_SHM_ID_HOST_VISIBLE)) {
if (!devm_request_mem_region(&vgdev->vdev->dev,
@@ -197,11 +200,12 @@ int virtio_gpu_init(struct virtio_device *vdev, struct 
drm_device *dev)
vgdev->has_context_init = true;
}
 
-   DRM_INFO("features: %cvirgl %cedid %cresource_blob %chost_visible",
+   DRM_INFO("features: %cvirgl %cedid %cresource_blob %chost_visible 
%cresource_userptr",
 vgdev->has_virgl_3d? '+' : '-',
 vgdev->has_edid? '+' : '-',
 vgdev->has_resource_blob ? '+' : '-',
-vgdev->has_host_visible ? '+' : '-');
+vgdev->has_host_visible ? '+' : '-',
+vgdev->has_resource_userptr ? '+' : '-');
 
DRM_INFO("features: %ccontext_init\n",
 vgdev->has_context_init ? '+' : '-');
-- 
2.34.1



RE: [PATCH] drm/bridge:anx7625: Enable DSC feature

2025-02-27 Thread Xin Ji
> > > > > > > > > > > > From: Dmitry Baryshkov
> > > > > > > > > > > > 
> > > > > > > > > > > > Sent: Thursday, February 13, 2025 9:04 PM
> > > > > > > > > > > > To: Xin Ji 
> > > > > > > > > > > > Cc: Andrzej Hajda ; Neil
> > > > > > > > > > > > Armstrong ; Robert Foss
> > > > > > > > > > > > ; Laurent Pinchart
> > > > > > > > > > > > ; Jonas Karlman
> > > > > > > > > > > > ; Jernej Skrabec
> > > > > > > > > > > > ; Maarten Lankhorst
> > > > > > > > > > > > ; Maxime Ripard
> > > > > > > > > > > > ; Thomas Zimmermann
> > > > > > > > > > > > ;
> > > > > > > > > > David
> > > > > > > > > > > > Airlie ; Simona Vetter
> > > > > > > > > > > > ; Bernie Liang
> > > > > > > > > > > > ; Qilin Wen
> > > > > > > > > > > > ; treapk...@google.com;
> > > > > > > > > > > > dri-devel@lists.freedesktop.org; linux-
> > > > > > > > > > > > ker...@vger.kernel.org
> > > > > > > > > > > > Subject: Re: [PATCH] drm/bridge:anx7625: Enable
> > > > > > > > > > > > DSC feature
> > > > > > > > > >
> > > > > > > > > > PLease remove such splats, use something more sensible.
> > > > > > > > > OK, I'll change the subject
> > > > > > > >
> > > > > > > > It's not about the subject. Compare just "ABC DEF wrote:"
> > > > > > > > and your quatation header.
> > > > > > > Sorry, these message is automatically added by Outlook, I'll 
> > > > > > > remove it.
> > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Thu, Feb 13, 2025 at 08:33:30PM +0800, Xin Ji wrote:
> > > > > > > > > > > > > As anx7625 MIPI RX bandwidth(maximum 1.5Gbps per
> > > > > > > > > > > > > lane) and internal pixel clock(maximum 300M) 
> > > > > > > > > > > > > limitation.
> > > > > > > > > > > > > Anx7625 must enable DSC feature while MIPI
> > > > > > > > > > > > > source want to output 4K30
> > > > > > resolution.
> > > > > > > > > > > >
> > > > > > > > > > > > This commit message is pretty hard to read and
> > > > > > > > > > > > understand for a non-native speaker. Please
> > > > > > > > > > > > consider rewriting it so that it is easier to
> > > > > > > > > > understand it.
> > > > > > > > > > > >
> > > > > > > > > > > Thanks for the review, sorry about that, I'll
> > > > > > > > > > > rewriting the commit message
> > > > > > > > > > > > >
> > > > > > > > > > > > > Signed-off-by: Xin Ji 
> > > > > > > > > > > > > ---
> > > > > > > > > > > > >  drivers/gpu/drm/bridge/analogix/anx7625.c | 300
> > > > > > > > > > > > > ++
> > > > > > > > > > > > >
> > > ++drivers/gpu/drm/bridge/analogix/anx7625.
> > > > > > > > > > > > > ++h
> > > > > > > > > > > > > ++|
> > > > > > > > > > > > > 32 +++
> > > > > > > > > > > > >  2 files changed, 284 insertions(+), 48
> > > > > > > > > > > > > deletions(-)
> > > > > > > > > > > > >
> > > > > > > > > > > > > diff --git
> > > > > > > > > > > > > a/drivers/gpu/drm/bridge/analogix/anx7625.c
> > > > > > > > > > > > > b/drivers/gpu/drm/bridge/analogix/anx7625.c
> > > > > > > > > > > > > index 4be34d5c7a3b..7d86ab02f71c 100644
> > > > > > > > > > > > > --- a/drivers/gpu/drm/bridge/analogix/anx7625.c
> > > > > > > > > > > > > +++ b/drivers/gpu/drm/bridge/analogix/anx7625.c
> > > > > > > > > > > > > @@ -22,6 +22,7 @@
> > > > > > > > > > > > >
> > > > > > > > > > > > >  #include 
> > > > > > > > > > > > > #include 
> > > > > > > > > > > > > +#include 
> > > > > > > > > > > > >  #include 
> > > > > > > > > > > > > #include   #include
> > > > > > > > > > > > >  @@
> > > > > > > > > > > > > -476,6
> > > > > > > > > > > > > +477,138 @@ static int
> > > > > > > > > > > > > +anx7625_set_k_value(struct anx7625_data
> > > > > > > > > > > > > +*ctx)
> > > > > > > > > > > > >
> > > > > > > > > > > > > MIPI_DIGITAL_ADJ_1, 0x3D); }
> > > > > > > > > > > > >
> > > > > > > > > > > > > +static int anx7625_set_dsc_params(struct
> > > > > > > > > > > > > +anx7625_data *ctx)
> > > {
> > > > > > > > > > > > > + int ret, i;
> > > > > > > > > > > > > + u16 htotal, vtotal;
> > > > > > > > > > > > > +
> > > > > > > > > > > > > + if (!ctx->dsc_en)
> > > > > > > > > > > > > + return 0;
> > > > > > > > > > > > > +
> > > > > > > > > > > > > + /* Htotal */
> > > > > > > > > > > > > + htotal = ctx->dt.hactive.min + ctx-
> >dt.hfront_porch.min +
> > > > > > > > > > > > > + ctx->dt.hback_porch.min + 
> > > > > > > > > > > > > ctx->dt.hsync_len.min;
> > > > > > > > > > > > > + ret = anx7625_reg_write(ctx, 
> > > > > > > > > > > > > ctx->i2c.tx_p2_client,
> > > > > > > > > > > > > + 
> > > > > > > > > > > > > HORIZONTAL_TOTAL_PIXELS_L, htotal);
> > > > > > > > > > > > > + ret |= anx7625_reg_write(ctx, 
> > > > > > > > > > > > > ctx->i2c.tx_p2_client,
> > > > > > > > > > > > > +  
> > > > > > > > > > > > > HORIZONTAL_TOTAL_PIXELS_H, htotal >>
> 8);
> > > > > > > > > > > > > + /* Hactive */
> > > > > > > > > > > > > + ret |= anx7625_reg_write(ctx,

Re: [PATCH] drm/vboxvideo: Remove unused hgsmi_cursor_position

2025-02-27 Thread Dr. David Alan Gilbert
* li...@treblig.org (li...@treblig.org) wrote:
> From: "Dr. David Alan Gilbert" 
> 
> hgsmi_cursor_position() has been unused since 2018's
> commit 35f3288c453e ("staging: vboxvideo: Atomic phase 1: convert cursor to
> universal plane")
> 
> Remove it.
> 
> Signed-off-by: Dr. David Alan Gilbert 

Hi David, Simona,
  Will this one be picked up by drm-next?  It got Hans's
review back on 16 Dec.
( in 2513e942-6391-4a96-b487-1e4ba19b7...@redhat.com )

  Thanks,

Dave

> ---
>  drivers/gpu/drm/vboxvideo/hgsmi_base.c  | 37 -
>  drivers/gpu/drm/vboxvideo/vboxvideo_guest.h |  2 --
>  2 files changed, 39 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vboxvideo/hgsmi_base.c 
> b/drivers/gpu/drm/vboxvideo/hgsmi_base.c
> index 87dccaecc3e5..db994aeaa0f9 100644
> --- a/drivers/gpu/drm/vboxvideo/hgsmi_base.c
> +++ b/drivers/gpu/drm/vboxvideo/hgsmi_base.c
> @@ -181,40 +181,3 @@ int hgsmi_update_pointer_shape(struct gen_pool *ctx, u32 
> flags,
>  
>   return rc;
>  }
> -
> -/**
> - * hgsmi_cursor_position - Report the guest cursor position.  The host may
> - * wish to use this information to re-position its
> - * own cursor (though this is currently unlikely).
> - * The current host cursor position is returned.
> - * Return: 0 or negative errno value.
> - * @ctx:  The context containing the heap used.
> - * @report_position:  Are we reporting a position?
> - * @x:Guest cursor X position.
> - * @y:Guest cursor Y position.
> - * @x_host:   Host cursor X position is stored here.  Optional.
> - * @y_host:   Host cursor Y position is stored here.  Optional.
> - */
> -int hgsmi_cursor_position(struct gen_pool *ctx, bool report_position,
> -   u32 x, u32 y, u32 *x_host, u32 *y_host)
> -{
> - struct vbva_cursor_position *p;
> -
> - p = hgsmi_buffer_alloc(ctx, sizeof(*p), HGSMI_CH_VBVA,
> -VBVA_CURSOR_POSITION);
> - if (!p)
> - return -ENOMEM;
> -
> - p->report_position = report_position;
> - p->x = x;
> - p->y = y;
> -
> - hgsmi_buffer_submit(ctx, p);
> -
> - *x_host = p->x;
> - *y_host = p->y;
> -
> - hgsmi_buffer_free(ctx, p);
> -
> - return 0;
> -}
> diff --git a/drivers/gpu/drm/vboxvideo/vboxvideo_guest.h 
> b/drivers/gpu/drm/vboxvideo/vboxvideo_guest.h
> index 55fcee3a6470..643c4448bdcb 100644
> --- a/drivers/gpu/drm/vboxvideo/vboxvideo_guest.h
> +++ b/drivers/gpu/drm/vboxvideo/vboxvideo_guest.h
> @@ -34,8 +34,6 @@ int hgsmi_query_conf(struct gen_pool *ctx, u32 index, u32 
> *value_ret);
>  int hgsmi_update_pointer_shape(struct gen_pool *ctx, u32 flags,
>  u32 hot_x, u32 hot_y, u32 width, u32 height,
>  u8 *pixels, u32 len);
> -int hgsmi_cursor_position(struct gen_pool *ctx, bool report_position,
> -   u32 x, u32 y, u32 *x_host, u32 *y_host);
>  
>  bool vbva_enable(struct vbva_buf_ctx *vbva_ctx, struct gen_pool *ctx,
>struct vbva_buffer *vbva, s32 screen);
> -- 
> 2.47.1
> 
-- 
 -Open up your eyes, open up your mind, open up your code ---   
/ Dr. David Alan Gilbert|   Running GNU/Linux   | Happy  \ 
\dave @ treblig.org |   | In Hex /
 \ _|_ http://www.treblig.org   |___/


Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-02-27 Thread Dave Airlie
On Wed, 26 Feb 2025 at 00:11, Alexandre Courbot  wrote:
>
> On Mon Feb 24, 2025 at 9:07 PM JST, Danilo Krummrich wrote:
> > CC: Gary
> >
> > On Mon, Feb 24, 2025 at 10:40:00AM +0900, Alexandre Courbot wrote:
> >> This inability to sleep while we are accessing registers seems very
> >> constraining to me, if not dangerous. It is pretty common to have
> >> functions intermingle hardware accesses with other operations that might
> >> sleep, and this constraint means that in such cases the caller would
> >> need to perform guard lifetime management manually:
> >>
> >>   let bar_guard = bar.try_access()?;
> >>   /* do something non-sleeping with bar_guard */
> >>   drop(bar_guard);
> >>
> >>   /* do something that might sleep */
> >>
> >>   let bar_guard = bar.try_access()?;
> >>   /* do something non-sleeping with bar_guard */
> >>   drop(bar_guard);
> >>
> >>   ...
> >>
> >> Failure to drop the guard potentially introduces a race condition, which
> >> will receive no compile-time warning and potentialy not even a runtime
> >> one unless lockdep is enabled. This problem does not exist with the
> >> equivalent C code AFAICT, which makes the Rust version actually more
> >> error-prone and dangerous, the opposite of what we are trying to achieve
> >> with Rust. Or am I missing something?
> >
> > Generally you are right, but you have to see it from a different 
> > perspective.
> >
> > What you describe is not an issue that comes from the design of the API, 
> > but is
> > a limitation of Rust in the kernel. People are aware of the issue and with 
> > klint
> > [1] there are solutions for that in the pipeline, see also [2] and [3].
> >
> > [1] https://rust-for-linux.com/klint
> > [2] https://github.com/Rust-for-Linux/klint/blob/trunk/doc/atomic_context.md
> > [3] https://www.memorysafety.org/blog/gary-guo-klint-rust-tools/
>
> Thanks, I wasn't aware of klint and it looks indeed cool, even if not perfect 
> by
> its own admission. But even if the ignore the safety issue, the other one
> (ergonomics) is still there.
>
> Basically this way of accessing registers imposes quite a mental burden on its
> users. It requires a very different (and harsher) discipline than when writing
> the same code in C, and the correct granularity to use is unclear to me.
>
> For instance, if I want to do the equivalent of Nouveau's nvkm_usec() to poll 
> a
> particular register in a busy loop, should I call try_access() once before the
> loop? Or every time before accessing the register? I'm afraid having to check
> that the resource is still alive before accessing any register is going to
> become tedious very quickly.
>
> I understand that we want to protect against accessing the IO region of an
> unplugged device ; but still there is no guarantee that the device won't be
> unplugged in the middle of a critical section, however short. Thus the driver
> code should be able to recognize that the device has fallen off the bus when 
> it
> e.g. gets a bunch of 0xff instead of a valid value. So do we really need to
> extra protection that AFAICT isn't used in C?

Yes.

I've tried to retrofit checking 0x to drivers a lot, I'd
prefer not to. Drivers getting stuck in wait for clear bits for ever.

Dave.


Re: [PATCH 9/9] arm64: dts: imx95: Describe Mali G310 GPU

2025-02-27 Thread Frank Li
On Thu, Feb 27, 2025 at 09:36:55PM +0100, Marek Vasut wrote:
> On 2/27/25 6:43 PM, Frank Li wrote:
> [...]
>
> > > diff --git a/arch/arm64/boot/dts/freescale/imx95.dtsi 
> > > b/arch/arm64/boot/dts/freescale/imx95.dtsi
> > > index 3af13173de4bd..36bad211e5558 100644
> > > --- a/arch/arm64/boot/dts/freescale/imx95.dtsi
> > > +++ b/arch/arm64/boot/dts/freescale/imx95.dtsi
> > > @@ -249,6 +249,37 @@ dummy: clock-dummy {
> > >   clock-output-names = "dummy";
> > >   };
> > >
> > > + gpu_fixed_reg: fixed-gpu-reg {
> > > + compatible = "regulator-fixed";
> > > + regulator-min-microvolt = <92>;
> > > + regulator-max-microvolt = <92>;
> > > + regulator-name = "vdd_gpu";
> > > + regulator-always-on;
> > > + regulator-boot-on;
> >
> > Does really need regulator-boot-on and regulator-always-on ?
>
> I don't think so, this is a development remnant, fixed, thanks.
>
> [...]
>
> > > + gpu: gpu@4d90 {
> > > + compatible = "fsl,imx95-mali", "arm,mali-valhall-csf";
> > > + reg = <0 0x4d90 0 0x48>;
> > > + clocks = <&scmi_clk IMX95_CLK_GPU>;
> > > + clock-names = "core";
> > > + interrupts = ,
> > > +  ,
> > > +  ;
> > > + interrupt-names = "gpu", "job", "mmu";
> > > + mali-supply = <&gpu_fixed_reg>;
> > > + operating-points-v2 = <&gpu_opp_table>;
> > > + power-domains = <&scmi_devpd IMX95_PD_GPU>, <&scmi_perf 
> > > IMX95_PERF_GPU>;
> > > + power-domain-names = "mix", "perf";
> > > + resets = <&gpu_blk_ctrl 0>;
> > > + #cooling-cells = <2>;
> > > + dynamic-power-coefficient = <1013>;
> > > + status = "disabled";
> >
> > GPU is internal module, which have not much dependence with other module
> > such as pinmux. why not default status is "disabled". Supposed gpu driver
> > will turn off clock and power if not used.
> My thinking was that there are MX95 SoC with GPU fused off, hence it is
> better to keep the GPU disabled in DT by default. But I can also keep it
> enabled and the few boards which do not have MX95 SoC with GPU can
> explicitly disable it in board DT.
>
> What do you think ?

GPU Fuse off should use access-control, see thread
https://lore.kernel.org/imx/20250207120213.GD14860@localhost.localdomain/

Frank


Re: [PATCH 4/4] panic_qr: use new #[export] macro

2025-02-27 Thread Boqun Feng
On Thu, Feb 27, 2025 at 05:02:02PM +, Alice Ryhl wrote:
> This validates at compile time that the signatures match what is in the
> header file. It highlights one annoyance with the compile-time check,
> which is that it can only be used with functions marked unsafe.
> 
> If the function is not unsafe, then this error is emitted:
> 
> error[E0308]: `if` and `else` have incompatible types

Is there a way to improve this error message? I vaguely remember there
are ways to do customized error message.

Regards,
Boqun

>--> /drivers/gpu/drm/drm_panic_qr.rs:987:19
> |
> 986 | #[export]
> | - expected because of this
> 987 | pub extern "C" fn drm_panic_qr_max_data_size(version: u8, url_len: 
> usize) -> usize {
> |   ^^ expected unsafe fn, found 
> safe fn
> |
> = note: expected fn item `unsafe extern "C" fn(_, _) -> _ 
> {kernel::bindings::drm_panic_qr_max_data_size}`
>found fn item `extern "C" fn(_, _) -> _ 
> {drm_panic_qr_max_data_size}`
> 
> Signed-off-by: Alice Ryhl 
> ---
>  drivers/gpu/drm/drm_panic.c |  5 -
>  drivers/gpu/drm/drm_panic_qr.rs | 15 +++
>  include/drm/drm_panic.h |  7 +++
>  rust/bindings/bindings_helper.h |  4 
>  4 files changed, 22 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_panic.c b/drivers/gpu/drm/drm_panic.c
> index f128d345b16d..dee5301dd729 100644
> --- a/drivers/gpu/drm/drm_panic.c
> +++ b/drivers/gpu/drm/drm_panic.c
> @@ -486,11 +486,6 @@ static void drm_panic_qr_exit(void)
>   stream.workspace = NULL;
>  }
>  
> -extern size_t drm_panic_qr_max_data_size(u8 version, size_t url_len);
> -
> -extern u8 drm_panic_qr_generate(const char *url, u8 *data, size_t data_len, 
> size_t data_size,
> - u8 *tmp, size_t tmp_size);
> -
>  static int drm_panic_get_qr_code_url(u8 **qr_image)
>  {
>   struct kmsg_dump_iter iter;
> diff --git a/drivers/gpu/drm/drm_panic_qr.rs b/drivers/gpu/drm/drm_panic_qr.rs
> index bcf248f69252..d055655aa0cd 100644
> --- a/drivers/gpu/drm/drm_panic_qr.rs
> +++ b/drivers/gpu/drm/drm_panic_qr.rs
> @@ -27,7 +27,10 @@
>  //! * 
>  
>  use core::cmp;
> -use kernel::str::CStr;
> +use kernel::{
> +prelude::*,
> +str::CStr,
> +};
>  
>  #[derive(Debug, Clone, Copy, PartialEq, Eq, Ord, PartialOrd)]
>  struct Version(usize);
> @@ -929,7 +932,7 @@ fn draw_all(&mut self, data: impl Iterator) {
>  /// * `tmp` must be valid for reading and writing for `tmp_size` bytes.
>  ///
>  /// They must remain valid for the duration of the function call.
> -#[no_mangle]
> +#[export]
>  pub unsafe extern "C" fn drm_panic_qr_generate(
>  url: *const kernel::ffi::c_char,
>  data: *mut u8,
> @@ -980,8 +983,12 @@ fn draw_all(&mut self, data: impl Iterator) {
>  /// * If `url_len` > 0, remove the 2 segments header/length and also count 
> the
>  ///   conversion to numeric segments.
>  /// * If `url_len` = 0, only removes 3 bytes for 1 binary segment.
> -#[no_mangle]
> -pub extern "C" fn drm_panic_qr_max_data_size(version: u8, url_len: usize) -> 
> usize {
> +///
> +/// # Safety
> +///
> +/// Always safe to call.
> +#[export]
> +pub unsafe extern "C" fn drm_panic_qr_max_data_size(version: u8, url_len: 
> usize) -> usize {
>  #[expect(clippy::manual_range_contains)]
>  if version < 1 || version > 40 {
>  return 0;
> diff --git a/include/drm/drm_panic.h b/include/drm/drm_panic.h
> index f4e1fa9ae607..2a1536e0229a 100644
> --- a/include/drm/drm_panic.h
> +++ b/include/drm/drm_panic.h
> @@ -163,4 +163,11 @@ static inline void drm_panic_unlock(struct drm_device 
> *dev, unsigned long flags)
>  
>  #endif
>  
> +#if defined(CONFIG_DRM_PANIC_SCREEN_QR_CODE)
> +extern size_t drm_panic_qr_max_data_size(u8 version, size_t url_len);
> +
> +extern u8 drm_panic_qr_generate(const char *url, u8 *data, size_t data_len, 
> size_t data_size,
> + u8 *tmp, size_t tmp_size);
> +#endif
> +
>  #endif /* __DRM_PANIC_H__ */
> diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
> index 55354e4dec14..5345aa93fb8a 100644
> --- a/rust/bindings/bindings_helper.h
> +++ b/rust/bindings/bindings_helper.h
> @@ -36,6 +36,10 @@
>  #include 
>  #include 
>  
> +#if defined(CONFIG_DRM_PANIC_SCREEN_QR_CODE)
> +#include 
> +#endif
> +
>  /* `bindgen` gets confused at certain things. */
>  const size_t RUST_CONST_HELPER_ARCH_SLAB_MINALIGN = ARCH_SLAB_MINALIGN;
>  const size_t RUST_CONST_HELPER_PAGE_SIZE = PAGE_SIZE;
> 
> -- 
> 2.48.1.658.g4767266eb4-goog
> 


Re: [PATCH 9/9] arm64: dts: imx95: Describe Mali G310 GPU

2025-02-27 Thread Marek Vasut

On 2/27/25 10:27 PM, Frank Li wrote:

[...]


+   gpu: gpu@4d90 {
+   compatible = "fsl,imx95-mali", "arm,mali-valhall-csf";
+   reg = <0 0x4d90 0 0x48>;
+   clocks = <&scmi_clk IMX95_CLK_GPU>;
+   clock-names = "core";
+   interrupts = ,
+,
+;
+   interrupt-names = "gpu", "job", "mmu";
+   mali-supply = <&gpu_fixed_reg>;
+   operating-points-v2 = <&gpu_opp_table>;
+   power-domains = <&scmi_devpd IMX95_PD_GPU>, <&scmi_perf 
IMX95_PERF_GPU>;
+   power-domain-names = "mix", "perf";
+   resets = <&gpu_blk_ctrl 0>;
+   #cooling-cells = <2>;
+   dynamic-power-coefficient = <1013>;
+   status = "disabled";


GPU is internal module, which have not much dependence with other module
such as pinmux. why not default status is "disabled". Supposed gpu driver
will turn off clock and power if not used.

My thinking was that there are MX95 SoC with GPU fused off, hence it is
better to keep the GPU disabled in DT by default. But I can also keep it
enabled and the few boards which do not have MX95 SoC with GPU can
explicitly disable it in board DT.

What do you think ?


GPU Fuse off should use access-control, see thread
https://lore.kernel.org/imx/20250207120213.GD14860@localhost.localdomain/
Did that thread ever go anywhere ? It seems there is no real conclusion, 
is there ? +Cc Alex .


Re: [PATCH v2 3/6] drm/msm/a6xx: Add support for Adreno 623

2025-02-27 Thread Konrad Dybcio
On 27.02.2025 10:06 PM, Akhil P Oommen wrote:
> On 2/28/2025 1:59 AM, Konrad Dybcio wrote:
>> On 27.02.2025 9:07 PM, Akhil P Oommen wrote:
>>> From: Jie Zhang 
>>>
>>> Add support for Adreno 623 GPU found in QCS8300 chipsets.
>>>
>>> Signed-off-by: Jie Zhang 
>>> Signed-off-by: Akhil P Oommen 
>>> ---
>>>  drivers/gpu/drm/msm/adreno/a6xx_catalog.c   | 29 
>>> +
>>>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   |  8 
>>>  drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c |  2 +-
>>>  drivers/gpu/drm/msm/adreno/adreno_gpu.h |  5 +
>>>  4 files changed, 43 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c 
>>> b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
>>> index 
>>> edffb7737a97b268bb2986d557969e651988a344..53e2ff4406d8f0afe474aaafbf0e459ef8f4577d
>>>  100644
>>> --- a/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
>>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_catalog.c
>>> @@ -879,6 +879,35 @@ static const struct adreno_info a6xx_gpus[] = {
>>> { 0, 0 },
>>> { 137, 1 },
>>> ),
>>> +   }, {
>>> +   .chip_ids = ADRENO_CHIP_IDS(0x06020300),
>>> +   .family = ADRENO_6XX_GEN3,
>>> +   .fw = {
>>> +   [ADRENO_FW_SQE] = "a650_sqe.fw",
>>> +   [ADRENO_FW_GMU] = "a623_gmu.bin",
>>> +   },
>>> +   .gmem = SZ_512K,
>>> +   .inactive_period = DRM_MSM_INACTIVE_PERIOD,
>>> +   .quirks = ADRENO_QUIRK_HAS_CACHED_COHERENT |
>>> +   ADRENO_QUIRK_HAS_HW_APRIV,
>>> +   .init = a6xx_gpu_init,
>>> +   .a6xx = &(const struct a6xx_info) {
>>> +   .hwcg = a690_hwcg,
>>
>> You used the a620 table before, I'm assuming a690 is correct after all?
> 
> Correct. a690_hwcg array has the recommended values for a623.

Thanks for double checking

Reviewed-by: Konrad Dybcio 

Konrad


Re: [PATCH v4 2/2] drm/tiny: add driver for Apple Touch Bars in x86 Macs

2025-02-27 Thread kernel test robot
Hi Aditya,

kernel test robot noticed the following build warnings:

[auto build test WARNING on linus/master]
[also build test WARNING on v6.14-rc4 next-20250227]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:
https://github.com/intel-lab-lkp/linux/commits/Aditya-Garg/drm-format-helper-Add-conversion-from-XRGB-to-BGR888/20250224-214352
base:   linus/master
patch link:
https://lore.kernel.org/r/844C1D39-4891-4DC2-8458-F46FA1B59FA0%40live.com
patch subject: [PATCH v4 2/2] drm/tiny: add driver for Apple Touch Bars in x86 
Macs
config: loongarch-allyesconfig 
(https://download.01.org/0day-ci/archive/20250228/202502280748.onktnumk-...@intel.com/config)
compiler: loongarch64-linux-gcc (GCC) 14.2.0
reproduce (this is a W=1 build): 
(https://download.01.org/0day-ci/archive/20250228/202502280748.onktnumk-...@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202502280748.onktnumk-...@intel.com/

All warnings (new ones prefixed by >>):

   drivers/gpu/drm/tiny/appletbdrm.c: In function 
'appletbdrm_primary_plane_duplicate_state':
>> drivers/gpu/drm/tiny/appletbdrm.c:524:40: warning: variable 
>> 'old_appletbdrm_state' set but not used [-Wunused-but-set-variable]
 524 | struct appletbdrm_plane_state *old_appletbdrm_state;
 |^~~~


vim +/old_appletbdrm_state +524 drivers/gpu/drm/tiny/appletbdrm.c

   520  
   521  static struct drm_plane_state 
*appletbdrm_primary_plane_duplicate_state(struct drm_plane *plane)
   522  {
   523  struct drm_shadow_plane_state *new_shadow_plane_state;
 > 524  struct appletbdrm_plane_state *old_appletbdrm_state;
   525  struct appletbdrm_plane_state *appletbdrm_state;
   526  
   527  if (WARN_ON(!plane->state))
   528  return NULL;
   529  
   530  old_appletbdrm_state = to_appletbdrm_plane_state(plane->state);
   531  appletbdrm_state = kzalloc(sizeof(*appletbdrm_state), 
GFP_KERNEL);
   532  if (!appletbdrm_state)
   533  return NULL;
   534  
   535  /* Request and response are not duplicated and are allocated in 
.atomic_check */
   536  appletbdrm_state->request = NULL;
   537  appletbdrm_state->response = NULL;
   538  
   539  appletbdrm_state->request_size = 0;
   540  appletbdrm_state->frames_size = 0;
   541  
   542  new_shadow_plane_state = &appletbdrm_state->base;
   543  
   544  __drm_gem_duplicate_shadow_plane_state(plane, 
new_shadow_plane_state);
   545  
   546  return &new_shadow_plane_state->base;
   547  }
   548  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Re: [PATCH 3/9] dt-bindings: gpu: mali-valhall-csf: Document optional reset

2025-02-27 Thread Rob Herring (Arm)


On Thu, 27 Feb 2025 17:58:03 +0100, Marek Vasut wrote:
> The instance of the GPU populated in Freescale i.MX95 does require
> release from reset by writing into a single GPUMIX block controller
> GPURESET register bit 0. Document support for one optional reset.
> 
> Signed-off-by: Marek Vasut 
> ---
> Cc: Boris Brezillon 
> Cc: Conor Dooley 
> Cc: David Airlie 
> Cc: Fabio Estevam 
> Cc: Krzysztof Kozlowski 
> Cc: Liviu Dudau 
> Cc: Maarten Lankhorst 
> Cc: Maxime Ripard 
> Cc: Pengutronix Kernel Team 
> Cc: Philipp Zabel 
> Cc: Rob Herring 
> Cc: Sascha Hauer 
> Cc: Sebastian Reichel 
> Cc: Shawn Guo 
> Cc: Simona Vetter 
> Cc: Steven Price 
> Cc: Thomas Zimmermann 
> Cc: devicet...@vger.kernel.org
> Cc: dri-devel@lists.freedesktop.org
> Cc: i...@lists.linux.dev
> Cc: linux-arm-ker...@lists.infradead.org
> ---
>  .../devicetree/bindings/gpu/arm,mali-valhall-csf.yaml  | 3 +++
>  1 file changed, 3 insertions(+)
> 

My bot found errors running 'make dt_binding_check' on your patch:

yamllint warnings/errors:

dtschema/dtc warnings/errors:


doc reference errors (make refcheckdocs):

See 
https://patchwork.ozlabs.org/project/devicetree-bindings/patch/20250227170012.124768-4-ma...@denx.de

The base for the series is generally the latest rc1. A different dependency
should be noted in *this* patch.

If you already ran 'make dt_binding_check' and didn't see the above
error(s), then make sure 'yamllint' is installed and dt-schema is up to
date:

pip3 install dtschema --upgrade

Please check and re-submit after running the above command yourself. Note
that DT_SCHEMA_FILES can be set to your schema file to speed up checking
your schema. However, it must be unset to test all examples with your schema.



Re: [PATCH 7/9] dt-bindings: gpu: mali-valhall-csf: Document i.MX95 support

2025-02-27 Thread Rob Herring (Arm)


On Thu, 27 Feb 2025 17:58:07 +0100, Marek Vasut wrote:
> The instance of the GPU populated in Freescale i.MX95 is the
> Mali G310, document support for this variant.
> 
> Signed-off-by: Marek Vasut 
> ---
> Cc: Boris Brezillon 
> Cc: Conor Dooley 
> Cc: David Airlie 
> Cc: Fabio Estevam 
> Cc: Krzysztof Kozlowski 
> Cc: Liviu Dudau 
> Cc: Maarten Lankhorst 
> Cc: Maxime Ripard 
> Cc: Pengutronix Kernel Team 
> Cc: Philipp Zabel 
> Cc: Rob Herring 
> Cc: Sascha Hauer 
> Cc: Sebastian Reichel 
> Cc: Shawn Guo 
> Cc: Simona Vetter 
> Cc: Steven Price 
> Cc: Thomas Zimmermann 
> Cc: devicet...@vger.kernel.org
> Cc: dri-devel@lists.freedesktop.org
> Cc: i...@lists.linux.dev
> Cc: linux-arm-ker...@lists.infradead.org
> ---
>  Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml | 1 +
>  1 file changed, 1 insertion(+)
> 

My bot found errors running 'make dt_binding_check' on your patch:

yamllint warnings/errors:

dtschema/dtc warnings/errors:


doc reference errors (make refcheckdocs):

See 
https://patchwork.ozlabs.org/project/devicetree-bindings/patch/20250227170012.124768-8-ma...@denx.de

The base for the series is generally the latest rc1. A different dependency
should be noted in *this* patch.

If you already ran 'make dt_binding_check' and didn't see the above
error(s), then make sure 'yamllint' is installed and dt-schema is up to
date:

pip3 install dtschema --upgrade

Please check and re-submit after running the above command yourself. Note
that DT_SCHEMA_FILES can be set to your schema file to speed up checking
your schema. However, it must be unset to test all examples with your schema.



Re: [PATCH 0/4] Check Rust signatures at compile time

2025-02-27 Thread Alice Ryhl
On Thu, Feb 27, 2025 at 8:31 PM Greg Kroah-Hartman
 wrote:
>
> On Thu, Feb 27, 2025 at 05:01:58PM +, Alice Ryhl wrote:
> > Signed-off-by: Alice Ryhl 
>
> It's a bit odd to sign off on a 0/X email with no patch or description
> :)

What b4 does, I do. ;)

> Seriously, nice work!  This resolves the issues I had, and it looks like
> you found a needed fix already where things were not quite defined
> properly.
>
> Reviewed-by: Greg Kroah-Hartman 

Thanks!

Alice


Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice implementation

2025-02-27 Thread John Hubbard
On Thu Feb 27, 2025 at 1:42 PM PST, Dave Airlie wrote:
> On Thu, 27 Feb 2025 at 11:34, John Hubbard  wrote:
>> On Wed Feb 26, 2025 at 5:02 PM PST, Greg KH wrote:
>> > On Wed, Feb 26, 2025 at 07:47:30PM -0400, Jason Gunthorpe wrote:
...
> nova is just a drm driver, it's not a rewrite of the drm subsystem,
> that sort of effort would entail a much larger commitment.

Maybe at this point in the discussion it would help to discern between
nova-core and nova-drm:

drivers/gpu/nova-core/ (under discussion here)
drivers/gpu/drm/nova/ (Future)

...keeping in mind that nova-core will be used by other, non-DRM things,
notably VFIO.

>
> DRM has reasons for doing what drm does, that is a separate discussion
> of how a rust driver fits into the DRM. The rust code has to conform
> to the C expectations for the subsystems they are fitting into.
>
> The drm has spent years moving things to devm/drmm type constructs,
> adding hotplug with the unplug mechanisms, but it's a long journey and
> certainly not something nova would want to wait to reconstruct from
> scratch.

ack.

thanks,
John Hubbard




Re: [PATCH v4 2/2] drm/tiny: add driver for Apple Touch Bars in x86 Macs

2025-02-27 Thread kernel test robot
Hi Aditya,

kernel test robot noticed the following build warnings:

[auto build test WARNING on linus/master]
[also build test WARNING on v6.14-rc4 next-20250227]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:
https://github.com/intel-lab-lkp/linux/commits/Aditya-Garg/drm-format-helper-Add-conversion-from-XRGB-to-BGR888/20250224-214352
base:   linus/master
patch link:
https://lore.kernel.org/r/844C1D39-4891-4DC2-8458-F46FA1B59FA0%40live.com
patch subject: [PATCH v4 2/2] drm/tiny: add driver for Apple Touch Bars in x86 
Macs
config: mips-allyesconfig 
(https://download.01.org/0day-ci/archive/20250228/202502280740.4ffokilx-...@intel.com/config)
compiler: mips-linux-gcc (GCC) 14.2.0
reproduce (this is a W=1 build): 
(https://download.01.org/0day-ci/archive/20250228/202502280740.4ffokilx-...@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202502280740.4ffokilx-...@intel.com/

All warnings (new ones prefixed by >>):

   In file included from drivers/gpu/drm/tiny/appletbdrm.c:13:
   drivers/gpu/drm/tiny/appletbdrm.c: In function 'appletbdrm_send_request':
>> include/drm/drm_print.h:589:54: warning: format '%lu' expects argument of 
>> type 'long unsigned int', but argument 4 has type 'size_t' {aka 'unsigned 
>> int'} [-Wformat=]
 589 | dev_##level##type((drm) ? (drm)->dev : NULL, "[drm] " fmt, 
##__VA_ARGS__)
 |  ^~~~
   include/linux/dev_printk.h:110:30: note: in definition of macro 
'dev_printk_index_wrap'
 110 | _p_func(dev, fmt, ##__VA_ARGS__);
   \
 |  ^~~
   include/linux/dev_printk.h:154:56: note: in expansion of macro 'dev_fmt'
 154 | dev_printk_index_wrap(_dev_err, KERN_ERR, dev, dev_fmt(fmt), 
##__VA_ARGS__)
 |^~~
   include/drm/drm_print.h:589:9: note: in expansion of macro 'dev_err'
 589 | dev_##level##type((drm) ? (drm)->dev : NULL, "[drm] " fmt, 
##__VA_ARGS__)
 | ^~~~
   include/drm/drm_print.h:602:9: note: in expansion of macro '__drm_printk'
 602 | __drm_printk((drm), err,, "*ERROR* " fmt, ##__VA_ARGS__)
 | ^~~~
   drivers/gpu/drm/tiny/appletbdrm.c:173:17: note: in expansion of macro 
'drm_err'
 173 | drm_err(drm, "Actual size (%d) doesn't match 
expected size (%lu)\n",
 | ^~~
   drivers/gpu/drm/tiny/appletbdrm.c: In function 'appletbdrm_read_response':
>> include/drm/drm_print.h:589:54: warning: format '%lu' expects argument of 
>> type 'long unsigned int', but argument 4 has type 'size_t' {aka 'unsigned 
>> int'} [-Wformat=]
 589 | dev_##level##type((drm) ? (drm)->dev : NULL, "[drm] " fmt, 
##__VA_ARGS__)
 |  ^~~~
   include/linux/dev_printk.h:110:30: note: in definition of macro 
'dev_printk_index_wrap'
 110 | _p_func(dev, fmt, ##__VA_ARGS__);
   \
 |  ^~~
   include/linux/dev_printk.h:154:56: note: in expansion of macro 'dev_fmt'
 154 | dev_printk_index_wrap(_dev_err, KERN_ERR, dev, dev_fmt(fmt), 
##__VA_ARGS__)
 |^~~
   include/drm/drm_print.h:589:9: note: in expansion of macro 'dev_err'
 589 | dev_##level##type((drm) ? (drm)->dev : NULL, "[drm] " fmt, 
##__VA_ARGS__)
 | ^~~~
   include/drm/drm_print.h:602:9: note: in expansion of macro '__drm_printk'
 602 | __drm_printk((drm), err,, "*ERROR* " fmt, ##__VA_ARGS__)
 | ^~~~
   drivers/gpu/drm/tiny/appletbdrm.c:214:17: note: in expansion of macro 
'drm_err'
 214 | drm_err(drm, "Actual size (%d) doesn't match 
expected size (%lu)\n",
 | ^~~
   drivers/gpu/drm/tiny/appletbdrm.c: In function 
'appletbdrm_primary_plane_duplicate_state':
   drivers/gpu/drm/tiny/appletbdrm.c:524:40: warning: variable 
'old_appletbdrm_state' set but not used [-Wunused-but-set-variable]
 524 | struct appletbdrm_plane_state *old_appletbdrm_state;
   

[PATCH v3 1/2] drm/panthor: Replace sleep locks with spinlocks in fdinfo path

2025-02-27 Thread Adrián Larumbe
Commit 0590c94c3596 ("drm/panthor: Fix race condition when gathering fdinfo
group samples") introduced an xarray lock to deal with potential
use-after-free errors when accessing groups fdinfo figures. However, this
toggles the kernel's atomic context status, so the next nested mutex lock
will raise a warning when the kernel is compiled with mutex debug options:

CONFIG_DEBUG_RT_MUTEXES=y
CONFIG_DEBUG_MUTEXES=y

Replace Panthor's group fdinfo data mutex with a guarded spinlock.

Signed-off-by: Adrián Larumbe 
0590c94c3596 ("drm/panthor: Fix race condition when gathering fdinfo group 
samples")
Reviewed-by: Liviu Dudau 
Reviewed-by: Boris Brezillon 
---
 drivers/gpu/drm/panthor/panthor_sched.c | 26 -
 1 file changed, 12 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_sched.c 
b/drivers/gpu/drm/panthor/panthor_sched.c
index 1a276db095ff..4d31d1967716 100644
--- a/drivers/gpu/drm/panthor/panthor_sched.c
+++ b/drivers/gpu/drm/panthor/panthor_sched.c
@@ -9,6 +9,7 @@
 #include 
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -631,10 +632,10 @@ struct panthor_group {
struct panthor_gpu_usage data;
 
/**
-* @lock: Mutex to govern concurrent access from drm file's 
fdinfo callback
-* and job post-completion processing function
+* @fdinfo.lock: Spinlock to govern concurrent access from drm 
file's fdinfo
+* callback and job post-completion processing function
 */
-   struct mutex lock;
+   spinlock_t lock;
 
/** @fdinfo.kbo_sizes: Aggregate size of private kernel BO's 
held by the group. */
size_t kbo_sizes;
@@ -910,8 +911,6 @@ static void group_release_work(struct work_struct *work)
   release_work);
u32 i;
 
-   mutex_destroy(&group->fdinfo.lock);
-
for (i = 0; i < group->queue_count; i++)
group_free_queue(group, group->queues[i]);
 
@@ -2861,12 +2860,12 @@ static void update_fdinfo_stats(struct panthor_job *job)
struct panthor_job_profiling_data *slots = queue->profiling.slots->kmap;
struct panthor_job_profiling_data *data = &slots[job->profiling.slot];
 
-   mutex_lock(&group->fdinfo.lock);
-   if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_CYCLES)
-   fdinfo->cycles += data->cycles.after - data->cycles.before;
-   if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_TIMESTAMP)
-   fdinfo->time += data->time.after - data->time.before;
-   mutex_unlock(&group->fdinfo.lock);
+   scoped_guard(spinlock, &group->fdinfo.lock) {
+   if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_CYCLES)
+   fdinfo->cycles += data->cycles.after - 
data->cycles.before;
+   if (job->profiling.mask & PANTHOR_DEVICE_PROFILING_TIMESTAMP)
+   fdinfo->time += data->time.after - data->time.before;
+   }
 }
 
 void panthor_fdinfo_gather_group_samples(struct panthor_file *pfile)
@@ -2880,12 +2879,11 @@ void panthor_fdinfo_gather_group_samples(struct 
panthor_file *pfile)
 
xa_lock(&gpool->xa);
xa_for_each(&gpool->xa, i, group) {
-   mutex_lock(&group->fdinfo.lock);
+   guard(spinlock)(&group->fdinfo.lock);
pfile->stats.cycles += group->fdinfo.data.cycles;
pfile->stats.time += group->fdinfo.data.time;
group->fdinfo.data.cycles = 0;
group->fdinfo.data.time = 0;
-   mutex_unlock(&group->fdinfo.lock);
}
xa_unlock(&gpool->xa);
 }
@@ -3537,7 +3535,7 @@ int panthor_group_create(struct panthor_file *pfile,
mutex_unlock(&sched->reset.lock);
 
add_group_kbo_sizes(group->ptdev, group);
-   mutex_init(&group->fdinfo.lock);
+   spin_lock_init(&group->fdinfo.lock);
 
return gid;
 
-- 
2.47.1



  1   2   3   >