Re: [Intel-gfx] [PATCH] drm/i915/guc: Use iosys_map interface to update lrc_desc
On 11.03.2022 10:40, Lucas De Marchi wrote: > On Tue, Mar 08, 2022 at 10:17:42PM +0530, Balasubramani Vivekanandan wrote: > > This patch is continuation of the effort to move all pointers in i915, > > which at any point may be pointing to device memory or system memory, to > > iosys_map interface. > > More details about the need of this change is explained in the patch > > series which initiated this task > > https://patchwork.freedesktop.org/series/99711/ > > > > This patch converts all access to the lrc_desc through iosys_map > > interfaces. > > > > Cc: Lucas De Marchi > > Cc: John Harrison > > Cc: Matthew Brost > > Cc: Umesh Nerlige Ramappa > > Signed-off-by: Balasubramani Vivekanandan > > > > --- > > ... > > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h > > b/drivers/gpu/drm/i915/gt/uc/intel_guc.h > > @@ -2245,13 +2256,13 @@ static void > > prepare_context_registration_info(struct intel_context *ce) > > GEM_BUG_ON(i915_gem_object_is_lmem(guc->ct.vma->obj) != > >i915_gem_object_is_lmem(ce->ring->vma->obj)); > > > > - desc = __get_lrc_desc(guc, ctx_id); > > - desc->engine_class = engine_class_to_guc_class(engine->class); > > - desc->engine_submit_mask = engine->logical_mask; > > - desc->hw_context_desc = ce->lrc.lrca; > > - desc->priority = ce->guc_state.prio; > > - desc->context_flags = CONTEXT_REGISTRATION_FLAG_KMD; > > - guc_context_policy_init(engine, desc); > > + memset(&desc, 0, sizeof(desc)); > > previously we would re-use whatever was left in > guc->lrc_desc_pool_vaddr. Here we are changing it to always zero > everything and set the fields we are interested in. > > As I'm not too familiar with this part and I see us traversing child > guc_process_desc > which may point to the same id, it doesn't _feel_ safe. Did you check if > this is not zero'ing what it shouldn't? > > Matt Brost / John / Daniele, could you clarify? > > thanks > Lucas De Marchi I verified that struct guc_lrc_desc is not updated anywhere else in the driver other than in prepare_context_registration_info. So I went ahead with clearing it before updating the fields. But I will still wait for comments from Matt Brost/ John / Daniele for their confirmation. Thanks Bala
Re: [Intel-gfx] [PATCH v2 4/7] drm/i915/guc: use the memcpy_from_wc call from the drm
On 21.03.2022 14:14, Lucas De Marchi wrote: > On Thu, Mar 03, 2022 at 11:30:10PM +0530, Balasubramani Vivekanandan wrote: > > memcpy_from_wc functions in i915_memcpy.c will be removed and replaced > > by the implementation in drm_cache.c. > > Updated to use the functions provided by drm_cache.c. > > > > v2: Check if the log object allocated from local memory or system memory > >and according setup the iosys_map (Lucas) > > > > Cc: Lucas De Marchi > > > > Signed-off-by: Balasubramani Vivekanandan > > > > --- > > drivers/gpu/drm/i915/gt/uc/intel_guc_log.c | 15 --- > > 1 file changed, 12 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c > > b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c > > index a24dc6441872..b9db765627ea 100644 > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c > > @@ -3,6 +3,7 @@ > > * Copyright © 2014-2019 Intel Corporation > > */ > > > > +#include > > #include > > #include > > > > @@ -206,6 +207,7 @@ static void guc_read_update_log_buffer(struct > > intel_guc_log *log) > > enum guc_log_buffer_type type; > > void *src_data, *dst_data; > > bool new_overflow; > > + struct iosys_map src_map; > > > > mutex_lock(&log->relay.lock); > > > > @@ -282,14 +284,21 @@ static void guc_read_update_log_buffer(struct > > intel_guc_log *log) > > } > > > > /* Just copy the newly written data */ > > + if (i915_gem_object_is_lmem(log->vma->obj)) > > + iosys_map_set_vaddr_iomem(&src_map, (void __iomem > > *)src_data); > > + else > > + iosys_map_set_vaddr(&src_map, src_data); > > It would be better to keep this outside of the loop. So inside the loop > we can use only iosys_map_incr(&src_map, buffer_size). However you'd > also have to handle the read_offset. The iosys_map_ API has both a > src_offset and dst_offset due to situations like that. Maybe this is > missing in the new drm_memcpy_* function you're adding? > > This function was not correct wrt to IO memory access with the other > 2 places in this function doing plain memcpy(). Since we are starting to > use iosys_map here, we probably should handle this commit as "migrate to > iosys_map", and convert those. In your current final state > we have 3 variables aliasing the same memory location. IMO it will be > error prone to keep it like that yes, it is a good suggestion to completely change the reading of the GuC log for the relay to use the iosys_map interfaces. Though it was planned eventually, doing it now with this series will avoid mixing of memcpy() and drm_memcpy_*(which needs iosys_map parameters) functions. I will do the changes. > > +Michal, some questions: > > - I'm not very familiar with the relayfs API. Is the `dst_data += PAGE_SIZE;` > really correct? > > - Could you double check this patch and ack if ok? > > Heads up that since the log buffer is potentially in lmem, we will need > to convert this function to take that into account. All those accesses > to log_buf_state need to use the proper kernel abstraction for system vs > I/O memory. > > thanks > Lucas De Marchi > > > + > > if (read_offset > write_offset) { > > - i915_memcpy_from_wc(dst_data, src_data, write_offset); > > + drm_memcpy_from_wc_vaddr(dst_data, &src_map, > > +write_offset); > > bytes_to_copy = buffer_size - read_offset; > > } else { > > bytes_to_copy = write_offset - read_offset; > > } > > - i915_memcpy_from_wc(dst_data + read_offset, > > - src_data + read_offset, bytes_to_copy); > > + iosys_map_incr(&src_map, read_offset); > > + drm_memcpy_from_wc_vaddr(dst_data + read_offset, &src_map, > > +bytes_to_copy); > > > > src_data += buffer_size; > > dst_data += buffer_size; > > -- > > 2.25.1 > >
Re: [Intel-gfx] [PATCH v2 5/7] drm/i915/selftests: use the memcpy_from_wc call from the drm
On 21.03.2022 16:07, Lucas De Marchi wrote: > Now Cc'ing Daniel properly > > Lucas De Marchi > > On Mon, Mar 21, 2022 at 04:00:56PM -0700, Lucas De Marchi wrote: > > +Thomas Zimmermann and +Daniel Vetter > > > > Could you take a look below regarding the I/O to I/O memory access? > > > > On Thu, Mar 03, 2022 at 11:30:11PM +0530, Balasubramani Vivekanandan wrote: > > > memcpy_from_wc functions in i915_memcpy.c will be removed and replaced > > > by the implementation in drm_cache.c. > > > Updated to use the functions provided by drm_cache.c. > > > > > > v2: check if the source and destination memory address is from local > > > memory or system memory and initialize the iosys_map accordingly > > > (Lucas) > > > > > > Cc: Lucas De Marchi > > > Cc: Matthew Auld > > > Cc: Thomas Hellstr_m > > > > > > Signed-off-by: Balasubramani Vivekanandan > > > > > > --- > > > .../drm/i915/selftests/intel_memory_region.c | 41 +-- > > > 1 file changed, 28 insertions(+), 13 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c > > > b/drivers/gpu/drm/i915/selftests/intel_memory_region.c > > > index ba32893e0873..d16ecb905f3b 100644 > > > --- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c > > > +++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c > > > @@ -7,6 +7,7 @@ > > > #include > > > > > > #include > > > +#include > > > > > > #include "../i915_selftest.h" > > > > > > @@ -1133,7 +1134,7 @@ static const char *repr_type(u32 type) > > > > > > static struct drm_i915_gem_object * > > > create_region_for_mapping(struct intel_memory_region *mr, u64 size, u32 > > > type, > > > - void **out_addr) > > > + struct iosys_map *out_addr) > > > { > > > struct drm_i915_gem_object *obj; > > > void *addr; > > > @@ -1153,7 +1154,11 @@ create_region_for_mapping(struct > > > intel_memory_region *mr, u64 size, u32 type, > > > return addr; > > > } > > > > > > - *out_addr = addr; > > > + if (i915_gem_object_is_lmem(obj)) > > > + iosys_map_set_vaddr_iomem(out_addr, (void __iomem *)addr); > > > + else > > > + iosys_map_set_vaddr(out_addr, addr); > > > + > > > return obj; > > > } > > > > > > @@ -1164,24 +1169,33 @@ static int wrap_ktime_compare(const void *A, > > > const void *B) > > > return ktime_compare(*a, *b); > > > } > > > > > > -static void igt_memcpy_long(void *dst, const void *src, size_t size) > > > +static void igt_memcpy_long(struct iosys_map *dst, struct iosys_map *src, > > > + size_t size) > > > { > > > - unsigned long *tmp = dst; > > > - const unsigned long *s = src; > > > + unsigned long *tmp = dst->is_iomem ? > > > + (unsigned long __force *)dst->vaddr_iomem : > > > + dst->vaddr; > > > > if we access vaddr_iomem/vaddr we basically break the promise of > > abstracting system and I/O memory. There is no point in receiving > > struct iosys_map as argument and then break the abstraction. > > Hi Lucas, I didn't attempt to convert the memory access using iosys_map interfaces to abstract system and I/O memory, in this patch. The intention of passing iosys_map structures instead of raw pointers in the test functions is for the benefit of igt_memcpy_from_wc() test function. igt_memcpy_from_wc() requires iosys_map variables for passing it to drm_memcpy_from_wc(). In the other test functions, though it receives iosys_map structures I have retained the behavior same as earlier by converting back the iosys_map structures to pointers. I made a short try to use iosys_map structures to perform the memory copy inside other test functions, but I dropped it after I realized that their is support lacking for (a) mentioned below in your comment. Since it requires some discussion to bring in the support for (a), I did not proceed with it. Regards, Bala > > > + const unsigned long *s = src->is_iomem ? > > > + (unsigned long __force *)src->vaddr_iomem : > > > + src->vaddr; > > > > > > size = size / sizeof(unsigned long); > > > while (size--) > > > *tmp++ = *s++; > > &g
Re: [Intel-gfx] [PATCH v7 3/9] drm/i915/gt: Optimize the migration and clear loop
On 29.03.2022 00:37, Ramalingam C wrote: > Move the static calculations out of the loops for copy and clear. > > Signed-off-by: Ramalingam C > Reviewed-by: Thomas Hellström > --- > drivers/gpu/drm/i915/gt/intel_migrate.c | 44 - > 1 file changed, 21 insertions(+), 23 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c > b/drivers/gpu/drm/i915/gt/intel_migrate.c > index 17dd372a47d1..ec9a9e7cb388 100644 > --- a/drivers/gpu/drm/i915/gt/intel_migrate.c > +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c > @@ -526,6 +526,7 @@ intel_context_migrate_copy(struct intel_context *ce, > struct i915_request **out) > { > struct sgt_dma it_src = sg_sgt(src), it_dst = sg_sgt(dst); > + u32 src_offset, dst_offset; > struct i915_request *rq; > int err; > > @@ -534,8 +535,20 @@ intel_context_migrate_copy(struct intel_context *ce, > > GEM_BUG_ON(ce->ring->size < SZ_64K); > > + src_offset = 0; > + dst_offset = CHUNK_SZ; > + if (HAS_64K_PAGES(ce->engine->i915)) { > + GEM_BUG_ON(!src_is_lmem && !dst_is_lmem); > + > + src_offset = 0; > + dst_offset = 0; > + if (src_is_lmem) > + src_offset = CHUNK_SZ; > + if (dst_is_lmem) > + dst_offset = 2 * CHUNK_SZ; > + } > + > do { > - u32 src_offset, dst_offset; > int len; > > rq = i915_request_create(ce); > @@ -563,19 +576,6 @@ intel_context_migrate_copy(struct intel_context *ce, > if (err) > goto out_rq; > > - src_offset = 0; > - dst_offset = CHUNK_SZ; > - if (HAS_64K_PAGES(ce->engine->i915)) { > - GEM_BUG_ON(!src_is_lmem && !dst_is_lmem); > - > - src_offset = 0; > - dst_offset = 0; > - if (src_is_lmem) > - src_offset = CHUNK_SZ; > - if (dst_is_lmem) > - dst_offset = 2 * CHUNK_SZ; > - } > - > len = emit_pte(rq, &it_src, src_cache_level, src_is_lmem, > src_offset, CHUNK_SZ); > if (len <= 0) { > @@ -585,12 +585,10 @@ intel_context_migrate_copy(struct intel_context *ce, > > err = emit_pte(rq, &it_dst, dst_cache_level, dst_is_lmem, > dst_offset, len); > - if (err < 0) > - goto out_rq; > - if (err < len) { > + if (err < len) > err = -EINVAL; > + if (err < 0) > goto out_rq; > - } With this change, for the case 0 < err < len, now the code does not reach `goto out_rq`. Is it the expected behavior? If yes, can you please add some details regarding this change in the commit description. Regards, Bala > > err = rq->engine->emit_flush(rq, EMIT_INVALIDATE); > if (err) > @@ -691,6 +689,7 @@ intel_context_migrate_clear(struct intel_context *ce, > { > struct sgt_dma it = sg_sgt(sg); > struct i915_request *rq; > + u32 offset; > int err; > > GEM_BUG_ON(ce->vm != ce->engine->gt->migrate.context->vm); > @@ -698,8 +697,11 @@ intel_context_migrate_clear(struct intel_context *ce, > > GEM_BUG_ON(ce->ring->size < SZ_64K); > > + offset = 0; > + if (HAS_64K_PAGES(ce->engine->i915) && is_lmem) > + offset = CHUNK_SZ; > + > do { > - u32 offset; > int len; > > rq = i915_request_create(ce); > @@ -727,10 +729,6 @@ intel_context_migrate_clear(struct intel_context *ce, > if (err) > goto out_rq; > > - offset = 0; > - if (HAS_64K_PAGES(ce->engine->i915) && is_lmem) > - offset = CHUNK_SZ; > - > len = emit_pte(rq, &it, cache_level, is_lmem, offset, CHUNK_SZ); > if (len <= 0) { > err = len; > -- > 2.20.1 >
Re: [Intel-gfx] [PATCH 1/2] drm/i915/ats-m: add ATS-M platform info
Looks good to me. Reviewed-by: Balasubramani Vivekanandan On 28.03.2022 17:08, Matt Roper wrote: > ATS-M is a server platform based on Xe_HPG and Xe_HPM, but without > display support. From a driver point of view, it's easiest to just > handle it as DG2 (including identifying as PLATFORM_DG2), but with the > display disabled in the device info. > > Signed-off-by: Matt Roper > --- > drivers/gpu/drm/i915/i915_pci.c | 40 - > 1 file changed, 25 insertions(+), 15 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c > index 67b89769f577..2025e1114927 100644 > --- a/drivers/gpu/drm/i915/i915_pci.c > +++ b/drivers/gpu/drm/i915/i915_pci.c > @@ -1040,25 +1040,35 @@ static const struct intel_device_info xehpsdv_info = { > .require_force_probe = 1, > }; > > +#define DG2_FEATURES \ > + XE_HP_FEATURES, \ > + XE_HPM_FEATURES, \ > + DGFX_FEATURES, \ > + .graphics.rel = 55, \ > + .media.rel = 55, \ > + PLATFORM(INTEL_DG2), \ > + .has_4tile = 1, \ > + .has_64k_pages = 1, \ > + .has_guc_deprivilege = 1, \ > + .needs_compact_pt = 1, \ > + .platform_engine_mask = \ > + BIT(RCS0) | BIT(BCS0) | \ > + BIT(VECS0) | BIT(VECS1) | \ > + BIT(VCS0) | BIT(VCS2) > + > static const struct intel_device_info dg2_info = { > - XE_HP_FEATURES, > - XE_HPM_FEATURES, > + DG2_FEATURES, > XE_LPD_FEATURES, > - DGFX_FEATURES, > - .graphics.rel = 55, > - .media.rel = 55, > - .has_4tile = 1, > - PLATFORM(INTEL_DG2), > - .has_guc_deprivilege = 1, > - .has_64k_pages = 1, > - .needs_compact_pt = 1, > - .platform_engine_mask = > - BIT(RCS0) | BIT(BCS0) | > - BIT(VECS0) | BIT(VECS1) | > - BIT(VCS0) | BIT(VCS2), > - .require_force_probe = 1, > .display.cpu_transcoder_mask = BIT(TRANSCODER_A) | BIT(TRANSCODER_B) | > BIT(TRANSCODER_C) | BIT(TRANSCODER_D), > + .require_force_probe = 1, > +}; > + > +__maybe_unused > +static const struct intel_device_info ats_m_info = { > + DG2_FEATURES, > + .display = { 0 }, > + .require_force_probe = 1, > }; > > #undef PLATFORM > -- > 2.34.1 >
Re: [Intel-gfx] [PATCH 1/1] drm/i915/guc: Convert slpc to iosys_map
On 09.05.2022 21:05, Mullati Siva wrote: > From: Siva Mullati > > Convert slpc shared data to use iosys_map rather than > plain pointer and save it in the intel_guc_slpc struct. > This will help with in read and update slpc shared data > after the slpc init by abstracting the IO vs system memory. > > Signed-off-by: Siva Mullati > --- > drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 82 +++ > .../gpu/drm/i915/gt/uc/intel_guc_slpc_types.h | 5 +- > 2 files changed, 50 insertions(+), 37 deletions(-) Acked-by: Balasubramani Vivekanandan > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c > b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c > index 1db833da42df..ee9fd8e7f1d4 100644 > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c > @@ -14,6 +14,13 @@ > #include "gt/intel_gt_regs.h" > #include "gt/intel_rps.h" > > +#define slpc_blob_read(slpc_, field_) \ > +iosys_map_rd_field(&(slpc_)->slpc_map, 0, \ > +struct slpc_shared_data, field_) > +#define slpc_blob_write(slpc_, field_, val_) \ > + iosys_map_wr_field(&(slpc_)->slpc_map, 0, \ > + struct slpc_shared_data, field_, val_) > + > static inline struct intel_guc *slpc_to_guc(struct intel_guc_slpc *slpc) > { > return container_of(slpc, struct intel_guc, slpc); > @@ -52,50 +59,51 @@ void intel_guc_slpc_init_early(struct intel_guc_slpc > *slpc) > slpc->selected = __guc_slpc_selected(guc); > } > > -static void slpc_mem_set_param(struct slpc_shared_data *data, > +static void slpc_mem_set_param(struct intel_guc_slpc *slpc, > u32 id, u32 value) > { > + u32 bits = slpc_blob_read(slpc, override_params.bits[id >> 5]); > + > GEM_BUG_ON(id >= SLPC_MAX_OVERRIDE_PARAMETERS); > /* >* When the flag bit is set, corresponding value will be read >* and applied by SLPC. >*/ > - data->override_params.bits[id >> 5] |= (1 << (id % 32)); > - data->override_params.values[id] = value; > + bits |= (1 << (id % 32)); > + slpc_blob_write(slpc, override_params.bits[id >> 5], bits); > + slpc_blob_write(slpc, override_params.values[id], value); > } > > -static void slpc_mem_set_enabled(struct slpc_shared_data *data, > +static void slpc_mem_set_enabled(struct intel_guc_slpc *slpc, >u8 enable_id, u8 disable_id) > { > /* >* Enabling a param involves setting the enable_id >* to 1 and disable_id to 0. >*/ > - slpc_mem_set_param(data, enable_id, 1); > - slpc_mem_set_param(data, disable_id, 0); > + slpc_mem_set_param(slpc, enable_id, 1); > + slpc_mem_set_param(slpc, disable_id, 0); > } > > -static void slpc_mem_set_disabled(struct slpc_shared_data *data, > +static void slpc_mem_set_disabled(struct intel_guc_slpc *slpc, > u8 enable_id, u8 disable_id) > { > /* >* Disabling a param involves setting the enable_id >* to 0 and disable_id to 1. >*/ > - slpc_mem_set_param(data, disable_id, 1); > - slpc_mem_set_param(data, enable_id, 0); > + slpc_mem_set_param(slpc, disable_id, 1); > + slpc_mem_set_param(slpc, enable_id, 0); > } > > static u32 slpc_get_state(struct intel_guc_slpc *slpc) > { > - struct slpc_shared_data *data; > - > GEM_BUG_ON(!slpc->vma); > > - drm_clflush_virt_range(slpc->vaddr, sizeof(u32)); > - data = slpc->vaddr; > + if (!slpc->slpc_map.is_iomem) > + drm_clflush_virt_range(slpc->slpc_map.vaddr, sizeof(u32)); > > - return data->header.global_state; > + return slpc_blob_read(slpc, header.global_state); > } > > static int guc_action_slpc_set_param(struct intel_guc *guc, u8 id, u32 value) > @@ -156,7 +164,9 @@ static int slpc_query_task_state(struct intel_guc_slpc > *slpc) > i915_probe_error(i915, "Failed to query task state (%pe)\n", >ERR_PTR(ret)); > > - drm_clflush_virt_range(slpc->vaddr, SLPC_PAGE_SIZE_BYTES); > + if (!slpc->slpc_map.is_iomem) > + drm_clflush_virt_range(slpc->slpc_map.vaddr, > +SLPC_PAGE_SIZE_BYTES); > > return ret; > } > @@ -243,10 +253,11 @@ int intel_guc_slpc_init(struct intel_guc_slpc *slpc) > struct drm_i915_private *i915 = slpc_to_i915(slpc); > u32 size = PAGE_ALIGN(sizeof(struct slpc_shared_data)); >
Re: [Intel-gfx] [PATCH 1/1] drm/i915/guc: Convert ct buffer to iosys_map
On 09.05.2022 12:19, Mullati Siva wrote: > From: Siva Mullati > > Convert CT commands and descriptors to use iosys_map rather > than plain pointer and save it in the intel_guc_ct_buffer struct. > This will help with ct_write and ct_read for cmd send and receive > after the initialization by abstracting the IO vs system memory. > > Signed-off-by: Siva Mullati Acked-by: Balasubramani Vivekanandan > --- > drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 195 +- > drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h | 9 +- > 2 files changed, 122 insertions(+), 82 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c > b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c > index f01325cd1b62..bd5b4312d968 100644 > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c > @@ -44,6 +44,11 @@ static inline struct drm_device *ct_to_drm(struct > intel_guc_ct *ct) > #define CT_PROBE_ERROR(_ct, _fmt, ...) \ > i915_probe_error(ct_to_i915(ct), "CT: " _fmt, ##__VA_ARGS__) > > +#define ct_desc_read(desc_map_, field_) \ > + iosys_map_rd_field(desc_map_, 0, struct guc_ct_buffer_desc, field_) > +#define ct_desc_write(desc_map_, field_, val_) \ > + iosys_map_wr_field(desc_map_, 0, struct guc_ct_buffer_desc, field_, > val_) > + > /** > * DOC: CTB Blob > * > @@ -76,6 +81,11 @@ static inline struct drm_device *ct_to_drm(struct > intel_guc_ct *ct) > #define CTB_G2H_BUFFER_SIZE (4 * CTB_H2G_BUFFER_SIZE) > #define G2H_ROOM_BUFFER_SIZE (CTB_G2H_BUFFER_SIZE / 4) > > +#define CTB_SEND_DESC_OFFSET 0u > +#define CTB_RECV_DESC_OFFSET (CTB_DESC_SIZE) > +#define CTB_SEND_CMDS_OFFSET (2 * CTB_DESC_SIZE) > +#define CTB_RECV_CMDS_OFFSET (2 * CTB_DESC_SIZE + CTB_H2G_BUFFER_SIZE) > + > struct ct_request { > struct list_head link; > u32 fence; > @@ -113,9 +123,9 @@ void intel_guc_ct_init_early(struct intel_guc_ct *ct) > init_waitqueue_head(&ct->wq); > } > > -static void guc_ct_buffer_desc_init(struct guc_ct_buffer_desc *desc) > +static void guc_ct_buffer_desc_init(struct iosys_map *desc) > { > - memset(desc, 0, sizeof(*desc)); > + iosys_map_memset(desc, 0, 0, sizeof(struct guc_ct_buffer_desc)); > } > > static void guc_ct_buffer_reset(struct intel_guc_ct_buffer *ctb) > @@ -128,17 +138,18 @@ static void guc_ct_buffer_reset(struct > intel_guc_ct_buffer *ctb) > space = CIRC_SPACE(ctb->tail, ctb->head, ctb->size) - ctb->resv_space; > atomic_set(&ctb->space, space); > > - guc_ct_buffer_desc_init(ctb->desc); > + guc_ct_buffer_desc_init(&ctb->desc_map); > } > > static void guc_ct_buffer_init(struct intel_guc_ct_buffer *ctb, > -struct guc_ct_buffer_desc *desc, > -u32 *cmds, u32 size_in_bytes, u32 resv_space) > +struct iosys_map *desc, > +struct iosys_map *cmds, > +u32 size_in_bytes, u32 resv_space) > { > GEM_BUG_ON(size_in_bytes % 4); > > - ctb->desc = desc; > - ctb->cmds = cmds; > + ctb->desc_map = *desc; > + ctb->cmds_map = *cmds; > ctb->size = size_in_bytes / 4; > ctb->resv_space = resv_space / 4; > > @@ -218,12 +229,13 @@ static int ct_register_buffer(struct intel_guc_ct *ct, > bool send, > int intel_guc_ct_init(struct intel_guc_ct *ct) > { > struct intel_guc *guc = ct_to_guc(ct); > - struct guc_ct_buffer_desc *desc; > + struct iosys_map blob_map; > + struct iosys_map desc_map; > + struct iosys_map cmds_map; > u32 blob_size; > u32 cmds_size; > u32 resv_space; > void *blob; > - u32 *cmds; > int err; > > err = i915_inject_probe_error(guc_to_gt(guc)->i915, -ENXIO); > @@ -242,27 +254,35 @@ int intel_guc_ct_init(struct intel_guc_ct *ct) > > CT_DEBUG(ct, "base=%#x size=%u\n", intel_guc_ggtt_offset(guc, ct->vma), > blob_size); > > - /* store pointers to desc and cmds for send ctb */ > - desc = blob; > - cmds = blob + 2 * CTB_DESC_SIZE; > + if (i915_gem_object_is_lmem(ct->vma->obj)) > + iosys_map_set_vaddr_iomem(&blob_map, > + (void __iomem *)blob); > + else > + iosys_map_set_vaddr(&blob_map, blob); > + > + /* store sysmap to desc_map and cmds_map for send ctb */ > + desc_map = IOSYS_MAP_INIT_OFFSET(&blob_map, CTB_SEND_DESC_OFFSET); > + cmds_map = IOSYS_MAP_INIT_OFFSET(&blob_map, CTB_SEND_CMDS_OFFSET); > cm
[Intel-gfx] [PATCH] drm/i915/hwconfig: Report no hwconfig support on ADL-N
ADL-N being a subplatform of ADL-P, it lacks support for hwconfig table. Explicit check added to skip ADL-N. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/gt/uc/intel_guc_hwconfig.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_hwconfig.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_hwconfig.c index 79c66b6b51a3..5aaa3948de74 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_hwconfig.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_hwconfig.c @@ -94,7 +94,7 @@ static int guc_hwconfig_fill_buffer(struct intel_guc *guc, struct intel_hwconfig static bool has_table(struct drm_i915_private *i915) { - if (IS_ALDERLAKE_P(i915)) + if (IS_ALDERLAKE_P(i915) && !IS_ADLP_N(i915)) return true; if (IS_DG2(i915)) return true; -- 2.25.1
[Intel-gfx] [PATCH] drm/i915/display/adl_p: Updates to HDMI combo PHY voltage swing table
New updates to HDMI combo PHY voltage swing tables. Actually with this update (bspec updated on 08/17/2021), the values are reverted back to be same as icelake for HDMI combo PHY. Bspec: 49291 Signed-off-by: Balasubramani Vivekanandan --- .../drm/i915/display/intel_ddi_buf_trans.c| 22 +-- 1 file changed, 1 insertion(+), 21 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_ddi_buf_trans.c b/drivers/gpu/drm/i915/display/intel_ddi_buf_trans.c index 85f58dd3df72..5cae1d19bcbb 100644 --- a/drivers/gpu/drm/i915/display/intel_ddi_buf_trans.c +++ b/drivers/gpu/drm/i915/display/intel_ddi_buf_trans.c @@ -878,26 +878,6 @@ static const struct intel_ddi_buf_trans adls_combo_phy_trans_edp_hbr3 = { .num_entries = ARRAY_SIZE(_adls_combo_phy_trans_edp_hbr3), }; -static const union intel_ddi_buf_trans_entry _adlp_combo_phy_trans_hdmi[] = { - /* NT mV Trans mVdb */ - { .icl = { 0x6, 0x60, 0x3F, 0x00, 0x00 } }, /* 400400 0.0 */ - { .icl = { 0x6, 0x68, 0x3F, 0x00, 0x00 } }, /* 500500 0.0 */ - { .icl = { 0xA, 0x73, 0x3F, 0x00, 0x00 } }, /* 650650 0.0 ALS */ - { .icl = { 0xA, 0x78, 0x3F, 0x00, 0x00 } }, /* 800800 0.0 */ - { .icl = { 0xB, 0x7F, 0x3F, 0x00, 0x00 } }, /* 1000 1000 0.0 Re-timer */ - { .icl = { 0xB, 0x7F, 0x3B, 0x00, 0x04 } }, /* FullRed -1.5 */ - { .icl = { 0xB, 0x7F, 0x39, 0x00, 0x06 } }, /* FullRed -1.8 */ - { .icl = { 0xB, 0x7F, 0x37, 0x00, 0x08 } }, /* FullRed -2.0 CRLS */ - { .icl = { 0xB, 0x7F, 0x35, 0x00, 0x0A } }, /* FullRed -2.5 */ - { .icl = { 0xB, 0x7F, 0x33, 0x00, 0x0C } }, /* FullRed -3.0 */ -}; - -static const struct intel_ddi_buf_trans adlp_combo_phy_trans_hdmi = { - .entries = _adlp_combo_phy_trans_hdmi, - .num_entries = ARRAY_SIZE(_adlp_combo_phy_trans_hdmi), - .hdmi_default_entry = ARRAY_SIZE(_adlp_combo_phy_trans_hdmi) - 1, -}; - static const union intel_ddi_buf_trans_entry _adlp_combo_phy_trans_dp_hbr[] = { /* NT mV Trans mV db */ { .icl = { 0xA, 0x35, 0x3F, 0x00, 0x00 } }, /* 350 350 0.0 */ @@ -1556,7 +1536,7 @@ adlp_get_combo_buf_trans(struct intel_encoder *encoder, int *n_entries) { if (intel_crtc_has_type(crtc_state, INTEL_OUTPUT_HDMI)) - return intel_get_buf_trans(&adlp_combo_phy_trans_hdmi, n_entries); + return intel_get_buf_trans(&icl_combo_phy_trans_hdmi, n_entries); else if (intel_crtc_has_type(crtc_state, INTEL_OUTPUT_EDP)) return adlp_get_combo_buf_trans_edp(encoder, crtc_state, n_entries); else -- 2.25.1
Re: [Intel-gfx] [PATCH v5 5/6] drm/i915/sseu: Disassociate internal subslice mask representation from uapi
On 23.05.2022 13:45, Matt Roper wrote: > As with EU masks, it's easier to store subslice/DSS masks internally in > a format that's more natural for the driver to work with, and then only > covert into the u8[] uapi form when the query ioctl is invoked. Since > the hardware design changed significantly with Xe_HP, we'll use a union > to choose between the old "hsw-style" subslice masks or the newer xehp > mask. HSW-style masks will be stored in an array of u8's, indexed by > slice (there's never more than 6 subslices per slice on older > platforms). For Xe_HP and beyond where slices no longer exist, we only > need a single bitmask. However we already know that this mask is > eventually going to grow too large for a simple u64 to hold, so we'll > represent it in a manner that can be operated on by the utilities in > linux/bitmap.h. > > v2: > - Fix typo: BIT(s) -> BIT(ss) in gen9_sseu_device_status() > > v3: > - Eliminate sseu->ss_stride and just calculate the stride while >specifically handling uapi. (Tvrtko) > - Use BITMAP_BITS() macro to refer to size of masks rather than >passing I915_MAX_SS_FUSE_BITS directly. (Tvrtko) > - Report compute/geometry DSS masks separately when dumping Xe_HP SSEU >info. (Tvrtko) > - Restore dropped range checks to intel_sseu_has_subslice(). (Tvrtko) > > v4: > - Make the bitmap size macro check the size of the .xehp field rather >than the containing union. (Tvrtko) > - Don't add GEM_BUG_ON() intel_sseu_has_subslice()'s check for whether >slice or subslice ID exceed sseu->max_[sub]slices; various loops >in the driver are expected to exceed these, so we should just >silently return 'false.' > > Cc: Tvrtko Ursulin > Signed-off-by: Matt Roper > --- > drivers/gpu/drm/i915/gem/i915_gem_context.c | 5 +- > drivers/gpu/drm/i915/gt/intel_engine_cs.c| 4 +- > drivers/gpu/drm/i915/gt/intel_gt.c | 12 +- > drivers/gpu/drm/i915/gt/intel_sseu.c | 261 +++ > drivers/gpu/drm/i915/gt/intel_sseu.h | 76 -- > drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c | 30 +-- > drivers/gpu/drm/i915/gt/intel_workarounds.c | 24 +- > drivers/gpu/drm/i915/i915_getparam.c | 3 +- > drivers/gpu/drm/i915/i915_query.c| 13 +- > 9 files changed, 241 insertions(+), 187 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c > b/drivers/gpu/drm/i915/gem/i915_gem_context.c > index ab4c5ab28e4d..a3bb73f5d53b 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c > @@ -1875,6 +1875,7 @@ i915_gem_user_to_context_sseu(struct intel_gt *gt, > { > const struct sseu_dev_info *device = >->info.sseu; > struct drm_i915_private *i915 = gt->i915; > + unsigned int dev_subslice_mask = intel_sseu_get_hsw_subslices(device, > 0); > > /* No zeros in any field. */ > if (!user->slice_mask || !user->subslice_mask || > @@ -1901,7 +1902,7 @@ i915_gem_user_to_context_sseu(struct intel_gt *gt, > if (user->slice_mask & ~device->slice_mask) > return -EINVAL; > > - if (user->subslice_mask & ~device->subslice_mask[0]) > + if (user->subslice_mask & ~dev_subslice_mask) > return -EINVAL; > > if (user->max_eus_per_subslice > device->max_eus_per_subslice) > @@ -1915,7 +1916,7 @@ i915_gem_user_to_context_sseu(struct intel_gt *gt, > /* Part specific restrictions. */ > if (GRAPHICS_VER(i915) == 11) { > unsigned int hw_s = hweight8(device->slice_mask); > - unsigned int hw_ss_per_s = hweight8(device->subslice_mask[0]); > + unsigned int hw_ss_per_s = hweight8(dev_subslice_mask); > unsigned int req_s = hweight8(context->slice_mask); > unsigned int req_ss = hweight8(context->subslice_mask); > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c > b/drivers/gpu/drm/i915/gt/intel_engine_cs.c > index 1adbf34c3632..f0acf8518a51 100644 > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c > @@ -674,8 +674,8 @@ static void engine_mask_apply_compute_fuses(struct > intel_gt *gt) > if (GRAPHICS_VER_FULL(i915) < IP_VER(12, 50)) > return; > > - ccs_mask = > intel_slicemask_from_dssmask(intel_sseu_get_compute_subslices(&info->sseu), > - ss_per_ccs); > + ccs_mask = > intel_slicemask_from_xehp_dssmask(info->sseu.compute_subslice_mask, > + ss_per_ccs); > /* >* If all DSS in a quadrant are fused off, the corresponding CCS >* engine is not available for use. > diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c > b/drivers/gpu/drm/i915/gt/intel_gt.c > index 034182f85501..2921f510642f 100644 > --- a/drivers/gpu/drm/i915/gt/intel_gt.c > +++ b/drivers/gpu/drm/i915/gt/intel_gt.c > @@ -133,13 +133,6 @@ static const struct in
Re: [Intel-gfx] [PATCH v5 0/6] i915: SSEU handling updates
15_MAX_SS_FUSE_BITS > >around directly to bitmap operations. > > - Improved debugfs / dmesg reporting for Xe_HP dumps > > - Various assertion check improvements. > > > > v5: > - Rebase to latest drm-tip (resolve trivial conflicts) > - Move XEHP_BITMAP_BITS() to the header so that we can also replace a usage > of >I915_MAX_SS_FUSE_BITS in one of the inline functions. (Bala) > - Change the local variable in intel_slicemask_from_xehp_dssmask() from u16 > to >'unsigned long' to make it a bit more future-proof. > - Incorporate ack's received from Tvrtko and Lionel. > > Cc: Tvrtko Ursulin > Cc: Balasubramani Vivekanandan Patch looks good to me. I do not have any comments except for the request to please check the checkpatch warnings. Reviewed-by: Balasubramani Vivekanandan > > Matt Roper (6): > drm/i915/xehp: Use separate sseu init function > drm/i915/xehp: Drop GETPARAM lookups of I915_PARAM_[SUB]SLICE_MASK > drm/i915/sseu: Simplify gen11+ SSEU handling > drm/i915/sseu: Don't try to store EU mask internally in UAPI format > drm/i915/sseu: Disassociate internal subslice mask representation from > uapi > drm/i915/pvc: Add SSEU changes > > drivers/gpu/drm/i915/gem/i915_gem_context.c | 5 +- > drivers/gpu/drm/i915/gt/intel_engine_cs.c| 4 +- > drivers/gpu/drm/i915/gt/intel_gt.c | 12 +- > drivers/gpu/drm/i915/gt/intel_gt_regs.h | 1 + > drivers/gpu/drm/i915/gt/intel_sseu.c | 450 --- > drivers/gpu/drm/i915/gt/intel_sseu.h | 92 ++-- > drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c | 30 +- > drivers/gpu/drm/i915/gt/intel_workarounds.c | 24 +- > drivers/gpu/drm/i915/i915_drv.h | 2 + > drivers/gpu/drm/i915/i915_getparam.c | 11 +- > drivers/gpu/drm/i915/i915_pci.c | 3 +- > drivers/gpu/drm/i915/i915_query.c| 26 +- > drivers/gpu/drm/i915/intel_device_info.h | 1 + > 13 files changed, 397 insertions(+), 264 deletions(-) > > -- > 2.35.3 >
[Intel-gfx] [PATCH] drm/i915/display/adlp: More updates to voltage swing table
Voltage swing table updated for eDP HBR3 Bspec: 49291 Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/display/intel_ddi_buf_trans.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_ddi_buf_trans.c b/drivers/gpu/drm/i915/display/intel_ddi_buf_trans.c index 5cae1d19bcbb..e6cf50922dca 100644 --- a/drivers/gpu/drm/i915/display/intel_ddi_buf_trans.c +++ b/drivers/gpu/drm/i915/display/intel_ddi_buf_trans.c @@ -933,9 +933,9 @@ static const union intel_ddi_buf_trans_entry _adlp_combo_phy_trans_dp_hbr2_edp_h { .icl = { 0x6, 0x7F, 0x2B, 0x00, 0x14 } }, /* 350 900 8.2 */ { .icl = { 0xA, 0x4C, 0x3F, 0x00, 0x00 } }, /* 500 500 0.0 */ { .icl = { 0xC, 0x73, 0x34, 0x00, 0x0B } }, /* 500 700 2.9 */ - { .icl = { 0x6, 0x7F, 0x2F, 0x00, 0x10 } }, /* 500 900 5.1 */ - { .icl = { 0xC, 0x6C, 0x3C, 0x00, 0x03 } }, /* 650 700 0.6 */ - { .icl = { 0x6, 0x7F, 0x35, 0x00, 0x0A } }, /* 600 900 3.5 */ + { .icl = { 0x6, 0x7F, 0x30, 0x00, 0x0F } }, /* 500 900 5.1 */ + { .icl = { 0xC, 0x63, 0x3F, 0x00, 0x00 } }, /* 650 700 0.6 */ + { .icl = { 0x6, 0x7F, 0x38, 0x00, 0x07 } }, /* 600 900 3.5 */ { .icl = { 0x6, 0x7F, 0x3F, 0x00, 0x00 } }, /* 900 900 0.0 */ }; -- 2.25.1
Re: [Intel-gfx] [PATCH] drm/i915/dg2: Correct DSS check for Wa_1308578152
On 07.06.2022 08:47, Matt Roper wrote: > When converting our DSS masks to bitmaps, we fumbled the condition used > to check whether any DSS are present in the first gslice. Since > intel_sseu_find_first_xehp_dss() returns a 0-based number, we need a >= > condition rather than >. > > Fixes: b87d39019651 ("drm/i915/sseu: Disassociate internal subslice mask > representation from uapi") > Reported-by: Balasubramani Vivekanandan > Signed-off-by: Matt Roper > --- > drivers/gpu/drm/i915/gt/intel_workarounds.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c > b/drivers/gpu/drm/i915/gt/intel_workarounds.c > index 1b191b234160..67104ba8951e 100644 > --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c > +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c > @@ -2079,7 +2079,7 @@ engine_fake_wa_init(struct intel_engine_cs *engine, > struct i915_wa_list *wal) > > static bool needs_wa_1308578152(struct intel_engine_cs *engine) > { > - return intel_sseu_find_first_xehp_dss(&engine->gt->info.sseu, 0, 0) > > + return intel_sseu_find_first_xehp_dss(&engine->gt->info.sseu, 0, 0) >= > GEN_DSS_PER_GSLICE; > } Acked-by: Balasubramani Vivekanandan > > -- > 2.35.3 >
Re: [Intel-gfx] [PATCH] drm/i915/xehp: Correct steering initialization
On 07.06.2022 10:57, Matt Roper wrote: > Another mistake during the conversion to DSS bitmaps: after retrieving > the DSS ID intel_sseu_find_first_xehp_dss() we forgot to modulo it down > to obtain which ID within the current gslice it is. > > Fixes: b87d39019651 ("drm/i915/sseu: Disassociate internal subslice mask > representation from uapi") > Cc: Balasubramani Vivekanandan > Signed-off-by: Matt Roper > --- > drivers/gpu/drm/i915/gt/intel_workarounds.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c > b/drivers/gpu/drm/i915/gt/intel_workarounds.c > index b7421f109c13..a5c0508c5b63 100644 > --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c > +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c > @@ -1177,8 +1177,8 @@ xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list > *wal) > } > > slice = __ffs(slice_mask); > - subslice = intel_sseu_find_first_xehp_dss(sseu, GEN_DSS_PER_GSLICE, > slice); > - WARN_ON(subslice > GEN_DSS_PER_GSLICE); > + subslice = intel_sseu_find_first_xehp_dss(sseu, GEN_DSS_PER_GSLICE, > slice) % > + GEN_DSS_PER_GSLICE; Acked-by: Balasubramani Vivekanandan > > __add_mcr_wa(gt, wal, slice, subslice); > > -- > 2.35.3 >
Re: [Intel-gfx] [PATCH 17/23] drm/i915/mtl: Update MBUS_DBOX credits
On 27.07.2022 18:34, Radhakrishna Sripada wrote: > Display version 14 platforms has different credits values compared to ADL-P. > Update the credits based on pipe usage. > > Bspec: 49213 > > Cc: Jose Roberto de Souza > Cc: Matt Roper > Original Author: Caz Yokoyama > Signed-off-by: José Roberto de Souza > Signed-off-by: Radhakrishna Sripada > --- > drivers/gpu/drm/i915/i915_reg.h | 4 +++ > drivers/gpu/drm/i915/intel_pm.c | 47 ++--- > 2 files changed, 47 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h > index d37607109398..2f9cbdd068e8 100644 > --- a/drivers/gpu/drm/i915/i915_reg.h > +++ b/drivers/gpu/drm/i915/i915_reg.h > @@ -1125,8 +1125,12 @@ > #define MBUS_DBOX_REGULATE_B2B_TRANSACTIONS_EN REG_BIT(16) /* tgl+ */ > #define MBUS_DBOX_BW_CREDIT_MASK REG_GENMASK(15, 14) > #define MBUS_DBOX_BW_CREDIT(x) > REG_FIELD_PREP(MBUS_DBOX_BW_CREDIT_MASK, x) > +#define MBUS_DBOX_BW_4CREDITS_MTL0x2 > +#define MBUS_DBOX_BW_8CREDITS_MTL0x3 > #define MBUS_DBOX_B_CREDIT_MASK REG_GENMASK(12, 8) > #define MBUS_DBOX_B_CREDIT(x) > REG_FIELD_PREP(MBUS_DBOX_B_CREDIT_MASK, x) > +#define MBUS_DBOX_I_CREDIT_MASK REG_GENMASK(7, 5) > +#define MBUS_DBOX_I_CREDIT(x) > REG_FIELD_PREP(MBUS_DBOX_I_CREDIT_MASK, x) > #define MBUS_DBOX_A_CREDIT_MASK REG_GENMASK(3, 0) > #define MBUS_DBOX_A_CREDIT(x) > REG_FIELD_PREP(MBUS_DBOX_A_CREDIT_MASK, x) > > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c > index f71b3b8b590c..58a3c72418a7 100644 > --- a/drivers/gpu/drm/i915/intel_pm.c > +++ b/drivers/gpu/drm/i915/intel_pm.c > @@ -8443,6 +8443,27 @@ void intel_dbuf_post_plane_update(struct > intel_atomic_state *state) > new_dbuf_state->enabled_slices); > } > > +static bool xelpdp_is_one_pipe_per_dbuf_bank(enum pipe pipe, u8 active_pipes) > +{ > + switch (pipe) { > + case PIPE_A: > + case PIPE_D: > + if (is_power_of_2(active_pipes & (BIT(PIPE_A) | BIT(PIPE_D > + return true; > + break; > + case PIPE_B: > + case PIPE_C: > + if (is_power_of_2(active_pipes & (BIT(PIPE_B) | BIT(PIPE_C > + return true; > + break; > + default: /* to suppress compiler warning */ > + MISSING_CASE(pipe); > + break; > + } > + > + return false; > +} > + > void intel_mbus_dbox_update(struct intel_atomic_state *state) > { > struct drm_i915_private *i915 = to_i915(state->base.dev); > @@ -8462,20 +8483,28 @@ void intel_mbus_dbox_update(struct intel_atomic_state > *state) >new_dbuf_state->active_pipes == old_dbuf_state->active_pipes)) > return; > > + if (DISPLAY_VER(i915) >= 14) > + val |= MBUS_DBOX_I_CREDIT(2); > + > if (DISPLAY_VER(i915) >= 12) { > val |= MBUS_DBOX_B2B_TRANSACTIONS_MAX(16); > val |= MBUS_DBOX_B2B_TRANSACTIONS_DELAY(1); > val |= MBUS_DBOX_REGULATE_B2B_TRANSACTIONS_EN; > } > > - /* Wa_22010947358:adl-p */ > - if (IS_ALDERLAKE_P(i915)) > + if (DISPLAY_VER(i915) >= 14) > + val |= new_dbuf_state->joined_mbus ? MBUS_DBOX_A_CREDIT(12) : > + MBUS_DBOX_A_CREDIT(8); > + else if (IS_ALDERLAKE_P(i915)) > + /* Wa_22010947358:adl-p */ > val |= new_dbuf_state->joined_mbus ? MBUS_DBOX_A_CREDIT(6) : >MBUS_DBOX_A_CREDIT(4); > else > val |= MBUS_DBOX_A_CREDIT(2); > > - if (IS_ALDERLAKE_P(i915)) { > + if (DISPLAY_VER(i915) >= 14) { > + val |= MBUS_DBOX_B_CREDIT(0xA); > + } else if (IS_ALDERLAKE_P(i915)) { > val |= MBUS_DBOX_BW_CREDIT(2); > val |= MBUS_DBOX_B_CREDIT(8); > } else if (DISPLAY_VER(i915) >= 12) { > @@ -8487,10 +8516,20 @@ void intel_mbus_dbox_update(struct intel_atomic_state > *state) > } > > for_each_new_intel_crtc_in_state(state, crtc, new_crtc_state, i) { > + u32 pipe_val = val; > + > if (!new_crtc_state->hw.active || > !intel_crtc_needs_modeset(new_crtc_state)) > continue; > > - intel_de_write(i915, PIPE_MBUS_DBOX_CTL(crtc->pipe), val); > + if (DISPLAY_VER(i915) >= 14) { Only MTL and its subplatforms require the BW Credits to be set in MBUS_DBOX_CTL register. No future platforms with DISPLAY_VER(i915) higher than or equal to 14 has BW Credits field in the MBUS_DBOX_CTL register. So please change the if condition to IS_METEORLAKE(i915) Regards, Bala > + if (xelpdp_is_one_pipe_per_dbuf_bank(
Re: [Intel-gfx] [PATCH v2] drm/i915/dg2: Add additional HDMI pixel clock frequencies
On 01.08.2022 16:48, Taylor, Clinton A wrote: > Using the BSPEC algorithm add addition HDMI pixel clocks to the existing > table. > > v2: remove 297000 unused entry > > Cc: Matt Roper > Cc: Radhakrishna Sripada > Signed-off-by: Taylor, Clinton A Reviewed-by: Balasubramani Vivekanandan > --- > drivers/gpu/drm/i915/display/intel_snps_phy.c | 1115 + > 1 file changed, 1115 insertions(+) > > diff --git a/drivers/gpu/drm/i915/display/intel_snps_phy.c > b/drivers/gpu/drm/i915/display/intel_snps_phy.c > index 0bdbedc67d7d..f75808e0c95e 100644 > --- a/drivers/gpu/drm/i915/display/intel_snps_phy.c > +++ b/drivers/gpu/drm/i915/display/intel_snps_phy.c > @@ -518,6 +518,1085 @@ static const struct intel_mpllb_state dg2_hdmi_148_5 = > { > }; > > /* values in the below table are calculted using the algo */ > +static const struct intel_mpllb_state dg2_hdmi_25200 = { > + .clock = 25200, > + .ref_control = > + REG_FIELD_PREP(SNPS_PHY_REF_CONTROL_REF_RANGE, 3), > + .mpllb_cp = > + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT, 7) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP, 14) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT_GS, 64) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP_GS, 124), > + .mpllb_div = > + REG_FIELD_PREP(SNPS_PHY_MPLLB_DIV5_CLK_EN, 1) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_TX_CLK_DIV, 5) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_PMIX_EN, 1) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_V2I, 2) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_FREQ_VCO, 0), > + .mpllb_div2 = > + REG_FIELD_PREP(SNPS_PHY_MPLLB_REF_CLK_DIV, 1) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_MULTIPLIER, 128) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_HDMI_DIV, 1), > + .mpllb_fracn1 = > + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_CGG_UPDATE_EN, 1) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_EN, 1) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_DEN, 65535), > + .mpllb_fracn2 = > + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_QUOT, 41943) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_REM, 2621), > + .mpllb_sscen = > + REG_FIELD_PREP(SNPS_PHY_MPLLB_SSC_UP_SPREAD, 1), > +}; > + > +static const struct intel_mpllb_state dg2_hdmi_27027 = { > + .clock = 27027, > + .ref_control = > + REG_FIELD_PREP(SNPS_PHY_REF_CONTROL_REF_RANGE, 3), > + .mpllb_cp = > + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT, 6) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP, 14) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT_GS, 64) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP_GS, 124), > + .mpllb_div = > + REG_FIELD_PREP(SNPS_PHY_MPLLB_DIV5_CLK_EN, 1) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_TX_CLK_DIV, 5) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_PMIX_EN, 1) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_V2I, 2) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_FREQ_VCO, 0), > + .mpllb_div2 = > + REG_FIELD_PREP(SNPS_PHY_MPLLB_REF_CLK_DIV, 1) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_MULTIPLIER, 140) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_HDMI_DIV, 1), > + .mpllb_fracn1 = > + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_CGG_UPDATE_EN, 1) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_EN, 1) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_DEN, 65535), > + .mpllb_fracn2 = > + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_QUOT, 31876) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_REM, 46555), > + .mpllb_sscen = > + REG_FIELD_PREP(SNPS_PHY_MPLLB_SSC_UP_SPREAD, 1), > +}; > + > +static const struct intel_mpllb_state dg2_hdmi_28320 = { > + .clock = 28320, > + .ref_control = > + REG_FIELD_PREP(SNPS_PHY_REF_CONTROL_REF_RANGE, 3), > + .mpllb_cp = > + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT, 6) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP, 14) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT_GS, 64) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP_GS, 124), > + .mpllb_div = > + REG_FIELD_PREP(SNPS_PHY_MPLLB_DIV5_CLK_EN, 1) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_TX_CLK_DIV, 5) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_PMIX_EN, 1) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_V2I, 2) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_FREQ_VCO, 0), > + .mpllb_div2 = > + REG_FIELD_PREP(SNPS_PHY_MPLLB_REF_CLK_DIV, 1) | > + REG_FIELD_PREP(SNPS_PHY_MPLLB_MULTIPLIER, 148) | > + REG_FIELD_PREP(SN
Re: [Intel-gfx] [v1.1 01/23] drm/i915: Read graphics/media/display arch version from hw
On 27.07.2022 20:46, Radhakrishna Sripada wrote: > From: Matt Roper > > Going forward, the hardware teams no longer consider new platforms to > have a "generation" in the way we've defined it for past platforms. > Instead, each IP block (graphics, media, display) will have their own > architecture major.minor versions and stepping ID's which should be read > directly from a register in the MMIO space. New hardware programming > styles, features, and workarounds should be conditional solely on the > architecture version, and should no longer be derived from the PCI > device ID, revision ID, or platform-specific feature flags. > > v1.1: Fix build error > > Bspec: 63361, 64111 > > Signed-off-by: Matt Roper > Signed-off-by: Rodrigo Vivi > --- > drivers/gpu/drm/i915/gt/intel_gt_regs.h | 2 + > drivers/gpu/drm/i915/i915_driver.c| 80 ++- > drivers/gpu/drm/i915/i915_drv.h | 16 ++-- > drivers/gpu/drm/i915/i915_pci.c | 1 + > drivers/gpu/drm/i915/i915_reg.h | 6 ++ > drivers/gpu/drm/i915/intel_device_info.c | 32 > drivers/gpu/drm/i915/intel_device_info.h | 14 > .../gpu/drm/i915/selftests/mock_gem_device.c | 1 + > 8 files changed, 128 insertions(+), 24 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h > b/drivers/gpu/drm/i915/gt/intel_gt_regs.h > index 60d6eb5f245b..fab8e4ff74d5 100644 > --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h > +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h > @@ -39,6 +39,8 @@ > #define FORCEWAKE_ACK_RENDER_GEN9_MMIO(0xd84) > #define FORCEWAKE_ACK_MEDIA_GEN9 _MMIO(0xd88) > > +#define GMD_ID_GRAPHICS _MMIO(0xd8c) > + > #define MCFG_MCR_SELECTOR_MMIO(0xfd0) > #define SF_MCR_SELECTOR _MMIO(0xfd8) > #define GEN8_MCR_SELECTOR_MMIO(0xfdc) > diff --git a/drivers/gpu/drm/i915/i915_driver.c > b/drivers/gpu/drm/i915/i915_driver.c > index deb8a8b76965..33566f6e9546 100644 > --- a/drivers/gpu/drm/i915/i915_driver.c > +++ b/drivers/gpu/drm/i915/i915_driver.c > @@ -70,6 +70,7 @@ > #include "gem/i915_gem_pm.h" > #include "gt/intel_gt.h" > #include "gt/intel_gt_pm.h" > +#include "gt/intel_gt_regs.h" > #include "gt/intel_rc6.h" > > #include "pxp/intel_pxp_pm.h" > @@ -306,15 +307,83 @@ static void sanitize_gpu(struct drm_i915_private *i915) > __intel_gt_reset(to_gt(i915), ALL_ENGINES); > } > > +#define IP_VER_READ(offset, ri_prefix) \ > + addr = pci_iomap_range(pdev, 0, offset, sizeof(u32)); \ > + if (drm_WARN_ON(&i915->drm, !addr)) { \ > + /* Fall back to whatever was in the device info */ \ > + RUNTIME_INFO(i915)->ri_prefix.ver = > INTEL_INFO(i915)->ri_prefix.ver; \ > + RUNTIME_INFO(i915)->ri_prefix.rel = > INTEL_INFO(i915)->ri_prefix.rel; \ > + goto ri_prefix##done; \ > + } \ > + \ > + ver = ioread32(addr); \ > + pci_iounmap(pdev, addr); \ > + \ > + RUNTIME_INFO(i915)->ri_prefix.ver = REG_FIELD_GET(GMD_ID_ARCH_MASK, > ver); \ > + RUNTIME_INFO(i915)->ri_prefix.rel = REG_FIELD_GET(GMD_ID_RELEASE_MASK, > ver); \ > + RUNTIME_INFO(i915)->ri_prefix.step = REG_FIELD_GET(GMD_ID_STEP, ver); \ > + \ > + /* Sanity check against expected versions from device info */ \ > + if (RUNTIME_INFO(i915)->ri_prefix.ver != > INTEL_INFO(i915)->ri_prefix.ver || \ > + RUNTIME_INFO(i915)->ri_prefix.rel > > INTEL_INFO(i915)->ri_prefix.rel) \ > + drm_dbg(&i915->drm, \ > + "Hardware reports " #ri_prefix " IP version %u.%u but > minimum expected is %u.%u\n", \ > + RUNTIME_INFO(i915)->ri_prefix.ver, \ > + RUNTIME_INFO(i915)->ri_prefix.rel, \ > + INTEL_INFO(i915)->ri_prefix.ver, \ > + INTEL_INFO(i915)->ri_prefix.rel); \ > +ri_prefix##done: > + > +/** > + * intel_ipver_early_init - setup IP version values > + * @dev_priv: device private > + * > + * Setup the graphics version for the current device. This must be done > before > + * any code that performs checks on GRAPHICS_VER or DISPLAY_VER, so this > + * function should be called very early in the driver initialization > sequence. > + * > + * Regular MMIO access is not yet setup at the point this function is called > so > + * we peek at the appropriate MMIO offset directly. The GMD_ID register is > + * part of an 'always on' power well by design, so we don't need to worry > about > + * forcewake while reading it. > + */ > +static void intel_ipver_early_init(struct drm_i915_private *i915) > +{ > + struct pci_dev *pdev = to_pci_dev(i915->drm.dev); > + void __iomem *addr; > + u32 ver; > + > + if (!HAS_GMD_ID(i915)) { > + drm_WARN_ON(&i915->drm, INTEL_INFO(i915)->graphics.ver > 12); > + > + RUNTIME_INFO(i915)->graphics.ver = > INTEL_INFO
Re: [Intel-gfx] [PATCH v7 3/9] drm/i915/gt: Optimize the migration and clear loop
On 01.04.2022 18:07, Ramalingam C wrote: > Move the static calculations out of the loops for copy and clear. > > Signed-off-by: Ramalingam C > Reviewed-by: Thomas Hellstrom > --- > drivers/gpu/drm/i915/gt/intel_migrate.c | 40 - > 1 file changed, 19 insertions(+), 21 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c > b/drivers/gpu/drm/i915/gt/intel_migrate.c > index e81f20266f62..580b4cf1efa2 100644 > --- a/drivers/gpu/drm/i915/gt/intel_migrate.c > +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c > @@ -526,6 +526,7 @@ intel_context_migrate_copy(struct intel_context *ce, > struct i915_request **out) > { > struct sgt_dma it_src = sg_sgt(src), it_dst = sg_sgt(dst); > + u32 src_offset, dst_offset; > struct i915_request *rq; > int err; > > @@ -535,8 +536,18 @@ intel_context_migrate_copy(struct intel_context *ce, > > GEM_BUG_ON(ce->ring->size < SZ_64K); > > + src_offset = 0; > + dst_offset = CHUNK_SZ; > + if (HAS_64K_PAGES(ce->engine->i915)) { > + src_offset = 0; > + dst_offset = 0; > + if (src_is_lmem) > + src_offset = CHUNK_SZ; > + if (dst_is_lmem) > + dst_offset = 2 * CHUNK_SZ; > + } > + > do { > - u32 src_offset, dst_offset; > int len; > > rq = i915_request_create(ce); > @@ -564,17 +575,6 @@ intel_context_migrate_copy(struct intel_context *ce, > if (err) > goto out_rq; > > - src_offset = 0; > - dst_offset = CHUNK_SZ; > - if (HAS_64K_PAGES(ce->engine->i915)) { > - src_offset = 0; > - dst_offset = 0; > - if (src_is_lmem) > - src_offset = CHUNK_SZ; > - if (dst_is_lmem) > - dst_offset = 2 * CHUNK_SZ; > - } > - > len = emit_pte(rq, &it_src, src_cache_level, src_is_lmem, > src_offset, CHUNK_SZ); > if (len <= 0) { > @@ -584,12 +584,10 @@ intel_context_migrate_copy(struct intel_context *ce, > > err = emit_pte(rq, &it_dst, dst_cache_level, dst_is_lmem, > dst_offset, len); > - if (err < 0) > - goto out_rq; > - if (err < len) { > + if (err < len) > err = -EINVAL; > + if (err < 0) > goto out_rq; > - } did you take a look at my comment at https://patchwork.freedesktop.org/patch/479847/?series=101106&rev=6? Above change looks like a regression, can you check again? Regards, Bala > > err = rq->engine->emit_flush(rq, EMIT_INVALIDATE); > if (err) > @@ -690,6 +688,7 @@ intel_context_migrate_clear(struct intel_context *ce, > { > struct sgt_dma it = sg_sgt(sg); > struct i915_request *rq; > + u32 offset; > int err; > > GEM_BUG_ON(ce->vm != ce->engine->gt->migrate.context->vm); > @@ -697,8 +696,11 @@ intel_context_migrate_clear(struct intel_context *ce, > > GEM_BUG_ON(ce->ring->size < SZ_64K); > > + offset = 0; > + if (HAS_64K_PAGES(ce->engine->i915) && is_lmem) > + offset = CHUNK_SZ; > + > do { > - u32 offset; > int len; > > rq = i915_request_create(ce); > @@ -726,10 +728,6 @@ intel_context_migrate_clear(struct intel_context *ce, > if (err) > goto out_rq; > > - offset = 0; > - if (HAS_64K_PAGES(ce->engine->i915) && is_lmem) > - offset = CHUNK_SZ; > - > len = emit_pte(rq, &it, cache_level, is_lmem, offset, CHUNK_SZ); > if (len <= 0) { > err = len; > -- > 2.20.1 >
[Intel-gfx] [PATCH] drm/i915/uc: use io memcpy functions for device memory copy
When copying RSA use io memcpy functions if the destination address contains a GPU local memory address. Considering even the source address can be on local memory, a bounce buffer is used to copy from io to io. The intention of this patch is to make i915 portable outside x86 mainly on ARM64. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c | 23 +-- 1 file changed, 21 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c index bb864655c495..06d30670e15c 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c @@ -589,7 +589,7 @@ static int uc_fw_rsa_data_create(struct intel_uc_fw *uc_fw) struct intel_gt *gt = __uc_fw_to_gt(uc_fw); struct i915_vma *vma; size_t copied; - void *vaddr; + void *vaddr, *bounce; int err; err = i915_inject_probe_error(gt->i915, -ENXIO); @@ -621,7 +621,26 @@ static int uc_fw_rsa_data_create(struct intel_uc_fw *uc_fw) goto unpin_out; } - copied = intel_uc_fw_copy_rsa(uc_fw, vaddr, vma->size); + if (i915_gem_object_is_lmem(vma->obj)) { + /* When vma is allocated from the GPU local memmory, it means +* the destination address contains an io memory and we need to +* use memcpy function for io memory for copying, to ensure +* i915 portability outside x86. It is most likely the RSA will +* also be on local memory and so the source of copy will also +* be an io address. Since we cannot directly copy from io to +* io, we use a bounce buffer to copy. +*/ + copied = 0; + bounce = kmalloc(vma->size, GFP_KERNEL); + if (likely(bounce)) { + copied = intel_uc_fw_copy_rsa(uc_fw, bounce, vma->size); + memcpy_toio((void __iomem *)vaddr, bounce, copied); + kfree(bounce); + } + } else { + copied = intel_uc_fw_copy_rsa(uc_fw, vaddr, vma->size); + } + i915_gem_object_unpin_map(vma->obj); if (copied < uc_fw->rsa_size) { -- 2.25.1
Re: [Intel-gfx] [PATCH 1/1] drm/i915/guc: Convert ct buffer to iosys_map
On 04.04.2022 15:01, Mullati Siva wrote: > From: Siva Mullati > > Convert CT commands and descriptors to use iosys_map rather > than plain pointer and save it in the intel_guc_ct_buffer struct. > This will help with ct_write and ct_read for cmd send and receive > after the initialization by abstracting the IO vs system memory. > > Signed-off-by: Siva Mullati > --- > drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 200 +- > drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h | 9 +- > 2 files changed, 127 insertions(+), 82 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c > b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c > index f01325cd1b62..64568dc90b05 100644 > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c > @@ -44,6 +44,11 @@ static inline struct drm_device *ct_to_drm(struct > intel_guc_ct *ct) > #define CT_PROBE_ERROR(_ct, _fmt, ...) \ > i915_probe_error(ct_to_i915(ct), "CT: " _fmt, ##__VA_ARGS__) > > +#define ct_desc_read(desc_map_, field_) \ > + iosys_map_rd_field(desc_map_, 0, struct guc_ct_buffer_desc, field_) > +#define ct_desc_write(desc_map_, field_, val_) \ > + iosys_map_wr_field(desc_map_, 0, struct guc_ct_buffer_desc, field_, > val_) > + Did you try to make the change Lucas mentioned in his comment on rev0, to pass `struct guc_ct_buffer_desc *` as first argument to the above macros? Was it not feasible? > /** > * DOC: CTB Blob > * > @@ -76,6 +81,11 @@ static inline struct drm_device *ct_to_drm(struct > intel_guc_ct *ct) > #define CTB_G2H_BUFFER_SIZE (4 * CTB_H2G_BUFFER_SIZE) > #define G2H_ROOM_BUFFER_SIZE (CTB_G2H_BUFFER_SIZE / 4) > > +#define CTB_SEND_DESC_OFFSET 0u > +#define CTB_RECV_DESC_OFFSET (CTB_DESC_SIZE) > +#define CTB_SEND_CMDS_OFFSET (2 * CTB_DESC_SIZE) > +#define CTB_RECV_CMDS_OFFSET (2 * CTB_DESC_SIZE + CTB_H2G_BUFFER_SIZE) > + > struct ct_request { > struct list_head link; > u32 fence; > @@ -113,9 +123,9 @@ void intel_guc_ct_init_early(struct intel_guc_ct *ct) > init_waitqueue_head(&ct->wq); > } > > -static void guc_ct_buffer_desc_init(struct guc_ct_buffer_desc *desc) > +static void guc_ct_buffer_desc_init(struct iosys_map *desc) > { > - memset(desc, 0, sizeof(*desc)); > + iosys_map_memset(desc, 0, 0, sizeof(struct guc_ct_buffer_desc)); > } > > static void guc_ct_buffer_reset(struct intel_guc_ct_buffer *ctb) > @@ -128,17 +138,18 @@ static void guc_ct_buffer_reset(struct > intel_guc_ct_buffer *ctb) > space = CIRC_SPACE(ctb->tail, ctb->head, ctb->size) - ctb->resv_space; > atomic_set(&ctb->space, space); > > - guc_ct_buffer_desc_init(ctb->desc); > + guc_ct_buffer_desc_init(&ctb->desc_map); > } > > static void guc_ct_buffer_init(struct intel_guc_ct_buffer *ctb, > -struct guc_ct_buffer_desc *desc, > -u32 *cmds, u32 size_in_bytes, u32 resv_space) > +struct iosys_map *desc, > +struct iosys_map *cmds, > +u32 size_in_bytes, u32 resv_space) > { > GEM_BUG_ON(size_in_bytes % 4); > > - ctb->desc = desc; > - ctb->cmds = cmds; > + ctb->desc_map = *desc; > + ctb->cmds_map = *cmds; > ctb->size = size_in_bytes / 4; > ctb->resv_space = resv_space / 4; > > @@ -218,12 +229,13 @@ static int ct_register_buffer(struct intel_guc_ct *ct, > bool send, > int intel_guc_ct_init(struct intel_guc_ct *ct) > { > struct intel_guc *guc = ct_to_guc(ct); > - struct guc_ct_buffer_desc *desc; > + struct iosys_map blob_map; > + struct iosys_map desc_map; > + struct iosys_map cmds_map; > u32 blob_size; > u32 cmds_size; > u32 resv_space; > void *blob; > - u32 *cmds; > int err; > > err = i915_inject_probe_error(guc_to_gt(guc)->i915, -ENXIO); > @@ -242,27 +254,35 @@ int intel_guc_ct_init(struct intel_guc_ct *ct) > > CT_DEBUG(ct, "base=%#x size=%u\n", intel_guc_ggtt_offset(guc, ct->vma), > blob_size); > > - /* store pointers to desc and cmds for send ctb */ > - desc = blob; > - cmds = blob + 2 * CTB_DESC_SIZE; > + if (i915_gem_object_is_lmem(ct->vma->obj)) > + iosys_map_set_vaddr_iomem(&blob_map, > + (void __iomem *)blob); > + else > + iosys_map_set_vaddr(&blob_map, blob); > + > + /* store sysmap to desc_map and cmds_map for send ctb */ > + desc_map = IOSYS_MAP_INIT_OFFSET(&blob_map, CTB_SEND_DESC_OFFSET); > + cmds_map = IOSYS_MAP_INIT_OFFSET(&blob_map, CTB_SEND_CMDS_OFFSET); > cmds_size = CTB_H2G_BUFFER_SIZE; > resv_space = 0; > - CT_DEBUG(ct, "%s desc %#tx cmds %#tx size %u/%u\n", "send", > - ptrdiff(desc, blob), ptrdiff(cmds, blob), cmds_size, > - resv_space); > + CT_DEBUG(ct, "%s desc %#x cmds %#x size %u/%u\n", "send", > + CTB_SEND_DESC_OFF
Re: [Intel-gfx] [PATCH 1/1] drm/i915/guc: Convert ct buffer to iosys_map
On 04.04.2022 15:01, Mullati Siva wrote: > From: Siva Mullati > > Convert CT commands and descriptors to use iosys_map rather > than plain pointer and save it in the intel_guc_ct_buffer struct. > This will help with ct_write and ct_read for cmd send and receive > after the initialization by abstracting the IO vs system memory. > > Signed-off-by: Siva Mullati > --- > drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 200 +- > drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h | 9 +- > 2 files changed, 127 insertions(+), 82 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c > b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c > index f01325cd1b62..64568dc90b05 100644 > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c > @@ -44,6 +44,11 @@ static inline struct drm_device *ct_to_drm(struct > intel_guc_ct *ct) > #define CT_PROBE_ERROR(_ct, _fmt, ...) \ > i915_probe_error(ct_to_i915(ct), "CT: " _fmt, ##__VA_ARGS__) > > +#define ct_desc_read(desc_map_, field_) \ > + iosys_map_rd_field(desc_map_, 0, struct guc_ct_buffer_desc, field_) > +#define ct_desc_write(desc_map_, field_, val_) \ > + iosys_map_wr_field(desc_map_, 0, struct guc_ct_buffer_desc, field_, > val_) > + > /** > * DOC: CTB Blob > * > @@ -76,6 +81,11 @@ static inline struct drm_device *ct_to_drm(struct > intel_guc_ct *ct) > #define CTB_G2H_BUFFER_SIZE (4 * CTB_H2G_BUFFER_SIZE) > #define G2H_ROOM_BUFFER_SIZE (CTB_G2H_BUFFER_SIZE / 4) > > +#define CTB_SEND_DESC_OFFSET 0u > +#define CTB_RECV_DESC_OFFSET (CTB_DESC_SIZE) > +#define CTB_SEND_CMDS_OFFSET (2 * CTB_DESC_SIZE) > +#define CTB_RECV_CMDS_OFFSET (2 * CTB_DESC_SIZE + CTB_H2G_BUFFER_SIZE) > + > struct ct_request { > struct list_head link; > u32 fence; > @@ -113,9 +123,9 @@ void intel_guc_ct_init_early(struct intel_guc_ct *ct) > init_waitqueue_head(&ct->wq); > } > > -static void guc_ct_buffer_desc_init(struct guc_ct_buffer_desc *desc) > +static void guc_ct_buffer_desc_init(struct iosys_map *desc) > { > - memset(desc, 0, sizeof(*desc)); > + iosys_map_memset(desc, 0, 0, sizeof(struct guc_ct_buffer_desc)); > } > > static void guc_ct_buffer_reset(struct intel_guc_ct_buffer *ctb) > @@ -128,17 +138,18 @@ static void guc_ct_buffer_reset(struct > intel_guc_ct_buffer *ctb) > space = CIRC_SPACE(ctb->tail, ctb->head, ctb->size) - ctb->resv_space; > atomic_set(&ctb->space, space); > > - guc_ct_buffer_desc_init(ctb->desc); > + guc_ct_buffer_desc_init(&ctb->desc_map); > } > > static void guc_ct_buffer_init(struct intel_guc_ct_buffer *ctb, > -struct guc_ct_buffer_desc *desc, > -u32 *cmds, u32 size_in_bytes, u32 resv_space) > +struct iosys_map *desc, > +struct iosys_map *cmds, > +u32 size_in_bytes, u32 resv_space) > { > GEM_BUG_ON(size_in_bytes % 4); > > - ctb->desc = desc; > - ctb->cmds = cmds; > + ctb->desc_map = *desc; > + ctb->cmds_map = *cmds; > ctb->size = size_in_bytes / 4; > ctb->resv_space = resv_space / 4; > > @@ -218,12 +229,13 @@ static int ct_register_buffer(struct intel_guc_ct *ct, > bool send, > int intel_guc_ct_init(struct intel_guc_ct *ct) > { > struct intel_guc *guc = ct_to_guc(ct); > - struct guc_ct_buffer_desc *desc; > + struct iosys_map blob_map; > + struct iosys_map desc_map; > + struct iosys_map cmds_map; > u32 blob_size; > u32 cmds_size; > u32 resv_space; > void *blob; > - u32 *cmds; > int err; > > err = i915_inject_probe_error(guc_to_gt(guc)->i915, -ENXIO); > @@ -242,27 +254,35 @@ int intel_guc_ct_init(struct intel_guc_ct *ct) > > CT_DEBUG(ct, "base=%#x size=%u\n", intel_guc_ggtt_offset(guc, ct->vma), > blob_size); > > - /* store pointers to desc and cmds for send ctb */ > - desc = blob; > - cmds = blob + 2 * CTB_DESC_SIZE; > + if (i915_gem_object_is_lmem(ct->vma->obj)) > + iosys_map_set_vaddr_iomem(&blob_map, > + (void __iomem *)blob); > + else > + iosys_map_set_vaddr(&blob_map, blob); > + > + /* store sysmap to desc_map and cmds_map for send ctb */ > + desc_map = IOSYS_MAP_INIT_OFFSET(&blob_map, CTB_SEND_DESC_OFFSET); > + cmds_map = IOSYS_MAP_INIT_OFFSET(&blob_map, CTB_SEND_CMDS_OFFSET); > cmds_size = CTB_H2G_BUFFER_SIZE; > resv_space = 0; > - CT_DEBUG(ct, "%s desc %#tx cmds %#tx size %u/%u\n", "send", > - ptrdiff(desc, blob), ptrdiff(cmds, blob), cmds_size, > - resv_space); > + CT_DEBUG(ct, "%s desc %#x cmds %#x size %u/%u\n", "send", > + CTB_SEND_DESC_OFFSET, (u32)CTB_SEND_CMDS_OFFSET, > + cmds_size, resv_space); > > - guc_ct_buffer_init(&ct->ctbs.send, desc, cmds, cmds_size, resv_space); > + guc_ct
Re: [Intel-gfx] [PATCH 1/1] drm/i915/guc: Convert slpc to iosys_map
On 16.03.2022 18:26, Mullati Siva wrote: > From: Siva Mullati > > Convert slpc shared data to use iosys_map rather than > plain pointer and save it in the intel_guc_slpc struct. > This will help with in read and update slpc shared data > after the slpc init by abstracting the IO vs system memory. > > Signed-off-by: Siva Mullati > --- > drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 79 +++ > .../gpu/drm/i915/gt/uc/intel_guc_slpc_types.h | 5 +- > 2 files changed, 47 insertions(+), 37 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c > b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c > index 9f032c65a488..3a9ec6b03ceb 100644 > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c > @@ -14,6 +14,13 @@ > #include "gt/intel_gt_regs.h" > #include "gt/intel_rps.h" > > +#define slpc_blob_read(slpc_, field_) \ > +iosys_map_rd_field(&(slpc_)->slpc_map, 0, \ > +struct slpc_shared_data, field_) > +#define slpc_blob_write(slpc_, field_, val_) \ > + iosys_map_wr_field(&(slpc_)->slpc_map, 0, \ > + struct slpc_shared_data, field_, val_) > + > static inline struct intel_guc *slpc_to_guc(struct intel_guc_slpc *slpc) > { > return container_of(slpc, struct intel_guc, slpc); > @@ -52,50 +59,50 @@ void intel_guc_slpc_init_early(struct intel_guc_slpc > *slpc) > slpc->selected = __guc_slpc_selected(guc); > } > > -static void slpc_mem_set_param(struct slpc_shared_data *data, > +static void slpc_mem_set_param(struct intel_guc_slpc *slpc, > u32 id, u32 value) > { > + u32 bits = slpc_blob_read(slpc, override_params.bits[id >> 5]); > + > GEM_BUG_ON(id >= SLPC_MAX_OVERRIDE_PARAMETERS); > /* >* When the flag bit is set, corresponding value will be read >* and applied by SLPC. >*/ > - data->override_params.bits[id >> 5] |= (1 << (id % 32)); > - data->override_params.values[id] = value; > + bits |= (1 << (id % 32)); > + slpc_blob_write(slpc, override_params.bits[id >> 5], bits); > + slpc_blob_write(slpc, override_params.values[id], value); > } > > -static void slpc_mem_set_enabled(struct slpc_shared_data *data, > +static void slpc_mem_set_enabled(struct intel_guc_slpc *slpc, >u8 enable_id, u8 disable_id) > { > /* >* Enabling a param involves setting the enable_id >* to 1 and disable_id to 0. >*/ > - slpc_mem_set_param(data, enable_id, 1); > - slpc_mem_set_param(data, disable_id, 0); > + slpc_mem_set_param(slpc, enable_id, 1); > + slpc_mem_set_param(slpc, disable_id, 0); > } > > -static void slpc_mem_set_disabled(struct slpc_shared_data *data, > +static void slpc_mem_set_disabled(struct intel_guc_slpc *slpc, > u8 enable_id, u8 disable_id) > { > /* >* Disabling a param involves setting the enable_id >* to 0 and disable_id to 1. >*/ > - slpc_mem_set_param(data, disable_id, 1); > - slpc_mem_set_param(data, enable_id, 0); > + slpc_mem_set_param(slpc, disable_id, 1); > + slpc_mem_set_param(slpc, enable_id, 0); > } > > static u32 slpc_get_state(struct intel_guc_slpc *slpc) > { > - struct slpc_shared_data *data; > - > GEM_BUG_ON(!slpc->vma); > > - drm_clflush_virt_range(slpc->vaddr, sizeof(u32)); > - data = slpc->vaddr; > + drm_clflush_virt_range(slpc->slpc_map.vaddr, sizeof(u32)); clflush will not be required if the slpc_map contains io memory address. So the drm_clflush_virt_range can be added under a check for system memory > > - return data->header.global_state; > + return slpc_blob_read(slpc, header.global_state); > } > > static int guc_action_slpc_set_param(struct intel_guc *guc, u8 id, u32 value) > @@ -156,7 +163,7 @@ static int slpc_query_task_state(struct intel_guc_slpc > *slpc) > drm_err(&i915->drm, "Failed to query task state (%pe)\n", > ERR_PTR(ret)); > > - drm_clflush_virt_range(slpc->vaddr, SLPC_PAGE_SIZE_BYTES); > + drm_clflush_virt_range(slpc->slpc_map.vaddr, SLPC_PAGE_SIZE_BYTES); Also here we need clfush only for system memory address. > > return ret; > } > @@ -243,10 +250,11 @@ int intel_guc_slpc_init(struct intel_guc_slpc *slpc) > struct drm_i915_private *i915 = slpc_to_i915(slpc); > u32 size = PAGE_ALIGN(sizeof(struct slpc_shared_data)); > int err; > + void *vaddr; > > GEM_BUG_ON(slpc->vma); > > - err = intel_guc_allocate_and_map_vma(guc, size, &slpc->vma, (void > **)&slpc->vaddr); > + err = intel_guc_allocate_and_map_vma(guc, size, &slpc->vma, (void > **)&vaddr); > if (unlikely(err)) { > drm_err(&i915->drm, > "Failed to allocate SLPC struct (err=%pe)\n", > @@ -254,6 +262,12 @@ int intel_guc_slpc_init(st
[Intel-gfx] [PATCH v3 0/7] drm/i915: Use the memcpy_from_wc function from drm
drm_memcpy_from_wc() performs fast copy from WC memory type using non-temporal instructions. Now there are two similar implementations of this function. One exists in drm_cache.c as drm_memcpy_from_wc() and another implementation in i915/i915_memcpy.c as i915_memcpy_from_wc(). drm_memcpy_from_wc() was the recent addition through the series https://patchwork.freedesktop.org/patch/436276/?series=90681&rev=6 The goal of this patch series is to change all users of i915_memcpy_from_wc() to drm_memcpy_from_wc() and a have common implementation in drm and eventually remove the copy from i915. Another benefit of using memcpy functions from drm is that drm_memcpy_from_wc() is available for non-x86 architectures. i915_memcpy_from_wc() is implemented only for x86 and prevents building i915 for ARM64. drm_memcpy_from_wc() does fast copy using non-temporal instructions for x86 and for other architectures makes use of memcpy() family of functions as fallback. Another major difference is unlike i915_memcpy_from_wc(), drm_memcpy_from_wc() will not fail if the passed address argument is not alignment to be used with non-temporal load instructions or if the platform lacks support for those instructions (non-temporal load instructions are provided through SSE4.1 instruction set extension). Instead drm_memcpy_from_wc() continues with fallback functions to complete the copy. This relieves the caller from checking the return value of i915_memcpy_from_wc() and explicitly using a fallback. Follow up series will be created to remove the memcpy_from_wc functions from i915 once the dependency is completely removed. v2: Fixed missing check to find if the address is from system memory or io memory and use the right initialization function to construct the iosys_map structure (Review feedback from Lucas) v3: "drm/i915/guc: use the memcpy_from_wc call from the drm" replaced by patch "drm/i915/guc: use iosys_map abstraction to access GuC log". New patch does a wider change compared to the old patch. It completely changes the access to GuC log using iosys_map abstraction, in addition to using drm_memcpy_from_wc. Cc: Jani Nikula Cc: Lucas De Marchi Cc: David Airlie Cc: Daniel Vetter Cc: Chris Wilson Cc: Thomas Hellstr_m Cc: Joonas Lahtinen Cc: Rodrigo Vivi Cc: Tvrtko Ursulin Cc: Nirmoy Das Balasubramani Vivekanandan (7): drm: Relax alignment constraint for destination address drm: Add drm_memcpy_from_wc() variant which accepts destination address drm/i915: use the memcpy_from_wc call from the drm drm/i915/guc: use iosys_map abstraction to access GuC log drm/i915/selftests: use the memcpy_from_wc call from the drm drm/i915/gt: Avoid direct dereferencing of io memory drm/i915: Avoid dereferencing io mapped memory drivers/gpu/drm/drm_cache.c | 99 +-- drivers/gpu/drm/i915/gem/i915_gem_object.c| 8 +- drivers/gpu/drm/i915/gt/selftest_reset.c | 21 ++-- drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h | 2 +- .../gpu/drm/i915/gt/uc/intel_guc_capture.c| 52 +++--- drivers/gpu/drm/i915/gt/uc/intel_guc_log.c| 77 +++ drivers/gpu/drm/i915/gt/uc/intel_guc_log.h| 3 +- drivers/gpu/drm/i915/i915_gpu_error.c | 45 + .../drm/i915/selftests/intel_memory_region.c | 41 +--- include/drm/drm_cache.h | 3 + 10 files changed, 261 insertions(+), 90 deletions(-) -- 2.25.1
[Intel-gfx] [PATCH v3 2/7] drm: Add drm_memcpy_from_wc() variant which accepts destination address
Fast copy using non-temporal instructions for x86 currently exists at two locations. One is implemented in i915 driver at i915/i915_memcpy.c and another copy at drm_cache.c. The plan is to remove the duplicate implementation in i915 driver and use the functions from drm_cache.c. A variant of drm_memcpy_from_wc() is added in drm_cache.c which accepts address as argument instead of iosys_map for destination. It is a very common scenario in i915 to copy from a WC memory type, which may be an io memory or a system memory to a destination address pointing to system memory. To avoid the overhead of creating iosys_map type for the destination, new variant is created to accept the address directly. Also a new function is exported in drm_cache.c to find if the fast copy is supported by the platform or not. It is required for i915. v2: Added a new argument to drm_memcpy_from_wc_vaddr() which provides the offset into the src address to start copy from. Cc: Maarten Lankhorst Cc: Maxime Ripard Cc: Thomas Zimmermann Cc: David Airlie Cc: Daniel Vetter Cc: Thomas Hellstr_m Signed-off-by: Balasubramani Vivekanandan Reviewed-by: Lucas De Marchi Reviewed-by: Nirmoy Das --- drivers/gpu/drm/drm_cache.c | 55 + include/drm/drm_cache.h | 3 ++ 2 files changed, 58 insertions(+) diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c index 2e2545df3310..8c7af755f7bc 100644 --- a/drivers/gpu/drm/drm_cache.c +++ b/drivers/gpu/drm/drm_cache.c @@ -358,6 +358,55 @@ void drm_memcpy_from_wc(struct iosys_map *dst, } EXPORT_SYMBOL(drm_memcpy_from_wc); +/** + * drm_memcpy_from_wc_vaddr - Perform the fastest available memcpy from a source + * that may be WC to a destination in system memory. + * @dst: The destination pointer + * @src: The source pointer + * @src_offset: The offset from which to copy + * @len: The size of the area to transfer in bytes + * + * Same as drm_memcpy_from_wc except destination is accepted as system memory + * address. Useful in situations where passing destination address as iosys_map + * is simply an overhead and can be avoided. + */ +void drm_memcpy_from_wc_vaddr(void *dst, const struct iosys_map *src, + size_t src_offset, unsigned long len) +{ + const void *src_addr = src->is_iomem ? + (void const __force *)src->vaddr_iomem : + src->vaddr; + + if (WARN_ON(in_interrupt())) { + iosys_map_memcpy_from(dst, src, src_offset, len); + return; + } + + if (static_branch_likely(&has_movntdqa)) { + __drm_memcpy_from_wc(dst, src_addr + src_offset, len); + return; + } + + iosys_map_memcpy_from(dst, src, src_offset, len); +} +EXPORT_SYMBOL(drm_memcpy_from_wc_vaddr); + +/* + * drm_memcpy_fastcopy_supported - Returns if fast copy using non-temporal + * instructions is supported + * + * Returns true if platform has support for fast copying from wc memory type + * using non-temporal instructions. Else false. + */ +bool drm_memcpy_fastcopy_supported(void) +{ + if (static_branch_likely(&has_movntdqa)) + return true; + + return false; +} +EXPORT_SYMBOL(drm_memcpy_fastcopy_supported); + /* * drm_memcpy_init_early - One time initialization of the WC memcpy code */ @@ -382,6 +431,12 @@ void drm_memcpy_from_wc(struct iosys_map *dst, } EXPORT_SYMBOL(drm_memcpy_from_wc); +bool drm_memcpy_fastcopy_supported(void) +{ + return false; +} +EXPORT_SYMBOL(drm_memcpy_fastcopy_supported); + void drm_memcpy_init_early(void) { } diff --git a/include/drm/drm_cache.h b/include/drm/drm_cache.h index 22deb216b59c..d1b57c84a659 100644 --- a/include/drm/drm_cache.h +++ b/include/drm/drm_cache.h @@ -77,4 +77,7 @@ void drm_memcpy_init_early(void); void drm_memcpy_from_wc(struct iosys_map *dst, const struct iosys_map *src, unsigned long len); +bool drm_memcpy_fastcopy_supported(void); +void drm_memcpy_from_wc_vaddr(void *dst, const struct iosys_map *src, + size_t src_offset, unsigned long len); #endif -- 2.25.1
[Intel-gfx] [PATCH v3 1/7] drm: Relax alignment constraint for destination address
There is no need for the destination address to be aligned to 16 byte boundary to be able to use the non-temporal instructions while copying. Non-temporal instructions are used only for loading from the source address which has alignment constraints. We only need to take care of using the right instructions, based on whether destination address is aligned or not, while storing the data to the destination address. __memcpy_ntdqu is copied from i915/i915_memcpy.c Cc: Maarten Lankhorst Cc: Maxime Ripard Cc: Thomas Zimmermann Cc: David Airlie Cc: Daniel Vetter Cc: Chris Wilson Signed-off-by: Balasubramani Vivekanandan Reviewed-by: Lucas De Marchi Reviewed-by: Nirmoy Das --- drivers/gpu/drm/drm_cache.c | 44 - 1 file changed, 38 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c index 7051c9c909c2..2e2545df3310 100644 --- a/drivers/gpu/drm/drm_cache.c +++ b/drivers/gpu/drm/drm_cache.c @@ -278,18 +278,50 @@ static void __memcpy_ntdqa(void *dst, const void *src, unsigned long len) kernel_fpu_end(); } +static void __memcpy_ntdqu(void *dst, const void *src, unsigned long len) +{ + kernel_fpu_begin(); + + while (len >= 4) { + asm("movntdqa (%0), %%xmm0\n" + "movntdqa 16(%0), %%xmm1\n" + "movntdqa 32(%0), %%xmm2\n" + "movntdqa 48(%0), %%xmm3\n" + "movups %%xmm0, (%1)\n" + "movups %%xmm1, 16(%1)\n" + "movups %%xmm2, 32(%1)\n" + "movups %%xmm3, 48(%1)\n" + :: "r" (src), "r" (dst) : "memory"); + src += 64; + dst += 64; + len -= 4; + } + while (len--) { + asm("movntdqa (%0), %%xmm0\n" + "movups %%xmm0, (%1)\n" + :: "r" (src), "r" (dst) : "memory"); + src += 16; + dst += 16; + } + + kernel_fpu_end(); +} + /* * __drm_memcpy_from_wc copies @len bytes from @src to @dst using - * non-temporal instructions where available. Note that all arguments - * (@src, @dst) must be aligned to 16 bytes and @len must be a multiple - * of 16. + * non-temporal instructions where available. Note that @src must be aligned to + * 16 bytes and @len must be a multiple of 16. */ static void __drm_memcpy_from_wc(void *dst, const void *src, unsigned long len) { - if (unlikely(((unsigned long)dst | (unsigned long)src | len) & 15)) + if (unlikely(((unsigned long)src | len) & 15)) { memcpy(dst, src, len); - else if (likely(len)) - __memcpy_ntdqa(dst, src, len >> 4); + } else if (likely(len)) { + if (IS_ALIGNED((unsigned long)dst, 16)) + __memcpy_ntdqa(dst, src, len >> 4); + else + __memcpy_ntdqu(dst, src, len >> 4); + } } /** -- 2.25.1
[Intel-gfx] [PATCH v3 4/7] drm/i915/guc: use iosys_map abstraction to access GuC log
Pointer to the GuC log may be pointing to system memory or device memory based on if the GuC log is backed by system memory or GPU local memory. If the GuC log is on the local memory, we need to use memcpy_[from/to]io APIs to access the logs to support i915 on non-x86 architectures. iosys_map family of APIs provide the needed abstraction to access such address pointers. There is parallel work ongoing to move all such memory access in i915 to iosys_map APIs. Pointer to GuC log ported to iosys_map in this patch as it provides a good base when changing to drm_memcpy_from_wc. Cc: Lucas De Marchi Cc: Daniele Ceraolo Spurio Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h | 2 +- .../gpu/drm/i915/gt/uc/intel_guc_capture.c| 52 + drivers/gpu/drm/i915/gt/uc/intel_guc_log.c| 77 ++- drivers/gpu/drm/i915/gt/uc/intel_guc_log.h| 3 +- 4 files changed, 98 insertions(+), 36 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h b/drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h index 3624abfd22d1..47bed2a0c409 100644 --- a/drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h +++ b/drivers/gpu/drm/i915/gt/uc/guc_capture_fwif.h @@ -21,7 +21,7 @@ struct file; */ struct __guc_capture_bufstate { u32 size; - void *data; + struct iosys_map *data_map; u32 rd; u32 wr; }; diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c index c4e25966d3e9..c4f7a28956b8 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c @@ -5,6 +5,7 @@ #include +#include #include #include "gt/intel_engine_regs.h" @@ -826,7 +827,6 @@ guc_capture_log_remove_dw(struct intel_guc *guc, struct __guc_capture_bufstate * struct drm_i915_private *i915 = guc_to_gt(guc)->i915; int tries = 2; int avail = 0; - u32 *src_data; if (!guc_capture_buf_cnt(buf)) return 0; @@ -834,8 +834,7 @@ guc_capture_log_remove_dw(struct intel_guc *guc, struct __guc_capture_bufstate * while (tries--) { avail = guc_capture_buf_cnt_to_end(buf); if (avail >= sizeof(u32)) { - src_data = (u32 *)(buf->data + buf->rd); - *dw = *src_data; + *dw = iosys_map_rd(buf->data_map, buf->rd, u32); buf->rd += 4; return 4; } @@ -852,7 +851,7 @@ guc_capture_data_extracted(struct __guc_capture_bufstate *b, int size, void *dest) { if (guc_capture_buf_cnt_to_end(b) >= size) { - memcpy(dest, (b->data + b->rd), size); + drm_memcpy_from_wc_vaddr(dest, b->data_map, b->rd, size); b->rd += size; return true; } @@ -1343,22 +1342,24 @@ static void __guc_capture_process_output(struct intel_guc *guc) struct intel_uc *uc = container_of(guc, typeof(*uc), guc); struct drm_i915_private *i915 = guc_to_gt(guc)->i915; struct guc_log_buffer_state log_buf_state_local; - struct guc_log_buffer_state *log_buf_state; + unsigned int capture_offset; struct __guc_capture_bufstate buf; - void *src_data = NULL; + struct iosys_map src_map; bool new_overflow; int ret; - log_buf_state = guc->log.buf_addr + - (sizeof(struct guc_log_buffer_state) * GUC_CAPTURE_LOG_BUFFER); - src_data = guc->log.buf_addr + intel_guc_get_log_buffer_offset(GUC_CAPTURE_LOG_BUFFER); + src_map = IOSYS_MAP_INIT_OFFSET(&guc->log.buf_map, + intel_guc_get_log_buffer_offset(GUC_CAPTURE_LOG_BUFFER)); /* * Make a copy of the state structure, inside GuC log buffer * (which is uncached mapped), on the stack to avoid reading * from it multiple times. */ - memcpy(&log_buf_state_local, log_buf_state, sizeof(struct guc_log_buffer_state)); + capture_offset = sizeof(struct guc_log_buffer_state) * GUC_CAPTURE_LOG_BUFFER; + drm_memcpy_from_wc_vaddr(&log_buf_state_local, &guc->log.buf_map, +capture_offset, +sizeof(struct guc_log_buffer_state)); buffer_size = intel_guc_get_log_buffer_size(GUC_CAPTURE_LOG_BUFFER); read_offset = log_buf_state_local.read_ptr; write_offset = log_buf_state_local.sampled_write_ptr; @@ -1385,7 +1386,7 @@ static void __guc_capture_process_output(struct intel_guc *guc) buf.size = buffer_size; buf.rd = read_offset; buf.wr = write_offset; - buf.data = src_data; + buf.data_map = &src_map; if (!uc->reset_in_progress) { do {
[Intel-gfx] [PATCH v3 6/7] drm/i915/gt: Avoid direct dereferencing of io memory
io mapped memory should not be directly dereferenced to ensure portability. io memory should be read/written/copied using helper functions. i915_memcpy_from_wc() function was used to copy the data from io memory to a temporary buffer and pointer to the temporary buffer was passed to CRC calculation function. But i915_memcpy_from_wc() only does a copy if the platform supports fast copy using non-temporal instructions. Otherwise the pointer to io memory was passed for CRC calculation. CRC function will directly dereference io memory and would not work properly on non-x86 platforms. To make it portable, it should be ensured always temporary buffer is used for CRC and not io memory. drm_memcpy_from_wc_vaddr() is now used for copying instead of i915_memcpy_from_wc() for 2 reasons. - i915_memcpy_from_wc() will be deprecated. - drm_memcpy_from_wc_vaddr() will not fail if the fast copy is not supported but uses memcpy_fromio as fallback for copying. Cc: Matthew Brost Cc: Michał Winiarski Signed-off-by: Balasubramani Vivekanandan Acked-by: Nirmoy Das --- drivers/gpu/drm/i915/gt/selftest_reset.c | 21 ++--- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/selftest_reset.c b/drivers/gpu/drm/i915/gt/selftest_reset.c index 37c38bdd5f47..7a455583c687 100644 --- a/drivers/gpu/drm/i915/gt/selftest_reset.c +++ b/drivers/gpu/drm/i915/gt/selftest_reset.c @@ -3,6 +3,7 @@ * Copyright © 2018 Intel Corporation */ +#include #include #include "gem/i915_gem_stolen.h" @@ -82,7 +83,7 @@ __igt_reset_stolen(struct intel_gt *gt, for (page = 0; page < num_pages; page++) { dma_addr_t dma = (dma_addr_t)dsm->start + (page << PAGE_SHIFT); void __iomem *s; - void *in; + struct iosys_map src_map; ggtt->vm.insert_page(&ggtt->vm, dma, ggtt->error_capture.start, @@ -98,10 +99,9 @@ __igt_reset_stolen(struct intel_gt *gt, ((page + 1) << PAGE_SHIFT) - 1)) memset_io(s, STACK_MAGIC, PAGE_SIZE); - in = (void __force *)s; - if (i915_memcpy_from_wc(tmp, in, PAGE_SIZE)) - in = tmp; - crc[page] = crc32_le(0, in, PAGE_SIZE); + iosys_map_set_vaddr_iomem(&src_map, s); + drm_memcpy_from_wc_vaddr(tmp, &src_map, 0, PAGE_SIZE); + crc[page] = crc32_le(0, tmp, PAGE_SIZE); io_mapping_unmap(s); } @@ -122,7 +122,7 @@ __igt_reset_stolen(struct intel_gt *gt, for (page = 0; page < num_pages; page++) { dma_addr_t dma = (dma_addr_t)dsm->start + (page << PAGE_SHIFT); void __iomem *s; - void *in; + struct iosys_map src_map; u32 x; ggtt->vm.insert_page(&ggtt->vm, dma, @@ -134,10 +134,9 @@ __igt_reset_stolen(struct intel_gt *gt, ggtt->error_capture.start, PAGE_SIZE); - in = (void __force *)s; - if (i915_memcpy_from_wc(tmp, in, PAGE_SIZE)) - in = tmp; - x = crc32_le(0, in, PAGE_SIZE); + iosys_map_set_vaddr_iomem(&src_map, s); + drm_memcpy_from_wc_vaddr(tmp, &src_map, 0, PAGE_SIZE); + x = crc32_le(0, tmp, PAGE_SIZE); if (x != crc[page] && !__drm_mm_interval_first(>->i915->mm.stolen, @@ -146,7 +145,7 @@ __igt_reset_stolen(struct intel_gt *gt, pr_debug("unused stolen page %pa modified by GPU reset\n", &page); if (count++ == 0) - igt_hexdump(in, PAGE_SIZE); + igt_hexdump(tmp, PAGE_SIZE); max = page; } -- 2.25.1
[Intel-gfx] [PATCH v3 5/7] drm/i915/selftests: use the memcpy_from_wc call from the drm
memcpy_from_wc functions in i915_memcpy.c will be removed and replaced by the implementation in drm_cache.c. Updated to use the functions provided by drm_cache.c. v2: check if the source and destination memory address is from local memory or system memory and initialize the iosys_map accordingly (Lucas) Cc: Lucas De Marchi Cc: Matthew Auld Cc: Thomas Hellstr_m Cc: Thomas Zimmermann Cc: Daniel Vetter Signed-off-by: Balasubramani Vivekanandan Acked-by: Nirmoy Das --- .../drm/i915/selftests/intel_memory_region.c | 41 +-- 1 file changed, 28 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c index 73eb53edb8de..420210c20ad5 100644 --- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c +++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c @@ -7,6 +7,7 @@ #include #include +#include #include "../i915_selftest.h" @@ -1141,7 +1142,7 @@ static const char *repr_type(u32 type) static struct drm_i915_gem_object * create_region_for_mapping(struct intel_memory_region *mr, u64 size, u32 type, - void **out_addr) + struct iosys_map *out_addr) { struct drm_i915_gem_object *obj; void *addr; @@ -1161,7 +1162,11 @@ create_region_for_mapping(struct intel_memory_region *mr, u64 size, u32 type, return addr; } - *out_addr = addr; + if (i915_gem_object_is_lmem(obj)) + iosys_map_set_vaddr_iomem(out_addr, (void __iomem *)addr); + else + iosys_map_set_vaddr(out_addr, addr); + return obj; } @@ -1172,24 +1177,33 @@ static int wrap_ktime_compare(const void *A, const void *B) return ktime_compare(*a, *b); } -static void igt_memcpy_long(void *dst, const void *src, size_t size) +static void igt_memcpy_long(struct iosys_map *dst, struct iosys_map *src, + size_t size) { - unsigned long *tmp = dst; - const unsigned long *s = src; + unsigned long *tmp = dst->is_iomem ? + (unsigned long __force *)dst->vaddr_iomem : + dst->vaddr; + const unsigned long *s = src->is_iomem ? + (unsigned long __force *)src->vaddr_iomem : + src->vaddr; size = size / sizeof(unsigned long); while (size--) *tmp++ = *s++; } -static inline void igt_memcpy(void *dst, const void *src, size_t size) +static inline void igt_memcpy(struct iosys_map *dst, struct iosys_map *src, + size_t size) { - memcpy(dst, src, size); + memcpy(dst->is_iomem ? (void __force *)dst->vaddr_iomem : dst->vaddr, + src->is_iomem ? (void __force *)src->vaddr_iomem : src->vaddr, + size); } -static inline void igt_memcpy_from_wc(void *dst, const void *src, size_t size) +static inline void igt_memcpy_from_wc(struct iosys_map *dst, struct iosys_map *src, + size_t size) { - i915_memcpy_from_wc(dst, src, size); + drm_memcpy_from_wc(dst, src, size); } static int _perf_memcpy(struct intel_memory_region *src_mr, @@ -1199,7 +1213,8 @@ static int _perf_memcpy(struct intel_memory_region *src_mr, struct drm_i915_private *i915 = src_mr->i915; const struct { const char *name; - void (*copy)(void *dst, const void *src, size_t size); + void (*copy)(struct iosys_map *dst, struct iosys_map *src, +size_t size); bool skip; } tests[] = { { @@ -1213,11 +1228,11 @@ static int _perf_memcpy(struct intel_memory_region *src_mr, { "memcpy_from_wc", igt_memcpy_from_wc, - !i915_has_memcpy_from_wc(), + !drm_memcpy_fastcopy_supported(), }, }; struct drm_i915_gem_object *src, *dst; - void *src_addr, *dst_addr; + struct iosys_map src_addr, dst_addr; int ret = 0; int i; @@ -1245,7 +1260,7 @@ static int _perf_memcpy(struct intel_memory_region *src_mr, t0 = ktime_get(); - tests[i].copy(dst_addr, src_addr, size); + tests[i].copy(&dst_addr, &src_addr, size); t1 = ktime_get(); t[pass] = ktime_sub(t1, t0); -- 2.25.1
[Intel-gfx] [PATCH v3 3/7] drm/i915: use the memcpy_from_wc call from the drm
memcpy_from_wc functions in i915_memcpy.c will be removed and replaced by the implementation in drm_cache.c. Updated to use the functions provided by drm_cache.c. v2: Pass newly added src offset argument to the modified drm_memcpy_from_wc_vaddr() function. Signed-off-by: Balasubramani Vivekanandan Reviewed-by: Lucas De Marchi Reviewed-by: Nirmoy Das --- drivers/gpu/drm/i915/gem/i915_gem_object.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index 06b1b188ce5a..c1ff0a591a24 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -438,16 +438,16 @@ static void i915_gem_object_read_from_page_iomap(struct drm_i915_gem_object *obj, u64 offset, void *dst, int size) { void __iomem *src_map; - void __iomem *src_ptr; + struct iosys_map src; + dma_addr_t dma = i915_gem_object_get_dma_address(obj, offset >> PAGE_SHIFT); src_map = io_mapping_map_wc(&obj->mm.region->iomap, dma - obj->mm.region->region.start, PAGE_SIZE); - src_ptr = src_map + offset_in_page(offset); - if (!i915_memcpy_from_wc(dst, (void __force *)src_ptr, size)) - memcpy_fromio(dst, src_ptr, size); + iosys_map_set_vaddr_iomem(&src, src_map); + drm_memcpy_from_wc_vaddr(dst, &src, offset_in_page(offset), size); io_mapping_unmap(src_map); } -- 2.25.1
[Intel-gfx] [PATCH v3 7/7] drm/i915: Avoid dereferencing io mapped memory
Pointer passed to zlib_deflate() for compression could point to io mapped memory and might end up in direct derefencing. io mapped memory is copied to a temporary buffer, which is then shared to zlib_deflate(), only for the case where platform supports fast copy using non-temporal instructions. If the platform lacks support, then io mapped memory is directly used. Direct dereferencing of io memory makes driver not portable outside x86 and should be avoided. With this patch, io memory is always copied to a temporary buffer irrespective of platform support for fast copy. The i915_has_memcpy_from_wc() check is removed. And drm_memcpy_from_wc_vaddr() is now used for copying instead of i915_memcpy_from_wc() for 2 reasons. - i915_memcpy_from_wc() will be deprecated. - drm_memcpy_from_wc_vaddr() will not fail if the fast copy is not supported instead continues copying using memcpy_fromio as fallback. Signed-off-by: Balasubramani Vivekanandan Acked-by: Nirmoy Das --- drivers/gpu/drm/i915/i915_gpu_error.c | 45 +++ 1 file changed, 25 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 0512c66fa4f3..9cafacb4ceb6 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -262,9 +262,12 @@ static bool compress_init(struct i915_vma_compress *c) return false; } - c->tmp = NULL; - if (i915_has_memcpy_from_wc()) - c->tmp = pool_alloc(&c->pool, ALLOW_FAIL); + c->tmp = pool_alloc(&c->pool, ALLOW_FAIL); + if (!c->tmp) { + kfree(zstream->workspace); + pool_fini(&c->pool); + return false; + } return true; } @@ -296,15 +299,17 @@ static void *compress_next_page(struct i915_vma_compress *c, } static int compress_page(struct i915_vma_compress *c, -void *src, -struct i915_vma_coredump *dst, -bool wc) +struct iosys_map *src, +struct i915_vma_coredump *dst) { struct z_stream_s *zstream = &c->zstream; - zstream->next_in = src; - if (wc && c->tmp && i915_memcpy_from_wc(c->tmp, src, PAGE_SIZE)) + if (src->is_iomem) { + drm_memcpy_from_wc_vaddr(c->tmp, src, 0, PAGE_SIZE); zstream->next_in = c->tmp; + } else { + zstream->next_in = src->vaddr; + } zstream->avail_in = PAGE_SIZE; do { @@ -393,9 +398,8 @@ static bool compress_start(struct i915_vma_compress *c) } static int compress_page(struct i915_vma_compress *c, -void *src, -struct i915_vma_coredump *dst, -bool wc) +struct iosys_map *src, +struct i915_vma_coredump *dst) { void *ptr; @@ -403,8 +407,7 @@ static int compress_page(struct i915_vma_compress *c, if (!ptr) return -ENOMEM; - if (!(wc && i915_memcpy_from_wc(ptr, src, PAGE_SIZE))) - memcpy(ptr, src, PAGE_SIZE); + drm_memcpy_from_wc_vaddr(ptr, src, 0, PAGE_SIZE); list_add_tail(&virt_to_page(ptr)->lru, &dst->page_list); cond_resched(); @@ -1092,6 +1095,7 @@ i915_vma_coredump_create(const struct intel_gt *gt, if (drm_mm_node_allocated(&ggtt->error_capture)) { void __iomem *s; dma_addr_t dma; + struct iosys_map src; for_each_sgt_daddr(dma, iter, vma_res->bi.pages) { mutex_lock(&ggtt->error_mutex); @@ -1100,9 +1104,8 @@ i915_vma_coredump_create(const struct intel_gt *gt, mb(); s = io_mapping_map_wc(&ggtt->iomap, slot, PAGE_SIZE); - ret = compress_page(compress, - (void __force *)s, dst, - true); + iosys_map_set_vaddr_iomem(&src, s); + ret = compress_page(compress, &src, dst); io_mapping_unmap(s); mb(); @@ -1114,6 +1117,7 @@ i915_vma_coredump_create(const struct intel_gt *gt, } else if (vma_res->bi.lmem) { struct intel_memory_region *mem = vma_res->mr; dma_addr_t dma; + struct iosys_map src; for_each_sgt_daddr(dma, iter, vma_res->bi.pages) { void __iomem *s; @@ -1121,15 +1125,15 @@ i915_vma_coredump_create(const struct intel_gt *gt, s = io_mapping_map_wc(&mem->iomap,
Re: [Intel-gfx] [PATCH 1/1] drm/i915/guc: Convert ct buffer to iosys_map
On 28.04.2022 19:43, Siva Mullati wrote: > > On 14/04/22 17:41, Balasubramani Vivekanandan wrote: > > On 04.04.2022 15:01, Mullati Siva wrote: > >> From: Siva Mullati > >> > >> Convert CT commands and descriptors to use iosys_map rather > >> than plain pointer and save it in the intel_guc_ct_buffer struct. > >> This will help with ct_write and ct_read for cmd send and receive > >> after the initialization by abstracting the IO vs system memory. > >> > >> Signed-off-by: Siva Mullati > >> --- > >> drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 200 +- > >> drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h | 9 +- > >> 2 files changed, 127 insertions(+), 82 deletions(-) > >> > >> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c > >> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c > >> index f01325cd1b62..64568dc90b05 100644 > >> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c > >> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c > >> @@ -44,6 +44,11 @@ static inline struct drm_device *ct_to_drm(struct > >> intel_guc_ct *ct) > >> #define CT_PROBE_ERROR(_ct, _fmt, ...) \ > >>i915_probe_error(ct_to_i915(ct), "CT: " _fmt, ##__VA_ARGS__) > >> > >> +#define ct_desc_read(desc_map_, field_) \ > >> + iosys_map_rd_field(desc_map_, 0, struct guc_ct_buffer_desc, field_) > >> +#define ct_desc_write(desc_map_, field_, val_) \ > >> + iosys_map_wr_field(desc_map_, 0, struct guc_ct_buffer_desc, field_, > >> val_) > >> + > > Did you try to make the change Lucas mentioned in his comment on rev0, > > to pass `struct guc_ct_buffer_desc *` as first argument to the above > > macros? Was it not feasible? > It is not feasible. > >> /** > >> * DOC: CTB Blob > >> * > >> @@ -76,6 +81,11 @@ static inline struct drm_device *ct_to_drm(struct > >> intel_guc_ct *ct) > >> #define CTB_G2H_BUFFER_SIZE (4 * CTB_H2G_BUFFER_SIZE) > >> #define G2H_ROOM_BUFFER_SIZE (CTB_G2H_BUFFER_SIZE / 4) > >> > >> +#define CTB_SEND_DESC_OFFSET 0u > >> +#define CTB_RECV_DESC_OFFSET (CTB_DESC_SIZE) > >> +#define CTB_SEND_CMDS_OFFSET (2 * CTB_DESC_SIZE) > >> +#define CTB_RECV_CMDS_OFFSET (2 * CTB_DESC_SIZE + > >> CTB_H2G_BUFFER_SIZE) > >> + > >> struct ct_request { > >>struct list_head link; > >>u32 fence; > >> @@ -113,9 +123,9 @@ void intel_guc_ct_init_early(struct intel_guc_ct *ct) > >>init_waitqueue_head(&ct->wq); > >> } > >> > >> -static void guc_ct_buffer_desc_init(struct guc_ct_buffer_desc *desc) > >> +static void guc_ct_buffer_desc_init(struct iosys_map *desc) > >> { > >> - memset(desc, 0, sizeof(*desc)); > >> + iosys_map_memset(desc, 0, 0, sizeof(struct guc_ct_buffer_desc)); > >> } > >> > >> static void guc_ct_buffer_reset(struct intel_guc_ct_buffer *ctb) > >> @@ -128,17 +138,18 @@ static void guc_ct_buffer_reset(struct > >> intel_guc_ct_buffer *ctb) > >>space = CIRC_SPACE(ctb->tail, ctb->head, ctb->size) - ctb->resv_space; > >>atomic_set(&ctb->space, space); > >> > >> - guc_ct_buffer_desc_init(ctb->desc); > >> + guc_ct_buffer_desc_init(&ctb->desc_map); > >> } > >> > >> static void guc_ct_buffer_init(struct intel_guc_ct_buffer *ctb, > >> - struct guc_ct_buffer_desc *desc, > >> - u32 *cmds, u32 size_in_bytes, u32 resv_space) > >> + struct iosys_map *desc, > >> + struct iosys_map *cmds, > >> + u32 size_in_bytes, u32 resv_space) > >> { > >>GEM_BUG_ON(size_in_bytes % 4); > >> > >> - ctb->desc = desc; > >> - ctb->cmds = cmds; > >> + ctb->desc_map = *desc; > >> + ctb->cmds_map = *cmds; > >>ctb->size = size_in_bytes / 4; > >>ctb->resv_space = resv_space / 4; > >> > >> @@ -218,12 +229,13 @@ static int ct_register_buffer(struct intel_guc_ct > >> *ct, bool send, > >> int intel_guc_ct_init(struct intel_guc_ct *ct) > >> { > >>struct intel_guc *guc = ct_to_guc(ct); > >> - struct guc_ct_buffer_desc *desc; > >> + struct iosys_map blob_map; > >> + struct iosys_map desc_map; > >> + struct ios
[Intel-gfx] [PATCH 0/7] drm/i915: Use the memcpy_from_wc function from drm
drm_memcpy_from_wc() performs fast copy from WC memory type using non-temporal instructions. Now there are two similar implementations of this function. One exists in drm_cache.c as drm_memcpy_from_wc() and another implementation in i915/i915_memcpy.c as i915_memcpy_from_wc(). drm_memcpy_from_wc() was the recent addition through the series https://patchwork.freedesktop.org/patch/436276/?series=90681&rev=6 The goal of this patch series is to change all users of i915_memcpy_from_wc() to drm_memcpy_from_wc() and a have common implementation in drm and eventually remove the copy from i915. Another benefit of using memcpy functions from drm is that drm_memcpy_from_wc() is available for non-x86 architectures. i915_memcpy_from_wc() is implemented only for x86 and prevents building i915 for ARM64. drm_memcpy_from_wc() does fast copy using non-temporal instructions for x86 and for other architectures makes use of memcpy() family of functions as fallback. Another major difference is unlike i915_memcpy_from_wc(), drm_memcpy_from_wc() will not fail if the passed address argument is not alignment to be used with non-temporal load instructions or if the platform lacks support for those instructions (non-temporal load instructions are provided through SSE4.1 instruction set extension). Instead drm_memcpy_from_wc() continues with fallback functions to complete the copy. This relieves the caller from checking the return value of i915_memcpy_from_wc() and explicitly using a fallback. Follow up series will be created to remove the memcpy_from_wc functions from i915 once the dependency is completely removed. Cc: Jani Nikula Cc: Lucas De Marchi Cc: David Airlie Cc: Daniel Vetter Cc: Chris Wilson Cc: Thomas Hellstr_m Cc: Joonas Lahtinen Cc: Rodrigo Vivi Cc: Tvrtko Ursulin Balasubramani Vivekanandan (7): drm: Relax alignment constraint for destination address drm: Add drm_memcpy_from_wc() variant which accepts destination address drm/i915: use the memcpy_from_wc call from the drm drm/i915/guc: use the memcpy_from_wc call from the drm drm/i915/selftests: use the memcpy_from_wc call from the drm drm/i915/gt: Avoid direct dereferencing of io memory drm/i915: Avoid dereferencing io mapped memory drivers/gpu/drm/drm_cache.c | 98 +-- drivers/gpu/drm/i915/gem/i915_gem_object.c| 8 +- drivers/gpu/drm/i915/gt/selftest_reset.c | 21 ++-- drivers/gpu/drm/i915/gt/uc/intel_guc_log.c| 11 ++- drivers/gpu/drm/i915/i915_gpu_error.c | 45 + .../drm/i915/selftests/intel_memory_region.c | 8 +- include/drm/drm_cache.h | 3 + 7 files changed, 148 insertions(+), 46 deletions(-) -- 2.25.1
[Intel-gfx] [PATCH 1/7] drm: Relax alignment constraint for destination address
There is no need for the destination address to be aligned to 16 byte boundary to be able to use the non-temporal instructions while copying. Non-temporal instructions are used only for loading from the source address which has alignment constraints. We only need to take care of using the right instructions, based on whether destination address is aligned or not, while storing the data to the destination address. __memcpy_ntdqu is copied from i915/i915_memcpy.c Cc: Maarten Lankhorst Cc: Maxime Ripard Cc: Thomas Zimmermann Cc: David Airlie Cc: Daniel Vetter Cc: Chris Wilson Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/drm_cache.c | 44 - 1 file changed, 38 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c index c3e6e615bf09..a21c1350eb09 100644 --- a/drivers/gpu/drm/drm_cache.c +++ b/drivers/gpu/drm/drm_cache.c @@ -278,18 +278,50 @@ static void __memcpy_ntdqa(void *dst, const void *src, unsigned long len) kernel_fpu_end(); } +static void __memcpy_ntdqu(void *dst, const void *src, unsigned long len) +{ + kernel_fpu_begin(); + + while (len >= 4) { + asm("movntdqa (%0), %%xmm0\n" + "movntdqa 16(%0), %%xmm1\n" + "movntdqa 32(%0), %%xmm2\n" + "movntdqa 48(%0), %%xmm3\n" + "movups %%xmm0, (%1)\n" + "movups %%xmm1, 16(%1)\n" + "movups %%xmm2, 32(%1)\n" + "movups %%xmm3, 48(%1)\n" + :: "r" (src), "r" (dst) : "memory"); + src += 64; + dst += 64; + len -= 4; + } + while (len--) { + asm("movntdqa (%0), %%xmm0\n" + "movups %%xmm0, (%1)\n" + :: "r" (src), "r" (dst) : "memory"); + src += 16; + dst += 16; + } + + kernel_fpu_end(); +} + /* * __drm_memcpy_from_wc copies @len bytes from @src to @dst using - * non-temporal instructions where available. Note that all arguments - * (@src, @dst) must be aligned to 16 bytes and @len must be a multiple - * of 16. + * non-temporal instructions where available. Note that @src must be aligned to + * 16 bytes and @len must be a multiple of 16. */ static void __drm_memcpy_from_wc(void *dst, const void *src, unsigned long len) { - if (unlikely(((unsigned long)dst | (unsigned long)src | len) & 15)) + if (unlikely(((unsigned long)src | len) & 15)) { memcpy(dst, src, len); - else if (likely(len)) - __memcpy_ntdqa(dst, src, len >> 4); + } else if (likely(len)) { + if (IS_ALIGNED((unsigned long)dst, 16)) + __memcpy_ntdqa(dst, src, len >> 4); + else + __memcpy_ntdqu(dst, src, len >> 4); + } } /** -- 2.25.1
[Intel-gfx] [PATCH 2/7] drm: Add drm_memcpy_from_wc() variant which accepts destination address
Fast copy using non-temporal instructions for x86 currently exists at two locations. One is implemented in i915 driver at i915/i915_memcpy.c and another copy at drm_cache.c. The plan is to remove the duplicate implementation in i915 driver and use the functions from drm_cache.c. A variant of drm_memcpy_from_wc() is added in drm_cache.c which accepts address as argument instead of iosys_map for destination. It is a very common scenario in i915 to copy from a WC memory type, which may be an io memory or a system memory to a destination address pointing to system memory. To avoid the overhead of creating iosys_map type for the destination, new variant is created to accept the address directly. Also a new function is exported in drm_cache.c to find if the fast copy is supported by the platform or not. It is required for i915. Cc: Maarten Lankhorst Cc: Maxime Ripard Cc: Thomas Zimmermann Cc: David Airlie Cc: Daniel Vetter Cc: Thomas Hellstr_m Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/drm_cache.c | 54 + include/drm/drm_cache.h | 3 +++ 2 files changed, 57 insertions(+) diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c index a21c1350eb09..eb0bcd33665e 100644 --- a/drivers/gpu/drm/drm_cache.c +++ b/drivers/gpu/drm/drm_cache.c @@ -358,6 +358,54 @@ void drm_memcpy_from_wc(struct iosys_map *dst, } EXPORT_SYMBOL(drm_memcpy_from_wc); +/** + * drm_memcpy_from_wc_vaddr - Perform the fastest available memcpy from a source + * that may be WC. + * @dst: The destination pointer + * @src: The source pointer + * @len: The size of the area to transfer in bytes + * + * Same as drm_memcpy_from_wc except destination is accepted as system memory + * address. Useful in situations where passing destination address as iosys_map + * is simply an overhead and can be avoided. + */ +void drm_memcpy_from_wc_vaddr(void *dst, const struct iosys_map *src, + unsigned long len) +{ + if (WARN_ON(in_interrupt())) { + iosys_map_memcpy_from(dst, src, 0, len); + return; + } + + if (static_branch_likely(&has_movntdqa)) { + __drm_memcpy_from_wc(dst, +src->is_iomem ? +(void const __force *)src->vaddr_iomem : +src->vaddr, +len); + return; + } + + iosys_map_memcpy_from(dst, src, 0, len); +} +EXPORT_SYMBOL(drm_memcpy_from_wc_vaddr); + +/* + * drm_memcpy_fastcopy_supported - Returns if fast copy using non-temporal + * instructions is supported + * + * Returns true if platform has support for fast copying from wc memory type + * using non-temporal instructions. Else false. + */ +bool drm_memcpy_fastcopy_supported(void) +{ + if (static_branch_likely(&has_movntdqa)) + return true; + + return false; +} +EXPORT_SYMBOL(drm_memcpy_fastcopy_supported); + /* * drm_memcpy_init_early - One time initialization of the WC memcpy code */ @@ -382,6 +430,12 @@ void drm_memcpy_from_wc(struct iosys_map *dst, } EXPORT_SYMBOL(drm_memcpy_from_wc); +bool drm_memcpy_fastcopy_supported(void) +{ + return false; +} +EXPORT_SYMBOL(drm_memcpy_fastcopy_supported); + void drm_memcpy_init_early(void) { } diff --git a/include/drm/drm_cache.h b/include/drm/drm_cache.h index 22deb216b59c..8f48e4dcd7dc 100644 --- a/include/drm/drm_cache.h +++ b/include/drm/drm_cache.h @@ -77,4 +77,7 @@ void drm_memcpy_init_early(void); void drm_memcpy_from_wc(struct iosys_map *dst, const struct iosys_map *src, unsigned long len); +bool drm_memcpy_fastcopy_supported(void); +void drm_memcpy_from_wc_vaddr(void *dst, const struct iosys_map *src, + unsigned long len); #endif -- 2.25.1
[Intel-gfx] [PATCH 3/7] drm/i915: use the memcpy_from_wc call from the drm
memcpy_from_wc functions in i915_memcpy.c will be removed and replaced by the implementation in drm_cache.c. Updated to use the functions provided by drm_cache.c. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/gem/i915_gem_object.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index 2d593d573ef1..49ff8e3e71d9 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -449,16 +449,16 @@ static void i915_gem_object_read_from_page_iomap(struct drm_i915_gem_object *obj, u64 offset, void *dst, int size) { void __iomem *src_map; - void __iomem *src_ptr; + struct iosys_map src_ptr; + dma_addr_t dma = i915_gem_object_get_dma_address(obj, offset >> PAGE_SHIFT); src_map = io_mapping_map_wc(&obj->mm.region->iomap, dma - obj->mm.region->region.start, PAGE_SIZE); - src_ptr = src_map + offset_in_page(offset); - if (!i915_memcpy_from_wc(dst, (void __force *)src_ptr, size)) - memcpy_fromio(dst, src_ptr, size); + iosys_map_set_vaddr_iomem(&src_ptr, (src_map + offset_in_page(offset))); + drm_memcpy_from_wc_vaddr(dst, &src_ptr, size); io_mapping_unmap(src_map); } -- 2.25.1
[Intel-gfx] [PATCH 4/7] drm/i915/guc: use the memcpy_from_wc call from the drm
memcpy_from_wc functions in i915_memcpy.c will be removed and replaced by the implementation in drm_cache.c. Updated to use the functions provided by drm_cache.c. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/gt/uc/intel_guc_log.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c index b53f61f3101f..1990762f07de 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c @@ -3,6 +3,7 @@ * Copyright © 2014-2019 Intel Corporation */ +#include #include #include "gt/intel_gt.h" @@ -205,6 +206,7 @@ static void guc_read_update_log_buffer(struct intel_guc_log *log) enum guc_log_buffer_type type; void *src_data, *dst_data; bool new_overflow; + struct iosys_map src_map; mutex_lock(&log->relay.lock); @@ -281,14 +283,17 @@ static void guc_read_update_log_buffer(struct intel_guc_log *log) } /* Just copy the newly written data */ + iosys_map_set_vaddr(&src_map, src_data); if (read_offset > write_offset) { - i915_memcpy_from_wc(dst_data, src_data, write_offset); + drm_memcpy_from_wc_vaddr(dst_data, &src_map, +write_offset); bytes_to_copy = buffer_size - read_offset; } else { bytes_to_copy = write_offset - read_offset; } - i915_memcpy_from_wc(dst_data + read_offset, - src_data + read_offset, bytes_to_copy); + iosys_map_incr(&src_map, read_offset); + drm_memcpy_from_wc_vaddr(dst_data + read_offset, &src_map, +bytes_to_copy); src_data += buffer_size; dst_data += buffer_size; -- 2.25.1
[Intel-gfx] [PATCH 5/7] drm/i915/selftests: use the memcpy_from_wc call from the drm
memcpy_from_wc functions in i915_memcpy.c will be removed and replaced by the implementation in drm_cache.c. Updated to use the functions provided by drm_cache.c. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/selftests/intel_memory_region.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c index 7acba1d2135e..d7531aa6965a 100644 --- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c +++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c @@ -7,6 +7,7 @@ #include #include +#include #include "../i915_selftest.h" @@ -1033,7 +1034,10 @@ static inline void igt_memcpy(void *dst, const void *src, size_t size) static inline void igt_memcpy_from_wc(void *dst, const void *src, size_t size) { - i915_memcpy_from_wc(dst, src, size); + struct iosys_map src_map; + + iosys_map_set_vaddr(&src_map, (void *)src); + drm_memcpy_from_wc_vaddr(dst, &src_map, size); } static int _perf_memcpy(struct intel_memory_region *src_mr, @@ -1057,7 +1061,7 @@ static int _perf_memcpy(struct intel_memory_region *src_mr, { "memcpy_from_wc", igt_memcpy_from_wc, - !i915_has_memcpy_from_wc(), + !drm_memcpy_fastcopy_supported(), }, }; struct drm_i915_gem_object *src, *dst; -- 2.25.1
[Intel-gfx] [PATCH 6/7] drm/i915/gt: Avoid direct dereferencing of io memory
io mapped memory should not be directly dereferenced to ensure portability. io memory should be read/written/copied using helper functions. i915_memcpy_from_wc() function was used to copy the data from io memory to a temporary buffer and pointer to the temporary buffer was passed to CRC calculation function. But i915_memcpy_from_wc() only does a copy if the platform supports fast copy using non-temporal instructions. Otherwise the pointer to io memory was passed for CRC calculation. CRC function will directly dereference io memory and would not work properly on non-x86 platforms. To make it portable, it should be ensured always temporary buffer is used for CRC and not io memory. drm_memcpy_from_wc_vaddr() is now used for copying instead of i915_memcpy_from_wc() for 2 reasons. - i915_memcpy_from_wc() will be deprecated. - drm_memcpy_from_wc_vaddr() will not fail if the fast copy is not supported but uses memcpy_fromio as fallback for copying. Cc: Matthew Brost Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/gt/selftest_reset.c | 21 ++--- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/selftest_reset.c b/drivers/gpu/drm/i915/gt/selftest_reset.c index 37c38bdd5f47..79d2bd7ef3b9 100644 --- a/drivers/gpu/drm/i915/gt/selftest_reset.c +++ b/drivers/gpu/drm/i915/gt/selftest_reset.c @@ -3,6 +3,7 @@ * Copyright © 2018 Intel Corporation */ +#include #include #include "gem/i915_gem_stolen.h" @@ -82,7 +83,7 @@ __igt_reset_stolen(struct intel_gt *gt, for (page = 0; page < num_pages; page++) { dma_addr_t dma = (dma_addr_t)dsm->start + (page << PAGE_SHIFT); void __iomem *s; - void *in; + struct iosys_map src_map; ggtt->vm.insert_page(&ggtt->vm, dma, ggtt->error_capture.start, @@ -98,10 +99,9 @@ __igt_reset_stolen(struct intel_gt *gt, ((page + 1) << PAGE_SHIFT) - 1)) memset_io(s, STACK_MAGIC, PAGE_SIZE); - in = (void __force *)s; - if (i915_memcpy_from_wc(tmp, in, PAGE_SIZE)) - in = tmp; - crc[page] = crc32_le(0, in, PAGE_SIZE); + iosys_map_set_vaddr_iomem(&src_map, s); + drm_memcpy_from_wc_vaddr(tmp, &src_map, PAGE_SIZE); + crc[page] = crc32_le(0, tmp, PAGE_SIZE); io_mapping_unmap(s); } @@ -122,7 +122,7 @@ __igt_reset_stolen(struct intel_gt *gt, for (page = 0; page < num_pages; page++) { dma_addr_t dma = (dma_addr_t)dsm->start + (page << PAGE_SHIFT); void __iomem *s; - void *in; + struct iosys_map src_map; u32 x; ggtt->vm.insert_page(&ggtt->vm, dma, @@ -134,10 +134,9 @@ __igt_reset_stolen(struct intel_gt *gt, ggtt->error_capture.start, PAGE_SIZE); - in = (void __force *)s; - if (i915_memcpy_from_wc(tmp, in, PAGE_SIZE)) - in = tmp; - x = crc32_le(0, in, PAGE_SIZE); + iosys_map_set_vaddr_iomem(&src_map, s); + drm_memcpy_from_wc_vaddr(tmp, &src_map, PAGE_SIZE); + x = crc32_le(0, tmp, PAGE_SIZE); if (x != crc[page] && !__drm_mm_interval_first(>->i915->mm.stolen, @@ -146,7 +145,7 @@ __igt_reset_stolen(struct intel_gt *gt, pr_debug("unused stolen page %pa modified by GPU reset\n", &page); if (count++ == 0) - igt_hexdump(in, PAGE_SIZE); + igt_hexdump(tmp, PAGE_SIZE); max = page; } -- 2.25.1
[Intel-gfx] [PATCH 7/7] drm/i915: Avoid dereferencing io mapped memory
Pointer passed to zlib_deflate() for compression could point to io mapped memory and might end up in direct derefencing. io mapped memory is copied to a temporary buffer, which is then shared to zlib_deflate(), only for the case where platform supports fast copy using non-temporal instructions. If the platform lacks support, then io mapped memory is directly used. Direct dereferencing of io memory makes driver not portable outside x86 and should be avoided. With this patch, io memory is always copied to a temporary buffer irrespective of platform support for fast copy. The i915_has_memcpy_from_wc() check is removed. And drm_memcpy_from_wc_vaddr() is now used for copying instead of i915_memcpy_from_wc() for 2 reasons. - i915_memcpy_from_wc() will be deprecated. - drm_memcpy_from_wc_vaddr() will not fail if the fast copy is not supported instead continues copying using memcpy_fromio as fallback. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/i915_gpu_error.c | 45 +++ 1 file changed, 25 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 1d042551619e..0c5917a7a545 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -258,9 +258,12 @@ static bool compress_init(struct i915_vma_compress *c) return false; } - c->tmp = NULL; - if (i915_has_memcpy_from_wc()) - c->tmp = pool_alloc(&c->pool, ALLOW_FAIL); + c->tmp = pool_alloc(&c->pool, ALLOW_FAIL); + if (!c->tmp) { + kfree(zstream->workspace); + pool_fini(&c->pool); + return false; + } return true; } @@ -292,15 +295,17 @@ static void *compress_next_page(struct i915_vma_compress *c, } static int compress_page(struct i915_vma_compress *c, -void *src, -struct i915_vma_coredump *dst, -bool wc) +struct iosys_map *src, +struct i915_vma_coredump *dst) { struct z_stream_s *zstream = &c->zstream; - zstream->next_in = src; - if (wc && c->tmp && i915_memcpy_from_wc(c->tmp, src, PAGE_SIZE)) + if (src->is_iomem) { + drm_memcpy_from_wc_vaddr(c->tmp, src, PAGE_SIZE); zstream->next_in = c->tmp; + } else { + zstream->next_in = src->vaddr; + } zstream->avail_in = PAGE_SIZE; do { @@ -389,9 +394,8 @@ static bool compress_start(struct i915_vma_compress *c) } static int compress_page(struct i915_vma_compress *c, -void *src, -struct i915_vma_coredump *dst, -bool wc) +struct iosys_map *src, +struct i915_vma_coredump *dst) { void *ptr; @@ -399,8 +403,7 @@ static int compress_page(struct i915_vma_compress *c, if (!ptr) return -ENOMEM; - if (!(wc && i915_memcpy_from_wc(ptr, src, PAGE_SIZE))) - memcpy(ptr, src, PAGE_SIZE); + drm_memcpy_from_wc_vaddr(ptr, src, PAGE_SIZE); list_add_tail(&virt_to_page(ptr)->lru, &dst->page_list); cond_resched(); @@ -1054,6 +1057,7 @@ i915_vma_coredump_create(const struct intel_gt *gt, if (drm_mm_node_allocated(&ggtt->error_capture)) { void __iomem *s; dma_addr_t dma; + struct iosys_map src; for_each_sgt_daddr(dma, iter, vma_res->bi.pages) { mutex_lock(&ggtt->error_mutex); @@ -1062,9 +1066,8 @@ i915_vma_coredump_create(const struct intel_gt *gt, mb(); s = io_mapping_map_wc(&ggtt->iomap, slot, PAGE_SIZE); - ret = compress_page(compress, - (void __force *)s, dst, - true); + iosys_map_set_vaddr_iomem(&src, s); + ret = compress_page(compress, &src, dst); io_mapping_unmap(s); mb(); @@ -1076,6 +1079,7 @@ i915_vma_coredump_create(const struct intel_gt *gt, } else if (vma_res->bi.lmem) { struct intel_memory_region *mem = vma_res->mr; dma_addr_t dma; + struct iosys_map src; for_each_sgt_daddr(dma, iter, vma_res->bi.pages) { void __iomem *s; @@ -1083,15 +1087,15 @@ i915_vma_coredump_create(const struct intel_gt *gt, s = io_mapping_map_wc(&mem->iomap, dma - mem->region.start,
Re: [Intel-gfx] [PATCH 0/7] drm/i915: Use the memcpy_from_wc function from drm
On 23.02.2022 10:02, Das, Nirmoy wrote: > > On 22/02/2022 15:51, Balasubramani Vivekanandan wrote: > > drm_memcpy_from_wc() performs fast copy from WC memory type using > > non-temporal instructions. Now there are two similar implementations of > > this function. One exists in drm_cache.c as drm_memcpy_from_wc() and > > another implementation in i915/i915_memcpy.c as i915_memcpy_from_wc(). > > drm_memcpy_from_wc() was the recent addition through the series > > https://patchwork.freedesktop.org/patch/436276/?series=90681&rev=6 > > > > The goal of this patch series is to change all users of > > i915_memcpy_from_wc() to drm_memcpy_from_wc() and a have common > > implementation in drm and eventually remove the copy from i915. > > > > Another benefit of using memcpy functions from drm is that > > drm_memcpy_from_wc() is available for non-x86 architectures. > > i915_memcpy_from_wc() is implemented only for x86 and prevents building > > i915 for ARM64. > > drm_memcpy_from_wc() does fast copy using non-temporal instructions for > > x86 and for other architectures makes use of memcpy() family of > > functions as fallback. > > > > Another major difference is unlike i915_memcpy_from_wc(), > > drm_memcpy_from_wc() will not fail if the passed address argument is not > > alignment to be used with non-temporal load instructions or if the > > platform lacks support for those instructions (non-temporal load > > instructions are provided through SSE4.1 instruction set extension). > > Instead drm_memcpy_from_wc() continues with fallback functions to > > complete the copy. > > This relieves the caller from checking the return value of > > i915_memcpy_from_wc() and explicitly using a fallback. > > > > Follow up series will be created to remove the memcpy_from_wc functions > > from i915 once the dependency is completely removed. > > Overall the series looks good to me but I think you can add another patch to > remove > > i915_memcpy_from_wc() as I don't see any other usages left after this series, > may be I > am missing something? I have changed all users of i915_memcpy_from_wc() to drm function. But this is another function i915_unaligned_memcpy_from_wc() in i915_memcpy.c which is blocking completely eliminating the i915_memcpy.c file from i915. This function accepts unaligned source address and does fast copy only for the aligned region of memory and remaining part is copied using memcpy function. Either I can move i915_unaligned_memcpy_from_wc() also to drm but I am concerned since it is more a platform specific handling, does it make sense to keep it in drm. Else I have retain to i915_unaligned_memcpy_from_wc() inside i915 and refactor the function to use drm_memcpy_from_wc() instead of the __memcpy_ntdqu(). But before I could do more changes, I wanted feedback on the current change. So I decided to go ahead with creating series for review. Regards, Bala > > Regards, > Nirmoy > > > > > Cc: Jani Nikula > > Cc: Lucas De Marchi > > Cc: David Airlie > > Cc: Daniel Vetter > > Cc: Chris Wilson > > Cc: Thomas Hellstr_m > > Cc: Joonas Lahtinen > > Cc: Rodrigo Vivi > > Cc: Tvrtko Ursulin > > > > Balasubramani Vivekanandan (7): > >drm: Relax alignment constraint for destination address > >drm: Add drm_memcpy_from_wc() variant which accepts destination > > address > >drm/i915: use the memcpy_from_wc call from the drm > >drm/i915/guc: use the memcpy_from_wc call from the drm > >drm/i915/selftests: use the memcpy_from_wc call from the drm > >drm/i915/gt: Avoid direct dereferencing of io memory > >drm/i915: Avoid dereferencing io mapped memory > > > > drivers/gpu/drm/drm_cache.c | 98 +-- > > drivers/gpu/drm/i915/gem/i915_gem_object.c| 8 +- > > drivers/gpu/drm/i915/gt/selftest_reset.c | 21 ++-- > > drivers/gpu/drm/i915/gt/uc/intel_guc_log.c| 11 ++- > > drivers/gpu/drm/i915/i915_gpu_error.c | 45 + > > .../drm/i915/selftests/intel_memory_region.c | 8 +- > > include/drm/drm_cache.h | 3 + > > 7 files changed, 148 insertions(+), 46 deletions(-) > >
[Intel-gfx] [PATCH v2 5/7] drm/i915/selftests: use the memcpy_from_wc call from the drm
memcpy_from_wc functions in i915_memcpy.c will be removed and replaced by the implementation in drm_cache.c. Updated to use the functions provided by drm_cache.c. v2: check if the source and destination memory address is from local memory or system memory and initialize the iosys_map accordingly (Lucas) Cc: Lucas De Marchi Cc: Matthew Auld Cc: Thomas Hellstr_m Signed-off-by: Balasubramani Vivekanandan --- .../drm/i915/selftests/intel_memory_region.c | 41 +-- 1 file changed, 28 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/selftests/intel_memory_region.c b/drivers/gpu/drm/i915/selftests/intel_memory_region.c index ba32893e0873..d16ecb905f3b 100644 --- a/drivers/gpu/drm/i915/selftests/intel_memory_region.c +++ b/drivers/gpu/drm/i915/selftests/intel_memory_region.c @@ -7,6 +7,7 @@ #include #include +#include #include "../i915_selftest.h" @@ -1133,7 +1134,7 @@ static const char *repr_type(u32 type) static struct drm_i915_gem_object * create_region_for_mapping(struct intel_memory_region *mr, u64 size, u32 type, - void **out_addr) + struct iosys_map *out_addr) { struct drm_i915_gem_object *obj; void *addr; @@ -1153,7 +1154,11 @@ create_region_for_mapping(struct intel_memory_region *mr, u64 size, u32 type, return addr; } - *out_addr = addr; + if (i915_gem_object_is_lmem(obj)) + iosys_map_set_vaddr_iomem(out_addr, (void __iomem *)addr); + else + iosys_map_set_vaddr(out_addr, addr); + return obj; } @@ -1164,24 +1169,33 @@ static int wrap_ktime_compare(const void *A, const void *B) return ktime_compare(*a, *b); } -static void igt_memcpy_long(void *dst, const void *src, size_t size) +static void igt_memcpy_long(struct iosys_map *dst, struct iosys_map *src, + size_t size) { - unsigned long *tmp = dst; - const unsigned long *s = src; + unsigned long *tmp = dst->is_iomem ? + (unsigned long __force *)dst->vaddr_iomem : + dst->vaddr; + const unsigned long *s = src->is_iomem ? + (unsigned long __force *)src->vaddr_iomem : + src->vaddr; size = size / sizeof(unsigned long); while (size--) *tmp++ = *s++; } -static inline void igt_memcpy(void *dst, const void *src, size_t size) +static inline void igt_memcpy(struct iosys_map *dst, struct iosys_map *src, + size_t size) { - memcpy(dst, src, size); + memcpy(dst->is_iomem ? (void __force *)dst->vaddr_iomem : dst->vaddr, + src->is_iomem ? (void __force *)src->vaddr_iomem : src->vaddr, + size); } -static inline void igt_memcpy_from_wc(void *dst, const void *src, size_t size) +static inline void igt_memcpy_from_wc(struct iosys_map *dst, struct iosys_map *src, + size_t size) { - i915_memcpy_from_wc(dst, src, size); + drm_memcpy_from_wc(dst, src, size); } static int _perf_memcpy(struct intel_memory_region *src_mr, @@ -1191,7 +1205,8 @@ static int _perf_memcpy(struct intel_memory_region *src_mr, struct drm_i915_private *i915 = src_mr->i915; const struct { const char *name; - void (*copy)(void *dst, const void *src, size_t size); + void (*copy)(struct iosys_map *dst, struct iosys_map *src, +size_t size); bool skip; } tests[] = { { @@ -1205,11 +1220,11 @@ static int _perf_memcpy(struct intel_memory_region *src_mr, { "memcpy_from_wc", igt_memcpy_from_wc, - !i915_has_memcpy_from_wc(), + !drm_memcpy_fastcopy_supported(), }, }; struct drm_i915_gem_object *src, *dst; - void *src_addr, *dst_addr; + struct iosys_map src_addr, dst_addr; int ret = 0; int i; @@ -1237,7 +1252,7 @@ static int _perf_memcpy(struct intel_memory_region *src_mr, t0 = ktime_get(); - tests[i].copy(dst_addr, src_addr, size); + tests[i].copy(&dst_addr, &src_addr, size); t1 = ktime_get(); t[pass] = ktime_sub(t1, t0); -- 2.25.1
[Intel-gfx] [PATCH v2 4/7] drm/i915/guc: use the memcpy_from_wc call from the drm
memcpy_from_wc functions in i915_memcpy.c will be removed and replaced by the implementation in drm_cache.c. Updated to use the functions provided by drm_cache.c. v2: Check if the log object allocated from local memory or system memory and according setup the iosys_map (Lucas) Cc: Lucas De Marchi Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/gt/uc/intel_guc_log.c | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c index a24dc6441872..b9db765627ea 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_log.c @@ -3,6 +3,7 @@ * Copyright © 2014-2019 Intel Corporation */ +#include #include #include @@ -206,6 +207,7 @@ static void guc_read_update_log_buffer(struct intel_guc_log *log) enum guc_log_buffer_type type; void *src_data, *dst_data; bool new_overflow; + struct iosys_map src_map; mutex_lock(&log->relay.lock); @@ -282,14 +284,21 @@ static void guc_read_update_log_buffer(struct intel_guc_log *log) } /* Just copy the newly written data */ + if (i915_gem_object_is_lmem(log->vma->obj)) + iosys_map_set_vaddr_iomem(&src_map, (void __iomem *)src_data); + else + iosys_map_set_vaddr(&src_map, src_data); + if (read_offset > write_offset) { - i915_memcpy_from_wc(dst_data, src_data, write_offset); + drm_memcpy_from_wc_vaddr(dst_data, &src_map, +write_offset); bytes_to_copy = buffer_size - read_offset; } else { bytes_to_copy = write_offset - read_offset; } - i915_memcpy_from_wc(dst_data + read_offset, - src_data + read_offset, bytes_to_copy); + iosys_map_incr(&src_map, read_offset); + drm_memcpy_from_wc_vaddr(dst_data + read_offset, &src_map, +bytes_to_copy); src_data += buffer_size; dst_data += buffer_size; -- 2.25.1
[Intel-gfx] [PATCH v2 3/7] drm/i915: use the memcpy_from_wc call from the drm
memcpy_from_wc functions in i915_memcpy.c will be removed and replaced by the implementation in drm_cache.c. Updated to use the functions provided by drm_cache.c. Signed-off-by: Balasubramani Vivekanandan Reviewed-by: Lucas De Marchi --- drivers/gpu/drm/i915/gem/i915_gem_object.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index 372bc220faeb..5de657c3190e 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -438,6 +438,8 @@ i915_gem_object_read_from_page_iomap(struct drm_i915_gem_object *obj, u64 offset { void __iomem *src_map; void __iomem *src_ptr; + struct iosys_map src; + dma_addr_t dma = i915_gem_object_get_dma_address(obj, offset >> PAGE_SHIFT); src_map = io_mapping_map_wc(&obj->mm.region->iomap, @@ -445,8 +447,8 @@ i915_gem_object_read_from_page_iomap(struct drm_i915_gem_object *obj, u64 offset PAGE_SIZE); src_ptr = src_map + offset_in_page(offset); - if (!i915_memcpy_from_wc(dst, (void __force *)src_ptr, size)) - memcpy_fromio(dst, src_ptr, size); + iosys_map_set_vaddr_iomem(&src, src_ptr); + drm_memcpy_from_wc_vaddr(dst, &src, size); io_mapping_unmap(src_map); } -- 2.25.1
[Intel-gfx] [PATCH v2 6/7] drm/i915/gt: Avoid direct dereferencing of io memory
io mapped memory should not be directly dereferenced to ensure portability. io memory should be read/written/copied using helper functions. i915_memcpy_from_wc() function was used to copy the data from io memory to a temporary buffer and pointer to the temporary buffer was passed to CRC calculation function. But i915_memcpy_from_wc() only does a copy if the platform supports fast copy using non-temporal instructions. Otherwise the pointer to io memory was passed for CRC calculation. CRC function will directly dereference io memory and would not work properly on non-x86 platforms. To make it portable, it should be ensured always temporary buffer is used for CRC and not io memory. drm_memcpy_from_wc_vaddr() is now used for copying instead of i915_memcpy_from_wc() for 2 reasons. - i915_memcpy_from_wc() will be deprecated. - drm_memcpy_from_wc_vaddr() will not fail if the fast copy is not supported but uses memcpy_fromio as fallback for copying. Cc: Matthew Brost Cc: Michał Winiarski Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/gt/selftest_reset.c | 21 ++--- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/selftest_reset.c b/drivers/gpu/drm/i915/gt/selftest_reset.c index 37c38bdd5f47..79d2bd7ef3b9 100644 --- a/drivers/gpu/drm/i915/gt/selftest_reset.c +++ b/drivers/gpu/drm/i915/gt/selftest_reset.c @@ -3,6 +3,7 @@ * Copyright © 2018 Intel Corporation */ +#include #include #include "gem/i915_gem_stolen.h" @@ -82,7 +83,7 @@ __igt_reset_stolen(struct intel_gt *gt, for (page = 0; page < num_pages; page++) { dma_addr_t dma = (dma_addr_t)dsm->start + (page << PAGE_SHIFT); void __iomem *s; - void *in; + struct iosys_map src_map; ggtt->vm.insert_page(&ggtt->vm, dma, ggtt->error_capture.start, @@ -98,10 +99,9 @@ __igt_reset_stolen(struct intel_gt *gt, ((page + 1) << PAGE_SHIFT) - 1)) memset_io(s, STACK_MAGIC, PAGE_SIZE); - in = (void __force *)s; - if (i915_memcpy_from_wc(tmp, in, PAGE_SIZE)) - in = tmp; - crc[page] = crc32_le(0, in, PAGE_SIZE); + iosys_map_set_vaddr_iomem(&src_map, s); + drm_memcpy_from_wc_vaddr(tmp, &src_map, PAGE_SIZE); + crc[page] = crc32_le(0, tmp, PAGE_SIZE); io_mapping_unmap(s); } @@ -122,7 +122,7 @@ __igt_reset_stolen(struct intel_gt *gt, for (page = 0; page < num_pages; page++) { dma_addr_t dma = (dma_addr_t)dsm->start + (page << PAGE_SHIFT); void __iomem *s; - void *in; + struct iosys_map src_map; u32 x; ggtt->vm.insert_page(&ggtt->vm, dma, @@ -134,10 +134,9 @@ __igt_reset_stolen(struct intel_gt *gt, ggtt->error_capture.start, PAGE_SIZE); - in = (void __force *)s; - if (i915_memcpy_from_wc(tmp, in, PAGE_SIZE)) - in = tmp; - x = crc32_le(0, in, PAGE_SIZE); + iosys_map_set_vaddr_iomem(&src_map, s); + drm_memcpy_from_wc_vaddr(tmp, &src_map, PAGE_SIZE); + x = crc32_le(0, tmp, PAGE_SIZE); if (x != crc[page] && !__drm_mm_interval_first(>->i915->mm.stolen, @@ -146,7 +145,7 @@ __igt_reset_stolen(struct intel_gt *gt, pr_debug("unused stolen page %pa modified by GPU reset\n", &page); if (count++ == 0) - igt_hexdump(in, PAGE_SIZE); + igt_hexdump(tmp, PAGE_SIZE); max = page; } -- 2.25.1
[Intel-gfx] [PATCH v2 7/7] drm/i915: Avoid dereferencing io mapped memory
Pointer passed to zlib_deflate() for compression could point to io mapped memory and might end up in direct derefencing. io mapped memory is copied to a temporary buffer, which is then shared to zlib_deflate(), only for the case where platform supports fast copy using non-temporal instructions. If the platform lacks support, then io mapped memory is directly used. Direct dereferencing of io memory makes driver not portable outside x86 and should be avoided. With this patch, io memory is always copied to a temporary buffer irrespective of platform support for fast copy. The i915_has_memcpy_from_wc() check is removed. And drm_memcpy_from_wc_vaddr() is now used for copying instead of i915_memcpy_from_wc() for 2 reasons. - i915_memcpy_from_wc() will be deprecated. - drm_memcpy_from_wc_vaddr() will not fail if the fast copy is not supported instead continues copying using memcpy_fromio as fallback. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/i915_gpu_error.c | 45 +++ 1 file changed, 25 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 4967e79806f8..1ca5072b85db 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -259,9 +259,12 @@ static bool compress_init(struct i915_vma_compress *c) return false; } - c->tmp = NULL; - if (i915_has_memcpy_from_wc()) - c->tmp = pool_alloc(&c->pool, ALLOW_FAIL); + c->tmp = pool_alloc(&c->pool, ALLOW_FAIL); + if (!c->tmp) { + kfree(zstream->workspace); + pool_fini(&c->pool); + return false; + } return true; } @@ -293,15 +296,17 @@ static void *compress_next_page(struct i915_vma_compress *c, } static int compress_page(struct i915_vma_compress *c, -void *src, -struct i915_vma_coredump *dst, -bool wc) +struct iosys_map *src, +struct i915_vma_coredump *dst) { struct z_stream_s *zstream = &c->zstream; - zstream->next_in = src; - if (wc && c->tmp && i915_memcpy_from_wc(c->tmp, src, PAGE_SIZE)) + if (src->is_iomem) { + drm_memcpy_from_wc_vaddr(c->tmp, src, PAGE_SIZE); zstream->next_in = c->tmp; + } else { + zstream->next_in = src->vaddr; + } zstream->avail_in = PAGE_SIZE; do { @@ -390,9 +395,8 @@ static bool compress_start(struct i915_vma_compress *c) } static int compress_page(struct i915_vma_compress *c, -void *src, -struct i915_vma_coredump *dst, -bool wc) +struct iosys_map *src, +struct i915_vma_coredump *dst) { void *ptr; @@ -400,8 +404,7 @@ static int compress_page(struct i915_vma_compress *c, if (!ptr) return -ENOMEM; - if (!(wc && i915_memcpy_from_wc(ptr, src, PAGE_SIZE))) - memcpy(ptr, src, PAGE_SIZE); + drm_memcpy_from_wc_vaddr(ptr, src, PAGE_SIZE); list_add_tail(&virt_to_page(ptr)->lru, &dst->page_list); cond_resched(); @@ -1055,6 +1058,7 @@ i915_vma_coredump_create(const struct intel_gt *gt, if (drm_mm_node_allocated(&ggtt->error_capture)) { void __iomem *s; dma_addr_t dma; + struct iosys_map src; for_each_sgt_daddr(dma, iter, vma_res->bi.pages) { mutex_lock(&ggtt->error_mutex); @@ -1063,9 +1067,8 @@ i915_vma_coredump_create(const struct intel_gt *gt, mb(); s = io_mapping_map_wc(&ggtt->iomap, slot, PAGE_SIZE); - ret = compress_page(compress, - (void __force *)s, dst, - true); + iosys_map_set_vaddr_iomem(&src, s); + ret = compress_page(compress, &src, dst); io_mapping_unmap(s); mb(); @@ -1077,6 +1080,7 @@ i915_vma_coredump_create(const struct intel_gt *gt, } else if (vma_res->bi.lmem) { struct intel_memory_region *mem = vma_res->mr; dma_addr_t dma; + struct iosys_map src; for_each_sgt_daddr(dma, iter, vma_res->bi.pages) { void __iomem *s; @@ -1084,15 +1088,15 @@ i915_vma_coredump_create(const struct intel_gt *gt, s = io_mapping_map_wc(&mem->iomap, dma - mem->region.start,
[Intel-gfx] [PATCH v2 1/7] drm: Relax alignment constraint for destination address
There is no need for the destination address to be aligned to 16 byte boundary to be able to use the non-temporal instructions while copying. Non-temporal instructions are used only for loading from the source address which has alignment constraints. We only need to take care of using the right instructions, based on whether destination address is aligned or not, while storing the data to the destination address. __memcpy_ntdqu is copied from i915/i915_memcpy.c Cc: Maarten Lankhorst Cc: Maxime Ripard Cc: Thomas Zimmermann Cc: David Airlie Cc: Daniel Vetter Cc: Chris Wilson Signed-off-by: Balasubramani Vivekanandan Reviewed-by: Lucas De Marchi --- drivers/gpu/drm/drm_cache.c | 44 - 1 file changed, 38 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c index c3e6e615bf09..a21c1350eb09 100644 --- a/drivers/gpu/drm/drm_cache.c +++ b/drivers/gpu/drm/drm_cache.c @@ -278,18 +278,50 @@ static void __memcpy_ntdqa(void *dst, const void *src, unsigned long len) kernel_fpu_end(); } +static void __memcpy_ntdqu(void *dst, const void *src, unsigned long len) +{ + kernel_fpu_begin(); + + while (len >= 4) { + asm("movntdqa (%0), %%xmm0\n" + "movntdqa 16(%0), %%xmm1\n" + "movntdqa 32(%0), %%xmm2\n" + "movntdqa 48(%0), %%xmm3\n" + "movups %%xmm0, (%1)\n" + "movups %%xmm1, 16(%1)\n" + "movups %%xmm2, 32(%1)\n" + "movups %%xmm3, 48(%1)\n" + :: "r" (src), "r" (dst) : "memory"); + src += 64; + dst += 64; + len -= 4; + } + while (len--) { + asm("movntdqa (%0), %%xmm0\n" + "movups %%xmm0, (%1)\n" + :: "r" (src), "r" (dst) : "memory"); + src += 16; + dst += 16; + } + + kernel_fpu_end(); +} + /* * __drm_memcpy_from_wc copies @len bytes from @src to @dst using - * non-temporal instructions where available. Note that all arguments - * (@src, @dst) must be aligned to 16 bytes and @len must be a multiple - * of 16. + * non-temporal instructions where available. Note that @src must be aligned to + * 16 bytes and @len must be a multiple of 16. */ static void __drm_memcpy_from_wc(void *dst, const void *src, unsigned long len) { - if (unlikely(((unsigned long)dst | (unsigned long)src | len) & 15)) + if (unlikely(((unsigned long)src | len) & 15)) { memcpy(dst, src, len); - else if (likely(len)) - __memcpy_ntdqa(dst, src, len >> 4); + } else if (likely(len)) { + if (IS_ALIGNED((unsigned long)dst, 16)) + __memcpy_ntdqa(dst, src, len >> 4); + else + __memcpy_ntdqu(dst, src, len >> 4); + } } /** -- 2.25.1
[Intel-gfx] [PATCH v2 0/7] drm/i915: Use the memcpy_from_wc function from drm
drm_memcpy_from_wc() performs fast copy from WC memory type using non-temporal instructions. Now there are two similar implementations of this function. One exists in drm_cache.c as drm_memcpy_from_wc() and another implementation in i915/i915_memcpy.c as i915_memcpy_from_wc(). drm_memcpy_from_wc() was the recent addition through the series https://patchwork.freedesktop.org/patch/436276/?series=90681&rev=6 The goal of this patch series is to change all users of i915_memcpy_from_wc() to drm_memcpy_from_wc() and a have common implementation in drm and eventually remove the copy from i915. Another benefit of using memcpy functions from drm is that drm_memcpy_from_wc() is available for non-x86 architectures. i915_memcpy_from_wc() is implemented only for x86 and prevents building i915 for ARM64. drm_memcpy_from_wc() does fast copy using non-temporal instructions for x86 and for other architectures makes use of memcpy() family of functions as fallback. Another major difference is unlike i915_memcpy_from_wc(), drm_memcpy_from_wc() will not fail if the passed address argument is not alignment to be used with non-temporal load instructions or if the platform lacks support for those instructions (non-temporal load instructions are provided through SSE4.1 instruction set extension). Instead drm_memcpy_from_wc() continues with fallback functions to complete the copy. This relieves the caller from checking the return value of i915_memcpy_from_wc() and explicitly using a fallback. Follow up series will be created to remove the memcpy_from_wc functions from i915 once the dependency is completely removed. v2: Fixed missing check to find if the address is from system memory or io memory and use the right initialization function to construct the iosys_map structure (Review feedback from Lucas) Cc: Jani Nikula Cc: Lucas De Marchi Cc: David Airlie Cc: Daniel Vetter Cc: Chris Wilson Cc: Thomas Hellstr_m Cc: Joonas Lahtinen Cc: Rodrigo Vivi Cc: Tvrtko Ursulin Cc: Nirmoy Das Balasubramani Vivekanandan (7): drm: Relax alignment constraint for destination address drm: Add drm_memcpy_from_wc() variant which accepts destination address drm/i915: use the memcpy_from_wc call from the drm drm/i915/guc: use the memcpy_from_wc call from the drm drm/i915/selftests: use the memcpy_from_wc call from the drm drm/i915/gt: Avoid direct dereferencing of io memory drm/i915: Avoid dereferencing io mapped memory drivers/gpu/drm/drm_cache.c | 98 +-- drivers/gpu/drm/i915/gem/i915_gem_object.c| 6 +- drivers/gpu/drm/i915/gt/selftest_reset.c | 21 ++-- drivers/gpu/drm/i915/gt/uc/intel_guc_log.c| 15 ++- drivers/gpu/drm/i915/i915_gpu_error.c | 45 + .../drm/i915/selftests/intel_memory_region.c | 41 +--- include/drm/drm_cache.h | 3 + 7 files changed, 174 insertions(+), 55 deletions(-) -- 2.25.1
[Intel-gfx] [PATCH v2 2/7] drm: Add drm_memcpy_from_wc() variant which accepts destination address
Fast copy using non-temporal instructions for x86 currently exists at two locations. One is implemented in i915 driver at i915/i915_memcpy.c and another copy at drm_cache.c. The plan is to remove the duplicate implementation in i915 driver and use the functions from drm_cache.c. A variant of drm_memcpy_from_wc() is added in drm_cache.c which accepts address as argument instead of iosys_map for destination. It is a very common scenario in i915 to copy from a WC memory type, which may be an io memory or a system memory to a destination address pointing to system memory. To avoid the overhead of creating iosys_map type for the destination, new variant is created to accept the address directly. Also a new function is exported in drm_cache.c to find if the fast copy is supported by the platform or not. It is required for i915. Cc: Maarten Lankhorst Cc: Maxime Ripard Cc: Thomas Zimmermann Cc: David Airlie Cc: Daniel Vetter Cc: Thomas Hellstr_m Cc: Lucas De Marchi Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/drm_cache.c | 54 + include/drm/drm_cache.h | 3 +++ 2 files changed, 57 insertions(+) diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c index a21c1350eb09..97959eecc300 100644 --- a/drivers/gpu/drm/drm_cache.c +++ b/drivers/gpu/drm/drm_cache.c @@ -358,6 +358,54 @@ void drm_memcpy_from_wc(struct iosys_map *dst, } EXPORT_SYMBOL(drm_memcpy_from_wc); +/** + * drm_memcpy_from_wc_vaddr - Perform the fastest available memcpy from a source + * that may be WC to a destination in system memory. + * @dst: The destination pointer + * @src: The source pointer + * @len: The size of the area to transfer in bytes + * + * Same as drm_memcpy_from_wc except destination is accepted as system memory + * address. Useful in situations where passing destination address as iosys_map + * is simply an overhead and can be avoided. + */ +void drm_memcpy_from_wc_vaddr(void *dst, const struct iosys_map *src, + unsigned long len) +{ + if (WARN_ON(in_interrupt())) { + iosys_map_memcpy_from(dst, src, 0, len); + return; + } + + if (static_branch_likely(&has_movntdqa)) { + __drm_memcpy_from_wc(dst, +src->is_iomem ? +(void const __force *)src->vaddr_iomem : +src->vaddr, +len); + return; + } + + iosys_map_memcpy_from(dst, src, 0, len); +} +EXPORT_SYMBOL(drm_memcpy_from_wc_vaddr); + +/* + * drm_memcpy_fastcopy_supported - Returns if fast copy using non-temporal + * instructions is supported + * + * Returns true if platform has support for fast copying from wc memory type + * using non-temporal instructions. Else false. + */ +bool drm_memcpy_fastcopy_supported(void) +{ + if (static_branch_likely(&has_movntdqa)) + return true; + + return false; +} +EXPORT_SYMBOL(drm_memcpy_fastcopy_supported); + /* * drm_memcpy_init_early - One time initialization of the WC memcpy code */ @@ -382,6 +430,12 @@ void drm_memcpy_from_wc(struct iosys_map *dst, } EXPORT_SYMBOL(drm_memcpy_from_wc); +bool drm_memcpy_fastcopy_supported(void) +{ + return false; +} +EXPORT_SYMBOL(drm_memcpy_fastcopy_supported); + void drm_memcpy_init_early(void) { } diff --git a/include/drm/drm_cache.h b/include/drm/drm_cache.h index 22deb216b59c..8f48e4dcd7dc 100644 --- a/include/drm/drm_cache.h +++ b/include/drm/drm_cache.h @@ -77,4 +77,7 @@ void drm_memcpy_init_early(void); void drm_memcpy_from_wc(struct iosys_map *dst, const struct iosys_map *src, unsigned long len); +bool drm_memcpy_fastcopy_supported(void); +void drm_memcpy_from_wc_vaddr(void *dst, const struct iosys_map *src, + unsigned long len); #endif -- 2.25.1
[Intel-gfx] [PATCH] drm/i915/guc: Use iosys_map interface to update lrc_desc
This patch is continuation of the effort to move all pointers in i915, which at any point may be pointing to device memory or system memory, to iosys_map interface. More details about the need of this change is explained in the patch series which initiated this task https://patchwork.freedesktop.org/series/99711/ This patch converts all access to the lrc_desc through iosys_map interfaces. Cc: Lucas De Marchi Cc: John Harrison Cc: Matthew Brost Cc: Umesh Nerlige Ramappa Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/gt/uc/intel_guc.h| 2 +- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 68 --- 2 files changed, 43 insertions(+), 27 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h index e439e6c1ac8b..cbbc24dbaf0f 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h @@ -168,7 +168,7 @@ struct intel_guc { /** @lrc_desc_pool: object allocated to hold the GuC LRC descriptor pool */ struct i915_vma *lrc_desc_pool; /** @lrc_desc_pool_vaddr: contents of the GuC LRC descriptor pool */ - void *lrc_desc_pool_vaddr; + struct iosys_map lrc_desc_pool_vaddr; /** * @context_lookup: used to resolve intel_context from guc_id, if a diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index 9ec03234d2c2..84b17ded886a 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -467,13 +467,14 @@ static u32 *get_wq_pointer(struct guc_process_desc *desc, return &__get_parent_scratch(ce)->wq[ce->parallel.guc.wqi_tail / sizeof(u32)]; } -static struct guc_lrc_desc *__get_lrc_desc(struct intel_guc *guc, u32 index) +static void __write_lrc_desc(struct intel_guc *guc, u32 index, +struct guc_lrc_desc *desc) { - struct guc_lrc_desc *base = guc->lrc_desc_pool_vaddr; + unsigned int size = sizeof(struct guc_lrc_desc); GEM_BUG_ON(index >= GUC_MAX_CONTEXT_ID); - return &base[index]; + iosys_map_memcpy_to(&guc->lrc_desc_pool_vaddr, index * size, desc, size); } static inline struct intel_context *__get_context(struct intel_guc *guc, u32 id) @@ -489,20 +490,28 @@ static int guc_lrc_desc_pool_create(struct intel_guc *guc) { u32 size; int ret; + void *addr; size = PAGE_ALIGN(sizeof(struct guc_lrc_desc) * GUC_MAX_CONTEXT_ID); ret = intel_guc_allocate_and_map_vma(guc, size, &guc->lrc_desc_pool, -(void **)&guc->lrc_desc_pool_vaddr); +&addr); + if (ret) return ret; + if (i915_gem_object_is_lmem(guc->lrc_desc_pool->obj)) + iosys_map_set_vaddr_iomem(&guc->lrc_desc_pool_vaddr, + (void __iomem *)addr); + else + iosys_map_set_vaddr(&guc->lrc_desc_pool_vaddr, addr); + return 0; } static void guc_lrc_desc_pool_destroy(struct intel_guc *guc) { - guc->lrc_desc_pool_vaddr = NULL; + iosys_map_clear(&guc->lrc_desc_pool_vaddr); i915_vma_unpin_and_release(&guc->lrc_desc_pool, I915_VMA_RELEASE_MAP); } @@ -513,9 +522,11 @@ static inline bool guc_submission_initialized(struct intel_guc *guc) static inline void _reset_lrc_desc(struct intel_guc *guc, u32 id) { - struct guc_lrc_desc *desc = __get_lrc_desc(guc, id); + unsigned int size = sizeof(struct guc_lrc_desc); - memset(desc, 0, sizeof(*desc)); + GEM_BUG_ON(id >= GUC_MAX_CONTEXT_ID); + + iosys_map_memset(&guc->lrc_desc_pool_vaddr, id * size, 0, size); } static inline bool ctx_id_mapped(struct intel_guc *guc, u32 id) @@ -2233,7 +2244,7 @@ static void prepare_context_registration_info(struct intel_context *ce) struct intel_engine_cs *engine = ce->engine; struct intel_guc *guc = &engine->gt->uc.guc; u32 ctx_id = ce->guc_id.id; - struct guc_lrc_desc *desc; + struct guc_lrc_desc desc; struct intel_context *child; GEM_BUG_ON(!engine->mask); @@ -2245,13 +2256,13 @@ static void prepare_context_registration_info(struct intel_context *ce) GEM_BUG_ON(i915_gem_object_is_lmem(guc->ct.vma->obj) != i915_gem_object_is_lmem(ce->ring->vma->obj)); - desc = __get_lrc_desc(guc, ctx_id); - desc->engine_class = engine_class_to_guc_class(engine->class); - desc->engine_submit_mask = engine->logical_mask; - desc->hw_context_desc = ce->lrc.lrca; - desc->priority = ce->guc_state.prio; - desc->context_flags = CONTEXT_REGISTRATI
[Intel-gfx] [PATCH] drm/i915: Add fallback inside memcpy_from_wc functions
memcpy_from_wc functions can fail if SSE4.1 is not supported or the supplied addresses are not 16-byte aligned. It was then upto to the caller to use memcpy as fallback. Now fallback to memcpy is implemented inside memcpy_from_wc functions relieving the user from checking the return value of i915_memcpy_from_wc and doing fallback. When doing copying from io memory address memcpy_fromio should be used as fallback. So a new function is added to the family of memcpy_to_wc functions which should be used while copying from io memory. This change is implemented also with an intention to perpare for porting memcpy_from_wc code to ARM64. Since SSE4.1 is not valid for ARM, accelerated reads will not be supported and the driver should rely on fallback always. So there would be few more places in the code where fallback should be introduced. For e.g. GuC log relay is currently not using fallback since a GPU supporting GuC submission will mostly have SSE4.1 enabled CPU. This is no more valid with Discrete GPU and with enabling support for ARM64. With fallback moved inside memcpy_from_wc function, call sites would look neat and fallback can be implemented in a uniform way. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/gem/i915_gem_object.c | 3 +- drivers/gpu/drm/i915/gt/selftest_reset.c | 8 ++- drivers/gpu/drm/i915/i915_gpu_error.c | 9 ++- drivers/gpu/drm/i915/i915_memcpy.c | 78 -- drivers/gpu/drm/i915/i915_memcpy.h | 18 ++--- 5 files changed, 77 insertions(+), 39 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index e03e362d320b..b139a88fce70 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -452,8 +452,7 @@ i915_gem_object_read_from_page_iomap(struct drm_i915_gem_object *obj, u64 offset PAGE_SIZE); src_ptr = src_map + offset_in_page(offset); - if (!i915_memcpy_from_wc(dst, (void __force *)src_ptr, size)) - memcpy_fromio(dst, src_ptr, size); + i915_io_memcpy_from_wc(dst, src_ptr, size); io_mapping_unmap(src_map); } diff --git a/drivers/gpu/drm/i915/gt/selftest_reset.c b/drivers/gpu/drm/i915/gt/selftest_reset.c index 37c38bdd5f47..64b8521a8b28 100644 --- a/drivers/gpu/drm/i915/gt/selftest_reset.c +++ b/drivers/gpu/drm/i915/gt/selftest_reset.c @@ -99,8 +99,10 @@ __igt_reset_stolen(struct intel_gt *gt, memset_io(s, STACK_MAGIC, PAGE_SIZE); in = (void __force *)s; - if (i915_memcpy_from_wc(tmp, in, PAGE_SIZE)) + if (i915_can_memcpy_from_wc(tmp, in, PAGE_SIZE)) { + i915_io_memcpy_from_wc(tmp, in, PAGE_SIZE); in = tmp; + } crc[page] = crc32_le(0, in, PAGE_SIZE); io_mapping_unmap(s); @@ -135,8 +137,10 @@ __igt_reset_stolen(struct intel_gt *gt, PAGE_SIZE); in = (void __force *)s; - if (i915_memcpy_from_wc(tmp, in, PAGE_SIZE)) + if (i915_can_memcpy_from_wc(tmp, in, PAGE_SIZE)) { + i915_io_memcpy_from_wc(tmp, in, PAGE_SIZE); in = tmp; + } x = crc32_le(0, in, PAGE_SIZE); if (x != crc[page] && diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index aee42eae4729..90db5de86c25 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -296,8 +296,10 @@ static int compress_page(struct i915_vma_compress *c, struct z_stream_s *zstream = &c->zstream; zstream->next_in = src; - if (wc && c->tmp && i915_memcpy_from_wc(c->tmp, src, PAGE_SIZE)) + if (wc && c->tmp && i915_can_memcpy_from_wc(c->tmp, src, PAGE_SIZE)) { + i915_io_memcpy_from_wc(c->tmp, src, PAGE_SIZE); zstream->next_in = c->tmp; + } zstream->avail_in = PAGE_SIZE; do { @@ -396,8 +398,11 @@ static int compress_page(struct i915_vma_compress *c, if (!ptr) return -ENOMEM; - if (!(wc && i915_memcpy_from_wc(ptr, src, PAGE_SIZE))) + if (wc) + i915_io_memcpy_from_wc(ptr, src, PAGE_SIZE); + else memcpy(ptr, src, PAGE_SIZE); + list_add_tail(&virt_to_page(ptr)->lru, &dst->page_list); cond_resched(); diff --git a/drivers/gpu/drm/i915/i915_memcpy.c b/drivers/gpu/drm/i915/i915_memcpy.c index 1b021a4902de..b1f8abf35452 100644 --- a/drivers/gpu/drm/i915/i915_memcpy.c +++ b/drivers/gpu/drm/i915/i915_memcpy.c @@ -24,15 +24,10 @@ #include #include +#include #include "i915_memcpy.h" -#if IS_ENABLED(CONFIG_DRM_I
[Intel-gfx] [PATCH v2 0/1] Add fallback inside memcpy_from_wc functions
Fallback function implemented inside memcpy_from_wc functions when copying using accelerated read is not possible. v2: Fixed Sparse warnings Balasubramani Vivekanandan (1): drm/i915: Add fallback inside memcpy_from_wc functions drivers/gpu/drm/i915/gem/i915_gem_object.c | 5 +- drivers/gpu/drm/i915/gt/selftest_reset.c | 8 ++- drivers/gpu/drm/i915/i915_gpu_error.c | 9 ++- drivers/gpu/drm/i915/i915_memcpy.c | 78 -- drivers/gpu/drm/i915/i915_memcpy.h | 18 ++--- 5 files changed, 78 insertions(+), 40 deletions(-) -- 2.25.1
[Intel-gfx] [PATCH v2 1/1] drm/i915: Add fallback inside memcpy_from_wc functions
memcpy_from_wc functions can fail if SSE4.1 is not supported or the supplied addresses are not 16-byte aligned. It was then upto to the caller to use memcpy as fallback. Now fallback to memcpy is implemented inside memcpy_from_wc functions relieving the user from checking the return value of i915_memcpy_from_wc and doing fallback. When doing copying from io memory address memcpy_fromio should be used as fallback. So a new function is added to the family of memcpy_to_wc functions which should be used while copying from io memory. This change is implemented also with an intention to perpare for porting memcpy_from_wc code to ARM64. Since SSE4.1 is not valid for ARM, accelerated reads will not be supported and the driver should rely on fallback always. So there would be few more places in the code where fallback should be introduced. For e.g. GuC log relay is currently not using fallback since a GPU supporting GuC submission will mostly have SSE4.1 enabled CPU. This is no more valid with Discrete GPU and with enabling support for ARM64. With fallback moved inside memcpy_from_wc function, call sites would look neat and fallback can be implemented in a uniform way. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/gem/i915_gem_object.c | 5 +- drivers/gpu/drm/i915/gt/selftest_reset.c | 8 ++- drivers/gpu/drm/i915/i915_gpu_error.c | 9 ++- drivers/gpu/drm/i915/i915_memcpy.c | 78 -- drivers/gpu/drm/i915/i915_memcpy.h | 18 ++--- 5 files changed, 78 insertions(+), 40 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index e03e362d320b..e187c4bfb7e4 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -444,7 +444,7 @@ static void i915_gem_object_read_from_page_iomap(struct drm_i915_gem_object *obj, u64 offset, void *dst, int size) { void __iomem *src_map; - void __iomem *src_ptr; + const void __iomem *src_ptr; dma_addr_t dma = i915_gem_object_get_dma_address(obj, offset >> PAGE_SHIFT); src_map = io_mapping_map_wc(&obj->mm.region->iomap, @@ -452,8 +452,7 @@ i915_gem_object_read_from_page_iomap(struct drm_i915_gem_object *obj, u64 offset PAGE_SIZE); src_ptr = src_map + offset_in_page(offset); - if (!i915_memcpy_from_wc(dst, (void __force *)src_ptr, size)) - memcpy_fromio(dst, src_ptr, size); + i915_io_memcpy_from_wc(dst, src_ptr, size); io_mapping_unmap(src_map); } diff --git a/drivers/gpu/drm/i915/gt/selftest_reset.c b/drivers/gpu/drm/i915/gt/selftest_reset.c index 37c38bdd5f47..64b8521a8b28 100644 --- a/drivers/gpu/drm/i915/gt/selftest_reset.c +++ b/drivers/gpu/drm/i915/gt/selftest_reset.c @@ -99,8 +99,10 @@ __igt_reset_stolen(struct intel_gt *gt, memset_io(s, STACK_MAGIC, PAGE_SIZE); in = (void __force *)s; - if (i915_memcpy_from_wc(tmp, in, PAGE_SIZE)) + if (i915_can_memcpy_from_wc(tmp, in, PAGE_SIZE)) { + i915_io_memcpy_from_wc(tmp, in, PAGE_SIZE); in = tmp; + } crc[page] = crc32_le(0, in, PAGE_SIZE); io_mapping_unmap(s); @@ -135,8 +137,10 @@ __igt_reset_stolen(struct intel_gt *gt, PAGE_SIZE); in = (void __force *)s; - if (i915_memcpy_from_wc(tmp, in, PAGE_SIZE)) + if (i915_can_memcpy_from_wc(tmp, in, PAGE_SIZE)) { + i915_io_memcpy_from_wc(tmp, in, PAGE_SIZE); in = tmp; + } x = crc32_le(0, in, PAGE_SIZE); if (x != crc[page] && diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 127ff56c8ce6..2c14a28c 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -297,8 +297,10 @@ static int compress_page(struct i915_vma_compress *c, struct z_stream_s *zstream = &c->zstream; zstream->next_in = src; - if (wc && c->tmp && i915_memcpy_from_wc(c->tmp, src, PAGE_SIZE)) + if (wc && c->tmp && i915_can_memcpy_from_wc(c->tmp, src, PAGE_SIZE)) { + i915_io_memcpy_from_wc(c->tmp, (const void __iomem *)src, PAGE_SIZE); zstream->next_in = c->tmp; + } zstream->avail_in = PAGE_SIZE; do { @@ -397,8 +399,11 @@ static int compress_page(struct i915_vma_compress *c, if (!ptr) return -ENOMEM; - if (!(wc && i915_memcpy_from_wc(ptr, src, PAGE_SIZE))) + if (wc) + i915_io_memcpy_from_wc(ptr, src, PAGE_SIZE); + else memcpy(ptr, src, PAGE_SIZE); +
Re: [Intel-gfx] [PATCH v2 1/1] drm/i915: Add fallback inside memcpy_from_wc functions
On 08.02.2022 11:11, Lucas De Marchi wrote: > On Mon, Feb 07, 2022 at 09:43:08PM +0530, Balasubramani Vivekanandan wrote: > > memcpy_from_wc functions can fail if SSE4.1 is not supported or the > > supplied addresses are not 16-byte aligned. It was then upto to the > > caller to use memcpy as fallback. > > Now fallback to memcpy is implemented inside memcpy_from_wc functions > > relieving the user from checking the return value of i915_memcpy_from_wc > > and doing fallback. > > > > When doing copying from io memory address memcpy_fromio should be used > > as fallback. So a new function is added to the family of memcpy_to_wc > > functions which should be used while copying from io memory. > > > > This change is implemented also with an intention to perpare for porting > > memcpy_from_wc code to ARM64. Since SSE4.1 is not valid for ARM, > > accelerated reads will not be supported and the driver should rely on > > fallback always. > > So there would be few more places in the code where fallback should be > > introduced. For e.g. GuC log relay is currently not using fallback since > > a GPU supporting GuC submission will mostly have SSE4.1 enabled CPU. > > This is no more valid with Discrete GPU and with enabling support for > > ARM64. > > With fallback moved inside memcpy_from_wc function, call sites would > > look neat and fallback can be implemented in a uniform way. > > > > Signed-off-by: Balasubramani Vivekanandan > > > > --- > > drivers/gpu/drm/i915/gem/i915_gem_object.c | 5 +- > > drivers/gpu/drm/i915/gt/selftest_reset.c | 8 ++- > > drivers/gpu/drm/i915/i915_gpu_error.c | 9 ++- > > drivers/gpu/drm/i915/i915_memcpy.c | 78 -- > > drivers/gpu/drm/i915/i915_memcpy.h | 18 ++--- > > 5 files changed, 78 insertions(+), 40 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c > > b/drivers/gpu/drm/i915/gem/i915_gem_object.c > > index e03e362d320b..e187c4bfb7e4 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c > > @@ -444,7 +444,7 @@ static void > > i915_gem_object_read_from_page_iomap(struct drm_i915_gem_object *obj, u64 > > offset, void *dst, int size) > > { > > void __iomem *src_map; > > - void __iomem *src_ptr; > > + const void __iomem *src_ptr; > > dma_addr_t dma = i915_gem_object_get_dma_address(obj, offset >> > > PAGE_SHIFT); > > > > src_map = io_mapping_map_wc(&obj->mm.region->iomap, > > @@ -452,8 +452,7 @@ i915_gem_object_read_from_page_iomap(struct > > drm_i915_gem_object *obj, u64 offset > > PAGE_SIZE); > > > > src_ptr = src_map + offset_in_page(offset); > > - if (!i915_memcpy_from_wc(dst, (void __force *)src_ptr, size)) > > - memcpy_fromio(dst, src_ptr, size); > > + i915_io_memcpy_from_wc(dst, src_ptr, size); > > nitpick, but maybe to align with the memcpy_fromio() API this would > better be named i915_memcpy_fromio_wc()? I too thought for a moment should I rename to i915_memcpy_fromio_wc() but stayed with the current name, when preparing the patch. I will rename it. > > > > > io_mapping_unmap(src_map); > > } > > diff --git a/drivers/gpu/drm/i915/gt/selftest_reset.c > > b/drivers/gpu/drm/i915/gt/selftest_reset.c > > index 37c38bdd5f47..64b8521a8b28 100644 > > --- a/drivers/gpu/drm/i915/gt/selftest_reset.c > > +++ b/drivers/gpu/drm/i915/gt/selftest_reset.c > > @@ -99,8 +99,10 @@ __igt_reset_stolen(struct intel_gt *gt, > > memset_io(s, STACK_MAGIC, PAGE_SIZE); > > > > in = (void __force *)s; > > - if (i915_memcpy_from_wc(tmp, in, PAGE_SIZE)) > > + if (i915_can_memcpy_from_wc(tmp, in, PAGE_SIZE)) { > > + i915_io_memcpy_from_wc(tmp, in, PAGE_SIZE); > > in = tmp; > > + } > > crc[page] = crc32_le(0, in, PAGE_SIZE); > > > > io_mapping_unmap(s); > > @@ -135,8 +137,10 @@ __igt_reset_stolen(struct intel_gt *gt, > > PAGE_SIZE); > > > > in = (void __force *)s; > > - if (i915_memcpy_from_wc(tmp, in, PAGE_SIZE)) > > + if (i915_can_memcpy_from_wc(tmp, in, PAGE_SIZE)) { > > + i915_io_memcpy_from_wc(tmp, in, PAGE_SIZE); > > but you removed __iomem above Yeah, it is a mistake. I will change it. There is one more place in the same file which needs
Re: [Intel-gfx] [PATCH v2 02/21] drm/i915: Parse and set stepping for platforms with GMD
On 18.08.2022 16:41, Radhakrishna Sripada wrote: > From: José Roberto de Souza > > The GMD step field do not properly match the current stepping convention > that we use(STEP_A0, STEP_A1, STEP_B0...). > > One platform could have { arch = 12, rel = 70, step = 1 } and the > actual stepping is STEP_B0 but without the translation of the step > field would mean STEP_A1. > That is why we will need to have gmd_to_intel_step tables for each IP. > > Signed-off-by: José Roberto de Souza > Signed-off-by: Radhakrishna Sripada > --- > drivers/gpu/drm/i915/intel_step.c | 60 +++ > 1 file changed, 60 insertions(+) Reviewed-by: Balasubramani Vivekanandan > > diff --git a/drivers/gpu/drm/i915/intel_step.c > b/drivers/gpu/drm/i915/intel_step.c > index 42b3133d8387..0fa7147c7d0f 100644 > --- a/drivers/gpu/drm/i915/intel_step.c > +++ b/drivers/gpu/drm/i915/intel_step.c > @@ -135,6 +135,48 @@ static const struct intel_step_info adlp_n_revids[] = { > [0x0] = { COMMON_GT_MEDIA_STEP(A0), .display_step = STEP_D0 }, > }; > > +struct gmd_to_intel_step { > + struct ip_version gmd; > + enum intel_step step; > +}; > + > +static const struct gmd_to_intel_step gmd_graphics_table[] = { > + { .gmd.ver = 12, .gmd.rel = 70, .gmd.step = 0, .step = STEP_A0 }, > + { .gmd.ver = 12, .gmd.rel = 70, .gmd.step = 4, .step = STEP_B0 }, > + { .gmd.ver = 12, .gmd.rel = 71, .gmd.step = 0, .step = STEP_A0 }, > + { .gmd.ver = 12, .gmd.rel = 71, .gmd.step = 4, .step = STEP_B0 }, > + { .gmd.ver = 12, .gmd.rel = 73, .gmd.step = 0, .step = STEP_A0 }, > + { .gmd.ver = 12, .gmd.rel = 73, .gmd.step = 4, .step = STEP_B0 }, > +}; > + > +static const struct gmd_to_intel_step gmd_media_table[] = { > + { .gmd.ver = 13, .gmd.rel = 70, .gmd.step = 0, .step = STEP_A0 }, > + { .gmd.ver = 13, .gmd.rel = 70, .gmd.step = 4, .step = STEP_B0 }, > +}; > + > +static const struct gmd_to_intel_step gmd_display_table[] = { > + { .gmd.ver = 14, .gmd.rel = 0, .gmd.step = 0, .step = STEP_A0 }, > + { .gmd.ver = 14, .gmd.rel = 0, .gmd.step = 4, .step = STEP_B0 }, > +}; > + > +static u8 gmd_to_intel_step(struct drm_i915_private *i915, > + struct ip_version *gmd, > + const struct gmd_to_intel_step *table, > + int len) > +{ > + int i; > + > + for (i = 0; i < len; i++) { > + if (table[i].gmd.ver == gmd->ver && > + table[i].gmd.rel == gmd->rel && > + table[i].gmd.step == gmd->step) > + return table[i].step; > + } > + > + drm_dbg(&i915->drm, "Using future steppings\n"); > + return STEP_FUTURE; > +} > + > static void pvc_step_init(struct drm_i915_private *i915, int pci_revid); > > void intel_step_init(struct drm_i915_private *i915) > @@ -144,6 +186,24 @@ void intel_step_init(struct drm_i915_private *i915) > int revid = INTEL_REVID(i915); > struct intel_step_info step = {}; > > + if (HAS_GMD_ID(i915)) { > + step.graphics_step = gmd_to_intel_step(i915, > + > &RUNTIME_INFO(i915)->graphics, > +gmd_graphics_table, > + > ARRAY_SIZE(gmd_graphics_table)); > + step.media_step = gmd_to_intel_step(i915, > + &RUNTIME_INFO(i915)->media, > + gmd_media_table, > + > ARRAY_SIZE(gmd_media_table)); > + step.display_step = gmd_to_intel_step(i915, > + > &RUNTIME_INFO(i915)->display, > + gmd_display_table, > + > ARRAY_SIZE(gmd_display_table)); > + RUNTIME_INFO(i915)->step = step; > + > + return; > + } > + > if (IS_PONTEVECCHIO(i915)) { > pvc_step_init(i915, revid); > return; > -- > 2.25.1 >
Re: [Intel-gfx] [PATCH v2 03/21] drm/i915/mtl: MMIO range is now 4MB
On 18.08.2022 16:41, Radhakrishna Sripada wrote: > From: Matt Roper > > Previously only dgfx platforms had a 4MB MMIO range, but starting with > MTL we now use the larger range for all platforms. > > Bspec: 63834, 63830 > Signed-off-by: Matt Roper > Signed-off-by: Radhakrishna Sripada Reviewed-by: Balasubramani Vivekanandan > --- > drivers/gpu/drm/i915/intel_uncore.c | 11 ++- > 1 file changed, 6 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_uncore.c > b/drivers/gpu/drm/i915/intel_uncore.c > index a852c471d1b3..e0a8a8cb2052 100644 > --- a/drivers/gpu/drm/i915/intel_uncore.c > +++ b/drivers/gpu/drm/i915/intel_uncore.c > @@ -2232,14 +2232,15 @@ int intel_uncore_setup_mmio(struct intel_uncore > *uncore, phys_addr_t phys_addr) >* clobbering the GTT which we want ioremap_wc instead. Fortunately, >* the register BAR remains the same size for all the earlier >* generations up to Ironlake. > - * For dgfx chips register range is expanded to 4MB. > + * For dgfx chips register range is expanded to 4MB, and this larger > + * range is also used for integrated gpus beginning with Meteor Lake. >*/ > - if (GRAPHICS_VER(i915) < 5) > - mmio_size = 512 * 1024; > - else if (IS_DGFX(i915)) > + if (IS_DGFX(i915) || GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70)) > mmio_size = 4 * 1024 * 1024; > - else > + else if (GRAPHICS_VER(i915) >= 5) > mmio_size = 2 * 1024 * 1024; > + else > + mmio_size = 512 * 1024; > > uncore->regs = ioremap(phys_addr, mmio_size); > if (uncore->regs == NULL) { > -- > 2.25.1 >
Re: [Intel-gfx] [PATCH v2 04/21] drm/i915/mtl: Don't mask off CCS according to DSS fusing
On 18.08.2022 16:41, Radhakrishna Sripada wrote: > From: Matt Roper > > Unlike the Xe_HP platforms, MTL only has a single CCS engine; the > quad-based engine masking logic does not apply to this platform (or > presumably any future platforms that only have 0 or 1 CCS). > > Signed-off-by: Matt Roper > Signed-off-by: Radhakrishna Sripada > --- > drivers/gpu/drm/i915/gt/intel_engine_cs.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) Reviewed-by: Balasubramani Vivekanandan > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c > b/drivers/gpu/drm/i915/gt/intel_engine_cs.c > index 37fa813af766..17e7f20bbb48 100644 > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c > @@ -672,7 +672,7 @@ static void engine_mask_apply_compute_fuses(struct > intel_gt *gt) > unsigned long ccs_mask; > unsigned int i; > > - if (GRAPHICS_VER_FULL(i915) < IP_VER(12, 50)) > + if (hweight32(CCS_MASK(gt)) <= 1) > return; > > ccs_mask = > intel_slicemask_from_xehp_dssmask(info->sseu.compute_subslice_mask, > -- > 2.25.1 >
Re: [Intel-gfx] [PATCH v2 05/21] drm/i915/mtl: Define engine context layouts
On 18.08.2022 16:41, Radhakrishna Sripada wrote: > From: Matt Roper > > The part of the media and blitter engine contexts that we care about for > setting up an initial state are the same on MTL as they were on DG2 > (and PVC), so we need to update the driver conditions to re-use the DG2 > context table. > > For render/compute engines, the part of the context images are nearly > the same, although the layout had a very slight change --- one POSH > register was removed and the placement of some LRI/noops adjusted > slightly to compensate. > > Bspec: 46261, 46260, 45585 > Signed-off-by: Matt Roper > Signed-off-by: Radhakrishna Sripada > --- > drivers/gpu/drm/i915/gt/intel_lrc.c | 47 - > 1 file changed, 46 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c > b/drivers/gpu/drm/i915/gt/intel_lrc.c > index eec73c66406c..d3833cbaabcb 100644 > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c > @@ -606,6 +606,49 @@ static const u8 dg2_rcs_offsets[] = { > END > }; > > +static const u8 mtl_rcs_offsets[] = { > + NOP(1), > + LRI(15, POSTED), > + REG16(0x244), > + REG(0x034), > + REG(0x030), > + REG(0x038), > + REG(0x03c), > + REG(0x168), > + REG(0x140), > + REG(0x110), > + REG(0x1c0), > + REG(0x1c4), > + REG(0x1c8), > + REG(0x180), > + REG16(0x2b4), Inspecting Bspecs 46261 and 46260 indicates the following 2 registers are replaced by NOP for MTL. Can you check? > + REG(0x120), > + REG(0x124), > + > + NOP(1), > + LRI(9, POSTED), > + REG16(0x3a8), > + REG16(0x28c), > + REG16(0x288), > + REG16(0x284), > + REG16(0x280), > + REG16(0x27c), > + REG16(0x278), > + REG16(0x274), > + REG16(0x270), > + > + NOP(2), > + LRI(2, POSTED), > + REG16(0x5a8), > + REG16(0x5ac), > + > + NOP(6), > + LRI(1, 0), > + REG(0x0c8), > + > + END > +}; > + > #undef END > #undef REG16 > #undef REG > @@ -624,7 +667,9 @@ static const u8 *reg_offsets(const struct intel_engine_cs > *engine) > !intel_engine_has_relative_mmio(engine)); > > if (engine->flags & I915_ENGINE_HAS_RCS_REG_STATE) { > - if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 55)) > + if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 70)) > + return mtl_rcs_offsets; > + else if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 55)) > return dg2_rcs_offsets; > else if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50)) > return xehp_rcs_offsets; Similarly from Bpsec 45582, the same 2 registers indicated above is replaced by NOP even in Copy and Blitter engine contexts in MTL compared to DG2. So we have to create a new structure for MTL for Copy and Media engine contexts. Regards, Bala > -- > 2.25.1 >
Re: [Intel-gfx] [PATCH] drm/i915/dg2: Incorporate Wa_16014892111 into DRAW_WATERMARK tuning
On 23.08.2022 13:24, Matt Roper wrote: > Although register tuning settings are generally implemented via the > workaround infrastructure, it turns out that the DRAW_WATERMARK register > is not properly saved/restored by hardware around power events (i.e., > RC6 entry) so updates to the value cannot be applied in the usual > manner. New workaround Wa_16014892111 informs us that any tuning > updates to this register must instead be applied via an INDIRECT_CTX > batch buffer. This will ensure that the necessary value is re-applied > when a context begins running, even if an RC6 entry had wiped the > register back to hardware defaults since the last context ran. > > Fixes: 6dc85721df74 ("drm/i915/dg2: Add additional tuning settings") > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6642 > Signed-off-by: Matt Roper Reviewed-by: Balasubramani Vivekanandan > --- > drivers/gpu/drm/i915/gt/intel_lrc.c | 21 + > drivers/gpu/drm/i915/gt/intel_workarounds.c | 2 -- > 2 files changed, 21 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c > b/drivers/gpu/drm/i915/gt/intel_lrc.c > index eec73c66406c..070cec4ff8a4 100644 > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c > @@ -1242,6 +1242,23 @@ dg2_emit_rcs_hang_wabb(const struct intel_context *ce, > u32 *cs) > return cs; > } > > +/* > + * The bspec's tuning guide asks us to program a vertical watermark value of > + * 0x3FF. However this register is not saved/restored properly by the > + * hardware, so we're required to apply the desired value via INDIRECT_CTX > + * batch buffer to ensure the value takes effect properly. All other bits > + * in this register should remain at 0 (the hardware default). > + */ > +static u32 * > +dg2_emit_draw_watermark_setting(u32 *cs) > +{ > + *cs++ = MI_LOAD_REGISTER_IMM(1); > + *cs++ = i915_mmio_reg_offset(DRAW_WATERMARK); > + *cs++ = REG_FIELD_PREP(VERT_WM_VAL, 0x3FF); > + > + return cs; > +} > + > static u32 * > gen12_emit_indirect_ctx_rcs(const struct intel_context *ce, u32 *cs) > { > @@ -1263,6 +1280,10 @@ gen12_emit_indirect_ctx_rcs(const struct intel_context > *ce, u32 *cs) > if (!HAS_FLAT_CCS(ce->engine->i915)) > cs = gen12_emit_aux_table_inv(cs, GEN12_GFX_CCS_AUX_NV); > > + /* Wa_16014892111 */ > + if (IS_DG2(ce->engine->i915)) > + cs = dg2_emit_draw_watermark_setting(cs); > + > return cs; > } > > diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c > b/drivers/gpu/drm/i915/gt/intel_workarounds.c > index 31e129329fb0..3cdb8294e13f 100644 > --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c > +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c > @@ -2685,8 +2685,6 @@ add_render_compute_tuning_settings(struct > drm_i915_private *i915, > if (IS_DG2(i915)) { > wa_write_or(wal, XEHP_L3SCQREG7, BLEND_FILL_CACHING_OPT_DIS); > wa_write_clr_set(wal, RT_CTRL, STACKID_CTRL, STACKID_CTRL_512); > - wa_write_clr_set(wal, DRAW_WATERMARK, VERT_WM_VAL, > - REG_FIELD_PREP(VERT_WM_VAL, 0x3FF)); > > /* >* This is also listed as Wa_22012654132 for certain DG2 > -- > 2.37.2 >
Re: [Intel-gfx] [PATCH v4 04/11] drm/i915/mtl: Define engine context layouts
On 01.09.2022 23:03, Radhakrishna Sripada wrote: > From: Matt Roper > > The part of the media and blitter engine contexts that we care about for > setting up an initial state are the same on MTL as they were on DG2 > (and PVC), so we need to update the driver conditions to re-use the DG2 > context table. > > For render/compute engines, the part of the context images are nearly > the same, although the layout had a very slight change --- one POSH > register was removed and the placement of some LRI/noops adjusted > slightly to compensate. > > v2: > - Dg2, mtl xcs offsets slightly vary. Use a separate offsets array(Bala) > - Drop unused registers in mtl rcs offsets.(Bala) > > Bspec: 46261, 46260, 45585 > Cc: Balasubramani Vivekanandan > Signed-off-by: Matt Roper > Signed-off-by: Radhakrishna Sripada > --- > drivers/gpu/drm/i915/gt/intel_lrc.c | 81 - > 1 file changed, 79 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c > b/drivers/gpu/drm/i915/gt/intel_lrc.c > index 070cec4ff8a4..ecb030ee39cd 100644 > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c > @@ -264,6 +264,38 @@ static const u8 dg2_xcs_offsets[] = { > END > }; > > +static const u8 mtl_xcs_offsets[] = { > + NOP(1), > + LRI(13, POSTED), > + REG16(0x244), > + REG(0x034), > + REG(0x030), > + REG(0x038), > + REG(0x03c), > + REG(0x168), > + REG(0x140), > + REG(0x110), > + REG(0x1c0), > + REG(0x1c4), > + REG(0x1c8), > + REG(0x180), > + REG16(0x2b4), Comparing Bspec 45585, there are few NOP missing here > + > + NOP(1), > + LRI(9, POSTED), > + REG16(0x3a8), > + REG16(0x28c), > + REG16(0x288), > + REG16(0x284), > + REG16(0x280), > + REG16(0x27c), > + REG16(0x278), > + REG16(0x274), > + REG16(0x270), > + > + END > +}; > + > static const u8 gen8_rcs_offsets[] = { > NOP(1), > LRI(14, POSTED), > @@ -606,6 +638,47 @@ static const u8 dg2_rcs_offsets[] = { > END > }; > > +static const u8 mtl_rcs_offsets[] = { > + NOP(1), > + LRI(13, POSTED), > + REG16(0x244), > + REG(0x034), > + REG(0x030), > + REG(0x038), > + REG(0x03c), > + REG(0x168), > + REG(0x140), > + REG(0x110), > + REG(0x1c0), > + REG(0x1c4), > + REG(0x1c8), > + REG(0x180), > + REG16(0x2b4), > + > + NOP(1), > + LRI(9, POSTED), > + REG16(0x3a8), > + REG16(0x28c), > + REG16(0x288), > + REG16(0x284), > + REG16(0x280), > + REG16(0x27c), > + REG16(0x278), > + REG16(0x274), > + REG16(0x270), > + > + NOP(2), > + LRI(2, POSTED), > + REG16(0x5a8), > + REG16(0x5ac), > + > + NOP(6), > + LRI(1, 0), > + REG(0x0c8), > + > + END > +}; > + > #undef END > #undef REG16 > #undef REG > @@ -624,7 +697,9 @@ static const u8 *reg_offsets(const struct intel_engine_cs > *engine) > !intel_engine_has_relative_mmio(engine)); > > if (engine->flags & I915_ENGINE_HAS_RCS_REG_STATE) { > - if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 55)) > + if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 70)) > + return mtl_rcs_offsets; > + else if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 55)) > return dg2_rcs_offsets; > else if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50)) > return xehp_rcs_offsets; > @@ -637,7 +712,9 @@ static const u8 *reg_offsets(const struct intel_engine_cs > *engine) > else > return gen8_rcs_offsets; > } else { > - if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 55)) > + if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 70)) > + return mtl_xcs_offsets; > + else if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 55)) > return dg2_xcs_offsets; > else if (GRAPHICS_VER(engine->i915) >= 12) > return gen12_xcs_offsets; > -- > 2.34.1 >
Re: [Intel-gfx] [PATCH v3 05/11] drm/i915/mtl: Add gmbus and gpio support
On 31.08.2022 14:49, Radhakrishna Sripada wrote: > Add tables to map the GMBUS pin pairs to GPIO registers and port to DDC. > From spec we have registers GPIO_CTL[1-5] mapped to native display phys and > GPIO_CTL[9-12] are mapped to TC ports. > > v2: > - Drop unused GPIO pins(MattR) > > BSpec: 49306 > > Cc: Matt Roper > Original Author: Brian J Lovin > Signed-off-by: Radhakrishna Sripada > --- > drivers/gpu/drm/i915/display/intel_gmbus.c | 15 +++ > drivers/gpu/drm/i915/display/intel_gmbus.h | 1 + > 2 files changed, 16 insertions(+) Reviewed-by: Balasubramani Vivekanandan > > diff --git a/drivers/gpu/drm/i915/display/intel_gmbus.c > b/drivers/gpu/drm/i915/display/intel_gmbus.c > index 6f6cfccad477..74443f57f62d 100644 > --- a/drivers/gpu/drm/i915/display/intel_gmbus.c > +++ b/drivers/gpu/drm/i915/display/intel_gmbus.c > @@ -117,6 +117,18 @@ static const struct gmbus_pin gmbus_pins_dg2[] = { > [GMBUS_PIN_9_TC1_ICP] = { "tc1", GPIOJ }, > }; >
Re: [Intel-gfx] [PATCH v4 05/11] drm/i915/mtl: Add gmbus and gpio support
On 01.09.2022 23:03, Radhakrishna Sripada wrote: > Add tables to map the GMBUS pin pairs to GPIO registers and port to DDC. > From spec we have registers GPIO_CTL[1-5] mapped to native display phys and > GPIO_CTL[9-12] are mapped to TC ports. > > v2: > - Drop unused GPIO pins(MattR) > > BSpec: 49306 > > Cc: Matt Roper > Original Author: Brian J Lovin > Signed-off-by: Radhakrishna Sripada > --- > drivers/gpu/drm/i915/display/intel_gmbus.c | 15 +++ > drivers/gpu/drm/i915/display/intel_gmbus.h | 1 + > 2 files changed, 16 insertions(+) Reviewed-by: Balasubramani Vivekanandan > > diff --git a/drivers/gpu/drm/i915/display/intel_gmbus.c > b/drivers/gpu/drm/i915/display/intel_gmbus.c > index 6f6cfccad477..74443f57f62d 100644 > --- a/drivers/gpu/drm/i915/display/intel_gmbus.c > +++ b/drivers/gpu/drm/i915/display/intel_gmbus.c > @@ -117,6 +117,18 @@ static const struct gmbus_pin gmbus_pins_dg2[] = { > [GMBUS_PIN_9_TC1_ICP] = { "tc1", GPIOJ }, > }; >
Re: [Intel-gfx] [PATCH v4 11/11] drm/i915/mtl: Do not update GV point, mask value
On 01.09.2022 23:03, Radhakrishna Sripada wrote: > Display 14 and future platforms do not directly communicate to Pcode > via mailbox the SAGV bandwidth information. PM Demand registers are > used to communicate display power requirements to the PUnit which would > include GV point and mask value. > > Skip programming GV point and mask values through legacy pcode mailbox > interface. I agree to Matt's suggestion in v2 of this patch series, to move this patch to the future series where we would introduce the new pm_demand interface. It would make more sense there. > > Bspec: 64636 > Cc: Matt Roper > Original Author: Caz Yokoyama > Signed-off-by: Radhakrishna Sripada > --- > drivers/gpu/drm/i915/intel_pm.c | 18 ++ > 1 file changed, 18 insertions(+) > > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c > index b19a1ecb010e..69efd613bbde 100644 > --- a/drivers/gpu/drm/i915/intel_pm.c > +++ b/drivers/gpu/drm/i915/intel_pm.c > @@ -3923,6 +3923,14 @@ void intel_sagv_pre_plane_update(struct > intel_atomic_state *state) > { > struct drm_i915_private *i915 = to_i915(state->base.dev); > > + /* > + * No need to update mask value/restrict because > + * "Pcode only wants to use GV bandwidth value, not the mask value." > + * for DISPLAY_VER() >= 14. > + */ > + if (DISPLAY_VER(i915) >= 14) > + return; > + My suggestion would be to remove the DISPLAY version check here and do it at the place where this function is invoked from. So for versions <14, intel_sagv_pre_plane_update can be called and for higher we need to implement the new pm_demand interface. > /* >* Just return if we can't control SAGV or don't have it. >* This is different from situation when we have SAGV but just can't > @@ -3943,6 +3951,16 @@ void intel_sagv_post_plane_update(struct > intel_atomic_state *state) > { > struct drm_i915_private *i915 = to_i915(state->base.dev); > > + /* > + * No need to update mask value/restrict because > + * "Pcode only wants to use GV bandwidth value, not the mask value." > + * for DISPLAY_VER() >= 14. > + * > + * GV bandwidth will be set by intel_pmdemand_post_plane_update() > + */ > + if (DISPLAY_VER(i915) >= 14) > + return; ditto > + > /* >* Just return if we can't control SAGV or don't have it. >* This is different from situation when we have SAGV but just can't Regards, Bala > -- > 2.34.1 >
[Intel-gfx] [PATCH] drm/i915/display: Print display info inside driver display initialization
Separate the printing of display version and feature flags from the main driver probe to inside the display initialization. This is in alignment with isolating the display code from the main driver and helps Xe driver to resuse it. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/display/intel_display_driver.c | 5 + drivers/gpu/drm/i915/i915_driver.c | 2 -- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_display_driver.c b/drivers/gpu/drm/i915/display/intel_display_driver.c index 9d9b034b9bdc..2fbb3c956336 100644 --- a/drivers/gpu/drm/i915/display/intel_display_driver.c +++ b/drivers/gpu/drm/i915/display/intel_display_driver.c @@ -380,6 +380,8 @@ int intel_display_driver_probe(struct drm_i915_private *i915) void intel_display_driver_register(struct drm_i915_private *i915) { + struct drm_printer p = drm_info_printer(i915->drm.dev); + if (!HAS_DISPLAY(i915)) return; @@ -407,6 +409,9 @@ void intel_display_driver_register(struct drm_i915_private *i915) * fbdev->async_cookie. */ drm_kms_helper_poll_init(&i915->drm); + + intel_display_device_info_print(DISPLAY_INFO(i915), + DISPLAY_RUNTIME_INFO(i915), &p); } /* part #1: call before irq uninstall */ diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c index e5d693904123..d50347e5773a 100644 --- a/drivers/gpu/drm/i915/i915_driver.c +++ b/drivers/gpu/drm/i915/i915_driver.c @@ -699,8 +699,6 @@ static void i915_welcome_messages(struct drm_i915_private *dev_priv) intel_device_info_print(INTEL_INFO(dev_priv), RUNTIME_INFO(dev_priv), &p); - intel_display_device_info_print(DISPLAY_INFO(dev_priv), - DISPLAY_RUNTIME_INFO(dev_priv), &p); i915_print_iommu_status(dev_priv, &p); for_each_gt(gt, dev_priv, i) intel_gt_info_print(>->info, &p); -- 2.25.1
Re: [Intel-gfx] [PATCH] drm/i915/display: Print display info inside driver display initialization
On 21.09.2023 10:38, Jani Nikula wrote: > On Thu, 21 Sep 2023, Balasubramani Vivekanandan > wrote: > > Separate the printing of display version and feature flags from the main > > driver probe to inside the display initialization. This is in alignment > > with isolating the display code from the main driver and helps Xe driver > > to resuse it. > > > > Signed-off-by: Balasubramani Vivekanandan > > > > --- > > drivers/gpu/drm/i915/display/intel_display_driver.c | 5 + > > drivers/gpu/drm/i915/i915_driver.c | 2 -- > > 2 files changed, 5 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/display/intel_display_driver.c > > b/drivers/gpu/drm/i915/display/intel_display_driver.c > > index 9d9b034b9bdc..2fbb3c956336 100644 > > --- a/drivers/gpu/drm/i915/display/intel_display_driver.c > > +++ b/drivers/gpu/drm/i915/display/intel_display_driver.c > > @@ -380,6 +380,8 @@ int intel_display_driver_probe(struct drm_i915_private > > *i915) > > > > void intel_display_driver_register(struct drm_i915_private *i915) > > { > > + struct drm_printer p = drm_info_printer(i915->drm.dev); > > It needs to be a debug printer, not info printer, maybe: > > struct drm_printer p = drm_debug_printer("display info:"); > > Unfortunately, it's not device specific, but that's for another set of > patches another day. Yeah, thats' the reason I deliberately used info printer. Anyway I will resend the patch changing to debug printer. Regards, Bala > > BR, > Jani. > > > + > > if (!HAS_DISPLAY(i915)) > > return; > > > > @@ -407,6 +409,9 @@ void intel_display_driver_register(struct > > drm_i915_private *i915) > > * fbdev->async_cookie. > > */ > > drm_kms_helper_poll_init(&i915->drm); > > + > > + intel_display_device_info_print(DISPLAY_INFO(i915), > > + DISPLAY_RUNTIME_INFO(i915), &p); > > } > > > > /* part #1: call before irq uninstall */ > > diff --git a/drivers/gpu/drm/i915/i915_driver.c > > b/drivers/gpu/drm/i915/i915_driver.c > > index e5d693904123..d50347e5773a 100644 > > --- a/drivers/gpu/drm/i915/i915_driver.c > > +++ b/drivers/gpu/drm/i915/i915_driver.c > > @@ -699,8 +699,6 @@ static void i915_welcome_messages(struct > > drm_i915_private *dev_priv) > > > > intel_device_info_print(INTEL_INFO(dev_priv), > > RUNTIME_INFO(dev_priv), &p); > > - intel_display_device_info_print(DISPLAY_INFO(dev_priv), > > - DISPLAY_RUNTIME_INFO(dev_priv), > > &p); > > i915_print_iommu_status(dev_priv, &p); > > for_each_gt(gt, dev_priv, i) > > intel_gt_info_print(>->info, &p); > > -- > Jani Nikula, Intel
[Intel-gfx] [PATCH v2] drm/i915/display: Print display info inside driver display initialization
Separate the printing of display version and feature flags from the main driver probe to inside the display initialization. This is in alignment with isolating the display code from the main driver and helps Xe driver to resuse it. v2: Replace drm_info_printer with drm_debug_printer (Jani) Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/display/intel_display_driver.c | 5 + drivers/gpu/drm/i915/i915_driver.c | 2 -- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_display_driver.c b/drivers/gpu/drm/i915/display/intel_display_driver.c index 9d9b034b9bdc..44b59ac301e6 100644 --- a/drivers/gpu/drm/i915/display/intel_display_driver.c +++ b/drivers/gpu/drm/i915/display/intel_display_driver.c @@ -380,6 +380,8 @@ int intel_display_driver_probe(struct drm_i915_private *i915) void intel_display_driver_register(struct drm_i915_private *i915) { + struct drm_printer p = drm_debug_printer("i915 display info:"); + if (!HAS_DISPLAY(i915)) return; @@ -407,6 +409,9 @@ void intel_display_driver_register(struct drm_i915_private *i915) * fbdev->async_cookie. */ drm_kms_helper_poll_init(&i915->drm); + + intel_display_device_info_print(DISPLAY_INFO(i915), + DISPLAY_RUNTIME_INFO(i915), &p); } /* part #1: call before irq uninstall */ diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c index e5d693904123..d50347e5773a 100644 --- a/drivers/gpu/drm/i915/i915_driver.c +++ b/drivers/gpu/drm/i915/i915_driver.c @@ -699,8 +699,6 @@ static void i915_welcome_messages(struct drm_i915_private *dev_priv) intel_device_info_print(INTEL_INFO(dev_priv), RUNTIME_INFO(dev_priv), &p); - intel_display_device_info_print(DISPLAY_INFO(dev_priv), - DISPLAY_RUNTIME_INFO(dev_priv), &p); i915_print_iommu_status(dev_priv, &p); for_each_gt(gt, dev_priv, i) intel_gt_info_print(>->info, &p); -- 2.25.1
Re: [Intel-gfx] [PATCH] drm/i915/mtl: Add Wa_22016670082
On 25.10.2023 18:47, Dnyaneshwar Bhadane wrote: > Implemented workaround for XeLPM+ > BSpec: 51762 > > Signed-off-by: Dnyaneshwar Bhadane > --- > drivers/gpu/drm/i915/gt/intel_workarounds.c | 3 +++ > 1 file changed, 3 insertions(+) Reviewed-by: Balasubramani Vivekanandan Regards, Bala > > diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c > b/drivers/gpu/drm/i915/gt/intel_workarounds.c > index 192ac0e59afa..6ae7a4de83b0 100644 > --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c > +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c > @@ -1674,6 +1674,9 @@ xelpmp_gt_workarounds_init(struct intel_gt *gt, struct > i915_wa_list *wal) >*/ > wa_write_or(wal, XELPMP_GSC_MOD_CTRL, FORCE_MISS_FTLB); > > + /* Wa_22016670082 */ > + wa_write_or(wal, GEN12_SQCNT1, GEN12_STRICT_RAR_ENABLE); > + > debug_dump_steering(gt); > } > > -- > 2.34.1 >
[Intel-gfx] [PATCH] drm/i915/display: Fix IP version of the WAs
WAs 14011508470, 14011503030 were applied on IP versions beyond which they are applicable. Fixed the IP version checks for these workarounds. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/display/intel_display_power.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c b/drivers/gpu/drm/i915/display/intel_display_power.c index 18ff7f3639ff..5f091502719b 100644 --- a/drivers/gpu/drm/i915/display/intel_display_power.c +++ b/drivers/gpu/drm/i915/display/intel_display_power.c @@ -1697,14 +1697,14 @@ static void icl_display_core_init(struct drm_i915_private *dev_priv, if (resume) intel_dmc_load_program(dev_priv); - /* Wa_14011508470:tgl,dg1,rkl,adl-s,adl-p */ - if (DISPLAY_VER(dev_priv) >= 12) + /* Wa_14011508470:tgl,dg1,rkl,adl-s,adl-p,dg2 */ + if (IS_DISPLAY_IP_RANGE(dev_priv, IP_VER(12, 0), IP_VER(13, 0))) intel_de_rmw(dev_priv, GEN11_CHICKEN_DCPR_2, 0, DCPR_CLEAR_MEMSTAT_DIS | DCPR_SEND_RESP_IMM | DCPR_MASK_LPMODE | DCPR_MASK_MAXLATENCY_MEMUP_CLR); /* Wa_14011503030:xelpd */ - if (DISPLAY_VER(dev_priv) >= 13) + if (DISPLAY_VER(dev_priv) == 13) intel_de_write(dev_priv, XELPD_DISPLAY_ERR_FATAL_MASK, ~0); } -- 2.25.1
[Intel-gfx] [PATCH] drm/i915/display: Don't use port enum as register offset
Display DDI ports are enumerated as PORT_A,PORT_B... . The enums are also used as an index to access the DDI_BUF_CTL register for the port. With the introduction of TypeC ports, new enums PORT_TC1,PORT_TC2.. were added starting from enum value 4 to match the index position of the DDI_BUF_CTL register of those ports. Because those early platforms had only 3 non-TypeC ports PORT_A,PORT_B, PORT_C followed by TypeC ports. So the enums PORT_D,PORT_E.. and PORT_TC1,PORT_TC2.. used the same enum values. Driver also used the condition `if (port > PORT_TC1)` to identify if a port is a TypeC port or non-TypeC. >From XELPD, additional non-TypeC ports were added in the platform calling them as PORT D, PORT E and the DDI registers for those ports were positioned after TypeC ports. So the enums PORT_D and PORT_E can't be used as their values do not match with register position. It led to creating new enums PORT_D_XELPD, PORT_E_XELPD for ports D and E. The condition `if (port > PORT_TC1)` was no more valid for XELPD to identify a TypeC port. Also it led to many additional special checks for ports PORT_D_XELPD/PORT_E_XELPD. With new platforms indicating that the DDI register positions of ports can vary across platforms it makes no more feasible to maintain the port enum values to match the DDI register position. Port DDI register position is now maintained in a separate datastructure part of the platform device info and ports are enumerated independently. With enums for TypeC ports defined at the bottom, driver can easily identify the TypeC ports. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/display/icl_dsi.c| 12 ++-- drivers/gpu/drm/i915/display/intel_bios.c | 4 +- drivers/gpu/drm/i915/display/intel_ddi.c | 62 +++-- drivers/gpu/drm/i915/display/intel_display.c | 12 ++-- drivers/gpu/drm/i915/display/intel_display.h | 8 +-- .../drm/i915/display/intel_display_power.c| 40 +-- drivers/gpu/drm/i915/display/intel_fdi.c | 14 ++-- drivers/gpu/drm/i915/display/intel_tc.c | 6 +- drivers/gpu/drm/i915/gvt/display.c| 30 - drivers/gpu/drm/i915/gvt/handlers.c | 17 +++-- drivers/gpu/drm/i915/i915_pci.c | 66 --- drivers/gpu/drm/i915/i915_reg.h | 8 ++- drivers/gpu/drm/i915/intel_device_info.h | 1 + drivers/gpu/drm/i915/intel_gvt_mmio_table.c | 10 +-- include/drm/i915_component.h | 2 +- 15 files changed, 144 insertions(+), 148 deletions(-) diff --git a/drivers/gpu/drm/i915/display/icl_dsi.c b/drivers/gpu/drm/i915/display/icl_dsi.c index ed4d93942dbd..70098b67149b 100644 --- a/drivers/gpu/drm/i915/display/icl_dsi.c +++ b/drivers/gpu/drm/i915/display/icl_dsi.c @@ -548,11 +548,11 @@ static void gen11_dsi_enable_ddi_buffer(struct intel_encoder *encoder) enum port port; for_each_dsi_port(port, intel_dsi->ports) { - tmp = intel_de_read(dev_priv, DDI_BUF_CTL(port)); + tmp = intel_de_read(dev_priv, DDI_BUF_CTL(dev_priv, port)); tmp |= DDI_BUF_CTL_ENABLE; - intel_de_write(dev_priv, DDI_BUF_CTL(port), tmp); + intel_de_write(dev_priv, DDI_BUF_CTL(dev_priv, port), tmp); - if (wait_for_us(!(intel_de_read(dev_priv, DDI_BUF_CTL(port)) & + if (wait_for_us(!(intel_de_read(dev_priv, DDI_BUF_CTL(dev_priv, port)) & DDI_BUF_IS_IDLE), 500)) drm_err(&dev_priv->drm, "DDI port:%c buffer idle\n", @@ -1400,11 +1400,11 @@ static void gen11_dsi_disable_port(struct intel_encoder *encoder) gen11_dsi_ungate_clocks(encoder); for_each_dsi_port(port, intel_dsi->ports) { - tmp = intel_de_read(dev_priv, DDI_BUF_CTL(port)); + tmp = intel_de_read(dev_priv, DDI_BUF_CTL(dev_priv, port)); tmp &= ~DDI_BUF_CTL_ENABLE; - intel_de_write(dev_priv, DDI_BUF_CTL(port), tmp); + intel_de_write(dev_priv, DDI_BUF_CTL(dev_priv, port), tmp); - if (wait_for_us((intel_de_read(dev_priv, DDI_BUF_CTL(port)) & + if (wait_for_us((intel_de_read(dev_priv, DDI_BUF_CTL(dev_priv, port)) & DDI_BUF_IS_IDLE), 8)) drm_err(&dev_priv->drm, diff --git a/drivers/gpu/drm/i915/display/intel_bios.c b/drivers/gpu/drm/i915/display/intel_bios.c index 4c543e8205ca..ab472fa757d8 100644 --- a/drivers/gpu/drm/i915/display/intel_bios.c +++ b/drivers/gpu/drm/i915/display/intel_bios.c @@ -2436,8 +2436,8 @@ static enum port dvo_port_to_port(struct drm_i915_private *i915, [PORT_A] = { DVO_PORT_HDMIA, DVO_PORT_DPA, -1 }, [PORT_B] = { DVO_PORT_HDMIB, DVO_PORT_DPB, -1 }, [PORT_C] = { D
[Intel-gfx] [PATCH v2] drm/i915/display: Don't use port enum as register offset
Display DDI ports are enumerated as PORT_A,PORT_B... . The enums are also used as an index to access the DDI_BUF_CTL register for the port. With the introduction of TypeC ports, new enums PORT_TC1,PORT_TC2.. were added starting from enum value 4 to match the index position of the DDI_BUF_CTL register of those ports. Because those early platforms had only 3 non-TypeC ports PORT_A,PORT_B, PORT_C followed by TypeC ports. So the enums PORT_D,PORT_E.. and PORT_TC1,PORT_TC2.. used the same enum values. Driver also used the condition `if (port > PORT_TC1)` to identify if a port is a TypeC port or non-TypeC. >From XELPD, additional non-TypeC ports were added in the platform calling them as PORT D, PORT E and the DDI registers for those ports were positioned after TypeC ports. So the enums PORT_D and PORT_E can't be used as their values do not match with register position. It led to creating new enums PORT_D_XELPD, PORT_E_XELPD for ports D and E. The condition `if (port > PORT_TC1)` was no more valid for XELPD to identify a TypeC port. Also it led to many additional special checks for ports PORT_D_XELPD/PORT_E_XELPD. With new platforms indicating that the DDI register positions of ports can vary across platforms it makes no more feasible to maintain the port enum values to match the DDI register position. Port DDI register position is now maintained in a separate datastructure part of the platform device info and ports are enumerated independently. With enums for TypeC ports defined at the bottom, driver can easily identify the TypeC ports. Removed a WARN_ON as it is no longer valid. The WARN was added in commit - "327f8d8c336d drm/i915: simplify setting of ddi_io_power_domain" The ddi_io_power_domain calculation has changed completely since the commit and doesn't need this WARN_ON anymore. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/display/icl_dsi.c| 12 ++-- drivers/gpu/drm/i915/display/intel_bios.c | 4 +- drivers/gpu/drm/i915/display/intel_ddi.c | 63 +++--- drivers/gpu/drm/i915/display/intel_display.c | 12 ++-- drivers/gpu/drm/i915/display/intel_display.h | 8 +-- .../drm/i915/display/intel_display_power.c| 40 +-- drivers/gpu/drm/i915/display/intel_fdi.c | 14 ++-- drivers/gpu/drm/i915/display/intel_tc.c | 6 +- drivers/gpu/drm/i915/gvt/display.c| 30 - drivers/gpu/drm/i915/gvt/handlers.c | 17 +++-- drivers/gpu/drm/i915/i915_pci.c | 66 --- drivers/gpu/drm/i915/i915_reg.h | 8 ++- drivers/gpu/drm/i915/intel_device_info.h | 1 + drivers/gpu/drm/i915/intel_gvt_mmio_table.c | 10 +-- include/drm/i915_component.h | 2 +- 15 files changed, 144 insertions(+), 149 deletions(-) diff --git a/drivers/gpu/drm/i915/display/icl_dsi.c b/drivers/gpu/drm/i915/display/icl_dsi.c index ed4d93942dbd..70098b67149b 100644 --- a/drivers/gpu/drm/i915/display/icl_dsi.c +++ b/drivers/gpu/drm/i915/display/icl_dsi.c @@ -548,11 +548,11 @@ static void gen11_dsi_enable_ddi_buffer(struct intel_encoder *encoder) enum port port; for_each_dsi_port(port, intel_dsi->ports) { - tmp = intel_de_read(dev_priv, DDI_BUF_CTL(port)); + tmp = intel_de_read(dev_priv, DDI_BUF_CTL(dev_priv, port)); tmp |= DDI_BUF_CTL_ENABLE; - intel_de_write(dev_priv, DDI_BUF_CTL(port), tmp); + intel_de_write(dev_priv, DDI_BUF_CTL(dev_priv, port), tmp); - if (wait_for_us(!(intel_de_read(dev_priv, DDI_BUF_CTL(port)) & + if (wait_for_us(!(intel_de_read(dev_priv, DDI_BUF_CTL(dev_priv, port)) & DDI_BUF_IS_IDLE), 500)) drm_err(&dev_priv->drm, "DDI port:%c buffer idle\n", @@ -1400,11 +1400,11 @@ static void gen11_dsi_disable_port(struct intel_encoder *encoder) gen11_dsi_ungate_clocks(encoder); for_each_dsi_port(port, intel_dsi->ports) { - tmp = intel_de_read(dev_priv, DDI_BUF_CTL(port)); + tmp = intel_de_read(dev_priv, DDI_BUF_CTL(dev_priv, port)); tmp &= ~DDI_BUF_CTL_ENABLE; - intel_de_write(dev_priv, DDI_BUF_CTL(port), tmp); + intel_de_write(dev_priv, DDI_BUF_CTL(dev_priv, port), tmp); - if (wait_for_us((intel_de_read(dev_priv, DDI_BUF_CTL(port)) & + if (wait_for_us((intel_de_read(dev_priv, DDI_BUF_CTL(dev_priv, port)) & DDI_BUF_IS_IDLE), 8)) drm_err(&dev_priv->drm, diff --git a/drivers/gpu/drm/i915/display/intel_bios.c b/drivers/gpu/drm/i915/display/intel_bios.c index 4c543e8205ca..ab472fa757d8 100644 --- a/drivers/gpu/drm/i915/display/intel_bios.c +++ b/drivers/gpu/drm/i915/disp
Re: [Intel-gfx] [PATCH] drm/i915: Noop lrc_init_wa_ctx() on recent/future platforms
On 07.09.2022 16:08, Lucas De Marchi wrote: > Except for graphics version 8 and 9, nothing is done in > lrc_init_wa_ctx(). Assume this won't be needed on future platforms as > well and remove the warning. > > Note that this function is not called for anything below version 8 since > those don't use either guc or execlist, i.e. HAS_EXECLISTS() is false. > > Signed-off-by: Lucas De Marchi > --- > drivers/gpu/drm/i915/gt/intel_lrc.c | 16 > 1 file changed, 4 insertions(+), 12 deletions(-) Reviewed-by: Balasubramani Vivekanandan > > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c > b/drivers/gpu/drm/i915/gt/intel_lrc.c > index 070cec4ff8a4..43fa7b3422c4 100644 > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c > @@ -1695,24 +1695,16 @@ void lrc_init_wa_ctx(struct intel_engine_cs *engine) > unsigned int i; > int err; > > - if (!(engine->flags & I915_ENGINE_HAS_RCS_REG_STATE)) > + if (GRAPHICS_VER(engine->i915) >= 11 || > + !(engine->flags & I915_ENGINE_HAS_RCS_REG_STATE)) > return; > > - switch (GRAPHICS_VER(engine->i915)) { > - case 12: > - case 11: > - return; > - case 9: > + if (GRAPHICS_VER(engine->i915) == 9) { > wa_bb_fn[0] = gen9_init_indirectctx_bb; > wa_bb_fn[1] = NULL; > - break; > - case 8: > + } else if (GRAPHICS_VER(engine->i915) == 8) { > wa_bb_fn[0] = gen8_init_indirectctx_bb; > wa_bb_fn[1] = NULL; > - break; > - default: > - MISSING_CASE(GRAPHICS_VER(engine->i915)); > - return; > } > > err = lrc_create_wa_ctx(engine); > -- > 2.37.2 >
Re: [Intel-gfx] [PATCH v2] drm/i915/display: Don't use port enum as register offset
On 21.09.2022 21:12, Ville Syrjälä wrote: > On Wed, Sep 21, 2022 at 10:52:59PM +0530, Balasubramani Vivekanandan wrote: > > Display DDI ports are enumerated as PORT_A,PORT_B... . The enums are > > also used as an index to access the DDI_BUF_CTL register for the port. > > > > With the introduction of TypeC ports, new enums PORT_TC1,PORT_TC2.. were > > added starting from enum value 4 to match the index position of the > > DDI_BUF_CTL register of those ports. Because those early platforms had > > only 3 non-TypeC ports PORT_A,PORT_B, PORT_C followed by TypeC ports. > > So the enums PORT_D,PORT_E.. and PORT_TC1,PORT_TC2.. used the same enum > > values. > > > > Driver also used the condition `if (port > PORT_TC1)` to identify if a > > port is a TypeC port or non-TypeC. > > No one should really be doing that, apart from a few exceptions > during initialization. Apart from that I don't think enum port > should really be doing anything else these days than being the > register block offset we pass to the port registers. Yes, my main concern is trying to fix the enum values of port to match the register block offset. As we have seen with ports PORT_D_XELPD, PORT_E_XELPD, just how if the hardware moves the register offset of a port to a new position how much chaos it creates in the driver. It resulted in a 1. new function xelpd_hpd_pin, 2. New condition check `if (DISPLAY_VER(dev_priv) >= 13 && port >= PORT_D_XELPD) {` in function intel_ddi_init() 3. New conditional check in intel_port_to_phy() function 4. A new array item in `static const struct intel_ddi_port_domains d13_port_domains[] = {` All these special handling can be avoided if we were not to fix the enum values of port to register offset as shown in this series. I am also worried driver how much mess it would create if the newer platform adds new TypeC ports at register offset after PORT_E_XELPD or if it moves the offset of the existing TypeC ports. > > Well, the VBT code does screw over that idea kinda. I've been > occasionally pondering some kind of separate namespace for ports > for the VBT code but haven't really it throught it through in > any detail. > > > > > >From XELPD, additional non-TypeC ports were added in the platform > > calling them as PORT D, PORT E and the DDI registers for those ports > > were positioned after TypeC ports. So the enums PORT_D and PORT_E can't > > be used as their values do not match with register position. It led to > > creating new enums PORT_D_XELPD, PORT_E_XELPD for ports D and E. > > > > The condition `if (port > PORT_TC1)` was no more valid for XELPD to > > identify a TypeC port. Also it led to many additional special checks for > > ports PORT_D_XELPD/PORT_E_XELPD. > > > > With new platforms indicating that the DDI register positions of ports > > can vary across platforms it makes no more feasible to maintain the port > > enum values to match the DDI register position. > > Do we know that it's going to get even more messy? I see a big possibility. > > Anyways, we have the exact same thing with AUX CH, so trying to > change one but not the other isn't really going to help. Yes, I am aware of it. DDI_BUT_CTL and AUX CH registers have same relative offset for the ports. My plan is to use the current series as a prepartion work to clean up the AUX CH handling as well. I will send a follow up patch for it. > > And on top of that we have the horrorshow in intel_port_to_phy() > & co. I think the phy stuff is probably what we should try to sort > out next, since IMO it's the bigger mess. Agree. Regards, Bala > > -- > Ville Syrjälä > Intel
Re: [Intel-gfx] [PATCH 01/12] drm/i915/gen8: Create separate reg definitions for new MCR registers
*engine, struct i915_wa_li > wa_write_or(wal, XEHP_L3NODEARBCFG, XEHP_LNESPARE); > > /* Wa_14010449647:xehpsdv */ > - wa_masked_en(wal, GEN7_HALF_SLICE_CHICKEN1, > + wa_masked_en(wal, GEN8_HALF_SLICE_CHICKEN1, >GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE); > > /* Wa_18011725039:xehpsdv */ > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c > b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c > index 8f1165146013..9495a7928bc8 100644 > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c > @@ -244,8 +244,8 @@ struct __ext_steer_reg { > }; > > static const struct __ext_steer_reg xe_extregs[] = { > - {"GEN7_SAMPLER_INSTDONE", GEN7_SAMPLER_INSTDONE}, > - {"GEN7_ROW_INSTDONE", GEN7_ROW_INSTDONE} > + {"GEN8_SAMPLER_INSTDONE", GEN8_SAMPLER_INSTDONE}, > + {"GEN8_ROW_INSTDONE", GEN8_ROW_INSTDONE} > }; > > static void __fill_ext_reg(struct __guc_mmio_reg_descr *ext, > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c > b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c > index a0372735cddb..9229243992c2 100644 > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c > @@ -35,7 +35,7 @@ static void guc_prepare_xfer(struct intel_uncore *uncore) > > if (GRAPHICS_VER(uncore->i915) == 9) { > /* DOP Clock Gating Enable for GuC clocks */ > - intel_uncore_rmw(uncore, GEN7_MISCCPCTL, > + intel_uncore_rmw(uncore, GEN8_MISCCPCTL, >0, GEN8_DOP_CLOCK_GATE_GUC_ENABLE); > > /* allows for 5us (in 10ns units) before GT can go to RC6 */ > diff --git a/drivers/gpu/drm/i915/gvt/handlers.c > b/drivers/gpu/drm/i915/gvt/handlers.c > index daac2050d77d..700cc9688f47 100644 > --- a/drivers/gpu/drm/i915/gvt/handlers.c > +++ b/drivers/gpu/drm/i915/gvt/handlers.c > @@ -2257,7 +2257,7 @@ static int init_generic_mmio_info(struct intel_gvt *gvt) > MMIO_DFH(_MMIO(0x2438), D_ALL, F_CMD_ACCESS, NULL, NULL); > MMIO_DFH(_MMIO(0x243c), D_ALL, F_CMD_ACCESS, NULL, NULL); > MMIO_DFH(_MMIO(0x7018), D_ALL, F_MODE_MASK | F_CMD_ACCESS, NULL, NULL); > - MMIO_DFH(HALF_SLICE_CHICKEN3, D_ALL, F_MODE_MASK | F_CMD_ACCESS, NULL, > NULL); > + MMIO_DFH(HSW_HALF_SLICE_CHICKEN3, D_ALL, F_MODE_MASK | F_CMD_ACCESS, > NULL, NULL); > MMIO_DFH(GEN7_HALF_SLICE_CHICKEN1, D_ALL, F_MODE_MASK | F_CMD_ACCESS, > NULL, NULL); > > /* display */ > diff --git a/drivers/gpu/drm/i915/gvt/mmio_context.c > b/drivers/gpu/drm/i915/gvt/mmio_context.c > index 1c6e941c9666..ac58460fb305 100644 > --- a/drivers/gpu/drm/i915/gvt/mmio_context.c > +++ b/drivers/gpu/drm/i915/gvt/mmio_context.c > @@ -111,7 +111,7 @@ static struct engine_mmio gen9_engine_mmio_list[] > __cacheline_aligned = { > {RCS0, GEN9_SCRATCH_LNCF1, 0, false}, /* 0xb008 */ > {RCS0, GEN7_HALF_SLICE_CHICKEN1, 0x, true}, /* 0xe100 */ > {RCS0, HALF_SLICE_CHICKEN2, 0x, true}, /* 0xe180 */ > - {RCS0, HALF_SLICE_CHICKEN3, 0x, true}, /* 0xe184 */ > + {RCS0, HSW_HALF_SLICE_CHICKEN3, 0x, true}, /* 0xe184 */ Since it is for Gen9 and above, can we use GEN8_HALF_SLICE_CHICKEN3 register name here? Rest looks good. Reviewed-by: Balasubramani Vivekanandan Regards, Bala > {RCS0, GEN9_HALF_SLICE_CHICKEN5, 0x, true}, /* 0xe188 */ > {RCS0, GEN9_HALF_SLICE_CHICKEN7, 0x, true}, /* 0xe194 */ > {RCS0, GEN8_ROW_CHICKEN, 0x, true}, /* 0xe4f0 */ > diff --git a/drivers/gpu/drm/i915/intel_gvt_mmio_table.c > b/drivers/gpu/drm/i915/intel_gvt_mmio_table.c > index 8279dc580a3e..638b77d64bf4 100644 > --- a/drivers/gpu/drm/i915/intel_gvt_mmio_table.c > +++ b/drivers/gpu/drm/i915/intel_gvt_mmio_table.c > @@ -102,7 +102,7 @@ static int iterate_generic_mmio(struct > intel_gvt_mmio_table_iter *iter) > MMIO_D(_MMIO(0x2438)); > MMIO_D(_MMIO(0x243c)); > MMIO_D(_MMIO(0x7018)); > - MMIO_D(HALF_SLICE_CHICKEN3); > + MMIO_D(HSW_HALF_SLICE_CHICKEN3); > MMIO_D(GEN7_HALF_SLICE_CHICKEN1); > /* display */ > MMIO_F(_MMIO(0x60220), 0x20); > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c > index 8f86f56e7ca4..1aa77b18fd3c 100644 > --- a/drivers/gpu/drm/i915/intel_pm.c > +++ b/drivers/gpu/drm/i915/intel_pm.c > @@ -4325,8 +4325,8 @@ static void gen8_set_l3sqc_credits(struct > drm_i915_private *dev_priv, > u32 val; > > /* WaTempDisableDOPClkGating:bdw */ > - misccpctl = intel_uncore_read(&dev_priv->uncore, GEN7_MISCCPCTL); > - intel_u
Re: [Intel-gfx] [PATCH 05/12] drm/i915/xehp: Check for faults on primary GAM
On 19.09.2022 15:32, Matt Roper wrote: > On Xe_HP the fault registers are now in a multicast register range. > However as part of the GAM these registers follow special rules and we > need only read from the "primary" GAM's instance to get the information > we need. So a single intel_gt_mcr_read_any() (which will automatically > steer to the primary GAM) is sufficient; we don't need to loop over each > instance of the MCR register. > > Signed-off-by: Matt Roper > --- > drivers/gpu/drm/i915/gt/intel_gt.c | 40 - > drivers/gpu/drm/i915/gt/intel_gt_regs.h | 3 ++ > 2 files changed, 42 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c > b/drivers/gpu/drm/i915/gt/intel_gt.c > index 5ddae95d4886..1cb7dd40ec47 100644 > --- a/drivers/gpu/drm/i915/gt/intel_gt.c > +++ b/drivers/gpu/drm/i915/gt/intel_gt.c > @@ -304,6 +304,42 @@ static void gen6_check_faults(struct intel_gt *gt) > } > } > > +static void xehp_check_faults(struct intel_gt *gt) > +{ > + u32 fault; > + > + /* > + * Although the fault register now lives in an MCR register range, > + * the GAM registers are special and we only truly need to read > + * the "primary" GAM instance rather than handling each instance > + * individually. intel_gt_mcr_read_any() will automatically steer > + * toward the primary instance. > + */ > + fault = intel_gt_mcr_read_any(gt, XEHP_RING_FAULT_REG); > + if (fault & RING_FAULT_VALID) { > + u32 fault_data0, fault_data1; > + u64 fault_addr; > + > + fault_data0 = intel_gt_mcr_read_any(gt, XEHP_FAULT_TLB_DATA0); > + fault_data1 = intel_gt_mcr_read_any(gt, XEHP_FAULT_TLB_DATA1); > + > + fault_addr = ((u64)(fault_data1 & FAULT_VA_HIGH_BITS) << 44) | > + ((u64)fault_data0 << 12); > + > + drm_dbg(>->i915->drm, "Unexpected fault\n" > + "\tAddr: 0x%08x_%08x\n" > + "\tAddress space: %s\n" > + "\tEngine ID: %d\n" > + "\tSource ID: %d\n" > + "\tType: %d\n", > + upper_32_bits(fault_addr), lower_32_bits(fault_addr), > + fault_data1 & FAULT_GTT_SEL ? "GGTT" : "PPGTT", > + GEN8_RING_FAULT_ENGINE_ID(fault), > + RING_FAULT_SRCID(fault), > + RING_FAULT_FAULT_TYPE(fault)); > + } > +} > + > static void gen8_check_faults(struct intel_gt *gt) > { > struct intel_uncore *uncore = gt->uncore; > @@ -350,7 +386,9 @@ void intel_gt_check_and_clear_faults(struct intel_gt *gt) > struct drm_i915_private *i915 = gt->i915; > > /* From GEN8 onwards we only have one 'All Engine Fault Register' */ > - if (GRAPHICS_VER(i915) >= 8) > + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) > + xehp_check_faults(gt); > + else if (GRAPHICS_VER(i915) >= 8) > gen8_check_faults(gt); > else if (GRAPHICS_VER(i915) >= 6) > gen6_check_faults(gt); > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h > b/drivers/gpu/drm/i915/gt/intel_gt_regs.h > index cf87a1b36a21..dff38b0c4430 100644 > --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h > +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h > @@ -1024,11 +1024,14 @@ > #define GEN9_BLT_MOCS(i) _MMIO(__GEN9_BCS0_MOCS0 + (i) * > 4) > > #define GEN12_FAULT_TLB_DATA0_MMIO(0xceb8) > +#define XEHP_FAULT_TLB_DATA0 _MMIO(0xceb8) > #define GEN12_FAULT_TLB_DATA1_MMIO(0xcebc) > +#define XEHP_FAULT_TLB_DATA1 _MMIO(0xcebc) > #define FAULT_VA_HIGH_BITS (0xf << 0) > #define FAULT_GTT_SEL (1 << 4) > > #define GEN12_RING_FAULT_REG _MMIO(0xcec4) > +#define XEHP_RING_FAULT_REG _MMIO(0xcec4) The fault registers GEN12_FAULT_TLB_DATA0, GEN12_FAULT_TLB_DATA1, GEN12_RING_FAULT_REG are used in few other places in the driver for platforms including Xe_HP. Don't we need to take care of them? Regards, Bala > #define GEN8_RING_FAULT_ENGINE_ID(x) (((x) >> 12) & 0x7) > #define RING_FAULT_GTTSEL_MASK (1 << 11) > #define RING_FAULT_SRCID(x)(((x) >> 3) & 0xff) > -- > 2.37.3 >
[Intel-gfx] [PATCH v3 0/6] drm/i915/display: Don't use port enum as register offset
Prior to display version 12, platforms had DDI ports A,B,C,D,E,F represented by enums PORT_A,PORT_B...PORT_F. The DDI register offsets of the ports were in the same order as the ports. So the port enums were directly used as index to calculate the register offset of the ports. Starting in display version 12, TypeC ports were introduced in the platforms. These were defined as new enums PORT_TC1,PORT_TC2... The later generation platforms had DDI register offests of TypeC and non-TypeC ports interleaved and the existing port enums didn't match the order of the DDI register offests. So the enums could no more be used as index to calculate the register offest. This led to the creation of new platform specific enums for the ports like PORT_D_XELPD, PORT_E_XELPD to match the index of the ports in those platforms and additional code to handle the special enums. So we want to make the port enums not tied to DDI register offset and use the index from somewhere else to calculate the register offsets. The index of the DDI ports in the platform is now defined as part of device info. Series includes few patches at the end which does some cleanup and fixing made possible because of unique enums for the ports. Cc: Jani Nikula Cc: Ville Syrjälä Balasubramani Vivekanandan (6): drm/i915/display: Pass struct drm_i915_private to DDI_BUF_CTL macro drm/i915/display: Define the DDI port indices inside device info drm/i915/display: Free port enums from tied to register offset drm/i915/display: Remove PORT_D_XELPD/PORT_E_XELPD platform specific defintions drm/i915/display: Fix port_identifier function drm/i915/display: cleanup unused DDI port enums drivers/gpu/drm/i915/display/icl_dsi.c| 12 ++-- drivers/gpu/drm/i915/display/intel_bios.c | 7 +-- drivers/gpu/drm/i915/display/intel_ddi.c | 63 +++ drivers/gpu/drm/i915/display/intel_display.c | 12 ++-- drivers/gpu/drm/i915/display/intel_display.h | 29 + .../drm/i915/display/intel_display_power.c| 40 +--- drivers/gpu/drm/i915/display/intel_fdi.c | 14 ++--- drivers/gpu/drm/i915/display/intel_tc.c | 6 +- drivers/gpu/drm/i915/gvt/display.c| 30 - drivers/gpu/drm/i915/gvt/handlers.c | 17 ++--- drivers/gpu/drm/i915/i915_pci.c | 46 +- drivers/gpu/drm/i915/i915_reg.h | 4 +- drivers/gpu/drm/i915/intel_device_info.h | 1 + drivers/gpu/drm/i915/intel_gvt_mmio_table.c | 10 +-- include/drm/i915_component.h | 2 +- 15 files changed, 140 insertions(+), 153 deletions(-) -- 2.34.1
[Intel-gfx] [PATCH v3 1/6] drm/i915/display: Pass struct drm_i915_private to DDI_BUF_CTL macro
This is a prep patch for a patch series in which register offset of the DDI ports are not calculated using the port enums but using a different datastructure part of the device info. So the device info is passed as a parameter to the macro DDI_BUF_CTL but unused yet. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/display/icl_dsi.c | 12 +++--- drivers/gpu/drm/i915/display/intel_ddi.c | 39 +++- drivers/gpu/drm/i915/display/intel_display.c | 6 ++- drivers/gpu/drm/i915/display/intel_fdi.c | 14 +++ drivers/gpu/drm/i915/display/intel_tc.c | 6 +-- drivers/gpu/drm/i915/gvt/display.c | 30 +++ drivers/gpu/drm/i915/gvt/handlers.c | 17 + drivers/gpu/drm/i915/i915_reg.h | 6 ++- drivers/gpu/drm/i915/intel_gvt_mmio_table.c | 10 ++--- 9 files changed, 76 insertions(+), 64 deletions(-) diff --git a/drivers/gpu/drm/i915/display/icl_dsi.c b/drivers/gpu/drm/i915/display/icl_dsi.c index ed4d93942dbd..70098b67149b 100644 --- a/drivers/gpu/drm/i915/display/icl_dsi.c +++ b/drivers/gpu/drm/i915/display/icl_dsi.c @@ -548,11 +548,11 @@ static void gen11_dsi_enable_ddi_buffer(struct intel_encoder *encoder) enum port port; for_each_dsi_port(port, intel_dsi->ports) { - tmp = intel_de_read(dev_priv, DDI_BUF_CTL(port)); + tmp = intel_de_read(dev_priv, DDI_BUF_CTL(dev_priv, port)); tmp |= DDI_BUF_CTL_ENABLE; - intel_de_write(dev_priv, DDI_BUF_CTL(port), tmp); + intel_de_write(dev_priv, DDI_BUF_CTL(dev_priv, port), tmp); - if (wait_for_us(!(intel_de_read(dev_priv, DDI_BUF_CTL(port)) & + if (wait_for_us(!(intel_de_read(dev_priv, DDI_BUF_CTL(dev_priv, port)) & DDI_BUF_IS_IDLE), 500)) drm_err(&dev_priv->drm, "DDI port:%c buffer idle\n", @@ -1400,11 +1400,11 @@ static void gen11_dsi_disable_port(struct intel_encoder *encoder) gen11_dsi_ungate_clocks(encoder); for_each_dsi_port(port, intel_dsi->ports) { - tmp = intel_de_read(dev_priv, DDI_BUF_CTL(port)); + tmp = intel_de_read(dev_priv, DDI_BUF_CTL(dev_priv, port)); tmp &= ~DDI_BUF_CTL_ENABLE; - intel_de_write(dev_priv, DDI_BUF_CTL(port), tmp); + intel_de_write(dev_priv, DDI_BUF_CTL(dev_priv, port), tmp); - if (wait_for_us((intel_de_read(dev_priv, DDI_BUF_CTL(port)) & + if (wait_for_us((intel_de_read(dev_priv, DDI_BUF_CTL(dev_priv, port)) & DDI_BUF_IS_IDLE), 8)) drm_err(&dev_priv->drm, diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c index 643832d55c28..aae429bd2e2b 100644 --- a/drivers/gpu/drm/i915/display/intel_ddi.c +++ b/drivers/gpu/drm/i915/display/intel_ddi.c @@ -172,7 +172,7 @@ void intel_wait_ddi_buf_idle(struct drm_i915_private *dev_priv, return; } - if (wait_for_us((intel_de_read(dev_priv, DDI_BUF_CTL(port)) & + if (wait_for_us((intel_de_read(dev_priv, DDI_BUF_CTL(dev_priv, port)) & DDI_BUF_IS_IDLE), 8)) drm_err(&dev_priv->drm, "Timeout waiting for DDI BUF %c to get idle\n", port_name(port)); @@ -189,7 +189,7 @@ static void intel_wait_ddi_buf_active(struct drm_i915_private *dev_priv, return; } - ret = _wait_for(!(intel_de_read(dev_priv, DDI_BUF_CTL(port)) & + ret = _wait_for(!(intel_de_read(dev_priv, DDI_BUF_CTL(dev_priv, port)) & DDI_BUF_IS_IDLE), IS_DG2(dev_priv) ? 1200 : 500, 10, 10); if (ret) @@ -730,7 +730,7 @@ static void intel_ddi_get_encoder_pipes(struct intel_encoder *encoder, if (!wakeref) return; - tmp = intel_de_read(dev_priv, DDI_BUF_CTL(port)); + tmp = intel_de_read(dev_priv, DDI_BUF_CTL(dev_priv, port)); if (!(tmp & DDI_BUF_CTL_ENABLE)) goto out; @@ -1397,8 +1397,8 @@ hsw_set_signal_levels(struct intel_encoder *encoder, intel_dp->DP &= ~DDI_BUF_EMP_MASK; intel_dp->DP |= signal_levels; - intel_de_write(dev_priv, DDI_BUF_CTL(port), intel_dp->DP); - intel_de_posting_read(dev_priv, DDI_BUF_CTL(port)); + intel_de_write(dev_priv, DDI_BUF_CTL(dev_priv, port), intel_dp->DP); + intel_de_posting_read(dev_priv, DDI_BUF_CTL(dev_priv, port)); } static void _icl_ddi_enable_clock(struct drm_i915_private *i915, i915_reg_t reg, @@ -2577,10 +2577,10 @@ static void intel_disable_ddi_buf(struct intel_encoder *encoder, bool wait = false; u32 val; - val = intel_de_read(dev_priv,
[Intel-gfx] [PATCH v3 2/6] drm/i915/display: Define the DDI port indices inside device info
Prior to display version 12, platforms had DDI ports A,B,C,D,E,F represented by enums PORT_A,PORT_B...PORT_F. The DDI register offsets of the ports was in the same order as the ports. So the port enums were directly used as index to calculate the register offset of the ports. Starting in display version 12, TypeC ports were introduced in the platforms. These were defined as new enums PORT_TC1,PORT_TC2... The later generation platforms had DDI register offests of TypeC and non-TypeC ports interleaved and the existing port enums didn't match the order of the DDI register offests. So the enums could no more be used as index to calculate the register offest. This led to the creation of new platform specific enums for the ports like PORT_D_XELPD, PORT_E_XELPD to match the index of the ports in those platforms and additional code to handle the special enums. So we want to make the port enums not tied to DDI register offset and use the index from somewhere else to calculate the register offsets. The index of the DDI ports in the platform is now defined as part of device info. This patch just adds the indices to the device info. Later patches in the series use that index for offset calculation. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/i915_pci.c | 46 ++-- drivers/gpu/drm/i915/intel_device_info.h | 1 + 2 files changed, 44 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index cace897e1db1..e7eb7c0ea7fd 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -130,6 +130,42 @@ [PIPE_D] = TGL_CURSOR_D_OFFSET, \ } +#define GEN9_DDI_INDEX \ + .display.ddi_index = { \ + [PORT_A] = 0, \ + [PORT_B] = 1, \ + [PORT_C] = 2, \ + [PORT_D] = 3, \ + [PORT_E] = 4, \ + [PORT_F] = 5, \ + } + +#define GEN12_DDI_INDEX \ + .display.ddi_index = { \ + [PORT_A] = 0, \ + [PORT_B] = 1, \ + [PORT_C] = 2, \ + [PORT_TC1] = 3, \ + [PORT_TC2] = 4, \ + [PORT_TC3] = 5, \ + [PORT_TC4] = 6, \ + [PORT_TC5] = 7, \ + [PORT_TC6] = 8, \ + } + +#define XE_LPD_DDI_INDEX \ + .display.ddi_index = { \ + [PORT_A] = 0, \ + [PORT_B] = 1, \ + [PORT_C] = 2, \ + [PORT_TC1] = 3, \ + [PORT_TC2] = 4, \ + [PORT_TC3] = 5, \ + [PORT_TC4] = 6, \ + [PORT_D_XELPD] = 7, \ + [PORT_E_XELPD] = 8, \ + } + #define I9XX_COLORS \ .display.color = { .gamma_lut_size = 256 } #define I965_COLORS \ @@ -664,7 +700,8 @@ static const struct intel_device_info chv_info = { .display.has_psr = 1, \ .display.has_psr_hw_tracking = 1, \ .display.dbuf.size = 896 - 4, /* 4 blocks for bypass path allocation */ \ - .display.dbuf.slice_mask = BIT(DBUF_S1) + .display.dbuf.slice_mask = BIT(DBUF_S1), \ + GEN9_DDI_INDEX #define SKL_PLATFORM \ GEN9_FEATURES, \ @@ -732,7 +769,8 @@ static const struct intel_device_info skl_gt4_info = { IVB_CURSOR_OFFSETS, \ IVB_COLORS, \ GEN9_DEFAULT_PAGE_SIZES, \ - GEN_DEFAULT_REGIONS + GEN_DEFAULT_REGIONS, \ + GEN9_DDI_INDEX static const struct intel_device_info bxt_info = { GEN9_LP_FEATURES, @@ -886,6 +924,7 @@ static const struct intel_device_info jsl_info = { [TRANSCODER_DSI_1] = TRANSCODER_DSI1_OFFSET, \ }, \ TGL_CURSOR_OFFSETS, \ + GEN12_DDI_INDEX, \ .has_global_mocs = 1, \ .has_pxp = 1, \ .display.has_dsb = 0 /* FIXME: LUT load is broken with DSB */ @@ -983,7 +1022,8 @@ static const struct intel_device_info adl_s_info = { [TRANSCODER_DSI_0] = TRANSCODER_DSI0_OFFSET, \ [TRANSCODER_DSI_1] = TRANSCODER_DSI1_OFFSET, \ }, \ - TGL_CURSOR_OFFSETS + TGL_CURSOR_OFFSETS, \ + XE_LPD_DDI_INDEX static const struct intel_device_info adl_p_info = { GEN12_FEATURES, diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h index bc87d3156b14..a93f54990a01 100644 --- a/drivers/gpu/drm/i915/intel_device_info.h +++ b/drivers/gpu/drm/i915/intel_device_info.h @@ -292,6 +292,7 @@ struct intel_device_info { u32 pipe_offsets[I915_MAX_TRANSCODERS]; u32 trans_offsets[I915_MAX_TRANSCODERS]; u32 cursor_offsets[I915_MAX_PIPES]; + u32 ddi_index[I915_MAX_PORTS]; struct { u32 degamma_lut_size; -- 2.34.1
[Intel-gfx] [PATCH v3 3/6] drm/i915/display: Free port enums from tied to register offset
With the index required for DDI register offset calculation available in the device info, DDI_BUF_CTL macro updated to make use of it. Any new macros to access the DDI registers should follow the same procedure. This would free the port enums from tied to the register offset of DDI registers. We can remove all the enum aliases and clean up the enum definitions. The key target of the patch series to remove platform specific definitions of ports like PORT_D_XELPD, PORT_E_XELPD is not yet covered here. The definitions are still retained and will be handled in the follow patch. Removed a WARN_ON as it is no longer valid. The WARN was added in the commit "327f8d8c336d drm/i915: simplify setting of ddi_io_power_domain" The ddi_io_power_domain calculation has changed completely since the commit and doesn't need this WARN_ON anymore. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/display/intel_ddi.c | 1 - drivers/gpu/drm/i915/display/intel_display.h | 8 +++- drivers/gpu/drm/i915/i915_reg.h | 6 ++ include/drm/i915_component.h | 2 +- 4 files changed, 6 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c index aae429bd2e2b..00ac683ef96b 100644 --- a/drivers/gpu/drm/i915/display/intel_ddi.c +++ b/drivers/gpu/drm/i915/display/intel_ddi.c @@ -4474,7 +4474,6 @@ void intel_ddi_init(struct drm_i915_private *dev_priv, enum port port) encoder->update_complete = intel_ddi_update_complete; } - drm_WARN_ON(&dev_priv->drm, port > PORT_I); dig_port->ddi_io_power_domain = intel_display_power_ddi_io_domain(dev_priv, port); if (init_dp) { diff --git a/drivers/gpu/drm/i915/display/intel_display.h b/drivers/gpu/drm/i915/display/intel_display.h index 884e8e67b17c..aa5ded6b513c 100644 --- a/drivers/gpu/drm/i915/display/intel_display.h +++ b/drivers/gpu/drm/i915/display/intel_display.h @@ -212,18 +212,16 @@ enum port { PORT_H, PORT_I, - /* tgl+ */ - PORT_TC1 = PORT_D, + /* Non-TypeC ports must be defined above */ + PORT_TC1, PORT_TC2, PORT_TC3, PORT_TC4, PORT_TC5, PORT_TC6, - /* XE_LPD repositions D/E offsets and bitfields */ - PORT_D_XELPD = PORT_TC5, + PORT_D_XELPD, PORT_E_XELPD, - I915_MAX_PORTS }; diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 67f3b17b2360..12a6fe7ee010 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -170,6 +170,7 @@ #define _MMIO_CURSOR2(pipe, reg) _MMIO(INTEL_INFO(dev_priv)->display.cursor_offsets[(pipe)] - \ INTEL_INFO(dev_priv)->display.cursor_offsets[PIPE_A] + \ DISPLAY_MMIO_BASE(dev_priv) + (reg)) +#define _MMIO_DDI(i915, port, a, b) _MMIO_PORT(INTEL_INFO(i915)->display.ddi_index[port], a, b) #define __MASKED_FIELD(mask, value) ((mask) << 16 | (value)) #define _MASKED_FIELD(mask, value) ({ \ @@ -6936,10 +6937,7 @@ enum skl_power_gate { /* DDI Buffer Control */ #define _DDI_BUF_CTL_A 0x64000 #define _DDI_BUF_CTL_B 0x64100 -#define DDI_BUF_CTL(i915, port) ({ \ - (void)i915; /* Suppress unused variable warning */ \ - _MMIO_PORT(port, _DDI_BUF_CTL_A, _DDI_BUF_CTL_B); \ -}) +#define DDI_BUF_CTL(i915, port) _MMIO_DDI(i915, port, _DDI_BUF_CTL_A, _DDI_BUF_CTL_B) #define DDI_BUF_CTL_ENABLE(1 << 31) #define DDI_BUF_TRANS_SELECT(n) ((n) << 24) diff --git a/include/drm/i915_component.h b/include/drm/i915_component.h index c1e2a43d2d1e..f95ff82c3b4a 100644 --- a/include/drm/i915_component.h +++ b/include/drm/i915_component.h @@ -35,7 +35,7 @@ enum i915_component_type { /* MAX_PORT is the number of port * It must be sync with I915_MAX_PORTS defined i915_drv.h */ -#define MAX_PORTS 9 +#define MAX_PORTS 17 /** * struct i915_audio_component - Used for direct communication between i915 and hda drivers -- 2.34.1
[Intel-gfx] [PATCH v3 5/6] drm/i915/display: Fix port_identifier function
port_identifier function was broken when TypeC ports were using enum aliases. It would return wrong string for TypeC ports. With unique enums for DDI ports now, fix port_identifier to cover all ports. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/display/intel_display.h | 12 1 file changed, 12 insertions(+) diff --git a/drivers/gpu/drm/i915/display/intel_display.h b/drivers/gpu/drm/i915/display/intel_display.h index e3aa8080b79f..e0d5a9e569d8 100644 --- a/drivers/gpu/drm/i915/display/intel_display.h +++ b/drivers/gpu/drm/i915/display/intel_display.h @@ -250,6 +250,18 @@ static inline const char *port_identifier(enum port port) return "Port H"; case PORT_I: return "Port I"; + case PORT_TC1: + return "Port TC1"; + case PORT_TC2: + return "Port TC2"; + case PORT_TC3: + return "Port TC3"; + case PORT_TC4: + return "Port TC4"; + case PORT_TC5: + return "Port TC5"; + case PORT_TC6: + return "Port TC6"; default: return ""; } -- 2.34.1
[Intel-gfx] [PATCH v3 6/6] drm/i915/display: cleanup unused DDI port enums
DDI port enums PORT_G/H/I were added in the commit - "6c8337dafaa9 drm/i915/tgl: Add additional ports for Tiger Lake" to identify new ports added in the platform. In the subsequent commits those ports were identified by new enums PORT_TC1/TC2/TC3.. to differentiate TypeC ports from non-TypeC. However, the enum definitions PORT_G/H/I and few usages of these enums were left as it is. These enums are unused as of today and can be removed. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/display/intel_bios.c| 3 --- drivers/gpu/drm/i915/display/intel_display.h | 9 - include/drm/i915_component.h | 2 +- 3 files changed, 1 insertion(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_bios.c b/drivers/gpu/drm/i915/display/intel_bios.c index ab472fa757d8..b0dfb37e402a 100644 --- a/drivers/gpu/drm/i915/display/intel_bios.c +++ b/drivers/gpu/drm/i915/display/intel_bios.c @@ -2404,9 +2404,6 @@ static enum port dvo_port_to_port(struct drm_i915_private *i915, [PORT_D] = { DVO_PORT_HDMID, DVO_PORT_DPD, -1 }, [PORT_E] = { DVO_PORT_HDMIE, DVO_PORT_DPE, DVO_PORT_CRT }, [PORT_F] = { DVO_PORT_HDMIF, DVO_PORT_DPF, -1 }, - [PORT_G] = { DVO_PORT_HDMIG, DVO_PORT_DPG, -1 }, - [PORT_H] = { DVO_PORT_HDMIH, DVO_PORT_DPH, -1 }, - [PORT_I] = { DVO_PORT_HDMII, DVO_PORT_DPI, -1 }, }; /* * RKL VBT uses PHY based mapping. Combo PHYs A,B,C,D diff --git a/drivers/gpu/drm/i915/display/intel_display.h b/drivers/gpu/drm/i915/display/intel_display.h index e0d5a9e569d8..1abc5da650f6 100644 --- a/drivers/gpu/drm/i915/display/intel_display.h +++ b/drivers/gpu/drm/i915/display/intel_display.h @@ -208,9 +208,6 @@ enum port { PORT_D, PORT_E, PORT_F, - PORT_G, - PORT_H, - PORT_I, /* Non-TypeC ports must be defined above */ PORT_TC1, @@ -244,12 +241,6 @@ static inline const char *port_identifier(enum port port) return "Port E"; case PORT_F: return "Port F"; - case PORT_G: - return "Port G"; - case PORT_H: - return "Port H"; - case PORT_I: - return "Port I"; case PORT_TC1: return "Port TC1"; case PORT_TC2: diff --git a/include/drm/i915_component.h b/include/drm/i915_component.h index 4b31bab5533a..335822d6960a 100644 --- a/include/drm/i915_component.h +++ b/include/drm/i915_component.h @@ -35,7 +35,7 @@ enum i915_component_type { /* MAX_PORT is the number of port * It must be sync with I915_MAX_PORTS defined i915_drv.h */ -#define MAX_PORTS 15 +#define MAX_PORTS 12 /** * struct i915_audio_component - Used for direct communication between i915 and hda drivers -- 2.34.1
[Intel-gfx] [PATCH v3 4/6] drm/i915/display: Remove PORT_D_XELPD/PORT_E_XELPD platform specific defintions
Port enums are no more used in the DDI register offset caculcation. We can remove the platform specific port redefinitions. Along with it we also get rid of the code required for handling these special definitions. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/display/intel_bios.c | 4 +- drivers/gpu/drm/i915/display/intel_ddi.c | 23 +-- drivers/gpu/drm/i915/display/intel_display.c | 6 +-- drivers/gpu/drm/i915/display/intel_display.h | 2 - .../drm/i915/display/intel_display_power.c| 40 +-- drivers/gpu/drm/i915/i915_pci.c | 4 +- include/drm/i915_component.h | 2 +- 7 files changed, 10 insertions(+), 71 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_bios.c b/drivers/gpu/drm/i915/display/intel_bios.c index 4c543e8205ca..ab472fa757d8 100644 --- a/drivers/gpu/drm/i915/display/intel_bios.c +++ b/drivers/gpu/drm/i915/display/intel_bios.c @@ -2436,8 +2436,8 @@ static enum port dvo_port_to_port(struct drm_i915_private *i915, [PORT_A] = { DVO_PORT_HDMIA, DVO_PORT_DPA, -1 }, [PORT_B] = { DVO_PORT_HDMIB, DVO_PORT_DPB, -1 }, [PORT_C] = { DVO_PORT_HDMIC, DVO_PORT_DPC, -1 }, - [PORT_D_XELPD] = { DVO_PORT_HDMID, DVO_PORT_DPD, -1 }, - [PORT_E_XELPD] = { DVO_PORT_HDMIE, DVO_PORT_DPE, -1 }, + [PORT_D] = { DVO_PORT_HDMID, DVO_PORT_DPD, -1 }, + [PORT_E] = { DVO_PORT_HDMIE, DVO_PORT_DPE, -1 }, [PORT_TC1] = { DVO_PORT_HDMIF, DVO_PORT_DPF, -1 }, [PORT_TC2] = { DVO_PORT_HDMIG, DVO_PORT_DPG, -1 }, [PORT_TC3] = { DVO_PORT_HDMIH, DVO_PORT_DPH, -1 }, diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c index 00ac683ef96b..73ef6e97c446 100644 --- a/drivers/gpu/drm/i915/display/intel_ddi.c +++ b/drivers/gpu/drm/i915/display/intel_ddi.c @@ -4115,17 +4115,6 @@ static bool hti_uses_phy(struct drm_i915_private *i915, enum phy phy) i915->hti_state & HDPORT_DDI_USED(phy); } -static enum hpd_pin xelpd_hpd_pin(struct drm_i915_private *dev_priv, - enum port port) -{ - if (port >= PORT_D_XELPD) - return HPD_PORT_D + port - PORT_D_XELPD; - else if (port >= PORT_TC1) - return HPD_PORT_TC1 + port - PORT_TC1; - else - return HPD_PORT_A + port - PORT_A; -} - static enum hpd_pin dg1_hpd_pin(struct drm_i915_private *dev_priv, enum port port) { @@ -4294,13 +4283,7 @@ void intel_ddi_init(struct drm_i915_private *dev_priv, enum port port) encoder = &dig_port->base; encoder->devdata = devdata; - if (DISPLAY_VER(dev_priv) >= 13 && port >= PORT_D_XELPD) { - drm_encoder_init(&dev_priv->drm, &encoder->base, &intel_ddi_funcs, -DRM_MODE_ENCODER_TMDS, -"DDI %c/PHY %c", -port_name(port - PORT_D_XELPD + PORT_D), -phy_name(phy)); - } else if (DISPLAY_VER(dev_priv) >= 12) { + if (DISPLAY_VER(dev_priv) >= 12) { enum tc_port tc_port = intel_port_to_tc(dev_priv, port); drm_encoder_init(&dev_priv->drm, &encoder->base, &intel_ddi_funcs, @@ -4430,9 +4413,7 @@ void intel_ddi_init(struct drm_i915_private *dev_priv, enum port port) intel_ddi_buf_trans_init(encoder); - if (DISPLAY_VER(dev_priv) >= 13) - encoder->hpd_pin = xelpd_hpd_pin(dev_priv, port); - else if (IS_DG1(dev_priv)) + if (IS_DG1(dev_priv)) encoder->hpd_pin = dg1_hpd_pin(dev_priv, port); else if (IS_ROCKETLAKE(dev_priv)) encoder->hpd_pin = rkl_hpd_pin(dev_priv, port); diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index 194a4758ee04..caf81f4b7f2a 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -2135,9 +2135,7 @@ bool intel_phy_is_snps(struct drm_i915_private *dev_priv, enum phy phy) enum phy intel_port_to_phy(struct drm_i915_private *i915, enum port port) { - if (DISPLAY_VER(i915) >= 13 && port >= PORT_D_XELPD) - return PHY_D + port - PORT_D_XELPD; - else if (DISPLAY_VER(i915) >= 13 && port >= PORT_TC1) + if (DISPLAY_VER(i915) >= 13 && port >= PORT_TC1) return PHY_F + port - PORT_TC1; else if (IS_ALDERLAKE_S(i915) && port >= PORT_TC1) return PHY_B + port - PORT_TC1; @@ -7903,7 +7901,7 @@ static void intel_setup_outputs(struct drm_i915_private *dev_priv) intel_ddi_init(dev_priv, PORT_A);
Re: [Intel-gfx] [PATCH v2] drm/i915/display: Don't use port enum as register offset
On 23.09.2022 12:52, Jani Nikula wrote: > On Wed, 21 Sep 2022, Balasubramani Vivekanandan > wrote: > > Display DDI ports are enumerated as PORT_A,PORT_B... . The enums are > > also used as an index to access the DDI_BUF_CTL register for the port. > > > > With the introduction of TypeC ports, new enums PORT_TC1,PORT_TC2.. were > > added starting from enum value 4 to match the index position of the > > DDI_BUF_CTL register of those ports. Because those early platforms had > > only 3 non-TypeC ports PORT_A,PORT_B, PORT_C followed by TypeC ports. > > So the enums PORT_D,PORT_E.. and PORT_TC1,PORT_TC2.. used the same enum > > values. > > > > Driver also used the condition `if (port > PORT_TC1)` to identify if a > > port is a TypeC port or non-TypeC. > > > > From XELPD, additional non-TypeC ports were added in the platform > > calling them as PORT D, PORT E and the DDI registers for those ports > > were positioned after TypeC ports. So the enums PORT_D and PORT_E can't > > be used as their values do not match with register position. It led to > > creating new enums PORT_D_XELPD, PORT_E_XELPD for ports D and E. > > > > The condition `if (port > PORT_TC1)` was no more valid for XELPD to > > identify a TypeC port. Also it led to many additional special checks for > > ports PORT_D_XELPD/PORT_E_XELPD. > > > > With new platforms indicating that the DDI register positions of ports > > can vary across platforms it makes no more feasible to maintain the port > > enum values to match the DDI register position. > > > > Port DDI register position is now maintained in a separate datastructure > > part of the platform device info and ports are enumerated independently. > > With enums for TypeC ports defined at the bottom, driver can easily > > identify the TypeC ports. > > > > Removed a WARN_ON as it is no longer valid. The WARN was added in > > commit - "327f8d8c336d drm/i915: simplify setting of ddi_io_power_domain" > > The ddi_io_power_domain calculation has changed completely since the > > commit and doesn't need this WARN_ON anymore. > > > > Signed-off-by: Balasubramani Vivekanandan > > > > I agree with the premise that defining platform specific port enums such > as PORT_D_XELPD to tackle differences in register offsets is handling > the problem at the wrong abstraction level. > > I am not (at least not yet) convinced with the approach of adding > platform specific mappings in .display.ddi_offsets. The main problem I > have with that is adding yet another way to deal with different register > offsets. We already have many, and adding a new one isn't appealing. > > Not that this *is* different from .display.pipe_offsets and > .display.trans_offsets which are actual *offsets*. The solution here is > actually misnamed; it's about indexes, not offsets. > > Finally, even if we were to choose this approach, this should be split > to at least three separate patches. First, pass i915 to the register > macro, no other changes, totally non-functional. Second, use the > indexes. Third, remove PORT_D_XELPD etc. > > I'm still considering alternatives. In the mean time, please find some > random comments on the details inline. Thanks for the comments. I have floated a new revision of the series after addressing your review comments. > > BR, > Jani. > > > --- > > drivers/gpu/drm/i915/display/icl_dsi.c| 12 ++-- > > drivers/gpu/drm/i915/display/intel_bios.c | 4 +- > > drivers/gpu/drm/i915/display/intel_ddi.c | 63 +++--- > > drivers/gpu/drm/i915/display/intel_display.c | 12 ++-- > > drivers/gpu/drm/i915/display/intel_display.h | 8 +-- > > .../drm/i915/display/intel_display_power.c| 40 +-- > > drivers/gpu/drm/i915/display/intel_fdi.c | 14 ++-- > > drivers/gpu/drm/i915/display/intel_tc.c | 6 +- > > drivers/gpu/drm/i915/gvt/display.c| 30 - > > drivers/gpu/drm/i915/gvt/handlers.c | 17 +++-- > > drivers/gpu/drm/i915/i915_pci.c | 66 --- > > drivers/gpu/drm/i915/i915_reg.h | 8 ++- > > drivers/gpu/drm/i915/intel_device_info.h | 1 + > > drivers/gpu/drm/i915/intel_gvt_mmio_table.c | 10 +-- > > include/drm/i915_component.h | 2 +- > > 15 files changed, 144 insertions(+), 149 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/display/icl_dsi.c > > b/drivers/gpu/drm/i915/display/icl_dsi.c > > index ed4d93942dbd..70098b67149b 100644 > > --- a/drivers/gpu/drm/i915/display/icl
Re: [Intel-gfx] [PATCH v2] drm/i915/display: Don't use port enum as register offset
On 23.09.2022 13:18, Ville Syrjälä wrote: > On Fri, Sep 23, 2022 at 12:52:48PM +0300, Jani Nikula wrote: > > On Wed, 21 Sep 2022, Balasubramani Vivekanandan > > wrote: > > > Display DDI ports are enumerated as PORT_A,PORT_B... . The enums are > > > also used as an index to access the DDI_BUF_CTL register for the port. > > > > > > With the introduction of TypeC ports, new enums PORT_TC1,PORT_TC2.. were > > > added starting from enum value 4 to match the index position of the > > > DDI_BUF_CTL register of those ports. Because those early platforms had > > > only 3 non-TypeC ports PORT_A,PORT_B, PORT_C followed by TypeC ports. > > > So the enums PORT_D,PORT_E.. and PORT_TC1,PORT_TC2.. used the same enum > > > values. > > > > > > Driver also used the condition `if (port > PORT_TC1)` to identify if a > > > port is a TypeC port or non-TypeC. > > > > > > From XELPD, additional non-TypeC ports were added in the platform > > > calling them as PORT D, PORT E and the DDI registers for those ports > > > were positioned after TypeC ports. So the enums PORT_D and PORT_E can't > > > be used as their values do not match with register position. It led to > > > creating new enums PORT_D_XELPD, PORT_E_XELPD for ports D and E. > > > > > > The condition `if (port > PORT_TC1)` was no more valid for XELPD to > > > identify a TypeC port. Also it led to many additional special checks for > > > ports PORT_D_XELPD/PORT_E_XELPD. > > > > > > With new platforms indicating that the DDI register positions of ports > > > can vary across platforms it makes no more feasible to maintain the port > > > enum values to match the DDI register position. > > > > > > Port DDI register position is now maintained in a separate datastructure > > > part of the platform device info and ports are enumerated independently. > > > With enums for TypeC ports defined at the bottom, driver can easily > > > identify the TypeC ports. > > > > > > Removed a WARN_ON as it is no longer valid. The WARN was added in > > > commit - "327f8d8c336d drm/i915: simplify setting of ddi_io_power_domain" > > > The ddi_io_power_domain calculation has changed completely since the > > > commit and doesn't need this WARN_ON anymore. > > > > > > Signed-off-by: Balasubramani Vivekanandan > > > > > > > I agree with the premise that defining platform specific port enums such > > as PORT_D_XELPD to tackle differences in register offsets is handling > > the problem at the wrong abstraction level. > > > > I am not (at least not yet) convinced with the approach of adding > > platform specific mappings in .display.ddi_offsets. The main problem I > > have with that is adding yet another way to deal with different register > > offsets. We already have many, and adding a new one isn't appealing. > > > > Not that this *is* different from .display.pipe_offsets and > > .display.trans_offsets which are actual *offsets*. The solution here is > > actually misnamed; it's about indexes, not offsets. > > > > Finally, even if we were to choose this approach, this should be split > > to at least three separate patches. First, pass i915 to the register > > macro, no other changes, totally non-functional. Second, use the > > indexes. Third, remove PORT_D_XELPD etc. > > > > I'm still considering alternatives. In the mean time, please find some > > random comments on the details inline. > > One of the earlier alternatives proposed was some kind of declarative > struct to describe each port, which would include separate indexes needed > for different things (among information on the type of DDI/PHY/etc.) > I think there was some attempt at something like that, but IIRC it > tried to do a bunch of other stuff too so it got bikeshedded to death. > > I guess one key question is: Do we need to freestanding DDI/AUX/etc. > register accesses or can we assume the encoder struct is always there? > That would dictate whether we need any magic in the register macros at > all, or whether we can just trust the caller to pass in the right > index. Wouldn't it be a big restriction to say it wouldn't be possible to read the DDI registers of a port if it has no encoder struct associated with it? AFAIU from the driver, encoder struct would be created for a port, if it is enabled in the VBT. Can we then assume that if a port is not enabled in VBT, there would be never a need to read its registers? > > Oh, and the other key question i
Re: [Intel-gfx] [PATCH 01/12] drm/i915/gen8: Create separate reg definitions for new MCR registers
On 19.09.2022 15:32, Matt Roper wrote: > Gen8 was the first time our hardware had multicast registers (or at > least the first time the multicast nature was exposed and MMIO accesses > could be steered). There are some registers that transitioned from > singleton behavior to multicast during the gen7 -> gen8 transition; > let's duplicate the register definitions for those registers in > preparation for upcoming patches that will handle MCR registers in a > special manner. > > The registers adjusted are: > * MISCCPCTL > * SAMPLER_INSTDONE > * ROW_INSTDONE > * ROW_CHICKEN2 > * HALF_SLICE_CHICKEN1 > * HALF_SLICE_CHICKEN3 > > Signed-off-by: Matt Roper > --- > drivers/gpu/drm/i915/gt/intel_engine_cs.c | 4 ++-- > drivers/gpu/drm/i915/gt/intel_gt_regs.h | 11 +- > drivers/gpu/drm/i915/gt/intel_workarounds.c | 22 +-- > .../gpu/drm/i915/gt/uc/intel_guc_capture.c| 4 ++-- > drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 2 +- > drivers/gpu/drm/i915/gvt/handlers.c | 2 +- > drivers/gpu/drm/i915/gvt/mmio_context.c | 2 +- > drivers/gpu/drm/i915/intel_gvt_mmio_table.c | 2 +- > drivers/gpu/drm/i915/intel_pm.c | 10 - > 9 files changed, 34 insertions(+), 25 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c > b/drivers/gpu/drm/i915/gt/intel_engine_cs.c > index 2ddcad497fa3..c408bac3c533 100644 > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c > @@ -1559,11 +1559,11 @@ void intel_engine_get_instdone(const struct > intel_engine_cs *engine, > for_each_ss_steering(iter, engine->gt, slice, subslice) { > instdone->sampler[slice][subslice] = > intel_gt_mcr_read(engine->gt, > - GEN7_SAMPLER_INSTDONE, > + GEN8_SAMPLER_INSTDONE, > slice, subslice); > instdone->row[slice][subslice] = > intel_gt_mcr_read(engine->gt, > - GEN7_ROW_INSTDONE, > + GEN8_ROW_INSTDONE, > slice, subslice); > } > > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h > b/drivers/gpu/drm/i915/gt/intel_gt_regs.h > index 1cbb7226400b..e5a1ea255640 100644 > --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h > +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h > @@ -647,6 +647,9 @@ > > #define GEN7_MISCCPCTL _MMIO(0x9424) > #define GEN7_DOP_CLOCK_GATE_ENABLE (1 << 0) > + > +#define GEN8_MISCCPCTL _MMIO(0x9424) > +#define GEN8_DOP_CLOCK_GATE_ENABLE REG_BIT(0) When I went through the driver to check if is there any instance where platforms above Gen7 still using Gen7 registers, I found the following two functions still using GEN7_MISCCPCTL. Can you check? * dg2_gt_workarounds_init * pvc_gt_workarounds_init Regards, Bala > #define GEN12_DOP_CLOCK_GATE_RENDER_ENABLE REG_BIT(1) > #define GEN8_DOP_CLOCK_GATE_CFCLK_ENABLE (1 << 2) > #define GEN8_DOP_CLOCK_GATE_GUC_ENABLE (1 << 4) > @@ -1068,18 +1071,22 @@ > #define GEN12_GAM_DONE _MMIO(0xcf68) > > #define GEN7_HALF_SLICE_CHICKEN1 _MMIO(0xe100) /* IVB GT1 + VLV > */ > +#define GEN8_HALF_SLICE_CHICKEN1 _MMIO(0xe100) > #define GEN7_MAX_PS_THREAD_DEP (8 << 12) > #define GEN7_SINGLE_SUBSCAN_DISPATCH_ENABLE(1 << 10) > #define GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE(1 << 4) > #define GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE (1 << 3) > > #define GEN7_SAMPLER_INSTDONE_MMIO(0xe160) > +#define GEN8_SAMPLER_INSTDONE_MMIO(0xe160) > #define GEN7_ROW_INSTDONE_MMIO(0xe164) > +#define GEN8_ROW_INSTDONE_MMIO(0xe164) > > #define HALF_SLICE_CHICKEN2 _MMIO(0xe180) > #define GEN8_ST_PO_DISABLE (1 << 13) > > -#define HALF_SLICE_CHICKEN3 _MMIO(0xe184) > +#define HSW_HALF_SLICE_CHICKEN3 _MMIO(0xe184) > +#define GEN8_HALF_SLICE_CHICKEN3 _MMIO(0xe184) > #define HSW_SAMPLE_C_PERFORMANCE (1 << 9) > #define GEN8_CENTROID_PIXEL_OPT_DIS(1 << 8) > #define GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC(1 << 5) > @@ -1132,6 +1139,8 @@ > #define DISABLE_EARLY_EOT REG_BIT(1) > > #define GEN7_ROW_CHICKEN2_MMIO(0xe4f4) > + > +#define GEN8_ROW_CHICKEN2_MMIO(0xe4f4) > #define GEN12_DISABLE_READ_SUPPRESSION REG_BIT(15) > #define GEN12_DISABLE_EARLY_READ REG_BIT(14) > #define GEN12_ENABLE_LA
Re: [Intel-gfx] [PATCH 02/12] drm/i915/xehp: Create separate reg definitions for new MCR registers
On 19.09.2022 15:32, Matt Roper wrote: > Starting in Xe_HP, several registers our driver works with have been > converted from singleton registers into replicated registers with > multicast behavior. Although the registers are still located at the > same MMIO offsets as on previous platforms, let's duplicate the register > definitions in preparation for upcoming patches that will handle > multicast registers in a special manner. > > The registers that are now replicated on Xe_HP are: > * PAT_INDEX (mslice replication) > * FF_MODE2 (gslice replication) > * COMMON_SLICE_CHICKEN3 (gslice replication) > * SLICE_COMMON_ECO_CHICKEN1 (gslice replication) > * SLICE_UNIT_LEVEL_CLKGATE (gslice replication) > * LNCFCMOCS (lncf replication) > > Bspec: 66534 > Signed-off-by: Matt Roper > --- > drivers/gpu/drm/i915/gt/intel_gt_regs.h | 18 - > drivers/gpu/drm/i915/gt/intel_gtt.c | 29 ++--- > drivers/gpu/drm/i915/gt/intel_mocs.c| 5 +++- > drivers/gpu/drm/i915/gt/intel_workarounds.c | 24 - > drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 7 +++-- > 5 files changed, 52 insertions(+), 31 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h > b/drivers/gpu/drm/i915/gt/intel_gt_regs.h > index e5a1ea255640..559e3473f14c 100644 > --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h > +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h > @@ -329,6 +329,7 @@ > #define GEN7_TLB_RD_ADDR _MMIO(0x4700) > > #define GEN12_PAT_INDEX(index) _MMIO(0x4800 + (index) > * 4) > +#define XEHP_PAT_INDEX(index)_MMIO(0x4800 + (index) > * 4) > > #define XEHP_TILE0_ADDR_RANGE_MMIO(0x4900) > #define XEHP_TILE_LMEM_RANGE_SHIFT 8 > @@ -387,7 +388,8 @@ > #define DIS_OVER_FETCH_CACHE REG_BIT(1) > #define DIS_MULT_MISS_RD_SQUASHREG_BIT(0) > > -#define FF_MODE2 _MMIO(0x6604) > +#define GEN12_FF_MODE2 _MMIO(0x6604) > +#define XEHP_FF_MODE2_MMIO(0x6604) > #define FF_MODE2_GS_TIMER_MASK REG_GENMASK(31, 24) > #define FF_MODE2_GS_TIMER_224 > REG_FIELD_PREP(FF_MODE2_GS_TIMER_MASK, 224) > #define FF_MODE2_TDS_TIMER_MASKREG_GENMASK(23, 16) > @@ -442,6 +444,7 @@ > #define GEN8_HDC_CHICKEN1_MMIO(0x7304) > > #define GEN11_COMMON_SLICE_CHICKEN3 _MMIO(0x7304) > +#define XEHP_COMMON_SLICE_CHICKEN3 _MMIO(0x7304) > #define DG1_FLOAT_POINT_BLEND_OPT_STRICT_MODE_EN REG_BIT(12) > #define XEHP_DUAL_SIMD8_SEQ_MERGE_DISABLE REG_BIT(12) > #define GEN11_BLEND_EMB_FIX_DISABLE_IN_RCC REG_BIT(11) > @@ -455,10 +458,9 @@ > #define DISABLE_PIXEL_MASK_CAMMING (1 << 14) > > #define GEN9_SLICE_COMMON_ECO_CHICKEN1 _MMIO(0x731c) > -#define GEN11_STATE_CACHE_REDIRECT_TO_CS (1 << 11) > - > -#define SLICE_COMMON_ECO_CHICKEN1_MMIO(0x731c) > +#define XEHP_SLICE_COMMON_ECO_CHICKEN1 _MMIO(0x731c) > #define MSC_MSAA_REODER_BUF_BYPASS_DISABLE REG_BIT(14) > +#define GEN11_STATE_CACHE_REDIRECT_TO_CS (1 << 11) > > #define GEN9_SLICE_PGCTL_ACK(slice) _MMIO(0x804c + (slice) * 0x4) > #define GEN10_SLICE_PGCTL_ACK(slice) _MMIO(0x804c + ((slice) / 3) * > 0x34 + \ > @@ -703,7 +705,8 @@ > #define GAMTLBVEBOX0_CLKGATE_DIS REG_BIT(16) > #define LTCDD_CLKGATE_DIS REG_BIT(10) > > -#define SLICE_UNIT_LEVEL_CLKGATE _MMIO(0x94d4) > +#define GEN11_SLICE_UNIT_LEVEL_CLKGATE _MMIO(0x94d4) > +#define XEHP_SLICE_UNIT_LEVEL_CLKGATE_MMIO(0x94d4) > #define SARBUNIT_CLKGATE_DIS (1 << 5) > #define RCCUNIT_CLKGATE_DIS(1 << 7) > #define MSCUNIT_CLKGATE_DIS(1 << 10) > @@ -718,7 +721,7 @@ > #define VSUNIT_CLKGATE_DIS_TGL REG_BIT(19) > #define PSDUNIT_CLKGATE_DISREG_BIT(5) > > -#define SUBSLICE_UNIT_LEVEL_CLKGATE _MMIO(0x9524) > +#define GEN11_SUBSLICE_UNIT_LEVEL_CLKGATE_MMIO(0x9524) > #define DSS_ROUTER_CLKGATE_DIS REG_BIT(28) > #define GWUNIT_CLKGATE_DIS REG_BIT(16) > > @@ -943,7 +946,8 @@ > > /* MOCS (Memory Object Control State) registers */ > #define GEN9_LNCFCMOCS(i)_MMIO(0xb020 + (i) * 4) /* L3 > Cache Control */ GEN9_LNCFCMOCS is used in few functions in file selftest_mocs.c. This patch has untouched those instances. Is it by intention to handle it part of a separate series? If the plan is to handle it later sometime can we create a ticket to keep track of it? Regards, Bala > -#define GEN9_LNCFCMOCS_REG_COUNT 32 > +#define XEHP_LNCFCMOCS(i)_MMIO(0xb020 + (i) * 4) /* L3 > Cache Control */ > +#define LNCFCM
[Intel-gfx] [PATCH v4 0/6] drm/i915/display: Don't use port enum as register offset
Prior to display version 12, platforms had DDI ports A,B,C,D,E,F represented by enums PORT_A,PORT_B...PORT_F. The DDI register offsets of the ports were in the same order as the ports. So the port enums were directly used as index to calculate the register offset of the ports. Starting in display version 12, TypeC ports were introduced in the platforms. These were defined as new enums PORT_TC1,PORT_TC2... The later generation platforms had DDI register offests of TypeC and non-TypeC ports interleaved and the existing port enums didn't match the order of the DDI register offests. So the enums could no more be used as index to calculate the register offest. This led to the creation of new platform specific enums for the ports like PORT_D_XELPD, PORT_E_XELPD to match the index of the ports in those platforms and additional code to handle the special enums. So we want to make the port enums not tied to DDI register offset and use the index from somewhere else to calculate the register offsets. The index of the DDI ports in the platform is now defined as part of device info. Series includes few patches at the end which does some cleanup and fixing made possible because of unique enums for the ports. v2: ddi_index defined for platforms starting from Gen75. Many platforms from Gen75 has ddi support. Cc: Jani Nikula Cc: Ville Syrjälä Balasubramani Vivekanandan (6): drm/i915/display: Pass struct drm_i915_private to DDI_BUF_CTL macro drm/i915/display: Define the DDI port indices inside device info drm/i915/display: Free port enums from tied to register offset drm/i915/display: Remove PORT_D_XELPD/PORT_E_XELPD platform specific defintions drm/i915/display: Fix port_identifier function drm/i915/display: cleanup unused DDI port enums drivers/gpu/drm/i915/display/icl_dsi.c| 12 ++-- drivers/gpu/drm/i915/display/intel_bios.c | 7 +-- drivers/gpu/drm/i915/display/intel_ddi.c | 63 +++ drivers/gpu/drm/i915/display/intel_display.c | 12 ++-- drivers/gpu/drm/i915/display/intel_display.h | 29 + .../drm/i915/display/intel_display_power.c| 40 +--- drivers/gpu/drm/i915/display/intel_fdi.c | 14 ++--- drivers/gpu/drm/i915/display/intel_tc.c | 6 +- drivers/gpu/drm/i915/gvt/display.c| 30 - drivers/gpu/drm/i915/gvt/handlers.c | 17 ++--- drivers/gpu/drm/i915/i915_pci.c | 46 +- drivers/gpu/drm/i915/i915_reg.h | 4 +- drivers/gpu/drm/i915/intel_device_info.h | 1 + drivers/gpu/drm/i915/intel_gvt_mmio_table.c | 10 +-- include/drm/i915_component.h | 2 +- 15 files changed, 140 insertions(+), 153 deletions(-) -- 2.34.1
[Intel-gfx] [PATCH v4 1/6] drm/i915/display: Pass struct drm_i915_private to DDI_BUF_CTL macro
This is a prep patch for a patch series in which register offset of the DDI ports are not calculated using the port enums but using a different datastructure part of the device info. So the device info is passed as a parameter to the macro DDI_BUF_CTL but unused yet. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/display/icl_dsi.c | 12 +++--- drivers/gpu/drm/i915/display/intel_ddi.c | 39 +++- drivers/gpu/drm/i915/display/intel_display.c | 6 ++- drivers/gpu/drm/i915/display/intel_fdi.c | 14 +++ drivers/gpu/drm/i915/display/intel_tc.c | 6 +-- drivers/gpu/drm/i915/gvt/display.c | 30 +++ drivers/gpu/drm/i915/gvt/handlers.c | 17 + drivers/gpu/drm/i915/i915_reg.h | 6 ++- drivers/gpu/drm/i915/intel_gvt_mmio_table.c | 10 ++--- 9 files changed, 76 insertions(+), 64 deletions(-) diff --git a/drivers/gpu/drm/i915/display/icl_dsi.c b/drivers/gpu/drm/i915/display/icl_dsi.c index 47f13750f6fa..f7c1f6561423 100644 --- a/drivers/gpu/drm/i915/display/icl_dsi.c +++ b/drivers/gpu/drm/i915/display/icl_dsi.c @@ -548,11 +548,11 @@ static void gen11_dsi_enable_ddi_buffer(struct intel_encoder *encoder) enum port port; for_each_dsi_port(port, intel_dsi->ports) { - tmp = intel_de_read(dev_priv, DDI_BUF_CTL(port)); + tmp = intel_de_read(dev_priv, DDI_BUF_CTL(dev_priv, port)); tmp |= DDI_BUF_CTL_ENABLE; - intel_de_write(dev_priv, DDI_BUF_CTL(port), tmp); + intel_de_write(dev_priv, DDI_BUF_CTL(dev_priv, port), tmp); - if (wait_for_us(!(intel_de_read(dev_priv, DDI_BUF_CTL(port)) & + if (wait_for_us(!(intel_de_read(dev_priv, DDI_BUF_CTL(dev_priv, port)) & DDI_BUF_IS_IDLE), 500)) drm_err(&dev_priv->drm, "DDI port:%c buffer idle\n", @@ -1400,11 +1400,11 @@ static void gen11_dsi_disable_port(struct intel_encoder *encoder) gen11_dsi_ungate_clocks(encoder); for_each_dsi_port(port, intel_dsi->ports) { - tmp = intel_de_read(dev_priv, DDI_BUF_CTL(port)); + tmp = intel_de_read(dev_priv, DDI_BUF_CTL(dev_priv, port)); tmp &= ~DDI_BUF_CTL_ENABLE; - intel_de_write(dev_priv, DDI_BUF_CTL(port), tmp); + intel_de_write(dev_priv, DDI_BUF_CTL(dev_priv, port), tmp); - if (wait_for_us((intel_de_read(dev_priv, DDI_BUF_CTL(port)) & + if (wait_for_us((intel_de_read(dev_priv, DDI_BUF_CTL(dev_priv, port)) & DDI_BUF_IS_IDLE), 8)) drm_err(&dev_priv->drm, diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c index 971356237eca..77a986696c76 100644 --- a/drivers/gpu/drm/i915/display/intel_ddi.c +++ b/drivers/gpu/drm/i915/display/intel_ddi.c @@ -172,7 +172,7 @@ void intel_wait_ddi_buf_idle(struct drm_i915_private *dev_priv, return; } - if (wait_for_us((intel_de_read(dev_priv, DDI_BUF_CTL(port)) & + if (wait_for_us((intel_de_read(dev_priv, DDI_BUF_CTL(dev_priv, port)) & DDI_BUF_IS_IDLE), 8)) drm_err(&dev_priv->drm, "Timeout waiting for DDI BUF %c to get idle\n", port_name(port)); @@ -189,7 +189,7 @@ static void intel_wait_ddi_buf_active(struct drm_i915_private *dev_priv, return; } - ret = _wait_for(!(intel_de_read(dev_priv, DDI_BUF_CTL(port)) & + ret = _wait_for(!(intel_de_read(dev_priv, DDI_BUF_CTL(dev_priv, port)) & DDI_BUF_IS_IDLE), IS_DG2(dev_priv) ? 1200 : 500, 10, 10); if (ret) @@ -730,7 +730,7 @@ static void intel_ddi_get_encoder_pipes(struct intel_encoder *encoder, if (!wakeref) return; - tmp = intel_de_read(dev_priv, DDI_BUF_CTL(port)); + tmp = intel_de_read(dev_priv, DDI_BUF_CTL(dev_priv, port)); if (!(tmp & DDI_BUF_CTL_ENABLE)) goto out; @@ -1397,8 +1397,8 @@ hsw_set_signal_levels(struct intel_encoder *encoder, intel_dp->DP &= ~DDI_BUF_EMP_MASK; intel_dp->DP |= signal_levels; - intel_de_write(dev_priv, DDI_BUF_CTL(port), intel_dp->DP); - intel_de_posting_read(dev_priv, DDI_BUF_CTL(port)); + intel_de_write(dev_priv, DDI_BUF_CTL(dev_priv, port), intel_dp->DP); + intel_de_posting_read(dev_priv, DDI_BUF_CTL(dev_priv, port)); } static void _icl_ddi_enable_clock(struct drm_i915_private *i915, i915_reg_t reg, @@ -2577,10 +2577,10 @@ static void intel_disable_ddi_buf(struct intel_encoder *encoder, bool wait = false; u32 val; - val = intel_de_read(dev_priv,
[Intel-gfx] [PATCH v4 2/6] drm/i915/display: Define the DDI port indices inside device info
Prior to display version 12, platforms had DDI ports A,B,C,D,E,F represented by enums PORT_A,PORT_B...PORT_F. The DDI register offsets of the ports was in the same order as the ports. So the port enums were directly used as index to calculate the register offset of the ports. Starting in display version 12, TypeC ports were introduced in the platforms. These were defined as new enums PORT_TC1,PORT_TC2... The later generation platforms had DDI register offests of TypeC and non-TypeC ports interleaved and the existing port enums didn't match the order of the DDI register offests. So the enums could no more be used as index to calculate the register offest. This led to the creation of new platform specific enums for the ports like PORT_D_XELPD, PORT_E_XELPD to match the index of the ports in those platforms and additional code to handle the special enums. So we want to make the port enums not tied to DDI register offset and use the index from somewhere else to calculate the register offsets. The index of the DDI ports in the platform is now defined as part of device info. This patch just adds the indices to the device info. Later patches in the series use that index for offset calculation. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/i915_pci.c | 46 ++-- drivers/gpu/drm/i915/intel_device_info.h | 1 + 2 files changed, 44 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index 38460a0bd7cb..b37a95755b77 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -132,6 +132,42 @@ [PIPE_D] = TGL_CURSOR_D_OFFSET, \ } +#define GEN75_DDI_INDEX \ + .display.ddi_index = { \ + [PORT_A] = 0, \ + [PORT_B] = 1, \ + [PORT_C] = 2, \ + [PORT_D] = 3, \ + [PORT_E] = 4, \ + [PORT_F] = 5, \ + } + +#define GEN12_DDI_INDEX \ + .display.ddi_index = { \ + [PORT_A] = 0, \ + [PORT_B] = 1, \ + [PORT_C] = 2, \ + [PORT_TC1] = 3, \ + [PORT_TC2] = 4, \ + [PORT_TC3] = 5, \ + [PORT_TC4] = 6, \ + [PORT_TC5] = 7, \ + [PORT_TC6] = 8, \ + } + +#define XE_LPD_DDI_INDEX \ + .display.ddi_index = { \ + [PORT_A] = 0, \ + [PORT_B] = 1, \ + [PORT_C] = 2, \ + [PORT_TC1] = 3, \ + [PORT_TC2] = 4, \ + [PORT_TC3] = 5, \ + [PORT_TC4] = 6, \ + [PORT_D_XELPD] = 7, \ + [PORT_E_XELPD] = 8, \ + } + #define I9XX_COLORS \ .display.color = { .gamma_lut_size = 256 } #define I965_COLORS \ @@ -562,7 +598,8 @@ static const struct intel_device_info vlv_info = { .display.has_dp_mst = 1, \ .has_rc6p = 0 /* RC6p removed-by HSW */, \ HSW_PIPE_OFFSETS, \ - .has_runtime_pm = 1 + .has_runtime_pm = 1, \ + GEN75_DDI_INDEX #define HSW_PLATFORM \ G75_FEATURES, \ @@ -733,7 +770,8 @@ static const struct intel_device_info skl_gt4_info = { IVB_CURSOR_OFFSETS, \ IVB_COLORS, \ GEN9_DEFAULT_PAGE_SIZES, \ - GEN_DEFAULT_REGIONS + GEN_DEFAULT_REGIONS, \ + GEN75_DDI_INDEX static const struct intel_device_info bxt_info = { GEN9_LP_FEATURES, @@ -887,6 +925,7 @@ static const struct intel_device_info jsl_info = { [TRANSCODER_DSI_1] = TRANSCODER_DSI1_OFFSET, \ }, \ TGL_CURSOR_OFFSETS, \ + GEN12_DDI_INDEX, \ .has_global_mocs = 1, \ .has_pxp = 1, \ .display.has_dsb = 0 /* FIXME: LUT load is broken with DSB */ @@ -984,7 +1023,8 @@ static const struct intel_device_info adl_s_info = { [TRANSCODER_DSI_0] = TRANSCODER_DSI0_OFFSET, \ [TRANSCODER_DSI_1] = TRANSCODER_DSI1_OFFSET, \ }, \ - TGL_CURSOR_OFFSETS + TGL_CURSOR_OFFSETS, \ + XE_LPD_DDI_INDEX static const struct intel_device_info adl_p_info = { GEN12_FEATURES, diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h index bc87d3156b14..a93f54990a01 100644 --- a/drivers/gpu/drm/i915/intel_device_info.h +++ b/drivers/gpu/drm/i915/intel_device_info.h @@ -292,6 +292,7 @@ struct intel_device_info { u32 pipe_offsets[I915_MAX_TRANSCODERS]; u32 trans_offsets[I915_MAX_TRANSCODERS]; u32 cursor_offsets[I915_MAX_PIPES]; + u32 ddi_index[I915_MAX_PORTS]; struct { u32 degamma_lut_size; -- 2.34.1
[Intel-gfx] [PATCH v4 3/6] drm/i915/display: Free port enums from tied to register offset
With the index required for DDI register offset calculation available in the device info, DDI_BUF_CTL macro updated to make use of it. Any new macros to access the DDI registers should follow the same procedure. This would free the port enums from tied to the register offset of DDI registers. We can remove all the enum aliases and clean up the enum definitions. The key target of the patch series to remove platform specific definitions of ports like PORT_D_XELPD, PORT_E_XELPD is not yet covered here. The definitions are still retained and will be handled in the follow patch. Removed a WARN_ON as it is no longer valid. The WARN was added in the commit "327f8d8c336d drm/i915: simplify setting of ddi_io_power_domain" The ddi_io_power_domain calculation has changed completely since the commit and doesn't need this WARN_ON anymore. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/display/intel_ddi.c | 1 - drivers/gpu/drm/i915/display/intel_display.h | 8 +++- drivers/gpu/drm/i915/i915_reg.h | 6 ++ include/drm/i915_component.h | 2 +- 4 files changed, 6 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c index 77a986696c76..7dd6d108a26f 100644 --- a/drivers/gpu/drm/i915/display/intel_ddi.c +++ b/drivers/gpu/drm/i915/display/intel_ddi.c @@ -4492,7 +4492,6 @@ void intel_ddi_init(struct drm_i915_private *dev_priv, enum port port) encoder->update_complete = intel_ddi_update_complete; } - drm_WARN_ON(&dev_priv->drm, port > PORT_I); dig_port->ddi_io_power_domain = intel_display_power_ddi_io_domain(dev_priv, port); if (init_dp) { diff --git a/drivers/gpu/drm/i915/display/intel_display.h b/drivers/gpu/drm/i915/display/intel_display.h index 2af4a1925063..9112833b39eb 100644 --- a/drivers/gpu/drm/i915/display/intel_display.h +++ b/drivers/gpu/drm/i915/display/intel_display.h @@ -212,18 +212,16 @@ enum port { PORT_H, PORT_I, - /* tgl+ */ - PORT_TC1 = PORT_D, + /* Non-TypeC ports must be defined above */ + PORT_TC1, PORT_TC2, PORT_TC3, PORT_TC4, PORT_TC5, PORT_TC6, - /* XE_LPD repositions D/E offsets and bitfields */ - PORT_D_XELPD = PORT_TC5, + PORT_D_XELPD, PORT_E_XELPD, - I915_MAX_PORTS }; diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index a91bbc6e1255..cae48786c5a3 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -170,6 +170,7 @@ #define _MMIO_CURSOR2(pipe, reg) _MMIO(INTEL_INFO(dev_priv)->display.cursor_offsets[(pipe)] - \ INTEL_INFO(dev_priv)->display.cursor_offsets[PIPE_A] + \ DISPLAY_MMIO_BASE(dev_priv) + (reg)) +#define _MMIO_DDI(i915, port, a, b) _MMIO_PORT(INTEL_INFO(i915)->display.ddi_index[port], a, b) #define __MASKED_FIELD(mask, value) ((mask) << 16 | (value)) #define _MASKED_FIELD(mask, value) ({ \ @@ -6936,10 +6937,7 @@ enum skl_power_gate { /* DDI Buffer Control */ #define _DDI_BUF_CTL_A 0x64000 #define _DDI_BUF_CTL_B 0x64100 -#define DDI_BUF_CTL(i915, port) ({ \ - (void)i915; /* Suppress unused variable warning */ \ - _MMIO_PORT(port, _DDI_BUF_CTL_A, _DDI_BUF_CTL_B); \ -}) +#define DDI_BUF_CTL(i915, port) _MMIO_DDI(i915, port, _DDI_BUF_CTL_A, _DDI_BUF_CTL_B) #define DDI_BUF_CTL_ENABLE(1 << 31) #define DDI_BUF_TRANS_SELECT(n) ((n) << 24) diff --git a/include/drm/i915_component.h b/include/drm/i915_component.h index c1e2a43d2d1e..f95ff82c3b4a 100644 --- a/include/drm/i915_component.h +++ b/include/drm/i915_component.h @@ -35,7 +35,7 @@ enum i915_component_type { /* MAX_PORT is the number of port * It must be sync with I915_MAX_PORTS defined i915_drv.h */ -#define MAX_PORTS 9 +#define MAX_PORTS 17 /** * struct i915_audio_component - Used for direct communication between i915 and hda drivers -- 2.34.1
[Intel-gfx] [PATCH v4 5/6] drm/i915/display: Fix port_identifier function
port_identifier function was broken when TypeC ports were using enum aliases. It would return wrong string for TypeC ports. With unique enums for DDI ports now, fix port_identifier to cover all ports. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/display/intel_display.h | 12 1 file changed, 12 insertions(+) diff --git a/drivers/gpu/drm/i915/display/intel_display.h b/drivers/gpu/drm/i915/display/intel_display.h index 62604cadf0b8..4a5f7df7492b 100644 --- a/drivers/gpu/drm/i915/display/intel_display.h +++ b/drivers/gpu/drm/i915/display/intel_display.h @@ -250,6 +250,18 @@ static inline const char *port_identifier(enum port port) return "Port H"; case PORT_I: return "Port I"; + case PORT_TC1: + return "Port TC1"; + case PORT_TC2: + return "Port TC2"; + case PORT_TC3: + return "Port TC3"; + case PORT_TC4: + return "Port TC4"; + case PORT_TC5: + return "Port TC5"; + case PORT_TC6: + return "Port TC6"; default: return ""; } -- 2.34.1
[Intel-gfx] [PATCH v4 4/6] drm/i915/display: Remove PORT_D_XELPD/PORT_E_XELPD platform specific defintions
Port enums are no more used in the DDI register offset caculcation. We can remove the platform specific port redefinitions. Along with it we also get rid of the code required for handling these special definitions. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/display/intel_bios.c | 4 +- drivers/gpu/drm/i915/display/intel_ddi.c | 23 +-- drivers/gpu/drm/i915/display/intel_display.c | 6 +-- drivers/gpu/drm/i915/display/intel_display.h | 2 - .../drm/i915/display/intel_display_power.c| 40 +-- drivers/gpu/drm/i915/i915_pci.c | 4 +- include/drm/i915_component.h | 2 +- 7 files changed, 10 insertions(+), 71 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_bios.c b/drivers/gpu/drm/i915/display/intel_bios.c index 4c543e8205ca..ab472fa757d8 100644 --- a/drivers/gpu/drm/i915/display/intel_bios.c +++ b/drivers/gpu/drm/i915/display/intel_bios.c @@ -2436,8 +2436,8 @@ static enum port dvo_port_to_port(struct drm_i915_private *i915, [PORT_A] = { DVO_PORT_HDMIA, DVO_PORT_DPA, -1 }, [PORT_B] = { DVO_PORT_HDMIB, DVO_PORT_DPB, -1 }, [PORT_C] = { DVO_PORT_HDMIC, DVO_PORT_DPC, -1 }, - [PORT_D_XELPD] = { DVO_PORT_HDMID, DVO_PORT_DPD, -1 }, - [PORT_E_XELPD] = { DVO_PORT_HDMIE, DVO_PORT_DPE, -1 }, + [PORT_D] = { DVO_PORT_HDMID, DVO_PORT_DPD, -1 }, + [PORT_E] = { DVO_PORT_HDMIE, DVO_PORT_DPE, -1 }, [PORT_TC1] = { DVO_PORT_HDMIF, DVO_PORT_DPF, -1 }, [PORT_TC2] = { DVO_PORT_HDMIG, DVO_PORT_DPG, -1 }, [PORT_TC3] = { DVO_PORT_HDMIH, DVO_PORT_DPH, -1 }, diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c index 7dd6d108a26f..b95124c4fe74 100644 --- a/drivers/gpu/drm/i915/display/intel_ddi.c +++ b/drivers/gpu/drm/i915/display/intel_ddi.c @@ -4133,17 +4133,6 @@ static bool hti_uses_phy(struct drm_i915_private *i915, enum phy phy) i915->hti_state & HDPORT_DDI_USED(phy); } -static enum hpd_pin xelpd_hpd_pin(struct drm_i915_private *dev_priv, - enum port port) -{ - if (port >= PORT_D_XELPD) - return HPD_PORT_D + port - PORT_D_XELPD; - else if (port >= PORT_TC1) - return HPD_PORT_TC1 + port - PORT_TC1; - else - return HPD_PORT_A + port - PORT_A; -} - static enum hpd_pin dg1_hpd_pin(struct drm_i915_private *dev_priv, enum port port) { @@ -4312,13 +4301,7 @@ void intel_ddi_init(struct drm_i915_private *dev_priv, enum port port) encoder = &dig_port->base; encoder->devdata = devdata; - if (DISPLAY_VER(dev_priv) >= 13 && port >= PORT_D_XELPD) { - drm_encoder_init(&dev_priv->drm, &encoder->base, &intel_ddi_funcs, -DRM_MODE_ENCODER_TMDS, -"DDI %c/PHY %c", -port_name(port - PORT_D_XELPD + PORT_D), -phy_name(phy)); - } else if (DISPLAY_VER(dev_priv) >= 12) { + if (DISPLAY_VER(dev_priv) >= 12) { enum tc_port tc_port = intel_port_to_tc(dev_priv, port); drm_encoder_init(&dev_priv->drm, &encoder->base, &intel_ddi_funcs, @@ -4448,9 +4431,7 @@ void intel_ddi_init(struct drm_i915_private *dev_priv, enum port port) intel_ddi_buf_trans_init(encoder); - if (DISPLAY_VER(dev_priv) >= 13) - encoder->hpd_pin = xelpd_hpd_pin(dev_priv, port); - else if (IS_DG1(dev_priv)) + if (IS_DG1(dev_priv)) encoder->hpd_pin = dg1_hpd_pin(dev_priv, port); else if (IS_ROCKETLAKE(dev_priv)) encoder->hpd_pin = rkl_hpd_pin(dev_priv, port); diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index 8681055843f0..febe85a8a9c8 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -2135,9 +2135,7 @@ bool intel_phy_is_snps(struct drm_i915_private *dev_priv, enum phy phy) enum phy intel_port_to_phy(struct drm_i915_private *i915, enum port port) { - if (DISPLAY_VER(i915) >= 13 && port >= PORT_D_XELPD) - return PHY_D + port - PORT_D_XELPD; - else if (DISPLAY_VER(i915) >= 13 && port >= PORT_TC1) + if (DISPLAY_VER(i915) >= 13 && port >= PORT_TC1) return PHY_F + port - PORT_TC1; else if (IS_ALDERLAKE_S(i915) && port >= PORT_TC1) return PHY_B + port - PORT_TC1; @@ -7907,7 +7905,7 @@ static void intel_setup_outputs(struct drm_i915_private *dev_priv) intel_ddi_init(dev_priv, PORT_A);
[Intel-gfx] [PATCH v4 6/6] drm/i915/display: cleanup unused DDI port enums
DDI port enums PORT_G/H/I were added in the commit - "6c8337dafaa9 drm/i915/tgl: Add additional ports for Tiger Lake" to identify new ports added in the platform. In the subsequent commits those ports were identified by new enums PORT_TC1/TC2/TC3.. to differentiate TypeC ports from non-TypeC. However, the enum definitions PORT_G/H/I and few usages of these enums were left as it is. These enums are unused as of today and can be removed. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/display/intel_bios.c| 3 --- drivers/gpu/drm/i915/display/intel_display.h | 9 - include/drm/i915_component.h | 2 +- 3 files changed, 1 insertion(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_bios.c b/drivers/gpu/drm/i915/display/intel_bios.c index ab472fa757d8..b0dfb37e402a 100644 --- a/drivers/gpu/drm/i915/display/intel_bios.c +++ b/drivers/gpu/drm/i915/display/intel_bios.c @@ -2404,9 +2404,6 @@ static enum port dvo_port_to_port(struct drm_i915_private *i915, [PORT_D] = { DVO_PORT_HDMID, DVO_PORT_DPD, -1 }, [PORT_E] = { DVO_PORT_HDMIE, DVO_PORT_DPE, DVO_PORT_CRT }, [PORT_F] = { DVO_PORT_HDMIF, DVO_PORT_DPF, -1 }, - [PORT_G] = { DVO_PORT_HDMIG, DVO_PORT_DPG, -1 }, - [PORT_H] = { DVO_PORT_HDMIH, DVO_PORT_DPH, -1 }, - [PORT_I] = { DVO_PORT_HDMII, DVO_PORT_DPI, -1 }, }; /* * RKL VBT uses PHY based mapping. Combo PHYs A,B,C,D diff --git a/drivers/gpu/drm/i915/display/intel_display.h b/drivers/gpu/drm/i915/display/intel_display.h index 4a5f7df7492b..5a55b9f43ce3 100644 --- a/drivers/gpu/drm/i915/display/intel_display.h +++ b/drivers/gpu/drm/i915/display/intel_display.h @@ -208,9 +208,6 @@ enum port { PORT_D, PORT_E, PORT_F, - PORT_G, - PORT_H, - PORT_I, /* Non-TypeC ports must be defined above */ PORT_TC1, @@ -244,12 +241,6 @@ static inline const char *port_identifier(enum port port) return "Port E"; case PORT_F: return "Port F"; - case PORT_G: - return "Port G"; - case PORT_H: - return "Port H"; - case PORT_I: - return "Port I"; case PORT_TC1: return "Port TC1"; case PORT_TC2: diff --git a/include/drm/i915_component.h b/include/drm/i915_component.h index 4b31bab5533a..335822d6960a 100644 --- a/include/drm/i915_component.h +++ b/include/drm/i915_component.h @@ -35,7 +35,7 @@ enum i915_component_type { /* MAX_PORT is the number of port * It must be sync with I915_MAX_PORTS defined i915_drv.h */ -#define MAX_PORTS 15 +#define MAX_PORTS 12 /** * struct i915_audio_component - Used for direct communication between i915 and hda drivers -- 2.34.1
[Intel-gfx] [PATCH v5 2/7] drm/i915/display: Pass struct drm_i915_private to DDI_CLK_SEL macro
DDI_CLK_SEL is an another macro which returns the register offset based on DDI port enum. So DDI_CLK_SEL has to be prepared for the new method being developed for calculating the register offsets of DDI ports. Macro receives i915 private structure as new parameter for the upcoming changes. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/display/intel_ddi.c | 17 + drivers/gpu/drm/i915/i915_reg.h | 5 - 2 files changed, 13 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c index 77a986696c76..e7beafafb857 100644 --- a/drivers/gpu/drm/i915/display/intel_ddi.c +++ b/drivers/gpu/drm/i915/display/intel_ddi.c @@ -305,7 +305,8 @@ static void intel_ddi_init_dp_buf_reg(struct intel_encoder *encoder, static int icl_calc_tbt_pll_link(struct drm_i915_private *dev_priv, enum port port) { - u32 val = intel_de_read(dev_priv, DDI_CLK_SEL(port)) & DDI_CLK_SEL_MASK; + u32 val = intel_de_read(dev_priv, DDI_CLK_SEL(dev_priv, port)) & + DDI_CLK_SEL_MASK; switch (val) { case DDI_CLK_SEL_NONE: @@ -1656,7 +1657,7 @@ static void jsl_ddi_tc_enable_clock(struct intel_encoder *encoder, * "For DDIC and DDID, program DDI_CLK_SEL to map the MG clock to the port. * MG does not exist, but the programming is required to ungate DDIC and DDID." */ - intel_de_write(i915, DDI_CLK_SEL(port), DDI_CLK_SEL_MG); + intel_de_write(i915, DDI_CLK_SEL(i915, port), DDI_CLK_SEL_MG); icl_ddi_combo_enable_clock(encoder, crtc_state); } @@ -1668,7 +1669,7 @@ static void jsl_ddi_tc_disable_clock(struct intel_encoder *encoder) icl_ddi_combo_disable_clock(encoder); - intel_de_write(i915, DDI_CLK_SEL(port), DDI_CLK_SEL_NONE); + intel_de_write(i915, DDI_CLK_SEL(i915, port), DDI_CLK_SEL_NONE); } static bool jsl_ddi_tc_is_clock_enabled(struct intel_encoder *encoder) @@ -1677,7 +1678,7 @@ static bool jsl_ddi_tc_is_clock_enabled(struct intel_encoder *encoder) enum port port = encoder->port; u32 tmp; - tmp = intel_de_read(i915, DDI_CLK_SEL(port)); + tmp = intel_de_read(i915, DDI_CLK_SEL(i915, port)); if ((tmp & DDI_CLK_SEL_MASK) == DDI_CLK_SEL_NONE) return false; @@ -1696,7 +1697,7 @@ static void icl_ddi_tc_enable_clock(struct intel_encoder *encoder, if (drm_WARN_ON(&i915->drm, !pll)) return; - intel_de_write(i915, DDI_CLK_SEL(port), + intel_de_write(i915, DDI_CLK_SEL(i915, port), icl_pll_to_ddi_clk_sel(encoder, crtc_state)); mutex_lock(&i915->display.dpll.lock); @@ -1720,7 +1721,7 @@ static void icl_ddi_tc_disable_clock(struct intel_encoder *encoder) mutex_unlock(&i915->display.dpll.lock); - intel_de_write(i915, DDI_CLK_SEL(port), DDI_CLK_SEL_NONE); + intel_de_write(i915, DDI_CLK_SEL(i915, port), DDI_CLK_SEL_NONE); } static bool icl_ddi_tc_is_clock_enabled(struct intel_encoder *encoder) @@ -1730,7 +1731,7 @@ static bool icl_ddi_tc_is_clock_enabled(struct intel_encoder *encoder) enum port port = encoder->port; u32 tmp; - tmp = intel_de_read(i915, DDI_CLK_SEL(port)); + tmp = intel_de_read(i915, DDI_CLK_SEL(i915, port)); if ((tmp & DDI_CLK_SEL_MASK) == DDI_CLK_SEL_NONE) return false; @@ -1748,7 +1749,7 @@ static struct intel_shared_dpll *icl_ddi_tc_get_pll(struct intel_encoder *encode enum intel_dpll_id id; u32 tmp; - tmp = intel_de_read(i915, DDI_CLK_SEL(port)); + tmp = intel_de_read(i915, DDI_CLK_SEL(i915, port)); switch (tmp & DDI_CLK_SEL_MASK) { case DDI_CLK_SEL_TBT_162: diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index a91bbc6e1255..acb764755338 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -7075,7 +7075,10 @@ enum skl_power_gate { #define PORT_CLK_SEL_NONE REG_FIELD_PREP(PORT_CLK_SEL_MASK, 7) /* On ICL+ this is the same as PORT_CLK_SEL, but all bits change. */ -#define DDI_CLK_SEL(port) PORT_CLK_SEL(port) +#define DDI_CLK_SEL(i915, port)({ \ + (void)i915; /* Suppress unused variable warning */ \ + PORT_CLK_SEL(port); \ + }) #define DDI_CLK_SEL_MASK REG_GENMASK(31, 28) #define DDI_CLK_SEL_NONE REG_FIELD_PREP(DDI_CLK_SEL_MASK, 0x0) #define DDI_CLK_SEL_MG REG_FIELD_PREP(DDI_CLK_SEL_MASK, 0x8) -- 2.34.1
[Intel-gfx] [PATCH v5 3/7] drm/i915/display: Define the DDI port indices inside device info
Prior to display version 12, platforms had DDI ports A,B,C,D,E,F represented by enums PORT_A,PORT_B...PORT_F. The DDI register offsets of the ports was in the same order as the ports. So the port enums were directly used as index to calculate the register offset of the ports. Starting in display version 12, TypeC ports were introduced in the platforms. These were defined as new enums PORT_TC1,PORT_TC2... The later generation platforms had DDI register offests of TypeC and non-TypeC ports interleaved and the existing port enums didn't match the order of the DDI register offests. So the enums could no more be used as index to calculate the register offest. This led to the creation of new platform specific enums for the ports like PORT_D_XELPD, PORT_E_XELPD to match the index of the ports in those platforms and additional code to handle the special enums. So we want to make the port enums not tied to DDI register offset and use the index from somewhere else to calculate the register offsets. The index of the DDI ports in the platform is now defined as part of device info. This patch just adds the indices to the device info. Later patches in the series use that index for offset calculation. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/i915_pci.c | 46 ++-- drivers/gpu/drm/i915/intel_device_info.h | 1 + 2 files changed, 44 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index 38460a0bd7cb..b37a95755b77 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -132,6 +132,42 @@ [PIPE_D] = TGL_CURSOR_D_OFFSET, \ } +#define GEN75_DDI_INDEX \ + .display.ddi_index = { \ + [PORT_A] = 0, \ + [PORT_B] = 1, \ + [PORT_C] = 2, \ + [PORT_D] = 3, \ + [PORT_E] = 4, \ + [PORT_F] = 5, \ + } + +#define GEN12_DDI_INDEX \ + .display.ddi_index = { \ + [PORT_A] = 0, \ + [PORT_B] = 1, \ + [PORT_C] = 2, \ + [PORT_TC1] = 3, \ + [PORT_TC2] = 4, \ + [PORT_TC3] = 5, \ + [PORT_TC4] = 6, \ + [PORT_TC5] = 7, \ + [PORT_TC6] = 8, \ + } + +#define XE_LPD_DDI_INDEX \ + .display.ddi_index = { \ + [PORT_A] = 0, \ + [PORT_B] = 1, \ + [PORT_C] = 2, \ + [PORT_TC1] = 3, \ + [PORT_TC2] = 4, \ + [PORT_TC3] = 5, \ + [PORT_TC4] = 6, \ + [PORT_D_XELPD] = 7, \ + [PORT_E_XELPD] = 8, \ + } + #define I9XX_COLORS \ .display.color = { .gamma_lut_size = 256 } #define I965_COLORS \ @@ -562,7 +598,8 @@ static const struct intel_device_info vlv_info = { .display.has_dp_mst = 1, \ .has_rc6p = 0 /* RC6p removed-by HSW */, \ HSW_PIPE_OFFSETS, \ - .has_runtime_pm = 1 + .has_runtime_pm = 1, \ + GEN75_DDI_INDEX #define HSW_PLATFORM \ G75_FEATURES, \ @@ -733,7 +770,8 @@ static const struct intel_device_info skl_gt4_info = { IVB_CURSOR_OFFSETS, \ IVB_COLORS, \ GEN9_DEFAULT_PAGE_SIZES, \ - GEN_DEFAULT_REGIONS + GEN_DEFAULT_REGIONS, \ + GEN75_DDI_INDEX static const struct intel_device_info bxt_info = { GEN9_LP_FEATURES, @@ -887,6 +925,7 @@ static const struct intel_device_info jsl_info = { [TRANSCODER_DSI_1] = TRANSCODER_DSI1_OFFSET, \ }, \ TGL_CURSOR_OFFSETS, \ + GEN12_DDI_INDEX, \ .has_global_mocs = 1, \ .has_pxp = 1, \ .display.has_dsb = 0 /* FIXME: LUT load is broken with DSB */ @@ -984,7 +1023,8 @@ static const struct intel_device_info adl_s_info = { [TRANSCODER_DSI_0] = TRANSCODER_DSI0_OFFSET, \ [TRANSCODER_DSI_1] = TRANSCODER_DSI1_OFFSET, \ }, \ - TGL_CURSOR_OFFSETS + TGL_CURSOR_OFFSETS, \ + XE_LPD_DDI_INDEX static const struct intel_device_info adl_p_info = { GEN12_FEATURES, diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h index bc87d3156b14..a93f54990a01 100644 --- a/drivers/gpu/drm/i915/intel_device_info.h +++ b/drivers/gpu/drm/i915/intel_device_info.h @@ -292,6 +292,7 @@ struct intel_device_info { u32 pipe_offsets[I915_MAX_TRANSCODERS]; u32 trans_offsets[I915_MAX_TRANSCODERS]; u32 cursor_offsets[I915_MAX_PIPES]; + u32 ddi_index[I915_MAX_PORTS]; struct { u32 degamma_lut_size; -- 2.34.1
[Intel-gfx] [PATCH v5 0/7] drm/i915/display: Don't use port enum as register offset
Prior to display version 12, platforms had DDI ports A,B,C,D,E,F represented by enums PORT_A,PORT_B...PORT_F. The DDI register offsets of the ports were in the same order as the ports. So the port enums were directly used as index to calculate the register offset of the ports. Starting in display version 12, TypeC ports were introduced in the platforms. These were defined as new enums PORT_TC1,PORT_TC2... The later generation platforms had DDI register offests of TypeC and non-TypeC ports interleaved and the existing port enums didn't match the order of the DDI register offests. So the enums could no more be used as index to calculate the register offest. This led to the creation of new platform specific enums for the ports like PORT_D_XELPD, PORT_E_XELPD to match the index of the ports in those platforms and additional code to handle the special enums. So we want to make the port enums not tied to DDI register offset and use the index from somewhere else to calculate the register offsets. The index of the DDI ports in the platform is now defined as part of device info. Series includes few patches at the end which does some cleanup and fixing made possible because of unique enums for the ports. v2: ddi_index defined for platforms starting from Gen75. Many platforms from Gen75 has ddi support. v3: Updated DDI_CLK_SEL macro to use new index for DDI register offset caculation. Cc: Jani Nikula Cc: Ville Syrjälä Balasubramani Vivekanandan (7): drm/i915/display: Pass struct drm_i915_private to DDI_BUF_CTL macro drm/i915/display: Pass struct drm_i915_private to DDI_CLK_SEL macro drm/i915/display: Define the DDI port indices inside device info drm/i915/display: Free port enums from tied to register offset drm/i915/display: Remove PORT_D_XELPD/PORT_E_XELPD platform specific defintions drm/i915/display: Fix port_identifier function drm/i915/display: cleanup unused DDI port enums drivers/gpu/drm/i915/display/icl_dsi.c| 12 +-- drivers/gpu/drm/i915/display/intel_bios.c | 7 +- drivers/gpu/drm/i915/display/intel_ddi.c | 80 --- drivers/gpu/drm/i915/display/intel_display.c | 12 +-- drivers/gpu/drm/i915/display/intel_display.h | 29 --- .../drm/i915/display/intel_display_power.c| 40 +- drivers/gpu/drm/i915/display/intel_fdi.c | 14 ++-- drivers/gpu/drm/i915/display/intel_tc.c | 6 +- drivers/gpu/drm/i915/gvt/display.c| 30 +++ drivers/gpu/drm/i915/gvt/handlers.c | 17 ++-- drivers/gpu/drm/i915/i915_pci.c | 46 ++- drivers/gpu/drm/i915/i915_reg.h | 7 +- drivers/gpu/drm/i915/intel_device_info.h | 1 + drivers/gpu/drm/i915/intel_gvt_mmio_table.c | 10 +-- include/drm/i915_component.h | 2 +- 15 files changed, 151 insertions(+), 162 deletions(-) -- 2.34.1
[Intel-gfx] [PATCH v5 4/7] drm/i915/display: Free port enums from tied to register offset
With the index required for DDI register offset calculation available in the device info, the macros which used port enums to calculate the DDI register offsets i.e. DDI_BUF_CTL and DDI_CLK_SEL are updated to make use of the index rather than enum directly. Any new macros access that DDI registers should follow the same procedure. This would free the port enums from tied to the register offset of DDI registers. We can remove all the enum aliases and clean up the enum definitions. The key target of the patch series to remove platform specific definitions of ports like PORT_D_XELPD, PORT_E_XELPD is not yet covered here. The definitions are still retained and will be handled in the follow patch. Removed a WARN_ON as it is no longer valid. The WARN was added in the commit "327f8d8c336d drm/i915: simplify setting of ddi_io_power_domain" The ddi_io_power_domain calculation has changed completely since the commit and doesn't need this WARN_ON anymore. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/display/intel_ddi.c | 1 - drivers/gpu/drm/i915/display/intel_display.h | 8 +++- drivers/gpu/drm/i915/i915_reg.h | 12 include/drm/i915_component.h | 2 +- 4 files changed, 8 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c index e7beafafb857..74b4271063d1 100644 --- a/drivers/gpu/drm/i915/display/intel_ddi.c +++ b/drivers/gpu/drm/i915/display/intel_ddi.c @@ -4493,7 +4493,6 @@ void intel_ddi_init(struct drm_i915_private *dev_priv, enum port port) encoder->update_complete = intel_ddi_update_complete; } - drm_WARN_ON(&dev_priv->drm, port > PORT_I); dig_port->ddi_io_power_domain = intel_display_power_ddi_io_domain(dev_priv, port); if (init_dp) { diff --git a/drivers/gpu/drm/i915/display/intel_display.h b/drivers/gpu/drm/i915/display/intel_display.h index 2af4a1925063..9112833b39eb 100644 --- a/drivers/gpu/drm/i915/display/intel_display.h +++ b/drivers/gpu/drm/i915/display/intel_display.h @@ -212,18 +212,16 @@ enum port { PORT_H, PORT_I, - /* tgl+ */ - PORT_TC1 = PORT_D, + /* Non-TypeC ports must be defined above */ + PORT_TC1, PORT_TC2, PORT_TC3, PORT_TC4, PORT_TC5, PORT_TC6, - /* XE_LPD repositions D/E offsets and bitfields */ - PORT_D_XELPD = PORT_TC5, + PORT_D_XELPD, PORT_E_XELPD, - I915_MAX_PORTS }; diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index acb764755338..15e6b9482ee8 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -170,6 +170,7 @@ #define _MMIO_CURSOR2(pipe, reg) _MMIO(INTEL_INFO(dev_priv)->display.cursor_offsets[(pipe)] - \ INTEL_INFO(dev_priv)->display.cursor_offsets[PIPE_A] + \ DISPLAY_MMIO_BASE(dev_priv) + (reg)) +#define _MMIO_DDI(i915, port, a, b) _MMIO_PORT(INTEL_INFO(i915)->display.ddi_index[port], a, b) #define __MASKED_FIELD(mask, value) ((mask) << 16 | (value)) #define _MASKED_FIELD(mask, value) ({ \ @@ -6936,10 +6937,7 @@ enum skl_power_gate { /* DDI Buffer Control */ #define _DDI_BUF_CTL_A 0x64000 #define _DDI_BUF_CTL_B 0x64100 -#define DDI_BUF_CTL(i915, port) ({ \ - (void)i915; /* Suppress unused variable warning */ \ - _MMIO_PORT(port, _DDI_BUF_CTL_A, _DDI_BUF_CTL_B); \ -}) +#define DDI_BUF_CTL(i915, port) _MMIO_DDI(i915, port, _DDI_BUF_CTL_A, _DDI_BUF_CTL_B) #define DDI_BUF_CTL_ENABLE(1 << 31) #define DDI_BUF_TRANS_SELECT(n) ((n) << 24) @@ -7075,10 +7073,8 @@ enum skl_power_gate { #define PORT_CLK_SEL_NONE REG_FIELD_PREP(PORT_CLK_SEL_MASK, 7) /* On ICL+ this is the same as PORT_CLK_SEL, but all bits change. */ -#define DDI_CLK_SEL(i915, port)({ \ - (void)i915; /* Suppress unused variable warning */ \ - PORT_CLK_SEL(port); \ - }) +#define DDI_CLK_SEL(i915, port)_MMIO_DDI(i915, port, _PORT_CLK_SEL_A, _PORT_CLK_SEL_B) + #define DDI_CLK_SEL_MASK REG_GENMASK(31, 28) #define DDI_CLK_SEL_NONE REG_FIELD_PREP(DDI_CLK_SEL_MASK, 0x0) #define DDI_CLK_SEL_MG REG_FIELD_PREP(DDI_CLK_SEL_MASK, 0x8) diff --git a/include/drm/i915_component.h b/include/drm/i915_component.h index c1e2a43d2d1e..f95ff82c3b4a 100644 --- a/include/drm/i915_component.h +++ b/include/drm/i915_component.h @@ -35
[Intel-gfx] [PATCH v5 1/7] drm/i915/display: Pass struct drm_i915_private to DDI_BUF_CTL macro
This is a prep patch for a patch series in which register offset of the DDI ports are not calculated using the port enums but using a different datastructure part of the device info. So the device info is passed as a parameter to the macro DDI_BUF_CTL but unused yet. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/display/icl_dsi.c | 12 +++--- drivers/gpu/drm/i915/display/intel_ddi.c | 39 +++- drivers/gpu/drm/i915/display/intel_display.c | 6 ++- drivers/gpu/drm/i915/display/intel_fdi.c | 14 +++ drivers/gpu/drm/i915/display/intel_tc.c | 6 +-- drivers/gpu/drm/i915/gvt/display.c | 30 +++ drivers/gpu/drm/i915/gvt/handlers.c | 17 + drivers/gpu/drm/i915/i915_reg.h | 6 ++- drivers/gpu/drm/i915/intel_gvt_mmio_table.c | 10 ++--- 9 files changed, 76 insertions(+), 64 deletions(-) diff --git a/drivers/gpu/drm/i915/display/icl_dsi.c b/drivers/gpu/drm/i915/display/icl_dsi.c index 47f13750f6fa..f7c1f6561423 100644 --- a/drivers/gpu/drm/i915/display/icl_dsi.c +++ b/drivers/gpu/drm/i915/display/icl_dsi.c @@ -548,11 +548,11 @@ static void gen11_dsi_enable_ddi_buffer(struct intel_encoder *encoder) enum port port; for_each_dsi_port(port, intel_dsi->ports) { - tmp = intel_de_read(dev_priv, DDI_BUF_CTL(port)); + tmp = intel_de_read(dev_priv, DDI_BUF_CTL(dev_priv, port)); tmp |= DDI_BUF_CTL_ENABLE; - intel_de_write(dev_priv, DDI_BUF_CTL(port), tmp); + intel_de_write(dev_priv, DDI_BUF_CTL(dev_priv, port), tmp); - if (wait_for_us(!(intel_de_read(dev_priv, DDI_BUF_CTL(port)) & + if (wait_for_us(!(intel_de_read(dev_priv, DDI_BUF_CTL(dev_priv, port)) & DDI_BUF_IS_IDLE), 500)) drm_err(&dev_priv->drm, "DDI port:%c buffer idle\n", @@ -1400,11 +1400,11 @@ static void gen11_dsi_disable_port(struct intel_encoder *encoder) gen11_dsi_ungate_clocks(encoder); for_each_dsi_port(port, intel_dsi->ports) { - tmp = intel_de_read(dev_priv, DDI_BUF_CTL(port)); + tmp = intel_de_read(dev_priv, DDI_BUF_CTL(dev_priv, port)); tmp &= ~DDI_BUF_CTL_ENABLE; - intel_de_write(dev_priv, DDI_BUF_CTL(port), tmp); + intel_de_write(dev_priv, DDI_BUF_CTL(dev_priv, port), tmp); - if (wait_for_us((intel_de_read(dev_priv, DDI_BUF_CTL(port)) & + if (wait_for_us((intel_de_read(dev_priv, DDI_BUF_CTL(dev_priv, port)) & DDI_BUF_IS_IDLE), 8)) drm_err(&dev_priv->drm, diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c index 971356237eca..77a986696c76 100644 --- a/drivers/gpu/drm/i915/display/intel_ddi.c +++ b/drivers/gpu/drm/i915/display/intel_ddi.c @@ -172,7 +172,7 @@ void intel_wait_ddi_buf_idle(struct drm_i915_private *dev_priv, return; } - if (wait_for_us((intel_de_read(dev_priv, DDI_BUF_CTL(port)) & + if (wait_for_us((intel_de_read(dev_priv, DDI_BUF_CTL(dev_priv, port)) & DDI_BUF_IS_IDLE), 8)) drm_err(&dev_priv->drm, "Timeout waiting for DDI BUF %c to get idle\n", port_name(port)); @@ -189,7 +189,7 @@ static void intel_wait_ddi_buf_active(struct drm_i915_private *dev_priv, return; } - ret = _wait_for(!(intel_de_read(dev_priv, DDI_BUF_CTL(port)) & + ret = _wait_for(!(intel_de_read(dev_priv, DDI_BUF_CTL(dev_priv, port)) & DDI_BUF_IS_IDLE), IS_DG2(dev_priv) ? 1200 : 500, 10, 10); if (ret) @@ -730,7 +730,7 @@ static void intel_ddi_get_encoder_pipes(struct intel_encoder *encoder, if (!wakeref) return; - tmp = intel_de_read(dev_priv, DDI_BUF_CTL(port)); + tmp = intel_de_read(dev_priv, DDI_BUF_CTL(dev_priv, port)); if (!(tmp & DDI_BUF_CTL_ENABLE)) goto out; @@ -1397,8 +1397,8 @@ hsw_set_signal_levels(struct intel_encoder *encoder, intel_dp->DP &= ~DDI_BUF_EMP_MASK; intel_dp->DP |= signal_levels; - intel_de_write(dev_priv, DDI_BUF_CTL(port), intel_dp->DP); - intel_de_posting_read(dev_priv, DDI_BUF_CTL(port)); + intel_de_write(dev_priv, DDI_BUF_CTL(dev_priv, port), intel_dp->DP); + intel_de_posting_read(dev_priv, DDI_BUF_CTL(dev_priv, port)); } static void _icl_ddi_enable_clock(struct drm_i915_private *i915, i915_reg_t reg, @@ -2577,10 +2577,10 @@ static void intel_disable_ddi_buf(struct intel_encoder *encoder, bool wait = false; u32 val; - val = intel_de_read(dev_priv,
[Intel-gfx] [PATCH v5 5/7] drm/i915/display: Remove PORT_D_XELPD/PORT_E_XELPD platform specific defintions
Port enums are no more used in the DDI register offset caculcation. We can remove the platform specific port redefinitions. Along with it we also get rid of the code required for handling these special definitions. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/display/intel_bios.c | 4 +- drivers/gpu/drm/i915/display/intel_ddi.c | 23 +-- drivers/gpu/drm/i915/display/intel_display.c | 6 +-- drivers/gpu/drm/i915/display/intel_display.h | 2 - .../drm/i915/display/intel_display_power.c| 40 +-- drivers/gpu/drm/i915/i915_pci.c | 4 +- include/drm/i915_component.h | 2 +- 7 files changed, 10 insertions(+), 71 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_bios.c b/drivers/gpu/drm/i915/display/intel_bios.c index 4c543e8205ca..ab472fa757d8 100644 --- a/drivers/gpu/drm/i915/display/intel_bios.c +++ b/drivers/gpu/drm/i915/display/intel_bios.c @@ -2436,8 +2436,8 @@ static enum port dvo_port_to_port(struct drm_i915_private *i915, [PORT_A] = { DVO_PORT_HDMIA, DVO_PORT_DPA, -1 }, [PORT_B] = { DVO_PORT_HDMIB, DVO_PORT_DPB, -1 }, [PORT_C] = { DVO_PORT_HDMIC, DVO_PORT_DPC, -1 }, - [PORT_D_XELPD] = { DVO_PORT_HDMID, DVO_PORT_DPD, -1 }, - [PORT_E_XELPD] = { DVO_PORT_HDMIE, DVO_PORT_DPE, -1 }, + [PORT_D] = { DVO_PORT_HDMID, DVO_PORT_DPD, -1 }, + [PORT_E] = { DVO_PORT_HDMIE, DVO_PORT_DPE, -1 }, [PORT_TC1] = { DVO_PORT_HDMIF, DVO_PORT_DPF, -1 }, [PORT_TC2] = { DVO_PORT_HDMIG, DVO_PORT_DPG, -1 }, [PORT_TC3] = { DVO_PORT_HDMIH, DVO_PORT_DPH, -1 }, diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c index 74b4271063d1..0b6f884650d3 100644 --- a/drivers/gpu/drm/i915/display/intel_ddi.c +++ b/drivers/gpu/drm/i915/display/intel_ddi.c @@ -4134,17 +4134,6 @@ static bool hti_uses_phy(struct drm_i915_private *i915, enum phy phy) i915->hti_state & HDPORT_DDI_USED(phy); } -static enum hpd_pin xelpd_hpd_pin(struct drm_i915_private *dev_priv, - enum port port) -{ - if (port >= PORT_D_XELPD) - return HPD_PORT_D + port - PORT_D_XELPD; - else if (port >= PORT_TC1) - return HPD_PORT_TC1 + port - PORT_TC1; - else - return HPD_PORT_A + port - PORT_A; -} - static enum hpd_pin dg1_hpd_pin(struct drm_i915_private *dev_priv, enum port port) { @@ -4313,13 +4302,7 @@ void intel_ddi_init(struct drm_i915_private *dev_priv, enum port port) encoder = &dig_port->base; encoder->devdata = devdata; - if (DISPLAY_VER(dev_priv) >= 13 && port >= PORT_D_XELPD) { - drm_encoder_init(&dev_priv->drm, &encoder->base, &intel_ddi_funcs, -DRM_MODE_ENCODER_TMDS, -"DDI %c/PHY %c", -port_name(port - PORT_D_XELPD + PORT_D), -phy_name(phy)); - } else if (DISPLAY_VER(dev_priv) >= 12) { + if (DISPLAY_VER(dev_priv) >= 12) { enum tc_port tc_port = intel_port_to_tc(dev_priv, port); drm_encoder_init(&dev_priv->drm, &encoder->base, &intel_ddi_funcs, @@ -4449,9 +4432,7 @@ void intel_ddi_init(struct drm_i915_private *dev_priv, enum port port) intel_ddi_buf_trans_init(encoder); - if (DISPLAY_VER(dev_priv) >= 13) - encoder->hpd_pin = xelpd_hpd_pin(dev_priv, port); - else if (IS_DG1(dev_priv)) + if (IS_DG1(dev_priv)) encoder->hpd_pin = dg1_hpd_pin(dev_priv, port); else if (IS_ROCKETLAKE(dev_priv)) encoder->hpd_pin = rkl_hpd_pin(dev_priv, port); diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index 8681055843f0..febe85a8a9c8 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -2135,9 +2135,7 @@ bool intel_phy_is_snps(struct drm_i915_private *dev_priv, enum phy phy) enum phy intel_port_to_phy(struct drm_i915_private *i915, enum port port) { - if (DISPLAY_VER(i915) >= 13 && port >= PORT_D_XELPD) - return PHY_D + port - PORT_D_XELPD; - else if (DISPLAY_VER(i915) >= 13 && port >= PORT_TC1) + if (DISPLAY_VER(i915) >= 13 && port >= PORT_TC1) return PHY_F + port - PORT_TC1; else if (IS_ALDERLAKE_S(i915) && port >= PORT_TC1) return PHY_B + port - PORT_TC1; @@ -7907,7 +7905,7 @@ static void intel_setup_outputs(struct drm_i915_private *dev_priv) intel_ddi_init(dev_priv, PORT_A);
[Intel-gfx] [PATCH v5 6/7] drm/i915/display: Fix port_identifier function
port_identifier function was broken when TypeC ports were using enum aliases. It would return wrong string for TypeC ports. With unique enums for DDI ports now, fix port_identifier to cover all ports. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/display/intel_display.h | 12 1 file changed, 12 insertions(+) diff --git a/drivers/gpu/drm/i915/display/intel_display.h b/drivers/gpu/drm/i915/display/intel_display.h index 62604cadf0b8..4a5f7df7492b 100644 --- a/drivers/gpu/drm/i915/display/intel_display.h +++ b/drivers/gpu/drm/i915/display/intel_display.h @@ -250,6 +250,18 @@ static inline const char *port_identifier(enum port port) return "Port H"; case PORT_I: return "Port I"; + case PORT_TC1: + return "Port TC1"; + case PORT_TC2: + return "Port TC2"; + case PORT_TC3: + return "Port TC3"; + case PORT_TC4: + return "Port TC4"; + case PORT_TC5: + return "Port TC5"; + case PORT_TC6: + return "Port TC6"; default: return ""; } -- 2.34.1
[Intel-gfx] [PATCH v5 7/7] drm/i915/display: cleanup unused DDI port enums
DDI port enums PORT_G/H/I were added in the commit - "6c8337dafaa9 drm/i915/tgl: Add additional ports for Tiger Lake" to identify new ports added in the platform. In the subsequent commits those ports were identified by new enums PORT_TC1/TC2/TC3.. to differentiate TypeC ports from non-TypeC. However, the enum definitions PORT_G/H/I and few usages of these enums were left as it is. These enums are unused as of today and can be removed. Signed-off-by: Balasubramani Vivekanandan --- drivers/gpu/drm/i915/display/intel_bios.c| 3 --- drivers/gpu/drm/i915/display/intel_display.h | 9 - include/drm/i915_component.h | 2 +- 3 files changed, 1 insertion(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_bios.c b/drivers/gpu/drm/i915/display/intel_bios.c index ab472fa757d8..b0dfb37e402a 100644 --- a/drivers/gpu/drm/i915/display/intel_bios.c +++ b/drivers/gpu/drm/i915/display/intel_bios.c @@ -2404,9 +2404,6 @@ static enum port dvo_port_to_port(struct drm_i915_private *i915, [PORT_D] = { DVO_PORT_HDMID, DVO_PORT_DPD, -1 }, [PORT_E] = { DVO_PORT_HDMIE, DVO_PORT_DPE, DVO_PORT_CRT }, [PORT_F] = { DVO_PORT_HDMIF, DVO_PORT_DPF, -1 }, - [PORT_G] = { DVO_PORT_HDMIG, DVO_PORT_DPG, -1 }, - [PORT_H] = { DVO_PORT_HDMIH, DVO_PORT_DPH, -1 }, - [PORT_I] = { DVO_PORT_HDMII, DVO_PORT_DPI, -1 }, }; /* * RKL VBT uses PHY based mapping. Combo PHYs A,B,C,D diff --git a/drivers/gpu/drm/i915/display/intel_display.h b/drivers/gpu/drm/i915/display/intel_display.h index 4a5f7df7492b..5a55b9f43ce3 100644 --- a/drivers/gpu/drm/i915/display/intel_display.h +++ b/drivers/gpu/drm/i915/display/intel_display.h @@ -208,9 +208,6 @@ enum port { PORT_D, PORT_E, PORT_F, - PORT_G, - PORT_H, - PORT_I, /* Non-TypeC ports must be defined above */ PORT_TC1, @@ -244,12 +241,6 @@ static inline const char *port_identifier(enum port port) return "Port E"; case PORT_F: return "Port F"; - case PORT_G: - return "Port G"; - case PORT_H: - return "Port H"; - case PORT_I: - return "Port I"; case PORT_TC1: return "Port TC1"; case PORT_TC2: diff --git a/include/drm/i915_component.h b/include/drm/i915_component.h index 4b31bab5533a..335822d6960a 100644 --- a/include/drm/i915_component.h +++ b/include/drm/i915_component.h @@ -35,7 +35,7 @@ enum i915_component_type { /* MAX_PORT is the number of port * It must be sync with I915_MAX_PORTS defined i915_drv.h */ -#define MAX_PORTS 15 +#define MAX_PORTS 12 /** * struct i915_audio_component - Used for direct communication between i915 and hda drivers -- 2.34.1
[Intel-gfx] [PATCH v6 0/7] drm/i915/display: Don't use port enum as register offset
Prior to display version 12, platforms had DDI ports A,B,C,D,E,F represented by enums PORT_A,PORT_B...PORT_F. The DDI register offsets of the ports were in the same order as the ports. So the port enums were directly used as index to calculate the register offset of the ports. Starting in display version 12, TypeC ports were introduced in the platforms. These were defined as new enums PORT_TC1,PORT_TC2... The later generation platforms had DDI register offests of TypeC and non-TypeC ports interleaved and the existing port enums didn't match the order of the DDI register offests. So the enums could no more be used as index to calculate the register offest. This led to the creation of new platform specific enums for the ports like PORT_D_XELPD, PORT_E_XELPD to match the index of the ports in those platforms and additional code to handle the special enums. So we want to make the port enums not tied to DDI register offset and use the index from somewhere else to calculate the register offsets. The index of the DDI ports in the platform is now defined as part of device info. Series includes few patches at the end which does some cleanup and fixing made possible because of unique enums for the ports. v2: ddi_index defined for platforms starting from Gen75. Many platforms from Gen75 has ddi support. v3: Updated DDI_CLK_SEL macro to use new index for DDI register offset caculation. v4: After removing d13_port_domains array, d12_port_domains is used for all platforms with DISPLAY_VER 12 and above. So the port_end member had to fixed to extend it for ports D and E. Cc: Jani Nikula Cc: Ville Syrjälä Balasubramani Vivekanandan (7): drm/i915/display: Pass struct drm_i915_private to DDI_BUF_CTL macro drm/i915/display: Pass struct drm_i915_private to DDI_CLK_SEL macro drm/i915/display: Define the DDI port indices inside device info drm/i915/display: Free port enums from tied to register offset drm/i915/display: Remove PORT_D_XELPD/PORT_E_XELPD platform specific defintions drm/i915/display: Fix port_identifier function drm/i915/display: cleanup unused DDI port enums drivers/gpu/drm/i915/display/icl_dsi.c| 12 +-- drivers/gpu/drm/i915/display/intel_bios.c | 7 +- drivers/gpu/drm/i915/display/intel_ddi.c | 80 --- drivers/gpu/drm/i915/display/intel_display.c | 12 +-- drivers/gpu/drm/i915/display/intel_display.h | 29 --- .../drm/i915/display/intel_display_power.c| 44 +- drivers/gpu/drm/i915/display/intel_fdi.c | 14 ++-- drivers/gpu/drm/i915/display/intel_tc.c | 6 +- drivers/gpu/drm/i915/gvt/display.c| 30 +++ drivers/gpu/drm/i915/gvt/handlers.c | 17 ++-- drivers/gpu/drm/i915/i915_pci.c | 46 ++- drivers/gpu/drm/i915/i915_reg.h | 7 +- drivers/gpu/drm/i915/intel_device_info.h | 1 + drivers/gpu/drm/i915/intel_gvt_mmio_table.c | 10 +-- include/drm/i915_component.h | 2 +- 15 files changed, 153 insertions(+), 164 deletions(-) -- 2.34.1