Re: [Intel-gfx] [RFC v2 01/22] drm: RFC for Plane Color Hardware Pipeline

2021-10-13 Thread Pekka Paalanen
On Tue, 12 Oct 2021 19:11:29 +
"Shankar, Uma"  wrote:

> > -Original Message-
> > From: Pekka Paalanen 
> > Sent: Tuesday, October 12, 2021 5:30 PM
> > To: Simon Ser 
> > Cc: Shankar, Uma ; intel-gfx@lists.freedesktop.org; 
> > dri-
> > de...@lists.freedesktop.org; harry.wentl...@amd.com;
> > ville.syrj...@linux.intel.com; brian.star...@arm.com;
> > sebast...@sebastianwick.net; shashank.sha...@amd.com
> > Subject: Re: [RFC v2 01/22] drm: RFC for Plane Color Hardware Pipeline
> > 
> > On Tue, 12 Oct 2021 10:35:37 +
> > Simon Ser  wrote:
> >   
> > > On Tuesday, October 12th, 2021 at 12:30, Pekka Paalanen  
> >  wrote:  
> > >  
> > > > is there a practise of landing proposal documents in the kernel? How
> > > > does that work, will a kernel tree carry the patch files?
> > > > Or should this document be worded like documentation for an accepted
> > > > feature, and then the patches either land or don't?  
> > >
> > > Once everyone agrees, the RFC can land. I don't think a kernel tree is
> > > necessary. See:
> > >
> > > https://dri.freedesktop.org/docs/drm/gpu/rfc/index.html  
> > 
> > Does this mean the RFC doc patch will land, but the code patches will 
> > remain in the
> > review cycles waiting for userspace proving vehicles?
> > Rather than e.g. committed as files that people would need to apply 
> > themselves? Or
> > how does one find the code patches corresponding to RFC docs?  
> 
> As I understand, this section was added to finalize the design and debate on 
> the UAPI,
> structures, headers and design etc. Once a general agreement is in place with 
> all the
> stakeholders, we can have ack on design and approach and get it merged. This 
> hence
> serves as an approved reference for the UAPI, accepted and agreed by 
> community at large.
> 
> Once the code lands, all the documentation will be added to the right driver 
> sections and
> helpers, like it's been done currently.

I'm just wondering: someone browses a kernel tree, and discovers this
RFC doc in there. They want to see or test the latest (WIP) kernel
implementation of it. How will they find the code / patches?


Thanks,
pq


pgpZjiSLSTD4M.pgp
Description: OpenPGP digital signature


[Intel-gfx] [PATCH v3] lib/stackdepot: allow optional init and stack_table allocation by kvmalloc()

2021-10-13 Thread Vlastimil Babka
Currently, enabling CONFIG_STACKDEPOT means its stack_table will be allocated
from memblock, even if stack depot ends up not actually used. The default size
of stack_table is 4MB on 32-bit, 8MB on 64-bit.

This is fine for use-cases such as KASAN which is also a config option and
has overhead on its own. But it's an issue for functionality that has to be
actually enabled on boot (page_owner) or depends on hardware (GPU drivers)
and thus the memory might be wasted. This was raised as an issue [1] when
attempting to add stackdepot support for SLUB's debug object tracking
functionality. It's common to build kernels with CONFIG_SLUB_DEBUG and enable
slub_debug on boot only when needed, or create only specific kmem caches with
debugging for testing purposes.

It would thus be more efficient if stackdepot's table was allocated only when
actually going to be used. This patch thus makes the allocation (and whole
stack_depot_init() call) optional:

- Add a CONFIG_STACKDEPOT_ALWAYS_INIT flag to keep using the current
  well-defined point of allocation as part of mem_init(). Make CONFIG_KASAN
  select this flag.
- Other users have to call stack_depot_init() as part of their own init when
  it's determined that stack depot will actually be used. This may depend on
  both config and runtime conditions. Convert current users which are
  page_owner and several in the DRM subsystem. Same will be done for SLUB
  later.
- Because the init might now be called after the boot-time memblock allocation
  has given all memory to the buddy allocator, change stack_depot_init() to
  allocate stack_table with kvmalloc() when memblock is no longer available.
  Also handle allocation failure by disabling stackdepot (could have
  theoretically happened even with memblock allocation previously), and don't
  unnecessarily align the memblock allocation to its own size anymore.

[1] 
https://lore.kernel.org/all/CAMuHMdW=eovzm1re5fvoen87nkfilmm2+ah7enu2kxehcvb...@mail.gmail.com/

Signed-off-by: Vlastimil Babka 
Acked-by: Dmitry Vyukov 
Reviewed-by: Marco Elver  # stackdepot
Cc: Marco Elver 
Cc: Vijayanand Jitta 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Andrey Ryabinin 
Cc: Alexander Potapenko 
Cc: Andrey Konovalov 
Cc: Dmitry Vyukov 
Cc: Geert Uytterhoeven 
Cc: Oliver Glitta 
Cc: Imran Khan 
---
Changes in v3:
- stack_depot_init_mutex made static and moved inside stack_depot_init()
  Reported-by: kernel test robot 
- use !stack_table condition instead of stack_table == NULL
  reported by checkpatch on freedesktop.org patchwork
 drivers/gpu/drm/drm_dp_mst_topology.c   |  1 +
 drivers/gpu/drm/drm_mm.c|  4 +++
 drivers/gpu/drm/i915/intel_runtime_pm.c |  3 +++
 include/linux/stackdepot.h  | 25 ---
 init/main.c |  2 +-
 lib/Kconfig |  4 +++
 lib/Kconfig.kasan   |  2 +-
 lib/stackdepot.c| 33 +
 mm/page_owner.c |  2 ++
 9 files changed, 60 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c 
b/drivers/gpu/drm/drm_dp_mst_topology.c
index 86d13d6bc463..b0ebdc843a00 100644
--- a/drivers/gpu/drm/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/drm_dp_mst_topology.c
@@ -5493,6 +5493,7 @@ int drm_dp_mst_topology_mgr_init(struct 
drm_dp_mst_topology_mgr *mgr,
mutex_init(&mgr->probe_lock);
 #if IS_ENABLED(CONFIG_DRM_DEBUG_DP_MST_TOPOLOGY_REFS)
mutex_init(&mgr->topology_ref_history_lock);
+   stack_depot_init();
 #endif
INIT_LIST_HEAD(&mgr->tx_msg_downq);
INIT_LIST_HEAD(&mgr->destroy_port_list);
diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index 93d48a6f04ab..5916228ea0c9 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -983,6 +983,10 @@ void drm_mm_init(struct drm_mm *mm, u64 start, u64 size)
add_hole(&mm->head_node);
 
mm->scan_active = 0;
+
+#ifdef CONFIG_DRM_DEBUG_MM
+   stack_depot_init();
+#endif
 }
 EXPORT_SYMBOL(drm_mm_init);
 
diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c 
b/drivers/gpu/drm/i915/intel_runtime_pm.c
index eaf7688f517d..d083506986e1 100644
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
@@ -78,6 +78,9 @@ static void __print_depot_stack(depot_stack_handle_t stack,
 static void init_intel_runtime_pm_wakeref(struct intel_runtime_pm *rpm)
 {
spin_lock_init(&rpm->debug.lock);
+
+   if (rpm->available)
+   stack_depot_init();
 }
 
 static noinline depot_stack_handle_t
diff --git a/include/linux/stackdepot.h b/include/linux/stackdepot.h
index 6bb4bc1a5f54..40fc5e92194f 100644
--- a/include/linux/stackdepot.h
+++ b/include/linux/stackdepot.h
@@ -13,6 +13,22 @@
 
 typedef u32 depot_stack_handle_t;
 
+/*
+ * Every user of stack depot has to call this during its own init when it's
+ * decided that

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for lib/stackdepot: allow optional init and stack_table allocation by kvmalloc() (rev3)

2021-10-13 Thread Patchwork
== Series Details ==

Series: lib/stackdepot: allow optional init and stack_table allocation by 
kvmalloc() (rev3)
URL   : https://patchwork.freedesktop.org/series/95549/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
50fb572ebbb4 lib/stackdepot: allow optional init and stack_table allocation by 
kvmalloc()
-:7: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description 
(prefer a maximum 75 chars per line)
#7: 
Currently, enabling CONFIG_STACKDEPOT means its stack_table will be allocated

-:209: CHECK:COMPARISON_TO_NULL: Comparison to NULL could be written 
"!stack_table"
#209: FILE: lib/stackdepot.c:175:
+   if (!stack_depot_disable && stack_table == NULL) {

total: 0 errors, 1 warnings, 1 checks, 147 lines checked




[Intel-gfx] ✓ Fi.CI.BAT: success for lib/stackdepot: allow optional init and stack_table allocation by kvmalloc() (rev3)

2021-10-13 Thread Patchwork
== Series Details ==

Series: lib/stackdepot: allow optional init and stack_table allocation by 
kvmalloc() (rev3)
URL   : https://patchwork.freedesktop.org/series/95549/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10728 -> Patchwork_21326


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21326/index.html

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_21326:

### IGT changes ###

 Suppressed 

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@i915_selftest@live@hangcheck:
- {fi-jsl-1}: [PASS][1] -> [INCOMPLETE][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/fi-jsl-1/igt@i915_selftest@l...@hangcheck.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21326/fi-jsl-1/igt@i915_selftest@l...@hangcheck.html

  
Known issues


  Here are the changes found in Patchwork_21326 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_basic@cs-gfx:
- fi-kbl-soraka:  NOTRUN -> [SKIP][3] ([fdo#109271]) +3 similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21326/fi-kbl-soraka/igt@amdgpu/amd_ba...@cs-gfx.html

  * igt@kms_flip@basic-flip-vs-modeset@c-dp1:
- fi-cfl-8109u:   [PASS][4] -> [FAIL][5] ([i915#4165])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/fi-cfl-8109u/igt@kms_flip@basic-flip-vs-mode...@c-dp1.html
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21326/fi-cfl-8109u/igt@kms_flip@basic-flip-vs-mode...@c-dp1.html

  * igt@kms_frontbuffer_tracking@basic:
- fi-cml-u2:  [PASS][6] -> [DMESG-WARN][7] ([i915#4269])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/fi-cml-u2/igt@kms_frontbuffer_track...@basic.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21326/fi-cml-u2/igt@kms_frontbuffer_track...@basic.html
- fi-cfl-8109u:   [PASS][8] -> [FAIL][9] ([i915#2546])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/fi-cfl-8109u/igt@kms_frontbuffer_track...@basic.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21326/fi-cfl-8109u/igt@kms_frontbuffer_track...@basic.html

  
 Possible fixes 

  * igt@i915_selftest@live@hangcheck:
- {fi-hsw-gt1}:   [DMESG-WARN][10] ([i915#3303]) -> [PASS][11]
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/fi-hsw-gt1/igt@i915_selftest@l...@hangcheck.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21326/fi-hsw-gt1/igt@i915_selftest@l...@hangcheck.html

  * igt@i915_selftest@live@perf:
- {fi-tgl-dsi}:   [DMESG-WARN][12] ([i915#2867]) -> [PASS][13] +9 
similar issues
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/fi-tgl-dsi/igt@i915_selftest@l...@perf.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21326/fi-tgl-dsi/igt@i915_selftest@l...@perf.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [i915#2546]: https://gitlab.freedesktop.org/drm/intel/issues/2546
  [i915#2867]: https://gitlab.freedesktop.org/drm/intel/issues/2867
  [i915#3303]: https://gitlab.freedesktop.org/drm/intel/issues/3303
  [i915#3970]: https://gitlab.freedesktop.org/drm/intel/issues/3970
  [i915#4165]: https://gitlab.freedesktop.org/drm/intel/issues/4165
  [i915#4269]: https://gitlab.freedesktop.org/drm/intel/issues/4269


Participating hosts (41 -> 35)
--

  Missing(6): fi-ilk-m540 fi-hsw-4200u fi-bsw-cyan fi-apl-guc fi-ctg-p8600 
fi-icl-y 


Build changes
-

  * Linux: CI_DRM_10728 -> Patchwork_21326

  CI-20190529: 20190529
  CI_DRM_10728: 82a9f298afec66c882e710078138891826ce5e22 @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6242: 721fd85ee95225ed5df322f7182bdfa9b86a3e68 @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_21326: 50fb572ebbb41c369436391aba246b388d6b0f13 @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

50fb572ebbb4 lib/stackdepot: allow optional init and stack_table allocation by 
kvmalloc()

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21326/index.html


Re: [Intel-gfx] [PATCH v2] drm/i915: Remove memory frequency calculation

2021-10-13 Thread Zhao, Yakui




On 2021/10/13 10:54, Matt Roper wrote:

On Tue, Oct 12, 2021 at 06:00:46PM -0700, José Roberto de Souza wrote:

This memory frequency calculated is only used to check if it is zero,
what is not useful as it will never actually be zero.

Also the calculation is wrong, we should be checking other bit to
select the appropriate frequency multiplier while this code is stuck
with a fixed multiplier.

So here dropping it as whole.

v2:
- Also remove memory frequency calculation for gen9 LP platforms

Cc: Yakui Zhao 
Cc: Matt Roper 
Fixes: f8112cb9574b ("drm/i915/gen11+: Only load DRAM information from pcode")
Signed-off-by: José Roberto de Souza 


Reviewed-by: Matt Roper 


After removing the check of memory frequency, the EHL SBL can work as 
expected. Otherwise it will fail some checks in intel_dram_detect 
because of incorrect memory frequency calculation.


Add: Tested-by: Zhao Yakui 



---
  drivers/gpu/drm/i915/i915_reg.h   |  8 
  drivers/gpu/drm/i915/intel_dram.c | 30 ++
  2 files changed, 2 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index a897f4abea0c3..8825f7ac477b6 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -11109,12 +11109,6 @@ enum skl_power_gate {
  #define  DC_STATE_DEBUG_MASK_CORES(1 << 0)
  #define  DC_STATE_DEBUG_MASK_MEMORY_UP(1 << 1)
  
-#define BXT_P_CR_MC_BIOS_REQ_0_0_0	_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x7114)

-#define  BXT_REQ_DATA_MASK 0x3F
-#define  BXT_DRAM_CHANNEL_ACTIVE_SHIFT 12
-#define  BXT_DRAM_CHANNEL_ACTIVE_MASK  (0xF << 12)
-#define  BXT_MEMORY_FREQ_MULTIPLIER_HZ 1
-
  #define BXT_D_CR_DRP0_DUNIT8  0x1000
  #define BXT_D_CR_DRP0_DUNIT9  0x1200
  #define  BXT_D_CR_DRP0_DUNIT_START8
@@ -11145,9 +11139,7 @@ enum skl_power_gate {
  #define  BXT_DRAM_TYPE_LPDDR4 (0x2 << 22)
  #define  BXT_DRAM_TYPE_DDR4   (0x4 << 22)
  
-#define SKL_MEMORY_FREQ_MULTIPLIER_HZ		2

  #define SKL_MC_BIOS_DATA_0_0_0_MCHBAR_PCU _MMIO(MCHBAR_MIRROR_BASE_SNB + 
0x5E04)
-#define  SKL_REQ_DATA_MASK (0xF << 0)
  #define  DG1_GEAR_TYPEREG_BIT(16)
  
  #define SKL_MAD_INTER_CHANNEL_0_0_0_MCHBAR_MCMAIN _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x5000)

diff --git a/drivers/gpu/drm/i915/intel_dram.c 
b/drivers/gpu/drm/i915/intel_dram.c
index 30a0cab5eff46..0adadfd9528aa 100644
--- a/drivers/gpu/drm/i915/intel_dram.c
+++ b/drivers/gpu/drm/i915/intel_dram.c
@@ -244,7 +244,6 @@ static int
  skl_get_dram_info(struct drm_i915_private *i915)
  {
struct dram_info *dram_info = &i915->dram_info;
-   u32 mem_freq_khz, val;
int ret;
  
  	dram_info->type = skl_get_dram_type(i915);

@@ -255,17 +254,6 @@ skl_get_dram_info(struct drm_i915_private *i915)
if (ret)
return ret;
  
-	val = intel_uncore_read(&i915->uncore,

-   SKL_MC_BIOS_DATA_0_0_0_MCHBAR_PCU);
-   mem_freq_khz = DIV_ROUND_UP((val & SKL_REQ_DATA_MASK) *
-   SKL_MEMORY_FREQ_MULTIPLIER_HZ, 1000);
-
-   if (dram_info->num_channels * mem_freq_khz == 0) {
-   drm_info(&i915->drm,
-"Couldn't get system memory bandwidth\n");
-   return -EINVAL;
-   }
-
return 0;
  }
  
@@ -350,24 +338,10 @@ static void bxt_get_dimm_info(struct dram_dimm_info *dimm, u32 val)

  static int bxt_get_dram_info(struct drm_i915_private *i915)
  {
struct dram_info *dram_info = &i915->dram_info;
-   u32 dram_channels;
-   u32 mem_freq_khz, val;
-   u8 num_active_channels, valid_ranks = 0;
+   u32 val;
+   u8 valid_ranks = 0;
int i;
  
-	val = intel_uncore_read(&i915->uncore, BXT_P_CR_MC_BIOS_REQ_0_0_0);

-   mem_freq_khz = DIV_ROUND_UP((val & BXT_REQ_DATA_MASK) *
-   BXT_MEMORY_FREQ_MULTIPLIER_HZ, 1000);
-
-   dram_channels = val & BXT_DRAM_CHANNEL_ACTIVE_MASK;
-   num_active_channels = hweight32(dram_channels);
-
-   if (mem_freq_khz * num_active_channels == 0) {
-   drm_info(&i915->drm,
-"Couldn't get system memory bandwidth\n");
-   return -EINVAL;
-   }
-
/*
 * Now read each DUNIT8/9/10/11 to check the rank of each dimms.
 */
--
2.33.0





Re: [Intel-gfx] [PATCH v5] drm/i915/gt: move remaining debugfs interfaces into gt

2021-10-13 Thread Andi Shyti
Hi,

sorry, just forgot to add the changelog

On Wed, Oct 13, 2021 at 12:17:38AM +0200, Andi Shyti wrote:
> From: Andi Shyti 
> 
> The following interfaces:
> 
>   i915_wedged
>   i915_forcewake_user
> 
> are dependent on gt values. Put them inside gt/ and drop the
> "i915_" prefix name. This would be the new structure:
> 
>   dri/0/gt
>   |
>   +-- forcewake_user
>   |
>   \-- reset
> 
> For backwards compatibility with existing igt (and the slight
> semantic difference between operating on the i915 abi entry
> points and the deep gt info):
> 
>   dri/0
>   |
>   +-- i915_wedged
>   |
>   \-- i915_forcewake_user
> 
> remain at the top level.
> 
> Signed-off-by: Andi Shyti 
> Cc: Tvrtko Ursulin 
> Cc: Chris Wilson 
> Reviewed-by: Lucas De Marchi 
> ---

Changelog:
--
v4 -> v5: https://patchwork.freedesktop.org/patch/458293/
 * rename static functions exposed to header files so that they
   can keep a coherent namespace (thanks Lucas!)
 * add Lucas r-b.

v3 -> v4: https://patchwork.freedesktop.org/patch/458225/
 * remove the unnecessary interrupt_info_show() information. They
   were already removed here by Chris:

cf977e18610e6 ("drm/i915/gem: Spring clean debugfs")

v2 -> v3: https://patchwork.freedesktop.org/patch/458108/
 * keep the original interfaces as they were (thanks Chris) but
   implement the functionality inside the gt. The upper level
   files will call the gt functions (thanks Lucas).

v1 -> v2: https://patchwork.freedesktop.org/patch/456652/
 * keep the original interfaces intact (thanks Chris).

Andi


[Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915/gt: move remaining debugfs interfaces into gt (rev12)

2021-10-13 Thread Patchwork
== Series Details ==

Series: drm/i915/gt: move remaining debugfs interfaces into gt (rev12)
URL   : https://patchwork.freedesktop.org/series/75333/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10728_full -> Patchwork_21322_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_21322_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_21322_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_21322_full:

### IGT changes ###

 Possible regressions 

  * igt@kms_frontbuffer_tracking@fbc-suspend:
- shard-kbl:  [PASS][1] -> [INCOMPLETE][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/shard-kbl7/igt@kms_frontbuffer_track...@fbc-suspend.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21322/shard-kbl4/igt@kms_frontbuffer_track...@fbc-suspend.html

  
Known issues


  Here are the changes found in Patchwork_21322_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_ctx_isolation@preservation-s3@bcs0:
- shard-apl:  NOTRUN -> [DMESG-WARN][3] ([i915#180]) +2 similar 
issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21322/shard-apl8/igt@gem_ctx_isolation@preservation...@bcs0.html

  * igt@gem_ctx_persistence@engines-mixed-process:
- shard-snb:  NOTRUN -> [SKIP][4] ([fdo#109271] / [i915#1099]) +3 
similar issues
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21322/shard-snb6/igt@gem_ctx_persiste...@engines-mixed-process.html

  * igt@gem_ctx_shared@q-in-order:
- shard-snb:  NOTRUN -> [SKIP][5] ([fdo#109271]) +294 similar issues
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21322/shard-snb7/igt@gem_ctx_sha...@q-in-order.html

  * igt@gem_eio@in-flight-suspend:
- shard-kbl:  [PASS][6] -> [DMESG-WARN][7] ([i915#180]) +1 similar 
issue
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/shard-kbl4/igt@gem_...@in-flight-suspend.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21322/shard-kbl7/igt@gem_...@in-flight-suspend.html

  * igt@gem_exec_fair@basic-deadline:
- shard-glk:  [PASS][8] -> [FAIL][9] ([i915#2846])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/shard-glk9/igt@gem_exec_f...@basic-deadline.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21322/shard-glk9/igt@gem_exec_f...@basic-deadline.html

  * igt@gem_exec_fair@basic-none-rrul@rcs0:
- shard-tglb: NOTRUN -> [FAIL][10] ([i915#2842]) +1 similar issue
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21322/shard-tglb1/igt@gem_exec_fair@basic-none-r...@rcs0.html

  * igt@gem_exec_fair@basic-none-solo@rcs0:
- shard-glk:  [PASS][11] -> [FAIL][12] ([i915#2842])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/shard-glk5/igt@gem_exec_fair@basic-none-s...@rcs0.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21322/shard-glk6/igt@gem_exec_fair@basic-none-s...@rcs0.html

  * igt@gem_exec_fair@basic-pace@vcs1:
- shard-iclb: NOTRUN -> [FAIL][13] ([i915#2842])
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21322/shard-iclb1/igt@gem_exec_fair@basic-p...@vcs1.html
- shard-tglb: [PASS][14] -> [FAIL][15] ([i915#2842])
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/shard-tglb5/igt@gem_exec_fair@basic-p...@vcs1.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21322/shard-tglb2/igt@gem_exec_fair@basic-p...@vcs1.html

  * igt@gem_exec_schedule@u-submit-golden-slice@vecs0:
- shard-skl:  NOTRUN -> [INCOMPLETE][16] ([i915#3797])
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21322/shard-skl7/igt@gem_exec_schedule@u-submit-golden-sl...@vecs0.html

  * igt@gem_exec_whisper@basic-fds-forked:
- shard-glk:  [PASS][17] -> [DMESG-WARN][18] ([i915#118]) +1 
similar issue
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/shard-glk6/igt@gem_exec_whis...@basic-fds-forked.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21322/shard-glk1/igt@gem_exec_whis...@basic-fds-forked.html

  * igt@gem_pxp@reject-modify-context-protection-off-2:
- shard-tglb: NOTRUN -> [SKIP][19] ([i915#4270])
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21322/shard-tglb8/igt@gem_...@reject-modify-context-protection-off-2.html

  * igt@gem_render_copy@x-tiled-to-vebox-yf-tiled:
- shard-kbl:  NOTRUN -> [SKIP][20] ([fdo#109271]) +100 similar 
issues
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21322/shard-kbl6/i

Re: [Intel-gfx] [RFC v2 01/22] drm: RFC for Plane Color Hardware Pipeline

2021-10-13 Thread Pekka Paalanen
On Tue, 12 Oct 2021 20:58:27 +
"Shankar, Uma"  wrote:

> > -Original Message-
> > From: Pekka Paalanen 
> > Sent: Tuesday, October 12, 2021 4:01 PM
> > To: Shankar, Uma 
> > Cc: intel-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org; 
> > harry.wentl...@amd.com; ville.syrj...@linux.intel.com; 
> > brian.star...@arm.com; sebast...@sebastianwick.net; 
> > shashank.sha...@amd.com
> > Subject: Re: [RFC v2 01/22] drm: RFC for Plane Color Hardware Pipeline
> > 
> > On Tue,  7 Sep 2021 03:08:43 +0530
> > Uma Shankar  wrote:
> >   
> > > This is a RFC proposal for plane color hardware blocks.
> > > It exposes the property interface to userspace and calls out the 
> > > details or interfaces created and the intended purpose.
> > >
> > > Credits: Ville Syrjälä 
> > > Signed-off-by: Uma Shankar 
> > > ---
> > >  Documentation/gpu/rfc/drm_color_pipeline.rst | 167
> > > +++
> > >  1 file changed, 167 insertions(+)
> > >  create mode 100644 Documentation/gpu/rfc/drm_color_pipeline.rst
> > >
> > > diff --git a/Documentation/gpu/rfc/drm_color_pipeline.rst
> > > b/Documentation/gpu/rfc/drm_color_pipeline.rst
> > > new file mode 100644
> > > index ..0d1ca858783b
> > > --- /dev/null
> > > +++ b/Documentation/gpu/rfc/drm_color_pipeline.rst
> > > @@ -0,0 +1,167 @@
> > > +==
> > > +Display Color Pipeline: Proposed DRM Properties  

...

> > > +Proposal is to have below properties for a plane:
> > > +
> > > +* Plane Degamma or Pre-Curve:
> > > + * This will be used to linearize the input framebuffer data.
> > > + * It will apply the reverse of the color transfer function.
> > > + * It can be a degamma curve or OETF for HDR.  
> > 
> > As you want to produce light-linear values, you use EOTF or inverse OETF.
> > 
> > The term OETF has a built-in assumption that that happens in a camera:
> > it takes in light and produces and electrical signal. Lately I have 
> > personally started talking about non-linear encoding of color values, 
> > since EOTF is often associated with displays if nothing else is said 
> > (taking in an electrical signal and producing light).
> > 
> > So this would be decoding the color values into light-linear color 
> > values. That is what an EOTF does, yes, but I feel there is a nuanced 
> > difference. A piece of equipment implements an EOTF by turning an 
> > electrical signal into light, hence EOTF often refers to specific 
> > equipment. You could talk about content EOTF to denote content value 
> > encoding, as opposed to output or display EOTF, but that might be 
> > confusing if you look at e.g. the diagrams in BT.2100: is it the EOTF or is 
> > it the inverse OETF? Is the (inverse?) OOTF included?
> > 
> > So I try to side-step those questions by talking about encoding.  
> 
> The idea here is that frame buffer presented to display plane engine will be 
> non-linear.
> So output of a media decode should result in content with EOTF applied.

Hi,

sure, but the question is: which EOTF. There can be many different
things called "EOTF" in a single pipeline, and then it's up to the
document writer to make the difference between them. Comparing two
documents with different conventions causes a lot of confusion in my
personal experience, so it is good to define the concepts more
carefully.

> So output of a media decode should result in content with EOTF applied.

I suspect you have it backwards. Media decode produces electrical
(non-linear) pixel color values. If EOTF was applied, they would be
linear instead (and require more memory to achieve the same visual
precision).

If you want to put it this way, you could say "with inverse EOTF
applied", but that might be slightly confusing because it is already
baked in to the video, it's not something a media decoder has to
specifically apply, I think. However, the (inverse) EOTF in this case
is the content EOTF, not the display EOTF.

If content and display EOTF differ, then one must apply first content
EOTF and then inverse display EOTF to get values that are correctly
encoded for the display. (This is necessary but not sufficient in
general.) Mind, that this is not an OOTF nor an artistic adjustment,
this is purely a value encoding conversion.

> Playback transfer function (EOTF): inverse OETF plus rendering intent gamma. 

Does "rendering intent gamma" refer to artistic adjustments, not OOTF?

cf. BT.2100 Annex 1, "The relationship between the OETF, the EOTF and
the OOTF", although I find those diagrams somewhat confusing still. It
does not seem to clearly account for transmission non-linear encoding
being different from the display EOTF.

Different documents use OOTF to refer to different things. Then there
is also the fundamental difference between PQ and HLG systems, where
OOTF is by definition in different places of the
camera-transmission-display pipeline.

> 
> To make it linear, we should apply the OETF. Confusion is whether OETF is 
> equivalent to
> in

[Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915: Remove memory frequency calculation (rev2)

2021-10-13 Thread Patchwork
== Series Details ==

Series: drm/i915: Remove memory frequency calculation (rev2)
URL   : https://patchwork.freedesktop.org/series/95748/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10728_full -> Patchwork_21324_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_21324_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_21324_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_21324_full:

### IGT changes ###

 Possible regressions 

  * igt@kms_cursor_edge_walk@pipe-d-128x128-bottom-edge:
- shard-tglb: [PASS][1] -> [INCOMPLETE][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/shard-tglb3/igt@kms_cursor_edge_w...@pipe-d-128x128-bottom-edge.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21324/shard-tglb6/igt@kms_cursor_edge_w...@pipe-d-128x128-bottom-edge.html

  * igt@kms_frontbuffer_tracking@fbc-suspend:
- shard-kbl:  [PASS][3] -> [INCOMPLETE][4]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/shard-kbl7/igt@kms_frontbuffer_track...@fbc-suspend.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21324/shard-kbl3/igt@kms_frontbuffer_track...@fbc-suspend.html

  
Known issues


  Here are the changes found in Patchwork_21324_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_ctx_persistence@engines-mixed-process:
- shard-snb:  NOTRUN -> [SKIP][5] ([fdo#109271] / [i915#1099]) +3 
similar issues
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21324/shard-snb2/igt@gem_ctx_persiste...@engines-mixed-process.html

  * igt@gem_ctx_shared@q-in-order:
- shard-snb:  NOTRUN -> [SKIP][6] ([fdo#109271]) +294 similar issues
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21324/shard-snb5/igt@gem_ctx_sha...@q-in-order.html

  * igt@gem_eio@unwedge-stress:
- shard-skl:  [PASS][7] -> [TIMEOUT][8] ([i915#2369] / [i915#3063])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/shard-skl1/igt@gem_...@unwedge-stress.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21324/shard-skl1/igt@gem_...@unwedge-stress.html

  * igt@gem_exec_fair@basic-flow@rcs0:
- shard-tglb: [PASS][9] -> [FAIL][10] ([i915#2842]) +3 similar 
issues
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/shard-tglb8/igt@gem_exec_fair@basic-f...@rcs0.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21324/shard-tglb8/igt@gem_exec_fair@basic-f...@rcs0.html

  * igt@gem_exec_fair@basic-none-rrul@rcs0:
- shard-tglb: NOTRUN -> [FAIL][11] ([i915#2842]) +1 similar issue
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21324/shard-tglb5/igt@gem_exec_fair@basic-none-r...@rcs0.html

  * igt@gem_exec_fair@basic-none@vcs0:
- shard-kbl:  [PASS][12] -> [FAIL][13] ([i915#2842]) +2 similar 
issues
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/shard-kbl3/igt@gem_exec_fair@basic-n...@vcs0.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21324/shard-kbl3/igt@gem_exec_fair@basic-n...@vcs0.html

  * igt@gem_exec_fair@basic-pace@vcs1:
- shard-iclb: NOTRUN -> [FAIL][14] ([i915#2842])
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21324/shard-iclb2/igt@gem_exec_fair@basic-p...@vcs1.html

  * igt@gem_exec_schedule@u-submit-golden-slice@vecs0:
- shard-skl:  NOTRUN -> [INCOMPLETE][15] ([i915#3797])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21324/shard-skl3/igt@gem_exec_schedule@u-submit-golden-sl...@vecs0.html

  * igt@gem_pxp@reject-modify-context-protection-off-2:
- shard-tglb: NOTRUN -> [SKIP][16] ([i915#4270]) +1 similar issue
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21324/shard-tglb3/igt@gem_...@reject-modify-context-protection-off-2.html

  * igt@gem_render_copy@x-tiled-to-vebox-yf-tiled:
- shard-kbl:  NOTRUN -> [SKIP][17] ([fdo#109271]) +125 similar 
issues
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21324/shard-kbl7/igt@gem_render_c...@x-tiled-to-vebox-yf-tiled.html

  * igt@gem_softpin@evict-snoop:
- shard-tglb: NOTRUN -> [SKIP][18] ([fdo#109312])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21324/shard-tglb3/igt@gem_soft...@evict-snoop.html

  * igt@gem_userptr_blits@input-checking:
- shard-apl:  NOTRUN -> [DMESG-WARN][19] ([i915#3002])
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21324/shard-apl6/igt@gem_userptr_bl...@input-checking.html

  * igt@gem_userptr_blits@unsync-unm

Re: [Intel-gfx] [PATCH v2] drm/i915: Remove memory frequency calculation

2021-10-13 Thread Ville Syrjälä
On Tue, Oct 12, 2021 at 06:00:46PM -0700, José Roberto de Souza wrote:
> This memory frequency calculated is only used to check if it is zero,
> what is not useful as it will never actually be zero.
> 
> Also the calculation is wrong, we should be checking other bit to
> select the appropriate frequency multiplier while this code is stuck
> with a fixed multiplier.

I don't think the alternate ref clock was ever used.
At least I don't recall ever seeing it.

The real problem with this is that IIRC this is just the last
requested frequency. So on a system with SAGV this will
change dynamically.

> 
> So here dropping it as whole.

We have a second copy of this in gen6_update_ring_freq(). Rather
than removing one and leaving another potentially broken one behind we
should probably just consolidate on a single implementation.

> 
> v2:
> - Also remove memory frequency calculation for gen9 LP platforms
> 
> Cc: Yakui Zhao 
> Cc: Matt Roper 
> Fixes: f8112cb9574b ("drm/i915/gen11+: Only load DRAM information from pcode")
> Signed-off-by: José Roberto de Souza 
> ---
>  drivers/gpu/drm/i915/i915_reg.h   |  8 
>  drivers/gpu/drm/i915/intel_dram.c | 30 ++
>  2 files changed, 2 insertions(+), 36 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index a897f4abea0c3..8825f7ac477b6 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -11109,12 +11109,6 @@ enum skl_power_gate {
>  #define  DC_STATE_DEBUG_MASK_CORES   (1 << 0)
>  #define  DC_STATE_DEBUG_MASK_MEMORY_UP   (1 << 1)
>  
> -#define BXT_P_CR_MC_BIOS_REQ_0_0_0   _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x7114)
> -#define  BXT_REQ_DATA_MASK   0x3F
> -#define  BXT_DRAM_CHANNEL_ACTIVE_SHIFT   12
> -#define  BXT_DRAM_CHANNEL_ACTIVE_MASK(0xF << 12)
> -#define  BXT_MEMORY_FREQ_MULTIPLIER_HZ   1
> -
>  #define BXT_D_CR_DRP0_DUNIT8 0x1000
>  #define BXT_D_CR_DRP0_DUNIT9 0x1200
>  #define  BXT_D_CR_DRP0_DUNIT_START   8
> @@ -11145,9 +11139,7 @@ enum skl_power_gate {
>  #define  BXT_DRAM_TYPE_LPDDR4(0x2 << 22)
>  #define  BXT_DRAM_TYPE_DDR4  (0x4 << 22)
>  
> -#define SKL_MEMORY_FREQ_MULTIPLIER_HZ2
>  #define SKL_MC_BIOS_DATA_0_0_0_MCHBAR_PCU_MMIO(MCHBAR_MIRROR_BASE_SNB + 
> 0x5E04)
> -#define  SKL_REQ_DATA_MASK   (0xF << 0)
>  #define  DG1_GEAR_TYPE   REG_BIT(16)
>  
>  #define SKL_MAD_INTER_CHANNEL_0_0_0_MCHBAR_MCMAIN 
> _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x5000)
> diff --git a/drivers/gpu/drm/i915/intel_dram.c 
> b/drivers/gpu/drm/i915/intel_dram.c
> index 30a0cab5eff46..0adadfd9528aa 100644
> --- a/drivers/gpu/drm/i915/intel_dram.c
> +++ b/drivers/gpu/drm/i915/intel_dram.c
> @@ -244,7 +244,6 @@ static int
>  skl_get_dram_info(struct drm_i915_private *i915)
>  {
>   struct dram_info *dram_info = &i915->dram_info;
> - u32 mem_freq_khz, val;
>   int ret;
>  
>   dram_info->type = skl_get_dram_type(i915);
> @@ -255,17 +254,6 @@ skl_get_dram_info(struct drm_i915_private *i915)
>   if (ret)
>   return ret;
>  
> - val = intel_uncore_read(&i915->uncore,
> - SKL_MC_BIOS_DATA_0_0_0_MCHBAR_PCU);
> - mem_freq_khz = DIV_ROUND_UP((val & SKL_REQ_DATA_MASK) *
> - SKL_MEMORY_FREQ_MULTIPLIER_HZ, 1000);
> -
> - if (dram_info->num_channels * mem_freq_khz == 0) {
> - drm_info(&i915->drm,
> -  "Couldn't get system memory bandwidth\n");
> - return -EINVAL;
> - }
> -
>   return 0;
>  }
>  
> @@ -350,24 +338,10 @@ static void bxt_get_dimm_info(struct dram_dimm_info 
> *dimm, u32 val)
>  static int bxt_get_dram_info(struct drm_i915_private *i915)
>  {
>   struct dram_info *dram_info = &i915->dram_info;
> - u32 dram_channels;
> - u32 mem_freq_khz, val;
> - u8 num_active_channels, valid_ranks = 0;
> + u32 val;
> + u8 valid_ranks = 0;
>   int i;
>  
> - val = intel_uncore_read(&i915->uncore, BXT_P_CR_MC_BIOS_REQ_0_0_0);
> - mem_freq_khz = DIV_ROUND_UP((val & BXT_REQ_DATA_MASK) *
> - BXT_MEMORY_FREQ_MULTIPLIER_HZ, 1000);
> -
> - dram_channels = val & BXT_DRAM_CHANNEL_ACTIVE_MASK;
> - num_active_channels = hweight32(dram_channels);
> -
> - if (mem_freq_khz * num_active_channels == 0) {
> - drm_info(&i915->drm,
> -  "Couldn't get system memory bandwidth\n");
> - return -EINVAL;
> - }
> -
>   /*
>* Now read each DUNIT8/9/10/11 to check the rank of each dimms.
>*/
> -- 
> 2.33.0

-- 
Ville Syrjälä
Intel


[Intel-gfx] [PATCH 0/1] drm/i915: vlv sideband

2021-10-13 Thread Jani Nikula
Three main ideas here:

- vlv sideband only has the name "sideband" in common with the rest of
  intel_sideband.[ch]

- we may need better abstractions on the  dependency,
  this should help a little bit; maybe vlv_sideband.[ch] can be turned
  into that abstraction layer

- we probably want to split out sideband registers from i915_reg.h, and
  they could go to vlv_sideband.h or vlv_sideband_reg.h or something

BR,
Jani.


Cc: Lucas De Marchi 
Cc: Ville Syrjälä 


Jani Nikula (1):
  drm/i915: split out vlv sideband to a separate file

 drivers/gpu/drm/i915/Makefile |   1 +
 drivers/gpu/drm/i915/display/g4x_dp.c |   2 +-
 drivers/gpu/drm/i915/display/g4x_hdmi.c   |   2 +-
 drivers/gpu/drm/i915/display/intel_cdclk.c|   1 +
 drivers/gpu/drm/i915/display/intel_display.c  |   1 +
 .../drm/i915/display/intel_display_debugfs.c  |   1 -
 .../drm/i915/display/intel_display_power.c|   4 +-
 drivers/gpu/drm/i915/display/intel_dp.c   |   1 -
 drivers/gpu/drm/i915/display/intel_dpio_phy.c |   5 +-
 drivers/gpu/drm/i915/display/intel_dpll.c |   2 +-
 drivers/gpu/drm/i915/display/intel_dsi_vbt.c  |   2 +-
 drivers/gpu/drm/i915/display/vlv_dsi.c|   2 +-
 drivers/gpu/drm/i915/display/vlv_dsi_pll.c|   2 +-
 drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c |   1 +
 drivers/gpu/drm/i915/gt/intel_rps.c   |   1 +
 drivers/gpu/drm/i915/i915_debugfs.c   |   1 -
 drivers/gpu/drm/i915/i915_sysfs.c |   1 -
 drivers/gpu/drm/i915/intel_pm.c   |   1 +
 drivers/gpu/drm/i915/intel_sideband.c | 257 -
 drivers/gpu/drm/i915/intel_sideband.h | 110 
 drivers/gpu/drm/i915/vlv_sideband.c   | 266 ++
 drivers/gpu/drm/i915/vlv_sideband.h   | 123 
 22 files changed, 405 insertions(+), 382 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/vlv_sideband.c
 create mode 100644 drivers/gpu/drm/i915/vlv_sideband.h

-- 
2.30.2



[Intel-gfx] [PATCH 1/1] drm/i915: split out vlv sideband to a separate file

2021-10-13 Thread Jani Nikula
The VLV/CHV sideband code is pretty distinct from the rest of the
sideband code. Split it out to new vlv_sideband.[ch].

Pure code movement with relevant #include changes, and a tiny checkpatch
fix on top.

Cc: Lucas De Marchi 
Cc: Ville Syrjälä 
Signed-off-by: Jani Nikula 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 drivers/gpu/drm/i915/display/g4x_dp.c |   2 +-
 drivers/gpu/drm/i915/display/g4x_hdmi.c   |   2 +-
 drivers/gpu/drm/i915/display/intel_cdclk.c|   1 +
 drivers/gpu/drm/i915/display/intel_display.c  |   1 +
 .../drm/i915/display/intel_display_debugfs.c  |   1 -
 .../drm/i915/display/intel_display_power.c|   4 +-
 drivers/gpu/drm/i915/display/intel_dp.c   |   1 -
 drivers/gpu/drm/i915/display/intel_dpio_phy.c |   5 +-
 drivers/gpu/drm/i915/display/intel_dpll.c |   2 +-
 drivers/gpu/drm/i915/display/intel_dsi_vbt.c  |   2 +-
 drivers/gpu/drm/i915/display/vlv_dsi.c|   2 +-
 drivers/gpu/drm/i915/display/vlv_dsi_pll.c|   2 +-
 drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c |   1 +
 drivers/gpu/drm/i915/gt/intel_rps.c   |   1 +
 drivers/gpu/drm/i915/i915_debugfs.c   |   1 -
 drivers/gpu/drm/i915/i915_sysfs.c |   1 -
 drivers/gpu/drm/i915/intel_pm.c   |   1 +
 drivers/gpu/drm/i915/intel_sideband.c | 257 -
 drivers/gpu/drm/i915/intel_sideband.h | 110 
 drivers/gpu/drm/i915/vlv_sideband.c   | 266 ++
 drivers/gpu/drm/i915/vlv_sideband.h   | 123 
 22 files changed, 405 insertions(+), 382 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/vlv_sideband.c
 create mode 100644 drivers/gpu/drm/i915/vlv_sideband.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 21b05ed0e4e8..d50d2b144fc6 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -54,6 +54,7 @@ i915-y += i915_drv.o \
  intel_step.o \
  intel_uncore.o \
  intel_wakeref.o \
+ vlv_sideband.o \
  vlv_suspend.o
 
 # core library code
diff --git a/drivers/gpu/drm/i915/display/g4x_dp.c 
b/drivers/gpu/drm/i915/display/g4x_dp.c
index 85a09c3e09e8..dc41868d01ef 100644
--- a/drivers/gpu/drm/i915/display/g4x_dp.c
+++ b/drivers/gpu/drm/i915/display/g4x_dp.c
@@ -18,7 +18,7 @@
 #include "intel_hdmi.h"
 #include "intel_hotplug.h"
 #include "intel_pps.h"
-#include "intel_sideband.h"
+#include "vlv_sideband.h"
 
 struct dp_link_dpll {
int clock;
diff --git a/drivers/gpu/drm/i915/display/g4x_hdmi.c 
b/drivers/gpu/drm/i915/display/g4x_hdmi.c
index be352e9f0afc..88c427f3c346 100644
--- a/drivers/gpu/drm/i915/display/g4x_hdmi.c
+++ b/drivers/gpu/drm/i915/display/g4x_hdmi.c
@@ -14,8 +14,8 @@
 #include "intel_fifo_underrun.h"
 #include "intel_hdmi.h"
 #include "intel_hotplug.h"
-#include "intel_sideband.h"
 #include "intel_sdvo.h"
+#include "vlv_sideband.h"
 
 static void intel_hdmi_prepare(struct intel_encoder *encoder,
   const struct intel_crtc_state *crtc_state)
diff --git a/drivers/gpu/drm/i915/display/intel_cdclk.c 
b/drivers/gpu/drm/i915/display/intel_cdclk.c
index ecb28e8f1eb6..44bb18773509 100644
--- a/drivers/gpu/drm/i915/display/intel_cdclk.c
+++ b/drivers/gpu/drm/i915/display/intel_cdclk.c
@@ -30,6 +30,7 @@
 #include "intel_display_types.h"
 #include "intel_psr.h"
 #include "intel_sideband.h"
+#include "vlv_sideband.h"
 
 /**
  * DOC: CDCLK / RAWCLK
diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index 9cf987ee143d..3602fdb2a549 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -109,6 +109,7 @@
 #include "i9xx_plane.h"
 #include "skl_scaler.h"
 #include "skl_universal_plane.h"
+#include "vlv_sideband.h"
 
 static void i9xx_crtc_clock_get(struct intel_crtc *crtc,
struct intel_crtc_state *pipe_config);
diff --git a/drivers/gpu/drm/i915/display/intel_display_debugfs.c 
b/drivers/gpu/drm/i915/display/intel_display_debugfs.c
index bc5113589f0a..e04767695530 100644
--- a/drivers/gpu/drm/i915/display/intel_display_debugfs.c
+++ b/drivers/gpu/drm/i915/display/intel_display_debugfs.c
@@ -20,7 +20,6 @@
 #include "intel_hdmi.h"
 #include "intel_pm.h"
 #include "intel_psr.h"
-#include "intel_sideband.h"
 #include "intel_sprite.h"
 
 static inline struct drm_i915_private *node_to_i915(struct drm_info_node *node)
diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c 
b/drivers/gpu/drm/i915/display/intel_display_power.c
index 06e9879aedd7..709569211c85 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power.c
+++ b/drivers/gpu/drm/i915/display/intel_display_power.c
@@ -3,12 +3,11 @@
  * Copyright © 2019 Intel Corporation
  */
 
-#include "display/intel_crt.h"
-
 #include "i915_drv.h"
 #include "i915_irq.h"
 #include "intel_cdclk.h"
 #include "intel_combo_phy.h"
+#include "intel_crt.h"
 #include "intel_de

[Intel-gfx] ✗ Fi.CI.IGT: failure for lib/stackdepot: allow optional init and stack_table allocation by kvmalloc() (rev3)

2021-10-13 Thread Patchwork
== Series Details ==

Series: lib/stackdepot: allow optional init and stack_table allocation by 
kvmalloc() (rev3)
URL   : https://patchwork.freedesktop.org/series/95549/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10728_full -> Patchwork_21326_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_21326_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_21326_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_21326_full:

### IGT changes ###

 Possible regressions 

  * igt@i915_pm_rpm@reg-read-ioctl:
- shard-iclb: [PASS][1] -> [INCOMPLETE][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/shard-iclb1/igt@i915_pm_...@reg-read-ioctl.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21326/shard-iclb7/igt@i915_pm_...@reg-read-ioctl.html

  * igt@kms_frontbuffer_tracking@fbc-suspend:
- shard-kbl:  [PASS][3] -> [INCOMPLETE][4]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/shard-kbl7/igt@kms_frontbuffer_track...@fbc-suspend.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21326/shard-kbl4/igt@kms_frontbuffer_track...@fbc-suspend.html

  

### Piglit changes ###

 Possible regressions 

  * spec@glsl-1.50@execution@built-in-functions@gs-op-assign-mult-ivec2-ivec2 
(NEW):
- pig-snb-2600:   NOTRUN -> [FAIL][5]
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21326/pig-snb-2600/spec@glsl-1.50@execution@built-in-functi...@gs-op-assign-mult-ivec2-ivec2.html

  
New tests
-

  New tests have been introduced between CI_DRM_10728_full and 
Patchwork_21326_full:

### New Piglit tests (1) ###

  * spec@glsl-1.50@execution@built-in-functions@gs-op-assign-mult-ivec2-ivec2:
- Statuses : 1 fail(s)
- Exec time: [0.19] s

  

Known issues


  Here are the changes found in Patchwork_21326_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_ctx_persistence@engines-mixed-process:
- shard-snb:  NOTRUN -> [SKIP][6] ([fdo#109271] / [i915#1099]) +1 
similar issue
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21326/shard-snb6/igt@gem_ctx_persiste...@engines-mixed-process.html

  * igt@gem_exec_fair@basic-deadline:
- shard-skl:  NOTRUN -> [FAIL][7] ([i915#2846])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21326/shard-skl9/igt@gem_exec_f...@basic-deadline.html

  * igt@gem_exec_fair@basic-none-rrul@rcs0:
- shard-tglb: NOTRUN -> [FAIL][8] ([i915#2842]) +1 similar issue
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21326/shard-tglb3/igt@gem_exec_fair@basic-none-r...@rcs0.html

  * igt@gem_exec_fair@basic-none-solo@rcs0:
- shard-glk:  [PASS][9] -> [FAIL][10] ([i915#2842]) +1 similar issue
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/shard-glk5/igt@gem_exec_fair@basic-none-s...@rcs0.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21326/shard-glk2/igt@gem_exec_fair@basic-none-s...@rcs0.html

  * igt@gem_exec_fair@basic-none@vcs0:
- shard-kbl:  [PASS][11] -> [FAIL][12] ([i915#2842]) +1 similar 
issue
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/shard-kbl3/igt@gem_exec_fair@basic-n...@vcs0.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21326/shard-kbl6/igt@gem_exec_fair@basic-n...@vcs0.html

  * igt@gem_exec_fair@basic-pace@vcs1:
- shard-iclb: NOTRUN -> [FAIL][13] ([i915#2842])
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21326/shard-iclb4/igt@gem_exec_fair@basic-p...@vcs1.html
- shard-tglb: [PASS][14] -> [FAIL][15] ([i915#2842])
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/shard-tglb5/igt@gem_exec_fair@basic-p...@vcs1.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21326/shard-tglb6/igt@gem_exec_fair@basic-p...@vcs1.html

  * igt@gem_exec_schedule@u-submit-golden-slice@vecs0:
- shard-skl:  NOTRUN -> [INCOMPLETE][16] ([i915#3797])
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21326/shard-skl10/igt@gem_exec_schedule@u-submit-golden-sl...@vecs0.html

  * igt@gem_fenced_exec_thrash@2-spare-fences:
- shard-snb:  [PASS][17] -> [INCOMPLETE][18] ([i915#2055])
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/shard-snb5/igt@gem_fenced_exec_thr...@2-spare-fences.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21326/shard-snb6/igt@gem_fenced_exec_thr...@2-spare-fences.html

  * igt@gem_huc_copy@huc-copy:
- shard-tglb: [PASS][19] -> [SKIP][20] ([

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: vlv sideband

2021-10-13 Thread Patchwork
== Series Details ==

Series: drm/i915: vlv sideband
URL   : https://patchwork.freedesktop.org/series/95764/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
ba91b0757d4b drm/i915: split out vlv sideband to a separate file
-:666: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does 
MAINTAINERS need updating?
#666: 
new file mode 100644

total: 0 errors, 1 warnings, 0 checks, 934 lines checked




Re: [Intel-gfx] [PATCH 0/1] drm/i915: vlv sideband

2021-10-13 Thread Ville Syrjälä
On Wed, Oct 13, 2021 at 01:11:58PM +0300, Jani Nikula wrote:
> Three main ideas here:
> 
> - vlv sideband only has the name "sideband" in common with the rest of
>   intel_sideband.[ch]

I wouldn't put it like that. There are two actual sideband 
implementtions in that file:
- vlv/chv iosf sideband (vlv_sideband)
- lpt/wpt iosf sideband (intel_sbi)

And the third thing in that file is the snb+ pcode mailbox stuff,
which has nothing to do with sideband.

-- 
Ville Syrjälä
Intel


[Intel-gfx] [PATCH] drm/i915: Use dma_resv_iter for waiting in i915_gem_object_wait_reservation.

2021-10-13 Thread Maarten Lankhorst
No memory should be allocated when calling i915_gem_object_wait,
because it may be called to idle a BO when evicting memory.

Fix this by using dma_resv_iter helpers to call
i915_gem_object_wait_fence() on each fence, which cleans up the code a lot.
Also remove dma_resv_prune, it's questionably.

This will result in the following lockdep splat.

<4> [83.538517] ==
<4> [83.538520] WARNING: possible circular locking dependency detected
<4> [83.538522] 5.15.0-rc5-CI-Trybot_8062+ #1 Not tainted
<4> [83.538525] --
<4> [83.538527] gem_render_line/5242 is trying to acquire lock:
<4> [83.538530] 8275b1e0 (fs_reclaim){+.+.}-{0:0}, at: 
__kmalloc_track_caller+0x56/0x270
<4> [83.538538]
but task is already holding lock:
<4> [83.538540] 88813471d1e0 (&vm->mutex/1){+.+.}-{3:3}, at: 
i915_vma_pin_ww+0x1c7/0x970 [i915]
<4> [83.538638]
which lock already depends on the new lock.
<4> [83.538642]
the existing dependency chain (in reverse order) is:
<4> [83.538645]
-> #1 (&vm->mutex/1){+.+.}-{3:3}:
<4> [83.538649]lock_acquire+0xd3/0x310
<4> [83.538654]i915_gem_shrinker_taints_mutex+0x2d/0x50 [i915]
<4> [83.538730]i915_address_space_init+0xf5/0x1b0 [i915]
<4> [83.538794]ppgtt_init+0x55/0x70 [i915]
<4> [83.538856]gen8_ppgtt_create+0x44/0x5d0 [i915]
<4> [83.538912]i915_ppgtt_create+0x28/0xf0 [i915]
<4> [83.538971]intel_gt_init+0x130/0x3b0 [i915]
<4> [83.539029]i915_gem_init+0x14b/0x220 [i915]
<4> [83.539100]i915_driver_probe+0x97e/0xdd0 [i915]
<4> [83.539149]i915_pci_probe+0x43/0x1d0 [i915]
<4> [83.539197]pci_device_probe+0x9b/0x110
<4> [83.539201]really_probe+0x1b0/0x3b0
<4> [83.539205]__driver_probe_device+0xf6/0x170
<4> [83.539208]driver_probe_device+0x1a/0x90
<4> [83.539210]__driver_attach+0x93/0x160
<4> [83.539213]bus_for_each_dev+0x72/0xc0
<4> [83.539216]bus_add_driver+0x14b/0x1f0
<4> [83.539220]driver_register+0x66/0xb0
<4> [83.539222]hdmi_get_spk_alloc+0x1f/0x50 [snd_hda_codec_hdmi]
<4> [83.539227]do_one_initcall+0x53/0x2e0
<4> [83.539230]do_init_module+0x55/0x200
<4> [83.539234]load_module+0x2700/0x2980
<4> [83.539237]__do_sys_finit_module+0xaa/0x110
<4> [83.539241]do_syscall_64+0x37/0xb0
<4> [83.539244]entry_SYSCALL_64_after_hwframe+0x44/0xae
<4> [83.539247]
-> #0 (fs_reclaim){+.+.}-{0:0}:
<4> [83.539251]validate_chain+0xb37/0x1e70
<4> [83.539254]__lock_acquire+0x5a1/0xb70
<4> [83.539258]lock_acquire+0xd3/0x310
<4> [83.539260]fs_reclaim_acquire+0x9d/0xd0
<4> [83.539264]__kmalloc_track_caller+0x56/0x270
<4> [83.539267]krealloc+0x48/0xa0
<4> [83.539270]dma_resv_get_fences+0x1c3/0x280
<4> [83.539274]i915_gem_object_wait+0x1ff/0x410 [i915]
<4> [83.539342]i915_gem_evict_for_node+0x16b/0x440 [i915]
<4> [83.539412]i915_gem_gtt_reserve+0xff/0x130 [i915]
<4> [83.539482]i915_vma_pin_ww+0x765/0x970 [i915]
<4> [83.539556]eb_validate_vmas+0x6fe/0x8e0 [i915]
<4> [83.539626]i915_gem_do_execbuffer+0x9a6/0x20a0 [i915]
<4> [83.539693]i915_gem_execbuffer2_ioctl+0x11f/0x2c0 [i915]
<4> [83.539759]drm_ioctl_kernel+0xac/0x140
<4> [83.539763]drm_ioctl+0x201/0x3d0
<4> [83.539766]__x64_sys_ioctl+0x6a/0xa0
<4> [83.539769]do_syscall_64+0x37/0xb0
<4> [83.539772]entry_SYSCALL_64_after_hwframe+0x44/0xae
<4> [83.539775]
other info that might help us debug this:
<4> [83.539778]  Possible unsafe locking scenario:
<4> [83.539781]CPU0CPU1
<4> [83.539783]
<4> [83.539785]   lock(&vm->mutex/1);
<4> [83.539788]lock(fs_reclaim);
<4> [83.539791]lock(&vm->mutex/1);
<4> [83.539794]   lock(fs_reclaim);
<4> [83.539796]
 *** DEADLOCK ***
<4> [83.539799] 3 locks held by gem_render_line/5242:
<4> [83.539802]  #0: c9d4bbf0 
(reservation_ww_class_acquire){+.+.}-{0:0}, at: 
i915_gem_do_execbuffer+0x8e5/0x20a0 [i915]
<4> [83.539870]  #1: 88811e48bae8 (reservation_ww_class_mutex){+.+.}-{3:3}, 
at: eb_validate_vmas+0x81/0x8e0 [i915]
<4> [83.539936]  #2: 88813471d1e0 (&vm->mutex/1){+.+.}-{3:3}, at: 
i915_vma_pin_ww+0x1c7/0x970 [i915]
<4> [83.540011]
stack backtrace:
<4> [83.540014] CPU: 2 PID: 5242 Comm: gem_render_line Not tainted 
5.15.0-rc5-CI-Trybot_8062+ #1
<4> [83.540019] Hardware name: Intel(R) Client Systems NUC11TNHi3/NUC11TNBi3, 
BIOS TNTGL357.0038.2020.1124.1648 11/24/2020
<4> [83.540023] Call Trace:
<4> [83.540026]  dump_stack_lvl+0x56/0x7b
<4> [83.540030]  check_noncircular+0x12e/0x150
<4> [83.540034]  ? _raw_spin_unlock_irqrestore+0x50/0x60
<4> [83.540038]  validate_chain+0xb37/0x1e70
<4> [83.540042]  __lock_acquire+0x5a1/0xb70
<4> [83.540046]  lock_acquire+0

Re: [Intel-gfx] [PATCH 0/1] drm/i915: vlv sideband

2021-10-13 Thread Jani Nikula
On Wed, 13 Oct 2021, Ville Syrjälä  wrote:
> On Wed, Oct 13, 2021 at 01:11:58PM +0300, Jani Nikula wrote:
>> Three main ideas here:
>> 
>> - vlv sideband only has the name "sideband" in common with the rest of
>>   intel_sideband.[ch]
>
> I wouldn't put it like that. There are two actual sideband 
> implementtions in that file:
> - vlv/chv iosf sideband (vlv_sideband)
> - lpt/wpt iosf sideband (intel_sbi)
>
> And the third thing in that file is the snb+ pcode mailbox stuff,
> which has nothing to do with sideband.

Fair enough... but no opposition to the splitting out of vlv/chv iosf
sideband? vlv_sideband.[ch] like here? I'm fine with renaming too.

I can follow up with lpt/wpt iosf split out (intel_sbi.[ch]?) and snb+
pcode (intel_pcode.[ch]?).

I think we've just put all of them together way back when this was all
probably bundled in i915_drv.c or something...


BR,
Jani.



-- 
Jani Nikula, Intel Open Source Graphics Center


[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: vlv sideband

2021-10-13 Thread Patchwork
== Series Details ==

Series: drm/i915: vlv sideband
URL   : https://patchwork.freedesktop.org/series/95764/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10728 -> Patchwork_21327


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21327/index.html

Known issues


  Here are the changes found in Patchwork_21327 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@i915_selftest@live@hangcheck:
- fi-ivb-3770:[PASS][1] -> [INCOMPLETE][2] ([i915#3303])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/fi-ivb-3770/igt@i915_selftest@l...@hangcheck.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21327/fi-ivb-3770/igt@i915_selftest@l...@hangcheck.html

  * igt@kms_chamelium@vga-hpd-fast:
- fi-kbl-guc: NOTRUN -> [SKIP][3] ([fdo#109271] / [fdo#111827]) +8 
similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21327/fi-kbl-guc/igt@kms_chamel...@vga-hpd-fast.html

  * igt@kms_frontbuffer_tracking@basic:
- fi-cml-u2:  [PASS][4] -> [DMESG-WARN][5] ([i915#4269])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/fi-cml-u2/igt@kms_frontbuffer_track...@basic.html
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21327/fi-cml-u2/igt@kms_frontbuffer_track...@basic.html

  * igt@kms_pipe_crc_basic@compare-crc-sanitycheck-pipe-d:
- fi-kbl-guc: NOTRUN -> [SKIP][6] ([fdo#109271] / [i915#533])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21327/fi-kbl-guc/igt@kms_pipe_crc_ba...@compare-crc-sanitycheck-pipe-d.html

  * igt@kms_pipe_crc_basic@read-crc-pipe-c:
- fi-kbl-guc: NOTRUN -> [SKIP][7] ([fdo#109271]) +41 similar issues
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21327/fi-kbl-guc/igt@kms_pipe_crc_ba...@read-crc-pipe-c.html

  * igt@runner@aborted:
- fi-ivb-3770:NOTRUN -> [FAIL][8] ([fdo#109271])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21327/fi-ivb-3770/igt@run...@aborted.html

  
 Possible fixes 

  * igt@i915_selftest@live@hangcheck:
- {fi-hsw-gt1}:   [DMESG-WARN][9] ([i915#3303]) -> [PASS][10]
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/fi-hsw-gt1/igt@i915_selftest@l...@hangcheck.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21327/fi-hsw-gt1/igt@i915_selftest@l...@hangcheck.html

  * igt@i915_selftest@live@perf:
- {fi-tgl-dsi}:   [DMESG-WARN][11] ([i915#2867]) -> [PASS][12] +9 
similar issues
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/fi-tgl-dsi/igt@i915_selftest@l...@perf.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21327/fi-tgl-dsi/igt@i915_selftest@l...@perf.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#2867]: https://gitlab.freedesktop.org/drm/intel/issues/2867
  [i915#3303]: https://gitlab.freedesktop.org/drm/intel/issues/3303
  [i915#4269]: https://gitlab.freedesktop.org/drm/intel/issues/4269
  [i915#533]: https://gitlab.freedesktop.org/drm/intel/issues/533


Participating hosts (41 -> 37)
--

  Additional (1): fi-kbl-guc 
  Missing(5): fi-ilk-m540 fi-hsw-4200u fi-bsw-cyan fi-apl-guc fi-ctg-p8600 


Build changes
-

  * Linux: CI_DRM_10728 -> Patchwork_21327

  CI-20190529: 20190529
  CI_DRM_10728: 82a9f298afec66c882e710078138891826ce5e22 @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6242: 721fd85ee95225ed5df322f7182bdfa9b86a3e68 @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_21327: ba91b0757d4b185a92e03981ca99df05ca7cea22 @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

ba91b0757d4b drm/i915: split out vlv sideband to a separate file

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21327/index.html


[Intel-gfx] ✗ Fi.CI.BUILD: failure for drm/i915: Use dma_resv_iter for waiting in i915_gem_object_wait_reservation.

2021-10-13 Thread Patchwork
== Series Details ==

Series: drm/i915: Use dma_resv_iter for waiting in 
i915_gem_object_wait_reservation.
URL   : https://patchwork.freedesktop.org/series/95765/
State : failure

== Summary ==

CALLscripts/checksyscalls.sh
  CALLscripts/atomic/check-atomics.sh
  DESCEND objtool
  CHK include/generated/compile.h
make[4]: *** No rule to make target 'drivers/gpu/drm/i915/dma_resv_utils.o', 
needed by 'drivers/gpu/drm/i915/i915.o'.  Stop.
scripts/Makefile.build:540: recipe for target 'drivers/gpu/drm/i915' failed
make[3]: *** [drivers/gpu/drm/i915] Error 2
scripts/Makefile.build:540: recipe for target 'drivers/gpu/drm' failed
make[2]: *** [drivers/gpu/drm] Error 2
scripts/Makefile.build:540: recipe for target 'drivers/gpu' failed
make[1]: *** [drivers/gpu] Error 2
Makefile:1868: recipe for target 'drivers' failed
make: *** [drivers] Error 2




Re: [Intel-gfx] [PATCH 0/1] drm/i915: vlv sideband

2021-10-13 Thread Ville Syrjälä
On Wed, Oct 13, 2021 at 01:47:09PM +0300, Jani Nikula wrote:
> On Wed, 13 Oct 2021, Ville Syrjälä  wrote:
> > On Wed, Oct 13, 2021 at 01:11:58PM +0300, Jani Nikula wrote:
> >> Three main ideas here:
> >> 
> >> - vlv sideband only has the name "sideband" in common with the rest of
> >>   intel_sideband.[ch]
> >
> > I wouldn't put it like that. There are two actual sideband 
> > implementtions in that file:
> > - vlv/chv iosf sideband (vlv_sideband)
> > - lpt/wpt iosf sideband (intel_sbi)
> >
> > And the third thing in that file is the snb+ pcode mailbox stuff,
> > which has nothing to do with sideband.
> 
> Fair enough... but no opposition to the splitting out of vlv/chv iosf
> sideband? vlv_sideband.[ch] like here? I'm fine with renaming too.
> 
> I can follow up with lpt/wpt iosf split out (intel_sbi.[ch]?) and snb+
> pcode (intel_pcode.[ch]?).

Yeah, I guess just full split is the cleanest. Those names seem OK
to me. Or I suppose we could rename the intel_sbi stuff to lpt_sbi
or something? Might not be worth the hassle. Adding a small comment
to intel_sbi.c to document what it's for should be sufficient reminder.

> I think we've just put all of them together way back when this was all
> probably bundled in i915_drv.c or something...

Yeah. I think the common thread was that you need to go through
a mailbox, but the file name didn't really reflect that.

-- 
Ville Syrjälä
Intel


[Intel-gfx] [PATCH] drm/i915: Use dma_resv_iter for waiting in i915_gem_object_wait_reservation.

2021-10-13 Thread Maarten Lankhorst
No memory should be allocated when calling i915_gem_object_wait,
because it may be called to idle a BO when evicting memory.

Fix this by using dma_resv_iter helpers to call
i915_gem_object_wait_fence() on each fence, which cleans up the code a lot.
Also remove dma_resv_prune, it's questionably.

This will result in the following lockdep splat.

<4> [83.538517] ==
<4> [83.538520] WARNING: possible circular locking dependency detected
<4> [83.538522] 5.15.0-rc5-CI-Trybot_8062+ #1 Not tainted
<4> [83.538525] --
<4> [83.538527] gem_render_line/5242 is trying to acquire lock:
<4> [83.538530] 8275b1e0 (fs_reclaim){+.+.}-{0:0}, at: 
__kmalloc_track_caller+0x56/0x270
<4> [83.538538]
but task is already holding lock:
<4> [83.538540] 88813471d1e0 (&vm->mutex/1){+.+.}-{3:3}, at: 
i915_vma_pin_ww+0x1c7/0x970 [i915]
<4> [83.538638]
which lock already depends on the new lock.
<4> [83.538642]
the existing dependency chain (in reverse order) is:
<4> [83.538645]
-> #1 (&vm->mutex/1){+.+.}-{3:3}:
<4> [83.538649]lock_acquire+0xd3/0x310
<4> [83.538654]i915_gem_shrinker_taints_mutex+0x2d/0x50 [i915]
<4> [83.538730]i915_address_space_init+0xf5/0x1b0 [i915]
<4> [83.538794]ppgtt_init+0x55/0x70 [i915]
<4> [83.538856]gen8_ppgtt_create+0x44/0x5d0 [i915]
<4> [83.538912]i915_ppgtt_create+0x28/0xf0 [i915]
<4> [83.538971]intel_gt_init+0x130/0x3b0 [i915]
<4> [83.539029]i915_gem_init+0x14b/0x220 [i915]
<4> [83.539100]i915_driver_probe+0x97e/0xdd0 [i915]
<4> [83.539149]i915_pci_probe+0x43/0x1d0 [i915]
<4> [83.539197]pci_device_probe+0x9b/0x110
<4> [83.539201]really_probe+0x1b0/0x3b0
<4> [83.539205]__driver_probe_device+0xf6/0x170
<4> [83.539208]driver_probe_device+0x1a/0x90
<4> [83.539210]__driver_attach+0x93/0x160
<4> [83.539213]bus_for_each_dev+0x72/0xc0
<4> [83.539216]bus_add_driver+0x14b/0x1f0
<4> [83.539220]driver_register+0x66/0xb0
<4> [83.539222]hdmi_get_spk_alloc+0x1f/0x50 [snd_hda_codec_hdmi]
<4> [83.539227]do_one_initcall+0x53/0x2e0
<4> [83.539230]do_init_module+0x55/0x200
<4> [83.539234]load_module+0x2700/0x2980
<4> [83.539237]__do_sys_finit_module+0xaa/0x110
<4> [83.539241]do_syscall_64+0x37/0xb0
<4> [83.539244]entry_SYSCALL_64_after_hwframe+0x44/0xae
<4> [83.539247]
-> #0 (fs_reclaim){+.+.}-{0:0}:
<4> [83.539251]validate_chain+0xb37/0x1e70
<4> [83.539254]__lock_acquire+0x5a1/0xb70
<4> [83.539258]lock_acquire+0xd3/0x310
<4> [83.539260]fs_reclaim_acquire+0x9d/0xd0
<4> [83.539264]__kmalloc_track_caller+0x56/0x270
<4> [83.539267]krealloc+0x48/0xa0
<4> [83.539270]dma_resv_get_fences+0x1c3/0x280
<4> [83.539274]i915_gem_object_wait+0x1ff/0x410 [i915]
<4> [83.539342]i915_gem_evict_for_node+0x16b/0x440 [i915]
<4> [83.539412]i915_gem_gtt_reserve+0xff/0x130 [i915]
<4> [83.539482]i915_vma_pin_ww+0x765/0x970 [i915]
<4> [83.539556]eb_validate_vmas+0x6fe/0x8e0 [i915]
<4> [83.539626]i915_gem_do_execbuffer+0x9a6/0x20a0 [i915]
<4> [83.539693]i915_gem_execbuffer2_ioctl+0x11f/0x2c0 [i915]
<4> [83.539759]drm_ioctl_kernel+0xac/0x140
<4> [83.539763]drm_ioctl+0x201/0x3d0
<4> [83.539766]__x64_sys_ioctl+0x6a/0xa0
<4> [83.539769]do_syscall_64+0x37/0xb0
<4> [83.539772]entry_SYSCALL_64_after_hwframe+0x44/0xae
<4> [83.539775]
other info that might help us debug this:
<4> [83.539778]  Possible unsafe locking scenario:
<4> [83.539781]CPU0CPU1
<4> [83.539783]
<4> [83.539785]   lock(&vm->mutex/1);
<4> [83.539788]lock(fs_reclaim);
<4> [83.539791]lock(&vm->mutex/1);
<4> [83.539794]   lock(fs_reclaim);
<4> [83.539796]
 *** DEADLOCK ***
<4> [83.539799] 3 locks held by gem_render_line/5242:
<4> [83.539802]  #0: c9d4bbf0 
(reservation_ww_class_acquire){+.+.}-{0:0}, at: 
i915_gem_do_execbuffer+0x8e5/0x20a0 [i915]
<4> [83.539870]  #1: 88811e48bae8 (reservation_ww_class_mutex){+.+.}-{3:3}, 
at: eb_validate_vmas+0x81/0x8e0 [i915]
<4> [83.539936]  #2: 88813471d1e0 (&vm->mutex/1){+.+.}-{3:3}, at: 
i915_vma_pin_ww+0x1c7/0x970 [i915]
<4> [83.540011]
stack backtrace:
<4> [83.540014] CPU: 2 PID: 5242 Comm: gem_render_line Not tainted 
5.15.0-rc5-CI-Trybot_8062+ #1
<4> [83.540019] Hardware name: Intel(R) Client Systems NUC11TNHi3/NUC11TNBi3, 
BIOS TNTGL357.0038.2020.1124.1648 11/24/2020
<4> [83.540023] Call Trace:
<4> [83.540026]  dump_stack_lvl+0x56/0x7b
<4> [83.540030]  check_noncircular+0x12e/0x150
<4> [83.540034]  ? _raw_spin_unlock_irqrestore+0x50/0x60
<4> [83.540038]  validate_chain+0xb37/0x1e70
<4> [83.540042]  __lock_acquire+0x5a1/0xb70
<4> [83.540046]  lock_acquire+0

Re: [Intel-gfx] [PATCH] drm/i915: Prefer struct_size over open coded arithmetic

2021-10-13 Thread Jani Nikula
On Mon, 11 Oct 2021, Len Baker  wrote:
> Hi,
>
> On Sun, Oct 03, 2021 at 12:42:58PM +0200, Len Baker wrote:
>> As noted in the "Deprecated Interfaces, Language Features, Attributes,
>> and Conventions" documentation [1], size calculations (especially
>> multiplication) should not be performed in memory allocator (or similar)
>> function arguments due to the risk of them overflowing. This could lead
>> to values wrapping around and a smaller allocation being made than the
>> caller was expecting. Using those allocations could lead to linear
>> overflows of heap memory and other misbehaviors.
>>
>> In this case these are not actually dynamic sizes: all the operands
>> involved in the calculation are constant values. However it is better to
>> refactor them anyway, just to keep the open-coded math idiom out of
>> code.
>>
>> So, add at the end of the struct i915_syncmap a union with two flexible
>> array members (these arrays share the same memory layout). This is
>> possible using the new DECLARE_FLEX_ARRAY macro. And then, use the
>> struct_size() helper to do the arithmetic instead of the argument
>> "size + count * size" in the kmalloc and kzalloc() functions.
>>
>> Also, take the opportunity to refactor the __sync_seqno and __sync_child
>> making them more readable.
>>
>> This code was detected with the help of Coccinelle and audited and fixed
>> manually.
>>
>> [1] 
>> https://www.kernel.org/doc/html/latest/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments
>>
>> Signed-off-by: Len Baker 
>> ---
>>  drivers/gpu/drm/i915/i915_syncmap.c | 12 
>>  1 file changed, 8 insertions(+), 4 deletions(-)
>
> I received a mail telling that this patch doesn't build:
>
> == Series Details ==
>
> Series: drm/i915: Prefer struct_size over open coded arithmetic
> URL   : https://patchwork.freedesktop.org/series/95408/
> State : failure
>
> But it builds without error against linux-next (tag next-20211001). Against
> which tree and branch do I need to build?

drm-tip [1]. It's a sort of linux-next for graphics. I think there are
still some branches that don't feed to linux-next.

BR,
Jani.


[1] https://cgit.freedesktop.org/drm/drm-tip


>
> Regards,
> Len

-- 
Jani Nikula, Intel Open Source Graphics Center


Re: [Intel-gfx] [PATCH 1/1] RFC : drm/i915: Adding new sysfs frequency attributes

2021-10-13 Thread Jani Nikula
On Fri, 08 Oct 2021, Sujaritha Sundaresan  
wrote:
> This patch adds the following new sysfs frequency attributes;

Why?

Sysfs is uapi. What's the userspace consumer for these?

More comments inline.

>   - punit_req_freq_mhz
>   - throttle_reason_status
>   - throttle_reason_pl1
>   - throttle_reason_pl2
>   - throttle_reason_pl4
>   - throttle_reason_thermal
>   - throttle_reason_prochot
>   - throttle_reason_ratl
>   - throttle_reason_vr_thermalert
>   - throttle_reason_vr_tdc
>
> Signed-off-by: Sujaritha Sundaresan 
> Cc: Dale B Stimson 
> ---
>  drivers/gpu/drm/i915/gt/intel_rps.c |  83 +
>  drivers/gpu/drm/i915/gt/intel_rps.h |  10 +++
>  drivers/gpu/drm/i915/i915_reg.h |  11 +++
>  drivers/gpu/drm/i915/i915_sysfs.c   | 135 
>  4 files changed, 239 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
> b/drivers/gpu/drm/i915/gt/intel_rps.c
> index 172de6c9f949..c03d99f2608c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_rps.c
> +++ b/drivers/gpu/drm/i915/gt/intel_rps.c
> @@ -2153,6 +2153,89 @@ u32 intel_rps_read_state_cap(struct intel_rps *rps)
>   return intel_uncore_read(uncore, GEN6_RP_STATE_CAP);
>  }
>  
> +static u32 __rps_read_mmio(struct intel_gt *gt, i915_reg_t reg32)
> +{
> + intel_wakeref_t wakeref;
> + u32 val;
> +
> + with_intel_runtime_pm(gt->uncore->rpm, wakeref)
> + val = intel_uncore_read(gt->uncore, reg32);
> +
> + return val;
> +}
> +
> +u32 intel_rps_read_throttle_reason_status(struct intel_rps *rps)
> +{
> + struct intel_gt *gt = rps_to_gt(rps);
> + u32 status = __rps_read_mmio(gt, GT0_PERF_LIMIT_REASONS) & 
> GT0_PERF_LIMIT_REASONS_MASK;
> +
> + return status;
> +}
> +
> +u32 intel_rps_read_throttle_reason_pl1(struct intel_rps *rps)
> +{
> + struct intel_gt *gt = rps_to_gt(rps);
> + u32 pl1 = __rps_read_mmio(gt, GT0_PERF_LIMIT_REASONS) & 
> POWER_LIMIT_1_MASK;
> +
> + return pl1;
> +}
> +
> +u32 intel_rps_read_throttle_reason_pl2(struct intel_rps *rps)
> +{
> + struct intel_gt *gt = rps_to_gt(rps);
> + u32 pl2 = __rps_read_mmio(gt, GT0_PERF_LIMIT_REASONS) & 
> POWER_LIMIT_2_MASK;
> +
> + return pl2;
> +}
> +
> +u32 intel_rps_read_throttle_reason_pl4(struct intel_rps *rps)
> +{
> + struct intel_gt *gt = rps_to_gt(rps);
> + u32 pl4 = __rps_read_mmio(gt, GT0_PERF_LIMIT_REASONS) & 
> POWER_LIMIT_4_MASK;
> +
> + return pl4;
> +}
> +
> +u32 intel_rps_read_throttle_reason_thermal(struct intel_rps *rps)
> +{
> + struct intel_gt *gt = rps_to_gt(rps);
> + u32 thermal = __rps_read_mmio(gt, GT0_PERF_LIMIT_REASONS) & 
> THERMAL_LIMIT_MASK;
> +
> + return thermal;
> +}
> +
> +u32 intel_rps_read_throttle_reason_prochot(struct intel_rps *rps)
> +{
> + struct intel_gt *gt = rps_to_gt(rps);
> + u32 prochot = __rps_read_mmio(gt, GT0_PERF_LIMIT_REASONS) & 
> PROCHOT_MASK;
> +
> + return prochot;
> +}
> +
> +u32 intel_rps_read_throttle_reason_ratl(struct intel_rps *rps)
> +{
> + struct intel_gt *gt = rps_to_gt(rps);
> + u32 ratl = __rps_read_mmio(gt, GT0_PERF_LIMIT_REASONS) & RATL_MASK;
> +
> + return ratl;
> +}
> +
> +u32 intel_rps_read_throttle_reason_vr_thermalert(struct intel_rps *rps)
> +{
> + struct intel_gt *gt = rps_to_gt(rps);
> + u32 thermalert = __rps_read_mmio(gt, GT0_PERF_LIMIT_REASONS) & 
> VR_THERMALERT_MASK;
> +
> + return thermalert;
> +}
> +
> +u32 intel_rps_read_throttle_reason_vr_tdc(struct intel_rps *rps)
> +{
> + struct intel_gt *gt = rps_to_gt(rps);
> + u32 tdc = __rps_read_mmio(gt, GT0_PERF_LIMIT_REASONS) & VR_TDC_MASK;
> +
> + return tdc;
> +}
> +
>  /* External interface for intel_ips.ko */
>  
>  static struct drm_i915_private __rcu *ips_mchdev;
> diff --git a/drivers/gpu/drm/i915/gt/intel_rps.h 
> b/drivers/gpu/drm/i915/gt/intel_rps.h
> index 11960d64ca82..d6ac97f1facd 100644
> --- a/drivers/gpu/drm/i915/gt/intel_rps.h
> +++ b/drivers/gpu/drm/i915/gt/intel_rps.h
> @@ -42,6 +42,16 @@ u32 intel_rps_get_rpn_frequency(struct intel_rps *rps);
>  u32 intel_rps_read_punit_req(struct intel_rps *rps);
>  u32 intel_rps_read_punit_req_frequency(struct intel_rps *rps);
>  u32 intel_rps_read_state_cap(struct intel_rps *rps);
> +u32 intel_rps_read_throttle_reason(struct intel_rps *rps);
> +u32 intel_rps_read_throttle_reason_status(struct intel_rps *rps);
> +u32 intel_rps_read_throttle_reason_pl1(struct intel_rps *rps);
> +u32 intel_rps_read_throttle_reason_pl2(struct intel_rps *rps);
> +u32 intel_rps_read_throttle_reason_pl4(struct intel_rps *rps);
> +u32 intel_rps_read_throttle_reason_thermal(struct intel_rps *rps);
> +u32 intel_rps_read_throttle_reason_prochot(struct intel_rps *rps);
> +u32 intel_rps_read_throttle_reason_ratl(struct intel_rps *rps);
> +u32 intel_rps_read_throttle_reason_vr_thermalert(struct intel_rps *rps);
> +u32 intel_rps_read_throttle_reason_vr_tdc(struct intel_rps *rps);
>  
>  void gen5_rps_irq_handler(struct intel_r

Re: [Intel-gfx] [PATCH] drm/i915: Prefer struct_size over open coded arithmetic

2021-10-13 Thread Daniel Vetter
On Wed, Oct 13, 2021 at 02:24:05PM +0300, Jani Nikula wrote:
> On Mon, 11 Oct 2021, Len Baker  wrote:
> > Hi,
> >
> > On Sun, Oct 03, 2021 at 12:42:58PM +0200, Len Baker wrote:
> >> As noted in the "Deprecated Interfaces, Language Features, Attributes,
> >> and Conventions" documentation [1], size calculations (especially
> >> multiplication) should not be performed in memory allocator (or similar)
> >> function arguments due to the risk of them overflowing. This could lead
> >> to values wrapping around and a smaller allocation being made than the
> >> caller was expecting. Using those allocations could lead to linear
> >> overflows of heap memory and other misbehaviors.
> >>
> >> In this case these are not actually dynamic sizes: all the operands
> >> involved in the calculation are constant values. However it is better to
> >> refactor them anyway, just to keep the open-coded math idiom out of
> >> code.
> >>
> >> So, add at the end of the struct i915_syncmap a union with two flexible
> >> array members (these arrays share the same memory layout). This is
> >> possible using the new DECLARE_FLEX_ARRAY macro. And then, use the
> >> struct_size() helper to do the arithmetic instead of the argument
> >> "size + count * size" in the kmalloc and kzalloc() functions.
> >>
> >> Also, take the opportunity to refactor the __sync_seqno and __sync_child
> >> making them more readable.
> >>
> >> This code was detected with the help of Coccinelle and audited and fixed
> >> manually.
> >>
> >> [1] 
> >> https://www.kernel.org/doc/html/latest/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments
> >>
> >> Signed-off-by: Len Baker 
> >> ---
> >>  drivers/gpu/drm/i915/i915_syncmap.c | 12 
> >>  1 file changed, 8 insertions(+), 4 deletions(-)
> >
> > I received a mail telling that this patch doesn't build:
> >
> > == Series Details ==
> >
> > Series: drm/i915: Prefer struct_size over open coded arithmetic
> > URL   : https://patchwork.freedesktop.org/series/95408/
> > State : failure
> >
> > But it builds without error against linux-next (tag next-20211001). Against
> > which tree and branch do I need to build?
> 
> drm-tip [1]. It's a sort of linux-next for graphics. I think there are
> still some branches that don't feed to linux-next.

Yeah we need to get gt-next in linux-next asap. Joonas promised to send
out his patch to make that happen in dim.
-Daniel

> 
> BR,
> Jani.
> 
> 
> [1] https://cgit.freedesktop.org/drm/drm-tip
> 
> 
> >
> > Regards,
> > Len
> 
> -- 
> Jani Nikula, Intel Open Source Graphics Center

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH 1/1] drm/i915: split out vlv sideband to a separate file

2021-10-13 Thread Hans de Goede
Hi,

On 10/13/21 12:11 PM, Jani Nikula wrote:
> The VLV/CHV sideband code is pretty distinct from the rest of the
> sideband code. Split it out to new vlv_sideband.[ch].
> 
> Pure code movement with relevant #include changes, and a tiny checkpatch
> fix on top.
> 
> Cc: Lucas De Marchi 
> Cc: Ville Syrjälä 
> Signed-off-by: Jani Nikula 

Thanks, patch looks good to me:

Reviewed-by: Hans de Goede 

Feel free to keep the Reviewed-by if you do a new version with
the improved commit msg suggested by Ville.

Regards,

Hans

> ---
>  drivers/gpu/drm/i915/Makefile |   1 +
>  drivers/gpu/drm/i915/display/g4x_dp.c |   2 +-
>  drivers/gpu/drm/i915/display/g4x_hdmi.c   |   2 +-
>  drivers/gpu/drm/i915/display/intel_cdclk.c|   1 +
>  drivers/gpu/drm/i915/display/intel_display.c  |   1 +
>  .../drm/i915/display/intel_display_debugfs.c  |   1 -
>  .../drm/i915/display/intel_display_power.c|   4 +-
>  drivers/gpu/drm/i915/display/intel_dp.c   |   1 -
>  drivers/gpu/drm/i915/display/intel_dpio_phy.c |   5 +-
>  drivers/gpu/drm/i915/display/intel_dpll.c |   2 +-
>  drivers/gpu/drm/i915/display/intel_dsi_vbt.c  |   2 +-
>  drivers/gpu/drm/i915/display/vlv_dsi.c|   2 +-
>  drivers/gpu/drm/i915/display/vlv_dsi_pll.c|   2 +-
>  drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c |   1 +
>  drivers/gpu/drm/i915/gt/intel_rps.c   |   1 +
>  drivers/gpu/drm/i915/i915_debugfs.c   |   1 -
>  drivers/gpu/drm/i915/i915_sysfs.c |   1 -
>  drivers/gpu/drm/i915/intel_pm.c   |   1 +
>  drivers/gpu/drm/i915/intel_sideband.c | 257 -
>  drivers/gpu/drm/i915/intel_sideband.h | 110 
>  drivers/gpu/drm/i915/vlv_sideband.c   | 266 ++
>  drivers/gpu/drm/i915/vlv_sideband.h   | 123 
>  22 files changed, 405 insertions(+), 382 deletions(-)
>  create mode 100644 drivers/gpu/drm/i915/vlv_sideband.c
>  create mode 100644 drivers/gpu/drm/i915/vlv_sideband.h
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 21b05ed0e4e8..d50d2b144fc6 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -54,6 +54,7 @@ i915-y += i915_drv.o \
> intel_step.o \
> intel_uncore.o \
> intel_wakeref.o \
> +   vlv_sideband.o \
> vlv_suspend.o
>  
>  # core library code
> diff --git a/drivers/gpu/drm/i915/display/g4x_dp.c 
> b/drivers/gpu/drm/i915/display/g4x_dp.c
> index 85a09c3e09e8..dc41868d01ef 100644
> --- a/drivers/gpu/drm/i915/display/g4x_dp.c
> +++ b/drivers/gpu/drm/i915/display/g4x_dp.c
> @@ -18,7 +18,7 @@
>  #include "intel_hdmi.h"
>  #include "intel_hotplug.h"
>  #include "intel_pps.h"
> -#include "intel_sideband.h"
> +#include "vlv_sideband.h"
>  
>  struct dp_link_dpll {
>   int clock;
> diff --git a/drivers/gpu/drm/i915/display/g4x_hdmi.c 
> b/drivers/gpu/drm/i915/display/g4x_hdmi.c
> index be352e9f0afc..88c427f3c346 100644
> --- a/drivers/gpu/drm/i915/display/g4x_hdmi.c
> +++ b/drivers/gpu/drm/i915/display/g4x_hdmi.c
> @@ -14,8 +14,8 @@
>  #include "intel_fifo_underrun.h"
>  #include "intel_hdmi.h"
>  #include "intel_hotplug.h"
> -#include "intel_sideband.h"
>  #include "intel_sdvo.h"
> +#include "vlv_sideband.h"
>  
>  static void intel_hdmi_prepare(struct intel_encoder *encoder,
>  const struct intel_crtc_state *crtc_state)
> diff --git a/drivers/gpu/drm/i915/display/intel_cdclk.c 
> b/drivers/gpu/drm/i915/display/intel_cdclk.c
> index ecb28e8f1eb6..44bb18773509 100644
> --- a/drivers/gpu/drm/i915/display/intel_cdclk.c
> +++ b/drivers/gpu/drm/i915/display/intel_cdclk.c
> @@ -30,6 +30,7 @@
>  #include "intel_display_types.h"
>  #include "intel_psr.h"
>  #include "intel_sideband.h"
> +#include "vlv_sideband.h"
>  
>  /**
>   * DOC: CDCLK / RAWCLK
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
> b/drivers/gpu/drm/i915/display/intel_display.c
> index 9cf987ee143d..3602fdb2a549 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -109,6 +109,7 @@
>  #include "i9xx_plane.h"
>  #include "skl_scaler.h"
>  #include "skl_universal_plane.h"
> +#include "vlv_sideband.h"
>  
>  static void i9xx_crtc_clock_get(struct intel_crtc *crtc,
>   struct intel_crtc_state *pipe_config);
> diff --git a/drivers/gpu/drm/i915/display/intel_display_debugfs.c 
> b/drivers/gpu/drm/i915/display/intel_display_debugfs.c
> index bc5113589f0a..e04767695530 100644
> --- a/drivers/gpu/drm/i915/display/intel_display_debugfs.c
> +++ b/drivers/gpu/drm/i915/display/intel_display_debugfs.c
> @@ -20,7 +20,6 @@
>  #include "intel_hdmi.h"
>  #include "intel_pm.h"
>  #include "intel_psr.h"
> -#include "intel_sideband.h"
>  #include "intel_sprite.h"
>  
>  static inline struct drm_i915_private *node_to_i915(struct drm_info_node 
> *node)
> diff --git a/drivers/gpu/drm/i915/display/intel_display_power

Re: [Intel-gfx] mmotm 2021-10-05-19-53 uploaded (drivers/gpu/drm/msm/hdmi/hdmi_phy.o)

2021-10-13 Thread Arnd Bergmann
On Wed, Oct 13, 2021 at 12:54 PM Arnd Bergmann  wrote:
> On Thu, Oct 7, 2021 at 11:51 AM Geert Uytterhoeven  
> wrote:
>
> -msm-$(CONFIG_DRM_FBDEV_EMULATION) += msm_fbdev.o
> -msm-$(CONFIG_COMMON_CLK) += disp/mdp4/mdp4_lvds_pll.o
> -msm-$(CONFIG_COMMON_CLK) += hdmi/hdmi_pll_8960.o
> -msm-$(CONFIG_COMMON_CLK) += hdmi/hdmi_phy_8996.o
> +msm-$(CONFIG_DRM_FBDEV_EMULATION) += msm_fbdev.o \
> + disp/mdp4/mdp4_lvds_pll.o \
> + hdmi/hdmi_pll_8960.o \
> + hdmi/hdmi_phy_8996.o
>
>  msm-$(CONFIG_DRM_MSM_HDMI_HDCP) += hdmi/hdmi_hdcp.o

I fixed my local copy now after noticing that these should not go
after CONFIG_DRM_FBDEV_EMULATION but the top-level option:

@@ -23,8 +23,10 @@ msm-y := \
hdmi/hdmi_i2c.o \
hdmi/hdmi_phy.o \
hdmi/hdmi_phy_8960.o \
+   hdmi/hdmi_phy_8996.o
hdmi/hdmi_phy_8x60.o \
hdmi/hdmi_phy_8x74.o \
+   hdmi/hdmi_pll_8960.o \
edp/edp.o \
edp/edp_aux.o \
edp/edp_bridge.o \
@@ -37,6 +39,7 @@ msm-y := \
disp/mdp4/mdp4_dtv_encoder.o \
disp/mdp4/mdp4_lcdc_encoder.o \
disp/mdp4/mdp4_lvds_connector.o \
+   disp/mdp4/mdp4_lvds_pll.o \
disp/mdp4/mdp4_irq.o \
disp/mdp4/mdp4_kms.o \
disp/mdp4/mdp4_plane.o \

   Arnd


Re: [Intel-gfx] mmotm 2021-10-05-19-53 uploaded (drivers/gpu/drm/msm/hdmi/hdmi_phy.o)

2021-10-13 Thread Arnd Bergmann
On Thu, Oct 7, 2021 at 11:51 AM Geert Uytterhoeven  wrote:
> On Wed, Oct 6, 2021 at 9:28 AM Christian König  
> wrote:
> > Am 06.10.21 um 09:20 schrieb Stephen Rothwell:
> > > On Tue, 5 Oct 2021 22:48:03 -0700 Randy Dunlap  
> > > wrote:
> > >> on i386:
> > >>
> > >> ld: drivers/gpu/drm/msm/hdmi/hdmi_phy.o:(.rodata+0x3f0): undefined 
> > >> reference to `msm_hdmi_phy_8996_cfg'

I ran into the same thing now as well.
E_TEST) && COMMON_CLK
>
> I'd make that:
>
> -depends on DRM
> +   depends on COMMON_CLK && DRM && IOMMU_SUPPORT
> depends on ARCH_QCOM || SOC_IMX5 || COMPILE_TEST
> -depends on IOMMU_SUPPORT
> -   depends on (OF && COMMON_CLK) || COMPILE_TEST
> +   depends on OF || COMPILE_TEST
>
> to keep a better separation between hard and soft dependencies.
>
> Note that the "depends on OF || COMPILE_TEST" can even be
> deleted, as the dependency on ARCH_QCOM || SOC_IMX5 implies OF.

Looks good to me, I would also drop that last line in this case, and maybe
add this change as building without COMMON_CLK is no longer possible:

diff --git a/drivers/gpu/drm/msm/Makefile b/drivers/gpu/drm/msm/Makefile
index 904535eda0c4..a5d87e03812f 100644
--- a/drivers/gpu/drm/msm/Makefile
+++ b/drivers/gpu/drm/msm/Makefile
@@ -116,10 +116,10 @@ msm-$(CONFIG_DRM_MSM_DP)+= dp/dp_aux.o \
  dp/dp_power.o \
  dp/dp_audio.o

-msm-$(CONFIG_DRM_FBDEV_EMULATION) += msm_fbdev.o
-msm-$(CONFIG_COMMON_CLK) += disp/mdp4/mdp4_lvds_pll.o
-msm-$(CONFIG_COMMON_CLK) += hdmi/hdmi_pll_8960.o
-msm-$(CONFIG_COMMON_CLK) += hdmi/hdmi_phy_8996.o
+msm-$(CONFIG_DRM_FBDEV_EMULATION) += msm_fbdev.o \
+ disp/mdp4/mdp4_lvds_pll.o \
+ hdmi/hdmi_pll_8960.o \
+ hdmi/hdmi_phy_8996.o

 msm-$(CONFIG_DRM_MSM_HDMI_HDCP) += hdmi/hdmi_hdcp.o

Has anyone submitted a patch already, or should I send the version
that I am using locally now?

Arnd


Re: [Intel-gfx] [PATCH 1/4] drm: Introduce drm_modeset_lock_ctx_retry()

2021-10-13 Thread Daniel Vetter
On Mon, Oct 04, 2021 at 02:15:51PM +0300, Ville Syrjälä wrote:
> On Tue, Jul 20, 2021 at 03:44:49PM +0200, Daniel Vetter wrote:
> > On Thu, Jul 15, 2021 at 09:49:51PM +0300, Ville Syrjala wrote:
> > > From: Ville Syrjälä 
> > > 
> > > Quite a few places are hand rolling the modeset lock backoff dance.
> > > Let's suck that into a helper macro that is easier to use without
> > > forgetting some steps.
> > > 
> > > The main downside is probably that the implementation of
> > > drm_with_modeset_lock_ctx() is a bit harder to read than a hand
> > > rolled version on account of being split across three functions,
> > > but the actual code using it ends up being much simpler.
> > > 
> > > Cc: Sean Paul 
> > > Cc: Daniel Vetter 
> > > Signed-off-by: Ville Syrjälä 
> > > ---
> > >  drivers/gpu/drm/drm_modeset_lock.c | 44 ++
> > >  include/drm/drm_modeset_lock.h | 20 ++
> > >  2 files changed, 64 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/drm_modeset_lock.c 
> > > b/drivers/gpu/drm/drm_modeset_lock.c
> > > index fcfe1a03c4a1..083df96632e8 100644
> > > --- a/drivers/gpu/drm/drm_modeset_lock.c
> > > +++ b/drivers/gpu/drm/drm_modeset_lock.c
> > > @@ -425,3 +425,47 @@ int drm_modeset_lock_all_ctx(struct drm_device *dev,
> > >   return 0;
> > >  }
> > >  EXPORT_SYMBOL(drm_modeset_lock_all_ctx);
> > > +
> > > +void _drm_modeset_lock_begin(struct drm_modeset_acquire_ctx *ctx,
> > > +  struct drm_atomic_state *state,
> > > +  unsigned int flags, int *ret)
> > > +{
> > > + drm_modeset_acquire_init(ctx, flags);
> > > +
> > > + if (state)
> > > + state->acquire_ctx = ctx;
> > > +
> > > + *ret = -EDEADLK;
> > > +}
> > > +EXPORT_SYMBOL(_drm_modeset_lock_begin);
> > > +
> > > +bool _drm_modeset_lock_loop(int *ret)
> > > +{
> > > + if (*ret == -EDEADLK) {
> > > + *ret = 0;
> > > + return true;
> > > + }
> > > +
> > > + return false;
> > > +}
> > > +EXPORT_SYMBOL(_drm_modeset_lock_loop);
> > > +
> > > +void _drm_modeset_lock_end(struct drm_modeset_acquire_ctx *ctx,
> > > +struct drm_atomic_state *state,
> > > +int *ret)
> > > +{
> > > + if (*ret == -EDEADLK) {
> > > + if (state)
> > > + drm_atomic_state_clear(state);
> > > +
> > > + *ret = drm_modeset_backoff(ctx);
> > > + if (*ret == 0) {
> > > + *ret = -EDEADLK;
> > > + return;
> > > + }
> > > + }
> > > +
> > > + drm_modeset_drop_locks(ctx);
> > > + drm_modeset_acquire_fini(ctx);
> > > +}
> > > +EXPORT_SYMBOL(_drm_modeset_lock_end);
> > > diff --git a/include/drm/drm_modeset_lock.h 
> > > b/include/drm/drm_modeset_lock.h
> > > index aafd07388eb7..5eaad2533de5 100644
> > > --- a/include/drm/drm_modeset_lock.h
> > > +++ b/include/drm/drm_modeset_lock.h
> > > @@ -26,6 +26,7 @@
> > >  
> > >  #include 
> > >  
> > > +struct drm_atomic_state;
> > >  struct drm_modeset_lock;
> > >  
> > >  /**
> > > @@ -203,4 +204,23 @@ modeset_lock_fail:   
> > > \
> > >   if (!drm_drv_uses_atomic_modeset(dev))  \
> > >   mutex_unlock(&dev->mode_config.mutex);
> > >  
> > > +void _drm_modeset_lock_begin(struct drm_modeset_acquire_ctx *ctx,
> > > +  struct drm_atomic_state *state,
> > > +  unsigned int flags,
> > > +  int *ret);
> > > +bool _drm_modeset_lock_loop(int *ret);
> > > +void _drm_modeset_lock_end(struct drm_modeset_acquire_ctx *ctx,
> > > +struct drm_atomic_state *state,
> > > +int *ret);
> > > +
> > > +/*
> > > + * Note that one must always use "continue" rather than
> > > + * "break" or "return" to handle errors within the
> > > + * drm_modeset_lock_ctx_retry() block.
> > 
> > I'm not sold on loop macros with these kind of restrictions, C just isn't
> > a great language for these. That's why e.g. drm_connector_iter doesn't
> > give you a macro, but only the begin/next/end function calls explicitly.
> 
> We already use this pattern extensively in i915. Gem ww ctx has one,
> power domains/pps/etc. use a similar things. It makes the code pretty nice,
> with the slight caveat that an accidental 'break' can ruin your day. But
> so can an accidental return with other constructs (and we even had that
> happen a few times with the connector iterators), so not a dealbreaker
> IMO.
> 
> So if we don't want this drm wide I guess I can propose this just for
> i915 since it fits in perfectly there.

Well I don't like them for i915 either.

And yes C is dangerous, but also C is verbose. I think one lesson from igt
is that too many magic block constructs are bad, it's just not how C
works. Definitely not in the kernel, where "oops I got it wrong because it
was too clever" is bad.

> > Yes the macro we have is also not nice, but at least it's a screaming
> > macro since it's all uppercase, so

Re: [Intel-gfx] [RFC 6/8] drm/i915: Make some recently added vfuncs use full scheduling attribute

2021-10-13 Thread Daniel Vetter
On Wed, Oct 06, 2021 at 10:12:29AM -0700, Matthew Brost wrote:
> On Mon, Oct 04, 2021 at 03:36:48PM +0100, Tvrtko Ursulin wrote:
> > From: Tvrtko Ursulin 
> > 
> > Code added in 71ed60112d5d ("drm/i915: Add kick_backend function to
> > i915_sched_engine") and ee242ca704d3 ("drm/i915/guc: Implement GuC
> > priority management") introduced some scheduling related vfuncs which
> > take integer request priority as argument.
> > 
> > Make them instead take struct i915_sched_attr, which is the type
> > encapsulating this information, so it probably aligns with the design
> > better. It definitely enables extending the set of scheduling attributes.
> > 
> 
> Understand the motivation here but the i915_scheduler is going to
> disapear when we move to the DRM scheduler or at least its functionality
> of priority inheritance will be pushed into the DRM scheduler. I'd be
> very careful making any changes here as the priority in the DRM
> scheduler is defined as single enum:

Yeah I'm not sure it makes sense to build this and make the conversion to
drm/sched even harder. We've already merged a lot of code with a "we'll
totally convert to drm/sched right after" promise, there's not really room
for more fun like this built on top of i915-scheduler.
-Daniel

> 
> /* These are often used as an (initial) index
>  * to an array, and as such should start at 0.
>  */
> enum drm_sched_priority {
> DRM_SCHED_PRIORITY_MIN,
> DRM_SCHED_PRIORITY_NORMAL,
> DRM_SCHED_PRIORITY_HIGH,
> DRM_SCHED_PRIORITY_KERNEL,
> 
> DRM_SCHED_PRIORITY_COUNT,
> DRM_SCHED_PRIORITY_UNSET = -2
> };
> 
> Adding a field to the i915_sched_attr is fairly easy as we already have
> a structure but changing the DRM scheduler might be a tougher sell.
> Anyway you can make this work without adding the 'nice' field to
> i915_sched_attr? Might be worth exploring so when we move to the DRM
> scheduler this feature drops in a little cleaner.
> 
> Matt
> 
> > Signed-off-by: Tvrtko Ursulin 
> > Cc: Matthew Brost 
> > Cc: Daniele Ceraolo Spurio 
> > ---
> >  drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 4 +++-
> >  drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c| 3 ++-
> >  drivers/gpu/drm/i915/i915_scheduler.c| 4 ++--
> >  drivers/gpu/drm/i915/i915_scheduler_types.h  | 4 ++--
> >  4 files changed, 9 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
> > b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > index 7147fe80919e..e91d803a6453 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > @@ -3216,11 +3216,13 @@ static bool can_preempt(struct intel_engine_cs 
> > *engine)
> > return engine->class != RENDER_CLASS;
> >  }
> >  
> > -static void kick_execlists(const struct i915_request *rq, int prio)
> > +static void kick_execlists(const struct i915_request *rq,
> > +  const struct i915_sched_attr *attr)
> >  {
> > struct intel_engine_cs *engine = rq->engine;
> > struct i915_sched_engine *sched_engine = engine->sched_engine;
> > const struct i915_request *inflight;
> > +   const int prio = attr->priority;
> >  
> > /*
> >  * We only need to kick the tasklet once for the high priority
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > index ba0de35f6323..b5883a4365ca 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > @@ -2414,9 +2414,10 @@ static void guc_init_breadcrumbs(struct 
> > intel_engine_cs *engine)
> >  }
> >  
> >  static void guc_bump_inflight_request_prio(struct i915_request *rq,
> > -  int prio)
> > +  const struct i915_sched_attr *attr)
> >  {
> > struct intel_context *ce = rq->context;
> > +   const int prio = attr->priority;
> > u8 new_guc_prio = map_i915_prio_to_guc_prio(prio);
> >  
> > /* Short circuit function */
> > diff --git a/drivers/gpu/drm/i915/i915_scheduler.c 
> > b/drivers/gpu/drm/i915/i915_scheduler.c
> > index 762127dd56c5..534bab99fcdc 100644
> > --- a/drivers/gpu/drm/i915/i915_scheduler.c
> > +++ b/drivers/gpu/drm/i915/i915_scheduler.c
> > @@ -255,7 +255,7 @@ static void __i915_schedule(struct i915_sched_node 
> > *node,
> >  
> > /* Must be called before changing the nodes priority */
> > if (sched_engine->bump_inflight_request_prio)
> > -   sched_engine->bump_inflight_request_prio(from, prio);
> > +   sched_engine->bump_inflight_request_prio(from, attr);
> >  
> > WRITE_ONCE(node->attr.priority, prio);
> >  
> > @@ -280,7 +280,7 @@ static void __i915_schedule(struct i915_sched_node 
> > *node,
> >  
> > /* Defer (tasklet) submission until after all of

Re: [Intel-gfx] [RFC PATCH] drm: Increase DRM_OBJECT_MAX_PROPERTY by 18.

2021-10-13 Thread Daniel Vetter
On Tue, Oct 05, 2021 at 08:51:51AM +0200, Sebastian Andrzej Siewior wrote:
> The warning poped up, it says it increase it by the number of occurrence.
> I saw it 18 times so here it is.
> It started to up since commit
>2f425cf5242a0 ("drm: Fix oops in damage self-tests by mocking damage 
> property")
> 
> Increase DRM_OBJECT_MAX_PROPERTY by 18.
> 
> Signed-off-by: Sebastian Andrzej Siewior 

Which driver where? Whomever added that into upstream should also have
realized this (things will just not work) and include it in there. So if
things are tested correctly this should be part of a larger series to add
these 18 props somewhere.

Also maybe we should just dynamically allocate this array if people have
this many properties on their objects.
-Daniel

> ---
> 
> I have no idea whether this is correct or just a symptom of another
> problem. This has been observed with i915 and full debug.
> 
>  include/drm/drm_mode_object.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/drm/drm_mode_object.h b/include/drm/drm_mode_object.h
> index c34a3e8030e12..1e5399e47c3a5 100644
> --- a/include/drm/drm_mode_object.h
> +++ b/include/drm/drm_mode_object.h
> @@ -60,7 +60,7 @@ struct drm_mode_object {
>   void (*free_cb)(struct kref *kref);
>  };
>  
> -#define DRM_OBJECT_MAX_PROPERTY 24
> +#define DRM_OBJECT_MAX_PROPERTY 42
>  /**
>   * struct drm_object_properties - property tracking for &drm_mode_object
>   */
> -- 
> 2.33.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH] drm/i915: Handle Intel igfx + Intel dgfx hybrid graphics setup

2021-10-13 Thread Daniel Vetter
On Tue, Oct 05, 2021 at 03:05:25PM +0200, Thomas Hellström wrote:
> Hi, Tvrtko,
> 
> On 10/5/21 13:31, Tvrtko Ursulin wrote:
> > From: Tvrtko Ursulin 
> > 
> > In short this makes i915 work for hybrid setups (DRI_PRIME=1 with Mesa)
> > when rendering is done on Intel dgfx and scanout/composition on Intel
> > igfx.
> > 
> > Before this patch the driver was not quite ready for that setup, mainly
> > because it was able to emit a semaphore wait between the two GPUs, which
> > results in deadlocks because semaphore target location in HWSP is neither
> > shared between the two, nor mapped in both GGTT spaces.
> > 
> > To fix it the patch adds an additional check to a couple of relevant code
> > paths in order to prevent using semaphores for inter-engine
> > synchronisation when relevant objects are not in the same GGTT space.
> > 
> > v2:
> >   * Avoid adding rq->i915. (Chris)
> > 
> > v3:
> >   * Use GGTT which describes the limit more precisely.
> > 
> > Signed-off-by: Tvrtko Ursulin 
> > Cc: Daniel Vetter 
> > Cc: Matthew Auld 
> > Cc: Thomas Hellström 
> 
> An IMO pretty important bugfix. I read up a bit on the previous discussion
> on this, and from what I understand the other two options were
> 
> 1) Ripping out the semaphore code,
> 2) Consider dma-fences from other instances of the same driver as foreign.
> 
> For imported dma-bufs we do 2), but particularly with lmem and p2p that's a
> more straightforward decision.
> 
> I don't think 1) is a reasonable approach to fix this bug, (but perhaps as a
> general cleanup?), and for 2) yes I guess we might end up doing that, unless
> we find some real benefits in treating same-driver-separate-device
> dma-fences as local, but for this particular bug, IMO this is a reasonable
> fix.

The foreign dma-fences have uapi impact, which Tvrtko shrugged off as
"it's a good idea", and not it's really just not. So we still need to that
this properly.

> Reviewed-by: Thomas Hellström 

But I'm also ok with just merging this as-is so the situation doesn't
become too entertaining.
-Daniel

> 
> 
> 
> 
> 
> > ---
> >   drivers/gpu/drm/i915/i915_request.c | 12 +++-
> >   1 file changed, 11 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_request.c 
> > b/drivers/gpu/drm/i915/i915_request.c
> > index 79da5eca60af..4f189982f67e 100644
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -1145,6 +1145,12 @@ __emit_semaphore_wait(struct i915_request *to,
> > return 0;
> >   }
> > +static bool
> > +can_use_semaphore_wait(struct i915_request *to, struct i915_request *from)
> > +{
> > +   return to->engine->gt->ggtt == from->engine->gt->ggtt;
> > +}
> > +
> >   static int
> >   emit_semaphore_wait(struct i915_request *to,
> > struct i915_request *from,
> > @@ -1153,6 +1159,9 @@ emit_semaphore_wait(struct i915_request *to,
> > const intel_engine_mask_t mask = READ_ONCE(from->engine)->mask;
> > struct i915_sw_fence *wait = &to->submit;
> > +   if (!can_use_semaphore_wait(to, from))
> > +   goto await_fence;
> > +
> > if (!intel_context_use_semaphores(to->context))
> > goto await_fence;
> > @@ -1256,7 +1265,8 @@ __i915_request_await_execution(struct i915_request 
> > *to,
> >  * immediate execution, and so we must wait until it reaches the
> >  * active slot.
> >  */
> > -   if (intel_engine_has_semaphores(to->engine) &&
> > +   if (can_use_semaphore_wait(to, from) &&
> > +   intel_engine_has_semaphores(to->engine) &&
> > !i915_request_has_initial_breadcrumb(to)) {
> > err = __emit_semaphore_wait(to, from, from->fence.seqno - 1);
> > if (err < 0)

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH 1/4] drm: Introduce drm_modeset_lock_ctx_retry()

2021-10-13 Thread Ville Syrjälä
On Wed, Oct 13, 2021 at 01:59:47PM +0200, Daniel Vetter wrote:
> On Mon, Oct 04, 2021 at 02:15:51PM +0300, Ville Syrjälä wrote:
> > On Tue, Jul 20, 2021 at 03:44:49PM +0200, Daniel Vetter wrote:
> > > On Thu, Jul 15, 2021 at 09:49:51PM +0300, Ville Syrjala wrote:
> > > > From: Ville Syrjälä 
> > > > 
> > > > Quite a few places are hand rolling the modeset lock backoff dance.
> > > > Let's suck that into a helper macro that is easier to use without
> > > > forgetting some steps.
> > > > 
> > > > The main downside is probably that the implementation of
> > > > drm_with_modeset_lock_ctx() is a bit harder to read than a hand
> > > > rolled version on account of being split across three functions,
> > > > but the actual code using it ends up being much simpler.
> > > > 
> > > > Cc: Sean Paul 
> > > > Cc: Daniel Vetter 
> > > > Signed-off-by: Ville Syrjälä 
> > > > ---
> > > >  drivers/gpu/drm/drm_modeset_lock.c | 44 ++
> > > >  include/drm/drm_modeset_lock.h | 20 ++
> > > >  2 files changed, 64 insertions(+)
> > > > 
> > > > diff --git a/drivers/gpu/drm/drm_modeset_lock.c 
> > > > b/drivers/gpu/drm/drm_modeset_lock.c
> > > > index fcfe1a03c4a1..083df96632e8 100644
> > > > --- a/drivers/gpu/drm/drm_modeset_lock.c
> > > > +++ b/drivers/gpu/drm/drm_modeset_lock.c
> > > > @@ -425,3 +425,47 @@ int drm_modeset_lock_all_ctx(struct drm_device 
> > > > *dev,
> > > > return 0;
> > > >  }
> > > >  EXPORT_SYMBOL(drm_modeset_lock_all_ctx);
> > > > +
> > > > +void _drm_modeset_lock_begin(struct drm_modeset_acquire_ctx *ctx,
> > > > +struct drm_atomic_state *state,
> > > > +unsigned int flags, int *ret)
> > > > +{
> > > > +   drm_modeset_acquire_init(ctx, flags);
> > > > +
> > > > +   if (state)
> > > > +   state->acquire_ctx = ctx;
> > > > +
> > > > +   *ret = -EDEADLK;
> > > > +}
> > > > +EXPORT_SYMBOL(_drm_modeset_lock_begin);
> > > > +
> > > > +bool _drm_modeset_lock_loop(int *ret)
> > > > +{
> > > > +   if (*ret == -EDEADLK) {
> > > > +   *ret = 0;
> > > > +   return true;
> > > > +   }
> > > > +
> > > > +   return false;
> > > > +}
> > > > +EXPORT_SYMBOL(_drm_modeset_lock_loop);
> > > > +
> > > > +void _drm_modeset_lock_end(struct drm_modeset_acquire_ctx *ctx,
> > > > +  struct drm_atomic_state *state,
> > > > +  int *ret)
> > > > +{
> > > > +   if (*ret == -EDEADLK) {
> > > > +   if (state)
> > > > +   drm_atomic_state_clear(state);
> > > > +
> > > > +   *ret = drm_modeset_backoff(ctx);
> > > > +   if (*ret == 0) {
> > > > +   *ret = -EDEADLK;
> > > > +   return;
> > > > +   }
> > > > +   }
> > > > +
> > > > +   drm_modeset_drop_locks(ctx);
> > > > +   drm_modeset_acquire_fini(ctx);
> > > > +}
> > > > +EXPORT_SYMBOL(_drm_modeset_lock_end);
> > > > diff --git a/include/drm/drm_modeset_lock.h 
> > > > b/include/drm/drm_modeset_lock.h
> > > > index aafd07388eb7..5eaad2533de5 100644
> > > > --- a/include/drm/drm_modeset_lock.h
> > > > +++ b/include/drm/drm_modeset_lock.h
> > > > @@ -26,6 +26,7 @@
> > > >  
> > > >  #include 
> > > >  
> > > > +struct drm_atomic_state;
> > > >  struct drm_modeset_lock;
> > > >  
> > > >  /**
> > > > @@ -203,4 +204,23 @@ modeset_lock_fail: 
> > > > \
> > > > if (!drm_drv_uses_atomic_modeset(dev))  
> > > > \
> > > > mutex_unlock(&dev->mode_config.mutex);
> > > >  
> > > > +void _drm_modeset_lock_begin(struct drm_modeset_acquire_ctx *ctx,
> > > > +struct drm_atomic_state *state,
> > > > +unsigned int flags,
> > > > +int *ret);
> > > > +bool _drm_modeset_lock_loop(int *ret);
> > > > +void _drm_modeset_lock_end(struct drm_modeset_acquire_ctx *ctx,
> > > > +  struct drm_atomic_state *state,
> > > > +  int *ret);
> > > > +
> > > > +/*
> > > > + * Note that one must always use "continue" rather than
> > > > + * "break" or "return" to handle errors within the
> > > > + * drm_modeset_lock_ctx_retry() block.
> > > 
> > > I'm not sold on loop macros with these kind of restrictions, C just isn't
> > > a great language for these. That's why e.g. drm_connector_iter doesn't
> > > give you a macro, but only the begin/next/end function calls explicitly.
> > 
> > We already use this pattern extensively in i915. Gem ww ctx has one,
> > power domains/pps/etc. use a similar things. It makes the code pretty nice,
> > with the slight caveat that an accidental 'break' can ruin your day. But
> > so can an accidental return with other constructs (and we even had that
> > happen a few times with the connector iterators), so not a dealbreaker
> > IMO.
> > 
> > 

Re: [Intel-gfx] [PATCH 03/11] drm/i915: Restructure probe to handle multi-tile platforms

2021-10-13 Thread Jani Nikula
On Fri, 08 Oct 2021, Matt Roper  wrote:
> On a multi-tile platform, each tile has its own registers + GGTT space,
> and BAR 0 is extended to cover all of them.  Upcoming patches will start
> exposing the tiles as multiple GTs within a single PCI device.  In
> preparation for supporting such setups, restructure the driver's probe
> code a bit.
>
> Only the primary/root tile is initialized for now; the other tiles will
> be detected and plugged in by future patches once the necessary
> infrastructure is in place to handle them.
>
> Original-author: Abdiel Janulgue
> Cc: Daniele Ceraolo Spurio 
> Cc: Matthew Auld 
> Cc: Joonas Lahtinen 
> Signed-off-by: Daniele Ceraolo Spurio 
> Signed-off-by: Tvrtko Ursulin 
> Signed-off-by: Matt Roper 
> ---
>  drivers/gpu/drm/i915/gt/intel_gt.c   | 45 
>  drivers/gpu/drm/i915/gt/intel_gt.h   |  3 ++
>  drivers/gpu/drm/i915/gt/intel_gt_pm.c|  9 -
>  drivers/gpu/drm/i915/gt/intel_gt_types.h |  5 +++
>  drivers/gpu/drm/i915/i915_drv.c  | 20 +--
>  drivers/gpu/drm/i915/intel_uncore.c  | 12 +++
>  drivers/gpu/drm/i915/intel_uncore.h  |  3 +-
>  7 files changed, 76 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
> b/drivers/gpu/drm/i915/gt/intel_gt.c
> index 1cb1948ac959..f4bea1f1de77 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt.c
> @@ -900,6 +900,51 @@ u32 intel_gt_read_register_fw(struct intel_gt *gt, 
> i915_reg_t reg)
>   return intel_uncore_read_fw(gt->uncore, reg);
>  }
>  
> +static int
> +tile_setup(struct intel_gt *gt, unsigned int id, phys_addr_t phys_addr)
> +{
> + int ret;
> +
> + intel_uncore_init_early(gt->uncore, gt->i915);
> +
> + ret = intel_uncore_setup_mmio(gt->uncore, phys_addr);
> + if (ret)
> + return ret;
> +
> + gt->phys_addr = phys_addr;
> +
> + return 0;
> +}
> +
> +static void tile_cleanup(struct intel_gt *gt)
> +{
> + intel_uncore_cleanup_mmio(gt->uncore);
> +}
> +
> +int intel_probe_gts(struct drm_i915_private *i915)
> +{
> + struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
> + phys_addr_t phys_addr;
> + unsigned int mmio_bar;
> + int ret;
> +
> + mmio_bar = GRAPHICS_VER(i915) == 2 ? 1 : 0;
> + phys_addr = pci_resource_start(pdev, mmio_bar);
> +
> + /* We always have at least one primary GT on any device */
> + ret = tile_setup(&i915->gt, 0, phys_addr);
> + if (ret)
> + return ret;
> +
> + /* TODO: add more tiles */
> + return 0;
> +}
> +
> +void intel_gts_release(struct drm_i915_private *i915)
> +{
> + tile_cleanup(&i915->gt);
> +}

Please call the functions intel_gt_*.

BR,
Jani.



> +
>  void intel_gt_info_print(const struct intel_gt_info *info,
>struct drm_printer *p)
>  {
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt.h 
> b/drivers/gpu/drm/i915/gt/intel_gt.h
> index 74e771871a9b..f4f35a70cbe4 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt.h
> @@ -85,6 +85,9 @@ static inline bool intel_gt_needs_read_steering(struct 
> intel_gt *gt,
>  
>  u32 intel_gt_read_register_fw(struct intel_gt *gt, i915_reg_t reg);
>  
> +int intel_probe_gts(struct drm_i915_private *i915);
> +void intel_gts_release(struct drm_i915_private *i915);
> +
>  void intel_gt_info_print(const struct intel_gt_info *info,
>struct drm_printer *p);
>  
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm.c 
> b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> index 524eaf678790..76f498edb0d5 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm.c
> @@ -126,7 +126,14 @@ static const struct intel_wakeref_ops wf_ops = {
>  
>  void intel_gt_pm_init_early(struct intel_gt *gt)
>  {
> - intel_wakeref_init(>->wakeref, gt->uncore->rpm, &wf_ops);
> + /*
> +  * We access the runtime_pm structure via gt->i915 here rather than
> +  * gt->uncore as we do elsewhere in the file because gt->uncore is not
> +  * yet initialized for all tiles at this point in the driver startup.
> +  * runtime_pm is per-device rather than per-tile, so this is still the
> +  * correct structure.
> +  */
> + intel_wakeref_init(>->wakeref, >->i915->runtime_pm, &wf_ops);
>   seqcount_mutex_init(>->stats.lock, >->wakeref.mutex);
>  }
>  
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h 
> b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> index 14216cc471b1..66143316d92e 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
> @@ -180,6 +180,11 @@ struct intel_gt {
>  
>   const struct intel_mmio_range *steering_table[NUM_STEERING_TYPES];
>  
> + /*
> +  * Base of per-tile GTTMMADR where we can derive the MMIO and the GGTT.
> +  */
> + phys_addr_t phys_addr;
> +
>   struct intel_gt_info {
>   intel_engine_mask_t engine_mask;

[Intel-gfx] [PATCH] drm/i915: Use dma_resv_iter for waiting in i915_gem_object_wait_reservation.

2021-10-13 Thread Maarten Lankhorst
No memory should be allocated when calling i915_gem_object_wait,
because it may be called to idle a BO when evicting memory.

Fix this by using dma_resv_iter helpers to call
i915_gem_object_wait_fence() on each fence, which cleans up the code a lot.
Also remove dma_resv_prune, it's questionably.

This will result in the following lockdep splat.

<4> [83.538517] ==
<4> [83.538520] WARNING: possible circular locking dependency detected
<4> [83.538522] 5.15.0-rc5-CI-Trybot_8062+ #1 Not tainted
<4> [83.538525] --
<4> [83.538527] gem_render_line/5242 is trying to acquire lock:
<4> [83.538530] 8275b1e0 (fs_reclaim){+.+.}-{0:0}, at: 
__kmalloc_track_caller+0x56/0x270
<4> [83.538538]
but task is already holding lock:
<4> [83.538540] 88813471d1e0 (&vm->mutex/1){+.+.}-{3:3}, at: 
i915_vma_pin_ww+0x1c7/0x970 [i915]
<4> [83.538638]
which lock already depends on the new lock.
<4> [83.538642]
the existing dependency chain (in reverse order) is:
<4> [83.538645]
-> #1 (&vm->mutex/1){+.+.}-{3:3}:
<4> [83.538649]lock_acquire+0xd3/0x310
<4> [83.538654]i915_gem_shrinker_taints_mutex+0x2d/0x50 [i915]
<4> [83.538730]i915_address_space_init+0xf5/0x1b0 [i915]
<4> [83.538794]ppgtt_init+0x55/0x70 [i915]
<4> [83.538856]gen8_ppgtt_create+0x44/0x5d0 [i915]
<4> [83.538912]i915_ppgtt_create+0x28/0xf0 [i915]
<4> [83.538971]intel_gt_init+0x130/0x3b0 [i915]
<4> [83.539029]i915_gem_init+0x14b/0x220 [i915]
<4> [83.539100]i915_driver_probe+0x97e/0xdd0 [i915]
<4> [83.539149]i915_pci_probe+0x43/0x1d0 [i915]
<4> [83.539197]pci_device_probe+0x9b/0x110
<4> [83.539201]really_probe+0x1b0/0x3b0
<4> [83.539205]__driver_probe_device+0xf6/0x170
<4> [83.539208]driver_probe_device+0x1a/0x90
<4> [83.539210]__driver_attach+0x93/0x160
<4> [83.539213]bus_for_each_dev+0x72/0xc0
<4> [83.539216]bus_add_driver+0x14b/0x1f0
<4> [83.539220]driver_register+0x66/0xb0
<4> [83.539222]hdmi_get_spk_alloc+0x1f/0x50 [snd_hda_codec_hdmi]
<4> [83.539227]do_one_initcall+0x53/0x2e0
<4> [83.539230]do_init_module+0x55/0x200
<4> [83.539234]load_module+0x2700/0x2980
<4> [83.539237]__do_sys_finit_module+0xaa/0x110
<4> [83.539241]do_syscall_64+0x37/0xb0
<4> [83.539244]entry_SYSCALL_64_after_hwframe+0x44/0xae
<4> [83.539247]
-> #0 (fs_reclaim){+.+.}-{0:0}:
<4> [83.539251]validate_chain+0xb37/0x1e70
<4> [83.539254]__lock_acquire+0x5a1/0xb70
<4> [83.539258]lock_acquire+0xd3/0x310
<4> [83.539260]fs_reclaim_acquire+0x9d/0xd0
<4> [83.539264]__kmalloc_track_caller+0x56/0x270
<4> [83.539267]krealloc+0x48/0xa0
<4> [83.539270]dma_resv_get_fences+0x1c3/0x280
<4> [83.539274]i915_gem_object_wait+0x1ff/0x410 [i915]
<4> [83.539342]i915_gem_evict_for_node+0x16b/0x440 [i915]
<4> [83.539412]i915_gem_gtt_reserve+0xff/0x130 [i915]
<4> [83.539482]i915_vma_pin_ww+0x765/0x970 [i915]
<4> [83.539556]eb_validate_vmas+0x6fe/0x8e0 [i915]
<4> [83.539626]i915_gem_do_execbuffer+0x9a6/0x20a0 [i915]
<4> [83.539693]i915_gem_execbuffer2_ioctl+0x11f/0x2c0 [i915]
<4> [83.539759]drm_ioctl_kernel+0xac/0x140
<4> [83.539763]drm_ioctl+0x201/0x3d0
<4> [83.539766]__x64_sys_ioctl+0x6a/0xa0
<4> [83.539769]do_syscall_64+0x37/0xb0
<4> [83.539772]entry_SYSCALL_64_after_hwframe+0x44/0xae
<4> [83.539775]
other info that might help us debug this:
<4> [83.539778]  Possible unsafe locking scenario:
<4> [83.539781]CPU0CPU1
<4> [83.539783]
<4> [83.539785]   lock(&vm->mutex/1);
<4> [83.539788]lock(fs_reclaim);
<4> [83.539791]lock(&vm->mutex/1);
<4> [83.539794]   lock(fs_reclaim);
<4> [83.539796]
 *** DEADLOCK ***
<4> [83.539799] 3 locks held by gem_render_line/5242:
<4> [83.539802]  #0: c9d4bbf0 
(reservation_ww_class_acquire){+.+.}-{0:0}, at: 
i915_gem_do_execbuffer+0x8e5/0x20a0 [i915]
<4> [83.539870]  #1: 88811e48bae8 (reservation_ww_class_mutex){+.+.}-{3:3}, 
at: eb_validate_vmas+0x81/0x8e0 [i915]
<4> [83.539936]  #2: 88813471d1e0 (&vm->mutex/1){+.+.}-{3:3}, at: 
i915_vma_pin_ww+0x1c7/0x970 [i915]
<4> [83.540011]
stack backtrace:
<4> [83.540014] CPU: 2 PID: 5242 Comm: gem_render_line Not tainted 
5.15.0-rc5-CI-Trybot_8062+ #1
<4> [83.540019] Hardware name: Intel(R) Client Systems NUC11TNHi3/NUC11TNBi3, 
BIOS TNTGL357.0038.2020.1124.1648 11/24/2020
<4> [83.540023] Call Trace:
<4> [83.540026]  dump_stack_lvl+0x56/0x7b
<4> [83.540030]  check_noncircular+0x12e/0x150
<4> [83.540034]  ? _raw_spin_unlock_irqrestore+0x50/0x60
<4> [83.540038]  validate_chain+0xb37/0x1e70
<4> [83.540042]  __lock_acquire+0x5a1/0xb70
<4> [83.540046]  lock_acquire+0

Re: [Intel-gfx] [RFC PATCH] drm: Increase DRM_OBJECT_MAX_PROPERTY by 18.

2021-10-13 Thread Sebastian Andrzej Siewior
On 2021-10-13 14:02:59 [+0200], Daniel Vetter wrote:
> On Tue, Oct 05, 2021 at 08:51:51AM +0200, Sebastian Andrzej Siewior wrote:
> > The warning poped up, it says it increase it by the number of occurrence.
> > I saw it 18 times so here it is.
> > It started to up since commit
> >2f425cf5242a0 ("drm: Fix oops in damage self-tests by mocking damage 
> > property")
> > 
> > Increase DRM_OBJECT_MAX_PROPERTY by 18.
> > 
> > Signed-off-by: Sebastian Andrzej Siewior 
> 
> Which driver where? Whomever added that into upstream should also have
> realized this (things will just not work) and include it in there. So if
> things are tested correctly this should be part of a larger series to add
> these 18 props somewhere.

This is on i915 with full debug. If I remember correctly, it wasn't
there before commit
   c7fcbf2513973 ("drm/plane: check that fb_damage is set up when used")

With that commit the box crashed until commit 
   2f425cf5242a0 ("drm: Fix oops in damage self-tests by mocking damage 
property")

where I then observed this.

> Also maybe we should just dynamically allocate this array if people have
> this many properties on their objects.
> -Daniel

Sebastian


Re: [Intel-gfx] [PATCH 1/6] drm/i915: Update dma_fence_work

2021-10-13 Thread Daniel Vetter
On Fri, Oct 08, 2021 at 03:35:25PM +0200, Thomas Hellström wrote:
> Move the release callback to after fence signaling to align with
> what's done for upcoming VM_BIND user-fence signaling.
> 
> Finally call the work callback regardless of whether we have a fence
> error or not and update the existing callbacks accordingly. We will
> need this to intercept the error for failsafe migration.
> 
> Signed-off-by: Thomas Hellström 

I think before we make this thing more complex we really should either
move this into dma-buf/ as a proper thing, or just open-code.

Minimally at least any new async dma_fence worker needs to have
dma_fence_begin/end_signalling annotations, or we're just digging a grave
here.

I'm also not seeing the point in building everything on top of this, for
many cases just an open-coded work_struct should be a lot simpler. It's
just more to clean up later on, that part is for sure.
-Daniel

> ---
>  drivers/gpu/drm/i915/gem/i915_gem_clflush.c |  5 +++
>  drivers/gpu/drm/i915/i915_sw_fence_work.c   | 36 ++---
>  drivers/gpu/drm/i915/i915_sw_fence_work.h   |  1 +
>  drivers/gpu/drm/i915/i915_vma.c | 12 +--
>  4 files changed, 33 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
> index f0435c6feb68..2143ebaf5b6f 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
> @@ -28,6 +28,11 @@ static void clflush_work(struct dma_fence_work *base)
>  {
>   struct clflush *clflush = container_of(base, typeof(*clflush), base);
>  
> + if (base->error) {
> + dma_fence_set_error(&base->dma, base->error);
> + return;
> + }
> +
>   __do_clflush(clflush->obj);
>  }
>  
> diff --git a/drivers/gpu/drm/i915/i915_sw_fence_work.c 
> b/drivers/gpu/drm/i915/i915_sw_fence_work.c
> index 5b33ef23d54c..5b55cddafc9b 100644
> --- a/drivers/gpu/drm/i915/i915_sw_fence_work.c
> +++ b/drivers/gpu/drm/i915/i915_sw_fence_work.c
> @@ -6,21 +6,24 @@
>  
>  #include "i915_sw_fence_work.h"
>  
> -static void fence_complete(struct dma_fence_work *f)
> +static void dma_fence_work_complete(struct dma_fence_work *f)
>  {
> + dma_fence_signal(&f->dma);
> +
>   if (f->ops->release)
>   f->ops->release(f);
> - dma_fence_signal(&f->dma);
> +
> + dma_fence_put(&f->dma);
>  }
>  
> -static void fence_work(struct work_struct *work)
> +static void dma_fence_work_work(struct work_struct *work)
>  {
>   struct dma_fence_work *f = container_of(work, typeof(*f), work);
>  
> - f->ops->work(f);
> + if (f->ops->work)
> + f->ops->work(f);
>  
> - fence_complete(f);
> - dma_fence_put(&f->dma);
> + dma_fence_work_complete(f);
>  }
>  
>  static int __i915_sw_fence_call
> @@ -31,17 +34,13 @@ fence_notify(struct i915_sw_fence *fence, enum 
> i915_sw_fence_notify state)
>   switch (state) {
>   case FENCE_COMPLETE:
>   if (fence->error)
> - dma_fence_set_error(&f->dma, fence->error);
> -
> - if (!f->dma.error) {
> - dma_fence_get(&f->dma);
> - if (test_bit(DMA_FENCE_WORK_IMM, &f->dma.flags))
> - fence_work(&f->work);
> - else
> - queue_work(system_unbound_wq, &f->work);
> - } else {
> - fence_complete(f);
> - }
> + cmpxchg(&f->error, 0, fence->error);
> +
> + dma_fence_get(&f->dma);
> + if (test_bit(DMA_FENCE_WORK_IMM, &f->dma.flags))
> + dma_fence_work_work(&f->work);
> + else
> + queue_work(system_unbound_wq, &f->work);
>   break;
>  
>   case FENCE_FREE:
> @@ -84,10 +83,11 @@ void dma_fence_work_init(struct dma_fence_work *f,
>const struct dma_fence_work_ops *ops)
>  {
>   f->ops = ops;
> + f->error = 0;
>   spin_lock_init(&f->lock);
>   dma_fence_init(&f->dma, &fence_ops, &f->lock, 0, 0);
>   i915_sw_fence_init(&f->chain, fence_notify);
> - INIT_WORK(&f->work, fence_work);
> + INIT_WORK(&f->work, dma_fence_work_work);
>  }
>  
>  int dma_fence_work_chain(struct dma_fence_work *f, struct dma_fence *signal)
> diff --git a/drivers/gpu/drm/i915/i915_sw_fence_work.h 
> b/drivers/gpu/drm/i915/i915_sw_fence_work.h
> index d56806918d13..caa59fb5252b 100644
> --- a/drivers/gpu/drm/i915/i915_sw_fence_work.h
> +++ b/drivers/gpu/drm/i915/i915_sw_fence_work.h
> @@ -24,6 +24,7 @@ struct dma_fence_work_ops {
>  struct dma_fence_work {
>   struct dma_fence dma;
>   spinlock_t lock;
> + int error;
>  
>   struct i915_sw_fence chain;
>   struct i915_sw_dma_fence_cb cb;
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index 4b7fc4647e46..5123ac28ad9a 100644
> --- a/d

Re: [Intel-gfx] [PATCH 4/6] drm/i915: Add a struct dma_fence_work timeline

2021-10-13 Thread Daniel Vetter
On Fri, Oct 08, 2021 at 03:35:28PM +0200, Thomas Hellström wrote:
> The TTM managers and, possibly, the gtt address space managers will
> need to be able to order fences for async operation.
> Using dma_fence_is_later() for this will require that the fences we hand
> them are from a single fence context and ordered.
> 
> Introduce a struct dma_fence_work_timeline, and a function to attach
> struct dma_fence_work to such a timeline in a way that all previous
> fences attached to the timeline will be signaled when the latest
> attached struct dma_fence_work signals.
> 
> Signed-off-by: Thomas Hellström 

I'm not understanding why we need this:

- if we just want to order dma_fence work, then an ordered workqueue is
  what we want. Which is why hand-rolling is better than reusing
  dma_fence_work for absolutely everything.

- if we just need to make sure the public fences signal in order, then
  it's a dma_fence_chain.

Definitely no more "it looks like it's shared code but isn't" stuff in
i915.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_sw_fence_work.c | 89 ++-
>  drivers/gpu/drm/i915/i915_sw_fence_work.h | 58 +++
>  2 files changed, 145 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_sw_fence_work.c 
> b/drivers/gpu/drm/i915/i915_sw_fence_work.c
> index 5b55cddafc9b..87cdb3158042 100644
> --- a/drivers/gpu/drm/i915/i915_sw_fence_work.c
> +++ b/drivers/gpu/drm/i915/i915_sw_fence_work.c
> @@ -5,6 +5,66 @@
>   */
>  
>  #include "i915_sw_fence_work.h"
> +#include "i915_utils.h"
> +
> +/**
> + * dma_fence_work_timeline_attach - Attach a struct dma_fence_work to a
> + * timeline.
> + * @tl: The timeline to attach to.
> + * @f: The struct dma_fence_work.
> + * @tl_cb: The i915_sw_dma_fence_cb needed to attach to the
> + * timeline. This is typically embedded into the structure that also
> + * embeds the struct dma_fence_work.
> + *
> + * This function takes a timeline reference and associates it with the
> + * struct dma_fence_work. That reference is given up when the fence
> + * signals. Furthermore it assigns a fence context and a seqno to the
> + * dma-fence, and then chains upon the previous fence of the timeline
> + * if any, to make sure that the fence signals after that fence. The
> + * @tl_cb callback structure is needed for that chaining. Finally
> + * the registered last fence of the timeline is replaced by this fence, and
> + * the timeline takes a reference on the fence, which is released when
> + * the fence signals.
> + */
> +void dma_fence_work_timeline_attach(struct dma_fence_work_timeline *tl,
> + struct dma_fence_work *f,
> + struct i915_sw_dma_fence_cb *tl_cb)
> +{
> + struct dma_fence *await;
> +
> + if (tl->ops->get)
> + tl->ops->get(tl);
> +
> + spin_lock(&tl->lock);
> + await = tl->last_fence;
> + tl->last_fence = dma_fence_get(&f->dma);
> + f->dma.seqno = tl->seqno++;
> + f->dma.context = tl->context;
> + f->tl = tl;
> + spin_unlock(&tl->lock);
> +
> + if (await) {
> + __i915_sw_fence_await_dma_fence(&f->chain, await, tl_cb);
> + dma_fence_put(await);
> + }
> +}
> +
> +static void dma_fence_work_timeline_detach(struct dma_fence_work *f)
> +{
> + struct dma_fence_work_timeline *tl = f->tl;
> + bool put = false;
> +
> + spin_lock(&tl->lock);
> + if (tl->last_fence == &f->dma) {
> + put = true;
> + tl->last_fence = NULL;
> + }
> + spin_unlock(&tl->lock);
> + if (tl->ops->put)
> + tl->ops->put(tl);
> + if (put)
> + dma_fence_put(&f->dma);
> +}
>  
>  static void dma_fence_work_complete(struct dma_fence_work *f)
>  {
> @@ -13,6 +73,9 @@ static void dma_fence_work_complete(struct dma_fence_work 
> *f)
>   if (f->ops->release)
>   f->ops->release(f);
>  
> + if (f->tl)
> + dma_fence_work_timeline_detach(f);
> +
>   dma_fence_put(&f->dma);
>  }
>  
> @@ -53,14 +116,17 @@ fence_notify(struct i915_sw_fence *fence, enum 
> i915_sw_fence_notify state)
>  
>  static const char *get_driver_name(struct dma_fence *fence)
>  {
> - return "dma-fence";
> + struct dma_fence_work *f = container_of(fence, typeof(*f), dma);
> +
> + return (f->tl && f->tl->ops->name) ? f->tl->ops->name : "dma-fence";
>  }
>  
>  static const char *get_timeline_name(struct dma_fence *fence)
>  {
>   struct dma_fence_work *f = container_of(fence, typeof(*f), dma);
>  
> - return f->ops->name ?: "work";
> + return (f->tl && f->tl->name) ? f->tl->name :
> + f->ops->name ?: "work";
>  }
>  
>  static void fence_release(struct dma_fence *fence)
> @@ -84,6 +150,7 @@ void dma_fence_work_init(struct dma_fence_work *f,
>  {
>   f->ops = ops;
>   f->error = 0;
> + f->tl = NULL;
>   spin_lock_init(&f->lock);
>   dma_fence_init(&f->dma, &fence_ops, &f->lock, 0, 0);
>   i

Re: [Intel-gfx] [RFC PATCH] drm: Increase DRM_OBJECT_MAX_PROPERTY by 18.

2021-10-13 Thread Daniel Vetter
On Wed, Oct 13, 2021 at 02:35:25PM +0200, Sebastian Andrzej Siewior wrote:
> On 2021-10-13 14:02:59 [+0200], Daniel Vetter wrote:
> > On Tue, Oct 05, 2021 at 08:51:51AM +0200, Sebastian Andrzej Siewior wrote:
> > > The warning poped up, it says it increase it by the number of occurrence.
> > > I saw it 18 times so here it is.
> > > It started to up since commit
> > >2f425cf5242a0 ("drm: Fix oops in damage self-tests by mocking damage 
> > > property")
> > > 
> > > Increase DRM_OBJECT_MAX_PROPERTY by 18.
> > > 
> > > Signed-off-by: Sebastian Andrzej Siewior 
> > 
> > Which driver where? Whomever added that into upstream should also have
> > realized this (things will just not work) and include it in there. So if
> > things are tested correctly this should be part of a larger series to add
> > these 18 props somewhere.
> 
> This is on i915 with full debug. If I remember correctly, it wasn't
> there before commit
>c7fcbf2513973 ("drm/plane: check that fb_damage is set up when used")
> 
> With that commit the box crashed until commit 
>2f425cf5242a0 ("drm: Fix oops in damage self-tests by mocking damage 
> property")
> 
> where I then observed this.

Hm there's a pile of commits there, and nothing immediately jumps to
light. The thing is, 18 is likely way too much, since if e.g. we have a
single new property on a plane and that pushes over the limit on all of
them, you get iirc 3x4 already simply because we have that many planes.

So would be good to know the actual culprit.

Can you pls try to bisect the above range, applying the patch as a fixup
locally (without commit, that will confuse git bisect a bit I think), so
we know what/where went wrong?

I'm still confused why this isn't showing up anywhere in our intel ci ...

Thanks, Daniel

> 
> > Also maybe we should just dynamically allocate this array if people have
> > this many properties on their objects.
> > -Daniel
> 
> Sebastian

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH 1/6] drm/i915: Update dma_fence_work

2021-10-13 Thread Thomas Hellström



On 10/13/21 14:41, Daniel Vetter wrote:

On Fri, Oct 08, 2021 at 03:35:25PM +0200, Thomas Hellström wrote:

Move the release callback to after fence signaling to align with
what's done for upcoming VM_BIND user-fence signaling.

Finally call the work callback regardless of whether we have a fence
error or not and update the existing callbacks accordingly. We will
need this to intercept the error for failsafe migration.

Signed-off-by: Thomas Hellström 

I think before we make this thing more complex we really should either
move this into dma-buf/ as a proper thing, or just open-code.

Minimally at least any new async dma_fence worker needs to have
dma_fence_begin/end_signalling annotations, or we're just digging a grave
here.

I'm also not seeing the point in building everything on top of this, for
many cases just an open-coded work_struct should be a lot simpler. It's
just more to clean up later on, that part is for sure.
-Daniel


Yes, I mentioned to Matthew, I'm going to respin this based on our 
previous discussions.


Forgot to mention on the ML.

/Thomas



---
  drivers/gpu/drm/i915/gem/i915_gem_clflush.c |  5 +++
  drivers/gpu/drm/i915/i915_sw_fence_work.c   | 36 ++---
  drivers/gpu/drm/i915/i915_sw_fence_work.h   |  1 +
  drivers/gpu/drm/i915/i915_vma.c | 12 +--
  4 files changed, 33 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c 
b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
index f0435c6feb68..2143ebaf5b6f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
@@ -28,6 +28,11 @@ static void clflush_work(struct dma_fence_work *base)
  {
struct clflush *clflush = container_of(base, typeof(*clflush), base);
  
+	if (base->error) {

+   dma_fence_set_error(&base->dma, base->error);
+   return;
+   }
+
__do_clflush(clflush->obj);
  }
  
diff --git a/drivers/gpu/drm/i915/i915_sw_fence_work.c b/drivers/gpu/drm/i915/i915_sw_fence_work.c

index 5b33ef23d54c..5b55cddafc9b 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence_work.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence_work.c
@@ -6,21 +6,24 @@
  
  #include "i915_sw_fence_work.h"
  
-static void fence_complete(struct dma_fence_work *f)

+static void dma_fence_work_complete(struct dma_fence_work *f)
  {
+   dma_fence_signal(&f->dma);
+
if (f->ops->release)
f->ops->release(f);
-   dma_fence_signal(&f->dma);
+
+   dma_fence_put(&f->dma);
  }
  
-static void fence_work(struct work_struct *work)

+static void dma_fence_work_work(struct work_struct *work)
  {
struct dma_fence_work *f = container_of(work, typeof(*f), work);
  
-	f->ops->work(f);

+   if (f->ops->work)
+   f->ops->work(f);
  
-	fence_complete(f);

-   dma_fence_put(&f->dma);
+   dma_fence_work_complete(f);
  }
  
  static int __i915_sw_fence_call

@@ -31,17 +34,13 @@ fence_notify(struct i915_sw_fence *fence, enum 
i915_sw_fence_notify state)
switch (state) {
case FENCE_COMPLETE:
if (fence->error)
-   dma_fence_set_error(&f->dma, fence->error);
-
-   if (!f->dma.error) {
-   dma_fence_get(&f->dma);
-   if (test_bit(DMA_FENCE_WORK_IMM, &f->dma.flags))
-   fence_work(&f->work);
-   else
-   queue_work(system_unbound_wq, &f->work);
-   } else {
-   fence_complete(f);
-   }
+   cmpxchg(&f->error, 0, fence->error);
+
+   dma_fence_get(&f->dma);
+   if (test_bit(DMA_FENCE_WORK_IMM, &f->dma.flags))
+   dma_fence_work_work(&f->work);
+   else
+   queue_work(system_unbound_wq, &f->work);
break;
  
  	case FENCE_FREE:

@@ -84,10 +83,11 @@ void dma_fence_work_init(struct dma_fence_work *f,
 const struct dma_fence_work_ops *ops)
  {
f->ops = ops;
+   f->error = 0;
spin_lock_init(&f->lock);
dma_fence_init(&f->dma, &fence_ops, &f->lock, 0, 0);
i915_sw_fence_init(&f->chain, fence_notify);
-   INIT_WORK(&f->work, fence_work);
+   INIT_WORK(&f->work, dma_fence_work_work);
  }
  
  int dma_fence_work_chain(struct dma_fence_work *f, struct dma_fence *signal)

diff --git a/drivers/gpu/drm/i915/i915_sw_fence_work.h 
b/drivers/gpu/drm/i915/i915_sw_fence_work.h
index d56806918d13..caa59fb5252b 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence_work.h
+++ b/drivers/gpu/drm/i915/i915_sw_fence_work.h
@@ -24,6 +24,7 @@ struct dma_fence_work_ops {
  struct dma_fence_work {
struct dma_fence dma;
spinlock_t lock;
+   int error;
  
  	struct i915_sw_fence chain;

struct i915_sw_dma_fence_cb cb;
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
i

Re: [Intel-gfx] [PATCH v2] drm/locking: add backtrace for locking contended locks without backoff

2021-10-13 Thread Jani Nikula
On Fri, 01 Oct 2021, Jani Nikula  wrote:
> If drm_modeset_lock() returns -EDEADLK, the caller is supposed to drop
> all currently held locks using drm_modeset_backoff(). Failing to do so
> will result in warnings and backtraces on the paths trying to lock a
> contended lock. Add support for optionally printing the backtrace on the
> path that hit the deadlock and didn't gracefully handle the situation.
>
> For example, the patch [1] inadvertently dropped the return value check
> and error return on replacing calc_watermark_data() with
> intel_compute_global_watermarks(). The backtraces on the subsequent
> locking paths hitting WARN_ON(ctx->contended) were unhelpful, but adding
> the backtrace to the deadlock path produced this helpful printout:
>
> <7> [98.002465] drm_modeset_lock attempting to lock a contended lock without 
> backoff:
>drm_modeset_lock+0x107/0x130
>drm_atomic_get_plane_state+0x76/0x150
>skl_compute_wm+0x251d/0x2b20 [i915]
>intel_atomic_check+0x1942/0x29e0 [i915]
>drm_atomic_check_only+0x554/0x910
>drm_atomic_nonblocking_commit+0xe/0x50
>drm_mode_atomic_ioctl+0x8c2/0xab0
>drm_ioctl_kernel+0xac/0x140
>
> Add new CONFIG_DRM_DEBUG_MODESET_LOCK to enable modeset lock debugging
> with stack depot and trace.
>
> [1] https://lore.kernel.org/r/20210924114741.15940-4-jani.nik...@intel.com
>
> v2:
> - default y if DEBUG_WW_MUTEX_SLOWPATH (Daniel)
> - depends on DEBUG_KERNEL
>
> Cc: Daniel Vetter 
> Cc: Dave Airlie 
> Reviewed-by: Daniel Vetter 
> Signed-off-by: Jani Nikula 

Pushed to drm-misc-next, thanks for the review.

BR,
Jani.

> ---
>  drivers/gpu/drm/Kconfig| 15 +
>  drivers/gpu/drm/drm_modeset_lock.c | 49 --
>  include/drm/drm_modeset_lock.h |  8 +
>  3 files changed, 70 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index 2a926d0de423..a4c020a9a0eb 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -100,6 +100,21 @@ config DRM_DEBUG_DP_MST_TOPOLOGY_REFS
>This has the potential to use a lot of memory and print some very
>large kernel messages. If in doubt, say "N".
>  
> +config DRM_DEBUG_MODESET_LOCK
> + bool "Enable backtrace history for lock contention"
> + depends on STACKTRACE_SUPPORT
> + depends on DEBUG_KERNEL
> + depends on EXPERT
> + select STACKDEPOT
> + default y if DEBUG_WW_MUTEX_SLOWPATH
> + help
> +   Enable debug tracing of failures to gracefully handle drm modeset lock
> +   contention. A history of each drm modeset lock path hitting -EDEADLK
> +   will be saved until gracefully handled, and the backtrace will be
> +   printed when attempting to lock a contended lock.
> +
> +   If in doubt, say "N".
> +
>  config DRM_FBDEV_EMULATION
>   bool "Enable legacy fbdev support for your modesetting driver"
>   depends on DRM
> diff --git a/drivers/gpu/drm/drm_modeset_lock.c 
> b/drivers/gpu/drm/drm_modeset_lock.c
> index bf8a6e823a15..4d32b61fa1fd 100644
> --- a/drivers/gpu/drm/drm_modeset_lock.c
> +++ b/drivers/gpu/drm/drm_modeset_lock.c
> @@ -25,6 +25,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  /**
>   * DOC: kms locking
> @@ -77,6 +78,45 @@
>  
>  static DEFINE_WW_CLASS(crtc_ww_class);
>  
> +#if IS_ENABLED(CONFIG_DRM_DEBUG_MODESET_LOCK)
> +static noinline depot_stack_handle_t __stack_depot_save(void)
> +{
> + unsigned long entries[8];
> + unsigned int n;
> +
> + n = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
> +
> + return stack_depot_save(entries, n, GFP_NOWAIT | __GFP_NOWARN);
> +}
> +
> +static void __stack_depot_print(depot_stack_handle_t stack_depot)
> +{
> + struct drm_printer p = drm_debug_printer("drm_modeset_lock");
> + unsigned long *entries;
> + unsigned int nr_entries;
> + char *buf;
> +
> + buf = kmalloc(PAGE_SIZE, GFP_NOWAIT | __GFP_NOWARN);
> + if (!buf)
> + return;
> +
> + nr_entries = stack_depot_fetch(stack_depot, &entries);
> + stack_trace_snprint(buf, PAGE_SIZE, entries, nr_entries, 2);
> +
> + drm_printf(&p, "attempting to lock a contended lock without 
> backoff:\n%s", buf);
> +
> + kfree(buf);
> +}
> +#else /* CONFIG_DRM_DEBUG_MODESET_LOCK */
> +static depot_stack_handle_t __stack_depot_save(void)
> +{
> + return 0;
> +}
> +static void __stack_depot_print(depot_stack_handle_t stack_depot)
> +{
> +}
> +#endif /* CONFIG_DRM_DEBUG_MODESET_LOCK */
> +
>  /**
>   * drm_modeset_lock_all - take all modeset locks
>   * @dev: DRM device
> @@ -225,7 +265,9 @@ EXPORT_SYMBOL(drm_modeset_acquire_fini);
>   */
>  void drm_modeset_drop_locks(struct drm_modeset_acquire_ctx *ctx)
>  {
> - WARN_ON(ctx->contended);
> + if (WARN_ON(ctx->contended))
> + __stack_depot_print(ctx->stack_depot);
> +
>   while (!list_empty(&ctx->locked)) {
>   struct drm_modeset_lock *lock;
>  
> @@ -243,7 +285,8 @@ static i

Re: [Intel-gfx] [PATCH v2] component: do not leave master devres group open after bind

2021-10-13 Thread Greg KH
On Wed, Oct 06, 2021 at 04:47:57PM +0300, Kai Vehmanen wrote:
> Hi,
> 
> On Tue, 5 Oct 2021, Greg KH wrote:
> 
> > On Wed, Sep 22, 2021 at 11:54:32AM +0300, Kai Vehmanen wrote:
> > > In current code, the devres group for aggregate master is left open
> > > after call to component_master_add_*(). This leads to problems when the
> > > master does further managed allocations on its own. When any
> > > participating driver calls component_del(), this leads to immediate
> > > release of resources.
> [...]
> > > the devres group, and by closing the devres group after
> > > the master->ops->bind() call is done. This allows devres allocations
> > > done by the driver acting as master to be isolated from the binding state
> > > of the aggregate driver. This modifies the logic originally introduced in
> > > commit 9e1ccb4a7700 ("drivers/base: fix devres handling for master 
> > > device")
> > > 
> > > BugLink: https://gitlab.freedesktop.org/drm/intel/-/issues/4136
> > > Signed-off-by: Kai Vehmanen 
> > > Acked-by: Imre Deak 
> > > Acked-by: Russell King (Oracle) 
> > 
> > What commit does this "fix:"?  And does it need to go to stable
> > kernel(s)?
> 
> I didn't put a "Fixes" on the original commit 9e1ccb4a7700 
> ("drivers/base: fix devres handling for master device") as it alone
> didn't cause problems. It did open the door for possible devres issues
> for anybody calling component_master_add_().
> 
> On audio side, this surfaced with the more recent commit 3fcaf24e5dce 
> ("ALSA: hda: Allocate resources with device-managed APIs"). In theory one 
> could have hit issues already before, but this made it very easy to hit
> on actual systems.
> 
> If I'd have to pick one, it would be 9e1ccb4a7700 ("drivers/base: fix 
> devres handling for master device"). And yes, given comments on this 
> thread, I'd say this needs to go to stable kernels.

Then please add a fixes: line and a cc: stable line and resend.

thanks,

greg k-h


[Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915: vlv sideband

2021-10-13 Thread Patchwork
== Series Details ==

Series: drm/i915: vlv sideband
URL   : https://patchwork.freedesktop.org/series/95764/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10728_full -> Patchwork_21327_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_21327_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_21327_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_21327_full:

### IGT changes ###

 Possible regressions 

  * igt@kms_frontbuffer_tracking@fbc-suspend:
- shard-kbl:  [PASS][1] -> [INCOMPLETE][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/shard-kbl7/igt@kms_frontbuffer_track...@fbc-suspend.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21327/shard-kbl2/igt@kms_frontbuffer_track...@fbc-suspend.html

  
Known issues


  Here are the changes found in Patchwork_21327_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_ctx_isolation@preservation-s3@vcs0:
- shard-kbl:  [PASS][3] -> [DMESG-WARN][4] ([i915#180]) +5 similar 
issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/shard-kbl1/igt@gem_ctx_isolation@preservation...@vcs0.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21327/shard-kbl1/igt@gem_ctx_isolation@preservation...@vcs0.html

  * igt@gem_ctx_persistence@legacy-engines-persistence:
- shard-snb:  NOTRUN -> [SKIP][5] ([fdo#109271] / [i915#1099]) +2 
similar issues
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21327/shard-snb5/igt@gem_ctx_persiste...@legacy-engines-persistence.html

  * igt@gem_ctx_shared@q-in-order:
- shard-snb:  NOTRUN -> [SKIP][6] ([fdo#109271]) +224 similar issues
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21327/shard-snb5/igt@gem_ctx_sha...@q-in-order.html

  * igt@gem_eio@unwedge-stress:
- shard-skl:  [PASS][7] -> [TIMEOUT][8] ([i915#2369] / [i915#3063])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/shard-skl1/igt@gem_...@unwedge-stress.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21327/shard-skl5/igt@gem_...@unwedge-stress.html

  * igt@gem_exec_fair@basic-deadline:
- shard-skl:  NOTRUN -> [FAIL][9] ([i915#2846])
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21327/shard-skl1/igt@gem_exec_f...@basic-deadline.html

  * igt@gem_exec_fair@basic-none-rrul@rcs0:
- shard-tglb: NOTRUN -> [FAIL][10] ([i915#2842])
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21327/shard-tglb2/igt@gem_exec_fair@basic-none-r...@rcs0.html

  * igt@gem_exec_fair@basic-none-solo@rcs0:
- shard-glk:  [PASS][11] -> [FAIL][12] ([i915#2842])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/shard-glk5/igt@gem_exec_fair@basic-none-s...@rcs0.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21327/shard-glk1/igt@gem_exec_fair@basic-none-s...@rcs0.html

  * igt@gem_exec_fair@basic-pace@vcs0:
- shard-kbl:  [PASS][13] -> [FAIL][14] ([i915#2842])
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/shard-kbl6/igt@gem_exec_fair@basic-p...@vcs0.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21327/shard-kbl2/igt@gem_exec_fair@basic-p...@vcs0.html

  * igt@gem_exec_fair@basic-pace@vcs1:
- shard-iclb: NOTRUN -> [FAIL][15] ([i915#2842])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21327/shard-iclb2/igt@gem_exec_fair@basic-p...@vcs1.html
- shard-tglb: [PASS][16] -> [FAIL][17] ([i915#2842])
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/shard-tglb5/igt@gem_exec_fair@basic-p...@vcs1.html
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21327/shard-tglb3/igt@gem_exec_fair@basic-p...@vcs1.html

  * igt@gem_exec_schedule@u-submit-golden-slice@vecs0:
- shard-skl:  NOTRUN -> [INCOMPLETE][18] ([i915#3797])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21327/shard-skl8/igt@gem_exec_schedule@u-submit-golden-sl...@vecs0.html

  * igt@gem_fenced_exec_thrash@2-spare-fences:
- shard-snb:  [PASS][19] -> [INCOMPLETE][20] ([i915#2055])
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/shard-snb5/igt@gem_fenced_exec_thr...@2-spare-fences.html
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21327/shard-snb6/igt@gem_fenced_exec_thr...@2-spare-fences.html

  * igt@gem_huc_copy@huc-copy:
- shard-tglb: [PASS][21] -> [SKIP][22] ([i915#2190])
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10728/shard-tglb3/igt@gem_huc_c..

Re: [Intel-gfx] [PATCH 03/14] drm/i915/xehpsdv: enforce min GTT alignment

2021-10-13 Thread Daniel Vetter
On Mon, Oct 11, 2021 at 09:41:44PM +0530, Ramalingam C wrote:
> From: Matthew Auld 
> 
> For local-memory objects we need to align the GTT addresses to 64K, both
> for the ppgtt and ggtt.
> 
> Signed-off-by: Matthew Auld 
> Signed-off-by: Stuart Summers 
> Signed-off-by: Ramalingam C 
> Cc: Joonas Lahtinen 
> Cc: Rodrigo Vivi 

Do we still need this with relocations removed? Userspace is picking all
the addresses for us, so all we have to check is whether userspace got it
right.
-Daniel


> ---
>  drivers/gpu/drm/i915/i915_vma.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index 4b7fc4647e46..1ea1fa08efdf 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -670,8 +670,13 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 
> alignment, u64 flags)
>   }
>  
>   color = 0;
> - if (vma->obj && i915_vm_has_cache_coloring(vma->vm))
> - color = vma->obj->cache_level;
> + if (vma->obj) {
> + if (HAS_64K_PAGES(vma->vm->i915) && 
> i915_gem_object_is_lmem(vma->obj))
> + alignment = max(alignment, I915_GTT_PAGE_SIZE_64K);
> +
> + if (i915_vm_has_cache_coloring(vma->vm))
> + color = vma->obj->cache_level;
> + }
>  
>   if (flags & PIN_OFFSET_FIXED) {
>   u64 offset = flags & PIN_OFFSET_MASK;
> -- 
> 2.20.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH] drm/i915: Use dma_resv_iter for waiting in i915_gem_object_wait_reservation.

2021-10-13 Thread kernel test robot
Hi Maarten,

I love your patch! Yet something to improve:

[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on drm-tip/drm-tip drm-exynos/exynos-drm-next 
tegra-drm/drm/tegra/for-next v5.15-rc5 next-20211013]
[cannot apply to airlied/drm-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Maarten-Lankhorst/drm-i915-Use-dma_resv_iter-for-waiting-in-i915_gem_object_wait_reservation/20211013-184219
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-randconfig-a015-20211013 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
# 
https://github.com/0day-ci/linux/commit/647f0c4c47ffea53967daf523e8b935707e7a586
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Maarten-Lankhorst/drm-i915-Use-dma_resv_iter-for-waiting-in-i915_gem_object_wait_reservation/20211013-184219
git checkout 647f0c4c47ffea53967daf523e8b935707e7a586
# save the attached .config to linux build tree
mkdir build_dir
make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash drivers/gpu/drm/i915/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

>> drivers/gpu/drm/i915/gem/i915_gem_shrinker.c:18:10: fatal error: 
>> dma_resv_utils.h: No such file or directory
  18 | #include "dma_resv_utils.h"
 |  ^~
   compilation terminated.


vim +18 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c

09137e94543761 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c Chris Wilson  
2020-07-08  17  
6d393ef5ff5cac drivers/gpu/drm/i915/gem/i915_gem_shrinker.c Chris Wilson  
2020-12-23 @18  #include "dma_resv_utils.h"
be6a0376950475 drivers/gpu/drm/i915/i915_gem_shrinker.c Daniel Vetter 
2015-03-18  19  #include "i915_trace.h"
be6a0376950475 drivers/gpu/drm/i915/i915_gem_shrinker.c Daniel Vetter 
2015-03-18  20  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


Re: [Intel-gfx] [PATCH 13/14] drm/i915/uapi: document behaviour for DG2 64K support

2021-10-13 Thread Daniel Vetter
On Mon, Oct 11, 2021 at 09:41:54PM +0530, Ramalingam C wrote:
> From: Matthew Auld 
> 
> On discrete platforms like DG2, we need to support a minimum page size
> of 64K when dealing with device local-memory. This is quite tricky for
> various reasons, so try to document the new implicit uapi for this.
> 
> Signed-off-by: Matthew Auld 
> Signed-off-by: Ramalingam C 
> ---
>  include/uapi/drm/i915_drm.h | 61 ++---
>  1 file changed, 56 insertions(+), 5 deletions(-)
> 
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index aa2a7eccfb94..d62e8b7ed8b6 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -1118,10 +1118,16 @@ struct drm_i915_gem_exec_object2 {
>   /**
>* When the EXEC_OBJECT_PINNED flag is specified this is populated by
>* the user with the GTT offset at which this object will be pinned.
> +  *
>* When the I915_EXEC_NO_RELOC flag is specified this must contain the
>* presumed_offset of the object.
> +  *
>* During execbuffer2 the kernel populates it with the value of the
>* current GTT offset of the object, for future presumed_offset writes.
> +  *
> +  * See struct drm_i915_gem_create_ext for the rules when dealing with
> +  * alignment restrictions with I915_MEMORY_CLASS_DEVICE, on devices with
> +  * minimum page sizes, like DG2.
>*/
>   __u64 offset;
>  
> @@ -3001,11 +3007,56 @@ struct drm_i915_gem_create_ext {
>*

I think a heading here (or a bit earlier) about Page alignment would be
good. Just mark it up as bold or something (since real sphinx headings
won't work).

>* The (page-aligned) allocated size for the object will be returned.
>*
> -  * Note that for some devices we have might have further minimum
> -  * page-size restrictions(larger than 4K), like for device local-memory.
> -  * However in general the final size here should always reflect any
> -  * rounding up, if for example using the 
> I915_GEM_CREATE_EXT_MEMORY_REGIONS
> -  * extension to place the object in device local-memory.
> +  * On discrete platforms, starting from DG2, we have to contend with GTT
> +  * page size restrictions when dealing with I915_MEMORY_CLASS_DEVICE
> +  * objects.  Specifically the hardware only supports 64K or larger GTT
> +  * page sizes for such memory. The kernel will already ensure that all
> +  * I915_MEMORY_CLASS_DEVICE memory is allocated using 64K or larger page
> +  * sizes underneath.
> +  *
> +  * Note that the returned size here will always reflect any required
> +  * rounding up done by the kernel, i.e 4K will now become 64K on devices
> +  * such as DG2. The GTT alignment will also need be at least 64K for
> +  * such objects.
> +  *

I think here we should have a "Special DG2 placement restrictions" heading
for clarity

> +  * Note that due to how the hardware implements 64K GTT page support, we
> +  * have some further complications:
> +  *
> +  *   1.) The entire PDE(which covers a 2M virtual address range), must

Does this really format into a nice list in the html output? Also not both
. and ), usually in text it's just )

> +  *   contain only 64K PTEs, i.e mixing 4K and 64K PTEs in the same
> +  *   PDE is forbidden by the hardware.
> +  *
> +  *   2.) We still need to support 4K PTEs for I915_MEMORY_CLASS_SYSTEM
> +  *   objects.
> +  *
> +  * To handle the above the kernel implements a memory coloring scheme to
> +  * prevent userspace from mixing I915_MEMORY_CLASS_DEVICE and
> +  * I915_MEMORY_CLASS_SYSTEM objects in the same PDE. If the kernel is
> +  * ever unable to evict the required pages for the given PDE(different
> +  * color) when inserting the object into the GTT then it will simply
> +  * fail the request.
> +  *
> +  * Since userspace needs to manage the GTT address space themselves,
> +  * special care is needed to ensure this doesn't happen. The simplest
> +  * scheme is to simply align and round up all I915_MEMORY_CLASS_DEVICE
> +  * objects to 2M, which avoids any issues here. At the very least this
> +  * is likely needed for objects that can be placed in both
> +  * I915_MEMORY_CLASS_DEVICE and I915_MEMORY_CLASS_SYSTEM, to avoid
> +  * potential issues when the kernel needs to migrate the object behind
> +  * the scenes, since that might also involve evicting other objects.
> +  *
> +  * To summarise the GTT rules, on platforms like DG2:
> +  *
> +  *   1.) All objects that can be placed in I915_MEMORY_CLASS_DEVICE must
> +  *   have 64K alignment. The kernel will reject this otherwise.
> +  *
> +  *   2.) All I915_MEMORY_CLASS_DEVICE objects must never be placed in
> +  *   the same PDE with other I915_MEMORY_CLASS_SYSTEM objects. The
> +  *   kernel will r

Re: [Intel-gfx] [PATCH 14/14] Doc/gpu/rfc/i915: i915 DG2 uAPI

2021-10-13 Thread Daniel Vetter
On Mon, Oct 11, 2021 at 09:41:55PM +0530, Ramalingam C wrote:
> Details of the new features getting added as part of DG2 enabling and their
> implicit impact on the uAPI.
> 
> Signed-off-by: Ramalingam C 
> cc: Daniel Vetter 
> cc: Matthew Auld 
> ---
>  Documentation/gpu/rfc/i915_dg2.rst | 47 ++
>  Documentation/gpu/rfc/index.rst|  3 ++
>  2 files changed, 50 insertions(+)
>  create mode 100644 Documentation/gpu/rfc/i915_dg2.rst

Please move this and any uapi doc patch this relies on to the front of the
series, so it serves as an intro.

I think the 64k side looks good with the uapi docs, once it's fully
reviewed and acked.

What we still need is proper uapi docs for flat CCS. I think for that a
separate flat ccs DOC: section would be good, which is then references by
the gem_create_ext kerneldoc with a sphinx hyperlink.

The other thing that's missing here are the dg2 flat ccs drm_modifiers. So
we need another patch for that, which in it's kerneldoc then also links to
the flat ccs DOC: section.

Finally that flat ccs doc section needs to discuss all the flat ccs issues
and uapi we've discussed. That patch needs to be acked both by userspace
driver folks, and by compositor folks (because of the modifier uapi
aspect). Please cc Pekka and Simon Ser for the compositor acks (but feel
free to add more people).
-Daniel

> 
> diff --git a/Documentation/gpu/rfc/i915_dg2.rst 
> b/Documentation/gpu/rfc/i915_dg2.rst
> new file mode 100644
> index ..a83ca26cd758
> --- /dev/null
> +++ b/Documentation/gpu/rfc/i915_dg2.rst
> @@ -0,0 +1,47 @@
> +
> +I915 DG2 RFC Section
> +
> +
> +Upstream plan
> +=
> +Plan to upstream the DG2 enabling is:
> +
> +* Merge basic HW enabling for DG2(Still without pciid)
> +* Merge the 64k support for lmem
> +* Merge the flat CCS enabling patches
> +* Add the pciid for DG2 and enable the DG2 in CI
> +
> +
> +64K page support for lmem
> +=
> +On DG2 hw, local-memory supports minimum GTT page size of 64k only. 4k is 
> not supported anymore.
> +
> +DG2 hw dont support the 64k(lmem) and 4k(smem) pages in the same ppgtt Page 
> table. Refer the
> +struct drm_i915_gem_create_ext for the implication of handling the 64k page 
> size.
> +
> +.. kernel-doc:: include/uapi/drm/i915_drm.h
> +:functions: drm_i915_gem_create_ext
> +
> +
> +flat CCS support for lmem
> +=
> +Gen 12+ devices support 3D surfaces compression and compression formats. 
> This is
> +accomplished by an additional compression control state (CCS) stored for 
> each surface.
> +
> +Gen 12 devices(TGL and DG1) stores compression state in a separate region of 
> memory.
> +It is managed by userspace and has an associated set of userspace managed 
> page tables
> +used by hardware for address translation.
> +
> +In Gen 12.5 devices(XEXPSDV and DG2) Flat CCS is introduced to replace the 
> userspace
> +managed AUX pagetable with the flat indexed region of device memory for 
> storing the
> +compression state
> +
> +GOP Driver steals a chunk of memory for the CCS surface corresponding to the 
> entire
> +range of local memory. The memory required for the CCS of the entire local 
> memory is
> +1/256 of the main local memory. The Gop driver will also program a secure 
> register
> +(XEHPSDV_FLAT_CCS_BASE_ADDR 0x4910) with this address value.
> +
> +So the Total local memory available for driver allocation is Total lmem size 
> - CCS data size
> +
> +Flat CCS data needs to be cleared when a lmem object is allocated. And CCS 
> data can
> +be copied in and out of CCS region through XY_CTRL_SURF_COPY_BLT.
> diff --git a/Documentation/gpu/rfc/index.rst b/Documentation/gpu/rfc/index.rst
> index 91e93a705230..afb320ed4028 100644
> --- a/Documentation/gpu/rfc/index.rst
> +++ b/Documentation/gpu/rfc/index.rst
> @@ -20,6 +20,9 @@ host such documentation:
>  
>  i915_gem_lmem.rst
>  
> +.. toctree::
> +i915_dg2.rst
> +
>  .. toctree::
>  
>  i915_scheduler.rst
> -- 
> 2.20.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH 00/14] drm/i915/dg2: Enabling 64k page size and flat ccs

2021-10-13 Thread Daniel Vetter
On Mon, Oct 11, 2021 at 09:41:41PM +0530, Ramalingam C wrote:
> This series introduces the enabling patches for new flat ccs feature and
> 64k page support for i915 local memory, along with documentation on the
> uAPI impact.
> 
> 64k page support
> 
> 
> On discrete platforms, starting from DG2, we have to contend with GTT
> page size restrictions when dealing with I915_MEMORY_CLASS_DEVICE
> objects. Specifically the hardware only supports 64K or larger GTT page
> sizes for such memory. The kernel will already ensure that all
> I915_MEMORY_CLASS_DEVICE memory is allocated using 64K or larger page
> sizes underneath.
> 
> Note that the returned size here will always reflect any required
> rounding up done by the kernel, i.e 4K will now become 64K on devices
> such as DG2. The GTT alignment will also need be at least 64K for such
> objects.
> 
> Note that due to how the hardware implements 64K GTT page support, we
> have some further complications:
> 
> 1.) The entire PDE(which covers a 2M virtual address range), must
> contain only 64K PTEs, i.e mixing 4K and 64K PTEs in the same PDE is
> forbidden by the hardware.
> 
> 2.) We still need to support 4K PTEs for I915_MEMORY_CLASS_SYSTEM
> objects.
> 
> To handle the above the kernel implements a memory coloring scheme to
> prevent userspace from mixing I915_MEMORY_CLASS_DEVICE and
> I915_MEMORY_CLASS_SYSTEM objects in the same PDE. If the kernel is ever
> unable to evict the required pages for the given PDE(different color)
> when inserting the object into the GTT then it will simply fail the
> request.
> 
> Since userspace needs to manage the GTT address space themselves,
> special care is needed to ensure this doesn’t happen. The simplest
> scheme is to simply align and round up all I915_MEMORY_CLASS_DEVICE
> objects to 2M, which avoids any issues here. At the very least this is
> likely needed for objects that can be placed in both
> I915_MEMORY_CLASS_DEVICE and I915_MEMORY_CLASS_SYSTEM, to avoid
> potential issues when the kernel needs to migrate the object behind the
> scenes, since that might also involve evicting other objects.
> 
> To summarise the GTT rules, on platforms like DG2:
> 
> 1.) All objects that can be placed in I915_MEMORY_CLASS_DEVICE must have
> 64K alignment. The kernel will reject this otherwise.
> 
> 2.) All I915_MEMORY_CLASS_DEVICE objects must never be placed in the
> same PDE with other I915_MEMORY_CLASS_SYSTEM objects. The kernel will
> reject this otherwise.
> 
> 3.) Objects that can be placed in both I915_MEMORY_CLASS_DEVICE and
> I915_MEMORY_CLASS_SYSTEM should probably be aligned and padded out to
> 2M.
> 
> Flat CCS:
> =
> Gen 12+ devices support 3D surfaces compression and compression formats.
> This is accomplished by an additional compression control state (CCS)
> stored for each surface.
> 
> Gen 12 devices(TGL and DG1) stores compression state in a separate
> region of memory. It is managed by userspace and has an associated set
> of userspace managed page tables used by hardware for address
> translation.
> 
> In Gen 12.5 devices(XEXPSDV and DG2) Flat CCS is introduced to replace
> the userspace managed AUX pagetable with the flat indexed region of
> device memory for storing the compression state
> 
> GOP Driver steals a chunk of memory for the CCS surface corresponding to
> the entire range of local memory. The memory required for the CCS of the
> entire local memory is 1/256 of the main local memory. The Gop driver
> will also program a secure register (XEHPSDV_FLAT_CCS_BASE_ADDR 0x4910)
> with this address value.
> 
> TODO: add patches for the flatccs modifiers and kdoc for them.

Ah it's here too :-)

Since this is uapi we also need link to igts (or at least where the tests
are), and to mesa MR (if that hasn't all landed yet).
-Daniel

> 
> *** BLURB HERE ***
> 
> Abdiel Janulgue (1):
>   drm/i915/lmem: Enable lmem for platforms with Flat CCS
> 
> Ayaz A Siddiqui (1):
>   drm/i915/gt: Clear compress metadata for Gen12.5 >= platforms
> 
> Bommu Krishnaiah (1):
>   drm/i915: Add vm min alignment support
> 
> CQ Tang (1):
>   drm/i915/xehpsdv: Add has_flat_ccs to device info
> 
> Matthew Auld (8):
>   drm/i915/xehpsdv: set min page-size to 64K
>   drm/i915/xehpsdv: enforce min GTT alignment
>   drm/i915: enforce min page size for scratch
>   drm/i915/gtt/xehpsdv: move scratch page to system memory
>   drm/i915/xehpsdv: support 64K GTT pages
>   drm/i915/selftests: account for min_alignment in GTT selftests
>   drm/i915/xehpsdv: implement memory coloring
>   drm/i915/uapi: document behaviour for DG2 64K support
> 
> Ramalingam C (1):
>   Doc/gpu/rfc/i915: i915 DG2 uAPI
> 
> Stuart Summers (1):
>   drm/i915: Add has_64k_pages flag
> 
>  Documentation/gpu/rfc/i915_dg2.rst|  47 ++
>  Documentation/gpu/rfc/index.rst   |   3 +
>  drivers/gpu/drm/i915/gem/i915_gem_stolen.c|   6 +-
>  .../gpu/drm/i915/gem/selftests/huge_pages.c   |  61 
>  .../i915/gem/selftest

Re: [Intel-gfx] [PATCH 1/3] drm:Enable buddy allocator support

2021-10-13 Thread Daniel Vetter
On Wed, Oct 13, 2021 at 07:05:34PM +0530, Arunpravin wrote:
> Port Intel buddy manager to drm root folder

One patch to move it 1:1, then follow-up patches to change it. Not
everything in one.

Also i915 needs to be adopted to use this too, or this just doesn't make
sense.

I'm also wondering whether we shouldn't have a ttm helper for this
readymade so it just glues all in?
-Daniel

> Implemented range allocation support for the provided order
> Implemented TOP-DOWN support
> Implemented freeing up unused pages on contiguous allocation
> Moved range allocation and freelist pickup into a single function
> 
> Signed-off-by: Arunpravin 
> ---
>  drivers/gpu/drm/Makefile|   2 +-
>  drivers/gpu/drm/drm_buddy.c | 705 
>  drivers/gpu/drm/drm_drv.c   |   3 +
>  include/drm/drm_buddy.h | 157 
>  4 files changed, 866 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/gpu/drm/drm_buddy.c
>  create mode 100644 include/drm/drm_buddy.h
> 
> diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
> index a118692a6df7..fe1a2fc09675 100644
> --- a/drivers/gpu/drm/Makefile
> +++ b/drivers/gpu/drm/Makefile
> @@ -18,7 +18,7 @@ drm-y   :=  drm_aperture.o drm_auth.o drm_cache.o \
>   drm_dumb_buffers.o drm_mode_config.o drm_vblank.o \
>   drm_syncobj.o drm_lease.o drm_writeback.o drm_client.o \
>   drm_client_modeset.o drm_atomic_uapi.o drm_hdcp.o \
> - drm_managed.o drm_vblank_work.o
> + drm_managed.o drm_vblank_work.o drm_buddy.o
>  
>  drm-$(CONFIG_DRM_LEGACY) += drm_agpsupport.o drm_bufs.o drm_context.o 
> drm_dma.o \
>   drm_legacy_misc.o drm_lock.o drm_memory.o 
> drm_scatter.o \
> diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
> new file mode 100644
> index ..8cd118574665
> --- /dev/null
> +++ b/drivers/gpu/drm/drm_buddy.c
> @@ -0,0 +1,705 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2021 Intel Corporation
> + */
> +
> +#include 
> +#include 
> +
> +#include 
> +
> +static struct kmem_cache *slab_blocks;
> +
> +static struct drm_buddy_block *drm_block_alloc(struct drm_buddy_mm *mm,
> +struct drm_buddy_block *parent,
> +unsigned int order,
> +u64 offset)
> +{
> + struct drm_buddy_block *block;
> +
> + BUG_ON(order > DRM_BUDDY_MAX_ORDER);
> +
> + block = kmem_cache_zalloc(slab_blocks, GFP_KERNEL);
> + if (!block)
> + return NULL;
> +
> + block->header = offset;
> + block->header |= order;
> + block->parent = parent;
> +
> + BUG_ON(block->header & DRM_BUDDY_HEADER_UNUSED);
> + return block;
> +}
> +
> +static void drm_block_free(struct drm_buddy_mm *mm,
> +struct drm_buddy_block *block)
> +{
> + kmem_cache_free(slab_blocks, block);
> +}
> +
> +static void mark_allocated(struct drm_buddy_block *block)
> +{
> + block->header &= ~DRM_BUDDY_HEADER_STATE;
> + block->header |= DRM_BUDDY_ALLOCATED;
> +
> + list_del(&block->link);
> +}
> +
> +static void mark_free(struct drm_buddy_mm *mm,
> +   struct drm_buddy_block *block)
> +{
> + block->header &= ~DRM_BUDDY_HEADER_STATE;
> + block->header |= DRM_BUDDY_FREE;
> +
> + list_add(&block->link,
> + &mm->free_list[drm_buddy_block_order(block)]);
> +}
> +
> +static void mark_split(struct drm_buddy_block *block)
> +{
> + block->header &= ~DRM_BUDDY_HEADER_STATE;
> + block->header |= DRM_BUDDY_SPLIT;
> +
> + list_del(&block->link);
> +}
> +
> +/**
> + * drm_buddy_init - init memory manager
> + *
> + * @mm: DRM buddy manager to initialize
> + * @size: size in bytes to manage
> + * @chunk_size: minimum page size in bytes for our allocations
> + *
> + * Initializes the memory manager and its resources.
> + *
> + * Returns:
> + * 0 on success, error code on failure.
> + */
> +int drm_buddy_init(struct drm_buddy_mm *mm, u64 size, u64 chunk_size)
> +{
> + unsigned int i;
> + u64 offset;
> +
> + if (size < chunk_size)
> + return -EINVAL;
> +
> + if (chunk_size < PAGE_SIZE)
> + return -EINVAL;
> +
> + if (!is_power_of_2(chunk_size))
> + return -EINVAL;
> +
> + size = round_down(size, chunk_size);
> +
> + mm->size = size;
> + mm->avail = size;
> + mm->chunk_size = chunk_size;
> + mm->max_order = ilog2(size) - ilog2(chunk_size);
> +
> + BUG_ON(mm->max_order > DRM_BUDDY_MAX_ORDER);
> +
> + mm->free_list = kmalloc_array(mm->max_order + 1,
> +   sizeof(struct list_head),
> +   GFP_KERNEL);
> + if (!mm->free_list)
> + return -ENOMEM;
> +
> + for (i = 0; i <= mm->max_order; ++i)
> + INIT_LIST_HEAD(&mm->free_list[i]);
> +
> + mm->n_roots = hweight

Re: [Intel-gfx] [PATCH] drm/i915: Use dma_resv_iter for waiting in i915_gem_object_wait_reservation.

2021-10-13 Thread Daniel Vetter
On Wed, Oct 13, 2021 at 02:32:03PM +0200, Maarten Lankhorst wrote:
> No memory should be allocated when calling i915_gem_object_wait,
> because it may be called to idle a BO when evicting memory.
> 
> Fix this by using dma_resv_iter helpers to call
> i915_gem_object_wait_fence() on each fence, which cleans up the code a lot.
> Also remove dma_resv_prune, it's questionably.
> 
> This will result in the following lockdep splat.
> 
> <4> [83.538517] ==
> <4> [83.538520] WARNING: possible circular locking dependency detected
> <4> [83.538522] 5.15.0-rc5-CI-Trybot_8062+ #1 Not tainted
> <4> [83.538525] --
> <4> [83.538527] gem_render_line/5242 is trying to acquire lock:
> <4> [83.538530] 8275b1e0 (fs_reclaim){+.+.}-{0:0}, at: 
> __kmalloc_track_caller+0x56/0x270
> <4> [83.538538]
> but task is already holding lock:
> <4> [83.538540] 88813471d1e0 (&vm->mutex/1){+.+.}-{3:3}, at: 
> i915_vma_pin_ww+0x1c7/0x970 [i915]
> <4> [83.538638]
> which lock already depends on the new lock.
> <4> [83.538642]
> the existing dependency chain (in reverse order) is:
> <4> [83.538645]
> -> #1 (&vm->mutex/1){+.+.}-{3:3}:
> <4> [83.538649]lock_acquire+0xd3/0x310
> <4> [83.538654]i915_gem_shrinker_taints_mutex+0x2d/0x50 [i915]
> <4> [83.538730]i915_address_space_init+0xf5/0x1b0 [i915]
> <4> [83.538794]ppgtt_init+0x55/0x70 [i915]
> <4> [83.538856]gen8_ppgtt_create+0x44/0x5d0 [i915]
> <4> [83.538912]i915_ppgtt_create+0x28/0xf0 [i915]
> <4> [83.538971]intel_gt_init+0x130/0x3b0 [i915]
> <4> [83.539029]i915_gem_init+0x14b/0x220 [i915]
> <4> [83.539100]i915_driver_probe+0x97e/0xdd0 [i915]
> <4> [83.539149]i915_pci_probe+0x43/0x1d0 [i915]
> <4> [83.539197]pci_device_probe+0x9b/0x110
> <4> [83.539201]really_probe+0x1b0/0x3b0
> <4> [83.539205]__driver_probe_device+0xf6/0x170
> <4> [83.539208]driver_probe_device+0x1a/0x90
> <4> [83.539210]__driver_attach+0x93/0x160
> <4> [83.539213]bus_for_each_dev+0x72/0xc0
> <4> [83.539216]bus_add_driver+0x14b/0x1f0
> <4> [83.539220]driver_register+0x66/0xb0
> <4> [83.539222]hdmi_get_spk_alloc+0x1f/0x50 [snd_hda_codec_hdmi]
> <4> [83.539227]do_one_initcall+0x53/0x2e0
> <4> [83.539230]do_init_module+0x55/0x200
> <4> [83.539234]load_module+0x2700/0x2980
> <4> [83.539237]__do_sys_finit_module+0xaa/0x110
> <4> [83.539241]do_syscall_64+0x37/0xb0
> <4> [83.539244]entry_SYSCALL_64_after_hwframe+0x44/0xae
> <4> [83.539247]
> -> #0 (fs_reclaim){+.+.}-{0:0}:
> <4> [83.539251]validate_chain+0xb37/0x1e70
> <4> [83.539254]__lock_acquire+0x5a1/0xb70
> <4> [83.539258]lock_acquire+0xd3/0x310
> <4> [83.539260]fs_reclaim_acquire+0x9d/0xd0
> <4> [83.539264]__kmalloc_track_caller+0x56/0x270
> <4> [83.539267]krealloc+0x48/0xa0
> <4> [83.539270]dma_resv_get_fences+0x1c3/0x280
> <4> [83.539274]i915_gem_object_wait+0x1ff/0x410 [i915]
> <4> [83.539342]i915_gem_evict_for_node+0x16b/0x440 [i915]
> <4> [83.539412]i915_gem_gtt_reserve+0xff/0x130 [i915]
> <4> [83.539482]i915_vma_pin_ww+0x765/0x970 [i915]
> <4> [83.539556]eb_validate_vmas+0x6fe/0x8e0 [i915]
> <4> [83.539626]i915_gem_do_execbuffer+0x9a6/0x20a0 [i915]
> <4> [83.539693]i915_gem_execbuffer2_ioctl+0x11f/0x2c0 [i915]
> <4> [83.539759]drm_ioctl_kernel+0xac/0x140
> <4> [83.539763]drm_ioctl+0x201/0x3d0
> <4> [83.539766]__x64_sys_ioctl+0x6a/0xa0
> <4> [83.539769]do_syscall_64+0x37/0xb0
> <4> [83.539772]entry_SYSCALL_64_after_hwframe+0x44/0xae
> <4> [83.539775]
> other info that might help us debug this:
> <4> [83.539778]  Possible unsafe locking scenario:
> <4> [83.539781]CPU0CPU1
> <4> [83.539783]
> <4> [83.539785]   lock(&vm->mutex/1);
> <4> [83.539788]lock(fs_reclaim);
> <4> [83.539791]lock(&vm->mutex/1);
> <4> [83.539794]   lock(fs_reclaim);
> <4> [83.539796]
>  *** DEADLOCK ***
> <4> [83.539799] 3 locks held by gem_render_line/5242:
> <4> [83.539802]  #0: c9d4bbf0 
> (reservation_ww_class_acquire){+.+.}-{0:0}, at: 
> i915_gem_do_execbuffer+0x8e5/0x20a0 [i915]
> <4> [83.539870]  #1: 88811e48bae8 
> (reservation_ww_class_mutex){+.+.}-{3:3}, at: eb_validate_vmas+0x81/0x8e0 
> [i915]
> <4> [83.539936]  #2: 88813471d1e0 (&vm->mutex/1){+.+.}-{3:3}, at: 
> i915_vma_pin_ww+0x1c7/0x970 [i915]
> <4> [83.540011]
> stack backtrace:
> <4> [83.540014] CPU: 2 PID: 5242 Comm: gem_render_line Not tainted 
> 5.15.0-rc5-CI-Trybot_8062+ #1
> <4> [83.540019] Hardware name: Intel(R) Client Systems NUC11TNHi3/NUC11TNBi3, 
> BIOS TNTGL357.0038.2020.1124.1648 11/24/2020
> <4> [83.540023] Call Trace:

Re: [Intel-gfx] [PATCH 03/28] dma-buf: add dma_resv selftest v3

2021-10-13 Thread Daniel Vetter
On Tue, Oct 05, 2021 at 01:37:17PM +0200, Christian König wrote:
> Just exercising a very minor subset of the functionality, but already
> proven useful.
> 
> v2: add missing locking
> v3: some more cleanup and consolidation, add unlocked test as well
> 
> Signed-off-by: Christian König 

Yeah this is great, since if we then get some specific bug later on it's
going to be very easy to add the unit test for the precise bug hopefully.

I scrolled through, looks correct.

Reviewed-by: Daniel Vetter 

> ---
>  drivers/dma-buf/Makefile  |   3 +-
>  drivers/dma-buf/selftests.h   |   1 +
>  drivers/dma-buf/st-dma-resv.c | 282 ++
>  3 files changed, 285 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/dma-buf/st-dma-resv.c
> 
> diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile
> index 1ef021273a06..511805dbeb75 100644
> --- a/drivers/dma-buf/Makefile
> +++ b/drivers/dma-buf/Makefile
> @@ -11,6 +11,7 @@ obj-$(CONFIG_DMABUF_SYSFS_STATS) += dma-buf-sysfs-stats.o
>  dmabuf_selftests-y := \
>   selftest.o \
>   st-dma-fence.o \
> - st-dma-fence-chain.o
> + st-dma-fence-chain.o \
> + st-dma-resv.o
>  
>  obj-$(CONFIG_DMABUF_SELFTESTS)   += dmabuf_selftests.o
> diff --git a/drivers/dma-buf/selftests.h b/drivers/dma-buf/selftests.h
> index bc8cea67bf1e..97d73aaa31da 100644
> --- a/drivers/dma-buf/selftests.h
> +++ b/drivers/dma-buf/selftests.h
> @@ -12,3 +12,4 @@
>  selftest(sanitycheck, __sanitycheck__) /* keep first (igt selfcheck) */
>  selftest(dma_fence, dma_fence)
>  selftest(dma_fence_chain, dma_fence_chain)
> +selftest(dma_resv, dma_resv)
> diff --git a/drivers/dma-buf/st-dma-resv.c b/drivers/dma-buf/st-dma-resv.c
> new file mode 100644
> index ..50d3791ccb8c
> --- /dev/null
> +++ b/drivers/dma-buf/st-dma-resv.c
> @@ -0,0 +1,282 @@
> +/* SPDX-License-Identifier: MIT */
> +
> +/*
> +* Copyright © 2019 Intel Corporation
> +* Copyright © 2021 Advanced Micro Devices, Inc.
> +*/
> +
> +#include 
> +#include 
> +#include 
> +
> +#include "selftest.h"
> +
> +static struct spinlock fence_lock;
> +
> +static const char *fence_name(struct dma_fence *f)
> +{
> + return "selftest";
> +}
> +
> +static const struct dma_fence_ops fence_ops = {
> + .get_driver_name = fence_name,
> + .get_timeline_name = fence_name,
> +};
> +
> +static struct dma_fence *alloc_fence(void)
> +{
> + struct dma_fence *f;
> +
> + f = kmalloc(sizeof(*f), GFP_KERNEL);
> + if (!f)
> + return NULL;
> +
> + dma_fence_init(f, &fence_ops, &fence_lock, 0, 0);
> + return f;
> +}
> +
> +static int sanitycheck(void *arg)
> +{
> + struct dma_resv resv;
> + struct dma_fence *f;
> + int r;
> +
> + f = alloc_fence();
> + if (!f)
> + return -ENOMEM;
> +
> + dma_fence_signal(f);
> + dma_fence_put(f);
> +
> + dma_resv_init(&resv);
> + r = dma_resv_lock(&resv, NULL);
> + if (r)
> + pr_err("Resv locking failed\n");
> + else
> + dma_resv_unlock(&resv);
> + dma_resv_fini(&resv);
> + return r;
> +}
> +
> +static int test_signaling(void *arg, bool shared)
> +{
> + struct dma_resv resv;
> + struct dma_fence *f;
> + int r;
> +
> + f = alloc_fence();
> + if (!f)
> + return -ENOMEM;
> +
> + dma_resv_init(&resv);
> + r = dma_resv_lock(&resv, NULL);
> + if (r) {
> + pr_err("Resv locking failed\n");
> + goto err_free;
> + }
> +
> + if (shared) {
> + r = dma_resv_reserve_shared(&resv, 1);
> + if (r) {
> + pr_err("Resv shared slot allocation failed\n");
> + goto err_unlock;
> + }
> +
> + dma_resv_add_shared_fence(&resv, f);
> + } else {
> + dma_resv_add_excl_fence(&resv, f);
> + }
> +
> + if (dma_resv_test_signaled(&resv, shared)) {
> + pr_err("Resv unexpectedly signaled\n");
> + r = -EINVAL;
> + goto err_unlock;
> + }
> + dma_fence_signal(f);
> + if (!dma_resv_test_signaled(&resv, shared)) {
> + pr_err("Resv not reporting signaled\n");
> + r = -EINVAL;
> + goto err_unlock;
> + }
> +err_unlock:
> + dma_resv_unlock(&resv);
> +err_free:
> + dma_resv_fini(&resv);
> + dma_fence_put(f);
> + return r;
> +}
> +
> +static int test_excl_signaling(void *arg)
> +{
> + return test_signaling(arg, false);
> +}
> +
> +static int test_shared_signaling(void *arg)
> +{
> + return test_signaling(arg, true);
> +}
> +
> +static int test_for_each(void *arg, bool shared)
> +{
> + struct dma_resv_iter cursor;
> + struct dma_fence *f, *fence;
> + struct dma_resv resv;
> + int r;
> +
> + f = alloc_fence();
> + if (!f)
> + return -ENOMEM;
> +
> + dma_resv_init(&resv);
> + r = dma_resv_lock(&resv, NULL);
> + if (r) {
> + pr_err("Resv 

Re: [Intel-gfx] [PATCH 11/28] drm/amdgpu: use the new iterator in amdgpu_sync_resv

2021-10-13 Thread Daniel Vetter
On Tue, Oct 05, 2021 at 01:37:25PM +0200, Christian König wrote:
> Simplifying the code a bit.
> 
> Signed-off-by: Christian König 

Reviewed-by: Daniel Vetter 

Yeah these iterators rock :-)
-Daniel

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 44 
>  1 file changed, 14 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
> index 862eb3c1c4c5..f7d8487799b2 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
> @@ -252,41 +252,25 @@ int amdgpu_sync_resv(struct amdgpu_device *adev, struct 
> amdgpu_sync *sync,
>struct dma_resv *resv, enum amdgpu_sync_mode mode,
>void *owner)
>  {
> - struct dma_resv_list *flist;
> + struct dma_resv_iter cursor;
>   struct dma_fence *f;
> - unsigned i;
> - int r = 0;
> + int r;
>  
>   if (resv == NULL)
>   return -EINVAL;
>  
> - /* always sync to the exclusive fence */
> - f = dma_resv_excl_fence(resv);
> - dma_fence_chain_for_each(f, f) {
> - struct dma_fence_chain *chain = to_dma_fence_chain(f);
> -
> - if (amdgpu_sync_test_fence(adev, mode, owner, chain ?
> -chain->fence : f)) {
> - r = amdgpu_sync_fence(sync, f);
> - dma_fence_put(f);
> - if (r)
> - return r;
> - break;
> - }
> - }
> -
> - flist = dma_resv_shared_list(resv);
> - if (!flist)
> - return 0;
> -
> - for (i = 0; i < flist->shared_count; ++i) {
> - f = rcu_dereference_protected(flist->shared[i],
> -   dma_resv_held(resv));
> -
> - if (amdgpu_sync_test_fence(adev, mode, owner, f)) {
> - r = amdgpu_sync_fence(sync, f);
> - if (r)
> - return r;
> + dma_resv_for_each_fence(&cursor, resv, true, f) {
> + dma_fence_chain_for_each(f, f) {
> + struct dma_fence_chain *chain = to_dma_fence_chain(f);
> +
> + if (amdgpu_sync_test_fence(adev, mode, owner, chain ?
> +chain->fence : f)) {
> + r = amdgpu_sync_fence(sync, f);
> + dma_fence_put(f);
> + if (r)
> + return r;
> + break;
> + }
>   }
>   }
>   return 0;
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH 12/28] drm/amdgpu: use new iterator in amdgpu_ttm_bo_eviction_valuable

2021-10-13 Thread Daniel Vetter
On Tue, Oct 05, 2021 at 01:37:26PM +0200, Christian König wrote:
> Simplifying the code a bit.
> 
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 14 --
>  1 file changed, 4 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index e8d70b6e6737..722e3c9e8882 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -1345,10 +1345,9 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct 
> ttm_buffer_object *bo,
>   const struct ttm_place *place)
>  {
>   unsigned long num_pages = bo->resource->num_pages;
> + struct dma_resv_iter resv_cursor;
>   struct amdgpu_res_cursor cursor;
> - struct dma_resv_list *flist;
>   struct dma_fence *f;
> - int i;
>  
>   /* Swapout? */
>   if (bo->resource->mem_type == TTM_PL_SYSTEM)
> @@ -1362,14 +1361,9 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct 
> ttm_buffer_object *bo,
>* If true, then return false as any KFD process needs all its BOs to
>* be resident to run successfully
>*/
> - flist = dma_resv_shared_list(bo->base.resv);
> - if (flist) {
> - for (i = 0; i < flist->shared_count; ++i) {
> - f = rcu_dereference_protected(flist->shared[i],
> - dma_resv_held(bo->base.resv));
> - if (amdkfd_fence_check_mm(f, current->mm))
> - return false;
> - }
> + dma_resv_for_each_fence(&resv_cursor, bo->base.resv, true, f) {

^false?

At least I'm not seeing the code look at the exclusive fence here.
-Daniel

> + if (amdkfd_fence_check_mm(f, current->mm))
> + return false;
>   }
>  
>   switch (bo->resource->mem_type) {
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH 13/28] drm/amdgpu: use new iterator in amdgpu_vm_prt_fini

2021-10-13 Thread Daniel Vetter
On Tue, Oct 05, 2021 at 01:37:27PM +0200, Christian König wrote:
> No need to actually allocate an array of fences here.
> 
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 26 +-
>  1 file changed, 5 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 6b15cad78de9..e42dd79ed6f4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -2090,30 +2090,14 @@ static void amdgpu_vm_free_mapping(struct 
> amdgpu_device *adev,
>  static void amdgpu_vm_prt_fini(struct amdgpu_device *adev, struct amdgpu_vm 
> *vm)
>  {
>   struct dma_resv *resv = vm->root.bo->tbo.base.resv;
> - struct dma_fence *excl, **shared;
> - unsigned i, shared_count;
> - int r;
> + struct dma_resv_iter cursor;
> + struct dma_fence *fence;
>  
> - r = dma_resv_get_fences(resv, &excl, &shared_count, &shared);
> - if (r) {
> - /* Not enough memory to grab the fence list, as last resort
> -  * block for all the fences to complete.
> -  */
> - dma_resv_wait_timeout(resv, true, false,
> - MAX_SCHEDULE_TIMEOUT);
> - return;
> - }
> -
> - /* Add a callback for each fence in the reservation object */
> - amdgpu_vm_prt_get(adev);

I was confused for a bit why the old code wouldn't leak a refcount for
!excl case, but it's all handled.

Not sure amdgpu_vm_add_prt_cb still needs to handle the !fence case, it's
a bit a gotcha but I guess can happen?

Either way, looks correct.

Reviewed-by: Daniel Vetter 

> - amdgpu_vm_add_prt_cb(adev, excl);
> -
> - for (i = 0; i < shared_count; ++i) {
> + dma_resv_for_each_fence(&cursor, resv, true, fence) {
> + /* Add a callback for each fence in the reservation object */
>   amdgpu_vm_prt_get(adev);
> - amdgpu_vm_add_prt_cb(adev, shared[i]);
> + amdgpu_vm_add_prt_cb(adev, fence);
>   }
> -
> - kfree(shared);
>  }
>  
>  /**
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH 03/14] drm/i915/xehpsdv: enforce min GTT alignment

2021-10-13 Thread Matthew Auld

On 13/10/2021 14:38, Daniel Vetter wrote:

On Mon, Oct 11, 2021 at 09:41:44PM +0530, Ramalingam C wrote:

From: Matthew Auld 

For local-memory objects we need to align the GTT addresses to 64K, both
for the ppgtt and ggtt.

Signed-off-by: Matthew Auld 
Signed-off-by: Stuart Summers 
Signed-off-by: Ramalingam C 
Cc: Joonas Lahtinen 
Cc: Rodrigo Vivi 


Do we still need this with relocations removed? Userspace is picking all
the addresses for us, so all we have to check is whether userspace got it
right.


Yeah, for OFFSET_FIXED this just validates that the provided address is 
correctly aligned to 64K, while for the in-kernel insertion stuff we 
still need to allocate an address that is aligned to 64K. Setting the 
alignment here handles both cases.



-Daniel



---
  drivers/gpu/drm/i915/i915_vma.c | 9 +++--
  1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 4b7fc4647e46..1ea1fa08efdf 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -670,8 +670,13 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 
alignment, u64 flags)
}
  
  	color = 0;

-   if (vma->obj && i915_vm_has_cache_coloring(vma->vm))
-   color = vma->obj->cache_level;
+   if (vma->obj) {
+   if (HAS_64K_PAGES(vma->vm->i915) && 
i915_gem_object_is_lmem(vma->obj))
+   alignment = max(alignment, I915_GTT_PAGE_SIZE_64K);
+
+   if (i915_vm_has_cache_coloring(vma->vm))
+   color = vma->obj->cache_level;
+   }
  
  	if (flags & PIN_OFFSET_FIXED) {

u64 offset = flags & PIN_OFFSET_MASK;
--
2.20.1





Re: [Intel-gfx] [PATCH 14/28] drm/msm: use new iterator in msm_gem_describe

2021-10-13 Thread Daniel Vetter
On Tue, Oct 05, 2021 at 01:37:28PM +0200, Christian König wrote:
> Simplifying the code a bit. Also drop the RCU read side lock since the
> object is locked anyway.
> 
> Untested since I can't get the driver to compile on !ARM.

Cross-compiler install is pretty easy and you should have that for pushing
drm changes to drm-misc :-)

> Signed-off-by: Christian König 

Assuming this compiles, it looks correct.

Reviewed-by: Daniel Vetter 

> ---
>  drivers/gpu/drm/msm/msm_gem.c | 19 +--
>  1 file changed, 5 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
> index 40a9863f5951..5bd511f07c07 100644
> --- a/drivers/gpu/drm/msm/msm_gem.c
> +++ b/drivers/gpu/drm/msm/msm_gem.c
> @@ -880,7 +880,7 @@ void msm_gem_describe(struct drm_gem_object *obj, struct 
> seq_file *m,
>  {
>   struct msm_gem_object *msm_obj = to_msm_bo(obj);
>   struct dma_resv *robj = obj->resv;
> - struct dma_resv_list *fobj;
> + struct dma_resv_iter cursor;
>   struct dma_fence *fence;
>   struct msm_gem_vma *vma;
>   uint64_t off = drm_vma_node_start(&obj->vma_node);
> @@ -955,22 +955,13 @@ void msm_gem_describe(struct drm_gem_object *obj, 
> struct seq_file *m,
>   seq_puts(m, "\n");
>   }
>  
> - rcu_read_lock();
> - fobj = dma_resv_shared_list(robj);
> - if (fobj) {
> - unsigned int i, shared_count = fobj->shared_count;
> -
> - for (i = 0; i < shared_count; i++) {
> - fence = rcu_dereference(fobj->shared[i]);
> + dma_resv_for_each_fence(&cursor, robj, true, fence) {
> + if (dma_resv_iter_is_exclusive(&cursor))
> + describe_fence(fence, "Exclusive", m);
> + else
>   describe_fence(fence, "Shared", m);
> - }
>   }
>  
> - fence = dma_resv_excl_fence(robj);
> - if (fence)
> - describe_fence(fence, "Exclusive", m);
> - rcu_read_unlock();
> -
>   msm_gem_unlock(obj);
>  }
>  
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH 15/28] drm/radeon: use new iterator in radeon_sync_resv

2021-10-13 Thread Daniel Vetter
On Tue, Oct 05, 2021 at 01:37:29PM +0200, Christian König wrote:
> Simplifying the code a bit.
> 
> Signed-off-by: Christian König 

Reviewed-by: Daniel Vetter 

> ---
>  drivers/gpu/drm/radeon/radeon_sync.c | 22 +++---
>  1 file changed, 3 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/gpu/drm/radeon/radeon_sync.c 
> b/drivers/gpu/drm/radeon/radeon_sync.c
> index 9257b60144c4..b991ba1bcd51 100644
> --- a/drivers/gpu/drm/radeon/radeon_sync.c
> +++ b/drivers/gpu/drm/radeon/radeon_sync.c
> @@ -91,33 +91,17 @@ int radeon_sync_resv(struct radeon_device *rdev,
>struct dma_resv *resv,
>bool shared)
>  {
> - struct dma_resv_list *flist;
> - struct dma_fence *f;
> + struct dma_resv_iter cursor;
>   struct radeon_fence *fence;
> - unsigned i;
> + struct dma_fence *f;
>   int r = 0;
>  
> - /* always sync to the exclusive fence */
> - f = dma_resv_excl_fence(resv);
> - fence = f ? to_radeon_fence(f) : NULL;
> - if (fence && fence->rdev == rdev)
> - radeon_sync_fence(sync, fence);
> - else if (f)
> - r = dma_fence_wait(f, true);
> -
> - flist = dma_resv_shared_list(resv);
> - if (shared || !flist || r)
> - return r;
> -
> - for (i = 0; i < flist->shared_count; ++i) {
> - f = rcu_dereference_protected(flist->shared[i],
> -   dma_resv_held(resv));
> + dma_resv_for_each_fence(&cursor, resv, shared, f) {
>   fence = to_radeon_fence(f);
>   if (fence && fence->rdev == rdev)
>   radeon_sync_fence(sync, fence);
>   else
>   r = dma_fence_wait(f, true);
> -
>   if (r)
>   break;
>   }
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH 17/28] drm/i915: use the new iterator in i915_gem_busy_ioctl v2

2021-10-13 Thread Daniel Vetter
On Tue, Oct 05, 2021 at 02:44:50PM +0200, Christian König wrote:
> Am 05.10.21 um 14:40 schrieb Tvrtko Ursulin:
> > 
> > On 05/10/2021 12:37, Christian König wrote:
> > > This makes the function much simpler since the complex
> > > retry logic is now handled else where.
> > > 
> > > Signed-off-by: Christian König 
> > > Reviewed-by: Tvrtko Ursulin 
> > 
> > Reminder - r-b was retracted until at least more text is added to commit
> > message about pros and cons. But really some discussion had inside the
> > i915 team on the topic.
> 
> Sure, going to move those to a different branch.
> 
> But I really only see the following options:
> 1. Grab the lock.
> 2. Use the _unlocked variant with get/put.
> 3. Add another _rcu iterator just for this case.
> 
> I'm fine with either, but Daniel pretty much already rejected #3 and #2/#1
> has more overhead then the original one.

Anything that removes open-code rcu/lockless magic from i915 gets my ack,
there's way too much of this everywhere. So on this:

Acked-by: Daniel Vetter 

I've asked Maarten to review the i915 ones for you, please pester him if
it's not happening :-)
-Daniel

> 
> Regards,
> Christian.
> 
> > 
> > Regards,
> > 
> > Tvrtko
> > 
> > > ---
> > >   drivers/gpu/drm/i915/gem/i915_gem_busy.c | 35 ++--
> > >   1 file changed, 14 insertions(+), 21 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c
> > > b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
> > > index 6234e17259c1..dc72b36dae54 100644
> > > --- a/drivers/gpu/drm/i915/gem/i915_gem_busy.c
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
> > > @@ -82,8 +82,8 @@ i915_gem_busy_ioctl(struct drm_device *dev, void
> > > *data,
> > >   {
> > >   struct drm_i915_gem_busy *args = data;
> > >   struct drm_i915_gem_object *obj;
> > > -    struct dma_resv_list *list;
> > > -    unsigned int seq;
> > > +    struct dma_resv_iter cursor;
> > > +    struct dma_fence *fence;
> > >   int err;
> > >     err = -ENOENT;
> > > @@ -109,27 +109,20 @@ i915_gem_busy_ioctl(struct drm_device *dev,
> > > void *data,
> > >    * to report the overall busyness. This is what the wait-ioctl
> > > does.
> > >    *
> > >    */
> > > -retry:
> > > -    seq = raw_read_seqcount(&obj->base.resv->seq);
> > > -
> > > -    /* Translate the exclusive fence to the READ *and* WRITE engine */
> > > -    args->busy =
> > > busy_check_writer(dma_resv_excl_fence(obj->base.resv));
> > > -
> > > -    /* Translate shared fences to READ set of engines */
> > > -    list = dma_resv_shared_list(obj->base.resv);
> > > -    if (list) {
> > > -    unsigned int shared_count = list->shared_count, i;
> > > -
> > > -    for (i = 0; i < shared_count; ++i) {
> > > -    struct dma_fence *fence =
> > > -    rcu_dereference(list->shared[i]);
> > > -
> > > +    args->busy = 0;
> > > +    dma_resv_iter_begin(&cursor, obj->base.resv, true);
> > > +    dma_resv_for_each_fence_unlocked(&cursor, fence) {
> > > +    if (dma_resv_iter_is_restarted(&cursor))
> > > +    args->busy = 0;
> > > +
> > > +    if (dma_resv_iter_is_exclusive(&cursor))
> > > +    /* Translate the exclusive fence to the READ *and*
> > > WRITE engine */
> > > +    args->busy |= busy_check_writer(fence);
> > > +    else
> > > +    /* Translate shared fences to READ set of engines */
> > >   args->busy |= busy_check_reader(fence);
> > > -    }
> > >   }
> > > -
> > > -    if (args->busy && read_seqcount_retry(&obj->base.resv->seq, seq))
> > > -    goto retry;
> > > +    dma_resv_iter_end(&cursor);
> > >     err = 0;
> > >   out:
> > > 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH 23/28] drm: use new iterator in drm_gem_fence_array_add_implicit v3

2021-10-13 Thread Daniel Vetter
On Tue, Oct 05, 2021 at 01:37:37PM +0200, Christian König wrote:
> Simplifying the code a bit.
> 
> v2: add missing rcu_read_lock()/unlock()
> v3: switch to locked version
> 
> Signed-off-by: Christian König 
> Reviewed-by: Tvrtko Ursulin 

Please make sure you also apply this to the new copy of this code in
drm/sched. This one here is up for deletion, once I get all the driver
conversions I have landed ...
-Daniel

> ---
>  drivers/gpu/drm/drm_gem.c | 26 +-
>  1 file changed, 5 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> index 09c820045859..4dcdec6487bb 100644
> --- a/drivers/gpu/drm/drm_gem.c
> +++ b/drivers/gpu/drm/drm_gem.c
> @@ -1340,31 +1340,15 @@ int drm_gem_fence_array_add_implicit(struct xarray 
> *fence_array,
>struct drm_gem_object *obj,
>bool write)
>  {
> - int ret;
> - struct dma_fence **fences;
> - unsigned int i, fence_count;
> -
> - if (!write) {
> - struct dma_fence *fence =
> - dma_resv_get_excl_unlocked(obj->resv);
> -
> - return drm_gem_fence_array_add(fence_array, fence);
> - }
> + struct dma_resv_iter cursor;
> + struct dma_fence *fence;
> + int ret = 0;
>  
> - ret = dma_resv_get_fences(obj->resv, NULL,
> - &fence_count, &fences);
> - if (ret || !fence_count)
> - return ret;
> -
> - for (i = 0; i < fence_count; i++) {
> - ret = drm_gem_fence_array_add(fence_array, fences[i]);
> + dma_resv_for_each_fence(&cursor, obj->resv, write, fence) {
> + ret = drm_gem_fence_array_add(fence_array, fence);
>   if (ret)
>   break;
>   }
> -
> - for (; i < fence_count; i++)
> - dma_fence_put(fences[i]);
> - kfree(fences);
>   return ret;
>  }
>  EXPORT_SYMBOL(drm_gem_fence_array_add_implicit);
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH 4/6] drm/i915: Add a struct dma_fence_work timeline

2021-10-13 Thread Thomas Hellström
On Wed, 2021-10-13 at 14:43 +0200, Daniel Vetter wrote:
> On Fri, Oct 08, 2021 at 03:35:28PM +0200, Thomas Hellström wrote:
> > The TTM managers and, possibly, the gtt address space managers will
> > need to be able to order fences for async operation.
> > Using dma_fence_is_later() for this will require that the fences we
> > hand
> > them are from a single fence context and ordered.
> > 
> > Introduce a struct dma_fence_work_timeline, and a function to
> > attach
> > struct dma_fence_work to such a timeline in a way that all previous
> > fences attached to the timeline will be signaled when the latest
> > attached struct dma_fence_work signals.
> > 
> > Signed-off-by: Thomas Hellström 
> 
> I'm not understanding why we need this:
> 
> - if we just want to order dma_fence work, then an ordered workqueue
> is
>   what we want. Which is why hand-rolling is better than reusing
>   dma_fence_work for absolutely everything.
> 
> - if we just need to make sure the public fences signal in order,
> then
>   it's a dma_fence_chain.

Part of the same series that needs reworking.

What we need here is a way to coalesce multiple fences from various
contexts (including both gpu and work fences) into a single fence and
then attach it to a timeline.

/Thomas






Re: [Intel-gfx] [PATCH 24/28] drm: use new iterator in drm_gem_plane_helper_prepare_fb v2

2021-10-13 Thread Daniel Vetter
On Tue, Oct 05, 2021 at 01:37:38PM +0200, Christian König wrote:
> Makes the handling a bit more complex, but avoids the use of
> dma_resv_get_excl_unlocked().
> 
> v2: improve coding and documentation
> 
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/drm_gem_atomic_helper.c | 13 +++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_gem_atomic_helper.c 
> b/drivers/gpu/drm/drm_gem_atomic_helper.c
> index e570398abd78..8534f78d4d6d 100644
> --- a/drivers/gpu/drm/drm_gem_atomic_helper.c
> +++ b/drivers/gpu/drm/drm_gem_atomic_helper.c
> @@ -143,6 +143,7 @@
>   */
>  int drm_gem_plane_helper_prepare_fb(struct drm_plane *plane, struct 
> drm_plane_state *state)
>  {
> + struct dma_resv_iter cursor;
>   struct drm_gem_object *obj;
>   struct dma_fence *fence;
>  
> @@ -150,9 +151,17 @@ int drm_gem_plane_helper_prepare_fb(struct drm_plane 
> *plane, struct drm_plane_st
>   return 0;
>  
>   obj = drm_gem_fb_get_obj(state->fb, 0);
> - fence = dma_resv_get_excl_unlocked(obj->resv);
> - drm_atomic_set_fence_for_plane(state, fence);
> + dma_resv_iter_begin(&cursor, obj->resv, false);
> + dma_resv_for_each_fence_unlocked(&cursor, fence) {
> + /* TODO: We only use the first write fence here and need to fix
> +  * the drm_atomic_set_fence_for_plane() API to accept more than
> +  * one. */

I'm confused, right now there is only one write fence. So no need to
iterate, and also no need to add a TODO. If/when we add more write fences
then I think this needs to be revisited, and ofc then we do need to update
the set_fence helpers to carry an entire array of fences.
-Daniel

> + dma_fence_get(fence);
> + break;
> + }
> + dma_resv_iter_end(&cursor);
>  
> + drm_atomic_set_fence_for_plane(state, fence);
>   return 0;
>  }
>  EXPORT_SYMBOL_GPL(drm_gem_plane_helper_prepare_fb);
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH 25/28] drm/nouveau: use the new iterator in nouveau_fence_sync

2021-10-13 Thread Daniel Vetter
On Tue, Oct 05, 2021 at 01:37:39PM +0200, Christian König wrote:
> Simplifying the code a bit.
> 
> Signed-off-by: Christian König 

A bit a trick conversion since the previous code was clever with the ret
handling in the loop, but looks correct.

Please mention in the commit message that this code now also waits for all
shared fences in all cases. Previously if we found an exclusive fence, we
bailed out. That needs to be recorded in the commit message, together with
an explainer that defacto too many other drivers have broken this rule
already, and so you have to always iterate all fences.

With that added:

Reviewed-by: Daniel Vetter 


> ---
>  drivers/gpu/drm/nouveau/nouveau_fence.c | 48 +++--
>  1 file changed, 12 insertions(+), 36 deletions(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c 
> b/drivers/gpu/drm/nouveau/nouveau_fence.c
> index 05d0b3eb3690..26f9299df881 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_fence.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
> @@ -339,14 +339,15 @@ nouveau_fence_wait(struct nouveau_fence *fence, bool 
> lazy, bool intr)
>  }
>  
>  int
> -nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan, 
> bool exclusive, bool intr)
> +nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan,
> +bool exclusive, bool intr)
>  {
>   struct nouveau_fence_chan *fctx = chan->fence;
> - struct dma_fence *fence;
>   struct dma_resv *resv = nvbo->bo.base.resv;
> - struct dma_resv_list *fobj;
> + struct dma_resv_iter cursor;
> + struct dma_fence *fence;
>   struct nouveau_fence *f;
> - int ret = 0, i;
> + int ret;
>  
>   if (!exclusive) {
>   ret = dma_resv_reserve_shared(resv, 1);
> @@ -355,10 +356,7 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct 
> nouveau_channel *chan, bool e
>   return ret;
>   }
>  
> - fobj = dma_resv_shared_list(resv);
> - fence = dma_resv_excl_fence(resv);
> -
> - if (fence) {
> + dma_resv_for_each_fence(&cursor, resv, exclusive, fence) {
>   struct nouveau_channel *prev = NULL;
>   bool must_wait = true;
>  
> @@ -366,41 +364,19 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct 
> nouveau_channel *chan, bool e
>   if (f) {
>   rcu_read_lock();
>   prev = rcu_dereference(f->channel);
> - if (prev && (prev == chan || fctx->sync(f, prev, chan) 
> == 0))
> + if (prev && (prev == chan ||
> +  fctx->sync(f, prev, chan) == 0))
>   must_wait = false;
>   rcu_read_unlock();
>   }
>  
> - if (must_wait)
> + if (must_wait) {
>   ret = dma_fence_wait(fence, intr);
> -
> - return ret;
> - }
> -
> - if (!exclusive || !fobj)
> - return ret;
> -
> - for (i = 0; i < fobj->shared_count && !ret; ++i) {
> - struct nouveau_channel *prev = NULL;
> - bool must_wait = true;
> -
> - fence = rcu_dereference_protected(fobj->shared[i],
> - dma_resv_held(resv));
> -
> - f = nouveau_local_fence(fence, chan->drm);
> - if (f) {
> - rcu_read_lock();
> - prev = rcu_dereference(f->channel);
> - if (prev && (prev == chan || fctx->sync(f, prev, chan) 
> == 0))
> - must_wait = false;
> - rcu_read_unlock();
> + if (ret)
> + return ret;
>   }
> -
> - if (must_wait)
> - ret = dma_fence_wait(fence, intr);
>   }
> -
> - return ret;
> + return 0;
>  }
>  
>  void
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH 26/28] drm/nouveau: use the new interator in nv50_wndw_prepare_fb

2021-10-13 Thread Daniel Vetter
On Tue, Oct 05, 2021 at 01:37:40PM +0200, Christian König wrote:
> Makes the handling a bit more complex, but avoids the use of
> dma_resv_get_excl_unlocked().
> 
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/nouveau/dispnv50/wndw.c | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/dispnv50/wndw.c 
> b/drivers/gpu/drm/nouveau/dispnv50/wndw.c
> index 8d048bacd6f0..30712a681e2a 100644
> --- a/drivers/gpu/drm/nouveau/dispnv50/wndw.c
> +++ b/drivers/gpu/drm/nouveau/dispnv50/wndw.c
> @@ -539,6 +539,8 @@ nv50_wndw_prepare_fb(struct drm_plane *plane, struct 
> drm_plane_state *state)
>   struct nouveau_bo *nvbo;
>   struct nv50_head_atom *asyh;
>   struct nv50_wndw_ctxdma *ctxdma;
> + struct dma_resv_iter cursor;
> + struct dma_fence *fence;
>   int ret;
>  
>   NV_ATOMIC(drm, "%s prepare: %p\n", plane->name, fb);
> @@ -561,7 +563,13 @@ nv50_wndw_prepare_fb(struct drm_plane *plane, struct 
> drm_plane_state *state)
>   asyw->image.handle[0] = ctxdma->object.handle;
>   }
>  
> - asyw->state.fence = dma_resv_get_excl_unlocked(nvbo->bo.base.resv);
> + dma_resv_iter_begin(&cursor, nvbo->bo.base.resv, false);
> + dma_resv_for_each_fence_unlocked(&cursor, fence) {
> + /* TODO: We only use the first writer here */

Same thing as with the atomic core helper. This is actually broken,
because for atomic we really do _not_ want to wait for any shared fences.
Which this will do, if there's no exclusive fence attached.

So upgrading my general concern on this and the atomic helper patch to a
reject, since I think it's broken.
-Daniel

> + asyw->state.fence = dma_fence_get(fence);
> + break;
> + }
> + dma_resv_iter_end(&cursor);
>   asyw->image.offset[0] = nvbo->offset;
>  
>   if (wndw->func->prepare) {
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH 27/28] drm/etnaviv: use new iterator in etnaviv_gem_describe

2021-10-13 Thread Daniel Vetter
On Tue, Oct 05, 2021 at 01:37:41PM +0200, Christian König wrote:
> Instead of hand rolling the logic.
> 
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/etnaviv/etnaviv_gem.c | 31 ++-
>  1 file changed, 11 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.c 
> b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
> index 8f1b5af47dd6..0eeb33de2ff4 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_gem.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
> @@ -428,19 +428,17 @@ int etnaviv_gem_wait_bo(struct etnaviv_gpu *gpu, struct 
> drm_gem_object *obj,
>  static void etnaviv_gem_describe_fence(struct dma_fence *fence,
>   const char *type, struct seq_file *m)
>  {
> - if (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))

Yay for removing open-coded tests like this. Drivers really should have no
business digging around in fence->flags (i915 is terrible in this regard
unfortunately).

> - seq_printf(m, "\t%9s: %s %s seq %llu\n",
> -type,
> -fence->ops->get_driver_name(fence),
> -fence->ops->get_timeline_name(fence),
> -fence->seqno);
> + seq_printf(m, "\t%9s: %s %s seq %llu\n", type,
> +fence->ops->get_driver_name(fence),
> +fence->ops->get_timeline_name(fence),
> +fence->seqno);
>  }
>  
>  static void etnaviv_gem_describe(struct drm_gem_object *obj, struct seq_file 
> *m)
>  {
>   struct etnaviv_gem_object *etnaviv_obj = to_etnaviv_bo(obj);
>   struct dma_resv *robj = obj->resv;
> - struct dma_resv_list *fobj;
> + struct dma_resv_iter cursor;
>   struct dma_fence *fence;
>   unsigned long off = drm_vma_node_start(&obj->vma_node);
>  
> @@ -449,21 +447,14 @@ static void etnaviv_gem_describe(struct drm_gem_object 
> *obj, struct seq_file *m)
>   obj->name, kref_read(&obj->refcount),
>   off, etnaviv_obj->vaddr, obj->size);
>  
> - rcu_read_lock();
> - fobj = dma_resv_shared_list(robj);
> - if (fobj) {
> - unsigned int i, shared_count = fobj->shared_count;
> -
> - for (i = 0; i < shared_count; i++) {
> - fence = rcu_dereference(fobj->shared[i]);
> + dma_resv_iter_begin(&cursor, robj, true);
> + dma_resv_for_each_fence_unlocked(&cursor, fence) {
> + if (dma_resv_iter_is_exclusive(&cursor))
> + etnaviv_gem_describe_fence(fence, "Exclusive", m);
> + else
>   etnaviv_gem_describe_fence(fence, "Shared", m);
> - }
>   }
> -
> - fence = dma_resv_excl_fence(robj);
> - if (fence)
> - etnaviv_gem_describe_fence(fence, "Exclusive", m);
> - rcu_read_unlock();
> + dma_resv_iter_end(&cursor);

Reviewed-by: Daniel Vetter 

Please make sure it compiles on arm before pushing :-)

>  }
>  
>  void etnaviv_gem_describe_objects(struct etnaviv_drm_private *priv,
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH 28/28] drm/etnaviv: replace dma_resv_get_excl_unlocked

2021-10-13 Thread Daniel Vetter
On Tue, Oct 05, 2021 at 01:37:42PM +0200, Christian König wrote:
> We certainly hold the reservation lock here, no need for the RCU dance.
> 
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c 
> b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> index 4dd7d9d541c0..7e17bc2b5df1 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
> @@ -195,7 +195,7 @@ static int submit_fence_sync(struct etnaviv_gem_submit 
> *submit)
>   if (ret)
>   return ret;
>   } else {
> - bo->excl = dma_resv_get_excl_unlocked(robj);

Maybe have that in the series to sunset dma_resv_get_excl_unlocked()? Just
so it makes a bit more sense from a motivation pov. Or explain that in the
commit message.

Anyway looks correct.

Reviewed-by: Daniel Vetter 
> + bo->excl = dma_fence_get(dma_resv_excl_fence(robj));
>   }
>  
>   }
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH 4/6] drm/i915: Add a struct dma_fence_work timeline

2021-10-13 Thread Daniel Vetter
On Wed, Oct 13, 2021 at 04:21:43PM +0200, Thomas Hellström wrote:
> On Wed, 2021-10-13 at 14:43 +0200, Daniel Vetter wrote:
> > On Fri, Oct 08, 2021 at 03:35:28PM +0200, Thomas Hellström wrote:
> > > The TTM managers and, possibly, the gtt address space managers will
> > > need to be able to order fences for async operation.
> > > Using dma_fence_is_later() for this will require that the fences we
> > > hand
> > > them are from a single fence context and ordered.
> > > 
> > > Introduce a struct dma_fence_work_timeline, and a function to
> > > attach
> > > struct dma_fence_work to such a timeline in a way that all previous
> > > fences attached to the timeline will be signaled when the latest
> > > attached struct dma_fence_work signals.
> > > 
> > > Signed-off-by: Thomas Hellström 
> > 
> > I'm not understanding why we need this:
> > 
> > - if we just want to order dma_fence work, then an ordered workqueue
> > is
> >   what we want. Which is why hand-rolling is better than reusing
> >   dma_fence_work for absolutely everything.
> > 
> > - if we just need to make sure the public fences signal in order,
> > then
> >   it's a dma_fence_chain.
> 
> Part of the same series that needs reworking.
> 
> What we need here is a way to coalesce multiple fences from various
> contexts (including both gpu and work fences) into a single fence and
> then attach it to a timeline.

I thought dma_fence_chain does this for you, including coelescing on the
same timeline. Or at least it's supposed to, because if it doesn't you can
produce some rather epic chain explosions with vulkan :-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Intel-gfx] [PATCH 4/6] drm/i915: Add a struct dma_fence_work timeline

2021-10-13 Thread Thomas Hellström



On 10/13/21 16:33, Daniel Vetter wrote:

On Wed, Oct 13, 2021 at 04:21:43PM +0200, Thomas Hellström wrote:

On Wed, 2021-10-13 at 14:43 +0200, Daniel Vetter wrote:

On Fri, Oct 08, 2021 at 03:35:28PM +0200, Thomas Hellström wrote:

The TTM managers and, possibly, the gtt address space managers will
need to be able to order fences for async operation.
Using dma_fence_is_later() for this will require that the fences we
hand
them are from a single fence context and ordered.

Introduce a struct dma_fence_work_timeline, and a function to
attach
struct dma_fence_work to such a timeline in a way that all previous
fences attached to the timeline will be signaled when the latest
attached struct dma_fence_work signals.

Signed-off-by: Thomas Hellström 

I'm not understanding why we need this:

- if we just want to order dma_fence work, then an ordered workqueue
is
   what we want. Which is why hand-rolling is better than reusing
   dma_fence_work for absolutely everything.

- if we just need to make sure the public fences signal in order,
then
   it's a dma_fence_chain.

Part of the same series that needs reworking.

What we need here is a way to coalesce multiple fences from various
contexts (including both gpu and work fences) into a single fence and
then attach it to a timeline.

I thought dma_fence_chain does this for you, including coelescing on the
same timeline. Or at least it's supposed to, because if it doesn't you can
produce some rather epic chain explosions with vulkan :-)


I'll take a look to see if I can use dma_fence_chain for this case.

Thanks,

/Thomas


-Daniel


Re: [Intel-gfx] [PATCH 2/6] drm/i915: Introduce refcounted sg-tables

2021-10-13 Thread Daniel Vetter
On Fri, Oct 08, 2021 at 03:35:26PM +0200, Thomas Hellström wrote:
> As we start to introduce asynchronous failsafe object migration,
> where we update the object state and then submit asynchronous
> commands we need to record what memory resources are actually used
> by various part of the command stream. Initially for three purposes:
> 
> 1) Error capture.
> 2) Asynchronous migration error recovery.
> 3) Asynchronous vma bind.
> 
> At the time where these happens, the object state may have been updated
> to be several migrations ahead and object sg-tables discarded.
> 
> In order to make it possible to keep sg-tables with memory resource
> information for these operations, introduce refcounted sg-tables that
> aren't freed until the last user is done with them.
> 
> The alternative would be to reference information sitting on the
> corresponding ttm_resources which typically have the same lifetime as
> these refcountes sg_tables, but that leads to other awkward constructs:
> Due to the design direction chosen for ttm resource managers that would
> lead to diamond-style inheritance, the LMEM resources may sometimes be
> prematurely freed, and finally the subclassed struct ttm_resource would
> have to bleed into the asynchronous vma bind code.

On the diamon inheritence I was pondering some more whether we shouldn't
just do the classic C union horrors, i.e.

struct ttm_resource {
/* stuff */
};

struct ttm_drm_mm_resource {
struct ttm_resource base;
struct drm_mm_node;
};

struct ttm_buddy_resource {
struct ttm_resource base;
struct drm_buddy_node;
};

Whatever else we have, maybe also integer resources for guc_id.

And then the horrors:

struct i915_gem_resource {
union {
struct ttm_resource base;
struct ttm_drm_mm_resource drm_mm;
struct ttm_buffer_object buddy;
};

/* i915 stuff */
};

BUILD_BUG_ON(offsetof(struct i915_gem_resource, base) ==
offsetof(struct i915_gem_resource, drmm_mm.base))
BUILD_BUG_ON(offsetof(struct i915_gem_resource, base) ==
offsetof(struct i915_gem_resource, buddy.base))

This is horrible, but also in official C89 and later unions are the only
ways to do inheritance. The only reason we can do different in linux is
because we compile with strict aliasing turned off.

So I think we can shrug this off as officially sanctioned horrors. There's
a small downside with overhead maybe, but I don't think the amount in
difference between the various allocators is big enough that we should
care. Plus a pointer to driver stuff to resolve the diamond inheritance
through different means isn't free either.

But also this is for much later, I think for now refcounting sglist as a
standalone thing is ok, since we do seem to need them in a bunch of
places. But eventually I do think we should aim to merge them with
ttm_resource, if/when those get refcounted.
-Daniel

> 
> Signed-off-by: Thomas Hellström 
> ---
>  .../gpu/drm/i915/gem/i915_gem_object_types.h  |   3 +-
>  drivers/gpu/drm/i915/gem/i915_gem_ttm.c   | 159 +++---
>  drivers/gpu/drm/i915/i915_scatterlist.c   |  62 +--
>  drivers/gpu/drm/i915/i915_scatterlist.h   |  76 -
>  drivers/gpu/drm/i915/intel_region_ttm.c   |  15 +-
>  drivers/gpu/drm/i915/intel_region_ttm.h   |   5 +-
>  drivers/gpu/drm/i915/selftests/mock_region.c  |  12 +-
>  7 files changed, 238 insertions(+), 94 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h 
> b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> index 7c3da4e3e737..d600cf7ceb35 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
> @@ -485,6 +485,7 @@ struct drm_i915_gem_object {
>*/
>   struct list_head region_link;
>  
> + struct i915_refct_sgt *rsgt;
>   struct sg_table *pages;
>   void *mapping;
>  
> @@ -538,7 +539,7 @@ struct drm_i915_gem_object {
>   } mm;
>  
>   struct {
> - struct sg_table *cached_io_st;
> + struct i915_refct_sgt *cached_io_rsgt;
>   struct i915_gem_object_page_iter get_io_page;
>   struct drm_i915_gem_object *backup;
>   bool created:1;
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> index 74a1ffd0d7dd..4b4d7457bef9 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> @@ -34,7 +34,7 @@
>   * struct i915_ttm_tt - TTM page vector with additional private information
>   * @ttm: The base TTM page vector.
>   * @dev: The struct device used for dma mapping and unmapping.
> - * @cached_st: The cached scatter-gather table.
> + * @cached_rsgt: The cached scatter-gather table.
>   *
>   * Note that DMA may be going on right up to the point where the page-
>   * vector is unpopulated in delayed

Re: [Intel-gfx] [PATCH 2/6] drm/i915: Introduce refcounted sg-tables

2021-10-13 Thread Thomas Hellström



On 10/13/21 16:41, Daniel Vetter wrote:

On Fri, Oct 08, 2021 at 03:35:26PM +0200, Thomas Hellström wrote:

As we start to introduce asynchronous failsafe object migration,
where we update the object state and then submit asynchronous
commands we need to record what memory resources are actually used
by various part of the command stream. Initially for three purposes:

1) Error capture.
2) Asynchronous migration error recovery.
3) Asynchronous vma bind.

At the time where these happens, the object state may have been updated
to be several migrations ahead and object sg-tables discarded.

In order to make it possible to keep sg-tables with memory resource
information for these operations, introduce refcounted sg-tables that
aren't freed until the last user is done with them.

The alternative would be to reference information sitting on the
corresponding ttm_resources which typically have the same lifetime as
these refcountes sg_tables, but that leads to other awkward constructs:
Due to the design direction chosen for ttm resource managers that would
lead to diamond-style inheritance, the LMEM resources may sometimes be
prematurely freed, and finally the subclassed struct ttm_resource would
have to bleed into the asynchronous vma bind code.

On the diamon inheritence I was pondering some more whether we shouldn't
just do the classic C union horrors, i.e.

struct ttm_resource {
/* stuff */
};

struct ttm_drm_mm_resource {
struct ttm_resource base;
struct drm_mm_node;
};

struct ttm_buddy_resource {
struct ttm_resource base;
struct drm_buddy_node;
};

Whatever else we have, maybe also integer resources for guc_id.

And then the horrors:

struct i915_gem_resource {
union {
struct ttm_resource base;
struct ttm_drm_mm_resource drm_mm;
struct ttm_buffer_object buddy;
};

/* i915 stuff */
};

BUILD_BUG_ON(offsetof(struct i915_gem_resource, base) ==
offsetof(struct i915_gem_resource, drmm_mm.base))
BUILD_BUG_ON(offsetof(struct i915_gem_resource, base) ==
offsetof(struct i915_gem_resource, buddy.base))

This is horrible, but also in official C89 and later unions are the only
ways to do inheritance. The only reason we can do different in linux is
because we compile with strict aliasing turned off.

So I think we can shrug this off as officially sanctioned horrors. There's
a small downside with overhead maybe, but I don't think the amount in
difference between the various allocators is big enough that we should
care. Plus a pointer to driver stuff to resolve the diamond inheritance
through different means isn't free either.


Yes, this is exactly what was meant by "awkward constructs" in the 
commit message,


My thoughts are still that all this could be avoided by a different 
design for struct ttm_resource,
but I agree we can do with refcounted sg-lists for now, to see where 
this ends up when all related resource-on-lru stuff lands in TTM.


/Thomas




[Intel-gfx] ✗ Fi.CI.BUILD: failure for mmotm 2021-10-05-19-53 uploaded (drivers/gpu/drm/msm/hdmi/hdmi_phy.o) (rev2)

2021-10-13 Thread Patchwork
== Series Details ==

Series: mmotm 2021-10-05-19-53 uploaded (drivers/gpu/drm/msm/hdmi/hdmi_phy.o) 
(rev2)
URL   : https://patchwork.freedesktop.org/series/95495/
State : failure

== Summary ==

Applying: mmotm 2021-10-05-19-53 uploaded (drivers/gpu/drm/msm/hdmi/hdmi_phy.o)
error: patch failed: drivers/gpu/drm/msm/Makefile:116
error: drivers/gpu/drm/msm/Makefile: patch does not apply
error: Did you hand edit your patch?
It does not apply to blobs recorded in its index.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Using index info to reconstruct a base tree...
Patch failed at 0001 mmotm 2021-10-05-19-53 uploaded 
(drivers/gpu/drm/msm/hdmi/hdmi_phy.o)
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".




[Intel-gfx] ✗ Fi.CI.BUILD: failure for drm/i915: Use dma_resv_iter for waiting in i915_gem_object_wait_reservation. (rev3)

2021-10-13 Thread Patchwork
== Series Details ==

Series: drm/i915: Use dma_resv_iter for waiting in 
i915_gem_object_wait_reservation. (rev3)
URL   : https://patchwork.freedesktop.org/series/95765/
State : failure

== Summary ==

CALLscripts/checksyscalls.sh
  CALLscripts/atomic/check-atomics.sh
  DESCEND objtool
  CHK include/generated/compile.h
  CC [M]  drivers/gpu/drm/i915/gem/i915_gem_shrinker.o
drivers/gpu/drm/i915/gem/i915_gem_shrinker.c: In function ‘i915_gem_shrink’:
drivers/gpu/drm/i915/gem/i915_gem_shrinker.c:231:4: error: implicit declaration 
of function ‘dma_resv_prune’; did you mean ‘dma_resv_fini’? 
[-Werror=implicit-function-declaration]
dma_resv_prune(obj->base.resv);
^~
dma_resv_fini
cc1: all warnings being treated as errors
scripts/Makefile.build:277: recipe for target 
'drivers/gpu/drm/i915/gem/i915_gem_shrinker.o' failed
make[4]: *** [drivers/gpu/drm/i915/gem/i915_gem_shrinker.o] Error 1
scripts/Makefile.build:540: recipe for target 'drivers/gpu/drm/i915' failed
make[3]: *** [drivers/gpu/drm/i915] Error 2
scripts/Makefile.build:540: recipe for target 'drivers/gpu/drm' failed
make[2]: *** [drivers/gpu/drm] Error 2
scripts/Makefile.build:540: recipe for target 'drivers/gpu' failed
make[1]: *** [drivers/gpu] Error 2
Makefile:1868: recipe for target 'drivers' failed
make: *** [drivers] Error 2




Re: [Intel-gfx] [PATCH] drm/i915/display: Remove check for low voltage sku for max dp source rate

2021-10-13 Thread Imre Deak
On Thu, Oct 07, 2021 at 01:19:25PM +0530, Nautiyal, Ankit K wrote:
> 
> On 10/5/2021 9:01 PM, Imre Deak wrote:
> > On Tue, Oct 05, 2021 at 01:34:21PM +0300, Jani Nikula wrote:
> > > Cc: Imre, I think you were involved in adding the checks.
> > About ADL-S the spec says:
> > 
> > Bspec 53597:
> > Combo Port Maximum Speed:
> > OEM must use VBT to specify a maximum that is tolerated by the board design.
> > 
> > Combo Port HBR3 support:
> > May require retimer on motherboard. The OEM must use VBT to limit the link 
> > rate to HBR2 if HBR3 not supported by motherboard.
> > 
> > Bspec/49201:
> > Combo Port HBR3/6.48GHz support:
> > Only supported on SKUs with higher I/O voltage
> > 
> > I take the above meaning that only high voltage SKUs support HBR3 and
> > on those SKUs the OEM must limit this to HBR2 if HBR3 would require a
> > retimer on the board, but the board doesn't have this.
> > 
> > If the above isn't correct and low voltage SKUs also in fact support
> > HBR3 (with retimers if necessary) then this should imo clarified at
> > Bspec/49201. The VBT limit could be used then if present, ignoring the
> > low voltage SKU readout.
> 
> Thanks Imre for the inputs.
> 
> As you have mentioned note : rate >5.4 G supported only on High voltage I/O,
> is mentioned for platforms like ICL, JSL and Display 12 platforms.
> 
> I had again asked the HW team and VBT/GOP team whether we can safely rely on
> VBT for the max rate for these platforms, without worrying about the SKU's
> IO Voltage, and also requested them to update the Bspec page for the same.
> 
> In response the Bspec pages 49201, 20598 are now updated with the note "OEM
> must use VBT to specify a maximum that is tolerated by the board design" for
> the rates above 5.4G.

Ok, thanks for this, now the spec is closer to the proposed changes. On
some platforms it's still unclear if the default max rate in the lack of
a VBT limit is HBR2 or HBR3. The ADL-S overview at Bspec/53597 is clear
now wrt. this:

(*) "May require retimer on motherboard. The OEM must use VBT to limit the link 
rate
to HBR2 if HBR3 not supported by motherboard."

ideally it should still clarify if the potential retimer requirement applies to
both eDP and DP or only to DP.

I still see the followings to adjust in the spec so that it reflects
the patch:

- ICL
  - bspec/20584:
"Increased IO voltage may be required to support HBR3 for the highest 
DisplayPort
 and eDP resolutions."

 should be changed to (*) above mentioning that HBR3 is only supported on
 eDP.

  - bspec/20598:
"Combo HBR3: OEM must use VBT to specify a miximum that is tolerated by the
board design."

The DP/HBR3 support on ICL should be removed.

For eDP/HBR3 on ICL the above comment should be changed to (*).

- JSL
  - bspec/32247:
"Increased IO voltage may be required to support HBR3 for the highest 
DisplayPort
 resolutions."

should be removed/changed to (*).

  - bspec/20598:
"OEM must use VBT to specify a miximum that is tolerated by the
board design."

should be changed to (*).

- TGL:
  - bspec/49201:
"Combo HBR3: OEM must use VBT to specify a miximum that is tolerated
by the board design."

The DP/HBR3 support should be removed, for eDP/HBR3 the above should
be changed to (*).

- RKL:
  - bspec/49201, 49204:
Remove the RKL tag, since there is a separate page for RKL.

  - bspec/49202:
"Combo HBR3: Only supported on SKUs with higher I/O voltage"

should be changed to (*).

- ADLS:
  - bspec/49201, 49204:
The ADLS tag should be removed, since there is a separate page for ADLS.

  - bspec/53720:
"Combo HBR3: OEM must use VBT to specify a miximum that is tolerated by the
board design."

should be changed to (*).

- DG1:
  - bspec/49205:
"Combo HBR3: Only supported on SKUs with higher I/O voltage"

should be changed to (*) above.

- DG2:
  - bspec/53657:
For Combo HBR3 (*) should be added.

  - bspec/54034:
For Combo HBR3 (*) should be added.

- ADLP:
  - bspec/49185:
"Combo DP/HBR3: OEM must use VBT to specify a miximum that is tolerated by
the board design. An external re-timer may be needed."

should be changed to (*).


Also could you add a debug print with the voltage configuration of combo
PHYs somewhere in intel_combo_phy.c?

> From what I understand, we can depend upon the VBT's rate, and if there are
> some low voltage I/O SKUs that do not support HBR3 rate, it should be
> limited by the VBT.
> 
> Thanks & Regards,
> 
> Ankit
> 
> > > BR,
> > > Jani.
> > > 
> > > On Tue, 05 Oct 2021, "Nautiyal, Ankit K"  
> > > wrote:
> > > > On 10/5/2021 1:34 PM, Jani Nikula wrote:
> > > > > On Tue, 05 Oct 2021, Ankit Nautiyal  
> > > > > wrote:
> > > > > > The low voltage sku check can be ignored as OEMs need to consider 
> > > > > > that
> > > > > > when designing the board and then put any limits in VBT.
> > > > > "can" or "must"?
> > > > > 
> > > > > VBT has been notoriously buggy over the

Re: [Intel-gfx] [PATCH] drm/i915/display: Remove check for low voltage sku for max dp source rate

2021-10-13 Thread Jani Nikula
On Wed, 13 Oct 2021, Imre Deak  wrote:
> On Thu, Oct 07, 2021 at 01:19:25PM +0530, Nautiyal, Ankit K wrote:
>> 
>> On 10/5/2021 9:01 PM, Imre Deak wrote:
>> > On Tue, Oct 05, 2021 at 01:34:21PM +0300, Jani Nikula wrote:
>> > > Cc: Imre, I think you were involved in adding the checks.
>> > About ADL-S the spec says:
>> > 
>> > Bspec 53597:
>> > Combo Port Maximum Speed:
>> > OEM must use VBT to specify a maximum that is tolerated by the board 
>> > design.
>> > 
>> > Combo Port HBR3 support:
>> > May require retimer on motherboard. The OEM must use VBT to limit the link 
>> > rate to HBR2 if HBR3 not supported by motherboard.
>> > 
>> > Bspec/49201:
>> > Combo Port HBR3/6.48GHz support:
>> > Only supported on SKUs with higher I/O voltage
>> > 
>> > I take the above meaning that only high voltage SKUs support HBR3 and
>> > on those SKUs the OEM must limit this to HBR2 if HBR3 would require a
>> > retimer on the board, but the board doesn't have this.
>> > 
>> > If the above isn't correct and low voltage SKUs also in fact support
>> > HBR3 (with retimers if necessary) then this should imo clarified at
>> > Bspec/49201. The VBT limit could be used then if present, ignoring the
>> > low voltage SKU readout.
>> 
>> Thanks Imre for the inputs.
>> 
>> As you have mentioned note : rate >5.4 G supported only on High voltage I/O,
>> is mentioned for platforms like ICL, JSL and Display 12 platforms.
>> 
>> I had again asked the HW team and VBT/GOP team whether we can safely rely on
>> VBT for the max rate for these platforms, without worrying about the SKU's
>> IO Voltage, and also requested them to update the Bspec page for the same.
>> 
>> In response the Bspec pages 49201, 20598 are now updated with the note "OEM
>> must use VBT to specify a maximum that is tolerated by the board design" for
>> the rates above 5.4G.
>
> Ok, thanks for this, now the spec is closer to the proposed changes. On
> some platforms it's still unclear if the default max rate in the lack of
> a VBT limit is HBR2 or HBR3. The ADL-S overview at Bspec/53597 is clear
> now wrt. this:
>
> (*) "May require retimer on motherboard. The OEM must use VBT to limit the 
> link rate
> to HBR2 if HBR3 not supported by motherboard."
>
> ideally it should still clarify if the potential retimer requirement applies 
> to
> both eDP and DP or only to DP.
>
> I still see the followings to adjust in the spec so that it reflects
> the patch:
>
> - ICL
>   - bspec/20584:
> "Increased IO voltage may be required to support HBR3 for the highest 
> DisplayPort
>  and eDP resolutions."
>
>  should be changed to (*) above mentioning that HBR3 is only supported on
>  eDP.
>
>   - bspec/20598:
> "Combo HBR3: OEM must use VBT to specify a miximum that is tolerated by 
> the
> board design."
>
> The DP/HBR3 support on ICL should be removed.
>
> For eDP/HBR3 on ICL the above comment should be changed to (*).
>
> - JSL
>   - bspec/32247:
> "Increased IO voltage may be required to support HBR3 for the highest 
> DisplayPort
>  resolutions."
>
> should be removed/changed to (*).
>
>   - bspec/20598:
> "OEM must use VBT to specify a miximum that is tolerated by the
> board design."
>
> should be changed to (*).
>
> - TGL:
>   - bspec/49201:
> "Combo HBR3: OEM must use VBT to specify a miximum that is tolerated
> by the board design."
>
> The DP/HBR3 support should be removed, for eDP/HBR3 the above should
> be changed to (*).
>
> - RKL:
>   - bspec/49201, 49204:
> Remove the RKL tag, since there is a separate page for RKL.
>
>   - bspec/49202:
> "Combo HBR3: Only supported on SKUs with higher I/O voltage"
>
> should be changed to (*).
>
> - ADLS:
>   - bspec/49201, 49204:
> The ADLS tag should be removed, since there is a separate page for ADLS.
>
>   - bspec/53720:
> "Combo HBR3: OEM must use VBT to specify a miximum that is tolerated by 
> the
> board design."
>
> should be changed to (*).
>
> - DG1:
>   - bspec/49205:
> "Combo HBR3: Only supported on SKUs with higher I/O voltage"
>
> should be changed to (*) above.
>
> - DG2:
>   - bspec/53657:
> For Combo HBR3 (*) should be added.
>
>   - bspec/54034:
> For Combo HBR3 (*) should be added.
>
> - ADLP:
>   - bspec/49185:
> "Combo DP/HBR3: OEM must use VBT to specify a miximum that is tolerated by
> the board design. An external re-timer may be needed."
>
> should be changed to (*).
>
>
> Also could you add a debug print with the voltage configuration of combo
> PHYs somewhere in intel_combo_phy.c?
>
>> From what I understand, we can depend upon the VBT's rate, and if there are
>> some low voltage I/O SKUs that do not support HBR3 rate, it should be
>> limited by the VBT.
>> 
>> Thanks & Regards,
>> 
>> Ankit
>> 
>> > > BR,
>> > > Jani.
>> > > 
>> > > On Tue, 05 Oct 2021, "Nautiyal, Ankit K"  
>> > > wrote:
>> > > > On 10/5/2021 1:34 PM, Jani Nikula wrote:
>> > > > > On Tue, 05 Oct 2021, Ankit Naut

Re: [Intel-gfx] [PATCH] drm/i915: Use dma_resv_iter for waiting in i915_gem_object_wait_reservation.

2021-10-13 Thread Tvrtko Ursulin



On 13/10/2021 15:00, Daniel Vetter wrote:

On Wed, Oct 13, 2021 at 02:32:03PM +0200, Maarten Lankhorst wrote:

No memory should be allocated when calling i915_gem_object_wait,
because it may be called to idle a BO when evicting memory.

Fix this by using dma_resv_iter helpers to call
i915_gem_object_wait_fence() on each fence, which cleans up the code a lot.
Also remove dma_resv_prune, it's questionably.

This will result in the following lockdep splat.

<4> [83.538517] ==
<4> [83.538520] WARNING: possible circular locking dependency detected
<4> [83.538522] 5.15.0-rc5-CI-Trybot_8062+ #1 Not tainted
<4> [83.538525] --
<4> [83.538527] gem_render_line/5242 is trying to acquire lock:
<4> [83.538530] 8275b1e0 (fs_reclaim){+.+.}-{0:0}, at: 
__kmalloc_track_caller+0x56/0x270
<4> [83.538538]
but task is already holding lock:
<4> [83.538540] 88813471d1e0 (&vm->mutex/1){+.+.}-{3:3}, at: 
i915_vma_pin_ww+0x1c7/0x970 [i915]
<4> [83.538638]
which lock already depends on the new lock.
<4> [83.538642]
the existing dependency chain (in reverse order) is:
<4> [83.538645]
-> #1 (&vm->mutex/1){+.+.}-{3:3}:
<4> [83.538649]lock_acquire+0xd3/0x310
<4> [83.538654]i915_gem_shrinker_taints_mutex+0x2d/0x50 [i915]
<4> [83.538730]i915_address_space_init+0xf5/0x1b0 [i915]
<4> [83.538794]ppgtt_init+0x55/0x70 [i915]
<4> [83.538856]gen8_ppgtt_create+0x44/0x5d0 [i915]
<4> [83.538912]i915_ppgtt_create+0x28/0xf0 [i915]
<4> [83.538971]intel_gt_init+0x130/0x3b0 [i915]
<4> [83.539029]i915_gem_init+0x14b/0x220 [i915]
<4> [83.539100]i915_driver_probe+0x97e/0xdd0 [i915]
<4> [83.539149]i915_pci_probe+0x43/0x1d0 [i915]
<4> [83.539197]pci_device_probe+0x9b/0x110
<4> [83.539201]really_probe+0x1b0/0x3b0
<4> [83.539205]__driver_probe_device+0xf6/0x170
<4> [83.539208]driver_probe_device+0x1a/0x90
<4> [83.539210]__driver_attach+0x93/0x160
<4> [83.539213]bus_for_each_dev+0x72/0xc0
<4> [83.539216]bus_add_driver+0x14b/0x1f0
<4> [83.539220]driver_register+0x66/0xb0
<4> [83.539222]hdmi_get_spk_alloc+0x1f/0x50 [snd_hda_codec_hdmi]
<4> [83.539227]do_one_initcall+0x53/0x2e0
<4> [83.539230]do_init_module+0x55/0x200
<4> [83.539234]load_module+0x2700/0x2980
<4> [83.539237]__do_sys_finit_module+0xaa/0x110
<4> [83.539241]do_syscall_64+0x37/0xb0
<4> [83.539244]entry_SYSCALL_64_after_hwframe+0x44/0xae
<4> [83.539247]
-> #0 (fs_reclaim){+.+.}-{0:0}:
<4> [83.539251]validate_chain+0xb37/0x1e70
<4> [83.539254]__lock_acquire+0x5a1/0xb70
<4> [83.539258]lock_acquire+0xd3/0x310
<4> [83.539260]fs_reclaim_acquire+0x9d/0xd0
<4> [83.539264]__kmalloc_track_caller+0x56/0x270
<4> [83.539267]krealloc+0x48/0xa0
<4> [83.539270]dma_resv_get_fences+0x1c3/0x280
<4> [83.539274]i915_gem_object_wait+0x1ff/0x410 [i915]
<4> [83.539342]i915_gem_evict_for_node+0x16b/0x440 [i915]
<4> [83.539412]i915_gem_gtt_reserve+0xff/0x130 [i915]
<4> [83.539482]i915_vma_pin_ww+0x765/0x970 [i915]
<4> [83.539556]eb_validate_vmas+0x6fe/0x8e0 [i915]
<4> [83.539626]i915_gem_do_execbuffer+0x9a6/0x20a0 [i915]
<4> [83.539693]i915_gem_execbuffer2_ioctl+0x11f/0x2c0 [i915]
<4> [83.539759]drm_ioctl_kernel+0xac/0x140
<4> [83.539763]drm_ioctl+0x201/0x3d0
<4> [83.539766]__x64_sys_ioctl+0x6a/0xa0
<4> [83.539769]do_syscall_64+0x37/0xb0
<4> [83.539772]entry_SYSCALL_64_after_hwframe+0x44/0xae
<4> [83.539775]
other info that might help us debug this:
<4> [83.539778]  Possible unsafe locking scenario:
<4> [83.539781]CPU0CPU1
<4> [83.539783]
<4> [83.539785]   lock(&vm->mutex/1);
<4> [83.539788]lock(fs_reclaim);
<4> [83.539791]lock(&vm->mutex/1);
<4> [83.539794]   lock(fs_reclaim);
<4> [83.539796]
  *** DEADLOCK ***
<4> [83.539799] 3 locks held by gem_render_line/5242:
<4> [83.539802]  #0: c9d4bbf0 
(reservation_ww_class_acquire){+.+.}-{0:0}, at: i915_gem_do_execbuffer+0x8e5/0x20a0 
[i915]
<4> [83.539870]  #1: 88811e48bae8 (reservation_ww_class_mutex){+.+.}-{3:3}, 
at: eb_validate_vmas+0x81/0x8e0 [i915]
<4> [83.539936]  #2: 88813471d1e0 (&vm->mutex/1){+.+.}-{3:3}, at: 
i915_vma_pin_ww+0x1c7/0x970 [i915]
<4> [83.540011]
stack backtrace:
<4> [83.540014] CPU: 2 PID: 5242 Comm: gem_render_line Not tainted 
5.15.0-rc5-CI-Trybot_8062+ #1
<4> [83.540019] Hardware name: Intel(R) Client Systems NUC11TNHi3/NUC11TNBi3, 
BIOS TNTGL357.0038.2020.1124.1648 11/24/2020
<4> [83.540023] Call Trace:
<4> [83.540026]  dump_stack_lvl+0x56/0x7b
<4> [83.540030]  check_noncircular+0x12e/0x150
<4> [83.540034]  ? _raw_spin_unlock_irqrestore+0x50/0x60
<4> [

Re: [Intel-gfx] [PATCH 0/1] drm/i915: vlv sideband

2021-10-13 Thread Lucas De Marchi

On Wed, Oct 13, 2021 at 01:47:09PM +0300, Jani Nikula wrote:

On Wed, 13 Oct 2021, Ville Syrjälä  wrote:

On Wed, Oct 13, 2021 at 01:11:58PM +0300, Jani Nikula wrote:

Three main ideas here:

- vlv sideband only has the name "sideband" in common with the rest of
  intel_sideband.[ch]


I wouldn't put it like that. There are two actual sideband
implementtions in that file:
- vlv/chv iosf sideband (vlv_sideband)
- lpt/wpt iosf sideband (intel_sbi)

And the third thing in that file is the snb+ pcode mailbox stuff,
which has nothing to do with sideband.


Fair enough... but no opposition to the splitting out of vlv/chv iosf
sideband? vlv_sideband.[ch] like here? I'm fine with renaming too.

I can follow up with lpt/wpt iosf split out (intel_sbi.[ch]?) and snb+
pcode (intel_pcode.[ch]?).


yeah, I think that if we move intel_pcode.[ch] out, then we probably
don't even have to worry about the iosf_* calls for other archs. The
common stuff would be in pcode and the others would be compiled out for
archs that don't have it (i.e. only x86 adds it).

+Siva, who was looking into this iosf abstraction.

Lucas De Marchi



I think we've just put all of them together way back when this was all
probably bundled in i915_drv.c or something...


BR,
Jani.



--
Jani Nikula, Intel Open Source Graphics Center


Re: [Intel-gfx] [RFC 6/8] drm/i915: Make some recently added vfuncs use full scheduling attribute

2021-10-13 Thread Tvrtko Ursulin



On 13/10/2021 13:01, Daniel Vetter wrote:

On Wed, Oct 06, 2021 at 10:12:29AM -0700, Matthew Brost wrote:

On Mon, Oct 04, 2021 at 03:36:48PM +0100, Tvrtko Ursulin wrote:

From: Tvrtko Ursulin 

Code added in 71ed60112d5d ("drm/i915: Add kick_backend function to
i915_sched_engine") and ee242ca704d3 ("drm/i915/guc: Implement GuC
priority management") introduced some scheduling related vfuncs which
take integer request priority as argument.

Make them instead take struct i915_sched_attr, which is the type
encapsulating this information, so it probably aligns with the design
better. It definitely enables extending the set of scheduling attributes.



Understand the motivation here but the i915_scheduler is going to
disapear when we move to the DRM scheduler or at least its functionality
of priority inheritance will be pushed into the DRM scheduler. I'd be
very careful making any changes here as the priority in the DRM
scheduler is defined as single enum:


Yeah I'm not sure it makes sense to build this and make the conversion to
drm/sched even harder. We've already merged a lot of code with a "we'll
totally convert to drm/sched right after" promise, there's not really room
for more fun like this built on top of i915-scheduler.


It is not really fun on top of i915-scheduler. It is fun on top of the 
concept of uapi gem context priority. As long as there is gem context 
priority, and requests inherit from it, the concept works. This is 
demonstrated by the fact it ties in with the GuC backend which reduces 
to three priorities already. It is limited granularity but it does 
something.


Implementation details aside, key question is the proposal to tie 
process nice with GPU scheduling priority. There seems to be interest 
from other parties so there probably is something here.


But I do plan to simplify this RFC to not add anything to 
i915_sched_attr and also drop the task sched attr change notifier.


Regards,

Tvrtko


-Daniel



/* These are often used as an (initial) index
  * to an array, and as such should start at 0.
  */
enum drm_sched_priority {
 DRM_SCHED_PRIORITY_MIN,
 DRM_SCHED_PRIORITY_NORMAL,
 DRM_SCHED_PRIORITY_HIGH,
 DRM_SCHED_PRIORITY_KERNEL,

 DRM_SCHED_PRIORITY_COUNT,
 DRM_SCHED_PRIORITY_UNSET = -2
};

Adding a field to the i915_sched_attr is fairly easy as we already have
a structure but changing the DRM scheduler might be a tougher sell.
Anyway you can make this work without adding the 'nice' field to
i915_sched_attr? Might be worth exploring so when we move to the DRM
scheduler this feature drops in a little cleaner.

Matt


Signed-off-by: Tvrtko Ursulin 
Cc: Matthew Brost 
Cc: Daniele Ceraolo Spurio 
---
  drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 4 +++-
  drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c| 3 ++-
  drivers/gpu/drm/i915/i915_scheduler.c| 4 ++--
  drivers/gpu/drm/i915/i915_scheduler_types.h  | 4 ++--
  4 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 7147fe80919e..e91d803a6453 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -3216,11 +3216,13 @@ static bool can_preempt(struct intel_engine_cs *engine)
return engine->class != RENDER_CLASS;
  }
  
-static void kick_execlists(const struct i915_request *rq, int prio)

+static void kick_execlists(const struct i915_request *rq,
+  const struct i915_sched_attr *attr)
  {
struct intel_engine_cs *engine = rq->engine;
struct i915_sched_engine *sched_engine = engine->sched_engine;
const struct i915_request *inflight;
+   const int prio = attr->priority;
  
  	/*

 * We only need to kick the tasklet once for the high priority
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index ba0de35f6323..b5883a4365ca 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -2414,9 +2414,10 @@ static void guc_init_breadcrumbs(struct intel_engine_cs 
*engine)
  }
  
  static void guc_bump_inflight_request_prio(struct i915_request *rq,

-  int prio)
+  const struct i915_sched_attr *attr)
  {
struct intel_context *ce = rq->context;
+   const int prio = attr->priority;
u8 new_guc_prio = map_i915_prio_to_guc_prio(prio);
  
  	/* Short circuit function */

diff --git a/drivers/gpu/drm/i915/i915_scheduler.c 
b/drivers/gpu/drm/i915/i915_scheduler.c
index 762127dd56c5..534bab99fcdc 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -255,7 +255,7 @@ static void __i915_schedule(struct i915_sched_node *node,
  
  		/* Must 

Re: [Intel-gfx] [PATCH 1/1] drm/i915: split out vlv sideband to a separate file

2021-10-13 Thread Lucas De Marchi

On Wed, Oct 13, 2021 at 01:11:59PM +0300, Jani Nikula wrote:

The VLV/CHV sideband code is pretty distinct from the rest of the
sideband code. Split it out to new vlv_sideband.[ch].

Pure code movement with relevant #include changes, and a tiny checkpatch
fix on top.

Cc: Lucas De Marchi 
Cc: Ville Syrjälä 
Signed-off-by: Jani Nikula 


Acked-by: Lucas De Marchi 

thanks
Lucas De Marchi


Re: [Intel-gfx] [PATCH] drm/i915: Handle Intel igfx + Intel dgfx hybrid graphics setup

2021-10-13 Thread Tvrtko Ursulin



On 13/10/2021 13:06, Daniel Vetter wrote:

On Tue, Oct 05, 2021 at 03:05:25PM +0200, Thomas Hellström wrote:

Hi, Tvrtko,

On 10/5/21 13:31, Tvrtko Ursulin wrote:

From: Tvrtko Ursulin 

In short this makes i915 work for hybrid setups (DRI_PRIME=1 with Mesa)
when rendering is done on Intel dgfx and scanout/composition on Intel
igfx.

Before this patch the driver was not quite ready for that setup, mainly
because it was able to emit a semaphore wait between the two GPUs, which
results in deadlocks because semaphore target location in HWSP is neither
shared between the two, nor mapped in both GGTT spaces.

To fix it the patch adds an additional check to a couple of relevant code
paths in order to prevent using semaphores for inter-engine
synchronisation when relevant objects are not in the same GGTT space.

v2:
   * Avoid adding rq->i915. (Chris)

v3:
   * Use GGTT which describes the limit more precisely.

Signed-off-by: Tvrtko Ursulin 
Cc: Daniel Vetter 
Cc: Matthew Auld 
Cc: Thomas Hellström 


An IMO pretty important bugfix. I read up a bit on the previous discussion
on this, and from what I understand the other two options were

1) Ripping out the semaphore code,
2) Consider dma-fences from other instances of the same driver as foreign.

For imported dma-bufs we do 2), but particularly with lmem and p2p that's a
more straightforward decision.

I don't think 1) is a reasonable approach to fix this bug, (but perhaps as a
general cleanup?), and for 2) yes I guess we might end up doing that, unless
we find some real benefits in treating same-driver-separate-device
dma-fences as local, but for this particular bug, IMO this is a reasonable
fix.


The foreign dma-fences have uapi impact, which Tvrtko shrugged off as
"it's a good idea", and not it's really just not. So we still need to that
this properly.


I always said lets merge the fix and discuss it. Fix only improved one 
fail and did not introduce any new issues you are worried about. They 
were all already there.


So lets start the discussion why it is not a good idea to extend the 
concept of priority inheritance in the hybrid case?


Today we can have high priority compositor waiting for client rendering, 
or even I915_PRIORITY_DISPLAY which I _think_ somehow ties into page 
flips with full screen stuff, and with igpu we do priority inheritance 
in those cases. Why it is a bad idea to do the same in the hybrid setup?


Regards,

Tvrtko




Reviewed-by: Thomas Hellström 


But I'm also ok with just merging this as-is so the situation doesn't
become too entertaining.
-Daniel








---
   drivers/gpu/drm/i915/i915_request.c | 12 +++-
   1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c
index 79da5eca60af..4f189982f67e 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1145,6 +1145,12 @@ __emit_semaphore_wait(struct i915_request *to,
return 0;
   }
+static bool
+can_use_semaphore_wait(struct i915_request *to, struct i915_request *from)
+{
+   return to->engine->gt->ggtt == from->engine->gt->ggtt;
+}
+
   static int
   emit_semaphore_wait(struct i915_request *to,
struct i915_request *from,
@@ -1153,6 +1159,9 @@ emit_semaphore_wait(struct i915_request *to,
const intel_engine_mask_t mask = READ_ONCE(from->engine)->mask;
struct i915_sw_fence *wait = &to->submit;
+   if (!can_use_semaphore_wait(to, from))
+   goto await_fence;
+
if (!intel_context_use_semaphores(to->context))
goto await_fence;
@@ -1256,7 +1265,8 @@ __i915_request_await_execution(struct i915_request *to,
 * immediate execution, and so we must wait until it reaches the
 * active slot.
 */
-   if (intel_engine_has_semaphores(to->engine) &&
+   if (can_use_semaphore_wait(to, from) &&
+   intel_engine_has_semaphores(to->engine) &&
!i915_request_has_initial_breadcrumb(to)) {
err = __emit_semaphore_wait(to, from, from->fence.seqno - 1);
if (err < 0)




Re: [Intel-gfx] [PATCH 2/2] drm/i915/pmu: Connect engine busyness stats from GuC to pmu

2021-10-13 Thread Tvrtko Ursulin



On 13/10/2021 01:56, Umesh Nerlige Ramappa wrote:

With GuC handling scheduling, i915 is not aware of the time that a
context is scheduled in and out of the engine. Since i915 pmu relies on
this info to provide engine busyness to the user, GuC shares this info
with i915 for all engines using shared memory. For each engine, this
info contains:

- total busyness: total time that the context was running (total)
- id: id of the running context (id)
- start timestamp: timestamp when the context started running (start)

At the time (now) of sampling the engine busyness, if the id is valid
(!= ~0), and start is non-zero, then the context is considered to be
active and the engine busyness is calculated using the below equation

engine busyness = total + (now - start)

All times are obtained from the gt clock base. For inactive contexts,
engine busyness is just equal to the total.

The start and total values provided by GuC are 32 bits and wrap around
in a few minutes. Since perf pmu provides busyness as 64 bit
monotonically increasing values, there is a need for this implementation
to account for overflows and extend the time to 64 bits before returning
busyness to the user. In order to do that, a worker runs periodically at
frequency = 1/8th the time it takes for the timestamp to wrap. As an
example, that would be once in 27 seconds for a gt clock frequency of
19.2 MHz.

Note:
There might be an overaccounting of busyness due to the fact that GuC
may be updating the total and start values while kmd is reading them.
(i.e kmd may read the updated total and the stale start). In such a
case, user may see higher busyness value followed by smaller ones which
would eventually catch up to the higher value.

v2: (Tvrtko)
- Include details in commit message
- Move intel engine busyness function into execlist code
- Use union inside engine->stats
- Use natural type for ping delay jiffies
- Drop active_work condition checks
- Use for_each_engine if iterating all engines
- Drop seq locking, use spinlock at guc level to update engine stats
- Document worker specific details

v3: (Tvrtko/Umesh)
- Demarcate guc and execlist stat objects with comments
- Document known over-accounting issue in commit
- Provide a consistent view of guc state
- Add hooks to gt park/unpark for guc busyness
- Stop/start worker in gt park/unpark path
- Drop inline
- Move spinlock and worker inits to guc initialization
- Drop helpers that are called only once

v4: (Tvrtko/Matt/Umesh)
- Drop addressed opens from commit message
- Get runtime pm in ping, remove from the park path
- Use cancel_delayed_work_sync in disable_submission path
- Update stats during reset prepare
- Skip ping if reset in progress
- Explicitly name execlists and guc stats objects
- Since disable_submission is called from many places, move resetting
   stats to intel_guc_submission_reset_prepare

v5: (Tvrtko)
- Add a trylock helper that does not sleep and synchronize PMU event
   callbacks and worker with gt reset


Looks good to me now, for some combination of high level and incomeplte 
low level review (I did not check the overflow handling or the GuC page 
layout and flow.). Both patches:


Acked-by: Tvrtko Ursulin 

Do you have someone available to check the parts I did not and r-b?

Regards,

Tvrtko



Signed-off-by: John Harrison 
Signed-off-by: Umesh Nerlige Ramappa 
---
  drivers/gpu/drm/i915/gt/intel_engine_cs.c |  28 +-
  drivers/gpu/drm/i915/gt/intel_engine_types.h  |  33 ++-
  .../drm/i915/gt/intel_execlists_submission.c  |  34 +++
  drivers/gpu/drm/i915/gt/intel_gt_pm.c |   2 +
  drivers/gpu/drm/i915/gt/intel_reset.c |  16 ++
  drivers/gpu/drm/i915/gt/intel_reset.h |   1 +
  .../gpu/drm/i915/gt/uc/abi/guc_actions_abi.h  |   1 +
  drivers/gpu/drm/i915/gt/uc/intel_guc.h|  30 ++
  drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c|  21 ++
  drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h|   5 +
  drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |  13 +
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 267 ++
  .../gpu/drm/i915/gt/uc/intel_guc_submission.h |   2 +
  drivers/gpu/drm/i915/i915_reg.h   |   2 +
  14 files changed, 427 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 38436f4b5706..6b783fdcba2a 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1873,23 +1873,6 @@ void intel_engine_dump(struct intel_engine_cs *engine,
intel_engine_print_breadcrumbs(engine, m);
  }
  
-static ktime_t __intel_engine_get_busy_time(struct intel_engine_cs *engine,

-   ktime_t *now)
-{
-   struct intel_engine_execlists_stats *stats = &engine->stats.execlists;
-   ktime_t total = stats->total;
-
-   /*
-* If the engine is executing something at the moment
-* add it to the total.
-*/
-   *now = 

Re: [Intel-gfx] [PATCH] drm/i915/dg2: Tile 4 plane format support

2021-10-13 Thread Ramalingam C
On 2021-10-12 at 11:28:45 +0300, Stanislav Lisovskiy wrote:
> TileF(Tile4 in bspec) format is 4K tile organized into
> 64B subtiles with same basic shape as for legacy TileY
> which will be supported by Display13.
> 
> v2: - Fixed wrong case condition(Jani Nikula)
> - Increased I915_FORMAT_MOD_F_TILED up to 12(Imre Deak)
> 
> v3: - s/I915_TILING_F/TILING_4/g
> - s/I915_FORMAT_MOD_F_TILED/I915_FORMAT_MOD_4_TILED/g
> - Removed unneeded fencing code
> 
> Cc: Imre Deak 
> Cc: Matt Roper 
> Cc: Maarten Lankhorst 
> Signed-off-by: Stanislav Lisovskiy 
> Signed-off-by: Matt Roper 
> Signed-off-by: Juha-Pekka Heikkilä 
> ---
>  drivers/gpu/drm/i915/display/intel_display.c  |  2 ++
>  drivers/gpu/drm/i915/display/intel_fb.c   |  7 
>  drivers/gpu/drm/i915/display/intel_fbc.c  |  1 +
>  .../drm/i915/display/skl_universal_plane.c| 36 ++-
>  drivers/gpu/drm/i915/i915_drv.h   |  1 +
>  drivers/gpu/drm/i915/i915_pci.c   |  1 +
>  drivers/gpu/drm/i915/i915_reg.h   |  1 +
>  drivers/gpu/drm/i915/intel_device_info.h  |  1 +
>  drivers/gpu/drm/i915/intel_pm.c   |  1 +
>  include/uapi/drm/drm_fourcc.h |  8 +
>  10 files changed, 50 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
> b/drivers/gpu/drm/i915/display/intel_display.c
> index 4f0badb11bbb..524a20fa67ce 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -1325,6 +1325,7 @@ intel_alloc_initial_plane_obj(struct intel_crtc *crtc,
>   case DRM_FORMAT_MOD_LINEAR:
>   case I915_FORMAT_MOD_X_TILED:
>   case I915_FORMAT_MOD_Y_TILED:
> + case I915_FORMAT_MOD_4_TILED:
>   break;
>   default:
>   drm_dbg(&dev_priv->drm,
> @@ -9330,6 +9331,7 @@ static int intel_atomic_check_async(struct 
> intel_atomic_state *state)
>   case I915_FORMAT_MOD_X_TILED:
>   case I915_FORMAT_MOD_Y_TILED:
>   case I915_FORMAT_MOD_Yf_TILED:
> + case I915_FORMAT_MOD_4_TILED:
>   break;
>   default:
>   drm_dbg_kms(&i915->drm,
> diff --git a/drivers/gpu/drm/i915/display/intel_fb.c 
> b/drivers/gpu/drm/i915/display/intel_fb.c
> index fa1f375e696b..e19739fef825 100644
> --- a/drivers/gpu/drm/i915/display/intel_fb.c
> +++ b/drivers/gpu/drm/i915/display/intel_fb.c
> @@ -127,6 +127,12 @@ intel_tile_width_bytes(const struct drm_framebuffer *fb, 
> int color_plane)
>   return 128;
>   else
>   return 512;
> + case I915_FORMAT_MOD_4_TILED:
> + /*
> +  * Each 4K tile consists of 64B(8*8) subtiles, with
> +  * same shape as Y Tile(i.e 4*16B OWords)
> +  */
> + return 128;
>   case I915_FORMAT_MOD_Y_TILED_CCS:
>   if (is_ccs_plane(fb, color_plane))
>   return 128;
> @@ -305,6 +311,7 @@ unsigned int intel_surf_alignment(const struct 
> drm_framebuffer *fb,
>   case I915_FORMAT_MOD_Y_TILED_CCS:
>   case I915_FORMAT_MOD_Yf_TILED_CCS:
>   case I915_FORMAT_MOD_Y_TILED:
> + case I915_FORMAT_MOD_4_TILED:
>   case I915_FORMAT_MOD_Yf_TILED:
>   return 1 * 1024 * 1024;
>   default:
> diff --git a/drivers/gpu/drm/i915/display/intel_fbc.c 
> b/drivers/gpu/drm/i915/display/intel_fbc.c
> index 1f66de77a6b1..f079a771f802 100644
> --- a/drivers/gpu/drm/i915/display/intel_fbc.c
> +++ b/drivers/gpu/drm/i915/display/intel_fbc.c
> @@ -747,6 +747,7 @@ static bool tiling_is_valid(struct drm_i915_private 
> *dev_priv,
>   case DRM_FORMAT_MOD_LINEAR:
>   case I915_FORMAT_MOD_Y_TILED:
>   case I915_FORMAT_MOD_Yf_TILED:
> + case I915_FORMAT_MOD_4_TILED:
>   return DISPLAY_VER(dev_priv) >= 9;
>   case I915_FORMAT_MOD_X_TILED:
>   return true;
> diff --git a/drivers/gpu/drm/i915/display/skl_universal_plane.c 
> b/drivers/gpu/drm/i915/display/skl_universal_plane.c
> index a0e53a3b267a..586aa660ba7a 100644
> --- a/drivers/gpu/drm/i915/display/skl_universal_plane.c
> +++ b/drivers/gpu/drm/i915/display/skl_universal_plane.c
> @@ -207,6 +207,13 @@ static const u64 adlp_step_a_plane_format_modifiers[] = {
>   DRM_FORMAT_MOD_INVALID
>  };
>  
> +static const u64 dg2_plane_format_modifiers[] = {
> + I915_FORMAT_MOD_X_TILED,
> + I915_FORMAT_MOD_4_TILED,
> + DRM_FORMAT_MOD_LINEAR,
> + DRM_FORMAT_MOD_INVALID
> +};
> +
>  int skl_format_to_fourcc(int format, bool rgb_order, bool alpha)
>  {
>   switch (format) {
> @@ -795,6 +802,8 @@ static u32 skl_plane_ctl_tiling(u64 fb_modifier)
>   return PLANE_CTL_TILED_X;
>   case I915_FORMAT_MOD_Y_TILED:
>   return PLANE_CTL_TILED_Y;
> + case I915_FORMAT_MOD_4_TILED:
> + return PLANE_CTL_TILED_F;
>   case I915_FORMAT_MOD_Y_TILED_CCS:
>   case I915

Re: [Intel-gfx] [PATCH] drm/i915: Use dma_resv_iter for waiting in i915_gem_object_wait_reservation.

2021-10-13 Thread Daniel Vetter
On Wed, Oct 13, 2021 at 04:37:03PM +0100, Tvrtko Ursulin wrote:
> 
> On 13/10/2021 15:00, Daniel Vetter wrote:
> > On Wed, Oct 13, 2021 at 02:32:03PM +0200, Maarten Lankhorst wrote:
> > > No memory should be allocated when calling i915_gem_object_wait,
> > > because it may be called to idle a BO when evicting memory.
> > > 
> > > Fix this by using dma_resv_iter helpers to call
> > > i915_gem_object_wait_fence() on each fence, which cleans up the code a 
> > > lot.
> > > Also remove dma_resv_prune, it's questionably.
> > > 
> > > This will result in the following lockdep splat.
> > > 
> > > <4> [83.538517] ==
> > > <4> [83.538520] WARNING: possible circular locking dependency detected
> > > <4> [83.538522] 5.15.0-rc5-CI-Trybot_8062+ #1 Not tainted
> > > <4> [83.538525] --
> > > <4> [83.538527] gem_render_line/5242 is trying to acquire lock:
> > > <4> [83.538530] 8275b1e0 (fs_reclaim){+.+.}-{0:0}, at: 
> > > __kmalloc_track_caller+0x56/0x270
> > > <4> [83.538538]
> > > but task is already holding lock:
> > > <4> [83.538540] 88813471d1e0 (&vm->mutex/1){+.+.}-{3:3}, at: 
> > > i915_vma_pin_ww+0x1c7/0x970 [i915]
> > > <4> [83.538638]
> > > which lock already depends on the new lock.
> > > <4> [83.538642]
> > > the existing dependency chain (in reverse order) is:
> > > <4> [83.538645]
> > > -> #1 (&vm->mutex/1){+.+.}-{3:3}:
> > > <4> [83.538649]lock_acquire+0xd3/0x310
> > > <4> [83.538654]i915_gem_shrinker_taints_mutex+0x2d/0x50 [i915]
> > > <4> [83.538730]i915_address_space_init+0xf5/0x1b0 [i915]
> > > <4> [83.538794]ppgtt_init+0x55/0x70 [i915]
> > > <4> [83.538856]gen8_ppgtt_create+0x44/0x5d0 [i915]
> > > <4> [83.538912]i915_ppgtt_create+0x28/0xf0 [i915]
> > > <4> [83.538971]intel_gt_init+0x130/0x3b0 [i915]
> > > <4> [83.539029]i915_gem_init+0x14b/0x220 [i915]
> > > <4> [83.539100]i915_driver_probe+0x97e/0xdd0 [i915]
> > > <4> [83.539149]i915_pci_probe+0x43/0x1d0 [i915]
> > > <4> [83.539197]pci_device_probe+0x9b/0x110
> > > <4> [83.539201]really_probe+0x1b0/0x3b0
> > > <4> [83.539205]__driver_probe_device+0xf6/0x170
> > > <4> [83.539208]driver_probe_device+0x1a/0x90
> > > <4> [83.539210]__driver_attach+0x93/0x160
> > > <4> [83.539213]bus_for_each_dev+0x72/0xc0
> > > <4> [83.539216]bus_add_driver+0x14b/0x1f0
> > > <4> [83.539220]driver_register+0x66/0xb0
> > > <4> [83.539222]hdmi_get_spk_alloc+0x1f/0x50 [snd_hda_codec_hdmi]
> > > <4> [83.539227]do_one_initcall+0x53/0x2e0
> > > <4> [83.539230]do_init_module+0x55/0x200
> > > <4> [83.539234]load_module+0x2700/0x2980
> > > <4> [83.539237]__do_sys_finit_module+0xaa/0x110
> > > <4> [83.539241]do_syscall_64+0x37/0xb0
> > > <4> [83.539244]entry_SYSCALL_64_after_hwframe+0x44/0xae
> > > <4> [83.539247]
> > > -> #0 (fs_reclaim){+.+.}-{0:0}:
> > > <4> [83.539251]validate_chain+0xb37/0x1e70
> > > <4> [83.539254]__lock_acquire+0x5a1/0xb70
> > > <4> [83.539258]lock_acquire+0xd3/0x310
> > > <4> [83.539260]fs_reclaim_acquire+0x9d/0xd0
> > > <4> [83.539264]__kmalloc_track_caller+0x56/0x270
> > > <4> [83.539267]krealloc+0x48/0xa0
> > > <4> [83.539270]dma_resv_get_fences+0x1c3/0x280
> > > <4> [83.539274]i915_gem_object_wait+0x1ff/0x410 [i915]
> > > <4> [83.539342]i915_gem_evict_for_node+0x16b/0x440 [i915]
> > > <4> [83.539412]i915_gem_gtt_reserve+0xff/0x130 [i915]
> > > <4> [83.539482]i915_vma_pin_ww+0x765/0x970 [i915]
> > > <4> [83.539556]eb_validate_vmas+0x6fe/0x8e0 [i915]
> > > <4> [83.539626]i915_gem_do_execbuffer+0x9a6/0x20a0 [i915]
> > > <4> [83.539693]i915_gem_execbuffer2_ioctl+0x11f/0x2c0 [i915]
> > > <4> [83.539759]drm_ioctl_kernel+0xac/0x140
> > > <4> [83.539763]drm_ioctl+0x201/0x3d0
> > > <4> [83.539766]__x64_sys_ioctl+0x6a/0xa0
> > > <4> [83.539769]do_syscall_64+0x37/0xb0
> > > <4> [83.539772]entry_SYSCALL_64_after_hwframe+0x44/0xae
> > > <4> [83.539775]
> > > other info that might help us debug this:
> > > <4> [83.539778]  Possible unsafe locking scenario:
> > > <4> [83.539781]CPU0CPU1
> > > <4> [83.539783]
> > > <4> [83.539785]   lock(&vm->mutex/1);
> > > <4> [83.539788]lock(fs_reclaim);
> > > <4> [83.539791]lock(&vm->mutex/1);
> > > <4> [83.539794]   lock(fs_reclaim);
> > > <4> [83.539796]
> > >   *** DEADLOCK ***
> > > <4> [83.539799] 3 locks held by gem_render_line/5242:
> > > <4> [83.539802]  #0: c9d4bbf0 
> > > (reservation_ww_class_acquire){+.+.}-{0:0}, at: 
> > > i915_gem_do_execbuffer+0x8e5/0x20a0 [i915]
> > > <4> [83.539870]  #1: 88811e48bae8 
> > > (reservation

[Intel-gfx] [PATCH v3] component: do not leave master devres group open after bind

2021-10-13 Thread Kai Vehmanen
In current code, the devres group for aggregate master is left open
after call to component_master_add_*(). This leads to problems when the
master does further managed allocations on its own. When any
participating driver calls component_del(), this leads to immediate
release of resources.

This came up when investigating a page fault occurring with i915 DRM
driver unbind with 5.15-rc1 kernel. The following sequence occurs:

 i915_pci_remove()
   -> intel_display_driver_unregister()
 -> i915_audio_component_cleanup()
   -> component_del()
 -> component.c:take_down_master()
   -> hdac_component_master_unbind() [via master->ops->unbind()]
   -> devres_release_group(master->parent, NULL)

With older kernels this has not caused issues, but with audio driver
moving to use managed interfaces for more of its allocations, this no
longer works. Devres log shows following to occur:

component_master_add_with_match()
[  126.886032] snd_hda_intel :00:1f.3: DEVRES ADD 323ccdc5 
devm_component_match_release (24 bytes)
[  126.886045] snd_hda_intel :00:1f.3: DEVRES ADD 865cdb29 grp< (0 
bytes)
[  126.886049] snd_hda_intel :00:1f.3: DEVRES ADD 1b480725 grp< (0 
bytes)

audio driver completes its PCI probe()
[  126.892238] snd_hda_intel :00:1f.3: DEVRES ADD 1b480725 
pcim_iomap_release (48 bytes)

component_del() called() at DRM/i915 unbind()
[  137.579422] i915 :00:02.0: DEVRES REL ef44c293 grp< (0 bytes)
[  137.579445] snd_hda_intel :00:1f.3: DEVRES REL 865cdb29 grp< (0 
bytes)
[  137.579458] snd_hda_intel :00:1f.3: DEVRES REL 1b480725 
pcim_iomap_release (48 bytes)

So the "devres_release_group(master->parent, NULL)" ends up freeing the
pcim_iomap allocation. Upon next runtime resume, the audio driver will
cause a page fault as the iomap alloc was released without the driver
knowing about it.

Fix this issue by using the "struct master" pointer as identifier for
the devres group, and by closing the devres group after
the master->ops->bind() call is done. This allows devres allocations
done by the driver acting as master to be isolated from the binding state
of the aggregate driver. This modifies the logic originally introduced in
commit 9e1ccb4a7700 ("drivers/base: fix devres handling for master device")

Cc: sta...@vger.kernel.org
BugLink: https://gitlab.freedesktop.org/drm/intel/-/issues/4136
Fixes: 9e1ccb4a7700 ("drivers/base: fix devres handling for master device")
Signed-off-by: Kai Vehmanen 
Acked-by: Imre Deak 
Acked-by: Russell King (Oracle) 
---
 drivers/base/component.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

V3 changes:
 - address feedback from Greg KH, add a Fixes tag and cc stable
V2 changes:
 - after review form Imre and Russell, removing RFC tag 


 - rebased on top of 5.15-rc2 (V1 was on drm-tip)   


 - CI test results for V1 show that this patch fixes multiple   


   failures in i915 unbind and module reload tests: 


   https://patchwork.freedesktop.org/series/94889/ 

diff --git a/drivers/base/component.c b/drivers/base/component.c
index 5e79299f6c3f..870485cbbb87 100644
--- a/drivers/base/component.c
+++ b/drivers/base/component.c
@@ -246,7 +246,7 @@ static int try_to_bring_up_master(struct master *master,
return 0;
}
 
-   if (!devres_open_group(master->parent, NULL, GFP_KERNEL))
+   if (!devres_open_group(master->parent, master, GFP_KERNEL))
return -ENOMEM;
 
/* Found all components */
@@ -258,6 +258,7 @@ static int try_to_bring_up_master(struct master *master,
return ret;
}
 
+   devres_close_group(master->parent, NULL);
master->bound = true;
return 1;
 }
@@ -282,7 +283,7 @@ static void take_down_master(struct master *master)
 {
if (master->bound) {
master->ops->unbind(master->parent);
-   devres_release_group(master->parent, NULL);
+   devres_release_group(master->parent, master);
master->bound = false;
}
 }

base-commit: 9e1ff307c779ce1f0f810c7ecce3d95bbae40896
-- 
2.33.0



Re: [Intel-gfx] [PATCH 2/2] drm/i915/pmu: Connect engine busyness stats from GuC to pmu

2021-10-13 Thread Umesh Nerlige Ramappa

On Wed, Oct 13, 2021 at 05:06:26PM +0100, Tvrtko Ursulin wrote:


On 13/10/2021 01:56, Umesh Nerlige Ramappa wrote:

With GuC handling scheduling, i915 is not aware of the time that a
context is scheduled in and out of the engine. Since i915 pmu relies on
this info to provide engine busyness to the user, GuC shares this info
with i915 for all engines using shared memory. For each engine, this
info contains:

- total busyness: total time that the context was running (total)
- id: id of the running context (id)
- start timestamp: timestamp when the context started running (start)

At the time (now) of sampling the engine busyness, if the id is valid
(!= ~0), and start is non-zero, then the context is considered to be
active and the engine busyness is calculated using the below equation

engine busyness = total + (now - start)

All times are obtained from the gt clock base. For inactive contexts,
engine busyness is just equal to the total.

The start and total values provided by GuC are 32 bits and wrap around
in a few minutes. Since perf pmu provides busyness as 64 bit
monotonically increasing values, there is a need for this implementation
to account for overflows and extend the time to 64 bits before returning
busyness to the user. In order to do that, a worker runs periodically at
frequency = 1/8th the time it takes for the timestamp to wrap. As an
example, that would be once in 27 seconds for a gt clock frequency of
19.2 MHz.

Note:
There might be an overaccounting of busyness due to the fact that GuC
may be updating the total and start values while kmd is reading them.
(i.e kmd may read the updated total and the stale start). In such a
case, user may see higher busyness value followed by smaller ones which
would eventually catch up to the higher value.

v2: (Tvrtko)
- Include details in commit message
- Move intel engine busyness function into execlist code
- Use union inside engine->stats
- Use natural type for ping delay jiffies
- Drop active_work condition checks
- Use for_each_engine if iterating all engines
- Drop seq locking, use spinlock at guc level to update engine stats
- Document worker specific details

v3: (Tvrtko/Umesh)
- Demarcate guc and execlist stat objects with comments
- Document known over-accounting issue in commit
- Provide a consistent view of guc state
- Add hooks to gt park/unpark for guc busyness
- Stop/start worker in gt park/unpark path
- Drop inline
- Move spinlock and worker inits to guc initialization
- Drop helpers that are called only once

v4: (Tvrtko/Matt/Umesh)
- Drop addressed opens from commit message
- Get runtime pm in ping, remove from the park path
- Use cancel_delayed_work_sync in disable_submission path
- Update stats during reset prepare
- Skip ping if reset in progress
- Explicitly name execlists and guc stats objects
- Since disable_submission is called from many places, move resetting
  stats to intel_guc_submission_reset_prepare

v5: (Tvrtko)
- Add a trylock helper that does not sleep and synchronize PMU event
  callbacks and worker with gt reset


Looks good to me now, for some combination of high level and 
incomeplte low level review (I did not check the overflow handling or 
the GuC page layout and flow.). Both patches:


Acked-by: Tvrtko Ursulin 


Thanks



Do you have someone available to check the parts I did not and r-b?


I will check with Matt/John.

Regards,
Umesh


Regards,

Tvrtko



Signed-off-by: John Harrison 
Signed-off-by: Umesh Nerlige Ramappa 
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |  28 +-
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  33 ++-
 .../drm/i915/gt/intel_execlists_submission.c  |  34 +++
 drivers/gpu/drm/i915/gt/intel_gt_pm.c |   2 +
 drivers/gpu/drm/i915/gt/intel_reset.c |  16 ++
 drivers/gpu/drm/i915/gt/intel_reset.h |   1 +
 .../gpu/drm/i915/gt/uc/abi/guc_actions_abi.h  |   1 +
 drivers/gpu/drm/i915/gt/uc/intel_guc.h|  30 ++
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c|  21 ++
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h|   5 +
 drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |  13 +
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 267 ++
 .../gpu/drm/i915/gt/uc/intel_guc_submission.h |   2 +
 drivers/gpu/drm/i915/i915_reg.h   |   2 +
 14 files changed, 427 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 38436f4b5706..6b783fdcba2a 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1873,23 +1873,6 @@ void intel_engine_dump(struct intel_engine_cs *engine,
intel_engine_print_breadcrumbs(engine, m);
 }
-static ktime_t __intel_engine_get_busy_time(struct intel_engine_cs *engine,
-   ktime_t *now)
-{
-   struct intel_engine_execlists_stats *stats = &engine->stats.execlists;
-   ktime_t total = stats->total;
-
-   /*
-* If the en

Re: [Intel-gfx] [PATCH] drm/i915: Use dma_resv_iter for waiting in i915_gem_object_wait_reservation.

2021-10-13 Thread Maarten Lankhorst
Op 13-10-2021 om 17:37 schreef Tvrtko Ursulin:
>
> On 13/10/2021 15:00, Daniel Vetter wrote:
>> On Wed, Oct 13, 2021 at 02:32:03PM +0200, Maarten Lankhorst wrote:
>>> No memory should be allocated when calling i915_gem_object_wait,
>>> because it may be called to idle a BO when evicting memory.
>>>
>>> Fix this by using dma_resv_iter helpers to call
>>> i915_gem_object_wait_fence() on each fence, which cleans up the code a lot.
>>> Also remove dma_resv_prune, it's questionably.
>>>
>>> This will result in the following lockdep splat.
>>>
>>> <4> [83.538517] ==
>>> <4> [83.538520] WARNING: possible circular locking dependency detected
>>> <4> [83.538522] 5.15.0-rc5-CI-Trybot_8062+ #1 Not tainted
>>> <4> [83.538525] --
>>> <4> [83.538527] gem_render_line/5242 is trying to acquire lock:
>>> <4> [83.538530] 8275b1e0 (fs_reclaim){+.+.}-{0:0}, at: 
>>> __kmalloc_track_caller+0x56/0x270
>>> <4> [83.538538]
>>> but task is already holding lock:
>>> <4> [83.538540] 88813471d1e0 (&vm->mutex/1){+.+.}-{3:3}, at: 
>>> i915_vma_pin_ww+0x1c7/0x970 [i915]
>>> <4> [83.538638]
>>> which lock already depends on the new lock.
>>> <4> [83.538642]
>>> the existing dependency chain (in reverse order) is:
>>> <4> [83.538645]
>>> -> #1 (&vm->mutex/1){+.+.}-{3:3}:
>>> <4> [83.538649]    lock_acquire+0xd3/0x310
>>> <4> [83.538654]    i915_gem_shrinker_taints_mutex+0x2d/0x50 [i915]
>>> <4> [83.538730]    i915_address_space_init+0xf5/0x1b0 [i915]
>>> <4> [83.538794]    ppgtt_init+0x55/0x70 [i915]
>>> <4> [83.538856]    gen8_ppgtt_create+0x44/0x5d0 [i915]
>>> <4> [83.538912]    i915_ppgtt_create+0x28/0xf0 [i915]
>>> <4> [83.538971]    intel_gt_init+0x130/0x3b0 [i915]
>>> <4> [83.539029]    i915_gem_init+0x14b/0x220 [i915]
>>> <4> [83.539100]    i915_driver_probe+0x97e/0xdd0 [i915]
>>> <4> [83.539149]    i915_pci_probe+0x43/0x1d0 [i915]
>>> <4> [83.539197]    pci_device_probe+0x9b/0x110
>>> <4> [83.539201]    really_probe+0x1b0/0x3b0
>>> <4> [83.539205]    __driver_probe_device+0xf6/0x170
>>> <4> [83.539208]    driver_probe_device+0x1a/0x90
>>> <4> [83.539210]    __driver_attach+0x93/0x160
>>> <4> [83.539213]    bus_for_each_dev+0x72/0xc0
>>> <4> [83.539216]    bus_add_driver+0x14b/0x1f0
>>> <4> [83.539220]    driver_register+0x66/0xb0
>>> <4> [83.539222]    hdmi_get_spk_alloc+0x1f/0x50 [snd_hda_codec_hdmi]
>>> <4> [83.539227]    do_one_initcall+0x53/0x2e0
>>> <4> [83.539230]    do_init_module+0x55/0x200
>>> <4> [83.539234]    load_module+0x2700/0x2980
>>> <4> [83.539237]    __do_sys_finit_module+0xaa/0x110
>>> <4> [83.539241]    do_syscall_64+0x37/0xb0
>>> <4> [83.539244]    entry_SYSCALL_64_after_hwframe+0x44/0xae
>>> <4> [83.539247]
>>> -> #0 (fs_reclaim){+.+.}-{0:0}:
>>> <4> [83.539251]    validate_chain+0xb37/0x1e70
>>> <4> [83.539254]    __lock_acquire+0x5a1/0xb70
>>> <4> [83.539258]    lock_acquire+0xd3/0x310
>>> <4> [83.539260]    fs_reclaim_acquire+0x9d/0xd0
>>> <4> [83.539264]    __kmalloc_track_caller+0x56/0x270
>>> <4> [83.539267]    krealloc+0x48/0xa0
>>> <4> [83.539270]    dma_resv_get_fences+0x1c3/0x280
>>> <4> [83.539274]    i915_gem_object_wait+0x1ff/0x410 [i915]
>>> <4> [83.539342]    i915_gem_evict_for_node+0x16b/0x440 [i915]
>>> <4> [83.539412]    i915_gem_gtt_reserve+0xff/0x130 [i915]
>>> <4> [83.539482]    i915_vma_pin_ww+0x765/0x970 [i915]
>>> <4> [83.539556]    eb_validate_vmas+0x6fe/0x8e0 [i915]
>>> <4> [83.539626]    i915_gem_do_execbuffer+0x9a6/0x20a0 [i915]
>>> <4> [83.539693]    i915_gem_execbuffer2_ioctl+0x11f/0x2c0 [i915]
>>> <4> [83.539759]    drm_ioctl_kernel+0xac/0x140
>>> <4> [83.539763]    drm_ioctl+0x201/0x3d0
>>> <4> [83.539766]    __x64_sys_ioctl+0x6a/0xa0
>>> <4> [83.539769]    do_syscall_64+0x37/0xb0
>>> <4> [83.539772]    entry_SYSCALL_64_after_hwframe+0x44/0xae
>>> <4> [83.539775]
>>> other info that might help us debug this:
>>> <4> [83.539778]  Possible unsafe locking scenario:
>>> <4> [83.539781]    CPU0    CPU1
>>> <4> [83.539783]        
>>> <4> [83.539785]   lock(&vm->mutex/1);
>>> <4> [83.539788]    lock(fs_reclaim);
>>> <4> [83.539791]    lock(&vm->mutex/1);
>>> <4> [83.539794]   lock(fs_reclaim);
>>> <4> [83.539796]
>>>   *** DEADLOCK ***
>>> <4> [83.539799] 3 locks held by gem_render_line/5242:
>>> <4> [83.539802]  #0: c9d4bbf0 
>>> (reservation_ww_class_acquire){+.+.}-{0:0}, at: 
>>> i915_gem_do_execbuffer+0x8e5/0x20a0 [i915]
>>> <4> [83.539870]  #1: 88811e48bae8 
>>> (reservation_ww_class_mutex){+.+.}-{3:3}, at: eb_validate_vmas+0x81/0x8e0 
>>> [i915]
>>> <4> [83.539936]  #2: 88813471d1e0 (&vm->mutex/1){+.+.}-{3:3}, at: 
>>> i915_vma_pin_ww+0x1c7/0x970 [i915]
>>> <4> [83.540011]

[Intel-gfx] ✗ Fi.CI.BAT: failure for component: do not leave master devres group open after bind (rev3)

2021-10-13 Thread Patchwork
== Series Details ==

Series: component: do not leave master devres group open after bind (rev3)
URL   : https://patchwork.freedesktop.org/series/94889/
State : failure

== Summary ==

Applying: component: do not leave master devres group open after bind
Using index info to reconstruct a base tree...
M   drivers/base/component.c
Falling back to patching base and 3-way merge...
No changes -- Patch already applied.




Re: [Intel-gfx] [RFC PATCH] drm: Increase DRM_OBJECT_MAX_PROPERTY by 18.

2021-10-13 Thread Sebastian Andrzej Siewior
On 2021-10-13 14:57:34 [+0200], Daniel Vetter wrote:
> Hm there's a pile of commits there, and nothing immediately jumps to
> light. The thing is, 18 is likely way too much, since if e.g. we have a
> single new property on a plane and that pushes over the limit on all of
> them, you get iirc 3x4 already simply because we have that many planes.
> 
> So would be good to know the actual culprit.
> 
> Can you pls try to bisect the above range, applying the patch as a fixup
> locally (without commit, that will confuse git bisect a bit I think), so
> we know what/where went wrong?

c7fcbf2513973 -> does not boot
c7fcbf2513973 + 2f425cf5242a0 -> boots, 18 x DRM_OBJECT_MAX_PROPERTY
6f11f37459d8f -> boots, 0 x DRM_OBJECT_MAX_PROPERTY
6f11f37459d8f + 2f425cf5242a0 -> boots, 18 x DRM_OBJECT_MAX_PROPERTY

> I'm still confused why this isn't showing up anywhere in our intel ci ...
> 
> Thanks, Daniel

Sebastian


[Intel-gfx] [PATCH 3/3] drm/amdgpu: Replace drm_mm with drm buddy manager

2021-10-13 Thread Arunpravin
Add drm buddy allocator support for vram memory management

Signed-off-by: Arunpravin 
---
 .../gpu/drm/amd/amdgpu/amdgpu_res_cursor.h|  97 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h   |   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c  | 251 ++
 3 files changed, 217 insertions(+), 135 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
index acfa207cf970..2c17e948355e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h
@@ -30,12 +30,15 @@
 #include 
 #include 
 
+#include "amdgpu_vram_mgr.h"
+
 /* state back for walking over vram_mgr and gtt_mgr allocations */
 struct amdgpu_res_cursor {
uint64_tstart;
uint64_tsize;
uint64_tremaining;
-   struct drm_mm_node  *node;
+   void*node;
+   uint32_tmem_type;
 };
 
 /**
@@ -52,27 +55,63 @@ static inline void amdgpu_res_first(struct ttm_resource 
*res,
uint64_t start, uint64_t size,
struct amdgpu_res_cursor *cur)
 {
+   struct drm_buddy_block *block;
+   struct list_head *head, *next;
struct drm_mm_node *node;
 
-   if (!res || res->mem_type == TTM_PL_SYSTEM) {
-   cur->start = start;
-   cur->size = size;
-   cur->remaining = size;
-   cur->node = NULL;
-   WARN_ON(res && start + size > res->num_pages << PAGE_SHIFT);
-   return;
-   }
+   if (!res)
+   goto err_out;
 
BUG_ON(start + size > res->num_pages << PAGE_SHIFT);
 
-   node = to_ttm_range_mgr_node(res)->mm_nodes;
-   while (start >= node->size << PAGE_SHIFT)
-   start -= node++->size << PAGE_SHIFT;
+   cur->mem_type = res->mem_type;
+
+   switch (cur->mem_type) {
+   case TTM_PL_VRAM:
+   head = &to_amdgpu_vram_mgr_node(res)->blocks;
+
+   block = list_first_entry_or_null(head,
+struct drm_buddy_block,
+link);
+   if (!block)
+   goto err_out;
+
+   while (start >= node_size(block)) {
+   start -= node_size(block);
+
+   next = block->link.next;
+   if (next != head)
+   block = list_entry(next, struct 
drm_buddy_block, link);
+   }
+
+   cur->start = node_start(block) + start;
+   cur->size = min(node_size(block) - start, size);
+   cur->remaining = size;
+   cur->node = block;
+   break;
+   case TTM_PL_TT:
+   node = to_ttm_range_mgr_node(res)->mm_nodes;
+   while (start >= node->size << PAGE_SHIFT)
+   start -= node++->size << PAGE_SHIFT;
+
+   cur->start = (node->start << PAGE_SHIFT) + start;
+   cur->size = min((node->size << PAGE_SHIFT) - start, size);
+   cur->remaining = size;
+   cur->node = node;
+   break;
+   default:
+   goto err_out;
+   }
 
-   cur->start = (node->start << PAGE_SHIFT) + start;
-   cur->size = min((node->size << PAGE_SHIFT) - start, size);
+   return;
+
+err_out:
+   cur->start = start;
+   cur->size = size;
cur->remaining = size;
-   cur->node = node;
+   cur->node = NULL;
+   WARN_ON(res && start + size > res->num_pages << PAGE_SHIFT);
+   return;
 }
 
 /**
@@ -85,7 +124,9 @@ static inline void amdgpu_res_first(struct ttm_resource *res,
  */
 static inline void amdgpu_res_next(struct amdgpu_res_cursor *cur, uint64_t 
size)
 {
-   struct drm_mm_node *node = cur->node;
+   struct drm_buddy_block *block;
+   struct drm_mm_node *node;
+   struct list_head *next;
 
BUG_ON(size > cur->remaining);
 
@@ -99,9 +140,27 @@ static inline void amdgpu_res_next(struct amdgpu_res_cursor 
*cur, uint64_t size)
return;
}
 
-   cur->node = ++node;
-   cur->start = node->start << PAGE_SHIFT;
-   cur->size = min(node->size << PAGE_SHIFT, cur->remaining);
+   switch (cur->mem_type) {
+   case TTM_PL_VRAM:
+   block = cur->node;
+
+   next = block->link.next;
+   block = list_entry(next, struct drm_buddy_block, link);
+
+   cur->node = block;
+   cur->start = node_start(block);
+   cur->size = min(node_size(block), cur->remaining);
+   break;
+   case TTM_PL_TT:
+   node = cur->node;
+
+   cur->node = ++node;
+   cur->start = node->start << PAGE_SHIFT;
+   cur->size = min(node->size << PAGE_SHIFT, cur->re

[Intel-gfx] [PATCH 1/3] drm:Enable buddy allocator support

2021-10-13 Thread Arunpravin
Port Intel buddy manager to drm root folder
Implemented range allocation support for the provided order
Implemented TOP-DOWN support
Implemented freeing up unused pages on contiguous allocation
Moved range allocation and freelist pickup into a single function

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/Makefile|   2 +-
 drivers/gpu/drm/drm_buddy.c | 705 
 drivers/gpu/drm/drm_drv.c   |   3 +
 include/drm/drm_buddy.h | 157 
 4 files changed, 866 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/drm_buddy.c
 create mode 100644 include/drm/drm_buddy.h

diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index a118692a6df7..fe1a2fc09675 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -18,7 +18,7 @@ drm-y   :=drm_aperture.o drm_auth.o drm_cache.o \
drm_dumb_buffers.o drm_mode_config.o drm_vblank.o \
drm_syncobj.o drm_lease.o drm_writeback.o drm_client.o \
drm_client_modeset.o drm_atomic_uapi.o drm_hdcp.o \
-   drm_managed.o drm_vblank_work.o
+   drm_managed.o drm_vblank_work.o drm_buddy.o
 
 drm-$(CONFIG_DRM_LEGACY) += drm_agpsupport.o drm_bufs.o drm_context.o 
drm_dma.o \
drm_legacy_misc.o drm_lock.o drm_memory.o 
drm_scatter.o \
diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
new file mode 100644
index ..8cd118574665
--- /dev/null
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -0,0 +1,705 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2021 Intel Corporation
+ */
+
+#include 
+#include 
+
+#include 
+
+static struct kmem_cache *slab_blocks;
+
+static struct drm_buddy_block *drm_block_alloc(struct drm_buddy_mm *mm,
+  struct drm_buddy_block *parent,
+  unsigned int order,
+  u64 offset)
+{
+   struct drm_buddy_block *block;
+
+   BUG_ON(order > DRM_BUDDY_MAX_ORDER);
+
+   block = kmem_cache_zalloc(slab_blocks, GFP_KERNEL);
+   if (!block)
+   return NULL;
+
+   block->header = offset;
+   block->header |= order;
+   block->parent = parent;
+
+   BUG_ON(block->header & DRM_BUDDY_HEADER_UNUSED);
+   return block;
+}
+
+static void drm_block_free(struct drm_buddy_mm *mm,
+  struct drm_buddy_block *block)
+{
+   kmem_cache_free(slab_blocks, block);
+}
+
+static void mark_allocated(struct drm_buddy_block *block)
+{
+   block->header &= ~DRM_BUDDY_HEADER_STATE;
+   block->header |= DRM_BUDDY_ALLOCATED;
+
+   list_del(&block->link);
+}
+
+static void mark_free(struct drm_buddy_mm *mm,
+ struct drm_buddy_block *block)
+{
+   block->header &= ~DRM_BUDDY_HEADER_STATE;
+   block->header |= DRM_BUDDY_FREE;
+
+   list_add(&block->link,
+   &mm->free_list[drm_buddy_block_order(block)]);
+}
+
+static void mark_split(struct drm_buddy_block *block)
+{
+   block->header &= ~DRM_BUDDY_HEADER_STATE;
+   block->header |= DRM_BUDDY_SPLIT;
+
+   list_del(&block->link);
+}
+
+/**
+ * drm_buddy_init - init memory manager
+ *
+ * @mm: DRM buddy manager to initialize
+ * @size: size in bytes to manage
+ * @chunk_size: minimum page size in bytes for our allocations
+ *
+ * Initializes the memory manager and its resources.
+ *
+ * Returns:
+ * 0 on success, error code on failure.
+ */
+int drm_buddy_init(struct drm_buddy_mm *mm, u64 size, u64 chunk_size)
+{
+   unsigned int i;
+   u64 offset;
+
+   if (size < chunk_size)
+   return -EINVAL;
+
+   if (chunk_size < PAGE_SIZE)
+   return -EINVAL;
+
+   if (!is_power_of_2(chunk_size))
+   return -EINVAL;
+
+   size = round_down(size, chunk_size);
+
+   mm->size = size;
+   mm->avail = size;
+   mm->chunk_size = chunk_size;
+   mm->max_order = ilog2(size) - ilog2(chunk_size);
+
+   BUG_ON(mm->max_order > DRM_BUDDY_MAX_ORDER);
+
+   mm->free_list = kmalloc_array(mm->max_order + 1,
+ sizeof(struct list_head),
+ GFP_KERNEL);
+   if (!mm->free_list)
+   return -ENOMEM;
+
+   for (i = 0; i <= mm->max_order; ++i)
+   INIT_LIST_HEAD(&mm->free_list[i]);
+
+   mm->n_roots = hweight64(size);
+
+   mm->roots = kmalloc_array(mm->n_roots,
+ sizeof(struct drm_buddy_block *),
+ GFP_KERNEL);
+   if (!mm->roots)
+   goto out_free_list;
+
+   offset = 0;
+   i = 0;
+
+   /*
+* Split into power-of-two blocks, in case we are given a size that is
+* not itself a power-of-two.
+*/
+   do {
+   struct drm_buddy_block *root;
+   unsigned int order;
+   

[Intel-gfx] [PATCH 2/3] drm/amdgpu:move vram manager defines into a header file

2021-10-13 Thread Arunpravin
Move vram related defines and inline functions into
a separate header file

Signed-off-by: Arunpravin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h | 72 
 1 file changed, 72 insertions(+)
 create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h
new file mode 100644
index ..fcab6475ccbb
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h
@@ -0,0 +1,72 @@
+/* SPDX-License-Identifier: MIT
+ * Copyright 2021 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#ifndef __AMDGPU_VRAM_MGR_H__
+#define __AMDGPU_VRAM_MGR_H__
+
+#include 
+
+struct amdgpu_vram_mgr_node {
+   struct ttm_resource base;
+   struct list_head blocks;
+   unsigned long flags;
+};
+
+struct amdgpu_vram_reservation {
+   uint64_t start;
+   uint64_t size;
+   uint64_t min_size;
+   unsigned long flags;
+   struct list_head block;
+   struct list_head node;
+};
+
+static inline uint64_t node_start(struct drm_buddy_block *block)
+{
+   return drm_buddy_block_offset(block);
+}
+
+static inline uint64_t node_size(struct drm_buddy_block *block)
+{
+   return PAGE_SIZE << drm_buddy_block_order(block);
+}
+
+static inline struct amdgpu_vram_mgr_node *
+to_amdgpu_vram_mgr_node(struct ttm_resource *res)
+{
+   return container_of(res, struct amdgpu_vram_mgr_node, base);
+}
+
+static inline struct amdgpu_vram_mgr *
+to_vram_mgr(struct ttm_resource_manager *man)
+{
+   return container_of(man, struct amdgpu_vram_mgr, manager);
+}
+
+static inline struct amdgpu_device *
+to_amdgpu_device(struct amdgpu_vram_mgr *mgr)
+{
+   return container_of(mgr, struct amdgpu_device, mman.vram_mgr);
+}
+
+#endif
-- 
2.25.1



Re: [Intel-gfx] [PATCH 23/26] drm/i915: Make request conflict tracking understand parallel submits

2021-10-13 Thread Matthew Brost
On Tue, Oct 12, 2021 at 03:08:05PM -0700, John Harrison wrote:
> On 10/4/2021 15:06, Matthew Brost wrote:
> > If an object in the excl or shared slot is a composite fence from a
> > parallel submit and the current request in the conflict tracking is from
> > the same parallel context there is no need to enforce ordering as the
> > ordering already implicit. Make the request conflict tracking understand
> ordering already -> ordering is already
> 
> > this by comparing the parents parallel fence values and skipping the
> parents -> parent's
> 
> > conflict insertion if the values match.
> Presumably, this is to cope with the fact that the parallel submit fences do
> not look like regular submission fences. And hence the existing code that
> says 'new fence belongs to same context as old fence, so safe to ignore'
> does not work with parallel submission. However, this change does not appear
> to be adding parallel submit support to an existing 'same context' check. It
> seems to be a brand new check that does not exist for single submission.
> What makes parallel submit different? If we aren't skipping same context
> fences for single submits, why do we need it for parallel? Conversely, if we
> need it for parallel then why don't we need it for single?
> 
> And if the single submission version is simply somewhere else in the code,
> why do the parallel version here instead of at the same place?
> 
> John.
> 
> > 
> > Signed-off-by: Matthew Brost 
> > ---
> >   drivers/gpu/drm/i915/i915_request.c | 43 +++--
> >   1 file changed, 29 insertions(+), 14 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_request.c 
> > b/drivers/gpu/drm/i915/i915_request.c
> > index e9bfa32f9270..cf89624020ad 100644
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -1325,6 +1325,25 @@ i915_request_await_external(struct i915_request *rq, 
> > struct dma_fence *fence)
> > return err;
> >   }
> > +static inline bool is_parallel_rq(struct i915_request *rq)
> > +{
> > +   return intel_context_is_parallel(rq->context);
> > +}
> > +
> > +static inline struct intel_context *request_to_parent(struct i915_request 
> > *rq)
> > +{
> > +   return intel_context_to_parent(rq->context);
> > +}
> > +
> > +static bool is_same_parallel_context(struct i915_request *to,
> > +struct i915_request *from)
> > +{
> > +   if (is_parallel_rq(to))
> Should this not say '&& is_parallel_rq(from)'?
> 

Missed this one. That isn't necessary as if from is not a parallel
submit the following compare of parents will always return false. I
could add if you insist as either way works.

Matt 

> > +   return request_to_parent(to) == request_to_parent(from);
> > +
> > +   return false;
> > +}
> > +
> >   int
> >   i915_request_await_execution(struct i915_request *rq,
> >  struct dma_fence *fence)
> > @@ -1356,11 +1375,14 @@ i915_request_await_execution(struct i915_request 
> > *rq,
> >  * want to run our callback in all cases.
> >  */
> > -   if (dma_fence_is_i915(fence))
> > +   if (dma_fence_is_i915(fence)) {
> > +   if (is_same_parallel_context(rq, to_request(fence)))
> > +   continue;
> > ret = __i915_request_await_execution(rq,
> >  to_request(fence));
> > -   else
> > +   } else {
> > ret = i915_request_await_external(rq, fence);
> > +   }
> > if (ret < 0)
> > return ret;
> > } while (--nchild);
> > @@ -1461,10 +1483,13 @@ i915_request_await_dma_fence(struct i915_request 
> > *rq, struct dma_fence *fence)
> >  fence))
> > continue;
> > -   if (dma_fence_is_i915(fence))
> > +   if (dma_fence_is_i915(fence)) {
> > +   if (is_same_parallel_context(rq, to_request(fence)))
> > +   continue;
> > ret = i915_request_await_request(rq, to_request(fence));
> > -   else
> > +   } else {
> > ret = i915_request_await_external(rq, fence);
> > +   }
> > if (ret < 0)
> > return ret;
> > @@ -1539,16 +1564,6 @@ i915_request_await_object(struct i915_request *to,
> > return ret;
> >   }
> > -static inline bool is_parallel_rq(struct i915_request *rq)
> > -{
> > -   return intel_context_is_parallel(rq->context);
> > -}
> > -
> > -static inline struct intel_context *request_to_parent(struct i915_request 
> > *rq)
> > -{
> > -   return intel_context_to_parent(rq->context);
> > -}
> > -
> >   static struct i915_request *
> >   __i915_request_ensure_parallel_ordering(struct i915_request *rq,
> > struct intel_timeline *timeline)
> 


Re: [Intel-gfx] [PATCH 10/26] drm/i915/guc: Assign contexts in parent-child relationship consecutive guc_ids

2021-10-13 Thread Matthew Brost
On Fri, Oct 08, 2021 at 09:40:43AM -0700, John Harrison wrote:
> On 10/7/2021 18:21, Matthew Brost wrote:
> > On Thu, Oct 07, 2021 at 03:03:04PM -0700, John Harrison wrote:
> > > On 10/4/2021 15:06, Matthew Brost wrote:
> > > > Assign contexts in parent-child relationship consecutive guc_ids. This
> > > > is accomplished by partitioning guc_id space between ones that need to
> > > > be consecutive (1/16 available guc_ids) and ones that do not (15/16 of
> > > > available guc_ids). The consecutive search is implemented via the bitmap
> > > > API.
> > > > 
> > > > This is a precursor to the full GuC multi-lrc implementation but aligns
> > > > to how GuC mutli-lrc interface is defined - guc_ids must be consecutive
> > > > when using the GuC multi-lrc interface.
> > > > 
> > > > v2:
> > > >(Daniel Vetter)
> > > > - Explicitly state why we assign consecutive guc_ids
> > > > v3:
> > > >(John Harrison)
> > > > - Bring back in spin lock
> > > > 
> > > > Signed-off-by: Matthew Brost 
> > > > ---
> > > >drivers/gpu/drm/i915/gt/uc/intel_guc.h|   6 +-
> > > >.../gpu/drm/i915/gt/uc/intel_guc_submission.c | 104 
> > > > ++
> > > >2 files changed, 86 insertions(+), 24 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
> > > > b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > > > index 25a598e2b6e8..a9f4ec972bfb 100644
> > > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > > > @@ -76,9 +76,13 @@ struct intel_guc {
> > > >  */
> > > > spinlock_t lock;
> > > > /**
> > > > -* @guc_ids: used to allocate new guc_ids
> > > > +* @guc_ids: used to allocate new guc_ids, single-lrc
> > > >  */
> > > > struct ida guc_ids;
> > > > +   /**
> > > > +* @guc_ids_bitmap: used to allocate new guc_ids, 
> > > > multi-lrc
> > > > +*/
> > > > +   unsigned long *guc_ids_bitmap;
> > > > /**
> > > >  * @guc_id_list: list of intel_context with valid 
> > > > guc_ids but no
> > > >  * refs
> > > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> > > > b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > > > index 1f2809187513..79e7732e83b2 100644
> > > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > > > @@ -128,6 +128,16 @@ guc_create_virtual(struct intel_engine_cs 
> > > > **siblings, unsigned int count);
> > > >#define GUC_REQUEST_SIZE 64 /* bytes */
> > > > +/*
> > > > + * We reserve 1/16 of the guc_ids for multi-lrc as these need to be 
> > > > contiguous
> > > > + * per the GuC submission interface. A different allocation algorithm 
> > > > is used
> > > > + * (bitmap vs. ida) between multi-lrc and single-lrc hence the reason 
> > > > to
> > > > + * partition the guc_id space. We believe the number of multi-lrc 
> > > > contexts in
> > > > + * use should be low and 1/16 should be sufficient. Minimum of 32 
> > > > guc_ids for
> > > > + * multi-lrc.
> > > > + */
> > > > +#define NUMBER_MULTI_LRC_GUC_ID
> > > > (GUC_MAX_LRC_DESCRIPTORS / 16)
> > > > +
> > > >/*
> > > > * Below is a set of functions which control the GuC scheduling 
> > > > state which
> > > > * require a lock.
> > > > @@ -1206,6 +1216,11 @@ int intel_guc_submission_init(struct intel_guc 
> > > > *guc)
> > > > INIT_WORK(&guc->submission_state.destroyed_worker,
> > > >   destroyed_worker_func);
> > > > +   guc->submission_state.guc_ids_bitmap =
> > > > +   bitmap_zalloc(NUMBER_MULTI_LRC_GUC_ID, GFP_KERNEL);
> > > > +   if (!guc->submission_state.guc_ids_bitmap)
> > > > +   return -ENOMEM;
> > > > +
> > > > return 0;
> > > >}
> > > > @@ -1217,6 +1232,7 @@ void intel_guc_submission_fini(struct intel_guc 
> > > > *guc)
> > > > guc_lrc_desc_pool_destroy(guc);
> > > > guc_flush_destroyed_contexts(guc);
> > > > i915_sched_engine_put(guc->sched_engine);
> > > > +   bitmap_free(guc->submission_state.guc_ids_bitmap);
> > > >}
> > > >static inline void queue_request(struct i915_sched_engine 
> > > > *sched_engine,
> > > > @@ -1268,18 +1284,43 @@ static void guc_submit_request(struct 
> > > > i915_request *rq)
> > > > spin_unlock_irqrestore(&sched_engine->lock, flags);
> > > >}
> > > > -static int new_guc_id(struct intel_guc *guc)
> > > > +static int new_guc_id(struct intel_guc *guc, struct intel_context *ce)
> > > >{
> > > > -   return ida_simple_get(&guc->submission_state.guc_ids, 0,
> > > > - GUC_MAX_LRC_DESCRIPTORS, GFP_KERNEL |
> > > > - __GFP_RETRY_MAYFAIL | __GFP_NOWARN);
> > > > +   int ret;
> > > > +
> > > > +   GEM_BUG_ON(inte

[Intel-gfx] [PATCH 1/2] drm: Add Gamma and Degamma LUT sizes props to drm_crtc to validate.

2021-10-13 Thread Mark Yacoub
From: Mark Yacoub 

[Why]
1. drm_atomic_helper_check doesn't check for the LUT sizes of either Gamma
or Degamma props in the new CRTC state, allowing any invalid size to
be passed on.
2. Each driver has its own LUT size, which could also be different for
legacy users.

[How]
1. Create |degamma_lut_size| and |gamma_lut_size| to save the LUT sizes
assigned by the driver when it's initializing its color and CTM
management.
2. Create drm_atomic_helper_check_crtc which is called by
drm_atomic_helper_check to check the LUT sizes saved in drm_crtc that
they match the sizes in the new CRTC state.
3. Rename older lut checks that test for the color channels to indicate
it's a channel check. It's not included in drm_atomic_helper_check_crtc
as it's hardware specific and is to be called by the driver.
4. As the LUT size check now happens in drm_atomic_helper_check, remove
the lut check in intel_color.c

Fixes: igt@kms_color@pipe-A-invalid-gamma-lut-sizes on MTK
Tested on Zork(amdgpu) and Jacuzzi(mediatek), volteer(TGL)

v1:
1. Fix typos
2. Remove the LUT size check from intel driver
3. Rename old LUT check to indicate it's a channel change

Signed-off-by: Mark Yacoub 
---
 drivers/gpu/drm/drm_atomic_helper.c| 60 ++
 drivers/gpu/drm/drm_color_mgmt.c   | 14 ++---
 drivers/gpu/drm/i915/display/intel_color.c | 14 ++---
 include/drm/drm_atomic_helper.h|  1 +
 include/drm/drm_color_mgmt.h   |  7 +--
 include/drm/drm_crtc.h | 11 
 6 files changed, 89 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/drm_atomic_helper.c 
b/drivers/gpu/drm/drm_atomic_helper.c
index bc3487964fb5e..5feb2ad0209c3 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -929,6 +929,62 @@ drm_atomic_helper_check_planes(struct drm_device *dev,
 }
 EXPORT_SYMBOL(drm_atomic_helper_check_planes);
 
+/**
+ * drm_atomic_helper_check_crtcs - validate state object for CRTC changes
+ * @state: the driver state object
+ *
+ * Check the CRTC state object such as the Gamma/Degamma LUT sizes if the new
+ * state holds them.
+ *
+ * RETURNS:
+ * Zero for success or -errno
+ */
+int drm_atomic_helper_check_crtcs(struct drm_atomic_state *state)
+{
+   struct drm_crtc *crtc;
+   struct drm_crtc_state *new_crtc_state;
+   int i;
+
+   for_each_new_crtc_in_state (state, crtc, new_crtc_state, i) {
+   if (new_crtc_state->color_mgmt_changed &&
+   new_crtc_state->gamma_lut) {
+   uint64_t supported_lut_size = crtc->gamma_lut_size;
+   uint32_t supported_legacy_lut_size = crtc->gamma_size;
+   uint32_t new_state_lut_size =
+   drm_color_lut_size(new_crtc_state->gamma_lut);
+
+   if (new_state_lut_size != supported_lut_size &&
+   new_state_lut_size != supported_legacy_lut_size) {
+   drm_dbg_state(
+   state->dev,
+   "Invalid Gamma LUT size. Should be %u 
(or %u for legacy) but got %u.\n",
+   supported_lut_size,
+   supported_legacy_lut_size,
+   new_state_lut_size);
+   return -EINVAL;
+   }
+   }
+
+   if (new_crtc_state->color_mgmt_changed &&
+   new_crtc_state->degamma_lut) {
+   uint32_t new_state_lut_size =
+   drm_color_lut_size(new_crtc_state->degamma_lut);
+   uint64_t supported_lut_size = crtc->degamma_lut_size;
+
+   if (new_state_lut_size != supported_lut_size) {
+   drm_dbg_state(
+   state->dev,
+   "Invalid Degamma LUT size. Should be %u 
but got %u.\n",
+   supported_lut_size, new_state_lut_size);
+   return -EINVAL;
+   }
+   }
+   }
+
+   return 0;
+}
+EXPORT_SYMBOL(drm_atomic_helper_check_crtcs);
+
 /**
  * drm_atomic_helper_check - validate state object
  * @dev: DRM device
@@ -974,6 +1030,10 @@ int drm_atomic_helper_check(struct drm_device *dev,
if (ret)
return ret;
 
+   ret = drm_atomic_helper_check_crtcs(state);
+   if (ret)
+   return ret;
+
if (state->legacy_cursor_update)
state->async_update = !drm_atomic_helper_async_check(dev, 
state);
 
diff --git a/drivers/gpu/drm/drm_color_mgmt.c b/drivers/gpu/drm/drm_color_mgmt.c
index bb14f488c8f6c..e5b820ce823bf 100644
--- a/drivers/gpu/drm/drm_color_mgmt.c
+++ b/drivers/gpu/drm/drm_color_mgmt.c
@@ -166,6 +166,7 @@ void drm_crtc_enable_color_mgmt(struct d

[Intel-gfx] [PATCH 2/2] amd/amdgpu_dm: Verify Gamma and Degamma LUT sizes using DRM Core check

2021-10-13 Thread Mark Yacoub
From: Mark Yacoub 

[Why]
drm_atomic_helper_check_crtc now verifies both legacy and non-legacy LUT
sizes. There is no need to check it within amdgpu_dm_atomic_check.

[How]
Remove the local call to verify LUT sizes and use DRM Core function
instead.

Tested on ChromeOS Zork.

v1:
Remove amdgpu_dm_verify_lut_sizes everywhere.

Signed-off-by: Mark Yacoub 
---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |  8 ++---
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h |  1 -
 .../amd/display/amdgpu_dm/amdgpu_dm_color.c   | 35 ---
 3 files changed, 4 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
index f74663b6b046e..47f8de1cfc3a5 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -10244,6 +10244,10 @@ static int amdgpu_dm_atomic_check(struct drm_device 
*dev,
}
}
 #endif
+   ret = drm_atomic_helper_check_crtcs(state);
+   if (ret)
+   return ret;
+
for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, 
new_crtc_state, i) {
dm_old_crtc_state = to_dm_crtc_state(old_crtc_state);
 
@@ -10253,10 +10257,6 @@ static int amdgpu_dm_atomic_check(struct drm_device 
*dev,
dm_old_crtc_state->dsc_force_changed == false)
continue;
 
-   ret = amdgpu_dm_verify_lut_sizes(new_crtc_state);
-   if (ret)
-   goto fail;
-
if (!new_crtc_state->enable)
continue;
 
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
index fcb9c4a629c32..22730e5542092 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h
@@ -617,7 +617,6 @@ void amdgpu_dm_trigger_timing_sync(struct drm_device *dev);
 #define MAX_COLOR_LEGACY_LUT_ENTRIES 256
 
 void amdgpu_dm_init_color_mod(void);
-int amdgpu_dm_verify_lut_sizes(const struct drm_crtc_state *crtc_state);
 int amdgpu_dm_update_crtc_color_mgmt(struct dm_crtc_state *crtc);
 int amdgpu_dm_update_plane_color_mgmt(struct dm_crtc_state *crtc,
  struct dc_plane_state *dc_plane_state);
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c 
b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
index a022e5bb30a5c..319f8a8a89835 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_color.c
@@ -284,37 +284,6 @@ static int __set_input_tf(struct dc_transfer_func *func,
return res ? 0 : -ENOMEM;
 }
 
-/**
- * Verifies that the Degamma and Gamma LUTs attached to the |crtc_state| are of
- * the expected size.
- * Returns 0 on success.
- */
-int amdgpu_dm_verify_lut_sizes(const struct drm_crtc_state *crtc_state)
-{
-   const struct drm_color_lut *lut = NULL;
-   uint32_t size = 0;
-
-   lut = __extract_blob_lut(crtc_state->degamma_lut, &size);
-   if (lut && size != MAX_COLOR_LUT_ENTRIES) {
-   DRM_DEBUG_DRIVER(
-   "Invalid Degamma LUT size. Should be %u but got %u.\n",
-   MAX_COLOR_LUT_ENTRIES, size);
-   return -EINVAL;
-   }
-
-   lut = __extract_blob_lut(crtc_state->gamma_lut, &size);
-   if (lut && size != MAX_COLOR_LUT_ENTRIES &&
-   size != MAX_COLOR_LEGACY_LUT_ENTRIES) {
-   DRM_DEBUG_DRIVER(
-   "Invalid Gamma LUT size. Should be %u (or %u for 
legacy) but got %u.\n",
-   MAX_COLOR_LUT_ENTRIES, MAX_COLOR_LEGACY_LUT_ENTRIES,
-   size);
-   return -EINVAL;
-   }
-
-   return 0;
-}
-
 /**
  * amdgpu_dm_update_crtc_color_mgmt: Maps DRM color management to DC stream.
  * @crtc: amdgpu_dm crtc state
@@ -348,10 +317,6 @@ int amdgpu_dm_update_crtc_color_mgmt(struct dm_crtc_state 
*crtc)
bool is_legacy;
int r;
 
-   r = amdgpu_dm_verify_lut_sizes(&crtc->base);
-   if (r)
-   return r;
-
degamma_lut = __extract_blob_lut(crtc->base.degamma_lut, °amma_size);
regamma_lut = __extract_blob_lut(crtc->base.gamma_lut, ®amma_size);
 
-- 
2.33.0.882.g93a45727a2-goog



Re: [Intel-gfx] [PATCH 12/26] drm/i915/guc: Implement multi-lrc submission

2021-10-13 Thread Matthew Brost
On Fri, Oct 08, 2021 at 10:20:24AM -0700, John Harrison wrote:
> On 10/4/2021 15:06, Matthew Brost wrote:
> > Implement multi-lrc submission via a single workqueue entry and single
> > H2G. The workqueue entry contains an updated tail value for each
> > request, of all the contexts in the multi-lrc submission, and updates
> > these values simultaneously. As such, the tasklet and bypass path have
> > been updated to coalesce requests into a single submission.
> > 
> > v2:
> >   (John Harrison)
> >- s/wqe/wqi
> >- Use FIELD_PREP macros
> >- Add GEM_BUG_ONs ensures length fits within field
> >- Add comment / white space to intel_guc_write_barrier
> >   (Kernel test robot)
> >- Make need_tasklet a static function
> > 
> > Signed-off-by: Matthew Brost 
> > ---
> >   drivers/gpu/drm/i915/gt/uc/intel_guc.c|  26 ++
> >   drivers/gpu/drm/i915/gt/uc/intel_guc.h|   8 +
> >   drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c |  24 +-
> >   drivers/gpu/drm/i915/gt/uc/intel_guc_fwif.h   |  23 +-
> >   .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 319 --
> >   drivers/gpu/drm/i915/i915_request.h   |   8 +
> >   6 files changed, 335 insertions(+), 73 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> > index 8f8182bf7c11..7191e8439290 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
> > @@ -756,3 +756,29 @@ void intel_guc_load_status(struct intel_guc *guc, 
> > struct drm_printer *p)
> > }
> > }
> >   }
> > +
> > +void intel_guc_write_barrier(struct intel_guc *guc)
> > +{
> > +   struct intel_gt *gt = guc_to_gt(guc);
> > +
> > +   if (i915_gem_object_is_lmem(guc->ct.vma->obj)) {
> > +   /*
> > +* Ensure intel_uncore_write_fw can be used rather than
> > +* intel_uncore_write.
> > +*/
> > +   GEM_BUG_ON(guc->send_regs.fw_domains);
> > +
> > +   /*
> > +* This register is used by the i915 and GuC for MMIO based
> > +* communication. Once we are in this code CTBs are the only
> > +* method the i915 uses to communicate with the GuC so it is
> > +* safe to write to this register (a value of 0 is NOP for MMIO
> > +* communication). If we ever start mixing CTBs and MMIOs a new
> > +* register will have to be chosen.
> > +*/
> Hmm, missed it before but this comment is very CTB centric and the barrier
> function is now being used for parallel submission work queues. Seems like
> an extra comment should be added to cover that case. Just something simple
> about WQ usage is also guaranteed to be post CTB switch over.
> 

Sure.

> > +   intel_uncore_write_fw(gt->uncore, GEN11_SOFT_SCRATCH(0), 0);
> > +   } else {
> > +   /* wmb() sufficient for a barrier if in smem */
> > +   wmb();
> > +   }
> > +}
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > index a9f4ec972bfb..147f39cc0f2f 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
> > @@ -46,6 +46,12 @@ struct intel_guc {
> >  * submitted until the stalled request is processed.
> >  */
> > struct i915_request *stalled_request;
> > +   enum {
> > +   STALL_NONE,
> > +   STALL_REGISTER_CONTEXT,
> > +   STALL_MOVE_LRC_TAIL,
> > +   STALL_ADD_REQUEST,
> > +   } submission_stall_reason;
> > /* intel_guc_recv interrupt related state */
> > /** @irq_lock: protects GuC irq state */
> > @@ -361,4 +367,6 @@ void intel_guc_submission_cancel_requests(struct 
> > intel_guc *guc);
> >   void intel_guc_load_status(struct intel_guc *guc, struct drm_printer *p);
> > +void intel_guc_write_barrier(struct intel_guc *guc);
> > +
> >   #endif
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > index 20c710a74498..10d1878d2826 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
> > @@ -377,28 +377,6 @@ static u32 ct_get_next_fence(struct intel_guc_ct *ct)
> > return ++ct->requests.last_fence;
> >   }
> > -static void write_barrier(struct intel_guc_ct *ct)
> > -{
> > -   struct intel_guc *guc = ct_to_guc(ct);
> > -   struct intel_gt *gt = guc_to_gt(guc);
> > -
> > -   if (i915_gem_object_is_lmem(guc->ct.vma->obj)) {
> > -   GEM_BUG_ON(guc->send_regs.fw_domains);
> > -   /*
> > -* This register is used by the i915 and GuC for MMIO based
> > -* communication. Once we are in this code CTBs are the only
> > -* method the i915 uses to communicate with the GuC so it is
> > -* safe to write to this register (a value of 0 is NOP for MMIO
> > -* communication). If we ever start mixing CTBs and MMIOs a

Re: [Intel-gfx] [PATCH] drm/i915/uapi: Add comment clarifying purpose of I915_TILING_* values

2021-10-13 Thread Yokoyama, Caz
Looks good to me.
Reviewed-by: Caz Yokoyama 
-caz

On Tue, 2021-10-12 at 15:12 -0700, Matt Roper wrote:
> The I915_TILING_* values in our uapi header are intended solely for
> use
> with the old get_tiling/set_tiling ioctls that operate on hardware
> de-tiling fences; all other uapi communication about tiling types is
> done via framebuffer modifiers rather than with these old values.
> 
> On newer Intel platforms detiling fences no longer exist so the old
> get_tiling/set_tiling ioctls are no longer usable and will always
> return
> -EOPNOTSUPP.  This means there's no reason to add new tiling types
> (such
> as the Tile4 format introduced by Xe_HP) to the uapi header
> here.  Any
> kernel-internal code that needs to represent tiling format should
> either
> rely on framebuffer modifiers (as the display code does) or use some
> kind of non-uapi enum (as the GEM blt selftest now does).
> 
> References: 
> https://patchwork.freedesktop.org/patch/456656/?series=95308
> Cc: Ville Syrjälä 
> Signed-off-by: Matt Roper 
> ---
>  include/uapi/drm/i915_drm.h | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/include/uapi/drm/i915_drm.h
> b/include/uapi/drm/i915_drm.h
> index aa2a7eccfb94..9b8e61163c39 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -1522,6 +1522,12 @@ struct drm_i915_gem_caching {
>  #define I915_TILING_NONE 0
>  #define I915_TILING_X1
>  #define I915_TILING_Y2
> +/*
> + * Do not add new tiling types here.  The I915_TILING_* values are
> for
> + * de-tiling fence registers that no longer exist on modern
> platforms.  Although
> + * the hardware may support new types of tiling in general (e.g.,
> Tile4), we
> + * do not need to add them to the uapi that is specific to now-
> defunct ioctls.
> + */
>  #define I915_TILING_LAST I915_TILING_Y
>  
>  #define I915_BIT_6_SWIZZLE_NONE  0


[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [v2,1/4] dri: do not check for NULL debugfs dentry

2021-10-13 Thread Patchwork
== Series Details ==

Series: series starting with [v2,1/4] dri: do not check for NULL debugfs dentry
URL   : https://patchwork.freedesktop.org/series/95794/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
bb1c720488a1 dri: do not check for NULL debugfs dentry
-:93: CHECK:LINE_SPACING: Please don't use multiple blank lines
#93: FILE: include/drm/drm_file.h:84:
 
+

total: 0 errors, 0 warnings, 1 checks, 73 lines checked
5a230733e5b5 drm/ttm: do not set NULL to debugfs dentry
9a2340e7beba drm/i915/gt: do not check for NULL debugfs dentry
3e3b63e04133 vgaswitcheroo: do not check for NULL debugfs dentry




[Intel-gfx] ✗ Fi.CI.SPARSE: warning for series starting with [v2,1/4] dri: do not check for NULL debugfs dentry

2021-10-13 Thread Patchwork
== Series Details ==

Series: series starting with [v2,1/4] dri: do not check for NULL debugfs dentry
URL   : https://patchwork.freedesktop.org/series/95794/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.
-
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:314:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"

Re: [Intel-gfx] [PATCH 10/26] drm/i915/guc: Assign contexts in parent-child relationship consecutive guc_ids

2021-10-13 Thread John Harrison

On 10/13/2021 11:03, Matthew Brost wrote:

On Fri, Oct 08, 2021 at 09:40:43AM -0700, John Harrison wrote:

On 10/7/2021 18:21, Matthew Brost wrote:

On Thu, Oct 07, 2021 at 03:03:04PM -0700, John Harrison wrote:

On 10/4/2021 15:06, Matthew Brost wrote:

Assign contexts in parent-child relationship consecutive guc_ids. This
is accomplished by partitioning guc_id space between ones that need to
be consecutive (1/16 available guc_ids) and ones that do not (15/16 of
available guc_ids). The consecutive search is implemented via the bitmap
API.

This is a precursor to the full GuC multi-lrc implementation but aligns
to how GuC mutli-lrc interface is defined - guc_ids must be consecutive
when using the GuC multi-lrc interface.

v2:
(Daniel Vetter)
 - Explicitly state why we assign consecutive guc_ids
v3:
(John Harrison)
 - Bring back in spin lock

Signed-off-by: Matthew Brost 
---
drivers/gpu/drm/i915/gt/uc/intel_guc.h|   6 +-
.../gpu/drm/i915/gt/uc/intel_guc_submission.c | 104 ++
2 files changed, 86 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
index 25a598e2b6e8..a9f4ec972bfb 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -76,9 +76,13 @@ struct intel_guc {
 */
spinlock_t lock;
/**
-* @guc_ids: used to allocate new guc_ids
+* @guc_ids: used to allocate new guc_ids, single-lrc
 */
struct ida guc_ids;
+   /**
+* @guc_ids_bitmap: used to allocate new guc_ids, multi-lrc
+*/
+   unsigned long *guc_ids_bitmap;
/**
 * @guc_id_list: list of intel_context with valid guc_ids but no
 * refs
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 1f2809187513..79e7732e83b2 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -128,6 +128,16 @@ guc_create_virtual(struct intel_engine_cs **siblings, 
unsigned int count);
#define GUC_REQUEST_SIZE 64 /* bytes */
+/*
+ * We reserve 1/16 of the guc_ids for multi-lrc as these need to be contiguous
+ * per the GuC submission interface. A different allocation algorithm is used
+ * (bitmap vs. ida) between multi-lrc and single-lrc hence the reason to
+ * partition the guc_id space. We believe the number of multi-lrc contexts in
+ * use should be low and 1/16 should be sufficient. Minimum of 32 guc_ids for
+ * multi-lrc.
+ */
+#define NUMBER_MULTI_LRC_GUC_ID(GUC_MAX_LRC_DESCRIPTORS / 16)
+
/*
 * Below is a set of functions which control the GuC scheduling state which
 * require a lock.
@@ -1206,6 +1216,11 @@ int intel_guc_submission_init(struct intel_guc *guc)
INIT_WORK(&guc->submission_state.destroyed_worker,
  destroyed_worker_func);
+   guc->submission_state.guc_ids_bitmap =
+   bitmap_zalloc(NUMBER_MULTI_LRC_GUC_ID, GFP_KERNEL);
+   if (!guc->submission_state.guc_ids_bitmap)
+   return -ENOMEM;
+
return 0;
}
@@ -1217,6 +1232,7 @@ void intel_guc_submission_fini(struct intel_guc *guc)
guc_lrc_desc_pool_destroy(guc);
guc_flush_destroyed_contexts(guc);
i915_sched_engine_put(guc->sched_engine);
+   bitmap_free(guc->submission_state.guc_ids_bitmap);
}
static inline void queue_request(struct i915_sched_engine *sched_engine,
@@ -1268,18 +1284,43 @@ static void guc_submit_request(struct i915_request *rq)
spin_unlock_irqrestore(&sched_engine->lock, flags);
}
-static int new_guc_id(struct intel_guc *guc)
+static int new_guc_id(struct intel_guc *guc, struct intel_context *ce)
{
-   return ida_simple_get(&guc->submission_state.guc_ids, 0,
- GUC_MAX_LRC_DESCRIPTORS, GFP_KERNEL |
- __GFP_RETRY_MAYFAIL | __GFP_NOWARN);
+   int ret;
+
+   GEM_BUG_ON(intel_context_is_child(ce));
+
+   if (intel_context_is_parent(ce))
+   ret = 
bitmap_find_free_region(guc->submission_state.guc_ids_bitmap,
+ NUMBER_MULTI_LRC_GUC_ID,
+ 
order_base_2(ce->parallel.number_children
+  + 1));
+   else
+   ret = ida_simple_get(&guc->submission_state.guc_ids,
+NUMBER_MULTI_LRC_GUC_ID,
+GUC_MAX_LRC_DESCRIPTORS,
+GFP_KERNEL | __GFP_RETRY_MAYFAIL |
+__GFP_NOWARN);
+   if (unlikely(ret < 0))
+   return ret;
+
+   ce->guc_id.id = ret;
+   return 0;
 

Re: [Intel-gfx] [PATCH v2] drm/i915: Remove memory frequency calculation

2021-10-13 Thread Souza, Jose
On Wed, 2021-10-13 at 12:32 +0300, Ville Syrjälä wrote:
> On Tue, Oct 12, 2021 at 06:00:46PM -0700, José Roberto de Souza wrote:
> > This memory frequency calculated is only used to check if it is zero,
> > what is not useful as it will never actually be zero.
> > 
> > Also the calculation is wrong, we should be checking other bit to
> > select the appropriate frequency multiplier while this code is stuck
> > with a fixed multiplier.
> 
> I don't think the alternate ref clock was ever used.
> At least I don't recall ever seeing it.
> 
> The real problem with this is that IIRC this is just the last
> requested frequency. So on a system with SAGV this will
> change dynamically.
> 
> > 
> > So here dropping it as whole.
> 
> We have a second copy of this in gen6_update_ring_freq(). Rather
> than removing one and leaving another potentially broken one behind we
> should probably just consolidate on a single implementation.

gen6_update_ring_freq() is related to GPU frequency not memory, don't look 
related at all to me.

> 
> > 
> > v2:
> > - Also remove memory frequency calculation for gen9 LP platforms
> > 
> > Cc: Yakui Zhao 
> > Cc: Matt Roper 
> > Fixes: f8112cb9574b ("drm/i915/gen11+: Only load DRAM information from 
> > pcode")
> > Signed-off-by: José Roberto de Souza 
> > ---
> >  drivers/gpu/drm/i915/i915_reg.h   |  8 
> >  drivers/gpu/drm/i915/intel_dram.c | 30 ++
> >  2 files changed, 2 insertions(+), 36 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_reg.h 
> > b/drivers/gpu/drm/i915/i915_reg.h
> > index a897f4abea0c3..8825f7ac477b6 100644
> > --- a/drivers/gpu/drm/i915/i915_reg.h
> > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > @@ -11109,12 +11109,6 @@ enum skl_power_gate {
> >  #define  DC_STATE_DEBUG_MASK_CORES (1 << 0)
> >  #define  DC_STATE_DEBUG_MASK_MEMORY_UP (1 << 1)
> >  
> > -#define BXT_P_CR_MC_BIOS_REQ_0_0_0 _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x7114)
> > -#define  BXT_REQ_DATA_MASK 0x3F
> > -#define  BXT_DRAM_CHANNEL_ACTIVE_SHIFT 12
> > -#define  BXT_DRAM_CHANNEL_ACTIVE_MASK  (0xF << 12)
> > -#define  BXT_MEMORY_FREQ_MULTIPLIER_HZ 1
> > -
> >  #define BXT_D_CR_DRP0_DUNIT8   0x1000
> >  #define BXT_D_CR_DRP0_DUNIT9   0x1200
> >  #define  BXT_D_CR_DRP0_DUNIT_START 8
> > @@ -11145,9 +11139,7 @@ enum skl_power_gate {
> >  #define  BXT_DRAM_TYPE_LPDDR4  (0x2 << 22)
> >  #define  BXT_DRAM_TYPE_DDR4(0x4 << 22)
> >  
> > -#define SKL_MEMORY_FREQ_MULTIPLIER_HZ  2
> >  #define SKL_MC_BIOS_DATA_0_0_0_MCHBAR_PCU  _MMIO(MCHBAR_MIRROR_BASE_SNB + 
> > 0x5E04)
> > -#define  SKL_REQ_DATA_MASK (0xF << 0)
> >  #define  DG1_GEAR_TYPE REG_BIT(16)
> >  
> >  #define SKL_MAD_INTER_CHANNEL_0_0_0_MCHBAR_MCMAIN 
> > _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x5000)
> > diff --git a/drivers/gpu/drm/i915/intel_dram.c 
> > b/drivers/gpu/drm/i915/intel_dram.c
> > index 30a0cab5eff46..0adadfd9528aa 100644
> > --- a/drivers/gpu/drm/i915/intel_dram.c
> > +++ b/drivers/gpu/drm/i915/intel_dram.c
> > @@ -244,7 +244,6 @@ static int
> >  skl_get_dram_info(struct drm_i915_private *i915)
> >  {
> > struct dram_info *dram_info = &i915->dram_info;
> > -   u32 mem_freq_khz, val;
> > int ret;
> >  
> > dram_info->type = skl_get_dram_type(i915);
> > @@ -255,17 +254,6 @@ skl_get_dram_info(struct drm_i915_private *i915)
> > if (ret)
> > return ret;
> >  
> > -   val = intel_uncore_read(&i915->uncore,
> > -   SKL_MC_BIOS_DATA_0_0_0_MCHBAR_PCU);
> > -   mem_freq_khz = DIV_ROUND_UP((val & SKL_REQ_DATA_MASK) *
> > -   SKL_MEMORY_FREQ_MULTIPLIER_HZ, 1000);
> > -
> > -   if (dram_info->num_channels * mem_freq_khz == 0) {
> > -   drm_info(&i915->drm,
> > -"Couldn't get system memory bandwidth\n");
> > -   return -EINVAL;
> > -   }
> > -
> > return 0;
> >  }
> >  
> > @@ -350,24 +338,10 @@ static void bxt_get_dimm_info(struct dram_dimm_info 
> > *dimm, u32 val)
> >  static int bxt_get_dram_info(struct drm_i915_private *i915)
> >  {
> > struct dram_info *dram_info = &i915->dram_info;
> > -   u32 dram_channels;
> > -   u32 mem_freq_khz, val;
> > -   u8 num_active_channels, valid_ranks = 0;
> > +   u32 val;
> > +   u8 valid_ranks = 0;
> > int i;
> >  
> > -   val = intel_uncore_read(&i915->uncore, BXT_P_CR_MC_BIOS_REQ_0_0_0);
> > -   mem_freq_khz = DIV_ROUND_UP((val & BXT_REQ_DATA_MASK) *
> > -   BXT_MEMORY_FREQ_MULTIPLIER_HZ, 1000);
> > -
> > -   dram_channels = val & BXT_DRAM_CHANNEL_ACTIVE_MASK;
> > -   num_active_channels = hweight32(dram_channels);
> > -
> > -   if (mem_freq_khz * num_active_channels == 0) {
> > -   drm_info(&i915->drm,
> > -"Couldn't get system memory bandwidth\n");
> > -   return -EINV

Re: [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Parallel submission aka multi-bb execbuf (rev4)

2021-10-13 Thread John Harrison

On 10/12/2021 17:15, Matthew Brost wrote:

On Tue, Oct 12, 2021 at 03:15:00PM -0700, John Harrison wrote:

On 10/4/2021 15:21, Patchwork wrote:

== Series Details ==

Series: Parallel submission aka multi-bb execbuf (rev4)
URL   : https://patchwork.freedesktop.org/series/92789/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
e2a47a99bf9d drm/i915/guc: Move GuC guc_id allocation under submission state 
sub-struct
f83d8f1539fa drm/i915/guc: Take GT PM ref when deregistering context
-:79: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'gt' - possible side-effects?
#79: FILE: drivers/gpu/drm/i915/gt/intel_gt_pm.h:44:
+#define with_intel_gt_pm(gt, tmp) \
+   for (tmp = 1, intel_gt_pm_get(gt); tmp; \
+intel_gt_pm_put(gt), tmp = 0)

-:79: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'tmp' - possible side-effects?
#79: FILE: drivers/gpu/drm/i915/gt/intel_gt_pm.h:44:
+#define with_intel_gt_pm(gt, tmp) \
+   for (tmp = 1, intel_gt_pm_get(gt); tmp; \
+intel_gt_pm_put(gt), tmp = 0)

Not sure what these two are complaining about? But 'gt' and 'tmp' should be
wrapped with parentheses when used?


Not, sure but I think this one is fine.


total: 0 errors, 0 warnings, 2 checks, 290 lines checked
93e5284929b3 drm/i915/guc: Take engine PM when a context is pinned with GuC 
submission
4dd6554d994d drm/i915/guc: Don't call switch_to_kernel_context with GuC 
submission
8629b55f536c drm/i915: Add logical engine mapping
8117ec0a1ca7 drm/i915: Expose logical engine instance to user
aa8e1eb4dd4e drm/i915/guc: Introduce context parent-child relationship
aaf50eacc2fd drm/i915/guc: Add multi-lrc context registration
e5f6f50e66d1 drm/i915/guc: Ensure GuC schedule operations do not operate on 
child contexts
adf21ba138f3 drm/i915/guc: Assign contexts in parent-child relationship 
consecutive guc_ids
40ef33318b81 drm/i915/guc: Implement parallel context pin / unpin functions
1ad560c70346 drm/i915/guc: Implement multi-lrc submission
-:364: CHECK:SPACING: spaces preferred around that '*' (ctx:ExV)
#364: FILE: drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:771:
+   *wqi++ = child->ring->tail / sizeof(u64);
^

This seems like a bogus warning.


Agree.


total: 0 errors, 0 warnings, 1 checks, 570 lines checked
466c01457dec drm/i915/guc: Insert submit fences between requests in 
parent-child relationship
2ece815c1f18 drm/i915/guc: Implement multi-lrc reset
7add5784199f drm/i915/guc: Update debugfs for GuC multi-lrc
-:23: CHECK:LINE_SPACING: Please don't use multiple blank lines
#23: FILE: drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:3707:
+

This should be fixed.


Done.
  

total: 0 errors, 0 warnings, 1 checks, 67 lines checked
966991d7bbed drm/i915: Fix bug in user proto-context creation that leaked 
contexts
0eb3d3bf0c84 drm/i915/guc: Connect UAPI to GuC multi-lrc interface
68c6596b649a drm/i915/doc: Update parallel submit doc to point to i915_drm.h
-:13: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does 
MAINTAINERS need updating?
#13:
deleted file mode 100644

total: 0 errors, 1 warnings, 0 checks, 10 lines checked
8290f5d15ca2 drm/i915/guc: Add basic GuC multi-lrc selftest
-:22: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does 
MAINTAINERS need updating?
#22:
new file mode 100644

These two can be ignored.

Agree.


total: 0 errors, 1 warnings, 0 checks, 190 lines checked
ade3768c42d5 drm/i915/guc: Implement no mid batch preemption for multi-lrc
57882939d788 drm/i915: Multi-BB execbuf
-:369: CHECK:MACRO_ARG_REUSE: Macro argument reuse '_i' - possible side-effects?
#369: FILE: drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:1854:
+#define for_each_batch_create_order(_eb, _i) \
+   for (_i = 0; _i < (_eb)->num_batches; ++_i)

Again, not sure the 'reuse' comment means but should also use '(_i)'?


I haven't been able to figure out how to fix these ones. I think you
only need () if you dref the variable.
The () is to prevent any kind of operator precedence confusion when 
passing in something more exciting than a simple variable. Doesn't have 
to be a deref, it could be any operator. Granted, extremely unlikely for 
this particular macro but generally good practice just in case. E.g. 
someone passes in weird things like 'a, func()' as '_i'.


John.

  

-:371: ERROR:MULTISTATEMENT_MACRO_USE_DO_WHILE: Macros with multiple statements 
should be enclosed in a do - while loop
#371: FILE: drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:1856:
+#define for_each_batch_add_order(_eb, _i) \
+   BUILD_BUG_ON(!typecheck(int, _i)); \
+   for (_i = (_eb)->num_batches - 1; _i >= 0; --_i)

This seems bogus. Wrapping it in a do/while will break the purpose!


Right. Added the BUILD_BUG_ON here because I did have a bug where I used
an unsigned with this macro and that breaks the macro.

Matt


-:371: CHECK:MACRO_ARG_REUSE: Macro argument reuse '_i' - possible side-effects?
#371: FILE: drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:1856:

Re: [Intel-gfx] [PATCH 23/26] drm/i915: Make request conflict tracking understand parallel submits

2021-10-13 Thread John Harrison

On 10/13/2021 10:51, Matthew Brost wrote:

On Tue, Oct 12, 2021 at 03:08:05PM -0700, John Harrison wrote:

On 10/4/2021 15:06, Matthew Brost wrote:

If an object in the excl or shared slot is a composite fence from a
parallel submit and the current request in the conflict tracking is from
the same parallel context there is no need to enforce ordering as the
ordering already implicit. Make the request conflict tracking understand

ordering already -> ordering is already


this by comparing the parents parallel fence values and skipping the

parents -> parent's


conflict insertion if the values match.

Presumably, this is to cope with the fact that the parallel submit fences do
not look like regular submission fences. And hence the existing code that
says 'new fence belongs to same context as old fence, so safe to ignore'
does not work with parallel submission. However, this change does not appear
to be adding parallel submit support to an existing 'same context' check. It
seems to be a brand new check that does not exist for single submission.
What makes parallel submit different? If we aren't skipping same context
fences for single submits, why do we need it for parallel? Conversely, if we
need it for parallel then why don't we need it for single?

And if the single submission version is simply somewhere else in the code,
why do the parallel version here instead of at the same place?

John.


Signed-off-by: Matthew Brost 
---
   drivers/gpu/drm/i915/i915_request.c | 43 +++--
   1 file changed, 29 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c
index e9bfa32f9270..cf89624020ad 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1325,6 +1325,25 @@ i915_request_await_external(struct i915_request *rq, 
struct dma_fence *fence)
return err;
   }
+static inline bool is_parallel_rq(struct i915_request *rq)
+{
+   return intel_context_is_parallel(rq->context);
+}
+
+static inline struct intel_context *request_to_parent(struct i915_request *rq)
+{
+   return intel_context_to_parent(rq->context);
+}
+
+static bool is_same_parallel_context(struct i915_request *to,
+struct i915_request *from)
+{
+   if (is_parallel_rq(to))

Should this not say '&& is_parallel_rq(from)'?


Missed this one. That isn't necessary as if from is not a parallel
submit the following compare of parents will always return false. I
could add if you insist as either way works.

Matt
It was more a question of whether req_to_parent() works fine 
irrespective of whether the rq is a parent, child or single?


John.




+   return request_to_parent(to) == request_to_parent(from);
+
+   return false;
+}
+
   int
   i915_request_await_execution(struct i915_request *rq,
 struct dma_fence *fence)
@@ -1356,11 +1375,14 @@ i915_request_await_execution(struct i915_request *rq,
 * want to run our callback in all cases.
 */
-   if (dma_fence_is_i915(fence))
+   if (dma_fence_is_i915(fence)) {
+   if (is_same_parallel_context(rq, to_request(fence)))
+   continue;
ret = __i915_request_await_execution(rq,
 to_request(fence));
-   else
+   } else {
ret = i915_request_await_external(rq, fence);
+   }
if (ret < 0)
return ret;
} while (--nchild);
@@ -1461,10 +1483,13 @@ i915_request_await_dma_fence(struct i915_request *rq, 
struct dma_fence *fence)
 fence))
continue;
-   if (dma_fence_is_i915(fence))
+   if (dma_fence_is_i915(fence)) {
+   if (is_same_parallel_context(rq, to_request(fence)))
+   continue;
ret = i915_request_await_request(rq, to_request(fence));
-   else
+   } else {
ret = i915_request_await_external(rq, fence);
+   }
if (ret < 0)
return ret;
@@ -1539,16 +1564,6 @@ i915_request_await_object(struct i915_request *to,
return ret;
   }
-static inline bool is_parallel_rq(struct i915_request *rq)
-{
-   return intel_context_is_parallel(rq->context);
-}
-
-static inline struct intel_context *request_to_parent(struct i915_request *rq)
-{
-   return intel_context_to_parent(rq->context);
-}
-
   static struct i915_request *
   __i915_request_ensure_parallel_ordering(struct i915_request *rq,
struct intel_timeline *timeline)




  1   2   3   >