date:20211021

[Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915: (near)atomic gamma LUT updates via vblank workers

2021-10-21 Thread Patchwork

== Series Details ==

Series: drm/i915: (near)atomic gamma LUT updates via vblank workers
URL   : https://patchwork.freedesktop.org/series/96089/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10767_full -> Patchwork_21399_full


Summary
---

  **SUCCESS**

  No regressions found.

  

Known issues


  Here are the changes found in Patchwork_21399_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_create@create-massive:
- shard-kbl:  NOTRUN -> [DMESG-WARN][1] ([i915#3002])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21399/shard-kbl7/igt@gem_cre...@create-massive.html

  * igt@gem_ctx_persistence@legacy-engines-persistence:
- shard-snb:  NOTRUN -> [SKIP][2] ([fdo#109271] / [i915#1099]) +4 
similar issues
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21399/shard-snb7/igt@gem_ctx_persiste...@legacy-engines-persistence.html

  * igt@gem_exec_fair@basic-none@rcs0:
- shard-kbl:  [PASS][3] -> [FAIL][4] ([i915#2842]) +1 similar issue
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10767/shard-kbl7/igt@gem_exec_fair@basic-n...@rcs0.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21399/shard-kbl3/igt@gem_exec_fair@basic-n...@rcs0.html

  * igt@gem_exec_fair@basic-pace@bcs0:
- shard-tglb: [PASS][5] -> [FAIL][6] ([i915#2842]) +5 similar issues
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10767/shard-tglb6/igt@gem_exec_fair@basic-p...@bcs0.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21399/shard-tglb1/igt@gem_exec_fair@basic-p...@bcs0.html

  * igt@gem_exec_fair@basic-throttle@rcs0:
- shard-tglb: NOTRUN -> [FAIL][7] ([i915#2842])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21399/shard-tglb6/igt@gem_exec_fair@basic-throt...@rcs0.html
- shard-iclb: [PASS][8] -> [FAIL][9] ([i915#2849])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10767/shard-iclb8/igt@gem_exec_fair@basic-throt...@rcs0.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21399/shard-iclb7/igt@gem_exec_fair@basic-throt...@rcs0.html

  * igt@gem_exec_params@no-vebox:
- shard-skl:  NOTRUN -> [SKIP][10] ([fdo#109271]) +76 similar issues
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21399/shard-skl4/igt@gem_exec_par...@no-vebox.html

  * igt@gem_exec_reloc@basic-wc-gtt-noreloc:
- shard-skl:  [PASS][11] -> [DMESG-WARN][12] ([i915#1982])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10767/shard-skl9/igt@gem_exec_re...@basic-wc-gtt-noreloc.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21399/shard-skl10/igt@gem_exec_re...@basic-wc-gtt-noreloc.html

  * igt@gem_userptr_blits@readonly-unsync:
- shard-tglb: NOTRUN -> [SKIP][13] ([i915#3297])
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21399/shard-tglb2/igt@gem_userptr_bl...@readonly-unsync.html

  * igt@gen9_exec_parse@allowed-single:
- shard-skl:  [PASS][14] -> [DMESG-WARN][15] ([i915#1436] / 
[i915#716])
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10767/shard-skl2/igt@gen9_exec_pa...@allowed-single.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21399/shard-skl8/igt@gen9_exec_pa...@allowed-single.html

  * igt@gen9_exec_parse@basic-rejected:
- shard-iclb: NOTRUN -> [SKIP][16] ([i915#2856])
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21399/shard-iclb1/igt@gen9_exec_pa...@basic-rejected.html
- shard-tglb: NOTRUN -> [SKIP][17] ([i915#2856])
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21399/shard-tglb2/igt@gen9_exec_pa...@basic-rejected.html

  * igt@i915_pm_rpm@dpms-non-lpsp:
- shard-tglb: NOTRUN -> [SKIP][18] ([fdo#111644] / [i915#1397] / 
[i915#2411])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21399/shard-tglb2/igt@i915_pm_...@dpms-non-lpsp.html

  * igt@kms_big_fb@linear-32bpp-rotate-0:
- shard-glk:  [PASS][19] -> [DMESG-WARN][20] ([i915#118]) +1 
similar issue
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10767/shard-glk5/igt@kms_big...@linear-32bpp-rotate-0.html
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21399/shard-glk1/igt@kms_big...@linear-32bpp-rotate-0.html

  * igt@kms_big_fb@x-tiled-max-hw-stride-32bpp-rotate-180-hflip:
- shard-apl:  NOTRUN -> [SKIP][21] ([fdo#109271] / [i915#3777])
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21399/shard-apl1/igt@kms_big...@x-tiled-max-hw-stride-32bpp-rotate-180-hflip.html

  * igt@kms_big_fb@x-tiled-max-hw-stride-64bpp-rotate-0-hflip:
- shard-kbl:  NOTRUN -> [SKIP][22] ([fdo#109271] / [i915#3777])
   [22]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21399/shard-kbl6/igt@kms_big...@x-tiled-max-hw-stride-64bpp-rotate-0-hflip.html

  * igt

[Intel-gfx] ✓ Fi.CI.IGT: success for drm/dp: Use DP2.0 DPCD 248h updated register/field names for DP PHY CTS

2021-10-21 Thread Patchwork

== Series Details ==

Series: drm/dp: Use DP2.0 DPCD 248h updated register/field names for DP PHY CTS
URL   : https://patchwork.freedesktop.org/series/96096/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10767_full -> Patchwork_21400_full


Summary
---

  **SUCCESS**

  No regressions found.

  

Known issues


  Here are the changes found in Patchwork_21400_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_create@create-massive:
- shard-kbl:  NOTRUN -> [DMESG-WARN][1] ([i915#3002]) +1 similar 
issue
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21400/shard-kbl4/igt@gem_cre...@create-massive.html
- shard-apl:  NOTRUN -> [DMESG-WARN][2] ([i915#3002])
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21400/shard-apl3/igt@gem_cre...@create-massive.html

  * igt@gem_ctx_persistence@legacy-engines-cleanup:
- shard-snb:  NOTRUN -> [SKIP][3] ([fdo#109271] / [i915#1099]) +1 
similar issue
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21400/shard-snb7/igt@gem_ctx_persiste...@legacy-engines-cleanup.html

  * igt@gem_exec_fair@basic-deadline:
- shard-glk:  [PASS][4] -> [FAIL][5] ([i915#2846])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10767/shard-glk9/igt@gem_exec_f...@basic-deadline.html
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21400/shard-glk8/igt@gem_exec_f...@basic-deadline.html

  * igt@gem_exec_fair@basic-none@vcs0:
- shard-kbl:  [PASS][6] -> [FAIL][7] ([i915#2842]) +3 similar issues
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10767/shard-kbl7/igt@gem_exec_fair@basic-n...@vcs0.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21400/shard-kbl3/igt@gem_exec_fair@basic-n...@vcs0.html

  * igt@gem_exec_fair@basic-pace@bcs0:
- shard-tglb: [PASS][8] -> [FAIL][9] ([i915#2842]) +2 similar issues
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10767/shard-tglb6/igt@gem_exec_fair@basic-p...@bcs0.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21400/shard-tglb6/igt@gem_exec_fair@basic-p...@bcs0.html

  * igt@gem_exec_fair@basic-throttle@rcs0:
- shard-tglb: NOTRUN -> [FAIL][10] ([i915#2842])
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21400/shard-tglb1/igt@gem_exec_fair@basic-throt...@rcs0.html
- shard-iclb: [PASS][11] -> [FAIL][12] ([i915#2842])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10767/shard-iclb8/igt@gem_exec_fair@basic-throt...@rcs0.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21400/shard-iclb7/igt@gem_exec_fair@basic-throt...@rcs0.html

  * igt@gem_exec_params@no-vebox:
- shard-skl:  NOTRUN -> [SKIP][13] ([fdo#109271]) +77 similar issues
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21400/shard-skl4/igt@gem_exec_par...@no-vebox.html

  * igt@gem_userptr_blits@readonly-unsync:
- shard-tglb: NOTRUN -> [SKIP][14] ([i915#3297])
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21400/shard-tglb2/igt@gem_userptr_bl...@readonly-unsync.html

  * igt@gem_userptr_blits@vma-merge:
- shard-apl:  NOTRUN -> [FAIL][15] ([i915#3318])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21400/shard-apl8/igt@gem_userptr_bl...@vma-merge.html

  * igt@gen9_exec_parse@basic-rejected:
- shard-iclb: NOTRUN -> [SKIP][16] ([i915#2856])
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21400/shard-iclb5/igt@gen9_exec_pa...@basic-rejected.html
- shard-tglb: NOTRUN -> [SKIP][17] ([i915#2856])
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21400/shard-tglb2/igt@gen9_exec_pa...@basic-rejected.html

  * igt@i915_module_load@reload-with-fault-injection:
- shard-skl:  [PASS][18] -> [DMESG-WARN][19] ([i915#1982])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10767/shard-skl9/igt@i915_module_l...@reload-with-fault-injection.html
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21400/shard-skl2/igt@i915_module_l...@reload-with-fault-injection.html

  * igt@i915_pm_rpm@dpms-non-lpsp:
- shard-tglb: NOTRUN -> [SKIP][20] ([fdo#111644] / [i915#1397] / 
[i915#2411])
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21400/shard-tglb2/igt@i915_pm_...@dpms-non-lpsp.html

  * igt@kms_async_flips@alternate-sync-async-flip:
- shard-skl:  [PASS][21] -> [FAIL][22] ([i915#2521])
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10767/shard-skl10/igt@kms_async_fl...@alternate-sync-async-flip.html
   [22]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21400/shard-skl2/igt@kms_async_fl...@alternate-sync-async-flip.html

  * igt@kms_big_fb@linear-32bpp-rotate-0:
- shard-glk:  [PASS][23] -> [DMESG-WARN][24] ([i915#118])
   [23]: 
https://intel-gfx-ci.01.org/tr

[Intel-gfx] [PULL] drm-misc-fixes

2021-10-21 Thread Maarten Lankhorst

Hi Dave,

New drm-misc-fixes without the vc4 changes. I feel that needs some more 
discussion first.

drm-misc-fixes-2021-10-21-1:
drm-misc-fixes for v5.15-rc7:
- Rebased, to remove vc4 patches.
- Fix mxsfb crash on unload.
- Use correct sync parameters for Feixin K101-IM2BYL02.
- Assorted kmb modeset/atomic fixes.
The following changes since commit 519d81956ee277b4419c723adfb154603c2565ba:

  Linux 5.15-rc6 (2021-10-17 20:00:13 -1000)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm-misc tags/drm-misc-fixes-2021-10-21-1

for you to fetch changes up to 74056092ff415e7e20ce2544689b32ee811c4f0b:

  drm/kmb: Enable ADV bridge after modeset (2021-10-21 11:08:09 +0200)


drm-misc-fixes for v5.15-rc7:
- Rebased, to remove vc4 patches.
- Fix mxsfb crash on unload.
- Use correct sync parameters for Feixin K101-IM2BYL02.
- Assorted kmb modeset/atomic fixes.


Anitha Chrisanthus (4):
  drm/kmb: Work around for higher system clock
  drm/kmb: Limit supported mode to 1080p
  drm/kmb: Corrected typo in handle_lcd_irq
  drm/kmb: Enable ADV bridge after modeset

Dan Johansen (1):
  drm/panel: ilitek-ili9881c: Fix sync for Feixin K101-IM2BYL02 panel

Edmund Dea (2):
  drm/kmb: Remove clearing DPHY regs
  drm/kmb: Disable change of plane parameters

Marek Vasut (1):
  drm: mxsfb: Fix NULL pointer dereference crash on unload

 drivers/gpu/drm/kmb/kmb_crtc.c| 41 +++--
 drivers/gpu/drm/kmb/kmb_drv.c |  2 +-
 drivers/gpu/drm/kmb/kmb_drv.h | 10 ++-
 drivers/gpu/drm/kmb/kmb_dsi.c | 25 +---
 drivers/gpu/drm/kmb/kmb_dsi.h |  2 +-
 drivers/gpu/drm/kmb/kmb_plane.c   | 43 ++-
 drivers/gpu/drm/kmb/kmb_plane.h   |  6 
 drivers/gpu/drm/mxsfb/mxsfb_drv.c |  6 +++-
 drivers/gpu/drm/panel/panel-ilitek-ili9881c.c | 12 
 9 files changed, 123 insertions(+), 24 deletions(-)

Re: [Intel-gfx] [RFC PATCH 1/4] drm/dp: Rename DPCD 248h according to DP 2.0 specs

2021-10-21 Thread Jani Nikula

On Wed, 20 Oct 2021, Khaled Almahallawy  wrote:
> DPCD 248h name was changed from “PHY_TEST_PATTERN” in DP 1.4 to 
> “LINK_QUAL_PATTERN_SELECT” in DP 2.0.

Please use ASCII double quotes ". Please reflow the commit message to
limit line lenghts to about 72 characters.

> Also, DPCD 248h [6:0] is the same as DPCDs 10Bh/10Ch/10Dh/10Eh [6:0]. So 
> removed the repeated definition of PHY patterns.
>
> Reference: “DPCD 248h/10Bh/10Ch/10Dh/10Eh Name/Description Consistency”
> https://groups.vesa.org/wg/AllMem/documentComment/2738
>
> Signed-off-by: Khaled Almahallawy 
> ---
>  drivers/gpu/drm/drm_dp_helper.c |  6 +++---
>  include/drm/drm_dp_helper.h | 13 +++--
>  2 files changed, 6 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c
> index ada0a1ff262d..c9c928c08026 100644
> --- a/drivers/gpu/drm/drm_dp_helper.c
> +++ b/drivers/gpu/drm/drm_dp_helper.c
> @@ -2489,19 +2489,19 @@ int drm_dp_get_phy_test_pattern(struct drm_dp_aux 
> *aux,
>   if (lanes & DP_ENHANCED_FRAME_CAP)
>   data->enhanced_frame_cap = true;
>  
> - err = drm_dp_dpcd_readb(aux, DP_PHY_TEST_PATTERN, &data->phy_pattern);
> + err = drm_dp_dpcd_readb(aux, DP_LINK_QUAL_PATTERN_SELECT, 
> &data->phy_pattern);
>   if (err < 0)
>   return err;
>  
>   switch (data->phy_pattern) {
> - case DP_PHY_TEST_PATTERN_80BIT_CUSTOM:
> + case DP_LINK_QUAL_PATTERN_80BIT_CUSTOM:
>   err = drm_dp_dpcd_read(aux, DP_TEST_80BIT_CUSTOM_PATTERN_7_0,
>  &data->custom80, sizeof(data->custom80));
>   if (err < 0)
>   return err;
>  
>   break;
> - case DP_PHY_TEST_PATTERN_CP2520:
> + case DP_LINK_QUAL_PATTERN_CP2520_PAT_1:
>   err = drm_dp_dpcd_read(aux, DP_TEST_HBR2_SCRAMBLER_RESET,
>  &data->hbr2_reset,
>  sizeof(data->hbr2_reset));
> diff --git a/include/drm/drm_dp_helper.h b/include/drm/drm_dp_helper.h
> index afdf7f4183f9..ef915bb75bb4 100644
> --- a/include/drm/drm_dp_helper.h
> +++ b/include/drm/drm_dp_helper.h
> @@ -862,16 +862,9 @@ struct drm_panel;
>  # define DP_TEST_CRC_SUPPORTED   (1 << 5)
>  # define DP_TEST_COUNT_MASK  0xf
>  
> -#define DP_PHY_TEST_PATTERN 0x248
> -# define DP_PHY_TEST_PATTERN_SEL_MASK   0x7
> -# define DP_PHY_TEST_PATTERN_NONE   0x0
> -# define DP_PHY_TEST_PATTERN_D10_2  0x1
> -# define DP_PHY_TEST_PATTERN_ERROR_COUNT0x2
> -# define DP_PHY_TEST_PATTERN_PRBS7  0x3
> -# define DP_PHY_TEST_PATTERN_80BIT_CUSTOM   0x4
> -# define DP_PHY_TEST_PATTERN_CP2520 0x5
> -
> -#define DP_PHY_SQUARE_PATTERN0x249
> +#define DP_LINK_QUAL_PATTERN_SELECT 0x248

Please add a comment here referencing where the values are. There are
examples in the file.

> +
> +#define DP_PHY_SQUARE_PATTERN   0x249
>  
>  #define DP_TEST_HBR2_SCRAMBLER_RESET0x24A
>  #define DP_TEST_80BIT_CUSTOM_PATTERN_7_00x250

-- 
Jani Nikula, Intel Open Source Graphics Center

Re: [Intel-gfx] [RFC PATCH 0/4] drm/dp: Use DP2.0 DPCD 248h updated register/field names for DP PHY CTS

2021-10-21 Thread Jani Nikula

On Wed, 20 Oct 2021, Khaled Almahallawy  wrote:
> This series updates DPCD 248h register name and PHY test patterns names to 
> follow DP 2.0 Specs.
> Also updates the DP PHY CTS codes of the affected drivers (i915, amd, msm)
> No functional changes expected.
>  
> Reference: “DPCD 248h/10Bh/10Ch/10Dh/10Eh Name/Description Consistency”
> https://groups.vesa.org/wg/AllMem/documentComment/2738

You can't do renames like this piece by piece. Every commit must build.

Incidentally, this is one of the reasons we often don't bother with
renames to follow spec changes, but rather stick to the original names.

However, in this case you could switch all drivers to the different test
pattern macros piece by piece, as they're already there.


BR,
Jani.


>
> Khaled Almahallawy (4):
>   drm/dp: Rename DPCD 248h according to DP 2.0 specs
>   drm/i915/dp: Use DP 2.0 LINK_QUAL_PATTERN_* Phy test pattern
> definitions
>   drm/amd/dc: Use DPCD 248h DP 2.0 new name
>   drm/msm/dp: Use DPCD 248h DP 2.0 new names/definitions
>
>  drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c |  2 +-
>  drivers/gpu/drm/drm_dp_helper.c  |  6 +++---
>  drivers/gpu/drm/i915/display/intel_dp.c  | 12 ++--
>  drivers/gpu/drm/msm/dp/dp_catalog.c  | 12 ++--
>  drivers/gpu/drm/msm/dp/dp_ctrl.c | 12 ++--
>  drivers/gpu/drm/msm/dp/dp_link.c | 16 
>  include/drm/drm_dp_helper.h  | 13 +++--
>  7 files changed, 33 insertions(+), 40 deletions(-)

-- 
Jani Nikula, Intel Open Source Graphics Center

[Intel-gfx] [PATCH 0/2] Selective fetch support for biplanar formats

2021-10-21 Thread Jouni Högander

These patches are implementing selective update configuration for biplanar
formats. Also workaround to do full fetch for multi-planar formats is reverted.

Jouni Högander (2):
  drm/i915/display: Add initial selective fetch support for biplanar
formats
  Revert "drm/i915/display/psr: Do full fetch when handling multi-planar
formats"

 drivers/gpu/drm/i915/display/intel_psr.c | 34 +++-
 1 file changed, 27 insertions(+), 7 deletions(-)

-- 
2.25.1

[Intel-gfx] [PATCH 1/2] drm/i915/display: Add initial selective fetch support for biplanar formats

2021-10-21 Thread Jouni Högander

Biplanar formats are using two planes (Y and UV). This patch adds handling
of Y selective fetch area by utilizing existing linked plane mechanism.
Also UV plane Y offset configuration is modified according to Bspec.

Signed-off-by: Jouni Högander 
---
 drivers/gpu/drm/i915/display/intel_psr.c | 30 +---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_psr.c 
b/drivers/gpu/drm/i915/display/intel_psr.c
index 49c2dfbd4055..469bf95178f3 100644
--- a/drivers/gpu/drm/i915/display/intel_psr.c
+++ b/drivers/gpu/drm/i915/display/intel_psr.c
@@ -1467,10 +1467,19 @@ void intel_psr2_program_plane_sel_fetch(struct 
intel_plane *plane,
val |= plane_state->uapi.dst.x1;
intel_de_write_fw(dev_priv, PLANE_SEL_FETCH_POS(pipe, plane->id), val);
 
-   /* TODO: consider auxiliary surfaces */
-   x = plane_state->uapi.src.x1 >> 16;
-   y = (plane_state->uapi.src.y1 >> 16) + clip->y1;
+   x = plane_state->view.color_plane[color_plane].x;
+
+   /*
+* From Bspec: UV surface Start Y Position = half of Y plane Y
+* start position.
+*/
+   if (!color_plane)
+   y = plane_state->view.color_plane[color_plane].y + clip->y1;
+   else
+   y = plane_state->view.color_plane[color_plane].y + clip->y1 / 2;
+
val = y << 16 | x;
+
intel_de_write_fw(dev_priv, PLANE_SEL_FETCH_OFFSET(pipe, plane->id),
  val);
 
@@ -1700,6 +1709,7 @@ int intel_psr2_sel_fetch_update(struct intel_atomic_state 
*state,
for_each_oldnew_intel_plane_in_state(state, plane, old_plane_state,
 new_plane_state, i) {
struct drm_rect *sel_fetch_area, inter;
+   struct intel_plane *linked = 
new_plane_state->planar_linked_plane;
 
if (new_plane_state->uapi.crtc != crtc_state->uapi.crtc ||
!new_plane_state->uapi.visible)
@@ -1718,6 +1728,20 @@ int intel_psr2_sel_fetch_update(struct 
intel_atomic_state *state,
sel_fetch_area->y1 = inter.y1 - new_plane_state->uapi.dst.y1;
sel_fetch_area->y2 = inter.y2 - new_plane_state->uapi.dst.y1;
crtc_state->update_planes |= BIT(plane->id);
+
+   /*
+* Sel_fetch_area is calculated for UV plane. Use
+* same area for Y plane as well.
+*/
+   if (linked) {
+   struct intel_plane_state *linked_new_plane_state =
+ intel_atomic_get_new_plane_state(state, linked);
+   struct drm_rect *linked_sel_fetch_area =
+ &linked_new_plane_state->psr2_sel_fetch_area;
+
+   linked_sel_fetch_area->y1 = sel_fetch_area->y1;
+   linked_sel_fetch_area->y2 = sel_fetch_area->y2;
+   }
}
 
 skip_sel_fetch_set_loop:
-- 
2.25.1

[Intel-gfx] [PATCH 2/2] Revert "drm/i915/display/psr: Do full fetch when handling multi-planar formats"

2021-10-21 Thread Jouni Högander

This reverts commit 1f61f0655b95d5b89589390e6f83c4a61d9b1e8d.

Now we are supporting selective fetch for biplanar formats. We can revert WA
patch which forced using full fetch for biplanar formats.

Signed-off-by: Jouni Högander 
---
 drivers/gpu/drm/i915/display/intel_psr.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_psr.c 
b/drivers/gpu/drm/i915/display/intel_psr.c
index 469bf95178f3..65282a545dbf 100644
--- a/drivers/gpu/drm/i915/display/intel_psr.c
+++ b/drivers/gpu/drm/i915/display/intel_psr.c
@@ -1571,9 +1571,6 @@ static void intel_psr2_sel_fetch_pipe_alignment(const 
struct intel_crtc_state *c
  * also planes are not updated if they have a negative X
  * position so for now doing a full update in this cases
  *
- * TODO: We are missing multi-planar formats handling, until it is
- * implemented it will send full frame updates.
- *
  * Plane scaling and rotation is not supported by selective fetch and both
  * properties can change without a modeset, so need to be check at every
  * atomic commmit.
@@ -1583,7 +1580,6 @@ static bool psr2_sel_fetch_plane_state_supported(const 
struct intel_plane_state
if (plane_state->uapi.dst.y1 < 0 ||
plane_state->uapi.dst.x1 < 0 ||
plane_state->scaler_id >= 0 ||
-   plane_state->hw.fb->format->num_planes > 1 ||
plane_state->uapi.rotation != DRM_MODE_ROTATE_0)
return false;
 
-- 
2.25.1

Re: [Intel-gfx] [PATCH v4 01/11] drm/i915: Add a table with a descriptor for all i915 modifiers

2021-10-21 Thread Jani Nikula

On Wed, 20 Oct 2021, Imre Deak  wrote:
> Add a table describing all the framebuffer modifiers used by i915 at one
> place. This has the benefit of deduplicating the listing of supported
> modifiers for each platform and checking the support of these modifiers
> on a given plane. This also simplifies in a similar way getting some
> attribute for a modifier, for instance checking if the modifier is a
> CCS modifier type.
>
> While at it drop the cursor plane filtering from skl_plane_has_rc_ccs(),
> as the cursor plane is registered with DRM core elsewhere.
>
> v1: Unchanged.
> v2:
> - Keep the plane caps calculation in the plane code and pass an enum
>   with these caps to intel_fb_get_modifiers(). (Ville)
> - Get the modifiers calling intel_fb_get_modifiers() in i9xx_plane.c as
>   well.
> v3:
> - s/.id/.modifier/ (Ville)
> - Keep modifier_desc vs. plane_cap filter conditions consistent. (Ville)
> - Drop redundant cursor plane check from skl_plane_has_rc_ccs(). (Ville)
> - Use from, until display version fields in modifier_desc instead of a mask. 
> (Jani)
> - Unexport struct intel_modifier_desc, separate its decl and init. (Jani)
> - Remove enum pipe, plane_id forward decls from intel_fb.h, which are
>   not needed after v2.
> v4:
> - Reuse IS_DISPLAY_VER() instead of open-coding it. (Jani)
> - Preserve the current modifier order exposed to user space. (Ville)
>
> Cc: Ville Syrjälä 
> Cc: Juha-Pekka Heikkila 
> Cc: Jani Nikula 
> Signed-off-by: Imre Deak 
> Reviewed-by: Juha-Pekka Heikkila  (v3)
> ---
>  drivers/gpu/drm/i915/display/i9xx_plane.c |  30 +--
>  drivers/gpu/drm/i915/display/intel_cursor.c   |  19 +-
>  .../drm/i915/display/intel_display_types.h|   1 -
>  drivers/gpu/drm/i915/display/intel_fb.c   | 152 +++
>  drivers/gpu/drm/i915/display/intel_fb.h   |  13 ++
>  drivers/gpu/drm/i915/display/intel_sprite.c   |  35 +---
>  drivers/gpu/drm/i915/display/skl_scaler.c |   1 +
>  .../drm/i915/display/skl_universal_plane.c| 178 +-
>  8 files changed, 245 insertions(+), 184 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/display/i9xx_plane.c 
> b/drivers/gpu/drm/i915/display/i9xx_plane.c
> index b1439ba78f67b..a939accff7ee2 100644
> --- a/drivers/gpu/drm/i915/display/i9xx_plane.c
> +++ b/drivers/gpu/drm/i915/display/i9xx_plane.c
> @@ -60,22 +60,11 @@ static const u32 vlv_primary_formats[] = {
>   DRM_FORMAT_XBGR16161616F,
>  };
>  
> -static const u64 i9xx_format_modifiers[] = {
> - I915_FORMAT_MOD_X_TILED,
> - DRM_FORMAT_MOD_LINEAR,
> - DRM_FORMAT_MOD_INVALID
> -};
> -
>  static bool i8xx_plane_format_mod_supported(struct drm_plane *_plane,
>   u32 format, u64 modifier)
>  {
> - switch (modifier) {
> - case DRM_FORMAT_MOD_LINEAR:
> - case I915_FORMAT_MOD_X_TILED:
> - break;
> - default:
> + if (!intel_fb_plane_supports_modifier(to_intel_plane(_plane), modifier))
>   return false;
> - }
>  
>   switch (format) {
>   case DRM_FORMAT_C8:
> @@ -92,13 +81,8 @@ static bool i8xx_plane_format_mod_supported(struct 
> drm_plane *_plane,
>  static bool i965_plane_format_mod_supported(struct drm_plane *_plane,
>   u32 format, u64 modifier)
>  {
> - switch (modifier) {
> - case DRM_FORMAT_MOD_LINEAR:
> - case I915_FORMAT_MOD_X_TILED:
> - break;
> - default:
> + if (!intel_fb_plane_supports_modifier(to_intel_plane(_plane), modifier))
>   return false;
> - }
>  
>   switch (format) {
>   case DRM_FORMAT_C8:
> @@ -768,6 +752,7 @@ intel_primary_plane_create(struct drm_i915_private 
> *dev_priv, enum pipe pipe)
>   struct intel_plane *plane;
>   const struct drm_plane_funcs *plane_funcs;
>   unsigned int supported_rotations;
> + const u64 *modifiers;
>   const u32 *formats;
>   int num_formats;
>   int ret, zpos;
> @@ -875,21 +860,26 @@ intel_primary_plane_create(struct drm_i915_private 
> *dev_priv, enum pipe pipe)
>   plane->disable_flip_done = ilk_primary_disable_flip_done;
>   }
>  
> + modifiers = intel_fb_plane_get_modifiers(dev_priv, PLANE_HAS_TILING);
> +
>   if (DISPLAY_VER(dev_priv) >= 5 || IS_G4X(dev_priv))
>   ret = drm_universal_plane_init(&dev_priv->drm, &plane->base,
>  0, plane_funcs,
>  formats, num_formats,
> -i9xx_format_modifiers,
> +modifiers,
>  DRM_PLANE_TYPE_PRIMARY,
>  "primary %c", pipe_name(pipe));
>   else
>   ret = drm_universal_plane_init(&dev_priv->drm, &plane->base,
>  0, plane_funcs,
>  formats, num_format

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Selective fetch support for biplanar formats

2021-10-21 Thread Patchwork

== Series Details ==

Series: Selective fetch support for biplanar formats
URL   : https://patchwork.freedesktop.org/series/96113/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
c3d07cfdcd88 drm/i915/display: Add initial selective fetch support for biplanar 
formats
7164a9803ffc Revert "drm/i915/display/psr: Do full fetch when handling 
multi-planar formats"
-:12: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description 
(prefer a maximum 75 chars per line)
#12: 
Now we are supporting selective fetch for biplanar formats. We can revert WA

total: 0 errors, 1 warnings, 0 checks, 16 lines checked

Re: [Intel-gfx] [PATCH 1/2] drm/i915/display: Add initial selective fetch support for biplanar formats

2021-10-21 Thread Jani Nikula

On Thu, 21 Oct 2021, Jouni Högander  wrote:
> Biplanar formats are using two planes (Y and UV). This patch adds handling
> of Y selective fetch area by utilizing existing linked plane mechanism.
> Also UV plane Y offset configuration is modified according to Bspec.

FYI, it's fine to add the bspec reference as a tag in the commit
message, e.g.

Bspec: 12345

See git log --grep="^Bspec:" for examples.

No need to resend for this.

BR,
Jani.

>
> Signed-off-by: Jouni Högander 
> ---
>  drivers/gpu/drm/i915/display/intel_psr.c | 30 +---
>  1 file changed, 27 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/display/intel_psr.c 
> b/drivers/gpu/drm/i915/display/intel_psr.c
> index 49c2dfbd4055..469bf95178f3 100644
> --- a/drivers/gpu/drm/i915/display/intel_psr.c
> +++ b/drivers/gpu/drm/i915/display/intel_psr.c
> @@ -1467,10 +1467,19 @@ void intel_psr2_program_plane_sel_fetch(struct 
> intel_plane *plane,
>   val |= plane_state->uapi.dst.x1;
>   intel_de_write_fw(dev_priv, PLANE_SEL_FETCH_POS(pipe, plane->id), val);
>  
> - /* TODO: consider auxiliary surfaces */
> - x = plane_state->uapi.src.x1 >> 16;
> - y = (plane_state->uapi.src.y1 >> 16) + clip->y1;
> + x = plane_state->view.color_plane[color_plane].x;
> +
> + /*
> +  * From Bspec: UV surface Start Y Position = half of Y plane Y
> +  * start position.
> +  */
> + if (!color_plane)
> + y = plane_state->view.color_plane[color_plane].y + clip->y1;
> + else
> + y = plane_state->view.color_plane[color_plane].y + clip->y1 / 2;
> +
>   val = y << 16 | x;
> +
>   intel_de_write_fw(dev_priv, PLANE_SEL_FETCH_OFFSET(pipe, plane->id),
> val);
>  
> @@ -1700,6 +1709,7 @@ int intel_psr2_sel_fetch_update(struct 
> intel_atomic_state *state,
>   for_each_oldnew_intel_plane_in_state(state, plane, old_plane_state,
>new_plane_state, i) {
>   struct drm_rect *sel_fetch_area, inter;
> + struct intel_plane *linked = 
> new_plane_state->planar_linked_plane;
>  
>   if (new_plane_state->uapi.crtc != crtc_state->uapi.crtc ||
>   !new_plane_state->uapi.visible)
> @@ -1718,6 +1728,20 @@ int intel_psr2_sel_fetch_update(struct 
> intel_atomic_state *state,
>   sel_fetch_area->y1 = inter.y1 - new_plane_state->uapi.dst.y1;
>   sel_fetch_area->y2 = inter.y2 - new_plane_state->uapi.dst.y1;
>   crtc_state->update_planes |= BIT(plane->id);
> +
> + /*
> +  * Sel_fetch_area is calculated for UV plane. Use
> +  * same area for Y plane as well.
> +  */
> + if (linked) {
> + struct intel_plane_state *linked_new_plane_state =
> +   intel_atomic_get_new_plane_state(state, linked);
> + struct drm_rect *linked_sel_fetch_area =
> +   &linked_new_plane_state->psr2_sel_fetch_area;
> +
> + linked_sel_fetch_area->y1 = sel_fetch_area->y1;
> + linked_sel_fetch_area->y2 = sel_fetch_area->y2;
> + }
>   }
>  
>  skip_sel_fetch_set_loop:

-- 
Jani Nikula, Intel Open Source Graphics Center

Re: [Intel-gfx] [PATCH 1/4] drm/i915: Move function prototypes to the correct header

2021-10-21 Thread Jani Nikula

On Thu, 21 Oct 2021, Ville Syrjala  wrote:
> From: Ville Syrjälä 
>
> A bunch of function prototypes were left behind when the
> plane/crtc code got reshuffled to new files. Move the
> prototypes as well.

Reviewed-by: Jani Nikula 


>
> Signed-off-by: Ville Syrjälä 
> ---
>  drivers/gpu/drm/i915/display/intel_crtc.h   | 5 +
>  drivers/gpu/drm/i915/display/intel_psr.c| 2 +-
>  drivers/gpu/drm/i915/display/intel_sprite.h | 4 
>  3 files changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/display/intel_crtc.h 
> b/drivers/gpu/drm/i915/display/intel_crtc.h
> index a5ae997581aa..22363fbbc925 100644
> --- a/drivers/gpu/drm/i915/display/intel_crtc.h
> +++ b/drivers/gpu/drm/i915/display/intel_crtc.h
> @@ -9,10 +9,13 @@
>  #include 
>  
>  enum pipe;
> +struct drm_display_mode;
>  struct drm_i915_private;
>  struct intel_crtc;
>  struct intel_crtc_state;
>  
> +int intel_usecs_to_scanlines(const struct drm_display_mode *adjusted_mode,
> +  int usecs);
>  u32 intel_crtc_max_vblank_count(const struct intel_crtc_state *crtc_state);
>  int intel_crtc_init(struct drm_i915_private *dev_priv, enum pipe pipe);
>  struct intel_crtc_state *intel_crtc_state_alloc(struct intel_crtc *crtc);
> @@ -21,5 +24,7 @@ void intel_crtc_state_reset(struct intel_crtc_state 
> *crtc_state,
>  u32 intel_crtc_get_vblank_counter(struct intel_crtc *crtc);
>  void intel_crtc_vblank_on(const struct intel_crtc_state *crtc_state);
>  void intel_crtc_vblank_off(const struct intel_crtc_state *crtc_state);
> +void intel_pipe_update_start(const struct intel_crtc_state *new_crtc_state);
> +void intel_pipe_update_end(struct intel_crtc_state *new_crtc_state);
>  
>  #endif
> diff --git a/drivers/gpu/drm/i915/display/intel_psr.c 
> b/drivers/gpu/drm/i915/display/intel_psr.c
> index 49c2dfbd4055..ccffe05784d3 100644
> --- a/drivers/gpu/drm/i915/display/intel_psr.c
> +++ b/drivers/gpu/drm/i915/display/intel_psr.c
> @@ -28,13 +28,13 @@
>  
>  #include "i915_drv.h"
>  #include "intel_atomic.h"
> +#include "intel_crtc.h"
>  #include "intel_de.h"
>  #include "intel_display_types.h"
>  #include "intel_dp_aux.h"
>  #include "intel_hdmi.h"
>  #include "intel_psr.h"
>  #include "intel_snps_phy.h"
> -#include "intel_sprite.h"
>  #include "skl_universal_plane.h"
>  
>  /**
> diff --git a/drivers/gpu/drm/i915/display/intel_sprite.h 
> b/drivers/gpu/drm/i915/display/intel_sprite.h
> index c085eb87705c..4f63e4967731 100644
> --- a/drivers/gpu/drm/i915/display/intel_sprite.h
> +++ b/drivers/gpu/drm/i915/display/intel_sprite.h
> @@ -27,14 +27,10 @@ struct intel_plane_state;
>  #define VBLANK_EVASION_TIME_US 100
>  #endif
>  
> -int intel_usecs_to_scanlines(const struct drm_display_mode *adjusted_mode,
> -  int usecs);
>  struct intel_plane *intel_sprite_plane_create(struct drm_i915_private 
> *dev_priv,
> enum pipe pipe, int plane);
>  int intel_sprite_set_colorkey_ioctl(struct drm_device *dev, void *data,
>   struct drm_file *file_priv);
> -void intel_pipe_update_start(const struct intel_crtc_state *new_crtc_state);
> -void intel_pipe_update_end(struct intel_crtc_state *new_crtc_state);
>  int intel_plane_check_src_coordinates(struct intel_plane_state *plane_state);
>  int chv_plane_check_rotation(const struct intel_plane_state *plane_state);

-- 
Jani Nikula, Intel Open Source Graphics Center

Re: [Intel-gfx] [PATCH 3/4] drm/i915: Use vblank workers for gamma updates

2021-10-21 Thread Jani Nikula

On Thu, 21 Oct 2021, Ville Syrjala  wrote:
> From: Ville Syrjälä 
>
> The pipe gamma registers are single buffered so they should only
> be updated during the vblank to avoid screen tearing. In fact they
> really should only be updated between start of vblank and frame
> start because that is the only time the pipe is guaranteed to be
> empty. Already at frame start the pipe begins to fill up with
> data for the next frame.
>
> Unfortunately frame start happens ~1 scanline after the start
> of vblank which in practice doesn't always leave us enough time to
> finish the gamma update in time (gamma LUTs can be several KiB of
> data we have to bash into the registers). However we must try our
> best and so we'll add a vblank work for each pipe from where we
> can do the gamma update. Additionally we could consider pushing
> frame start forward to the max of ~4 scanlines after start of
> vblank. But not sure that's exactly a validated configuration.
> As it stands the ~100 first pixels tend to make it through with
> the old gamma values.
>
> Even though the vblank worker is running on a high prority thread
> we still have to contend with C-states. If the CPU happens be in
> a deep C-state when the vblank interrupt arrives even the irq
> handler gets delayed massively (I've observed dozens of scanlines
> worth of latency). To avoid that problem we'll use the qos mechanism
> to keep the CPU awake while the vblank work is scheduled.
>
> With all this hooked up we can finally enjoy near atomic gamma
> updates. It even works across several pipes from the same atomic
> commit which previously was a total fail because we did the
> gamma updates for each pipe serially after waiting for all
> pipes to have latched the double buffered registers.
>
> In the future the DSB should take over this responsibility
> which will hopefully avoid some of these issues.
>
> Kudos to Lyude for finishing the actual vblank workers.
> Works like the proverbial train toilet.
>
> v2: Add missing intel_atomic_state fwd declaration
> v3: Clean up properly when not scheduling the worker
> v4: Clean up the rest and add tracepoints
>
> CC: Lyude Paul 
> Signed-off-by: Ville Syrjälä 
> ---
>  drivers/gpu/drm/i915/display/intel_crtc.c | 76 ++-
>  drivers/gpu/drm/i915/display/intel_crtc.h |  4 +-
>  drivers/gpu/drm/i915/display/intel_display.c  |  9 +--
>  .../drm/i915/display/intel_display_types.h|  8 ++
>  drivers/gpu/drm/i915/i915_trace.h | 42 ++
>  5 files changed, 129 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/display/intel_crtc.c 
> b/drivers/gpu/drm/i915/display/intel_crtc.c
> index 0f8b48b6911c..4758c61adae8 100644
> --- a/drivers/gpu/drm/i915/display/intel_crtc.c
> +++ b/drivers/gpu/drm/i915/display/intel_crtc.c
> @@ -3,12 +3,14 @@
>   * Copyright © 2020 Intel Corporation
>   */
>  #include 
> +#include 
>  #include 
>  
>  #include 
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "i915_trace.h"
>  #include "i915_vgpu.h"
> @@ -167,6 +169,8 @@ static void intel_crtc_destroy(struct drm_crtc *_crtc)
>  {
>   struct intel_crtc *crtc = to_intel_crtc(_crtc);
>  
> + cpu_latency_qos_remove_request(&crtc->vblank_pm_qos);
> +
>   drm_crtc_cleanup(&crtc->base);
>   kfree(crtc);
>  }
> @@ -344,6 +348,8 @@ int intel_crtc_init(struct drm_i915_private *dev_priv, 
> enum pipe pipe)
>  
>   intel_crtc_crc_init(crtc);
>  
> + cpu_latency_qos_add_request(&crtc->vblank_pm_qos, PM_QOS_DEFAULT_VALUE);
> +
>   drm_WARN_ON(&dev_priv->drm, drm_crtc_index(&crtc->base) != crtc->pipe);
>  
>   return 0;
> @@ -354,6 +360,65 @@ int intel_crtc_init(struct drm_i915_private *dev_priv, 
> enum pipe pipe)
>   return ret;
>  }
>  
> +static bool intel_crtc_needs_vblank_work(const struct intel_crtc_state 
> *crtc_state)
> +{
> + return crtc_state->hw.active &&
> + !intel_crtc_needs_modeset(crtc_state) &&
> + !crtc_state->preload_luts &&
> + (crtc_state->uapi.color_mgmt_changed ||
> +  crtc_state->update_pipe);
> +}
> +
> +static void intel_crtc_vblank_work(struct kthread_work *base)
> +{
> + struct drm_vblank_work *work = to_drm_vblank_work(base);
> + struct intel_crtc_state *crtc_state =
> + container_of(work, typeof(*crtc_state), vblank_work);
> + struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> +
> + trace_intel_crtc_vblank_work_start(crtc);
> +
> + intel_color_load_luts(crtc_state);
> +
> + if (crtc_state->uapi.event) {
> + spin_lock_irq(&crtc->base.dev->event_lock);
> + drm_crtc_send_vblank_event(&crtc->base, crtc_state->uapi.event);
> + crtc_state->uapi.event = NULL;
> + spin_unlock_irq(&crtc->base.dev->event_lock);
> + }
> +
> + trace_intel_crtc_vblank_work_end(crtc);
> +}
> +
> +static void intel_crtc_vblank_work_init(struct intel_crtc_state *crtc_state)
> +{
> + struct intel

[Intel-gfx] [PATCH 04/28] drm/i915: Remove unused bits of i915_vma/active api

2021-10-21 Thread Maarten Lankhorst

When reworking the code to move the eviction fence to the object,
the best code is removed code.

Remove some functions that are unused, and change the function definition
if it's only used in 1 place.

Signed-off-by: Maarten Lankhorst 
Reviewed-by: Niranjana Vishwanathapura 
---
 drivers/gpu/drm/i915/i915_active.c | 28 +++-
 drivers/gpu/drm/i915/i915_active.h | 17 +
 drivers/gpu/drm/i915/i915_vma.c|  2 +-
 drivers/gpu/drm/i915/i915_vma.h|  2 --
 4 files changed, 5 insertions(+), 44 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_active.c 
b/drivers/gpu/drm/i915/i915_active.c
index 3103c1e1fd14..ee2b3a375362 100644
--- a/drivers/gpu/drm/i915/i915_active.c
+++ b/drivers/gpu/drm/i915/i915_active.c
@@ -426,8 +426,9 @@ replace_barrier(struct i915_active *ref, struct 
i915_active_fence *active)
return true;
 }
 
-int i915_active_ref(struct i915_active *ref, u64 idx, struct dma_fence *fence)
+int i915_active_add_request(struct i915_active *ref, struct i915_request *rq)
 {
+   struct dma_fence *fence = &rq->fence;
struct i915_active_fence *active;
int err;
 
@@ -436,7 +437,7 @@ int i915_active_ref(struct i915_active *ref, u64 idx, 
struct dma_fence *fence)
if (err)
return err;
 
-   active = active_instance(ref, idx);
+   active = active_instance(ref, i915_request_timeline(rq)->fence_context);
if (!active) {
err = -ENOMEM;
goto out;
@@ -477,29 +478,6 @@ __i915_active_set_fence(struct i915_active *ref,
return prev;
 }
 
-static struct i915_active_fence *
-__active_fence(struct i915_active *ref, u64 idx)
-{
-   struct active_node *it;
-
-   it = __active_lookup(ref, idx);
-   if (unlikely(!it)) { /* Contention with parallel tree builders! */
-   spin_lock_irq(&ref->tree_lock);
-   it = __active_lookup(ref, idx);
-   spin_unlock_irq(&ref->tree_lock);
-   }
-   GEM_BUG_ON(!it); /* slot must be preallocated */
-
-   return &it->base;
-}
-
-struct dma_fence *
-__i915_active_ref(struct i915_active *ref, u64 idx, struct dma_fence *fence)
-{
-   /* Only valid while active, see i915_active_acquire_for_context() */
-   return __i915_active_set_fence(ref, __active_fence(ref, idx), fence);
-}
-
 struct dma_fence *
 i915_active_set_exclusive(struct i915_active *ref, struct dma_fence *f)
 {
diff --git a/drivers/gpu/drm/i915/i915_active.h 
b/drivers/gpu/drm/i915/i915_active.h
index 5fcdb0e2bc9e..7eb44132183a 100644
--- a/drivers/gpu/drm/i915/i915_active.h
+++ b/drivers/gpu/drm/i915/i915_active.h
@@ -164,26 +164,11 @@ void __i915_active_init(struct i915_active *ref,
__i915_active_init(ref, active, retire, flags, &__mkey, &__wkey);   
\
 } while (0)
 
-struct dma_fence *
-__i915_active_ref(struct i915_active *ref, u64 idx, struct dma_fence *fence);
-int i915_active_ref(struct i915_active *ref, u64 idx, struct dma_fence *fence);
-
-static inline int
-i915_active_add_request(struct i915_active *ref, struct i915_request *rq)
-{
-   return i915_active_ref(ref,
-  i915_request_timeline(rq)->fence_context,
-  &rq->fence);
-}
+int i915_active_add_request(struct i915_active *ref, struct i915_request *rq);
 
 struct dma_fence *
 i915_active_set_exclusive(struct i915_active *ref, struct dma_fence *f);
 
-static inline bool i915_active_has_exclusive(struct i915_active *ref)
-{
-   return rcu_access_pointer(ref->excl.fence);
-}
-
 int __i915_active_wait(struct i915_active *ref, int state);
 static inline int i915_active_wait(struct i915_active *ref)
 {
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 90546fa58fc1..1187f1956c20 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -1220,7 +1220,7 @@ __i915_request_await_bind(struct i915_request *rq, struct 
i915_vma *vma)
return __i915_request_await_exclusive(rq, &vma->active);
 }
 
-int __i915_vma_move_to_active(struct i915_vma *vma, struct i915_request *rq)
+static int __i915_vma_move_to_active(struct i915_vma *vma, struct i915_request 
*rq)
 {
int err;
 
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 648dbe744c96..b882fd7b5f99 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -55,8 +55,6 @@ static inline bool i915_vma_is_active(const struct i915_vma 
*vma)
 /* do not reserve memory to prevent deadlocks */
 #define __EXEC_OBJECT_NO_RESERVE BIT(31)
 
-int __must_check __i915_vma_move_to_active(struct i915_vma *vma,
-  struct i915_request *rq);
 int __must_check _i915_vma_move_to_active(struct i915_vma *vma,
  struct i915_request *rq,
  struct dma_fence *fence,
-- 
2.33.0

[Intel-gfx] [PATCH 02/28] drm/i915: use new iterator in i915_gem_object_wait_reservation

2021-10-21 Thread Maarten Lankhorst

From: Christian König 

Simplifying the code a bit.

Signed-off-by: Christian König 
[mlankhorst: Handle timeout = 0 correctly, use new i915_request_wait_timeout.]
Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/gem/i915_gem_wait.c | 65 
 1 file changed, 20 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c 
b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
index f909aaa09d9c..840c13706999 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
@@ -25,7 +25,7 @@ i915_gem_object_wait_fence(struct dma_fence *fence,
return timeout;
 
if (dma_fence_is_i915(fence))
-   return i915_request_wait(to_request(fence), flags, timeout);
+   return i915_request_wait_timeout(to_request(fence), flags, 
timeout);
 
return dma_fence_wait_timeout(fence,
  flags & I915_WAIT_INTERRUPTIBLE,
@@ -37,58 +37,29 @@ i915_gem_object_wait_reservation(struct dma_resv *resv,
 unsigned int flags,
 long timeout)
 {
-   struct dma_fence *excl;
-   bool prune_fences = false;
-
-   if (flags & I915_WAIT_ALL) {
-   struct dma_fence **shared;
-   unsigned int count, i;
-   int ret;
-
-   ret = dma_resv_get_fences(resv, &excl, &count, &shared);
-   if (ret)
-   return ret;
-
-   for (i = 0; i < count; i++) {
-   timeout = i915_gem_object_wait_fence(shared[i],
-flags, timeout);
-   if (timeout < 0)
-   break;
-
-   dma_fence_put(shared[i]);
-   }
-
-   for (; i < count; i++)
-   dma_fence_put(shared[i]);
-   kfree(shared);
+   struct dma_resv_iter cursor;
+   struct dma_fence *fence;
+   long ret = timeout ?: 1;
+
+   dma_resv_iter_begin(&cursor, resv, flags & I915_WAIT_ALL);
+   dma_resv_for_each_fence_unlocked(&cursor, fence) {
+   ret = i915_gem_object_wait_fence(fence, flags, timeout);
+   if (ret <= 0)
+   break;
 
-   /*
-* If both shared fences and an exclusive fence exist,
-* then by construction the shared fences must be later
-* than the exclusive fence. If we successfully wait for
-* all the shared fences, we know that the exclusive fence
-* must all be signaled. If all the shared fences are
-* signaled, we can prune the array and recover the
-* floating references on the fences/requests.
-*/
-   prune_fences = count && timeout >= 0;
-   } else {
-   excl = dma_resv_get_excl_unlocked(resv);
+   if (timeout)
+   timeout = ret;
}
-
-   if (excl && timeout >= 0)
-   timeout = i915_gem_object_wait_fence(excl, flags, timeout);
-
-   dma_fence_put(excl);
+   dma_resv_iter_end(&cursor);
 
/*
 * Opportunistically prune the fences iff we know they have *all* been
 * signaled.
 */
-   if (prune_fences)
+   if (timeout > 0)
dma_resv_prune(resv);
 
-   return timeout;
+   return ret;
 }
 
 static void fence_set_priority(struct dma_fence *fence,
@@ -196,7 +167,11 @@ i915_gem_object_wait(struct drm_i915_gem_object *obj,
 
timeout = i915_gem_object_wait_reservation(obj->base.resv,
   flags, timeout);
-   return timeout < 0 ? timeout : 0;
+
+   if (timeout < 0)
+   return timeout;
+
+   return !timeout ? -ETIME : 0;
 }
 
 static inline unsigned long nsecs_to_jiffies_timeout(const u64 n)
-- 
2.33.0

[Intel-gfx] [PATCH 06/28] drm/i915: Remove gen6_ppgtt_unpin_all

2021-10-21 Thread Maarten Lankhorst

gen6_ppgtt_unpin_all is unused, kill it.

Signed-off-by: Maarten Lankhorst 
Reviewed-by: Matthew Auld 
---
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c | 11 ---
 drivers/gpu/drm/i915/gt/gen6_ppgtt.h |  1 -
 2 files changed, 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
index 890191f286e3..9fdbd9d3372b 100644
--- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
@@ -405,17 +405,6 @@ void gen6_ppgtt_unpin(struct i915_ppgtt *base)
i915_vma_unpin(ppgtt->vma);
 }
 
-void gen6_ppgtt_unpin_all(struct i915_ppgtt *base)
-{
-   struct gen6_ppgtt *ppgtt = to_gen6_ppgtt(base);
-
-   if (!atomic_read(&ppgtt->pin_count))
-   return;
-
-   i915_vma_unpin(ppgtt->vma);
-   atomic_set(&ppgtt->pin_count, 0);
-}
-
 struct i915_ppgtt *gen6_ppgtt_create(struct intel_gt *gt)
 {
struct i915_ggtt * const ggtt = gt->ggtt;
diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.h 
b/drivers/gpu/drm/i915/gt/gen6_ppgtt.h
index 6a61a5c3a85a..ab0eecb086dd 100644
--- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.h
+++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.h
@@ -71,7 +71,6 @@ static inline struct gen6_ppgtt *to_gen6_ppgtt(struct 
i915_ppgtt *base)
 
 int gen6_ppgtt_pin(struct i915_ppgtt *base, struct i915_gem_ww_ctx *ww);
 void gen6_ppgtt_unpin(struct i915_ppgtt *base);
-void gen6_ppgtt_unpin_all(struct i915_ppgtt *base);
 void gen6_ppgtt_enable(struct intel_gt *gt);
 void gen7_ppgtt_enable(struct intel_gt *gt);
 struct i915_ppgtt *gen6_ppgtt_create(struct intel_gt *gt);
-- 
2.33.0

[Intel-gfx] [PATCH 01/28] drm/i915: Fix i915_request fence wait semantics

2021-10-21 Thread Maarten Lankhorst

The i915_request fence wait behaves differently for timeout = 0
compared to expected dma-fence behavior.

i915 behavior:
- Unsignaled: -ETIME
- Signaled: 0 (= timeout)

Expected:
- Unsignaled: 0
- Signaled: 1

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/i915_request.c | 57 -
 drivers/gpu/drm/i915/i915_request.h |  5 +++
 2 files changed, 52 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c
index 820a1f38b271..42cd17357771 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -96,9 +96,9 @@ static signed long i915_fence_wait(struct dma_fence *fence,
   bool interruptible,
   signed long timeout)
 {
-   return i915_request_wait(to_request(fence),
-interruptible | I915_WAIT_PRIORITY,
-timeout);
+   return i915_request_wait_timeout(to_request(fence),
+interruptible | I915_WAIT_PRIORITY,
+timeout);
 }
 
 struct kmem_cache *i915_request_slab_cache(void)
@@ -1857,23 +1857,27 @@ static void request_wait_wake(struct dma_fence *fence, 
struct dma_fence_cb *cb)
 }
 
 /**
- * i915_request_wait - wait until execution of request has finished
+ * i915_request_wait_timeout - wait until execution of request has finished
  * @rq: the request to wait upon
  * @flags: how to wait
  * @timeout: how long to wait in jiffies
  *
- * i915_request_wait() waits for the request to be completed, for a
+ * i915_request_wait_timeout() waits for the request to be completed, for a
  * maximum of @timeout jiffies (with MAX_SCHEDULE_TIMEOUT implying an
  * unbounded wait).
  *
  * Returns the remaining time (in jiffies) if the request completed, which may
- * be zero or -ETIME if the request is unfinished after the timeout expires.
+ * be zero if the request is unfinished after the timeout expires.
+ * If the timeout is 0, it will return 1 if the fence is signaled.
+ *
  * May return -EINTR is called with I915_WAIT_INTERRUPTIBLE and a signal is
  * pending before the request completes.
+ *
+ * NOTE: This function has the same wait semantics as dma-fence.
  */
-long i915_request_wait(struct i915_request *rq,
-  unsigned int flags,
-  long timeout)
+long i915_request_wait_timeout(struct i915_request *rq,
+  unsigned int flags,
+  long timeout)
 {
const int state = flags & I915_WAIT_INTERRUPTIBLE ?
TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE;
@@ -1883,7 +1887,7 @@ long i915_request_wait(struct i915_request *rq,
GEM_BUG_ON(timeout < 0);
 
if (dma_fence_is_signaled(&rq->fence))
-   return timeout;
+   return timeout ?: 1;
 
if (!timeout)
return -ETIME;
@@ -1992,6 +1996,39 @@ long i915_request_wait(struct i915_request *rq,
return timeout;
 }
 
+/**
+ * i915_request_wait - wait until execution of request has finished
+ * @rq: the request to wait upon
+ * @flags: how to wait
+ * @timeout: how long to wait in jiffies
+ *
+ * i915_request_wait() waits for the request to be completed, for a
+ * maximum of @timeout jiffies (with MAX_SCHEDULE_TIMEOUT implying an
+ * unbounded wait).
+ *
+ * Returns the remaining time (in jiffies) if the request completed, which may
+ * be zero or -ETIME if the request is unfinished after the timeout expires.
+ * May return -EINTR is called with I915_WAIT_INTERRUPTIBLE and a signal is
+ * pending before the request completes.
+ *
+ * NOTE: This function behaves differently from dma-fence wait semantics for
+ * timeout = 0. It returns 0 on success, and -ETIME if not signaled.
+ */
+long i915_request_wait(struct i915_request *rq,
+  unsigned int flags,
+  long timeout)
+{
+   long ret = i915_request_wait_timeout(rq, flags, timeout);
+
+   if (!ret)
+   return -ETIME;
+
+   if (ret > 0 && !timeout)
+   return 0;
+
+   return ret;
+}
+
 static int print_sched_attr(const struct i915_sched_attr *attr,
char *buf, int x, int len)
 {
diff --git a/drivers/gpu/drm/i915/i915_request.h 
b/drivers/gpu/drm/i915/i915_request.h
index dc359242d1ae..3c6e8acd1457 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -414,6 +414,11 @@ void i915_request_unsubmit(struct i915_request *request);
 
 void i915_request_cancel(struct i915_request *rq, int error);
 
+long i915_request_wait_timeout(struct i915_request *rq,
+  unsigned int flags,
+  long timeout)
+   __attribute__((nonnull(1)));
+
 long i915_request_wait(struct i915_request *rq,
   unsigned int flags,

[Intel-gfx] [PATCH 08/28] drm/i915: Create a full object for mock_ring, v2.

2021-10-21 Thread Maarten Lankhorst

This allows us to finally get rid of all the assumptions that vma->obj is NULL.

Changes since v1:
- Ensure the mock_ring vma is pinned to prevent a fault.
- Pin it high to avoid failure in evict_for_vma selftest.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/gt/mock_engine.c | 38 ---
 1 file changed, 28 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c 
b/drivers/gpu/drm/i915/gt/mock_engine.c
index 8b89215afe46..bb99fc03f503 100644
--- a/drivers/gpu/drm/i915/gt/mock_engine.c
+++ b/drivers/gpu/drm/i915/gt/mock_engine.c
@@ -35,9 +35,31 @@ static void mock_timeline_unpin(struct intel_timeline *tl)
atomic_dec(&tl->pin_count);
 }
 
+static struct i915_vma *create_ring_vma(struct i915_ggtt *ggtt, int size)
+{
+   struct i915_address_space *vm = &ggtt->vm;
+   struct drm_i915_private *i915 = vm->i915;
+   struct drm_i915_gem_object *obj;
+   struct i915_vma *vma;
+
+   obj = i915_gem_object_create_internal(i915, size);
+   if (IS_ERR(obj))
+   return ERR_CAST(obj);
+
+   vma = i915_vma_instance(obj, vm, NULL);
+   if (IS_ERR(vma))
+   goto err;
+
+   return vma;
+
+err:
+   i915_gem_object_put(obj);
+   return vma;
+}
+
 static struct intel_ring *mock_ring(struct intel_engine_cs *engine)
 {
-   const unsigned long sz = PAGE_SIZE / 2;
+   const unsigned long sz = PAGE_SIZE;
struct intel_ring *ring;
 
ring = kzalloc(sizeof(*ring) + sz, GFP_KERNEL);
@@ -50,15 +72,11 @@ static struct intel_ring *mock_ring(struct intel_engine_cs 
*engine)
ring->vaddr = (void *)(ring + 1);
atomic_set(&ring->pin_count, 1);
 
-   ring->vma = i915_vma_alloc();
-   if (!ring->vma) {
+   ring->vma = create_ring_vma(engine->gt->ggtt, PAGE_SIZE);
+   if (IS_ERR(ring->vma)) {
kfree(ring);
return NULL;
}
-   i915_active_init(&ring->vma->active, NULL, NULL, 0);
-   __set_bit(I915_VMA_GGTT_BIT, __i915_vma_flags(ring->vma));
-   __set_bit(DRM_MM_NODE_ALLOCATED_BIT, &ring->vma->node.flags);
-   ring->vma->node.size = sz;
 
intel_ring_update_space(ring);
 
@@ -67,8 +85,7 @@ static struct intel_ring *mock_ring(struct intel_engine_cs 
*engine)
 
 static void mock_ring_free(struct intel_ring *ring)
 {
-   i915_active_fini(&ring->vma->active);
-   i915_vma_free(ring->vma);
+   i915_vma_put(ring->vma);
 
kfree(ring);
 }
@@ -125,6 +142,7 @@ static void mock_context_unpin(struct intel_context *ce)
 
 static void mock_context_post_unpin(struct intel_context *ce)
 {
+   i915_vma_unpin(ce->ring->vma);
 }
 
 static void mock_context_destroy(struct kref *ref)
@@ -169,7 +187,7 @@ static int mock_context_alloc(struct intel_context *ce)
 static int mock_context_pre_pin(struct intel_context *ce,
struct i915_gem_ww_ctx *ww, void **unused)
 {
-   return 0;
+   return i915_vma_pin_ww(ce->ring->vma, ww, 0, 0, PIN_GLOBAL | PIN_HIGH);
 }
 
 static int mock_context_pin(struct intel_context *ce, void *unused)
-- 
2.33.0

[Intel-gfx] [PATCH 03/28] drm/i915: Remove dma_resv_prune

2021-10-21 Thread Maarten Lankhorst

The signaled bit is already used for quick testing if a fence is signaled.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/Makefile|  1 -
 drivers/gpu/drm/i915/dma_resv_utils.c| 17 -
 drivers/gpu/drm/i915/dma_resv_utils.h| 13 -
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c |  3 ---
 drivers/gpu/drm/i915/gem/i915_gem_wait.c |  8 
 5 files changed, 42 deletions(-)
 delete mode 100644 drivers/gpu/drm/i915/dma_resv_utils.c
 delete mode 100644 drivers/gpu/drm/i915/dma_resv_utils.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 467872cca027..b87e3ed10d86 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -60,7 +60,6 @@ i915-y += i915_drv.o \
 
 # core library code
 i915-y += \
-   dma_resv_utils.o \
i915_memcpy.o \
i915_mm.o \
i915_sw_fence.o \
diff --git a/drivers/gpu/drm/i915/dma_resv_utils.c 
b/drivers/gpu/drm/i915/dma_resv_utils.c
deleted file mode 100644
index 7df91b7e4ca8..
--- a/drivers/gpu/drm/i915/dma_resv_utils.c
+++ /dev/null
@@ -1,17 +0,0 @@
-// SPDX-License-Identifier: MIT
-/*
- * Copyright © 2020 Intel Corporation
- */
-
-#include 
-
-#include "dma_resv_utils.h"
-
-void dma_resv_prune(struct dma_resv *resv)
-{
-   if (dma_resv_trylock(resv)) {
-   if (dma_resv_test_signaled(resv, true))
-   dma_resv_add_excl_fence(resv, NULL);
-   dma_resv_unlock(resv);
-   }
-}
diff --git a/drivers/gpu/drm/i915/dma_resv_utils.h 
b/drivers/gpu/drm/i915/dma_resv_utils.h
deleted file mode 100644
index b9d8fb5f8367..
--- a/drivers/gpu/drm/i915/dma_resv_utils.h
+++ /dev/null
@@ -1,13 +0,0 @@
-/* SPDX-License-Identifier: MIT */
-/*
- * Copyright © 2020 Intel Corporation
- */
-
-#ifndef DMA_RESV_UTILS_H
-#define DMA_RESV_UTILS_H
-
-struct dma_resv;
-
-void dma_resv_prune(struct dma_resv *resv);
-
-#endif /* DMA_RESV_UTILS_H */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c 
b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
index 5ab136ffdeb2..af3eb7fd951d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
@@ -15,7 +15,6 @@
 
 #include "gt/intel_gt_requests.h"
 
-#include "dma_resv_utils.h"
 #include "i915_trace.h"
 
 static bool swap_available(void)
@@ -229,8 +228,6 @@ i915_gem_shrink(struct i915_gem_ww_ctx *ww,
i915_gem_object_unlock(obj);
}
 
-   dma_resv_prune(obj->base.resv);
-
scanned += obj->base.size >> PAGE_SHIFT;
 skip:
i915_gem_object_put(obj);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c 
b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
index 840c13706999..1592d95c3ead 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c
@@ -10,7 +10,6 @@
 
 #include "gt/intel_engine.h"
 
-#include "dma_resv_utils.h"
 #include "i915_gem_ioctls.h"
 #include "i915_gem_object.h"
 
@@ -52,13 +51,6 @@ i915_gem_object_wait_reservation(struct dma_resv *resv,
}
dma_resv_iter_end(&cursor);
 
-   /*
-* Opportunistically prune the fences iff we know they have *all* been
-* signaled.
-*/
-   if (timeout > 0)
-   dma_resv_prune(resv);
-
return ret;
 }
 
-- 
2.33.0

[Intel-gfx] [PATCH 07/28] drm/i915: Create a dummy object for gen6 ppgtt

2021-10-21 Thread Maarten Lankhorst

We currently have to special case vma->obj being NULL because
of gen6 ppgtt and mock_engine. Fix gen6 ppgtt, so we may soon
be able to remove a few checks. As the object only exists as
a fake object pointing to ggtt, we have no backing storage,
so no real object is created. It just has to look real enough.

Also kill pin_mutex, it's not compatible with ww locking,
and we can use the vm lock instead.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/gem/i915_gem_internal.c |  44 ---
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c | 122 +++
 drivers/gpu/drm/i915/gt/gen6_ppgtt.h |   1 -
 drivers/gpu/drm/i915/i915_drv.h  |   4 +
 4 files changed, 99 insertions(+), 72 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_internal.c 
b/drivers/gpu/drm/i915/gem/i915_gem_internal.c
index a57a6b7013c2..c5150a1ee3d2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_internal.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_internal.c
@@ -145,24 +145,10 @@ static const struct drm_i915_gem_object_ops 
i915_gem_object_internal_ops = {
.put_pages = i915_gem_object_put_pages_internal,
 };
 
-/**
- * i915_gem_object_create_internal: create an object with volatile pages
- * @i915: the i915 device
- * @size: the size in bytes of backing storage to allocate for the object
- *
- * Creates a new object that wraps some internal memory for private use.
- * This object is not backed by swappable storage, and as such its contents
- * are volatile and only valid whilst pinned. If the object is reaped by the
- * shrinker, its pages and data will be discarded. Equally, it is not a full
- * GEM object and so not valid for access from userspace. This makes it useful
- * for hardware interfaces like ringbuffers (which are pinned from the time
- * the request is written to the time the hardware stops accessing it), but
- * not for contexts (which need to be preserved when not active for later
- * reuse). Note that it is not cleared upon allocation.
- */
 struct drm_i915_gem_object *
-i915_gem_object_create_internal(struct drm_i915_private *i915,
-   phys_addr_t size)
+__i915_gem_object_create_internal(struct drm_i915_private *i915,
+ const struct drm_i915_gem_object_ops *ops,
+ phys_addr_t size)
 {
static struct lock_class_key lock_class;
struct drm_i915_gem_object *obj;
@@ -179,7 +165,7 @@ i915_gem_object_create_internal(struct drm_i915_private 
*i915,
return ERR_PTR(-ENOMEM);
 
drm_gem_private_object_init(&i915->drm, &obj->base, size);
-   i915_gem_object_init(obj, &i915_gem_object_internal_ops, &lock_class, 
0);
+   i915_gem_object_init(obj, ops, &lock_class, 0);
obj->mem_flags |= I915_BO_FLAG_STRUCT_PAGE;
 
/*
@@ -199,3 +185,25 @@ i915_gem_object_create_internal(struct drm_i915_private 
*i915,
 
return obj;
 }
+
+/**
+ * i915_gem_object_create_internal: create an object with volatile pages
+ * @i915: the i915 device
+ * @size: the size in bytes of backing storage to allocate for the object
+ *
+ * Creates a new object that wraps some internal memory for private use.
+ * This object is not backed by swappable storage, and as such its contents
+ * are volatile and only valid whilst pinned. If the object is reaped by the
+ * shrinker, its pages and data will be discarded. Equally, it is not a full
+ * GEM object and so not valid for access from userspace. This makes it useful
+ * for hardware interfaces like ringbuffers (which are pinned from the time
+ * the request is written to the time the hardware stops accessing it), but
+ * not for contexts (which need to be preserved when not active for later
+ * reuse). Note that it is not cleared upon allocation.
+ */
+struct drm_i915_gem_object *
+i915_gem_object_create_internal(struct drm_i915_private *i915,
+   phys_addr_t size)
+{
+   return __i915_gem_object_create_internal(i915, 
&i915_gem_object_internal_ops, size);
+}
diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
index 9fdbd9d3372b..5caa1703716e 100644
--- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
@@ -262,13 +262,10 @@ static void gen6_ppgtt_cleanup(struct i915_address_space 
*vm)
 {
struct gen6_ppgtt *ppgtt = to_gen6_ppgtt(i915_vm_to_ppgtt(vm));
 
-   __i915_vma_put(ppgtt->vma);
-
gen6_ppgtt_free_pd(ppgtt);
free_scratch(vm);
 
mutex_destroy(&ppgtt->flush);
-   mutex_destroy(&ppgtt->pin_mutex);
 
free_pd(&ppgtt->base.vm, ppgtt->base.pd);
 }
@@ -331,37 +328,6 @@ static const struct i915_vma_ops pd_vma_ops = {
.unbind_vma = pd_vma_unbind,
 };
 
-static struct i915_vma *pd_vma_create(struct gen6_ppgtt *ppgtt, int size)
-{
-   struct i915_ggtt *ggtt = ppgtt->base.vm.gt->ggtt;
-   struct i915_vma *vma;
-
-   GEM_BUG_ON(!IS_ALIGNED(size, I915_GTT_P

[Intel-gfx] [PATCH 05/28] drm/i915: Slightly rework EXEC_OBJECT_CAPTURE handling, v2.

2021-10-21 Thread Maarten Lankhorst

Use a single null-terminated array for simplicity instead of a linked
list. This might slightly speed up execbuf when many vma's may be marked
as capture, but definitely removes an allocation from a signaling path.

We are not allowed to allocate memory in eb_move_to_gpu, but we can't
enforce it yet through annotations.

Changes since v1:
- Rebase on top of multi-batchbuffer changes.

Signed-off-by: Maarten Lankhorst 
Reviewed-by: Niranjana Vishwanathapura  #v1
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 29 ---
 drivers/gpu/drm/i915/i915_gpu_error.c |  9 +++---
 drivers/gpu/drm/i915/i915_request.c   |  9 ++
 drivers/gpu/drm/i915/i915_request.h   |  7 +
 4 files changed, 26 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 9c323666bd7c..eaacadd2d2e5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -265,6 +265,9 @@ struct i915_execbuffer {
/* number of batches in execbuf IOCTL */
unsigned int num_batches;
 
+   /* Number of objects with EXEC_OBJECT_CAPTURE set */
+   unsigned int capture_count;
+
/** list of vma not yet bound during reservation phase */
struct list_head unbound;
 
@@ -909,6 +912,9 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb)
goto err;
}
 
+   if (eb->exec[i].flags & EXEC_OBJECT_CAPTURE)
+   eb->capture_count++;
+
err = eb_validate_vma(eb, &eb->exec[i], vma);
if (unlikely(err)) {
i915_vma_put(vma);
@@ -1906,19 +1912,11 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
assert_vma_held(vma);
 
if (flags & EXEC_OBJECT_CAPTURE) {
-   struct i915_capture_list *capture;
+   eb->capture_count--;
 
for_each_batch_create_order(eb, j) {
-   if (!eb->requests[j])
-   break;
-
-   capture = kmalloc(sizeof(*capture), GFP_KERNEL);
-   if (capture) {
-   capture->next =
-   eb->requests[j]->capture_list;
-   capture->vma = vma;
-   eb->requests[j]->capture_list = capture;
-   }
+   if (eb->requests[j]->capture_list)
+   
eb->requests[j]->capture_list[eb->capture_count] = vma;
}
}
 
@@ -3130,6 +3128,14 @@ eb_requests_create(struct i915_execbuffer *eb, struct 
dma_fence *in_fence,
return out_fence;
}
 
+   if (eb->capture_count) {
+   eb->requests[i]->capture_list =
+   kvcalloc(eb->capture_count + 1,
+   sizeof(*eb->requests[i]->capture_list),
+   GFP_KERNEL | __GFP_NOWARN);
+   }
+
+
/*
 * Only the first request added (committed to backend) has to
 * take the in fences into account as all subsequent requests
@@ -3197,6 +3203,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 
eb.fences = NULL;
eb.num_fences = 0;
+   eb.capture_count = 0;
 
memset(eb.requests, 0, sizeof(struct i915_request *) *
   ARRAY_SIZE(eb.requests));
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 2a2d7643b551..45104bb12a98 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1356,10 +1356,10 @@ capture_user(struct intel_engine_capture_vma *capture,
 const struct i915_request *rq,
 gfp_t gfp)
 {
-   struct i915_capture_list *c;
+   int i;
 
-   for (c = rq->capture_list; c; c = c->next)
-   capture = capture_vma(capture, c->vma, "user", gfp);
+   for (i = 0; rq->capture_list[i]; i++)
+   capture = capture_vma(capture, rq->capture_list[i], "user", 
gfp);
 
return capture;
 }
@@ -1407,7 +1407,8 @@ intel_engine_coredump_add_request(struct 
intel_engine_coredump *ee,
 * by userspace.
 */
vma = capture_vma(vma, rq->batch, "batch", gfp);
-   vma = capture_user(vma, rq, gfp);
+   if (rq->capture_list)
+   vma = capture_user(vma, rq, gfp);
vma = capture_vma(vma, rq->ring->vma, "ring", gfp);
vma = capture_vma(vma, rq->context->state, "HW context", gfp);
 
diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c
index 42cd17357771.

[Intel-gfx] [PATCH 12/28] drm/i915: Remove resv from i915_vma

2021-10-21 Thread Maarten Lankhorst

It's just an alias to vma->obj->base.resv, no need to duplicate it.

Signed-off-by: Maarten Lankhorst 
Reviewed-by: Niranjana Vishwanathapura 
---
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 4 ++--
 drivers/gpu/drm/i915/i915_vma.c| 9 -
 drivers/gpu/drm/i915/i915_vma.h| 6 +++---
 drivers/gpu/drm/i915/i915_vma_types.h  | 1 -
 4 files changed, 9 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index eaacadd2d2e5..e614591ca510 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -1007,7 +1007,7 @@ static int eb_validate_vmas(struct i915_execbuffer *eb)
}
 
if (!(ev->flags & EXEC_OBJECT_WRITE)) {
-   err = dma_resv_reserve_shared(vma->resv, 1);
+   err = dma_resv_reserve_shared(vma->obj->base.resv, 1);
if (err)
return err;
}
@@ -2173,7 +2173,7 @@ static int eb_parse(struct i915_execbuffer *eb)
goto err_trampoline;
}
 
-   err = dma_resv_reserve_shared(shadow->resv, 1);
+   err = dma_resv_reserve_shared(shadow->obj->base.resv, 1);
if (err)
goto err_trampoline;
 
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index aebfc232b58b..ac09b685678a 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -116,7 +116,6 @@ vma_create(struct drm_i915_gem_object *obj,
vma->vm = i915_vm_get(vm);
vma->ops = &vm->vma_ops;
vma->obj = obj;
-   vma->resv = obj->base.resv;
vma->size = obj->base.size;
vma->display_alignment = I915_GTT_MIN_ALIGNMENT;
 
@@ -1032,7 +1031,7 @@ int i915_ggtt_pin(struct i915_vma *vma, struct 
i915_gem_ww_ctx *ww,
GEM_BUG_ON(!i915_vma_is_ggtt(vma));
 
 #ifdef CONFIG_LOCKDEP
-   WARN_ON(!ww && dma_resv_held(vma->resv));
+   WARN_ON(!ww && dma_resv_held(vma->obj->base.resv));
 #endif
 
do {
@@ -1251,19 +1250,19 @@ int _i915_vma_move_to_active(struct i915_vma *vma,
}
 
if (fence) {
-   dma_resv_add_excl_fence(vma->resv, fence);
+   dma_resv_add_excl_fence(vma->obj->base.resv, fence);
obj->write_domain = I915_GEM_DOMAIN_RENDER;
obj->read_domains = 0;
}
} else {
if (!(flags & __EXEC_OBJECT_NO_RESERVE)) {
-   err = dma_resv_reserve_shared(vma->resv, 1);
+   err = dma_resv_reserve_shared(vma->obj->base.resv, 1);
if (unlikely(err))
return err;
}
 
if (fence) {
-   dma_resv_add_shared_fence(vma->resv, fence);
+   dma_resv_add_shared_fence(vma->obj->base.resv, fence);
obj->write_domain = 0;
}
}
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 423e0df81c87..9a931ecb09e5 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -232,16 +232,16 @@ static inline void __i915_vma_put(struct i915_vma *vma)
kref_put(&vma->ref, i915_vma_release);
 }
 
-#define assert_vma_held(vma) dma_resv_assert_held((vma)->resv)
+#define assert_vma_held(vma) dma_resv_assert_held((vma)->obj->base.resv)
 
 static inline void i915_vma_lock(struct i915_vma *vma)
 {
-   dma_resv_lock(vma->resv, NULL);
+   dma_resv_lock(vma->obj->base.resv, NULL);
 }
 
 static inline void i915_vma_unlock(struct i915_vma *vma)
 {
-   dma_resv_unlock(vma->resv);
+   dma_resv_unlock(vma->obj->base.resv);
 }
 
 int __must_check
diff --git a/drivers/gpu/drm/i915/i915_vma_types.h 
b/drivers/gpu/drm/i915/i915_vma_types.h
index 80e93bf00f2e..8a0decb19bcc 100644
--- a/drivers/gpu/drm/i915/i915_vma_types.h
+++ b/drivers/gpu/drm/i915/i915_vma_types.h
@@ -178,7 +178,6 @@ struct i915_vma {
const struct i915_vma_ops *ops;
 
struct drm_i915_gem_object *obj;
-   struct dma_resv *resv; /** Alias of obj->resv */
 
struct sg_table *pages;
void __iomem *iomap;
-- 
2.33.0

[Intel-gfx] [PATCH 10/28] drm/i915: Change shrink ordering to use locking around unbinding.

2021-10-21 Thread Maarten Lankhorst

Call drop_pages with the gem object lock held, instead of the other
way around. This will allow us to drop the vma bindings with the
gem object lock held.

We plan to require the object lock for unpinning in the future,
and this is an easy target.

Signed-off-by: Maarten Lankhorst 
Reviewed-by: Niranjana Vishwanathapura 
---
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 42 ++--
 1 file changed, 22 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c 
b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
index af3eb7fd951d..d3f29a66cb36 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
@@ -36,8 +36,8 @@ static bool can_release_pages(struct drm_i915_gem_object *obj)
return swap_available() || obj->mm.madv == I915_MADV_DONTNEED;
 }
 
-static bool unsafe_drop_pages(struct drm_i915_gem_object *obj,
- unsigned long shrink, bool trylock_vm)
+static int drop_pages(struct drm_i915_gem_object *obj,
+  unsigned long shrink, bool trylock_vm)
 {
unsigned long flags;
 
@@ -208,26 +208,28 @@ i915_gem_shrink(struct i915_gem_ww_ctx *ww,
 
spin_unlock_irqrestore(&i915->mm.obj_lock, flags);
 
-   err = 0;
-   if (unsafe_drop_pages(obj, shrink, trylock_vm)) {
-   /* May arrive from get_pages on another bo */
-   if (!ww) {
-   if (!i915_gem_object_trylock(obj))
-   goto skip;
-   } else {
-   err = i915_gem_object_lock(obj, ww);
-   if (err)
-   goto skip;
-   }
-
-   if (!__i915_gem_object_put_pages(obj)) {
-   try_to_writeback(obj, shrink);
-   count += obj->base.size >> PAGE_SHIFT;
-   }
-   if (!ww)
-   i915_gem_object_unlock(obj);
+   /* May arrive from get_pages on another bo */
+   if (!ww) {
+   if (!i915_gem_object_trylock(obj))
+   goto skip;
+   } else {
+   err = i915_gem_object_lock(obj, ww);
+   if (err)
+   goto skip;
}
 
+   if (drop_pages(obj, shrink, trylock_vm) &&
+   !__i915_gem_object_put_pages(obj)) {
+   try_to_writeback(obj, shrink);
+   count += obj->base.size >> PAGE_SHIFT;
+   }
+
+   if (dma_resv_test_signaled(obj->base.resv, true))
+   dma_resv_add_excl_fence(obj->base.resv, NULL);
+
+   if (!ww)
+   i915_gem_object_unlock(obj);
+
scanned += obj->base.size >> PAGE_SHIFT;
 skip:
i915_gem_object_put(obj);
-- 
2.33.0

[Intel-gfx] [PATCH 23/28] drm/i915: Call i915_gem_evict_vm in vm_fault_gtt to prevent new ENOSPC errors

2021-10-21 Thread Maarten Lankhorst

Now that we cannot unbind kill the currently locked object directly
because we're removing short term pinning, we may have to unbind the
object from gtt manually, using a i915_gem_evict_vm() call.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/gem/i915_gem_mman.c | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c 
b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index 65fc6ff5f59d..6d557bb9926f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -357,8 +357,22 @@ static vm_fault_t vm_fault_gtt(struct vm_fault *vmf)
vma = i915_gem_object_ggtt_pin_ww(obj, &ww, &view, 0, 
0, flags);
}
 
-   /* The entire mappable GGTT is pinned? Unexpected! */
-   GEM_BUG_ON(vma == ERR_PTR(-ENOSPC));
+   /*
+* The entire mappable GGTT is pinned? Unexpected!
+* Try to evict the object we locked too, as normally we skip it
+* due to lack of short term pinning inside execbuf.
+*/
+   if (vma == ERR_PTR(-ENOSPC)) {
+   ret = mutex_lock_interruptible(&ggtt->vm.mutex);
+   if (!ret) {
+   ret = i915_gem_evict_vm(&ggtt->vm, &ww);
+   mutex_unlock(&ggtt->vm.mutex);
+   }
+   if (ret)
+   goto err_reset;
+   vma = i915_gem_object_ggtt_pin_ww(obj, &ww, &view, 0, 
0, flags);
+   }
+   GEM_WARN_ON(vma == ERR_PTR(-ENOSPC));
}
if (IS_ERR(vma)) {
ret = PTR_ERR(vma);
-- 
2.33.0

[Intel-gfx] [PATCH 13/28] drm/i915: Remove pages_mutex and intel_gtt->vma_ops.set/clear_pages members

2021-10-21 Thread Maarten Lankhorst

Big delta, but boils down to moving set_pages to i915_vma.c, and removing
the special handling, all callers use the defaults anyway. We only remap
in ggtt, so default case will fall through.

Because we still don't require locking in i915_vma_unpin(), handle this by
using xchg in get_pages(), as it's locked with obj->mutex, and cmpxchg in
unpin, which only fails if we race a against a new pin.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/display/intel_dpt.c  |   2 -
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c  |  15 -
 drivers/gpu/drm/i915/gt/intel_ggtt.c  | 345 
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  13 -
 drivers/gpu/drm/i915/gt/intel_gtt.h   |   7 -
 drivers/gpu/drm/i915/gt/intel_ppgtt.c |  12 -
 drivers/gpu/drm/i915/i915_vma.c   | 388 --
 drivers/gpu/drm/i915/i915_vma.h   |   3 +
 drivers/gpu/drm/i915/i915_vma_types.h |   1 -
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |  12 +-
 drivers/gpu/drm/i915/selftests/mock_gtt.c |   4 -
 11 files changed, 368 insertions(+), 434 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c 
b/drivers/gpu/drm/i915/display/intel_dpt.c
index 8f7b1f7534a4..ef428f3fc538 100644
--- a/drivers/gpu/drm/i915/display/intel_dpt.c
+++ b/drivers/gpu/drm/i915/display/intel_dpt.c
@@ -221,8 +221,6 @@ intel_dpt_create(struct intel_framebuffer *fb)
 
vm->vma_ops.bind_vma= dpt_bind_vma;
vm->vma_ops.unbind_vma  = dpt_unbind_vma;
-   vm->vma_ops.set_pages   = ggtt_set_pages;
-   vm->vma_ops.clear_pages = clear_pages;
 
vm->pte_encode = gen8_ggtt_pte_encode;
 
diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
index 5caa1703716e..5c048b4ccd4d 100644
--- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
@@ -270,19 +270,6 @@ static void gen6_ppgtt_cleanup(struct i915_address_space 
*vm)
free_pd(&ppgtt->base.vm, ppgtt->base.pd);
 }
 
-static int pd_vma_set_pages(struct i915_vma *vma)
-{
-   vma->pages = ERR_PTR(-ENODEV);
-   return 0;
-}
-
-static void pd_vma_clear_pages(struct i915_vma *vma)
-{
-   GEM_BUG_ON(!vma->pages);
-
-   vma->pages = NULL;
-}
-
 static void pd_vma_bind(struct i915_address_space *vm,
struct i915_vm_pt_stash *stash,
struct i915_vma *vma,
@@ -322,8 +309,6 @@ static void pd_vma_unbind(struct i915_address_space *vm, 
struct i915_vma *vma)
 }
 
 static const struct i915_vma_ops pd_vma_ops = {
-   .set_pages = pd_vma_set_pages,
-   .clear_pages = pd_vma_clear_pages,
.bind_vma = pd_vma_bind,
.unbind_vma = pd_vma_unbind,
 };
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c 
b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index f17383e76eb7..6da57199bb33 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -20,9 +20,6 @@
 #include "intel_gtt.h"
 #include "gen8_ppgtt.h"
 
-static int
-i915_get_ggtt_vma_pages(struct i915_vma *vma);
-
 static void i915_ggtt_color_adjust(const struct drm_mm_node *node,
   unsigned long color,
   u64 *start,
@@ -875,21 +872,6 @@ static int ggtt_probe_common(struct i915_ggtt *ggtt, u64 
size)
return 0;
 }
 
-int ggtt_set_pages(struct i915_vma *vma)
-{
-   int ret;
-
-   GEM_BUG_ON(vma->pages);
-
-   ret = i915_get_ggtt_vma_pages(vma);
-   if (ret)
-   return ret;
-
-   vma->page_sizes = vma->obj->mm.page_sizes;
-
-   return 0;
-}
-
 static void gen6_gmch_remove(struct i915_address_space *vm)
 {
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
@@ -950,8 +932,6 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt)
 
ggtt->vm.vma_ops.bind_vma= ggtt_bind_vma;
ggtt->vm.vma_ops.unbind_vma  = ggtt_unbind_vma;
-   ggtt->vm.vma_ops.set_pages   = ggtt_set_pages;
-   ggtt->vm.vma_ops.clear_pages = clear_pages;
 
ggtt->vm.pte_encode = gen8_ggtt_pte_encode;
 
@@ -1100,8 +1080,6 @@ static int gen6_gmch_probe(struct i915_ggtt *ggtt)
 
ggtt->vm.vma_ops.bind_vma= ggtt_bind_vma;
ggtt->vm.vma_ops.unbind_vma  = ggtt_unbind_vma;
-   ggtt->vm.vma_ops.set_pages   = ggtt_set_pages;
-   ggtt->vm.vma_ops.clear_pages = clear_pages;
 
return ggtt_probe_common(ggtt, size);
 }
@@ -1145,8 +1123,6 @@ static int i915_gmch_probe(struct i915_ggtt *ggtt)
 
ggtt->vm.vma_ops.bind_vma= ggtt_bind_vma;
ggtt->vm.vma_ops.unbind_vma  = ggtt_unbind_vma;
-   ggtt->vm.vma_ops.set_pages   = ggtt_set_pages;
-   ggtt->vm.vma_ops.clear_pages = clear_pages;
 
if (unlikely(ggtt->do_idle_maps))
drm_notice(&i915->drm,
@@ -1294,324 +1270,3 @@ void i915_ggtt_resume(struct i915_ggtt *ggtt)
 
intel_ggtt_restore_fences(ggtt);
 }
-
-static struct scatterlist *
-rotate_pages(struct drm_i915_gem_objec

[Intel-gfx] [PATCH 15/28] drm/i915: Add lock for unbinding to i915_gem_object_ggtt_pin_ww

2021-10-21 Thread Maarten Lankhorst

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/i915_gem.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 981e383d1a5d..6aa9e465b48e 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -931,7 +931,14 @@ i915_gem_object_ggtt_pin_ww(struct drm_i915_gem_object 
*obj,
goto new_vma;
}
 
-   ret = i915_vma_unbind(vma);
+   ret = 0;
+   if (!ww)
+   ret = i915_gem_object_lock_interruptible(obj, NULL);
+   if (!ret) {
+   ret = i915_vma_unbind(vma);
+   if (!ww)
+   i915_gem_object_unlock(obj);
+   }
if (ret)
return ERR_PTR(ret);
}
-- 
2.33.0

[Intel-gfx] [PATCH 18/28] drm/i915: Take trylock during eviction, v2.

2021-10-21 Thread Maarten Lankhorst

Now that freeing objects takes the object lock when destroying the
backing pages, we can confidently take the object lock even for dead
objects.

Use this fact to take the object lock in the shrinker, without requiring
a reference to the object, so all calls to unbind take the object lock.

This is the last step to requiring the object lock for vma_unbind.

Changes since v1:
- No longer require the refcount, as every freed object now holds the lock
  when unbinding VMA's.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c |  6 
 drivers/gpu/drm/i915/i915_gem_evict.c| 34 +---
 2 files changed, 35 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c 
b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
index d3f29a66cb36..34c12e5983eb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
@@ -403,12 +403,18 @@ i915_gem_shrinker_vmap(struct notifier_block *nb, 
unsigned long event, void *ptr
list_for_each_entry_safe(vma, next,
 &i915->ggtt.vm.bound_list, vm_link) {
unsigned long count = vma->node.size >> PAGE_SHIFT;
+   struct drm_i915_gem_object *obj = vma->obj;
 
if (!vma->iomap || i915_vma_is_active(vma))
continue;
 
+   if (!i915_gem_object_trylock(obj))
+   continue;
+
if (__i915_vma_unbind(vma) == 0)
freed_pages += count;
+
+   i915_gem_object_unlock(obj);
}
mutex_unlock(&i915->ggtt.vm.mutex);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c 
b/drivers/gpu/drm/i915/i915_gem_evict.c
index 2b73ddb11c66..286efa462eca 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -58,6 +58,9 @@ mark_free(struct drm_mm_scan *scan,
if (i915_vma_is_pinned(vma))
return false;
 
+   if (!i915_gem_object_trylock(vma->obj))
+   return false;
+
list_add(&vma->evict_link, unwind);
return drm_mm_scan_add_block(scan, &vma->node);
 }
@@ -178,6 +181,7 @@ i915_gem_evict_something(struct i915_address_space *vm,
list_for_each_entry_safe(vma, next, &eviction_list, evict_link) {
ret = drm_mm_scan_remove_block(&scan, &vma->node);
BUG_ON(ret);
+   i915_gem_object_unlock(vma->obj);
}
 
/*
@@ -222,10 +226,12 @@ i915_gem_evict_something(struct i915_address_space *vm,
 * of any of our objects, thus corrupting the list).
 */
list_for_each_entry_safe(vma, next, &eviction_list, evict_link) {
-   if (drm_mm_scan_remove_block(&scan, &vma->node))
+   if (drm_mm_scan_remove_block(&scan, &vma->node)) {
__i915_vma_pin(vma);
-   else
+   } else {
list_del(&vma->evict_link);
+   i915_gem_object_unlock(vma->obj);
+   }
}
 
/* Unbinding will emit any required flushes */
@@ -234,16 +240,22 @@ i915_gem_evict_something(struct i915_address_space *vm,
__i915_vma_unpin(vma);
if (ret == 0)
ret = __i915_vma_unbind(vma);
+
+   i915_gem_object_unlock(vma->obj);
}
 
while (ret == 0 && (node = drm_mm_scan_color_evict(&scan))) {
vma = container_of(node, struct i915_vma, node);
 
+
/* If we find any non-objects (!vma), we cannot evict them */
-   if (vma->node.color != I915_COLOR_UNEVICTABLE)
+   if (vma->node.color != I915_COLOR_UNEVICTABLE &&
+   i915_gem_object_trylock(vma->obj)) {
ret = __i915_vma_unbind(vma);
-   else
-   ret = -ENOSPC; /* XXX search failed, try again? */
+   i915_gem_object_unlock(vma->obj);
+   } else {
+   ret = -ENOSPC;
+   }
}
 
return ret;
@@ -333,6 +345,11 @@ int i915_gem_evict_for_node(struct i915_address_space *vm,
break;
}
 
+   if (!i915_gem_object_trylock(vma->obj)) {
+   ret = -ENOSPC;
+   break;
+   }
+
/*
 * Never show fear in the face of dragons!
 *
@@ -350,6 +367,8 @@ int i915_gem_evict_for_node(struct i915_address_space *vm,
__i915_vma_unpin(vma);
if (ret == 0)
ret = __i915_vma_unbind(vma);
+
+   i915_gem_object_unlock(vma->obj);
}
 
return ret;
@@ -393,6 +412,9 @@ int i915_gem_evict_vm(struct i915_address_space *vm)
if (i915_vma_is_pinned(vma))
continue;
 
+

[Intel-gfx] [PATCH 11/28] drm/i915/pm: Move CONTEXT_VALID_BIT check

2021-10-21 Thread Maarten Lankhorst

Resetting will clear the CONTEXT_VALID_BIT, so wait until after that to test.

Signed-off-by: Maarten Lankhorst 
Reviewed-by: Niranjana Vishwanathapura 
---
 drivers/gpu/drm/i915/gt/intel_engine_pm.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index a1334b48dde7..849fbb229bd3 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -52,8 +52,6 @@ static int __engine_unpark(struct intel_wakeref *wf)
/* Discard stale context state from across idling */
ce = engine->kernel_context;
if (ce) {
-   GEM_BUG_ON(test_bit(CONTEXT_VALID_BIT, &ce->flags));
-
/* Flush all pending HW writes before we touch the context */
while (unlikely(intel_context_inflight(ce)))
intel_engine_flush_submission(engine);
@@ -68,6 +66,9 @@ static int __engine_unpark(struct intel_wakeref *wf)
 ce->timeline->seqno,
 READ_ONCE(*ce->timeline->hwsp_seqno),
 ce->ring->emit);
+
+   GEM_BUG_ON(test_bit(CONTEXT_VALID_BIT, &ce->flags));
+
GEM_BUG_ON(ce->timeline->seqno !=
   READ_ONCE(*ce->timeline->hwsp_seqno));
}
-- 
2.33.0

[Intel-gfx] [PATCH 16/28] drm/i915: Rework context handling in hugepages selftests

2021-10-21 Thread Maarten Lankhorst

In the next commit, we don't evict when refcount = 0, so we need to
call drain freed objects, because we want to pin new bo's in the same
place, causing a test failure.

Furthermore, since each subtest is separated, it's a lot better to use
i915_live_selftests, so each subtest starts with a clean slate, and a
clean address space.

Signed-off-by: Maarten Lankhorst 
---
 .../gpu/drm/i915/gem/selftests/huge_pages.c   | 127 +++---
 1 file changed, 79 insertions(+), 48 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c 
b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
index b2003133deaf..493509f90b35 100644
--- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
@@ -22,6 +22,21 @@
 #include "selftests/mock_region.h"
 #include "selftests/i915_random.h"
 
+struct i915_gem_context *hugepage_ctx(struct drm_i915_private *i915, struct 
file *file)
+{
+   struct i915_gem_context *ctx = live_context(i915, file);
+   struct i915_address_space *vm;
+
+   if (IS_ERR(ctx))
+   return ctx;
+
+   vm = ctx->vm;
+   if (vm)
+   WRITE_ONCE(vm->scrub_64K, true);
+
+   return ctx;
+}
+
 static const unsigned int page_sizes[] = {
I915_GTT_PAGE_SIZE_2M,
I915_GTT_PAGE_SIZE_64K,
@@ -959,6 +974,8 @@ static int igt_mock_ppgtt_64K(void *arg)
__i915_gem_object_put_pages(obj);
i915_gem_object_unlock(obj);
i915_gem_object_put(obj);
+
+   i915_gem_drain_freed_objects(i915);
}
}
 
@@ -1080,10 +1097,6 @@ static int __igt_write_huge(struct intel_context *ce,
if (IS_ERR(vma))
return PTR_ERR(vma);
 
-   err = i915_vma_unbind(vma);
-   if (err)
-   return err;
-
err = i915_vma_pin(vma, size, 0, flags | offset);
if (err) {
/*
@@ -1117,7 +1130,7 @@ static int __igt_write_huge(struct intel_context *ce,
return err;
 }
 
-static int igt_write_huge(struct i915_gem_context *ctx,
+static int igt_write_huge(struct drm_i915_private *i915,
  struct drm_i915_gem_object *obj)
 {
struct i915_gem_engines *engines;
@@ -1127,6 +1140,8 @@ static int igt_write_huge(struct i915_gem_context *ctx,
IGT_TIMEOUT(end_time);
unsigned int max_page_size;
unsigned int count;
+   struct i915_gem_context *ctx;
+   struct file *file;
u64 max;
u64 num;
u64 size;
@@ -1134,6 +1149,16 @@ static int igt_write_huge(struct i915_gem_context *ctx,
int i, n;
int err = 0;
 
+   file = mock_file(i915);
+   if (IS_ERR(file))
+   return PTR_ERR(file);
+
+   ctx = hugepage_ctx(i915, file);
+   if (IS_ERR(ctx)) {
+   err = PTR_ERR(ctx);
+   goto out;
+   }
+
GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj));
 
size = obj->base.size;
@@ -1153,7 +1178,7 @@ static int igt_write_huge(struct i915_gem_context *ctx,
}
i915_gem_context_unlock_engines(ctx);
if (!n)
-   return 0;
+   goto out;
 
/*
 * To keep things interesting when alternating between engines in our
@@ -1215,6 +1240,8 @@ static int igt_write_huge(struct i915_gem_context *ctx,
 
kfree(order);
 
+out:
+   fput(file);
return err;
 }
 
@@ -1277,8 +1304,7 @@ static u32 igt_random_size(struct rnd_state *prng,
 
 static int igt_ppgtt_smoke_huge(void *arg)
 {
-   struct i915_gem_context *ctx = arg;
-   struct drm_i915_private *i915 = ctx->i915;
+   struct drm_i915_private *i915 = arg;
struct drm_i915_gem_object *obj;
I915_RND_STATE(prng);
struct {
@@ -1302,6 +1328,7 @@ static int igt_ppgtt_smoke_huge(void *arg)
u32 min = backends[i].min;
u32 max = backends[i].max;
u32 size = max;
+
 try_again:
size = igt_random_size(&prng, min, rounddown_pow_of_two(size));
 
@@ -1336,7 +1363,7 @@ static int igt_ppgtt_smoke_huge(void *arg)
goto out_unpin;
}
 
-   err = igt_write_huge(ctx, obj);
+   err = igt_write_huge(i915, obj);
if (err) {
pr_err("%s write-huge failed with size=%u, i=%d\n",
   __func__, size, i);
@@ -1363,8 +1390,7 @@ static int igt_ppgtt_smoke_huge(void *arg)
 
 static int igt_ppgtt_sanity_check(void *arg)
 {
-   struct i915_gem_context *ctx = arg;
-   struct drm_i915_private *i915 = ctx->i915;
+   struct drm_i915_private *i915 = arg;
unsigned int supported = INTEL_INFO(i915)->page_sizes;
struct {
igt_create_fn fn;
@@ -1431,7 +1457,7 @@ static int igt_ppgtt_sanity_check(void *arg)
if (pages)
obj->mm.page_sizes.sg =

[Intel-gfx] [PATCH 09/28] drm/i915: vma is always backed by an object.

2021-10-21 Thread Maarten Lankhorst

vma->obj and vma->resv are now never NULL, and some checks can be removed.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/gt/intel_context.c   |  2 +-
 .../gpu/drm/i915/gt/intel_ring_submission.c   |  2 +-
 drivers/gpu/drm/i915/i915_vma.c   | 48 ---
 drivers/gpu/drm/i915/i915_vma.h   |  3 --
 4 files changed, 22 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
b/drivers/gpu/drm/i915/gt/intel_context.c
index 5634d14052bc..e0220ac0e9b6 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -219,7 +219,7 @@ int __intel_context_do_pin_ww(struct intel_context *ce,
 */
 
err = i915_gem_object_lock(ce->timeline->hwsp_ggtt->obj, ww);
-   if (!err && ce->ring->vma->obj)
+   if (!err)
err = i915_gem_object_lock(ce->ring->vma->obj, ww);
if (!err && ce->state)
err = i915_gem_object_lock(ce->state->obj, ww);
diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c 
b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
index 586dca1731ce..3e6fac0340ef 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
@@ -1357,7 +1357,7 @@ int intel_ring_submission_setup(struct intel_engine_cs 
*engine)
err = i915_gem_object_lock(timeline->hwsp_ggtt->obj, &ww);
if (!err && gen7_wa_vma)
err = i915_gem_object_lock(gen7_wa_vma->obj, &ww);
-   if (!err && engine->legacy.ring->vma->obj)
+   if (!err)
err = i915_gem_object_lock(engine->legacy.ring->vma->obj, &ww);
if (!err)
err = intel_timeline_pin(timeline, &ww);
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 1187f1956c20..aebfc232b58b 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -40,12 +40,12 @@
 
 static struct kmem_cache *slab_vmas;
 
-struct i915_vma *i915_vma_alloc(void)
+static struct i915_vma *i915_vma_alloc(void)
 {
return kmem_cache_zalloc(slab_vmas, GFP_KERNEL);
 }
 
-void i915_vma_free(struct i915_vma *vma)
+static void i915_vma_free(struct i915_vma *vma)
 {
return kmem_cache_free(slab_vmas, vma);
 }
@@ -426,10 +426,8 @@ int i915_vma_bind(struct i915_vma *vma,
 
work->base.dma.error = 0; /* enable the queue_work() */
 
-   if (vma->obj) {
-   __i915_gem_object_pin_pages(vma->obj);
-   work->pinned = i915_gem_object_get(vma->obj);
-   }
+   __i915_gem_object_pin_pages(vma->obj);
+   work->pinned = i915_gem_object_get(vma->obj);
} else {
vma->ops->bind_vma(vma->vm, NULL, vma, cache_level, bind_flags);
}
@@ -670,7 +668,7 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 
alignment, u64 flags)
}
 
color = 0;
-   if (vma->obj && i915_vm_has_cache_coloring(vma->vm))
+   if (i915_vm_has_cache_coloring(vma->vm))
color = vma->obj->cache_level;
 
if (flags & PIN_OFFSET_FIXED) {
@@ -795,17 +793,14 @@ static bool try_qad_pin(struct i915_vma *vma, unsigned 
int flags)
 static int vma_get_pages(struct i915_vma *vma)
 {
int err = 0;
-   bool pinned_pages = false;
+   bool pinned_pages = true;
 
if (atomic_add_unless(&vma->pages_count, 1, 0))
return 0;
 
-   if (vma->obj) {
-   err = i915_gem_object_pin_pages(vma->obj);
-   if (err)
-   return err;
-   pinned_pages = true;
-   }
+   err = i915_gem_object_pin_pages(vma->obj);
+   if (err)
+   return err;
 
/* Allocations ahoy! */
if (mutex_lock_interruptible(&vma->pages_mutex)) {
@@ -838,8 +833,8 @@ static void __vma_put_pages(struct i915_vma *vma, unsigned 
int count)
if (atomic_sub_return(count, &vma->pages_count) == 0) {
vma->ops->clear_pages(vma);
GEM_BUG_ON(vma->pages);
-   if (vma->obj)
-   i915_gem_object_unpin_pages(vma->obj);
+
+   i915_gem_object_unpin_pages(vma->obj);
}
mutex_unlock(&vma->pages_mutex);
 }
@@ -875,7 +870,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct 
i915_gem_ww_ctx *ww,
int err;
 
 #ifdef CONFIG_PROVE_LOCKING
-   if (debug_locks && !WARN_ON(!ww) && vma->resv)
+   if (debug_locks && !WARN_ON(!ww))
assert_vma_held(vma);
 #endif
 
@@ -983,7 +978,7 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct 
i915_gem_ww_ctx *ww,
 
GEM_BUG_ON(!vma->pages);
err = i915_vma_bind(vma,
-   vma->obj ? vma->obj->cache_level : 0,
+   vma->obj->cache_level,
flags, work);
if (err)
goto err_remove;
@@ -1037,7 +1032,7 @@ int i915_ggtt

[Intel-gfx] [PATCH 17/28] drm/i915: Ensure gem_contexts selftests work with unbind changes.

2021-10-21 Thread Maarten Lankhorst

In the next commit, we don't evict when refcount = 0.

igt_vm_isolation() continuously tries to pin/unpin at same address,
but also calls put() on the object, which means the object may not
be unpinned in time.

Instead of this, re-use the same object over and over, so they can
be unbound as required.

Signed-off-by: Maarten Lankhorst 
---
 .../drm/i915/gem/selftests/i915_gem_context.c | 54 +++
 1 file changed, 32 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
index b32f7fed2d9c..3fc595b57cf4 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
@@ -1481,10 +1481,10 @@ static int check_scratch(struct i915_address_space *vm, 
u64 offset)
 
 static int write_to_scratch(struct i915_gem_context *ctx,
struct intel_engine_cs *engine,
+   struct drm_i915_gem_object *obj,
u64 offset, u32 value)
 {
struct drm_i915_private *i915 = ctx->i915;
-   struct drm_i915_gem_object *obj;
struct i915_address_space *vm;
struct i915_request *rq;
struct i915_vma *vma;
@@ -1497,15 +1497,9 @@ static int write_to_scratch(struct i915_gem_context *ctx,
if (err)
return err;
 
-   obj = i915_gem_object_create_internal(i915, PAGE_SIZE);
-   if (IS_ERR(obj))
-   return PTR_ERR(obj);
-
cmd = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
-   if (IS_ERR(cmd)) {
-   err = PTR_ERR(cmd);
-   goto out;
-   }
+   if (IS_ERR(cmd))
+   return PTR_ERR(cmd);
 
*cmd++ = MI_STORE_DWORD_IMM_GEN4;
if (GRAPHICS_VER(i915) >= 8) {
@@ -1569,17 +1563,19 @@ static int write_to_scratch(struct i915_gem_context 
*ctx,
i915_vma_unpin(vma);
 out_vm:
i915_vm_put(vm);
-out:
-   i915_gem_object_put(obj);
+
+   if (!err)
+   err = i915_gem_object_wait(obj, 0, MAX_SCHEDULE_TIMEOUT);
+
return err;
 }
 
 static int read_from_scratch(struct i915_gem_context *ctx,
 struct intel_engine_cs *engine,
+struct drm_i915_gem_object *obj,
 u64 offset, u32 *value)
 {
struct drm_i915_private *i915 = ctx->i915;
-   struct drm_i915_gem_object *obj;
struct i915_address_space *vm;
const u32 result = 0x100;
struct i915_request *rq;
@@ -1594,10 +1590,6 @@ static int read_from_scratch(struct i915_gem_context 
*ctx,
if (err)
return err;
 
-   obj = i915_gem_object_create_internal(i915, PAGE_SIZE);
-   if (IS_ERR(obj))
-   return PTR_ERR(obj);
-
if (GRAPHICS_VER(i915) >= 8) {
const u32 GPR0 = engine->mmio_base + 0x600;
 
@@ -1615,7 +1607,7 @@ static int read_from_scratch(struct i915_gem_context *ctx,
cmd = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
if (IS_ERR(cmd)) {
err = PTR_ERR(cmd);
-   goto out;
+   goto err_unpin;
}
 
memset(cmd, POISON_INUSE, PAGE_SIZE);
@@ -1651,7 +1643,7 @@ static int read_from_scratch(struct i915_gem_context *ctx,
cmd = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WB);
if (IS_ERR(cmd)) {
err = PTR_ERR(cmd);
-   goto out;
+   goto err_unpin;
}
 
memset(cmd, POISON_INUSE, PAGE_SIZE);
@@ -1722,8 +1714,10 @@ static int read_from_scratch(struct i915_gem_context 
*ctx,
i915_vma_unpin(vma);
 out_vm:
i915_vm_put(vm);
-out:
-   i915_gem_object_put(obj);
+
+   if (!err)
+   err = i915_gem_object_wait(obj, 0, MAX_SCHEDULE_TIMEOUT);
+
return err;
 }
 
@@ -1765,6 +1759,7 @@ static int igt_vm_isolation(void *arg)
u64 vm_total;
u32 expected;
int err;
+   struct drm_i915_gem_object *obj_a, *obj_b;
 
if (GRAPHICS_VER(i915) < 7)
return 0;
@@ -1810,6 +1805,18 @@ static int igt_vm_isolation(void *arg)
vm_total = ctx_a->vm->total;
GEM_BUG_ON(ctx_b->vm->total != vm_total);
 
+   obj_a = i915_gem_object_create_internal(i915, PAGE_SIZE);
+   if (IS_ERR(obj_a)) {
+   err = PTR_ERR(obj_a);
+   goto out_file;
+   }
+
+   obj_b = i915_gem_object_create_internal(i915, PAGE_SIZE);
+   if (IS_ERR(obj_b)) {
+   err = PTR_ERR(obj_b);
+   goto put_a;
+   }
+
count = 0;
num_engines = 0;
for_each_uabi_engine(engine, i915) {
@@ -1832,10 +1839,10 @@ static int igt_vm_isolation(void *arg)
   I915_GTT_PAGE_SIZE, vm_total,

[Intel-gfx] [PATCH 24/28] drm/i915: Add i915_vma_unbind_unlocked, and take obj lock for i915_vma_unbind

2021-10-21 Thread Maarten Lankhorst

We want to remove more members of i915_vma, which requires the locking to be
held more often.

Start requiring gem object lock for i915_vma_unbind, as it's one of the
callers that may unpin pages.

Some special care is needed when evicting, because the last reference to the
object may be held by the VMA, so after __i915_vma_unbind, vma may be garbage,
and we need to cache vma->obj before unlocking.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/display/intel_fb_pin.c   |  2 +-
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |  2 +-
 .../i915/gem/selftests/i915_gem_client_blt.c  |  2 +-
 .../drm/i915/gem/selftests/i915_gem_mman.c|  6 +++
 drivers/gpu/drm/i915/gt/intel_ggtt.c  | 45 ---
 drivers/gpu/drm/i915/i915_gem.c   |  2 +
 drivers/gpu/drm/i915/i915_vma.c   | 27 ++-
 drivers/gpu/drm/i915/i915_vma.h   |  1 +
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 22 -
 drivers/gpu/drm/i915/selftests/i915_vma.c |  8 ++--
 10 files changed, 91 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_fb_pin.c 
b/drivers/gpu/drm/i915/display/intel_fb_pin.c
index 3f77f3013584..1c0ff2ef3937 100644
--- a/drivers/gpu/drm/i915/display/intel_fb_pin.c
+++ b/drivers/gpu/drm/i915/display/intel_fb_pin.c
@@ -47,7 +47,7 @@ intel_pin_fb_obj_dpt(struct drm_framebuffer *fb,
goto err;
 
if (i915_vma_misplaced(vma, 0, alignment, 0)) {
-   ret = i915_vma_unbind(vma);
+   ret = i915_vma_unbind_unlocked(vma);
if (ret) {
vma = ERR_PTR(ret);
goto err;
diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c 
b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
index 493509f90b35..97473ac7c7f7 100644
--- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
@@ -646,7 +646,7 @@ static int igt_mock_ppgtt_misaligned_dma(void *arg)
 * pages.
 */
for (offset = 4096; offset < page_size; offset += 4096) {
-   err = i915_vma_unbind(vma);
+   err = i915_vma_unbind_unlocked(vma);
if (err)
goto out_unpin;
 
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c
index 8402ed925a69..8fb5be799b3c 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c
@@ -318,7 +318,7 @@ static int pin_buffer(struct i915_vma *vma, u64 addr)
int err;
 
if (drm_mm_node_allocated(&vma->node) && vma->node.start != addr) {
-   err = i915_vma_unbind(vma);
+   err = i915_vma_unbind_unlocked(vma);
if (err)
return err;
}
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
index 6d30cdfa80f3..e69e8861352d 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
@@ -165,7 +165,9 @@ static int check_partial_mapping(struct drm_i915_gem_object 
*obj,
kunmap(p);
 
 out:
+   i915_gem_object_lock(obj, NULL);
__i915_vma_put(vma);
+   i915_gem_object_unlock(obj);
return err;
 }
 
@@ -259,7 +261,9 @@ static int check_partial_mappings(struct 
drm_i915_gem_object *obj,
if (err)
return err;
 
+   i915_gem_object_lock(obj, NULL);
__i915_vma_put(vma);
+   i915_gem_object_unlock(obj);
 
if (igt_timeout(end_time,
"%s: timed out after tiling=%d stride=%d\n",
@@ -1349,7 +1353,9 @@ static int __igt_mmap_revoke(struct drm_i915_private 
*i915,
 * for other objects. Ergo we have to revoke the previous mmap PTE
 * access as it no longer points to the same object.
 */
+   i915_gem_object_lock(obj, NULL);
err = i915_gem_object_unbind(obj, I915_GEM_OBJECT_UNBIND_ACTIVE);
+   i915_gem_object_unlock(obj);
if (err) {
pr_err("Failed to unbind object!\n");
goto out_unmap;
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c 
b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index df29fd992b45..44bc30458cac 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -118,22 +118,45 @@ void i915_ggtt_suspend(struct i915_ggtt *ggtt)
struct i915_vma *vma, *vn;
int open;
 
+retry:
+   i915_gem_drain_freed_objects(ggtt->vm.i915);
+
mutex_lock(&ggtt->vm.mutex);
 
/* Skip rewriting PTE on VMA unbind. */
open = atomic_xchg(&ggtt->vm.open, 0);
 
list_for_each_entry_safe(vma, vn, &ggtt->vm.bound_list, vm_link) {
+   str

[Intel-gfx] [PATCH 20/28] drm/i915: Ensure i915_vma tests do not get -ENOSPC with the locking changes.

2021-10-21 Thread Maarten Lankhorst

Now that we require locking to evict, multiple vmas from the same object
might not be evicted. This is expected and required, because execbuf will
move to short-term pinning by using the lock only. This will cause these
tests to fail, because they create a ton of vma's for the same object.

Unbind manually to prevent spurious -ENOSPC in those mock tests.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/selftests/i915_vma.c | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/selftests/i915_vma.c 
b/drivers/gpu/drm/i915/selftests/i915_vma.c
index 1f10fe36619b..5c5809dfe9b2 100644
--- a/drivers/gpu/drm/i915/selftests/i915_vma.c
+++ b/drivers/gpu/drm/i915/selftests/i915_vma.c
@@ -691,7 +691,11 @@ static int igt_vma_rotate_remap(void *arg)
}
 
i915_vma_unpin(vma);
-
+   err = i915_vma_unbind(vma);
+   if (err) {
+   pr_err("Unbinding returned 
%i\n", err);
+   goto out_object;
+   }
cond_resched();
}
}
@@ -848,6 +852,11 @@ static int igt_vma_partial(void *arg)
 
i915_vma_unpin(vma);
nvma++;
+   err = i915_vma_unbind(vma);
+   if (err) {
+   pr_err("Unbinding returned %i\n", err);
+   goto out_object;
+   }
 
cond_resched();
}
@@ -882,6 +891,12 @@ static int igt_vma_partial(void *arg)
 
i915_vma_unpin(vma);
 
+   err = i915_vma_unbind(vma);
+   if (err) {
+   pr_err("Unbinding returned %i\n", err);
+   goto out_object;
+   }
+
count = 0;
list_for_each_entry(vma, &obj->vma.list, obj_link)
count++;
-- 
2.33.0

[Intel-gfx] [PATCH 19/28] drm/i915: Pass trylock context to callers

2021-10-21 Thread Maarten Lankhorst

Signed-off-by: Maarten Lankhorst 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.h|  8 --
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  4 +--
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c|  2 +-
 drivers/gpu/drm/i915/gt/intel_engine_pm.c |  2 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c  |  2 +-
 drivers/gpu/drm/i915/gt/mock_engine.c |  2 +-
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |  2 +-
 drivers/gpu/drm/i915/gt/selftest_migrate.c|  2 +-
 drivers/gpu/drm/i915/gvt/aperture_gm.c|  2 +-
 drivers/gpu/drm/i915/i915_drv.h   |  5 +++-
 drivers/gpu/drm/i915/i915_gem_evict.c | 15 ++-
 drivers/gpu/drm/i915/i915_gem_gtt.c   |  8 +++---
 drivers/gpu/drm/i915/i915_gem_gtt.h   |  3 +++
 drivers/gpu/drm/i915/i915_vgpu.c  |  2 +-
 drivers/gpu/drm/i915/i915_vma.c   | 13 +
 .../gpu/drm/i915/selftests/i915_gem_evict.c   | 27 ---
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 14 +-
 18 files changed, 70 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index e614591ca510..bbf2a10738f7 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -764,7 +764,7 @@ static int eb_reserve(struct i915_execbuffer *eb)
case 1:
/* Too fragmented, unbind everything and retry */
mutex_lock(&eb->context->vm->mutex);
-   err = i915_gem_evict_vm(eb->context->vm);
+   err = i915_gem_evict_vm(eb->context->vm, &eb->ww);
mutex_unlock(&eb->context->vm->mutex);
if (err)
return err;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 59201801cec5..accd8fad0ed7 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -211,9 +211,13 @@ static inline int 
i915_gem_object_lock_interruptible(struct drm_i915_gem_object
return __i915_gem_object_lock(obj, ww, true);
 }
 
-static inline bool i915_gem_object_trylock(struct drm_i915_gem_object *obj)
+static inline bool i915_gem_object_trylock(struct drm_i915_gem_object *obj,
+  struct i915_gem_ww_ctx *ww)
 {
-   return dma_resv_trylock(obj->base.resv);
+   if (!ww)
+   return dma_resv_trylock(obj->base.resv);
+   else
+   return ww_mutex_trylock(&obj->base.resv->lock, &ww->ctx);
 }
 
 static inline void i915_gem_object_unlock(struct drm_i915_gem_object *obj)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c 
b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
index 34c12e5983eb..5375f3f9f016 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
@@ -210,7 +210,7 @@ i915_gem_shrink(struct i915_gem_ww_ctx *ww,
 
/* May arrive from get_pages on another bo */
if (!ww) {
-   if (!i915_gem_object_trylock(obj))
+   if (!i915_gem_object_trylock(obj, NULL))
goto skip;
} else {
err = i915_gem_object_lock(obj, ww);
@@ -408,7 +408,7 @@ i915_gem_shrinker_vmap(struct notifier_block *nb, unsigned 
long event, void *ptr
if (!vma->iomap || i915_vma_is_active(vma))
continue;
 
-   if (!i915_gem_object_trylock(obj))
+   if (!i915_gem_object_trylock(obj, NULL))
continue;
 
if (__i915_vma_unbind(vma) == 0)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c 
b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
index ddd37ccb1362..d629ef5abf9a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
@@ -653,7 +653,7 @@ static int __i915_gem_object_create_stolen(struct 
intel_memory_region *mem,
cache_level = HAS_LLC(mem->i915) ? I915_CACHE_LLC : I915_CACHE_NONE;
i915_gem_object_set_cache_coherency(obj, cache_level);
 
-   if (WARN_ON(!i915_gem_object_trylock(obj)))
+   if (WARN_ON(!i915_gem_object_trylock(obj, NULL)))
return -EBUSY;
 
i915_gem_object_init_memory_region(obj, mem);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 849fbb229bd3..d66c8bd50ae3 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -26,7 +26,7 @@ static void dbg_poison_ce(struct intel_context *ce)
int type = i915_coherent_map_type(ce->engine->i915, obj, true);

[Intel-gfx] [PATCH 25/28] drm/i915: Require object lock when freeing pages during destruction

2021-10-21 Thread Maarten Lankhorst

TTM already requires this, and we require it for delayed destroy.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 1e426a42a36c..d549f34829d0 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -257,6 +257,8 @@ static void __i915_gem_object_free_mmaps(struct 
drm_i915_gem_object *obj)
  */
 void __i915_gem_object_pages_fini(struct drm_i915_gem_object *obj)
 {
+   assert_object_held(obj);
+
if (!list_empty(&obj->vma.list)) {
struct i915_vma *vma;
 
@@ -323,7 +325,10 @@ static void __i915_gem_free_objects(struct 
drm_i915_private *i915,
obj->ops->delayed_free(obj);
continue;
}
+
+   i915_gem_object_lock(obj, NULL);
__i915_gem_object_pages_fini(obj);
+   i915_gem_object_unlock(obj);
__i915_gem_free_object(obj);
 
/* But keep the pointer alive for RCU-protected lookups */
-- 
2.33.0

[Intel-gfx] [PATCH 14/28] drm/i915: Take object lock in i915_ggtt_pin if ww is not set

2021-10-21 Thread Maarten Lankhorst

i915_vma_wait_for_bind needs the vma lock held, fix the caller.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/i915_vma.c | 40 +++--
 1 file changed, 28 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index bacc8d68e495..2877dcd62acb 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -1348,23 +1348,15 @@ static void flush_idle_contexts(struct intel_gt *gt)
intel_gt_wait_for_idle(gt, MAX_SCHEDULE_TIMEOUT);
 }
 
-int i915_ggtt_pin(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
- u32 align, unsigned int flags)
+static int __i915_ggtt_pin(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
+  u32 align, unsigned int flags)
 {
struct i915_address_space *vm = vma->vm;
int err;
 
-   GEM_BUG_ON(!i915_vma_is_ggtt(vma));
-
-#ifdef CONFIG_LOCKDEP
-   WARN_ON(!ww && dma_resv_held(vma->obj->base.resv));
-#endif
-
do {
-   if (ww)
-   err = i915_vma_pin_ww(vma, ww, 0, align, flags | 
PIN_GLOBAL);
-   else
-   err = i915_vma_pin(vma, 0, align, flags | PIN_GLOBAL);
+   err = i915_vma_pin_ww(vma, ww, 0, align, flags | PIN_GLOBAL);
+
if (err != -ENOSPC) {
if (!err) {
err = i915_vma_wait_for_bind(vma);
@@ -1383,6 +1375,30 @@ int i915_ggtt_pin(struct i915_vma *vma, struct 
i915_gem_ww_ctx *ww,
} while (1);
 }
 
+int i915_ggtt_pin(struct i915_vma *vma, struct i915_gem_ww_ctx *ww,
+ u32 align, unsigned int flags)
+{
+   struct i915_gem_ww_ctx _ww;
+   int err;
+
+   GEM_BUG_ON(!i915_vma_is_ggtt(vma));
+
+   if (ww)
+   return __i915_ggtt_pin(vma, ww, align, flags);
+
+#ifdef CONFIG_LOCKDEP
+   WARN_ON(dma_resv_held(vma->obj->base.resv));
+#endif
+
+   for_i915_gem_ww(&_ww, err, true) {
+   err = i915_gem_object_lock(vma->obj, &_ww);
+   if (!err)
+   err = __i915_ggtt_pin(vma, &_ww, align, flags);
+   }
+
+   return err;
+}
+
 static void __vma_close(struct i915_vma *vma, struct intel_gt *gt)
 {
/*
-- 
2.33.0

[Intel-gfx] [PATCH 28/28] drm/i915: Remove short-term pins from execbuf, v4.

2021-10-21 Thread Maarten Lankhorst

Add a flag PIN_VALIDATE, to indicate we don't need to pin and only
protected by the object lock.

This removes the need to unpin, which is done by just releasing the
lock.

eb_reserve is slightly reworked for readability, but the same steps
are still done:
- First pass pins with NONBLOCK.
- Second pass unbinds all objects first, then pins.
- Third pass is only called when not all objects are softpinned, and
  unbinds all objects, then calls i915_gem_evict_vm(), then pins.

When evicting the entire vm in eb_reserve() we do temporarily pin objects
that are marked with EXEC_OBJECT_PINNED. This is because they are already
at their destination, and i915_gem_evict_vm() would otherwise unbind them.

However, we reduce the visibility of those pins by limiting the pin
to our call to i915_gem_evict_vm() only, and pin with vm->mutex held,
instead of the entire duration of the execbuf.

Not sure the latter matters, one can hope..
In theory we could kill the pinning by adding an extra flag to the vma
to temporarily prevent unbinding for gtt for i915_gem_evict_vm only, but
I think that might be overkill. We're still holding the object lock, and
we don't have blocking eviction yet. It's likely sufficient to simply
enforce EXEC_OBJECT_PINNED for all objects on >= gen12.

Changes since v1:
- Split out eb_reserve() into separate functions for readability.
Changes since v2:
- Make batch buffer mappable on platforms where only GGTT is available,
  to prevent moving the batch buffer during relocations.
Changes since v3:
- Preserve current behavior for batch buffer, instead be cautious when
  calling i915_gem_object_ggtt_pin_ww, and re-use the current batch vma
  if it's inside ggtt and map-and-fenceable.

Signed-off-by: Maarten Lankhorst 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 252 ++
 drivers/gpu/drm/i915/i915_gem_gtt.h   |   1 +
 drivers/gpu/drm/i915/i915_vma.c   |  24 +-
 3 files changed, 161 insertions(+), 116 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index bbf2a10738f7..19f91143cfcf 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -439,7 +439,7 @@ eb_pin_vma(struct i915_execbuffer *eb,
else
pin_flags = entry->offset & PIN_OFFSET_MASK;
 
-   pin_flags |= PIN_USER | PIN_NOEVICT | PIN_OFFSET_FIXED;
+   pin_flags |= PIN_USER | PIN_NOEVICT | PIN_OFFSET_FIXED | PIN_VALIDATE;
if (unlikely(ev->flags & EXEC_OBJECT_NEEDS_GTT))
pin_flags |= PIN_GLOBAL;
 
@@ -457,17 +457,15 @@ eb_pin_vma(struct i915_execbuffer *eb,
 entry->pad_to_size,
 entry->alignment,
 eb_pin_flags(entry, ev->flags) |
-PIN_USER | PIN_NOEVICT);
+PIN_USER | PIN_NOEVICT | 
PIN_VALIDATE);
if (unlikely(err))
return err;
}
 
if (unlikely(ev->flags & EXEC_OBJECT_NEEDS_FENCE)) {
err = i915_vma_pin_fence(vma);
-   if (unlikely(err)) {
-   i915_vma_unpin(vma);
+   if (unlikely(err))
return err;
-   }
 
if (vma->fence)
ev->flags |= __EXEC_OBJECT_HAS_FENCE;
@@ -483,13 +481,9 @@ eb_pin_vma(struct i915_execbuffer *eb,
 static inline void
 eb_unreserve_vma(struct eb_vma *ev)
 {
-   if (!(ev->flags & __EXEC_OBJECT_HAS_PIN))
-   return;
-
if (unlikely(ev->flags & __EXEC_OBJECT_HAS_FENCE))
__i915_vma_unpin_fence(ev->vma);
 
-   __i915_vma_unpin(ev->vma);
ev->flags &= ~__EXEC_OBJECT_RESERVED;
 }
 
@@ -682,10 +676,8 @@ static int eb_reserve_vma(struct i915_execbuffer *eb,
 
if (unlikely(ev->flags & EXEC_OBJECT_NEEDS_FENCE)) {
err = i915_vma_pin_fence(vma);
-   if (unlikely(err)) {
-   i915_vma_unpin(vma);
+   if (unlikely(err))
return err;
-   }
 
if (vma->fence)
ev->flags |= __EXEC_OBJECT_HAS_FENCE;
@@ -697,85 +689,129 @@ static int eb_reserve_vma(struct i915_execbuffer *eb,
return 0;
 }
 
-static int eb_reserve(struct i915_execbuffer *eb)
+static int eb_evict_vm(struct i915_execbuffer *eb)
+{
+   const unsigned int count = eb->buffer_count;
+   unsigned int i;
+   int err;
+
+   err = mutex_lock_interruptible(&eb->context->vm->mutex);
+   if (err)
+   return err;
+
+   /* pin to protect against i915_gem_evict_vm evicting below */
+   for (i = 0; i < count; i++) {
+   struct eb_vma *ev = &eb->vma[i];
+
+   if (ev->flags & __EXEC_OBJECT_HAS_PIN)
+

[Intel-gfx] [PATCH 22/28] drm/i915: Make i915_gem_evict_vm work correctly for already locked objects

2021-10-21 Thread Maarten Lankhorst

i915_gem_execbuf will call i915_gem_evict_vm() after failing to pin
all objects in the first round. We are about to remove those short-term
pins, but even without those the objects are still locked. Add a special
case to allow i915_gem_evict_vm to evict locked objects as well.

This might also allow multiple objects sharing the same resv to be evicted.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/i915_gem_evict.c | 23 ++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c 
b/drivers/gpu/drm/i915/i915_gem_evict.c
index 24f5e3345e43..f502a617b35c 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -410,21 +410,42 @@ int i915_gem_evict_vm(struct i915_address_space *vm, 
struct i915_gem_ww_ctx *ww)
do {
struct i915_vma *vma, *vn;
LIST_HEAD(eviction_list);
+   LIST_HEAD(locked_eviction_list);
 
list_for_each_entry(vma, &vm->bound_list, vm_link) {
if (i915_vma_is_pinned(vma))
continue;
 
+   /*
+* If we already own the lock, trylock fails. In case 
the resv
+* is shared among multiple objects, we still need the 
object ref.
+*/
+   if (ww && (dma_resv_locking_ctx(vma->obj->base.resv) == 
&ww->ctx)) {
+   __i915_vma_pin(vma);
+   list_add(&vma->evict_link, 
&locked_eviction_list);
+   continue;
+   }
+
if (!i915_gem_object_trylock(vma->obj, ww))
continue;
 
__i915_vma_pin(vma);
list_add(&vma->evict_link, &eviction_list);
}
-   if (list_empty(&eviction_list))
+   if (list_empty(&eviction_list) && 
list_empty(&locked_eviction_list))
break;
 
ret = 0;
+   /* Unbind locked objects first, before unlocking the 
eviction_list */
+   list_for_each_entry_safe(vma, vn, &locked_eviction_list, 
evict_link) {
+   __i915_vma_unpin(vma);
+
+   if (ret == 0)
+   ret = __i915_vma_unbind(vma);
+   if (ret != -EINTR) /* "Get me out of here!" */
+   ret = 0;
+   }
+
list_for_each_entry_safe(vma, vn, &eviction_list, evict_link) {
__i915_vma_unpin(vma);
if (ret == 0)
-- 
2.33.0

[Intel-gfx] [PATCH 26/28] drm/i915: Remove assert_object_held_shared

2021-10-21 Thread Maarten Lankhorst

This duck tape workaround is no longer required, unbind and destroy are
fixed to take the obj->resv mutex before destroying and obj->mm.lock has
been removed, always requiring obj->resv as well.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/gem/i915_gem_object.c  |  4 ++--
 drivers/gpu/drm/i915/gem/i915_gem_object.h  | 14 --
 drivers/gpu/drm/i915/gem/i915_gem_pages.c   | 12 ++--
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c |  2 +-
 drivers/gpu/drm/i915/i915_vma.c |  6 +++---
 5 files changed, 12 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index d549f34829d0..a4fb76b7841d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -548,7 +548,7 @@ bool i915_gem_object_has_struct_page(const struct 
drm_i915_gem_object *obj)
 #ifdef CONFIG_LOCKDEP
if (IS_DGFX(to_i915(obj->base.dev)) &&
i915_gem_object_evictable((void __force *)obj))
-   assert_object_held_shared(obj);
+   assert_object_held(obj);
 #endif
return obj->mem_flags & I915_BO_FLAG_STRUCT_PAGE;
 }
@@ -567,7 +567,7 @@ bool i915_gem_object_has_iomem(const struct 
drm_i915_gem_object *obj)
 #ifdef CONFIG_LOCKDEP
if (IS_DGFX(to_i915(obj->base.dev)) &&
i915_gem_object_evictable((void __force *)obj))
-   assert_object_held_shared(obj);
+   assert_object_held(obj);
 #endif
return obj->mem_flags & I915_BO_FLAG_IOMEM;
 }
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index accd8fad0ed7..ef4f2568d46a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -158,20 +158,6 @@ i915_gem_object_put(struct drm_i915_gem_object *obj)
 
 #define assert_object_held(obj) dma_resv_assert_held((obj)->base.resv)
 
-/*
- * If more than one potential simultaneous locker, assert held.
- */
-static inline void assert_object_held_shared(const struct drm_i915_gem_object 
*obj)
-{
-   /*
-* Note mm list lookup is protected by
-* kref_get_unless_zero().
-*/
-   if (IS_ENABLED(CONFIG_LOCKDEP) &&
-   kref_read(&obj->base.refcount) > 0)
-   assert_object_held(obj);
-}
-
 static inline int __i915_gem_object_lock(struct drm_i915_gem_object *obj,
 struct i915_gem_ww_ctx *ww,
 bool intr)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c 
b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index 8eb1c3a6fc9c..dbd226b0ea49 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -19,7 +19,7 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object 
*obj,
bool shrinkable;
int i;
 
-   assert_object_held_shared(obj);
+   assert_object_held(obj);
 
if (i915_gem_object_is_volatile(obj))
obj->mm.madv = I915_MADV_DONTNEED;
@@ -94,7 +94,7 @@ int i915_gem_object_get_pages(struct drm_i915_gem_object 
*obj)
struct drm_i915_private *i915 = to_i915(obj->base.dev);
int err;
 
-   assert_object_held_shared(obj);
+   assert_object_held(obj);
 
if (unlikely(obj->mm.madv != I915_MADV_WILLNEED)) {
drm_dbg(&i915->drm,
@@ -121,7 +121,7 @@ int __i915_gem_object_get_pages(struct drm_i915_gem_object 
*obj)
 
assert_object_held(obj);
 
-   assert_object_held_shared(obj);
+   assert_object_held(obj);
 
if (unlikely(!i915_gem_object_has_pages(obj))) {
GEM_BUG_ON(i915_gem_object_has_pinned_pages(obj));
@@ -168,7 +168,7 @@ void i915_gem_object_truncate(struct drm_i915_gem_object 
*obj)
 /* Try to discard unwanted pages */
 void i915_gem_object_writeback(struct drm_i915_gem_object *obj)
 {
-   assert_object_held_shared(obj);
+   assert_object_held(obj);
GEM_BUG_ON(i915_gem_object_has_pages(obj));
 
if (obj->ops->writeback)
@@ -199,7 +199,7 @@ __i915_gem_object_unset_pages(struct drm_i915_gem_object 
*obj)
 {
struct sg_table *pages;
 
-   assert_object_held_shared(obj);
+   assert_object_held(obj);
 
pages = fetch_and_zero(&obj->mm.pages);
if (IS_ERR_OR_NULL(pages))
@@ -229,7 +229,7 @@ int __i915_gem_object_put_pages(struct drm_i915_gem_object 
*obj)
return -EBUSY;
 
/* May be called by shrinker from within get_pages() (on another bo) */
-   assert_object_held_shared(obj);
+   assert_object_held(obj);
 
i915_gem_object_release_mmap_offset(obj);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c 
b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index 3173c9f9a040..a315c010f635 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -109,7 +109,7 @@ static void i915

[Intel-gfx] [PATCH 21/28] drm/i915: Drain the ttm delayed workqueue too

2021-10-21 Thread Maarten Lankhorst

Be thorough..

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/i915_drv.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 22c891720c6d..7c5ed5957fe2 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1819,6 +1819,7 @@ static inline void i915_gem_drain_freed_objects(struct 
drm_i915_private *i915)
 */
while (atomic_read(&i915->mm.free_count)) {
flush_work(&i915->mm.free_work);
+   flush_delayed_work(&i915->bdev.wq);
rcu_barrier();
}
 }
-- 
2.33.0

[Intel-gfx] [PATCH 27/28] drm/i915: Remove support for unlocked i915_vma unbind

2021-10-21 Thread Maarten Lankhorst

Now that we require the object lock for all ops, some code handling
race conditions can be removed.

This is required to not take short-term pins inside execbuf.

Signed-off-by: Maarten Lankhorst 
Acked-by: Niranjana Vishwanathapura 
---
 drivers/gpu/drm/i915/i915_vma.c | 40 +
 1 file changed, 5 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 8131dbf89048..65168db534f0 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -749,7 +749,6 @@ i915_vma_detach(struct i915_vma *vma)
 static bool try_qad_pin(struct i915_vma *vma, unsigned int flags)
 {
unsigned int bound;
-   bool pinned = true;
 
bound = atomic_read(&vma->flags);
do {
@@ -759,34 +758,10 @@ static bool try_qad_pin(struct i915_vma *vma, unsigned 
int flags)
if (unlikely(bound & (I915_VMA_OVERFLOW | I915_VMA_ERROR)))
return false;
 
-   if (!(bound & I915_VMA_PIN_MASK))
-   goto unpinned;
-
GEM_BUG_ON(((bound + 1) & I915_VMA_PIN_MASK) == 0);
} while (!atomic_try_cmpxchg(&vma->flags, &bound, bound + 1));
 
return true;
-
-unpinned:
-   /*
-* If pin_count==0, but we are bound, check under the lock to avoid
-* racing with a concurrent i915_vma_unbind().
-*/
-   mutex_lock(&vma->vm->mutex);
-   do {
-   if (unlikely(bound & (I915_VMA_OVERFLOW | I915_VMA_ERROR))) {
-   pinned = false;
-   break;
-   }
-
-   if (unlikely(flags & ~bound)) {
-   pinned = false;
-   break;
-   }
-   } while (!atomic_try_cmpxchg(&vma->flags, &bound, bound + 1));
-   mutex_unlock(&vma->vm->mutex);
-
-   return pinned;
 }
 
 
@@ -1112,13 +1087,7 @@ __i915_vma_get_pages(struct i915_vma *vma)
vma->ggtt_view.type, ret);
}
 
-   pages = xchg(&vma->pages, pages);
-
-   /* did we race against a put_pages? */
-   if (pages && pages != vma->obj->mm.pages) {
-   sg_free_table(vma->pages);
-   kfree(vma->pages);
-   }
+   vma->pages = pages;
 
return ret;
 }
@@ -1152,13 +1121,14 @@ I915_SELFTEST_EXPORT int i915_vma_get_pages(struct 
i915_vma *vma)
 static void __vma_put_pages(struct i915_vma *vma, unsigned int count)
 {
/* We allocate under vma_get_pages, so beware the shrinker */
-   struct sg_table *pages = READ_ONCE(vma->pages);
+   struct sg_table *pages = vma->pages;
 
GEM_BUG_ON(atomic_read(&vma->pages_count) < count);
 
if (atomic_sub_return(count, &vma->pages_count) == 0) {
-   if (pages == cmpxchg(&vma->pages, pages, NULL) &&
-   pages != vma->obj->mm.pages) {
+   vma->pages = NULL;
+
+   if (pages != vma->obj->mm.pages) {
sg_free_table(pages);
kfree(pages);
}
-- 
2.33.0

Re: [Intel-gfx] [PATCH 3/4] drm/i915: Use vblank workers for gamma updates

2021-10-21 Thread Ville Syrjälä

On Thu, Oct 21, 2021 at 01:35:12PM +0300, Jani Nikula wrote:
> On Thu, 21 Oct 2021, Ville Syrjala  wrote:
> > From: Ville Syrjälä 
> >
> > The pipe gamma registers are single buffered so they should only
> > be updated during the vblank to avoid screen tearing. In fact they
> > really should only be updated between start of vblank and frame
> > start because that is the only time the pipe is guaranteed to be
> > empty. Already at frame start the pipe begins to fill up with
> > data for the next frame.
> >
> > Unfortunately frame start happens ~1 scanline after the start
> > of vblank which in practice doesn't always leave us enough time to
> > finish the gamma update in time (gamma LUTs can be several KiB of
> > data we have to bash into the registers). However we must try our
> > best and so we'll add a vblank work for each pipe from where we
> > can do the gamma update. Additionally we could consider pushing
> > frame start forward to the max of ~4 scanlines after start of
> > vblank. But not sure that's exactly a validated configuration.
> > As it stands the ~100 first pixels tend to make it through with
> > the old gamma values.
> >
> > Even though the vblank worker is running on a high prority thread
> > we still have to contend with C-states. If the CPU happens be in
> > a deep C-state when the vblank interrupt arrives even the irq
> > handler gets delayed massively (I've observed dozens of scanlines
> > worth of latency). To avoid that problem we'll use the qos mechanism
> > to keep the CPU awake while the vblank work is scheduled.
> >
> > With all this hooked up we can finally enjoy near atomic gamma
> > updates. It even works across several pipes from the same atomic
> > commit which previously was a total fail because we did the
> > gamma updates for each pipe serially after waiting for all
> > pipes to have latched the double buffered registers.
> >
> > In the future the DSB should take over this responsibility
> > which will hopefully avoid some of these issues.
> >
> > Kudos to Lyude for finishing the actual vblank workers.
> > Works like the proverbial train toilet.
> >
> > v2: Add missing intel_atomic_state fwd declaration
> > v3: Clean up properly when not scheduling the worker
> > v4: Clean up the rest and add tracepoints
> >
> > CC: Lyude Paul 
> > Signed-off-by: Ville Syrjälä 
> > ---
> >  drivers/gpu/drm/i915/display/intel_crtc.c | 76 ++-
> >  drivers/gpu/drm/i915/display/intel_crtc.h |  4 +-
> >  drivers/gpu/drm/i915/display/intel_display.c  |  9 +--
> >  .../drm/i915/display/intel_display_types.h|  8 ++
> >  drivers/gpu/drm/i915/i915_trace.h | 42 ++
> >  5 files changed, 129 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/display/intel_crtc.c 
> > b/drivers/gpu/drm/i915/display/intel_crtc.c
> > index 0f8b48b6911c..4758c61adae8 100644
> > --- a/drivers/gpu/drm/i915/display/intel_crtc.c
> > +++ b/drivers/gpu/drm/i915/display/intel_crtc.c
> > @@ -3,12 +3,14 @@
> >   * Copyright © 2020 Intel Corporation
> >   */
> >  #include 
> > +#include 
> >  #include 
> >  
> >  #include 
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  
> >  #include "i915_trace.h"
> >  #include "i915_vgpu.h"
> > @@ -167,6 +169,8 @@ static void intel_crtc_destroy(struct drm_crtc *_crtc)
> >  {
> > struct intel_crtc *crtc = to_intel_crtc(_crtc);
> >  
> > +   cpu_latency_qos_remove_request(&crtc->vblank_pm_qos);
> > +
> > drm_crtc_cleanup(&crtc->base);
> > kfree(crtc);
> >  }
> > @@ -344,6 +348,8 @@ int intel_crtc_init(struct drm_i915_private *dev_priv, 
> > enum pipe pipe)
> >  
> > intel_crtc_crc_init(crtc);
> >  
> > +   cpu_latency_qos_add_request(&crtc->vblank_pm_qos, PM_QOS_DEFAULT_VALUE);
> > +
> > drm_WARN_ON(&dev_priv->drm, drm_crtc_index(&crtc->base) != crtc->pipe);
> >  
> > return 0;
> > @@ -354,6 +360,65 @@ int intel_crtc_init(struct drm_i915_private *dev_priv, 
> > enum pipe pipe)
> > return ret;
> >  }
> >  
> > +static bool intel_crtc_needs_vblank_work(const struct intel_crtc_state 
> > *crtc_state)
> > +{
> > +   return crtc_state->hw.active &&
> > +   !intel_crtc_needs_modeset(crtc_state) &&
> > +   !crtc_state->preload_luts &&
> > +   (crtc_state->uapi.color_mgmt_changed ||
> > +crtc_state->update_pipe);
> > +}
> > +
> > +static void intel_crtc_vblank_work(struct kthread_work *base)
> > +{
> > +   struct drm_vblank_work *work = to_drm_vblank_work(base);
> > +   struct intel_crtc_state *crtc_state =
> > +   container_of(work, typeof(*crtc_state), vblank_work);
> > +   struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> > +
> > +   trace_intel_crtc_vblank_work_start(crtc);
> > +
> > +   intel_color_load_luts(crtc_state);
> > +
> > +   if (crtc_state->uapi.event) {
> > +   spin_lock_irq(&crtc->base.dev->event_lock);
> > +   drm_crtc_send_vblank_event(&crtc->base, crtc_state->uapi.event);
> > +   crtc_state

Re: [Intel-gfx] [PATCH v4 01/11] drm/i915: Add a table with a descriptor for all i915 modifiers

2021-10-21 Thread Imre Deak

On Thu, Oct 21, 2021 at 01:14:59PM +0300, Jani Nikula wrote:
> On Wed, 20 Oct 2021, Imre Deak  wrote:
> > Add a table describing all the framebuffer modifiers used by i915 at one
> > place. This has the benefit of deduplicating the listing of supported
> > modifiers for each platform and checking the support of these modifiers
> > on a given plane. This also simplifies in a similar way getting some
> > attribute for a modifier, for instance checking if the modifier is a
> > CCS modifier type.
> >
> > While at it drop the cursor plane filtering from skl_plane_has_rc_ccs(),
> > as the cursor plane is registered with DRM core elsewhere.
> >
> > v1: Unchanged.
> > v2:
> > - Keep the plane caps calculation in the plane code and pass an enum
> >   with these caps to intel_fb_get_modifiers(). (Ville)
> > - Get the modifiers calling intel_fb_get_modifiers() in i9xx_plane.c as
> >   well.
> > v3:
> > - s/.id/.modifier/ (Ville)
> > - Keep modifier_desc vs. plane_cap filter conditions consistent. (Ville)
> > - Drop redundant cursor plane check from skl_plane_has_rc_ccs(). (Ville)
> > - Use from, until display version fields in modifier_desc instead of a 
> > mask. (Jani)
> > - Unexport struct intel_modifier_desc, separate its decl and init. (Jani)
> > - Remove enum pipe, plane_id forward decls from intel_fb.h, which are
> >   not needed after v2.
> > v4:
> > - Reuse IS_DISPLAY_VER() instead of open-coding it. (Jani)
> > - Preserve the current modifier order exposed to user space. (Ville)
> >
> > Cc: Ville Syrjälä 
> > Cc: Juha-Pekka Heikkila 
> > Cc: Jani Nikula 
> > Signed-off-by: Imre Deak 
> > Reviewed-by: Juha-Pekka Heikkila  (v3)
> > ---
> >  drivers/gpu/drm/i915/display/i9xx_plane.c |  30 +--
> >  drivers/gpu/drm/i915/display/intel_cursor.c   |  19 +-
> >  .../drm/i915/display/intel_display_types.h|   1 -
> >  drivers/gpu/drm/i915/display/intel_fb.c   | 152 +++
> >  drivers/gpu/drm/i915/display/intel_fb.h   |  13 ++
> >  drivers/gpu/drm/i915/display/intel_sprite.c   |  35 +---
> >  drivers/gpu/drm/i915/display/skl_scaler.c |   1 +
> >  .../drm/i915/display/skl_universal_plane.c| 178 +-
> >  8 files changed, 245 insertions(+), 184 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/display/i9xx_plane.c 
> > b/drivers/gpu/drm/i915/display/i9xx_plane.c
> > index b1439ba78f67b..a939accff7ee2 100644
> > --- a/drivers/gpu/drm/i915/display/i9xx_plane.c
> > +++ b/drivers/gpu/drm/i915/display/i9xx_plane.c
> > @@ -60,22 +60,11 @@ static const u32 vlv_primary_formats[] = {
> > DRM_FORMAT_XBGR16161616F,
> >  };
> >  
> > -static const u64 i9xx_format_modifiers[] = {
> > -   I915_FORMAT_MOD_X_TILED,
> > -   DRM_FORMAT_MOD_LINEAR,
> > -   DRM_FORMAT_MOD_INVALID
> > -};
> > -
> >  static bool i8xx_plane_format_mod_supported(struct drm_plane *_plane,
> > u32 format, u64 modifier)
> >  {
> > -   switch (modifier) {
> > -   case DRM_FORMAT_MOD_LINEAR:
> > -   case I915_FORMAT_MOD_X_TILED:
> > -   break;
> > -   default:
> > +   if (!intel_fb_plane_supports_modifier(to_intel_plane(_plane), modifier))
> > return false;
> > -   }
> >  
> > switch (format) {
> > case DRM_FORMAT_C8:
> > @@ -92,13 +81,8 @@ static bool i8xx_plane_format_mod_supported(struct 
> > drm_plane *_plane,
> >  static bool i965_plane_format_mod_supported(struct drm_plane *_plane,
> > u32 format, u64 modifier)
> >  {
> > -   switch (modifier) {
> > -   case DRM_FORMAT_MOD_LINEAR:
> > -   case I915_FORMAT_MOD_X_TILED:
> > -   break;
> > -   default:
> > +   if (!intel_fb_plane_supports_modifier(to_intel_plane(_plane), modifier))
> > return false;
> > -   }
> >  
> > switch (format) {
> > case DRM_FORMAT_C8:
> > @@ -768,6 +752,7 @@ intel_primary_plane_create(struct drm_i915_private 
> > *dev_priv, enum pipe pipe)
> > struct intel_plane *plane;
> > const struct drm_plane_funcs *plane_funcs;
> > unsigned int supported_rotations;
> > +   const u64 *modifiers;
> > const u32 *formats;
> > int num_formats;
> > int ret, zpos;
> > @@ -875,21 +860,26 @@ intel_primary_plane_create(struct drm_i915_private 
> > *dev_priv, enum pipe pipe)
> > plane->disable_flip_done = ilk_primary_disable_flip_done;
> > }
> >  
> > +   modifiers = intel_fb_plane_get_modifiers(dev_priv, PLANE_HAS_TILING);
> > +
> > if (DISPLAY_VER(dev_priv) >= 5 || IS_G4X(dev_priv))
> > ret = drm_universal_plane_init(&dev_priv->drm, &plane->base,
> >0, plane_funcs,
> >formats, num_formats,
> > -  i9xx_format_modifiers,
> > +  modifiers,
> >DRM_PLANE_TYPE_PRIMARY,
> >"primary %c", pipe_name(pipe));
> > else
> >

[Intel-gfx] ✓ Fi.CI.BAT: success for Selective fetch support for biplanar formats

2021-10-21 Thread Patchwork

== Series Details ==

Series: Selective fetch support for biplanar formats
URL   : https://patchwork.freedesktop.org/series/96113/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10768 -> Patchwork_21401


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21401/index.html

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_21401:

### IGT changes ###

 Suppressed 

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@kms_frontbuffer_tracking@basic:
- {fi-hsw-gt1}:   [PASS][1] -> [DMESG-WARN][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10768/fi-hsw-gt1/igt@kms_frontbuffer_track...@basic.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21401/fi-hsw-gt1/igt@kms_frontbuffer_track...@basic.html

  
Known issues


  Here are the changes found in Patchwork_21401 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_cs_nop@sync-fork-compute0:
- fi-snb-2600:NOTRUN -> [SKIP][3] ([fdo#109271]) +17 similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21401/fi-snb-2600/igt@amdgpu/amd_cs_...@sync-fork-compute0.html

  * igt@gem_huc_copy@huc-copy:
- fi-glk-dsi: NOTRUN -> [SKIP][4] ([fdo#109271] / [i915#2190])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21401/fi-glk-dsi/igt@gem_huc_c...@huc-copy.html

  * igt@kms_chamelium@dp-crc-fast:
- fi-bsw-nick:NOTRUN -> [SKIP][5] ([fdo#109271] / [fdo#111827]) +8 
similar issues
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21401/fi-bsw-nick/igt@kms_chamel...@dp-crc-fast.html

  * igt@kms_chamelium@hdmi-hpd-fast:
- fi-glk-dsi: NOTRUN -> [SKIP][6] ([fdo#109271] / [fdo#111827]) +8 
similar issues
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21401/fi-glk-dsi/igt@kms_chamel...@hdmi-hpd-fast.html

  * igt@kms_pipe_crc_basic@compare-crc-sanitycheck-pipe-d:
- fi-glk-dsi: NOTRUN -> [SKIP][7] ([fdo#109271] / [i915#533])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21401/fi-glk-dsi/igt@kms_pipe_crc_ba...@compare-crc-sanitycheck-pipe-d.html

  * igt@kms_psr@primary_page_flip:
- fi-glk-dsi: NOTRUN -> [SKIP][8] ([fdo#109271]) +30 similar issues
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21401/fi-glk-dsi/igt@kms_psr@primary_page_flip.html

  * igt@prime_vgem@basic-fence-flip:
- fi-bsw-nick:NOTRUN -> [SKIP][9] ([fdo#109271]) +63 similar issues
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21401/fi-bsw-nick/igt@prime_v...@basic-fence-flip.html

  
 Possible fixes 

  * igt@gem_exec_suspend@basic-s3:
- fi-tgl-1115g4:  [FAIL][10] ([i915#1888]) -> [PASS][11]
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10768/fi-tgl-1115g4/igt@gem_exec_susp...@basic-s3.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21401/fi-tgl-1115g4/igt@gem_exec_susp...@basic-s3.html

  * igt@i915_selftest@live@hangcheck:
- {fi-hsw-gt1}:   [DMESG-WARN][12] ([i915#3303]) -> [PASS][13]
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10768/fi-hsw-gt1/igt@i915_selftest@l...@hangcheck.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21401/fi-hsw-gt1/igt@i915_selftest@l...@hangcheck.html
- fi-snb-2600:[INCOMPLETE][14] ([i915#3921]) -> [PASS][15]
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10768/fi-snb-2600/igt@i915_selftest@l...@hangcheck.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21401/fi-snb-2600/igt@i915_selftest@l...@hangcheck.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888
  [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190
  [i915#3303]: https://gitlab.freedesktop.org/drm/intel/issues/3303
  [i915#3921]: https://gitlab.freedesktop.org/drm/intel/issues/3921
  [i915#533]: https://gitlab.freedesktop.org/drm/intel/issues/533
  [i915#541]: https://gitlab.freedesktop.org/drm/intel/issues/541


Participating hosts (41 -> 34)
--

  Additional (2): fi-glk-dsi fi-bsw-nick 
  Missing(9): fi-kbl-soraka fi-bdw-5557u bat-dg1-6 bat-dg1-5 fi-hsw-4200u 
fi-bsw-cyan bat-adlp-4 fi-ctg-p8600 fi-kbl-8809g 


Build changes
-

  * Linux: CI_DRM_10768 -> Patchwork_21401

  CI-20190529: 20190529
  CI_DRM_10768: 0e1c99720e0793390c9758dc1b4eedd7395b1382 @ 
git://anongit.freedesktop.org/gfx-ci/linux

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/28] drm/i915: Fix i915_request fence wait semantics

2021-10-21 Thread Patchwork

== Series Details ==

Series: series starting with [01/28] drm/i915: Fix i915_request fence wait 
semantics
URL   : https://patchwork.freedesktop.org/series/96115/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
ddb763fd10e7 drm/i915: Fix i915_request fence wait semantics
0c036c1715f3 drm/i915: use new iterator in i915_gem_object_wait_reservation
6df3a2f9c58a drm/i915: Remove dma_resv_prune
-:23: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does 
MAINTAINERS need updating?
#23: 
deleted file mode 100644

total: 0 errors, 1 warnings, 0 checks, 42 lines checked
1740df189181 drm/i915: Remove unused bits of i915_vma/active api
ec8de168f667 drm/i915: Slightly rework EXEC_OBJECT_CAPTURE handling, v2.
-:73: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#73: FILE: drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:3134:
+   kvcalloc(eb->capture_count + 1,
+   sizeof(*eb->requests[i]->capture_list),

-:77: CHECK:LINE_SPACING: Please don't use multiple blank lines
#77: FILE: drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:3138:
+
+

total: 0 errors, 0 warnings, 2 checks, 119 lines checked
ce8a62bcaae3 drm/i915: Remove gen6_ppgtt_unpin_all
e22208d09f16 drm/i915: Create a dummy object for gen6 ppgtt
-:178: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#178: FILE: drivers/gpu/drm/i915/gt/gen6_ppgtt.c:376:
+static void pd_dummy_obj_put_pages(struct drm_i915_gem_object *obj,
+struct sg_table *pages)

-:200: WARNING:LONG_LINE: line length of 119 exceeds 100 columns
#200: FILE: drivers/gpu/drm/i915/gt/gen6_ppgtt.c:398:
+   pd->pt.base = 
__i915_gem_object_create_internal(ppgtt->base.vm.gt->i915, &pd_dummy_obj_ops, 
I915_PDES * SZ_4K);

total: 0 errors, 1 warnings, 1 checks, 256 lines checked
5fa1e4d967c0 drm/i915: Create a full object for mock_ring, v2.
-:6: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description 
(prefer a maximum 75 chars per line)
#6: 
This allows us to finally get rid of all the assumptions that vma->obj is NULL.

total: 0 errors, 1 warnings, 0 checks, 73 lines checked
c56256b7900b drm/i915: vma is always backed by an object.
3324301339ce drm/i915: Change shrink ordering to use locking around unbinding.
-:28: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#28: FILE: drivers/gpu/drm/i915/gem/i915_gem_shrinker.c:40:
+static int drop_pages(struct drm_i915_gem_object *obj,
+  unsigned long shrink, bool trylock_vm)

total: 0 errors, 0 warnings, 1 checks, 56 lines checked
5b0d3b29f3ce drm/i915/pm: Move CONTEXT_VALID_BIT check
-:6: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description 
(prefer a maximum 75 chars per line)
#6: 
Resetting will clear the CONTEXT_VALID_BIT, so wait until after that to test.

total: 0 errors, 1 warnings, 0 checks, 17 lines checked
494b7f7d1d07 drm/i915: Remove resv from i915_vma
971bfcf716db drm/i915: Remove pages_mutex and 
intel_gtt->vma_ops.set/clear_pages members
-:545: CHECK:LINE_SPACING: Please don't use multiple blank lines
#545: FILE: drivers/gpu/drm/i915/i915_vma.c:791:
 
+

total: 0 errors, 0 warnings, 1 checks, 659 lines checked
9ca977d4ea2d drm/i915: Take object lock in i915_ggtt_pin if ww is not set
6af62101e38d drm/i915: Add lock for unbinding to i915_gem_object_ggtt_pin_ww
-:8: WARNING:COMMIT_MESSAGE: Missing commit description - Add an appropriate one

total: 0 errors, 1 warnings, 0 checks, 15 lines checked
1893beb58bd2 drm/i915: Rework context handling in hugepages selftests
12aa1adab3a1 drm/i915: Ensure gem_contexts selftests work with unbind changes.
b30510e7c7a5 drm/i915: Take trylock during eviction, v2.
-:92: CHECK:LINE_SPACING: Please don't use multiple blank lines
#92: FILE: drivers/gpu/drm/i915/i915_gem_evict.c:250:
 
+

total: 0 errors, 0 warnings, 1 checks, 109 lines checked
65bb798433f5 drm/i915: Pass trylock context to callers
-:7: WARNING:COMMIT_MESSAGE: Missing commit description - Add an appropriate one

-:391: CHECK:BRACES: Blank lines aren't necessary after an open brace '{'
#391: FILE: drivers/gpu/drm/i915/i915_vma.c:1373:
if (mutex_lock_interruptible(&vm->mutex) == 0) {
+

total: 0 errors, 1 warnings, 1 checks, 446 lines checked
83936045aeae drm/i915: Ensure i915_vma tests do not get -ENOSPC with the 
locking changes.
9ffbca093c70 drm/i915: Drain the ttm delayed workqueue too
1752c3fdca53 drm/i915: Make i915_gem_evict_vm work correctly for already locked 
objects
d80031a5a1fa drm/i915: Call i915_gem_evict_vm in vm_fault_gtt to prevent new 
ENOSPC errors
5845081d8530 drm/i915: Add i915_vma_unbind_unlocked, and take obj lock for 
i915_vma_unbind
-:7: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description 
(prefer a maximum 75 chars per line)
#7: 
We want to remove more members of i915_vma, which requires the locking to be

total: 0 errors, 1 warnings, 0 checks, 313 lines checked

[Intel-gfx] ✗ Fi.CI.SPARSE: warning for series starting with [01/28] drm/i915: Fix i915_request fence wait semantics

2021-10-21 Thread Patchwork

== Series Details ==

Series: series starting with [01/28] drm/i915: Fix i915_request fence wait 
semantics
URL   : https://patchwork.freedesktop.org/series/96115/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.
+drivers/gpu/drm/i915/gem/selftests/huge_pages.c:25:25: warning: symbol 
'hugepage_ctx' was not declared. Should it be static?

[Intel-gfx] ✗ Fi.CI.DOCS: warning for series starting with [01/28] drm/i915: Fix i915_request fence wait semantics

2021-10-21 Thread Patchwork

== Series Details ==

Series: series starting with [01/28] drm/i915: Fix i915_request fence wait 
semantics
URL   : https://patchwork.freedesktop.org/series/96115/
State : warning

== Summary ==

$ make htmldocs 2>&1 > /dev/null | grep i915
./drivers/gpu/drm/i915/i915_gem_evict.c:110: warning: Function parameter or 
member 'ww' not described in 'i915_gem_evict_something'
./drivers/gpu/drm/i915/i915_gem_evict.c:281: warning: Function parameter or 
member 'ww' not described in 'i915_gem_evict_for_node'
./drivers/gpu/drm/i915/i915_gem_evict.c:393: warning: Function parameter or 
member 'ww' not described in 'i915_gem_evict_vm'
./drivers/gpu/drm/i915/i915_gem_gtt.c:100: warning: Function parameter or 
member 'ww' not described in 'i915_gem_gtt_reserve'
./drivers/gpu/drm/i915/i915_gem_gtt.c:192: warning: Function parameter or 
member 'ww' not described in 'i915_gem_gtt_insert'

[Intel-gfx] [PATCH v3] drm/i915/display: program audio CDCLK-TS for keepalives

2021-10-21 Thread Kai Vehmanen

XE_LPD display adds support for display audio codec keepalive feature.
This feature works also when display codec is in D3 state and the audio
link is off (BCLK off). To enable this functionality, display driver
must update the AUD_TS_CDCLK_M/N registers whenever CDCLK is changed.
Actual timestamps are generated only when the audio codec driver
specifically enables the KeepAlive (KAE) feature.

This patch adds new hooks to intel_set_cdclk() in order to inform
display audio driver when CDCLK change is started and when it is
complete.

Bspec: 53679
Signed-off-by: Kai Vehmanen 
Reviewed-by: Uma Shankar 
Acked-by: Ville Syrjälä 
---
 drivers/gpu/drm/i915/display/intel_audio.c | 37 ++
 drivers/gpu/drm/i915/display/intel_audio.h |  2 ++
 drivers/gpu/drm/i915/display/intel_cdclk.c |  5 +++
 drivers/gpu/drm/i915/i915_reg.h|  4 +++
 4 files changed, 48 insertions(+)

Changes V2->V3:
 - added review sign-offs by Uma and Ville
 - rebase to latest upstream, no other changes
Changes V1->V2:
 - addressed review comments Jani Nikula (Sep 10)
 - added an initial call to intel_audio_cdclk_change_post() so 
   that AUD_CDCLK initial configuration is always performance


diff --git a/drivers/gpu/drm/i915/display/intel_audio.c 
b/drivers/gpu/drm/i915/display/intel_audio.c
index 03e8c05a74f6..a96523f1b052 100644
--- a/drivers/gpu/drm/i915/display/intel_audio.c
+++ b/drivers/gpu/drm/i915/display/intel_audio.c
@@ -947,6 +947,40 @@ void intel_init_audio_hooks(struct drm_i915_private 
*dev_priv)
}
 }
 
+struct aud_ts_cdclk_m_n {
+   u8 m;
+   u16 n;
+};
+
+void intel_audio_cdclk_change_pre(struct drm_i915_private *i915)
+{
+   if (DISPLAY_VER(i915) >= 13)
+   intel_de_rmw(i915, AUD_TS_CDCLK_M, AUD_TS_CDCLK_M_EN, 0);
+}
+
+static void get_aud_ts_cdclk_m_n(int refclk, int cdclk, struct 
aud_ts_cdclk_m_n *aud_ts)
+{
+   if (refclk == 24000)
+   aud_ts->m = 12;
+   else
+   aud_ts->m = 15;
+
+   aud_ts->n = cdclk * aud_ts->m / 24000;
+}
+
+void intel_audio_cdclk_change_post(struct drm_i915_private *i915)
+{
+   struct aud_ts_cdclk_m_n aud_ts;
+
+   if (DISPLAY_VER(i915) >= 13) {
+   get_aud_ts_cdclk_m_n(i915->cdclk.hw.ref, i915->cdclk.hw.cdclk, 
&aud_ts);
+
+   intel_de_write(i915, AUD_TS_CDCLK_N, aud_ts.n);
+   intel_de_write(i915, AUD_TS_CDCLK_M, aud_ts.m | 
AUD_TS_CDCLK_M_EN);
+   drm_dbg_kms(&i915->drm, "aud_ts_cdclk set to M=%u, N=%u\n", 
aud_ts.m, aud_ts.n);
+   }
+}
+
 static int glk_force_audio_cdclk_commit(struct intel_atomic_state *state,
struct intel_crtc *crtc,
bool enable)
@@ -1330,6 +1364,9 @@ static void i915_audio_component_init(struct 
drm_i915_private *dev_priv)
dev_priv->audio_freq_cntrl = aud_freq;
}
 
+   /* init with current cdclk */
+   intel_audio_cdclk_change_post(dev_priv);
+
dev_priv->audio_component_registered = true;
 }
 
diff --git a/drivers/gpu/drm/i915/display/intel_audio.h 
b/drivers/gpu/drm/i915/display/intel_audio.h
index a3657c7a7ba2..dcb259dd2da7 100644
--- a/drivers/gpu/drm/i915/display/intel_audio.h
+++ b/drivers/gpu/drm/i915/display/intel_audio.h
@@ -18,6 +18,8 @@ void intel_audio_codec_enable(struct intel_encoder *encoder,
 void intel_audio_codec_disable(struct intel_encoder *encoder,
   const struct intel_crtc_state *old_crtc_state,
   const struct drm_connector_state 
*old_conn_state);
+void intel_audio_cdclk_change_pre(struct drm_i915_private *dev_priv);
+void intel_audio_cdclk_change_post(struct drm_i915_private *dev_priv);
 void intel_audio_init(struct drm_i915_private *dev_priv);
 void intel_audio_deinit(struct drm_i915_private *dev_priv);
 
diff --git a/drivers/gpu/drm/i915/display/intel_cdclk.c 
b/drivers/gpu/drm/i915/display/intel_cdclk.c
index 9e466d829019..63d1e3b225c6 100644
--- a/drivers/gpu/drm/i915/display/intel_cdclk.c
+++ b/drivers/gpu/drm/i915/display/intel_cdclk.c
@@ -24,6 +24,7 @@
 #include 
 
 #include "intel_atomic.h"
+#include "intel_audio.h"
 #include "intel_bw.h"
 #include "intel_cdclk.h"
 #include "intel_de.h"
@@ -1975,6 +1976,8 @@ static void intel_set_cdclk(struct drm_i915_private 
*dev_priv,
intel_psr_pause(intel_dp);
}
 
+   intel_audio_cdclk_change_pre(dev_priv);
+
/*
 * Lock aux/gmbus while we change cdclk in case those
 * functions use cdclk. Not all platforms/ports do,
@@ -2003,6 +2006,8 @@ static void intel_set_cdclk(struct drm_i915_private 
*dev_priv,
intel_psr_resume(intel_dp);
}
 
+   intel_audio_cdclk_change_post(dev_priv);
+
if (drm_WARN(&dev_priv->drm,
 intel_cdclk_changed(&dev_priv->cdclk.hw, cdclk_config),
 "cdclk state doesn't match!\n")) {
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/g

Re: [Intel-gfx] [PATCH 02/28] drm/i915: use new iterator in i915_gem_object_wait_reservation

2021-10-21 Thread Maarten Lankhorst

Op 21-10-2021 om 12:38 schreef Christian König:
> Am 21.10.21 um 12:35 schrieb Maarten Lankhorst:
>> From: Christian König 
>>
>> Simplifying the code a bit.
>>
>> Signed-off-by: Christian König 
>> [mlankhorst: Handle timeout = 0 correctly, use new 
>> i915_request_wait_timeout.]
>> Signed-off-by: Maarten Lankhorst 
>
> LGTM, do you want to push it or should I pick it up into drm-misc-next? 

I think it can be applied to drm-intel-gt-next, after a backmerge. It needs 
patch 1 too, which fixes

i915_request_wait semantics when used in dma-fence. It exports a dma-fence 
compatible i915_request_wait_timeout function, used in this patch.

Re: [Intel-gfx] [PATCH] Revert "drm/i915/bios: gracefully disable dual eDP for now"

2021-10-21 Thread Jani Nikula

On Tue, 19 Oct 2021, "Souza, Jose"  wrote:
> On Tue, 2021-10-19 at 14:43 +0300, Jani Nikula wrote:
>> This reverts commit 05734ca2a8f76c9eb3890b3c9dfc3467f03105c1.
>> 
>> It's not graceful, instead it leads to boot time warning splats in the
>> case it is supposed to handle gracefully. Apparently the BIOS/GOP
>> enabling the port we end up skipping leads to state readout
>> problems. Back to the drawing board.
>
> Reviewed-by: José Roberto de Souza 

Thanks, pushed.

BR,
Jani.


>
>> 
>> References: 
>> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21255/bat-adlp-4/boot0.txt
>> Fixes: 05734ca2a8f7 ("drm/i915/bios: gracefully disable dual eDP for now")
>> Cc: José Roberto de Souza 
>> Cc: Uma Shankar 
>> Cc: Ville Syrjälä 
>> Cc: Swati Sharma 
>> Signed-off-by: Jani Nikula 
>> ---
>>  drivers/gpu/drm/i915/display/intel_bios.c | 47 ---
>>  1 file changed, 47 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/i915/display/intel_bios.c 
>> b/drivers/gpu/drm/i915/display/intel_bios.c
>> index b99907c656bb..f9776ca85de3 100644
>> --- a/drivers/gpu/drm/i915/display/intel_bios.c
>> +++ b/drivers/gpu/drm/i915/display/intel_bios.c
>> @@ -1930,50 +1930,6 @@ static int _intel_bios_max_tmds_clock(const struct 
>> intel_bios_encoder_data *devd
>>  }
>>  }
>>  
>> -static enum port get_edp_port(struct drm_i915_private *i915)
>> -{
>> -const struct intel_bios_encoder_data *devdata;
>> -enum port port;
>> -
>> -for_each_port(port) {
>> -devdata = i915->vbt.ports[port];
>> -
>> -if (devdata && intel_bios_encoder_supports_edp(devdata))
>> -return port;
>> -}
>> -
>> -return PORT_NONE;
>> -}
>> -
>> -/*
>> - * FIXME: The power sequencer and backlight code currently do not support 
>> more
>> - * than one set registers, at least not on anything other than VLV/CHV. It 
>> will
>> - * clobber the registers. As a temporary workaround, gracefully prevent more
>> - * than one eDP from being registered.
>> - */
>> -static void sanitize_dual_edp(struct intel_bios_encoder_data *devdata,
>> -  enum port port)
>> -{
>> -struct drm_i915_private *i915 = devdata->i915;
>> -struct child_device_config *child = &devdata->child;
>> -enum port p;
>> -
>> -/* CHV might not clobber PPS registers. */
>> -if (IS_CHERRYVIEW(i915))
>> -return;
>> -
>> -p = get_edp_port(i915);
>> -if (p == PORT_NONE)
>> -return;
>> -
>> -drm_dbg_kms(&i915->drm, "both ports %c and %c configured as eDP, "
>> -"disabling port %c eDP\n", port_name(p), port_name(port),
>> -port_name(port));
>> -
>> -child->device_type &= ~DEVICE_TYPE_DISPLAYPORT_OUTPUT;
>> -child->device_type &= ~DEVICE_TYPE_INTERNAL_CONNECTOR;
>> -}
>> -
>>  static bool is_port_valid(struct drm_i915_private *i915, enum port port)
>>  {
>>  /*
>> @@ -2031,9 +1987,6 @@ static void parse_ddi_port(struct drm_i915_private 
>> *i915,
>>  supports_typec_usb, supports_tbt,
>>  devdata->dsc != NULL);
>>  
>> -if (is_edp)
>> -sanitize_dual_edp(devdata, port);
>> -
>>  if (is_dvi)
>>  sanitize_ddc_pin(devdata, port);
>>  
>

-- 
Jani Nikula, Intel Open Source Graphics Center

Re: [Intel-gfx] [PATCH 02/28] drm/i915: use new iterator in i915_gem_object_wait_reservation

2021-10-21 Thread Tvrtko Ursulin




On 21/10/2021 12:06, Maarten Lankhorst wrote:

Op 21-10-2021 om 12:38 schreef Christian König:

Am 21.10.21 um 12:35 schrieb Maarten Lankhorst:

From: Christian König 

Simplifying the code a bit.

Signed-off-by: Christian König 
[mlankhorst: Handle timeout = 0 correctly, use new i915_request_wait_timeout.]
Signed-off-by: Maarten Lankhorst 


LGTM, do you want to push it or should I pick it up into drm-misc-next?


I think it can be applied to drm-intel-gt-next, after a backmerge. It needs 
patch 1 too, which fixes

i915_request_wait semantics when used in dma-fence. It exports a dma-fence 
compatible i915_request_wait_timeout function, used in this patch.


I don't think my open has been resolved, at least I haven't seen a reply 
from Daniel on the topic of potential for infinite waits with untrusted 
clients after this change. +Daniel


Regards,

Tvrtko

[Intel-gfx] [drm-intel:drm-intel-gt-next 5/10] drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c:248:3: error: implicit declaration of function 'wbinvd_on_all_cpus'

2021-10-21 Thread kernel test robot

tree:   git://anongit.freedesktop.org/drm-intel drm-intel-gt-next
head:   ab5d964c001b9efffcbfa4d67a30186b67d79771
commit: a035154da45d19e09dc68454673ff257a660aece [5/10] drm/i915/dmabuf: add 
paranoid flush-on-acquire
config: x86_64-randconfig-a004-20211021 (attached as .config)
compiler: clang version 14.0.0 (https://github.com/llvm/llvm-project 
3cea2505fd8d99a9ba0cb625aecfe28a47c4e3f8)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
git remote add drm-intel git://anongit.freedesktop.org/drm-intel
git fetch --no-tags drm-intel drm-intel-gt-next
git checkout a035154da45d19e09dc68454673ff257a660aece
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 
ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

>> drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c:248:3: error: implicit 
>> declaration of function 'wbinvd_on_all_cpus' 
>> [-Werror,-Wimplicit-function-declaration]
   wbinvd_on_all_cpus();
   ^
   1 error generated.


vim +/wbinvd_on_all_cpus +248 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c

   232  
   233  static int i915_gem_object_get_pages_dmabuf(struct drm_i915_gem_object 
*obj)
   234  {
   235  struct drm_i915_private *i915 = to_i915(obj->base.dev);
   236  struct sg_table *pages;
   237  unsigned int sg_page_sizes;
   238  
   239  assert_object_held(obj);
   240  
   241  pages = dma_buf_map_attachment(obj->base.import_attach,
   242 DMA_BIDIRECTIONAL);
   243  if (IS_ERR(pages))
   244  return PTR_ERR(pages);
   245  
   246  /* XXX: consider doing a vmap flush or something */
   247  if (!HAS_LLC(i915) || i915_gem_object_can_bypass_llc(obj))
 > 248  wbinvd_on_all_cpus();
   249  
   250  sg_page_sizes = i915_sg_dma_sizes(pages->sgl);
   251  __i915_gem_object_set_pages(obj, pages, sg_page_sizes);
   252  
   253  return 0;
   254  }
   255  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip

[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [01/28] drm/i915: Fix i915_request fence wait semantics

2021-10-21 Thread Patchwork

== Series Details ==

Series: series starting with [01/28] drm/i915: Fix i915_request fence wait 
semantics
URL   : https://patchwork.freedesktop.org/series/96115/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10768 -> Patchwork_21402


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21402/index.html

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_21402:

### IGT changes ###

 Suppressed 

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@kms_frontbuffer_tracking@basic:
- {fi-hsw-gt1}:   [PASS][1] -> [DMESG-WARN][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10768/fi-hsw-gt1/igt@kms_frontbuffer_track...@basic.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21402/fi-hsw-gt1/igt@kms_frontbuffer_track...@basic.html

  
Known issues


  Here are the changes found in Patchwork_21402 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@gem_huc_copy@huc-copy:
- fi-glk-dsi: NOTRUN -> [SKIP][3] ([fdo#109271] / [i915#2190])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21402/fi-glk-dsi/igt@gem_huc_c...@huc-copy.html

  * igt@gem_tiled_blits@basic:
- fi-pnv-d510:[PASS][4] -> [INCOMPLETE][5] ([i915#299])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10768/fi-pnv-d510/igt@gem_tiled_bl...@basic.html
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21402/fi-pnv-d510/igt@gem_tiled_bl...@basic.html

  * igt@kms_chamelium@dp-crc-fast:
- fi-bsw-nick:NOTRUN -> [SKIP][6] ([fdo#109271] / [fdo#111827]) +8 
similar issues
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21402/fi-bsw-nick/igt@kms_chamel...@dp-crc-fast.html

  * igt@kms_chamelium@hdmi-hpd-fast:
- fi-glk-dsi: NOTRUN -> [SKIP][7] ([fdo#109271] / [fdo#111827]) +8 
similar issues
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21402/fi-glk-dsi/igt@kms_chamel...@hdmi-hpd-fast.html

  * igt@kms_frontbuffer_tracking@basic:
- fi-cfl-8109u:   [PASS][8] -> [FAIL][9] ([i915#2546])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10768/fi-cfl-8109u/igt@kms_frontbuffer_track...@basic.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21402/fi-cfl-8109u/igt@kms_frontbuffer_track...@basic.html

  * igt@kms_pipe_crc_basic@compare-crc-sanitycheck-pipe-d:
- fi-glk-dsi: NOTRUN -> [SKIP][10] ([fdo#109271] / [i915#533])
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21402/fi-glk-dsi/igt@kms_pipe_crc_ba...@compare-crc-sanitycheck-pipe-d.html

  * igt@kms_psr@primary_page_flip:
- fi-glk-dsi: NOTRUN -> [SKIP][11] ([fdo#109271]) +30 similar issues
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21402/fi-glk-dsi/igt@kms_psr@primary_page_flip.html

  * igt@prime_vgem@basic-fence-flip:
- fi-bsw-nick:NOTRUN -> [SKIP][12] ([fdo#109271]) +63 similar issues
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21402/fi-bsw-nick/igt@prime_v...@basic-fence-flip.html

  * igt@runner@aborted:
- fi-pnv-d510:NOTRUN -> [FAIL][13] ([i915#2403] / [i915#4312])
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21402/fi-pnv-d510/igt@run...@aborted.html

  
 Possible fixes 

  * igt@i915_selftest@live@hangcheck:
- {fi-hsw-gt1}:   [DMESG-WARN][14] ([i915#3303]) -> [PASS][15]
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10768/fi-hsw-gt1/igt@i915_selftest@l...@hangcheck.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21402/fi-hsw-gt1/igt@i915_selftest@l...@hangcheck.html

  * igt@kms_frontbuffer_tracking@basic:
- fi-cml-u2:  [DMESG-WARN][16] ([i915#4269]) -> [PASS][17]
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10768/fi-cml-u2/igt@kms_frontbuffer_track...@basic.html
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21402/fi-cml-u2/igt@kms_frontbuffer_track...@basic.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190
  [i915#2403]: https://gitlab.freedesktop.org/drm/intel/issues/2403
  [i915#2546]: https://gitlab.freedesktop.org/drm/intel/issues/2546
  [i915#299]: https://gitlab.freedesktop.org/drm/intel/issues/299
  [i915#3303]: https://gitlab.freedesktop.org/drm/intel/issues/3303
  [i915#4269]: https://gitlab.freedesktop.org/drm/intel/issues/4269
  [i915#4312]: https://gitlab.freedesktop.org/drm/intel/issues/4312
  [i915#533]:

Re: [Intel-gfx] [PATCH 12/28] drm/amdgpu: use new iterator in amdgpu_ttm_bo_eviction_valuable

2021-10-21 Thread Daniel Vetter

On Tue, Oct 19, 2021 at 12:30:40PM -0400, Felix Kuehling wrote:
> Am 2021-10-19 um 7:36 a.m. schrieb Christian König:
> > Am 13.10.21 um 16:07 schrieb Daniel Vetter:
> >> On Tue, Oct 05, 2021 at 01:37:26PM +0200, Christian König wrote:
> >>> Simplifying the code a bit.
> >>>
> >>> Signed-off-by: Christian König 
> >>> ---
> >>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 14 --
> >>>   1 file changed, 4 insertions(+), 10 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>> index e8d70b6e6737..722e3c9e8882 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>> @@ -1345,10 +1345,9 @@ static bool
> >>> amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
> >>>   const struct ttm_place *place)
> >>>   {
> >>>   unsigned long num_pages = bo->resource->num_pages;
> >>> +    struct dma_resv_iter resv_cursor;
> >>>   struct amdgpu_res_cursor cursor;
> >>> -    struct dma_resv_list *flist;
> >>>   struct dma_fence *f;
> >>> -    int i;
> >>>     /* Swapout? */
> >>>   if (bo->resource->mem_type == TTM_PL_SYSTEM)
> >>> @@ -1362,14 +1361,9 @@ static bool
> >>> amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
> >>>    * If true, then return false as any KFD process needs all its
> >>> BOs to
> >>>    * be resident to run successfully
> >>>    */
> >>> -    flist = dma_resv_shared_list(bo->base.resv);
> >>> -    if (flist) {
> >>> -    for (i = 0; i < flist->shared_count; ++i) {
> >>> -    f = rcu_dereference_protected(flist->shared[i],
> >>> -    dma_resv_held(bo->base.resv));
> >>> -    if (amdkfd_fence_check_mm(f, current->mm))
> >>> -    return false;
> >>> -    }
> >>> +    dma_resv_for_each_fence(&resv_cursor, bo->base.resv, true, f) {
> >>     ^false?
> >>
> >> At least I'm not seeing the code look at the exclusive fence here.
> >
> > Yes, but that's correct. We need to look at all potential fences.
> 
> amdkfd_fence_check_mm is only meaningful for KFD eviction fences, and
> they are always added as shared fences. I think setting all_fences =
> false would return only the exclusive fence.

Hm yeah I got that wrong, which puts my entire review a bit in question
:-)

Anyway on the patch: Reviewed-by: Daniel Vetter 

> 
> Regards,
>   Felix
> 
> 
> >
> > It's a design problem in KFD if you ask me, but that is a completely
> > different topic.
> >
> > Christian.
> >
> >> -Daniel
> >>
> >>> +    if (amdkfd_fence_check_mm(f, current->mm))
> >>> +    return false;
> >>>   }
> >>>     switch (bo->resource->mem_type) {
> >>> -- 
> >>> 2.25.1
> >>>
> >

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [Intel-gfx] [PATCH 14/28] drm/msm: use new iterator in msm_gem_describe

2021-10-21 Thread Daniel Vetter

On Tue, Oct 19, 2021 at 01:49:08PM +0200, Christian König wrote:
> Am 13.10.21 um 16:14 schrieb Daniel Vetter:
> > On Tue, Oct 05, 2021 at 01:37:28PM +0200, Christian König wrote:
> > > Simplifying the code a bit. Also drop the RCU read side lock since the
> > > object is locked anyway.
> > > 
> > > Untested since I can't get the driver to compile on !ARM.
> > Cross-compiler install is pretty easy and you should have that for pushing
> > drm changes to drm-misc :-)
> 
> I do have cross compile setups for some architectures, but I seriously can't
> do that for every single driver.
> 
> With only a bit of work we allowed MSM to be compile tested on other
> architectures as well now. That even yielded a couple of missing includes
> and dependencies in MSM which just don't matter on ARM.

The only ones you need is arm32 and arm64.
-Daniel

> 
> > 
> > > Signed-off-by: Christian König 
> > Assuming this compiles, it looks correct.
> 
> Yes it does.
> 
> > 
> > Reviewed-by: Daniel Vetter 
> 
> 
> Thanks,
> Christian.
> 
> > 
> > > ---
> > >   drivers/gpu/drm/msm/msm_gem.c | 19 +--
> > >   1 file changed, 5 insertions(+), 14 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
> > > index 40a9863f5951..5bd511f07c07 100644
> > > --- a/drivers/gpu/drm/msm/msm_gem.c
> > > +++ b/drivers/gpu/drm/msm/msm_gem.c
> > > @@ -880,7 +880,7 @@ void msm_gem_describe(struct drm_gem_object *obj, 
> > > struct seq_file *m,
> > >   {
> > >   struct msm_gem_object *msm_obj = to_msm_bo(obj);
> > >   struct dma_resv *robj = obj->resv;
> > > - struct dma_resv_list *fobj;
> > > + struct dma_resv_iter cursor;
> > >   struct dma_fence *fence;
> > >   struct msm_gem_vma *vma;
> > >   uint64_t off = drm_vma_node_start(&obj->vma_node);
> > > @@ -955,22 +955,13 @@ void msm_gem_describe(struct drm_gem_object *obj, 
> > > struct seq_file *m,
> > >   seq_puts(m, "\n");
> > >   }
> > > - rcu_read_lock();
> > > - fobj = dma_resv_shared_list(robj);
> > > - if (fobj) {
> > > - unsigned int i, shared_count = fobj->shared_count;
> > > -
> > > - for (i = 0; i < shared_count; i++) {
> > > - fence = rcu_dereference(fobj->shared[i]);
> > > + dma_resv_for_each_fence(&cursor, robj, true, fence) {
> > > + if (dma_resv_iter_is_exclusive(&cursor))
> > > + describe_fence(fence, "Exclusive", m);
> > > + else
> > >   describe_fence(fence, "Shared", m);
> > > - }
> > >   }
> > > - fence = dma_resv_excl_fence(robj);
> > > - if (fence)
> > > - describe_fence(fence, "Exclusive", m);
> > > - rcu_read_unlock();
> > > -
> > >   msm_gem_unlock(obj);
> > >   }
> > > -- 
> > > 2.25.1
> > > 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [Intel-gfx] [PATCH 24/28] drm: use new iterator in drm_gem_plane_helper_prepare_fb v2

2021-10-21 Thread Daniel Vetter

On Tue, Oct 19, 2021 at 05:51:38PM +0200, Christian König wrote:
> Am 19.10.21 um 16:30 schrieb Daniel Vetter:
> > On Tue, Oct 19, 2021 at 03:02:26PM +0200, Christian König wrote:
> > > Am 13.10.21 um 16:23 schrieb Daniel Vetter:
> > > > On Tue, Oct 05, 2021 at 01:37:38PM +0200, Christian König wrote:
> > > > > Makes the handling a bit more complex, but avoids the use of
> > > > > dma_resv_get_excl_unlocked().
> > > > > 
> > > > > v2: improve coding and documentation
> > > > > 
> > > > > Signed-off-by: Christian König 
> > > > > ---
> > > > >drivers/gpu/drm/drm_gem_atomic_helper.c | 13 +++--
> > > > >1 file changed, 11 insertions(+), 2 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/drm_gem_atomic_helper.c 
> > > > > b/drivers/gpu/drm/drm_gem_atomic_helper.c
> > > > > index e570398abd78..8534f78d4d6d 100644
> > > > > --- a/drivers/gpu/drm/drm_gem_atomic_helper.c
> > > > > +++ b/drivers/gpu/drm/drm_gem_atomic_helper.c
> > > > > @@ -143,6 +143,7 @@
> > > > > */
> > > > >int drm_gem_plane_helper_prepare_fb(struct drm_plane *plane, 
> > > > > struct drm_plane_state *state)
> > > > >{
> > > > > + struct dma_resv_iter cursor;
> > > > >   struct drm_gem_object *obj;
> > > > >   struct dma_fence *fence;
> > > > > @@ -150,9 +151,17 @@ int drm_gem_plane_helper_prepare_fb(struct 
> > > > > drm_plane *plane, struct drm_plane_st
> > > > >   return 0;
> > > > >   obj = drm_gem_fb_get_obj(state->fb, 0);
> > > > > - fence = dma_resv_get_excl_unlocked(obj->resv);
> > > > > - drm_atomic_set_fence_for_plane(state, fence);
> > > > > + dma_resv_iter_begin(&cursor, obj->resv, false);
> > > > > + dma_resv_for_each_fence_unlocked(&cursor, fence) {
> > > > > + /* TODO: We only use the first write fence here and 
> > > > > need to fix
> > > > > +  * the drm_atomic_set_fence_for_plane() API to accept 
> > > > > more than
> > > > > +  * one. */
> > > > I'm confused, right now there is only one write fence. So no need to
> > > > iterate, and also no need to add a TODO. If/when we add more write 
> > > > fences
> > > > then I think this needs to be revisited, and ofc then we do need to 
> > > > update
> > > > the set_fence helpers to carry an entire array of fences.
> > > Well could be that I misunderstood you, but in your last explanation it
> > > sounded like the drm_atomic_set_fence_for_plane() function needs fixing
> > > anyway because a plane could have multiple BOs.
> > > 
> > > So in my understanding what we need is a
> > > drm_atomic_add_dependency_for_plane() function which records that a 
> > > certain
> > > fence needs to be signaled before a flip.
> > Yeah that's another issue, but in practice there's no libva which decodes
> > into planar yuv with different fences between the planes. So not a bug in
> > practice.
> > 
> > But this is entirely orthogonal to you picking up the wrong fence here if
> > there's not exclusive fence set:
> > 
> > - old code: Either pick the exclusive fence, or not fence if the exclusive
> >one is not set.
> > 
> > - new code: Pick the exclusive fence or the first shared fence
> 
> Hui what?
> 
> We use "dma_resv_iter_begin(&cursor, obj->resv, *false*);" here which means
> that only the exclusive fence is returned and no shared fences whatsoever.
> 
> My next step is to replace the boolean with a bunch of use case describing
> enums. I hope that will make it much clearer what's going on here.

Yeah I got that entirely wrong, which is kinda bad since that's about the
only thing worth checking in these conversions :-/

I'll go recheck them again and slap some more r-b on stuff.
-Daniel

> 
> Christian.
> 
> > New behaviour is busted, because scanning out and reading from a buffer at
> > the same time (for the next frame, e.g. to copy over damaged areas or some
> > other tricks) is very much a supported thing. Atomic _only_ wants to look
> > at the exclusive fence slot, which mean "there is an implicitly synced
> > write to this buffers". Implicitly synced reads _must_ be ignored.
> 
> 
> > 
> > Now amdgpu doesn't have this distinction in its uapi, but many drivers do.
> > -Daniel
> > 
> > > Support for more than one write fence then comes totally naturally.
> > > 
> > > Christian.
> > > 
> > > > -Daniel
> > > > 
> > > > > + dma_fence_get(fence);
> > > > > + break;
> > > > > + }
> > > > > + dma_resv_iter_end(&cursor);
> > > > > + drm_atomic_set_fence_for_plane(state, fence);
> > > > >   return 0;
> > > > >}
> > > > >EXPORT_SYMBOL_GPL(drm_gem_plane_helper_prepare_fb);
> > > > > -- 
> > > > > 2.25.1
> > > > > 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [Intel-gfx] [PATCH 24/28] drm: use new iterator in drm_gem_plane_helper_prepare_fb v2

2021-10-21 Thread Daniel Vetter

On Tue, Oct 05, 2021 at 01:37:38PM +0200, Christian König wrote:
> Makes the handling a bit more complex, but avoids the use of
> dma_resv_get_excl_unlocked().
> 
> v2: improve coding and documentation
> 
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/drm_gem_atomic_helper.c | 13 +++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_gem_atomic_helper.c 
> b/drivers/gpu/drm/drm_gem_atomic_helper.c
> index e570398abd78..8534f78d4d6d 100644
> --- a/drivers/gpu/drm/drm_gem_atomic_helper.c
> +++ b/drivers/gpu/drm/drm_gem_atomic_helper.c
> @@ -143,6 +143,7 @@
>   */
>  int drm_gem_plane_helper_prepare_fb(struct drm_plane *plane, struct 
> drm_plane_state *state)
>  {
> + struct dma_resv_iter cursor;
>   struct drm_gem_object *obj;
>   struct dma_fence *fence;
>  
> @@ -150,9 +151,17 @@ int drm_gem_plane_helper_prepare_fb(struct drm_plane 
> *plane, struct drm_plane_st
>   return 0;
>  
>   obj = drm_gem_fb_get_obj(state->fb, 0);
> - fence = dma_resv_get_excl_unlocked(obj->resv);
> - drm_atomic_set_fence_for_plane(state, fence);
> + dma_resv_iter_begin(&cursor, obj->resv, false);
> + dma_resv_for_each_fence_unlocked(&cursor, fence) {
> + /* TODO: We only use the first write fence here and need to fix

Maybe reword the todo that currently there's only one write fence, and if
that changes we have work to do. Or something like that. The current
comments sounds like multiple write fences are possible, which is not the
case.

With that:

Reviewed-by: Daniel Vetter 

> +  * the drm_atomic_set_fence_for_plane() API to accept more than
> +  * one. */
> + dma_fence_get(fence);
> + break;
> + }
> + dma_resv_iter_end(&cursor);
>  
> + drm_atomic_set_fence_for_plane(state, fence);
>   return 0;
>  }
>  EXPORT_SYMBOL_GPL(drm_gem_plane_helper_prepare_fb);
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

[Intel-gfx] [PATCH 1/4] drm/i915/clflush: fixup handling of cache_dirty

2021-10-21 Thread Matthew Auld

In theory if clflush_work_create() somehow fails here, and we don't yet
have mm.pages populated then we end up resetting cache_dirty, which is
likely wrong, since that will potentially skip the flush-on-acquire, if
it was needed.

It looks like intel_user_framebuffer_dirty() can arrive here before the
pages are populated.

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
---
 drivers/gpu/drm/i915/gem/i915_gem_clflush.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c 
b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
index f0435c6feb68..d09365b5eb29 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
@@ -20,6 +20,7 @@ static void __do_clflush(struct drm_i915_gem_object *obj)
 {
GEM_BUG_ON(!i915_gem_object_has_pages(obj));
drm_clflush_sg(obj->mm.pages);
+   obj->cache_dirty = false;
 
i915_gem_object_flush_frontbuffer(obj, ORIGIN_CPU);
 }
@@ -115,6 +116,5 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object 
*obj,
GEM_BUG_ON(obj->write_domain != I915_GEM_DOMAIN_CPU);
}
 
-   obj->cache_dirty = false;
return true;
 }
-- 
2.26.3

[Intel-gfx] [PATCH 4/4] drm/i915: stop setting cache_dirty on discrete

2021-10-21 Thread Matthew Auld

Should not be needed. Even with non-coherent display, we should be using
device local-memory there, and not system memory.

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
---
 drivers/gpu/drm/i915/gem/i915_gem_domain.c | 10 ++
 drivers/gpu/drm/i915/gem/i915_gem_object.c |  7 +--
 drivers/gpu/drm/i915/gem/i915_gem_pages.c  |  1 +
 3 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c 
b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
index d30d5a699788..26532c07d467 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
@@ -18,18 +18,28 @@
 
 static bool gpu_write_needs_clflush(struct drm_i915_gem_object *obj)
 {
+   struct drm_i915_private *i915 = to_i915(obj->base.dev);
+
+   if (IS_DGFX(i915))
+   return false;
+
return !(obj->cache_level == I915_CACHE_NONE ||
 obj->cache_level == I915_CACHE_WT);
 }
 
 bool i915_gem_cpu_write_needs_clflush(struct drm_i915_gem_object *obj)
 {
+   struct drm_i915_private *i915 = to_i915(obj->base.dev);
+
if (obj->cache_dirty)
return false;
 
if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE))
return true;
 
+   if (IS_DGFX(i915))
+   return false;
+
/* Currently in use by HW (display engine)? Keep flushed. */
return i915_gem_object_is_framebuffer(obj);
 }
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c 
b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 1e426a42a36c..170c74a2e46d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -114,18 +114,21 @@ void __i915_gem_object_fini(struct drm_i915_gem_object 
*obj)
 void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj,
 unsigned int cache_level)
 {
+   struct drm_i915_private *i915 = to_i915(obj->base.dev);
+
obj->cache_level = cache_level;
 
if (cache_level != I915_CACHE_NONE)
obj->cache_coherent = (I915_BO_CACHE_COHERENT_FOR_READ |
   I915_BO_CACHE_COHERENT_FOR_WRITE);
-   else if (HAS_LLC(to_i915(obj->base.dev)))
+   else if (HAS_LLC(i915))
obj->cache_coherent = I915_BO_CACHE_COHERENT_FOR_READ;
else
obj->cache_coherent = 0;
 
obj->cache_dirty =
-   !(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE);
+   !(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE) &&
+   !IS_DGFX(i915);
 }
 
 bool i915_gem_object_can_bypass_llc(struct drm_i915_gem_object *obj)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c 
b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index 8eb1c3a6fc9c..76530ca265de 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -26,6 +26,7 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object 
*obj,
 
/* Make the pages coherent with the GPU (flushing any swapin). */
if (obj->cache_dirty) {
+   WARN_ON_ONCE(IS_DGFX(i915));
obj->write_domain = 0;
if (i915_gem_object_has_struct_page(obj))
drm_clflush_sg(pages);
-- 
2.26.3

[Intel-gfx] [PATCH 2/4] drm/i915/clflush: disallow on discrete

2021-10-21 Thread Matthew Auld

We seem to have an unfortunate issue where we arrive from:

i915_gem_object_flush_if_display+0x86/0xd0 [i915]
intel_user_framebuffer_dirty+0x1a/0x50 [i915]
drm_mode_dirtyfb_ioctl+0xfb/0x1b0

Which can be before the pages are populated(and pinned for display), and
so i915_gem_object_has_struct_page() might still return true, as per the
ttm backend. We could re-order the later get_pages() call here, but
since on discrete everything should already be coherent, with the
exception of the display engine, and even there display surfaces must be
allocated in device local-memory anyway, so there should in theory be no
conceivable reason to ever call i915_gem_clflush_object() on discrete.

References: https://gitlab.freedesktop.org/drm/intel/-/issues/4320
Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
---
 drivers/gpu/drm/i915/gem/i915_gem_clflush.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c 
b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
index d09365b5eb29..b0822fd99709 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
@@ -70,6 +70,8 @@ static struct clflush *clflush_work_create(struct 
drm_i915_gem_object *obj)
 bool i915_gem_clflush_object(struct drm_i915_gem_object *obj,
 unsigned int flags)
 {
+
+   struct drm_i915_private *i915 = to_i915(obj->base.dev);
struct clflush *clflush;
 
assert_object_held(obj);
@@ -81,7 +83,7 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj,
 * anything not backed by physical memory we consider to be always
 * coherent and not need clflushing.
 */
-   if (!i915_gem_object_has_struct_page(obj)) {
+   if (!i915_gem_object_has_struct_page(obj) || IS_DGFX(i915)) {
obj->cache_dirty = false;
return false;
}
@@ -106,7 +108,7 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object 
*obj,
if (clflush) {
i915_sw_fence_await_reservation(&clflush->base.chain,
obj->base.resv, NULL, true,
-   
i915_fence_timeout(to_i915(obj->base.dev)),
+   i915_fence_timeout(i915),
I915_FENCE_GFP);
dma_resv_add_excl_fence(obj->base.resv, &clflush->base.dma);
dma_fence_work_commit(&clflush->base);
-- 
2.26.3

[Intel-gfx] [PATCH 3/4] drm/i915: move cpu_write_needs_clflush

2021-10-21 Thread Matthew Auld

Move it next to its partner in crime; gpu_write_needs_clflush.

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
---
 drivers/gpu/drm/i915/gem/i915_gem_domain.c | 12 
 drivers/gpu/drm/i915/gem/i915_gem_object.h | 15 ++-
 drivers/gpu/drm/i915/i915_gem.c|  2 +-
 3 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c 
b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
index b684a62bf3b0..d30d5a699788 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
@@ -22,6 +22,18 @@ static bool gpu_write_needs_clflush(struct 
drm_i915_gem_object *obj)
 obj->cache_level == I915_CACHE_WT);
 }
 
+bool i915_gem_cpu_write_needs_clflush(struct drm_i915_gem_object *obj)
+{
+   if (obj->cache_dirty)
+   return false;
+
+   if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE))
+   return true;
+
+   /* Currently in use by HW (display engine)? Keep flushed. */
+   return i915_gem_object_is_framebuffer(obj);
+}
+
 static void
 flush_write_domain(struct drm_i915_gem_object *obj, unsigned int flush_domains)
 {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h 
b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 59201801cec5..199f2ef928c3 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -517,6 +517,7 @@ void i915_gem_object_set_cache_coherency(struct 
drm_i915_gem_object *obj,
 bool i915_gem_object_can_bypass_llc(struct drm_i915_gem_object *obj);
 void i915_gem_object_flush_if_display(struct drm_i915_gem_object *obj);
 void i915_gem_object_flush_if_display_locked(struct drm_i915_gem_object *obj);
+bool i915_gem_cpu_write_needs_clflush(struct drm_i915_gem_object *obj);
 
 int __must_check
 i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write);
@@ -535,23 +536,11 @@ void i915_gem_object_make_unshrinkable(struct 
drm_i915_gem_object *obj);
 void i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj);
 void i915_gem_object_make_purgeable(struct drm_i915_gem_object *obj);
 
-static inline bool cpu_write_needs_clflush(struct drm_i915_gem_object *obj)
-{
-   if (obj->cache_dirty)
-   return false;
-
-   if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE))
-   return true;
-
-   /* Currently in use by HW (display engine)? Keep flushed. */
-   return i915_gem_object_is_framebuffer(obj);
-}
-
 static inline void __start_cpu_write(struct drm_i915_gem_object *obj)
 {
obj->read_domains = I915_GEM_DOMAIN_CPU;
obj->write_domain = I915_GEM_DOMAIN_CPU;
-   if (cpu_write_needs_clflush(obj))
+   if (i915_gem_cpu_write_needs_clflush(obj))
obj->cache_dirty = true;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 981e383d1a5d..d0e642c82064 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -764,7 +764,7 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
 * perspective, requiring manual detiling by the client.
 */
if (!i915_gem_object_has_struct_page(obj) ||
-   cpu_write_needs_clflush(obj))
+   i915_gem_cpu_write_needs_clflush(obj))
/* Note that the gtt paths might fail with non-page-backed user
 * pointers (e.g. gtt mappings when moving data between
 * textures). Fallback to the shmem path in that case.
-- 
2.26.3

Re: [Intel-gfx] [PATCH 3/3] drm/i915/guc/slpc: Update boost sysfs hooks for SLPC

2021-10-21 Thread Nilawar, Badal


Please fix code style related warnings and errors from checkpatch result.

On 21-10-2021 01:22, Vinay Belgaumkar wrote:

Add a helper to sort through the SLPC/RPS cases of get/set methods.
Boost frequency will be modified as long as it is within the constraints
of RP0 and if it is different from the existing one. We will set min
freq to boost only if there is an active waiter.

Signed-off-by: Vinay Belgaumkar 
---
  drivers/gpu/drm/i915/gt/intel_rps.c | 44 +
  drivers/gpu/drm/i915/gt/intel_rps.h |  2 +
  drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 18 +
  drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h |  1 +
  drivers/gpu/drm/i915/i915_sysfs.c   | 21 ++
  5 files changed, 69 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
b/drivers/gpu/drm/i915/gt/intel_rps.c
index 023e9c0b9f4a..19c57aac9553 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -935,6 +935,50 @@ void intel_rps_park(struct intel_rps *rps)
GT_TRACE(rps_to_gt(rps), "park:%x\n", rps->cur_freq);
  }
  
+u32 intel_rps_get_boost_frequency(struct intel_rps *rps)

+{
+   struct intel_guc_slpc *slpc = rps_to_slpc(rps);
+
+   if (rps_uses_slpc(rps))
+   return slpc->boost_freq;
+   else
+   return intel_gpu_freq(rps, rps->boost_freq);
+}
+
+static int set_boost_freq(struct intel_rps *rps, u32 val)
+{
+   bool boost = false;
+
+   /* Validate against (static) hardware limits */
+   val = intel_freq_opcode(rps, val);
+   if (val < rps->min_freq || val > rps->max_freq)
+   return -EINVAL;
+
+   mutex_lock(&rps->lock);
+   if (val != rps->boost_freq) {
+   rps->boost_freq = val;
+   boost = atomic_read(&rps->num_waiters);
+   }
+   mutex_unlock(&rps->lock);
+   if (boost)
+   schedule_work(&rps->work);
+
+   return 0;
+}
+
+int intel_rps_set_boost_frequency(struct intel_rps *rps, u32 freq)
+{
+   struct intel_guc_slpc *slpc;
+
+   if (rps_uses_slpc(rps)) {
+   slpc = rps_to_slpc(rps);
+
+   return intel_guc_slpc_set_boost_freq(slpc, freq);
+   } else {
+   return set_boost_freq(rps, freq);
+   }
+}
+
  void intel_rps_update_waiters(struct intel_rps *rps)
  {
struct intel_guc_slpc *slpc = rps_to_slpc(rps);
diff --git a/drivers/gpu/drm/i915/gt/intel_rps.h 
b/drivers/gpu/drm/i915/gt/intel_rps.h
index 4ca9924cb5ed..ce81094cf58e 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.h
+++ b/drivers/gpu/drm/i915/gt/intel_rps.h
@@ -24,6 +24,8 @@ void intel_rps_park(struct intel_rps *rps);
  void intel_rps_unpark(struct intel_rps *rps);
  void intel_rps_boost(struct i915_request *rq);
  void intel_rps_update_waiters(struct intel_rps *rps);
+u32 intel_rps_get_boost_frequency(struct intel_rps *rps);
+int intel_rps_set_boost_frequency(struct intel_rps *rps, u32 freq);
  
  int intel_rps_set(struct intel_rps *rps, u8 val);

  void intel_rps_mark_interactive(struct intel_rps *rps, bool interactive);
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
index a104371a8b79..7881bc1a5af8 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
@@ -613,6 +613,24 @@ void intel_guc_slpc_boost(struct intel_guc_slpc *slpc)
slpc->num_waiters++;
  }
  
+int intel_guc_slpc_set_boost_freq(struct intel_guc_slpc *slpc, u32 val)

+{
+   if (val < slpc->min_freq || val > slpc->rp0_freq)
+   return -EINVAL;
+
+   if (val != slpc->boost_freq) {
+   slpc->boost_freq = val;
+
+   /* Apply only if there are active waiters */
+   if (slpc->num_waiters)
+   return slpc_set_param(slpc,
+ 
SLPC_PARAM_GLOBAL_MIN_GT_UNSLICE_FREQ_MHZ,
+ slpc->boost_freq);


As per comments from some other ML wakeref may be needed here.

CC: jon.ew...@intel.com, ashutosh.di...@intel.com


+   }
+
+   return 0;
+}
+
  void intel_guc_slpc_update_waiters(struct intel_guc_slpc *slpc)
  {
/* Return min back to the softlimit.
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h
index 25093dfdea0b..d8191f2b965b 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.h
@@ -34,6 +34,7 @@ int intel_guc_slpc_enable(struct intel_guc_slpc *slpc);
  void intel_guc_slpc_fini(struct intel_guc_slpc *slpc);
  int intel_guc_slpc_set_max_freq(struct intel_guc_slpc *slpc, u32 val);
  int intel_guc_slpc_set_min_freq(struct intel_guc_slpc *slpc, u32 val);
+int intel_guc_slpc_set_boost_freq(struct intel_guc_slpc *slpc, u32 val);
  int intel_guc_slpc_get_max_freq(struct intel_guc_slpc *slpc, u32 *val);
  int intel_guc_slpc_get_min_freq(struct in

Re: [Intel-gfx] [PATCH] drm/i915/gem: stop using PAGE_KERNEL_IO

2021-10-21 Thread Daniel Vetter

On Wed, Oct 20, 2021 at 02:06:25AM -0700, Lucas De Marchi wrote:
> PAGE_KERNEL_IO is only defined for x86 and is the same as PAGE_KERNEL.
> Use the latter since that is also available on other archs, which should
> help us getting i915 there.
> 
> This is the same that was done done in commit 80c33624e472 ("io-mapping:
> Fixup for different names of writecombine"). Later the commit
> 80c33624e472 ("io-mapping: Fixup for different names of writecombine")
> added a "Fixes" tag to the first one, but that is actually fixing a
> separate issue:  the different names for pgprot_writecombine().
> 
> Fast-forward today, it seems the only 2 archs that define
> pgprot_noncached_wc() are microblaze and powerpc. Microblaze has the
> same definition for pgprot_writecombine() since commit
> 97ccedd793ac ("microblaze: Provide pgprot_device/writecombine macros for
> nommu"). Powerpc has 3 variants and all of them have the same behavior
> for pgprot_writecombine() and pgprot_noncached_wc(). From the commit message
> and linked issue, the fallback was needed for arm, but apparently today
> all the variants there also have pgprot_writecombine().
> 
> So, just use PAGE_KERNEL, and just use pgprot_writecombine().
> 
> Signed-off-by: Lucas De Marchi 

I think a bit more history on PAGE_KERNEL_IO is useful to add. It was
added in be43d72835ba ("x86: add _PAGE_IOMAP pte flag for IO mappings").
The one and only user was lost in f955371ca9d3 ("x86: remove the
Xen-specific _PAGE_IOMAP PTE flag"), therefore it's safe to do this.

With that added Reviewed-by: Daniel Vetter 

Also if you're motivated, maybe delete PAGE_KERNEL_IO across the tree and
get x86 maintainers to merge the entire series?
-Daniel


> ---
>  drivers/gpu/drm/i915/gem/i915_gem_pages.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
> index 8eb1c3a6fc9c..68fe1837ef54 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
> @@ -289,7 +289,7 @@ static void *i915_gem_object_map_page(struct 
> drm_i915_gem_object *obj,
>   pgprot = PAGE_KERNEL;
>   break;
>   case I915_MAP_WC:
> - pgprot = pgprot_writecombine(PAGE_KERNEL_IO);
> + pgprot = pgprot_writecombine(PAGE_KERNEL);
>   break;
>   }
>  
> @@ -333,7 +333,7 @@ static void *i915_gem_object_map_pfn(struct 
> drm_i915_gem_object *obj,
>   i = 0;
>   for_each_sgt_daddr(addr, iter, obj->mm.pages)
>   pfns[i++] = (iomap + addr) >> PAGE_SHIFT;
> - vaddr = vmap_pfn(pfns, n_pfn, pgprot_writecombine(PAGE_KERNEL_IO));
> + vaddr = vmap_pfn(pfns, n_pfn, pgprot_writecombine(PAGE_KERNEL));
>   if (pfns != stack)
>   kvfree(pfns);
>  
> -- 
> 2.33.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 2/2] dma-buf: Fix dma_resv_test_signaled.

2021-10-21 Thread Daniel Vetter

On Fri, Oct 15, 2021 at 02:56:59PM +0200, Christian König wrote:
> 
> 
> Am 15.10.21 um 14:52 schrieb Maarten Lankhorst:
> > Op 15-10-2021 om 14:07 schreef Christian König:
> > > Am 15.10.21 um 13:57 schrieb Maarten Lankhorst:
> > > > Commit 7fa828cb9265 ("dma-buf: use new iterator in 
> > > > dma_resv_test_signaled")
> > > > accidentally forgot to test whether the dma-buf is actually signaled, 
> > > > breaking
> > > > pretty much everything depending on it.
> > > NAK, the dma_resv_for_each_fence_unlocked() returns only unsignaled 
> > > fences. So the code is correct as it is.
> > That seems like it might cause some unexpected behavior when that function 
> > is called with one of the fence locks held, if it calls dma_fence_signal().
> > 
> > Could it be changed to only test the signaled bit, in which case this patch 
> > would still be useful?
> 
> That's exactly what I suggested as well, but Daniel was against that because
> of concerns around barriers.

I don't want open-coded bitmask tests, because the current code we have in
dma-fence.c is missing barriers, and that doesn't get better if we spread
that all around. But if you want this then wrap it in some static inline
in dma-fence.h or so, that's fine. Just not open-coded outside of these
files, like i915-gem code does a lot (which imo is just plain a disaster).

> > Or at least add some lockdep annotations, that fence->lock might be taken. 
> > So any hangs would at least be easy to spot with lockdep.
> 
> That should be trivial doable.

might_lock is trivial to add, but it's more complicated. The spinlock is
provided by the fence code, which means there's lots of different lockdep
classes. A might_lock on fence->lock is better than nothing, but maybe not
good enough.

What we might need are a few more pieces:
- a fake dma-fence spinlock lockdep key, maybe call it dma_fence_lock_key
  or so.
- in dma_fence_init we lock dma_fence_lock_key, and then might_lock the
  actual spinlock passed as an argument. This establishes dependencies
  from that fake lock to all real fence spinlocks
- anywhere we need a might lock we take dma_fence_lock_key instead

The potential issue here is that this might result in lockdep splats in
cases where fences somehow naturally nest (maybe drm/sched job fence vs hw
fence). So perhaps too much.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

[Intel-gfx] ✗ Fi.CI.IGT: failure for Selective fetch support for biplanar formats

2021-10-21 Thread Patchwork

== Series Details ==

Series: Selective fetch support for biplanar formats
URL   : https://patchwork.freedesktop.org/series/96113/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10768_full -> Patchwork_21401_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_21401_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_21401_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_21401_full:

### IGT changes ###

 Possible regressions 

  * igt@kms_cursor_crc@pipe-c-cursor-64x21-offscreen:
- shard-tglb: NOTRUN -> [INCOMPLETE][1]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21401/shard-tglb8/igt@kms_cursor_...@pipe-c-cursor-64x21-offscreen.html

  
Known issues


  Here are the changes found in Patchwork_21401_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@feature_discovery@display-2x:
- shard-tglb: NOTRUN -> [SKIP][2] ([i915#1839])
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21401/shard-tglb5/igt@feature_discov...@display-2x.html

  * igt@gem_create@create-massive:
- shard-snb:  NOTRUN -> [DMESG-WARN][3] ([i915#3002])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21401/shard-snb7/igt@gem_cre...@create-massive.html

  * igt@gem_ctx_persistence@engines-hostile:
- shard-snb:  NOTRUN -> [SKIP][4] ([fdo#109271] / [i915#1099]) +2 
similar issues
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21401/shard-snb5/igt@gem_ctx_persiste...@engines-hostile.html

  * igt@gem_exec_fair@basic-flow@rcs0:
- shard-tglb: NOTRUN -> [FAIL][5] ([i915#2842])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21401/shard-tglb5/igt@gem_exec_fair@basic-f...@rcs0.html

  * igt@gem_exec_fair@basic-pace-share@rcs0:
- shard-tglb: [PASS][6] -> [FAIL][7] ([i915#2842])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10768/shard-tglb1/igt@gem_exec_fair@basic-pace-sh...@rcs0.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21401/shard-tglb8/igt@gem_exec_fair@basic-pace-sh...@rcs0.html

  * igt@gem_exec_fair@basic-pace-solo@rcs0:
- shard-glk:  [PASS][8] -> [FAIL][9] ([i915#2842]) +1 similar issue
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10768/shard-glk2/igt@gem_exec_fair@basic-pace-s...@rcs0.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21401/shard-glk1/igt@gem_exec_fair@basic-pace-s...@rcs0.html

  * igt@gem_exec_params@no-vebox:
- shard-tglb: NOTRUN -> [SKIP][10] ([fdo#109283])
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21401/shard-tglb2/igt@gem_exec_par...@no-vebox.html

  * igt@gem_pread@exhaustion:
- shard-apl:  NOTRUN -> [WARN][11] ([i915#2658])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21401/shard-apl1/igt@gem_pr...@exhaustion.html

  * igt@gem_pwrite@basic-exhaustion:
- shard-kbl:  NOTRUN -> [WARN][12] ([i915#2658])
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21401/shard-kbl6/igt@gem_pwr...@basic-exhaustion.html

  * igt@gem_pxp@dmabuf-shared-protected-dst-is-context-refcounted:
- shard-tglb: NOTRUN -> [SKIP][13] ([i915#4270])
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21401/shard-tglb2/igt@gem_...@dmabuf-shared-protected-dst-is-context-refcounted.html

  * igt@gem_userptr_blits@readonly-unsync:
- shard-tglb: NOTRUN -> [SKIP][14] ([i915#3297])
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21401/shard-tglb2/igt@gem_userptr_bl...@readonly-unsync.html

  * igt@gem_userptr_blits@vma-merge:
- shard-skl:  NOTRUN -> [FAIL][15] ([i915#3318])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21401/shard-skl6/igt@gem_userptr_bl...@vma-merge.html

  * igt@gen7_exec_parse@basic-rejected:
- shard-tglb: NOTRUN -> [SKIP][16] ([fdo#109289]) +1 similar issue
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21401/shard-tglb2/igt@gen7_exec_pa...@basic-rejected.html

  * igt@gen9_exec_parse@basic-rejected:
- shard-iclb: NOTRUN -> [SKIP][17] ([i915#2856])
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21401/shard-iclb3/igt@gen9_exec_pa...@basic-rejected.html

  * igt@gen9_exec_parse@cmd-crossing-page:
- shard-tglb: NOTRUN -> [SKIP][18] ([i915#2856]) +1 similar issue
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21401/shard-tglb2/igt@gen9_exec_pa...@cmd-crossing-page.html

  * igt@i915_pm_dc@dc6-psr:
- shard-iclb: [PASS][19] -> [FAIL][20] ([i915#454])
   [

Re: [Intel-gfx] [PATCH 3/4] drm/i915: Use vblank workers for gamma updates

2021-10-21 Thread Jani Nikula

On Thu, 21 Oct 2021, Ville Syrjälä  wrote:
> On Thu, Oct 21, 2021 at 01:35:12PM +0300, Jani Nikula wrote:
>> On Thu, 21 Oct 2021, Ville Syrjala  wrote:
>> > diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
>> > b/drivers/gpu/drm/i915/display/intel_display.c
>> > index 79a7552af7b5..1375d963c0a8 100644
>> > --- a/drivers/gpu/drm/i915/display/intel_display.c
>> > +++ b/drivers/gpu/drm/i915/display/intel_display.c
>> > @@ -8818,6 +8818,8 @@ static void intel_atomic_commit_tail(struct 
>> > intel_atomic_state *state)
>> >intel_set_cdclk_post_plane_update(state);
>> >}
>> >  
>> > +  intel_wait_for_vblank_works(state);
>> 
>> Nitpick, I think the function name can be confusing due to the plural
>> vs. verb here. intel_wait_for_vblank_work_end(), _finish(), _done()?
>
> I guess _end() would match what I called the tracepoint. Another
> idea could be s/works/workers/

Either works for me.

BR,
Jani.


-- 
Jani Nikula, Intel Open Source Graphics Center

[Intel-gfx] [PATCH 1/2] drm/i915/dmabuf: fix broken build

2021-10-21 Thread Matthew Auld

wbinvd_on_all_cpus() is only defined on x86 it seems, plus we need to
include asm/smp.h here.

Reported-by: kernel test robot 
Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
---
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c 
b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
index 1adcd8e02d29..a45d0ec2c5b6 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
@@ -12,6 +12,13 @@
 #include "i915_gem_object.h"
 #include "i915_scatterlist.h"
 
+#if defined(CONFIG_X86)
+#include 
+#else
+#define wbinvd_on_all_cpus() \
+   pr_warn(DRIVER_NAME ": Missing cache flush in %s\n", __func__)
+#endif
+
 I915_SELFTEST_DECLARE(static bool force_different_devices;)
 
 static struct drm_i915_gem_object *dma_buf_to_obj(struct dma_buf *buf)
-- 
2.26.3

[Intel-gfx] [PATCH 2/2] drm/i915/dmabuf: drop the flush on discrete

2021-10-21 Thread Matthew Auld

We were overzealous here; even though discrete is non-LLC, it should
still be always coherent.

Signed-off-by: Matthew Auld 
Cc: Thomas Hellström 
---
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c 
b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
index a45d0ec2c5b6..848e81368043 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
@@ -251,7 +251,8 @@ static int i915_gem_object_get_pages_dmabuf(struct 
drm_i915_gem_object *obj)
return PTR_ERR(pages);
 
/* XXX: consider doing a vmap flush or something */
-   if (!HAS_LLC(i915) || i915_gem_object_can_bypass_llc(obj))
+   if ((!HAS_LLC(i915) && !IS_DGFX(i915)) ||
+   i915_gem_object_can_bypass_llc(obj))
wbinvd_on_all_cpus();
 
sg_page_sizes = i915_sg_dma_sizes(pages->sgl);
-- 
2.26.3

[Intel-gfx] ✗ Fi.CI.IGT: failure for series starting with [01/28] drm/i915: Fix i915_request fence wait semantics

2021-10-21 Thread Patchwork

== Series Details ==

Series: series starting with [01/28] drm/i915: Fix i915_request fence wait 
semantics
URL   : https://patchwork.freedesktop.org/series/96115/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10768_full -> Patchwork_21402_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_21402_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_21402_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_21402_full:

### IGT changes ###

 Possible regressions 

  * igt@gem_linear_blits@normal:
- shard-glk:  [PASS][1] -> [FAIL][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10768/shard-glk4/igt@gem_linear_bl...@normal.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21402/shard-glk4/igt@gem_linear_bl...@normal.html

  * igt@gem_mmap_gtt@cpuset-big-copy-xy:
- shard-iclb: NOTRUN -> [INCOMPLETE][3]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21402/shard-iclb3/igt@gem_mmap_...@cpuset-big-copy-xy.html

  * igt@gem_ppgtt@blt-vs-render-ctxn:
- shard-tglb: [PASS][4] -> [FAIL][5] +1 similar issue
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10768/shard-tglb5/igt@gem_pp...@blt-vs-render-ctxn.html
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21402/shard-tglb8/igt@gem_pp...@blt-vs-render-ctxn.html

  
Known issues


  Here are the changes found in Patchwork_21402_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@feature_discovery@display-2x:
- shard-tglb: NOTRUN -> [SKIP][6] ([i915#1839])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21402/shard-tglb1/igt@feature_discov...@display-2x.html

  * igt@gem_create@create-massive:
- shard-snb:  NOTRUN -> [DMESG-WARN][7] ([i915#3002])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21402/shard-snb2/igt@gem_cre...@create-massive.html

  * igt@gem_ctx_isolation@preservation-s3@rcs0:
- shard-tglb: [PASS][8] -> [INCOMPLETE][9] ([i915#1373])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10768/shard-tglb8/igt@gem_ctx_isolation@preservation...@rcs0.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21402/shard-tglb7/igt@gem_ctx_isolation@preservation...@rcs0.html

  * igt@gem_ctx_isolation@preservation-s3@vcs0:
- shard-kbl:  [PASS][10] -> [DMESG-WARN][11] ([i915#180]) +3 
similar issues
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10768/shard-kbl7/igt@gem_ctx_isolation@preservation...@vcs0.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21402/shard-kbl6/igt@gem_ctx_isolation@preservation...@vcs0.html

  * igt@gem_ctx_isolation@preservation-s3@vecs0:
- shard-skl:  [PASS][12] -> [INCOMPLETE][13] ([i915#198])
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10768/shard-skl3/igt@gem_ctx_isolation@preservation...@vecs0.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21402/shard-skl1/igt@gem_ctx_isolation@preservation...@vecs0.html

  * igt@gem_ctx_persistence@idempotent:
- shard-snb:  NOTRUN -> [SKIP][14] ([fdo#109271] / [i915#1099]) +4 
similar issues
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21402/shard-snb2/igt@gem_ctx_persiste...@idempotent.html

  * igt@gem_exec_fair@basic-none-solo@rcs0:
- shard-glk:  [PASS][15] -> [FAIL][16] ([i915#2842])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10768/shard-glk5/igt@gem_exec_fair@basic-none-s...@rcs0.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21402/shard-glk7/igt@gem_exec_fair@basic-none-s...@rcs0.html

  * igt@gem_exec_fair@basic-pace@bcs0:
- shard-tglb: [PASS][17] -> [FAIL][18] ([i915#2842]) +2 similar 
issues
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10768/shard-tglb1/igt@gem_exec_fair@basic-p...@bcs0.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21402/shard-tglb2/igt@gem_exec_fair@basic-p...@bcs0.html

  * igt@gem_exec_fair@basic-pace@vecs0:
- shard-kbl:  [PASS][19] -> [FAIL][20] ([i915#2842]) +2 similar 
issues
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10768/shard-kbl2/igt@gem_exec_fair@basic-p...@vecs0.html
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21402/shard-kbl2/igt@gem_exec_fair@basic-p...@vecs0.html

  * igt@gem_exec_params@no-vebox:
- shard-tglb: NOTRUN -> [SKIP][21] ([fdo#109283])
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21402/shard-tglb3/igt@gem_exec_par...@no-vebox.html

  * igt@gem_pwrite@basic-exhaustion:

[Intel-gfx] [PULL] drm-intel-gt-next

2021-10-21 Thread Joonas Lahtinen

Hi Dave & Daniel,

Here comes the final feature PR for 5.16.

As the biggest thing it adds multi-LRC (parallel) submission
implementation for GuC and a simplified parallel submission uAPI
to go with that (only works with GuC for now). It is has a similar
mission to the bonded submission uAPI, take a look at the kerneldocs
for full detail.

Then there are some improvements to making sure old pages are flushed from
caches before making them available to userspaces. Those extra flushes may
be visible in corner case scenarios if application is frequently importing
new dmabufs on non-LLC hardware. The better approach would anyway be to
recycle a pool of dmabufs than destroy and recreate.

In addition to that, it's only minor changes with mainly developer
impact and those can be seen in shortlog.

Regards, Joonas

PS. There has been request to backmerge drm-next after you merge this
PR, to bring in dma-resv iterators. I'll do that.

PPS. Will send out the dim patches for the "for-linux-next-gt" branch
 updating to make sure we avoid the future conflicts.

***

drm-intel-gt-next-2021-10-21:

UAPI Changes:

- Expose multi-LRC submission interface

  Similar to the bonded submission interface but simplified.
  Comes with GuC only implementation for now. See kerneldoc
  for more details.

  Userspace changes: https://github.com/intel/media-driver/pull/1252

- Expose logical engine instance to user

  Needed by the multi-LRC submission interface for GuC

  Userspace changes: https://github.com/intel/media-driver/pull/1252

Driver Changes:

- Fix blank screen booting crashes when CONFIG_CC_OPTIMIZE_FOR_SIZE=y (Hugh)
- Add support for multi-LRC submission in the GuC backend (Matt B)
- Add extra cache flushing before making pages userspace visible (Matt A, 
Thomas)
- Mark internal GPU object pages dirty so they will be flushed properly (Matt A)

- Move remaining debugfs interfaces i915_wedged/i915_forcewake_user into gt 
(Andi)
- Replace the unconditional clflushes with drm_clflush_virt_range() (Ville)
- Remove IS_ACTIVE macro completely (Lucas)
- Improve kerneldocs for cache_dirty (Matt A)

- Add missing includes (Lucas)
- Selftest improvements (Matt R, Ran, Matt A)

The following changes since commit 1a839e016e4964b5c8384e5d82e5e5ac02a23f52:

  drm/i915: remove IS_ACTIVE (2021-10-07 11:04:05 -0700)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm-intel tags/drm-intel-gt-next-2021-10-21

for you to fetch changes up to ab5d964c001b9efffcbfa4d67a30186b67d79771:

  drm/i915/selftests: mark up hugepages object with start_cpu_write (2021-10-20 
16:50:42 +0100)


UAPI Changes:

- Expose multi-LRC submission interface

  Similar to the bonded submission interface but simplified.
  Comes with GuC only implementation for now. See kerneldoc
  for more details.

  Userspace changes: https://github.com/intel/media-driver/pull/1252

- Expose logical engine instance to user

  Needed by the multi-LRC submission interface for GuC

  Userspace changes: https://github.com/intel/media-driver/pull/1252

Driver Changes:

- Fix blank screen booting crashes when CONFIG_CC_OPTIMIZE_FOR_SIZE=y (Hugh)
- Add support for multi-LRC submission in the GuC backend (Matt B)
- Add extra cache flushing before making pages userspace visible (Matt A, 
Thomas)
- Mark internal GPU object pages dirty so they will be flushed properly (Matt A)

- Move remaining debugfs interfaces i915_wedged/i915_forcewake_user into gt 
(Andi)
- Replace the unconditional clflushes with drm_clflush_virt_range() (Ville)
- Remove IS_ACTIVE macro completely (Lucas)
- Improve kerneldocs for cache_dirty (Matt A)

- Add missing includes (Lucas)
- Selftest improvements (Matt R, Ran, Matt A)


Andi Shyti (1):
  drm/i915/gt: move remaining debugfs interfaces into gt

Hugh Dickins (1):
  drm/i915: fix blank screen booting crashes

Lucas De Marchi (2):
  drm/i915/gt: include tsc.h where used
  drm/i915/gt: add asm/cacheflush.h for use of clflush()

Matt Roper (1):
  drm/i915: Stop using I915_TILING_* in client blit selftest

Matthew Auld (9):
  drm/i915: mark dmabuf objects as ALLOC_USER
  drm/i915: mark userptr objects as ALLOC_USER
  drm/i915: extract bypass-llc check into helper
  drm/i915/dmabuf: add paranoid flush-on-acquire
  drm/i915/userptr: add paranoid flush-on-acquire
  drm/i915/shmem: ensure flush during swap-in on non-LLC
  drm/i915: expand on the kernel-doc for cache_dirty
  drm/i915: mark up internal objects with start_cpu_write
  drm/i915/selftests: mark up hugepages object with start_cpu_write

Matthew Brost (24):
  drm/i915/guc: Move GuC guc_id allocation under submission state sub-struct
  drm/i915/guc: Take GT PM ref when deregistering context
  drm/i915/guc: Take engine PM when a context is pinned with GuC submission
  drm/i915/gu

Re: [Intel-gfx] [PATCH v3] drm/i915/display: Wait PSR2 get out of deep sleep to update pipe

2021-10-21 Thread Gwan-gyeong Mun





On 10/14/21 12:03 AM, Souza, Jose wrote:

On Wed, 2021-10-13 at 23:39 +0300, Gwan-gyeong Mun wrote:


On 10/11/21 11:53 PM, Souza, Jose wrote:

On Thu, 2021-10-07 at 12:31 +0300, Gwan-gyeong Mun wrote:


On 10/6/21 11:04 PM, Souza, Jose wrote:

On Wed, 2021-10-06 at 11:50 +0300, Gwan-gyeong Mun wrote:


On 10/6/21 2:18 AM, José Roberto de Souza wrote:

Alderlake-P was getting 'max time under evasion' messages when PSR2
is enabled, this is due PIPE_SCANLINE/PIPEDSL returning 0 over a
period of time longer than VBLANK_EVASION_TIME_US.

For PSR1 we had the same issue so intel_psr_wait_for_idle() was
implemented to wait for PSR1 to get into idle state but nothing was
done for PSR2.

For PSR2 we can't only wait for idle state as PSR2 tends to keep
into sleep state(ready to send selective updates).
Waiting for any state below deep sleep proved to be effective in
avoiding the evasion messages and also not wasted a lot of time.

v2:
- dropping the additional wait_for loops, only the _wait_for_atomic()
is necessary
- waiting for states below EDP_PSR2_STATUS_STATE_DEEP_SLEEP

v3:
- dropping intel_wait_for_condition_atomic() function

Cc: Ville Syrjälä 
Cc: Gwan-gyeong Mun 
Signed-off-by: José Roberto de Souza 
---
 .../drm/i915/display/intel_display_debugfs.c  |  3 +-
 drivers/gpu/drm/i915/display/intel_psr.c  | 52 +++
 drivers/gpu/drm/i915/i915_reg.h   | 10 ++--
 3 files changed, 36 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display_debugfs.c 
b/drivers/gpu/drm/i915/display/intel_display_debugfs.c
index 309d74fd86ce1..d7dd3a57c6170 100644
--- a/drivers/gpu/drm/i915/display/intel_display_debugfs.c
+++ b/drivers/gpu/drm/i915/display/intel_display_debugfs.c
@@ -303,8 +303,7 @@ psr_source_status(struct intel_dp *intel_dp, struct 
seq_file *m)
 };
 val = intel_de_read(dev_priv,
 EDP_PSR2_STATUS(intel_dp->psr.transcoder));
-status_val = (val & EDP_PSR2_STATUS_STATE_MASK) >>
-  EDP_PSR2_STATUS_STATE_SHIFT;
+status_val = REG_FIELD_GET(EDP_PSR2_STATUS_STATE_MASK, val);
 if (status_val < ARRAY_SIZE(live_status))
 status = live_status[status_val];
 } else {
diff --git a/drivers/gpu/drm/i915/display/intel_psr.c 
b/drivers/gpu/drm/i915/display/intel_psr.c
index 7a205fd5023bb..ade514fc0a24d 100644
--- a/drivers/gpu/drm/i915/display/intel_psr.c
+++ b/drivers/gpu/drm/i915/display/intel_psr.c
@@ -1809,15 +1809,21 @@ void intel_psr_post_plane_update(const struct 
intel_atomic_state *state)
 _intel_psr_post_plane_update(state, crtc_state);
 }

-/**
- * psr_wait_for_idle - wait for PSR1 to idle
- * @intel_dp: Intel DP
- * @out_value: PSR status in case of failure
- *
- * Returns: 0 on success or -ETIMEOUT if PSR status does not idle.
- *
- */
-static int psr_wait_for_idle(struct intel_dp *intel_dp, u32 *out_value)
+static int _psr2_ready_for_pipe_update_locked(struct intel_dp *intel_dp)
+{
+struct drm_i915_private *dev_priv = dp_to_i915(intel_dp);
+
+/*
+ * Any state lower than EDP_PSR2_STATUS_STATE_DEEP_SLEEP is enough.
+ * As all higher states has bit 4 of PSR2 state set we can just wait for
+ * EDP_PSR2_STATUS_STATE_DEEP_SLEEP to be cleared.
+ */
+return intel_de_wait_for_clear(dev_priv,
+   EDP_PSR2_STATUS(intel_dp->psr.transcoder),
+   EDP_PSR2_STATUS_STATE_DEEP_SLEEP, 50);

Under the DEEP_SLEEP state, there are IDLE, CAPTURE, CPTURE_FS, SLEEP,
BUFON_FW, ML_UP, SU_STANDBY, etc. In this case, whether the evasion
messages are completely tested in the state that changes quickly I think
the test period is a little insufficient.


What is your suggestion of test for this?

I left my Alderlake-P running overnight(more than 12 hours) with a News website 
open.
This website reloads the page at every 5 minutes, so it entered and exited 
DC5/6 states several times without any evasion messages.


I think it may be necessary to test a little more or to have
confirmation from the HW person in charge.


I can file an issue for this but it will probably several weeks to get an 
answer.


Yes, I am not disparaging what you tested.
However, since the current code confirms that only the 31st bit of the
PSR2_STATUS register is changed to 0 operationally,
it does not guarantee that the tested use cases have been tested for
IDLE, CAPTURE, CPTURE_FS, SLEEP, BUFON_FW, ML_UP, SU_STANDBY, and
FAST_SLEEP states.

I can't think of a way to test each of the above states right now, but
what I can suggest is that "intel_de_wait_for_clear(dev_priv,
EDP_PSR2_STATUS(intel_dp->psr.transcoder),
EDP_PSR2_STATUS_STATE_DEEP_SLEEP, 50)" works normally. After that, can
you put a code that prints the current PSR2 status?

If so, I think it will be easy to analyze the problem in case evasion
messages occur again after this code is applied later.
If additional confirmation from the responsible HW developer is received
at a later time, it is thought that future work such as deleting the
code that outputs the newly added current PSR Status

Re: [Intel-gfx] [PATCH 1/3] drm/i915: Add struct to hold IP version

2021-10-21 Thread Jani Nikula

On Wed, 20 Oct 2021, "Souza, Jose"  wrote:
> On Wed, 2021-10-20 at 12:47 +0300, Jani Nikula wrote:
>> On Tue, 19 Oct 2021, José Roberto de Souza  wrote:
>> > The constant platform display version is not using this new struct but
>> > the runtime variant will definitely use it.
>> 
>> Cc: Some more folks to hijack this thread. Sorry! ;)
>> 
>> We added runtime info to i915, because we had this idea and goal of
>> turning the device info to a truly const pointer to the info structures
>> in i915_pci.c that are stored in rodata. The idea was that we'll have a
>> complete split of mutable and immutable device data, with all the
>> mutable data in runtime info.
>> 
>> Alas, we never got there. More and more data that was mostly const but
>> sometimes needed tweaking kept piling up. mkwrite_device_info() was
>> supposed to be a clue not to modify device info runtime, but instead it
>> proliferated. Now we have places like intel_fbc_init() disabling FBC
>> through that. But most importantly, we have fusing that considerably
>> changes the device info, and the copying all of that data over to
>> runtime info probably isn't worth it.
>> 
>> Should we just acknowledge that the runtime info is useless, and move
>> some of that data to intel_device_info and some of it elsewhere in i915?
>
> With newer platforms getting more and more modular, I believe we will
> need to store even more mutable platform information.
>
> In my opinion a separation of immutable and mutable platform
> information is cleaner and easier to maintain.

Yeah, that's kind of what the original point was with device and runtime
info split. It's just that a lot of the supposedly immutable platform
info has turned into mutable information.

I think either we need to properly follow through with that idea, and
only store a const struct intel_device_info * to the rodata in
i915_pci.c, or just scrap it. None of this "almost immutable" business
that we currently have. "Almost immutable" means "mutable".

The main problem is that we'll still want to have the initial values in
static data. One idea is something like this:

struct intel_device_info {
const struct intel_runtime_info *runtime_info;
/* ... */
};

static const struct intel_device_info i965g_info = {
.runtime_info = &i965g_initial_runtime_info;
/* ... */
};

And things like .pipe_mask would be part of struct
intel_runtime_info. You'd copy the stuff over from intel_device_info
runtime_info member to i915->__runtime, but i915->__info would be a
const pointer to the device info. You'd never access the runtime_info
member after of intel_device_info after probe.

It's just really painful, for instance because we already have two sets
of flags, display and non-display, and those would be multiplied to
mutable/immutable. And we should probably increase, not decrease, the
split between display and non-display. The macro horror show of
i915_pci.c would just grow worse.

BR,
Jani.

>
>> 
>> 
>> BR,
>> Jani.
>> 
>

-- 
Jani Nikula, Intel Open Source Graphics Center

[Intel-gfx] [PATCH] drm/i915/cdclk: put the cdclk vtables in const data

2021-10-21 Thread Jani Nikula

Add the const that was accidentally left out from the vtables.

Fixes: 6b4cd9cba620 ("drm/i915: constify the cdclk vtable")
Cc: Dave Airlie 
Signed-off-by: Jani Nikula 
---
 drivers/gpu/drm/i915/display/intel_cdclk.c | 44 +++---
 1 file changed, 22 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_cdclk.c 
b/drivers/gpu/drm/i915/display/intel_cdclk.c
index 9e466d829019..868dd43a7542 100644
--- a/drivers/gpu/drm/i915/display/intel_cdclk.c
+++ b/drivers/gpu/drm/i915/display/intel_cdclk.c
@@ -2885,7 +2885,7 @@ u32 intel_read_rawclk(struct drm_i915_private *dev_priv)
return freq;
 }
 
-static struct intel_cdclk_funcs tgl_cdclk_funcs = {
+static const struct intel_cdclk_funcs tgl_cdclk_funcs = {
.get_cdclk = bxt_get_cdclk,
.set_cdclk = bxt_set_cdclk,
.bw_calc_min_cdclk = skl_bw_calc_min_cdclk,
@@ -2893,7 +2893,7 @@ static struct intel_cdclk_funcs tgl_cdclk_funcs = {
.calc_voltage_level = tgl_calc_voltage_level,
 };
 
-static struct intel_cdclk_funcs ehl_cdclk_funcs = {
+static const struct intel_cdclk_funcs ehl_cdclk_funcs = {
.get_cdclk = bxt_get_cdclk,
.set_cdclk = bxt_set_cdclk,
.bw_calc_min_cdclk = skl_bw_calc_min_cdclk,
@@ -2901,7 +2901,7 @@ static struct intel_cdclk_funcs ehl_cdclk_funcs = {
.calc_voltage_level = ehl_calc_voltage_level,
 };
 
-static struct intel_cdclk_funcs icl_cdclk_funcs = {
+static const struct intel_cdclk_funcs icl_cdclk_funcs = {
.get_cdclk = bxt_get_cdclk,
.set_cdclk = bxt_set_cdclk,
.bw_calc_min_cdclk = skl_bw_calc_min_cdclk,
@@ -2909,7 +2909,7 @@ static struct intel_cdclk_funcs icl_cdclk_funcs = {
.calc_voltage_level = icl_calc_voltage_level,
 };
 
-static struct intel_cdclk_funcs bxt_cdclk_funcs = {
+static const struct intel_cdclk_funcs bxt_cdclk_funcs = {
.get_cdclk = bxt_get_cdclk,
.set_cdclk = bxt_set_cdclk,
.bw_calc_min_cdclk = skl_bw_calc_min_cdclk,
@@ -2917,54 +2917,54 @@ static struct intel_cdclk_funcs bxt_cdclk_funcs = {
.calc_voltage_level = bxt_calc_voltage_level,
 };
 
-static struct intel_cdclk_funcs skl_cdclk_funcs = {
+static const struct intel_cdclk_funcs skl_cdclk_funcs = {
.get_cdclk = skl_get_cdclk,
.set_cdclk = skl_set_cdclk,
.bw_calc_min_cdclk = skl_bw_calc_min_cdclk,
.modeset_calc_cdclk = skl_modeset_calc_cdclk,
 };
 
-static struct intel_cdclk_funcs bdw_cdclk_funcs = {
+static const struct intel_cdclk_funcs bdw_cdclk_funcs = {
.get_cdclk = bdw_get_cdclk,
.set_cdclk = bdw_set_cdclk,
.bw_calc_min_cdclk = intel_bw_calc_min_cdclk,
.modeset_calc_cdclk = bdw_modeset_calc_cdclk,
 };
 
-static struct intel_cdclk_funcs chv_cdclk_funcs = {
+static const struct intel_cdclk_funcs chv_cdclk_funcs = {
.get_cdclk = vlv_get_cdclk,
.set_cdclk = chv_set_cdclk,
.bw_calc_min_cdclk = intel_bw_calc_min_cdclk,
.modeset_calc_cdclk = vlv_modeset_calc_cdclk,
 };
 
-static struct intel_cdclk_funcs vlv_cdclk_funcs = {
+static const struct intel_cdclk_funcs vlv_cdclk_funcs = {
.get_cdclk = vlv_get_cdclk,
.set_cdclk = vlv_set_cdclk,
.bw_calc_min_cdclk = intel_bw_calc_min_cdclk,
.modeset_calc_cdclk = vlv_modeset_calc_cdclk,
 };
 
-static struct intel_cdclk_funcs hsw_cdclk_funcs = {
+static const struct intel_cdclk_funcs hsw_cdclk_funcs = {
.get_cdclk = hsw_get_cdclk,
.bw_calc_min_cdclk = intel_bw_calc_min_cdclk,
.modeset_calc_cdclk = fixed_modeset_calc_cdclk,
 };
 
 /* SNB, IVB, 965G, 945G */
-static struct intel_cdclk_funcs fixed_400mhz_cdclk_funcs = {
+static const struct intel_cdclk_funcs fixed_400mhz_cdclk_funcs = {
.get_cdclk = fixed_400mhz_get_cdclk,
.bw_calc_min_cdclk = intel_bw_calc_min_cdclk,
.modeset_calc_cdclk = fixed_modeset_calc_cdclk,
 };
 
-static struct intel_cdclk_funcs ilk_cdclk_funcs = {
+static const struct intel_cdclk_funcs ilk_cdclk_funcs = {
.get_cdclk = fixed_450mhz_get_cdclk,
.bw_calc_min_cdclk = intel_bw_calc_min_cdclk,
.modeset_calc_cdclk = fixed_modeset_calc_cdclk,
 };
 
-static struct intel_cdclk_funcs gm45_cdclk_funcs = {
+static const struct intel_cdclk_funcs gm45_cdclk_funcs = {
.get_cdclk = gm45_get_cdclk,
.bw_calc_min_cdclk = intel_bw_calc_min_cdclk,
.modeset_calc_cdclk = fixed_modeset_calc_cdclk,
@@ -2972,7 +2972,7 @@ static struct intel_cdclk_funcs gm45_cdclk_funcs = {
 
 /* G45 uses G33 */
 
-static struct intel_cdclk_funcs i965gm_cdclk_funcs = {
+static const struct intel_cdclk_funcs i965gm_cdclk_funcs = {
.get_cdclk = i965gm_get_cdclk,
.bw_calc_min_cdclk = intel_bw_calc_min_cdclk,
.modeset_calc_cdclk = fixed_modeset_calc_cdclk,
@@ -2980,19 +2980,19 @@ static struct intel_cdclk_funcs i965gm_cdclk_funcs = {
 
 /* i965G uses fixed 400 */
 
-static struct intel_cdclk_funcs pnv_cdclk_funcs = {
+static const struct intel_cdclk_funcs

Re: [Intel-gfx] [PATCH] drm/i915/cdclk: put the cdclk vtables in const data

2021-10-21 Thread Ville Syrjälä

On Thu, Oct 21, 2021 at 04:34:08PM +0300, Jani Nikula wrote:
> Add the const that was accidentally left out from the vtables.
> 
> Fixes: 6b4cd9cba620 ("drm/i915: constify the cdclk vtable")
> Cc: Dave Airlie 
> Signed-off-by: Jani Nikula 

Reviewed-by: Ville Syrjälä 

And if you're sufficiently bored could also move i915->cdclk_funcs
into i915->cdclk.funcs.

-- 
Ville Syrjälä
Intel

[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/display: program audio CDCLK-TS for keepalives (rev4)

2021-10-21 Thread Patchwork

== Series Details ==

Series: drm/i915/display: program audio CDCLK-TS for keepalives (rev4)
URL   : https://patchwork.freedesktop.org/series/94551/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10769 -> Patchwork_21403


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21403/index.html

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_21403:

### IGT changes ###

 Suppressed 

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@kms_frontbuffer_tracking@basic:
- {fi-hsw-gt1}:   [PASS][1] -> [DMESG-WARN][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10769/fi-hsw-gt1/igt@kms_frontbuffer_track...@basic.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21403/fi-hsw-gt1/igt@kms_frontbuffer_track...@basic.html

  
Known issues


  Here are the changes found in Patchwork_21403 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_basic@cs-gfx:
- fi-skl-6700k2:  NOTRUN -> [SKIP][3] ([fdo#109271]) +28 similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21403/fi-skl-6700k2/igt@amdgpu/amd_ba...@cs-gfx.html

  * igt@amdgpu/amd_basic@cs-sdma:
- fi-kbl-7500u:   NOTRUN -> [SKIP][4] ([fdo#109271]) +28 similar issues
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21403/fi-kbl-7500u/igt@amdgpu/amd_ba...@cs-sdma.html

  * igt@amdgpu/amd_basic@query-info:
- fi-tgl-1115g4:  NOTRUN -> [SKIP][5] ([fdo#109315])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21403/fi-tgl-1115g4/igt@amdgpu/amd_ba...@query-info.html

  * igt@amdgpu/amd_basic@semaphore:
- fi-bdw-5557u:   NOTRUN -> [SKIP][6] ([fdo#109271]) +27 similar issues
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21403/fi-bdw-5557u/igt@amdgpu/amd_ba...@semaphore.html

  * igt@amdgpu/amd_cs_nop@nop-gfx0:
- fi-tgl-1115g4:  NOTRUN -> [SKIP][7] ([fdo#109315] / [i915#2575]) +16 
similar issues
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21403/fi-tgl-1115g4/igt@amdgpu/amd_cs_...@nop-gfx0.html

  * igt@gem_huc_copy@huc-copy:
- fi-skl-6700k2:  NOTRUN -> [SKIP][8] ([fdo#109271] / [i915#2190])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21403/fi-skl-6700k2/igt@gem_huc_c...@huc-copy.html
- fi-kbl-7500u:   NOTRUN -> [SKIP][9] ([fdo#109271] / [i915#2190])
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21403/fi-kbl-7500u/igt@gem_huc_c...@huc-copy.html
- fi-tgl-1115g4:  NOTRUN -> [SKIP][10] ([i915#2190])
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21403/fi-tgl-1115g4/igt@gem_huc_c...@huc-copy.html

  * igt@i915_pm_backlight@basic-brightness:
- fi-tgl-1115g4:  NOTRUN -> [SKIP][11] ([i915#1155])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21403/fi-tgl-1115g4/igt@i915_pm_backli...@basic-brightness.html

  * igt@i915_selftest@live@execlists:
- fi-bsw-nick:[PASS][12] -> [INCOMPLETE][13] ([i915#2940])
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10769/fi-bsw-nick/igt@i915_selftest@l...@execlists.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21403/fi-bsw-nick/igt@i915_selftest@l...@execlists.html
- fi-bsw-kefka:   [PASS][14] -> [INCOMPLETE][15] ([i915#2940])
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10769/fi-bsw-kefka/igt@i915_selftest@l...@execlists.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21403/fi-bsw-kefka/igt@i915_selftest@l...@execlists.html

  * igt@i915_selftest@live@gt_heartbeat:
- fi-kbl-soraka:  [PASS][16] -> [DMESG-FAIL][17] ([i915#2291] / 
[i915#541])
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10769/fi-kbl-soraka/igt@i915_selftest@live@gt_heartbeat.html
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21403/fi-kbl-soraka/igt@i915_selftest@live@gt_heartbeat.html

  * igt@i915_selftest@live@hangcheck:
- fi-icl-y:   [PASS][18] -> [INCOMPLETE][19] ([i915#3057] / 
[i915#3965])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10769/fi-icl-y/igt@i915_selftest@l...@hangcheck.html
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21403/fi-icl-y/igt@i915_selftest@l...@hangcheck.html

  * igt@kms_chamelium@common-hpd-after-suspend:
- fi-tgl-1115g4:  NOTRUN -> [SKIP][20] ([fdo#111827]) +8 similar issues
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21403/fi-tgl-1115g4/igt@kms_chamel...@common-hpd-after-suspend.html

  * igt@kms_chamelium@dp-crc-fast:
- fi-kbl-7500u:   NOTRUN -> [SKIP][21] ([fdo#109271] / [fdo#111827]) +8 
similar issues
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21403/fi-kbl-7500u/igt@kms_chame

Re: [Intel-gfx] [PATCH] drm/i915/cdclk: put the cdclk vtables in const data

2021-10-21 Thread Ville Syrjälä

On Thu, Oct 21, 2021 at 04:37:02PM +0300, Ville Syrjälä wrote:
> On Thu, Oct 21, 2021 at 04:34:08PM +0300, Jani Nikula wrote:
> > Add the const that was accidentally left out from the vtables.
> > 
> > Fixes: 6b4cd9cba620 ("drm/i915: constify the cdclk vtable")
> > Cc: Dave Airlie 
> > Signed-off-by: Jani Nikula 
> 
> Reviewed-by: Ville Syrjälä 
> 
> And if you're sufficiently bored could also move i915->cdclk_funcs
> into i915->cdclk.funcs.

Oh and we should move the cdclk_funcs struct definition into
intel_cdclk.c.

-- 
Ville Syrjälä
Intel

Re: [Intel-gfx] [PATCH v3 13/13] drm/i915: replace drm_detect_hdmi_monitor() with drm_display_info.is_hdmi

2021-10-21 Thread Ville Syrjälä

On Wed, Oct 20, 2021 at 12:51:21AM +0200, Claudio Suarez wrote:
> drm_get_edid() internally calls to drm_connector_update_edid_property()
> and then drm_add_display_info(), which parses the EDID.
> This happens in the function intel_hdmi_set_edid() and
> intel_sdvo_tmds_sink_detect() (via intel_sdvo_get_edid()).
> 
> Once EDID is parsed, the monitor HDMI support information is available
> through drm_display_info.is_hdmi. Retriving the same information with
> drm_detect_hdmi_monitor() is less efficient. Change to
> drm_display_info.is_hdmi

I meant we need to examine all call chains that can lead to
.detect() to make sure all of them do in fact update the
display_info beforehand.

> 
> This is a TODO task in Documentation/gpu/todo.rst
> 
> Signed-off-by: Claudio Suarez 
> ---
>  drivers/gpu/drm/i915/display/intel_hdmi.c | 2 +-
>  drivers/gpu/drm/i915/display/intel_sdvo.c | 3 ++-
>  2 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_hdmi.c 
> b/drivers/gpu/drm/i915/display/intel_hdmi.c
> index b04685bb6439..008e5b0ba408 100644
> --- a/drivers/gpu/drm/i915/display/intel_hdmi.c
> +++ b/drivers/gpu/drm/i915/display/intel_hdmi.c
> @@ -2355,7 +2355,7 @@ intel_hdmi_set_edid(struct drm_connector *connector)
>   to_intel_connector(connector)->detect_edid = edid;
>   if (edid && edid->input & DRM_EDID_INPUT_DIGITAL) {
>   intel_hdmi->has_audio = drm_detect_monitor_audio(edid);
> - intel_hdmi->has_hdmi_sink = drm_detect_hdmi_monitor(edid);
> + intel_hdmi->has_hdmi_sink = connector->display_info.is_hdmi;
>  
>   connected = true;
>   }
> diff --git a/drivers/gpu/drm/i915/display/intel_sdvo.c 
> b/drivers/gpu/drm/i915/display/intel_sdvo.c
> index 6cb27599ea03..b4065e4df644 100644
> --- a/drivers/gpu/drm/i915/display/intel_sdvo.c
> +++ b/drivers/gpu/drm/i915/display/intel_sdvo.c
> @@ -2060,8 +2060,9 @@ intel_sdvo_tmds_sink_detect(struct drm_connector 
> *connector)
>   if (edid->input & DRM_EDID_INPUT_DIGITAL) {
>   status = connector_status_connected;
>   if (intel_sdvo_connector->is_hdmi) {
> - intel_sdvo->has_hdmi_monitor = 
> drm_detect_hdmi_monitor(edid);
>   intel_sdvo->has_hdmi_audio = 
> drm_detect_monitor_audio(edid);
> + intel_sdvo->has_hdmi_monitor =
> + 
> connector->display_info.is_hdmi;
>   }
>   } else
>   status = connector_status_disconnected;
> -- 
> 2.33.0
> 
> 

-- 
Ville Syrjälä
Intel

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/4] drm/i915/clflush: fixup handling of cache_dirty

2021-10-21 Thread Patchwork

== Series Details ==

Series: series starting with [1/4] drm/i915/clflush: fixup handling of 
cache_dirty
URL   : https://patchwork.freedesktop.org/series/96119/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
a73be2e8149b drm/i915/clflush: fixup handling of cache_dirty
b9d957e084bd drm/i915/clflush: disallow on discrete
-:35: CHECK:BRACES: Blank lines aren't necessary after an open brace '{'
#35: FILE: drivers/gpu/drm/i915/gem/i915_gem_clflush.c:73:
 {
+

total: 0 errors, 0 warnings, 1 checks, 24 lines checked
8458b5d40c8d drm/i915: move cpu_write_needs_clflush
e587884c5e81 drm/i915: stop setting cache_dirty on discrete

[Intel-gfx] ✗ Fi.CI.SPARSE: warning for series starting with [1/4] drm/i915/clflush: fixup handling of cache_dirty

2021-10-21 Thread Patchwork

== Series Details ==

Series: series starting with [1/4] drm/i915/clflush: fixup handling of 
cache_dirty
URL   : https://patchwork.freedesktop.org/series/96119/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.
-
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_reset.c:1392:5: warning: context imbalance in 
'intel_gt_reset_trylock' - different lock contexts for basic block
+drivers/gpu/drm/i915/i915_perf.c:1442:15: warning: memset with byte count of 
16777216
+drivers/gpu/drm/i915/i915_perf.c:1496:15: warning: memset with byte count of 
16777216
+./include/asm-generic/bitops/find.h:112:45: warning: shift count is negative 
(-262080)
+./include/asm-generic/bitops/find.h:32:31: warning: shift count is negative 
(-262080)
+./include/linux/spinlock.h:418:9: warning: context imbalance in 
'fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:418:9: warning: context imbalance in 
'fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:418:9: warning: context imbalance in 
'fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:418:9: warning: context imbalance in 
'fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:418:9: warning: context imbalance in 
'fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:418:9: warning: context imbalance in 
'fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:418:9: warning: context imbalance in 
'fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:418:9: warning: context imbalance in 'gen6_write16' 
- different lock contexts for basic block
+./include/linux/spinlock.h:418:9: warning: context imbalance in 'gen6_write32' 
- different lock contexts for basic block
+./include/linux/spinlock.h:418:9: warning: context imbalance in 'gen6_write8' 
- different lock contexts for basic block

[Intel-gfx] [PATCH v2 00/17] drm/i915/dg2: Enabling 64k page size and flat ccs

2021-10-21 Thread Ramalingam C

This series introduces the enabling patches for new memory compression
feature Flat CCS and 64k page support for i915 local memory, along with
documentation on the uAPI impact. Included the details of the feature and
the implications on the uAPI below. Which is also added into
Documentation/gpu/rfc/i915_dg2.rst

DG2 64K page size support:
=

On discrete platforms, starting from DG2, we have to contend with GTT
page size restrictions when dealing with I915_MEMORY_CLASS_DEVICE
objects. Specifically the hardware only supports 64K or larger GTT page
sizes for such memory. The kernel will already ensure that all
I915_MEMORY_CLASS_DEVICE memory is allocated using 64K or larger page
sizes underneath.

Note that the returned size here will always reflect any required
rounding up done by the kernel, i.e 4K will now become 64K on devices
such as DG2.

Special DG2 GTT address alignment requirement:
=

The GTT alignment will also need be at least 64K for such objects.

Note that due to how the hardware implements 64K GTT page support, we
have some further complications:

1) The entire PDE(which covers a 2M virtual address range), must contain
only 64K PTEs, i.e mixing 4K and 64K PTEs in the same PDE is forbidden
by the hardware.

2) We still need to support 4K PTEs for I915_MEMORY_CLASS_SYSTEM
objects.

To handle the above the kernel implements a memory coloring scheme to
prevent userspace from mixing I915_MEMORY_CLASS_DEVICE and
I915_MEMORY_CLASS_SYSTEM objects in the same PDE. If the kernel is ever
unable to evict the required pages for the given PDE(different color)
when inserting the object into the GTT then it will simply fail the
request.

Since userspace needs to manage the GTT address space themselves,
special care is needed to ensure this doesn’t happen. The simplest
scheme is to simply align and round up all I915_MEMORY_CLASS_DEVICE
objects to 2M, which avoids any issues here. At the very least this is
likely needed for objects that can be placed in both
I915_MEMORY_CLASS_DEVICE and I915_MEMORY_CLASS_SYSTEM, to avoid
potential issues when the kernel needs to migrate the object behind the
scenes, since that might also involve evicting other objects.

To summarise the GTT rules, on platforms like DG2:

1) All objects that can be placed in I915_MEMORY_CLASS_DEVICE must have
64K alignment. The kernel will reject this otherwise.

2) All I915_MEMORY_CLASS_DEVICE objects must never be placed in the same
PDE with other I915_MEMORY_CLASS_SYSTEM objects. The kernel will reject
this otherwise.

3) Objects that can be placed in both I915_MEMORY_CLASS_DEVICE and
I915_MEMORY_CLASS_SYSTEM should probably be aligned and padded out to
2M.

Flat CCS support for lmem
=
On Xe-HP and later devices, we use dedicated compression control state
(CCS) stored in local memory for each surface, to support the 3D and
media compression formats.

The memory required for the CCS of the entire local memory is 1/256 of
the local memory size. So before the kernel boot, the required memory is
reserved for the CCS data and a secure register will be programmed with
the CCS base address.

Flat CCS data needs to be cleared when a lmem object is allocated. And
CCS data can be copied in and out of CCS region through
XY_CTRL_SURF_COPY_BLT. CPU can’t access the CCS data directly.

When we exaust the lmem, if the object’s placements support smem, then
we can directly decompress the compressed lmem object into smem and
start using it from smem itself.

But when we need to swapout the compressed lmem object into a smem
region though objects’ placement doesn’t support smem, then we copy the
lmem content as it is into smem region along with ccs data (using
XY_CTRL_SURF_COPY_BLT). When the object is referred, lmem content will
be swaped in along with restoration of the CCS data (using
XY_CTRL_SURF_COPY_BLT) at corresponding location.

Flat-CCS Modifiers for different compression formats

I915_FORMAT_MOD_F_TILED_DG2_RC_CCS - used to indicate the buffers of
Flat CCS render compression formats. Though the general layout is same
as I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS, new hashing/compression
algorithm is used. Render compression uses 128 byte compression blocks

I915_FORMAT_MOD_F_TILED_DG2_MC_CCS -used to indicate the buffers of Flat
CCS media compression formats. Though the general layout is same as
I915_FORMAT_MOD_Y_TILED_GEN12_MC_CCS, new hashing/compression algorithm
is used. Media compression uses 256 byte compression blocks.

I915_FORMAT_MOD_F_TILED_DG2_RC_CCS_CC - used to indicate the buffers of
Flat CCS clear color render compression formats. Unified compression
format for clear color render compression. The genral layout is a tiled
layout using 4Kb tiles i.e Tile4 layout.

v2:
  Fixed some formatting issues and platform naming issues
  Added some more documentation on Flat-CCS


Abdiel Janulgue (1):
  drm/i915/

[Intel-gfx] [PATCH v2 01/17] drm/i915: Add has_64k_pages flag

2021-10-21 Thread Ramalingam C

From: Stuart Summers 

Add a new platform flag, has_64k_pages, for platforms supporting
base page sizes of 64k.

Signed-off-by: Stuart Summers 
Signed-off-by: Ramalingam C 
---
 drivers/gpu/drm/i915/i915_drv.h  | 2 ++
 drivers/gpu/drm/i915/i915_pci.c  | 2 ++
 drivers/gpu/drm/i915/intel_device_info.h | 1 +
 3 files changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 12256218634f..a16fde38a252 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1714,6 +1714,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 #define HAS_MSLICES(dev_priv) \
(INTEL_INFO(dev_priv)->has_mslices)
 
+#define HAS_64K_PAGES(dev_priv) (INTEL_INFO(dev_priv)->has_64k_pages)
+
 #define HAS_IPC(dev_priv)   (INTEL_INFO(dev_priv)->display.has_ipc)
 
 #define HAS_REGION(i915, i) (INTEL_INFO(i915)->memory_regions & (i))
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 169837de395d..8ef484a23652 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -1015,6 +1015,7 @@ static const struct intel_device_info xehpsdv_info = {
DGFX_FEATURES,
PLATFORM(INTEL_XEHPSDV),
.display = { },
+   .has_64k_pages = 1,
.pipe_mask = 0,
.platform_engine_mask =
BIT(RCS0) | BIT(BCS0) |
@@ -1033,6 +1034,7 @@ static const struct intel_device_info dg2_info = {
.graphics_rel = 55,
.media_rel = 55,
PLATFORM(INTEL_DG2),
+   .has_64k_pages = 1,
.platform_engine_mask =
BIT(RCS0) | BIT(BCS0) |
BIT(VECS0) | BIT(VECS1) |
diff --git a/drivers/gpu/drm/i915/intel_device_info.h 
b/drivers/gpu/drm/i915/intel_device_info.h
index 8e6f48d1eb7b..dd453b96af19 100644
--- a/drivers/gpu/drm/i915/intel_device_info.h
+++ b/drivers/gpu/drm/i915/intel_device_info.h
@@ -123,6 +123,7 @@ enum intel_ppgtt_type {
func(is_dgfx); \
/* Keep has_* in alphabetical order */ \
func(has_64bit_reloc); \
+   func(has_64k_pages); \
func(gpu_reset_clobbers_display); \
func(has_reset_engine); \
func(has_global_mocs); \
-- 
2.20.1

[Intel-gfx] [PATCH v2 02/17] drm/i915/xehpsdv: set min page-size to 64K

2021-10-21 Thread Ramalingam C

From: Matthew Auld 

LMEM should be allocated at 64K granularity, since 4K page support will
eventually be dropped for LMEM when using the PPGTT.

Signed-off-by: Matthew Auld 
Signed-off-by: Stuart Summers 
Signed-off-by: Ramalingam C 
Cc: Joonas Lahtinen 
Cc: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c  | 6 +-
 drivers/gpu/drm/i915/gt/intel_region_lmem.c | 5 -
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c 
b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
index ddd37ccb1362..f52a06f05fc7 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
@@ -778,6 +778,7 @@ i915_gem_stolen_lmem_setup(struct drm_i915_private *i915, 
u16 type,
struct intel_uncore *uncore = &i915->uncore;
struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
struct intel_memory_region *mem;
+   resource_size_t min_page_size;
resource_size_t io_start;
resource_size_t lmem_size;
u64 lmem_base;
@@ -789,8 +790,11 @@ i915_gem_stolen_lmem_setup(struct drm_i915_private *i915, 
u16 type,
lmem_size = pci_resource_len(pdev, 2) - lmem_base;
io_start = pci_resource_start(pdev, 2) + lmem_base;
 
+   min_page_size = HAS_64K_PAGES(i915) ? I915_GTT_PAGE_SIZE_64K :
+   I915_GTT_PAGE_SIZE_4K;
+
mem = intel_memory_region_create(i915, lmem_base, lmem_size,
-I915_GTT_PAGE_SIZE_4K, io_start,
+min_page_size, io_start,
 type, instance,
 &i915_region_stolen_lmem_ops);
if (IS_ERR(mem))
diff --git a/drivers/gpu/drm/i915/gt/intel_region_lmem.c 
b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
index afb35d2e5c73..073d28d96669 100644
--- a/drivers/gpu/drm/i915/gt/intel_region_lmem.c
+++ b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
@@ -193,6 +193,7 @@ static struct intel_memory_region *setup_lmem(struct 
intel_gt *gt)
struct intel_uncore *uncore = gt->uncore;
struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
struct intel_memory_region *mem;
+   resource_size_t min_page_size;
resource_size_t io_start;
resource_size_t lmem_size;
int err;
@@ -207,10 +208,12 @@ static struct intel_memory_region *setup_lmem(struct 
intel_gt *gt)
if (GEM_WARN_ON(lmem_size > pci_resource_len(pdev, 2)))
return ERR_PTR(-ENODEV);
 
+   min_page_size = HAS_64K_PAGES(i915) ? I915_GTT_PAGE_SIZE_64K :
+   I915_GTT_PAGE_SIZE_4K;
mem = intel_memory_region_create(i915,
 0,
 lmem_size,
-I915_GTT_PAGE_SIZE_4K,
+min_page_size,
 io_start,
 INTEL_MEMORY_LOCAL,
 0,
-- 
2.20.1

[Intel-gfx] [PATCH v2 03/17] drm/i915/xehpsdv: enforce min GTT alignment

2021-10-21 Thread Ramalingam C

From: Matthew Auld 

For local-memory objects we need to align the GTT addresses to 64K, both
for the ppgtt and ggtt.

Signed-off-by: Matthew Auld 
Signed-off-by: Stuart Summers 
Signed-off-by: Ramalingam C 
Cc: Joonas Lahtinen 
Cc: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/i915_vma.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 90546fa58fc1..c31b4bc8af16 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -670,8 +670,13 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 
alignment, u64 flags)
}
 
color = 0;
-   if (vma->obj && i915_vm_has_cache_coloring(vma->vm))
-   color = vma->obj->cache_level;
+   if (vma->obj) {
+   if (HAS_64K_PAGES(vma->vm->i915) && 
i915_gem_object_is_lmem(vma->obj))
+   alignment = max(alignment, I915_GTT_PAGE_SIZE_64K);
+
+   if (i915_vm_has_cache_coloring(vma->vm))
+   color = vma->obj->cache_level;
+   }
 
if (flags & PIN_OFFSET_FIXED) {
u64 offset = flags & PIN_OFFSET_MASK;
-- 
2.20.1

[Intel-gfx] [PATCH v2 04/17] drm/i915: enforce min page size for scratch

2021-10-21 Thread Ramalingam C

From: Matthew Auld 

If the device needs 64K minimum GTT pages for device local-memory,
like on XEHPSDV, then we need to fail the allocation if we can't
meet it, instead of falling back to 4K pages, otherwise we can't
safely support the insertion of device local-memory pages for
this vm, since the HW expects the correct physical alignment and
size for every PTE, if we mark the page-table as 64K GTT mode.

Signed-off-by: Matthew Auld 
Signed-off-by: Ramalingam C 
---
 drivers/gpu/drm/i915/gt/intel_gtt.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c 
b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 67d14afa6623..2a6eec5f0d58 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -334,6 +334,18 @@ int setup_scratch_page(struct i915_address_space *vm)
if (size == I915_GTT_PAGE_SIZE_4K)
return -ENOMEM;
 
+   /*
+* If we need 64K minimum GTT pages for device local-memory,
+* like on XEHPSDV, then we need to fail the allocation here,
+* otherwise we can't safely support the insertion of
+* local-memory pages for this vm, since the HW expects the
+* correct physical alignment and size when the page-table is
+* operating in 64K GTT mode, which includes any scratch PTEs,
+* since userpsace can still touch them.
+*/
+   if (HAS_64K_PAGES(vm->i915))
+   return -ENOMEM;
+
size = I915_GTT_PAGE_SIZE_4K;
} while (1);
 }
-- 
2.20.1

[Intel-gfx] [PATCH v2 06/17] drm/i915/xehpsdv: support 64K GTT pages

2021-10-21 Thread Ramalingam C

From: Matthew Auld 

XEHPSDV optimises 64K GTT pages for local-memory, since everything
should be allocated at 64K granularity. We say goodbye to sparse
entries, and instead get a compact 256B page-table for 64K pages,
which should be more cache friendly. 4K pages for local-memory
are no longer supported by the HW.

Signed-off-by: Matthew Auld 
Signed-off-by: Stuart Summers 
Signed-off-by: Ramalingam C 
Cc: Joonas Lahtinen 
Cc: Rodrigo Vivi 
---
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |  61 ++
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c  | 106 +-
 drivers/gpu/drm/i915/gt/intel_gtt.h   |   3 +
 drivers/gpu/drm/i915/gt/intel_ppgtt.c |   1 +
 4 files changed, 168 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c 
b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
index 41d0680f3bd7..9c2ffa4090f1 100644
--- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c
@@ -1451,6 +1451,66 @@ static int igt_ppgtt_sanity_check(void *arg)
return err;
 }
 
+static int igt_ppgtt_compact(void *arg)
+{
+   struct i915_gem_context *ctx = arg;
+   struct drm_i915_private *i915 = ctx->i915;
+   struct drm_i915_gem_object *obj;
+   int err;
+
+   /*
+* Simple test to catch issues with compact 64K pages -- since the pt is
+* compacted to 256B that gives us 32 entries per pt, however since the
+* backing page for the pt is 4K, any extra entries we might incorrectly
+* write out should be ignored by the HW. If ever hit such a case this
+* test should catch it since some of our writes would land in scratch.
+*/
+
+   if (!HAS_64K_PAGES(i915)) {
+   pr_info("device lacks compact 64K page support, skipping\n");
+   return 0;
+   }
+
+   if (!HAS_LMEM(i915)) {
+   pr_info("device lacks LMEM support, skipping\n");
+   return 0;
+   }
+
+   /* We want the range to cover multiple page-table boundaries. */
+   obj = i915_gem_object_create_lmem(i915, SZ_4M, 0);
+   if (IS_ERR(obj))
+   return err;
+
+   err = i915_gem_object_pin_pages_unlocked(obj);
+   if (err)
+   goto out_put;
+
+   if (obj->mm.page_sizes.phys < I915_GTT_PAGE_SIZE_64K) {
+   pr_info("LMEM compact unable to allocate huge-page(s)\n");
+   goto out_unpin;
+   }
+
+   /*
+* Disable 2M GTT pages by forcing the page-size to 64K for the GTT
+* insertion.
+*/
+   obj->mm.page_sizes.sg = I915_GTT_PAGE_SIZE_64K;
+
+   err = igt_write_huge(ctx, obj);
+   if (err)
+   pr_err("LMEM compact write-huge failed\n");
+
+out_unpin:
+   i915_gem_object_unpin_pages(obj);
+out_put:
+   i915_gem_object_put(obj);
+
+   if (err == -ENOMEM)
+   err = 0;
+
+   return err;
+}
+
 static int igt_tmpfs_fallback(void *arg)
 {
struct i915_gem_context *ctx = arg;
@@ -1681,6 +1741,7 @@ int i915_gem_huge_page_live_selftests(struct 
drm_i915_private *i915)
SUBTEST(igt_tmpfs_fallback),
SUBTEST(igt_ppgtt_smoke_huge),
SUBTEST(igt_ppgtt_sanity_check),
+   SUBTEST(igt_ppgtt_compact),
};
struct i915_gem_context *ctx;
struct i915_address_space *vm;
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index 6bff6bf1a450..fec0f20f1b93 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -233,6 +233,8 @@ static u64 __gen8_ppgtt_clear(struct i915_address_space * 
const vm,
   start, end, lvl);
} else {
unsigned int count;
+   unsigned int pte = gen8_pd_index(start, 0);
+   unsigned int num_ptes;
u64 *vaddr;
 
count = gen8_pt_count(start, end);
@@ -242,10 +244,18 @@ static u64 __gen8_ppgtt_clear(struct i915_address_space * 
const vm,
atomic_read(&pt->used));
GEM_BUG_ON(!count || count >= atomic_read(&pt->used));
 
+   num_ptes = count;
+   if (pt->is_compact) {
+   GEM_BUG_ON(num_ptes % 16);
+   GEM_BUG_ON(pte % 16);
+   num_ptes /= 16;
+   pte /= 16;
+   }
+
vaddr = px_vaddr(pt);
-   memset64(vaddr + gen8_pd_index(start, 0),
+   memset64(vaddr + pte,
 vm->scratch[0]->encode,
-count);
+num_ptes);
 
atomic_sub(count, &pt->used);

[Intel-gfx] [PATCH v2 05/17] drm/i915/gtt/xehpsdv: move scratch page to system memory

2021-10-21 Thread Ramalingam C

From: Matthew Auld 

On some platforms the hw has dropped support for 4K GTT pages when
dealing with LMEM, and due to the design of 64K GTT pages in the hw, we
can only mark the *entire* page-table as operating in 64K GTT mode,
since the enable bit is still on the pde, and not the pte. And since we
we still need to allow 4K GTT pages for SMEM objects, we can't have a
"normal" 4K page-table with scratch pointing to LMEM, since that's
undefined from the hw pov. The simplest solution is to just move the 64K
scratch page to SMEM on such platforms and call it a day, since that
should work for all configurations.

Signed-off-by: Matthew Auld 
Signed-off-by: Ramalingam C 
---
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c  |  1 +
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c  | 23 +--
 drivers/gpu/drm/i915/gt/intel_ggtt.c  |  2 ++
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  2 +-
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  2 ++
 drivers/gpu/drm/i915/selftests/mock_gtt.c |  2 ++
 6 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
index 890191f286e3..49e7651d764a 100644
--- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
@@ -440,6 +440,7 @@ struct i915_ppgtt *gen6_ppgtt_create(struct intel_gt *gt)
ppgtt->base.vm.cleanup = gen6_ppgtt_cleanup;
 
ppgtt->base.vm.alloc_pt_dma = alloc_pt_dma;
+   ppgtt->base.vm.alloc_scratch_dma = alloc_pt_dma;
ppgtt->base.vm.pte_encode = ggtt->vm.pte_encode;
 
ppgtt->base.pd = __alloc_pd(I915_PDES);
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index 037a9a6e4889..6bff6bf1a450 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -777,10 +777,29 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
 */
ppgtt->vm.has_read_only = !IS_GRAPHICS_VER(gt->i915, 11, 12);
 
-   if (HAS_LMEM(gt->i915))
+   if (HAS_LMEM(gt->i915)) {
ppgtt->vm.alloc_pt_dma = alloc_pt_lmem;
-   else
+
+   /*
+* On some platforms the hw has dropped support for 4K GTT pages
+* when dealing with LMEM, and due to the design of 64K GTT
+* pages in the hw, we can only mark the *entire* page-table as
+* operating in 64K GTT mode, since the enable bit is still on
+* the pde, and not the pte. And since we still need to allow
+* 4K GTT pages for SMEM objects, we can't have a "normal" 4K
+* page-table with scratch pointing to LMEM, since that's
+* undefined from the hw pov. The simplest solution is to just
+* move the 64K scratch page to SMEM on such platforms and call
+* it a day, since that should work for all configurations.
+*/
+   if (HAS_64K_PAGES(gt->i915))
+   ppgtt->vm.alloc_scratch_dma = alloc_pt_dma;
+   else
+   ppgtt->vm.alloc_scratch_dma = alloc_pt_lmem;
+   } else {
ppgtt->vm.alloc_pt_dma = alloc_pt_dma;
+   ppgtt->vm.alloc_scratch_dma = alloc_pt_dma;
+   }
 
err = gen8_init_scratch(&ppgtt->vm);
if (err)
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c 
b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index f17383e76eb7..289316007029 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -1077,6 +1077,7 @@ static int gen6_gmch_probe(struct i915_ggtt *ggtt)
ggtt->vm.total = (size / sizeof(gen6_pte_t)) * I915_GTT_PAGE_SIZE;
 
ggtt->vm.alloc_pt_dma = alloc_pt_dma;
+   ggtt->vm.alloc_scratch_dma = alloc_pt_dma;
 
ggtt->vm.clear_range = nop_clear_range;
if (!HAS_FULL_PPGTT(i915) || intel_scanout_needs_vtd_wa(i915))
@@ -1129,6 +1130,7 @@ static int i915_gmch_probe(struct i915_ggtt *ggtt)
(struct resource)DEFINE_RES_MEM(gmadr_base, ggtt->mappable_end);
 
ggtt->vm.alloc_pt_dma = alloc_pt_dma;
+   ggtt->vm.alloc_scratch_dma = alloc_pt_dma;
 
if (needs_idle_maps(i915)) {
drm_notice(&i915->drm,
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c 
b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 2a6eec5f0d58..56fbd37a6b54 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -298,7 +298,7 @@ int setup_scratch_page(struct i915_address_space *vm)
do {
struct drm_i915_gem_object *obj;
 
-   obj = vm->alloc_pt_dma(vm, size);
+   obj = vm->alloc_scratch_dma(vm, size);
if (IS_ERR(obj))
goto skip;
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h 
b/drivers/gpu/drm/i915/gt/intel_gtt.h
index bc6750263359..6d13f4ab4d4a 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/

[Intel-gfx] [PATCH v2 07/17] drm/i915: Add vm min alignment support

2021-10-21 Thread Ramalingam C

From: Bommu Krishnaiah 

Replace the hard coded 4K alignment value with vm->min_alignment.

Cc: Wilson Chris P 
Signed-off-by: Bommu Krishnaiah 
Signed-off-by: Ramalingam C 
---
 .../i915/gem/selftests/i915_gem_client_blt.c  | 23 ---
 drivers/gpu/drm/i915/gt/intel_gtt.c   |  9 
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  9 
 3 files changed, 33 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c
index 8402ed925a69..6b9b861e43e5 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c
@@ -39,6 +39,7 @@ struct tiled_blits {
struct blit_buffer scratch;
struct i915_vma *batch;
u64 hole;
+   u64 align;
u32 width;
u32 height;
 };
@@ -410,14 +411,21 @@ tiled_blits_create(struct intel_engine_cs *engine, struct 
rnd_state *prng)
goto err_free;
}
 
-   hole_size = 2 * PAGE_ALIGN(WIDTH * HEIGHT * 4);
+   t->align = I915_GTT_PAGE_SIZE_2M; /* XXX worst case, derive from vm! */
+   t->align = max(t->align,
+  i915_vm_min_alignment(t->ce->vm, INTEL_MEMORY_LOCAL));
+   t->align = max(t->align,
+  i915_vm_min_alignment(t->ce->vm, INTEL_MEMORY_SYSTEM));
+
+   hole_size = 2 * round_up(WIDTH * HEIGHT * 4, t->align);
hole_size *= 2; /* room to maneuver */
-   hole_size += 2 * I915_GTT_MIN_ALIGNMENT;
+   hole_size += 2 * t->align; /* padding on either side */
 
mutex_lock(&t->ce->vm->mutex);
memset(&hole, 0, sizeof(hole));
err = drm_mm_insert_node_in_range(&t->ce->vm->mm, &hole,
- hole_size, 0, I915_COLOR_UNEVICTABLE,
+ hole_size, t->align,
+ I915_COLOR_UNEVICTABLE,
  0, U64_MAX,
  DRM_MM_INSERT_BEST);
if (!err)
@@ -428,7 +436,7 @@ tiled_blits_create(struct intel_engine_cs *engine, struct 
rnd_state *prng)
goto err_put;
}
 
-   t->hole = hole.start + I915_GTT_MIN_ALIGNMENT;
+   t->hole = hole.start + t->align;
pr_info("Using hole at %llx\n", t->hole);
 
err = tiled_blits_create_buffers(t, WIDTH, HEIGHT, prng);
@@ -455,7 +463,7 @@ static void tiled_blits_destroy(struct tiled_blits *t)
 static int tiled_blits_prepare(struct tiled_blits *t,
   struct rnd_state *prng)
 {
-   u64 offset = PAGE_ALIGN(t->width * t->height * 4);
+   u64 offset = round_up(t->width * t->height * 4, t->align);
u32 *map;
int err;
int i;
@@ -486,8 +494,7 @@ static int tiled_blits_prepare(struct tiled_blits *t,
 
 static int tiled_blits_bounce(struct tiled_blits *t, struct rnd_state *prng)
 {
-   u64 offset =
-   round_up(t->width * t->height * 4, 2 * I915_GTT_MIN_ALIGNMENT);
+   u64 offset = round_up(t->width * t->height * 4, 2 * t->align);
int err;
 
/* We want to check position invariant tiling across GTT eviction */
@@ -500,7 +507,7 @@ static int tiled_blits_bounce(struct tiled_blits *t, struct 
rnd_state *prng)
 
/* Reposition so that we overlap the old addresses, and slightly off */
err = tiled_blit(t,
-&t->buffers[2], t->hole + I915_GTT_MIN_ALIGNMENT,
+&t->buffers[2], t->hole + t->align,
 &t->buffers[1], t->hole + 3 * offset / 2);
if (err)
return err;
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c 
b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 56fbd37a6b54..4743921b7638 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -216,6 +216,15 @@ void i915_address_space_init(struct i915_address_space 
*vm, int subclass)
 
GEM_BUG_ON(!vm->total);
drm_mm_init(&vm->mm, 0, vm->total);
+
+   memset64(vm->min_alignment, I915_GTT_MIN_ALIGNMENT,
+ARRAY_SIZE(vm->min_alignment));
+
+   if (HAS_64K_PAGES(vm->i915)) {
+   vm->min_alignment[INTEL_MEMORY_LOCAL] = I915_GTT_PAGE_SIZE_64K;
+   vm->min_alignment[INTEL_MEMORY_STOLEN_LOCAL] = 
I915_GTT_PAGE_SIZE_64K;
+   }
+
vm->mm.head_node.color = I915_COLOR_UNEVICTABLE;
 
INIT_LIST_HEAD(&vm->bound_list);
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h 
b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 6d0233ffae17..20101eef4c95 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -28,6 +28,8 @@
 #include "gt/intel_reset.h"
 #include "i915_selftest.h"
 #include "i915_vma_types.h"
+#include "i915_params.h"
+#include "intel_memory_region.h"
 
 #define I915_GFP_ALLOW_FAIL (GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN)
 
@

[Intel-gfx] [PATCH v2 08/17] drm/i915/selftests: account for min_alignment in GTT selftests

2021-10-21 Thread Ramalingam C

From: Matthew Auld 

We need to support vm->min_alignment > 4K, depending
on the vm itself and the type of object we are inserting.
With this in mind update the GTT selftests to take this
into account.

Signed-off-by: Matthew Auld 
Signed-off-by: Ramalingam C 
---
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 96 ---
 1 file changed, 63 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index 46f4236039a9..fdb4bf88293b 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -237,6 +237,8 @@ static int lowlevel_hole(struct i915_address_space *vm,
 u64 hole_start, u64 hole_end,
 unsigned long end_time)
 {
+   const unsigned int min_alignment =
+   i915_vm_min_alignment(vm, INTEL_MEMORY_SYSTEM);
I915_RND_STATE(seed_prng);
struct i915_vma *mock_vma;
unsigned int size;
@@ -250,9 +252,10 @@ static int lowlevel_hole(struct i915_address_space *vm,
I915_RND_SUBSTATE(prng, seed_prng);
struct drm_i915_gem_object *obj;
unsigned int *order, count, n;
-   u64 hole_size;
+   u64 hole_size, aligned_size;
 
-   hole_size = (hole_end - hole_start) >> size;
+   aligned_size = max_t(u32, ilog2(min_alignment), size);
+   hole_size = (hole_end - hole_start) >> aligned_size;
if (hole_size > KMALLOC_MAX_SIZE / sizeof(u32))
hole_size = KMALLOC_MAX_SIZE / sizeof(u32);
count = hole_size >> 1;
@@ -273,8 +276,8 @@ static int lowlevel_hole(struct i915_address_space *vm,
}
GEM_BUG_ON(!order);
 
-   GEM_BUG_ON(count * BIT_ULL(size) > vm->total);
-   GEM_BUG_ON(hole_start + count * BIT_ULL(size) > hole_end);
+   GEM_BUG_ON(count * BIT_ULL(aligned_size) > vm->total);
+   GEM_BUG_ON(hole_start + count * BIT_ULL(aligned_size) > 
hole_end);
 
/* Ignore allocation failures (i.e. don't report them as
 * a test failure) as we are purposefully allocating very
@@ -297,10 +300,10 @@ static int lowlevel_hole(struct i915_address_space *vm,
}
 
for (n = 0; n < count; n++) {
-   u64 addr = hole_start + order[n] * BIT_ULL(size);
+   u64 addr = hole_start + order[n] * 
BIT_ULL(aligned_size);
intel_wakeref_t wakeref;
 
-   GEM_BUG_ON(addr + BIT_ULL(size) > vm->total);
+   GEM_BUG_ON(addr + BIT_ULL(aligned_size) > vm->total);
 
if (igt_timeout(end_time,
"%s timed out before %d/%d\n",
@@ -343,7 +346,7 @@ static int lowlevel_hole(struct i915_address_space *vm,
}
 
mock_vma->pages = obj->mm.pages;
-   mock_vma->node.size = BIT_ULL(size);
+   mock_vma->node.size = BIT_ULL(aligned_size);
mock_vma->node.start = addr;
 
with_intel_runtime_pm(vm->gt->uncore->rpm, wakeref)
@@ -354,7 +357,7 @@ static int lowlevel_hole(struct i915_address_space *vm,
 
i915_random_reorder(order, count, &prng);
for (n = 0; n < count; n++) {
-   u64 addr = hole_start + order[n] * BIT_ULL(size);
+   u64 addr = hole_start + order[n] * 
BIT_ULL(aligned_size);
intel_wakeref_t wakeref;
 
GEM_BUG_ON(addr + BIT_ULL(size) > vm->total);
@@ -398,8 +401,10 @@ static int fill_hole(struct i915_address_space *vm,
 {
const u64 hole_size = hole_end - hole_start;
struct drm_i915_gem_object *obj;
+   const unsigned int min_alignment =
+   i915_vm_min_alignment(vm, INTEL_MEMORY_SYSTEM);
const unsigned long max_pages =
-   min_t(u64, ULONG_MAX - 1, hole_size/2 >> PAGE_SHIFT);
+   min_t(u64, ULONG_MAX - 1, (hole_size / 2) >> 
ilog2(min_alignment));
const unsigned long max_step = max(int_sqrt(max_pages), 2UL);
unsigned long npages, prime, flags;
struct i915_vma *vma;
@@ -440,14 +445,17 @@ static int fill_hole(struct i915_address_space *vm,
 
offset = p->offset;
list_for_each_entry(obj, &objects, st_link) {
+   u64 aligned_size = 
round_up(obj->base.size,
+   
min_alignment);
+
vma = i915_vma_instance(obj, vm, NULL);
if (IS_ERR(vma))
continue;

[Intel-gfx] [PATCH v2 09/17] drm/i915/xehpsdv: implement memory coloring

2021-10-21 Thread Ramalingam C

From: Matthew Auld 

The basic idea is that each 2M block(page-table) has a color, depending
on if the page-table is occupied by LMEM objects(64K) or SMEM
objects(4K), where our goal is to prevent mixing 64K and 4K GTT pages in
the page-table, which is not supported by the HW.

Signed-off-by: Matthew Auld 
Signed-off-by: Stuart Summers 
Signed-off-by: Ramalingam C 
Cc: Joonas Lahtinen 
Cc: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c  | 16 ++
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  6 
 drivers/gpu/drm/i915/i915_gem_evict.c | 17 ++
 drivers/gpu/drm/i915/i915_vma.c   | 46 +++
 4 files changed, 71 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index fec0f20f1b93..666745adbe93 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -464,6 +464,19 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
return idx;
 }
 
+static void xehpsdv_ppgtt_color_adjust(const struct drm_mm_node *node,
+  unsigned long color,
+  u64 *start,
+  u64 *end)
+{
+   if (i915_node_color_differs(node, color))
+   *start = round_up(*start, SZ_2M);
+
+   node = list_next_entry(node, node_list);
+   if (i915_node_color_differs(node, color))
+   *end = round_down(*end, SZ_2M);
+}
+
 static void
 xehpsdv_ppgtt_insert_huge(struct i915_vma *vma,
  struct sgt_dma *iter,
@@ -901,6 +914,9 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
ppgtt->vm.alloc_scratch_dma = alloc_pt_dma;
}
 
+   if (HAS_64K_PAGES(gt->i915))
+   ppgtt->vm.mm.color_adjust = xehpsdv_ppgtt_color_adjust;
+
err = gen8_init_scratch(&ppgtt->vm);
if (err)
goto err_free;
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h 
b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 20101eef4c95..34696acde342 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -397,6 +397,12 @@ i915_vm_has_cache_coloring(struct i915_address_space *vm)
return i915_is_ggtt(vm) && vm->mm.color_adjust;
 }
 
+static inline bool
+i915_vm_has_memory_coloring(struct i915_address_space *vm)
+{
+   return !i915_is_ggtt(vm) && vm->mm.color_adjust;
+}
+
 static inline struct i915_ggtt *
 i915_vm_to_ggtt(struct i915_address_space *vm)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c 
b/drivers/gpu/drm/i915/i915_gem_evict.c
index 2b73ddb11c66..006bf4924c24 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -292,6 +292,13 @@ int i915_gem_evict_for_node(struct i915_address_space *vm,
 
/* Always look at the page afterwards to avoid the end-of-GTT */
end += I915_GTT_PAGE_SIZE;
+   } else if (i915_vm_has_memory_coloring(vm)) {
+   /*
+* Expand the search the cover the page-table boundries, in
+* case we need to flip the color of the page-table(s).
+*/
+   start = round_down(start, SZ_2M);
+   end = round_up(end, SZ_2M);
}
GEM_BUG_ON(start >= end);
 
@@ -321,6 +328,16 @@ int i915_gem_evict_for_node(struct i915_address_space *vm,
if (node->color == target->color)
continue;
}
+   } else if (i915_vm_has_memory_coloring(vm)) {
+   if (node->start + node->size <= target->start) {
+   if (node->color == target->color)
+   continue;
+   }
+
+   if (node->start >= target->start + target->size) {
+   if (node->color == target->color)
+   continue;
+   }
}
 
if (i915_vma_is_pinned(vma)) {
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index c31b4bc8af16..92b124ecc38c 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -585,6 +585,10 @@ bool i915_gem_valid_gtt_space(struct i915_vma *vma, 
unsigned long color)
struct drm_mm_node *node = &vma->node;
struct drm_mm_node *other;
 
+   /* Only valid to be called on an already inserted vma */
+   GEM_BUG_ON(!drm_mm_node_allocated(node));
+   GEM_BUG_ON(list_empty(&node->node_list));
+
/*
 * On some machines we have to be careful when putting differing types
 * of snoopable memory together to avoid the prefetcher crossing memory
@@ -592,22 +596,34 @@ bool i915_gem_valid_gtt_space(struct i915_vma *vma, 
unsigned long color)
 * these constraints apply and set the drm_mm.color_

[Intel-gfx] [PATCH v2 11/17] drm/i915/lmem: Enable lmem for platforms with Flat CCS

2021-10-21 Thread Ramalingam C

From: Abdiel Janulgue 

A portion of device memory is reserved for Flat CCS so usable
device memory will be reduced by size of Flat CCS. Size of
Flat CCS is specified in “XEHPSDV_FLAT_CCS_BASE_ADDR”.
So to get effective device memory we need to subtract
total device memory by Flat CCS memory size.

Cc: Matthew Auld 
Signed-off-by: Abdiel Janulgue 
Signed-off-by: Ramalingam C 
---
 drivers/gpu/drm/i915/gt/intel_gt.c  | 19 ++
 drivers/gpu/drm/i915/gt/intel_gt.h  |  1 +
 drivers/gpu/drm/i915/gt/intel_region_lmem.c | 22 +++--
 drivers/gpu/drm/i915/i915_reg.h |  3 +++
 4 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
b/drivers/gpu/drm/i915/gt/intel_gt.c
index 1cb1948ac959..fd82ebee8724 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -900,6 +900,25 @@ u32 intel_gt_read_register_fw(struct intel_gt *gt, 
i915_reg_t reg)
return intel_uncore_read_fw(gt->uncore, reg);
 }
 
+u32 intel_gt_read_register(struct intel_gt *gt, i915_reg_t reg)
+{
+   int type;
+   u8 sliceid, subsliceid;
+
+   for (type = 0; type < NUM_STEERING_TYPES; type++) {
+   if (intel_gt_reg_needs_read_steering(gt, reg, type)) {
+   intel_gt_get_valid_steering(gt, type, &sliceid,
+   &subsliceid);
+   return intel_uncore_read_with_mcr_steering(gt->uncore,
+  reg,
+  sliceid,
+  subsliceid);
+   }
+   }
+
+   return intel_uncore_read(gt->uncore, reg);
+}
+
 void intel_gt_info_print(const struct intel_gt_info *info,
 struct drm_printer *p)
 {
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.h 
b/drivers/gpu/drm/i915/gt/intel_gt.h
index 74e771871a9b..24b78398a587 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt.h
@@ -84,6 +84,7 @@ static inline bool intel_gt_needs_read_steering(struct 
intel_gt *gt,
 }
 
 u32 intel_gt_read_register_fw(struct intel_gt *gt, i915_reg_t reg);
+u32 intel_gt_read_register(struct intel_gt *gt, i915_reg_t reg);
 
 void intel_gt_info_print(const struct intel_gt_info *info,
 struct drm_printer *p);
diff --git a/drivers/gpu/drm/i915/gt/intel_region_lmem.c 
b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
index 073d28d96669..d1f88beb26fe 100644
--- a/drivers/gpu/drm/i915/gt/intel_region_lmem.c
+++ b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
@@ -201,8 +201,26 @@ static struct intel_memory_region *setup_lmem(struct 
intel_gt *gt)
if (!IS_DGFX(i915))
return ERR_PTR(-ENODEV);
 
-   /* Stolen starts from GSMBASE on DG1 */
-   lmem_size = intel_uncore_read64(uncore, GEN12_GSMBASE);
+   if (HAS_FLAT_CCS(i915)) {
+   u64 tile_stolen, flat_ccs_base_addr_reg, flat_ccs_base;
+
+   lmem_size = pci_resource_len(pdev, 2);
+   flat_ccs_base_addr_reg = intel_gt_read_register(gt, 
XEHPSDV_FLAT_CCS_BASE_ADDR);
+   flat_ccs_base = (flat_ccs_base_addr_reg >> 
XEHPSDV_CCS_BASE_SHIFT) * SZ_64K;
+   tile_stolen = lmem_size - flat_ccs_base;
+
+   /* If the FLAT_CCS_BASE_ADDR register is not populated, flag an 
error */
+   if (tile_stolen == lmem_size)
+   DRM_ERROR("CCS_BASE_ADDR register did not have expected 
value\n");
+
+   lmem_size -= tile_stolen;
+   } else {
+   /* Stolen starts from GSMBASE without CCS */
+   lmem_size = intel_uncore_read64(&i915->uncore, GEN12_GSMBASE);
+   if (GEM_WARN_ON(lmem_size > pci_resource_len(pdev, 2)))
+   return ERR_PTR(-ENODEV);
+   }
+
 
io_start = pci_resource_start(pdev, 2);
if (GEM_WARN_ON(lmem_size > pci_resource_len(pdev, 2)))
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 1e221fbe37fd..3693eb03f5aa 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -12469,6 +12469,9 @@ enum skl_power_gate {
 #define GEN12_GSMBASE  _MMIO(0x108100)
 #define GEN12_DSMBASE  _MMIO(0x1080C0)
 
+#define XEHPSDV_FLAT_CCS_BASE_ADDR _MMIO(0x4910)
+#define   XEHPSDV_CCS_BASE_SHIFT   8
+
 /* gamt regs */
 #define GEN8_L3_LRA_1_GPGPU _MMIO(0x4dd4)
 #define   GEN8_L3_LRA_1_GPGPU_DEFAULT_VALUE_BDW  0x67F1427F /* max/min for 
LRA1/2 */
-- 
2.20.1

[Intel-gfx] [PATCH v2 10/17] drm/i915/xehpsdv: Add has_flat_ccs to device info

2021-10-21 Thread Ramalingam C

From: CQ Tang 

Gen12+ devices support 3D surface (buffer) compression and various
compression formats. This is accomplished by an additional compression
control state (CCS) stored for each surface.

Gen 12 devices(TGL family and DG1) stores compression states in a separate
region of memory. It is managed by user-space and has an associated set of
user-space managed page tables used by hardware for address translation.

In Gen12.5 devices(XEHPSDV, DG2, etc), there is a new feature introduced
i.e Flat CCS. It replaced AUX page tables with a flat indexed region of
device memory for storing compression states.

Cc: Joonas Lahtinen 
Cc: Matthew Auld 
Signed-off-by: CQ Tang 
Signed-off-by: Ramalingam C 
---
 drivers/gpu/drm/i915/i915_drv.h  | 2 ++
 drivers/gpu/drm/i915/i915_pci.c  | 1 +
 drivers/gpu/drm/i915/intel_device_info.h | 1 +
 3 files changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a16fde38a252..57948e0ee48b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1721,6 +1721,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 #define HAS_REGION(i915, i) (INTEL_INFO(i915)->memory_regions & (i))
 #define HAS_LMEM(i915) HAS_REGION(i915, REGION_LMEM)
 
+#define HAS_FLAT_CCS(dev_priv)   (INTEL_INFO(dev_priv)->has_flat_ccs)
+
 #define HAS_GT_UC(dev_priv)(INTEL_INFO(dev_priv)->has_gt_uc)
 
 #define HAS_POOLED_EU(dev_priv)(INTEL_INFO(dev_priv)->has_pooled_eu)
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 8ef484a23652..68367b505dc4 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -991,6 +991,7 @@ static const struct intel_device_info adl_p_info = {
XE_HP_PAGE_SIZES, \
.dma_mask_size = 46, \
.has_64bit_reloc = 1, \
+   .has_flat_ccs = 1, \
.has_global_mocs = 1, \
.has_gt_uc = 1, \
.has_llc = 1, \
diff --git a/drivers/gpu/drm/i915/intel_device_info.h 
b/drivers/gpu/drm/i915/intel_device_info.h
index dd453b96af19..87ee1d86d2ac 100644
--- a/drivers/gpu/drm/i915/intel_device_info.h
+++ b/drivers/gpu/drm/i915/intel_device_info.h
@@ -126,6 +126,7 @@ enum intel_ppgtt_type {
func(has_64k_pages); \
func(gpu_reset_clobbers_display); \
func(has_reset_engine); \
+   func(has_flat_ccs); \
func(has_global_mocs); \
func(has_gt_uc); \
func(has_l3_dpf); \
-- 
2.20.1

[Intel-gfx] [PATCH v2 12/17] drm/i915/gt: Clear compress metadata for Xe_HP platforms

2021-10-21 Thread Ramalingam C

From: Ayaz A Siddiqui 

Xe-hp and latest devices support Flat CCS which reserved a portion of
the device memory to store compression metadata, during the clearing of
device memory buffer object we also need to clear the associated CCS buffer.

Flat CCS memory can not be directly accessed by S/W.
Address of CCS buffer associated main BO is automatically calculated
by device itself. KMD/UMD can only access this buffer indirectly using
XY_CTRL_SURF_COPY_BLT cmd via the address of device memory buffer.

v2: Fixed issues with platform naming [Lucas]

Cc: CQ Tang 
Signed-off-by: Ayaz A Siddiqui 
Signed-off-by: Ramalingam C 
---
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h |  14 +++
 drivers/gpu/drm/i915/gt/intel_migrate.c  | 120 ++-
 2 files changed, 131 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h 
b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
index f8253012d166..07bf5a1753bd 100644
--- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
@@ -203,6 +203,20 @@
 #define GFX_OP_DRAWRECT_INFO ((0x3<<29)|(0x1d<<24)|(0x80<<16)|(0x3))
 #define GFX_OP_DRAWRECT_INFO_I965  ((0x7900<<16)|0x2)
 
+#define XY_CTRL_SURF_INSTR_SIZE5
+#define MI_FLUSH_DW_SIZE   3
+#define XY_CTRL_SURF_COPY_BLT  ((2 << 29) | (0x48 << 22) | 3)
+#define   SRC_ACCESS_TYPE_SHIFT21
+#define   DST_ACCESS_TYPE_SHIFT20
+#define   CCS_SIZE_SHIFT   8
+#define   XY_CTRL_SURF_MOCS_SHIFT  25
+#define   NUM_CCS_BYTES_PER_BLOCK  256
+#define   NUM_CCS_BLKS_PER_XFER1024
+#define   INDIRECT_ACCESS  0
+#define   DIRECT_ACCESS1
+#define  MI_FLUSH_LLC  BIT(9)
+#define  MI_FLUSH_CCS  BIT(16)
+
 #define COLOR_BLT_CMD  (2 << 29 | 0x40 << 22 | (5 - 2))
 #define XY_COLOR_BLT_CMD   (2 << 29 | 0x50 << 22)
 #define SRC_COPY_BLT_CMD   (2 << 29 | 0x43 << 22)
diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c 
b/drivers/gpu/drm/i915/gt/intel_migrate.c
index afb1cce9a352..0bed01750884 100644
--- a/drivers/gpu/drm/i915/gt/intel_migrate.c
+++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
@@ -17,6 +17,7 @@ struct insert_pte_data {
 };
 
 #define CHUNK_SZ SZ_8M /* ~1ms at 8GiB/s preemption delay */
+#define GET_CCS_SIZE(i915, size)   (HAS_FLAT_CCS(i915) ? (size) >> 8 : 0)
 
 static bool engine_supports_migration(struct intel_engine_cs *engine)
 {
@@ -490,15 +491,104 @@ intel_context_migrate_copy(struct intel_context *ce,
return err;
 }
 
-static int emit_clear(struct i915_request *rq, int size, u32 value)
+static inline u32 *i915_flush_dw(u32 *cmd, u64 dst, u32 flags)
+{
+   /* Mask the 3 LSB to use the PPGTT address space */
+   *cmd++ = MI_FLUSH_DW | flags;
+   *cmd++ = lower_32_bits(dst);
+   *cmd++ = upper_32_bits(dst);
+
+   return cmd;
+}
+
+static u32 calc_ctrl_surf_instr_size(struct drm_i915_private *i915, int size)
+{
+   u32 num_cmds, num_blks, total_size;
+
+   if (!GET_CCS_SIZE(i915, size))
+   return 0;
+
+   /*
+* XY_CTRL_SURF_COPY_BLT transfers CCS in 256 byte
+* blocks. one XY_CTRL_SURF_COPY_BLT command can
+* trnasfer upto 1024 blocks.
+*/
+   num_blks = (GET_CCS_SIZE(i915, size) +
+  (NUM_CCS_BYTES_PER_BLOCK - 1)) >> 8;
+   num_cmds = (num_blks + (NUM_CCS_BLKS_PER_XFER - 1)) >> 10;
+   total_size = (XY_CTRL_SURF_INSTR_SIZE) * num_cmds;
+
+   /*
+* We need to add a flush before and after
+* XY_CTRL_SURF_COPY_BLT
+*/
+   total_size += 2 * MI_FLUSH_DW_SIZE;
+   return total_size;
+}
+
+static u32 *_i915_ctrl_surf_copy_blt(u32 *cmd, u64 src_addr, u64 dst_addr,
+u8 src_mem_access, u8 dst_mem_access,
+int src_mocs, int dst_mocs,
+u16 num_ccs_blocks)
+{
+   int i = num_ccs_blocks;
+
+   /*
+* The XY_CTRL_SURF_COPY_BLT instruction is used to copy the CCS
+* data in and out of the CCS region.
+*
+* We can copy at most 1024 blocks of 256 bytes using one
+* XY_CTRL_SURF_COPY_BLT instruction.
+*
+* In case we need to copy more than 1024 blocks, we need to add
+* another instruction to the same batch buffer.
+*
+* 1024 blocks of 256 bytes of CCS represent a total 256KB of CCS.
+*
+* 256 KB of CCS represents 256 * 256 KB = 64 MB of LMEM.
+*/
+   do {
+   /*
+* We use logical AND with 1023 since the size field
+* takes values which is in the range of 0 - 1023
+*/
+   *cmd++ = ((XY_CTRL_SURF_COPY_BLT) |
+ (src_mem_access << SRC_ACCESS_TYPE_SHIFT) |
+ (dst_mem_access << DST_ACCESS_TYPE_SHIFT) |
+

[Intel-gfx] [PATCH v2 13/17] drm/i915/dg2: Tile 4 plane format support

2021-10-21 Thread Ramalingam C

From: Stanislav Lisovskiy 

TileF(Tile4 in bspec) format is 4K tile organized into
64B subtiles with same basic shape as for legacy TileY
which will be supported by Display13.

v2: - Fixed wrong case condition(Jani Nikula)
- Increased I915_FORMAT_MOD_F_TILED up to 12(Imre Deak)

v3: - s/I915_TILING_F/TILING_4/g
- s/I915_FORMAT_MOD_F_TILED/I915_FORMAT_MOD_4_TILED/g
- Removed unneeded fencing code

Cc: Imre Deak 
Cc: Matt Roper 
Cc: Maarten Lankhorst 
Signed-off-by: Stanislav Lisovskiy 
Signed-off-by: Matt Roper 
Signed-off-by: Juha-Pekka Heikkilä 
---
 drivers/gpu/drm/i915/display/intel_display.c  |  1 +
 drivers/gpu/drm/i915/display/intel_fb.c   |  7 
 drivers/gpu/drm/i915/display/intel_fbc.c  |  1 +
 .../drm/i915/display/intel_plane_initial.c|  1 +
 .../drm/i915/display/skl_universal_plane.c| 36 ++-
 drivers/gpu/drm/i915/i915_drv.h   |  1 +
 drivers/gpu/drm/i915/i915_pci.c   |  1 +
 drivers/gpu/drm/i915/i915_reg.h   |  1 +
 drivers/gpu/drm/i915/intel_device_info.h  |  1 +
 drivers/gpu/drm/i915/intel_pm.c   |  1 +
 include/uapi/drm/drm_fourcc.h |  8 +
 11 files changed, 50 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index ce5d6633029a..9b678839bf2b 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -8877,6 +8877,7 @@ static int intel_atomic_check_async(struct 
intel_atomic_state *state)
case I915_FORMAT_MOD_X_TILED:
case I915_FORMAT_MOD_Y_TILED:
case I915_FORMAT_MOD_Yf_TILED:
+   case I915_FORMAT_MOD_4_TILED:
break;
default:
drm_dbg_kms(&i915->drm,
diff --git a/drivers/gpu/drm/i915/display/intel_fb.c 
b/drivers/gpu/drm/i915/display/intel_fb.c
index fa1f375e696b..e19739fef825 100644
--- a/drivers/gpu/drm/i915/display/intel_fb.c
+++ b/drivers/gpu/drm/i915/display/intel_fb.c
@@ -127,6 +127,12 @@ intel_tile_width_bytes(const struct drm_framebuffer *fb, 
int color_plane)
return 128;
else
return 512;
+   case I915_FORMAT_MOD_4_TILED:
+   /*
+* Each 4K tile consists of 64B(8*8) subtiles, with
+* same shape as Y Tile(i.e 4*16B OWords)
+*/
+   return 128;
case I915_FORMAT_MOD_Y_TILED_CCS:
if (is_ccs_plane(fb, color_plane))
return 128;
@@ -305,6 +311,7 @@ unsigned int intel_surf_alignment(const struct 
drm_framebuffer *fb,
case I915_FORMAT_MOD_Y_TILED_CCS:
case I915_FORMAT_MOD_Yf_TILED_CCS:
case I915_FORMAT_MOD_Y_TILED:
+   case I915_FORMAT_MOD_4_TILED:
case I915_FORMAT_MOD_Yf_TILED:
return 1 * 1024 * 1024;
default:
diff --git a/drivers/gpu/drm/i915/display/intel_fbc.c 
b/drivers/gpu/drm/i915/display/intel_fbc.c
index 1f66de77a6b1..f079a771f802 100644
--- a/drivers/gpu/drm/i915/display/intel_fbc.c
+++ b/drivers/gpu/drm/i915/display/intel_fbc.c
@@ -747,6 +747,7 @@ static bool tiling_is_valid(struct drm_i915_private 
*dev_priv,
case DRM_FORMAT_MOD_LINEAR:
case I915_FORMAT_MOD_Y_TILED:
case I915_FORMAT_MOD_Yf_TILED:
+   case I915_FORMAT_MOD_4_TILED:
return DISPLAY_VER(dev_priv) >= 9;
case I915_FORMAT_MOD_X_TILED:
return true;
diff --git a/drivers/gpu/drm/i915/display/intel_plane_initial.c 
b/drivers/gpu/drm/i915/display/intel_plane_initial.c
index dcd698a02da2..d80855ee9b96 100644
--- a/drivers/gpu/drm/i915/display/intel_plane_initial.c
+++ b/drivers/gpu/drm/i915/display/intel_plane_initial.c
@@ -125,6 +125,7 @@ intel_alloc_initial_plane_obj(struct intel_crtc *crtc,
case DRM_FORMAT_MOD_LINEAR:
case I915_FORMAT_MOD_X_TILED:
case I915_FORMAT_MOD_Y_TILED:
+   case I915_FORMAT_MOD_4_TILED:
break;
default:
drm_dbg(&dev_priv->drm,
diff --git a/drivers/gpu/drm/i915/display/skl_universal_plane.c 
b/drivers/gpu/drm/i915/display/skl_universal_plane.c
index 7444b88829ea..0eb4509f7f7a 100644
--- a/drivers/gpu/drm/i915/display/skl_universal_plane.c
+++ b/drivers/gpu/drm/i915/display/skl_universal_plane.c
@@ -207,6 +207,13 @@ static const u64 adlp_step_a_plane_format_modifiers[] = {
DRM_FORMAT_MOD_INVALID
 };
 
+static const u64 dg2_plane_format_modifiers[] = {
+   I915_FORMAT_MOD_X_TILED,
+   I915_FORMAT_MOD_4_TILED,
+   DRM_FORMAT_MOD_LINEAR,
+   DRM_FORMAT_MOD_INVALID
+};
+
 int skl_format_to_fourcc(int format, bool rgb_order, bool alpha)
 {
switch (format) {
@@ -795,6 +802,8 @@ static u32 skl_plane_ctl_tiling(u64 fb_modifier)
return PLANE_CTL_TILED_X;
case I915_FORMAT_MOD_Y_TILED:
return PLA

[Intel-gfx] [PATCH v2 15/17] drm/i915/uapi: document behaviour for DG2 64K support

2021-10-21 Thread Ramalingam C

From: Matthew Auld 

On discrete platforms like DG2, we need to support a minimum page size
of 64K when dealing with device local-memory. This is quite tricky for
various reasons, so try to document the new implicit uapi for this.

v2: Fixed suggestions on formatting [Daniel]

Signed-off-by: Matthew Auld 
Signed-off-by: Ramalingam C 
cc: Simon Ser 
cc: Pekka Paalanen 
---
 include/uapi/drm/i915_drm.h | 67 ++---
 1 file changed, 62 insertions(+), 5 deletions(-)

diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 914ebd9290e5..89bcf5a77958 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1118,10 +1118,16 @@ struct drm_i915_gem_exec_object2 {
/**
 * When the EXEC_OBJECT_PINNED flag is specified this is populated by
 * the user with the GTT offset at which this object will be pinned.
+*
 * When the I915_EXEC_NO_RELOC flag is specified this must contain the
 * presumed_offset of the object.
+*
 * During execbuffer2 the kernel populates it with the value of the
 * current GTT offset of the object, for future presumed_offset writes.
+*
+* See struct drm_i915_gem_create_ext for the rules when dealing with
+* alignment restrictions with I915_MEMORY_CLASS_DEVICE, on devices with
+* minimum page sizes, like DG2.
 */
__u64 offset;
 
@@ -3144,11 +3150,62 @@ struct drm_i915_gem_create_ext {
 *
 * The (page-aligned) allocated size for the object will be returned.
 *
-* Note that for some devices we have might have further minimum
-* page-size restrictions(larger than 4K), like for device local-memory.
-* However in general the final size here should always reflect any
-* rounding up, if for example using the 
I915_GEM_CREATE_EXT_MEMORY_REGIONS
-* extension to place the object in device local-memory.
+*
+* **DG2 64K min page size implications:**
+*
+* On discrete platforms, starting from DG2, we have to contend with GTT
+* page size restrictions when dealing with I915_MEMORY_CLASS_DEVICE
+* objects.  Specifically the hardware only supports 64K or larger GTT
+* page sizes for such memory. The kernel will already ensure that all
+* I915_MEMORY_CLASS_DEVICE memory is allocated using 64K or larger page
+* sizes underneath.
+*
+* Note that the returned size here will always reflect any required
+* rounding up done by the kernel, i.e 4K will now become 64K on devices
+* such as DG2.
+*
+* **Special DG2 GTT address alignment requirement:**
+*
+* The GTT alignment will also need be at least 64K for  such objects.
+*
+* Note that due to how the hardware implements 64K GTT page support, we
+* have some further complications:
+*
+*   1) The entire PDE(which covers a 2M virtual address range), must
+*   contain only 64K PTEs, i.e mixing 4K and 64K PTEs in the same
+*   PDE is forbidden by the hardware.
+*
+*   2) We still need to support 4K PTEs for I915_MEMORY_CLASS_SYSTEM
+*   objects.
+*
+* To handle the above the kernel implements a memory coloring scheme to
+* prevent userspace from mixing I915_MEMORY_CLASS_DEVICE and
+* I915_MEMORY_CLASS_SYSTEM objects in the same PDE. If the kernel is
+* ever unable to evict the required pages for the given PDE(different
+* color) when inserting the object into the GTT then it will simply
+* fail the request.
+*
+* Since userspace needs to manage the GTT address space themselves,
+* special care is needed to ensure this doesn't happen. The simplest
+* scheme is to simply align and round up all I915_MEMORY_CLASS_DEVICE
+* objects to 2M, which avoids any issues here. At the very least this
+* is likely needed for objects that can be placed in both
+* I915_MEMORY_CLASS_DEVICE and I915_MEMORY_CLASS_SYSTEM, to avoid
+* potential issues when the kernel needs to migrate the object behind
+* the scenes, since that might also involve evicting other objects.
+*
+* **To summarise the GTT rules, on platforms like DG2:**
+*
+*   1) All objects that can be placed in I915_MEMORY_CLASS_DEVICE must
+*   have 64K alignment. The kernel will reject this otherwise.
+*
+*   2) All I915_MEMORY_CLASS_DEVICE objects must never be placed in
+*   the same PDE with other I915_MEMORY_CLASS_SYSTEM objects. The
+*   kernel will reject this otherwise.
+*
+*   3) Objects that can be placed in both I915_MEMORY_CLASS_DEVICE and
+*   I915_MEMORY_CLASS_SYSTEM should probably be aligned and padded out
+*   to 2M.
 */

[Intel-gfx] [PATCH v2 14/17] uapi/drm/dg2: Format modifier for DG2 unified compression and clear color

2021-10-21 Thread Ramalingam C

From: Matt Roper 

DG2 unifies render compression and media compression into a single
format for the first time.  The programming and buffer layout is
supposed to match compression on older gen12 platforms, but the
actual compression algorithm is different from any previous platform; as
such, we need a new framebuffer modifier to represent buffers in this
format, but otherwise we can re-use the existing gen12 compression driver
logic.

DG2 clear color render compression uses Tile4 layout. Therefore, we need
to define a new format modifier for uAPI to support clear color rendering.

Signed-off-by: Matt Roper 
Signed-off-by: Mika Kahola  (v2)
Signed-off-by: Juha-Pekka Heikkilä 
Signed-off-by: Ramalingam C 
cc: Simon Ser 
Cc: Pekka Paalanen 
---
 drivers/gpu/drm/i915/display/intel_display.c  |  3 ++
 .../drm/i915/display/intel_display_types.h| 10 +++-
 drivers/gpu/drm/i915/display/intel_fb.c   |  7 +++
 .../drm/i915/display/skl_universal_plane.c| 49 +--
 include/uapi/drm/drm_fourcc.h | 30 
 5 files changed, 94 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index 9b678839bf2b..2949fe9f5b9f 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -1013,6 +1013,9 @@ intel_get_format_info(const struct drm_mode_fb_cmd2 *cmd)
  cmd->pixel_format);
case I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS:
case I915_FORMAT_MOD_Y_TILED_GEN12_MC_CCS:
+   case I915_FORMAT_MOD_F_TILED_DG2_RC_CCS:
+   case I915_FORMAT_MOD_F_TILED_DG2_MC_CCS:
+   case I915_FORMAT_MOD_F_TILED_DG2_RC_CCS_CC:
return lookup_format_info(gen12_ccs_formats,
  ARRAY_SIZE(gen12_ccs_formats),
  cmd->pixel_format);
diff --git a/drivers/gpu/drm/i915/display/intel_display_types.h 
b/drivers/gpu/drm/i915/display/intel_display_types.h
index 39e11eaec1a3..3aa47a9965d9 100644
--- a/drivers/gpu/drm/i915/display/intel_display_types.h
+++ b/drivers/gpu/drm/i915/display/intel_display_types.h
@@ -2047,14 +2047,20 @@ static inline bool is_ccs_modifier(u64 modifier)
   modifier == I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS_CC ||
   modifier == I915_FORMAT_MOD_Y_TILED_GEN12_MC_CCS ||
   modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
-  modifier == I915_FORMAT_MOD_Yf_TILED_CCS;
+  modifier == I915_FORMAT_MOD_Yf_TILED_CCS ||
+  modifier == I915_FORMAT_MOD_F_TILED_DG2_RC_CCS ||
+  modifier == I915_FORMAT_MOD_F_TILED_DG2_MC_CCS ||
+  modifier == I915_FORMAT_MOD_F_TILED_DG2_RC_CCS_CC;
 }
 
 static inline bool is_gen12_ccs_modifier(u64 modifier)
 {
return modifier == I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS ||
   modifier == I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS_CC ||
-  modifier == I915_FORMAT_MOD_Y_TILED_GEN12_MC_CCS;
+  modifier == I915_FORMAT_MOD_Y_TILED_GEN12_MC_CCS ||
+  modifier == I915_FORMAT_MOD_F_TILED_DG2_RC_CCS ||
+  modifier == I915_FORMAT_MOD_F_TILED_DG2_MC_CCS ||
+  modifier == I915_FORMAT_MOD_F_TILED_DG2_RC_CCS_CC;
 }
 
 #endif /*  __INTEL_DISPLAY_TYPES_H__ */
diff --git a/drivers/gpu/drm/i915/display/intel_fb.c 
b/drivers/gpu/drm/i915/display/intel_fb.c
index e19739fef825..8216b03b8aae 100644
--- a/drivers/gpu/drm/i915/display/intel_fb.c
+++ b/drivers/gpu/drm/i915/display/intel_fb.c
@@ -127,6 +127,9 @@ intel_tile_width_bytes(const struct drm_framebuffer *fb, 
int color_plane)
return 128;
else
return 512;
+   case I915_FORMAT_MOD_F_TILED_DG2_RC_CCS:
+   case I915_FORMAT_MOD_F_TILED_DG2_MC_CCS:
+   case I915_FORMAT_MOD_F_TILED_DG2_RC_CCS_CC:
case I915_FORMAT_MOD_4_TILED:
/*
 * Each 4K tile consists of 64B(8*8) subtiles, with
@@ -314,6 +317,10 @@ unsigned int intel_surf_alignment(const struct 
drm_framebuffer *fb,
case I915_FORMAT_MOD_4_TILED:
case I915_FORMAT_MOD_Yf_TILED:
return 1 * 1024 * 1024;
+   case I915_FORMAT_MOD_F_TILED_DG2_RC_CCS:
+   case I915_FORMAT_MOD_F_TILED_DG2_RC_CCS_CC:
+   case I915_FORMAT_MOD_F_TILED_DG2_MC_CCS:
+   return 16 * 1024;
default:
MISSING_CASE(fb->modifier);
return 0;
diff --git a/drivers/gpu/drm/i915/display/skl_universal_plane.c 
b/drivers/gpu/drm/i915/display/skl_universal_plane.c
index 0eb4509f7f7a..0aaccaff46f7 100644
--- a/drivers/gpu/drm/i915/display/skl_universal_plane.c
+++ b/drivers/gpu/drm/i915/display/skl_universal_plane.c
@@ -207,7 +207,19 @@ static const u64 adlp_step_a_plane_format_modifiers[] = {
DRM_FORMAT_MOD_INVALID
 };
 
+static const u64 dg2_step_a_b_plane_format_modifiers[] = {
+

[Intel-gfx] [PATCH v2 16/17] drm/i915/Flat-CCS: Document on Flat-CCS memory compression

2021-10-21 Thread Ramalingam C

Documents the Flat-CCS feature and kernel handling required along with
modifiers used.

Signed-off-by: Ramalingam C 
cc: Simon Ser 
cc: Pekka Paalanen 
---
 drivers/gpu/drm/i915/gt/intel_migrate.c | 47 +
 1 file changed, 47 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c 
b/drivers/gpu/drm/i915/gt/intel_migrate.c
index 0bed01750884..ad5a28da1c6a 100644
--- a/drivers/gpu/drm/i915/gt/intel_migrate.c
+++ b/drivers/gpu/drm/i915/gt/intel_migrate.c
@@ -491,6 +491,53 @@ intel_context_migrate_copy(struct intel_context *ce,
return err;
 }
 
+/**
+ * DOC: Flat-CCS - Memory compression for Local memory
+ *
+ * On Xe-HP and later devices, we use dedicated compression control state (CCS)
+ * stored in local memory for each surface, to support the 3D and media
+ * compression formats.
+ *
+ * The memory required for the CCS of the entire local memory is 1/256 of the
+ * local memory size. So before the kernel boot, the required memory is 
reserved
+ * for the CCS data and a secure register will be programmed with the CCS base
+ * address.
+ *
+ * Flat CCS data needs to be cleared when a lmem object is allocated.
+ * And CCS data can be copied in and out of CCS region through
+ * XY_CTRL_SURF_COPY_BLT. CPU can't access the CCS data directly.
+ *
+ * When we exaust the lmem, if the object's placements support smem, then we 
can
+ * directly decompress the compressed lmem object into smem and start using it
+ * from smem itself.
+ *
+ * But when we need to swapout the compressed lmem object into a smem region
+ * though objects' placement doesn't support smem, then we copy the lmem 
content
+ * as it is into smem region along with ccs data (using XY_CTRL_SURF_COPY_BLT).
+ * When the object is referred, lmem content will be swaped in along with
+ * restoration of the CCS data (using XY_CTRL_SURF_COPY_BLT) at corresponding
+ * location.
+ *
+ *
+ * Flat-CCS Modifiers for different compression formats
+ * 
+ *
+ * I915_FORMAT_MOD_F_TILED_DG2_RC_CCS - used to indicate the buffers of Flat 
CCS
+ * render compression formats. Though the general layout is same as
+ * I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS, new hashing/compression algorithm is
+ * used. Render compression uses 128 byte compression blocks
+ *
+ * I915_FORMAT_MOD_F_TILED_DG2_MC_CCS -used to indicate the buffers of Flat CCS
+ * media compression formats. Though the general layout is same as
+ * I915_FORMAT_MOD_Y_TILED_GEN12_MC_CCS, new hashing/compression algorithm is
+ * used. Media compression uses 256 byte compression blocks.
+ *
+ * I915_FORMAT_MOD_F_TILED_DG2_RC_CCS_CC - used to indicate the buffers of Flat
+ * CCS clear color render compression formats. Unified compression format for
+ * clear color render compression. The genral layout is a tiled layout using
+ * 4Kb tiles i.e Tile4 layout.
+ */
+
 static inline u32 *i915_flush_dw(u32 *cmd, u64 dst, u32 flags)
 {
/* Mask the 3 LSB to use the PPGTT address space */
-- 
2.20.1

[Intel-gfx] [PATCH v2 17/17] Doc/gpu/rfc/i915: i915 DG2 uAPI

2021-10-21 Thread Ramalingam C

Details of the new features getting added as part of DG2 enabling and their
implicit impact on the uAPI.

v2: improvised the Flat-CCS documentation [Danvet & CQ]

Signed-off-by: Ramalingam C 
cc: Daniel Vetter 
cc: Matthew Auld 
cc: Simon Ser 
cc: Pekka Paalanen 
---
 Documentation/gpu/rfc/i915_dg2.rst | 32 ++
 Documentation/gpu/rfc/index.rst|  3 +++
 2 files changed, 35 insertions(+)
 create mode 100644 Documentation/gpu/rfc/i915_dg2.rst

diff --git a/Documentation/gpu/rfc/i915_dg2.rst 
b/Documentation/gpu/rfc/i915_dg2.rst
new file mode 100644
index ..9d28b1812bc7
--- /dev/null
+++ b/Documentation/gpu/rfc/i915_dg2.rst
@@ -0,0 +1,32 @@
+
+I915 DG2 RFC Section
+
+
+Upstream plan
+=
+Plan to upstream the DG2 enabling is:
+
+* Merge basic HW enabling for DG2 (Still without pciid)
+* Merge the 64k support for lmem
+* Merge the flat CCS enabling patches
+* Add the pciid for DG2 and enable the DG2 in CI
+
+
+64K page support for lmem
+=
+On DG2 hw, local-memory supports minimum GTT page size of 64k only. 4k is not
+supported anymore.
+
+DG2 hw doesn't support the 64k (lmem) and 4k (smem) pages in the same ppgtt
+Page table. Refer the struct drm_i915_gem_create_ext for the implication of
+handling the 64k page size.
+
+.. kernel-doc:: include/uapi/drm/i915_drm.h
+:functions: drm_i915_gem_create_ext
+
+
+Flat CCS support for lmem
+=
+
+.. kernel-doc:: drivers/gpu/drm/i915/gt/intel_migrate.c
+:doc: Flat-CCS - Memory compression for Local memory
diff --git a/Documentation/gpu/rfc/index.rst b/Documentation/gpu/rfc/index.rst
index 91e93a705230..afb320ed4028 100644
--- a/Documentation/gpu/rfc/index.rst
+++ b/Documentation/gpu/rfc/index.rst
@@ -20,6 +20,9 @@ host such documentation:
 
 i915_gem_lmem.rst
 
+.. toctree::
+i915_dg2.rst
+
 .. toctree::
 
 i915_scheduler.rst
-- 
2.20.1

Re: [Intel-gfx] [PATCH v2 13/17] drm/i915/dg2: Tile 4 plane format support

2021-10-21 Thread Lisovskiy, Stanislav

On Thu, Oct 21, 2021 at 07:56:23PM +0530, Ramalingam C wrote:
> From: Stanislav Lisovskiy 
> 
> TileF(Tile4 in bspec) format is 4K tile organized into
> 64B subtiles with same basic shape as for legacy TileY
> which will be supported by Display13.
> 
> v2: - Fixed wrong case condition(Jani Nikula)
> - Increased I915_FORMAT_MOD_F_TILED up to 12(Imre Deak)
> 
> v3: - s/I915_TILING_F/TILING_4/g
> - s/I915_FORMAT_MOD_F_TILED/I915_FORMAT_MOD_4_TILED/g
> - Removed unneeded fencing code
> 
> Cc: Imre Deak 
> Cc: Matt Roper 
> Cc: Maarten Lankhorst 
> Signed-off-by: Stanislav Lisovskiy 
> Signed-off-by: Matt Roper 
> Signed-off-by: Juha-Pekka Heikkilä 
> ---
>  drivers/gpu/drm/i915/display/intel_display.c  |  1 +
>  drivers/gpu/drm/i915/display/intel_fb.c   |  7 
>  drivers/gpu/drm/i915/display/intel_fbc.c  |  1 +
>  .../drm/i915/display/intel_plane_initial.c|  1 +
>  .../drm/i915/display/skl_universal_plane.c| 36 ++-
>  drivers/gpu/drm/i915/i915_drv.h   |  1 +
>  drivers/gpu/drm/i915/i915_pci.c   |  1 +
>  drivers/gpu/drm/i915/i915_reg.h   |  1 +
>  drivers/gpu/drm/i915/intel_device_info.h  |  1 +
>  drivers/gpu/drm/i915/intel_pm.c   |  1 +
>  include/uapi/drm/drm_fourcc.h |  8 +
>  11 files changed, 50 insertions(+), 9 deletions(-)

Was I supposed to change TILE_F/TILE_4 everywhere first,
as per your comment?

Stan

> 
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
> b/drivers/gpu/drm/i915/display/intel_display.c
> index ce5d6633029a..9b678839bf2b 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -8877,6 +8877,7 @@ static int intel_atomic_check_async(struct 
> intel_atomic_state *state)
>   case I915_FORMAT_MOD_X_TILED:
>   case I915_FORMAT_MOD_Y_TILED:
>   case I915_FORMAT_MOD_Yf_TILED:
> + case I915_FORMAT_MOD_4_TILED:
>   break;
>   default:
>   drm_dbg_kms(&i915->drm,
> diff --git a/drivers/gpu/drm/i915/display/intel_fb.c 
> b/drivers/gpu/drm/i915/display/intel_fb.c
> index fa1f375e696b..e19739fef825 100644
> --- a/drivers/gpu/drm/i915/display/intel_fb.c
> +++ b/drivers/gpu/drm/i915/display/intel_fb.c
> @@ -127,6 +127,12 @@ intel_tile_width_bytes(const struct drm_framebuffer *fb, 
> int color_plane)
>   return 128;
>   else
>   return 512;
> + case I915_FORMAT_MOD_4_TILED:
> + /*
> +  * Each 4K tile consists of 64B(8*8) subtiles, with
> +  * same shape as Y Tile(i.e 4*16B OWords)
> +  */
> + return 128;
>   case I915_FORMAT_MOD_Y_TILED_CCS:
>   if (is_ccs_plane(fb, color_plane))
>   return 128;
> @@ -305,6 +311,7 @@ unsigned int intel_surf_alignment(const struct 
> drm_framebuffer *fb,
>   case I915_FORMAT_MOD_Y_TILED_CCS:
>   case I915_FORMAT_MOD_Yf_TILED_CCS:
>   case I915_FORMAT_MOD_Y_TILED:
> + case I915_FORMAT_MOD_4_TILED:
>   case I915_FORMAT_MOD_Yf_TILED:
>   return 1 * 1024 * 1024;
>   default:
> diff --git a/drivers/gpu/drm/i915/display/intel_fbc.c 
> b/drivers/gpu/drm/i915/display/intel_fbc.c
> index 1f66de77a6b1..f079a771f802 100644
> --- a/drivers/gpu/drm/i915/display/intel_fbc.c
> +++ b/drivers/gpu/drm/i915/display/intel_fbc.c
> @@ -747,6 +747,7 @@ static bool tiling_is_valid(struct drm_i915_private 
> *dev_priv,
>   case DRM_FORMAT_MOD_LINEAR:
>   case I915_FORMAT_MOD_Y_TILED:
>   case I915_FORMAT_MOD_Yf_TILED:
> + case I915_FORMAT_MOD_4_TILED:
>   return DISPLAY_VER(dev_priv) >= 9;
>   case I915_FORMAT_MOD_X_TILED:
>   return true;
> diff --git a/drivers/gpu/drm/i915/display/intel_plane_initial.c 
> b/drivers/gpu/drm/i915/display/intel_plane_initial.c
> index dcd698a02da2..d80855ee9b96 100644
> --- a/drivers/gpu/drm/i915/display/intel_plane_initial.c
> +++ b/drivers/gpu/drm/i915/display/intel_plane_initial.c
> @@ -125,6 +125,7 @@ intel_alloc_initial_plane_obj(struct intel_crtc *crtc,
>   case DRM_FORMAT_MOD_LINEAR:
>   case I915_FORMAT_MOD_X_TILED:
>   case I915_FORMAT_MOD_Y_TILED:
> + case I915_FORMAT_MOD_4_TILED:
>   break;
>   default:
>   drm_dbg(&dev_priv->drm,
> diff --git a/drivers/gpu/drm/i915/display/skl_universal_plane.c 
> b/drivers/gpu/drm/i915/display/skl_universal_plane.c
> index 7444b88829ea..0eb4509f7f7a 100644
> --- a/drivers/gpu/drm/i915/display/skl_universal_plane.c
> +++ b/drivers/gpu/drm/i915/display/skl_universal_plane.c
> @@ -207,6 +207,13 @@ static const u64 adlp_step_a_plane_format_modifiers[] = {
>   DRM_FORMAT_MOD_INVALID
>  };
>  
> +static const u64 dg2_plane_format_modifiers[] = {
> + I915_FORMAT_MOD_X_TILED,
> + I915_FORMAT_MOD_4_TILED,
> + DRM_FORMAT_MOD_LINEAR,
> + DRM_FORMA

Re: [Intel-gfx] [PATCH v2 14/17] uapi/drm/dg2: Format modifier for DG2 unified compression and clear color

2021-10-21 Thread Simon Ser

For the include/uapi/drm/drm_fourcc.h changes:

Acked-by: Simon Ser

Re: [Intel-gfx] [PATCH v2 14/17] uapi/drm/dg2: Format modifier for DG2 unified compression and clear color

2021-10-21 Thread Ville Syrjälä

On Thu, Oct 21, 2021 at 07:56:24PM +0530, Ramalingam C wrote:
> From: Matt Roper 
> 
> DG2 unifies render compression and media compression into a single
> format for the first time.  The programming and buffer layout is
> supposed to match compression on older gen12 platforms, but the
> actual compression algorithm is different from any previous platform; as
> such, we need a new framebuffer modifier to represent buffers in this
> format, but otherwise we can re-use the existing gen12 compression driver
> logic.
> 
> DG2 clear color render compression uses Tile4 layout. Therefore, we need
> to define a new format modifier for uAPI to support clear color rendering.
> 
> Signed-off-by: Matt Roper 
> Signed-off-by: Mika Kahola  (v2)
> Signed-off-by: Juha-Pekka Heikkilä 
> Signed-off-by: Ramalingam C 
> cc: Simon Ser 
> Cc: Pekka Paalanen 
> ---
>  drivers/gpu/drm/i915/display/intel_display.c  |  3 ++
>  .../drm/i915/display/intel_display_types.h| 10 +++-
>  drivers/gpu/drm/i915/display/intel_fb.c   |  7 +++
>  .../drm/i915/display/skl_universal_plane.c| 49 +--
>  include/uapi/drm/drm_fourcc.h | 30 
>  5 files changed, 94 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
> b/drivers/gpu/drm/i915/display/intel_display.c
> index 9b678839bf2b..2949fe9f5b9f 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -1013,6 +1013,9 @@ intel_get_format_info(const struct drm_mode_fb_cmd2 
> *cmd)
> cmd->pixel_format);
>   case I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS:
>   case I915_FORMAT_MOD_Y_TILED_GEN12_MC_CCS:
> + case I915_FORMAT_MOD_F_TILED_DG2_RC_CCS:
> + case I915_FORMAT_MOD_F_TILED_DG2_MC_CCS:
> + case I915_FORMAT_MOD_F_TILED_DG2_RC_CCS_CC:
>   return lookup_format_info(gen12_ccs_formats,
> ARRAY_SIZE(gen12_ccs_formats),
> cmd->pixel_format);

That seems not right. Flat CCS is invisible to the user so the format
info should not include a CCS plane.

-- 
Ville Syrjälä
Intel

1 2 >

1 - 100 of 189 matches

Mail list logo