[Intel-gfx] [PULL] gvt-next-fixes

2020-08-05 Thread Zhenyu Wang

Hi,

Here's to only include gvt fixes for 5.9-rc1. Two fixes to make
guest suspend/resume working gracefully are included.

Thanks
-- 
The following changes since commit e57bd05ec0d2d82d63725dedf9f5a063f879de25:

  drm/i915: Update DRIVER_DATE to 20200715 (2020-07-15 14:18:02 +0300)

are available in the Git repository at:

  https://github.com/intel/gvt-linux tags/gvt-next-fixes-2020-08-05

for you to fetch changes up to 9e7c0efadb86ddb58965561bbca638d44792d78f:

  drm/i915/gvt: Do not reset pv_notified when vGPU transit from D3->D0 
(2020-07-29 14:18:32 +0800)


gvt-next-fixes-2020-08-05

- Fix guest suspend/resume low performance handling of shadow ppgtt (Colin)
- Fix PV notifier handling for guest suspend/resume (Colin)


Colin Xu (2):
  drm/i915/gvt: Do not destroy ppgtt_mm during vGPU D3->D0.
  drm/i915/gvt: Do not reset pv_notified when vGPU transit from D3->D0

 drivers/gpu/drm/i915/gvt/cfg_space.c | 24 
 drivers/gpu/drm/i915/gvt/gtt.c   |  2 +-
 drivers/gpu/drm/i915/gvt/gtt.h   |  2 ++
 drivers/gpu/drm/i915/gvt/gvt.h   |  3 +++
 drivers/gpu/drm/i915/gvt/vgpu.c  | 20 +---
 5 files changed, 47 insertions(+), 4 deletions(-)



signature.asc
Description: PGP signature
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915/gt: Prevent immediate reuse of the last context tag

2020-08-05 Thread Chris Wilson
While we only release the context tag after we have processed the
context-switch event away from the context, be paranoid in case that
value remains live in HW and so avoid reusing the last tag for the next
context after a brief idle.

Signed-off-by: Chris Wilson 
Cc: Ramalingam C 
---
 drivers/gpu/drm/i915/gt/intel_engine_types.h |  1 +
 drivers/gpu/drm/i915/gt/intel_lrc.c  | 20 
 2 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index c400aaa2287b..bfa0199b7a2c 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -330,6 +330,7 @@ struct intel_engine_cs {
atomic_t fw_active;
 
unsigned long context_tag;
+   unsigned long context_last;
 
struct rb_node uabi_node;
 
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 417f6b0c6c61..f8a0ee67d930 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1335,6 +1335,21 @@ static void intel_context_update_runtime(struct 
intel_context *ce)
ce->runtime.total += dt;
 }
 
+static unsigned int next_cyclic_tag(struct intel_engine_cs *engine)
+{
+   unsigned long tag, mask = ~0ul << engine->context_last;
+
+   /* Cyclically allocate unused ids, prevent immediate reuse of last */
+   tag = READ_ONCE(engine->context_tag);
+   tag = (tag & mask) ?: tag;
+   GEM_BUG_ON(tag == 0);
+
+   tag = __ffs(tag);
+   clear_bit(tag, &engine->context_tag);
+
+   return engine->context_last = tag + 1;
+}
+
 static inline struct intel_engine_cs *
 __execlists_schedule_in(struct i915_request *rq)
 {
@@ -1355,12 +1370,9 @@ __execlists_schedule_in(struct i915_request *rq)
ce->lrc.ccid = ce->tag;
} else {
/* We don't need a strict matching tag, just different values */
-   unsigned int tag = ffs(READ_ONCE(engine->context_tag));
+   unsigned int tag = next_cyclic_tag(engine);
 
-   GEM_BUG_ON(tag == 0 || tag >= BITS_PER_LONG);
-   clear_bit(tag - 1, &engine->context_tag);
ce->lrc.ccid = tag << (GEN11_SW_CTX_ID_SHIFT - 32);
-
BUILD_BUG_ON(BITS_PER_LONG > GEN12_MAX_CONTEXT_HW_ID);
}
 
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/gt: Prevent immediate reuse of the last context tag

2020-08-05 Thread Chris Wilson
Quoting Chris Wilson (2020-08-05 09:37:51)
> While we only release the context tag after we have processed the
> context-switch event away from the context, be paranoid in case that
> value remains live in HW and so avoid reusing the last tag for the next
> context after a brief idle.
> 

Fixes: 5c4a53e3b1cb ("drm/i915/execlists: Track inflight CCID")
> Signed-off-by: Chris Wilson 
> Cc: Ramalingam C 
Cc:  # v5.5+
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.SPARSE: warning for drm/i915/gt: Prevent immediate reuse of the last context tag

2020-08-05 Thread Patchwork
== Series Details ==

Series: drm/i915/gt: Prevent immediate reuse of the last context tag
URL   : https://patchwork.freedesktop.org/series/80277/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.0
Fast mode used, each commit won't be checked separately.


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PULL] drm-misc-next-fixes

2020-08-05 Thread Maarten Lankhorst
drm-misc-next-fixes-2020-08-05:
drm-misc-next-fixes for v5.9-rc1:
- Fix drm_dp_mst_port refcount leaks in drm_dp_mst_allocate_vcpi
- Fix a fbcon OOB read in fbdev, found by syzbot.
- Mark vga_tryget static as it's not used elsewhere.
- Small fixes to xlnx.
- Remove null check for kfree in drm_dev_release.
- Fix DRM_FORMAT_MOD_AMLOGIC_FBC definition.
- Fix mode initialization in omap_connector_mode_valid().
The following changes since commit 206739119508d5ab4b42ab480ff61a7e6cd72d7c:

  Merge tag 'amd-drm-next-5.9-2020-07-17' of 
git://people.freedesktop.org/~agd5f/linux into drm-next (2020-07-23 15:38:11 
+1000)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm-misc tags/drm-misc-next-fixes-2020-08-05

for you to fetch changes up to a34a0a632dd991a371fec56431d73279f9c54029:

  drm: fix drm_dp_mst_port refcount leaks in drm_dp_mst_allocate_vcpi 
(2020-08-04 12:21:11 -0400)


drm-misc-next-fixes for v5.9-rc1:
- Fix drm_dp_mst_port refcount leaks in drm_dp_mst_allocate_vcpi
- Fix a fbcon OOB read in fbdev, found by syzbot.
- Mark vga_tryget static as it's not used elsewhere.
- Small fixes to xlnx.
- Remove null check for kfree in drm_dev_release.
- Fix DRM_FORMAT_MOD_AMLOGIC_FBC definition.
- Fix mode initialization in omap_connector_mode_valid().


Christoph Hellwig (1):
  vgaarb: mark vga_tryget static

Colin Ian King (1):
  drm: xlnx: fix spelling mistake "failes" -> "failed"

Hyun Kwon (1):
  drm: xlnx: zynqmp: Use switch - case for link rate downshift

Li Heng (1):
  drm: Remove redundant NULL check

Neil Armstrong (1):
  drm/fourcc: fix Amlogic Video Framebuffer Compression macro

Tetsuo Handa (1):
  fbmem: pull fbcon_update_vcs() out of fb_set_var()

Ville Syrjälä (1):
  drm/omap: Use {} to zero initialize the mode

Wei Yongjun (1):
  drm: xlnx: Fix typo in parameter description

Xin Xiong (1):
  drm: fix drm_dp_mst_port refcount leaks in drm_dp_mst_allocate_vcpi

 drivers/gpu/drm/drm_dp_mst_topology.c|  7 ---
 drivers/gpu/drm/drm_drv.c|  3 +--
 drivers/gpu/drm/omapdrm/omap_connector.c |  2 +-
 drivers/gpu/drm/xlnx/zynqmp_dp.c | 33 +---
 drivers/gpu/vga/vgaarb.c |  3 +--
 drivers/video/fbdev/core/fbmem.c |  8 ++--
 drivers/video/fbdev/core/fbsysfs.c   |  4 ++--
 drivers/video/fbdev/ps3fb.c  |  5 +++--
 include/linux/fb.h   |  2 --
 include/linux/vgaarb.h   |  6 --
 include/uapi/drm/drm_fourcc.h|  2 +-
 11 files changed, 33 insertions(+), 42 deletions(-)
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915/gt: Prevent immediate reuse of the last context tag

2020-08-05 Thread Patchwork
== Series Details ==

Series: drm/i915/gt: Prevent immediate reuse of the last context tag
URL   : https://patchwork.freedesktop.org/series/80277/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_8844 -> Patchwork_18308


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_18308 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_18308, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18308/index.html

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_18308:

### IGT changes ###

 Possible regressions 

  * igt@i915_pm_rpm@module-reload:
- fi-hsw-4770:[PASS][1] -> [DMESG-WARN][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8844/fi-hsw-4770/igt@i915_pm_...@module-reload.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18308/fi-hsw-4770/igt@i915_pm_...@module-reload.html

  
 Suppressed 

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@i915_pm_rpm@module-reload:
- {fi-kbl-7560u}: [PASS][3] -> [DMESG-WARN][4]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8844/fi-kbl-7560u/igt@i915_pm_...@module-reload.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18308/fi-kbl-7560u/igt@i915_pm_...@module-reload.html

  * igt@runner@aborted:
- {fi-tgl-dsi}:   NOTRUN -> [FAIL][5]
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18308/fi-tgl-dsi/igt@run...@aborted.html

  
Known issues


  Here are the changes found in Patchwork_18308 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@kms_busy@basic@flip:
- fi-kbl-x1275:   [PASS][6] -> [DMESG-WARN][7] ([i915#62] / [i915#92] / 
[i915#95])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8844/fi-kbl-x1275/igt@kms_busy@ba...@flip.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18308/fi-kbl-x1275/igt@kms_busy@ba...@flip.html

  
 Possible fixes 

  * igt@i915_pm_rpm@basic-pci-d3-state:
- fi-bsw-kefka:   [DMESG-WARN][8] ([i915#1982]) -> [PASS][9]
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8844/fi-bsw-kefka/igt@i915_pm_...@basic-pci-d3-state.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18308/fi-bsw-kefka/igt@i915_pm_...@basic-pci-d3-state.html

  * igt@i915_selftest@live@execlists:
- fi-kbl-guc: [INCOMPLETE][10] ([i915#794]) -> [PASS][11]
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8844/fi-kbl-guc/igt@i915_selftest@l...@execlists.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18308/fi-kbl-guc/igt@i915_selftest@l...@execlists.html

  * igt@i915_selftest@live@gem_contexts:
- fi-tgl-u2:  [INCOMPLETE][12] ([i915#2045]) -> [PASS][13]
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8844/fi-tgl-u2/igt@i915_selftest@live@gem_contexts.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18308/fi-tgl-u2/igt@i915_selftest@live@gem_contexts.html

  * igt@kms_flip@basic-flip-vs-wf_vblank@c-edp1:
- fi-icl-u2:  [DMESG-WARN][14] ([i915#1982]) -> [PASS][15] +1 
similar issue
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8844/fi-icl-u2/igt@kms_flip@basic-flip-vs-wf_vbl...@c-edp1.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18308/fi-icl-u2/igt@kms_flip@basic-flip-vs-wf_vbl...@c-edp1.html

  * igt@kms_flip@basic-flip-vs-wf_vblank@c-hdmi-a2:
- fi-skl-guc: [DMESG-WARN][16] ([i915#2203]) -> [PASS][17]
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8844/fi-skl-guc/igt@kms_flip@basic-flip-vs-wf_vbl...@c-hdmi-a2.html
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18308/fi-skl-guc/igt@kms_flip@basic-flip-vs-wf_vbl...@c-hdmi-a2.html

  
 Warnings 

  * igt@i915_pm_rpm@module-reload:
- fi-kbl-guc: [DMESG-FAIL][18] ([i915#2203]) -> [DMESG-WARN][19] 
([i915#2203])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8844/fi-kbl-guc/igt@i915_pm_...@module-reload.html
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18308/fi-kbl-guc/igt@i915_pm_...@module-reload.html

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-legacy:
- fi-kbl-x1275:   [DMESG-WARN][20] ([i915#62] / [i915#92] / [i915#95]) 
-> [DMESG-WARN][21] ([i915#62] / [i915#92]) +6 similar issues
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8844/fi-kbl-x1275/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-legacy.html
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18308/fi-kbl-x1275/igt@kms_cursor_leg...@basic-busy-flip-befo

[Intel-gfx] [PATCH v3 0/2] HDCP minor refactoring

2020-08-05 Thread Anshuman Gupta
No functional change.

Anshuman Gupta (2):
  drm/i915/hdcp: Add update_pipe early return
  drm/i915/hdcp: No direct access to power_well desc

 drivers/gpu/drm/i915/display/intel_hdcp.c | 23 +--
 1 file changed, 9 insertions(+), 14 deletions(-)

-- 
2.26.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v3 2/2] drm/i915/hdcp: No direct access to power_well desc

2020-08-05 Thread Anshuman Gupta
HDCP code doesn't require to access power_well internal stuff,
instead it should use the intel_display_power_well_is_enabled()
to get the status of desired power_well.
No functional change.

v2:
- used with_intel_runtime_pm instead of get/put. [Jani]
v3:
- rebased.

Cc: Jani Nikula 
Signed-off-by: Anshuman Gupta 
---
 drivers/gpu/drm/i915/display/intel_hdcp.c | 15 +++
 1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_hdcp.c 
b/drivers/gpu/drm/i915/display/intel_hdcp.c
index a1e0d518e529..e76b049618db 100644
--- a/drivers/gpu/drm/i915/display/intel_hdcp.c
+++ b/drivers/gpu/drm/i915/display/intel_hdcp.c
@@ -148,9 +148,8 @@ static int intel_hdcp_poll_ksv_fifo(struct 
intel_digital_port *dig_port,
 
 static bool hdcp_key_loadable(struct drm_i915_private *dev_priv)
 {
-   struct i915_power_domains *power_domains = &dev_priv->power_domains;
-   struct i915_power_well *power_well;
enum i915_power_well_id id;
+   intel_wakeref_t wakeref;
bool enabled = false;
 
/*
@@ -162,17 +161,9 @@ static bool hdcp_key_loadable(struct drm_i915_private 
*dev_priv)
else
id = SKL_DISP_PW_1;
 
-   mutex_lock(&power_domains->lock);
-
/* PG1 (power well #1) needs to be enabled */
-   for_each_power_well(dev_priv, power_well) {
-   if (power_well->desc->id == id) {
-   enabled = power_well->desc->ops->is_enabled(dev_priv,
-   power_well);
-   break;
-   }
-   }
-   mutex_unlock(&power_domains->lock);
+   with_intel_runtime_pm(&dev_priv->runtime_pm, wakeref)
+   enabled = intel_display_power_well_is_enabled(dev_priv, id);
 
/*
 * Another req for hdcp key loadability is enabled state of pll for
-- 
2.26.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v3 1/2] drm/i915/hdcp: Add update_pipe early return

2020-08-05 Thread Anshuman Gupta
Currently intel_hdcp_update_pipe() is also getting called for non-hdcp
connectors and get through its conditional code flow, which is completely
unnecessary for non-hdcp connectors, therefore it make sense to
have an early return. No functional change.

v2:
- rebased.

Reviewed-by: Uma Shankar 
Signed-off-by: Anshuman Gupta 
---
 drivers/gpu/drm/i915/display/intel_hdcp.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_hdcp.c 
b/drivers/gpu/drm/i915/display/intel_hdcp.c
index 89a4d294822d..a1e0d518e529 100644
--- a/drivers/gpu/drm/i915/display/intel_hdcp.c
+++ b/drivers/gpu/drm/i915/display/intel_hdcp.c
@@ -2082,11 +2082,15 @@ void intel_hdcp_update_pipe(struct intel_atomic_state 
*state,
struct intel_connector *connector =
to_intel_connector(conn_state->connector);
struct intel_hdcp *hdcp = &connector->hdcp;
-   bool content_protection_type_changed =
+   bool content_protection_type_changed, desired_and_not_enabled = false;
+
+   if (!connector->hdcp.shim)
+   return;
+
+   content_protection_type_changed =
(conn_state->hdcp_content_type != hdcp->content_type &&
 conn_state->content_protection !=
 DRM_MODE_CONTENT_PROTECTION_UNDESIRED);
-   bool desired_and_not_enabled = false;
 
/*
 * During the HDCP encryption session if Type change is requested,
-- 
2.26.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 28/66] drm/i915/gem: Replace i915_gem_object.mm.mutex with reservation_ww_class

2020-08-05 Thread Chris Wilson
Quoting Thomas Hellström (Intel) (2020-07-29 14:44:41)
> 
> On 7/29/20 2:17 PM, Tvrtko Ursulin wrote:
> >
> > On 28/07/2020 12:17, Thomas Hellström (Intel) wrote:
> >> On 7/16/20 5:53 PM, Tvrtko Ursulin wrote:
> >>> On 15/07/2020 16:43, Maarten Lankhorst wrote:
>  Op 15-07-2020 om 13:51 schreef Chris Wilson:
> > Our goal is to pull all memory reservations (next iteration
> > obj->ops->get_pages()) under a ww_mutex, and to align those 
> > reservations
> > with other drivers, i.e. control all such allocations with the
> > reservation_ww_class. Currently, this is under the purview of the
> > obj->mm.mutex, and while obj->mm remains an embedded struct we can
> > "simply" switch to using the reservation_ww_class 
> > obj->base.resv->lock
> >
> > The major consequence is the impact on the shrinker paths as the
> > reservation_ww_class is used to wrap allocations, and a ww_mutex does
> > not support subclassing so we cannot do our usual trick of knowing 
> > that
> > we never recurse inside the shrinker and instead have to finish the
> > reclaim with a trylock. This may result in us failing to release the
> > pages after having released the vma. This will have to do until a 
> > better
> > idea comes along.
> >
> > However, this step only converts the mutex over and continues to 
> > treat
> > everything as a single allocation and pinning the pages. With the
> > ww_mutex in place we can remove the temporary pinning, as we can then
> > reserve all storage en masse.
> >
> > One last thing to do: kill the implict page pinning for active vma.
> > This will require us to invalidate the vma->pages when the backing 
> > store
> > is removed (and we expect that while the vma is active, we mark the
> > backing store as active so that it cannot be removed while the HW is
> > busy.)
> >
> > Signed-off-by: Chris Wilson 
> >>>
> >>> [snip]
> >>>
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c 
> > b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> > index dc8f052a0ffe..4e928103a38f 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
> > @@ -47,10 +47,7 @@ static bool unsafe_drop_pages(struct 
> > drm_i915_gem_object *obj,
> >   if (!(shrink & I915_SHRINK_BOUND))
> >   flags = I915_GEM_OBJECT_UNBIND_TEST;
> >   -    if (i915_gem_object_unbind(obj, flags) == 0)
> > -    __i915_gem_object_put_pages(obj);
> > -
> > -    return !i915_gem_object_has_pages(obj);
> > +    return i915_gem_object_unbind(obj, flags) == 0;
> >   }
> >     static void try_to_writeback(struct drm_i915_gem_object *obj,
> > @@ -199,14 +196,14 @@ i915_gem_shrink(struct drm_i915_private *i915,
> > spin_unlock_irqrestore(&i915->mm.obj_lock, flags);
> >   -    if (unsafe_drop_pages(obj, shrink)) {
> > -    /* May arrive from get_pages on another bo */
> > -    mutex_lock(&obj->mm.lock);
> > +    if (unsafe_drop_pages(obj, shrink) &&
> > +    i915_gem_object_trylock(obj)) {
> >>>
>  Why trylock? Because of the nesting? In that case, still use ww ctx 
>  if provided please
> >>>
> >>> By "if provided" you mean for code paths where we are calling the 
> >>> shrinker ourselves, as opposed to reclaim, like shmem_get_pages?
> >>>
> >>> That indeed sounds like the right thing to do, since all the 
> >>> get_pages from execbuf are in the reservation phase, collecting a 
> >>> list of GEM objects to lock, the ones to shrink sound like should be 
> >>> on that list.
> >>>
> > + __i915_gem_object_put_pages(obj);
> >   if (!i915_gem_object_has_pages(obj)) {
> >   try_to_writeback(obj, shrink);
> >   count += obj->base.size >> PAGE_SHIFT;
> >   }
> > -    mutex_unlock(&obj->mm.lock);
> > +    i915_gem_object_unlock(obj);
> >   }
> >     scanned += obj->base.size >> PAGE_SHIFT;
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_tiling.c 
> > b/drivers/gpu/drm/i915/gem/i915_gem_tiling.c
> > index ff72ee2fd9cd..ac12e1c20e66 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_tiling.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_tiling.c
> > @@ -265,7 +265,6 @@ i915_gem_object_set_tiling(struct 
> > drm_i915_gem_object *obj,
> >    * pages to prevent them being swapped out and causing 
> > corruption
> >    * due to the change in swizzling.
> >    */
> > -    mutex_lock(&obj->mm.lock);
> >   if (i915_gem_object_has_pages(obj) &&
> >   obj->mm.madv == I915_MADV_WILLNEED &&
> >   i915->quirks & QUIRK_PIN_SWIZZLED_PAGES) {
> > @@ -280,7 +279,6 @@ i915_gem_object

[Intel-gfx] [PATCH 24/37] drm/i915/gem: Reintroduce multiple passes for reloc processing

2020-08-05 Thread Chris Wilson
The prospect of locking the entire submission sequence under a wide
ww_mutex re-imposes some key restrictions, in particular that we must
not call copy_(from|to)_user underneath the mutex (as the faulthandlers
themselves may need to take the ww_mutex). To satisfy this requirement,
we need to split the relocation handling into multiple phases again.
After dropping the reservations, we need to allocate enough buffer space
to both copy the relocations from userspace into, and serve as the
relocation command buffer. Once we have finished copying the
relocations, we can then re-aquire all the objects for the execbuf and
rebind them, including our new relocations objects. After we have bound
all the new and old objects into their final locations, we can then
convert the relocation entries into the GPU commands to update the
relocated vma. Finally, once it is all over and we have dropped the
ww_mutex for the last time, we can then complete the update of the user
relocation entries.

Signed-off-by: Chris Wilson 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 883 +-
 .../i915/gem/selftests/i915_gem_execbuffer.c  | 206 ++--
 .../drm/i915/gt/intel_gt_buffer_pool_types.h  |   2 +-
 3 files changed, 585 insertions(+), 506 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 0839397c7e50..58e40348b551 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -59,6 +59,20 @@ struct eb_vma_array {
struct eb_vma vma[];
 };
 
+struct eb_relocs_link {
+   unsigned long rsvd; /* overwritten by MI_BATCH_BUFFER_END */
+   struct i915_vma *vma;
+};
+
+struct eb_relocs {
+   struct i915_vma *head;
+   struct drm_i915_gem_relocation_entry *map;
+   unsigned int pos;
+   unsigned int max;
+
+   unsigned int bufsz;
+};
+
 #define __EXEC_OBJECT_HAS_PIN  BIT(31)
 #define __EXEC_OBJECT_HAS_FENCEBIT(30)
 #define __EXEC_OBJECT_NEEDS_MAPBIT(29)
@@ -250,6 +264,7 @@ struct i915_execbuffer {
 
struct intel_engine_cs *engine; /** engine to queue the request to */
struct intel_context *context; /* logical state for the request */
+   struct intel_context *reloc_context; /* distinct context for relocs */
struct i915_gem_context *gem_context; /** caller's context */
 
struct i915_request *request; /** our request to build */
@@ -261,27 +276,11 @@ struct i915_execbuffer {
/** list of all vma required to be bound for this execbuf */
struct list_head bind_list;
 
-   /** list of vma that have execobj.relocation_count */
-   struct list_head relocs_list;
-
struct list_head submit_list;
 
-   /**
-* Track the most recently used object for relocations, as we
-* frequently have to perform multiple relocations within the same
-* obj/page
-*/
-   struct reloc_cache {
-   struct drm_mm_node node; /** temporary GTT binding */
-
-   struct intel_context *ce;
-
-   struct i915_vma *target;
-   struct i915_request *rq;
-   struct i915_vma *rq_vma;
-   u32 *rq_cmd;
-   unsigned int rq_size;
-   } reloc_cache;
+   /** list of vma that have execobj.relocation_count */
+   struct list_head relocs_list;
+   unsigned long relocs_count;
 
struct eb_cmdparser {
struct eb_vma *shadow;
@@ -297,7 +296,6 @@ struct i915_execbuffer {
 
unsigned int gen; /** Cached value of INTEL_GEN */
bool use_64bit_reloc : 1;
-   bool has_llc : 1;
bool has_fence : 1;
bool needs_unfenced : 1;
 
@@ -485,6 +483,7 @@ static int eb_create(struct i915_execbuffer *eb)
INIT_LIST_HEAD(&eb->bind_list);
INIT_LIST_HEAD(&eb->submit_list);
INIT_LIST_HEAD(&eb->relocs_list);
+   eb->relocs_count = 0;
 
return 0;
 }
@@ -631,8 +630,10 @@ eb_add_vma(struct i915_execbuffer *eb,
list_add_tail(&ev->bind_link, &eb->bind_list);
list_add_tail(&ev->submit_link, &eb->submit_list);
 
-   if (entry->relocation_count)
+   if (entry->relocation_count) {
list_add_tail(&ev->reloc_link, &eb->relocs_list);
+   eb->relocs_count += entry->relocation_count;
+   }
 
/*
 * SNA is doing fancy tricks with compressing batch buffers, which leads
@@ -1889,8 +1890,6 @@ eb_get_vma(const struct i915_execbuffer *eb, unsigned 
long handle)
 
 static void eb_destroy(const struct i915_execbuffer *eb)
 {
-   GEM_BUG_ON(eb->reloc_cache.rq);
-
eb_vma_array_put(eb->array);
if (eb->lut_size > 0)
kfree(eb->buckets);
@@ -1908,90 +1907,11 @@ static void eb_info_init(struct i915_execbuffer *eb,
 {
/* Must be a variable in the struct to allow GCC to unroll. */
eb->gen = INTEL_GEN(i915);
-   eb->h

[Intel-gfx] [PATCH 21/37] drm/i915/gem: Include cmdparser in common execbuf pinning

2020-08-05 Thread Chris Wilson
Pull the cmdparser allocations in to the reservation phase, and then
they are included in the common vma pinning pass.

Signed-off-by: Chris Wilson 
Reviewed-by: Thomas Hellström 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 360 +++---
 drivers/gpu/drm/i915/gem/i915_gem_object.h|  10 +
 drivers/gpu/drm/i915/i915_cmd_parser.c|  21 +-
 3 files changed, 230 insertions(+), 161 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 4cdaf5d81ef1..236d4ad3516b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -25,6 +25,7 @@
 #include "i915_gem_clflush.h"
 #include "i915_gem_context.h"
 #include "i915_gem_ioctls.h"
+#include "i915_memcpy.h"
 #include "i915_sw_fence_work.h"
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
@@ -53,6 +54,7 @@ struct eb_bind_vma {
 
 struct eb_vma_array {
struct kref kref;
+   struct list_head aux_list;
struct eb_vma vma[];
 };
 
@@ -254,7 +256,6 @@ struct i915_execbuffer {
 
struct i915_request *request; /** our request to build */
struct eb_vma *batch; /** identity of the batch obj/vma */
-   struct i915_vma *trampoline; /** trampoline used for chaining */
 
/** actual size of execobj[] as we may extend it for the cmdparser */
unsigned int buffer_count;
@@ -284,6 +285,11 @@ struct i915_execbuffer {
unsigned int rq_size;
} reloc_cache;
 
+   struct eb_cmdparser {
+   struct eb_vma *shadow;
+   struct eb_vma *trampoline;
+   } parser;
+
u64 invalid_flags; /** Set of execobj.flags that are invalid */
u32 context_flags; /** Set of execobj.flags to insert from the ctx */
 
@@ -310,6 +316,10 @@ struct i915_execbuffer {
unsigned long num_fences;
 };
 
+static struct drm_i915_gem_exec_object2 no_entry = {
+   .offset = -1ull
+};
+
 static inline bool eb_use_cmdparser(const struct i915_execbuffer *eb)
 {
return intel_engine_requires_cmd_parser(eb->engine) ||
@@ -326,6 +336,7 @@ static struct eb_vma_array *eb_vma_array_create(unsigned 
int count)
return NULL;
 
kref_init(&arr->kref);
+   INIT_LIST_HEAD(&arr->aux_list);
arr->vma[0].vma = NULL;
 
return arr;
@@ -351,16 +362,31 @@ static inline void eb_unreserve_vma(struct eb_vma *ev)
   __EXEC_OBJECT_HAS_FENCE);
 }
 
+static void eb_vma_destroy(struct eb_vma *ev)
+{
+   eb_unreserve_vma(ev);
+   i915_vma_put(ev->vma);
+}
+
+static void eb_destroy_aux(struct eb_vma_array *arr)
+{
+   struct eb_vma *ev, *en;
+
+   list_for_each_entry_safe(ev, en, &arr->aux_list, reloc_link) {
+   eb_vma_destroy(ev);
+   kfree(ev);
+   }
+}
+
 static void eb_vma_array_destroy(struct kref *kref)
 {
struct eb_vma_array *arr = container_of(kref, typeof(*arr), kref);
-   struct eb_vma *ev = arr->vma;
+   struct eb_vma *ev;
 
-   while (ev->vma) {
-   eb_unreserve_vma(ev);
-   i915_vma_put(ev->vma);
-   ev++;
-   }
+   eb_destroy_aux(arr);
+
+   for (ev = arr->vma; ev->vma; ev++)
+   eb_vma_destroy(ev);
 
kvfree(arr);
 }
@@ -408,8 +434,8 @@ eb_lock_vma(struct i915_execbuffer *eb, struct 
ww_acquire_ctx *acquire)
 
 static int eb_create(struct i915_execbuffer *eb)
 {
-   /* Allocate an extra slot for use by the command parser + sentinel */
-   eb->array = eb_vma_array_create(eb->buffer_count + 2);
+   /* Allocate an extra slot for use by the sentinel */
+   eb->array = eb_vma_array_create(eb->buffer_count + 1);
if (!eb->array)
return -ENOMEM;
 
@@ -1076,7 +1102,7 @@ static int eb_reserve_vma(struct eb_vm_work *work, struct 
eb_bind_vma *bind)
GEM_BUG_ON(!(drm_mm_node_allocated(&vma->node) ^
 drm_mm_node_allocated(&bind->hole)));
 
-   if (entry->offset != vma->node.start) {
+   if (entry != &no_entry && entry->offset != vma->node.start) {
entry->offset = vma->node.start | UPDATE;
*work->p_flags |= __EXEC_HAS_RELOC;
}
@@ -1369,7 +1395,8 @@ static int eb_reserve_vm(struct i915_execbuffer *eb)
struct i915_vma *vma = ev->vma;
 
if (eb_pin_vma_inplace(eb, entry, ev)) {
-   if (entry->offset != vma->node.start) {
+   if (entry != &no_entry &&
+   entry->offset != vma->node.start) {
entry->offset = vma->node.start | UPDATE;
eb->args->flags |= __EXEC_HAS_RELOC;
}
@@ -1540,6 +1567,113 @@ static int eb_reserve_vm(struct i915_execbuffer *eb)
} while (1);
 }
 
+static int eb_alloc_cmdparser(struct i915_execbuffer *eb)
+{
+   struct intel_gt_buffer_pool_node *

[Intel-gfx] [PATCH 32/37] drm/i915: Specialise GGTT binding

2020-08-05 Thread Chris Wilson
The Global GTT mmapings do not require any backing storage for the page
directories and so do not need extensive support for preallocations, or
for handling multiple bindings en masse. The Global GTT bindings also
need to take into account an eviction strategy for pinned vma, that we
want to explicitly avoid for user bindings. It is easier to specialise
the i915_ggtt_pin() to keep alive the pages/address as they are used by
HW in its private GTT, while we deconstruct the i915_vma_pin() and
rebuild.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c  |   7 +-
 .../gpu/drm/i915/gt/intel_engine_heartbeat.c  |   7 +-
 .../gpu/drm/i915/gt/intel_engine_heartbeat.h  |   4 +-
 drivers/gpu/drm/i915/gt/selftest_context.c|   2 +-
 .../drm/i915/gt/selftest_engine_heartbeat.c   |   7 +-
 drivers/gpu/drm/i915/i915_active.c|   2 +-
 drivers/gpu/drm/i915/i915_vma.c   | 180 --
 drivers/gpu/drm/i915/i915_vma.h   |   1 +
 .../gpu/drm/i915/selftests/i915_gem_evict.c   | 151 ---
 9 files changed, 180 insertions(+), 181 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
index fae75830494d..3aadb3c80794 100644
--- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
@@ -390,8 +390,11 @@ int gen6_ppgtt_pin(struct i915_ppgtt *base)
 * size. We allocate at the top of the GTT to avoid fragmentation.
 */
err = 0;
-   if (!atomic_read(&ppgtt->pin_count))
-   err = i915_ggtt_pin(ppgtt->vma, GEN6_PD_ALIGN, PIN_HIGH);
+   if (!atomic_read(&ppgtt->pin_count)) {
+   err = i915_ggtt_pin_locked(ppgtt->vma, GEN6_PD_ALIGN, PIN_HIGH);
+   if (err == 0)
+   err = i915_vma_wait_for_bind(ppgtt->vma);
+   }
if (!err)
atomic_inc(&ppgtt->pin_count);
mutex_unlock(&ppgtt->pin_mutex);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c 
b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
index 377cbfdb3355..382b0ced18e8 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
@@ -255,7 +255,7 @@ int intel_engine_pulse(struct intel_engine_cs *engine)
return err;
 }
 
-int intel_engine_flush_barriers(struct intel_engine_cs *engine)
+int intel_engine_flush_barriers(struct intel_engine_cs *engine, gfp_t gfp)
 {
struct i915_sched_attr attr = {
.priority = I915_USER_PRIORITY(I915_PRIORITY_MIN),
@@ -270,12 +270,13 @@ int intel_engine_flush_barriers(struct intel_engine_cs 
*engine)
if (!intel_engine_pm_get_if_awake(engine))
return 0;
 
-   if (mutex_lock_interruptible(&ce->timeline->mutex)) {
+   if (mutex_lock_interruptible_nested(&ce->timeline->mutex,
+   !gfpflags_allow_blocking(gfp))) {
err = -EINTR;
goto out_rpm;
}
 
-   rq = heartbeat_create(ce, GFP_KERNEL);
+   rq = heartbeat_create(ce, gfp);
if (IS_ERR(rq)) {
err = PTR_ERR(rq);
goto out_unlock;
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.h 
b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.h
index a7b8c0f9e005..996e12e7ccf8 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.h
@@ -7,6 +7,8 @@
 #ifndef INTEL_ENGINE_HEARTBEAT_H
 #define INTEL_ENGINE_HEARTBEAT_H
 
+#include 
+
 struct intel_engine_cs;
 
 void intel_engine_init_heartbeat(struct intel_engine_cs *engine);
@@ -18,6 +20,6 @@ void intel_engine_park_heartbeat(struct intel_engine_cs 
*engine);
 void intel_engine_unpark_heartbeat(struct intel_engine_cs *engine);
 
 int intel_engine_pulse(struct intel_engine_cs *engine);
-int intel_engine_flush_barriers(struct intel_engine_cs *engine);
+int intel_engine_flush_barriers(struct intel_engine_cs *engine, gfp_t gfp);
 
 #endif /* INTEL_ENGINE_HEARTBEAT_H */
diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c 
b/drivers/gpu/drm/i915/gt/selftest_context.c
index 1f4020e906a8..e97e522f947b 100644
--- a/drivers/gpu/drm/i915/gt/selftest_context.c
+++ b/drivers/gpu/drm/i915/gt/selftest_context.c
@@ -261,7 +261,7 @@ static int __live_active_context(struct intel_engine_cs 
*engine)
}
 
/* Now make sure our idle-barriers are flushed */
-   err = intel_engine_flush_barriers(engine);
+   err = intel_engine_flush_barriers(engine, GFP_KERNEL);
if (err)
goto err;
 
diff --git a/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c 
b/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c
index e73854dd2fe0..d22a7956c9a5 100644
--- a/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c
+++ b/drivers/gpu/drm/i915/gt/selftest_engine_heartbeat.c
@@ -146,6 +146,11 @@ static int __live_idle_pulse(struct intel_engine_cs 
*engine,
return err;
 }

[Intel-gfx] [PATCH 00/37] Replace obj->mm.lock with reservation_ww_class

2020-08-05 Thread Chris Wilson
Long story short, we need to manage evictions using dma_resv & dma_fence
tracking. The backing storage will then be managed using the ww_mutex
borrowed from (and shared via) obj->base.resv, rather than the current
obj->mm.lock.

Skipping over the breadcrumbs, the first step is to remove the final
crutches of struct_mutex from execbuf and to broaden the hold for the
dma-resv to guard not just publishing the dma-fences, but for the
duration of the execbuf submission (holding all objects and their
backing store from the point of acquisition to publishing of the final
GPU work, after which the guard is delegated to the dma-fences).

This is of course made complicated by our history. On top of the user's
objects, we also have the HW/kernel objects with their own lifetimes,
and a bunch of auxiliary objects used for working around unhappy HW and
for providing the legacy relocation mechanism. We add every auxiliary
object to the list of user objects required, and attempt to acquire them
en masse. Since all the objects can be known a priori, we can build a
list of those objects and pass that to a routine that can resolve the
-EDEADLK (and evictions). [To avoid relocations imposing a penalty on
sane userspace that avoids them, we do not touch any relocations until
necessary, at will point we have to unroll the state, and rebuild a new
list with more auxiliary buffers to accommodate the extra copy_from_user].
More examples are included as to how we can break down operations
involving multiple objects into an acquire phase prior to those
operations, keeping the -EDEADLK handling under control.

execbuf is the unique interface in that it deals with multiple user
and kernel buffers. After that, we have callers that in principle care
about accessing a single buffer, and so can be migrated over to a helper
that permits only holding one such buffer at a time. That enables us to
swap out obj->mm.lock for obj->base.resv->lock, and use lockdep to spot
illegal nesting, and to throw away the temporary pins by replacing them
with holding the ww_mutex for the duration instead.

What's changed? Some patch splitting and we need to pull in Matthew's
patch to map the page directories under the ww_mutex.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 04/37] drm/i915/gt: Defer enabling the breadcrumb interrupt to after submission

2020-08-05 Thread Chris Wilson
Move the register slow register write and readback from out of the
critical path for execlists submission and delay it until the following
worker, shaving off around 200us. Note that the same signal_irq_work() is
allowed to run concurrently on each CPU (but it will only be queued once,
once running though it can be requeued and reexecuted) so we have to
remember to lock the global interactions as we cannot rely on the
signal_irq_work() itself providing the serialisation (in constrast to a
tasklet).

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 72 ++---
 drivers/gpu/drm/i915/gt/intel_engine_pm.h   |  5 ++
 2 files changed, 52 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c 
b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
index d8b206e53660..dee6d5c9b413 100644
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
@@ -30,6 +30,7 @@
 #include "i915_trace.h"
 #include "intel_breadcrumbs.h"
 #include "intel_context.h"
+#include "intel_engine_pm.h"
 #include "intel_gt_pm.h"
 #include "intel_gt_requests.h"
 
@@ -57,12 +58,10 @@ static void irq_disable(struct intel_engine_cs *engine)
 
 static void __intel_breadcrumbs_arm_irq(struct intel_breadcrumbs *b)
 {
-   lockdep_assert_held(&b->irq_lock);
-
-   if (!b->irq_engine || b->irq_armed)
+   if (!b->irq_engine)
return;
 
-   if (!intel_gt_pm_get_if_awake(b->irq_engine->gt))
+   if (GEM_WARN_ON(!intel_gt_pm_get_if_awake(b->irq_engine->gt)))
return;
 
/*
@@ -83,15 +82,13 @@ static void __intel_breadcrumbs_arm_irq(struct 
intel_breadcrumbs *b)
 
if (!b->irq_enabled++)
irq_enable(b->irq_engine);
+
+   /* Requests may have completed before we could enable the interrupt. */
+   irq_work_queue(&b->irq_work);
 }
 
 static void __intel_breadcrumbs_disarm_irq(struct intel_breadcrumbs *b)
 {
-   lockdep_assert_held(&b->irq_lock);
-
-   if (!b->irq_engine || !b->irq_armed)
-   return;
-
GEM_BUG_ON(!b->irq_enabled);
if (!--b->irq_enabled)
irq_disable(b->irq_engine);
@@ -105,8 +102,6 @@ static void add_signaling_context(struct intel_breadcrumbs 
*b,
 {
intel_context_get(ce);
list_add_tail(&ce->signal_link, &b->signalers);
-   if (list_is_first(&ce->signal_link, &b->signalers))
-   __intel_breadcrumbs_arm_irq(b);
 }
 
 static void remove_signaling_context(struct intel_breadcrumbs *b,
@@ -197,7 +192,30 @@ static void signal_irq_work(struct irq_work *work)
 
spin_lock(&b->irq_lock);
 
-   if (list_empty(&b->signalers))
+   /*
+* Keep the irq armed until the interrupt after all listeners are gone.
+*
+* Enabling/disabling the interrupt is rather costly, roughly a couple
+* of hundred microseconds. If we are proactive and enable/disable
+* the interrupt around every request that wants a breadcrumb, we
+* quickly drown in the extra orders of magnitude of latency imposed
+* on request submission.
+*
+* So we try to be lazy, and keep the interrupts enabled until no
+* more listeners appear within a breadcrumb interrupt interval (that
+* is until a request completes that no one cares about). The
+* observation is that listeners come in batches, and will often
+* listen to a bunch of requests in succession.
+*
+* We also try to avoid raising too many interrupts, as they may
+* be generated by userspace batches and it is unfortunately rather
+* too easy to drown the CPU under a flood of GPU interrupts. Thus
+* whenever no one appears to be listening, we turn off the interrupts.
+* Fewer interrupts should conserve power -- at the very least, fewer
+* interrupt draw less ire from other users of the system and tools
+* like powertop.
+*/
+   if (b->irq_armed && list_empty(&b->signalers))
__intel_breadcrumbs_disarm_irq(b);
 
list_splice_init(&b->signaled_requests, &signal);
@@ -251,6 +269,15 @@ static void signal_irq_work(struct irq_work *work)
 
i915_request_put(rq);
}
+
+   if (!READ_ONCE(b->irq_armed) && !list_empty(&b->signalers)) {
+   spin_lock(&b->irq_lock);
+   if (!b->irq_armed)
+   __intel_breadcrumbs_arm_irq(b);
+   spin_unlock(&b->irq_lock);
+   }
+   if (READ_ONCE(b->irq_armed) && intel_engine_is_parking(b->irq_engine))
+   irq_work_queue(&b->irq_work); /* flush the signalers */
 }
 
 struct intel_breadcrumbs *
@@ -292,16 +319,8 @@ void intel_breadcrumbs_reset(struct intel_breadcrumbs *b)
 
 void intel_breadcrumbs_park(struct intel_breadcrumbs *b)
 {
-   unsigned long flags;
-
-   if (!READ_ONCE(b->irq_armed))
-   return;
-
-   spin_

[Intel-gfx] [PATCH 06/37] drm/i915/gt: Don't cancel the interrupt shadow too early

2020-08-05 Thread Chris Wilson
We currently want to keep the interrupt enabled until the interrupt after
which we have no more work to do. This heuristic was broken by us
kicking the irq-work on adding a completed request without attaching a
signaler -- hence it appearing to the irq-worker that an interrupt had
fired when we were idle.

Fixes: bda4d4db6dd6 ("drm/i915/gt: Replace 
intel_engine_transfer_stale_breadcrumbs")
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c 
b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
index 9710d09e7670..ae8895b48eca 100644
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
@@ -216,7 +216,7 @@ static void signal_irq_work(struct irq_work *work)
 * interrupt draw less ire from other users of the system and tools
 * like powertop.
 */
-   if (b->irq_armed && list_empty(&b->signalers))
+   if (!signal && b->irq_armed && list_empty(&b->signalers))
__intel_breadcrumbs_disarm_irq(b);
 
list_for_each_entry_safe(ce, cn, &b->signalers, signal_link) {
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 30/37] drm/i915: Hold wakeref for the duration of the vma GGTT binding

2020-08-05 Thread Chris Wilson
Now that we have pushed the binding itself outside of the vm->mutex, we
are clear of the potential wakeref inversions and can take the wakeref
around the actual duration of the HW interaction.

Signed-off-by: Chris Wilson 
Reviewed-by: Thomas Hellström 
---
 drivers/gpu/drm/i915/gt/intel_ggtt.c | 39 
 drivers/gpu/drm/i915/i915_vma.c  |  6 -
 2 files changed, 22 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c 
b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 92b6cc754d5b..a2c7c55b358d 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -434,27 +434,39 @@ static void i915_ggtt_clear_range(struct 
i915_address_space *vm,
intel_gtt_clear_range(start >> PAGE_SHIFT, length >> PAGE_SHIFT);
 }
 
-static void ggtt_bind_vma(struct i915_address_space *vm,
- struct i915_vm_pt_stash *stash,
- struct i915_vma *vma,
- enum i915_cache_level cache_level,
- u32 flags)
+static void __ggtt_bind_vma(struct i915_address_space *vm,
+   struct i915_vm_pt_stash *stash,
+   struct i915_vma *vma,
+   enum i915_cache_level cache_level,
+   u32 flags)
 {
struct drm_i915_gem_object *obj = vma->obj;
+   intel_wakeref_t wakeref;
u32 pte_flags;
 
-   if (i915_vma_is_bound(vma, ~flags & I915_VMA_BIND_MASK))
-   return;
-
/* Applicable to VLV (gen8+ do not support RO in the GGTT) */
pte_flags = 0;
if (i915_gem_object_is_readonly(obj))
pte_flags |= PTE_READ_ONLY;
 
-   vm->insert_entries(vm, vma, cache_level, pte_flags);
+   with_intel_runtime_pm(vm->gt->uncore->rpm, wakeref)
+   vm->insert_entries(vm, vma, cache_level, pte_flags);
+
vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
 }
 
+static void ggtt_bind_vma(struct i915_address_space *vm,
+ struct i915_vm_pt_stash *stash,
+ struct i915_vma *vma,
+ enum i915_cache_level cache_level,
+ u32 flags)
+{
+   if (i915_vma_is_bound(vma, ~flags & I915_VMA_BIND_MASK))
+   return;
+
+   __ggtt_bind_vma(vm, stash, vma, cache_level, flags);
+}
+
 static void ggtt_unbind_vma(struct i915_address_space *vm, struct i915_vma 
*vma)
 {
vm->clear_range(vm, vma->node.start, vma->size);
@@ -571,19 +583,12 @@ static void aliasing_gtt_bind_vma(struct 
i915_address_space *vm,
  enum i915_cache_level cache_level,
  u32 flags)
 {
-   u32 pte_flags;
-
-   /* Currently applicable only to VLV */
-   pte_flags = 0;
-   if (i915_gem_object_is_readonly(vma->obj))
-   pte_flags |= PTE_READ_ONLY;
-
if (flags & I915_VMA_LOCAL_BIND)
ppgtt_bind_vma(&i915_vm_to_ggtt(vm)->alias->vm,
   stash, vma, cache_level, flags);
 
if (flags & I915_VMA_GLOBAL_BIND)
-   vm->insert_entries(vm, vma, cache_level, pte_flags);
+   __ggtt_bind_vma(vm, stash, vma, cache_level, flags);
 }
 
 static void aliasing_gtt_unbind_vma(struct i915_address_space *vm,
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 40e38b533b59..320f6f8ec042 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -794,7 +794,6 @@ static int __wait_for_unbind(struct i915_vma *vma, unsigned 
int flags)
 int i915_vma_pin(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
 {
struct i915_vma_work *work = NULL;
-   intel_wakeref_t wakeref = 0;
unsigned int bound;
int err;
 
@@ -813,9 +812,6 @@ int i915_vma_pin(struct i915_vma *vma, u64 size, u64 
alignment, u64 flags)
return err;
}
 
-   if (flags & PIN_GLOBAL)
-   wakeref = intel_runtime_pm_get(&vma->vm->i915->runtime_pm);
-
err = __wait_for_unbind(vma, flags);
if (err)
goto err_rpm;
@@ -925,8 +921,6 @@ int i915_vma_pin(struct i915_vma *vma, u64 size, u64 
alignment, u64 flags)
 err_fence:
dma_fence_work_commit_imm(&work->base);
 err_rpm:
-   if (wakeref)
-   intel_runtime_pm_put(&vma->vm->i915->runtime_pm, wakeref);
if (vma->obj)
i915_gem_object_unpin_pages(vma->obj);
return err;
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 36/37] drm/i915/display: Drop object lock from intel_unpin_fb_vma

2020-08-05 Thread Chris Wilson
The obj->resv->lock does not serialisation anything within
intel_unpin_fb_vma(), so remove the redundant contention point.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/display/intel_display.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index 522c772a2111..a70b41b63650 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -2311,12 +2311,9 @@ intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb,
 
 void intel_unpin_fb_vma(struct i915_vma *vma, unsigned long flags)
 {
-   i915_gem_object_lock(vma->obj);
if (flags & PLANE_HAS_FENCE)
i915_vma_unpin_fence(vma);
i915_gem_object_unpin_from_display_plane(vma);
-   i915_gem_object_unlock(vma->obj);
-
i915_vma_put(vma);
 }
 
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 28/37] drm/i915: Acquire the object lock around page directories

2020-08-05 Thread Chris Wilson
Now that the page directories are backed by an object, and we wish to
acquire multiple objects together under the same acquire context, teach
i915_vm_map_pt_stash() to use i915_acquire_ctx.

Signed-off-by: Chris Wilson 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  2 +-
 drivers/gpu/drm/i915/gt/intel_gtt.c   | 14 +++-
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  4 +++
 drivers/gpu/drm/i915/gt/intel_ppgtt.c | 34 +--
 4 files changed, 49 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index d3ac2542a039..94ec3536cac4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -1450,7 +1450,7 @@ static int eb_reserve_vm(struct i915_execbuffer *eb)
return eb_vm_work_cancel(work, err);
 
/* We also need to prepare mappings to write the PD pages */
-   err = i915_vm_map_pt_stash(work->vm, &work->stash);
+   err = __i915_vm_map_pt_stash_locked(work->vm, &work->stash);
if (err)
return eb_vm_work_cancel(work, err);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c 
b/drivers/gpu/drm/i915/gt/intel_gtt.c
index 1a7efbad8f74..b0629de490a3 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
@@ -19,7 +19,8 @@ struct drm_i915_gem_object *alloc_pt_dma(struct 
i915_address_space *vm, int sz)
return i915_gem_object_create_internal(vm->i915, sz);
 }
 
-int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
+int __map_pt_dma_locked(struct i915_address_space *vm,
+   struct drm_i915_gem_object *obj)
 {
void *vaddr;
 
@@ -31,6 +32,17 @@ int map_pt_dma(struct i915_address_space *vm, struct 
drm_i915_gem_object *obj)
return 0;
 }
 
+int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj)
+{
+   int err;
+
+   i915_gem_object_lock(obj);
+   err = __map_pt_dma_locked(vm, obj);
+   i915_gem_object_unlock(obj);
+
+   return err;
+}
+
 void __i915_vm_close(struct i915_address_space *vm)
 {
struct i915_vma *vma, *vn;
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h 
b/drivers/gpu/drm/i915/gt/intel_gtt.h
index c659dbd6cda2..b4e1519e4028 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -525,6 +525,8 @@ struct i915_page_directory *alloc_pd(struct 
i915_address_space *vm);
 struct i915_page_directory *__alloc_pd(int npde);
 
 int map_pt_dma(struct i915_address_space *vm, struct drm_i915_gem_object *obj);
+int __map_pt_dma_locked(struct i915_address_space *vm,
+   struct drm_i915_gem_object *obj);
 
 void free_px(struct i915_address_space *vm,
 struct i915_page_table *pt, int lvl);
@@ -573,6 +575,8 @@ int i915_vm_alloc_pt_stash(struct i915_address_space *vm,
   u64 size);
 int i915_vm_map_pt_stash(struct i915_address_space *vm,
 struct i915_vm_pt_stash *stash);
+int __i915_vm_map_pt_stash_locked(struct i915_address_space *vm,
+ struct i915_vm_pt_stash *stash);
 void i915_vm_free_pt_stash(struct i915_address_space *vm,
   struct i915_vm_pt_stash *stash);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c 
b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
index 11e7288464c0..ada894885795 100644
--- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c
@@ -5,6 +5,8 @@
 
 #include 
 
+#include "mm/i915_acquire_ctx.h"
+
 #include "i915_trace.h"
 #include "intel_gtt.h"
 #include "gen6_ppgtt.h"
@@ -253,15 +255,15 @@ int i915_vm_alloc_pt_stash(struct i915_address_space *vm,
return 0;
 }
 
-int i915_vm_map_pt_stash(struct i915_address_space *vm,
-struct i915_vm_pt_stash *stash)
+int __i915_vm_map_pt_stash_locked(struct i915_address_space *vm,
+ struct i915_vm_pt_stash *stash)
 {
struct i915_page_table *pt;
int n, err;
 
for (n = 0; n < ARRAY_SIZE(stash->pt); n++) {
for (pt = stash->pt[n]; pt; pt = pt->stash) {
-   err = map_pt_dma(vm, pt->base);
+   err = __map_pt_dma_locked(vm, pt->base);
if (err)
return err;
}
@@ -270,6 +272,32 @@ int i915_vm_map_pt_stash(struct i915_address_space *vm,
return 0;
 }
 
+int i915_vm_map_pt_stash(struct i915_address_space *vm,
+struct i915_vm_pt_stash *stash)
+{
+   struct i915_acquire_ctx acquire;
+   struct i915_page_table *pt;
+   int n, err;
+
+   /* Acquire all the pages for the page directories simultaneously */
+   i915_acquire_ctx_init(&acquire);
+   for (n = 0; n < ARRAY_SIZE(stash->pt); n++)

[Intel-gfx] [PATCH 25/37] drm/i915: Add an implementation for common reservation_ww_class locking

2020-08-05 Thread Chris Wilson
From: Maarten Lankhorst 

i915_gem_ww_ctx is used to lock all gem bo's for pinning and memory
eviction. We don't use it yet, but lets start adding the definition
first.

To use it, we have to pass a non-NULL ww to gem_object_lock, and don't
unlock directly. It is done in i915_gem_ww_ctx_fini.

Changes since v1:
- Change ww_ctx and obj order in locking functions (Jonas Lahtinen)

v3: Build a list of all objects first, centralise -EDEADLK handling

Signed-off-by: Maarten Lankhorst 
Reviewed-by: Thomas Hellström 
---
 drivers/gpu/drm/i915/Makefile |   4 +
 drivers/gpu/drm/i915/i915_globals.c   |   1 +
 drivers/gpu/drm/i915/i915_globals.h   |   1 +
 drivers/gpu/drm/i915/mm/i915_acquire_ctx.c| 139 ++
 drivers/gpu/drm/i915/mm/i915_acquire_ctx.h|  34 +++
 drivers/gpu/drm/i915/mm/st_acquire_ctx.c  | 242 ++
 .../drm/i915/selftests/i915_mock_selftests.h  |   1 +
 7 files changed, 422 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/mm/i915_acquire_ctx.c
 create mode 100644 drivers/gpu/drm/i915/mm/i915_acquire_ctx.h
 create mode 100644 drivers/gpu/drm/i915/mm/st_acquire_ctx.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index bda4c0e408f8..a3a4c8a555ec 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -125,6 +125,10 @@ gt-y += \
gt/gen9_renderstate.o
 i915-y += $(gt-y)
 
+# Memory + DMA management
+i915-y += \
+   mm/i915_acquire_ctx.o
+
 # GEM (Graphics Execution Management) code
 gem-y += \
gem/i915_gem_busy.o \
diff --git a/drivers/gpu/drm/i915/i915_globals.c 
b/drivers/gpu/drm/i915/i915_globals.c
index 3aa213684293..51ec42a14694 100644
--- a/drivers/gpu/drm/i915/i915_globals.c
+++ b/drivers/gpu/drm/i915/i915_globals.c
@@ -87,6 +87,7 @@ static void __i915_globals_cleanup(void)
 
 static __initconst int (* const initfn[])(void) = {
i915_global_active_init,
+   i915_global_acquire_init,
i915_global_buddy_init,
i915_global_context_init,
i915_global_gem_context_init,
diff --git a/drivers/gpu/drm/i915/i915_globals.h 
b/drivers/gpu/drm/i915/i915_globals.h
index b2f5cd9b9b1a..11227abf2769 100644
--- a/drivers/gpu/drm/i915/i915_globals.h
+++ b/drivers/gpu/drm/i915/i915_globals.h
@@ -27,6 +27,7 @@ void i915_globals_exit(void);
 
 /* constructors */
 int i915_global_active_init(void);
+int i915_global_acquire_init(void);
 int i915_global_buddy_init(void);
 int i915_global_context_init(void);
 int i915_global_gem_context_init(void);
diff --git a/drivers/gpu/drm/i915/mm/i915_acquire_ctx.c 
b/drivers/gpu/drm/i915/mm/i915_acquire_ctx.c
new file mode 100644
index ..d1c3b958c15d
--- /dev/null
+++ b/drivers/gpu/drm/i915/mm/i915_acquire_ctx.c
@@ -0,0 +1,139 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2020 Intel Corporation
+ */
+
+#include 
+
+#include "i915_globals.h"
+#include "gem/i915_gem_object.h"
+
+#include "i915_acquire_ctx.h"
+
+static struct i915_global_acquire {
+   struct i915_global base;
+   struct kmem_cache *slab_acquires;
+} global;
+
+struct i915_acquire {
+   struct drm_i915_gem_object *obj;
+   struct i915_acquire *next;
+};
+
+static struct i915_acquire *i915_acquire_alloc(void)
+{
+   return kmem_cache_alloc(global.slab_acquires, GFP_KERNEL);
+}
+
+static void i915_acquire_free(struct i915_acquire *lnk)
+{
+   kmem_cache_free(global.slab_acquires, lnk);
+}
+
+void i915_acquire_ctx_init(struct i915_acquire_ctx *ctx)
+{
+   ww_acquire_init(&ctx->ctx, &reservation_ww_class);
+   ctx->locked = NULL;
+}
+
+int i915_acquire_ctx_lock(struct i915_acquire_ctx *ctx,
+ struct drm_i915_gem_object *obj)
+{
+   struct i915_acquire *lock, *lnk;
+   int err;
+
+   lock = i915_acquire_alloc();
+   if (!lock)
+   return -ENOMEM;
+
+   lock->obj = i915_gem_object_get(obj);
+   lock->next = NULL;
+
+   while ((lnk = lock)) {
+   obj = lnk->obj;
+   lock = lnk->next;
+
+   err = dma_resv_lock_interruptible(obj->base.resv, &ctx->ctx);
+   if (err == -EDEADLK) {
+   struct i915_acquire *old;
+
+   while ((old = ctx->locked)) {
+   i915_gem_object_unlock(old->obj);
+   ctx->locked = old->next;
+   old->next = lock;
+   lock = old;
+   }
+
+   err = dma_resv_lock_slow_interruptible(obj->base.resv,
+  &ctx->ctx);
+   }
+   if (!err) {
+   lnk->next = ctx->locked;
+   ctx->locked = lnk;
+   } else {
+   i915_gem_object_put(obj);
+   i915_acquire_free(lnk);
+   }
+   if (err == -EALREADY)
+ 

[Intel-gfx] [PATCH 11/37] drm/i915/gem: Move the 'cached' info to i915_execbuffer

2020-08-05 Thread Chris Wilson
The reloc_cache contains some details that are used outside of the
relocation handling, so lift those out of the embeddded struct into the
principle struct i915_execbuffer.

Signed-off-by: Chris Wilson 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 61 +++
 .../i915/gem/selftests/i915_gem_execbuffer.c  |  6 +-
 2 files changed, 37 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index e7e16c62df1c..e9ef0c287fd9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -261,11 +261,6 @@ struct i915_execbuffer {
 */
struct reloc_cache {
struct drm_mm_node node; /** temporary GTT binding */
-   unsigned int gen; /** Cached value of INTEL_GEN */
-   bool use_64bit_reloc : 1;
-   bool has_llc : 1;
-   bool has_fence : 1;
-   bool needs_unfenced : 1;
 
struct intel_context *ce;
 
@@ -283,6 +278,12 @@ struct i915_execbuffer {
u32 batch_len; /** Length of batch within object */
u32 batch_flags; /** Flags composed for emit_bb_start() */
 
+   unsigned int gen; /** Cached value of INTEL_GEN */
+   bool use_64bit_reloc : 1;
+   bool has_llc : 1;
+   bool has_fence : 1;
+   bool needs_unfenced : 1;
+
/**
 * Indicate either the size of the hastable used to resolve
 * relocation handles, or if negative that we are using a direct
@@ -540,11 +541,11 @@ eb_validate_vma(struct i915_execbuffer *eb,
 */
entry->offset = gen8_noncanonical_addr(entry->offset);
 
-   if (!eb->reloc_cache.has_fence) {
+   if (!eb->has_fence) {
entry->flags &= ~EXEC_OBJECT_NEEDS_FENCE;
} else {
if ((entry->flags & EXEC_OBJECT_NEEDS_FENCE ||
-eb->reloc_cache.needs_unfenced) &&
+eb->needs_unfenced) &&
i915_gem_object_is_tiled(vma->obj))
entry->flags |= EXEC_OBJECT_NEEDS_GTT | 
__EXEC_OBJECT_NEEDS_MAP;
}
@@ -592,7 +593,7 @@ eb_add_vma(struct i915_execbuffer *eb,
if (entry->relocation_count &&
!(ev->flags & EXEC_OBJECT_PINNED))
ev->flags |= __EXEC_OBJECT_NEEDS_BIAS;
-   if (eb->reloc_cache.has_fence)
+   if (eb->has_fence)
ev->flags |= EXEC_OBJECT_NEEDS_FENCE;
 
eb->batch = ev;
@@ -995,15 +996,19 @@ relocation_target(const struct 
drm_i915_gem_relocation_entry *reloc,
return gen8_canonical_addr((int)reloc->delta + target->node.start);
 }
 
-static void reloc_cache_init(struct reloc_cache *cache,
-struct drm_i915_private *i915)
+static void eb_info_init(struct i915_execbuffer *eb,
+struct drm_i915_private *i915)
 {
/* Must be a variable in the struct to allow GCC to unroll. */
-   cache->gen = INTEL_GEN(i915);
-   cache->has_llc = HAS_LLC(i915);
-   cache->use_64bit_reloc = HAS_64BIT_RELOC(i915);
-   cache->has_fence = cache->gen < 4;
-   cache->needs_unfenced = INTEL_INFO(i915)->unfenced_needs_alignment;
+   eb->gen = INTEL_GEN(i915);
+   eb->has_llc = HAS_LLC(i915);
+   eb->use_64bit_reloc = HAS_64BIT_RELOC(i915);
+   eb->has_fence = eb->gen < 4;
+   eb->needs_unfenced = INTEL_INFO(i915)->unfenced_needs_alignment;
+}
+
+static void reloc_cache_init(struct reloc_cache *cache)
+{
cache->node.flags = 0;
cache->rq = NULL;
cache->target = NULL;
@@ -1011,8 +1016,9 @@ static void reloc_cache_init(struct reloc_cache *cache,
 
 #define RELOC_TAIL 4
 
-static int reloc_gpu_chain(struct reloc_cache *cache)
+static int reloc_gpu_chain(struct i915_execbuffer *eb)
 {
+   struct reloc_cache *cache = &eb->reloc_cache;
struct intel_gt_buffer_pool_node *pool;
struct i915_request *rq = cache->rq;
struct i915_vma *batch;
@@ -1036,9 +1042,9 @@ static int reloc_gpu_chain(struct reloc_cache *cache)
GEM_BUG_ON(cache->rq_size + RELOC_TAIL > PAGE_SIZE  / sizeof(u32));
cmd = cache->rq_cmd + cache->rq_size;
*cmd++ = MI_ARB_CHECK;
-   if (cache->gen >= 8)
+   if (eb->gen >= 8)
*cmd++ = MI_BATCH_BUFFER_START_GEN8;
-   else if (cache->gen >= 6)
+   else if (eb->gen >= 6)
*cmd++ = MI_BATCH_BUFFER_START;
else
*cmd++ = MI_BATCH_BUFFER_START | MI_BATCH_GTT;
@@ -1061,7 +1067,7 @@ static int reloc_gpu_chain(struct reloc_cache *cache)
goto out_pool;
 
cmd = i915_gem_object_pin_map(batch->obj,
- cache->has_llc ?
+ eb->has_llc ?
  I915_MAP_FORCE_WB :
  I915_MAP_FORC

[Intel-gfx] [PATCH 03/37] drm/i915/gt: Free stale request on destroying the virtual engine

2020-08-05 Thread Chris Wilson
Since preempt-to-busy, we may unsubmit a request while it is still on
the HW and completes asynchronously. That means it may be retired and in
the process destroy the virtual engine (as the user has closed their
context), but that engine may still be holding onto the unsubmitted
compelted request. Therefore we need to potentially cleanup the old
request on destroying the virtual engine. We also have to keep the
virtual_engine alive until after the sibling's execlists_dequeue() have
finished peeking into the virtual engines, for which we serialise with
RCU.

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 22 +++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 417f6b0c6c61..cb04bc5474be 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -180,6 +180,7 @@
 #define EXECLISTS_REQUEST_SIZE 64 /* bytes */
 
 struct virtual_engine {
+   struct rcu_head rcu;
struct intel_engine_cs base;
struct intel_context context;
 
@@ -5393,10 +5394,25 @@ static void virtual_context_destroy(struct kref *kref)
container_of(kref, typeof(*ve), context.ref);
unsigned int n;
 
-   GEM_BUG_ON(!list_empty(virtual_queue(ve)));
-   GEM_BUG_ON(ve->request);
GEM_BUG_ON(ve->context.inflight);
 
+   if (unlikely(ve->request)) {
+   struct i915_request *old;
+   unsigned long flags;
+
+   spin_lock_irqsave(&ve->base.active.lock, flags);
+
+   old = fetch_and_zero(&ve->request);
+   if (old) {
+   GEM_BUG_ON(!i915_request_completed(old));
+   __i915_request_submit(old);
+   i915_request_put(old);
+   }
+
+   spin_unlock_irqrestore(&ve->base.active.lock, flags);
+   }
+   GEM_BUG_ON(!list_empty(virtual_queue(ve)));
+
for (n = 0; n < ve->num_siblings; n++) {
struct intel_engine_cs *sibling = ve->siblings[n];
struct rb_node *node = &ve->nodes[sibling->id].rb;
@@ -5422,7 +5438,7 @@ static void virtual_context_destroy(struct kref *kref)
intel_engine_free_request_pool(&ve->base);
 
kfree(ve->bonds);
-   kfree(ve);
+   kfree_rcu(ve, rcu);
 }
 
 static void virtual_engine_initial_hint(struct virtual_engine *ve)
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 19/37] drm/i915/gem: Asynchronous GTT unbinding

2020-08-05 Thread Chris Wilson
It is reasonably common for userspace (even modern drivers like iris) to
reuse an active address for a new buffer. This would cause the
application to stall under its mutex (originally struct_mutex) until the
old batches were idle and it could synchronously remove the stale PTE.
However, we can queue up a job that waits on the signal for the old
nodes to complete and upon those signals, remove the old nodes replacing
them with the new ones for the batch. This is still CPU driven, but in
theory we can do the GTT patching from the GPU. The job itself has a
completion signal allowing the execbuf to wait upon the rebinding, and
also other observers to coordinate with the common VM activity.

Letting userspace queue up more work, allows it do more stuff without
blocking other clients. In turn, we take care not to let it too much
concurrent work, creating a small number of queues for each context to
limit the number of concurrent tasks.

The implementation relies on only scheduling one unbind operation per
vma as we use the unbound vma->node location to track the stale PTE. If
there are multiple processes thrashing the same vm, the eviction
processing will become synchronous, with the clients having to wait for
execbuf to schedule their work.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/1402
Signed-off-by: Chris Wilson 
Cc: Matthew Auld 
Cc: Andi Shyti 
Reviewed-by: Thomas Hellström 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 919 --
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c  |   1 +
 drivers/gpu/drm/i915/gt/intel_gtt.c   |   4 +
 drivers/gpu/drm/i915/gt/intel_gtt.h   |   2 +
 drivers/gpu/drm/i915/i915_gem.c   |   7 +
 drivers/gpu/drm/i915/i915_gem_gtt.c   |   5 +
 drivers/gpu/drm/i915/i915_vma.c   |  71 +-
 drivers/gpu/drm/i915/i915_vma.h   |   4 +
 8 files changed, 883 insertions(+), 130 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 32d23718ee1e..301e67dcdbde 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -18,6 +18,7 @@
 #include "gt/intel_gt.h"
 #include "gt/intel_gt_buffer_pool.h"
 #include "gt/intel_gt_pm.h"
+#include "gt/intel_gt_requests.h"
 #include "gt/intel_ring.h"
 
 #include "i915_drv.h"
@@ -44,6 +45,12 @@ struct eb_vma {
u32 handle;
 };
 
+struct eb_bind_vma {
+   struct eb_vma *ev;
+   struct drm_mm_node hole;
+   unsigned int bind_flags;
+};
+
 struct eb_vma_array {
struct kref kref;
struct eb_vma vma[];
@@ -67,11 +74,12 @@ struct eb_vma_array {
 I915_EXEC_RESOURCE_STREAMER)
 
 /* Catch emission of unexpected errors for CI! */
+#define __EINVAL__ 22
 #if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)
 #undef EINVAL
 #define EINVAL ({ \
DRM_DEBUG_DRIVER("EINVAL at %s:%d\n", __func__, __LINE__); \
-   22; \
+   __EINVAL__; \
 })
 #endif
 
@@ -323,6 +331,12 @@ static struct eb_vma_array *eb_vma_array_create(unsigned 
int count)
return arr;
 }
 
+static struct eb_vma_array *eb_vma_array_get(struct eb_vma_array *arr)
+{
+   kref_get(&arr->kref);
+   return arr;
+}
+
 static inline void eb_unreserve_vma(struct eb_vma *ev)
 {
struct i915_vma *vma = ev->vma;
@@ -456,7 +470,10 @@ eb_vma_misplaced(const struct drm_i915_gem_exec_object2 
*entry,
 const struct i915_vma *vma,
 unsigned int flags)
 {
-   if (vma->node.size < entry->pad_to_size)
+   if (test_bit(I915_VMA_ERROR_BIT, __i915_vma_flags(vma)))
+   return true;
+
+   if (vma->node.size < max(vma->size, entry->pad_to_size))
return true;
 
if (entry->alignment && !IS_ALIGNED(vma->node.start, entry->alignment))
@@ -481,32 +498,6 @@ eb_vma_misplaced(const struct drm_i915_gem_exec_object2 
*entry,
return false;
 }
 
-static u64 eb_pin_flags(const struct drm_i915_gem_exec_object2 *entry,
-   unsigned int exec_flags)
-{
-   u64 pin_flags = 0;
-
-   if (exec_flags & EXEC_OBJECT_NEEDS_GTT)
-   pin_flags |= PIN_GLOBAL;
-
-   /*
-* Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset,
-* limit address to the first 4GBs for unflagged objects.
-*/
-   if (!(exec_flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS))
-   pin_flags |= PIN_ZONE_4G;
-
-   if (exec_flags & __EXEC_OBJECT_NEEDS_MAP)
-   pin_flags |= PIN_MAPPABLE;
-
-   if (exec_flags & EXEC_OBJECT_PINNED)
-   pin_flags |= entry->offset | PIN_OFFSET_FIXED;
-   else if (exec_flags & __EXEC_OBJECT_NEEDS_BIAS)
-   pin_flags |= BATCH_OFFSET_BIAS | PIN_OFFSET_BIAS;
-
-   return pin_flags;
-}
-
 static bool eb_pin_vma_fence_inplace(struct eb_vma *ev)
 {
return false; /* We need to add some new fence serialisation */
@@ -520,6 +511,10 @@ eb_pin_vma_inplace(struct i915_e

[Intel-gfx] [PATCH 22/37] drm/i915/gem: Include secure batch in common execbuf pinning

2020-08-05 Thread Chris Wilson
Pull the GGTT binding for the secure batch dispatch into the common vma
pinning routine for execbuf, so that there is just a single central
place for all i915_vma_pin().

Signed-off-by: Chris Wilson 
Reviewed-by: Thomas Hellström 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 88 +++
 1 file changed, 51 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 236d4ad3516b..19cab5541dbc 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -1674,6 +1674,48 @@ static int eb_alloc_cmdparser(struct i915_execbuffer *eb)
return err;
 }
 
+static int eb_secure_batch(struct i915_execbuffer *eb)
+{
+   struct i915_vma *vma = eb->batch->vma;
+
+   /*
+* snb/ivb/vlv conflate the "batch in ppgtt" bit with the "non-secure
+* batch" bit. Hence we need to pin secure batches into the global gtt.
+* hsw should have this fixed, but bdw mucks it up again.
+*/
+   if (!(eb->batch_flags & I915_DISPATCH_SECURE))
+   return 0;
+
+   if (GEM_WARN_ON(vma->vm != &eb->engine->gt->ggtt->vm)) {
+   struct eb_vma *ev;
+
+   ev = kzalloc(sizeof(*ev), GFP_KERNEL);
+   if (!ev)
+   return -ENOMEM;
+
+   vma = i915_vma_instance(vma->obj,
+   &eb->engine->gt->ggtt->vm,
+   NULL);
+   if (IS_ERR(vma)) {
+   kfree(ev);
+   return PTR_ERR(vma);
+   }
+
+   ev->vma = i915_vma_get(vma);
+   ev->exec = &no_entry;
+
+   list_add(&ev->submit_link, &eb->submit_list);
+   list_add(&ev->reloc_link, &eb->array->aux_list);
+   list_add(&ev->bind_link, &eb->bind_list);
+
+   GEM_BUG_ON(eb->batch->vma->private);
+   eb->batch = ev;
+   }
+
+   eb->batch->flags |= EXEC_OBJECT_NEEDS_GTT;
+   return 0;
+}
+
 static unsigned int eb_batch_index(const struct i915_execbuffer *eb)
 {
if (eb->args->flags & I915_EXEC_BATCH_FIRST)
@@ -1823,6 +1865,10 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb)
if (err)
return err;
 
+   err = eb_secure_batch(eb);
+   if (err)
+   return err;
+
return 0;
 }
 
@@ -2798,7 +2844,7 @@ static int eb_parse(struct i915_execbuffer *eb)
return 0;
 }
 
-static int eb_submit(struct i915_execbuffer *eb, struct i915_vma *batch)
+static int eb_submit(struct i915_execbuffer *eb)
 {
int err;
 
@@ -2825,7 +2871,7 @@ static int eb_submit(struct i915_execbuffer *eb, struct 
i915_vma *batch)
}
 
err = eb->engine->emit_bb_start(eb->request,
-   batch->node.start +
+   eb->batch->vma->node.start +
eb->batch_start_offset,
eb->batch_len,
eb->batch_flags);
@@ -3486,7 +3532,6 @@ i915_gem_do_execbuffer(struct drm_device *dev,
struct i915_execbuffer eb;
struct dma_fence *in_fence = NULL;
struct sync_file *out_fence = NULL;
-   struct i915_vma *batch;
int out_fence_fd = -1;
int err;
 
@@ -3601,34 +3646,6 @@ i915_gem_do_execbuffer(struct drm_device *dev,
if (err)
goto err_vma;
 
-   /*
-* snb/ivb/vlv conflate the "batch in ppgtt" bit with the "non-secure
-* batch" bit. Hence we need to pin secure batches into the global gtt.
-* hsw should have this fixed, but bdw mucks it up again. */
-   batch = i915_vma_get(eb.batch->vma);
-   if (eb.batch_flags & I915_DISPATCH_SECURE) {
-   struct i915_vma *vma;
-
-   /*
-* So on first glance it looks freaky that we pin the batch here
-* outside of the reservation loop. But:
-* - The batch is already pinned into the relevant ppgtt, so we
-*   already have the backing storage fully allocated.
-* - No other BO uses the global gtt (well contexts, but meh),
-*   so we don't really have issues with multiple objects not
-*   fitting due to fragmentation.
-* So this is actually safe.
-*/
-   vma = i915_gem_object_ggtt_pin(batch->obj, NULL, 0, 0, 0);
-   if (IS_ERR(vma)) {
-   err = PTR_ERR(vma);
-   goto err_vma;
-   }
-
-   GEM_BUG_ON(vma->obj != batch->obj);
-   batch = vma;
-   }
-
/* All GPU relocation batches must be submitted prior to the user rq */
GEM_BUG_ON(eb.reloc_cache.rq);
 
@@ -3636,7 +3653,7 @@ i915_gem_do_exec

[Intel-gfx] [PATCH 12/37] drm/i915/gem: Break apart the early i915_vma_pin from execbuf object lookup

2020-08-05 Thread Chris Wilson
As a prelude to the next step where we want to perform all the object
allocations together under the same lock, we first must delay the
i915_vma_pin() as that implicitly does the allocations for us, one by
one. As it only does the allocations one by one, it is not allowed to
wait/evict, whereas pulling all the allocations together the entire set
can be scheduled as one.

Signed-off-by: Chris Wilson 
Reviewed-by: Tvrtko Ursulin 
Reviewed-by: Thomas Hellström 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 74 ++-
 1 file changed, 41 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index e9ef0c287fd9..2f6fa8b3a805 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -34,6 +34,8 @@ struct eb_vma {
 
/** This vma's place in the execbuf reservation list */
struct drm_i915_gem_exec_object2 *exec;
+
+   struct list_head bind_link;
struct list_head unbound_link;
struct list_head reloc_link;
 
@@ -248,8 +250,8 @@ struct i915_execbuffer {
/** actual size of execobj[] as we may extend it for the cmdparser */
unsigned int buffer_count;
 
-   /** list of vma not yet bound during reservation phase */
-   struct list_head unbound;
+   /** list of all vma required to be bound for this execbuf */
+   struct list_head bind_list;
 
/** list of vma that have execobj.relocation_count */
struct list_head relocs_list;
@@ -577,6 +579,8 @@ eb_add_vma(struct i915_execbuffer *eb,
eb->lut_size)]);
}
 
+   list_add_tail(&ev->bind_link, &eb->bind_list);
+
if (entry->relocation_count)
list_add_tail(&ev->reloc_link, &eb->relocs_list);
 
@@ -598,16 +602,6 @@ eb_add_vma(struct i915_execbuffer *eb,
 
eb->batch = ev;
}
-
-   if (eb_pin_vma(eb, entry, ev)) {
-   if (entry->offset != vma->node.start) {
-   entry->offset = vma->node.start | UPDATE;
-   eb->args->flags |= __EXEC_HAS_RELOC;
-   }
-   } else {
-   eb_unreserve_vma(ev);
-   list_add_tail(&ev->unbound_link, &eb->unbound);
-   }
 }
 
 static int eb_reserve_vma(const struct i915_execbuffer *eb,
@@ -682,13 +676,31 @@ static int wait_for_timeline(struct intel_timeline *tl)
} while (1);
 }
 
-static int eb_reserve(struct i915_execbuffer *eb)
+static int eb_reserve_vm(struct i915_execbuffer *eb)
 {
-   const unsigned int count = eb->buffer_count;
unsigned int pin_flags = PIN_USER | PIN_NONBLOCK;
-   struct list_head last;
+   struct list_head last, unbound;
struct eb_vma *ev;
-   unsigned int i, pass;
+   unsigned int pass;
+
+   INIT_LIST_HEAD(&unbound);
+   list_for_each_entry(ev, &eb->bind_list, bind_link) {
+   struct drm_i915_gem_exec_object2 *entry = ev->exec;
+   struct i915_vma *vma = ev->vma;
+
+   if (eb_pin_vma(eb, entry, ev)) {
+   if (entry->offset != vma->node.start) {
+   entry->offset = vma->node.start | UPDATE;
+   eb->args->flags |= __EXEC_HAS_RELOC;
+   }
+   } else {
+   eb_unreserve_vma(ev);
+   list_add_tail(&ev->unbound_link, &unbound);
+   }
+   }
+
+   if (list_empty(&unbound))
+   return 0;
 
/*
 * Attempt to pin all of the buffers into the GTT.
@@ -726,7 +738,7 @@ static int eb_reserve(struct i915_execbuffer *eb)
if (mutex_lock_interruptible(&eb->i915->drm.struct_mutex))
return -EINTR;
 
-   list_for_each_entry(ev, &eb->unbound, unbound_link) {
+   list_for_each_entry(ev, &unbound, unbound_link) {
err = eb_reserve_vma(eb, ev, pin_flags);
if (err)
break;
@@ -737,13 +749,11 @@ static int eb_reserve(struct i915_execbuffer *eb)
}
 
/* Resort *all* the objects into priority order */
-   INIT_LIST_HEAD(&eb->unbound);
+   INIT_LIST_HEAD(&unbound);
INIT_LIST_HEAD(&last);
-   for (i = 0; i < count; i++) {
-   unsigned int flags;
+   list_for_each_entry(ev, &eb->bind_list, bind_link) {
+   unsigned int flags = ev->flags;
 
-   ev = &eb->vma[i];
-   flags = ev->flags;
if (flags & EXEC_OBJECT_PINNED &&
flags & __EXEC_OBJECT_HAS_PIN)
continue;
@@ -752,17 +762,17 @@ static int eb_reserve(struct i915_execbuffer *eb)
 

[Intel-gfx] [PATCH 34/37] drm/i915/gt: Push the wait for the context to bound to the request

2020-08-05 Thread Chris Wilson
Rather than synchronously wait for the context to be bound, within the
intel_context_pin(), we can track the pending completion of the bind
fence and only submit requests along the context when signaled.

Signed-off-by: Chris Wilson 
Reviewed-by: Thomas Hellström 
---
 drivers/gpu/drm/i915/Makefile  |  1 +
 drivers/gpu/drm/i915/gt/intel_context.c| 80 +-
 drivers/gpu/drm/i915/gt/intel_context.h|  6 ++
 drivers/gpu/drm/i915/i915_active.h |  1 -
 drivers/gpu/drm/i915/i915_request.c|  4 ++
 drivers/gpu/drm/i915/i915_sw_fence_await.c | 62 +
 drivers/gpu/drm/i915/i915_sw_fence_await.h | 19 +
 7 files changed, 140 insertions(+), 33 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_sw_fence_await.c
 create mode 100644 drivers/gpu/drm/i915/i915_sw_fence_await.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index a3a4c8a555ec..2cf54db8b847 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -61,6 +61,7 @@ i915-y += \
i915_memcpy.o \
i915_mm.o \
i915_sw_fence.o \
+   i915_sw_fence_await.o \
i915_sw_fence_work.o \
i915_syncmap.o \
i915_user_extensions.o
diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
b/drivers/gpu/drm/i915/gt/intel_context.c
index ff3f7580d1ca..04c2f207b11d 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -10,6 +10,7 @@
 
 #include "i915_drv.h"
 #include "i915_globals.h"
+#include "i915_sw_fence_await.h"
 
 #include "intel_context.h"
 #include "intel_engine.h"
@@ -140,31 +141,71 @@ intel_context_acquire_lock(struct intel_context *ce,
return 0;
 }
 
+static int await_bind(struct dma_fence_await *fence, struct i915_vma *vma)
+{
+   struct dma_fence *bind;
+   int err = 0;
+
+   bind = i915_active_fence_get(&vma->active.excl);
+   if (bind) {
+   err = i915_sw_fence_await_dma_fence(&fence->await, bind,
+   0, GFP_KERNEL);
+   dma_fence_put(bind);
+   }
+
+   return err;
+}
+
 static int intel_context_active_locked(struct intel_context *ce)
 {
+   struct dma_fence_await *fence;
int err;
 
+   fence = dma_fence_await_create(GFP_KERNEL);
+   if (!fence)
+   return -ENOMEM;
+
err = __ring_active_locked(ce->ring);
if (err)
-   return err;
+   goto out_fence;
+
+   err = await_bind(fence, ce->ring->vma);
+   if (err < 0)
+   goto err_ring;
 
err = intel_timeline_pin_locked(ce->timeline);
if (err)
goto err_ring;
 
-   if (!ce->state)
-   return 0;
-
-   err = __context_active_locked(ce->state);
-   if (err)
+   err = await_bind(fence, ce->timeline->hwsp_ggtt);
+   if (err < 0)
goto err_timeline;
 
-   return 0;
+   if (ce->state) {
+   err = __context_active_locked(ce->state);
+   if (err)
+   goto err_timeline;
+
+   err = await_bind(fence, ce->state);
+   if (err < 0)
+   goto err_state;
+   }
+
+   /* Must be the last action as it *releases* the ce->active */
+   if (atomic_read(&fence->await.pending) > 1)
+   i915_active_set_exclusive(&ce->active, &fence->dma);
 
+   err = 0;
+   goto out_fence;
+
+err_state:
+   __context_retire_state(ce->state);
 err_timeline:
intel_timeline_unpin(ce->timeline);
 err_ring:
__ring_retire(ce->ring);
+out_fence:
+   i915_sw_fence_commit(&fence->await);
return err;
 }
 
@@ -322,27 +363,6 @@ static void intel_context_active_release(struct 
intel_context *ce)
i915_active_release(&ce->active);
 }
 
-static int __intel_context_sync(struct intel_context *ce)
-{
-   int err;
-
-   err = i915_vma_wait_for_bind(ce->ring->vma);
-   if (err)
-   return err;
-
-   err = i915_vma_wait_for_bind(ce->timeline->hwsp_ggtt);
-   if (err)
-   return err;
-
-   if (ce->state) {
-   err = i915_vma_wait_for_bind(ce->state);
-   if (err)
-   return err;
-   }
-
-   return 0;
-}
-
 int __intel_context_do_pin(struct intel_context *ce)
 {
int err;
@@ -368,10 +388,6 @@ int __intel_context_do_pin(struct intel_context *ce)
}
 
if (likely(!atomic_add_unless(&ce->pin_count, 1, 0))) {
-   err = __intel_context_sync(ce);
-   if (unlikely(err))
-   goto out_unlock;
-
err = intel_context_active_acquire(ce);
if (unlikely(err))
goto out_unlock;
diff --git a/drivers/gpu/drm/i915/gt/intel_context.h 
b/drivers/gpu/drm/i915/gt/intel_context.h
index 07be021882cc..f48df2784a6c 100644
--- a/drivers/gpu/drm/i915/g

[Intel-gfx] [PATCH 14/37] drm/i915: Serialise i915_vma_pin_inplace() with i915_vma_unbind()

2020-08-05 Thread Chris Wilson
Directly seralise the atomic pinning with evicting the vma from unbind
with a pair of coupled cmpxchg to avoid fighting over vm->mutex.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_vma.c | 45 ++---
 1 file changed, 14 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index dbe11b349175..17ce0bce318e 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -742,12 +742,10 @@ i915_vma_detach(struct i915_vma *vma)
 
 bool i915_vma_pin_inplace(struct i915_vma *vma, unsigned int flags)
 {
-   unsigned int bound;
-   bool pinned = true;
+   unsigned int bound = atomic_read(&vma->flags);
 
GEM_BUG_ON(flags & ~I915_VMA_BIND_MASK);
 
-   bound = atomic_read(&vma->flags);
do {
if (unlikely(flags & ~bound))
return false;
@@ -755,34 +753,10 @@ bool i915_vma_pin_inplace(struct i915_vma *vma, unsigned 
int flags)
if (unlikely(bound & (I915_VMA_OVERFLOW | I915_VMA_ERROR)))
return false;
 
-   if (!(bound & I915_VMA_PIN_MASK))
-   goto unpinned;
-
GEM_BUG_ON(((bound + 1) & I915_VMA_PIN_MASK) == 0);
} while (!atomic_try_cmpxchg(&vma->flags, &bound, bound + 1));
 
return true;
-
-unpinned:
-   /*
-* If pin_count==0, but we are bound, check under the lock to avoid
-* racing with a concurrent i915_vma_unbind().
-*/
-   mutex_lock(&vma->vm->mutex);
-   do {
-   if (unlikely(bound & (I915_VMA_OVERFLOW | I915_VMA_ERROR))) {
-   pinned = false;
-   break;
-   }
-
-   if (unlikely(flags & ~bound)) {
-   pinned = false;
-   break;
-   }
-   } while (!atomic_try_cmpxchg(&vma->flags, &bound, bound + 1));
-   mutex_unlock(&vma->vm->mutex);
-
-   return pinned;
 }
 
 static int vma_get_pages(struct i915_vma *vma)
@@ -1292,6 +1266,7 @@ void __i915_vma_evict(struct i915_vma *vma)
 
 int __i915_vma_unbind(struct i915_vma *vma)
 {
+   unsigned int bound;
int ret;
 
lockdep_assert_held(&vma->vm->mutex);
@@ -1299,10 +1274,18 @@ int __i915_vma_unbind(struct i915_vma *vma)
if (!drm_mm_node_allocated(&vma->node))
return 0;
 
-   if (i915_vma_is_pinned(vma)) {
-   vma_print_allocator(vma, "is pinned");
-   return -EAGAIN;
-   }
+   /* Serialise with i915_vma_pin_inplace() */
+   bound = atomic_read(&vma->flags);
+   do {
+   if (unlikely(bound & I915_VMA_PIN_MASK)) {
+   vma_print_allocator(vma, "is pinned");
+   return -EAGAIN;
+   }
+
+   if (unlikely(bound & I915_VMA_ERROR))
+   break;
+   } while (!atomic_try_cmpxchg(&vma->flags,
+&bound, bound | I915_VMA_ERROR));
 
/*
 * After confirming that no one else is pinning this vma, wait for
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 08/37] drm/i915/gem: Don't drop the timeline lock during execbuf

2020-08-05 Thread Chris Wilson
Our timeline lock is our defence against a concurrent execbuf
interrupting our request construction. we need hold it throughout or,
for example, a second thread may interject a relocation request in
between our own relocation request and execution in the ring.

A second, major benefit, is that it allows us to preserve a large chunk
of the ringbuffer for our exclusive use; which should virtually
eliminate the threat of hitting a wait_for_space during request
construction -- although we should have already dropped other
contentious locks at that point.

Signed-off-by: Chris Wilson 
Reviewed-by: Thomas Hellström 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 462 +++---
 .../i915/gem/selftests/i915_gem_execbuffer.c  |  29 +-
 2 files changed, 312 insertions(+), 179 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 9ce114d67288..2dc30dbbdbf3 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -267,6 +267,8 @@ struct i915_execbuffer {
bool has_fence : 1;
bool needs_unfenced : 1;
 
+   struct intel_context *ce;
+
struct i915_vma *target;
struct i915_request *rq;
struct i915_vma *rq_vma;
@@ -650,6 +652,35 @@ static int eb_reserve_vma(const struct i915_execbuffer *eb,
return 0;
 }
 
+static void retire_requests(struct intel_timeline *tl)
+{
+   struct i915_request *rq, *rn;
+
+   list_for_each_entry_safe(rq, rn, &tl->requests, link)
+   if (!i915_request_retire(rq))
+   break;
+}
+
+static int wait_for_timeline(struct intel_timeline *tl)
+{
+   do {
+   struct dma_fence *fence;
+   int err;
+
+   fence = i915_active_fence_get(&tl->last_request);
+   if (!fence)
+   return 0;
+
+   err = dma_fence_wait(fence, true);
+   dma_fence_put(fence);
+   if (err)
+   return err;
+
+   /* Retiring may trigger a barrier, requiring an extra pass */
+   retire_requests(tl);
+   } while (1);
+}
+
 static int eb_reserve(struct i915_execbuffer *eb)
 {
const unsigned int count = eb->buffer_count;
@@ -657,7 +688,6 @@ static int eb_reserve(struct i915_execbuffer *eb)
struct list_head last;
struct eb_vma *ev;
unsigned int i, pass;
-   int err = 0;
 
/*
 * Attempt to pin all of the buffers into the GTT.
@@ -673,18 +703,37 @@ static int eb_reserve(struct i915_execbuffer *eb)
 * room for the earlier objects *unless* we need to defragment.
 */
 
-   if (mutex_lock_interruptible(&eb->i915->drm.struct_mutex))
-   return -EINTR;
-
pass = 0;
do {
+   int err = 0;
+
+   /*
+* We need to hold one lock as we bind all the vma so that
+* we have a consistent view of the entire vm and can plan
+* evictions to fill the whole GTT. If we allow a second
+* thread to run as we do this, it will either unbind
+* everything we want pinned, or steal space that we need for
+* ourselves. The closer we are to a full GTT, the more likely
+* such contention will cause us to fail to bind the workload
+* for this batch. Since we know at this point we need to
+* find space for new buffers, we know that extra pressure
+* from contention is likely.
+*
+* In lieu of being able to hold vm->mutex for the entire
+* sequence (it's complicated!), we opt for struct_mutex.
+*/
+   if (mutex_lock_interruptible(&eb->i915->drm.struct_mutex))
+   return -EINTR;
+
list_for_each_entry(ev, &eb->unbound, bind_link) {
err = eb_reserve_vma(eb, ev, pin_flags);
if (err)
break;
}
-   if (!(err == -ENOSPC || err == -EAGAIN))
-   break;
+   if (!(err == -ENOSPC || err == -EAGAIN)) {
+   mutex_unlock(&eb->i915->drm.struct_mutex);
+   return err;
+   }
 
/* Resort *all* the objects into priority order */
INIT_LIST_HEAD(&eb->unbound);
@@ -713,38 +762,50 @@ static int eb_reserve(struct i915_execbuffer *eb)
list_add_tail(&ev->bind_link, &last);
}
list_splice_tail(&last, &eb->unbound);
+   mutex_unlock(&eb->i915->drm.struct_mutex);
 
if (err == -EAGAIN) {
-   mutex_unlock(&eb->i915->drm.struct_mutex);
flu

[Intel-gfx] [PATCH 18/37] drm/i915/gem: Separate the ww_mutex walker into its own list

2020-08-05 Thread Chris Wilson
In preparation for making eb_vma bigger and heavy to run in parallel,
we need to stop applying an in-place swap() to reorder around ww_mutex
deadlocks. Keep the array intact and reorder the locks using a dedicated
list.

Signed-off-by: Chris Wilson 
Reviewed-by: Tvrtko Ursulin 
Reviewed-by: Thomas Hellström 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 83 ---
 1 file changed, 54 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 62a1de1dd238..32d23718ee1e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -38,6 +38,7 @@ struct eb_vma {
struct list_head bind_link;
struct list_head unbound_link;
struct list_head reloc_link;
+   struct list_head submit_link;
 
struct hlist_node node;
u32 handle;
@@ -256,6 +257,8 @@ struct i915_execbuffer {
/** list of vma that have execobj.relocation_count */
struct list_head relocs_list;
 
+   struct list_head submit_list;
+
/**
 * Track the most recently used object for relocations, as we
 * frequently have to perform multiple relocations within the same
@@ -353,6 +356,42 @@ static void eb_vma_array_put(struct eb_vma_array *arr)
kref_put(&arr->kref, eb_vma_array_destroy);
 }
 
+static int
+eb_lock_vma(struct i915_execbuffer *eb, struct ww_acquire_ctx *acquire)
+{
+   struct eb_vma *ev;
+   int err = 0;
+
+   list_for_each_entry(ev, &eb->submit_list, submit_link) {
+   struct i915_vma *vma = ev->vma;
+
+   err = ww_mutex_lock_interruptible(&vma->resv->lock, acquire);
+   if (err == -EDEADLK) {
+   struct eb_vma *unlock = ev, *en;
+
+   list_for_each_entry_safe_continue_reverse(unlock, en,
+ 
&eb->submit_list,
+ submit_link) {
+   ww_mutex_unlock(&unlock->vma->resv->lock);
+   list_move_tail(&unlock->submit_link, 
&eb->submit_list);
+   }
+
+   GEM_BUG_ON(!list_is_first(&ev->submit_link, 
&eb->submit_list));
+   err = ww_mutex_lock_slow_interruptible(&vma->resv->lock,
+  acquire);
+   }
+   if (err) {
+   list_for_each_entry_continue_reverse(ev,
+&eb->submit_list,
+submit_link)
+   ww_mutex_unlock(&ev->vma->resv->lock);
+   break;
+   }
+   }
+
+   return err;
+}
+
 static int eb_create(struct i915_execbuffer *eb)
 {
/* Allocate an extra slot for use by the command parser + sentinel */
@@ -405,6 +444,10 @@ static int eb_create(struct i915_execbuffer *eb)
eb->lut_size = -eb->buffer_count;
}
 
+   INIT_LIST_HEAD(&eb->bind_list);
+   INIT_LIST_HEAD(&eb->submit_list);
+   INIT_LIST_HEAD(&eb->relocs_list);
+
return 0;
 }
 
@@ -572,6 +615,7 @@ eb_add_vma(struct i915_execbuffer *eb,
}
 
list_add_tail(&ev->bind_link, &eb->bind_list);
+   list_add_tail(&ev->submit_link, &eb->submit_list);
 
if (entry->relocation_count)
list_add_tail(&ev->reloc_link, &eb->relocs_list);
@@ -938,9 +982,6 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb)
unsigned int i;
int err = 0;
 
-   INIT_LIST_HEAD(&eb->bind_list);
-   INIT_LIST_HEAD(&eb->relocs_list);
-
for (i = 0; i < eb->buffer_count; i++) {
struct i915_vma *vma;
 
@@ -1613,38 +1654,19 @@ static int eb_relocate(struct i915_execbuffer *eb)
 
 static int eb_move_to_gpu(struct i915_execbuffer *eb)
 {
-   const unsigned int count = eb->buffer_count;
struct ww_acquire_ctx acquire;
-   unsigned int i;
+   struct eb_vma *ev;
int err = 0;
 
ww_acquire_init(&acquire, &reservation_ww_class);
 
-   for (i = 0; i < count; i++) {
-   struct eb_vma *ev = &eb->vma[i];
-   struct i915_vma *vma = ev->vma;
-
-   err = ww_mutex_lock_interruptible(&vma->resv->lock, &acquire);
-   if (err == -EDEADLK) {
-   GEM_BUG_ON(i == 0);
-   do {
-   int j = i - 1;
-
-   ww_mutex_unlock(&eb->vma[j].vma->resv->lock);
-
-   swap(eb->vma[i],  eb->vma[j]);
-   } while (--i);
+   err = eb_lock_vma(eb, &acquire);
+   if (err)
+   goto err_fini;
 
-   err = ww_mutex_lock_slow_interr

[Intel-gfx] [PATCH 17/37] drm/i915/gem: Assign context id for async work

2020-08-05 Thread Chris Wilson
Allocate a few dma fence context id that we can use to associate async work
[for the CPU] launched on behalf of this context. For extra fun, we allow
a configurable concurrency width.

A current example would be that we spawn an unbound worker for every
userptr get_pages. In the future, we wish to charge this work to the
context that initiated the async work and to impose concurrency limits
based on the context.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 4 
 drivers/gpu/drm/i915/gem/i915_gem_context.h   | 6 ++
 drivers/gpu/drm/i915/gem/i915_gem_context_types.h | 6 ++
 3 files changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index db893f6c516b..bc80e7d3c50a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -721,6 +721,10 @@ __create_context(struct drm_i915_private *i915)
mutex_init(&ctx->mutex);
INIT_LIST_HEAD(&ctx->link);
 
+   ctx->async.width = rounddown_pow_of_two(num_online_cpus());
+   ctx->async.context = dma_fence_context_alloc(ctx->async.width);
+   ctx->async.width--;
+
spin_lock_init(&ctx->stale.lock);
INIT_LIST_HEAD(&ctx->stale.engines);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h 
b/drivers/gpu/drm/i915/gem/i915_gem_context.h
index a133f92bbedb..f254458a795e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
@@ -134,6 +134,12 @@ int i915_gem_context_setparam_ioctl(struct drm_device 
*dev, void *data,
 int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
   struct drm_file *file);
 
+static inline u64 i915_gem_context_async_id(struct i915_gem_context *ctx)
+{
+   return (ctx->async.context +
+   (atomic_fetch_inc(&ctx->async.cur) & ctx->async.width));
+}
+
 static inline struct i915_gem_context *
 i915_gem_context_get(struct i915_gem_context *ctx)
 {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index ae14ca24a11f..52561f98000f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -85,6 +85,12 @@ struct i915_gem_context {
 
struct intel_timeline *timeline;
 
+   struct {
+   u64 context;
+   atomic_t cur;
+   unsigned int width;
+   } async;
+
/**
 * @vm: unique address space (GTT)
 *
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 01/37] drm/i915/gem: Reduce context termination list iteration guard to RCU

2020-08-05 Thread Chris Wilson
As we now protect the timeline list using RCU, we can drop the
timeline->mutex for guarding the list iteration during context close, as
we are searching for an inflight request. Any new request will see the
context is banned and not be submitted. In doing so, pull the checks for
a concurrent submission of the request (notably the
i915_request_completed()) under the engine spinlock, to fully serialise
with __i915_request_submit()). That is in the case of preempt-to-busy
where the request may be completed during the __i915_request_submit(),
we need to be careful that we sample the request status after
serialising so that we don't miss the request the engine is actually
submitting.

Fixes: 4a3174152147 ("drm/i915/gem: Refine occupancy test in kill_context()")
References: d22d2d073ef8 ("drm/i915: Protect i915_request_await_start from 
early waits") # rcu protection of timeline->requests
References: https://gitlab.freedesktop.org/drm/intel/-/issues/1622
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 32 -
 1 file changed, 19 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index d8cccbab7a51..db893f6c516b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -439,29 +439,36 @@ static bool __cancel_engine(struct intel_engine_cs 
*engine)
return __reset_engine(engine);
 }
 
-static struct intel_engine_cs *__active_engine(struct i915_request *rq)
+static bool
+__active_engine(struct i915_request *rq, struct intel_engine_cs **active)
 {
struct intel_engine_cs *engine, *locked;
+   bool ret = false;
 
/*
 * Serialise with __i915_request_submit() so that it sees
 * is-banned?, or we know the request is already inflight.
+*
+* Note that rq->engine is unstable, and so we double
+* check that we have acquired the lock on the final engine.
 */
locked = READ_ONCE(rq->engine);
spin_lock_irq(&locked->active.lock);
while (unlikely(locked != (engine = READ_ONCE(rq->engine {
spin_unlock(&locked->active.lock);
-   spin_lock(&engine->active.lock);
locked = engine;
+   spin_lock(&locked->active.lock);
}
 
-   engine = NULL;
-   if (i915_request_is_active(rq) && rq->fence.error != -EIO)
-   engine = rq->engine;
+   if (!i915_request_completed(rq)) {
+   if (i915_request_is_active(rq) && rq->fence.error != -EIO)
+   *active = locked;
+   ret = true;
+   }
 
spin_unlock_irq(&locked->active.lock);
 
-   return engine;
+   return ret;
 }
 
 static struct intel_engine_cs *active_engine(struct intel_context *ce)
@@ -472,17 +479,16 @@ static struct intel_engine_cs *active_engine(struct 
intel_context *ce)
if (!ce->timeline)
return NULL;
 
-   mutex_lock(&ce->timeline->mutex);
-   list_for_each_entry_reverse(rq, &ce->timeline->requests, link) {
-   if (i915_request_completed(rq))
-   break;
+   rcu_read_lock();
+   list_for_each_entry_rcu(rq, &ce->timeline->requests, link) {
+   if (i915_request_is_active(rq) && i915_request_completed(rq))
+   continue;
 
/* Check with the backend if the request is inflight */
-   engine = __active_engine(rq);
-   if (engine)
+   if (__active_engine(rq, &engine))
break;
}
-   mutex_unlock(&ce->timeline->mutex);
+   rcu_read_unlock();
 
return engine;
 }
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 31/37] drm/i915/gt: Refactor heartbeat request construction and submission

2020-08-05 Thread Chris Wilson
Pull the individual strands of creating a custom heartbeat requests into
a pair of common functions. This will reduce the number of changes we
will need to make in future.

Signed-off-by: Chris Wilson 
---
 .../gpu/drm/i915/gt/intel_engine_heartbeat.c  | 59 +--
 1 file changed, 41 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c 
b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
index 8ffdf676c0a0..377cbfdb3355 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c
@@ -37,12 +37,33 @@ static bool next_heartbeat(struct intel_engine_cs *engine)
return true;
 }
 
+static struct i915_request *
+heartbeat_create(struct intel_context *ce, gfp_t gfp)
+{
+   struct i915_request *rq;
+
+   intel_context_enter(ce);
+   rq = __i915_request_create(ce, gfp);
+   intel_context_exit(ce);
+
+   return rq;
+}
+
 static void idle_pulse(struct intel_engine_cs *engine, struct i915_request *rq)
 {
engine->wakeref_serial = READ_ONCE(engine->serial) + 1;
i915_request_add_active_barriers(rq);
 }
 
+static void heartbeat_commit(struct i915_request *rq,
+const struct i915_sched_attr *attr)
+{
+   idle_pulse(rq->engine, rq);
+
+   __i915_request_commit(rq);
+   __i915_request_queue(rq, attr);
+}
+
 static void show_heartbeat(const struct i915_request *rq,
   struct intel_engine_cs *engine)
 {
@@ -137,18 +158,14 @@ static void heartbeat(struct work_struct *wrk)
goto out;
}
 
-   intel_context_enter(ce);
-   rq = __i915_request_create(ce, GFP_NOWAIT | __GFP_NOWARN);
-   intel_context_exit(ce);
+   rq = heartbeat_create(ce, GFP_NOWAIT | __GFP_NOWARN);
if (IS_ERR(rq))
goto unlock;
 
-   idle_pulse(engine, rq);
if (engine->i915->params.enable_hangcheck)
engine->heartbeat.systole = i915_request_get(rq);
 
-   __i915_request_commit(rq);
-   __i915_request_queue(rq, &attr);
+   heartbeat_commit(rq, &attr);
 
 unlock:
mutex_unlock(&ce->timeline->mutex);
@@ -220,19 +237,14 @@ int intel_engine_pulse(struct intel_engine_cs *engine)
goto out_rpm;
}
 
-   intel_context_enter(ce);
-   rq = __i915_request_create(ce, GFP_NOWAIT | __GFP_NOWARN);
-   intel_context_exit(ce);
+   rq = heartbeat_create(ce, GFP_NOWAIT | __GFP_NOWARN);
if (IS_ERR(rq)) {
err = PTR_ERR(rq);
goto out_unlock;
}
 
__set_bit(I915_FENCE_FLAG_SENTINEL, &rq->fence.flags);
-   idle_pulse(engine, rq);
-
-   __i915_request_commit(rq);
-   __i915_request_queue(rq, &attr);
+   heartbeat_commit(rq, &attr);
GEM_BUG_ON(rq->sched.attr.priority < I915_PRIORITY_BARRIER);
err = 0;
 
@@ -245,8 +257,12 @@ int intel_engine_pulse(struct intel_engine_cs *engine)
 
 int intel_engine_flush_barriers(struct intel_engine_cs *engine)
 {
+   struct i915_sched_attr attr = {
+   .priority = I915_USER_PRIORITY(I915_PRIORITY_MIN),
+   };
+   struct intel_context *ce = engine->kernel_context;
struct i915_request *rq;
-   int err = 0;
+   int err;
 
if (llist_empty(&engine->barrier_tasks))
return 0;
@@ -254,15 +270,22 @@ int intel_engine_flush_barriers(struct intel_engine_cs 
*engine)
if (!intel_engine_pm_get_if_awake(engine))
return 0;
 
-   rq = i915_request_create(engine->kernel_context);
+   if (mutex_lock_interruptible(&ce->timeline->mutex)) {
+   err = -EINTR;
+   goto out_rpm;
+   }
+
+   rq = heartbeat_create(ce, GFP_KERNEL);
if (IS_ERR(rq)) {
err = PTR_ERR(rq);
-   goto out_rpm;
+   goto out_unlock;
}
 
-   idle_pulse(engine, rq);
-   i915_request_add(rq);
+   heartbeat_commit(rq, &attr);
 
+   err = 0;
+out_unlock:
+   mutex_unlock(&ce->timeline->mutex);
 out_rpm:
intel_engine_pm_put(engine);
return err;
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 07/37] drm/i915/gt: Split the breadcrumb spinlock between global and contexts

2020-08-05 Thread Chris Wilson
As we funnel more and more contexts into the breadcrumbs on an engine,
the hold time of b->irq_lock grows. As we may then contend with the
b->irq_lock during request submission, this increases the burden upon
the engine->active.lock and so directly impacts both our execution
latency and client latency. If we split the b->irq_lock by introducing a
per-context spinlock to manage the signalers within a context, we then
only need the b->irq_lock for enabling/disabling the interrupt and can
avoid taking the lock for walking the list of contexts within the signal
worker. Even with the current setup, this greatly reduces the number of
times we have to take and fight for b->irq_lock.

Fixes: bda4d4db6dd6 ("drm/i915/gt: Replace 
intel_engine_transfer_stale_breadcrumbs")
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_breadcrumbs.c   | 157 ++
 .../gpu/drm/i915/gt/intel_breadcrumbs_types.h |   6 +-
 drivers/gpu/drm/i915/gt/intel_context.c   |   1 +
 drivers/gpu/drm/i915/gt/intel_context_types.h |   1 +
 4 files changed, 89 insertions(+), 76 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c 
b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
index ae8895b48eca..8802b47fbd8f 100644
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
@@ -100,15 +100,16 @@ static void __intel_breadcrumbs_disarm_irq(struct 
intel_breadcrumbs *b)
 static void add_signaling_context(struct intel_breadcrumbs *b,
  struct intel_context *ce)
 {
-   intel_context_get(ce);
-   list_add_tail(&ce->signal_link, &b->signalers);
+   lockdep_assert_held(&b->signalers_lock);
+   list_add_rcu(&ce->signal_link, &b->signalers);
 }
 
 static void remove_signaling_context(struct intel_breadcrumbs *b,
 struct intel_context *ce)
 {
-   list_del(&ce->signal_link);
-   intel_context_put(ce);
+   spin_lock(&b->signalers_lock);
+   list_del_rcu(&ce->signal_link);
+   spin_unlock(&b->signalers_lock);
 }
 
 static inline bool __request_completed(const struct i915_request *rq)
@@ -184,15 +185,12 @@ static void signal_irq_work(struct irq_work *work)
struct intel_breadcrumbs *b = container_of(work, typeof(*b), irq_work);
const ktime_t timestamp = ktime_get();
struct llist_node *signal, *sn;
-   struct intel_context *ce, *cn;
-   struct list_head *pos, *next;
+   struct intel_context *ce;
 
signal = NULL;
if (unlikely(!llist_empty(&b->signaled_requests)))
signal = llist_del_all(&b->signaled_requests);
 
-   spin_lock(&b->irq_lock);
-
/*
 * Keep the irq armed until the interrupt after all listeners are gone.
 *
@@ -216,11 +214,23 @@ static void signal_irq_work(struct irq_work *work)
 * interrupt draw less ire from other users of the system and tools
 * like powertop.
 */
-   if (!signal && b->irq_armed && list_empty(&b->signalers))
-   __intel_breadcrumbs_disarm_irq(b);
+   if (!signal && READ_ONCE(b->irq_armed) && list_empty(&b->signalers)) {
+   spin_lock(&b->irq_lock);
+   if (b->irq_armed)
+   __intel_breadcrumbs_disarm_irq(b);
+   spin_unlock(&b->irq_lock);
+   }
+
+   rcu_read_lock();
+   list_for_each_entry_rcu(ce, &b->signalers, signal_link) {
+   struct list_head *pos, *next;
+   bool release = false;
 
-   list_for_each_entry_safe(ce, cn, &b->signalers, signal_link) {
-   GEM_BUG_ON(list_empty(&ce->signals));
+   if (!spin_trylock(&ce->signal_lock))
+   continue;
+
+   if (list_empty(&ce->signals))
+   goto unlock;
 
list_for_each_safe(pos, next, &ce->signals) {
struct i915_request *rq =
@@ -253,11 +263,16 @@ static void signal_irq_work(struct irq_work *work)
if (&ce->signals == pos) { /* now empty */
add_retire(b, ce->timeline);
remove_signaling_context(b, ce);
+   release = true;
}
}
-   }
 
-   spin_unlock(&b->irq_lock);
+unlock:
+   spin_unlock(&ce->signal_lock);
+   if (release)
+   intel_context_put(ce);
+   }
+   rcu_read_unlock();
 
llist_for_each_safe(signal, sn, signal) {
struct i915_request *rq =
@@ -292,14 +307,15 @@ intel_breadcrumbs_create(struct intel_engine_cs 
*irq_engine)
if (!b)
return NULL;
 
-   spin_lock_init(&b->irq_lock);
+   b->irq_engine = irq_engine;
+
+   spin_lock_init(&b->signalers_lock);
INIT_LIST_HEAD(&b->signalers);
init_llist_head(&b->signaled_requests);
 
+   spin_lock_init(&b->irq_lock);
  

[Intel-gfx] [PATCH 05/37] drm/i915/gt: Track signaled breadcrumbs outside of the breadcrumb spinlock

2020-08-05 Thread Chris Wilson
Make b->signaled_requests a lockless-list so that we can manipulate it
outside of the b->irq_lock.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_breadcrumbs.c   | 28 +++
 .../gpu/drm/i915/gt/intel_breadcrumbs_types.h |  2 +-
 drivers/gpu/drm/i915/i915_request.h   |  6 +++-
 3 files changed, 22 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c 
b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
index dee6d5c9b413..9710d09e7670 100644
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
@@ -169,16 +169,13 @@ static void add_retire(struct intel_breadcrumbs *b, 
struct intel_timeline *tl)
intel_engine_add_retire(b->irq_engine, tl);
 }
 
-static bool __signal_request(struct i915_request *rq, struct list_head 
*signals)
+static bool __signal_request(struct i915_request *rq)
 {
-   clear_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags);
-
if (!__dma_fence_signal(&rq->fence)) {
i915_request_put(rq);
return false;
}
 
-   list_add_tail(&rq->signal_link, signals);
return true;
 }
 
@@ -186,9 +183,13 @@ static void signal_irq_work(struct irq_work *work)
 {
struct intel_breadcrumbs *b = container_of(work, typeof(*b), irq_work);
const ktime_t timestamp = ktime_get();
+   struct llist_node *signal, *sn;
struct intel_context *ce, *cn;
struct list_head *pos, *next;
-   LIST_HEAD(signal);
+
+   signal = NULL;
+   if (unlikely(!llist_empty(&b->signaled_requests)))
+   signal = llist_del_all(&b->signaled_requests);
 
spin_lock(&b->irq_lock);
 
@@ -218,8 +219,6 @@ static void signal_irq_work(struct irq_work *work)
if (b->irq_armed && list_empty(&b->signalers))
__intel_breadcrumbs_disarm_irq(b);
 
-   list_splice_init(&b->signaled_requests, &signal);
-
list_for_each_entry_safe(ce, cn, &b->signalers, signal_link) {
GEM_BUG_ON(list_empty(&ce->signals));
 
@@ -236,7 +235,11 @@ static void signal_irq_work(struct irq_work *work)
 * spinlock as the callback chain may end up adding
 * more signalers to the same context or engine.
 */
-   __signal_request(rq, &signal);
+   clear_bit(I915_FENCE_FLAG_SIGNAL, &rq->fence.flags);
+   if (__signal_request(rq)) {
+   rq->signal_node.next = signal;
+   signal = &rq->signal_node;
+   }
}
 
/*
@@ -256,9 +259,9 @@ static void signal_irq_work(struct irq_work *work)
 
spin_unlock(&b->irq_lock);
 
-   list_for_each_safe(pos, next, &signal) {
+   llist_for_each_safe(signal, sn, signal) {
struct i915_request *rq =
-   list_entry(pos, typeof(*rq), signal_link);
+   llist_entry(signal, typeof(*rq), signal_node);
struct list_head cb_list;
 
spin_lock(&rq->lock);
@@ -291,7 +294,7 @@ intel_breadcrumbs_create(struct intel_engine_cs *irq_engine)
 
spin_lock_init(&b->irq_lock);
INIT_LIST_HEAD(&b->signalers);
-   INIT_LIST_HEAD(&b->signaled_requests);
+   init_llist_head(&b->signaled_requests);
 
init_irq_work(&b->irq_work, signal_irq_work);
 
@@ -346,7 +349,8 @@ static void insert_breadcrumb(struct i915_request *rq,
 * its signal completion.
 */
if (__request_completed(rq)) {
-   if (__signal_request(rq, &b->signaled_requests))
+   if (__signal_request(rq) &&
+   llist_add(&rq->signal_node, &b->signaled_requests))
irq_work_queue(&b->irq_work);
return;
}
diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs_types.h 
b/drivers/gpu/drm/i915/gt/intel_breadcrumbs_types.h
index 8e53b9942695..3fa19820b37a 100644
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs_types.h
@@ -35,7 +35,7 @@ struct intel_breadcrumbs {
struct intel_engine_cs *irq_engine;
 
struct list_head signalers;
-   struct list_head signaled_requests;
+   struct llist_head signaled_requests;
 
struct irq_work irq_work; /* for use from inside irq_lock */
 
diff --git a/drivers/gpu/drm/i915/i915_request.h 
b/drivers/gpu/drm/i915/i915_request.h
index 16b721080195..874af6db6103 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -176,7 +176,11 @@ struct i915_request {
struct intel_context *context;
struct intel_ring *ring;
struct intel_timeline __rcu *timeline;
-   struct list_head signal_link;
+
+   union {
+   struct list_head signal_link;
+   struct llist_node signal_node;
+ 

[Intel-gfx] [PATCH 02/37] drm/i915/gt: Protect context lifetime with RCU

2020-08-05 Thread Chris Wilson
Allow a brief period for continued access to a dead intel_context by
deferring the release of the struct until after an RCU grace period.
As we are using a dedicated slab cache for the contexts, we can defer
the release of the slab pages via RCU, with the caveat that individual
structs may be reused from the freelist within an RCU grace period. To
handle that, we have to avoid clearing members of the zombie struct.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_context.c | 330 +---
 drivers/gpu/drm/i915/i915_active.c  |  10 +
 drivers/gpu/drm/i915/i915_active.h  |   2 +
 drivers/gpu/drm/i915/i915_utils.h   |   7 +
 4 files changed, 202 insertions(+), 147 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
b/drivers/gpu/drm/i915/gt/intel_context.c
index 52db2bde44a3..4e7924640ffa 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -22,7 +22,7 @@ static struct i915_global_context {
 
 static struct intel_context *intel_context_alloc(void)
 {
-   return kmem_cache_zalloc(global.slab_ce, GFP_KERNEL);
+   return kmem_cache_alloc(global.slab_ce, GFP_KERNEL);
 }
 
 void intel_context_free(struct intel_context *ce)
@@ -30,6 +30,177 @@ void intel_context_free(struct intel_context *ce)
kmem_cache_free(global.slab_ce, ce);
 }
 
+static int __context_pin_state(struct i915_vma *vma)
+{
+   unsigned int bias = i915_ggtt_pin_bias(vma) | PIN_OFFSET_BIAS;
+   int err;
+
+   err = i915_ggtt_pin(vma, 0, bias | PIN_HIGH);
+   if (err)
+   return err;
+
+   err = i915_active_acquire(&vma->active);
+   if (err)
+   goto err_unpin;
+
+   /*
+* And mark it as a globally pinned object to let the shrinker know
+* it cannot reclaim the object until we release it.
+*/
+   i915_vma_make_unshrinkable(vma);
+   vma->obj->mm.dirty = true;
+
+   return 0;
+
+err_unpin:
+   i915_vma_unpin(vma);
+   return err;
+}
+
+static void __context_unpin_state(struct i915_vma *vma)
+{
+   i915_vma_make_shrinkable(vma);
+   i915_active_release(&vma->active);
+   __i915_vma_unpin(vma);
+}
+
+static int __ring_active(struct intel_ring *ring)
+{
+   int err;
+
+   err = intel_ring_pin(ring);
+   if (err)
+   return err;
+
+   err = i915_active_acquire(&ring->vma->active);
+   if (err)
+   goto err_pin;
+
+   return 0;
+
+err_pin:
+   intel_ring_unpin(ring);
+   return err;
+}
+
+static void __ring_retire(struct intel_ring *ring)
+{
+   i915_active_release(&ring->vma->active);
+   intel_ring_unpin(ring);
+}
+
+__i915_active_call
+static void __intel_context_retire(struct i915_active *active)
+{
+   struct intel_context *ce = container_of(active, typeof(*ce), active);
+
+   CE_TRACE(ce, "retire runtime: { total:%lluns, avg:%lluns }\n",
+intel_context_get_total_runtime_ns(ce),
+intel_context_get_avg_runtime_ns(ce));
+
+   set_bit(CONTEXT_VALID_BIT, &ce->flags);
+   if (ce->state)
+   __context_unpin_state(ce->state);
+
+   intel_timeline_unpin(ce->timeline);
+   __ring_retire(ce->ring);
+
+   intel_context_put(ce);
+}
+
+static int __intel_context_active(struct i915_active *active)
+{
+   struct intel_context *ce = container_of(active, typeof(*ce), active);
+   int err;
+
+   CE_TRACE(ce, "active\n");
+
+   intel_context_get(ce);
+
+   err = __ring_active(ce->ring);
+   if (err)
+   goto err_put;
+
+   err = intel_timeline_pin(ce->timeline);
+   if (err)
+   goto err_ring;
+
+   if (!ce->state)
+   return 0;
+
+   err = __context_pin_state(ce->state);
+   if (err)
+   goto err_timeline;
+
+   return 0;
+
+err_timeline:
+   intel_timeline_unpin(ce->timeline);
+err_ring:
+   __ring_retire(ce->ring);
+err_put:
+   intel_context_put(ce);
+   return err;
+}
+
+static void __intel_context_ctor(void *arg)
+{
+   struct intel_context *ce = arg;
+
+   INIT_LIST_HEAD(&ce->signal_link);
+   INIT_LIST_HEAD(&ce->signals);
+
+   atomic_set(&ce->pin_count, 0);
+   mutex_init(&ce->pin_mutex);
+
+   ce->active_count = 0;
+   i915_active_init(&ce->active,
+__intel_context_active, __intel_context_retire);
+
+   ce->inflight = NULL;
+   ce->lrc_reg_state = NULL;
+   ce->lrc.desc = 0;
+}
+
+static void
+__intel_context_init(struct intel_context *ce, struct intel_engine_cs *engine)
+{
+   GEM_BUG_ON(!engine->cops);
+   GEM_BUG_ON(!engine->gt->vm);
+
+   kref_init(&ce->ref);
+   i915_active_reinit(&ce->active);
+   mutex_reinit(&ce->pin_mutex);
+
+   ce->engine = engine;
+   ce->ops = engine->cops;
+   ce->sseu = engine->sseu;
+
+   ce->wa_bb_page = 0;
+   ce->flags = 0;
+   ce->tag = 0;
+
+

[Intel-gfx] [PATCH 16/37] drm/i915: Always defer fenced work to the worker

2020-08-05 Thread Chris Wilson
Currently, if an error is raised we always call the cleanup locally
[and skip the main work callback]. However, some future users may need
to take a mutex to cleanup and so we cannot immediately execute the
cleanup as we may still be in interrupt context. For example, if we have
committed sensitive changes [like evicting from the ppGTT layout] that
are visible but gated behind the fence, we need to ensure those changes
are completed even after an error. [This does suggest the split between
the work/release callback is artificial and we may be able to simplify
the worker api by only requiring a single callback.]

With the execute-immediate flag, for most cases this should result in
immediate cleanup of an error.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_sw_fence_work.c | 26 +++
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_sw_fence_work.c 
b/drivers/gpu/drm/i915/i915_sw_fence_work.c
index a3a81bb8f2c3..e094fd0a4202 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence_work.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence_work.c
@@ -16,11 +16,14 @@ static void fence_complete(struct dma_fence_work *f)
 static void fence_work(struct work_struct *work)
 {
struct dma_fence_work *f = container_of(work, typeof(*f), work);
-   int err;
 
-   err = f->ops->work(f);
-   if (err)
-   dma_fence_set_error(&f->dma, err);
+   if (!f->dma.error) {
+   int err;
+
+   err = f->ops->work(f);
+   if (err)
+   dma_fence_set_error(&f->dma, err);
+   }
 
fence_complete(f);
dma_fence_put(&f->dma);
@@ -36,15 +39,10 @@ fence_notify(struct i915_sw_fence *fence, enum 
i915_sw_fence_notify state)
if (fence->error)
dma_fence_set_error(&f->dma, fence->error);
 
-   if (!f->dma.error) {
-   dma_fence_get(&f->dma);
-   if (test_bit(DMA_FENCE_WORK_IMM, &f->dma.flags))
-   fence_work(&f->work);
-   else
-   queue_work(system_unbound_wq, &f->work);
-   } else {
-   fence_complete(f);
-   }
+   if (test_bit(DMA_FENCE_WORK_IMM, &f->dma.flags))
+   fence_work(&f->work);
+   else
+   queue_work(system_unbound_wq, &f->work);
break;
 
case FENCE_FREE:
@@ -91,6 +89,8 @@ void dma_fence_work_init(struct dma_fence_work *f,
dma_fence_init(&f->dma, &fence_ops, &f->lock, 0, 0);
i915_sw_fence_init(&f->chain, fence_notify);
INIT_WORK(&f->work, fence_work);
+
+   dma_fence_get(&f->dma); /* once for the chain; once for the work */
 }
 
 int dma_fence_work_chain(struct dma_fence_work *f, struct dma_fence *signal)
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 15/37] drm/i915: Add list_for_each_entry_safe_continue_reverse

2020-08-05 Thread Chris Wilson
One more list iterator variant, for when we want to unwind from inside
one list iterator with the intention of restarting from the current
entry as the new head of the list.

Signed-off-by: Chris Wilson 
Reviewed-by: Tvrtko Ursulin 
Reviewed-by: Thomas Hellström 
---
 drivers/gpu/drm/i915/i915_utils.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_utils.h 
b/drivers/gpu/drm/i915/i915_utils.h
index ef8db3aa75c7..3873834f2316 100644
--- a/drivers/gpu/drm/i915/i915_utils.h
+++ b/drivers/gpu/drm/i915/i915_utils.h
@@ -266,6 +266,12 @@ static inline int list_is_last_rcu(const struct list_head 
*list,
return READ_ONCE(list->next) == head;
 }
 
+#define list_for_each_entry_safe_continue_reverse(pos, n, head, member)
\
+   for (pos = list_prev_entry(pos, member),\
+n = list_prev_entry(pos, member);  \
+&pos->member != (head);\
+pos = n, n = list_prev_entry(n, member))
+
 static inline unsigned long msecs_to_jiffies_timeout(const unsigned int m)
 {
unsigned long j = msecs_to_jiffies(m);
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 10/37] drm/i915/gem: Rename the list of relocations to reloc_list

2020-08-05 Thread Chris Wilson
Continuing the theme of calling the lists a foo_list, rename the relocs
list. This means that we can now use relocs for the old reloc_cache that
is not a cache anymore.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index a5b63ae17241..e7e16c62df1c 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -252,7 +252,7 @@ struct i915_execbuffer {
struct list_head unbound;
 
/** list of vma that have execobj.relocation_count */
-   struct list_head relocs;
+   struct list_head relocs_list;
 
/**
 * Track the most recently used object for relocations, as we
@@ -577,7 +577,7 @@ eb_add_vma(struct i915_execbuffer *eb,
}
 
if (entry->relocation_count)
-   list_add_tail(&ev->reloc_link, &eb->relocs);
+   list_add_tail(&ev->reloc_link, &eb->relocs_list);
 
/*
 * SNA is doing fancy tricks with compressing batch buffers, which leads
@@ -932,7 +932,7 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb)
unsigned int i;
int err = 0;
 
-   INIT_LIST_HEAD(&eb->relocs);
+   INIT_LIST_HEAD(&eb->relocs_list);
INIT_LIST_HEAD(&eb->unbound);
 
for (i = 0; i < eb->buffer_count; i++) {
@@ -1592,7 +1592,7 @@ static int eb_relocate(struct i915_execbuffer *eb)
struct eb_vma *ev;
int flush;
 
-   list_for_each_entry(ev, &eb->relocs, reloc_link) {
+   list_for_each_entry(ev, &eb->relocs_list, reloc_link) {
err = eb_relocate_vma(eb, ev);
if (err)
break;
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 13/37] drm/i915/gem: Remove the call for no-evict i915_vma_pin

2020-08-05 Thread Chris Wilson
Remove the stub i915_vma_pin() used for incrementally pinning objects for
execbuf (under the severe restriction that they must not wait on a
resource as we may have already pinned it) and replace it with a
i915_vma_pin_inplace() that is only allowed to reclaim the currently
bound location for the vma (and will never wait for a pinned resource).

v2: Bail out if fences are in use.

Signed-off-by: Chris Wilson 
Reviewed-by: Thomas Hellström 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 55 +--
 drivers/gpu/drm/i915/i915_vma.c   |  6 +-
 drivers/gpu/drm/i915/i915_vma.h   |  2 +
 3 files changed, 31 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 2f6fa8b3a805..62a1de1dd238 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -464,49 +464,41 @@ static u64 eb_pin_flags(const struct 
drm_i915_gem_exec_object2 *entry,
return pin_flags;
 }
 
+static bool eb_pin_vma_fence_inplace(struct eb_vma *ev)
+{
+   return false; /* We need to add some new fence serialisation */
+}
+
 static inline bool
-eb_pin_vma(struct i915_execbuffer *eb,
-  const struct drm_i915_gem_exec_object2 *entry,
-  struct eb_vma *ev)
+eb_pin_vma_inplace(struct i915_execbuffer *eb,
+  const struct drm_i915_gem_exec_object2 *entry,
+  struct eb_vma *ev)
 {
struct i915_vma *vma = ev->vma;
-   u64 pin_flags;
+   unsigned int pin_flags;
 
-   if (vma->node.size)
-   pin_flags = vma->node.start;
-   else
-   pin_flags = entry->offset & PIN_OFFSET_MASK;
+   if (eb_vma_misplaced(entry, vma, ev->flags))
+   return false;
 
-   pin_flags |= PIN_USER | PIN_NOEVICT | PIN_OFFSET_FIXED;
+   pin_flags = PIN_USER;
if (unlikely(ev->flags & EXEC_OBJECT_NEEDS_GTT))
pin_flags |= PIN_GLOBAL;
 
/* Attempt to reuse the current location if available */
-   if (unlikely(i915_vma_pin(vma, 0, 0, pin_flags))) {
-   if (entry->flags & EXEC_OBJECT_PINNED)
-   return false;
-
-   /* Failing that pick any _free_ space if suitable */
-   if (unlikely(i915_vma_pin(vma,
- entry->pad_to_size,
- entry->alignment,
- eb_pin_flags(entry, ev->flags) |
- PIN_USER | PIN_NOEVICT)))
-   return false;
-   }
+   if (!i915_vma_pin_inplace(vma, pin_flags))
+   return false;
 
if (unlikely(ev->flags & EXEC_OBJECT_NEEDS_FENCE)) {
-   if (unlikely(i915_vma_pin_fence(vma))) {
-   i915_vma_unpin(vma);
+   if (!eb_pin_vma_fence_inplace(ev)) {
+   __i915_vma_unpin(vma);
return false;
}
-
-   if (vma->fence)
-   ev->flags |= __EXEC_OBJECT_HAS_FENCE;
}
 
+   GEM_BUG_ON(eb_vma_misplaced(entry, vma, ev->flags));
+
ev->flags |= __EXEC_OBJECT_HAS_PIN;
-   return !eb_vma_misplaced(entry, vma, ev->flags);
+   return true;
 }
 
 static int
@@ -688,14 +680,17 @@ static int eb_reserve_vm(struct i915_execbuffer *eb)
struct drm_i915_gem_exec_object2 *entry = ev->exec;
struct i915_vma *vma = ev->vma;
 
-   if (eb_pin_vma(eb, entry, ev)) {
+   if (eb_pin_vma_inplace(eb, entry, ev)) {
if (entry->offset != vma->node.start) {
entry->offset = vma->node.start | UPDATE;
eb->args->flags |= __EXEC_HAS_RELOC;
}
} else {
-   eb_unreserve_vma(ev);
-   list_add_tail(&ev->unbound_link, &unbound);
+   /* Lightly sort user placed objects to the fore */
+   if (ev->flags & EXEC_OBJECT_PINNED)
+   list_add(&ev->unbound_link, &unbound);
+   else
+   list_add_tail(&ev->unbound_link, &unbound);
}
}
 
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index c6bf04ca2032..dbe11b349175 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -740,11 +740,13 @@ i915_vma_detach(struct i915_vma *vma)
list_del(&vma->vm_link);
 }
 
-static bool try_qad_pin(struct i915_vma *vma, unsigned int flags)
+bool i915_vma_pin_inplace(struct i915_vma *vma, unsigned int flags)
 {
unsigned int bound;
bool pinned = true;
 
+   GEM_BUG_ON(flags & ~I915_VMA_BIND_MASK);
+
bound = atomic_read(&vma->flags);
do {

[Intel-gfx] [PATCH 09/37] drm/i915/gem: Rename execbuf.bind_link to unbound_link

2020-08-05 Thread Chris Wilson
Rename the current list of unbound objects so that we can track of all
objects that we need to bind, as well as the list of currently unbound
[unprocessed] objects.

Signed-off-by: Chris Wilson 
Reviewed-by: Tvrtko Ursulin 
Reviewed-by: Thomas Hellström 
---
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 2dc30dbbdbf3..a5b63ae17241 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -34,7 +34,7 @@ struct eb_vma {
 
/** This vma's place in the execbuf reservation list */
struct drm_i915_gem_exec_object2 *exec;
-   struct list_head bind_link;
+   struct list_head unbound_link;
struct list_head reloc_link;
 
struct hlist_node node;
@@ -605,7 +605,7 @@ eb_add_vma(struct i915_execbuffer *eb,
}
} else {
eb_unreserve_vma(ev);
-   list_add_tail(&ev->bind_link, &eb->unbound);
+   list_add_tail(&ev->unbound_link, &eb->unbound);
}
 }
 
@@ -725,7 +725,7 @@ static int eb_reserve(struct i915_execbuffer *eb)
if (mutex_lock_interruptible(&eb->i915->drm.struct_mutex))
return -EINTR;
 
-   list_for_each_entry(ev, &eb->unbound, bind_link) {
+   list_for_each_entry(ev, &eb->unbound, unbound_link) {
err = eb_reserve_vma(eb, ev, pin_flags);
if (err)
break;
@@ -751,15 +751,15 @@ static int eb_reserve(struct i915_execbuffer *eb)
 
if (flags & EXEC_OBJECT_PINNED)
/* Pinned must have their slot */
-   list_add(&ev->bind_link, &eb->unbound);
+   list_add(&ev->unbound_link, &eb->unbound);
else if (flags & __EXEC_OBJECT_NEEDS_MAP)
/* Map require the lowest 256MiB (aperture) */
-   list_add_tail(&ev->bind_link, &eb->unbound);
+   list_add_tail(&ev->unbound_link, &eb->unbound);
else if (!(flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS))
/* Prioritise 4GiB region for restricted bo */
-   list_add(&ev->bind_link, &last);
+   list_add(&ev->unbound_link, &last);
else
-   list_add_tail(&ev->bind_link, &last);
+   list_add_tail(&ev->unbound_link, &last);
}
list_splice_tail(&last, &eb->unbound);
mutex_unlock(&eb->i915->drm.struct_mutex);
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 20/37] drm/i915/gem: Bind the fence async for execbuf

2020-08-05 Thread Chris Wilson
It is illegal to wait on an another vma while holding the vm->mutex, as
that easily leads to ABBA deadlocks (we wait on a second vma that waits
on us to release the vm->mutex). So while the vm->mutex exists, we
compute the required register transfer inside the i915_ggtt.mutex, setting
up a fence for tracking the register writes, but move the waiting outside
of the lock into the async binding pipeline.

Signed-off-by: Chris Wilson 
Reviewed-by: Thomas Hellström 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  21 +--
 drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c  | 139 +-
 drivers/gpu/drm/i915/gt/intel_ggtt_fencing.h  |   5 +
 drivers/gpu/drm/i915/i915_vma.h   |   2 -
 4 files changed, 152 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 301e67dcdbde..4cdaf5d81ef1 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -1054,15 +1054,6 @@ static int eb_reserve_vma(struct eb_vm_work *work, 
struct eb_bind_vma *bind)
return err;
 
 pin:
-   if (unlikely(exec_flags & EXEC_OBJECT_NEEDS_FENCE)) {
-   err = __i915_vma_pin_fence(vma); /* XXX no waiting */
-   if (unlikely(err))
-   return err;
-
-   if (vma->fence)
-   bind->ev->flags |= __EXEC_OBJECT_HAS_FENCE;
-   }
-
bind_flags &= ~atomic_read(&vma->flags);
if (bind_flags) {
err = set_bind_fence(vma, work);
@@ -1093,6 +1084,15 @@ static int eb_reserve_vma(struct eb_vm_work *work, 
struct eb_bind_vma *bind)
bind->ev->flags |= __EXEC_OBJECT_HAS_PIN;
GEM_BUG_ON(eb_vma_misplaced(entry, vma, bind->ev->flags));
 
+   if (unlikely(exec_flags & EXEC_OBJECT_NEEDS_FENCE)) {
+   err = __i915_vma_pin_fence_async(vma, &work->base);
+   if (unlikely(err))
+   return err;
+
+   if (vma->fence)
+   bind->ev->flags |= __EXEC_OBJECT_HAS_FENCE;
+   }
+
return 0;
 }
 
@@ -1158,6 +1158,9 @@ static void __eb_bind_vma(struct eb_vm_work *work)
struct eb_bind_vma *bind = &work->bind[n];
struct i915_vma *vma = bind->ev->vma;
 
+   if (bind->ev->flags & __EXEC_OBJECT_HAS_FENCE)
+   __i915_vma_apply_fence_async(vma);
+
if (!bind->bind_flags)
goto put;
 
diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c 
b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
index 7fb36b12fe7a..ce06b949dc7c 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
@@ -21,10 +21,13 @@
  * IN THE SOFTWARE.
  */
 
+#include "i915_active.h"
 #include "i915_drv.h"
 #include "i915_scatterlist.h"
+#include "i915_sw_fence_work.h"
 #include "i915_pvinfo.h"
 #include "i915_vgpu.h"
+#include "i915_vma.h"
 
 /**
  * DOC: fence register handling
@@ -340,19 +343,37 @@ static struct i915_fence_reg *fence_find(struct i915_ggtt 
*ggtt)
return ERR_PTR(-EDEADLK);
 }
 
-int __i915_vma_pin_fence(struct i915_vma *vma)
+static int fence_wait_bind(struct i915_fence_reg *reg)
+{
+   struct dma_fence *fence;
+   int err = 0;
+
+   fence = i915_active_fence_get(®->active.excl);
+   if (fence) {
+   err = dma_fence_wait(fence, true);
+   dma_fence_put(fence);
+   }
+
+   return err;
+}
+
+static int __i915_vma_pin_fence(struct i915_vma *vma)
 {
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vma->vm);
-   struct i915_fence_reg *fence;
+   struct i915_fence_reg *fence = vma->fence;
struct i915_vma *set = i915_gem_object_is_tiled(vma->obj) ? vma : NULL;
int err;
 
lockdep_assert_held(&vma->vm->mutex);
 
/* Just update our place in the LRU if our fence is getting reused. */
-   if (vma->fence) {
-   fence = vma->fence;
+   if (fence) {
GEM_BUG_ON(fence->vma != vma);
+
+   err = fence_wait_bind(fence);
+   if (err)
+   return err;
+
atomic_inc(&fence->pin_count);
if (!fence->dirty) {
list_move_tail(&fence->link, &ggtt->fence_list);
@@ -384,6 +405,116 @@ int __i915_vma_pin_fence(struct i915_vma *vma)
return err;
 }
 
+static int set_bind_fence(struct i915_fence_reg *fence,
+ struct dma_fence_work *work)
+{
+   struct dma_fence *prev;
+   int err;
+
+   if (rcu_access_pointer(fence->active.excl.fence) == &work->dma)
+   return 0;
+
+   err = i915_sw_fence_await_active(&work->chain,
+&fence->active,
+I915_ACTIVE_AWAIT_ACTIVE);
+   if (err)
+   return err;
+
+   if (i915_active_ac

[Intel-gfx] [PATCH 33/37] drm/i915/gt: Acquire backing storage for the context

2020-08-05 Thread Chris Wilson
Pull the individual acquisition of the context objects (state, ring,
timeline) under a common i915_acquire_ctx in preparation to allow the
context to evict memory (or rather the i915_acquire_ctx on its behalf).

The context objects maintain their semi-permanent status; that is they
are assumed to be accessible by the HW at all times until we receive a
signal from the HW that they are no longer in use. Currently, we
generate such a signal ourselves from the context switch following the
final use of the objects. This means that they will remain on the HW for
an indefinite amount of time, and we retain the use of pinning to keep
them in the same place. As they are pinned, they can be processed
outside of the working set for the requests within the context. This is
useful, as the context share some global state causing it to incur a
global lock via its objects. By only requiring that lock as the context
is activated, it is both reduced in frequency and reduced in duration
(as compared to execbuf).

Signed-off-by: Chris Wilson 
Reviewed-by: Thomas Hellström 
---
 drivers/gpu/drm/i915/gt/intel_context.c   | 113 ++---
 drivers/gpu/drm/i915/gt/intel_ring.c  |  17 ++-
 drivers/gpu/drm/i915/gt/intel_ring.h  |   5 +-
 .../gpu/drm/i915/gt/intel_ring_submission.c   | 119 +++---
 drivers/gpu/drm/i915/gt/intel_timeline.c  |  14 ++-
 drivers/gpu/drm/i915/gt/intel_timeline.h  |   8 +-
 drivers/gpu/drm/i915/gt/mock_engine.c |   2 +
 drivers/gpu/drm/i915/gt/selftest_timeline.c   |  30 -
 8 files changed, 240 insertions(+), 68 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
b/drivers/gpu/drm/i915/gt/intel_context.c
index cde356c7754d..ff3f7580d1ca 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -6,6 +6,7 @@
 
 #include "gem/i915_gem_context.h"
 #include "gem/i915_gem_pm.h"
+#include "mm/i915_acquire_ctx.h"
 
 #include "i915_drv.h"
 #include "i915_globals.h"
@@ -30,12 +31,12 @@ void intel_context_free(struct intel_context *ce)
kmem_cache_free(global.slab_ce, ce);
 }
 
-static int __context_pin_state(struct i915_vma *vma)
+static int __context_active_locked(struct i915_vma *vma)
 {
unsigned int bias = i915_ggtt_pin_bias(vma) | PIN_OFFSET_BIAS;
int err;
 
-   err = i915_ggtt_pin(vma, 0, bias | PIN_HIGH);
+   err = i915_ggtt_pin_locked(vma, 0, bias | PIN_HIGH);
if (err)
return err;
 
@@ -57,18 +58,18 @@ static int __context_pin_state(struct i915_vma *vma)
return err;
 }
 
-static void __context_unpin_state(struct i915_vma *vma)
+static void __context_retire_state(struct i915_vma *vma)
 {
i915_vma_make_shrinkable(vma);
i915_active_release(&vma->active);
__i915_vma_unpin(vma);
 }
 
-static int __ring_active(struct intel_ring *ring)
+static int __ring_active_locked(struct intel_ring *ring)
 {
int err;
 
-   err = intel_ring_pin(ring);
+   err = intel_ring_pin_locked(ring);
if (err)
return err;
 
@@ -100,7 +101,7 @@ static void __intel_context_retire(struct i915_active 
*active)
 
set_bit(CONTEXT_VALID_BIT, &ce->flags);
if (ce->state)
-   __context_unpin_state(ce->state);
+   __context_retire_state(ce->state);
 
intel_timeline_unpin(ce->timeline);
__ring_retire(ce->ring);
@@ -108,27 +109,53 @@ static void __intel_context_retire(struct i915_active 
*active)
intel_context_put(ce);
 }
 
-static int __intel_context_active(struct i915_active *active)
+static int
+__intel_context_acquire_lock(struct intel_context *ce,
+struct i915_acquire_ctx *ctx)
+{
+   return i915_acquire_ctx_lock(ctx, ce->state->obj);
+}
+
+static int
+intel_context_acquire_lock(struct intel_context *ce,
+  struct i915_acquire_ctx *ctx)
 {
-   struct intel_context *ce = container_of(active, typeof(*ce), active);
int err;
 
-   CE_TRACE(ce, "active\n");
+   err = intel_ring_acquire_lock(ce->ring, ctx);
+   if (err)
+   return err;
 
-   intel_context_get(ce);
+   if (ce->state) {
+   err = __intel_context_acquire_lock(ce, ctx);
+   if (err)
+   return err;
+   }
 
-   err = __ring_active(ce->ring);
+   /* Note that the timeline will migrate as the seqno wrap around */
+   err = intel_timeline_acquire_lock(ce->timeline, ctx);
if (err)
-   goto err_put;
+   return err;
+
+   return 0;
+}
 
-   err = intel_timeline_pin(ce->timeline);
+static int intel_context_active_locked(struct intel_context *ce)
+{
+   int err;
+
+   err = __ring_active_locked(ce->ring);
+   if (err)
+   return err;
+
+   err = intel_timeline_pin_locked(ce->timeline);
if (err)
goto err_ring;
 
if (!ce->state)
   

[Intel-gfx] [PATCH 26/37] drm/i915/gem: Pull execbuf dma resv under a single critical section

2020-08-05 Thread Chris Wilson
Acquire all the objects and their backing storage, and page directories,
as used by execbuf under a single common ww_mutex. Albeit we have to
restart the critical section a few times in order to handle various
restrictions (such as avoiding copy_(from|to)_user and mmap_sem).

Signed-off-by: Chris Wilson 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 166 +-
 .../i915/gem/selftests/i915_gem_execbuffer.c  |   2 +
 2 files changed, 84 insertions(+), 84 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 58e40348b551..3a79b6facb02 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -20,6 +20,7 @@
 #include "gt/intel_gt_pm.h"
 #include "gt/intel_gt_requests.h"
 #include "gt/intel_ring.h"
+#include "mm/i915_acquire_ctx.h"
 
 #include "i915_drv.h"
 #include "i915_gem_clflush.h"
@@ -267,6 +268,8 @@ struct i915_execbuffer {
struct intel_context *reloc_context; /* distinct context for relocs */
struct i915_gem_context *gem_context; /** caller's context */
 
+   struct i915_acquire_ctx acquire; /** lock for _all_ DMA reservations */
+
struct i915_request *request; /** our request to build */
struct eb_vma *batch; /** identity of the batch obj/vma */
 
@@ -392,42 +395,6 @@ static void eb_vma_array_put(struct eb_vma_array *arr)
kref_put(&arr->kref, eb_vma_array_destroy);
 }
 
-static int
-eb_lock_vma(struct i915_execbuffer *eb, struct ww_acquire_ctx *acquire)
-{
-   struct eb_vma *ev;
-   int err = 0;
-
-   list_for_each_entry(ev, &eb->submit_list, submit_link) {
-   struct i915_vma *vma = ev->vma;
-
-   err = ww_mutex_lock_interruptible(&vma->resv->lock, acquire);
-   if (err == -EDEADLK) {
-   struct eb_vma *unlock = ev, *en;
-
-   list_for_each_entry_safe_continue_reverse(unlock, en,
- 
&eb->submit_list,
- submit_link) {
-   ww_mutex_unlock(&unlock->vma->resv->lock);
-   list_move_tail(&unlock->submit_link, 
&eb->submit_list);
-   }
-
-   GEM_BUG_ON(!list_is_first(&ev->submit_link, 
&eb->submit_list));
-   err = ww_mutex_lock_slow_interruptible(&vma->resv->lock,
-  acquire);
-   }
-   if (err) {
-   list_for_each_entry_continue_reverse(ev,
-&eb->submit_list,
-submit_link)
-   ww_mutex_unlock(&ev->vma->resv->lock);
-   break;
-   }
-   }
-
-   return err;
-}
-
 static int eb_create(struct i915_execbuffer *eb)
 {
/* Allocate an extra slot for use by the sentinel */
@@ -656,6 +623,25 @@ eb_add_vma(struct i915_execbuffer *eb,
}
 }
 
+static int eb_lock_mm(struct i915_execbuffer *eb)
+{
+   struct eb_vma *ev;
+   int err;
+
+   list_for_each_entry(ev, &eb->bind_list, bind_link) {
+   err = i915_acquire_ctx_lock(&eb->acquire, ev->vma->obj);
+   if (err)
+   return err;
+   }
+
+   return 0;
+}
+
+static int eb_acquire_mm(struct i915_execbuffer *eb)
+{
+   return i915_acquire_mm(&eb->acquire);
+}
+
 struct eb_vm_work {
struct dma_fence_work base;
struct eb_vma_array *array;
@@ -1378,7 +1364,15 @@ static int eb_reserve_vm(struct i915_execbuffer *eb)
unsigned long count;
struct eb_vma *ev;
unsigned int pass;
-   int err = 0;
+   int err;
+
+   err = eb_lock_mm(eb);
+   if (err)
+   return err;
+
+   err = eb_acquire_mm(eb);
+   if (err)
+   return err;
 
count = 0;
INIT_LIST_HEAD(&unbound);
@@ -1404,10 +1398,15 @@ static int eb_reserve_vm(struct i915_execbuffer *eb)
if (count == 0)
return 0;
 
+   /* We need to reserve page directories, release all, start over */
+   i915_acquire_ctx_fini(&eb->acquire);
+
pass = 0;
do {
struct eb_vm_work *work;
 
+   i915_acquire_ctx_init(&eb->acquire);
+
/*
 * We need to hold one lock as we bind all the vma so that
 * we have a consistent view of the entire vm and can plan
@@ -1424,6 +1423,11 @@ static int eb_reserve_vm(struct i915_execbuffer *eb)
 * beneath it, so we have to stage and preallocate all the
 * resources we may require before taking the mutex.
 */
+
+   err = eb_lock_mm(eb);
+   if (e

[Intel-gfx] [PATCH 37/37] drm/i915/gem: Delay attach mmu-notifier until we acquire the pinned userptr

2020-08-05 Thread Chris Wilson
On the fast path, we first try to pin the user pages and then attach the
mmu-notifier. On the slow path, we did it the opposite way around,
carrying the mmu-notifier over from the tail of the fast path. However,
if we are mapping a fresh batch of user pages, we will always hit a pmd
split operation (to replace the zero pages with real pages), triggering
an invalidate-range callback for this userptr, and so we have to cancel
the work [after completing the pinning] and cause the caller to retry
(an extra EAGAIN return from an ioctl for some paths). If we follow the
fast path approach and attach the callback after completion, we only see
the invalidate-range for revocations of our pages.

The risk (the same as for the fast path) is that if the mmu-notifier
should have been run during the page lookup, we will have missed it and
the pages will be mixed. One might conclude that the fast path is wrong,
and we should always attach the mmu-notifier first and bear the cost of
redundant repetition.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c 
b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index 80907c00c6fd..ba1f01650eeb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -500,14 +500,13 @@ __i915_gem_userptr_get_pages_worker(struct work_struct 
*_work)
pages = __i915_gem_userptr_alloc_pages(obj, pvec,
   npages);
if (!IS_ERR(pages)) {
+   __i915_gem_userptr_set_active(obj, true);
pinned = 0;
pages = NULL;
}
}
 
obj->userptr.work = ERR_CAST(pages);
-   if (IS_ERR(pages))
-   __i915_gem_userptr_set_active(obj, false);
}
i915_gem_object_unlock(obj);
 
@@ -566,7 +565,6 @@ static int i915_gem_userptr_get_pages(struct 
drm_i915_gem_object *obj)
struct mm_struct *mm = obj->userptr.mm->mm;
struct page **pvec;
struct sg_table *pages;
-   bool active;
int pinned;
unsigned int gup_flags = 0;
 
@@ -621,19 +619,16 @@ static int i915_gem_userptr_get_pages(struct 
drm_i915_gem_object *obj)
}
}
 
-   active = false;
if (pinned < 0) {
pages = ERR_PTR(pinned);
pinned = 0;
} else if (pinned < num_pages) {
pages = __i915_gem_userptr_get_pages_schedule(obj);
-   active = pages == ERR_PTR(-EAGAIN);
} else {
pages = __i915_gem_userptr_alloc_pages(obj, pvec, num_pages);
-   active = !IS_ERR(pages);
+   if (!IS_ERR(pages))
+   __i915_gem_userptr_set_active(obj, true);
}
-   if (active)
-   __i915_gem_userptr_set_active(obj, true);
 
if (IS_ERR(pages))
unpin_user_pages(pvec, pinned);
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 35/37] drm/i915: Remove unused i915_gem_evict_vm()

2020-08-05 Thread Chris Wilson
Obsolete, last user removed.

Signed-off-by: Chris Wilson 
Reviewed-by: Thomas Hellström 
---
 drivers/gpu/drm/i915/i915_drv.h   |  1 -
 drivers/gpu/drm/i915/i915_gem_evict.c | 57 ---
 .../gpu/drm/i915/selftests/i915_gem_evict.c   | 40 -
 3 files changed, 98 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 05a2624116a1..04243dc286c7 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1867,7 +1867,6 @@ int __must_check i915_gem_evict_something(struct 
i915_address_space *vm,
 int __must_check i915_gem_evict_for_node(struct i915_address_space *vm,
 struct drm_mm_node *node,
 unsigned int flags);
-int i915_gem_evict_vm(struct i915_address_space *vm);
 
 /* i915_gem_internal.c */
 struct drm_i915_gem_object *
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c 
b/drivers/gpu/drm/i915/i915_gem_evict.c
index 6501939929d5..e35f0ba5e245 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -343,63 +343,6 @@ int i915_gem_evict_for_node(struct i915_address_space *vm,
return ret;
 }
 
-/**
- * i915_gem_evict_vm - Evict all idle vmas from a vm
- * @vm: Address space to cleanse
- *
- * This function evicts all vmas from a vm.
- *
- * This is used by the execbuf code as a last-ditch effort to defragment the
- * address space.
- *
- * To clarify: This is for freeing up virtual address space, not for freeing
- * memory in e.g. the shrinker.
- */
-int i915_gem_evict_vm(struct i915_address_space *vm)
-{
-   int ret = 0;
-
-   lockdep_assert_held(&vm->mutex);
-   trace_i915_gem_evict_vm(vm);
-
-   /* Switch back to the default context in order to unpin
-* the existing context objects. However, such objects only
-* pin themselves inside the global GTT and performing the
-* switch otherwise is ineffective.
-*/
-   if (i915_is_ggtt(vm)) {
-   ret = ggtt_flush(vm->gt);
-   if (ret)
-   return ret;
-   }
-
-   do {
-   struct i915_vma *vma, *vn;
-   LIST_HEAD(eviction_list);
-
-   list_for_each_entry(vma, &vm->bound_list, vm_link) {
-   if (i915_vma_is_pinned(vma))
-   continue;
-
-   __i915_vma_pin(vma);
-   list_add(&vma->evict_link, &eviction_list);
-   }
-   if (list_empty(&eviction_list))
-   break;
-
-   ret = 0;
-   list_for_each_entry_safe(vma, vn, &eviction_list, evict_link) {
-   __i915_vma_unpin(vma);
-   if (ret == 0)
-   ret = __i915_vma_unbind(vma);
-   if (ret != -EINTR) /* "Get me out of here!" */
-   ret = 0;
-   }
-   } while (ret == 0);
-
-   return ret;
-}
-
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftests/i915_gem_evict.c"
 #endif
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c 
b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
index 48ea7f0ff7b9..b851b17d6f5a 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
@@ -327,52 +327,12 @@ static int igt_evict_for_cache_color(void *arg)
return err;
 }
 
-static int igt_evict_vm(void *arg)
-{
-   struct intel_gt *gt = arg;
-   struct i915_ggtt *ggtt = gt->ggtt;
-   LIST_HEAD(objects);
-   int err;
-
-   /* Fill the GGTT with pinned objects and try to evict everything. */
-
-   err = populate_ggtt(ggtt, &objects);
-   if (err)
-   goto cleanup;
-
-   /* Everything is pinned, nothing should happen */
-   mutex_lock(&ggtt->vm.mutex);
-   err = i915_gem_evict_vm(&ggtt->vm);
-   mutex_unlock(&ggtt->vm.mutex);
-   if (err) {
-   pr_err("i915_gem_evict_vm on a full GGTT returned err=%d]\n",
-  err);
-   goto cleanup;
-   }
-
-   unpin_ggtt(ggtt);
-
-   mutex_lock(&ggtt->vm.mutex);
-   err = i915_gem_evict_vm(&ggtt->vm);
-   mutex_unlock(&ggtt->vm.mutex);
-   if (err) {
-   pr_err("i915_gem_evict_vm on a full GGTT returned err=%d]\n",
-  err);
-   goto cleanup;
-   }
-
-cleanup:
-   cleanup_objects(ggtt, &objects);
-   return err;
-}
-
 int i915_gem_evict_mock_selftests(void)
 {
static const struct i915_subtest tests[] = {
SUBTEST(igt_evict_something),
SUBTEST(igt_evict_for_vma),
SUBTEST(igt_evict_for_cache_color),
-   SUBTEST(igt_evict_vm),
SUBTEST(igt_overcommit),
};
struct drm_i915_private *i915;
-- 
2.20.1

__

[Intel-gfx] [PATCH 29/37] drm/i915/gem: Replace i915_gem_object.mm.mutex with reservation_ww_class

2020-08-05 Thread Chris Wilson
Our goal is to pull all memory reservations (next iteration
obj->ops->get_pages()) under a ww_mutex, and to align those reservations
with other drivers, i.e. control all such allocations with the
reservation_ww_class. Currently, this is under the purview of the
obj->mm.mutex, and while obj->mm remains an embedded struct we can
"simply" switch to using the reservation_ww_class obj->base.resv->lock

The major consequence is the impact on the shrinker paths as the
reservation_ww_class is used to wrap allocations, and a ww_mutex does
not support subclassing so we cannot do our usual trick of knowing that
we never recurse inside the shrinker and instead have to finish the
reclaim with a trylock. This may result in us failing to release the
pages after having released the vma. This will have to do until a better
idea comes along.

However, this step only converts the mutex over and continues to treat
everything as a single allocation and pinning the pages. With the
ww_mutex in place we can remove the temporary pinning, as we can then
reserve all storage en masse.

One last thing to do: kill the implict page pinning for active vma.
This will require us to invalidate the vma->pages when the backing store
is removed (and we expect that while the vma is active, we mark the
backing store as active so that it cannot be removed while the HW is
busy.)

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gem/i915_gem_clflush.c   |  20 +-
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c|  19 +-
 drivers/gpu/drm/i915/gem/i915_gem_domain.c|  65 ++
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  40 +++-
 drivers/gpu/drm/i915/gem/i915_gem_object.c|   8 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.h|  37 +--
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |   1 -
 drivers/gpu/drm/i915/gem/i915_gem_pages.c | 147 ++--
 drivers/gpu/drm/i915/gem/i915_gem_phys.c  |   8 +-
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  13 +-
 drivers/gpu/drm/i915/gem/i915_gem_tiling.c|   2 -
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c   |  15 +-
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |  32 ++-
 .../i915/gem/selftests/i915_gem_coherency.c   |  14 +-
 .../drm/i915/gem/selftests/i915_gem_context.c |  10 +-
 .../drm/i915/gem/selftests/i915_gem_mman.c|   2 +
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c  |   2 -
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c  |   1 -
 drivers/gpu/drm/i915/gt/intel_ggtt.c  |   5 +-
 drivers/gpu/drm/i915/gt/intel_gtt.c   |   6 +-
 drivers/gpu/drm/i915/gt/intel_gtt.h   |   2 -
 drivers/gpu/drm/i915/gt/intel_ppgtt.c |   1 +
 drivers/gpu/drm/i915/i915_gem.c   |  16 +-
 drivers/gpu/drm/i915/i915_vma.c   | 216 +++---
 drivers/gpu/drm/i915/i915_vma_types.h |   6 -
 drivers/gpu/drm/i915/mm/i915_acquire_ctx.c|  12 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |   4 +-
 .../drm/i915/selftests/intel_memory_region.c  |  17 +-
 28 files changed, 336 insertions(+), 385 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c 
b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
index bc0223716906..a32fd0d5570b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
@@ -27,16 +27,8 @@ static void __do_clflush(struct drm_i915_gem_object *obj)
 static int clflush_work(struct dma_fence_work *base)
 {
struct clflush *clflush = container_of(base, typeof(*clflush), base);
-   struct drm_i915_gem_object *obj = clflush->obj;
-   int err;
-
-   err = i915_gem_object_pin_pages(obj);
-   if (err)
-   return err;
-
-   __do_clflush(obj);
-   i915_gem_object_unpin_pages(obj);
 
+   __do_clflush(clflush->obj);
return 0;
 }
 
@@ -44,7 +36,7 @@ static void clflush_release(struct dma_fence_work *base)
 {
struct clflush *clflush = container_of(base, typeof(*clflush), base);
 
-   i915_gem_object_put(clflush->obj);
+   i915_gem_object_unpin_pages(clflush->obj);
 }
 
 static const struct dma_fence_work_ops clflush_ops = {
@@ -63,8 +55,14 @@ static struct clflush *clflush_work_create(struct 
drm_i915_gem_object *obj)
if (!clflush)
return NULL;
 
+   if (__i915_gem_object_get_pages_locked(obj)) {
+   kfree(clflush);
+   return NULL;
+   }
+
dma_fence_work_init(&clflush->base, &clflush_ops);
-   clflush->obj = i915_gem_object_get(obj); /* obj <-> clflush cycle */
+   __i915_gem_object_pin_pages(obj);
+   clflush->obj = obj; /* Beware the obj.resv <-> clflush fence cycle */
 
return clflush;
 }
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c 
b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
index 2679380159fc..f965fa6c3353 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
@@ -124,19 +124,19 @@ static int i915_gem_begin_cpu_access(struct dma_buf 
*dma_buf, e

[Intel-gfx] [PATCH 23/37] drm/i915/gem: Manage GTT placement bias (starting offset) explicitly

2020-08-05 Thread Chris Wilson
Since we can control placement in the ppGTT explicitly, we can specify
our desired starting offset exactly on a per-vma basis. This prevents us
falling down a few corner cases where we confuse the user with our choices.

Signed-off-by: Chris Wilson 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 67 +--
 1 file changed, 31 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 19cab5541dbc..0839397c7e50 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -36,6 +36,7 @@ struct eb_vma {
 
/** This vma's place in the execbuf reservation list */
struct drm_i915_gem_exec_object2 *exec;
+   u32 bias;
 
struct list_head bind_link;
struct list_head unbound_link;
@@ -61,15 +62,12 @@ struct eb_vma_array {
 #define __EXEC_OBJECT_HAS_PIN  BIT(31)
 #define __EXEC_OBJECT_HAS_FENCEBIT(30)
 #define __EXEC_OBJECT_NEEDS_MAPBIT(29)
-#define __EXEC_OBJECT_NEEDS_BIAS   BIT(28)
-#define __EXEC_OBJECT_INTERNAL_FLAGS   (~0u << 28) /* all of the above */
+#define __EXEC_OBJECT_INTERNAL_FLAGS   (~0u << 29) /* all of the above */
 
 #define __EXEC_HAS_RELOC   BIT(31)
 #define __EXEC_INTERNAL_FLAGS  (~0u << 31)
 #define UPDATE PIN_OFFSET_FIXED
 
-#define BATCH_OFFSET_BIAS (256*1024)
-
 #define __I915_EXEC_ILLEGAL_FLAGS \
(__I915_EXEC_UNKNOWN_FLAGS | \
 I915_EXEC_CONSTANTS_MASK  | \
@@ -291,7 +289,7 @@ struct i915_execbuffer {
} parser;
 
u64 invalid_flags; /** Set of execobj.flags that are invalid */
-   u32 context_flags; /** Set of execobj.flags to insert from the ctx */
+   u32 context_bias;
 
u32 batch_start_offset; /** Location within object of batch */
u32 batch_len; /** Length of batch within object */
@@ -491,11 +489,12 @@ static int eb_create(struct i915_execbuffer *eb)
return 0;
 }
 
-static bool
-eb_vma_misplaced(const struct drm_i915_gem_exec_object2 *entry,
-const struct i915_vma *vma,
-unsigned int flags)
+static bool eb_vma_misplaced(const struct eb_vma *ev)
 {
+   const struct drm_i915_gem_exec_object2 *entry = ev->exec;
+   const struct i915_vma *vma = ev->vma;
+   unsigned int flags = ev->flags;
+
if (test_bit(I915_VMA_ERROR_BIT, __i915_vma_flags(vma)))
return true;
 
@@ -509,8 +508,7 @@ eb_vma_misplaced(const struct drm_i915_gem_exec_object2 
*entry,
vma->node.start != entry->offset)
return true;
 
-   if (flags & __EXEC_OBJECT_NEEDS_BIAS &&
-   vma->node.start < BATCH_OFFSET_BIAS)
+   if (vma->node.start < ev->bias)
return true;
 
if (!(flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS) &&
@@ -529,10 +527,7 @@ static bool eb_pin_vma_fence_inplace(struct eb_vma *ev)
return false; /* We need to add some new fence serialisation */
 }
 
-static inline bool
-eb_pin_vma_inplace(struct i915_execbuffer *eb,
-  const struct drm_i915_gem_exec_object2 *entry,
-  struct eb_vma *ev)
+static inline bool eb_pin_vma_inplace(struct eb_vma *ev)
 {
struct i915_vma *vma = ev->vma;
unsigned int pin_flags;
@@ -541,7 +536,7 @@ eb_pin_vma_inplace(struct i915_execbuffer *eb,
if (!i915_active_is_idle(&vma->vm->binding))
return false;
 
-   if (eb_vma_misplaced(entry, vma, ev->flags))
+   if (eb_vma_misplaced(ev))
return false;
 
pin_flags = PIN_USER;
@@ -559,7 +554,7 @@ eb_pin_vma_inplace(struct i915_execbuffer *eb,
}
}
 
-   GEM_BUG_ON(eb_vma_misplaced(entry, vma, ev->flags));
+   GEM_BUG_ON(eb_vma_misplaced(ev));
 
ev->flags |= __EXEC_OBJECT_HAS_PIN;
return true;
@@ -608,9 +603,6 @@ eb_validate_vma(struct i915_execbuffer *eb,
entry->flags |= EXEC_OBJECT_NEEDS_GTT | 
__EXEC_OBJECT_NEEDS_MAP;
}
 
-   if (!(entry->flags & EXEC_OBJECT_PINNED))
-   entry->flags |= eb->context_flags;
-
return 0;
 }
 
@@ -627,6 +619,7 @@ eb_add_vma(struct i915_execbuffer *eb,
ev->vma = vma;
ev->exec = entry;
ev->flags = entry->flags;
+   ev->bias = eb->context_bias;
 
if (eb->lut_size > 0) {
ev->handle = entry->handle;
@@ -653,7 +646,8 @@ eb_add_vma(struct i915_execbuffer *eb,
if (i == batch_idx) {
if (entry->relocation_count &&
!(ev->flags & EXEC_OBJECT_PINNED))
-   ev->flags |= __EXEC_OBJECT_NEEDS_BIAS;
+   ev->bias = max_t(u32, ev->bias, SZ_256K);
+
if (eb->has_fence)
ev->flags |= EXEC_OBJECT_NEEDS_FENCE;
 
@@ -979,8 +973,9 @@ static int eb_reserve_vma(struct eb_vm_work *work, struct 
eb_bind_vma *bind)
co

[Intel-gfx] [PATCH 27/37] drm/i915/gtt: map the PD up front

2020-08-05 Thread Chris Wilson
From: Matthew Auld 

We need to general our accessor for the page directories and tables from
using the simple kmap_atomic to support local memory, and this setup
must be done on acquisition of the backing storage prior to entering
fence execution contexts. Here we replace the kmap with the object
maping code that for simple single page shmemfs object will return a
plain kmap, that is then kept for the lifetime of the page directory.

Signed-off-by: Matthew Auld 
Signed-off-by: Chris Wilson 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  3 +-
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c  | 11 +++---
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c  | 26 ++
 drivers/gpu/drm/i915/gt/intel_ggtt.c  |  2 +-
 drivers/gpu/drm/i915/gt/intel_gtt.c   | 34 ---
 drivers/gpu/drm/i915/gt/intel_gtt.h   |  9 ++---
 drivers/gpu/drm/i915/gt/intel_ppgtt.c |  7 ++--
 drivers/gpu/drm/i915/i915_vma.c   |  3 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |  8 ++---
 9 files changed, 45 insertions(+), 58 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 3a79b6facb02..d3ac2542a039 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -1449,7 +1449,8 @@ static int eb_reserve_vm(struct i915_execbuffer *eb)
if (err)
return eb_vm_work_cancel(work, err);
 
-   err = i915_vm_pin_pt_stash(work->vm, &work->stash);
+   /* We also need to prepare mappings to write the PD pages */
+   err = i915_vm_map_pt_stash(work->vm, &work->stash);
if (err)
return eb_vm_work_cancel(work, err);
 
diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
index 1b4fa9ce6658..dd723d9832b9 100644
--- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c
@@ -105,9 +105,8 @@ static void gen6_ppgtt_clear_range(struct 
i915_address_space *vm,
 * entries back to scratch.
 */
 
-   vaddr = kmap_atomic_px(pt);
+   vaddr = px_vaddr(pt);
memset32(vaddr + pte, scratch_pte, count);
-   kunmap_atomic(vaddr);
 
pte = 0;
}
@@ -129,7 +128,7 @@ static void gen6_ppgtt_insert_entries(struct 
i915_address_space *vm,
 
GEM_BUG_ON(!pd->entry[act_pt]);
 
-   vaddr = kmap_atomic_px(i915_pt_entry(pd, act_pt));
+   vaddr = px_vaddr(i915_pt_entry(pd, act_pt));
do {
GEM_BUG_ON(iter.sg->length < I915_GTT_PAGE_SIZE);
vaddr[act_pte] = pte_encode | GEN6_PTE_ADDR_ENCODE(iter.dma);
@@ -145,12 +144,10 @@ static void gen6_ppgtt_insert_entries(struct 
i915_address_space *vm,
}
 
if (++act_pte == GEN6_PTES) {
-   kunmap_atomic(vaddr);
-   vaddr = kmap_atomic_px(i915_pt_entry(pd, ++act_pt));
+   vaddr = px_vaddr(i915_pt_entry(pd, ++act_pt));
act_pte = 0;
}
} while (1);
-   kunmap_atomic(vaddr);
 
vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
 }
@@ -242,7 +239,7 @@ static int gen6_ppgtt_init_scratch(struct gen6_ppgtt *ppgtt)
if (IS_ERR(vm->scratch[1]))
return PTR_ERR(vm->scratch[1]);
 
-   ret = pin_pt_dma(vm, vm->scratch[1]);
+   ret = map_pt_dma(vm, vm->scratch[1]);
if (ret) {
i915_gem_object_put(vm->scratch[1]);
return ret;
diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c 
b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
index eb64f474a78c..ca25e751a023 100644
--- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
+++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c
@@ -237,11 +237,10 @@ static u64 __gen8_ppgtt_clear(struct i915_address_space * 
const vm,
atomic_read(&pt->used));
GEM_BUG_ON(!count || count >= atomic_read(&pt->used));
 
-   vaddr = kmap_atomic_px(pt);
+   vaddr = px_vaddr(pt);
memset64(vaddr + gen8_pd_index(start, 0),
 vm->scratch[0]->encode,
 count);
-   kunmap_atomic(vaddr);
 
atomic_sub(count, &pt->used);
start += count;
@@ -370,7 +369,7 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt,
gen8_pte_t *vaddr;
 
pd = i915_pd_entry(pdp, gen8_pd_index(idx, 2));
-   vaddr = kmap_atomic_px(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
+   vaddr = px_vaddr(i915_pt_entry(pd, gen8_pd_index(idx, 1)));
do {
GEM_BUG_ON(iter->sg->length < I915_GTT_PAGE_SIZE);
vaddr[gen8_pd_index(idx, 0)] = pte_encode | iter->dma;
@@ -397,12 +396,10 @@ gen8_ppgtt_inser

[Intel-gfx] ✓ Fi.CI.BAT: success for HDCP minor refactoring (rev3)

2020-08-05 Thread Patchwork
== Series Details ==

Series: HDCP minor refactoring (rev3)
URL   : https://patchwork.freedesktop.org/series/77224/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_8845 -> Patchwork_18309


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18309/index.html

Known issues


  Here are the changes found in Patchwork_18309 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@i915_selftest@live@gt_lrc:
- fi-tgl-u2:  [PASS][1] -> [DMESG-FAIL][2] ([i915#1233])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/fi-tgl-u2/igt@i915_selftest@live@gt_lrc.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18309/fi-tgl-u2/igt@i915_selftest@live@gt_lrc.html

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic:
- fi-icl-u2:  [PASS][3] -> [DMESG-WARN][4] ([i915#1982])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/fi-icl-u2/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18309/fi-icl-u2/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html

  * igt@kms_pipe_crc_basic@hang-read-crc-pipe-a:
- fi-bsw-kefka:   [PASS][5] -> [DMESG-WARN][6] ([i915#1982])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/fi-bsw-kefka/igt@kms_pipe_crc_ba...@hang-read-crc-pipe-a.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18309/fi-bsw-kefka/igt@kms_pipe_crc_ba...@hang-read-crc-pipe-a.html

  
 Possible fixes 

  * igt@kms_busy@basic@flip:
- {fi-tgl-dsi}:   [DMESG-WARN][7] ([i915#1982]) -> [PASS][8]
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/fi-tgl-dsi/igt@kms_busy@ba...@flip.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18309/fi-tgl-dsi/igt@kms_busy@ba...@flip.html
- fi-kbl-x1275:   [DMESG-WARN][9] ([i915#62] / [i915#92] / [i915#95]) 
-> [PASS][10]
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/fi-kbl-x1275/igt@kms_busy@ba...@flip.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18309/fi-kbl-x1275/igt@kms_busy@ba...@flip.html

  * igt@kms_chamelium@common-hpd-after-suspend:
- fi-kbl-7500u:   [DMESG-WARN][11] ([i915#2203]) -> [PASS][12]
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/fi-kbl-7500u/igt@kms_chamel...@common-hpd-after-suspend.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18309/fi-kbl-7500u/igt@kms_chamel...@common-hpd-after-suspend.html

  * igt@kms_flip@basic-flip-vs-wf_vblank@b-edp1:
- fi-icl-u2:  [DMESG-WARN][13] ([i915#1982]) -> [PASS][14]
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/fi-icl-u2/igt@kms_flip@basic-flip-vs-wf_vbl...@b-edp1.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18309/fi-icl-u2/igt@kms_flip@basic-flip-vs-wf_vbl...@b-edp1.html

  
 Warnings 

  * igt@kms_cursor_legacy@basic-flip-before-cursor-atomic:
- fi-kbl-x1275:   [DMESG-WARN][15] ([i915#62] / [i915#92]) -> 
[DMESG-WARN][16] ([i915#62] / [i915#92] / [i915#95]) +5 similar issues
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/fi-kbl-x1275/igt@kms_cursor_leg...@basic-flip-before-cursor-atomic.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18309/fi-kbl-x1275/igt@kms_cursor_leg...@basic-flip-before-cursor-atomic.html

  * igt@kms_force_connector_basic@force-edid:
- fi-kbl-x1275:   [DMESG-WARN][17] ([i915#62] / [i915#92] / [i915#95]) 
-> [DMESG-WARN][18] ([i915#62] / [i915#92]) +4 similar issues
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/fi-kbl-x1275/igt@kms_force_connector_ba...@force-edid.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18309/fi-kbl-x1275/igt@kms_force_connector_ba...@force-edid.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [i915#1233]: https://gitlab.freedesktop.org/drm/intel/issues/1233
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#2203]: https://gitlab.freedesktop.org/drm/intel/issues/2203
  [i915#62]: https://gitlab.freedesktop.org/drm/intel/issues/62
  [i915#92]: https://gitlab.freedesktop.org/drm/intel/issues/92
  [i915#95]: https://gitlab.freedesktop.org/drm/intel/issues/95


Participating hosts (44 -> 37)
--

  Missing(7): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan 
fi-ctg-p8600 fi-byt-clapper fi-bdw-samus 


Build changes
-

  * Linux: CI_DRM_8845 -> Patchwork_18309

  CI-20190529: 20190529
  CI_DRM_8845: a486392fed875e0b9154eaeb4bf6a4193484e0b3 @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5758: bb34603947667cb44ed9ff0db8dccbb9d3f42357 @ 
git://anongit.freedesktop.org/xorg/app/inte

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Replace obj->mm.lock with reservation_ww_class

2020-08-05 Thread Patchwork
== Series Details ==

Series: Replace obj->mm.lock with reservation_ww_class
URL   : https://patchwork.freedesktop.org/series/80291/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
fa0ff87bd9b0 drm/i915/gem: Reduce context termination list iteration guard to 
RCU
-:20: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description 
(prefer a maximum 75 chars per line)
#20: 
References: d22d2d073ef8 ("drm/i915: Protect i915_request_await_start from 
early waits") # rcu protection of timeline->requests

-:20: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit d22d2d073ef8 ("drm/i915: Protect 
i915_request_await_start from early waits")'
#20: 
References: d22d2d073ef8 ("drm/i915: Protect i915_request_await_start from 
early waits") # rcu protection of timeline->requests

total: 1 errors, 1 warnings, 0 checks, 65 lines checked
ce73eec0532c drm/i915/gt: Protect context lifetime with RCU
dd5a5156d3f0 drm/i915/gt: Free stale request on destroying the virtual engine
3bc9e76bcdd9 drm/i915/gt: Defer enabling the breadcrumb interrupt to after 
submission
4749f15dd1d3 drm/i915/gt: Track signaled breadcrumbs outside of the breadcrumb 
spinlock
042d0d931dcf drm/i915/gt: Don't cancel the interrupt shadow too early
1ee396796726 drm/i915/gt: Split the breadcrumb spinlock between global and 
contexts
-:339: CHECK:UNCOMMENTED_DEFINITION: spinlock_t definition without comment
#339: FILE: drivers/gpu/drm/i915/gt/intel_context_types.h:54:
+   spinlock_t signal_lock;

total: 0 errors, 0 warnings, 1 checks, 293 lines checked
98e29e72ccd2 drm/i915/gem: Don't drop the timeline lock during execbuf
f0a442c8ea00 drm/i915/gem: Rename execbuf.bind_link to unbound_link
aae43a649127 drm/i915/gem: Rename the list of relocations to reloc_list
f6729467a31c drm/i915/gem: Move the 'cached' info to i915_execbuffer
cbf2d6b56eea drm/i915/gem: Break apart the early i915_vma_pin from execbuf 
object lookup
89bf55e09136 drm/i915/gem: Remove the call for no-evict i915_vma_pin
225f65b93b67 drm/i915: Serialise i915_vma_pin_inplace() with i915_vma_unbind()
aeec0677ca2f drm/i915: Add list_for_each_entry_safe_continue_reverse
-:25: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'pos' - possible side-effects?
#25: FILE: drivers/gpu/drm/i915/i915_utils.h:269:
+#define list_for_each_entry_safe_continue_reverse(pos, n, head, member)
\
+   for (pos = list_prev_entry(pos, member),\
+n = list_prev_entry(pos, member);  \
+&pos->member != (head);\
+pos = n, n = list_prev_entry(n, member))

-:25: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'n' - possible side-effects?
#25: FILE: drivers/gpu/drm/i915/i915_utils.h:269:
+#define list_for_each_entry_safe_continue_reverse(pos, n, head, member)
\
+   for (pos = list_prev_entry(pos, member),\
+n = list_prev_entry(pos, member);  \
+&pos->member != (head);\
+pos = n, n = list_prev_entry(n, member))

-:25: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'member' - possible 
side-effects?
#25: FILE: drivers/gpu/drm/i915/i915_utils.h:269:
+#define list_for_each_entry_safe_continue_reverse(pos, n, head, member)
\
+   for (pos = list_prev_entry(pos, member),\
+n = list_prev_entry(pos, member);  \
+&pos->member != (head);\
+pos = n, n = list_prev_entry(n, member))

total: 0 errors, 0 warnings, 3 checks, 12 lines checked
7ebf143a2b61 drm/i915: Always defer fenced work to the worker
1105264eb6df drm/i915/gem: Assign context id for async work
b5426c3e8f1e drm/i915/gem: Separate the ww_mutex walker into its own list
418443e4f373 drm/i915/gem: Asynchronous GTT unbinding
6909712422ff drm/i915/gem: Bind the fence async for execbuf
08253ca5d7f8 drm/i915/gem: Include cmdparser in common execbuf pinning
c04ff4d7bd4f drm/i915/gem: Include secure batch in common execbuf pinning
555b300812f8 drm/i915/gem: Manage GTT placement bias (starting offset) 
explicitly
245b73f9f203 drm/i915/gem: Reintroduce multiple passes for reloc processing
-:1390: WARNING:MEMORY_BARRIER: memory barrier without comment
#1390: FILE: drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c:174:
+   wmb();

total: 0 errors, 1 warnings, 0 checks, 1401 lines checked
aec7bc6e676a drm/i915: Add an implementation for common reservation_ww_class 
locking
-:65: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does 
MAINTAINERS need updating?
#65: 
new file mode 100644

-:360: WARNING:LINE_SPACING: Missing a blank line after declarations
#360: FILE: drivers/gpu/drm/i915/mm/st_acquire_ctx.c:106:
+   const unsigned int total = ARRAY_SIZE(dl->obj);
+   I915_RND_STATE(prng);

-:456: WA

[Intel-gfx] ✗ Fi.CI.SPARSE: warning for Replace obj->mm.lock with reservation_ww_class

2020-08-05 Thread Patchwork
== Series Details ==

Series: Replace obj->mm.lock with reservation_ww_class
URL   : https://patchwork.freedesktop.org/series/80291/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.0
Fast mode used, each commit won't be checked separately.


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✓ Fi.CI.BAT: success for Replace obj->mm.lock with reservation_ww_class

2020-08-05 Thread Patchwork
== Series Details ==

Series: Replace obj->mm.lock with reservation_ww_class
URL   : https://patchwork.freedesktop.org/series/80291/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_8845 -> Patchwork_18310


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18310/index.html

Known issues


  Here are the changes found in Patchwork_18310 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@gem_exec_suspend@basic-s0:
- fi-tgl-u2:  [PASS][1] -> [FAIL][2] ([i915#1888])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/fi-tgl-u2/igt@gem_exec_susp...@basic-s0.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18310/fi-tgl-u2/igt@gem_exec_susp...@basic-s0.html

  
 Possible fixes 

  * igt@i915_module_load@reload:
- fi-byt-j1900:   [DMESG-WARN][3] ([i915#1982]) -> [PASS][4]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/fi-byt-j1900/igt@i915_module_l...@reload.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18310/fi-byt-j1900/igt@i915_module_l...@reload.html

  * igt@kms_busy@basic@flip:
- fi-kbl-x1275:   [DMESG-WARN][5] ([i915#62] / [i915#92] / [i915#95]) 
-> [PASS][6]
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/fi-kbl-x1275/igt@kms_busy@ba...@flip.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18310/fi-kbl-x1275/igt@kms_busy@ba...@flip.html

  * igt@kms_chamelium@common-hpd-after-suspend:
- fi-kbl-7500u:   [DMESG-WARN][7] ([i915#2203]) -> [PASS][8]
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/fi-kbl-7500u/igt@kms_chamel...@common-hpd-after-suspend.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18310/fi-kbl-7500u/igt@kms_chamel...@common-hpd-after-suspend.html

  * igt@kms_flip@basic-flip-vs-wf_vblank@b-edp1:
- fi-icl-u2:  [DMESG-WARN][9] ([i915#1982]) -> [PASS][10]
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/fi-icl-u2/igt@kms_flip@basic-flip-vs-wf_vbl...@b-edp1.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18310/fi-icl-u2/igt@kms_flip@basic-flip-vs-wf_vbl...@b-edp1.html

  
 Warnings 

  * igt@gem_exec_suspend@basic-s0:
- fi-kbl-x1275:   [DMESG-WARN][11] ([i915#62] / [i915#92] / [i915#95]) 
-> [DMESG-WARN][12] ([i915#1982] / [i915#62] / [i915#92])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/fi-kbl-x1275/igt@gem_exec_susp...@basic-s0.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18310/fi-kbl-x1275/igt@gem_exec_susp...@basic-s0.html

  * igt@kms_force_connector_basic@force-connector-state:
- fi-kbl-x1275:   [DMESG-WARN][13] ([i915#62] / [i915#92] / [i915#95]) 
-> [DMESG-WARN][14] ([i915#62] / [i915#92]) +1 similar issue
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/fi-kbl-x1275/igt@kms_force_connector_ba...@force-connector-state.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18310/fi-kbl-x1275/igt@kms_force_connector_ba...@force-connector-state.html

  * igt@prime_vgem@basic-fence-flip:
- fi-kbl-x1275:   [DMESG-WARN][15] ([i915#62] / [i915#92]) -> 
[DMESG-WARN][16] ([i915#62] / [i915#92] / [i915#95]) +9 similar issues
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/fi-kbl-x1275/igt@prime_v...@basic-fence-flip.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18310/fi-kbl-x1275/igt@prime_v...@basic-fence-flip.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#2203]: https://gitlab.freedesktop.org/drm/intel/issues/2203
  [i915#62]: https://gitlab.freedesktop.org/drm/intel/issues/62
  [i915#92]: https://gitlab.freedesktop.org/drm/intel/issues/92
  [i915#95]: https://gitlab.freedesktop.org/drm/intel/issues/95


Participating hosts (44 -> 37)
--

  Missing(7): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan 
fi-ctg-p8600 fi-byt-clapper fi-bdw-samus 


Build changes
-

  * Linux: CI_DRM_8845 -> Patchwork_18310

  CI-20190529: 20190529
  CI_DRM_8845: a486392fed875e0b9154eaeb4bf6a4193484e0b3 @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5758: bb34603947667cb44ed9ff0db8dccbb9d3f42357 @ 
git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_18310: ed8fec45359a20d6d36aa007d51975e219e295d1 @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

ed8fec45359a drm/i915/gem: Delay attach mmu-notifier until we acquire the 
pinned userptr
5fce5e4b3f18 drm/i915/display: Drop object lock from intel_unpin_fb_vma
6e88c2dcdd0b drm/i915: Remove unused i915_gem_evict_vm()
c51644

Re: [Intel-gfx] [PATCH v3 1/2] drm/i915/hdcp: Add update_pipe early return

2020-08-05 Thread Ramalingam C
On 2020-08-05 at 17:15:20 +0530, Anshuman Gupta wrote:
> Currently intel_hdcp_update_pipe() is also getting called for non-hdcp
> connectors and get through its conditional code flow, which is completely
> unnecessary for non-hdcp connectors, therefore it make sense to
> have an early return. No functional change.
Looks good to me

Reviewed-by: Ramalingam C 
> 
> v2:
> - rebased.
> 
> Reviewed-by: Uma Shankar 
> Signed-off-by: Anshuman Gupta 
> ---
>  drivers/gpu/drm/i915/display/intel_hdcp.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_hdcp.c 
> b/drivers/gpu/drm/i915/display/intel_hdcp.c
> index 89a4d294822d..a1e0d518e529 100644
> --- a/drivers/gpu/drm/i915/display/intel_hdcp.c
> +++ b/drivers/gpu/drm/i915/display/intel_hdcp.c
> @@ -2082,11 +2082,15 @@ void intel_hdcp_update_pipe(struct intel_atomic_state 
> *state,
>   struct intel_connector *connector =
>   to_intel_connector(conn_state->connector);
>   struct intel_hdcp *hdcp = &connector->hdcp;
> - bool content_protection_type_changed =
> + bool content_protection_type_changed, desired_and_not_enabled = false;
> +
> + if (!connector->hdcp.shim)
> + return;
> +
> + content_protection_type_changed =
>   (conn_state->hdcp_content_type != hdcp->content_type &&
>conn_state->content_protection !=
>DRM_MODE_CONTENT_PROTECTION_UNDESIRED);
> - bool desired_and_not_enabled = false;
>  
>   /*
>* During the HDCP encryption session if Type change is requested,
> -- 
> 2.26.2
> 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 10/37] drm/i915/gem: Rename the list of relocations to reloc_list

2020-08-05 Thread Tvrtko Ursulin



On 05/08/2020 13:22, Chris Wilson wrote:

Continuing the theme of calling the lists a foo_list, rename the relocs
list. This means that we can now use relocs for the old reloc_cache that
is not a cache anymore.

Signed-off-by: Chris Wilson 
---
  drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index a5b63ae17241..e7e16c62df1c 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -252,7 +252,7 @@ struct i915_execbuffer {
struct list_head unbound;
  
  	/** list of vma that have execobj.relocation_count */

-   struct list_head relocs;
+   struct list_head relocs_list;
  
  	/**

 * Track the most recently used object for relocations, as we
@@ -577,7 +577,7 @@ eb_add_vma(struct i915_execbuffer *eb,
}
  
  	if (entry->relocation_count)

-   list_add_tail(&ev->reloc_link, &eb->relocs);
+   list_add_tail(&ev->reloc_link, &eb->relocs_list);
  
  	/*

 * SNA is doing fancy tricks with compressing batch buffers, which leads
@@ -932,7 +932,7 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb)
unsigned int i;
int err = 0;
  
-	INIT_LIST_HEAD(&eb->relocs);

+   INIT_LIST_HEAD(&eb->relocs_list);
INIT_LIST_HEAD(&eb->unbound);
  
  	for (i = 0; i < eb->buffer_count; i++) {

@@ -1592,7 +1592,7 @@ static int eb_relocate(struct i915_execbuffer *eb)
struct eb_vma *ev;
int flush;
  
-		list_for_each_entry(ev, &eb->relocs, reloc_link) {

+   list_for_each_entry(ev, &eb->relocs_list, reloc_link) {
err = eb_relocate_vma(eb, ev);
if (err)
break;



Reviewed-by: Tvrtko Ursulin 

Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 11/37] drm/i915/gem: Move the 'cached' info to i915_execbuffer

2020-08-05 Thread Tvrtko Ursulin



On 05/08/2020 13:22, Chris Wilson wrote:

The reloc_cache contains some details that are used outside of the
relocation handling, so lift those out of the embeddded struct into the
principle struct i915_execbuffer.

Signed-off-by: Chris Wilson 
---
  .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 61 +++
  .../i915/gem/selftests/i915_gem_execbuffer.c  |  6 +-
  2 files changed, 37 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index e7e16c62df1c..e9ef0c287fd9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -261,11 +261,6 @@ struct i915_execbuffer {
 */
struct reloc_cache {
struct drm_mm_node node; /** temporary GTT binding */
-   unsigned int gen; /** Cached value of INTEL_GEN */
-   bool use_64bit_reloc : 1;
-   bool has_llc : 1;
-   bool has_fence : 1;
-   bool needs_unfenced : 1;
  
  		struct intel_context *ce;
  
@@ -283,6 +278,12 @@ struct i915_execbuffer {

u32 batch_len; /** Length of batch within object */
u32 batch_flags; /** Flags composed for emit_bb_start() */
  
+	unsigned int gen; /** Cached value of INTEL_GEN */

+   bool use_64bit_reloc : 1;
+   bool has_llc : 1;
+   bool has_fence : 1;
+   bool needs_unfenced : 1;
+
/**
 * Indicate either the size of the hastable used to resolve
 * relocation handles, or if negative that we are using a direct
@@ -540,11 +541,11 @@ eb_validate_vma(struct i915_execbuffer *eb,
 */
entry->offset = gen8_noncanonical_addr(entry->offset);
  
-	if (!eb->reloc_cache.has_fence) {

+   if (!eb->has_fence) {
entry->flags &= ~EXEC_OBJECT_NEEDS_FENCE;
} else {
if ((entry->flags & EXEC_OBJECT_NEEDS_FENCE ||
-eb->reloc_cache.needs_unfenced) &&
+eb->needs_unfenced) &&
i915_gem_object_is_tiled(vma->obj))
entry->flags |= EXEC_OBJECT_NEEDS_GTT | 
__EXEC_OBJECT_NEEDS_MAP;
}
@@ -592,7 +593,7 @@ eb_add_vma(struct i915_execbuffer *eb,
if (entry->relocation_count &&
!(ev->flags & EXEC_OBJECT_PINNED))
ev->flags |= __EXEC_OBJECT_NEEDS_BIAS;
-   if (eb->reloc_cache.has_fence)
+   if (eb->has_fence)
ev->flags |= EXEC_OBJECT_NEEDS_FENCE;
  
  		eb->batch = ev;

@@ -995,15 +996,19 @@ relocation_target(const struct 
drm_i915_gem_relocation_entry *reloc,
return gen8_canonical_addr((int)reloc->delta + target->node.start);
  }
  
-static void reloc_cache_init(struct reloc_cache *cache,

-struct drm_i915_private *i915)
+static void eb_info_init(struct i915_execbuffer *eb,
+struct drm_i915_private *i915)
  {
/* Must be a variable in the struct to allow GCC to unroll. */
-   cache->gen = INTEL_GEN(i915);
-   cache->has_llc = HAS_LLC(i915);
-   cache->use_64bit_reloc = HAS_64BIT_RELOC(i915);
-   cache->has_fence = cache->gen < 4;
-   cache->needs_unfenced = INTEL_INFO(i915)->unfenced_needs_alignment;
+   eb->gen = INTEL_GEN(i915);
+   eb->has_llc = HAS_LLC(i915);
+   eb->use_64bit_reloc = HAS_64BIT_RELOC(i915);
+   eb->has_fence = eb->gen < 4;
+   eb->needs_unfenced = INTEL_INFO(i915)->unfenced_needs_alignment;
+}
+
+static void reloc_cache_init(struct reloc_cache *cache)
+{
cache->node.flags = 0;
cache->rq = NULL;
cache->target = NULL;
@@ -1011,8 +1016,9 @@ static void reloc_cache_init(struct reloc_cache *cache,
  
  #define RELOC_TAIL 4
  
-static int reloc_gpu_chain(struct reloc_cache *cache)

+static int reloc_gpu_chain(struct i915_execbuffer *eb)
  {
+   struct reloc_cache *cache = &eb->reloc_cache;
struct intel_gt_buffer_pool_node *pool;
struct i915_request *rq = cache->rq;
struct i915_vma *batch;
@@ -1036,9 +1042,9 @@ static int reloc_gpu_chain(struct reloc_cache *cache)
GEM_BUG_ON(cache->rq_size + RELOC_TAIL > PAGE_SIZE  / sizeof(u32));
cmd = cache->rq_cmd + cache->rq_size;
*cmd++ = MI_ARB_CHECK;
-   if (cache->gen >= 8)
+   if (eb->gen >= 8)
*cmd++ = MI_BATCH_BUFFER_START_GEN8;
-   else if (cache->gen >= 6)
+   else if (eb->gen >= 6)
*cmd++ = MI_BATCH_BUFFER_START;
else
*cmd++ = MI_BATCH_BUFFER_START | MI_BATCH_GTT;
@@ -1061,7 +1067,7 @@ static int reloc_gpu_chain(struct reloc_cache *cache)
goto out_pool;
  
  	cmd = i915_gem_object_pin_map(batch->obj,

- cache->has_llc ?
+ eb->has_llc ?
  I915_MAP_FORCE_WB :

Re: [Intel-gfx] [PATCH v3 2/2] drm/i915/hdcp: No direct access to power_well desc

2020-08-05 Thread Ramalingam C
On 2020-08-05 at 17:15:21 +0530, Anshuman Gupta wrote:
> HDCP code doesn't require to access power_well internal stuff,
> instead it should use the intel_display_power_well_is_enabled()
> to get the status of desired power_well.
> No functional change.
> 
> v2:
> - used with_intel_runtime_pm instead of get/put. [Jani]
> v3:
> - rebased.
> 
> Cc: Jani Nikula 
> Signed-off-by: Anshuman Gupta 
LGTM.

Reviewed-by: Ramalingam C 

-Ram

> ---
>  drivers/gpu/drm/i915/display/intel_hdcp.c | 15 +++
>  1 file changed, 3 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_hdcp.c 
> b/drivers/gpu/drm/i915/display/intel_hdcp.c
> index a1e0d518e529..e76b049618db 100644
> --- a/drivers/gpu/drm/i915/display/intel_hdcp.c
> +++ b/drivers/gpu/drm/i915/display/intel_hdcp.c
> @@ -148,9 +148,8 @@ static int intel_hdcp_poll_ksv_fifo(struct 
> intel_digital_port *dig_port,
>  
>  static bool hdcp_key_loadable(struct drm_i915_private *dev_priv)
>  {
> - struct i915_power_domains *power_domains = &dev_priv->power_domains;
> - struct i915_power_well *power_well;
>   enum i915_power_well_id id;
> + intel_wakeref_t wakeref;
>   bool enabled = false;
>  
>   /*
> @@ -162,17 +161,9 @@ static bool hdcp_key_loadable(struct drm_i915_private 
> *dev_priv)
>   else
>   id = SKL_DISP_PW_1;
>  
> - mutex_lock(&power_domains->lock);
> -
>   /* PG1 (power well #1) needs to be enabled */
> - for_each_power_well(dev_priv, power_well) {
> - if (power_well->desc->id == id) {
> - enabled = power_well->desc->ops->is_enabled(dev_priv,
> - power_well);
> - break;
> - }
> - }
> - mutex_unlock(&power_domains->lock);
> + with_intel_runtime_pm(&dev_priv->runtime_pm, wakeref)
> + enabled = intel_display_power_well_is_enabled(dev_priv, id);
>  
>   /*
>* Another req for hdcp key loadability is enabled state of pll for
> -- 
> 2.26.2
> 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v5 1/5] drm/i915: Add enable/disable flip done and flip done handler

2020-08-05 Thread Karthik B S



On 7/25/2020 4:56 AM, Paulo Zanoni wrote:

Em seg, 2020-07-20 às 17:01 +0530, Karthik B S escreveu:

Add enable/disable flip done functions and the flip done handler
function which handles the flip done interrupt.

Enable the flip done interrupt in IER.

Enable flip done function is called before writing the
surface address register as the write to this register triggers
the flip done interrupt

Flip done handler is used to send the page flip event as soon as the
surface address is written as per the requirement of async flips.
The interrupt is disabled after the event is sent.

v2: -Change function name from icl_* to skl_* (Paulo)
 -Move flip handler to this patch (Paulo)
 -Remove vblank_put() (Paulo)
 -Enable flip done interrupt for gen9+ only (Paulo)
 -Enable flip done interrupt in power_well_post_enable hook (Paulo)
 -Removed the event check in flip done handler to handle async
  flips without pageflip events.

v3: -Move skl_disable_flip_done out of interrupt handler (Paulo)
 -Make the pending vblank event NULL in the beginning of
  flip_done_handler to remove sporadic WARN_ON that is seen.

v4: -Calculate timestamps using flip done time stamp and current
  timestamp for async flips (Ville)

v5: -Fix the sparse warning by making the function 'g4x_get_flip_counter'
  static.(Reported-by: kernel test robot )
 -Fix the typo in commit message.

Signed-off-by: Karthik B S 
Signed-off-by: Vandita Kulkarni 
---
  drivers/gpu/drm/i915/display/intel_display.c | 10 +++
  drivers/gpu/drm/i915/i915_irq.c  | 83 ++--
  drivers/gpu/drm/i915/i915_irq.h  |  2 +
  drivers/gpu/drm/i915/i915_reg.h  |  4 +-
  4 files changed, 91 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index db2a5a1a9b35..b8ff032195d9 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -15562,6 +15562,13 @@ static void intel_atomic_commit_tail(struct 
intel_atomic_state *state)
  
  	intel_dbuf_pre_plane_update(state);
  
+	for_each_new_intel_crtc_in_state(state, crtc, new_crtc_state, i) {

+   if (new_crtc_state->uapi.async_flip) {
+   skl_enable_flip_done(&crtc->base);
+   break;


Do we really want the break here? What if more than one CRTC wants an
async flip?


Thanks for the review.
This will fail for multiple CRTC case, I will remove this break.


Perhaps you could extend IGT to try this.


Currently we cannot add this scenario of having 2 crtc's in the same 
commit, as we're using the page flip ioctl. But I did try by hacking via 
the atomic path and 2 display with async is working fine.



+   }
+   }
+
/* Now enable the clocks, plane, pipe, and connectors that we set up. */
dev_priv->display.commit_modeset_enables(state);
  
@@ -15583,6 +15590,9 @@ static void intel_atomic_commit_tail(struct intel_atomic_state *state)

drm_atomic_helper_wait_for_flip_done(dev, &state->base);
  
  	for_each_new_intel_crtc_in_state(state, crtc, new_crtc_state, i) {

+   if (new_crtc_state->uapi.async_flip)
+   skl_disable_flip_done(&crtc->base);


Here we don't break in the first found, so at least there's an
inconsistency.


I will remove the break in the earlier loop.

+
if (new_crtc_state->hw.active &&
!needs_modeset(new_crtc_state) &&
!new_crtc_state->preload_luts &&
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 1fa67700d8f4..95953b393941 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -697,14 +697,24 @@ u32 i915_get_vblank_counter(struct drm_crtc *crtc)
return (((high1 << 8) | low) + (pixel >= vbl_start)) & 0xff;
  }
  
+static u32 g4x_get_flip_counter(struct drm_crtc *crtc)

+{
+   struct drm_i915_private *dev_priv = to_i915(crtc->dev);
+   enum pipe pipe = to_intel_crtc(crtc)->pipe;
+
+   return I915_READ(PIPE_FLIPCOUNT_G4X(pipe));
+}
+
  u32 g4x_get_vblank_counter(struct drm_crtc *crtc)
  {
struct drm_i915_private *dev_priv = to_i915(crtc->dev);
enum pipe pipe = to_intel_crtc(crtc)->pipe;
  
+	if (crtc->state->async_flip)

+   return g4x_get_flip_counter(crtc);
+
return I915_READ(PIPE_FRMCOUNT_G4X(pipe));


I don't understand the intention behind this, can you please clarify?
This goes back to my reply of the cover letter. It seems that here
we're going to alternate between two different counters in our vblank
count. So if user space alternates between sometimes using async flips
and sometimes using normal flip it's going to get some very weird
deltas, isn't it? At least this is what I remember from when I played
with these registers: FLIPCOUNT drifts away from FRMCOUNT when we start
using async 

Re: [Intel-gfx] [PATCH v5 1/5] drm/i915: Add enable/disable flip done and flip done handler

2020-08-05 Thread Karthik B S



On 7/27/2020 5:57 PM, Michel Dänzer wrote:

On 2020-07-25 1:26 a.m., Paulo Zanoni wrote:

Em seg, 2020-07-20 às 17:01 +0530, Karthik B S escreveu:


diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 1fa67700d8f4..95953b393941 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -697,14 +697,24 @@ u32 i915_get_vblank_counter(struct drm_crtc *crtc)
return (((high1 << 8) | low) + (pixel >= vbl_start)) & 0xff;
  }
  
+static u32 g4x_get_flip_counter(struct drm_crtc *crtc)

+{
+   struct drm_i915_private *dev_priv = to_i915(crtc->dev);
+   enum pipe pipe = to_intel_crtc(crtc)->pipe;
+
+   return I915_READ(PIPE_FLIPCOUNT_G4X(pipe));
+}
+
  u32 g4x_get_vblank_counter(struct drm_crtc *crtc)
  {
struct drm_i915_private *dev_priv = to_i915(crtc->dev);
enum pipe pipe = to_intel_crtc(crtc)->pipe;
  
+	if (crtc->state->async_flip)

+   return g4x_get_flip_counter(crtc);
+
return I915_READ(PIPE_FRMCOUNT_G4X(pipe));


I don't understand the intention behind this, can you please clarify?
This goes back to my reply of the cover letter. It seems that here
we're going to alternate between two different counters in our vblank
count. So if user space alternates between sometimes using async flips
and sometimes using normal flip it's going to get some very weird
deltas, isn't it? At least this is what I remember from when I played
with these registers: FLIPCOUNT drifts away from FRMCOUNT when we start
using async flips.


This definitely looks wrong. The counter value returned by the
get_vblank_counter hook is supposed to increment when a vertical blank
period occurs; page flips are not supposed to affect this in any way.



Thanks for the review.
As per the feedback received, I will be removing this and will revert 
back to the original implementation in the next revision.


Thanks,
Karthik.B.S



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v5 1/5] drm/i915: Add enable/disable flip done and flip done handler

2020-08-05 Thread Karthik B S



On 7/28/2020 3:04 AM, Daniel Vetter wrote:

On Mon, Jul 27, 2020 at 2:27 PM Michel Dänzer  wrote:


On 2020-07-25 1:26 a.m., Paulo Zanoni wrote:

Em seg, 2020-07-20 às 17:01 +0530, Karthik B S escreveu:


diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 1fa67700d8f4..95953b393941 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -697,14 +697,24 @@ u32 i915_get_vblank_counter(struct drm_crtc *crtc)
  return (((high1 << 8) | low) + (pixel >= vbl_start)) & 0xff;
  }

+static u32 g4x_get_flip_counter(struct drm_crtc *crtc)
+{
+struct drm_i915_private *dev_priv = to_i915(crtc->dev);
+enum pipe pipe = to_intel_crtc(crtc)->pipe;
+
+return I915_READ(PIPE_FLIPCOUNT_G4X(pipe));
+}
+
  u32 g4x_get_vblank_counter(struct drm_crtc *crtc)
  {
  struct drm_i915_private *dev_priv = to_i915(crtc->dev);
  enum pipe pipe = to_intel_crtc(crtc)->pipe;

+if (crtc->state->async_flip)
+return g4x_get_flip_counter(crtc);
+
  return I915_READ(PIPE_FRMCOUNT_G4X(pipe));


I don't understand the intention behind this, can you please clarify?
This goes back to my reply of the cover letter. It seems that here
we're going to alternate between two different counters in our vblank
count. So if user space alternates between sometimes using async flips
and sometimes using normal flip it's going to get some very weird
deltas, isn't it? At least this is what I remember from when I played
with these registers: FLIPCOUNT drifts away from FRMCOUNT when we start
using async flips.


This definitely looks wrong. The counter value returned by the
get_vblank_counter hook is supposed to increment when a vertical blank
period occurs; page flips are not supposed to affect this in any way.


Also you just flat out can't access crtc->state from interrupt
context. Anything you need in there needs to be protected by the right
irq-type spin_lock, updates correctly synchronized against both the
interrupt handler and atomic updates, and data copied over, not
pointers. Otherwise just crash&burn.


Thanks for the review.
I will be removing this change in the next revision based on the 
feedback received, but I will keep this in mind whenever I'll have to 
access something from the interrupt context.


Thanks,
Karthik.B.S

-Daniel


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 14/37] drm/i915: Serialise i915_vma_pin_inplace() with i915_vma_unbind()

2020-08-05 Thread Tvrtko Ursulin



On 05/08/2020 13:22, Chris Wilson wrote:

Directly seralise the atomic pinning with evicting the vma from unbind
with a pair of coupled cmpxchg to avoid fighting over vm->mutex.


Assumption being bind/unbind should never contend and create a 
busy-spinny section? And motivation being.. ?



Signed-off-by: Chris Wilson 
---
  drivers/gpu/drm/i915/i915_vma.c | 45 ++---
  1 file changed, 14 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index dbe11b349175..17ce0bce318e 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -742,12 +742,10 @@ i915_vma_detach(struct i915_vma *vma)
  
  bool i915_vma_pin_inplace(struct i915_vma *vma, unsigned int flags)

  {
-   unsigned int bound;
-   bool pinned = true;
+   unsigned int bound = atomic_read(&vma->flags);
  
  	GEM_BUG_ON(flags & ~I915_VMA_BIND_MASK);
  
-	bound = atomic_read(&vma->flags);

do {
if (unlikely(flags & ~bound))
return false;
@@ -755,34 +753,10 @@ bool i915_vma_pin_inplace(struct i915_vma *vma, unsigned 
int flags)
if (unlikely(bound & (I915_VMA_OVERFLOW | I915_VMA_ERROR)))
return false;
  
-		if (!(bound & I915_VMA_PIN_MASK))

-   goto unpinned;
-
GEM_BUG_ON(((bound + 1) & I915_VMA_PIN_MASK) == 0);
} while (!atomic_try_cmpxchg(&vma->flags, &bound, bound + 1));
  
  	return true;

-
-unpinned:
-   /*
-* If pin_count==0, but we are bound, check under the lock to avoid
-* racing with a concurrent i915_vma_unbind().
-*/
-   mutex_lock(&vma->vm->mutex);
-   do {
-   if (unlikely(bound & (I915_VMA_OVERFLOW | I915_VMA_ERROR))) {
-   pinned = false;
-   break;
-   }
-
-   if (unlikely(flags & ~bound)) {
-   pinned = false;
-   break;
-   }
-   } while (!atomic_try_cmpxchg(&vma->flags, &bound, bound + 1));
-   mutex_unlock(&vma->vm->mutex);
-
-   return pinned;
  }
  
  static int vma_get_pages(struct i915_vma *vma)

@@ -1292,6 +1266,7 @@ void __i915_vma_evict(struct i915_vma *vma)
  
  int __i915_vma_unbind(struct i915_vma *vma)

  {
+   unsigned int bound;
int ret;
  
  	lockdep_assert_held(&vma->vm->mutex);

@@ -1299,10 +1274,18 @@ int __i915_vma_unbind(struct i915_vma *vma)
if (!drm_mm_node_allocated(&vma->node))
return 0;
  
-	if (i915_vma_is_pinned(vma)) {

-   vma_print_allocator(vma, "is pinned");
-   return -EAGAIN;
-   }
+   /* Serialise with i915_vma_pin_inplace() */
+   bound = atomic_read(&vma->flags);
+   do {
+   if (unlikely(bound & I915_VMA_PIN_MASK)) {
+   vma_print_allocator(vma, "is pinned");
+   return -EAGAIN;
+   }
+
+   if (unlikely(bound & I915_VMA_ERROR))
+   break;
+   } while (!atomic_try_cmpxchg(&vma->flags,
+&bound, bound | I915_VMA_ERROR));
Using the error flag is somehow critical for this scheme to work? Can 
you please explain in the comment and/or commit message?


  
  	/*

 * After confirming that no one else is pinning this vma, wait for



Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 16/37] drm/i915: Always defer fenced work to the worker

2020-08-05 Thread Tvrtko Ursulin



On 05/08/2020 13:22, Chris Wilson wrote:

Currently, if an error is raised we always call the cleanup locally
[and skip the main work callback]. However, some future users may need
to take a mutex to cleanup and so we cannot immediately execute the
cleanup as we may still be in interrupt context. For example, if we have
committed sensitive changes [like evicting from the ppGTT layout] that
are visible but gated behind the fence, we need to ensure those changes
are completed even after an error. [This does suggest the split between
the work/release callback is artificial and we may be able to simplify
the worker api by only requiring a single callback.]

With the execute-immediate flag, for most cases this should result in
immediate cleanup of an error.

Signed-off-by: Chris Wilson 
---
  drivers/gpu/drm/i915/i915_sw_fence_work.c | 26 +++
  1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_sw_fence_work.c 
b/drivers/gpu/drm/i915/i915_sw_fence_work.c
index a3a81bb8f2c3..e094fd0a4202 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence_work.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence_work.c
@@ -16,11 +16,14 @@ static void fence_complete(struct dma_fence_work *f)
  static void fence_work(struct work_struct *work)
  {
struct dma_fence_work *f = container_of(work, typeof(*f), work);
-   int err;
  
-	err = f->ops->work(f);

-   if (err)
-   dma_fence_set_error(&f->dma, err);
+   if (!f->dma.error) {
+   int err;
+
+   err = f->ops->work(f);
+   if (err)
+   dma_fence_set_error(&f->dma, err);
+   }
  
  	fence_complete(f);

dma_fence_put(&f->dma);
@@ -36,15 +39,10 @@ fence_notify(struct i915_sw_fence *fence, enum 
i915_sw_fence_notify state)
if (fence->error)
dma_fence_set_error(&f->dma, fence->error);
  
-		if (!f->dma.error) {

-   dma_fence_get(&f->dma);
-   if (test_bit(DMA_FENCE_WORK_IMM, &f->dma.flags))
-   fence_work(&f->work);
-   else
-   queue_work(system_unbound_wq, &f->work);
-   } else {
-   fence_complete(f);
-   }
+   if (test_bit(DMA_FENCE_WORK_IMM, &f->dma.flags))
+   fence_work(&f->work);
+   else
+   queue_work(system_unbound_wq, &f->work);
break;
  
  	case FENCE_FREE:

@@ -91,6 +89,8 @@ void dma_fence_work_init(struct dma_fence_work *f,
dma_fence_init(&f->dma, &fence_ops, &f->lock, 0, 0);
i915_sw_fence_init(&f->chain, fence_notify);
INIT_WORK(&f->work, fence_work);
+
+   dma_fence_get(&f->dma); /* once for the chain; once for the work */
  }
  
  int dma_fence_work_chain(struct dma_fence_work *f, struct dma_fence *signal)




Reviewed-by: Tvrtko Ursulin 

Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 17/37] drm/i915/gem: Assign context id for async work

2020-08-05 Thread Tvrtko Ursulin



On 05/08/2020 13:22, Chris Wilson wrote:

Allocate a few dma fence context id that we can use to associate async work
[for the CPU] launched on behalf of this context. For extra fun, we allow
a configurable concurrency width.

A current example would be that we spawn an unbound worker for every
userptr get_pages. In the future, we wish to charge this work to the
context that initiated the async work and to impose concurrency limits
based on the context.

Signed-off-by: Chris Wilson 
---
  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 4 
  drivers/gpu/drm/i915/gem/i915_gem_context.h   | 6 ++
  drivers/gpu/drm/i915/gem/i915_gem_context_types.h | 6 ++
  3 files changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index db893f6c516b..bc80e7d3c50a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -721,6 +721,10 @@ __create_context(struct drm_i915_private *i915)
mutex_init(&ctx->mutex);
INIT_LIST_HEAD(&ctx->link);
  
+	ctx->async.width = rounddown_pow_of_two(num_online_cpus());

+   ctx->async.context = dma_fence_context_alloc(ctx->async.width);
+   ctx->async.width--;
+
spin_lock_init(&ctx->stale.lock);
INIT_LIST_HEAD(&ctx->stale.engines);
  
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h

index a133f92bbedb..f254458a795e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h
@@ -134,6 +134,12 @@ int i915_gem_context_setparam_ioctl(struct drm_device 
*dev, void *data,
  int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
   struct drm_file *file);
  
+static inline u64 i915_gem_context_async_id(struct i915_gem_context *ctx)

+{
+   return (ctx->async.context +
+   (atomic_fetch_inc(&ctx->async.cur) & ctx->async.width));
+}
+
  static inline struct i915_gem_context *
  i915_gem_context_get(struct i915_gem_context *ctx)
  {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index ae14ca24a11f..52561f98000f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -85,6 +85,12 @@ struct i915_gem_context {
  
  	struct intel_timeline *timeline;
  
+	struct {

+   u64 context;
+   atomic_t cur;
+   unsigned int width;
+   } async;
+
/**
 * @vm: unique address space (GTT)
 *



Reviewed-by: Tvrtko Ursulin 

Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 23/37] drm/i915/gem: Manage GTT placement bias (starting offset) explicitly

2020-08-05 Thread Tvrtko Ursulin



On 05/08/2020 13:22, Chris Wilson wrote:

Since we can control placement in the ppGTT explicitly, we can specify
our desired starting offset exactly on a per-vma basis. This prevents us
falling down a few corner cases where we confuse the user with our choices.

Signed-off-by: Chris Wilson 
---
  .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 67 +--
  1 file changed, 31 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 19cab5541dbc..0839397c7e50 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -36,6 +36,7 @@ struct eb_vma {
  
  	/** This vma's place in the execbuf reservation list */

struct drm_i915_gem_exec_object2 *exec;
+   u32 bias;
  
  	struct list_head bind_link;

struct list_head unbound_link;
@@ -61,15 +62,12 @@ struct eb_vma_array {
  #define __EXEC_OBJECT_HAS_PIN BIT(31)
  #define __EXEC_OBJECT_HAS_FENCE   BIT(30)
  #define __EXEC_OBJECT_NEEDS_MAP   BIT(29)
-#define __EXEC_OBJECT_NEEDS_BIAS   BIT(28)
-#define __EXEC_OBJECT_INTERNAL_FLAGS   (~0u << 28) /* all of the above */
+#define __EXEC_OBJECT_INTERNAL_FLAGS   (~0u << 29) /* all of the above */
  
  #define __EXEC_HAS_RELOC	BIT(31)

  #define __EXEC_INTERNAL_FLAGS (~0u << 31)
  #define UPDATEPIN_OFFSET_FIXED
  
-#define BATCH_OFFSET_BIAS (256*1024)

-
  #define __I915_EXEC_ILLEGAL_FLAGS \
(__I915_EXEC_UNKNOWN_FLAGS | \
 I915_EXEC_CONSTANTS_MASK  | \
@@ -291,7 +289,7 @@ struct i915_execbuffer {
} parser;
  
  	u64 invalid_flags; /** Set of execobj.flags that are invalid */

-   u32 context_flags; /** Set of execobj.flags to insert from the ctx */
+   u32 context_bias;
  
  	u32 batch_start_offset; /** Location within object of batch */

u32 batch_len; /** Length of batch within object */
@@ -491,11 +489,12 @@ static int eb_create(struct i915_execbuffer *eb)
return 0;
  }
  
-static bool

-eb_vma_misplaced(const struct drm_i915_gem_exec_object2 *entry,
-const struct i915_vma *vma,
-unsigned int flags)
+static bool eb_vma_misplaced(const struct eb_vma *ev)
  {
+   const struct drm_i915_gem_exec_object2 *entry = ev->exec;
+   const struct i915_vma *vma = ev->vma;
+   unsigned int flags = ev->flags;
+
if (test_bit(I915_VMA_ERROR_BIT, __i915_vma_flags(vma)))
return true;
  
@@ -509,8 +508,7 @@ eb_vma_misplaced(const struct drm_i915_gem_exec_object2 *entry,

vma->node.start != entry->offset)
return true;
  
-	if (flags & __EXEC_OBJECT_NEEDS_BIAS &&

-   vma->node.start < BATCH_OFFSET_BIAS)
+   if (vma->node.start < ev->bias)
return true;
  
  	if (!(flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS) &&

@@ -529,10 +527,7 @@ static bool eb_pin_vma_fence_inplace(struct eb_vma *ev)
return false; /* We need to add some new fence serialisation */
  }
  
-static inline bool

-eb_pin_vma_inplace(struct i915_execbuffer *eb,
-  const struct drm_i915_gem_exec_object2 *entry,
-  struct eb_vma *ev)
+static inline bool eb_pin_vma_inplace(struct eb_vma *ev)
  {
struct i915_vma *vma = ev->vma;
unsigned int pin_flags;
@@ -541,7 +536,7 @@ eb_pin_vma_inplace(struct i915_execbuffer *eb,
if (!i915_active_is_idle(&vma->vm->binding))
return false;
  
-	if (eb_vma_misplaced(entry, vma, ev->flags))

+   if (eb_vma_misplaced(ev))
return false;
  
  	pin_flags = PIN_USER;

@@ -559,7 +554,7 @@ eb_pin_vma_inplace(struct i915_execbuffer *eb,
}
}
  
-	GEM_BUG_ON(eb_vma_misplaced(entry, vma, ev->flags));

+   GEM_BUG_ON(eb_vma_misplaced(ev));
  
  	ev->flags |= __EXEC_OBJECT_HAS_PIN;

return true;
@@ -608,9 +603,6 @@ eb_validate_vma(struct i915_execbuffer *eb,
entry->flags |= EXEC_OBJECT_NEEDS_GTT | 
__EXEC_OBJECT_NEEDS_MAP;
}
  
-	if (!(entry->flags & EXEC_OBJECT_PINNED))

-   entry->flags |= eb->context_flags;
-
return 0;
  }
  
@@ -627,6 +619,7 @@ eb_add_vma(struct i915_execbuffer *eb,

ev->vma = vma;
ev->exec = entry;
ev->flags = entry->flags;
+   ev->bias = eb->context_bias;
  
  	if (eb->lut_size > 0) {

ev->handle = entry->handle;
@@ -653,7 +646,8 @@ eb_add_vma(struct i915_execbuffer *eb,
if (i == batch_idx) {
if (entry->relocation_count &&
!(ev->flags & EXEC_OBJECT_PINNED))
-   ev->flags |= __EXEC_OBJECT_NEEDS_BIAS;
+   ev->bias = max_t(u32, ev->bias, SZ_256K);


What dictates the 256KiB border? Wondering if this is too hidden in here 
or not.


Regards,

Tvrtko


+
if (eb->has_fence)
ev->flags |= EXE

Re: [Intel-gfx] [PATCH 6/6] drm/xen-front: Add support for EDID based configuration

2020-08-05 Thread kernel test robot
Hi Oleksandr,

I love your patch! Perhaps something to improve:

[auto build test WARNING on drm-exynos/exynos-drm-next]
[also build test WARNING on drm-intel/for-linux-next 
tegra-drm/drm/tegra/for-next drm-tip/drm-tip linus/master v5.8 next-20200804]
[cannot apply to xen-tip/linux-next drm/drm-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Oleksandr-Andrushchenko/Fixes-and-improvements-for-Xen-pvdrm/20200731-205350
base:   https://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos.git 
exynos-drm-next
compiler: aarch64-linux-gcc (GCC) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 


cppcheck warnings: (new ones prefixed by >>)

>> drivers/irqchip/irq-gic.c:161:24: warning: Local variable gic_data shadows 
>> outer variable [shadowVar]
struct gic_chip_data *gic_data = irq_data_get_irq_chip_data(d);
  ^
   drivers/irqchip/irq-gic.c:123:29: note: Shadowed declaration
   static struct gic_chip_data gic_data[CONFIG_ARM_GIC_MAX_NR] __read_mostly;
   ^
   drivers/irqchip/irq-gic.c:161:24: note: Shadow variable
struct gic_chip_data *gic_data = irq_data_get_irq_chip_data(d);
  ^
   drivers/irqchip/irq-gic.c:167:24: warning: Local variable gic_data shadows 
outer variable [shadowVar]
struct gic_chip_data *gic_data = irq_data_get_irq_chip_data(d);
  ^
   drivers/irqchip/irq-gic.c:123:29: note: Shadowed declaration
   static struct gic_chip_data gic_data[CONFIG_ARM_GIC_MAX_NR] __read_mostly;
   ^
   drivers/irqchip/irq-gic.c:167:24: note: Shadow variable
struct gic_chip_data *gic_data = irq_data_get_irq_chip_data(d);
  ^
>> drivers/irqchip/irq-gic.c:400:28: warning: Local variable gic_irq shadows 
>> outer function [shadowFunction]
unsigned int cascade_irq, gic_irq;
  ^
   drivers/irqchip/irq-gic.c:171:28: note: Shadowed declaration
   static inline unsigned int gic_irq(struct irq_data *d)
  ^
   drivers/irqchip/irq-gic.c:400:28: note: Shadow variable
unsigned int cascade_irq, gic_irq;
  ^
>> drivers/irqchip/irq-gic.c:1507:14: warning: Local variable gic_cpu_base 
>> shadows outer function [shadowFunction]
phys_addr_t gic_cpu_base;
^
   drivers/irqchip/irq-gic.c:165:29: note: Shadowed declaration
   static inline void __iomem *gic_cpu_base(struct irq_data *d)
   ^
   drivers/irqchip/irq-gic.c:1507:14: note: Shadow variable
phys_addr_t gic_cpu_base;
^
>> drivers/irqchip/irq-gic-v3.c:874:71: warning: Boolean result is used in 
>> bitwise operation. Clarify expression with parentheses. [clarifyCondition]
gic_data.rdists.has_direct_lpi &= (!!(typer & GICR_TYPER_DirectLPIS) |
 ^
>> drivers/irqchip/irq-gic-v3.c:1808:6: warning: Local variable 
>> nr_redist_regions shadows outer variable [shadowVar]
u32 nr_redist_regions;
^
   drivers/irqchip/irq-gic-v3.c:1880:6: note: Shadowed declaration
u32 nr_redist_regions;
^
   drivers/irqchip/irq-gic-v3.c:1808:6: note: Shadow variable
u32 nr_redist_regions;
^
>> drivers/irqchip/irq-gic-v3.c:2042:6: warning: Local variable maint_irq_mode 
>> shadows outer variable [shadowVar]
int maint_irq_mode;
^
   drivers/irqchip/irq-gic-v3.c:1884:6: note: Shadowed declaration
int maint_irq_mode;
^
   drivers/irqchip/irq-gic-v3.c:2042:6: note: Shadow variable
int maint_irq_mode;
^
>> drivers/gpu/drm/xen/xen_drm_front_cfg.c:76:6: warning: Variable 'ret' is 
>> reassigned a value before the old one has been used. [redundantAssignment]
ret = xen_drm_front_get_edid(front_info, index, pages,
^
   drivers/gpu/drm/xen/xen_drm_front_cfg.c:61:0: note: Variable 'ret' is 
reassigned a value before the old one has been used.
int i, npages, ret = -ENOMEM;
   ^
   drivers/gpu/drm/xen/xen_drm_front_cfg.c:76:6: note: Variable 'ret' is 
reassigned a value before the old one has been used.
ret = xen_drm_front_get_edid(front_info, index, pages,
^

vim +/ret +76 drivers/gpu/drm/xen/xen_drm_front_cfg.c

54  
55  static void cfg_connector_edid(struct xen_drm_front_info *front_info,
56 struct xen_drm_front_cfg_connector 
*connector,
57 int index)
58  {
59  struct page **pages;
60  u32 edid_sz;
61  int i, npages, ret = -ENOMEM;
62  
63  connector->edid = vmalloc(XENDISPL_EDID_MAX_SIZE);
64  if (!connector->edid)
65  

[Intel-gfx] [PATCH] i915/tgl: Fix TC-cold block/unblock sequence

2020-08-05 Thread Imre Deak
The command register is the low PCODE MBOX low register not the high
one as described by the spec. This left the system with the TC-cold
power state being blocked all the time. Fix things by using the correct
register.

Also to make sure we retry a request for at least 600usec, when the
PCODE MBOX command itself succeeded, but the TC-cold block command
failed, sleep for 1msec unconditionally after any fail.

The change was tested with JTAG register read of the HW/FW's actual
TC-cold state, which reported the expected states after this change.

Tested-by: Nivedita Swaminathan 
Cc: José Roberto de Souza 
Signed-off-by: Imre Deak 
---
 drivers/gpu/drm/i915/display/intel_display_power.c | 10 +-
 drivers/gpu/drm/i915/i915_reg.h|  4 ++--
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c 
b/drivers/gpu/drm/i915/display/intel_display_power.c
index 9f0241a53a45..8f0b712ed7a0 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power.c
+++ b/drivers/gpu/drm/i915/display/intel_display_power.c
@@ -3927,12 +3927,13 @@ tgl_tc_cold_request(struct drm_i915_private *i915, bool 
block)
int ret;
 
while (1) {
-   u32 low_val = 0, high_val;
+   u32 low_val;
+   u32 high_val = 0;
 
if (block)
-   high_val = TGL_PCODE_EXIT_TCCOLD_DATA_H_BLOCK_REQ;
+   low_val = TGL_PCODE_EXIT_TCCOLD_DATA_L_BLOCK_REQ;
else
-   high_val = TGL_PCODE_EXIT_TCCOLD_DATA_H_UNBLOCK_REQ;
+   low_val = TGL_PCODE_EXIT_TCCOLD_DATA_L_UNBLOCK_REQ;
 
/*
 * Spec states that we should timeout the request after 200us
@@ -3951,8 +3952,7 @@ tgl_tc_cold_request(struct drm_i915_private *i915, bool 
block)
if (++tries == 3)
break;
 
-   if (ret == -EAGAIN)
-   msleep(1);
+   msleep(1);
}
 
if (ret)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 2b403df03404..e85c6fc1f3cb 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -9226,8 +9226,8 @@ enum {
 #define   DISPLAY_IPS_CONTROL  0x19
 #define   TGL_PCODE_TCCOLD 0x26
 #define TGL_PCODE_EXIT_TCCOLD_DATA_L_EXIT_FAILED   REG_BIT(0)
-#define TGL_PCODE_EXIT_TCCOLD_DATA_H_BLOCK_REQ 0
-#define TGL_PCODE_EXIT_TCCOLD_DATA_H_UNBLOCK_REQ   REG_BIT(0)
+#define TGL_PCODE_EXIT_TCCOLD_DATA_L_BLOCK_REQ 0
+#define TGL_PCODE_EXIT_TCCOLD_DATA_L_UNBLOCK_REQ   REG_BIT(0)
 /* See also IPS_CTL */
 #define IPS_PCODE_CONTROL  (1 << 30)
 #define   HSW_PCODE_DYNAMIC_DUTY_CYCLE_CONTROL 0x1A
-- 
2.23.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 01/37] drm/i915/gem: Reduce context termination list iteration guard to RCU

2020-08-05 Thread Tvrtko Ursulin



On 05/08/2020 13:21, Chris Wilson wrote:

As we now protect the timeline list using RCU, we can drop the
timeline->mutex for guarding the list iteration during context close, as
we are searching for an inflight request. Any new request will see the
context is banned and not be submitted. In doing so, pull the checks for
a concurrent submission of the request (notably the
i915_request_completed()) under the engine spinlock, to fully serialise
with __i915_request_submit()). That is in the case of preempt-to-busy
where the request may be completed during the __i915_request_submit(),
we need to be careful that we sample the request status after
serialising so that we don't miss the request the engine is actually
submitting.

Fixes: 4a3174152147 ("drm/i915/gem: Refine occupancy test in kill_context()")
References: d22d2d073ef8 ("drm/i915: Protect i915_request_await_start from early 
waits") # rcu protection of timeline->requests
References: https://gitlab.freedesktop.org/drm/intel/-/issues/1622
Signed-off-by: Chris Wilson 
---
  drivers/gpu/drm/i915/gem/i915_gem_context.c | 32 -
  1 file changed, 19 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index d8cccbab7a51..db893f6c516b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -439,29 +439,36 @@ static bool __cancel_engine(struct intel_engine_cs 
*engine)
return __reset_engine(engine);
  }
  
-static struct intel_engine_cs *__active_engine(struct i915_request *rq)

+static bool
+__active_engine(struct i915_request *rq, struct intel_engine_cs **active)
  {
struct intel_engine_cs *engine, *locked;
+   bool ret = false;
  
  	/*

 * Serialise with __i915_request_submit() so that it sees
 * is-banned?, or we know the request is already inflight.
+*
+* Note that rq->engine is unstable, and so we double
+* check that we have acquired the lock on the final engine.
 */
locked = READ_ONCE(rq->engine);
spin_lock_irq(&locked->active.lock);
while (unlikely(locked != (engine = READ_ONCE(rq->engine {
spin_unlock(&locked->active.lock);
-   spin_lock(&engine->active.lock);
locked = engine;
+   spin_lock(&locked->active.lock);
}
  
-	engine = NULL;

-   if (i915_request_is_active(rq) && rq->fence.error != -EIO)
-   engine = rq->engine;
+   if (!i915_request_completed(rq)) {
+   if (i915_request_is_active(rq) && rq->fence.error != -EIO)
+   *active = locked;
+   ret = true;


So not completed but also not submitted will return true and no engine..


+   }
  
  	spin_unlock_irq(&locked->active.lock);
  
-	return engine;

+   return ret;
  }
  
  static struct intel_engine_cs *active_engine(struct intel_context *ce)

@@ -472,17 +479,16 @@ static struct intel_engine_cs *active_engine(struct 
intel_context *ce)
if (!ce->timeline)
return NULL;
  
-	mutex_lock(&ce->timeline->mutex);

-   list_for_each_entry_reverse(rq, &ce->timeline->requests, link) {
-   if (i915_request_completed(rq))
-   break;
+   rcu_read_lock();
+   list_for_each_entry_rcu(rq, &ce->timeline->requests, link) {
+   if (i915_request_is_active(rq) && i915_request_completed(rq))
+   continue;
  
  		/* Check with the backend if the request is inflight */

-   engine = __active_engine(rq);
-   if (engine)
+   if (__active_engine(rq, &engine))
break;


... hence the caller of this will say no action. Because not active 
means not submitted so that's okay and matches old behaviour. Need for 
bool return and output engine looks a consequence of iterating the list 
in different direction.


Reviewed-by: Tvrtko Ursulin 

Regards,

Tvrtko


}
-   mutex_unlock(&ce->timeline->mutex);
+   rcu_read_unlock();
  
  	return engine;

  }


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 02/37] drm/i915/gt: Protect context lifetime with RCU

2020-08-05 Thread Tvrtko Ursulin



On 05/08/2020 13:21, Chris Wilson wrote:

Allow a brief period for continued access to a dead intel_context by
deferring the release of the struct until after an RCU grace period.
As we are using a dedicated slab cache for the contexts, we can defer
the release of the slab pages via RCU, with the caveat that individual
structs may be reused from the freelist within an RCU grace period. To
handle that, we have to avoid clearing members of the zombie struct.


Is this related to debugfs race, optimising the driver latencies or 
both? Need to hack up mutex_reinit bothers me, on top of general desire 
to avoid even more rcu complexity.


Regards,

Tvrtko


Signed-off-by: Chris Wilson 
---
  drivers/gpu/drm/i915/gt/intel_context.c | 330 +---
  drivers/gpu/drm/i915/i915_active.c  |  10 +
  drivers/gpu/drm/i915/i915_active.h  |   2 +
  drivers/gpu/drm/i915/i915_utils.h   |   7 +
  4 files changed, 202 insertions(+), 147 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
b/drivers/gpu/drm/i915/gt/intel_context.c
index 52db2bde44a3..4e7924640ffa 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -22,7 +22,7 @@ static struct i915_global_context {
  
  static struct intel_context *intel_context_alloc(void)

  {
-   return kmem_cache_zalloc(global.slab_ce, GFP_KERNEL);
+   return kmem_cache_alloc(global.slab_ce, GFP_KERNEL);
  }
  
  void intel_context_free(struct intel_context *ce)

@@ -30,6 +30,177 @@ void intel_context_free(struct intel_context *ce)
kmem_cache_free(global.slab_ce, ce);
  }
  
+static int __context_pin_state(struct i915_vma *vma)

+{
+   unsigned int bias = i915_ggtt_pin_bias(vma) | PIN_OFFSET_BIAS;
+   int err;
+
+   err = i915_ggtt_pin(vma, 0, bias | PIN_HIGH);
+   if (err)
+   return err;
+
+   err = i915_active_acquire(&vma->active);
+   if (err)
+   goto err_unpin;
+
+   /*
+* And mark it as a globally pinned object to let the shrinker know
+* it cannot reclaim the object until we release it.
+*/
+   i915_vma_make_unshrinkable(vma);
+   vma->obj->mm.dirty = true;
+
+   return 0;
+
+err_unpin:
+   i915_vma_unpin(vma);
+   return err;
+}
+
+static void __context_unpin_state(struct i915_vma *vma)
+{
+   i915_vma_make_shrinkable(vma);
+   i915_active_release(&vma->active);
+   __i915_vma_unpin(vma);
+}
+
+static int __ring_active(struct intel_ring *ring)
+{
+   int err;
+
+   err = intel_ring_pin(ring);
+   if (err)
+   return err;
+
+   err = i915_active_acquire(&ring->vma->active);
+   if (err)
+   goto err_pin;
+
+   return 0;
+
+err_pin:
+   intel_ring_unpin(ring);
+   return err;
+}
+
+static void __ring_retire(struct intel_ring *ring)
+{
+   i915_active_release(&ring->vma->active);
+   intel_ring_unpin(ring);
+}
+
+__i915_active_call
+static void __intel_context_retire(struct i915_active *active)
+{
+   struct intel_context *ce = container_of(active, typeof(*ce), active);
+
+   CE_TRACE(ce, "retire runtime: { total:%lluns, avg:%lluns }\n",
+intel_context_get_total_runtime_ns(ce),
+intel_context_get_avg_runtime_ns(ce));
+
+   set_bit(CONTEXT_VALID_BIT, &ce->flags);
+   if (ce->state)
+   __context_unpin_state(ce->state);
+
+   intel_timeline_unpin(ce->timeline);
+   __ring_retire(ce->ring);
+
+   intel_context_put(ce);
+}
+
+static int __intel_context_active(struct i915_active *active)
+{
+   struct intel_context *ce = container_of(active, typeof(*ce), active);
+   int err;
+
+   CE_TRACE(ce, "active\n");
+
+   intel_context_get(ce);
+
+   err = __ring_active(ce->ring);
+   if (err)
+   goto err_put;
+
+   err = intel_timeline_pin(ce->timeline);
+   if (err)
+   goto err_ring;
+
+   if (!ce->state)
+   return 0;
+
+   err = __context_pin_state(ce->state);
+   if (err)
+   goto err_timeline;
+
+   return 0;
+
+err_timeline:
+   intel_timeline_unpin(ce->timeline);
+err_ring:
+   __ring_retire(ce->ring);
+err_put:
+   intel_context_put(ce);
+   return err;
+}
+
+static void __intel_context_ctor(void *arg)
+{
+   struct intel_context *ce = arg;
+
+   INIT_LIST_HEAD(&ce->signal_link);
+   INIT_LIST_HEAD(&ce->signals);
+
+   atomic_set(&ce->pin_count, 0);
+   mutex_init(&ce->pin_mutex);
+
+   ce->active_count = 0;
+   i915_active_init(&ce->active,
+__intel_context_active, __intel_context_retire);
+
+   ce->inflight = NULL;
+   ce->lrc_reg_state = NULL;
+   ce->lrc.desc = 0;
+}
+
+static void
+__intel_context_init(struct intel_context *ce, struct intel_engine_cs *engine)
+{
+   GEM_BUG_ON(!engine->cops);
+   GEM_BUG_ON(!engine->gt->vm);
+
+   kref_init(&ce->re

Re: [Intel-gfx] [PATCH 03/37] drm/i915/gt: Free stale request on destroying the virtual engine

2020-08-05 Thread Tvrtko Ursulin



On 05/08/2020 13:21, Chris Wilson wrote:

Since preempt-to-busy, we may unsubmit a request while it is still on
the HW and completes asynchronously. That means it may be retired and in
the process destroy the virtual engine (as the user has closed their
context), but that engine may still be holding onto the unsubmitted
compelted request. Therefore we need to potentially cleanup the old
request on destroying the virtual engine. We also have to keep the
virtual_engine alive until after the sibling's execlists_dequeue() have
finished peeking into the virtual engines, for which we serialise with
RCU.

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
  drivers/gpu/drm/i915/gt/intel_lrc.c | 22 +++---
  1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 417f6b0c6c61..cb04bc5474be 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -180,6 +180,7 @@
  #define EXECLISTS_REQUEST_SIZE 64 /* bytes */
  
  struct virtual_engine {

+   struct rcu_head rcu;
struct intel_engine_cs base;
struct intel_context context;
  
@@ -5393,10 +5394,25 @@ static void virtual_context_destroy(struct kref *kref)

container_of(kref, typeof(*ve), context.ref);
unsigned int n;
  
-	GEM_BUG_ON(!list_empty(virtual_queue(ve)));

-   GEM_BUG_ON(ve->request);
GEM_BUG_ON(ve->context.inflight);
  
+	if (unlikely(ve->request)) {

+   struct i915_request *old;
+   unsigned long flags;
+
+   spin_lock_irqsave(&ve->base.active.lock, flags);
+
+   old = fetch_and_zero(&ve->request);
+   if (old) {
+   GEM_BUG_ON(!i915_request_completed(old));
+   __i915_request_submit(old);
+   i915_request_put(old);
+   }
+
+   spin_unlock_irqrestore(&ve->base.active.lock, flags);
+   }
+   GEM_BUG_ON(!list_empty(virtual_queue(ve)));
+
for (n = 0; n < ve->num_siblings; n++) {
struct intel_engine_cs *sibling = ve->siblings[n];
struct rb_node *node = &ve->nodes[sibling->id].rb;
@@ -5422,7 +5438,7 @@ static void virtual_context_destroy(struct kref *kref)
intel_engine_free_request_pool(&ve->base);
  
  	kfree(ve->bonds);

-   kfree(ve);
+   kfree_rcu(ve, rcu);
  }
  
  static void virtual_engine_initial_hint(struct virtual_engine *ve)




If it would go without the previous patch I think it would simply mean a 
normal kfree here. In both cases it looks okay to me.


Reviewed-by: Tvrtko Ursulin 

Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for i915/tgl: Fix TC-cold block/unblock sequence

2020-08-05 Thread Patchwork
== Series Details ==

Series: i915/tgl: Fix TC-cold block/unblock sequence
URL   : https://patchwork.freedesktop.org/series/80302/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
26989606f3cd i915/tgl: Fix TC-cold block/unblock sequence
-:52: WARNING:MSLEEP: msleep < 20ms can sleep for up to 20ms; see 
Documentation/timers/timers-howto.rst
#52: FILE: drivers/gpu/drm/i915/display/intel_display_power.c:3955:
+   msleep(1);

total: 0 errors, 1 warnings, 0 checks, 35 lines checked


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.SPARSE: warning for i915/tgl: Fix TC-cold block/unblock sequence

2020-08-05 Thread Patchwork
== Series Details ==

Series: i915/tgl: Fix TC-cold block/unblock sequence
URL   : https://patchwork.freedesktop.org/series/80302/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.0
Fast mode used, each commit won't be checked separately.


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✓ Fi.CI.BAT: success for i915/tgl: Fix TC-cold block/unblock sequence

2020-08-05 Thread Patchwork
== Series Details ==

Series: i915/tgl: Fix TC-cold block/unblock sequence
URL   : https://patchwork.freedesktop.org/series/80302/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_8845 -> Patchwork_18311


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18311/index.html

Known issues


  Here are the changes found in Patchwork_18311 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@i915_module_load@reload:
- fi-apl-guc: [PASS][1] -> [DMESG-WARN][2] ([i915#1635] / 
[i915#1982])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/fi-apl-guc/igt@i915_module_l...@reload.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18311/fi-apl-guc/igt@i915_module_l...@reload.html

  * igt@i915_selftest@live@execlists:
- fi-kbl-r:   [PASS][3] -> [INCOMPLETE][4] ([i915#794])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/fi-kbl-r/igt@i915_selftest@l...@execlists.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18311/fi-kbl-r/igt@i915_selftest@l...@execlists.html

  
 Possible fixes 

  * igt@i915_module_load@reload:
- fi-byt-j1900:   [DMESG-WARN][5] ([i915#1982]) -> [PASS][6]
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/fi-byt-j1900/igt@i915_module_l...@reload.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18311/fi-byt-j1900/igt@i915_module_l...@reload.html

  * igt@i915_pm_rpm@basic-pci-d3-state:
- fi-bsw-kefka:   [DMESG-WARN][7] ([i915#1982]) -> [PASS][8]
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/fi-bsw-kefka/igt@i915_pm_...@basic-pci-d3-state.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18311/fi-bsw-kefka/igt@i915_pm_...@basic-pci-d3-state.html

  * igt@kms_busy@basic@flip:
- {fi-tgl-dsi}:   [DMESG-WARN][9] ([i915#1982]) -> [PASS][10]
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/fi-tgl-dsi/igt@kms_busy@ba...@flip.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18311/fi-tgl-dsi/igt@kms_busy@ba...@flip.html
- fi-kbl-x1275:   [DMESG-WARN][11] ([i915#62] / [i915#92] / [i915#95]) 
-> [PASS][12]
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/fi-kbl-x1275/igt@kms_busy@ba...@flip.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18311/fi-kbl-x1275/igt@kms_busy@ba...@flip.html

  * igt@kms_chamelium@common-hpd-after-suspend:
- fi-kbl-7500u:   [DMESG-WARN][13] ([i915#2203]) -> [PASS][14]
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/fi-kbl-7500u/igt@kms_chamel...@common-hpd-after-suspend.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18311/fi-kbl-7500u/igt@kms_chamel...@common-hpd-after-suspend.html

  * igt@kms_flip@basic-flip-vs-wf_vblank@b-edp1:
- fi-icl-u2:  [DMESG-WARN][15] ([i915#1982]) -> [PASS][16]
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/fi-icl-u2/igt@kms_flip@basic-flip-vs-wf_vbl...@b-edp1.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18311/fi-icl-u2/igt@kms_flip@basic-flip-vs-wf_vbl...@b-edp1.html

  
 Warnings 

  * igt@kms_force_connector_basic@force-edid:
- fi-kbl-x1275:   [DMESG-WARN][17] ([i915#62] / [i915#92] / [i915#95]) 
-> [DMESG-WARN][18] ([i915#62] / [i915#92]) +2 similar issues
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/fi-kbl-x1275/igt@kms_force_connector_ba...@force-edid.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18311/fi-kbl-x1275/igt@kms_force_connector_ba...@force-edid.html

  * igt@prime_vgem@basic-fence-flip:
- fi-kbl-x1275:   [DMESG-WARN][19] ([i915#62] / [i915#92]) -> 
[DMESG-WARN][20] ([i915#62] / [i915#92] / [i915#95]) +5 similar issues
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/fi-kbl-x1275/igt@prime_v...@basic-fence-flip.html
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18311/fi-kbl-x1275/igt@prime_v...@basic-fence-flip.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [i915#1635]: https://gitlab.freedesktop.org/drm/intel/issues/1635
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#2203]: https://gitlab.freedesktop.org/drm/intel/issues/2203
  [i915#62]: https://gitlab.freedesktop.org/drm/intel/issues/62
  [i915#794]: https://gitlab.freedesktop.org/drm/intel/issues/794
  [i915#92]: https://gitlab.freedesktop.org/drm/intel/issues/92
  [i915#95]: https://gitlab.freedesktop.org/drm/intel/issues/95


Participating hosts (44 -> 37)
--

  Missing(7): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan 
fi-ctg-p8600 fi-byt-clapper fi-bdw-samus 


Build changes
-

  * Linux: CI_DRM_8845 

Re: [Intel-gfx] [PATCH 26/37] drm/i915/gem: Pull execbuf dma resv under a single critical section

2020-08-05 Thread Intel

Hi, Chris,

On 8/5/20 2:22 PM, Chris Wilson wrote:

Acquire all the objects and their backing storage, and page directories,
as used by execbuf under a single common ww_mutex. Albeit we have to
restart the critical section a few times in order to handle various
restrictions (such as avoiding copy_(from|to)_user and mmap_sem).

Signed-off-by: Chris Wilson 
---
  .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 166 +-
  .../i915/gem/selftests/i915_gem_execbuffer.c  |   2 +
  2 files changed, 84 insertions(+), 84 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 58e40348b551..3a79b6facb02 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -20,6 +20,7 @@
  #include "gt/intel_gt_pm.h"
  #include "gt/intel_gt_requests.h"
  #include "gt/intel_ring.h"
+#include "mm/i915_acquire_ctx.h"
  
  #include "i915_drv.h"

  #include "i915_gem_clflush.h"
@@ -267,6 +268,8 @@ struct i915_execbuffer {
struct intel_context *reloc_context; /* distinct context for relocs */
struct i915_gem_context *gem_context; /** caller's context */
  
+	struct i915_acquire_ctx acquire; /** lock for _all_ DMA reservations */

+
struct i915_request *request; /** our request to build */
struct eb_vma *batch; /** identity of the batch obj/vma */
  
@@ -392,42 +395,6 @@ static void eb_vma_array_put(struct eb_vma_array *arr)

kref_put(&arr->kref, eb_vma_array_destroy);
  }
  
-static int

-eb_lock_vma(struct i915_execbuffer *eb, struct ww_acquire_ctx *acquire)
-{
-   struct eb_vma *ev;
-   int err = 0;
-
-   list_for_each_entry(ev, &eb->submit_list, submit_link) {
-   struct i915_vma *vma = ev->vma;
-
-   err = ww_mutex_lock_interruptible(&vma->resv->lock, acquire);
-   if (err == -EDEADLK) {
-   struct eb_vma *unlock = ev, *en;
-
-   list_for_each_entry_safe_continue_reverse(unlock, en,
- 
&eb->submit_list,
- submit_link) {
-   ww_mutex_unlock(&unlock->vma->resv->lock);
-   list_move_tail(&unlock->submit_link, 
&eb->submit_list);
-   }
-
-   GEM_BUG_ON(!list_is_first(&ev->submit_link, 
&eb->submit_list));
-   err = ww_mutex_lock_slow_interruptible(&vma->resv->lock,
-  acquire);
-   }
-   if (err) {
-   list_for_each_entry_continue_reverse(ev,
-&eb->submit_list,
-submit_link)
-   ww_mutex_unlock(&ev->vma->resv->lock);
-   break;
-   }
-   }
-
-   return err;
-}
-
  static int eb_create(struct i915_execbuffer *eb)
  {
/* Allocate an extra slot for use by the sentinel */
@@ -656,6 +623,25 @@ eb_add_vma(struct i915_execbuffer *eb,
}
  }
  
+static int eb_lock_mm(struct i915_execbuffer *eb)

+{
+   struct eb_vma *ev;
+   int err;
+
+   list_for_each_entry(ev, &eb->bind_list, bind_link) {
+   err = i915_acquire_ctx_lock(&eb->acquire, ev->vma->obj);
+   if (err)
+   return err;
+   }
+
+   return 0;
+}
+
+static int eb_acquire_mm(struct i915_execbuffer *eb)
+{
+   return i915_acquire_mm(&eb->acquire);
+}
+
  struct eb_vm_work {
struct dma_fence_work base;
struct eb_vma_array *array;
@@ -1378,7 +1364,15 @@ static int eb_reserve_vm(struct i915_execbuffer *eb)
unsigned long count;
struct eb_vma *ev;
unsigned int pass;
-   int err = 0;
+   int err;
+
+   err = eb_lock_mm(eb);
+   if (err)
+   return err;
+
+   err = eb_acquire_mm(eb);
+   if (err)
+   return err;
  
  	count = 0;

INIT_LIST_HEAD(&unbound);
@@ -1404,10 +1398,15 @@ static int eb_reserve_vm(struct i915_execbuffer *eb)
if (count == 0)
return 0;
  
+	/* We need to reserve page directories, release all, start over */

+   i915_acquire_ctx_fini(&eb->acquire);
+
pass = 0;
do {
struct eb_vm_work *work;
  
+		i915_acquire_ctx_init(&eb->acquire);

+
/*
 * We need to hold one lock as we bind all the vma so that
 * we have a consistent view of the entire vm and can plan
@@ -1424,6 +1423,11 @@ static int eb_reserve_vm(struct i915_execbuffer *eb)
 * beneath it, so we have to stage and preallocate all the
 * resources we may require before taking the mutex.
 */
+
+  

Re: [Intel-gfx] [PATCH 00/37] Replace obj->mm.lock with reservation_ww_class

2020-08-05 Thread Intel

Hi, Chris,


On 8/5/20 2:21 PM, Chris Wilson wrote:

Long story short, we need to manage evictions using dma_resv & dma_fence
tracking. The backing storage will then be managed using the ww_mutex
borrowed from (and shared via) obj->base.resv, rather than the current
obj->mm.lock.

Skipping over the breadcrumbs,


While perhaps needed fixes, could we submit them as a separate series, 
since they, from what I can tell, are not a direct part of the locking 
rework, and some of them were actually part of a series that Dave NaK'ed 
and may require additional justification?




  the first step is to remove the final
crutches of struct_mutex from execbuf and to broaden the hold for the
dma-resv to guard not just publishing the dma-fences, but for the
duration of the execbuf submission (holding all objects and their
backing store from the point of acquisition to publishing of the final
GPU work, after which the guard is delegated to the dma-fences).

This is of course made complicated by our history. On top of the user's
objects, we also have the HW/kernel objects with their own lifetimes,
and a bunch of auxiliary objects used for working around unhappy HW and
for providing the legacy relocation mechanism. We add every auxiliary
object to the list of user objects required, and attempt to acquire them
en masse. Since all the objects can be known a priori, we can build a
list of those objects and pass that to a routine that can resolve the
-EDEADLK (and evictions). [To avoid relocations imposing a penalty on
sane userspace that avoids them, we do not touch any relocations until
necessary, at will point we have to unroll the state, and rebuild a new
list with more auxiliary buffers to accommodate the extra copy_from_user].
More examples are included as to how we can break down operations
involving multiple objects into an acquire phase prior to those
operations, keeping the -EDEADLK handling under control.

execbuf is the unique interface in that it deals with multiple user
and kernel buffers. After that, we have callers that in principle care
about accessing a single buffer, and so can be migrated over to a helper
that permits only holding one such buffer at a time. That enables us to
swap out obj->mm.lock for obj->base.resv->lock, and use lockdep to spot
illegal nesting, and to throw away the temporary pins by replacing them
with holding the ww_mutex for the duration instead.

What's changed? Some patch splitting and we need to pull in Matthew's
patch to map the page directories under the ww_mutex.


I would still like to see a justification for the newly introduced async 
work, as opposed to add it as an optimizing / regression fixing series 
follow the locking rework. That async work introduces a bunch of code 
complexity and it would be beneficial to see a discussion of the 
tradeoffs and how it alignes with the upstream proposed dma-fence 
annotations


Thanks,

Thomas


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.IGT: failure for HDCP minor refactoring (rev3)

2020-08-05 Thread Patchwork
== Series Details ==

Series: HDCP minor refactoring (rev3)
URL   : https://patchwork.freedesktop.org/series/77224/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_8845_full -> Patchwork_18309_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_18309_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_18309_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_18309_full:

### IGT changes ###

 Possible regressions 

  * igt@i915_selftest@live@gtt:
- shard-glk:  [PASS][1] -> [INCOMPLETE][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-glk8/igt@i915_selftest@l...@gtt.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18309/shard-glk7/igt@i915_selftest@l...@gtt.html

  
 Warnings 

  * igt@perf@blocking-parameterized:
- shard-glk:  [FAIL][3] ([i915#1542]) -> [INCOMPLETE][4]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-glk9/igt@p...@blocking-parameterized.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18309/shard-glk8/igt@p...@blocking-parameterized.html

  
Known issues


  Here are the changes found in Patchwork_18309_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@kms_cursor_crc@pipe-a-cursor-alpha-transparent:
- shard-skl:  [PASS][5] -> [FAIL][6] ([i915#54])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-skl3/igt@kms_cursor_...@pipe-a-cursor-alpha-transparent.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18309/shard-skl3/igt@kms_cursor_...@pipe-a-cursor-alpha-transparent.html

  * igt@kms_cursor_edge_walk@pipe-b-64x64-bottom-edge:
- shard-glk:  [PASS][7] -> [DMESG-WARN][8] ([i915#1982])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-glk1/igt@kms_cursor_edge_w...@pipe-b-64x64-bottom-edge.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18309/shard-glk1/igt@kms_cursor_edge_w...@pipe-b-64x64-bottom-edge.html

  * igt@kms_flip@2x-flip-vs-expired-vblank-interruptible@bc-hdmi-a1-hdmi-a2:
- shard-glk:  [PASS][9] -> [FAIL][10] ([i915#79])
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-glk4/igt@kms_flip@2x-flip-vs-expired-vblank-interrupti...@bc-hdmi-a1-hdmi-a2.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18309/shard-glk9/igt@kms_flip@2x-flip-vs-expired-vblank-interrupti...@bc-hdmi-a1-hdmi-a2.html

  * igt@kms_flip@flip-vs-expired-vblank-interruptible@b-edp1:
- shard-skl:  [PASS][11] -> [FAIL][12] ([i915#79])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-skl1/igt@kms_flip@flip-vs-expired-vblank-interrupti...@b-edp1.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18309/shard-skl1/igt@kms_flip@flip-vs-expired-vblank-interrupti...@b-edp1.html

  * igt@kms_hdr@bpc-switch-suspend:
- shard-kbl:  [PASS][13] -> [DMESG-WARN][14] ([i915#180]) +6 
similar issues
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-kbl1/igt@kms_...@bpc-switch-suspend.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18309/shard-kbl1/igt@kms_...@bpc-switch-suspend.html

  * igt@kms_plane_alpha_blend@pipe-b-constant-alpha-min:
- shard-skl:  [PASS][15] -> [FAIL][16] ([fdo#108145] / [i915#265])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-skl8/igt@kms_plane_alpha_bl...@pipe-b-constant-alpha-min.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18309/shard-skl10/igt@kms_plane_alpha_bl...@pipe-b-constant-alpha-min.html

  * igt@kms_plane_lowres@pipe-b-tiling-none:
- shard-apl:  [PASS][17] -> [DMESG-WARN][18] ([i915#1635] / 
[i915#1982])
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-apl2/igt@kms_plane_low...@pipe-b-tiling-none.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18309/shard-apl6/igt@kms_plane_low...@pipe-b-tiling-none.html

  * igt@kms_plane_scaling@pipe-b-scaler-with-pixel-format:
- shard-skl:  [PASS][19] -> [DMESG-WARN][20] ([i915#1982]) +12 
similar issues
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-skl7/igt@kms_plane_scal...@pipe-b-scaler-with-pixel-format.html
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18309/shard-skl5/igt@kms_plane_scal...@pipe-b-scaler-with-pixel-format.html

  * igt@kms_psr@psr2_sprite_plane_move:
- shard-iclb: [PASS][21] -> [SKIP][22] ([fdo#109441]) +1 similar 
issue
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-iclb2/igt

[Intel-gfx] ✗ Fi.CI.IGT: failure for Replace obj->mm.lock with reservation_ww_class

2020-08-05 Thread Patchwork
== Series Details ==

Series: Replace obj->mm.lock with reservation_ww_class
URL   : https://patchwork.freedesktop.org/series/80291/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_8845_full -> Patchwork_18310_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_18310_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_18310_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_18310_full:

### IGT changes ###

 Possible regressions 

  * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-pri-indfb-draw-mmap-gtt:
- shard-tglb: [PASS][1] -> [INCOMPLETE][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-tglb3/igt@kms_frontbuffer_track...@fbc-1p-primscrn-pri-indfb-draw-mmap-gtt.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18310/shard-tglb3/igt@kms_frontbuffer_track...@fbc-1p-primscrn-pri-indfb-draw-mmap-gtt.html

  
New tests
-

  New tests have been introduced between CI_DRM_8845_full and 
Patchwork_18310_full:

### New IGT tests (1) ###

  * igt@i915_selftest@mock@acquire:
- Statuses : 7 pass(s)
- Exec time: [0.67, 1.79] s

  

Known issues


  Here are the changes found in Patchwork_18310_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@kms_big_fb@y-tiled-16bpp-rotate-0:
- shard-skl:  [PASS][3] -> [DMESG-WARN][4] ([i915#1982]) +12 
similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-skl7/igt@kms_big...@y-tiled-16bpp-rotate-0.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18310/shard-skl2/igt@kms_big...@y-tiled-16bpp-rotate-0.html

  * igt@kms_big_fb@y-tiled-64bpp-rotate-0:
- shard-glk:  [PASS][5] -> [DMESG-FAIL][6] ([i915#118] / [i915#95])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-glk3/igt@kms_big...@y-tiled-64bpp-rotate-0.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18310/shard-glk8/igt@kms_big...@y-tiled-64bpp-rotate-0.html

  * igt@kms_cursor_crc@pipe-c-cursor-suspend:
- shard-kbl:  [PASS][7] -> [DMESG-WARN][8] ([i915#180]) +2 similar 
issues
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-kbl6/igt@kms_cursor_...@pipe-c-cursor-suspend.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18310/shard-kbl7/igt@kms_cursor_...@pipe-c-cursor-suspend.html

  * igt@kms_cursor_edge_walk@pipe-b-128x128-bottom-edge:
- shard-apl:  [PASS][9] -> [DMESG-WARN][10] ([i915#1635] / 
[i915#1982]) +3 similar issues
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-apl3/igt@kms_cursor_edge_w...@pipe-b-128x128-bottom-edge.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18310/shard-apl1/igt@kms_cursor_edge_w...@pipe-b-128x128-bottom-edge.html

  * igt@kms_cursor_legacy@pipe-a-forked-bo:
- shard-glk:  [PASS][11] -> [DMESG-WARN][12] ([i915#118] / 
[i915#95])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-glk3/igt@kms_cursor_leg...@pipe-a-forked-bo.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18310/shard-glk8/igt@kms_cursor_leg...@pipe-a-forked-bo.html

  * igt@kms_flip@flip-vs-expired-vblank-interruptible@b-hdmi-a2:
- shard-glk:  [PASS][13] -> [FAIL][14] ([i915#79])
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-glk1/igt@kms_flip@flip-vs-expired-vblank-interrupti...@b-hdmi-a2.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18310/shard-glk8/igt@kms_flip@flip-vs-expired-vblank-interrupti...@b-hdmi-a2.html

  * igt@kms_flip@flip-vs-expired-vblank@c-dp1:
- shard-kbl:  [PASS][15] -> [FAIL][16] ([i915#79])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-kbl2/igt@kms_flip@flip-vs-expired-vbl...@c-dp1.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18310/shard-kbl4/igt@kms_flip@flip-vs-expired-vbl...@c-dp1.html

  * igt@kms_flip@flip-vs-wf_vblank-interruptible@a-edp1:
- shard-tglb: [PASS][17] -> [DMESG-WARN][18] ([i915#1982])
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-tglb3/igt@kms_flip@flip-vs-wf_vblank-interrupti...@a-edp1.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18310/shard-tglb2/igt@kms_flip@flip-vs-wf_vblank-interrupti...@a-edp1.html

  * igt@kms_hdr@bpc-switch-suspend:
- shard-skl:  [PASS][19] -> [INCOMPLETE][20] ([i915#198])
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-skl4/igt@kms_...@bpc-switch-suspend.html
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwo

Re: [Intel-gfx] [PATCH v4] drm/kmb: Add support for KeemBay Display

2020-08-05 Thread Sam Ravnborg
Hi Anitha.

On Mon, Aug 03, 2020 at 09:02:24PM +, Chrisanthus, Anitha wrote:
> Hi Sam,
> I installed codespell, but the dictionary.txt in 
> usr/share/codespell/dictionary.txt
> seems to be different from yours. Mine is version 1.8. Where can I get the 
> dictionary.txt
> that you are using?
I dunno.

$ apt info codespell
Package: codespell
Version: 1.16.0-2
Priority: optional
Section: universe/devel
Origin: Ubuntu
Maintainer: Ubuntu Developers 
Original-Maintainer: Debian Python Modules Team 

Bugs: https://bugs.launchpad.net/ubuntu/+filebug
Installed-Size: 572 kB
Depends: python3, python3-chardet, python3:any
Homepage: https://github.com/codespell-project/codespell/
Download-Size: 118 kB
APT-Manual-Installed: yes
APT-Sources: http://dk.archive.ubuntu.com/ubuntu focal/universe amd64 Packages
Description: Find and fix common misspellings in text files
 codespell is designed to find and fix common misspellings in text files.
 It is designed primarily for checking misspelled words in source code,
 but it can be used with other files as well.

> I have corrected the relevant spelling warnings from your email and have sent 
> v5.

The spelling mistakes was the least relevant warnings.
Please see examples in the following.

> > -:146: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open
> > parenthesis
> > #146: FILE: drivers/gpu/drm/kmb/kmb_crtc.c:58:
> > +   kmb_clr_bitmask_lcd(kmb, LCD_INT_ENABLE,
> > +   LCD_INT_VERT_COMP);
Here we want LCD_INT_VERT_COMP to be aligned right after the opening
'('. It must be indented with a number of tabs followed by the necessary
spaces to achive this indent. Always uses tabs for indent if possible.
So in other words 8 spaces are not OK, then use a tab.

Same goes for similar warnings.
> > -:427: CHECK:LINE_SPACING: Please don't use multiple blank lines
> > #427: FILE: drivers/gpu/drm/kmb/kmb_drv.c:74:
> > +
> > +
> > 
Do not use two consecutive blank lines.

> > -:463: CHECK:SPACING: spaces preferred around that '/' (ctx:VxV)
> > #463: FILE: drivers/gpu/drm/kmb/kmb_drv.c:110:
> > +   kmb->sys_clk_mhz = clk_get_rate(kmb_clk.clk_pll0)/100;
> >  ^

Spaces around all operatoers - so space before and after '/' here.
Same goes for following warnings of the same type.

> > -:688: CHECK:BRACES: Blank lines aren't necessary after an open brace '{'
> > #688: FILE: drivers/gpu/drm/kmb/kmb_drv.c:335:
> > +   if (status & LCD_INT_EOF) {
> > +
As the warning says - no empty line after opening '{'

> > 
> > -:701: CHECK:CAMELCASE: Avoid CamelCase: 
> > #701: FILE: drivers/gpu/drm/kmb/kmb_drv.c:348:
> > +   LCD_LAYERn_DMA_CFG
> > 
If you have a reason to use CamelCase then this can be ignored.
A good reason could be that this is how it is done in the datasheet.
In this case maybe use LCD_LAYER_N_DMA_CFG or similar.

> > -:957: CHECK:BRACES: braces {} should be used on all arms of this statement
> > #957: FILE: drivers/gpu/drm/kmb/kmb_drv.c:604:
> > +   if (adv_bridge == ERR_PTR(-EPROBE_DEFER))
> > [...]
> > +   else if (IS_ERR(adv_bridge)) {
> > [...]
If we use {} in one arm of the statement use it in all arms.
This, as the other tidbits, improve readability.

Same for all similar warnings.

> > -:1026: WARNING:UNDOCUMENTED_DT_STRING: DT compatible string
> > "intel,kmb_display" appears un-documented -- check
> > ./Documentation/devicetree/bindings/
> > #1026: FILE: drivers/gpu/drm/kmb/kmb_drv.c:673:
> > +   {.compatible = "intel,kmb_display"},

Binding is missing - we cannot apply a driver for an unknown binding.
The binding must be in DT-schema (yaml) format.

> > 
> > -:1122: CHECK:UNCOMMENTED_DEFINITION: spinlock_t definition without
> > comment
> > #1122: FILE: drivers/gpu/drm/kmb/kmb_drv.h:35:
> > +   spinlock_t  irq_lock;

Add comment.
And consider a more specific name like kmb_irq_lock - allows for easier
grepping.

> > 
> > -:1360: CHECK:PREFER_KERNEL_TYPES: Prefer kernel type 'u16' over 'uint16_t'
> > #1360: FILE: drivers/gpu/drm/kmb/kmb_dsi.c:95:
> > +   uint16_t default_bit_rate_mbps;
As the warning says. This goes again later.

> > -:1947: CHECK:COMPARISON_TO_NULL: Comparison to NULL could be written
> > "fg_cfg->sections[i]"
> > #1947: FILE: drivers/gpu/drm/kmb/kmb_dsi.c:682:
> > +   if (fg_cfg->sections[i] != NULL)

Hmm, I like the current code. But better please checkpatch here.

I did not go through them all.
The point is that all the warnings from checkpatch should be considered,
and for the most of them they are legit and should be fixed.

Sam

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✓ Fi.CI.IGT: success for i915/tgl: Fix TC-cold block/unblock sequence

2020-08-05 Thread Patchwork
== Series Details ==

Series: i915/tgl: Fix TC-cold block/unblock sequence
URL   : https://patchwork.freedesktop.org/series/80302/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_8845_full -> Patchwork_18311_full


Summary
---

  **SUCCESS**

  No regressions found.

  

Known issues


  Here are the changes found in Patchwork_18311_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_exec_balancer@bonded-early:
- shard-kbl:  [PASS][1] -> [FAIL][2] ([i915#2079])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-kbl4/igt@gem_exec_balan...@bonded-early.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18311/shard-kbl1/igt@gem_exec_balan...@bonded-early.html

  * igt@gem_exec_balancer@nop:
- shard-iclb: [PASS][3] -> [INCOMPLETE][4] ([i915#2268])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-iclb2/igt@gem_exec_balan...@nop.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18311/shard-iclb2/igt@gem_exec_balan...@nop.html

  * igt@gem_partial_pwrite_pread@writes-after-reads-uncached:
- shard-apl:  [PASS][5] -> [FAIL][6] ([i915#1635])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-apl7/igt@gem_partial_pwrite_pr...@writes-after-reads-uncached.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18311/shard-apl2/igt@gem_partial_pwrite_pr...@writes-after-reads-uncached.html

  * igt@kms_big_fb@linear-64bpp-rotate-180:
- shard-glk:  [PASS][7] -> [DMESG-FAIL][8] ([i915#118] / [i915#95])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-glk4/igt@kms_big...@linear-64bpp-rotate-180.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18311/shard-glk8/igt@kms_big...@linear-64bpp-rotate-180.html

  * igt@kms_color@pipe-a-gamma:
- shard-skl:  [PASS][9] -> [FAIL][10] ([i915#71])
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-skl3/igt@kms_co...@pipe-a-gamma.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18311/shard-skl3/igt@kms_co...@pipe-a-gamma.html

  * igt@kms_cursor_crc@pipe-a-cursor-alpha-transparent:
- shard-skl:  [PASS][11] -> [FAIL][12] ([i915#54])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-skl3/igt@kms_cursor_...@pipe-a-cursor-alpha-transparent.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18311/shard-skl3/igt@kms_cursor_...@pipe-a-cursor-alpha-transparent.html

  * igt@kms_cursor_crc@pipe-c-cursor-suspend:
- shard-skl:  [PASS][13] -> [INCOMPLETE][14] ([i915#300])
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-skl5/igt@kms_cursor_...@pipe-c-cursor-suspend.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18311/shard-skl1/igt@kms_cursor_...@pipe-c-cursor-suspend.html

  * igt@kms_cursor_edge_walk@pipe-b-128x128-bottom-edge:
- shard-glk:  [PASS][15] -> [DMESG-WARN][16] ([i915#1982])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-glk6/igt@kms_cursor_edge_w...@pipe-b-128x128-bottom-edge.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18311/shard-glk3/igt@kms_cursor_edge_w...@pipe-b-128x128-bottom-edge.html

  * igt@kms_draw_crc@draw-method-xrgb2101010-mmap-wc-untiled:
- shard-apl:  [PASS][17] -> [DMESG-WARN][18] ([i915#1635] / 
[i915#1982])
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-apl2/igt@kms_draw_...@draw-method-xrgb2101010-mmap-wc-untiled.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18311/shard-apl7/igt@kms_draw_...@draw-method-xrgb2101010-mmap-wc-untiled.html

  * igt@kms_hdr@bpc-switch-suspend:
- shard-kbl:  [PASS][19] -> [DMESG-WARN][20] ([i915#180]) +8 
similar issues
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-kbl1/igt@kms_...@bpc-switch-suspend.html
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18311/shard-kbl4/igt@kms_...@bpc-switch-suspend.html
- shard-skl:  [PASS][21] -> [FAIL][22] ([i915#1188])
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-skl4/igt@kms_...@bpc-switch-suspend.html
   [22]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18311/shard-skl8/igt@kms_...@bpc-switch-suspend.html

  * igt@kms_plane_scaling@pipe-b-scaler-with-pixel-format:
- shard-skl:  [PASS][23] -> [DMESG-WARN][24] ([i915#1982]) +12 
similar issues
   [23]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8845/shard-skl7/igt@kms_plane_scal...@pipe-b-scaler-with-pixel-format.html
   [24]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18311/shard-skl2/igt@kms_plane_scal...@pipe-b-scaler-with-pixel-format.html

  * igt@kms_psr@psr2_no_drrs:
- shard-iclb: [PASS][25] -> [SKIP][26] ([fdo#109441])
   [25]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM

[Intel-gfx] [PATCH v2] drm/i915/gt: Implement WA_1406941453

2020-08-05 Thread clinton . a . taylor
From: Clint Taylor 

Enable HW Default flip for small PL.

bspec: 52890
bspec: 53508
bspec: 53273

v2: rebase to drm-tip
Reviewed-by: Matt Atwood 
Signed-off-by: Clint Taylor 
---
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 6 ++
 drivers/gpu/drm/i915/i915_reg.h | 1 +
 2 files changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index cef1c122696f..cb02813c5e92 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -639,6 +639,9 @@ static void tgl_ctx_workarounds_init(struct intel_engine_cs 
*engine,
   FF_MODE2_GS_TIMER_MASK | FF_MODE2_TDS_TIMER_MASK,
   FF_MODE2_GS_TIMER_224  | FF_MODE2_TDS_TIMER_128,
   0);
+
+   /* Wa_1406941453:gen12 */
+   WA_SET_BIT_MASKED(GEN10_SAMPLER_MODE, ENABLE_SMALLPL);
 }
 
 static void
@@ -1522,6 +1525,9 @@ static void icl_whitelist_build(struct intel_engine_cs 
*engine)
whitelist_reg_ext(w, PS_INVOCATION_COUNT,
  RING_FORCE_TO_NONPRIV_ACCESS_RD |
  RING_FORCE_TO_NONPRIV_RANGE_4);
+
+   /* Wa_1406941453:gen12 */
+   whitelist_reg(w, GEN10_SAMPLER_MODE);
break;
 
case VIDEO_DECODE_CLASS:
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 2b403df03404..494b2e1e358e 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -9314,6 +9314,7 @@ enum {
 #define   GEN11_LSN_UNSLCVC_GAFS_HALF_SF_MAXALLOC  (1 << 7)
 
 #define GEN10_SAMPLER_MODE _MMIO(0xE18C)
+#define   ENABLE_SMALLPL   REG_BIT(15)
 #define   GEN11_SAMPLER_ENABLE_HEADLESS_MSGREG_BIT(5)
 
 /* IVYBRIDGE DPF */
-- 
2.27.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.SPARSE: warning for drm/i915/gt: Implement WA_1406941453 (rev2)

2020-08-05 Thread Patchwork
== Series Details ==

Series: drm/i915/gt: Implement WA_1406941453 (rev2)
URL   : https://patchwork.freedesktop.org/series/78243/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.0
Fast mode used, each commit won't be checked separately.


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915/gt: Implement WA_1406941453 (rev2)

2020-08-05 Thread Patchwork
== Series Details ==

Series: drm/i915/gt: Implement WA_1406941453 (rev2)
URL   : https://patchwork.freedesktop.org/series/78243/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_8846 -> Patchwork_18312


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_18312 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_18312, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18312/index.html

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_18312:

### IGT changes ###

 Possible regressions 

  * igt@kms_chamelium@hdmi-hpd-fast:
- fi-icl-u2:  [PASS][1] -> [DMESG-WARN][2] +1 similar issue
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8846/fi-icl-u2/igt@kms_chamel...@hdmi-hpd-fast.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18312/fi-icl-u2/igt@kms_chamel...@hdmi-hpd-fast.html

  
Known issues


  Here are the changes found in Patchwork_18312 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@i915_selftest@live@execlists:
- fi-icl-y:   [PASS][3] -> [INCOMPLETE][4] ([i915#2276])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8846/fi-icl-y/igt@i915_selftest@l...@execlists.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18312/fi-icl-y/igt@i915_selftest@l...@execlists.html

  * igt@kms_cursor_legacy@basic-flip-after-cursor-atomic:
- fi-icl-u2:  [PASS][5] -> [DMESG-WARN][6] ([i915#1982]) +2 similar 
issues
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8846/fi-icl-u2/igt@kms_cursor_leg...@basic-flip-after-cursor-atomic.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18312/fi-icl-u2/igt@kms_cursor_leg...@basic-flip-after-cursor-atomic.html

  
 Possible fixes 

  * igt@i915_module_load@reload:
- fi-apl-guc: [DMESG-WARN][7] ([i915#1635] / [i915#1982]) -> 
[PASS][8]
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8846/fi-apl-guc/igt@i915_module_l...@reload.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18312/fi-apl-guc/igt@i915_module_l...@reload.html

  * igt@i915_pm_rpm@basic-pci-d3-state:
- fi-byt-j1900:   [DMESG-WARN][9] ([i915#1982]) -> [PASS][10]
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8846/fi-byt-j1900/igt@i915_pm_...@basic-pci-d3-state.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18312/fi-byt-j1900/igt@i915_pm_...@basic-pci-d3-state.html

  * igt@i915_pm_rpm@module-reload:
- fi-bsw-kefka:   [INCOMPLETE][11] ([i915#151] / [i915#1844] / 
[i915#1909] / [i915#392]) -> [PASS][12]
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8846/fi-bsw-kefka/igt@i915_pm_...@module-reload.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18312/fi-bsw-kefka/igt@i915_pm_...@module-reload.html

  * igt@i915_selftest@live@gt_lrc:
- fi-tgl-u2:  [DMESG-FAIL][13] ([i915#1233]) -> [PASS][14]
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8846/fi-tgl-u2/igt@i915_selftest@live@gt_lrc.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18312/fi-tgl-u2/igt@i915_selftest@live@gt_lrc.html

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-legacy:
- fi-icl-u2:  [DMESG-WARN][15] ([i915#1982]) -> [PASS][16] +1 
similar issue
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8846/fi-icl-u2/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-legacy.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18312/fi-icl-u2/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-legacy.html

  
 Warnings 

  * igt@gem_exec_suspend@basic-s0:
- fi-kbl-x1275:   [DMESG-WARN][17] ([i915#62] / [i915#92]) -> 
[DMESG-WARN][18] ([i915#62] / [i915#92] / [i915#95]) +4 similar issues
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8846/fi-kbl-x1275/igt@gem_exec_susp...@basic-s0.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18312/fi-kbl-x1275/igt@gem_exec_susp...@basic-s0.html

  * igt@i915_pm_rpm@module-reload:
- fi-kbl-guc: [DMESG-FAIL][19] ([i915#2203]) -> [DMESG-WARN][20] 
([i915#2203])
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8846/fi-kbl-guc/igt@i915_pm_...@module-reload.html
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_18312/fi-kbl-guc/igt@i915_pm_...@module-reload.html

  * igt@kms_force_connector_basic@force-connector-state:
- fi-kbl-x1275:   [DMESG-WARN][21] ([i915#62] / [i915#92] / [i915#95]) 
-> [DMESG-WARN][22] ([i915#62] / [i915#92]) +3 similar issues
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8846/f

Re: [Intel-gfx] [PATCH 1/8] drm/atomic-helper: reset vblank on crtc reset

2020-08-05 Thread daniel
On Thu, Aug 06, 2020 at 03:43:02PM +0900, Tetsuo Handa wrote:
> As of commit 47ec5303d73ea344 ("Merge 
> git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next") on linux.git ,
> my VMware environment cannot boot. Do I need to bisect?

That sounds like a good idea, but please start a new thread (not reply to
some random existing ones), with maintainers for drivers/gpu/drm/vmwgfx
only. Not a massive list of random folks who have no idea what's going on
here. From get_maintainers.pl

$ scripts/get_maintainer.pl -f drivers/gpu/drm/vmwgfx/
VMware Graphics  (supporter:DRM DRIVER 
FOR VMWARE VIRTUAL GPU)
Roland Scheidegger  (supporter:DRM DRIVER FOR VMWARE 
VIRTUAL GPU)
David Airlie  (maintainer:DRM DRIVERS)
Daniel Vetter  (maintainer:DRM DRIVERS)
dri-de...@lists.freedesktop.org (open list:DRM DRIVER FOR VMWARE VIRTUAL GPU)
linux-ker...@vger.kernel.org (open list)

Cheers, Daniel

> 
> [9.314496][T1] vga16fb: mapped to 0x71050562
> [9.467770][T1] Console: switching to colour frame buffer device 80x30
> [9.632092][T1] fb0: VGA16 VGA frame buffer device
> [9.651768][T1] ACPI: AC Adapter [ACAD] (on-line)
> [9.672544][T1] input: Power Button as 
> /devices/LNXSYSTM:00/LNXPWRBN:00/input/input0
> [9.722373][T1] ACPI: Power Button [PWRF]
> [9.744231][T1] ioatdma: Intel(R) QuickData Technology Driver 5.00
> [9.820147][T1] N_HDLC line discipline registered with maxframe=4096
> [9.835649][T1] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
> [9.852567][T1] 00:05: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 
> 115200) is a 16550A
> [   10.033372][T1] Cyclades driver 2.6
> [   10.049928][T1] Initializing Nozomi driver 2.1d
> [   10.065493][T1] RocketPort device driver module, version 2.09, 
> 12-June-2003
> [   10.095368][T1] No rocketport ports found; unloading driver
> [   10.112430][T1] Non-volatile memory driver v1.3
> [   10.127090][T1] Linux agpgart interface v0.103
> [   10.144037][T1] agpgart-intel :00:00.0: Intel 440BX Chipset
> [   10.162275][T1] agpgart-intel :00:00.0: AGP aperture is 256M @ 0x0
> [   10.181130][T1] [drm] DMA map mode: Caching DMA mappings.
> [   10.195150][T1] [drm] Capabilities:
> [   10.208728][T1] [drm]   Rect copy.
> [   10.222772][T1] [drm]   Cursor.
> [   10.235364][T1] [drm]   Cursor bypass.
> [   10.249121][T1] [drm]   Cursor bypass 2.
> [   10.260590][T1] [drm]   8bit emulation.
> [   10.272220][T1] [drm]   Alpha cursor.
> [   10.284670][T1] [drm]   3D.
> [   10.295051][T1] [drm]   Extended Fifo.
> [   10.305180][T1] [drm]   Multimon.
> [   10.315506][T1] [drm]   Pitchlock.
> [   10.325167][T1] [drm]   Irq mask.
> [   10.334262][T1] [drm]   Display Topology.
> [   10.343519][T1] [drm]   GMR.
> [   10.352775][T1] [drm]   Traces.
> [   10.362166][T1] [drm]   GMR2.
> [   10.370716][T1] [drm]   Screen Object 2.
> [   10.379220][T1] [drm]   Command Buffers.
> [   10.388489][T1] [drm]   Command Buffers 2.
> [   10.396055][T1] [drm]   Guest Backed Resources.
> [   10.403290][T1] [drm]   DX Features.
> [   10.409911][T1] [drm]   HP Command Queue.
> [   10.417820][T1] [drm] Capabilities2:
> [   10.424216][T1] [drm]   Grow oTable.
> [   10.430423][T1] [drm]   IntraSurface copy.
> [   10.436371][T1] [drm] Max GMR ids is 64
> [   10.442651][T1] [drm] Max number of GMR pages is 65536
> [   10.450317][T1] [drm] Max dedicated hypervisor surface memory is 0 kiB
> [   10.458809][T1] [drm] Maximum display memory size is 262144 kiB
> [   10.466330][T1] [drm] VRAM at 0xe800 size is 4096 kiB
> [   10.474704][T1] [drm] MMIO at 0xfe00 size is 256 kiB
> [   10.484625][T1] [TTM] Zone  kernel: Available graphics memory: 4030538 
> KiB
> [   10.500730][T1] [TTM] Zone   dma32: Available graphics memory: 2097152 
> KiB
> [   10.516851][T1] [TTM] Initializing pool allocator
> [   10.527542][T1] [TTM] Initializing DMA pool allocator
> [   10.540197][T1] BUG: kernel NULL pointer dereference, address: 
> 0438
> [   10.550087][T1] #PF: supervisor read access in kernel mode
> [   10.550087][T1] #PF: error_code(0x) - not-present page
> [   10.550087][T1] PGD 0 P4D 0 
> [   10.550087][T1] Oops:  [#1] PREEMPT SMP
> [   10.550087][T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.8.0+ #271
> [   10.550087][T1] Hardware name: VMware, Inc. VMware Virtual 
> Platform/440BX Desktop Reference Platform, BIOS 6.00 02/27/2020
> [   10.550087][T1] RIP: 0010:drm_dev_has_vblank+0x9/0x20
> [   10.550087][T1] Code: 5d 41 5e 41 5f e9 e7 fa 01 ff e8 e2 fa 01 ff 45 
> 31 e4 41 8b 5f 48 eb a7 cc cc cc cc cc cc cc cc cc 53 48 89 fb e8 c7 fa 01 ff 
> <8b> 83 38 04 00 00 5b 85 c0 0f 95 c0 c3 66 2e 0f 1f 84 00 00 00 00
> [   10.550087][T1] RSP: :c9027b80 EFLAGS: 00010293
> [   10.550087][