date:20210806

[Intel-gfx] ✓ Fi.CI.IGT: success for Begin enabling Xe_HP SDV and DG2 platforms (rev9)

2021-08-06 Thread Patchwork

== Series Details ==

Series: Begin enabling Xe_HP SDV and DG2 platforms (rev9)
URL   : https://patchwork.freedesktop.org/series/92135/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10451_full -> Patchwork_20776_full


Summary
---

  **SUCCESS**

  No regressions found.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_20776_full:

### CI changes ###

 Suppressed 

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * boot:
- {shard-rkl}:([PASS][1], [PASS][2], [PASS][3], [PASS][4], 
[PASS][5], [PASS][6], [PASS][7], [PASS][8], [PASS][9], [PASS][10], [PASS][11], 
[PASS][12], [PASS][13], [PASS][14], [PASS][15], [PASS][16], [PASS][17], 
[PASS][18], [PASS][19], [PASS][20], [PASS][21], [PASS][22]) -> ([PASS][23], 
[PASS][24], [PASS][25], [PASS][26], [PASS][27], [FAIL][28], [PASS][29], 
[PASS][30], [PASS][31], [PASS][32], [PASS][33], [PASS][34], [PASS][35], 
[PASS][36], [PASS][37], [PASS][38], [PASS][39], [PASS][40], [PASS][41], 
[PASS][42], [PASS][43], [PASS][44], [PASS][45])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/shard-rkl-6/boot.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/shard-rkl-6/boot.html
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/shard-rkl-6/boot.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/shard-rkl-6/boot.html
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/shard-rkl-5/boot.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/shard-rkl-5/boot.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/shard-rkl-5/boot.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/shard-rkl-5/boot.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/shard-rkl-5/boot.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/shard-rkl-2/boot.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/shard-rkl-2/boot.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/shard-rkl-2/boot.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/shard-rkl-2/boot.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/shard-rkl-2/boot.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/shard-rkl-2/boot.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/shard-rkl-2/boot.html
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/shard-rkl-1/boot.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/shard-rkl-1/boot.html
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/shard-rkl-1/boot.html
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/shard-rkl-1/boot.html
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/shard-rkl-1/boot.html
   [22]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/shard-rkl-1/boot.html
   [23]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20776/shard-rkl-6/boot.html
   [24]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20776/shard-rkl-6/boot.html
   [25]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20776/shard-rkl-6/boot.html
   [26]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20776/shard-rkl-5/boot.html
   [27]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20776/shard-rkl-5/boot.html
   [28]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20776/shard-rkl-5/boot.html
   [29]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20776/shard-rkl-5/boot.html
   [30]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20776/shard-rkl-5/boot.html
   [31]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20776/shard-rkl-5/boot.html
   [32]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20776/shard-rkl-2/boot.html
   [33]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20776/shard-rkl-2/boot.html
   [34]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20776/shard-rkl-2/boot.html
   [35]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20776/shard-rkl-2/boot.html
   [36]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20776/shard-rkl-2/boot.html
   [37]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20776/shard-rkl-2/boot.html
   [38]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20776/shard-rkl-1/boot.html
   [39]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20776/shard-rkl-1/boot.html
   [40]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20776/shard-rkl-1/boot.html
   [41]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20776/shard-rkl-1/boot.html
   [42]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20776/shard-rkl-1/boot.html
   [43]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20776/shard-rkl-1/boot.html
   [44]: 
https://intel-gfx-ci

Re: [Intel-gfx] [PATCH i-g-t v3 03/11] lib/i915/gem_mman: add fixed mode to gem_mmap_offset__cpu

2021-08-06 Thread Maarten Lankhorst

Op 02-08-2021 om 08:29 schreef Dixit, Ashutosh:
> On Fri, 30 Jul 2021 01:53:40 -0700, Matthew Auld wrote:
>> On discrete we only support the new fixed mode.
>>
>> Signed-off-by: Matthew Auld 
>> Cc: Maarten Lankhorst 
>> Cc: Ashutosh Dixit 
>> Cc: Daniel Vetter 
>> Cc: Ramalingam C 
>> ---
>>  lib/i915/gem_mman.c | 8 +++-
>>  1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/lib/i915/gem_mman.c b/lib/i915/gem_mman.c
>> index c432bb16..563a7ccf 100644
>> --- a/lib/i915/gem_mman.c
>> +++ b/lib/i915/gem_mman.c
>> @@ -474,8 +474,14 @@ void *gem_mmap__cpu(int fd, uint32_t handle, uint64_t 
>> offset, uint64_t size, uns
>>  void *__gem_mmap_offset__cpu(int fd, uint32_t handle, uint64_t offset,
>>   uint64_t size, unsigned prot)
>>  {
>> -return __gem_mmap_offset(fd, handle, offset, size, prot,
>> +void *ptr;
>> +
>> +ptr = __gem_mmap_offset(fd, handle, offset, size, prot,
>>   I915_MMAP_OFFSET_WB);
>> +if (!ptr)
>> +ptr = __gem_mmap_offset__fixed(fd, handle, offset, size, prot);
>> +
>> +return ptr;
> Imo there's some asymmetry here. If we are adding fixed mode to
> mmap__device_coherent (in the previous patch) then we should also be adding
> it to mmap__cpu_coherent (as before). Or, if we are adding fixed mode to
> __gem_mmap_offset__cpu we should also be adding it to
> __gem_mmap_offset__wc. Thanks.

I've applied the patch series for now. Not sure I agree on adding it to 
__gem_mmap_offset__wc, but if there's a need, we could do that. :)

[Intel-gfx] [PULL] drm-intel-gt-next

2021-08-06 Thread Joonas Lahtinen

Hi Dave & Daniel,

Sorry for the big PR in advance. Had the summer vacations and did not
notice until tool late how many patches were in already before leaving.

As requested, there is a lot of refactoring to increase the use of TTM
allocator and prep for DRM scheduler. Note that at times the patches trade
simplicity for performance and revert optimizations not seen as critical.
So some performance regressions are expected.

As an example is dropping of faster GPU relocation path for older platforms,
which should be mitigated by updating to the latest UMD versions to regain
the perf.

This PR drops various bits of uAPI where userspace has dropped the ball after 
reviews
have been completed on both sides (Thanks Jason for doing the due-diligence!). 
[1]
Due to the complexity of tracing back who used what, I think we could do better 
in the
future to avoid such situations to begin with. See below for my suggestion [2].

In addition to the refactoring and uAPI cleanup there is preliminay code for
ADL-P/XeHP and DG2 platforms. Fixes for ADL-S and removal of CNL code.
A couple of fixes that have been Cc: stable already. Removing unnecessary
workarounds per stepping and adding missing for Gen12 iGFX.

I915_MMAP_OFFSET_FIXED is also added to to align with the static/fixed caching
mode per BO instead of per-mapping mode (for dGFX only). There is GuC firmware
interface update and backend code major rework that unblock enabling GuC on 
Gen11
(not on by default). Then there is the addition of the GuCRC power management
feature which can be enabled for Gen12+ when submission is enabled.

Then there is finally the drm-next backmerge and 3 topic branch merges.

I think that is mostly all. I tried to summarize much in the tag description so
it should hopefully not be horribly long...

Regards, Joonas

[1] Given that Jason is only human and there is no way to automate this 
analysis, we
may have to bring something back as fixes if we find breakage (like with the 
MMAP IOCTL).

[2] As we first require to merge the kernel code, should we introduce some 
further rules
that the userspace has to land their code before the final kernel version 
release? If
that is not followed through, we would neuter the new uAPI with as small patch 
as possible
in the final release candidate and then remove it in next release. Thoughts?

***

drm-intel-gt-next-2021-08-06-1:

UAPI Changes:

- Add I915_MMAP_OFFSET_FIXED

  On devices with local memory `I915_MMAP_OFFSET_FIXED` is the only valid
  type. On devices without local memory, this caching mode is invalid.

  As caching mode when specifying `I915_MMAP_OFFSET_FIXED`, WC or WB will
  be used, depending on the object placement on creation. WB will be used
  when the object can only exist in system memory, WC otherwise.

  Userspace: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11888

- Reinstate the mmap ioctl for (already released) integrated Gen12 platforms

  Rationale: Otherwise media driver breaks eg. for ADL-P. Long term goal is
  still to sunset the IOCTL even for integrated and require using mmap_offset.

- Reject caching/set_domain IOCTLs on discrete

  Expected to become immutable property of the BO

- Disallow changing context parameters after first use on Gen12 and earlier
- Require setting context parameters at creation on platforms after Gen12

  Rationale (for both): Allow less dynamic changes to the context to simplify
  the implementation and avoid user shooting theirselves in the foot.

- Drop I915_CONTEXT_PARAM_RINGSIZE

  Userspace PR for compute-driver has not been merged

- Drop I915_CONTEXT_PARAM_NO_ZEROMAP

  Userspace PR for libdrm / Beignet was never landed

- Drop CONTEXT_CLONE API

  Userspace PR for Mesa was never landed

- Drop getparam support for I915_CONTEXT_PARAM_ENGINES

  Only existed for symmetry wrt. setparam, never used.

- Disallow bonding of virtual engines

  Drop the prep work, no hardware has been released needing it.

- (Implicit) Disable gpu relocations

  Media userspace was the last userspace to still use them. They
  have converted so performance can be regained with an update.

Core Changes:

- Merge topic branch 'topic/i915-ttm-2021-06-11' (from Maarten)
- Merge topic branch 'topic/revid_steppings' (from Matt R)
- Merge topic branch 'topic/xehp-dg2-definitions-2021-07-21' (from Matt R)
- Backmerges drm-next (Rodrigo)

Driver Changes:

- Initial workarounds for ADL-P (Clint)
- Preliminary code for XeHP/DG2 (Stuart, Umesh, Matt R, Prathap, Ram,
  Venkata, Akeem, Tvrtko, John, Lucas)
- Fix ADL-S DMA mask size to 39 bits (Tejas)
- Remove code for CNL (Lucas)
- Add ADL-P GuC/HuC firmwares (John)
- Update HuC to 7.9.3 for TGL/ADL-S/RKL (John)
- Fix -EDEADLK handling regression (Ville)
- Implement Wa_1508744258 for DG1 and Gen12 iGFX (Jose)
- Extend Wa_1406941453 to ADL-S (Jose)
- Drop unnecessary workarounds per stepping for SKL/BXT/ICL (Matt R)
- Use fuse info to enable SFC on Gen12 (Venkata)
- Unconditionally flush the pages on acq

Re: [Intel-gfx] [PATCH i-g-t v3 03/11] lib/i915/gem_mman: add fixed mode to gem_mmap_offset__cpu

2021-08-06 Thread Maarten Lankhorst

Op 02-08-2021 om 08:29 schreef Dixit, Ashutosh:
> On Fri, 30 Jul 2021 01:53:40 -0700, Matthew Auld wrote:
>> On discrete we only support the new fixed mode.
>>
>> Signed-off-by: Matthew Auld 
>> Cc: Maarten Lankhorst 
>> Cc: Ashutosh Dixit 
>> Cc: Daniel Vetter 
>> Cc: Ramalingam C 
>> ---
>>  lib/i915/gem_mman.c | 8 +++-
>>  1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/lib/i915/gem_mman.c b/lib/i915/gem_mman.c
>> index c432bb16..563a7ccf 100644
>> --- a/lib/i915/gem_mman.c
>> +++ b/lib/i915/gem_mman.c
>> @@ -474,8 +474,14 @@ void *gem_mmap__cpu(int fd, uint32_t handle, uint64_t 
>> offset, uint64_t size, uns
>>  void *__gem_mmap_offset__cpu(int fd, uint32_t handle, uint64_t offset,
>>   uint64_t size, unsigned prot)
>>  {
>> -return __gem_mmap_offset(fd, handle, offset, size, prot,
>> +void *ptr;
>> +
>> +ptr = __gem_mmap_offset(fd, handle, offset, size, prot,
>>   I915_MMAP_OFFSET_WB);
>> +if (!ptr)
>> +ptr = __gem_mmap_offset__fixed(fd, handle, offset, size, prot);
>> +
>> +return ptr;
> Imo there's some asymmetry here. If we are adding fixed mode to
> mmap__device_coherent (in the previous patch) then we should also be adding
> it to mmap__cpu_coherent (as before). Or, if we are adding fixed mode to
> __gem_mmap_offset__cpu we should also be adding it to
> __gem_mmap_offset__wc. Thanks.

Why do we need it in __gem_mmap_offset_wc btw?

Rest of series doesn't seem to be blocked by it, so pushed for now.

Re: [Intel-gfx] [PULL] drm-intel-gt-next

2021-08-06 Thread Joonas Lahtinen

Quoting Joonas Lahtinen (2021-08-06 13:06:17)
> Hi Dave & Daniel,
> 
> Sorry for the big PR in advance. Had the summer vacations and did not
> notice until tool late how many patches were in already before leaving.
> 
> As requested, there is a lot of refactoring to increase the use of TTM
> allocator and prep for DRM scheduler. Note that at times the patches trade
> simplicity for performance and revert optimizations not seen as critical.
> So some performance regressions are expected.
> 
> As an example is dropping of faster GPU relocation path for older platforms,
> which should be mitigated by updating to the latest UMD versions to regain
> the perf.
> 
> This PR drops various bits of uAPI where userspace has dropped the ball after 
> reviews
> have been completed on both sides (Thanks Jason for doing the 
> due-diligence!). [1]
> Due to the complexity of tracing back who used what, I think we could do 
> better in the
> future to avoid such situations to begin with. See below for my suggestion 
> [2].
> 
> In addition to the refactoring and uAPI cleanup there is preliminay code for
> ADL-P/XeHP and DG2 platforms. Fixes for ADL-S and removal of CNL code.
> A couple of fixes that have been Cc: stable already. Removing unnecessary
> workarounds per stepping and adding missing for Gen12 iGFX.
> 
> I915_MMAP_OFFSET_FIXED is also added to to align with the static/fixed caching
> mode per BO instead of per-mapping mode (for dGFX only). There is GuC firmware
> interface update and backend code major rework that unblock enabling GuC on 
> Gen11
> (not on by default). Then there is the addition of the GuCRC power management
> feature which can be enabled for Gen12+ when submission is enabled.

There is also addition of I915_USERPTR_PROBE with userspace changes in:

https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12044

Regards, Joonas

> 
> Then there is finally the drm-next backmerge and 3 topic branch merges.
> 
> I think that is mostly all. I tried to summarize much in the tag description 
> so
> it should hopefully not be horribly long...
> 
> Regards, Joonas
> 
> [1] Given that Jason is only human and there is no way to automate this 
> analysis, we
> may have to bring something back as fixes if we find breakage (like with the 
> MMAP IOCTL).
> 
> [2] As we first require to merge the kernel code, should we introduce some 
> further rules
> that the userspace has to land their code before the final kernel version 
> release? If
> that is not followed through, we would neuter the new uAPI with as small 
> patch as possible
> in the final release candidate and then remove it in next release. Thoughts?
> 
> ***
> 
> drm-intel-gt-next-2021-08-06-1:
> 
> UAPI Changes:
> 
> - Add I915_MMAP_OFFSET_FIXED
> 
>   On devices with local memory `I915_MMAP_OFFSET_FIXED` is the only valid
>   type. On devices without local memory, this caching mode is invalid.
> 
>   As caching mode when specifying `I915_MMAP_OFFSET_FIXED`, WC or WB will
>   be used, depending on the object placement on creation. WB will be used
>   when the object can only exist in system memory, WC otherwise.
> 
>   Userspace: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11888
> 
> - Reinstate the mmap ioctl for (already released) integrated Gen12 platforms
> 
>   Rationale: Otherwise media driver breaks eg. for ADL-P. Long term goal is
>   still to sunset the IOCTL even for integrated and require using mmap_offset.
> 
> - Reject caching/set_domain IOCTLs on discrete
> 
>   Expected to become immutable property of the BO
> 
> - Disallow changing context parameters after first use on Gen12 and earlier
> - Require setting context parameters at creation on platforms after Gen12
> 
>   Rationale (for both): Allow less dynamic changes to the context to simplify
>   the implementation and avoid user shooting theirselves in the foot.
> 
> - Drop I915_CONTEXT_PARAM_RINGSIZE
> 
>   Userspace PR for compute-driver has not been merged
> 
> - Drop I915_CONTEXT_PARAM_NO_ZEROMAP
> 
>   Userspace PR for libdrm / Beignet was never landed
> 
> - Drop CONTEXT_CLONE API
> 
>   Userspace PR for Mesa was never landed
> 
> - Drop getparam support for I915_CONTEXT_PARAM_ENGINES
> 
>   Only existed for symmetry wrt. setparam, never used.
> 
> - Disallow bonding of virtual engines
> 
>   Drop the prep work, no hardware has been released needing it.
> 
> - (Implicit) Disable gpu relocations
> 
>   Media userspace was the last userspace to still use them. They
>   have converted so performance can be regained with an update.
> 
> Core Changes:
> 
> - Merge topic branch 'topic/i915-ttm-2021-06-11' (from Maarten)
> - Merge topic branch 'topic/revid_steppings' (from Matt R)
> - Merge topic branch 'topic/xehp-dg2-definitions-2021-07-21' (from Matt R)
> - Backmerges drm-next (Rodrigo)
> 
> Driver Changes:
> 
> - Initial workarounds for ADL-P (Clint)
> - Preliminary code for XeHP/DG2 (Stuart, Umesh, Matt R, Prathap, Ram,
>   Venkata, Akeem, Tvrtko, John, Lucas)
>

Re: [Intel-gfx] [PATCH] drm/i915: Be more gentle when exiting non-persistent contexts

2021-08-06 Thread Tvrtko Ursulin




On 05/08/2021 17:32, Matthew Brost wrote:

On Thu, Aug 05, 2021 at 01:05:09PM +0100, Tvrtko Ursulin wrote:

From: Tvrtko Ursulin 

When a non-persistent context exits we currently mark it as banned in
order to trigger fast termination of any outstanding GPU jobs it may have
left running.

In doing so we apply a very strict 1ms limit in which the left over job
has to preempt before we issues an engine resets.

Some workloads are not able to cleanly preempt in that time window and it
can be argued that it would instead be better to give them a bit more
grace since avoiding engine resets is generally preferrable.

To achieve this the patch splits handling of banned contexts from simply
closed non-persistent ones and then applies different timeouts for both
and also extends the criteria which determines if a request should be
scheduled back in after preemption or not.

15ms preempt timeout grace is given to exited non-persistent contexts
which have been empirically tested to satisfy customers requirements
and still provides reasonably quick cleanup post exit.



I think you need to rework your thinking here a bit as this a very
execlists specific solution and the GuC needs to be considered.


Slipped my mind GuC patches were merged in the meantime. (This patch 
predates that.) But I think wording in the commit message is fine. It is 
just the implementation that now has to handle the GuC as well.



v2:
  * Streamline fast path checks.

v3:
  * Simplify by using only schedulable status.
  * Increase timeout to 20ms.

v4:
  * Fix live_execlists selftest.

v5:
  * Fix logic in kill_engines.

v6:
  * Rebase.

Signed-off-by: Tvrtko Ursulin 
Cc: Chris Wilson 
Cc: Zhen Han 
---
  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 22 +--
  drivers/gpu/drm/i915/gt/intel_context.c   |  2 ++
  drivers/gpu/drm/i915/gt/intel_context.h   | 17 +-
  drivers/gpu/drm/i915/gt/intel_context_types.h |  1 +
  .../drm/i915/gt/intel_execlists_submission.c  | 11 --
  drivers/gpu/drm/i915/gt/selftest_execlists.c  | 20 +++--
  drivers/gpu/drm/i915/i915_request.c   |  2 +-
  7 files changed, 57 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index cff72679ad7c..21fe5d4057ab 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1065,7 +1065,8 @@ static struct intel_engine_cs *active_engine(struct 
intel_context *ce)
return engine;
  }
  
-static void kill_engines(struct i915_gem_engines *engines, bool ban)

+static void
+kill_engines(struct i915_gem_engines *engines, bool ban, bool persistent)
  {
struct i915_gem_engines_iter it;
struct intel_context *ce;
@@ -1079,8 +1080,15 @@ static void kill_engines(struct i915_gem_engines 
*engines, bool ban)
 */
for_each_gem_engine(ce, engines, it) {
struct intel_engine_cs *engine;
+   bool skip = false;
+
+   if (ban)
+   skip = intel_context_ban(ce, NULL);
+   else if (!persistent)
+   skip = !intel_context_clear_schedulable(ce);


schedulable doesn't hook into the backend at all, while
intel_context_ban does. In the case of GuC submission intel_context_ban
changes to preemption timeout to 1 us and disables scheduling resulting
in the context getting kicked off the hardware immediately. You likely
need to update intel_context_clear_schedulable to use the same vfunc as
intel_context_ban() but accept an argument for the value of the
preemption timeout. For a ban user a lower value, for clearing
schedulable use a higher value.


Okay I'll have a look. Might go back to closed flag as opposed to 
schedulable as well since I don't quite like schedulable being the odd 
one out.




  
-		if (ban && intel_context_ban(ce, NULL))

+   /* Already previously banned or made non-schedulable? */
+   if (skip)
continue;
  
  		/*

@@ -1093,7 +1101,7 @@ static void kill_engines(struct i915_gem_engines 
*engines, bool ban)
engine = active_engine(ce);
  
  		/* First attempt to gracefully cancel the context */

-   if (engine && !__cancel_engine(engine) && ban)
+   if (engine && !__cancel_engine(engine) && (ban || !persistent))
/*
 * If we are unable to send a preemptive pulse to bump
 * the context from the GPU, we have to resort to a full
@@ -1105,8 +1113,6 @@ static void kill_engines(struct i915_gem_engines 
*engines, bool ban)
  
  static void kill_context(struct i915_gem_context *ctx)

  {
-   bool ban = (!i915_gem_context_is_persistent(ctx) ||
-   !ctx->i915->params.enable_hangcheck);
struct i915_gem_engines *pos, *next;
  
  	spin_lock_irq(&ctx->stale.lock);

@@ -1119,7 +1125,8 @@ static vo

Re: [Intel-gfx] [PATCH 0/4] Enable GuC submission by default on DG1

2021-08-06 Thread Intel


Hi,

On 8/3/21 7:26 PM, Matthew Brost wrote:

On Tue, Aug 03, 2021 at 02:15:13PM +0200, Daniel Vetter wrote:

On Tue, Aug 3, 2021 at 6:53 AM Matthew Brost  wrote:

Minimum set of patches to enable GuC submission on DG1 and enable it by
default.

A little difficult to test as IGTs do not work with DG1 due to a bunch
of uAPI features being disabled (e.g. relocations, caching memory
options, etc...).

Matt Auld has an igt series which fixes a lot of this stuff, would be
good to do at least a Test-With run with that.


It looks like Maarten now merged Matt's series to IGT.

There is a series on IGT trybot with pending work to have some igt tests 
support relocations,


https://patchwork.freedesktop.org/series/92043/

One of the tests that have WIP fixes is gem_exec_whisper, and that 
particular test has historically shown occasional hangs with GuC 
submission on DG1 so it would be very desirable if we could make that 
test in particular work (I haven't verified that that's the case) reliably.


Also the following series:

https://patchwork.freedesktop.org/series/93455/

tries a bit harder to get some more tests running, squashing the above 
series on top of latest IGT.


Thanks,
/Thomas

Re: [Intel-gfx] [PATCH 0/4] Enable GuC submission by default on DG1

2021-08-06 Thread Intel




On 8/6/21 1:34 PM, Thomas Hellström (Intel) wrote:

Hi,

On 8/3/21 7:26 PM, Matthew Brost wrote:

On Tue, Aug 03, 2021 at 02:15:13PM +0200, Daniel Vetter wrote:
On Tue, Aug 3, 2021 at 6:53 AM Matthew Brost 
 wrote:
Minimum set of patches to enable GuC submission on DG1 and enable 
it by

default.

A little difficult to test as IGTs do not work with DG1 due to a bunch
of uAPI features being disabled (e.g. relocations, caching memory
options, etc...).

Matt Auld has an igt series which fixes a lot of this stuff, would be
good to do at least a Test-With run with that.


It looks like Maarten now merged Matt's series to IGT.

There is a series on IGT trybot with pending work to have some igt 
tests support relocations,


https://patchwork.freedesktop.org/series/92043/

One of the tests that have WIP fixes is gem_exec_whisper, and that 
particular test has historically shown occasional hangs with GuC 
submission on DG1 so it would be very desirable if we could make that 
test in particular work (I haven't verified that that's the case) 
reliably.


Also the following series:

https://patchwork.freedesktop.org/series/93455/

tries a bit harder to get some more tests running, squashing the above 
series on top of latest IGT.


Thanks,
/Thomas

And also while we're working on getting igt adapted to uapi changes and 
to get more LMEM coverage in there, an IMO relevant test case to run 
manually is "piglit quick"on top of DG1-enabled OpenGL checking for 
regressions and significant changes in execution time.


/Thomas

[Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915/gvt: Fix cached atomics setting for Windows VM (rev2)

2021-08-06 Thread Patchwork

== Series Details ==

Series: drm/i915/gvt: Fix cached atomics setting for Windows VM (rev2)
URL   : https://patchwork.freedesktop.org/series/92809/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10454_full -> Patchwork_20779_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20779_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20779_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_20779_full:

### IGT changes ###

 Possible regressions 

  * igt@gem_exec_schedule@u-submit-golden-slice@vecs0:
- shard-iclb: [PASS][1] -> [INCOMPLETE][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10454/shard-iclb5/igt@gem_exec_schedule@u-submit-golden-sl...@vecs0.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20779/shard-iclb4/igt@gem_exec_schedule@u-submit-golden-sl...@vecs0.html

  
Known issues


  Here are the changes found in Patchwork_20779_full that come from known 
issues:

### CI changes ###

 Issues hit 

  * boot:
- shard-iclb: ([PASS][3], [PASS][4], [PASS][5], [PASS][6], 
[PASS][7], [PASS][8], [PASS][9], [PASS][10], [PASS][11], [PASS][12], 
[PASS][13], [PASS][14], [PASS][15], [PASS][16], [PASS][17], [PASS][18], 
[PASS][19], [PASS][20], [PASS][21], [PASS][22], [PASS][23], [PASS][24], 
[PASS][25], [PASS][26], [PASS][27]) -> ([PASS][28], [PASS][29], [PASS][30], 
[PASS][31], [PASS][32], [PASS][33], [PASS][34], [PASS][35], [PASS][36], 
[PASS][37], [FAIL][38], [PASS][39], [PASS][40], [PASS][41], [PASS][42], 
[PASS][43], [PASS][44], [PASS][45], [PASS][46], [PASS][47], [PASS][48], 
[PASS][49], [PASS][50], [PASS][51], [PASS][52]) ([i915#3521])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10454/shard-iclb3/boot.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10454/shard-iclb8/boot.html
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10454/shard-iclb8/boot.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10454/shard-iclb7/boot.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10454/shard-iclb7/boot.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10454/shard-iclb7/boot.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10454/shard-iclb6/boot.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10454/shard-iclb6/boot.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10454/shard-iclb6/boot.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10454/shard-iclb6/boot.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10454/shard-iclb5/boot.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10454/shard-iclb5/boot.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10454/shard-iclb5/boot.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10454/shard-iclb4/boot.html
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10454/shard-iclb4/boot.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10454/shard-iclb4/boot.html
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10454/shard-iclb4/boot.html
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10454/shard-iclb3/boot.html
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10454/shard-iclb3/boot.html
   [22]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10454/shard-iclb2/boot.html
   [23]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10454/shard-iclb2/boot.html
   [24]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10454/shard-iclb2/boot.html
   [25]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10454/shard-iclb1/boot.html
   [26]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10454/shard-iclb1/boot.html
   [27]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10454/shard-iclb1/boot.html
   [28]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20779/shard-iclb1/boot.html
   [29]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20779/shard-iclb1/boot.html
   [30]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20779/shard-iclb1/boot.html
   [31]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20779/shard-iclb2/boot.html
   [32]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20779/shard-iclb2/boot.html
   [33]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20779/shard-iclb2/boot.html
   [34]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20779/shard-iclb2/boot.html
   [35]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20779/shard-iclb3/boot.html
   [36]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20779/shard-iclb3/boot.html
   [37]: 
ht

Re: [Intel-gfx] [PATCH 0/4] Enable GuC submission by default on DG1

2021-08-06 Thread Intel




On 8/6/21 1:34 PM, Thomas Hellström (Intel) wrote:

Hi,

On 8/3/21 7:26 PM, Matthew Brost wrote:

On Tue, Aug 03, 2021 at 02:15:13PM +0200, Daniel Vetter wrote:
On Tue, Aug 3, 2021 at 6:53 AM Matthew Brost 
 wrote:
Minimum set of patches to enable GuC submission on DG1 and enable 
it by

default.

A little difficult to test as IGTs do not work with DG1 due to a bunch
of uAPI features being disabled (e.g. relocations, caching memory
options, etc...).

Matt Auld has an igt series which fixes a lot of this stuff, would be
good to do at least a Test-With run with that.


It looks like Maarten now merged Matt's series to IGT.

There is a series on IGT trybot with pending work to have some igt 
tests support relocations,


https://patchwork.freedesktop.org/series/92043/

One of the tests that have WIP fixes is gem_exec_whisper, and that 
particular test has historically shown occasional hangs with GuC 
submission on DG1 so it would be very desirable if we could make that 
test in particular work (I haven't verified that that's the case) 
reliably.


Also the following series:

https://patchwork.freedesktop.org/series/93455/

tries a bit harder to get some more tests running, squashing the above 
series on top of latest IGT.


Thanks,
/Thomas


Just verified that gem-exec-whisper is running now on DG1 on latest igt 
and the above series. Haven't tried with GuC submission, though.


/Thomas

Re: [Intel-gfx] [PATCH v5 02/20] drm/msm: Fix drm/sched point of no return rules

2021-08-06 Thread Daniel Vetter

On Fri, Aug 6, 2021 at 12:58 AM Rob Clark  wrote:
>
> On Thu, Aug 5, 2021 at 3:47 AM Daniel Vetter  wrote:
> >
> > Originally drm_sched_job_init was the point of no return, after which
> > drivers must submit a job. I've split that up, which allows us to fix
> > this issue pretty easily.
> >
> > Only thing we have to take care of is to not skip to error paths after
> > that. Other drivers do this the same for out-fence and similar things.
> >
> > Fixes: 1d8a5ca436ee ("drm/msm: Conversion to drm scheduler")
> > Cc: Rob Clark 
> > Cc: Rob Clark 
> > Cc: Sean Paul 
> > Cc: Sumit Semwal 
> > Cc: "Christian König" 
> > Cc: linux-arm-...@vger.kernel.org
> > Cc: dri-de...@lists.freedesktop.org
> > Cc: freedr...@lists.freedesktop.org
> > Cc: linux-me...@vger.kernel.org
> > Cc: linaro-mm-...@lists.linaro.org
> > Signed-off-by: Daniel Vetter 
> > ---
> >  drivers/gpu/drm/msm/msm_gem_submit.c | 15 +++
> >  1 file changed, 7 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
> > b/drivers/gpu/drm/msm/msm_gem_submit.c
> > index 6d6c44f0e1f3..d0ed4ddc509e 100644
> > --- a/drivers/gpu/drm/msm/msm_gem_submit.c
> > +++ b/drivers/gpu/drm/msm/msm_gem_submit.c
> > @@ -52,9 +52,6 @@ static struct msm_gem_submit *submit_create(struct 
> > drm_device *dev,
> > return ERR_PTR(ret);
> > }
> >
> > -   /* FIXME: this is way too early */
> > -   drm_sched_job_arm(&job->base);
> > -
> > xa_init_flags(&submit->deps, XA_FLAGS_ALLOC);
> >
> > kref_init(&submit->ref);
> > @@ -883,6 +880,9 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void 
> > *data,
> >
> > submit->user_fence = dma_fence_get(&submit->base.s_fence->finished);
> >
> > +   /* point of no return, we _have_ to submit no matter what */
> > +   drm_sched_job_arm(&submit->base);
> > +
> > /*
> >  * Allocate an id which can be used by WAIT_FENCE ioctl to map back
> >  * to the underlying fence.
> > @@ -892,17 +892,16 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void 
> > *data,
> > if (submit->fence_id < 0) {
> > ret = submit->fence_id = 0;
> > submit->fence_id = 0;
> > -   goto out;
> > }
> >
> > -   if (args->flags & MSM_SUBMIT_FENCE_FD_OUT) {
> > +   if (ret == 0 && args->flags & MSM_SUBMIT_FENCE_FD_OUT) {
> > struct sync_file *sync_file = 
> > sync_file_create(submit->user_fence);
> > if (!sync_file) {
> > ret = -ENOMEM;
> > -   goto out;
> > +   } else {
> > +   fd_install(out_fence_fd, sync_file->file);
> > +   args->fence_fd = out_fence_fd;
> > }
> > -   fd_install(out_fence_fd, sync_file->file);
> > -   args->fence_fd = out_fence_fd;
>
> I wonder if instead we should (approximately) undo "drm/msm/submit:
> Simplify out-fence-fd handling" so that the point that it could fail
> is moved up ahead of the drm_sched_job_arm()?

Hm yeah. Up to you how you want to paint this shed, I think either is fine.

> Also, does the dma_fence_get() work before drm_sched_job_arm()?  From
> a quick look, it looks like it won't, but I'm still playing catchup
> and haven't had a chance to look at your entire series.  If it doesn't
> work before drm_sched_job_arm(), then there is really no way to
> prevent a error path between the fence-init and job-submit.

Yes. I thought I've checked that I put the _arm() in the right spot,
but I guess I screwed up and you need the fence before the point where
I've put the job_arm()? And yes the error path cannot be avoided for
out-fences, that's what I tried to explain in the commit message.

> But, prior to your series, wouldn't a failure after
> drm_sched_job_init() but before the job is submitted just burn a
> fence-id, and otherwise carry on it's merry way?

Maybe? I'm not sure whether the scheduler gets confused about the gap
and freak out abou that. I'm fairly new to that code and learning
(which is part why I'm working on it). Since you look up in
fences/syncobj after job_init() it should be pretty easy to whip up a
testcase and see what happens. Also as long as nothing fails you won't
see an issue, that's for sure.
-Daniel

> BR,
> -R
>
> > }
> >
> > submit_attach_object_fences(submit);
> > --
> > 2.32.0
> >



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [Intel-gfx] [PATCH v5 02/20] drm/msm: Fix drm/sched point of no return rules

2021-08-06 Thread Rob Clark

On Fri, Aug 6, 2021 at 9:42 AM Daniel Vetter  wrote:
>
> On Fri, Aug 6, 2021 at 12:58 AM Rob Clark  wrote:
> >
> > On Thu, Aug 5, 2021 at 3:47 AM Daniel Vetter  wrote:
> > >
> > > Originally drm_sched_job_init was the point of no return, after which
> > > drivers must submit a job. I've split that up, which allows us to fix
> > > this issue pretty easily.
> > >
> > > Only thing we have to take care of is to not skip to error paths after
> > > that. Other drivers do this the same for out-fence and similar things.
> > >
> > > Fixes: 1d8a5ca436ee ("drm/msm: Conversion to drm scheduler")
> > > Cc: Rob Clark 
> > > Cc: Rob Clark 
> > > Cc: Sean Paul 
> > > Cc: Sumit Semwal 
> > > Cc: "Christian König" 
> > > Cc: linux-arm-...@vger.kernel.org
> > > Cc: dri-de...@lists.freedesktop.org
> > > Cc: freedr...@lists.freedesktop.org
> > > Cc: linux-me...@vger.kernel.org
> > > Cc: linaro-mm-...@lists.linaro.org
> > > Signed-off-by: Daniel Vetter 
> > > ---
> > >  drivers/gpu/drm/msm/msm_gem_submit.c | 15 +++
> > >  1 file changed, 7 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
> > > b/drivers/gpu/drm/msm/msm_gem_submit.c
> > > index 6d6c44f0e1f3..d0ed4ddc509e 100644
> > > --- a/drivers/gpu/drm/msm/msm_gem_submit.c
> > > +++ b/drivers/gpu/drm/msm/msm_gem_submit.c
> > > @@ -52,9 +52,6 @@ static struct msm_gem_submit *submit_create(struct 
> > > drm_device *dev,
> > > return ERR_PTR(ret);
> > > }
> > >
> > > -   /* FIXME: this is way too early */
> > > -   drm_sched_job_arm(&job->base);
> > > -
> > > xa_init_flags(&submit->deps, XA_FLAGS_ALLOC);
> > >
> > > kref_init(&submit->ref);
> > > @@ -883,6 +880,9 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void 
> > > *data,
> > >
> > > submit->user_fence = 
> > > dma_fence_get(&submit->base.s_fence->finished);
> > >
> > > +   /* point of no return, we _have_ to submit no matter what */
> > > +   drm_sched_job_arm(&submit->base);
> > > +
> > > /*
> > >  * Allocate an id which can be used by WAIT_FENCE ioctl to map 
> > > back
> > >  * to the underlying fence.
> > > @@ -892,17 +892,16 @@ int msm_ioctl_gem_submit(struct drm_device *dev, 
> > > void *data,
> > > if (submit->fence_id < 0) {
> > > ret = submit->fence_id = 0;
> > > submit->fence_id = 0;
> > > -   goto out;
> > > }
> > >
> > > -   if (args->flags & MSM_SUBMIT_FENCE_FD_OUT) {
> > > +   if (ret == 0 && args->flags & MSM_SUBMIT_FENCE_FD_OUT) {
> > > struct sync_file *sync_file = 
> > > sync_file_create(submit->user_fence);
> > > if (!sync_file) {
> > > ret = -ENOMEM;
> > > -   goto out;
> > > +   } else {
> > > +   fd_install(out_fence_fd, sync_file->file);
> > > +   args->fence_fd = out_fence_fd;
> > > }
> > > -   fd_install(out_fence_fd, sync_file->file);
> > > -   args->fence_fd = out_fence_fd;
> >
> > I wonder if instead we should (approximately) undo "drm/msm/submit:
> > Simplify out-fence-fd handling" so that the point that it could fail
> > is moved up ahead of the drm_sched_job_arm()?
>
> Hm yeah. Up to you how you want to paint this shed, I think either is fine.
>
> > Also, does the dma_fence_get() work before drm_sched_job_arm()?  From
> > a quick look, it looks like it won't, but I'm still playing catchup
> > and haven't had a chance to look at your entire series.  If it doesn't
> > work before drm_sched_job_arm(), then there is really no way to
> > prevent a error path between the fence-init and job-submit.
>
> Yes. I thought I've checked that I put the _arm() in the right spot,
> but I guess I screwed up and you need the fence before the point where
> I've put the job_arm()? And yes the error path cannot be avoided for
> out-fences, that's what I tried to explain in the commit message.
>
> > But, prior to your series, wouldn't a failure after
> > drm_sched_job_init() but before the job is submitted just burn a
> > fence-id, and otherwise carry on it's merry way?
>
> Maybe? I'm not sure whether the scheduler gets confused about the gap
> and freak out abou that. I'm fairly new to that code and learning
> (which is part why I'm working on it). Since you look up in
> fences/syncobj after job_init() it should be pretty easy to whip up a
> testcase and see what happens. Also as long as nothing fails you won't
> see an issue, that's for sure.

fair.. I'll try to come up with a test case.. pre-scheduler-conversion
it wasn't a problem to fail after the fence seqno was allocated (well,
I guess you might have problems if you had 2^31 failures in a row)

BR,
-R

> -Daniel
>
> > BR,
> > -R
> >
> > > }
> > >
> > > submit_attach_object_fences(submit);
> > > --
> > > 2.32.0
> > >
>
>
>
> --
> Daniel Vetter
> Software Engineer,

[Intel-gfx] [PATCH v5.1 4/9] drm/i915/xehpsdv: Add compute DSS type

2021-08-06 Thread Matt Roper

From: Stuart Summers 

Starting in XeHP, the concept of slice has been removed in favor of
DSS (Dual-Subslice) masks for various workload types. These workloads have
been divided into those enabled for geometry and those enabled for compute.

i915 currently maintains a single set of S/SS/EU masks for the device.
The goal of this patch set is to minimize the amount of impact to prior
generations while still giving the user maximum flexibility.

v2:
 - Generalize a comment about uapi access to geometry/compute masks; the
   proposed uapi has changed since the comment was first written, and
   will show up in a future series once the userspace code is published.
   (Lucas)

v3:
 - Eliminate unnecessary has_compute_dss flag.  (Lucas)
 - Drop unwanted comment change in uapi header.  (Lucas)

Bspec: 33117, 33118, 20376
Cc: Daniele Ceraolo Spurio 
Cc: Matt Roper 
Cc: Lucas De Marchi 
Signed-off-by: Stuart Summers 
Signed-off-by: Steve Hampson 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_sseu.c | 60 +---
 drivers/gpu/drm/i915/gt/intel_sseu.h |  4 +-
 drivers/gpu/drm/i915/i915_reg.h  |  3 +-
 3 files changed, 50 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c 
b/drivers/gpu/drm/i915/gt/intel_sseu.c
index bbd272943c3f..b0e09b58005e 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.c
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
@@ -46,11 +46,11 @@ u32 intel_sseu_get_subslices(const struct sseu_dev_info 
*sseu, u8 slice)
 }
 
 void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int slice,
- u32 ss_mask)
+ u8 *subslice_mask, u32 ss_mask)
 {
int offset = slice * sseu->ss_stride;
 
-   memcpy(&sseu->subslice_mask[offset], &ss_mask, sseu->ss_stride);
+   memcpy(&subslice_mask[offset], &ss_mask, sseu->ss_stride);
 }
 
 unsigned int
@@ -100,14 +100,24 @@ static u16 compute_eu_total(const struct sseu_dev_info 
*sseu)
return total;
 }
 
-static void gen11_compute_sseu_info(struct sseu_dev_info *sseu,
-   u8 s_en, u32 ss_en, u16 eu_en)
+static u32 get_ss_stride_mask(struct sseu_dev_info *sseu, u8 s, u32 ss_en)
+{
+   u32 ss_mask;
+
+   ss_mask = ss_en >> (s * sseu->max_subslices);
+   ss_mask &= GENMASK(sseu->max_subslices - 1, 0);
+
+   return ss_mask;
+}
+
+static void gen11_compute_sseu_info(struct sseu_dev_info *sseu, u8 s_en,
+   u32 g_ss_en, u32 c_ss_en, u16 eu_en)
 {
int s, ss;
 
-   /* ss_en represents entire subslice mask across all slices */
+   /* g_ss_en/c_ss_en represent entire subslice mask across all slices */
GEM_BUG_ON(sseu->max_slices * sseu->max_subslices >
-  sizeof(ss_en) * BITS_PER_BYTE);
+  sizeof(g_ss_en) * BITS_PER_BYTE);
 
for (s = 0; s < sseu->max_slices; s++) {
if ((s_en & BIT(s)) == 0)
@@ -115,7 +125,22 @@ static void gen11_compute_sseu_info(struct sseu_dev_info 
*sseu,
 
sseu->slice_mask |= BIT(s);
 
-   intel_sseu_set_subslices(sseu, s, ss_en);
+   /*
+* XeHP introduces the concept of compute vs geometry DSS. To
+* reduce variation between GENs around subslice usage, store a
+* mask for both the geometry and compute enabled masks since
+* userspace will need to be able to query these masks
+* independently.  Also compute a total enabled subslice count
+* for the purposes of selecting subslices to use in a
+* particular GEM context.
+*/
+   intel_sseu_set_subslices(sseu, s, sseu->compute_subslice_mask,
+get_ss_stride_mask(sseu, s, c_ss_en));
+   intel_sseu_set_subslices(sseu, s, sseu->geometry_subslice_mask,
+get_ss_stride_mask(sseu, s, g_ss_en));
+   intel_sseu_set_subslices(sseu, s, sseu->subslice_mask,
+get_ss_stride_mask(sseu, s,
+   g_ss_en | c_ss_en));
 
for (ss = 0; ss < sseu->max_subslices; ss++)
if (intel_sseu_has_subslice(sseu, s, ss))
@@ -129,7 +154,7 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
 {
struct sseu_dev_info *sseu = >->info.sseu;
struct intel_uncore *uncore = gt->uncore;
-   u32 dss_en;
+   u32 g_dss_en, c_dss_en = 0;
u16 eu_en = 0;
u8 eu_en_fuse;
u8 s_en;
@@ -160,7 +185,9 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
s_en = intel_uncore_read(uncore, GEN11_GT_SLICE_ENABLE) &
   GEN11_GT_S_ENA_MASK;
 
-   dss_en = intel_uncore_read(uncore, GEN12_GT_DSS_ENABLE);
+   g_dss_en = intel_uncore_read(uncore, GEN12_GT_GEOMETRY_DSS_ENABLE);
+

[Intel-gfx] [PATCH] drm/i915: Only access SFC_DONE when media domain is not fused off

2021-08-06 Thread Matt Roper

The SFC_DONE register lives within the corresponding VD0/VD2/VD4/VD6
forcewake domain and is not accessible if the vdbox in that domain is
fused off and the forcewake is not initialized.

This mistake went unnoticed because until recently we were using the
wrong register offset for the SFC_DONE register; once the register
offset was corrected, we started hitting errors like

  <4> [544.989065] i915 :cc:00.0: Uninitialized forcewake domain(s) 0x80 
accessed at 0x1ce000

on parts with fused-off vdbox engines.

Fixes: e50dbdbfd9fb ("drm/i915/tgl: Add SFC instdone to error state")
Fixes: 82929a2140eb ("drm/i915: Correct SFC_DONE register offset")
Cc: Daniele Ceraolo Spurio 
Cc: Mika Kuoppala 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/i915_gpu_error.c | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index 0f08bcfbe964..9cf6ac575de1 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -727,9 +727,18 @@ static void err_print_gt(struct drm_i915_error_state_buf 
*m,
if (GRAPHICS_VER(m->i915) >= 12) {
int i;
 
-   for (i = 0; i < GEN12_SFC_DONE_MAX; i++)
+   for (i = 0; i < GEN12_SFC_DONE_MAX; i++) {
+   /*
+* SFC_DONE resides in the VD forcewake domain, so it
+* only exists if the corresponding VCS engine is
+* present.
+*/
+   if (!HAS_ENGINE(gt->_gt, _VCS(i * 2)))
+   continue;
+
err_printf(m, "  SFC_DONE[%d]: 0x%08x\n", i,
   gt->sfc_done[i]);
+   }
 
err_printf(m, "  GAM_DONE: 0x%08x\n", gt->gam_done);
}
@@ -1598,6 +1607,14 @@ static void gt_record_regs(struct intel_gt_coredump *gt)
 
if (GRAPHICS_VER(i915) >= 12) {
for (i = 0; i < GEN12_SFC_DONE_MAX; i++) {
+   /*
+* SFC_DONE resides in the VD forcewake domain, so it
+* only exists if the corresponding VCS engine is
+* present.
+*/
+   if (!HAS_ENGINE(gt->_gt, _VCS(i * 2)))
+   continue;
+
gt->sfc_done[i] =
intel_uncore_read(uncore, GEN12_SFC_DONE(i));
}
-- 
2.25.4

Re: [Intel-gfx] [PATCH 1/1] drm/i915: Check if engine has heartbeat when closing a context

2021-08-06 Thread John Harrison


On 8/2/2021 02:40, Tvrtko Ursulin wrote:

On 30/07/2021 19:13, John Harrison wrote:

On 7/30/2021 02:49, Tvrtko Ursulin wrote:

On 30/07/2021 01:13, John Harrison wrote:

On 7/28/2021 17:34, Matthew Brost wrote:
If an engine associated with a context does not have a heartbeat, 
ban it
immediately. This is needed for GuC submission as a idle pulse 
doesn't

kick the context off the hardware where it then can check for a
heartbeat and ban the context.


Pulse, that is a request with I915_PRIORITY_BARRIER, does not 
preempt a running normal priority context?


Why does it matter then whether or not heartbeats are enabled - when 
heartbeat just ends up sending the same engine pulse (eventually, 
with raising priority)?
The point is that the pulse is pointless. See the rest of my comments 
below, specifically "the context will get resubmitted to the hardware 
after the pulse completes". To re-iterate...


Yes, it preempts the context. Yes, it does so whether heartbeats are 
enabled or not. But so what? Who cares? You have preempted a context. 
It is no longer running on the hardware. BUT IT IS STILL A VALID 
CONTEXT. 


It is valid yes, and it even may be the current ABI so another 
question is whether it is okay to change that.


The backend scheduler will just resubmit it to the hardware as soon 
as the pulse completes. The only reason this works at all is because 
of the horrid hack in the execlist scheduler's back end 
implementation (in __execlists_schedule_in):

 if (unlikely(intel_context_is_closed(ce) &&
  !intel_engine_has_heartbeat(engine)))
 intel_context_set_banned(ce);


Right, is the above code then needed with this patch - when ban is 
immediately applied on the higher level?


The actual back end scheduler is saying "Is this a zombie context? Is 
the heartbeat disabled? Then ban it". No other scheduler backend is 
going to have knowledge of zombie context status or of the heartbeat 
status. Nor are they going to call back into the higher levels of the 
i915 driver to trigger a ban operation. Certainly a hardware 
implemented scheduler is not going to be looking at private i915 
driver information to decide whether to submit a context or whether 
to tell the OS to kill it off instead.


For persistence to work with a hardware scheduler (or a non-Intel 
specific scheduler such as the DRM one), the handling of zombie 
contexts, banning, etc. *must* be done entirely in the front end. It 
cannot rely on any backend hacks. That means you can't rely on any 
fancy behaviour of pulses.


If you want to ban a context then you must explicitly ban that 
context. If you want to ban it at some later point then you need to 
track it at the top level as a zombie and then explicitly ban that 
zombie at whatever later point.


I am still trying to understand it all. If I go by the commit message:

"""
This is needed for GuC submission as a idle pulse doesn't
kick the context off the hardware where it then can check for a
heartbeat and ban the context.
"""

That did not explain things for me. Sentence does not appear to make 
sense. Now, it seems "kick off the hardware" is meant as revoke and 
not just preempt. Which is fine, perhaps just needs to be written more 
explicitly. But the part of checking for heartbeat after idle pulse 
does not compute for me. It is the heartbeat which emits idle pulses, 
not idle pulse emitting heartbeats.
I am in agreement that the commit message is confusing and does not 
explain either the problem or the solution.






But anyway, I can buy the handling at the front end story completely. 
It makes sense. We just need to agree that a) it is okay to change the 
ABI and b) remove the backend check from execlists if it is not needed 
any longer.


And if ABI change is okay then commit message needs to talk about it 
loudly and clearly.
I don't think we have a choice. The current ABI is not and cannot ever 
be compatible with any scheduler external to i915. It cannot be 
implemented with a hardware scheduler such as the GuC and it cannot be 
implemented with an external software scheduler such as the DRM one.


My view is that any implementation involving knowledge of the heartbeat 
is fundamentally broken.


According to Daniel Vetter, the DRM ABI on this subject is that an 
actively executing context should persist until the DRM file handle is 
closed. That seems like a much more plausible and simple ABI than one 
that says 'if the heartbeat is running then a context will persist 
forever, if the heartbeat is not running then it will be killed 
immediately, if the heart was running but then stops running then the 
context will be killed on the next context switch, ...'. And if I 
understand it correctly, the current ABI allows a badly written user app 
to cause a denial of service by leaving contexts permanently running an 
infinite loop on the hardware even after the app has been killed! How 
can that ever be considered a good idea?


Th

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Begin enabling Xe_HP SDV and DG2 platforms (rev10)

2021-08-06 Thread Patchwork

== Series Details ==

Series: Begin enabling Xe_HP SDV and DG2 platforms (rev10)
URL   : https://patchwork.freedesktop.org/series/92135/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
0ebc963f011d drm/i915/xehp: Loop over all gslices for INSTDONE processing
-:135: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'iter_' - possible 
side-effects?
#135: FILE: drivers/gpu/drm/i915/gt/intel_engine_types.h:582:
+#define for_each_instdone_gslice_dss_xehp(dev_priv_, sseu_, iter_, gslice_, 
dss_) \
+   for ((iter_) = 0, (gslice_) = 0, (dss_) = 0; \
+(iter_) < GEN_MAX_SUBSLICES; \
+(iter_)++, (gslice_) = (iter_) / GEN_DSS_PER_GSLICE, \
+(dss_) = (iter_) % GEN_DSS_PER_GSLICE) \
+   for_each_if(intel_sseu_has_subslice((sseu_), 0, (iter_)))

-:135: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'gslice_' - possible 
side-effects?
#135: FILE: drivers/gpu/drm/i915/gt/intel_engine_types.h:582:
+#define for_each_instdone_gslice_dss_xehp(dev_priv_, sseu_, iter_, gslice_, 
dss_) \
+   for ((iter_) = 0, (gslice_) = 0, (dss_) = 0; \
+(iter_) < GEN_MAX_SUBSLICES; \
+(iter_)++, (gslice_) = (iter_) / GEN_DSS_PER_GSLICE, \
+(dss_) = (iter_) % GEN_DSS_PER_GSLICE) \
+   for_each_if(intel_sseu_has_subslice((sseu_), 0, (iter_)))

-:135: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'dss_' - possible 
side-effects?
#135: FILE: drivers/gpu/drm/i915/gt/intel_engine_types.h:582:
+#define for_each_instdone_gslice_dss_xehp(dev_priv_, sseu_, iter_, gslice_, 
dss_) \
+   for ((iter_) = 0, (gslice_) = 0, (dss_) = 0; \
+(iter_) < GEN_MAX_SUBSLICES; \
+(iter_)++, (gslice_) = (iter_) / GEN_DSS_PER_GSLICE, \
+(dss_) = (iter_) % GEN_DSS_PER_GSLICE) \
+   for_each_if(intel_sseu_has_subslice((sseu_), 0, (iter_)))

total: 0 errors, 0 warnings, 3 checks, 164 lines checked
91cde6f62ffc drm/i915/dg2: Report INSTDONE_GEOM values in error state
9dd72d6a97d4 drm/i915/xehpsdv: Add compute DSS type
9793162b05a6 drm/i915/xehpsdv: factor out function to read RP_STATE_CAP
8db8e3b6bca0 drm/i915/xehpsdv: Read correct RP_STATE_CAP register
8c7ca653ac33 drm/i915/dg2: Add new LRI reg offsets
23befa04925d drm/i915/dg2: Maintain backward-compatible nested batch behavior
c5300668b502 drm/i915/dg2: Configure PCON in DP pre-enable path

[Intel-gfx] ✗ Fi.CI.SPARSE: warning for Begin enabling Xe_HP SDV and DG2 platforms (rev10)

2021-08-06 Thread Patchwork

== Series Details ==

Series: Begin enabling Xe_HP SDV and DG2 platforms (rev10)
URL   : https://patchwork.freedesktop.org/series/92135/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.
-
+drivers/gpu/drm/i915/display/intel_display.c:1901:21:expected struct 
i915_vma *[assigned] vma
+drivers/gpu/drm/i915/display/intel_display.c:1901:21:got void [noderef] 
__iomem *[assigned] iomem
+drivers/gpu/drm/i915/display/intel_display.c:1901:21: warning: incorrect type 
in assignment (different address spaces)
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1410:34:expected struct 
i915_address_space *vm
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1410:34:got struct 
i915_address_space [noderef] __rcu *vm
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1410:34: warning: incorrect type 
in argument 1 (different address spaces)
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:expected struct 
i915_address_space [noderef] __rcu *vm
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:got struct 
i915_address_space *
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25: warning: incorrect 
type in assignment (different address spaces)
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:expected struct 
i915_address_space *vm
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:got struct 
i915_address_space [noderef] __rcu *vm
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34: warning: incorrect 
type in argument 1 (different address spaces)
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_reset.c:1392:5: warning: context imbalance in 
'intel_gt_reset_trylock' - different lock contexts for basic block
+drivers/gpu/drm/i915/gt/intel_ring_submission.c:1268:24: warning: Using plain 
integer as NULL pointer
+drivers/gpu/drm/i915/i915_perf.c:1442:15: warning: memset with byte count of 
16777216
+drivers/gpu/drm/i915/i915_perf.c:1496:15: warning: memset with byte count of 
16777216
+./include/asm-generic/bitops/find.h:112:45: warning: shift count is negative 
(-262080)
+./include/asm-generic/bitops/find.h:32:31: warning: shift count is negative 
(-262080)
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9:

Re: [Intel-gfx] [PATCH] drm/i915: Fix syncmap memory leak

2021-08-06 Thread John Harrison


On 7/30/2021 12:53, Matthew Brost wrote:

A small race exists between intel_gt_retire_requests_timeout and
intel_timeline_exit which could result in the syncmap not getting
free'd. Rather than work to hard to seal this race, simply cleanup the

free'd -> freed


syncmap on fini.

unreferenced object 0x88813bc53b18 (size 96):
   comm "gem_close_race", pid 5410, jiffies 4294917818 (age 1105.600s)
   hex dump (first 32 bytes):
 01 00 00 00 00 00 00 00 00 00 00 00 0a 00 00 00  
 00 00 00 00 00 00 00 00 6b 6b 6b 6b 06 00 00 00  
   backtrace:
 [<120b863a>] __sync_alloc_leaf+0x1e/0x40 [i915]
 [<042f6959>] __sync_set+0x1bb/0x240 [i915]
 [<90f0e90f>] i915_request_await_dma_fence+0x1c7/0x400 [i915]
 [<56a48219>] i915_request_await_object+0x222/0x360 [i915]
 [] i915_gem_do_execbuffer+0x1bd0/0x2250 [i915]
 [<3c9d830f>] i915_gem_execbuffer2_ioctl+0x405/0xce0 [i915]
 [] drm_ioctl_kernel+0xb0/0xf0 [drm]
 [] drm_ioctl+0x305/0x3c0 [drm]
 [<8b0d8986>] __x64_sys_ioctl+0x71/0xb0
 [<76c362a4>] do_syscall_64+0x33/0x80
 [] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Signed-off-by: Matthew Brost 
Fixes: 531958f6f357 ("drm/i915/gt: Track timeline activeness in enter/exit")
Cc: 
---
  drivers/gpu/drm/i915/gt/intel_timeline.c | 9 +
  1 file changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c 
b/drivers/gpu/drm/i915/gt/intel_timeline.c
index c4a126c8caef..1257f4f11e66 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.c
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
@@ -127,6 +127,15 @@ static void intel_timeline_fini(struct rcu_head *rcu)
  
  	i915_vma_put(timeline->hwsp_ggtt);

i915_active_fini(&timeline->active);
+
+   /*
+* A small race exists between intel_gt_retire_requests_timeout and
+* intel_timeline_exit which could result in the syncmap not getting
+* free'd. Rather than work to hard to seal this race, simply cleanup
+* the syncmap on fini.
What is the race? I'm going round in circles just trying to work out how 
intel_gt_retire_requests_timeout is supposed to get to 
intel_timeline_exit in the first place.


Also, free'd -> freed.

John.



+*/
+   i915_syncmap_free(&timeline->sync);
+
kfree(timeline);
  }

Re: [Intel-gfx] [PATCH] drm/i915: Disable bonding on gen12+ platforms

2021-08-06 Thread John Harrison


On 7/28/2021 12:21, Matthew Brost wrote:

Disable bonding on gen12+ platforms aside from ones already supported by
the i915 - TGL, RKL, and ADL-S.

Signed-off-by: Matthew Brost 
---
  drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 05c3ee191710..9c3672bac0e2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -446,6 +446,13 @@ set_proto_ctx_engines_bond(struct i915_user_extension 
__user *base, void *data)
u16 idx, num_bonds;
int err, n;
  
+	if (GRAPHICS_VER(i915) >= 12 && !IS_TIGERLAKE(i915) &&

+   !IS_ROCKETLAKE(i915) && !IS_ALDERLAKE_S(i915)) {
+   drm_dbg(&i915->drm,
+   "Bonding on gen12+ aside from TGL, RKL, and ADL_S not 
allowed\n");

I would have said not supported rather than not allowed. Either way:
Reviewed-by: John Harrison 


+   return -ENODEV;
+   }
+
if (get_user(idx, &ext->virtual_index))
return -EFAULT;

Re: [Intel-gfx] [PATCH] drm/i915: Fix syncmap memory leak

2021-08-06 Thread Matthew Brost

On Fri, Aug 06, 2021 at 11:23:06AM -0700, John Harrison wrote:
> On 7/30/2021 12:53, Matthew Brost wrote:
> > A small race exists between intel_gt_retire_requests_timeout and
> > intel_timeline_exit which could result in the syncmap not getting
> > free'd. Rather than work to hard to seal this race, simply cleanup the
> free'd -> freed
> 

Sure.

> > syncmap on fini.
> > 
> > unreferenced object 0x88813bc53b18 (size 96):
> >comm "gem_close_race", pid 5410, jiffies 4294917818 (age 1105.600s)
> >hex dump (first 32 bytes):
> >  01 00 00 00 00 00 00 00 00 00 00 00 0a 00 00 00  
> >  00 00 00 00 00 00 00 00 6b 6b 6b 6b 06 00 00 00  
> >backtrace:
> >  [<120b863a>] __sync_alloc_leaf+0x1e/0x40 [i915]
> >  [<042f6959>] __sync_set+0x1bb/0x240 [i915]
> >  [<90f0e90f>] i915_request_await_dma_fence+0x1c7/0x400 [i915]
> >  [<56a48219>] i915_request_await_object+0x222/0x360 [i915]
> >  [] i915_gem_do_execbuffer+0x1bd0/0x2250 [i915]
> >  [<3c9d830f>] i915_gem_execbuffer2_ioctl+0x405/0xce0 [i915]
> >  [] drm_ioctl_kernel+0xb0/0xf0 [drm]
> >  [] drm_ioctl+0x305/0x3c0 [drm]
> >  [<8b0d8986>] __x64_sys_ioctl+0x71/0xb0
> >  [<76c362a4>] do_syscall_64+0x33/0x80
> >  [] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > 
> > Signed-off-by: Matthew Brost 
> > Fixes: 531958f6f357 ("drm/i915/gt: Track timeline activeness in enter/exit")
> > Cc: 
> > ---
> >   drivers/gpu/drm/i915/gt/intel_timeline.c | 9 +
> >   1 file changed, 9 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c 
> > b/drivers/gpu/drm/i915/gt/intel_timeline.c
> > index c4a126c8caef..1257f4f11e66 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_timeline.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
> > @@ -127,6 +127,15 @@ static void intel_timeline_fini(struct rcu_head *rcu)
> > i915_vma_put(timeline->hwsp_ggtt);
> > i915_active_fini(&timeline->active);
> > +
> > +   /*
> > +* A small race exists between intel_gt_retire_requests_timeout and
> > +* intel_timeline_exit which could result in the syncmap not getting
> > +* free'd. Rather than work to hard to seal this race, simply cleanup
> > +* the syncmap on fini.
> What is the race? I'm going round in circles just trying to work out how
> intel_gt_retire_requests_timeout is supposed to get to intel_timeline_exit
> in the first place.
> 

intel_gt_retire_requests_timeout increments tl->active_count, active_count == 2
intel_timeline_exit is called, returns on atomic_add_unless, active_count == 1
intel_gt_retire_requests_timeout decrements tl->active_count, active_count == 0
i915_syncmap_free is never called, memory leak

Matt

> Also, free'd -> freed.
> 
> John.
> 
> 
> > +*/
> > +   i915_syncmap_free(&timeline->sync);
> > +
> > kfree(timeline);
> >   }
>

Re: [Intel-gfx] [PATCH v5 02/20] drm/msm: Fix drm/sched point of no return rules

2021-08-06 Thread Daniel Vetter

On Fri, Aug 6, 2021 at 7:15 PM Rob Clark  wrote:
>
> On Fri, Aug 6, 2021 at 9:42 AM Daniel Vetter  wrote:
> >
> > On Fri, Aug 6, 2021 at 12:58 AM Rob Clark  wrote:
> > >
> > > On Thu, Aug 5, 2021 at 3:47 AM Daniel Vetter  
> > > wrote:
> > > >
> > > > Originally drm_sched_job_init was the point of no return, after which
> > > > drivers must submit a job. I've split that up, which allows us to fix
> > > > this issue pretty easily.
> > > >
> > > > Only thing we have to take care of is to not skip to error paths after
> > > > that. Other drivers do this the same for out-fence and similar things.
> > > >
> > > > Fixes: 1d8a5ca436ee ("drm/msm: Conversion to drm scheduler")
> > > > Cc: Rob Clark 
> > > > Cc: Rob Clark 
> > > > Cc: Sean Paul 
> > > > Cc: Sumit Semwal 
> > > > Cc: "Christian König" 
> > > > Cc: linux-arm-...@vger.kernel.org
> > > > Cc: dri-de...@lists.freedesktop.org
> > > > Cc: freedr...@lists.freedesktop.org
> > > > Cc: linux-me...@vger.kernel.org
> > > > Cc: linaro-mm-...@lists.linaro.org
> > > > Signed-off-by: Daniel Vetter 
> > > > ---
> > > >  drivers/gpu/drm/msm/msm_gem_submit.c | 15 +++
> > > >  1 file changed, 7 insertions(+), 8 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
> > > > b/drivers/gpu/drm/msm/msm_gem_submit.c
> > > > index 6d6c44f0e1f3..d0ed4ddc509e 100644
> > > > --- a/drivers/gpu/drm/msm/msm_gem_submit.c
> > > > +++ b/drivers/gpu/drm/msm/msm_gem_submit.c
> > > > @@ -52,9 +52,6 @@ static struct msm_gem_submit *submit_create(struct 
> > > > drm_device *dev,
> > > > return ERR_PTR(ret);
> > > > }
> > > >
> > > > -   /* FIXME: this is way too early */
> > > > -   drm_sched_job_arm(&job->base);
> > > > -
> > > > xa_init_flags(&submit->deps, XA_FLAGS_ALLOC);
> > > >
> > > > kref_init(&submit->ref);
> > > > @@ -883,6 +880,9 @@ int msm_ioctl_gem_submit(struct drm_device *dev, 
> > > > void *data,
> > > >
> > > > submit->user_fence = 
> > > > dma_fence_get(&submit->base.s_fence->finished);
> > > >
> > > > +   /* point of no return, we _have_ to submit no matter what */
> > > > +   drm_sched_job_arm(&submit->base);
> > > > +
> > > > /*
> > > >  * Allocate an id which can be used by WAIT_FENCE ioctl to map 
> > > > back
> > > >  * to the underlying fence.
> > > > @@ -892,17 +892,16 @@ int msm_ioctl_gem_submit(struct drm_device *dev, 
> > > > void *data,
> > > > if (submit->fence_id < 0) {
> > > > ret = submit->fence_id = 0;
> > > > submit->fence_id = 0;
> > > > -   goto out;
> > > > }
> > > >
> > > > -   if (args->flags & MSM_SUBMIT_FENCE_FD_OUT) {
> > > > +   if (ret == 0 && args->flags & MSM_SUBMIT_FENCE_FD_OUT) {
> > > > struct sync_file *sync_file = 
> > > > sync_file_create(submit->user_fence);
> > > > if (!sync_file) {
> > > > ret = -ENOMEM;
> > > > -   goto out;
> > > > +   } else {
> > > > +   fd_install(out_fence_fd, sync_file->file);
> > > > +   args->fence_fd = out_fence_fd;
> > > > }
> > > > -   fd_install(out_fence_fd, sync_file->file);
> > > > -   args->fence_fd = out_fence_fd;
> > >
> > > I wonder if instead we should (approximately) undo "drm/msm/submit:
> > > Simplify out-fence-fd handling" so that the point that it could fail
> > > is moved up ahead of the drm_sched_job_arm()?
> >
> > Hm yeah. Up to you how you want to paint this shed, I think either is fine.
> >
> > > Also, does the dma_fence_get() work before drm_sched_job_arm()?  From
> > > a quick look, it looks like it won't, but I'm still playing catchup
> > > and haven't had a chance to look at your entire series.  If it doesn't
> > > work before drm_sched_job_arm(), then there is really no way to
> > > prevent a error path between the fence-init and job-submit.
> >
> > Yes. I thought I've checked that I put the _arm() in the right spot,
> > but I guess I screwed up and you need the fence before the point where
> > I've put the job_arm()? And yes the error path cannot be avoided for
> > out-fences, that's what I tried to explain in the commit message.
> >
> > > But, prior to your series, wouldn't a failure after
> > > drm_sched_job_init() but before the job is submitted just burn a
> > > fence-id, and otherwise carry on it's merry way?
> >
> > Maybe? I'm not sure whether the scheduler gets confused about the gap
> > and freak out abou that. I'm fairly new to that code and learning
> > (which is part why I'm working on it). Since you look up in
> > fences/syncobj after job_init() it should be pretty easy to whip up a
> > testcase and see what happens. Also as long as nothing fails you won't
> > see an issue, that's for sure.
>
> fair.. I'll try to come up with a test case.. pre-scheduler-conversion
> it wasn't a problem to fail after the fence seqno

[Intel-gfx] ✓ Fi.CI.BAT: success for Begin enabling Xe_HP SDV and DG2 platforms (rev10)

2021-08-06 Thread Patchwork

== Series Details ==

Series: Begin enabling Xe_HP SDV and DG2 platforms (rev10)
URL   : https://patchwork.freedesktop.org/series/92135/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10457 -> Patchwork_20781


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20781/index.html

Known issues


  Here are the changes found in Patchwork_20781 that come from known issues:

### IGT changes ###

 Possible fixes 

  * igt@gem_exec_suspend@basic-s3:
- fi-tgl-1115g4:  [FAIL][1] ([i915#1888]) -> [PASS][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/fi-tgl-1115g4/igt@gem_exec_susp...@basic-s3.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20781/fi-tgl-1115g4/igt@gem_exec_susp...@basic-s3.html

  * igt@i915_selftest@live@hangcheck:
- fi-icl-u2:  [DMESG-FAIL][3] -> [PASS][4]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/fi-icl-u2/igt@i915_selftest@l...@hangcheck.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20781/fi-icl-u2/igt@i915_selftest@l...@hangcheck.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888
  [i915#541]: https://gitlab.freedesktop.org/drm/intel/issues/541


Participating hosts (37 -> 34)
--

  Missing(3): fi-bdw-samus fi-bsw-cyan bat-jsl-1 


Build changes
-

  * Linux: CI_DRM_10457 -> Patchwork_20781

  CI-20190529: 20190529
  CI_DRM_10457: 7700f858b68060307b0a7d94377a5d8f64000e5d @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6162: 2f32b9e0da5f1ac9529318dd5b836c8cf4d3c441 @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20781: c5300668b502fff59ac563d2a7b22f2341bd8c95 @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

c5300668b502 drm/i915/dg2: Configure PCON in DP pre-enable path
23befa04925d drm/i915/dg2: Maintain backward-compatible nested batch behavior
8c7ca653ac33 drm/i915/dg2: Add new LRI reg offsets
8db8e3b6bca0 drm/i915/xehpsdv: Read correct RP_STATE_CAP register
9793162b05a6 drm/i915/xehpsdv: factor out function to read RP_STATE_CAP
9dd72d6a97d4 drm/i915/xehpsdv: Add compute DSS type
91cde6f62ffc drm/i915/dg2: Report INSTDONE_GEOM values in error state
0ebc963f011d drm/i915/xehp: Loop over all gslices for INSTDONE processing

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20781/index.html

Re: [Intel-gfx] [PATCH 2/4] drm/i915/guc: put all guc objects in lmem when available

2021-08-06 Thread John Harrison


On 8/2/2021 22:11, Matthew Brost wrote:

From: Daniele Ceraolo Spurio 

The firmware binary has to be loaded from lmem and the recommendation is
to put all other objects in there as well. Note that we don't fall back
to system memory if the allocation in lmem fails because all objects are
allocated during driver load and if we have issues with lmem at that point
something is seriously wrong with the system, so no point in trying to
handle it.

Cc: Matthew Auld 
Cc: Abdiel Janulgue 
Cc: Michal Wajdeczko 
Cc: Vinay Belgaumkar 
Cc: Radoslaw Szwichtenberg 
Signed-off-by: Daniele Ceraolo Spurio 
Signed-off-by: Matthew Brost 
---
  drivers/gpu/drm/i915/gem/i915_gem_lmem.c  | 26 
  drivers/gpu/drm/i915/gem/i915_gem_lmem.h  |  4 ++
  drivers/gpu/drm/i915/gt/uc/intel_guc.c|  9 ++-
  drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 11 +++-
  drivers/gpu/drm/i915/gt/uc/intel_huc.c| 14 -
  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c  | 75 +--
  6 files changed, 127 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c 
b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
index eb345305dc52..034226c5d4d0 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
@@ -103,6 +103,32 @@ __i915_gem_object_create_lmem_with_ps(struct 
drm_i915_private *i915,
 size, page_size, flags);
  }
  
+struct drm_i915_gem_object *

+i915_gem_object_create_lmem_from_data(struct drm_i915_private *i915,
+ const void *data, size_t size)
+{
+   struct drm_i915_gem_object *obj;
+   void *map;
+
+   obj = i915_gem_object_create_lmem(i915,
+ round_up(size, PAGE_SIZE),
+ I915_BO_ALLOC_CONTIGUOUS);
+   if (IS_ERR(obj))
+   return obj;
+
+   map = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC);
+   if (IS_ERR(map)) {
+   i915_gem_object_put(obj);
+   return map;
+   }
+
+   memcpy(map, data, size);
+
+   i915_gem_object_unpin_map(obj);
+
+   return obj;
+}
+
  struct drm_i915_gem_object *
  i915_gem_object_create_lmem(struct drm_i915_private *i915,
resource_size_t size,
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h 
b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
index 4ee81fc66302..1b88ea13435c 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.h
@@ -23,6 +23,10 @@ bool i915_gem_object_is_lmem(struct drm_i915_gem_object 
*obj);
  
  bool __i915_gem_object_is_lmem(struct drm_i915_gem_object *obj);
  
+struct drm_i915_gem_object *

+i915_gem_object_create_lmem_from_data(struct drm_i915_private *i915,
+ const void *data, size_t size);
+
  struct drm_i915_gem_object *
  __i915_gem_object_create_lmem_with_ps(struct drm_i915_private *i915,
  resource_size_t size,
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
index 979128e28372..55160d3e401a 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.c
@@ -3,6 +3,7 @@
   * Copyright © 2014-2019 Intel Corporation
   */
  
+#include "gem/i915_gem_lmem.h"

  #include "gt/intel_gt.h"
  #include "gt/intel_gt_irq.h"
  #include "gt/intel_gt_pm_irq.h"
@@ -630,7 +631,13 @@ struct i915_vma *intel_guc_allocate_vma(struct intel_guc 
*guc, u32 size)
u64 flags;
int ret;
  
-	obj = i915_gem_object_create_shmem(gt->i915, size);

+   if (HAS_LMEM(gt->i915))
+   obj = i915_gem_object_create_lmem(gt->i915, size,
+ I915_BO_ALLOC_CPU_CLEAR |
+ I915_BO_ALLOC_CONTIGUOUS);
+   else
+   obj = i915_gem_object_create_shmem(gt->i915, size);
+
if (IS_ERR(obj))
return ERR_CAST(obj);
  
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c

index 76fe766ad1bc..962be0c12208 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
@@ -41,7 +41,7 @@ static void guc_prepare_xfer(struct intel_uncore *uncore)
  }
  
  /* Copy RSA signature from the fw image to HW for verification */

-static void guc_xfer_rsa(struct intel_uc_fw *guc_fw,
+static int guc_xfer_rsa(struct intel_uc_fw *guc_fw,
 struct intel_uncore *uncore)
  {
u32 rsa[UOS_RSA_SCRATCH_COUNT];
@@ -49,10 +49,13 @@ static void guc_xfer_rsa(struct intel_uc_fw *guc_fw,
int i;
  
  	copied = intel_uc_fw_copy_rsa(guc_fw, rsa, sizeof(rsa));

-   GEM_BUG_ON(copied < sizeof(rsa));
+   if (copied < sizeof(rsa))
+   return -ENOMEM;
  
  	for (i = 0; i < UOS_RSA_SCRATCH_COUNT; i++)

intel_uncore_write(u

Re: [Intel-gfx] [PATCH 3/4] drm/i915/guc: Add DG1 GuC / HuC firmware defs

2021-08-06 Thread John Harrison


On 8/2/2021 22:11, Matthew Brost wrote:

Signed-off-by: Matthew Brost 
---
  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c 
b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
index f8cb00ffb506..a685d563df72 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c
@@ -51,6 +51,7 @@ void intel_uc_fw_change_status(struct intel_uc_fw *uc_fw,
  #define INTEL_UC_FIRMWARE_DEFS(fw_def, guc_def, huc_def) \
fw_def(ALDERLAKE_P, 0, guc_def(adlp, 62, 0, 3), huc_def(tgl, 7, 9, 3)) \
fw_def(ALDERLAKE_S, 0, guc_def(tgl, 62, 0, 0), huc_def(tgl,  7, 9, 3)) \
+   fw_def(DG1, 0, guc_def(dg1, 62, 0, 0), huc_def(dg1,  7, 9, 3)) \
fw_def(ROCKETLAKE,  0, guc_def(tgl, 62, 0, 0), huc_def(tgl,  7, 9, 3)) \
fw_def(TIGERLAKE,   0, guc_def(tgl, 62, 0, 0), huc_def(tgl,  7, 9, 3)) \
fw_def(JASPERLAKE,  0, guc_def(ehl, 62, 0, 0), huc_def(ehl,  9, 0, 0)) \


Reviewed-by: John Harrison

Re: [Intel-gfx] [PATCH 4/4] drm/i915/guc: Enable GuC submission by default on DG1

2021-08-06 Thread John Harrison


On 8/2/2021 22:11, Matthew Brost wrote:

Signed-off-by: Matthew Brost 
---
  drivers/gpu/drm/i915/gt/uc/intel_uc.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
index da57d18d9f6b..fc2fc8d111d8 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
@@ -35,7 +35,7 @@ static void uc_expand_default_options(struct intel_uc *uc)
}
  
  	/* Intermediate platforms are HuC authentication only */

-   if (IS_DG1(i915) || IS_ALDERLAKE_S(i915)) {
+   if (IS_ALDERLAKE_S(i915)) {
i915->params.enable_guc = ENABLE_GUC_LOAD_HUC;
return;
}


Reviewed-by: John Harrison

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Only access SFC_DONE when media domain is not fused off

2021-08-06 Thread Patchwork

== Series Details ==

Series: drm/i915: Only access SFC_DONE when media domain is not fused off
URL   : https://patchwork.freedesktop.org/series/93468/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
0e9d9b7275f5 drm/i915: Only access SFC_DONE when media domain is not fused off
-:15: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description 
(prefer a maximum 75 chars per line)
#15: 
  <4> [544.989065] i915 :cc:00.0: Uninitialized forcewake domain(s) 0x80 
accessed at 0x1ce000

total: 0 errors, 1 warnings, 0 checks, 33 lines checked

Re: [Intel-gfx] [PATCH] drm/i915: Disable bonding on gen12+ platforms

2021-08-06 Thread Daniel Vetter

On Fri, Aug 6, 2021 at 8:25 PM John Harrison  wrote:
> On 7/28/2021 12:21, Matthew Brost wrote:
> > Disable bonding on gen12+ platforms aside from ones already supported by
> > the i915 - TGL, RKL, and ADL-S.
> >
> > Signed-off-by: Matthew Brost 
> > ---
> >   drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++
> >   1 file changed, 7 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
> > b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > index 05c3ee191710..9c3672bac0e2 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > @@ -446,6 +446,13 @@ set_proto_ctx_engines_bond(struct i915_user_extension 
> > __user *base, void *data)
> >   u16 idx, num_bonds;
> >   int err, n;
> >
> > + if (GRAPHICS_VER(i915) >= 12 && !IS_TIGERLAKE(i915) &&
> > + !IS_ROCKETLAKE(i915) && !IS_ALDERLAKE_S(i915)) {
> > + drm_dbg(&i915->drm,
> > + "Bonding on gen12+ aside from TGL, RKL, and ADL_S not 
> > allowed\n");
> I would have said not supported rather than not allowed. Either way:
> Reviewed-by: John Harrison 

Either is fine with me.

Acked-by: Daniel Vetter 

>
> > + return -ENODEV;
> > + }
> > +
> >   if (get_user(idx, &ext->virtual_index))
> >   return -EFAULT;
> >
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [Intel-gfx] [PATCH v5 02/20] drm/msm: Fix drm/sched point of no return rules

2021-08-06 Thread Rob Clark

On Fri, Aug 6, 2021 at 11:41 AM Daniel Vetter  wrote:
>
> On Fri, Aug 6, 2021 at 7:15 PM Rob Clark  wrote:
> >
> > On Fri, Aug 6, 2021 at 9:42 AM Daniel Vetter  wrote:
> > >
> > > On Fri, Aug 6, 2021 at 12:58 AM Rob Clark  wrote:
> > > >
> > > > On Thu, Aug 5, 2021 at 3:47 AM Daniel Vetter  
> > > > wrote:
> > > > >
> > > > > Originally drm_sched_job_init was the point of no return, after which
> > > > > drivers must submit a job. I've split that up, which allows us to fix
> > > > > this issue pretty easily.
> > > > >
> > > > > Only thing we have to take care of is to not skip to error paths after
> > > > > that. Other drivers do this the same for out-fence and similar things.
> > > > >
> > > > > Fixes: 1d8a5ca436ee ("drm/msm: Conversion to drm scheduler")
> > > > > Cc: Rob Clark 
> > > > > Cc: Rob Clark 
> > > > > Cc: Sean Paul 
> > > > > Cc: Sumit Semwal 
> > > > > Cc: "Christian König" 
> > > > > Cc: linux-arm-...@vger.kernel.org
> > > > > Cc: dri-de...@lists.freedesktop.org
> > > > > Cc: freedr...@lists.freedesktop.org
> > > > > Cc: linux-me...@vger.kernel.org
> > > > > Cc: linaro-mm-...@lists.linaro.org
> > > > > Signed-off-by: Daniel Vetter 
> > > > > ---
> > > > >  drivers/gpu/drm/msm/msm_gem_submit.c | 15 +++
> > > > >  1 file changed, 7 insertions(+), 8 deletions(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
> > > > > b/drivers/gpu/drm/msm/msm_gem_submit.c
> > > > > index 6d6c44f0e1f3..d0ed4ddc509e 100644
> > > > > --- a/drivers/gpu/drm/msm/msm_gem_submit.c
> > > > > +++ b/drivers/gpu/drm/msm/msm_gem_submit.c
> > > > > @@ -52,9 +52,6 @@ static struct msm_gem_submit *submit_create(struct 
> > > > > drm_device *dev,
> > > > > return ERR_PTR(ret);
> > > > > }
> > > > >
> > > > > -   /* FIXME: this is way too early */
> > > > > -   drm_sched_job_arm(&job->base);
> > > > > -
> > > > > xa_init_flags(&submit->deps, XA_FLAGS_ALLOC);
> > > > >
> > > > > kref_init(&submit->ref);
> > > > > @@ -883,6 +880,9 @@ int msm_ioctl_gem_submit(struct drm_device *dev, 
> > > > > void *data,
> > > > >
> > > > > submit->user_fence = 
> > > > > dma_fence_get(&submit->base.s_fence->finished);
> > > > >
> > > > > +   /* point of no return, we _have_ to submit no matter what */
> > > > > +   drm_sched_job_arm(&submit->base);
> > > > > +
> > > > > /*
> > > > >  * Allocate an id which can be used by WAIT_FENCE ioctl to 
> > > > > map back
> > > > >  * to the underlying fence.
> > > > > @@ -892,17 +892,16 @@ int msm_ioctl_gem_submit(struct drm_device 
> > > > > *dev, void *data,
> > > > > if (submit->fence_id < 0) {
> > > > > ret = submit->fence_id = 0;
> > > > > submit->fence_id = 0;
> > > > > -   goto out;
> > > > > }
> > > > >
> > > > > -   if (args->flags & MSM_SUBMIT_FENCE_FD_OUT) {
> > > > > +   if (ret == 0 && args->flags & MSM_SUBMIT_FENCE_FD_OUT) {
> > > > > struct sync_file *sync_file = 
> > > > > sync_file_create(submit->user_fence);
> > > > > if (!sync_file) {
> > > > > ret = -ENOMEM;
> > > > > -   goto out;
> > > > > +   } else {
> > > > > +   fd_install(out_fence_fd, sync_file->file);
> > > > > +   args->fence_fd = out_fence_fd;
> > > > > }
> > > > > -   fd_install(out_fence_fd, sync_file->file);
> > > > > -   args->fence_fd = out_fence_fd;
> > > >
> > > > I wonder if instead we should (approximately) undo "drm/msm/submit:
> > > > Simplify out-fence-fd handling" so that the point that it could fail
> > > > is moved up ahead of the drm_sched_job_arm()?
> > >
> > > Hm yeah. Up to you how you want to paint this shed, I think either is 
> > > fine.
> > >
> > > > Also, does the dma_fence_get() work before drm_sched_job_arm()?  From
> > > > a quick look, it looks like it won't, but I'm still playing catchup
> > > > and haven't had a chance to look at your entire series.  If it doesn't
> > > > work before drm_sched_job_arm(), then there is really no way to
> > > > prevent a error path between the fence-init and job-submit.
> > >
> > > Yes. I thought I've checked that I put the _arm() in the right spot,
> > > but I guess I screwed up and you need the fence before the point where
> > > I've put the job_arm()? And yes the error path cannot be avoided for
> > > out-fences, that's what I tried to explain in the commit message.
> > >
> > > > But, prior to your series, wouldn't a failure after
> > > > drm_sched_job_init() but before the job is submitted just burn a
> > > > fence-id, and otherwise carry on it's merry way?
> > >
> > > Maybe? I'm not sure whether the scheduler gets confused about the gap
> > > and freak out abou that. I'm fairly new to that code and learning
> > > (which is part why I'm working on it). Since you look up in
> > > fences/syncobj after job_ini

Re: [Intel-gfx] [PATCH v5 02/20] drm/msm: Fix drm/sched point of no return rules

2021-08-06 Thread Daniel Vetter

On Fri, Aug 6, 2021 at 8:57 PM Rob Clark  wrote:
>
> On Fri, Aug 6, 2021 at 11:41 AM Daniel Vetter  wrote:
> >
> > On Fri, Aug 6, 2021 at 7:15 PM Rob Clark  wrote:
> > >
> > > On Fri, Aug 6, 2021 at 9:42 AM Daniel Vetter  
> > > wrote:
> > > >
> > > > On Fri, Aug 6, 2021 at 12:58 AM Rob Clark  wrote:
> > > > >
> > > > > On Thu, Aug 5, 2021 at 3:47 AM Daniel Vetter  
> > > > > wrote:
> > > > > >
> > > > > > Originally drm_sched_job_init was the point of no return, after 
> > > > > > which
> > > > > > drivers must submit a job. I've split that up, which allows us to 
> > > > > > fix
> > > > > > this issue pretty easily.
> > > > > >
> > > > > > Only thing we have to take care of is to not skip to error paths 
> > > > > > after
> > > > > > that. Other drivers do this the same for out-fence and similar 
> > > > > > things.
> > > > > >
> > > > > > Fixes: 1d8a5ca436ee ("drm/msm: Conversion to drm scheduler")
> > > > > > Cc: Rob Clark 
> > > > > > Cc: Rob Clark 
> > > > > > Cc: Sean Paul 
> > > > > > Cc: Sumit Semwal 
> > > > > > Cc: "Christian König" 
> > > > > > Cc: linux-arm-...@vger.kernel.org
> > > > > > Cc: dri-de...@lists.freedesktop.org
> > > > > > Cc: freedr...@lists.freedesktop.org
> > > > > > Cc: linux-me...@vger.kernel.org
> > > > > > Cc: linaro-mm-...@lists.linaro.org
> > > > > > Signed-off-by: Daniel Vetter 
> > > > > > ---
> > > > > >  drivers/gpu/drm/msm/msm_gem_submit.c | 15 +++
> > > > > >  1 file changed, 7 insertions(+), 8 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
> > > > > > b/drivers/gpu/drm/msm/msm_gem_submit.c
> > > > > > index 6d6c44f0e1f3..d0ed4ddc509e 100644
> > > > > > --- a/drivers/gpu/drm/msm/msm_gem_submit.c
> > > > > > +++ b/drivers/gpu/drm/msm/msm_gem_submit.c
> > > > > > @@ -52,9 +52,6 @@ static struct msm_gem_submit 
> > > > > > *submit_create(struct drm_device *dev,
> > > > > > return ERR_PTR(ret);
> > > > > > }
> > > > > >
> > > > > > -   /* FIXME: this is way too early */
> > > > > > -   drm_sched_job_arm(&job->base);
> > > > > > -
> > > > > > xa_init_flags(&submit->deps, XA_FLAGS_ALLOC);
> > > > > >
> > > > > > kref_init(&submit->ref);
> > > > > > @@ -883,6 +880,9 @@ int msm_ioctl_gem_submit(struct drm_device 
> > > > > > *dev, void *data,
> > > > > >
> > > > > > submit->user_fence = 
> > > > > > dma_fence_get(&submit->base.s_fence->finished);
> > > > > >
> > > > > > +   /* point of no return, we _have_ to submit no matter what */
> > > > > > +   drm_sched_job_arm(&submit->base);
> > > > > > +
> > > > > > /*
> > > > > >  * Allocate an id which can be used by WAIT_FENCE ioctl to 
> > > > > > map back
> > > > > >  * to the underlying fence.
> > > > > > @@ -892,17 +892,16 @@ int msm_ioctl_gem_submit(struct drm_device 
> > > > > > *dev, void *data,
> > > > > > if (submit->fence_id < 0) {
> > > > > > ret = submit->fence_id = 0;
> > > > > > submit->fence_id = 0;
> > > > > > -   goto out;
> > > > > > }
> > > > > >
> > > > > > -   if (args->flags & MSM_SUBMIT_FENCE_FD_OUT) {
> > > > > > +   if (ret == 0 && args->flags & MSM_SUBMIT_FENCE_FD_OUT) {
> > > > > > struct sync_file *sync_file = 
> > > > > > sync_file_create(submit->user_fence);
> > > > > > if (!sync_file) {
> > > > > > ret = -ENOMEM;
> > > > > > -   goto out;
> > > > > > +   } else {
> > > > > > +   fd_install(out_fence_fd, sync_file->file);
> > > > > > +   args->fence_fd = out_fence_fd;
> > > > > > }
> > > > > > -   fd_install(out_fence_fd, sync_file->file);
> > > > > > -   args->fence_fd = out_fence_fd;
> > > > >
> > > > > I wonder if instead we should (approximately) undo "drm/msm/submit:
> > > > > Simplify out-fence-fd handling" so that the point that it could fail
> > > > > is moved up ahead of the drm_sched_job_arm()?
> > > >
> > > > Hm yeah. Up to you how you want to paint this shed, I think either is 
> > > > fine.
> > > >
> > > > > Also, does the dma_fence_get() work before drm_sched_job_arm()?  From
> > > > > a quick look, it looks like it won't, but I'm still playing catchup
> > > > > and haven't had a chance to look at your entire series.  If it doesn't
> > > > > work before drm_sched_job_arm(), then there is really no way to
> > > > > prevent a error path between the fence-init and job-submit.
> > > >
> > > > Yes. I thought I've checked that I put the _arm() in the right spot,
> > > > but I guess I screwed up and you need the fence before the point where
> > > > I've put the job_arm()? And yes the error path cannot be avoided for
> > > > out-fences, that's what I tried to explain in the commit message.
> > > >
> > > > > But, prior to your series, wouldn't a failure after
> > > > > drm_sched_job_init() but before the job is submitted jus

[Intel-gfx] ✗ Fi.CI.BUILD: failure for drm/sched dependency handling and implicit sync fixes (rev2)

2021-08-06 Thread Patchwork

== Series Details ==

Series: drm/sched dependency handling and implicit sync fixes (rev2)
URL   : https://patchwork.freedesktop.org/series/93415/
State : failure

== Summary ==

Applying: drm/sched: Split drm_sched_job_init
Applying: drm/msm: Fix drm/sched point of no return rules
error: git diff header lacks filename information when removing 1 leading 
pathname component (line 2)
error: could not build fake ancestor
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0002 drm/msm: Fix drm/sched point of no return rules
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Only access SFC_DONE when media domain is not fused off

2021-08-06 Thread Patchwork

== Series Details ==

Series: drm/i915: Only access SFC_DONE when media domain is not fused off
URL   : https://patchwork.freedesktop.org/series/93468/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10457 -> Patchwork_20782


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20782/index.html

Known issues


  Here are the changes found in Patchwork_20782 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@kms_chamelium@hdmi-edid-read:
- fi-kbl-7500u:   [PASS][1] -> [FAIL][2] ([i915#3449])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/fi-kbl-7500u/igt@kms_chamel...@hdmi-edid-read.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20782/fi-kbl-7500u/igt@kms_chamel...@hdmi-edid-read.html

  
 Possible fixes 

  * igt@gem_exec_suspend@basic-s3:
- fi-tgl-1115g4:  [FAIL][3] ([i915#1888]) -> [PASS][4]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/fi-tgl-1115g4/igt@gem_exec_susp...@basic-s3.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20782/fi-tgl-1115g4/igt@gem_exec_susp...@basic-s3.html

  * igt@i915_selftest@live@hangcheck:
- fi-icl-u2:  [DMESG-FAIL][5] -> [PASS][6]
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/fi-icl-u2/igt@i915_selftest@l...@hangcheck.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20782/fi-icl-u2/igt@i915_selftest@l...@hangcheck.html

  
  [i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888
  [i915#3449]: https://gitlab.freedesktop.org/drm/intel/issues/3449


Participating hosts (37 -> 34)
--

  Missing(3): fi-bdw-samus fi-bsw-cyan bat-jsl-1 


Build changes
-

  * Linux: CI_DRM_10457 -> Patchwork_20782

  CI-20190529: 20190529
  CI_DRM_10457: 7700f858b68060307b0a7d94377a5d8f64000e5d @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6162: 2f32b9e0da5f1ac9529318dd5b836c8cf4d3c441 @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20782: 0e9d9b7275f5382258a63683599e60f68ccb1e31 @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

0e9d9b7275f5 drm/i915: Only access SFC_DONE when media domain is not fused off

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20782/index.html

Re: [Intel-gfx] [PATCH 1/1] drm/i915: Check if engine has heartbeat when closing a context

2021-08-06 Thread Daniel Vetter

Seen this fly by and figured I dropped a few thoughts in here. At the
likely cost of looking a bit out of whack :-)

On Fri, Aug 6, 2021 at 8:01 PM John Harrison  wrote:
> On 8/2/2021 02:40, Tvrtko Ursulin wrote:
> > On 30/07/2021 19:13, John Harrison wrote:
> >> On 7/30/2021 02:49, Tvrtko Ursulin wrote:
> >>> On 30/07/2021 01:13, John Harrison wrote:
>  On 7/28/2021 17:34, Matthew Brost wrote:
> > If an engine associated with a context does not have a heartbeat,
> > ban it
> > immediately. This is needed for GuC submission as a idle pulse
> > doesn't
> > kick the context off the hardware where it then can check for a
> > heartbeat and ban the context.
> >>>
> >>> Pulse, that is a request with I915_PRIORITY_BARRIER, does not
> >>> preempt a running normal priority context?
> >>>
> >>> Why does it matter then whether or not heartbeats are enabled - when
> >>> heartbeat just ends up sending the same engine pulse (eventually,
> >>> with raising priority)?
> >> The point is that the pulse is pointless. See the rest of my comments
> >> below, specifically "the context will get resubmitted to the hardware
> >> after the pulse completes". To re-iterate...
> >>
> >> Yes, it preempts the context. Yes, it does so whether heartbeats are
> >> enabled or not. But so what? Who cares? You have preempted a context.
> >> It is no longer running on the hardware. BUT IT IS STILL A VALID
> >> CONTEXT.
> >
> > It is valid yes, and it even may be the current ABI so another
> > question is whether it is okay to change that.
> >
> >> The backend scheduler will just resubmit it to the hardware as soon
> >> as the pulse completes. The only reason this works at all is because
> >> of the horrid hack in the execlist scheduler's back end
> >> implementation (in __execlists_schedule_in):
> >>  if (unlikely(intel_context_is_closed(ce) &&
> >>   !intel_engine_has_heartbeat(engine)))
> >>  intel_context_set_banned(ce);
> >
> > Right, is the above code then needed with this patch - when ban is
> > immediately applied on the higher level?
> >
> >> The actual back end scheduler is saying "Is this a zombie context? Is
> >> the heartbeat disabled? Then ban it". No other scheduler backend is
> >> going to have knowledge of zombie context status or of the heartbeat
> >> status. Nor are they going to call back into the higher levels of the
> >> i915 driver to trigger a ban operation. Certainly a hardware
> >> implemented scheduler is not going to be looking at private i915
> >> driver information to decide whether to submit a context or whether
> >> to tell the OS to kill it off instead.
> >>
> >> For persistence to work with a hardware scheduler (or a non-Intel
> >> specific scheduler such as the DRM one), the handling of zombie
> >> contexts, banning, etc. *must* be done entirely in the front end. It
> >> cannot rely on any backend hacks. That means you can't rely on any
> >> fancy behaviour of pulses.
> >>
> >> If you want to ban a context then you must explicitly ban that
> >> context. If you want to ban it at some later point then you need to
> >> track it at the top level as a zombie and then explicitly ban that
> >> zombie at whatever later point.
> >
> > I am still trying to understand it all. If I go by the commit message:
> >
> > """
> > This is needed for GuC submission as a idle pulse doesn't
> > kick the context off the hardware where it then can check for a
> > heartbeat and ban the context.
> > """
> >
> > That did not explain things for me. Sentence does not appear to make
> > sense. Now, it seems "kick off the hardware" is meant as revoke and
> > not just preempt. Which is fine, perhaps just needs to be written more
> > explicitly. But the part of checking for heartbeat after idle pulse
> > does not compute for me. It is the heartbeat which emits idle pulses,
> > not idle pulse emitting heartbeats.
> I am in agreement that the commit message is confusing and does not
> explain either the problem or the solution.
>
>
> >
> >
> > But anyway, I can buy the handling at the front end story completely.
> > It makes sense. We just need to agree that a) it is okay to change the
> > ABI and b) remove the backend check from execlists if it is not needed
> > any longer.
> >
> > And if ABI change is okay then commit message needs to talk about it
> > loudly and clearly.
> I don't think we have a choice. The current ABI is not and cannot ever
> be compatible with any scheduler external to i915. It cannot be
> implemented with a hardware scheduler such as the GuC and it cannot be
> implemented with an external software scheduler such as the DRM one.

So generally on linux we implement helper libraries, which means
massive flexibility everywhere.

https://blog.ffwll.ch/2016/12/midlayers-once-more-with-feeling.html

So it shouldn't be an insurmountable problem to make this happen even
with drm/scheduler, we can patch it up.

Whether that's justified is another question.

> M

Re: [Intel-gfx] [PATCH v5 02/20] drm/msm: Fix drm/sched point of no return rules

2021-08-06 Thread Rob Clark

On Fri, Aug 6, 2021 at 12:11 PM Daniel Vetter  wrote:
>
> On Fri, Aug 6, 2021 at 8:57 PM Rob Clark  wrote:
> >
> > On Fri, Aug 6, 2021 at 11:41 AM Daniel Vetter  
> > wrote:
> > >
> > > On Fri, Aug 6, 2021 at 7:15 PM Rob Clark  wrote:
> > > >
> > > > On Fri, Aug 6, 2021 at 9:42 AM Daniel Vetter  
> > > > wrote:
> > > > >
> > > > > On Fri, Aug 6, 2021 at 12:58 AM Rob Clark  wrote:
> > > > > >
> > > > > > On Thu, Aug 5, 2021 at 3:47 AM Daniel Vetter 
> > > > > >  wrote:
> > > > > > >
> > > > > > > Originally drm_sched_job_init was the point of no return, after 
> > > > > > > which
> > > > > > > drivers must submit a job. I've split that up, which allows us to 
> > > > > > > fix
> > > > > > > this issue pretty easily.
> > > > > > >
> > > > > > > Only thing we have to take care of is to not skip to error paths 
> > > > > > > after
> > > > > > > that. Other drivers do this the same for out-fence and similar 
> > > > > > > things.
> > > > > > >
> > > > > > > Fixes: 1d8a5ca436ee ("drm/msm: Conversion to drm scheduler")
> > > > > > > Cc: Rob Clark 
> > > > > > > Cc: Rob Clark 
> > > > > > > Cc: Sean Paul 
> > > > > > > Cc: Sumit Semwal 
> > > > > > > Cc: "Christian König" 
> > > > > > > Cc: linux-arm-...@vger.kernel.org
> > > > > > > Cc: dri-de...@lists.freedesktop.org
> > > > > > > Cc: freedr...@lists.freedesktop.org
> > > > > > > Cc: linux-me...@vger.kernel.org
> > > > > > > Cc: linaro-mm-...@lists.linaro.org
> > > > > > > Signed-off-by: Daniel Vetter 
> > > > > > > ---
> > > > > > >  drivers/gpu/drm/msm/msm_gem_submit.c | 15 +++
> > > > > > >  1 file changed, 7 insertions(+), 8 deletions(-)
> > > > > > >
> > > > > > > diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
> > > > > > > b/drivers/gpu/drm/msm/msm_gem_submit.c
> > > > > > > index 6d6c44f0e1f3..d0ed4ddc509e 100644
> > > > > > > --- a/drivers/gpu/drm/msm/msm_gem_submit.c
> > > > > > > +++ b/drivers/gpu/drm/msm/msm_gem_submit.c
> > > > > > > @@ -52,9 +52,6 @@ static struct msm_gem_submit 
> > > > > > > *submit_create(struct drm_device *dev,
> > > > > > > return ERR_PTR(ret);
> > > > > > > }
> > > > > > >
> > > > > > > -   /* FIXME: this is way too early */
> > > > > > > -   drm_sched_job_arm(&job->base);
> > > > > > > -
> > > > > > > xa_init_flags(&submit->deps, XA_FLAGS_ALLOC);
> > > > > > >
> > > > > > > kref_init(&submit->ref);
> > > > > > > @@ -883,6 +880,9 @@ int msm_ioctl_gem_submit(struct drm_device 
> > > > > > > *dev, void *data,
> > > > > > >
> > > > > > > submit->user_fence = 
> > > > > > > dma_fence_get(&submit->base.s_fence->finished);
> > > > > > >
> > > > > > > +   /* point of no return, we _have_ to submit no matter what 
> > > > > > > */
> > > > > > > +   drm_sched_job_arm(&submit->base);
> > > > > > > +
> > > > > > > /*
> > > > > > >  * Allocate an id which can be used by WAIT_FENCE ioctl 
> > > > > > > to map back
> > > > > > >  * to the underlying fence.
> > > > > > > @@ -892,17 +892,16 @@ int msm_ioctl_gem_submit(struct drm_device 
> > > > > > > *dev, void *data,
> > > > > > > if (submit->fence_id < 0) {
> > > > > > > ret = submit->fence_id = 0;
> > > > > > > submit->fence_id = 0;
> > > > > > > -   goto out;
> > > > > > > }
> > > > > > >
> > > > > > > -   if (args->flags & MSM_SUBMIT_FENCE_FD_OUT) {
> > > > > > > +   if (ret == 0 && args->flags & MSM_SUBMIT_FENCE_FD_OUT) {
> > > > > > > struct sync_file *sync_file = 
> > > > > > > sync_file_create(submit->user_fence);
> > > > > > > if (!sync_file) {
> > > > > > > ret = -ENOMEM;
> > > > > > > -   goto out;
> > > > > > > +   } else {
> > > > > > > +   fd_install(out_fence_fd, sync_file->file);
> > > > > > > +   args->fence_fd = out_fence_fd;
> > > > > > > }
> > > > > > > -   fd_install(out_fence_fd, sync_file->file);
> > > > > > > -   args->fence_fd = out_fence_fd;
> > > > > >
> > > > > > I wonder if instead we should (approximately) undo "drm/msm/submit:
> > > > > > Simplify out-fence-fd handling" so that the point that it could fail
> > > > > > is moved up ahead of the drm_sched_job_arm()?
> > > > >
> > > > > Hm yeah. Up to you how you want to paint this shed, I think either is 
> > > > > fine.
> > > > >
> > > > > > Also, does the dma_fence_get() work before drm_sched_job_arm()?  
> > > > > > From
> > > > > > a quick look, it looks like it won't, but I'm still playing catchup
> > > > > > and haven't had a chance to look at your entire series.  If it 
> > > > > > doesn't
> > > > > > work before drm_sched_job_arm(), then there is really no way to
> > > > > > prevent a error path between the fence-init and job-submit.
> > > > >
> > > > > Yes. I thought I've checked that I put the _arm() in the right spot,
> > > > > but I guess I screwed up and you need the

[Intel-gfx] ✗ Fi.CI.IGT: failure for remove rcu support from i915_address_space (rev5)

2021-08-06 Thread Patchwork

== Series Details ==

Series: remove rcu support from i915_address_space (rev5)
URL   : https://patchwork.freedesktop.org/series/93314/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10451_full -> Patchwork_20777_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20777_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20777_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_20777_full:

### IGT changes ###

 Possible regressions 

  * igt@i915_selftest@live@hangcheck:
- shard-snb:  [PASS][1] -> [INCOMPLETE][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/shard-snb7/igt@i915_selftest@l...@hangcheck.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20777/shard-snb2/igt@i915_selftest@l...@hangcheck.html

  
Known issues


  Here are the changes found in Patchwork_20777_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_create@create-massive:
- shard-apl:  NOTRUN -> [DMESG-WARN][3] ([i915#3002])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20777/shard-apl7/igt@gem_cre...@create-massive.html

  * igt@gem_ctx_persistence@idempotent:
- shard-snb:  NOTRUN -> [SKIP][4] ([fdo#109271] / [i915#1099]) +6 
similar issues
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20777/shard-snb6/igt@gem_ctx_persiste...@idempotent.html

  * igt@gem_eio@unwedge-stress:
- shard-skl:  [PASS][5] -> [TIMEOUT][6] ([i915#2369] / [i915#3063])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/shard-skl1/igt@gem_...@unwedge-stress.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20777/shard-skl9/igt@gem_...@unwedge-stress.html
- shard-snb:  NOTRUN -> [FAIL][7] ([i915#3354])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20777/shard-snb6/igt@gem_...@unwedge-stress.html

  * igt@gem_exec_fair@basic-deadline:
- shard-apl:  NOTRUN -> [FAIL][8] ([i915#2846])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20777/shard-apl6/igt@gem_exec_f...@basic-deadline.html

  * igt@gem_exec_fair@basic-none-rrul@rcs0:
- shard-tglb: NOTRUN -> [FAIL][9] ([i915#2842])
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20777/shard-tglb1/igt@gem_exec_fair@basic-none-r...@rcs0.html

  * igt@gem_exec_fair@basic-none-share@rcs0:
- shard-iclb: [PASS][10] -> [FAIL][11] ([i915#2842])
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/shard-iclb1/igt@gem_exec_fair@basic-none-sh...@rcs0.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20777/shard-iclb5/igt@gem_exec_fair@basic-none-sh...@rcs0.html
- shard-tglb: [PASS][12] -> [FAIL][13] ([i915#2842])
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/shard-tglb5/igt@gem_exec_fair@basic-none-sh...@rcs0.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20777/shard-tglb7/igt@gem_exec_fair@basic-none-sh...@rcs0.html

  * igt@gem_exec_fair@basic-none-solo@rcs0:
- shard-glk:  [PASS][14] -> [FAIL][15] ([i915#2842]) +1 similar 
issue
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/shard-glk6/igt@gem_exec_fair@basic-none-s...@rcs0.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20777/shard-glk1/igt@gem_exec_fair@basic-none-s...@rcs0.html

  * igt@gem_exec_whisper@basic-contexts-all:
- shard-glk:  [PASS][16] -> [DMESG-WARN][17] ([i915#118] / 
[i915#95])
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10451/shard-glk7/igt@gem_exec_whis...@basic-contexts-all.html
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20777/shard-glk7/igt@gem_exec_whis...@basic-contexts-all.html

  * igt@gem_userptr_blits@unsync-overlap:
- shard-tglb: NOTRUN -> [SKIP][18] ([i915#3297])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20777/shard-tglb1/igt@gem_userptr_bl...@unsync-overlap.html

  * igt@gem_userptr_blits@unsync-unmap-cycles:
- shard-iclb: NOTRUN -> [SKIP][19] ([i915#3297])
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20777/shard-iclb6/igt@gem_userptr_bl...@unsync-unmap-cycles.html

  * igt@gem_userptr_blits@vma-merge:
- shard-apl:  NOTRUN -> [FAIL][20] ([i915#3318])
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20777/shard-apl1/igt@gem_userptr_bl...@vma-merge.html

  * igt@gen3_render_linear_blits:
- shard-tglb: NOTRUN -> [SKIP][21] ([fdo#109289]) +1 similar issue
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20777/shard-tglb1/

Re: [Intel-gfx] [PATCH] drm/i915: Fix syncmap memory leak

2021-08-06 Thread John Harrison


On 8/6/2021 11:29, Matthew Brost wrote:

On Fri, Aug 06, 2021 at 11:23:06AM -0700, John Harrison wrote:

On 7/30/2021 12:53, Matthew Brost wrote:

A small race exists between intel_gt_retire_requests_timeout and
intel_timeline_exit which could result in the syncmap not getting
free'd. Rather than work to hard to seal this race, simply cleanup the

free'd -> freed


Sure.


syncmap on fini.

unreferenced object 0x88813bc53b18 (size 96):
comm "gem_close_race", pid 5410, jiffies 4294917818 (age 1105.600s)
hex dump (first 32 bytes):
  01 00 00 00 00 00 00 00 00 00 00 00 0a 00 00 00  
  00 00 00 00 00 00 00 00 6b 6b 6b 6b 06 00 00 00  
backtrace:
  [<120b863a>] __sync_alloc_leaf+0x1e/0x40 [i915]
  [<042f6959>] __sync_set+0x1bb/0x240 [i915]
  [<90f0e90f>] i915_request_await_dma_fence+0x1c7/0x400 [i915]
  [<56a48219>] i915_request_await_object+0x222/0x360 [i915]
  [] i915_gem_do_execbuffer+0x1bd0/0x2250 [i915]
  [<3c9d830f>] i915_gem_execbuffer2_ioctl+0x405/0xce0 [i915]
  [] drm_ioctl_kernel+0xb0/0xf0 [drm]
  [] drm_ioctl+0x305/0x3c0 [drm]
  [<8b0d8986>] __x64_sys_ioctl+0x71/0xb0
  [<76c362a4>] do_syscall_64+0x33/0x80
  [] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Signed-off-by: Matthew Brost 
Fixes: 531958f6f357 ("drm/i915/gt: Track timeline activeness in enter/exit")
Cc: 
---
   drivers/gpu/drm/i915/gt/intel_timeline.c | 9 +
   1 file changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c 
b/drivers/gpu/drm/i915/gt/intel_timeline.c
index c4a126c8caef..1257f4f11e66 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.c
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
@@ -127,6 +127,15 @@ static void intel_timeline_fini(struct rcu_head *rcu)
i915_vma_put(timeline->hwsp_ggtt);
i915_active_fini(&timeline->active);
+
+   /*
+* A small race exists between intel_gt_retire_requests_timeout and
+* intel_timeline_exit which could result in the syncmap not getting
+* free'd. Rather than work to hard to seal this race, simply cleanup
+* the syncmap on fini.

What is the race? I'm going round in circles just trying to work out how
intel_gt_retire_requests_timeout is supposed to get to intel_timeline_exit
in the first place.


intel_gt_retire_requests_timeout increments tl->active_count, active_count == 2
intel_timeline_exit is called, returns on atomic_add_unless, active_count == 1
intel_gt_retire_requests_timeout decrements tl->active_count, active_count == 0
i915_syncmap_free is never called, memory leak

Matt

Okay. Think I follow it now.

Seems like the syncmap free should have been in timeline_fini instead of 
timeline_exit in the first place?


Reviewed-by: John Harrison 





Also, free'd -> freed.

John.



+*/
+   i915_syncmap_free(&timeline->sync);
+
kfree(timeline);
   }

[Intel-gfx] [PATCH] drm/i915: Release ctx->syncobj on final put, not on ctx close

2021-08-06 Thread Daniel Vetter

gem context refcounting is another exercise in least locking design it
seems, where most things get destroyed upon context closure (which can
race with anything really). Only the actual memory allocation and the
locks survive while holding a reference.

This tripped up Jason when reimplementing the single timeline feature
in

commit 00dae4d3d35d4f526929633b76e00b0ab4d3970d
Author: Jason Ekstrand 
Date:   Thu Jul 8 10:48:12 2021 -0500

drm/i915: Implement SINGLE_TIMELINE with a syncobj (v4)

We could fix the bug by holding ctx->mutex, but it's cleaner to just
make the context object actually invariant over its _entire_ lifetime.

Signed-off-by: Daniel Vetter 
Fixes: 00dae4d3d35d ("drm/i915: Implement SINGLE_TIMELINE with a syncobj (v4)")
Cc: Jason Ekstrand 
Cc: Chris Wilson 
Cc: Tvrtko Ursulin 
Cc: Joonas Lahtinen 
Cc: Matthew Brost 
Cc: Matthew Auld 
Cc: Maarten Lankhorst 
Cc: "Thomas Hellström" 
Cc: Lionel Landwerlin 
Cc: Dave Airlie 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 754b9b8d4981..93ba0197d70a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -940,6 +940,9 @@ void i915_gem_context_release(struct kref *ref)
trace_i915_context_free(ctx);
GEM_BUG_ON(!i915_gem_context_is_closed(ctx));
 
+   if (ctx->syncobj)
+   drm_syncobj_put(ctx->syncobj);
+
mutex_destroy(&ctx->engines_mutex);
mutex_destroy(&ctx->lut_mutex);
 
@@ -1159,9 +1162,6 @@ static void context_close(struct i915_gem_context *ctx)
if (vm)
i915_vm_close(vm);
 
-   if (ctx->syncobj)
-   drm_syncobj_put(ctx->syncobj);
-
ctx->file_priv = ERR_PTR(-EBADF);
 
/*
-- 
2.32.0

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Release ctx->syncobj on final put, not on ctx close

2021-08-06 Thread Patchwork

== Series Details ==

Series: drm/i915: Release ctx->syncobj on final put, not on ctx close
URL   : https://patchwork.freedesktop.org/series/93470/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
d1dd9f17231e drm/i915: Release ctx->syncobj on final put, not on ctx close
-:17: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit 00dae4d3d35d ("drm/i915: 
Implement SINGLE_TIMELINE with a syncobj (v4)")'
#17: 
commit 00dae4d3d35d4f526929633b76e00b0ab4d3970d

-:62: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 1 errors, 1 warnings, 0 checks, 18 lines checked

[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Release ctx->syncobj on final put, not on ctx close

2021-08-06 Thread Patchwork

== Series Details ==

Series: drm/i915: Release ctx->syncobj on final put, not on ctx close
URL   : https://patchwork.freedesktop.org/series/93470/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10457 -> Patchwork_20784


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20784/index.html

Known issues


  Here are the changes found in Patchwork_20784 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@i915_selftest@live@gt_heartbeat:
- fi-kbl-soraka:  [PASS][1] -> [DMESG-FAIL][2] ([i915#2291] / 
[i915#541])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/fi-kbl-soraka/igt@i915_selftest@live@gt_heartbeat.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20784/fi-kbl-soraka/igt@i915_selftest@live@gt_heartbeat.html

  
 Possible fixes 

  * igt@gem_exec_suspend@basic-s3:
- fi-tgl-1115g4:  [FAIL][3] ([i915#1888]) -> [PASS][4]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/fi-tgl-1115g4/igt@gem_exec_susp...@basic-s3.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20784/fi-tgl-1115g4/igt@gem_exec_susp...@basic-s3.html

  * igt@i915_selftest@live@hangcheck:
- fi-icl-u2:  [DMESG-FAIL][5] -> [PASS][6]
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/fi-icl-u2/igt@i915_selftest@l...@hangcheck.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20784/fi-icl-u2/igt@i915_selftest@l...@hangcheck.html

  
  [i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888
  [i915#2291]: https://gitlab.freedesktop.org/drm/intel/issues/2291
  [i915#541]: https://gitlab.freedesktop.org/drm/intel/issues/541


Participating hosts (37 -> 34)
--

  Missing(3): fi-bdw-samus fi-bsw-cyan bat-jsl-1 


Build changes
-

  * Linux: CI_DRM_10457 -> Patchwork_20784

  CI-20190529: 20190529
  CI_DRM_10457: 7700f858b68060307b0a7d94377a5d8f64000e5d @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6162: 2f32b9e0da5f1ac9529318dd5b836c8cf4d3c441 @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20784: d1dd9f17231e132d8ea8aeaeb7fffdf45380f3ee @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

d1dd9f17231e drm/i915: Release ctx->syncobj on final put, not on ctx close

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20784/index.html

[Intel-gfx] ✓ Fi.CI.IGT: success for Begin enabling Xe_HP SDV and DG2 platforms (rev10)

2021-08-06 Thread Patchwork

== Series Details ==

Series: Begin enabling Xe_HP SDV and DG2 platforms (rev10)
URL   : https://patchwork.freedesktop.org/series/92135/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10457_full -> Patchwork_20781_full


Summary
---

  **SUCCESS**

  No regressions found.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_20781_full:

### IGT changes ###

 Suppressed 

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * {igt@gem_userptr_blits@map-fixed-invalidate-overlap@fixed}:
- {shard-rkl}:NOTRUN -> [SKIP][1]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20781/shard-rkl-5/igt@gem_userptr_blits@map-fixed-invalidate-over...@fixed.html

  
Known issues


  Here are the changes found in Patchwork_20781_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_create@create-massive:
- shard-snb:  NOTRUN -> [DMESG-WARN][2] ([i915#3002])
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20781/shard-snb5/igt@gem_cre...@create-massive.html

  * igt@gem_ctx_isolation@preservation-s3@vcs0:
- shard-kbl:  [PASS][3] -> [DMESG-WARN][4] ([i915#180]) +2 similar 
issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/shard-kbl1/igt@gem_ctx_isolation@preservation...@vcs0.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20781/shard-kbl3/igt@gem_ctx_isolation@preservation...@vcs0.html
- shard-skl:  [PASS][5] -> [INCOMPLETE][6] ([i915#198])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/shard-skl1/igt@gem_ctx_isolation@preservation...@vcs0.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20781/shard-skl4/igt@gem_ctx_isolation@preservation...@vcs0.html

  * igt@gem_ctx_persistence@legacy-engines-mixed:
- shard-snb:  NOTRUN -> [SKIP][7] ([fdo#109271] / [i915#1099]) +1 
similar issue
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20781/shard-snb7/igt@gem_ctx_persiste...@legacy-engines-mixed.html

  * igt@gem_eio@unwedge-stress:
- shard-snb:  NOTRUN -> [FAIL][8] ([i915#3354])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20781/shard-snb5/igt@gem_...@unwedge-stress.html

  * igt@gem_exec_endless@dispatch@rcs0:
- shard-tglb: [PASS][9] -> [INCOMPLETE][10] ([i915#3778])
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/shard-tglb3/igt@gem_exec_endless@dispa...@rcs0.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20781/shard-tglb3/igt@gem_exec_endless@dispa...@rcs0.html

  * igt@gem_exec_fair@basic-none-rrul@rcs0:
- shard-glk:  [PASS][11] -> [FAIL][12] ([i915#2842]) +1 similar 
issue
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/shard-glk5/igt@gem_exec_fair@basic-none-r...@rcs0.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20781/shard-glk6/igt@gem_exec_fair@basic-none-r...@rcs0.html

  * igt@gem_exec_fair@basic-none-share@rcs0:
- shard-iclb: [PASS][13] -> [FAIL][14] ([i915#2842])
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/shard-iclb6/igt@gem_exec_fair@basic-none-sh...@rcs0.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20781/shard-iclb1/igt@gem_exec_fair@basic-none-sh...@rcs0.html

  * igt@gem_exec_fair@basic-none@vcs0:
- shard-kbl:  NOTRUN -> [FAIL][15] ([i915#2842]) +3 similar issues
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20781/shard-kbl2/igt@gem_exec_fair@basic-n...@vcs0.html

  * igt@gem_exec_fair@basic-pace@rcs0:
- shard-tglb: [PASS][16] -> [FAIL][17] ([i915#2842])
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/shard-tglb1/igt@gem_exec_fair@basic-p...@rcs0.html
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20781/shard-tglb7/igt@gem_exec_fair@basic-p...@rcs0.html

  * igt@gem_exec_fair@basic-pace@vcs1:
- shard-kbl:  [PASS][18] -> [SKIP][19] ([fdo#109271])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/shard-kbl7/igt@gem_exec_fair@basic-p...@vcs1.html
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20781/shard-kbl3/igt@gem_exec_fair@basic-p...@vcs1.html

  * igt@gem_exec_fair@basic-pace@vecs0:
- shard-kbl:  [PASS][20] -> [FAIL][21] ([i915#2842]) +1 similar 
issue
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/shard-kbl7/igt@gem_exec_fair@basic-p...@vecs0.html
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20781/shard-kbl3/igt@gem_exec_fair@basic-p...@vecs0.html

  * igt@gem_exec_whisper@basic-queues:
- shard-glk:  [PASS][22] -> [DMESG-WARN][23] ([i915#118] / 
[i915#95])
   [22]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/shard-glk1/igt@gem_exec_whis...@basic-queu

[Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915: Only access SFC_DONE when media domain is not fused off

2021-08-06 Thread Patchwork

== Series Details ==

Series: drm/i915: Only access SFC_DONE when media domain is not fused off
URL   : https://patchwork.freedesktop.org/series/93468/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10457_full -> Patchwork_20782_full


Summary
---

  **SUCCESS**

  No regressions found.

  

Known issues


  Here are the changes found in Patchwork_20782_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_ctx_persistence@legacy-engines-mixed:
- shard-snb:  NOTRUN -> [SKIP][1] ([fdo#109271] / [i915#1099]) +3 
similar issues
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20782/shard-snb6/igt@gem_ctx_persiste...@legacy-engines-mixed.html

  * igt@gem_ctx_sseu@invalid-args:
- shard-apl:  NOTRUN -> [SKIP][2] ([fdo#109271]) +107 similar issues
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20782/shard-apl6/igt@gem_ctx_s...@invalid-args.html

  * igt@gem_eio@unwedge-stress:
- shard-tglb: [PASS][3] -> [TIMEOUT][4] ([i915#2369] / [i915#3063] 
/ [i915#3648])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/shard-tglb2/igt@gem_...@unwedge-stress.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20782/shard-tglb5/igt@gem_...@unwedge-stress.html
- shard-snb:  NOTRUN -> [FAIL][5] ([i915#3354])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20782/shard-snb7/igt@gem_...@unwedge-stress.html

  * igt@gem_exec_fair@basic-none-rrul@rcs0:
- shard-glk:  [PASS][6] -> [FAIL][7] ([i915#2842])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/shard-glk5/igt@gem_exec_fair@basic-none-r...@rcs0.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20782/shard-glk3/igt@gem_exec_fair@basic-none-r...@rcs0.html

  * igt@gem_exec_fair@basic-none-share@rcs0:
- shard-iclb: [PASS][8] -> [FAIL][9] ([i915#2842])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/shard-iclb6/igt@gem_exec_fair@basic-none-sh...@rcs0.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20782/shard-iclb6/igt@gem_exec_fair@basic-none-sh...@rcs0.html

  * igt@gem_exec_fair@basic-none@vcs1:
- shard-iclb: NOTRUN -> [FAIL][10] ([i915#2842])
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20782/shard-iclb2/igt@gem_exec_fair@basic-n...@vcs1.html

  * igt@gem_exec_fair@basic-pace@rcs0:
- shard-tglb: [PASS][11] -> [FAIL][12] ([i915#2842])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/shard-tglb1/igt@gem_exec_fair@basic-p...@rcs0.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20782/shard-tglb3/igt@gem_exec_fair@basic-p...@rcs0.html

  * igt@gem_exec_fair@basic-pace@vecs0:
- shard-kbl:  [PASS][13] -> [FAIL][14] ([i915#2842]) +2 similar 
issues
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/shard-kbl7/igt@gem_exec_fair@basic-p...@vecs0.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20782/shard-kbl4/igt@gem_exec_fair@basic-p...@vecs0.html

  * igt@gem_mmap_gtt@big-copy:
- shard-glk:  [PASS][15] -> [FAIL][16] ([i915#1888] / [i915#307])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/shard-glk3/igt@gem_mmap_...@big-copy.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20782/shard-glk1/igt@gem_mmap_...@big-copy.html

  * igt@gem_render_copy@x-tiled-to-vebox-yf-tiled:
- shard-kbl:  NOTRUN -> [SKIP][17] ([fdo#109271]) +72 similar issues
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20782/shard-kbl1/igt@gem_render_c...@x-tiled-to-vebox-yf-tiled.html

  * igt@gem_softpin@noreloc-s3:
- shard-apl:  NOTRUN -> [DMESG-WARN][18] ([i915#180])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20782/shard-apl3/igt@gem_soft...@noreloc-s3.html

  * igt@gen9_exec_parse@allowed-single:
- shard-apl:  [PASS][19] -> [DMESG-WARN][20] ([i915#1436] / 
[i915#716])
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/shard-apl1/igt@gen9_exec_pa...@allowed-single.html
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20782/shard-apl2/igt@gen9_exec_pa...@allowed-single.html

  * igt@i915_pm_rpm@basic-rte:
- shard-apl:  NOTRUN -> [FAIL][21] ([i915#579])
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20782/shard-apl1/igt@i915_pm_...@basic-rte.html

  * igt@i915_suspend@fence-restore-untiled:
- shard-apl:  [PASS][22] -> [DMESG-WARN][23] ([i915#180]) +1 
similar issue
   [22]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/shard-apl7/igt@i915_susp...@fence-restore-untiled.html
   [23]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20782/shard-apl8/igt@i915_susp...@fence-restore-untiled.html

  * igt@i915_suspend@forcewake:
- shard-kbl:  [PASS][24] -> [DMESG-WARN][25] ([i915#180]) +4 
similar issue

[Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915: Release ctx->syncobj on final put, not on ctx close

2021-08-06 Thread Patchwork

== Series Details ==

Series: drm/i915: Release ctx->syncobj on final put, not on ctx close
URL   : https://patchwork.freedesktop.org/series/93470/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10457_full -> Patchwork_20784_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20784_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20784_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_20784_full:

### IGT changes ###

 Possible regressions 

  * igt@gem_ctx_shared@exec-single-timeline@vecs0:
- shard-apl:  NOTRUN -> [DMESG-WARN][1]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20784/shard-apl2/igt@gem_ctx_shared@exec-single-timel...@vecs0.html

  * igt@gem_ctx_shared@q-in-order@bcs0:
- shard-apl:  [PASS][2] -> [DMESG-WARN][3] +5 similar issues
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/shard-apl1/igt@gem_ctx_shared@q-in-or...@bcs0.html
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20784/shard-apl8/igt@gem_ctx_shared@q-in-or...@bcs0.html

  * igt@gem_ctx_shared@q-out-order@bcs0:
- shard-iclb: [PASS][4] -> [DMESG-WARN][5] +9 similar issues
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/shard-iclb8/igt@gem_ctx_shared@q-out-or...@bcs0.html
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20784/shard-iclb6/igt@gem_ctx_shared@q-out-or...@bcs0.html

  * igt@gem_ctx_shared@q-smoketest-all:
- shard-glk:  [PASS][6] -> [DMESG-WARN][7] +7 similar issues
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/shard-glk9/igt@gem_ctx_sha...@q-smoketest-all.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20784/shard-glk1/igt@gem_ctx_sha...@q-smoketest-all.html

  * igt@gem_ctx_shared@single-timeline:
- shard-skl:  [PASS][8] -> [DMESG-WARN][9] +12 similar issues
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/shard-skl8/igt@gem_ctx_sha...@single-timeline.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20784/shard-skl4/igt@gem_ctx_sha...@single-timeline.html

  * igt@gem_exec_whisper@basic-queues-all:
- shard-tglb: [PASS][10] -> [DMESG-WARN][11] +9 similar issues
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/shard-tglb8/igt@gem_exec_whis...@basic-queues-all.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20784/shard-tglb8/igt@gem_exec_whis...@basic-queues-all.html
- shard-kbl:  NOTRUN -> [DMESG-WARN][12] +1 similar issue
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20784/shard-kbl1/igt@gem_exec_whis...@basic-queues-all.html

  * igt@gem_exec_whisper@basic-queues-priority-all:
- shard-kbl:  [PASS][13] -> [DMESG-WARN][14] +6 similar issues
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/shard-kbl3/igt@gem_exec_whis...@basic-queues-priority-all.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20784/shard-kbl4/igt@gem_exec_whis...@basic-queues-priority-all.html

  
 Suppressed 

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@gem_ctx_shared@q-independent@bcs0:
- {shard-rkl}:[PASS][15] -> [DMESG-WARN][16] +7 similar issues
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/shard-rkl-2/igt@gem_ctx_shared@q-independ...@bcs0.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20784/shard-rkl-2/igt@gem_ctx_shared@q-independ...@bcs0.html

  * igt@gem_ctx_shared@q-smoketest@rcs0:
- {shard-rkl}:NOTRUN -> [DMESG-WARN][17] +1 similar issue
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20784/shard-rkl-2/igt@gem_ctx_shared@q-smoket...@rcs0.html

  
Known issues


  Here are the changes found in Patchwork_20784_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_ctx_isolation@preservation-s3@bcs0:
- shard-kbl:  [PASS][18] -> [DMESG-WARN][19] ([i915#180]) +2 
similar issues
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10457/shard-kbl1/igt@gem_ctx_isolation@preservation...@bcs0.html
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20784/shard-kbl6/igt@gem_ctx_isolation@preservation...@bcs0.html

  * igt@gem_ctx_persistence@smoketest:
- shard-snb:  NOTRUN -> [SKIP][20] ([fdo#109271] / [i915#1099]) +1 
similar issue
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20784/shard-snb6/igt@gem_ctx_persiste...@smoketest.html

  * igt@gem_eio@unwedge-stress:
- shard-tglb: [PA

Re: [Intel-gfx] [PATCH] drm/i915: Disable bonding on gen12+ platforms

2021-08-06 Thread Matt Roper

On Fri, Aug 06, 2021 at 08:54:59PM +0200, Daniel Vetter wrote:
> On Fri, Aug 6, 2021 at 8:25 PM John Harrison  
> wrote:
> > On 7/28/2021 12:21, Matthew Brost wrote:
> > > Disable bonding on gen12+ platforms aside from ones already supported by
> > > the i915 - TGL, RKL, and ADL-S.
> > >
> > > Signed-off-by: Matthew Brost 
> > > ---
> > >   drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++
> > >   1 file changed, 7 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
> > > b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > index 05c3ee191710..9c3672bac0e2 100644
> > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > @@ -446,6 +446,13 @@ set_proto_ctx_engines_bond(struct 
> > > i915_user_extension __user *base, void *data)
> > >   u16 idx, num_bonds;
> > >   int err, n;
> > >
> > > + if (GRAPHICS_VER(i915) >= 12 && !IS_TIGERLAKE(i915) &&
> > > + !IS_ROCKETLAKE(i915) && !IS_ALDERLAKE_S(i915)) {
> > > + drm_dbg(&i915->drm,
> > > + "Bonding on gen12+ aside from TGL, RKL, and ADL_S 
> > > not allowed\n");
> > I would have said not supported rather than not allowed. Either way:
> > Reviewed-by: John Harrison 
> 
> Either is fine with me.
> 
> Acked-by: Daniel Vetter 
> 

Applied to drm-intel-gt-next (with the suggested debug message wording
tweak).  Thanks for the patch and reviews.


Matt

> >
> > > + return -ENODEV;
> > > + }
> > > +
> > >   if (get_user(idx, &ext->virtual_index))
> > >   return -EFAULT;
> > >
> >
> 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795

42 matches

Mail list logo