Users/tests relying on the total reset count will start seeing a smaller
number since most of the hangs can be handled by engine reset.
Note that if reset engine x, context a running on engine y will be unaware
and unaffected.
To start the discussion, include just a total engine reset count. If it
intel_guc_reset sounds more like the microcontroller is the one performing
a reset, while in this case is the opposite. intel_reset_guc not only
makes it clearer, it follows the other intel_reset functions available.
Cc: Tvrtko Ursulin
Signed-off-by: Michel Thierry
---
drivers/gpu/drm/i915/i915
*** General ***
Watchdog timeout (or "media engine reset") is a feature that allows
userland applications to enable hang detection on individual batch buffers.
The detection mechanism itself is mostly bound to the hardware and the only
thing that the driver needs to do to support this form of hang
From: Arun Siluvery
This change implements support for per-engine reset as an initial, less
intrusive hang recovery option to be attempted before falling back to the
legacy full GPU reset recovery mode if necessary. This is only supported
from Gen8 onwards.
Hangchecker determines which engines a
This patch adds per engine reset and recovery (TDR) support when GuC is
used to submit workloads to GPU.
In the case of i915 directly submission to ELSP, driver manages hang
detection, recovery and resubmission. With GuC submission these tasks
are shared between driver and GuC. i915 is still respo
Save the watchdog threshold (in us) as part of the engine state.
Signed-off-by: Michel Thierry
---
drivers/gpu/drm/i915/i915_drv.h | 1 +
drivers/gpu/drm/i915/i915_gpu_error.c | 11 +++
2 files changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h
From: Arun Siluvery
Driver maintains count of how many times a given engine is reset, useful to
capture this in error state also. It gives an idea of how engine is coping
up with the workloads it is executing before this error state.
A follow-up patch will provide this information in debugfs.
v
From firmware v8.8, GuC provides the count of media engine resets
(watchdog timeout). This information is available in the GuC shared
context data struct, which resides in the first page of the default
(kernel) lrc context.
Since GuC handled engine resets are transparent for kernel and user,
provi
Check that we can reset specific engines, also check the fallback to
full reset if something didn't work.
v2: rebase.
Signed-off-by: Michel Thierry
---
drivers/gpu/drm/i915/selftests/intel_hangcheck.c | 147 +++
1 file changed, 147 insertions(+)
diff --git a/drivers/gpu/drm
From: Daniele Ceraolo Spurio
The mmio_start offset for the whitelist is the first FORCE_TO_NONPRIV
register the GuC can use to restore the provided whitelist when an
engine reset via GuC (which we still don't support) is triggered.
We're currently adding the mmio_base of the engine to the absolu
From: Arun Siluvery
In preparation for engine reset work update this parameter to handle more
than one type of reset. Default at the moment is still full gpu reset.
Cc: Chris Wilson
Cc: Mika Kuoppala
Signed-off-by: Arun Siluvery
Signed-off-by: Michel Thierry
---
drivers/gpu/drm/i915/i915_pa
Final enablement patch for GPU hang detection using watchdog timeout.
Using the gem_context_setparam ioctl, users can specify the desired
timeout value in microseconds, and the driver will do the conversion to
'timestamps'.
The recommended default watchdog threshold for video engines is 6 us,
From: Arun Siluvery
This feature is made available only from Gen8, for previous gen devices
driver uses legacy full gpu reset.
Cc: Chris Wilson
Cc: Mika Kuoppala
Signed-off-by: Tomas Elf
Signed-off-by: Arun Siluvery
Signed-off-by: Michel Thierry
---
drivers/gpu/drm/i915/i915_params.c | 4 +
Emit the required commands into the ring buffer for starting and
stopping the watchdog timer before/after batch buffer start during
batch buffer submission.
v2: Support watchdog threshold per context engine, merge lri commands,
and move watchdog commands emission to emit_bb_start. Request space of
Before reseting an engine, check if there is an active request, and if
the _hung_ request has completed. In these two cases, the seqno has moved
after hang declaration and we can skip the reset.
Also store the active request so that we only search for it once.
Suggested-by: Chris Wilson
Signed-o
For watchdog / media reset, the firmware must know the address of the shared
data page (the first page of the default context).
This information should be in DWORD 9 of the GUC_CTL structure.
v2: Use guc_ggtt_offset (Chris).
Store the ggtt offset of the default ctx as we needed for
suspend/resume
From: Robert Bragg
Enables access to OA unit metrics for BDW, CHV, SKL and BXT which all
share (more-or-less) the same OA unit design.
Of particular note in comparison to Haswell: some OA unit HW config
state has become per-context state and as a consequence it is somewhat
more complicated to ma
On 27/04/17 05:20 PM, Jason Gunthorpe wrote:
> It seems the most robust: test for iomem, and jump to a slow path
> copy, otherwise inline the kmap and memcpy
>
> Every place doing memcpy from sgl will need that pattern to be
> correct.
Ok, sounds like a good place to start to me. I'll see what
== Series Details ==
Series: Gen8+ engine-reset (rev4)
URL : https://patchwork.freedesktop.org/series/21868/
State : success
== Summary ==
Series 21868v4 Gen8+ engine-reset
https://patchwork.freedesktop.org/api/1.0/series/21868/revisions/4/mbox/
fi-bdw-5557u total:278 pass:267 dwarn:0
On Thu, Apr 27, 2017 at 04:12:43PM -0700, Michel Thierry wrote:
> From: Arun Siluvery
>
> This change implements support for per-engine reset as an initial, less
> intrusive hang recovery option to be attempted before falling back to the
> legacy full GPU reset recovery mode if necessary. This is
On Thu, Apr 27, 2017 at 04:12:52PM -0700, Michel Thierry wrote:
> +#define WA_REG_WR_GUC_RESTORE(addr, val) do { \
> + const int r = guc_wa_add(dev_priv, (addr), (val)); \
> + if (r) \
> + return r; \
> + } while (0)
Try to avoid burying returns insi
On Tue, 2017-04-25 at 09:51 +0200, Maarten Lankhorst wrote:
> On 21-04-17 07:51, Dhinakaran Pandiyan wrote:
> > From: "Pandiyan, Dhinakaran"
> >
> > Use the added helpers to track MST link bandwidth for atomic modesets.
> > Link bw is acquired in the ->atomic_check() phase when CRTCs are being
> >
-Original Message-
From: Ville Syrjälä [mailto:ville.syrj...@linux.intel.com]
Sent: Thursday, April 27, 2017 10:39 PM
To: Lee, Shawn C
Cc: intel-gfx@lists.freedesktop.org; Chiou, Cooper ;
Bride, Jim ; Nikula, Jani ; Vivi,
Rodrigo ; Lin, Ryan
Subject: Re: [Intel-gfx] [PATCH] drm/i915/
101 - 123 of 123 matches
Mail list logo