[Intel-gfx] [PATCH 18/39] drm/i915: Added scheduler support to __wait_request() calls

2015-11-23 Thread John . C . Harrison
From: John Harrison The scheduler can cause batch buffers, and hence requests, to be submitted to the ring out of order and asynchronously to their submission to the driver. Thus at the point of waiting for the completion of a given request, it is not even guaranteed that the request has actually

[Intel-gfx] [PATCH 35/39] drm/i915: GPU priority bumping to prevent starvation

2015-11-23 Thread John . C . Harrison
From: John Harrison If a high priority task was to continuously submit batch buffers to the driver, it could starve out any lower priority task from getting any GPU time at all. To prevent this, the priority of a queued batch buffer is bumped each time it does not get submitted to the hardware.

[Intel-gfx] [PATCH 17/39] drm/i915: Hook scheduler node clean up into retire requests

2015-11-23 Thread John . C . Harrison
From: John Harrison The scheduler keeps its own lock on various DRM objects in order to guarantee safe access long after the original execbuff IOCTL has completed. This is especially important when pre-emption is enabled as the batch buffer might need to be submitted to the hardware multiple time

[Intel-gfx] [PATCH 25/39] drm/i915: Add sync wait support to scheduler

2015-11-23 Thread John . C . Harrison
From: John Harrison There is a sync framework to allow work for multiple independent systems to be synchronised with each other but without stalling the CPU whether in the application or the driver. This patch adds support for this framework to the GPU scheduler. Batch buffers can now have sync

[Intel-gfx] [PATCH 36/39] drm/i915: Scheduler state dump via debugfs

2015-11-23 Thread John . C . Harrison
From: John Harrison Added a facility for triggering the scheduler state dump via a debugfs entry. v2: New patch in series. For: VIZ-1587 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_debugfs.c | 33 + drivers/gpu/drm/i915/i915_scheduler.c | 9 ++

[Intel-gfx] [PATCH 22/39] drm/i915: Support for 'unflushed' ring idle

2015-11-23 Thread John . C . Harrison
From: John Harrison When the seqno wraps around zero, the entire GPU is forced to be idle for some reason (possibly only to work around issues with hardware semaphores but no-one seems too sure!). This causes a problem if the force idle occurs at an inopportune moment such as in the middle of sub

[Intel-gfx] [PATCH 39/39] drm/i915: Allow scheduler to manage inter-ring object synchronisation

2015-11-23 Thread John . C . Harrison
From: John Harrison The scheduler has always tracked batch buffer dependencies based on DRM object usage. This means that it will not submit a batch on one ring that has outstanding dependencies still executing on other rings. This is exactly the same synchronisation performed by i915_gem_object_

[Intel-gfx] [PATCH 19/39] drm/i915: Added scheduler support to page fault handler

2015-11-23 Thread John . C . Harrison
From: John Harrison GPU page faults can now require scheduler operation in order to complete. For example, in order to free up sufficient memory to handle the fault the handler must wait for a batch buffer to complete that has not even been sent to the hardware yet. Thus EAGAIN no longer means a

[Intel-gfx] [PATCH 29/39] drm/i915: Added debugfs interface to scheduler tuning parameters

2015-11-23 Thread John . C . Harrison
From: John Harrison There are various parameters within the scheduler which can be tuned to improve performance, reduce memory footprint, etc. This change adds support for altering these via debugfs. v2: Updated for priorities now being signed values. Change-Id: I6c26765269ae7173ff4d3a5c20921ea

[Intel-gfx] [PATCH 28/39] drm/i915: Added scheduler queue throttling by DRM file handle

2015-11-23 Thread John . C . Harrison
From: John Harrison The scheduler decouples the submission of batch buffers to the driver from their subsequent submission to the hardware. This means that an application which is continuously submitting buffers as fast as it can could potentialy flood the driver. To prevent this, the driver now

[Intel-gfx] [RFC 03/37] drm/i915: hangcheck=idle should wake_up_all every time, not just once

2015-11-23 Thread John . C . Harrison
From: Dave Gordon For: VIZ-2021 Signed-off-by: Dave Gordon --- drivers/gpu/drm/i915/i915_irq.c | 23 --- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index d280e05..eb55a41 100644 --- a/driv

[Intel-gfx] [RFC 08/37] drm/i915/error: report size in pages for each object dumped

2015-11-23 Thread John . C . Harrison
From: Dave Gordon For: VIZ-2021 Signed-off-by: Dave Gordon --- drivers/gpu/drm/i915/i915_gpu_error.c | 23 ++- 1 file changed, 14 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 52def4e..bafaadd

[Intel-gfx] [RFC 18/37] drm/i915/guc: Fill in (part of?) the ADS whitelist

2015-11-23 Thread John . C . Harrison
From: Dave Gordon For: VIZ-2021 Signed-off-by: Dave Gordon --- drivers/gpu/drm/i915/i915_guc_submission.c | 11 +++ drivers/gpu/drm/i915/intel_guc_fwif.h | 6 ++ 2 files changed, 17 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/

[Intel-gfx] [RFC 07/37] drm/i915/error: improve CSB reporting

2015-11-23 Thread John . C . Harrison
From: Dave Gordon For: VIZ-2021 Signed-off-by: Dave Gordon --- drivers/gpu/drm/i915/i915_drv.h | 4 +- drivers/gpu/drm/i915/i915_gpu_error.c | 88 --- 2 files changed, 64 insertions(+), 28 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/driver

[Intel-gfx] [RFC 00/37] Preemption support for GPU scheduler

2015-11-23 Thread John . C . Harrison
From: John Harrison Added pre-emption support to the i915 GPU scheduler. Note that this patch series was written by David Gordon. I have simply ported it onto a more recent set of scheduler patches and am uploading it as part of that work so that everything can be viewed at once. Also because Da

[Intel-gfx] [RFC 01/37] drm/i915: update ring space correctly

2015-11-23 Thread John . C . Harrison
From: Dave Gordon For: VIZ-2021 Signed-off-by: Dave Gordon --- drivers/gpu/drm/i915/intel_lrc.c| 2 +- drivers/gpu/drm/i915/intel_ringbuffer.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index

[Intel-gfx] [RFC 26/37] drm/i915/preempt: preemption-related definitions and statistics

2015-11-23 Thread John . C . Harrison
From: Dave Gordon For: VIZ-2021 Signed-off-by: Dave Gordon --- drivers/gpu/drm/i915/i915_debugfs.c | 18 drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/i915_scheduler.h | 40 +-- 3 files changed, 57 insertions(+), 2 deletion

[Intel-gfx] [RFC 31/37] drm/i915/preempt: scheduler logic for landing preemptive requests

2015-11-23 Thread John . C . Harrison
From: Dave Gordon This patch adds the GEM & scheduler logic for detection and first-stage processing of completed preemption requests. Similar to regular batches, they deposit their sequence number in the hardware status page when starting and again when finished, but using different locations so

[Intel-gfx] [RFC 17/37] drm/i915/guc: Add support for GuC ADS (Addition Data Structure)

2015-11-23 Thread John . C . Harrison
From: Dave Gordon The GuC firmware uses this for various purposes; it seems to be required for preemption to work. For: VIZ-2021 Signed-off-by: Dave Gordon --- drivers/gpu/drm/i915/i915_guc_reg.h| 1 + drivers/gpu/drm/i915/i915_guc_submission.c | 57 ++- drivers/gp

[Intel-gfx] [RFC 06/37] drm/i915/error: report ctx id & desc for each request in the queue

2015-11-23 Thread John . C . Harrison
From: Dave Gordon Also decode and output CSB entries, in time order For: VIZ-2021 Signed-off-by: Dave Gordon --- drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/i915_gpu_error.c | 37 +++ 2 files changed, 30 insertions(+), 8 deletions(-) diff

[Intel-gfx] [RFC 11/37] drm/i915/guc: Add a second client, to be used for preemption

2015-11-23 Thread John . C . Harrison
From: Dave Gordon This second client is created with priority KMD_HIGH, and marked as preemptive. For: VIZ-2021 Signed-off-by: Dave Gordon --- drivers/gpu/drm/i915/i915_debugfs.c| 9 + drivers/gpu/drm/i915/i915_guc_submission.c | 15 ++- drivers/gpu/drm/i915/intel_

[Intel-gfx] [RFC 04/37] drm/i915/error: capture execlist state on error

2015-11-23 Thread John . C . Harrison
From: Dave Gordon At present, execlist status/ctx_id and CSBs, not the submission queue For: VIZ-2021 Signed-off-by: Dave Gordon --- drivers/gpu/drm/i915/i915_drv.h | 9 + drivers/gpu/drm/i915/i915_gpu_error.c | 38 +-- 2 files changed, 45 inserti

[Intel-gfx] [RFC 13/37] drm/i915/guc: Improve action error reporting, add preemption debug

2015-11-23 Thread John . C . Harrison
From: Dave Gordon For: VIZ-2021 Signed-off-by: Dave Gordon --- drivers/gpu/drm/i915/i915_debugfs.c| 20 +-- drivers/gpu/drm/i915/i915_guc_submission.c | 32 -- drivers/gpu/drm/i915/intel_guc.h | 12 +-- 3 files changed, 45 in

[Intel-gfx] [RFC 28/37] drm/i915/preempt: scheduler logic for selecting preemptive requests

2015-11-23 Thread John . C . Harrison
From: Dave Gordon This patch adds the scheduler logic for managing potentially preemptive requests, including validating dependencies and working out when a request can be downgraded to non-preemptive (e.g. when there's nothing ahead for it to preempt). Actually-preemptive requests are still dis

[Intel-gfx] [RFC 12/37] drm/i915/guc: implement submission via REQUEST_PREEMPTION action

2015-11-23 Thread John . C . Harrison
From: Dave Gordon If a batch is submitted via the preemptive (KMD_HIGH-priority) client then instead of ringing the doorbell we dispatch it using the GuC "REQUEST_PREEMPTION" action. Also, we specify "clear work queue" and "clear submit queue" in that request, so the scheduler can reconsider what

[Intel-gfx] [RFC 22/37] drm/i915: track relative-constants-mode per-context not per-device

2015-11-23 Thread John . C . Harrison
From: Dave Gordon 'relative_constants_mode' has always been tracked per-device, but this is wrong in execlists (or GuC) mode, as INSTPM is saved and restored with the logical context, and the per-context value could therefore get out of sync with the tracked value. This patch moves the tracking e

[Intel-gfx] [RFC 16/37] drm/i915/guc: Expose (intel)_lr_context_size()

2015-11-23 Thread John . C . Harrison
From: Dave Gordon The GuC code needs to know the size of a logical context, so we expose get_lr_context_size(), renaming it intel_lr_context__size() to fit the naming conventions for nonstatic functions. For: VIZ-2021 Signed-off-by: Dave Gordon --- drivers/gpu/drm/i915/intel_lrc.c | 4 ++-- dr

[Intel-gfx] [RFC 29/37] drm/i915/preempt: scheduler logic for preventing recursive preemption

2015-11-23 Thread John . C . Harrison
From: Dave Gordon Once a preemptive request has been dispatched to the hardware-layer submission mechanism, the scheduler must not send any further requests to the same ring until the preemption completes. Here we add the logic that ensure that only one preemption per ring can be in progress at o

[Intel-gfx] [RFC 19/37] drm/i915/error: capture errored context based on request context-id

2015-11-23 Thread John . C . Harrison
From: Dave Gordon Context capture hasn't worked for a while now, probably since the introduction of execlists; this patch makes it work again by using a different way of identifying the context of interest. For: VIZ-2021 Signed-off-by: Dave Gordon --- drivers/gpu/drm/i915/i915_gpu_error.c | 7

[Intel-gfx] [RFC 37/37] drm/i915: Added preemption info to various trace points

2015-11-23 Thread John . C . Harrison
From: John Harrison For: VIZ-2021 Signed-off-by: Dave Gordon --- drivers/gpu/drm/i915/i915_gem.c | 8 +--- drivers/gpu/drm/i915/i915_scheduler.c | 2 +- drivers/gpu/drm/i915/i915_trace.h | 31 +++ 3 files changed, 25 insertions(+), 16 deletions(-) di

[Intel-gfx] [RFC 34/37] drm/i915/preempt: scheduler logic for postprocessing preemptive requests

2015-11-23 Thread John . C . Harrison
From: Dave Gordon This patch adds the scheduler logic for postprocessing of completed preemption requests. It cleans out both the fence_signal list (dropping references as it goes) and the primary request_list. Requests that didn't complete are put into the 'preempted' state for resubmission by t

[Intel-gfx] [RFC 09/37] drm/i915/error: track, capture & print ringbuffer submission activity

2015-11-23 Thread John . C . Harrison
From: Dave Gordon For: VIZ-2021 Signed-off-by: Dave Gordon --- drivers/gpu/drm/i915/i915_drv.h | 2 ++ drivers/gpu/drm/i915/i915_gem.c | 11 +-- drivers/gpu/drm/i915/i915_gpu_error.c | 9 + drivers/gpu/drm/i915/intel_ringbuffer.h | 14 ++ 4 files

[Intel-gfx] [RFC 10/37] drm/i915/guc: Tidy up GuC proc/ctx descriptor setup

2015-11-23 Thread John . C . Harrison
From: Dave Gordon For: VIZ-2021 Signed-off-by: Dave Gordon --- drivers/gpu/drm/i915/i915_debugfs.c| 2 +- drivers/gpu/drm/i915/i915_guc_submission.c | 49 -- drivers/gpu/drm/i915/intel_guc.h | 16 +- 3 files changed, 28 insertions(+), 39 de

[Intel-gfx] [RFC 02/37] drm/i915: recalculate ring space after reset

2015-11-23 Thread John . C . Harrison
From: Dave Gordon To reinitialise a ringbuffer after a hang (or preemption), we need to not only to not only set both h/w and s/w HEAD and TAIL to 0, but also clear last_retired_head and recalculate the available space. For: VIZ-2021 Signed-off-by: Dave Gordon --- drivers/gpu/drm/i915/intel_lr

[Intel-gfx] [RFC 14/37] drm/i915/guc: Expose GuC-maintained statistics

2015-11-23 Thread John . C . Harrison
From: Dave Gordon For: VIZ-2021 Signed-off-by: Dave Gordon --- drivers/gpu/drm/i915/i915_guc_submission.c | 19 ++- drivers/gpu/drm/i915/intel_guc_fwif.h | 7 ++- 2 files changed, 24 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c

[Intel-gfx] [RFC 05/37] drm/i915/error: capture ringbuffer pointed to by START

2015-11-23 Thread John . C . Harrison
From: Dave Gordon For: VIZ-2021 Signed-off-by: Dave Gordon --- drivers/gpu/drm/i915/i915_drv.h | 2 +- drivers/gpu/drm/i915/i915_gpu_error.c | 36 +-- 2 files changed, 27 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/driver

[Intel-gfx] [RFC 21/37] drm/i915/error: add GuC state error capture & decode

2015-11-23 Thread John . C . Harrison
From: Dave Gordon For: VIZ-2021 Signed-off-by: Dave Gordon --- drivers/gpu/drm/i915/i915_drv.h | 4 ++ drivers/gpu/drm/i915/i915_gpu_error.c | 110 ++ 2 files changed, 114 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/

[Intel-gfx] [RFC 33/37] drm/i915/preempt: Refactor intel_lr_context_reset()

2015-11-23 Thread John . C . Harrison
From: Dave Gordon After preemption, we need to empty out the ringbuffers associated with preempted requests, so that the scheduler has a clean ring into which to (re-)insert requests (not necessarily in the same order as before they were preempted). So this patch refactors the existing routine i

[Intel-gfx] [RFC 35/37] drm/i915/preempt: update (LRC) ringbuffer-filling code to create preemptive requests

2015-11-23 Thread John . C . Harrison
From: Dave Gordon This patch refactors the rinbuffer-level code (in execlists/GuC mode only) and enhances it so that it can emit the proper sequence of opcode for preemption requests. A preemption request is similar to an batch submission, but doesn't actually invoke a batchbuffer, the purpose b

[Intel-gfx] [RFC 24/37] drm/i915/sched: set request 'head' on at start of ring submission

2015-11-23 Thread John . C . Harrison
From: Dave Gordon With the scheduler, request allocation can happen long before the ring is filled in, and in a different order. So for that case, we update the request head at the start of _final (the initialisation on allocation is stull useful for the direct-submission mode). For: VIZ-2021 Si

[Intel-gfx] [RFC 27/37] drm/i915/preempt: scheduler logic for queueing preemptive requests

2015-11-23 Thread John . C . Harrison
From: Dave Gordon This is the very first stage of the scheduler's preemption logic, where it determines whether a request should be marked as potentially preemptive, at the point where it is added to the scheduler's queue. Subsequent logic will determine how to handle the request on the basis of

[Intel-gfx] [RFC 36/37] drm/i915/preempt: update scheduler parameters to enable preemption

2015-11-23 Thread John . C . Harrison
From: Dave Gordon For: VIZ-2021 Signed-off-by: Dave Gordon --- drivers/gpu/drm/i915/i915_params.c| 4 ++-- drivers/gpu/drm/i915/i915_scheduler.c | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c in

[Intel-gfx] [RFC 30/37] drm/i915/preempt: don't allow nonbatch ctx init when the scheduler is busy

2015-11-23 Thread John . C . Harrison
From: Dave Gordon If the scheduler is busy (e.g. processing a preemption) it will need to be able to acquire the struct_mutex, so we can't allow untracked requests to bypass the scheduler and go directly to the hardware (much confusion will result). Since untracked requests are used only for init

[Intel-gfx] [RFC 32/37] drm/i915/preempt: add hook to catch 'unexpected' ring submissions

2015-11-23 Thread John . C . Harrison
From: Dave Gordon Author: John Harrison Date: Thu Apr 10 10:41:06 2014 +0100 The scheduler needs to know what each seqno that pops out of the ring is referring to. This change adds a hook into the the 'submit some random work that got forgotten about' clean up code to inform the scheduler tha

[Intel-gfx] [RFC 25/37] drm/i915/sched: include scheduler state in error capture

2015-11-23 Thread John . C . Harrison
From: Dave Gordon For: VIZ-2021 Signed-off-by: Dave Gordon --- drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/i915_gpu_error.c | 5 + 2 files changed, 6 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 9bece1e..06dff5a 10

[Intel-gfx] [RFC 15/37] drm/i915: add i915_wait_request() call after i915_add_request_no_flush()

2015-11-23 Thread John . C . Harrison
From: Dave Gordon Per-context initialisation GPU instructions (which are injected directly into the ringbuffer rather than being submitted as a batch) should not be allowed to mix with user-generated batches in the same submission; it will cause confusion for the GuC (which might merge a subseque

[Intel-gfx] [RFC 23/37] drm/i915: set request 'head' on allocation not in add_request()

2015-11-23 Thread John . C . Harrison
From: Dave Gordon The current setting of request 'head' in add_request() isn't useful and has been replaced for purposes of knowing how full the ring is by 'postfix'. So we can instead use 'head' to define and locate the entire range spanned by a request. Pictorially, headpos

[Intel-gfx] [RFC 20/37] drm/i915/error: enhanced error capture of requests

2015-11-23 Thread John . C . Harrison
From: Dave Gordon Record a few more things about the requests outstanding at the time of capture ... For: VIZ-2021 Signed-off-by: Dave Gordon --- drivers/gpu/drm/i915/i915_drv.h | 6 +- drivers/gpu/drm/i915/i915_gpu_error.c | 20 +++- 2 files changed, 20 insertions(+

[Intel-gfx] [PATCH 00/13] Convert requests to use struct fence

2015-12-11 Thread John . C . Harrison
From: John Harrison There is a construct in the linux kernel called 'struct fence' that is intended to keep track of work that is executed on hardware. I.e. it solves the basic problem that the drivers 'struct drm_i915_gem_request' is trying to address. The request structure does quite a lot more

[Intel-gfx] [PATCH 07/13] drm/i915: Add per context timelines to fence object

2015-12-11 Thread John . C . Harrison
From: John Harrison The fence object used inside the request structure requires a sequence number. Although this is not used by the i915 driver itself, it could potentially be used by non-i915 code if the fence is passed outside of the driver. This is the intention as it allows external kernel dr

[Intel-gfx] [PATCH 04/13] android/sync: Improved debug dump to dmesg

2015-12-11 Thread John . C . Harrison
From: John Harrison The sync code has a facility for dumping current state information via debugfs. It also has a way to re-use the same code for dumping to the kernel log on an internal error. However, the redirection was rather clunky and split the output across multiple prints at arbitrary bou

[Intel-gfx] [PATCH 01/13] staging/android/sync: Support sync points created from dma-fences

2015-12-11 Thread John . C . Harrison
From: Maarten Lankhorst Debug output assumes all sync points are built on top of Android sync points and when we start creating them from dma-fences will NULL ptr deref unless taught about this. v4: Corrected patch ownership. Signed-off-by: Maarten Lankhorst Signed-off-by: Tvrtko Ursulin Cc:

[Intel-gfx] [PATCH 02/13] staging/android/sync: add sync_fence_create_dma

2015-12-11 Thread John . C . Harrison
From: Maarten Lankhorst This allows users of dma fences to create a android fence. v2: Added kerneldoc. (Tvrtko Ursulin). v4: Updated comments from review feedback my Maarten. Signed-off-by: Maarten Lankhorst Signed-off-by: Tvrtko Ursulin Cc: Maarten Lankhorst Cc: Daniel Vetter Cc: Jesse B

[Intel-gfx] [PATCH 06/13] drm/i915: Removed now redudant parameter to i915_gem_request_completed()

2015-12-11 Thread John . C . Harrison
From: John Harrison The change to the implementation of i915_gem_request_completed() means that the lazy coherency flag is no longer used. This can now be removed to simplify the interface. For: VIZ-5190 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_debugfs.c | 2 +- drivers/gpu

[Intel-gfx] [PATCH 08/13] drm/i915: Delay the freeing of requests until retire time

2015-12-11 Thread John . C . Harrison
From: John Harrison The request structure is reference counted. When the count reached zero, the request was immediately freed and all associated objects were unrefereced/unallocated. This meant that the driver mutex lock must be held at the point where the count reaches zero. This was fine while

[Intel-gfx] [PATCH 11/13] android/sync: Fix reversed sense of signaled fence

2015-12-11 Thread John . C . Harrison
From: Peter Lawthers In the 3.14 kernel, a signaled fence was indicated by the status field == 1. In 4.x, a status == 0 indicates signaled, status < 0 indicates error, and status > 0 indicates active. This patch wraps the check for a signaled fence in a function so that callers no longer needs t

[Intel-gfx] [PATCH 09/13] drm/i915: Interrupt driven fences

2015-12-11 Thread John . C . Harrison
From: John Harrison The intended usage model for struct fence is that the signalled status should be set on demand rather than polled. That is, there should not be a need for a 'signaled' function to be called everytime the status is queried. Instead, 'something' should be done to enable a signal

[Intel-gfx] [PATCH 10/13] drm/i915: Updated request structure tracing

2015-12-11 Thread John . C . Harrison
From: John Harrison Added the '_complete' trace event which occurs when a fence/request is signaled as complete. Also moved the notify event from the IRQ handler code to inside the notify function itself. v3: Added the current ring seqno to the notify trace point. For: VIZ-5190 Signed-off-by: J

[Intel-gfx] [PATCH 05/13] drm/i915: Convert requests to use struct fence

2015-12-11 Thread John . C . Harrison
From: John Harrison There is a construct in the linux kernel called 'struct fence' that is intended to keep track of work that is executed on hardware. I.e. it solves the basic problem that the drivers 'struct drm_i915_gem_request' is trying to address. The request structure does quite a lot more

[Intel-gfx] [PATCH 13/13] drm/i915: Cache last IRQ seqno to reduce IRQ overhead

2015-12-11 Thread John . C . Harrison
From: John Harrison The notify function can be called many times without the seqno changing. A large number of duplicates are to prevent races due to the requirement of not enabling interrupts until requested. However, when interrupts are enabled the IRQ handle can be called multiple times withou

[Intel-gfx] [PATCH 03/13] staging/android/sync: Move sync framework out of staging

2015-12-11 Thread John . C . Harrison
From: John Harrison The sync framework is now used by the i915 driver. Therefore it can be moved out of staging and into the regular tree. Also, the public interfaces can actually be made public and exported. v3: New patch for series. Signed-off-by: John Harrison Signed-off-by: Geoff Miller -

[Intel-gfx] [PATCH 12/13] drm/i915: Add sync framework support to execbuff IOCTL

2015-12-11 Thread John . C . Harrison
From: John Harrison Various projects desire a mechanism for managing dependencies between work items asynchronously. This can also include work items across complete different and independent systems. For example, an application wants to retreive a frame from a video in device, using it for rende

[Intel-gfx] [PATCH 05/40] drm/i915: Split i915_dem_do_execbuffer() in half

2015-12-11 Thread John . C . Harrison
From: John Harrison Split the execbuffer() function in half. The first half collects and validates all the information requried to process the batch buffer. It also does all the object pinning, relocations, active list management, etc - basically anything that must be done upfront before the IOCT

[Intel-gfx] [PATCH 06/40] drm/i915: Cache request pointer in *_submission_final()

2015-12-11 Thread John . C . Harrison
From: Dave Gordon Keep a local copy of the request pointer in the _final() functions rather than dereferencing the params block repeatedly. v3: New patch in series. For: VIZ-1587 Signed-off-by: Dave Gordon Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 13 +

[Intel-gfx] [PATCH 08/40] drm/i915: Start of GPU scheduler

2015-12-11 Thread John . C . Harrison
From: John Harrison Initial creation of scheduler source files. Note that this patch implements most of the scheduler functionality but does not hook it in to the driver yet. It also leaves the scheduler code in 'pass through' mode so that even when it is hooked in, it will not actually do very m

[Intel-gfx] [PATCH 12/40] drm/i915: Added scheduler hook when closing DRM file handles

2015-12-11 Thread John . C . Harrison
From: John Harrison The scheduler decouples the submission of batch buffers to the driver with submission of batch buffers to the hardware. Thus it is possible for an application to close its DRM file handle while there is still work outstanding. That means the scheduler needs to know about file

[Intel-gfx] [PATCH 17/40] drm/i915: Added tracking/locking of batch buffer objects

2015-12-11 Thread John . C . Harrison
From: John Harrison The scheduler needs to track interdependencies between batch buffers. These are calculated by analysing the object lists of the buffers and looking for commonality. The scheduler also needs to keep those buffers locked long after the initial IOCTL call has returned to user lan

[Intel-gfx] [PATCH 16/40] drm/i915: Keep the reserved space mechanism happy

2015-12-11 Thread John . C . Harrison
From: John Harrison Ring space is reserved when constructing a request to ensure that the subsequent 'add_request()' call cannot fail due to waiting for space on a busy or broken GPU. However, the scheduler jumps in to the middle of the execbuffer process between request creation and request subm

[Intel-gfx] [PATCH 18/40] drm/i915: Hook scheduler node clean up into retire requests

2015-12-11 Thread John . C . Harrison
From: John Harrison The scheduler keeps its own lock on various DRM objects in order to guarantee safe access long after the original execbuff IOCTL has completed. This is especially important when pre-emption is enabled as the batch buffer might need to be submitted to the hardware multiple time

[Intel-gfx] [PATCH 21/40] drm/i915: Added scheduler flush calls to ring throttle and idle functions

2015-12-11 Thread John . C . Harrison
From: John Harrison When requesting that all GPU work is completed, it is now necessary to get the scheduler involved in order to flush out work that queued and not yet submitted. v2: Updated to add support for flushing the scheduler queue by time stamp rather than just doing a blanket flush. v

[Intel-gfx] [PATCH 19/40] drm/i915: Added scheduler support to __wait_request() calls

2015-12-11 Thread John . C . Harrison
From: John Harrison The scheduler can cause batch buffers, and hence requests, to be submitted to the ring out of order and asynchronously to their submission to the driver. Thus at the point of waiting for the completion of a given request, it is not even guaranteed that the request has actually

[Intel-gfx] [PATCH 24/40] drm/i915: Defer seqno allocation until actual hardware submission time

2015-12-11 Thread John . C . Harrison
From: John Harrison The seqno value is now only used for the final test for completion of a request. It is no longer used to track the request through the software stack. Thus it is no longer necessary to allocate the seqno immediately with the request. Instead, it can be done lazily and left unt

[Intel-gfx] [PATCH 28/40] drm/i915: Added trace points to scheduler

2015-12-11 Thread John . C . Harrison
From: John Harrison Added trace points to the scheduler to track all the various events, node state transitions and other interesting things that occur. v2: Updated for new request completion tracking implementation. v3: Updated for changes to node kill code. Change-Id: I9886390cfc7897bc1faf50

[Intel-gfx] [PATCH 31/40] drm/i915: Added debug state dump facilities to scheduler

2015-12-11 Thread John . C . Harrison
From: John Harrison When debugging batch buffer submission issues, it is useful to be able to see what the current state of the scheduler is. This change adds functions for decoding the internal scheduler state and reporting it. v3: Updated a debug message with the new state_str() function. Cha

[Intel-gfx] [PATCH 29/40] drm/i915: Added scheduler queue throttling by DRM file handle

2015-12-11 Thread John . C . Harrison
From: John Harrison The scheduler decouples the submission of batch buffers to the driver from their subsequent submission to the hardware. This means that an application which is continuously submitting buffers as fast as it can could potentialy flood the driver. To prevent this, the driver now

[Intel-gfx] [PATCH 33/40] drm/i915: Added scheduler statistic reporting to debugfs

2015-12-11 Thread John . C . Harrison
From: John Harrison It is useful for know what the scheduler is doing for both debugging and performance analysis purposes. This change adds a bunch of counters and such that keep track of various scheduler operations (batches submitted, completed, flush requests, etc.). The data can then be read

[Intel-gfx] [PATCH 32/40] drm/i915: Add early exit to execbuff_final() if insufficient ring space

2015-12-11 Thread John . C . Harrison
From: John Harrison One of the major purposes of the GPU scheduler is to avoid stalling the CPU when the GPU is busy and unable to accept more work. This change adds support to the ring submission code to allow a ring space check to be performed before attempting to submit a batch buffer to the h

[Intel-gfx] [PATCH 00/40] GPU scheduler for i915 driver

2015-12-11 Thread John . C . Harrison
From: John Harrison Implemented a batch buffer submission scheduler for the i915 DRM driver. The general theory of operation is that when batch buffers are submitted to the driver, the execbuffer() code assigns a unique seqno value and then packages up all the information required to execute the

[Intel-gfx] [RFC 24/38] drm/i915/sched: set request 'head' on at start of ring submission

2015-12-11 Thread John . C . Harrison
From: Dave Gordon With the scheduler, request allocation can happen long before the ring is filled in, and in a different order. So for that case, we update the request head at the start of _final (the initialisation on allocation is stull useful for the direct-submission mode). v2: Updated to u

[Intel-gfx] [RFC 32/38] drm/i915/preempt: add hook to catch 'unexpected' ring submissions

2015-12-11 Thread John . C . Harrison
From: Dave Gordon Author: John Harrison Date: Thu Apr 10 10:41:06 2014 +0100 The scheduler needs to know what each seqno that pops out of the ring is referring to. This change adds a hook into the the 'submit some random work that got forgotten about' clean up code to inform the scheduler tha

[Intel-gfx] [RFC 31/38] drm/i915/preempt: scheduler logic for landing preemptive requests

2015-12-11 Thread John . C . Harrison
From: Dave Gordon This patch adds the GEM & scheduler logic for detection and first-stage processing of completed preemption requests. Similar to regular batches, they deposit their sequence number in the hardware status page when starting and again when finished, but using different locations so

[Intel-gfx] [RFC 36/38] drm/i915/preempt: update (LRC) ringbuffer-filling code to create preemptive requests

2015-12-11 Thread John . C . Harrison
From: Dave Gordon This patch refactors the rinbuffer-level code (in execlists/GuC mode only) and enhances it so that it can emit the proper sequence of opcode for preemption requests. A preemption request is similar to an batch submission, but doesn't actually invoke a batchbuffer, the purpose b

[Intel-gfx] [RFC 38/38] drm/i915: Added preemption info to various trace points

2015-12-11 Thread John . C . Harrison
From: John Harrison v2: Fixed a typo (and improved the names in general). Updated for changes to notify() code. For: VIZ-2021 Signed-off-by: Dave Gordon --- drivers/gpu/drm/i915/i915_gem.c | 5 +++-- drivers/gpu/drm/i915/i915_scheduler.c | 2 +- drivers/gpu/drm/i915/i915_trace.h |

[Intel-gfx] [RFC 35/38] drm/i915/preempt: Implement mid-batch preemption support

2015-12-11 Thread John . C . Harrison
From: Dave Gordon Batch buffers which have been pre-emption mid-way through execution must be handled seperately. Rather than simply re-submitting the batch as a brand new piece of work, the driver only needs to requeue the context. The hardware will take care of picking up where it left off. v2

[Intel-gfx] [RFC 00/38] Preemption support for GPU scheduler

2015-12-11 Thread John . C . Harrison
From: John Harrison Added pre-emption support to the i915 GPU scheduler. Note that this patch series was written by David Gordon. I have simply ported it onto a more recent set of scheduler patches and am uploading it as part of that work so that everything can be viewed at once. Also because Da

[Intel-gfx] [PATCH v5 01/35] drm/i915: Add total count to context status debugfs output

2016-02-18 Thread John . C . Harrison
From: John Harrison When there are lots and lots and even more lots of contexts (e.g. when running with execlists) it is useful to be able to immediately see what the total context count is. v4: Re-typed a variable (review feedback from Joonas) For: VIZ-1587 Signed-off-by: John Harrison Review

[Intel-gfx] [PATCH v5 03/35] drm/i915: Split i915_dem_do_execbuffer() in half

2016-02-18 Thread John . C . Harrison
From: John Harrison Split the execbuffer() function in half. The first half collects and validates all the information required to process the batch buffer. It also does all the object pinning, relocations, active list management, etc - basically anything that must be done upfront before the IOCT

[Intel-gfx] [PATCH v5 00/35] GPU scheduler for i915 driver

2016-02-18 Thread John . C . Harrison
From: John Harrison Implemented a batch buffer submission scheduler for the i915 DRM driver. The general theory of operation is that when batch buffers are submitted to the driver, the execbuffer() code assigns a unique seqno value and then packages up all the information required to execute the

[Intel-gfx] [PATCH v5 02/35] drm/i915: Prelude to splitting i915_gem_do_execbuffer in two

2016-02-18 Thread John . C . Harrison
From: John Harrison The scheduler decouples the submission of batch buffers to the driver with their submission to the hardware. This basically means splitting the execbuffer() function in half. This change rearranges some code ready for the split to occur. v5: Dropped runtime PM calls as they c

[Intel-gfx] [PATCH v5 07/35] drm/i915: Prepare retire_requests to handle out-of-order seqnos

2016-02-18 Thread John . C . Harrison
From: John Harrison A major point of the GPU scheduler is that it re-orders batch buffers after they have been submitted to the driver. This leads to requests completing out of order. In turn, this means that the retire processing can no longer assume that all completed entries are at the front o

[Intel-gfx] [PATCH v5 11/35] drm/i915: Added scheduler hook into i915_gem_request_notify()

2016-02-18 Thread John . C . Harrison
From: John Harrison The scheduler needs to know when requests have completed so that it can keep its own internal state up to date and can submit new requests to the hardware from its queue. v2: Updated due to changes in request handling. The operation is now reversed from before. Rather than th

[Intel-gfx] [PATCH v5 08/35] drm/i915: Disable hardware semaphores when GPU scheduler is enabled

2016-02-18 Thread John . C . Harrison
From: John Harrison Hardware sempahores require seqno values to be continuously incrementing. However, the scheduler's reordering of batch buffers means that the seqno values going through the hardware could be out of order. Thus semaphores can not be used. On the other hand, the scheduler super

[Intel-gfx] [PATCH v5 13/35] drm/i915: Redirect execbuffer_final() via scheduler

2016-02-18 Thread John . C . Harrison
From: John Harrison Updated the execbuffer() code to pass the packaged up batch buffer information to the scheduler rather than calling execbuffer_final() directly. The scheduler queue() code is currently a stub which simply chains on to _final() immediately. For: VIZ-1587 Signed-off-by: John Ha

[Intel-gfx] [PATCH v5 04/35] drm/i915: Cache request pointer in *_submission_final()

2016-02-18 Thread John . C . Harrison
From: Dave Gordon Keep a local copy of the request pointer in the _final() functions rather than dereferencing the params block repeatedly. v3: New patch in series. For: VIZ-1587 Signed-off-by: Dave Gordon Signed-off-by: John Harrison Reviewed-by: Jesse Barnes --- drivers/gpu/drm/i915/i915_

[Intel-gfx] [PATCH v5 05/35] drm/i915: Re-instate request->uniq because it is extremely useful

2016-02-18 Thread John . C . Harrison
From: John Harrison The seqno value cannot always be used when debugging issues via trace points. This is because it can be reset back to start, especially during TDR type tests. Also, when the scheduler arrives the seqno is only valid while a given request is executing on the hardware. While the

[Intel-gfx] [PATCH v5 10/35] drm/i915: Added scheduler hook when closing DRM file handles

2016-02-18 Thread John . C . Harrison
From: John Harrison The scheduler decouples the submission of batch buffers to the driver with submission of batch buffers to the hardware. Thus it is possible for an application to close its DRM file handle while there is still work outstanding. That means the scheduler needs to know about file

[Intel-gfx] [PATCH v5 09/35] drm/i915: Force MMIO flips when scheduler enabled

2016-02-18 Thread John . C . Harrison
From: John Harrison MMIO flips are the preferred mechanism now but more importantly, pipe based flips cause issues for the scheduler. Specifically, submitting work to the rings around the side of the scheduler could cause that work to be lost if the scheduler generates a pre-emption event on that

[Intel-gfx] [PATCH v5 27/35] drm/i915: Added debug state dump facilities to scheduler

2016-02-18 Thread John . C . Harrison
From: John Harrison When debugging batch buffer submission issues, it is useful to be able to see what the current state of the scheduler is. This change adds functions for decoding the internal scheduler state and reporting it. v3: Updated a debug message with the new state_str() function. v4:

[Intel-gfx] [PATCH v5 23/35] drm/i915: Defer seqno allocation until actual hardware submission time

2016-02-18 Thread John . C . Harrison
From: John Harrison The seqno value is now only used for the final test for completion of a request. It is no longer used to track the request through the software stack. Thus it is no longer necessary to allocate the seqno immediately with the request. Instead, it can be done lazily and left unt

[Intel-gfx] [PATCH v5 24/35] drm/i915: Added trace points to scheduler

2016-02-18 Thread John . C . Harrison
From: John Harrison Added trace points to the scheduler to track all the various events, node state transitions and other interesting things that occur. v2: Updated for new request completion tracking implementation. v3: Updated for changes to node kill code. v4: Wrapped some long lines to kee

<    1   2   3   4   5   6   7   8   9   10   >