[Intel-gfx] [PATCH v5 28/35] drm/i915: Add early exit to execbuff_final() if insufficient ring space

2016-02-18 Thread John . C . Harrison
From: John Harrison One of the major purposes of the GPU scheduler is to avoid stalling the CPU when the GPU is busy and unable to accept more work. This change adds support to the ring submission code to allow a ring space check to be performed before attempting to submit a batch buffer to the h

[Intel-gfx] [PATCH v5 20/35] drm/i915: Add scheduler hook to GPU reset

2016-02-18 Thread John . C . Harrison
From: John Harrison When the watchdog resets the GPU, all interrupts get disabled despite the reference count remaining. As the scheduler probably had interrupts enabled during the reset (it would have been waiting for the bad batch to complete), it must be poked to tell it that the interrupt has

[Intel-gfx] [PATCH v5 21/35] drm/i915: Added a module parameter to allow the scheduler to be disabled

2016-02-18 Thread John . C . Harrison
From: John Harrison It can be useful to be able to disable the GPU scheduler via a module parameter for debugging purposes. v5: Converted from a multi-feature 'overrides' mask to a single 'enable' boolean. Further features (e.g. pre-emption) will now be separate 'enable' booleans added later. [C

[Intel-gfx] [PATCH v5 17/35] drm/i915: Added scheduler support to __wait_request() calls

2016-02-18 Thread John . C . Harrison
From: John Harrison The scheduler can cause batch buffers, and hence requests, to be submitted to the ring out of order and asynchronously to their submission to the driver. Thus at the point of waiting for the completion of a given request, it is not even guaranteed that the request has actually

[Intel-gfx] [PATCH v5 15/35] drm/i915: Added tracking/locking of batch buffer objects

2016-02-18 Thread John . C . Harrison
From: John Harrison The scheduler needs to track interdependencies between batch buffers. These are calculated by analysing the object lists of the buffers and looking for commonality. The scheduler also needs to keep those buffers locked long after the initial IOCTL call has returned to user lan

[Intel-gfx] [PATCH v5 31/35] drm/i915: Scheduler state dump via debugfs

2016-02-18 Thread John . C . Harrison
From: John Harrison Added a facility for triggering the scheduler state dump via a debugfs entry. v2: New patch in series. For: VIZ-1587 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_debugfs.c | 33 + drivers/gpu/drm/i915/i915_scheduler.c | 9 ++

[Intel-gfx] [PATCH v5 25/35] drm/i915: Added scheduler queue throttling by DRM file handle

2016-02-18 Thread John . C . Harrison
From: John Harrison The scheduler decouples the submission of batch buffers to the driver from their subsequent submission to the hardware. This means that an application which is continuously submitting buffers as fast as it can could potentialy flood the driver. To prevent this, the driver now

[Intel-gfx] [PATCH v5 22/35] drm/i915: Support for 'unflushed' ring idle

2016-02-18 Thread John . C . Harrison
From: John Harrison When the seqno wraps around zero, the entire GPU is forced to be idle for some reason (possibly only to work around issues with hardware semaphores but no-one seems too sure!). This causes a problem if the force idle occurs at an inopportune moment such as in the middle of sub

[Intel-gfx] [PATCH v5 16/35] drm/i915: Hook scheduler node clean up into retire requests

2016-02-18 Thread John . C . Harrison
From: John Harrison The scheduler keeps its own lock on various DRM objects in order to guarantee safe access long after the original execbuff IOCTL has completed. This is especially important when pre-emption is enabled as the batch buffer might need to be submitted to the hardware multiple time

[Intel-gfx] [PATCH v5 18/35] drm/i915: Added scheduler support to page fault handler

2016-02-18 Thread John . C . Harrison
From: John Harrison GPU page faults can now require scheduler operation in order to complete. For example, in order to free up sufficient memory to handle the fault the handler must wait for a batch buffer to complete that has not even been sent to the hardware yet. Thus EAGAIN no longer means a

[Intel-gfx] [PATCH v5 12/35] drm/i915: Added deferred work handler for scheduler

2016-02-18 Thread John . C . Harrison
From: John Harrison The scheduler needs to do interrupt triggered work that is too complex to do in the interrupt handler. Thus it requires a deferred work handler to process such tasks asynchronously. v2: Updated to reduce mutex lock usage. The lock is now only held for the minimum time within

[Intel-gfx] [PATCH 01/20] igt/gem_ctx_param_basic: Updated to support scheduler priority interface

2016-02-18 Thread John . C . Harrison
From: John Harrison The GPU scheduler has added an execution priority level to the context object. There is an IOCTL interface to allow user apps/libraries to set this priority. This patch updates the context paramter IOCTL test to include the new interface. For: VIZ-1587 Signed-off-by: John Har

[Intel-gfx] [PATCH v5 33/35] drm/i915: Add scheduling priority to per-context parameters

2016-02-18 Thread John . C . Harrison
From: Dave Gordon Added an interface for user land applications/libraries/services to set their GPU scheduler priority. This extends the existing context parameter IOCTL interface to add a scheduler priority parameter. The range is +/-1023 with +ve numbers meaning higher priority. Only system pro

[Intel-gfx] [PATCH v5 26/35] drm/i915: Added debugfs interface to scheduler tuning parameters

2016-02-18 Thread John . C . Harrison
From: John Harrison There are various parameters within the scheduler which can be tuned to improve performance, reduce memory footprint, etc. This change adds support for altering these via debugfs. v2: Updated for priorities now being signed values. v5: Squashed priority bumping entries into

[Intel-gfx] [PATCH v5 32/35] drm/i915: Enable GPU scheduler by default

2016-02-18 Thread John . C . Harrison
From: John Harrison Now that all the scheduler patches have been applied, it is safe to enable. v5: Updated for new module parameter. For: VIZ-1587 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_params.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drive

[Intel-gfx] [PATCH v5 34/35] drm/i915: Add support for retro-actively banning batch buffers

2016-02-18 Thread John . C . Harrison
From: John Harrison If a given context submits too many hanging batch buffers then it will be banned and no further batch buffers will be accepted for it. However, it is possible that a large number of buffers may already have been accepted and are sat in the scheduler waiting to be executed. Thi

[Intel-gfx] [PATCH v5 19/35] drm/i915: Added scheduler flush calls to ring throttle and idle functions

2016-02-18 Thread John . C . Harrison
From: John Harrison When requesting that all GPU work is completed, it is now necessary to get the scheduler involved in order to flush out work that queued and not yet submitted. v2: Updated to add support for flushing the scheduler queue by time stamp rather than just doing a blanket flush. v

[Intel-gfx] [PATCH v5 30/35] drm/i915: Add scheduler support functions for TDR

2016-02-18 Thread John . C . Harrison
From: John Harrison The TDR code needs to know what the scheduler is up to in order to work out whether a ring is really hung or not. v4: Removed some unnecessary braces to keep the style checker happy. v5: Removed white space and added documentation. [Joonas Lahtinen] Also updated for new mod

[Intel-gfx] [PATCH v5 29/35] drm/i915: Added scheduler statistic reporting to debugfs

2016-02-18 Thread John . C . Harrison
From: John Harrison It is useful for know what the scheduler is doing for both debugging and performance analysis purposes. This change adds a bunch of counters and such that keep track of various scheduler operations (batches submitted, completed, flush requests, etc.). The data can then be read

[Intel-gfx] [PATCH v5 14/35] drm/i915: Keep the reserved space mechanism happy

2016-02-18 Thread John . C . Harrison
From: John Harrison Ring space is reserved when constructing a request to ensure that the subsequent 'add_request()' call cannot fail due to waiting for space on a busy or broken GPU. However, the scheduler jumps in to the middle of the execbuffer process between request creation and request subm

[Intel-gfx] [PATCH v5 35/35] drm/i915: Allow scheduler to manage inter-ring object synchronisation

2016-02-18 Thread John . C . Harrison
From: John Harrison The scheduler has always tracked batch buffer dependencies based on DRM object usage. This means that it will not submit a batch on one ring that has outstanding dependencies still executing on other rings. This is exactly the same synchronisation performed by i915_gem_object_

[Intel-gfx] [PATCH v6 7/7] drm/i915: Cache last IRQ seqno to reduce IRQ overhead

2016-02-18 Thread John . C . Harrison
From: John Harrison The notify function can be called many times without the seqno changing. A large number of duplicates are to prevent races due to the requirement of not enabling interrupts until requested. However, when interrupts are enabled the IRQ handle can be called multiple times withou

[Intel-gfx] [PATCH v6 4/7] drm/i915: Delay the freeing of requests until retire time

2016-02-18 Thread John . C . Harrison
From: John Harrison The request structure is reference counted. When the count reached zero, the request was immediately freed and all associated objects were unrefereced/unallocated. This meant that the driver mutex lock must be held at the point where the count reaches zero. This was fine while

[Intel-gfx] [PATCH v6 1/7] drm/i915: Convert requests to use struct fence

2016-02-18 Thread John . C . Harrison
From: John Harrison There is a construct in the linux kernel called 'struct fence' that is intended to keep track of work that is executed on hardware. I.e. it solves the basic problem that the drivers 'struct drm_i915_gem_request' is trying to address. The request structure does quite a lot more

[Intel-gfx] [PATCH v6 2/7] drm/i915: Removed now redudant parameter to i915_gem_request_completed()

2016-02-18 Thread John . C . Harrison
From: John Harrison The change to the implementation of i915_gem_request_completed() means that the lazy coherency flag is no longer used. This can now be removed to simplify the interface. v6: Updated to newer nigthly and resolved conflicts. For: VIZ-5190 Signed-off-by: John Harrison --- dri

[Intel-gfx] [PATCH v6 3/7] drm/i915: Add per context timelines to fence object

2016-02-18 Thread John . C . Harrison
From: John Harrison The fence object used inside the request structure requires a sequence number. Although this is not used by the i915 driver itself, it could potentially be used by non-i915 code if the fence is passed outside of the driver. This is the intention as it allows external kernel dr

[Intel-gfx] [PATCH v6 0/7] Convert requests to use struct fence

2016-02-18 Thread John . C . Harrison
From: John Harrison There is a construct in the linux kernel called 'struct fence' that is intended to keep track of work that is executed on hardware. I.e. it solves the basic problem that the drivers 'struct drm_i915_gem_request' is trying to address. The request structure does quite a lot more

[Intel-gfx] [PATCH v6 5/7] drm/i915: Interrupt driven fences

2016-02-18 Thread John . C . Harrison
From: John Harrison The intended usage model for struct fence is that the signalled status should be set on demand rather than polled. That is, there should not be a need for a 'signaled' function to be called everytime the status is queried. Instead, 'something' should be done to enable a signal

[Intel-gfx] [PATCH v6 6/7] drm/i915: Updated request structure tracing

2016-02-18 Thread John . C . Harrison
From: John Harrison Added the '_complete' trace event which occurs when a fence/request is signaled as complete. Also moved the notify event from the IRQ handler code to inside the notify function itself. v3: Added the current ring seqno to the notify trace point. v5: Line wrapping to keep the

[Intel-gfx] [PATCH v5 06/35] drm/i915: Start of GPU scheduler

2016-02-18 Thread John . C . Harrison
From: John Harrison Initial creation of scheduler source files. Note that this patch implements most of the scheduler functionality but does not hook it in to the driver yet. It also leaves the scheduler code in 'pass through' mode so that even when it is hooked in, it will not actually do very m

[Intel-gfx] [PATCH 1/1] drm/i915: Add wrapper for context priority interface

2016-04-20 Thread John . C . Harrison
From: John Harrison There is an EGL extension to set execution priority per context. This can be implemented via the i915 per context priority parameter. This patch adds a wrapper to connect the two together in a way that can be updated as necessary without breaking one side or the other. Signed

[Intel-gfx] [PATCH v6 29/34] drm/i915: Enable GPU scheduler by default

2016-04-20 Thread John . C . Harrison
From: John Harrison Now that all the scheduler patches have been applied, it is safe to enable. v5: Updated for new module parameter. For: VIZ-1587 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_params.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drive

[Intel-gfx] [PATCH v6 30/34] drm/i915: Add scheduling priority to per-context parameters

2016-04-20 Thread John . C . Harrison
From: Dave Gordon Added an interface for user land applications/libraries/services to set their GPU scheduler priority. This extends the existing context parameter IOCTL interface to add a scheduler priority parameter. The range is +/-1023 with +ve numbers meaning higher priority. Only system pro

[Intel-gfx] [PATCH v6 31/34] drm/i915: Add support for retro-actively banning batch buffers

2016-04-20 Thread John . C . Harrison
From: John Harrison If a given context submits too many hanging batch buffers then it will be banned and no further batch buffers will be accepted for it. However, it is possible that a large number of buffers may already have been accepted and are sat in the scheduler waiting to be executed. Thi

[Intel-gfx] [PATCH v6 08/34] drm/i915: Force MMIO flips when scheduler enabled

2016-04-20 Thread John . C . Harrison
From: John Harrison MMIO flips are the preferred mechanism now but more importantly, pipe based flips cause issues for the scheduler. Specifically, submitting work to the rings around the side of the scheduler could cause that work to be lost if the scheduler generates a pre-emption event on that

[Intel-gfx] [PATCH v6 24/34] drm/i915: Added scheduler queue throttling by DRM file handle

2016-05-06 Thread John . C . Harrison
From: John Harrison The scheduler decouples the submission of batch buffers to the driver from their subsequent submission to the hardware. This means that an application which is continuously submitting buffers as fast as it can could potentialy flood the driver. To prevent this, the driver now

[Intel-gfx] [PATCH v6 27/34] drm/i915: Added scheduler statistic reporting to debugfs

2016-05-06 Thread John . C . Harrison
From: John Harrison It is useful for know what the scheduler is doing for both debugging and performance analysis purposes. This change adds a bunch of counters and such that keep track of various scheduler operations (batches submitted, completed, flush requests, etc.). The data can then be read

[Intel-gfx] [PATCH v8 6/6] drm/i915: Cache last IRQ seqno to reduce IRQ overhead

2016-05-12 Thread John . C . Harrison
From: John Harrison The notify function can be called many times without the seqno changing. Some are to prevent races due to the requirement of not enabling interrupts until requested. However, when interrupts are enabled the IRQ handler can be called multiple times without the ring's seqno valu

[Intel-gfx] [PATCH v8 3/6] drm/i915: Removed now redundant parameter to i915_gem_request_completed()

2016-05-12 Thread John . C . Harrison
From: John Harrison The change to the implementation of i915_gem_request_completed() means that the lazy coherency flag is no longer used. This can now be removed to simplify the interface. v6: Updated to newer nightly and resolved conflicts. v7: Updated to newer nightly (lots of ring -> engine

[Intel-gfx] [PATCH v8 0/6] Convert requests to use struct fence

2016-05-12 Thread John . C . Harrison
From: John Harrison There is a construct in the linux kernel called 'struct fence' that is intended to keep track of work that is executed on hardware. I.e. it solves the basic problem that the drivers 'struct drm_i915_gem_request' is trying to address. The request structure does quite a lot more

[Intel-gfx] [PATCH v8 5/6] drm/i915: Updated request structure tracing

2016-05-12 Thread John . C . Harrison
From: John Harrison Added the '_complete' trace event which occurs when a fence/request is signaled as complete. Also moved the notify event from the IRQ handler code to inside the notify function itself. v3: Added the current ring seqno to the notify trace point. v5: Line wrapping to keep the

[Intel-gfx] [PATCH v8 2/6] drm/i915: Convert requests to use struct fence

2016-05-12 Thread John . C . Harrison
From: John Harrison There is a construct in the linux kernel called 'struct fence' that is intended to keep track of work that is executed on hardware. I.e. it solves the basic problem that the drivers 'struct drm_i915_gem_request' is trying to address. The request structure does quite a lot more

[Intel-gfx] [PATCH v8 1/6] drm/i915: Add per context timelines to fence object

2016-05-12 Thread John . C . Harrison
From: John Harrison The fence object used inside the request structure requires a sequence number. Although this is not used by the i915 driver itself, it could potentially be used by non-i915 code if the fence is passed outside of the driver. This is the intention as it allows external kernel dr

[Intel-gfx] [PATCH v8 4/6] drm/i915: Interrupt driven fences

2016-05-12 Thread John . C . Harrison
From: John Harrison The intended usage model for struct fence is that the signalled status should be set on demand rather than polled. That is, there should not be a need for a 'signaled' function to be called everytime the status is queried. Instead, 'something' should be done to enable a signal

[Intel-gfx] [PATCH v7 0/8] Convert requests to use struct fence

2016-04-20 Thread John . C . Harrison
From: John Harrison There is a construct in the linux kernel called 'struct fence' that is intended to keep track of work that is executed on hardware. I.e. it solves the basic problem that the drivers 'struct drm_i915_gem_request' is trying to address. The request structure does quite a lot more

[Intel-gfx] [PATCH v7 7/8] drm/i915: Updated request structure tracing

2016-04-20 Thread John . C . Harrison
From: John Harrison Added the '_complete' trace event which occurs when a fence/request is signaled as complete. Also moved the notify event from the IRQ handler code to inside the notify function itself. v3: Added the current ring seqno to the notify trace point. v5: Line wrapping to keep the

[Intel-gfx] [PATCH v7 6/8] drm/i915: Interrupt driven fences

2016-04-20 Thread John . C . Harrison
From: John Harrison The intended usage model for struct fence is that the signalled status should be set on demand rather than polled. That is, there should not be a need for a 'signaled' function to be called everytime the status is queried. Instead, 'something' should be done to enable a signal

[Intel-gfx] [PATCH v7 2/8] drm/i915: Removed now redudant parameter to i915_gem_request_completed()

2016-04-20 Thread John . C . Harrison
From: John Harrison The change to the implementation of i915_gem_request_completed() means that the lazy coherency flag is no longer used. This can now be removed to simplify the interface. v6: Updated to newer nightly and resolved conflicts. v7: Updated to newer nightly (lots of ring -> engine

[Intel-gfx] [PATCH v7 5/8] drm/i915: Delay the freeing of requests until retire time

2016-04-20 Thread John . C . Harrison
From: John Harrison The request structure is reference counted. When the count reached zero, the request was immediately freed and all associated objects were unrefereced/unallocated. This meant that the driver mutex lock must be held at the point where the count reaches zero. This was fine while

[Intel-gfx] [PATCH v7 8/8] drm/i915: Cache last IRQ seqno to reduce IRQ overhead

2016-04-20 Thread John . C . Harrison
From: John Harrison The notify function can be called many times without the seqno changing. A large number of duplicates are to prevent races due to the requirement of not enabling interrupts until requested. However, when interrupts are enabled the IRQ handle can be called multiple times withou

[Intel-gfx] [PATCH v7 3/8] drm/i915: Add per context timelines to fence object

2016-04-20 Thread John . C . Harrison
From: John Harrison The fence object used inside the request structure requires a sequence number. Although this is not used by the i915 driver itself, it could potentially be used by non-i915 code if the fence is passed outside of the driver. This is the intention as it allows external kernel dr

[Intel-gfx] [PATCH v7 4/8] drm/i915: Fix clean up of file client list on execbuff failure

2016-04-20 Thread John . C . Harrison
From: John Harrison If an execbuff IOCTL call fails for some reason, it would leave the request in the client list. The request clean up code would remove this but only later on and only after the reference count has dropped to zero. The entire sequence is contained within the driver mutex lock.

[Intel-gfx] [PATCH v7 1/8] drm/i915: Convert requests to use struct fence

2016-04-20 Thread John . C . Harrison
From: John Harrison There is a construct in the linux kernel called 'struct fence' that is intended to keep track of work that is executed on hardware. I.e. it solves the basic problem that the drivers 'struct drm_i915_gem_request' is trying to address. The request structure does quite a lot more

[Intel-gfx] [PATCH v6 03/34] drm/i915: Split i915_dem_do_execbuffer() in half

2016-04-20 Thread John . C . Harrison
From: John Harrison Split the execbuffer() function in half. The first half collects and validates all the information required to process the batch buffer. It also does all the object pinning, relocations, active list management, etc - basically anything that must be done upfront before the IOCT

[Intel-gfx] [PATCH v6 13/34] drm/i915: Keep the reserved space mechanism happy

2016-04-20 Thread John . C . Harrison
From: John Harrison Ring space is reserved when constructing a request to ensure that the subsequent 'add_request()' call cannot fail due to waiting for space on a busy or broken GPU. However, the scheduler jumps in to the middle of the execbuffer process between request creation and request subm

[Intel-gfx] [PATCH v6 14/34] drm/i915: Added tracking/locking of batch buffer objects

2016-04-20 Thread John . C . Harrison
From: John Harrison The scheduler needs to track interdependencies between batch buffers. These are calculated by analysing the object lists of the buffers and looking for commonality. The scheduler also needs to keep those buffers locked long after the initial IOCTL call has returned to user lan

[Intel-gfx] [PATCH v6 12/34] drm/i915: Redirect execbuffer_final() via scheduler

2016-04-20 Thread John . C . Harrison
From: John Harrison Updated the execbuffer() code to pass the packaged up batch buffer information to the scheduler rather than calling execbuffer_final() directly. The scheduler queue() code is currently a stub which simply chains on to _final() immediately. v6: Updated to newer nightly (lots o

[Intel-gfx] [PATCH v6 20/34] drm/i915: Added a module parameter to allow the scheduler to be disabled

2016-04-20 Thread John . C . Harrison
From: John Harrison It can be useful to be able to disable the GPU scheduler via a module parameter for debugging purposes. v5: Converted from a multi-feature 'overrides' mask to a single 'enable' boolean. Further features (e.g. pre-emption) will now be separate 'enable' booleans added later. [C

[Intel-gfx] [PATCH v6 23/34] drm/i915: Added trace points to scheduler

2016-04-20 Thread John . C . Harrison
From: John Harrison Added trace points to the scheduler to track all the various events, node state transitions and other interesting things that occur. v2: Updated for new request completion tracking implementation. v3: Updated for changes to node kill code. v4: Wrapped some long lines to kee

[Intel-gfx] [PATCH v6 17/34] drm/i915: Added scheduler support to page fault handler

2016-04-20 Thread John . C . Harrison
From: John Harrison GPU page faults can now require scheduler operation in order to complete. For example, in order to free up sufficient memory to handle the fault the handler must wait for a batch buffer to complete that has not even been sent to the hardware yet. Thus EAGAIN no longer means a

[Intel-gfx] [PATCH v6 24/34] drm/i915: Added scheduler queue throttling by DRM file handle

2016-04-20 Thread John . C . Harrison
From: John Harrison The scheduler decouples the submission of batch buffers to the driver from their subsequent submission to the hardware. This means that an application which is continuously submitting buffers as fast as it can could potentialy flood the driver. To prevent this, the driver now

[Intel-gfx] [PATCH v6 16/34] drm/i915: Added scheduler support to __wait_request() calls

2016-04-20 Thread John . C . Harrison
From: John Harrison The scheduler can cause batch buffers, and hence requests, to be submitted to the ring out of order and asynchronously to their submission to the driver. Thus at the point of waiting for the completion of a given request, it is not even guaranteed that the request has actually

[Intel-gfx] [PATCH v6 19/34] drm/i915: Add scheduler hook to GPU reset

2016-04-20 Thread John . C . Harrison
From: John Harrison When the watchdog resets the GPU, all interrupts get disabled despite the reference count remaining. As the scheduler probably had interrupts enabled during the reset (it would have been waiting for the bad batch to complete), it must be poked to tell it that the interrupt has

[Intel-gfx] [PATCH v6 01/34] drm/i915: Add total count to context status debugfs output

2016-04-20 Thread John . C . Harrison
From: John Harrison When there are lots and lots and even more lots of contexts (e.g. when running with execlists) it is useful to be able to immediately see what the total context count is. v4: Re-typed a variable (review feedback from Joonas) For: VIZ-1587 Signed-off-by: John Harrison Review

[Intel-gfx] [PATCH v6 22/34] drm/i915: Defer seqno allocation until actual hardware submission time

2016-04-20 Thread John . C . Harrison
From: John Harrison The seqno value is now only used for the final test for completion of a request. It is no longer used to track the request through the software stack. Thus it is no longer necessary to allocate the seqno immediately with the request. Instead, it can be done lazily and left unt

[Intel-gfx] [PATCH v6 04/34] drm/i915: Cache request pointer in *_submission_final()

2016-04-20 Thread John . C . Harrison
From: Dave Gordon Keep a local copy of the request pointer in the _final() functions rather than dereferencing the params block repeatedly. v3: New patch in series. v6: Updated to newer nightly (lots of ring -> engine renaming). For: VIZ-1587 Signed-off-by: Dave Gordon Signed-off-by: John Har

[Intel-gfx] [PATCH v6 18/34] drm/i915: Added scheduler flush calls to ring throttle and idle functions

2016-04-20 Thread John . C . Harrison
From: John Harrison When requesting that all GPU work is completed, it is now necessary to get the scheduler involved in order to flush out work that queued and not yet submitted. v2: Updated to add support for flushing the scheduler queue by time stamp rather than just doing a blanket flush. v

[Intel-gfx] [PATCH v6 09/34] drm/i915: Added scheduler hook when closing DRM file handles

2016-04-20 Thread John . C . Harrison
From: John Harrison The scheduler decouples the submission of batch buffers to the driver with submission of batch buffers to the hardware. Thus it is possible for an application to close its DRM file handle while there is still work outstanding. That means the scheduler needs to know about file

[Intel-gfx] [PATCH v6 26/34] drm/i915: Add early exit to execbuff_final() if insufficient ring space

2016-04-20 Thread John . C . Harrison
From: John Harrison One of the major purposes of the GPU scheduler is to avoid stalling the CPU when the GPU is busy and unable to accept more work. This change adds support to the ring submission code to allow a ring space check to be performed before attempting to submit a batch buffer to the h

[Intel-gfx] [PATCH v6 25/34] drm/i915: Added debugfs interface to scheduler tuning parameters

2016-04-20 Thread John . C . Harrison
From: John Harrison There are various parameters within the scheduler which can be tuned to improve performance, reduce memory footprint, etc. This change adds support for altering these via debugfs. v2: Updated for priorities now being signed values. v5: Squashed priority bumping entries into

[Intel-gfx] [PATCH v6 15/34] drm/i915: Hook scheduler node clean up into retire requests

2016-04-20 Thread John . C . Harrison
From: John Harrison The scheduler keeps its own lock on various DRM objects in order to guarantee safe access long after the original execbuff IOCTL has completed. This is especially important when pre-emption is enabled as the batch buffer might need to be submitted to the hardware multiple time

[Intel-gfx] [PATCH v6 11/34] drm/i915: Added deferred work handler for scheduler

2016-04-20 Thread John . C . Harrison
From: John Harrison The scheduler needs to do interrupt triggered work that is too complex to do in the interrupt handler. Thus it requires a deferred work handler to process such tasks asynchronously. v2: Updated to reduce mutex lock usage. The lock is now only held for the minimum time within

[Intel-gfx] [PATCH v6 28/34] drm/i915: Add scheduler support functions for TDR

2016-04-20 Thread John . C . Harrison
From: John Harrison The TDR code needs to know what the scheduler is up to in order to work out whether a ring is really hung or not. v4: Removed some unnecessary braces to keep the style checker happy. v5: Removed white space and added documentation. [Joonas Lahtinen] Also updated for new mod

[Intel-gfx] [PATCH v6 21/34] drm/i915: Support for 'unflushed' ring idle

2016-04-20 Thread John . C . Harrison
From: John Harrison When the seqno wraps around zero, the entire GPU is forced to be idle for some reason (possibly only to work around issues with hardware semaphores but no-one seems too sure!). This causes a problem if the force idle occurs at an inopportune moment such as in the middle of sub

[Intel-gfx] [PATCH v6 07/34] drm/i915: Disable hardware semaphores when GPU scheduler is enabled

2016-04-20 Thread John . C . Harrison
From: John Harrison Hardware sempahores require seqno values to be continuously incrementing. However, the scheduler's reordering of batch buffers means that the seqno values going through the hardware could be out of order. Thus semaphores can not be used. On the other hand, the scheduler super

[Intel-gfx] [PATCH v6 27/34] drm/i915: Added scheduler statistic reporting to debugfs

2016-04-20 Thread John . C . Harrison
From: John Harrison It is useful for know what the scheduler is doing for both debugging and performance analysis purposes. This change adds a bunch of counters and such that keep track of various scheduler operations (batches submitted, completed, flush requests, etc.). The data can then be read

[Intel-gfx] [PATCH v6 05/34] drm/i915: Re-instate request->uniq because it is extremely useful

2016-04-20 Thread John . C . Harrison
From: John Harrison The seqno value cannot always be used when debugging issues via trace points. This is because it can be reset back to start, especially during TDR type tests. Also, when the scheduler arrives the seqno is only valid while a given request is executing on the hardware. While the

[Intel-gfx] [PATCH v6 00/34] GPU scheduler for i915 driver

2016-04-20 Thread John . C . Harrison
From: John Harrison Implemented a batch buffer submission scheduler for the i915 DRM driver. The general theory of operation is that when batch buffers are submitted to the driver, the execbuffer() code assigns a unique seqno value and then packages up all the information required to execute the

[Intel-gfx] [PATCH v6 06/34] drm/i915: Start of GPU scheduler

2016-04-20 Thread John . C . Harrison
From: John Harrison Initial creation of scheduler source files. Note that this patch implements most of the scheduler functionality but does not hook it in to the driver yet. It also leaves the scheduler code in 'pass through' mode so that even when it is hooked in, it will not actually do very m

[Intel-gfx] [PATCH v6 02/34] drm/i915: Prelude to splitting i915_gem_do_execbuffer in two

2016-04-20 Thread John . C . Harrison
From: John Harrison The scheduler decouples the submission of batch buffers to the driver with their submission to the hardware. This basically means splitting the execbuffer() function in half. This change rearranges some code ready for the split to occur. v5: Dropped runtime PM calls as they c

[Intel-gfx] [PATCH v6 10/34] drm/i915: Added scheduler hook into i915_gem_request_notify()

2016-04-20 Thread John . C . Harrison
From: John Harrison The scheduler needs to know when requests have completed so that it can keep its own internal state up to date and can submit new requests to the hardware from its queue. v2: Updated due to changes in request handling. The operation is now reversed from before. Rather than th

[Intel-gfx] [PATCH v6 32/34] drm/i915: Allow scheduler to manage inter-ring object synchronisation

2016-04-20 Thread John . C . Harrison
From: John Harrison The scheduler has always tracked batch buffer dependencies based on DRM object usage. This means that it will not submit a batch on one ring that has outstanding dependencies still executing on other rings. This is exactly the same synchronisation performed by i915_gem_object_

[Intel-gfx] [PATCH v6 34/34] drm/i915: Scheduler state dump via debugfs

2016-04-20 Thread John . C . Harrison
From: John Harrison Added a facility for triggering the scheduler state dump via a debugfs entry. v2: New patch in series. v6: Updated to newer nightly (lots of ring -> engine renaming). Updated to use 'to_i915()' instead of dev_private. Converted all enum labels to uppercase. [review feedback

[Intel-gfx] [PATCH 1/2] igt/gem_ctx_param_basic: Updated to support scheduler priority interface

2016-04-20 Thread John . C . Harrison
From: John Harrison The GPU scheduler has added an execution priority level to the context object. There is an IOCTL interface to allow user apps/libraries to set this priority. This patch updates the context paramter IOCTL test to include the new interface. For: VIZ-1587 Signed-off-by: John Har

[Intel-gfx] [PATCH 2/2] igt/gem_scheduler: Add gem_scheduler test

2016-04-20 Thread John . C . Harrison
From: Derek Morton --- lib/Makefile.sources | 2 + lib/igt.h | 1 + lib/igt_bb_factory.c | 391 + lib/igt_bb_factory.h | 47 ++ tests/Makefile.sources | 1 + tests/gem_scheduler.c | 421 +++

[Intel-gfx] [PATCH v6 33/34] drm/i915: Added debug state dump facilities to scheduler

2016-04-20 Thread John . C . Harrison
From: John Harrison When debugging batch buffer submission issues, it is useful to be able to see what the current state of the scheduler is. This change adds functions for decoding the internal scheduler state and reporting it. v3: Updated a debug message with the new state_str() function. v4:

[Intel-gfx] [PATCH 1/7] drm/i915: Convert requests to use struct fence

2016-01-08 Thread John . C . Harrison
From: John Harrison There is a construct in the linux kernel called 'struct fence' that is intended to keep track of work that is executed on hardware. I.e. it solves the basic problem that the drivers 'struct drm_i915_gem_request' is trying to address. The request structure does quite a lot more

[Intel-gfx] [PATCH 5/7] drm/i915: Interrupt driven fences

2016-01-08 Thread John . C . Harrison
From: John Harrison The intended usage model for struct fence is that the signalled status should be set on demand rather than polled. That is, there should not be a need for a 'signaled' function to be called everytime the status is queried. Instead, 'something' should be done to enable a signal

[Intel-gfx] [PATCH 3/7] drm/i915: Add per context timelines to fence object

2016-01-08 Thread John . C . Harrison
From: John Harrison The fence object used inside the request structure requires a sequence number. Although this is not used by the i915 driver itself, it could potentially be used by non-i915 code if the fence is passed outside of the driver. This is the intention as it allows external kernel dr

[Intel-gfx] [PATCH 6/7] drm/i915: Updated request structure tracing

2016-01-08 Thread John . C . Harrison
From: John Harrison Added the '_complete' trace event which occurs when a fence/request is signaled as complete. Also moved the notify event from the IRQ handler code to inside the notify function itself. v3: Added the current ring seqno to the notify trace point. v5: Line wrapping to keep the

[Intel-gfx] [PATCH 0/7] Convert requests to use struct fence

2016-01-08 Thread John . C . Harrison
From: John Harrison There is a construct in the linux kernel called 'struct fence' that is intended to keep track of work that is executed on hardware. I.e. it solves the basic problem that the drivers 'struct drm_i915_gem_request' is trying to address. The request structure does quite a lot more

[Intel-gfx] [PATCH 2/7] drm/i915: Removed now redudant parameter to i915_gem_request_completed()

2016-01-08 Thread John . C . Harrison
From: John Harrison The change to the implementation of i915_gem_request_completed() means that the lazy coherency flag is no longer used. This can now be removed to simplify the interface. For: VIZ-5190 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_debugfs.c | 2 +- drivers/gpu

[Intel-gfx] [PATCH 7/7] drm/i915: Cache last IRQ seqno to reduce IRQ overhead

2016-01-08 Thread John . C . Harrison
From: John Harrison The notify function can be called many times without the seqno changing. A large number of duplicates are to prevent races due to the requirement of not enabling interrupts until requested. However, when interrupts are enabled the IRQ handle can be called multiple times withou

[Intel-gfx] [PATCH 4/7] drm/i915: Delay the freeing of requests until retire time

2016-01-08 Thread John . C . Harrison
From: John Harrison The request structure is reference counted. When the count reached zero, the request was immediately freed and all associated objects were unrefereced/unallocated. This meant that the driver mutex lock must be held at the point where the count reaches zero. This was fine while

[Intel-gfx] [PATCH v4 04/38] drm/i915: Split i915_dem_do_execbuffer() in half

2016-01-11 Thread John . C . Harrison
From: John Harrison Split the execbuffer() function in half. The first half collects and validates all the information required to process the batch buffer. It also does all the object pinning, relocations, active list management, etc - basically anything that must be done upfront before the IOCT

[Intel-gfx] [PATCH v4 19/38] drm/i915: Added scheduler support to page fault handler

2016-01-11 Thread John . C . Harrison
From: John Harrison GPU page faults can now require scheduler operation in order to complete. For example, in order to free up sufficient memory to handle the fault the handler must wait for a batch buffer to complete that has not even been sent to the hardware yet. Thus EAGAIN no longer means a

[Intel-gfx] [PATCH v4 01/38] drm/i915: Add total count to context status debugfs output

2016-01-11 Thread John . C . Harrison
From: John Harrison When there are lots and lots and even more lots of contexts (e.g. when running with execlists) it is useful to be able to immediately see what the total context count is. v4: Re-typed a variable (review feedback from Joonas) For: VIZ-1587 Signed-off-by: John Harrison Review

[Intel-gfx] [PATCH v4 00/38] GPU scheduler for i915 driver

2016-01-11 Thread John . C . Harrison
From: John Harrison Implemented a batch buffer submission scheduler for the i915 DRM driver. The general theory of operation is that when batch buffers are submitted to the driver, the execbuffer() code assigns a unique seqno value and then packages up all the information required to execute the

[Intel-gfx] [PATCH v4 16/38] drm/i915: Added tracking/locking of batch buffer objects

2016-01-11 Thread John . C . Harrison
From: John Harrison The scheduler needs to track interdependencies between batch buffers. These are calculated by analysing the object lists of the buffers and looking for commonality. The scheduler also needs to keep those buffers locked long after the initial IOCTL call has returned to user lan

[Intel-gfx] [PATCH v4 18/38] drm/i915: Added scheduler support to __wait_request() calls

2016-01-11 Thread John . C . Harrison
From: John Harrison The scheduler can cause batch buffers, and hence requests, to be submitted to the ring out of order and asynchronously to their submission to the driver. Thus at the point of waiting for the completion of a given request, it is not even guaranteed that the request has actually

<    2   3   4   5   6   7   8   9   10   11   >