Re: [Intel-gfx] [PATCH 20/24] drm/i915/gen9: Add WaEnableChickenDCPR

2016-06-03 Thread Matthew Auld
Reviewed-by: Matthew Auld ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] ✗ Ro.CI.BAT: warning for drm/i915/dsi: fix bxt split screen and color issue

2016-06-03 Thread Patchwork
== Series Details == Series: drm/i915/dsi: fix bxt split screen and color issue URL : https://patchwork.freedesktop.org/series/8232/ State : warning == Summary == Series 8232v1 drm/i915/dsi: fix bxt split screen and color issue http://patchwork.freedesktop.org/api/1.0/series/8232/revisions/1/m

Re: [Intel-gfx] [PATCH 25/25] drm/i915/kbl: Add WaClearSlmSpaceAtContextSwitch

2016-06-03 Thread Matthew Auld
What about skl, this also seems to need the WA until A0? ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 04/21] drm/i915: Make queueing the hangcheck work inline

2016-06-03 Thread Chris Wilson
Since the function is a small wrapper around schedule_delayed_work(), move it inline to remove the function call overhead for the principle caller. Signed-off-by: Chris Wilson Reviewed-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/i915_drv.h | 18 +- drivers/gpu/drm/i915/i915_irq.

[Intel-gfx] Breadcrumbs, again

2016-06-03 Thread Chris Wilson
We have a major bottleneck in waiting with many clients that is impacting customer workloads. This is because we wake up every waiter after the GPU advance for them all to try and identify if they were the lucky one. The classic thundering herd, and the response is to only wake the next in the queu

[Intel-gfx] [PATCH 03/21] drm/i915: Remove the dedicated hangcheck workqueue

2016-06-03 Thread Chris Wilson
The queue only ever contains at most one item and has no special flags. It is just a very simple wrapper around the system-wq - a complication with no benefits. v2: Use the system_long_wq as we may wish to capture the error state after detecting the hang - which may take a bit of time. Signed-off

[Intel-gfx] [PATCH 05/21] drm/i915: Separate GPU hang waitqueue from advance

2016-06-03 Thread Chris Wilson
Currently __i915_wait_request uses a per-engine wait_queue_t for the dual purpose of waking after the GPU advances or for waking after an error. In the future, we may add even more wake sources and require greater separation, but for now we can conceptually simplify wakeups by separating the two so

[Intel-gfx] [PATCH 06/21] drm/i915: Slaughter the thundering i915_wait_request herd

2016-06-03 Thread Chris Wilson
One particularly stressful scenario consists of many independent tasks all competing for GPU time and waiting upon the results (e.g. realtime transcoding of many, many streams). One bottleneck in particular is that each client waits on its own results, but every client is woken up after every batch

[Intel-gfx] [PATCH 10/21] drm/i915: Allocate scratch page from stolen

2016-06-03 Thread Chris Wilson
With the last direct CPU access to the scratch page removed, we can now allocate it from our small amount of reserved system pages (stolen memory). Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/intel_ringbuffer.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drive

[Intel-gfx] [PATCH 01/21] drm/i915/shrinker: Flush active on objects before counting

2016-06-03 Thread Chris Wilson
As we inspect obj->active to decide how many objects we can shrink (we only shrink idle objects), it helps to flush the active lists first in order to have a more accurate count of available objects. Signed-off-by: Chris Wilson Reviewed-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/i915_gem_shrin

[Intel-gfx] [PATCH 17/21] drm/i915: Convert trace-irq to the breadcrumb waiter

2016-06-03 Thread Chris Wilson
If we convert the tracing over from direct use of ring->irq_get() and over to the breadcrumb infrastructure, we only have a single user of the ring->irq_get and so we will be able to simplify the driver routines (eliminating the redundant validation and irq refcounting). v2: Move to a signaling fr

[Intel-gfx] [PATCH 02/21] drm/i915: Delay queuing hangcheck to wait-request

2016-06-03 Thread Chris Wilson
We can forgo queuing the hangcheck from the start of every request to until we wait upon a request. This reduces the overhead of every request, but may increase the latency of detecting a hang. Howeever, if nothing every waits upon a hang, did it ever hang? It also improves the robustness of the wa

[Intel-gfx] [PATCH 08/21] drm/i915: Use HWS for seqno tracking everywhere

2016-06-03 Thread Chris Wilson
By using the same address for storing the HWS on every platform, we can remove the platform specific vfuncs and reduce the get-seqno routine to a single read of a cached memory location. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_debugfs.c | 6 +-- drivers/gpu/drm/i915/i915_

[Intel-gfx] [PATCH 19/21] drm/i915: Move the get/put irq locking into the caller

2016-06-03 Thread Chris Wilson
With only a single callsite for intel_engine_cs->irq_get and ->irq_put, we can reduce the code size by moving the common preamble into the caller, and we can also eliminate the reference counting. For completeness, as we are no longer doing reference counting on irq, rename the get/put vfunctions

[Intel-gfx] [PATCH 11/21] drm/i915: Refactor scratch object allocation for gen2 w/a buffer

2016-06-03 Thread Chris Wilson
The gen2 w/a buffer is stuffed into the same slot as the gen5+ scratch buffer. If we pass in the size we want to allocate for the scratch buffer, both callers can use the same routine. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/intel_lrc.c| 2 +- drivers/gpu/drm/i915/intel_rin

[Intel-gfx] [PATCH 20/21] drm/i915: Simplify enabling user-interrupts with L3-remapping

2016-06-03 Thread Chris Wilson
Borrow the idea from intel_lrc.c to precompute the mask of interrupts we wish to always enable to avoid having lots of conditionals inside the interrupt enabling. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/intel_ringbuffer.c | 35 +++-- drivers/gpu/drm/i915/

[Intel-gfx] [PATCH 18/21] drm/i915: Embed signaling node into the GEM request

2016-06-03 Thread Chris Wilson
Under the assumption that enabling signaling will be a frequent operation, lets preallocate our attachments for signaling inside the request struct (and so benefiting from the slab cache). Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/intel

[Intel-gfx] [PATCH 14/21] drm/i915: Only apply one barrier after a breadcrumb interrupt is posted

2016-06-03 Thread Chris Wilson
If we flag the seqno as potentially stale upon receiving an interrupt, we can use that information to reduce the frequency that we apply the heavyweight coherent seqno read (i.e. if we wake up a chain of waiters). Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_drv.h | 15

[Intel-gfx] [PATCH 12/21] drm/i915: Add a delay between interrupt and inspecting the final seqno (ilk)

2016-06-03 Thread Chris Wilson
On Ironlake, there is no command nor register to ensure that the write from a MI_STORE command is completed (and coherent on the CPU) before the command parser continues. This means that the ordering between the seqno write and the subsequent user interrupt is undefined (like gen6+). So to ensure t

[Intel-gfx] [PATCH 09/21] drm/i915: Stop mapping the scratch page into CPU space

2016-06-03 Thread Chris Wilson
After the elimination of using the scratch page for Ironlake's breadcrumb, we no longer need to kmap the object. We therefore can move it into the high unmappable space and do not need to force the object to be coherent (i.e. snooped on !llc platforms). Signed-off-by: Chris Wilson --- drivers/gp

[Intel-gfx] [PATCH 16/21] drm/i915: Only query timestamp when measuring elapsed time

2016-06-03 Thread Chris Wilson
Avoid the two calls to ktime_get_raw_ns() (at best it reads the TSC) as we only need to compute the elapsed time for a timed wait. v2: Eliminate the unused local variable reducing the function size by 64 bytes (using the storage space on the callers stack rather than adding to our stack frame). Wr

[Intel-gfx] [PATCH 15/21] drm/i915: Stop setting wraparound seqno on initialisation

2016-06-03 Thread Chris Wilson
We have testcases to ensure that seqno wraparound works fine, so we can forgo forcing everyone to encounter seqno wraparound during early uptime. seqno wraparound incurs a full GPU stall so not forcing it will eliminate one jitter from the early system. Using the testcases, we have very determinist

[Intel-gfx] [PATCH 07/21] drm/i915: Spin after waking up for an interrupt

2016-06-03 Thread Chris Wilson
When waiting for an interrupt (waiting for the GPU to complete some work), we know we are the single waiter for the GPU. We also know when the GPU has nearly completed our request (or at least started processing it), so after being woken and we detect that the GPU is almost finished, allow the bott

[Intel-gfx] [PATCH 13/21] drm/i915: Check the CPU cached value of seqno after waking the waiter

2016-06-03 Thread Chris Wilson
If we have multiple waiters, we may find that many complete on the same wake up. If we first inspect the seqno from the CPU cache, we may reduce the number of heavyweight coherent seqno reads we require. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_drv.h | 14 ++ 1 file

[Intel-gfx] [PATCH 21/21] drm/i915: Remove debug noise on detecting fault-injection of missed interrupts

2016-06-03 Thread Chris Wilson
Since the tests can and do explicitly check debugfs/i915_ring_missed_irqs for the handling of a "missed interrupt", adding it to the dmesg at INFO is just noise. When it happens for real, we still class it as an ERROR. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_irq.c | 3 --- 1 fi

[Intel-gfx] ✗ Ro.CI.BAT: failure for series starting with [01/21] drm/i915/shrinker: Flush active on objects before counting

2016-06-03 Thread Patchwork
== Series Details == Series: series starting with [01/21] drm/i915/shrinker: Flush active on objects before counting URL : https://patchwork.freedesktop.org/series/8246/ State : failure == Summary == Applying: drm/i915/shrinker: Flush active on objects before counting Applying: drm/i915: Dela

[Intel-gfx] The vma leak fix from yonder

2016-06-03 Thread Chris Wilson
Just to see if anyone is awake this series takes us to the VMA leak fix. Just the tip of the iceberg when it comes to VMA fixes... -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 10/62] drm/i915: Allow userspace to request no-error-capture upon GPU hangs

2016-06-03 Thread Chris Wilson
igt likes to inject GPU hangs into its command streams. However, as we expect these hangs, we don't actually want them recorded in the dmesg output or stored in the i915_error_state (usually). To accomodate this allow userspace to set a flag on the context that any hang emanating from that context

[Intel-gfx] [PATCH 03/62] drm/i915: Remove redundant queue_delayed_work() from throttle ioctl

2016-06-03 Thread Chris Wilson
We know, by design, that whilst the GPU is active (and thus we are throttling) the retire_worker is queued. Therefore attempting to requeue it with queue_delayed_work() is a no-op and we can safely remove it. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem.c | 3 --- 1 file changed

[Intel-gfx] [PATCH 15/62] drm/i915: Rename i915_gem_context_reference/unreference()

2016-06-03 Thread Chris Wilson
As these are wrappers around kref_get/kref_put() it is preferable to follow the naming convention and use the same verb get/put in our wrapper names for manipulating a reference to the context. Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Cc: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_drv

[Intel-gfx] [PATCH 14/62] drm/i915: Rename request reference/unreference to get/put

2016-06-03 Thread Chris Wilson
Now that we derive requests from struct fence, swap over to its nomenclature for references. It's shorter and more idiomatic across the kernel. s/i915_gem_request_reference/i915_gem_request_get/ s/i915_gem_request_unreference/i915_gem_request_put/ Signed-off-by: Chris Wilson --- drivers/gpu/drm

[Intel-gfx] [PATCH 01/62] drm/i915: Only start retire worker when idle

2016-06-03 Thread Chris Wilson
The retire worker is a low frequency task that makes sure we retire outstanding requests if userspace is being lax. We only need to start it once as it remains active until the GPU is idle, so do a cheap test before the more expensive queue_work(). A consequence of this is that we need correct lock

[Intel-gfx] [PATCH 02/62] drm/i915: Do not keep postponing the idle-work

2016-06-03 Thread Chris Wilson
Rather than persistently postponing the idle-work everytime somebody calls i915_gem_retire_requests() (potentially ensuring that we never reach the idle state), queue the work the first time we detect all requests are complete. Then if in 100ms, more requests have been queued, we will abort the idl

[Intel-gfx] [PATCH 55/62] drm/i915: i915_vma_move_to_active prep patch

2016-06-03 Thread Chris Wilson
This patch is broken out of the next just to remove the code motion from that patch and make it more readable. What we do here is move the i915_vma_move_to_active() to i915_gem_execbuffer.c and put the three stages (read, write, fenced) together so that future modifications to active handling are a

[Intel-gfx] [PATCH 19/62] drm/i915: Rename drm_gem_object_unreference_unlocked in preparation for lockless free

2016-06-03 Thread Chris Wilson
Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_drv.h | 7 +++ drivers/gpu/drm/i915/i915_gem.c | 10 +- drivers/gpu/drm/i915/i915_gem_tiling.c | 2 +- drivers/gpu/drm/i915/i915_gem_userptr.c | 2 +- drivers/gpu/drm/i915/intel_display.c| 6 +++--- dri

[Intel-gfx] [PATCH 42/62] drm/i915: Simplify calling engine->sync_to

2016-06-03 Thread Chris Wilson
Since requests can no longer be generated as a side-effect of intel_ring_begin(), we know that the seqno will be unchanged during ring-emission. This predicatablity then means we do not have to check for the seqno wrapping around whilst emitting the semaphore for engine->sync_to(). Signed-off-by:

[Intel-gfx] [PATCH 54/62] drm/i915: Store owning file on the i915_address_space

2016-06-03 Thread Chris Wilson
For the global GTT (and aliasing GTT), the address space is owned by the device (it is a global resource) and so the per-file owner field is NULL. For per-process GTT (where we create an address space per context), each is owned by the opening file. We can use this ownership information to both dis

[Intel-gfx] [PATCH 21/62] drm/i915: Disable waitboosting for mmioflips/semaphores

2016-06-03 Thread Chris Wilson
Since commit a6f766f3975185af66a31a2cea2cd38721645999 Author: Chris Wilson Date: Mon Apr 27 13:41:20 2015 +0100 drm/i915: Limit ring synchronisation (sw sempahores) RPS boosts and commit bcafc4e38b6ad03f48989b7ecaff03845b5b7acf Author: Chris Wilson Date: Mon Apr 27 13:41:21 2015 +0100

[Intel-gfx] [PATCH 18/62] drm/i915: Rename drm_gem_object_unreference in preparation for lockless free

2016-06-03 Thread Chris Wilson
Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_drv.h | 7 +++ drivers/gpu/drm/i915/i915_gem.c | 26 +- drivers/gpu/drm/i915/i915_gem_batch_pool.c | 4 ++-- drivers/gpu/drm/i915/i915_gem_context.c | 4 ++-- drivers/gpu/drm/

[Intel-gfx] [PATCH 24/62] drm/i915: Convert i915_semaphores_is_enabled over to early sanitize

2016-06-03 Thread Chris Wilson
Rather than recomputing whether semaphores are enabled, we can do that computation once during early initialisation as the i915.semaphores module parameter is now read-only. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_debugfs.c | 2 +- drivers/gpu/drm/i915/i915_drv.c |

[Intel-gfx] [PATCH 46/62] drm/i915: Refactor blocking waits

2016-06-03 Thread Chris Wilson
Tidy up the for loops that handle waiting for read/write vs read-only access. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem.c | 163 +++- 1 file changed, 78 insertions(+), 85 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/

[Intel-gfx] [PATCH 51/62] drm/i915: Move request list retirement to i915_gem_request.c

2016-06-03 Thread Chris Wilson
As the list retirement is now clean of implementation details, we can move it closer to the request management. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem.c | 41 - drivers/gpu/drm/i915/i915_gem_request.c | 33 ++

[Intel-gfx] [PATCH 34/62] drm/i915: Simplify request_alloc by returning the allocated request

2016-06-03 Thread Chris Wilson
If is simpler and leads to more readable code through the callstack if the allocation returns the allocated struct through the return value. The importance of this is that it no longer looks like we accidentally allocate requests as side-effect of calling certain functions. Signed-off-by: Chris W

[Intel-gfx] [PATCH 40/62] drm/i915: Remove duplicate golden render state init from execlists

2016-06-03 Thread Chris Wilson
Now that we use the same vfuncs for emitting the batch buffer in both execlists and legacy, the golden render state initialisation is identical between both. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_render_state.c | 23 +-- drivers/gpu/drm/i915/i915_gem_rende

[Intel-gfx] [PATCH 28/62] drm/i915: Rename backpointer from intel_ringbuffer to intel_engine_cs

2016-06-03 Thread Chris Wilson
Having ringbuf->ring point to an engine is confusing, so rename it once again to ring->engine. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/intel_ringbuffer.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/dri

[Intel-gfx] [PATCH 20/62] drm/i915: Disable waitboosting for fence_wait()

2016-06-03 Thread Chris Wilson
We want to restrict waitboosting to known process contexts, where we can track which clients are receiving waitboosts and prevent excessive power wasting. For fence_wait() we do not have any client tracking and so that leaves it open to abuse. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915

[Intel-gfx] [PATCH 09/62] drm/i915: Record the ringbuffer associated with the request

2016-06-03 Thread Chris Wilson
The request tells us where to read the ringbuf from, so use that information to simplify the error capture. If no request was active at the time of the hang, the ring is idle and there is no information inside the ring pertaining to the hang. Note carefully that this will reduce the amount of info

[Intel-gfx] [PATCH 11/62] drm/i915: Clean up GPU hang message

2016-06-03 Thread Chris Wilson
Remove some redundant kernel messages as we deduce a hung GPU and capture the error state. v2: Fix "hang" vs "no progress" message whilst I was there Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_irq.c | 41 ++--- 1 file changed, 26 insertions(+),

[Intel-gfx] [PATCH 44/62] drm/i915: Prepare i915_gem_active for annotations

2016-06-03 Thread Chris Wilson
In the future, we will want to add annotations to the i915_gem_active struct. The API is thus expanded to hide direct access to the contents of i915_gem_active and mediated instead through a number of helpers. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_debugfs.c | 13 ++-- dr

[Intel-gfx] [PATCH 53/62] drm/i915: Split early global GTT initialisation

2016-06-03 Thread Chris Wilson
Initialising the global GTT is tricky as we wish to use the drm_mm range manager during the modesetting initialisation (to capture stolen allocations from the BIOS) before we actually enable GEM. To overcome this, we currently setup the drm_mm first and then carefully rebind them. Signed-off-by: C

[Intel-gfx] [PATCH 26/62] drm/i915: Rename request->ring to request->engine

2016-06-03 Thread Chris Wilson
In order to disambiguate between the pointer to the intel_engine_cs (called ring) and the intel_ringbuffer (called ringbuf), rename s/ring/engine/. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_debugfs.c | 3 +-- drivers/gpu/drm/i915/i915_gem.c | 6 ++ dri

[Intel-gfx] [PATCH 36/62] drm/i915: Convert engine->write_tail to operate on a request

2016-06-03 Thread Chris Wilson
If we rewrite the I915_WRITE_TAIL specialisation for the legacy ringbuffer as submitting the request onto the ringbuffer, we can unify the vfunc with both execlists and GuC in the next patch. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_request.c | 5 +-- drivers/gpu/drm/i915/i

[Intel-gfx] [PATCH 16/62] drm/i915: Wrap drm_gem_object_lookup in i915_gem_object_lookup

2016-06-03 Thread Chris Wilson
For symmetry with a forthcoming i915_gem_object_get() and i915_gem_object_pu(). Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_drv.h| 18 ++- drivers/gpu/drm/i915/i915_gem.c| 56 +- drivers/gpu/drm/i915/i915_gem_tiling.c | 8 ++-

[Intel-gfx] [PATCH 38/62] drm/i915: Stop passing caller's num_dwords to engine->semaphore.signal()

2016-06-03 Thread Chris Wilson
Rather than pass in the num_dwords that the caller wishes to use after the signal command packet, split the breadcrumb emission into two phases and have both the signal and breadcrumb individiually acquire space on the ring. This makes the interface simpler for the reader, and will simplify for pat

[Intel-gfx] [PATCH 29/62] drm/i915: Rename intel_context[engine].ringbuf

2016-06-03 Thread Chris Wilson
Perform s/ringbuf/ring/ on the context struct for consistency with the ring/engine split. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_debugfs.c| 8 drivers/gpu/drm/i915/i915_drv.h| 2 +- drivers/gpu/drm/i915/i915_gem_context.c| 4 ++-- drivers/gp

[Intel-gfx] [PATCH 08/62] drm/i915: Remove stop-rings debugfs interface

2016-06-03 Thread Chris Wilson
Now that we have (near) universal GPU recovery code, we can inject a real hang from userspace and not need any fakery. Not only does this mean that the testing is far more realistic, but we can simplify the kernel in the process. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_debugfs.

[Intel-gfx] [PATCH 12/62] drm/i915: Skip capturing an error state if we already have one

2016-06-03 Thread Chris Wilson
As we only ever keep the first error state around, we can avoid some work that can be quite intrusive if we don't record the error the second time around. This does move the race whereby the user could discard one error state as the second is being captured, but that race exists in the current code

[Intel-gfx] [PATCH 05/62] drm/i915: Add background commentary to "waitboosting"

2016-06-03 Thread Chris Wilson
Describe the intent of boosting the GPU frequency to maximum before waiting on the GPU. RPS waitboosting was introduced with commit b29c19b645287f7062e17d70fa4e9781a01a5d88 Author: Chris Wilson Date: Wed Sep 25 17:34:56 2013 +0100 drm/i915: Boost RPS frequency for CPU stalls but lacked a

[Intel-gfx] [PATCH 32/62] drm/i915: Rename intel_pin_and_map_ring()

2016-06-03 Thread Chris Wilson
For more consistent oop-naming, we would use intel_ring_verb, so pick intel_ring_pin() and intel_ring_unpin(). Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/intel_lrc.c| 4 ++-- drivers/gpu/drm/i915/intel_ringbuffer.c | 38 - drivers/gpu/drm/i915/i

[Intel-gfx] [PATCH 47/62] drm/i915: Rename request->list to link for consistency

2016-06-03 Thread Chris Wilson
We use "list" to denote the list and "link" to denote an element on that list. Rename request->list to match this idiom. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_debugfs.c | 4 ++-- drivers/gpu/drm/i915/i915_gem.c | 10 +- drivers/gpu/drm/i915/i915_gem_reque

[Intel-gfx] [PATCH 50/62] drm/i915: Double check activity before relocations

2016-06-03 Thread Chris Wilson
If the object is active and we need to perform a relocation upon it, we need to take the slow relocation path. Before we do, double check the active requests to see if they have completed. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 16 +++- 1 file ch

[Intel-gfx] [PATCH 06/62] drm/i915: Flush the RPS bottom-half when the GPU idles

2016-06-03 Thread Chris Wilson
Make sure that the RPS bottom-half is flushed before we set the idle frequency when we decide the GPU is idle. This should prevent any races with the bottom-half and setting the idle frequency, and ensures that the bottom-half is bounded by the GPU's rpm reference taken for when it is active (i.e.

[Intel-gfx] [PATCH 27/62] drm/i915: Rename request->ringbuf to request->ring

2016-06-03 Thread Chris Wilson
Now that we have disambuigated ring and engine, we can use the clearer and more consistent name for the intel_ringbuffer pointer in the request. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_context.c| 4 +- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 4 +- drivers/gpu/drm

[Intel-gfx] [PATCH 59/62] drm/i915: Track active vma requests

2016-06-03 Thread Chris Wilson
Hook the vma itself into the i915_gem_request_retire() so that we can accurately track when a solitary vma is inactive (as opposed to having to wait for the entire object to be idle). This improves the interaction when using multiple contexts (with full-ppgtt) and eliminates some frequent list walk

[Intel-gfx] [PATCH 25/62] drm/i915: Unify intel_logical_ring_emit and intel_ring_emit

2016-06-03 Thread Chris Wilson
Both perform the same actions with more or less indirection, so just unify the code. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_context.c| 54 ++--- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 53 ++--- drivers/gpu/drm/i915/i915_gem_gtt.c| 62 ++--- drivers/gpu

[Intel-gfx] [PATCH 17/62] drm/i915: Wrap drm_gem_object_reference in i915_gem_object_get

2016-06-03 Thread Chris Wilson
Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_drv.h| 10 +- drivers/gpu/drm/i915/i915_gem.c| 4 ++-- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 3 +-- drivers/gpu/drm/i915/i915_gem_evict.c | 2 +- drivers/gpu/drm/i915/i915_gem_execbuffer.c |

[Intel-gfx] [PATCH 45/62] drm/i915: Mark up i915_gem_active for locking annotation

2016-06-03 Thread Chris Wilson
The future annotations will track the locking used for access to ensure that it is always sufficient. We make the preparations now to present the API ahead and to make sure that GCC can eliminate the unused parameter. Before: 6298417 3619610 696320 10614347 a1f64b vmlinux After: 6298417

[Intel-gfx] [PATCH 58/62] drm/i915: Kill drop_pages()

2016-06-03 Thread Chris Wilson
The drop_pages() function is a dangerous trap in that it can release the passed in object pointer and so unless the caller is aware, it can easily trick us into using the stale object afterwards. Move it into its solitary callsite where we know it is safe. Signed-off-by: Chris Wilson --- drivers

[Intel-gfx] [PATCH 49/62] drm/i915: Refactor activity tracking for requests

2016-06-03 Thread Chris Wilson
With the introduction of requests, we amplified the number of atomic refcounted objects we use and update every execbuffer; from none to several references, and a set of references that need to be changed. We also introduced interesting side-effects in the order of retiring requests and objects. I

[Intel-gfx] [PATCH 57/62] drm/i915: Be more careful when unbinding vma

2016-06-03 Thread Chris Wilson
When we call i915_vma_unbind(), we will wait upon outstanding rendering. This will also trigger a retirement phase, which may update the object lists. If, we extend request tracking to the VMA itself (rather than keep it at the encompassing object), then there is a potential that the obj->vma_list

[Intel-gfx] [PATCH 48/62] drm/i915: Remove obsolete i915_gem_object_flush_active()

2016-06-03 Thread Chris Wilson
Since we track requests, and requests are always added to the GPU fully formed, we never have to flush the incomplete request and know that the given request will eventually complete without any further action on our part. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem.c | 58 +++-

[Intel-gfx] [PATCH 13/62] drm/i915: Derive GEM requests from dma-fence

2016-06-03 Thread Chris Wilson
dma-buf provides a generic fence class for interoperation between drivers. Internally we use the request structure as a fence, and so with only a little bit of interfacing we can rebase those requests on top of dma-buf fences. This will allow us, in the future, to pass those fences back to userspac

[Intel-gfx] [PATCH 60/62] drm/i915: Release vma when the handle is closed

2016-06-03 Thread Chris Wilson
In order to prevent a leak of the vma on shared objects, we need to hook into the object_close callback to destroy the vma on the object for this file. However, if we destroyed that vma immediately we may cause unexpected application stalls as we try to unbind a busy vma - hence we defer the unbind

[Intel-gfx] [PATCH 52/62] drm/i915: Amalgamate GGTT/ppGTT vma debug list walkers

2016-06-03 Thread Chris Wilson
As we can now have multiple VMA inside the global GTT (with partial mappings, rotations, etc), it is no longer true that there may just be a single GGTT entry and so we should walk the full vma_list to count up the actual usage. In addition to unifying the two walkers, switch from multiplying the o

[Intel-gfx] [PATCH 61/62] drm/i915: Mark the context and address space as closed

2016-06-03 Thread Chris Wilson
When the user closes the context mark it and the dependent address space as closed. As we use an asynchronous destruct method, this has two purposes. First it allows us to flag the closed context and detect internal errors if we to create any new objects for it (as it is removed from the user's nam

[Intel-gfx] [PATCH 23/62] drm/i915: Rename ring->virtual_start as ring->vaddr

2016-06-03 Thread Chris Wilson
Just a different colour to better match virtual addresses elsewhere. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_irq.c | 8 drivers/gpu/drm/i915/intel_ringbuffer.c | 9 - drivers/gpu/drm/i915/intel_ringbuffer.h | 4 ++-- 3 files changed, 10 insertions(+), 1

[Intel-gfx] [PATCH 62/62] Revert "drm/i915: Clean up associated VMAs on context destruction"

2016-06-03 Thread Chris Wilson
This reverts commit e9f24d5fb7cf3628b195b18ff3ac4e37937ceeae. The patch was only a stop-gap measure that fixed half the problem - the leak of the fbcon when restarting X. A complete solution required releasing the VMA when the object itself was closed rather than rely on file/process exit. The pre

[Intel-gfx] [PATCH 41/62] drm/i915: Unify legacy/execlists submit_execbuf callbacks

2016-06-03 Thread Chris Wilson
Now that emitting requests is identical between legacy and execlists, we can use the same function to build up the ring for submitting to either engine. (With the exception of i915_switch_contexts(), but in time that will also be handled gracefully.) Signed-off-by: Chris Wilson --- drivers/gpu/d

[Intel-gfx] [PATCH 37/62] drm/i915: Unify request submission

2016-06-03 Thread Chris Wilson
Move request submission from emit_request into its own common vfunc from i915_add_request(). Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_request.c| 8 +++--- drivers/gpu/drm/i915/i915_guc_submission.c | 4 +-- drivers/gpu/drm/i915/intel_guc.h | 2 +- drivers/gp

[Intel-gfx] [PATCH 31/62] drm/i915: Rename residual ringbuf parameters

2016-06-03 Thread Chris Wilson
Now that we have a clear ring/engine split and a struct intel_ring, we no longer need the stopgap ringbuf names. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/intel_ringbuffer.c | 66 - drivers/gpu/drm/i915/intel_ringbuffer.h | 6 +-- 2 files changed, 36 i

[Intel-gfx] [PATCH 56/62] drm/i915: Count how many VMA are bound for an object

2016-06-03 Thread Chris Wilson
Since we may have VMA allocated for an object, but we interrupted their binding, there is a disparity between have elements on the obj->vma_list and being bound. i915_gem_obj_bound_any() does this check, but this is not rigorously observed - add an explicit count to make it easier. Signed-off-by:

[Intel-gfx] [PATCH 30/62] drm/i915: Rename struct intel_ringbuffer to struct intel_ring

2016-06-03 Thread Chris Wilson
The state stored in this struct is not only the information about the buffer object, but the ring used to communicate with the hardware. Using buffer here is overly specific and, for me at least, conflates with the notion of buffer objects themselves. Signed-off-by: Chris Wilson --- drivers/gpu/

[Intel-gfx] [PATCH 35/62] drm/i915: Unify legacy/execlists emission of MI_BATCHBUFFER_START

2016-06-03 Thread Chris Wilson
Both the ->dispatch_execbuffer and ->emit_bb_start callbacks do exactly the same thing, add MI_BATCHBUFFER_START to the request's ringbuffer - we need only one vfunc. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 6 +-- drivers/gpu/drm/i915/i915_gem_render_state

[Intel-gfx] [PATCH 07/62] drm/i915: Remove temporary RPM wakeref assert disables

2016-06-03 Thread Chris Wilson
Now that the last couple of hacks have been removed from the runtime powermanagement users, we can fully enable the asserts. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/intel_drv.h | 7 --- 1 file changed, 7 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/d

[Intel-gfx] [PATCH 39/62] drm/i915: Reuse legacy breadcrumbs + tail emission

2016-06-03 Thread Chris Wilson
As GEN6+ is now a simple variant on the basic breadcrumbs + tail write, reuse the common code. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/intel_ringbuffer.c | 68 + 1 file changed, 27 insertions(+), 41 deletions(-) diff --git a/drivers/gpu/drm/i915/inte

[Intel-gfx] [PATCH 22/62] drm/i915: Treat ringbuffer writes as write to normal memory

2016-06-03 Thread Chris Wilson
Ringbuffers are now being written to either through LLC or WC paths, so treating them as simply iomem is no longer adequate. However, for the older !llc hardware, the hardware is documentated as treating the TAIL register update as serialising, so we can relax the barriers when filling the rings (b

[Intel-gfx] [PATCH 43/62] drm/i915: Introduce i915_gem_active for request tracking

2016-06-03 Thread Chris Wilson
In the next patch, request tracking is made more generic and for that we need a new expanded struct and to separate out the logic changes from the mechanical churn, we split out the structure renaming into this patch. v2: Writer's block. Add some spiel about why we track requests. v3: Now i915_gem

[Intel-gfx] [PATCH 04/62] drm/i915: Restore waitboost credit to the synchronous waiter

2016-06-03 Thread Chris Wilson
Ideally, we want to automagically have the GPU respond to the instantaneous load by reclocking itself. However, reclocking occurs relatively slowly, and to the client waiting for a result from the GPU, too late. To compensate and reduce the client latency, we allow the first wait from a client to b

[Intel-gfx] [PATCH 33/62] drm/i915: Remove obsolete engine->gpu_caches_dirty

2016-06-03 Thread Chris Wilson
Space for flushing the GPU cache prior to completing the request is preallocated and so cannot fail. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_context.c| 2 +- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 9 +--- drivers/gpu/drm/i915/i915_gem_gtt.c| 18

[Intel-gfx] [PATCH 01/38] drm/i915: Combine loops within i915_gem_evict_something

2016-06-03 Thread Chris Wilson
Slight micro-optimise to produce combine loops so that gcc is able to optimise the inner-loops concisely. Since we are reviewing the loops, we can update the comments to describe the current state of affairs, in particular the distinction between evicting from the global GTT (which may contain untr

[Intel-gfx] [PATCH 03/38] drm/i915: Double check the active status on the batch pool

2016-06-03 Thread Chris Wilson
We should not rely on obj->active being uptodate unless we manually flush it. Instead, we can verify that the next available batch object is idle by looking at its last active request (and checking it for completion). Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_batch_pool.c | 1

[Intel-gfx] Tracking VMA

2016-06-03 Thread Chris Wilson
One issue with the current VMA api is that callers do not take ownership of the VMA they pin for their use, and corresponding never explicitly unpin it. Being able to track the VMA they are using, imo, allows for simpler code that is more easily verified (and is faster and more accurate - less gues

[Intel-gfx] [PATCH 04/38] drm/i915: Remove request retirement before each batch

2016-06-03 Thread Chris Wilson
This reimplements the denial-of-service protection against igt from commit 227f782e4667fc622810bce8be8ccdeee45f89c2 Author: Chris Wilson Date: Thu May 15 10:41:42 2014 +0100 drm/i915: Retire requests before creating a new one and transfers the stall from before each batch into get_pages()

[Intel-gfx] [PATCH 05/38] drm/i915: Remove i915_gem_execbuffer_retire_commands()

2016-06-03 Thread Chris Wilson
Move the single line to the callsite as the name is now misleading, and the purpose is solely to add the request to the execution queue. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 9 + 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/drivers/

[Intel-gfx] [PATCH 09/38] drm/i915: Start passing around i915_vma from execbuffer

2016-06-03 Thread Chris Wilson
During execbuffer we look up the i915_vma in order to reserver them in the VM. However, we then do a double lookup of the vma in order to then pin them, all because we lack the necessary interfaces to operate on i915_vma. v2: Tidy parameter lists to remove one level of redirection in the hot path.

[Intel-gfx] [PATCH 13/38] drm/i915: Move obj->active:5 to obj->flags

2016-06-03 Thread Chris Wilson
We are motivated to avoid using a bitfield for obj->active for a couple of reasons. Firstly, we wish to document our lockless read of obj->active using READ_ONCE inside i915_gem_busy_ioctl() and that requires an integral type (i.e. not a bitfield). Secondly, gcc produces abysmal code when presented

[Intel-gfx] [PATCH 06/38] drm/i915: Pad GTT views of exec objects up to user specified size

2016-06-03 Thread Chris Wilson
Our GPUs impose certain requirements upon buffers that depend upon how exactly they are used. Typically this is expressed as that they require a larger surface than would be naively computed by pitch * height. Normally such requirements are hidden away in the userspace driver, but when we accept po

[Intel-gfx] [PATCH 11/38] drm/i915: Make fb_tracking.lock a spinlock

2016-06-03 Thread Chris Wilson
We only need a very lightweight mechanism here as the locking is only used for co-ordinating a bitfield. Also double check that the object is still pinned to the display plane before processing the state change. v2: Move the cheap unlikely tests into the caller Signed-off-by: Chris Wilson ---

[Intel-gfx] [PATCH 07/38] drm/i915: Split insertion/binding of an object into the VM

2016-06-03 Thread Chris Wilson
Split the insertion into the address space's range manager and binding of that object into the GTT to simplify the code flow when pinning a VMA. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem.c | 33 +++-- 1 file changed, 15 insertions(+), 18 deletions(

[Intel-gfx] [PATCH 10/38] drm/i915: Remove highly confusing i915_gem_obj_ggtt_pin()

2016-06-03 Thread Chris Wilson
Since i915_gem_obj_ggtt_pin() is an idiom breaking curry function for i915_gem_object_ggtt_pin(), spare us the confustion and remove it. Removing it now simplifies later patches to change the i915_vma_pin() (and friends) interface. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_drv.h

<    1   2   3   >