Reviewed-by: Matthew Auld
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
== Series Details ==
Series: drm/i915/dsi: fix bxt split screen and color issue
URL : https://patchwork.freedesktop.org/series/8232/
State : warning
== Summary ==
Series 8232v1 drm/i915/dsi: fix bxt split screen and color issue
http://patchwork.freedesktop.org/api/1.0/series/8232/revisions/1/m
What about skl, this also seems to need the WA until A0?
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Since the function is a small wrapper around schedule_delayed_work(),
move it inline to remove the function call overhead for the principle
caller.
Signed-off-by: Chris Wilson
Reviewed-by: Tvrtko Ursulin
---
drivers/gpu/drm/i915/i915_drv.h | 18 +-
drivers/gpu/drm/i915/i915_irq.
We have a major bottleneck in waiting with many clients that is
impacting customer workloads. This is because we wake up every waiter
after the GPU advance for them all to try and identify if they were the
lucky one. The classic thundering herd, and the response is to only wake
the next in the queu
The queue only ever contains at most one item and has no special flags.
It is just a very simple wrapper around the system-wq - a complication
with no benefits.
v2: Use the system_long_wq as we may wish to capture the error state
after detecting the hang - which may take a bit of time.
Signed-off
Currently __i915_wait_request uses a per-engine wait_queue_t for the dual
purpose of waking after the GPU advances or for waking after an error.
In the future, we may add even more wake sources and require greater
separation, but for now we can conceptually simplify wakeups by separating
the two so
One particularly stressful scenario consists of many independent tasks
all competing for GPU time and waiting upon the results (e.g. realtime
transcoding of many, many streams). One bottleneck in particular is that
each client waits on its own results, but every client is woken up after
every batch
With the last direct CPU access to the scratch page removed, we can now
allocate it from our small amount of reserved system pages (stolen
memory).
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/intel_ringbuffer.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drive
As we inspect obj->active to decide how many objects we can shrink (we
only shrink idle objects), it helps to flush the active lists first
in order to have a more accurate count of available objects.
Signed-off-by: Chris Wilson
Reviewed-by: Tvrtko Ursulin
---
drivers/gpu/drm/i915/i915_gem_shrin
If we convert the tracing over from direct use of ring->irq_get() and
over to the breadcrumb infrastructure, we only have a single user of the
ring->irq_get and so we will be able to simplify the driver routines
(eliminating the redundant validation and irq refcounting).
v2: Move to a signaling fr
We can forgo queuing the hangcheck from the start of every request to
until we wait upon a request. This reduces the overhead of every
request, but may increase the latency of detecting a hang. Howeever, if
nothing every waits upon a hang, did it ever hang? It also improves the
robustness of the wa
By using the same address for storing the HWS on every platform, we can
remove the platform specific vfuncs and reduce the get-seqno routine to
a single read of a cached memory location.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_debugfs.c | 6 +--
drivers/gpu/drm/i915/i915_
With only a single callsite for intel_engine_cs->irq_get and ->irq_put,
we can reduce the code size by moving the common preamble into the
caller, and we can also eliminate the reference counting.
For completeness, as we are no longer doing reference counting on irq,
rename the get/put vfunctions
The gen2 w/a buffer is stuffed into the same slot as the gen5+ scratch
buffer. If we pass in the size we want to allocate for the scratch
buffer, both callers can use the same routine.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/intel_lrc.c| 2 +-
drivers/gpu/drm/i915/intel_rin
Borrow the idea from intel_lrc.c to precompute the mask of interrupts we
wish to always enable to avoid having lots of conditionals inside the
interrupt enabling.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/intel_ringbuffer.c | 35 +++--
drivers/gpu/drm/i915/
Under the assumption that enabling signaling will be a frequent
operation, lets preallocate our attachments for signaling inside the
request struct (and so benefiting from the slab cache).
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_drv.h | 1 +
drivers/gpu/drm/i915/intel
If we flag the seqno as potentially stale upon receiving an interrupt,
we can use that information to reduce the frequency that we apply the
heavyweight coherent seqno read (i.e. if we wake up a chain of waiters).
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_drv.h | 15
On Ironlake, there is no command nor register to ensure that the write
from a MI_STORE command is completed (and coherent on the CPU) before the
command parser continues. This means that the ordering between the seqno
write and the subsequent user interrupt is undefined (like gen6+). So to
ensure t
After the elimination of using the scratch page for Ironlake's
breadcrumb, we no longer need to kmap the object. We therefore can move
it into the high unmappable space and do not need to force the object to
be coherent (i.e. snooped on !llc platforms).
Signed-off-by: Chris Wilson
---
drivers/gp
Avoid the two calls to ktime_get_raw_ns() (at best it reads the TSC) as
we only need to compute the elapsed time for a timed wait.
v2: Eliminate the unused local variable reducing the function size by 64
bytes (using the storage space on the callers stack rather than adding
to our stack frame). Wr
We have testcases to ensure that seqno wraparound works fine, so we can
forgo forcing everyone to encounter seqno wraparound during early
uptime. seqno wraparound incurs a full GPU stall so not forcing it
will eliminate one jitter from the early system. Using the testcases, we
have very determinist
When waiting for an interrupt (waiting for the GPU to complete some
work), we know we are the single waiter for the GPU. We also know when
the GPU has nearly completed our request (or at least started processing
it), so after being woken and we detect that the GPU is almost finished,
allow the bott
If we have multiple waiters, we may find that many complete on the same
wake up. If we first inspect the seqno from the CPU cache, we may reduce
the number of heavyweight coherent seqno reads we require.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_drv.h | 14 ++
1 file
Since the tests can and do explicitly check debugfs/i915_ring_missed_irqs
for the handling of a "missed interrupt", adding it to the dmesg at INFO
is just noise. When it happens for real, we still class it as an ERROR.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_irq.c | 3 ---
1 fi
== Series Details ==
Series: series starting with [01/21] drm/i915/shrinker: Flush active on objects
before counting
URL : https://patchwork.freedesktop.org/series/8246/
State : failure
== Summary ==
Applying: drm/i915/shrinker: Flush active on objects before counting
Applying: drm/i915: Dela
Just to see if anyone is awake this series takes us to the VMA leak fix.
Just the tip of the iceberg when it comes to VMA fixes...
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
igt likes to inject GPU hangs into its command streams. However, as we
expect these hangs, we don't actually want them recorded in the dmesg
output or stored in the i915_error_state (usually). To accomodate this
allow userspace to set a flag on the context that any hang emanating
from that context
We know, by design, that whilst the GPU is active (and thus we are
throttling) the retire_worker is queued. Therefore attempting to requeue
it with queue_delayed_work() is a no-op and we can safely remove it.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_gem.c | 3 ---
1 file changed
As these are wrappers around kref_get/kref_put() it is preferable to
follow the naming convention and use the same verb get/put in our
wrapper names for manipulating a reference to the context.
Signed-off-by: Chris Wilson
Cc: Tvrtko Ursulin
Cc: Joonas Lahtinen
---
drivers/gpu/drm/i915/i915_drv
Now that we derive requests from struct fence, swap over to its
nomenclature for references. It's shorter and more idiomatic across the
kernel.
s/i915_gem_request_reference/i915_gem_request_get/
s/i915_gem_request_unreference/i915_gem_request_put/
Signed-off-by: Chris Wilson
---
drivers/gpu/drm
The retire worker is a low frequency task that makes sure we retire
outstanding requests if userspace is being lax. We only need to start it
once as it remains active until the GPU is idle, so do a cheap test
before the more expensive queue_work(). A consequence of this is that we
need correct lock
Rather than persistently postponing the idle-work everytime somebody
calls i915_gem_retire_requests() (potentially ensuring that we never
reach the idle state), queue the work the first time we detect all
requests are complete. Then if in 100ms, more requests have been queued,
we will abort the idl
This patch is broken out of the next just to remove the code motion from
that patch and make it more readable. What we do here is move the
i915_vma_move_to_active() to i915_gem_execbuffer.c and put the three
stages (read, write, fenced) together so that future modifications to
active handling are a
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_drv.h | 7 +++
drivers/gpu/drm/i915/i915_gem.c | 10 +-
drivers/gpu/drm/i915/i915_gem_tiling.c | 2 +-
drivers/gpu/drm/i915/i915_gem_userptr.c | 2 +-
drivers/gpu/drm/i915/intel_display.c| 6 +++---
dri
Since requests can no longer be generated as a side-effect of
intel_ring_begin(), we know that the seqno will be unchanged during
ring-emission. This predicatablity then means we do not have to check
for the seqno wrapping around whilst emitting the semaphore for
engine->sync_to().
Signed-off-by:
For the global GTT (and aliasing GTT), the address space is owned by the
device (it is a global resource) and so the per-file owner field is
NULL. For per-process GTT (where we create an address space per
context), each is owned by the opening file. We can use this ownership
information to both dis
Since
commit a6f766f3975185af66a31a2cea2cd38721645999
Author: Chris Wilson
Date: Mon Apr 27 13:41:20 2015 +0100
drm/i915: Limit ring synchronisation (sw sempahores) RPS boosts
and
commit bcafc4e38b6ad03f48989b7ecaff03845b5b7acf
Author: Chris Wilson
Date: Mon Apr 27 13:41:21 2015 +0100
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_drv.h | 7 +++
drivers/gpu/drm/i915/i915_gem.c | 26 +-
drivers/gpu/drm/i915/i915_gem_batch_pool.c | 4 ++--
drivers/gpu/drm/i915/i915_gem_context.c | 4 ++--
drivers/gpu/drm/
Rather than recomputing whether semaphores are enabled, we can do that
computation once during early initialisation as the i915.semaphores
module parameter is now read-only.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_debugfs.c | 2 +-
drivers/gpu/drm/i915/i915_drv.c |
Tidy up the for loops that handle waiting for read/write vs read-only
access.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_gem.c | 163 +++-
1 file changed, 78 insertions(+), 85 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/
As the list retirement is now clean of implementation details, we can
move it closer to the request management.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_gem.c | 41 -
drivers/gpu/drm/i915/i915_gem_request.c | 33 ++
If is simpler and leads to more readable code through the callstack if
the allocation returns the allocated struct through the return value.
The importance of this is that it no longer looks like we accidentally
allocate requests as side-effect of calling certain functions.
Signed-off-by: Chris W
Now that we use the same vfuncs for emitting the batch buffer in both
execlists and legacy, the golden render state initialisation is
identical between both.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_gem_render_state.c | 23 +--
drivers/gpu/drm/i915/i915_gem_rende
Having ringbuf->ring point to an engine is confusing, so rename it once
again to ring->engine.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/intel_ringbuffer.c | 14 +++---
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c
b/dri
We want to restrict waitboosting to known process contexts, where we can
track which clients are receiving waitboosts and prevent excessive power
wasting. For fence_wait() we do not have any client tracking and so that
leaves it open to abuse.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915
The request tells us where to read the ringbuf from, so use that
information to simplify the error capture. If no request was active at
the time of the hang, the ring is idle and there is no information
inside the ring pertaining to the hang.
Note carefully that this will reduce the amount of info
Remove some redundant kernel messages as we deduce a hung GPU and
capture the error state.
v2: Fix "hang" vs "no progress" message whilst I was there
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_irq.c | 41 ++---
1 file changed, 26 insertions(+),
In the future, we will want to add annotations to the i915_gem_active
struct. The API is thus expanded to hide direct access to the contents
of i915_gem_active and mediated instead through a number of helpers.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_debugfs.c | 13 ++--
dr
Initialising the global GTT is tricky as we wish to use the drm_mm range
manager during the modesetting initialisation (to capture stolen
allocations from the BIOS) before we actually enable GEM. To overcome
this, we currently setup the drm_mm first and then carefully rebind
them.
Signed-off-by: C
In order to disambiguate between the pointer to the intel_engine_cs
(called ring) and the intel_ringbuffer (called ringbuf), rename
s/ring/engine/.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_debugfs.c | 3 +--
drivers/gpu/drm/i915/i915_gem.c | 6 ++
dri
If we rewrite the I915_WRITE_TAIL specialisation for the legacy
ringbuffer as submitting the request onto the ringbuffer, we can unify
the vfunc with both execlists and GuC in the next patch.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_gem_request.c | 5 +--
drivers/gpu/drm/i915/i
For symmetry with a forthcoming i915_gem_object_get() and
i915_gem_object_pu().
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_drv.h| 18 ++-
drivers/gpu/drm/i915/i915_gem.c| 56 +-
drivers/gpu/drm/i915/i915_gem_tiling.c | 8 ++-
Rather than pass in the num_dwords that the caller wishes to use after
the signal command packet, split the breadcrumb emission into two phases
and have both the signal and breadcrumb individiually acquire space on
the ring. This makes the interface simpler for the reader, and will
simplify for pat
Perform s/ringbuf/ring/ on the context struct for consistency with the
ring/engine split.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_debugfs.c| 8
drivers/gpu/drm/i915/i915_drv.h| 2 +-
drivers/gpu/drm/i915/i915_gem_context.c| 4 ++--
drivers/gp
Now that we have (near) universal GPU recovery code, we can inject a
real hang from userspace and not need any fakery. Not only does this
mean that the testing is far more realistic, but we can simplify the
kernel in the process.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_debugfs.
As we only ever keep the first error state around, we can avoid some
work that can be quite intrusive if we don't record the error the second
time around. This does move the race whereby the user could discard one
error state as the second is being captured, but that race exists in the
current code
Describe the intent of boosting the GPU frequency to maximum before
waiting on the GPU.
RPS waitboosting was introduced with
commit b29c19b645287f7062e17d70fa4e9781a01a5d88
Author: Chris Wilson
Date: Wed Sep 25 17:34:56 2013 +0100
drm/i915: Boost RPS frequency for CPU stalls
but lacked a
For more consistent oop-naming, we would use intel_ring_verb, so pick
intel_ring_pin() and intel_ring_unpin().
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/intel_lrc.c| 4 ++--
drivers/gpu/drm/i915/intel_ringbuffer.c | 38 -
drivers/gpu/drm/i915/i
We use "list" to denote the list and "link" to denote an element on that
list. Rename request->list to match this idiom.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_debugfs.c | 4 ++--
drivers/gpu/drm/i915/i915_gem.c | 10 +-
drivers/gpu/drm/i915/i915_gem_reque
If the object is active and we need to perform a relocation upon it, we
need to take the slow relocation path. Before we do, double check the
active requests to see if they have completed.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 16 +++-
1 file ch
Make sure that the RPS bottom-half is flushed before we set the idle
frequency when we decide the GPU is idle. This should prevent any races
with the bottom-half and setting the idle frequency, and ensures that
the bottom-half is bounded by the GPU's rpm reference taken for when it
is active (i.e.
Now that we have disambuigated ring and engine, we can use the clearer
and more consistent name for the intel_ringbuffer pointer in the
request.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_gem_context.c| 4 +-
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 4 +-
drivers/gpu/drm
Hook the vma itself into the i915_gem_request_retire() so that we can
accurately track when a solitary vma is inactive (as opposed to having
to wait for the entire object to be idle). This improves the interaction
when using multiple contexts (with full-ppgtt) and eliminates some
frequent list walk
Both perform the same actions with more or less indirection, so just
unify the code.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_gem_context.c| 54 ++---
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 53 ++---
drivers/gpu/drm/i915/i915_gem_gtt.c| 62 ++---
drivers/gpu
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_drv.h| 10 +-
drivers/gpu/drm/i915/i915_gem.c| 4 ++--
drivers/gpu/drm/i915/i915_gem_dmabuf.c | 3 +--
drivers/gpu/drm/i915/i915_gem_evict.c | 2 +-
drivers/gpu/drm/i915/i915_gem_execbuffer.c |
The future annotations will track the locking used for access to ensure
that it is always sufficient. We make the preparations now to present
the API ahead and to make sure that GCC can eliminate the unused
parameter.
Before: 6298417 3619610 696320 10614347 a1f64b vmlinux
After: 6298417
The drop_pages() function is a dangerous trap in that it can release the
passed in object pointer and so unless the caller is aware, it can
easily trick us into using the stale object afterwards. Move it into its
solitary callsite where we know it is safe.
Signed-off-by: Chris Wilson
---
drivers
With the introduction of requests, we amplified the number of atomic
refcounted objects we use and update every execbuffer; from none to
several references, and a set of references that need to be changed. We
also introduced interesting side-effects in the order of retiring
requests and objects.
I
When we call i915_vma_unbind(), we will wait upon outstanding rendering.
This will also trigger a retirement phase, which may update the object
lists. If, we extend request tracking to the VMA itself (rather than
keep it at the encompassing object), then there is a potential that the
obj->vma_list
Since we track requests, and requests are always added to the GPU fully
formed, we never have to flush the incomplete request and know that the
given request will eventually complete without any further action on our
part.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_gem.c | 58 +++-
dma-buf provides a generic fence class for interoperation between
drivers. Internally we use the request structure as a fence, and so with
only a little bit of interfacing we can rebase those requests on top of
dma-buf fences. This will allow us, in the future, to pass those fences
back to userspac
In order to prevent a leak of the vma on shared objects, we need to
hook into the object_close callback to destroy the vma on the object for
this file. However, if we destroyed that vma immediately we may cause
unexpected application stalls as we try to unbind a busy vma - hence we
defer the unbind
As we can now have multiple VMA inside the global GTT (with partial
mappings, rotations, etc), it is no longer true that there may just be a
single GGTT entry and so we should walk the full vma_list to count up
the actual usage. In addition to unifying the two walkers, switch from
multiplying the o
When the user closes the context mark it and the dependent address space
as closed. As we use an asynchronous destruct method, this has two purposes.
First it allows us to flag the closed context and detect internal errors if
we to create any new objects for it (as it is removed from the user's
nam
Just a different colour to better match virtual addresses elsewhere.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_irq.c | 8
drivers/gpu/drm/i915/intel_ringbuffer.c | 9 -
drivers/gpu/drm/i915/intel_ringbuffer.h | 4 ++--
3 files changed, 10 insertions(+), 1
This reverts commit e9f24d5fb7cf3628b195b18ff3ac4e37937ceeae.
The patch was only a stop-gap measure that fixed half the problem - the
leak of the fbcon when restarting X. A complete solution required
releasing the VMA when the object itself was closed rather than rely on
file/process exit. The pre
Now that emitting requests is identical between legacy and execlists, we
can use the same function to build up the ring for submitting to either
engine. (With the exception of i915_switch_contexts(), but in time that
will also be handled gracefully.)
Signed-off-by: Chris Wilson
---
drivers/gpu/d
Move request submission from emit_request into its own common vfunc
from i915_add_request().
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_gem_request.c| 8 +++---
drivers/gpu/drm/i915/i915_guc_submission.c | 4 +--
drivers/gpu/drm/i915/intel_guc.h | 2 +-
drivers/gp
Now that we have a clear ring/engine split and a struct intel_ring, we
no longer need the stopgap ringbuf names.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/intel_ringbuffer.c | 66 -
drivers/gpu/drm/i915/intel_ringbuffer.h | 6 +--
2 files changed, 36 i
Since we may have VMA allocated for an object, but we interrupted their
binding, there is a disparity between have elements on the obj->vma_list
and being bound. i915_gem_obj_bound_any() does this check, but this is
not rigorously observed - add an explicit count to make it easier.
Signed-off-by:
The state stored in this struct is not only the information about the
buffer object, but the ring used to communicate with the hardware. Using
buffer here is overly specific and, for me at least, conflates with the
notion of buffer objects themselves.
Signed-off-by: Chris Wilson
---
drivers/gpu/
Both the ->dispatch_execbuffer and ->emit_bb_start callbacks do exactly
the same thing, add MI_BATCHBUFFER_START to the request's ringbuffer -
we need only one vfunc.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 6 +--
drivers/gpu/drm/i915/i915_gem_render_state
Now that the last couple of hacks have been removed from the runtime
powermanagement users, we can fully enable the asserts.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/intel_drv.h | 7 ---
1 file changed, 7 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/d
As GEN6+ is now a simple variant on the basic breadcrumbs + tail write,
reuse the common code.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/intel_ringbuffer.c | 68 +
1 file changed, 27 insertions(+), 41 deletions(-)
diff --git a/drivers/gpu/drm/i915/inte
Ringbuffers are now being written to either through LLC or WC paths, so
treating them as simply iomem is no longer adequate. However, for the
older !llc hardware, the hardware is documentated as treating the TAIL
register update as serialising, so we can relax the barriers when filling
the rings (b
In the next patch, request tracking is made more generic and for that we
need a new expanded struct and to separate out the logic changes from
the mechanical churn, we split out the structure renaming into this
patch.
v2: Writer's block. Add some spiel about why we track requests.
v3: Now i915_gem
Ideally, we want to automagically have the GPU respond to the
instantaneous load by reclocking itself. However, reclocking occurs
relatively slowly, and to the client waiting for a result from the GPU,
too late. To compensate and reduce the client latency, we allow the
first wait from a client to b
Space for flushing the GPU cache prior to completing the request is
preallocated and so cannot fail.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_gem_context.c| 2 +-
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 9 +---
drivers/gpu/drm/i915/i915_gem_gtt.c| 18
Slight micro-optimise to produce combine loops so that gcc is able to
optimise the inner-loops concisely. Since we are reviewing the loops, we
can update the comments to describe the current state of affairs, in
particular the distinction between evicting from the global GTT (which
may contain untr
We should not rely on obj->active being uptodate unless we manually
flush it. Instead, we can verify that the next available batch object is
idle by looking at its last active request (and checking it for
completion).
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_gem_batch_pool.c | 1
One issue with the current VMA api is that callers do not take ownership
of the VMA they pin for their use, and corresponding never explicitly
unpin it. Being able to track the VMA they are using, imo, allows for
simpler code that is more easily verified (and is faster and more
accurate - less gues
This reimplements the denial-of-service protection against igt from
commit 227f782e4667fc622810bce8be8ccdeee45f89c2
Author: Chris Wilson
Date: Thu May 15 10:41:42 2014 +0100
drm/i915: Retire requests before creating a new one
and transfers the stall from before each batch into get_pages()
Move the single line to the callsite as the name is now misleading, and
the purpose is solely to add the request to the execution queue.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 9 +
1 file changed, 1 insertion(+), 8 deletions(-)
diff --git a/drivers/
During execbuffer we look up the i915_vma in order to reserver them in
the VM. However, we then do a double lookup of the vma in order to then
pin them, all because we lack the necessary interfaces to operate on
i915_vma.
v2: Tidy parameter lists to remove one level of redirection in the hot
path.
We are motivated to avoid using a bitfield for obj->active for a couple
of reasons. Firstly, we wish to document our lockless read of obj->active
using READ_ONCE inside i915_gem_busy_ioctl() and that requires an
integral type (i.e. not a bitfield). Secondly, gcc produces abysmal code
when presented
Our GPUs impose certain requirements upon buffers that depend upon how
exactly they are used. Typically this is expressed as that they require
a larger surface than would be naively computed by pitch * height.
Normally such requirements are hidden away in the userspace driver, but
when we accept po
We only need a very lightweight mechanism here as the locking is only
used for co-ordinating a bitfield.
Also double check that the object is still pinned to the display plane
before processing the state change.
v2: Move the cheap unlikely tests into the caller
Signed-off-by: Chris Wilson
---
Split the insertion into the address space's range manager and binding
of that object into the GTT to simplify the code flow when pinning a
VMA.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_gem.c | 33 +++--
1 file changed, 15 insertions(+), 18 deletions(
Since i915_gem_obj_ggtt_pin() is an idiom breaking curry function for
i915_gem_object_ggtt_pin(), spare us the confustion and remove it.
Removing it now simplifies later patches to change the i915_vma_pin()
(and friends) interface.
Signed-off-by: Chris Wilson
---
drivers/gpu/drm/i915/i915_drv.h
101 - 200 of 237 matches
Mail list logo