Re: [Intel-gfx] [PATCH v2] drm/i915: prevent out of range pt in the PDE macros (take 3)

2015-10-06 Thread Daniel Vetter
On Tue, Oct 06, 2015 at 10:43:48AM +0100, Chris Wilson wrote:
> On Tue, Oct 06, 2015 at 10:38:22AM +0200, Daniel Vetter wrote:
> > On Mon, Oct 05, 2015 at 05:59:50PM +0100, Michel Thierry wrote:
> > > On 10/5/2015 5:36 PM, Dave Gordon wrote:
> > > >On 02/10/15 14:16, Michel Thierry wrote:
> > > >>We tried to fix this in commit fdc454c1484a ("drm/i915: Prevent out of
> > > >>range pt in gen6_for_each_pde").
> > > >>
> > > >>But the static analyzer still complains that, just before we break due
> > > >>to "iter < I915_PDES", we do "pt = (pd)->page_table[iter]" with an
> > > >>iter value that is bigger than I915_PDES. Of course, this isn't really
> > > >>a problem since no one uses pt outside the macro. Still, every single
> > > >>new usage of the macro will create a new issue for us to mark as a
> > > >>false positive.
> > > >>
> > > >>Also, Paulo re-started the discussion a while ago [1], but didn't end up
> > > >>implemented.
> > > >>
> > > >>In order to "solve" this "problem", this patch takes the ideas from
> > > >>Chris and Dave, but that check would change the desired behavior of the
> > > >>code, because the object (for example pdp->page_directory[iter]) can be
> > > >>null during init/alloc, and C would take this as false, breaking the for
> > > >>loop immediately.
> > > >>
> > > >>This has been already verified with "static analysis tools".
> > > >>
> > > >>[1]http://lists.freedesktop.org/archives/intel-gfx/2015-June/068548.html
> > > >>
> > > >>v2: Make it a single statement, while preventing the common 
> > > >>subexpression
> > > >>elimination (Chris)
> > > >>
> > > >>Cc: Paulo Zanoni 
> > > >>Cc: Chris Wilson 
> > > >>Cc: Dave Gordon 
> > > >>Signed-off-by: Michel Thierry 
> 
> > Yeah, since ?: is a ternary operator parsing implicitly adds the () in the
> > middle and always parses it as a ? (b) : c. If lower-level operators in
> > the middle could split the ternary operator then it would result in
> > parsing fail (sinc ? without the : is just useless). So lgtm. Someone
> > willing to smack an r-b onto the patch?
> 
> I think it's good enough.
> Reviewed-by: Chris Wilson 

Queued for -next, thanks for the patch.

> Something to consider is that ppgtt_insert is 10x slower than
> ppgtt_clear, and that some workloads (admittedly not 48b!) spend a
> disproportionate amount of time changing PTE. If you have ideas for
> spending up insertion, feel free to experiment!

Hm, where do we waste all that time? 10x slower is pretty impressive since
on a quick look I can only see the sg table walk as the additional bit of
memory traversals insert does on top of clear ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v6 7/8] drm/i915: BDW clock change support [regression]

2015-10-06 Thread Daniel Vetter
On Tue, Jun 16, 2015 at 04:07:40PM +0300, Jani Nikula wrote:
> On Tue, 16 Jun 2015, Jani Nikula  wrote:
> > On Wed, 03 Jun 2015, Mika Kahola  wrote:
> >> From: Ville Syrjälä 
> >>
> >> Add support for changing cdclk frequency during runtime on BDW. The
> >> procedure is quite a bit different on BDW from the one on HSW, so
> >> add a separate function for it.
> >>
> >> Also with IPS enabled the actual pixel rate mustn't exceed 95% of cdclk,
> >> so take that into account when computing the max pixel rate.
> >>
> >> v2: Grab rps.hw_lock around sandybridge_pcode_write()
> >> v3: Rebase due to power well vs. .global_resources() reordering
> >> v4: Rebased to the latest
> >> v5: Rebased to the latest
> >> v6: Patch order shuffle so that Broadwell CD clock change is
> >> applied before the patch for Haswell CD clock change
> >> v7: Fix for patch style problems
> >>
> >> Signed-off-by: Ville Syrjälä 
> >> Signed-off-by: Mika Kahola 
> >
> > This patch hard hangs my BDW NUC at boot when both DP and HDMI are
> > connected. Either DP or HDMI alone are good, same with hotplugging the
> > other afterwards. Booting to grub with both connected, and unplugging
> > HDMI before loading the kernel also reproduces the issue.
> >
> > It looks like the problem boils down to the BIOS setting up a smaller
> > resolution on the DP display when both are connected, and this patch
> > fails to cope with that on i915 load.
> 
> By "this patch" I obviously refer to
> 
> commit b432e5cfd5e92127ad2dd83bfc3083f1dbce43fb
> Author: Ville Syrjälä 
> Date:   Wed Jun 3 15:45:13 2015 +0300
> 
> drm/i915: BDW clock change support
> 
> and everything works for the commit before that.

Another regression for Jairo to track. Jairo please add the bisect result
to the bugzilla and mark it as bisected too.

Thanks, Daniel

> 
> BR,
> Jani.
> 
> 
> 
> 
> 
> >
> > BR,
> > Jani.
> >
> >
> >
> >>
> >> Author:Ville Syrjälä 
> >> ---
> >>  drivers/gpu/drm/i915/i915_reg.h  |   2 +
> >>  drivers/gpu/drm/i915/intel_display.c | 216 
> >> +--
> >>  2 files changed, 208 insertions(+), 10 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_reg.h 
> >> b/drivers/gpu/drm/i915/i915_reg.h
> >> index 7213224..0f72c0e 100644
> >> --- a/drivers/gpu/drm/i915/i915_reg.h
> >> +++ b/drivers/gpu/drm/i915/i915_reg.h
> >> @@ -6705,6 +6705,7 @@ enum skl_disp_power_wells {
> >>  #define GEN6_PCODE_READ_RC6VIDS   0x5
> >>  #define GEN6_ENCODE_RC6_VID(mv)   (((mv) - 245) / 5)
> >>  #define GEN6_DECODE_RC6_VID(vids) (((vids) * 5) + 245)
> >> +#define   BDW_PCODE_DISPLAY_FREQ_CHANGE_REQ   0x18
> >>  #define   GEN9_PCODE_READ_MEM_LATENCY 0x6
> >>  #define GEN9_MEM_LATENCY_LEVEL_MASK   0xFF
> >>  #define GEN9_MEM_LATENCY_LEVEL_1_5_SHIFT  8
> >> @@ -7170,6 +7171,7 @@ enum skl_disp_power_wells {
> >>  #define  LCPLL_CLK_FREQ_337_5_BDW (2<<26)
> >>  #define  LCPLL_CLK_FREQ_675_BDW   (3<<26)
> >>  #define  LCPLL_CD_CLOCK_DISABLE   (1<<25)
> >> +#define  LCPLL_ROOT_CD_CLOCK_DISABLE  (1<<24)
> >>  #define  LCPLL_CD2X_CLOCK_DISABLE (1<<23)
> >>  #define  LCPLL_POWER_DOWN_ALLOW   (1<<22)
> >>  #define  LCPLL_CD_SOURCE_FCLK (1<<21)
> >> diff --git a/drivers/gpu/drm/i915/intel_display.c 
> >> b/drivers/gpu/drm/i915/intel_display.c
> >> index c3f01aa..b1e2069 100644
> >> --- a/drivers/gpu/drm/i915/intel_display.c
> >> +++ b/drivers/gpu/drm/i915/intel_display.c
> >> @@ -5751,7 +5751,22 @@ static void intel_update_max_cdclk(struct 
> >> drm_device *dev)
> >>  {
> >>struct drm_i915_private *dev_priv = dev->dev_private;
> >>  
> >> -  if (IS_VALLEYVIEW(dev)) {
> >> +  if (IS_BROADWELL(dev))  {
> >> +  /*
> >> +   * FIXME with extra cooling we can allow
> >> +   * 540 MHz for ULX and 675 Mhz for ULT.
> >> +   * How can we know if extra cooling is
> >> +   * available? PCI ID, VTB, something else?
> >> +   */
> >> +  if (I915_READ(FUSE_STRAP) & HSW_CDCLK_LIMIT)
> >> +  dev_priv->max_cdclk_freq = 45;
> >> +  else if (IS_BDW_ULX(dev))
> >> +  dev_priv->max_cdclk_freq = 45;
> >> +  else if (IS_BDW_ULT(dev))
> >> +  dev_priv->max_cdclk_freq = 54;
> >> +  else
> >> +  dev_priv->max_cdclk_freq = 675000;
> >> +  } else if (IS_VALLEYVIEW(dev)) {
> >>dev_priv->max_cdclk_freq = 40;
> >>} else {
> >>/* otherwise assume cdclk is fixed */
> >> @@ -6621,13 +6636,11 @@ static bool pipe_config_supports_ips(struct 
> >> drm_i915_private *dev_priv,
> >>return true;
> >>  
> >>/*
> >> -   * FIXME if we compare against max we should then
> >> -   * increase the cdclk frequency when the current
> >> -   * value is too low. The other option is to compare
> >> -   * against the cdclk frequency we're going have post
> >> -   * modeset (ie. on

Re: [Intel-gfx] [PATCH] fixup! drm/i915/skl: Eliminate usage of pipe_wm_parameters from SKL-style WM (v3) [regression]

2015-10-06 Thread Daniel Vetter
Another regression for Jairo to track. Also this one is bisected too
(although not 100% confirmed).
-Daniel

On Fri, Oct 2, 2015 at 8:43 PM, Zanoni, Paulo R  
wrote:
> Em Qui, 2015-10-01 às 16:03 -0700, Matt Roper escreveu:
>> Cc: Paulo Zanoni 
>> Signed-off-by: Matt Roper 
>> ---
>> Paulo, I'm not positive that this is the cause of the issues you're
>> seeing, but
>> I did find this unwanted behavior change while re-reading all the SKL
>> watermark
>> code.  Could you give this a try and see if it improves your
>> situation at all?
>
> Thanks for the patch, but unfortunately this doesn't solve the problems
> I'm seeing.
>
> For my normal work activities I'm carrying a patch that reverts the
> following commits:
>
> drm/i915: Calculate watermark configuration during atomic check (v2)
> drm/i915: Don't set plane visible during HW readout if CRTC is off
> drm/i915: Calculate ILK-style watermarks during atomic check (v3)
> drm/i915: Calculate pipe watermarks into CRTC state (v3)
> drm/i915: Refactor ilk_update_wm (v3)
> drm/i915: Drop intel_update_sprite_watermarks
>
> So I guess the sprite update thing is very likely the first bad commit.
> I'm also noticing that the screen stays black for _way_ too much time
> during boot, but I'm not sure it's caused by the watermark series:
> might be something else on -nightly.
>
> Thanks,
> Paulo
>
>>
>>  drivers/gpu/drm/i915/intel_pm.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_pm.c
>> b/drivers/gpu/drm/i915/intel_pm.c
>> index 3857592..22c97f2 100644
>> --- a/drivers/gpu/drm/i915/intel_pm.c
>> +++ b/drivers/gpu/drm/i915/intel_pm.c
>> @@ -2951,6 +2951,9 @@ skl_get_total_relative_data_rate(const struct
>> intel_crtc_state *cstate)
>>   if (pstate->fb == NULL)
>>   continue;
>>
>> + if (intel_plane->base.type == DRM_PLANE_TYPE_CURSOR)
>> + continue;
>> +
>>   /* packed/uv */
>>   total_data_rate +=
>> skl_plane_relative_data_rate(cstate,
>>   psta
>> te,
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2] drm/i915: prevent out of range pt in the PDE macros (take 3)

2015-10-06 Thread Chris Wilson
On Tue, Oct 06, 2015 at 12:09:51PM +0200, Daniel Vetter wrote:
> On Tue, Oct 06, 2015 at 10:43:48AM +0100, Chris Wilson wrote:
> > On Tue, Oct 06, 2015 at 10:38:22AM +0200, Daniel Vetter wrote:
> > > On Mon, Oct 05, 2015 at 05:59:50PM +0100, Michel Thierry wrote:
> > > > On 10/5/2015 5:36 PM, Dave Gordon wrote:
> > > > >On 02/10/15 14:16, Michel Thierry wrote:
> > > > >>We tried to fix this in commit fdc454c1484a ("drm/i915: Prevent out of
> > > > >>range pt in gen6_for_each_pde").
> > > > >>
> > > > >>But the static analyzer still complains that, just before we break due
> > > > >>to "iter < I915_PDES", we do "pt = (pd)->page_table[iter]" with an
> > > > >>iter value that is bigger than I915_PDES. Of course, this isn't really
> > > > >>a problem since no one uses pt outside the macro. Still, every single
> > > > >>new usage of the macro will create a new issue for us to mark as a
> > > > >>false positive.
> > > > >>
> > > > >>Also, Paulo re-started the discussion a while ago [1], but didn't end 
> > > > >>up
> > > > >>implemented.
> > > > >>
> > > > >>In order to "solve" this "problem", this patch takes the ideas from
> > > > >>Chris and Dave, but that check would change the desired behavior of 
> > > > >>the
> > > > >>code, because the object (for example pdp->page_directory[iter]) can 
> > > > >>be
> > > > >>null during init/alloc, and C would take this as false, breaking the 
> > > > >>for
> > > > >>loop immediately.
> > > > >>
> > > > >>This has been already verified with "static analysis tools".
> > > > >>
> > > > >>[1]http://lists.freedesktop.org/archives/intel-gfx/2015-June/068548.html
> > > > >>
> > > > >>v2: Make it a single statement, while preventing the common 
> > > > >>subexpression
> > > > >>elimination (Chris)
> > > > >>
> > > > >>Cc: Paulo Zanoni 
> > > > >>Cc: Chris Wilson 
> > > > >>Cc: Dave Gordon 
> > > > >>Signed-off-by: Michel Thierry 
> > 
> > > Yeah, since ?: is a ternary operator parsing implicitly adds the () in the
> > > middle and always parses it as a ? (b) : c. If lower-level operators in
> > > the middle could split the ternary operator then it would result in
> > > parsing fail (sinc ? without the : is just useless). So lgtm. Someone
> > > willing to smack an r-b onto the patch?
> > 
> > I think it's good enough.
> > Reviewed-by: Chris Wilson 
> 
> Queued for -next, thanks for the patch.
> 
> > Something to consider is that ppgtt_insert is 10x slower than
> > ppgtt_clear, and that some workloads (admittedly not 48b!) spend a
> > disproportionate amount of time changing PTE. If you have ideas for
> > spending up insertion, feel free to experiment!
> 
> Hm, where do we waste all that time? 10x slower is pretty impressive since
> on a quick look I can only see the sg table walk as the additional bit of
> memory traversals insert does on top of clear ...

The sg_page_iter claims top spot, followed by the memory dereferences
(at a guess, it doesn't seem the individual sg struct isn't dense enough
to be cache friendly or maybe we should just cachealign it?). We can make
substantial improvement by opencoding the page iter (under the assumption
that we do not have page coallescing) like:

http://cgit.freedesktop.org/~ickle/linux-2.6/commit/?h=nightly&id=f9baf155c3096058ef9aeeb39b586713291efc56

with that and eliminating the ivb_pte_encode() indirection gets us to
only be ~5x slower than clearing. It is not pretty (and is rather
cavalier about the last page), but at that point it seems to be memory bound.

However, in this particular benchmark where inserting ppgtt dominates
the kernel profile, we are GPU bound and improving the insert is lost in
the noise. (Unless we disable the GPU load and assume an infinitely fast
GPU like Skylake!)
-Chris


-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 1/2] drm/i915: Kill DRI1 cliprects

2015-10-06 Thread Chris Wilson
Passing cliprects into the kernel for it to re-execute the batch buffer
with different CMD_DRAWRECT died out long ago. As DRI1 support has been
removed from the kernel, we can now simply reject any execbuf trying to
use this "feature".

To keep Daniel happy with the prospect of being able to reuse these
fields in the next decade, continue to ensure that current userspace is
not passing garbage in through the dead fields.

v2: Fix the cliprects_ptr check

Signed-off-by: Chris Wilson 
Cc: Daniel Vetter 
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 154 ++---
 drivers/gpu/drm/i915/intel_lrc.c   |  15 ---
 2 files changed, 31 insertions(+), 138 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 75a0c8b5305b..045a7631faa0 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -947,7 +947,21 @@ i915_gem_check_execbuffer(struct drm_i915_gem_execbuffer2 
*exec)
if (exec->flags & __I915_EXEC_UNKNOWN_FLAGS)
return false;
 
-   return ((exec->batch_start_offset | exec->batch_len) & 0x7) == 0;
+   /* Kernel clipping was a DRI1 misfeature */
+   if (exec->num_cliprects || exec->cliprects_ptr)
+   return false;
+
+   if (exec->DR4 == 0x) {
+   DRM_DEBUG("UXA submitting garbage DR4, fixing up\n");
+   exec->DR4 = 0;
+   }
+   if (exec->DR1 || exec->DR4)
+   return false;
+
+   if ((exec->batch_start_offset | exec->batch_len) & 0x7)
+   return false;
+
+   return true;
 }
 
 static int
@@ -,47 +1125,6 @@ i915_reset_gen7_sol_offsets(struct drm_device *dev,
return 0;
 }
 
-static int
-i915_emit_box(struct drm_i915_gem_request *req,
- struct drm_clip_rect *box,
- int DR1, int DR4)
-{
-   struct intel_engine_cs *ring = req->ring;
-   int ret;
-
-   if (box->y2 <= box->y1 || box->x2 <= box->x1 ||
-   box->y2 <= 0 || box->x2 <= 0) {
-   DRM_ERROR("Bad box %d,%d..%d,%d\n",
- box->x1, box->y1, box->x2, box->y2);
-   return -EINVAL;
-   }
-
-   if (INTEL_INFO(ring->dev)->gen >= 4) {
-   ret = intel_ring_begin(req, 4);
-   if (ret)
-   return ret;
-
-   intel_ring_emit(ring, GFX_OP_DRAWRECT_INFO_I965);
-   intel_ring_emit(ring, (box->x1 & 0x) | box->y1 << 16);
-   intel_ring_emit(ring, ((box->x2 - 1) & 0x) | (box->y2 - 1) 
<< 16);
-   intel_ring_emit(ring, DR4);
-   } else {
-   ret = intel_ring_begin(req, 6);
-   if (ret)
-   return ret;
-
-   intel_ring_emit(ring, GFX_OP_DRAWRECT_INFO);
-   intel_ring_emit(ring, DR1);
-   intel_ring_emit(ring, (box->x1 & 0x) | box->y1 << 16);
-   intel_ring_emit(ring, ((box->x2 - 1) & 0x) | (box->y2 - 1) 
<< 16);
-   intel_ring_emit(ring, DR4);
-   intel_ring_emit(ring, 0);
-   }
-   intel_ring_advance(ring);
-
-   return 0;
-}
-
 static struct drm_i915_gem_object*
 i915_gem_execbuffer_parse(struct intel_engine_cs *ring,
  struct drm_i915_gem_exec_object2 *shadow_exec_entry,
@@ -1210,65 +1183,21 @@ i915_gem_ringbuffer_submission(struct 
i915_execbuffer_params *params,
   struct drm_i915_gem_execbuffer2 *args,
   struct list_head *vmas)
 {
-   struct drm_clip_rect *cliprects = NULL;
struct drm_device *dev = params->dev;
struct intel_engine_cs *ring = params->ring;
struct drm_i915_private *dev_priv = dev->dev_private;
u64 exec_start, exec_len;
int instp_mode;
u32 instp_mask;
-   int i, ret = 0;
-
-   if (args->num_cliprects != 0) {
-   if (ring != &dev_priv->ring[RCS]) {
-   DRM_DEBUG("clip rectangles are only valid with the 
render ring\n");
-   return -EINVAL;
-   }
-
-   if (INTEL_INFO(dev)->gen >= 5) {
-   DRM_DEBUG("clip rectangles are only valid on 
pre-gen5\n");
-   return -EINVAL;
-   }
-
-   if (args->num_cliprects > UINT_MAX / sizeof(*cliprects)) {
-   DRM_DEBUG("execbuf with %u cliprects\n",
- args->num_cliprects);
-   return -EINVAL;
-   }
-
-   cliprects = kcalloc(args->num_cliprects,
-   sizeof(*cliprects),
-   GFP_KERNEL);
-   if (cliprects == NULL) {
-   ret = -ENOMEM;
-   goto error;
-   }
-
-   if (copy_from_user(cliprects,
-  

[Intel-gfx] [PATCH 2/2] drm/i915: Drop i915_gem_obj_is_pinned() from set-cache-level

2015-10-06 Thread Chris Wilson
Since the remove of the pin-ioctl, we only care about not changing the
cache level on buffers pinned to the hardware as indicated by
obj->pin_display. So we can safely replace i915_gem_object_is_pinned()
here with a plain obj->pin_display check. During rebinding, we will check
sanity checks in case vma->pin_count is erroneously set.

At the same time, we can micro-optimise GTT mmap() behaviour since we
only need to relinquish the mmaps before Sandybridge.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem.c | 40 
 1 file changed, 24 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d4a3bdf0c5b6..2b8ed7a2faab 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3629,31 +3629,34 @@ int i915_gem_object_set_cache_level(struct 
drm_i915_gem_object *obj,
 {
struct drm_device *dev = obj->base.dev;
struct i915_vma *vma, *next;
+   bool bound = false;
int ret = 0;
 
if (obj->cache_level == cache_level)
goto out;
 
-   if (i915_gem_obj_is_pinned(obj)) {
-   DRM_DEBUG("can not change the cache level of pinned objects\n");
-   return -EBUSY;
-   }
-
list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) {
+   if (!drm_mm_node_allocated(&vma->node))
+   continue;
+
+   if (vma->pin_count) {
+   DRM_DEBUG("can not change the cache level of pinned 
objects\n");
+   return -EBUSY;
+   }
+
if (!i915_gem_valid_gtt_space(vma, cache_level)) {
ret = i915_vma_unbind(vma);
if (ret)
return ret;
-   }
+   } else
+   bound = true;
}
 
-   if (i915_gem_obj_bound_any(obj)) {
+   if (bound) {
ret = i915_gem_object_wait_rendering(obj, false);
if (ret)
return ret;
 
-   i915_gem_object_finish_gtt(obj);
-
/* Before SandyBridge, you could not use tiling or fence
 * registers with snooped memory, so relinquish any fences
 * currently pointing to our region in the aperture.
@@ -3664,13 +3667,18 @@ int i915_gem_object_set_cache_level(struct 
drm_i915_gem_object *obj,
return ret;
}
 
-   list_for_each_entry(vma, &obj->vma_list, vma_link)
-   if (drm_mm_node_allocated(&vma->node)) {
-   ret = i915_vma_bind(vma, cache_level,
-   PIN_UPDATE);
-   if (ret)
-   return ret;
-   }
+   /* Access to snoopable pages through the GTT is incoherent. */
+   if (cache_level != I915_CACHE_NONE && !HAS_LLC(dev))
+   i915_gem_release_mmap(obj);
+
+   list_for_each_entry(vma, &obj->vma_list, vma_link) {
+   if (!drm_mm_node_allocated(&vma->node))
+   continue;
+
+   ret = i915_vma_bind(vma, cache_level, PIN_UPDATE);
+   if (ret)
+   return ret;
+   }
}
 
list_for_each_entry(vma, &obj->vma_list, vma_link)
-- 
2.6.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v2] drm/i915: Make the link training test for same voltage smaller

2015-10-06 Thread Ander Conselvan de Oliveira
It makes it slightly easier to read.

v2: Add missing word in patch title. (Ander)

Signed-off-by: Ander Conselvan de Oliveira 

---
 drivers/gpu/drm/i915/intel_dp_link_training.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_dp_link_training.c 
b/drivers/gpu/drm/i915/intel_dp_link_training.c
index 8b20970..ba640b7 100644
--- a/drivers/gpu/drm/i915/intel_dp_link_training.c
+++ b/drivers/gpu/drm/i915/intel_dp_link_training.c
@@ -186,14 +186,13 @@ clock_recovery_voltage_step(struct intel_dp *intel_dp)
break;
 
/* Check to see if we've tried the same voltage 5 times */
-   if (intel_dp_get_train_voltage(intel_dp) == voltage) {
-   ++voltage_tries;
-   if (voltage_tries == 5) {
-   DRM_ERROR("too many voltage retries, give 
up\n");
-   break;
-   }
-   } else
+   if (intel_dp_get_train_voltage(intel_dp) != voltage) {
voltage_tries = 0;
+   } else if (++voltage_tries == 5) {
+   DRM_ERROR("too many voltage retries, give up\n");
+   break;
+   }
+
voltage = intel_dp_get_train_voltage(intel_dp);
 
/* Update training set as requested by target */
-- 
2.4.3

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/skl: Init cdclk in the driver rather than relying on pre-os

2015-10-06 Thread Imre Deak
On ti, 2015-10-06 at 15:26 +0530, Kumar, Shobhit wrote:
> On 10/05/2015 09:05 PM, Imre Deak wrote:
> > On ma, 2015-10-05 at 20:52 +0530, Shobhit Kumar wrote:
> >> Mostly reuse what is programmed by pre-os, but in case there is no
> >> pre-os initialization, init the cdclk with the default value.
> >>
> >> Cc: Imre Deak 
> >> Signed-off-by: Shobhit Kumar 
> >> ---
> >>   drivers/gpu/drm/i915/intel_ddi.c | 6 ++
> >>   1 file changed, 2 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/intel_ddi.c 
> >> b/drivers/gpu/drm/i915/intel_ddi.c
> >> index 2d3cc82..675c60d 100644
> >> --- a/drivers/gpu/drm/i915/intel_ddi.c
> >> +++ b/drivers/gpu/drm/i915/intel_ddi.c
> >> @@ -2947,10 +2947,8 @@ void intel_ddi_pll_init(struct drm_device *dev)
> >>
> >>cdclk_freq = dev_priv->display.get_display_clock_speed(dev);
> >>dev_priv->skl_boot_cdclk = cdclk_freq;
> >> -  if (!(I915_READ(LCPLL1_CTL) & LCPLL_PLL_ENABLE))
> >> -  DRM_ERROR("LCPLL1 is disabled\n");
> >> -  else
> >> -  intel_display_power_get(dev_priv, POWER_DOMAIN_PLLS);
> >> +
> >> +  skl_init_cdclk(dev_priv);
> >
> > How does this prevent changing the clock if BIOS did enable some output?
> > We shouldn't change the clock in that case.
> 
> In that case it will try to re-apply the same clock that BIOS enabled. 
> Not sure if this is allowed, but I checked the cdclock change sequence 
> and it is mostly followed in skl_init_cdclk.
> In my tests where BIOS does enable this, I faced no issues in
> initializing again in driver.

The first step in that sequence:
"Disable all display engine functions using the full mode set disable
sequence on all pipes, ports, and planes."

So the problem is not that the PLL itself may be enabled here (as BIOS
left it), but that some output is also enabled.

> I have noticed on some pre-os this value is programmed correctly except
> for the decimal part. That causes AUX transactions to fail on SKl. That
> is what triggered this patch actually. So other way is to completely
> validate the value in get_display_clock_speed instead of bit[28:26] and
> then if wrong then only do the cdclk init.

I think we'd need to detect at this point if outputs are enabled and
only attempt to work around the above BIOS problem if this is not the
case. Alternatively you could also disable the active outputs as a first
step.

> 
> Regards
> Shobhit
> 
> >
> >>} else if (IS_BROXTON(dev)) {
> >>broxton_init_cdclk(dev);
> >>broxton_ddi_phy_init(dev);
> >
> >
> > ___
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> >


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 1/3] drm: Track drm_mm nodes with an interval tree

2015-10-06 Thread Chris Wilson
In addition to the last-in/first-out stack for accessing drm_mm nodes,
we occasionally and in the future often want to find a drm_mm_node by an
address. To do so efficiently we need to track the nodes in an interval
tree - lookups for a particular address will then be O(lg(N)), where N
is the number of nodes in the range manager as opposed to O(N).
Insertion however gains an extra O(lg(N)) step for all nodes
irrespective of whether the interval tree is in use. For future i915
patches, eliminating the linear walk is a significant improvement.

Signed-off-by: Chris Wilson 
Cc: dri-de...@lists.freedesktop.org
---
 drivers/gpu/drm/Kconfig  |  1 +
 drivers/gpu/drm/drm_mm.c | 71 
 include/drm/drm_mm.h |  5 
 3 files changed, 54 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 06ae5008c5ed..e25050a5a843 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -12,6 +12,7 @@ menuconfig DRM
select I2C
select I2C_ALGOBIT
select DMA_SHARED_BUFFER
+   select INTERVAL_TREE
help
  Kernel-level support for the Direct Rendering Infrastructure (DRI)
  introduced in XFree86 4.0. If you say Y here, you need to select
diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index 04de6fd88f8c..e3acd860f738 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -153,6 +153,10 @@ static void drm_mm_insert_helper(struct drm_mm_node 
*hole_node,
INIT_LIST_HEAD(&node->hole_stack);
list_add(&node->node_list, &hole_node->node_list);
 
+   node->it.start = node->start;
+   node->it.last = node->start + size - 1;
+   interval_tree_insert(&node->it, &mm->interval_tree);
+
BUG_ON(node->start + node->size > adj_end);
 
node->hole_follows = 0;
@@ -178,39 +182,53 @@ static void drm_mm_insert_helper(struct drm_mm_node 
*hole_node,
  */
 int drm_mm_reserve_node(struct drm_mm *mm, struct drm_mm_node *node)
 {
-   struct drm_mm_node *hole;
u64 end = node->start + node->size;
-   u64 hole_start;
-   u64 hole_end;
-
-   BUG_ON(node == NULL);
+   struct interval_tree_node *it;
+   struct drm_mm_node *hole;
+   u64 hole_start, hole_end;
 
/* Find the relevant hole to add our node to */
-   drm_mm_for_each_hole(hole, mm, hole_start, hole_end) {
-   if (hole_start > node->start || hole_end < end)
-   continue;
+   it = interval_tree_iter_first(&mm->interval_tree,
+ node->start, (u64)-1);
+   if (it == NULL) {
+   hole = list_last_entry(&mm->head_node.node_list,
+  struct drm_mm_node, node_list);
+   } else {
+   hole = container_of(it, typeof(*hole), it);
+   if (hole->start <= node->start)
+   return -ENOSPC;
+
+   hole = list_last_entry(&hole->node_list,
+  struct drm_mm_node, node_list);
+   }
 
-   node->mm = mm;
-   node->allocated = 1;
+   hole_start = drm_mm_hole_node_start(hole);
+   hole_end = drm_mm_hole_node_end(hole);
+   if (hole_start > node->start || hole_end < end)
+   return -ENOSPC;
 
-   INIT_LIST_HEAD(&node->hole_stack);
-   list_add(&node->node_list, &hole->node_list);
+   node->mm = mm;
+   node->allocated = 1;
 
-   if (node->start == hole_start) {
-   hole->hole_follows = 0;
-   list_del_init(&hole->hole_stack);
-   }
+   INIT_LIST_HEAD(&node->hole_stack);
+   list_add(&node->node_list, &hole->node_list);
 
-   node->hole_follows = 0;
-   if (end != hole_end) {
-   list_add(&node->hole_stack, &mm->hole_stack);
-   node->hole_follows = 1;
-   }
+   node->it.start = node->start;
+   node->it.last = node->start + node->size - 1;
+   interval_tree_insert(&node->it, &mm->interval_tree);
 
-   return 0;
+   if (node->start == hole_start) {
+   hole->hole_follows = 0;
+   list_del_init(&hole->hole_stack);
}
 
-   return -ENOSPC;
+   node->hole_follows = 0;
+   if (end != hole_end) {
+   list_add(&node->hole_stack, &mm->hole_stack);
+   node->hole_follows = 1;
+   }
+
+   return 0;
 }
 EXPORT_SYMBOL(drm_mm_reserve_node);
 
@@ -300,6 +318,10 @@ static void drm_mm_insert_helper_range(struct drm_mm_node 
*hole_node,
INIT_LIST_HEAD(&node->hole_stack);
list_add(&node->node_list, &hole_node->node_list);
 
+   node->it.start = node->start;
+   node->it.last = node->start + node->size - 1;
+   interval_tree_insert(&node->it, &mm->interval_tree);
+
BUG_ON(node->start < start);

[Intel-gfx] [PATCH 3/3] drm/i915: Add soft-pinning API for execbuffer

2015-10-06 Thread Chris Wilson
Userspace can pass in an offset that it presumes the object is located
at. The kernel will then do its utmost to fit the object into that
location. The assumption is that userspace is handling its own object
locations (for example along with full-ppgtt) and that the kernel will
rarely have to make space for the user's requests.

v2: Fix i915_gem_evict_range() (now evict_for_vma) to handle ordinary
and fixed objects within the same batch

Signed-off-by: Chris Wilson 
Cc: "Daniel, Thomas" 
---
 drivers/gpu/drm/i915/i915_dma.c|  3 ++
 drivers/gpu/drm/i915/i915_drv.h| 10 +++--
 drivers/gpu/drm/i915/i915_gem.c| 68 +-
 drivers/gpu/drm/i915/i915_gem_evict.c  | 61 +++
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  9 +++-
 drivers/gpu/drm/i915/i915_trace.h  | 23 ++
 include/uapi/drm/i915_drm.h|  4 +-
 7 files changed, 151 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index ab37d1121be8..cd79ef114b8e 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -170,6 +170,9 @@ static int i915_getparam(struct drm_device *dev, void *data,
case I915_PARAM_HAS_RESOURCE_STREAMER:
value = HAS_RESOURCE_STREAMER(dev);
break;
+   case I915_PARAM_HAS_EXEC_SOFTPIN:
+   value = 1;
+   break;
default:
DRM_DEBUG("Unknown parameter %d\n", param->param);
return -EINVAL;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a0ce011a5dc0..7d351d991022 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2778,10 +2778,11 @@ void i915_gem_vma_destroy(struct i915_vma *vma);
 #define PIN_NONBLOCK   (1<<1)
 #define PIN_GLOBAL (1<<2)
 #define PIN_OFFSET_BIAS(1<<3)
-#define PIN_USER   (1<<4)
-#define PIN_UPDATE (1<<5)
-#define PIN_ZONE_4G(1<<6)
-#define PIN_HIGH   (1<<7)
+#define PIN_OFFSET_FIXED (1<<4)
+#define PIN_USER   (1<<5)
+#define PIN_UPDATE (1<<6)
+#define PIN_ZONE_4G(1<<7)
+#define PIN_HIGH   (1<<8)
 #define PIN_OFFSET_MASK (~4095)
 int __must_check
 i915_gem_object_pin(struct drm_i915_gem_object *obj,
@@ -3127,6 +3128,7 @@ int __must_check i915_gem_evict_something(struct 
drm_device *dev,
  unsigned long start,
  unsigned long end,
  unsigned flags);
+int __must_check i915_gem_evict_for_vma(struct i915_vma *vma, unsigned flags);
 int i915_gem_evict_vm(struct i915_address_space *vm, bool do_idle);
 
 /* belongs in i915_gem_gtt.h */
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 8fe3df0cdcb8..82efd6a6dee0 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3334,7 +3334,6 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object 
*obj,
struct drm_device *dev = obj->base.dev;
struct drm_i915_private *dev_priv = dev->dev_private;
u64 start, end;
-   u32 search_flag, alloc_flag;
struct i915_vma *vma;
int ret;
 
@@ -3409,30 +3408,53 @@ i915_gem_object_bind_to_vm(struct drm_i915_gem_object 
*obj,
if (IS_ERR(vma))
goto err_unpin;
 
-   if (flags & PIN_HIGH) {
-   search_flag = DRM_MM_SEARCH_BELOW;
-   alloc_flag = DRM_MM_CREATE_TOP;
+   if (flags & PIN_OFFSET_FIXED) {
+   uint64_t offset = flags & PIN_OFFSET_MASK;
+   if (offset & (alignment - 1) || offset + size > end) {
+   vma = ERR_PTR(-EINVAL);
+   goto err_free_vma;
+   }
+   vma->node.start = offset;
+   vma->node.size = size;
+   vma->node.color = obj->cache_level;
+   ret = drm_mm_reserve_node(&vm->mm, &vma->node);
+   if (ret) {
+   ret = i915_gem_evict_for_vma(vma, flags);
+   if (ret == 0)
+   ret = drm_mm_reserve_node(&vm->mm, &vma->node);
+   }
+   if (ret) {
+   vma = ERR_PTR(ret);
+   goto err_free_vma;
+   }
} else {
-   search_flag = DRM_MM_SEARCH_DEFAULT;
-   alloc_flag = DRM_MM_CREATE_DEFAULT;
-   }
+   u32 search_flag, alloc_flag;
+
+   if (flags & PIN_HIGH) {
+   search_flag = DRM_MM_SEARCH_BELOW;
+   alloc_flag = DRM_MM_CREATE_TOP;
+   } else {
+   search_flag = DRM_MM_SEARCH_DEFAULT;
+   alloc_flag = DRM_MM_CREATE_DEFAULT;
+   }
 
 search_free:
-   ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node,
- 

[Intel-gfx] [PATCH 2/3] drm/i915: Allow the user to pass a context to any ring

2015-10-06 Thread Chris Wilson
With full-ppgtt, we want the user to have full control over their memory
layout, with a separate instance per context. Forcing them to use a
shared memory layout for !RCS not only duplicates the amount of work we
have to do, but also defeats the memory segregation on offer.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 9 ++---
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index a01c1ebe47ca..19dd6b05ee1d 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1033,18 +1033,13 @@ static struct intel_context *
 i915_gem_validate_context(struct drm_device *dev, struct drm_file *file,
  struct intel_engine_cs *ring, const u32 ctx_id)
 {
-   struct intel_context *ctx = NULL;
-   struct i915_ctx_hang_stats *hs;
-
-   if (ring->id != RCS && ctx_id != DEFAULT_CONTEXT_HANDLE)
-   return ERR_PTR(-EINVAL);
+   struct intel_context *ctx;
 
ctx = i915_gem_context_get(file->driver_priv, ctx_id);
if (IS_ERR(ctx))
return ctx;
 
-   hs = &ctx->hang_stats;
-   if (hs->banned) {
+   if (ctx->hang_stats.banned) {
DRM_DEBUG("Context %u tried to submit while banned\n", ctx_id);
return ERR_PTR(-EIO);
}
-- 
2.6.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/skl: Init cdclk in the driver rather than relying on pre-os

2015-10-06 Thread Kumar, Shobhit

On 10/06/2015 04:11 PM, Imre Deak wrote:

On ti, 2015-10-06 at 15:26 +0530, Kumar, Shobhit wrote:

On 10/05/2015 09:05 PM, Imre Deak wrote:

On ma, 2015-10-05 at 20:52 +0530, Shobhit Kumar wrote:

Mostly reuse what is programmed by pre-os, but in case there is no
pre-os initialization, init the cdclk with the default value.

Cc: Imre Deak 
Signed-off-by: Shobhit Kumar 
---
   drivers/gpu/drm/i915/intel_ddi.c | 6 ++
   1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ddi.c b/drivers/gpu/drm/i915/intel_ddi.c
index 2d3cc82..675c60d 100644
--- a/drivers/gpu/drm/i915/intel_ddi.c
+++ b/drivers/gpu/drm/i915/intel_ddi.c
@@ -2947,10 +2947,8 @@ void intel_ddi_pll_init(struct drm_device *dev)

cdclk_freq = dev_priv->display.get_display_clock_speed(dev);
dev_priv->skl_boot_cdclk = cdclk_freq;
-   if (!(I915_READ(LCPLL1_CTL) & LCPLL_PLL_ENABLE))
-   DRM_ERROR("LCPLL1 is disabled\n");
-   else
-   intel_display_power_get(dev_priv, POWER_DOMAIN_PLLS);
+
+   skl_init_cdclk(dev_priv);


How does this prevent changing the clock if BIOS did enable some output?
We shouldn't change the clock in that case.


In that case it will try to re-apply the same clock that BIOS enabled.
Not sure if this is allowed, but I checked the cdclock change sequence
and it is mostly followed in skl_init_cdclk.
In my tests where BIOS does enable this, I faced no issues in
initializing again in driver.


The first step in that sequence:
"Disable all display engine functions using the full mode set disable
sequence on all pipes, ports, and planes."


Oh, yeah, I again made mistake of assuming that display is not enabled 
in the first place. You are right, though it works if I change the clock 
again.




So the problem is not that the PLL itself may be enabled here (as BIOS
left it), but that some output is also enabled.


Yes.




I have noticed on some pre-os this value is programmed correctly except
for the decimal part. That causes AUX transactions to fail on SKl. That
is what triggered this patch actually. So other way is to completely
validate the value in get_display_clock_speed instead of bit[28:26] and
then if wrong then only do the cdclk init.


I think we'd need to detect at this point if outputs are enabled and
only attempt to work around the above BIOS problem if this is not the
case. Alternatively you could also disable the active outputs as a first
step.


Ok, let me detect if any output is enabled by BIOS and accordingly 
initialize cdclk.


Regards
Shobhit





Regards
Shobhit




} else if (IS_BROXTON(dev)) {
broxton_init_cdclk(dev);
broxton_ddi_phy_init(dev);



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx





___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/3] drm: Track drm_mm nodes with an interval tree

2015-10-06 Thread Daniel Vetter
On Tue, Oct 06, 2015 at 11:53:09AM +0100, Chris Wilson wrote:
> In addition to the last-in/first-out stack for accessing drm_mm nodes,
> we occasionally and in the future often want to find a drm_mm_node by an
> address. To do so efficiently we need to track the nodes in an interval
> tree - lookups for a particular address will then be O(lg(N)), where N
> is the number of nodes in the range manager as opposed to O(N).
> Insertion however gains an extra O(lg(N)) step for all nodes
> irrespective of whether the interval tree is in use. For future i915
> patches, eliminating the linear walk is a significant improvement.
> 
> Signed-off-by: Chris Wilson 
> Cc: dri-de...@lists.freedesktop.org

For the vma manager David Herrman put the interval tree outside of drm_mm.
Whichever way we pick, but I think we should be consistent about this.
-Daniel

> ---
>  drivers/gpu/drm/Kconfig  |  1 +
>  drivers/gpu/drm/drm_mm.c | 71 
> 
>  include/drm/drm_mm.h |  5 
>  3 files changed, 54 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index 06ae5008c5ed..e25050a5a843 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -12,6 +12,7 @@ menuconfig DRM
>   select I2C
>   select I2C_ALGOBIT
>   select DMA_SHARED_BUFFER
> + select INTERVAL_TREE
>   help
> Kernel-level support for the Direct Rendering Infrastructure (DRI)
> introduced in XFree86 4.0. If you say Y here, you need to select
> diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
> index 04de6fd88f8c..e3acd860f738 100644
> --- a/drivers/gpu/drm/drm_mm.c
> +++ b/drivers/gpu/drm/drm_mm.c
> @@ -153,6 +153,10 @@ static void drm_mm_insert_helper(struct drm_mm_node 
> *hole_node,
>   INIT_LIST_HEAD(&node->hole_stack);
>   list_add(&node->node_list, &hole_node->node_list);
>  
> + node->it.start = node->start;
> + node->it.last = node->start + size - 1;
> + interval_tree_insert(&node->it, &mm->interval_tree);
> +
>   BUG_ON(node->start + node->size > adj_end);
>  
>   node->hole_follows = 0;
> @@ -178,39 +182,53 @@ static void drm_mm_insert_helper(struct drm_mm_node 
> *hole_node,
>   */
>  int drm_mm_reserve_node(struct drm_mm *mm, struct drm_mm_node *node)
>  {
> - struct drm_mm_node *hole;
>   u64 end = node->start + node->size;
> - u64 hole_start;
> - u64 hole_end;
> -
> - BUG_ON(node == NULL);
> + struct interval_tree_node *it;
> + struct drm_mm_node *hole;
> + u64 hole_start, hole_end;
>  
>   /* Find the relevant hole to add our node to */
> - drm_mm_for_each_hole(hole, mm, hole_start, hole_end) {
> - if (hole_start > node->start || hole_end < end)
> - continue;
> + it = interval_tree_iter_first(&mm->interval_tree,
> +   node->start, (u64)-1);
> + if (it == NULL) {
> + hole = list_last_entry(&mm->head_node.node_list,
> +struct drm_mm_node, node_list);
> + } else {
> + hole = container_of(it, typeof(*hole), it);
> + if (hole->start <= node->start)
> + return -ENOSPC;
> +
> + hole = list_last_entry(&hole->node_list,
> +struct drm_mm_node, node_list);
> + }
>  
> - node->mm = mm;
> - node->allocated = 1;
> + hole_start = drm_mm_hole_node_start(hole);
> + hole_end = drm_mm_hole_node_end(hole);
> + if (hole_start > node->start || hole_end < end)
> + return -ENOSPC;
>  
> - INIT_LIST_HEAD(&node->hole_stack);
> - list_add(&node->node_list, &hole->node_list);
> + node->mm = mm;
> + node->allocated = 1;
>  
> - if (node->start == hole_start) {
> - hole->hole_follows = 0;
> - list_del_init(&hole->hole_stack);
> - }
> + INIT_LIST_HEAD(&node->hole_stack);
> + list_add(&node->node_list, &hole->node_list);
>  
> - node->hole_follows = 0;
> - if (end != hole_end) {
> - list_add(&node->hole_stack, &mm->hole_stack);
> - node->hole_follows = 1;
> - }
> + node->it.start = node->start;
> + node->it.last = node->start + node->size - 1;
> + interval_tree_insert(&node->it, &mm->interval_tree);
>  
> - return 0;
> + if (node->start == hole_start) {
> + hole->hole_follows = 0;
> + list_del_init(&hole->hole_stack);
>   }
>  
> - return -ENOSPC;
> + node->hole_follows = 0;
> + if (end != hole_end) {
> + list_add(&node->hole_stack, &mm->hole_stack);
> + node->hole_follows = 1;
> + }
> +
> + return 0;
>  }
>  EXPORT_SYMBOL(drm_mm_reserve_node);
>  
> @@ -300,6 +318,10 @@ static void drm_mm_insert_helper_range(struct 

Re: [Intel-gfx] [PATCH] drm/i915/skl: Init cdclk in the driver rather than relying on pre-os

2015-10-06 Thread Daniel Vetter
On Tue, Oct 06, 2015 at 04:33:43PM +0530, Kumar, Shobhit wrote:
> On 10/06/2015 04:11 PM, Imre Deak wrote:
> >On ti, 2015-10-06 at 15:26 +0530, Kumar, Shobhit wrote:
> >>On 10/05/2015 09:05 PM, Imre Deak wrote:
> >>>On ma, 2015-10-05 at 20:52 +0530, Shobhit Kumar wrote:
> Mostly reuse what is programmed by pre-os, but in case there is no
> pre-os initialization, init the cdclk with the default value.
> 
> Cc: Imre Deak 
> Signed-off-by: Shobhit Kumar 
> ---
>    drivers/gpu/drm/i915/intel_ddi.c | 6 ++
>    1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_ddi.c 
> b/drivers/gpu/drm/i915/intel_ddi.c
> index 2d3cc82..675c60d 100644
> --- a/drivers/gpu/drm/i915/intel_ddi.c
> +++ b/drivers/gpu/drm/i915/intel_ddi.c
> @@ -2947,10 +2947,8 @@ void intel_ddi_pll_init(struct drm_device *dev)
> 
>   cdclk_freq = 
>  dev_priv->display.get_display_clock_speed(dev);
>   dev_priv->skl_boot_cdclk = cdclk_freq;
> - if (!(I915_READ(LCPLL1_CTL) & LCPLL_PLL_ENABLE))
> - DRM_ERROR("LCPLL1 is disabled\n");
> - else
> - intel_display_power_get(dev_priv, POWER_DOMAIN_PLLS);
> +
> + skl_init_cdclk(dev_priv);
> >>>
> >>>How does this prevent changing the clock if BIOS did enable some output?
> >>>We shouldn't change the clock in that case.
> >>
> >>In that case it will try to re-apply the same clock that BIOS enabled.
> >>Not sure if this is allowed, but I checked the cdclock change sequence
> >>and it is mostly followed in skl_init_cdclk.
> >>In my tests where BIOS does enable this, I faced no issues in
> >>initializing again in driver.
> >
> >The first step in that sequence:
> >"Disable all display engine functions using the full mode set disable
> >sequence on all pipes, ports, and planes."
> 
> Oh, yeah, I again made mistake of assuming that display is not enabled in
> the first place. You are right, though it works if I change the clock again.
> 
> >
> >So the problem is not that the PLL itself may be enabled here (as BIOS
> >left it), but that some output is also enabled.
> 
> Yes.
> 
> >
> >>I have noticed on some pre-os this value is programmed correctly except
> >>for the decimal part. That causes AUX transactions to fail on SKl. That
> >>is what triggered this patch actually. So other way is to completely
> >>validate the value in get_display_clock_speed instead of bit[28:26] and
> >>then if wrong then only do the cdclk init.
> >
> >I think we'd need to detect at this point if outputs are enabled and
> >only attempt to work around the above BIOS problem if this is not the
> >case. Alternatively you could also disable the active outputs as a first
> >step.
> 
> Ok, let me detect if any output is enabled by BIOS and accordingly
> initialize cdclk.

These kind of fixiups should be done after the hw state readout. We
already have sanitize_crtc/pll/encoder functions, probably best if we add
a sanitize_cdclk or similar for this at the very end of the hw state
sanitize sequence.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/2] drm/i915: Kill DRI1 cliprects

2015-10-06 Thread Daniel Vetter
On Tue, Oct 06, 2015 at 11:39:55AM +0100, Chris Wilson wrote:
> Passing cliprects into the kernel for it to re-execute the batch buffer
> with different CMD_DRAWRECT died out long ago. As DRI1 support has been
> removed from the kernel, we can now simply reject any execbuf trying to
> use this "feature".
> 
> To keep Daniel happy with the prospect of being able to reuse these
> fields in the next decade, continue to ensure that current userspace is
> not passing garbage in through the dead fields.
> 
> v2: Fix the cliprects_ptr check
> 
> Signed-off-by: Chris Wilson 
> Cc: Daniel Vetter 

igt subtest seems to be missing to ensure we enforce this. Yay otherwise!
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 154 
> ++---
>  drivers/gpu/drm/i915/intel_lrc.c   |  15 ---
>  2 files changed, 31 insertions(+), 138 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
> b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 75a0c8b5305b..045a7631faa0 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -947,7 +947,21 @@ i915_gem_check_execbuffer(struct 
> drm_i915_gem_execbuffer2 *exec)
>   if (exec->flags & __I915_EXEC_UNKNOWN_FLAGS)
>   return false;
>  
> - return ((exec->batch_start_offset | exec->batch_len) & 0x7) == 0;
> + /* Kernel clipping was a DRI1 misfeature */
> + if (exec->num_cliprects || exec->cliprects_ptr)
> + return false;
> +
> + if (exec->DR4 == 0x) {
> + DRM_DEBUG("UXA submitting garbage DR4, fixing up\n");
> + exec->DR4 = 0;
> + }
> + if (exec->DR1 || exec->DR4)
> + return false;
> +
> + if ((exec->batch_start_offset | exec->batch_len) & 0x7)
> + return false;
> +
> + return true;
>  }
>  
>  static int
> @@ -,47 +1125,6 @@ i915_reset_gen7_sol_offsets(struct drm_device *dev,
>   return 0;
>  }
>  
> -static int
> -i915_emit_box(struct drm_i915_gem_request *req,
> -   struct drm_clip_rect *box,
> -   int DR1, int DR4)
> -{
> - struct intel_engine_cs *ring = req->ring;
> - int ret;
> -
> - if (box->y2 <= box->y1 || box->x2 <= box->x1 ||
> - box->y2 <= 0 || box->x2 <= 0) {
> - DRM_ERROR("Bad box %d,%d..%d,%d\n",
> -   box->x1, box->y1, box->x2, box->y2);
> - return -EINVAL;
> - }
> -
> - if (INTEL_INFO(ring->dev)->gen >= 4) {
> - ret = intel_ring_begin(req, 4);
> - if (ret)
> - return ret;
> -
> - intel_ring_emit(ring, GFX_OP_DRAWRECT_INFO_I965);
> - intel_ring_emit(ring, (box->x1 & 0x) | box->y1 << 16);
> - intel_ring_emit(ring, ((box->x2 - 1) & 0x) | (box->y2 - 1) 
> << 16);
> - intel_ring_emit(ring, DR4);
> - } else {
> - ret = intel_ring_begin(req, 6);
> - if (ret)
> - return ret;
> -
> - intel_ring_emit(ring, GFX_OP_DRAWRECT_INFO);
> - intel_ring_emit(ring, DR1);
> - intel_ring_emit(ring, (box->x1 & 0x) | box->y1 << 16);
> - intel_ring_emit(ring, ((box->x2 - 1) & 0x) | (box->y2 - 1) 
> << 16);
> - intel_ring_emit(ring, DR4);
> - intel_ring_emit(ring, 0);
> - }
> - intel_ring_advance(ring);
> -
> - return 0;
> -}
> -
>  static struct drm_i915_gem_object*
>  i915_gem_execbuffer_parse(struct intel_engine_cs *ring,
> struct drm_i915_gem_exec_object2 *shadow_exec_entry,
> @@ -1210,65 +1183,21 @@ i915_gem_ringbuffer_submission(struct 
> i915_execbuffer_params *params,
>  struct drm_i915_gem_execbuffer2 *args,
>  struct list_head *vmas)
>  {
> - struct drm_clip_rect *cliprects = NULL;
>   struct drm_device *dev = params->dev;
>   struct intel_engine_cs *ring = params->ring;
>   struct drm_i915_private *dev_priv = dev->dev_private;
>   u64 exec_start, exec_len;
>   int instp_mode;
>   u32 instp_mask;
> - int i, ret = 0;
> -
> - if (args->num_cliprects != 0) {
> - if (ring != &dev_priv->ring[RCS]) {
> - DRM_DEBUG("clip rectangles are only valid with the 
> render ring\n");
> - return -EINVAL;
> - }
> -
> - if (INTEL_INFO(dev)->gen >= 5) {
> - DRM_DEBUG("clip rectangles are only valid on 
> pre-gen5\n");
> - return -EINVAL;
> - }
> -
> - if (args->num_cliprects > UINT_MAX / sizeof(*cliprects)) {
> - DRM_DEBUG("execbuf with %u cliprects\n",
> -   args->num_cliprects);
> - return -EINVAL;
> - }
> -
> - cliprects = kcalloc(args->num_cliprects,
> - sizeof(*clipr

Re: [Intel-gfx] [PATCH 1/3] drm: Track drm_mm nodes with an interval tree

2015-10-06 Thread Chris Wilson
On Tue, Oct 06, 2015 at 01:11:56PM +0200, Daniel Vetter wrote:
> On Tue, Oct 06, 2015 at 11:53:09AM +0100, Chris Wilson wrote:
> > In addition to the last-in/first-out stack for accessing drm_mm nodes,
> > we occasionally and in the future often want to find a drm_mm_node by an
> > address. To do so efficiently we need to track the nodes in an interval
> > tree - lookups for a particular address will then be O(lg(N)), where N
> > is the number of nodes in the range manager as opposed to O(N).
> > Insertion however gains an extra O(lg(N)) step for all nodes
> > irrespective of whether the interval tree is in use. For future i915
> > patches, eliminating the linear walk is a significant improvement.
> > 
> > Signed-off-by: Chris Wilson 
> > Cc: dri-de...@lists.freedesktop.org
> 
> For the vma manager David Herrman put the interval tree outside of drm_mm.
> Whichever way we pick, but I think we should be consistent about this.

Given that the basis of this patch is that functionality exposed by
drm_mm (i.e. drm_mm_reserve_node) is too slow for our use case (i.e.
there is a measurable perf degradation if we switch over from the mru
stack to using fixed addresses) it makes sense to improve that
functionality. The question is then why the drm_vma_manager didn't use
and improve the existing functionality...
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2] drm/i915: prevent out of range pt in the PDE macros (take 3)

2015-10-06 Thread Dave Gordon

On 06/10/15 09:38, Daniel Vetter wrote:

On Mon, Oct 05, 2015 at 05:59:50PM +0100, Michel Thierry wrote:

On 10/5/2015 5:36 PM, Dave Gordon wrote:

On 02/10/15 14:16, Michel Thierry wrote:

We tried to fix this in commit fdc454c1484a ("drm/i915: Prevent out of
range pt in gen6_for_each_pde").

But the static analyzer still complains that, just before we break due
to "iter < I915_PDES", we do "pt = (pd)->page_table[iter]" with an
iter value that is bigger than I915_PDES. Of course, this isn't really
a problem since no one uses pt outside the macro. Still, every single
new usage of the macro will create a new issue for us to mark as a
false positive.

Also, Paulo re-started the discussion a while ago [1], but didn't end up
implemented.

In order to "solve" this "problem", this patch takes the ideas from
Chris and Dave, but that check would change the desired behavior of the
code, because the object (for example pdp->page_directory[iter]) can be
null during init/alloc, and C would take this as false, breaking the for
loop immediately.

This has been already verified with "static analysis tools".

[1]http://lists.freedesktop.org/archives/intel-gfx/2015-June/068548.html

v2: Make it a single statement, while preventing the common subexpression
elimination (Chris)

Cc: Paulo Zanoni 
Cc: Chris Wilson 
Cc: Dave Gordon 
Signed-off-by: Michel Thierry 
---
  drivers/gpu/drm/i915/i915_gem_gtt.h | 14 --
  1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h
b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 9fbb07d..a216397 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -394,7 +394,8 @@ struct i915_hw_ppgtt {
   */
  #define gen6_for_each_pde(pt, pd, start, length, temp, iter) \
  for (iter = gen6_pde_index(start); \
- pt = (pd)->page_table[iter], length > 0 && iter < I915_PDES; \
+ length > 0 && iter < I915_PDES ? \
+(pt = (pd)->page_table[iter]), 1 : 0; \
   iter++, \
   temp = ALIGN(start+1, 1 << GEN6_PDE_SHIFT) - start, \
   temp = min_t(unsigned, temp, length), \
@@ -459,7 +460,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr)
   */
  #define gen8_for_each_pde(pt, pd, start, length, temp, iter)\
  for (iter = gen8_pde_index(start); \
- pt = (pd)->page_table[iter], length > 0 && iter <
I915_PDES;\
+ length > 0 && iter < I915_PDES ? \
+(pt = (pd)->page_table[iter]), 1 : 0; \
   iter++,\
   temp = ALIGN(start+1, 1 << GEN8_PDE_SHIFT) - start,\
   temp = min(temp, length),\
@@ -467,8 +469,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr)

  #define gen8_for_each_pdpe(pd, pdp, start, length, temp, iter)\
  for (iter = gen8_pdpe_index(start); \
- pd = (pdp)->page_directory[iter], \
- length > 0 && (iter < I915_PDPES_PER_PDP(dev)); \
+ length > 0 && (iter < I915_PDPES_PER_PDP(dev)) ? \
+(pd = (pdp)->page_directory[iter]), 1 : 0; \
   iter++,\
   temp = ALIGN(start+1, 1 << GEN8_PDPE_SHIFT) - start,\
   temp = min(temp, length),\
@@ -476,8 +478,8 @@ static inline uint32_t gen6_pde_index(uint32_t addr)

  #define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter)\
  for (iter = gen8_pml4e_index(start);\
- pdp = (pml4)->pdps[iter], \
- length > 0 && iter < GEN8_PML4ES_PER_PML4; \
+ length > 0 && iter < GEN8_PML4ES_PER_PML4 ? \
+(pdp = (pml4)->pdps[iter]), 1 : 0; \


this won't compile -- see below


Hmm, it compiled (also got rid of of the "analysis tool error" and didn't
see any behavior change).




   iter++,\
   temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start,\
   temp = min(temp, length),\


The man page for C operators tells us:

Operator Associativity
() [] -> .   left to right
! ~ ++ -- + - (type) * & sizeof  right to left
* / %left to right
+ -  left to right
<< >>left to right
< <= > >=left to right
== !=left to right
&left to right
^left to right
|left to right
&&   left to right
||   left to right
?:   right to left
= += -= *= /= %= <<= >>= &= ^= |=right to left
,left to right

So there's a problem with the above code, because the comma o

Re: [Intel-gfx] [PATCH 2/2] drm/i915: Drop i915_gem_obj_is_pinned() from set-cache-level

2015-10-06 Thread Daniel Vetter
On Tue, Oct 06, 2015 at 11:39:56AM +0100, Chris Wilson wrote:
> Since the remove of the pin-ioctl, we only care about not changing the
> cache level on buffers pinned to the hardware as indicated by
> obj->pin_display. So we can safely replace i915_gem_object_is_pinned()
> here with a plain obj->pin_display check. During rebinding, we will check
> sanity checks in case vma->pin_count is erroneously set.
> 
> At the same time, we can micro-optimise GTT mmap() behaviour since we
> only need to relinquish the mmaps before Sandybridge.

Actual condition is !LLC so would need to be updated (and split out imo).
 
> Signed-off-by: Chris Wilson 
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 40 
>  1 file changed, 24 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index d4a3bdf0c5b6..2b8ed7a2faab 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3629,31 +3629,34 @@ int i915_gem_object_set_cache_level(struct 
> drm_i915_gem_object *obj,
>  {
>   struct drm_device *dev = obj->base.dev;
>   struct i915_vma *vma, *next;
> + bool bound = false;
>   int ret = 0;
>  
>   if (obj->cache_level == cache_level)
>   goto out;
>  
> - if (i915_gem_obj_is_pinned(obj)) {
> - DRM_DEBUG("can not change the cache level of pinned objects\n");
> - return -EBUSY;
> - }
> -
>   list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) {
> + if (!drm_mm_node_allocated(&vma->node))
> + continue;
> +
> + if (vma->pin_count) {
> + DRM_DEBUG("can not change the cache level of pinned 
> objects\n");
> + return -EBUSY;
> + }
> +
>   if (!i915_gem_valid_gtt_space(vma, cache_level)) {
>   ret = i915_vma_unbind(vma);
>   if (ret)
>   return ret;
> - }
> + } else
> + bound = true;
>   }
>  
> - if (i915_gem_obj_bound_any(obj)) {
> + if (bound) {
>   ret = i915_gem_object_wait_rendering(obj, false);
>   if (ret)
>   return ret;

Shouldn't the below be split out into a separate patch? And maybe for
paranoia keep calling finish_gtt but restrict it to !LLC && snooped like
you do below.
-Daniel

>  
> - i915_gem_object_finish_gtt(obj);
> -
>   /* Before SandyBridge, you could not use tiling or fence
>* registers with snooped memory, so relinquish any fences
>* currently pointing to our region in the aperture.
> @@ -3664,13 +3667,18 @@ int i915_gem_object_set_cache_level(struct 
> drm_i915_gem_object *obj,
>   return ret;
>   }
>  
> - list_for_each_entry(vma, &obj->vma_list, vma_link)
> - if (drm_mm_node_allocated(&vma->node)) {
> - ret = i915_vma_bind(vma, cache_level,
> - PIN_UPDATE);
> - if (ret)
> - return ret;
> - }
> + /* Access to snoopable pages through the GTT is incoherent. */
> + if (cache_level != I915_CACHE_NONE && !HAS_LLC(dev))
> + i915_gem_release_mmap(obj);
> +
> + list_for_each_entry(vma, &obj->vma_list, vma_link) {
> + if (!drm_mm_node_allocated(&vma->node))
> + continue;
> +
> + ret = i915_vma_bind(vma, cache_level, PIN_UPDATE);
> + if (ret)
> + return ret;
> + }
>   }
>  
>   list_for_each_entry(vma, &obj->vma_list, vma_link)
> -- 
> 2.6.0
> 
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 3/3] drm/i915: add helpers for platform specific revision id range checks

2015-10-06 Thread Jani Nikula
Revision checks are almost always accompanied by a platform check. (The
exceptions are platform specific code.) Add helpers to check for a
platform and a revision range: IS_SKL_REVID() and IS_BXT_REVID(). In
most places this simplifies and clarifies the code. It will be obvious
that revid macros are used for the correct platform.

This should make it easier to find all the revision checks for
workarounds for each platform, and make it easier to remove them once we
drop support for early hardware revisions.

This should also make it easier to differentiate between Skylake and
Kabylake revision checks when Kabylake support is added.

Signed-off-by: Jani Nikula 
---
 drivers/gpu/drm/i915/i915_drv.h| 13 +
 drivers/gpu/drm/i915/i915_gem.c|  2 +-
 drivers/gpu/drm/i915/i915_guc_submission.c |  6 ++--
 drivers/gpu/drm/i915/intel_ddi.c   |  3 +-
 drivers/gpu/drm/i915/intel_dp.c|  4 +--
 drivers/gpu/drm/i915/intel_guc_loader.c|  4 +--
 drivers/gpu/drm/i915/intel_hdmi.c  |  2 +-
 drivers/gpu/drm/i915/intel_lrc.c   | 26 -
 drivers/gpu/drm/i915/intel_pm.c| 29 +--
 drivers/gpu/drm/i915/intel_ringbuffer.c| 46 --
 10 files changed, 69 insertions(+), 66 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 9833a2055930..578563ca0d5c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2440,6 +2440,15 @@ struct drm_i915_cmd_table {
 #define INTEL_DEVID(p) (INTEL_INFO(p)->device_id)
 #define INTEL_REVID(p) (__I915__(p)->dev->pdev->revision)
 
+#define REVID_FOREVER  0xff
+/*
+ * Return true if revision is in range [since,until] inclusive.
+ *
+ * Use 0 for open-ended since, and REVID_FOREVER for open-ended until.
+ */
+#define IS_REVID(p, since, until) \
+   (INTEL_REVID(p) >= (since) && INTEL_REVID(p) <= (until))
+
 #define IS_I830(dev)   (INTEL_DEVID(dev) == 0x3577)
 #define IS_845G(dev)   (INTEL_DEVID(dev) == 0x2562)
 #define IS_I85X(dev)   (INTEL_INFO(dev)->is_i85x)
@@ -2508,11 +2517,15 @@ struct drm_i915_cmd_table {
 #define SKL_REVID_E0   0x4
 #define SKL_REVID_F0   0x5
 
+#define IS_SKL_REVID(p, since, until) (IS_SKYLAKE(p) && IS_REVID(p, since, 
until))
+
 #define BXT_REVID_A0   0x0
 #define BXT_REVID_A1   0x1
 #define BXT_REVID_B0   0x3
 #define BXT_REVID_C0   0x9
 
+#define IS_BXT_REVID(p, since, until) (IS_BROXTON(p) && IS_REVID(p, since, 
until))
+
 /*
  * The genX designation typically refers to the render engine, so render
  * capability related checks should use IS_GEN, while display and other checks
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index fd2d880656b2..8af33a48204f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3757,7 +3757,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, 
void *data,
 * cacheline, whereas normally such cachelines would get
 * invalidated.
 */
-   if (IS_BROXTON(dev) && INTEL_REVID(dev) <= BXT_REVID_A1)
+   if (IS_BXT_REVID(dev, 0, BXT_REVID_A1))
return -ENODEV;
 
level = I915_CACHE_LLC;
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 863aa5c82466..4bf9aa54c75e 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -161,9 +161,9 @@ static int host2guc_sample_forcewake(struct intel_guc *guc,
data[0] = HOST2GUC_ACTION_SAMPLE_FORCEWAKE;
/* WaRsDisableCoarsePowerGating:skl,bxt */
if (!intel_enable_rc6(dev_priv->dev) ||
-   (IS_BROXTON(dev) && (INTEL_REVID(dev) <= BXT_REVID_A1)) ||
-   (IS_SKL_GT3(dev) && (INTEL_REVID(dev) <= SKL_REVID_E0)) ||
-   (IS_SKL_GT4(dev) && (INTEL_REVID(dev) <= SKL_REVID_E0)))
+   IS_BXT_REVID(dev, 0, BXT_REVID_A1) ||
+   (IS_SKL_GT3(dev) && IS_SKL_REVID(dev, 0, SKL_REVID_E0)) ||
+   (IS_SKL_GT4(dev) && IS_SKL_REVID(dev, 0, SKL_REVID_E0)))
data[1] = 0;
else
/* bit 0 and 1 are for Render and Media domain separately */
diff --git a/drivers/gpu/drm/i915/intel_ddi.c b/drivers/gpu/drm/i915/intel_ddi.c
index b80e0f5ec5dc..7bcca708393d 100644
--- a/drivers/gpu/drm/i915/intel_ddi.c
+++ b/drivers/gpu/drm/i915/intel_ddi.c
@@ -3247,8 +3247,7 @@ void intel_ddi_init(struct drm_device *dev, enum port 
port)
 * On BXT A0/A1, sw needs to activate DDIA HPD logic and
 * interrupts to check the external panel connection.
 */
-   if (IS_BROXTON(dev_priv) && (INTEL_REVID(dev) <= BXT_REVID_A1)
-&& port == PORT_B)
+   if (IS_BXT_REVID(dev, 0, BXT_REVID_A

[Intel-gfx] [PATCH 2/3] drm/i915/bxt: add revision id for A1 stepping and use it

2015-10-06 Thread Jani Nikula
Prefer inclusive ranges for revision checks rather than "below B0". Per
specs A2 is not used, so revid <= A1 matches revid < B0.

Signed-off-by: Jani Nikula 
---
 drivers/gpu/drm/i915/i915_drv.h| 1 +
 drivers/gpu/drm/i915/i915_gem.c| 2 +-
 drivers/gpu/drm/i915/i915_guc_submission.c | 2 +-
 drivers/gpu/drm/i915/intel_ddi.c   | 2 +-
 drivers/gpu/drm/i915/intel_dp.c| 2 +-
 drivers/gpu/drm/i915/intel_hdmi.c  | 2 +-
 drivers/gpu/drm/i915/intel_lrc.c   | 8 
 drivers/gpu/drm/i915/intel_pm.c| 6 +++---
 drivers/gpu/drm/i915/intel_ringbuffer.c| 6 +++---
 9 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a3b137715604..9833a2055930 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2509,6 +2509,7 @@ struct drm_i915_cmd_table {
 #define SKL_REVID_F0   0x5
 
 #define BXT_REVID_A0   0x0
+#define BXT_REVID_A1   0x1
 #define BXT_REVID_B0   0x3
 #define BXT_REVID_C0   0x9
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f0cfbb9ee12c..fd2d880656b2 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3757,7 +3757,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, 
void *data,
 * cacheline, whereas normally such cachelines would get
 * invalidated.
 */
-   if (IS_BROXTON(dev) && INTEL_REVID(dev) < BXT_REVID_B0)
+   if (IS_BROXTON(dev) && INTEL_REVID(dev) <= BXT_REVID_A1)
return -ENODEV;
 
level = I915_CACHE_LLC;
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 036b42bae827..863aa5c82466 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -161,7 +161,7 @@ static int host2guc_sample_forcewake(struct intel_guc *guc,
data[0] = HOST2GUC_ACTION_SAMPLE_FORCEWAKE;
/* WaRsDisableCoarsePowerGating:skl,bxt */
if (!intel_enable_rc6(dev_priv->dev) ||
-   (IS_BROXTON(dev) && (INTEL_REVID(dev) < BXT_REVID_B0)) ||
+   (IS_BROXTON(dev) && (INTEL_REVID(dev) <= BXT_REVID_A1)) ||
(IS_SKL_GT3(dev) && (INTEL_REVID(dev) <= SKL_REVID_E0)) ||
(IS_SKL_GT4(dev) && (INTEL_REVID(dev) <= SKL_REVID_E0)))
data[1] = 0;
diff --git a/drivers/gpu/drm/i915/intel_ddi.c b/drivers/gpu/drm/i915/intel_ddi.c
index b25e99a432fb..b80e0f5ec5dc 100644
--- a/drivers/gpu/drm/i915/intel_ddi.c
+++ b/drivers/gpu/drm/i915/intel_ddi.c
@@ -3247,7 +3247,7 @@ void intel_ddi_init(struct drm_device *dev, enum port 
port)
 * On BXT A0/A1, sw needs to activate DDIA HPD logic and
 * interrupts to check the external panel connection.
 */
-   if (IS_BROXTON(dev_priv) && (INTEL_REVID(dev) < BXT_REVID_B0)
+   if (IS_BROXTON(dev_priv) && (INTEL_REVID(dev) <= BXT_REVID_A1)
 && port == PORT_B)
dev_priv->hotplug.irq_port[PORT_A] = intel_dig_port;
else
diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index 8d34ca7b287a..8baf6fe06313 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -6087,7 +6087,7 @@ intel_dp_init_connector(struct intel_digital_port 
*intel_dig_port,
break;
case PORT_B:
intel_encoder->hpd_pin = HPD_PORT_B;
-   if (IS_BROXTON(dev_priv) && (INTEL_REVID(dev) < BXT_REVID_B0))
+   if (IS_BROXTON(dev_priv) && (INTEL_REVID(dev) <= BXT_REVID_A1))
intel_encoder->hpd_pin = HPD_PORT_A;
break;
case PORT_C:
diff --git a/drivers/gpu/drm/i915/intel_hdmi.c 
b/drivers/gpu/drm/i915/intel_hdmi.c
index 03d85909c6ab..32e9117ee8e3 100644
--- a/drivers/gpu/drm/i915/intel_hdmi.c
+++ b/drivers/gpu/drm/i915/intel_hdmi.c
@@ -2068,7 +2068,7 @@ void intel_hdmi_init_connector(struct intel_digital_port 
*intel_dig_port,
 * On BXT A0/A1, sw needs to activate DDIA HPD logic and
 * interrupts to check the external panel connection.
 */
-   if (IS_BROXTON(dev_priv) && (INTEL_REVID(dev) < BXT_REVID_B0))
+   if (IS_BROXTON(dev_priv) && (INTEL_REVID(dev) <= BXT_REVID_A1))
intel_encoder->hpd_pin = HPD_PORT_A;
else
intel_encoder->hpd_pin = HPD_PORT_B;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 825fa7a8df86..acd4fa332a80 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1986,7 +1986,7 @@ static int logical_render_ring_init(struct drm_device 
*dev)
   

[Intel-gfx] [PATCH 1/3] drm/i915: remove parens around revision ids

2015-10-06 Thread Jani Nikula
Totally unnecessary.

Signed-off-by: Jani Nikula 
---
 drivers/gpu/drm/i915/i915_drv.h | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 51eea2951c0f..a3b137715604 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2501,16 +2501,16 @@ struct drm_i915_cmd_table {
 
 #define IS_PRELIMINARY_HW(intel_info) ((intel_info)->is_preliminary)
 
-#define SKL_REVID_A0   (0x0)
-#define SKL_REVID_B0   (0x1)
-#define SKL_REVID_C0   (0x2)
-#define SKL_REVID_D0   (0x3)
-#define SKL_REVID_E0   (0x4)
-#define SKL_REVID_F0   (0x5)
-
-#define BXT_REVID_A0   (0x0)
-#define BXT_REVID_B0   (0x3)
-#define BXT_REVID_C0   (0x9)
+#define SKL_REVID_A0   0x0
+#define SKL_REVID_B0   0x1
+#define SKL_REVID_C0   0x2
+#define SKL_REVID_D0   0x3
+#define SKL_REVID_E0   0x4
+#define SKL_REVID_F0   0x5
+
+#define BXT_REVID_A0   0x0
+#define BXT_REVID_B0   0x3
+#define BXT_REVID_C0   0x9
 
 /*
  * The genX designation typically refers to the render engine, so render
-- 
2.1.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: Drop i915_gem_obj_is_pinned() from set-cache-level

2015-10-06 Thread Chris Wilson
On Tue, Oct 06, 2015 at 01:28:07PM +0200, Daniel Vetter wrote:
> On Tue, Oct 06, 2015 at 11:39:56AM +0100, Chris Wilson wrote:
> > Since the remove of the pin-ioctl, we only care about not changing the
> > cache level on buffers pinned to the hardware as indicated by
> > obj->pin_display. So we can safely replace i915_gem_object_is_pinned()
> > here with a plain obj->pin_display check. During rebinding, we will check
> > sanity checks in case vma->pin_count is erroneously set.
> > 
> > At the same time, we can micro-optimise GTT mmap() behaviour since we
> > only need to relinquish the mmaps before Sandybridge.
> 
> Actual condition is !LLC so would need to be updated (and split out imo).
>  
> > Signed-off-by: Chris Wilson 
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c | 40 
> > 
> >  1 file changed, 24 insertions(+), 16 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c 
> > b/drivers/gpu/drm/i915/i915_gem.c
> > index d4a3bdf0c5b6..2b8ed7a2faab 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -3629,31 +3629,34 @@ int i915_gem_object_set_cache_level(struct 
> > drm_i915_gem_object *obj,
> >  {
> > struct drm_device *dev = obj->base.dev;
> > struct i915_vma *vma, *next;
> > +   bool bound = false;
> > int ret = 0;
> >  
> > if (obj->cache_level == cache_level)
> > goto out;
> >  
> > -   if (i915_gem_obj_is_pinned(obj)) {
> > -   DRM_DEBUG("can not change the cache level of pinned objects\n");
> > -   return -EBUSY;
> > -   }
> > -
> > list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) {
> > +   if (!drm_mm_node_allocated(&vma->node))
> > +   continue;
> > +
> > +   if (vma->pin_count) {
> > +   DRM_DEBUG("can not change the cache level of pinned 
> > objects\n");
> > +   return -EBUSY;
> > +   }
> > +
> > if (!i915_gem_valid_gtt_space(vma, cache_level)) {
> > ret = i915_vma_unbind(vma);
> > if (ret)
> > return ret;
> > -   }
> > +   } else
> > +   bound = true;
> > }
> >  
> > -   if (i915_gem_obj_bound_any(obj)) {
> > +   if (bound) {
> > ret = i915_gem_object_wait_rendering(obj, false);
> > if (ret)
> > return ret;
> 
> Shouldn't the below be split out into a separate patch? And maybe for
> paranoia keep calling finish_gtt but restrict it to !LLC && snooped like
> you do below.

Hmm, I don't have a finish-gtt. The serialisation is based on
release-mmaps (we have to be sure that any concurrent access is
prohibited). So the question is: is i915_gem_release_mmap() a sufficient
barrier and if not, why not. In release-mmap we are revoking the CPU's PTE,
but that can be ordered with the memory accesses, but before we continue
we should be sure that they have been revoked. Paranoia says we should
be moving the mb() we have from outside of release-mmaps into
release-mmaps.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/skl: Init cdclk in the driver rather than relying on pre-os

2015-10-06 Thread Ville Syrjälä
On Tue, Oct 06, 2015 at 01:19:52PM +0200, Daniel Vetter wrote:
> On Tue, Oct 06, 2015 at 04:33:43PM +0530, Kumar, Shobhit wrote:
> > On 10/06/2015 04:11 PM, Imre Deak wrote:
> > >On ti, 2015-10-06 at 15:26 +0530, Kumar, Shobhit wrote:
> > >>On 10/05/2015 09:05 PM, Imre Deak wrote:
> > >>>On ma, 2015-10-05 at 20:52 +0530, Shobhit Kumar wrote:
> > Mostly reuse what is programmed by pre-os, but in case there is no
> > pre-os initialization, init the cdclk with the default value.
> > 
> > Cc: Imre Deak 
> > Signed-off-by: Shobhit Kumar 
> > ---
> >    drivers/gpu/drm/i915/intel_ddi.c | 6 ++
> >    1 file changed, 2 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_ddi.c 
> > b/drivers/gpu/drm/i915/intel_ddi.c
> > index 2d3cc82..675c60d 100644
> > --- a/drivers/gpu/drm/i915/intel_ddi.c
> > +++ b/drivers/gpu/drm/i915/intel_ddi.c
> > @@ -2947,10 +2947,8 @@ void intel_ddi_pll_init(struct drm_device *dev)
> > 
> > cdclk_freq = 
> >  dev_priv->display.get_display_clock_speed(dev);
> > dev_priv->skl_boot_cdclk = cdclk_freq;
> > -   if (!(I915_READ(LCPLL1_CTL) & LCPLL_PLL_ENABLE))
> > -   DRM_ERROR("LCPLL1 is disabled\n");
> > -   else
> > -   intel_display_power_get(dev_priv, 
> > POWER_DOMAIN_PLLS);
> > +
> > +   skl_init_cdclk(dev_priv);
> > >>>
> > >>>How does this prevent changing the clock if BIOS did enable some output?
> > >>>We shouldn't change the clock in that case.
> > >>
> > >>In that case it will try to re-apply the same clock that BIOS enabled.
> > >>Not sure if this is allowed, but I checked the cdclock change sequence
> > >>and it is mostly followed in skl_init_cdclk.
> > >>In my tests where BIOS does enable this, I faced no issues in
> > >>initializing again in driver.
> > >
> > >The first step in that sequence:
> > >"Disable all display engine functions using the full mode set disable
> > >sequence on all pipes, ports, and planes."
> > 
> > Oh, yeah, I again made mistake of assuming that display is not enabled in
> > the first place. You are right, though it works if I change the clock again.
> > 
> > >
> > >So the problem is not that the PLL itself may be enabled here (as BIOS
> > >left it), but that some output is also enabled.
> > 
> > Yes.
> > 
> > >
> > >>I have noticed on some pre-os this value is programmed correctly except
> > >>for the decimal part. That causes AUX transactions to fail on SKl. That
> > >>is what triggered this patch actually. So other way is to completely
> > >>validate the value in get_display_clock_speed instead of bit[28:26] and
> > >>then if wrong then only do the cdclk init.
> > >
> > >I think we'd need to detect at this point if outputs are enabled and
> > >only attempt to work around the above BIOS problem if this is not the
> > >case. Alternatively you could also disable the active outputs as a first
> > >step.
> > 
> > Ok, let me detect if any output is enabled by BIOS and accordingly
> > initialize cdclk.
> 
> These kind of fixiups should be done after the hw state readout. We
> already have sanitize_crtc/pll/encoder functions, probably best if we add
> a sanitize_cdclk or similar for this at the very end of the hw state
> sanitize sequence.

Can't be done if we already need a somewhat sane cdclk for the
eDP AUX probing and whatnot.

For actually enabling the cdclk for pushing pixels, we wouldn't need
to do anything except actually plug ia a calc_cdclk for SKL. No idea
why we're not doing that currently. Some extra care may be needed
due to the eDP DPLL0 usag IIRC.

> -Daniel
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v2] drm/i915: prevent out of range pt in the PDE macros (take 3)

2015-10-06 Thread Chris Wilson
On Tue, Oct 06, 2015 at 12:21:05PM +0100, Dave Gordon wrote:
> ... although I still think my version is (slightly) easier to read.
> Or it could be improved even more by moving the increment of 'iter'
> to the end, making it one line shorter and perhaps helping the
> compiler a little :)
> 
> #define gen8_for_each_pml4e(pdp, pml4, start, length, temp, iter)  \
> for (iter = gen8_pml4e_index(start);   \
>  iter < GEN8_PML4ES_PER_PML4 &&\
> (pdp = (pml4)->pdps[iter], length > 0);\
>  temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT) - start,  \
>  temp = min(temp, length), \
>  start += temp, length -= temp, ++iter)

Shorter, yes, but we may as well take advantage of not using [iter] if
length == 0, so meh.
 
> Or, noting that 'temp' is never used in the generated loop bodies,
> we could eliminate the parameter and make it local to the loop
> header :)
> 
> #define gen8_for_each_pml4e(pdp, pml4, start, length, iter)\
> for (iter = gen8_pml4e_index(start);   \
>  iter < GEN8_PML4ES_PER_PML4 &&\
> (pdp = (pml4)->pdps[iter], length > 0);\
>  ({ u64 temp = ALIGN(start+1, 1ULL << GEN8_PML4E_SHIFT);   \
> temp = min(temp - start, length);  \
> start += temp, length -= temp; }), \
> ++iter)

Removing extraneous parameters from macros is differently a usability win.
Care to spin a real patch so we can see how it looks in practice?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/3] drm: Track drm_mm nodes with an interval tree

2015-10-06 Thread Daniel Vetter
On Tue, Oct 06, 2015 at 12:19:43PM +0100, Chris Wilson wrote:
> On Tue, Oct 06, 2015 at 01:11:56PM +0200, Daniel Vetter wrote:
> > On Tue, Oct 06, 2015 at 11:53:09AM +0100, Chris Wilson wrote:
> > > In addition to the last-in/first-out stack for accessing drm_mm nodes,
> > > we occasionally and in the future often want to find a drm_mm_node by an
> > > address. To do so efficiently we need to track the nodes in an interval
> > > tree - lookups for a particular address will then be O(lg(N)), where N
> > > is the number of nodes in the range manager as opposed to O(N).
> > > Insertion however gains an extra O(lg(N)) step for all nodes
> > > irrespective of whether the interval tree is in use. For future i915
> > > patches, eliminating the linear walk is a significant improvement.
> > > 
> > > Signed-off-by: Chris Wilson 
> > > Cc: dri-de...@lists.freedesktop.org
> > 
> > For the vma manager David Herrman put the interval tree outside of drm_mm.
> > Whichever way we pick, but I think we should be consistent about this.
> 
> Given that the basis of this patch is that functionality exposed by
> drm_mm (i.e. drm_mm_reserve_node) is too slow for our use case (i.e.
> there is a measurable perf degradation if we switch over from the mru
> stack to using fixed addresses) it makes sense to improve that
> functionality. The question is then why the drm_vma_manager didn't use
> and improve the existing functionality...

Yeah I'm trying to volunteer you to add a lookup-function and rework the
vma-manager ;-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915: Move the mb() following release-mmap into release-mmap

2015-10-06 Thread Chris Wilson
As paranoia, we want to ensure that the CPU's PTEs have been revoked for
the object before we return from i915_gem_release_mmap(). This allows us
to rely on there being no outstanding memory accesses and guarantees
serialisation of the code against concurrent access just by calling
i915_gem_release_mmap().

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_gem.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 2b8ed7a2faab..642644f12295 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1877,11 +1877,21 @@ out:
 void
 i915_gem_release_mmap(struct drm_i915_gem_object *obj)
 {
+   /* Serialisation between user GTT access and our code depends upon
+* revoking the CPU's PTE whilst the mutex is held. The next user
+* pagefault then has to wait until we release the mutex.
+*/
+   lockdep_assert_held(&obj->base.dev->struct_mutex);
+
if (!obj->fault_mappable)
return;
 
drm_vma_node_unmap(&obj->base.vma_node,
   obj->base.dev->anon_inode->i_mapping);
+
+   /* Ensure that the CPU's PTE are revoked before we return */
+   mb();
+
obj->fault_mappable = false;
 }
 
@@ -3168,9 +3178,6 @@ static void i915_gem_object_finish_gtt(struct 
drm_i915_gem_object *obj)
if ((obj->base.read_domains & I915_GEM_DOMAIN_GTT) == 0)
return;
 
-   /* Wait for any direct GTT access to complete */
-   mb();
-
old_read_domains = obj->base.read_domains;
old_write_domain = obj->base.write_domain;
 
-- 
2.6.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 3/3] drm/i915: Use a task to cancel the userptr on invalidate_range

2015-10-06 Thread Daniel Vetter
On Thu, Oct 01, 2015 at 12:34:47PM +0100, Chris Wilson wrote:
> Whilst discussing possible ways to trigger an invalidate_range on a
> userptr with an aliased GGTT mmapping (and so cause a struct_mutex
> deadlock), the conclusion is that we can, and we must, prevent any
> possible deadlock by avoiding taking the mutex at all during
> invalidate_range. This has numerous advantages all of which stem from
> avoid the sleeping function from inside the unknown context. In
> particular, it simplifies the invalidate_range because we no longer
> have to juggle the spinlock/mutex and can just hold the spinlock
> for the entire walk. To compensate, we have to make get_pages a bit more
> complicated in order to serialise with a pending cancel_userptr worker.
> As we hold the struct_mutex, we have no choice but to return EAGAIN and
> hope that the worker is then flushed before we retry after reacquiring
> the struct_mutex.
> 
> The important caveat is that the invalidate_range itself is no longer
> synchronous. There exists a small but definite period in time in which
> the old PTE's page remain accessible via the GPU. Note however that the
> physical pages themselves are not invalidated by the mmu_notifier, just
> the CPU view of the address space. The impact should be limited to a
> delay in pages being flushed, rather than a possibility of writing to
> the wrong pages. The only race condition that this worsens is remapping
> an userptr active on the GPU where fresh work may still reference the
> old pages due to struct_mutex contention. Given that userspace is racing
> with the GPU, it is fair to say that the results are undefined.
> 
> v2: Only queue (and importantly only take one refcnt) the worker once.
> 
> Signed-off-by: Chris Wilson 
> Cc: Michał Winiarski 
> Cc: Tvrtko Ursulin 
> Reviewed-by: Tvrtko Ursulin 

Pulled in all 3 patches. Btw some pretty kerneldoc explaining the
high-level interactions would be neat for all the userptr stuff ... I'm
totally lost in i915_gem_userptr.c ;-)

Thanks, Daniel

> ---
>  drivers/gpu/drm/i915/i915_gem_userptr.c | 148 
> +---
>  1 file changed, 61 insertions(+), 87 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c 
> b/drivers/gpu/drm/i915/i915_gem_userptr.c
> index 161f7fbf5b76..1b3b451b6658 100644
> --- a/drivers/gpu/drm/i915/i915_gem_userptr.c
> +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
> @@ -50,7 +50,6 @@ struct i915_mmu_notifier {
>   struct mmu_notifier mn;
>   struct rb_root objects;
>   struct list_head linear;
> - unsigned long serial;
>   bool has_linear;
>  };
>  
> @@ -59,14 +58,16 @@ struct i915_mmu_object {
>   struct interval_tree_node it;
>   struct list_head link;
>   struct drm_i915_gem_object *obj;
> + struct work_struct work;
>   bool active;
>   bool is_linear;
>  };
>  
> -static unsigned long cancel_userptr(struct drm_i915_gem_object *obj)
> +static void __cancel_userptr__worker(struct work_struct *work)
>  {
> + struct i915_mmu_object *mo = container_of(work, typeof(*mo), work);
> + struct drm_i915_gem_object *obj = mo->obj;
>   struct drm_device *dev = obj->base.dev;
> - unsigned long end;
>  
>   mutex_lock(&dev->struct_mutex);
>   /* Cancel any active worker and force us to re-evaluate gup */
> @@ -89,46 +90,28 @@ static unsigned long cancel_userptr(struct 
> drm_i915_gem_object *obj)
>   dev_priv->mm.interruptible = was_interruptible;
>   }
>  
> - end = obj->userptr.ptr + obj->base.size;
> -
>   drm_gem_object_unreference(&obj->base);
>   mutex_unlock(&dev->struct_mutex);
> -
> - return end;
>  }
>  
> -static void *invalidate_range__linear(struct i915_mmu_notifier *mn,
> -   struct mm_struct *mm,
> -   unsigned long start,
> -   unsigned long end)
> +static unsigned long cancel_userptr(struct i915_mmu_object *mo)
>  {
> - struct i915_mmu_object *mo;
> - unsigned long serial;
> -
> -restart:
> - serial = mn->serial;
> - list_for_each_entry(mo, &mn->linear, link) {
> - struct drm_i915_gem_object *obj;
> -
> - if (mo->it.last < start || mo->it.start > end)
> - continue;
> -
> - obj = mo->obj;
> -
> - if (!mo->active ||
> - !kref_get_unless_zero(&obj->base.refcount))
> - continue;
> -
> - spin_unlock(&mn->lock);
> -
> - cancel_userptr(obj);
> -
> - spin_lock(&mn->lock);
> - if (serial != mn->serial)
> - goto restart;
> + unsigned long end = mo->obj->userptr.ptr + mo->obj->base.size;
> +
> + /* The mmu_object is released late when destroying the
> +  * GEM object so it is entirely possible to gain a
> +  * reference on an object in the process of being freed
> +  * since our serialisation is via the s

[Intel-gfx] [PATCH] drm/i915: Stop discarding GTT cache-domain on unbind vma

2015-10-06 Thread Chris Wilson
Since

commit 43566dedde54f9729113f5f9fde77d53e75e61e9
Author: Chris Wilson 
Date:   Fri Jan 2 16:29:29 2015 +0530

drm/i915: Broaden application of set-domain(GTT)

we allowed objects to be in the GTT domain, but unbound. Therefore
removing the GTT cache domain when removing the GGTT vma is no longer
semantically correct.

An unfortunate side-effect is we lose the wondrously named
i915_gem_object_finish_gtt(), not to be confused with
i915_gem_gtt_finish_object()!

Signed-off-by: Chris Wilson 
Cc: Akash Goel 
Cc: Joonas Lahtinen 
Cc: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_gem.c | 20 +++-
 1 file changed, 3 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 8f498d4d874d..682af2ae3681 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3183,20 +3183,6 @@ i915_gem_object_sync(struct drm_i915_gem_object *obj,
return 0;
 }
 
-static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
-{
-   /* Force a pagefault for domain tracking on next user access */
-   i915_gem_release_mmap(obj);
-
-   if ((obj->base.read_domains & I915_GEM_DOMAIN_GTT) == 0)
-   return;
-
-   obj->base.read_domains &= ~I915_GEM_DOMAIN_GTT;
-   obj->base.write_domain &= ~I915_GEM_DOMAIN_GTT;
-
-   trace_i915_gem_object_change_domain(obj);
-}
-
 int i915_vma_unbind(struct i915_vma *vma)
 {
struct drm_i915_gem_object *obj = vma->obj;
@@ -3228,12 +3214,12 @@ int i915_vma_unbind(struct i915_vma *vma)
 */
 
if (vma->is_ggtt && vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL) {
-   i915_gem_object_finish_gtt(obj);
-
-   /* release the fence reg _after_ flushing */
ret = i915_gem_object_put_fence(obj);
if (ret)
return ret;
+
+   /* Force a pagefault for domain tracking on next user access */
+   i915_gem_release_mmap(obj);
}
 
if (!vma->vm->closed) {
-- 
2.6.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Allow userptr backchannel for passing around GTT mappings

2015-10-06 Thread Daniel Vetter
On Thu, Oct 01, 2015 at 12:36:05PM +0100, Chris Wilson wrote:
> Once userptr becomes part of client API, it is almost a certainty that
> eventually someone will try to create a new object from a mapping of
> another client object, e.g.
> 
> new = vaImport(vaMap(old, &size), size);
> 
> (using a hypothethical API, not meaning to pick on anyone!)
> 
> Since this is actually fairly safe to implement and to allow for a GTT
> mapping (since it is within a single process space and the memory access
> passes the standard permissions test) let us not limit the Client
> possibilities.
> 
> v2: sfelling pixes
> 
> Signed-off-by: Chris Wilson 
> Cc: Gwenole Beauchesne 
> Cc: Michał Winiarski 
> Cc: Tvrtko Ursulin 

This feels like a really big can of worms, since all the apis I've seen
thus far have been rather explicit that you can only import malloc'ed
memory. Also tiling and stuff like this ...
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_gem_userptr.c | 48 
> ++---
>  1 file changed, 45 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c 
> b/drivers/gpu/drm/i915/i915_gem_userptr.c
> index 1b3b451b6658..afd4c2c4cc04 100644
> --- a/drivers/gpu/drm/i915/i915_gem_userptr.c
> +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
> @@ -795,6 +795,37 @@ static const struct drm_i915_gem_object_ops 
> i915_gem_userptr_ops = {
>   .release = i915_gem_userptr_release,
>  };
>  
> +static struct drm_i915_gem_object *
> +find_object_from_vma(struct drm_device *dev,
> +  struct drm_i915_gem_userptr *args)
> +{
> + struct drm_i915_gem_object *obj = NULL;
> + struct vm_area_struct *vma;
> +
> + down_read(¤t->mm->mmap_sem);
> + vma = find_vma(current->mm, args->user_ptr);
> + if (vma == NULL) {
> + obj = ERR_PTR(-EFAULT);
> + goto out;
> + }
> +
> + if (vma->vm_ops != dev->driver->gem_vm_ops)
> + goto out;
> +
> + if (vma->vm_start != args->user_ptr ||
> + vma->vm_end != args->user_ptr + args->user_size) {
> + obj = ERR_PTR(-EINVAL);
> + goto out;
> + }
> +
> + obj = to_intel_bo(vma->vm_private_data);
> + drm_gem_object_reference(&obj->base);
> +
> +out:
> + up_read(¤t->mm->mmap_sem);
> + return obj;
> +}
> +
>  /**
>   * Creates a new mm object that wraps some normal memory from the process
>   * context - user memory.
> @@ -802,8 +833,11 @@ static const struct drm_i915_gem_object_ops 
> i915_gem_userptr_ops = {
>   * We impose several restrictions upon the memory being mapped
>   * into the GPU.
>   * 1. It must be page aligned (both start/end addresses, i.e ptr and size).
> - * 2. It must be normal system memory, not a pointer into another map of IO
> - *space (e.g. it must not be a GTT mmapping of another object).
> + * 2. It must either be:
> + *a) normal system memory, not a pointer into another map of IO
> + *   space (e.g. it must not be part of a GTT mmapping of another 
> object).
> + *b) a pointer to the complete GTT mmap of another object in your
> + *   address space.
>   * 3. We only allow a bo as large as we could in theory map into the GTT,
>   *that is we limit the size to the total size of the GTT.
>   * 4. The bo is marked as being snoopable. The backing pages are left
> @@ -853,6 +887,14 @@ i915_gem_userptr_ioctl(struct drm_device *dev, void 
> *data, struct drm_file *file
>   return -ENODEV;
>   }
>  
> + obj = find_object_from_vma(dev, args);
> + if (obj) {
> + if (!IS_ERR(obj))
> + goto out;
> +
> + return PTR_ERR(obj);
> + }
> +
>   obj = i915_gem_object_alloc(dev);
>   if (obj == NULL)
>   return -ENOMEM;
> @@ -874,7 +916,7 @@ i915_gem_userptr_ioctl(struct drm_device *dev, void 
> *data, struct drm_file *file
>   if (ret == 0)
>   ret = i915_gem_userptr_init__mmu_notifier(obj, args->flags);
>   if (ret == 0)
> - ret = drm_gem_handle_create(file, &obj->base, &handle);
> +out:  ret = drm_gem_handle_create(file, &obj->base, &handle);
>  
>   /* drop reference from allocate - handle holds it now */
>   drm_gem_object_unreference_unlocked(&obj->base);
> -- 
> 2.6.0
> 
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4] drm/i915: Clean up associated VMAs on context destruction

2015-10-06 Thread Daniel Vetter
On Tue, Oct 06, 2015 at 10:48:28AM +0100, Tvrtko Ursulin wrote:
> 
> 
> On 06/10/15 10:34, Chris Wilson wrote:
> >On Tue, Oct 06, 2015 at 11:28:49AM +0200, Daniel Vetter wrote:
> >>On Tue, Oct 06, 2015 at 10:04:31AM +0100, Chris Wilson wrote:
> >>>On Tue, Oct 06, 2015 at 10:58:01AM +0200, Daniel Vetter wrote:
> On Mon, Oct 05, 2015 at 01:26:36PM +0100, Tvrtko Ursulin wrote:
> >From: Tvrtko Ursulin 
> >
> >Prevent leaking VMAs and PPGTT VMs when objects are imported
> >via flink.
> >
> >Scenario is that any VMAs created by the importer will be left
> >dangling after the importer exits, or destroys the PPGTT context
> >with which they are associated.
> >
> >This is caused by object destruction not running when the
> >importer closes the buffer object handle due the reference held
> >by the exporter. This also leaks the VM since the VMA has a
> >reference on it.
> >
> >In practice these leaks can be observed by stopping and starting
> >the X server on a kernel with fbcon compiled in. Every time
> >X server exits another VMA will be leaked against the fbcon's
> >frame buffer object.
> >
> >Also on systems where flink buffer sharing is used extensively,
> >like Android, this leak has even more serious consequences.
> >
> >This version is takes a general approach from the  earlier work
> >by Rafael Barbalho (drm/i915: Clean-up PPGTT on context
> >destruction) and tries to incorporate the subsequent discussion
> >between Chris Wilson and Daniel Vetter.
> >
> >v2:
> >
> >Removed immediate cleanup on object retire - it was causing a
> >recursive VMA unbind via i915_gem_object_wait_rendering. And
> >it is in fact not even needed since by definition context
> >cleanup worker runs only after the last context reference has
> >been dropped, hence all VMAs against the VM belonging to the
> >context are already on the inactive list.
> >
> >v3:
> >
> >Previous version could deadlock since VMA unbind waits on any
> >rendering on an object to complete. Objects can be busy in a
> >different VM which would mean that the cleanup loop would do
> >the wait with the struct mutex held.
> >
> >This is an even simpler approach where we just unbind VMAs
> >without waiting since we know all VMAs belonging to this VM
> >are idle, and there is nothing in flight, at the point
> >context destructor runs.
> >
> >v4:
> >
> >Double underscore prefix for __915_vma_unbind_no_wait and a
> >commit message typo fix. (Michel Thierry)
> >
> >Signed-off-by: Tvrtko Ursulin 
> >Testcase: igt/gem_ppgtt.c/flink-and-exit-vma-leak
> >Reviewed-by: Michel Thierry 
> >Cc: Daniel Vetter 
> >Cc: Chris Wilson 
> >Cc: Rafael Barbalho 
> >Cc: Michel Thierry 
> 
> Queued for -next, thanks for the patch.
> >>>
> >>>Please no, it's an awful patch and does not even fix the root cause of
> >>>the leak (that the vma are not closed when the handle is).
> >>
> >>It's a lose-lose situation for me as maintainer (and this holds in general
> >>really, not just for this patch):
> >>- Either I wield my considerable maintainer powers and force proper
> >>   solution, which will piss of a lot of people.
> >>- Or I just merge intermediate stuff and piss of another set of people
> >>   (including likely our all future selves because we're slowly digging a
> >>   tech debt grave).
> >>
> >>What I can't do with maintainer fu is force collaboration, I can only try
> >>to piss off everyone equally. And today the die rolled a "merge".
> >
> >Pity it didn't roll "what's the impact and where is the bugzilla?" :)
> 
> There is this one:
> 
> https://bugs.freedesktop.org/show_bug.cgi?id=87729
> 
> And I could raise another one for leaking VMAs on X.org restart? :)

I added a small note to the patch that this isn't everything.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/skl: Init cdclk in the driver rather than relying on pre-os

2015-10-06 Thread Daniel Vetter
On Tue, Oct 06, 2015 at 02:41:44PM +0300, Ville Syrjälä wrote:
> On Tue, Oct 06, 2015 at 01:19:52PM +0200, Daniel Vetter wrote:
> > On Tue, Oct 06, 2015 at 04:33:43PM +0530, Kumar, Shobhit wrote:
> > > On 10/06/2015 04:11 PM, Imre Deak wrote:
> > > >On ti, 2015-10-06 at 15:26 +0530, Kumar, Shobhit wrote:
> > > >>On 10/05/2015 09:05 PM, Imre Deak wrote:
> > > >>>On ma, 2015-10-05 at 20:52 +0530, Shobhit Kumar wrote:
> > > Mostly reuse what is programmed by pre-os, but in case there is no
> > > pre-os initialization, init the cdclk with the default value.
> > > 
> > > Cc: Imre Deak 
> > > Signed-off-by: Shobhit Kumar 
> > > ---
> > >    drivers/gpu/drm/i915/intel_ddi.c | 6 ++
> > >    1 file changed, 2 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/intel_ddi.c 
> > > b/drivers/gpu/drm/i915/intel_ddi.c
> > > index 2d3cc82..675c60d 100644
> > > --- a/drivers/gpu/drm/i915/intel_ddi.c
> > > +++ b/drivers/gpu/drm/i915/intel_ddi.c
> > > @@ -2947,10 +2947,8 @@ void intel_ddi_pll_init(struct drm_device *dev)
> > > 
> > >   cdclk_freq = 
> > >  dev_priv->display.get_display_clock_speed(dev);
> > >   dev_priv->skl_boot_cdclk = cdclk_freq;
> > > - if (!(I915_READ(LCPLL1_CTL) & LCPLL_PLL_ENABLE))
> > > - DRM_ERROR("LCPLL1 is disabled\n");
> > > - else
> > > - intel_display_power_get(dev_priv, 
> > > POWER_DOMAIN_PLLS);
> > > +
> > > + skl_init_cdclk(dev_priv);
> > > >>>
> > > >>>How does this prevent changing the clock if BIOS did enable some 
> > > >>>output?
> > > >>>We shouldn't change the clock in that case.
> > > >>
> > > >>In that case it will try to re-apply the same clock that BIOS enabled.
> > > >>Not sure if this is allowed, but I checked the cdclock change sequence
> > > >>and it is mostly followed in skl_init_cdclk.
> > > >>In my tests where BIOS does enable this, I faced no issues in
> > > >>initializing again in driver.
> > > >
> > > >The first step in that sequence:
> > > >"Disable all display engine functions using the full mode set disable
> > > >sequence on all pipes, ports, and planes."
> > > 
> > > Oh, yeah, I again made mistake of assuming that display is not enabled in
> > > the first place. You are right, though it works if I change the clock 
> > > again.
> > > 
> > > >
> > > >So the problem is not that the PLL itself may be enabled here (as BIOS
> > > >left it), but that some output is also enabled.
> > > 
> > > Yes.
> > > 
> > > >
> > > >>I have noticed on some pre-os this value is programmed correctly except
> > > >>for the decimal part. That causes AUX transactions to fail on SKl. That
> > > >>is what triggered this patch actually. So other way is to completely
> > > >>validate the value in get_display_clock_speed instead of bit[28:26] and
> > > >>then if wrong then only do the cdclk init.
> > > >
> > > >I think we'd need to detect at this point if outputs are enabled and
> > > >only attempt to work around the above BIOS problem if this is not the
> > > >case. Alternatively you could also disable the active outputs as a first
> > > >step.
> > > 
> > > Ok, let me detect if any output is enabled by BIOS and accordingly
> > > initialize cdclk.
> > 
> > These kind of fixiups should be done after the hw state readout. We
> > already have sanitize_crtc/pll/encoder functions, probably best if we add
> > a sanitize_cdclk or similar for this at the very end of the hw state
> > sanitize sequence.
> 
> Can't be done if we already need a somewhat sane cdclk for the
> eDP AUX probing and whatnot.
> 
> For actually enabling the cdclk for pushing pixels, we wouldn't need
> to do anything except actually plug ia a calc_cdclk for SKL. No idea
> why we're not doing that currently. Some extra care may be needed
> due to the eDP DPLL0 usag IIRC.

Hm right, cdlck is in the top-level power domain. Added fun is that with
dmc the firmware is supposed to handle it. Messy :(
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Allow userptr backchannel for passing around GTT mappings

2015-10-06 Thread Chris Wilson
On Tue, Oct 06, 2015 at 02:12:31PM +0200, Daniel Vetter wrote:
> On Thu, Oct 01, 2015 at 12:36:05PM +0100, Chris Wilson wrote:
> > Once userptr becomes part of client API, it is almost a certainty that
> > eventually someone will try to create a new object from a mapping of
> > another client object, e.g.
> > 
> > new = vaImport(vaMap(old, &size), size);
> > 
> > (using a hypothethical API, not meaning to pick on anyone!)
> > 
> > Since this is actually fairly safe to implement and to allow for a GTT
> > mapping (since it is within a single process space and the memory access
> > passes the standard permissions test) let us not limit the Client
> > possibilities.
> > 
> > v2: sfelling pixes
> > 
> > Signed-off-by: Chris Wilson 
> > Cc: Gwenole Beauchesne 
> > Cc: Michał Winiarski 
> > Cc: Tvrtko Ursulin 
> 
> This feels like a really big can of worms, since all the apis I've seen
> thus far have been rather explicit that you can only import malloc'ed
> memory.

Weirdly no one even brought it up as an issue for AMD_pinned_memory.
Nothing in the spec says you can't
glBufferData(GL_EXTERNAL_VIRTUAL_MEMORY_BUFFER_AMD,
 ...,
 glMapBuffer(GL_PIXEL_UNPACK_BUFFER, ...));

Not that I would expect it to always work.

> Also tiling and stuff like this ...

Indeed, it is the only way to get tiling right... The only question is
whether every userptr mapping should be a unique bo (this also has the
effect of return a new handle for an existing userptr mapping to the
same exact address range). Note that the memory address is still
controlled by the earlier bo, so it will always remain linear, and with
the second handle the user still has to object all the normal userptr
restrictions.

As you probably guess Gwenole raise the question of whether this
behaviour would be possible to implement.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 3/3] drm/i915: Use a task to cancel the userptr on invalidate_range

2015-10-06 Thread Chris Wilson
On Tue, Oct 06, 2015 at 02:04:19PM +0200, Daniel Vetter wrote:
> On Thu, Oct 01, 2015 at 12:34:47PM +0100, Chris Wilson wrote:
> > Whilst discussing possible ways to trigger an invalidate_range on a
> > userptr with an aliased GGTT mmapping (and so cause a struct_mutex
> > deadlock), the conclusion is that we can, and we must, prevent any
> > possible deadlock by avoiding taking the mutex at all during
> > invalidate_range. This has numerous advantages all of which stem from
> > avoid the sleeping function from inside the unknown context. In
> > particular, it simplifies the invalidate_range because we no longer
> > have to juggle the spinlock/mutex and can just hold the spinlock
> > for the entire walk. To compensate, we have to make get_pages a bit more
> > complicated in order to serialise with a pending cancel_userptr worker.
> > As we hold the struct_mutex, we have no choice but to return EAGAIN and
> > hope that the worker is then flushed before we retry after reacquiring
> > the struct_mutex.
> > 
> > The important caveat is that the invalidate_range itself is no longer
> > synchronous. There exists a small but definite period in time in which
> > the old PTE's page remain accessible via the GPU. Note however that the
> > physical pages themselves are not invalidated by the mmu_notifier, just
> > the CPU view of the address space. The impact should be limited to a
> > delay in pages being flushed, rather than a possibility of writing to
> > the wrong pages. The only race condition that this worsens is remapping
> > an userptr active on the GPU where fresh work may still reference the
> > old pages due to struct_mutex contention. Given that userspace is racing
> > with the GPU, it is fair to say that the results are undefined.
> > 
> > v2: Only queue (and importantly only take one refcnt) the worker once.
> > 
> > Signed-off-by: Chris Wilson 
> > Cc: Michał Winiarski 
> > Cc: Tvrtko Ursulin 
> > Reviewed-by: Tvrtko Ursulin 
> 
> Pulled in all 3 patches. Btw some pretty kerneldoc explaining the
> high-level interactions would be neat for all the userptr stuff ... I'm
> totally lost in i915_gem_userptr.c ;-)

Probably because it already has too many verbose comments?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/3] drm/i915/bxt: add revision id for A1 stepping and use it

2015-10-06 Thread Ville Syrjälä
On Tue, Oct 06, 2015 at 02:41:15PM +0300, Jani Nikula wrote:
> Prefer inclusive ranges for revision checks rather than "below B0". Per
> specs A2 is not used, so revid <= A1 matches revid < B0.

The w/a db would say UNTIL_B0 etc., so might be easier to check against
it if we keep to the same convention.

> 
> Signed-off-by: Jani Nikula 
> ---
>  drivers/gpu/drm/i915/i915_drv.h| 1 +
>  drivers/gpu/drm/i915/i915_gem.c| 2 +-
>  drivers/gpu/drm/i915/i915_guc_submission.c | 2 +-
>  drivers/gpu/drm/i915/intel_ddi.c   | 2 +-
>  drivers/gpu/drm/i915/intel_dp.c| 2 +-
>  drivers/gpu/drm/i915/intel_hdmi.c  | 2 +-
>  drivers/gpu/drm/i915/intel_lrc.c   | 8 
>  drivers/gpu/drm/i915/intel_pm.c| 6 +++---
>  drivers/gpu/drm/i915/intel_ringbuffer.c| 6 +++---
>  9 files changed, 16 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index a3b137715604..9833a2055930 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2509,6 +2509,7 @@ struct drm_i915_cmd_table {
>  #define SKL_REVID_F0 0x5
>  
>  #define BXT_REVID_A0 0x0
> +#define BXT_REVID_A1 0x1
>  #define BXT_REVID_B0 0x3
>  #define BXT_REVID_C0 0x9
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index f0cfbb9ee12c..fd2d880656b2 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3757,7 +3757,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, 
> void *data,
>* cacheline, whereas normally such cachelines would get
>* invalidated.
>*/
> - if (IS_BROXTON(dev) && INTEL_REVID(dev) < BXT_REVID_B0)
> + if (IS_BROXTON(dev) && INTEL_REVID(dev) <= BXT_REVID_A1)
>   return -ENODEV;
>  
>   level = I915_CACHE_LLC;
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
> b/drivers/gpu/drm/i915/i915_guc_submission.c
> index 036b42bae827..863aa5c82466 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -161,7 +161,7 @@ static int host2guc_sample_forcewake(struct intel_guc 
> *guc,
>   data[0] = HOST2GUC_ACTION_SAMPLE_FORCEWAKE;
>   /* WaRsDisableCoarsePowerGating:skl,bxt */
>   if (!intel_enable_rc6(dev_priv->dev) ||
> - (IS_BROXTON(dev) && (INTEL_REVID(dev) < BXT_REVID_B0)) ||
> + (IS_BROXTON(dev) && (INTEL_REVID(dev) <= BXT_REVID_A1)) ||
>   (IS_SKL_GT3(dev) && (INTEL_REVID(dev) <= SKL_REVID_E0)) ||
>   (IS_SKL_GT4(dev) && (INTEL_REVID(dev) <= SKL_REVID_E0)))
>   data[1] = 0;
> diff --git a/drivers/gpu/drm/i915/intel_ddi.c 
> b/drivers/gpu/drm/i915/intel_ddi.c
> index b25e99a432fb..b80e0f5ec5dc 100644
> --- a/drivers/gpu/drm/i915/intel_ddi.c
> +++ b/drivers/gpu/drm/i915/intel_ddi.c
> @@ -3247,7 +3247,7 @@ void intel_ddi_init(struct drm_device *dev, enum port 
> port)
>* On BXT A0/A1, sw needs to activate DDIA HPD logic and
>* interrupts to check the external panel connection.
>*/
> - if (IS_BROXTON(dev_priv) && (INTEL_REVID(dev) < BXT_REVID_B0)
> + if (IS_BROXTON(dev_priv) && (INTEL_REVID(dev) <= BXT_REVID_A1)
>&& port == PORT_B)
>   dev_priv->hotplug.irq_port[PORT_A] = intel_dig_port;
>   else
> diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
> index 8d34ca7b287a..8baf6fe06313 100644
> --- a/drivers/gpu/drm/i915/intel_dp.c
> +++ b/drivers/gpu/drm/i915/intel_dp.c
> @@ -6087,7 +6087,7 @@ intel_dp_init_connector(struct intel_digital_port 
> *intel_dig_port,
>   break;
>   case PORT_B:
>   intel_encoder->hpd_pin = HPD_PORT_B;
> - if (IS_BROXTON(dev_priv) && (INTEL_REVID(dev) < BXT_REVID_B0))
> + if (IS_BROXTON(dev_priv) && (INTEL_REVID(dev) <= BXT_REVID_A1))
>   intel_encoder->hpd_pin = HPD_PORT_A;
>   break;
>   case PORT_C:
> diff --git a/drivers/gpu/drm/i915/intel_hdmi.c 
> b/drivers/gpu/drm/i915/intel_hdmi.c
> index 03d85909c6ab..32e9117ee8e3 100644
> --- a/drivers/gpu/drm/i915/intel_hdmi.c
> +++ b/drivers/gpu/drm/i915/intel_hdmi.c
> @@ -2068,7 +2068,7 @@ void intel_hdmi_init_connector(struct 
> intel_digital_port *intel_dig_port,
>* On BXT A0/A1, sw needs to activate DDIA HPD logic and
>* interrupts to check the external panel connection.
>*/
> - if (IS_BROXTON(dev_priv) && (INTEL_REVID(dev) < BXT_REVID_B0))
> + if (IS_BROXTON(dev_priv) && (INTEL_REVID(dev) <= BXT_REVID_A1))
>   intel_encoder->hpd_pin = HPD_PORT_A;
>   else
>   intel_encoder->hpd_pin = HPD_PORT_B;
> di

Re: [Intel-gfx] [PATCH] drm/i915: Stop discarding GTT cache-domain on unbind vma

2015-10-06 Thread Daniel Vetter
On Tue, Oct 06, 2015 at 01:02:22PM +0100, Chris Wilson wrote:
> Since
> 
> commit 43566dedde54f9729113f5f9fde77d53e75e61e9
> Author: Chris Wilson 
> Date:   Fri Jan 2 16:29:29 2015 +0530
> 
> drm/i915: Broaden application of set-domain(GTT)
> 
> we allowed objects to be in the GTT domain, but unbound. Therefore
> removing the GTT cache domain when removing the GGTT vma is no longer
> semantically correct.
> 
> An unfortunate side-effect is we lose the wondrously named
> i915_gem_object_finish_gtt(), not to be confused with
> i915_gem_gtt_finish_object()!
> 
> Signed-off-by: Chris Wilson 
> Cc: Akash Goel 
> Cc: Joonas Lahtinen 
> Cc: Tvrtko Ursulin 
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 20 +++-
>  1 file changed, 3 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 8f498d4d874d..682af2ae3681 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3183,20 +3183,6 @@ i915_gem_object_sync(struct drm_i915_gem_object *obj,
>   return 0;
>  }
>  
> -static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
> -{
> - /* Force a pagefault for domain tracking on next user access */
> - i915_gem_release_mmap(obj);
> -
> - if ((obj->base.read_domains & I915_GEM_DOMAIN_GTT) == 0)
> - return;
> -
> - obj->base.read_domains &= ~I915_GEM_DOMAIN_GTT;
> - obj->base.write_domain &= ~I915_GEM_DOMAIN_GTT;
> -
> - trace_i915_gem_object_change_domain(obj);
> -}
> -
>  int i915_vma_unbind(struct i915_vma *vma)
>  {
>   struct drm_i915_gem_object *obj = vma->obj;
> @@ -3228,12 +3214,12 @@ int i915_vma_unbind(struct i915_vma *vma)
>*/
>  
>   if (vma->is_ggtt && vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL) {
> - i915_gem_object_finish_gtt(obj);
> -
> - /* release the fence reg _after_ flushing */
>   ret = i915_gem_object_put_fence(obj);
>   if (ret)
>   return ret;
> +
> + /* Force a pagefault for domain tracking on next user access */
> + i915_gem_release_mmap(obj);

Can't put_fence before release_mmap ... I guess we should have a testcase
for this somewhere? Hard to provoke probably ...
-Daniel

>   }
>  
>   if (!vma->vm->closed) {
> -- 
> 2.6.1
> 
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/6] drm/i915: Eliminate vmap overhead for cmd parser

2015-10-06 Thread Daniel Vetter
On Thu, Oct 01, 2015 at 12:57:10PM +0100, Chris Wilson wrote:
> With a little complexity to handle cmds straddling page boundaries, we
> can completely avoiding having to vmap the batch and the shadow batch
> objects whilst running the command parser.
> 
> On ivb i7-3720MQ:
> 
> x11perf -dot before 54.3M, after 53.2M (max 203M)
> glxgears before 7110 fps, after 7300 fps (max 7860 fps)
> 
> Before:
> Time to blt 16384 bytes x  1:  12.400µs, 1.2GiB/s
> Time to blt 16384 bytes x   4096:   3.055µs, 5.0GiB/s
> 
> After:
> Time to blt 16384 bytes x  1:   8.600µs, 1.8GiB/s
> Time to blt 16384 bytes x   4096:   2.456µs, 6.2GiB/s

Numbers for the overall series (or individual patches would be even
better) are needed. I thought you have this neat script now to do that for
an entire series?
-Daniel

> 
> Removing the vmap is mostly a win, except we lose in a few cases where
> the batch size is greater than a page due to the extra complexity (loss
> of a simple cache efficient large copy, and boundary handling).
> 
> v2: Reorder so that we do check oacontrol remaining set at end-of-batch
> 
> Signed-off-by: Chris Wilson 
> ---
>  drivers/gpu/drm/i915/i915_cmd_parser.c | 298 
> -
>  1 file changed, 146 insertions(+), 152 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c 
> b/drivers/gpu/drm/i915/i915_cmd_parser.c
> index 237ff6884a22..91e4baa0f2b8 100644
> --- a/drivers/gpu/drm/i915/i915_cmd_parser.c
> +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
> @@ -852,100 +852,6 @@ find_reg(const struct drm_i915_reg_descriptor *table,
>   return NULL;
>  }
>  
> -static u32 *vmap_batch(struct drm_i915_gem_object *obj,
> -unsigned start, unsigned len)
> -{
> - int i;
> - void *addr = NULL;
> - struct sg_page_iter sg_iter;
> - int first_page = start >> PAGE_SHIFT;
> - int last_page = (len + start + 4095) >> PAGE_SHIFT;
> - int npages = last_page - first_page;
> - struct page **pages;
> -
> - pages = drm_malloc_ab(npages, sizeof(*pages));
> - if (pages == NULL) {
> - DRM_DEBUG_DRIVER("Failed to get space for pages\n");
> - goto finish;
> - }
> -
> - i = 0;
> - for_each_sg_page(obj->pages->sgl, &sg_iter, obj->pages->nents, 
> first_page) {
> - pages[i++] = sg_page_iter_page(&sg_iter);
> - if (i == npages)
> - break;
> - }
> -
> - addr = vmap(pages, i, 0, PAGE_KERNEL);
> - if (addr == NULL) {
> - DRM_DEBUG_DRIVER("Failed to vmap pages\n");
> - goto finish;
> - }
> -
> -finish:
> - if (pages)
> - drm_free_large(pages);
> - return (u32*)addr;
> -}
> -
> -/* Returns a vmap'd pointer to dest_obj, which the caller must unmap */
> -static u32 *copy_batch(struct drm_i915_gem_object *dest_obj,
> -struct drm_i915_gem_object *src_obj,
> -u32 batch_start_offset,
> -u32 batch_len)
> -{
> - int needs_clflush = 0;
> - void *src_base, *src;
> - void *dst = NULL;
> - int ret;
> -
> - if (batch_len > dest_obj->base.size ||
> - batch_len + batch_start_offset > src_obj->base.size)
> - return ERR_PTR(-E2BIG);
> -
> - if (WARN_ON(dest_obj->pages_pin_count == 0))
> - return ERR_PTR(-ENODEV);
> -
> - ret = i915_gem_obj_prepare_shmem_read(src_obj, &needs_clflush);
> - if (ret) {
> - DRM_DEBUG_DRIVER("CMD: failed to prepare shadow batch\n");
> - return ERR_PTR(ret);
> - }
> -
> - src_base = vmap_batch(src_obj, batch_start_offset, batch_len);
> - if (!src_base) {
> - DRM_DEBUG_DRIVER("CMD: Failed to vmap batch\n");
> - ret = -ENOMEM;
> - goto unpin_src;
> - }
> -
> - ret = i915_gem_object_set_to_cpu_domain(dest_obj, true);
> - if (ret) {
> - DRM_DEBUG_DRIVER("CMD: Failed to set shadow batch to CPU\n");
> - goto unmap_src;
> - }
> -
> - dst = vmap_batch(dest_obj, 0, batch_len);
> - if (!dst) {
> - DRM_DEBUG_DRIVER("CMD: Failed to vmap shadow batch\n");
> - ret = -ENOMEM;
> - goto unmap_src;
> - }
> -
> - src = src_base + offset_in_page(batch_start_offset);
> - if (needs_clflush)
> - drm_clflush_virt_range(src, batch_len);
> -
> - memcpy(dst, src, batch_len);
> -
> -unmap_src:
> - vunmap(src_base);
> -unpin_src:
> - i915_gem_object_unpin_pages(src_obj);
> -
> - return ret ? ERR_PTR(ret) : dst;
> -}
> -
>  /**
>   * i915_needs_cmd_parser() - should a given ring use software command 
> parsing?
>   * @ring: the ring in question
> @@ -1112,16 +1018,35 @@ int i915_parse_cmds(struct intel_engine_cs *ring,
>   u32 batch_len,
>   bool is_master)
>  {
> - u32 *cmd, *batch_base, *batch_end;
> + u32 tmp[128];
> + const struct drm_i915_cmd_descri

Re: [Intel-gfx] [PATCH] drm/i915/skl: Init cdclk in the driver rather than relying on pre-os

2015-10-06 Thread Kumar, Shobhit

On 10/06/2015 05:49 PM, Daniel Vetter wrote:

On Tue, Oct 06, 2015 at 02:41:44PM +0300, Ville Syrjälä wrote:

On Tue, Oct 06, 2015 at 01:19:52PM +0200, Daniel Vetter wrote:

On Tue, Oct 06, 2015 at 04:33:43PM +0530, Kumar, Shobhit wrote:

On 10/06/2015 04:11 PM, Imre Deak wrote:

On ti, 2015-10-06 at 15:26 +0530, Kumar, Shobhit wrote:

On 10/05/2015 09:05 PM, Imre Deak wrote:

On ma, 2015-10-05 at 20:52 +0530, Shobhit Kumar wrote:

Mostly reuse what is programmed by pre-os, but in case there is no
pre-os initialization, init the cdclk with the default value.

Cc: Imre Deak 
Signed-off-by: Shobhit Kumar 
---
   drivers/gpu/drm/i915/intel_ddi.c | 6 ++
   1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ddi.c b/drivers/gpu/drm/i915/intel_ddi.c
index 2d3cc82..675c60d 100644
--- a/drivers/gpu/drm/i915/intel_ddi.c
+++ b/drivers/gpu/drm/i915/intel_ddi.c
@@ -2947,10 +2947,8 @@ void intel_ddi_pll_init(struct drm_device *dev)

cdclk_freq = dev_priv->display.get_display_clock_speed(dev);
dev_priv->skl_boot_cdclk = cdclk_freq;
-   if (!(I915_READ(LCPLL1_CTL) & LCPLL_PLL_ENABLE))
-   DRM_ERROR("LCPLL1 is disabled\n");
-   else
-   intel_display_power_get(dev_priv, POWER_DOMAIN_PLLS);
+
+   skl_init_cdclk(dev_priv);


How does this prevent changing the clock if BIOS did enable some output?
We shouldn't change the clock in that case.


In that case it will try to re-apply the same clock that BIOS enabled.
Not sure if this is allowed, but I checked the cdclock change sequence
and it is mostly followed in skl_init_cdclk.
In my tests where BIOS does enable this, I faced no issues in
initializing again in driver.


The first step in that sequence:
"Disable all display engine functions using the full mode set disable
sequence on all pipes, ports, and planes."


Oh, yeah, I again made mistake of assuming that display is not enabled in
the first place. You are right, though it works if I change the clock again.



So the problem is not that the PLL itself may be enabled here (as BIOS
left it), but that some output is also enabled.


Yes.




I have noticed on some pre-os this value is programmed correctly except
for the decimal part. That causes AUX transactions to fail on SKl. That
is what triggered this patch actually. So other way is to completely
validate the value in get_display_clock_speed instead of bit[28:26] and
then if wrong then only do the cdclk init.


I think we'd need to detect at this point if outputs are enabled and
only attempt to work around the above BIOS problem if this is not the
case. Alternatively you could also disable the active outputs as a first
step.


Ok, let me detect if any output is enabled by BIOS and accordingly
initialize cdclk.


These kind of fixiups should be done after the hw state readout. We
already have sanitize_crtc/pll/encoder functions, probably best if we add
a sanitize_cdclk or similar for this at the very end of the hw state
sanitize sequence.


Can't be done if we already need a somewhat sane cdclk for the
eDP AUX probing and whatnot.

For actually enabling the cdclk for pushing pixels, we wouldn't need
to do anything except actually plug ia a calc_cdclk for SKL. No idea
why we're not doing that currently. Some extra care may be needed
due to the eDP DPLL0 usag IIRC.


Hm right, cdlck is in the top-level power domain. Added fun is that with
dmc the firmware is supposed to handle it. Messy :(


Yes, exactly. How about just adding verify_cdclk and calling in 
get_display_clock_speed to check if cdclk is programmed correctly along 
with related DPLL0 VCO settings for now. If all looks good, then skip 
else initialize. Now in that case if we have to initialize where do we 
get the cdclock to initialize with at this point ? Any default in VBT ? 
Or go with minimum by default and it can be bumped up later if needed.


Regards
Shobhit


-Daniel


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/2] drm/i915: Kill DRI1 cliprects

2015-10-06 Thread Chris Wilson
On Tue, Oct 06, 2015 at 01:21:25PM +0200, Daniel Vetter wrote:
> On Tue, Oct 06, 2015 at 11:39:55AM +0100, Chris Wilson wrote:
> > Passing cliprects into the kernel for it to re-execute the batch buffer
> > with different CMD_DRAWRECT died out long ago. As DRI1 support has been
> > removed from the kernel, we can now simply reject any execbuf trying to
> > use this "feature".
> > 
> > To keep Daniel happy with the prospect of being able to reuse these
> > fields in the next decade, continue to ensure that current userspace is
> > not passing garbage in through the dead fields.
> > 
> > v2: Fix the cliprects_ptr check
> > 
> > Signed-off-by: Chris Wilson 
> > Cc: Daniel Vetter 
> 
> igt subtest seems to be missing to ensure we enforce this. Yay otherwise!

Looking at gem_exec_params/cliprects-invalid, presumably all you want to
do is to remove the igt_require(). To be more precise you would need to
a flag to tell when the wicked witch is dead and you could expect the
execbuf to fail with any cliprect garbage.
-Chris
> -Da 

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Stop discarding GTT cache-domain on unbind vma

2015-10-06 Thread Chris Wilson
On Tue, Oct 06, 2015 at 02:40:26PM +0200, Daniel Vetter wrote:
> On Tue, Oct 06, 2015 at 01:02:22PM +0100, Chris Wilson wrote:
> > Since
> > 
> > commit 43566dedde54f9729113f5f9fde77d53e75e61e9
> > Author: Chris Wilson 
> > Date:   Fri Jan 2 16:29:29 2015 +0530
> > 
> > drm/i915: Broaden application of set-domain(GTT)
> > 
> > we allowed objects to be in the GTT domain, but unbound. Therefore
> > removing the GTT cache domain when removing the GGTT vma is no longer
> > semantically correct.
> > 
> > An unfortunate side-effect is we lose the wondrously named
> > i915_gem_object_finish_gtt(), not to be confused with
> > i915_gem_gtt_finish_object()!
> > 
> > Signed-off-by: Chris Wilson 
> > Cc: Akash Goel 
> > Cc: Joonas Lahtinen 
> > Cc: Tvrtko Ursulin 
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c | 20 +++-
> >  1 file changed, 3 insertions(+), 17 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c 
> > b/drivers/gpu/drm/i915/i915_gem.c
> > index 8f498d4d874d..682af2ae3681 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -3183,20 +3183,6 @@ i915_gem_object_sync(struct drm_i915_gem_object *obj,
> > return 0;
> >  }
> >  
> > -static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
> > -{
> > -   /* Force a pagefault for domain tracking on next user access */
> > -   i915_gem_release_mmap(obj);
> > -
> > -   if ((obj->base.read_domains & I915_GEM_DOMAIN_GTT) == 0)
> > -   return;
> > -
> > -   obj->base.read_domains &= ~I915_GEM_DOMAIN_GTT;
> > -   obj->base.write_domain &= ~I915_GEM_DOMAIN_GTT;
> > -
> > -   trace_i915_gem_object_change_domain(obj);
> > -}
> > -
> >  int i915_vma_unbind(struct i915_vma *vma)
> >  {
> > struct drm_i915_gem_object *obj = vma->obj;
> > @@ -3228,12 +3214,12 @@ int i915_vma_unbind(struct i915_vma *vma)
> >  */
> >  
> > if (vma->is_ggtt && vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL) {
> > -   i915_gem_object_finish_gtt(obj);
> > -
> > -   /* release the fence reg _after_ flushing */
> > ret = i915_gem_object_put_fence(obj);
> > if (ret)
> > return ret;
> > +
> > +   /* Force a pagefault for domain tracking on next user access */
> > +   i915_gem_release_mmap(obj);
> 
> Can't put_fence before release_mmap ... I guess we should have a testcase
> for this somewhere? Hard to provoke probably ...

Why not? i915_gem_object_put_fence() has to release the mmaps itself if
it has a fence register assigned for the object.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/5] drm/i915: shrinker_control->nr_to_scan is now unsigned long

2015-10-06 Thread Daniel Vetter
On Thu, Oct 01, 2015 at 12:18:25PM +0100, Chris Wilson wrote:
> As the shrinker_control now passes us unsigned long targets, update our
> shrinker functions to match.
> 
> Signed-off-by: Chris Wilson 

Queued for -next, thanks for the patch.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_drv.h  | 2 +-
>  drivers/gpu/drm/i915/i915_gem_shrinker.c | 4 ++--
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index ec731e6db126..6c807c584d59 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -3155,7 +3155,7 @@ i915_gem_object_create_stolen_for_preallocated(struct 
> drm_device *dev,
>  
>  /* i915_gem_shrinker.c */
>  unsigned long i915_gem_shrink(struct drm_i915_private *dev_priv,
> -   long target,
> +   unsigned long target,
> unsigned flags);
>  #define I915_SHRINK_PURGEABLE 0x1
>  #define I915_SHRINK_UNBOUND 0x2
> diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c 
> b/drivers/gpu/drm/i915/i915_gem_shrinker.c
> index f6ecbda2c604..b627d07fad29 100644
> --- a/drivers/gpu/drm/i915/i915_gem_shrinker.c
> +++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c
> @@ -73,7 +73,7 @@ static bool mutex_is_locked_by(struct mutex *mutex, struct 
> task_struct *task)
>   */
>  unsigned long
>  i915_gem_shrink(struct drm_i915_private *dev_priv,
> - long target, unsigned flags)
> + unsigned long target, unsigned flags)
>  {
>   const struct {
>   struct list_head *list;
> @@ -159,7 +159,7 @@ i915_gem_shrink(struct drm_i915_private *dev_priv,
>  unsigned long i915_gem_shrink_all(struct drm_i915_private *dev_priv)
>  {
>   i915_gem_evict_everything(dev_priv->dev);
> - return i915_gem_shrink(dev_priv, LONG_MAX,
> + return i915_gem_shrink(dev_priv, -1UL,
>  I915_SHRINK_BOUND | I915_SHRINK_UNBOUND);
>  }
>  
> -- 
> 2.6.0
> 
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915: Fix kerneldoc for i915_gem_shrink_all

2015-10-06 Thread Daniel Vetter
I've botched this, so let's fix it.

Signed-off-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/i915_gem_shrinker.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c 
b/drivers/gpu/drm/i915/i915_gem_shrinker.c
index f6ecbda2c604..674341708033 100644
--- a/drivers/gpu/drm/i915/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c
@@ -143,7 +143,7 @@ i915_gem_shrink(struct drm_i915_private *dev_priv,
 }
 
 /**
- * i915_gem_shrink - Shrink buffer object caches completely
+ * i915_gem_shrink_all - Shrink buffer object caches completely
  * @dev_priv: i915 device
  *
  * This is a simple wraper around i915_gem_shrink() to aggressively shrink all
-- 
2.5.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/5] drm/i915: Add a tracepoint for the shrinker

2015-10-06 Thread Daniel Vetter
On Thu, Oct 01, 2015 at 12:18:26PM +0100, Chris Wilson wrote:
> Often it is very useful to know why we suddenly purge vast tracts of
> memory and surprisingly up until now we didn't even have a tracepoint
> for when we shrink our memory.
> 
> Signed-off-by: Chris Wilson 
> ---
>  drivers/gpu/drm/i915/i915_gem_shrinker.c |  2 ++
>  drivers/gpu/drm/i915/i915_trace.h| 20 
>  2 files changed, 22 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c 
> b/drivers/gpu/drm/i915/i915_gem_shrinker.c
> index b627d07fad29..88f66a2586ec 100644
> --- a/drivers/gpu/drm/i915/i915_gem_shrinker.c
> +++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c
> @@ -85,6 +85,8 @@ i915_gem_shrink(struct drm_i915_private *dev_priv,
>   }, *phase;
>   unsigned long count = 0;
>  
> + trace_i915_gem_shrink(dev_priv, target, flags);

Shouldn't we also dump how many pages we actually managed to shrink, i.e.
count (at the end of the functions).

Also we have a slab_start/end tracepoint already, but that one obviously
doesn't cover the internal calls to i915_gem_shrink. Should imo be
mentioned in the commit message.
-Daniel

> +
>   /*
>* As we may completely rewrite the (un)bound list whilst unbinding
>* (due to retiring requests) we have to strictly process only
> diff --git a/drivers/gpu/drm/i915/i915_trace.h 
> b/drivers/gpu/drm/i915/i915_trace.h
> index e6b5c7470ba0..ed7f42f2e740 100644
> --- a/drivers/gpu/drm/i915/i915_trace.h
> +++ b/drivers/gpu/drm/i915/i915_trace.h
> @@ -107,6 +107,26 @@ TRACE_EVENT(i915_gem_object_create,
>   TP_printk("obj=%p, size=%u", __entry->obj, __entry->size)
>  );
>  
> +TRACE_EVENT(i915_gem_shrink,
> + TP_PROTO(struct drm_i915_private *i915, unsigned long target, 
> unsigned flags),
> + TP_ARGS(i915, target, flags),
> +
> + TP_STRUCT__entry(
> +  __field(int, dev)
> +  __field(unsigned long, target)
> +  __field(unsigned, flags)
> +  ),
> +
> + TP_fast_assign(
> +__entry->dev = i915->dev->primary->index;
> +__entry->target = target;
> +__entry->flags = flags;
> +),
> +
> + TP_printk("dev=%d, target=%lu, flags=%x",
> +   __entry->dev, __entry->target, __entry->flags)
> +);
> +
>  TRACE_EVENT(i915_vma_bind,
>   TP_PROTO(struct i915_vma *vma, unsigned flags),
>   TP_ARGS(vma, flags),
> -- 
> 2.6.0
> 
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 3/5] drm/i915: During shrink_all we only need to idle the GPU

2015-10-06 Thread Daniel Vetter
On Thu, Oct 01, 2015 at 12:18:27PM +0100, Chris Wilson wrote:
> We can forgo an evict-everything here as the shrinker operation itself
> will unbind any vma as required. If we explicitly idle the GPU through a
> switch to the default context, we not only create a request in an
> illegal context (e.g. whilst shrinking during execbuf with a request
> already allocated), but switching to the default context will not free
> up the memory backing the active contexts - unless in the unlikely
> situation that context had already been closed (and just kept arrive by
> being the current context). The saving is near zero and the danger real.
> 
> To compensate for the loss of the forced retire, add a couple of
> retire-requests to i915_gem_shirnk() - this should help free up any
> transitive cache from the requests.
> 
> Signed-off-by: Chris Wilson 
> ---
>  drivers/gpu/drm/i915/i915_gem_shrinker.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c 
> b/drivers/gpu/drm/i915/i915_gem_shrinker.c
> index 88f66a2586ec..2058d162aeb9 100644
> --- a/drivers/gpu/drm/i915/i915_gem_shrinker.c
> +++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c
> @@ -86,6 +86,7 @@ i915_gem_shrink(struct drm_i915_private *dev_priv,
>   unsigned long count = 0;
>  
>   trace_i915_gem_shrink(dev_priv, target, flags);
> + i915_gem_retire_requests(dev_priv->dev);
>  
>   /*
>* As we may completely rewrite the (un)bound list whilst unbinding
> @@ -141,6 +142,8 @@ i915_gem_shrink(struct drm_i915_private *dev_priv,
>   list_splice(&still_in_list, phase->list);
>   }
>  
> + i915_gem_retire_requests(dev_priv->dev);

I dont really get the justification for the 2nd retire_requests. Also
isn't the first one only needed for the last patch to not stall in the
normal shrinker on active objects?

Aside for blowing up on requests and nested stuff: We could make
alloc_request/request_submit/cancel a lockdep locking pair. That would
catch bogus nesting and locking inversion through the mm subsystem (since
any malloc function is it's own lockdep critical section to avoid
deadlocks on GFP_NOFS and friends).

Also splitting out evict_everything into that one-line patch might be good
for -fixes if we have bug reports where this blows up.
-Daniel

> +
>   return count;
>  }
>  
> @@ -160,7 +163,6 @@ i915_gem_shrink(struct drm_i915_private *dev_priv,
>   */
>  unsigned long i915_gem_shrink_all(struct drm_i915_private *dev_priv)
>  {
> - i915_gem_evict_everything(dev_priv->dev);
>   return i915_gem_shrink(dev_priv, -1UL,
>  I915_SHRINK_BOUND | I915_SHRINK_UNBOUND);
>  }
> -- 
> 2.6.0
> 
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 5/5] drm/i915: Avoid GPU stalls from kswapd

2015-10-06 Thread Daniel Vetter
On Thu, Oct 01, 2015 at 12:18:29PM +0100, Chris Wilson wrote:
> Exclude active GPU pages from the purview of the background shrinker
> (kswapd), as these cause uncontrollable GPU stalls. Given that the
> shrinker is rerun until the freelists are satisfied, we should have
> opportunity in subsequent passes to recover the pages once idle. If the
> machine does run out of memory entirely, we have the forced idling in the
> oom-notifier as a means of releasing all the pages we can before an oom
> is prematurely executed.
> 
> Signed-off-by: Chris Wilson 
> Reviewed-by: Damien Lespiau 

lgtm, but imo we should move the retire_requests from an earlier patch to
this one.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_drv.h  | 1 +
>  drivers/gpu/drm/i915/i915_gem_shrinker.c | 9 +++--
>  2 files changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 64b8929acdf2..a443310a3598 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -3159,6 +3159,7 @@ unsigned long i915_gem_shrink(struct drm_i915_private 
> *dev_priv,
>  #define I915_SHRINK_PURGEABLE 0x1
>  #define I915_SHRINK_UNBOUND 0x2
>  #define I915_SHRINK_BOUND 0x4
> +#define I915_SHRINK_ACTIVE 0x8
>  unsigned long i915_gem_shrink_all(struct drm_i915_private *dev_priv);
>  void i915_gem_shrinker_init(struct drm_i915_private *dev_priv);
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c 
> b/drivers/gpu/drm/i915/i915_gem_shrinker.c
> index 2058d162aeb9..b2ccb7346899 100644
> --- a/drivers/gpu/drm/i915/i915_gem_shrinker.c
> +++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c
> @@ -126,6 +126,9 @@ i915_gem_shrink(struct drm_i915_private *dev_priv,
>   obj->madv != I915_MADV_DONTNEED)
>   continue;
>  
> + if ((flags & I915_SHRINK_ACTIVE) == 0 && obj->active)
> + continue;
> +
>   drm_gem_object_reference(&obj->base);
>  
>   /* For the unbound phase, this should be a no-op! */
> @@ -164,7 +167,9 @@ i915_gem_shrink(struct drm_i915_private *dev_priv,
>  unsigned long i915_gem_shrink_all(struct drm_i915_private *dev_priv)
>  {
>   return i915_gem_shrink(dev_priv, -1UL,
> -I915_SHRINK_BOUND | I915_SHRINK_UNBOUND);
> +I915_SHRINK_BOUND |
> +I915_SHRINK_UNBOUND |
> +I915_SHRINK_ACTIVE);
>  }
>  
>  static bool i915_gem_shrinker_lock(struct drm_device *dev, bool *unlock)
> @@ -217,7 +222,7 @@ i915_gem_shrinker_count(struct shrinker *shrinker, struct 
> shrink_control *sc)
>   count += obj->base.size >> PAGE_SHIFT;
>  
>   list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> - if (obj->pages_pin_count == num_vma_bound(obj))
> + if (!obj->active && obj->pages_pin_count == num_vma_bound(obj))
>   count += obj->base.size >> PAGE_SHIFT;
>   }
>  
> -- 
> 2.6.0
> 
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/6] drm/i915: Eliminate vmap overhead for cmd parser

2015-10-06 Thread Chris Wilson
On Tue, Oct 06, 2015 at 02:44:22PM +0200, Daniel Vetter wrote:
> On Thu, Oct 01, 2015 at 12:57:10PM +0100, Chris Wilson wrote:
> > With a little complexity to handle cmds straddling page boundaries, we
> > can completely avoiding having to vmap the batch and the shadow batch
> > objects whilst running the command parser.
> > 
> > On ivb i7-3720MQ:
> > 
> > x11perf -dot before 54.3M, after 53.2M (max 203M)
> > glxgears before 7110 fps, after 7300 fps (max 7860 fps)
> > 
> > Before:
> > Time to blt 16384 bytes x  1:12.400µs, 1.2GiB/s
> > Time to blt 16384 bytes x   4096: 3.055µs, 5.0GiB/s
> > 
> > After:
> > Time to blt 16384 bytes x  1: 8.600µs, 1.8GiB/s
> > Time to blt 16384 bytes x   4096: 2.456µs, 6.2GiB/s
> 
> Numbers for the overall series (or individual patches would be even
> better) are needed. I thought you have this neat script now to do that for
> an entire series?

Note that numbers on a patch are for the patch unless otherwise stated.
Double so since these are numbers from when I first posted it and didn't
have anything else to boost cmdparser perf.

I do and I even added the benchmark to demonstrate one case. Then I
forgot to enable the cmdparser in mesa and so its numbers are bunk.
Fancy scripts still can't eliminate pebkac.

However, it did show that we still get the throughput improvement from
killing the vmap even with the temporary copy.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/skl: Init cdclk in the driver rather than relying on pre-os

2015-10-06 Thread Daniel Vetter
On Tue, Oct 06, 2015 at 06:13:52PM +0530, Kumar, Shobhit wrote:
> On 10/06/2015 05:49 PM, Daniel Vetter wrote:
> >On Tue, Oct 06, 2015 at 02:41:44PM +0300, Ville Syrjälä wrote:
> >>On Tue, Oct 06, 2015 at 01:19:52PM +0200, Daniel Vetter wrote:
> >>>On Tue, Oct 06, 2015 at 04:33:43PM +0530, Kumar, Shobhit wrote:
> On 10/06/2015 04:11 PM, Imre Deak wrote:
> >On ti, 2015-10-06 at 15:26 +0530, Kumar, Shobhit wrote:
> >>On 10/05/2015 09:05 PM, Imre Deak wrote:
> >>>On ma, 2015-10-05 at 20:52 +0530, Shobhit Kumar wrote:
> Mostly reuse what is programmed by pre-os, but in case there is no
> pre-os initialization, init the cdclk with the default value.
> 
> Cc: Imre Deak 
> Signed-off-by: Shobhit Kumar 
> ---
>    drivers/gpu/drm/i915/intel_ddi.c | 6 ++
>    1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_ddi.c 
> b/drivers/gpu/drm/i915/intel_ddi.c
> index 2d3cc82..675c60d 100644
> --- a/drivers/gpu/drm/i915/intel_ddi.c
> +++ b/drivers/gpu/drm/i915/intel_ddi.c
> @@ -2947,10 +2947,8 @@ void intel_ddi_pll_init(struct drm_device *dev)
> 
>   cdclk_freq = 
>  dev_priv->display.get_display_clock_speed(dev);
>   dev_priv->skl_boot_cdclk = cdclk_freq;
> - if (!(I915_READ(LCPLL1_CTL) & LCPLL_PLL_ENABLE))
> - DRM_ERROR("LCPLL1 is disabled\n");
> - else
> - intel_display_power_get(dev_priv, 
> POWER_DOMAIN_PLLS);
> +
> + skl_init_cdclk(dev_priv);
> >>>
> >>>How does this prevent changing the clock if BIOS did enable some 
> >>>output?
> >>>We shouldn't change the clock in that case.
> >>
> >>In that case it will try to re-apply the same clock that BIOS enabled.
> >>Not sure if this is allowed, but I checked the cdclock change sequence
> >>and it is mostly followed in skl_init_cdclk.
> >>In my tests where BIOS does enable this, I faced no issues in
> >>initializing again in driver.
> >
> >The first step in that sequence:
> >"Disable all display engine functions using the full mode set disable
> >sequence on all pipes, ports, and planes."
> 
> Oh, yeah, I again made mistake of assuming that display is not enabled in
> the first place. You are right, though it works if I change the clock 
> again.
> 
> >
> >So the problem is not that the PLL itself may be enabled here (as BIOS
> >left it), but that some output is also enabled.
> 
> Yes.
> 
> >
> >>I have noticed on some pre-os this value is programmed correctly except
> >>for the decimal part. That causes AUX transactions to fail on SKl. That
> >>is what triggered this patch actually. So other way is to completely
> >>validate the value in get_display_clock_speed instead of bit[28:26] and
> >>then if wrong then only do the cdclk init.
> >
> >I think we'd need to detect at this point if outputs are enabled and
> >only attempt to work around the above BIOS problem if this is not the
> >case. Alternatively you could also disable the active outputs as a first
> >step.
> 
> Ok, let me detect if any output is enabled by BIOS and accordingly
> initialize cdclk.
> >>>
> >>>These kind of fixiups should be done after the hw state readout. We
> >>>already have sanitize_crtc/pll/encoder functions, probably best if we add
> >>>a sanitize_cdclk or similar for this at the very end of the hw state
> >>>sanitize sequence.
> >>
> >>Can't be done if we already need a somewhat sane cdclk for the
> >>eDP AUX probing and whatnot.
> >>
> >>For actually enabling the cdclk for pushing pixels, we wouldn't need
> >>to do anything except actually plug ia a calc_cdclk for SKL. No idea
> >>why we're not doing that currently. Some extra care may be needed
> >>due to the eDP DPLL0 usag IIRC.
> >
> >Hm right, cdlck is in the top-level power domain. Added fun is that with
> >dmc the firmware is supposed to handle it. Messy :(
> 
> Yes, exactly. How about just adding verify_cdclk and calling in
> get_display_clock_speed to check if cdclk is programmed correctly along with
> related DPLL0 VCO settings for now. If all looks good, then skip else
> initialize. Now in that case if we have to initialize where do we get the
> cdclock to initialize with at this point ? Any default in VBT ? Or go with
> minimum by default and it can be bumped up later if needed.

Just initialize to the slowest possible value, once we have dynamic cdclk
switching for skl. But that seems to be stuck behind resolving that big
confusion around dmc and the sequence for runtime pm. Without dynamic
cdclk we have to pick the maximum.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_

Re: [Intel-gfx] [PATCH] drm/i915: Stop discarding GTT cache-domain on unbind vma

2015-10-06 Thread Daniel Vetter
On Tue, Oct 06, 2015 at 01:46:25PM +0100, Chris Wilson wrote:
> On Tue, Oct 06, 2015 at 02:40:26PM +0200, Daniel Vetter wrote:
> > On Tue, Oct 06, 2015 at 01:02:22PM +0100, Chris Wilson wrote:
> > > Since
> > > 
> > > commit 43566dedde54f9729113f5f9fde77d53e75e61e9
> > > Author: Chris Wilson 
> > > Date:   Fri Jan 2 16:29:29 2015 +0530
> > > 
> > > drm/i915: Broaden application of set-domain(GTT)
> > > 
> > > we allowed objects to be in the GTT domain, but unbound. Therefore
> > > removing the GTT cache domain when removing the GGTT vma is no longer
> > > semantically correct.
> > > 
> > > An unfortunate side-effect is we lose the wondrously named
> > > i915_gem_object_finish_gtt(), not to be confused with
> > > i915_gem_gtt_finish_object()!
> > > 
> > > Signed-off-by: Chris Wilson 
> > > Cc: Akash Goel 
> > > Cc: Joonas Lahtinen 
> > > Cc: Tvrtko Ursulin 
> > > ---
> > >  drivers/gpu/drm/i915/i915_gem.c | 20 +++-
> > >  1 file changed, 3 insertions(+), 17 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c 
> > > b/drivers/gpu/drm/i915/i915_gem.c
> > > index 8f498d4d874d..682af2ae3681 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -3183,20 +3183,6 @@ i915_gem_object_sync(struct drm_i915_gem_object 
> > > *obj,
> > >   return 0;
> > >  }
> > >  
> > > -static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
> > > -{
> > > - /* Force a pagefault for domain tracking on next user access */
> > > - i915_gem_release_mmap(obj);
> > > -
> > > - if ((obj->base.read_domains & I915_GEM_DOMAIN_GTT) == 0)
> > > - return;
> > > -
> > > - obj->base.read_domains &= ~I915_GEM_DOMAIN_GTT;
> > > - obj->base.write_domain &= ~I915_GEM_DOMAIN_GTT;
> > > -
> > > - trace_i915_gem_object_change_domain(obj);
> > > -}
> > > -
> > >  int i915_vma_unbind(struct i915_vma *vma)
> > >  {
> > >   struct drm_i915_gem_object *obj = vma->obj;
> > > @@ -3228,12 +3214,12 @@ int i915_vma_unbind(struct i915_vma *vma)
> > >*/
> > >  
> > >   if (vma->is_ggtt && vma->ggtt_view.type == I915_GGTT_VIEW_NORMAL) {
> > > - i915_gem_object_finish_gtt(obj);
> > > -
> > > - /* release the fence reg _after_ flushing */
> > >   ret = i915_gem_object_put_fence(obj);
> > >   if (ret)
> > >   return ret;
> > > +
> > > + /* Force a pagefault for domain tracking on next user access */
> > > + i915_gem_release_mmap(obj);
> > 
> > Can't put_fence before release_mmap ... I guess we should have a testcase
> > for this somewhere? Hard to provoke probably ...
> 
> Why not? i915_gem_object_put_fence() has to release the mmaps itself if
> it has a fence register assigned for the object.

Oh right, forgotten that put_fence is robust. Looking at this simply
brought up bad memories since I fixed this kind of bug 5 years ago once
;-)

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 3/5] drm/i915: During shrink_all we only need to idle the GPU

2015-10-06 Thread Chris Wilson
On Tue, Oct 06, 2015 at 03:00:49PM +0200, Daniel Vetter wrote:
> On Thu, Oct 01, 2015 at 12:18:27PM +0100, Chris Wilson wrote:
> > We can forgo an evict-everything here as the shrinker operation itself
> > will unbind any vma as required. If we explicitly idle the GPU through a
> > switch to the default context, we not only create a request in an
> > illegal context (e.g. whilst shrinking during execbuf with a request
> > already allocated), but switching to the default context will not free
> > up the memory backing the active contexts - unless in the unlikely
> > situation that context had already been closed (and just kept arrive by
> > being the current context). The saving is near zero and the danger real.
> > 
> > To compensate for the loss of the forced retire, add a couple of
> > retire-requests to i915_gem_shirnk() - this should help free up any
> > transitive cache from the requests.
> > 
> > Signed-off-by: Chris Wilson 
> > ---
> >  drivers/gpu/drm/i915/i915_gem_shrinker.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c 
> > b/drivers/gpu/drm/i915/i915_gem_shrinker.c
> > index 88f66a2586ec..2058d162aeb9 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_shrinker.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c
> > @@ -86,6 +86,7 @@ i915_gem_shrink(struct drm_i915_private *dev_priv,
> > unsigned long count = 0;
> >  
> > trace_i915_gem_shrink(dev_priv, target, flags);
> > +   i915_gem_retire_requests(dev_priv->dev);
> >  
> > /*
> >  * As we may completely rewrite the (un)bound list whilst unbinding
> > @@ -141,6 +142,8 @@ i915_gem_shrink(struct drm_i915_private *dev_priv,
> > list_splice(&still_in_list, phase->list);
> > }
> >  
> > +   i915_gem_retire_requests(dev_priv->dev);
> 
> I dont really get the justification for the 2nd retire_requests. Also
> isn't the first one only needed for the last patch to not stall in the
> normal shrinker on active objects?

No. The first one is just a convenience (putting it first just means we
may get more inactive objects during an inactive only shrink, through they
will be at the end and so more likely not to be included by the shrinker's
scan-count).

We need a i915_gem_retire_requests() over and above the usual retirement
because execlists is snafu. The second one is to handle a transient
cache of requests which you haven't seen yet, but execlists needs it
anyway in order to unpin itself (since it is not tied into retirement).
 
> Aside for blowing up on requests and nested stuff: We could make
> alloc_request/request_submit/cancel a lockdep locking pair. That would
> catch bogus nesting and locking inversion through the mm subsystem (since
> any malloc function is it's own lockdep critical section to avoid
> deadlocks on GFP_NOFS and friends).

Interesting. That sounds like a clean way to catch reentrancy, something
to think about.

> Also splitting out evict_everything into that one-line patch might be good
> for -fixes if we have bug reports where this blows up.

Corner-case performance issue on top of memory pressure. It's so old no
one will have noticed a regression, and it's already on a slow path that
unless you were analysing traces you probably wouldn't even notice the
degradation.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/5] drm/i915: Add a tracepoint for the shrinker

2015-10-06 Thread Chris Wilson
On Tue, Oct 06, 2015 at 02:54:25PM +0200, Daniel Vetter wrote:
> On Thu, Oct 01, 2015 at 12:18:26PM +0100, Chris Wilson wrote:
> > Often it is very useful to know why we suddenly purge vast tracts of
> > memory and surprisingly up until now we didn't even have a tracepoint
> > for when we shrink our memory.
> > 
> > Signed-off-by: Chris Wilson 
> > ---
> >  drivers/gpu/drm/i915/i915_gem_shrinker.c |  2 ++
> >  drivers/gpu/drm/i915/i915_trace.h| 20 
> >  2 files changed, 22 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c 
> > b/drivers/gpu/drm/i915/i915_gem_shrinker.c
> > index b627d07fad29..88f66a2586ec 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_shrinker.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c
> > @@ -85,6 +85,8 @@ i915_gem_shrink(struct drm_i915_private *dev_priv,
> > }, *phase;
> > unsigned long count = 0;
> >  
> > +   trace_i915_gem_shrink(dev_priv, target, flags);
> 
> Shouldn't we also dump how many pages we actually managed to shrink, i.e.
> count (at the end of the functions).

I didn't because I find the double tracepoints annoying, and you already
have the unbinds following.

I guess shrink_begin, shrink_end (to be consistent with wait_begin/_end
or shrink_start/_end to be consistent with slab).
 
> Also we have a slab_start/end tracepoint already, but that one obviously
> doesn't cover the internal calls to i915_gem_shrink. Should imo be
> mentioned in the commit message.

Sure, I don't usually watch slab, so I don't have a marker for the
thousand unbinds as to what caused them.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 5/5] drm/i915: Avoid GPU stalls from kswapd

2015-10-06 Thread Chris Wilson
On Tue, Oct 06, 2015 at 03:01:45PM +0200, Daniel Vetter wrote:
> On Thu, Oct 01, 2015 at 12:18:29PM +0100, Chris Wilson wrote:
> > Exclude active GPU pages from the purview of the background shrinker
> > (kswapd), as these cause uncontrollable GPU stalls. Given that the
> > shrinker is rerun until the freelists are satisfied, we should have
> > opportunity in subsequent passes to recover the pages once idle. If the
> > machine does run out of memory entirely, we have the forced idling in the
> > oom-notifier as a means of releasing all the pages we can before an oom
> > is prematurely executed.
> > 
> > Signed-off-by: Chris Wilson 
> > Reviewed-by: Damien Lespiau 
> 
> lgtm, but imo we should move the retire_requests from an earlier patch to
> this one.

I am not convinced. The retire_requests are there for their own reasons
(to cover up cracks elsewhere) and not because we need them for retiring
active objects.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH i-g-t] kms_rotation_crc: Excercise page flips with 90 degree rotation

2015-10-06 Thread Tvrtko Ursulin
From: Tvrtko Ursulin 

Do some page flipping on the rotated plane just to excercise
that code path.

Signed-off-by: Tvrtko Ursulin 
Cc: Sonika Jindal 
Cc: Arun R Murthy 
---
There are some, still unofficial, reports that page flipping with
90 degree rotation may be causing a vma->pin_count leak.

But more importantly it looks like it doesn't work at all at the
moment with frame buffers which do not cover the whole plane.
---
 tests/kms_rotation_crc.c | 56 
 1 file changed, 56 insertions(+)

diff --git a/tests/kms_rotation_crc.c b/tests/kms_rotation_crc.c
index cc9847ecc113..fde910250353 100644
--- a/tests/kms_rotation_crc.c
+++ b/tests/kms_rotation_crc.c
@@ -31,6 +31,7 @@ typedef struct {
igt_display_t display;
struct igt_fb fb;
struct igt_fb fb_modeset;
+   struct igt_fb fb_flip;
igt_crc_t ref_crc;
igt_pipe_crc_t *pipe_crc;
igt_rotation_t rotation;
@@ -39,6 +40,7 @@ typedef struct {
unsigned int w, h;
uint32_t override_fmt;
uint64_t override_tiling;
+   unsigned int flip_stress;
 } data_t;
 
 static void
@@ -165,6 +167,16 @@ static void prepare_crtc(data_t *data, igt_output_t 
*output, enum pipe pipe,
  &data->fb);
igt_assert(fb_id);
 
+   if (data->flip_stress) {
+   fb_id = igt_create_fb(data->gfx_fd,
+ w, h,
+ pixel_format,
+ tiling,
+ &data->fb_flip);
+   igt_assert(fb_id);
+   paint_squares(data, mode, data->rotation, plane);
+   }
+
/* Step 1: create a reference CRC for a software-rotated fb */
paint_squares(data, mode, data->rotation, plane);
commit_crtc(data, output, plane);
@@ -187,6 +199,8 @@ static void cleanup_crtc(data_t *data, igt_output_t 
*output, igt_plane_t *plane)
 
igt_remove_fb(data->gfx_fd, &data->fb);
igt_remove_fb(data->gfx_fd, &data->fb_modeset);
+   if (data->fb_flip.fb_id)
+   igt_remove_fb(data->gfx_fd, &data->fb_flip);
 
/* XXX: see the note in prepare_crtc() */
if (!plane->is_primary) {
@@ -202,6 +216,23 @@ static void cleanup_crtc(data_t *data, igt_output_t 
*output, igt_plane_t *plane)
igt_display_commit(display);
 }
 
+static void wait_for_pageflip(int fd)
+{
+   drmEventContext evctx = { .version = DRM_EVENT_CONTEXT_VERSION };
+   struct timeval timeout = { .tv_sec = 0, .tv_usec = 32000 };
+   fd_set fds;
+   int ret;
+
+   /* Wait for pageflip completion, then consume event on fd */
+   FD_ZERO(&fds);
+   FD_SET(fd, &fds);
+   do {
+   ret = select(fd + 1, &fds, NULL, NULL, &timeout);
+   } while (ret < 0 && errno == EINTR);
+   igt_assert_eq(ret, 1);
+   igt_assert(drmHandleEvent(fd, &evctx) == 0);
+}
+
 static void test_plane_rotation(data_t *data, enum igt_plane plane_type)
 {
igt_display_t *display = &data->display;
@@ -245,6 +276,23 @@ static void test_plane_rotation(data_t *data, enum 
igt_plane plane_type)
 &crc_output);
}
 
+   while (data->flip_stress--) {
+   ret = drmModePageFlip(data->gfx_fd,
+ 
output->config.crtc->crtc_id,
+ data->fb_flip.fb_id,
+ DRM_MODE_PAGE_FLIP_EVENT,
+ NULL);
+   igt_assert(ret == 0);
+   wait_for_pageflip(data->gfx_fd);
+   ret = drmModePageFlip(data->gfx_fd,
+ 
output->config.crtc->crtc_id,
+ data->fb.fb_id,
+ DRM_MODE_PAGE_FLIP_EVENT,
+ NULL);
+   igt_assert(ret == 0);
+   wait_for_pageflip(data->gfx_fd);
+   }
+
/*
 * check the rotation state has been reset when the VT
 * mode is restored
@@ -345,6 +393,14 @@ igt_main
test_plane_rotation(&data, IGT_PLANE_PRIMARY);
}
 
+   igt_subtest_f("primary-rotation-90-flip-stress") {
+   igt_require(gen >= 9);
+   data.override_tiling = 0;
+   data.flip_stress = 120;
+   data.rotation = IGT_ROTATION_90;
+   test_plane_rotation(&data, IGT_PLANE_PRIMARY);
+   }
+
igt_fixture {
igt_display_fini(&data.display);
}
-- 
1.9.1

___

Re: [Intel-gfx] [PATCH] drm/i915/skl: Init cdclk in the driver rather than relying on pre-os

2015-10-06 Thread Ville Syrjälä
On Tue, Oct 06, 2015 at 06:13:52PM +0530, Kumar, Shobhit wrote:
> On 10/06/2015 05:49 PM, Daniel Vetter wrote:
> > On Tue, Oct 06, 2015 at 02:41:44PM +0300, Ville Syrjälä wrote:
> >> On Tue, Oct 06, 2015 at 01:19:52PM +0200, Daniel Vetter wrote:
> >>> On Tue, Oct 06, 2015 at 04:33:43PM +0530, Kumar, Shobhit wrote:
>  On 10/06/2015 04:11 PM, Imre Deak wrote:
> > On ti, 2015-10-06 at 15:26 +0530, Kumar, Shobhit wrote:
> >> On 10/05/2015 09:05 PM, Imre Deak wrote:
> >>> On ma, 2015-10-05 at 20:52 +0530, Shobhit Kumar wrote:
>  Mostly reuse what is programmed by pre-os, but in case there is no
>  pre-os initialization, init the cdclk with the default value.
> 
>  Cc: Imre Deak 
>  Signed-off-by: Shobhit Kumar 
>  ---
> drivers/gpu/drm/i915/intel_ddi.c | 6 ++
> 1 file changed, 2 insertions(+), 4 deletions(-)
> 
>  diff --git a/drivers/gpu/drm/i915/intel_ddi.c 
>  b/drivers/gpu/drm/i915/intel_ddi.c
>  index 2d3cc82..675c60d 100644
>  --- a/drivers/gpu/drm/i915/intel_ddi.c
>  +++ b/drivers/gpu/drm/i915/intel_ddi.c
>  @@ -2947,10 +2947,8 @@ void intel_ddi_pll_init(struct drm_device 
>  *dev)
> 
>   cdclk_freq = 
>  dev_priv->display.get_display_clock_speed(dev);
>   dev_priv->skl_boot_cdclk = cdclk_freq;
>  -if (!(I915_READ(LCPLL1_CTL) & LCPLL_PLL_ENABLE))
>  -DRM_ERROR("LCPLL1 is disabled\n");
>  -else
>  -intel_display_power_get(dev_priv, 
>  POWER_DOMAIN_PLLS);
>  +
>  +skl_init_cdclk(dev_priv);
> >>>
> >>> How does this prevent changing the clock if BIOS did enable some 
> >>> output?
> >>> We shouldn't change the clock in that case.
> >>
> >> In that case it will try to re-apply the same clock that BIOS enabled.
> >> Not sure if this is allowed, but I checked the cdclock change sequence
> >> and it is mostly followed in skl_init_cdclk.
> >> In my tests where BIOS does enable this, I faced no issues in
> >> initializing again in driver.
> >
> > The first step in that sequence:
> > "Disable all display engine functions using the full mode set disable
> > sequence on all pipes, ports, and planes."
> 
>  Oh, yeah, I again made mistake of assuming that display is not enabled in
>  the first place. You are right, though it works if I change the clock 
>  again.
> 
> >
> > So the problem is not that the PLL itself may be enabled here (as BIOS
> > left it), but that some output is also enabled.
> 
>  Yes.
> 
> >
> >> I have noticed on some pre-os this value is programmed correctly except
> >> for the decimal part. That causes AUX transactions to fail on SKl. That
> >> is what triggered this patch actually. So other way is to completely
> >> validate the value in get_display_clock_speed instead of bit[28:26] and
> >> then if wrong then only do the cdclk init.
> >
> > I think we'd need to detect at this point if outputs are enabled and
> > only attempt to work around the above BIOS problem if this is not the
> > case. Alternatively you could also disable the active outputs as a first
> > step.
> 
>  Ok, let me detect if any output is enabled by BIOS and accordingly
>  initialize cdclk.
> >>>
> >>> These kind of fixiups should be done after the hw state readout. We
> >>> already have sanitize_crtc/pll/encoder functions, probably best if we add
> >>> a sanitize_cdclk or similar for this at the very end of the hw state
> >>> sanitize sequence.
> >>
> >> Can't be done if we already need a somewhat sane cdclk for the
> >> eDP AUX probing and whatnot.
> >>
> >> For actually enabling the cdclk for pushing pixels, we wouldn't need
> >> to do anything except actually plug ia a calc_cdclk for SKL. No idea
> >> why we're not doing that currently. Some extra care may be needed
> >> due to the eDP DPLL0 usag IIRC.
> >
> > Hm right, cdlck is in the top-level power domain. Added fun is that with
> > dmc the firmware is supposed to handle it. Messy :(
> 
> Yes, exactly. How about just adding verify_cdclk and calling in 
> get_display_clock_speed to check if cdclk is programmed correctly along 
> with related DPLL0 VCO settings for now.

I would just keep it somewhere in init/resume path rather than polluting
.get_display_clock_speed with it. We should be calling
intel_update_cdclk() in those paths somewhere, so doing the fixup just
before that should be sufficient.

> If all looks good, then skip 
> else initialize. Now in that case if we have to initialize where do we 
> get the cdclock to initialize with at this point ? Any default in VBT ? 
> Or go with minimum by default and it can be bumped up later if needed.

Can we figure out the eDP li

Re: [Intel-gfx] [PATCH] drm/i915/skl: Init cdclk in the driver rather than relying on pre-os

2015-10-06 Thread Ville Syrjälä
On Tue, Oct 06, 2015 at 03:04:28PM +0200, Daniel Vetter wrote:
> On Tue, Oct 06, 2015 at 06:13:52PM +0530, Kumar, Shobhit wrote:
> > On 10/06/2015 05:49 PM, Daniel Vetter wrote:
> > >On Tue, Oct 06, 2015 at 02:41:44PM +0300, Ville Syrjälä wrote:
> > >>On Tue, Oct 06, 2015 at 01:19:52PM +0200, Daniel Vetter wrote:
> > >>>On Tue, Oct 06, 2015 at 04:33:43PM +0530, Kumar, Shobhit wrote:
> > On 10/06/2015 04:11 PM, Imre Deak wrote:
> > >On ti, 2015-10-06 at 15:26 +0530, Kumar, Shobhit wrote:
> > >>On 10/05/2015 09:05 PM, Imre Deak wrote:
> > >>>On ma, 2015-10-05 at 20:52 +0530, Shobhit Kumar wrote:
> > Mostly reuse what is programmed by pre-os, but in case there is no
> > pre-os initialization, init the cdclk with the default value.
> > 
> > Cc: Imre Deak 
> > Signed-off-by: Shobhit Kumar 
> > ---
> >    drivers/gpu/drm/i915/intel_ddi.c | 6 ++
> >    1 file changed, 2 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_ddi.c 
> > b/drivers/gpu/drm/i915/intel_ddi.c
> > index 2d3cc82..675c60d 100644
> > --- a/drivers/gpu/drm/i915/intel_ddi.c
> > +++ b/drivers/gpu/drm/i915/intel_ddi.c
> > @@ -2947,10 +2947,8 @@ void intel_ddi_pll_init(struct drm_device 
> > *dev)
> > 
> > cdclk_freq = 
> >  dev_priv->display.get_display_clock_speed(dev);
> > dev_priv->skl_boot_cdclk = cdclk_freq;
> > -   if (!(I915_READ(LCPLL1_CTL) & LCPLL_PLL_ENABLE))
> > -   DRM_ERROR("LCPLL1 is disabled\n");
> > -   else
> > -   intel_display_power_get(dev_priv, 
> > POWER_DOMAIN_PLLS);
> > +
> > +   skl_init_cdclk(dev_priv);
> > >>>
> > >>>How does this prevent changing the clock if BIOS did enable some 
> > >>>output?
> > >>>We shouldn't change the clock in that case.
> > >>
> > >>In that case it will try to re-apply the same clock that BIOS enabled.
> > >>Not sure if this is allowed, but I checked the cdclock change sequence
> > >>and it is mostly followed in skl_init_cdclk.
> > >>In my tests where BIOS does enable this, I faced no issues in
> > >>initializing again in driver.
> > >
> > >The first step in that sequence:
> > >"Disable all display engine functions using the full mode set disable
> > >sequence on all pipes, ports, and planes."
> > 
> > Oh, yeah, I again made mistake of assuming that display is not enabled 
> > in
> > the first place. You are right, though it works if I change the clock 
> > again.
> > 
> > >
> > >So the problem is not that the PLL itself may be enabled here (as BIOS
> > >left it), but that some output is also enabled.
> > 
> > Yes.
> > 
> > >
> > >>I have noticed on some pre-os this value is programmed correctly 
> > >>except
> > >>for the decimal part. That causes AUX transactions to fail on SKl. 
> > >>That
> > >>is what triggered this patch actually. So other way is to completely
> > >>validate the value in get_display_clock_speed instead of bit[28:26] 
> > >>and
> > >>then if wrong then only do the cdclk init.
> > >
> > >I think we'd need to detect at this point if outputs are enabled and
> > >only attempt to work around the above BIOS problem if this is not the
> > >case. Alternatively you could also disable the active outputs as a 
> > >first
> > >step.
> > 
> > Ok, let me detect if any output is enabled by BIOS and accordingly
> > initialize cdclk.
> > >>>
> > >>>These kind of fixiups should be done after the hw state readout. We
> > >>>already have sanitize_crtc/pll/encoder functions, probably best if we add
> > >>>a sanitize_cdclk or similar for this at the very end of the hw state
> > >>>sanitize sequence.
> > >>
> > >>Can't be done if we already need a somewhat sane cdclk for the
> > >>eDP AUX probing and whatnot.
> > >>
> > >>For actually enabling the cdclk for pushing pixels, we wouldn't need
> > >>to do anything except actually plug ia a calc_cdclk for SKL. No idea
> > >>why we're not doing that currently. Some extra care may be needed
> > >>due to the eDP DPLL0 usag IIRC.
> > >
> > >Hm right, cdlck is in the top-level power domain. Added fun is that with
> > >dmc the firmware is supposed to handle it. Messy :(
> > 
> > Yes, exactly. How about just adding verify_cdclk and calling in
> > get_display_clock_speed to check if cdclk is programmed correctly along with
> > related DPLL0 VCO settings for now. If all looks good, then skip else
> > initialize. Now in that case if we have to initialize where do we get the
> > cdclock to initialize with at this point ? Any default in VBT ? Or go with
> > minimum by default and it can be bumped up later if needed.
> 
> Just initialize to the slowest p

Re: [Intel-gfx] [PATCH 1/2] drm/i915/skl: Allow universal planes to position

2015-10-06 Thread Tvrtko Ursulin


On 10/04/15 10:07, Sonika Jindal wrote:

Signed-off-by: Sonika Jindal 
Reviewed-by: Matt Roper 
---
  drivers/gpu/drm/i915/intel_display.c |7 ++-
  1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index ceb2e61..f0bbc22 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -12150,16 +12150,21 @@ intel_check_primary_plane(struct drm_plane *plane,
struct drm_rect *dest = &state->dst;
struct drm_rect *src = &state->src;
const struct drm_rect *clip = &state->clip;
+   bool can_position = false;
int ret;

crtc = crtc ? crtc : plane->crtc;
intel_crtc = to_intel_crtc(crtc);

+   if (INTEL_INFO(dev)->gen >= 9)
+   can_position = true;
+
ret = drm_plane_helper_check_update(plane, crtc, fb,
src, dest, clip,
DRM_PLANE_HELPER_NO_SCALING,
DRM_PLANE_HELPER_NO_SCALING,
-   false, true, &state->visible);
+   can_position, true,
+   &state->visible);
if (ret)
return ret;




I have discovered today that, while this allows SetCrtc and SetPlane 
ioctls to work with frame buffers which do not cover the plane, page 
flips are not that lucky and fail roughly with:


[drm:drm_crtc_check_viewport] Invalid fb size 1080x1080 for CRTC 
viewport 1920x1080+0+0.


I have posted a quick IGT exerciser for this as "kms_rotation_crc: 
Excercise page flips with 90 degree rotation". May not be that great but 
shows the failure.


I am not that hot on meddling with this code, nor do I feel competent to 
even try on my own at least. :/ Maybe just because the atomic and plane 
related rewrites have been going on for so long, and have multiple 
people involved, it all sounds pretty scary and fragile.


But I think some sort of plan on how to fix this could be in order?

Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Fix kerneldoc for i915_gem_shrink_all

2015-10-06 Thread Jani Nikula
On Tue, 06 Oct 2015, Daniel Vetter  wrote:
> I've botched this, so let's fix it.

Ahah, where's the reference to regressing commit, huh?! HUH?! ;)

Botched in

commit eb0b44adc08c0be01a027eb009e9cdadc31e65a2
Author: Daniel Vetter 
Date:   Wed Mar 18 14:47:59 2015 +0100

drm/i915: kerneldoc for i915_gem_shrinker.c


Reviewed-by: Jani Nikula 

>
> Signed-off-by: Daniel Vetter 
> ---
>  drivers/gpu/drm/i915/i915_gem_shrinker.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c 
> b/drivers/gpu/drm/i915/i915_gem_shrinker.c
> index f6ecbda2c604..674341708033 100644
> --- a/drivers/gpu/drm/i915/i915_gem_shrinker.c
> +++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c
> @@ -143,7 +143,7 @@ i915_gem_shrink(struct drm_i915_private *dev_priv,
>  }
>  
>  /**
> - * i915_gem_shrink - Shrink buffer object caches completely
> + * i915_gem_shrink_all - Shrink buffer object caches completely
>   * @dev_priv: i915 device
>   *
>   * This is a simple wraper around i915_gem_shrink() to aggressively shrink 
> all
> -- 
> 2.5.1
>
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Jani Nikula, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/3] drm/i915/bxt: add revision id for A1 stepping and use it

2015-10-06 Thread Jani Nikula
On Tue, 06 Oct 2015, Ville Syrjälä  wrote:
> On Tue, Oct 06, 2015 at 02:41:15PM +0300, Jani Nikula wrote:
>> Prefer inclusive ranges for revision checks rather than "below B0". Per
>> specs A2 is not used, so revid <= A1 matches revid < B0.
>
> The w/a db would say UNTIL_B0 etc., so might be easier to check against
> it if we keep to the same convention.

So I wanted to double check what the convention is. I picked
WaRsDisableCoarsePowerGating.

KBL - SIWA_FOREVER
BXT - SI_WA_BEFORE(BXT_REV_ID_B0)
SKL - SIWA_UNTIL_SKL_E0

Description "Disable coarse power gating for GT4 until GT F0 stepping."

*rolls eyes*

So is that "until" there inclusive or non-inclusive? The db is
contradicting itself... Cc: Sarah who has also looked at workarounds
recently.

Rodrigo, for one thing, I'll want workarounds for SKL and KBL in
different conditions instead of conflated into SKL!

But what about this non-inclusive end of range? It'll matter in patch
3/3. It's not so much a problem for ranges, but rather for specific
revisions, where you'd have to include a revision not mentioned in the
spec at all, e.g. for B0 only:

IS_SKL_REVID(dev, SKL_REVID_B0, SKL_REVID_C0)

instead of the current proposal:

IS_SKL_REVID(dev, SKL_REVID_B0, SKL_REVID_B0)

I'm not really fond of adding separate macros for checking specific
vs. ranges.

Thoughts?

BR,
Jani.





>
>> 
>> Signed-off-by: Jani Nikula 
>> ---
>>  drivers/gpu/drm/i915/i915_drv.h| 1 +
>>  drivers/gpu/drm/i915/i915_gem.c| 2 +-
>>  drivers/gpu/drm/i915/i915_guc_submission.c | 2 +-
>>  drivers/gpu/drm/i915/intel_ddi.c   | 2 +-
>>  drivers/gpu/drm/i915/intel_dp.c| 2 +-
>>  drivers/gpu/drm/i915/intel_hdmi.c  | 2 +-
>>  drivers/gpu/drm/i915/intel_lrc.c   | 8 
>>  drivers/gpu/drm/i915/intel_pm.c| 6 +++---
>>  drivers/gpu/drm/i915/intel_ringbuffer.c| 6 +++---
>>  9 files changed, 16 insertions(+), 15 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
>> b/drivers/gpu/drm/i915/i915_drv.h
>> index a3b137715604..9833a2055930 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2509,6 +2509,7 @@ struct drm_i915_cmd_table {
>>  #define SKL_REVID_F00x5
>>  
>>  #define BXT_REVID_A00x0
>> +#define BXT_REVID_A10x1
>>  #define BXT_REVID_B00x3
>>  #define BXT_REVID_C00x9
>>  
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c 
>> b/drivers/gpu/drm/i915/i915_gem.c
>> index f0cfbb9ee12c..fd2d880656b2 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -3757,7 +3757,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, 
>> void *data,
>>   * cacheline, whereas normally such cachelines would get
>>   * invalidated.
>>   */
>> -if (IS_BROXTON(dev) && INTEL_REVID(dev) < BXT_REVID_B0)
>> +if (IS_BROXTON(dev) && INTEL_REVID(dev) <= BXT_REVID_A1)
>>  return -ENODEV;
>>  
>>  level = I915_CACHE_LLC;
>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
>> b/drivers/gpu/drm/i915/i915_guc_submission.c
>> index 036b42bae827..863aa5c82466 100644
>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>> @@ -161,7 +161,7 @@ static int host2guc_sample_forcewake(struct intel_guc 
>> *guc,
>>  data[0] = HOST2GUC_ACTION_SAMPLE_FORCEWAKE;
>>  /* WaRsDisableCoarsePowerGating:skl,bxt */
>>  if (!intel_enable_rc6(dev_priv->dev) ||
>> -(IS_BROXTON(dev) && (INTEL_REVID(dev) < BXT_REVID_B0)) ||
>> +(IS_BROXTON(dev) && (INTEL_REVID(dev) <= BXT_REVID_A1)) ||
>>  (IS_SKL_GT3(dev) && (INTEL_REVID(dev) <= SKL_REVID_E0)) ||
>>  (IS_SKL_GT4(dev) && (INTEL_REVID(dev) <= SKL_REVID_E0)))
>>  data[1] = 0;
>> diff --git a/drivers/gpu/drm/i915/intel_ddi.c 
>> b/drivers/gpu/drm/i915/intel_ddi.c
>> index b25e99a432fb..b80e0f5ec5dc 100644
>> --- a/drivers/gpu/drm/i915/intel_ddi.c
>> +++ b/drivers/gpu/drm/i915/intel_ddi.c
>> @@ -3247,7 +3247,7 @@ void intel_ddi_init(struct drm_device *dev, enum port 
>> port)
>>   * On BXT A0/A1, sw needs to activate DDIA HPD logic and
>>   * interrupts to check the external panel connection.
>>   */
>> -if (IS_BROXTON(dev_priv) && (INTEL_REVID(dev) < BXT_REVID_B0)
>> +if (IS_BROXTON(dev_priv) && (INTEL_REVID(dev) <= BXT_REVID_A1)
>>   && port == PORT_B)
>>  dev_priv->hotplug.irq_port[PORT_A] = intel_dig_port;
>>  else
>> diff --git a/drivers/gpu/drm/i915/intel_dp.c 
>> b/drivers/gpu/drm/i915/intel_dp.c
>> index 8d34ca7b287a..8baf6fe06313 100644
>> --- a/drivers/gpu/drm/i915/intel_dp.c
>> +++ b/drivers/gpu/drm/i915/intel_dp.c
>> @@ -6087,7 +6087,7 @@ intel_dp_init_connector(struct intel_digital_port 
>

Re: [Intel-gfx] [PATCH 2/3] drm/i915: Allow the user to pass a context to any ring

2015-10-06 Thread Daniel, Thomas
> -Original Message-
> From: Intel-gfx [mailto:intel-gfx-boun...@lists.freedesktop.org] On Behalf Of
> Chris Wilson
> Sent: Tuesday, October 6, 2015 11:53 AM
> To: intel-gfx@lists.freedesktop.org
> Subject: [Intel-gfx] [PATCH 2/3] drm/i915: Allow the user to pass a context to
> any ring
> 
> With full-ppgtt, we want the user to have full control over their memory
> layout, with a separate instance per context. Forcing them to use a
> shared memory layout for !RCS not only duplicates the amount of work we
> have to do, but also defeats the memory segregation on offer.
> 
> Signed-off-by: Chris Wilson 

Reviewed-by: Thomas Daniel 

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 3/3] drm/i915: Add soft-pinning API for execbuffer

2015-10-06 Thread Daniel, Thomas
> -Original Message-
> From: Chris Wilson [mailto:ch...@chris-wilson.co.uk]
> Sent: Tuesday, October 6, 2015 11:53 AM
> To: intel-gfx@lists.freedesktop.org
> Cc: Chris Wilson; Daniel, Thomas
> Subject: [PATCH 3/3] drm/i915: Add soft-pinning API for execbuffer
> 
> Userspace can pass in an offset that it presumes the object is located
> at. The kernel will then do its utmost to fit the object into that
> location. The assumption is that userspace is handling its own object
> locations (for example along with full-ppgtt) and that the kernel will
> rarely have to make space for the user's requests.
> 
> v2: Fix i915_gem_evict_range() (now evict_for_vma) to handle ordinary
> and fixed objects within the same batch
> 
> Signed-off-by: Chris Wilson 
> Cc: "Daniel, Thomas" 

This didn't apply cleanly to my tree pulled today (after patches 1 and 2 of 
this series).
Are you going to post a rebase?

Cheers,
Thomas.

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/2] drm/i915: Kill DRI1 cliprects

2015-10-06 Thread Tvrtko Ursulin


On 06/10/15 11:39, Chris Wilson wrote:

Passing cliprects into the kernel for it to re-execute the batch buffer
with different CMD_DRAWRECT died out long ago. As DRI1 support has been
removed from the kernel, we can now simply reject any execbuf trying to
use this "feature".

To keep Daniel happy with the prospect of being able to reuse these
fields in the next decade, continue to ensure that current userspace is
not passing garbage in through the dead fields.

v2: Fix the cliprects_ptr check

Signed-off-by: Chris Wilson 
Cc: Daniel Vetter 


Don't know anything about the DRI1 history but the removal looks fine to 
me, so:


Reviewed-by: Tvrtko Ursulin 

Would it also make sense to rename the related fields in 
drm_i915_gem_execbuffer2 to reserved?


Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 04/12] drm/i915: Move workaround macros to i915_drv.h

2015-10-06 Thread Mika Kuoppala
The plan is to allow workaround list usage outside of
intel_ringbuffer.c, mainly in intel_pm.c where we setup assortment
of workaround registers as part of intel_init_clock_gating().

Move macros to i915_drv.h and export intel_wa_add().
Remove WA_WRITE macro as there are no users of it as of now.

Signed-off-by: Mika Kuoppala 
---
 drivers/gpu/drm/i915/i915_drv.h | 22 ++
 drivers/gpu/drm/i915/intel_ringbuffer.c | 24 ++--
 2 files changed, 24 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 1883847..5a04948 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3518,4 +3518,26 @@ static inline void i915_trace_irq_get(struct 
intel_engine_cs *ring,
i915_gem_request_assign(&ring->trace_irq_req, req);
 }
 
+/* Workaround register lists */
+#define WA_REG(addr, mask, val) do { \
+   const int r = intel_wa_add(&dev_priv->lri_workarounds, \
+  (addr), (mask), (val)); \
+   WARN_ON(r); \
+   } while (0)
+
+#define WA_SET_BIT_MASKED(addr, mask) \
+   WA_REG(addr, (mask), _MASKED_BIT_ENABLE(mask))
+
+#define WA_CLR_BIT_MASKED(addr, mask) \
+   WA_REG(addr, (mask), _MASKED_BIT_DISABLE(mask))
+
+#define WA_SET_FIELD_MASKED(addr, mask, value) \
+   WA_REG(addr, mask, _MASKED_FIELD(mask, value))
+
+#define WA_SET_BIT(addr, mask) WA_REG(addr, mask, I915_READ((addr)) | (mask))
+#define WA_CLR_BIT(addr, mask) WA_REG(addr, mask, I915_READ((addr)) & ~(mask))
+
+int intel_wa_add(struct i915_workarounds *w,
+const u32 addr, const u32 mask, const u32 val);
+
 #endif
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
b/drivers/gpu/drm/i915/intel_ringbuffer.c
index bc8a8e2..29ae97e 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -763,8 +763,8 @@ static int intel_rcs_ctx_init(struct drm_i915_gem_request 
*req)
return ret;
 }
 
-static int wa_add(struct i915_workarounds *w,
- const u32 addr, const u32 mask, const u32 val)
+int intel_wa_add(struct i915_workarounds *w,
+const u32 addr, const u32 mask, const u32 val)
 {
const u32 idx = w->count;
 
@@ -780,26 +780,6 @@ static int wa_add(struct i915_workarounds *w,
return 0;
 }
 
-#define WA_REG(addr, mask, val) do { \
-   const int r = wa_add(&dev_priv->lri_workarounds, \
-(addr), (mask), (val)); \
-   WARN_ON(r); \
-   } while (0)
-
-#define WA_SET_BIT_MASKED(addr, mask) \
-   WA_REG(addr, (mask), _MASKED_BIT_ENABLE(mask))
-
-#define WA_CLR_BIT_MASKED(addr, mask) \
-   WA_REG(addr, (mask), _MASKED_BIT_DISABLE(mask))
-
-#define WA_SET_FIELD_MASKED(addr, mask, value) \
-   WA_REG(addr, mask, _MASKED_FIELD(mask, value))
-
-#define WA_SET_BIT(addr, mask) WA_REG(addr, mask, I915_READ(addr) | (mask))
-#define WA_CLR_BIT(addr, mask) WA_REG(addr, mask, I915_READ(addr) & ~(mask))
-
-#define WA_WRITE(addr, val) WA_REG(addr, 0x, val)
-
 static int gen8_init_workarounds(struct intel_engine_cs *ring)
 {
struct drm_device *dev = ring->dev;
-- 
2.1.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 10/12] drm/i915/hsw: Use mmio workarounds in init clock gating

2015-10-06 Thread Mika Kuoppala
For workarounds written in haswells's init clock gating path,
use mmio workaround list.

Signed-off-by: Mika Kuoppala 
---
 drivers/gpu/drm/i915/intel_pm.c | 35 +++
 1 file changed, 15 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 7e01ef7..b2626c2 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -6608,29 +6608,26 @@ static void haswell_init_clock_gating(struct drm_device 
*dev)
ilk_init_lp_watermarks(dev);
 
/* L3 caching of data atomics doesn't work -- disable it. */
-   I915_WRITE(HSW_SCRATCH1, HSW_SCRATCH1_L3_DATA_ATOMICS_DISABLE);
-   I915_WRITE(HSW_ROW_CHICKEN3,
-  
_MASKED_BIT_ENABLE(HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE));
+   WA_WRITE(MMIO, HSW_SCRATCH1, HSW_SCRATCH1_L3_DATA_ATOMICS_DISABLE);
+   WA_SET_BIT_MASKED(MMIO, HSW_ROW_CHICKEN3,
+ HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE);
 
/* This is required by WaCatErrorRejectionIssue:hsw */
-   I915_WRITE(GEN7_SQ_CHICKEN_MBCUNIT_CONFIG,
-   I915_READ(GEN7_SQ_CHICKEN_MBCUNIT_CONFIG) |
-   GEN7_SQ_CHICKEN_MBCUNIT_SQINTMOB);
+   WA_SET_BIT(MMIO, GEN7_SQ_CHICKEN_MBCUNIT_CONFIG,
+  GEN7_SQ_CHICKEN_MBCUNIT_SQINTMOB);
 
/* WaVSRefCountFullforceMissDisable:hsw */
-   I915_WRITE(GEN7_FF_THREAD_MODE,
-  I915_READ(GEN7_FF_THREAD_MODE) & ~GEN7_FF_VS_REF_CNT_FFME);
+   WA_CLR_BIT(MMIO, GEN7_FF_THREAD_MODE, GEN7_FF_VS_REF_CNT_FFME);
 
/* WaDisable_RenderCache_OperationalFlush:hsw */
-   I915_WRITE(CACHE_MODE_0_GEN7, _MASKED_BIT_DISABLE(RC_OP_FLUSH_ENABLE));
+   WA_CLR_BIT_MASKED(MMIO, CACHE_MODE_0_GEN7, RC_OP_FLUSH_ENABLE);
 
/* enable HiZ Raw Stall Optimization */
-   I915_WRITE(CACHE_MODE_0_GEN7,
-  _MASKED_BIT_DISABLE(HIZ_RAW_STALL_OPT_DISABLE));
+   WA_CLR_BIT_MASKED(MMIO, CACHE_MODE_0_GEN7, HIZ_RAW_STALL_OPT_DISABLE);
 
/* WaDisable4x2SubspanOptimization:hsw */
-   I915_WRITE(CACHE_MODE_1,
-  _MASKED_BIT_ENABLE(PIXEL_SUBSPAN_COLLECT_OPT_DISABLE));
+   WA_SET_BIT_MASKED(MMIO, CACHE_MODE_1,
+ PIXEL_SUBSPAN_COLLECT_OPT_DISABLE);
 
/*
 * BSpec recommends 8x4 when MSAA is used,
@@ -6640,19 +6637,17 @@ static void haswell_init_clock_gating(struct drm_device 
*dev)
 * disable bit, which we don't touch here, but it's good
 * to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
 */
-   I915_WRITE(GEN7_GT_MODE,
-  _MASKED_FIELD(GEN6_WIZ_HASHING_MASK, GEN6_WIZ_HASHING_16x4));
+   WA_SET_FIELD_MASKED(MMIO, GEN7_GT_MODE,
+   GEN6_WIZ_HASHING_MASK, GEN6_WIZ_HASHING_16x4);
 
/* WaSampleCChickenBitEnable:hsw */
-   I915_WRITE(HALF_SLICE_CHICKEN3,
-  _MASKED_BIT_ENABLE(HSW_SAMPLE_C_PERFORMANCE));
+   WA_SET_BIT_MASKED(MMIO, HALF_SLICE_CHICKEN3, HSW_SAMPLE_C_PERFORMANCE);
 
/* WaSwitchSolVfFArbitrationPriority:hsw */
-   I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) | HSW_ECOCHK_ARB_PRIO_SOL);
+   WA_SET_BIT(MMIO, GAM_ECOCHK, HSW_ECOCHK_ARB_PRIO_SOL);
 
/* WaRsPkgCStateDisplayPMReq:hsw */
-   I915_WRITE(CHICKEN_PAR1_1,
-  I915_READ(CHICKEN_PAR1_1) | FORCE_ARB_IDLE_PLANES);
+   WA_SET_BIT(MMIO, CHICKEN_PAR1_1, FORCE_ARB_IDLE_PLANES);
 
lpt_init_clock_gating(dev);
 }
-- 
2.1.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 08/12] drm/i915: Use mmio workaround list for skl/bxt

2015-10-06 Thread Mika Kuoppala
Some registers are, naturally, lost in gpu reset/suspend cycle.
And some registers, for example in display domain and are not subject
to gpu reset so they retain their contents.

As hang recovery triggers a reset, recoverable gpu hang can currently
flush out essential workarounds and cause havoc later on.

When register GEN8_GARBNTL is missing the WaEnableGapsTsvCreditFix:skl,
it can cause random system hangs [1]. This workaround was added in:
commit 245d96670d26 ("drm/i915:skl: Add WaEnableGapsTsvCreditFix").
But another set of system hangs were observed and the failure pattern
indicated that there was random gpu hang preceding the system hang [2].
This lead to the realization that we lose this workaround and BDW_SCRATCH1
on reset.

Add workarounds in skl/bxt init clock gating path to mmio workaround
list. This exposes these registers to the same testing mechanism in use
with the lri workarounds, gem_workarounds.

References: [1] https://bugs.freedesktop.org/show_bug.cgi?id=90854
References: https://bugs.freedesktop.org/show_bug.cgi?id=92315
Testcase: igt/gem_workarounds
Reported-by: Tomi Sarvela 
Signed-off-by: Mika Kuoppala 
---
 drivers/gpu/drm/i915/intel_pm.c | 31 +--
 1 file changed, 13 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index e97f271..61136e1 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -73,12 +73,11 @@ static void gen9_init_clock_gating(struct drm_device *dev)
struct drm_i915_private *dev_priv = dev->dev_private;
 
/* WaEnableLbsSlaRetryTimerDecrement:skl */
-   I915_WRITE(BDW_SCRATCH1, I915_READ(BDW_SCRATCH1) |
+   WA_SET_BIT(MMIO, BDW_SCRATCH1,
   GEN9_LBS_SLA_RETRY_TIMER_DECREMENT_ENABLE);
 
/* WaDisableKillLogic:bxt,skl */
-   I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
-  ECOCHK_DIS_TLB);
+   WA_SET_BIT(MMIO, GAM_ECOCHK, ECOCHK_DIS_TLB);
 }
 
 static void skl_init_clock_gating(struct drm_device *dev)
@@ -89,12 +88,11 @@ static void skl_init_clock_gating(struct drm_device *dev)
 
if (INTEL_REVID(dev) <= SKL_REVID_D0) {
/* WaDisableHDCInvalidation:skl */
-   I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) |
-  BDW_DISABLE_HDC_INVALIDATION);
+   WA_SET_BIT(MMIO, GAM_ECOCHK, BDW_DISABLE_HDC_INVALIDATION);
 
/* WaDisableChickenBitTSGBarrierAckForFFSliceCS:skl */
-   I915_WRITE(FF_SLICE_CS_CHICKEN2,
-  _MASKED_BIT_ENABLE(GEN9_TSG_BARRIER_ACK_DISABLE));
+   WA_SET_BIT_MASKED(MMIO, FF_SLICE_CS_CHICKEN2,
+ GEN9_TSG_BARRIER_ACK_DISABLE);
}
 
/* GEN8_L3SQCREG4 has a dependency with WA batch so any new changes
@@ -102,13 +100,12 @@ static void skl_init_clock_gating(struct drm_device *dev)
 */
if (INTEL_REVID(dev) <= SKL_REVID_E0)
/* WaDisableLSQCROPERFforOCL:skl */
-   I915_WRITE(GEN8_L3SQCREG4, I915_READ(GEN8_L3SQCREG4) |
-  GEN8_LQSC_RO_PERF_DIS);
+   WA_SET_BIT(MMIO, GEN8_L3SQCREG4, GEN8_LQSC_RO_PERF_DIS);
 
/* WaEnableGapsTsvCreditFix:skl */
if (IS_SKYLAKE(dev) && (INTEL_REVID(dev) >= SKL_REVID_C0)) {
-   I915_WRITE(GEN8_GARBCNTL, (I915_READ(GEN8_GARBCNTL) |
-  GEN9_GAPS_TSV_CREDIT_DISABLE));
+   WA_SET_BIT(MMIO, GEN8_GARBCNTL,
+  GEN9_GAPS_TSV_CREDIT_DISABLE);
}
 }
 
@@ -119,25 +116,23 @@ static void bxt_init_clock_gating(struct drm_device *dev)
gen9_init_clock_gating(dev);
 
/* WaDisableSDEUnitClockGating:bxt */
-   I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
-  GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
+   WA_SET_BIT(MMIO, GEN8_UCGCTL6, GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
 
/*
 * FIXME:
 * GEN8_HDCUNIT_CLOCK_GATE_DISABLE_HDCREQ applies on 3x6 GT SKUs only.
 */
-   I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
-  GEN8_HDCUNIT_CLOCK_GATE_DISABLE_HDCREQ);
+   WA_SET_BIT(MMIO, GEN8_UCGCTL6, GEN8_HDCUNIT_CLOCK_GATE_DISABLE_HDCREQ);
 
/* WaStoreMultiplePTEenable:bxt */
/* This is a requirement according to Hardware specification */
if (INTEL_REVID(dev) == BXT_REVID_A0)
-   I915_WRITE(TILECTL, I915_READ(TILECTL) | TILECTL_TLBPF);
+   WA_SET_BIT(MMIO, TILECTL, TILECTL_TLBPF);
 
/* WaSetClckGatingDisableMedia:bxt */
if (INTEL_REVID(dev) == BXT_REVID_A0) {
-   I915_WRITE(GEN7_MISCCPCTL, (I915_READ(GEN7_MISCCPCTL) &
-   ~GEN8_DOP_CLOCK_GATE_MEDIA_ENABLE));
+   WA_CLR_BIT(MMIO, GEN7_MISCCPCTL,
+  GEN8_DOP_CLOCK_GATE_MEDIA_ENABLE);
}
 }
 
-- 
2.1.4

___

[Intel-gfx] [PATCH 12/12] drm/i915/ivb: Simplify row chicken setup logic

2015-10-06 Thread Mika Kuoppala
We always write the ROW_CHICKEN2. Make this more clear
by writing it unconditionally.

Signed-off-by: Mika Kuoppala 
---
 drivers/gpu/drm/i915/intel_pm.c | 13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 8bc1d3b..ec77e04 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -6685,16 +6685,13 @@ static void ivybridge_init_clock_gating(struct 
drm_device *dev)
WA_WRITE(MMIO, GEN7_L3CNTLREG1, GEN7_WA_FOR_GEN7_L3_CONTROL);
WA_WRITE(MMIO, GEN7_L3_CHICKEN_MODE_REGISTER, GEN7_WA_L3_CHICKEN_MODE);
 
-   if (IS_IVB_GT1(dev))
-   WA_SET_BIT_MASKED(MMIO, GEN7_ROW_CHICKEN2,
- DOP_CLOCK_GATING_DISABLE);
-   else {
-   /* must write both registers */
-   WA_SET_BIT_MASKED(MMIO, GEN7_ROW_CHICKEN2,
- DOP_CLOCK_GATING_DISABLE);
+   WA_SET_BIT_MASKED(MMIO, GEN7_ROW_CHICKEN2,
+ DOP_CLOCK_GATING_DISABLE);
+
+   /* must write both registers */
+   if (!IS_IVB_GT1(dev))
WA_SET_BIT_MASKED(MMIO, GEN7_ROW_CHICKEN2_GT2,
  DOP_CLOCK_GATING_DISABLE);
-   }
 
/* WaForceL3Serialization:ivb */
WA_CLR_BIT(MMIO, GEN7_L3SQCREG4, L3SQ_URB_READ_CAM_MATCH_DISABLE);
-- 
2.1.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 01/12] drm/i915: Prepare for multiple workaround lists

2015-10-06 Thread Mika Kuoppala
In preparation to have separate workaround lists
for both LRI and MMIO written workarounds, parametrize the
register addition and printing of wa lists.

Cc: Arun Siluvery 
Signed-off-by: Mika Kuoppala 
---
 drivers/gpu/drm/i915/i915_debugfs.c | 39 +
 drivers/gpu/drm/i915/i915_drv.h |  2 +-
 drivers/gpu/drm/i915/intel_lrc.c|  2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c | 19 
 4 files changed, 37 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 3f2a7a7..af44808 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3094,33 +3094,44 @@ static int i915_shared_dplls_info(struct seq_file *m, 
void *unused)
return 0;
 }
 
-static int i915_wa_registers(struct seq_file *m, void *unused)
+static void print_wa_regs(struct seq_file *m,
+ const struct i915_workarounds *w)
 {
-   int i;
-   int ret;
struct drm_info_node *node = (struct drm_info_node *) m->private;
struct drm_device *dev = node->minor->dev;
struct drm_i915_private *dev_priv = dev->dev_private;
+   int i;
 
-   ret = mutex_lock_interruptible(&dev->struct_mutex);
-   if (ret)
-   return ret;
-
-   intel_runtime_pm_get(dev_priv);
-
-   seq_printf(m, "Workarounds applied: %d\n", dev_priv->workarounds.count);
-   for (i = 0; i < dev_priv->workarounds.count; ++i) {
+   for (i = 0; i < w->count; ++i) {
u32 addr, mask, value, read;
bool ok;
 
-   addr = dev_priv->workarounds.reg[i].addr;
-   mask = dev_priv->workarounds.reg[i].mask;
-   value = dev_priv->workarounds.reg[i].value;
+   addr = w->reg[i].addr;
+   mask = w->reg[i].mask;
+   value = w->reg[i].value;
read = I915_READ(addr);
ok = (value & mask) == (read & mask);
seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X, read: 0x%08x, 
status: %s\n",
   addr, value, mask, read, ok ? "OK" : "FAIL");
}
+}
+
+static int i915_wa_registers(struct seq_file *m, void *unused)
+{
+   int ret;
+   struct drm_info_node *node = (struct drm_info_node *) m->private;
+   struct drm_device *dev = node->minor->dev;
+   struct drm_i915_private *dev_priv = dev->dev_private;
+
+   ret = mutex_lock_interruptible(&dev->struct_mutex);
+   if (ret)
+   return ret;
+
+   intel_runtime_pm_get(dev_priv);
+
+   seq_printf(m, "Workarounds applied: %d\n",
+  dev_priv->lri_workarounds.count);
+   print_wa_regs(m, &dev_priv->lri_workarounds);
 
intel_runtime_pm_put(dev_priv);
mutex_unlock(&dev->struct_mutex);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 51eea29..aa38d1e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1844,7 +1844,7 @@ struct drm_i915_private {
struct intel_shared_dpll shared_dplls[I915_NUM_PLLS];
int dpio_phy_iosf_port[I915_NUM_PHYS_VLV];
 
-   struct i915_workarounds workarounds;
+   struct i915_workarounds lri_workarounds;
 
/* Reclocking support */
bool render_reclock_avail;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 825fa7a..b9c7d23 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1095,7 +1095,7 @@ static int intel_logical_ring_workarounds_emit(struct 
drm_i915_gem_request *req)
struct intel_ringbuffer *ringbuf = req->ringbuf;
struct drm_device *dev = ring->dev;
struct drm_i915_private *dev_priv = dev->dev_private;
-   struct i915_workarounds *w = &dev_priv->workarounds;
+   struct i915_workarounds *w = &dev_priv->lri_workarounds;
 
if (WARN_ON_ONCE(w->count == 0))
return 0;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
b/drivers/gpu/drm/i915/intel_ringbuffer.c
index c82c74c..71b4fac 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -715,7 +715,7 @@ static int intel_ring_workarounds_emit(struct 
drm_i915_gem_request *req)
struct intel_engine_cs *ring = req->ring;
struct drm_device *dev = ring->dev;
struct drm_i915_private *dev_priv = dev->dev_private;
-   struct i915_workarounds *w = &dev_priv->workarounds;
+   struct i915_workarounds *w = &dev_priv->lri_workarounds;
 
if (WARN_ON_ONCE(w->count == 0))
return 0;
@@ -763,25 +763,26 @@ static int intel_rcs_ctx_init(struct drm_i915_gem_request 
*req)
return ret;
 }
 
-static int wa_add(struct drm_i915_private *dev_priv,
+static int wa_add(struct i915_workarounds *w,
  const u32 addr, const u32 mask, const u32 val)
 {
-   const u32 idx = dev_pri

[Intel-gfx] [PATCH 11/12] drm/i915/ivb: Use mmio workarounds in init clock gating

2015-10-06 Thread Mika Kuoppala
Use workarounds written in ivybridge's init clock gating path,
use mmio workaround list to ensure proper setup after
reset/resume.

This way we don't lose _3DCHICKEN and GEN7_FF_THREAD_MODE register
contents on reset/suspend.

Signed-off-by: Mika Kuoppala 
---
 drivers/gpu/drm/i915/i915_reg.h |  2 +-
 drivers/gpu/drm/i915/intel_pm.c | 83 +++--
 2 files changed, 40 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index a7c9e8c..573e7d9 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -5911,7 +5911,7 @@ enum skl_disp_power_wells {
 
 /* GEN7 chicken */
 #define GEN7_COMMON_SLICE_CHICKEN1 0x7010
-# define GEN7_CSC1_RHWO_OPT_DISABLE_IN_RCC ((1<<10) | (1<<26))
+# define GEN7_CSC1_RHWO_OPT_DISABLE_IN_RCC (1<<10)
 # define GEN9_RHWO_OPTIMIZATION_DISABLE(1<<14)
 #define COMMON_SLICE_CHICKEN2  0x7014
 # define GEN8_CSC2_SBE_VUE_CACHE_CONSERVATIVE  (1<<0)
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index b2626c2..8bc1d3b 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -6370,11 +6370,11 @@ static void cpt_init_clock_gating(struct drm_device 
*dev)
 * gating for the panel power sequencer or it will fail to
 * start up when no ports are active.
 */
-   I915_WRITE(SOUTH_DSPCLK_GATE_D, PCH_DPLSUNIT_CLOCK_GATE_DISABLE |
-  PCH_DPLUNIT_CLOCK_GATE_DISABLE |
-  PCH_CPUNIT_CLOCK_GATE_DISABLE);
-   I915_WRITE(SOUTH_CHICKEN2, I915_READ(SOUTH_CHICKEN2) |
-  DPLS_EDP_PPS_FIX_DIS);
+   WA_WRITE(MMIO, SOUTH_DSPCLK_GATE_D,
+PCH_DPLSUNIT_CLOCK_GATE_DISABLE |
+PCH_DPLUNIT_CLOCK_GATE_DISABLE |
+PCH_CPUNIT_CLOCK_GATE_DISABLE);
+   WA_SET_BIT(MMIO, SOUTH_CHICKEN2, DPLS_EDP_PPS_FIX_DIS);
/* The below fixes the weird display corruption, a few pixels shifted
 * downward, on (only) LVDS of some HP laptops with IVY.
 */
@@ -6387,12 +6387,12 @@ static void cpt_init_clock_gating(struct drm_device 
*dev)
val &= ~TRANS_CHICKEN2_FRAME_START_DELAY_MASK;
val &= ~TRANS_CHICKEN2_DISABLE_DEEP_COLOR_COUNTER;
val &= ~TRANS_CHICKEN2_DISABLE_DEEP_COLOR_MODESWITCH;
-   I915_WRITE(TRANS_CHICKEN2(pipe), val);
+   WA_WRITE(MMIO, TRANS_CHICKEN2(pipe), val);
}
/* WADP0ClockGatingDisable */
for_each_pipe(dev_priv, pipe) {
-   I915_WRITE(TRANS_CHICKEN1(pipe),
-  TRANS_CHICKEN1_DP0UNIT_GC_DISABLE);
+   WA_WRITE(MMIO, TRANS_CHICKEN1(pipe),
+TRANS_CHICKEN1_DP0UNIT_GC_DISABLE);
}
 }
 
@@ -6519,7 +6519,7 @@ static void gen7_setup_fixed_func_scheduler(struct 
drm_i915_private *dev_priv)
reg |= GEN7_FF_VS_SCHED_HW;
reg |= GEN7_FF_DS_SCHED_HW;
 
-   I915_WRITE(GEN7_FF_THREAD_MODE, reg);
+   WA_WRITE(MMIO, GEN7_FF_THREAD_MODE, reg);
 }
 
 static void lpt_init_clock_gating(struct drm_device *dev)
@@ -6659,60 +6659,55 @@ static void ivybridge_init_clock_gating(struct 
drm_device *dev)
 
ilk_init_lp_watermarks(dev);
 
-   I915_WRITE(ILK_DSPCLK_GATE_D, ILK_VRHUNIT_CLOCK_GATE_DISABLE);
+   WA_WRITE(MMIO, ILK_DSPCLK_GATE_D, ILK_VRHUNIT_CLOCK_GATE_DISABLE);
 
/* WaDisableEarlyCull:ivb */
-   I915_WRITE(_3D_CHICKEN3,
-  _MASKED_BIT_ENABLE(_3D_CHICKEN_SF_DISABLE_OBJEND_CULL));
+   WA_SET_BIT_MASKED(MMIO, _3D_CHICKEN3,
+ _3D_CHICKEN_SF_DISABLE_OBJEND_CULL);
 
/* WaDisableBackToBackFlipFix:ivb */
-   I915_WRITE(IVB_CHICKEN3,
-  CHICKEN3_DGMG_REQ_OUT_FIX_DISABLE |
-  CHICKEN3_DGMG_DONE_FIX_DISABLE);
+   WA_WRITE(MMIO, IVB_CHICKEN3, CHICKEN3_DGMG_REQ_OUT_FIX_DISABLE |
+CHICKEN3_DGMG_DONE_FIX_DISABLE);
 
/* WaDisablePSDDualDispatchEnable:ivb */
if (IS_IVB_GT1(dev))
-   I915_WRITE(GEN7_HALF_SLICE_CHICKEN1,
-  
_MASKED_BIT_ENABLE(GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE));
+   WA_SET_BIT_MASKED(MMIO, GEN7_HALF_SLICE_CHICKEN1,
+ GEN7_PSD_SINGLE_PORT_DISPATCH_ENABLE);
 
/* WaDisable_RenderCache_OperationalFlush:ivb */
-   I915_WRITE(CACHE_MODE_0_GEN7, _MASKED_BIT_DISABLE(RC_OP_FLUSH_ENABLE));
+   WA_CLR_BIT_MASKED(MMIO, CACHE_MODE_0_GEN7, RC_OP_FLUSH_ENABLE);
 
/* Apply the WaDisableRHWOOptimizationForRenderHang:ivb workaround. */
-   I915_WRITE(GEN7_COMMON_SLICE_CHICKEN1,
-  GEN7_CSC1_RHWO_OPT_DISABLE_IN_RCC);
+   WA_SET_BIT_MASKED(MMIO, GEN7_COMMON_SLICE_CHICKEN1,
+ GEN7_CSC1_RHWO_OPT_DISABLE_IN_RCC);
 
/* WaApplyL3ControlAndL3ChickenMode:ivb */
-   I915_WRITE(GEN7_L3CNTLREG

[Intel-gfx] [PATCH 07/12] drm/i915: Write mmio workarounds after gpu reset

2015-10-06 Thread Mika Kuoppala
From: Mika Kuoppala 

Rewrite everything in mmio workaround list right after
gpu reset. This ensures that we start the reinitialization
with proper mmio workarounds in place, before we
start the rings.

This commit just adds the mechanism, the list itself
is still empty. Following commits will add registers.

Signed-off-by: Mika Kuoppala 
---
 drivers/gpu/drm/i915/i915_drv.h |  8 
 drivers/gpu/drm/i915/i915_irq.c |  2 ++
 drivers/gpu/drm/i915/intel_pm.c | 20 
 drivers/gpu/drm/i915/intel_ringbuffer.c |  2 +-
 drivers/gpu/drm/i915/intel_uncore.c | 12 
 5 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ae5b6b3..d41808a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2732,6 +2732,7 @@ void intel_uncore_forcewake_get__locked(struct 
drm_i915_private *dev_priv,
 void intel_uncore_forcewake_put__locked(struct drm_i915_private *dev_priv,
enum forcewake_domains domains);
 void assert_forcewakes_inactive(struct drm_i915_private *dev_priv);
+void assert_forcewakes_active(struct drm_i915_private *dev_priv);
 static inline bool intel_vgpu_active(struct drm_device *dev)
 {
return to_i915(dev)->vgpu.active;
@@ -3545,7 +3546,14 @@ static inline void i915_trace_irq_get(struct 
intel_engine_cs *ring,
 #define WA_CLR_BIT(t, addr, mask) \
WA_REG_##t(addr, mask, I915_READ((addr)) & ~(mask))
 
+static inline void intel_wa_init(struct i915_workarounds *w)
+{
+   w->count = 0;
+}
+
 int intel_wa_add(struct i915_workarounds *w,
 const u32 addr, const u32 mask, const u32 val);
 
+void intel_wa_write_mmio(struct drm_i915_private *dev_priv,
+const struct i915_workarounds *w);
 #endif
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 8ca772d..e686f78 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2441,6 +2441,8 @@ static void i915_reset_and_wakeup(struct drm_device *dev)
 */
ret = i915_reset(dev);
 
+   intel_wa_write_mmio(dev_priv, &dev_priv->mmio_workarounds);
+
intel_finish_reset(dev);
 
intel_runtime_pm_put(dev_priv);
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 60d120c..e97f271 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -52,6 +52,22 @@
 #define INTEL_RC6p_ENABLE  (1<<1)
 #define INTEL_RC6pp_ENABLE (1<<2)
 
+void intel_wa_write_mmio(struct drm_i915_private *dev_priv,
+const struct i915_workarounds *w)
+{
+   int i;
+
+   if (WARN_ON_ONCE(w->count == 0))
+   return;
+
+   assert_forcewakes_active(dev_priv);
+
+   for (i = 0; i < w->count; i++)
+   I915_WRITE_FW(w->reg[i].addr, w->reg[i].value);
+
+   DRM_DEBUG_DRIVER("Number of Workarounds written: %d\n", w->count);
+}
+
 static void gen9_init_clock_gating(struct drm_device *dev)
 {
struct drm_i915_private *dev_priv = dev->dev_private;
@@ -6973,8 +6989,12 @@ void intel_init_clock_gating(struct drm_device *dev)
 {
struct drm_i915_private *dev_priv = dev->dev_private;
 
+   intel_wa_init(&dev_priv->mmio_workarounds);
+
if (dev_priv->display.init_clock_gating)
dev_priv->display.init_clock_gating(dev);
+
+   intel_wa_write_mmio(dev_priv, &dev_priv->mmio_workarounds);
 }
 
 void intel_suspend_hw(struct drm_device *dev)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
b/drivers/gpu/drm/i915/intel_ringbuffer.c
index c9d3489e..3667dd9 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1074,7 +1074,7 @@ int init_workarounds_ring(struct intel_engine_cs *ring)
 
WARN_ON(ring->id != RCS);
 
-   dev_priv->lri_workarounds.count = 0;
+   intel_wa_init(&dev_priv->lri_workarounds);
 
if (IS_BROADWELL(dev))
return bdw_init_workarounds(ring);
diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
b/drivers/gpu/drm/i915/intel_uncore.c
index b43c6d0..3f8d1f6 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -524,6 +524,18 @@ void assert_forcewakes_inactive(struct drm_i915_private 
*dev_priv)
WARN_ON(domain->wake_count);
 }
 
+void assert_forcewakes_active(struct drm_i915_private *dev_priv)
+{
+   struct intel_uncore_forcewake_domain *domain;
+   enum forcewake_domain_id id;
+
+   if (!dev_priv->uncore.funcs.force_wake_get)
+   return;
+
+   for_each_fw_domain(domain, dev_priv, id)
+   WARN_ON(domain->wake_count == 0);
+}
+
 /* We give fast paths for the really cool registers */
 #define NEEDS_FORCE_WAKE(dev_priv, reg) \
 ((reg) < 0x4

[Intel-gfx] [PATCH 09/12] drm/i915/bdw: Use mmio workarounds in init clock gating

2015-10-06 Thread Mika Kuoppala
For workarounds written in broadwell's init clock gating path,
use mmio workaround list.

Signed-off-by: Mika Kuoppala 
---
 drivers/gpu/drm/i915/i915_drv.h |  3 +++
 drivers/gpu/drm/i915/intel_pm.c | 59 -
 2 files changed, 38 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d41808a..a225d55 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3546,6 +3546,9 @@ static inline void i915_trace_irq_get(struct 
intel_engine_cs *ring,
 #define WA_CLR_BIT(t, addr, mask) \
WA_REG_##t(addr, mask, I915_READ((addr)) & ~(mask))
 
+#define WA_WRITE(t, addr, val) \
+   WA_REG_##t(addr, 0x, val)
+
 static inline void intel_wa_init(struct i915_workarounds *w)
 {
w->count = 0;
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 61136e1..7e01ef7 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -62,9 +62,26 @@ void intel_wa_write_mmio(struct drm_i915_private *dev_priv,
 
assert_forcewakes_active(dev_priv);
 
-   for (i = 0; i < w->count; i++)
+   for (i = 0; i < w->count; i++) {
+   u32 misccpctl;
+
+   /* WaTempDisableDOPClkGating:bdw */
+   const bool need_dop_clkgate =
+   w->reg[i].addr == GEN8_L3SQCREG1 &&
+   IS_BROADWELL(dev_priv->dev);
+
+   if (need_dop_clkgate) {
+   misccpctl = I915_READ(GEN7_MISCCPCTL);
+   I915_WRITE_FW(GEN7_MISCCPCTL,
+ misccpctl & ~GEN7_DOP_CLOCK_GATE_ENABLE);
+   }
+
I915_WRITE_FW(w->reg[i].addr, w->reg[i].value);
 
+   if (need_dop_clkgate)
+   I915_WRITE_FW(GEN7_MISCCPCTL, misccpctl);
+   }
+
DRM_DEBUG_DRIVER("Number of Workarounds written: %d\n", w->count);
 }
 
@@ -6514,10 +6531,14 @@ static void lpt_init_clock_gating(struct drm_device 
*dev)
 * disabled when not needed anymore in order to save power.
 */
if (HAS_PCH_LPT_LP(dev))
-   I915_WRITE(SOUTH_DSPCLK_GATE_D,
-  I915_READ(SOUTH_DSPCLK_GATE_D) |
+   WA_SET_BIT(MMIO, SOUTH_DSPCLK_GATE_D,
   PCH_LP_PARTITION_LEVEL_DISABLE);
 
+   /* XXX: This register is volatile with bdw. After
+* reset you get weird values and after writing,
+* the value you get is 0x1000 so read/modify/write
+* is dangerous
+*/
/* WADPOClockGatingDisable:hsw */
I915_WRITE(TRANS_CHICKEN1(PIPE_A),
   I915_READ(TRANS_CHICKEN1(PIPE_A)) |
@@ -6540,52 +6561,42 @@ static void broadwell_init_clock_gating(struct 
drm_device *dev)
 {
struct drm_i915_private *dev_priv = dev->dev_private;
enum pipe pipe;
-   uint32_t misccpctl;
 
ilk_init_lp_watermarks(dev);
 
/* WaSwitchSolVfFArbitrationPriority:bdw */
-   I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) | HSW_ECOCHK_ARB_PRIO_SOL);
+   WA_SET_BIT(MMIO, GAM_ECOCHK, HSW_ECOCHK_ARB_PRIO_SOL);
 
/* WaPsrDPAMaskVBlankInSRD:bdw */
-   I915_WRITE(CHICKEN_PAR1_1,
-  I915_READ(CHICKEN_PAR1_1) | DPA_MASK_VBLANK_SRD);
+   WA_SET_BIT(MMIO, CHICKEN_PAR1_1, DPA_MASK_VBLANK_SRD);
 
/* WaPsrDPRSUnmaskVBlankInSRD:bdw */
for_each_pipe(dev_priv, pipe) {
-   I915_WRITE(CHICKEN_PIPESL_1(pipe),
-  I915_READ(CHICKEN_PIPESL_1(pipe)) |
+   WA_SET_BIT(MMIO, CHICKEN_PIPESL_1(pipe),
   BDW_DPRS_MASK_VBLANK_SRD);
}
 
/* WaVSRefCountFullforceMissDisable:bdw */
/* WaDSRefCountFullforceMissDisable:bdw */
-   I915_WRITE(GEN7_FF_THREAD_MODE,
-  I915_READ(GEN7_FF_THREAD_MODE) &
-  ~(GEN8_FF_DS_REF_CNT_FFME | GEN7_FF_VS_REF_CNT_FFME));
+   WA_CLR_BIT(MMIO, GEN7_FF_THREAD_MODE,
+  GEN8_FF_DS_REF_CNT_FFME | GEN7_FF_VS_REF_CNT_FFME);
 
-   I915_WRITE(GEN6_RC_SLEEP_PSMI_CONTROL,
-  _MASKED_BIT_ENABLE(GEN8_RC_SEMA_IDLE_MSG_DISABLE));
+   WA_SET_BIT_MASKED(MMIO, GEN6_RC_SLEEP_PSMI_CONTROL,
+ GEN8_RC_SEMA_IDLE_MSG_DISABLE);
 
/* WaDisableSDEUnitClockGating:bdw */
-   I915_WRITE(GEN8_UCGCTL6, I915_READ(GEN8_UCGCTL6) |
+   WA_SET_BIT(MMIO, GEN8_UCGCTL6,
   GEN8_SDEUNIT_CLOCK_GATE_DISABLE);
 
-   /*
-* WaProgramL3SqcReg1Default:bdw
-* WaTempDisableDOPClkGating:bdw
-*/
-   misccpctl = I915_READ(GEN7_MISCCPCTL);
-   I915_WRITE(GEN7_MISCCPCTL, misccpctl & ~GEN7_DOP_CLOCK_GATE_ENABLE);
-   I915_WRITE(GEN8_L3SQCREG1, BDW_WA_L3SQCREG1_DEFAULT);
-   I915_WRITE(GEN7_MISCCPCTL, misccpctl);
+   /* WaProgramL3SqcReg1Default:bdw */
+   WA_WRITE(MMIO, GEN8_L3SQCREG1, BDW_W

[Intel-gfx] [PATCH 03/12] drm/i915: Don't return inside WA_REG macro

2015-10-06 Thread Mika Kuoppala
It is considered a very bad practice to return inside
a macro. Instead of returning, emit a warning.

Signed-off-by: Mika Kuoppala 
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 71b4fac..bc8a8e2 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -783,8 +783,7 @@ static int wa_add(struct i915_workarounds *w,
 #define WA_REG(addr, mask, val) do { \
const int r = wa_add(&dev_priv->lri_workarounds, \
 (addr), (mask), (val)); \
-   if (r) \
-   return r; \
+   WARN_ON(r); \
} while (0)
 
 #define WA_SET_BIT_MASKED(addr, mask) \
-- 
2.1.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 02/12] drm/i915: Raise the amount of workarounds one list has

2015-10-06 Thread Mika Kuoppala
As we move towards of adding mmio register setup to
use workaround list, raise the maximum amount of available
registers in list.

Signed-off-by: Mika Kuoppala 
---
 drivers/gpu/drm/i915/i915_drv.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index aa38d1e..1883847 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1669,7 +1669,7 @@ struct i915_wa_reg {
u32 mask;
 };
 
-#define I915_MAX_WA_REGS 16
+#define I915_MAX_WA_REGS 32
 
 struct i915_workarounds {
struct i915_wa_reg reg[I915_MAX_WA_REGS];
-- 
2.1.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 00/12] MMIO workaround list

2015-10-06 Thread Mika Kuoppala
Hi,

This series was inspired by founding out:
https://bugs.freedesktop.org/show_bug.cgi?id=92315
and that we still lose some of the workarounds after
reset/suspend cycle. We did build a list for LRI emitted
workarounds in past to combat this same issue and checked that
contexts retain wa registers with igt/gem_workarounds.

This series expands the same mechanism to keep track of
mmio workarounds across reset/resume and gain test coverage
with gem_workarounds.

So 8/12 is the crux of the series, by curing the possible
system hang after any recoverable gpu hang. But I have also
moved few other gens to use the mmio workaround list as an example.
And also because I noticed that with ivybridge we also lose few
registers on reset. If this is the path to go, more gens can be
converted.

There was temptation to move the workaround list code and
also the buildup of lists to a separate file. But as Arun
has/had large series of WAs in flight, I decided not to.

Mika Kuoppala (12):
  drm/i915: Prepare for multiple workaround lists
  drm/i915: Raise the amount of workarounds one list has
  drm/i915: Don't return inside WA_REG macro
  drm/i915: Move workaround macros to i915_drv.h
  drm/i915: Specify the wa list in WA_* macros
  drm/i915: Introduce mmio workaround list
  drm/i915: Write mmio workarounds after gpu reset
  drm/i915: Use mmio workaround list for skl/bxt
  drm/i915/bdw: Use mmio workarounds in init clock gating
  drm/i915/hsw: Use mmio workarounds in init clock gating
  drm/i915/ivb: Use mmio workarounds in init clock gating
  drm/i915/ivb: Simplify row chicken setup logic

 drivers/gpu/drm/i915/i915_debugfs.c |  41 --
 drivers/gpu/drm/i915/i915_drv.h |  45 ++-
 drivers/gpu/drm/i915/i915_irq.c |   2 +
 drivers/gpu/drm/i915/i915_reg.h |   2 +-
 drivers/gpu/drm/i915/intel_lrc.c|   2 +-
 drivers/gpu/drm/i915/intel_pm.c | 231 +---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 103 ++
 drivers/gpu/drm/i915/intel_uncore.c |  12 ++
 8 files changed, 251 insertions(+), 187 deletions(-)

-- 
2.1.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/2] drm/i915: Kill DRI1 cliprects

2015-10-06 Thread Dave Gordon

On 06/10/15 11:39, Chris Wilson wrote:

Passing cliprects into the kernel for it to re-execute the batch buffer
with different CMD_DRAWRECT died out long ago. As DRI1 support has been
removed from the kernel, we can now simply reject any execbuf trying to
use this "feature".

To keep Daniel happy with the prospect of being able to reuse these
fields in the next decade, continue to ensure that current userspace is
not passing garbage in through the dead fields.

v2: Fix the cliprects_ptr check

Signed-off-by: Chris Wilson 
Cc: Daniel Vetter 
---
  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 154 ++---
  drivers/gpu/drm/i915/intel_lrc.c   |  15 ---
  2 files changed, 31 insertions(+), 138 deletions(-)


This will cause John Harrison & myself a certain amount of pain (because 
the changes collide with the scheduler's reorganisation of the 
execbuffer path), but I'm all in favour of getting rid of the legacy 
crud cluttering up this code, so ...


Reviewed-by: Dave Gordon 

Perhaps we could get rid of relative-constants-mode next :)


diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 75a0c8b5305b..045a7631faa0 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -947,7 +947,21 @@ i915_gem_check_execbuffer(struct drm_i915_gem_execbuffer2 
*exec)
if (exec->flags & __I915_EXEC_UNKNOWN_FLAGS)
return false;

-   return ((exec->batch_start_offset | exec->batch_len) & 0x7) == 0;
+   /* Kernel clipping was a DRI1 misfeature */
+   if (exec->num_cliprects || exec->cliprects_ptr)
+   return false;
+
+   if (exec->DR4 == 0x) {
+   DRM_DEBUG("UXA submitting garbage DR4, fixing up\n");
+   exec->DR4 = 0;
+   }
+   if (exec->DR1 || exec->DR4)
+   return false;
+
+   if ((exec->batch_start_offset | exec->batch_len) & 0x7)
+   return false;
+
+   return true;
  }

  static int
@@ -,47 +1125,6 @@ i915_reset_gen7_sol_offsets(struct drm_device *dev,
return 0;
  }

-static int
-i915_emit_box(struct drm_i915_gem_request *req,
- struct drm_clip_rect *box,
- int DR1, int DR4)
-{
-   struct intel_engine_cs *ring = req->ring;
-   int ret;
-
-   if (box->y2 <= box->y1 || box->x2 <= box->x1 ||
-   box->y2 <= 0 || box->x2 <= 0) {
-   DRM_ERROR("Bad box %d,%d..%d,%d\n",
- box->x1, box->y1, box->x2, box->y2);
-   return -EINVAL;
-   }
-
-   if (INTEL_INFO(ring->dev)->gen >= 4) {
-   ret = intel_ring_begin(req, 4);
-   if (ret)
-   return ret;
-
-   intel_ring_emit(ring, GFX_OP_DRAWRECT_INFO_I965);
-   intel_ring_emit(ring, (box->x1 & 0x) | box->y1 << 16);
-   intel_ring_emit(ring, ((box->x2 - 1) & 0x) | (box->y2 - 1) 
<< 16);
-   intel_ring_emit(ring, DR4);
-   } else {
-   ret = intel_ring_begin(req, 6);
-   if (ret)
-   return ret;
-
-   intel_ring_emit(ring, GFX_OP_DRAWRECT_INFO);
-   intel_ring_emit(ring, DR1);
-   intel_ring_emit(ring, (box->x1 & 0x) | box->y1 << 16);
-   intel_ring_emit(ring, ((box->x2 - 1) & 0x) | (box->y2 - 1) 
<< 16);
-   intel_ring_emit(ring, DR4);
-   intel_ring_emit(ring, 0);
-   }
-   intel_ring_advance(ring);
-
-   return 0;
-}
-
  static struct drm_i915_gem_object*
  i915_gem_execbuffer_parse(struct intel_engine_cs *ring,
  struct drm_i915_gem_exec_object2 *shadow_exec_entry,
@@ -1210,65 +1183,21 @@ i915_gem_ringbuffer_submission(struct 
i915_execbuffer_params *params,
   struct drm_i915_gem_execbuffer2 *args,
   struct list_head *vmas)
  {
-   struct drm_clip_rect *cliprects = NULL;
struct drm_device *dev = params->dev;
struct intel_engine_cs *ring = params->ring;
struct drm_i915_private *dev_priv = dev->dev_private;
u64 exec_start, exec_len;
int instp_mode;
u32 instp_mask;
-   int i, ret = 0;
-
-   if (args->num_cliprects != 0) {
-   if (ring != &dev_priv->ring[RCS]) {
-   DRM_DEBUG("clip rectangles are only valid with the render 
ring\n");
-   return -EINVAL;
-   }
-
-   if (INTEL_INFO(dev)->gen >= 5) {
-   DRM_DEBUG("clip rectangles are only valid on 
pre-gen5\n");
-   return -EINVAL;
-   }
-
-   if (args->num_cliprects > UINT_MAX / sizeof(*cliprects)) {
-   DRM_DEBUG("execbuf with %u cliprects\n",
- args->num_cliprects);
-   return -EINVAL;
-  

Re: [Intel-gfx] [PATCH 1/2] drm/i915/skl: Allow universal planes to position

2015-10-06 Thread Matt Roper
On Tue, Oct 06, 2015 at 02:32:47PM +0100, Tvrtko Ursulin wrote:
> 
> On 10/04/15 10:07, Sonika Jindal wrote:
> >Signed-off-by: Sonika Jindal 
> >Reviewed-by: Matt Roper 
> >---
> >  drivers/gpu/drm/i915/intel_display.c |7 ++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> >
> >diff --git a/drivers/gpu/drm/i915/intel_display.c 
> >b/drivers/gpu/drm/i915/intel_display.c
> >index ceb2e61..f0bbc22 100644
> >--- a/drivers/gpu/drm/i915/intel_display.c
> >+++ b/drivers/gpu/drm/i915/intel_display.c
> >@@ -12150,16 +12150,21 @@ intel_check_primary_plane(struct drm_plane *plane,
> > struct drm_rect *dest = &state->dst;
> > struct drm_rect *src = &state->src;
> > const struct drm_rect *clip = &state->clip;
> >+bool can_position = false;
> > int ret;
> >
> > crtc = crtc ? crtc : plane->crtc;
> > intel_crtc = to_intel_crtc(crtc);
> >
> >+if (INTEL_INFO(dev)->gen >= 9)
> >+can_position = true;
> >+
> > ret = drm_plane_helper_check_update(plane, crtc, fb,
> > src, dest, clip,
> > DRM_PLANE_HELPER_NO_SCALING,
> > DRM_PLANE_HELPER_NO_SCALING,
> >-false, true, &state->visible);
> >+can_position, true,
> >+&state->visible);
> > if (ret)
> > return ret;
> >
> >
> 
> I have discovered today that, while this allows SetCrtc and SetPlane
> ioctls to work with frame buffers which do not cover the plane, page
> flips are not that lucky and fail roughly with:
> 
> [drm:drm_crtc_check_viewport] Invalid fb size 1080x1080 for CRTC
> viewport 1920x1080+0+0.

Maybe I'm misunderstanding your explanation, but a framebuffer is always
required to fill/cover the plane scanning out of it.  What this patch is
supposed to be allowing is for the primary plane to not cover the entire
CRTC (since that's something that only became possible for Intel
hardware on the gen9+ platforms).  I.e., the primary plane is now
allowed to positioned and resized to cover a subset of the CRTC area,
just like "sprite" planes have always been able to.

If you've got a 1080x1080 framebuffer, then it's legal to have a
1080x1080 primary plane while running in 1920x1080 mode on SKL/BXT.
However it is not legal to size the primary plane as 1920x1080 and use
this same 1080x1080 framebuffer with any of our interfaces (setplane,
setcrtc, pageflip, or atomic).

Are you using ioctls/libdrm directly or are you using igt_kms helpers?
IIRC, the IGT helpers will try to be extra helpful and automatically
size the plane to match the framebuffer (unless you override that
behavior), so that might be what's causing the confusion here.


Matt

> 
> I have posted a quick IGT exerciser for this as "kms_rotation_crc:
> Excercise page flips with 90 degree rotation". May not be that great
> but shows the failure.
> 
> I am not that hot on meddling with this code, nor do I feel
> competent to even try on my own at least. :/ Maybe just because the
> atomic and plane related rewrites have been going on for so long,
> and have multiple people involved, it all sounds pretty scary and
> fragile.
> 
> But I think some sort of plan on how to fix this could be in order?
> 
> Regards,
> 
> Tvrtko

-- 
Matt Roper
Graphics Software Engineer
IoTG Platform Enabling & Development
Intel Corporation
(916) 356-2795
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Convert hsw_compute_linetime_wm to use in-flight state

2015-10-06 Thread Jani Nikula
On Thu, 01 Oct 2015, Matt Roper  wrote:
> When watermark calculation was moved up to the atomic check phase, the
> code was updated to calculate based on in-flight atomic state rather
> than already-committed state.  However the hsw_compute_linetime_wm()
> didn't get updated and continued to pull values out of the
> currently-committed CRTC state.  On platforms that call this function
> (HSW/BDW only), this will cause problems when we go to enable the CRTC
> since we'll pull the current mode (off) rather than the mode we're
> calculating for and wind up with a divide by zero error.
>
> This was an oversight in commit:
>
> commit a28170f3389f4e42db95e595b0d86384a79de696
> Author: Matt Roper 
> Date:   Thu Sep 24 15:53:16 2015 -0700
>
> drm/i915: Calculate ILK-style watermarks during atomic check (v3)
>
> Cc: Jani Nikula 
> Signed-off-by: Matt Roper 

Matt, current nightly + "drm/i915: Sanitize watermarks after hardware
state readout" + this still oopses my BDW at boot. dmesg attached.

BR,
Jani.


-- 
Jani Nikula, Intel Open Source Technology Center

[0.00] Initializing cgroup subsys cpuset
[0.00] Initializing cgroup subsys cpu
[0.00] Initializing cgroup subsys cpuacct
[0.00] Linux version 4.3.0-rc4-bisect-00536-g400a6b9 (jani@nukke) (gcc 
version 4.9.2 (Debian 4.9.2-10) ) #86 SMP Wed Oct 7 01:27:22 EEST 2015
[0.00] Command line: 
BOOT_IMAGE=/boot/vmlinuz-4.3.0-rc4-bisect-00536-g400a6b9 
root=UUID=181461f1-6b98-43c3-8984-05f81ca9f77e ro log_buf_len=4M drm.debug=14 
quiet
[0.00] x86/fpu: xstate_offset[2]: 0240, xstate_sizes[2]: 0100
[0.00] x86/fpu: Supporting XSAVE feature 0x01: 'x87 floating point 
registers'
[0.00] x86/fpu: Supporting XSAVE feature 0x02: 'SSE registers'
[0.00] x86/fpu: Supporting XSAVE feature 0x04: 'AVX registers'
[0.00] x86/fpu: Enabled xstate features 0x7, context size is 0x340 
bytes, using 'standard' format.
[0.00] x86/fpu: Using 'eager' FPU context switches.
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x00057fff] usable
[0.00] BIOS-e820: [mem 0x00058000-0x00058fff] reserved
[0.00] BIOS-e820: [mem 0x00059000-0x0009efff] usable
[0.00] BIOS-e820: [mem 0x0009f000-0x0009] reserved
[0.00] BIOS-e820: [mem 0x0010-0xd132efff] usable
[0.00] BIOS-e820: [mem 0xd132f000-0xd1809fff] reserved
[0.00] BIOS-e820: [mem 0xd180a000-0xd9324fff] usable
[0.00] BIOS-e820: [mem 0xd9325000-0xd9383fff] reserved
[0.00] BIOS-e820: [mem 0xd9384000-0xd93a6fff] ACPI data
[0.00] BIOS-e820: [mem 0xd93a7000-0xd9cd6fff] ACPI NVS
[0.00] BIOS-e820: [mem 0xd9cd7000-0xd9fa2fff] reserved
[0.00] BIOS-e820: [mem 0xd9fa3000-0xd9ffefff] type 20
[0.00] BIOS-e820: [mem 0xd9fff000-0xd9ff] usable
[0.00] BIOS-e820: [mem 0xdb00-0xdf7f] reserved
[0.00] BIOS-e820: [mem 0xf800-0xfbff] reserved
[0.00] BIOS-e820: [mem 0xfec0-0xfec00fff] reserved
[0.00] BIOS-e820: [mem 0xfed0-0xfed03fff] reserved
[0.00] BIOS-e820: [mem 0xfed1c000-0xfed1] reserved
[0.00] BIOS-e820: [mem 0xfee0-0xfee00fff] reserved
[0.00] BIOS-e820: [mem 0xff00-0x] reserved
[0.00] BIOS-e820: [mem 0x0001-0x00021f7f] usable
[0.00] NX (Execute Disable) protection: active
[0.00] efi: EFI v2.40 by American Megatrends
[0.00] efi:  ACPI=0xd938b000  ACPI 2.0=0xd938b000  SMBIOS=0xf05b0  
MPS=0xfd640 
[0.00] SMBIOS 2.8 present.
[0.00] DMI: 
\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x
 
\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x\x/NUC5i3RYB,
 BIOS RYBDWi35.86A.0130.2014.1203.1639 12/03/2014
[0.00] e820: update [mem 0x-0x0fff] usable ==> reserved
[0.00] e820: remove [mem 0x000a-0x000f] usable
[0.00] e820: last_pfn = 0x21f800 max_arch_pfn = 0x4
[0.00] MTRR default type: uncachable
[0.00] MTRR fixed range

Re: [Intel-gfx] [PATCH] drm/i915: Move the mb() following release-mmap into release-mmap

2015-10-06 Thread Tvrtko Ursulin


Hi,

On 06/10/15 12:58, Chris Wilson wrote:

As paranoia, we want to ensure that the CPU's PTEs have been revoked for
the object before we return from i915_gem_release_mmap(). This allows us
to rely on there being no outstanding memory accesses and guarantees
serialisation of the code against concurrent access just by calling
i915_gem_release_mmap().

Signed-off-by: Chris Wilson 
---
  drivers/gpu/drm/i915/i915_gem.c | 13 ++---
  1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 2b8ed7a2faab..642644f12295 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1877,11 +1877,21 @@ out:
  void
  i915_gem_release_mmap(struct drm_i915_gem_object *obj)
  {
+   /* Serialisation between user GTT access and our code depends upon
+* revoking the CPU's PTE whilst the mutex is held. The next user
+* pagefault then has to wait until we release the mutex.
+*/
+   lockdep_assert_held(&obj->base.dev->struct_mutex);
+
if (!obj->fault_mappable)
return;

drm_vma_node_unmap(&obj->base.vma_node,
   obj->base.dev->anon_inode->i_mapping);
+
+   /* Ensure that the CPU's PTE are revoked before we return */
+   mb();
+


smp_mb() or smp_wmb() would not suffice? Is it needed on uniprocessor?

Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/2] drm/i915/skl: Allow universal planes to position

2015-10-06 Thread Ville Syrjälä
On Tue, Oct 06, 2015 at 07:29:54AM -0700, Matt Roper wrote:
> On Tue, Oct 06, 2015 at 02:32:47PM +0100, Tvrtko Ursulin wrote:
> > 
> > On 10/04/15 10:07, Sonika Jindal wrote:
> > >Signed-off-by: Sonika Jindal 
> > >Reviewed-by: Matt Roper 
> > >---
> > >  drivers/gpu/drm/i915/intel_display.c |7 ++-
> > >  1 file changed, 6 insertions(+), 1 deletion(-)
> > >
> > >diff --git a/drivers/gpu/drm/i915/intel_display.c 
> > >b/drivers/gpu/drm/i915/intel_display.c
> > >index ceb2e61..f0bbc22 100644
> > >--- a/drivers/gpu/drm/i915/intel_display.c
> > >+++ b/drivers/gpu/drm/i915/intel_display.c
> > >@@ -12150,16 +12150,21 @@ intel_check_primary_plane(struct drm_plane 
> > >*plane,
> > >   struct drm_rect *dest = &state->dst;
> > >   struct drm_rect *src = &state->src;
> > >   const struct drm_rect *clip = &state->clip;
> > >+  bool can_position = false;
> > >   int ret;
> > >
> > >   crtc = crtc ? crtc : plane->crtc;
> > >   intel_crtc = to_intel_crtc(crtc);
> > >
> > >+  if (INTEL_INFO(dev)->gen >= 9)
> > >+  can_position = true;
> > >+
> > >   ret = drm_plane_helper_check_update(plane, crtc, fb,
> > >   src, dest, clip,
> > >   DRM_PLANE_HELPER_NO_SCALING,
> > >   DRM_PLANE_HELPER_NO_SCALING,
> > >-  false, true, &state->visible);
> > >+  can_position, true,
> > >+  &state->visible);
> > >   if (ret)
> > >   return ret;
> > >
> > >
> > 
> > I have discovered today that, while this allows SetCrtc and SetPlane
> > ioctls to work with frame buffers which do not cover the plane, page
> > flips are not that lucky and fail roughly with:
> > 
> > [drm:drm_crtc_check_viewport] Invalid fb size 1080x1080 for CRTC
> > viewport 1920x1080+0+0.
> 
> Maybe I'm misunderstanding your explanation, but a framebuffer is always
> required to fill/cover the plane scanning out of it.  What this patch is
> supposed to be allowing is for the primary plane to not cover the entire
> CRTC (since that's something that only became possible for Intel
> hardware on the gen9+ platforms).  I.e., the primary plane is now
> allowed to positioned and resized to cover a subset of the CRTC area,
> just like "sprite" planes have always been able to.
> 
> If you've got a 1080x1080 framebuffer, then it's legal to have a
> 1080x1080 primary plane while running in 1920x1080 mode on SKL/BXT.
> However it is not legal to size the primary plane as 1920x1080 and use
> this same 1080x1080 framebuffer with any of our interfaces (setplane,
> setcrtc, pageflip, or atomic).
> 
> Are you using ioctls/libdrm directly or are you using igt_kms helpers?
> IIRC, the IGT helpers will try to be extra helpful and automatically
> size the plane to match the framebuffer (unless you override that
> behavior), so that might be what's causing the confusion here.

The problem is clear as day in drm_mode_page_flip_ioctl():
ret = drm_crtc_check_viewport(crtc, crtc->x, crtc->y, &crtc->mode, fb);
if (ret)
goto out;

The fix should be easy; just extract the current src coordinates from
the plane state and check those against the new fb size. And then hope
that the plane state is really up to date.

And I'm sure rotated cases will go boom in some other ways. Probably
we should just switch over to using the full plane update for mmio
flips to fix it.

> 
> 
> Matt
> 
> > 
> > I have posted a quick IGT exerciser for this as "kms_rotation_crc:
> > Excercise page flips with 90 degree rotation". May not be that great
> > but shows the failure.
> > 
> > I am not that hot on meddling with this code, nor do I feel
> > competent to even try on my own at least. :/ Maybe just because the
> > atomic and plane related rewrites have been going on for so long,
> > and have multiple people involved, it all sounds pretty scary and
> > fragile.
> > 
> > But I think some sort of plan on how to fix this could be in order?
> > 
> > Regards,
> > 
> > Tvrtko
> 
> -- 
> Matt Roper
> Graphics Software Engineer
> IoTG Platform Enabling & Development
> Intel Corporation
> (916) 356-2795

-- 
Ville Syrjälä
Intel OTC
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 06/12] drm/i915: Introduce mmio workaround list

2015-10-06 Thread Mika Kuoppala
Introduce another workaround list for mmio write type of
workarounds. No users yet.

Cc: Arun Siluvery 
Signed-off-by: Mika Kuoppala 
---
 drivers/gpu/drm/i915/i915_debugfs.c | 14 +-
 drivers/gpu/drm/i915/i915_drv.h | 14 ++
 2 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index af44808..0c4e6bc 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3095,7 +3095,8 @@ static int i915_shared_dplls_info(struct seq_file *m, 
void *unused)
 }
 
 static void print_wa_regs(struct seq_file *m,
- const struct i915_workarounds *w)
+ const struct i915_workarounds *w,
+ const char *type)
 {
struct drm_info_node *node = (struct drm_info_node *) m->private;
struct drm_device *dev = node->minor->dev;
@@ -3111,8 +3112,8 @@ static void print_wa_regs(struct seq_file *m,
value = w->reg[i].value;
read = I915_READ(addr);
ok = (value & mask) == (read & mask);
-   seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X, read: 0x%08x, 
status: %s\n",
-  addr, value, mask, read, ok ? "OK" : "FAIL");
+   seq_printf(m, "0x%X: 0x%08X, mask: 0x%08X, read: 0x%08x, type: 
%s, status: %s\n",
+  addr, value, mask, read, type, ok ? "OK" : "FAIL");
}
 }
 
@@ -3130,8 +3131,11 @@ static int i915_wa_registers(struct seq_file *m, void 
*unused)
intel_runtime_pm_get(dev_priv);
 
seq_printf(m, "Workarounds applied: %d\n",
-  dev_priv->lri_workarounds.count);
-   print_wa_regs(m, &dev_priv->lri_workarounds);
+  dev_priv->lri_workarounds.count +
+  dev_priv->mmio_workarounds.count);
+
+   print_wa_regs(m, &dev_priv->lri_workarounds,  " LRI");
+   print_wa_regs(m, &dev_priv->mmio_workarounds, "MMIO");
 
intel_runtime_pm_put(dev_priv);
mutex_unlock(&dev->struct_mutex);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 0ed790c..ae5b6b3 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1845,6 +1845,7 @@ struct drm_i915_private {
int dpio_phy_iosf_port[I915_NUM_PHYS_VLV];
 
struct i915_workarounds lri_workarounds;
+   struct i915_workarounds mmio_workarounds;
 
/* Reclocking support */
bool render_reclock_avail;
@@ -3519,12 +3520,17 @@ static inline void i915_trace_irq_get(struct 
intel_engine_cs *ring,
 }
 
 /* Workaround register lists */
-#define WA_REG_LRI(addr, mask, val) do { \
-   const int r = intel_wa_add(&dev_priv->lri_workarounds, \
-  (addr), (mask), (val)); \
-   WARN_ON(r); \
+#define WA_REG(wlist, addr, mask, val) do { \
+   const int r = intel_wa_add((wlist), (addr), (mask), (val)); \
+   WARN_ON(r); \
} while (0)
 
+#define WA_REG_LRI(addr, mask, val) \
+   WA_REG(&dev_priv->lri_workarounds, (addr), (mask), (val))
+
+#define WA_REG_MMIO(addr, mask, val) \
+   WA_REG(&dev_priv->mmio_workarounds, (addr), (mask), (val))
+
 #define WA_SET_BIT_MASKED(t, addr, mask) \
WA_REG_##t(addr, (mask), _MASKED_BIT_ENABLE(mask))
 
-- 
2.1.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 05/12] drm/i915: Specify the wa list in WA_* macros

2015-10-06 Thread Mika Kuoppala
In order to prepare for different types of workaround lists,
parametrize the list we are adding the workaround register.

Signed-off-by: Mika Kuoppala 
---
 drivers/gpu/drm/i915/i915_drv.h | 20 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c | 65 +
 2 files changed, 44 insertions(+), 41 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 5a04948..0ed790c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3519,23 +3519,25 @@ static inline void i915_trace_irq_get(struct 
intel_engine_cs *ring,
 }
 
 /* Workaround register lists */
-#define WA_REG(addr, mask, val) do { \
+#define WA_REG_LRI(addr, mask, val) do { \
const int r = intel_wa_add(&dev_priv->lri_workarounds, \
   (addr), (mask), (val)); \
WARN_ON(r); \
} while (0)
 
-#define WA_SET_BIT_MASKED(addr, mask) \
-   WA_REG(addr, (mask), _MASKED_BIT_ENABLE(mask))
+#define WA_SET_BIT_MASKED(t, addr, mask) \
+   WA_REG_##t(addr, (mask), _MASKED_BIT_ENABLE(mask))
 
-#define WA_CLR_BIT_MASKED(addr, mask) \
-   WA_REG(addr, (mask), _MASKED_BIT_DISABLE(mask))
+#define WA_CLR_BIT_MASKED(t, addr, mask) \
+   WA_REG_##t(addr, (mask), _MASKED_BIT_DISABLE(mask))
 
-#define WA_SET_FIELD_MASKED(addr, mask, value) \
-   WA_REG(addr, mask, _MASKED_FIELD(mask, value))
+#define WA_SET_FIELD_MASKED(t, addr, mask, value) \
+   WA_REG_##t(addr, mask, _MASKED_FIELD(mask, value))
 
-#define WA_SET_BIT(addr, mask) WA_REG(addr, mask, I915_READ((addr)) | (mask))
-#define WA_CLR_BIT(addr, mask) WA_REG(addr, mask, I915_READ((addr)) & ~(mask))
+#define WA_SET_BIT(t, addr, mask) \
+   WA_REG_##t(addr, mask, I915_READ((addr)) | (mask))
+#define WA_CLR_BIT(t, addr, mask) \
+   WA_REG_##t(addr, mask, I915_READ((addr)) & ~(mask))
 
 int intel_wa_add(struct i915_workarounds *w,
 const u32 addr, const u32 mask, const u32 val);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 29ae97e..c9d3489e 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -785,13 +785,13 @@ static int gen8_init_workarounds(struct intel_engine_cs 
*ring)
struct drm_device *dev = ring->dev;
struct drm_i915_private *dev_priv = dev->dev_private;
 
-   WA_SET_BIT_MASKED(INSTPM, INSTPM_FORCE_ORDERING);
+   WA_SET_BIT_MASKED(LRI, INSTPM, INSTPM_FORCE_ORDERING);
 
/* WaDisableAsyncFlipPerfMode:bdw,chv */
-   WA_SET_BIT_MASKED(MI_MODE, ASYNC_FLIP_PERF_DISABLE);
+   WA_SET_BIT_MASKED(LRI, MI_MODE, ASYNC_FLIP_PERF_DISABLE);
 
/* WaDisablePartialInstShootdown:bdw,chv */
-   WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN,
+   WA_SET_BIT_MASKED(LRI, GEN8_ROW_CHICKEN,
  PARTIAL_INSTRUCTION_SHOOTDOWN_DISABLE);
 
/* Use Force Non-Coherent whenever executing a 3D context. This is a
@@ -800,7 +800,7 @@ static int gen8_init_workarounds(struct intel_engine_cs 
*ring)
 */
/* WaForceEnableNonCoherent:bdw,chv */
/* WaHdcDisableFetchWhenMasked:bdw,chv */
-   WA_SET_BIT_MASKED(HDC_CHICKEN0,
+   WA_SET_BIT_MASKED(LRI, HDC_CHICKEN0,
  HDC_DONOT_FETCH_MEM_WHEN_MASKED |
  HDC_FORCE_NON_COHERENT);
 
@@ -812,10 +812,10 @@ static int gen8_init_workarounds(struct intel_engine_cs 
*ring)
 *
 * This optimization is off by default for BDW and CHV; turn it on.
 */
-   WA_CLR_BIT_MASKED(CACHE_MODE_0_GEN7, HIZ_RAW_STALL_OPT_DISABLE);
+   WA_CLR_BIT_MASKED(LRI, CACHE_MODE_0_GEN7, HIZ_RAW_STALL_OPT_DISABLE);
 
/* Wa4x4STCOptimizationDisable:bdw,chv */
-   WA_SET_BIT_MASKED(CACHE_MODE_1, GEN8_4x4_STC_OPTIMIZATION_DISABLE);
+   WA_SET_BIT_MASKED(LRI, CACHE_MODE_1, GEN8_4x4_STC_OPTIMIZATION_DISABLE);
 
/*
 * BSpec recommends 8x4 when MSAA is used,
@@ -825,7 +825,7 @@ static int gen8_init_workarounds(struct intel_engine_cs 
*ring)
 * disable bit, which we don't touch here, but it's good
 * to keep in mind (see 3DSTATE_PS and 3DSTATE_WM).
 */
-   WA_SET_FIELD_MASKED(GEN7_GT_MODE,
+   WA_SET_FIELD_MASKED(LRI, GEN7_GT_MODE,
GEN6_WIZ_HASHING_MASK,
GEN6_WIZ_HASHING_16x4);
 
@@ -843,16 +843,16 @@ static int bdw_init_workarounds(struct intel_engine_cs 
*ring)
return ret;
 
/* WaDisableThreadStallDopClockGating:bdw (pre-production) */
-   WA_SET_BIT_MASKED(GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
+   WA_SET_BIT_MASKED(LRI, GEN8_ROW_CHICKEN, STALL_DOP_GATING_DISABLE);
 
/* WaDisableDopClockGating:bdw */
-   WA_SET_BIT_MASKED(GEN7_ROW_CHICKEN2,
+   WA_SET_BIT_MASKED(LRI, GEN7_ROW_CHICKEN2,
  DOP_CLOCK_GATING_DISABLE);
 
-   WA_SET_BIT_MASKED(

[Intel-gfx] [PATCH 0/4] lrc lifecycle cleanups

2015-10-06 Thread Nick Hoath
These changes are a result of the requests made in VIZ-4277.
Make the lrc path more like the legacy submission path.
Attach the CPU mappings to vma (un)bind, so that the shrinker
also cleans those up.
Pin the CPU mappings while context is busy (pending bbs), so
that the mappings aren't released/made continuously as this is
an expensive process.

Nick Hoath (4):
  drm/i915: Unify execlist and legacy request life-cycles
  drm/i915: Improve dynamic management/eviction of lrc backing objects
  drm/i915: Add the CPU mapping of the hw context to the pinned items.
  drm/i915: Only update ringbuf address when necessary

 drivers/gpu/drm/i915/i915_debugfs.c |  14 ++--
 drivers/gpu/drm/i915/i915_drv.h |  14 +++-
 drivers/gpu/drm/i915/i915_gem.c |  70 +
 drivers/gpu/drm/i915/i915_gem_gtt.c |   8 ++
 drivers/gpu/drm/i915/i915_irq.c |  81 +---
 drivers/gpu/drm/i915/intel_lrc.c| 131 ++--
 drivers/gpu/drm/i915/intel_lrc.h|   2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c |  71 +
 drivers/gpu/drm/i915/intel_ringbuffer.h |   4 -
 9 files changed, 250 insertions(+), 145 deletions(-)

-- 
1.9.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 2/4] drm/i915: Improve dynamic management/eviction of lrc backing objects

2015-10-06 Thread Nick Hoath
Shovel all context related objects through the active queue and obj
management.

- Added callback in vma_(un)bind to add CPU (un)mapping at same time
  if desired
- Inserted LRC hw context & ringbuf to vma active list

Issue: VIZ-4277
Signed-off-by: Nick Hoath 
---
 drivers/gpu/drm/i915/i915_drv.h |  4 ++
 drivers/gpu/drm/i915/i915_gem.c |  3 ++
 drivers/gpu/drm/i915/i915_gem_gtt.c |  8 
 drivers/gpu/drm/i915/intel_lrc.c| 28 +++--
 drivers/gpu/drm/i915/intel_ringbuffer.c | 71 ++---
 drivers/gpu/drm/i915/intel_ringbuffer.h |  3 --
 6 files changed, 79 insertions(+), 38 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 3d217f9..d660ee3 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2169,6 +2169,10 @@ struct drm_i915_gem_object {
struct work_struct *work;
} userptr;
};
+
+   /** Support for automatic CPU side mapping of object */
+   int (*mmap)(struct drm_i915_gem_object *obj, bool unmap);
+   void *mappable;
 };
 #define to_intel_bo(x) container_of(x, struct drm_i915_gem_object, base)
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index fc82171..56e0e00 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3262,6 +3262,9 @@ static int __i915_vma_unbind(struct i915_vma *vma, bool 
wait)
if (vma->pin_count)
return -EBUSY;
 
+   if (obj->mmap)
+   obj->mmap(obj, true);
+
BUG_ON(obj->pages == NULL);
 
if (wait) {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 620d57e..786ec4b 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -3495,6 +3495,14 @@ int i915_vma_bind(struct i915_vma *vma, enum 
i915_cache_level cache_level,
 
vma->bound |= bind_flags;
 
+   if (vma->obj->mmap) {
+   ret = vma->obj->mmap(vma->obj, false);
+   if (ret) {
+   i915_vma_unbind(vma);
+   return ret;
+   }
+   }
+
return 0;
 }
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index e8f5b6c..b807928 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -723,6 +723,18 @@ intel_logical_ring_advance_and_submit(struct 
drm_i915_gem_request *request)
 
intel_logical_ring_advance(request->ringbuf);
 
+   /* Push the hw context on to the active list */
+   i915_vma_move_to_active(
+   i915_gem_obj_to_ggtt(
+   request->ctx->engine[ring->id].state),
+   request);
+
+   /* Push the ringbuf on to the active list */
+   i915_vma_move_to_active(
+   i915_gem_obj_to_ggtt(
+   request->ctx->engine[ring->id].ringbuf->obj),
+   request);
+
request->tail = request->ringbuf->tail;
 
if (intel_ring_stopped(ring))
@@ -1006,10 +1018,15 @@ static int intel_lr_context_do_pin(struct 
intel_engine_cs *ring,
if (ret)
return ret;
 
-   ret = intel_pin_and_map_ringbuffer_obj(ring->dev, ringbuf);
+   ret = i915_gem_obj_ggtt_pin(ringbuf->obj, PAGE_SIZE,
+   PIN_MAPPABLE);
if (ret)
goto unpin_ctx_obj;
 
+   ret = i915_gem_object_set_to_gtt_domain(ringbuf->obj, true);
+   if (ret)
+   goto unpin_rb_obj;
+
ctx_obj->dirty = true;
 
/* Invalidate GuC TLB. */
@@ -1018,6 +1035,8 @@ static int intel_lr_context_do_pin(struct intel_engine_cs 
*ring,
 
return ret;
 
+unpin_rb_obj:
+   i915_gem_object_ggtt_unpin(ringbuf->obj);
 unpin_ctx_obj:
i915_gem_object_ggtt_unpin(ctx_obj);
 
@@ -1052,7 +1071,7 @@ void intel_lr_context_unpin(struct drm_i915_gem_request 
*rq)
if (ctx_obj) {
WARN_ON(!mutex_is_locked(&ring->dev->struct_mutex));
if (--rq->ctx->engine[ring->id].pin_count == 0) {
-   intel_unpin_ringbuffer_obj(ringbuf);
+   i915_gem_object_ggtt_unpin(ringbuf->obj);
i915_gem_object_ggtt_unpin(ctx_obj);
}
}
@@ -2369,7 +2388,7 @@ void intel_lr_context_free(struct intel_context *ctx)
struct intel_engine_cs *ring = ringbuf->ring;
 
if (ctx == ring->default_context) {
-   intel_unpin_ringbuffer_obj(ringbuf);
+   i915_gem_object_ggtt_unpin(ringbuf->obj);
i915_gem_object_ggtt_unpin(ctx_obj);
}
WARN_ON(ctx->engine[ring->id].pin_count);
@@ -2536,5 +2555,8 @@ void intel_lr_context_reset(struct drm_device *d

[Intel-gfx] [PATCH 3/4] drm/i915: Add the CPU mapping of the hw context to the pinned items.

2015-10-06 Thread Nick Hoath
Pin the hw ctx mapping so that it is not mapped/unmapped per bb
when doing GuC submission.

Issue: VIZ-4277
Cc: David Gordon 
Signed-off-by: Nick Hoath 
---
 drivers/gpu/drm/i915/i915_debugfs.c | 14 --
 drivers/gpu/drm/i915/i915_drv.h |  4 ++-
 drivers/gpu/drm/i915/intel_lrc.c| 56 +++--
 3 files changed, 50 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 3f2a7a7..e68cf5fa 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1970,10 +1970,9 @@ static int i915_context_status(struct seq_file *m, void 
*unused)
 
 static void i915_dump_lrc_obj(struct seq_file *m,
  struct intel_engine_cs *ring,
- struct drm_i915_gem_object *ctx_obj)
+ struct drm_i915_gem_object *ctx_obj,
+ uint32_t *reg_state)
 {
-   struct page *page;
-   uint32_t *reg_state;
int j;
unsigned long ggtt_offset = 0;
 
@@ -1996,17 +1995,13 @@ static void i915_dump_lrc_obj(struct seq_file *m,
return;
}
 
-   page = i915_gem_object_get_page(ctx_obj, LRC_STATE_PN);
-   if (!WARN_ON(page == NULL)) {
-   reg_state = kmap_atomic(page);
-
+   if (!WARN_ON(reg_state == NULL)) {
for (j = 0; j < 0x600 / sizeof(u32) / 4; j += 4) {
seq_printf(m, "\t[0x%08lx] 0x%08x 0x%08x 0x%08x 
0x%08x\n",
   ggtt_offset + 4096 + (j * 4),
   reg_state[j], reg_state[j + 1],
   reg_state[j + 2], reg_state[j + 3]);
}
-   kunmap_atomic(reg_state);
}
 
seq_putc(m, '\n');
@@ -2034,7 +2029,8 @@ static int i915_dump_lrc(struct seq_file *m, void *unused)
for_each_ring(ring, dev_priv, i) {
if (ring->default_context != ctx)
i915_dump_lrc_obj(m, ring,
- ctx->engine[i].state);
+ ctx->engine[i].state,
+ ctx->engine[i].reg_state);
}
}
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d660ee3..b49fd12 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -879,8 +879,10 @@ struct intel_context {
} legacy_hw_ctx;
 
/* Execlists */
-   struct {
+   struct intel_context_engine {
struct drm_i915_gem_object *state;
+   uint32_t *reg_state;
+   struct page *page;
struct intel_ringbuffer *ringbuf;
int pin_count;
} engine[I915_NUM_RINGS];
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index b807928..55a4de56 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -360,16 +360,13 @@ static int execlists_update_context(struct 
drm_i915_gem_request *rq)
struct i915_hw_ppgtt *ppgtt = rq->ctx->ppgtt;
struct drm_i915_gem_object *ctx_obj = rq->ctx->engine[ring->id].state;
struct drm_i915_gem_object *rb_obj = rq->ringbuf->obj;
-   struct page *page;
-   uint32_t *reg_state;
+   uint32_t *reg_state = rq->ctx->engine[ring->id].reg_state;
 
BUG_ON(!ctx_obj);
+   WARN_ON(!reg_state);
WARN_ON(!i915_gem_obj_is_pinned(ctx_obj));
WARN_ON(!i915_gem_obj_is_pinned(rb_obj));
 
-   page = i915_gem_object_get_page(ctx_obj, LRC_STATE_PN);
-   reg_state = kmap_atomic(page);
-
reg_state[CTX_RING_TAIL+1] = rq->tail;
reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(rb_obj);
 
@@ -385,8 +382,6 @@ static int execlists_update_context(struct 
drm_i915_gem_request *rq)
ASSIGN_CTX_PDP(ppgtt, reg_state, 0);
}
 
-   kunmap_atomic(reg_state);
-
return 0;
 }
 
@@ -1004,7 +999,31 @@ int logical_ring_flush_all_caches(struct 
drm_i915_gem_request *req)
return 0;
 }
 
-static int intel_lr_context_do_pin(struct intel_engine_cs *ring,
+static int intel_mmap_hw_context(struct drm_i915_gem_object *obj,
+   bool unmap)
+{
+   int ret = 0;
+   struct intel_context_engine *ice =
+   (struct intel_context_engine *)obj->mappable;
+   struct page *page;
+   uint32_t *reg_state;
+
+   if (unmap) {
+   kunmap(ice->page);
+   ice->reg_state = NULL;
+   ice->page = NULL;
+   } else {
+   page = i915_gem_object_get_page(obj, LRC_STATE_PN);
+   reg_state = kmap(page);
+   ice->reg_state = reg_state;
+   ice->page = page;
+   }
+   return ret;
+}
+
+static int intel_lr_context

[Intel-gfx] [PATCH 4/4] drm/i915: Only update ringbuf address when necessary

2015-10-06 Thread Nick Hoath
We now only need to update the address of the ringbuf object in the
hw context when it is pinned, and the hw context is first CPU mapped

Issue: VIZ-4277
Cc: David Gordon 
Signed-off-by: Nick Hoath 
---
 drivers/gpu/drm/i915/intel_lrc.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 55a4de56..92a0ece 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -368,7 +368,6 @@ static int execlists_update_context(struct 
drm_i915_gem_request *rq)
WARN_ON(!i915_gem_obj_is_pinned(rb_obj));
 
reg_state[CTX_RING_TAIL+1] = rq->tail;
-   reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(rb_obj);
 
if (ppgtt && !USES_FULL_48BIT_PPGTT(ppgtt->base.dev)) {
/* True 32b PPGTT with dynamic page allocation: update PDP
@@ -1046,6 +1045,9 @@ static int intel_lr_context_do_pin(
if (ret)
goto unpin_rb_obj;
 
+   ctx->engine[ring->id].reg_state[CTX_RING_BUFFER_START+1] =
+   i915_gem_obj_ggtt_offset(ringbuf->obj);
+
ctx_obj->dirty = true;
 
/* Invalidate GuC TLB. */
-- 
1.9.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 1/4] drm/i915: Unify execlist and legacy request life-cycles

2015-10-06 Thread Nick Hoath
There is a desire to simplify the i915 driver by reducing the number of
different code paths introduced by the LRC / execlists support.  As the
execlists request is now part of the gem request it is possible and
desirable to unify the request life-cycles for execlist and legacy
requests.

Added a context complete flag to a request which gets set during the
context switch interrupt.

Added a function i915_gem_request_retireable().  A request is considered
retireable if its seqno passed (i.e. the request has completed) and either
it was never submitted to the ELSP or its context completed.  This ensures
that context save is carried out before the last request for a context is
considered retireable.  retire_requests_ring() now uses
i915_gem_request_retireable() rather than request_complete() when deciding
which requests to retire. Requests that were not waiting for a context
switch interrupt (either as a result of being merged into a following
request or by being a legacy request) will be considered retireable as
soon as their seqno has passed.

Removed the extra request reference held for the execlist request.

Removed intel_execlists_retire_requests() and all references to
intel_engine_cs.execlist_retired_req_list.

Moved context unpinning into retire_requests_ring() for now.  Further work
is pending for the context pinning - this patch should allow us to use the
active list to track context and ring buffer objects later.

Changed gen8_cs_irq_handler() so that notify_ring() is called when
contexts complete as well as when a user interrupt occurs so that
notification happens when a request is complete and context save has
finished.

v2: Rebase over the read-read optimisation changes

v3: Reworked IRQ handler after removing IRQ handler cleanup patch

v4: Fixed various pin leaks

Issue: VIZ-4277
Signed-off-by: Thomas Daniel 
Signed-off-by: Nick Hoath 
---
 drivers/gpu/drm/i915/i915_drv.h |  6 +++
 drivers/gpu/drm/i915/i915_gem.c | 67 +--
 drivers/gpu/drm/i915/i915_irq.c | 81 +
 drivers/gpu/drm/i915/intel_lrc.c| 43 +++--
 drivers/gpu/drm/i915/intel_lrc.h|  2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.h |  1 -
 6 files changed, 118 insertions(+), 82 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index fbf0ae9..3d217f9 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2262,6 +2262,12 @@ struct drm_i915_gem_request {
/** Execlists no. of times this request has been sent to the ELSP */
int elsp_submitted;
 
+   /**
+* Execlists: whether this requests's context has completed after
+* submission to the ELSP
+*/
+   bool ctx_complete;
+
 };
 
 int i915_gem_request_alloc(struct intel_engine_cs *ring,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 52642af..fc82171 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1386,6 +1386,24 @@ __i915_gem_request_retire__upto(struct 
drm_i915_gem_request *req)
   typeof(*tmp), list);
 
i915_gem_request_retire(tmp);
+
+   if (i915.enable_execlists) {
+   struct intel_context *ctx = tmp->ctx;
+   struct drm_i915_private *dev_priv =
+   engine->dev->dev_private;
+   unsigned long flags;
+   struct drm_i915_gem_object *ctx_obj =
+   ctx->engine[engine->id].state;
+
+   spin_lock_irqsave(&engine->execlist_lock, flags);
+
+   if (ctx_obj && (ctx != engine->default_context))
+   intel_lr_context_unpin(tmp);
+
+   intel_runtime_pm_put(dev_priv);
+   spin_unlock_irqrestore(&engine->execlist_lock, flags);
+   }
+
} while (tmp != req);
 
WARN_ON(i915_verify_lists(engine->dev));
@@ -2359,6 +2377,12 @@ void i915_vma_move_to_active(struct i915_vma *vma,
list_move_tail(&vma->mm_list, &vma->vm->active_list);
 }
 
+static bool i915_gem_request_retireable(struct drm_i915_gem_request *req)
+{
+   return (i915_gem_request_completed(req, true) &&
+   (!req->elsp_submitted || req->ctx_complete));
+}
+
 static void
 i915_gem_object_retire__write(struct drm_i915_gem_object *obj)
 {
@@ -2829,10 +2853,28 @@ i915_gem_retire_requests_ring(struct intel_engine_cs 
*ring)
   struct drm_i915_gem_request,
   list);
 
-   if (!i915_gem_request_completed(request, true))
+   if (!i915_gem_request_retireable(request))
break;
 
i915_gem_request_retire(request);
+
+   if (i915.enable_execlists) {
+  

Re: [Intel-gfx] [PATCH 1/2] drm/i915/skl: Allow universal planes to position

2015-10-06 Thread Matt Roper
On Tue, Oct 06, 2015 at 05:42:42PM +0300, Ville Syrjälä wrote:
> On Tue, Oct 06, 2015 at 07:29:54AM -0700, Matt Roper wrote:
> > On Tue, Oct 06, 2015 at 02:32:47PM +0100, Tvrtko Ursulin wrote:
> > > 
> > > On 10/04/15 10:07, Sonika Jindal wrote:
> > > >Signed-off-by: Sonika Jindal 
> > > >Reviewed-by: Matt Roper 
> > > >---
> > > >  drivers/gpu/drm/i915/intel_display.c |7 ++-
> > > >  1 file changed, 6 insertions(+), 1 deletion(-)
> > > >
> > > >diff --git a/drivers/gpu/drm/i915/intel_display.c 
> > > >b/drivers/gpu/drm/i915/intel_display.c
> > > >index ceb2e61..f0bbc22 100644
> > > >--- a/drivers/gpu/drm/i915/intel_display.c
> > > >+++ b/drivers/gpu/drm/i915/intel_display.c
> > > >@@ -12150,16 +12150,21 @@ intel_check_primary_plane(struct drm_plane 
> > > >*plane,
> > > > struct drm_rect *dest = &state->dst;
> > > > struct drm_rect *src = &state->src;
> > > > const struct drm_rect *clip = &state->clip;
> > > >+bool can_position = false;
> > > > int ret;
> > > >
> > > > crtc = crtc ? crtc : plane->crtc;
> > > > intel_crtc = to_intel_crtc(crtc);
> > > >
> > > >+if (INTEL_INFO(dev)->gen >= 9)
> > > >+can_position = true;
> > > >+
> > > > ret = drm_plane_helper_check_update(plane, crtc, fb,
> > > > src, dest, clip,
> > > > DRM_PLANE_HELPER_NO_SCALING,
> > > > DRM_PLANE_HELPER_NO_SCALING,
> > > >-false, true, 
> > > >&state->visible);
> > > >+can_position, true,
> > > >+&state->visible);
> > > > if (ret)
> > > > return ret;
> > > >
> > > >
> > > 
> > > I have discovered today that, while this allows SetCrtc and SetPlane
> > > ioctls to work with frame buffers which do not cover the plane, page
> > > flips are not that lucky and fail roughly with:
> > > 
> > > [drm:drm_crtc_check_viewport] Invalid fb size 1080x1080 for CRTC
> > > viewport 1920x1080+0+0.
> > 
> > Maybe I'm misunderstanding your explanation, but a framebuffer is always
> > required to fill/cover the plane scanning out of it.  What this patch is
> > supposed to be allowing is for the primary plane to not cover the entire
> > CRTC (since that's something that only became possible for Intel
> > hardware on the gen9+ platforms).  I.e., the primary plane is now
> > allowed to positioned and resized to cover a subset of the CRTC area,
> > just like "sprite" planes have always been able to.
> > 
> > If you've got a 1080x1080 framebuffer, then it's legal to have a
> > 1080x1080 primary plane while running in 1920x1080 mode on SKL/BXT.
> > However it is not legal to size the primary plane as 1920x1080 and use
> > this same 1080x1080 framebuffer with any of our interfaces (setplane,
> > setcrtc, pageflip, or atomic).
> > 
> > Are you using ioctls/libdrm directly or are you using igt_kms helpers?
> > IIRC, the IGT helpers will try to be extra helpful and automatically
> > size the plane to match the framebuffer (unless you override that
> > behavior), so that might be what's causing the confusion here.
> 
> The problem is clear as day in drm_mode_page_flip_ioctl():
> ret = drm_crtc_check_viewport(crtc, crtc->x, crtc->y, &crtc->mode, fb);
> if (ret)
>   goto out;
> 
> The fix should be easy; just extract the current src coordinates from
> the plane state and check those against the new fb size. And then hope
> that the plane state is really up to date.

Yep, that's the conclusion we came to once Tvrtko explained what he was
seeing on IRC.  I'm not sure whether non-atomic drivers have enough
state setup by the default helpers to work properly.  Worst case we'll
just assume that a non-atomic driver won't support primary plane
windowing (since none have in the past) and fall back to looking at the
mode for legacy non-atomic drivers.

> 
> And I'm sure rotated cases will go boom in some other ways. Probably
> we should just switch over to using the full plane update for mmio
> flips to fix it.

Yeah; the core looks at a drm_plane->invert_dimensions field that's only
set by omap.  That should probably be updated to look at the state's
rotation on atomic-capable drivers.


Matt

> 
> > 
> > 
> > Matt
> > 
> > > 
> > > I have posted a quick IGT exerciser for this as "kms_rotation_crc:
> > > Excercise page flips with 90 degree rotation". May not be that great
> > > but shows the failure.
> > > 
> > > I am not that hot on meddling with this code, nor do I feel
> > > competent to even try on my own at least. :/ Maybe just because the
> > > atomic and plane related rewrites have been going on for so long,
> > > and have multiple people involved, it all sounds pretty scary and
> > > fragile.
> > > 
> > > But I think some sort of plan on how to fix this could be in order?
> > > 
> > > Regar

Re: [Intel-gfx] [PATCH 1/2] drm/i915: Kill DRI1 cliprects

2015-10-06 Thread Chris Wilson
On Tue, Oct 06, 2015 at 03:29:20PM +0100, Dave Gordon wrote:
> On 06/10/15 11:39, Chris Wilson wrote:
> >Passing cliprects into the kernel for it to re-execute the batch buffer
> >with different CMD_DRAWRECT died out long ago. As DRI1 support has been
> >removed from the kernel, we can now simply reject any execbuf trying to
> >use this "feature".
> >
> >To keep Daniel happy with the prospect of being able to reuse these
> >fields in the next decade, continue to ensure that current userspace is
> >not passing garbage in through the dead fields.
> >
> >v2: Fix the cliprects_ptr check
> >
> >Signed-off-by: Chris Wilson 
> >Cc: Daniel Vetter 
> >---
> >  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 154 
> > ++---
> >  drivers/gpu/drm/i915/intel_lrc.c   |  15 ---
> >  2 files changed, 31 insertions(+), 138 deletions(-)
> 
> This will cause John Harrison & myself a certain amount of pain
> (because the changes collide with the scheduler's reorganisation of
> the execbuffer path), but I'm all in favour of getting rid of the
> legacy crud cluttering up this code, so ...

Oh, don't worry I'm completely rewriting it and will nak anything that
adds a cycle of latency to execbuffer.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/2] drm/i915: Kill DRI1 cliprects

2015-10-06 Thread Chris Wilson
On Tue, Oct 06, 2015 at 03:19:36PM +0100, Tvrtko Ursulin wrote:
> 
> On 06/10/15 11:39, Chris Wilson wrote:
> >Passing cliprects into the kernel for it to re-execute the batch buffer
> >with different CMD_DRAWRECT died out long ago. As DRI1 support has been
> >removed from the kernel, we can now simply reject any execbuf trying to
> >use this "feature".
> >
> >To keep Daniel happy with the prospect of being able to reuse these
> >fields in the next decade, continue to ensure that current userspace is
> >not passing garbage in through the dead fields.
> >
> >v2: Fix the cliprects_ptr check
> >
> >Signed-off-by: Chris Wilson 
> >Cc: Daniel Vetter 
> 
> Don't know anything about the DRI1 history but the removal looks
> fine to me, so:
> 
> Reviewed-by: Tvrtko Ursulin 
> 
> Would it also make sense to rename the related fields in
> drm_i915_gem_execbuffer2 to reserved?

In our heads, yes. Renaming structs in uapi.h is harder than it looks.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH] drm/i915: Set mode->crtc_clock during hardware state readout

2015-10-06 Thread Matt Roper
intel_mode_from_pipe_config() fills in a mode structure from the CRTC
state that was read out of the hardware, but does not set the
.crtc_clock field (it only sets the .clock).  This causes the subsequent
call to drm_calc_timestamping_constants() to complain with messages like
"*ERROR* crtc 21: Can't calculate constants, dotclock = 0!"  Ensuring
.crtc_clock is set as well eliminates this error.

Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/intel_display.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index bbeb6d3..4e481e3 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -7752,6 +7752,7 @@ void intel_mode_from_pipe_config(struct drm_display_mode 
*mode,
mode->type = DRM_MODE_TYPE_DRIVER;
 
mode->clock = pipe_config->base.adjusted_mode.crtc_clock;
+   mode->crtc_clock = pipe_config->base.adjusted_mode.crtc_clock;
mode->flags |= pipe_config->base.adjusted_mode.flags;
 
mode->hsync = drm_mode_hsync(mode);
-- 
2.1.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/2] drm/i915/skl: Allow universal planes to position

2015-10-06 Thread Ville Syrjälä
On Tue, Oct 06, 2015 at 08:16:19AM -0700, Matt Roper wrote:
> On Tue, Oct 06, 2015 at 05:42:42PM +0300, Ville Syrjälä wrote:
> > On Tue, Oct 06, 2015 at 07:29:54AM -0700, Matt Roper wrote:
> > > On Tue, Oct 06, 2015 at 02:32:47PM +0100, Tvrtko Ursulin wrote:
> > > > 
> > > > On 10/04/15 10:07, Sonika Jindal wrote:
> > > > >Signed-off-by: Sonika Jindal 
> > > > >Reviewed-by: Matt Roper 
> > > > >---
> > > > >  drivers/gpu/drm/i915/intel_display.c |7 ++-
> > > > >  1 file changed, 6 insertions(+), 1 deletion(-)
> > > > >
> > > > >diff --git a/drivers/gpu/drm/i915/intel_display.c 
> > > > >b/drivers/gpu/drm/i915/intel_display.c
> > > > >index ceb2e61..f0bbc22 100644
> > > > >--- a/drivers/gpu/drm/i915/intel_display.c
> > > > >+++ b/drivers/gpu/drm/i915/intel_display.c
> > > > >@@ -12150,16 +12150,21 @@ intel_check_primary_plane(struct drm_plane 
> > > > >*plane,
> > > > >   struct drm_rect *dest = &state->dst;
> > > > >   struct drm_rect *src = &state->src;
> > > > >   const struct drm_rect *clip = &state->clip;
> > > > >+  bool can_position = false;
> > > > >   int ret;
> > > > >
> > > > >   crtc = crtc ? crtc : plane->crtc;
> > > > >   intel_crtc = to_intel_crtc(crtc);
> > > > >
> > > > >+  if (INTEL_INFO(dev)->gen >= 9)
> > > > >+  can_position = true;
> > > > >+
> > > > >   ret = drm_plane_helper_check_update(plane, crtc, fb,
> > > > >   src, dest, clip,
> > > > >   DRM_PLANE_HELPER_NO_SCALING,
> > > > >   DRM_PLANE_HELPER_NO_SCALING,
> > > > >-  false, true, 
> > > > >&state->visible);
> > > > >+  can_position, true,
> > > > >+  &state->visible);
> > > > >   if (ret)
> > > > >   return ret;
> > > > >
> > > > >
> > > > 
> > > > I have discovered today that, while this allows SetCrtc and SetPlane
> > > > ioctls to work with frame buffers which do not cover the plane, page
> > > > flips are not that lucky and fail roughly with:
> > > > 
> > > > [drm:drm_crtc_check_viewport] Invalid fb size 1080x1080 for CRTC
> > > > viewport 1920x1080+0+0.
> > > 
> > > Maybe I'm misunderstanding your explanation, but a framebuffer is always
> > > required to fill/cover the plane scanning out of it.  What this patch is
> > > supposed to be allowing is for the primary plane to not cover the entire
> > > CRTC (since that's something that only became possible for Intel
> > > hardware on the gen9+ platforms).  I.e., the primary plane is now
> > > allowed to positioned and resized to cover a subset of the CRTC area,
> > > just like "sprite" planes have always been able to.
> > > 
> > > If you've got a 1080x1080 framebuffer, then it's legal to have a
> > > 1080x1080 primary plane while running in 1920x1080 mode on SKL/BXT.
> > > However it is not legal to size the primary plane as 1920x1080 and use
> > > this same 1080x1080 framebuffer with any of our interfaces (setplane,
> > > setcrtc, pageflip, or atomic).
> > > 
> > > Are you using ioctls/libdrm directly or are you using igt_kms helpers?
> > > IIRC, the IGT helpers will try to be extra helpful and automatically
> > > size the plane to match the framebuffer (unless you override that
> > > behavior), so that might be what's causing the confusion here.
> > 
> > The problem is clear as day in drm_mode_page_flip_ioctl():
> > ret = drm_crtc_check_viewport(crtc, crtc->x, crtc->y, &crtc->mode, fb);
> > if (ret)
> > goto out;
> > 
> > The fix should be easy; just extract the current src coordinates from
> > the plane state and check those against the new fb size. And then hope
> > that the plane state is really up to date.
> 
> Yep, that's the conclusion we came to once Tvrtko explained what he was
> seeing on IRC.  I'm not sure whether non-atomic drivers have enough
> state setup by the default helpers to work properly.  Worst case we'll
> just assume that a non-atomic driver won't support primary plane
> windowing (since none have in the past) and fall back to looking at the
> mode for legacy non-atomic drivers.
> 
> > 
> > And I'm sure rotated cases will go boom in some other ways. Probably
> > we should just switch over to using the full plane update for mmio
> > flips to fix it.
> 
> Yeah; the core looks at a drm_plane->invert_dimensions field that's only
> set by omap.  That should probably be updated to look at the state's
> rotation on atomic-capable drivers.

We can just look at the src coordinates. Those always match the fb
orientation.

> 
> 
> Matt
> 
> > 
> > > 
> > > 
> > > Matt
> > > 
> > > > 
> > > > I have posted a quick IGT exerciser for this as "kms_rotation_crc:
> > > > Excercise page flips with 90 degree rotation". May not be that great
> > > > but shows the failure.
> > > > 
> > > > I am not that hot on meddling with this code, nor do I feel
> > > > comp

Re: [Intel-gfx] [PATCH] drm/i915: Set mode->crtc_clock during hardware state readout

2015-10-06 Thread Ville Syrjälä
On Tue, Oct 06, 2015 at 09:26:31AM -0700, Matt Roper wrote:
> intel_mode_from_pipe_config() fills in a mode structure from the CRTC
> state that was read out of the hardware, but does not set the
> .crtc_clock field (it only sets the .clock).  This causes the subsequent
> call to drm_calc_timestamping_constants() to complain with messages like
> "*ERROR* crtc 21: Can't calculate constants, dotclock = 0!"  Ensuring
> .crtc_clock is set as well eliminates this error.
> 
> Signed-off-by: Matt Roper 
> ---
>  drivers/gpu/drm/i915/intel_display.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c 
> b/drivers/gpu/drm/i915/intel_display.c
> index bbeb6d3..4e481e3 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -7752,6 +7752,7 @@ void intel_mode_from_pipe_config(struct 
> drm_display_mode *mode,
>   mode->type = DRM_MODE_TYPE_DRIVER;
>  
>   mode->clock = pipe_config->base.adjusted_mode.crtc_clock;
> + mode->crtc_clock = pipe_config->base.adjusted_mode.crtc_clock;

You should never look at crtc_clock unless you're looking at the
adjusted mode. Are you actually seeing these errors with the current
code? They should have been fixed by:

7f4c62840cc4 drm/i915: Assign hwmode after encoder state readout
0f64614dde17 drm/i915: Fix clock readout when pipes are enabled w/o ports

>   mode->flags |= pipe_config->base.adjusted_mode.flags;
>  
>   mode->hsync = drm_mode_hsync(mode);
> -- 
> 2.1.4
> 
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/5] drm/i915/kbl: Add Kabylake PCI ID

2015-10-06 Thread Vivi, Rodrigo
On Tue, 2015-10-06 at 12:09 +0300, Jani Nikula wrote:
> On Tue, 06 Oct 2015, Rodrigo Vivi  wrote:
> > From: Deepak S 
> > 
> > v2: separate out device info into different GT (Damien)
> > v3: Add is_kabylake to the KBL gt3 structuer (Damien)
> > Sort the platforms in older -> newer order (Damien)
> > 
> > Reviewed-by: Damien Lespiau 
> > Signed-off-by: Deepak S 
> > Signed-off-by: Damien Lespiau 
> > Signed-off-by: Rodrigo Vivi 
> > ---
> >  drivers/gpu/drm/i915/i915_drv.c | 33 
> > -
> >  drivers/gpu/drm/i915/i915_drv.h |  2 ++
> >  include/drm/i915_pciids.h   | 29 +
> >  3 files changed, 63 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c 
> > b/drivers/gpu/drm/i915/i915_drv.c
> > index 1cb6b82..f42102d 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -394,6 +394,34 @@ static const struct intel_device_info 
> > intel_broxton_info = {
> > IVB_CURSOR_OFFSETS,
> >  };
> >  
> > +static const struct intel_device_info intel_kabylake_info = {
> > +   .is_preliminary = 1,
> > +   .is_skylake = 1,
> 
> Now's the time to call the shots, is this really a good idea or not? 
> See
> VLV vs. CHV, we (okay, the royal we) still confuse ourselves with
> IS_VALLEYVIEW.
> 
> Granted, 74 call sites for IS_SKYLAKE(), all of those would need to 
> be
> patched. We'd need something like "is skylake family" including SKL 
> and
> KBL. Some of them might be changed to be more like feature flags, 
> which
> is something we've decided we need to do more anyway.

To be honest I also don't like this approach as I didn't like the chv
is vlv one. I always get confused and I thought about changing
everything many times during last week, but as you pointed out it would
be a change in many entries and not sure what others opinion was since
I never saw any complain about this patch before neither the vlv-chv
ones... 

But if you and Daniel prefer I can do a
s/IS_SKYLAKE(dev)/IS_KABYLAKE(dev) || IS_SKYLAKE(dev)
Just let me know...

The IS_SKYLAKE_FAMILY I don't like much because it will still confuse
people because in ark they will probably be 2 different "families" if
they continue with the current scheme they are using so far.

> 
> BR,
> Jani.
> 
> 
> > +   .is_kabylake = 1,
> > +   .gen = 9, .num_pipes = 3,
> > +   .need_gfx_hws = 1, .has_hotplug = 1,
> > +   .ring_mask = RENDER_RING | BSD_RING | BLT_RING | 
> > VEBOX_RING,
> > +   .has_llc = 1,
> > +   .has_ddi = 1,
> > +   .has_fbc = 1,
> > +   GEN_DEFAULT_PIPEOFFSETS,
> > +   IVB_CURSOR_OFFSETS,
> > +};
> > +
> > +static const struct intel_device_info intel_kabylake_gt3_info = {
> > +   .is_preliminary = 1,
> > +   .is_skylake = 1,
> > +   .is_kabylake = 1,
> > +   .gen = 9, .num_pipes = 3,
> > +   .need_gfx_hws = 1, .has_hotplug = 1,
> > +   .ring_mask = RENDER_RING | BSD_RING | BLT_RING | 
> > VEBOX_RING | BSD2_RING,
> > +   .has_llc = 1,
> > +   .has_ddi = 1,
> > +   .has_fbc = 1,
> > +   GEN_DEFAULT_PIPEOFFSETS,
> > +   IVB_CURSOR_OFFSETS,
> > +};
> > +
> >  /*
> >   * Make sure any device matches here are from most specific to 
> > most
> >   * general.  For example, since the Quanta match is based on the 
> > subsystem
> > @@ -434,7 +462,10 @@ static const struct intel_device_info 
> > intel_broxton_info = {
> > INTEL_SKL_GT1_IDS(&intel_skylake_info), \
> > INTEL_SKL_GT2_IDS(&intel_skylake_info), \
> > INTEL_SKL_GT3_IDS(&intel_skylake_gt3_info), \
> > -   INTEL_BXT_IDS(&intel_broxton_info)
> > +   INTEL_BXT_IDS(&intel_broxton_info), \
> > +   INTEL_KBL_GT1_IDS(&intel_kabylake_info),\
> > +   INTEL_KBL_GT2_IDS(&intel_kabylake_info),\
> > +   INTEL_KBL_GT3_IDS(&intel_kabylake_gt3_info)
> >  
> >  static const struct pci_device_id pciidlist[] = {  /
> > * aka */
> > INTEL_PCI_IDS,
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h 
> > b/drivers/gpu/drm/i915/i915_drv.h
> > index 824e724..f7e9d7e 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -765,6 +765,7 @@ struct intel_csr {
> > func(is_valleyview) sep \
> > func(is_haswell) sep \
> > func(is_skylake) sep \
> > +   func(is_kabylake) sep \
> > func(is_preliminary) sep \
> > func(has_fbc) sep \
> > func(has_pipe_cxsr) sep \
> > @@ -2464,6 +2465,7 @@ struct drm_i915_cmd_table {
> >  #define IS_BROADWELL(dev)  (!INTEL_INFO(dev)->is_valleyview 
> > && IS_GEN8(dev))
> >  #define IS_SKYLAKE(dev)(INTEL_INFO(dev)->is_skylake)
> >  #define IS_BROXTON(dev)(!INTEL_INFO(dev)->is_skylake && 
> > IS_GEN9(dev))
> > +#define IS_KABYLAKE(dev)   (INTEL_INFO(dev)->is_kabylake)
> >  #define IS_MOBILE(dev) (INTEL_INFO(dev)->is_mobile)
> >  #define IS_HSW_EARLY_SDV(dev)  (IS_HASWELL(dev) && \
> >  (INTEL_DEVID(dev) & 0xFF00) == 
> > 0x0C00)
> > diff --git a/include/drm/i915_pciids.h b/include/drm/i915_pciids.h
> > index 

Re: [Intel-gfx] [PATCH 3/5] drm/i915/kbl: Kabylake A0 is based on Skylake H0.

2015-10-06 Thread Vivi, Rodrigo
On Tue, 2015-10-06 at 12:24 +0300, Jani Nikula wrote:
> On Tue, 06 Oct 2015, Rodrigo Vivi  wrote:
> > Kabylake is gen 9.5 derivated from Skylake H0 stepping.
> > 
> > So we don't need pre-production Skylake workaround and also
> > firmware loading will use SKL H0 offsets.
> > 
> > Signed-off-by: Rodrigo Vivi 
> > ---
> >  drivers/gpu/drm/i915/i915_drv.h | 7 ++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h 
> > b/drivers/gpu/drm/i915/i915_drv.h
> > index 7374a0d..580c005 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -2436,7 +2436,6 @@ struct drm_i915_cmd_table {
> >  })
> >  #define INTEL_INFO(p)  (&__I915__(p)->info)
> >  #define INTEL_DEVID(p) (INTEL_INFO(p)->device_id)
> > -#define INTEL_REVID(p) (__I915__(p)->dev->pdev->revision)
> >  
> >  #define IS_I830(dev)   (INTEL_DEVID(dev) == 0x3577)
> >  #define IS_845G(dev)   (INTEL_DEVID(dev) == 0x2562)
> > @@ -2508,6 +2507,9 @@ struct drm_i915_cmd_table {
> >  
> >  #define IS_PRELIMINARY_HW(intel_info) ((intel_info)
> > ->is_preliminary)
> >  
> > +#define INTEL_REVID(p) (__I915__(p)->dev->pdev->revision + 
> > \
> > +IS_KABYLAKE(p) ? 7 : 0)
> > +
> 
> I am not fond of this at all. It will be really confusing that
> ->revision is different from INTEL_REVID when checking the 
> workarounds,
> and that you'll be using SKL_REVID_* to match KBL revision
> ids. 

this is exactly one of the reasons why I did this sum in this way so
they never match...

> Additionally, we'll probably want to start removing SKL workarounds
> before KBL workarounds.

I believe this is another discussion... On HSW BDW I remember I was
removing old Wa as it was no longer needed, but on SKL I saw this REVID
and I believed the idea was to let them there since some devs might be
using preliminary platforms yet for other reasons... I don't see a
problem of letting the old W/a there.

> 
> Others may disagree, but I'd like KBL revid checks be different from
> SKL.
> 
> >  #define SKL_REVID_A0   (0x0)
> >  #define SKL_REVID_B0   (0x1)
> >  #define SKL_REVID_C0   (0x2)
> > @@ -2515,6 +2517,9 @@ struct drm_i915_cmd_table {
> >  #define SKL_REVID_E0   (0x4)
> >  #define SKL_REVID_F0   (0x5)
> >  
> > +/* KBL A0 is based on SKL H0 */
> > +#define KBL_REVID_A0   (0x7)
> 
> You can't compare this against INTEL_REVID() now can you...? Or is 
> this
> not the one in the spec? Confused already.

Yes, this is confusing indeed. It seems that we have many levels of
steppings (according to platform guys) and this platform stepping
returning 0 is our KBL A0, but this correspond to our internal gpu
stepping H0 (same going to skl h0).

Like dmc firmware loading for instance we need to load the firmware for
stepping 7.

So yes, this definition matches BSPec KBL A0.

> 
> BR,
> Jani.
> 
> > +
> >  #define BXT_REVID_A0   (0x0)
> >  #define BXT_REVID_B0   (0x3)
> >  #define BXT_REVID_C0   (0x9)
> > -- 
> > 2.4.3
> > 
> > ___
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/3] drm/i915/bxt: add revision id for A1 stepping and use it

2015-10-06 Thread Vivi, Rodrigo
On Tue, 2015-10-06 at 16:43 +0300, Jani Nikula wrote:
> On Tue, 06 Oct 2015, Ville Syrjälä  
> wrote:
> > On Tue, Oct 06, 2015 at 02:41:15PM +0300, Jani Nikula wrote:
> > > Prefer inclusive ranges for revision checks rather than "below 
> > > B0". Per
> > > specs A2 is not used, so revid <= A1 matches revid < B0.
> > 
> > The w/a db would say UNTIL_B0 etc., so might be easier to check 
> > against
> > it if we keep to the same convention.
> 
> So I wanted to double check what the convention is. I picked
> WaRsDisableCoarsePowerGating.
> 
> KBL - SIWA_FOREVER
> BXT - SI_WA_BEFORE(BXT_REV_ID_B0)
> SKL - SIWA_UNTIL_SKL_E0
> 
> Description "Disable coarse power gating for GT4 until GT F0 
> stepping."
> 
> *rolls eyes*
> 
> So is that "until" there inclusive or non-inclusive? The db is
> contradicting itself... Cc: Sarah who has also looked at workarounds
> recently.
> 
> Rodrigo, for one thing, I'll want workarounds for SKL and KBL in
> different conditions instead of conflated into SKL!

I agree with Ville that <= REVID matches the spec in sense of "until"
certain stepping and I like this.

Also KBL W/A doesn't conflict with SKL W/as in they way they were
derivated...

> 
> But what about this non-inclusive end of range? It'll matter in patch
> 3/3. It's not so much a problem for ranges, but rather for specific
> revisions, where you'd have to include a revision not mentioned in 
> the
> spec at all, e.g. for B0 only:
> 
>   IS_SKL_REVID(dev, SKL_REVID_B0, SKL_REVID_C0)
> 
> instead of the current proposal:
> 
>   IS_SKL_REVID(dev, SKL_REVID_B0, SKL_REVID_B0)
> 
> I'm not really fond of adding separate macros for checking specific
> vs. ranges.
> 
> Thoughts?
> 
> BR,
> Jani.
> 
> 
> 
> 
> 
> > 
> > > 
> > > Signed-off-by: Jani Nikula 
> > > ---
> > >  drivers/gpu/drm/i915/i915_drv.h| 1 +
> > >  drivers/gpu/drm/i915/i915_gem.c| 2 +-
> > >  drivers/gpu/drm/i915/i915_guc_submission.c | 2 +-
> > >  drivers/gpu/drm/i915/intel_ddi.c   | 2 +-
> > >  drivers/gpu/drm/i915/intel_dp.c| 2 +-
> > >  drivers/gpu/drm/i915/intel_hdmi.c  | 2 +-
> > >  drivers/gpu/drm/i915/intel_lrc.c   | 8 
> > >  drivers/gpu/drm/i915/intel_pm.c| 6 +++---
> > >  drivers/gpu/drm/i915/intel_ringbuffer.c| 6 +++---
> > >  9 files changed, 16 insertions(+), 15 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h 
> > > b/drivers/gpu/drm/i915/i915_drv.h
> > > index a3b137715604..9833a2055930 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -2509,6 +2509,7 @@ struct drm_i915_cmd_table {
> > >  #define SKL_REVID_F0 0x5
> > >  
> > >  #define BXT_REVID_A0 0x0
> > > +#define BXT_REVID_A1 0x1
> > >  #define BXT_REVID_B0 0x3
> > >  #define BXT_REVID_C0 0x9
> > >  
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c 
> > > b/drivers/gpu/drm/i915/i915_gem.c
> > > index f0cfbb9ee12c..fd2d880656b2 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -3757,7 +3757,7 @@ int i915_gem_set_caching_ioctl(struct 
> > > drm_device *dev, void *data,
> > >* cacheline, whereas normally such cachelines 
> > > would get
> > >* invalidated.
> > >*/
> > > - if (IS_BROXTON(dev) && INTEL_REVID(dev) < 
> > > BXT_REVID_B0)
> > > + if (IS_BROXTON(dev) && INTEL_REVID(dev) <= 
> > > BXT_REVID_A1)
> > >   return -ENODEV;
> > >  
> > >   level = I915_CACHE_LLC;
> > > diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
> > > b/drivers/gpu/drm/i915/i915_guc_submission.c
> > > index 036b42bae827..863aa5c82466 100644
> > > --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> > > +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> > > @@ -161,7 +161,7 @@ static int host2guc_sample_forcewake(struct 
> > > intel_guc *guc,
> > >   data[0] = HOST2GUC_ACTION_SAMPLE_FORCEWAKE;
> > >   /* WaRsDisableCoarsePowerGating:skl,bxt */
> > >   if (!intel_enable_rc6(dev_priv->dev) ||
> > > - (IS_BROXTON(dev) && (INTEL_REVID(dev) < 
> > > BXT_REVID_B0)) ||
> > > + (IS_BROXTON(dev) && (INTEL_REVID(dev) <= 
> > > BXT_REVID_A1)) ||
> > >   (IS_SKL_GT3(dev) && (INTEL_REVID(dev) <= 
> > > SKL_REVID_E0)) ||
> > >   (IS_SKL_GT4(dev) && (INTEL_REVID(dev) <= 
> > > SKL_REVID_E0)))
> > >   data[1] = 0;
> > > diff --git a/drivers/gpu/drm/i915/intel_ddi.c 
> > > b/drivers/gpu/drm/i915/intel_ddi.c
> > > index b25e99a432fb..b80e0f5ec5dc 100644
> > > --- a/drivers/gpu/drm/i915/intel_ddi.c
> > > +++ b/drivers/gpu/drm/i915/intel_ddi.c
> > > @@ -3247,7 +3247,7 @@ void intel_ddi_init(struct drm_device *dev, 
> > > enum port port)
> > >* On BXT A0/A1, sw needs to activate DDIA HPD 
> > > logic and
> > >* interrupts to check the external panel 
> > > connection.
> > >*/
> > > - if (IS_BROXTON(dev_priv) && (IN

Re: [Intel-gfx] [PATCH 1/1] drm/i915/bxt: Set time interval unit to 0.833us

2015-10-06 Thread Imre Deak
On pe, 2015-09-18 at 23:39 +0530, Sagar Arun Kamble wrote:
> From: Akash Goel 
> 
> Signed-off-by: Ankitprasad Sharma 
> Signed-off-by: Akash Goel 
> Signed-off-by: Sagar Arun Kamble 

The comment about units in gen6_set_rps_thresholds() is outdated, so you
could update that while at it. In any case this looks ok, so:
Reviewed-by: Imre Deak 

> ---
>  drivers/gpu/drm/i915/i915_reg.h | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 67bf205..6b1998c 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -2802,8 +2802,11 @@ enum skl_disp_power_wells {
>  
>  #define INTERVAL_1_28_US(us) (((us) * 100) >> 7)
>  #define INTERVAL_1_33_US(us) (((us) * 3)   >> 2)
> +#define INTERVAL_0_833_US(us)(((us) * 6) / 5)
>  #define GT_INTERVAL_FROM_US(dev_priv, us) (IS_GEN9(dev_priv) ? \
> - INTERVAL_1_33_US(us) : \
> + (IS_BROXTON(dev_priv) ? \
> + INTERVAL_0_833_US(us) : \
> + INTERVAL_1_33_US(us)) : \
>   INTERVAL_1_28_US(us))
>  
>  /*


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/3] drm/i915/bxt: add revision id for A1 stepping and use it

2015-10-06 Thread Ville Syrjälä
On Tue, Oct 06, 2015 at 04:43:11PM +0300, Jani Nikula wrote:
> On Tue, 06 Oct 2015, Ville Syrjälä  wrote:
> > On Tue, Oct 06, 2015 at 02:41:15PM +0300, Jani Nikula wrote:
> >> Prefer inclusive ranges for revision checks rather than "below B0". Per
> >> specs A2 is not used, so revid <= A1 matches revid < B0.
> >
> > The w/a db would say UNTIL_B0 etc., so might be easier to check against
> > it if we keep to the same convention.
> 
> So I wanted to double check what the convention is. I picked
> WaRsDisableCoarsePowerGating.
> 
> KBL - SIWA_FOREVER
> BXT - SI_WA_BEFORE(BXT_REV_ID_B0)
> SKL - SIWA_UNTIL_SKL_E0
> 
> Description "Disable coarse power gating for GT4 until GT F0 stepping."
> 
> *rolls eyes*
> 
> So is that "until" there inclusive or non-inclusive? The db is
> contradicting itself...

Hmm. My recollection was that it's exclusive, but now that I look at
your findings and some other workarounds, it does look a bit more like
inclusive.

I would think the exclusive thing would be easier to maintain since
the hsd specifies the stepping in which stuff got fixed, and the
exclusive convention would then have the same stepping listed. Eg. if
the hsd says fixed in E0, but the w/a db says UNTIL_D0, then one is
left wondering about D1+ But perhaps such steppings didn't even exist.

Well, in reality it's all over the place. Eg. looking at the BDW UNTIL_D0
stuff, some are fixed in E0, some are fixed in D0, and at least one was
fixed in B0 according to hsd. So I'm starting to think that the meaning
of the tag depends entirely on the person who pushed the change.

> Cc: Sarah who has also looked at workarounds
> recently.
> 
> Rodrigo, for one thing, I'll want workarounds for SKL and KBL in
> different conditions instead of conflated into SKL!
> 
> But what about this non-inclusive end of range? It'll matter in patch
> 3/3. It's not so much a problem for ranges, but rather for specific
> revisions, where you'd have to include a revision not mentioned in the
> spec at all, e.g. for B0 only:
> 
>   IS_SKL_REVID(dev, SKL_REVID_B0, SKL_REVID_C0)
> 
> instead of the current proposal:
> 
>   IS_SKL_REVID(dev, SKL_REVID_B0, SKL_REVID_B0)
> 
> I'm not really fond of adding separate macros for checking specific
> vs. ranges.
> 
> Thoughts?
> 
> BR,
> Jani.
> 
> 
> 
> 
> 
> >
> >> 
> >> Signed-off-by: Jani Nikula 
> >> ---
> >>  drivers/gpu/drm/i915/i915_drv.h| 1 +
> >>  drivers/gpu/drm/i915/i915_gem.c| 2 +-
> >>  drivers/gpu/drm/i915/i915_guc_submission.c | 2 +-
> >>  drivers/gpu/drm/i915/intel_ddi.c   | 2 +-
> >>  drivers/gpu/drm/i915/intel_dp.c| 2 +-
> >>  drivers/gpu/drm/i915/intel_hdmi.c  | 2 +-
> >>  drivers/gpu/drm/i915/intel_lrc.c   | 8 
> >>  drivers/gpu/drm/i915/intel_pm.c| 6 +++---
> >>  drivers/gpu/drm/i915/intel_ringbuffer.c| 6 +++---
> >>  9 files changed, 16 insertions(+), 15 deletions(-)
> >> 
> >> diff --git a/drivers/gpu/drm/i915/i915_drv.h 
> >> b/drivers/gpu/drm/i915/i915_drv.h
> >> index a3b137715604..9833a2055930 100644
> >> --- a/drivers/gpu/drm/i915/i915_drv.h
> >> +++ b/drivers/gpu/drm/i915/i915_drv.h
> >> @@ -2509,6 +2509,7 @@ struct drm_i915_cmd_table {
> >>  #define SKL_REVID_F0  0x5
> >>  
> >>  #define BXT_REVID_A0  0x0
> >> +#define BXT_REVID_A1  0x1
> >>  #define BXT_REVID_B0  0x3
> >>  #define BXT_REVID_C0  0x9
> >>  
> >> diff --git a/drivers/gpu/drm/i915/i915_gem.c 
> >> b/drivers/gpu/drm/i915/i915_gem.c
> >> index f0cfbb9ee12c..fd2d880656b2 100644
> >> --- a/drivers/gpu/drm/i915/i915_gem.c
> >> +++ b/drivers/gpu/drm/i915/i915_gem.c
> >> @@ -3757,7 +3757,7 @@ int i915_gem_set_caching_ioctl(struct drm_device 
> >> *dev, void *data,
> >> * cacheline, whereas normally such cachelines would get
> >> * invalidated.
> >> */
> >> -  if (IS_BROXTON(dev) && INTEL_REVID(dev) < BXT_REVID_B0)
> >> +  if (IS_BROXTON(dev) && INTEL_REVID(dev) <= BXT_REVID_A1)
> >>return -ENODEV;
> >>  
> >>level = I915_CACHE_LLC;
> >> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
> >> b/drivers/gpu/drm/i915/i915_guc_submission.c
> >> index 036b42bae827..863aa5c82466 100644
> >> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> >> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> >> @@ -161,7 +161,7 @@ static int host2guc_sample_forcewake(struct intel_guc 
> >> *guc,
> >>data[0] = HOST2GUC_ACTION_SAMPLE_FORCEWAKE;
> >>/* WaRsDisableCoarsePowerGating:skl,bxt */
> >>if (!intel_enable_rc6(dev_priv->dev) ||
> >> -  (IS_BROXTON(dev) && (INTEL_REVID(dev) < BXT_REVID_B0)) ||
> >> +  (IS_BROXTON(dev) && (INTEL_REVID(dev) <= BXT_REVID_A1)) ||
> >>(IS_SKL_GT3(dev) && (INTEL_REVID(dev) <= SKL_REVID_E0)) ||
> >>(IS_SKL_GT4(dev) && (INTEL_REVID(dev) <= SKL_REVID_E0)))
> >>data[1] = 0;
> >> diff --git a/drivers/gpu/drm/i915/intel_ddi.c 
> >> 

Re: [Intel-gfx] [DMC_REDESIGN_V2 07/14] drm/i915/gen9: Simplify csr loading failure printing.

2015-10-06 Thread Marc Herbert

On 30/09/15 07:28, Imre Deak wrote:

On ke, 2015-08-26 at 16:58 +0530, Animesh Manna wrote:


-void i915_firmware_load_error_print(const char *fw_path, int err)
-{
-   DRM_ERROR("failed to load firmware %s (%d)\n", fw_path, err);
-
-   /*
-* If the reason is not known assume -ENOENT since that's the most
-* usual failure mode.
-*/
-   if (!err)
-   err = -ENOENT;
-
-   if (!(IS_BUILTIN(CONFIG_DRM_I915) && err == -ENOENT))
-   return;
-
-   DRM_ERROR(
- "The driver is built-in, so to load the firmware you need to\n"
- "include it either in the kernel (see CONFIG_EXTRA_FIRMWARE) or\n"
- "in your initrd/initramfs image.\n");
-}
-


The point here was to clarify the reason why the loading failed, since
that caused quite a confusion. It was a separate function since the same
could've been called from the GuC loader too. I think the error message
would be still useful.


Agreed 100%. The code of this function was a bit confusing, however this 
error message has proved very useful "on the field" many times already. 
Please preserve the message.



___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 1/5] drm/i915/kbl: Add Kabylake PCI ID

2015-10-06 Thread Ville Syrjälä
On Tue, Oct 06, 2015 at 12:09:17PM +0300, Jani Nikula wrote:
> On Tue, 06 Oct 2015, Rodrigo Vivi  wrote:
> > From: Deepak S 
> >
> > v2: separate out device info into different GT (Damien)
> > v3: Add is_kabylake to the KBL gt3 structuer (Damien)
> > Sort the platforms in older -> newer order (Damien)
> >
> > Reviewed-by: Damien Lespiau 
> > Signed-off-by: Deepak S 
> > Signed-off-by: Damien Lespiau 
> > Signed-off-by: Rodrigo Vivi 
> > ---
> >  drivers/gpu/drm/i915/i915_drv.c | 33 -
> >  drivers/gpu/drm/i915/i915_drv.h |  2 ++
> >  include/drm/i915_pciids.h   | 29 +
> >  3 files changed, 63 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c 
> > b/drivers/gpu/drm/i915/i915_drv.c
> > index 1cb6b82..f42102d 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -394,6 +394,34 @@ static const struct intel_device_info 
> > intel_broxton_info = {
> > IVB_CURSOR_OFFSETS,
> >  };
> >  
> > +static const struct intel_device_info intel_kabylake_info = {
> > +   .is_preliminary = 1,
> > +   .is_skylake = 1,
> 
> Now's the time to call the shots, is this really a good idea or not? See
> VLV vs. CHV, we (okay, the royal we) still confuse ourselves with
> IS_VALLEYVIEW.
> 
> Granted, 74 call sites for IS_SKYLAKE(), all of those would need to be
> patched. We'd need something like "is skylake family" including SKL and
> KBL.

That would be just gen>=9 in most case, with potentially BXT handled
first as the special case. So maybe not much better than IS_SKYLAKE
being true for KBL.

Maybe it's best to just have IS_KABYLAKE and go through all the
IS_SKYLAKE checks and add BKL where needed (or change to gen>=9 if
that's OK).

I'm pretty sure I once did a patch to split IS_CHERRYVIEW from
IS_VALLEYVIEW, but I'm not sure where I stashed the work. Maybe I
should dig it up to get rid of the bad example?

> Some of them might be changed to be more like feature flags, which
> is something we've decided we need to do more anyway.

I've also thought about adding a chipset/platform enum or something where
the platforms would sit in some sort of logical order. So mostly a .gen
except we could actually tell apart the .5 gens and whatnot. Might make
the code a bit nicer since we would say eg. '>= HSW' instead of
'gen >= 8 || HSW', but maybe it wouldn't do much else for us. VLV/CHV
would still be fairly problematic since the display gen is so different
from the gt gen.

> 
> BR,
> Jani.
> 
> 
> > +   .is_kabylake = 1,
> > +   .gen = 9, .num_pipes = 3,
> > +   .need_gfx_hws = 1, .has_hotplug = 1,
> > +   .ring_mask = RENDER_RING | BSD_RING | BLT_RING | VEBOX_RING,
> > +   .has_llc = 1,
> > +   .has_ddi = 1,
> > +   .has_fbc = 1,
> > +   GEN_DEFAULT_PIPEOFFSETS,
> > +   IVB_CURSOR_OFFSETS,
> > +};
> > +
> > +static const struct intel_device_info intel_kabylake_gt3_info = {
> > +   .is_preliminary = 1,
> > +   .is_skylake = 1,
> > +   .is_kabylake = 1,
> > +   .gen = 9, .num_pipes = 3,
> > +   .need_gfx_hws = 1, .has_hotplug = 1,
> > +   .ring_mask = RENDER_RING | BSD_RING | BLT_RING | VEBOX_RING | BSD2_RING,
> > +   .has_llc = 1,
> > +   .has_ddi = 1,
> > +   .has_fbc = 1,
> > +   GEN_DEFAULT_PIPEOFFSETS,
> > +   IVB_CURSOR_OFFSETS,
> > +};
> > +
> >  /*
> >   * Make sure any device matches here are from most specific to most
> >   * general.  For example, since the Quanta match is based on the subsystem
> > @@ -434,7 +462,10 @@ static const struct intel_device_info 
> > intel_broxton_info = {
> > INTEL_SKL_GT1_IDS(&intel_skylake_info), \
> > INTEL_SKL_GT2_IDS(&intel_skylake_info), \
> > INTEL_SKL_GT3_IDS(&intel_skylake_gt3_info), \
> > -   INTEL_BXT_IDS(&intel_broxton_info)
> > +   INTEL_BXT_IDS(&intel_broxton_info), \
> > +   INTEL_KBL_GT1_IDS(&intel_kabylake_info),\
> > +   INTEL_KBL_GT2_IDS(&intel_kabylake_info),\
> > +   INTEL_KBL_GT3_IDS(&intel_kabylake_gt3_info)
> >  
> >  static const struct pci_device_id pciidlist[] = {  /* aka */
> > INTEL_PCI_IDS,
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h 
> > b/drivers/gpu/drm/i915/i915_drv.h
> > index 824e724..f7e9d7e 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -765,6 +765,7 @@ struct intel_csr {
> > func(is_valleyview) sep \
> > func(is_haswell) sep \
> > func(is_skylake) sep \
> > +   func(is_kabylake) sep \
> > func(is_preliminary) sep \
> > func(has_fbc) sep \
> > func(has_pipe_cxsr) sep \
> > @@ -2464,6 +2465,7 @@ struct drm_i915_cmd_table {
> >  #define IS_BROADWELL(dev)  (!INTEL_INFO(dev)->is_valleyview && 
> > IS_GEN8(dev))
> >  #define IS_SKYLAKE(dev)(INTEL_INFO(dev)->is_skylake)
> >  #define IS_BROXTON(dev)(!INTEL_INFO(dev)->is_skylake && IS_GEN9(dev))
> > +#define IS_KABYLAKE(dev)   (INTEL_INFO(dev)->is_kabylake)
> >  #define IS_MOBILE(dev) (INTEL_INFO(dev)->is_mobile)
> >  #defi

Re: [Intel-gfx] [PATCH 3/5] drm/i915/kbl: Kabylake A0 is based on Skylake H0.

2015-10-06 Thread Rodrigo Vivi
cc'ing Ben to get his opinion...

On Tue, Oct 6, 2015 at 10:43 AM Vivi, Rodrigo 
wrote:

> On Tue, 2015-10-06 at 12:24 +0300, Jani Nikula wrote:
> > On Tue, 06 Oct 2015, Rodrigo Vivi  wrote:
> > > Kabylake is gen 9.5 derivated from Skylake H0 stepping.
> > >
> > > So we don't need pre-production Skylake workaround and also
> > > firmware loading will use SKL H0 offsets.
> > >
> > > Signed-off-by: Rodrigo Vivi 
> > > ---
> > >  drivers/gpu/drm/i915/i915_drv.h | 7 ++-
> > >  1 file changed, 6 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > > b/drivers/gpu/drm/i915/i915_drv.h
> > > index 7374a0d..580c005 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -2436,7 +2436,6 @@ struct drm_i915_cmd_table {
> > >  })
> > >  #define INTEL_INFO(p)  (&__I915__(p)->info)
> > >  #define INTEL_DEVID(p) (INTEL_INFO(p)->device_id)
> > > -#define INTEL_REVID(p) (__I915__(p)->dev->pdev->revision)
> > >
> > >  #define IS_I830(dev)   (INTEL_DEVID(dev) == 0x3577)
> > >  #define IS_845G(dev)   (INTEL_DEVID(dev) == 0x2562)
> > > @@ -2508,6 +2507,9 @@ struct drm_i915_cmd_table {
> > >
> > >  #define IS_PRELIMINARY_HW(intel_info) ((intel_info)
> > > ->is_preliminary)
> > >
> > > +#define INTEL_REVID(p) (__I915__(p)->dev->pdev->revision +
> > > \
> > > +IS_KABYLAKE(p) ? 7 : 0)
> > > +
> >
> > I am not fond of this at all. It will be really confusing that
> > ->revision is different from INTEL_REVID when checking the
> > workarounds,
> > and that you'll be using SKL_REVID_* to match KBL revision
> > ids.
>
> this is exactly one of the reasons why I did this sum in this way so
> they never match...
>
> > Additionally, we'll probably want to start removing SKL workarounds
> > before KBL workarounds.
>
> I believe this is another discussion... On HSW BDW I remember I was
> removing old Wa as it was no longer needed, but on SKL I saw this REVID
> and I believed the idea was to let them there since some devs might be
> using preliminary platforms yet for other reasons... I don't see a
> problem of letting the old W/a there.
>
> >
> > Others may disagree, but I'd like KBL revid checks be different from
> > SKL.
> >
> > >  #define SKL_REVID_A0   (0x0)
> > >  #define SKL_REVID_B0   (0x1)
> > >  #define SKL_REVID_C0   (0x2)
> > > @@ -2515,6 +2517,9 @@ struct drm_i915_cmd_table {
> > >  #define SKL_REVID_E0   (0x4)
> > >  #define SKL_REVID_F0   (0x5)
> > >
> > > +/* KBL A0 is based on SKL H0 */
> > > +#define KBL_REVID_A0   (0x7)
> >
> > You can't compare this against INTEL_REVID() now can you...? Or is
> > this
> > not the one in the spec? Confused already.
>
> Yes, this is confusing indeed. It seems that we have many levels of
> steppings (according to platform guys) and this platform stepping
> returning 0 is our KBL A0, but this correspond to our internal gpu
> stepping H0 (same going to skl h0).
>
> Like dmc firmware loading for instance we need to load the firmware for
> stepping 7.
>
> So yes, this definition matches BSPec KBL A0.
>
> >
> > BR,
> > Jani.
> >
> > > +
> > >  #define BXT_REVID_A0   (0x0)
> > >  #define BXT_REVID_B0   (0x3)
> > >  #define BXT_REVID_C0   (0x9)
> > > --
> > > 2.4.3
> > >
> > > ___
> > > Intel-gfx mailing list
> > > Intel-gfx@lists.freedesktop.org
> > > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> >
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 3/5] drm/i915/kbl: Kabylake A0 is based on Skylake H0.

2015-10-06 Thread Ben Widawsky
On Tue, Oct 06, 2015 at 08:51:13PM +, Rodrigo Vivi wrote:
> cc'ing Ben to get his opinion...
> 

Of course anything is possible wrt the delta of KBL features vs SKL. With the
knowledge we have, we can make a pretty educated guess that there will be no
changes, and with an equally high level of confidence say that if there are
changes, they will be very minor and self contained.

I am in favor of this minimalistic patch myself. I think both the result, and
reduced amount of churn make this patch favorable to the requested alternative.
Some finer comments below.

> On Tue, Oct 6, 2015 at 10:43 AM Vivi, Rodrigo 
> wrote:
> 
> > On Tue, 2015-10-06 at 12:24 +0300, Jani Nikula wrote:
> > > On Tue, 06 Oct 2015, Rodrigo Vivi  wrote:
> > > > Kabylake is gen 9.5 derivated from Skylake H0 stepping.

Don't call it 9.5... some people don't like that... Just say it's a SKL
derivative.

> > > >
> > > > So we don't need pre-production Skylake workaround and also
> > > > firmware loading will use SKL H0 offsets.

In fact we know some of these workarounds to be harmful.

> > > >
> > > > Signed-off-by: Rodrigo Vivi 
> > > > ---
> > > >  drivers/gpu/drm/i915/i915_drv.h | 7 ++-
> > > >  1 file changed, 6 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > > > b/drivers/gpu/drm/i915/i915_drv.h
> > > > index 7374a0d..580c005 100644
> > > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > > @@ -2436,7 +2436,6 @@ struct drm_i915_cmd_table {
> > > >  })
> > > >  #define INTEL_INFO(p)  (&__I915__(p)->info)
> > > >  #define INTEL_DEVID(p) (INTEL_INFO(p)->device_id)
> > > > -#define INTEL_REVID(p) (__I915__(p)->dev->pdev->revision)
> > > >
> > > >  #define IS_I830(dev)   (INTEL_DEVID(dev) == 0x3577)
> > > >  #define IS_845G(dev)   (INTEL_DEVID(dev) == 0x2562)
> > > > @@ -2508,6 +2507,9 @@ struct drm_i915_cmd_table {
> > > >
> > > >  #define IS_PRELIMINARY_HW(intel_info) ((intel_info)
> > > > ->is_preliminary)
> > > >
> > > > +#define INTEL_REVID(p) (__I915__(p)->dev->pdev->revision +
> > > > \
> > > > +IS_KABYLAKE(p) ? 7 : 0)
> > > > +
> > >
> > > I am not fond of this at all. It will be really confusing that
> > > ->revision is different from INTEL_REVID when checking the
> > > workarounds,
> > > and that you'll be using SKL_REVID_* to match KBL revision
> > > ids.
> >
> > this is exactly one of the reasons why I did this sum in this way so
> > they never match...
> >

Jani, I do understand your distaste with this patch. However, I think this is a
very reasonable, and more importantly, readable wart in the code. It's very
obvious what and why the macro does what it does.

> > > Additionally, we'll probably want to start removing SKL workarounds
> > > before KBL workarounds.
> >

I don't see a reason for this. Maybe you have thought of something I haven't?
We're not using any of the early SKL workarounds on KBL as a result of this
patch, so it shouldn't matter.

> > I believe this is another discussion... On HSW BDW I remember I was
> > removing old Wa as it was no longer needed, but on SKL I saw this REVID
> > and I believed the idea was to let them there since some devs might be
> > using preliminary platforms yet for other reasons... I don't see a
> > problem of letting the old W/a there.
> >

I'm very much in favor of killing pre-production support (not that anyone
asked).

> > >
> > > Others may disagree, but I'd like KBL revid checks be different from
> > > SKL.
> > >
> > > >  #define SKL_REVID_A0   (0x0)
> > > >  #define SKL_REVID_B0   (0x1)
> > > >  #define SKL_REVID_C0   (0x2)
> > > > @@ -2515,6 +2517,9 @@ struct drm_i915_cmd_table {
> > > >  #define SKL_REVID_E0   (0x4)
> > > >  #define SKL_REVID_F0   (0x5)
> > > >
> > > > +/* KBL A0 is based on SKL H0 */
> > > > +#define KBL_REVID_A0   (0x7)
> > >
> > > You can't compare this against INTEL_REVID() now can you...? Or is
> > > this
> > > not the one in the spec? Confused already.
> >
> > Yes, this is confusing indeed. It seems that we have many levels of
> > steppings (according to platform guys) and this platform stepping
> > returning 0 is our KBL A0, but this correspond to our internal gpu
> > stepping H0 (same going to skl h0).
> >
> > Like dmc firmware loading for instance we need to load the firmware for
> > stepping 7.
> >
> > So yes, this definition matches BSPec KBL A0.
> >

Maybe amend the comment to say that the actual PCI header has revid 0 (or
whatever it was). With that, it's pretty clear - and yes, it is a value which
can and should be used to compare with INTEL_REVID, but as stated above, the
comparison isn't needed, and if/when it is, we can/should revisit the more
intrusive change you're suggesting.

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/l

[Intel-gfx] [PATCH] drm: Check fb against plane size rather than CRTC mode for pageflip

2015-10-06 Thread Matt Roper
The legacy pageflip ioctl calls drm_crtc_check_viewport() to determine
whether the framebuffer being flipped is big enough to fill the display
it is being flipped to.  However some drivers support "windowing" of
their primary planes (i.e., a primary plane that does not cover the
entire CRTC dimensions); in such situations we can wind up rejecting
valid pageflips of buffers that are smaller than the display mode, but
still large enough to fill the entire primary plane.

What we really want to be comparing against for pageflips is the size of
the primary plane, which can be found in crtc->primary->state for atomic
drivers (and drivers in the process of transitioning to atomic).  There
are no non-atomic drivers that support primary plane windowing at the
moment, so we'll continue to use the current behavior of looking at the
CRTC mode size on drivers that don't have a crtc->primary->state.  We'll
also continue to use the existing logic for SetCrtc, which is the other
callsite for drm_crtc_check_viewport(), since legacy modesets reprogram
the primary plane and remove windowing.

Note that the existing code was checking a crtc->invert_dimensions field
to determine whether the width/height of the mode needed to be swapped.
A bonus of checking plane size is that the source width/height we get
already take rotation into account so that check is no longer necessary
when using the plane size.

Testcase: igt/universal-plane-gen9-features-pipe-#
Reported-by: Tvrtko Ursulin 
Cc: Tvrtko Ursulin 
Cc: Ville Syrjälä 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/drm_crtc.c | 87 +-
 1 file changed, 71 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c
index e600a5f..35cd4dc 100644
--- a/drivers/gpu/drm/drm_crtc.c
+++ b/drivers/gpu/drm/drm_crtc.c
@@ -2534,9 +2534,76 @@ void drm_crtc_get_hv_timing(const struct 
drm_display_mode *mode,
 }
 EXPORT_SYMBOL(drm_crtc_get_hv_timing);
 
+static int check_viewport(int hdisplay,
+ int vdisplay,
+ int x,
+ int y,
+ bool invert_dimensions,
+ const struct drm_framebuffer *fb)
+{
+   if (invert_dimensions)
+   swap(hdisplay, vdisplay);
+
+   if (hdisplay > fb->width ||
+   vdisplay > fb->height ||
+   x > fb->width - hdisplay ||
+   y > fb->height - vdisplay) {
+   DRM_DEBUG_KMS("Invalid fb size %ux%u for CRTC viewport 
%ux%u+%d+%d%s.\n",
+ fb->width, fb->height, hdisplay, vdisplay, x, y,
+ invert_dimensions ? " (inverted)" : "");
+   return -ENOSPC;
+   }
+
+   return 0;
+}
+
+/**
+ * drm_plane_check_viewport - Checks that a framebuffer is big enough for the
+ * plane's viewport
+ * @plane: Plane that framebuffer will be displayed on
+ * @x: x panning
+ * @y: y panning
+ * @fb: framebuffer to check size of
+ *
+ * Atomic drivers (or transitioning drivers that support proper plane state)
+ * may call this function on any plane.  Non-atomic drivers may only call this
+ * for the primary plane while the CRTC is active (we'll assume that the
+ * primary plane covers the entire CRTC in that case).
+ */
+int drm_plane_check_viewport(const struct drm_plane *plane,
+int x,
+int y,
+const struct drm_framebuffer *fb)
+
+{
+   struct drm_crtc *crtc = plane->crtc;
+   int hdisplay, vdisplay;
+
+   if (WARN_ON(plane->state == NULL &&
+   plane->type != DRM_PLANE_TYPE_PRIMARY))
+   return -EINVAL;
+
+   /*
+* Non-atomic drivers may not have valid plane state to look at.  But
+* those drivers also don't support windowing of the primary plane, so
+* we can fall back to looking at the mode of the owning CRTC.
+*/
+   if (plane->state) {
+   hdisplay = plane->state->src_w >> 16;
+   vdisplay = plane->state->src_h >> 16;
+   } else if (WARN_ON(!crtc)) {
+   hdisplay = vdisplay = 0;
+   } else {
+   drm_crtc_get_hv_timing(&crtc->mode, &hdisplay, &vdisplay);
+   }
+
+   return check_viewport(hdisplay, vdisplay, x, y, false, fb);
+}
+EXPORT_SYMBOL(drm_plane_check_viewport);
+
 /**
  * drm_crtc_check_viewport - Checks that a framebuffer is big enough for the
- * CRTC viewport
+ * CRTC viewport when running in the specified mode
  * @crtc: CRTC that framebuffer will be displayed on
  * @x: x panning
  * @y: y panning
@@ -2553,20 +2620,8 @@ int drm_crtc_check_viewport(const struct drm_crtc *crtc,
 
drm_crtc_get_hv_timing(mode, &hdisplay, &vdisplay);
 
-   if (crtc->invert_dimensions)
-   swap(hdisplay, vdisplay);
-
-   if (hdisplay > fb->width ||
-   vdisplay > fb->height ||
-   x > fb->width 

[Intel-gfx] [PATCH i-g-t] kms_universal_plane: Add gen9-specific test

2015-10-06 Thread Matt Roper
Gen9 adds some new capabilities not present on previous platforms
(primary plane windowing, 90/270 rotation, etc.).  Add a new subtest to
check how these new features interact with the use of the universal
plane API.

For now we just check whether pageflips work as expected in a windowed
setting.  We may want to add some rotation testing in future patches.

Signed-off-by: Matt Roper 
---
 tests/kms_universal_plane.c | 107 
 1 file changed, 107 insertions(+)

diff --git a/tests/kms_universal_plane.c b/tests/kms_universal_plane.c
index b233166..b06b51e 100644
--- a/tests/kms_universal_plane.c
+++ b/tests/kms_universal_plane.c
@@ -54,6 +54,13 @@ typedef struct {
struct igt_fb red_fb, blue_fb;
 } pageflip_test_t;
 
+typedef struct {
+   data_t *data;
+   int x, y;
+   int w, h;
+   struct igt_fb biggreen_fb, smallred_fb, smallblue_fb;
+} gen9_test_t;
+
 static void
 functional_test_init(functional_test_t *test, igt_output_t *output, enum pipe 
pipe)
 {
@@ -637,6 +644,101 @@ cursor_leak_test_pipe(data_t *data, enum pipe pipe, 
igt_output_t *output)
 }
 
 static void
+gen9_test_init(gen9_test_t *test, igt_output_t *output, enum pipe pipe)
+{
+   data_t *data = test->data;
+   drmModeModeInfo *mode;
+
+   igt_output_set_pipe(output, pipe);
+
+   mode = igt_output_get_mode(output);
+   test->w = mode->hdisplay / 2;
+   test->h = mode->vdisplay / 2;
+   test->x = mode->hdisplay / 4;
+   test->y = mode->vdisplay / 4;
+
+   /* Initial framebuffer of full CRTC size */
+   igt_create_color_fb(data->drm_fd, mode->hdisplay, mode->vdisplay,
+   DRM_FORMAT_XRGB,
+   LOCAL_DRM_FORMAT_MOD_NONE,
+   0.0, 1.0, 0.0,
+   &test->biggreen_fb);
+
+   /* Framebuffers that only cover a quarter of the CRTC size */
+   igt_create_color_fb(data->drm_fd, test->w, test->h,
+   DRM_FORMAT_XRGB,
+   LOCAL_DRM_FORMAT_MOD_NONE,
+   1.0, 0.0, 0.0,
+   &test->smallred_fb);
+   igt_create_color_fb(data->drm_fd, test->w, test->h,
+   DRM_FORMAT_XRGB,
+   LOCAL_DRM_FORMAT_MOD_NONE,
+   0.0, 0.0, 1.0,
+   &test->smallblue_fb);
+}
+
+static void
+gen9_test_fini(gen9_test_t *test, igt_output_t *output)
+{
+   igt_remove_fb(test->data->drm_fd, &test->biggreen_fb);
+   igt_remove_fb(test->data->drm_fd, &test->smallred_fb);
+   igt_remove_fb(test->data->drm_fd, &test->smallblue_fb);
+
+   igt_output_set_pipe(output, PIPE_ANY);
+   igt_display_commit2(&test->data->display, COMMIT_LEGACY);
+}
+
+/*
+ * Test features specific to gen9+ platforms (i.e., primary plane
+ * windowing)
+ */
+static void
+gen9_test_pipe(data_t *data, enum pipe pipe, igt_output_t *output)
+{
+   gen9_test_t test = { .data = data };
+   igt_plane_t *primary;
+
+   int ret = 0;
+
+   igt_skip_on(data->gen < 9);
+   igt_skip_on(pipe >= data->display.n_pipes);
+
+   igt_output_set_pipe(output, pipe);
+
+   gen9_test_init(&test, output, pipe);
+
+   primary = igt_output_get_plane(output, IGT_PLANE_PRIMARY);
+
+   /* Start with a full-screen primary plane */
+   igt_plane_set_fb(primary, &test.biggreen_fb);
+   igt_display_commit2(&data->display, COMMIT_LEGACY);
+
+   /* Set primary to windowed size/position */
+   igt_plane_set_fb(primary, &test.smallblue_fb);
+   igt_plane_set_position(primary, test.x, test.y);
+   igt_plane_set_size(primary, test.w, test.h);
+   igt_display_commit2(&data->display, COMMIT_UNIVERSAL);
+
+   /*
+* SetPlane update to another framebuffer of the same size
+* should succeed
+*/
+   igt_plane_set_fb(primary, &test.smallred_fb);
+   igt_plane_set_position(primary, test.x, test.y);
+   igt_plane_set_size(primary, test.w, test.h);
+   igt_display_commit2(&data->display, COMMIT_UNIVERSAL);
+
+   /* PageFlip should also succeed */
+   ret = drmModePageFlip(data->drm_fd, output->config.crtc->crtc_id,
+ test.smallblue_fb.fb_id, 0, NULL);
+   igt_assert_eq(ret, 0);
+
+   igt_plane_set_fb(primary, NULL);
+   igt_plane_set_position(primary, 0, 0);
+   gen9_test_fini(&test, output);
+}
+
+static void
 run_tests_for_pipe(data_t *data, enum pipe pipe)
 {
igt_output_t *output;
@@ -660,6 +762,11 @@ run_tests_for_pipe(data_t *data, enum pipe pipe)
  kmstest_pipe_name(pipe))
for_each_connected_output(&data->display, output)
cursor_leak_test_pipe(data, pipe, output);
+
+   igt_subtest_f("universal-plane-gen9-features-pipe-%s",
+ kmstest_pipe_name(pipe))
+   for_each_connected_output(&d

[Intel-gfx] [PATCH] igt/kms_addfb_basic: New subtest to check for fb modifier and tiling mode mismatch

2015-10-06 Thread Vivek Kasireddy
This new subtest will validate a Y-tiled object's tiling mode against
its associated fb modifier.

Cc: Tvrtko Ursulin 
Signed-off-by: Vivek Kasireddy 
---
 tests/kms_addfb_basic.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/tests/kms_addfb_basic.c b/tests/kms_addfb_basic.c
index d466e4d..7ca1add 100644
--- a/tests/kms_addfb_basic.c
+++ b/tests/kms_addfb_basic.c
@@ -373,6 +373,15 @@ static void addfb25_ytile(int fd, int gen)
f.handles[0] = gem_bo;
}
 
+   igt_subtest("addfb25-Y-tiled-X-modifier-mismatch") {
+   igt_require(gen >= 9);
+   igt_require_fb_modifiers(fd);
+   gem_set_tiling(fd, gem_bo, I915_TILING_Y, 1024*4);
+
+   f.modifier[0] = LOCAL_I915_FORMAT_MOD_X_TILED;
+   igt_assert(drmIoctl(fd, LOCAL_DRM_IOCTL_MODE_ADDFB2, &f) < 0 && 
errno == EINVAL);
+   }
+
igt_subtest("addfb25-Y-tiled") {
igt_require_fb_modifiers(fd);
 
-- 
2.4.3

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


  1   2   >