date:20170111

Re: [Intel-gfx] [alsa-devel] [PATCH v4 0/3] support DP MST audio

2017-01-11 Thread Takashi Iwai

On Wed, 11 Jan 2017 08:39:13 +0100,
Daniel Vetter wrote:
> 
> On Tue, Jan 10, 2017 at 9:49 AM, Takashi Iwai  wrote:
> > On Tue, 10 Jan 2017 09:45:31 +0100,
> >> >-Original Message-
> >> >From: Takashi Iwai [mailto:ti...@suse.de]
> >> >Sent: Tuesday, January 10, 2017 4:19 PM
> >> >To: Yang, Libin 
> >> >Cc: Daniel Vetter ; intel-gfx 
> >> >;
> >> >Nikula, Jani ; alsa-de...@alsa-project.org; 
> >> >Lin,
> >> >Mengdong 
> >> >Subject: Re: [alsa-devel] [PATCH v4 0/3] support DP MST audio
> >> >
> >> >On Mon, 09 Jan 2017 07:22:55 +0100,
> >> >Yang, Libin wrote:
> >> >>
> >> >> Hi Takashi,
> >> >>
> >> >> It seems the patches for DP MST in gfx is not merged into Linus branch.
> >> >>
> >> >> Do we have plan to merge gfx branch manually and review the patches for
> >> >audio? Or we will wait the DP MST patches for i915 merged into Linus 
> >> >branch?
> >> >
> >> >Sorry, this was delayed due to the vacation.
> >> >Now I applied these three patches to topic/hda-dp-mst branch based on 
> >> >4.10-
> >> >rc2, and it was merged to for-next branch.
> >> >
> >> >  git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git 
> >> > topic/hda-dp-
> >> >mst
> >>
> >> Thanks. These audio patches are based on the i915 dp mst patches. Without
> >> I915 driver patches support, it will make errors.
> >
> > Ah yeah, I forgot it.  So I removed it from for-next branch, but
> > topic/hda-dp-mst branch is kept so that it can be merged to i915
> > tree.
> >
> > Let me know if there is any i915 branch I can pull into sound tree.
> 
> Aw, I didn't know that the depency goes this way round, the dp mst
> patches (I'm not even sure which ones they are) on the i915 are just
> on the general pile. So not anywhere near a place where I can make a
> topic branch.
> 
> I guess what needs to be done now is a cherry-picked list of just the
> patches we need, on top of -rc3, that I can then pull into
> drm-intel.git plus send a pull request for that to Takashi. That means
> the patches are twice in drm-intel.git, but if we cherry-pick
> reference them correctly then that should be all ok and can't really
> be helped.
> 
> Or we just delay the audio side for 4.12, dp mst audio support is 4
> years late anyway, so one more release won'thurt that much ;-)

Well, thinking of the amount of patches, I guess we can do other way
round: basically it's fine to apply Libin's latest patches to drm tree
for 4.11, if it makes things easier.  I don't think we'll have a big
conflict with these changes for others during 4.11 development.  If
any, I can pull some of stable point from drm tree.

Does it work for you?  If yes, feel free to apply these three sound
patches to drm or i915 tree with my ack.

Reviewed-by: Takashi Iwai 


thanks,

Takashi
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 4/6] drm/dp: Introduce DP MST topology manager state to track DP link bw

2017-01-11 Thread Daniel Vetter

On Sat, Jan 07, 2017 at 12:35:36AM +, Pandiyan, Dhinakaran wrote:
> On Thu, 2017-01-05 at 09:24 +0100, Daniel Vetter wrote:
> > On Thu, Jan 05, 2017 at 03:54:54AM +, Pandiyan, Dhinakaran wrote:
> > > On Wed, 2017-01-04 at 19:20 +, Pandiyan, Dhinakaran wrote:
> > > > On Wed, 2017-01-04 at 10:33 +0100, Daniel Vetter wrote:
> > > > > On Tue, Jan 03, 2017 at 01:01:49PM -0800, Dhinakaran Pandiyan wrote:
> > > > > > Link bandwidth is shared between multiple display streams in DP MST
> > > > > > configurations. The DP MST topology manager structure maintains the 
> > > > > > shared
> > > > > > link bandwidth for a primary link directly connected to the GPU. 
> > > > > > For atomic
> > > > > > modesetting drivers, checking if there is sufficient link bandwidth 
> > > > > > for a
> > > > > > mode needs to be done during the atomic_check phase to avoid failed
> > > > > > modesets. Let's encsapsulate the available link bw information in a 
> > > > > > state
> > > > > > structure so that bw can be allocated and released atomically for 
> > > > > > each of
> > > > > > the ports sharing the primary link.
> > > > > > 
> > > > > > Signed-off-by: Dhinakaran Pandiyan 
> > > > > 
> > > > > Overall issue with the patch is that dp helpers now have 2 places 
> > > > > where
> > > > > available_slots is stored: One for atomic drivers in ->state, and the
> > > > > legacy one. I think it'd be good to rework the legacy codepaths (i.e.
> > > > > drm_dp_find_vcpi_slots) to use mgr->state->avail_slots, and remove
> > > > > mgr->avail_slots entirely.
> > > > 
> > > > PATCH 2/6 does that. mgr->avail_slots is not updated in the legacy code
> > > > path, so the check turns out to be against mgr->total_slots. So, I did
> > > > just that, albeit explicitly. 
> > 
> > Ah right, I missed that.
> > 
> > > > > > diff --git a/include/drm/drm_atomic.h b/include/drm/drm_atomic.h
> > > > > > index fd2d971..7ac5ed6 100644
> > > > > > --- a/include/drm/drm_atomic.h
> > > > > > +++ b/include/drm/drm_atomic.h
> > > > > > @@ -153,6 +153,11 @@ struct __drm_connnectors_state {
> > > > > > struct drm_connector_state *state;
> > > > > >  };
> > > > > >  
> > > > > > +struct __drm_dp_mst_topology_state {
> > > > > > +   struct drm_dp_mst_topology_mgr *ptr;
> > > > > > +   struct drm_dp_mst_topology_state *state;
> > > > > 
> > > > > One way to fix that control inversion I mentioned above is to use 
> > > > > void*
> > > > > pionters here, and then have callbacks for atomic_destroy and 
> > > > > swap_state
> > > > > on top. A bit more shuffling, but we could then use that for other 
> > > > > driver
> > > > > private objects.
> > > > > 
> > > > > Other option is to stuff it into intel_atomic_state.
> > > 
> > > Hmm... I think I understand what you are saying. The core atomic
> > > functions like swap_state should not be able alter the topology
> > > manager's current state?
> > > 
> > > Did you mean something like this - https://paste.ubuntu.com/23743485/ ?
> > 
> > Not quite yet, here's what I had in mind as a sketch:
> > 
> > diff --git a/include/drm/drm_atomic.h b/include/drm/drm_atomic.h
> > index 2e28fdca9c3d..6ce704b1e900 100644
> > --- a/include/drm/drm_atomic.h
> > +++ b/include/drm/drm_atomic.h
> > @@ -154,6 +154,17 @@ struct __drm_connnectors_state {
> > struct drm_connector_state *state;
> >  };
> >  
> > +struct drm_private_state_funcs {
> > +   void (*swap)(void *obj, void *state);
> > +   void (*destroy_state)(void *state);
> > +};
> > +
> > +struct __drm_private_obj_state {
> > +   struct obj *ptr;
> > +   struct obj_state *state;
> 
> Thanks for the sketch Daniel, I have a couple of questions.
> Should this be void *obj and void *obj_state? 

Yes :-)

> 
> > +   struct drm_private_state_funcs *funcs;
> > +}
> > +
> >  /**
> >   * struct drm_atomic_state - the global state object for atomic updates
> >   * @ref: count of all references to this state (will not be freed until 
> > zero)
> > @@ -178,6 +189,8 @@ struct drm_atomic_state {
> > struct __drm_crtcs_state *crtcs;
> > int num_connector;
> > struct __drm_connnectors_state *connectors;
> > +   int num_private_objs;
> > +   struct __drm_private_obj_state *private_objs;
> >  
> > struct drm_modeset_acquire_ctx *acquire_ctx;
> >  
> > @@ -414,6 +427,19 @@ void drm_state_dump(struct drm_device *dev, struct 
> > drm_printer *p);
> >  (__i)++)   \
> > for_each_if (plane_state)
> >  
> > +/* The magic here is that if obj and obj_state have the right type, then 
> > this
> > + * will automatically cast to the right type. Since we allow any kind of 
> > private
> > + * object mixed into the same array, runtime type casting is done using the
> > + * funcs pointer.
> > + */
> > +#define for_each_private_obj(__state, obj, obj_state, __i, funcs)
> > +   for ((__i) = 0; \
> > +(__i) < (__state)->num_private_objs && 
>

Re: [Intel-gfx] [PATCH] [RFC i-g-t] Test Design to verify mipi enable/disable sequence.

2017-01-11 Thread Daniel Vetter

On Mon, Jan 09, 2017 at 11:00:02AM +0200, Jani Nikula wrote:
> On Sat, 07 Jan 2017, Yadav Jyoti  wrote:
> > From: Jenkins Val 
> >
> 
> This place here is for the commit message, where you should explain
> *why* we need this change.
> 
> Where do you get the XML file? Do you write it manually? How do you
> manage them? The kernel will execute the sequences from the VBT, not
> from your XML file, so you'll have a problem of maintaining XML files
> for each machine you ever run this test on.
> 
> I'm also not thrilled about adding special debug messages that the test
> depends on finding in dmesg. The test also doesn't actually do anything
> to cause the sequences to be run, so you expect some other, undefined
> tests to have been run, the dmesg from that run captured, and saved to a
> file that you feed to this test.
> 
> I think the design is rather fragile.

Also, igt are black-box testcases, they should not assume any specific
implementation. Every time we break that, we are adding api (even if it's
just for tests in debugfs), and that means coordination issues.

On top of that Chris is building up a neat selftest infrastructure which
helps to cover anything which cannot easily be tested using a blackbox
approach.

Furthermore writing the same stuff twice (like the xml and vbt sequence
this test seems to rely) on isn't validation, it's just typing stuff
twice. Real validation tries to verify (preferrably orthogonal)
invariants, or at least entirely indepent approachs to the implementation.
Another similar case was the color manager testcase, which did not check
functional outcome using crc, but instead checked that the kernel wrote
the right register values in the right places. That's not independent
validation, an hence not really useful as a testcase.

If you want to validate dsi, then either there needs to be some indication
from the sink (on edp we have sink crcs and status flags) that thing went
well, or we need a special testing board like chamelium (but that doesn't
do dsi unfortunately). Everything else is already covered by the generic
modeset testcases and the kernel's selftest.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCHv2] drm/i915: Remove WaDisableLSQCROPERFforOCL KBL workaround.

2017-01-11 Thread Daniel Vetter

On Mon, Jan 09, 2017 at 01:07:56PM -0800, Francisco Jerez wrote:
> The WaDisableLSQCROPERFforOCL workaround has the side effect of
> disabling an L3SQ optimization that has huge performance implications
> and is unlikely to be necessary for the correct functioning of usual
> graphic workloads.  Userspace is free to re-enable the workaround on
> demand, and is generally in a better position to determine whether the
> workaround is necessary than the DRM is (e.g. only during the
> execution of compute kernels that rely on both L3 fences and HDC R/W
> requests).
> 
> The same workaround seems to apply to BDW (at least to production
> stepping G1) and SKL as well (the internal workaround database claims
> that it does for all steppings, while the BSpec workaround table only
> mentions pre-production steppings), but the DRM doesn't do anything
> beyond whitelisting the L3SQCREG4 register so userspace can enable it
> when it sees fit.  Do the same on KBL platforms.
> 
> Improves performance of the GFXBench4 gl_manhattan31 benchmark by 60%,
> and gl_4 (AKA car chase) by 14% on a KBL GT2 running Mesa master --
> This is followed by a regression of 35% and 10% respectively for the
> same benchmarks and platform caused by my recent patch series
> switching userspace to use the dataport constant cache instead of the
> sampler to implement uniform pull constant loads, which caused us to
> hit more heavily the L3 cache (and on platforms other than KBL had the
> opposite effect of improving performance of the same two benchmarks).
> The overall effect on KBL of this change combined with the recent
> userspace change is respectively 4.6% and 2.6%.  SynMark2 OglShMapPcf
> was affected by the constant cache changes (though it improved as it
> did on other platforms rather than regressing), but is not
> significantly affected by this patch (with statistical significance of
> 5% and sample size 20).
> 
> v2: Drop some more code to avoid unused variable warning.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99256
> Signed-off-by: Francisco Jerez 
> Cc: Eero Tamminen 
> Cc: Jani Nikula 
> Cc: Mika Kuoppala 
> Cc: beig...@lists.freedesktop.org

Don't we need some userspace flag/opt-in scheme to avoid stuff going boom
for compute kernels? Are the patches for mesa compute/beignet
ready&reviewed?
-Daniel

> ---
>  drivers/gpu/drm/i915/intel_lrc.c| 10 --
>  drivers/gpu/drm/i915/intel_ringbuffer.c |  8 
>  2 files changed, 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
> b/drivers/gpu/drm/i915/intel_lrc.c
> index 6db246a..656e0a3 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -970,18 +970,8 @@ static inline int gen8_emit_flush_coherentl3_wa(struct 
> intel_engine_cs *engine,
>   uint32_t *batch,
>   uint32_t index)
>  {
> - struct drm_i915_private *dev_priv = engine->i915;
>   uint32_t l3sqc4_flush = (0x4040 | GEN8_LQSC_FLUSH_COHERENT_LINES);
>  
> - /*
> -  * WaDisableLSQCROPERFforOCL:kbl
> -  * This WA is implemented in skl_init_clock_gating() but since
> -  * this batch updates GEN8_L3SQCREG4 with default value we need to
> -  * set this bit here to retain the WA during flush.
> -  */
> - if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_E0))
> - l3sqc4_flush |= GEN8_LQSC_RO_PERF_DIS;
> -
>   wa_ctx_emit(batch, index, (MI_STORE_REGISTER_MEM_GEN8 |
>  MI_SRM_LRM_GLOBAL_GTT));
>   wa_ctx_emit_reg(batch, index, GEN8_L3SQCREG4);
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
> b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 0971ac3..7cb2ab4 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -1095,14 +1095,6 @@ static int kbl_init_workarounds(struct intel_engine_cs 
> *engine)
>   WA_SET_BIT_MASKED(HDC_CHICKEN0,
> HDC_FENCE_DEST_SLM_DISABLE);
>  
> - /* GEN8_L3SQCREG4 has a dependency with WA batch so any new changes
> -  * involving this register should also be added to WA batch as required.
> -  */
> - if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_E0))
> - /* WaDisableLSQCROPERFforOCL:kbl */
> - I915_WRITE(GEN8_L3SQCREG4, I915_READ(GEN8_L3SQCREG4) |
> -GEN8_LQSC_RO_PERF_DIS);
> -
>   /* WaToEnableHwFixForPushConstHWBug:kbl */
>   if (IS_KBL_REVID(dev_priv, KBL_REVID_C0, REVID_FOREVER))
>   WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
> -- 
> 2.10.2
> 
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.fr

Re: [Intel-gfx] [alsa-devel] [PATCH v4 0/3] support DP MST audio

2017-01-11 Thread Daniel Vetter

On Wed, Jan 11, 2017 at 09:00:27AM +0100, Takashi Iwai wrote:
> On Wed, 11 Jan 2017 08:39:13 +0100,
> Daniel Vetter wrote:
> > 
> > On Tue, Jan 10, 2017 at 9:49 AM, Takashi Iwai  wrote:
> > > On Tue, 10 Jan 2017 09:45:31 +0100,
> > >> >-Original Message-
> > >> >From: Takashi Iwai [mailto:ti...@suse.de]
> > >> >Sent: Tuesday, January 10, 2017 4:19 PM
> > >> >To: Yang, Libin 
> > >> >Cc: Daniel Vetter ; intel-gfx 
> > >> >;
> > >> >Nikula, Jani ; 
> > >> >alsa-de...@alsa-project.org; Lin,
> > >> >Mengdong 
> > >> >Subject: Re: [alsa-devel] [PATCH v4 0/3] support DP MST audio
> > >> >
> > >> >On Mon, 09 Jan 2017 07:22:55 +0100,
> > >> >Yang, Libin wrote:
> > >> >>
> > >> >> Hi Takashi,
> > >> >>
> > >> >> It seems the patches for DP MST in gfx is not merged into Linus 
> > >> >> branch.
> > >> >>
> > >> >> Do we have plan to merge gfx branch manually and review the patches 
> > >> >> for
> > >> >audio? Or we will wait the DP MST patches for i915 merged into Linus 
> > >> >branch?
> > >> >
> > >> >Sorry, this was delayed due to the vacation.
> > >> >Now I applied these three patches to topic/hda-dp-mst branch based on 
> > >> >4.10-
> > >> >rc2, and it was merged to for-next branch.
> > >> >
> > >> >  git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git 
> > >> > topic/hda-dp-
> > >> >mst
> > >>
> > >> Thanks. These audio patches are based on the i915 dp mst patches. Without
> > >> I915 driver patches support, it will make errors.
> > >
> > > Ah yeah, I forgot it.  So I removed it from for-next branch, but
> > > topic/hda-dp-mst branch is kept so that it can be merged to i915
> > > tree.
> > >
> > > Let me know if there is any i915 branch I can pull into sound tree.
> > 
> > Aw, I didn't know that the depency goes this way round, the dp mst
> > patches (I'm not even sure which ones they are) on the i915 are just
> > on the general pile. So not anywhere near a place where I can make a
> > topic branch.
> > 
> > I guess what needs to be done now is a cherry-picked list of just the
> > patches we need, on top of -rc3, that I can then pull into
> > drm-intel.git plus send a pull request for that to Takashi. That means
> > the patches are twice in drm-intel.git, but if we cherry-pick
> > reference them correctly then that should be all ok and can't really
> > be helped.
> > 
> > Or we just delay the audio side for 4.12, dp mst audio support is 4
> > years late anyway, so one more release won'thurt that much ;-)
> 
> Well, thinking of the amount of patches, I guess we can do other way
> round: basically it's fine to apply Libin's latest patches to drm tree
> for 4.11, if it makes things easier.  I don't think we'll have a big
> conflict with these changes for others during 4.11 development.  If
> any, I can pull some of stable point from drm tree.
> 
> Does it work for you?  If yes, feel free to apply these three sound
> patches to drm or i915 tree with my ack.
> 
> Reviewed-by: Takashi Iwai 

Works for me too. Libin, can you pls resend (somehow this thread
disconnected from the patches for me), with Takashi's r-b + ack for
merging through drm-intel?

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [alsa-devel] [PATCH v4 0/3] support DP MST audio

2017-01-11 Thread Yang, Libin

Hi Daniel,

OK, I will resend the patches tomorrow. Thanks.

Hi Takashi,

In case you still need the patches for i915, it is on
git://anongit.freedesktop.org/drm-tip:drm-tip

My patches are:

commit 9a148a96fc3a654ddcf142a7ab7db37b972ba5d8
drm/i915/debugfs: add dp mst info

commit 9935f7fa2854355203e3976762eecfb218079aac
drm/i915: abstract ddi being audio enabled

commit 7f9e77545b92bcb894b8e2be5646535e8ba8da9e
drm/i915: enable dp mst audio

commit 31613268c0a6f7abdb0c19487a084249bcf203ba
drm/i915/audio: extend get_saved_enc() to support more scenarios

commit f55d23be11ed15f493957246f3b81fc530e79d70
drm/i915/audio: extend audio sync rate support for DP MST

And you may still need the patches in gfx to fix the flicker issue, which 
Dhinakaran
can help.

Regards,
Libin


>-Original Message-
>From: Daniel Vetter [mailto:daniel.vet...@ffwll.ch] On Behalf Of Daniel Vetter
>Sent: Wednesday, January 11, 2017 4:35 PM
>To: Takashi Iwai 
>Cc: Daniel Vetter ; Yang, Libin ; intel-
>gfx ; Nikula, Jani
>; alsa-de...@alsa-project.org; Lin, Mengdong
>
>Subject: Re: [alsa-devel] [PATCH v4 0/3] support DP MST audio
>
>On Wed, Jan 11, 2017 at 09:00:27AM +0100, Takashi Iwai wrote:
>> On Wed, 11 Jan 2017 08:39:13 +0100,
>> Daniel Vetter wrote:
>> >
>> > On Tue, Jan 10, 2017 at 9:49 AM, Takashi Iwai  wrote:
>> > > On Tue, 10 Jan 2017 09:45:31 +0100,
>> > >> >-Original Message-
>> > >> >From: Takashi Iwai [mailto:ti...@suse.de]
>> > >> >Sent: Tuesday, January 10, 2017 4:19 PM
>> > >> >To: Yang, Libin 
>> > >> >Cc: Daniel Vetter ; intel-gfx
>> > >> >;
>> > >> >Nikula, Jani ;
>> > >> >alsa-de...@alsa-project.org; Lin, Mengdong
>> > >> >
>> > >> >Subject: Re: [alsa-devel] [PATCH v4 0/3] support DP MST audio
>> > >> >
>> > >> >On Mon, 09 Jan 2017 07:22:55 +0100, Yang, Libin wrote:
>> > >> >>
>> > >> >> Hi Takashi,
>> > >> >>
>> > >> >> It seems the patches for DP MST in gfx is not merged into Linus
>branch.
>> > >> >>
>> > >> >> Do we have plan to merge gfx branch manually and review the
>> > >> >> patches for
>> > >> >audio? Or we will wait the DP MST patches for i915 merged into Linus
>branch?
>> > >> >
>> > >> >Sorry, this was delayed due to the vacation.
>> > >> >Now I applied these three patches to topic/hda-dp-mst branch
>> > >> >based on 4.10- rc2, and it was merged to for-next branch.
>> > >> >
>> > >> >  git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git
>> > >> >topic/hda-dp- mst
>> > >>
>> > >> Thanks. These audio patches are based on the i915 dp mst patches.
>> > >> Without
>> > >> I915 driver patches support, it will make errors.
>> > >
>> > > Ah yeah, I forgot it.  So I removed it from for-next branch, but
>> > > topic/hda-dp-mst branch is kept so that it can be merged to i915
>> > > tree.
>> > >
>> > > Let me know if there is any i915 branch I can pull into sound tree.
>> >
>> > Aw, I didn't know that the depency goes this way round, the dp mst
>> > patches (I'm not even sure which ones they are) on the i915 are just
>> > on the general pile. So not anywhere near a place where I can make a
>> > topic branch.
>> >
>> > I guess what needs to be done now is a cherry-picked list of just
>> > the patches we need, on top of -rc3, that I can then pull into
>> > drm-intel.git plus send a pull request for that to Takashi. That
>> > means the patches are twice in drm-intel.git, but if we cherry-pick
>> > reference them correctly then that should be all ok and can't really
>> > be helped.
>> >
>> > Or we just delay the audio side for 4.12, dp mst audio support is 4
>> > years late anyway, so one more release won'thurt that much ;-)
>>
>> Well, thinking of the amount of patches, I guess we can do other way
>> round: basically it's fine to apply Libin's latest patches to drm tree
>> for 4.11, if it makes things easier.  I don't think we'll have a big
>> conflict with these changes for others during 4.11 development.  If
>> any, I can pull some of stable point from drm tree.
>>
>> Does it work for you?  If yes, feel free to apply these three sound
>> patches to drm or i915 tree with my ack.
>>
>> Reviewed-by: Takashi Iwai 
>
>Works for me too. Libin, can you pls resend (somehow this thread
>disconnected from the patches for me), with Takashi's r-b + ack for merging
>through drm-intel?
>
>Thanks, Daniel
>--
>Daniel Vetter
>Software Engineer, Intel Corporation
>http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH] drm/i915/bxt: Add MST support when do DPLL calculation

2017-01-11 Thread Lee, Shawn C

From: "Lee, Shawn C" 

Kernel oops was trigger by DP MST monitor/hub connected.
DP MST series patch already upstream and MST should
be support also. MST monitor will display normally with this
change on bxt platform.

Cc: Jani Nikula 
Reviewed-by: Cooper Chiou 
Reviewed-by: Gary C Wang 
Reviewed-by: Ciobanu, Nathan D 
Reviewed-by: Herbert, Marc 
Reviewed-by: Sripada, Radhakrishna 

Signed-off-by: Shawn Lee 
---
 drivers/gpu/drm/i915/intel_dpll_mgr.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_dpll_mgr.c 
b/drivers/gpu/drm/i915/intel_dpll_mgr.c
index c92a2558beb4..1a1d99d266ed 100644
--- a/drivers/gpu/drm/i915/intel_dpll_mgr.c
+++ b/drivers/gpu/drm/i915/intel_dpll_mgr.c
@@ -1855,7 +1855,8 @@ bool bxt_ddi_dp_set_dpll_hw_state(int clock,
return NULL;
 
if ((encoder->type == INTEL_OUTPUT_DP ||
-encoder->type == INTEL_OUTPUT_EDP) &&
+encoder->type == INTEL_OUTPUT_EDP ||
+encoder->type == INTEL_OUTPUT_DP_MST ) &&
!bxt_ddi_dp_set_dpll_hw_state(clock, &dpll_hw_state))
return NULL;
 
-- 
1.7.9.5

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 2/4] lib/scatterlist: Avoid potential scatterlist entry overflow

2017-01-11 Thread Tvrtko Ursulin

From: Tvrtko Ursulin 

Since the scatterlist length field is an unsigned int, make
sure that sg_alloc_table_from_pages does not overflow it while
coallescing pages to a single entry.

v2: Drop reference to future use. Use UINT_MAX.
v3: max_segment must be page aligned.

Signed-off-by: Tvrtko Ursulin 
Cc: Masahiro Yamada 
Cc: linux-ker...@vger.kernel.org
Reviewed-by: Chris Wilson  (v2)
---
 lib/scatterlist.c | 25 +++--
 1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index e05e7fc98892..4fc54801cd29 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -394,7 +394,8 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
unsigned int offset, unsigned long size,
gfp_t gfp_mask)
 {
-   unsigned int chunks;
+   const unsigned int max_segment = rounddown(UINT_MAX, PAGE_SIZE);
+   unsigned int seg_len, chunks;
unsigned int i;
unsigned int cur_page;
int ret;
@@ -402,9 +403,16 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
 
/* compute number of contiguous chunks */
chunks = 1;
-   for (i = 1; i < n_pages; ++i)
-   if (page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1)
+   seg_len = PAGE_SIZE;
+   for (i = 1; i < n_pages; ++i) {
+   if (seg_len >= max_segment ||
+   page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1) {
++chunks;
+   seg_len = PAGE_SIZE;
+   } else {
+   seg_len += PAGE_SIZE;
+   }
+   }
 
ret = sg_alloc_table(sgt, chunks, gfp_mask);
if (unlikely(ret))
@@ -413,17 +421,22 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
/* merging chunks and putting them into the scatterlist */
cur_page = 0;
for_each_sg(sgt->sgl, s, sgt->orig_nents, i) {
-   unsigned long chunk_size;
+   unsigned int chunk_size;
unsigned int j;
 
/* look for the end of the current chunk */
+   seg_len = PAGE_SIZE;
for (j = cur_page + 1; j < n_pages; ++j)
-   if (page_to_pfn(pages[j]) !=
+   if (seg_len >= max_segment ||
+   page_to_pfn(pages[j]) !=
page_to_pfn(pages[j - 1]) + 1)
break;
+   else
+   seg_len += PAGE_SIZE;
 
chunk_size = ((j - cur_page) << PAGE_SHIFT) - offset;
-   sg_set_page(s, pages[cur_page], min(size, chunk_size), offset);
+   sg_set_page(s, pages[cur_page],
+   min_t(unsigned long, size, chunk_size), offset);
size -= chunk_size;
offset = 0;
cur_page = j;
-- 
2.7.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages

2017-01-11 Thread Tvrtko Ursulin

From: Tvrtko Ursulin 

Scatterlist entries have an unsigned int for the offset so
correct the sg_alloc_table_from_pages function accordingly.

Since these are offsets withing a page, unsigned int is
wide enough.

Also converts callers which were using unsigned long locally
with the lower_32_bits annotation to make it explicitly
clear what is happening.

v2: Use offset_in_page. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin 
Cc: Masahiro Yamada 
Cc: Pawel Osciak 
Cc: Marek Szyprowski 
Cc: Kyungmin Park 
Cc: Tomasz Stanislawski 
Cc: Matt Porter 
Cc: Alexandre Bounine 
Cc: linux-me...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Acked-by: Marek Szyprowski  (v1)
Reviewed-by: Chris Wilson 
Reviewed-by: Mauro Carvalho Chehab 
---
 drivers/media/v4l2-core/videobuf2-dma-contig.c | 4 ++--
 drivers/rapidio/devices/rio_mport_cdev.c   | 4 ++--
 include/linux/scatterlist.h| 2 +-
 lib/scatterlist.c  | 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c 
b/drivers/media/v4l2-core/videobuf2-dma-contig.c
index fb6a177be461..51e8765bc3c6 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
@@ -478,7 +478,7 @@ static void *vb2_dc_get_userptr(struct device *dev, 
unsigned long vaddr,
 {
struct vb2_dc_buf *buf;
struct frame_vector *vec;
-   unsigned long offset;
+   unsigned int offset;
int n_pages, i;
int ret = 0;
struct sg_table *sgt;
@@ -506,7 +506,7 @@ static void *vb2_dc_get_userptr(struct device *dev, 
unsigned long vaddr,
buf->dev = dev;
buf->dma_dir = dma_dir;
 
-   offset = vaddr & ~PAGE_MASK;
+   offset = lower_32_bits(offset_in_page(vaddr));
vec = vb2_create_framevec(vaddr, size, dma_dir == DMA_FROM_DEVICE);
if (IS_ERR(vec)) {
ret = PTR_ERR(vec);
diff --git a/drivers/rapidio/devices/rio_mport_cdev.c 
b/drivers/rapidio/devices/rio_mport_cdev.c
index 9013a585507e..0fae29ff47ba 100644
--- a/drivers/rapidio/devices/rio_mport_cdev.c
+++ b/drivers/rapidio/devices/rio_mport_cdev.c
@@ -876,10 +876,10 @@ rio_dma_transfer(struct file *filp, u32 transfer_mode,
 * offset within the internal buffer specified by handle parameter.
 */
if (xfer->loc_addr) {
-   unsigned long offset;
+   unsigned int offset;
long pinned;
 
-   offset = (unsigned long)(uintptr_t)xfer->loc_addr & ~PAGE_MASK;
+   offset = lower_32_bits(offset_in_page(xfer->loc_addr));
nr_pages = PAGE_ALIGN(xfer->length + offset) >> PAGE_SHIFT;
 
page_list = kmalloc_array(nr_pages,
diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index cb3c8fe6acd7..c981bee1a3ae 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -263,7 +263,7 @@ int __sg_alloc_table(struct sg_table *, unsigned int, 
unsigned int,
 int sg_alloc_table(struct sg_table *, unsigned int, gfp_t);
 int sg_alloc_table_from_pages(struct sg_table *sgt,
struct page **pages, unsigned int n_pages,
-   unsigned long offset, unsigned long size,
+   unsigned int offset, unsigned long size,
gfp_t gfp_mask);
 
 size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf,
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index 004fc70fc56a..e05e7fc98892 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -391,7 +391,7 @@ EXPORT_SYMBOL(sg_alloc_table);
  */
 int sg_alloc_table_from_pages(struct sg_table *sgt,
struct page **pages, unsigned int n_pages,
-   unsigned long offset, unsigned long size,
+   unsigned int offset, unsigned long size,
gfp_t gfp_mask)
 {
unsigned int chunks;
-- 
2.7.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 3/4] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages

2017-01-11 Thread Tvrtko Ursulin

From: Tvrtko Ursulin 

Drivers like i915 benefit from being able to control the maxium
size of the sg coallesced segment while building the scatter-
gather list.

Introduce and export the __sg_alloc_table_from_pages function
which will allow it that control.

v2: Reorder parameters. (Chris Wilson)
v3: Fix incomplete reordering in v2.
v4: max_segment needs to be page aligned.

Signed-off-by: Tvrtko Ursulin 
Cc: Masahiro Yamada 
Cc: linux-ker...@vger.kernel.org
Cc: Chris Wilson 
Reviewed-by: Chris Wilson  (v2)
---
 include/linux/scatterlist.h | 11 ++---
 lib/scatterlist.c   | 59 +++--
 2 files changed, 53 insertions(+), 17 deletions(-)

diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index c981bee1a3ae..16b740afeed2 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -261,10 +261,13 @@ void sg_free_table(struct sg_table *);
 int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int,
 struct scatterlist *, gfp_t, sg_alloc_fn *);
 int sg_alloc_table(struct sg_table *, unsigned int, gfp_t);
-int sg_alloc_table_from_pages(struct sg_table *sgt,
-   struct page **pages, unsigned int n_pages,
-   unsigned int offset, unsigned long size,
-   gfp_t gfp_mask);
+int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+   unsigned int n_pages, unsigned int offset,
+   unsigned long size, unsigned int max_segment,
+   gfp_t gfp_mask);
+int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+ unsigned int n_pages, unsigned int offset,
+ unsigned long size, gfp_t gfp_mask);
 
 size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf,
  size_t buflen, off_t skip, bool to_buffer);
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index 4fc54801cd29..df375ff18587 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -370,14 +370,15 @@ int sg_alloc_table(struct sg_table *table, unsigned int 
nents, gfp_t gfp_mask)
 EXPORT_SYMBOL(sg_alloc_table);
 
 /**
- * sg_alloc_table_from_pages - Allocate and initialize an sg table from
- *an array of pages
- * @sgt:   The sg table header to use
- * @pages: Pointer to an array of page pointers
- * @n_pages:   Number of pages in the pages array
- * @offset: Offset from start of the first page to the start of a buffer
- * @size:   Number of valid bytes in the buffer (after offset)
- * @gfp_mask:  GFP allocation mask
+ * __sg_alloc_table_from_pages - Allocate and initialize an sg table from
+ *  an array of pages
+ * @sgt:The sg table header to use
+ * @pages:  Pointer to an array of page pointers
+ * @n_pages:Number of pages in the pages array
+ * @offset:  Offset from start of the first page to the start of a buffer
+ * @size:Number of valid bytes in the buffer (after offset)
+ * @max_segment: Maximum size of a scatterlist node in bytes (page aligned)
+ * @gfp_mask:   GFP allocation mask
  *
  *  Description:
  *Allocate and initialize an sg table from a list of pages. Contiguous
@@ -389,18 +390,20 @@ EXPORT_SYMBOL(sg_alloc_table);
  * Returns:
  *   0 on success, negative error on failure
  */
-int sg_alloc_table_from_pages(struct sg_table *sgt,
-   struct page **pages, unsigned int n_pages,
-   unsigned int offset, unsigned long size,
-   gfp_t gfp_mask)
+int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+   unsigned int n_pages, unsigned int offset,
+   unsigned long size, unsigned int max_segment,
+   gfp_t gfp_mask)
 {
-   const unsigned int max_segment = rounddown(UINT_MAX, PAGE_SIZE);
unsigned int seg_len, chunks;
unsigned int i;
unsigned int cur_page;
int ret;
struct scatterlist *s;
 
+   if (WARN_ON(offset_in_page(max_segment)))
+   return -EINVAL;
+
/* compute number of contiguous chunks */
chunks = 1;
seg_len = PAGE_SIZE;
@@ -444,6 +447,36 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
 
return 0;
 }
+EXPORT_SYMBOL(__sg_alloc_table_from_pages);
+
+/**
+ * sg_alloc_table_from_pages - Allocate and initialize an sg table from
+ *an array of pages
+ * @sgt:The sg table header to use
+ * @pages:  Pointer to an array of page pointers
+ * @n_pages:Number of pages in the pages array
+ * @offset:  Offset from start of the first page to the start of a buffer
+ * @size:Number of valid bytes in the buffer (after offset)
+ * @gfp_mask:   GFP allocation mask
+ *
+ *  Description:
+ *Allocate and initialize an sg table from a list of pages. Contiguous
+ *r

[Intel-gfx] [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations

2017-01-11 Thread Tvrtko Ursulin

From: Tvrtko Ursulin 

With the addition of __sg_alloc_table_from_pages we can control
the maximum coallescing size and eliminate a separate path for
allocating backing store here.

Similar to 871dfbd67d4e ("drm/i915: Allow compaction upto
SWIOTLB max segment size") this enables more compact sg lists to
be created and so has a beneficial effect on workloads with many
and/or large objects of this class.

v2:
 * Rename helper to i915_sg_segment_size and fix swiotlb override.
 * Commit message update.

v3:
 * Actually include the swiotlb override fix.

v4:
 * Regroup parameters a bit. (Chris Wilson)

v5:
 * Rebase for swiotlb_max_segment.
 * Add DMA map failure handling as in abb0deacb5a6
   ("drm/i915: Fallback to single PAGE_SIZE segments for DMA remapping").

Signed-off-by: Tvrtko Ursulin 
Cc: Chris Wilson 
Cc: linux-ker...@vger.kernel.org
Reviewed-by: Chris Wilson  (v4)
---
 drivers/gpu/drm/i915/i915_drv.h | 10 +
 drivers/gpu/drm/i915/i915_gem.c |  6 +--
 drivers/gpu/drm/i915/i915_gem_userptr.c | 79 -
 3 files changed, 40 insertions(+), 55 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 2b325032fedc..a944ff0c5c68 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2594,6 +2594,16 @@ static inline struct scatterlist *__sg_next(struct 
scatterlist *sg)
 (((__iter).curr += PAGE_SIZE) < (__iter).max) ||   \
 ((__iter) = __sgt_iter(__sg_next((__iter).sgp), false), 0))
 
+static inline unsigned int i915_sg_segment_size(void)
+{
+   unsigned int size = swiotlb_max_segment();
+
+   if (size == 0)
+   size = UINT_MAX;
+
+   return rounddown(size, PAGE_SIZE);
+}
+
 static inline const struct intel_device_info *
 intel_info(const struct drm_i915_private *dev_priv)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 13c02015709c..421827069a2f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2248,7 +2248,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object 
*obj)
struct sgt_iter sgt_iter;
struct page *page;
unsigned long last_pfn = 0; /* suppress gcc warning */
-   unsigned int max_segment;
+   unsigned int max_segment = i915_sg_segment_size();
int ret;
gfp_t gfp;
 
@@ -2259,10 +2259,6 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object 
*obj)
GEM_BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
GEM_BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
 
-   max_segment = swiotlb_max_segment();
-   if (!max_segment)
-   max_segment = rounddown(UINT_MAX, PAGE_SIZE);
-
st = kmalloc(sizeof(*st), GFP_KERNEL);
if (st == NULL)
return ERR_PTR(-ENOMEM);
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c 
b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 6a8fa085b74e..95b62b9c5cd6 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -390,64 +390,42 @@ struct get_pages_work {
struct task_struct *task;
 };
 
-#if IS_ENABLED(CONFIG_SWIOTLB)
-#define swiotlb_active() swiotlb_nr_tbl()
-#else
-#define swiotlb_active() 0
-#endif
-
-static int
-st_set_pages(struct sg_table **st, struct page **pvec, int num_pages)
-{
-   struct scatterlist *sg;
-   int ret, n;
-
-   *st = kmalloc(sizeof(**st), GFP_KERNEL);
-   if (*st == NULL)
-   return -ENOMEM;
-
-   if (swiotlb_active()) {
-   ret = sg_alloc_table(*st, num_pages, GFP_KERNEL);
-   if (ret)
-   goto err;
-
-   for_each_sg((*st)->sgl, sg, num_pages, n)
-   sg_set_page(sg, pvec[n], PAGE_SIZE, 0);
-   } else {
-   ret = sg_alloc_table_from_pages(*st, pvec, num_pages,
-   0, num_pages << PAGE_SHIFT,
-   GFP_KERNEL);
-   if (ret)
-   goto err;
-   }
-
-   return 0;
-
-err:
-   kfree(*st);
-   *st = NULL;
-   return ret;
-}
-
 static struct sg_table *
-__i915_gem_userptr_set_pages(struct drm_i915_gem_object *obj,
-struct page **pvec, int num_pages)
+__i915_gem_userptr_alloc_pages(struct drm_i915_gem_object *obj,
+  struct page **pvec, int num_pages)
 {
-   struct sg_table *pages;
+   unsigned int max_segment = i915_sg_segment_size();
+   struct sg_table *st;
int ret;
 
-   ret = st_set_pages(&pages, pvec, num_pages);
-   if (ret)
+   st = kmalloc(sizeof(*st), GFP_KERNEL);
+   if (!st)
+   return ERR_PTR(-ENOMEM);
+
+alloc_table:
+   ret = __sg_alloc_table_from_pages(st, pvec, num_pages,
+ 0, num_pages << PAGE_SHIFT,
+

[Intel-gfx] [PATCH] drm/probe-helpers: Drop locking from poll_enable

2017-01-11 Thread Daniel Vetter

It was only needed to protect the connector_list walking, see

commit 8c4ccc4ab6f64e859d4ff8d7c02c2ed2e956e07f
Author: Daniel Vetter 
Date:   Thu Jul 9 23:44:26 2015 +0200

drm/probe-helper: Grab mode_config.mutex in poll_init/enable

Unfortunately the commit message of that patch fails to mention that
the new locking check was for the connector_list.

But that requirement disappeared in

commit c36a3254f7857f1ad9badbe3578ccc92be541a8e
Author: Daniel Vetter 
Date:   Thu Dec 15 16:58:43 2016 +0100

drm: Convert all helpers to drm_connector_list_iter

and so we can drop this again.

This fixes a locking inversion on nouveau, where the rpm code needs to
re-enable. But in other places the rpm_get() calls are nested within
the big modeset locks.

While at it, also improve the kerneldoc for these two functions a
notch.

v2: Update the kerneldoc even more to explain that these functions
can't be called concurrently, or bad things happen (Chris).

Cc: Dave Airlie 
Reviewed-by: Chris Wilson 
Cc: Chris Wilson 
Signed-off-by: Daniel Vetter 
---
 drivers/gpu/drm/drm_probe_helper.c   | 51 ++--
 drivers/gpu/drm/i915/intel_hotplug.c |  4 +--
 include/drm/drm_crtc_helper.h|  1 -
 3 files changed, 22 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/drm/drm_probe_helper.c 
b/drivers/gpu/drm/drm_probe_helper.c
index 20f48d1e2785..93381454bdf7 100644
--- a/drivers/gpu/drm/drm_probe_helper.c
+++ b/drivers/gpu/drm/drm_probe_helper.c
@@ -115,25 +115,28 @@ static int drm_helper_probe_add_cmdline_mode(struct 
drm_connector *connector)
 
 #define DRM_OUTPUT_POLL_PERIOD (10*HZ)
 /**
- * drm_kms_helper_poll_enable_locked - re-enable output polling.
+ * drm_kms_helper_poll_enable - re-enable output polling.
  * @dev: drm_device
  *
- * This function re-enables the output polling work without
- * locking the mode_config mutex.
+ * This function re-enables the output polling work, after it has been
+ * temporarily disabled using drm_kms_helper_poll_disable(), for example over
+ * suspend/resume.
  *
- * This is like drm_kms_helper_poll_enable() however it is to be
- * called from a context where the mode_config mutex is locked
- * already.
+ * Drivers can call this helper from their device resume implementation. It is
+ * an error to call this when the output polling support has not yet been set
+ * up.
+ *
+ * Note that calls to enable and disable polling must be strictly ordered, 
which
+ * is automatically the case when they're only call from suspend/resume
+ * callbacks.
  */
-void drm_kms_helper_poll_enable_locked(struct drm_device *dev)
+void drm_kms_helper_poll_enable(struct drm_device *dev)
 {
bool poll = false;
struct drm_connector *connector;
struct drm_connector_list_iter conn_iter;
unsigned long delay = DRM_OUTPUT_POLL_PERIOD;
 
-   WARN_ON(!mutex_is_locked(&dev->mode_config.mutex));
-
if (!dev->mode_config.poll_enabled || !drm_kms_helper_poll)
return;
 
@@ -163,7 +166,7 @@ void drm_kms_helper_poll_enable_locked(struct drm_device 
*dev)
if (poll)
schedule_delayed_work(&dev->mode_config.output_poll_work, 
delay);
 }
-EXPORT_SYMBOL(drm_kms_helper_poll_enable_locked);
+EXPORT_SYMBOL(drm_kms_helper_poll_enable);
 
 static enum drm_connector_status
 drm_connector_detect(struct drm_connector *connector, bool force)
@@ -290,7 +293,7 @@ int drm_helper_probe_single_connector_modes(struct 
drm_connector *connector,
 
/* Re-enable polling in case the global poll config changed. */
if (drm_kms_helper_poll != dev->mode_config.poll_running)
-   drm_kms_helper_poll_enable_locked(dev);
+   drm_kms_helper_poll_enable(dev);
 
dev->mode_config.poll_running = drm_kms_helper_poll;
 
@@ -484,8 +487,12 @@ static void output_poll_execute(struct work_struct *work)
  * This function disables the output polling work.
  *
  * Drivers can call this helper from their device suspend implementation. It is
- * not an error to call this even when output polling isn't enabled or arlready
- * disabled.
+ * not an error to call this even when output polling isn't enabled or already
+ * disabled. Polling is re-enabled by calling drm_kms_helper_poll_enable().
+ *
+ * Note that calls to enable and disable polling must be strictly ordered, 
which
+ * is automatically the case when they're only call from suspend/resume
+ * callbacks.
  */
 void drm_kms_helper_poll_disable(struct drm_device *dev)
 {
@@ -496,24 +503,6 @@ void drm_kms_helper_poll_disable(struct drm_device *dev)
 EXPORT_SYMBOL(drm_kms_helper_poll_disable);
 
 /**
- * drm_kms_helper_poll_enable - re-enable output polling.
- * @dev: drm_device
- *
- * This function re-enables the output polling work.
- *
- * Drivers can call this helper from their device resume implementation. It is
- * an error to call this when the output polling support has not yet been set
- * up.
- */
-void drm_kms_helper_poll_enable(struct drm_devi

Re: [Intel-gfx] [PATCH] drm/i915/bxt: Add MST support when do DPLL calculation

2017-01-11 Thread Jani Nikula

On Wed, 11 Jan 2017, "Lee, Shawn C"  wrote:
> From: "Lee, Shawn C" 
>
> Kernel oops was trigger by DP MST monitor/hub connected.

Copy paste the oops to the commit message please. It's *much* easier to
match bug reports and fixes this way.

There's likely a bug report, or several bug reports about this over at
FDO bugzilla. Any Bugzilla: references we should add?

When was this broken? Which commit does this fix? We should use a Fixes:
tag to identify it, so the fix can be backported to appropriate stable
kernels.

BR,
Jani.

> DP MST series patch already upstream and MST should
> be support also. MST monitor will display normally with this
> change on bxt platform.
>
> Cc: Jani Nikula 
> Reviewed-by: Cooper Chiou 
> Reviewed-by: Gary C Wang 
> Reviewed-by: Ciobanu, Nathan D 
> Reviewed-by: Herbert, Marc 
> Reviewed-by: Sripada, Radhakrishna 
>
> Signed-off-by: Shawn Lee 
> ---
>  drivers/gpu/drm/i915/intel_dpll_mgr.c |3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_dpll_mgr.c 
> b/drivers/gpu/drm/i915/intel_dpll_mgr.c
> index c92a2558beb4..1a1d99d266ed 100644
> --- a/drivers/gpu/drm/i915/intel_dpll_mgr.c
> +++ b/drivers/gpu/drm/i915/intel_dpll_mgr.c
> @@ -1855,7 +1855,8 @@ bool bxt_ddi_dp_set_dpll_hw_state(int clock,
>   return NULL;
>  
>   if ((encoder->type == INTEL_OUTPUT_DP ||
> -  encoder->type == INTEL_OUTPUT_EDP) &&
> +  encoder->type == INTEL_OUTPUT_EDP ||
> +  encoder->type == INTEL_OUTPUT_DP_MST ) &&
>   !bxt_ddi_dp_set_dpll_hw_state(clock, &dpll_hw_state))
>   return NULL;

-- 
Jani Nikula, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] ✗ Fi.CI.BAT: warning for drm/i915/bxt: Add MST support when do DPLL calculation

2017-01-11 Thread Patchwork

== Series Details ==

Series: drm/i915/bxt: Add MST support when do DPLL calculation
URL   : https://patchwork.freedesktop.org/series/17815/
State : warning

== Summary ==

Series 17815v1 drm/i915/bxt: Add MST support when do DPLL calculation
https://patchwork.freedesktop.org/api/1.0/series/17815/revisions/1/mbox/

Test kms_force_connector_basic:
Subgroup force-edid:
pass   -> DMESG-WARN (fi-snb-2520m)

fi-bdw-5557u total:246  pass:232  dwarn:0   dfail:0   fail:0   skip:14 
fi-bsw-n3050 total:246  pass:207  dwarn:0   dfail:0   fail:0   skip:39 
fi-bxt-j4205 total:246  pass:224  dwarn:0   dfail:0   fail:0   skip:22 
fi-bxt-t5700 total:82   pass:69   dwarn:0   dfail:0   fail:0   skip:12 
fi-byt-j1900 total:246  pass:219  dwarn:0   dfail:0   fail:0   skip:27 
fi-byt-n2820 total:246  pass:215  dwarn:0   dfail:0   fail:0   skip:31 
fi-hsw-4770  total:246  pass:227  dwarn:0   dfail:0   fail:0   skip:19 
fi-hsw-4770r total:246  pass:227  dwarn:0   dfail:0   fail:0   skip:19 
fi-ivb-3520m total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-ivb-3770  total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-kbl-7500u total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-skl-6260u total:246  pass:233  dwarn:0   dfail:0   fail:0   skip:13 
fi-skl-6700hqtotal:246  pass:226  dwarn:0   dfail:0   fail:0   skip:20 
fi-skl-6700k total:246  pass:222  dwarn:3   dfail:0   fail:0   skip:21 
fi-skl-6770hqtotal:246  pass:233  dwarn:0   dfail:0   fail:0   skip:13 
fi-snb-2520m total:246  pass:214  dwarn:1   dfail:0   fail:0   skip:31 
fi-snb-2600  total:246  pass:214  dwarn:0   dfail:0   fail:0   skip:32 

abf5260be6dda4ade94e8edf66e133260083f29b drm-tip: 2017y-01m-10d-23h-42m-21s UTC 
integration manifest
a6d6930 drm/i915/bxt: Add MST support when do DPLL calculation

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3472/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 3/3] drm/i915: Prefer random replacement before eviction search

2017-01-11 Thread Chris Wilson

On Wed, Jan 11, 2017 at 09:47:41AM +0200, Joonas Lahtinen wrote:
> On ti, 2017-01-10 at 21:55 +, Chris Wilson wrote:
> > Performing an eviction search can be very, very slow especially for a
> > range restricted replacement. For example, a workload like
> > gem_concurrent_blit will populate the entire GTT and then cause aperture
> > thrashing. Since the GTT is a mix of active and inactive tiny objects,
> > we have to search through almost 400k objects before finding anything
> > inside the mappable region, and as this search is required before every
> > operation performance falls off a cliff.
> > 
> > Instead of performing the full search, we do a trial replacement of the
> > node at a random location fitting the specified restrictions. We lose
> > the strict LRU property of the GTT in exchange for avoiding the slow
> > search (several orders of runtime improvement for gem_concurrent_blit
> > 4KiB-global-gtt, e.g. from 5000s to 20s). The loss of LRU replacement is
> > (later) mitigated firstly by only doing replacement if we find no
> > freespace and secondly by execbuf doing a PIN_NONBLOCK search first before
> > it starts thrashing (i.e. the random replacement will only occur from the
> > already inactive set of objects).
> > 
> > Signed-off-by: Chris Wilson 
> > Cc: Tvrtko Ursulin 
> > Cc: Joonas Lahtinen 
> 
> 
> 
> > +static u64 random_offset(u64 start, u64 end, u64 len, u64 align)
> > +{
> 
> The usual GEM_BUG_ON dance to make sure the inputs make some sense. Or
> are you relying on the upper level callers?

It was static and the callers were checking, but yeah might as well catch
them whilst we think about it.
 
> > +   u64 range, addr;
> > +
> > +   if (align == 0)
> > +   align = I915_GTT_MIN_ALIGNMENT;
> > +
> > +   range = round_down(end - len, align) - round_up(start, align);
> 
> For example this may cause an odd result.
> 
> > @@ -3629,6 +3655,16 @@ int i915_gem_gtt_insert(struct i915_address_space 
> > *vm,
> >     if (err != -ENOSPC)
> >     return err;
> >  
> > +   /* No free space, pick a slot at random */
> > +   err = i915_gem_gtt_reserve(vm, node,
> > +      size,
> > +      random_offset(start, end, size, alignment),
> 
> I'd pull this to a line above just to make it more humane to read.

> > +      color,
> > +      flags);
> > +   if (err != -ENOSPC)
> > +   return err;
> > +
> > +   /* Randomly selected placement is pinned, do a search */
> >     err = i915_gem_evict_something(vm, size, alignment, color,
> >        start, end, flags);
> >     if (err)
> 
> I'm bit unsure why it would make such a big difference, but if you've
> been running the numbers. Code itself is all good, so this is;


The pathological case we have is

|<-- 256 MiB aperture (64k objects) -->||<-- 1792 MiB unmappable (448k 
objects) -->|

Now imagine that the eviction LRU is ordered top-down (just because
pathology meets reallife), and that we need to evict an object to make
room inside the aperture. The eviction scan then has to walk the list
448k before it finds one within range. And now imagine that it has to
search for a new hole between every byte inside the memcpy, for several
simultaneous clients.

If there are a few holes in the unmappable region, we also have a
similar problem with hole skipping inside the drm_mm range search. This
is mitigated by using DRM_MM_INSERT_LOW, but only once we have that
support in drm_mm. Right now, the drm_mm search is also having to walk
the MRU rejecting the holes above the full aperture.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages

2017-01-11 Thread Patchwork

== Series Details ==

Series: series starting with [1/4] lib/scatterlist: Fix offset type in 
sg_alloc_table_from_pages
URL   : https://patchwork.freedesktop.org/series/17816/
State : success

== Summary ==

Series 17816v1 Series without cover letter
https://patchwork.freedesktop.org/api/1.0/series/17816/revisions/1/mbox/


fi-bdw-5557u total:246  pass:232  dwarn:0   dfail:0   fail:0   skip:14 
fi-bsw-n3050 total:246  pass:207  dwarn:0   dfail:0   fail:0   skip:39 
fi-bxt-j4205 total:246  pass:224  dwarn:0   dfail:0   fail:0   skip:22 
fi-bxt-t5700 total:82   pass:69   dwarn:0   dfail:0   fail:0   skip:12 
fi-byt-j1900 total:246  pass:219  dwarn:0   dfail:0   fail:0   skip:27 
fi-byt-n2820 total:246  pass:215  dwarn:0   dfail:0   fail:0   skip:31 
fi-hsw-4770  total:246  pass:227  dwarn:0   dfail:0   fail:0   skip:19 
fi-hsw-4770r total:246  pass:227  dwarn:0   dfail:0   fail:0   skip:19 
fi-ivb-3520m total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-ivb-3770  total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-kbl-7500u total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-skl-6260u total:246  pass:233  dwarn:0   dfail:0   fail:0   skip:13 
fi-skl-6700hqtotal:246  pass:226  dwarn:0   dfail:0   fail:0   skip:20 
fi-skl-6700k total:246  pass:222  dwarn:3   dfail:0   fail:0   skip:21 
fi-skl-6770hqtotal:246  pass:233  dwarn:0   dfail:0   fail:0   skip:13 
fi-snb-2520m total:246  pass:215  dwarn:0   dfail:0   fail:0   skip:31 
fi-snb-2600  total:246  pass:214  dwarn:0   dfail:0   fail:0   skip:32 

abf5260be6dda4ade94e8edf66e133260083f29b drm-tip: 2017y-01m-10d-23h-42m-21s UTC 
integration manifest
91daf71e drm/i915: Use __sg_alloc_table_from_pages for userptr allocations
653fe3c lib/scatterlist: Introduce and export __sg_alloc_table_from_pages
9fee41b lib/scatterlist: Avoid potential scatterlist entry overflow
354e525 lib/scatterlist: Fix offset type in sg_alloc_table_from_pages

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3473/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH] drm/i915: Prefer random replacement before eviction search

2017-01-11 Thread Chris Wilson

Performing an eviction search can be very, very slow especially for a
range restricted replacement. For example, a workload like
gem_concurrent_blit will populate the entire GTT and then cause aperture
thrashing. Since the GTT is a mix of active and inactive tiny objects,
we have to search through almost 400k objects before finding anything
inside the mappable region, and as this search is required before every
operation performance falls off a cliff.

Instead of performing the full search, we do a trial replacement of the
node at a random location fitting the specified restrictions. We lose
the strict LRU property of the GTT in exchange for avoiding the slow
search (several orders of runtime improvement for gem_concurrent_blit
4KiB-global-gtt, e.g. from 5000s to 20s). The loss of LRU replacement is
(later) mitigated firstly by only doing replacement if we find no
freespace and secondly by execbuf doing a PIN_NONBLOCK search first before
it starts thrashing (i.e. the random replacement will only occur from the
already inactive set of objects).

v2: Ascii-art, and check preconditions

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
Cc: Joonas Lahtinen 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 52 +
 1 file changed, 52 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index a3ea32f79d86..9aa53bdf5a48 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -24,6 +24,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 
@@ -3581,12 +3582,38 @@ int i915_gem_gtt_reserve(struct i915_address_space *vm,
return err;
 }
 
+static u64 random_offset(u64 start, u64 end, u64 len, u64 align)
+{
+   u64 range, addr;
+
+   GEM_BUG_ON(range_overflows(start, len, end));
+   GEM_BUG_ON(round_up(start, align) > round_down(end - len, align));
+
+   range = round_down(end - len, align) - round_up(start, align);
+   if (range) {
+   if (sizeof(unsigned long) == sizeof(u64)) {
+   addr = get_random_long();
+   } else {
+   addr = get_random_int();
+   if (range > U32_MAX) {
+   addr <<= 32;
+   addr |= get_random_int();
+   }
+   }
+   div64_u64_rem(addr, range, &addr);
+   start += addr;
+   }
+
+   return round_up(start, align);
+}
+
 int i915_gem_gtt_insert(struct i915_address_space *vm,
struct drm_mm_node *node,
u64 size, u64 alignment, unsigned long color,
u64 start, u64 end, unsigned int flags)
 {
u32 search_flag, alloc_flag;
+   u64 offset;
int err;
 
lockdep_assert_held(&vm->i915->drm.struct_mutex);
@@ -3629,6 +3656,31 @@ int i915_gem_gtt_insert(struct i915_address_space *vm,
if (err != -ENOSPC)
return err;
 
+   /* No free space, pick a slot at random.
+*
+* There is a pathological case here using a GTT shared between
+* mmap and GPU (i.e. ggtt/aliasing_ppgtt but not full-ppgtt):
+*
+*|<-- 256 MiB aperture -->||<-- 1792 MiB unmappable -->|
+* (64k objects) (448k objects)
+*
+* Now imagine that the eviction LRU is ordered top-down (just because
+* pathology meets real life), and that we need to evict an object to
+* make room inside the aperture. The eviction scan then has to walk
+* the list 448k before it finds one within range. And now imagine that
+* it has to search for a new hole between every byte inside the memcpy,
+* for several simultaneous clients.
+*
+* On a full-ppgtt system, if we have run out of available space, there
+* will be lots and lots of objects in the eviction list!
+*/
+   offset = random_offset(start, end,
+  size, alignment ?: I915_GTT_MIN_ALIGNMENT);
+   err = i915_gem_gtt_reserve(vm, node, size, offset, color, flags);
+   if (err != -ENOSPC)
+   return err;
+
+   /* Randomly selected placement is pinned, do a search */
err = i915_gem_evict_something(vm, size, alignment, color,
   start, end, flags);
if (err)
-- 
2.11.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH v3] drm/i915: Prefer random replacement before eviction search

2017-01-11 Thread Chris Wilson

Performing an eviction search can be very, very slow especially for a
range restricted replacement. For example, a workload like
gem_concurrent_blit will populate the entire GTT and then cause aperture
thrashing. Since the GTT is a mix of active and inactive tiny objects,
we have to search through almost 400k objects before finding anything
inside the mappable region, and as this search is required before every
operation performance falls off a cliff.

Instead of performing the full search, we do a trial replacement of the
node at a random location fitting the specified restrictions. We lose
the strict LRU property of the GTT in exchange for avoiding the slow
search (several orders of runtime improvement for gem_concurrent_blit
4KiB-global-gtt, e.g. from 5000s to 20s). The loss of LRU replacement is
(later) mitigated firstly by only doing replacement if we find no
freespace and secondly by execbuf doing a PIN_NONBLOCK search first before
it starts thrashing (i.e. the random replacement will only occur from the
already inactive set of objects).

v2: Ascii-art, and check preconditionst
v3: Rephrase final sentence in comment to explain why we don't both with
if (i915_is_ggtt(vm)) for preferring random replacement.

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
Cc: Joonas Lahtinen 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 56 +
 1 file changed, 56 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index a3ea32f79d86..b320cdd22f2f 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -24,6 +24,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 
@@ -3581,12 +3582,38 @@ int i915_gem_gtt_reserve(struct i915_address_space *vm,
return err;
 }
 
+static u64 random_offset(u64 start, u64 end, u64 len, u64 align)
+{
+   u64 range, addr;
+
+   GEM_BUG_ON(range_overflows(start, len, end));
+   GEM_BUG_ON(round_up(start, align) > round_down(end - len, align));
+
+   range = round_down(end - len, align) - round_up(start, align);
+   if (range) {
+   if (sizeof(unsigned long) == sizeof(u64)) {
+   addr = get_random_long();
+   } else {
+   addr = get_random_int();
+   if (range > U32_MAX) {
+   addr <<= 32;
+   addr |= get_random_int();
+   }
+   }
+   div64_u64_rem(addr, range, &addr);
+   start += addr;
+   }
+
+   return round_up(start, align);
+}
+
 int i915_gem_gtt_insert(struct i915_address_space *vm,
struct drm_mm_node *node,
u64 size, u64 alignment, unsigned long color,
u64 start, u64 end, unsigned int flags)
 {
u32 search_flag, alloc_flag;
+   u64 offset;
int err;
 
lockdep_assert_held(&vm->i915->drm.struct_mutex);
@@ -3629,6 +3656,35 @@ int i915_gem_gtt_insert(struct i915_address_space *vm,
if (err != -ENOSPC)
return err;
 
+   /* No free space, pick a slot at random.
+*
+* There is a pathological case here using a GTT shared between
+* mmap and GPU (i.e. ggtt/aliasing_ppgtt but not full-ppgtt):
+*
+*|<-- 256 MiB aperture -->||<-- 1792 MiB unmappable -->|
+* (64k objects) (448k objects)
+*
+* Now imagine that the eviction LRU is ordered top-down (just because
+* pathology meets real life), and that we need to evict an object to
+* make room inside the aperture. The eviction scan then has to walk
+* the 448k list before it finds one within range. And now imagine that
+* it has to search for a new hole between every byte inside the memcpy,
+* for several simultaneous clients.
+*
+* On a full-ppgtt system, if we have run out of available space, there
+* will be lots and lots of objects in the eviction list! Again,
+* searching that LRU list may be slow if we are also applying any
+* range restrictions (e.g. restriction to low 4GiB) and so, for
+* simplicity and similarilty between different GTT, try the single
+* random replacement first.
+*/
+   offset = random_offset(start, end,
+  size, alignment ?: I915_GTT_MIN_ALIGNMENT);
+   err = i915_gem_gtt_reserve(vm, node, size, offset, color, flags);
+   if (err != -ENOSPC)
+   return err;
+
+   /* Randomly selected placement is pinned, do a search */
err = i915_gem_evict_something(vm, size, alignment, color,
   start, end, flags);
if (err)
-- 
2.11.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https:/

[Intel-gfx] ✗ Fi.CI.BAT: warning for series starting with [1/3] drm/i915: Use the MRU stack search after evicting (rev3)

2017-01-11 Thread Patchwork

== Series Details ==

Series: series starting with [1/3] drm/i915: Use the MRU stack search after 
evicting (rev3)
URL   : https://patchwork.freedesktop.org/series/17784/
State : warning

== Summary ==

Series 17784v3 Series without cover letter
https://patchwork.freedesktop.org/api/1.0/series/17784/revisions/3/mbox/

Test kms_force_connector_basic:
Subgroup force-connector-state:
pass   -> DMESG-WARN (fi-snb-2520m)

fi-bdw-5557u total:246  pass:232  dwarn:0   dfail:0   fail:0   skip:14 
fi-bsw-n3050 total:246  pass:207  dwarn:0   dfail:0   fail:0   skip:39 
fi-bxt-j4205 total:246  pass:224  dwarn:0   dfail:0   fail:0   skip:22 
fi-bxt-t5700 total:82   pass:69   dwarn:0   dfail:0   fail:0   skip:12 
fi-byt-j1900 total:246  pass:219  dwarn:0   dfail:0   fail:0   skip:27 
fi-byt-n2820 total:246  pass:215  dwarn:0   dfail:0   fail:0   skip:31 
fi-hsw-4770  total:246  pass:227  dwarn:0   dfail:0   fail:0   skip:19 
fi-hsw-4770r total:246  pass:227  dwarn:0   dfail:0   fail:0   skip:19 
fi-ivb-3520m total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-ivb-3770  total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-kbl-7500u total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-skl-6260u total:246  pass:233  dwarn:0   dfail:0   fail:0   skip:13 
fi-skl-6700hqtotal:246  pass:226  dwarn:0   dfail:0   fail:0   skip:20 
fi-skl-6700k total:246  pass:222  dwarn:3   dfail:0   fail:0   skip:21 
fi-skl-6770hqtotal:246  pass:233  dwarn:0   dfail:0   fail:0   skip:13 
fi-snb-2520m total:246  pass:214  dwarn:1   dfail:0   fail:0   skip:31 
fi-snb-2600  total:246  pass:214  dwarn:0   dfail:0   fail:0   skip:32 

abf5260be6dda4ade94e8edf66e133260083f29b drm-tip: 2017y-01m-10d-23h-42m-21s UTC 
integration manifest
cee7403 drm/i915: Prefer random replacement before eviction search
df15dc1 drm/i915: Extract reserving space in the GTT to a helper
c4a94c9 drm/i915: Use the MRU stack search after evicting

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3475/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH v2 3/3] drm/i915: Prefer random replacement before eviction search

2017-01-11 Thread Chris Wilson

Performing an eviction search can be very, very slow especially for a
range restricted replacement. For example, a workload like
gem_concurrent_blit will populate the entire GTT and then cause aperture
thrashing. Since the GTT is a mix of active and inactive tiny objects,
we have to search through almost 400k objects before finding anything
inside the mappable region, and as this search is required before every
operation performance falls off a cliff.

Instead of performing the full search, we do a trial replacement of the
node at a random location fitting the specified restrictions. We lose
the strict LRU property of the GTT in exchange for avoiding the slow
search (several orders of runtime improvement for gem_concurrent_blit
4KiB-global-gtt, e.g. from 5000s to 20s). The loss of LRU replacement is
(later) mitigated firstly by only doing replacement if we find no
freespace and secondly by execbuf doing a PIN_NONBLOCK search first before
it starts thrashing (i.e. the random replacement will only occur from the
already inactive set of objects).

v2: Ascii-art, and check preconditionst
v3: Rephrase final sentence in comment to explain why we don't both with
if (i915_is_ggtt(vm)) for preferring random replacement.

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
Cc: Joonas Lahtinen 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 59 -
 1 file changed, 58 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 92b907f27986..0c48d6286419 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -24,6 +24,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 
@@ -3605,6 +3606,31 @@ int i915_gem_gtt_reserve(struct i915_address_space *vm,
return err;
 }
 
+static u64 random_offset(u64 start, u64 end, u64 len, u64 align)
+{
+   u64 range, addr;
+
+   GEM_BUG_ON(range_overflows(start, len, end));
+   GEM_BUG_ON(round_up(start, align) > round_down(end - len, align));
+
+   range = round_down(end - len, align) - round_up(start, align);
+   if (range) {
+   if (sizeof(unsigned long) == sizeof(u64)) {
+   addr = get_random_long();
+   } else {
+   addr = get_random_int();
+   if (range > U32_MAX) {
+   addr <<= 32;
+   addr |= get_random_int();
+   }
+   }
+   div64_u64_rem(addr, range, &addr);
+   start += addr;
+   }
+
+   return round_up(start, align);
+}
+
 /**
  * i915_gem_gtt_insert - insert a node into an address_space (GTT)
  * @vm - the &struct i915_address_space
@@ -3626,7 +3652,8 @@ int i915_gem_gtt_reserve(struct i915_address_space *vm,
  * its @size must then fit entirely within the [@start, @end] bounds. The
  * nodes on either side of the hole must match @color, or else a guard page
  * will be inserted between the two nodes (or the node evicted). If no
- * suitable hole is found, then the LRU list of objects within the GTT
+ * suitable hole is found, first a victim is randomly selected and tested
+ * for eviction, otherwise then the LRU list of objects within the GTT
  * is scanned to find the first set of replacement nodes to create the hole.
  * Those old overlapping nodes are evicted from the GTT (and so must be
  * rebound before any future use). Any node that is current pinned cannot
@@ -3644,6 +3671,7 @@ int i915_gem_gtt_insert(struct i915_address_space *vm,
u64 start, u64 end, unsigned int flags)
 {
u32 search_flag, alloc_flag;
+   u64 offset;
int err;
 
lockdep_assert_held(&vm->i915->drm.struct_mutex);
@@ -3686,6 +3714,35 @@ int i915_gem_gtt_insert(struct i915_address_space *vm,
if (err != -ENOSPC)
return err;
 
+   /* No free space, pick a slot at random.
+*
+* There is a pathological case here using a GTT shared between
+* mmap and GPU (i.e. ggtt/aliasing_ppgtt but not full-ppgtt):
+*
+*|<-- 256 MiB aperture -->||<-- 1792 MiB unmappable -->|
+* (64k objects) (448k objects)
+*
+* Now imagine that the eviction LRU is ordered top-down (just because
+* pathology meets real life), and that we need to evict an object to
+* make room inside the aperture. The eviction scan then has to walk
+* the 448k list before it finds one within range. And now imagine that
+* it has to search for a new hole between every byte inside the memcpy,
+* for several simultaneous clients.
+*
+* On a full-ppgtt system, if we have run out of available space, there
+* will be lots and lots of objects in the eviction list! Again,
+* searching that LRU list may be slow if we are also applying any
+

[Intel-gfx] [PATCH v2 2/3] drm/i915: Extract reserving space in the GTT to a helper

2017-01-11 Thread Chris Wilson

Extract drm_mm_reserve_node + calling i915_gem_evict_for_node into its
own routine so that it can be shared rather than duplicated.

v2: Kerneldoc

Signed-off-by: Chris Wilson 
Cc: Joonas Lahtinen 
Cc: igvt-g-...@lists.01.org
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_drv.h|  5 ++--
 drivers/gpu/drm/i915/i915_gem_evict.c  | 33 --
 drivers/gpu/drm/i915/i915_gem_gtt.c| 51 ++
 drivers/gpu/drm/i915/i915_gem_gtt.h|  5 
 drivers/gpu/drm/i915/i915_gem_stolen.c |  7 ++---
 drivers/gpu/drm/i915/i915_trace.h  | 16 +--
 drivers/gpu/drm/i915/i915_vgpu.c   | 33 --
 drivers/gpu/drm/i915/i915_vma.c| 16 ---
 8 files changed, 105 insertions(+), 61 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 89e0038ea26b..a29d138b6906 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3468,8 +3468,9 @@ int __must_check i915_gem_evict_something(struct 
i915_address_space *vm,
  unsigned cache_level,
  u64 start, u64 end,
  unsigned flags);
-int __must_check i915_gem_evict_for_vma(struct i915_vma *vma,
-   unsigned int flags);
+int __must_check i915_gem_evict_for_node(struct i915_address_space *vm,
+struct drm_mm_node *node,
+unsigned int flags);
 int i915_gem_evict_vm(struct i915_address_space *vm, bool do_idle);
 
 /* belongs in i915_gem_gtt.h */
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c 
b/drivers/gpu/drm/i915/i915_gem_evict.c
index 6a5415e31acf..50b4645bf627 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -231,7 +231,8 @@ i915_gem_evict_something(struct i915_address_space *vm,
 
 /**
  * i915_gem_evict_for_vma - Evict vmas to make room for binding a new one
- * @target: address space and range to evict for
+ * @vm: address space to evict from
+ * @target: range (and color) to evict for
  * @flags: additional flags to control the eviction algorithm
  *
  * This function will try to evict vmas that overlap the target node.
@@ -239,18 +240,20 @@ i915_gem_evict_something(struct i915_address_space *vm,
  * To clarify: This is for freeing up virtual address space, not for freeing
  * memory in e.g. the shrinker.
  */
-int i915_gem_evict_for_vma(struct i915_vma *target, unsigned int flags)
+int i915_gem_evict_for_node(struct i915_address_space *vm,
+   struct drm_mm_node *target,
+   unsigned int flags)
 {
LIST_HEAD(eviction_list);
struct drm_mm_node *node;
-   u64 start = target->node.start;
-   u64 end = start + target->node.size;
+   u64 start = target->start;
+   u64 end = start + target->size;
struct i915_vma *vma, *next;
bool check_color;
int ret = 0;
 
-   lockdep_assert_held(&target->vm->i915->drm.struct_mutex);
-   trace_i915_gem_evict_vma(target, flags);
+   lockdep_assert_held(&vm->i915->drm.struct_mutex);
+   trace_i915_gem_evict_node(vm, target, flags);
 
/* Retire before we search the active list. Although we have
 * reasonable accuracy in our retirement lists, we may have
@@ -258,18 +261,18 @@ int i915_gem_evict_for_vma(struct i915_vma *target, 
unsigned int flags)
 * retiring.
 */
if (!(flags & PIN_NONBLOCK))
-   i915_gem_retire_requests(target->vm->i915);
+   i915_gem_retire_requests(vm->i915);
 
-   check_color = target->vm->mm.color_adjust;
+   check_color = vm->mm.color_adjust;
if (check_color) {
/* Expand search to cover neighbouring guard pages (or lack!) */
-   if (start > target->vm->start)
+   if (start > vm->start)
start -= I915_GTT_PAGE_SIZE;
-   if (end < target->vm->start + target->vm->total)
+   if (end < vm->start + vm->total)
end += I915_GTT_PAGE_SIZE;
}
 
-   drm_mm_for_each_node_in_range(node, &target->vm->mm, start, end) {
+   drm_mm_for_each_node_in_range(node, &vm->mm, start, end) {
/* If we find any non-objects (!vma), we cannot evict them */
if (node->color == I915_COLOR_UNEVICTABLE) {
ret = -ENOSPC;
@@ -285,12 +288,12 @@ int i915_gem_evict_for_vma(struct i915_vma *target, 
unsigned int flags)
 * those as well to make room for our guard pages.
 */
if (check_color) {
-   if (vma->node.start + vma->node.size == 
target->node.start) {
-   if (vma->node.color == target->node.color)
+   if (vma->node.sta

[Intel-gfx] [PATCH v2 1/3] drm/i915: Use the MRU stack search after evicting

2017-01-11 Thread Chris Wilson

When we evict from the GTT to make room for an object, the hole we
create is put onto the MRU stack inside the drm_mm range manager. On the
next search pass, we can speed up a PIN_HIGH allocation by referencing
that stack for the new hole.

v2: Pull together the 3 identical implements (ahem, a couple were
outdated) into a common routine for allocating a node and evicting as
necessary.
v3: Detect invalid calls to i915_gem_gtt_insert()
v4: kerneldoc

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/gvt/aperture_gm.c |  33 +++--
 drivers/gpu/drm/i915/i915_gem_gtt.c| 121 +++--
 drivers/gpu/drm/i915/i915_gem_gtt.h|   5 ++
 drivers/gpu/drm/i915/i915_vma.c|  40 ++-
 4 files changed, 119 insertions(+), 80 deletions(-)

diff --git a/drivers/gpu/drm/i915/gvt/aperture_gm.c 
b/drivers/gpu/drm/i915/gvt/aperture_gm.c
index 016227d77dd4..a97d56ea3d83 100644
--- a/drivers/gpu/drm/i915/gvt/aperture_gm.c
+++ b/drivers/gpu/drm/i915/gvt/aperture_gm.c
@@ -41,47 +41,34 @@ static int alloc_gm(struct intel_vgpu *vgpu, bool high_gm)
 {
struct intel_gvt *gvt = vgpu->gvt;
struct drm_i915_private *dev_priv = gvt->dev_priv;
-   u32 alloc_flag, search_flag;
+   unsigned int flags;
u64 start, end, size;
struct drm_mm_node *node;
-   int retried = 0;
int ret;
 
if (high_gm) {
-   search_flag = DRM_MM_SEARCH_BELOW;
-   alloc_flag = DRM_MM_CREATE_TOP;
node = &vgpu->gm.high_gm_node;
size = vgpu_hidden_sz(vgpu);
start = gvt_hidden_gmadr_base(gvt);
end = gvt_hidden_gmadr_end(gvt);
+   flags = PIN_HIGH;
} else {
-   search_flag = DRM_MM_SEARCH_DEFAULT;
-   alloc_flag = DRM_MM_CREATE_DEFAULT;
node = &vgpu->gm.low_gm_node;
size = vgpu_aperture_sz(vgpu);
start = gvt_aperture_gmadr_base(gvt);
end = gvt_aperture_gmadr_end(gvt);
+   flags = PIN_MAPPABLE;
}
 
mutex_lock(&dev_priv->drm.struct_mutex);
-search_again:
-   ret = drm_mm_insert_node_in_range_generic(&dev_priv->ggtt.base.mm,
- node, size, 4096,
- I915_COLOR_UNEVICTABLE,
- start, end, search_flag,
- alloc_flag);
-   if (ret) {
-   ret = i915_gem_evict_something(&dev_priv->ggtt.base,
-  size, 4096,
-  I915_COLOR_UNEVICTABLE,
-  start, end, 0);
-   if (ret == 0 && ++retried < 3)
-   goto search_again;
-
-   gvt_err("fail to alloc %s gm space from host, retried %d\n",
-   high_gm ? "high" : "low", retried);
-   }
+   ret = i915_gem_gtt_insert(&dev_priv->ggtt.base, node,
+ size, 4096, I915_COLOR_UNEVICTABLE,
+ start, end, flags);
mutex_unlock(&dev_priv->drm.struct_mutex);
+   if (ret)
+   gvt_err("fail to alloc %s gm space from host\n",
+   high_gm ? "high" : "low");
+
return ret;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 8aca11f5f446..136f90ba95ab 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -23,10 +23,13 @@
  *
  */
 
+#include 
 #include 
 #include 
+
 #include 
 #include 
+
 #include "i915_drv.h"
 #include "i915_vgpu.h"
 #include "i915_trace.h"
@@ -2032,7 +2035,6 @@ static int gen6_ppgtt_allocate_page_directories(struct 
i915_hw_ppgtt *ppgtt)
struct i915_address_space *vm = &ppgtt->base;
struct drm_i915_private *dev_priv = ppgtt->base.i915;
struct i915_ggtt *ggtt = &dev_priv->ggtt;
-   bool retried = false;
int ret;
 
/* PPGTT PDEs reside in the GGTT and consists of 512 entries. The
@@ -2045,29 +2047,14 @@ static int gen6_ppgtt_allocate_page_directories(struct 
i915_hw_ppgtt *ppgtt)
if (ret)
return ret;
 
-alloc:
-   ret = drm_mm_insert_node_in_range_generic(&ggtt->base.mm, &ppgtt->node,
- GEN6_PD_SIZE, GEN6_PD_ALIGN,
- I915_COLOR_UNEVICTABLE,
- 0, ggtt->base.total,
- DRM_MM_TOPDOWN);
-   if (ret == -ENOSPC && !retried) {
-   ret = i915_gem_evict_something(&ggtt->base,
-  GEN6_PD_SIZE, GEN6_PD_ALIGN,
-  I915_COLOR_UNEVICTABLE,
-

[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [v2,1/3] drm/i915: Use the MRU stack search after evicting

2017-01-11 Thread Patchwork

== Series Details ==

Series: series starting with [v2,1/3] drm/i915: Use the MRU stack search after 
evicting
URL   : https://patchwork.freedesktop.org/series/17822/
State : success

== Summary ==

Series 17822v1 Series without cover letter
https://patchwork.freedesktop.org/api/1.0/series/17822/revisions/1/mbox/


fi-bdw-5557u total:246  pass:232  dwarn:0   dfail:0   fail:0   skip:14 
fi-bsw-n3050 total:246  pass:207  dwarn:0   dfail:0   fail:0   skip:39 
fi-bxt-j4205 total:246  pass:224  dwarn:0   dfail:0   fail:0   skip:22 
fi-bxt-t5700 total:82   pass:69   dwarn:0   dfail:0   fail:0   skip:12 
fi-byt-j1900 total:246  pass:219  dwarn:0   dfail:0   fail:0   skip:27 
fi-byt-n2820 total:246  pass:215  dwarn:0   dfail:0   fail:0   skip:31 
fi-hsw-4770  total:246  pass:227  dwarn:0   dfail:0   fail:0   skip:19 
fi-hsw-4770r total:246  pass:227  dwarn:0   dfail:0   fail:0   skip:19 
fi-ivb-3520m total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-ivb-3770  total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-kbl-7500u total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-skl-6260u total:246  pass:233  dwarn:0   dfail:0   fail:0   skip:13 
fi-skl-6700hqtotal:246  pass:226  dwarn:0   dfail:0   fail:0   skip:20 
fi-skl-6700k total:246  pass:222  dwarn:3   dfail:0   fail:0   skip:21 
fi-skl-6770hqtotal:246  pass:233  dwarn:0   dfail:0   fail:0   skip:13 
fi-snb-2520m total:246  pass:215  dwarn:0   dfail:0   fail:0   skip:31 
fi-snb-2600  total:246  pass:214  dwarn:0   dfail:0   fail:0   skip:32 

abf5260be6dda4ade94e8edf66e133260083f29b drm-tip: 2017y-01m-10d-23h-42m-21s UTC 
integration manifest
0db2997 drm/i915: Prefer random replacement before eviction search
a2d9bf3 drm/i915: Extract reserving space in the GTT to a helper
be9c2e1 drm/i915: Use the MRU stack search after evicting

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3476/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH v4] lib/scatterlist: Avoid potential scatterlist entry overflow

2017-01-11 Thread Tvrtko Ursulin

From: Tvrtko Ursulin 

Since the scatterlist length field is an unsigned int, make
sure that sg_alloc_table_from_pages does not overflow it while
coallescing pages to a single entry.

v2: Drop reference to future use. Use UINT_MAX.
v3: max_segment must be page aligned.
v4: Do not rely on compiler to optimise out the rounddown.
(Joonas Lahtinen)

Signed-off-by: Tvrtko Ursulin 
Cc: Masahiro Yamada 
Cc: linux-ker...@vger.kernel.org
Reviewed-by: Chris Wilson  (v2)
Cc: Joonas Lahtinen 
---
 include/linux/scatterlist.h |  6 ++
 lib/scatterlist.c   | 25 +++--
 2 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index c981bee1a3ae..15265bb6e5c3 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -21,6 +21,12 @@ struct scatterlist {
 };
 
 /*
+ * Since the above length field is an unsigned int, below we define the maximum
+ * lenght in bytes that can be stored in one scatterlist entry.
+ */
+#define SCATTERLIST_MAX_SEGMENT (0xf000)
+
+/*
  * These macros should be used after a dma_map_sg call has been done
  * to get bus addresses of each of the SG entries and their lengths.
  * You should only work with the number of sg entries dma_map_sg
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index e05e7fc98892..24beb0965e69 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -394,7 +394,8 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
unsigned int offset, unsigned long size,
gfp_t gfp_mask)
 {
-   unsigned int chunks;
+   const unsigned int max_segment = SCATTERLIST_MAX_SEGMENT;
+   unsigned int seg_len, chunks;
unsigned int i;
unsigned int cur_page;
int ret;
@@ -402,9 +403,16 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
 
/* compute number of contiguous chunks */
chunks = 1;
-   for (i = 1; i < n_pages; ++i)
-   if (page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1)
+   seg_len = PAGE_SIZE;
+   for (i = 1; i < n_pages; ++i) {
+   if (seg_len >= max_segment ||
+   page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1) {
++chunks;
+   seg_len = PAGE_SIZE;
+   } else {
+   seg_len += PAGE_SIZE;
+   }
+   }
 
ret = sg_alloc_table(sgt, chunks, gfp_mask);
if (unlikely(ret))
@@ -413,17 +421,22 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
/* merging chunks and putting them into the scatterlist */
cur_page = 0;
for_each_sg(sgt->sgl, s, sgt->orig_nents, i) {
-   unsigned long chunk_size;
+   unsigned int chunk_size;
unsigned int j;
 
/* look for the end of the current chunk */
+   seg_len = PAGE_SIZE;
for (j = cur_page + 1; j < n_pages; ++j)
-   if (page_to_pfn(pages[j]) !=
+   if (seg_len >= max_segment ||
+   page_to_pfn(pages[j]) !=
page_to_pfn(pages[j - 1]) + 1)
break;
+   else
+   seg_len += PAGE_SIZE;
 
chunk_size = ((j - cur_page) << PAGE_SHIFT) - offset;
-   sg_set_page(s, pages[cur_page], min(size, chunk_size), offset);
+   sg_set_page(s, pages[cur_page],
+   min_t(unsigned long, size, chunk_size), offset);
size -= chunk_size;
offset = 0;
cur_page = j;
-- 
2.7.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH v5] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages

2017-01-11 Thread Tvrtko Ursulin

From: Tvrtko Ursulin 

Drivers like i915 benefit from being able to control the maxium
size of the sg coallesced segment while building the scatter-
gather list.

Introduce and export the __sg_alloc_table_from_pages function
which will allow it that control.

v2: Reorder parameters. (Chris Wilson)
v3: Fix incomplete reordering in v2.
v4: max_segment needs to be page aligned.
v5: Rebase.

Signed-off-by: Tvrtko Ursulin 
Cc: Masahiro Yamada 
Cc: linux-ker...@vger.kernel.org
Cc: Chris Wilson 
Reviewed-by: Chris Wilson  (v2)
Cc: Joonas Lahtinen 
---
 include/linux/scatterlist.h | 11 +
 lib/scatterlist.c   | 58 +++--
 2 files changed, 52 insertions(+), 17 deletions(-)

diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index 15265bb6e5c3..c897533fb85d 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -267,10 +267,13 @@ void sg_free_table(struct sg_table *);
 int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int,
 struct scatterlist *, gfp_t, sg_alloc_fn *);
 int sg_alloc_table(struct sg_table *, unsigned int, gfp_t);
-int sg_alloc_table_from_pages(struct sg_table *sgt,
-   struct page **pages, unsigned int n_pages,
-   unsigned int offset, unsigned long size,
-   gfp_t gfp_mask);
+int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+   unsigned int n_pages, unsigned int offset,
+   unsigned long size, unsigned int max_segment,
+   gfp_t gfp_mask);
+int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+ unsigned int n_pages, unsigned int offset,
+ unsigned long size, gfp_t gfp_mask);
 
 size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf,
  size_t buflen, off_t skip, bool to_buffer);
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index 24beb0965e69..732c6e516657 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -370,14 +370,15 @@ int sg_alloc_table(struct sg_table *table, unsigned int 
nents, gfp_t gfp_mask)
 EXPORT_SYMBOL(sg_alloc_table);
 
 /**
- * sg_alloc_table_from_pages - Allocate and initialize an sg table from
- *an array of pages
- * @sgt:   The sg table header to use
- * @pages: Pointer to an array of page pointers
- * @n_pages:   Number of pages in the pages array
- * @offset: Offset from start of the first page to the start of a buffer
- * @size:   Number of valid bytes in the buffer (after offset)
- * @gfp_mask:  GFP allocation mask
+ * __sg_alloc_table_from_pages - Allocate and initialize an sg table from
+ *  an array of pages
+ * @sgt:The sg table header to use
+ * @pages:  Pointer to an array of page pointers
+ * @n_pages:Number of pages in the pages array
+ * @offset:  Offset from start of the first page to the start of a buffer
+ * @size:Number of valid bytes in the buffer (after offset)
+ * @max_segment: Maximum size of a scatterlist node in bytes (page aligned)
+ * @gfp_mask:   GFP allocation mask
  *
  *  Description:
  *Allocate and initialize an sg table from a list of pages. Contiguous
@@ -389,18 +390,20 @@ EXPORT_SYMBOL(sg_alloc_table);
  * Returns:
  *   0 on success, negative error on failure
  */
-int sg_alloc_table_from_pages(struct sg_table *sgt,
-   struct page **pages, unsigned int n_pages,
-   unsigned int offset, unsigned long size,
-   gfp_t gfp_mask)
+int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+   unsigned int n_pages, unsigned int offset,
+   unsigned long size, unsigned int max_segment,
+   gfp_t gfp_mask)
 {
-   const unsigned int max_segment = SCATTERLIST_MAX_SEGMENT;
unsigned int seg_len, chunks;
unsigned int i;
unsigned int cur_page;
int ret;
struct scatterlist *s;
 
+   if (WARN_ON(!max_segment || offset_in_page(max_segment)))
+   return -EINVAL;
+
/* compute number of contiguous chunks */
chunks = 1;
seg_len = PAGE_SIZE;
@@ -444,6 +447,35 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
 
return 0;
 }
+EXPORT_SYMBOL(__sg_alloc_table_from_pages);
+
+/**
+ * sg_alloc_table_from_pages - Allocate and initialize an sg table from
+ *an array of pages
+ * @sgt:The sg table header to use
+ * @pages:  Pointer to an array of page pointers
+ * @n_pages:Number of pages in the pages array
+ * @offset:  Offset from start of the first page to the start of a buffer
+ * @size:Number of valid bytes in the buffer (after offset)
+ * @gfp_mask:   GFP allocation mask
+ *
+ *  Description:
+ *Allocate and initialize an sg table

[Intel-gfx] [PATCH v6] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations

2017-01-11 Thread Tvrtko Ursulin

From: Tvrtko Ursulin 

With the addition of __sg_alloc_table_from_pages we can control
the maximum coallescing size and eliminate a separate path for
allocating backing store here.

Similar to 871dfbd67d4e ("drm/i915: Allow compaction upto
SWIOTLB max segment size") this enables more compact sg lists to
be created and so has a beneficial effect on workloads with many
and/or large objects of this class.

v2:
 * Rename helper to i915_sg_segment_size and fix swiotlb override.
 * Commit message update.

v3:
 * Actually include the swiotlb override fix.

v4:
 * Regroup parameters a bit. (Chris Wilson)

v5:
 * Rebase for swiotlb_max_segment.
 * Add DMA map failure handling as in abb0deacb5a6
   ("drm/i915: Fallback to single PAGE_SIZE segments for DMA remapping").

v6: Handle swiotlb_max_segment() returning 1. (Joonas Lahtinen)

Signed-off-by: Tvrtko Ursulin 
Cc: Chris Wilson 
Cc: linux-ker...@vger.kernel.org
Reviewed-by: Chris Wilson  (v4)
Cc: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_drv.h | 15 +++
 drivers/gpu/drm/i915/i915_gem.c |  6 +--
 drivers/gpu/drm/i915/i915_gem_userptr.c | 79 -
 3 files changed, 45 insertions(+), 55 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a320675a9e71..5646e48a893b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2598,6 +2598,21 @@ static inline struct scatterlist *__sg_next(struct 
scatterlist *sg)
 (((__iter).curr += PAGE_SIZE) < (__iter).max) ||   \
 ((__iter) = __sgt_iter(__sg_next((__iter).sgp), false), 0))
 
+static inline unsigned int i915_sg_segment_size(void)
+{
+   unsigned int size = swiotlb_max_segment();
+
+   if (size == 0)
+   return SCATTERLIST_MAX_SEGMENT;
+
+   size = rounddown(size, PAGE_SIZE);
+   /* swiotlb_max_segment_size can return 1 byte when it means one page. */
+   if (size < PAGE_SIZE)
+   size = PAGE_SIZE;
+
+   return size;
+}
+
 static inline const struct intel_device_info *
 intel_info(const struct drm_i915_private *dev_priv)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 3bf517e2430a..9312284a31e4 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2255,7 +2255,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object 
*obj)
struct sgt_iter sgt_iter;
struct page *page;
unsigned long last_pfn = 0; /* suppress gcc warning */
-   unsigned int max_segment;
+   unsigned int max_segment = i915_sg_segment_size();
int ret;
gfp_t gfp;
 
@@ -2266,10 +2266,6 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object 
*obj)
GEM_BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
GEM_BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
 
-   max_segment = swiotlb_max_segment();
-   if (!max_segment)
-   max_segment = rounddown(UINT_MAX, PAGE_SIZE);
-
st = kmalloc(sizeof(*st), GFP_KERNEL);
if (st == NULL)
return ERR_PTR(-ENOMEM);
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c 
b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 6a8fa085b74e..95b62b9c5cd6 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -390,64 +390,42 @@ struct get_pages_work {
struct task_struct *task;
 };
 
-#if IS_ENABLED(CONFIG_SWIOTLB)
-#define swiotlb_active() swiotlb_nr_tbl()
-#else
-#define swiotlb_active() 0
-#endif
-
-static int
-st_set_pages(struct sg_table **st, struct page **pvec, int num_pages)
-{
-   struct scatterlist *sg;
-   int ret, n;
-
-   *st = kmalloc(sizeof(**st), GFP_KERNEL);
-   if (*st == NULL)
-   return -ENOMEM;
-
-   if (swiotlb_active()) {
-   ret = sg_alloc_table(*st, num_pages, GFP_KERNEL);
-   if (ret)
-   goto err;
-
-   for_each_sg((*st)->sgl, sg, num_pages, n)
-   sg_set_page(sg, pvec[n], PAGE_SIZE, 0);
-   } else {
-   ret = sg_alloc_table_from_pages(*st, pvec, num_pages,
-   0, num_pages << PAGE_SHIFT,
-   GFP_KERNEL);
-   if (ret)
-   goto err;
-   }
-
-   return 0;
-
-err:
-   kfree(*st);
-   *st = NULL;
-   return ret;
-}
-
 static struct sg_table *
-__i915_gem_userptr_set_pages(struct drm_i915_gem_object *obj,
-struct page **pvec, int num_pages)
+__i915_gem_userptr_alloc_pages(struct drm_i915_gem_object *obj,
+  struct page **pvec, int num_pages)
 {
-   struct sg_table *pages;
+   unsigned int max_segment = i915_sg_segment_size();
+   struct sg_table *st;
int ret;
 
-   ret = st_set_pages(&pages, pvec, num_pages);
-   if

Re: [Intel-gfx] [PATCH v2 1/3] drm/i915: Use the MRU stack search after evicting

2017-01-11 Thread Joonas Lahtinen

On ke, 2017-01-11 at 11:23 +, Chris Wilson wrote:
> When we evict from the GTT to make room for an object, the hole we
> create is put onto the MRU stack inside the drm_mm range manager. On the
> next search pass, we can speed up a PIN_HIGH allocation by referencing
> that stack for the new hole.
> 
> v2: Pull together the 3 identical implements (ahem, a couple were
> outdated) into a common routine for allocating a node and evicting as
> necessary.
> v3: Detect invalid calls to i915_gem_gtt_insert()
> v4: kerneldoc
> 
> Signed-off-by: Chris Wilson 
> Reviewed-by: Joonas Lahtinen 



> +/**
> + * i915_gem_gtt_insert - insert a node into an address_space (GTT)
> + * @vm - the &struct i915_address_space

mixing &struct and @struct, I guess you meant &struct in later line
too.

> + * @node - the @struct drm_mm_node (typicallay i915_vma.mode)

"typicallly" and "i915_vma.node"

> + * @size - how much space to allocate inside the GTT,
> + * must be #I915_GTT_PAGE_SIZE aligned
> + * @alignment - required alignment of starting offset, may be 0 but
> + *  if specified, this must be a power-of-two and at least
> + *  #I915_GTT_MIN_ALIGNMENT
> + * @color - color to apply to node
> + * @start - start of any range restriction inside GTT (0 for all),
> + *  must be #I915_GTT_PAGE_SIZE aligned
> + * @end - end of any range restriction inside GTT (U64_MAX for all),
> + *must be #I915_GTT_PAGE_SIZE aligned
> + * @flags - control search and eviction behaviour
> + *
> + * i915_gem_gtt_insert() first searches for an available hole into which
> + * is can insert the node. The hole address is aligned to @alignment and
> + * its @size must then fit entirely within the [@start, @end] bounds. The
> + * nodes on either side of the hole must match @color, or else a guard page
> + * will be inserted between the two nodes (or the node evicted). If no
> + * suitable hole is found, then the LRU list of objects within the GTT
> + * is scanned to find the first set of replacement nodes to create the hole.
> + * Those old overlapping nodes are evicted from the GTT (and so must be
> + * rebound before any future use). Any node that is current pinned cannot

"currently"

> + * be evicted (see i915_vma_pin()). Similar if the node's VMA is currently
> + * active and #PIN_NONBLOCK is specified, that node is also skipped when
> + * searching for an eviction candidate. See i915_gem_evict_something() for
> + * the gory details on the eviction algorithm.
> + *
> + * Returns: 0 on success, -ENOSPC if no suitable hole is found, -EINTR if
> + * asked to wait for eviction and interrupted.
> + */

Fit those fixed, good to merge.

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCHv2] drm/i915: Remove WaDisableLSQCROPERFforOCL KBL workaround.

2017-01-11 Thread Mika Kuoppala

Daniel Vetter  writes:

> On Mon, Jan 09, 2017 at 01:07:56PM -0800, Francisco Jerez wrote:
>> The WaDisableLSQCROPERFforOCL workaround has the side effect of
>> disabling an L3SQ optimization that has huge performance implications
>> and is unlikely to be necessary for the correct functioning of usual
>> graphic workloads.  Userspace is free to re-enable the workaround on
>> demand, and is generally in a better position to determine whether the
>> workaround is necessary than the DRM is (e.g. only during the
>> execution of compute kernels that rely on both L3 fences and HDC R/W
>> requests).
>> 
>> The same workaround seems to apply to BDW (at least to production
>> stepping G1) and SKL as well (the internal workaround database claims
>> that it does for all steppings, while the BSpec workaround table only
>> mentions pre-production steppings), but the DRM doesn't do anything
>> beyond whitelisting the L3SQCREG4 register so userspace can enable it
>> when it sees fit.  Do the same on KBL platforms.
>> 
>> Improves performance of the GFXBench4 gl_manhattan31 benchmark by 60%,
>> and gl_4 (AKA car chase) by 14% on a KBL GT2 running Mesa master --
>> This is followed by a regression of 35% and 10% respectively for the
>> same benchmarks and platform caused by my recent patch series
>> switching userspace to use the dataport constant cache instead of the
>> sampler to implement uniform pull constant loads, which caused us to
>> hit more heavily the L3 cache (and on platforms other than KBL had the
>> opposite effect of improving performance of the same two benchmarks).
>> The overall effect on KBL of this change combined with the recent
>> userspace change is respectively 4.6% and 2.6%.  SynMark2 OglShMapPcf
>> was affected by the constant cache changes (though it improved as it
>> did on other platforms rather than regressing), but is not
>> significantly affected by this patch (with statistical significance of
>> 5% and sample size 20).
>> 
>> v2: Drop some more code to avoid unused variable warning.
>> 
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99256
>> Signed-off-by: Francisco Jerez 
>> Cc: Eero Tamminen 
>> Cc: Jani Nikula 
>> Cc: Mika Kuoppala 
>> Cc: beig...@lists.freedesktop.org
>
> Don't we need some userspace flag/opt-in scheme to avoid stuff going boom
> for compute kernels? Are the patches for mesa compute/beignet
> ready&reviewed?

This is explicit setting on kbl/E0 only. So one could argue
that unless they filter based on PCI-IDs, things would already
blow up across the skl/kbl population, if they forgot
to set it. The whitelisting is in place and looks sane
so this E0 exception is a wart that got in by me reading wa
database slavishly without thinking.

-Mika

> -Daniel
>
>> ---
>>  drivers/gpu/drm/i915/intel_lrc.c| 10 --
>>  drivers/gpu/drm/i915/intel_ringbuffer.c |  8 
>>  2 files changed, 18 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c 
>> b/drivers/gpu/drm/i915/intel_lrc.c
>> index 6db246a..656e0a3 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -970,18 +970,8 @@ static inline int gen8_emit_flush_coherentl3_wa(struct 
>> intel_engine_cs *engine,
>>  uint32_t *batch,
>>  uint32_t index)
>>  {
>> -struct drm_i915_private *dev_priv = engine->i915;
>>  uint32_t l3sqc4_flush = (0x4040 | GEN8_LQSC_FLUSH_COHERENT_LINES);
>>  
>> -/*
>> - * WaDisableLSQCROPERFforOCL:kbl
>> - * This WA is implemented in skl_init_clock_gating() but since
>> - * this batch updates GEN8_L3SQCREG4 with default value we need to
>> - * set this bit here to retain the WA during flush.
>> - */
>> -if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_E0))
>> -l3sqc4_flush |= GEN8_LQSC_RO_PERF_DIS;
>> -
>>  wa_ctx_emit(batch, index, (MI_STORE_REGISTER_MEM_GEN8 |
>> MI_SRM_LRM_GLOBAL_GTT));
>>  wa_ctx_emit_reg(batch, index, GEN8_L3SQCREG4);
>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
>> b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> index 0971ac3..7cb2ab4 100644
>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> @@ -1095,14 +1095,6 @@ static int kbl_init_workarounds(struct 
>> intel_engine_cs *engine)
>>  WA_SET_BIT_MASKED(HDC_CHICKEN0,
>>HDC_FENCE_DEST_SLM_DISABLE);
>>  
>> -/* GEN8_L3SQCREG4 has a dependency with WA batch so any new changes
>> - * involving this register should also be added to WA batch as required.
>> - */
>> -if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_E0))
>> -/* WaDisableLSQCROPERFforOCL:kbl */
>> -I915_WRITE(GEN8_L3SQCREG4, I915_READ(GEN8_L3SQCREG4) |
>> -   GEN8_LQSC_RO_PERF_DIS);
>> -
>>  /* WaToEnableHwFixForPushConstHWBug:kbl */
>>  if (IS_KBL_REVID(dev_priv, KB

[Intel-gfx] [PATCH] drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt

2017-01-11 Thread Chris Wilson

When switching between contexts using the aliasing_ppgtt, the VM is
shared. We don't need to reload the PD registers unless they are dirty.

Martin Peres reported an issue that looks like corruption between
Haswell context switches, bisecting to commit f9326be5f1d3 ("drm/i915:
Rearrange switch_context to load the aliasing ppgtt on first use").
Switching between the same mm (the aliasing_ppgtt is used for all
contexts in this case) should be a nop, but appears to trigger some
side-effects in the context switch. However, as we know the switch
is redundant in this case, we can skip it and continue to ignore the
issue until somebody feels strong enough to investigate full-ppgtt on
gen7 again!

Fixes: f9326be5f1d3 ("drm/i915: Rearrange switch_context to load the aliasing 
ppgtt on first use")
Reported-by: Martin Peres 
Signed-off-by: Chris Wilson 
Cc: Martin Peres 
---
 drivers/gpu/drm/i915/i915_gem_context.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c 
b/drivers/gpu/drm/i915/i915_gem_context.c
index ed31133b3ce3..86426c1a9534 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -728,10 +728,10 @@ static inline bool skip_rcs_switch(struct i915_hw_ppgtt 
*ppgtt,
 }
 
 static bool
-needs_pd_load_pre(struct i915_hw_ppgtt *ppgtt,
- struct intel_engine_cs *engine,
- struct i915_gem_context *to)
+needs_pd_load_pre(struct i915_hw_ppgtt *ppgtt, struct intel_engine_cs *engine)
 {
+   struct i915_hw_ppgtt *last_ppgtt;
+
if (!ppgtt)
return false;
 
@@ -740,7 +740,9 @@ needs_pd_load_pre(struct i915_hw_ppgtt *ppgtt,
return true;
 
/* Same context without new entries, skip */
-   if (engine->legacy_active_context == to &&
+   last_ppgtt =
+   engine->legacy_active_context->ppgtt ?: 
engine->i915->mm.aliasing_ppgtt;
+   if (last_ppgtt == ppgtt &&
!(intel_engine_flag(engine) & ppgtt->pd_dirty_rings))
return false;
 
@@ -784,7 +786,7 @@ static int do_rcs_switch(struct drm_i915_gem_request *req)
if (skip_rcs_switch(ppgtt, engine, to))
return 0;
 
-   if (needs_pd_load_pre(ppgtt, engine, to)) {
+   if (needs_pd_load_pre(ppgtt, engine)) {
/* Older GENs and non render rings still want the load first,
 * "PP_DCLV followed by PP_DIR_BASE register through Load
 * Register Immediate commands in Ring Buffer before submitting
@@ -881,7 +883,7 @@ int i915_switch_context(struct drm_i915_gem_request *req)
struct i915_hw_ppgtt *ppgtt =
to->ppgtt ?: req->i915->mm.aliasing_ppgtt;
 
-   if (needs_pd_load_pre(ppgtt, engine, to)) {
+   if (needs_pd_load_pre(ppgtt, engine)) {
int ret;
 
trace_switch_mm(engine, to);
-- 
2.11.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCHv2] drm/i915: Remove WaDisableLSQCROPERFforOCL KBL workaround.

2017-01-11 Thread Chris Wilson

On Wed, Jan 11, 2017 at 02:07:37PM +0200, Mika Kuoppala wrote:
> Daniel Vetter  writes:
> 
> > On Mon, Jan 09, 2017 at 01:07:56PM -0800, Francisco Jerez wrote:
> >> The WaDisableLSQCROPERFforOCL workaround has the side effect of
> >> disabling an L3SQ optimization that has huge performance implications
> >> and is unlikely to be necessary for the correct functioning of usual
> >> graphic workloads.  Userspace is free to re-enable the workaround on
> >> demand, and is generally in a better position to determine whether the
> >> workaround is necessary than the DRM is (e.g. only during the
> >> execution of compute kernels that rely on both L3 fences and HDC R/W
> >> requests).
> >> 
> >> The same workaround seems to apply to BDW (at least to production
> >> stepping G1) and SKL as well (the internal workaround database claims
> >> that it does for all steppings, while the BSpec workaround table only
> >> mentions pre-production steppings), but the DRM doesn't do anything
> >> beyond whitelisting the L3SQCREG4 register so userspace can enable it
> >> when it sees fit.  Do the same on KBL platforms.
> >> 
> >> Improves performance of the GFXBench4 gl_manhattan31 benchmark by 60%,
> >> and gl_4 (AKA car chase) by 14% on a KBL GT2 running Mesa master --
> >> This is followed by a regression of 35% and 10% respectively for the
> >> same benchmarks and platform caused by my recent patch series
> >> switching userspace to use the dataport constant cache instead of the
> >> sampler to implement uniform pull constant loads, which caused us to
> >> hit more heavily the L3 cache (and on platforms other than KBL had the
> >> opposite effect of improving performance of the same two benchmarks).
> >> The overall effect on KBL of this change combined with the recent
> >> userspace change is respectively 4.6% and 2.6%.  SynMark2 OglShMapPcf
> >> was affected by the constant cache changes (though it improved as it
> >> did on other platforms rather than regressing), but is not
> >> significantly affected by this patch (with statistical significance of
> >> 5% and sample size 20).
> >> 
> >> v2: Drop some more code to avoid unused variable warning.
> >> 
> >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99256
> >> Signed-off-by: Francisco Jerez 
> >> Cc: Eero Tamminen 
> >> Cc: Jani Nikula 
> >> Cc: Mika Kuoppala 
> >> Cc: beig...@lists.freedesktop.org
> >
> > Don't we need some userspace flag/opt-in scheme to avoid stuff going boom
> > for compute kernels? Are the patches for mesa compute/beignet
> > ready&reviewed?
> 
> This is explicit setting on kbl/E0 only. So one could argue
> that unless they filter based on PCI-IDs, things would already
> blow up across the skl/kbl population, if they forgot
> to set it. The whitelisting is in place and looks sane
> so this E0 exception is a wart that got in by me reading wa
> database slavishly without thinking.

Add Fixes then?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v2 1/3] drm/i915: Use the MRU stack search after evicting

2017-01-11 Thread Chris Wilson

On Wed, Jan 11, 2017 at 02:04:53PM +0200, Joonas Lahtinen wrote:
> On ke, 2017-01-11 at 11:23 +, Chris Wilson wrote:
> > When we evict from the GTT to make room for an object, the hole we
> > create is put onto the MRU stack inside the drm_mm range manager. On the
> > next search pass, we can speed up a PIN_HIGH allocation by referencing
> > that stack for the new hole.
> > 
> > v2: Pull together the 3 identical implements (ahem, a couple were
> > outdated) into a common routine for allocating a node and evicting as
> > necessary.
> > v3: Detect invalid calls to i915_gem_gtt_insert()
> > v4: kerneldoc
> > 
> > Signed-off-by: Chris Wilson 
> > Reviewed-by: Joonas Lahtinen 
> 
> 
> 
> > +/**
> > + * i915_gem_gtt_insert - insert a node into an address_space (GTT)
> > + * @vm - the &struct i915_address_space
> 
> mixing &struct and @struct, I guess you meant &struct in later line
> too.
> 
> > + * @node - the @struct drm_mm_node (typicallay i915_vma.mode)
> 
> "typicallly" and "i915_vma.node"
> 
> > + * @size - how much space to allocate inside the GTT,
> > + * must be #I915_GTT_PAGE_SIZE aligned
> > + * @alignment - required alignment of starting offset, may be 0 but
> > + *  if specified, this must be a power-of-two and at least
> > + *  #I915_GTT_MIN_ALIGNMENT
> > + * @color - color to apply to node
> > + * @start - start of any range restriction inside GTT (0 for all),
> > + *  must be #I915_GTT_PAGE_SIZE aligned
> > + * @end - end of any range restriction inside GTT (U64_MAX for all),
> > + *must be #I915_GTT_PAGE_SIZE aligned
> > + * @flags - control search and eviction behaviour
> > + *
> > + * i915_gem_gtt_insert() first searches for an available hole into which
> > + * is can insert the node. The hole address is aligned to @alignment and
> > + * its @size must then fit entirely within the [@start, @end] bounds. The
> > + * nodes on either side of the hole must match @color, or else a guard page
> > + * will be inserted between the two nodes (or the node evicted). If no
> > + * suitable hole is found, then the LRU list of objects within the GTT
> > + * is scanned to find the first set of replacement nodes to create the 
> > hole.
> > + * Those old overlapping nodes are evicted from the GTT (and so must be
> > + * rebound before any future use). Any node that is current pinned cannot
> 
> "currently"
> 
> > + * be evicted (see i915_vma_pin()). Similar if the node's VMA is currently
> > + * active and #PIN_NONBLOCK is specified, that node is also skipped when
> > + * searching for an eviction candidate. See i915_gem_evict_something() for
> > + * the gory details on the eviction algorithm.
> > + *
> > + * Returns: 0 on success, -ENOSPC if no suitable hole is found, -EINTR if
> > + * asked to wait for eviction and interrupted.
> > + */
> 
> Fit those fixed, good to merge.

Thanks for proof reading.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCHv2] drm/i915: Remove WaDisableLSQCROPERFforOCL KBL workaround.

2017-01-11 Thread Mika Kuoppala

Chris Wilson  writes:

> On Wed, Jan 11, 2017 at 02:07:37PM +0200, Mika Kuoppala wrote:
>> Daniel Vetter  writes:
>> 
>> > On Mon, Jan 09, 2017 at 01:07:56PM -0800, Francisco Jerez wrote:
>> >> The WaDisableLSQCROPERFforOCL workaround has the side effect of
>> >> disabling an L3SQ optimization that has huge performance implications
>> >> and is unlikely to be necessary for the correct functioning of usual
>> >> graphic workloads.  Userspace is free to re-enable the workaround on
>> >> demand, and is generally in a better position to determine whether the
>> >> workaround is necessary than the DRM is (e.g. only during the
>> >> execution of compute kernels that rely on both L3 fences and HDC R/W
>> >> requests).
>> >> 
>> >> The same workaround seems to apply to BDW (at least to production
>> >> stepping G1) and SKL as well (the internal workaround database claims
>> >> that it does for all steppings, while the BSpec workaround table only
>> >> mentions pre-production steppings), but the DRM doesn't do anything
>> >> beyond whitelisting the L3SQCREG4 register so userspace can enable it
>> >> when it sees fit.  Do the same on KBL platforms.
>> >> 
>> >> Improves performance of the GFXBench4 gl_manhattan31 benchmark by 60%,
>> >> and gl_4 (AKA car chase) by 14% on a KBL GT2 running Mesa master --
>> >> This is followed by a regression of 35% and 10% respectively for the
>> >> same benchmarks and platform caused by my recent patch series
>> >> switching userspace to use the dataport constant cache instead of the
>> >> sampler to implement uniform pull constant loads, which caused us to
>> >> hit more heavily the L3 cache (and on platforms other than KBL had the
>> >> opposite effect of improving performance of the same two benchmarks).
>> >> The overall effect on KBL of this change combined with the recent
>> >> userspace change is respectively 4.6% and 2.6%.  SynMark2 OglShMapPcf
>> >> was affected by the constant cache changes (though it improved as it
>> >> did on other platforms rather than regressing), but is not
>> >> significantly affected by this patch (with statistical significance of
>> >> 5% and sample size 20).
>> >> 
>> >> v2: Drop some more code to avoid unused variable warning.
>> >> 
>> >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99256
>> >> Signed-off-by: Francisco Jerez 
>> >> Cc: Eero Tamminen 
>> >> Cc: Jani Nikula 
>> >> Cc: Mika Kuoppala 
>> >> Cc: beig...@lists.freedesktop.org
>> >
>> > Don't we need some userspace flag/opt-in scheme to avoid stuff going boom
>> > for compute kernels? Are the patches for mesa compute/beignet
>> > ready&reviewed?
>> 
>> This is explicit setting on kbl/E0 only. So one could argue
>> that unless they filter based on PCI-IDs, things would already
>> blow up across the skl/kbl population, if they forgot
>> to set it. The whitelisting is in place and looks sane
>> so this E0 exception is a wart that got in by me reading wa
>> database slavishly without thinking.
>
> Add Fixes then?

Fixes: a4106a782d11 ("drm/i915/gen9: Add 
WaFlushCoherentL3CacheLinesAtContextSwitch workaround")

Looking at beignet source, they don't care about this register/bit (yet).

Also we need to get rid of KBL_REVID_E0 as there is no such thing.
Oddly kbl doesnt follow the logical x0->rev mapping but leave
holes. Were they afraid of running out of revids or what...

-Mika

> -Chris
>
> -- 
> Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt

2017-01-11 Thread Patchwork

== Series Details ==

Series: drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt
URL   : https://patchwork.freedesktop.org/series/17823/
State : failure

== Summary ==

Series 17823v1 drm/i915: Suppress switch_mm emission between the same 
aliasing_ppgtt
https://patchwork.freedesktop.org/api/1.0/series/17823/revisions/1/mbox/

Test drv_hangman:
Subgroup error-state-basic:
pass   -> TIMEOUT(fi-hsw-4770)
Test gem_close_race:
Subgroup basic-process:
pass   -> INCOMPLETE (fi-hsw-4770)
Subgroup basic-threads:
pass   -> INCOMPLETE (fi-hsw-4770r)
Test gem_ctx_basic:
pass   -> INCOMPLETE (fi-ivb-3520m)

fi-bdw-5557u total:246  pass:232  dwarn:0   dfail:0   fail:0   skip:14 
fi-bsw-n3050 total:246  pass:207  dwarn:0   dfail:0   fail:0   skip:39 
fi-bxt-j4205 total:246  pass:224  dwarn:0   dfail:0   fail:0   skip:22 
fi-bxt-t5700 total:82   pass:69   dwarn:0   dfail:0   fail:0   skip:12 
fi-byt-j1900 total:246  pass:219  dwarn:0   dfail:0   fail:0   skip:27 
fi-byt-n2820 total:246  pass:215  dwarn:0   dfail:0   fail:0   skip:31 
fi-hsw-4770  total:14   pass:12   dwarn:0   dfail:0   fail:0   skip:0  
fi-hsw-4770r total:15   pass:14   dwarn:0   dfail:0   fail:0   skip:0  
fi-ivb-3520m total:18   pass:17   dwarn:0   dfail:0   fail:0   skip:0  
fi-kbl-7500u total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-skl-6260u total:246  pass:233  dwarn:0   dfail:0   fail:0   skip:13 
fi-skl-6700hqtotal:246  pass:226  dwarn:0   dfail:0   fail:0   skip:20 
fi-skl-6700k total:246  pass:222  dwarn:3   dfail:0   fail:0   skip:21 
fi-skl-6770hqtotal:246  pass:233  dwarn:0   dfail:0   fail:0   skip:13 
fi-snb-2520m total:246  pass:215  dwarn:0   dfail:0   fail:0   skip:31 
fi-snb-2600  total:246  pass:214  dwarn:0   dfail:0   fail:0   skip:32 

abf5260be6dda4ade94e8edf66e133260083f29b drm-tip: 2017y-01m-10d-23h-42m-21s UTC 
integration manifest
f58497f drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3478/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 0/5] drm/edid: Improve RGB limited range handling a bit

2017-01-11 Thread ville . syrjala

From: Ville Syrjälä 

While reading the HDMI 2.0 spec I noticed some new things related to
the RGB quantization range stuff, and after cross checking with
CEA-861-F I spotted a some other things as well. So I figured I should
pimp up the code a bit.

And since we now have two drivers that deal with this stuff, I decided
to move a bunch of the code to the core to avoid duplicating the code
and having different bugs/features for each driver. I still left the state
computation part in the drivers, but eventually we might want to move that
code into some helper as well.

Entire series available here:
git://github.com/vsyrjala/linux.git hdmi_quant_range_helpers

Ville Syrjälä (5):
  drm/edid: Have drm_edid.h include hdmi.h
  drm/edid: Introduce drm_default_rgb_quant_range()
  drm/edid: Introduce drm_hdmi_avi_infoframe_quant_range()
  drm/edid: Set AVI infoframe Q even when QS=0
  drm/edid: Set YQ bits in the AVI infoframe according to CEA-861-F

 drivers/gpu/drm/drm_edid.c| 64 +++
 drivers/gpu/drm/i915/intel_dp.c   |  4 ++-
 drivers/gpu/drm/i915/intel_hdmi.c | 20 ++--
 drivers/gpu/drm/vc4/vc4_hdmi.c| 18 +--
 include/drm/drm_edid.h| 10 --
 5 files changed, 93 insertions(+), 23 deletions(-)

-- 
2.10.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 1/5] drm/edid: Have drm_edid.h include hdmi.h

2017-01-11 Thread ville . syrjala

From: Ville Syrjälä 

drm_edid.h depends on hdmi.h on account of enum hdmi_picture_aspect,
so let's just include hdmi.h and drop some useless struct declarations.

Signed-off-by: Ville Syrjälä 
---
 include/drm/drm_edid.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/include/drm/drm_edid.h b/include/drm/drm_edid.h
index 38eabf65f19d..838eaf2b42e9 100644
--- a/include/drm/drm_edid.h
+++ b/include/drm/drm_edid.h
@@ -24,6 +24,7 @@
 #define __DRM_EDID_H__
 
 #include 
+#include 
 
 struct drm_device;
 struct i2c_adapter;
@@ -322,8 +323,6 @@ struct cea_sad {
 struct drm_encoder;
 struct drm_connector;
 struct drm_display_mode;
-struct hdmi_avi_infoframe;
-struct hdmi_vendor_infoframe;
 
 void drm_edid_to_eld(struct drm_connector *connector, struct edid *edid);
 int drm_edid_to_sad(struct edid *edid, struct cea_sad **sads);
-- 
2.10.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 4/5] drm/edid: Set AVI infoframe Q even when QS=0

2017-01-11 Thread ville . syrjala

From: Ville Syrjälä 

HDMI 2.0 recommends that we set the Q bits in the AVI infoframe
even when the sink does not support quantization range selection (QS=0).
According to CEA-861 we can do that as long as the Q we send matches
the default quantization range for the mode.

Previosuly I think I had misread the spec as saying that you can't
send a non-zero Q at all when QS=0. But that's not what the spec
actually says.

Signed-off-by: Ville Syrjälä 
---
 drivers/gpu/drm/drm_edid.c| 8 +++-
 drivers/gpu/drm/i915/intel_hdmi.c | 6 --
 drivers/gpu/drm/vc4/vc4_hdmi.c| 2 +-
 include/drm/drm_edid.h| 1 +
 4 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index 548c20250b95..caa2435bac31 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -4295,11 +4295,13 @@ EXPORT_SYMBOL(drm_hdmi_avi_infoframe_from_display_mode);
  * drm_hdmi_avi_infoframe_quant_range() - fill the HDMI AVI infoframe
  *quantization range information
  * @frame: HDMI AVI infoframe
+ * @mode: DRM display mode
  * @rgb_quant_range: RGB quantization range (Q)
  * @rgb_quant_range_selectable: Sink support selectable RGB quantization range 
(QS)
  */
 void
 drm_hdmi_avi_infoframe_quant_range(struct hdmi_avi_infoframe *frame,
+  const struct drm_display_mode *mode,
   enum hdmi_quantization_range rgb_quant_range,
   bool rgb_quant_range_selectable)
 {
@@ -4309,8 +4311,12 @@ drm_hdmi_avi_infoframe_quant_range(struct 
hdmi_avi_infoframe *frame,
 *  to the default RGB Quantization Range for the transmitted Picture
 *  unless the Sink indicates support for the Q bit in a Video
 *  Capabilities Data Block."
+*
+* HDMI 2.0 recommends sending non-zero Q when it does match the
+* default RGB quantization range for the mode, even when QS=0.
 */
-   if (rgb_quant_range_selectable)
+   if (rgb_quant_range_selectable ||
+   rgb_quant_range == drm_default_rgb_quant_range(mode))
frame->quantization_range = rgb_quant_range;
else
frame->quantization_range = HDMI_QUANTIZATION_RANGE_DEFAULT;
diff --git a/drivers/gpu/drm/i915/intel_hdmi.c 
b/drivers/gpu/drm/i915/intel_hdmi.c
index 351f837b09a0..af16b0fa6b69 100644
--- a/drivers/gpu/drm/i915/intel_hdmi.c
+++ b/drivers/gpu/drm/i915/intel_hdmi.c
@@ -455,17 +455,19 @@ static void intel_hdmi_set_avi_infoframe(struct 
drm_encoder *encoder,
 const struct intel_crtc_state 
*crtc_state)
 {
struct intel_hdmi *intel_hdmi = enc_to_intel_hdmi(encoder);
+   const struct drm_display_mode *adjusted_mode =
+   &crtc_state->base.adjusted_mode;
union hdmi_infoframe frame;
int ret;
 
ret = drm_hdmi_avi_infoframe_from_display_mode(&frame.avi,
-  
&crtc_state->base.adjusted_mode);
+  adjusted_mode);
if (ret < 0) {
DRM_ERROR("couldn't fill AVI infoframe\n");
return;
}
 
-   drm_hdmi_avi_infoframe_quant_range(&frame.avi,
+   drm_hdmi_avi_infoframe_quant_range(&frame.avi, adjusted_mode,
   crtc_state->limited_color_range ?
   HDMI_QUANTIZATION_RANGE_LIMITED :
   HDMI_QUANTIZATION_RANGE_FULL,
diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index a588156b5410..f38fdbac2878 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -356,7 +356,7 @@ static void vc4_hdmi_set_avi_infoframe(struct drm_encoder 
*encoder)
return;
}
 
-   drm_hdmi_avi_infoframe_quant_range(&frame.avi,
+   drm_hdmi_avi_infoframe_quant_range(&frame.avi, mode,
   vc4_encoder->limited_rgb_range ?
   HDMI_QUANTIZATION_RANGE_LIMITED :
   HDMI_QUANTIZATION_RANGE_FULL,
diff --git a/include/drm/drm_edid.h b/include/drm/drm_edid.h
index cfad4d89589f..43fb0ac5eb9c 100644
--- a/include/drm/drm_edid.h
+++ b/include/drm/drm_edid.h
@@ -347,6 +347,7 @@ drm_hdmi_vendor_infoframe_from_display_mode(struct 
hdmi_vendor_infoframe *frame,
const struct drm_display_mode 
*mode);
 void
 drm_hdmi_avi_infoframe_quant_range(struct hdmi_avi_infoframe *frame,
+  const struct drm_display_mode *mode,
   enum hdmi_quantization_range rgb_quant_range,
   bool rgb_quant_range_selectable);
 
-- 
2.10.2

__

[Intel-gfx] [PATCH 2/5] drm/edid: Introduce drm_default_rgb_quant_range()

2017-01-11 Thread ville . syrjala

From: Ville Syrjälä 

Make the code selecting the RGB quantization range a little less magicy
by wrapping it up in a small helper.

Signed-off-by: Ville Syrjälä 
---
 drivers/gpu/drm/drm_edid.c| 18 ++
 drivers/gpu/drm/i915/intel_dp.c   |  4 +++-
 drivers/gpu/drm/i915/intel_hdmi.c |  3 ++-
 drivers/gpu/drm/vc4/vc4_hdmi.c|  4 +++-
 include/drm/drm_edid.h|  2 ++
 5 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index 4ff04aa84dd0..304c583b8000 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -3768,6 +3768,24 @@ bool drm_rgb_quant_range_selectable(struct edid *edid)
 }
 EXPORT_SYMBOL(drm_rgb_quant_range_selectable);
 
+/**
+ * drm_default_rgb_quant_range - default RGB quantization range
+ * @mode: display mode
+ *
+ * Determine the default RGB quantization range for the mode,
+ * as specified in CEA-861.
+ *
+ * Return: The default RGB quantization range for the mode
+ */
+enum hdmi_quantization_range
+drm_default_rgb_quant_range(const struct drm_display_mode *mode)
+{
+   return drm_match_cea_mode(mode) > 1 ?
+   HDMI_QUANTIZATION_RANGE_LIMITED :
+   HDMI_QUANTIZATION_RANGE_FULL;
+}
+EXPORT_SYMBOL(drm_default_rgb_quant_range);
+
 static void drm_parse_hdmi_deep_color_info(struct drm_connector *connector,
   const u8 *hdmi)
 {
diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index 343e1d9fa761..d4befbbe834a 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -1713,7 +1713,9 @@ intel_dp_compute_config(struct intel_encoder *encoder,
 * VESA DisplayPort Ver.1.2a - 5.1.1.1 Video Colorimetry
 */
pipe_config->limited_color_range =
-   bpp != 18 && drm_match_cea_mode(adjusted_mode) > 1;
+   bpp != 18 &&
+   drm_default_rgb_quant_range(adjusted_mode) ==
+   HDMI_QUANTIZATION_RANGE_LIMITED;
} else {
pipe_config->limited_color_range =
intel_dp->limited_color_range;
diff --git a/drivers/gpu/drm/i915/intel_hdmi.c 
b/drivers/gpu/drm/i915/intel_hdmi.c
index 0bcfead14571..19bd13f53729 100644
--- a/drivers/gpu/drm/i915/intel_hdmi.c
+++ b/drivers/gpu/drm/i915/intel_hdmi.c
@@ -1330,7 +1330,8 @@ bool intel_hdmi_compute_config(struct intel_encoder 
*encoder,
/* See CEA-861-E - 5.1 Default Encoding Parameters */
pipe_config->limited_color_range =
pipe_config->has_hdmi_sink &&
-   drm_match_cea_mode(adjusted_mode) > 1;
+   drm_default_rgb_quant_range(adjusted_mode) ==
+   HDMI_QUANTIZATION_RANGE_LIMITED;
} else {
pipe_config->limited_color_range =
intel_hdmi->limited_color_range;
diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index c4cb2e26de32..d79466a42690 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -463,7 +463,9 @@ static void vc4_hdmi_encoder_mode_set(struct drm_encoder 
*encoder,
csc_ctl = VC4_SET_FIELD(VC4_HD_CSC_CTL_ORDER_BGR,
VC4_HD_CSC_CTL_ORDER);
 
-   if (vc4_encoder->hdmi_monitor && drm_match_cea_mode(mode) > 1) {
+   if (vc4_encoder->hdmi_monitor &&
+   drm_default_rgb_quant_range(adjusted_mode) ==
+   HDMI_QUANTIZATION_RANGE_LIMITED) {
/* CEA VICs other than #1 requre limited range RGB
 * output unless overridden by an AVI infoframe.
 * Apply a colorspace conversion to squash 0-255 down
diff --git a/include/drm/drm_edid.h b/include/drm/drm_edid.h
index 838eaf2b42e9..25cdf5f7a0d8 100644
--- a/include/drm/drm_edid.h
+++ b/include/drm/drm_edid.h
@@ -441,6 +441,8 @@ enum hdmi_picture_aspect drm_get_cea_aspect_ratio(const u8 
video_code);
 bool drm_detect_hdmi_monitor(struct edid *edid);
 bool drm_detect_monitor_audio(struct edid *edid);
 bool drm_rgb_quant_range_selectable(struct edid *edid);
+enum hdmi_quantization_range
+drm_default_rgb_quant_range(const struct drm_display_mode *mode);
 int drm_add_modes_noedid(struct drm_connector *connector,
 int hdisplay, int vdisplay);
 void drm_set_preferred_mode(struct drm_connector *connector,
-- 
2.10.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 5/5] drm/edid: Set YQ bits in the AVI infoframe according to CEA-861-F

2017-01-11 Thread ville . syrjala

From: Ville Syrjälä 

CEA-861-F tells us:
"When transmitting any RGB colorimetry, the Source should set the
 YQ-field to match the RGB Quantization Range being transmitted
 (e.g., when Limited Range RGB, set YQ=0 or when Full Range RGB,
 set YQ=1) and the Sink shall ignore the YQ-field."

So let's go ahead and do that. Perhaps there are sinks that don't
ignore the YQ as they should for RGB?

I wasn't able to find similar text in CEA-861-E, so it would seem
to be a fairly "recent" addition.

Signed-off-by: Ville Syrjälä 
---
 drivers/gpu/drm/drm_edid.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index caa2435bac31..6ba9a1a6eae4 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -4320,6 +4320,20 @@ drm_hdmi_avi_infoframe_quant_range(struct 
hdmi_avi_infoframe *frame,
frame->quantization_range = rgb_quant_range;
else
frame->quantization_range = HDMI_QUANTIZATION_RANGE_DEFAULT;
+
+   /*
+* CEA-861-F:
+* "When transmitting any RGB colorimetry, the Source should set the
+*  YQ-field to match the RGB Quantization Range being transmitted
+*  (e.g., when Limited Range RGB, set YQ=0 or when Full Range RGB,
+*  set YQ=1) and the Sink shall ignore the YQ-field."
+*/
+   if (rgb_quant_range == HDMI_QUANTIZATION_RANGE_LIMITED)
+   frame->ycc_quantization_range =
+   HDMI_YCC_QUANTIZATION_RANGE_LIMITED;
+   else
+   frame->ycc_quantization_range =
+   HDMI_YCC_QUANTIZATION_RANGE_FULL;
 }
 EXPORT_SYMBOL(drm_hdmi_avi_infoframe_quant_range);
 
-- 
2.10.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 3/5] drm/edid: Introduce drm_hdmi_avi_infoframe_quant_range()

2017-01-11 Thread ville . syrjala

From: Ville Syrjälä 

Pull the logic to populate the quantization range information
in the AVI infoframe into a small helper. We'll be adding a bit
more logic to it, and having it in a central place seems like a
good idea since it's based on the CEA-861 spec.

Signed-off-by: Ville Syrjälä 
---
 drivers/gpu/drm/drm_edid.c| 26 ++
 drivers/gpu/drm/i915/intel_hdmi.c | 13 +
 drivers/gpu/drm/vc4/vc4_hdmi.c| 14 +-
 include/drm/drm_edid.h|  4 
 4 files changed, 40 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index 304c583b8000..548c20250b95 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -4291,6 +4291,32 @@ drm_hdmi_avi_infoframe_from_display_mode(struct 
hdmi_avi_infoframe *frame,
 }
 EXPORT_SYMBOL(drm_hdmi_avi_infoframe_from_display_mode);
 
+/**
+ * drm_hdmi_avi_infoframe_quant_range() - fill the HDMI AVI infoframe
+ *quantization range information
+ * @frame: HDMI AVI infoframe
+ * @rgb_quant_range: RGB quantization range (Q)
+ * @rgb_quant_range_selectable: Sink support selectable RGB quantization range 
(QS)
+ */
+void
+drm_hdmi_avi_infoframe_quant_range(struct hdmi_avi_infoframe *frame,
+  enum hdmi_quantization_range rgb_quant_range,
+  bool rgb_quant_range_selectable)
+{
+   /*
+* CEA-861:
+* "A Source shall not send a non-zero Q value that does not correspond
+*  to the default RGB Quantization Range for the transmitted Picture
+*  unless the Sink indicates support for the Q bit in a Video
+*  Capabilities Data Block."
+*/
+   if (rgb_quant_range_selectable)
+   frame->quantization_range = rgb_quant_range;
+   else
+   frame->quantization_range = HDMI_QUANTIZATION_RANGE_DEFAULT;
+}
+EXPORT_SYMBOL(drm_hdmi_avi_infoframe_quant_range);
+
 static enum hdmi_3d_structure
 s3d_structure_from_display_mode(const struct drm_display_mode *mode)
 {
diff --git a/drivers/gpu/drm/i915/intel_hdmi.c 
b/drivers/gpu/drm/i915/intel_hdmi.c
index 19bd13f53729..351f837b09a0 100644
--- a/drivers/gpu/drm/i915/intel_hdmi.c
+++ b/drivers/gpu/drm/i915/intel_hdmi.c
@@ -465,14 +465,11 @@ static void intel_hdmi_set_avi_infoframe(struct 
drm_encoder *encoder,
return;
}
 
-   if (intel_hdmi->rgb_quant_range_selectable) {
-   if (crtc_state->limited_color_range)
-   frame.avi.quantization_range =
-   HDMI_QUANTIZATION_RANGE_LIMITED;
-   else
-   frame.avi.quantization_range =
-   HDMI_QUANTIZATION_RANGE_FULL;
-   }
+   drm_hdmi_avi_infoframe_quant_range(&frame.avi,
+  crtc_state->limited_color_range ?
+  HDMI_QUANTIZATION_RANGE_LIMITED :
+  HDMI_QUANTIZATION_RANGE_FULL,
+  
intel_hdmi->rgb_quant_range_selectable);
 
intel_write_infoframe(encoder, crtc_state, &frame);
 }
diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index d79466a42690..a588156b5410 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -356,15 +356,11 @@ static void vc4_hdmi_set_avi_infoframe(struct drm_encoder 
*encoder)
return;
}
 
-   if (vc4_encoder->rgb_range_selectable) {
-   if (vc4_encoder->limited_rgb_range) {
-   frame.avi.quantization_range =
-   HDMI_QUANTIZATION_RANGE_LIMITED;
-   } else {
-   frame.avi.quantization_range =
-   HDMI_QUANTIZATION_RANGE_FULL;
-   }
-   }
+   drm_hdmi_avi_infoframe_quant_range(&frame.avi,
+  vc4_encoder->limited_rgb_range ?
+  HDMI_QUANTIZATION_RANGE_LIMITED :
+  HDMI_QUANTIZATION_RANGE_FULL,
+  vc4_encoder->rgb_range_selectable);
 
vc4_hdmi_write_infoframe(encoder, &frame);
 }
diff --git a/include/drm/drm_edid.h b/include/drm/drm_edid.h
index 25cdf5f7a0d8..cfad4d89589f 100644
--- a/include/drm/drm_edid.h
+++ b/include/drm/drm_edid.h
@@ -345,6 +345,10 @@ drm_hdmi_avi_infoframe_from_display_mode(struct 
hdmi_avi_infoframe *frame,
 int
 drm_hdmi_vendor_infoframe_from_display_mode(struct hdmi_vendor_infoframe 
*frame,
const struct drm_display_mode 
*mode);
+void
+drm_hdmi_avi_infoframe_quant_range(struct hdmi_avi_infoframe *frame,
+  enum hdmi_quantization_range rgb_quant_range,
+

Re: [Intel-gfx] [PATCH] drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt

2017-01-11 Thread Joonas Lahtinen

On ke, 2017-01-11 at 12:14 +, Chris Wilson wrote:
> When switching between contexts using the aliasing_ppgtt, the VM is
> shared. We don't need to reload the PD registers unless they are dirty.
> 
> Martin Peres reported an issue that looks like corruption between
> Haswell context switches, bisecting to commit f9326be5f1d3 ("drm/i915:
> Rearrange switch_context to load the aliasing ppgtt on first use").
> Switching between the same mm (the aliasing_ppgtt is used for all
> contexts in this case) should be a nop, but appears to trigger some
> side-effects in the context switch. However, as we know the switch
> is redundant in this case, we can skip it and continue to ignore the
> issue until somebody feels strong enough to investigate full-ppgtt on
> gen7 again!
> 
> Fixes: f9326be5f1d3 ("drm/i915: Rearrange switch_context to load the aliasing 
> ppgtt on first use")
> Reported-by: Martin Peres 
> Signed-off-by: Chris Wilson 
> Cc: Martin Peres 

Code looks good, could use the T-b's to verify.

Reviewed-by: Joonas Lahtinen 

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt

2017-01-11 Thread Chris Wilson

On Wed, Jan 11, 2017 at 12:56:03PM -, Patchwork wrote:
> == Series Details ==
> 
> Series: drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt
> URL   : https://patchwork.freedesktop.org/series/17823/
> State : failure
> 
> == Summary ==
> 
> Series 17823v1 drm/i915: Suppress switch_mm emission between the same 
> aliasing_ppgtt
> https://patchwork.freedesktop.org/api/1.0/series/17823/revisions/1/mbox/
> 
> Test drv_hangman:
> Subgroup error-state-basic:
> pass   -> TIMEOUT(fi-hsw-4770)
> Test gem_close_race:
> Subgroup basic-process:
> pass   -> INCOMPLETE (fi-hsw-4770)
> Subgroup basic-threads:
> pass   -> INCOMPLETE (fi-hsw-4770r)
> Test gem_ctx_basic:
> pass   -> INCOMPLETE (fi-ivb-3520m)

Ooh. That is suitably scary that there is something wrong going here.
Still think the patch is sane by itself, so suspecting there is
something not meeting the eye here.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 3/3] HAX enable guc submission for CI

2017-01-11 Thread Chris Wilson

---
 drivers/gpu/drm/i915/i915_params.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_params.c 
b/drivers/gpu/drm/i915/i915_params.c
index 0e280fbd52f1..1d3766cfc837 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -56,8 +56,8 @@ struct i915_params i915 __read_mostly = {
.verbose_state_checks = 1,
.nuclear_pageflip = 0,
.edp_vswing = 0,
-   .enable_guc_loading = 0,
-   .enable_guc_submission = 0,
+   .enable_guc_loading = 1,
+   .enable_guc_submission = 1,
.guc_log_level = -1,
.enable_dp_mst = true,
.inject_load_failure = 0,
-- 
2.11.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 1/3] drm/i915: Invalidate the guc ggtt TLB upon insertion

2017-01-11 Thread Chris Wilson

Move the GuC invalidation of its ggtt TLB to where we perform the ggtt
modification rather than proliferate it into all the callers of the
insert (which may or may not in fact have to do the insertion).

v2: Just do the guc invalidate unconditionally, (afaict) it has no impact
without the guc loaded on gen8+
v3: Conditionally invalidate the guc - just in case that register has
not been validated for other modes.

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
Cc: Daniel Vetter 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c| 78 +++---
 drivers/gpu/drm/i915/i915_gem_gtt.h|  3 ++
 drivers/gpu/drm/i915/i915_guc_submission.c |  3 --
 drivers/gpu/drm/i915/intel_guc_loader.c|  7 +--
 drivers/gpu/drm/i915/intel_lrc.c   |  6 ---
 5 files changed, 57 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 0ed99adfd0da..ed120a1e7f93 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -110,6 +110,30 @@ const struct i915_ggtt_view i915_ggtt_view_rotated = {
.type = I915_GGTT_VIEW_ROTATED,
 };
 
+static void gen6_ggtt_invalidate(struct drm_i915_private *dev_priv)
+{
+   /* Note that as an uncached mmio write, this should flush the
+* WCB of the writes into the GGTT before it triggers the invalidate.
+*/
+   I915_WRITE(GFX_FLSH_CNTL_GEN6, GFX_FLSH_CNTL_EN);
+}
+
+static void guc_ggtt_invalidate(struct drm_i915_private *dev_priv)
+{
+   gen6_ggtt_invalidate(dev_priv);
+   I915_WRITE(GEN8_GTCR, GEN8_GTCR_INVALIDATE);
+}
+
+static void gmch_ggtt_invalidate(struct drm_i915_private *dev_priv)
+{
+   intel_gtt_chipset_flush();
+}
+
+static inline void i915_ggtt_invalidate(struct drm_i915_private *i915)
+{
+   i915->ggtt.invalidate(i915);
+}
+
 int intel_sanitize_enable_ppgtt(struct drm_i915_private *dev_priv,
int enable_ppgtt)
 {
@@ -2307,16 +2331,6 @@ void i915_check_and_clear_faults(struct drm_i915_private 
*dev_priv)
POSTING_READ(RING_FAULT_REG(dev_priv->engine[RCS]));
 }
 
-static void i915_ggtt_flush(struct drm_i915_private *dev_priv)
-{
-   if (INTEL_INFO(dev_priv)->gen < 6) {
-   intel_gtt_chipset_flush();
-   } else {
-   I915_WRITE(GFX_FLSH_CNTL_GEN6, GFX_FLSH_CNTL_EN);
-   POSTING_READ(GFX_FLSH_CNTL_GEN6);
-   }
-}
-
 void i915_gem_suspend_gtt_mappings(struct drm_i915_private *dev_priv)
 {
struct i915_ggtt *ggtt = &dev_priv->ggtt;
@@ -2331,7 +2345,7 @@ void i915_gem_suspend_gtt_mappings(struct 
drm_i915_private *dev_priv)
 
ggtt->base.clear_range(&ggtt->base, ggtt->base.start, ggtt->base.total);
 
-   i915_ggtt_flush(dev_priv);
+   i915_ggtt_invalidate(dev_priv);
 }
 
 int i915_gem_gtt_prepare_pages(struct drm_i915_gem_object *obj,
@@ -2370,15 +2384,13 @@ static void gen8_ggtt_insert_page(struct 
i915_address_space *vm,
  enum i915_cache_level level,
  u32 unused)
 {
-   struct drm_i915_private *dev_priv = vm->i915;
+   struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
gen8_pte_t __iomem *pte =
-   (gen8_pte_t __iomem *)dev_priv->ggtt.gsm +
-   (offset >> PAGE_SHIFT);
+   (gen8_pte_t __iomem *)ggtt->gsm + (offset >> PAGE_SHIFT);
 
gen8_set_pte(pte, gen8_pte_encode(addr, level));
 
-   I915_WRITE(GFX_FLSH_CNTL_GEN6, GFX_FLSH_CNTL_EN);
-   POSTING_READ(GFX_FLSH_CNTL_GEN6);
+   ggtt->invalidate(vm->i915);
 }
 
 static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
@@ -2386,7 +2398,6 @@ static void gen8_ggtt_insert_entries(struct 
i915_address_space *vm,
 uint64_t start,
 enum i915_cache_level level, u32 unused)
 {
-   struct drm_i915_private *dev_priv = vm->i915;
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
struct sgt_iter sgt_iter;
gen8_pte_t __iomem *gtt_entries;
@@ -2415,8 +2426,7 @@ static void gen8_ggtt_insert_entries(struct 
i915_address_space *vm,
 * want to flush the TLBs only after we're certain all the PTE updates
 * have finished.
 */
-   I915_WRITE(GFX_FLSH_CNTL_GEN6, GFX_FLSH_CNTL_EN);
-   POSTING_READ(GFX_FLSH_CNTL_GEN6);
+   ggtt->invalidate(vm->i915);
 }
 
 struct insert_entries {
@@ -2451,15 +2461,13 @@ static void gen6_ggtt_insert_page(struct 
i915_address_space *vm,
  enum i915_cache_level level,
  u32 flags)
 {
-   struct drm_i915_private *dev_priv = vm->i915;
+   struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
gen6_pte_t __iomem *pte =
-   (gen6_pte_t __iomem *)dev_priv->ggtt.gsm +
-   (offset >> PAGE_SHIFT);
+   (gen6_pte_t __iomem *)ggtt->gsm + (offset >> PAGE_SHIFT);
 
iowrite

[Intel-gfx] [PATCH 2/3] drm/i915/scheduler: emulate a scheduler for guc

2017-01-11 Thread Chris Wilson

This emulates execlists on top of the GuC in order to defer submission of
requests to the hardware. This deferral allows time for high priority
requests to gazump their way to the head of the queue, however it nerfs
the GuC by converting it back into a simple execlist (where the CPU has
to wake up after every request to feed new commands into the GuC).

v2: Drop hack status - though iirc there is still a lockdep inversion
between fence and engine->timeline->lock (which is impossible as the
nesting only occurs on different fences - hopefully just requires some
judicious lockdep annotation)

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 79 +++---
 drivers/gpu/drm/i915/i915_irq.c|  4 +-
 drivers/gpu/drm/i915/intel_lrc.c   |  5 +-
 3 files changed, 76 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 913d87358972..bdc9e2bc5eb9 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -350,7 +350,7 @@ int i915_guc_wq_reserve(struct drm_i915_gem_request 
*request)
u32 freespace;
int ret;
 
-   spin_lock(&client->wq_lock);
+   spin_lock_irq(&client->wq_lock);
freespace = CIRC_SPACE(client->wq_tail, desc->head, client->wq_size);
freespace -= client->wq_rsvd;
if (likely(freespace >= wqi_size)) {
@@ -360,7 +360,7 @@ int i915_guc_wq_reserve(struct drm_i915_gem_request 
*request)
client->no_wq_space++;
ret = -EAGAIN;
}
-   spin_unlock(&client->wq_lock);
+   spin_unlock_irq(&client->wq_lock);
 
return ret;
 }
@@ -372,9 +372,9 @@ void i915_guc_wq_unreserve(struct drm_i915_gem_request 
*request)
 
GEM_BUG_ON(READ_ONCE(client->wq_rsvd) < wqi_size);
 
-   spin_lock(&client->wq_lock);
+   spin_lock_irq(&client->wq_lock);
client->wq_rsvd -= wqi_size;
-   spin_unlock(&client->wq_lock);
+   spin_unlock_irq(&client->wq_lock);
 }
 
 /* Construct a Work Item and append it to the GuC's Work Queue */
@@ -534,10 +534,74 @@ static void __i915_guc_submit(struct drm_i915_gem_request 
*rq)
 
 static void i915_guc_submit(struct drm_i915_gem_request *rq)
 {
-   i915_gem_request_submit(rq);
+   __i915_gem_request_submit(rq);
__i915_guc_submit(rq);
 }
 
+static bool i915_guc_dequeue(struct intel_engine_cs *engine)
+{
+   struct execlist_port *port = engine->execlist_port;
+   struct drm_i915_gem_request *last = port[0].request;
+   unsigned long flags;
+   struct rb_node *rb;
+   bool submit = false;
+
+   spin_lock_irqsave(&engine->timeline->lock, flags);
+   rb = engine->execlist_first;
+   while (rb) {
+   struct drm_i915_gem_request *cursor =
+   rb_entry(rb, typeof(*cursor), priotree.node);
+
+   if (last && cursor->ctx != last->ctx) {
+   if (port != engine->execlist_port)
+   break;
+
+   i915_gem_request_assign(&port->request, last);
+   dma_fence_enable_sw_signaling(&last->fence);
+   port++;
+   }
+
+   rb = rb_next(rb);
+   rb_erase(&cursor->priotree.node, &engine->execlist_queue);
+   RB_CLEAR_NODE(&cursor->priotree.node);
+   cursor->priotree.priority = INT_MAX;
+
+   i915_guc_submit(cursor);
+   last = cursor;
+   submit = true;
+   }
+   if (submit) {
+   i915_gem_request_assign(&port->request, last);
+   dma_fence_enable_sw_signaling(&last->fence);
+   engine->execlist_first = rb;
+   }
+   spin_unlock_irqrestore(&engine->timeline->lock, flags);
+
+   return submit;
+}
+
+static void i915_guc_irq_handler(unsigned long data)
+{
+   struct intel_engine_cs *engine = (struct intel_engine_cs *)data;
+   struct execlist_port *port = engine->execlist_port;
+   struct drm_i915_gem_request *rq;
+   bool submit;
+
+   do {
+   rq = port[0].request;
+   while (rq && i915_gem_request_completed(rq)) {
+   i915_gem_request_put(rq);
+   rq = port[1].request;
+   port[0].request = rq;
+   port[1].request = NULL;
+   }
+
+   submit = false;
+   if (!port[1].request)
+   submit = i915_guc_dequeue(engine);
+   } while (submit);
+}
+
 /*
  * Everything below here is concerned with setup & teardown, and is
  * therefore not part of the somewhat time-critical batch-submission
@@ -1428,8 +1492,9 @@ int i915_guc_submission_enable(struct drm_i915_private 
*dev_priv)
for_each_engine(engine, dev_priv, id) {
struct drm_i915_gem_request *rq;
 
-

[Intel-gfx] [PATCH] drm/i915/huc: Add HuC fw loading support

2017-01-11 Thread Anusha Srivatsa

The HuC loading process is similar to GuC. The intel_uc_fw_fetch()
is used for both cases.

HuC loading needs to be before GuC loading. The WOPCM setting must
be done early before loading any of them.

v2: rebased on-top of drm-intel-nightly.
removed if(HAS_GUC()) before the guc call. (D.Gordon)
update huc_version number of format.
v3: rebased to drm-intel-nightly, changed the file name format to
match the one in the huc package.
Changed dev->dev_private to to_i915()
v4: moved function back to where it was.
change wait_for_atomic to wait_for.
v5: rebased + comment changes.
v7: rebased.
v8: rebased.
v9: rebased. Changed the year in the copyright message to reflect
the right year.Correct the comments,remove the unwanted WARN message,
replace drm_gem_object_unreference() with i915_gem_object_put().Make the
prototypes in intel_huc.h non-extern.
v10: rebased. Update the file construction done by HuC. It is similar to
GuC.Adopted the approach used in-
https://patchwork.freedesktop.org/patch/104355/ 
v11: Fix warnings remove old declaration
v12: Change dev to dev_priv in macro definition.
Corrected comments.
v13: rebased.
v14: rebased on top of drm-tip
v15: rebased. Updated functions intel_huc_load(),intel_huc_init() and
intel_uc_fw_fetch() to accept dev_priv instead of dev. Moved contents
of intel_huc.h to intel_uc.h
v16: change SKL_FW_ to SKL_HUC_FW_. Add intel_ prefix to guc_wopcm_size().
Remove unwanted checks in intel_uc.h. Rename huc_fw in struct intel_huc to
simply fw to avoid redundency.
v17: rebased.
v18: rebased. Correct comments.
v19: rebased. Correct comments. move definition to i915_guc_reg.h from
intel_uc.h. Clean DMA_CTRL bits after HuC DMA transfer in huc_ucode_xfer()
instead of guc_ucode_xfer(). Add suitable WARNs to give extra info.

Cc: Arkadiusz Hiler 
Cc: Michal Wajdeczko 
Tested-by: Xiang Haihao 
Signed-off-by: Anusha Srivatsa 
Signed-off-by: Alex Dai 
Signed-off-by: Peter Antoine 
---
 drivers/gpu/drm/i915/Makefile   |   1 +
 drivers/gpu/drm/i915/i915_drv.c |   3 +
 drivers/gpu/drm/i915/i915_drv.h |   2 +
 drivers/gpu/drm/i915/i915_guc_reg.h |   6 +
 drivers/gpu/drm/i915/intel_guc_loader.c |   7 +-
 drivers/gpu/drm/i915/intel_huc_loader.c | 265 
 drivers/gpu/drm/i915/intel_uc.h |  14 ++
 7 files changed, 295 insertions(+), 3 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/intel_huc_loader.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 5196509..45ae124 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -57,6 +57,7 @@ i915-y += i915_cmd_parser.o \
 # general-purpose microcontroller (GuC) support
 i915-y += intel_uc.o \
  intel_guc_loader.o \
+ intel_huc_loader.o \
  i915_guc_submission.o
 
 # autogenerated null render state
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index aefab9a..5a90829 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -599,6 +599,7 @@ static int i915_load_modeset_init(struct drm_device *dev)
if (ret)
goto cleanup_irq;
 
+   intel_huc_init(dev_priv);
intel_guc_init(dev_priv);
 
ret = i915_gem_init(dev_priv);
@@ -627,6 +628,7 @@ static int i915_load_modeset_init(struct drm_device *dev)
i915_gem_fini(dev_priv);
 cleanup_irq:
intel_guc_fini(dev_priv);
+   intel_huc_fini(dev);
drm_irq_uninstall(dev);
intel_teardown_gmbus(dev_priv);
 cleanup_csr:
@@ -1314,6 +1316,7 @@ void i915_driver_unload(struct drm_device *dev)
drain_workqueue(dev_priv->wq);
 
intel_guc_fini(dev_priv);
+   intel_huc_fini(dev);
i915_gem_fini(dev_priv);
intel_fbc_cleanup_cfb(dev_priv);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b84c1d1..2a17df2 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2073,6 +2073,7 @@ struct drm_i915_private {
 
struct intel_gvt *gvt;
 
+   struct intel_huc huc;
struct intel_guc guc;
 
struct intel_csr csr;
@@ -2847,6 +2848,7 @@ intel_info(const struct drm_i915_private *dev_priv)
 #define HAS_GUC(dev_priv)  ((dev_priv)->info.has_guc)
 #define HAS_GUC_UCODE(dev_priv)(HAS_GUC(dev_priv))
 #define HAS_GUC_SCHED(dev_priv)(HAS_GUC(dev_priv))
+#define HAS_HUC_UCODE(dev_priv)(HAS_GUC(dev_priv))
 
 #define HAS_RESOURCE_STREAMER(dev_priv) 
((dev_priv)->info.has_resource_streamer)
 
diff --git a/drivers/gpu/drm/i915/i915_guc_reg.h 
b/drivers/gpu/drm/i915/i915_guc_reg.h
index 6a0adaf..35cf991 100644
--- a/drivers/gpu/drm/i915/i915_guc_reg.h
+++ b/drivers/gpu/drm/i915/i915_guc_reg.h
@@ -61,12 +61,18 @@
 #define   DMA_ADDRESS_SPACE_GTT  (8 << 16)
 #define DMA_COPY_SIZE  _MMIO(0xc310)
 #define DMA_CTRL   _MMIO(0xc314)
+#define   HUC_UKERNEL

Re: [Intel-gfx] [PATCHv2] drm/i915: Remove WaDisableLSQCROPERFforOCL KBL workaround.

2017-01-11 Thread Daniel Vetter

On Wed, Jan 11, 2017 at 12:24:59PM +, Chris Wilson wrote:
> On Wed, Jan 11, 2017 at 02:07:37PM +0200, Mika Kuoppala wrote:
> > Daniel Vetter  writes:
> > 
> > > On Mon, Jan 09, 2017 at 01:07:56PM -0800, Francisco Jerez wrote:
> > >> The WaDisableLSQCROPERFforOCL workaround has the side effect of
> > >> disabling an L3SQ optimization that has huge performance implications
> > >> and is unlikely to be necessary for the correct functioning of usual
> > >> graphic workloads.  Userspace is free to re-enable the workaround on
> > >> demand, and is generally in a better position to determine whether the
> > >> workaround is necessary than the DRM is (e.g. only during the
> > >> execution of compute kernels that rely on both L3 fences and HDC R/W
> > >> requests).
> > >> 
> > >> The same workaround seems to apply to BDW (at least to production
> > >> stepping G1) and SKL as well (the internal workaround database claims
> > >> that it does for all steppings, while the BSpec workaround table only
> > >> mentions pre-production steppings), but the DRM doesn't do anything
> > >> beyond whitelisting the L3SQCREG4 register so userspace can enable it
> > >> when it sees fit.  Do the same on KBL platforms.
> > >> 
> > >> Improves performance of the GFXBench4 gl_manhattan31 benchmark by 60%,
> > >> and gl_4 (AKA car chase) by 14% on a KBL GT2 running Mesa master --
> > >> This is followed by a regression of 35% and 10% respectively for the
> > >> same benchmarks and platform caused by my recent patch series
> > >> switching userspace to use the dataport constant cache instead of the
> > >> sampler to implement uniform pull constant loads, which caused us to
> > >> hit more heavily the L3 cache (and on platforms other than KBL had the
> > >> opposite effect of improving performance of the same two benchmarks).
> > >> The overall effect on KBL of this change combined with the recent
> > >> userspace change is respectively 4.6% and 2.6%.  SynMark2 OglShMapPcf
> > >> was affected by the constant cache changes (though it improved as it
> > >> did on other platforms rather than regressing), but is not
> > >> significantly affected by this patch (with statistical significance of
> > >> 5% and sample size 20).
> > >> 
> > >> v2: Drop some more code to avoid unused variable warning.
> > >> 
> > >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99256
> > >> Signed-off-by: Francisco Jerez 
> > >> Cc: Eero Tamminen 
> > >> Cc: Jani Nikula 
> > >> Cc: Mika Kuoppala 
> > >> Cc: beig...@lists.freedesktop.org
> > >
> > > Don't we need some userspace flag/opt-in scheme to avoid stuff going boom
> > > for compute kernels? Are the patches for mesa compute/beignet
> > > ready&reviewed?
> > 
> > This is explicit setting on kbl/E0 only. So one could argue
> > that unless they filter based on PCI-IDs, things would already
> > blow up across the skl/kbl population, if they forgot
> > to set it. The whitelisting is in place and looks sane
> > so this E0 exception is a wart that got in by me reading wa
> > database slavishly without thinking.
> 
> Add Fixes then?

Yeah, cc: stable would be good to make sure it shows up in all supported
kernels, fast. Otherwise we'll get some good wtf bug reports.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt

2017-01-11 Thread Chris Wilson

On Wed, Jan 11, 2017 at 01:01:18PM +, Chris Wilson wrote:
> On Wed, Jan 11, 2017 at 12:56:03PM -, Patchwork wrote:
> > == Series Details ==
> > 
> > Series: drm/i915: Suppress switch_mm emission between the same 
> > aliasing_ppgtt
> > URL   : https://patchwork.freedesktop.org/series/17823/
> > State : failure
> > 
> > == Summary ==
> > 
> > Series 17823v1 drm/i915: Suppress switch_mm emission between the same 
> > aliasing_ppgtt
> > https://patchwork.freedesktop.org/api/1.0/series/17823/revisions/1/mbox/
> > 
> > Test drv_hangman:
> > Subgroup error-state-basic:
> > pass   -> TIMEOUT(fi-hsw-4770)
> > Test gem_close_race:
> > Subgroup basic-process:
> > pass   -> INCOMPLETE (fi-hsw-4770)
> > Subgroup basic-threads:
> > pass   -> INCOMPLETE (fi-hsw-4770r)
> > Test gem_ctx_basic:
> > pass   -> INCOMPLETE (fi-ivb-3520m)
> 
> Ooh. That is suitably scary that there is something wrong going here.
> Still think the patch is sane by itself, so suspecting there is
> something not meeting the eye here.

To further demonstrate the bizarreness, they are all *CPU* lockups.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH] drm/i915/huc: Support HuC authentication

2017-01-11 Thread Anusha Srivatsa

From: Peter Antoine 

The HuC authentication is done by host2guc call. The HuC RSA keys
are sent to GuC for authentication.

v2: rebased on top of drm-intel-nightly.
changed name format and upped version 1.7.
v3: rebased on top of drm-intel-nightly.
v4: changed wait_for_automic to wait_for
v5: rebased.
v7: rebased.
v8: rebased.
v9: rebased. Rename intel_huc_auh() to intel_guc_auth_huc()
and place the prototype in intel_guc.h,correct the comments.
v10: rebased.
v11: rebased.
v12: rebased on top of drm-tip
v13: rebased. Moved intel_guc_auth_huc from i915_guc_submission.c
to intel_uc.c.Update dev to dev_priv in intel_guc_auth_huc().
Renamed HOST2GUC_ACTION_AUTHENTICATE_HUC TO INTEL_GUC_ACTION_
AUTHENTICATE_HUC
v14: rebased.
v15: rebased. Add newline on DRM_ERRORs that already dont have one.
v16: rebased. Replace wait_for with intel_wait_for_register() since
the latter employs sleep optimisations for quick responses- as pointed
out by Chris Wilson.
v17: rebased. Cleanup the intel_guc_auth_huc() by removing checks
already performed in earlier functions. Make comments more descriptive.

Cc: Chris Wilson 
Cc: Arkadiusz Hiler 
Cc: Michal Wajdeczko 
Tested-by: Xiang Haihao 
Signed-off-by: Anusha Srivatsa 
Signed-off-by: Alex Dai 
Signed-off-by: Peter Antoine 
---
 drivers/gpu/drm/i915/intel_guc_fwif.h   |  1 +
 drivers/gpu/drm/i915/intel_guc_loader.c |  2 ++
 drivers/gpu/drm/i915/intel_uc.c | 56 -
 drivers/gpu/drm/i915/intel_uc.h |  1 +
 4 files changed, 59 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h 
b/drivers/gpu/drm/i915/intel_guc_fwif.h
index ed1ab40..25691f0 100644
--- a/drivers/gpu/drm/i915/intel_guc_fwif.h
+++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
@@ -505,6 +505,7 @@ enum intel_guc_action {
INTEL_GUC_ACTION_ENTER_S_STATE = 0x501,
INTEL_GUC_ACTION_EXIT_S_STATE = 0x502,
INTEL_GUC_ACTION_SLPC_REQUEST = 0x3003,
+   INTEL_GUC_ACTION_AUTHENTICATE_HUC = 0x4000,
INTEL_GUC_ACTION_UK_LOG_ENABLE_LOGGING = 0x0E000,
INTEL_GUC_ACTION_LIMIT
 };
diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index 3b05232..967ab2f 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -529,6 +529,8 @@ int intel_guc_setup(struct drm_i915_private *dev_priv)
intel_uc_fw_status_repr(guc_fw->fetch_status),
intel_uc_fw_status_repr(guc_fw->load_status));
 
+   intel_guc_auth_huc(dev_priv);
+
if (i915.enable_guc_submission) {
if (i915.guc_log_level >= 0)
gen9_enable_guc_interrupts(dev_priv);
diff --git a/drivers/gpu/drm/i915/intel_uc.c b/drivers/gpu/drm/i915/intel_uc.c
index c6be352..7dabbe6 100644
--- a/drivers/gpu/drm/i915/intel_uc.c
+++ b/drivers/gpu/drm/i915/intel_uc.c
@@ -46,7 +46,7 @@ static bool intel_guc_recv(struct intel_guc *guc, u32 *status)
 int intel_guc_send(struct intel_guc *guc, const u32 *action, u32 len)
 {
struct drm_i915_private *dev_priv = guc_to_i915(guc);
-   u32 status;
+   u32 status = 0;
int i;
int ret;
 
@@ -140,3 +140,57 @@ int intel_guc_log_control(struct intel_guc *guc, u32 
control_val)
 
return intel_guc_send(guc, action, ARRAY_SIZE(action));
 }
+
+/**
+ * intel_guc_auth_huc() - authenticate ucode
+ * @dev_priv: the drm_i915_device
+ *
+ * Triggers a HuC fw authentication request to the GuC via intel_guc_action_
+ * authenticate_huc interface.
+ * interface.
+ */
+void intel_guc_auth_huc(struct drm_i915_private *dev_priv)
+{
+   struct intel_guc *guc = &dev_priv->guc;
+   struct intel_huc *huc = &dev_priv->huc;
+   struct i915_vma *vma;
+   int ret;
+   u32 data[2];
+
+   vma = i915_gem_object_ggtt_pin(huc->fw.obj, NULL, 0, 0, 0);
+   if (IS_ERR(vma)) {
+   DRM_DEBUG_DRIVER("failed to pin huc fw object %d\n",
+   (int)PTR_ERR(vma));
+   return;
+   }
+
+
+   /* Invalidate GuC TLB to let GuC take the latest updates to GTT. */
+   I915_WRITE(GEN8_GTCR, GEN8_GTCR_INVALIDATE);
+
+   /* Specify auth action and where public signature is. */
+   data[0] = INTEL_GUC_ACTION_AUTHENTICATE_HUC;
+   data[1] = i915_ggtt_offset(vma) + huc->fw.rsa_offset;
+
+   ret = intel_guc_send(guc, data, ARRAY_SIZE(data));
+   if (ret) {
+   DRM_ERROR("HuC: GuC did not ack Auth request\n");
+   goto out;
+   }
+
+   /* Check authentication status, it should be done by now */
+   ret = intel_wait_for_register(dev_priv,
+   HUC_STATUS2,
+   HUC_FW_VERIFIED,
+   HUC_FW_VERIFIED,
+   50);
+
+   if (ret) {
+   DRM_ERROR("HuC: Authentication failed\n");
+   goto out;
+   }
+
+   DRM_ERROR("HuC Authentication Suc

Re: [Intel-gfx] [PATCH i-g-t v3] tools: Add intel_dp_compliance for DisplayPort 1.2 compliance automation

2017-01-11 Thread Petri Latvala


Hi

The copyright statements still need the year
corrected. intel_dp_compliance needs to be added to tools/.gitignore

Some new comments also:

- Why do some of the prints have \r\n?
- Building intel_dp_compliance should actually be made conditional upon 
HAVE_UDEV



--
Petri Latvala



On Fri, Dec 23, 2016 at 09:47:48AM +0200, Pandiyan, Dhinakaran wrote:
> I have addressed review comments that Petri, Jim had for this patch along 
> with making some small changes for error handling. The functionality is 
> mostly unchanged from Manasi's version.
> 
> -DK
> 
> From: Pandiyan, Dhinakaran
> Sent: Thursday, December 22, 2016 11:41 PM
> To: intel-gfx@lists.freedesktop.org
> Cc: jim.br...@linux.intel.com; Navare, Manasi D; Latvala, Petri; Vlad, Marius 
> C; Daniel Vetter; Pandiyan, Dhinakaran
> Subject: [PATCH i-g-t v3] tools: Add intel_dp_compliance for DisplayPort 1.2 
> compliance automation
> 
> From: "Navare, Manasi D" 
> 
> This is the userspace component of the Displayport Compliance
> testing software required for compliance testing of the I915
> Display Port driver. This must be running in order to successfully
> complete Display Port compliance testing. This app and the kernel
> code that accompanies it has been written to satify the requirements
> of the Displayport Link CTS 1.2 rev1.1 specification from VESA.
> Note that this application does not support eDP compliance testing.
> This utility has an automation support for the Link training tests
> (4.3.1.1. - 4.3.2.3), EDID tests (4.2.2.3
> - 4.2.2.6) and Video Pattern generation tests (4.3.3.1) from CTS
> specification 1.2 Rev 1.1.
> 
> This tool has the support for responding to the hotplug uevents
> sent by compliance testting unit after each test.
> 
> The Linux DUT running this utility must be in text (console) mode
> and cannot have any other display manager running. Since this uses
> sysfs nodes for kernel interaction, this utility should be run as
> Root. Once this user application is up and running, waiting for
> test requests, the test appliance software on the windows host
> can now be used to execute the compliance tests.
> 
> This app is based on some prior work done in April 2015 (by
> Todd Previte )
> 
> v2:
> * Add mode unset on hotplug uevent on disconnect (Manasi Navare)
> 
> v3:
> Made capitalization consistent
> Reduced line lengths
> Added return value checks
> Changed how GLib is linked
> Fixed build warnings
> 
> Cc: Petri Latvala 
> Cc: Marius Vlad 
> Cc: Daniel Vetter 
> Signed-off-by: Manasi Navare 
> Signed-off-by: Dhinakaran Pandiyan 
> ---
>  tools/Makefile.am   |1 +
>  tools/Makefile.sources  |7 +
>  tools/intel_dp_compliance.c | 1104 
> +++
>  tools/intel_dp_compliance.h |   35 ++
>  tools/intel_dp_compliance_hotplug.c |  123 
>  5 files changed, 1270 insertions(+)
>  create mode 100644 tools/intel_dp_compliance.c
>  create mode 100644 tools/intel_dp_compliance.h
>  create mode 100644 tools/intel_dp_compliance_hotplug.c
> 
> diff --git a/tools/Makefile.am b/tools/Makefile.am
> index 18f86f6..bd8f512 100644
> --- a/tools/Makefile.am
> +++ b/tools/Makefile.am
> @@ -16,6 +16,7 @@ AM_CFLAGS = $(DEBUG_CFLAGS) $(DRM_CFLAGS) 
> $(PCIACCESS_CFLAGS) $(CWARNFLAGS) \
>  LDADD = $(top_builddir)/lib/libintel_tools.la
>  AM_LDFLAGS = -Wl,--as-needed
> 
> +intel_dp_compliance_LDADD = $(top_builddir)/lib/libintel_tools.la 
> $(GLIB_LIBS)
> 
>  # aubdumper
> 
> diff --git a/tools/Makefile.sources b/tools/Makefile.sources
> index e2451ea..e8ce891 100644
> --- a/tools/Makefile.sources
> +++ b/tools/Makefile.sources
> @@ -13,6 +13,7 @@ tools_prog_lists =\
> intel_bios_reader   \
> intel_display_crc   \
> intel_display_poller\
> +   intel_dp_compliance \
> intel_forcewaked\
> intel_gpu_frequency \
> intel_firmware_decode   \
> @@ -56,3 +57,9 @@ intel_l3_parity_SOURCES = \
> intel_l3_parity.h   \
> intel_l3_udev_listener.c
> 
> +intel_dp_compliance_SOURCES = \
> +intel_dp_compliance.c \
> +intel_dp_compliance.h \
> +intel_dp_compliance_hotplug.c \
> +$(NULL)
> +
> diff --git a/tools/intel_dp_compliance.c b/tools/intel_dp_compliance.c
> new file mode 100644
> index 000..df1ca10
> --- /dev/null
> +++ b/tools/intel_dp_compliance.c
> @@ -0,0 +1,1104 @@
> +/*
> + * Copyright © 2014 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> +

[Intel-gfx] ✓ Fi.CI.BAT: success for drm/edid: Improve RGB limited range handling a bit

2017-01-11 Thread Patchwork

== Series Details ==

Series: drm/edid: Improve RGB limited range handling a bit
URL   : https://patchwork.freedesktop.org/series/17825/
State : success

== Summary ==

Series 17825v1 drm/edid: Improve RGB limited range handling a bit
https://patchwork.freedesktop.org/api/1.0/series/17825/revisions/1/mbox/

Test kms_pipe_crc_basic:
Subgroup suspend-read-crc-pipe-a:
incomplete -> SKIP   (fi-bsw-n3050)
Test pm_rpm:
Subgroup basic-pci-d3-state:
incomplete -> PASS   (fi-byt-n2820)

fi-bdw-5557u total:246  pass:232  dwarn:0   dfail:0   fail:0   skip:14 
fi-bsw-n3050 total:246  pass:207  dwarn:0   dfail:0   fail:0   skip:39 
fi-bxt-j4205 total:246  pass:224  dwarn:0   dfail:0   fail:0   skip:22 
fi-bxt-t5700 total:82   pass:69   dwarn:0   dfail:0   fail:0   skip:12 
fi-byt-j1900 total:246  pass:219  dwarn:0   dfail:0   fail:0   skip:27 
fi-byt-n2820 total:246  pass:215  dwarn:0   dfail:0   fail:0   skip:31 
fi-hsw-4770  total:246  pass:227  dwarn:0   dfail:0   fail:0   skip:19 
fi-hsw-4770r total:246  pass:227  dwarn:0   dfail:0   fail:0   skip:19 
fi-ivb-3520m total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-ivb-3770  total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-kbl-7500u total:83   pass:71   dwarn:0   dfail:0   fail:0   skip:11 
fi-skl-6260u total:246  pass:233  dwarn:0   dfail:0   fail:0   skip:13 
fi-skl-6700hqtotal:246  pass:226  dwarn:0   dfail:0   fail:0   skip:20 
fi-skl-6700k total:246  pass:222  dwarn:3   dfail:0   fail:0   skip:21 
fi-skl-6770hqtotal:246  pass:233  dwarn:0   dfail:0   fail:0   skip:13 
fi-snb-2520m total:246  pass:215  dwarn:0   dfail:0   fail:0   skip:31 
fi-snb-2600  total:246  pass:214  dwarn:0   dfail:0   fail:0   skip:32 

f0350ffa1b2bc16dc49fdc2fce10776d604a1c5f drm-tip: 2017y-01m-11d-12h-34m-12s UTC 
integration manifest
af78a86 drm/edid: Set YQ bits in the AVI infoframe according to CEA-861-F
6e2f2ab drm/edid: Set AVI infoframe Q even when QS=0
ac00d36 drm/edid: Introduce drm_hdmi_avi_infoframe_quant_range()
d7e4b07 drm/edid: Introduce drm_default_rgb_quant_range()
142daf0 drm/edid: Have drm_edid.h include hdmi.h

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3479/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH] drm/i915: Add a sanity check that no request is submitted in the middle

2017-01-11 Thread Chris Wilson

It is an error to start a new request on the same timeline (ringbuffer)
as the current one before the current is submitted. If there are two
requests emitting to the ringbuffer at the same time, the operation is
undefined. We can catch this by checking for the timeline having a later
seqno than ours when we come to submit out request.

Currently we have this check at the end of __i915_add_request, but
having an early check as well isolates a failure in the caller versus a
failure in sealing the request (i.e. from inside __i915_add_request
itself). For example, CI is currently tripping over this late assertion
on ctg/ilk:

[  100.329399] [IGT] gem_cs_tlb: starting subtest basic-default
[  100.336333] [ cut here ]
[  100.336341] kernel BUG at drivers/gpu/drm/i915/i915_gem_request.c:908!
[  100.336347] invalid opcode:  [#1] PREEMPT SMP
[  100.336351] Modules linked in: snd_hda_intel i915 snd_hda_codec_generic 
snd_hda_codec snd_hwdep snd_hda_core snd_pcm coretemp mei_me lpc_ich mei e1000e 
ptp pps_core [last unloaded: i915]
[  100.336373] CPU: 0 PID: 6308 Comm: gem_cs_tlb Tainted: G U  
4.10.0-rc3-CI-CI_DRM_2045+ #1
[  100.336380] Hardware name: LENOVO 7465CTO/7465CTO, BIOS 6DET44WW (2.08 ) 
04/22/2009
[  100.336386] task: 88012b738040 task.stack: c956
[  100.336441] RIP: 0010:__i915_add_request+0x4aa/0x510 [i915]
[  100.336445] RSP: 0018:c9563ac0 EFLAGS: 00010212
[  100.336451] RAX: 5d52 RBX: 880133bb84c0 RCX: 0001
[  100.336456] RDX: 8001 RSI: 88012b738860 RDI: 
[  100.336461] RBP: c9563b00 R08: 880133bb8780 R09: 
[  100.336466] R10:  R11:  R12: 88012f53d950
[  100.336472] R13: 88012a2b0af8 R14: 88012a5b0008 R15: 88012f53d960
[  100.336477] FS:  7f0d19da38c0() GS:88013bc0() 
knlGS:
[  100.336483] CS:  0010 DS:  ES:  CR0: 80050033
[  100.336488] CR2: 7f0d17706000 CR3: 00012aa3e000 CR4: 000406f0
[  100.336496] Call Trace:
[  100.336527]  i915_gem_switch_to_kernel_context+0x131/0x1b0 [i915]
[  100.336559]  i915_gem_evict_vm+0x202/0x2b0 [i915]
[  100.336590]  i915_gem_execbuffer_reserve.isra.9+0x3ae/0x440 [i915]
[  100.336623]  i915_gem_do_execbuffer.isra.15+0x6d9/0x1b20 [i915]
[  100.336656]  i915_gem_execbuffer2+0xc0/0x250 [i915]
[  100.33]  drm_ioctl+0x200/0x450
[  100.336697]  ? i915_gem_execbuffer+0x330/0x330 [i915]
[  100.336708]  do_vfs_ioctl+0x90/0x6e0
[  100.336716]  ? up_read+0x1a/0x40
[  100.336723]  ? trace_hardirqs_on_caller+0x122/0x1b0
[  100.336730]  SyS_ioctl+0x3c/0x70
[  100.336738]  entry_SYSCALL_64_fastpath+0x1c/0xb1
[  100.336745] RIP: 0033:0x7f0d187cb357
[  100.336750] RSP: 002b:7ffe0b2f7c28 EFLAGS: 0246 ORIG_RAX: 
0010
[  100.336761] RAX: ffda RBX: 7ffe0b2f7d60 RCX: 7f0d187cb357
[  100.336768] RDX: 7ffe0b2f7d00 RSI: 40406469 RDI: 0003
[  100.336775] RBP:  R08:  R09: 0022
[  100.336782] R10: 0007 R11: 0246 R12: 0002
[  100.336789] R13: 00419101 R14: 7ffe0b2f7d60 R15: 7ffe0b2f7d50
[  100.336797] Code: 5f 74 1e e9 d4 fb ff ff e8 bc 1e 9c e0 e9 ae fb ff ff 4c 
89 e7 e8 77 22 fd ff e9 88 fd ff ff 0f 0b e8 a3 1e 9c e0 e9 b1 fb ff ff <0f> 0b 
0f 0b e8 fd af ab e0 85 c0 75 c2 48 c7 c2 80 2c 71 a0 be
[  100.336877] RIP: __i915_add_request+0x4aa/0x510 [i915] RSP: c9563ac0
[  100.336886] ---[ end trace 22b36545479e5eb7 ]---

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
Cc: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem_request.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_request.c 
b/drivers/gpu/drm/i915/i915_gem_request.c
index 99056b948eda..1ad47f08e1fd 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -851,6 +851,13 @@ void __i915_add_request(struct drm_i915_gem_request 
*request, bool flush_caches)
lockdep_assert_held(&request->i915->drm.struct_mutex);
trace_i915_gem_request_add(request);
 
+   /* Make sure that no request gazzumped us - if it was allocated after
+* our i915_gem_request_alloc() and called __i915_add_request() before
+* us, the timeline will hold its seqno which is later than ours.
+*/
+   GEM_BUG_ON(i915_seqno_passed(timeline->last_submitted_seqno,
+request->fence.seqno));
+
/*
 * To ensure that this call will not fail, space for its emissions
 * should already have been reserved in the ring buffer. Let the ring
-- 
2.11.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915/huc: Support HuC authentication

2017-01-11 Thread Michal Wajdeczko

On Wed, Jan 11, 2017 at 05:36:49AM -0800, Anusha Srivatsa wrote:
> From: Peter Antoine 
> 
> The HuC authentication is done by host2guc call. The HuC RSA keys
> are sent to GuC for authentication.
> 
> v2: rebased on top of drm-intel-nightly.
> changed name format and upped version 1.7.
> v3: rebased on top of drm-intel-nightly.
> v4: changed wait_for_automic to wait_for
> v5: rebased.
> v7: rebased.
> v8: rebased.
> v9: rebased. Rename intel_huc_auh() to intel_guc_auth_huc()
> and place the prototype in intel_guc.h,correct the comments.
> v10: rebased.
> v11: rebased.
> v12: rebased on top of drm-tip
> v13: rebased. Moved intel_guc_auth_huc from i915_guc_submission.c
> to intel_uc.c.Update dev to dev_priv in intel_guc_auth_huc().
> Renamed HOST2GUC_ACTION_AUTHENTICATE_HUC TO INTEL_GUC_ACTION_
> AUTHENTICATE_HUC
> v14: rebased.
> v15: rebased. Add newline on DRM_ERRORs that already dont have one.
> v16: rebased. Replace wait_for with intel_wait_for_register() since
> the latter employs sleep optimisations for quick responses- as pointed
> out by Chris Wilson.
> v17: rebased. Cleanup the intel_guc_auth_huc() by removing checks
> already performed in earlier functions. Make comments more descriptive.
> 
> Cc: Chris Wilson 
> Cc: Arkadiusz Hiler 
> Cc: Michal Wajdeczko 

There is still typo in my email ;(

> Tested-by: Xiang Haihao 
> Signed-off-by: Anusha Srivatsa 
> Signed-off-by: Alex Dai 
> Signed-off-by: Peter Antoine 
> ---
>  drivers/gpu/drm/i915/intel_guc_fwif.h   |  1 +
>  drivers/gpu/drm/i915/intel_guc_loader.c |  2 ++
>  drivers/gpu/drm/i915/intel_uc.c | 56 
> -
>  drivers/gpu/drm/i915/intel_uc.h |  1 +
>  4 files changed, 59 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h 
> b/drivers/gpu/drm/i915/intel_guc_fwif.h
> index ed1ab40..25691f0 100644
> --- a/drivers/gpu/drm/i915/intel_guc_fwif.h
> +++ b/drivers/gpu/drm/i915/intel_guc_fwif.h
> @@ -505,6 +505,7 @@ enum intel_guc_action {
>   INTEL_GUC_ACTION_ENTER_S_STATE = 0x501,
>   INTEL_GUC_ACTION_EXIT_S_STATE = 0x502,
>   INTEL_GUC_ACTION_SLPC_REQUEST = 0x3003,
> + INTEL_GUC_ACTION_AUTHENTICATE_HUC = 0x4000,
>   INTEL_GUC_ACTION_UK_LOG_ENABLE_LOGGING = 0x0E000,
>   INTEL_GUC_ACTION_LIMIT
>  };
> diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
> b/drivers/gpu/drm/i915/intel_guc_loader.c
> index 3b05232..967ab2f 100644
> --- a/drivers/gpu/drm/i915/intel_guc_loader.c
> +++ b/drivers/gpu/drm/i915/intel_guc_loader.c
> @@ -529,6 +529,8 @@ int intel_guc_setup(struct drm_i915_private *dev_priv)
>   intel_uc_fw_status_repr(guc_fw->fetch_status),
>   intel_uc_fw_status_repr(guc_fw->load_status));
>  
> + intel_guc_auth_huc(dev_priv);
> +
>   if (i915.enable_guc_submission) {
>   if (i915.guc_log_level >= 0)
>   gen9_enable_guc_interrupts(dev_priv);
> diff --git a/drivers/gpu/drm/i915/intel_uc.c b/drivers/gpu/drm/i915/intel_uc.c
> index c6be352..7dabbe6 100644
> --- a/drivers/gpu/drm/i915/intel_uc.c
> +++ b/drivers/gpu/drm/i915/intel_uc.c
> @@ -46,7 +46,7 @@ static bool intel_guc_recv(struct intel_guc *guc, u32 
> *status)
>  int intel_guc_send(struct intel_guc *guc, const u32 *action, u32 len)
>  {
>   struct drm_i915_private *dev_priv = guc_to_i915(guc);
> - u32 status;
> + u32 status = 0;

Any reason why this chunk is included in the Huc patch ? Merge error ?


>   int i;
>   int ret;
>  
> @@ -140,3 +140,57 @@ int intel_guc_log_control(struct intel_guc *guc, u32 
> control_val)
>  
>   return intel_guc_send(guc, action, ARRAY_SIZE(action));
>  }
> +
> +/**
> + * intel_guc_auth_huc() - authenticate ucode
> + * @dev_priv: the drm_i915_device
> + *
> + * Triggers a HuC fw authentication request to the GuC via intel_guc_action_
> + * authenticate_huc interface.
> + * interface.
> + */
> +void intel_guc_auth_huc(struct drm_i915_private *dev_priv)
> +{
> + struct intel_guc *guc = &dev_priv->guc;
> + struct intel_huc *huc = &dev_priv->huc;
> + struct i915_vma *vma;
> + int ret;
> + u32 data[2];
> +
> + vma = i915_gem_object_ggtt_pin(huc->fw.obj, NULL, 0, 0, 0);
> + if (IS_ERR(vma)) {
> + DRM_DEBUG_DRIVER("failed to pin huc fw object %d\n",
> + (int)PTR_ERR(vma));
> + return;
> + }
> +
> +

Leave only one line 


> + /* Invalidate GuC TLB to let GuC take the latest updates to GTT. */
> + I915_WRITE(GEN8_GTCR, GEN8_GTCR_INVALIDATE);
> +
> + /* Specify auth action and where public signature is. */
> + data[0] = INTEL_GUC_ACTION_AUTHENTICATE_HUC;
> + data[1] = i915_ggtt_offset(vma) + huc->fw.rsa_offset;
> +
> + ret = intel_guc_send(guc, data, ARRAY_SIZE(data));
> + if (ret) {
> + DRM_ERROR("HuC: GuC did not ack Auth request\n");
> + goto out;
> + }
> +
> + /* Check authentication status, it should be done by now */
> +

Re: [Intel-gfx] [PATCH] drm/i915/huc: Add HuC fw loading support

2017-01-11 Thread Michal Wajdeczko

On Wed, Jan 11, 2017 at 05:15:14AM -0800, Anusha Srivatsa wrote:
> The HuC loading process is similar to GuC. The intel_uc_fw_fetch()
> is used for both cases.
> 
> HuC loading needs to be before GuC loading. The WOPCM setting must
> be done early before loading any of them.
> 
> v2: rebased on-top of drm-intel-nightly.
> removed if(HAS_GUC()) before the guc call. (D.Gordon)
> update huc_version number of format.
> v3: rebased to drm-intel-nightly, changed the file name format to
> match the one in the huc package.
> Changed dev->dev_private to to_i915()
> v4: moved function back to where it was.
> change wait_for_atomic to wait_for.
> v5: rebased + comment changes.
> v7: rebased.
> v8: rebased.
> v9: rebased. Changed the year in the copyright message to reflect
> the right year.Correct the comments,remove the unwanted WARN message,
> replace drm_gem_object_unreference() with i915_gem_object_put().Make the
> prototypes in intel_huc.h non-extern.
> v10: rebased. Update the file construction done by HuC. It is similar to
> GuC.Adopted the approach used in-
> https://patchwork.freedesktop.org/patch/104355/ 
> v11: Fix warnings remove old declaration
> v12: Change dev to dev_priv in macro definition.
> Corrected comments.
> v13: rebased.
> v14: rebased on top of drm-tip
> v15: rebased. Updated functions intel_huc_load(),intel_huc_init() and
> intel_uc_fw_fetch() to accept dev_priv instead of dev. Moved contents
> of intel_huc.h to intel_uc.h
> v16: change SKL_FW_ to SKL_HUC_FW_. Add intel_ prefix to guc_wopcm_size().
> Remove unwanted checks in intel_uc.h. Rename huc_fw in struct intel_huc to
> simply fw to avoid redundency.
> v17: rebased.
> v18: rebased. Correct comments.
> v19: rebased. Correct comments. move definition to i915_guc_reg.h from
> intel_uc.h. Clean DMA_CTRL bits after HuC DMA transfer in huc_ucode_xfer()
> instead of guc_ucode_xfer(). Add suitable WARNs to give extra info.
> 
> Cc: Arkadiusz Hiler 
> Cc: Michal Wajdeczko 
> Tested-by: Xiang Haihao 
> Signed-off-by: Anusha Srivatsa 
> Signed-off-by: Alex Dai 
> Signed-off-by: Peter Antoine 
> ---
>  drivers/gpu/drm/i915/Makefile   |   1 +
>  drivers/gpu/drm/i915/i915_drv.c |   3 +
>  drivers/gpu/drm/i915/i915_drv.h |   2 +
>  drivers/gpu/drm/i915/i915_guc_reg.h |   6 +
>  drivers/gpu/drm/i915/intel_guc_loader.c |   7 +-
>  drivers/gpu/drm/i915/intel_huc_loader.c | 265 
> 
>  drivers/gpu/drm/i915/intel_uc.h |  14 ++
>  7 files changed, 295 insertions(+), 3 deletions(-)
>  create mode 100644 drivers/gpu/drm/i915/intel_huc_loader.c
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 5196509..45ae124 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -57,6 +57,7 @@ i915-y += i915_cmd_parser.o \
>  # general-purpose microcontroller (GuC) support
>  i915-y += intel_uc.o \
> intel_guc_loader.o \
> +   intel_huc_loader.o \
> i915_guc_submission.o
>  
>  # autogenerated null render state
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index aefab9a..5a90829 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -599,6 +599,7 @@ static int i915_load_modeset_init(struct drm_device *dev)
>   if (ret)
>   goto cleanup_irq;
>  
> + intel_huc_init(dev_priv);
>   intel_guc_init(dev_priv);
>  
>   ret = i915_gem_init(dev_priv);
> @@ -627,6 +628,7 @@ static int i915_load_modeset_init(struct drm_device *dev)
>   i915_gem_fini(dev_priv);
>  cleanup_irq:
>   intel_guc_fini(dev_priv);
> + intel_huc_fini(dev);
>   drm_irq_uninstall(dev);
>   intel_teardown_gmbus(dev_priv);
>  cleanup_csr:
> @@ -1314,6 +1316,7 @@ void i915_driver_unload(struct drm_device *dev)
>   drain_workqueue(dev_priv->wq);
>  
>   intel_guc_fini(dev_priv);
> + intel_huc_fini(dev);

Hmm, still not dev_priv?


>   i915_gem_fini(dev_priv);
>   intel_fbc_cleanup_cfb(dev_priv);
>  
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index b84c1d1..2a17df2 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2073,6 +2073,7 @@ struct drm_i915_private {
>  
>   struct intel_gvt *gvt;
>  
> + struct intel_huc huc;
>   struct intel_guc guc;
>  
>   struct intel_csr csr;
> @@ -2847,6 +2848,7 @@ intel_info(const struct drm_i915_private *dev_priv)
>  #define HAS_GUC(dev_priv)((dev_priv)->info.has_guc)
>  #define HAS_GUC_UCODE(dev_priv)  (HAS_GUC(dev_priv))
>  #define HAS_GUC_SCHED(dev_priv)  (HAS_GUC(dev_priv))
> +#define HAS_HUC_UCODE(dev_priv)  (HAS_GUC(dev_priv))
>  
>  #define HAS_RESOURCE_STREAMER(dev_priv) 
> ((dev_priv)->info.has_resource_streamer)
>  
> diff --git a/drivers/gpu/drm/i915/i915_guc_reg.h 
> b/drivers/gpu/drm/i915/i915_guc_reg.h
> index 6a0adaf..35cf991 100644
> --- a/driv

Re: [Intel-gfx] [PATCH] drm/i915: Add a sanity check that no request is submitted in the middle

2017-01-11 Thread Mika Kuoppala

Chris Wilson  writes:

> It is an error to start a new request on the same timeline (ringbuffer)
> as the current one before the current is submitted. If there are two
> requests emitting to the ringbuffer at the same time, the operation is
> undefined. We can catch this by checking for the timeline having a later
> seqno than ours when we come to submit out request.
>
> Currently we have this check at the end of __i915_add_request, but
> having an early check as well isolates a failure in the caller versus a
> failure in sealing the request (i.e. from inside __i915_add_request
> itself). For example, CI is currently tripping over this late assertion
> on ctg/ilk:
>
> [  100.329399] [IGT] gem_cs_tlb: starting subtest basic-default
> [  100.336333] [ cut here ]
> [  100.336341] kernel BUG at drivers/gpu/drm/i915/i915_gem_request.c:908!
> [  100.336347] invalid opcode:  [#1] PREEMPT SMP
> [  100.336351] Modules linked in: snd_hda_intel i915 snd_hda_codec_generic 
> snd_hda_codec snd_hwdep snd_hda_core snd_pcm coretemp mei_me lpc_ich mei 
> e1000e ptp pps_core [last unloaded: i915]
> [  100.336373] CPU: 0 PID: 6308 Comm: gem_cs_tlb Tainted: G U  
> 4.10.0-rc3-CI-CI_DRM_2045+ #1
> [  100.336380] Hardware name: LENOVO 7465CTO/7465CTO, BIOS 6DET44WW (2.08 ) 
> 04/22/2009
> [  100.336386] task: 88012b738040 task.stack: c956
> [  100.336441] RIP: 0010:__i915_add_request+0x4aa/0x510 [i915]
> [  100.336445] RSP: 0018:c9563ac0 EFLAGS: 00010212
> [  100.336451] RAX: 5d52 RBX: 880133bb84c0 RCX: 
> 0001
> [  100.336456] RDX: 8001 RSI: 88012b738860 RDI: 
> 
> [  100.336461] RBP: c9563b00 R08: 880133bb8780 R09: 
> 
> [  100.336466] R10:  R11:  R12: 
> 88012f53d950
> [  100.336472] R13: 88012a2b0af8 R14: 88012a5b0008 R15: 
> 88012f53d960
> [  100.336477] FS:  7f0d19da38c0() GS:88013bc0() 
> knlGS:
> [  100.336483] CS:  0010 DS:  ES:  CR0: 80050033
> [  100.336488] CR2: 7f0d17706000 CR3: 00012aa3e000 CR4: 
> 000406f0
> [  100.336496] Call Trace:
> [  100.336527]  i915_gem_switch_to_kernel_context+0x131/0x1b0 [i915]
> [  100.336559]  i915_gem_evict_vm+0x202/0x2b0 [i915]
> [  100.336590]  i915_gem_execbuffer_reserve.isra.9+0x3ae/0x440 [i915]
> [  100.336623]  i915_gem_do_execbuffer.isra.15+0x6d9/0x1b20 [i915]
> [  100.336656]  i915_gem_execbuffer2+0xc0/0x250 [i915]
> [  100.33]  drm_ioctl+0x200/0x450
> [  100.336697]  ? i915_gem_execbuffer+0x330/0x330 [i915]
> [  100.336708]  do_vfs_ioctl+0x90/0x6e0
> [  100.336716]  ? up_read+0x1a/0x40
> [  100.336723]  ? trace_hardirqs_on_caller+0x122/0x1b0
> [  100.336730]  SyS_ioctl+0x3c/0x70
> [  100.336738]  entry_SYSCALL_64_fastpath+0x1c/0xb1
> [  100.336745] RIP: 0033:0x7f0d187cb357
> [  100.336750] RSP: 002b:7ffe0b2f7c28 EFLAGS: 0246 ORIG_RAX: 
> 0010
> [  100.336761] RAX: ffda RBX: 7ffe0b2f7d60 RCX: 
> 7f0d187cb357
> [  100.336768] RDX: 7ffe0b2f7d00 RSI: 40406469 RDI: 
> 0003
> [  100.336775] RBP:  R08:  R09: 
> 0022
> [  100.336782] R10: 0007 R11: 0246 R12: 
> 0002
> [  100.336789] R13: 00419101 R14: 7ffe0b2f7d60 R15: 
> 7ffe0b2f7d50
> [  100.336797] Code: 5f 74 1e e9 d4 fb ff ff e8 bc 1e 9c e0 e9 ae fb ff ff 4c 
> 89 e7 e8 77 22 fd ff e9 88 fd ff ff 0f 0b e8 a3 1e 9c e0 e9 b1 fb ff ff <0f> 
> 0b 0f 0b e8 fd af ab e0 85 c0 75 c2 48 c7 c2 80 2c 71 a0 be
> [  100.336877] RIP: __i915_add_request+0x4aa/0x510 [i915] RSP: 
> c9563ac0
> [  100.336886] ---[ end trace 22b36545479e5eb7 ]---
>
> Signed-off-by: Chris Wilson 
> Cc: Tvrtko Ursulin 
> Cc: Joonas Lahtinen 

Reviewed-by: Mika Kuoppala 

> ---
>  drivers/gpu/drm/i915/i915_gem_request.c | 7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c 
> b/drivers/gpu/drm/i915/i915_gem_request.c
> index 99056b948eda..1ad47f08e1fd 100644
> --- a/drivers/gpu/drm/i915/i915_gem_request.c
> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> @@ -851,6 +851,13 @@ void __i915_add_request(struct drm_i915_gem_request 
> *request, bool flush_caches)
>   lockdep_assert_held(&request->i915->drm.struct_mutex);
>   trace_i915_gem_request_add(request);
>  
> + /* Make sure that no request gazzumped us - if it was allocated after
> +  * our i915_gem_request_alloc() and called __i915_add_request() before
> +  * us, the timeline will hold its seqno which is later than ours.
> +  */
> + GEM_BUG_ON(i915_seqno_passed(timeline->last_submitted_seqno,
> +  request->fence.seqno));
> +
>   /*
>* To ensure that this call will not fail, space for its emissions
>* should already have been reserved in the ring buffer. L

[Intel-gfx] [PATCH v2 2/5] drm/edid: Introduce drm_default_rgb_quant_range()

2017-01-11 Thread ville . syrjala

From: Ville Syrjälä 

Make the code selecting the RGB quantization range a little less magicy
by wrapping it up in a small helper.

v2: s/adjusted_mode/mode in vc4 to make it actually compile

Signed-off-by: Ville Syrjälä 
---
 drivers/gpu/drm/drm_edid.c| 18 ++
 drivers/gpu/drm/i915/intel_dp.c   |  4 +++-
 drivers/gpu/drm/i915/intel_hdmi.c |  3 ++-
 drivers/gpu/drm/vc4/vc4_hdmi.c|  4 +++-
 include/drm/drm_edid.h|  2 ++
 5 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index 4ff04aa84dd0..304c583b8000 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -3768,6 +3768,24 @@ bool drm_rgb_quant_range_selectable(struct edid *edid)
 }
 EXPORT_SYMBOL(drm_rgb_quant_range_selectable);
 
+/**
+ * drm_default_rgb_quant_range - default RGB quantization range
+ * @mode: display mode
+ *
+ * Determine the default RGB quantization range for the mode,
+ * as specified in CEA-861.
+ *
+ * Return: The default RGB quantization range for the mode
+ */
+enum hdmi_quantization_range
+drm_default_rgb_quant_range(const struct drm_display_mode *mode)
+{
+   return drm_match_cea_mode(mode) > 1 ?
+   HDMI_QUANTIZATION_RANGE_LIMITED :
+   HDMI_QUANTIZATION_RANGE_FULL;
+}
+EXPORT_SYMBOL(drm_default_rgb_quant_range);
+
 static void drm_parse_hdmi_deep_color_info(struct drm_connector *connector,
   const u8 *hdmi)
 {
diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index 343e1d9fa761..d4befbbe834a 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -1713,7 +1713,9 @@ intel_dp_compute_config(struct intel_encoder *encoder,
 * VESA DisplayPort Ver.1.2a - 5.1.1.1 Video Colorimetry
 */
pipe_config->limited_color_range =
-   bpp != 18 && drm_match_cea_mode(adjusted_mode) > 1;
+   bpp != 18 &&
+   drm_default_rgb_quant_range(adjusted_mode) ==
+   HDMI_QUANTIZATION_RANGE_LIMITED;
} else {
pipe_config->limited_color_range =
intel_dp->limited_color_range;
diff --git a/drivers/gpu/drm/i915/intel_hdmi.c 
b/drivers/gpu/drm/i915/intel_hdmi.c
index 0bcfead14571..19bd13f53729 100644
--- a/drivers/gpu/drm/i915/intel_hdmi.c
+++ b/drivers/gpu/drm/i915/intel_hdmi.c
@@ -1330,7 +1330,8 @@ bool intel_hdmi_compute_config(struct intel_encoder 
*encoder,
/* See CEA-861-E - 5.1 Default Encoding Parameters */
pipe_config->limited_color_range =
pipe_config->has_hdmi_sink &&
-   drm_match_cea_mode(adjusted_mode) > 1;
+   drm_default_rgb_quant_range(adjusted_mode) ==
+   HDMI_QUANTIZATION_RANGE_LIMITED;
} else {
pipe_config->limited_color_range =
intel_hdmi->limited_color_range;
diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index c4cb2e26de32..5d49bf948162 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -463,7 +463,9 @@ static void vc4_hdmi_encoder_mode_set(struct drm_encoder 
*encoder,
csc_ctl = VC4_SET_FIELD(VC4_HD_CSC_CTL_ORDER_BGR,
VC4_HD_CSC_CTL_ORDER);
 
-   if (vc4_encoder->hdmi_monitor && drm_match_cea_mode(mode) > 1) {
+   if (vc4_encoder->hdmi_monitor &&
+   drm_default_rgb_quant_range(mode) ==
+   HDMI_QUANTIZATION_RANGE_LIMITED) {
/* CEA VICs other than #1 requre limited range RGB
 * output unless overridden by an AVI infoframe.
 * Apply a colorspace conversion to squash 0-255 down
diff --git a/include/drm/drm_edid.h b/include/drm/drm_edid.h
index 838eaf2b42e9..25cdf5f7a0d8 100644
--- a/include/drm/drm_edid.h
+++ b/include/drm/drm_edid.h
@@ -441,6 +441,8 @@ enum hdmi_picture_aspect drm_get_cea_aspect_ratio(const u8 
video_code);
 bool drm_detect_hdmi_monitor(struct edid *edid);
 bool drm_detect_monitor_audio(struct edid *edid);
 bool drm_rgb_quant_range_selectable(struct edid *edid);
+enum hdmi_quantization_range
+drm_default_rgb_quant_range(const struct drm_display_mode *mode);
 int drm_add_modes_noedid(struct drm_connector *connector,
 int hdisplay, int vdisplay);
 void drm_set_preferred_mode(struct drm_connector *connector,
-- 
2.10.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915: Add a sanity check that no request is submitted in the middle

2017-01-11 Thread Joonas Lahtinen

On ke, 2017-01-11 at 14:08 +, Chris Wilson wrote:
> It is an error to start a new request on the same timeline (ringbuffer)
> as the current one before the current is submitted. If there are two
> requests emitting to the ringbuffer at the same time, the operation is
> undefined. We can catch this by checking for the timeline having a later
> seqno than ours when we come to submit out request.
> 
> Currently we have this check at the end of __i915_add_request, but
> having an early check as well isolates a failure in the caller versus a
> failure in sealing the request (i.e. from inside __i915_add_request
> itself). For example, CI is currently tripping over this late assertion
> on ctg/ilk:
> 
> [  100.329399] [IGT] gem_cs_tlb: starting subtest basic-default
> [  100.336333] [ cut here ]
> [  100.336341] kernel BUG at drivers/gpu/drm/i915/i915_gem_request.c:908!
> [  100.336347] invalid opcode:  [#1] PREEMPT SMP
> [  100.336351] Modules linked in: snd_hda_intel i915 snd_hda_codec_generic 
> snd_hda_codec snd_hwdep snd_hda_core snd_pcm coretemp mei_me lpc_ich mei 
> e1000e ptp pps_core [last unloaded: i915]
> [  100.336373] CPU: 0 PID: 6308 Comm: gem_cs_tlb Tainted: G U  
> 4.10.0-rc3-CI-CI_DRM_2045+ #1
> [  100.336380] Hardware name: LENOVO 7465CTO/7465CTO, BIOS 6DET44WW (2.08 ) 
> 04/22/2009
> [  100.336386] task: 88012b738040 task.stack: c956
> [  100.336441] RIP: 0010:__i915_add_request+0x4aa/0x510 [i915]
> [  100.336445] RSP: 0018:c9563ac0 EFLAGS: 00010212
> [  100.336451] RAX: 5d52 RBX: 880133bb84c0 RCX: 
> 0001
> [  100.336456] RDX: 8001 RSI: 88012b738860 RDI: 
> 
> [  100.336461] RBP: c9563b00 R08: 880133bb8780 R09: 
> 
> [  100.336466] R10:  R11:  R12: 
> 88012f53d950
> [  100.336472] R13: 88012a2b0af8 R14: 88012a5b0008 R15: 
> 88012f53d960
> [  100.336477] FS:  7f0d19da38c0() GS:88013bc0() 
> knlGS:
> [  100.336483] CS:  0010 DS:  ES:  CR0: 80050033
> [  100.336488] CR2: 7f0d17706000 CR3: 00012aa3e000 CR4: 
> 000406f0
> [  100.336496] Call Trace:
> [  100.336527]  i915_gem_switch_to_kernel_context+0x131/0x1b0 [i915]
> [  100.336559]  i915_gem_evict_vm+0x202/0x2b0 [i915]
> [  100.336590]  i915_gem_execbuffer_reserve.isra.9+0x3ae/0x440 [i915]
> [  100.336623]  i915_gem_do_execbuffer.isra.15+0x6d9/0x1b20 [i915]
> [  100.336656]  i915_gem_execbuffer2+0xc0/0x250 [i915]
> [  100.33]  drm_ioctl+0x200/0x450
> [  100.336697]  ? i915_gem_execbuffer+0x330/0x330 [i915]
> [  100.336708]  do_vfs_ioctl+0x90/0x6e0
> [  100.336716]  ? up_read+0x1a/0x40
> [  100.336723]  ? trace_hardirqs_on_caller+0x122/0x1b0
> [  100.336730]  SyS_ioctl+0x3c/0x70
> [  100.336738]  entry_SYSCALL_64_fastpath+0x1c/0xb1
> [  100.336745] RIP: 0033:0x7f0d187cb357
> [  100.336750] RSP: 002b:7ffe0b2f7c28 EFLAGS: 0246 ORIG_RAX: 
> 0010
> [  100.336761] RAX: ffda RBX: 7ffe0b2f7d60 RCX: 
> 7f0d187cb357
> [  100.336768] RDX: 7ffe0b2f7d00 RSI: 40406469 RDI: 
> 0003
> [  100.336775] RBP:  R08:  R09: 
> 0022
> [  100.336782] R10: 0007 R11: 0246 R12: 
> 0002
> [  100.336789] R13: 00419101 R14: 7ffe0b2f7d60 R15: 
> 7ffe0b2f7d50
> [  100.336797] Code: 5f 74 1e e9 d4 fb ff ff e8 bc 1e 9c e0 e9 ae fb ff ff 4c 
> 89 e7 e8 77 22 fd ff e9 88 fd ff ff 0f 0b e8 a3 1e 9c e0 e9 b1 fb ff ff <0f> 
> 0b 0f 0b e8 fd af ab e0 85 c0 75 c2 48 c7 c2 80 2c 71 a0 be
> [  100.336877] RIP: __i915_add_request+0x4aa/0x510 [i915] RSP: 
> c9563ac0
> [  100.336886] ---[ end trace 22b36545479e5eb7 ]---
> 
> Signed-off-by: Chris Wilson 
> Cc: Tvrtko Ursulin 
> Cc: Joonas Lahtinen 

Reviewed-by: Joonas Lahtinen 

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915/huc: Add HuC fw loading support

2017-01-11 Thread Chris Wilson

On Wed, Jan 11, 2017 at 03:13:29PM +0100, Michal Wajdeczko wrote:
> > +   vma = i915_gem_object_ggtt_pin(huc_fw->obj, NULL, 0, 0, 0);
> > +   if (IS_ERR(vma)) {
> > +   DRM_DEBUG_DRIVER("pin failed %d\n", (int)PTR_ERR(vma));
> > +   return PTR_ERR(vma);
> > +   }

Just asking a stupid question: Does the HuC have the same limitation as
the GuC on not being able to map certain ranges of the GuC? From the
earlier discussion on the failures, I got the impression the HuC had the
same limitations.

> > +
> > +   /* Invalidate GuC TLB to let GuC take the latest updates to GTT. */
> > +   I915_WRITE(GEN8_GTCR, GEN8_GTCR_INVALIDATE);
> > +
> > +   intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
> > +
> > +   /* init WOPCM */
> > +   I915_WRITE(GUC_WOPCM_SIZE, intel_guc_wopcm_size(dev_priv));
> > +   I915_WRITE(DMA_GUC_WOPCM_OFFSET, GUC_WOPCM_OFFSET_VALUE |
> > +   HUC_LOADING_AGENT_GUC);
> > +
> > +   /* Set the source address for the uCode */
> > +   offset = i915_ggtt_offset(vma) + huc_fw->header_offset;

If huc does have the same limits as the guc, please use guc_ggtt_offset()
for the extra verification on the address before use.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with [1/3] drm/i915: Invalidate the guc ggtt TLB upon insertion

2017-01-11 Thread Patchwork

== Series Details ==

Series: series starting with [1/3] drm/i915: Invalidate the guc ggtt TLB upon 
insertion
URL   : https://patchwork.freedesktop.org/series/17829/
State : failure

== Summary ==

Series 17829v1 Series without cover letter
https://patchwork.freedesktop.org/api/1.0/series/17829/revisions/1/mbox/

Test drv_module_reload:
Subgroup basic-reload:
pass   -> DMESG-WARN (fi-skl-6260u)
pass   -> DMESG-WARN (fi-bxt-t5700)
pass   -> DMESG-WARN (fi-skl-6700hq)
Subgroup basic-reload-final:
pass   -> DMESG-WARN (fi-bxt-j4205)
Test gem_busy:
Subgroup basic-hang-default:
pass   -> DMESG-WARN (fi-skl-6770hq)
Test gem_exec_suspend:
Subgroup basic-s3:
pass   -> INCOMPLETE (fi-skl-6770hq)
pass   -> INCOMPLETE (fi-bxt-j4205)
Subgroup basic-s4-devices:
pass   -> INCOMPLETE (fi-skl-6700k)
Test kms_force_connector_basic:
Subgroup force-connector-state:
pass   -> DMESG-WARN (fi-snb-2520m)
Test kms_pipe_crc_basic:
Subgroup suspend-read-crc-pipe-a:
pass   -> INCOMPLETE (fi-skl-6260u)
pass   -> INCOMPLETE (fi-skl-6700hq)
incomplete -> SKIP   (fi-bsw-n3050)
Test pm_rpm:
Subgroup basic-pci-d3-state:
incomplete -> PASS   (fi-byt-n2820)

fi-bdw-5557u total:246  pass:232  dwarn:0   dfail:0   fail:0   skip:14 
fi-bsw-n3050 total:246  pass:207  dwarn:0   dfail:0   fail:0   skip:39 
fi-bxt-j4205 total:82   pass:69   dwarn:1   dfail:0   fail:0   skip:11 
fi-bxt-t5700 total:82   pass:68   dwarn:1   dfail:0   fail:0   skip:12 
fi-byt-j1900 total:246  pass:219  dwarn:0   dfail:0   fail:0   skip:27 
fi-byt-n2820 total:246  pass:215  dwarn:0   dfail:0   fail:0   skip:31 
fi-hsw-4770  total:246  pass:227  dwarn:0   dfail:0   fail:0   skip:19 
fi-hsw-4770r total:246  pass:227  dwarn:0   dfail:0   fail:0   skip:19 
fi-ivb-3520m total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-ivb-3770  total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-kbl-7500u total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-skl-6260u total:208  pass:195  dwarn:1   dfail:0   fail:0   skip:11 
fi-skl-6700hqtotal:208  pass:187  dwarn:1   dfail:0   fail:0   skip:19 
fi-skl-6700k total:83   pass:68   dwarn:3   dfail:0   fail:0   skip:11 
fi-skl-6770hqtotal:82   pass:77   dwarn:1   dfail:0   fail:0   skip:3  
fi-snb-2520m total:246  pass:214  dwarn:1   dfail:0   fail:0   skip:31 
fi-snb-2600  total:246  pass:214  dwarn:0   dfail:0   fail:0   skip:32 

f0350ffa1b2bc16dc49fdc2fce10776d604a1c5f drm-tip: 2017y-01m-11d-12h-34m-12s UTC 
integration manifest
ad8e4de HAX enable guc submission for CI
b2b1516 drm/i915/scheduler: emulate a scheduler for guc
7ed1066 drm/i915: Invalidate the guc ggtt TLB upon insertion

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3480/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915: Add a sanity check that no request is submitted in the middle

2017-01-11 Thread Chris Wilson

On Wed, Jan 11, 2017 at 04:19:19PM +0200, Joonas Lahtinen wrote:
> On ke, 2017-01-11 at 14:08 +, Chris Wilson wrote:
> > It is an error to start a new request on the same timeline (ringbuffer)
> > as the current one before the current is submitted. If there are two
> > requests emitting to the ringbuffer at the same time, the operation is
> > undefined. We can catch this by checking for the timeline having a later
> > seqno than ours when we come to submit out request.
> > 
> > Currently we have this check at the end of __i915_add_request, but
> > having an early check as well isolates a failure in the caller versus a
> > failure in sealing the request (i.e. from inside __i915_add_request
> > itself). For example, CI is currently tripping over this late assertion
> > on ctg/ilk:
> > 
> > [  100.329399] [IGT] gem_cs_tlb: starting subtest basic-default
> > [  100.336333] [ cut here ]
> > [  100.336341] kernel BUG at drivers/gpu/drm/i915/i915_gem_request.c:908!
> > [  100.336347] invalid opcode:  [#1] PREEMPT SMP
> > [  100.336351] Modules linked in: snd_hda_intel i915 snd_hda_codec_generic 
> > snd_hda_codec snd_hwdep snd_hda_core snd_pcm coretemp mei_me lpc_ich mei 
> > e1000e ptp pps_core [last unloaded: i915]
> > [  100.336373] CPU: 0 PID: 6308 Comm: gem_cs_tlb Tainted: G U  
> > 4.10.0-rc3-CI-CI_DRM_2045+ #1
> > [  100.336380] Hardware name: LENOVO 7465CTO/7465CTO, BIOS 6DET44WW (2.08 ) 
> > 04/22/2009
> > [  100.336386] task: 88012b738040 task.stack: c956
> > [  100.336441] RIP: 0010:__i915_add_request+0x4aa/0x510 [i915]
> > [  100.336445] RSP: 0018:c9563ac0 EFLAGS: 00010212
> > [  100.336451] RAX: 5d52 RBX: 880133bb84c0 RCX: 
> > 0001
> > [  100.336456] RDX: 8001 RSI: 88012b738860 RDI: 
> > 
> > [  100.336461] RBP: c9563b00 R08: 880133bb8780 R09: 
> > 
> > [  100.336466] R10:  R11:  R12: 
> > 88012f53d950
> > [  100.336472] R13: 88012a2b0af8 R14: 88012a5b0008 R15: 
> > 88012f53d960
> > [  100.336477] FS:  7f0d19da38c0() GS:88013bc0() 
> > knlGS:
> > [  100.336483] CS:  0010 DS:  ES:  CR0: 80050033
> > [  100.336488] CR2: 7f0d17706000 CR3: 00012aa3e000 CR4: 
> > 000406f0
> > [  100.336496] Call Trace:
> > [  100.336527]  i915_gem_switch_to_kernel_context+0x131/0x1b0 [i915]
> > [  100.336559]  i915_gem_evict_vm+0x202/0x2b0 [i915]
> > [  100.336590]  i915_gem_execbuffer_reserve.isra.9+0x3ae/0x440 [i915]
> > [  100.336623]  i915_gem_do_execbuffer.isra.15+0x6d9/0x1b20 [i915]
> > [  100.336656]  i915_gem_execbuffer2+0xc0/0x250 [i915]
> > [  100.33]  drm_ioctl+0x200/0x450
> > [  100.336697]  ? i915_gem_execbuffer+0x330/0x330 [i915]
> > [  100.336708]  do_vfs_ioctl+0x90/0x6e0
> > [  100.336716]  ? up_read+0x1a/0x40
> > [  100.336723]  ? trace_hardirqs_on_caller+0x122/0x1b0
> > [  100.336730]  SyS_ioctl+0x3c/0x70
> > [  100.336738]  entry_SYSCALL_64_fastpath+0x1c/0xb1
> > [  100.336745] RIP: 0033:0x7f0d187cb357
> > [  100.336750] RSP: 002b:7ffe0b2f7c28 EFLAGS: 0246 ORIG_RAX: 
> > 0010
> > [  100.336761] RAX: ffda RBX: 7ffe0b2f7d60 RCX: 
> > 7f0d187cb357
> > [  100.336768] RDX: 7ffe0b2f7d00 RSI: 40406469 RDI: 
> > 0003
> > [  100.336775] RBP:  R08:  R09: 
> > 0022
> > [  100.336782] R10: 0007 R11: 0246 R12: 
> > 0002
> > [  100.336789] R13: 00419101 R14: 7ffe0b2f7d60 R15: 
> > 7ffe0b2f7d50
> > [  100.336797] Code: 5f 74 1e e9 d4 fb ff ff e8 bc 1e 9c e0 e9 ae fb ff ff 
> > 4c 89 e7 e8 77 22 fd ff e9 88 fd ff ff 0f 0b e8 a3 1e 9c e0 e9 b1 fb ff ff 
> > <0f> 0b 0f 0b e8 fd af ab e0 85 c0 75 c2 48 c7 c2 80 2c 71 a0 be
> > [  100.336877] RIP: __i915_add_request+0x4aa/0x510 [i915] RSP: 
> > c9563ac0
> > [  100.336886] ---[ end trace 22b36545479e5eb7 ]---
> > 
> > Signed-off-by: Chris Wilson 
> > Cc: Tvrtko Ursulin 
> > Cc: Joonas Lahtinen 
> 
> Reviewed-by: Joonas Lahtinen 

Thanks, pushed this promptly since I'm trying to understand the CI
failure.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 03/10] drm/i915/psr: fix blank screen issue for psr2

2017-01-11 Thread vathsala nagaraju

Psr1 and psr2 are mutually exclusive,ie when psr2 is enabled,
psr1 should be disabled.When psr2 is exited , bit 31 of reg
PSR2_CTL must be set to 0 but currently bit 31 of SRD_CTL
(psr1 control register)is set to 0.
Also ,PSR2_IDLE state is looked up from SRD_STATUS(psr1 register)
instead of PSR2_STATUS register, which has wrong data, resulting
in blankscreen.
hsw_enable_source is split into hsw_enable_source_psr1 and
hsw_enable_source_psr2 for easier code review and maintenance,
as suggested by rodrigo and jim.

v2: (Rodrigo)
- Rename hsw_enable_source_psr* to intel_enable_source_psr*

v3: (Rodrigo)
- In hsw_psr_disable ,
  1) for psr active case, handle psr2 followed by psr1.
  2) psr inactive case, handle psr2 followed by psr1

v4:(Rodrigo)
- move psr2 restriction(32X20) to match_conditions function
  returning false and fully blocking PSR to a new patch before
  this one.

Cc: Rodrigo Vivi 
Cc: Jim Bride 
Signed-off-by: Vathsala Nagaraju 
Signed-off-by: Patil Deepti 
---
 drivers/gpu/drm/i915/i915_reg.h  |   3 +
 drivers/gpu/drm/i915/intel_psr.c | 122 +--
 2 files changed, 95 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 00970aa..7830e6e 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -3615,6 +3615,9 @@ enum {
 #define   EDP_PSR2_FRAME_BEFORE_SU_MASK(0xf<<4)
 #define   EDP_PSR2_IDLE_MASK   0xf
 
+#define EDP_PSR2_STATUS_CTL_MMIO(0x6f940)
+#define EDP_PSR2_STATUS_STATE_MASK (0xf<<28)
+
 /* VGA port control */
 #define ADPA   _MMIO(0x61100)
 #define PCH_ADPA_MMIO(0xe1100)
diff --git a/drivers/gpu/drm/i915/intel_psr.c b/drivers/gpu/drm/i915/intel_psr.c
index 707cae8..19c7090 100644
--- a/drivers/gpu/drm/i915/intel_psr.c
+++ b/drivers/gpu/drm/i915/intel_psr.c
@@ -261,7 +261,7 @@ static void vlv_psr_activate(struct intel_dp *intel_dp)
   VLV_EDP_PSR_ACTIVE_ENTRY);
 }
 
-static void hsw_psr_enable_source(struct intel_dp *intel_dp)
+static void intel_enable_source_psr1(struct intel_dp *intel_dp)
 {
struct intel_digital_port *dig_port = dp_to_dig_port(intel_dp);
struct drm_device *dev = dig_port->base.base.dev;
@@ -312,14 +312,29 @@ static void hsw_psr_enable_source(struct intel_dp 
*intel_dp)
val |= EDP_PSR_TP1_TP2_SEL;
 
I915_WRITE(EDP_PSR_CTL, val);
+}
 
-   if (!dev_priv->psr.psr2_support)
-   return;
+static void intel_enable_source_psr2(struct intel_dp *intel_dp)
+{
+   struct intel_digital_port *dig_port = dp_to_dig_port(intel_dp);
+   struct drm_device *dev = dig_port->base.base.dev;
+   struct drm_i915_private *dev_priv = to_i915(dev);
+   /*
+* Let's respect VBT in case VBT asks a higher idle_frame value.
+* Let's use 6 as the minimum to cover all known cases including
+* the off-by-one issue that HW has in some cases. Also there are
+* cases where sink should be able to train
+* with the 5 or 6 idle patterns.
+*/
+   uint32_t idle_frames = max(6, dev_priv->vbt.psr.idle_frames);
+   uint32_t val = EDP_PSR_ENABLE;
+
+   val |= idle_frames << EDP_PSR_IDLE_FRAME_SHIFT;
 
/* FIXME: selective update is probably totally broken because it doesn't
 * mesh at all with our frontbuffer tracking. And the hw alone isn't
 * good enough. */
-   val = EDP_PSR2_ENABLE | EDP_SU_TRACK_ENABLE;
+   val |= EDP_PSR2_ENABLE | EDP_SU_TRACK_ENABLE;
 
if (dev_priv->vbt.psr.tp2_tp3_wakeup_time > 5)
val |= EDP_PSR2_TP2_TIME_2500;
@@ -333,6 +348,19 @@ static void hsw_psr_enable_source(struct intel_dp 
*intel_dp)
I915_WRITE(EDP_PSR2_CTL, val);
 }
 
+static void hsw_psr_enable_source(struct intel_dp *intel_dp)
+{
+   struct intel_digital_port *dig_port = dp_to_dig_port(intel_dp);
+   struct drm_device *dev = dig_port->base.base.dev;
+   struct drm_i915_private *dev_priv = to_i915(dev);
+
+   /* psr1 and psr2 are mutually exclusive.*/
+   if (dev_priv->psr.psr2_support)
+   intel_enable_source_psr2(intel_dp);
+   else
+   intel_enable_source_psr1(intel_dp);
+}
+
 static bool intel_psr_match_conditions(struct intel_dp *intel_dp)
 {
struct intel_digital_port *dig_port = dp_to_dig_port(intel_dp);
@@ -417,7 +445,10 @@ static void intel_psr_activate(struct intel_dp *intel_dp)
struct drm_device *dev = intel_dig_port->base.base.dev;
struct drm_i915_private *dev_priv = to_i915(dev);
 
-   WARN_ON(I915_READ(EDP_PSR_CTL) & EDP_PSR_ENABLE);
+   if (dev_priv->psr.psr2_support)
+   WARN_ON(I915_READ(EDP_PSR2_CTL) & EDP_PSR2_ENABLE);
+   else
+   WARN_ON(I915_READ(EDP_PSR_CTL) & EDP_PSR_ENABLE);
WARN_ON(dev_priv->psr.active);
lockdep_assert_held(&dev_priv->psr.lock);
 
@@ -468,10 +499,11 @@ void intel_psr_enable

Re: [Intel-gfx] [PATCH v6] drm: add fourcc codes for 16bit R and RG

2017-01-11 Thread Ville Syrjälä

On Thu, Jan 05, 2017 at 02:45:37PM +0100, Christian König wrote:
> Am 05.01.2017 um 12:37 schrieb Ville Syrjälä:
> > On Wed, Jan 04, 2017 at 07:38:55PM +0100, Rainer Hochecker wrote:
> >> From: Rainer Hochecker 
> >>
> >> This adds fourcc codes for 16bit planes required for DRM buffer
> >> export to mesa.
> >>
> >> Signed-off-by: Rainer Hochecker 
> > Reviewed-by: Ville Syrjälä 
> 
> Good to see some work landing on that part, patch is Acked-by: Christian 
> König .

Has the userspace side of this been reviewed already?

/me wonders if it's safe to push this...

> 
> >
> >> ---
> >>   include/uapi/drm/drm_fourcc.h | 7 +++
> >>   1 file changed, 7 insertions(+)
> >>
> >> diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
> >> index a5890bf..d230e58 100644
> >> --- a/include/uapi/drm/drm_fourcc.h
> >> +++ b/include/uapi/drm/drm_fourcc.h
> >> @@ -41,10 +41,17 @@ extern "C" {
> >>   /* 8 bpp Red */
> >>   #define DRM_FORMAT_R8fourcc_code('R', '8', ' ', ' ') /* 
> >> [7:0] R */
> >>   
> >> +/* 16 bpp Red */
> >> +#define DRM_FORMAT_R16fourcc_code('R', '1', '6', ' ') /* 
> >> [15:0] R little endian */
> >> +
> >>   /* 16 bpp RG */
> >>   #define DRM_FORMAT_RG88  fourcc_code('R', 'G', '8', '8') /* 
> >> [15:0] R:G 8:8 little endian */
> >>   #define DRM_FORMAT_GR88  fourcc_code('G', 'R', '8', '8') /* 
> >> [15:0] G:R 8:8 little endian */
> >>   
> >> +/* 32 bpp RG */
> >> +#define DRM_FORMAT_RG1616 fourcc_code('R', 'G', '3', '2') /* [31:0] R:G 
> >> 16:16 little endian */
> >> +#define DRM_FORMAT_GR1616 fourcc_code('G', 'R', '3', '2') /* [31:0] G:R 
> >> 16:16 little endian */
> >> +
> >>   /* 8 bpp RGB */
> >>   #define DRM_FORMAT_RGB332fourcc_code('R', 'G', 'B', '8') /* 
> >> [7:0] R:G:B 3:3:2 */
> >>   #define DRM_FORMAT_BGR233fourcc_code('B', 'G', 'R', '8') /* 
> >> [7:0] B:G:R 2:3:3 */
> >> -- 
> >> 2.9.3
> 

-- 
Ville Syrjälä
Intel OTC
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] DP compliance failure due to dithering for 18bpp video pattern

2017-01-11 Thread Jani Nikula

On Tue, 10 Jan 2017, Manasi Navare  wrote:
> Hi All,
>
> We are seeing CRC check failures in some of the 18bpp video pattern
> DP Compliance tests causing the tests to fail. On further investigation, it is
> rootcaused to dithering that the i915 driver enables in case of 18bpp pipe
> configuration that messes up the CRC and causes the test to fail.

The CTS spec actually accounts for CRC failures caused by dithering and
color space conversions. See section 3.2.1. However, it would be
preferrable to be able to automate this.

> Some of the approaches that can solve this problem are:
> 1.  Add a new method in intel_dp.c to request the compliance test state.
> Call this new method in intel_display.c to not enable dithering during a
> compliance test. Issue with this is it makes the general portion of the driver
> compliance aware.
>
> 2.  Move the dithering enable to compute_config methods in all encoder source
> files. Issue: Lot of duplicate code and DP is the only encoder that uses 
> 18bpc.
>
> 3.  Disable dithering at all times in the driver. However this can cause image
> quality issue with 8bpc plane and 6 bit pipe.
>
> Any suggestions on which approach can be implemented in order to pass
> compliance?

I can't find any mention in the specs that we couldn't enable/disable
dithering on the fly. It's PIPE_MISC for BDW+ and PIPE_CONF for the
rest. So I'm wondering about doing...

4. Disable dithering at intel_dp_sink_crc_start() and enable it again
   (according to config->dither) at intel_dp_sink_crc_stop(). It's
   similar to the hsw_disable_ips() and hsw_enable_ips() calls, but
   would have to cover more platforms.

Ville, thoughts on changing dithering on the fly?

BR,
Jani.

>
> Regards
> Manasi
>
>
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Jani Nikula, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH] drm/i915/huc: Add HuC fw loading support

2017-01-11 Thread Anusha Srivatsa

The HuC loading process is similar to GuC. The intel_uc_fw_fetch()
is used for both cases.

HuC loading needs to be before GuC loading. The WOPCM setting must
be done early before loading any of them.

v2: rebased on-top of drm-intel-nightly.
removed if(HAS_GUC()) before the guc call. (D.Gordon)
update huc_version number of format.
v3: rebased to drm-intel-nightly, changed the file name format to
match the one in the huc package.
Changed dev->dev_private to to_i915()
v4: moved function back to where it was.
change wait_for_atomic to wait_for.
v5: rebased + comment changes.
v7: rebased.
v8: rebased.
v9: rebased. Changed the year in the copyright message to reflect
the right year.Correct the comments,remove the unwanted WARN message,
replace drm_gem_object_unreference() with i915_gem_object_put().Make the
prototypes in intel_huc.h non-extern.
v10: rebased. Update the file construction done by HuC. It is similar to
GuC.Adopted the approach used in-
https://patchwork.freedesktop.org/patch/104355/ 
v11: Fix warnings remove old declaration
v12: Change dev to dev_priv in macro definition.
Corrected comments.
v13: rebased.
v14: rebased on top of drm-tip
v15: rebased. Updated functions intel_huc_load(),intel_huc_init() and
intel_uc_fw_fetch() to accept dev_priv instead of dev. Moved contents
of intel_huc.h to intel_uc.h
v16: change SKL_FW_ to SKL_HUC_FW_. Add intel_ prefix to guc_wopcm_size().
Remove unwanted checks in intel_uc.h. Rename huc_fw in struct intel_huc
to simply fw to avoid redundency.
v17: rebased.
v18: rebased. Correct comments.
v19: rebased. Correct comments. Make intel_huc_fini() accept dev_priv
instead of dev like intel_huc_init() and intel_huc_load().Move definition 
to i915_guc_reg.h from intel_uc.h. Clean DMA_CTRL bits after HuC DMA
transfer in huc_ucode_xfer() instead of guc_ucode_xfer(). Add suitable 
WARNs to give extra info.

Cc: Arkadiusz Hiler 
Cc: Michal Wajdeczko 
Tested-by: Xiang Haihao 
Signed-off-by: Anusha Srivatsa 
Signed-off-by: Alex Dai 
Signed-off-by: Peter Antoine 
---
 drivers/gpu/drm/i915/Makefile   |   1 +
 drivers/gpu/drm/i915/i915_drv.c |   3 +
 drivers/gpu/drm/i915/i915_drv.h |   2 +
 drivers/gpu/drm/i915/i915_guc_reg.h |   6 +
 drivers/gpu/drm/i915/intel_guc_loader.c |   7 +-
 drivers/gpu/drm/i915/intel_huc_loader.c | 264 
 drivers/gpu/drm/i915/intel_uc.h |  14 ++
 7 files changed, 294 insertions(+), 3 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/intel_huc_loader.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 5196509..45ae124 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -57,6 +57,7 @@ i915-y += i915_cmd_parser.o \
 # general-purpose microcontroller (GuC) support
 i915-y += intel_uc.o \
  intel_guc_loader.o \
+ intel_huc_loader.o \
  i915_guc_submission.o
 
 # autogenerated null render state
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index aefab9a..d6f32f4 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -599,6 +599,7 @@ static int i915_load_modeset_init(struct drm_device *dev)
if (ret)
goto cleanup_irq;
 
+   intel_huc_init(dev_priv);
intel_guc_init(dev_priv);
 
ret = i915_gem_init(dev_priv);
@@ -627,6 +628,7 @@ static int i915_load_modeset_init(struct drm_device *dev)
i915_gem_fini(dev_priv);
 cleanup_irq:
intel_guc_fini(dev_priv);
+   intel_huc_fini(dev_priv);
drm_irq_uninstall(dev);
intel_teardown_gmbus(dev_priv);
 cleanup_csr:
@@ -1314,6 +1316,7 @@ void i915_driver_unload(struct drm_device *dev)
drain_workqueue(dev_priv->wq);
 
intel_guc_fini(dev_priv);
+   intel_huc_fini(dev_priv);
i915_gem_fini(dev_priv);
intel_fbc_cleanup_cfb(dev_priv);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b84c1d1..2a17df2 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2073,6 +2073,7 @@ struct drm_i915_private {
 
struct intel_gvt *gvt;
 
+   struct intel_huc huc;
struct intel_guc guc;
 
struct intel_csr csr;
@@ -2847,6 +2848,7 @@ intel_info(const struct drm_i915_private *dev_priv)
 #define HAS_GUC(dev_priv)  ((dev_priv)->info.has_guc)
 #define HAS_GUC_UCODE(dev_priv)(HAS_GUC(dev_priv))
 #define HAS_GUC_SCHED(dev_priv)(HAS_GUC(dev_priv))
+#define HAS_HUC_UCODE(dev_priv)(HAS_GUC(dev_priv))
 
 #define HAS_RESOURCE_STREAMER(dev_priv) 
((dev_priv)->info.has_resource_streamer)
 
diff --git a/drivers/gpu/drm/i915/i915_guc_reg.h 
b/drivers/gpu/drm/i915/i915_guc_reg.h
index 6a0adaf..35cf991 100644
--- a/drivers/gpu/drm/i915/i915_guc_reg.h
+++ b/drivers/gpu/drm/i915/i915_guc_reg.h
@@ -61,12 +61,18 @@
 #define   DMA_ADDRESS_SPACE_GTT  (8 << 16)
 #define DMA_COPY_SIZE

[Intel-gfx] [PATCH] drm/i915/guc: Make sure vma containing firmware is GuC mappable

2017-01-11 Thread Michał Winiarski

Since commit 4741da925fa3 ("drm/i915/guc: Assert that all GGTT offsets used
by the GuC are mappable"), we're asserting that GuC firmware is in the
GuC mappable range.
Except we're not pinning the object with bias, which means it's possible
to trigger this assert. Let's add a proper bias.

Cc: Chris Wilson 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Michał Winiarski 
---
 drivers/gpu/drm/i915/intel_guc_loader.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c 
b/drivers/gpu/drm/i915/intel_guc_loader.c
index aa2b866..5a6ab87 100644
--- a/drivers/gpu/drm/i915/intel_guc_loader.c
+++ b/drivers/gpu/drm/i915/intel_guc_loader.c
@@ -360,7 +360,8 @@ static int guc_ucode_xfer(struct drm_i915_private *dev_priv)
return ret;
}
 
-   vma = i915_gem_object_ggtt_pin(guc_fw->guc_fw_obj, NULL, 0, 0, 0);
+   vma = i915_gem_object_ggtt_pin(guc_fw->guc_fw_obj, NULL, 0, 0,
+  PIN_OFFSET_BIAS | GUC_WOPCM_TOP);
if (IS_ERR(vma)) {
DRM_DEBUG_DRIVER("pin failed %d\n", (int)PTR_ERR(vma));
return PTR_ERR(vma);
-- 
2.9.3

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 04/10] drm/i915/psr: disable aux_frame_sync on psr2 exit

2017-01-11 Thread vathsala nagaraju

Screen freeze observed if AUX_FRAME_SYNC is not disabled
on psr2 exit.AUX_FRAME_SYNC needed for psr2 is enabled during
psr2 entry. It must be disabled on psr2 exit.

v2: rebase

Cc: Rodrigo Vivi 
Cc: Jim Bride 
Signed-off-by: Vathsala Nagaraju 
Signed-off-by: Patil Deepti 
Reviewed-by: Rodrigo Vivi 
---
 drivers/gpu/drm/i915/intel_psr.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_psr.c b/drivers/gpu/drm/i915/intel_psr.c
index 19c7090..52b8c80 100644
--- a/drivers/gpu/drm/i915/intel_psr.c
+++ b/drivers/gpu/drm/i915/intel_psr.c
@@ -590,6 +590,11 @@ static void hsw_psr_disable(struct intel_dp *intel_dp)
struct drm_i915_private *dev_priv = to_i915(dev);
 
if (dev_priv->psr.active) {
+   if (dev_priv->psr.aux_frame_sync)
+   drm_dp_dpcd_writeb(&intel_dp->aux,
+   DP_SINK_DEVICE_AUX_FRAME_SYNC_CONF,
+   0);
+
if (dev_priv->psr.psr2_support) {
I915_WRITE(EDP_PSR2_CTL,
I915_READ(EDP_PSR2_CTL) &
@@ -728,6 +733,10 @@ static void intel_psr_exit(struct drm_i915_private 
*dev_priv)
return;
 
if (HAS_DDI(dev_priv)) {
+   if (dev_priv->psr.aux_frame_sync)
+   drm_dp_dpcd_writeb(&intel_dp->aux,
+   DP_SINK_DEVICE_AUX_FRAME_SYNC_CONF,
+   0);
if (dev_priv->psr.psr2_support) {
val = I915_READ(EDP_PSR2_CTL);
WARN_ON(!(val & EDP_PSR2_ENABLE));
-- 
1.9.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 06/10] drm/i915/psr: set CHICKEN_TRANS for psr2

2017-01-11 Thread vathsala nagaraju

As per bpsec, CHICKEN_TRANS_EDP bit 12 ,15 must be programmed in
psr2 enable sequence.
bit 12 : Program Transcoder EDP VSC DIP header with a valid setting for
PSR2 and Set CHICKEN_TRANS_EDP(0x420cc) bit 12 for programmable
header packet.
bit 15 : Set CHICKEN_TRANS_EDP(0x420cc) bit 15 if Y coordinate is supported

v2: (Rodrigo)
- move CHICKEN_TRANS_EDP bit set logic right after setup_vsc

v3:(Rodrigo)
- initialize chicken_trans to CHICKEN_TRANS_BIT12 instead of 0

v4:(chris wilson)
- use BIT(12), remove CHICKEN_TRANS_BIT12
- remove unnecessary comments
- update commit message

v5:
- rename bit 12 PSR2_VSC_ENABLE_PROG_HEADER
- rename bit 15 PSR2_ADD_VERTICAL_LINE_COUNT

Cc: Rodrigo Vivi 
Cc: Jim Bride 
Signed-off-by: vathsala nagaraju 
Signed-off-by: Patil Deepti 
---
 drivers/gpu/drm/i915/i915_reg.h  | 7 +++
 drivers/gpu/drm/i915/intel_psr.c | 5 +
 2 files changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 7830e6e..7a325fb 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -6449,6 +6449,13 @@ enum {
 #define  BDW_DPRS_MASK_VBLANK_SRD  (1 << 0)
 #define CHICKEN_PIPESL_1(pipe) _MMIO_PIPE(pipe, _CHICKEN_PIPESL_1_A, 
_CHICKEN_PIPESL_1_B)
 
+#define CHICKEN_TRANS_A 0x420c0
+#define CHICKEN_TRANS_B 0x420c4
+#define CHICKEN_TRANS(trans) _MMIO_TRANS(trans, CHICKEN_TRANS_A, 
CHICKEN_TRANS_B)
+#define TRANS_EDP  3
+#define PSR2_VSC_ENABLE_PROG_HEADER(1<<12)
+#define PSR2_ADD_VERTICAL_LINE_COUNT   (1<<15)
+
 #define DISP_ARB_CTL   _MMIO(0x45000)
 #define  DISP_FBC_MEMORY_WAKE  (1<<31)
 #define  DISP_TILE_SURFACE_SWIZZLING   (1<<13)
diff --git a/drivers/gpu/drm/i915/intel_psr.c b/drivers/gpu/drm/i915/intel_psr.c
index 3cf5cc4..b582220 100644
--- a/drivers/gpu/drm/i915/intel_psr.c
+++ b/drivers/gpu/drm/i915/intel_psr.c
@@ -480,6 +480,7 @@ void intel_psr_enable(struct intel_dp *intel_dp)
struct intel_digital_port *intel_dig_port = dp_to_dig_port(intel_dp);
struct drm_device *dev = intel_dig_port->base.base.dev;
struct drm_i915_private *dev_priv = to_i915(dev);
+   u32 chicken;
 
if (!HAS_PSR(dev_priv)) {
DRM_DEBUG_KMS("PSR not supported on this platform\n");
@@ -505,6 +506,10 @@ void intel_psr_enable(struct intel_dp *intel_dp)
if (HAS_DDI(dev_priv)) {
if (dev_priv->psr.psr2_support) {
skl_psr_setup_su_vsc(intel_dp);
+   chicken = PSR2_VSC_ENABLE_PROG_HEADER;
+   if (dev_priv->psr.y_cord_support)
+   chicken |= PSR2_ADD_VERTICAL_LINE_COUNT;
+   I915_WRITE(CHICKEN_TRANS(TRANS_EDP), chicken);
} else {
/* set up vsc header for psr1 */
hsw_psr_setup_vsc(intel_dp);
-- 
1.9.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 07/10] drm/i915/psr: set PSR_MASK bits for deep sleep

2017-01-11 Thread vathsala nagaraju

Program EDP_PSR_DEBUG_CTL (PSR_MASK) to enable system
to go to deep sleep while in psr2.PSR2_STATUS bit 31:28
should report value 8 , if system enters deep sleep state.

Also, EDP_FRAMES_BEFORE_SU_ENTRY is set 1 , if not set,
flickering is observed on psr2 panel.

v2: (Ilia Mirkin)
- Remove duplicate bit definition 25:27

v3: rebase

v4: rebase

Cc: Rodrigo Vivi 
Cc: Jim Bride 
Signed-off-by: Vathsala Nagaraju 
Signed-off-by: Patil Deepti 
Reviewed-by: Jim Bride 
---
 drivers/gpu/drm/i915/i915_reg.h  | 10 +++---
 drivers/gpu/drm/i915/intel_psr.c | 31 ---
 2 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 7a325fb..6ad9f06 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -3597,9 +3597,12 @@ enum {
 #define   EDP_PSR_PERF_CNT_MASK0xff
 
 #define EDP_PSR_DEBUG_CTL  _MMIO(dev_priv->psr_mmio_base + 0x60)
-#define   EDP_PSR_DEBUG_MASK_LPSP  (1<<27)
-#define   EDP_PSR_DEBUG_MASK_MEMUP (1<<26)
-#define   EDP_PSR_DEBUG_MASK_HPD   (1<<25)
+#define   EDP_PSR_DEBUG_MASK_MAX_SLEEP (1<<28)
+#define   EDP_PSR_DEBUG_MASK_LPSP  (1<<27)
+#define   EDP_PSR_DEBUG_MASK_MEMUP (1<<26)
+#define   EDP_PSR_DEBUG_MASK_HPD   (1<<25)
+#define   EDP_PSR_DEBUG_MASK_DISP_REG_WRITE(1<<16)
+#define   EDP_PSR_DEBUG_EXIT_ON_PIXEL_UNDERRUN (1<<15)
 
 #define EDP_PSR2_CTL   _MMIO(0x6f900)
 #define   EDP_PSR2_ENABLE  (1<<31)
@@ -3614,6 +3617,7 @@ enum {
 #define   EDP_PSR2_FRAME_BEFORE_SU_SHIFT 4
 #define   EDP_PSR2_FRAME_BEFORE_SU_MASK(0xf<<4)
 #define   EDP_PSR2_IDLE_MASK   0xf
+#define   EDP_FRAMES_BEFORE_SU_ENTRY   (1<<4)
 
 #define EDP_PSR2_STATUS_CTL_MMIO(0x6f940)
 #define EDP_PSR2_STATUS_STATE_MASK (0xf<<28)
diff --git a/drivers/gpu/drm/i915/intel_psr.c b/drivers/gpu/drm/i915/intel_psr.c
index b582220..f9d620b 100644
--- a/drivers/gpu/drm/i915/intel_psr.c
+++ b/drivers/gpu/drm/i915/intel_psr.c
@@ -338,7 +338,9 @@ static void intel_enable_source_psr2(struct intel_dp 
*intel_dp)
/* FIXME: selective update is probably totally broken because it doesn't
 * mesh at all with our frontbuffer tracking. And the hw alone isn't
 * good enough. */
-   val |= EDP_PSR2_ENABLE | EDP_SU_TRACK_ENABLE;
+   val |= EDP_PSR2_ENABLE |
+   EDP_SU_TRACK_ENABLE |
+   EDP_FRAMES_BEFORE_SU_ENTRY;
 
if (dev_priv->vbt.psr.tp2_tp3_wakeup_time > 5)
val |= EDP_PSR2_TP2_TIME_2500;
@@ -510,20 +512,27 @@ void intel_psr_enable(struct intel_dp *intel_dp)
if (dev_priv->psr.y_cord_support)
chicken |= PSR2_ADD_VERTICAL_LINE_COUNT;
I915_WRITE(CHICKEN_TRANS(TRANS_EDP), chicken);
+   I915_WRITE(EDP_PSR_DEBUG_CTL,
+  EDP_PSR_DEBUG_MASK_MEMUP |
+  EDP_PSR_DEBUG_MASK_HPD |
+  EDP_PSR_DEBUG_MASK_LPSP |
+  EDP_PSR_DEBUG_MASK_MAX_SLEEP |
+  EDP_PSR_DEBUG_MASK_DISP_REG_WRITE);
} else {
/* set up vsc header for psr1 */
hsw_psr_setup_vsc(intel_dp);
+   /*
+* Per Spec: Avoid continuous PSR exit by masking MEMUP
+* and HPD. also mask LPSP to avoid dependency on other
+* drivers that might block runtime_pm besides
+* preventing  other hw tracking issues now we can rely
+* on frontbuffer tracking.
+*/
+   I915_WRITE(EDP_PSR_DEBUG_CTL,
+  EDP_PSR_DEBUG_MASK_MEMUP |
+  EDP_PSR_DEBUG_MASK_HPD |
+  EDP_PSR_DEBUG_MASK_LPSP);
}
-
-   /*
-* Per Spec: Avoid continuous PSR exit by masking MEMUP and HPD.
-* Also mask LPSP to avoid dependency on other drivers that
-* might block runtime_pm besides preventing other hw tracking
-* issues now we can rely on frontbuffer tracking.
-*/
-   I915_WRITE(EDP_PSR_DEBUG_CTL, EDP_PSR_DEBUG_MASK_MEMUP |
-  EDP_PSR_DEBUG_MASK_HPD | EDP_PSR_DEBUG_MASK_LPSP);
-
/* Enable PSR on the panel */
hsw_psr_enable_sink(intel_dp);
 
-- 
1.9.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] 4.10-rc2 oops in DRM connector code

2017-01-11 Thread Dave Hansen

On 01/10/2017 11:43 PM, Daniel Vetter wrote:
> On Tue, Jan 10, 2017 at 08:52:47AM -0800, Dave Hansen wrote:
>> On 01/10/2017 02:31 AM, Daniel Vetter wrote:
>>> commit e73ab00e9a0f1731f34d0620a9c55f5c30c4ad4e
>>> Author: Daniel Vetter 
>>> Date:   Sun Dec 18 14:35:45 2016 +0100
>>>
>>> drm: prevent double-(un)registration for connectors
>>>
>>> Lack of that would perfectly explain that oops ... Otherwise still no idea
>>> what's going wrong.
>> No...  That's not in mainline as far as I can see.  Should I test with
>> it applied?
> Hm, I guess failed to cc: stable that one properly, iirc we decided the
> race fix is too academic and can't be hit in reality ;-)
> 
> Testing would be great. Probably conflicts because we extracted
> drm_connector.c only recently, but running s/drm_connector\.c/drm_crtc.c/
> over the diff and then applying with some fudge should take care of that.

It doesn't apply to mainline, with or without the substitution you suggest.

Do you want that commit itself tested from -next?
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915/guc: Make sure vma containing firmware is GuC mappable

2017-01-11 Thread Chris Wilson

On Wed, Jan 11, 2017 at 04:17:39PM +0100, Michał Winiarski wrote:
> Since commit 4741da925fa3 ("drm/i915/guc: Assert that all GGTT offsets used
> by the GuC are mappable"), we're asserting that GuC firmware is in the
> GuC mappable range.
> Except we're not pinning the object with bias, which means it's possible
> to trigger this assert. Let's add a proper bias.
> 
> Cc: Chris Wilson 
> Cc: Daniele Ceraolo Spurio 
> Signed-off-by: Michał Winiarski 

Fits in with the checks we added. If they are correct, so is this fix ;)
Reviewed-by: Chris Wilson 
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt

2017-01-11 Thread Martin Peres


On 11/01/17 14:58, Joonas Lahtinen wrote:

On ke, 2017-01-11 at 12:14 +, Chris Wilson wrote:

When switching between contexts using the aliasing_ppgtt, the VM is
shared. We don't need to reload the PD registers unless they are dirty.

Martin Peres reported an issue that looks like corruption between
Haswell context switches, bisecting to commit f9326be5f1d3 ("drm/i915:
Rearrange switch_context to load the aliasing ppgtt on first use").
Switching between the same mm (the aliasing_ppgtt is used for all
contexts in this case) should be a nop, but appears to trigger some
side-effects in the context switch. However, as we know the switch
is redundant in this case, we can skip it and continue to ignore the
issue until somebody feels strong enough to investigate full-ppgtt on
gen7 again!

Fixes: f9326be5f1d3 ("drm/i915: Rearrange switch_context to load the aliasing ppgtt 
on first use")
Reported-by: Martin Peres 
Signed-off-by: Chris Wilson 
Cc: Martin Peres 


Code looks good, could use the T-b's to verify.

Reviewed-by: Joonas Lahtinen 

Regards, Joonas



https://cgit.freedesktop.org/~ickle/linux-2.6/commit/?h=for-mupuf&id=cfe8f1043b45896af23e4a978020fe20e90c5072 
was actually the commit that massively improved the corruption I was 
seeing in one benchmark while this patch had no visible impact.


However, my problem was that i915.enable_ppgtt=2 was set in 
/etc/modprobe.d/... and I had completely forgotten about it.


So yeah, now you know that f9326be5f1d3 massively broke enable_ppgtt=2, 
but not sure what you want to do about it.


There is no hurry though, as the defaults are sane.

Sorry for the noise everyone, I hope that my painful manual bisects will 
be useful if someone wants to make the second mode work :)


Martin
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] DP compliance failure due to dithering for 18bpp video pattern

2017-01-11 Thread Ville Syrjälä

On Wed, Jan 11, 2017 at 05:09:16PM +0200, Jani Nikula wrote:
> On Tue, 10 Jan 2017, Manasi Navare  wrote:
> > Hi All,
> >
> > We are seeing CRC check failures in some of the 18bpp video pattern
> > DP Compliance tests causing the tests to fail. On further investigation, it 
> > is
> > rootcaused to dithering that the i915 driver enables in case of 18bpp pipe
> > configuration that messes up the CRC and causes the test to fail.
> 
> The CTS spec actually accounts for CRC failures caused by dithering and
> color space conversions. See section 3.2.1. However, it would be
> preferrable to be able to automate this.
> 
> > Some of the approaches that can solve this problem are:
> > 1.  Add a new method in intel_dp.c to request the compliance test state.
> > Call this new method in intel_display.c to not enable dithering during a
> > compliance test. Issue with this is it makes the general portion of the 
> > driver
> > compliance aware.
> >
> > 2.  Move the dithering enable to compute_config methods in all encoder 
> > source
> > files. Issue: Lot of duplicate code and DP is the only encoder that uses 
> > 18bpc.
> >
> > 3.  Disable dithering at all times in the driver. However this can cause 
> > image
> > quality issue with 8bpc plane and 6 bit pipe.
> >
> > Any suggestions on which approach can be implemented in order to pass
> > compliance?
> 
> I can't find any mention in the specs that we couldn't enable/disable
> dithering on the fly. It's PIPE_MISC for BDW+ and PIPE_CONF for the
> rest. So I'm wondering about doing...
> 
> 4. Disable dithering at intel_dp_sink_crc_start() and enable it again
>(according to config->dither) at intel_dp_sink_crc_stop(). It's
>similar to the hsw_disable_ips() and hsw_enable_ips() calls, but
>would have to cover more platforms.
> 
> Ville, thoughts on changing dithering on the fly?

Should be fine I think.

BTW see 
https://lists.freedesktop.org/archives/intel-gfx/2016-December/115186.html
if you intend to add more crc workaround type of things. There I'm
changing the IPS w/a to force a full modeset because it was the easiest
way to do things, and the current thing is just broken.

-- 
Ville Syrjälä
Intel OTC
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] 4.10-rc2 oops in DRM connector code

2017-01-11 Thread Daniel Vetter

On Wed, Jan 11, 2017 at 4:24 PM, Dave Hansen  wrote:
> On 01/10/2017 11:43 PM, Daniel Vetter wrote:
>> On Tue, Jan 10, 2017 at 08:52:47AM -0800, Dave Hansen wrote:
>>> On 01/10/2017 02:31 AM, Daniel Vetter wrote:
 commit e73ab00e9a0f1731f34d0620a9c55f5c30c4ad4e
 Author: Daniel Vetter 
 Date:   Sun Dec 18 14:35:45 2016 +0100

 drm: prevent double-(un)registration for connectors

 Lack of that would perfectly explain that oops ... Otherwise still no idea
 what's going wrong.
>>> No...  That's not in mainline as far as I can see.  Should I test with
>>> it applied?
>> Hm, I guess failed to cc: stable that one properly, iirc we decided the
>> race fix is too academic and can't be hit in reality ;-)
>>
>> Testing would be great. Probably conflicts because we extracted
>> drm_connector.c only recently, but running s/drm_connector\.c/drm_crtc.c/
>> over the diff and then applying with some fudge should take care of that.
>
> It doesn't apply to mainline, with or without the substitution you suggest.
>
> Do you want that commit itself tested from -next?

Hm, just cherry-picked it on top of Linus' latest 4.10 git, applies
cleanly there. The substituation was for 4.9. I can send you the patch
here, but seems all fine from what I can tell ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] 4.10-rc2 oops in DRM connector code

2017-01-11 Thread Chris Wilson

On Wed, Jan 11, 2017 at 07:24:45AM -0800, Dave Hansen wrote:
> On 01/10/2017 11:43 PM, Daniel Vetter wrote:
> > On Tue, Jan 10, 2017 at 08:52:47AM -0800, Dave Hansen wrote:
> >> On 01/10/2017 02:31 AM, Daniel Vetter wrote:
> >>> commit e73ab00e9a0f1731f34d0620a9c55f5c30c4ad4e
> >>> Author: Daniel Vetter 
> >>> Date:   Sun Dec 18 14:35:45 2016 +0100
> >>>
> >>> drm: prevent double-(un)registration for connectors
> >>>
> >>> Lack of that would perfectly explain that oops ... Otherwise still no idea
> >>> what's going wrong.
> >> No...  That's not in mainline as far as I can see.  Should I test with
> >> it applied?
> > Hm, I guess failed to cc: stable that one properly, iirc we decided the
> > race fix is too academic and can't be hit in reality ;-)
> > 
> > Testing would be great. Probably conflicts because we extracted
> > drm_connector.c only recently, but running s/drm_connector\.c/drm_crtc.c/
> > over the diff and then applying with some fudge should take care of that.
> 
> It doesn't apply to mainline, with or without the substitution you suggest.

I was hoping that the locking was the real cause here and would be an
easy fix to apply. I did have a look at trying to reorder the DP-MST
worker with driver registration. Hacky to say the least.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
>From 30ac9092e934295f12775f03d73170fc480b7fc8 Mon Sep 17 00:00:00 2001
From: Chris Wilson 
Date: Tue, 10 Jan 2017 10:46:25 +
Subject: [PATCH] dp-mst-register

---
 drivers/gpu/drm/i915/intel_dp.c | 12 +++-
 drivers/gpu/drm/i915/intel_dp_mst.c |  9 ++---
 2 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index f0f44cdbe4b4..fc10eb2c8563 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -4762,7 +4762,17 @@ intel_dp_connector_register(struct drm_connector *connector)
 		  intel_dp->aux.name, connector->kdev->kobj.name);
 
 	intel_dp->aux.dev = connector->kdev;
-	return drm_dp_aux_register(&intel_dp->aux);
+	ret = drm_dp_aux_register(&intel_dp->aux);
+	if (ret)
+		return ret;
+
+	if (intel_dp->mst_mgr.cbs) {
+		intel_dp->can_mst = true;
+		if (intel_dp->attached_connector)
+			intel_dp->attached_connector->base.status = intel_dp_long_pulse(intel_dp->attached_connector);
+	}
+
+	return 0;
 }
 
 static void
diff --git a/drivers/gpu/drm/i915/intel_dp_mst.c b/drivers/gpu/drm/i915/intel_dp_mst.c
index c93c1999a494..f0a664041dbc 100644
--- a/drivers/gpu/drm/i915/intel_dp_mst.c
+++ b/drivers/gpu/drm/i915/intel_dp_mst.c
@@ -582,16 +582,19 @@ intel_dp_mst_encoder_init(struct intel_digital_port *intel_dig_port, int conn_ba
 	struct drm_device *dev = intel_dig_port->base.base.dev;
 	int ret;
 
-	intel_dp->can_mst = true;
+	intel_dp->can_mst = false;
 	intel_dp->mst_mgr.cbs = &mst_cbs;
 
 	/* create encoders */
 	intel_dp_create_fake_mst_encoders(intel_dig_port);
-	ret = drm_dp_mst_topology_mgr_init(&intel_dp->mst_mgr, dev->dev, &intel_dp->aux, 16, 3, conn_base_id);
+	ret = drm_dp_mst_topology_mgr_init(&intel_dp->mst_mgr, dev->dev,
+	&intel_dp->aux, 16, 3,
+	conn_base_id);
 	if (ret) {
-		intel_dp->can_mst = false;
+		intel_dp->mst_mgr.cbs = NULL;
 		return ret;
 	}
+
 	return 0;
 }
 
-- 
2.11.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt

2017-01-11 Thread Chris Wilson

On Wed, Jan 11, 2017 at 05:35:08PM +0200, Martin Peres wrote:
> On 11/01/17 14:58, Joonas Lahtinen wrote:
> >On ke, 2017-01-11 at 12:14 +, Chris Wilson wrote:
> >>When switching between contexts using the aliasing_ppgtt, the VM is
> >>shared. We don't need to reload the PD registers unless they are dirty.
> >>
> >>Martin Peres reported an issue that looks like corruption between
> >>Haswell context switches, bisecting to commit f9326be5f1d3 ("drm/i915:
> >>Rearrange switch_context to load the aliasing ppgtt on first use").
> >>Switching between the same mm (the aliasing_ppgtt is used for all
> >>contexts in this case) should be a nop, but appears to trigger some
> >>side-effects in the context switch. However, as we know the switch
> >>is redundant in this case, we can skip it and continue to ignore the
> >>issue until somebody feels strong enough to investigate full-ppgtt on
> >>gen7 again!
> >>
> >>Fixes: f9326be5f1d3 ("drm/i915: Rearrange switch_context to load the 
> >>aliasing ppgtt on first use")
> >>Reported-by: Martin Peres 
> >>Signed-off-by: Chris Wilson 
> >>Cc: Martin Peres 
> >
> >Code looks good, could use the T-b's to verify.
> >
> >Reviewed-by: Joonas Lahtinen 
> >
> >Regards, Joonas
> >
> 
> https://cgit.freedesktop.org/~ickle/linux-2.6/commit/?h=for-mupuf&id=cfe8f1043b45896af23e4a978020fe20e90c5072
> was actually the commit that massively improved the corruption I was
> seeing in one benchmark while this patch had no visible impact.
> 
> However, my problem was that i915.enable_ppgtt=2 was set in
> /etc/modprobe.d/... and I had completely forgotten about it.
> 
> So yeah, now you know that f9326be5f1d3 massively broke
> enable_ppgtt=2, but not sure what you want to do about it.
> 
> There is no hurry though, as the defaults are sane.
> 
> Sorry for the noise everyone, I hope that my painful manual bisects
> will be useful if someone wants to make the second mode work :)

The information is very useful, I've added that the symptoms have only
been seen with full-ppgtt. I don't see any reason to apply this patch now,
so will send it along with the series playing with i915_gem_gtt.c once
the kselftests for it are in.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v6] drm: add fourcc codes for 16bit R and RG

2017-01-11 Thread Ben Widawsky


On 17-01-11 17:05:04, Ville Syrjälä wrote:

On Thu, Jan 05, 2017 at 02:45:37PM +0100, Christian König wrote:

Am 05.01.2017 um 12:37 schrieb Ville Syrjälä:
> On Wed, Jan 04, 2017 at 07:38:55PM +0100, Rainer Hochecker wrote:
>> From: Rainer Hochecker 
>>
>> This adds fourcc codes for 16bit planes required for DRM buffer
>> export to mesa.
>>
>> Signed-off-by: Rainer Hochecker 
> Reviewed-by: Ville Syrjälä 

Good to see some work landing on that part, patch is Acked-by: Christian
König .


Has the userspace side of this been reviewed already?

/me wonders if it's safe to push this...



I acked the mesa side, and Rainer sent a version 2 which also looked fine to me.
Let me bump that thread...



>
>> ---
>>   include/uapi/drm/drm_fourcc.h | 7 +++
>>   1 file changed, 7 insertions(+)
>>
>> diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
>> index a5890bf..d230e58 100644
>> --- a/include/uapi/drm/drm_fourcc.h
>> +++ b/include/uapi/drm/drm_fourcc.h
>> @@ -41,10 +41,17 @@ extern "C" {
>>   /* 8 bpp Red */
>>   #define DRM_FORMAT_R8fourcc_code('R', '8', ' ', ' ') /* 
[7:0] R */
>>
>> +/* 16 bpp Red */
>> +#define DRM_FORMAT_R16fourcc_code('R', '1', '6', ' ') /* 
[15:0] R little endian */
>> +
>>   /* 16 bpp RG */
>>   #define DRM_FORMAT_RG88  fourcc_code('R', 'G', '8', '8') /* 
[15:0] R:G 8:8 little endian */
>>   #define DRM_FORMAT_GR88  fourcc_code('G', 'R', '8', '8') /* 
[15:0] G:R 8:8 little endian */
>>
>> +/* 32 bpp RG */
>> +#define DRM_FORMAT_RG1616 fourcc_code('R', 'G', '3', '2') /* [31:0] R:G 
16:16 little endian */
>> +#define DRM_FORMAT_GR1616 fourcc_code('G', 'R', '3', '2') /* [31:0] G:R 
16:16 little endian */
>> +
>>   /* 8 bpp RGB */
>>   #define DRM_FORMAT_RGB332fourcc_code('R', 'G', 'B', '8') /* [7:0] 
R:G:B 3:3:2 */
>>   #define DRM_FORMAT_BGR233fourcc_code('B', 'G', 'R', '8') /* [7:0] 
B:G:R 2:3:3 */
>> --
>> 2.9.3



--
Ville Syrjälä
Intel OTC

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 1/3] drm/i915: Invalidate the guc ggtt TLB upon insertion

2017-01-11 Thread Tvrtko Ursulin



On 11/01/2017 13:13, Chris Wilson wrote:

Move the GuC invalidation of its ggtt TLB to where we perform the ggtt
modification rather than proliferate it into all the callers of the
insert (which may or may not in fact have to do the insertion).

v2: Just do the guc invalidate unconditionally, (afaict) it has no impact
without the guc loaded on gen8+
v3: Conditionally invalidate the guc - just in case that register has
not been validated for other modes.

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
Cc: Daniel Vetter 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c| 78 +++---
 drivers/gpu/drm/i915/i915_gem_gtt.h|  3 ++
 drivers/gpu/drm/i915/i915_guc_submission.c |  3 --
 drivers/gpu/drm/i915/intel_guc_loader.c|  7 +--
 drivers/gpu/drm/i915/intel_lrc.c   |  6 ---
 5 files changed, 57 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 0ed99adfd0da..ed120a1e7f93 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -110,6 +110,30 @@ const struct i915_ggtt_view i915_ggtt_view_rotated = {
.type = I915_GGTT_VIEW_ROTATED,
 };

+static void gen6_ggtt_invalidate(struct drm_i915_private *dev_priv)
+{
+   /* Note that as an uncached mmio write, this should flush the
+* WCB of the writes into the GGTT before it triggers the invalidate.
+*/
+   I915_WRITE(GFX_FLSH_CNTL_GEN6, GFX_FLSH_CNTL_EN);
+}
+
+static void guc_ggtt_invalidate(struct drm_i915_private *dev_priv)
+{
+   gen6_ggtt_invalidate(dev_priv);
+   I915_WRITE(GEN8_GTCR, GEN8_GTCR_INVALIDATE);
+}
+
+static void gmch_ggtt_invalidate(struct drm_i915_private *dev_priv)
+{
+   intel_gtt_chipset_flush();
+}
+
+static inline void i915_ggtt_invalidate(struct drm_i915_private *i915)
+{
+   i915->ggtt.invalidate(i915);
+}
+
 int intel_sanitize_enable_ppgtt(struct drm_i915_private *dev_priv,
int enable_ppgtt)
 {
@@ -2307,16 +2331,6 @@ void i915_check_and_clear_faults(struct drm_i915_private 
*dev_priv)
POSTING_READ(RING_FAULT_REG(dev_priv->engine[RCS]));
 }

-static void i915_ggtt_flush(struct drm_i915_private *dev_priv)
-{
-   if (INTEL_INFO(dev_priv)->gen < 6) {
-   intel_gtt_chipset_flush();
-   } else {
-   I915_WRITE(GFX_FLSH_CNTL_GEN6, GFX_FLSH_CNTL_EN);
-   POSTING_READ(GFX_FLSH_CNTL_GEN6);
-   }
-}
-
 void i915_gem_suspend_gtt_mappings(struct drm_i915_private *dev_priv)
 {
struct i915_ggtt *ggtt = &dev_priv->ggtt;
@@ -2331,7 +2345,7 @@ void i915_gem_suspend_gtt_mappings(struct 
drm_i915_private *dev_priv)

ggtt->base.clear_range(&ggtt->base, ggtt->base.start, ggtt->base.total);

-   i915_ggtt_flush(dev_priv);
+   i915_ggtt_invalidate(dev_priv);
 }

 int i915_gem_gtt_prepare_pages(struct drm_i915_gem_object *obj,
@@ -2370,15 +2384,13 @@ static void gen8_ggtt_insert_page(struct 
i915_address_space *vm,
  enum i915_cache_level level,
  u32 unused)
 {
-   struct drm_i915_private *dev_priv = vm->i915;
+   struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
gen8_pte_t __iomem *pte =
-   (gen8_pte_t __iomem *)dev_priv->ggtt.gsm +
-   (offset >> PAGE_SHIFT);
+   (gen8_pte_t __iomem *)ggtt->gsm + (offset >> PAGE_SHIFT);

gen8_set_pte(pte, gen8_pte_encode(addr, level));

-   I915_WRITE(GFX_FLSH_CNTL_GEN6, GFX_FLSH_CNTL_EN);
-   POSTING_READ(GFX_FLSH_CNTL_GEN6);
+   ggtt->invalidate(vm->i915);
 }

 static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
@@ -2386,7 +2398,6 @@ static void gen8_ggtt_insert_entries(struct 
i915_address_space *vm,
 uint64_t start,
 enum i915_cache_level level, u32 unused)
 {
-   struct drm_i915_private *dev_priv = vm->i915;
struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
struct sgt_iter sgt_iter;
gen8_pte_t __iomem *gtt_entries;
@@ -2415,8 +2426,7 @@ static void gen8_ggtt_insert_entries(struct 
i915_address_space *vm,
 * want to flush the TLBs only after we're certain all the PTE updates
 * have finished.
 */
-   I915_WRITE(GFX_FLSH_CNTL_GEN6, GFX_FLSH_CNTL_EN);
-   POSTING_READ(GFX_FLSH_CNTL_GEN6);
+   ggtt->invalidate(vm->i915);
 }

 struct insert_entries {
@@ -2451,15 +2461,13 @@ static void gen6_ggtt_insert_page(struct 
i915_address_space *vm,
  enum i915_cache_level level,
  u32 flags)
 {
-   struct drm_i915_private *dev_priv = vm->i915;
+   struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
gen6_pte_t __iomem *pte =
-   (gen6_pte_t __iomem *)dev_priv->ggtt.gsm +
-   (offset >> PAGE_SHIFT);
+   (gen6_pte_t __iomem *)ggtt->gsm + (offse

[Intel-gfx] [drm-intel:for-linux-next 1/4] htmldocs: drivers/gpu/drm/i915/i915_gem_gtt.c:3594: warning: No description found for parameter 'vm'

2017-01-11 Thread kbuild test robot

tree:   git://anongit.freedesktop.org/drm-intel for-linux-next
head:   c781c978e784c50dcd7cb312fe17f5281923f55b
commit: e007b19d7ba7424735fd4f17a355b145ae153e4c [1/4] drm/i915: Use the MRU 
stack search after evicting
reproduce: make htmldocs

All warnings (new ones prefixed by >>):

   make[3]: warning: jobserver unavailable: using -j1.  Add '+' to parent make 
rule.
   include/linux/init.h:1: warning: no structured comments found
   include/linux/kthread.h:26: warning: Excess function parameter '...' 
description in 'kthread_create'
   kernel/sys.c:1: warning: no structured comments found
   drivers/dma-buf/seqno-fence.c:1: warning: no structured comments found
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'firstopen'
   include/drm/drm_drv.h:441: warning: No description found for parameter 'open'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'preclose'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'postclose'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'lastclose'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'dma_ioctl'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'dma_quiescent'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'context_dtor'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'set_busid'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'irq_handler'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'irq_preinstall'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'irq_postinstall'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'irq_uninstall'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'debugfs_init'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'debugfs_cleanup'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'gem_open_object'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'gem_close_object'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'prime_handle_to_fd'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'prime_fd_to_handle'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'gem_prime_export'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'gem_prime_import'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'gem_prime_pin'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'gem_prime_unpin'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'gem_prime_res_obj'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'gem_prime_get_sg_table'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'gem_prime_import_sg_table'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'gem_prime_vmap'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'gem_prime_vunmap'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'gem_prime_mmap'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'vgaarb_irq'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'gem_vm_ops'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'major'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'minor'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'patchlevel'
   include/drm/drm_drv.h:441: warning: No description found for parameter 'name'
   include/drm/drm_drv.h:441: warning: No description found for parameter 'desc'
   include/drm/drm_drv.h:441: warning: No description found for parameter 'date'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'driver_features'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'dev_priv_size'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'ioctls'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'num_ioctls'
   include/drm/drm_drv.h:441: warning: No description found for parameter 'fops'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'legacy_dev_list'
>> drivers/gpu/drm/i915/i915_gem_gtt.c:3594: warning: No description found for 
>> parameter 'vm'
>> drivers/gpu/drm/i915/i915_gem_gtt.c:3594: warning: No description found for 
>> parameter 'node'
>> drivers/gpu/drm/i915/i915_gem_gtt.c:3594: warning: No description found for 
>> parameter 'size'
>> drivers/gpu/drm/i915/i915_gem_gtt.c:3594: warning: No description found for 
>> parameter 'alignment'
>> drivers/gpu/drm/i915/i915_gem_gtt.c:3594: warning: No descri

[Intel-gfx] ✓ Fi.CI.BAT: success for HuC Loading Patches (rev2)

2017-01-11 Thread Patchwork

== Series Details ==

Series: HuC Loading Patches (rev2)
URL   : https://patchwork.freedesktop.org/series/17499/
State : success

== Summary ==

Series 17499v2 HuC Loading Patches
https://patchwork.freedesktop.org/api/1.0/series/17499/revisions/2/mbox/


fi-bdw-5557u total:246  pass:232  dwarn:0   dfail:0   fail:0   skip:14 
fi-bsw-n3050 total:246  pass:207  dwarn:0   dfail:0   fail:0   skip:39 
fi-bxt-j4205 total:246  pass:224  dwarn:0   dfail:0   fail:0   skip:22 
fi-bxt-t5700 total:82   pass:69   dwarn:0   dfail:0   fail:0   skip:12 
fi-byt-j1900 total:246  pass:219  dwarn:0   dfail:0   fail:0   skip:27 
fi-byt-n2820 total:246  pass:215  dwarn:0   dfail:0   fail:0   skip:31 
fi-hsw-4770  total:246  pass:227  dwarn:0   dfail:0   fail:0   skip:19 
fi-hsw-4770r total:246  pass:227  dwarn:0   dfail:0   fail:0   skip:19 
fi-ivb-3520m total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-ivb-3770  total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-kbl-7500u total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-skl-6260u total:246  pass:233  dwarn:0   dfail:0   fail:0   skip:13 
fi-skl-6700hqtotal:246  pass:226  dwarn:0   dfail:0   fail:0   skip:20 
fi-skl-6700k total:246  pass:222  dwarn:3   dfail:0   fail:0   skip:21 
fi-skl-6770hqtotal:246  pass:233  dwarn:0   dfail:0   fail:0   skip:13 
fi-snb-2520m total:246  pass:215  dwarn:0   dfail:0   fail:0   skip:31 
fi-snb-2600  total:246  pass:214  dwarn:0   dfail:0   fail:0   skip:32 

a947b47b6c0c947253f44d750512220ecb7c5cf4 drm-tip: 2017y-01m-11d-14h-32m-39s UTC 
integration manifest
fabcb22 drm/i915/get_params: Add HuC status to getparams
3532699 drm/i915/huc: Support HuC authentication
50c8a56 drm/i915/huc: Add debugfs for HuC loading status check
5d14b30 drm/i915/HuC: Add KBL huC loading Support
6def1fb drm/i915/huc: Add BXT HuC Loading Support
e6713f5 drm/i915/huc: Add HuC fw loading support
061e51b drm/i915/huc: Unified css_header struct for GuC and HuC
6db7e21 drm/i915/guc: Make the GuC fw loading helper functions general

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3482/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH v3] drm/i915/scheduler: emulate a scheduler for guc

2017-01-11 Thread Chris Wilson

This emulates execlists on top of the GuC in order to defer submission of
requests to the hardware. This deferral allows time for high priority
requests to gazump their way to the head of the queue, however it nerfs
the GuC by converting it back into a simple execlist (where the CPU has
to wake up after every request to feed new commands into the GuC).

v2: Drop hack status - though iirc there is still a lockdep inversion
between fence and engine->timeline->lock (which is impossible as the
nesting only occurs on different fences - hopefully just requires some
judicious lockdep annotation)
v3: Apply lockdep nesting to enabling signaling on the request, using
the pattern we already have in __i915_gem_request_submit();

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 92 +++---
 drivers/gpu/drm/i915/i915_irq.c|  4 +-
 drivers/gpu/drm/i915/intel_lrc.c   |  5 +-
 3 files changed, 89 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 913d87358972..4484591cbf7c 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -350,7 +350,7 @@ int i915_guc_wq_reserve(struct drm_i915_gem_request 
*request)
u32 freespace;
int ret;
 
-   spin_lock(&client->wq_lock);
+   spin_lock_irq(&client->wq_lock);
freespace = CIRC_SPACE(client->wq_tail, desc->head, client->wq_size);
freespace -= client->wq_rsvd;
if (likely(freespace >= wqi_size)) {
@@ -360,7 +360,7 @@ int i915_guc_wq_reserve(struct drm_i915_gem_request 
*request)
client->no_wq_space++;
ret = -EAGAIN;
}
-   spin_unlock(&client->wq_lock);
+   spin_unlock_irq(&client->wq_lock);
 
return ret;
 }
@@ -372,9 +372,9 @@ void i915_guc_wq_unreserve(struct drm_i915_gem_request 
*request)
 
GEM_BUG_ON(READ_ONCE(client->wq_rsvd) < wqi_size);
 
-   spin_lock(&client->wq_lock);
+   spin_lock_irq(&client->wq_lock);
client->wq_rsvd -= wqi_size;
-   spin_unlock(&client->wq_lock);
+   spin_unlock_irq(&client->wq_lock);
 }
 
 /* Construct a Work Item and append it to the GuC's Work Queue */
@@ -534,10 +534,87 @@ static void __i915_guc_submit(struct drm_i915_gem_request 
*rq)
 
 static void i915_guc_submit(struct drm_i915_gem_request *rq)
 {
-   i915_gem_request_submit(rq);
+   __i915_gem_request_submit(rq);
__i915_guc_submit(rq);
 }
 
+static void nested_enable_signaling(struct drm_i915_gem_request *rq)
+{
+   if (test_and_set_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT,
+&rq->fence.flags))
+   return;
+
+   GEM_BUG_ON(test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &rq->fence.flags));
+
+   spin_lock_nested(&rq->lock, SINGLE_DEPTH_NESTING);
+   intel_engine_enable_signaling(rq);
+   spin_unlock(&rq->lock);
+}
+
+static bool i915_guc_dequeue(struct intel_engine_cs *engine)
+{
+   struct execlist_port *port = engine->execlist_port;
+   struct drm_i915_gem_request *last = port[0].request;
+   unsigned long flags;
+   struct rb_node *rb;
+   bool submit = false;
+
+   spin_lock_irqsave(&engine->timeline->lock, flags);
+   rb = engine->execlist_first;
+   while (rb) {
+   struct drm_i915_gem_request *cursor =
+   rb_entry(rb, typeof(*cursor), priotree.node);
+
+   if (last && cursor->ctx != last->ctx) {
+   if (port != engine->execlist_port)
+   break;
+
+   i915_gem_request_assign(&port->request, last);
+   nested_enable_signaling(last);
+   port++;
+   }
+
+   rb = rb_next(rb);
+   rb_erase(&cursor->priotree.node, &engine->execlist_queue);
+   RB_CLEAR_NODE(&cursor->priotree.node);
+   cursor->priotree.priority = INT_MAX;
+
+   i915_guc_submit(cursor);
+   last = cursor;
+   submit = true;
+   }
+   if (submit) {
+   i915_gem_request_assign(&port->request, last);
+   nested_enable_signaling(last);
+   engine->execlist_first = rb;
+   }
+   spin_unlock_irqrestore(&engine->timeline->lock, flags);
+
+   return submit;
+}
+
+static void i915_guc_irq_handler(unsigned long data)
+{
+   struct intel_engine_cs *engine = (struct intel_engine_cs *)data;
+   struct execlist_port *port = engine->execlist_port;
+   struct drm_i915_gem_request *rq;
+   bool submit;
+
+   do {
+   rq = port[0].request;
+   while (rq && i915_gem_request_completed(rq)) {
+   i915_gem_request_put(rq);
+   rq = port[1].request;
+   port[0].request = rq;
+

Re: [Intel-gfx] [PATCH 2/4] drm/i915: Fix POWER_DOMAIN_AUDIO refcounting.

2017-01-11 Thread Daniel Vetter

On Thu, Dec 15, 2016 at 03:29:43PM +0100, Maarten Lankhorst wrote:
> If the crtc was brought up with audio before the driver loads,
> then crtc_disable will remove a refcount to audio that doesn't exist
> before.
> 
> Fortunately we already set power domains on readout, so we can just add
> the power domain handling to get_crtc_power_domains, which will update
> the power domains correctly in all cases.
> 
> This was found when testing module reload on CI with the crtc enabled,
> which resulted in the following warn after module reload + modeset:
> 
> [   24.197041] [ cut here ]
> [   24.197075] WARNING: CPU: 0 PID: 99 at 
> drivers/gpu/drm/i915/intel_runtime_pm.c:1790 
> intel_display_power_put+0x134/0x140 [i915]
> [   24.197076] Use count on domain AUDIO is already zero
> [   24.197098] CPU: 0 PID: 99 Comm: kworker/u8:2 Not tainted 
> 4.9.0-CI-Trybot_393+ #1
> [   24.197099] Hardware name:  /NUC6i5SYB, BIOS 
> SYSKLi35.86A.0042.2016.0409.1246 04/09/2016
> [   24.197102] Workqueue: events_unbound async_run_entry_fn
> [   24.197105]  c93c7688 81435b35 c93c76d8 
> 
> [   24.197107]  c93c76c8 8107e4d6 06fe5dc36f28 
> 88025dc30054
> [   24.197109]  88025dc36f28 88025dc3 88025dc3 
> 0015
> [   24.197110] Call Trace:
> [   24.197113]  [] dump_stack+0x67/0x92
> [   24.197116]  [] __warn+0xc6/0xe0
> [   24.197118]  [] warn_slowpath_fmt+0x4a/0x50
> [   24.197149]  [] intel_display_power_put+0x134/0x140 
> [i915]
> [   24.197187]  [] intel_disable_ddi+0x4d/0x80 [i915]
> [   24.197223]  [] intel_encoders_disable.isra.74+0x7f/0x90 
> [i915]
> [   24.197257]  [] haswell_crtc_disable+0x55/0x170 [i915]
> [   24.197292]  [] intel_atomic_commit_tail+0x108/0xfd0 
> [i915]
> [   24.197295]  [] ? __lock_is_held+0x66/0x90
> [   24.197330]  [] intel_atomic_commit+0x429/0x560 [i915]
> [   24.197332]  [] 
> ?drm_atomic_add_affected_connectors+0x56/0xf0
> [   24.197334]  [] drm_atomic_commit+0x46/0x50
> [   24.197336]  [] restore_fbdev_mode+0x147/0x270
> [   24.197337]  [] 
> drm_fb_helper_restore_fbdev_mode_unlocked+0x2e/0x70
> [   24.197339]  [] drm_fb_helper_set_par+0x28/0x50
> [   24.197374]  [] intel_fbdev_set_par+0x13/0x70 [i915]
> [   24.197376]  [] fbcon_init+0x57a/0x600
> [   24.197379]  [] visual_init+0xd1/0x130
> [   24.197381]  [] do_bind_con_driver+0x1bc/0x3a0
> [   24.197384]  [] do_take_over_console+0x111/0x180
> [   24.197386]  [] do_fbcon_takeover+0x52/0xb0
> [   24.197387]  [] fbcon_event_notify+0x723/0x850
> [   24.197390]  [] ?__blocking_notifier_call_chain+0x30/0x70
> [   24.197392]  [] notifier_call_chain+0x34/0xa0
> [   24.197394]  [] __blocking_notifier_call_chain+0x48/0x70
> [   24.197397]  [] blocking_notifier_call_chain+0x11/0x20
> [   24.197398]  [] fb_notifier_call_chain+0x16/0x20
> [   24.197400]  [] register_framebuffer+0x24c/0x330
> [   24.197402]  [] drm_fb_helper_initial_config+0x219/0x3c0
> [   24.197436]  [] intel_fbdev_initial_config+0x13/0x30 
> [i915]
> [   24.197438]  [] async_run_entry_fn+0x34/0x140
> [   24.197440]  [] process_one_work+0x1ec/0x6b0
> [   24.197442]  [] ? process_one_work+0x166/0x6b0
> [   24.197445]  [] worker_thread+0x49/0x490
> [   24.197447]  [] ? process_one_work+0x6b0/0x6b0
> [   24.197448]  [] kthread+0xeb/0x110
> [   24.197451]  [] ? kthread_park+0x60/0x60
> [   24.197453]  [] ret_from_fork+0x27/0x40
> [   24.197476] ---[ end trace bda64b683b8e8162 ]---
> 
> Signed-off-by: Maarten Lankhorst 

Do we still need this with patch 3? I know it'd be nice if we could
faithfully restore any state we can also program, but then that's also a
lot of complexity ...

Otoh patch 3 means we'll stop testing a lot of the fastboot code while
reloading the driver. But then that's been the thing in the past, and as
long as we still boot up we have at least some test coverage fo the
fastboot code (I'm mostly concerned about the plane/buffer readout code,
since that's not covered by the state checker).

But for now I'd say let's just go with patch 3 only.
-Daniel
> ---
>  drivers/gpu/drm/i915/intel_ddi.c | 14 ++
>  drivers/gpu/drm/i915/intel_display.c |  4 
>  drivers/gpu/drm/i915/intel_dp_mst.c  |  9 ++---
>  3 files changed, 8 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_ddi.c 
> b/drivers/gpu/drm/i915/intel_ddi.c
> index d808a2ccc29e..8c9ce850760b 100644
> --- a/drivers/gpu/drm/i915/intel_ddi.c
> +++ b/drivers/gpu/drm/i915/intel_ddi.c
> @@ -1835,8 +1835,6 @@ static void intel_enable_ddi(struct intel_encoder 
> *intel_encoder,
>struct drm_connector_state *conn_state)
>  {
>   struct drm_encoder *encoder = &intel_encoder->base;
> - struct drm_crtc *crtc = encoder->crtc;
> - struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>   struct drm_i915_private *dev_priv = to_i915(encoder->dev);
>   enum port port = intel_ddi_get_encoder_port(intel_encoder);
>

Re: [Intel-gfx] [PATCH 3/4] drm/i915: Disable all crtcs during driver unload.

2017-01-11 Thread Daniel Vetter

On Thu, Dec 15, 2016 at 03:29:44PM +0100, Maarten Lankhorst wrote:
> We may keep the crtc's enabled when userspace unsets all framebuffers but
> keeps the crtc active. This exposes a WARN in fbc_global disable, and
> a lot of bugs in our hardware readout code. Solve this by disabling
> all crtc's for now.
> 
> Signed-off-by: Maarten Lankhorst 
> ---
>  drivers/gpu/drm/i915/i915_drv.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 6428588518aa..bb0d7517b678 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -43,6 +43,7 @@
>  
>  #include 
>  #include 
> +#include 
>  #include 
>  
>  #include "i915_drv.h"
> @@ -1282,6 +1283,10 @@ void i915_driver_unload(struct drm_device *dev)
>  
>   intel_display_power_get(dev_priv, POWER_DOMAIN_INIT);
>  
> + drm_modeset_lock_all(dev);
> + drm_atomic_helper_disable_all(dev, dev->mode_config.acquire_ctx);
> + drm_modeset_unlock_all(dev);

Bikeshed: I think we should phase out lock_all and do an explicit acquire
context here. And maybe get a bit better at refactoring the boilerplate
that brings along. But also as-is:

Reviewed-by: Daniel Vetter 

> +
>   i915_driver_unregister(dev_priv);
>  
>   drm_vblank_cleanup(dev);
> -- 
> 2.7.4
> 
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 4/4] drm: Resurrect atomic rmfb code, v2

2017-01-11 Thread Daniel Vetter

On Thu, Dec 15, 2016 at 03:29:45PM +0100, Maarten Lankhorst wrote:
> From: Daniel Vetter 
> 
> This was somehow lost between v3 and the merged version in Maarten's
> patch merged as:
> 
> commit f2d580b9a8149735cbc4b59c4a8df60173658140
> Author: Maarten Lankhorst 
> Date:   Wed May 4 14:38:26 2016 +0200
> 
> drm/core: Do not preserve framebuffer on rmfb, v4.
> 
> Actual code copied from Maarten's patch, but with the slight change to
> just use dev->mode_config.funcs->atomic_commit to decide whether to
> use the atomic path or not.
> 
> v2:
> - Remove plane->fb assignment, done by drm_atomic_clean_old_fb.
> - Add WARN_ON when atomic_remove_fb fails.
> - Always call drm_atomic_state_put.
> 
> Signed-off-by: Daniel Vetter 
> Signed-off-by: Daniel Vetter 
> Signed-off-by: Maarten Lankhorst 

Would be great if someone else could r-b this, I've proven pretty well
that I don't understand the complexity here :(
-Daniel

> ---
>  drivers/gpu/drm/drm_atomic.c| 64 
> +
>  drivers/gpu/drm/drm_crtc_internal.h |  1 +
>  drivers/gpu/drm/drm_framebuffer.c   |  7 
>  3 files changed, 72 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
> index d1d252261bf1..23a3845542e1 100644
> --- a/drivers/gpu/drm/drm_atomic.c
> +++ b/drivers/gpu/drm/drm_atomic.c
> @@ -2059,6 +2059,70 @@ static void complete_crtc_signaling(struct drm_device 
> *dev,
>   kfree(fence_state);
>  }
>  
> +int drm_atomic_remove_fb(struct drm_framebuffer *fb)
> +{
> + struct drm_modeset_acquire_ctx ctx;
> + struct drm_device *dev = fb->dev;
> + struct drm_atomic_state *state;
> + struct drm_plane *plane;
> + int ret = 0;
> + unsigned plane_mask;
> +
> + state = drm_atomic_state_alloc(dev);
> + if (!state)
> + return -ENOMEM;
> +
> + drm_modeset_acquire_init(&ctx, 0);
> + state->acquire_ctx = &ctx;
> +
> +retry:
> + plane_mask = 0;
> + ret = drm_modeset_lock_all_ctx(dev, &ctx);
> + if (ret)
> + goto unlock;
> +
> + drm_for_each_plane(plane, dev) {
> + struct drm_plane_state *plane_state;
> +
> + if (plane->state->fb != fb)
> + continue;
> +
> + plane_state = drm_atomic_get_plane_state(state, plane);
> + if (IS_ERR(plane_state)) {
> + ret = PTR_ERR(plane_state);
> + goto unlock;
> + }
> +
> + drm_atomic_set_fb_for_plane(plane_state, NULL);
> + ret = drm_atomic_set_crtc_for_plane(plane_state, NULL);
> + if (ret)
> + goto unlock;
> +
> + plane_mask |= BIT(drm_plane_index(plane));
> +
> + plane->old_fb = plane->fb;
> + }
> +
> + if (plane_mask)
> + ret = drm_atomic_commit(state);
> +
> +unlock:
> + if (plane_mask)
> + drm_atomic_clean_old_fb(dev, plane_mask, ret);
> +
> + if (ret == -EDEADLK) {
> + drm_modeset_backoff(&ctx);
> + goto retry;
> + }
> +
> + drm_atomic_state_put(state);
> +
> + drm_modeset_drop_locks(&ctx);
> + drm_modeset_acquire_fini(&ctx);
> +
> + return ret;
> +}
> +
>  int drm_mode_atomic_ioctl(struct drm_device *dev,
> void *data, struct drm_file *file_priv)
>  {
> diff --git a/drivers/gpu/drm/drm_crtc_internal.h 
> b/drivers/gpu/drm/drm_crtc_internal.h
> index cdf6860c9d22..121e250853d2 100644
> --- a/drivers/gpu/drm/drm_crtc_internal.h
> +++ b/drivers/gpu/drm/drm_crtc_internal.h
> @@ -178,6 +178,7 @@ int drm_atomic_get_property(struct drm_mode_object *obj,
>   struct drm_property *property, uint64_t *val);
>  int drm_mode_atomic_ioctl(struct drm_device *dev,
> void *data, struct drm_file *file_priv);
> +int drm_atomic_remove_fb(struct drm_framebuffer *fb);
>  
>  
>  /* drm_plane.c */
> diff --git a/drivers/gpu/drm/drm_framebuffer.c 
> b/drivers/gpu/drm/drm_framebuffer.c
> index cbf0c893f426..c358bf8280a8 100644
> --- a/drivers/gpu/drm/drm_framebuffer.c
> +++ b/drivers/gpu/drm/drm_framebuffer.c
> @@ -770,6 +770,12 @@ void drm_framebuffer_remove(struct drm_framebuffer *fb)
>* in this manner.
>*/
>   if (drm_framebuffer_read_refcount(fb) > 1) {
> + if (dev->mode_config.funcs->atomic_commit) {
> + int ret = drm_atomic_remove_fb(fb);
> + WARN(ret, "atomic remove_fb failed with %i\n", ret);
> + goto out;
> + }
> +
>   drm_modeset_lock_all(dev);
>   /* remove from any CRTC */
>   drm_for_each_crtc(crtc, dev) {
> @@ -787,6 +793,7 @@ void drm_framebuffer_remove(struct drm_framebuffer *fb)
>   drm_modeset_unlock_all(dev);
>   }
>  
> +out:
>   drm_framebuffer_unreference(fb);
>  }
>  EXPORT_SYMBOL(drm_framebuffer_remove);
> -- 
> 2.7.4
> 

-- 
Daniel Vetter
Software Engin

Re: [Intel-gfx] 4.10-rc2 oops in DRM connector code

2017-01-11 Thread Dave Hansen

On 01/11/2017 07:39 AM, Daniel Vetter wrote:
> Hm, just cherry-picked it on top of Linus' latest 4.10 git, applies
> cleanly there. The substituation was for 4.9. I can send you the patch
> here, but seems all fine from what I can tell ...

All of the printk's that I added were making it fail to apply.

So, I took a 4.10-rc3 kernel with i915 compiled in (not as a module) and
applied e73ab00e9a0f17, which I grabbed from linux-next.

I'm seeing basically the same behavior that I did before applying
e73ab00e9a0f17.  sysfs_create_dir_ns() fails because of a NULL kobj->parent.

Have you guys tried testing this yourselves?  It seems really easy to
reproduce if you just compile the driver in.

> [1.400797] drm_dev_register(88040c73)::730 cpu: 2
> [1.400860] drm_connector_register(88040c76b000)::382 
> connector->registered: 0 cpu: 1
> [1.400870] sysfs_create_dir_ns()::53 error: -2
> [1.400874] create_dir()::75 error: -2 cpu: 1
> [1.400878] [ cut here ]
> [1.400884] WARNING: CPU: 1 PID: 91 at lib/kobject.c:249 
> kobject_add_internal+0x273/0x330
> [1.400888] kobject_add_internal failed for card0-DP-3 (error: -2 parent: 
> card0)
> [1.400892] Modules linked in:
> [1.400896] CPU: 1 PID: 91 Comm: kworker/1:2 Not tainted 
> 4.10.0-rc3-i915borked-dirty #67
> [1.400900] Hardware name: LENOVO 20F5S7V800/20F5S7V800, BIOS R02ET50W 
> (1.23 ) 09/20/2016
> [1.400906] Workqueue: events_long drm_dp_mst_link_probe_work
> [1.400909] Call Trace:
> [1.400914]  dump_stack+0x67/0x99
> [1.400918]  __warn+0xd1/0xf0
> [1.400922]  warn_slowpath_fmt+0x4f/0x60
> [1.400925]  kobject_add_internal+0x273/0x330
> [1.400927]  kobject_add+0x65/0xb0
> [1.400931]  ? klist_init+0x31/0x40
> [1.400936]  device_add+0x102/0x5d0
> [1.400940]  ? kfree_const+0x22/0x30
> [1.400944]  device_create_groups_vargs+0xd8/0x100
> [1.400947]  device_create_with_groups+0x36/0x40
> [1.400952]  ? vprintk_default+0x29/0x50
> [1.400957]  ? __might_sleep+0x4a/0x90
> [1.400962]  drm_sysfs_connector_add+0x60/0xe0
> [1.400967]  drm_connector_register+0x74/0xd0
> [1.400971]  intel_dp_register_mst_connector+0x41/0x50
> [1.400975]  drm_dp_add_port+0x350/0x450
> [1.400977] drm_connector_register(88040ee6f800)::382 
> connector->registered: 0 cpu: 2
> [1.400982]  ? rcu_early_boot_tests+0x1/0x10
> [1.400986]  ? schedule_timeout+0x1cd/0x390
> [1.400989]  ? __might_sleep+0x4a/0x90
> [1.400992]  ? mutex_lock+0x25/0x50
> [1.400995]  ? drm_dp_mst_wait_tx_reply+0x118/0x1e0
> [1.400996] drm_sysfs_connector_add() connector: 88040ee6f800 kdev: 
> 88040eef9c00
> [1.401002]  ? prepare_to_wait_event+0x120/0x120
> [1.401005]  ? drm_dp_check_mstb_guid+0x3d/0x120
> [1.401008]  drm_dp_send_link_address+0x185/0x1f0
> [1.401012]  drm_dp_check_and_send_link_address+0xad/0xc0
> [1.401015]  drm_dp_mst_link_probe_work+0x57/0xa0
> [1.401018]  process_one_work+0x14b/0x430
> [1.401021]  worker_thread+0x12b/0x4a0
> [1.401025]  kthread+0x10c/0x140
> [1.401027]  ? process_one_work+0x430/0x430
> [1.401030]  ? kthread_create_on_node+0x40/0x40
> [1.401034]  ret_from_fork+0x27/0x40
> [1.401038] ---[ end trace ba43fc250fbf282d ]---
> [1.401041] drm_sysfs_connector_add() connector: 88040c76b000 kdev: 
> fffe
> [1.401043] drm_connector_register(88040c768000)::382 
> connector->registered: 0 cpu: 2
> [1.401050] [drm:drm_sysfs_connector_add] *ERROR* failed to register 
> connector device: -2
> [1.401057] drm_sysfs_connector_add() connector: 88040c768000 kdev: 
> 88040eefa000
> [1.401093] drm_connector_register(88040c768800)::382 
> connector->registered: 0 cpu: 2
> [1.401113] drm_sysfs_connector_add() connector: 88040c768800 kdev: 
> 88040eefa400
> [1.401122] drm_connector_register(88040c769000)::382 
> connector->registered: 0 cpu: 2
> [1.401140] drm_sysfs_connector_add() connector: 88040c769000 kdev: 
> 88040eefa800
> [1.401167] drm_connector_register(88040c769800)::382 
> connector->registered: 0 cpu: 2
> [1.401186] drm_sysfs_connector_add() connector: 88040c769800 kdev: 
> 88040eefac00
> [1.401195] drm_connector_register(88040c76b000)::382 
> connector->registered: 0 cpu: 2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 2/5] drm/edid: Introduce drm_default_rgb_quant_range()

2017-01-11 Thread Daniel Vetter

On Wed, Jan 11, 2017 at 02:57:22PM +0200, ville.syrj...@linux.intel.com wrote:
> From: Ville Syrjälä 
> 
> Make the code selecting the RGB quantization range a little less magicy
> by wrapping it up in a small helper.
> 
> Signed-off-by: Ville Syrjälä 

Needs cc: for driver maintainers, Eric for vc4 here.
-Daniel

> ---
>  drivers/gpu/drm/drm_edid.c| 18 ++
>  drivers/gpu/drm/i915/intel_dp.c   |  4 +++-
>  drivers/gpu/drm/i915/intel_hdmi.c |  3 ++-
>  drivers/gpu/drm/vc4/vc4_hdmi.c|  4 +++-
>  include/drm/drm_edid.h|  2 ++
>  5 files changed, 28 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
> index 4ff04aa84dd0..304c583b8000 100644
> --- a/drivers/gpu/drm/drm_edid.c
> +++ b/drivers/gpu/drm/drm_edid.c
> @@ -3768,6 +3768,24 @@ bool drm_rgb_quant_range_selectable(struct edid *edid)
>  }
>  EXPORT_SYMBOL(drm_rgb_quant_range_selectable);
>  
> +/**
> + * drm_default_rgb_quant_range - default RGB quantization range
> + * @mode: display mode
> + *
> + * Determine the default RGB quantization range for the mode,
> + * as specified in CEA-861.
> + *
> + * Return: The default RGB quantization range for the mode
> + */
> +enum hdmi_quantization_range
> +drm_default_rgb_quant_range(const struct drm_display_mode *mode)
> +{
> + return drm_match_cea_mode(mode) > 1 ?
> + HDMI_QUANTIZATION_RANGE_LIMITED :
> + HDMI_QUANTIZATION_RANGE_FULL;
> +}
> +EXPORT_SYMBOL(drm_default_rgb_quant_range);
> +
>  static void drm_parse_hdmi_deep_color_info(struct drm_connector *connector,
>  const u8 *hdmi)
>  {
> diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
> index 343e1d9fa761..d4befbbe834a 100644
> --- a/drivers/gpu/drm/i915/intel_dp.c
> +++ b/drivers/gpu/drm/i915/intel_dp.c
> @@ -1713,7 +1713,9 @@ intel_dp_compute_config(struct intel_encoder *encoder,
>* VESA DisplayPort Ver.1.2a - 5.1.1.1 Video Colorimetry
>*/
>   pipe_config->limited_color_range =
> - bpp != 18 && drm_match_cea_mode(adjusted_mode) > 1;
> + bpp != 18 &&
> + drm_default_rgb_quant_range(adjusted_mode) ==
> + HDMI_QUANTIZATION_RANGE_LIMITED;
>   } else {
>   pipe_config->limited_color_range =
>   intel_dp->limited_color_range;
> diff --git a/drivers/gpu/drm/i915/intel_hdmi.c 
> b/drivers/gpu/drm/i915/intel_hdmi.c
> index 0bcfead14571..19bd13f53729 100644
> --- a/drivers/gpu/drm/i915/intel_hdmi.c
> +++ b/drivers/gpu/drm/i915/intel_hdmi.c
> @@ -1330,7 +1330,8 @@ bool intel_hdmi_compute_config(struct intel_encoder 
> *encoder,
>   /* See CEA-861-E - 5.1 Default Encoding Parameters */
>   pipe_config->limited_color_range =
>   pipe_config->has_hdmi_sink &&
> - drm_match_cea_mode(adjusted_mode) > 1;
> + drm_default_rgb_quant_range(adjusted_mode) ==
> + HDMI_QUANTIZATION_RANGE_LIMITED;
>   } else {
>   pipe_config->limited_color_range =
>   intel_hdmi->limited_color_range;
> diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
> index c4cb2e26de32..d79466a42690 100644
> --- a/drivers/gpu/drm/vc4/vc4_hdmi.c
> +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
> @@ -463,7 +463,9 @@ static void vc4_hdmi_encoder_mode_set(struct drm_encoder 
> *encoder,
>   csc_ctl = VC4_SET_FIELD(VC4_HD_CSC_CTL_ORDER_BGR,
>   VC4_HD_CSC_CTL_ORDER);
>  
> - if (vc4_encoder->hdmi_monitor && drm_match_cea_mode(mode) > 1) {
> + if (vc4_encoder->hdmi_monitor &&
> + drm_default_rgb_quant_range(adjusted_mode) ==
> + HDMI_QUANTIZATION_RANGE_LIMITED) {
>   /* CEA VICs other than #1 requre limited range RGB
>* output unless overridden by an AVI infoframe.
>* Apply a colorspace conversion to squash 0-255 down
> diff --git a/include/drm/drm_edid.h b/include/drm/drm_edid.h
> index 838eaf2b42e9..25cdf5f7a0d8 100644
> --- a/include/drm/drm_edid.h
> +++ b/include/drm/drm_edid.h
> @@ -441,6 +441,8 @@ enum hdmi_picture_aspect drm_get_cea_aspect_ratio(const 
> u8 video_code);
>  bool drm_detect_hdmi_monitor(struct edid *edid);
>  bool drm_detect_monitor_audio(struct edid *edid);
>  bool drm_rgb_quant_range_selectable(struct edid *edid);
> +enum hdmi_quantization_range
> +drm_default_rgb_quant_range(const struct drm_display_mode *mode);
>  int drm_add_modes_noedid(struct drm_connector *connector,
>int hdisplay, int vdisplay);
>  void drm_set_preferred_mode(struct drm_connector *connector,
> -- 
> 2.10.2
> 
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Softwa

Re: [Intel-gfx] [PATCH 2/5] drm/edid: Introduce drm_default_rgb_quant_range()

2017-01-11 Thread Ville Syrjälä

On Wed, Jan 11, 2017 at 05:16:54PM +0100, Daniel Vetter wrote:
> On Wed, Jan 11, 2017 at 02:57:22PM +0200, ville.syrj...@linux.intel.com wrote:
> > From: Ville Syrjälä 
> > 
> > Make the code selecting the RGB quantization range a little less magicy
> > by wrapping it up in a small helper.
> > 
> > Signed-off-by: Ville Syrjälä 
> 
> Needs cc: for driver maintainers, Eric for vc4 here.

Eric was cc:d. I was just too lazy to add the cc:s to all the commit
messages, and so i just used --cc on the whole lot.

> -Daniel
> 
> > ---
> >  drivers/gpu/drm/drm_edid.c| 18 ++
> >  drivers/gpu/drm/i915/intel_dp.c   |  4 +++-
> >  drivers/gpu/drm/i915/intel_hdmi.c |  3 ++-
> >  drivers/gpu/drm/vc4/vc4_hdmi.c|  4 +++-
> >  include/drm/drm_edid.h|  2 ++
> >  5 files changed, 28 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
> > index 4ff04aa84dd0..304c583b8000 100644
> > --- a/drivers/gpu/drm/drm_edid.c
> > +++ b/drivers/gpu/drm/drm_edid.c
> > @@ -3768,6 +3768,24 @@ bool drm_rgb_quant_range_selectable(struct edid 
> > *edid)
> >  }
> >  EXPORT_SYMBOL(drm_rgb_quant_range_selectable);
> >  
> > +/**
> > + * drm_default_rgb_quant_range - default RGB quantization range
> > + * @mode: display mode
> > + *
> > + * Determine the default RGB quantization range for the mode,
> > + * as specified in CEA-861.
> > + *
> > + * Return: The default RGB quantization range for the mode
> > + */
> > +enum hdmi_quantization_range
> > +drm_default_rgb_quant_range(const struct drm_display_mode *mode)
> > +{
> > +   return drm_match_cea_mode(mode) > 1 ?
> > +   HDMI_QUANTIZATION_RANGE_LIMITED :
> > +   HDMI_QUANTIZATION_RANGE_FULL;
> > +}
> > +EXPORT_SYMBOL(drm_default_rgb_quant_range);
> > +
> >  static void drm_parse_hdmi_deep_color_info(struct drm_connector *connector,
> >const u8 *hdmi)
> >  {
> > diff --git a/drivers/gpu/drm/i915/intel_dp.c 
> > b/drivers/gpu/drm/i915/intel_dp.c
> > index 343e1d9fa761..d4befbbe834a 100644
> > --- a/drivers/gpu/drm/i915/intel_dp.c
> > +++ b/drivers/gpu/drm/i915/intel_dp.c
> > @@ -1713,7 +1713,9 @@ intel_dp_compute_config(struct intel_encoder *encoder,
> >  * VESA DisplayPort Ver.1.2a - 5.1.1.1 Video Colorimetry
> >  */
> > pipe_config->limited_color_range =
> > -   bpp != 18 && drm_match_cea_mode(adjusted_mode) > 1;
> > +   bpp != 18 &&
> > +   drm_default_rgb_quant_range(adjusted_mode) ==
> > +   HDMI_QUANTIZATION_RANGE_LIMITED;
> > } else {
> > pipe_config->limited_color_range =
> > intel_dp->limited_color_range;
> > diff --git a/drivers/gpu/drm/i915/intel_hdmi.c 
> > b/drivers/gpu/drm/i915/intel_hdmi.c
> > index 0bcfead14571..19bd13f53729 100644
> > --- a/drivers/gpu/drm/i915/intel_hdmi.c
> > +++ b/drivers/gpu/drm/i915/intel_hdmi.c
> > @@ -1330,7 +1330,8 @@ bool intel_hdmi_compute_config(struct intel_encoder 
> > *encoder,
> > /* See CEA-861-E - 5.1 Default Encoding Parameters */
> > pipe_config->limited_color_range =
> > pipe_config->has_hdmi_sink &&
> > -   drm_match_cea_mode(adjusted_mode) > 1;
> > +   drm_default_rgb_quant_range(adjusted_mode) ==
> > +   HDMI_QUANTIZATION_RANGE_LIMITED;
> > } else {
> > pipe_config->limited_color_range =
> > intel_hdmi->limited_color_range;
> > diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
> > index c4cb2e26de32..d79466a42690 100644
> > --- a/drivers/gpu/drm/vc4/vc4_hdmi.c
> > +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
> > @@ -463,7 +463,9 @@ static void vc4_hdmi_encoder_mode_set(struct 
> > drm_encoder *encoder,
> > csc_ctl = VC4_SET_FIELD(VC4_HD_CSC_CTL_ORDER_BGR,
> > VC4_HD_CSC_CTL_ORDER);
> >  
> > -   if (vc4_encoder->hdmi_monitor && drm_match_cea_mode(mode) > 1) {
> > +   if (vc4_encoder->hdmi_monitor &&
> > +   drm_default_rgb_quant_range(adjusted_mode) ==
> > +   HDMI_QUANTIZATION_RANGE_LIMITED) {
> > /* CEA VICs other than #1 requre limited range RGB
> >  * output unless overridden by an AVI infoframe.
> >  * Apply a colorspace conversion to squash 0-255 down
> > diff --git a/include/drm/drm_edid.h b/include/drm/drm_edid.h
> > index 838eaf2b42e9..25cdf5f7a0d8 100644
> > --- a/include/drm/drm_edid.h
> > +++ b/include/drm/drm_edid.h
> > @@ -441,6 +441,8 @@ enum hdmi_picture_aspect drm_get_cea_aspect_ratio(const 
> > u8 video_code);
> >  bool drm_detect_hdmi_monitor(struct edid *edid);
> >  bool drm_detect_monitor_audio(struct edid *edid);
> >  bool drm_rgb_quant_range_selectable(struct edid *edid);
> > +enum hdmi_quantization_range
> > +drm_default_rgb_quant_range(const struct drm_display_mode *mode);
> >  int drm_add_modes_noedid(struct drm_conne

[Intel-gfx] GPU hang with kernel 4.10rc3

2017-01-11 Thread Juergen Gross

With kernel 4.10rc3 running as Xen dm0 I get at each boot:

[   49.213697] [drm] GPU HANG: ecode 7:0:0x3d1d3d3d, in gnome-shell
[1431], reason: Hang on render ring, action: reset
[   49.213699] [drm] GPU hangs can indicate a bug anywhere in the entire
gfx stack, including userspace.
[   49.213700] [drm] Please file a _new_ bug report on
bugs.freedesktop.org against DRI -> DRM/Intel
[   49.213700] [drm] drm/i915 developers can then reassign to the right
component if it's not a kernel issue.
[   49.213700] [drm] The gpu crash dump is required to analyze gpu
hangs, so please always attach it.
[   49.213701] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[   49.213755] drm/i915: Resetting chip after gpu hang
[   60.213769] drm/i915: Resetting chip after gpu hang
[   71.189737] drm/i915: Resetting chip after gpu hang
[   82.165747] drm/i915: Resetting chip after gpu hang
[   93.205727] drm/i915: Resetting chip after gpu hang

The dump is attached.


Juergen
GPU HANG: ecode 7:0:0x3d1d3d3d, in gnome-shell [1431], reason: Hang on render 
ring, action: reset
Kernel: 4.10.0-rc3-pv+
Time: 1484151498 s 569085 us
Boottime: 39 s 844107 us
Uptime: 34 s 750060 us
is_mobile: no
is_i85x: no
is_i915g: no
is_i945gm: no
is_g33: no
is_g4x: no
is_pineview: no
is_broadwater: no
is_crestline: no
is_ivybridge: no
is_valleyview: no
is_cherryview: no
is_haswell: yes
is_broadwell: no
is_skylake: no
is_broxton: no
is_kabylake: no
is_alpha_support: no
has_64bit_reloc: no
has_csr: no
has_ddi: yes
has_dp_mst: yes
has_fbc: yes
has_fpga_dbg: yes
has_gmbus_irq: yes
has_gmch_display: no
has_guc: no
has_hotplug: yes
has_hw_contexts: yes
has_l3_dpf: yes
has_llc: yes
has_logical_ring_contexts: no
has_overlay: no
has_pipe_cxsr: no
has_pooled_eu: no
has_psr: yes
has_rc6: yes
has_rc6p: no
has_resource_streamer: yes
has_runtime_pm: yes
has_snoop: no
cursor_needs_physical: no
hws_needs_physical: no
overlay_needs_physical: no
supports_tv: no
has_decoupled_mmio: no
Active process (on ring render): gnome-shell [1431]
Reset count: 0
Suspend count: 0
PCI ID: 0x0416
PCI Revision: 0x06
PCI Subsystem: 1028:05bd
IOMMU enabled?: 0
EIR: 0x
IER: 0xfc002529
GTIER: 0x00401821
PGTBL_ER: 0x
FORCEWAKE: 0x0001
DERRMR: 0x
CCID: 0x012f710d
Missed interrupts: 0x
  fence[0] = bbf03300603001
  fence[1] = c3300700bf4003
  fence[2] = d3300f00c34003
  fence[3] = 12f003300d34001
  fence[4] = 230703f01308003
  fence[5] = 2b0003b02309003
  fence[6] = 30c503302b09001
  fence[7] = 
  fence[8] = 
  fence[9] = 
  fence[10] = 
  fence[11] = 
  fence[12] = 
  fence[13] = 
  fence[14] = 
  fence[15] = 
  fence[16] = 
  fence[17] = 
  fence[18] = 
  fence[19] = 
  fence[20] = 
  fence[21] = 
  fence[22] = 
  fence[23] = 
  fence[24] = 
  fence[25] = 
  fence[26] = 
  fence[27] = 
  fence[28] = 
  fence[29] = 
  fence[30] = 
  fence[31] = 
ERROR: 0x0101
DONE_REG: 0xffef
ERR_INT: 0x
render command stream:
  START: 0x1000
  HEAD:  0x0620 [0x05f8]
  TAIL:  0x0668 [0x0630, 0x0668]
  CTL:   0x0001f001
  MODE:  0x4000
  HWS:   0x7fff
  ACTHD: 0x 030c9004
  IPEIR: 0x
  IPEHR: 0xc2c2c2c2
  INSTDONE: 0xffdf
  SC_INSTDONE: 0x
  SAMPLER_INSTDONE[0][0]: 0x
  ROW_INSTDONE[0][0]: 0x
  batch: [0x_030c9000, 0x_030cc000]
  BBADDR: 0x_030c9005
  BB_STATE: 0x
  INSTPS: 0x8201
  INSTPM: 0x6080
  FADDR: 0x 030c9200
  RC PSMI: 0x0010
  FAULT_REG: 0x00c5
  SYNC_0: 0x
  SYNC_1: 0x0002
  SYNC_2: 0x
  GFX_MODE: 0x2a00
  PP_DIR_BASE: 0x7fdf
  seqno: 0x0008
  last_seqno: 0x0009
  waiting: yes
  ring->head: 0x
  ring->tail: 0x0668
  hangcheck: hung [42]
blt command stream:
  START: 0x00022000
  HEAD:  0x0088 [0x]
  TAIL:  0x0088 [0x, 0x]
  CTL:   0x0001f001
  MODE:  0x0200
  HWS:   0x00021000
  ACTHD: 0x 0088
  IPEIR: 0x
  IPEHR: 0x
  INSTDONE: 0xfffe
  BBADDR: 0x_00bc0024
  BB_STATE: 0x
  INSTPS: 0x
  INSTPM: 0x
  FADDR: 0x 00022088
  RC PSMI: 0x0010
  FAULT_REG: 0x
  SYNC_0: 0x0008
  SYNC_1: 0x
  SYNC_2: 0x
  GFX_MODE: 0x0200
  PP_DIR_BASE: 0x7fdf
  seqno: 0x0002
  last_seqno: 0x0002
  waiting: no
  ring->head: 0x
  ring->tail: 0x
  hangcheck: idle [0]
bsd command stream:
  START: 0x00043000
  HEAD:  0x [0x]
  TAIL:  0x [0x, 0x]
  CTL:   0x0001f001
  MODE:  0x0200
  HWS:   0x00042000
  ACTHD: 0x 
  IPEIR: 0x
  IPEHR: 0x
  INSTDONE: 0xfffe
  BBADDR: 0x_
  BB_STATE: 0x
  INSTPS: 0x
  INSTPM: 0x

Re: [Intel-gfx] [PATCH v6] drm: add fourcc codes for 16bit R and RG

2017-01-11 Thread Ville Syrjälä

On Wed, Jan 11, 2017 at 07:44:05AM -0800, Ben Widawsky wrote:
> On 17-01-11 17:05:04, Ville Syrjälä wrote:
> >On Thu, Jan 05, 2017 at 02:45:37PM +0100, Christian König wrote:
> >> Am 05.01.2017 um 12:37 schrieb Ville Syrjälä:
> >> > On Wed, Jan 04, 2017 at 07:38:55PM +0100, Rainer Hochecker wrote:
> >> >> From: Rainer Hochecker 
> >> >>
> >> >> This adds fourcc codes for 16bit planes required for DRM buffer
> >> >> export to mesa.
> >> >>
> >> >> Signed-off-by: Rainer Hochecker 
> >> > Reviewed-by: Ville Syrjälä 
> >>
> >> Good to see some work landing on that part, patch is Acked-by: Christian
> >> König .
> >
> >Has the userspace side of this been reviewed already?
> >
> >/me wonders if it's safe to push this...
> >
> 
> I acked the mesa side, and Rainer sent a version 2 which also looked fine to 
> me.
> Let me bump that thread...

Thanks everyone. I've pushed this patch to drm-misc-next.

> 
> >>
> >> >
> >> >> ---
> >> >>   include/uapi/drm/drm_fourcc.h | 7 +++
> >> >>   1 file changed, 7 insertions(+)
> >> >>
> >> >> diff --git a/include/uapi/drm/drm_fourcc.h 
> >> >> b/include/uapi/drm/drm_fourcc.h
> >> >> index a5890bf..d230e58 100644
> >> >> --- a/include/uapi/drm/drm_fourcc.h
> >> >> +++ b/include/uapi/drm/drm_fourcc.h
> >> >> @@ -41,10 +41,17 @@ extern "C" {
> >> >>   /* 8 bpp Red */
> >> >>   #define DRM_FORMAT_R8 fourcc_code('R', '8', ' ', ' ') /* 
> >> >> [7:0] R */
> >> >>
> >> >> +/* 16 bpp Red */
> >> >> +#define DRM_FORMAT_R16 fourcc_code('R', '1', '6', ' ') /* 
> >> >> [15:0] R little endian */
> >> >> +
> >> >>   /* 16 bpp RG */
> >> >>   #define DRM_FORMAT_RG88   fourcc_code('R', 'G', '8', '8') 
> >> >> /* [15:0] R:G 8:8 little endian */
> >> >>   #define DRM_FORMAT_GR88   fourcc_code('G', 'R', '8', '8') 
> >> >> /* [15:0] G:R 8:8 little endian */
> >> >>
> >> >> +/* 32 bpp RG */
> >> >> +#define DRM_FORMAT_RG1616  fourcc_code('R', 'G', '3', '2') /* 
> >> >> [31:0] R:G 16:16 little endian */
> >> >> +#define DRM_FORMAT_GR1616  fourcc_code('G', 'R', '3', '2') /* 
> >> >> [31:0] G:R 16:16 little endian */
> >> >> +
> >> >>   /* 8 bpp RGB */
> >> >>   #define DRM_FORMAT_RGB332 fourcc_code('R', 'G', 'B', '8') /* 
> >> >> [7:0] R:G:B 3:3:2 */
> >> >>   #define DRM_FORMAT_BGR233 fourcc_code('B', 'G', 'R', '8') /* 
> >> >> [7:0] B:G:R 2:3:3 */
> >> >> --
> >> >> 2.9.3
> >>
> >
> >-- 
> >Ville Syrjälä
> >Intel OTC

-- 
Ville Syrjälä
Intel OTC
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 2/3] drm/i915/scheduler: emulate a scheduler for guc

2017-01-11 Thread Tvrtko Ursulin



On 11/01/2017 13:13, Chris Wilson wrote:

This emulates execlists on top of the GuC in order to defer submission of

 > requests to the hardware. This deferral allows time for high priority

requests to gazump their way to the head of the queue, however it nerfs
the GuC by converting it back into a simple execlist (where the CPU has
to wake up after every request to feed new commands into the GuC).

v2: Drop hack status - though iirc there is still a lockdep inversion
between fence and engine->timeline->lock (which is impossible as the
nesting only occurs on different fences - hopefully just requires some
judicious lockdep annotation)


Hm hm... so fence->lock under timeline->lock when we enable signalling, 
while we already have the opposite in the submit_notify->submit_request 
yes? That would mean a per fence lock class, which can't work, or a 
nested annotation on fence->lock? So outside i915?




Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 79 +++---
 drivers/gpu/drm/i915/i915_irq.c|  4 +-
 drivers/gpu/drm/i915/intel_lrc.c   |  5 +-
 3 files changed, 76 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 913d87358972..bdc9e2bc5eb9 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -350,7 +350,7 @@ int i915_guc_wq_reserve(struct drm_i915_gem_request 
*request)
u32 freespace;
int ret;

-   spin_lock(&client->wq_lock);
+   spin_lock_irq(&client->wq_lock);
freespace = CIRC_SPACE(client->wq_tail, desc->head, client->wq_size);
freespace -= client->wq_rsvd;
if (likely(freespace >= wqi_size)) {
@@ -360,7 +360,7 @@ int i915_guc_wq_reserve(struct drm_i915_gem_request 
*request)
client->no_wq_space++;
ret = -EAGAIN;
}
-   spin_unlock(&client->wq_lock);
+   spin_unlock_irq(&client->wq_lock);

return ret;
 }
@@ -372,9 +372,9 @@ void i915_guc_wq_unreserve(struct drm_i915_gem_request 
*request)

GEM_BUG_ON(READ_ONCE(client->wq_rsvd) < wqi_size);

-   spin_lock(&client->wq_lock);
+   spin_lock_irq(&client->wq_lock);
client->wq_rsvd -= wqi_size;
-   spin_unlock(&client->wq_lock);
+   spin_unlock_irq(&client->wq_lock);
 }

 /* Construct a Work Item and append it to the GuC's Work Queue */
@@ -534,10 +534,74 @@ static void __i915_guc_submit(struct drm_i915_gem_request 
*rq)

 static void i915_guc_submit(struct drm_i915_gem_request *rq)
 {
-   i915_gem_request_submit(rq);
+   __i915_gem_request_submit(rq);
__i915_guc_submit(rq);
 }

+static bool i915_guc_dequeue(struct intel_engine_cs *engine)
+{
+   struct execlist_port *port = engine->execlist_port;
+   struct drm_i915_gem_request *last = port[0].request;
+   unsigned long flags;
+   struct rb_node *rb;
+   bool submit = false;
+
+   spin_lock_irqsave(&engine->timeline->lock, flags);
+   rb = engine->execlist_first;
+   while (rb) {
+   struct drm_i915_gem_request *cursor =
+   rb_entry(rb, typeof(*cursor), priotree.node);
+
+   if (last && cursor->ctx != last->ctx) {
+   if (port != engine->execlist_port)
+   break;
+
+   i915_gem_request_assign(&port->request, last);
+   dma_fence_enable_sw_signaling(&last->fence);
+   port++;
+   }
+
+   rb = rb_next(rb);
+   rb_erase(&cursor->priotree.node, &engine->execlist_queue);
+   RB_CLEAR_NODE(&cursor->priotree.node);
+   cursor->priotree.priority = INT_MAX;
+
+   i915_guc_submit(cursor);
+   last = cursor;
+   submit = true;
+   }
+   if (submit) {
+   i915_gem_request_assign(&port->request, last);
+   dma_fence_enable_sw_signaling(&last->fence);
+   engine->execlist_first = rb;
+   }
+   spin_unlock_irqrestore(&engine->timeline->lock, flags);
+
+   return submit;
+}


It is again tempting me to suggest a single instance of the dequeue 
loop. Maybe something like:


i915_request_dequeue(engine, assign_func, submit_func)
{
...

while (rb) {
...
if () {
...
assign_func();
port++;
}

...

submit_func();
last = cursor;
submit = true;
}
if (submit) {
assign_func();
engine->execlists_first = rb;
}

...
}

execlists_dequeue_assign()
{
i915_gem_request_assign();
}

execlists_deqeue_submit()
{
__i915_gem_request_submit();
}

execlists_dequeue(

Re: [Intel-gfx] [PATCH v3] drm/i915/scheduler: emulate a scheduler for guc

2017-01-11 Thread Tvrtko Ursulin



On 11/01/2017 16:11, Chris Wilson wrote:

This emulates execlists on top of the GuC in order to defer submission of
requests to the hardware. This deferral allows time for high priority
requests to gazump their way to the head of the queue, however it nerfs
the GuC by converting it back into a simple execlist (where the CPU has
to wake up after every request to feed new commands into the GuC).

v2: Drop hack status - though iirc there is still a lockdep inversion
between fence and engine->timeline->lock (which is impossible as the
nesting only occurs on different fences - hopefully just requires some
judicious lockdep annotation)
v3: Apply lockdep nesting to enabling signaling on the request, using
the pattern we already have in __i915_gem_request_submit();

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 92 +++---
 drivers/gpu/drm/i915/i915_irq.c|  4 +-
 drivers/gpu/drm/i915/intel_lrc.c   |  5 +-
 3 files changed, 89 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
b/drivers/gpu/drm/i915/i915_guc_submission.c
index 913d87358972..4484591cbf7c 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -350,7 +350,7 @@ int i915_guc_wq_reserve(struct drm_i915_gem_request 
*request)
u32 freespace;
int ret;

-   spin_lock(&client->wq_lock);
+   spin_lock_irq(&client->wq_lock);
freespace = CIRC_SPACE(client->wq_tail, desc->head, client->wq_size);
freespace -= client->wq_rsvd;
if (likely(freespace >= wqi_size)) {
@@ -360,7 +360,7 @@ int i915_guc_wq_reserve(struct drm_i915_gem_request 
*request)
client->no_wq_space++;
ret = -EAGAIN;
}
-   spin_unlock(&client->wq_lock);
+   spin_unlock_irq(&client->wq_lock);

return ret;
 }
@@ -372,9 +372,9 @@ void i915_guc_wq_unreserve(struct drm_i915_gem_request 
*request)

GEM_BUG_ON(READ_ONCE(client->wq_rsvd) < wqi_size);

-   spin_lock(&client->wq_lock);
+   spin_lock_irq(&client->wq_lock);
client->wq_rsvd -= wqi_size;
-   spin_unlock(&client->wq_lock);
+   spin_unlock_irq(&client->wq_lock);
 }

 /* Construct a Work Item and append it to the GuC's Work Queue */
@@ -534,10 +534,87 @@ static void __i915_guc_submit(struct drm_i915_gem_request 
*rq)

 static void i915_guc_submit(struct drm_i915_gem_request *rq)
 {
-   i915_gem_request_submit(rq);
+   __i915_gem_request_submit(rq);
__i915_guc_submit(rq);
 }

+static void nested_enable_signaling(struct drm_i915_gem_request *rq)
+{
+   if (test_and_set_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT,
+&rq->fence.flags))
+   return;
+
+   GEM_BUG_ON(test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &rq->fence.flags));
+
+   spin_lock_nested(&rq->lock, SINGLE_DEPTH_NESTING);
+   intel_engine_enable_signaling(rq);
+   spin_unlock(&rq->lock);
+}


Crossed wires. :) Opencoding this works I guess.

I would stick a trace_dma_fence_enable_signal into it to be extra nice.

Regards,

Tvrtko


+
+static bool i915_guc_dequeue(struct intel_engine_cs *engine)
+{
+   struct execlist_port *port = engine->execlist_port;
+   struct drm_i915_gem_request *last = port[0].request;
+   unsigned long flags;
+   struct rb_node *rb;
+   bool submit = false;
+
+   spin_lock_irqsave(&engine->timeline->lock, flags);
+   rb = engine->execlist_first;
+   while (rb) {
+   struct drm_i915_gem_request *cursor =
+   rb_entry(rb, typeof(*cursor), priotree.node);
+
+   if (last && cursor->ctx != last->ctx) {
+   if (port != engine->execlist_port)
+   break;
+
+   i915_gem_request_assign(&port->request, last);
+   nested_enable_signaling(last);
+   port++;
+   }
+
+   rb = rb_next(rb);
+   rb_erase(&cursor->priotree.node, &engine->execlist_queue);
+   RB_CLEAR_NODE(&cursor->priotree.node);
+   cursor->priotree.priority = INT_MAX;
+
+   i915_guc_submit(cursor);
+   last = cursor;
+   submit = true;
+   }
+   if (submit) {
+   i915_gem_request_assign(&port->request, last);
+   nested_enable_signaling(last);
+   engine->execlist_first = rb;
+   }
+   spin_unlock_irqrestore(&engine->timeline->lock, flags);
+
+   return submit;
+}
+
+static void i915_guc_irq_handler(unsigned long data)
+{
+   struct intel_engine_cs *engine = (struct intel_engine_cs *)data;
+   struct execlist_port *port = engine->execlist_port;
+   struct drm_i915_gem_request *rq;
+   bool submit;
+
+   do {
+   rq = port[0].request;
+   while (rq && i915_gem_request_c

Re: [Intel-gfx] GPU hang with kernel 4.10rc3

2017-01-11 Thread Chris Wilson

On Wed, Jan 11, 2017 at 05:33:34PM +0100, Juergen Gross wrote:
> With kernel 4.10rc3 running as Xen dm0 I get at each boot:
> 
> [   49.213697] [drm] GPU HANG: ecode 7:0:0x3d1d3d3d, in gnome-shell
> [1431], reason: Hang on render ring, action: reset
> [   49.213699] [drm] GPU hangs can indicate a bug anywhere in the entire
> gfx stack, including userspace.
> [   49.213700] [drm] Please file a _new_ bug report on
> bugs.freedesktop.org against DRI -> DRM/Intel
> [   49.213700] [drm] drm/i915 developers can then reassign to the right
> component if it's not a kernel issue.
> [   49.213700] [drm] The gpu crash dump is required to analyze gpu
> hangs, so please always attach it.
> [   49.213701] [drm] GPU crash dump saved to /sys/class/drm/card0/error
> [   49.213755] drm/i915: Resetting chip after gpu hang
> [   60.213769] drm/i915: Resetting chip after gpu hang
> [   71.189737] drm/i915: Resetting chip after gpu hang
> [   82.165747] drm/i915: Resetting chip after gpu hang
> [   93.205727] drm/i915: Resetting chip after gpu hang
> 
> The dump is attached.

That's a nasty one. The first couple of pages of the batchbuffer appear
to be overwritten. (Full of 0xc2c2c2c2, i.e. probably pixel data.) That
may be a concurrent write by either the GPU or CPU, or we may have
incorrected mapped a set of pages. That it doesn't recovered suggests
that the corruption occurs frequently, probably on every request/batch.

Is this a new bug? Bisection would be the fastest way to triage it.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 2/3] drm/i915/scheduler: emulate a scheduler for guc

2017-01-11 Thread Chris Wilson

On Wed, Jan 11, 2017 at 04:55:46PM +, Tvrtko Ursulin wrote:
> 
> On 11/01/2017 13:13, Chris Wilson wrote:
> >This emulates execlists on top of the GuC in order to defer submission of
>  > requests to the hardware. This deferral allows time for high priority
> >requests to gazump their way to the head of the queue, however it nerfs
> >the GuC by converting it back into a simple execlist (where the CPU has
> >to wake up after every request to feed new commands into the GuC).
> >
> >v2: Drop hack status - though iirc there is still a lockdep inversion
> >between fence and engine->timeline->lock (which is impossible as the
> >nesting only occurs on different fences - hopefully just requires some
> >judicious lockdep annotation)
> 
> Hm hm... so fence->lock under timeline->lock when we enable
> signalling, while we already have the opposite in the
> submit_notify->submit_request yes? That would mean a per fence lock
> class, which can't work, or a nested annotation on fence->lock? So
> outside i915?

That was my approach as well, and by heading in that direction, we can
see that it is the same nested signal enabling issue we met when
handling the deferral of the breadcrumb signaler in
i915_gem_request_submit().

> >Signed-off-by: Chris Wilson 
> >---
> > drivers/gpu/drm/i915/i915_guc_submission.c | 79 
> > +++---
> > drivers/gpu/drm/i915/i915_irq.c|  4 +-
> > drivers/gpu/drm/i915/intel_lrc.c   |  5 +-
> > 3 files changed, 76 insertions(+), 12 deletions(-)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c 
> >b/drivers/gpu/drm/i915/i915_guc_submission.c
> >index 913d87358972..bdc9e2bc5eb9 100644
> >--- a/drivers/gpu/drm/i915/i915_guc_submission.c
> >+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> >@@ -350,7 +350,7 @@ int i915_guc_wq_reserve(struct drm_i915_gem_request 
> >*request)
> > u32 freespace;
> > int ret;
> >
> >-spin_lock(&client->wq_lock);
> >+spin_lock_irq(&client->wq_lock);
> > freespace = CIRC_SPACE(client->wq_tail, desc->head, client->wq_size);
> > freespace -= client->wq_rsvd;
> > if (likely(freespace >= wqi_size)) {
> >@@ -360,7 +360,7 @@ int i915_guc_wq_reserve(struct drm_i915_gem_request 
> >*request)
> > client->no_wq_space++;
> > ret = -EAGAIN;
> > }
> >-spin_unlock(&client->wq_lock);
> >+spin_unlock_irq(&client->wq_lock);
> >
> > return ret;
> > }
> >@@ -372,9 +372,9 @@ void i915_guc_wq_unreserve(struct drm_i915_gem_request 
> >*request)
> >
> > GEM_BUG_ON(READ_ONCE(client->wq_rsvd) < wqi_size);
> >
> >-spin_lock(&client->wq_lock);
> >+spin_lock_irq(&client->wq_lock);
> > client->wq_rsvd -= wqi_size;
> >-spin_unlock(&client->wq_lock);
> >+spin_unlock_irq(&client->wq_lock);
> > }
> >
> > /* Construct a Work Item and append it to the GuC's Work Queue */
> >@@ -534,10 +534,74 @@ static void __i915_guc_submit(struct 
> >drm_i915_gem_request *rq)
> >
> > static void i915_guc_submit(struct drm_i915_gem_request *rq)
> > {
> >-i915_gem_request_submit(rq);
> >+__i915_gem_request_submit(rq);
> > __i915_guc_submit(rq);
> > }
> >
> >+static bool i915_guc_dequeue(struct intel_engine_cs *engine)
> >+{
> >+struct execlist_port *port = engine->execlist_port;
> >+struct drm_i915_gem_request *last = port[0].request;
> >+unsigned long flags;
> >+struct rb_node *rb;
> >+bool submit = false;
> >+
> >+spin_lock_irqsave(&engine->timeline->lock, flags);
> >+rb = engine->execlist_first;
> >+while (rb) {
> >+struct drm_i915_gem_request *cursor =
> >+rb_entry(rb, typeof(*cursor), priotree.node);
> >+
> >+if (last && cursor->ctx != last->ctx) {
> >+if (port != engine->execlist_port)
> >+break;
> >+
> >+i915_gem_request_assign(&port->request, last);
> >+dma_fence_enable_sw_signaling(&last->fence);
> >+port++;
> >+}
> >+
> >+rb = rb_next(rb);
> >+rb_erase(&cursor->priotree.node, &engine->execlist_queue);
> >+RB_CLEAR_NODE(&cursor->priotree.node);
> >+cursor->priotree.priority = INT_MAX;
> >+
> >+i915_guc_submit(cursor);
> >+last = cursor;
> >+submit = true;
> >+}
> >+if (submit) {
> >+i915_gem_request_assign(&port->request, last);
> >+dma_fence_enable_sw_signaling(&last->fence);
> >+engine->execlist_first = rb;
> >+}
> >+spin_unlock_irqrestore(&engine->timeline->lock, flags);
> >+
> >+return submit;
> >+}
> 
> It is again tempting me to suggest a single instance of the dequeue
> loop. Maybe something like:
> 
> i915_request_dequeue(engine, assign_func, submit_func)
> {
>   ...
>   
>   while (rb) {
>   ...
>   if () {

Don't forget

if (!cant_merge_func()) { /* which is where I start not lik

[Intel-gfx] [drm-intel:for-linux-next 2/4] htmldocs: drivers/gpu/drm/i915/i915_gem_gtt.c:3586: warning: No description found for parameter 'offset'

2017-01-11 Thread kbuild test robot

tree:   git://anongit.freedesktop.org/drm-intel for-linux-next
head:   c781c978e784c50dcd7cb312fe17f5281923f55b
commit: 625d988acc28f3fe1d44f3798426561c17387a59 [2/4] drm/i915: Extract 
reserving space in the GTT to a helper
reproduce: make htmldocs

All warnings (new ones prefixed by >>):

   make[3]: warning: jobserver unavailable: using -j1.  Add '+' to parent make 
rule.
   include/linux/init.h:1: warning: no structured comments found
   include/linux/kthread.h:26: warning: Excess function parameter '...' 
description in 'kthread_create'
   kernel/sys.c:1: warning: no structured comments found
   drivers/dma-buf/seqno-fence.c:1: warning: no structured comments found
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'firstopen'
   include/drm/drm_drv.h:441: warning: No description found for parameter 'open'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'preclose'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'postclose'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'lastclose'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'dma_ioctl'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'dma_quiescent'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'context_dtor'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'set_busid'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'irq_handler'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'irq_preinstall'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'irq_postinstall'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'irq_uninstall'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'debugfs_init'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'debugfs_cleanup'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'gem_open_object'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'gem_close_object'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'prime_handle_to_fd'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'prime_fd_to_handle'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'gem_prime_export'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'gem_prime_import'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'gem_prime_pin'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'gem_prime_unpin'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'gem_prime_res_obj'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'gem_prime_get_sg_table'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'gem_prime_import_sg_table'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'gem_prime_vmap'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'gem_prime_vunmap'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'gem_prime_mmap'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'vgaarb_irq'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'gem_vm_ops'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'major'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'minor'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'patchlevel'
   include/drm/drm_drv.h:441: warning: No description found for parameter 'name'
   include/drm/drm_drv.h:441: warning: No description found for parameter 'desc'
   include/drm/drm_drv.h:441: warning: No description found for parameter 'date'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'driver_features'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'dev_priv_size'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'ioctls'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'num_ioctls'
   include/drm/drm_drv.h:441: warning: No description found for parameter 'fops'
   include/drm/drm_drv.h:441: warning: No description found for parameter 
'legacy_dev_list'
   drivers/gpu/drm/i915/i915_gem_gtt.c:3586: warning: No description found for 
parameter 'vm'
   drivers/gpu/drm/i915/i915_gem_gtt.c:3586: warning: No description found for 
parameter 'node'
   drivers/gpu/drm/i915/i915_gem_gtt.c:3586: warning: No description found for 
parameter 'size'
>> drivers/gpu/drm/i915/i915_gem_gtt.c:3586: warning: No description found for 
>> parameter 'offset'
   drivers/gpu/drm/i915/i915_gem_gtt.c:3586: warning: No description

Re: [Intel-gfx] [RFCv2 01/19] drm/i915: Provide a hook for selftests

2017-01-11 Thread Tvrtko Ursulin



On 20/12/2016 13:07, Chris Wilson wrote:

Some pieces of code are independent of hardware but are very tricky to
exercise through the normal userspace ABI or via debugfs hooks. Being
able to create mock unit tests and execute them through CI is vital.
Start by adding a central point where we can execute unit tests and
a parameter to enable them. This is disabled by default as the
expectation is that these tests will occasionally explode.

To facilitate integration with igt, any parameter beginning with
i915.igt__ is interpreted as a subtest executable independently via
igt/drv_selftest.

Two classes of selftests are recognised: mock unit tests and integration
tests. Mock unit tests are run as soon as the module is loaded, before
the device is probed. At that point there is no driver instantiated and
all hw interactions must be "mocked". This is very useful for writing
universal tests to exercise code not typically run on a broad range of
architectures. Alternatively, you can hook into the late selftests and
run when the device has been instantiated - hw interactions are real.

v2: Add a macro for compiling conditional code for mock objects inside
real objects.
v3: Differentiate between mock unit tests and late integration test.
v4: List the tests in natural order, use igt to sort after modparam.
v5: s/late/live/

Signed-off-by: Chris Wilson 
Reviewed-by: Tvrtko Ursulin  #v1
---
 drivers/gpu/drm/i915/Kconfig.debug |  15 ++
 drivers/gpu/drm/i915/Makefile  |   3 +
 drivers/gpu/drm/i915/i915_pci.c|  19 +-
 drivers/gpu/drm/i915/i915_selftest.h   |  91 +
 .../gpu/drm/i915/selftests/i915_live_selftests.h   |  11 +
 .../gpu/drm/i915/selftests/i915_mock_selftests.h   |  11 +
 drivers/gpu/drm/i915/selftests/i915_selftest.c | 222 +
 7 files changed, 371 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/i915/i915_selftest.h
 create mode 100644 drivers/gpu/drm/i915/selftests/i915_live_selftests.h
 create mode 100644 drivers/gpu/drm/i915/selftests/i915_mock_selftests.h
 create mode 100644 drivers/gpu/drm/i915/selftests/i915_selftest.c

diff --git a/drivers/gpu/drm/i915/Kconfig.debug 
b/drivers/gpu/drm/i915/Kconfig.debug
index 598551dbf62c..de051502e891 100644
--- a/drivers/gpu/drm/i915/Kconfig.debug
+++ b/drivers/gpu/drm/i915/Kconfig.debug
@@ -26,6 +26,7 @@ config DRM_I915_DEBUG
 select DRM_DEBUG_MM if DRM=y
select DRM_DEBUG_MM_SELFTEST
select DRM_I915_SW_FENCE_DEBUG_OBJECTS
+   select DRM_I915_SELFTEST
 default n
 help
   Choose this option to turn on extra driver debugging that may affect
@@ -59,3 +60,17 @@ config DRM_I915_SW_FENCE_DEBUG_OBJECTS
   Recommended for driver developers only.

   If in doubt, say "N".
+
+config DRM_I915_SELFTEST
+   bool "Enable selftests upon driver load"
+   depends on DRM_I915
+   default n
+   help
+ Choose this option to allow the driver to perform selftests upon
+ loading; also requires the i915.selftest=1 module parameter. To
+ exit the module after running the selftests (i.e. to prevent normal
+ module initialisation afterwards) use i915.selftest=-1.
+
+ Recommended for driver developers only.
+
+ If in doubt, say "N".
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 5196509e71cf..461aeb44a9ad 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -3,6 +3,7 @@
 # Direct Rendering Infrastructure (DRI) in XFree86 4.1.0 and higher.

 subdir-ccflags-$(CONFIG_DRM_I915_WERROR) := -Werror
+subdir-ccflags-$(CONFIG_DRM_I915_SELFTEST) := -I$(src) -I$(src)/selftests
 subdir-ccflags-y += \
$(call as-instr,movntdqa (%eax)$(comma)%xmm0,-DCONFIG_AS_MOVNTDQA)

@@ -114,6 +115,8 @@ i915-y += dvo_ch7017.o \

 # Post-mortem debug and GPU hang state capture
 i915-$(CONFIG_DRM_I915_CAPTURE_ERROR) += i915_gpu_error.o
+i915-$(CONFIG_DRM_I915_SELFTEST) += \
+   selftests/i915_selftest.o

 # virtual gpu code
 i915-y += i915_vgpu.o
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 9885458b0fb8..3d416d142573 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -27,6 +27,7 @@
 #include 

 #include "i915_drv.h"
+#include "i915_selftest.h"

 #define GEN_DEFAULT_PIPEOFFSETS \
.pipe_offsets = { PIPE_A_OFFSET, PIPE_B_OFFSET, \
@@ -477,6 +478,7 @@ static int i915_pci_probe(struct pci_dev *pdev, const 
struct pci_device_id *ent)
 {
struct intel_device_info *intel_info =
(struct intel_device_info *) ent->driver_data;
+   int err;

if (IS_ALPHA_SUPPORT(intel_info) && !i915.alpha_support) {
DRM_INFO("The driver support for your hardware in this kernel 
version is alpha quality\n"
@@ -500,7 +502,17 @@ static int i915_pci_probe(struct pci_dev *pdev, const 
struct pci_device_id *ent)

[Intel-gfx] [PATCH] drm/i915: Detect vma reserved for execbuf in evict-for-node

2017-01-11 Thread Chris Wilson

The vma->exec_list is still the only means we have for both reserving an
object in execbuf, and for constructing the eviction list. So during the
construction of the eviction list, we must treat anything already on the
exec_list as being pinned.

Yes, this sharing of two semantically different lists will be fixed! But
in the meantime, we have the issue that this is tripping up CI since we
started using i915_gem_gtt_reserve_node() + i915_gem_evict_for_node()
from the regular execbuf reservation path in commit 606fec956c0e
("drm/i915: Prefer random replacement before eviction search"):

[  108.424063] kernel BUG at drivers/gpu/drm/i915/i915_vma.h:254!
[  108.424072] invalid opcode:  [#1] PREEMPT SMP
[  108.424079] Modules linked in: snd_hda_intel i915 intel_powerclamp coretemp 
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec_hdmi 
snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core mei_me snd_pcm 
lpc_ich mei sdhci_pci sdhci mmc_core e1000e ptp pps_core [last unloaded: i915]
[  108.424132] CPU: 1 PID: 6865 Comm: gem_cs_tlb Tainted: G U  
4.10.0-rc3-CI-CI_DRM_2049+ #1
[  108.424143] Hardware name: Hewlett-Packard HP EliteBook 8440p/172A, BIOS 
68CCU Ver. F.24 09/13/2013
[  108.424154] task: 88012ae22600 task.stack: c9a14000
[  108.424220] RIP: 0010:i915_gem_evict_for_node+0x237/0x410 [i915]
[  108.424229] RSP: 0018:c9a17a58 EFLAGS: 00010202
[  108.424237] RAX: 5871 RBX: 88012d1ad778 RCX: 
[  108.424246] RDX: 7000 RSI: c9a17a68 RDI: 880127e694d8
[  108.424255] RBP: c9a17aa0 R08: c9a17a68 R09: 
[  108.424264] R10: 0001 R11:  R12: 8000
[  108.424273] R13: c9a17a68 R14: 880127e694d8 R15: a0387330
[  108.424283] FS:  7f8236e3d8c0() GS:880137c4() 
knlGS:
[  108.424293] CS:  0010 DS:  ES:  CR0: 80050033
[  108.424305] CR2: 7f82347a2000 CR3: 00012c866000 CR4: 06e0
[  108.424317] Call Trace:
[  108.424368]  i915_gem_gtt_reserve+0x67/0x80 [i915]
[  108.424424]  __i915_vma_do_pin+0x248/0x620 [i915]
[  108.424487]  ? __i915_vma_do_pin+0x162/0x620 [i915]
[  108.424540]  i915_gem_execbuffer_reserve_vma.isra.8+0x153/0x1f0 [i915]
[  108.424591]  i915_gem_execbuffer_reserve.isra.9+0x40e/0x440 [i915]
[  108.424643]  i915_gem_do_execbuffer.isra.15+0x6d9/0x1b20 [i915]
[  108.424696]  i915_gem_execbuffer2+0xc0/0x250 [i915]
[  108.424712]  drm_ioctl+0x200/0x450
[  108.424760]  ? i915_gem_execbuffer+0x330/0x330 [i915]
[  108.424776]  do_vfs_ioctl+0x90/0x6e0
[  108.424789]  ? up_read+0x1a/0x40
[  108.424800]  ? trace_hardirqs_on_caller+0x122/0x1b0
[  108.424813]  SyS_ioctl+0x3c/0x70
[  108.424828]  entry_SYSCALL_64_fastpath+0x1c/0xb1
[  108.424839] RIP: 0033:0x7f8235867357
[  108.424848] RSP: 002b:7ffdc14504c8 EFLAGS: 0246 ORIG_RAX: 
0010
[  108.424866] RAX: ffda RBX: 7ffdc1450600 RCX: 7f8235867357
[  108.424878] RDX: 7ffdc14505a0 RSI: 40406469 RDI: 0003
[  108.424890] RBP:  R08:  R09: 0022
[  108.424903] R10: 0007 R11: 0246 R12: 0002
[  108.424915] R13: 00419101 R14: 7ffdc1450600 R15: 7ffdc14505f0
[  108.424928] Code: 45 b8 8b 4d c0 4c 89 f2 48 89 de ff d0 49 8b 07 4c 8b 45 
b8 48 85 c0 75 dd 65 ff 0d d4 a1 c8 5f 0f 84 47 01 00 00 e9 0d fe ff ff <0f> 0b 
45 31 f6 4c 8b 65 c8 49 8b 04 24 4d 39 ec 49 8d 9c 24 28
[  108.425055] RIP: i915_gem_evict_for_node+0x237/0x410 [i915] RSP: 
c9a17a58

Fixes: 172ae5b4c8c1 ("drm/i915: Fix i915_gem_evict_for_vma (soft-pinning)")
Fixes: 606fec956c0e ("drm/i915: Prefer random replacement before eviction 
search")
Signed-off-by: Chris Wilson 
Cc: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_gem_evict.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c 
b/drivers/gpu/drm/i915/i915_gem_evict.c
index 50b4645bf627..a43e44e18042 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -305,7 +305,7 @@ int i915_gem_evict_for_node(struct i915_address_space *vm,
}
 
/* Overlap of objects in the same batch? */
-   if (i915_vma_is_pinned(vma)) {
+   if (i915_vma_is_pinned(vma) || !list_empty(&vma->exec_list)) {
ret = -ENOSPC;
if (vma->exec_entry &&
vma->exec_entry->flags & EXEC_OBJECT_PINNED)
-- 
2.11.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [RFCv2 14/19] drm/i915: Move uncore selfchecks to live selftest infrastructure

2017-01-11 Thread Matthew Auld

On 20 December 2016 at 13:08, Chris Wilson  wrote:
> Now that the kselftest infrastructure exists, put it to use and add to
> it the existing consistency checks on the fw register lookup tables.
>
> v2: s/tabke/table/
>
> Signed-off-by: Chris Wilson 
> Cc: Tvrtko Ursulin 
Reviewed-by: Matthew Auld 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [RFCv2 15/19] drm/i915: Test all fw tables during mock selftests

2017-01-11 Thread Matthew Auld

On 20 December 2016 at 13:08, Chris Wilson  wrote:
> In addition to just testing the fw table we load, during the initial
> mock testing we can test that all tables are valid (so the testing is
> not limited to just the platforms that load that particular table).
>
> Signed-off-by: Chris Wilson 
> Cc: Tvrtko Ursulin 
Reviewed-by: Matthew Auld 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [RFCv2 16/19] drm/i915: Sanity check all registers for matching fw domains

2017-01-11 Thread Matthew Auld

On 20 December 2016 at 13:08, Chris Wilson  wrote:
> Add a late selftest that walks over all forcewake registers (those below
> 0x4) and checks intel_uncore_forcewake_for_reg() that the look
> exists and we having the matching powerwells.
>
> Signed-off-by: Chris Wilson 
> ---
>  drivers/gpu/drm/i915/selftests/intel_uncore.c | 47 
> +++
>  1 file changed, 47 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/selftests/intel_uncore.c 
> b/drivers/gpu/drm/i915/selftests/intel_uncore.c
> index c18fddb12d00..c9f90514500f 100644
> --- a/drivers/gpu/drm/i915/selftests/intel_uncore.c
> +++ b/drivers/gpu/drm/i915/selftests/intel_uncore.c
> @@ -107,6 +107,49 @@ int intel_uncore_mock_selftests(void)
> return 0;
>  }
>
> +static int intel_uncore_check_forcewake_domains(struct drm_i915_private 
> *dev_priv)
> +{
> +#define FW_RANGE 0x4
> +   unsigned long *valid;
> +   u32 offset;
> +   int err;
> +
> +   valid = kzalloc(BITS_TO_LONGS(FW_RANGE) * sizeof(*valid),
> +   GFP_TEMPORARY);
> +   if (!valid)
> +   return -ENOMEM;
> +
> +   intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
> +
> +   check_for_unclaimed_mmio(dev_priv);
> +   for (offset = 0; offset < FW_RANGE; offset += 4) {
> +   i915_reg_t reg = { offset };
> +
> +   (void)I915_READ_FW(reg);
> +   if (!check_for_unclaimed_mmio(dev_priv))
> +   set_bit(offset, valid);
> +   }
> +
> +   intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
> +
> +   for_each_set_bit(offset, valid, FW_RANGE) {
> +   i915_reg_t reg = { offset };
> +
> +   intel_uncore_forcewake_reset(dev_priv, false);
> +   check_for_unclaimed_mmio(dev_priv);
hmm, what do we need this for ?

> +
> +   (void)I915_READ(reg);
> +   if (check_for_unclaimed_mmio(dev_priv)) {
> +   pr_err("Unclaimed mmio read to register 0x%04x\n",
> +  offset);
> +   err = -EINVAL;
> +   }
> +   }
> +
> +   kfree(valid);
> +   return err;
> +}
> +
>  int intel_uncore_live_selftests(struct drm_i915_private *i915)
>  {
> int err;
> @@ -118,5 +161,9 @@ int intel_uncore_live_selftests(struct drm_i915_private 
> *i915)
> if (err)
> return err;
>
> +   err = intel_uncore_check_forcewake_domains(i915);
> +   if (err)
> +   return err;
> +
> return 0;
>  }
> --
> 2.11.0
>
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [RFCv2 01/19] drm/i915: Provide a hook for selftests

2017-01-11 Thread Chris Wilson

On Wed, Jan 11, 2017 at 06:17:48PM +, Tvrtko Ursulin wrote:
> On 20/12/2016 13:07, Chris Wilson wrote:
> >@@ -522,6 +534,11 @@ static struct pci_driver i915_pci_driver = {
> > static int __init i915_init(void)
> > {
> > bool use_kms = true;
> >+int err;
> >+
> >+err = i915_mock_selftests();
> >+if (err)
> >+return err > 0 ? 0 : err;
> 
> Am I again confused by the return codes? :) Module param of -1 will
> result in i915_mock_selftests returning 1, which here translates to
> 0 so it won't abort the load like it should.

I had to give up on that for silent passing and do the remove from
userspace on success instead. Returning anything other than 0 causes noise
in dmesg. That I can live with after an error during the selftest, since
dmesg should also contain more details on the test failure.

If i915.mock_selftests=-1 then we run the tests and stop. We just leave
the module loaded even though it hasn't bound to any pci devices. :|
igt/drv_selftest and kselftests/gpu/i915.sh then unload the module.

> >+static void set_default_test_all(struct selftest *st, unsigned long count)
> >+{
> >+unsigned long i;
> >+
> >+for (i = 0; i < count; i++)
> >+if (st[i].enabled)
> >+return;
> >+
> >+for (i = 0; i < count; i++)
> >+st[i].enabled = true;
> >+}
> 
> unsigned int should be enough for everyone! :) (i & count)

Such shortsightedness!

> >+static int run_selftests(const char *name,
> >+ struct selftest *st,
> >+ unsigned long count,
> >+ void *data)
> >+{
> >+int err = 0;
> >+
> 
> If I got it right:
> 
> /* Make sure both live and mock run with the same seed if ran one
> after another. */

Yes, choose the seed once, run every selected test with the same seed.
 
> ? just not sure what happens if user sets zero.

I wasn't such if 0 was a valid seed, so I wasn't caring too much if the
user did i915.st_random_seed=0. They will see the pr_info() and go
wtf, and hopefully don't do that again.

> >+while (!i915_selftest.random_seed)
> >+i915_selftest.random_seed = get_random_int();
> >+
> >+i915_selftest.timeout_jiffies =
> >+i915_selftest.timeout_ms ?
> >+msecs_to_jiffies_timeout(i915_selftest.timeout_ms) :
> >+MAX_SCHEDULE_TIMEOUT;
> >+
> >+set_default_test_all(st, count);
> >+
> >+pr_info("i915: Performing %s selftests with st_random_seed=%x and 
> >st_timeout=%u\n",
> >+name, i915_selftest.random_seed, i915_selftest.timeout_ms);
> >+
> >+/* Tests are listed in order in i915_*_selftests.h */
> >+for (; count--; st++) {
> >+if (!st->enabled)
> >+continue;
> >+
> >+cond_resched();
> >+if (signal_pending(current))
> >+return -EINTR;
> >+
> >+pr_debug("i915: Running %s\n", st->name);
> >+if (data)
> >+err = st->live(data);
> >+else
> >+err = st->mock();
> >+if (err)
> >+break;
> >+}
> >+
> >+if (WARN(err > 0 || err == -ENOTTY,
> >+ "%s returned %d, conflicting with selftest's magic values!\n",
> >+ st->name, err))
> >+err = -1;
> >+
> >+rcu_barrier();
> 
> Why this?

Paranoia for the tests aborting without the barrier, as we can't rely on
module_unload providing it since we may go on to load the driver as
normal.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [RFCv2 16/19] drm/i915: Sanity check all registers for matching fw domains

2017-01-11 Thread Chris Wilson

On Wed, Jan 11, 2017 at 06:25:59PM +, Matthew Auld wrote:
> On 20 December 2016 at 13:08, Chris Wilson  wrote:
> > +   for_each_set_bit(offset, valid, FW_RANGE) {
> > +   i915_reg_t reg = { offset };
> > +
> > +   intel_uncore_forcewake_reset(dev_priv, false);
> > +   check_for_unclaimed_mmio(dev_priv);
> hmm, what do we need this for ?

It clears the debug register before every test - so that we know the
only thing the debug register is complaining about is the I915_READ()
sandwiched in between. The reset is there to ensure that the fw is
turned off and the timer disabled, so that we have a vanilla state
every time with the powerwell off. Hopefully.

> > +
> > +   (void)I915_READ(reg);
> > +   if (check_for_unclaimed_mmio(dev_priv)) {

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] ✓ Fi.CI.BAT: success for drm/edid: Improve RGB limited range handling a bit (rev2)

2017-01-11 Thread Patchwork

== Series Details ==

Series: drm/edid: Improve RGB limited range handling a bit (rev2)
URL   : https://patchwork.freedesktop.org/series/17825/
State : success

== Summary ==

Series 17825v2 drm/edid: Improve RGB limited range handling a bit
https://patchwork.freedesktop.org/api/1.0/series/17825/revisions/2/mbox/


fi-bdw-5557u total:246  pass:232  dwarn:0   dfail:0   fail:0   skip:14 
fi-bsw-n3050 total:246  pass:207  dwarn:0   dfail:0   fail:0   skip:39 
fi-bxt-j4205 total:246  pass:224  dwarn:0   dfail:0   fail:0   skip:22 
fi-bxt-t5700 total:82   pass:69   dwarn:0   dfail:0   fail:0   skip:12 
fi-byt-j1900 total:246  pass:219  dwarn:0   dfail:0   fail:0   skip:27 
fi-byt-n2820 total:246  pass:215  dwarn:0   dfail:0   fail:0   skip:31 
fi-hsw-4770  total:246  pass:227  dwarn:0   dfail:0   fail:0   skip:19 
fi-hsw-4770r total:246  pass:227  dwarn:0   dfail:0   fail:0   skip:19 
fi-ivb-3520m total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-ivb-3770  total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-kbl-7500u total:246  pass:225  dwarn:0   dfail:0   fail:0   skip:21 
fi-skl-6260u total:246  pass:233  dwarn:0   dfail:0   fail:0   skip:13 
fi-skl-6700hqtotal:246  pass:226  dwarn:0   dfail:0   fail:0   skip:20 
fi-skl-6700k total:246  pass:222  dwarn:3   dfail:0   fail:0   skip:21 
fi-skl-6770hqtotal:246  pass:233  dwarn:0   dfail:0   fail:0   skip:13 
fi-snb-2520m total:246  pass:215  dwarn:0   dfail:0   fail:0   skip:31 
fi-snb-2600  total:246  pass:214  dwarn:0   dfail:0   fail:0   skip:32 

b69fc4c941bef6d10750ce3f07daedfffc7017d1 drm-tip: 2017y-01m-11d-17h-30m-02s UTC 
integration manifest
20e8661 drm/edid: Set YQ bits in the AVI infoframe according to CEA-861-F
3d4bb98 drm/edid: Set AVI infoframe Q even when QS=0
0784091 drm/edid: Introduce drm_hdmi_avi_infoframe_quant_range()
36b475a drm/edid: Introduce drm_default_rgb_quant_range()
e46f7c5 drm/edid: Have drm_edid.h include hdmi.h

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3484/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

1 2 3 >

1 - 100 of 228 matches

Mail list logo