date:20170504

[Intel-gfx] ✗ Fi.CI.BAT: failure for Enhancement to intel_dp_aux_backlight driver (rev4)

2017-05-04 Thread Patchwork

== Series Details ==

Series: Enhancement to intel_dp_aux_backlight driver (rev4)
URL   : https://patchwork.freedesktop.org/series/21086/
State : failure

== Summary ==

make: Entering directory '/home/cidrm/kernel'
  CHK include/config/kernel.release
  CHK include/generated/uapi/linux/version.h
  CHK include/generated/utsrelease.h
  CHK include/generated/bounds.h
  CHK include/generated/timeconst.h
  CHK include/generated/asm-offsets.h
  CALLscripts/checksyscalls.sh
  CHK include/generated/compile.h
  CHK kernel/config_data.h
  CC [M]  drivers/gpu/drm/i915/i915_params.o
In file included from ./include/linux/module.h:18:0,
 from ./include/drm/drmP.h:59,
 from drivers/gpu/drm/i915/i915_drv.h:47,
 from drivers/gpu/drm/i915/i915_params.c:26:
drivers/gpu/drm/i915/i915_params.c: In function ‘__check_enable_dpcd_backlight’:
./include/linux/moduleparam.h:344:67: error: return from incompatible pointer 
type [-Werror=incompatible-pointer-types]
  static inline type __always_unused *__check_##name(void) { return(p); }
   ^
./include/linux/moduleparam.h:396:35: note: in expansion of macro 
‘__param_check’
 #define param_check_bool(name, p) __param_check(name, p, bool)
   ^
./include/linux/moduleparam.h:146:2: note: in expansion of macro 
‘param_check_bool’
  param_check_##type(name, &(value));   \
  ^
drivers/gpu/drm/i915/i915_params.c:249:1: note: in expansion of macro 
‘module_param_named’
 module_param_named(enable_dpcd_backlight, i915.enable_dpcd_backlight, bool, 
0600);
 ^
cc1: all warnings being treated as errors
scripts/Makefile.build:294: recipe for target 
'drivers/gpu/drm/i915/i915_params.o' failed
make[4]: *** [drivers/gpu/drm/i915/i915_params.o] Error 1
scripts/Makefile.build:553: recipe for target 'drivers/gpu/drm/i915' failed
make[3]: *** [drivers/gpu/drm/i915] Error 2
scripts/Makefile.build:553: recipe for target 'drivers/gpu/drm' failed
make[2]: *** [drivers/gpu/drm] Error 2
scripts/Makefile.build:553: recipe for target 'drivers/gpu' failed
make[1]: *** [drivers/gpu] Error 2
Makefile:1002: recipe for target 'drivers' failed
make: *** [drivers] Error 2
make: Leaving directory '/home/cidrm/kernel'

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] ✗ Fi.CI.BAT: failure for Enhancement to intel_dp_aux_backlight driver (rev4)

2017-05-04 Thread Patchwork

== Series Details ==

Series: Enhancement to intel_dp_aux_backlight driver (rev4)
URL   : https://patchwork.freedesktop.org/series/21086/
State : failure

== Summary ==

  CHK include/config/kernel.release
  CHK include/generated/uapi/linux/version.h
  CHK include/generated/utsrelease.h
  CHK include/generated/bounds.h
  CHK include/generated/timeconst.h
  CHK include/generated/asm-offsets.h
  CALLscripts/checksyscalls.sh
  CHK include/generated/compile.h
  CHK kernel/config_data.h
  CC [M]  drivers/gpu/drm/i915/i915_params.o
In file included from ./include/linux/module.h:18:0,
 from ./include/drm/drmP.h:59,
 from drivers/gpu/drm/i915/i915_drv.h:47,
 from drivers/gpu/drm/i915/i915_params.c:26:
drivers/gpu/drm/i915/i915_params.c: In function ‘__check_enable_dpcd_backlight’:
./include/linux/moduleparam.h:344:67: error: return from incompatible pointer 
type [-Werror=incompatible-pointer-types]
  static inline type __always_unused *__check_##name(void) { return(p); }
   ^
./include/linux/moduleparam.h:396:35: note: in expansion of macro 
‘__param_check’
 #define param_check_bool(name, p) __param_check(name, p, bool)
   ^
./include/linux/moduleparam.h:146:2: note: in expansion of macro 
‘param_check_bool’
  param_check_##type(name, &(value));   \
  ^
drivers/gpu/drm/i915/i915_params.c:249:1: note: in expansion of macro 
‘module_param_named’
 module_param_named(enable_dpcd_backlight, i915.enable_dpcd_backlight, bool, 
0600);
 ^
cc1: all warnings being treated as errors
scripts/Makefile.build:294: recipe for target 
'drivers/gpu/drm/i915/i915_params.o' failed
make[4]: *** [drivers/gpu/drm/i915/i915_params.o] Error 1
scripts/Makefile.build:553: recipe for target 'drivers/gpu/drm/i915' failed
make[3]: *** [drivers/gpu/drm/i915] Error 2
scripts/Makefile.build:553: recipe for target 'drivers/gpu/drm' failed
make[2]: *** [drivers/gpu/drm] Error 2
scripts/Makefile.build:553: recipe for target 'drivers/gpu' failed
make[1]: *** [drivers/gpu] Error 2
Makefile:1002: recipe for target 'drivers' failed
make: *** [drivers] Error 2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH i-g-t 13/13] tests/gem_exec_nop: Disable headless subtest on cairoless Android

2017-05-04 Thread Petri Latvala

On Wed, Apr 19, 2017 at 01:01:55PM +0200, Arkadiusz Hiler wrote:
> Currently whole igt_kms.c is disabled while compiling on Android without
> cairo, so this tests does not compile.
> 
> There should be cleaner a way to disable only cairo dependant parts
> which should allow us to enable at least some of the KMS tests, but
> that's a bigger rework for another time.
> 
> Signed-off-by: Arkadiusz Hiler 
> ---
>  lib/Android.mk   | 1 +
>  tests/gem_exec_nop.c | 4 
>  2 files changed, 5 insertions(+)
> 
> diff --git a/lib/Android.mk b/lib/Android.mk
> index 31f88be..dc538b8 100644
> --- a/lib/Android.mk
> +++ b/lib/Android.mk
> @@ -38,6 +38,7 @@ ifeq ("${ANDROID_HAS_CAIRO}", "1")
>  LOCAL_C_INCLUDES += $(ANDROID_BUILD_TOP)/external/cairo-1.12.16/src
>  LOCAL_CFLAGS += -DANDROID_HAS_CAIRO=1 -DIGT_DATADIR=\".\" 
> -DIGT_SRCDIR=\".\"
>  else
> +
>  skip_lib_list := \
>  igt_kms.c \
>  igt_kms.h \
> diff --git a/tests/gem_exec_nop.c b/tests/gem_exec_nop.c
> index 66c2fc1..967caef 100644
> --- a/tests/gem_exec_nop.c
> +++ b/tests/gem_exec_nop.c
> @@ -138,6 +138,7 @@ stable_nop_on_ring(int fd, uint32_t handle, unsigned int 
> engine,
>   return n;
>  }
>  
> +#if (!defined(ANDROID)) || (defined(ANDROID) && ANDROID_HAS_CAIRO)


Tautological check for ANDROID being defined. Is it too confusing to reduce 
this to

#if !defined(ANDROID) || ANDROID_HAS_CAIRO




>  #define assert_within_epsilon(x, ref, tolerance) \
>  igt_assert_f((x) <= (1.0 + tolerance) * ref && \
>   (x) >= (1.0 - tolerance) * ref, \
> @@ -178,6 +179,7 @@ static void headless(int fd, uint32_t handle)
>   /* check that the two execution speeds are roughly the same */
>   assert_within_epsilon(n_headless, n_display, 0.1f);
>  }
> +#endif
>  
>  static bool ignore_engine(int fd, unsigned engine)
>  {
> @@ -561,8 +563,10 @@ igt_main
>   igt_subtest("context-sequential")
>   sequential(device, handle, FORKED | CONTEXT, 150);
>  
> +#if (!defined(ANDROID)) || (defined(ANDROID) && ANDROID_HAS_CAIRO)


Likewise.




--
Petri Latvala





>   igt_subtest("headless")
>   headless(device, handle);
> +#endif
>  
>   igt_fixture {
>   igt_stop_hang_detector();
> -- 
> 2.9.3
> 
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [maintainer-tools PATCH v3] dim: Add pull request tag template

2017-05-04 Thread Jani Nikula

On Wed, 03 May 2017, Sean Paul  wrote:
> Each pull request is accompanied by a summary that is stored in the git tag
> from which it is generated. These summaries all share the same template with
> headers classifying changes to UAPI, Cross-subsystem, Core, and Drivers. This
> patch adds this template to the tag summary automatically in dim pull-request.
>
> Changes in v2:
>   - Tweaked the template var name s/PULL/TAG/ (Daniel)
> Changes in v3:
>   - Use git tag -F- to ingest template (Jani)
>   - Tweak naming/comments again to hopefully clarify things (Jani)
>
> Signed-off-by: Sean Paul 

Pushed, thanks.

BR,
Jani.

> ---
>  dim | 25 +++--
>  dim.rst |  4 
>  2 files changed, 27 insertions(+), 2 deletions(-)
>
> diff --git a/dim b/dim
> index 8937803..baa0b38 100755
> --- a/dim
> +++ b/dim
> @@ -67,6 +67,9 @@ 
> DIM_TEMPLATE_HELLO=${DIM_TEMPLATE_HELLO:-$HOME/.dim.template.hello}
>  # signature pull request template
>  
> DIM_TEMPLATE_SIGNATURE=${DIM_TEMPLATE_SIGNATURE:-$HOME/.dim.template.signature}
>  
> +# dim pull-request tag summary template
> +DIM_TEMPLATE_TAG_SUMMARY=${DIM_TEMPLATE_TAG_SUMMARY:-$HOME/.dim.template.tagsummary}
> +
>  #
>  # Internal configuration.
>  #
> @@ -1501,6 +1504,24 @@ function dim_tag_next
>  
>  }
>  
> +function prep_pull_tag_summary
> +{
> + if [ -r $DIM_TEMPLATE_TAG_SUMMARY ]; then
> + cat $DIM_TEMPLATE_TAG_SUMMARY
> + else
> + cat <<-EOF
> + UAPI Changes:
> +
> + Cross-subsystem Changes:
> +
> + Core Changes:
> +
> + Driver Changes:
> +
> + EOF
> + fi
> +}
> +
>  # dim_pull_request branch upstream
>  function dim_pull_request
>  {
> @@ -1533,9 +1554,9 @@ function dim_pull_request
>   while git tag -l $tag | grep -q $tag ; do
>   tag="$branch-$today-$((++suffix))"
>   done
> -
>   gitk "$branch@{upstream}" ^$upstream &
> - $DRY git tag -a $tag "$branch@{upstream}"
> + prep_pull_tag_summary | $DRY git tag -F- $tag 
> "$branch@{upstream}"
> + $DRY git tag -a -f $tag
>   $DRY git push $remote $tag
>   prep_pull_mail $req_file $tag
>  
> diff --git a/dim.rst b/dim.rst
> index 3dd19f9..10572f1 100644
> --- a/dim.rst
> +++ b/dim.rst
> @@ -464,6 +464,10 @@ DIM_TEMPLATE_SIGNATURE
>  --
>  Path to a file containing a signature template for pull request mails.
>  
> +DIM_TEMPLATE_TAG_SUMMARY
> +-
> +Path to a file containing the template for dim pull-request tag summaries.
> +
>  dim_alias_
>  -
>  Make  an alias for the subcommand defined as the value. For 
> example,

-- 
Jani Nikula, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v2 4/4] drm/i915: Calculate vlv/chv intermediate watermarks correctly, v2.

2017-05-04 Thread Maarten Lankhorst

Op 03-05-17 om 20:03 schreef Ville Syrjälä:
> On Wed, May 03, 2017 at 06:18:46PM +0200, Maarten Lankhorst wrote:
>> Op 03-05-17 om 18:07 schreef Ville Syrjälä:
>>> On Wed, May 03, 2017 at 05:53:34PM +0200, Maarten Lankhorst wrote:
 Op 03-05-17 om 16:11 schreef Ville Syrjälä:
> On Wed, May 03, 2017 at 04:06:37PM +0200, Maarten Lankhorst wrote:
>> Op 03-05-17 om 15:45 schreef Ville Syrjälä:
>>> On Mon, May 01, 2017 at 03:34:34PM +0200, Maarten Lankhorst wrote:
 The watermarks it should calculate against are the old optimal 
 watermarks.
 The currently active crtc watermarks are pure fiction, and are invalid 
 in
 case of a nonblocking modeset, page flip enabling/disabling planes or 
 any
 other reason.

 When the crtc is disabled or during a modeset the intermediate 
 watermarks
 don't need to be programmed separately, and could be directly assigned
 to the optimal watermarks.

 Also rename crtc_state to new_crtc_state, to distinguish it from the 
 old state.

 Changes since v1:
 - Use intel_atomic_get_old_crtc_state. (ville)

 Signed-off-by: Maarten Lankhorst 
 ---
  drivers/gpu/drm/i915/intel_pm.c | 20 ++--
  1 file changed, 14 insertions(+), 6 deletions(-)

 diff --git a/drivers/gpu/drm/i915/intel_pm.c 
 b/drivers/gpu/drm/i915/intel_pm.c
 index 0f344b1fff45..a09396ee1f3d 100644
 --- a/drivers/gpu/drm/i915/intel_pm.c
 +++ b/drivers/gpu/drm/i915/intel_pm.c
 @@ -1458,16 +1458,24 @@ static void vlv_atomic_update_fifo(struct 
 intel_atomic_state *state,
  
  static int vlv_compute_intermediate_wm(struct drm_device *dev,
   struct intel_crtc *crtc,
 - struct intel_crtc_state 
 *crtc_state)
 + struct intel_crtc_state 
 *new_crtc_state)
  {
 -  struct vlv_wm_state *intermediate = 
 &crtc_state->wm.vlv.intermediate;
 -  const struct vlv_wm_state *optimal = 
 &crtc_state->wm.vlv.optimal;
 -  const struct vlv_wm_state *active = &crtc->wm.active.vlv;
 +  struct vlv_wm_state *intermediate = 
 &new_crtc_state->wm.vlv.intermediate;
 +  const struct vlv_wm_state *optimal = 
 &new_crtc_state->wm.vlv.optimal;
 +  const struct intel_crtc_state *old_crtc_state =
 +  
 intel_atomic_get_old_crtc_state(new_crtc_state->base.state, crtc);
 +  const struct vlv_wm_state *active = 
 &old_crtc_state->wm.vlv.optimal;
int level;
  
 +  if (!new_crtc_state->base.active || 
 drm_atomic_crtc_needs_modeset(&new_crtc_state->base)) {
 +  *intermediate = *optimal;
 +
 +  return 0;
 +  }
 +
intermediate->num_levels = min(optimal->num_levels, 
 active->num_levels);
intermediate->cxsr = optimal->cxsr && active->cxsr &&
 -  !crtc_state->disable_cxsr;
 +  !new_crtc_state->disable_cxsr;
>>> We need to consider disable_cxsr even in the modeset case.
>> Why is this? crtc_state->disable_cxsr is set if any plane is part of the 
>> crtc during modeset, so it's disabled during modeset already.
> It's set if any plane is enabling/disabling, which should be quite
> typical during a modeset.
 Yeah but .initial_watermarks is called during crtc_enable, so cxsr will 
 get enabled anyway.
>>> Which is not what we want. CxSR must stay off until the planes have been
>>> enabled.
>>>
>> In that case why is it enabled in .initial_watermarks at all? It should be 
>> in optimize_watermarks then..
> Because we can keep it enabled across the update unless planes are
> getting enabled or disabled.
>
So for the modeset case, computing intermediate watermarks:

*intermediate = *optimal;
if (needs_modeset)
intermediate->cxsr = false;

if (optimal->cxsr && !intermediate->cxsr)
new_crtc_state->wm.need_postvbl_update = true;

?

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH RESEND i-g-t 2/2] kms_frontbuffer_tracking: Don't poke compressing status for old cpus

2017-05-04 Thread Petri Latvala

On Wed, Apr 26, 2017 at 03:36:16PM -0300, Paulo Zanoni wrote:
> > I have a feeling I asked this before, but why aren't we just fixing
> > the kernel to report it correctly? For any platform with FBC2 it
> > should be trivial,
> 
> Right, I see there's a reg for that for ILK/SNB.
> 
> >  for FBC1 slightly more complicate as you probably
> > have to check each individual tag.
> 
> I didn't check the docs for that.
> 
> Maybe we should change the comment from "early generations are not able
> to report compression status" to something more accurate like "the
> Kernel doesn't report compression status for early generations".
> 
> 
> There are quite a few different ways to solve the problem involved in
> this patch, and some of the would remove the need to check for platform
> generations in the user space side. An example alternative would be to
> always print "Compressing: " and then put "no" when FBC is disabled and
> "unknown" for platforms where we don't know what to print. In fact it's
> still on my TODO list to add a ton more information to i915_fbc_status,
> but I'm not going to work on that soon. And there's always the problem
> with having to sync Kernel and IGT.
> 
> Anyway, the current patch plugs the current hole, so I think further
> improvements to this area can come on top of it:
> 
> Reviewed-by: Paulo Zanoni 
>



Pushed this patch, thanks.


-- 
Petri Latvala
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915: Update MOCS settings for gen 9

2017-05-04 Thread Arkadiusz Hiler

On Thu, Apr 27, 2017 at 05:23:16PM +0100, Chris Wilson wrote:
> On Thu, Apr 27, 2017 at 06:30:42PM +0300, David Weinehall wrote:
> > On Thu, Apr 27, 2017 at 04:55:20PM +0200, Arkadiusz Hiler wrote:
> > > On Wed, Apr 26, 2017 at 06:00:41PM +0300, David Weinehall wrote:
> > > > Add a bunch of MOCS entries for gen 9 that were missing from intel_mocs.
> > > > Some of these are used by media-sdk; if these entries are missing
> > > > the default will instead be to do everything uncached.
> > > > 
> > > > This patch improves media-sdk performance with up to 60%
> > > > with the (admittedly synthetic) benchmarks we use in our nightly
> > > > testing, without regressing any other benchmarks.
> > > 
> > > Hey David,
> > > 
> > > I am testing some of the extended MOCS with Mesa and the differences I
> > > see fit in the margins of statistical error.
> > > 
> > > Odd, I thought, so to make sure I haven't messed up anything in the
> > > process of compiling, setting LD_LIBRARY_PATH and benchmarking I turned
> > > everything to UNCACHED - and I saw severe performance drop.
> > > 
> > > So here is the question it induced:
> > > 
> > > Have you used the "closest neighbour" from entries available or did you
> > > defaulted to the UNCACHED ones? That could be the culprit.
> > > 
> > > Note: I have tested MOCS for VB and Render Target only, and only in a
> > > few synthetic cases - it will require much more fine-tuning and
> > > benchmarking before any final conclusions.
> > 
> > As I mentioned in the commit message, the improvements only manifest
> > themselves for media-sdk workloads (and presumably other workloads
> > that uses the same hardware); if you see any performance regressions
> > with these additional entries I'd be interested to know.
> 
> But what is being counter suggested is that their is no reason for these
> mocs entries. If the sdk is just using mocs registers without first
> programming them outside of the kernel abi, then it will be hitting
> uncached memory - and then the only benefit is from simply enabling
> cached access. The kernel ABI is minimalist for a reason, and we want to
> know why we should be adding tables that we need to maintain forever
> (bonus points for making that a consistent interface for hardware for
> years to come).
> -Chris

Thanks for rephrasing - that's exactly what I am concerned with.

Did you just use the MediaSDK as it is - meaning that MOCS entries
beyond the set of the 3 we have defined had been naively utilized?

If that's the case it is probably the cause of the performance
difference - everything beyond "the 3" means UNCACHED.

Can you try changing MediaSDK to only use entries that are already in?
How the performance differs in that case?

-- 
Cheers,
Arek


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v2] tests/pm_sseu: Re-enable the test

2017-05-04 Thread Petri Latvala

On Wed, Apr 26, 2017 at 03:28:09AM -0700, Oscar Mateo wrote:
> This test got inadvertently disabled by commit 83884e97 (Restore
> "lib: Open debugfs files for the given DRM device") when the
> initialization order got changed (dbg_init before gem_init).
> 
> v2:
>   - The asserts on fd are useless (Petri)
>   - Deinit in inverse order.
> 
> Cc: Petri Latvala 
> Signed-off-by: Oscar Mateo 


Thanks, pushed with R-b.


Btw, can you do

 git config format.subjectprefix "PATCH i-g-t"

for your future patches?




-- 
Petri Latvala
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915: Update MOCS settings for gen 9

2017-05-04 Thread Tvrtko Ursulin



On 04/05/2017 09:35, Arkadiusz Hiler wrote:

On Thu, Apr 27, 2017 at 05:23:16PM +0100, Chris Wilson wrote:

On Thu, Apr 27, 2017 at 06:30:42PM +0300, David Weinehall wrote:

On Thu, Apr 27, 2017 at 04:55:20PM +0200, Arkadiusz Hiler wrote:

On Wed, Apr 26, 2017 at 06:00:41PM +0300, David Weinehall wrote:

Add a bunch of MOCS entries for gen 9 that were missing from intel_mocs.
Some of these are used by media-sdk; if these entries are missing
the default will instead be to do everything uncached.

This patch improves media-sdk performance with up to 60%
with the (admittedly synthetic) benchmarks we use in our nightly
testing, without regressing any other benchmarks.


Hey David,

I am testing some of the extended MOCS with Mesa and the differences I
see fit in the margins of statistical error.

Odd, I thought, so to make sure I haven't messed up anything in the
process of compiling, setting LD_LIBRARY_PATH and benchmarking I turned
everything to UNCACHED - and I saw severe performance drop.

So here is the question it induced:

Have you used the "closest neighbour" from entries available or did you
defaulted to the UNCACHED ones? That could be the culprit.

Note: I have tested MOCS for VB and Render Target only, and only in a
few synthetic cases - it will require much more fine-tuning and
benchmarking before any final conclusions.


As I mentioned in the commit message, the improvements only manifest
themselves for media-sdk workloads (and presumably other workloads
that uses the same hardware); if you see any performance regressions
with these additional entries I'd be interested to know.


But what is being counter suggested is that their is no reason for these
mocs entries. If the sdk is just using mocs registers without first
programming them outside of the kernel abi, then it will be hitting
uncached memory - and then the only benefit is from simply enabling
cached access. The kernel ABI is minimalist for a reason, and we want to
know why we should be adding tables that we need to maintain forever
(bonus points for making that a consistent interface for hardware for
years to come).
-Chris


Thanks for rephrasing - that's exactly what I am concerned with.

Did you just use the MediaSDK as it is - meaning that MOCS entries
beyond the set of the 3 we have defined had been naively utilized?

If that's the case it is probably the cause of the performance
difference - everything beyond "the 3" means UNCACHED.

Can you try changing MediaSDK to only use entries that are already in?
How the performance differs in that case?


Alternatively, at the time this was on my plate, Eero had suggested a 
sequence of experiments by basically gradually replicating the default 
UC/WB entries to currently empty slots, starting on GT2 parts and then 
going forward adding the more fine tuned parts.


This would have showed the benefit of fine tuned entries vs basic cached 
ones. Unfortunately I never got round doing this, but it sounded like a 
really good approach to me.


I could paste these suggestion here if Eero wouldn't mind? But I am also 
not sure if it is still relevant after the effort of exactly documenting 
the extended set of entries started.


Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 07/67] drm/i915/cnl: Introduce Cannonlake platform defition.

2017-05-04 Thread Ander Conselvan De Oliveira

On Thu, 2017-04-06 at 12:15 -0700, Rodrigo Vivi wrote:
> Cannonlake is a Intel® Processor containing Intel® HD Graphics
> following Kabylake.
> 
> It is Gen10.
> 
> Let's start by adding the platform definition based on previous
> platforms but yet as alpha_support.
> 
> On following patches we will start adding PCI IDs and the
> platform specific changes.
> 
> Signed-off-by: Rodrigo Vivi 
> ---
>  drivers/gpu/drm/i915/i915_drv.h  | 3 +++
>  drivers/gpu/drm/i915/i915_pci.c  | 8 
>  drivers/gpu/drm/i915/intel_device_info.c | 1 +
>  3 files changed, 12 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 2685f12..a357862 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -887,6 +887,7 @@ enum intel_platform {
>   INTEL_BROXTON,
>   INTEL_KABYLAKE,
>   INTEL_GEMINILAKE,
> + INTEL_CANNONLAKE,
>   INTEL_MAX_PLATFORMS
>  };
>  
> @@ -2751,6 +2752,7 @@ static inline struct scatterlist *__sg_next(struct 
> scatterlist *sg)
>  #define IS_BROXTON(dev_priv) ((dev_priv)->info.platform == INTEL_BROXTON)
>  #define IS_KABYLAKE(dev_priv)((dev_priv)->info.platform == 
> INTEL_KABYLAKE)
>  #define IS_GEMINILAKE(dev_priv)  ((dev_priv)->info.platform == 
> INTEL_GEMINILAKE)
> +#define IS_CANNONLAKE(dev_priv)  ((dev_priv)->info.platform == 
> INTEL_CANNONLAKE)
>  #define IS_MOBILE(dev_priv)  ((dev_priv)->info.is_mobile)
>  #define IS_HSW_EARLY_SDV(dev_priv) (IS_HASWELL(dev_priv) && \
>   (INTEL_DEVID(dev_priv) & 0xFF00) == 0x0C00)
> @@ -2842,6 +2844,7 @@ static inline struct scatterlist *__sg_next(struct 
> scatterlist *sg)
>  #define IS_GEN7(dev_priv)(!!((dev_priv)->info.gen_mask & BIT(6)))
>  #define IS_GEN8(dev_priv)(!!((dev_priv)->info.gen_mask & BIT(7)))
>  #define IS_GEN9(dev_priv)(!!((dev_priv)->info.gen_mask & BIT(8)))
> +#define IS_GEN10(dev_priv)   (!!((dev_priv)->info.gen_mask & BIT(9)))
>  
>  #define IS_LP(dev_priv)  (INTEL_INFO(dev_priv)->is_lp)
>  #define IS_GEN9_LP(dev_priv) (IS_GEN9(dev_priv) && IS_LP(dev_priv))
> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> index f87b0c4..a2a4b2f 100644
> --- a/drivers/gpu/drm/i915/i915_pci.c
> +++ b/drivers/gpu/drm/i915/i915_pci.c
> @@ -431,6 +431,14 @@
>   .ring_mask = RENDER_RING | BSD_RING | BLT_RING | VEBOX_RING | BSD2_RING,
>  };
>  
> +static const struct intel_device_info intel_cannonlake_info = {
> + BDW_FEATURES,
> + .is_alpha_support = 1,
> + .platform = INTEL_CANNONLAKE,
> + .gen = 10,
> + .ddb_size = 896,
> +};
> +

I think it makes sense to squash patch 17 with this one. No point in adding
.ddb_size with the wrong value. If there's a reason not squash, I'd say is
better to leave this as zero, so that the WARN_ON(ddb_size == 0) in intel_pm.c
will remind us to fix it. With one of these suggestions,

Reviewed-by: Ander Conselvan de Oliveira 

>  /*
>   * Make sure any device matches here are from most specific to most
>   * general.  For example, since the Quanta match is based on the subsystem
> diff --git a/drivers/gpu/drm/i915/intel_device_info.c 
> b/drivers/gpu/drm/i915/intel_device_info.c
> index 7d01dfe..6b09a82 100644
> --- a/drivers/gpu/drm/i915/intel_device_info.c
> +++ b/drivers/gpu/drm/i915/intel_device_info.c
> @@ -51,6 +51,7 @@
>   PLATFORM_NAME(BROXTON),
>   PLATFORM_NAME(KABYLAKE),
>   PLATFORM_NAME(GEMINILAKE),
> + PLATFORM_NAME(CANNONLAKE),
>  };
>  #undef PLATFORM_NAME
>  
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 16/67] drm/i915/cnl: Cannonlake has 4 planes (3 sprites) per pipe

2017-05-04 Thread Ander Conselvan De Oliveira

On Thu, 2017-04-06 at 12:15 -0700, Rodrigo Vivi wrote:
> From: James Irwin 
> 
> Issue: VIZ-4525
> 
> Reviewed-by: Damien Lespiau 
> Signed-off-by: James Irwin 
> Signed-off-by: Damien Lespiau 

Reviewed-by: Ander Conselvan de Oliveira 

> ---
>  drivers/gpu/drm/i915/intel_device_info.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_device_info.c 
> b/drivers/gpu/drm/i915/intel_device_info.c
> index 6b09a82..3cc8cdb 100644
> --- a/drivers/gpu/drm/i915/intel_device_info.c
> +++ b/drivers/gpu/drm/i915/intel_device_info.c
> @@ -328,7 +328,7 @@ void intel_device_info_runtime_init(struct 
> drm_i915_private *dev_priv)
>* we don't expose the topmost plane at all to prevent ABI breakage
>* down the line.
>*/
> - if (IS_GEMINILAKE(dev_priv))
> + if (IS_GEN10(dev_priv) || IS_GEMINILAKE(dev_priv))
>   for_each_pipe(dev_priv, pipe)
>   info->num_sprites[pipe] = 3;
>   else if (IS_BROXTON(dev_priv)) {
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915: Update MOCS settings for gen 9

2017-05-04 Thread Eero Tamminen


Hi,

On 04.05.2017 11:53, Tvrtko Ursulin wrote:

On 04/05/2017 09:35, Arkadiusz Hiler wrote:

On Thu, Apr 27, 2017 at 05:23:16PM +0100, Chris Wilson wrote:

But what is being counter suggested is that their is no reason for these
mocs entries. If the sdk is just using mocs registers without first
programming them outside of the kernel abi, then it will be hitting
uncached memory - and then the only benefit is from simply enabling
cached access. The kernel ABI is minimalist for a reason, and we want to
know why we should be adding tables that we need to maintain forever
(bonus points for making that a consistent interface for hardware for
years to come).
-Chris


Thanks for rephrasing - that's exactly what I am concerned with.

Did you just use the MediaSDK as it is - meaning that MOCS entries
beyond the set of the 3 we have defined had been naively utilized?

If that's the case it is probably the cause of the performance
difference - everything beyond "the 3" means UNCACHED.

Can you try changing MediaSDK to only use entries that are already in?
How the performance differs in that case?


Alternatively, at the time this was on my plate, Eero had suggested a
sequence of experiments by basically gradually replicating the default
UC/WB entries to currently empty slots, starting on GT2 parts and then
going forward adding the more fine tuned parts.

This would have showed the benefit of fine tuned entries vs basic cached
ones. Unfortunately I never got round doing this, but it sounded like a
really good approach to me.

I could paste these suggestion here if Eero wouldn't mind?


Of course I don't mind. :-)



But I am also
not sure if it is still relevant after the effort of exactly documenting
the extended set of entries started.


It's relevant in the sense that we don't currently don't know whether 
there's any actual benefit from the new entries (i.e. was it just an 
issue of VPG not using the correct existing entries).


If there is, that would be motivation to investigate impact of them also 
on other workloads.




- Eero

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915/cnp: Backlight support for CNP.

2017-05-04 Thread Jani Nikula

On Wed, 03 May 2017, Anusha Srivatsa  wrote:
> From: Rodrigo Vivi 
>
> Split out BXT and CNP's setup_backlight(),enable_backlight(),
> disable_backlight() and hz_to_pwm() into
> two separate functions instead of reusing BXT function.
>
> Reuse set_backlight() and get_backlight() since they have
> no reference to the utility pin.
>
> v2: Reuse BXT functions with controller 0 instead of
> redefining it. (Jani).
> Use dev_priv->rawclk_freq instead of getting the value
> from SFUSE_STRAP.
> v3: Avoid setup backligh controller along with hooks and
> fully reuse hooks setup as suggested by Jani.
> v4: Clean up commit message.
> v5: Implement per PCH instead per platform.
>
> v6: Introduce a new function for CNP.(Jani and Ville)
>
> v7: Squash the all CNP Backlight support patches into a
> single patch. (Jani)
>
> v8: Correct indentation, remove unneeded blank lines and
> correct mail address (Jani).
>
> Reviewed-by: Jani Nikula 

Yup. What's the plan for merging the series, incl. this patch?

BR,
Jani.


> Suggested-by: Jani Nikula 
> Suggested-by: Ville Syrjala 
> Cc: Ville Syrjala 
> Cc: Jani Nikula 
> Signed-off-by: Anusha Srivatsa 
> Signed-off-by: Rodrigo Vivi 
> ---
>  drivers/gpu/drm/i915/intel_panel.c | 88 
> +++---
>  1 file changed, 83 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_panel.c 
> b/drivers/gpu/drm/i915/intel_panel.c
> index 1978bec..8ee61c1 100644
> --- a/drivers/gpu/drm/i915/intel_panel.c
> +++ b/drivers/gpu/drm/i915/intel_panel.c
> @@ -796,6 +796,19 @@ static void bxt_disable_backlight(struct intel_connector 
> *connector)
>   }
>  }
>  
> +static void cnp_disable_backlight(struct intel_connector *connector)
> +{
> + struct drm_i915_private *dev_priv = to_i915(connector->base.dev);
> + struct intel_panel *panel = &connector->panel;
> + u32 tmp, val;
> +
> + intel_panel_actually_set_backlight(connector, 0);
> +
> + tmp = I915_READ(BXT_BLC_PWM_CTL(panel->backlight.controller));
> + I915_WRITE(BXT_BLC_PWM_CTL(panel->backlight.controller),
> +tmp & ~BXT_BLC_PWM_ENABLE);
> +}
> +
>  static void pwm_disable_backlight(struct intel_connector *connector)
>  {
>   struct intel_panel *panel = &connector->panel;
> @@ -1076,6 +1089,36 @@ static void bxt_enable_backlight(struct 
> intel_connector *connector)
>   pwm_ctl | BXT_BLC_PWM_ENABLE);
>  }
>  
> +static void cnp_enable_backlight(struct intel_connector *connector)
> +{
> + struct drm_i915_private *dev_priv = to_i915(connector->base.dev);
> + struct intel_panel *panel = &connector->panel;
> + enum pipe pipe = intel_get_pipe_from_connector(connector);
> + u32 pwm_ctl, val;
> +
> + pwm_ctl = I915_READ(BXT_BLC_PWM_CTL(panel->backlight.controller));
> + if (pwm_ctl & BXT_BLC_PWM_ENABLE) {
> + DRM_DEBUG_KMS("backlight already enabled\n");
> + pwm_ctl &= ~BXT_BLC_PWM_ENABLE;
> + I915_WRITE(BXT_BLC_PWM_CTL(panel->backlight.controller),
> +pwm_ctl);
> + }
> +
> + I915_WRITE(BXT_BLC_PWM_FREQ(panel->backlight.controller),
> +panel->backlight.max);
> +
> + intel_panel_actually_set_backlight(connector, panel->backlight.level);
> +
> + pwm_ctl = 0;
> + if (panel->backlight.active_low_pwm)
> + pwm_ctl |= BXT_BLC_PWM_POLARITY;
> +
> + I915_WRITE(BXT_BLC_PWM_CTL(panel->backlight.controller), pwm_ctl);
> + POSTING_READ(BXT_BLC_PWM_CTL(panel->backlight.controller));
> + I915_WRITE(BXT_BLC_PWM_CTL(panel->backlight.controller),
> +pwm_ctl | BXT_BLC_PWM_ENABLE);
> +}
> +
>  static void pwm_enable_backlight(struct intel_connector *connector)
>  {
>   struct intel_panel *panel = &connector->panel;
> @@ -1645,6 +1688,37 @@ bxt_setup_backlight(struct intel_connector *connector, 
> enum pipe unused)
>   return 0;
>  }
>  
> +static int
> +cnp_setup_backlight(struct intel_connector *connector, enum pipe unused)
> +{
> + struct drm_i915_private *dev_priv = to_i915(connector->base.dev);
> + struct intel_panel *panel = &connector->panel;
> + u32 pwm_ctl, val;
> +
> + panel->backlight.controller = dev_priv->vbt.backlight.controller;
> +
> + pwm_ctl = I915_READ(BXT_BLC_PWM_CTL(panel->backlight.controller));
> +
> + panel->backlight.active_low_pwm = pwm_ctl & BXT_BLC_PWM_POLARITY;
> + panel->backlight.max =
> + I915_READ(BXT_BLC_PWM_FREQ(panel->backlight.controller));
> +
> + if (!panel->backlight.max)
> + panel->backlight.max = get_backlight_max_vbt(connector);
> +
> + if (!panel->backlight.max)
> + return -ENODEV;
> +
> + val = bxt_get_backlight(connector);
> + val = intel_panel_compute_brightness(connector, val);
> + panel->backlight.level = clamp(val, panel->backlight.min,
> +panel->backlight.max);
> +
> + panel->backlight.enabled = pwm_ctl & BXT_

[Intel-gfx] [PATCH v1] ACPI: Switch to use generic UUID API

2017-05-04 Thread Andy Shevchenko

acpi_evaluate_dsm() and friends take a pointer to a raw buffer of 16
bytes. Instead we convert them to use uuid_le type. At the same time we
convert current users.

acpi_str_to_uuid() becomes useless after the conversion and it's safe to
get rid of it.

The conversion fixes a potential bug in int340x_thermal as well since
we have to use memcmp() on binary data.

Cc: Rafael J. Wysocki 
Cc: Mika Westerberg 
Cc: Borislav Petkov 
Cc: Dan Williams 
Cc: Amir Goldstein 
Cc: Jarkko Sakkinen 
Cc: Jani Nikula 
Cc: Ben Skeggs 
Cc: Benjamin Tissoires 
Cc: Joerg Roedel 
Cc: Adrian Hunter 
Cc: Yisen Zhuang 
Cc: Bjorn Helgaas 
Cc: Zhang Rui 
Cc: Felipe Balbi 
Cc: Mathias Nyman 
Cc: Heikki Krogerus 
Cc: Liam Girdwood 
Cc: Mark Brown 
Signed-off-by: Andy Shevchenko 
---
 drivers/acpi/acpi_extlog.c | 10 +++---
 drivers/acpi/bus.c | 29 ++--
 drivers/acpi/nfit/core.c   | 40 +++---
 drivers/acpi/nfit/nfit.h   |  3 +-
 drivers/acpi/utils.c   |  4 +--
 drivers/char/tpm/tpm_crb.c |  9 +++--
 drivers/char/tpm/tpm_ppi.c | 20 +--
 drivers/gpu/drm/i915/intel_acpi.c  | 14 +++-
 drivers/gpu/drm/nouveau/nouveau_acpi.c | 20 +--
 drivers/gpu/drm/nouveau/nvkm/subdev/mxm/base.c |  9 +++--
 drivers/hid/i2c-hid/i2c-hid.c  |  9 +++--
 drivers/iommu/dmar.c   | 11 +++---
 drivers/mmc/host/sdhci-pci-core.c  |  9 +++--
 drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c | 15 
 drivers/pci/pci-acpi.c | 11 +++---
 drivers/pci/pci-label.c|  4 +--
 drivers/thermal/int340x_thermal/int3400_thermal.c  |  8 ++---
 drivers/usb/dwc3/dwc3-pci.c|  6 ++--
 drivers/usb/host/xhci-pci.c|  9 +++--
 drivers/usb/misc/ucsi.c|  2 +-
 drivers/usb/typec/typec_wcove.c|  4 +--
 include/acpi/acpi_bus.h|  9 ++---
 include/linux/acpi.h   |  4 +--
 include/linux/pci-acpi.h   |  2 +-
 sound/soc/intel/skylake/skl-nhlt.c |  7 ++--
 tools/testing/nvdimm/test/iomap.c  |  2 +-
 tools/testing/nvdimm/test/nfit.c   |  2 +-
 27 files changed, 116 insertions(+), 156 deletions(-)

diff --git a/drivers/acpi/acpi_extlog.c b/drivers/acpi/acpi_extlog.c
index 502ea4dc2080..69d6140b6afa 100644
--- a/drivers/acpi/acpi_extlog.c
+++ b/drivers/acpi/acpi_extlog.c
@@ -182,17 +182,17 @@ static int extlog_print(struct notifier_block *nb, 
unsigned long val,
 
 static bool __init extlog_get_l1addr(void)
 {
-   u8 uuid[16];
+   uuid_le uuid;
acpi_handle handle;
union acpi_object *obj;
 
-   acpi_str_to_uuid(extlog_dsm_uuid, uuid);
-
+   if (uuid_le_to_bin(extlog_dsm_uuid, &uuid))
+   return false;
if (ACPI_FAILURE(acpi_get_handle(NULL, "\\_SB", &handle)))
return false;
-   if (!acpi_check_dsm(handle, uuid, EXTLOG_DSM_REV, 1 << EXTLOG_FN_ADDR))
+   if (!acpi_check_dsm(handle, &uuid, EXTLOG_DSM_REV, 1 << EXTLOG_FN_ADDR))
return false;
-   obj = acpi_evaluate_dsm_typed(handle, uuid, EXTLOG_DSM_REV,
+   obj = acpi_evaluate_dsm_typed(handle, &uuid, EXTLOG_DSM_REV,
  EXTLOG_FN_ADDR, NULL, ACPI_TYPE_INTEGER);
if (!obj) {
return false;
diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c
index 784bda663d16..e8130a4873e9 100644
--- a/drivers/acpi/bus.c
+++ b/drivers/acpi/bus.c
@@ -196,42 +196,19 @@ static void acpi_print_osc_error(acpi_handle handle,
pr_debug("\n");
 }
 
-acpi_status acpi_str_to_uuid(char *str, u8 *uuid)
-{
-   int i;
-   static int opc_map_to_uuid[16] = {6, 4, 2, 0, 11, 9, 16, 14, 19, 21,
-   24, 26, 28, 30, 32, 34};
-
-   if (strlen(str) != 36)
-   return AE_BAD_PARAMETER;
-   for (i = 0; i < 36; i++) {
-   if (i == 8 || i == 13 || i == 18 || i == 23) {
-   if (str[i] != '-')
-   return AE_BAD_PARAMETER;
-   } else if (!isxdigit(str[i]))
-   return AE_BAD_PARAMETER;
-   }
-   for (i = 0; i < 16; i++) {
-   uuid[i] = hex_to_bin(str[opc_map_to_uuid[i]]) << 4;
-   uuid[i] |= hex_to_bin(str[opc_map_to_uuid[i] + 1]);
-   }
-   return AE_OK;
-}
-EXPORT_SYMBOL_GPL(acpi_str_to_uuid);
-
 acpi_status acpi_run_osc(acpi_handle handle, struct acpi_osc_context *context)
 {
acpi_status status;
struct acpi_object_list input;
union acpi_object in_params[4];
union acpi_object *out_obj;
-   u8 uuid[16];
+   uuid_le uuid;
u32 errors

Re: [Intel-gfx] [PATCH 2/3] drm: Create a format/modifier blob

2017-05-04 Thread Daniel Stone

Hi,

On 3 May 2017 at 06:14, Ben Widawsky  wrote:
> Updated blob layout (Rob, Daniel, Kristian, xerpi)

In terms of the blob as uABI, we've got an implementation inside
Weston which works:
https://git.collabora.com/cgit/user/daniels/weston.git/commit/?h=wip/2017-04/atomic-v11-WIP&id=0a47cb63947e

That was authored by Sergi and reviewed by me. We both think it's
entirely acceptable and future-proof uABI, and it does exactly what we
want. We use it to both allocate with a suitable set of modifiers, as
well as a high-pass filter to avoid assigning FBs to planes which
won't accept the FB modifiers. So this gets my:
Acked-by: Daniel Stone 

And a future revision with the fixups found here would get my R-b.

Cheers,
Daniel
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [CI] drm/i915: Use engine->context_pin() to report the intel_ring

2017-05-04 Thread Chris Wilson

Since unifying ringbuffer/execlist submission to use
engine->pin_context, we ensure that the intel_ring is available before
we start constructing the request. We can therefore move the assignment
of the request->ring to the central i915_gem_request_alloc() and not
require it in every engine->request_alloc() callback. Another small step
towards simplification (of the core, but at a cost of handling error
pointers in less important callers of engine->pin_context).

v2: Rearrange a few branches to reduce impact of PTR_ERR() on gcc's code
generation.

Signed-off-by: Chris Wilson 
Cc: Oscar Mateo 
Cc: Joonas Lahtinen 
Reviewed-by: Oscar Mateo 
---
 drivers/gpu/drm/i915/gvt/scheduler.c |  6 --
 drivers/gpu/drm/i915/i915_gem_request.c  |  9 ++---
 drivers/gpu/drm/i915/i915_perf.c | 13 ++---
 drivers/gpu/drm/i915/intel_engine_cs.c   |  7 ---
 drivers/gpu/drm/i915/intel_lrc.c | 17 -
 drivers/gpu/drm/i915/intel_ringbuffer.c  | 25 +
 drivers/gpu/drm/i915/intel_ringbuffer.h  |  4 ++--
 drivers/gpu/drm/i915/selftests/mock_engine.c |  8 
 8 files changed, 47 insertions(+), 42 deletions(-)

diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c 
b/drivers/gpu/drm/i915/gvt/scheduler.c
index 1256fe21850b..6ae286cb5804 100644
--- a/drivers/gpu/drm/i915/gvt/scheduler.c
+++ b/drivers/gpu/drm/i915/gvt/scheduler.c
@@ -180,6 +180,7 @@ static int dispatch_workload(struct intel_vgpu_workload 
*workload)
struct intel_engine_cs *engine = dev_priv->engine[ring_id];
struct drm_i915_gem_request *rq;
struct intel_vgpu *vgpu = workload->vgpu;
+   struct intel_ring *ring;
int ret;
 
gvt_dbg_sched("ring id %d prepare to dispatch workload %p\n",
@@ -198,8 +199,9 @@ static int dispatch_workload(struct intel_vgpu_workload 
*workload)
 * shadow_ctx pages invalid. So gvt need to pin itself. After update
 * the guest context, gvt can unpin the shadow_ctx safely.
 */
-   ret = engine->context_pin(engine, shadow_ctx);
-   if (ret) {
+   ring = engine->context_pin(engine, shadow_ctx);
+   if (IS_ERR(ring)) {
+   ret = PTR_ERR(ring);
gvt_vgpu_err("fail to pin shadow context\n");
workload->status = ret;
mutex_unlock(&dev_priv->drm.struct_mutex);
diff --git a/drivers/gpu/drm/i915/i915_gem_request.c 
b/drivers/gpu/drm/i915/i915_gem_request.c
index 9074303c..10361c7e3b37 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -551,6 +551,7 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
 {
struct drm_i915_private *dev_priv = engine->i915;
struct drm_i915_gem_request *req;
+   struct intel_ring *ring;
int ret;
 
lockdep_assert_held(&dev_priv->drm.struct_mutex);
@@ -565,9 +566,10 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
 * GGTT space, so do this first before we reserve a seqno for
 * ourselves.
 */
-   ret = engine->context_pin(engine, ctx);
-   if (ret)
-   return ERR_PTR(ret);
+   ring = engine->context_pin(engine, ctx);
+   if (IS_ERR(ring))
+   return ERR_CAST(ring);
+   GEM_BUG_ON(!ring);
 
ret = reserve_seqno(engine);
if (ret)
@@ -633,6 +635,7 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
req->i915 = dev_priv;
req->engine = engine;
req->ctx = ctx;
+   req->ring = ring;
 
/* No zalloc, must clear what we need by hand */
req->global_seqno = 0;
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 060b171480d5..cdac68580cb1 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -744,6 +744,7 @@ static int oa_get_render_ctx_id(struct i915_perf_stream 
*stream)
 {
struct drm_i915_private *dev_priv = stream->dev_priv;
struct intel_engine_cs *engine = dev_priv->engine[RCS];
+   struct intel_ring *ring;
int ret;
 
ret = i915_mutex_lock_interruptible(&dev_priv->drm);
@@ -755,9 +756,10 @@ static int oa_get_render_ctx_id(struct i915_perf_stream 
*stream)
 *
 * NB: implied RCS engine...
 */
-   ret = engine->context_pin(engine, stream->ctx);
-   if (ret)
-   goto unlock;
+   ring = engine->context_pin(engine, stream->ctx);
+   mutex_unlock(&dev_priv->drm.struct_mutex);
+   if (IS_ERR(ring))
+   return PTR_ERR(ring);
 
/* Explicitly track the ID (instead of calling i915_ggtt_offset()
 * on the fly) considering the difference with gen8+ and
@@ -766,10 +768,7 @@ static int oa_get_render_ctx_id(struct i915_perf_stream 
*stream)
dev_priv->perf.oa.specific_ctx_id =
i915_ggtt_offset(stream->ctx->engine[engine->id].state);
 
-unlock:
-   mutex_unlock(&dev_priv-

Re: [Intel-gfx] [PATCH] drm/i915: Update MOCS settings for gen 9

2017-05-04 Thread Tvrtko Ursulin



On 04/05/2017 10:21, Eero Tamminen wrote:

Hi,

On 04.05.2017 11:53, Tvrtko Ursulin wrote:

On 04/05/2017 09:35, Arkadiusz Hiler wrote:

On Thu, Apr 27, 2017 at 05:23:16PM +0100, Chris Wilson wrote:

But what is being counter suggested is that their is no reason for
these
mocs entries. If the sdk is just using mocs registers without first
programming them outside of the kernel abi, then it will be hitting
uncached memory - and then the only benefit is from simply enabling
cached access. The kernel ABI is minimalist for a reason, and we
want to
know why we should be adding tables that we need to maintain forever
(bonus points for making that a consistent interface for hardware for
years to come).
-Chris


Thanks for rephrasing - that's exactly what I am concerned with.

Did you just use the MediaSDK as it is - meaning that MOCS entries
beyond the set of the 3 we have defined had been naively utilized?

If that's the case it is probably the cause of the performance
difference - everything beyond "the 3" means UNCACHED.

Can you try changing MediaSDK to only use entries that are already in?
How the performance differs in that case?


Alternatively, at the time this was on my plate, Eero had suggested a
sequence of experiments by basically gradually replicating the default
UC/WB entries to currently empty slots, starting on GT2 parts and then
going forward adding the more fine tuned parts.

This would have showed the benefit of fine tuned entries vs basic cached
ones. Unfortunately I never got round doing this, but it sounded like a
really good approach to me.

I could paste these suggestion here if Eero wouldn't mind?


Of course I don't mind. :-)


Excellent, so here is what you wrote to me at that time:

--
You could start by putting first ED_UC line values to other ED_UC lines, 
and the first ED_WB line values to other ED_WB lines.


Then test that against standard kernel and VPG kernel on SKL GT2 
machine, to evaluate LLC settings.


If perf of that looks good, then test same settings also on SKL GT3e, or 
GT4e to evaluate impact of the more fine-tuned eLLC settings in addition 
to LLC ones.


If GT2 results don't look good, try using ED_WB line for all lines that 
have either ED_WB or L3_WB.


If if that doesn't look good either, try using ED_UC line for all lines 
that have either ED_UC or L3_UC.


And if even that fails to produce performance-wise good results, we can 
conclude that we need VPG kernel's fine-tuned MOCS settings are really 
needed.


Please provide some spreadsheet of the results you get.

(My guess is that that the first settings provide almost all of the 
available speedup on GT2, but with eDRAM things aren't that 
straightforward.)

--




But I am also
not sure if it is still relevant after the effort of exactly documenting
the extended set of entries started.


It's relevant in the sense that we don't currently don't know whether
there's any actual benefit from the new entries (i.e. was it just an
issue of VPG not using the correct existing entries).

If there is, that would be motivation to investigate impact of them also
on other workloads.


There probably is a benefit since it is hard to imagine fine tuned 
entries would otherwise exist. But I agree it makes sense to get a 
complete understanding of relative contribution of individual fine tunings.


Regards,

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v1] ACPI: Switch to use generic UUID API

2017-05-04 Thread Jani Nikula

On Thu, 04 May 2017, Andy Shevchenko  wrote:
> diff --git a/drivers/gpu/drm/i915/intel_acpi.c 
> b/drivers/gpu/drm/i915/intel_acpi.c
> index eb638a1e69d2..72bfe6ceadf8 100644
> --- a/drivers/gpu/drm/i915/intel_acpi.c
> +++ b/drivers/gpu/drm/i915/intel_acpi.c
> @@ -15,13 +15,9 @@ static struct intel_dsm_priv {
>   acpi_handle dhandle;
>  } intel_dsm_priv;
>  
> -static const u8 intel_dsm_guid[] = {
> - 0xd3, 0x73, 0xd8, 0x7e,
> - 0xd0, 0xc2,
> - 0x4f, 0x4e,
> - 0xa8, 0x54,
> - 0x0f, 0x13, 0x17, 0xb0, 0x1c, 0x2c
> -};
> +static const uuid_le intel_dsm_guid =
> + UUID_LE(0x7ed873d3, 0xc2d0, 0x4e4f,
> + 0xa8, 0x54, 0x0f, 0x13, 0x17, 0xb0, 0x1c, 0x2c);
>  
>  static char *intel_dsm_port_name(u8 id)
>  {
> @@ -80,7 +76,7 @@ static void intel_dsm_platform_mux_info(void)
>   int i;
>   union acpi_object *pkg, *connector_count;
>  
> - pkg = acpi_evaluate_dsm_typed(intel_dsm_priv.dhandle, intel_dsm_guid,
> + pkg = acpi_evaluate_dsm_typed(intel_dsm_priv.dhandle, &intel_dsm_guid,
>   INTEL_DSM_REVISION_ID, INTEL_DSM_FN_PLATFORM_MUX_INFO,
>   NULL, ACPI_TYPE_PACKAGE);
>   if (!pkg) {
> @@ -118,7 +114,7 @@ static bool intel_dsm_pci_probe(struct pci_dev *pdev)
>   if (!dhandle)
>   return false;
>  
> - if (!acpi_check_dsm(dhandle, intel_dsm_guid, INTEL_DSM_REVISION_ID,
> + if (!acpi_check_dsm(dhandle, &intel_dsm_guid, INTEL_DSM_REVISION_ID,
>   1 << INTEL_DSM_FN_PLATFORM_MUX_INFO)) {
>   DRM_DEBUG_KMS("no _DSM method for intel device\n");
>   return false;

The drm/i915 hunk above is

Reviewed-by: Jani Nikula 

and acked for merging via whichever tree is suitable.


BR,
Jani.

-- 
Jani Nikula, Intel Open Source Technology Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH] drm/i915: Set all undefined MOCS entries to follow PTE

2017-05-04 Thread Chris Wilson

A good default for garbage entries from the user is to follow the
default setting of the object (i.e. the PTE). Currently they use the
uncached entry, and now the only way to accidentally hit uncached
performance is via explicit use of the uncached MOCS or setting the
object to uncached. Note that these entries are currently undefined in
the ABI and we reserve the right to change them. We originally chose
uncached to eliminate any problem with reducing the caching level in
future, but the object is a much better definition of the minimum
caching level.

Fixes: 3bbaba0ceaa2 ("drm/i915: Added Programming of the MOCS")
Signed-off-by: Chris Wilson 
Cc: David Weinehall 
Cc: Arkadiusz Hiler 
Cc: Tvrtko Ursulin 
Cc: sta...@vger.kernel.org
---
 drivers/gpu/drm/i915/intel_mocs.c | 39 +++
 1 file changed, 15 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_mocs.c 
b/drivers/gpu/drm/i915/intel_mocs.c
index 92e461c68385..e7a7781ca457 100644
--- a/drivers/gpu/drm/i915/intel_mocs.c
+++ b/drivers/gpu/drm/i915/intel_mocs.c
@@ -85,10 +85,7 @@ struct drm_i915_mocs_table {
  *
  * Entries not part of the following tables are undefined as far as
  * userspace is concerned and shouldn't be relied upon.  For the time
- * being they will be implicitly initialized to the strictest caching
- * configuration (uncached) to guarantee forwards compatibility with
- * userspace programs written against more recent kernels providing
- * additional MOCS entries.
+ * being they will be implicitly initialized to follow the PTE.
  *
  * NOTE: These tables MUST start with being uncached and the length
  *   MUST be less than 63 as the last two registers are reserved
@@ -249,16 +246,13 @@ int intel_mocs_init_engine(struct intel_engine_cs *engine)
   table.table[index].control_value);
 
/*
-* Ok, now set the unused entries to uncached. These entries
+* Ok, now set the unused entries to follow the PTE. These entries
 * are officially undefined and no contract for the contents
 * and settings is given for these entries.
-*
-* Entry 0 in the table is uncached - so we are just writing
-* that value to all the used entries.
 */
for (; index < GEN9_NUM_MOCS_ENTRIES; index++)
I915_WRITE(mocs_register(engine->id, index),
-  table.table[0].control_value);
+  table.table[I915_MOCS_PTE].control_value);
 
return 0;
 }
@@ -295,16 +289,13 @@ static int emit_mocs_control_table(struct 
drm_i915_gem_request *req,
}
 
/*
-* Ok, now set the unused entries to uncached. These entries
+* Ok, now set the unused entries to follow the PTE. These entries
 * are officially undefined and no contract for the contents
 * and settings is given for these entries.
-*
-* Entry 0 in the table is uncached - so we are just writing
-* that value to all the used entries.
 */
for (; index < GEN9_NUM_MOCS_ENTRIES; index++) {
*cs++ = i915_mmio_reg_offset(mocs_register(engine, index));
-   *cs++ = table->table[0].control_value;
+   *cs++ = table->table[I915_MOCS_PTE].control_value;
}
 
*cs++ = MI_NOOP;
@@ -355,18 +346,17 @@ static int emit_mocs_l3cc_table(struct 
drm_i915_gem_request *req,
if (table->size & 0x01) {
/* Odd table size - 1 left over */
*cs++ = i915_mmio_reg_offset(GEN9_LNCFCMOCS(i));
-   *cs++ = l3cc_combine(table, 2 * i, 0);
+   *cs++ = l3cc_combine(table, 2 * i, I915_MOCS_PTE);
i++;
}
 
/*
-* Now set the rest of the table to uncached - use entry 0 as
-* this will be uncached. Leave the last pair uninitialised as
-* they are reserved by the hardware.
+* Now set the rest of the table to follow the PTE.
+* Leave the last pair as they are reserved by the hardware.
 */
for (; i < GEN9_NUM_MOCS_ENTRIES / 2; i++) {
*cs++ = i915_mmio_reg_offset(GEN9_LNCFCMOCS(i));
-   *cs++ = l3cc_combine(table, 0, 0);
+   *cs++ = l3cc_combine(table, I915_MOCS_PTE, I915_MOCS_PTE);
}
 
*cs++ = MI_NOOP;
@@ -402,17 +392,18 @@ void intel_mocs_init_l3cc_table(struct drm_i915_private 
*dev_priv)
 
/* Odd table size - 1 left over */
if (table.size & 0x01) {
-   I915_WRITE(GEN9_LNCFCMOCS(i), l3cc_combine(&table, 2*i, 0));
+   I915_WRITE(GEN9_LNCFCMOCS(i),
+  l3cc_combine(&table, 2*i, I915_MOCS_PTE));
i++;
}
 
/*
-* Now set the rest of the table to uncached - use entry 0 as
-* this will be uncached. Leave the last pair as initialised as
-* they are reserved by the hardware.
+* Now set the rest of th

[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Use engine->context_pin() to report the intel_ring (rev2)

2017-05-04 Thread Patchwork

== Series Details ==

Series: drm/i915: Use engine->context_pin() to report the intel_ring (rev2)
URL   : https://patchwork.freedesktop.org/series/23884/
State : success

== Summary ==

Series 23884v2 drm/i915: Use engine->context_pin() to report the intel_ring
https://patchwork.freedesktop.org/api/1.0/series/23884/revisions/2/mbox/

fi-bdw-5557u total:278  pass:267  dwarn:0   dfail:0   fail:0   skip:11  
time:431s
fi-bdw-gvtdvmtotal:278  pass:256  dwarn:8   dfail:0   fail:0   skip:14  
time:431s
fi-bsw-n3050 total:278  pass:242  dwarn:0   dfail:0   fail:0   skip:36  
time:580s
fi-bxt-j4205 total:278  pass:259  dwarn:0   dfail:0   fail:0   skip:19  
time:512s
fi-bxt-t5700 total:278  pass:258  dwarn:0   dfail:0   fail:0   skip:20  
time:557s
fi-byt-j1900 total:278  pass:254  dwarn:0   dfail:0   fail:0   skip:24  
time:487s
fi-byt-n2820 total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  
time:486s
fi-hsw-4770  total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  
time:406s
fi-hsw-4770r total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  
time:405s
fi-ilk-650   total:278  pass:228  dwarn:0   dfail:0   fail:0   skip:50  
time:418s
fi-ivb-3520m total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  
time:493s
fi-ivb-3770  total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  
time:491s
fi-kbl-7500u total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  
time:465s
fi-kbl-7560u total:278  pass:267  dwarn:1   dfail:0   fail:0   skip:10  
time:567s
fi-skl-6260u total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  
time:447s
fi-skl-6700hqtotal:278  pass:261  dwarn:0   dfail:0   fail:0   skip:17  
time:567s
fi-skl-6700k total:278  pass:256  dwarn:4   dfail:0   fail:0   skip:18  
time:458s
fi-skl-6770hqtotal:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  
time:488s
fi-skl-gvtdvmtotal:278  pass:265  dwarn:0   dfail:0   fail:0   skip:13  
time:429s
fi-snb-2520m total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  
time:536s
fi-snb-2600  total:278  pass:249  dwarn:0   dfail:0   fail:0   skip:29  
time:398s

ade10dd3713e82daa22a6cd3524510f65f1dd86e drm-tip: 2017y-05m-04d-08h-03m-03s UTC 
integration manifest
2cdbe9f drm/i915: Use engine->context_pin() to report the intel_ring

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4617/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 2/5] drm/vblank: Switch to bool in_vblank_irq in get_vblank_timestamp

2017-05-04 Thread kbuild test robot

Hi Daniel,

[auto build test ERROR on drm/drm-next]
[also build test ERROR on next-20170503]
[cannot apply to v4.11]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Daniel-Vetter/vblanke-cleanup-resend/20170504-003948
base:   git://people.freedesktop.org/~airlied/linux.git drm-next
config: arm-allmodconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
wget 
https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=arm 

Note: the linux-review/Daniel-Vetter/vblanke-cleanup-resend/20170504-003948 
HEAD 7d42e23d7949707be44be8720a9eb260534aa4dc builds fine.
  It only hurts bisectibility.

All errors (new ones prefixed by >>):

   drivers/gpu//drm/vc4/vc4_crtc.c: In function 'vc4_crtc_get_scanoutpos':
>> drivers/gpu//drm/vc4/vc4_crtc.c:235:6: error: 'in_vblank_irq' undeclared 
>> (first use in this function)
 if (in_vblank_irq) {
 ^
   drivers/gpu//drm/vc4/vc4_crtc.c:235:6: note: each undeclared identifier is 
reported only once for each function it appears in

vim +/in_vblank_irq +235 drivers/gpu//drm/vc4/vc4_crtc.c

   229   * We can't get meaningful readings wrt. scanline position of 
the PV
   230   * and need to make things up in a approximative but consistent 
way.
   231   */
   232  ret |= DRM_SCANOUTPOS_IN_VBLANK;
   233  vblank_lines = mode->vtotal - mode->vdisplay;
   234  
 > 235  if (in_vblank_irq) {
   236  /*
   237   * Assume the irq handler got called close to first
   238   * line of vblank, so PV has about a full vblank

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Set all undefined MOCS entries to follow PTE

2017-05-04 Thread Patchwork

== Series Details ==

Series: drm/i915: Set all undefined MOCS entries to follow PTE
URL   : https://patchwork.freedesktop.org/series/23941/
State : success

== Summary ==

Series 23941v1 drm/i915: Set all undefined MOCS entries to follow PTE
https://patchwork.freedesktop.org/api/1.0/series/23941/revisions/1/mbox/

fi-bdw-5557u total:278  pass:267  dwarn:0   dfail:0   fail:0   skip:11  
time:432s
fi-bsw-n3050 total:278  pass:242  dwarn:0   dfail:0   fail:0   skip:36  
time:574s
fi-bxt-j4205 total:278  pass:259  dwarn:0   dfail:0   fail:0   skip:19  
time:504s
fi-bxt-t5700 total:278  pass:258  dwarn:0   dfail:0   fail:0   skip:20  
time:546s
fi-byt-j1900 total:278  pass:254  dwarn:0   dfail:0   fail:0   skip:24  
time:480s
fi-byt-n2820 total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  
time:484s
fi-hsw-4770  total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  
time:409s
fi-hsw-4770r total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  
time:408s
fi-ilk-650   total:278  pass:228  dwarn:0   dfail:0   fail:0   skip:50  
time:416s
fi-ivb-3520m total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  
time:491s
fi-ivb-3770  total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  
time:469s
fi-kbl-7500u total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  
time:463s
fi-kbl-7560u total:278  pass:267  dwarn:1   dfail:0   fail:0   skip:10  
time:573s
fi-skl-6260u total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  
time:464s
fi-skl-6700hqtotal:278  pass:261  dwarn:0   dfail:0   fail:0   skip:17  
time:572s
fi-skl-6700k total:278  pass:256  dwarn:4   dfail:0   fail:0   skip:18  
time:465s
fi-skl-6770hqtotal:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  
time:499s
fi-skl-gvtdvmtotal:278  pass:265  dwarn:0   dfail:0   fail:0   skip:13  
time:428s
fi-snb-2520m total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  
time:532s
fi-snb-2600  total:278  pass:249  dwarn:0   dfail:0   fail:0   skip:29  
time:402s
fi-bdw-gvtdvm failed to collect. IGT log at Patchwork_4618/fi-bdw-gvtdvm/igt.log

ade10dd3713e82daa22a6cd3524510f65f1dd86e drm-tip: 2017y-05m-04d-08h-03m-03s UTC 
integration manifest
901f6cd drm/i915: Set all undefined MOCS entries to follow PTE

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4618/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH] ALSA: hda: Use loop counter for hdac_wait_for_cmd_dmas() timeout

2017-05-04 Thread Chris Wilson

hdac_wait_for_cmd_dmas() uses a jiffie timeout to ensure that we do not
wait forever for stuck hardware. However, it is called from an
irq-disabled context which prevents jiffie from advancing and so the
loop doesn't terminate if the hardware fails. This can then cause NMI
watchdog warnings, such as:

NMI watchdog: Watchdog detected hard LOCKUP on cpu 3
Modules linked in: snd_hda_intel i915 vgem snd_hda_codec_hdmi 
x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul 
snd_hda_codec_realtek snd_hda_codec_generic ghash_clmulni_intel e1000e 
snd_hda_codec snd_hwdep snd_hda_core snd_pcm ptp mei_me prime_numbers pps_core 
mei lpc_ich i2c_hid i2c_designware_platform i2c_designware_core [last unloaded: 
i915]
irq event stamp: 13366
hardirqs last  enabled at (13365): [] 
_raw_spin_unlock_irq+0x27/0x50
hardirqs last disabled at (13366): [] 
_raw_spin_lock_irq+0x12/0x50
softirqs last  enabled at (12744): [] 
__do_softirq+0x1d9/0x4c0
softirqs last disabled at (12721): [] irq_exit+0xa9/0xc0
CPU: 3 PID: 10443 Comm: kworker/u8:11 Tainted: G U  
4.11.0-rc4-CI-CI_DRM_319+ #1
Hardware name:  /NUC5i5RYB, BIOS 
RYBDWi35.86A.0362.2017.0118.0940 01/18/2017
Workqueue: events_unbound async_run_entry_fn
task: 88024cd32740 task.stack: c9000162c000
RIP: 0010:preempt_count_add+0xe/0xc0
RSP: 0018:c9000162fbd8 EFLAGS: 0082
RAX: 8001 RBX: 000704b96558 RCX: 0002
RDX:  RSI: 81c74f2d RDI: 0001
RBP: c9000162fc08 R08: bbcc90cc R09: 23c7b071
R10: 827901a8 R11: 88024cd32740 R12: 000704b92baa
R13: 3ea0 R14: 0003 R15: a00061f0
FS:  () GS:880256d8() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7f90f84a5144 CR3: 03e0f000 CR4: 003406e0
Call Trace:
 ? delay_tsc+0x3d/0xc0
 __delay+0xa/0x10
 __const_udelay+0x31/0x40
 snd_hdac_bus_stop_cmd_io+0x96/0xe0 [snd_hda_core]
 ? azx_dev_disconnect+0x20/0x20 [snd_hda_intel]
 snd_hdac_bus_stop_chip+0xb1/0x100 [snd_hda_core]
 azx_stop_chip+0x9/0x10 [snd_hda_codec]
 azx_suspend+0x72/0x220 [snd_hda_intel]
 pci_pm_suspend+0x71/0x140
 dpm_run_callback+0x6f/0x330
 ? pci_pm_freeze+0xe0/0xe0
 __device_suspend+0xf9/0x370
 ? dpm_watchdog_set+0x60/0x60
 async_suspend+0x1a/0x90
 async_run_entry_fn+0x34/0x160
 process_one_work+0x1f4/0x6d0
 ? process_one_work+0x16e/0x6d0
 worker_thread+0x49/0x4a0
 kthread+0x107/0x140
 ? process_one_work+0x6d0/0x6d0
 ? kthread_create_on_node+0x40/0x40
 ret_from_fork+0x2e/0x40

Fixes: 38b19ed7f81e ("ALSA: hda: fix to wait for RIRB & CORB DMA to set")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100419
Signed-off-by: Chris Wilson 
Cc: Jeeja KP 
Cc: Vinod Koul 
Cc: Takashi Iwai 
Cc:  # v4.7+
---
 sound/hda/hdac_controller.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/sound/hda/hdac_controller.c b/sound/hda/hdac_controller.c
index ee08c389b4d6..7f8806b03982 100644
--- a/sound/hda/hdac_controller.c
+++ b/sound/hda/hdac_controller.c
@@ -85,14 +85,14 @@ static void hdac_wait_for_cmd_dmas(struct hdac_bus *bus)
 {
unsigned long timeout;
 
-   timeout = jiffies + msecs_to_jiffies(100);
+   timeout = 100 * 100; /* 100ms */
while ((snd_hdac_chip_readb(bus, RIRBCTL) & AZX_RBCTL_DMA_EN)
-   && time_before(jiffies, timeout))
+  && timeout--)
udelay(10);
 
-   timeout = jiffies + msecs_to_jiffies(100);
+   timeout = 100 * 100; /* 100ms */
while ((snd_hdac_chip_readb(bus, CORBCTL) & AZX_CORBCTL_RUN)
-   && time_before(jiffies, timeout))
+  && timeout--)
udelay(10);
 }
 
-- 
2.11.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] ALSA: hda: Use loop counter for hdac_wait_for_cmd_dmas() timeout

2017-05-04 Thread Takashi Iwai

On Thu, 04 May 2017 12:18:29 +0200,
Chris Wilson wrote:
> 
> hdac_wait_for_cmd_dmas() uses a jiffie timeout to ensure that we do not
> wait forever for stuck hardware. However, it is called from an
> irq-disabled context which prevents jiffie from advancing and so the
> loop doesn't terminate if the hardware fails. This can then cause NMI
> watchdog warnings, such as:
> 
> NMI watchdog: Watchdog detected hard LOCKUP on cpu 3
> Modules linked in: snd_hda_intel i915 vgem snd_hda_codec_hdmi 
> x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul 
> snd_hda_codec_realtek snd_hda_codec_generic ghash_clmulni_intel e1000e 
> snd_hda_codec snd_hwdep snd_hda_core snd_pcm ptp mei_me prime_numbers 
> pps_core mei lpc_ich i2c_hid i2c_designware_platform i2c_designware_core 
> [last unloaded: i915]
> irq event stamp: 13366
> hardirqs last  enabled at (13365): [] 
> _raw_spin_unlock_irq+0x27/0x50
> hardirqs last disabled at (13366): [] 
> _raw_spin_lock_irq+0x12/0x50
> softirqs last  enabled at (12744): [] 
> __do_softirq+0x1d9/0x4c0
> softirqs last disabled at (12721): [] irq_exit+0xa9/0xc0
> CPU: 3 PID: 10443 Comm: kworker/u8:11 Tainted: G U  
> 4.11.0-rc4-CI-CI_DRM_319+ #1
> Hardware name:  /NUC5i5RYB, BIOS 
> RYBDWi35.86A.0362.2017.0118.0940 01/18/2017
> Workqueue: events_unbound async_run_entry_fn
> task: 88024cd32740 task.stack: c9000162c000
> RIP: 0010:preempt_count_add+0xe/0xc0
> RSP: 0018:c9000162fbd8 EFLAGS: 0082
> RAX: 8001 RBX: 000704b96558 RCX: 0002
> RDX:  RSI: 81c74f2d RDI: 0001
> RBP: c9000162fc08 R08: bbcc90cc R09: 23c7b071
> R10: 827901a8 R11: 88024cd32740 R12: 000704b92baa
> R13: 3ea0 R14: 0003 R15: a00061f0
> FS:  () GS:880256d8() 
> knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 7f90f84a5144 CR3: 03e0f000 CR4: 003406e0
> Call Trace:
>  ? delay_tsc+0x3d/0xc0
>  __delay+0xa/0x10
>  __const_udelay+0x31/0x40
>  snd_hdac_bus_stop_cmd_io+0x96/0xe0 [snd_hda_core]
>  ? azx_dev_disconnect+0x20/0x20 [snd_hda_intel]
>  snd_hdac_bus_stop_chip+0xb1/0x100 [snd_hda_core]
>  azx_stop_chip+0x9/0x10 [snd_hda_codec]
>  azx_suspend+0x72/0x220 [snd_hda_intel]
>  pci_pm_suspend+0x71/0x140
>  dpm_run_callback+0x6f/0x330
>  ? pci_pm_freeze+0xe0/0xe0
>  __device_suspend+0xf9/0x370
>  ? dpm_watchdog_set+0x60/0x60
>  async_suspend+0x1a/0x90
>  async_run_entry_fn+0x34/0x160
>  process_one_work+0x1f4/0x6d0
>  ? process_one_work+0x16e/0x6d0
>  worker_thread+0x49/0x4a0
>  kthread+0x107/0x140
>  ? process_one_work+0x6d0/0x6d0
>  ? kthread_create_on_node+0x40/0x40
>  ret_from_fork+0x2e/0x40
> 
> Fixes: 38b19ed7f81e ("ALSA: hda: fix to wait for RIRB & CORB DMA to set")
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100419
> Signed-off-by: Chris Wilson 
> Cc: Jeeja KP 
> Cc: Vinod Koul 
> Cc: Takashi Iwai 
> Cc:  # v4.7+

Any reason to submit a different fix from what's attached in the
bugzilla you mentioned?


thanks,

Takashi


> ---
>  sound/hda/hdac_controller.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/sound/hda/hdac_controller.c b/sound/hda/hdac_controller.c
> index ee08c389b4d6..7f8806b03982 100644
> --- a/sound/hda/hdac_controller.c
> +++ b/sound/hda/hdac_controller.c
> @@ -85,14 +85,14 @@ static void hdac_wait_for_cmd_dmas(struct hdac_bus *bus)
>  {
>   unsigned long timeout;
>  
> - timeout = jiffies + msecs_to_jiffies(100);
> + timeout = 100 * 100; /* 100ms */
>   while ((snd_hdac_chip_readb(bus, RIRBCTL) & AZX_RBCTL_DMA_EN)
> - && time_before(jiffies, timeout))
> +&& timeout--)
>   udelay(10);
>  
> - timeout = jiffies + msecs_to_jiffies(100);
> + timeout = 100 * 100; /* 100ms */
>   while ((snd_hdac_chip_readb(bus, CORBCTL) & AZX_CORBCTL_RUN)
> - && time_before(jiffies, timeout))
> +&& timeout--)
>   udelay(10);
>  }
>  
> -- 
> 2.11.0
> 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] ALSA: hda: Use loop counter for hdac_wait_for_cmd_dmas() timeout

2017-05-04 Thread Vinod Koul

On Thu, May 04, 2017 at 12:25:26PM +0200, Takashi Iwai wrote:
> On Thu, 04 May 2017 12:18:29 +0200,
> Chris Wilson wrote:
> > 
> > hdac_wait_for_cmd_dmas() uses a jiffie timeout to ensure that we do not
> > wait forever for stuck hardware. However, it is called from an
> > irq-disabled context which prevents jiffie from advancing and so the
> > loop doesn't terminate if the hardware fails. This can then cause NMI
> > watchdog warnings, such as:
> > 
> > NMI watchdog: Watchdog detected hard LOCKUP on cpu 3
> > Modules linked in: snd_hda_intel i915 vgem snd_hda_codec_hdmi 
> > x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul 
> > crc32_pclmul snd_hda_codec_realtek snd_hda_codec_generic 
> > ghash_clmulni_intel e1000e snd_hda_codec snd_hwdep snd_hda_core snd_pcm ptp 
> > mei_me prime_numbers pps_core mei lpc_ich i2c_hid i2c_designware_platform 
> > i2c_designware_core [last unloaded: i915]
> > irq event stamp: 13366
> > hardirqs last  enabled at (13365): [] 
> > _raw_spin_unlock_irq+0x27/0x50
> > hardirqs last disabled at (13366): [] 
> > _raw_spin_lock_irq+0x12/0x50
> > softirqs last  enabled at (12744): [] 
> > __do_softirq+0x1d9/0x4c0
> > softirqs last disabled at (12721): [] 
> > irq_exit+0xa9/0xc0
> > CPU: 3 PID: 10443 Comm: kworker/u8:11 Tainted: G U  
> > 4.11.0-rc4-CI-CI_DRM_319+ #1
> > Hardware name:  /NUC5i5RYB, BIOS 
> > RYBDWi35.86A.0362.2017.0118.0940 01/18/2017
> > Workqueue: events_unbound async_run_entry_fn
> > task: 88024cd32740 task.stack: c9000162c000
> > RIP: 0010:preempt_count_add+0xe/0xc0
> > RSP: 0018:c9000162fbd8 EFLAGS: 0082
> > RAX: 8001 RBX: 000704b96558 RCX: 0002
> > RDX:  RSI: 81c74f2d RDI: 0001
> > RBP: c9000162fc08 R08: bbcc90cc R09: 23c7b071
> > R10: 827901a8 R11: 88024cd32740 R12: 000704b92baa
> > R13: 3ea0 R14: 0003 R15: a00061f0
> > FS:  () GS:880256d8() 
> > knlGS:
> > CS:  0010 DS:  ES:  CR0: 80050033
> > CR2: 7f90f84a5144 CR3: 03e0f000 CR4: 003406e0
> > Call Trace:
> >  ? delay_tsc+0x3d/0xc0
> >  __delay+0xa/0x10
> >  __const_udelay+0x31/0x40
> >  snd_hdac_bus_stop_cmd_io+0x96/0xe0 [snd_hda_core]
> >  ? azx_dev_disconnect+0x20/0x20 [snd_hda_intel]
> >  snd_hdac_bus_stop_chip+0xb1/0x100 [snd_hda_core]
> >  azx_stop_chip+0x9/0x10 [snd_hda_codec]
> >  azx_suspend+0x72/0x220 [snd_hda_intel]
> >  pci_pm_suspend+0x71/0x140
> >  dpm_run_callback+0x6f/0x330
> >  ? pci_pm_freeze+0xe0/0xe0
> >  __device_suspend+0xf9/0x370
> >  ? dpm_watchdog_set+0x60/0x60
> >  async_suspend+0x1a/0x90
> >  async_run_entry_fn+0x34/0x160
> >  process_one_work+0x1f4/0x6d0
> >  ? process_one_work+0x16e/0x6d0
> >  worker_thread+0x49/0x4a0
> >  kthread+0x107/0x140
> >  ? process_one_work+0x6d0/0x6d0
> >  ? kthread_create_on_node+0x40/0x40
> >  ret_from_fork+0x2e/0x40
> > 
> > Fixes: 38b19ed7f81e ("ALSA: hda: fix to wait for RIRB & CORB DMA to set")
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100419
> > Signed-off-by: Chris Wilson 
> > Cc: Jeeja KP 
> > Cc: Vinod Koul 
> > Cc: Takashi Iwai 
> > Cc:  # v4.7+
> 
> Any reason to submit a different fix from what's attached in the
> bugzilla you mentioned?

probably a race between then :)

Jeeja talked to me earlier today and uploaded the patch where we drop the
locks and still use jiffies.

Takashi,
Do you prefer dropping locks or using loop?

> > ---
> >  sound/hda/hdac_controller.c | 8 
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/sound/hda/hdac_controller.c b/sound/hda/hdac_controller.c
> > index ee08c389b4d6..7f8806b03982 100644
> > --- a/sound/hda/hdac_controller.c
> > +++ b/sound/hda/hdac_controller.c
> > @@ -85,14 +85,14 @@ static void hdac_wait_for_cmd_dmas(struct hdac_bus *bus)
> >  {
> > unsigned long timeout;
> >  
> > -   timeout = jiffies + msecs_to_jiffies(100);
> > +   timeout = 100 * 100; /* 100ms */
> > while ((snd_hdac_chip_readb(bus, RIRBCTL) & AZX_RBCTL_DMA_EN)
> > -   && time_before(jiffies, timeout))
> > +  && timeout--)
> > udelay(10);
> >  
> > -   timeout = jiffies + msecs_to_jiffies(100);
> > +   timeout = 100 * 100; /* 100ms */
> > while ((snd_hdac_chip_readb(bus, CORBCTL) & AZX_CORBCTL_RUN)
> > -   && time_before(jiffies, timeout))
> > +  && timeout--)
> > udelay(10);
> >  }
> >  
> > -- 
> > 2.11.0
> > 

-- 
~Vinod
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] ALSA: hda: Use loop counter for hdac_wait_for_cmd_dmas() timeout

2017-05-04 Thread Takashi Iwai

On Thu, 04 May 2017 12:30:32 +0200,
Vinod Koul wrote:
> 
> On Thu, May 04, 2017 at 12:25:26PM +0200, Takashi Iwai wrote:
> > On Thu, 04 May 2017 12:18:29 +0200,
> > Chris Wilson wrote:
> > > 
> > > hdac_wait_for_cmd_dmas() uses a jiffie timeout to ensure that we do not
> > > wait forever for stuck hardware. However, it is called from an
> > > irq-disabled context which prevents jiffie from advancing and so the
> > > loop doesn't terminate if the hardware fails. This can then cause NMI
> > > watchdog warnings, such as:
> > > 
> > > NMI watchdog: Watchdog detected hard LOCKUP on cpu 3
> > > Modules linked in: snd_hda_intel i915 vgem snd_hda_codec_hdmi 
> > > x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul 
> > > crc32_pclmul snd_hda_codec_realtek snd_hda_codec_generic 
> > > ghash_clmulni_intel e1000e snd_hda_codec snd_hwdep snd_hda_core snd_pcm 
> > > ptp mei_me prime_numbers pps_core mei lpc_ich i2c_hid 
> > > i2c_designware_platform i2c_designware_core [last unloaded: i915]
> > > irq event stamp: 13366
> > > hardirqs last  enabled at (13365): [] 
> > > _raw_spin_unlock_irq+0x27/0x50
> > > hardirqs last disabled at (13366): [] 
> > > _raw_spin_lock_irq+0x12/0x50
> > > softirqs last  enabled at (12744): [] 
> > > __do_softirq+0x1d9/0x4c0
> > > softirqs last disabled at (12721): [] 
> > > irq_exit+0xa9/0xc0
> > > CPU: 3 PID: 10443 Comm: kworker/u8:11 Tainted: G U  
> > > 4.11.0-rc4-CI-CI_DRM_319+ #1
> > > Hardware name:  /NUC5i5RYB, BIOS 
> > > RYBDWi35.86A.0362.2017.0118.0940 01/18/2017
> > > Workqueue: events_unbound async_run_entry_fn
> > > task: 88024cd32740 task.stack: c9000162c000
> > > RIP: 0010:preempt_count_add+0xe/0xc0
> > > RSP: 0018:c9000162fbd8 EFLAGS: 0082
> > > RAX: 8001 RBX: 000704b96558 RCX: 0002
> > > RDX:  RSI: 81c74f2d RDI: 0001
> > > RBP: c9000162fc08 R08: bbcc90cc R09: 23c7b071
> > > R10: 827901a8 R11: 88024cd32740 R12: 000704b92baa
> > > R13: 3ea0 R14: 0003 R15: a00061f0
> > > FS:  () GS:880256d8() 
> > > knlGS:
> > > CS:  0010 DS:  ES:  CR0: 80050033
> > > CR2: 7f90f84a5144 CR3: 03e0f000 CR4: 003406e0
> > > Call Trace:
> > >  ? delay_tsc+0x3d/0xc0
> > >  __delay+0xa/0x10
> > >  __const_udelay+0x31/0x40
> > >  snd_hdac_bus_stop_cmd_io+0x96/0xe0 [snd_hda_core]
> > >  ? azx_dev_disconnect+0x20/0x20 [snd_hda_intel]
> > >  snd_hdac_bus_stop_chip+0xb1/0x100 [snd_hda_core]
> > >  azx_stop_chip+0x9/0x10 [snd_hda_codec]
> > >  azx_suspend+0x72/0x220 [snd_hda_intel]
> > >  pci_pm_suspend+0x71/0x140
> > >  dpm_run_callback+0x6f/0x330
> > >  ? pci_pm_freeze+0xe0/0xe0
> > >  __device_suspend+0xf9/0x370
> > >  ? dpm_watchdog_set+0x60/0x60
> > >  async_suspend+0x1a/0x90
> > >  async_run_entry_fn+0x34/0x160
> > >  process_one_work+0x1f4/0x6d0
> > >  ? process_one_work+0x16e/0x6d0
> > >  worker_thread+0x49/0x4a0
> > >  kthread+0x107/0x140
> > >  ? process_one_work+0x6d0/0x6d0
> > >  ? kthread_create_on_node+0x40/0x40
> > >  ret_from_fork+0x2e/0x40
> > > 
> > > Fixes: 38b19ed7f81e ("ALSA: hda: fix to wait for RIRB & CORB DMA to set")
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100419
> > > Signed-off-by: Chris Wilson 
> > > Cc: Jeeja KP 
> > > Cc: Vinod Koul 
> > > Cc: Takashi Iwai 
> > > Cc:  # v4.7+
> > 
> > Any reason to submit a different fix from what's attached in the
> > bugzilla you mentioned?
> 
> probably a race between then :)
> 
> Jeeja talked to me earlier today and uploaded the patch where we drop the
> locks and still use jiffies.
> 
> Takashi,
> Do you prefer dropping locks or using loop?

I prefer dropping the lock.


thanks,

Takashi
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] ✓ Fi.CI.BAT: success for ALSA: hda: Use loop counter for hdac_wait_for_cmd_dmas() timeout

2017-05-04 Thread Patchwork

== Series Details ==

Series: ALSA: hda: Use loop counter for hdac_wait_for_cmd_dmas() timeout
URL   : https://patchwork.freedesktop.org/series/23948/
State : success

== Summary ==

Series 23948v1 ALSA: hda: Use loop counter for hdac_wait_for_cmd_dmas() timeout
https://patchwork.freedesktop.org/api/1.0/series/23948/revisions/1/mbox/

Test gem_exec_suspend:
Subgroup basic-s4-devices:
dmesg-warn -> PASS   (fi-kbl-7560u) fdo#100125

fdo#100125 https://bugs.freedesktop.org/show_bug.cgi?id=100125

fi-bdw-5557u total:278  pass:267  dwarn:0   dfail:0   fail:0   skip:11  
time:424s
fi-bdw-gvtdvmtotal:278  pass:256  dwarn:8   dfail:0   fail:0   skip:14  
time:428s
fi-bsw-n3050 total:278  pass:242  dwarn:0   dfail:0   fail:0   skip:36  
time:580s
fi-bxt-j4205 total:278  pass:259  dwarn:0   dfail:0   fail:0   skip:19  
time:505s
fi-bxt-t5700 total:278  pass:258  dwarn:0   dfail:0   fail:0   skip:20  
time:543s
fi-byt-j1900 total:278  pass:254  dwarn:0   dfail:0   fail:0   skip:24  
time:485s
fi-byt-n2820 total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  
time:482s
fi-hsw-4770  total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  
time:410s
fi-hsw-4770r total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  
time:405s
fi-ilk-650   total:278  pass:228  dwarn:0   dfail:0   fail:0   skip:50  
time:418s
fi-ivb-3520m total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  
time:488s
fi-ivb-3770  total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  
time:472s
fi-kbl-7500u total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  
time:460s
fi-kbl-7560u total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  
time:562s
fi-skl-6260u total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  
time:459s
fi-skl-6700hqtotal:278  pass:261  dwarn:0   dfail:0   fail:0   skip:17  
time:570s
fi-skl-6700k total:278  pass:256  dwarn:4   dfail:0   fail:0   skip:18  
time:463s
fi-skl-6770hqtotal:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  
time:483s
fi-skl-gvtdvmtotal:278  pass:265  dwarn:0   dfail:0   fail:0   skip:13  
time:430s
fi-snb-2520m total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  
time:531s
fi-snb-2600  total:278  pass:249  dwarn:0   dfail:0   fail:0   skip:29  
time:409s

ade10dd3713e82daa22a6cd3524510f65f1dd86e drm-tip: 2017y-05m-04d-08h-03m-03s UTC 
integration manifest
5ebe8bb ALSA: hda: Use loop counter for hdac_wait_for_cmd_dmas() timeout

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4619/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] ALSA: hda: Use loop counter for hdac_wait_for_cmd_dmas() timeout

2017-05-04 Thread Chris Wilson

On Thu, May 04, 2017 at 12:25:26PM +0200, Takashi Iwai wrote:
> On Thu, 04 May 2017 12:18:29 +0200,
> Chris Wilson wrote:
> > 
> > hdac_wait_for_cmd_dmas() uses a jiffie timeout to ensure that we do not
> > wait forever for stuck hardware. However, it is called from an
> > irq-disabled context which prevents jiffie from advancing and so the
> > loop doesn't terminate if the hardware fails. This can then cause NMI
> > watchdog warnings, such as:
> > 
> > NMI watchdog: Watchdog detected hard LOCKUP on cpu 3
> > Modules linked in: snd_hda_intel i915 vgem snd_hda_codec_hdmi 
> > x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul 
> > crc32_pclmul snd_hda_codec_realtek snd_hda_codec_generic 
> > ghash_clmulni_intel e1000e snd_hda_codec snd_hwdep snd_hda_core snd_pcm ptp 
> > mei_me prime_numbers pps_core mei lpc_ich i2c_hid i2c_designware_platform 
> > i2c_designware_core [last unloaded: i915]
> > irq event stamp: 13366
> > hardirqs last  enabled at (13365): [] 
> > _raw_spin_unlock_irq+0x27/0x50
> > hardirqs last disabled at (13366): [] 
> > _raw_spin_lock_irq+0x12/0x50
> > softirqs last  enabled at (12744): [] 
> > __do_softirq+0x1d9/0x4c0
> > softirqs last disabled at (12721): [] 
> > irq_exit+0xa9/0xc0
> > CPU: 3 PID: 10443 Comm: kworker/u8:11 Tainted: G U  
> > 4.11.0-rc4-CI-CI_DRM_319+ #1
> > Hardware name:  /NUC5i5RYB, BIOS 
> > RYBDWi35.86A.0362.2017.0118.0940 01/18/2017
> > Workqueue: events_unbound async_run_entry_fn
> > task: 88024cd32740 task.stack: c9000162c000
> > RIP: 0010:preempt_count_add+0xe/0xc0
> > RSP: 0018:c9000162fbd8 EFLAGS: 0082
> > RAX: 8001 RBX: 000704b96558 RCX: 0002
> > RDX:  RSI: 81c74f2d RDI: 0001
> > RBP: c9000162fc08 R08: bbcc90cc R09: 23c7b071
> > R10: 827901a8 R11: 88024cd32740 R12: 000704b92baa
> > R13: 3ea0 R14: 0003 R15: a00061f0
> > FS:  () GS:880256d8() 
> > knlGS:
> > CS:  0010 DS:  ES:  CR0: 80050033
> > CR2: 7f90f84a5144 CR3: 03e0f000 CR4: 003406e0
> > Call Trace:
> >  ? delay_tsc+0x3d/0xc0
> >  __delay+0xa/0x10
> >  __const_udelay+0x31/0x40
> >  snd_hdac_bus_stop_cmd_io+0x96/0xe0 [snd_hda_core]
> >  ? azx_dev_disconnect+0x20/0x20 [snd_hda_intel]
> >  snd_hdac_bus_stop_chip+0xb1/0x100 [snd_hda_core]
> >  azx_stop_chip+0x9/0x10 [snd_hda_codec]
> >  azx_suspend+0x72/0x220 [snd_hda_intel]
> >  pci_pm_suspend+0x71/0x140
> >  dpm_run_callback+0x6f/0x330
> >  ? pci_pm_freeze+0xe0/0xe0
> >  __device_suspend+0xf9/0x370
> >  ? dpm_watchdog_set+0x60/0x60
> >  async_suspend+0x1a/0x90
> >  async_run_entry_fn+0x34/0x160
> >  process_one_work+0x1f4/0x6d0
> >  ? process_one_work+0x16e/0x6d0
> >  worker_thread+0x49/0x4a0
> >  kthread+0x107/0x140
> >  ? process_one_work+0x6d0/0x6d0
> >  ? kthread_create_on_node+0x40/0x40
> >  ret_from_fork+0x2e/0x40
> > 
> > Fixes: 38b19ed7f81e ("ALSA: hda: fix to wait for RIRB & CORB DMA to set")
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100419
> > Signed-off-by: Chris Wilson 
> > Cc: Jeeja KP 
> > Cc: Vinod Koul 
> > Cc: Takashi Iwai 
> > Cc:  # v4.7+
> 
> Any reason to submit a different fix from what's attached in the
> bugzilla you mentioned?

Because I didn't see it when Marta complained on irc and suggested
reverting 38b19ed7f81e. There's no advantage either way, but even after
fixing the timeout detection we are still left with the issue that the
hw is stuck and suffer a 200ms suspend delay. :|
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v2 4/4] drm/i915: Calculate vlv/chv intermediate watermarks correctly, v2.

2017-05-04 Thread Ville Syrjälä

On Thu, May 04, 2017 at 10:12:52AM +0200, Maarten Lankhorst wrote:
> Op 03-05-17 om 20:03 schreef Ville Syrjälä:
> > On Wed, May 03, 2017 at 06:18:46PM +0200, Maarten Lankhorst wrote:
> >> Op 03-05-17 om 18:07 schreef Ville Syrjälä:
> >>> On Wed, May 03, 2017 at 05:53:34PM +0200, Maarten Lankhorst wrote:
>  Op 03-05-17 om 16:11 schreef Ville Syrjälä:
> > On Wed, May 03, 2017 at 04:06:37PM +0200, Maarten Lankhorst wrote:
> >> Op 03-05-17 om 15:45 schreef Ville Syrjälä:
> >>> On Mon, May 01, 2017 at 03:34:34PM +0200, Maarten Lankhorst wrote:
>  The watermarks it should calculate against are the old optimal 
>  watermarks.
>  The currently active crtc watermarks are pure fiction, and are 
>  invalid in
>  case of a nonblocking modeset, page flip enabling/disabling planes 
>  or any
>  other reason.
> 
>  When the crtc is disabled or during a modeset the intermediate 
>  watermarks
>  don't need to be programmed separately, and could be directly 
>  assigned
>  to the optimal watermarks.
> 
>  Also rename crtc_state to new_crtc_state, to distinguish it from the 
>  old state.
> 
>  Changes since v1:
>  - Use intel_atomic_get_old_crtc_state. (ville)
> 
>  Signed-off-by: Maarten Lankhorst 
>  ---
>   drivers/gpu/drm/i915/intel_pm.c | 20 ++--
>   1 file changed, 14 insertions(+), 6 deletions(-)
> 
>  diff --git a/drivers/gpu/drm/i915/intel_pm.c 
>  b/drivers/gpu/drm/i915/intel_pm.c
>  index 0f344b1fff45..a09396ee1f3d 100644
>  --- a/drivers/gpu/drm/i915/intel_pm.c
>  +++ b/drivers/gpu/drm/i915/intel_pm.c
>  @@ -1458,16 +1458,24 @@ static void vlv_atomic_update_fifo(struct 
>  intel_atomic_state *state,
>   
>   static int vlv_compute_intermediate_wm(struct drm_device *dev,
>  struct intel_crtc *crtc,
>  -   struct intel_crtc_state 
>  *crtc_state)
>  +   struct intel_crtc_state 
>  *new_crtc_state)
>   {
>  -struct vlv_wm_state *intermediate = 
>  &crtc_state->wm.vlv.intermediate;
>  -const struct vlv_wm_state *optimal = 
>  &crtc_state->wm.vlv.optimal;
>  -const struct vlv_wm_state *active = &crtc->wm.active.vlv;
>  +struct vlv_wm_state *intermediate = 
>  &new_crtc_state->wm.vlv.intermediate;
>  +const struct vlv_wm_state *optimal = 
>  &new_crtc_state->wm.vlv.optimal;
>  +const struct intel_crtc_state *old_crtc_state =
>  +
>  intel_atomic_get_old_crtc_state(new_crtc_state->base.state, crtc);
>  +const struct vlv_wm_state *active = 
>  &old_crtc_state->wm.vlv.optimal;
>   int level;
>   
>  +if (!new_crtc_state->base.active || 
>  drm_atomic_crtc_needs_modeset(&new_crtc_state->base)) {
>  +*intermediate = *optimal;
>  +
>  +return 0;
>  +}
>  +
>   intermediate->num_levels = min(optimal->num_levels, 
>  active->num_levels);
>   intermediate->cxsr = optimal->cxsr && active->cxsr &&
>  -!crtc_state->disable_cxsr;
>  +!new_crtc_state->disable_cxsr;
> >>> We need to consider disable_cxsr even in the modeset case.
> >> Why is this? crtc_state->disable_cxsr is set if any plane is part of 
> >> the crtc during modeset, so it's disabled during modeset already.
> > It's set if any plane is enabling/disabling, which should be quite
> > typical during a modeset.
>  Yeah but .initial_watermarks is called during crtc_enable, so cxsr will 
>  get enabled anyway.
> >>> Which is not what we want. CxSR must stay off until the planes have been
> >>> enabled.
> >>>
> >> In that case why is it enabled in .initial_watermarks at all? It should be 
> >> in optimize_watermarks then..
> > Because we can keep it enabled across the update unless planes are
> > getting enabled or disabled.
> >
> So for the modeset case, computing intermediate watermarks:
> 
> *intermediate = *optimal;
> if (needs_modeset)
>   intermediate->cxsr = false;
> 
> if (optimal->cxsr && !intermediate->cxsr)
>   new_crtc_state->wm.need_postvbl_update = true;
> 
> ?


Or maybe

if (blah) {
*intermediate = *optimal;
goto out;
}

// min/max stuff

out:
if (disable_cxsr)
intermediate->cxsr = false;
if (memcmp(...


-- 
Ville Syrjälä
Intel OTC
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mai

Re: [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Set all undefined MOCS entries to follow PTE

2017-05-04 Thread Chris Wilson

On Thu, May 04, 2017 at 10:09:57AM -, Patchwork wrote:
> == Series Details ==
> 
> Series: drm/i915: Set all undefined MOCS entries to follow PTE
> URL   : https://patchwork.freedesktop.org/series/23941/
> State : success
> 
> == Summary ==
> 
> Series 23941v1 drm/i915: Set all undefined MOCS entries to follow PTE
> https://patchwork.freedesktop.org/api/1.0/series/23941/revisions/1/mbox/

Pushed, thanks for the kick and the review.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Use engine->context_pin() to report the intel_ring (rev2)

2017-05-04 Thread Chris Wilson

On Thu, May 04, 2017 at 09:53:35AM -, Patchwork wrote:
> == Series Details ==
> 
> Series: drm/i915: Use engine->context_pin() to report the intel_ring (rev2)
> URL   : https://patchwork.freedesktop.org/series/23884/
> State : success
> 
> == Summary ==
> 
> Series 23884v2 drm/i915: Use engine->context_pin() to report the intel_ring
> https://patchwork.freedesktop.org/api/1.0/series/23884/revisions/2/mbox/

Contrary to earlier reports, this is the patch I just pushed (not mocs)!
Thanks for the review and prompting me to fix up the request->ring
assignment.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Set all undefined MOCS entries to follow PTE

2017-05-04 Thread Chris Wilson

On Thu, May 04, 2017 at 11:59:53AM +0100, Chris Wilson wrote:
> On Thu, May 04, 2017 at 10:09:57AM -, Patchwork wrote:
> > == Series Details ==
> > 
> > Series: drm/i915: Set all undefined MOCS entries to follow PTE
> > URL   : https://patchwork.freedesktop.org/series/23941/
> > State : success
> > 
> > == Summary ==
> > 
> > Series 23941v1 drm/i915: Set all undefined MOCS entries to follow PTE
> > https://patchwork.freedesktop.org/api/1.0/series/23941/revisions/1/mbox/
> 
> Pushed, thanks for the kick and the review.

Actually, no I didn't. That reply was intended for a different series,
sorry for the scare/noise.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC 2/7] drm/i915: Program gen3- watermarks atomically

2017-05-04 Thread Maarten Lankhorst

With the atomic watermark calculations calculate intermediary watermark
values and update the watermarks atomically.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/i915_drv.h  |   5 ++
 drivers/gpu/drm/i915/intel_drv.h |   2 +-
 drivers/gpu/drm/i915/intel_pm.c  | 103 +--
 3 files changed, 95 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 91b945cd39f9..7af4f908b2cd 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1793,6 +1793,10 @@ struct g4x_wm_values {
bool fbc_en;
 };
 
+struct i9xx_wm_values {
+   bool cxsr;
+};
+
 struct skl_ddb_entry {
uint16_t start, end;/* in number of blocks, 'end' is exclusive */
 };
@@ -2422,6 +2426,7 @@ struct drm_i915_private {
struct skl_wm_values skl_hw;
struct vlv_wm_values vlv;
struct g4x_wm_values g4x;
+   struct i9xx_wm_values i9xx;
};
 
uint8_t max_level;
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index d9e49f2b3c22..73e74fc7383c 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -600,7 +600,7 @@ struct intel_crtc_wm_state {
struct g4x_wm_state optimal;
} g4x;
struct {
-   struct i9xx_wm_state optimal;
+   struct i9xx_wm_state optimal, intermediate;
} i9xx;
};
 
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 0c933cfad02c..c39f63aff4a5 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -433,6 +433,8 @@ bool intel_set_memory_cxsr(struct drm_i915_private 
*dev_priv, bool enable)
dev_priv->wm.vlv.cxsr = enable;
else if (IS_G4X(dev_priv))
dev_priv->wm.g4x.cxsr = enable;
+   else if (INTEL_GEN(dev_priv) <= 4)
+   dev_priv->wm.i9xx.cxsr = enable;
mutex_unlock(&dev_priv->wm.wm_mutex);
 
return ret;
@@ -2317,6 +2319,44 @@ static int i9xx_compute_pipe_wm(struct intel_crtc_state 
*crtc_state)
return 0;
 }
 
+static int i9xx_compute_intermediate_wm(struct drm_device *dev,
+  struct intel_crtc *intel_crtc,
+  struct intel_crtc_state *newstate)
+{
+   struct i9xx_wm_state *intermediate = &newstate->wm.i9xx.intermediate;
+   const struct drm_crtc_state *old_drm_state =
+   drm_atomic_get_old_crtc_state(newstate->base.state, 
&intel_crtc->base);
+   const struct i9xx_wm_state *old = 
&to_intel_crtc_state(old_drm_state)->wm.i9xx.optimal;
+   const struct i9xx_wm_state *optimal = &newstate->wm.i9xx.optimal;
+
+   /*
+* Start with the final, target watermarks, then combine with the
+* currently active watermarks to get values that are safe both before
+* and after the vblank.
+*/
+   *intermediate = *optimal;
+   if (newstate->disable_cxsr)
+   intermediate->cxsr = false;
+
+   if (!newstate->base.active ||
+   drm_atomic_crtc_needs_modeset(&newstate->base))
+   goto out;
+
+   intermediate->plane_wm = min(old->plane_wm, optimal->plane_wm);
+   intermediate->sr.plane = min(old->sr.plane, optimal->sr.plane);
+
+out:
+   /*
+* If our intermediate WM are identical to the final WM, then we can
+* omit the post-vblank programming; only update if it's different.
+*/
+   if (newstate->base.active &&
+   memcmp(intermediate, optimal, sizeof(*intermediate)) != 0)
+   newstate->wm.need_postvbl_update = true;
+
+   return 0;
+}
+
 void i9xx_wm_get_hw_state(struct drm_device *dev)
 {
struct drm_i915_private *dev_priv = to_i915(dev);
@@ -2345,17 +2385,15 @@ void i9xx_wm_get_hw_state(struct drm_device *dev)
}
 }
 
-static void i9xx_update_wm(struct intel_crtc *crtc)
+static void i9xx_program_watermarks(struct drm_i915_private *dev_priv)
 {
-   struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+   struct intel_crtc *crtc;
uint32_t fwater_lo;
uint32_t fwater_hi;
int cwm, srwm = -1;
int planea_wm, planeb_wm;
struct intel_crtc *enabled = NULL;
 
-   crtc->wm.active.i9xx = crtc->config->wm.i9xx.optimal;
-
crtc = intel_get_crtc_for_plane(dev_priv, 0);
planea_wm = crtc->wm.active.i9xx.plane_wm;
if (intel_crtc_active(crtc))
@@ -2381,7 +2419,7 @@ static void i9xx_update_wm(struct intel_crtc *crtc)
cwm = 2;
 
/* Play safe and disable self-refresh before adjusting watermarks. */
-   intel_set_memory_cxsr(dev_priv, false);
+   _intel_set_memory_cxsr(dev_priv, false);
 
/* Calc sr entries for one plane

[Intel-gfx] [RFC 0/7] drm/i915: Convert gen4- watermarks to atomic.

2017-05-04 Thread Maarten Lankhorst

I've only compile time tested this and the series depends on
Ville's gen4x watermark conversion so CI will fail to apply it.

Maarten Lankhorst (7):
  drm/i915: Calculate gen3- watermarks semi-atomically.
  drm/i915: Program gen3- watermarks atomically
  drm/i915: Convert pineview watermarks to atomic
  drm/i915: Calculate gen4 watermarks semiatomically.
  drm/i915: Program gen4 watermarks atomically
  drm/i915: Kill off intel_crtc_active.
  drm/i915: Rip out legacy watermark infrastructure

 drivers/gpu/drm/i915/i915_drv.h  |   6 +-
 drivers/gpu/drm/i915/intel_atomic.c  |   2 -
 drivers/gpu/drm/i915/intel_display.c |  97 +-
 drivers/gpu/drm/i915/intel_drv.h |  18 +-
 drivers/gpu/drm/i915/intel_fbc.c |   2 +-
 drivers/gpu/drm/i915/intel_pm.c  | 635 ++-
 6 files changed, 433 insertions(+), 327 deletions(-)

-- 
2.9.3

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC 1/7] drm/i915: Calculate gen3- watermarks semi-atomically.

2017-05-04 Thread Maarten Lankhorst

The gen3 watermark calculations are converted to atomic,
but the wm update calls are still done through the legacy
functions.

This will make it easier to bisect things if they go wrong.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/intel_display.c |   3 +-
 drivers/gpu/drm/i915/intel_drv.h |  14 +++
 drivers/gpu/drm/i915/intel_pm.c  | 231 +--
 3 files changed, 152 insertions(+), 96 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index 4991ef2ac77d..c7d295a0895d 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -15518,7 +15518,8 @@ intel_modeset_setup_hw_state(struct drm_device *dev)
skl_wm_get_hw_state(dev);
} else if (HAS_PCH_SPLIT(dev_priv)) {
ilk_wm_get_hw_state(dev);
-   }
+   } else if (INTEL_GEN(dev_priv) <= 3 && !IS_PINEVIEW(dev_priv))
+   i9xx_wm_get_hw_state(dev);
 
for_each_intel_crtc(dev, crtc) {
u64 put_domains;
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index ae9173707959..d9e49f2b3c22 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -546,6 +546,15 @@ struct g4x_wm_state {
bool fbc_en;
 };
 
+struct i9xx_wm_state {
+   uint16_t plane_wm;
+   bool cxsr;
+
+   struct {
+   uint16_t plane;
+   } sr;
+};
+
 struct intel_crtc_wm_state {
union {
struct {
@@ -590,6 +599,9 @@ struct intel_crtc_wm_state {
/* optimal watermarks */
struct g4x_wm_state optimal;
} g4x;
+   struct {
+   struct i9xx_wm_state optimal;
+   } i9xx;
};
 
/*
@@ -828,6 +840,7 @@ struct intel_crtc {
struct intel_pipe_wm ilk;
struct vlv_wm_state vlv;
struct g4x_wm_state g4x;
+   struct i9xx_wm_state i9xx;
} active;
} wm;
 
@@ -1868,6 +1881,7 @@ void gen6_rps_boost(struct drm_i915_private *dev_priv,
unsigned long submitted);
 void intel_queue_rps_boost_for_request(struct drm_i915_gem_request *req);
 void g4x_wm_get_hw_state(struct drm_device *dev);
+void i9xx_wm_get_hw_state(struct drm_device *dev);
 void vlv_wm_get_hw_state(struct drm_device *dev);
 void ilk_wm_get_hw_state(struct drm_device *dev);
 void skl_wm_get_hw_state(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index d2cec3249e87..0c933cfad02c 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -2226,89 +2226,154 @@ static void i965_update_wm(struct intel_crtc 
*unused_crtc)
 
 #undef FW_WM
 
-static void i9xx_update_wm(struct intel_crtc *unused_crtc)
+static const struct intel_watermark_params *i9xx_get_wm_info(struct 
drm_i915_private *dev_priv,
+struct intel_crtc 
*crtc)
 {
-   struct drm_i915_private *dev_priv = to_i915(unused_crtc->base.dev);
-   const struct intel_watermark_params *wm_info;
-   uint32_t fwater_lo;
-   uint32_t fwater_hi;
-   int cwm, srwm = 1;
-   int fifo_size;
-   int planea_wm, planeb_wm;
-   struct intel_crtc *crtc, *enabled = NULL;
+   struct intel_plane *plane = to_intel_plane(crtc->base.primary);
 
if (IS_I945GM(dev_priv))
-   wm_info = &i945_wm_info;
+   return &i945_wm_info;
else if (!IS_GEN2(dev_priv))
-   wm_info = &i915_wm_info;
+   return &i915_wm_info;
+   else if (plane->plane == PLANE_A)
+   return &i830_a_wm_info;
else
-   wm_info = &i830_a_wm_info;
+   return &i830_bc_wm_info;
+}
 
-   fifo_size = dev_priv->display.get_fifo_size(dev_priv, 0);
-   crtc = intel_get_crtc_for_plane(dev_priv, 0);
-   if (intel_crtc_active(crtc)) {
+static int i9xx_compute_pipe_wm(struct intel_crtc_state *crtc_state)
+{
+   struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc);
+   struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+   struct intel_atomic_state *state =
+   to_intel_atomic_state(crtc_state->base.state);
+   struct i9xx_wm_state *wm_state = &crtc_state->wm.i9xx.optimal;
+   struct intel_plane *plane = to_intel_plane(crtc->base.primary);
+   const struct drm_plane_state *plane_state = NULL;
+   int fifo_size;
+   const struct intel_watermark_params *wm_info;
+
+   fifo_size = dev_priv->display.get_fifo_size(dev_priv, plane->plane);
+
+   wm_info = i9xx_get_wm_info(dev_priv, crtc);
+
+   wm_state->cxsr = false;
+   memset(&wm_state->sr, 0, sizeof(wm_state->sr));
+
+   if (crtc_state->base.plane_mask & BIT(drm_plane_index(&plane->

[Intel-gfx] [RFC 3/7] drm/i915: Convert pineview watermarks to atomic

2017-05-04 Thread Maarten Lankhorst

Pineview seems to have different watermarks from the other
platforms and are calculated separately.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/intel_drv.h |   3 +-
 drivers/gpu/drm/i915/intel_pm.c  | 134 ++-
 2 files changed, 92 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 73e74fc7383c..62f690c7691e 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -552,7 +552,8 @@ struct i9xx_wm_state {
 
struct {
uint16_t plane;
-   } sr;
+   uint16_t cursor;
+   } sr, hpll;
 };
 
 struct intel_crtc_wm_state {
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index c39f63aff4a5..eb1bb8b3f9a6 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -824,13 +824,17 @@ static struct intel_crtc *single_enabled_crtc(struct 
drm_i915_private *dev_priv)
return enabled;
 }
 
-static void pineview_update_wm(struct intel_crtc *unused_crtc)
+static int pnv_compute_pipe_wm(struct intel_crtc_state *crtc_state)
 {
-   struct drm_i915_private *dev_priv = to_i915(unused_crtc->base.dev);
-   struct intel_crtc *crtc;
+   struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc);
+   struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+   struct i9xx_wm_state *wm_state = &crtc_state->wm.i9xx.optimal;
+   struct intel_plane *plane = to_intel_plane(crtc->base.primary);
+   struct intel_atomic_state *state = 
to_intel_atomic_state(crtc_state->base.state);
+   const struct drm_plane_state *primary_plane_state = NULL;
const struct cxsr_latency *latency;
-   u32 reg;
-   unsigned int wm;
+
+   memset(wm_state, 0, sizeof(*wm_state));
 
latency = intel_get_cxsr_latency(IS_PINEVIEW_G(dev_priv),
 dev_priv->is_ddr3,
@@ -838,60 +842,90 @@ static void pineview_update_wm(struct intel_crtc 
*unused_crtc)
 dev_priv->mem_freq);
if (!latency) {
DRM_DEBUG_KMS("Unknown FSB/MEM found, disable CxSR\n");
-   intel_set_memory_cxsr(dev_priv, false);
-   return;
+
+   return 0;
}
 
-   crtc = single_enabled_crtc(dev_priv);
-   if (crtc) {
-   const struct drm_display_mode *adjusted_mode =
-   &crtc->config->base.adjusted_mode;
+   if (crtc_state->base.plane_mask & BIT(drm_plane_index(&plane->base)))
+   primary_plane_state = 
__drm_atomic_get_current_plane_state(&state->base, &plane->base);
+
+   if (primary_plane_state) {
const struct drm_framebuffer *fb =
-   crtc->base.primary->state->fb;
+   primary_plane_state->fb;
int cpp = fb->format->cpp[0];
-   int clock = adjusted_mode->crtc_clock;
+   const struct drm_display_mode *adjusted_mode =
+   &crtc_state->base.adjusted_mode;
+   unsigned active_crtcs;
+
+   if (state->modeset)
+   active_crtcs = state->active_crtcs;
+   else
+   active_crtcs = dev_priv->active_crtcs;
+
+   wm_state->cxsr = active_crtcs == drm_crtc_mask(&crtc->base);
+
+   wm_state->sr.plane = 
intel_calculate_wm(adjusted_mode->crtc_clock,
+   &pineview_display_wm,
+   
pineview_display_wm.fifo_size,
+   cpp, 
latency->display_sr);
+
+   wm_state->sr.cursor = 
intel_calculate_wm(adjusted_mode->crtc_clock,
+&pineview_cursor_wm,
+
pineview_display_wm.fifo_size,
+4, latency->cursor_sr);
+
+   wm_state->hpll.plane = 
intel_calculate_wm(adjusted_mode->crtc_clock,
+
&pineview_display_hplloff_wm,
+
pineview_display_hplloff_wm.fifo_size,
+cpp, 
latency->display_hpll_disable);
+
+   wm_state->hpll.cursor = 
intel_calculate_wm(adjusted_mode->crtc_clock,
+ 
&pineview_cursor_hplloff_wm,
+ 
pineview_display_hplloff_wm.fifo_size,
+ 4, 
latency->cursor_hpll_disable);
+
+   DRM_DEBUG_KMS("FIFO watermarks - can cxsr: %s, display plane 
%d, cursor SR size: %d\n",

[Intel-gfx] [RFC 4/7] drm/i915: Calculate gen4 watermarks semiatomically.

2017-05-04 Thread Maarten Lankhorst

Gen4 watermark is handled same as gen3-. Calculate
the optimal watermarks atomically first, and program
it in the legacy helper.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/intel_pm.c | 136 
 1 file changed, 95 insertions(+), 41 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index eb1bb8b3f9a6..c5bdef6281f3 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -2189,58 +2189,109 @@ static void vlv_optimize_watermarks(struct 
intel_atomic_state *state,
mutex_unlock(&dev_priv->wm.wm_mutex);
 }
 
-static void i965_update_wm(struct intel_crtc *unused_crtc)
+static int i965_compute_pipe_wm(struct intel_crtc_state *crtc_state)
 {
-   struct drm_i915_private *dev_priv = to_i915(unused_crtc->base.dev);
-   struct intel_crtc *crtc;
+   struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc);
+   struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+   struct intel_atomic_state *state =
+   to_intel_atomic_state(crtc_state->base.state);
+   struct i9xx_wm_state *wm_state = &crtc_state->wm.i9xx.optimal;
+   struct intel_plane *plane = to_intel_plane(crtc->base.primary);
+   const struct drm_plane_state *primary_plane_state = NULL;
+   const struct drm_plane_state *cursor_plane_state = NULL;
+
+   memset(wm_state, 0, sizeof(*wm_state));
+
+   if (crtc_state->base.plane_mask & BIT(drm_plane_index(&plane->base)))
+   primary_plane_state = 
__drm_atomic_get_current_plane_state(&state->base, &plane->base);
+
+   if (crtc_state->base.plane_mask & 
BIT(drm_plane_index(crtc->base.cursor)))
+   cursor_plane_state = 
__drm_atomic_get_current_plane_state(&state->base, crtc->base.cursor);
+
+   if (primary_plane_state) {
+   static const int sr_latency_ns = 12000;
+   const struct drm_display_mode *adjusted_mode =
+   &crtc_state->base.adjusted_mode;
+   unsigned active_crtcs;
+   unsigned long entries;
+   bool may_cxsr;
+
+   if (state->modeset)
+   active_crtcs = state->active_crtcs;
+   else
+   active_crtcs = dev_priv->active_crtcs;
+
+   may_cxsr = active_crtcs == drm_crtc_mask(&crtc->base);
+
+   if (may_cxsr && intel_wm_plane_visible(crtc_state, 
to_intel_plane_state(primary_plane_state))) {
+   struct drm_framebuffer *fb = primary_plane_state->fb;
+   unsigned cpp = fb->format->cpp[0];
+
+   entries = intel_wm_method2(adjusted_mode->crtc_clock,
+  adjusted_mode->crtc_htotal,
+  crtc_state->pipe_src_w, cpp,
+  sr_latency_ns / 100);
+   entries = DIV_ROUND_UP(entries, I915_FIFO_LINE_SIZE);
+   if (entries < I965_FIFO_SIZE)
+   wm_state->sr.plane = I965_FIFO_SIZE - entries;
+   else
+   may_cxsr = false;
+
+   DRM_DEBUG_KMS("self-refresh entries: %ld\n", entries);
+   }
+
+   /* No need to use intel_wm_plane_visible here, since cursor. */
+   if (may_cxsr && cursor_plane_state && crtc_state->base.active) {
+   entries = intel_wm_method2(adjusted_mode->crtc_clock,
+  adjusted_mode->crtc_htotal,
+  cursor_plane_state->crtc_w, 
4,
+  sr_latency_ns / 100);
+
+   entries = DIV_ROUND_UP(entries,
+ 
i965_cursor_wm_info.cacheline_size) +
+   i965_cursor_wm_info.guard_size;
+
+   if (entries < i965_cursor_wm_info.fifo_size)
+   wm_state->sr.cursor = 
min(i965_cursor_wm_info.fifo_size - entries,
+ (unsigned 
long)(i965_cursor_wm_info.max_wm));
+   else
+   may_cxsr = false;
+   } else if (may_cxsr)
+   wm_state->sr.cursor = 16;
+
+   wm_state->cxsr = may_cxsr;
+
+   DRM_DEBUG_KMS("FIFO watermarks - can cxsr: %s, display plane 
%d, cursor SR size: %d\n",
+ yesno(wm_state->cxsr), wm_state->sr.plane, 
wm_state->sr.cursor);
+   }
+
+   return 0;
+}
+
+static void i965_update_wm(struct intel_crtc *crtc)
+{
+   struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
int srwm = 1;
int cursor_sr = 16;
-   bool cxsr_enabled;
+   bool cxsr_enabl

[Intel-gfx] [RFC 6/7] drm/i915: Kill off intel_crtc_active.

2017-05-04 Thread Maarten Lankhorst

Use crtc->active directly instead. This is still not completely
optimal and needs fixing, but it's about as good as using
intel_crtc_active.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/intel_display.c | 19 ---
 drivers/gpu/drm/i915/intel_drv.h |  1 -
 drivers/gpu/drm/i915/intel_fbc.c |  2 +-
 drivers/gpu/drm/i915/intel_pm.c  |  6 +++---
 4 files changed, 4 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index c7d295a0895d..8538c0246015 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -948,25 +948,6 @@ bool bxt_find_best_dpll(struct intel_crtc_state 
*crtc_state, int target_clock,
  target_clock, refclk, NULL, best_clock);
 }
 
-bool intel_crtc_active(struct intel_crtc *crtc)
-{
-   /* Be paranoid as we can arrive here with only partial
-* state retrieved from the hardware during setup.
-*
-* We can ditch the adjusted_mode.crtc_clock check as soon
-* as Haswell has gained clock readout/fastboot support.
-*
-* We can ditch the crtc->primary->fb check as soon as we can
-* properly reconstruct framebuffers.
-*
-* FIXME: The intel_crtc->active here should be switched to
-* crtc->state->active once we have proper CRTC states wired up
-* for atomic.
-*/
-   return crtc->active && crtc->base.primary->state->fb &&
-   crtc->config->base.adjusted_mode.crtc_clock;
-}
-
 enum transcoder intel_pipe_to_cpu_transcoder(struct drm_i915_private *dev_priv,
 enum pipe pipe)
 {
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 62f690c7691e..dbe33b7bcf67 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1490,7 +1490,6 @@ bool bxt_find_best_dpll(struct intel_crtc_state 
*crtc_state, int target_clock,
struct dpll *best_clock);
 int chv_calc_dpll_params(int refclk, struct dpll *pll_clock);
 
-bool intel_crtc_active(struct intel_crtc *crtc);
 void hsw_enable_ips(struct intel_crtc *crtc);
 void hsw_disable_ips(struct intel_crtc *crtc);
 enum intel_display_power_domain intel_port_to_power_domain(enum port port);
diff --git a/drivers/gpu/drm/i915/intel_fbc.c b/drivers/gpu/drm/i915/intel_fbc.c
index ded2add18b26..a93214d0388e 100644
--- a/drivers/gpu/drm/i915/intel_fbc.c
+++ b/drivers/gpu/drm/i915/intel_fbc.c
@@ -1282,7 +1282,7 @@ void intel_fbc_init_pipe_state(struct drm_i915_private 
*dev_priv)
return;
 
for_each_intel_crtc(&dev_priv->drm, crtc)
-   if (intel_crtc_active(crtc) &&
+   if (crtc->base.state->active &&
crtc->base.primary->state->visible)
dev_priv->fbc.visible_pipes_mask |= (1 << crtc->pipe);
 }
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 969eb11ed5cd..bf2127a3f730 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -814,7 +814,7 @@ static struct intel_crtc *single_enabled_crtc(struct 
drm_i915_private *dev_priv)
struct intel_crtc *crtc, *enabled = NULL;
 
for_each_intel_crtc(&dev_priv->drm, crtc) {
-   if (intel_crtc_active(crtc)) {
+   if (crtc->active) {
if (enabled)
return NULL;
enabled = crtc;
@@ -2486,11 +2486,11 @@ static void i9xx_program_watermarks(struct 
drm_i915_private *dev_priv)
 
crtc = intel_get_crtc_for_plane(dev_priv, 0);
planea_wm = crtc->wm.active.i9xx.plane_wm;
-   if (intel_crtc_active(crtc))
+   if (crtc->active)
enabled = crtc;
 
crtc = intel_get_crtc_for_plane(dev_priv, 1);
-   if (intel_crtc_active(crtc)) {
+   if (crtc->active) {
if (enabled == NULL)
enabled = crtc;
else
-- 
2.9.3

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC 5/7] drm/i915: Program gen4 watermarks atomically

2017-05-04 Thread Maarten Lankhorst

We're already calculating the watermarks correctly, now we have to
program them too.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/intel_pm.c | 25 +++--
 1 file changed, 15 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index c5bdef6281f3..969eb11ed5cd 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -2268,20 +2268,20 @@ static int i965_compute_pipe_wm(struct intel_crtc_state 
*crtc_state)
return 0;
 }
 
-static void i965_update_wm(struct intel_crtc *crtc)
+static void i965_program_watermarks(struct drm_i915_private *dev_priv)
 {
-   struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+   struct intel_crtc *crtc;
+   struct i9xx_wm_state *wm_state = NULL;
int srwm = 1;
int cursor_sr = 16;
bool cxsr_enabled = false;
 
-   crtc->wm.active.i9xx = crtc->config->wm.i9xx.optimal;
-
-   /* Calc sr entries for one plane configs */
crtc = single_enabled_crtc(dev_priv);
-   if (crtc && crtc->wm.active.i9xx.cxsr) {
-   struct i9xx_wm_state *wm_state = &crtc->wm.active.i9xx;
+   if (crtc)
+   wm_state = &crtc->wm.active.i9xx;
 
+   /* Calc sr entries for one plane configs */
+   if (wm_state && wm_state->cxsr) {
srwm = wm_state->sr.plane;
cursor_sr = wm_state->sr.cursor;
 
@@ -2571,8 +2571,10 @@ static void i9xx_initial_watermarks(struct 
intel_atomic_state *state,
pnv_program_watermarks(dev_priv);
else if (INTEL_INFO(dev_priv)->num_pipes == 1)
i845_program_watermarks(intel_crtc);
-   else
+   else if (INTEL_GEN(dev_priv) < 4)
i9xx_program_watermarks(dev_priv);
+   else
+   i965_program_watermarks(dev_priv);
mutex_unlock(&dev_priv->wm.wm_mutex);
 }
 
@@ -2591,8 +2593,10 @@ static void i9xx_optimize_watermarks(struct 
intel_atomic_state *state,
pnv_program_watermarks(dev_priv);
else if (INTEL_INFO(dev_priv)->num_pipes == 1)
i845_program_watermarks(intel_crtc);
-   else
+   else if (INTEL_GEN(dev_priv) < 4)
i9xx_program_watermarks(dev_priv);
+   else
+   i965_program_watermarks(dev_priv);
mutex_unlock(&dev_priv->wm.wm_mutex);
 }
 
@@ -8911,7 +8915,8 @@ void intel_init_pm(struct drm_i915_private *dev_priv)
}
} else if (IS_GEN4(dev_priv)) {
dev_priv->display.compute_pipe_wm = i965_compute_pipe_wm;
-   dev_priv->display.update_wm = i965_update_wm;
+   dev_priv->display.initial_watermarks = i9xx_initial_watermarks;
+   dev_priv->display.optimize_watermarks = 
i9xx_optimize_watermarks;
} else if (IS_GEN3(dev_priv)) {
dev_priv->display.compute_pipe_wm = i9xx_compute_pipe_wm;
dev_priv->display.compute_intermediate_wm = 
i9xx_compute_intermediate_wm;
-- 
2.9.3

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC 7/7] drm/i915: Rip out legacy watermark infrastructure

2017-05-04 Thread Maarten Lankhorst

The legacy watermark infrastructure is now unused, so remove it.

Signed-off-by: Maarten Lankhorst 
---
 drivers/gpu/drm/i915/i915_drv.h  |  1 -
 drivers/gpu/drm/i915/intel_atomic.c  |  2 -
 drivers/gpu/drm/i915/intel_display.c | 75 ++--
 drivers/gpu/drm/i915/intel_drv.h |  2 -
 drivers/gpu/drm/i915/intel_pm.c  | 42 
 5 files changed, 3 insertions(+), 119 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 7af4f908b2cd..46b317c991f0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -637,7 +637,6 @@ struct drm_i915_display_funcs {
void (*optimize_watermarks)(struct intel_atomic_state *state,
struct intel_crtc_state *cstate);
int (*compute_global_watermarks)(struct drm_atomic_state *state);
-   void (*update_wm)(struct intel_crtc *crtc);
int (*modeset_calc_cdclk)(struct drm_atomic_state *state);
/* Returns the active state of the crtc, and if the crtc is active,
 * fills out the pipe-config with the hw state. */
diff --git a/drivers/gpu/drm/i915/intel_atomic.c 
b/drivers/gpu/drm/i915/intel_atomic.c
index 87b1dd464eee..7a4acaa45edd 100644
--- a/drivers/gpu/drm/i915/intel_atomic.c
+++ b/drivers/gpu/drm/i915/intel_atomic.c
@@ -173,8 +173,6 @@ intel_crtc_duplicate_state(struct drm_crtc *crtc)
crtc_state->update_pipe = false;
crtc_state->disable_lp_wm = false;
crtc_state->disable_cxsr = false;
-   crtc_state->update_wm_pre = false;
-   crtc_state->update_wm_post = false;
crtc_state->fb_changed = false;
crtc_state->fifo_changed = false;
crtc_state->wm.need_postvbl_update = false;
diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index 8538c0246015..295e17d0f272 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -4958,9 +4958,6 @@ static void intel_post_plane_update(struct 
intel_crtc_state *old_crtc_state)
 
intel_frontbuffer_flip(to_i915(crtc->base.dev), pipe_config->fb_bits);
 
-   if (pipe_config->update_wm_post && pipe_config->base.active)
-   intel_update_watermarks(crtc);
-
if (old_pri_state) {
struct intel_plane_state *primary_state =
to_intel_plane_state(primary->state);
@@ -5050,8 +5047,6 @@ static void intel_pre_plane_update(struct 
intel_crtc_state *old_crtc_state,
if (dev_priv->display.initial_watermarks != NULL)
dev_priv->display.initial_watermarks(old_intel_state,
 pipe_config);
-   else if (pipe_config->update_wm_pre)
-   intel_update_watermarks(crtc);
 }
 
 static void intel_crtc_disable_planes(struct drm_crtc *crtc, unsigned 
plane_mask)
@@ -5737,8 +5732,6 @@ static void i9xx_crtc_enable(struct intel_crtc_state 
*pipe_config,
if (dev_priv->display.initial_watermarks != NULL)
dev_priv->display.initial_watermarks(old_intel_state,
 intel_crtc->config);
-   else
-   intel_update_watermarks(intel_crtc);
intel_enable_pipe(intel_crtc);
 
assert_vblank_disabled(crtc);
@@ -5802,9 +5795,6 @@ static void i9xx_crtc_disable(struct intel_crtc_state 
*old_crtc_state,
 
if (!IS_GEN2(dev_priv))
intel_set_cpu_fifo_underrun_reporting(dev_priv, pipe, false);
-
-   if (!dev_priv->display.initial_watermarks)
-   intel_update_watermarks(intel_crtc);
 }
 
 static void intel_crtc_disable_noatomic(struct drm_crtc *crtc)
@@ -5863,7 +5853,6 @@ static void intel_crtc_disable_noatomic(struct drm_crtc 
*crtc)
encoder->base.crtc = NULL;
 
intel_fbc_disable(intel_crtc);
-   intel_update_watermarks(intel_crtc);
intel_disable_shared_dpll(intel_crtc);
 
domains = intel_crtc->enabled_power_domains;
@@ -10738,40 +10727,6 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
 }
 
 
-/**
- * intel_wm_need_update - Check whether watermarks need updating
- * @plane: drm plane
- * @state: new plane state
- *
- * Check current plane state versus the new one to determine whether
- * watermarks need to be recalculated.
- *
- * Returns true or false.
- */
-static bool intel_wm_need_update(struct drm_plane *plane,
-struct drm_plane_state *state)
-{
-   struct intel_plane_state *new = to_intel_plane_state(state);
-   struct intel_plane_state *cur = to_intel_plane_state(plane->state);
-
-   /* Update watermarks on tiling or size changes. */
-   if (new->base.visible != cur->base.visible)
-   return true;
-
-   if (!cur->base.fb || !new->base.fb)
-   return false;
-
-   if (cur->base.fb->modifier != new->base.fb->modifier ||
-   cur->base.rotation

[Intel-gfx] [PATCH] drm/i915: Move the unclaimed mmio detection into the powerwell for KMS

2017-05-04 Thread Chris Wilson

Replace the large comment about requiring the powerwell for
intel_uncore_arm_unclaimed_mmio_detection() by moving the arming of the
mmio error detection into the powerwell held for modesetting. Thereby
also accomplishing the goal of only arming the mmio detection after a
full modeset.

Signed-off-by: Chris Wilson 
Cc: Mika Kuoppala 
Cc: Daniel Vetter 
Cc: Ville Syrjälä 
---
 drivers/gpu/drm/i915/intel_display.c | 23 +--
 1 file changed, 9 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c 
b/drivers/gpu/drm/i915/intel_display.c
index 85b9e2f521a0..14e12e46eda5 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -12912,8 +12912,16 @@ static void intel_atomic_commit_tail(struct 
drm_atomic_state *state)
 
drm_atomic_helper_commit_hw_done(state);
 
-   if (intel_state->modeset)
+   if (intel_state->modeset) {
+   /* As one of the primary mmio accessors, KMS has a high
+* likelihood of triggering bugs in unclaimed access. After we
+* finish modesetting, see if an error has been flagged, and if
+* so enable debugging for the next modeset - and hope we catch
+* the culprit.
+*/
+   intel_uncore_arm_unclaimed_mmio_detection(dev_priv);
intel_display_power_put(dev_priv, POWER_DOMAIN_MODESET);
+   }
 
mutex_lock(&dev->struct_mutex);
drm_atomic_helper_cleanup_planes(dev, state);
@@ -12923,19 +12931,6 @@ static void intel_atomic_commit_tail(struct 
drm_atomic_state *state)
 
drm_atomic_state_put(state);
 
-   /* As one of the primary mmio accessors, KMS has a high likelihood
-* of triggering bugs in unclaimed access. After we finish
-* modesetting, see if an error has been flagged, and if so
-* enable debugging for the next modeset - and hope we catch
-* the culprit.
-*
-* XXX note that we assume display power is on at this point.
-* This might hold true now but we need to add pm helper to check
-* unclaimed only when the hardware is on, as atomic commits
-* can happen also when the device is completely off.
-*/
-   intel_uncore_arm_unclaimed_mmio_detection(dev_priv);
-
intel_atomic_helper_free_state(dev_priv);
 }
 
-- 
2.11.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 3/3] drm/i915: Micro-optimise hotpath through intel_ring_begin()

2017-05-04 Thread Mika Kuoppala

Chris Wilson  writes:

> Typically, there is space available within the ring and if not we have
> to wait (by definition a slow path). Rearrange the code to reduce the
> number of branches and stack size for the hotpath, accomodating a slight
> growth for the wait.
>
> v2: Fix the new assert that packets are not larger than the actual ring.
>
> Signed-off-by: Chris Wilson 
> ---
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 63 
> +
>  1 file changed, 33 insertions(+), 30 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
> b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index c46e5439d379..53123c1cfcc5 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -1654,7 +1654,7 @@ static int ring_request_alloc(struct 
> drm_i915_gem_request *request)
>   return 0;
>  }
>  
> -static int wait_for_space(struct drm_i915_gem_request *req, int bytes)
> +static noinline int wait_for_space(struct drm_i915_gem_request *req, int 
> bytes)
>  {
>   struct intel_ring *ring = req->ring;
>   struct drm_i915_gem_request *target;
> @@ -1702,49 +1702,52 @@ static int wait_for_space(struct drm_i915_gem_request 
> *req, int bytes)
>  u32 *intel_ring_begin(struct drm_i915_gem_request *req, int num_dwords)
>  {
>   struct intel_ring *ring = req->ring;
> - int remain_actual = ring->size - ring->emit;
> - int remain_usable = ring->effective_size - ring->emit;
> - int bytes = num_dwords * sizeof(u32);
> - int total_bytes, wait_bytes;
> - bool need_wrap = false;
> + const unsigned int remain_usable = ring->effective_size - ring->emit;
> + const unsigned int bytes = num_dwords * sizeof(u32);
> + unsigned int need_wrap = 0;
> + unsigned int total_bytes;
>   u32 *cs;
>  
>   total_bytes = bytes + req->reserved_space;
> + GEM_BUG_ON(total_bytes > ring->effective_size);
>  
> - if (unlikely(bytes > remain_usable)) {
> - /*
> -  * Not enough space for the basic request. So need to flush
> -  * out the remainder and then wait for base + reserved.
> -  */
> - wait_bytes = remain_actual + total_bytes;
> - need_wrap = true;
> - } else if (unlikely(total_bytes > remain_usable)) {
> - /*
> -  * The base request will fit but the reserved space
> -  * falls off the end. So we don't need an immediate wrap
> -  * and only need to effectively wait for the reserved
> -  * size space from the start of ringbuffer.
> -  */
> - wait_bytes = remain_actual + req->reserved_space;
> - } else {
> - /* No wrapping required, just waiting. */
> - wait_bytes = total_bytes;
> + if (unlikely(total_bytes > remain_usable)) {
> + const int remain_actual = ring->size - ring->emit;
> +
> + if (bytes > remain_usable) {
> + /*
> +  * Not enough space for the basic request. So need to
> +  * flush out the remainder and then wait for
> +  * base + reserved.
> +  */
> + total_bytes += remain_actual;
> + need_wrap = remain_actual | 1;

Your remain_actual should never reach zero. So in here
forcing the lowest bit on, and later off, seems superfluous.

-Mika

> + } else  {
> + /*
> +  * The base request will fit but the reserved space
> +  * falls off the end. So we don't need an immediate
> +  * wrap and only need to effectively wait for the
> +  * reserved size from the start of ringbuffer.
> +  */
> + total_bytes = req->reserved_space + remain_actual;
> + }
>   }
>  
> - if (wait_bytes > ring->space) {
> - int ret = wait_for_space(req, wait_bytes);
> + if (unlikely(total_bytes > ring->space)) {
> + int ret = wait_for_space(req, total_bytes);
>   if (unlikely(ret))
>   return ERR_PTR(ret);
>   }
>  
>   if (unlikely(need_wrap)) {
> - GEM_BUG_ON(remain_actual > ring->space);
> - GEM_BUG_ON(ring->emit + remain_actual > ring->size);
> + need_wrap &= ~1;
> + GEM_BUG_ON(need_wrap > ring->space);
> + GEM_BUG_ON(ring->emit + need_wrap > ring->size);
>  
>   /* Fill the tail with MI_NOOP */
> - memset(ring->vaddr + ring->emit, 0, remain_actual);
> + memset(ring->vaddr + ring->emit, 0, need_wrap);
>   ring->emit = 0;
> - ring->space -= remain_actual;
> + ring->space -= need_wrap;
>   }
>  
>   GEM_BUG_ON(ring->emit > ring->size - bytes);
> -- 
> 2.11.0
>
> _

[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Move the unclaimed mmio detection into the powerwell for KMS

2017-05-04 Thread Patchwork

== Series Details ==

Series: drm/i915: Move the unclaimed mmio detection into the powerwell for KMS
URL   : https://patchwork.freedesktop.org/series/23955/
State : success

== Summary ==

Series 23955v1 drm/i915: Move the unclaimed mmio detection into the powerwell 
for KMS
https://patchwork.freedesktop.org/api/1.0/series/23955/revisions/1/mbox/

Test gem_exec_suspend:
Subgroup basic-s4-devices:
dmesg-warn -> PASS   (fi-kbl-7560u) fdo#100125
Test vgem_basic:
Subgroup sysfs:
incomplete -> PASS   (fi-snb-2600)

fdo#100125 https://bugs.freedesktop.org/show_bug.cgi?id=100125

fi-bdw-5557u total:278  pass:267  dwarn:0   dfail:0   fail:0   skip:11  
time:430s
fi-bdw-gvtdvmtotal:278  pass:256  dwarn:8   dfail:0   fail:0   skip:14  
time:426s
fi-bxt-j4205 total:278  pass:259  dwarn:0   dfail:0   fail:0   skip:19  
time:513s
fi-bxt-t5700 total:278  pass:258  dwarn:0   dfail:0   fail:0   skip:20  
time:548s
fi-byt-j1900 total:278  pass:254  dwarn:0   dfail:0   fail:0   skip:24  
time:494s
fi-byt-n2820 total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  
time:479s
fi-elk-e7500 total:278  pass:221  dwarn:0   dfail:0   fail:0   skip:57  
time:402s
fi-hsw-4770  total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  
time:407s
fi-hsw-4770r total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  
time:408s
fi-ilk-650   total:278  pass:228  dwarn:0   dfail:0   fail:0   skip:50  
time:415s
fi-ivb-3520m total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  
time:494s
fi-ivb-3770  total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  
time:489s
fi-kbl-7500u total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  
time:453s
fi-kbl-7560u total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  
time:565s
fi-skl-6260u total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  
time:453s
fi-skl-6700hqtotal:278  pass:261  dwarn:0   dfail:0   fail:0   skip:17  
time:569s
fi-skl-6700k total:278  pass:256  dwarn:4   dfail:0   fail:0   skip:18  
time:459s
fi-skl-6770hqtotal:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  
time:494s
fi-skl-gvtdvmtotal:278  pass:265  dwarn:0   dfail:0   fail:0   skip:13  
time:431s
fi-snb-2520m total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  
time:528s
fi-snb-2600  total:278  pass:249  dwarn:0   dfail:0   fail:0   skip:29  
time:400s
fi-bsw-n3050 failed to collect. IGT log at Patchwork_4621/fi-bsw-n3050/igt.log

93dcb17f41bd2025c355f4e2aded42c0fc5a5c5d drm-tip: 2017y-05m-04d-10h-58m-24s UTC 
integration manifest
03c0c51 drm/i915: Move the unclaimed mmio detection into the powerwell for KMS

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4621/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v1] ACPI: Switch to use generic UUID API

2017-05-04 Thread Heikki Krogerus

On Thu, May 04, 2017 at 12:21:51PM +0300, Andy Shevchenko wrote:
> acpi_evaluate_dsm() and friends take a pointer to a raw buffer of 16
> bytes. Instead we convert them to use uuid_le type. At the same time we
> convert current users.
> 
> acpi_str_to_uuid() becomes useless after the conversion and it's safe to
> get rid of it.
> 
> The conversion fixes a potential bug in int340x_thermal as well since
> we have to use memcmp() on binary data.
> 
> Cc: Rafael J. Wysocki 
> Cc: Mika Westerberg 
> Cc: Borislav Petkov 
> Cc: Dan Williams 
> Cc: Amir Goldstein 
> Cc: Jarkko Sakkinen 
> Cc: Jani Nikula 
> Cc: Ben Skeggs 
> Cc: Benjamin Tissoires 
> Cc: Joerg Roedel 
> Cc: Adrian Hunter 
> Cc: Yisen Zhuang 
> Cc: Bjorn Helgaas 
> Cc: Zhang Rui 
> Cc: Felipe Balbi 
> Cc: Mathias Nyman 
> Cc: Heikki Krogerus 
> Cc: Liam Girdwood 
> Cc: Mark Brown 
> Signed-off-by: Andy Shevchenko 

OK by me, FWIW:

Reviewed-by: Heikki Krogerus 


Thanks,

-- 
heikki
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 3/3] drm/i915: Micro-optimise hotpath through intel_ring_begin()

2017-05-04 Thread Chris Wilson

On Thu, May 04, 2017 at 03:11:45PM +0300, Mika Kuoppala wrote:
> Chris Wilson  writes:
> 
> > Typically, there is space available within the ring and if not we have
> > to wait (by definition a slow path). Rearrange the code to reduce the
> > number of branches and stack size for the hotpath, accomodating a slight
> > growth for the wait.
> >
> > v2: Fix the new assert that packets are not larger than the actual ring.
> >
> > Signed-off-by: Chris Wilson 
> > ---
> >  drivers/gpu/drm/i915/intel_ringbuffer.c | 63 
> > +
> >  1 file changed, 33 insertions(+), 30 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
> > b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index c46e5439d379..53123c1cfcc5 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -1654,7 +1654,7 @@ static int ring_request_alloc(struct 
> > drm_i915_gem_request *request)
> > return 0;
> >  }
> >  
> > -static int wait_for_space(struct drm_i915_gem_request *req, int bytes)
> > +static noinline int wait_for_space(struct drm_i915_gem_request *req, int 
> > bytes)
> >  {
> > struct intel_ring *ring = req->ring;
> > struct drm_i915_gem_request *target;
> > @@ -1702,49 +1702,52 @@ static int wait_for_space(struct 
> > drm_i915_gem_request *req, int bytes)
> >  u32 *intel_ring_begin(struct drm_i915_gem_request *req, int num_dwords)
> >  {
> > struct intel_ring *ring = req->ring;
> > -   int remain_actual = ring->size - ring->emit;
> > -   int remain_usable = ring->effective_size - ring->emit;
> > -   int bytes = num_dwords * sizeof(u32);
> > -   int total_bytes, wait_bytes;
> > -   bool need_wrap = false;
> > +   const unsigned int remain_usable = ring->effective_size - ring->emit;
> > +   const unsigned int bytes = num_dwords * sizeof(u32);
> > +   unsigned int need_wrap = 0;
> > +   unsigned int total_bytes;
> > u32 *cs;
> >  
> > total_bytes = bytes + req->reserved_space;
> > +   GEM_BUG_ON(total_bytes > ring->effective_size);
> >  
> > -   if (unlikely(bytes > remain_usable)) {
> > -   /*
> > -* Not enough space for the basic request. So need to flush
> > -* out the remainder and then wait for base + reserved.
> > -*/
> > -   wait_bytes = remain_actual + total_bytes;
> > -   need_wrap = true;
> > -   } else if (unlikely(total_bytes > remain_usable)) {
> > -   /*
> > -* The base request will fit but the reserved space
> > -* falls off the end. So we don't need an immediate wrap
> > -* and only need to effectively wait for the reserved
> > -* size space from the start of ringbuffer.
> > -*/
> > -   wait_bytes = remain_actual + req->reserved_space;
> > -   } else {
> > -   /* No wrapping required, just waiting. */
> > -   wait_bytes = total_bytes;
> > +   if (unlikely(total_bytes > remain_usable)) {
> > +   const int remain_actual = ring->size - ring->emit;
> > +
> > +   if (bytes > remain_usable) {
> > +   /*
> > +* Not enough space for the basic request. So need to
> > +* flush out the remainder and then wait for
> > +* base + reserved.
> > +*/
> > +   total_bytes += remain_actual;
> > +   need_wrap = remain_actual | 1;
> 
> Your remain_actual should never reach zero. So in here
> forcing the lowest bit on, and later off, seems superfluous.

Why can't we fill up to the last byte with commands? remain_actual is
just (size - tail) and we don't force a wrap until emit crosses the
boundary (and not before). We hit remain_actual == 0 in practice.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 32/67] drm/i915/cnl: DDI - PLL mapping

2017-05-04 Thread Ander Conselvan De Oliveira

On Fri, 2017-04-07 at 18:12 -0300, Paulo Zanoni wrote:
> Em Qui, 2017-04-06 às 12:15 -0700, Rodrigo Vivi escreveu:
> > One of the steps for PLL (un)initialization is to (un)map
> > the correspondent DDI that is actually using that PLL.
> > 
> > So, let's do this step following the places already stablished
> > and used so far, although spec put this as part of PLL
> > initialization sequences.
> > 
> > v2: Use proper prefix on bits names as suggested by Ander.
> > v3: Add missed "~". Without that the logic was inverted
> > so we were disabling interrupts.
> > Credits-to: Clinton
> > Credits-to: Art
> > v4: Spec is getting updated to do DDI -> PLL mapping
> > and clock on in 2 separated reg writes. (Paulo)
> > Also update bits definitions to use space
> > (1 << 1) instead of (1<<1). (Paulo)
> > 
> > Cc: Paulo Zanoni 
> > Cc: Art Runyan 
> > Cc: Clint Taylor 
> > Cc: Ville Syrjälä 
> > Cc: Kahola, Mika 
> > Cc: Ander Conselvan De Oliveira  > m>
> > Signed-off-by: Rodrigo Vivi 
> > Reviewed-by: Kahola, Mika 
> > Signed-off-by: Rodrigo Vivi 
> > ---
> >  drivers/gpu/drm/i915/i915_reg.h  |  9 +
> >  drivers/gpu/drm/i915/intel_ddi.c | 23 ---
> >  2 files changed, 29 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_reg.h
> > b/drivers/gpu/drm/i915/i915_reg.h
> > index 3cfc65f..dcb8e21 100644
> > --- a/drivers/gpu/drm/i915/i915_reg.h
> > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > @@ -8150,6 +8150,15 @@ enum {
> >  #define DPLL_CFGCR1(id)_MMIO_PIPE((id) - SKL_DPLL1,
> > _DPLL1_CFGCR1, _DPLL2_CFGCR1)
> >  #define DPLL_CFGCR2(id)_MMIO_PIPE((id) - SKL_DPLL1,
> > _DPLL1_CFGCR2, _DPLL2_CFGCR2)
> >  
> > +/*
> > + * CNL Clocks
> > + */
> > +#define DPCLKA_CFGCR0  _MMIO(0x6C200)
> > +#define  DPCLKA_CFGCR0_DDI_CLK_OFF(port)   (1 << ((port)+10))
> > +#define  DPCLKA_CFGCR0_DDI_CLK_SEL_MASK(port)  (3 <<
> > ((port)*2))
> > +#define  DPCLKA_CFGCR0_DDI_CLK_SEL_SHIFT(port) ((port)*2)
> > +#define  DPCLKA_CFGCR0_DDI_CLK_SEL(pll, port)  ((pll) <<
> > ((port)*2))
> > +
> >  /* BXT display engine PLL */
> >  #define BXT_DE_PLL_CTL _MMIO(0x6d000)
> >  #define   BXT_DE_PLL_RATIO(x)  (x) /*
> > {60,65,100} * 19.2MHz */
> > diff --git a/drivers/gpu/drm/i915/intel_ddi.c
> > b/drivers/gpu/drm/i915/intel_ddi.c
> > index 0914ad9..2a901bf 100644
> > --- a/drivers/gpu/drm/i915/intel_ddi.c
> > +++ b/drivers/gpu/drm/i915/intel_ddi.c
> > @@ -1621,13 +1621,27 @@ static void intel_ddi_clk_select(struct
> > intel_encoder *encoder,
> >  {
> >     struct drm_i915_private *dev_priv = to_i915(encoder-
> > > base.dev);
> > 
> >     enum port port = intel_ddi_get_encoder_port(encoder);
> > +   uint32_t val;
> >  
> >     if (WARN_ON(!pll))
> >     return;
> >  
> > -   if (IS_GEN9_BC(dev_priv)) {
> > -   uint32_t val;
> > +   if (IS_CANNONLAKE(dev_priv)) {
> > +   /* Configure DPCLKA_CFGCR0 to map the DPLL to the
> > DDI. */
> > +   val = I915_READ(DPCLKA_CFGCR0);
> > +   val |= DPCLKA_CFGCR0_DDI_CLK_SEL(pll->id, port);
> > +   I915_WRITE(DPCLKA_CFGCR0, val);
> 
> A question to the Atomic Lords: don't we need some sort of locking
> around this register since it's used by all ports/clocks? I suppose
> dev_priv->dpll_lock would do...
> 
> Maybe the same would apply for gen9_bc.

If there are modesets happening in parallel for different crtcs, then some
locking is needed. dpll_lock seems like the right call, that's what's used to
avoid the same problem with the enable/disable hooks.

Btw, I think this patch shows why something like [1] might be a good idea.

[1] https://patchwork.freedesktop.org/patch/113598/
> 
> >  
> > +   /*
> > +    * Configure DPCLKA_CFGCR0 to turn on the clock for
> > the DDI.
> > +    * This step and the step before must be done with
> > separate
> > +    * register writes.
> > +    */
> > +   val = I915_READ(DPCLKA_CFGCR0);
> > +   val &= ~(DPCLKA_CFGCR0_DDI_CLK_OFF(port) |
> > +    DPCLKA_CFGCR0_DDI_CLK_SEL_MASK(port));
> > +   I915_WRITE(DPCLKA_CFGCR0, val);
> > +   } else if (IS_GEN9_BC(dev_priv)) {
> >     /* DDI -> PLL mapping  */
> >     val = I915_READ(DPLL_CTRL2);
> >  
> > @@ -1763,7 +1777,10 @@ static void intel_ddi_post_disable(struct
> > intel_encoder *intel_encoder,
> >     if (dig_port)
> >     intel_display_power_put(dev_priv, dig_port-
> > > ddi_io_power_domain);
> > 
> >  
> > -   if (IS_GEN9_BC(dev_priv))
> > +   if (IS_CANNONLAKE(dev_priv))
> > +   I915_WRITE(DPCLKA_CFGCR0, I915_READ(DPCLKA_CFGCR0) |
> > +      DPCLKA_CFGCR0_DDI_CLK_OFF(port));
> > +   else if (IS_GEN9_BC(dev_priv))
> >     I915_WRITE(DPLL_CTRL2, (I915_READ(DPLL_CTRL2) |
> >     DPLL_CTRL2_DDI_CLK_OFF(port)
> > ));
> >     else if (INTEL_GEN(dev_priv) < 9)
> 
> __

Re: [Intel-gfx] [PATCH 32/67] drm/i915/cnl: DDI - PLL mapping

2017-05-04 Thread Ville Syrjälä

On Thu, May 04, 2017 at 03:35:51PM +0300, Ander Conselvan De Oliveira wrote:
> On Fri, 2017-04-07 at 18:12 -0300, Paulo Zanoni wrote:
> > Em Qui, 2017-04-06 às 12:15 -0700, Rodrigo Vivi escreveu:
> > > One of the steps for PLL (un)initialization is to (un)map
> > > the correspondent DDI that is actually using that PLL.
> > > 
> > > So, let's do this step following the places already stablished
> > > and used so far, although spec put this as part of PLL
> > > initialization sequences.
> > > 
> > > v2: Use proper prefix on bits names as suggested by Ander.
> > > v3: Add missed "~". Without that the logic was inverted
> > > so we were disabling interrupts.
> > > Credits-to: Clinton
> > > Credits-to: Art
> > > v4: Spec is getting updated to do DDI -> PLL mapping
> > > and clock on in 2 separated reg writes. (Paulo)
> > > Also update bits definitions to use space
> > > (1 << 1) instead of (1<<1). (Paulo)
> > > 
> > > Cc: Paulo Zanoni 
> > > Cc: Art Runyan 
> > > Cc: Clint Taylor 
> > > Cc: Ville Syrjälä 
> > > Cc: Kahola, Mika 
> > > Cc: Ander Conselvan De Oliveira  > > m>
> > > Signed-off-by: Rodrigo Vivi 
> > > Reviewed-by: Kahola, Mika 
> > > Signed-off-by: Rodrigo Vivi 
> > > ---
> > >  drivers/gpu/drm/i915/i915_reg.h  |  9 +
> > >  drivers/gpu/drm/i915/intel_ddi.c | 23 ---
> > >  2 files changed, 29 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_reg.h
> > > b/drivers/gpu/drm/i915/i915_reg.h
> > > index 3cfc65f..dcb8e21 100644
> > > --- a/drivers/gpu/drm/i915/i915_reg.h
> > > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > > @@ -8150,6 +8150,15 @@ enum {
> > >  #define DPLL_CFGCR1(id)  _MMIO_PIPE((id) - SKL_DPLL1,
> > > _DPLL1_CFGCR1, _DPLL2_CFGCR1)
> > >  #define DPLL_CFGCR2(id)  _MMIO_PIPE((id) - SKL_DPLL1,
> > > _DPLL1_CFGCR2, _DPLL2_CFGCR2)
> > >  
> > > +/*
> > > + * CNL Clocks
> > > + */
> > > +#define DPCLKA_CFGCR0_MMIO(0x6C200)
> > > +#define  DPCLKA_CFGCR0_DDI_CLK_OFF(port) (1 << ((port)+10))
> > > +#define  DPCLKA_CFGCR0_DDI_CLK_SEL_MASK(port)(3 <<
> > > ((port)*2))
> > > +#define  DPCLKA_CFGCR0_DDI_CLK_SEL_SHIFT(port)   ((port)*2)
> > > +#define  DPCLKA_CFGCR0_DDI_CLK_SEL(pll, port)((pll) <<
> > > ((port)*2))
> > > +
> > >  /* BXT display engine PLL */
> > >  #define BXT_DE_PLL_CTL   _MMIO(0x6d000)
> > >  #define   BXT_DE_PLL_RATIO(x)(x) /*
> > > {60,65,100} * 19.2MHz */
> > > diff --git a/drivers/gpu/drm/i915/intel_ddi.c
> > > b/drivers/gpu/drm/i915/intel_ddi.c
> > > index 0914ad9..2a901bf 100644
> > > --- a/drivers/gpu/drm/i915/intel_ddi.c
> > > +++ b/drivers/gpu/drm/i915/intel_ddi.c
> > > @@ -1621,13 +1621,27 @@ static void intel_ddi_clk_select(struct
> > > intel_encoder *encoder,
> > >  {
> > >   struct drm_i915_private *dev_priv = to_i915(encoder-
> > > > base.dev);
> > > 
> > >   enum port port = intel_ddi_get_encoder_port(encoder);
> > > + uint32_t val;
> > >  
> > >   if (WARN_ON(!pll))
> > >   return;
> > >  
> > > - if (IS_GEN9_BC(dev_priv)) {
> > > - uint32_t val;
> > > + if (IS_CANNONLAKE(dev_priv)) {
> > > + /* Configure DPCLKA_CFGCR0 to map the DPLL to the
> > > DDI. */
> > > + val = I915_READ(DPCLKA_CFGCR0);
> > > + val |= DPCLKA_CFGCR0_DDI_CLK_SEL(pll->id, port);
> > > + I915_WRITE(DPCLKA_CFGCR0, val);
> > 
> > A question to the Atomic Lords: don't we need some sort of locking
> > around this register since it's used by all ports/clocks? I suppose
> > dev_priv->dpll_lock would do...
> > 
> > Maybe the same would apply for gen9_bc.
> 
> If there are modesets happening in parallel for different crtcs, then some
> locking is needed. dpll_lock seems like the right call, that's what's used to
> avoid the same problem with the enable/disable hooks.

If something is allowing modesets to commit in parallel then probably
the whole world is on fire. Historically connection_mutex has been there
to protect us, but not sure how that goes with nonblocking commits. I
do hope there's still something there to prevents this...

> 
> Btw, I think this patch shows why something like [1] might be a good idea.
> 
> [1] https://patchwork.freedesktop.org/patch/113598/
> > 
> > >  
> > > + /*
> > > +  * Configure DPCLKA_CFGCR0 to turn on the clock for
> > > the DDI.
> > > +  * This step and the step before must be done with
> > > separate
> > > +  * register writes.
> > > +  */
> > > + val = I915_READ(DPCLKA_CFGCR0);
> > > + val &= ~(DPCLKA_CFGCR0_DDI_CLK_OFF(port) |
> > > +  DPCLKA_CFGCR0_DDI_CLK_SEL_MASK(port));
> > > + I915_WRITE(DPCLKA_CFGCR0, val);
> > > + } else if (IS_GEN9_BC(dev_priv)) {
> > >   /* DDI -> PLL mapping  */
> > >   val = I915_READ(DPLL_CTRL2);
> > >  
> > > @@ -1763,7 +1777,10 @@ static void intel_ddi_post_disable(struct
> > > intel_encoder *intel_encoder,
> > >   if (dig_port)
> > >   i

[Intel-gfx] [PATCH v2 2/3] drm/i915/guc: Make scratch register base and count flexible

2017-05-04 Thread Michal Wajdeczko

We are using some scratch registers in MMIO based send function.
Make their base and count flexible in preparation of upcoming
GuC firmware/hardware changes. While around, change cmd len
parameter verification from WARN_ON to GEM_BUG_ON as we don't
need this all the time.

v2: call out WARN/GEM_BUG change in the commit msg (Daniele)

Signed-off-by: Michal Wajdeczko 
Suggested-by: Daniele Ceraolo Spurio 
Cc: Daniele Ceraolo Spurio 
Cc: Joonas Lahtinen 
Reviewed-by: Daniele Ceraolo Spurio 
---
 drivers/gpu/drm/i915/intel_uc.c | 41 ++---
 drivers/gpu/drm/i915/intel_uc.h |  7 +++
 2 files changed, 41 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uc.c b/drivers/gpu/drm/i915/intel_uc.c
index 72f49e6..9d11c42 100644
--- a/drivers/gpu/drm/i915/intel_uc.c
+++ b/drivers/gpu/drm/i915/intel_uc.c
@@ -260,9 +260,36 @@ void intel_uc_fini_fw(struct drm_i915_private *dev_priv)
__intel_uc_fw_fini(&dev_priv->huc.fw);
 }
 
+static inline i915_reg_t guc_send_reg(struct intel_guc *guc, u32 i)
+{
+   GEM_BUG_ON(!guc->send_regs.base);
+   GEM_BUG_ON(!guc->send_regs.count);
+   GEM_BUG_ON(i >= guc->send_regs.count);
+
+   return _MMIO(guc->send_regs.base + 4 * i);
+}
+
+static void guc_init_send_regs(struct intel_guc *guc)
+{
+   struct drm_i915_private *dev_priv = guc_to_i915(guc);
+   enum forcewake_domains fw_domains = 0;
+   u32 i;
+
+   guc->send_regs.base = i915_mmio_reg_offset(SOFT_SCRATCH(0));
+   guc->send_regs.count = SOFT_SCRATCH_COUNT - 1;
+
+   for (i = 0; i < guc->send_regs.count; i++) {
+   fw_domains |= intel_uncore_forcewake_for_reg(dev_priv,
+   guc_send_reg(guc, i),
+   FW_REG_READ | FW_REG_WRITE);
+   }
+   guc->send_regs.fw_domains = fw_domains;
+}
+
 static int guc_enable_communication(struct intel_guc *guc)
 {
/* XXX: placeholder for alternate setup */
+   guc_init_send_regs(guc);
guc->send = intel_guc_send_mmio;
return 0;
 }
@@ -407,19 +434,19 @@ int intel_guc_send_mmio(struct intel_guc *guc, const u32 
*action, u32 len)
int i;
int ret;
 
-   if (WARN_ON(len < 1 || len > 15))
-   return -EINVAL;
+   GEM_BUG_ON(!len);
+   GEM_BUG_ON(len > guc->send_regs.count);
 
mutex_lock(&guc->send_mutex);
-   intel_uncore_forcewake_get(dev_priv, FORCEWAKE_BLITTER);
+   intel_uncore_forcewake_get(dev_priv, guc->send_regs.fw_domains);
 
dev_priv->guc.action_count += 1;
dev_priv->guc.action_cmd = action[0];
 
for (i = 0; i < len; i++)
-   I915_WRITE(SOFT_SCRATCH(i), action[i]);
+   I915_WRITE(guc_send_reg(guc, i), action[i]);
 
-   POSTING_READ(SOFT_SCRATCH(i - 1));
+   POSTING_READ(guc_send_reg(guc, i - 1));
 
intel_guc_notify(guc);
 
@@ -428,7 +455,7 @@ int intel_guc_send_mmio(struct intel_guc *guc, const u32 
*action, u32 len)
 * Fast commands should still complete in 10us.
 */
ret = __intel_wait_for_register_fw(dev_priv,
-  SOFT_SCRATCH(0),
+  guc_send_reg(guc, 0),
   INTEL_GUC_RECV_MASK,
   INTEL_GUC_RECV_MASK,
   10, 10, &status);
@@ -450,7 +477,7 @@ int intel_guc_send_mmio(struct intel_guc *guc, const u32 
*action, u32 len)
}
dev_priv->guc.action_status = status;
 
-   intel_uncore_forcewake_put(dev_priv, FORCEWAKE_BLITTER);
+   intel_uncore_forcewake_put(dev_priv, guc->send_regs.fw_domains);
mutex_unlock(&guc->send_mutex);
 
return ret;
diff --git a/drivers/gpu/drm/i915/intel_uc.h b/drivers/gpu/drm/i915/intel_uc.h
index 097289b..a37a8cc 100644
--- a/drivers/gpu/drm/i915/intel_uc.h
+++ b/drivers/gpu/drm/i915/intel_uc.h
@@ -205,6 +205,13 @@ struct intel_guc {
uint64_t submissions[I915_NUM_ENGINES];
uint32_t last_seqno[I915_NUM_ENGINES];
 
+   /* GuC's FW specific registers used in MMIO send */
+   struct {
+   u32 base;
+   u32 count;
+   u32 fw_domains; /* enum forcewake_domains */
+   } send_regs;
+
/* To serialize the intel_guc_send actions */
struct mutex send_mutex;
 
-- 
2.7.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v1] ACPI: Switch to use generic UUID API

2017-05-04 Thread Joerg Roedel

On Thu, May 04, 2017 at 12:21:51PM +0300, Andy Shevchenko wrote:
> diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
> index cbf7763d8091..420d51b286ad 100644
> --- a/drivers/iommu/dmar.c
> +++ b/drivers/iommu/dmar.c
> @@ -1808,10 +1808,9 @@ IOMMU_INIT_POST(detect_intel_iommu);
>   * for Directed-IO Architecture Specifiction, Rev 2.2, Section 8.8
>   * "Remapping Hardware Unit Hot Plug".
>   */
> -static u8 dmar_hp_uuid[] = {
> - /*  */0xA6, 0xA3, 0xC1, 0xD8, 0x9B, 0xBE, 0x9B, 0x4C,
> - /* 0008 */0x91, 0xBF, 0xC3, 0xCB, 0x81, 0xFC, 0x5D, 0xAF
> -};
> +static uuid_le dmar_hp_uuid =
> + UUID_LE(0xD8C1A3A6, 0xBE9B, 0x4C9B,
> + 0x91, 0xBF, 0xC3, 0xCB, 0x81, 0xFC, 0x5D, 0xAF);
>  
>  /*
>   * Currently there's only one revision and BIOS will not check the revision 
> id,
> @@ -1824,7 +1823,7 @@ static u8 dmar_hp_uuid[] = {
>  
>  static inline bool dmar_detect_dsm(acpi_handle handle, int func)
>  {
> - return acpi_check_dsm(handle, dmar_hp_uuid, DMAR_DSM_REV_ID, 1 << func);
> + return acpi_check_dsm(handle, &dmar_hp_uuid, DMAR_DSM_REV_ID, 1 << 
> func);
>  }
>  
>  static int dmar_walk_dsm_resource(acpi_handle handle, int func,
> @@ -1843,7 +1842,7 @@ static int dmar_walk_dsm_resource(acpi_handle handle, 
> int func,
>   if (!dmar_detect_dsm(handle, func))
>   return 0;
>  
> - obj = acpi_evaluate_dsm_typed(handle, dmar_hp_uuid, DMAR_DSM_REV_ID,
> + obj = acpi_evaluate_dsm_typed(handle, &dmar_hp_uuid, DMAR_DSM_REV_ID,
> func, NULL, ACPI_TYPE_BUFFER);
>   if (!obj)
>   return -ENODEV;

DMAR part is

Acked-by: Joerg Roedel 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 32/67] drm/i915/cnl: DDI - PLL mapping

2017-05-04 Thread Ander Conselvan De Oliveira

On Thu, 2017-04-06 at 12:15 -0700, Rodrigo Vivi wrote:
> One of the steps for PLL (un)initialization is to (un)map
> the correspondent DDI that is actually using that PLL.
> 
> So, let's do this step following the places already stablished
> and used so far, although spec put this as part of PLL
> initialization sequences.
> 
> v2: Use proper prefix on bits names as suggested by Ander.
> v3: Add missed "~". Without that the logic was inverted
> so we were disabling interrupts.
> Credits-to: Clinton
> Credits-to: Art
> v4: Spec is getting updated to do DDI -> PLL mapping
> and clock on in 2 separated reg writes. (Paulo)
> Also update bits definitions to use space
> (1 << 1) instead of (1<<1). (Paulo)
> 
> Cc: Paulo Zanoni 
> Cc: Art Runyan 
> Cc: Clint Taylor 
> Cc: Ville Syrjälä 
> Cc: Kahola, Mika 
> Cc: Ander Conselvan De Oliveira 
> Signed-off-by: Rodrigo Vivi 
> Reviewed-by: Kahola, Mika 
> Signed-off-by: Rodrigo Vivi 
> ---
>  drivers/gpu/drm/i915/i915_reg.h  |  9 +
>  drivers/gpu/drm/i915/intel_ddi.c | 23 ---
>  2 files changed, 29 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 3cfc65f..dcb8e21 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -8150,6 +8150,15 @@ enum {
>  #define DPLL_CFGCR1(id)  _MMIO_PIPE((id) - SKL_DPLL1, _DPLL1_CFGCR1, 
> _DPLL2_CFGCR1)
>  #define DPLL_CFGCR2(id)  _MMIO_PIPE((id) - SKL_DPLL1, _DPLL1_CFGCR2, 
> _DPLL2_CFGCR2)
>  
> +/*
> + * CNL Clocks
> + */
> +#define DPCLKA_CFGCR0_MMIO(0x6C200)
> +#define  DPCLKA_CFGCR0_DDI_CLK_OFF(port) (1 << ((port)+10))
> +#define  DPCLKA_CFGCR0_DDI_CLK_SEL_MASK(port)(3 << ((port)*2))
> +#define  DPCLKA_CFGCR0_DDI_CLK_SEL_SHIFT(port)   ((port)*2)
> +#define  DPCLKA_CFGCR0_DDI_CLK_SEL(pll, port)((pll) << ((port)*2))
> +
>  /* BXT display engine PLL */
>  #define BXT_DE_PLL_CTL   _MMIO(0x6d000)
>  #define   BXT_DE_PLL_RATIO(x)(x) /* {60,65,100} * 
> 19.2MHz */
> diff --git a/drivers/gpu/drm/i915/intel_ddi.c 
> b/drivers/gpu/drm/i915/intel_ddi.c
> index 0914ad9..2a901bf 100644
> --- a/drivers/gpu/drm/i915/intel_ddi.c
> +++ b/drivers/gpu/drm/i915/intel_ddi.c
> @@ -1621,13 +1621,27 @@ static void intel_ddi_clk_select(struct intel_encoder 
> *encoder,
>  {
>   struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
>   enum port port = intel_ddi_get_encoder_port(encoder);
> + uint32_t val;
>  
>   if (WARN_ON(!pll))
>   return;
>  
> - if (IS_GEN9_BC(dev_priv)) {
> - uint32_t val;
> + if (IS_CANNONLAKE(dev_priv)) {
> + /* Configure DPCLKA_CFGCR0 to map the DPLL to the DDI. */
> + val = I915_READ(DPCLKA_CFGCR0);
> + val |= DPCLKA_CFGCR0_DDI_CLK_SEL(pll->id, port);
> + I915_WRITE(DPCLKA_CFGCR0, val);
>  
> + /*
> +  * Configure DPCLKA_CFGCR0 to turn on the clock for the DDI.
> +  * This step and the step before must be done with separate
> +  * register writes.
> +  */
> + val = I915_READ(DPCLKA_CFGCR0);
> + val &= ~(DPCLKA_CFGCR0_DDI_CLK_OFF(port) |
> +  DPCLKA_CFGCR0_DDI_CLK_SEL_MASK(port));

val |= DPCLKA_CFGCR0_DDI_CLK_SEL(pll->id, port); ?

Or clearing the clock select to zero has no effect here?

Ander

> + I915_WRITE(DPCLKA_CFGCR0, val);
> + } else if (IS_GEN9_BC(dev_priv)) {
>   /* DDI -> PLL mapping  */
>   val = I915_READ(DPLL_CTRL2);
>  
> @@ -1763,7 +1777,10 @@ static void intel_ddi_post_disable(struct 
> intel_encoder *intel_encoder,
>   if (dig_port)
>   intel_display_power_put(dev_priv, 
> dig_port->ddi_io_power_domain);
>  
> - if (IS_GEN9_BC(dev_priv))
> + if (IS_CANNONLAKE(dev_priv))
> + I915_WRITE(DPCLKA_CFGCR0, I915_READ(DPCLKA_CFGCR0) |
> +DPCLKA_CFGCR0_DDI_CLK_OFF(port));
> + else if (IS_GEN9_BC(dev_priv))
>   I915_WRITE(DPLL_CTRL2, (I915_READ(DPLL_CTRL2) |
>   DPLL_CTRL2_DDI_CLK_OFF(port)));
>   else if (INTEL_GEN(dev_priv) < 9)
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 3/3] drm/i915: Micro-optimise hotpath through intel_ring_begin()

2017-05-04 Thread Mika Kuoppala

Chris Wilson  writes:

> On Thu, May 04, 2017 at 03:11:45PM +0300, Mika Kuoppala wrote:
>> Chris Wilson  writes:
>> 
>> > Typically, there is space available within the ring and if not we have
>> > to wait (by definition a slow path). Rearrange the code to reduce the
>> > number of branches and stack size for the hotpath, accomodating a slight
>> > growth for the wait.
>> >
>> > v2: Fix the new assert that packets are not larger than the actual ring.
>> >
>> > Signed-off-by: Chris Wilson 
>> > ---
>> >  drivers/gpu/drm/i915/intel_ringbuffer.c | 63 
>> > +
>> >  1 file changed, 33 insertions(+), 30 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
>> > b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> > index c46e5439d379..53123c1cfcc5 100644
>> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
>> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
>> > @@ -1654,7 +1654,7 @@ static int ring_request_alloc(struct 
>> > drm_i915_gem_request *request)
>> >return 0;
>> >  }
>> >  
>> > -static int wait_for_space(struct drm_i915_gem_request *req, int bytes)
>> > +static noinline int wait_for_space(struct drm_i915_gem_request *req, int 
>> > bytes)
>> >  {
>> >struct intel_ring *ring = req->ring;
>> >struct drm_i915_gem_request *target;
>> > @@ -1702,49 +1702,52 @@ static int wait_for_space(struct 
>> > drm_i915_gem_request *req, int bytes)
>> >  u32 *intel_ring_begin(struct drm_i915_gem_request *req, int num_dwords)
>> >  {
>> >struct intel_ring *ring = req->ring;
>> > -  int remain_actual = ring->size - ring->emit;
>> > -  int remain_usable = ring->effective_size - ring->emit;
>> > -  int bytes = num_dwords * sizeof(u32);
>> > -  int total_bytes, wait_bytes;
>> > -  bool need_wrap = false;
>> > +  const unsigned int remain_usable = ring->effective_size - ring->emit;
>> > +  const unsigned int bytes = num_dwords * sizeof(u32);
>> > +  unsigned int need_wrap = 0;
>> > +  unsigned int total_bytes;
>> >u32 *cs;
>> >  
>> >total_bytes = bytes + req->reserved_space;
>> > +  GEM_BUG_ON(total_bytes > ring->effective_size);
>> >  
>> > -  if (unlikely(bytes > remain_usable)) {
>> > -  /*
>> > -   * Not enough space for the basic request. So need to flush
>> > -   * out the remainder and then wait for base + reserved.
>> > -   */
>> > -  wait_bytes = remain_actual + total_bytes;
>> > -  need_wrap = true;
>> > -  } else if (unlikely(total_bytes > remain_usable)) {
>> > -  /*
>> > -   * The base request will fit but the reserved space
>> > -   * falls off the end. So we don't need an immediate wrap
>> > -   * and only need to effectively wait for the reserved
>> > -   * size space from the start of ringbuffer.
>> > -   */
>> > -  wait_bytes = remain_actual + req->reserved_space;
>> > -  } else {
>> > -  /* No wrapping required, just waiting. */
>> > -  wait_bytes = total_bytes;
>> > +  if (unlikely(total_bytes > remain_usable)) {
>> > +  const int remain_actual = ring->size - ring->emit;
>> > +
>> > +  if (bytes > remain_usable) {
>> > +  /*
>> > +   * Not enough space for the basic request. So need to
>> > +   * flush out the remainder and then wait for
>> > +   * base + reserved.
>> > +   */
>> > +  total_bytes += remain_actual;
>> > +  need_wrap = remain_actual | 1;
>> 
>> Your remain_actual should never reach zero. So in here
>> forcing the lowest bit on, and later off, seems superfluous.
>
> Why can't we fill up to the last byte with commands? remain_actual is
> just (size - tail) and we don't force a wrap until emit crosses the
> boundary (and not before). We hit remain_actual == 0 in practice.
> -Chris

My mistake, was thinking postwrap.

num_dwords and second parameter to wait_for_space should be unsigned.

Reviewed-by: Mika Kuoppala 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 32/67] drm/i915/cnl: DDI - PLL mapping

2017-05-04 Thread Maarten Lankhorst

Op 04-05-17 om 14:44 schreef Ville Syrjälä:
> On Thu, May 04, 2017 at 03:35:51PM +0300, Ander Conselvan De Oliveira wrote:
>> On Fri, 2017-04-07 at 18:12 -0300, Paulo Zanoni wrote:
>>> Em Qui, 2017-04-06 às 12:15 -0700, Rodrigo Vivi escreveu:
 One of the steps for PLL (un)initialization is to (un)map
 the correspondent DDI that is actually using that PLL.

 So, let's do this step following the places already stablished
 and used so far, although spec put this as part of PLL
 initialization sequences.

 v2: Use proper prefix on bits names as suggested by Ander.
 v3: Add missed "~". Without that the logic was inverted
 so we were disabling interrupts.
 Credits-to: Clinton
 Credits-to: Art
 v4: Spec is getting updated to do DDI -> PLL mapping
 and clock on in 2 separated reg writes. (Paulo)
 Also update bits definitions to use space
 (1 << 1) instead of (1<<1). (Paulo)

 Cc: Paulo Zanoni 
 Cc: Art Runyan 
 Cc: Clint Taylor 
 Cc: Ville Syrjälä 
 Cc: Kahola, Mika 
 Cc: Ander Conselvan De Oliveira >>> m>
 Signed-off-by: Rodrigo Vivi 
 Reviewed-by: Kahola, Mika 
 Signed-off-by: Rodrigo Vivi 
 ---
  drivers/gpu/drm/i915/i915_reg.h  |  9 +
  drivers/gpu/drm/i915/intel_ddi.c | 23 ---
  2 files changed, 29 insertions(+), 3 deletions(-)

 diff --git a/drivers/gpu/drm/i915/i915_reg.h
 b/drivers/gpu/drm/i915/i915_reg.h
 index 3cfc65f..dcb8e21 100644
 --- a/drivers/gpu/drm/i915/i915_reg.h
 +++ b/drivers/gpu/drm/i915/i915_reg.h
 @@ -8150,6 +8150,15 @@ enum {
  #define DPLL_CFGCR1(id)   _MMIO_PIPE((id) - SKL_DPLL1,
 _DPLL1_CFGCR1, _DPLL2_CFGCR1)
  #define DPLL_CFGCR2(id)   _MMIO_PIPE((id) - SKL_DPLL1,
 _DPLL1_CFGCR2, _DPLL2_CFGCR2)
  
 +/*
 + * CNL Clocks
 + */
 +#define DPCLKA_CFGCR0 _MMIO(0x6C200)
 +#define  DPCLKA_CFGCR0_DDI_CLK_OFF(port)  (1 << ((port)+10))
 +#define  DPCLKA_CFGCR0_DDI_CLK_SEL_MASK(port) (3 <<
 ((port)*2))
 +#define  DPCLKA_CFGCR0_DDI_CLK_SEL_SHIFT(port)((port)*2)
 +#define  DPCLKA_CFGCR0_DDI_CLK_SEL(pll, port) ((pll) <<
 ((port)*2))
 +
  /* BXT display engine PLL */
  #define BXT_DE_PLL_CTL_MMIO(0x6d000)
  #define   BXT_DE_PLL_RATIO(x) (x) /*
 {60,65,100} * 19.2MHz */
 diff --git a/drivers/gpu/drm/i915/intel_ddi.c
 b/drivers/gpu/drm/i915/intel_ddi.c
 index 0914ad9..2a901bf 100644
 --- a/drivers/gpu/drm/i915/intel_ddi.c
 +++ b/drivers/gpu/drm/i915/intel_ddi.c
 @@ -1621,13 +1621,27 @@ static void intel_ddi_clk_select(struct
 intel_encoder *encoder,
  {
struct drm_i915_private *dev_priv = to_i915(encoder-
> base.dev);
enum port port = intel_ddi_get_encoder_port(encoder);
 +  uint32_t val;
  
if (WARN_ON(!pll))
return;
  
 -  if (IS_GEN9_BC(dev_priv)) {
 -  uint32_t val;
 +  if (IS_CANNONLAKE(dev_priv)) {
 +  /* Configure DPCLKA_CFGCR0 to map the DPLL to the
 DDI. */
 +  val = I915_READ(DPCLKA_CFGCR0);
 +  val |= DPCLKA_CFGCR0_DDI_CLK_SEL(pll->id, port);
 +  I915_WRITE(DPCLKA_CFGCR0, val);
>>> A question to the Atomic Lords: don't we need some sort of locking
>>> around this register since it's used by all ports/clocks? I suppose
>>> dev_priv->dpll_lock would do...
>>>
>>> Maybe the same would apply for gen9_bc.
>> If there are modesets happening in parallel for different crtcs, then some
>> locking is needed. dpll_lock seems like the right call, that's what's used to
>> avoid the same problem with the enable/disable hooks.
> If something is allowing modesets to commit in parallel then probably
> the whole world is on fire. Historically connection_mutex has been there
> to protect us, but not sure how that goes with nonblocking commits. I
> do hope there's still something there to prevents this...

During nonblocking modesets we don't hold any locks. It's still possible
that we force serialization through some other means, for example grabbing
all crtc_states might force serialization previously. But I'm not sure this
is guaranteed to happen even for SKL. It might happen for when DDB
allocation or cdclk changes but there's no guarantee during modeset.

So quite likely you'll need locking here. :)

~Maarten

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 3/3] drm/i915: Micro-optimise hotpath through intel_ring_begin()

2017-05-04 Thread Chris Wilson

On Thu, May 04, 2017 at 03:59:05PM +0300, Mika Kuoppala wrote:
> Chris Wilson  writes:
> 
> > On Thu, May 04, 2017 at 03:11:45PM +0300, Mika Kuoppala wrote:
> >> Chris Wilson  writes:
> >> 
> >> > Typically, there is space available within the ring and if not we have
> >> > to wait (by definition a slow path). Rearrange the code to reduce the
> >> > number of branches and stack size for the hotpath, accomodating a slight
> >> > growth for the wait.
> >> >
> >> > v2: Fix the new assert that packets are not larger than the actual ring.
> >> >
> >> > Signed-off-by: Chris Wilson 
> >> > ---
> >> >  drivers/gpu/drm/i915/intel_ringbuffer.c | 63 
> >> > +
> >> >  1 file changed, 33 insertions(+), 30 deletions(-)
> >> >
> >> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
> >> > b/drivers/gpu/drm/i915/intel_ringbuffer.c
> >> > index c46e5439d379..53123c1cfcc5 100644
> >> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> >> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> >> > @@ -1654,7 +1654,7 @@ static int ring_request_alloc(struct 
> >> > drm_i915_gem_request *request)
> >> >  return 0;
> >> >  }
> >> >  
> >> > -static int wait_for_space(struct drm_i915_gem_request *req, int bytes)
> >> > +static noinline int wait_for_space(struct drm_i915_gem_request *req, 
> >> > int bytes)
> >> >  {
> >> >  struct intel_ring *ring = req->ring;
> >> >  struct drm_i915_gem_request *target;
> >> > @@ -1702,49 +1702,52 @@ static int wait_for_space(struct 
> >> > drm_i915_gem_request *req, int bytes)
> >> >  u32 *intel_ring_begin(struct drm_i915_gem_request *req, int num_dwords)
> >> >  {
> >> >  struct intel_ring *ring = req->ring;
> >> > -int remain_actual = ring->size - ring->emit;
> >> > -int remain_usable = ring->effective_size - ring->emit;
> >> > -int bytes = num_dwords * sizeof(u32);
> >> > -int total_bytes, wait_bytes;
> >> > -bool need_wrap = false;
> >> > +const unsigned int remain_usable = ring->effective_size - 
> >> > ring->emit;
> >> > +const unsigned int bytes = num_dwords * sizeof(u32);
> >> > +unsigned int need_wrap = 0;
> >> > +unsigned int total_bytes;
> >> >  u32 *cs;
> >> >  
> >> >  total_bytes = bytes + req->reserved_space;
> >> > +GEM_BUG_ON(total_bytes > ring->effective_size);
> >> >  
> >> > -if (unlikely(bytes > remain_usable)) {
> >> > -/*
> >> > - * Not enough space for the basic request. So need to 
> >> > flush
> >> > - * out the remainder and then wait for base + reserved.
> >> > - */
> >> > -wait_bytes = remain_actual + total_bytes;
> >> > -need_wrap = true;
> >> > -} else if (unlikely(total_bytes > remain_usable)) {
> >> > -/*
> >> > - * The base request will fit but the reserved space
> >> > - * falls off the end. So we don't need an immediate wrap
> >> > - * and only need to effectively wait for the reserved
> >> > - * size space from the start of ringbuffer.
> >> > - */
> >> > -wait_bytes = remain_actual + req->reserved_space;
> >> > -} else {
> >> > -/* No wrapping required, just waiting. */
> >> > -wait_bytes = total_bytes;
> >> > +if (unlikely(total_bytes > remain_usable)) {
> >> > +const int remain_actual = ring->size - ring->emit;
> >> > +
> >> > +if (bytes > remain_usable) {
> >> > +/*
> >> > + * Not enough space for the basic request. So 
> >> > need to
> >> > + * flush out the remainder and then wait for
> >> > + * base + reserved.
> >> > + */
> >> > +total_bytes += remain_actual;
> >> > +need_wrap = remain_actual | 1;
> >> 
> >> Your remain_actual should never reach zero. So in here
> >> forcing the lowest bit on, and later off, seems superfluous.
> >
> > Why can't we fill up to the last byte with commands? remain_actual is
> > just (size - tail) and we don't force a wrap until emit crosses the
> > boundary (and not before). We hit remain_actual == 0 in practice.
> > -Chris
> 
> My mistake, was thinking postwrap.
> 
> num_dwords and second parameter to wait_for_space should be unsigned.

You predictive algorithm is working fine though. Applied after your
suggestion from patch 1.

Thanks,
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [CI 3/3] drm/i915: Micro-optimise hotpath through intel_ring_begin()

2017-05-04 Thread Chris Wilson

Typically, there is space available within the ring and if not we have
to wait (by definition a slow path). Rearrange the code to reduce the
number of branches and stack size for the hotpath, accomodating a slight
growth for the wait.

v2: Fix the new assert that packets are not larger than the actual ring.
v3: Make the parameters unsigned as well to make usage.

Signed-off-by: Chris Wilson 
Reviewed-by: Mika Kuoppala 
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 67 ++---
 drivers/gpu/drm/i915/intel_ringbuffer.h |  3 +-
 2 files changed, 38 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 47f144b1e3fa..8b427a6151b2 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1655,7 +1655,8 @@ static int ring_request_alloc(struct drm_i915_gem_request 
*request)
return 0;
 }
 
-static int wait_for_space(struct drm_i915_gem_request *req, int bytes)
+static noinline int wait_for_space(struct drm_i915_gem_request *req,
+  unsigned int bytes)
 {
struct intel_ring *ring = req->ring;
struct drm_i915_gem_request *target;
@@ -1700,52 +1701,56 @@ static int wait_for_space(struct drm_i915_gem_request 
*req, int bytes)
return 0;
 }
 
-u32 *intel_ring_begin(struct drm_i915_gem_request *req, int num_dwords)
+u32 *intel_ring_begin(struct drm_i915_gem_request *req,
+ unsigned int num_dwords)
 {
struct intel_ring *ring = req->ring;
-   int remain_actual = ring->size - ring->emit;
-   int remain_usable = ring->effective_size - ring->emit;
-   int bytes = num_dwords * sizeof(u32);
-   int total_bytes, wait_bytes;
-   bool need_wrap = false;
+   const unsigned int remain_usable = ring->effective_size - ring->emit;
+   const unsigned int bytes = num_dwords * sizeof(u32);
+   unsigned int need_wrap = 0;
+   unsigned int total_bytes;
u32 *cs;
 
total_bytes = bytes + req->reserved_space;
+   GEM_BUG_ON(total_bytes > ring->effective_size);
 
-   if (unlikely(bytes > remain_usable)) {
-   /*
-* Not enough space for the basic request. So need to flush
-* out the remainder and then wait for base + reserved.
-*/
-   wait_bytes = remain_actual + total_bytes;
-   need_wrap = true;
-   } else if (unlikely(total_bytes > remain_usable)) {
-   /*
-* The base request will fit but the reserved space
-* falls off the end. So we don't need an immediate wrap
-* and only need to effectively wait for the reserved
-* size space from the start of ringbuffer.
-*/
-   wait_bytes = remain_actual + req->reserved_space;
-   } else {
-   /* No wrapping required, just waiting. */
-   wait_bytes = total_bytes;
+   if (unlikely(total_bytes > remain_usable)) {
+   const int remain_actual = ring->size - ring->emit;
+
+   if (bytes > remain_usable) {
+   /*
+* Not enough space for the basic request. So need to
+* flush out the remainder and then wait for
+* base + reserved.
+*/
+   total_bytes += remain_actual;
+   need_wrap = remain_actual | 1;
+   } else  {
+   /*
+* The base request will fit but the reserved space
+* falls off the end. So we don't need an immediate
+* wrap and only need to effectively wait for the
+* reserved size from the start of ringbuffer.
+*/
+   total_bytes = req->reserved_space + remain_actual;
+   }
}
 
-   if (wait_bytes > ring->space) {
-   int ret = wait_for_space(req, wait_bytes);
+   if (unlikely(total_bytes > ring->space)) {
+   int ret = wait_for_space(req, total_bytes);
if (unlikely(ret))
return ERR_PTR(ret);
}
 
if (unlikely(need_wrap)) {
-   GEM_BUG_ON(remain_actual > ring->space);
-   GEM_BUG_ON(ring->emit + remain_actual > ring->size);
+   need_wrap &= ~1;
+   GEM_BUG_ON(need_wrap > ring->space);
+   GEM_BUG_ON(ring->emit + need_wrap > ring->size);
 
/* Fill the tail with MI_NOOP */
-   memset(ring->vaddr + ring->emit, 0, remain_actual);
+   memset(ring->vaddr + ring->emit, 0, need_wrap);
ring->emit = 0;
-   ring->space -= remain_actual;
+   ring->space -= need_wrap;
}
 
GEM_BUG_ON(ring->emit > r

[Intel-gfx] [CI 2/3] drm/i915: Report the ring->space from intel_ring_update_space()

2017-05-04 Thread Chris Wilson

Some callers immediately want to know the current ring->space after
calling intel_ring_update_space(), which we can freely provide via the
return parameter.

Signed-off-by: Chris Wilson 
Reviewed-by: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 12 
 drivers/gpu/drm/i915/intel_ringbuffer.h |  2 +-
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
b/drivers/gpu/drm/i915/intel_ringbuffer.c
index e7ef04cc071b..47f144b1e3fa 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -51,9 +51,14 @@ static unsigned int __intel_ring_space(unsigned int head,
return (head - tail - CACHELINE_BYTES) & (size - 1);
 }
 
-void intel_ring_update_space(struct intel_ring *ring)
+unsigned int intel_ring_update_space(struct intel_ring *ring)
 {
-   ring->space = __intel_ring_space(ring->head, ring->emit, ring->size);
+   unsigned int space;
+
+   space = __intel_ring_space(ring->head, ring->emit, ring->size);
+
+   ring->space = space;
+   return space;
 }
 
 static int
@@ -1658,8 +1663,7 @@ static int wait_for_space(struct drm_i915_gem_request 
*req, int bytes)
 
lockdep_assert_held(&req->i915->drm.struct_mutex);
 
-   intel_ring_update_space(ring);
-   if (ring->space >= bytes)
+   if (intel_ring_update_space(ring) >= bytes)
return 0;
 
/*
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h 
b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 650ab884d6c8..3e343b09eeb6 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -486,7 +486,7 @@ int intel_ring_pin(struct intel_ring *ring,
   struct drm_i915_private *i915,
   unsigned int offset_bias);
 void intel_ring_reset(struct intel_ring *ring, u32 tail);
-void intel_ring_update_space(struct intel_ring *ring);
+unsigned int intel_ring_update_space(struct intel_ring *ring);
 void intel_ring_unpin(struct intel_ring *ring);
 void intel_ring_free(struct intel_ring *ring);
 
-- 
2.11.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [CI 1/3] drm/i915: Avoid the branch in computing intel_ring_space()

2017-05-04 Thread Chris Wilson

Exploit the power-of-two ring size to compute the space across the
wraparound using a mask rather than a if. Convert to unsigned integers
so the operation is well defined.

References: https://bugs.freedesktop.org/show_bug.cgi?id=99671
Signed-off-by: Chris Wilson 
Cc: Mika Kuoppala 
Reviewed-by: Mika Kuoppala 
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 23 +++--
 drivers/gpu/drm/i915/intel_ringbuffer.h | 36 -
 2 files changed, 34 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 3ce1c87dec46..e7ef04cc071b 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -39,12 +39,16 @@
  */
 #define LEGACY_REQUEST_SIZE 200
 
-static int __intel_ring_space(int head, int tail, int size)
+static unsigned int __intel_ring_space(unsigned int head,
+  unsigned int tail,
+  unsigned int size)
 {
-   int space = head - tail;
-   if (space <= 0)
-   space += size;
-   return space - I915_RING_FREE_SPACE;
+   /*
+* "If the Ring Buffer Head Pointer and the Tail Pointer are on the
+* same cacheline, the Head Pointer must not be greater than the Tail
+* Pointer."
+*/
+   return (head - tail - CACHELINE_BYTES) & (size - 1);
 }
 
 void intel_ring_update_space(struct intel_ring *ring)
@@ -1670,12 +1674,9 @@ static int wait_for_space(struct drm_i915_gem_request 
*req, int bytes)
GEM_BUG_ON(!req->reserved_space);
 
list_for_each_entry(target, &ring->request_list, ring_link) {
-   unsigned space;
-
/* Would completion of this request free enough space? */
-   space = __intel_ring_space(target->postfix, ring->emit,
-  ring->size);
-   if (space >= bytes)
+   if (bytes <= __intel_ring_space(target->postfix,
+   ring->emit, ring->size))
break;
}
 
@@ -1744,11 +1745,11 @@ u32 *intel_ring_begin(struct drm_i915_gem_request *req, 
int num_dwords)
}
 
GEM_BUG_ON(ring->emit > ring->size - bytes);
+   GEM_BUG_ON(ring->space < bytes);
cs = ring->vaddr + ring->emit;
GEM_DEBUG_EXEC(memset(cs, POISON_INUSE, bytes));
ring->emit += bytes;
ring->space -= bytes;
-   GEM_BUG_ON(ring->space < 0);
 
return cs;
 }
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h 
b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 600713b29d79..650ab884d6c8 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -17,17 +17,6 @@
 #define CACHELINE_BYTES 64
 #define CACHELINE_DWORDS (CACHELINE_BYTES / sizeof(uint32_t))
 
-/*
- * Gen2 BSpec "1. Programming Environment" / 1.4.4.6 "Ring Buffer Use"
- * Gen3 BSpec "vol1c Memory Interface Functions" / 2.3.4.5 "Ring Buffer Use"
- * Gen4+ BSpec "vol1c Memory Interface and Command Stream" / 5.3.4.5 "Ring 
Buffer Use"
- *
- * "If the Ring Buffer Head Pointer and the Tail Pointer are on the same
- * cacheline, the Head Pointer must not be greater than the Tail
- * Pointer."
- */
-#define I915_RING_FREE_SPACE 64
-
 struct intel_hw_status_page {
struct i915_vma *vma;
u32 *page_addr;
@@ -145,9 +134,9 @@ struct intel_ring {
u32 tail;
u32 emit;
 
-   int space;
-   int size;
-   int effective_size;
+   u32 space;
+   u32 size;
+   u32 effective_size;
 };
 
 struct i915_gem_context;
@@ -548,6 +537,25 @@ assert_ring_tail_valid(const struct intel_ring *ring, 
unsigned int tail)
 */
GEM_BUG_ON(!IS_ALIGNED(tail, 8));
GEM_BUG_ON(tail >= ring->size);
+
+   /*
+* "Ring Buffer Use"
+*  Gen2 BSpec "1. Programming Environment" / 1.4.4.6
+*  Gen3 BSpec "1c Memory Interface Functions" / 2.3.4.5
+*  Gen4+ BSpec "1c Memory Interface and Command Stream" / 5.3.4.5
+* "If the Ring Buffer Head Pointer and the Tail Pointer are on the
+* same cacheline, the Head Pointer must not be greater than the Tail
+* Pointer."
+*
+* We use ring->head as the last known location of the actual RING_HEAD,
+* it may have advanced but in the worst case it is equally the same
+* as ring->head and so we should never program RING_TAIL to advance
+* into the same cacheline as ring->head.
+*/
+#define cacheline(a) round_down(a, CACHELINE_BYTES)
+   GEM_BUG_ON(cacheline(tail) == cacheline(ring->head) &&
+  tail < ring->head);
+#undef cacheline
 }
 
 static inline unsigned int
-- 
2.11.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 32/67] drm/i915/cnl: DDI - PLL mapping

2017-05-04 Thread Ville Syrjälä

On Thu, May 04, 2017 at 03:02:07PM +0200, Maarten Lankhorst wrote:
> Op 04-05-17 om 14:44 schreef Ville Syrjälä:
> > On Thu, May 04, 2017 at 03:35:51PM +0300, Ander Conselvan De Oliveira wrote:
> >> On Fri, 2017-04-07 at 18:12 -0300, Paulo Zanoni wrote:
> >>> Em Qui, 2017-04-06 às 12:15 -0700, Rodrigo Vivi escreveu:
>  One of the steps for PLL (un)initialization is to (un)map
>  the correspondent DDI that is actually using that PLL.
> 
>  So, let's do this step following the places already stablished
>  and used so far, although spec put this as part of PLL
>  initialization sequences.
> 
>  v2: Use proper prefix on bits names as suggested by Ander.
>  v3: Add missed "~". Without that the logic was inverted
>  so we were disabling interrupts.
>  Credits-to: Clinton
>  Credits-to: Art
>  v4: Spec is getting updated to do DDI -> PLL mapping
>  and clock on in 2 separated reg writes. (Paulo)
>  Also update bits definitions to use space
>  (1 << 1) instead of (1<<1). (Paulo)
> 
>  Cc: Paulo Zanoni 
>  Cc: Art Runyan 
>  Cc: Clint Taylor 
>  Cc: Ville Syrjälä 
>  Cc: Kahola, Mika 
>  Cc: Ander Conselvan De Oliveira   m>
>  Signed-off-by: Rodrigo Vivi 
>  Reviewed-by: Kahola, Mika 
>  Signed-off-by: Rodrigo Vivi 
>  ---
>   drivers/gpu/drm/i915/i915_reg.h  |  9 +
>   drivers/gpu/drm/i915/intel_ddi.c | 23 ---
>   2 files changed, 29 insertions(+), 3 deletions(-)
> 
>  diff --git a/drivers/gpu/drm/i915/i915_reg.h
>  b/drivers/gpu/drm/i915/i915_reg.h
>  index 3cfc65f..dcb8e21 100644
>  --- a/drivers/gpu/drm/i915/i915_reg.h
>  +++ b/drivers/gpu/drm/i915/i915_reg.h
>  @@ -8150,6 +8150,15 @@ enum {
>   #define DPLL_CFGCR1(id) _MMIO_PIPE((id) - SKL_DPLL1,
>  _DPLL1_CFGCR1, _DPLL2_CFGCR1)
>   #define DPLL_CFGCR2(id) _MMIO_PIPE((id) - SKL_DPLL1,
>  _DPLL1_CFGCR2, _DPLL2_CFGCR2)
>   
>  +/*
>  + * CNL Clocks
>  + */
>  +#define DPCLKA_CFGCR0   _MMIO(0x6C200)
>  +#define  DPCLKA_CFGCR0_DDI_CLK_OFF(port)(1 << ((port)+10))
>  +#define  DPCLKA_CFGCR0_DDI_CLK_SEL_MASK(port)   (3 <<
>  ((port)*2))
>  +#define  DPCLKA_CFGCR0_DDI_CLK_SEL_SHIFT(port)  ((port)*2)
>  +#define  DPCLKA_CFGCR0_DDI_CLK_SEL(pll, port)   ((pll) <<
>  ((port)*2))
>  +
>   /* BXT display engine PLL */
>   #define BXT_DE_PLL_CTL  _MMIO(0x6d000)
>   #define   BXT_DE_PLL_RATIO(x)   (x) /*
>  {60,65,100} * 19.2MHz */
>  diff --git a/drivers/gpu/drm/i915/intel_ddi.c
>  b/drivers/gpu/drm/i915/intel_ddi.c
>  index 0914ad9..2a901bf 100644
>  --- a/drivers/gpu/drm/i915/intel_ddi.c
>  +++ b/drivers/gpu/drm/i915/intel_ddi.c
>  @@ -1621,13 +1621,27 @@ static void intel_ddi_clk_select(struct
>  intel_encoder *encoder,
>   {
>   struct drm_i915_private *dev_priv = to_i915(encoder-
> > base.dev);
>   enum port port = intel_ddi_get_encoder_port(encoder);
>  +uint32_t val;
>   
>   if (WARN_ON(!pll))
>   return;
>   
>  -if (IS_GEN9_BC(dev_priv)) {
>  -uint32_t val;
>  +if (IS_CANNONLAKE(dev_priv)) {
>  +/* Configure DPCLKA_CFGCR0 to map the DPLL to the
>  DDI. */
>  +val = I915_READ(DPCLKA_CFGCR0);
>  +val |= DPCLKA_CFGCR0_DDI_CLK_SEL(pll->id, port);
>  +I915_WRITE(DPCLKA_CFGCR0, val);
> >>> A question to the Atomic Lords: don't we need some sort of locking
> >>> around this register since it's used by all ports/clocks? I suppose
> >>> dev_priv->dpll_lock would do...
> >>>
> >>> Maybe the same would apply for gen9_bc.
> >> If there are modesets happening in parallel for different crtcs, then some
> >> locking is needed. dpll_lock seems like the right call, that's what's used 
> >> to
> >> avoid the same problem with the enable/disable hooks.
> > If something is allowing modesets to commit in parallel then probably
> > the whole world is on fire. Historically connection_mutex has been there
> > to protect us, but not sure how that goes with nonblocking commits. I
> > do hope there's still something there to prevents this...
> 
> During nonblocking modesets we don't hold any locks. It's still possible
> that we force serialization through some other means, for example grabbing
> all crtc_states might force serialization previously. But I'm not sure this
> is guaranteed to happen even for SKL. It might happen for when DDB
> allocation or cdclk changes but there's no guarantee during modeset.
> 
> So quite likely you'll need locking here. :)

Someone just need to fix things so that modesets are always serialized.
I don't think anyone has actually reviewd the entire driver sufficiently
to allow parallel m

Re: [Intel-gfx] [PATCH 33/67] drm/i915: Configure DPLL's for Cannonlake

2017-05-04 Thread Ander Conselvan De Oliveira

On Thu, 2017-04-06 at 12:15 -0700, Rodrigo Vivi wrote:
> From: "Kahola, Mika" 
> 
> DPLL's are defined in DPCLKA_CFGCR0 register (0x6C200). Let's use these
> definitions when computing dpll's for ddi ports.
> 
> v2: (Rodrigo) Remove register that was defined in another patch with
> fixed name and more bits.
> 
> Signed-off-by: Kahola, Mika 
> Signed-off-by: Rodrigo Vivi 
> ---
>  drivers/gpu/drm/i915/intel_display.c | 20 +++-
>  1 file changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_display.c 
> b/drivers/gpu/drm/i915/intel_display.c
> index 87d2822..4d0ae98 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -8850,6 +8850,22 @@ static int haswell_crtc_compute_clock(struct 
> intel_crtc *crtc,
>   return 0;
>  }
>  
> +static void cannonlake_get_ddi_pll(struct drm_i915_private *dev_priv,
> +enum port port,
> +struct intel_crtc_state *pipe_config)
> +{
> + enum intel_dpll_id id;
> + u32 temp;
> +
> + temp = I915_READ(DPCLKA_CFGCR0) & DPCLKA_CFGCR0_DDI_CLK_SEL_MASK(port);
> + id = temp >> (port * 2);

Maybe use DPCLKA_CFGCR0_DDI_CLK_SEL_SHIFT which was defined in the previous
patch?

Also, might make sense to squash this with the next patch, but anyway,

Reviewed-by: Ander Conselvan de Oliveira 


> +
> + if (WARN_ON(id < SKL_DPLL0 || id > SKL_DPLL2))
> + return;
> +
> + pipe_config->shared_dpll = intel_get_shared_dpll_by_id(dev_priv, id);
> +}
> +
>  static void bxt_get_ddi_pll(struct drm_i915_private *dev_priv,
>   enum port port,
>   struct intel_crtc_state *pipe_config)
> @@ -9037,7 +9053,9 @@ static void haswell_get_ddi_port_state(struct 
> intel_crtc *crtc,
>  
>   port = (tmp & TRANS_DDI_PORT_MASK) >> TRANS_DDI_PORT_SHIFT;
>  
> - if (IS_GEN9_BC(dev_priv))
> + if (IS_CANNONLAKE(dev_priv))
> + cannonlake_get_ddi_pll(dev_priv, port, pipe_config);
> + else if (IS_GEN9_BC(dev_priv))
>   skylake_get_ddi_pll(dev_priv, port, pipe_config);
>   else if (IS_GEN9_LP(dev_priv))
>   bxt_get_ddi_pll(dev_priv, port, pipe_config);
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v2 2/3] drm/i915/guc: Make scratch register base and count flexible

2017-05-04 Thread Jani Nikula

On Thu, 04 May 2017, Michal Wajdeczko  wrote:
> We are using some scratch registers in MMIO based send function.
> Make their base and count flexible in preparation of upcoming
> GuC firmware/hardware changes. While around, change cmd len
> parameter verification from WARN_ON to GEM_BUG_ON as we don't
> need this all the time.

I'm not generally fond of caching the registers like this or adding
_MMIO() wrapping outside of i915_reg.h. Sure, we have some of that here
and there, but here it's hard to see the rationale because you do this
in preparation for something that we you're not sharing.

BR,
Jani.

>
> v2: call out WARN/GEM_BUG change in the commit msg (Daniele)
>
> Signed-off-by: Michal Wajdeczko 
> Suggested-by: Daniele Ceraolo Spurio 
> Cc: Daniele Ceraolo Spurio 
> Cc: Joonas Lahtinen 
> Reviewed-by: Daniele Ceraolo Spurio 
> ---
>  drivers/gpu/drm/i915/intel_uc.c | 41 
> ++---
>  drivers/gpu/drm/i915/intel_uc.h |  7 +++
>  2 files changed, 41 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_uc.c b/drivers/gpu/drm/i915/intel_uc.c
> index 72f49e6..9d11c42 100644
> --- a/drivers/gpu/drm/i915/intel_uc.c
> +++ b/drivers/gpu/drm/i915/intel_uc.c
> @@ -260,9 +260,36 @@ void intel_uc_fini_fw(struct drm_i915_private *dev_priv)
>   __intel_uc_fw_fini(&dev_priv->huc.fw);
>  }
>  
> +static inline i915_reg_t guc_send_reg(struct intel_guc *guc, u32 i)
> +{
> + GEM_BUG_ON(!guc->send_regs.base);
> + GEM_BUG_ON(!guc->send_regs.count);
> + GEM_BUG_ON(i >= guc->send_regs.count);
> +
> + return _MMIO(guc->send_regs.base + 4 * i);
> +}
> +
> +static void guc_init_send_regs(struct intel_guc *guc)
> +{
> + struct drm_i915_private *dev_priv = guc_to_i915(guc);
> + enum forcewake_domains fw_domains = 0;
> + u32 i;
> +
> + guc->send_regs.base = i915_mmio_reg_offset(SOFT_SCRATCH(0));
> + guc->send_regs.count = SOFT_SCRATCH_COUNT - 1;
> +
> + for (i = 0; i < guc->send_regs.count; i++) {
> + fw_domains |= intel_uncore_forcewake_for_reg(dev_priv,
> + guc_send_reg(guc, i),
> + FW_REG_READ | FW_REG_WRITE);
> + }
> + guc->send_regs.fw_domains = fw_domains;
> +}
> +
>  static int guc_enable_communication(struct intel_guc *guc)
>  {
>   /* XXX: placeholder for alternate setup */
> + guc_init_send_regs(guc);
>   guc->send = intel_guc_send_mmio;
>   return 0;
>  }
> @@ -407,19 +434,19 @@ int intel_guc_send_mmio(struct intel_guc *guc, const 
> u32 *action, u32 len)
>   int i;
>   int ret;
>  
> - if (WARN_ON(len < 1 || len > 15))
> - return -EINVAL;
> + GEM_BUG_ON(!len);
> + GEM_BUG_ON(len > guc->send_regs.count);
>  
>   mutex_lock(&guc->send_mutex);
> - intel_uncore_forcewake_get(dev_priv, FORCEWAKE_BLITTER);
> + intel_uncore_forcewake_get(dev_priv, guc->send_regs.fw_domains);
>  
>   dev_priv->guc.action_count += 1;
>   dev_priv->guc.action_cmd = action[0];
>  
>   for (i = 0; i < len; i++)
> - I915_WRITE(SOFT_SCRATCH(i), action[i]);
> + I915_WRITE(guc_send_reg(guc, i), action[i]);
>  
> - POSTING_READ(SOFT_SCRATCH(i - 1));
> + POSTING_READ(guc_send_reg(guc, i - 1));
>  
>   intel_guc_notify(guc);
>  
> @@ -428,7 +455,7 @@ int intel_guc_send_mmio(struct intel_guc *guc, const u32 
> *action, u32 len)
>* Fast commands should still complete in 10us.
>*/
>   ret = __intel_wait_for_register_fw(dev_priv,
> -SOFT_SCRATCH(0),
> +guc_send_reg(guc, 0),
>  INTEL_GUC_RECV_MASK,
>  INTEL_GUC_RECV_MASK,
>  10, 10, &status);
> @@ -450,7 +477,7 @@ int intel_guc_send_mmio(struct intel_guc *guc, const u32 
> *action, u32 len)
>   }
>   dev_priv->guc.action_status = status;
>  
> - intel_uncore_forcewake_put(dev_priv, FORCEWAKE_BLITTER);
> + intel_uncore_forcewake_put(dev_priv, guc->send_regs.fw_domains);
>   mutex_unlock(&guc->send_mutex);
>  
>   return ret;
> diff --git a/drivers/gpu/drm/i915/intel_uc.h b/drivers/gpu/drm/i915/intel_uc.h
> index 097289b..a37a8cc 100644
> --- a/drivers/gpu/drm/i915/intel_uc.h
> +++ b/drivers/gpu/drm/i915/intel_uc.h
> @@ -205,6 +205,13 @@ struct intel_guc {
>   uint64_t submissions[I915_NUM_ENGINES];
>   uint32_t last_seqno[I915_NUM_ENGINES];
>  
> + /* GuC's FW specific registers used in MMIO send */
> + struct {
> + u32 base;
> + u32 count;
> + u32 fw_domains; /* enum forcewake_domains */
> + } send_regs;
> +
>   /* To serialize the intel_guc_send actions */
>   struct mutex send_mutex;

-- 
Jani Nikula, Intel Open Source Technology Center
___
Intel-gfx mailing list

Re: [Intel-gfx] [PATCH 5/5] drm/vblank: Lock down vblank->hwmode more

2017-05-04 Thread Daniel Vetter

On Wed, May 03, 2017 at 05:09:08PM +0300, Ville Syrjälä wrote:
> On Wed, May 03, 2017 at 09:26:38AM +0200, Daniel Vetter wrote:
> > In the previous patch we've implemented hwmode tracking a la i915 for
> > the vblank timestamp calculations. But that was just the basic
> > semantics, i915 has some nice sanity checks to make sure we keep
> > getting this right. Move them over too.
> > 
> > Cc: Ville Syrjälä 
> > Reviewed-by: Neil Armstrong 
> > Signed-off-by: Daniel Vetter 
> > ---
> >  drivers/gpu/drm/drm_irq.c|  8 +++-
> >  drivers/gpu/drm/i915/i915_irq.c  | 10 ++
> >  drivers/gpu/drm/i915/intel_display.c | 11 ++-
> >  3 files changed, 15 insertions(+), 14 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
> > index 89f0928b042a..942183a2aa3c 100644
> > --- a/drivers/gpu/drm/drm_irq.c
> > +++ b/drivers/gpu/drm/drm_irq.c
> > @@ -775,8 +775,10 @@ bool drm_calc_vbltimestamp_from_scanoutpos(struct 
> > drm_device *dev,
> > /* If mode timing undefined, just return as no-op:
> >  * Happens during initial modesetting of a crtc.
> >  */
> > -   if (mode->crtc_clock == 0) {
> > +   if (WARN_ON(mode->crtc_clock == 0)) {
> > DRM_DEBUG("crtc %u: Noop due to uninitialized mode.\n", pipe);
> > +   WARN_ON(drm_drv_uses_atomic_modeset(dev));
> 
> I would make these _ONCE() otherwise the machine might end up
> practically dead.

Will do.

> > +
> > return false;
> > }
> >  
> > @@ -1338,6 +1340,10 @@ void drm_crtc_vblank_off(struct drm_crtc *crtc)
> > send_vblank_event(dev, e, seq, &now);
> > }
> > spin_unlock_irqrestore(&dev->event_lock, irqflags);
> > +
> > +   /* Will be reset by the modeset helpers when re-enabling the crtc by
> > +* calling drm_calc_timestamping_constants(). */
> > +   vblank->hwmode.crtc_clock = 0;
> >  }
> >  EXPORT_SYMBOL(drm_crtc_vblank_off);
> 
> Shouldn't we do this in drm_crtc_vblank_reset() as well?
> 
> Hmm. Except we call that after drm_calc_timestamping_constants(). I
> guess we should be able to move the reset() into
> intel_modeset_readout_hw_state(). And possibly move the vblank_on()
> call as well?

Yeah, it'd be nice to clean this stuff up some more, but there's also the
problem that legacy and new drivers callc drm_calc_timestamping_constants
at opposite ends of the modeset sequence. Doing more here is a bunch more
work, maybe for the next patche series ...

I don't think we need to call it in _reset, at least at boot-up it should
be 0 already. And for s/r we already shut down the pipe on suspend, so
it's gone through this here.

With the _ONCE nit address (and the build breakage I've introduced in this
version fixed), ack from you on the entire series?

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 5/9] drm/i915: Use a define for the default priority [0]

2017-05-04 Thread Joonas Lahtinen

On ke, 2017-05-03 at 12:37 +0100, Chris Wilson wrote:
> Explicitly assign the default priority, and give it a name (macro).
> 
> Signed-off-by: Chris Wilson 



>   kref_init(&ctx->ref);
>   list_add_tail(&ctx->link, &dev_priv->context_list);
>   ctx->i915 = dev_priv;
> + ctx->priority = I915_PRIORITY_DFL;

I915_PRIORITY_DEFAULT would work better.

Reviewed-by: Joonas Lahtinen 

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [CI,1/3] drm/i915: Avoid the branch in computing intel_ring_space()

2017-05-04 Thread Patchwork

== Series Details ==

Series: series starting with [CI,1/3] drm/i915: Avoid the branch in computing 
intel_ring_space()
URL   : https://patchwork.freedesktop.org/series/23958/
State : success

== Summary ==

Series 23958v1 Series without cover letter
https://patchwork.freedesktop.org/api/1.0/series/23958/revisions/1/mbox/

Test gem_exec_suspend:
Subgroup basic-s4-devices:
pass   -> DMESG-WARN (fi-snb-2600) fdo#100125
Test kms_flip:
Subgroup basic-flip-vs-modeset:
dmesg-warn -> PASS   (fi-byt-j1900) fdo#100652

fdo#100125 https://bugs.freedesktop.org/show_bug.cgi?id=100125
fdo#100652 https://bugs.freedesktop.org/show_bug.cgi?id=100652

fi-bdw-5557u total:278  pass:267  dwarn:0   dfail:0   fail:0   skip:11  
time:436s
fi-bdw-gvtdvmtotal:278  pass:256  dwarn:8   dfail:0   fail:0   skip:14  
time:429s
fi-bsw-n3050 total:278  pass:242  dwarn:0   dfail:0   fail:0   skip:36  
time:576s
fi-bxt-j4205 total:278  pass:259  dwarn:0   dfail:0   fail:0   skip:19  
time:506s
fi-bxt-t5700 total:278  pass:258  dwarn:0   dfail:0   fail:0   skip:20  
time:568s
fi-byt-j1900 total:278  pass:254  dwarn:0   dfail:0   fail:0   skip:24  
time:496s
fi-byt-n2820 total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  
time:483s
fi-elk-e7500 total:278  pass:221  dwarn:0   dfail:0   fail:0   skip:57  
time:407s
fi-hsw-4770  total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  
time:416s
fi-hsw-4770r total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  
time:402s
fi-ilk-650   total:278  pass:228  dwarn:0   dfail:0   fail:0   skip:50  
time:414s
fi-ivb-3520m total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  
time:495s
fi-ivb-3770  total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  
time:487s
fi-kbl-7500u total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  
time:459s
fi-kbl-7560u total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  
time:565s
fi-skl-6260u total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  
time:452s
fi-skl-6700hqtotal:278  pass:261  dwarn:0   dfail:0   fail:0   skip:17  
time:583s
fi-skl-6700k total:278  pass:256  dwarn:4   dfail:0   fail:0   skip:18  
time:461s
fi-skl-6770hqtotal:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  
time:489s
fi-skl-gvtdvmtotal:278  pass:265  dwarn:0   dfail:0   fail:0   skip:13  
time:429s
fi-snb-2520m total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  
time:535s
fi-snb-2600  total:278  pass:248  dwarn:1   dfail:0   fail:0   skip:29  
time:415s

1fbac016c8f2c9d4405111f3425f778d2ecdea62 drm-tip: 2017y-05m-04d-12h-52m-01s UTC 
integration manifest
f1c0df1 drm/i915: Micro-optimise hotpath through intel_ring_begin()
03ee0e5 drm/i915: Report the ring->space from intel_ring_update_space()
eca45ee drm/i915: Avoid the branch in computing intel_ring_space()

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4622/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v1] ACPI: Switch to use generic UUID API

2017-05-04 Thread Bjorn Helgaas

On Thu, May 4, 2017 at 4:21 AM, Andy Shevchenko
 wrote:
> acpi_evaluate_dsm() and friends take a pointer to a raw buffer of 16
> bytes. Instead we convert them to use uuid_le type. At the same time we
> convert current users.
>
> acpi_str_to_uuid() becomes useless after the conversion and it's safe to
> get rid of it.
>
> The conversion fixes a potential bug in int340x_thermal as well since
> we have to use memcmp() on binary data.
>
> Cc: Rafael J. Wysocki 
> Cc: Mika Westerberg 
> Cc: Borislav Petkov 
> Cc: Dan Williams 
> Cc: Amir Goldstein 
> Cc: Jarkko Sakkinen 
> Cc: Jani Nikula 
> Cc: Ben Skeggs 
> Cc: Benjamin Tissoires 
> Cc: Joerg Roedel 
> Cc: Adrian Hunter 
> Cc: Yisen Zhuang 
> Cc: Bjorn Helgaas 
> Cc: Zhang Rui 
> Cc: Felipe Balbi 
> Cc: Mathias Nyman 
> Cc: Heikki Krogerus 
> Cc: Liam Girdwood 
> Cc: Mark Brown 
> Signed-off-by: Andy Shevchenko 

For the drivers/pci parts:

Acked-by: Bjorn Helgaas 

> ---
>  drivers/acpi/acpi_extlog.c | 10 +++---
>  drivers/acpi/bus.c | 29 ++--
>  drivers/acpi/nfit/core.c   | 40 
> +++---
>  drivers/acpi/nfit/nfit.h   |  3 +-
>  drivers/acpi/utils.c   |  4 +--
>  drivers/char/tpm/tpm_crb.c |  9 +++--
>  drivers/char/tpm/tpm_ppi.c | 20 +--
>  drivers/gpu/drm/i915/intel_acpi.c  | 14 +++-
>  drivers/gpu/drm/nouveau/nouveau_acpi.c | 20 +--
>  drivers/gpu/drm/nouveau/nvkm/subdev/mxm/base.c |  9 +++--
>  drivers/hid/i2c-hid/i2c-hid.c  |  9 +++--
>  drivers/iommu/dmar.c   | 11 +++---
>  drivers/mmc/host/sdhci-pci-core.c  |  9 +++--
>  drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c | 15 
>  drivers/pci/pci-acpi.c | 11 +++---
>  drivers/pci/pci-label.c|  4 +--
>  drivers/thermal/int340x_thermal/int3400_thermal.c  |  8 ++---
>  drivers/usb/dwc3/dwc3-pci.c|  6 ++--
>  drivers/usb/host/xhci-pci.c|  9 +++--
>  drivers/usb/misc/ucsi.c|  2 +-
>  drivers/usb/typec/typec_wcove.c|  4 +--
>  include/acpi/acpi_bus.h|  9 ++---
>  include/linux/acpi.h   |  4 +--
>  include/linux/pci-acpi.h   |  2 +-
>  sound/soc/intel/skylake/skl-nhlt.c |  7 ++--
>  tools/testing/nvdimm/test/iomap.c  |  2 +-
>  tools/testing/nvdimm/test/nfit.c   |  2 +-
>  27 files changed, 116 insertions(+), 156 deletions(-)
>
> diff --git a/drivers/acpi/acpi_extlog.c b/drivers/acpi/acpi_extlog.c
> index 502ea4dc2080..69d6140b6afa 100644
> --- a/drivers/acpi/acpi_extlog.c
> +++ b/drivers/acpi/acpi_extlog.c
> @@ -182,17 +182,17 @@ static int extlog_print(struct notifier_block *nb, 
> unsigned long val,
>
>  static bool __init extlog_get_l1addr(void)
>  {
> -   u8 uuid[16];
> +   uuid_le uuid;
> acpi_handle handle;
> union acpi_object *obj;
>
> -   acpi_str_to_uuid(extlog_dsm_uuid, uuid);
> -
> +   if (uuid_le_to_bin(extlog_dsm_uuid, &uuid))
> +   return false;
> if (ACPI_FAILURE(acpi_get_handle(NULL, "\\_SB", &handle)))
> return false;
> -   if (!acpi_check_dsm(handle, uuid, EXTLOG_DSM_REV, 1 << 
> EXTLOG_FN_ADDR))
> +   if (!acpi_check_dsm(handle, &uuid, EXTLOG_DSM_REV, 1 << 
> EXTLOG_FN_ADDR))
> return false;
> -   obj = acpi_evaluate_dsm_typed(handle, uuid, EXTLOG_DSM_REV,
> +   obj = acpi_evaluate_dsm_typed(handle, &uuid, EXTLOG_DSM_REV,
>   EXTLOG_FN_ADDR, NULL, 
> ACPI_TYPE_INTEGER);
> if (!obj) {
> return false;
> diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c
> index 784bda663d16..e8130a4873e9 100644
> --- a/drivers/acpi/bus.c
> +++ b/drivers/acpi/bus.c
> @@ -196,42 +196,19 @@ static void acpi_print_osc_error(acpi_handle handle,
> pr_debug("\n");
>  }
>
> -acpi_status acpi_str_to_uuid(char *str, u8 *uuid)
> -{
> -   int i;
> -   static int opc_map_to_uuid[16] = {6, 4, 2, 0, 11, 9, 16, 14, 19, 21,
> -   24, 26, 28, 30, 32, 34};
> -
> -   if (strlen(str) != 36)
> -   return AE_BAD_PARAMETER;
> -   for (i = 0; i < 36; i++) {
> -   if (i == 8 || i == 13 || i == 18 || i == 23) {
> -   if (str[i] != '-')
> -   return AE_BAD_PARAMETER;
> -   } else if (!isxdigit(str[i]))
> -   return AE_BAD_PARAMETER;
> -   }
> -   for (i = 0; i < 16; i++) {
> -   uuid[i] = hex_to_bin(str[opc_map_to_uuid[i]]) << 4;
> -   uuid[i] |= hex_to_bin(str[opc_map_to_uuid[i] + 1]);
> -   }
> -

Re: [Intel-gfx] [PATCH 8/9] drm/i915: Stop inlining the execlists IRQ handler

2017-05-04 Thread Mika Kuoppala

Chris Wilson  writes:

> As the handler is now quite complex, involving a few atomics, the cost
> of the function preamble is negligible in comparison and so we should
> leave the function out-of-line for better I$.
>
> Signed-off-by: Chris Wilson 

Reviewed-by: Mika Kuoppala 

> ---
>  drivers/gpu/drm/i915/i915_irq.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 86ede88daaab..8f60c8045b3e 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1353,7 +1353,7 @@ static void snb_gt_irq_handler(struct drm_i915_private 
> *dev_priv,
>   ivybridge_parity_error_irq_handler(dev_priv, gt_iir);
>  }
>  
> -static __always_inline void
> +static void
>  gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir, int test_shift)
>  {
>   bool tasklet = false;
> -- 
> 2.11.0
>
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [CI 1/3] drm/i915: Avoid the branch in computing intel_ring_space()

2017-05-04 Thread Michal Wajdeczko

On Thu, May 04, 2017 at 02:08:44PM +0100, Chris Wilson wrote:
> Exploit the power-of-two ring size to compute the space across the
> wraparound using a mask rather than a if. Convert to unsigned integers
> so the operation is well defined.
> 
> References: https://bugs.freedesktop.org/show_bug.cgi?id=99671
> Signed-off-by: Chris Wilson 
> Cc: Mika Kuoppala 
> Reviewed-by: Mika Kuoppala 
> ---
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 23 +++--
>  drivers/gpu/drm/i915/intel_ringbuffer.h | 36 
> -
>  2 files changed, 34 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
> b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 3ce1c87dec46..e7ef04cc071b 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -39,12 +39,16 @@
>   */
>  #define LEGACY_REQUEST_SIZE 200
>  
> -static int __intel_ring_space(int head, int tail, int size)
> +static unsigned int __intel_ring_space(unsigned int head,
> +unsigned int tail,
> +unsigned int size)
>  {
> - int space = head - tail;
> - if (space <= 0)
> - space += size;
> - return space - I915_RING_FREE_SPACE;
> + /*
> +  * "If the Ring Buffer Head Pointer and the Tail Pointer are on the
> +  * same cacheline, the Head Pointer must not be greater than the Tail
> +  * Pointer."
> +  */
> + return (head - tail - CACHELINE_BYTES) & (size - 1);

Btw, as you exploit power-of-two ring size here, maybe it is worth to repeat

GEM_BUG_ON(!is_power_of_2(size));

to emphase this assumption in the code (not only in the commit message)?

-Michal

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915: Update MOCS settings for gen 9

2017-05-04 Thread David Weinehall

On Thu, May 04, 2017 at 10:35:33AM +0200, Arkadiusz Hiler wrote:
> On Thu, Apr 27, 2017 at 05:23:16PM +0100, Chris Wilson wrote:
> > On Thu, Apr 27, 2017 at 06:30:42PM +0300, David Weinehall wrote:
> > > On Thu, Apr 27, 2017 at 04:55:20PM +0200, Arkadiusz Hiler wrote:
> > > > On Wed, Apr 26, 2017 at 06:00:41PM +0300, David Weinehall wrote:
> > > > > Add a bunch of MOCS entries for gen 9 that were missing from 
> > > > > intel_mocs.
> > > > > Some of these are used by media-sdk; if these entries are missing
> > > > > the default will instead be to do everything uncached.
> > > > > 
> > > > > This patch improves media-sdk performance with up to 60%
> > > > > with the (admittedly synthetic) benchmarks we use in our nightly
> > > > > testing, without regressing any other benchmarks.
> > > > 
> > > > Hey David,
> > > > 
> > > > I am testing some of the extended MOCS with Mesa and the differences I
> > > > see fit in the margins of statistical error.
> > > > 
> > > > Odd, I thought, so to make sure I haven't messed up anything in the
> > > > process of compiling, setting LD_LIBRARY_PATH and benchmarking I turned
> > > > everything to UNCACHED - and I saw severe performance drop.
> > > > 
> > > > So here is the question it induced:
> > > > 
> > > > Have you used the "closest neighbour" from entries available or did you
> > > > defaulted to the UNCACHED ones? That could be the culprit.
> > > > 
> > > > Note: I have tested MOCS for VB and Render Target only, and only in a
> > > > few synthetic cases - it will require much more fine-tuning and
> > > > benchmarking before any final conclusions.
> > > 
> > > As I mentioned in the commit message, the improvements only manifest
> > > themselves for media-sdk workloads (and presumably other workloads
> > > that uses the same hardware); if you see any performance regressions
> > > with these additional entries I'd be interested to know.
> > 
> > But what is being counter suggested is that their is no reason for these
> > mocs entries. If the sdk is just using mocs registers without first
> > programming them outside of the kernel abi, then it will be hitting
> > uncached memory - and then the only benefit is from simply enabling
> > cached access. The kernel ABI is minimalist for a reason, and we want to
> > know why we should be adding tables that we need to maintain forever
> > (bonus points for making that a consistent interface for hardware for
> > years to come).
> > -Chris
> 
> Thanks for rephrasing - that's exactly what I am concerned with.
> 
> Did you just use the MediaSDK as it is - meaning that MOCS entries
> beyond the set of the 3 we have defined had been naively utilized?
> 
> If that's the case it is probably the cause of the performance
> difference - everything beyond "the 3" means UNCACHED.
> 
> Can you try changing MediaSDK to only use entries that are already in?
> How the performance differs in that case?

We're benchmarking using upstream MediaSDK without changes, since that's
the only thing that's relevant. Customising benchmarks to get better
results isn't really an acceptable solution :)

Obviously fixing MediaSDK upstream is a different story, in case one of
the three pre-defined entries we have turns out to be the best possible
MOCS-settings for that workload.


Kind regards, David
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [CI 1/3] drm/i915: Avoid the branch in computing intel_ring_space()

2017-05-04 Thread Chris Wilson

On Thu, May 04, 2017 at 04:17:13PM +0200, Michal Wajdeczko wrote:
> On Thu, May 04, 2017 at 02:08:44PM +0100, Chris Wilson wrote:
> > Exploit the power-of-two ring size to compute the space across the
> > wraparound using a mask rather than a if. Convert to unsigned integers
> > so the operation is well defined.
> > 
> > References: https://bugs.freedesktop.org/show_bug.cgi?id=99671
> > Signed-off-by: Chris Wilson 
> > Cc: Mika Kuoppala 
> > Reviewed-by: Mika Kuoppala 
> > ---
> >  drivers/gpu/drm/i915/intel_ringbuffer.c | 23 +++--
> >  drivers/gpu/drm/i915/intel_ringbuffer.h | 36 
> > -
> >  2 files changed, 34 insertions(+), 25 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
> > b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index 3ce1c87dec46..e7ef04cc071b 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -39,12 +39,16 @@
> >   */
> >  #define LEGACY_REQUEST_SIZE 200
> >  
> > -static int __intel_ring_space(int head, int tail, int size)
> > +static unsigned int __intel_ring_space(unsigned int head,
> > +  unsigned int tail,
> > +  unsigned int size)
> >  {
> > -   int space = head - tail;
> > -   if (space <= 0)
> > -   space += size;
> > -   return space - I915_RING_FREE_SPACE;
> > +   /*
> > +* "If the Ring Buffer Head Pointer and the Tail Pointer are on the
> > +* same cacheline, the Head Pointer must not be greater than the Tail
> > +* Pointer."
> > +*/
> > +   return (head - tail - CACHELINE_BYTES) & (size - 1);
> 
> Btw, as you exploit power-of-two ring size here, maybe it is worth to repeat
> 
>   GEM_BUG_ON(!is_power_of_2(size));
> 
> to emphase this assumption in the code (not only in the commit message)?

I've made the cardinal sin of changing it at the last moment, if I've
broken everything I'm going to blame you :)

Semi-pushed, looks like we're already back in conflict territory.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [CI 1/3] drm/i915: Avoid the branch in computing intel_ring_space()

2017-05-04 Thread Chris Wilson

On Thu, May 04, 2017 at 04:17:13PM +0200, Michal Wajdeczko wrote:
> On Thu, May 04, 2017 at 02:08:44PM +0100, Chris Wilson wrote:
> > Exploit the power-of-two ring size to compute the space across the
> > wraparound using a mask rather than a if. Convert to unsigned integers
> > so the operation is well defined.
> > 
> > References: https://bugs.freedesktop.org/show_bug.cgi?id=99671
> > Signed-off-by: Chris Wilson 
> > Cc: Mika Kuoppala 
> > Reviewed-by: Mika Kuoppala 
> > ---
> >  drivers/gpu/drm/i915/intel_ringbuffer.c | 23 +++--
> >  drivers/gpu/drm/i915/intel_ringbuffer.h | 36 
> > -
> >  2 files changed, 34 insertions(+), 25 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c 
> > b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index 3ce1c87dec46..e7ef04cc071b 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -39,12 +39,16 @@
> >   */
> >  #define LEGACY_REQUEST_SIZE 200
> >  
> > -static int __intel_ring_space(int head, int tail, int size)
> > +static unsigned int __intel_ring_space(unsigned int head,
> > +  unsigned int tail,
> > +  unsigned int size)
> >  {
> > -   int space = head - tail;
> > -   if (space <= 0)
> > -   space += size;
> > -   return space - I915_RING_FREE_SPACE;
> > +   /*
> > +* "If the Ring Buffer Head Pointer and the Tail Pointer are on the
> > +* same cacheline, the Head Pointer must not be greater than the Tail
> > +* Pointer."
> > +*/
> > +   return (head - tail - CACHELINE_BYTES) & (size - 1);
> 
> Btw, as you exploit power-of-two ring size here, maybe it is worth to repeat
> 
>   GEM_BUG_ON(!is_power_of_2(size));
> 
> to emphase this assumption in the code (not only in the commit message)?

I did check we had an is_power_of_2() check in intel_engine_create_ring.
Might be worth asserting here as well as there's a little disconnect
between the function and ring->size.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915: Set all undefined MOCS entries to follow PTE

2017-05-04 Thread David Weinehall

On Thu, May 04, 2017 at 10:51:29AM +0100, Chris Wilson wrote:
> A good default for garbage entries from the user is to follow the
> default setting of the object (i.e. the PTE). Currently they use the
> uncached entry, and now the only way to accidentally hit uncached
> performance is via explicit use of the uncached MOCS or setting the
> object to uncached. Note that these entries are currently undefined in
> the ABI and we reserve the right to change them. We originally chose
> uncached to eliminate any problem with reducing the caching level in
> future, but the object is a much better definition of the minimum
> caching level.
> 
> Fixes: 3bbaba0ceaa2 ("drm/i915: Added Programming of the MOCS")
> Signed-off-by: Chris Wilson 
> Cc: David Weinehall 
> Cc: Arkadiusz Hiler 
> Cc: Tvrtko Ursulin 
> Cc: sta...@vger.kernel.org

LGTM, and passes our nightly msdk test case.

Tested-by: David Weinehall 
Reviewed-by: David Weinehall 

> ---
>  drivers/gpu/drm/i915/intel_mocs.c | 39 
> +++
>  1 file changed, 15 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_mocs.c 
> b/drivers/gpu/drm/i915/intel_mocs.c
> index 92e461c68385..e7a7781ca457 100644
> --- a/drivers/gpu/drm/i915/intel_mocs.c
> +++ b/drivers/gpu/drm/i915/intel_mocs.c
> @@ -85,10 +85,7 @@ struct drm_i915_mocs_table {
>   *
>   * Entries not part of the following tables are undefined as far as
>   * userspace is concerned and shouldn't be relied upon.  For the time
> - * being they will be implicitly initialized to the strictest caching
> - * configuration (uncached) to guarantee forwards compatibility with
> - * userspace programs written against more recent kernels providing
> - * additional MOCS entries.
> + * being they will be implicitly initialized to follow the PTE.
>   *
>   * NOTE: These tables MUST start with being uncached and the length
>   *   MUST be less than 63 as the last two registers are reserved
> @@ -249,16 +246,13 @@ int intel_mocs_init_engine(struct intel_engine_cs 
> *engine)
>  table.table[index].control_value);
>  
>   /*
> -  * Ok, now set the unused entries to uncached. These entries
> +  * Ok, now set the unused entries to follow the PTE. These entries
>* are officially undefined and no contract for the contents
>* and settings is given for these entries.
> -  *
> -  * Entry 0 in the table is uncached - so we are just writing
> -  * that value to all the used entries.
>*/
>   for (; index < GEN9_NUM_MOCS_ENTRIES; index++)
>   I915_WRITE(mocs_register(engine->id, index),
> -table.table[0].control_value);
> +table.table[I915_MOCS_PTE].control_value);
>  
>   return 0;
>  }
> @@ -295,16 +289,13 @@ static int emit_mocs_control_table(struct 
> drm_i915_gem_request *req,
>   }
>  
>   /*
> -  * Ok, now set the unused entries to uncached. These entries
> +  * Ok, now set the unused entries to follow the PTE. These entries
>* are officially undefined and no contract for the contents
>* and settings is given for these entries.
> -  *
> -  * Entry 0 in the table is uncached - so we are just writing
> -  * that value to all the used entries.
>*/
>   for (; index < GEN9_NUM_MOCS_ENTRIES; index++) {
>   *cs++ = i915_mmio_reg_offset(mocs_register(engine, index));
> - *cs++ = table->table[0].control_value;
> + *cs++ = table->table[I915_MOCS_PTE].control_value;
>   }
>  
>   *cs++ = MI_NOOP;
> @@ -355,18 +346,17 @@ static int emit_mocs_l3cc_table(struct 
> drm_i915_gem_request *req,
>   if (table->size & 0x01) {
>   /* Odd table size - 1 left over */
>   *cs++ = i915_mmio_reg_offset(GEN9_LNCFCMOCS(i));
> - *cs++ = l3cc_combine(table, 2 * i, 0);
> + *cs++ = l3cc_combine(table, 2 * i, I915_MOCS_PTE);
>   i++;
>   }
>  
>   /*
> -  * Now set the rest of the table to uncached - use entry 0 as
> -  * this will be uncached. Leave the last pair uninitialised as
> -  * they are reserved by the hardware.
> +  * Now set the rest of the table to follow the PTE.
> +  * Leave the last pair as they are reserved by the hardware.
>*/
>   for (; i < GEN9_NUM_MOCS_ENTRIES / 2; i++) {
>   *cs++ = i915_mmio_reg_offset(GEN9_LNCFCMOCS(i));
> - *cs++ = l3cc_combine(table, 0, 0);
> + *cs++ = l3cc_combine(table, I915_MOCS_PTE, I915_MOCS_PTE);
>   }
>  
>   *cs++ = MI_NOOP;
> @@ -402,17 +392,18 @@ void intel_mocs_init_l3cc_table(struct drm_i915_private 
> *dev_priv)
>  
>   /* Odd table size - 1 left over */
>   if (table.size & 0x01) {
> - I915_WRITE(GEN9_LNCFCMOCS(i), l3cc_combine(&table, 2*i, 0));
> + I915_WRITE(GEN9_LNCFCMOCS(i),
> +l3cc_combine(&table, 2*i, I915

Re: [Intel-gfx] [PATCH 5/9] drm/i915: Use a define for the default priority [0]

2017-05-04 Thread Chris Wilson

On Thu, May 04, 2017 at 04:32:34PM +0300, Joonas Lahtinen wrote:
> On ke, 2017-05-03 at 12:37 +0100, Chris Wilson wrote:
> > Explicitly assign the default priority, and give it a name (macro).
> > 
> > Signed-off-by: Chris Wilson 
> 
> 
> 
> >     kref_init(&ctx->ref);
> >     list_add_tail(&ctx->link, &dev_priv->context_list);
> >     ctx->i915 = dev_priv;
> > +   ctx->priority = I915_PRIORITY_DFL;
> 
> I915_PRIORITY_DEFAULT would work better.

On the one hand I have the symmetry with MIN, DFL, MAX, on the other
hand DFL is plain bizarre.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] drm] Atomic update on pipe (A) took 119 us, max time under evasion is 100 us

2017-05-04 Thread Jens Axboe

Hi,

Running current -git on my laptop (20FB, X1 Carbon gen4, skylake), I get
a lot of the below warnings. Things seem to work fine (in fact it seems
faster in general use than previously), but it's a lot of warning spew.

[  764.877978] [drm] Atomic update on pipe (A) took 156 us, max time under 
evasion is 100 us
[ 1210.063144] [drm] Atomic update on pipe (A) took 152 us, max time under 
evasion is 100 us
[ 1272.208727] [drm] Atomic update on pipe (A) took 213 us, max time under 
evasion is 100 us
[ 1308.106266] [drm] Atomic update on pipe (A) took 194 us, max time under 
evasion is 100 us
[ 1308.439572] [drm] Atomic update on pipe (A) took 202 us, max time under 
evasion is 100 us
[ 1371.905950] [drm] Atomic update on pipe (A) took 135 us, max time under 
evasion is 100 us
[ 1373.891378] [drm] Atomic update on pipe (A) took 202 us, max time under 
evasion is 100 us
[ 1497.259572] [drm] Atomic update on pipe (A) took 199 us, max time under 
evasion is 100 us
[ 1497.292922] [drm] Atomic update on pipe (A) took 178 us, max time under 
evasion is 100 us
[ 1497.326313] [drm] Atomic update on pipe (A) took 188 us, max time under 
evasion is 100 us
[ 1534.106959] [drm] Atomic update on pipe (A) took 223 us, max time under 
evasion is 100 us
[ 1534.190331] [drm] Atomic update on pipe (A) took 180 us, max time under 
evasion is 100 us
[ 1680.613275] [drm] Atomic update on pipe (A) took 101 us, max time under 
evasion is 100 us
[ 1870.783352] [drm] Atomic update on pipe (A) took 188 us, max time under 
evasion is 100 us
[ 2338.083752] [drm] Atomic update on pipe (A) took 225 us, max time under 
evasion is 100 us
[ 2405.212252] [drm] Atomic update on pipe (A) took 114 us, max time under 
evasion is 100 us
[ 2421.811125] [drm] Atomic update on pipe (A) took 112 us, max time under 
evasion is 100 us
[ 2426.344151] [drm] Atomic update on pipe (A) took 137 us, max time under 
evasion is 100 us
[ 2439.012088] [drm] Atomic update on pipe (A) took 143 us, max time under 
evasion is 100 us
[ 2446.011309] [drm] Atomic update on pipe (A) took 163 us, max time under 
evasion is 100 us
[ 2446.142622] [drm] Atomic update on pipe (A) took 112 us, max time under 
evasion is 100 us
[ 2446.542772] [drm] Atomic update on pipe (A) took 137 us, max time under 
evasion is 100 us
[ 2448.243922] [drm] Atomic update on pipe (A) took 157 us, max time under 
evasion is 100 us
[ 2450.042450] [drm] Atomic update on pipe (A) took 157 us, max time under 
evasion is 100 us
[ 2456.575226] [drm] Atomic update on pipe (A) took 131 us, max time under 
evasion is 100 us
[ 2457.275176] [drm] Atomic update on pipe (A) took 115 us, max time under 
evasion is 100 us
[ 2464.308098] [drm] Atomic update on pipe (A) took 112 us, max time under 
evasion is 100 us
[ 2569.418646] [drm] Atomic update on pipe (A) took 179 us, max time under 
evasion is 100 us
[ 2572.302065] [drm] Atomic update on pipe (A) took 133 us, max time under 
evasion is 100 us
[ 2589.933225] [drm] Atomic update on pipe (A) took 168 us, max time under 
evasion is 100 us
[ 2590.701810] [drm] Atomic update on pipe (A) took 175 us, max time under 
evasion is 100 us
[ 2606.732899] [drm] Atomic update on pipe (A) took 130 us, max time under 
evasion is 100 us
[ 2611.732710] [drm] Atomic update on pipe (A) took 147 us, max time under 
evasion is 100 us
[ 2615.532819] [drm] Atomic update on pipe (A) took 145 us, max time under 
evasion is 100 us
[ 2654.412509] [drm] Atomic update on pipe (A) took 157 us, max time under 
evasion is 100 us
[ 2657.012470] [drm] Atomic update on pipe (A) took 168 us, max time under 
evasion is 100 us
[ 2714.341971] [drm] Atomic update on pipe (A) took 144 us, max time under 
evasion is 100 us
[ 2775.486168] [drm] Atomic update on pipe (A) took 138 us, max time under 
evasion is 100 us
[ 2782.852360] [drm] Atomic update on pipe (A) took 113 us, max time under 
evasion is 100 us
[ 2795.319781] [drm] Atomic update on pipe (A) took 188 us, max time under 
evasion is 100 us
[ 2818.601093] [drm] Atomic update on pipe (A) took 160 us, max time under 
evasion is 100 us
[ 2867.998524] [drm] Atomic update on pipe (A) took 167 us, max time under 
evasion is 100 us
[ 2878.980535] [drm] Atomic update on pipe (A) took 163 us, max time under 
evasion is 100 us
[ 2945.607547] [drm] Atomic update on pipe (A) took 110 us, max time under 
evasion is 100 us
[ 2957.606588] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure 
on pipe A (start=177768 end=177769) time 214 us, min 1431, max 1439, scanline 
start 1423, end 1442
[ 2958.609128] [drm] Atomic update on pipe (A) took 168 us, max time under 
evasion is 100 us
[ 2960.059591] [drm] Atomic update on pipe (A) took 186 us, max time under 
evasion is 100 us
[ 2960.658177] [drm] Atomic update on pipe (A) took 181 us, max time under 
evasion is 100 us
[ 3002.688632] [drm] Atomic update on pipe (A) took 210 us, max time under 
evasion is 100 us
[ 3021.939015] [drm] Atomic update on pipe (A) took 140 us, max time under 
evasion

Re: [Intel-gfx] [PATCH v1] ACPI: Switch to use generic UUID API

2017-05-04 Thread Benjamin Tissoires

On May 04 2017 or thereabouts, Andy Shevchenko wrote:
> acpi_evaluate_dsm() and friends take a pointer to a raw buffer of 16
> bytes. Instead we convert them to use uuid_le type. At the same time we
> convert current users.
> 
> acpi_str_to_uuid() becomes useless after the conversion and it's safe to
> get rid of it.
> 
> The conversion fixes a potential bug in int340x_thermal as well since
> we have to use memcmp() on binary data.
> 
> Cc: Rafael J. Wysocki 
> Cc: Mika Westerberg 
> Cc: Borislav Petkov 
> Cc: Dan Williams 
> Cc: Amir Goldstein 
> Cc: Jarkko Sakkinen 
> Cc: Jani Nikula 
> Cc: Ben Skeggs 
> Cc: Benjamin Tissoires 
> Cc: Joerg Roedel 
> Cc: Adrian Hunter 
> Cc: Yisen Zhuang 
> Cc: Bjorn Helgaas 
> Cc: Zhang Rui 
> Cc: Felipe Balbi 
> Cc: Mathias Nyman 
> Cc: Heikki Krogerus 
> Cc: Liam Girdwood 
> Cc: Mark Brown 
> Signed-off-by: Andy Shevchenko 
> ---

For i2c-hid:
Acked-by: Benjamin Tissoires 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 2/4] lib/scatterlist: Avoid potential scatterlist entry overflow

2017-05-04 Thread Tvrtko Ursulin

From: Tvrtko Ursulin 

Since the scatterlist length field is an unsigned int, make
sure that sg_alloc_table_from_pages does not overflow it while
coallescing pages to a single entry.

v2: Drop reference to future use. Use UINT_MAX.
v3: max_segment must be page aligned.
v4: Do not rely on compiler to optimise out the rounddown.
(Joonas Lahtinen)
v5: Simplified loops and use post-increments rather than
pre-increments. Use PAGE_MASK and fix comment typo.
(Andy Shevchenko)

Signed-off-by: Tvrtko Ursulin 
Cc: Masahiro Yamada 
Cc: linux-ker...@vger.kernel.org
Reviewed-by: Chris Wilson  (v2)
Cc: Joonas Lahtinen 
Cc: Andy Shevchenko 
---
 include/linux/scatterlist.h |  6 ++
 lib/scatterlist.c   | 31 ---
 2 files changed, 26 insertions(+), 11 deletions(-)

diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index c981bee1a3ae..4768eeeb7054 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -21,6 +21,12 @@ struct scatterlist {
 };
 
 /*
+ * Since the above length field is an unsigned int, below we define the maximum
+ * length in bytes that can be stored in one scatterlist entry.
+ */
+#define SCATTERLIST_MAX_SEGMENT (UINT_MAX & PAGE_MASK)
+
+/*
  * These macros should be used after a dma_map_sg call has been done
  * to get bus addresses of each of the SG entries and their lengths.
  * You should only work with the number of sg entries dma_map_sg
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index 11f172c383cb..ca4ccd8c80b9 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -394,17 +394,22 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
unsigned int offset, unsigned long size,
gfp_t gfp_mask)
 {
-   unsigned int chunks;
-   unsigned int i;
-   unsigned int cur_page;
+   const unsigned int max_segment = SCATTERLIST_MAX_SEGMENT;
+   unsigned int chunks, cur_page, seg_len, i;
int ret;
struct scatterlist *s;
 
/* compute number of contiguous chunks */
chunks = 1;
-   for (i = 1; i < n_pages; ++i)
-   if (page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1)
-   ++chunks;
+   seg_len = 0;
+   for (i = 1; i < n_pages; i++) {
+   seg_len += PAGE_SIZE;
+   if (seg_len >= max_segment ||
+   page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1) {
+   chunks++;
+   seg_len = 0;
+   }
+   }
 
ret = sg_alloc_table(sgt, chunks, gfp_mask);
if (unlikely(ret))
@@ -413,17 +418,21 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
/* merging chunks and putting them into the scatterlist */
cur_page = 0;
for_each_sg(sgt->sgl, s, sgt->orig_nents, i) {
-   unsigned long chunk_size;
-   unsigned int j;
+   unsigned int j, chunk_size;
 
/* look for the end of the current chunk */
-   for (j = cur_page + 1; j < n_pages; ++j)
-   if (page_to_pfn(pages[j]) !=
+   seg_len = 0;
+   for (j = cur_page + 1; j < n_pages; j++) {
+   seg_len += PAGE_SIZE;
+   if (seg_len >= max_segment ||
+   page_to_pfn(pages[j]) !=
page_to_pfn(pages[j - 1]) + 1)
break;
+   }
 
chunk_size = ((j - cur_page) << PAGE_SHIFT) - offset;
-   sg_set_page(s, pages[cur_page], min(size, chunk_size), offset);
+   sg_set_page(s, pages[cur_page],
+   min_t(unsigned long, size, chunk_size), offset);
size -= chunk_size;
offset = 0;
cur_page = j;
-- 
2.9.3

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 3/4] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages

2017-05-04 Thread Tvrtko Ursulin

From: Tvrtko Ursulin 

Drivers like i915 benefit from being able to control the maxium
size of the sg coallesced segment while building the scatter-
gather list.

Introduce and export the __sg_alloc_table_from_pages function
which will allow it that control.

v2: Reorder parameters. (Chris Wilson)
v3: Fix incomplete reordering in v2.
v4: max_segment needs to be page aligned.
v5: Rebase.
v6: Rebase.

Signed-off-by: Tvrtko Ursulin 
Cc: Masahiro Yamada 
Cc: linux-ker...@vger.kernel.org
Cc: Chris Wilson 
Reviewed-by: Chris Wilson  (v2)
Cc: Joonas Lahtinen 
---
 include/linux/scatterlist.h | 11 +
 lib/scatterlist.c   | 58 +++--
 2 files changed, 52 insertions(+), 17 deletions(-)

diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index 4768eeeb7054..4d67a9652c7d 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -267,10 +267,13 @@ void sg_free_table(struct sg_table *);
 int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int,
 struct scatterlist *, gfp_t, sg_alloc_fn *);
 int sg_alloc_table(struct sg_table *, unsigned int, gfp_t);
-int sg_alloc_table_from_pages(struct sg_table *sgt,
-   struct page **pages, unsigned int n_pages,
-   unsigned int offset, unsigned long size,
-   gfp_t gfp_mask);
+int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+   unsigned int n_pages, unsigned int offset,
+   unsigned long size, unsigned int max_segment,
+   gfp_t gfp_mask);
+int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+ unsigned int n_pages, unsigned int offset,
+ unsigned long size, gfp_t gfp_mask);
 
 size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf,
  size_t buflen, off_t skip, bool to_buffer);
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index ca4ccd8c80b9..73dace1bd5bb 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -370,14 +370,15 @@ int sg_alloc_table(struct sg_table *table, unsigned int 
nents, gfp_t gfp_mask)
 EXPORT_SYMBOL(sg_alloc_table);
 
 /**
- * sg_alloc_table_from_pages - Allocate and initialize an sg table from
- *an array of pages
- * @sgt:   The sg table header to use
- * @pages: Pointer to an array of page pointers
- * @n_pages:   Number of pages in the pages array
- * @offset: Offset from start of the first page to the start of a buffer
- * @size:   Number of valid bytes in the buffer (after offset)
- * @gfp_mask:  GFP allocation mask
+ * __sg_alloc_table_from_pages - Allocate and initialize an sg table from
+ *  an array of pages
+ * @sgt:The sg table header to use
+ * @pages:  Pointer to an array of page pointers
+ * @n_pages:Number of pages in the pages array
+ * @offset:  Offset from start of the first page to the start of a buffer
+ * @size:Number of valid bytes in the buffer (after offset)
+ * @max_segment: Maximum size of a scatterlist node in bytes (page aligned)
+ * @gfp_mask:   GFP allocation mask
  *
  *  Description:
  *Allocate and initialize an sg table from a list of pages. Contiguous
@@ -389,16 +390,18 @@ EXPORT_SYMBOL(sg_alloc_table);
  * Returns:
  *   0 on success, negative error on failure
  */
-int sg_alloc_table_from_pages(struct sg_table *sgt,
-   struct page **pages, unsigned int n_pages,
-   unsigned int offset, unsigned long size,
-   gfp_t gfp_mask)
+int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+   unsigned int n_pages, unsigned int offset,
+   unsigned long size, unsigned int max_segment,
+   gfp_t gfp_mask)
 {
-   const unsigned int max_segment = SCATTERLIST_MAX_SEGMENT;
unsigned int chunks, cur_page, seg_len, i;
int ret;
struct scatterlist *s;
 
+   if (WARN_ON(!max_segment || offset_in_page(max_segment)))
+   return -EINVAL;
+
/* compute number of contiguous chunks */
chunks = 1;
seg_len = 0;
@@ -440,6 +443,35 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
 
return 0;
 }
+EXPORT_SYMBOL(__sg_alloc_table_from_pages);
+
+/**
+ * sg_alloc_table_from_pages - Allocate and initialize an sg table from
+ *an array of pages
+ * @sgt:The sg table header to use
+ * @pages:  Pointer to an array of page pointers
+ * @n_pages:Number of pages in the pages array
+ * @offset:  Offset from start of the first page to the start of a buffer
+ * @size:Number of valid bytes in the buffer (after offset)
+ * @gfp_mask:   GFP allocation mask
+ *
+ *  Description:
+ *Allocate and initialize an sg table from a list of pages. Contiguous
+ *

[Intel-gfx] [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages

2017-05-04 Thread Tvrtko Ursulin

From: Tvrtko Ursulin 

Scatterlist entries have an unsigned int for the offset so
correct the sg_alloc_table_from_pages function accordingly.

Since these are offsets withing a page, unsigned int is
wide enough.

Also converts callers which were using unsigned long locally
with the lower_32_bits annotation to make it explicitly
clear what is happening.

v2: Use offset_in_page. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin 
Cc: Masahiro Yamada 
Cc: Pawel Osciak 
Cc: Marek Szyprowski 
Cc: Kyungmin Park 
Cc: Tomasz Stanislawski 
Cc: Matt Porter 
Cc: Alexandre Bounine 
Cc: linux-me...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Acked-by: Marek Szyprowski  (v1)
Reviewed-by: Chris Wilson 
Reviewed-by: Mauro Carvalho Chehab 
---
 drivers/media/v4l2-core/videobuf2-dma-contig.c | 4 ++--
 drivers/rapidio/devices/rio_mport_cdev.c   | 4 ++--
 include/linux/scatterlist.h| 2 +-
 lib/scatterlist.c  | 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c 
b/drivers/media/v4l2-core/videobuf2-dma-contig.c
index 2db0413f5d57..b5009c1649bc 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
@@ -478,7 +478,7 @@ static void *vb2_dc_get_userptr(struct device *dev, 
unsigned long vaddr,
 {
struct vb2_dc_buf *buf;
struct frame_vector *vec;
-   unsigned long offset;
+   unsigned int offset;
int n_pages, i;
int ret = 0;
struct sg_table *sgt;
@@ -506,7 +506,7 @@ static void *vb2_dc_get_userptr(struct device *dev, 
unsigned long vaddr,
buf->dev = dev;
buf->dma_dir = dma_dir;
 
-   offset = vaddr & ~PAGE_MASK;
+   offset = lower_32_bits(offset_in_page(vaddr));
vec = vb2_create_framevec(vaddr, size, dma_dir == DMA_FROM_DEVICE);
if (IS_ERR(vec)) {
ret = PTR_ERR(vec);
diff --git a/drivers/rapidio/devices/rio_mport_cdev.c 
b/drivers/rapidio/devices/rio_mport_cdev.c
index 50b617af81bd..a8b6696ab6cb 100644
--- a/drivers/rapidio/devices/rio_mport_cdev.c
+++ b/drivers/rapidio/devices/rio_mport_cdev.c
@@ -876,10 +876,10 @@ rio_dma_transfer(struct file *filp, u32 transfer_mode,
 * offset within the internal buffer specified by handle parameter.
 */
if (xfer->loc_addr) {
-   unsigned long offset;
+   unsigned int offset;
long pinned;
 
-   offset = (unsigned long)(uintptr_t)xfer->loc_addr & ~PAGE_MASK;
+   offset = lower_32_bits(offset_in_page(xfer->loc_addr));
nr_pages = PAGE_ALIGN(xfer->length + offset) >> PAGE_SHIFT;
 
page_list = kmalloc_array(nr_pages,
diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index cb3c8fe6acd7..c981bee1a3ae 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -263,7 +263,7 @@ int __sg_alloc_table(struct sg_table *, unsigned int, 
unsigned int,
 int sg_alloc_table(struct sg_table *, unsigned int, gfp_t);
 int sg_alloc_table_from_pages(struct sg_table *sgt,
struct page **pages, unsigned int n_pages,
-   unsigned long offset, unsigned long size,
+   unsigned int offset, unsigned long size,
gfp_t gfp_mask);
 
 size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf,
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index c6cf82242d65..11f172c383cb 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -391,7 +391,7 @@ EXPORT_SYMBOL(sg_alloc_table);
  */
 int sg_alloc_table_from_pages(struct sg_table *sgt,
struct page **pages, unsigned int n_pages,
-   unsigned long offset, unsigned long size,
+   unsigned int offset, unsigned long size,
gfp_t gfp_mask)
 {
unsigned int chunks;
-- 
2.9.3

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations

2017-05-04 Thread Tvrtko Ursulin

From: Tvrtko Ursulin 

With the addition of __sg_alloc_table_from_pages we can control
the maximum coallescing size and eliminate a separate path for
allocating backing store here.

Similar to 871dfbd67d4e ("drm/i915: Allow compaction upto
SWIOTLB max segment size") this enables more compact sg lists to
be created and so has a beneficial effect on workloads with many
and/or large objects of this class.

v2:
 * Rename helper to i915_sg_segment_size and fix swiotlb override.
 * Commit message update.

v3:
 * Actually include the swiotlb override fix.

v4:
 * Regroup parameters a bit. (Chris Wilson)

v5:
 * Rebase for swiotlb_max_segment.
 * Add DMA map failure handling as in abb0deacb5a6
   ("drm/i915: Fallback to single PAGE_SIZE segments for DMA remapping").

v6: Handle swiotlb_max_segment() returning 1. (Joonas Lahtinen)

v7: Rebase.

Signed-off-by: Tvrtko Ursulin 
Cc: Chris Wilson 
Cc: linux-ker...@vger.kernel.org
Reviewed-by: Chris Wilson  (v4)
Cc: Joonas Lahtinen 
---
 drivers/gpu/drm/i915/i915_drv.h | 15 +++
 drivers/gpu/drm/i915/i915_gem.c |  6 +--
 drivers/gpu/drm/i915/i915_gem_userptr.c | 79 -
 3 files changed, 45 insertions(+), 55 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b20ed16da0ad..320c16df1c9c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2676,6 +2676,21 @@ static inline struct scatterlist *__sg_next(struct 
scatterlist *sg)
 (((__iter).curr += PAGE_SIZE) < (__iter).max) ||   \
 ((__iter) = __sgt_iter(__sg_next((__iter).sgp), false), 0))
 
+static inline unsigned int i915_sg_segment_size(void)
+{
+   unsigned int size = swiotlb_max_segment();
+
+   if (size == 0)
+   return SCATTERLIST_MAX_SEGMENT;
+
+   size = rounddown(size, PAGE_SIZE);
+   /* swiotlb_max_segment_size can return 1 byte when it means one page. */
+   if (size < PAGE_SIZE)
+   size = PAGE_SIZE;
+
+   return size;
+}
+
 static inline const struct intel_device_info *
 intel_info(const struct drm_i915_private *dev_priv)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f9c6b9b5002c..b2727905ef2b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2336,7 +2336,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object 
*obj)
struct sgt_iter sgt_iter;
struct page *page;
unsigned long last_pfn = 0; /* suppress gcc warning */
-   unsigned int max_segment;
+   unsigned int max_segment = i915_sg_segment_size();
int ret;
gfp_t gfp;
 
@@ -2347,10 +2347,6 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object 
*obj)
GEM_BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
GEM_BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
 
-   max_segment = swiotlb_max_segment();
-   if (!max_segment)
-   max_segment = rounddown(UINT_MAX, PAGE_SIZE);
-
st = kmalloc(sizeof(*st), GFP_KERNEL);
if (st == NULL)
return ERR_PTR(-ENOMEM);
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c 
b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 58ccf8b8ca1c..d003076702ad 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -399,64 +399,42 @@ struct get_pages_work {
struct task_struct *task;
 };
 
-#if IS_ENABLED(CONFIG_SWIOTLB)
-#define swiotlb_active() swiotlb_nr_tbl()
-#else
-#define swiotlb_active() 0
-#endif
-
-static int
-st_set_pages(struct sg_table **st, struct page **pvec, int num_pages)
-{
-   struct scatterlist *sg;
-   int ret, n;
-
-   *st = kmalloc(sizeof(**st), GFP_KERNEL);
-   if (*st == NULL)
-   return -ENOMEM;
-
-   if (swiotlb_active()) {
-   ret = sg_alloc_table(*st, num_pages, GFP_KERNEL);
-   if (ret)
-   goto err;
-
-   for_each_sg((*st)->sgl, sg, num_pages, n)
-   sg_set_page(sg, pvec[n], PAGE_SIZE, 0);
-   } else {
-   ret = sg_alloc_table_from_pages(*st, pvec, num_pages,
-   0, num_pages << PAGE_SHIFT,
-   GFP_KERNEL);
-   if (ret)
-   goto err;
-   }
-
-   return 0;
-
-err:
-   kfree(*st);
-   *st = NULL;
-   return ret;
-}
-
 static struct sg_table *
-__i915_gem_userptr_set_pages(struct drm_i915_gem_object *obj,
-struct page **pvec, int num_pages)
+__i915_gem_userptr_alloc_pages(struct drm_i915_gem_object *obj,
+  struct page **pvec, int num_pages)
 {
-   struct sg_table *pages;
+   unsigned int max_segment = i915_sg_segment_size();
+   struct sg_table *st;
int ret;
 
-   ret = st_set_pages(&pages, pvec, num_pages

Re: [Intel-gfx] [RFC PATCH 6/6] drm/i915/gvt: support QEMU getting the dmabuf

2017-05-04 Thread Alex Williamson

On Thu, 4 May 2017 03:09:40 +
"Chen, Xiaoguang"  wrote:

> Hi Alex, do you have any comments for this interface?
> 
> >-Original Message-
> >From: intel-gvt-dev [mailto:intel-gvt-dev-boun...@lists.freedesktop.org] On
> >Behalf Of Chen, Xiaoguang
> >Sent: Wednesday, May 03, 2017 9:39 AM
> >To: Gerd Hoffmann 
> >Cc: Tian, Kevin ; intel-gfx@lists.freedesktop.org; 
> >linux-
> >ker...@vger.kernel.org; zhen...@linux.intel.com; alex.william...@redhat.com;
> >Lv, Zhiyuan ; intel-gvt-...@lists.freedesktop.org; 
> >Wang,
> >Zhi A 
> >Subject: RE: [RFC PATCH 6/6] drm/i915/gvt: support QEMU getting the dmabuf
> >
> >
> >  
> >>-Original Message-
> >>From: Gerd Hoffmann [mailto:kra...@redhat.com]
> >>Sent: Tuesday, May 02, 2017 5:51 PM
> >>To: Chen, Xiaoguang 
> >>Cc: alex.william...@redhat.com; intel-gfx@lists.freedesktop.org;
> >>intel-gvt- d...@lists.freedesktop.org; Wang, Zhi A
> >>; zhen...@linux.intel.com;
> >>linux-ker...@vger.kernel.org; Lv, Zhiyuan ; Tian,
> >>Kevin 
> >>Subject: Re: [RFC PATCH 6/6] drm/i915/gvt: support QEMU getting the
> >>dmabuf
> >>
> >>On Fr, 2017-04-28 at 17:35 +0800, Xiaoguang Chen wrote:  
> >>> +static size_t intel_vgpu_reg_rw_gvtg(struct intel_vgpu *vgpu, char
> >>> *buf,
> >>> +   size_t count, loff_t *ppos, bool iswrite) {
> >>> +   unsigned int i = VFIO_PCI_OFFSET_TO_INDEX(*ppos) -
> >>> +   VFIO_PCI_NUM_REGIONS;
> >>> +   loff_t pos = *ppos & VFIO_PCI_OFFSET_MASK;
> >>> +   int fd;
> >>> +
> >>> +   if (pos >= vgpu->vdev.region[i].size || iswrite) {
> >>> +   gvt_vgpu_err("invalid op or offset for Intel vgpu fd
> >>> region\n");
> >>> +   return -EINVAL;
> >>> +   }
> >>> +
> >>> +   fd = anon_inode_getfd("gvtg", &intel_vgpu_gvtg_ops, vgpu,
> >>> +   O_RDWR | O_CLOEXEC);
> >>> +   if (fd < 0) {
> >>> +   gvt_vgpu_err("create intel vgpu fd failed:%d\n", fd);
> >>> +   return -EINVAL;
> >>> +   }
> >>> +
> >>> +   count = min(count, (size_t)(vgpu->vdev.region[i].size - pos));
> >>> +   memcpy(buf, &fd, count);
> >>> +
> >>> +   return count;
> >>> +}  
> >>
> >>Hmm, that looks like a rather strange way to return a file descriptor.
> >>
> >>What is the reason to not use ioctls on the vfio file handle, like
> >>older version of these patches did?  
> >If I understood correctly that Alex prefer not to change the ioctls on the 
> >vfio file
> >handle like the old version.
> >So I used this way the smallest change to general vfio framework only adding 
> >a
> >subregion definition.

I think I was hoping we could avoid a separate file descriptor
altogether and use a vfio region instead.  However, it was explained
previously why this really needs to be a separate fd and I agree that
using a region to expose an fd is really awkward.  If we're going to
have a separate fd, let's use a device specific ioctl to get it.
Thanks,

Alex
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v2 2/3] drm/i915/guc: Make scratch register base and count flexible

2017-05-04 Thread Michal Wajdeczko

On Thu, May 04, 2017 at 04:22:15PM +0300, Jani Nikula wrote:
> On Thu, 04 May 2017, Michal Wajdeczko  wrote:
> > We are using some scratch registers in MMIO based send function.
> > Make their base and count flexible in preparation of upcoming
> > GuC firmware/hardware changes. While around, change cmd len
> > parameter verification from WARN_ON to GEM_BUG_ON as we don't
> > need this all the time.
> 
> I'm not generally fond of caching the registers like this or adding
> _MMIO() wrapping outside of i915_reg.h. Sure, we have some of that here
> and there, but here it's hard to see the rationale because you do this
> in preparation for something that we you're not sharing.
> 

I can't share details atm, but as commit message says, there will be a
change in both offsets and number of scratch registers.

Imho any wrapping around these values can't go to the i915_[guc_]reg.h file
as that file shall include only raw MMIO definitions, without any extra
logic that is based on GEN or PLATFORM or FW version.

Alternate approach would be, thanks to the already defined virtual function
send(), to create new send_mmio function(s) that will be 100% the same as
the old send_mmio except offset and count of the scratch registers.

Then we can benefit from most optimal implementation per GEN|PLATFORM|FW that
can run without reading cached regs offsets/count, but at the cost of extra
code that need to be maintained to be in sync with the original function.
And then someone else can point out that we missed code sharing opportunity.

I'm afraid there is no clear winner. 

-Michal


> BR,
> Jani.
> 
> >
> > v2: call out WARN/GEM_BUG change in the commit msg (Daniele)
> >
> > Signed-off-by: Michal Wajdeczko 
> > Suggested-by: Daniele Ceraolo Spurio 
> > Cc: Daniele Ceraolo Spurio 
> > Cc: Joonas Lahtinen 
> > Reviewed-by: Daniele Ceraolo Spurio 
> > ---
> >  drivers/gpu/drm/i915/intel_uc.c | 41 
> > ++---
> >  drivers/gpu/drm/i915/intel_uc.h |  7 +++
> >  2 files changed, 41 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/intel_uc.c 
> > b/drivers/gpu/drm/i915/intel_uc.c
> > index 72f49e6..9d11c42 100644
> > --- a/drivers/gpu/drm/i915/intel_uc.c
> > +++ b/drivers/gpu/drm/i915/intel_uc.c
> > @@ -260,9 +260,36 @@ void intel_uc_fini_fw(struct drm_i915_private 
> > *dev_priv)
> > __intel_uc_fw_fini(&dev_priv->huc.fw);
> >  }
> >  
> > +static inline i915_reg_t guc_send_reg(struct intel_guc *guc, u32 i)
> > +{
> > +   GEM_BUG_ON(!guc->send_regs.base);
> > +   GEM_BUG_ON(!guc->send_regs.count);
> > +   GEM_BUG_ON(i >= guc->send_regs.count);
> > +
> > +   return _MMIO(guc->send_regs.base + 4 * i);
> > +}
> > +
> > +static void guc_init_send_regs(struct intel_guc *guc)
> > +{
> > +   struct drm_i915_private *dev_priv = guc_to_i915(guc);
> > +   enum forcewake_domains fw_domains = 0;
> > +   u32 i;
> > +
> > +   guc->send_regs.base = i915_mmio_reg_offset(SOFT_SCRATCH(0));
> > +   guc->send_regs.count = SOFT_SCRATCH_COUNT - 1;
> > +
> > +   for (i = 0; i < guc->send_regs.count; i++) {
> > +   fw_domains |= intel_uncore_forcewake_for_reg(dev_priv,
> > +   guc_send_reg(guc, i),
> > +   FW_REG_READ | FW_REG_WRITE);
> > +   }
> > +   guc->send_regs.fw_domains = fw_domains;
> > +}
> > +
> >  static int guc_enable_communication(struct intel_guc *guc)
> >  {
> > /* XXX: placeholder for alternate setup */
> > +   guc_init_send_regs(guc);
> > guc->send = intel_guc_send_mmio;
> > return 0;
> >  }
> > @@ -407,19 +434,19 @@ int intel_guc_send_mmio(struct intel_guc *guc, const 
> > u32 *action, u32 len)
> > int i;
> > int ret;
> >  
> > -   if (WARN_ON(len < 1 || len > 15))
> > -   return -EINVAL;
> > +   GEM_BUG_ON(!len);
> > +   GEM_BUG_ON(len > guc->send_regs.count);
> >  
> > mutex_lock(&guc->send_mutex);
> > -   intel_uncore_forcewake_get(dev_priv, FORCEWAKE_BLITTER);
> > +   intel_uncore_forcewake_get(dev_priv, guc->send_regs.fw_domains);
> >  
> > dev_priv->guc.action_count += 1;
> > dev_priv->guc.action_cmd = action[0];
> >  
> > for (i = 0; i < len; i++)
> > -   I915_WRITE(SOFT_SCRATCH(i), action[i]);
> > +   I915_WRITE(guc_send_reg(guc, i), action[i]);
> >  
> > -   POSTING_READ(SOFT_SCRATCH(i - 1));
> > +   POSTING_READ(guc_send_reg(guc, i - 1));
> >  
> > intel_guc_notify(guc);
> >  
> > @@ -428,7 +455,7 @@ int intel_guc_send_mmio(struct intel_guc *guc, const 
> > u32 *action, u32 len)
> >  * Fast commands should still complete in 10us.
> >  */
> > ret = __intel_wait_for_register_fw(dev_priv,
> > -  SOFT_SCRATCH(0),
> > +  guc_send_reg(guc, 0),
> >INTEL_GUC_RECV_MASK,
> >INTEL_GUC_RECV_MASK,
> >10, 10, &status);
> > @@ -450,7 +477,7 @@ i

Re: [Intel-gfx] [PATCH] drm/i915: Update MOCS settings for gen 9

2017-05-04 Thread Kenneth Graunke

On Thursday, May 4, 2017 7:47:21 AM PDT David Weinehall wrote:
> On Thu, May 04, 2017 at 10:35:33AM +0200, Arkadiusz Hiler wrote:
> > Thanks for rephrasing - that's exactly what I am concerned with.
> > 
> > Did you just use the MediaSDK as it is - meaning that MOCS entries
> > beyond the set of the 3 we have defined had been naively utilized?
> > 
> > If that's the case it is probably the cause of the performance
> > difference - everything beyond "the 3" means UNCACHED.
> > 
> > Can you try changing MediaSDK to only use entries that are already in?
> > How the performance differs in that case?
> 
> We're benchmarking using upstream MediaSDK without changes, since that's
> the only thing that's relevant. Customising benchmarks to get better
> results isn't really an acceptable solution :)
> 
> Obviously fixing MediaSDK upstream is a different story, in case one of
> the three pre-defined entries we have turns out to be the best possible
> MOCS-settings for that workload.

You're right about customizing benchmarks, but...

MediaSDK is not a benchmark.  If I'm not mistaken, it's a userspace
driver produced by Intel engineers, one which Intel has the full
capability to change.  What you're saying is that Intel's MediaSDK
engineers are unwilling to change their software to provide better
performance for their Linux users.

That's pretty mental.

We don't warp the core operating system to work around userspace
software simply because they don't want to change it.

This isn't about open vs. closed or internal vs. public projects,
either.  I work on a public userspace driver for Intel graphics.
If I sent a kernel patch, the kernel developers would ask me the
exact same questions, to justify my new additions:

   1.  Is your userspace actually using all these new additions?
   If not, which ones are you using?

   They would ask me to drop anything I wasn't actually using
   yet, because speculatively adding things to the kernel that
   we have to maintain backwards compatibility for has caused
   both kernel and userspace developers a lot of trouble.

   2.  Are you sure that you need them all?  Is there a simpler
   solution - are some existing things good enough?  What's
   the additional benefit of each new addition?

I would have to answer these questions to the satisfaction of the
kernel developers before they would even consider taking my patch.

You keep pointing to your large performance improvement, but all it's
shown is that actually using the GPU cache is faster than having a
broken userspace driver explicitly set everything to uncached.  Many
people have pointed this out.

Arek and Tvrtko have good suggestions.  I don't think you're going
to get anywhere with this until you demonstrate that the new MOCS
entries provide some non-zero value over using the existing WB entry.

Here are a couple more data points:

1. We likely can't implement the documented "MOCS Version 1"
   table as is.

   The kernel exposes existing entries with specific semantics.
   Changing their meaning would introduce a backwards-incompatible
   change that would likely regress the performance of existing
   userspace.  This is almost certainly unacceptable - our customers,
   distro partners, users, and even people like Linus Torvalds will
   suffer and complain loudly.

   We could add the new entries at an offset - i.e. leave the existing
   3 entries, and append the rest after that.  But that would require
   changing userspace that assumes the Windows tables, such as MediaSDK
   (they would have to add 3 to their MOCS indexes).  At which point,
   we're changing them, so...the "runs unaltered" argument falls over.

2. The docs finally contain "recommended MOCS settings" - i.e. where
   to cache various types of objects, and at what age.  However, I
   believe those recommendations can be implemented with 1-2 new table
   entries and a PTE change to be eLLC-only by default.  Most of the
   table is completely unnecessary to implement the recommendations.

   I personally would like to try implementing their recommended
   settings in my driver.  I have not had time yet, but plan to try.

I'm very glad to see the Windows MOCS recommendations documented.
I'd been asking for that information for literally years.  If we'd
gotten it earlier, a lot of mess could have been avoided.  For future
platforms, we may want to coordinate and use the same table.  But
Gen9 has been shipping for ages, and we don't have that luxury.

--Ken

signature.asc
Description: This is a digitally signed message part.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 1/9] drm/i915: Replace ten seq_puts() calls by seq_putc()

2017-05-04 Thread SF Markus Elfring

From: Markus Elfring 
Date: Thu, 4 May 2017 11:04:45 +0200

Some single characters should be put into a sequence.
Thus use the corresponding function "seq_putc".

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 drivers/gpu/drm/i915/i915_debugfs.c | 32 
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index d689e511744e..f2bda699749a 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -190,7 +190,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object 
*obj)
seq_printf(m, " , fence: %d%s",
   vma->fence->id,
   i915_gem_active_isset(&vma->last_fence) ? 
"*" : "");
-   seq_puts(m, ")");
+   seq_putc(m, ')');
}
if (obj->stolen)
seq_printf(m, " (stolen: %08llx)", obj->stolen->start);
@@ -2689,7 +2689,7 @@ static int i915_edp_psr_status(struct seq_file *m, void 
*data)
(stat[pipe] == VLV_EDP_PSR_ACTIVE_SF_UPDATE))
seq_printf(m, " pipe %c", pipe_name(pipe));
}
-   seq_puts(m, "\n");
+   seq_putc(m, '\n');
 
/*
 * VLV/CHV PSR has no kind of performance counter
@@ -3176,7 +3176,7 @@ static void intel_scaler_info(struct seq_file *m, struct 
intel_crtc *intel_crtc)
seq_printf(m, ", scalers[%d]: use=%s, mode=%x",
   i, yesno(sc->in_use), sc->mode);
}
-   seq_puts(m, "\n");
+   seq_putc(m, '\n');
} else {
seq_puts(m, "\tNo scalers available on this platform\n");
}
@@ -3384,8 +3384,7 @@ static int i915_engine_info(struct seq_file *m, void 
*unused)
   w->tsk->comm, w->tsk->pid, w->seqno);
}
spin_unlock_irq(&b->rb_lock);
-
-   seq_puts(m, "\n");
+   seq_putc(m, '\n');
}
 
intel_runtime_pm_put(dev_priv);
@@ -3629,7 +3628,7 @@ static void drrs_status_per_crtc(struct seq_file *m,
/* DRRS not supported. Print the VBT parameter*/
seq_puts(m, "\tDRRS Supported : No");
}
-   seq_puts(m, "\n");
+   seq_putc(m, '\n');
 }
 
 static int i915_drrs_status(struct seq_file *m, void *unused)
@@ -3764,12 +3763,11 @@ static int i915_displayport_test_active_show(struct 
seq_file *m, void *data)
if (connector->status == connector_status_connected &&
connector->encoder != NULL) {
intel_dp = enc_to_intel_dp(connector->encoder);
-   if (intel_dp->compliance.test_active)
-   seq_puts(m, "1");
-   else
-   seq_puts(m, "0");
-   } else
-   seq_puts(m, "0");
+   seq_putc(m,
+intel_dp->compliance.test_active ? '1' : '0');
+   } else {
+   seq_putc(m, '0');
+   }
}
drm_connector_list_iter_end(&conn_iter);
 
@@ -3823,8 +3821,9 @@ static int i915_displayport_test_data_show(struct 
seq_file *m, void *data)
seq_printf(m, "bpc: %u\n",
   intel_dp->compliance.test_data.bpc);
}
-   } else
-   seq_puts(m, "0");
+   } else {
+   seq_putc(m, '0');
+   }
}
drm_connector_list_iter_end(&conn_iter);
 
@@ -3864,8 +3863,9 @@ static int i915_displayport_test_type_show(struct 
seq_file *m, void *data)
connector->encoder != NULL) {
intel_dp = enc_to_intel_dp(connector->encoder);
seq_printf(m, "%02lx", intel_dp->compliance.test_type);
-   } else
-   seq_puts(m, "0");
+   } else {
+   seq_putc(m, '0');
+   }
}
drm_connector_list_iter_end(&conn_iter);
 
-- 
2.12.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 2/9] drm/i915: Combine five seq_printf() calls in i915_display_info()

2017-05-04 Thread SF Markus Elfring

From: Markus Elfring 
Date: Thu, 4 May 2017 13:17:10 +0200

Some text was put into a sequence by separate function calls.
Print the same data by two single function calls instead.

Signed-off-by: Markus Elfring 
---
 drivers/gpu/drm/i915/i915_debugfs.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index f2bda699749a..4adf96be9146 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3191,8 +3191,7 @@ static int i915_display_info(struct seq_file *m, void 
*unused)
struct drm_connector_list_iter conn_iter;
 
intel_runtime_pm_get(dev_priv);
-   seq_printf(m, "CRTC info\n");
-   seq_printf(m, "-\n");
+   seq_puts(m, "CRTC info\n-\n");
for_each_intel_crtc(dev, crtc) {
bool active;
struct intel_crtc_state *pipe_config;
@@ -3226,9 +3225,7 @@ static int i915_display_info(struct seq_file *m, void 
*unused)
drm_modeset_unlock(&crtc->base.mutex);
}
 
-   seq_printf(m, "\n");
-   seq_printf(m, "Connector info\n");
-   seq_printf(m, "--\n");
+   seq_puts(m, "\nConnector info\n--\n");
mutex_lock(&dev->mode_config.mutex);
drm_connector_list_iter_begin(dev, &conn_iter);
drm_for_each_connector_iter(connector, &conn_iter)
-- 
2.12.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 6/9] drm/i915: Add spaces for better code readability

2017-05-04 Thread SF Markus Elfring

From: Markus Elfring 
Date: Thu, 4 May 2017 14:04:38 +0200

Use space characters at some source code places according to
the Linux coding style convention.

Signed-off-by: Markus Elfring 
---
 drivers/gpu/drm/i915/i915_debugfs.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index d9c699d7245e..6f3119d40c50 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2358,7 +2358,7 @@ static int i915_llc(struct seq_file *m, void *data)
 
seq_printf(m, "LLC: %s\n", yesno(HAS_LLC(dev_priv)));
seq_printf(m, "%s: %lluMB\n", edram ? "eDRAM" : "eLLC",
-  intel_uncore_edram_size(dev_priv)/1024/1024);
+  intel_uncore_edram_size(dev_priv) / 1024 / 1024);
 
return 0;
 }
@@ -4502,7 +4502,7 @@ static void gen9_sseu_device_status(struct 
drm_i915_private *dev_priv,
 {
int s_max = 3, ss_max = 4;
int s, ss;
-   u32 s_reg[s_max], eu_reg[2*s_max], eu_mask[2];
+   u32 s_reg[s_max], eu_reg[2 * s_max], eu_mask[2];
 
/* BXT has a single slice and at most 3 subslices. */
if (IS_GEN9_LP(dev_priv)) {
@@ -4512,8 +4512,8 @@ static void gen9_sseu_device_status(struct 
drm_i915_private *dev_priv,
 
for (s = 0; s < s_max; s++) {
s_reg[s] = I915_READ(GEN9_SLICE_PGCTL_ACK(s));
-   eu_reg[2*s] = I915_READ(GEN9_SS01_EU_PGCTL_ACK(s));
-   eu_reg[2*s + 1] = I915_READ(GEN9_SS23_EU_PGCTL_ACK(s));
+   eu_reg[2 * s] = I915_READ(GEN9_SS01_EU_PGCTL_ACK(s));
+   eu_reg[2 * s + 1] = I915_READ(GEN9_SS23_EU_PGCTL_ACK(s));
}
 
eu_mask[0] = GEN9_PGCTL_SSA_EU08_ACK |
@@ -4547,8 +4547,8 @@ static void gen9_sseu_device_status(struct 
drm_i915_private *dev_priv,
sseu->subslice_mask |= BIT(ss);
}
 
-   eu_cnt = 2 * hweight32(eu_reg[2*s + ss/2] &
-  eu_mask[ss%2]);
+   eu_cnt = 2 * hweight32(eu_reg[2 * s + ss / 2] &
+  eu_mask[ss % 2]);
sseu->eu_total += eu_cnt;
sseu->eu_per_subslice = max_t(unsigned int,
  sseu->eu_per_subslice,
-- 
2.12.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 7/9] drm/i915: Combine substrings for a message in gen6_drpc_info()

2017-05-04 Thread SF Markus Elfring

From: Markus Elfring 
Date: Thu, 4 May 2017 14:15:00 +0200

The script "checkpatch.pl" pointed information out like the following.

WARNING: quoted string split across lines

Thus fix the affected source code place.

Signed-off-by: Markus Elfring 
---
 drivers/gpu/drm/i915/i915_debugfs.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 6f3119d40c50..dbd52ea89fb4 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1529,8 +1529,8 @@ static int gen6_drpc_info(struct seq_file *m)
 
forcewake_count = 
READ_ONCE(dev_priv->uncore.fw_domain[FW_DOMAIN_ID_RENDER].wake_count);
if (forcewake_count) {
-   seq_puts(m, "RC information inaccurate because somebody "
-   "holds a forcewake reference \n");
+   seq_puts(m,
+"RC information inaccurate because somebody holds a 
forcewake reference.\n");
} else {
/* NB: we cannot use forcewake, else we read the wrong values */
while (count++ < 50 && (I915_READ_NOTRACE(FORCEWAKE_ACK) & 1))
-- 
2.12.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 3/9] drm/i915: Replace 14 seq_printf() calls by seq_puts()

2017-05-04 Thread SF Markus Elfring

From: Markus Elfring 
Date: Thu, 4 May 2017 13:20:47 +0200

Some strings which did not contain data format specifications should be put
into a sequence. Thus use the corresponding function "seq_puts".

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 drivers/gpu/drm/i915/i915_debugfs.c | 34 +-
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 4adf96be9146..296108464f2b 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -149,7 +149,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object 
*obj)
}
seq_printf(m, " (pinned x %d)", pin_count);
if (obj->pin_display)
-   seq_printf(m, " (display)");
+   seq_puts(m, " (display)");
list_for_each_entry(vma, &obj->vma_list, obj_link) {
if (!drm_mm_node_allocated(&vma->node))
continue;
@@ -581,8 +581,10 @@ static int i915_gem_pageflip_info(struct seq_file *m, void 
*data)
   intel_engine_last_submit(engine),
   intel_engine_get_seqno(engine),
   
i915_gem_request_completed(work->flip_queued_req));
-   } else
-   seq_printf(m, "Flip not associated with any 
ring\n");
+   } else {
+   seq_puts(m,
+"Flip not associated with any ring\n");
+   }
seq_printf(m, "Flip queued on frame %d, (was ready on 
frame %d), now %d\n",
   work->flip_queued_vblank,
   work->flip_ready_vblank,
@@ -2048,7 +2050,7 @@ static int i915_dump_lrc(struct seq_file *m, void *unused)
int ret;
 
if (!i915.enable_execlists) {
-   seq_printf(m, "Logical Ring Contexts are disabled\n");
+   seq_puts(m, "Logical Ring Contexts are disabled\n");
return 0;
}
 
@@ -2402,7 +2404,7 @@ static int i915_guc_load_status_info(struct seq_file *m, 
void *data)
if (!HAS_GUC_UCODE(dev_priv))
return 0;
 
-   seq_printf(m, "GuC firmware status:\n");
+   seq_puts(m, "GuC firmware status:\n");
seq_printf(m, "\tpath: %s\n",
guc_fw->path);
seq_printf(m, "\tfetch: %s\n",
@@ -2510,7 +2512,7 @@ static int i915_guc_info(struct seq_file *m, void *data)
return 0;
}
 
-   seq_printf(m, "Doorbell map:\n");
+   seq_puts(m, "Doorbell map:\n");
seq_printf(m, "\t%*pb\n", GUC_NUM_DOORBELLS, guc->doorbell_bitmap);
seq_printf(m, "Doorbell next cacheline: 0x%x\n\n", guc->db_cacheline);
 
@@ -2521,7 +2523,7 @@ static int i915_guc_info(struct seq_file *m, void *data)
seq_printf(m, "GuC last action error code: %d\n", guc->action_err);
 
total = 0;
-   seq_printf(m, "\nGuC submissions:\n");
+   seq_puts(m, "\nGuC submissions:\n");
for_each_engine(engine, dev_priv, id) {
u64 submissions = guc->submissions[id];
total += submissions;
@@ -2795,7 +2797,7 @@ static int i915_runtime_pm_status(struct seq_file *m, 
void *unused)
seq_printf(m, "Usage count: %d\n",
   atomic_read(&dev_priv->drm.dev->power.usage_count));
 #else
-   seq_printf(m, "Device Power Management (CONFIG_PM) disabled\n");
+   seq_puts(m, "Device Power Management (CONFIG_PM) disabled\n");
 #endif
seq_printf(m, "PCI device power state: %s [%d]\n",
   pci_power_name(pdev->current_state),
@@ -2914,7 +2916,7 @@ static void intel_encoder_info(struct seq_file *m,
   drm_get_connector_status_name(connector->status));
if (connector->status == connector_status_connected) {
struct drm_display_mode *mode = &crtc->mode;
-   seq_printf(m, ", mode:\n");
+   seq_puts(m, ", mode:\n");
intel_seq_print_mode(m, 2, mode);
} else {
seq_putc(m, '\n');
@@ -2945,7 +2947,7 @@ static void intel_panel_info(struct seq_file *m, struct 
intel_panel *panel)
 {
struct drm_display_mode *mode = panel->fixed_mode;
 
-   seq_printf(m, "\tfixed mode:\n");
+   seq_puts(m, "\tfixed mode:\n");
intel_seq_print_mode(m, 2, mode);
 }
 
@@ -3038,7 +3040,7 @@ static void intel_connector_info(struct seq_file *m,
break;
}
 
-   seq_printf(m, "\tmodes:\n");
+   seq_puts(m, "\tmodes:\n");
list_for_each_entry(mode, &connector->modes, head)
intel_seq_print_mode(m, 2, mode);
 }
@@ -3266,9 +3268,7 @@ static int i915_engine_info(struc

[Intel-gfx] [PATCH 4/9] drm/i915: Delete unnecessary braces in three functions

2017-05-04 Thread SF Markus Elfring

From: Markus Elfring 
Date: Thu, 4 May 2017 13:40:53 +0200

Do not use curly brackets at some source code places
where a single statement should be sufficient.

Signed-off-by: Markus Elfring 
---
 drivers/gpu/drm/i915/i915_debugfs.c | 19 ---
 1 file changed, 8 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 296108464f2b..bf9a2e8d8c16 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -565,13 +565,13 @@ static int i915_gem_pageflip_info(struct seq_file *m, 
void *data)
u32 addr;
 
pending = atomic_read(&work->pending);
-   if (pending) {
+   if (pending)
seq_printf(m, "Flip ioctl preparing on pipe %c 
(plane %c)\n",
   pipe, plane);
-   } else {
+   else
seq_printf(m, "Flip pending (waiting for vsync) 
on pipe %c (plane %c)\n",
   pipe, plane);
-   }
+
if (work->flip_queued_req) {
struct intel_engine_cs *engine = 
work->flip_queued_req->engine;
 
@@ -3130,13 +3130,11 @@ static void intel_plane_info(struct seq_file *m, struct 
intel_crtc *intel_crtc)
}
 
state = plane->state;
-
-   if (state->fb) {
+   if (state->fb)
drm_get_format_name(state->fb->format->format,
&format_name);
-   } else {
+   else
sprintf(format_name.str, "N/A");
-   }
 
seq_printf(m, "\t--Plane id %d: type=%s, crtc_pos=%4dx%4d, 
crtc_size=%4dx%4d, src_pos=%d.%04ux%d.%04u, src_size=%d.%04ux%d.%04u, 
format=%s, rotation=%s\n",
   plane->base.id,
@@ -4636,13 +4634,12 @@ static int i915_sseu_status(struct seq_file *m, void 
*unused)
 
intel_runtime_pm_get(dev_priv);
 
-   if (IS_CHERRYVIEW(dev_priv)) {
+   if (IS_CHERRYVIEW(dev_priv))
cherryview_sseu_device_status(dev_priv, &sseu);
-   } else if (IS_BROADWELL(dev_priv)) {
+   else if (IS_BROADWELL(dev_priv))
broadwell_sseu_device_status(dev_priv, &sseu);
-   } else if (INTEL_GEN(dev_priv) >= 9) {
+   else if (INTEL_GEN(dev_priv) >= 9)
gen9_sseu_device_status(dev_priv, &sseu);
-   }
 
intel_runtime_pm_put(dev_priv);
 
-- 
2.12.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 8/9] drm/i915: Replace a seq_puts() call by seq_putc() in two functions

2017-05-04 Thread SF Markus Elfring

From: Markus Elfring 
Date: Thu, 4 May 2017 14:23:32 +0200

Two single characters (line breaks) should be put into a sequence.
Thus use the corresponding function "seq_putc".

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 2aa6b97fd22f..9f64dc3f2d05 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1254,7 +1254,7 @@ static void gen8_dump_pdp(struct i915_hw_ppgtt *ppgtt,
else
seq_puts(m, "  SCRATCH ");
}
-   seq_puts(m, "\n");
+   seq_putc(m, '\n');
}
kunmap_atomic(pt_vaddr);
}
@@ -1437,7 +1437,7 @@ static void gen6_dump_ppgtt(struct i915_hw_ppgtt *ppgtt, 
struct seq_file *m)
else
seq_puts(m, "  SCRATCH ");
}
-   seq_puts(m, "\n");
+   seq_putc(m, '\n');
}
kunmap_atomic(pt_vaddr);
}
-- 
2.12.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 9/9] drm/i915: Combine substrings for two messages in i915_ggtt_probe_hw()

2017-05-04 Thread SF Markus Elfring

From: Markus Elfring 
Date: Thu, 4 May 2017 14:30:37 +0200

The script "checkpatch.pl" pointed information out like the following.

WARNING: quoted string split across lines

Thus fix the affected source code place.

Signed-off-by: Markus Elfring 
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c 
b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 9f64dc3f2d05..508431f42b65 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2905,16 +2905,14 @@ int i915_ggtt_probe_hw(struct drm_i915_private 
*dev_priv)
}
 
if ((ggtt->base.total - 1) >> 32) {
-   DRM_ERROR("We never expected a Global GTT with more than 32bits"
- " of address space! Found %lldM!\n",
+   DRM_ERROR("We never expected a Global GTT with more than 32bits 
of address space! Found %lldM!\n",
  ggtt->base.total >> 20);
ggtt->base.total = 1ULL << 32;
ggtt->mappable_end = min(ggtt->mappable_end, ggtt->base.total);
}
 
if (ggtt->mappable_end > ggtt->base.total) {
-   DRM_ERROR("mappable aperture extends past end of GGTT,"
- " aperture=%llx, total=%llx\n",
+   DRM_ERROR("mappable aperture extends past end of GGTT, 
aperture=%llx, total=%llx\n",
  ggtt->mappable_end, ggtt->base.total);
ggtt->mappable_end = ggtt->base.total;
}
-- 
2.12.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 5/9] drm/i915: Adjust seven checks for null pointers

2017-05-04 Thread SF Markus Elfring

From: Markus Elfring 
Date: Thu, 4 May 2017 13:52:19 +0200
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The script “checkpatch.pl” pointed information out like the following.

Comparison to NULL could be written …

Thus fix affected source code places.

Signed-off-by: Markus Elfring 
---
 drivers/gpu/drm/i915/i915_debugfs.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index bf9a2e8d8c16..d9c699d7245e 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -242,7 +242,7 @@ static int i915_gem_stolen_list_info(struct seq_file *m, 
void *data)
if (count == total)
break;
 
-   if (obj->stolen == NULL)
+   if (!obj->stolen)
continue;
 
objects[count++] = obj;
@@ -254,7 +254,7 @@ static int i915_gem_stolen_list_info(struct seq_file *m, 
void *data)
if (count == total)
break;
 
-   if (obj->stolen == NULL)
+   if (!obj->stolen)
continue;
 
objects[count++] = obj;
@@ -557,7 +557,7 @@ static int i915_gem_pageflip_info(struct seq_file *m, void 
*data)
 
spin_lock_irq(&dev->event_lock);
work = crtc->flip_work;
-   if (work == NULL) {
+   if (!work) {
seq_printf(m, "No flip due on pipe %c (plane %c)\n",
   pipe, plane);
} else {
@@ -3717,7 +3717,7 @@ static ssize_t i915_displayport_test_active_write(struct 
file *file,
continue;
 
if (connector->status == connector_status_connected &&
-   connector->encoder != NULL) {
+   connector->encoder) {
intel_dp = enc_to_intel_dp(connector->encoder);
status = kstrtoint(input_buffer, 10, &val);
if (status < 0)
@@ -3756,7 +3756,7 @@ static int i915_displayport_test_active_show(struct 
seq_file *m, void *data)
continue;
 
if (connector->status == connector_status_connected &&
-   connector->encoder != NULL) {
+   connector->encoder) {
intel_dp = enc_to_intel_dp(connector->encoder);
seq_putc(m,
 intel_dp->compliance.test_active ? '1' : '0');
@@ -3801,7 +3801,7 @@ static int i915_displayport_test_data_show(struct 
seq_file *m, void *data)
continue;
 
if (connector->status == connector_status_connected &&
-   connector->encoder != NULL) {
+   connector->encoder) {
intel_dp = enc_to_intel_dp(connector->encoder);
if (intel_dp->compliance.test_type ==
DP_TEST_LINK_EDID_READ)
@@ -3855,7 +3855,7 @@ static int i915_displayport_test_type_show(struct 
seq_file *m, void *data)
continue;
 
if (connector->status == connector_status_connected &&
-   connector->encoder != NULL) {
+   connector->encoder) {
intel_dp = enc_to_intel_dp(connector->encoder);
seq_printf(m, "%02lx", intel_dp->compliance.test_type);
} else {
-- 
2.12.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages

2017-05-04 Thread Patchwork

== Series Details ==

Series: series starting with [1/4] lib/scatterlist: Fix offset type in 
sg_alloc_table_from_pages
URL   : https://patchwork.freedesktop.org/series/23969/
State : success

== Summary ==

Series 23969v1 Series without cover letter
https://patchwork.freedesktop.org/api/1.0/series/23969/revisions/1/mbox/

Test gem_exec_suspend:
Subgroup basic-s4-devices:
dmesg-warn -> PASS   (fi-kbl-7560u) fdo#100125

fdo#100125 https://bugs.freedesktop.org/show_bug.cgi?id=100125

fi-bdw-5557u total:278  pass:267  dwarn:0   dfail:0   fail:0   skip:11  
time:430s
fi-bsw-n3050 total:278  pass:242  dwarn:0   dfail:0   fail:0   skip:36  
time:572s
fi-bxt-j4205 total:278  pass:259  dwarn:0   dfail:0   fail:0   skip:19  
time:513s
fi-bxt-t5700 total:278  pass:258  dwarn:0   dfail:0   fail:0   skip:20  
time:552s
fi-byt-j1900 total:278  pass:254  dwarn:0   dfail:0   fail:0   skip:24  
time:486s
fi-byt-n2820 total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  
time:481s
fi-hsw-4770  total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  
time:408s
fi-hsw-4770r total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  
time:405s
fi-ilk-650   total:278  pass:228  dwarn:0   dfail:0   fail:0   skip:50  
time:416s
fi-ivb-3520m total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  
time:484s
fi-ivb-3770  total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  
time:464s
fi-kbl-7500u total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  
time:459s
fi-kbl-7560u total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  
time:566s
fi-skl-6260u total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  
time:456s
fi-skl-6700hqtotal:278  pass:261  dwarn:0   dfail:0   fail:0   skip:17  
time:568s
fi-skl-6700k total:278  pass:256  dwarn:4   dfail:0   fail:0   skip:18  
time:473s
fi-skl-6770hqtotal:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  
time:500s
fi-skl-gvtdvmtotal:278  pass:265  dwarn:0   dfail:0   fail:0   skip:13  
time:438s
fi-snb-2520m total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  
time:531s
fi-snb-2600  total:278  pass:249  dwarn:0   dfail:0   fail:0   skip:29  
time:404s
fi-bdw-gvtdvm failed to collect. IGT log at Patchwork_4623/fi-bdw-gvtdvm/igt.log

369880c1680bf9bde467a40d2a03d3ad32341281 drm-tip: 2017y-05m-04d-15h-00m-33s UTC 
integration manifest
5bb846f drm/i915: Use __sg_alloc_table_from_pages for userptr allocations
54ed0e1 lib/scatterlist: Introduce and export __sg_alloc_table_from_pages
bafac0f lib/scatterlist: Avoid potential scatterlist entry overflow
b5fb37a lib/scatterlist: Fix offset type in sg_alloc_table_from_pages

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4623/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] drm] Atomic update on pipe (A) took 119 us, max time under evasion is 100 us

2017-05-04 Thread Ville Syrjälä

On Thu, May 04, 2017 at 09:26:09AM -0600, Jens Axboe wrote:
> Hi,
> 
> Running current -git on my laptop (20FB, X1 Carbon gen4, skylake), I get
> a lot of the below warnings. Things seem to work fine (in fact it seems
> faster in general use than previously), but it's a lot of warning spew.
> 
> [  764.877978] [drm] Atomic update on pipe (A) took 156 us, max time under 
> evasion is 100 us

I tried to optimize this a bit recently but indeed it's stil known to be too
slow. Looks like all of that stuff did land in Linus's tree already,
so presumably you have it all already.

I did have some further ideas that should help but I got sidetracked by
other things before I managed to finish the work. I guess I'll need to get
back on that horse and try to finish what I started.

In the meantime, maybe we should just silence this error spew again
until we're more confident about meeting the deadlines. Maarten?

Do you have lockdep enabled BTW? Based on what I've seen lockdep does
seem be a major contributor to slowness here.

-- 
Ville Syrjälä
Intel OTC
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] drm] Atomic update on pipe (A) took 119 us, max time under evasion is 100 us

2017-05-04 Thread Jens Axboe

On 05/04/2017 11:42 AM, Ville Syrjälä wrote:
> On Thu, May 04, 2017 at 09:26:09AM -0600, Jens Axboe wrote:
>> Hi,
>>
>> Running current -git on my laptop (20FB, X1 Carbon gen4, skylake), I get
>> a lot of the below warnings. Things seem to work fine (in fact it seems
>> faster in general use than previously), but it's a lot of warning spew.
>>
>> [  764.877978] [drm] Atomic update on pipe (A) took 156 us, max time under 
>> evasion is 100 us
> 
> I tried to optimize this a bit recently but indeed it's stil known to be too
> slow. Looks like all of that stuff did land in Linus's tree already,
> so presumably you have it all already.

Yes, this is Linus' tree...

> I did have some further ideas that should help but I got sidetracked by
> other things before I managed to finish the work. I guess I'll need to get
> back on that horse and try to finish what I started.
> 
> In the meantime, maybe we should just silence this error spew again
> until we're more confident about meeting the deadlines. Maarten?
> 
> Do you have lockdep enabled BTW? Based on what I've seen lockdep does
> seem be a major contributor to slowness here.

Nope, running a fairly optimized build on my laptop.

-- 
Jens Axboe

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915: Set all undefined MOCS entries to follow PTE

2017-05-04 Thread Francisco Jerez

David Weinehall  writes:

> On Thu, May 04, 2017 at 10:51:29AM +0100, Chris Wilson wrote:
>> A good default for garbage entries from the user is to follow the
>> default setting of the object (i.e. the PTE). Currently they use the
>> uncached entry, and now the only way to accidentally hit uncached
>> performance is via explicit use of the uncached MOCS or setting the
>> object to uncached. Note that these entries are currently undefined in
>> the ABI and we reserve the right to change them. We originally chose
>> uncached to eliminate any problem with reducing the caching level in
>> future, but the object is a much better definition of the minimum
>> caching level.
>> 

NAK.  The reason for the default being UC is that it's the only setting
that guarantees full forwards compatibility with any other entry that
might be added in the future.  If you default to PTE on (e)LLC and WB on
L3, userspace will no longer be able to use any newly introduced entry
with stricter coherency guarantees than that (e.g. any L3-uncached
entry) in a backwards-compatible way.  Attempting to do so may break
memory coherency assumptions of the application and lead to misrendering
when run on older kernel versions (which to my judgment is a scarier
failure mode than reduced performance).

My other concern is that this change may make inadvertent use of
undefined MOCS entries extremely difficult to detect in some cases -- UC
gives userspace a pretty obvious (if functionally harmless) indicative
that it's got its caching settings wrong, and is a strong motivation for
userspace developers to contribute MOCS table changes to the kernel
instead of blindly making assumptions about them (e.g. that they match
the Android kernel as media-sdk was probably doing).  With this change
checked in, the performance drawback from using media-sdk on an upstream
kernel may have been subtle enough that David would never have bothered
to look into the issue.  People may have started shipping copies of
media-sdk making bogus MOCS table assumptions (with potential
correctness implications), at which point you would have to deal with
userspace regressions anytime the MOCS table is extended in the future.

>> Fixes: 3bbaba0ceaa2 ("drm/i915: Added Programming of the MOCS")
>> Signed-off-by: Chris Wilson 
>> Cc: David Weinehall 
>> Cc: Arkadiusz Hiler 
>> Cc: Tvrtko Ursulin 
>> Cc: sta...@vger.kernel.org
>
> LGTM, and passes our nightly msdk test case.
>
> Tested-by: David Weinehall 
> Reviewed-by: David Weinehall 
>
>> ---
>>  drivers/gpu/drm/i915/intel_mocs.c | 39 
>> +++
>>  1 file changed, 15 insertions(+), 24 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/i915/intel_mocs.c 
>> b/drivers/gpu/drm/i915/intel_mocs.c
>> index 92e461c68385..e7a7781ca457 100644
>> --- a/drivers/gpu/drm/i915/intel_mocs.c
>> +++ b/drivers/gpu/drm/i915/intel_mocs.c
>> @@ -85,10 +85,7 @@ struct drm_i915_mocs_table {
>>   *
>>   * Entries not part of the following tables are undefined as far as
>>   * userspace is concerned and shouldn't be relied upon.  For the time
>> - * being they will be implicitly initialized to the strictest caching
>> - * configuration (uncached) to guarantee forwards compatibility with
>> - * userspace programs written against more recent kernels providing
>> - * additional MOCS entries.
>> + * being they will be implicitly initialized to follow the PTE.
>>   *
>>   * NOTE: These tables MUST start with being uncached and the length
>>   *   MUST be less than 63 as the last two registers are reserved
>> @@ -249,16 +246,13 @@ int intel_mocs_init_engine(struct intel_engine_cs 
>> *engine)
>> table.table[index].control_value);
>>  
>>  /*
>> - * Ok, now set the unused entries to uncached. These entries
>> + * Ok, now set the unused entries to follow the PTE. These entries
>>   * are officially undefined and no contract for the contents
>>   * and settings is given for these entries.
>> - *
>> - * Entry 0 in the table is uncached - so we are just writing
>> - * that value to all the used entries.
>>   */
>>  for (; index < GEN9_NUM_MOCS_ENTRIES; index++)
>>  I915_WRITE(mocs_register(engine->id, index),
>> -   table.table[0].control_value);
>> +   table.table[I915_MOCS_PTE].control_value);
>>  
>>  return 0;
>>  }
>> @@ -295,16 +289,13 @@ static int emit_mocs_control_table(struct 
>> drm_i915_gem_request *req,
>>  }
>>  
>>  /*
>> - * Ok, now set the unused entries to uncached. These entries
>> + * Ok, now set the unused entries to follow the PTE. These entries
>>   * are officially undefined and no contract for the contents
>>   * and settings is given for these entries.
>> - *
>> - * Entry 0 in the table is uncached - so we are just writing
>> - * that value to all the used entries.
>>   */
>>  for (; index < GEN9_NUM_MOCS_ENTRIES; index++) {
>>

[Intel-gfx] [PATCH] drm/i915: Fix rawclk readout for g4x

2017-05-04 Thread ville . syrjala

From: Ville Syrjälä 

Turns out our skills in decoding the CLKCFG register weren't good
enough. On this particular elk the answer we got was 400 MHz when
in reality the clock was running at 266 MHz, which then caused us
to program a bogus AUX clock divider that caused all AUX communication
to fail.

Sadly the docs are now in bit heaven, so the fix will have to be based
on empirical evidence. Using another elk machine I was able to frob
the FSB frequency from the BIOS and see how it affects the CLKCFG
register. The machine seesm to use a frequency of 266 MHz by default,
and fortunately it still boot even with the 50% CPU overclock that
we get when we bump the FSB up to 400 MHz.

It turns out the actual FSB frequency and the register have no real
link whatsoever. The register value is based on some straps or something,
but fortunately those too can be configured from the BIOS on this board,
although it doesn't seem to respect the settings 100%. In the end I was
able to derive the following relationship:

BIOS FSB / strap | CLKCFG
-
200  | 0x2
266  | 0x0
333  | 0x4
400  | 0x4

So only the 200 and 400 MHz cases actually match how we're currently
decoding that register. But as the comment next to some of the defines
says, we have been just guessing anyway.

So let's fix things up so that at least the 266 MHz case will work
correctly as that is actually the setting used by both the buggy
machine and my test machine.

The fact that 333 and 400 MHz BIOS settings result in the same register
value is a little disappointing, as that means we can't tell them apart.
However, according to the gmch datasheet for both elk and ctg 400 Mhz is
not even a supported FSB frequency, so I'm going to make the assumption
that we should decode it as 333 MHz instead.

Cc: sta...@vger.kernel.org
Cc: Tomi Sarvela 
Reported-by: Tomi Sarvela 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100926
Signed-off-by: Ville Syrjälä 
---
 drivers/gpu/drm/i915/i915_reg.h| 10 +++---
 drivers/gpu/drm/i915/intel_cdclk.c |  6 ++
 2 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index ee8170cda93e..524fdfda9d45 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -3059,10 +3059,14 @@ enum skl_disp_power_wells {
 #define CLKCFG_FSB_667 (3 << 0)/* 
hrawclk 166 */
 #define CLKCFG_FSB_800 (2 << 0)/* 
hrawclk 200 */
 #define CLKCFG_FSB_1067(6 << 0)
/* hrawclk 266 */
+#define CLKCFG_FSB_1067_ALT(0 << 0)/* 
hrawclk 266 */
 #define CLKCFG_FSB_1333(7 << 0)
/* hrawclk 333 */
-/* Note, below two are guess */
-#define CLKCFG_FSB_1600(4 << 0)
/* hrawclk 400 */
-#define CLKCFG_FSB_1600_ALT(0 << 0)/* 
hrawclk 400 */
+/*
+ * Note that on at least on ELK the below value is reported for both
+ * 333 and 400 MHz BIOS FSB setting, but given that the gmch datasheet
+ * lists only 200/266/333 MHz FSB as supported let's decode it as 333 MHz.
+ */
+#define CLKCFG_FSB_1333_ALT(4 << 0)/* 
hrawclk 333 */
 #define CLKCFG_FSB_MASK(7 << 0)
 #define CLKCFG_MEM_533 (1 << 4)
 #define CLKCFG_MEM_667 (2 << 4)
diff --git a/drivers/gpu/drm/i915/intel_cdclk.c 
b/drivers/gpu/drm/i915/intel_cdclk.c
index 763010f8ad89..29792972d55d 100644
--- a/drivers/gpu/drm/i915/intel_cdclk.c
+++ b/drivers/gpu/drm/i915/intel_cdclk.c
@@ -1808,13 +1808,11 @@ static int g4x_hrawclk(struct drm_i915_private 
*dev_priv)
case CLKCFG_FSB_800:
return 20;
case CLKCFG_FSB_1067:
+   case CLKCFG_FSB_1067_ALT:
return 27;
case CLKCFG_FSB_1333:
+   case CLKCFG_FSB_1333_ALT:
return 33;
-   /* these two are just a guess; one of them might be right */
-   case CLKCFG_FSB_1600:
-   case CLKCFG_FSB_1600_ALT:
-   return 40;
default:
return 13;
}
-- 
2.10.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Fix rawclk readout for g4x

2017-05-04 Thread Patchwork

== Series Details ==

Series: drm/i915: Fix rawclk readout for g4x
URL   : https://patchwork.freedesktop.org/series/23978/
State : success

== Summary ==

Series 23978v1 drm/i915: Fix rawclk readout for g4x
https://patchwork.freedesktop.org/api/1.0/series/23978/revisions/1/mbox/

fi-bdw-5557u total:278  pass:267  dwarn:0   dfail:0   fail:0   skip:11  
time:432s
fi-bdw-gvtdvmtotal:278  pass:256  dwarn:8   dfail:0   fail:0   skip:14  
time:425s
fi-bsw-n3050 total:278  pass:242  dwarn:0   dfail:0   fail:0   skip:36  
time:579s
fi-bxt-j4205 total:278  pass:259  dwarn:0   dfail:0   fail:0   skip:19  
time:515s
fi-bxt-t5700 total:278  pass:258  dwarn:0   dfail:0   fail:0   skip:20  
time:564s
fi-byt-j1900 total:278  pass:254  dwarn:0   dfail:0   fail:0   skip:24  
time:494s
fi-byt-n2820 total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  
time:483s
fi-hsw-4770  total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  
time:411s
fi-hsw-4770r total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  
time:409s
fi-ilk-650   total:278  pass:228  dwarn:0   dfail:0   fail:0   skip:50  
time:420s
fi-ivb-3520m total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  
time:480s
fi-ivb-3770  total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  
time:487s
fi-kbl-7500u total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  
time:458s
fi-kbl-7560u total:278  pass:267  dwarn:1   dfail:0   fail:0   skip:10  
time:571s
fi-skl-6260u total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  
time:454s
fi-skl-6700hqtotal:278  pass:261  dwarn:0   dfail:0   fail:0   skip:17  
time:573s
fi-skl-6700k total:278  pass:256  dwarn:4   dfail:0   fail:0   skip:18  
time:455s
fi-skl-6770hqtotal:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  
time:492s
fi-skl-gvtdvmtotal:278  pass:265  dwarn:0   dfail:0   fail:0   skip:13  
time:430s
fi-snb-2520m total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  
time:529s
fi-snb-2600  total:278  pass:249  dwarn:0   dfail:0   fail:0   skip:29  
time:413s

369880c1680bf9bde467a40d2a03d3ad32341281 drm-tip: 2017y-05m-04d-15h-00m-33s UTC 
integration manifest
be10d0a drm/i915: Fix rawclk readout for g4x

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4624/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCHv4 0/3] dma_buf import support for vgem

2017-05-04 Thread Laura Abbott

Hi,

This v4 of the series to add dma_buf import functions for vgem. This version
primarily focuses on adding a new approach for an alternate dma_buf attach
after platformdev was removed.

Thanks,
Laura

Laura Abbott (3):
  drm/vgem: Add a dummy platform device
  drm/prime: Introduce drm_gem_prime_import_dev
  drm/vgem: Enable dmabuf import interfaces

 drivers/gpu/drm/drm_prime.c |  30 ++--
 drivers/gpu/drm/vgem/vgem_drv.c | 155 +++-
 drivers/gpu/drm/vgem/vgem_drv.h |   2 +
 include/drm/drm_prime.h |   5 ++
 4 files changed, 154 insertions(+), 38 deletions(-)

-- 
2.7.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCHv4 2/3] drm/prime: Introduce drm_gem_prime_import_dev

2017-05-04 Thread Laura Abbott


The existing drm_gem_prime_import function uses the underlying
struct device of a drm_device for attaching to a dma_buf. Some drivers
(notably vgem) may not have an underlying device structure. Offer
an alternate function to attach using any available device structure.

Signed-off-by: Laura Abbott 
---
v4: Alternate implemntation to take an arbitrary struct dev instead of just
a platform device.

This was different enough that I dropped the previous Reviewed-by
---
 drivers/gpu/drm/drm_prime.c | 30 --
 include/drm/drm_prime.h |  5 +
 2 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index 9fb65b7..5ad9a26 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -595,15 +595,18 @@ int drm_gem_prime_handle_to_fd(struct drm_device *dev,
 EXPORT_SYMBOL(drm_gem_prime_handle_to_fd);
 
 /**
- * drm_gem_prime_import - helper library implementation of the import callback
+ * drm_gem_prime_import_dev - core implementation of the import callback
  * @dev: drm_device to import into
  * @dma_buf: dma-buf object to import
+ * @attach_dev: struct device to dma_buf attach
  *
- * This is the implementation of the gem_prime_import functions for GEM drivers
- * using the PRIME helpers.
+ * This is the core of drm_gem_prime_import. It's designed to be called by
+ * drivers who want to use a different device structure than dev->dev for
+ * attaching via dma_buf.
  */
-struct drm_gem_object *drm_gem_prime_import(struct drm_device *dev,
-   struct dma_buf *dma_buf)
+struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev,
+   struct dma_buf *dma_buf,
+   struct device *attach_dev)
 {
struct dma_buf_attachment *attach;
struct sg_table *sgt;
@@ -625,7 +628,7 @@ struct drm_gem_object *drm_gem_prime_import(struct 
drm_device *dev,
if (!dev->driver->gem_prime_import_sg_table)
return ERR_PTR(-EINVAL);
 
-   attach = dma_buf_attach(dma_buf, dev->dev);
+   attach = dma_buf_attach(dma_buf, attach_dev);
if (IS_ERR(attach))
return ERR_CAST(attach);
 
@@ -655,6 +658,21 @@ struct drm_gem_object *drm_gem_prime_import(struct 
drm_device *dev,
 
return ERR_PTR(ret);
 }
+EXPORT_SYMBOL(drm_gem_prime_import_dev);
+
+/**
+ * drm_gem_prime_import - helper library implementation of the import callback
+ * @dev: drm_device to import into
+ * @dma_buf: dma-buf object to import
+ *
+ * This is the implementation of the gem_prime_import functions for GEM drivers
+ * using the PRIME helpers.
+ */
+struct drm_gem_object *drm_gem_prime_import(struct drm_device *dev,
+   struct dma_buf *dma_buf)
+{
+   return drm_gem_prime_import_dev(dev, dma_buf, dev->dev);
+}
 EXPORT_SYMBOL(drm_gem_prime_import);
 
 /**
diff --git a/include/drm/drm_prime.h b/include/drm/drm_prime.h
index 0b2a235..46fd1fb 100644
--- a/include/drm/drm_prime.h
+++ b/include/drm/drm_prime.h
@@ -65,6 +65,11 @@ int drm_gem_prime_handle_to_fd(struct drm_device *dev,
   int *prime_fd);
 struct drm_gem_object *drm_gem_prime_import(struct drm_device *dev,
struct dma_buf *dma_buf);
+
+struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev,
+   struct dma_buf *dma_buf,
+   struct device *attach_dev);
+
 int drm_gem_prime_fd_to_handle(struct drm_device *dev,
   struct drm_file *file_priv, int prime_fd, 
uint32_t *handle);
 struct dma_buf *drm_gem_dmabuf_export(struct drm_device *dev,
-- 
2.7.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCHv4 3/3] drm/vgem: Enable dmabuf import interfaces

2017-05-04 Thread Laura Abbott


Enable the GEM dma-buf import interfaces in addition to the export
interfaces. This lets vgem be used as a test source for other allocators
(e.g. Ion).

Reviewed-by: Chris Wilson 
Signed-off-by: Laura Abbott 
---
v4: Use new drm_gem_prime_import_dev function
---
 drivers/gpu/drm/vgem/vgem_drv.c | 136 +++-
 drivers/gpu/drm/vgem/vgem_drv.h |   2 +
 2 files changed, 109 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
index d1d98af..c9381d45 100644
--- a/drivers/gpu/drm/vgem/vgem_drv.c
+++ b/drivers/gpu/drm/vgem/vgem_drv.c
@@ -48,6 +48,11 @@ static void vgem_gem_free_object(struct drm_gem_object *obj)
 {
struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj);
 
+   drm_free_large(vgem_obj->pages);
+
+   if (obj->import_attach)
+   drm_prime_gem_destroy(obj, vgem_obj->table);
+
drm_gem_object_release(obj);
kfree(vgem_obj);
 }
@@ -58,26 +63,49 @@ static int vgem_gem_fault(struct vm_fault *vmf)
struct drm_vgem_gem_object *obj = vma->vm_private_data;
/* We don't use vmf->pgoff since that has the fake offset */
unsigned long vaddr = vmf->address;
-   struct page *page;
-
-   page = shmem_read_mapping_page(file_inode(obj->base.filp)->i_mapping,
-  (vaddr - vma->vm_start) >> PAGE_SHIFT);
-   if (!IS_ERR(page)) {
-   vmf->page = page;
-   return 0;
-   } else switch (PTR_ERR(page)) {
-   case -ENOSPC:
-   case -ENOMEM:
-   return VM_FAULT_OOM;
-   case -EBUSY:
-   return VM_FAULT_RETRY;
-   case -EFAULT:
-   case -EINVAL:
-   return VM_FAULT_SIGBUS;
-   default:
-   WARN_ON_ONCE(PTR_ERR(page));
-   return VM_FAULT_SIGBUS;
+   int ret;
+   loff_t num_pages;
+   pgoff_t page_offset;
+   page_offset = (vaddr - vma->vm_start) >> PAGE_SHIFT;
+
+   num_pages = DIV_ROUND_UP(obj->base.size, PAGE_SIZE);
+
+   if (page_offset > num_pages)
+   return VM_FAULT_SIGBUS;
+
+   if (obj->pages) {
+   get_page(obj->pages[page_offset]);
+   vmf->page = obj->pages[page_offset];
+   ret = 0;
+   } else {
+   struct page *page;
+
+   page = shmem_read_mapping_page(
+   file_inode(obj->base.filp)->i_mapping,
+   page_offset);
+   if (!IS_ERR(page)) {
+   vmf->page = page;
+   ret = 0;
+   } else switch (PTR_ERR(page)) {
+   case -ENOSPC:
+   case -ENOMEM:
+   ret = VM_FAULT_OOM;
+   break;
+   case -EBUSY:
+   ret = VM_FAULT_RETRY;
+   break;
+   case -EFAULT:
+   case -EINVAL:
+   ret = VM_FAULT_SIGBUS;
+   break;
+   default:
+   WARN_ON(PTR_ERR(page));
+   ret = VM_FAULT_SIGBUS;
+   break;
+   }
+
}
+   return ret;
 }
 
 static const struct vm_operations_struct vgem_gem_vm_ops = {
@@ -114,12 +142,8 @@ static void vgem_postclose(struct drm_device *dev, struct 
drm_file *file)
kfree(vfile);
 }
 
-/* ioctls */
-
-static struct drm_gem_object *vgem_gem_create(struct drm_device *dev,
- struct drm_file *file,
- unsigned int *handle,
- unsigned long size)
+static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev,
+   unsigned long size)
 {
struct drm_vgem_gem_object *obj;
int ret;
@@ -129,8 +153,31 @@ static struct drm_gem_object *vgem_gem_create(struct 
drm_device *dev,
return ERR_PTR(-ENOMEM);
 
ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE));
-   if (ret)
-   goto err_free;
+   if (ret) {
+   kfree(obj);
+   return ERR_PTR(ret);
+   }
+
+   return obj;
+}
+
+static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj)
+{
+   drm_gem_object_release(&obj->base);
+   kfree(obj);
+}
+
+static struct drm_gem_object *vgem_gem_create(struct drm_device *dev,
+ struct drm_file *file,
+ unsigned int *handle,
+ unsigned long size)
+{
+   struct drm_vgem_gem_object *o

[Intel-gfx] [PATCHv4 1/3] drm/vgem: Add a dummy platform device

2017-05-04 Thread Laura Abbott


The vgem driver is currently registered independent of any actual
device. Some usage of the dmabuf APIs require an actual device structure
to do anything. Register a dummy platform device for use with dmabuf.

Reviewed-by: Chris Wilson 
Signed-off-by: Laura Abbott 
---
v4: Switch from the now removed platformdev to a static platform device.
---
 drivers/gpu/drm/vgem/vgem_drv.c | 19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
index 9fee38a..d1d98af 100644
--- a/drivers/gpu/drm/vgem/vgem_drv.c
+++ b/drivers/gpu/drm/vgem/vgem_drv.c
@@ -42,6 +42,8 @@
 #define DRIVER_MAJOR   1
 #define DRIVER_MINOR   0
 
+static struct platform_device *vgem_platform;
+
 static void vgem_gem_free_object(struct drm_gem_object *obj)
 {
struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj);
@@ -335,11 +337,20 @@ static int __init vgem_init(void)
int ret;
 
vgem_device = drm_dev_alloc(&vgem_driver, NULL);
-   if (IS_ERR(vgem_device)) {
-   ret = PTR_ERR(vgem_device);
+   if (IS_ERR(vgem_device))
+   return PTR_ERR(vgem_device);
+
+   vgem_platform = platform_device_register_simple("vgem",
+   -1, NULL, 0);
+
+   if (!vgem_platform) {
+   ret = -ENODEV;
goto out;
}
 
+   dma_coerce_mask_and_coherent(&vgem_platform->dev,
+   DMA_BIT_MASK(64));
+
ret  = drm_dev_register(vgem_device, 0);
if (ret)
goto out_unref;
@@ -347,13 +358,15 @@ static int __init vgem_init(void)
return 0;
 
 out_unref:
-   drm_dev_unref(vgem_device);
+   platform_device_unregister(vgem_platform);
 out:
+   drm_dev_unref(vgem_device);
return ret;
 }
 
 static void __exit vgem_exit(void)
 {
+   platform_device_unregister(vgem_platform);
drm_dev_unregister(vgem_device);
drm_dev_unref(vgem_device);
 }
-- 
2.7.4

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [RFC] drm/i915/guc: capture GuC logs if FW fails to load

2017-05-04 Thread Daniele Ceraolo Spurio

We're currently deleting the GuC logs if the FW fails to load, but those
are still useful to understand why the loading failed. Instead of
deleting them, taking a snapshot allows us to access them after driver
load is completed.

Cc: Oscar Mateo 
Cc: Michal Wajdeczko 
Signed-off-by: Daniele Ceraolo Spurio 
---
 drivers/gpu/drm/i915/i915_debugfs.c   | 36 ---
 drivers/gpu/drm/i915/i915_drv.c   |  3 +++
 drivers/gpu/drm/i915/i915_drv.h   |  6 ++
 drivers/gpu/drm/i915/i915_gpu_error.c | 36 +++
 drivers/gpu/drm/i915/intel_guc_fwif.h | 14 +++---
 drivers/gpu/drm/i915/intel_guc_log.c  | 10 ++
 drivers/gpu/drm/i915/intel_uc.c   |  7 +--
 7 files changed, 84 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index 870c470..4ff20fc 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2543,26 +2543,32 @@ static int i915_guc_info(struct seq_file *m, void *data)
 static int i915_guc_log_dump(struct seq_file *m, void *data)
 {
struct drm_i915_private *dev_priv = node_to_i915(m->private);
-   struct drm_i915_gem_object *obj;
-   int i = 0, pg;
-
-   if (!dev_priv->guc.log.vma)
+   u32 *log;
+   int i = 0;
+
+   if (dev_priv->guc.log.vma) {
+   log = i915_gem_object_pin_map(dev_priv->guc.log.vma->obj,
+ I915_MAP_WC);
+   if (IS_ERR(log)) {
+   DRM_ERROR("Failed to pin guc_log vma\n");
+   return -ENOMEM;
+   }
+   } else if (dev_priv->gpu_error.guc_load_fail_log) {
+   log = dev_priv->gpu_error.guc_load_fail_log;
+   } else {
return 0;
-
-   obj = dev_priv->guc.log.vma->obj;
-   for (pg = 0; pg < obj->base.size / PAGE_SIZE; pg++) {
-   u32 *log = kmap_atomic(i915_gem_object_get_page(obj, pg));
-
-   for (i = 0; i < PAGE_SIZE / sizeof(u32); i += 4)
-   seq_printf(m, "0x%08x 0x%08x 0x%08x 0x%08x\n",
-  *(log + i), *(log + i + 1),
-  *(log + i + 2), *(log + i + 3));
-
-   kunmap_atomic(log);
}
 
+   for (i = 0; i < GUC_LOG_SIZE / sizeof(u32); i += 4)
+   seq_printf(m, "0x%08x 0x%08x 0x%08x 0x%08x\n",
+  *(log + i), *(log + i + 1),
+  *(log + i + 2), *(log + i + 3));
+
seq_putc(m, '\n');
 
+   if (dev_priv->guc.log.vma)
+   i915_gem_object_unpin_map(dev_priv->guc.log.vma->obj);
+
return 0;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 452c265..c7cb36c 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1354,6 +1354,9 @@ void i915_driver_unload(struct drm_device *dev)
cancel_delayed_work_sync(&dev_priv->gpu_error.hangcheck_work);
i915_reset_error_state(dev_priv);
 
+   /* release GuC error log (if any) */
+   i915_guc_load_error_log_free(dev_priv);
+
/* Flush any outstanding unpin_work. */
drain_workqueue(dev_priv->wq);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4588b3e..761c663 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1555,6 +1555,9 @@ struct i915_gpu_error {
/* Protected by the above dev->gpu_error.lock. */
struct i915_gpu_state *first_error;
 
+   /* Log snapshot if GuC errors during load */
+   void *guc_load_fail_log;
+
unsigned long missed_irq_rings;
 
/**
@@ -3687,6 +3690,9 @@ static inline void i915_reset_error_state(struct 
drm_i915_private *i915)
 
 #endif
 
+void i915_guc_load_error_log_capture(struct drm_i915_private *i915);
+void i915_guc_load_error_log_free(struct drm_i915_private *i915);
+
 const char *i915_cache_level_str(struct drm_i915_private *i915, int type);
 
 /* i915_cmd_parser.c */
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index ec526d9..44a873b 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1809,3 +1809,39 @@ void i915_reset_error_state(struct drm_i915_private 
*i915)
 
i915_gpu_state_put(error);
 }
+
+void i915_guc_load_error_log_capture(struct drm_i915_private *i915)
+{
+   void *log, *buf;
+   struct i915_vma *vma = i915->guc.log.vma;
+
+   if (i915->gpu_error.guc_load_fail_log || !vma)
+   return;
+
+   /*
+* the vma should be already pinned and mapped for log runtime
+* management but let's play safe
+*/
+   log = i915_gem_object_pin_map(vma->obj, I915_MAP_WC);
+   if (IS_ERR(log)) {
+   DRM_ERROR("Failed to pin guc_log vma\n");
+   return;
+

[Intel-gfx] ✓ Fi.CI.BAT: success for dma_buf import support for vgem (rev2)

2017-05-04 Thread Patchwork

== Series Details ==

Series: dma_buf import support for vgem (rev2)
URL   : https://patchwork.freedesktop.org/series/23824/
State : success

== Summary ==

Series 23824v2 dma_buf import support for vgem
https://patchwork.freedesktop.org/api/1.0/series/23824/revisions/2/mbox/

Test gem_exec_flush:
Subgroup basic-batch-kernel-default-uc:
pass   -> FAIL   (fi-snb-2600) fdo#17

fdo#17 https://bugs.freedesktop.org/show_bug.cgi?id=17

fi-bdw-5557u total:278  pass:267  dwarn:0   dfail:0   fail:0   skip:11  
time:431s
fi-bdw-gvtdvmtotal:278  pass:256  dwarn:8   dfail:0   fail:0   skip:14  
time:425s
fi-bsw-n3050 total:278  pass:242  dwarn:0   dfail:0   fail:0   skip:36  
time:584s
fi-bxt-j4205 total:278  pass:259  dwarn:0   dfail:0   fail:0   skip:19  
time:508s
fi-bxt-t5700 total:278  pass:258  dwarn:0   dfail:0   fail:0   skip:20  
time:553s
fi-byt-j1900 total:278  pass:254  dwarn:0   dfail:0   fail:0   skip:24  
time:492s
fi-byt-n2820 total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  
time:480s
fi-hsw-4770  total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  
time:411s
fi-hsw-4770r total:278  pass:262  dwarn:0   dfail:0   fail:0   skip:16  
time:408s
fi-ilk-650   total:278  pass:228  dwarn:0   dfail:0   fail:0   skip:50  
time:415s
fi-ivb-3520m total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  
time:493s
fi-ivb-3770  total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  
time:464s
fi-kbl-7500u total:278  pass:260  dwarn:0   dfail:0   fail:0   skip:18  
time:460s
fi-kbl-7560u total:278  pass:267  dwarn:1   dfail:0   fail:0   skip:10  
time:560s
fi-skl-6260u total:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  
time:455s
fi-skl-6700hqtotal:278  pass:261  dwarn:0   dfail:0   fail:0   skip:17  
time:566s
fi-skl-6700k total:278  pass:256  dwarn:4   dfail:0   fail:0   skip:18  
time:453s
fi-skl-6770hqtotal:278  pass:268  dwarn:0   dfail:0   fail:0   skip:10  
time:498s
fi-skl-gvtdvmtotal:278  pass:265  dwarn:0   dfail:0   fail:0   skip:13  
time:431s
fi-snb-2520m total:278  pass:250  dwarn:0   dfail:0   fail:0   skip:28  
time:539s
fi-snb-2600  total:278  pass:248  dwarn:0   dfail:0   fail:1   skip:29  
time:417s

369880c1680bf9bde467a40d2a03d3ad32341281 drm-tip: 2017y-05m-04d-15h-00m-33s UTC 
integration manifest
0e6a5c5 drm/vgem: Enable dmabuf import interfaces
36b39d3 drm/prime: Introduce drm_gem_prime_import_dev
d231a4f drm/vgem: Add a dummy platform device

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4625/
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

1 2 >

1 - 100 of 140 matches

Mail list logo