[Intel-gfx] ✗ Fi.CI.BAT: failure for Enhancement to intel_dp_aux_backlight driver (rev4)
== Series Details == Series: Enhancement to intel_dp_aux_backlight driver (rev4) URL : https://patchwork.freedesktop.org/series/21086/ State : failure == Summary == make: Entering directory '/home/cidrm/kernel' CHK include/config/kernel.release CHK include/generated/uapi/linux/version.h CHK include/generated/utsrelease.h CHK include/generated/bounds.h CHK include/generated/timeconst.h CHK include/generated/asm-offsets.h CALLscripts/checksyscalls.sh CHK include/generated/compile.h CHK kernel/config_data.h CC [M] drivers/gpu/drm/i915/i915_params.o In file included from ./include/linux/module.h:18:0, from ./include/drm/drmP.h:59, from drivers/gpu/drm/i915/i915_drv.h:47, from drivers/gpu/drm/i915/i915_params.c:26: drivers/gpu/drm/i915/i915_params.c: In function ‘__check_enable_dpcd_backlight’: ./include/linux/moduleparam.h:344:67: error: return from incompatible pointer type [-Werror=incompatible-pointer-types] static inline type __always_unused *__check_##name(void) { return(p); } ^ ./include/linux/moduleparam.h:396:35: note: in expansion of macro ‘__param_check’ #define param_check_bool(name, p) __param_check(name, p, bool) ^ ./include/linux/moduleparam.h:146:2: note: in expansion of macro ‘param_check_bool’ param_check_##type(name, &(value)); \ ^ drivers/gpu/drm/i915/i915_params.c:249:1: note: in expansion of macro ‘module_param_named’ module_param_named(enable_dpcd_backlight, i915.enable_dpcd_backlight, bool, 0600); ^ cc1: all warnings being treated as errors scripts/Makefile.build:294: recipe for target 'drivers/gpu/drm/i915/i915_params.o' failed make[4]: *** [drivers/gpu/drm/i915/i915_params.o] Error 1 scripts/Makefile.build:553: recipe for target 'drivers/gpu/drm/i915' failed make[3]: *** [drivers/gpu/drm/i915] Error 2 scripts/Makefile.build:553: recipe for target 'drivers/gpu/drm' failed make[2]: *** [drivers/gpu/drm] Error 2 scripts/Makefile.build:553: recipe for target 'drivers/gpu' failed make[1]: *** [drivers/gpu] Error 2 Makefile:1002: recipe for target 'drivers' failed make: *** [drivers] Error 2 make: Leaving directory '/home/cidrm/kernel' ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.BAT: failure for Enhancement to intel_dp_aux_backlight driver (rev4)
== Series Details == Series: Enhancement to intel_dp_aux_backlight driver (rev4) URL : https://patchwork.freedesktop.org/series/21086/ State : failure == Summary == CHK include/config/kernel.release CHK include/generated/uapi/linux/version.h CHK include/generated/utsrelease.h CHK include/generated/bounds.h CHK include/generated/timeconst.h CHK include/generated/asm-offsets.h CALLscripts/checksyscalls.sh CHK include/generated/compile.h CHK kernel/config_data.h CC [M] drivers/gpu/drm/i915/i915_params.o In file included from ./include/linux/module.h:18:0, from ./include/drm/drmP.h:59, from drivers/gpu/drm/i915/i915_drv.h:47, from drivers/gpu/drm/i915/i915_params.c:26: drivers/gpu/drm/i915/i915_params.c: In function ‘__check_enable_dpcd_backlight’: ./include/linux/moduleparam.h:344:67: error: return from incompatible pointer type [-Werror=incompatible-pointer-types] static inline type __always_unused *__check_##name(void) { return(p); } ^ ./include/linux/moduleparam.h:396:35: note: in expansion of macro ‘__param_check’ #define param_check_bool(name, p) __param_check(name, p, bool) ^ ./include/linux/moduleparam.h:146:2: note: in expansion of macro ‘param_check_bool’ param_check_##type(name, &(value)); \ ^ drivers/gpu/drm/i915/i915_params.c:249:1: note: in expansion of macro ‘module_param_named’ module_param_named(enable_dpcd_backlight, i915.enable_dpcd_backlight, bool, 0600); ^ cc1: all warnings being treated as errors scripts/Makefile.build:294: recipe for target 'drivers/gpu/drm/i915/i915_params.o' failed make[4]: *** [drivers/gpu/drm/i915/i915_params.o] Error 1 scripts/Makefile.build:553: recipe for target 'drivers/gpu/drm/i915' failed make[3]: *** [drivers/gpu/drm/i915] Error 2 scripts/Makefile.build:553: recipe for target 'drivers/gpu/drm' failed make[2]: *** [drivers/gpu/drm] Error 2 scripts/Makefile.build:553: recipe for target 'drivers/gpu' failed make[1]: *** [drivers/gpu] Error 2 Makefile:1002: recipe for target 'drivers' failed make: *** [drivers] Error 2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH i-g-t 13/13] tests/gem_exec_nop: Disable headless subtest on cairoless Android
On Wed, Apr 19, 2017 at 01:01:55PM +0200, Arkadiusz Hiler wrote: > Currently whole igt_kms.c is disabled while compiling on Android without > cairo, so this tests does not compile. > > There should be cleaner a way to disable only cairo dependant parts > which should allow us to enable at least some of the KMS tests, but > that's a bigger rework for another time. > > Signed-off-by: Arkadiusz Hiler > --- > lib/Android.mk | 1 + > tests/gem_exec_nop.c | 4 > 2 files changed, 5 insertions(+) > > diff --git a/lib/Android.mk b/lib/Android.mk > index 31f88be..dc538b8 100644 > --- a/lib/Android.mk > +++ b/lib/Android.mk > @@ -38,6 +38,7 @@ ifeq ("${ANDROID_HAS_CAIRO}", "1") > LOCAL_C_INCLUDES += $(ANDROID_BUILD_TOP)/external/cairo-1.12.16/src > LOCAL_CFLAGS += -DANDROID_HAS_CAIRO=1 -DIGT_DATADIR=\".\" > -DIGT_SRCDIR=\".\" > else > + > skip_lib_list := \ > igt_kms.c \ > igt_kms.h \ > diff --git a/tests/gem_exec_nop.c b/tests/gem_exec_nop.c > index 66c2fc1..967caef 100644 > --- a/tests/gem_exec_nop.c > +++ b/tests/gem_exec_nop.c > @@ -138,6 +138,7 @@ stable_nop_on_ring(int fd, uint32_t handle, unsigned int > engine, > return n; > } > > +#if (!defined(ANDROID)) || (defined(ANDROID) && ANDROID_HAS_CAIRO) Tautological check for ANDROID being defined. Is it too confusing to reduce this to #if !defined(ANDROID) || ANDROID_HAS_CAIRO > #define assert_within_epsilon(x, ref, tolerance) \ > igt_assert_f((x) <= (1.0 + tolerance) * ref && \ > (x) >= (1.0 - tolerance) * ref, \ > @@ -178,6 +179,7 @@ static void headless(int fd, uint32_t handle) > /* check that the two execution speeds are roughly the same */ > assert_within_epsilon(n_headless, n_display, 0.1f); > } > +#endif > > static bool ignore_engine(int fd, unsigned engine) > { > @@ -561,8 +563,10 @@ igt_main > igt_subtest("context-sequential") > sequential(device, handle, FORKED | CONTEXT, 150); > > +#if (!defined(ANDROID)) || (defined(ANDROID) && ANDROID_HAS_CAIRO) Likewise. -- Petri Latvala > igt_subtest("headless") > headless(device, handle); > +#endif > > igt_fixture { > igt_stop_hang_detector(); > -- > 2.9.3 > > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [maintainer-tools PATCH v3] dim: Add pull request tag template
On Wed, 03 May 2017, Sean Paul wrote: > Each pull request is accompanied by a summary that is stored in the git tag > from which it is generated. These summaries all share the same template with > headers classifying changes to UAPI, Cross-subsystem, Core, and Drivers. This > patch adds this template to the tag summary automatically in dim pull-request. > > Changes in v2: > - Tweaked the template var name s/PULL/TAG/ (Daniel) > Changes in v3: > - Use git tag -F- to ingest template (Jani) > - Tweak naming/comments again to hopefully clarify things (Jani) > > Signed-off-by: Sean Paul Pushed, thanks. BR, Jani. > --- > dim | 25 +++-- > dim.rst | 4 > 2 files changed, 27 insertions(+), 2 deletions(-) > > diff --git a/dim b/dim > index 8937803..baa0b38 100755 > --- a/dim > +++ b/dim > @@ -67,6 +67,9 @@ > DIM_TEMPLATE_HELLO=${DIM_TEMPLATE_HELLO:-$HOME/.dim.template.hello} > # signature pull request template > > DIM_TEMPLATE_SIGNATURE=${DIM_TEMPLATE_SIGNATURE:-$HOME/.dim.template.signature} > > +# dim pull-request tag summary template > +DIM_TEMPLATE_TAG_SUMMARY=${DIM_TEMPLATE_TAG_SUMMARY:-$HOME/.dim.template.tagsummary} > + > # > # Internal configuration. > # > @@ -1501,6 +1504,24 @@ function dim_tag_next > > } > > +function prep_pull_tag_summary > +{ > + if [ -r $DIM_TEMPLATE_TAG_SUMMARY ]; then > + cat $DIM_TEMPLATE_TAG_SUMMARY > + else > + cat <<-EOF > + UAPI Changes: > + > + Cross-subsystem Changes: > + > + Core Changes: > + > + Driver Changes: > + > + EOF > + fi > +} > + > # dim_pull_request branch upstream > function dim_pull_request > { > @@ -1533,9 +1554,9 @@ function dim_pull_request > while git tag -l $tag | grep -q $tag ; do > tag="$branch-$today-$((++suffix))" > done > - > gitk "$branch@{upstream}" ^$upstream & > - $DRY git tag -a $tag "$branch@{upstream}" > + prep_pull_tag_summary | $DRY git tag -F- $tag > "$branch@{upstream}" > + $DRY git tag -a -f $tag > $DRY git push $remote $tag > prep_pull_mail $req_file $tag > > diff --git a/dim.rst b/dim.rst > index 3dd19f9..10572f1 100644 > --- a/dim.rst > +++ b/dim.rst > @@ -464,6 +464,10 @@ DIM_TEMPLATE_SIGNATURE > -- > Path to a file containing a signature template for pull request mails. > > +DIM_TEMPLATE_TAG_SUMMARY > +- > +Path to a file containing the template for dim pull-request tag summaries. > + > dim_alias_ > - > Make an alias for the subcommand defined as the value. For > example, -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2 4/4] drm/i915: Calculate vlv/chv intermediate watermarks correctly, v2.
Op 03-05-17 om 20:03 schreef Ville Syrjälä: > On Wed, May 03, 2017 at 06:18:46PM +0200, Maarten Lankhorst wrote: >> Op 03-05-17 om 18:07 schreef Ville Syrjälä: >>> On Wed, May 03, 2017 at 05:53:34PM +0200, Maarten Lankhorst wrote: Op 03-05-17 om 16:11 schreef Ville Syrjälä: > On Wed, May 03, 2017 at 04:06:37PM +0200, Maarten Lankhorst wrote: >> Op 03-05-17 om 15:45 schreef Ville Syrjälä: >>> On Mon, May 01, 2017 at 03:34:34PM +0200, Maarten Lankhorst wrote: The watermarks it should calculate against are the old optimal watermarks. The currently active crtc watermarks are pure fiction, and are invalid in case of a nonblocking modeset, page flip enabling/disabling planes or any other reason. When the crtc is disabled or during a modeset the intermediate watermarks don't need to be programmed separately, and could be directly assigned to the optimal watermarks. Also rename crtc_state to new_crtc_state, to distinguish it from the old state. Changes since v1: - Use intel_atomic_get_old_crtc_state. (ville) Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/intel_pm.c | 20 ++-- 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 0f344b1fff45..a09396ee1f3d 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -1458,16 +1458,24 @@ static void vlv_atomic_update_fifo(struct intel_atomic_state *state, static int vlv_compute_intermediate_wm(struct drm_device *dev, struct intel_crtc *crtc, - struct intel_crtc_state *crtc_state) + struct intel_crtc_state *new_crtc_state) { - struct vlv_wm_state *intermediate = &crtc_state->wm.vlv.intermediate; - const struct vlv_wm_state *optimal = &crtc_state->wm.vlv.optimal; - const struct vlv_wm_state *active = &crtc->wm.active.vlv; + struct vlv_wm_state *intermediate = &new_crtc_state->wm.vlv.intermediate; + const struct vlv_wm_state *optimal = &new_crtc_state->wm.vlv.optimal; + const struct intel_crtc_state *old_crtc_state = + intel_atomic_get_old_crtc_state(new_crtc_state->base.state, crtc); + const struct vlv_wm_state *active = &old_crtc_state->wm.vlv.optimal; int level; + if (!new_crtc_state->base.active || drm_atomic_crtc_needs_modeset(&new_crtc_state->base)) { + *intermediate = *optimal; + + return 0; + } + intermediate->num_levels = min(optimal->num_levels, active->num_levels); intermediate->cxsr = optimal->cxsr && active->cxsr && - !crtc_state->disable_cxsr; + !new_crtc_state->disable_cxsr; >>> We need to consider disable_cxsr even in the modeset case. >> Why is this? crtc_state->disable_cxsr is set if any plane is part of the >> crtc during modeset, so it's disabled during modeset already. > It's set if any plane is enabling/disabling, which should be quite > typical during a modeset. Yeah but .initial_watermarks is called during crtc_enable, so cxsr will get enabled anyway. >>> Which is not what we want. CxSR must stay off until the planes have been >>> enabled. >>> >> In that case why is it enabled in .initial_watermarks at all? It should be >> in optimize_watermarks then.. > Because we can keep it enabled across the update unless planes are > getting enabled or disabled. > So for the modeset case, computing intermediate watermarks: *intermediate = *optimal; if (needs_modeset) intermediate->cxsr = false; if (optimal->cxsr && !intermediate->cxsr) new_crtc_state->wm.need_postvbl_update = true; ? ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH RESEND i-g-t 2/2] kms_frontbuffer_tracking: Don't poke compressing status for old cpus
On Wed, Apr 26, 2017 at 03:36:16PM -0300, Paulo Zanoni wrote: > > I have a feeling I asked this before, but why aren't we just fixing > > the kernel to report it correctly? For any platform with FBC2 it > > should be trivial, > > Right, I see there's a reg for that for ILK/SNB. > > > for FBC1 slightly more complicate as you probably > > have to check each individual tag. > > I didn't check the docs for that. > > Maybe we should change the comment from "early generations are not able > to report compression status" to something more accurate like "the > Kernel doesn't report compression status for early generations". > > > There are quite a few different ways to solve the problem involved in > this patch, and some of the would remove the need to check for platform > generations in the user space side. An example alternative would be to > always print "Compressing: " and then put "no" when FBC is disabled and > "unknown" for platforms where we don't know what to print. In fact it's > still on my TODO list to add a ton more information to i915_fbc_status, > but I'm not going to work on that soon. And there's always the problem > with having to sync Kernel and IGT. > > Anyway, the current patch plugs the current hole, so I think further > improvements to this area can come on top of it: > > Reviewed-by: Paulo Zanoni > Pushed this patch, thanks. -- Petri Latvala ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Update MOCS settings for gen 9
On Thu, Apr 27, 2017 at 05:23:16PM +0100, Chris Wilson wrote: > On Thu, Apr 27, 2017 at 06:30:42PM +0300, David Weinehall wrote: > > On Thu, Apr 27, 2017 at 04:55:20PM +0200, Arkadiusz Hiler wrote: > > > On Wed, Apr 26, 2017 at 06:00:41PM +0300, David Weinehall wrote: > > > > Add a bunch of MOCS entries for gen 9 that were missing from intel_mocs. > > > > Some of these are used by media-sdk; if these entries are missing > > > > the default will instead be to do everything uncached. > > > > > > > > This patch improves media-sdk performance with up to 60% > > > > with the (admittedly synthetic) benchmarks we use in our nightly > > > > testing, without regressing any other benchmarks. > > > > > > Hey David, > > > > > > I am testing some of the extended MOCS with Mesa and the differences I > > > see fit in the margins of statistical error. > > > > > > Odd, I thought, so to make sure I haven't messed up anything in the > > > process of compiling, setting LD_LIBRARY_PATH and benchmarking I turned > > > everything to UNCACHED - and I saw severe performance drop. > > > > > > So here is the question it induced: > > > > > > Have you used the "closest neighbour" from entries available or did you > > > defaulted to the UNCACHED ones? That could be the culprit. > > > > > > Note: I have tested MOCS for VB and Render Target only, and only in a > > > few synthetic cases - it will require much more fine-tuning and > > > benchmarking before any final conclusions. > > > > As I mentioned in the commit message, the improvements only manifest > > themselves for media-sdk workloads (and presumably other workloads > > that uses the same hardware); if you see any performance regressions > > with these additional entries I'd be interested to know. > > But what is being counter suggested is that their is no reason for these > mocs entries. If the sdk is just using mocs registers without first > programming them outside of the kernel abi, then it will be hitting > uncached memory - and then the only benefit is from simply enabling > cached access. The kernel ABI is minimalist for a reason, and we want to > know why we should be adding tables that we need to maintain forever > (bonus points for making that a consistent interface for hardware for > years to come). > -Chris Thanks for rephrasing - that's exactly what I am concerned with. Did you just use the MediaSDK as it is - meaning that MOCS entries beyond the set of the 3 we have defined had been naively utilized? If that's the case it is probably the cause of the performance difference - everything beyond "the 3" means UNCACHED. Can you try changing MediaSDK to only use entries that are already in? How the performance differs in that case? -- Cheers, Arek ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2] tests/pm_sseu: Re-enable the test
On Wed, Apr 26, 2017 at 03:28:09AM -0700, Oscar Mateo wrote: > This test got inadvertently disabled by commit 83884e97 (Restore > "lib: Open debugfs files for the given DRM device") when the > initialization order got changed (dbg_init before gem_init). > > v2: > - The asserts on fd are useless (Petri) > - Deinit in inverse order. > > Cc: Petri Latvala > Signed-off-by: Oscar Mateo Thanks, pushed with R-b. Btw, can you do git config format.subjectprefix "PATCH i-g-t" for your future patches? -- Petri Latvala ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Update MOCS settings for gen 9
On 04/05/2017 09:35, Arkadiusz Hiler wrote: On Thu, Apr 27, 2017 at 05:23:16PM +0100, Chris Wilson wrote: On Thu, Apr 27, 2017 at 06:30:42PM +0300, David Weinehall wrote: On Thu, Apr 27, 2017 at 04:55:20PM +0200, Arkadiusz Hiler wrote: On Wed, Apr 26, 2017 at 06:00:41PM +0300, David Weinehall wrote: Add a bunch of MOCS entries for gen 9 that were missing from intel_mocs. Some of these are used by media-sdk; if these entries are missing the default will instead be to do everything uncached. This patch improves media-sdk performance with up to 60% with the (admittedly synthetic) benchmarks we use in our nightly testing, without regressing any other benchmarks. Hey David, I am testing some of the extended MOCS with Mesa and the differences I see fit in the margins of statistical error. Odd, I thought, so to make sure I haven't messed up anything in the process of compiling, setting LD_LIBRARY_PATH and benchmarking I turned everything to UNCACHED - and I saw severe performance drop. So here is the question it induced: Have you used the "closest neighbour" from entries available or did you defaulted to the UNCACHED ones? That could be the culprit. Note: I have tested MOCS for VB and Render Target only, and only in a few synthetic cases - it will require much more fine-tuning and benchmarking before any final conclusions. As I mentioned in the commit message, the improvements only manifest themselves for media-sdk workloads (and presumably other workloads that uses the same hardware); if you see any performance regressions with these additional entries I'd be interested to know. But what is being counter suggested is that their is no reason for these mocs entries. If the sdk is just using mocs registers without first programming them outside of the kernel abi, then it will be hitting uncached memory - and then the only benefit is from simply enabling cached access. The kernel ABI is minimalist for a reason, and we want to know why we should be adding tables that we need to maintain forever (bonus points for making that a consistent interface for hardware for years to come). -Chris Thanks for rephrasing - that's exactly what I am concerned with. Did you just use the MediaSDK as it is - meaning that MOCS entries beyond the set of the 3 we have defined had been naively utilized? If that's the case it is probably the cause of the performance difference - everything beyond "the 3" means UNCACHED. Can you try changing MediaSDK to only use entries that are already in? How the performance differs in that case? Alternatively, at the time this was on my plate, Eero had suggested a sequence of experiments by basically gradually replicating the default UC/WB entries to currently empty slots, starting on GT2 parts and then going forward adding the more fine tuned parts. This would have showed the benefit of fine tuned entries vs basic cached ones. Unfortunately I never got round doing this, but it sounded like a really good approach to me. I could paste these suggestion here if Eero wouldn't mind? But I am also not sure if it is still relevant after the effort of exactly documenting the extended set of entries started. Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 07/67] drm/i915/cnl: Introduce Cannonlake platform defition.
On Thu, 2017-04-06 at 12:15 -0700, Rodrigo Vivi wrote: > Cannonlake is a Intel® Processor containing Intel® HD Graphics > following Kabylake. > > It is Gen10. > > Let's start by adding the platform definition based on previous > platforms but yet as alpha_support. > > On following patches we will start adding PCI IDs and the > platform specific changes. > > Signed-off-by: Rodrigo Vivi > --- > drivers/gpu/drm/i915/i915_drv.h | 3 +++ > drivers/gpu/drm/i915/i915_pci.c | 8 > drivers/gpu/drm/i915/intel_device_info.c | 1 + > 3 files changed, 12 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 2685f12..a357862 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -887,6 +887,7 @@ enum intel_platform { > INTEL_BROXTON, > INTEL_KABYLAKE, > INTEL_GEMINILAKE, > + INTEL_CANNONLAKE, > INTEL_MAX_PLATFORMS > }; > > @@ -2751,6 +2752,7 @@ static inline struct scatterlist *__sg_next(struct > scatterlist *sg) > #define IS_BROXTON(dev_priv) ((dev_priv)->info.platform == INTEL_BROXTON) > #define IS_KABYLAKE(dev_priv)((dev_priv)->info.platform == > INTEL_KABYLAKE) > #define IS_GEMINILAKE(dev_priv) ((dev_priv)->info.platform == > INTEL_GEMINILAKE) > +#define IS_CANNONLAKE(dev_priv) ((dev_priv)->info.platform == > INTEL_CANNONLAKE) > #define IS_MOBILE(dev_priv) ((dev_priv)->info.is_mobile) > #define IS_HSW_EARLY_SDV(dev_priv) (IS_HASWELL(dev_priv) && \ > (INTEL_DEVID(dev_priv) & 0xFF00) == 0x0C00) > @@ -2842,6 +2844,7 @@ static inline struct scatterlist *__sg_next(struct > scatterlist *sg) > #define IS_GEN7(dev_priv)(!!((dev_priv)->info.gen_mask & BIT(6))) > #define IS_GEN8(dev_priv)(!!((dev_priv)->info.gen_mask & BIT(7))) > #define IS_GEN9(dev_priv)(!!((dev_priv)->info.gen_mask & BIT(8))) > +#define IS_GEN10(dev_priv) (!!((dev_priv)->info.gen_mask & BIT(9))) > > #define IS_LP(dev_priv) (INTEL_INFO(dev_priv)->is_lp) > #define IS_GEN9_LP(dev_priv) (IS_GEN9(dev_priv) && IS_LP(dev_priv)) > diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c > index f87b0c4..a2a4b2f 100644 > --- a/drivers/gpu/drm/i915/i915_pci.c > +++ b/drivers/gpu/drm/i915/i915_pci.c > @@ -431,6 +431,14 @@ > .ring_mask = RENDER_RING | BSD_RING | BLT_RING | VEBOX_RING | BSD2_RING, > }; > > +static const struct intel_device_info intel_cannonlake_info = { > + BDW_FEATURES, > + .is_alpha_support = 1, > + .platform = INTEL_CANNONLAKE, > + .gen = 10, > + .ddb_size = 896, > +}; > + I think it makes sense to squash patch 17 with this one. No point in adding .ddb_size with the wrong value. If there's a reason not squash, I'd say is better to leave this as zero, so that the WARN_ON(ddb_size == 0) in intel_pm.c will remind us to fix it. With one of these suggestions, Reviewed-by: Ander Conselvan de Oliveira > /* > * Make sure any device matches here are from most specific to most > * general. For example, since the Quanta match is based on the subsystem > diff --git a/drivers/gpu/drm/i915/intel_device_info.c > b/drivers/gpu/drm/i915/intel_device_info.c > index 7d01dfe..6b09a82 100644 > --- a/drivers/gpu/drm/i915/intel_device_info.c > +++ b/drivers/gpu/drm/i915/intel_device_info.c > @@ -51,6 +51,7 @@ > PLATFORM_NAME(BROXTON), > PLATFORM_NAME(KABYLAKE), > PLATFORM_NAME(GEMINILAKE), > + PLATFORM_NAME(CANNONLAKE), > }; > #undef PLATFORM_NAME > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 16/67] drm/i915/cnl: Cannonlake has 4 planes (3 sprites) per pipe
On Thu, 2017-04-06 at 12:15 -0700, Rodrigo Vivi wrote: > From: James Irwin > > Issue: VIZ-4525 > > Reviewed-by: Damien Lespiau > Signed-off-by: James Irwin > Signed-off-by: Damien Lespiau Reviewed-by: Ander Conselvan de Oliveira > --- > drivers/gpu/drm/i915/intel_device_info.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/intel_device_info.c > b/drivers/gpu/drm/i915/intel_device_info.c > index 6b09a82..3cc8cdb 100644 > --- a/drivers/gpu/drm/i915/intel_device_info.c > +++ b/drivers/gpu/drm/i915/intel_device_info.c > @@ -328,7 +328,7 @@ void intel_device_info_runtime_init(struct > drm_i915_private *dev_priv) >* we don't expose the topmost plane at all to prevent ABI breakage >* down the line. >*/ > - if (IS_GEMINILAKE(dev_priv)) > + if (IS_GEN10(dev_priv) || IS_GEMINILAKE(dev_priv)) > for_each_pipe(dev_priv, pipe) > info->num_sprites[pipe] = 3; > else if (IS_BROXTON(dev_priv)) { ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Update MOCS settings for gen 9
Hi, On 04.05.2017 11:53, Tvrtko Ursulin wrote: On 04/05/2017 09:35, Arkadiusz Hiler wrote: On Thu, Apr 27, 2017 at 05:23:16PM +0100, Chris Wilson wrote: But what is being counter suggested is that their is no reason for these mocs entries. If the sdk is just using mocs registers without first programming them outside of the kernel abi, then it will be hitting uncached memory - and then the only benefit is from simply enabling cached access. The kernel ABI is minimalist for a reason, and we want to know why we should be adding tables that we need to maintain forever (bonus points for making that a consistent interface for hardware for years to come). -Chris Thanks for rephrasing - that's exactly what I am concerned with. Did you just use the MediaSDK as it is - meaning that MOCS entries beyond the set of the 3 we have defined had been naively utilized? If that's the case it is probably the cause of the performance difference - everything beyond "the 3" means UNCACHED. Can you try changing MediaSDK to only use entries that are already in? How the performance differs in that case? Alternatively, at the time this was on my plate, Eero had suggested a sequence of experiments by basically gradually replicating the default UC/WB entries to currently empty slots, starting on GT2 parts and then going forward adding the more fine tuned parts. This would have showed the benefit of fine tuned entries vs basic cached ones. Unfortunately I never got round doing this, but it sounded like a really good approach to me. I could paste these suggestion here if Eero wouldn't mind? Of course I don't mind. :-) But I am also not sure if it is still relevant after the effort of exactly documenting the extended set of entries started. It's relevant in the sense that we don't currently don't know whether there's any actual benefit from the new entries (i.e. was it just an issue of VPG not using the correct existing entries). If there is, that would be motivation to investigate impact of them also on other workloads. - Eero ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/cnp: Backlight support for CNP.
On Wed, 03 May 2017, Anusha Srivatsa wrote: > From: Rodrigo Vivi > > Split out BXT and CNP's setup_backlight(),enable_backlight(), > disable_backlight() and hz_to_pwm() into > two separate functions instead of reusing BXT function. > > Reuse set_backlight() and get_backlight() since they have > no reference to the utility pin. > > v2: Reuse BXT functions with controller 0 instead of > redefining it. (Jani). > Use dev_priv->rawclk_freq instead of getting the value > from SFUSE_STRAP. > v3: Avoid setup backligh controller along with hooks and > fully reuse hooks setup as suggested by Jani. > v4: Clean up commit message. > v5: Implement per PCH instead per platform. > > v6: Introduce a new function for CNP.(Jani and Ville) > > v7: Squash the all CNP Backlight support patches into a > single patch. (Jani) > > v8: Correct indentation, remove unneeded blank lines and > correct mail address (Jani). > > Reviewed-by: Jani Nikula Yup. What's the plan for merging the series, incl. this patch? BR, Jani. > Suggested-by: Jani Nikula > Suggested-by: Ville Syrjala > Cc: Ville Syrjala > Cc: Jani Nikula > Signed-off-by: Anusha Srivatsa > Signed-off-by: Rodrigo Vivi > --- > drivers/gpu/drm/i915/intel_panel.c | 88 > +++--- > 1 file changed, 83 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_panel.c > b/drivers/gpu/drm/i915/intel_panel.c > index 1978bec..8ee61c1 100644 > --- a/drivers/gpu/drm/i915/intel_panel.c > +++ b/drivers/gpu/drm/i915/intel_panel.c > @@ -796,6 +796,19 @@ static void bxt_disable_backlight(struct intel_connector > *connector) > } > } > > +static void cnp_disable_backlight(struct intel_connector *connector) > +{ > + struct drm_i915_private *dev_priv = to_i915(connector->base.dev); > + struct intel_panel *panel = &connector->panel; > + u32 tmp, val; > + > + intel_panel_actually_set_backlight(connector, 0); > + > + tmp = I915_READ(BXT_BLC_PWM_CTL(panel->backlight.controller)); > + I915_WRITE(BXT_BLC_PWM_CTL(panel->backlight.controller), > +tmp & ~BXT_BLC_PWM_ENABLE); > +} > + > static void pwm_disable_backlight(struct intel_connector *connector) > { > struct intel_panel *panel = &connector->panel; > @@ -1076,6 +1089,36 @@ static void bxt_enable_backlight(struct > intel_connector *connector) > pwm_ctl | BXT_BLC_PWM_ENABLE); > } > > +static void cnp_enable_backlight(struct intel_connector *connector) > +{ > + struct drm_i915_private *dev_priv = to_i915(connector->base.dev); > + struct intel_panel *panel = &connector->panel; > + enum pipe pipe = intel_get_pipe_from_connector(connector); > + u32 pwm_ctl, val; > + > + pwm_ctl = I915_READ(BXT_BLC_PWM_CTL(panel->backlight.controller)); > + if (pwm_ctl & BXT_BLC_PWM_ENABLE) { > + DRM_DEBUG_KMS("backlight already enabled\n"); > + pwm_ctl &= ~BXT_BLC_PWM_ENABLE; > + I915_WRITE(BXT_BLC_PWM_CTL(panel->backlight.controller), > +pwm_ctl); > + } > + > + I915_WRITE(BXT_BLC_PWM_FREQ(panel->backlight.controller), > +panel->backlight.max); > + > + intel_panel_actually_set_backlight(connector, panel->backlight.level); > + > + pwm_ctl = 0; > + if (panel->backlight.active_low_pwm) > + pwm_ctl |= BXT_BLC_PWM_POLARITY; > + > + I915_WRITE(BXT_BLC_PWM_CTL(panel->backlight.controller), pwm_ctl); > + POSTING_READ(BXT_BLC_PWM_CTL(panel->backlight.controller)); > + I915_WRITE(BXT_BLC_PWM_CTL(panel->backlight.controller), > +pwm_ctl | BXT_BLC_PWM_ENABLE); > +} > + > static void pwm_enable_backlight(struct intel_connector *connector) > { > struct intel_panel *panel = &connector->panel; > @@ -1645,6 +1688,37 @@ bxt_setup_backlight(struct intel_connector *connector, > enum pipe unused) > return 0; > } > > +static int > +cnp_setup_backlight(struct intel_connector *connector, enum pipe unused) > +{ > + struct drm_i915_private *dev_priv = to_i915(connector->base.dev); > + struct intel_panel *panel = &connector->panel; > + u32 pwm_ctl, val; > + > + panel->backlight.controller = dev_priv->vbt.backlight.controller; > + > + pwm_ctl = I915_READ(BXT_BLC_PWM_CTL(panel->backlight.controller)); > + > + panel->backlight.active_low_pwm = pwm_ctl & BXT_BLC_PWM_POLARITY; > + panel->backlight.max = > + I915_READ(BXT_BLC_PWM_FREQ(panel->backlight.controller)); > + > + if (!panel->backlight.max) > + panel->backlight.max = get_backlight_max_vbt(connector); > + > + if (!panel->backlight.max) > + return -ENODEV; > + > + val = bxt_get_backlight(connector); > + val = intel_panel_compute_brightness(connector, val); > + panel->backlight.level = clamp(val, panel->backlight.min, > +panel->backlight.max); > + > + panel->backlight.enabled = pwm_ctl & BXT_
[Intel-gfx] [PATCH v1] ACPI: Switch to use generic UUID API
acpi_evaluate_dsm() and friends take a pointer to a raw buffer of 16 bytes. Instead we convert them to use uuid_le type. At the same time we convert current users. acpi_str_to_uuid() becomes useless after the conversion and it's safe to get rid of it. The conversion fixes a potential bug in int340x_thermal as well since we have to use memcmp() on binary data. Cc: Rafael J. Wysocki Cc: Mika Westerberg Cc: Borislav Petkov Cc: Dan Williams Cc: Amir Goldstein Cc: Jarkko Sakkinen Cc: Jani Nikula Cc: Ben Skeggs Cc: Benjamin Tissoires Cc: Joerg Roedel Cc: Adrian Hunter Cc: Yisen Zhuang Cc: Bjorn Helgaas Cc: Zhang Rui Cc: Felipe Balbi Cc: Mathias Nyman Cc: Heikki Krogerus Cc: Liam Girdwood Cc: Mark Brown Signed-off-by: Andy Shevchenko --- drivers/acpi/acpi_extlog.c | 10 +++--- drivers/acpi/bus.c | 29 ++-- drivers/acpi/nfit/core.c | 40 +++--- drivers/acpi/nfit/nfit.h | 3 +- drivers/acpi/utils.c | 4 +-- drivers/char/tpm/tpm_crb.c | 9 +++-- drivers/char/tpm/tpm_ppi.c | 20 +-- drivers/gpu/drm/i915/intel_acpi.c | 14 +++- drivers/gpu/drm/nouveau/nouveau_acpi.c | 20 +-- drivers/gpu/drm/nouveau/nvkm/subdev/mxm/base.c | 9 +++-- drivers/hid/i2c-hid/i2c-hid.c | 9 +++-- drivers/iommu/dmar.c | 11 +++--- drivers/mmc/host/sdhci-pci-core.c | 9 +++-- drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c | 15 drivers/pci/pci-acpi.c | 11 +++--- drivers/pci/pci-label.c| 4 +-- drivers/thermal/int340x_thermal/int3400_thermal.c | 8 ++--- drivers/usb/dwc3/dwc3-pci.c| 6 ++-- drivers/usb/host/xhci-pci.c| 9 +++-- drivers/usb/misc/ucsi.c| 2 +- drivers/usb/typec/typec_wcove.c| 4 +-- include/acpi/acpi_bus.h| 9 ++--- include/linux/acpi.h | 4 +-- include/linux/pci-acpi.h | 2 +- sound/soc/intel/skylake/skl-nhlt.c | 7 ++-- tools/testing/nvdimm/test/iomap.c | 2 +- tools/testing/nvdimm/test/nfit.c | 2 +- 27 files changed, 116 insertions(+), 156 deletions(-) diff --git a/drivers/acpi/acpi_extlog.c b/drivers/acpi/acpi_extlog.c index 502ea4dc2080..69d6140b6afa 100644 --- a/drivers/acpi/acpi_extlog.c +++ b/drivers/acpi/acpi_extlog.c @@ -182,17 +182,17 @@ static int extlog_print(struct notifier_block *nb, unsigned long val, static bool __init extlog_get_l1addr(void) { - u8 uuid[16]; + uuid_le uuid; acpi_handle handle; union acpi_object *obj; - acpi_str_to_uuid(extlog_dsm_uuid, uuid); - + if (uuid_le_to_bin(extlog_dsm_uuid, &uuid)) + return false; if (ACPI_FAILURE(acpi_get_handle(NULL, "\\_SB", &handle))) return false; - if (!acpi_check_dsm(handle, uuid, EXTLOG_DSM_REV, 1 << EXTLOG_FN_ADDR)) + if (!acpi_check_dsm(handle, &uuid, EXTLOG_DSM_REV, 1 << EXTLOG_FN_ADDR)) return false; - obj = acpi_evaluate_dsm_typed(handle, uuid, EXTLOG_DSM_REV, + obj = acpi_evaluate_dsm_typed(handle, &uuid, EXTLOG_DSM_REV, EXTLOG_FN_ADDR, NULL, ACPI_TYPE_INTEGER); if (!obj) { return false; diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c index 784bda663d16..e8130a4873e9 100644 --- a/drivers/acpi/bus.c +++ b/drivers/acpi/bus.c @@ -196,42 +196,19 @@ static void acpi_print_osc_error(acpi_handle handle, pr_debug("\n"); } -acpi_status acpi_str_to_uuid(char *str, u8 *uuid) -{ - int i; - static int opc_map_to_uuid[16] = {6, 4, 2, 0, 11, 9, 16, 14, 19, 21, - 24, 26, 28, 30, 32, 34}; - - if (strlen(str) != 36) - return AE_BAD_PARAMETER; - for (i = 0; i < 36; i++) { - if (i == 8 || i == 13 || i == 18 || i == 23) { - if (str[i] != '-') - return AE_BAD_PARAMETER; - } else if (!isxdigit(str[i])) - return AE_BAD_PARAMETER; - } - for (i = 0; i < 16; i++) { - uuid[i] = hex_to_bin(str[opc_map_to_uuid[i]]) << 4; - uuid[i] |= hex_to_bin(str[opc_map_to_uuid[i] + 1]); - } - return AE_OK; -} -EXPORT_SYMBOL_GPL(acpi_str_to_uuid); - acpi_status acpi_run_osc(acpi_handle handle, struct acpi_osc_context *context) { acpi_status status; struct acpi_object_list input; union acpi_object in_params[4]; union acpi_object *out_obj; - u8 uuid[16]; + uuid_le uuid; u32 errors
Re: [Intel-gfx] [PATCH 2/3] drm: Create a format/modifier blob
Hi, On 3 May 2017 at 06:14, Ben Widawsky wrote: > Updated blob layout (Rob, Daniel, Kristian, xerpi) In terms of the blob as uABI, we've got an implementation inside Weston which works: https://git.collabora.com/cgit/user/daniels/weston.git/commit/?h=wip/2017-04/atomic-v11-WIP&id=0a47cb63947e That was authored by Sergi and reviewed by me. We both think it's entirely acceptable and future-proof uABI, and it does exactly what we want. We use it to both allocate with a suitable set of modifiers, as well as a high-pass filter to avoid assigning FBs to planes which won't accept the FB modifiers. So this gets my: Acked-by: Daniel Stone And a future revision with the fixups found here would get my R-b. Cheers, Daniel ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI] drm/i915: Use engine->context_pin() to report the intel_ring
Since unifying ringbuffer/execlist submission to use engine->pin_context, we ensure that the intel_ring is available before we start constructing the request. We can therefore move the assignment of the request->ring to the central i915_gem_request_alloc() and not require it in every engine->request_alloc() callback. Another small step towards simplification (of the core, but at a cost of handling error pointers in less important callers of engine->pin_context). v2: Rearrange a few branches to reduce impact of PTR_ERR() on gcc's code generation. Signed-off-by: Chris Wilson Cc: Oscar Mateo Cc: Joonas Lahtinen Reviewed-by: Oscar Mateo --- drivers/gpu/drm/i915/gvt/scheduler.c | 6 -- drivers/gpu/drm/i915/i915_gem_request.c | 9 ++--- drivers/gpu/drm/i915/i915_perf.c | 13 ++--- drivers/gpu/drm/i915/intel_engine_cs.c | 7 --- drivers/gpu/drm/i915/intel_lrc.c | 17 - drivers/gpu/drm/i915/intel_ringbuffer.c | 25 + drivers/gpu/drm/i915/intel_ringbuffer.h | 4 ++-- drivers/gpu/drm/i915/selftests/mock_engine.c | 8 8 files changed, 47 insertions(+), 42 deletions(-) diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c index 1256fe21850b..6ae286cb5804 100644 --- a/drivers/gpu/drm/i915/gvt/scheduler.c +++ b/drivers/gpu/drm/i915/gvt/scheduler.c @@ -180,6 +180,7 @@ static int dispatch_workload(struct intel_vgpu_workload *workload) struct intel_engine_cs *engine = dev_priv->engine[ring_id]; struct drm_i915_gem_request *rq; struct intel_vgpu *vgpu = workload->vgpu; + struct intel_ring *ring; int ret; gvt_dbg_sched("ring id %d prepare to dispatch workload %p\n", @@ -198,8 +199,9 @@ static int dispatch_workload(struct intel_vgpu_workload *workload) * shadow_ctx pages invalid. So gvt need to pin itself. After update * the guest context, gvt can unpin the shadow_ctx safely. */ - ret = engine->context_pin(engine, shadow_ctx); - if (ret) { + ring = engine->context_pin(engine, shadow_ctx); + if (IS_ERR(ring)) { + ret = PTR_ERR(ring); gvt_vgpu_err("fail to pin shadow context\n"); workload->status = ret; mutex_unlock(&dev_priv->drm.struct_mutex); diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c index 9074303c..10361c7e3b37 100644 --- a/drivers/gpu/drm/i915/i915_gem_request.c +++ b/drivers/gpu/drm/i915/i915_gem_request.c @@ -551,6 +551,7 @@ i915_gem_request_alloc(struct intel_engine_cs *engine, { struct drm_i915_private *dev_priv = engine->i915; struct drm_i915_gem_request *req; + struct intel_ring *ring; int ret; lockdep_assert_held(&dev_priv->drm.struct_mutex); @@ -565,9 +566,10 @@ i915_gem_request_alloc(struct intel_engine_cs *engine, * GGTT space, so do this first before we reserve a seqno for * ourselves. */ - ret = engine->context_pin(engine, ctx); - if (ret) - return ERR_PTR(ret); + ring = engine->context_pin(engine, ctx); + if (IS_ERR(ring)) + return ERR_CAST(ring); + GEM_BUG_ON(!ring); ret = reserve_seqno(engine); if (ret) @@ -633,6 +635,7 @@ i915_gem_request_alloc(struct intel_engine_cs *engine, req->i915 = dev_priv; req->engine = engine; req->ctx = ctx; + req->ring = ring; /* No zalloc, must clear what we need by hand */ req->global_seqno = 0; diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 060b171480d5..cdac68580cb1 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -744,6 +744,7 @@ static int oa_get_render_ctx_id(struct i915_perf_stream *stream) { struct drm_i915_private *dev_priv = stream->dev_priv; struct intel_engine_cs *engine = dev_priv->engine[RCS]; + struct intel_ring *ring; int ret; ret = i915_mutex_lock_interruptible(&dev_priv->drm); @@ -755,9 +756,10 @@ static int oa_get_render_ctx_id(struct i915_perf_stream *stream) * * NB: implied RCS engine... */ - ret = engine->context_pin(engine, stream->ctx); - if (ret) - goto unlock; + ring = engine->context_pin(engine, stream->ctx); + mutex_unlock(&dev_priv->drm.struct_mutex); + if (IS_ERR(ring)) + return PTR_ERR(ring); /* Explicitly track the ID (instead of calling i915_ggtt_offset() * on the fly) considering the difference with gen8+ and @@ -766,10 +768,7 @@ static int oa_get_render_ctx_id(struct i915_perf_stream *stream) dev_priv->perf.oa.specific_ctx_id = i915_ggtt_offset(stream->ctx->engine[engine->id].state); -unlock: - mutex_unlock(&dev_priv-
Re: [Intel-gfx] [PATCH] drm/i915: Update MOCS settings for gen 9
On 04/05/2017 10:21, Eero Tamminen wrote: Hi, On 04.05.2017 11:53, Tvrtko Ursulin wrote: On 04/05/2017 09:35, Arkadiusz Hiler wrote: On Thu, Apr 27, 2017 at 05:23:16PM +0100, Chris Wilson wrote: But what is being counter suggested is that their is no reason for these mocs entries. If the sdk is just using mocs registers without first programming them outside of the kernel abi, then it will be hitting uncached memory - and then the only benefit is from simply enabling cached access. The kernel ABI is minimalist for a reason, and we want to know why we should be adding tables that we need to maintain forever (bonus points for making that a consistent interface for hardware for years to come). -Chris Thanks for rephrasing - that's exactly what I am concerned with. Did you just use the MediaSDK as it is - meaning that MOCS entries beyond the set of the 3 we have defined had been naively utilized? If that's the case it is probably the cause of the performance difference - everything beyond "the 3" means UNCACHED. Can you try changing MediaSDK to only use entries that are already in? How the performance differs in that case? Alternatively, at the time this was on my plate, Eero had suggested a sequence of experiments by basically gradually replicating the default UC/WB entries to currently empty slots, starting on GT2 parts and then going forward adding the more fine tuned parts. This would have showed the benefit of fine tuned entries vs basic cached ones. Unfortunately I never got round doing this, but it sounded like a really good approach to me. I could paste these suggestion here if Eero wouldn't mind? Of course I don't mind. :-) Excellent, so here is what you wrote to me at that time: -- You could start by putting first ED_UC line values to other ED_UC lines, and the first ED_WB line values to other ED_WB lines. Then test that against standard kernel and VPG kernel on SKL GT2 machine, to evaluate LLC settings. If perf of that looks good, then test same settings also on SKL GT3e, or GT4e to evaluate impact of the more fine-tuned eLLC settings in addition to LLC ones. If GT2 results don't look good, try using ED_WB line for all lines that have either ED_WB or L3_WB. If if that doesn't look good either, try using ED_UC line for all lines that have either ED_UC or L3_UC. And if even that fails to produce performance-wise good results, we can conclude that we need VPG kernel's fine-tuned MOCS settings are really needed. Please provide some spreadsheet of the results you get. (My guess is that that the first settings provide almost all of the available speedup on GT2, but with eDRAM things aren't that straightforward.) -- But I am also not sure if it is still relevant after the effort of exactly documenting the extended set of entries started. It's relevant in the sense that we don't currently don't know whether there's any actual benefit from the new entries (i.e. was it just an issue of VPG not using the correct existing entries). If there is, that would be motivation to investigate impact of them also on other workloads. There probably is a benefit since it is hard to imagine fine tuned entries would otherwise exist. But I agree it makes sense to get a complete understanding of relative contribution of individual fine tunings. Regards, Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v1] ACPI: Switch to use generic UUID API
On Thu, 04 May 2017, Andy Shevchenko wrote: > diff --git a/drivers/gpu/drm/i915/intel_acpi.c > b/drivers/gpu/drm/i915/intel_acpi.c > index eb638a1e69d2..72bfe6ceadf8 100644 > --- a/drivers/gpu/drm/i915/intel_acpi.c > +++ b/drivers/gpu/drm/i915/intel_acpi.c > @@ -15,13 +15,9 @@ static struct intel_dsm_priv { > acpi_handle dhandle; > } intel_dsm_priv; > > -static const u8 intel_dsm_guid[] = { > - 0xd3, 0x73, 0xd8, 0x7e, > - 0xd0, 0xc2, > - 0x4f, 0x4e, > - 0xa8, 0x54, > - 0x0f, 0x13, 0x17, 0xb0, 0x1c, 0x2c > -}; > +static const uuid_le intel_dsm_guid = > + UUID_LE(0x7ed873d3, 0xc2d0, 0x4e4f, > + 0xa8, 0x54, 0x0f, 0x13, 0x17, 0xb0, 0x1c, 0x2c); > > static char *intel_dsm_port_name(u8 id) > { > @@ -80,7 +76,7 @@ static void intel_dsm_platform_mux_info(void) > int i; > union acpi_object *pkg, *connector_count; > > - pkg = acpi_evaluate_dsm_typed(intel_dsm_priv.dhandle, intel_dsm_guid, > + pkg = acpi_evaluate_dsm_typed(intel_dsm_priv.dhandle, &intel_dsm_guid, > INTEL_DSM_REVISION_ID, INTEL_DSM_FN_PLATFORM_MUX_INFO, > NULL, ACPI_TYPE_PACKAGE); > if (!pkg) { > @@ -118,7 +114,7 @@ static bool intel_dsm_pci_probe(struct pci_dev *pdev) > if (!dhandle) > return false; > > - if (!acpi_check_dsm(dhandle, intel_dsm_guid, INTEL_DSM_REVISION_ID, > + if (!acpi_check_dsm(dhandle, &intel_dsm_guid, INTEL_DSM_REVISION_ID, > 1 << INTEL_DSM_FN_PLATFORM_MUX_INFO)) { > DRM_DEBUG_KMS("no _DSM method for intel device\n"); > return false; The drm/i915 hunk above is Reviewed-by: Jani Nikula and acked for merging via whichever tree is suitable. BR, Jani. -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Set all undefined MOCS entries to follow PTE
A good default for garbage entries from the user is to follow the default setting of the object (i.e. the PTE). Currently they use the uncached entry, and now the only way to accidentally hit uncached performance is via explicit use of the uncached MOCS or setting the object to uncached. Note that these entries are currently undefined in the ABI and we reserve the right to change them. We originally chose uncached to eliminate any problem with reducing the caching level in future, but the object is a much better definition of the minimum caching level. Fixes: 3bbaba0ceaa2 ("drm/i915: Added Programming of the MOCS") Signed-off-by: Chris Wilson Cc: David Weinehall Cc: Arkadiusz Hiler Cc: Tvrtko Ursulin Cc: sta...@vger.kernel.org --- drivers/gpu/drm/i915/intel_mocs.c | 39 +++ 1 file changed, 15 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_mocs.c b/drivers/gpu/drm/i915/intel_mocs.c index 92e461c68385..e7a7781ca457 100644 --- a/drivers/gpu/drm/i915/intel_mocs.c +++ b/drivers/gpu/drm/i915/intel_mocs.c @@ -85,10 +85,7 @@ struct drm_i915_mocs_table { * * Entries not part of the following tables are undefined as far as * userspace is concerned and shouldn't be relied upon. For the time - * being they will be implicitly initialized to the strictest caching - * configuration (uncached) to guarantee forwards compatibility with - * userspace programs written against more recent kernels providing - * additional MOCS entries. + * being they will be implicitly initialized to follow the PTE. * * NOTE: These tables MUST start with being uncached and the length * MUST be less than 63 as the last two registers are reserved @@ -249,16 +246,13 @@ int intel_mocs_init_engine(struct intel_engine_cs *engine) table.table[index].control_value); /* -* Ok, now set the unused entries to uncached. These entries +* Ok, now set the unused entries to follow the PTE. These entries * are officially undefined and no contract for the contents * and settings is given for these entries. -* -* Entry 0 in the table is uncached - so we are just writing -* that value to all the used entries. */ for (; index < GEN9_NUM_MOCS_ENTRIES; index++) I915_WRITE(mocs_register(engine->id, index), - table.table[0].control_value); + table.table[I915_MOCS_PTE].control_value); return 0; } @@ -295,16 +289,13 @@ static int emit_mocs_control_table(struct drm_i915_gem_request *req, } /* -* Ok, now set the unused entries to uncached. These entries +* Ok, now set the unused entries to follow the PTE. These entries * are officially undefined and no contract for the contents * and settings is given for these entries. -* -* Entry 0 in the table is uncached - so we are just writing -* that value to all the used entries. */ for (; index < GEN9_NUM_MOCS_ENTRIES; index++) { *cs++ = i915_mmio_reg_offset(mocs_register(engine, index)); - *cs++ = table->table[0].control_value; + *cs++ = table->table[I915_MOCS_PTE].control_value; } *cs++ = MI_NOOP; @@ -355,18 +346,17 @@ static int emit_mocs_l3cc_table(struct drm_i915_gem_request *req, if (table->size & 0x01) { /* Odd table size - 1 left over */ *cs++ = i915_mmio_reg_offset(GEN9_LNCFCMOCS(i)); - *cs++ = l3cc_combine(table, 2 * i, 0); + *cs++ = l3cc_combine(table, 2 * i, I915_MOCS_PTE); i++; } /* -* Now set the rest of the table to uncached - use entry 0 as -* this will be uncached. Leave the last pair uninitialised as -* they are reserved by the hardware. +* Now set the rest of the table to follow the PTE. +* Leave the last pair as they are reserved by the hardware. */ for (; i < GEN9_NUM_MOCS_ENTRIES / 2; i++) { *cs++ = i915_mmio_reg_offset(GEN9_LNCFCMOCS(i)); - *cs++ = l3cc_combine(table, 0, 0); + *cs++ = l3cc_combine(table, I915_MOCS_PTE, I915_MOCS_PTE); } *cs++ = MI_NOOP; @@ -402,17 +392,18 @@ void intel_mocs_init_l3cc_table(struct drm_i915_private *dev_priv) /* Odd table size - 1 left over */ if (table.size & 0x01) { - I915_WRITE(GEN9_LNCFCMOCS(i), l3cc_combine(&table, 2*i, 0)); + I915_WRITE(GEN9_LNCFCMOCS(i), + l3cc_combine(&table, 2*i, I915_MOCS_PTE)); i++; } /* -* Now set the rest of the table to uncached - use entry 0 as -* this will be uncached. Leave the last pair as initialised as -* they are reserved by the hardware. +* Now set the rest of th
[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Use engine->context_pin() to report the intel_ring (rev2)
== Series Details == Series: drm/i915: Use engine->context_pin() to report the intel_ring (rev2) URL : https://patchwork.freedesktop.org/series/23884/ State : success == Summary == Series 23884v2 drm/i915: Use engine->context_pin() to report the intel_ring https://patchwork.freedesktop.org/api/1.0/series/23884/revisions/2/mbox/ fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11 time:431s fi-bdw-gvtdvmtotal:278 pass:256 dwarn:8 dfail:0 fail:0 skip:14 time:431s fi-bsw-n3050 total:278 pass:242 dwarn:0 dfail:0 fail:0 skip:36 time:580s fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time:512s fi-bxt-t5700 total:278 pass:258 dwarn:0 dfail:0 fail:0 skip:20 time:557s fi-byt-j1900 total:278 pass:254 dwarn:0 dfail:0 fail:0 skip:24 time:487s fi-byt-n2820 total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:486s fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:406s fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:405s fi-ilk-650 total:278 pass:228 dwarn:0 dfail:0 fail:0 skip:50 time:418s fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:493s fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:491s fi-kbl-7500u total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:465s fi-kbl-7560u total:278 pass:267 dwarn:1 dfail:0 fail:0 skip:10 time:567s fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:447s fi-skl-6700hqtotal:278 pass:261 dwarn:0 dfail:0 fail:0 skip:17 time:567s fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time:458s fi-skl-6770hqtotal:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:488s fi-skl-gvtdvmtotal:278 pass:265 dwarn:0 dfail:0 fail:0 skip:13 time:429s fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:536s fi-snb-2600 total:278 pass:249 dwarn:0 dfail:0 fail:0 skip:29 time:398s ade10dd3713e82daa22a6cd3524510f65f1dd86e drm-tip: 2017y-05m-04d-08h-03m-03s UTC integration manifest 2cdbe9f drm/i915: Use engine->context_pin() to report the intel_ring == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4617/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/5] drm/vblank: Switch to bool in_vblank_irq in get_vblank_timestamp
Hi Daniel, [auto build test ERROR on drm/drm-next] [also build test ERROR on next-20170503] [cannot apply to v4.11] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Daniel-Vetter/vblanke-cleanup-resend/20170504-003948 base: git://people.freedesktop.org/~airlied/linux.git drm-next config: arm-allmodconfig (attached as .config) compiler: arm-linux-gnueabi-gcc (Debian 6.1.1-9) 6.1.1 20160705 reproduce: wget https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=arm Note: the linux-review/Daniel-Vetter/vblanke-cleanup-resend/20170504-003948 HEAD 7d42e23d7949707be44be8720a9eb260534aa4dc builds fine. It only hurts bisectibility. All errors (new ones prefixed by >>): drivers/gpu//drm/vc4/vc4_crtc.c: In function 'vc4_crtc_get_scanoutpos': >> drivers/gpu//drm/vc4/vc4_crtc.c:235:6: error: 'in_vblank_irq' undeclared >> (first use in this function) if (in_vblank_irq) { ^ drivers/gpu//drm/vc4/vc4_crtc.c:235:6: note: each undeclared identifier is reported only once for each function it appears in vim +/in_vblank_irq +235 drivers/gpu//drm/vc4/vc4_crtc.c 229 * We can't get meaningful readings wrt. scanline position of the PV 230 * and need to make things up in a approximative but consistent way. 231 */ 232 ret |= DRM_SCANOUTPOS_IN_VBLANK; 233 vblank_lines = mode->vtotal - mode->vdisplay; 234 > 235 if (in_vblank_irq) { 236 /* 237 * Assume the irq handler got called close to first 238 * line of vblank, so PV has about a full vblank --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Set all undefined MOCS entries to follow PTE
== Series Details == Series: drm/i915: Set all undefined MOCS entries to follow PTE URL : https://patchwork.freedesktop.org/series/23941/ State : success == Summary == Series 23941v1 drm/i915: Set all undefined MOCS entries to follow PTE https://patchwork.freedesktop.org/api/1.0/series/23941/revisions/1/mbox/ fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11 time:432s fi-bsw-n3050 total:278 pass:242 dwarn:0 dfail:0 fail:0 skip:36 time:574s fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time:504s fi-bxt-t5700 total:278 pass:258 dwarn:0 dfail:0 fail:0 skip:20 time:546s fi-byt-j1900 total:278 pass:254 dwarn:0 dfail:0 fail:0 skip:24 time:480s fi-byt-n2820 total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:484s fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:409s fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:408s fi-ilk-650 total:278 pass:228 dwarn:0 dfail:0 fail:0 skip:50 time:416s fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:491s fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:469s fi-kbl-7500u total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:463s fi-kbl-7560u total:278 pass:267 dwarn:1 dfail:0 fail:0 skip:10 time:573s fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:464s fi-skl-6700hqtotal:278 pass:261 dwarn:0 dfail:0 fail:0 skip:17 time:572s fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time:465s fi-skl-6770hqtotal:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:499s fi-skl-gvtdvmtotal:278 pass:265 dwarn:0 dfail:0 fail:0 skip:13 time:428s fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:532s fi-snb-2600 total:278 pass:249 dwarn:0 dfail:0 fail:0 skip:29 time:402s fi-bdw-gvtdvm failed to collect. IGT log at Patchwork_4618/fi-bdw-gvtdvm/igt.log ade10dd3713e82daa22a6cd3524510f65f1dd86e drm-tip: 2017y-05m-04d-08h-03m-03s UTC integration manifest 901f6cd drm/i915: Set all undefined MOCS entries to follow PTE == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4618/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] ALSA: hda: Use loop counter for hdac_wait_for_cmd_dmas() timeout
hdac_wait_for_cmd_dmas() uses a jiffie timeout to ensure that we do not wait forever for stuck hardware. However, it is called from an irq-disabled context which prevents jiffie from advancing and so the loop doesn't terminate if the hardware fails. This can then cause NMI watchdog warnings, such as: NMI watchdog: Watchdog detected hard LOCKUP on cpu 3 Modules linked in: snd_hda_intel i915 vgem snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul snd_hda_codec_realtek snd_hda_codec_generic ghash_clmulni_intel e1000e snd_hda_codec snd_hwdep snd_hda_core snd_pcm ptp mei_me prime_numbers pps_core mei lpc_ich i2c_hid i2c_designware_platform i2c_designware_core [last unloaded: i915] irq event stamp: 13366 hardirqs last enabled at (13365): [] _raw_spin_unlock_irq+0x27/0x50 hardirqs last disabled at (13366): [] _raw_spin_lock_irq+0x12/0x50 softirqs last enabled at (12744): [] __do_softirq+0x1d9/0x4c0 softirqs last disabled at (12721): [] irq_exit+0xa9/0xc0 CPU: 3 PID: 10443 Comm: kworker/u8:11 Tainted: G U 4.11.0-rc4-CI-CI_DRM_319+ #1 Hardware name: /NUC5i5RYB, BIOS RYBDWi35.86A.0362.2017.0118.0940 01/18/2017 Workqueue: events_unbound async_run_entry_fn task: 88024cd32740 task.stack: c9000162c000 RIP: 0010:preempt_count_add+0xe/0xc0 RSP: 0018:c9000162fbd8 EFLAGS: 0082 RAX: 8001 RBX: 000704b96558 RCX: 0002 RDX: RSI: 81c74f2d RDI: 0001 RBP: c9000162fc08 R08: bbcc90cc R09: 23c7b071 R10: 827901a8 R11: 88024cd32740 R12: 000704b92baa R13: 3ea0 R14: 0003 R15: a00061f0 FS: () GS:880256d8() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7f90f84a5144 CR3: 03e0f000 CR4: 003406e0 Call Trace: ? delay_tsc+0x3d/0xc0 __delay+0xa/0x10 __const_udelay+0x31/0x40 snd_hdac_bus_stop_cmd_io+0x96/0xe0 [snd_hda_core] ? azx_dev_disconnect+0x20/0x20 [snd_hda_intel] snd_hdac_bus_stop_chip+0xb1/0x100 [snd_hda_core] azx_stop_chip+0x9/0x10 [snd_hda_codec] azx_suspend+0x72/0x220 [snd_hda_intel] pci_pm_suspend+0x71/0x140 dpm_run_callback+0x6f/0x330 ? pci_pm_freeze+0xe0/0xe0 __device_suspend+0xf9/0x370 ? dpm_watchdog_set+0x60/0x60 async_suspend+0x1a/0x90 async_run_entry_fn+0x34/0x160 process_one_work+0x1f4/0x6d0 ? process_one_work+0x16e/0x6d0 worker_thread+0x49/0x4a0 kthread+0x107/0x140 ? process_one_work+0x6d0/0x6d0 ? kthread_create_on_node+0x40/0x40 ret_from_fork+0x2e/0x40 Fixes: 38b19ed7f81e ("ALSA: hda: fix to wait for RIRB & CORB DMA to set") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100419 Signed-off-by: Chris Wilson Cc: Jeeja KP Cc: Vinod Koul Cc: Takashi Iwai Cc: # v4.7+ --- sound/hda/hdac_controller.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/sound/hda/hdac_controller.c b/sound/hda/hdac_controller.c index ee08c389b4d6..7f8806b03982 100644 --- a/sound/hda/hdac_controller.c +++ b/sound/hda/hdac_controller.c @@ -85,14 +85,14 @@ static void hdac_wait_for_cmd_dmas(struct hdac_bus *bus) { unsigned long timeout; - timeout = jiffies + msecs_to_jiffies(100); + timeout = 100 * 100; /* 100ms */ while ((snd_hdac_chip_readb(bus, RIRBCTL) & AZX_RBCTL_DMA_EN) - && time_before(jiffies, timeout)) + && timeout--) udelay(10); - timeout = jiffies + msecs_to_jiffies(100); + timeout = 100 * 100; /* 100ms */ while ((snd_hdac_chip_readb(bus, CORBCTL) & AZX_CORBCTL_RUN) - && time_before(jiffies, timeout)) + && timeout--) udelay(10); } -- 2.11.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] ALSA: hda: Use loop counter for hdac_wait_for_cmd_dmas() timeout
On Thu, 04 May 2017 12:18:29 +0200, Chris Wilson wrote: > > hdac_wait_for_cmd_dmas() uses a jiffie timeout to ensure that we do not > wait forever for stuck hardware. However, it is called from an > irq-disabled context which prevents jiffie from advancing and so the > loop doesn't terminate if the hardware fails. This can then cause NMI > watchdog warnings, such as: > > NMI watchdog: Watchdog detected hard LOCKUP on cpu 3 > Modules linked in: snd_hda_intel i915 vgem snd_hda_codec_hdmi > x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul > snd_hda_codec_realtek snd_hda_codec_generic ghash_clmulni_intel e1000e > snd_hda_codec snd_hwdep snd_hda_core snd_pcm ptp mei_me prime_numbers > pps_core mei lpc_ich i2c_hid i2c_designware_platform i2c_designware_core > [last unloaded: i915] > irq event stamp: 13366 > hardirqs last enabled at (13365): [] > _raw_spin_unlock_irq+0x27/0x50 > hardirqs last disabled at (13366): [] > _raw_spin_lock_irq+0x12/0x50 > softirqs last enabled at (12744): [] > __do_softirq+0x1d9/0x4c0 > softirqs last disabled at (12721): [] irq_exit+0xa9/0xc0 > CPU: 3 PID: 10443 Comm: kworker/u8:11 Tainted: G U > 4.11.0-rc4-CI-CI_DRM_319+ #1 > Hardware name: /NUC5i5RYB, BIOS > RYBDWi35.86A.0362.2017.0118.0940 01/18/2017 > Workqueue: events_unbound async_run_entry_fn > task: 88024cd32740 task.stack: c9000162c000 > RIP: 0010:preempt_count_add+0xe/0xc0 > RSP: 0018:c9000162fbd8 EFLAGS: 0082 > RAX: 8001 RBX: 000704b96558 RCX: 0002 > RDX: RSI: 81c74f2d RDI: 0001 > RBP: c9000162fc08 R08: bbcc90cc R09: 23c7b071 > R10: 827901a8 R11: 88024cd32740 R12: 000704b92baa > R13: 3ea0 R14: 0003 R15: a00061f0 > FS: () GS:880256d8() > knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: 7f90f84a5144 CR3: 03e0f000 CR4: 003406e0 > Call Trace: > ? delay_tsc+0x3d/0xc0 > __delay+0xa/0x10 > __const_udelay+0x31/0x40 > snd_hdac_bus_stop_cmd_io+0x96/0xe0 [snd_hda_core] > ? azx_dev_disconnect+0x20/0x20 [snd_hda_intel] > snd_hdac_bus_stop_chip+0xb1/0x100 [snd_hda_core] > azx_stop_chip+0x9/0x10 [snd_hda_codec] > azx_suspend+0x72/0x220 [snd_hda_intel] > pci_pm_suspend+0x71/0x140 > dpm_run_callback+0x6f/0x330 > ? pci_pm_freeze+0xe0/0xe0 > __device_suspend+0xf9/0x370 > ? dpm_watchdog_set+0x60/0x60 > async_suspend+0x1a/0x90 > async_run_entry_fn+0x34/0x160 > process_one_work+0x1f4/0x6d0 > ? process_one_work+0x16e/0x6d0 > worker_thread+0x49/0x4a0 > kthread+0x107/0x140 > ? process_one_work+0x6d0/0x6d0 > ? kthread_create_on_node+0x40/0x40 > ret_from_fork+0x2e/0x40 > > Fixes: 38b19ed7f81e ("ALSA: hda: fix to wait for RIRB & CORB DMA to set") > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100419 > Signed-off-by: Chris Wilson > Cc: Jeeja KP > Cc: Vinod Koul > Cc: Takashi Iwai > Cc: # v4.7+ Any reason to submit a different fix from what's attached in the bugzilla you mentioned? thanks, Takashi > --- > sound/hda/hdac_controller.c | 8 > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/sound/hda/hdac_controller.c b/sound/hda/hdac_controller.c > index ee08c389b4d6..7f8806b03982 100644 > --- a/sound/hda/hdac_controller.c > +++ b/sound/hda/hdac_controller.c > @@ -85,14 +85,14 @@ static void hdac_wait_for_cmd_dmas(struct hdac_bus *bus) > { > unsigned long timeout; > > - timeout = jiffies + msecs_to_jiffies(100); > + timeout = 100 * 100; /* 100ms */ > while ((snd_hdac_chip_readb(bus, RIRBCTL) & AZX_RBCTL_DMA_EN) > - && time_before(jiffies, timeout)) > +&& timeout--) > udelay(10); > > - timeout = jiffies + msecs_to_jiffies(100); > + timeout = 100 * 100; /* 100ms */ > while ((snd_hdac_chip_readb(bus, CORBCTL) & AZX_CORBCTL_RUN) > - && time_before(jiffies, timeout)) > +&& timeout--) > udelay(10); > } > > -- > 2.11.0 > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] ALSA: hda: Use loop counter for hdac_wait_for_cmd_dmas() timeout
On Thu, May 04, 2017 at 12:25:26PM +0200, Takashi Iwai wrote: > On Thu, 04 May 2017 12:18:29 +0200, > Chris Wilson wrote: > > > > hdac_wait_for_cmd_dmas() uses a jiffie timeout to ensure that we do not > > wait forever for stuck hardware. However, it is called from an > > irq-disabled context which prevents jiffie from advancing and so the > > loop doesn't terminate if the hardware fails. This can then cause NMI > > watchdog warnings, such as: > > > > NMI watchdog: Watchdog detected hard LOCKUP on cpu 3 > > Modules linked in: snd_hda_intel i915 vgem snd_hda_codec_hdmi > > x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul > > crc32_pclmul snd_hda_codec_realtek snd_hda_codec_generic > > ghash_clmulni_intel e1000e snd_hda_codec snd_hwdep snd_hda_core snd_pcm ptp > > mei_me prime_numbers pps_core mei lpc_ich i2c_hid i2c_designware_platform > > i2c_designware_core [last unloaded: i915] > > irq event stamp: 13366 > > hardirqs last enabled at (13365): [] > > _raw_spin_unlock_irq+0x27/0x50 > > hardirqs last disabled at (13366): [] > > _raw_spin_lock_irq+0x12/0x50 > > softirqs last enabled at (12744): [] > > __do_softirq+0x1d9/0x4c0 > > softirqs last disabled at (12721): [] > > irq_exit+0xa9/0xc0 > > CPU: 3 PID: 10443 Comm: kworker/u8:11 Tainted: G U > > 4.11.0-rc4-CI-CI_DRM_319+ #1 > > Hardware name: /NUC5i5RYB, BIOS > > RYBDWi35.86A.0362.2017.0118.0940 01/18/2017 > > Workqueue: events_unbound async_run_entry_fn > > task: 88024cd32740 task.stack: c9000162c000 > > RIP: 0010:preempt_count_add+0xe/0xc0 > > RSP: 0018:c9000162fbd8 EFLAGS: 0082 > > RAX: 8001 RBX: 000704b96558 RCX: 0002 > > RDX: RSI: 81c74f2d RDI: 0001 > > RBP: c9000162fc08 R08: bbcc90cc R09: 23c7b071 > > R10: 827901a8 R11: 88024cd32740 R12: 000704b92baa > > R13: 3ea0 R14: 0003 R15: a00061f0 > > FS: () GS:880256d8() > > knlGS: > > CS: 0010 DS: ES: CR0: 80050033 > > CR2: 7f90f84a5144 CR3: 03e0f000 CR4: 003406e0 > > Call Trace: > > ? delay_tsc+0x3d/0xc0 > > __delay+0xa/0x10 > > __const_udelay+0x31/0x40 > > snd_hdac_bus_stop_cmd_io+0x96/0xe0 [snd_hda_core] > > ? azx_dev_disconnect+0x20/0x20 [snd_hda_intel] > > snd_hdac_bus_stop_chip+0xb1/0x100 [snd_hda_core] > > azx_stop_chip+0x9/0x10 [snd_hda_codec] > > azx_suspend+0x72/0x220 [snd_hda_intel] > > pci_pm_suspend+0x71/0x140 > > dpm_run_callback+0x6f/0x330 > > ? pci_pm_freeze+0xe0/0xe0 > > __device_suspend+0xf9/0x370 > > ? dpm_watchdog_set+0x60/0x60 > > async_suspend+0x1a/0x90 > > async_run_entry_fn+0x34/0x160 > > process_one_work+0x1f4/0x6d0 > > ? process_one_work+0x16e/0x6d0 > > worker_thread+0x49/0x4a0 > > kthread+0x107/0x140 > > ? process_one_work+0x6d0/0x6d0 > > ? kthread_create_on_node+0x40/0x40 > > ret_from_fork+0x2e/0x40 > > > > Fixes: 38b19ed7f81e ("ALSA: hda: fix to wait for RIRB & CORB DMA to set") > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100419 > > Signed-off-by: Chris Wilson > > Cc: Jeeja KP > > Cc: Vinod Koul > > Cc: Takashi Iwai > > Cc: # v4.7+ > > Any reason to submit a different fix from what's attached in the > bugzilla you mentioned? probably a race between then :) Jeeja talked to me earlier today and uploaded the patch where we drop the locks and still use jiffies. Takashi, Do you prefer dropping locks or using loop? > > --- > > sound/hda/hdac_controller.c | 8 > > 1 file changed, 4 insertions(+), 4 deletions(-) > > > > diff --git a/sound/hda/hdac_controller.c b/sound/hda/hdac_controller.c > > index ee08c389b4d6..7f8806b03982 100644 > > --- a/sound/hda/hdac_controller.c > > +++ b/sound/hda/hdac_controller.c > > @@ -85,14 +85,14 @@ static void hdac_wait_for_cmd_dmas(struct hdac_bus *bus) > > { > > unsigned long timeout; > > > > - timeout = jiffies + msecs_to_jiffies(100); > > + timeout = 100 * 100; /* 100ms */ > > while ((snd_hdac_chip_readb(bus, RIRBCTL) & AZX_RBCTL_DMA_EN) > > - && time_before(jiffies, timeout)) > > + && timeout--) > > udelay(10); > > > > - timeout = jiffies + msecs_to_jiffies(100); > > + timeout = 100 * 100; /* 100ms */ > > while ((snd_hdac_chip_readb(bus, CORBCTL) & AZX_CORBCTL_RUN) > > - && time_before(jiffies, timeout)) > > + && timeout--) > > udelay(10); > > } > > > > -- > > 2.11.0 > > -- ~Vinod ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] ALSA: hda: Use loop counter for hdac_wait_for_cmd_dmas() timeout
On Thu, 04 May 2017 12:30:32 +0200, Vinod Koul wrote: > > On Thu, May 04, 2017 at 12:25:26PM +0200, Takashi Iwai wrote: > > On Thu, 04 May 2017 12:18:29 +0200, > > Chris Wilson wrote: > > > > > > hdac_wait_for_cmd_dmas() uses a jiffie timeout to ensure that we do not > > > wait forever for stuck hardware. However, it is called from an > > > irq-disabled context which prevents jiffie from advancing and so the > > > loop doesn't terminate if the hardware fails. This can then cause NMI > > > watchdog warnings, such as: > > > > > > NMI watchdog: Watchdog detected hard LOCKUP on cpu 3 > > > Modules linked in: snd_hda_intel i915 vgem snd_hda_codec_hdmi > > > x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul > > > crc32_pclmul snd_hda_codec_realtek snd_hda_codec_generic > > > ghash_clmulni_intel e1000e snd_hda_codec snd_hwdep snd_hda_core snd_pcm > > > ptp mei_me prime_numbers pps_core mei lpc_ich i2c_hid > > > i2c_designware_platform i2c_designware_core [last unloaded: i915] > > > irq event stamp: 13366 > > > hardirqs last enabled at (13365): [] > > > _raw_spin_unlock_irq+0x27/0x50 > > > hardirqs last disabled at (13366): [] > > > _raw_spin_lock_irq+0x12/0x50 > > > softirqs last enabled at (12744): [] > > > __do_softirq+0x1d9/0x4c0 > > > softirqs last disabled at (12721): [] > > > irq_exit+0xa9/0xc0 > > > CPU: 3 PID: 10443 Comm: kworker/u8:11 Tainted: G U > > > 4.11.0-rc4-CI-CI_DRM_319+ #1 > > > Hardware name: /NUC5i5RYB, BIOS > > > RYBDWi35.86A.0362.2017.0118.0940 01/18/2017 > > > Workqueue: events_unbound async_run_entry_fn > > > task: 88024cd32740 task.stack: c9000162c000 > > > RIP: 0010:preempt_count_add+0xe/0xc0 > > > RSP: 0018:c9000162fbd8 EFLAGS: 0082 > > > RAX: 8001 RBX: 000704b96558 RCX: 0002 > > > RDX: RSI: 81c74f2d RDI: 0001 > > > RBP: c9000162fc08 R08: bbcc90cc R09: 23c7b071 > > > R10: 827901a8 R11: 88024cd32740 R12: 000704b92baa > > > R13: 3ea0 R14: 0003 R15: a00061f0 > > > FS: () GS:880256d8() > > > knlGS: > > > CS: 0010 DS: ES: CR0: 80050033 > > > CR2: 7f90f84a5144 CR3: 03e0f000 CR4: 003406e0 > > > Call Trace: > > > ? delay_tsc+0x3d/0xc0 > > > __delay+0xa/0x10 > > > __const_udelay+0x31/0x40 > > > snd_hdac_bus_stop_cmd_io+0x96/0xe0 [snd_hda_core] > > > ? azx_dev_disconnect+0x20/0x20 [snd_hda_intel] > > > snd_hdac_bus_stop_chip+0xb1/0x100 [snd_hda_core] > > > azx_stop_chip+0x9/0x10 [snd_hda_codec] > > > azx_suspend+0x72/0x220 [snd_hda_intel] > > > pci_pm_suspend+0x71/0x140 > > > dpm_run_callback+0x6f/0x330 > > > ? pci_pm_freeze+0xe0/0xe0 > > > __device_suspend+0xf9/0x370 > > > ? dpm_watchdog_set+0x60/0x60 > > > async_suspend+0x1a/0x90 > > > async_run_entry_fn+0x34/0x160 > > > process_one_work+0x1f4/0x6d0 > > > ? process_one_work+0x16e/0x6d0 > > > worker_thread+0x49/0x4a0 > > > kthread+0x107/0x140 > > > ? process_one_work+0x6d0/0x6d0 > > > ? kthread_create_on_node+0x40/0x40 > > > ret_from_fork+0x2e/0x40 > > > > > > Fixes: 38b19ed7f81e ("ALSA: hda: fix to wait for RIRB & CORB DMA to set") > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100419 > > > Signed-off-by: Chris Wilson > > > Cc: Jeeja KP > > > Cc: Vinod Koul > > > Cc: Takashi Iwai > > > Cc: # v4.7+ > > > > Any reason to submit a different fix from what's attached in the > > bugzilla you mentioned? > > probably a race between then :) > > Jeeja talked to me earlier today and uploaded the patch where we drop the > locks and still use jiffies. > > Takashi, > Do you prefer dropping locks or using loop? I prefer dropping the lock. thanks, Takashi ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for ALSA: hda: Use loop counter for hdac_wait_for_cmd_dmas() timeout
== Series Details == Series: ALSA: hda: Use loop counter for hdac_wait_for_cmd_dmas() timeout URL : https://patchwork.freedesktop.org/series/23948/ State : success == Summary == Series 23948v1 ALSA: hda: Use loop counter for hdac_wait_for_cmd_dmas() timeout https://patchwork.freedesktop.org/api/1.0/series/23948/revisions/1/mbox/ Test gem_exec_suspend: Subgroup basic-s4-devices: dmesg-warn -> PASS (fi-kbl-7560u) fdo#100125 fdo#100125 https://bugs.freedesktop.org/show_bug.cgi?id=100125 fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11 time:424s fi-bdw-gvtdvmtotal:278 pass:256 dwarn:8 dfail:0 fail:0 skip:14 time:428s fi-bsw-n3050 total:278 pass:242 dwarn:0 dfail:0 fail:0 skip:36 time:580s fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time:505s fi-bxt-t5700 total:278 pass:258 dwarn:0 dfail:0 fail:0 skip:20 time:543s fi-byt-j1900 total:278 pass:254 dwarn:0 dfail:0 fail:0 skip:24 time:485s fi-byt-n2820 total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:482s fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:410s fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:405s fi-ilk-650 total:278 pass:228 dwarn:0 dfail:0 fail:0 skip:50 time:418s fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:488s fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:472s fi-kbl-7500u total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:460s fi-kbl-7560u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:562s fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:459s fi-skl-6700hqtotal:278 pass:261 dwarn:0 dfail:0 fail:0 skip:17 time:570s fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time:463s fi-skl-6770hqtotal:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:483s fi-skl-gvtdvmtotal:278 pass:265 dwarn:0 dfail:0 fail:0 skip:13 time:430s fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:531s fi-snb-2600 total:278 pass:249 dwarn:0 dfail:0 fail:0 skip:29 time:409s ade10dd3713e82daa22a6cd3524510f65f1dd86e drm-tip: 2017y-05m-04d-08h-03m-03s UTC integration manifest 5ebe8bb ALSA: hda: Use loop counter for hdac_wait_for_cmd_dmas() timeout == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4619/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] ALSA: hda: Use loop counter for hdac_wait_for_cmd_dmas() timeout
On Thu, May 04, 2017 at 12:25:26PM +0200, Takashi Iwai wrote: > On Thu, 04 May 2017 12:18:29 +0200, > Chris Wilson wrote: > > > > hdac_wait_for_cmd_dmas() uses a jiffie timeout to ensure that we do not > > wait forever for stuck hardware. However, it is called from an > > irq-disabled context which prevents jiffie from advancing and so the > > loop doesn't terminate if the hardware fails. This can then cause NMI > > watchdog warnings, such as: > > > > NMI watchdog: Watchdog detected hard LOCKUP on cpu 3 > > Modules linked in: snd_hda_intel i915 vgem snd_hda_codec_hdmi > > x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul > > crc32_pclmul snd_hda_codec_realtek snd_hda_codec_generic > > ghash_clmulni_intel e1000e snd_hda_codec snd_hwdep snd_hda_core snd_pcm ptp > > mei_me prime_numbers pps_core mei lpc_ich i2c_hid i2c_designware_platform > > i2c_designware_core [last unloaded: i915] > > irq event stamp: 13366 > > hardirqs last enabled at (13365): [] > > _raw_spin_unlock_irq+0x27/0x50 > > hardirqs last disabled at (13366): [] > > _raw_spin_lock_irq+0x12/0x50 > > softirqs last enabled at (12744): [] > > __do_softirq+0x1d9/0x4c0 > > softirqs last disabled at (12721): [] > > irq_exit+0xa9/0xc0 > > CPU: 3 PID: 10443 Comm: kworker/u8:11 Tainted: G U > > 4.11.0-rc4-CI-CI_DRM_319+ #1 > > Hardware name: /NUC5i5RYB, BIOS > > RYBDWi35.86A.0362.2017.0118.0940 01/18/2017 > > Workqueue: events_unbound async_run_entry_fn > > task: 88024cd32740 task.stack: c9000162c000 > > RIP: 0010:preempt_count_add+0xe/0xc0 > > RSP: 0018:c9000162fbd8 EFLAGS: 0082 > > RAX: 8001 RBX: 000704b96558 RCX: 0002 > > RDX: RSI: 81c74f2d RDI: 0001 > > RBP: c9000162fc08 R08: bbcc90cc R09: 23c7b071 > > R10: 827901a8 R11: 88024cd32740 R12: 000704b92baa > > R13: 3ea0 R14: 0003 R15: a00061f0 > > FS: () GS:880256d8() > > knlGS: > > CS: 0010 DS: ES: CR0: 80050033 > > CR2: 7f90f84a5144 CR3: 03e0f000 CR4: 003406e0 > > Call Trace: > > ? delay_tsc+0x3d/0xc0 > > __delay+0xa/0x10 > > __const_udelay+0x31/0x40 > > snd_hdac_bus_stop_cmd_io+0x96/0xe0 [snd_hda_core] > > ? azx_dev_disconnect+0x20/0x20 [snd_hda_intel] > > snd_hdac_bus_stop_chip+0xb1/0x100 [snd_hda_core] > > azx_stop_chip+0x9/0x10 [snd_hda_codec] > > azx_suspend+0x72/0x220 [snd_hda_intel] > > pci_pm_suspend+0x71/0x140 > > dpm_run_callback+0x6f/0x330 > > ? pci_pm_freeze+0xe0/0xe0 > > __device_suspend+0xf9/0x370 > > ? dpm_watchdog_set+0x60/0x60 > > async_suspend+0x1a/0x90 > > async_run_entry_fn+0x34/0x160 > > process_one_work+0x1f4/0x6d0 > > ? process_one_work+0x16e/0x6d0 > > worker_thread+0x49/0x4a0 > > kthread+0x107/0x140 > > ? process_one_work+0x6d0/0x6d0 > > ? kthread_create_on_node+0x40/0x40 > > ret_from_fork+0x2e/0x40 > > > > Fixes: 38b19ed7f81e ("ALSA: hda: fix to wait for RIRB & CORB DMA to set") > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100419 > > Signed-off-by: Chris Wilson > > Cc: Jeeja KP > > Cc: Vinod Koul > > Cc: Takashi Iwai > > Cc: # v4.7+ > > Any reason to submit a different fix from what's attached in the > bugzilla you mentioned? Because I didn't see it when Marta complained on irc and suggested reverting 38b19ed7f81e. There's no advantage either way, but even after fixing the timeout detection we are still left with the issue that the hw is stuck and suffer a 200ms suspend delay. :| -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2 4/4] drm/i915: Calculate vlv/chv intermediate watermarks correctly, v2.
On Thu, May 04, 2017 at 10:12:52AM +0200, Maarten Lankhorst wrote: > Op 03-05-17 om 20:03 schreef Ville Syrjälä: > > On Wed, May 03, 2017 at 06:18:46PM +0200, Maarten Lankhorst wrote: > >> Op 03-05-17 om 18:07 schreef Ville Syrjälä: > >>> On Wed, May 03, 2017 at 05:53:34PM +0200, Maarten Lankhorst wrote: > Op 03-05-17 om 16:11 schreef Ville Syrjälä: > > On Wed, May 03, 2017 at 04:06:37PM +0200, Maarten Lankhorst wrote: > >> Op 03-05-17 om 15:45 schreef Ville Syrjälä: > >>> On Mon, May 01, 2017 at 03:34:34PM +0200, Maarten Lankhorst wrote: > The watermarks it should calculate against are the old optimal > watermarks. > The currently active crtc watermarks are pure fiction, and are > invalid in > case of a nonblocking modeset, page flip enabling/disabling planes > or any > other reason. > > When the crtc is disabled or during a modeset the intermediate > watermarks > don't need to be programmed separately, and could be directly > assigned > to the optimal watermarks. > > Also rename crtc_state to new_crtc_state, to distinguish it from the > old state. > > Changes since v1: > - Use intel_atomic_get_old_crtc_state. (ville) > > Signed-off-by: Maarten Lankhorst > --- > drivers/gpu/drm/i915/intel_pm.c | 20 ++-- > 1 file changed, 14 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_pm.c > b/drivers/gpu/drm/i915/intel_pm.c > index 0f344b1fff45..a09396ee1f3d 100644 > --- a/drivers/gpu/drm/i915/intel_pm.c > +++ b/drivers/gpu/drm/i915/intel_pm.c > @@ -1458,16 +1458,24 @@ static void vlv_atomic_update_fifo(struct > intel_atomic_state *state, > > static int vlv_compute_intermediate_wm(struct drm_device *dev, > struct intel_crtc *crtc, > - struct intel_crtc_state > *crtc_state) > + struct intel_crtc_state > *new_crtc_state) > { > -struct vlv_wm_state *intermediate = > &crtc_state->wm.vlv.intermediate; > -const struct vlv_wm_state *optimal = > &crtc_state->wm.vlv.optimal; > -const struct vlv_wm_state *active = &crtc->wm.active.vlv; > +struct vlv_wm_state *intermediate = > &new_crtc_state->wm.vlv.intermediate; > +const struct vlv_wm_state *optimal = > &new_crtc_state->wm.vlv.optimal; > +const struct intel_crtc_state *old_crtc_state = > + > intel_atomic_get_old_crtc_state(new_crtc_state->base.state, crtc); > +const struct vlv_wm_state *active = > &old_crtc_state->wm.vlv.optimal; > int level; > > +if (!new_crtc_state->base.active || > drm_atomic_crtc_needs_modeset(&new_crtc_state->base)) { > +*intermediate = *optimal; > + > +return 0; > +} > + > intermediate->num_levels = min(optimal->num_levels, > active->num_levels); > intermediate->cxsr = optimal->cxsr && active->cxsr && > -!crtc_state->disable_cxsr; > +!new_crtc_state->disable_cxsr; > >>> We need to consider disable_cxsr even in the modeset case. > >> Why is this? crtc_state->disable_cxsr is set if any plane is part of > >> the crtc during modeset, so it's disabled during modeset already. > > It's set if any plane is enabling/disabling, which should be quite > > typical during a modeset. > Yeah but .initial_watermarks is called during crtc_enable, so cxsr will > get enabled anyway. > >>> Which is not what we want. CxSR must stay off until the planes have been > >>> enabled. > >>> > >> In that case why is it enabled in .initial_watermarks at all? It should be > >> in optimize_watermarks then.. > > Because we can keep it enabled across the update unless planes are > > getting enabled or disabled. > > > So for the modeset case, computing intermediate watermarks: > > *intermediate = *optimal; > if (needs_modeset) > intermediate->cxsr = false; > > if (optimal->cxsr && !intermediate->cxsr) > new_crtc_state->wm.need_postvbl_update = true; > > ? Or maybe if (blah) { *intermediate = *optimal; goto out; } // min/max stuff out: if (disable_cxsr) intermediate->cxsr = false; if (memcmp(... -- Ville Syrjälä Intel OTC ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mai
Re: [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Set all undefined MOCS entries to follow PTE
On Thu, May 04, 2017 at 10:09:57AM -, Patchwork wrote: > == Series Details == > > Series: drm/i915: Set all undefined MOCS entries to follow PTE > URL : https://patchwork.freedesktop.org/series/23941/ > State : success > > == Summary == > > Series 23941v1 drm/i915: Set all undefined MOCS entries to follow PTE > https://patchwork.freedesktop.org/api/1.0/series/23941/revisions/1/mbox/ Pushed, thanks for the kick and the review. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Use engine->context_pin() to report the intel_ring (rev2)
On Thu, May 04, 2017 at 09:53:35AM -, Patchwork wrote: > == Series Details == > > Series: drm/i915: Use engine->context_pin() to report the intel_ring (rev2) > URL : https://patchwork.freedesktop.org/series/23884/ > State : success > > == Summary == > > Series 23884v2 drm/i915: Use engine->context_pin() to report the intel_ring > https://patchwork.freedesktop.org/api/1.0/series/23884/revisions/2/mbox/ Contrary to earlier reports, this is the patch I just pushed (not mocs)! Thanks for the review and prompting me to fix up the request->ring assignment. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Set all undefined MOCS entries to follow PTE
On Thu, May 04, 2017 at 11:59:53AM +0100, Chris Wilson wrote: > On Thu, May 04, 2017 at 10:09:57AM -, Patchwork wrote: > > == Series Details == > > > > Series: drm/i915: Set all undefined MOCS entries to follow PTE > > URL : https://patchwork.freedesktop.org/series/23941/ > > State : success > > > > == Summary == > > > > Series 23941v1 drm/i915: Set all undefined MOCS entries to follow PTE > > https://patchwork.freedesktop.org/api/1.0/series/23941/revisions/1/mbox/ > > Pushed, thanks for the kick and the review. Actually, no I didn't. That reply was intended for a different series, sorry for the scare/noise. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 2/7] drm/i915: Program gen3- watermarks atomically
With the atomic watermark calculations calculate intermediary watermark values and update the watermarks atomically. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/i915_drv.h | 5 ++ drivers/gpu/drm/i915/intel_drv.h | 2 +- drivers/gpu/drm/i915/intel_pm.c | 103 +-- 3 files changed, 95 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 91b945cd39f9..7af4f908b2cd 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1793,6 +1793,10 @@ struct g4x_wm_values { bool fbc_en; }; +struct i9xx_wm_values { + bool cxsr; +}; + struct skl_ddb_entry { uint16_t start, end;/* in number of blocks, 'end' is exclusive */ }; @@ -2422,6 +2426,7 @@ struct drm_i915_private { struct skl_wm_values skl_hw; struct vlv_wm_values vlv; struct g4x_wm_values g4x; + struct i9xx_wm_values i9xx; }; uint8_t max_level; diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index d9e49f2b3c22..73e74fc7383c 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -600,7 +600,7 @@ struct intel_crtc_wm_state { struct g4x_wm_state optimal; } g4x; struct { - struct i9xx_wm_state optimal; + struct i9xx_wm_state optimal, intermediate; } i9xx; }; diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 0c933cfad02c..c39f63aff4a5 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -433,6 +433,8 @@ bool intel_set_memory_cxsr(struct drm_i915_private *dev_priv, bool enable) dev_priv->wm.vlv.cxsr = enable; else if (IS_G4X(dev_priv)) dev_priv->wm.g4x.cxsr = enable; + else if (INTEL_GEN(dev_priv) <= 4) + dev_priv->wm.i9xx.cxsr = enable; mutex_unlock(&dev_priv->wm.wm_mutex); return ret; @@ -2317,6 +2319,44 @@ static int i9xx_compute_pipe_wm(struct intel_crtc_state *crtc_state) return 0; } +static int i9xx_compute_intermediate_wm(struct drm_device *dev, + struct intel_crtc *intel_crtc, + struct intel_crtc_state *newstate) +{ + struct i9xx_wm_state *intermediate = &newstate->wm.i9xx.intermediate; + const struct drm_crtc_state *old_drm_state = + drm_atomic_get_old_crtc_state(newstate->base.state, &intel_crtc->base); + const struct i9xx_wm_state *old = &to_intel_crtc_state(old_drm_state)->wm.i9xx.optimal; + const struct i9xx_wm_state *optimal = &newstate->wm.i9xx.optimal; + + /* +* Start with the final, target watermarks, then combine with the +* currently active watermarks to get values that are safe both before +* and after the vblank. +*/ + *intermediate = *optimal; + if (newstate->disable_cxsr) + intermediate->cxsr = false; + + if (!newstate->base.active || + drm_atomic_crtc_needs_modeset(&newstate->base)) + goto out; + + intermediate->plane_wm = min(old->plane_wm, optimal->plane_wm); + intermediate->sr.plane = min(old->sr.plane, optimal->sr.plane); + +out: + /* +* If our intermediate WM are identical to the final WM, then we can +* omit the post-vblank programming; only update if it's different. +*/ + if (newstate->base.active && + memcmp(intermediate, optimal, sizeof(*intermediate)) != 0) + newstate->wm.need_postvbl_update = true; + + return 0; +} + void i9xx_wm_get_hw_state(struct drm_device *dev) { struct drm_i915_private *dev_priv = to_i915(dev); @@ -2345,17 +2385,15 @@ void i9xx_wm_get_hw_state(struct drm_device *dev) } } -static void i9xx_update_wm(struct intel_crtc *crtc) +static void i9xx_program_watermarks(struct drm_i915_private *dev_priv) { - struct drm_i915_private *dev_priv = to_i915(crtc->base.dev); + struct intel_crtc *crtc; uint32_t fwater_lo; uint32_t fwater_hi; int cwm, srwm = -1; int planea_wm, planeb_wm; struct intel_crtc *enabled = NULL; - crtc->wm.active.i9xx = crtc->config->wm.i9xx.optimal; - crtc = intel_get_crtc_for_plane(dev_priv, 0); planea_wm = crtc->wm.active.i9xx.plane_wm; if (intel_crtc_active(crtc)) @@ -2381,7 +2419,7 @@ static void i9xx_update_wm(struct intel_crtc *crtc) cwm = 2; /* Play safe and disable self-refresh before adjusting watermarks. */ - intel_set_memory_cxsr(dev_priv, false); + _intel_set_memory_cxsr(dev_priv, false); /* Calc sr entries for one plane
[Intel-gfx] [RFC 0/7] drm/i915: Convert gen4- watermarks to atomic.
I've only compile time tested this and the series depends on Ville's gen4x watermark conversion so CI will fail to apply it. Maarten Lankhorst (7): drm/i915: Calculate gen3- watermarks semi-atomically. drm/i915: Program gen3- watermarks atomically drm/i915: Convert pineview watermarks to atomic drm/i915: Calculate gen4 watermarks semiatomically. drm/i915: Program gen4 watermarks atomically drm/i915: Kill off intel_crtc_active. drm/i915: Rip out legacy watermark infrastructure drivers/gpu/drm/i915/i915_drv.h | 6 +- drivers/gpu/drm/i915/intel_atomic.c | 2 - drivers/gpu/drm/i915/intel_display.c | 97 +- drivers/gpu/drm/i915/intel_drv.h | 18 +- drivers/gpu/drm/i915/intel_fbc.c | 2 +- drivers/gpu/drm/i915/intel_pm.c | 635 ++- 6 files changed, 433 insertions(+), 327 deletions(-) -- 2.9.3 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 1/7] drm/i915: Calculate gen3- watermarks semi-atomically.
The gen3 watermark calculations are converted to atomic, but the wm update calls are still done through the legacy functions. This will make it easier to bisect things if they go wrong. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/intel_display.c | 3 +- drivers/gpu/drm/i915/intel_drv.h | 14 +++ drivers/gpu/drm/i915/intel_pm.c | 231 +-- 3 files changed, 152 insertions(+), 96 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 4991ef2ac77d..c7d295a0895d 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -15518,7 +15518,8 @@ intel_modeset_setup_hw_state(struct drm_device *dev) skl_wm_get_hw_state(dev); } else if (HAS_PCH_SPLIT(dev_priv)) { ilk_wm_get_hw_state(dev); - } + } else if (INTEL_GEN(dev_priv) <= 3 && !IS_PINEVIEW(dev_priv)) + i9xx_wm_get_hw_state(dev); for_each_intel_crtc(dev, crtc) { u64 put_domains; diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index ae9173707959..d9e49f2b3c22 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -546,6 +546,15 @@ struct g4x_wm_state { bool fbc_en; }; +struct i9xx_wm_state { + uint16_t plane_wm; + bool cxsr; + + struct { + uint16_t plane; + } sr; +}; + struct intel_crtc_wm_state { union { struct { @@ -590,6 +599,9 @@ struct intel_crtc_wm_state { /* optimal watermarks */ struct g4x_wm_state optimal; } g4x; + struct { + struct i9xx_wm_state optimal; + } i9xx; }; /* @@ -828,6 +840,7 @@ struct intel_crtc { struct intel_pipe_wm ilk; struct vlv_wm_state vlv; struct g4x_wm_state g4x; + struct i9xx_wm_state i9xx; } active; } wm; @@ -1868,6 +1881,7 @@ void gen6_rps_boost(struct drm_i915_private *dev_priv, unsigned long submitted); void intel_queue_rps_boost_for_request(struct drm_i915_gem_request *req); void g4x_wm_get_hw_state(struct drm_device *dev); +void i9xx_wm_get_hw_state(struct drm_device *dev); void vlv_wm_get_hw_state(struct drm_device *dev); void ilk_wm_get_hw_state(struct drm_device *dev); void skl_wm_get_hw_state(struct drm_device *dev); diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index d2cec3249e87..0c933cfad02c 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -2226,89 +2226,154 @@ static void i965_update_wm(struct intel_crtc *unused_crtc) #undef FW_WM -static void i9xx_update_wm(struct intel_crtc *unused_crtc) +static const struct intel_watermark_params *i9xx_get_wm_info(struct drm_i915_private *dev_priv, +struct intel_crtc *crtc) { - struct drm_i915_private *dev_priv = to_i915(unused_crtc->base.dev); - const struct intel_watermark_params *wm_info; - uint32_t fwater_lo; - uint32_t fwater_hi; - int cwm, srwm = 1; - int fifo_size; - int planea_wm, planeb_wm; - struct intel_crtc *crtc, *enabled = NULL; + struct intel_plane *plane = to_intel_plane(crtc->base.primary); if (IS_I945GM(dev_priv)) - wm_info = &i945_wm_info; + return &i945_wm_info; else if (!IS_GEN2(dev_priv)) - wm_info = &i915_wm_info; + return &i915_wm_info; + else if (plane->plane == PLANE_A) + return &i830_a_wm_info; else - wm_info = &i830_a_wm_info; + return &i830_bc_wm_info; +} - fifo_size = dev_priv->display.get_fifo_size(dev_priv, 0); - crtc = intel_get_crtc_for_plane(dev_priv, 0); - if (intel_crtc_active(crtc)) { +static int i9xx_compute_pipe_wm(struct intel_crtc_state *crtc_state) +{ + struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc); + struct drm_i915_private *dev_priv = to_i915(crtc->base.dev); + struct intel_atomic_state *state = + to_intel_atomic_state(crtc_state->base.state); + struct i9xx_wm_state *wm_state = &crtc_state->wm.i9xx.optimal; + struct intel_plane *plane = to_intel_plane(crtc->base.primary); + const struct drm_plane_state *plane_state = NULL; + int fifo_size; + const struct intel_watermark_params *wm_info; + + fifo_size = dev_priv->display.get_fifo_size(dev_priv, plane->plane); + + wm_info = i9xx_get_wm_info(dev_priv, crtc); + + wm_state->cxsr = false; + memset(&wm_state->sr, 0, sizeof(wm_state->sr)); + + if (crtc_state->base.plane_mask & BIT(drm_plane_index(&plane->
[Intel-gfx] [RFC 3/7] drm/i915: Convert pineview watermarks to atomic
Pineview seems to have different watermarks from the other platforms and are calculated separately. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/intel_drv.h | 3 +- drivers/gpu/drm/i915/intel_pm.c | 134 ++- 2 files changed, 92 insertions(+), 45 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index 73e74fc7383c..62f690c7691e 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -552,7 +552,8 @@ struct i9xx_wm_state { struct { uint16_t plane; - } sr; + uint16_t cursor; + } sr, hpll; }; struct intel_crtc_wm_state { diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index c39f63aff4a5..eb1bb8b3f9a6 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -824,13 +824,17 @@ static struct intel_crtc *single_enabled_crtc(struct drm_i915_private *dev_priv) return enabled; } -static void pineview_update_wm(struct intel_crtc *unused_crtc) +static int pnv_compute_pipe_wm(struct intel_crtc_state *crtc_state) { - struct drm_i915_private *dev_priv = to_i915(unused_crtc->base.dev); - struct intel_crtc *crtc; + struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc); + struct drm_i915_private *dev_priv = to_i915(crtc->base.dev); + struct i9xx_wm_state *wm_state = &crtc_state->wm.i9xx.optimal; + struct intel_plane *plane = to_intel_plane(crtc->base.primary); + struct intel_atomic_state *state = to_intel_atomic_state(crtc_state->base.state); + const struct drm_plane_state *primary_plane_state = NULL; const struct cxsr_latency *latency; - u32 reg; - unsigned int wm; + + memset(wm_state, 0, sizeof(*wm_state)); latency = intel_get_cxsr_latency(IS_PINEVIEW_G(dev_priv), dev_priv->is_ddr3, @@ -838,60 +842,90 @@ static void pineview_update_wm(struct intel_crtc *unused_crtc) dev_priv->mem_freq); if (!latency) { DRM_DEBUG_KMS("Unknown FSB/MEM found, disable CxSR\n"); - intel_set_memory_cxsr(dev_priv, false); - return; + + return 0; } - crtc = single_enabled_crtc(dev_priv); - if (crtc) { - const struct drm_display_mode *adjusted_mode = - &crtc->config->base.adjusted_mode; + if (crtc_state->base.plane_mask & BIT(drm_plane_index(&plane->base))) + primary_plane_state = __drm_atomic_get_current_plane_state(&state->base, &plane->base); + + if (primary_plane_state) { const struct drm_framebuffer *fb = - crtc->base.primary->state->fb; + primary_plane_state->fb; int cpp = fb->format->cpp[0]; - int clock = adjusted_mode->crtc_clock; + const struct drm_display_mode *adjusted_mode = + &crtc_state->base.adjusted_mode; + unsigned active_crtcs; + + if (state->modeset) + active_crtcs = state->active_crtcs; + else + active_crtcs = dev_priv->active_crtcs; + + wm_state->cxsr = active_crtcs == drm_crtc_mask(&crtc->base); + + wm_state->sr.plane = intel_calculate_wm(adjusted_mode->crtc_clock, + &pineview_display_wm, + pineview_display_wm.fifo_size, + cpp, latency->display_sr); + + wm_state->sr.cursor = intel_calculate_wm(adjusted_mode->crtc_clock, +&pineview_cursor_wm, + pineview_display_wm.fifo_size, +4, latency->cursor_sr); + + wm_state->hpll.plane = intel_calculate_wm(adjusted_mode->crtc_clock, + &pineview_display_hplloff_wm, + pineview_display_hplloff_wm.fifo_size, +cpp, latency->display_hpll_disable); + + wm_state->hpll.cursor = intel_calculate_wm(adjusted_mode->crtc_clock, + &pineview_cursor_hplloff_wm, + pineview_display_hplloff_wm.fifo_size, + 4, latency->cursor_hpll_disable); + + DRM_DEBUG_KMS("FIFO watermarks - can cxsr: %s, display plane %d, cursor SR size: %d\n",
[Intel-gfx] [RFC 4/7] drm/i915: Calculate gen4 watermarks semiatomically.
Gen4 watermark is handled same as gen3-. Calculate the optimal watermarks atomically first, and program it in the legacy helper. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/intel_pm.c | 136 1 file changed, 95 insertions(+), 41 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index eb1bb8b3f9a6..c5bdef6281f3 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -2189,58 +2189,109 @@ static void vlv_optimize_watermarks(struct intel_atomic_state *state, mutex_unlock(&dev_priv->wm.wm_mutex); } -static void i965_update_wm(struct intel_crtc *unused_crtc) +static int i965_compute_pipe_wm(struct intel_crtc_state *crtc_state) { - struct drm_i915_private *dev_priv = to_i915(unused_crtc->base.dev); - struct intel_crtc *crtc; + struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc); + struct drm_i915_private *dev_priv = to_i915(crtc->base.dev); + struct intel_atomic_state *state = + to_intel_atomic_state(crtc_state->base.state); + struct i9xx_wm_state *wm_state = &crtc_state->wm.i9xx.optimal; + struct intel_plane *plane = to_intel_plane(crtc->base.primary); + const struct drm_plane_state *primary_plane_state = NULL; + const struct drm_plane_state *cursor_plane_state = NULL; + + memset(wm_state, 0, sizeof(*wm_state)); + + if (crtc_state->base.plane_mask & BIT(drm_plane_index(&plane->base))) + primary_plane_state = __drm_atomic_get_current_plane_state(&state->base, &plane->base); + + if (crtc_state->base.plane_mask & BIT(drm_plane_index(crtc->base.cursor))) + cursor_plane_state = __drm_atomic_get_current_plane_state(&state->base, crtc->base.cursor); + + if (primary_plane_state) { + static const int sr_latency_ns = 12000; + const struct drm_display_mode *adjusted_mode = + &crtc_state->base.adjusted_mode; + unsigned active_crtcs; + unsigned long entries; + bool may_cxsr; + + if (state->modeset) + active_crtcs = state->active_crtcs; + else + active_crtcs = dev_priv->active_crtcs; + + may_cxsr = active_crtcs == drm_crtc_mask(&crtc->base); + + if (may_cxsr && intel_wm_plane_visible(crtc_state, to_intel_plane_state(primary_plane_state))) { + struct drm_framebuffer *fb = primary_plane_state->fb; + unsigned cpp = fb->format->cpp[0]; + + entries = intel_wm_method2(adjusted_mode->crtc_clock, + adjusted_mode->crtc_htotal, + crtc_state->pipe_src_w, cpp, + sr_latency_ns / 100); + entries = DIV_ROUND_UP(entries, I915_FIFO_LINE_SIZE); + if (entries < I965_FIFO_SIZE) + wm_state->sr.plane = I965_FIFO_SIZE - entries; + else + may_cxsr = false; + + DRM_DEBUG_KMS("self-refresh entries: %ld\n", entries); + } + + /* No need to use intel_wm_plane_visible here, since cursor. */ + if (may_cxsr && cursor_plane_state && crtc_state->base.active) { + entries = intel_wm_method2(adjusted_mode->crtc_clock, + adjusted_mode->crtc_htotal, + cursor_plane_state->crtc_w, 4, + sr_latency_ns / 100); + + entries = DIV_ROUND_UP(entries, + i965_cursor_wm_info.cacheline_size) + + i965_cursor_wm_info.guard_size; + + if (entries < i965_cursor_wm_info.fifo_size) + wm_state->sr.cursor = min(i965_cursor_wm_info.fifo_size - entries, + (unsigned long)(i965_cursor_wm_info.max_wm)); + else + may_cxsr = false; + } else if (may_cxsr) + wm_state->sr.cursor = 16; + + wm_state->cxsr = may_cxsr; + + DRM_DEBUG_KMS("FIFO watermarks - can cxsr: %s, display plane %d, cursor SR size: %d\n", + yesno(wm_state->cxsr), wm_state->sr.plane, wm_state->sr.cursor); + } + + return 0; +} + +static void i965_update_wm(struct intel_crtc *crtc) +{ + struct drm_i915_private *dev_priv = to_i915(crtc->base.dev); int srwm = 1; int cursor_sr = 16; - bool cxsr_enabled; + bool cxsr_enabl
[Intel-gfx] [RFC 6/7] drm/i915: Kill off intel_crtc_active.
Use crtc->active directly instead. This is still not completely optimal and needs fixing, but it's about as good as using intel_crtc_active. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/intel_display.c | 19 --- drivers/gpu/drm/i915/intel_drv.h | 1 - drivers/gpu/drm/i915/intel_fbc.c | 2 +- drivers/gpu/drm/i915/intel_pm.c | 6 +++--- 4 files changed, 4 insertions(+), 24 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index c7d295a0895d..8538c0246015 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -948,25 +948,6 @@ bool bxt_find_best_dpll(struct intel_crtc_state *crtc_state, int target_clock, target_clock, refclk, NULL, best_clock); } -bool intel_crtc_active(struct intel_crtc *crtc) -{ - /* Be paranoid as we can arrive here with only partial -* state retrieved from the hardware during setup. -* -* We can ditch the adjusted_mode.crtc_clock check as soon -* as Haswell has gained clock readout/fastboot support. -* -* We can ditch the crtc->primary->fb check as soon as we can -* properly reconstruct framebuffers. -* -* FIXME: The intel_crtc->active here should be switched to -* crtc->state->active once we have proper CRTC states wired up -* for atomic. -*/ - return crtc->active && crtc->base.primary->state->fb && - crtc->config->base.adjusted_mode.crtc_clock; -} - enum transcoder intel_pipe_to_cpu_transcoder(struct drm_i915_private *dev_priv, enum pipe pipe) { diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index 62f690c7691e..dbe33b7bcf67 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -1490,7 +1490,6 @@ bool bxt_find_best_dpll(struct intel_crtc_state *crtc_state, int target_clock, struct dpll *best_clock); int chv_calc_dpll_params(int refclk, struct dpll *pll_clock); -bool intel_crtc_active(struct intel_crtc *crtc); void hsw_enable_ips(struct intel_crtc *crtc); void hsw_disable_ips(struct intel_crtc *crtc); enum intel_display_power_domain intel_port_to_power_domain(enum port port); diff --git a/drivers/gpu/drm/i915/intel_fbc.c b/drivers/gpu/drm/i915/intel_fbc.c index ded2add18b26..a93214d0388e 100644 --- a/drivers/gpu/drm/i915/intel_fbc.c +++ b/drivers/gpu/drm/i915/intel_fbc.c @@ -1282,7 +1282,7 @@ void intel_fbc_init_pipe_state(struct drm_i915_private *dev_priv) return; for_each_intel_crtc(&dev_priv->drm, crtc) - if (intel_crtc_active(crtc) && + if (crtc->base.state->active && crtc->base.primary->state->visible) dev_priv->fbc.visible_pipes_mask |= (1 << crtc->pipe); } diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 969eb11ed5cd..bf2127a3f730 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -814,7 +814,7 @@ static struct intel_crtc *single_enabled_crtc(struct drm_i915_private *dev_priv) struct intel_crtc *crtc, *enabled = NULL; for_each_intel_crtc(&dev_priv->drm, crtc) { - if (intel_crtc_active(crtc)) { + if (crtc->active) { if (enabled) return NULL; enabled = crtc; @@ -2486,11 +2486,11 @@ static void i9xx_program_watermarks(struct drm_i915_private *dev_priv) crtc = intel_get_crtc_for_plane(dev_priv, 0); planea_wm = crtc->wm.active.i9xx.plane_wm; - if (intel_crtc_active(crtc)) + if (crtc->active) enabled = crtc; crtc = intel_get_crtc_for_plane(dev_priv, 1); - if (intel_crtc_active(crtc)) { + if (crtc->active) { if (enabled == NULL) enabled = crtc; else -- 2.9.3 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 5/7] drm/i915: Program gen4 watermarks atomically
We're already calculating the watermarks correctly, now we have to program them too. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/intel_pm.c | 25 +++-- 1 file changed, 15 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index c5bdef6281f3..969eb11ed5cd 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -2268,20 +2268,20 @@ static int i965_compute_pipe_wm(struct intel_crtc_state *crtc_state) return 0; } -static void i965_update_wm(struct intel_crtc *crtc) +static void i965_program_watermarks(struct drm_i915_private *dev_priv) { - struct drm_i915_private *dev_priv = to_i915(crtc->base.dev); + struct intel_crtc *crtc; + struct i9xx_wm_state *wm_state = NULL; int srwm = 1; int cursor_sr = 16; bool cxsr_enabled = false; - crtc->wm.active.i9xx = crtc->config->wm.i9xx.optimal; - - /* Calc sr entries for one plane configs */ crtc = single_enabled_crtc(dev_priv); - if (crtc && crtc->wm.active.i9xx.cxsr) { - struct i9xx_wm_state *wm_state = &crtc->wm.active.i9xx; + if (crtc) + wm_state = &crtc->wm.active.i9xx; + /* Calc sr entries for one plane configs */ + if (wm_state && wm_state->cxsr) { srwm = wm_state->sr.plane; cursor_sr = wm_state->sr.cursor; @@ -2571,8 +2571,10 @@ static void i9xx_initial_watermarks(struct intel_atomic_state *state, pnv_program_watermarks(dev_priv); else if (INTEL_INFO(dev_priv)->num_pipes == 1) i845_program_watermarks(intel_crtc); - else + else if (INTEL_GEN(dev_priv) < 4) i9xx_program_watermarks(dev_priv); + else + i965_program_watermarks(dev_priv); mutex_unlock(&dev_priv->wm.wm_mutex); } @@ -2591,8 +2593,10 @@ static void i9xx_optimize_watermarks(struct intel_atomic_state *state, pnv_program_watermarks(dev_priv); else if (INTEL_INFO(dev_priv)->num_pipes == 1) i845_program_watermarks(intel_crtc); - else + else if (INTEL_GEN(dev_priv) < 4) i9xx_program_watermarks(dev_priv); + else + i965_program_watermarks(dev_priv); mutex_unlock(&dev_priv->wm.wm_mutex); } @@ -8911,7 +8915,8 @@ void intel_init_pm(struct drm_i915_private *dev_priv) } } else if (IS_GEN4(dev_priv)) { dev_priv->display.compute_pipe_wm = i965_compute_pipe_wm; - dev_priv->display.update_wm = i965_update_wm; + dev_priv->display.initial_watermarks = i9xx_initial_watermarks; + dev_priv->display.optimize_watermarks = i9xx_optimize_watermarks; } else if (IS_GEN3(dev_priv)) { dev_priv->display.compute_pipe_wm = i9xx_compute_pipe_wm; dev_priv->display.compute_intermediate_wm = i9xx_compute_intermediate_wm; -- 2.9.3 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC 7/7] drm/i915: Rip out legacy watermark infrastructure
The legacy watermark infrastructure is now unused, so remove it. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/i915_drv.h | 1 - drivers/gpu/drm/i915/intel_atomic.c | 2 - drivers/gpu/drm/i915/intel_display.c | 75 ++-- drivers/gpu/drm/i915/intel_drv.h | 2 - drivers/gpu/drm/i915/intel_pm.c | 42 5 files changed, 3 insertions(+), 119 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 7af4f908b2cd..46b317c991f0 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -637,7 +637,6 @@ struct drm_i915_display_funcs { void (*optimize_watermarks)(struct intel_atomic_state *state, struct intel_crtc_state *cstate); int (*compute_global_watermarks)(struct drm_atomic_state *state); - void (*update_wm)(struct intel_crtc *crtc); int (*modeset_calc_cdclk)(struct drm_atomic_state *state); /* Returns the active state of the crtc, and if the crtc is active, * fills out the pipe-config with the hw state. */ diff --git a/drivers/gpu/drm/i915/intel_atomic.c b/drivers/gpu/drm/i915/intel_atomic.c index 87b1dd464eee..7a4acaa45edd 100644 --- a/drivers/gpu/drm/i915/intel_atomic.c +++ b/drivers/gpu/drm/i915/intel_atomic.c @@ -173,8 +173,6 @@ intel_crtc_duplicate_state(struct drm_crtc *crtc) crtc_state->update_pipe = false; crtc_state->disable_lp_wm = false; crtc_state->disable_cxsr = false; - crtc_state->update_wm_pre = false; - crtc_state->update_wm_post = false; crtc_state->fb_changed = false; crtc_state->fifo_changed = false; crtc_state->wm.need_postvbl_update = false; diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 8538c0246015..295e17d0f272 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -4958,9 +4958,6 @@ static void intel_post_plane_update(struct intel_crtc_state *old_crtc_state) intel_frontbuffer_flip(to_i915(crtc->base.dev), pipe_config->fb_bits); - if (pipe_config->update_wm_post && pipe_config->base.active) - intel_update_watermarks(crtc); - if (old_pri_state) { struct intel_plane_state *primary_state = to_intel_plane_state(primary->state); @@ -5050,8 +5047,6 @@ static void intel_pre_plane_update(struct intel_crtc_state *old_crtc_state, if (dev_priv->display.initial_watermarks != NULL) dev_priv->display.initial_watermarks(old_intel_state, pipe_config); - else if (pipe_config->update_wm_pre) - intel_update_watermarks(crtc); } static void intel_crtc_disable_planes(struct drm_crtc *crtc, unsigned plane_mask) @@ -5737,8 +5732,6 @@ static void i9xx_crtc_enable(struct intel_crtc_state *pipe_config, if (dev_priv->display.initial_watermarks != NULL) dev_priv->display.initial_watermarks(old_intel_state, intel_crtc->config); - else - intel_update_watermarks(intel_crtc); intel_enable_pipe(intel_crtc); assert_vblank_disabled(crtc); @@ -5802,9 +5795,6 @@ static void i9xx_crtc_disable(struct intel_crtc_state *old_crtc_state, if (!IS_GEN2(dev_priv)) intel_set_cpu_fifo_underrun_reporting(dev_priv, pipe, false); - - if (!dev_priv->display.initial_watermarks) - intel_update_watermarks(intel_crtc); } static void intel_crtc_disable_noatomic(struct drm_crtc *crtc) @@ -5863,7 +5853,6 @@ static void intel_crtc_disable_noatomic(struct drm_crtc *crtc) encoder->base.crtc = NULL; intel_fbc_disable(intel_crtc); - intel_update_watermarks(intel_crtc); intel_disable_shared_dpll(intel_crtc); domains = intel_crtc->enabled_power_domains; @@ -10738,40 +10727,6 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc, } -/** - * intel_wm_need_update - Check whether watermarks need updating - * @plane: drm plane - * @state: new plane state - * - * Check current plane state versus the new one to determine whether - * watermarks need to be recalculated. - * - * Returns true or false. - */ -static bool intel_wm_need_update(struct drm_plane *plane, -struct drm_plane_state *state) -{ - struct intel_plane_state *new = to_intel_plane_state(state); - struct intel_plane_state *cur = to_intel_plane_state(plane->state); - - /* Update watermarks on tiling or size changes. */ - if (new->base.visible != cur->base.visible) - return true; - - if (!cur->base.fb || !new->base.fb) - return false; - - if (cur->base.fb->modifier != new->base.fb->modifier || - cur->base.rotation
[Intel-gfx] [PATCH] drm/i915: Move the unclaimed mmio detection into the powerwell for KMS
Replace the large comment about requiring the powerwell for intel_uncore_arm_unclaimed_mmio_detection() by moving the arming of the mmio error detection into the powerwell held for modesetting. Thereby also accomplishing the goal of only arming the mmio detection after a full modeset. Signed-off-by: Chris Wilson Cc: Mika Kuoppala Cc: Daniel Vetter Cc: Ville Syrjälä --- drivers/gpu/drm/i915/intel_display.c | 23 +-- 1 file changed, 9 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 85b9e2f521a0..14e12e46eda5 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -12912,8 +12912,16 @@ static void intel_atomic_commit_tail(struct drm_atomic_state *state) drm_atomic_helper_commit_hw_done(state); - if (intel_state->modeset) + if (intel_state->modeset) { + /* As one of the primary mmio accessors, KMS has a high +* likelihood of triggering bugs in unclaimed access. After we +* finish modesetting, see if an error has been flagged, and if +* so enable debugging for the next modeset - and hope we catch +* the culprit. +*/ + intel_uncore_arm_unclaimed_mmio_detection(dev_priv); intel_display_power_put(dev_priv, POWER_DOMAIN_MODESET); + } mutex_lock(&dev->struct_mutex); drm_atomic_helper_cleanup_planes(dev, state); @@ -12923,19 +12931,6 @@ static void intel_atomic_commit_tail(struct drm_atomic_state *state) drm_atomic_state_put(state); - /* As one of the primary mmio accessors, KMS has a high likelihood -* of triggering bugs in unclaimed access. After we finish -* modesetting, see if an error has been flagged, and if so -* enable debugging for the next modeset - and hope we catch -* the culprit. -* -* XXX note that we assume display power is on at this point. -* This might hold true now but we need to add pm helper to check -* unclaimed only when the hardware is on, as atomic commits -* can happen also when the device is completely off. -*/ - intel_uncore_arm_unclaimed_mmio_detection(dev_priv); - intel_atomic_helper_free_state(dev_priv); } -- 2.11.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 3/3] drm/i915: Micro-optimise hotpath through intel_ring_begin()
Chris Wilson writes: > Typically, there is space available within the ring and if not we have > to wait (by definition a slow path). Rearrange the code to reduce the > number of branches and stack size for the hotpath, accomodating a slight > growth for the wait. > > v2: Fix the new assert that packets are not larger than the actual ring. > > Signed-off-by: Chris Wilson > --- > drivers/gpu/drm/i915/intel_ringbuffer.c | 63 > + > 1 file changed, 33 insertions(+), 30 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c > b/drivers/gpu/drm/i915/intel_ringbuffer.c > index c46e5439d379..53123c1cfcc5 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > @@ -1654,7 +1654,7 @@ static int ring_request_alloc(struct > drm_i915_gem_request *request) > return 0; > } > > -static int wait_for_space(struct drm_i915_gem_request *req, int bytes) > +static noinline int wait_for_space(struct drm_i915_gem_request *req, int > bytes) > { > struct intel_ring *ring = req->ring; > struct drm_i915_gem_request *target; > @@ -1702,49 +1702,52 @@ static int wait_for_space(struct drm_i915_gem_request > *req, int bytes) > u32 *intel_ring_begin(struct drm_i915_gem_request *req, int num_dwords) > { > struct intel_ring *ring = req->ring; > - int remain_actual = ring->size - ring->emit; > - int remain_usable = ring->effective_size - ring->emit; > - int bytes = num_dwords * sizeof(u32); > - int total_bytes, wait_bytes; > - bool need_wrap = false; > + const unsigned int remain_usable = ring->effective_size - ring->emit; > + const unsigned int bytes = num_dwords * sizeof(u32); > + unsigned int need_wrap = 0; > + unsigned int total_bytes; > u32 *cs; > > total_bytes = bytes + req->reserved_space; > + GEM_BUG_ON(total_bytes > ring->effective_size); > > - if (unlikely(bytes > remain_usable)) { > - /* > - * Not enough space for the basic request. So need to flush > - * out the remainder and then wait for base + reserved. > - */ > - wait_bytes = remain_actual + total_bytes; > - need_wrap = true; > - } else if (unlikely(total_bytes > remain_usable)) { > - /* > - * The base request will fit but the reserved space > - * falls off the end. So we don't need an immediate wrap > - * and only need to effectively wait for the reserved > - * size space from the start of ringbuffer. > - */ > - wait_bytes = remain_actual + req->reserved_space; > - } else { > - /* No wrapping required, just waiting. */ > - wait_bytes = total_bytes; > + if (unlikely(total_bytes > remain_usable)) { > + const int remain_actual = ring->size - ring->emit; > + > + if (bytes > remain_usable) { > + /* > + * Not enough space for the basic request. So need to > + * flush out the remainder and then wait for > + * base + reserved. > + */ > + total_bytes += remain_actual; > + need_wrap = remain_actual | 1; Your remain_actual should never reach zero. So in here forcing the lowest bit on, and later off, seems superfluous. -Mika > + } else { > + /* > + * The base request will fit but the reserved space > + * falls off the end. So we don't need an immediate > + * wrap and only need to effectively wait for the > + * reserved size from the start of ringbuffer. > + */ > + total_bytes = req->reserved_space + remain_actual; > + } > } > > - if (wait_bytes > ring->space) { > - int ret = wait_for_space(req, wait_bytes); > + if (unlikely(total_bytes > ring->space)) { > + int ret = wait_for_space(req, total_bytes); > if (unlikely(ret)) > return ERR_PTR(ret); > } > > if (unlikely(need_wrap)) { > - GEM_BUG_ON(remain_actual > ring->space); > - GEM_BUG_ON(ring->emit + remain_actual > ring->size); > + need_wrap &= ~1; > + GEM_BUG_ON(need_wrap > ring->space); > + GEM_BUG_ON(ring->emit + need_wrap > ring->size); > > /* Fill the tail with MI_NOOP */ > - memset(ring->vaddr + ring->emit, 0, remain_actual); > + memset(ring->vaddr + ring->emit, 0, need_wrap); > ring->emit = 0; > - ring->space -= remain_actual; > + ring->space -= need_wrap; > } > > GEM_BUG_ON(ring->emit > ring->size - bytes); > -- > 2.11.0 > > _
[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Move the unclaimed mmio detection into the powerwell for KMS
== Series Details == Series: drm/i915: Move the unclaimed mmio detection into the powerwell for KMS URL : https://patchwork.freedesktop.org/series/23955/ State : success == Summary == Series 23955v1 drm/i915: Move the unclaimed mmio detection into the powerwell for KMS https://patchwork.freedesktop.org/api/1.0/series/23955/revisions/1/mbox/ Test gem_exec_suspend: Subgroup basic-s4-devices: dmesg-warn -> PASS (fi-kbl-7560u) fdo#100125 Test vgem_basic: Subgroup sysfs: incomplete -> PASS (fi-snb-2600) fdo#100125 https://bugs.freedesktop.org/show_bug.cgi?id=100125 fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11 time:430s fi-bdw-gvtdvmtotal:278 pass:256 dwarn:8 dfail:0 fail:0 skip:14 time:426s fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time:513s fi-bxt-t5700 total:278 pass:258 dwarn:0 dfail:0 fail:0 skip:20 time:548s fi-byt-j1900 total:278 pass:254 dwarn:0 dfail:0 fail:0 skip:24 time:494s fi-byt-n2820 total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:479s fi-elk-e7500 total:278 pass:221 dwarn:0 dfail:0 fail:0 skip:57 time:402s fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:407s fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:408s fi-ilk-650 total:278 pass:228 dwarn:0 dfail:0 fail:0 skip:50 time:415s fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:494s fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:489s fi-kbl-7500u total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:453s fi-kbl-7560u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:565s fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:453s fi-skl-6700hqtotal:278 pass:261 dwarn:0 dfail:0 fail:0 skip:17 time:569s fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time:459s fi-skl-6770hqtotal:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:494s fi-skl-gvtdvmtotal:278 pass:265 dwarn:0 dfail:0 fail:0 skip:13 time:431s fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:528s fi-snb-2600 total:278 pass:249 dwarn:0 dfail:0 fail:0 skip:29 time:400s fi-bsw-n3050 failed to collect. IGT log at Patchwork_4621/fi-bsw-n3050/igt.log 93dcb17f41bd2025c355f4e2aded42c0fc5a5c5d drm-tip: 2017y-05m-04d-10h-58m-24s UTC integration manifest 03c0c51 drm/i915: Move the unclaimed mmio detection into the powerwell for KMS == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4621/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v1] ACPI: Switch to use generic UUID API
On Thu, May 04, 2017 at 12:21:51PM +0300, Andy Shevchenko wrote: > acpi_evaluate_dsm() and friends take a pointer to a raw buffer of 16 > bytes. Instead we convert them to use uuid_le type. At the same time we > convert current users. > > acpi_str_to_uuid() becomes useless after the conversion and it's safe to > get rid of it. > > The conversion fixes a potential bug in int340x_thermal as well since > we have to use memcmp() on binary data. > > Cc: Rafael J. Wysocki > Cc: Mika Westerberg > Cc: Borislav Petkov > Cc: Dan Williams > Cc: Amir Goldstein > Cc: Jarkko Sakkinen > Cc: Jani Nikula > Cc: Ben Skeggs > Cc: Benjamin Tissoires > Cc: Joerg Roedel > Cc: Adrian Hunter > Cc: Yisen Zhuang > Cc: Bjorn Helgaas > Cc: Zhang Rui > Cc: Felipe Balbi > Cc: Mathias Nyman > Cc: Heikki Krogerus > Cc: Liam Girdwood > Cc: Mark Brown > Signed-off-by: Andy Shevchenko OK by me, FWIW: Reviewed-by: Heikki Krogerus Thanks, -- heikki ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 3/3] drm/i915: Micro-optimise hotpath through intel_ring_begin()
On Thu, May 04, 2017 at 03:11:45PM +0300, Mika Kuoppala wrote: > Chris Wilson writes: > > > Typically, there is space available within the ring and if not we have > > to wait (by definition a slow path). Rearrange the code to reduce the > > number of branches and stack size for the hotpath, accomodating a slight > > growth for the wait. > > > > v2: Fix the new assert that packets are not larger than the actual ring. > > > > Signed-off-by: Chris Wilson > > --- > > drivers/gpu/drm/i915/intel_ringbuffer.c | 63 > > + > > 1 file changed, 33 insertions(+), 30 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c > > b/drivers/gpu/drm/i915/intel_ringbuffer.c > > index c46e5439d379..53123c1cfcc5 100644 > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > > @@ -1654,7 +1654,7 @@ static int ring_request_alloc(struct > > drm_i915_gem_request *request) > > return 0; > > } > > > > -static int wait_for_space(struct drm_i915_gem_request *req, int bytes) > > +static noinline int wait_for_space(struct drm_i915_gem_request *req, int > > bytes) > > { > > struct intel_ring *ring = req->ring; > > struct drm_i915_gem_request *target; > > @@ -1702,49 +1702,52 @@ static int wait_for_space(struct > > drm_i915_gem_request *req, int bytes) > > u32 *intel_ring_begin(struct drm_i915_gem_request *req, int num_dwords) > > { > > struct intel_ring *ring = req->ring; > > - int remain_actual = ring->size - ring->emit; > > - int remain_usable = ring->effective_size - ring->emit; > > - int bytes = num_dwords * sizeof(u32); > > - int total_bytes, wait_bytes; > > - bool need_wrap = false; > > + const unsigned int remain_usable = ring->effective_size - ring->emit; > > + const unsigned int bytes = num_dwords * sizeof(u32); > > + unsigned int need_wrap = 0; > > + unsigned int total_bytes; > > u32 *cs; > > > > total_bytes = bytes + req->reserved_space; > > + GEM_BUG_ON(total_bytes > ring->effective_size); > > > > - if (unlikely(bytes > remain_usable)) { > > - /* > > -* Not enough space for the basic request. So need to flush > > -* out the remainder and then wait for base + reserved. > > -*/ > > - wait_bytes = remain_actual + total_bytes; > > - need_wrap = true; > > - } else if (unlikely(total_bytes > remain_usable)) { > > - /* > > -* The base request will fit but the reserved space > > -* falls off the end. So we don't need an immediate wrap > > -* and only need to effectively wait for the reserved > > -* size space from the start of ringbuffer. > > -*/ > > - wait_bytes = remain_actual + req->reserved_space; > > - } else { > > - /* No wrapping required, just waiting. */ > > - wait_bytes = total_bytes; > > + if (unlikely(total_bytes > remain_usable)) { > > + const int remain_actual = ring->size - ring->emit; > > + > > + if (bytes > remain_usable) { > > + /* > > +* Not enough space for the basic request. So need to > > +* flush out the remainder and then wait for > > +* base + reserved. > > +*/ > > + total_bytes += remain_actual; > > + need_wrap = remain_actual | 1; > > Your remain_actual should never reach zero. So in here > forcing the lowest bit on, and later off, seems superfluous. Why can't we fill up to the last byte with commands? remain_actual is just (size - tail) and we don't force a wrap until emit crosses the boundary (and not before). We hit remain_actual == 0 in practice. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 32/67] drm/i915/cnl: DDI - PLL mapping
On Fri, 2017-04-07 at 18:12 -0300, Paulo Zanoni wrote: > Em Qui, 2017-04-06 às 12:15 -0700, Rodrigo Vivi escreveu: > > One of the steps for PLL (un)initialization is to (un)map > > the correspondent DDI that is actually using that PLL. > > > > So, let's do this step following the places already stablished > > and used so far, although spec put this as part of PLL > > initialization sequences. > > > > v2: Use proper prefix on bits names as suggested by Ander. > > v3: Add missed "~". Without that the logic was inverted > > so we were disabling interrupts. > > Credits-to: Clinton > > Credits-to: Art > > v4: Spec is getting updated to do DDI -> PLL mapping > > and clock on in 2 separated reg writes. (Paulo) > > Also update bits definitions to use space > > (1 << 1) instead of (1<<1). (Paulo) > > > > Cc: Paulo Zanoni > > Cc: Art Runyan > > Cc: Clint Taylor > > Cc: Ville Syrjälä > > Cc: Kahola, Mika > > Cc: Ander Conselvan De Oliveira > m> > > Signed-off-by: Rodrigo Vivi > > Reviewed-by: Kahola, Mika > > Signed-off-by: Rodrigo Vivi > > --- > > drivers/gpu/drm/i915/i915_reg.h | 9 + > > drivers/gpu/drm/i915/intel_ddi.c | 23 --- > > 2 files changed, 29 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_reg.h > > b/drivers/gpu/drm/i915/i915_reg.h > > index 3cfc65f..dcb8e21 100644 > > --- a/drivers/gpu/drm/i915/i915_reg.h > > +++ b/drivers/gpu/drm/i915/i915_reg.h > > @@ -8150,6 +8150,15 @@ enum { > > #define DPLL_CFGCR1(id)_MMIO_PIPE((id) - SKL_DPLL1, > > _DPLL1_CFGCR1, _DPLL2_CFGCR1) > > #define DPLL_CFGCR2(id)_MMIO_PIPE((id) - SKL_DPLL1, > > _DPLL1_CFGCR2, _DPLL2_CFGCR2) > > > > +/* > > + * CNL Clocks > > + */ > > +#define DPCLKA_CFGCR0 _MMIO(0x6C200) > > +#define DPCLKA_CFGCR0_DDI_CLK_OFF(port) (1 << ((port)+10)) > > +#define DPCLKA_CFGCR0_DDI_CLK_SEL_MASK(port) (3 << > > ((port)*2)) > > +#define DPCLKA_CFGCR0_DDI_CLK_SEL_SHIFT(port) ((port)*2) > > +#define DPCLKA_CFGCR0_DDI_CLK_SEL(pll, port) ((pll) << > > ((port)*2)) > > + > > /* BXT display engine PLL */ > > #define BXT_DE_PLL_CTL _MMIO(0x6d000) > > #define BXT_DE_PLL_RATIO(x) (x) /* > > {60,65,100} * 19.2MHz */ > > diff --git a/drivers/gpu/drm/i915/intel_ddi.c > > b/drivers/gpu/drm/i915/intel_ddi.c > > index 0914ad9..2a901bf 100644 > > --- a/drivers/gpu/drm/i915/intel_ddi.c > > +++ b/drivers/gpu/drm/i915/intel_ddi.c > > @@ -1621,13 +1621,27 @@ static void intel_ddi_clk_select(struct > > intel_encoder *encoder, > > { > > struct drm_i915_private *dev_priv = to_i915(encoder- > > > base.dev); > > > > enum port port = intel_ddi_get_encoder_port(encoder); > > + uint32_t val; > > > > if (WARN_ON(!pll)) > > return; > > > > - if (IS_GEN9_BC(dev_priv)) { > > - uint32_t val; > > + if (IS_CANNONLAKE(dev_priv)) { > > + /* Configure DPCLKA_CFGCR0 to map the DPLL to the > > DDI. */ > > + val = I915_READ(DPCLKA_CFGCR0); > > + val |= DPCLKA_CFGCR0_DDI_CLK_SEL(pll->id, port); > > + I915_WRITE(DPCLKA_CFGCR0, val); > > A question to the Atomic Lords: don't we need some sort of locking > around this register since it's used by all ports/clocks? I suppose > dev_priv->dpll_lock would do... > > Maybe the same would apply for gen9_bc. If there are modesets happening in parallel for different crtcs, then some locking is needed. dpll_lock seems like the right call, that's what's used to avoid the same problem with the enable/disable hooks. Btw, I think this patch shows why something like [1] might be a good idea. [1] https://patchwork.freedesktop.org/patch/113598/ > > > > > + /* > > + * Configure DPCLKA_CFGCR0 to turn on the clock for > > the DDI. > > + * This step and the step before must be done with > > separate > > + * register writes. > > + */ > > + val = I915_READ(DPCLKA_CFGCR0); > > + val &= ~(DPCLKA_CFGCR0_DDI_CLK_OFF(port) | > > + DPCLKA_CFGCR0_DDI_CLK_SEL_MASK(port)); > > + I915_WRITE(DPCLKA_CFGCR0, val); > > + } else if (IS_GEN9_BC(dev_priv)) { > > /* DDI -> PLL mapping */ > > val = I915_READ(DPLL_CTRL2); > > > > @@ -1763,7 +1777,10 @@ static void intel_ddi_post_disable(struct > > intel_encoder *intel_encoder, > > if (dig_port) > > intel_display_power_put(dev_priv, dig_port- > > > ddi_io_power_domain); > > > > > > - if (IS_GEN9_BC(dev_priv)) > > + if (IS_CANNONLAKE(dev_priv)) > > + I915_WRITE(DPCLKA_CFGCR0, I915_READ(DPCLKA_CFGCR0) | > > + DPCLKA_CFGCR0_DDI_CLK_OFF(port)); > > + else if (IS_GEN9_BC(dev_priv)) > > I915_WRITE(DPLL_CTRL2, (I915_READ(DPLL_CTRL2) | > > DPLL_CTRL2_DDI_CLK_OFF(port) > > )); > > else if (INTEL_GEN(dev_priv) < 9) > > __
Re: [Intel-gfx] [PATCH 32/67] drm/i915/cnl: DDI - PLL mapping
On Thu, May 04, 2017 at 03:35:51PM +0300, Ander Conselvan De Oliveira wrote: > On Fri, 2017-04-07 at 18:12 -0300, Paulo Zanoni wrote: > > Em Qui, 2017-04-06 às 12:15 -0700, Rodrigo Vivi escreveu: > > > One of the steps for PLL (un)initialization is to (un)map > > > the correspondent DDI that is actually using that PLL. > > > > > > So, let's do this step following the places already stablished > > > and used so far, although spec put this as part of PLL > > > initialization sequences. > > > > > > v2: Use proper prefix on bits names as suggested by Ander. > > > v3: Add missed "~". Without that the logic was inverted > > > so we were disabling interrupts. > > > Credits-to: Clinton > > > Credits-to: Art > > > v4: Spec is getting updated to do DDI -> PLL mapping > > > and clock on in 2 separated reg writes. (Paulo) > > > Also update bits definitions to use space > > > (1 << 1) instead of (1<<1). (Paulo) > > > > > > Cc: Paulo Zanoni > > > Cc: Art Runyan > > > Cc: Clint Taylor > > > Cc: Ville Syrjälä > > > Cc: Kahola, Mika > > > Cc: Ander Conselvan De Oliveira > > m> > > > Signed-off-by: Rodrigo Vivi > > > Reviewed-by: Kahola, Mika > > > Signed-off-by: Rodrigo Vivi > > > --- > > > drivers/gpu/drm/i915/i915_reg.h | 9 + > > > drivers/gpu/drm/i915/intel_ddi.c | 23 --- > > > 2 files changed, 29 insertions(+), 3 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/i915/i915_reg.h > > > b/drivers/gpu/drm/i915/i915_reg.h > > > index 3cfc65f..dcb8e21 100644 > > > --- a/drivers/gpu/drm/i915/i915_reg.h > > > +++ b/drivers/gpu/drm/i915/i915_reg.h > > > @@ -8150,6 +8150,15 @@ enum { > > > #define DPLL_CFGCR1(id) _MMIO_PIPE((id) - SKL_DPLL1, > > > _DPLL1_CFGCR1, _DPLL2_CFGCR1) > > > #define DPLL_CFGCR2(id) _MMIO_PIPE((id) - SKL_DPLL1, > > > _DPLL1_CFGCR2, _DPLL2_CFGCR2) > > > > > > +/* > > > + * CNL Clocks > > > + */ > > > +#define DPCLKA_CFGCR0_MMIO(0x6C200) > > > +#define DPCLKA_CFGCR0_DDI_CLK_OFF(port) (1 << ((port)+10)) > > > +#define DPCLKA_CFGCR0_DDI_CLK_SEL_MASK(port)(3 << > > > ((port)*2)) > > > +#define DPCLKA_CFGCR0_DDI_CLK_SEL_SHIFT(port) ((port)*2) > > > +#define DPCLKA_CFGCR0_DDI_CLK_SEL(pll, port)((pll) << > > > ((port)*2)) > > > + > > > /* BXT display engine PLL */ > > > #define BXT_DE_PLL_CTL _MMIO(0x6d000) > > > #define BXT_DE_PLL_RATIO(x)(x) /* > > > {60,65,100} * 19.2MHz */ > > > diff --git a/drivers/gpu/drm/i915/intel_ddi.c > > > b/drivers/gpu/drm/i915/intel_ddi.c > > > index 0914ad9..2a901bf 100644 > > > --- a/drivers/gpu/drm/i915/intel_ddi.c > > > +++ b/drivers/gpu/drm/i915/intel_ddi.c > > > @@ -1621,13 +1621,27 @@ static void intel_ddi_clk_select(struct > > > intel_encoder *encoder, > > > { > > > struct drm_i915_private *dev_priv = to_i915(encoder- > > > > base.dev); > > > > > > enum port port = intel_ddi_get_encoder_port(encoder); > > > + uint32_t val; > > > > > > if (WARN_ON(!pll)) > > > return; > > > > > > - if (IS_GEN9_BC(dev_priv)) { > > > - uint32_t val; > > > + if (IS_CANNONLAKE(dev_priv)) { > > > + /* Configure DPCLKA_CFGCR0 to map the DPLL to the > > > DDI. */ > > > + val = I915_READ(DPCLKA_CFGCR0); > > > + val |= DPCLKA_CFGCR0_DDI_CLK_SEL(pll->id, port); > > > + I915_WRITE(DPCLKA_CFGCR0, val); > > > > A question to the Atomic Lords: don't we need some sort of locking > > around this register since it's used by all ports/clocks? I suppose > > dev_priv->dpll_lock would do... > > > > Maybe the same would apply for gen9_bc. > > If there are modesets happening in parallel for different crtcs, then some > locking is needed. dpll_lock seems like the right call, that's what's used to > avoid the same problem with the enable/disable hooks. If something is allowing modesets to commit in parallel then probably the whole world is on fire. Historically connection_mutex has been there to protect us, but not sure how that goes with nonblocking commits. I do hope there's still something there to prevents this... > > Btw, I think this patch shows why something like [1] might be a good idea. > > [1] https://patchwork.freedesktop.org/patch/113598/ > > > > > > > > + /* > > > + * Configure DPCLKA_CFGCR0 to turn on the clock for > > > the DDI. > > > + * This step and the step before must be done with > > > separate > > > + * register writes. > > > + */ > > > + val = I915_READ(DPCLKA_CFGCR0); > > > + val &= ~(DPCLKA_CFGCR0_DDI_CLK_OFF(port) | > > > + DPCLKA_CFGCR0_DDI_CLK_SEL_MASK(port)); > > > + I915_WRITE(DPCLKA_CFGCR0, val); > > > + } else if (IS_GEN9_BC(dev_priv)) { > > > /* DDI -> PLL mapping */ > > > val = I915_READ(DPLL_CTRL2); > > > > > > @@ -1763,7 +1777,10 @@ static void intel_ddi_post_disable(struct > > > intel_encoder *intel_encoder, > > > if (dig_port) > > > i
[Intel-gfx] [PATCH v2 2/3] drm/i915/guc: Make scratch register base and count flexible
We are using some scratch registers in MMIO based send function. Make their base and count flexible in preparation of upcoming GuC firmware/hardware changes. While around, change cmd len parameter verification from WARN_ON to GEM_BUG_ON as we don't need this all the time. v2: call out WARN/GEM_BUG change in the commit msg (Daniele) Signed-off-by: Michal Wajdeczko Suggested-by: Daniele Ceraolo Spurio Cc: Daniele Ceraolo Spurio Cc: Joonas Lahtinen Reviewed-by: Daniele Ceraolo Spurio --- drivers/gpu/drm/i915/intel_uc.c | 41 ++--- drivers/gpu/drm/i915/intel_uc.h | 7 +++ 2 files changed, 41 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_uc.c b/drivers/gpu/drm/i915/intel_uc.c index 72f49e6..9d11c42 100644 --- a/drivers/gpu/drm/i915/intel_uc.c +++ b/drivers/gpu/drm/i915/intel_uc.c @@ -260,9 +260,36 @@ void intel_uc_fini_fw(struct drm_i915_private *dev_priv) __intel_uc_fw_fini(&dev_priv->huc.fw); } +static inline i915_reg_t guc_send_reg(struct intel_guc *guc, u32 i) +{ + GEM_BUG_ON(!guc->send_regs.base); + GEM_BUG_ON(!guc->send_regs.count); + GEM_BUG_ON(i >= guc->send_regs.count); + + return _MMIO(guc->send_regs.base + 4 * i); +} + +static void guc_init_send_regs(struct intel_guc *guc) +{ + struct drm_i915_private *dev_priv = guc_to_i915(guc); + enum forcewake_domains fw_domains = 0; + u32 i; + + guc->send_regs.base = i915_mmio_reg_offset(SOFT_SCRATCH(0)); + guc->send_regs.count = SOFT_SCRATCH_COUNT - 1; + + for (i = 0; i < guc->send_regs.count; i++) { + fw_domains |= intel_uncore_forcewake_for_reg(dev_priv, + guc_send_reg(guc, i), + FW_REG_READ | FW_REG_WRITE); + } + guc->send_regs.fw_domains = fw_domains; +} + static int guc_enable_communication(struct intel_guc *guc) { /* XXX: placeholder for alternate setup */ + guc_init_send_regs(guc); guc->send = intel_guc_send_mmio; return 0; } @@ -407,19 +434,19 @@ int intel_guc_send_mmio(struct intel_guc *guc, const u32 *action, u32 len) int i; int ret; - if (WARN_ON(len < 1 || len > 15)) - return -EINVAL; + GEM_BUG_ON(!len); + GEM_BUG_ON(len > guc->send_regs.count); mutex_lock(&guc->send_mutex); - intel_uncore_forcewake_get(dev_priv, FORCEWAKE_BLITTER); + intel_uncore_forcewake_get(dev_priv, guc->send_regs.fw_domains); dev_priv->guc.action_count += 1; dev_priv->guc.action_cmd = action[0]; for (i = 0; i < len; i++) - I915_WRITE(SOFT_SCRATCH(i), action[i]); + I915_WRITE(guc_send_reg(guc, i), action[i]); - POSTING_READ(SOFT_SCRATCH(i - 1)); + POSTING_READ(guc_send_reg(guc, i - 1)); intel_guc_notify(guc); @@ -428,7 +455,7 @@ int intel_guc_send_mmio(struct intel_guc *guc, const u32 *action, u32 len) * Fast commands should still complete in 10us. */ ret = __intel_wait_for_register_fw(dev_priv, - SOFT_SCRATCH(0), + guc_send_reg(guc, 0), INTEL_GUC_RECV_MASK, INTEL_GUC_RECV_MASK, 10, 10, &status); @@ -450,7 +477,7 @@ int intel_guc_send_mmio(struct intel_guc *guc, const u32 *action, u32 len) } dev_priv->guc.action_status = status; - intel_uncore_forcewake_put(dev_priv, FORCEWAKE_BLITTER); + intel_uncore_forcewake_put(dev_priv, guc->send_regs.fw_domains); mutex_unlock(&guc->send_mutex); return ret; diff --git a/drivers/gpu/drm/i915/intel_uc.h b/drivers/gpu/drm/i915/intel_uc.h index 097289b..a37a8cc 100644 --- a/drivers/gpu/drm/i915/intel_uc.h +++ b/drivers/gpu/drm/i915/intel_uc.h @@ -205,6 +205,13 @@ struct intel_guc { uint64_t submissions[I915_NUM_ENGINES]; uint32_t last_seqno[I915_NUM_ENGINES]; + /* GuC's FW specific registers used in MMIO send */ + struct { + u32 base; + u32 count; + u32 fw_domains; /* enum forcewake_domains */ + } send_regs; + /* To serialize the intel_guc_send actions */ struct mutex send_mutex; -- 2.7.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v1] ACPI: Switch to use generic UUID API
On Thu, May 04, 2017 at 12:21:51PM +0300, Andy Shevchenko wrote: > diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c > index cbf7763d8091..420d51b286ad 100644 > --- a/drivers/iommu/dmar.c > +++ b/drivers/iommu/dmar.c > @@ -1808,10 +1808,9 @@ IOMMU_INIT_POST(detect_intel_iommu); > * for Directed-IO Architecture Specifiction, Rev 2.2, Section 8.8 > * "Remapping Hardware Unit Hot Plug". > */ > -static u8 dmar_hp_uuid[] = { > - /* */0xA6, 0xA3, 0xC1, 0xD8, 0x9B, 0xBE, 0x9B, 0x4C, > - /* 0008 */0x91, 0xBF, 0xC3, 0xCB, 0x81, 0xFC, 0x5D, 0xAF > -}; > +static uuid_le dmar_hp_uuid = > + UUID_LE(0xD8C1A3A6, 0xBE9B, 0x4C9B, > + 0x91, 0xBF, 0xC3, 0xCB, 0x81, 0xFC, 0x5D, 0xAF); > > /* > * Currently there's only one revision and BIOS will not check the revision > id, > @@ -1824,7 +1823,7 @@ static u8 dmar_hp_uuid[] = { > > static inline bool dmar_detect_dsm(acpi_handle handle, int func) > { > - return acpi_check_dsm(handle, dmar_hp_uuid, DMAR_DSM_REV_ID, 1 << func); > + return acpi_check_dsm(handle, &dmar_hp_uuid, DMAR_DSM_REV_ID, 1 << > func); > } > > static int dmar_walk_dsm_resource(acpi_handle handle, int func, > @@ -1843,7 +1842,7 @@ static int dmar_walk_dsm_resource(acpi_handle handle, > int func, > if (!dmar_detect_dsm(handle, func)) > return 0; > > - obj = acpi_evaluate_dsm_typed(handle, dmar_hp_uuid, DMAR_DSM_REV_ID, > + obj = acpi_evaluate_dsm_typed(handle, &dmar_hp_uuid, DMAR_DSM_REV_ID, > func, NULL, ACPI_TYPE_BUFFER); > if (!obj) > return -ENODEV; DMAR part is Acked-by: Joerg Roedel ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 32/67] drm/i915/cnl: DDI - PLL mapping
On Thu, 2017-04-06 at 12:15 -0700, Rodrigo Vivi wrote: > One of the steps for PLL (un)initialization is to (un)map > the correspondent DDI that is actually using that PLL. > > So, let's do this step following the places already stablished > and used so far, although spec put this as part of PLL > initialization sequences. > > v2: Use proper prefix on bits names as suggested by Ander. > v3: Add missed "~". Without that the logic was inverted > so we were disabling interrupts. > Credits-to: Clinton > Credits-to: Art > v4: Spec is getting updated to do DDI -> PLL mapping > and clock on in 2 separated reg writes. (Paulo) > Also update bits definitions to use space > (1 << 1) instead of (1<<1). (Paulo) > > Cc: Paulo Zanoni > Cc: Art Runyan > Cc: Clint Taylor > Cc: Ville Syrjälä > Cc: Kahola, Mika > Cc: Ander Conselvan De Oliveira > Signed-off-by: Rodrigo Vivi > Reviewed-by: Kahola, Mika > Signed-off-by: Rodrigo Vivi > --- > drivers/gpu/drm/i915/i915_reg.h | 9 + > drivers/gpu/drm/i915/intel_ddi.c | 23 --- > 2 files changed, 29 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h > index 3cfc65f..dcb8e21 100644 > --- a/drivers/gpu/drm/i915/i915_reg.h > +++ b/drivers/gpu/drm/i915/i915_reg.h > @@ -8150,6 +8150,15 @@ enum { > #define DPLL_CFGCR1(id) _MMIO_PIPE((id) - SKL_DPLL1, _DPLL1_CFGCR1, > _DPLL2_CFGCR1) > #define DPLL_CFGCR2(id) _MMIO_PIPE((id) - SKL_DPLL1, _DPLL1_CFGCR2, > _DPLL2_CFGCR2) > > +/* > + * CNL Clocks > + */ > +#define DPCLKA_CFGCR0_MMIO(0x6C200) > +#define DPCLKA_CFGCR0_DDI_CLK_OFF(port) (1 << ((port)+10)) > +#define DPCLKA_CFGCR0_DDI_CLK_SEL_MASK(port)(3 << ((port)*2)) > +#define DPCLKA_CFGCR0_DDI_CLK_SEL_SHIFT(port) ((port)*2) > +#define DPCLKA_CFGCR0_DDI_CLK_SEL(pll, port)((pll) << ((port)*2)) > + > /* BXT display engine PLL */ > #define BXT_DE_PLL_CTL _MMIO(0x6d000) > #define BXT_DE_PLL_RATIO(x)(x) /* {60,65,100} * > 19.2MHz */ > diff --git a/drivers/gpu/drm/i915/intel_ddi.c > b/drivers/gpu/drm/i915/intel_ddi.c > index 0914ad9..2a901bf 100644 > --- a/drivers/gpu/drm/i915/intel_ddi.c > +++ b/drivers/gpu/drm/i915/intel_ddi.c > @@ -1621,13 +1621,27 @@ static void intel_ddi_clk_select(struct intel_encoder > *encoder, > { > struct drm_i915_private *dev_priv = to_i915(encoder->base.dev); > enum port port = intel_ddi_get_encoder_port(encoder); > + uint32_t val; > > if (WARN_ON(!pll)) > return; > > - if (IS_GEN9_BC(dev_priv)) { > - uint32_t val; > + if (IS_CANNONLAKE(dev_priv)) { > + /* Configure DPCLKA_CFGCR0 to map the DPLL to the DDI. */ > + val = I915_READ(DPCLKA_CFGCR0); > + val |= DPCLKA_CFGCR0_DDI_CLK_SEL(pll->id, port); > + I915_WRITE(DPCLKA_CFGCR0, val); > > + /* > + * Configure DPCLKA_CFGCR0 to turn on the clock for the DDI. > + * This step and the step before must be done with separate > + * register writes. > + */ > + val = I915_READ(DPCLKA_CFGCR0); > + val &= ~(DPCLKA_CFGCR0_DDI_CLK_OFF(port) | > + DPCLKA_CFGCR0_DDI_CLK_SEL_MASK(port)); val |= DPCLKA_CFGCR0_DDI_CLK_SEL(pll->id, port); ? Or clearing the clock select to zero has no effect here? Ander > + I915_WRITE(DPCLKA_CFGCR0, val); > + } else if (IS_GEN9_BC(dev_priv)) { > /* DDI -> PLL mapping */ > val = I915_READ(DPLL_CTRL2); > > @@ -1763,7 +1777,10 @@ static void intel_ddi_post_disable(struct > intel_encoder *intel_encoder, > if (dig_port) > intel_display_power_put(dev_priv, > dig_port->ddi_io_power_domain); > > - if (IS_GEN9_BC(dev_priv)) > + if (IS_CANNONLAKE(dev_priv)) > + I915_WRITE(DPCLKA_CFGCR0, I915_READ(DPCLKA_CFGCR0) | > +DPCLKA_CFGCR0_DDI_CLK_OFF(port)); > + else if (IS_GEN9_BC(dev_priv)) > I915_WRITE(DPLL_CTRL2, (I915_READ(DPLL_CTRL2) | > DPLL_CTRL2_DDI_CLK_OFF(port))); > else if (INTEL_GEN(dev_priv) < 9) ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 3/3] drm/i915: Micro-optimise hotpath through intel_ring_begin()
Chris Wilson writes: > On Thu, May 04, 2017 at 03:11:45PM +0300, Mika Kuoppala wrote: >> Chris Wilson writes: >> >> > Typically, there is space available within the ring and if not we have >> > to wait (by definition a slow path). Rearrange the code to reduce the >> > number of branches and stack size for the hotpath, accomodating a slight >> > growth for the wait. >> > >> > v2: Fix the new assert that packets are not larger than the actual ring. >> > >> > Signed-off-by: Chris Wilson >> > --- >> > drivers/gpu/drm/i915/intel_ringbuffer.c | 63 >> > + >> > 1 file changed, 33 insertions(+), 30 deletions(-) >> > >> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c >> > b/drivers/gpu/drm/i915/intel_ringbuffer.c >> > index c46e5439d379..53123c1cfcc5 100644 >> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c >> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c >> > @@ -1654,7 +1654,7 @@ static int ring_request_alloc(struct >> > drm_i915_gem_request *request) >> >return 0; >> > } >> > >> > -static int wait_for_space(struct drm_i915_gem_request *req, int bytes) >> > +static noinline int wait_for_space(struct drm_i915_gem_request *req, int >> > bytes) >> > { >> >struct intel_ring *ring = req->ring; >> >struct drm_i915_gem_request *target; >> > @@ -1702,49 +1702,52 @@ static int wait_for_space(struct >> > drm_i915_gem_request *req, int bytes) >> > u32 *intel_ring_begin(struct drm_i915_gem_request *req, int num_dwords) >> > { >> >struct intel_ring *ring = req->ring; >> > - int remain_actual = ring->size - ring->emit; >> > - int remain_usable = ring->effective_size - ring->emit; >> > - int bytes = num_dwords * sizeof(u32); >> > - int total_bytes, wait_bytes; >> > - bool need_wrap = false; >> > + const unsigned int remain_usable = ring->effective_size - ring->emit; >> > + const unsigned int bytes = num_dwords * sizeof(u32); >> > + unsigned int need_wrap = 0; >> > + unsigned int total_bytes; >> >u32 *cs; >> > >> >total_bytes = bytes + req->reserved_space; >> > + GEM_BUG_ON(total_bytes > ring->effective_size); >> > >> > - if (unlikely(bytes > remain_usable)) { >> > - /* >> > - * Not enough space for the basic request. So need to flush >> > - * out the remainder and then wait for base + reserved. >> > - */ >> > - wait_bytes = remain_actual + total_bytes; >> > - need_wrap = true; >> > - } else if (unlikely(total_bytes > remain_usable)) { >> > - /* >> > - * The base request will fit but the reserved space >> > - * falls off the end. So we don't need an immediate wrap >> > - * and only need to effectively wait for the reserved >> > - * size space from the start of ringbuffer. >> > - */ >> > - wait_bytes = remain_actual + req->reserved_space; >> > - } else { >> > - /* No wrapping required, just waiting. */ >> > - wait_bytes = total_bytes; >> > + if (unlikely(total_bytes > remain_usable)) { >> > + const int remain_actual = ring->size - ring->emit; >> > + >> > + if (bytes > remain_usable) { >> > + /* >> > + * Not enough space for the basic request. So need to >> > + * flush out the remainder and then wait for >> > + * base + reserved. >> > + */ >> > + total_bytes += remain_actual; >> > + need_wrap = remain_actual | 1; >> >> Your remain_actual should never reach zero. So in here >> forcing the lowest bit on, and later off, seems superfluous. > > Why can't we fill up to the last byte with commands? remain_actual is > just (size - tail) and we don't force a wrap until emit crosses the > boundary (and not before). We hit remain_actual == 0 in practice. > -Chris My mistake, was thinking postwrap. num_dwords and second parameter to wait_for_space should be unsigned. Reviewed-by: Mika Kuoppala ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 32/67] drm/i915/cnl: DDI - PLL mapping
Op 04-05-17 om 14:44 schreef Ville Syrjälä: > On Thu, May 04, 2017 at 03:35:51PM +0300, Ander Conselvan De Oliveira wrote: >> On Fri, 2017-04-07 at 18:12 -0300, Paulo Zanoni wrote: >>> Em Qui, 2017-04-06 às 12:15 -0700, Rodrigo Vivi escreveu: One of the steps for PLL (un)initialization is to (un)map the correspondent DDI that is actually using that PLL. So, let's do this step following the places already stablished and used so far, although spec put this as part of PLL initialization sequences. v2: Use proper prefix on bits names as suggested by Ander. v3: Add missed "~". Without that the logic was inverted so we were disabling interrupts. Credits-to: Clinton Credits-to: Art v4: Spec is getting updated to do DDI -> PLL mapping and clock on in 2 separated reg writes. (Paulo) Also update bits definitions to use space (1 << 1) instead of (1<<1). (Paulo) Cc: Paulo Zanoni Cc: Art Runyan Cc: Clint Taylor Cc: Ville Syrjälä Cc: Kahola, Mika Cc: Ander Conselvan De Oliveira >>> m> Signed-off-by: Rodrigo Vivi Reviewed-by: Kahola, Mika Signed-off-by: Rodrigo Vivi --- drivers/gpu/drm/i915/i915_reg.h | 9 + drivers/gpu/drm/i915/intel_ddi.c | 23 --- 2 files changed, 29 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 3cfc65f..dcb8e21 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -8150,6 +8150,15 @@ enum { #define DPLL_CFGCR1(id) _MMIO_PIPE((id) - SKL_DPLL1, _DPLL1_CFGCR1, _DPLL2_CFGCR1) #define DPLL_CFGCR2(id) _MMIO_PIPE((id) - SKL_DPLL1, _DPLL1_CFGCR2, _DPLL2_CFGCR2) +/* + * CNL Clocks + */ +#define DPCLKA_CFGCR0 _MMIO(0x6C200) +#define DPCLKA_CFGCR0_DDI_CLK_OFF(port) (1 << ((port)+10)) +#define DPCLKA_CFGCR0_DDI_CLK_SEL_MASK(port) (3 << ((port)*2)) +#define DPCLKA_CFGCR0_DDI_CLK_SEL_SHIFT(port)((port)*2) +#define DPCLKA_CFGCR0_DDI_CLK_SEL(pll, port) ((pll) << ((port)*2)) + /* BXT display engine PLL */ #define BXT_DE_PLL_CTL_MMIO(0x6d000) #define BXT_DE_PLL_RATIO(x) (x) /* {60,65,100} * 19.2MHz */ diff --git a/drivers/gpu/drm/i915/intel_ddi.c b/drivers/gpu/drm/i915/intel_ddi.c index 0914ad9..2a901bf 100644 --- a/drivers/gpu/drm/i915/intel_ddi.c +++ b/drivers/gpu/drm/i915/intel_ddi.c @@ -1621,13 +1621,27 @@ static void intel_ddi_clk_select(struct intel_encoder *encoder, { struct drm_i915_private *dev_priv = to_i915(encoder- > base.dev); enum port port = intel_ddi_get_encoder_port(encoder); + uint32_t val; if (WARN_ON(!pll)) return; - if (IS_GEN9_BC(dev_priv)) { - uint32_t val; + if (IS_CANNONLAKE(dev_priv)) { + /* Configure DPCLKA_CFGCR0 to map the DPLL to the DDI. */ + val = I915_READ(DPCLKA_CFGCR0); + val |= DPCLKA_CFGCR0_DDI_CLK_SEL(pll->id, port); + I915_WRITE(DPCLKA_CFGCR0, val); >>> A question to the Atomic Lords: don't we need some sort of locking >>> around this register since it's used by all ports/clocks? I suppose >>> dev_priv->dpll_lock would do... >>> >>> Maybe the same would apply for gen9_bc. >> If there are modesets happening in parallel for different crtcs, then some >> locking is needed. dpll_lock seems like the right call, that's what's used to >> avoid the same problem with the enable/disable hooks. > If something is allowing modesets to commit in parallel then probably > the whole world is on fire. Historically connection_mutex has been there > to protect us, but not sure how that goes with nonblocking commits. I > do hope there's still something there to prevents this... During nonblocking modesets we don't hold any locks. It's still possible that we force serialization through some other means, for example grabbing all crtc_states might force serialization previously. But I'm not sure this is guaranteed to happen even for SKL. It might happen for when DDB allocation or cdclk changes but there's no guarantee during modeset. So quite likely you'll need locking here. :) ~Maarten ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 3/3] drm/i915: Micro-optimise hotpath through intel_ring_begin()
On Thu, May 04, 2017 at 03:59:05PM +0300, Mika Kuoppala wrote: > Chris Wilson writes: > > > On Thu, May 04, 2017 at 03:11:45PM +0300, Mika Kuoppala wrote: > >> Chris Wilson writes: > >> > >> > Typically, there is space available within the ring and if not we have > >> > to wait (by definition a slow path). Rearrange the code to reduce the > >> > number of branches and stack size for the hotpath, accomodating a slight > >> > growth for the wait. > >> > > >> > v2: Fix the new assert that packets are not larger than the actual ring. > >> > > >> > Signed-off-by: Chris Wilson > >> > --- > >> > drivers/gpu/drm/i915/intel_ringbuffer.c | 63 > >> > + > >> > 1 file changed, 33 insertions(+), 30 deletions(-) > >> > > >> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c > >> > b/drivers/gpu/drm/i915/intel_ringbuffer.c > >> > index c46e5439d379..53123c1cfcc5 100644 > >> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > >> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > >> > @@ -1654,7 +1654,7 @@ static int ring_request_alloc(struct > >> > drm_i915_gem_request *request) > >> > return 0; > >> > } > >> > > >> > -static int wait_for_space(struct drm_i915_gem_request *req, int bytes) > >> > +static noinline int wait_for_space(struct drm_i915_gem_request *req, > >> > int bytes) > >> > { > >> > struct intel_ring *ring = req->ring; > >> > struct drm_i915_gem_request *target; > >> > @@ -1702,49 +1702,52 @@ static int wait_for_space(struct > >> > drm_i915_gem_request *req, int bytes) > >> > u32 *intel_ring_begin(struct drm_i915_gem_request *req, int num_dwords) > >> > { > >> > struct intel_ring *ring = req->ring; > >> > -int remain_actual = ring->size - ring->emit; > >> > -int remain_usable = ring->effective_size - ring->emit; > >> > -int bytes = num_dwords * sizeof(u32); > >> > -int total_bytes, wait_bytes; > >> > -bool need_wrap = false; > >> > +const unsigned int remain_usable = ring->effective_size - > >> > ring->emit; > >> > +const unsigned int bytes = num_dwords * sizeof(u32); > >> > +unsigned int need_wrap = 0; > >> > +unsigned int total_bytes; > >> > u32 *cs; > >> > > >> > total_bytes = bytes + req->reserved_space; > >> > +GEM_BUG_ON(total_bytes > ring->effective_size); > >> > > >> > -if (unlikely(bytes > remain_usable)) { > >> > -/* > >> > - * Not enough space for the basic request. So need to > >> > flush > >> > - * out the remainder and then wait for base + reserved. > >> > - */ > >> > -wait_bytes = remain_actual + total_bytes; > >> > -need_wrap = true; > >> > -} else if (unlikely(total_bytes > remain_usable)) { > >> > -/* > >> > - * The base request will fit but the reserved space > >> > - * falls off the end. So we don't need an immediate wrap > >> > - * and only need to effectively wait for the reserved > >> > - * size space from the start of ringbuffer. > >> > - */ > >> > -wait_bytes = remain_actual + req->reserved_space; > >> > -} else { > >> > -/* No wrapping required, just waiting. */ > >> > -wait_bytes = total_bytes; > >> > +if (unlikely(total_bytes > remain_usable)) { > >> > +const int remain_actual = ring->size - ring->emit; > >> > + > >> > +if (bytes > remain_usable) { > >> > +/* > >> > + * Not enough space for the basic request. So > >> > need to > >> > + * flush out the remainder and then wait for > >> > + * base + reserved. > >> > + */ > >> > +total_bytes += remain_actual; > >> > +need_wrap = remain_actual | 1; > >> > >> Your remain_actual should never reach zero. So in here > >> forcing the lowest bit on, and later off, seems superfluous. > > > > Why can't we fill up to the last byte with commands? remain_actual is > > just (size - tail) and we don't force a wrap until emit crosses the > > boundary (and not before). We hit remain_actual == 0 in practice. > > -Chris > > My mistake, was thinking postwrap. > > num_dwords and second parameter to wait_for_space should be unsigned. You predictive algorithm is working fine though. Applied after your suggestion from patch 1. Thanks, -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 3/3] drm/i915: Micro-optimise hotpath through intel_ring_begin()
Typically, there is space available within the ring and if not we have to wait (by definition a slow path). Rearrange the code to reduce the number of branches and stack size for the hotpath, accomodating a slight growth for the wait. v2: Fix the new assert that packets are not larger than the actual ring. v3: Make the parameters unsigned as well to make usage. Signed-off-by: Chris Wilson Reviewed-by: Mika Kuoppala --- drivers/gpu/drm/i915/intel_ringbuffer.c | 67 ++--- drivers/gpu/drm/i915/intel_ringbuffer.h | 3 +- 2 files changed, 38 insertions(+), 32 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 47f144b1e3fa..8b427a6151b2 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1655,7 +1655,8 @@ static int ring_request_alloc(struct drm_i915_gem_request *request) return 0; } -static int wait_for_space(struct drm_i915_gem_request *req, int bytes) +static noinline int wait_for_space(struct drm_i915_gem_request *req, + unsigned int bytes) { struct intel_ring *ring = req->ring; struct drm_i915_gem_request *target; @@ -1700,52 +1701,56 @@ static int wait_for_space(struct drm_i915_gem_request *req, int bytes) return 0; } -u32 *intel_ring_begin(struct drm_i915_gem_request *req, int num_dwords) +u32 *intel_ring_begin(struct drm_i915_gem_request *req, + unsigned int num_dwords) { struct intel_ring *ring = req->ring; - int remain_actual = ring->size - ring->emit; - int remain_usable = ring->effective_size - ring->emit; - int bytes = num_dwords * sizeof(u32); - int total_bytes, wait_bytes; - bool need_wrap = false; + const unsigned int remain_usable = ring->effective_size - ring->emit; + const unsigned int bytes = num_dwords * sizeof(u32); + unsigned int need_wrap = 0; + unsigned int total_bytes; u32 *cs; total_bytes = bytes + req->reserved_space; + GEM_BUG_ON(total_bytes > ring->effective_size); - if (unlikely(bytes > remain_usable)) { - /* -* Not enough space for the basic request. So need to flush -* out the remainder and then wait for base + reserved. -*/ - wait_bytes = remain_actual + total_bytes; - need_wrap = true; - } else if (unlikely(total_bytes > remain_usable)) { - /* -* The base request will fit but the reserved space -* falls off the end. So we don't need an immediate wrap -* and only need to effectively wait for the reserved -* size space from the start of ringbuffer. -*/ - wait_bytes = remain_actual + req->reserved_space; - } else { - /* No wrapping required, just waiting. */ - wait_bytes = total_bytes; + if (unlikely(total_bytes > remain_usable)) { + const int remain_actual = ring->size - ring->emit; + + if (bytes > remain_usable) { + /* +* Not enough space for the basic request. So need to +* flush out the remainder and then wait for +* base + reserved. +*/ + total_bytes += remain_actual; + need_wrap = remain_actual | 1; + } else { + /* +* The base request will fit but the reserved space +* falls off the end. So we don't need an immediate +* wrap and only need to effectively wait for the +* reserved size from the start of ringbuffer. +*/ + total_bytes = req->reserved_space + remain_actual; + } } - if (wait_bytes > ring->space) { - int ret = wait_for_space(req, wait_bytes); + if (unlikely(total_bytes > ring->space)) { + int ret = wait_for_space(req, total_bytes); if (unlikely(ret)) return ERR_PTR(ret); } if (unlikely(need_wrap)) { - GEM_BUG_ON(remain_actual > ring->space); - GEM_BUG_ON(ring->emit + remain_actual > ring->size); + need_wrap &= ~1; + GEM_BUG_ON(need_wrap > ring->space); + GEM_BUG_ON(ring->emit + need_wrap > ring->size); /* Fill the tail with MI_NOOP */ - memset(ring->vaddr + ring->emit, 0, remain_actual); + memset(ring->vaddr + ring->emit, 0, need_wrap); ring->emit = 0; - ring->space -= remain_actual; + ring->space -= need_wrap; } GEM_BUG_ON(ring->emit > r
[Intel-gfx] [CI 2/3] drm/i915: Report the ring->space from intel_ring_update_space()
Some callers immediately want to know the current ring->space after calling intel_ring_update_space(), which we can freely provide via the return parameter. Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/intel_ringbuffer.c | 12 drivers/gpu/drm/i915/intel_ringbuffer.h | 2 +- 2 files changed, 9 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index e7ef04cc071b..47f144b1e3fa 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -51,9 +51,14 @@ static unsigned int __intel_ring_space(unsigned int head, return (head - tail - CACHELINE_BYTES) & (size - 1); } -void intel_ring_update_space(struct intel_ring *ring) +unsigned int intel_ring_update_space(struct intel_ring *ring) { - ring->space = __intel_ring_space(ring->head, ring->emit, ring->size); + unsigned int space; + + space = __intel_ring_space(ring->head, ring->emit, ring->size); + + ring->space = space; + return space; } static int @@ -1658,8 +1663,7 @@ static int wait_for_space(struct drm_i915_gem_request *req, int bytes) lockdep_assert_held(&req->i915->drm.struct_mutex); - intel_ring_update_space(ring); - if (ring->space >= bytes) + if (intel_ring_update_space(ring) >= bytes) return 0; /* diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 650ab884d6c8..3e343b09eeb6 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -486,7 +486,7 @@ int intel_ring_pin(struct intel_ring *ring, struct drm_i915_private *i915, unsigned int offset_bias); void intel_ring_reset(struct intel_ring *ring, u32 tail); -void intel_ring_update_space(struct intel_ring *ring); +unsigned int intel_ring_update_space(struct intel_ring *ring); void intel_ring_unpin(struct intel_ring *ring); void intel_ring_free(struct intel_ring *ring); -- 2.11.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 1/3] drm/i915: Avoid the branch in computing intel_ring_space()
Exploit the power-of-two ring size to compute the space across the wraparound using a mask rather than a if. Convert to unsigned integers so the operation is well defined. References: https://bugs.freedesktop.org/show_bug.cgi?id=99671 Signed-off-by: Chris Wilson Cc: Mika Kuoppala Reviewed-by: Mika Kuoppala --- drivers/gpu/drm/i915/intel_ringbuffer.c | 23 +++-- drivers/gpu/drm/i915/intel_ringbuffer.h | 36 - 2 files changed, 34 insertions(+), 25 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 3ce1c87dec46..e7ef04cc071b 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -39,12 +39,16 @@ */ #define LEGACY_REQUEST_SIZE 200 -static int __intel_ring_space(int head, int tail, int size) +static unsigned int __intel_ring_space(unsigned int head, + unsigned int tail, + unsigned int size) { - int space = head - tail; - if (space <= 0) - space += size; - return space - I915_RING_FREE_SPACE; + /* +* "If the Ring Buffer Head Pointer and the Tail Pointer are on the +* same cacheline, the Head Pointer must not be greater than the Tail +* Pointer." +*/ + return (head - tail - CACHELINE_BYTES) & (size - 1); } void intel_ring_update_space(struct intel_ring *ring) @@ -1670,12 +1674,9 @@ static int wait_for_space(struct drm_i915_gem_request *req, int bytes) GEM_BUG_ON(!req->reserved_space); list_for_each_entry(target, &ring->request_list, ring_link) { - unsigned space; - /* Would completion of this request free enough space? */ - space = __intel_ring_space(target->postfix, ring->emit, - ring->size); - if (space >= bytes) + if (bytes <= __intel_ring_space(target->postfix, + ring->emit, ring->size)) break; } @@ -1744,11 +1745,11 @@ u32 *intel_ring_begin(struct drm_i915_gem_request *req, int num_dwords) } GEM_BUG_ON(ring->emit > ring->size - bytes); + GEM_BUG_ON(ring->space < bytes); cs = ring->vaddr + ring->emit; GEM_DEBUG_EXEC(memset(cs, POISON_INUSE, bytes)); ring->emit += bytes; ring->space -= bytes; - GEM_BUG_ON(ring->space < 0); return cs; } diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 600713b29d79..650ab884d6c8 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -17,17 +17,6 @@ #define CACHELINE_BYTES 64 #define CACHELINE_DWORDS (CACHELINE_BYTES / sizeof(uint32_t)) -/* - * Gen2 BSpec "1. Programming Environment" / 1.4.4.6 "Ring Buffer Use" - * Gen3 BSpec "vol1c Memory Interface Functions" / 2.3.4.5 "Ring Buffer Use" - * Gen4+ BSpec "vol1c Memory Interface and Command Stream" / 5.3.4.5 "Ring Buffer Use" - * - * "If the Ring Buffer Head Pointer and the Tail Pointer are on the same - * cacheline, the Head Pointer must not be greater than the Tail - * Pointer." - */ -#define I915_RING_FREE_SPACE 64 - struct intel_hw_status_page { struct i915_vma *vma; u32 *page_addr; @@ -145,9 +134,9 @@ struct intel_ring { u32 tail; u32 emit; - int space; - int size; - int effective_size; + u32 space; + u32 size; + u32 effective_size; }; struct i915_gem_context; @@ -548,6 +537,25 @@ assert_ring_tail_valid(const struct intel_ring *ring, unsigned int tail) */ GEM_BUG_ON(!IS_ALIGNED(tail, 8)); GEM_BUG_ON(tail >= ring->size); + + /* +* "Ring Buffer Use" +* Gen2 BSpec "1. Programming Environment" / 1.4.4.6 +* Gen3 BSpec "1c Memory Interface Functions" / 2.3.4.5 +* Gen4+ BSpec "1c Memory Interface and Command Stream" / 5.3.4.5 +* "If the Ring Buffer Head Pointer and the Tail Pointer are on the +* same cacheline, the Head Pointer must not be greater than the Tail +* Pointer." +* +* We use ring->head as the last known location of the actual RING_HEAD, +* it may have advanced but in the worst case it is equally the same +* as ring->head and so we should never program RING_TAIL to advance +* into the same cacheline as ring->head. +*/ +#define cacheline(a) round_down(a, CACHELINE_BYTES) + GEM_BUG_ON(cacheline(tail) == cacheline(ring->head) && + tail < ring->head); +#undef cacheline } static inline unsigned int -- 2.11.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 32/67] drm/i915/cnl: DDI - PLL mapping
On Thu, May 04, 2017 at 03:02:07PM +0200, Maarten Lankhorst wrote: > Op 04-05-17 om 14:44 schreef Ville Syrjälä: > > On Thu, May 04, 2017 at 03:35:51PM +0300, Ander Conselvan De Oliveira wrote: > >> On Fri, 2017-04-07 at 18:12 -0300, Paulo Zanoni wrote: > >>> Em Qui, 2017-04-06 às 12:15 -0700, Rodrigo Vivi escreveu: > One of the steps for PLL (un)initialization is to (un)map > the correspondent DDI that is actually using that PLL. > > So, let's do this step following the places already stablished > and used so far, although spec put this as part of PLL > initialization sequences. > > v2: Use proper prefix on bits names as suggested by Ander. > v3: Add missed "~". Without that the logic was inverted > so we were disabling interrupts. > Credits-to: Clinton > Credits-to: Art > v4: Spec is getting updated to do DDI -> PLL mapping > and clock on in 2 separated reg writes. (Paulo) > Also update bits definitions to use space > (1 << 1) instead of (1<<1). (Paulo) > > Cc: Paulo Zanoni > Cc: Art Runyan > Cc: Clint Taylor > Cc: Ville Syrjälä > Cc: Kahola, Mika > Cc: Ander Conselvan De Oliveira m> > Signed-off-by: Rodrigo Vivi > Reviewed-by: Kahola, Mika > Signed-off-by: Rodrigo Vivi > --- > drivers/gpu/drm/i915/i915_reg.h | 9 + > drivers/gpu/drm/i915/intel_ddi.c | 23 --- > 2 files changed, 29 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_reg.h > b/drivers/gpu/drm/i915/i915_reg.h > index 3cfc65f..dcb8e21 100644 > --- a/drivers/gpu/drm/i915/i915_reg.h > +++ b/drivers/gpu/drm/i915/i915_reg.h > @@ -8150,6 +8150,15 @@ enum { > #define DPLL_CFGCR1(id) _MMIO_PIPE((id) - SKL_DPLL1, > _DPLL1_CFGCR1, _DPLL2_CFGCR1) > #define DPLL_CFGCR2(id) _MMIO_PIPE((id) - SKL_DPLL1, > _DPLL1_CFGCR2, _DPLL2_CFGCR2) > > +/* > + * CNL Clocks > + */ > +#define DPCLKA_CFGCR0 _MMIO(0x6C200) > +#define DPCLKA_CFGCR0_DDI_CLK_OFF(port)(1 << ((port)+10)) > +#define DPCLKA_CFGCR0_DDI_CLK_SEL_MASK(port) (3 << > ((port)*2)) > +#define DPCLKA_CFGCR0_DDI_CLK_SEL_SHIFT(port) ((port)*2) > +#define DPCLKA_CFGCR0_DDI_CLK_SEL(pll, port) ((pll) << > ((port)*2)) > + > /* BXT display engine PLL */ > #define BXT_DE_PLL_CTL _MMIO(0x6d000) > #define BXT_DE_PLL_RATIO(x) (x) /* > {60,65,100} * 19.2MHz */ > diff --git a/drivers/gpu/drm/i915/intel_ddi.c > b/drivers/gpu/drm/i915/intel_ddi.c > index 0914ad9..2a901bf 100644 > --- a/drivers/gpu/drm/i915/intel_ddi.c > +++ b/drivers/gpu/drm/i915/intel_ddi.c > @@ -1621,13 +1621,27 @@ static void intel_ddi_clk_select(struct > intel_encoder *encoder, > { > struct drm_i915_private *dev_priv = to_i915(encoder- > > base.dev); > enum port port = intel_ddi_get_encoder_port(encoder); > +uint32_t val; > > if (WARN_ON(!pll)) > return; > > -if (IS_GEN9_BC(dev_priv)) { > -uint32_t val; > +if (IS_CANNONLAKE(dev_priv)) { > +/* Configure DPCLKA_CFGCR0 to map the DPLL to the > DDI. */ > +val = I915_READ(DPCLKA_CFGCR0); > +val |= DPCLKA_CFGCR0_DDI_CLK_SEL(pll->id, port); > +I915_WRITE(DPCLKA_CFGCR0, val); > >>> A question to the Atomic Lords: don't we need some sort of locking > >>> around this register since it's used by all ports/clocks? I suppose > >>> dev_priv->dpll_lock would do... > >>> > >>> Maybe the same would apply for gen9_bc. > >> If there are modesets happening in parallel for different crtcs, then some > >> locking is needed. dpll_lock seems like the right call, that's what's used > >> to > >> avoid the same problem with the enable/disable hooks. > > If something is allowing modesets to commit in parallel then probably > > the whole world is on fire. Historically connection_mutex has been there > > to protect us, but not sure how that goes with nonblocking commits. I > > do hope there's still something there to prevents this... > > During nonblocking modesets we don't hold any locks. It's still possible > that we force serialization through some other means, for example grabbing > all crtc_states might force serialization previously. But I'm not sure this > is guaranteed to happen even for SKL. It might happen for when DDB > allocation or cdclk changes but there's no guarantee during modeset. > > So quite likely you'll need locking here. :) Someone just need to fix things so that modesets are always serialized. I don't think anyone has actually reviewd the entire driver sufficiently to allow parallel m
Re: [Intel-gfx] [PATCH 33/67] drm/i915: Configure DPLL's for Cannonlake
On Thu, 2017-04-06 at 12:15 -0700, Rodrigo Vivi wrote: > From: "Kahola, Mika" > > DPLL's are defined in DPCLKA_CFGCR0 register (0x6C200). Let's use these > definitions when computing dpll's for ddi ports. > > v2: (Rodrigo) Remove register that was defined in another patch with > fixed name and more bits. > > Signed-off-by: Kahola, Mika > Signed-off-by: Rodrigo Vivi > --- > drivers/gpu/drm/i915/intel_display.c | 20 +++- > 1 file changed, 19 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/intel_display.c > b/drivers/gpu/drm/i915/intel_display.c > index 87d2822..4d0ae98 100644 > --- a/drivers/gpu/drm/i915/intel_display.c > +++ b/drivers/gpu/drm/i915/intel_display.c > @@ -8850,6 +8850,22 @@ static int haswell_crtc_compute_clock(struct > intel_crtc *crtc, > return 0; > } > > +static void cannonlake_get_ddi_pll(struct drm_i915_private *dev_priv, > +enum port port, > +struct intel_crtc_state *pipe_config) > +{ > + enum intel_dpll_id id; > + u32 temp; > + > + temp = I915_READ(DPCLKA_CFGCR0) & DPCLKA_CFGCR0_DDI_CLK_SEL_MASK(port); > + id = temp >> (port * 2); Maybe use DPCLKA_CFGCR0_DDI_CLK_SEL_SHIFT which was defined in the previous patch? Also, might make sense to squash this with the next patch, but anyway, Reviewed-by: Ander Conselvan de Oliveira > + > + if (WARN_ON(id < SKL_DPLL0 || id > SKL_DPLL2)) > + return; > + > + pipe_config->shared_dpll = intel_get_shared_dpll_by_id(dev_priv, id); > +} > + > static void bxt_get_ddi_pll(struct drm_i915_private *dev_priv, > enum port port, > struct intel_crtc_state *pipe_config) > @@ -9037,7 +9053,9 @@ static void haswell_get_ddi_port_state(struct > intel_crtc *crtc, > > port = (tmp & TRANS_DDI_PORT_MASK) >> TRANS_DDI_PORT_SHIFT; > > - if (IS_GEN9_BC(dev_priv)) > + if (IS_CANNONLAKE(dev_priv)) > + cannonlake_get_ddi_pll(dev_priv, port, pipe_config); > + else if (IS_GEN9_BC(dev_priv)) > skylake_get_ddi_pll(dev_priv, port, pipe_config); > else if (IS_GEN9_LP(dev_priv)) > bxt_get_ddi_pll(dev_priv, port, pipe_config); ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2 2/3] drm/i915/guc: Make scratch register base and count flexible
On Thu, 04 May 2017, Michal Wajdeczko wrote: > We are using some scratch registers in MMIO based send function. > Make their base and count flexible in preparation of upcoming > GuC firmware/hardware changes. While around, change cmd len > parameter verification from WARN_ON to GEM_BUG_ON as we don't > need this all the time. I'm not generally fond of caching the registers like this or adding _MMIO() wrapping outside of i915_reg.h. Sure, we have some of that here and there, but here it's hard to see the rationale because you do this in preparation for something that we you're not sharing. BR, Jani. > > v2: call out WARN/GEM_BUG change in the commit msg (Daniele) > > Signed-off-by: Michal Wajdeczko > Suggested-by: Daniele Ceraolo Spurio > Cc: Daniele Ceraolo Spurio > Cc: Joonas Lahtinen > Reviewed-by: Daniele Ceraolo Spurio > --- > drivers/gpu/drm/i915/intel_uc.c | 41 > ++--- > drivers/gpu/drm/i915/intel_uc.h | 7 +++ > 2 files changed, 41 insertions(+), 7 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_uc.c b/drivers/gpu/drm/i915/intel_uc.c > index 72f49e6..9d11c42 100644 > --- a/drivers/gpu/drm/i915/intel_uc.c > +++ b/drivers/gpu/drm/i915/intel_uc.c > @@ -260,9 +260,36 @@ void intel_uc_fini_fw(struct drm_i915_private *dev_priv) > __intel_uc_fw_fini(&dev_priv->huc.fw); > } > > +static inline i915_reg_t guc_send_reg(struct intel_guc *guc, u32 i) > +{ > + GEM_BUG_ON(!guc->send_regs.base); > + GEM_BUG_ON(!guc->send_regs.count); > + GEM_BUG_ON(i >= guc->send_regs.count); > + > + return _MMIO(guc->send_regs.base + 4 * i); > +} > + > +static void guc_init_send_regs(struct intel_guc *guc) > +{ > + struct drm_i915_private *dev_priv = guc_to_i915(guc); > + enum forcewake_domains fw_domains = 0; > + u32 i; > + > + guc->send_regs.base = i915_mmio_reg_offset(SOFT_SCRATCH(0)); > + guc->send_regs.count = SOFT_SCRATCH_COUNT - 1; > + > + for (i = 0; i < guc->send_regs.count; i++) { > + fw_domains |= intel_uncore_forcewake_for_reg(dev_priv, > + guc_send_reg(guc, i), > + FW_REG_READ | FW_REG_WRITE); > + } > + guc->send_regs.fw_domains = fw_domains; > +} > + > static int guc_enable_communication(struct intel_guc *guc) > { > /* XXX: placeholder for alternate setup */ > + guc_init_send_regs(guc); > guc->send = intel_guc_send_mmio; > return 0; > } > @@ -407,19 +434,19 @@ int intel_guc_send_mmio(struct intel_guc *guc, const > u32 *action, u32 len) > int i; > int ret; > > - if (WARN_ON(len < 1 || len > 15)) > - return -EINVAL; > + GEM_BUG_ON(!len); > + GEM_BUG_ON(len > guc->send_regs.count); > > mutex_lock(&guc->send_mutex); > - intel_uncore_forcewake_get(dev_priv, FORCEWAKE_BLITTER); > + intel_uncore_forcewake_get(dev_priv, guc->send_regs.fw_domains); > > dev_priv->guc.action_count += 1; > dev_priv->guc.action_cmd = action[0]; > > for (i = 0; i < len; i++) > - I915_WRITE(SOFT_SCRATCH(i), action[i]); > + I915_WRITE(guc_send_reg(guc, i), action[i]); > > - POSTING_READ(SOFT_SCRATCH(i - 1)); > + POSTING_READ(guc_send_reg(guc, i - 1)); > > intel_guc_notify(guc); > > @@ -428,7 +455,7 @@ int intel_guc_send_mmio(struct intel_guc *guc, const u32 > *action, u32 len) >* Fast commands should still complete in 10us. >*/ > ret = __intel_wait_for_register_fw(dev_priv, > -SOFT_SCRATCH(0), > +guc_send_reg(guc, 0), > INTEL_GUC_RECV_MASK, > INTEL_GUC_RECV_MASK, > 10, 10, &status); > @@ -450,7 +477,7 @@ int intel_guc_send_mmio(struct intel_guc *guc, const u32 > *action, u32 len) > } > dev_priv->guc.action_status = status; > > - intel_uncore_forcewake_put(dev_priv, FORCEWAKE_BLITTER); > + intel_uncore_forcewake_put(dev_priv, guc->send_regs.fw_domains); > mutex_unlock(&guc->send_mutex); > > return ret; > diff --git a/drivers/gpu/drm/i915/intel_uc.h b/drivers/gpu/drm/i915/intel_uc.h > index 097289b..a37a8cc 100644 > --- a/drivers/gpu/drm/i915/intel_uc.h > +++ b/drivers/gpu/drm/i915/intel_uc.h > @@ -205,6 +205,13 @@ struct intel_guc { > uint64_t submissions[I915_NUM_ENGINES]; > uint32_t last_seqno[I915_NUM_ENGINES]; > > + /* GuC's FW specific registers used in MMIO send */ > + struct { > + u32 base; > + u32 count; > + u32 fw_domains; /* enum forcewake_domains */ > + } send_regs; > + > /* To serialize the intel_guc_send actions */ > struct mutex send_mutex; -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list
Re: [Intel-gfx] [PATCH 5/5] drm/vblank: Lock down vblank->hwmode more
On Wed, May 03, 2017 at 05:09:08PM +0300, Ville Syrjälä wrote: > On Wed, May 03, 2017 at 09:26:38AM +0200, Daniel Vetter wrote: > > In the previous patch we've implemented hwmode tracking a la i915 for > > the vblank timestamp calculations. But that was just the basic > > semantics, i915 has some nice sanity checks to make sure we keep > > getting this right. Move them over too. > > > > Cc: Ville Syrjälä > > Reviewed-by: Neil Armstrong > > Signed-off-by: Daniel Vetter > > --- > > drivers/gpu/drm/drm_irq.c| 8 +++- > > drivers/gpu/drm/i915/i915_irq.c | 10 ++ > > drivers/gpu/drm/i915/intel_display.c | 11 ++- > > 3 files changed, 15 insertions(+), 14 deletions(-) > > > > diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c > > index 89f0928b042a..942183a2aa3c 100644 > > --- a/drivers/gpu/drm/drm_irq.c > > +++ b/drivers/gpu/drm/drm_irq.c > > @@ -775,8 +775,10 @@ bool drm_calc_vbltimestamp_from_scanoutpos(struct > > drm_device *dev, > > /* If mode timing undefined, just return as no-op: > > * Happens during initial modesetting of a crtc. > > */ > > - if (mode->crtc_clock == 0) { > > + if (WARN_ON(mode->crtc_clock == 0)) { > > DRM_DEBUG("crtc %u: Noop due to uninitialized mode.\n", pipe); > > + WARN_ON(drm_drv_uses_atomic_modeset(dev)); > > I would make these _ONCE() otherwise the machine might end up > practically dead. Will do. > > + > > return false; > > } > > > > @@ -1338,6 +1340,10 @@ void drm_crtc_vblank_off(struct drm_crtc *crtc) > > send_vblank_event(dev, e, seq, &now); > > } > > spin_unlock_irqrestore(&dev->event_lock, irqflags); > > + > > + /* Will be reset by the modeset helpers when re-enabling the crtc by > > +* calling drm_calc_timestamping_constants(). */ > > + vblank->hwmode.crtc_clock = 0; > > } > > EXPORT_SYMBOL(drm_crtc_vblank_off); > > Shouldn't we do this in drm_crtc_vblank_reset() as well? > > Hmm. Except we call that after drm_calc_timestamping_constants(). I > guess we should be able to move the reset() into > intel_modeset_readout_hw_state(). And possibly move the vblank_on() > call as well? Yeah, it'd be nice to clean this stuff up some more, but there's also the problem that legacy and new drivers callc drm_calc_timestamping_constants at opposite ends of the modeset sequence. Doing more here is a bunch more work, maybe for the next patche series ... I don't think we need to call it in _reset, at least at boot-up it should be 0 already. And for s/r we already shut down the pipe on suspend, so it's gone through this here. With the _ONCE nit address (and the build breakage I've introduced in this version fixed), ack from you on the entire series? Thanks, Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 5/9] drm/i915: Use a define for the default priority [0]
On ke, 2017-05-03 at 12:37 +0100, Chris Wilson wrote: > Explicitly assign the default priority, and give it a name (macro). > > Signed-off-by: Chris Wilson > kref_init(&ctx->ref); > list_add_tail(&ctx->link, &dev_priv->context_list); > ctx->i915 = dev_priv; > + ctx->priority = I915_PRIORITY_DFL; I915_PRIORITY_DEFAULT would work better. Reviewed-by: Joonas Lahtinen Regards, Joonas -- Joonas Lahtinen Open Source Technology Center Intel Corporation ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [CI,1/3] drm/i915: Avoid the branch in computing intel_ring_space()
== Series Details == Series: series starting with [CI,1/3] drm/i915: Avoid the branch in computing intel_ring_space() URL : https://patchwork.freedesktop.org/series/23958/ State : success == Summary == Series 23958v1 Series without cover letter https://patchwork.freedesktop.org/api/1.0/series/23958/revisions/1/mbox/ Test gem_exec_suspend: Subgroup basic-s4-devices: pass -> DMESG-WARN (fi-snb-2600) fdo#100125 Test kms_flip: Subgroup basic-flip-vs-modeset: dmesg-warn -> PASS (fi-byt-j1900) fdo#100652 fdo#100125 https://bugs.freedesktop.org/show_bug.cgi?id=100125 fdo#100652 https://bugs.freedesktop.org/show_bug.cgi?id=100652 fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11 time:436s fi-bdw-gvtdvmtotal:278 pass:256 dwarn:8 dfail:0 fail:0 skip:14 time:429s fi-bsw-n3050 total:278 pass:242 dwarn:0 dfail:0 fail:0 skip:36 time:576s fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time:506s fi-bxt-t5700 total:278 pass:258 dwarn:0 dfail:0 fail:0 skip:20 time:568s fi-byt-j1900 total:278 pass:254 dwarn:0 dfail:0 fail:0 skip:24 time:496s fi-byt-n2820 total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:483s fi-elk-e7500 total:278 pass:221 dwarn:0 dfail:0 fail:0 skip:57 time:407s fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:416s fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:402s fi-ilk-650 total:278 pass:228 dwarn:0 dfail:0 fail:0 skip:50 time:414s fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:495s fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:487s fi-kbl-7500u total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:459s fi-kbl-7560u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:565s fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:452s fi-skl-6700hqtotal:278 pass:261 dwarn:0 dfail:0 fail:0 skip:17 time:583s fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time:461s fi-skl-6770hqtotal:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:489s fi-skl-gvtdvmtotal:278 pass:265 dwarn:0 dfail:0 fail:0 skip:13 time:429s fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:535s fi-snb-2600 total:278 pass:248 dwarn:1 dfail:0 fail:0 skip:29 time:415s 1fbac016c8f2c9d4405111f3425f778d2ecdea62 drm-tip: 2017y-05m-04d-12h-52m-01s UTC integration manifest f1c0df1 drm/i915: Micro-optimise hotpath through intel_ring_begin() 03ee0e5 drm/i915: Report the ring->space from intel_ring_update_space() eca45ee drm/i915: Avoid the branch in computing intel_ring_space() == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4622/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v1] ACPI: Switch to use generic UUID API
On Thu, May 4, 2017 at 4:21 AM, Andy Shevchenko wrote: > acpi_evaluate_dsm() and friends take a pointer to a raw buffer of 16 > bytes. Instead we convert them to use uuid_le type. At the same time we > convert current users. > > acpi_str_to_uuid() becomes useless after the conversion and it's safe to > get rid of it. > > The conversion fixes a potential bug in int340x_thermal as well since > we have to use memcmp() on binary data. > > Cc: Rafael J. Wysocki > Cc: Mika Westerberg > Cc: Borislav Petkov > Cc: Dan Williams > Cc: Amir Goldstein > Cc: Jarkko Sakkinen > Cc: Jani Nikula > Cc: Ben Skeggs > Cc: Benjamin Tissoires > Cc: Joerg Roedel > Cc: Adrian Hunter > Cc: Yisen Zhuang > Cc: Bjorn Helgaas > Cc: Zhang Rui > Cc: Felipe Balbi > Cc: Mathias Nyman > Cc: Heikki Krogerus > Cc: Liam Girdwood > Cc: Mark Brown > Signed-off-by: Andy Shevchenko For the drivers/pci parts: Acked-by: Bjorn Helgaas > --- > drivers/acpi/acpi_extlog.c | 10 +++--- > drivers/acpi/bus.c | 29 ++-- > drivers/acpi/nfit/core.c | 40 > +++--- > drivers/acpi/nfit/nfit.h | 3 +- > drivers/acpi/utils.c | 4 +-- > drivers/char/tpm/tpm_crb.c | 9 +++-- > drivers/char/tpm/tpm_ppi.c | 20 +-- > drivers/gpu/drm/i915/intel_acpi.c | 14 +++- > drivers/gpu/drm/nouveau/nouveau_acpi.c | 20 +-- > drivers/gpu/drm/nouveau/nvkm/subdev/mxm/base.c | 9 +++-- > drivers/hid/i2c-hid/i2c-hid.c | 9 +++-- > drivers/iommu/dmar.c | 11 +++--- > drivers/mmc/host/sdhci-pci-core.c | 9 +++-- > drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c | 15 > drivers/pci/pci-acpi.c | 11 +++--- > drivers/pci/pci-label.c| 4 +-- > drivers/thermal/int340x_thermal/int3400_thermal.c | 8 ++--- > drivers/usb/dwc3/dwc3-pci.c| 6 ++-- > drivers/usb/host/xhci-pci.c| 9 +++-- > drivers/usb/misc/ucsi.c| 2 +- > drivers/usb/typec/typec_wcove.c| 4 +-- > include/acpi/acpi_bus.h| 9 ++--- > include/linux/acpi.h | 4 +-- > include/linux/pci-acpi.h | 2 +- > sound/soc/intel/skylake/skl-nhlt.c | 7 ++-- > tools/testing/nvdimm/test/iomap.c | 2 +- > tools/testing/nvdimm/test/nfit.c | 2 +- > 27 files changed, 116 insertions(+), 156 deletions(-) > > diff --git a/drivers/acpi/acpi_extlog.c b/drivers/acpi/acpi_extlog.c > index 502ea4dc2080..69d6140b6afa 100644 > --- a/drivers/acpi/acpi_extlog.c > +++ b/drivers/acpi/acpi_extlog.c > @@ -182,17 +182,17 @@ static int extlog_print(struct notifier_block *nb, > unsigned long val, > > static bool __init extlog_get_l1addr(void) > { > - u8 uuid[16]; > + uuid_le uuid; > acpi_handle handle; > union acpi_object *obj; > > - acpi_str_to_uuid(extlog_dsm_uuid, uuid); > - > + if (uuid_le_to_bin(extlog_dsm_uuid, &uuid)) > + return false; > if (ACPI_FAILURE(acpi_get_handle(NULL, "\\_SB", &handle))) > return false; > - if (!acpi_check_dsm(handle, uuid, EXTLOG_DSM_REV, 1 << > EXTLOG_FN_ADDR)) > + if (!acpi_check_dsm(handle, &uuid, EXTLOG_DSM_REV, 1 << > EXTLOG_FN_ADDR)) > return false; > - obj = acpi_evaluate_dsm_typed(handle, uuid, EXTLOG_DSM_REV, > + obj = acpi_evaluate_dsm_typed(handle, &uuid, EXTLOG_DSM_REV, > EXTLOG_FN_ADDR, NULL, > ACPI_TYPE_INTEGER); > if (!obj) { > return false; > diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c > index 784bda663d16..e8130a4873e9 100644 > --- a/drivers/acpi/bus.c > +++ b/drivers/acpi/bus.c > @@ -196,42 +196,19 @@ static void acpi_print_osc_error(acpi_handle handle, > pr_debug("\n"); > } > > -acpi_status acpi_str_to_uuid(char *str, u8 *uuid) > -{ > - int i; > - static int opc_map_to_uuid[16] = {6, 4, 2, 0, 11, 9, 16, 14, 19, 21, > - 24, 26, 28, 30, 32, 34}; > - > - if (strlen(str) != 36) > - return AE_BAD_PARAMETER; > - for (i = 0; i < 36; i++) { > - if (i == 8 || i == 13 || i == 18 || i == 23) { > - if (str[i] != '-') > - return AE_BAD_PARAMETER; > - } else if (!isxdigit(str[i])) > - return AE_BAD_PARAMETER; > - } > - for (i = 0; i < 16; i++) { > - uuid[i] = hex_to_bin(str[opc_map_to_uuid[i]]) << 4; > - uuid[i] |= hex_to_bin(str[opc_map_to_uuid[i] + 1]); > - } > -
Re: [Intel-gfx] [PATCH 8/9] drm/i915: Stop inlining the execlists IRQ handler
Chris Wilson writes: > As the handler is now quite complex, involving a few atomics, the cost > of the function preamble is negligible in comparison and so we should > leave the function out-of-line for better I$. > > Signed-off-by: Chris Wilson Reviewed-by: Mika Kuoppala > --- > drivers/gpu/drm/i915/i915_irq.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c > index 86ede88daaab..8f60c8045b3e 100644 > --- a/drivers/gpu/drm/i915/i915_irq.c > +++ b/drivers/gpu/drm/i915/i915_irq.c > @@ -1353,7 +1353,7 @@ static void snb_gt_irq_handler(struct drm_i915_private > *dev_priv, > ivybridge_parity_error_irq_handler(dev_priv, gt_iir); > } > > -static __always_inline void > +static void > gen8_cs_irq_handler(struct intel_engine_cs *engine, u32 iir, int test_shift) > { > bool tasklet = false; > -- > 2.11.0 > > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [CI 1/3] drm/i915: Avoid the branch in computing intel_ring_space()
On Thu, May 04, 2017 at 02:08:44PM +0100, Chris Wilson wrote: > Exploit the power-of-two ring size to compute the space across the > wraparound using a mask rather than a if. Convert to unsigned integers > so the operation is well defined. > > References: https://bugs.freedesktop.org/show_bug.cgi?id=99671 > Signed-off-by: Chris Wilson > Cc: Mika Kuoppala > Reviewed-by: Mika Kuoppala > --- > drivers/gpu/drm/i915/intel_ringbuffer.c | 23 +++-- > drivers/gpu/drm/i915/intel_ringbuffer.h | 36 > - > 2 files changed, 34 insertions(+), 25 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c > b/drivers/gpu/drm/i915/intel_ringbuffer.c > index 3ce1c87dec46..e7ef04cc071b 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > @@ -39,12 +39,16 @@ > */ > #define LEGACY_REQUEST_SIZE 200 > > -static int __intel_ring_space(int head, int tail, int size) > +static unsigned int __intel_ring_space(unsigned int head, > +unsigned int tail, > +unsigned int size) > { > - int space = head - tail; > - if (space <= 0) > - space += size; > - return space - I915_RING_FREE_SPACE; > + /* > + * "If the Ring Buffer Head Pointer and the Tail Pointer are on the > + * same cacheline, the Head Pointer must not be greater than the Tail > + * Pointer." > + */ > + return (head - tail - CACHELINE_BYTES) & (size - 1); Btw, as you exploit power-of-two ring size here, maybe it is worth to repeat GEM_BUG_ON(!is_power_of_2(size)); to emphase this assumption in the code (not only in the commit message)? -Michal ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Update MOCS settings for gen 9
On Thu, May 04, 2017 at 10:35:33AM +0200, Arkadiusz Hiler wrote: > On Thu, Apr 27, 2017 at 05:23:16PM +0100, Chris Wilson wrote: > > On Thu, Apr 27, 2017 at 06:30:42PM +0300, David Weinehall wrote: > > > On Thu, Apr 27, 2017 at 04:55:20PM +0200, Arkadiusz Hiler wrote: > > > > On Wed, Apr 26, 2017 at 06:00:41PM +0300, David Weinehall wrote: > > > > > Add a bunch of MOCS entries for gen 9 that were missing from > > > > > intel_mocs. > > > > > Some of these are used by media-sdk; if these entries are missing > > > > > the default will instead be to do everything uncached. > > > > > > > > > > This patch improves media-sdk performance with up to 60% > > > > > with the (admittedly synthetic) benchmarks we use in our nightly > > > > > testing, without regressing any other benchmarks. > > > > > > > > Hey David, > > > > > > > > I am testing some of the extended MOCS with Mesa and the differences I > > > > see fit in the margins of statistical error. > > > > > > > > Odd, I thought, so to make sure I haven't messed up anything in the > > > > process of compiling, setting LD_LIBRARY_PATH and benchmarking I turned > > > > everything to UNCACHED - and I saw severe performance drop. > > > > > > > > So here is the question it induced: > > > > > > > > Have you used the "closest neighbour" from entries available or did you > > > > defaulted to the UNCACHED ones? That could be the culprit. > > > > > > > > Note: I have tested MOCS for VB and Render Target only, and only in a > > > > few synthetic cases - it will require much more fine-tuning and > > > > benchmarking before any final conclusions. > > > > > > As I mentioned in the commit message, the improvements only manifest > > > themselves for media-sdk workloads (and presumably other workloads > > > that uses the same hardware); if you see any performance regressions > > > with these additional entries I'd be interested to know. > > > > But what is being counter suggested is that their is no reason for these > > mocs entries. If the sdk is just using mocs registers without first > > programming them outside of the kernel abi, then it will be hitting > > uncached memory - and then the only benefit is from simply enabling > > cached access. The kernel ABI is minimalist for a reason, and we want to > > know why we should be adding tables that we need to maintain forever > > (bonus points for making that a consistent interface for hardware for > > years to come). > > -Chris > > Thanks for rephrasing - that's exactly what I am concerned with. > > Did you just use the MediaSDK as it is - meaning that MOCS entries > beyond the set of the 3 we have defined had been naively utilized? > > If that's the case it is probably the cause of the performance > difference - everything beyond "the 3" means UNCACHED. > > Can you try changing MediaSDK to only use entries that are already in? > How the performance differs in that case? We're benchmarking using upstream MediaSDK without changes, since that's the only thing that's relevant. Customising benchmarks to get better results isn't really an acceptable solution :) Obviously fixing MediaSDK upstream is a different story, in case one of the three pre-defined entries we have turns out to be the best possible MOCS-settings for that workload. Kind regards, David ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [CI 1/3] drm/i915: Avoid the branch in computing intel_ring_space()
On Thu, May 04, 2017 at 04:17:13PM +0200, Michal Wajdeczko wrote: > On Thu, May 04, 2017 at 02:08:44PM +0100, Chris Wilson wrote: > > Exploit the power-of-two ring size to compute the space across the > > wraparound using a mask rather than a if. Convert to unsigned integers > > so the operation is well defined. > > > > References: https://bugs.freedesktop.org/show_bug.cgi?id=99671 > > Signed-off-by: Chris Wilson > > Cc: Mika Kuoppala > > Reviewed-by: Mika Kuoppala > > --- > > drivers/gpu/drm/i915/intel_ringbuffer.c | 23 +++-- > > drivers/gpu/drm/i915/intel_ringbuffer.h | 36 > > - > > 2 files changed, 34 insertions(+), 25 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c > > b/drivers/gpu/drm/i915/intel_ringbuffer.c > > index 3ce1c87dec46..e7ef04cc071b 100644 > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > > @@ -39,12 +39,16 @@ > > */ > > #define LEGACY_REQUEST_SIZE 200 > > > > -static int __intel_ring_space(int head, int tail, int size) > > +static unsigned int __intel_ring_space(unsigned int head, > > + unsigned int tail, > > + unsigned int size) > > { > > - int space = head - tail; > > - if (space <= 0) > > - space += size; > > - return space - I915_RING_FREE_SPACE; > > + /* > > +* "If the Ring Buffer Head Pointer and the Tail Pointer are on the > > +* same cacheline, the Head Pointer must not be greater than the Tail > > +* Pointer." > > +*/ > > + return (head - tail - CACHELINE_BYTES) & (size - 1); > > Btw, as you exploit power-of-two ring size here, maybe it is worth to repeat > > GEM_BUG_ON(!is_power_of_2(size)); > > to emphase this assumption in the code (not only in the commit message)? I've made the cardinal sin of changing it at the last moment, if I've broken everything I'm going to blame you :) Semi-pushed, looks like we're already back in conflict territory. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [CI 1/3] drm/i915: Avoid the branch in computing intel_ring_space()
On Thu, May 04, 2017 at 04:17:13PM +0200, Michal Wajdeczko wrote: > On Thu, May 04, 2017 at 02:08:44PM +0100, Chris Wilson wrote: > > Exploit the power-of-two ring size to compute the space across the > > wraparound using a mask rather than a if. Convert to unsigned integers > > so the operation is well defined. > > > > References: https://bugs.freedesktop.org/show_bug.cgi?id=99671 > > Signed-off-by: Chris Wilson > > Cc: Mika Kuoppala > > Reviewed-by: Mika Kuoppala > > --- > > drivers/gpu/drm/i915/intel_ringbuffer.c | 23 +++-- > > drivers/gpu/drm/i915/intel_ringbuffer.h | 36 > > - > > 2 files changed, 34 insertions(+), 25 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c > > b/drivers/gpu/drm/i915/intel_ringbuffer.c > > index 3ce1c87dec46..e7ef04cc071b 100644 > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > > @@ -39,12 +39,16 @@ > > */ > > #define LEGACY_REQUEST_SIZE 200 > > > > -static int __intel_ring_space(int head, int tail, int size) > > +static unsigned int __intel_ring_space(unsigned int head, > > + unsigned int tail, > > + unsigned int size) > > { > > - int space = head - tail; > > - if (space <= 0) > > - space += size; > > - return space - I915_RING_FREE_SPACE; > > + /* > > +* "If the Ring Buffer Head Pointer and the Tail Pointer are on the > > +* same cacheline, the Head Pointer must not be greater than the Tail > > +* Pointer." > > +*/ > > + return (head - tail - CACHELINE_BYTES) & (size - 1); > > Btw, as you exploit power-of-two ring size here, maybe it is worth to repeat > > GEM_BUG_ON(!is_power_of_2(size)); > > to emphase this assumption in the code (not only in the commit message)? I did check we had an is_power_of_2() check in intel_engine_create_ring. Might be worth asserting here as well as there's a little disconnect between the function and ring->size. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Set all undefined MOCS entries to follow PTE
On Thu, May 04, 2017 at 10:51:29AM +0100, Chris Wilson wrote: > A good default for garbage entries from the user is to follow the > default setting of the object (i.e. the PTE). Currently they use the > uncached entry, and now the only way to accidentally hit uncached > performance is via explicit use of the uncached MOCS or setting the > object to uncached. Note that these entries are currently undefined in > the ABI and we reserve the right to change them. We originally chose > uncached to eliminate any problem with reducing the caching level in > future, but the object is a much better definition of the minimum > caching level. > > Fixes: 3bbaba0ceaa2 ("drm/i915: Added Programming of the MOCS") > Signed-off-by: Chris Wilson > Cc: David Weinehall > Cc: Arkadiusz Hiler > Cc: Tvrtko Ursulin > Cc: sta...@vger.kernel.org LGTM, and passes our nightly msdk test case. Tested-by: David Weinehall Reviewed-by: David Weinehall > --- > drivers/gpu/drm/i915/intel_mocs.c | 39 > +++ > 1 file changed, 15 insertions(+), 24 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_mocs.c > b/drivers/gpu/drm/i915/intel_mocs.c > index 92e461c68385..e7a7781ca457 100644 > --- a/drivers/gpu/drm/i915/intel_mocs.c > +++ b/drivers/gpu/drm/i915/intel_mocs.c > @@ -85,10 +85,7 @@ struct drm_i915_mocs_table { > * > * Entries not part of the following tables are undefined as far as > * userspace is concerned and shouldn't be relied upon. For the time > - * being they will be implicitly initialized to the strictest caching > - * configuration (uncached) to guarantee forwards compatibility with > - * userspace programs written against more recent kernels providing > - * additional MOCS entries. > + * being they will be implicitly initialized to follow the PTE. > * > * NOTE: These tables MUST start with being uncached and the length > * MUST be less than 63 as the last two registers are reserved > @@ -249,16 +246,13 @@ int intel_mocs_init_engine(struct intel_engine_cs > *engine) > table.table[index].control_value); > > /* > - * Ok, now set the unused entries to uncached. These entries > + * Ok, now set the unused entries to follow the PTE. These entries >* are officially undefined and no contract for the contents >* and settings is given for these entries. > - * > - * Entry 0 in the table is uncached - so we are just writing > - * that value to all the used entries. >*/ > for (; index < GEN9_NUM_MOCS_ENTRIES; index++) > I915_WRITE(mocs_register(engine->id, index), > -table.table[0].control_value); > +table.table[I915_MOCS_PTE].control_value); > > return 0; > } > @@ -295,16 +289,13 @@ static int emit_mocs_control_table(struct > drm_i915_gem_request *req, > } > > /* > - * Ok, now set the unused entries to uncached. These entries > + * Ok, now set the unused entries to follow the PTE. These entries >* are officially undefined and no contract for the contents >* and settings is given for these entries. > - * > - * Entry 0 in the table is uncached - so we are just writing > - * that value to all the used entries. >*/ > for (; index < GEN9_NUM_MOCS_ENTRIES; index++) { > *cs++ = i915_mmio_reg_offset(mocs_register(engine, index)); > - *cs++ = table->table[0].control_value; > + *cs++ = table->table[I915_MOCS_PTE].control_value; > } > > *cs++ = MI_NOOP; > @@ -355,18 +346,17 @@ static int emit_mocs_l3cc_table(struct > drm_i915_gem_request *req, > if (table->size & 0x01) { > /* Odd table size - 1 left over */ > *cs++ = i915_mmio_reg_offset(GEN9_LNCFCMOCS(i)); > - *cs++ = l3cc_combine(table, 2 * i, 0); > + *cs++ = l3cc_combine(table, 2 * i, I915_MOCS_PTE); > i++; > } > > /* > - * Now set the rest of the table to uncached - use entry 0 as > - * this will be uncached. Leave the last pair uninitialised as > - * they are reserved by the hardware. > + * Now set the rest of the table to follow the PTE. > + * Leave the last pair as they are reserved by the hardware. >*/ > for (; i < GEN9_NUM_MOCS_ENTRIES / 2; i++) { > *cs++ = i915_mmio_reg_offset(GEN9_LNCFCMOCS(i)); > - *cs++ = l3cc_combine(table, 0, 0); > + *cs++ = l3cc_combine(table, I915_MOCS_PTE, I915_MOCS_PTE); > } > > *cs++ = MI_NOOP; > @@ -402,17 +392,18 @@ void intel_mocs_init_l3cc_table(struct drm_i915_private > *dev_priv) > > /* Odd table size - 1 left over */ > if (table.size & 0x01) { > - I915_WRITE(GEN9_LNCFCMOCS(i), l3cc_combine(&table, 2*i, 0)); > + I915_WRITE(GEN9_LNCFCMOCS(i), > +l3cc_combine(&table, 2*i, I915
Re: [Intel-gfx] [PATCH 5/9] drm/i915: Use a define for the default priority [0]
On Thu, May 04, 2017 at 04:32:34PM +0300, Joonas Lahtinen wrote: > On ke, 2017-05-03 at 12:37 +0100, Chris Wilson wrote: > > Explicitly assign the default priority, and give it a name (macro). > > > > Signed-off-by: Chris Wilson > > > > > kref_init(&ctx->ref); > > list_add_tail(&ctx->link, &dev_priv->context_list); > > ctx->i915 = dev_priv; > > + ctx->priority = I915_PRIORITY_DFL; > > I915_PRIORITY_DEFAULT would work better. On the one hand I have the symmetry with MIN, DFL, MAX, on the other hand DFL is plain bizarre. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] drm] Atomic update on pipe (A) took 119 us, max time under evasion is 100 us
Hi, Running current -git on my laptop (20FB, X1 Carbon gen4, skylake), I get a lot of the below warnings. Things seem to work fine (in fact it seems faster in general use than previously), but it's a lot of warning spew. [ 764.877978] [drm] Atomic update on pipe (A) took 156 us, max time under evasion is 100 us [ 1210.063144] [drm] Atomic update on pipe (A) took 152 us, max time under evasion is 100 us [ 1272.208727] [drm] Atomic update on pipe (A) took 213 us, max time under evasion is 100 us [ 1308.106266] [drm] Atomic update on pipe (A) took 194 us, max time under evasion is 100 us [ 1308.439572] [drm] Atomic update on pipe (A) took 202 us, max time under evasion is 100 us [ 1371.905950] [drm] Atomic update on pipe (A) took 135 us, max time under evasion is 100 us [ 1373.891378] [drm] Atomic update on pipe (A) took 202 us, max time under evasion is 100 us [ 1497.259572] [drm] Atomic update on pipe (A) took 199 us, max time under evasion is 100 us [ 1497.292922] [drm] Atomic update on pipe (A) took 178 us, max time under evasion is 100 us [ 1497.326313] [drm] Atomic update on pipe (A) took 188 us, max time under evasion is 100 us [ 1534.106959] [drm] Atomic update on pipe (A) took 223 us, max time under evasion is 100 us [ 1534.190331] [drm] Atomic update on pipe (A) took 180 us, max time under evasion is 100 us [ 1680.613275] [drm] Atomic update on pipe (A) took 101 us, max time under evasion is 100 us [ 1870.783352] [drm] Atomic update on pipe (A) took 188 us, max time under evasion is 100 us [ 2338.083752] [drm] Atomic update on pipe (A) took 225 us, max time under evasion is 100 us [ 2405.212252] [drm] Atomic update on pipe (A) took 114 us, max time under evasion is 100 us [ 2421.811125] [drm] Atomic update on pipe (A) took 112 us, max time under evasion is 100 us [ 2426.344151] [drm] Atomic update on pipe (A) took 137 us, max time under evasion is 100 us [ 2439.012088] [drm] Atomic update on pipe (A) took 143 us, max time under evasion is 100 us [ 2446.011309] [drm] Atomic update on pipe (A) took 163 us, max time under evasion is 100 us [ 2446.142622] [drm] Atomic update on pipe (A) took 112 us, max time under evasion is 100 us [ 2446.542772] [drm] Atomic update on pipe (A) took 137 us, max time under evasion is 100 us [ 2448.243922] [drm] Atomic update on pipe (A) took 157 us, max time under evasion is 100 us [ 2450.042450] [drm] Atomic update on pipe (A) took 157 us, max time under evasion is 100 us [ 2456.575226] [drm] Atomic update on pipe (A) took 131 us, max time under evasion is 100 us [ 2457.275176] [drm] Atomic update on pipe (A) took 115 us, max time under evasion is 100 us [ 2464.308098] [drm] Atomic update on pipe (A) took 112 us, max time under evasion is 100 us [ 2569.418646] [drm] Atomic update on pipe (A) took 179 us, max time under evasion is 100 us [ 2572.302065] [drm] Atomic update on pipe (A) took 133 us, max time under evasion is 100 us [ 2589.933225] [drm] Atomic update on pipe (A) took 168 us, max time under evasion is 100 us [ 2590.701810] [drm] Atomic update on pipe (A) took 175 us, max time under evasion is 100 us [ 2606.732899] [drm] Atomic update on pipe (A) took 130 us, max time under evasion is 100 us [ 2611.732710] [drm] Atomic update on pipe (A) took 147 us, max time under evasion is 100 us [ 2615.532819] [drm] Atomic update on pipe (A) took 145 us, max time under evasion is 100 us [ 2654.412509] [drm] Atomic update on pipe (A) took 157 us, max time under evasion is 100 us [ 2657.012470] [drm] Atomic update on pipe (A) took 168 us, max time under evasion is 100 us [ 2714.341971] [drm] Atomic update on pipe (A) took 144 us, max time under evasion is 100 us [ 2775.486168] [drm] Atomic update on pipe (A) took 138 us, max time under evasion is 100 us [ 2782.852360] [drm] Atomic update on pipe (A) took 113 us, max time under evasion is 100 us [ 2795.319781] [drm] Atomic update on pipe (A) took 188 us, max time under evasion is 100 us [ 2818.601093] [drm] Atomic update on pipe (A) took 160 us, max time under evasion is 100 us [ 2867.998524] [drm] Atomic update on pipe (A) took 167 us, max time under evasion is 100 us [ 2878.980535] [drm] Atomic update on pipe (A) took 163 us, max time under evasion is 100 us [ 2945.607547] [drm] Atomic update on pipe (A) took 110 us, max time under evasion is 100 us [ 2957.606588] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=177768 end=177769) time 214 us, min 1431, max 1439, scanline start 1423, end 1442 [ 2958.609128] [drm] Atomic update on pipe (A) took 168 us, max time under evasion is 100 us [ 2960.059591] [drm] Atomic update on pipe (A) took 186 us, max time under evasion is 100 us [ 2960.658177] [drm] Atomic update on pipe (A) took 181 us, max time under evasion is 100 us [ 3002.688632] [drm] Atomic update on pipe (A) took 210 us, max time under evasion is 100 us [ 3021.939015] [drm] Atomic update on pipe (A) took 140 us, max time under evasion
Re: [Intel-gfx] [PATCH v1] ACPI: Switch to use generic UUID API
On May 04 2017 or thereabouts, Andy Shevchenko wrote: > acpi_evaluate_dsm() and friends take a pointer to a raw buffer of 16 > bytes. Instead we convert them to use uuid_le type. At the same time we > convert current users. > > acpi_str_to_uuid() becomes useless after the conversion and it's safe to > get rid of it. > > The conversion fixes a potential bug in int340x_thermal as well since > we have to use memcmp() on binary data. > > Cc: Rafael J. Wysocki > Cc: Mika Westerberg > Cc: Borislav Petkov > Cc: Dan Williams > Cc: Amir Goldstein > Cc: Jarkko Sakkinen > Cc: Jani Nikula > Cc: Ben Skeggs > Cc: Benjamin Tissoires > Cc: Joerg Roedel > Cc: Adrian Hunter > Cc: Yisen Zhuang > Cc: Bjorn Helgaas > Cc: Zhang Rui > Cc: Felipe Balbi > Cc: Mathias Nyman > Cc: Heikki Krogerus > Cc: Liam Girdwood > Cc: Mark Brown > Signed-off-by: Andy Shevchenko > --- For i2c-hid: Acked-by: Benjamin Tissoires ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/4] lib/scatterlist: Avoid potential scatterlist entry overflow
From: Tvrtko Ursulin Since the scatterlist length field is an unsigned int, make sure that sg_alloc_table_from_pages does not overflow it while coallescing pages to a single entry. v2: Drop reference to future use. Use UINT_MAX. v3: max_segment must be page aligned. v4: Do not rely on compiler to optimise out the rounddown. (Joonas Lahtinen) v5: Simplified loops and use post-increments rather than pre-increments. Use PAGE_MASK and fix comment typo. (Andy Shevchenko) Signed-off-by: Tvrtko Ursulin Cc: Masahiro Yamada Cc: linux-ker...@vger.kernel.org Reviewed-by: Chris Wilson (v2) Cc: Joonas Lahtinen Cc: Andy Shevchenko --- include/linux/scatterlist.h | 6 ++ lib/scatterlist.c | 31 --- 2 files changed, 26 insertions(+), 11 deletions(-) diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h index c981bee1a3ae..4768eeeb7054 100644 --- a/include/linux/scatterlist.h +++ b/include/linux/scatterlist.h @@ -21,6 +21,12 @@ struct scatterlist { }; /* + * Since the above length field is an unsigned int, below we define the maximum + * length in bytes that can be stored in one scatterlist entry. + */ +#define SCATTERLIST_MAX_SEGMENT (UINT_MAX & PAGE_MASK) + +/* * These macros should be used after a dma_map_sg call has been done * to get bus addresses of each of the SG entries and their lengths. * You should only work with the number of sg entries dma_map_sg diff --git a/lib/scatterlist.c b/lib/scatterlist.c index 11f172c383cb..ca4ccd8c80b9 100644 --- a/lib/scatterlist.c +++ b/lib/scatterlist.c @@ -394,17 +394,22 @@ int sg_alloc_table_from_pages(struct sg_table *sgt, unsigned int offset, unsigned long size, gfp_t gfp_mask) { - unsigned int chunks; - unsigned int i; - unsigned int cur_page; + const unsigned int max_segment = SCATTERLIST_MAX_SEGMENT; + unsigned int chunks, cur_page, seg_len, i; int ret; struct scatterlist *s; /* compute number of contiguous chunks */ chunks = 1; - for (i = 1; i < n_pages; ++i) - if (page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1) - ++chunks; + seg_len = 0; + for (i = 1; i < n_pages; i++) { + seg_len += PAGE_SIZE; + if (seg_len >= max_segment || + page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1) { + chunks++; + seg_len = 0; + } + } ret = sg_alloc_table(sgt, chunks, gfp_mask); if (unlikely(ret)) @@ -413,17 +418,21 @@ int sg_alloc_table_from_pages(struct sg_table *sgt, /* merging chunks and putting them into the scatterlist */ cur_page = 0; for_each_sg(sgt->sgl, s, sgt->orig_nents, i) { - unsigned long chunk_size; - unsigned int j; + unsigned int j, chunk_size; /* look for the end of the current chunk */ - for (j = cur_page + 1; j < n_pages; ++j) - if (page_to_pfn(pages[j]) != + seg_len = 0; + for (j = cur_page + 1; j < n_pages; j++) { + seg_len += PAGE_SIZE; + if (seg_len >= max_segment || + page_to_pfn(pages[j]) != page_to_pfn(pages[j - 1]) + 1) break; + } chunk_size = ((j - cur_page) << PAGE_SHIFT) - offset; - sg_set_page(s, pages[cur_page], min(size, chunk_size), offset); + sg_set_page(s, pages[cur_page], + min_t(unsigned long, size, chunk_size), offset); size -= chunk_size; offset = 0; cur_page = j; -- 2.9.3 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 3/4] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages
From: Tvrtko Ursulin Drivers like i915 benefit from being able to control the maxium size of the sg coallesced segment while building the scatter- gather list. Introduce and export the __sg_alloc_table_from_pages function which will allow it that control. v2: Reorder parameters. (Chris Wilson) v3: Fix incomplete reordering in v2. v4: max_segment needs to be page aligned. v5: Rebase. v6: Rebase. Signed-off-by: Tvrtko Ursulin Cc: Masahiro Yamada Cc: linux-ker...@vger.kernel.org Cc: Chris Wilson Reviewed-by: Chris Wilson (v2) Cc: Joonas Lahtinen --- include/linux/scatterlist.h | 11 + lib/scatterlist.c | 58 +++-- 2 files changed, 52 insertions(+), 17 deletions(-) diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h index 4768eeeb7054..4d67a9652c7d 100644 --- a/include/linux/scatterlist.h +++ b/include/linux/scatterlist.h @@ -267,10 +267,13 @@ void sg_free_table(struct sg_table *); int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int, struct scatterlist *, gfp_t, sg_alloc_fn *); int sg_alloc_table(struct sg_table *, unsigned int, gfp_t); -int sg_alloc_table_from_pages(struct sg_table *sgt, - struct page **pages, unsigned int n_pages, - unsigned int offset, unsigned long size, - gfp_t gfp_mask); +int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages, + unsigned int n_pages, unsigned int offset, + unsigned long size, unsigned int max_segment, + gfp_t gfp_mask); +int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages, + unsigned int n_pages, unsigned int offset, + unsigned long size, gfp_t gfp_mask); size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf, size_t buflen, off_t skip, bool to_buffer); diff --git a/lib/scatterlist.c b/lib/scatterlist.c index ca4ccd8c80b9..73dace1bd5bb 100644 --- a/lib/scatterlist.c +++ b/lib/scatterlist.c @@ -370,14 +370,15 @@ int sg_alloc_table(struct sg_table *table, unsigned int nents, gfp_t gfp_mask) EXPORT_SYMBOL(sg_alloc_table); /** - * sg_alloc_table_from_pages - Allocate and initialize an sg table from - *an array of pages - * @sgt: The sg table header to use - * @pages: Pointer to an array of page pointers - * @n_pages: Number of pages in the pages array - * @offset: Offset from start of the first page to the start of a buffer - * @size: Number of valid bytes in the buffer (after offset) - * @gfp_mask: GFP allocation mask + * __sg_alloc_table_from_pages - Allocate and initialize an sg table from + * an array of pages + * @sgt:The sg table header to use + * @pages: Pointer to an array of page pointers + * @n_pages:Number of pages in the pages array + * @offset: Offset from start of the first page to the start of a buffer + * @size:Number of valid bytes in the buffer (after offset) + * @max_segment: Maximum size of a scatterlist node in bytes (page aligned) + * @gfp_mask: GFP allocation mask * * Description: *Allocate and initialize an sg table from a list of pages. Contiguous @@ -389,16 +390,18 @@ EXPORT_SYMBOL(sg_alloc_table); * Returns: * 0 on success, negative error on failure */ -int sg_alloc_table_from_pages(struct sg_table *sgt, - struct page **pages, unsigned int n_pages, - unsigned int offset, unsigned long size, - gfp_t gfp_mask) +int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages, + unsigned int n_pages, unsigned int offset, + unsigned long size, unsigned int max_segment, + gfp_t gfp_mask) { - const unsigned int max_segment = SCATTERLIST_MAX_SEGMENT; unsigned int chunks, cur_page, seg_len, i; int ret; struct scatterlist *s; + if (WARN_ON(!max_segment || offset_in_page(max_segment))) + return -EINVAL; + /* compute number of contiguous chunks */ chunks = 1; seg_len = 0; @@ -440,6 +443,35 @@ int sg_alloc_table_from_pages(struct sg_table *sgt, return 0; } +EXPORT_SYMBOL(__sg_alloc_table_from_pages); + +/** + * sg_alloc_table_from_pages - Allocate and initialize an sg table from + *an array of pages + * @sgt:The sg table header to use + * @pages: Pointer to an array of page pointers + * @n_pages:Number of pages in the pages array + * @offset: Offset from start of the first page to the start of a buffer + * @size:Number of valid bytes in the buffer (after offset) + * @gfp_mask: GFP allocation mask + * + * Description: + *Allocate and initialize an sg table from a list of pages. Contiguous + *
[Intel-gfx] [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages
From: Tvrtko Ursulin Scatterlist entries have an unsigned int for the offset so correct the sg_alloc_table_from_pages function accordingly. Since these are offsets withing a page, unsigned int is wide enough. Also converts callers which were using unsigned long locally with the lower_32_bits annotation to make it explicitly clear what is happening. v2: Use offset_in_page. (Chris Wilson) Signed-off-by: Tvrtko Ursulin Cc: Masahiro Yamada Cc: Pawel Osciak Cc: Marek Szyprowski Cc: Kyungmin Park Cc: Tomasz Stanislawski Cc: Matt Porter Cc: Alexandre Bounine Cc: linux-me...@vger.kernel.org Cc: linux-ker...@vger.kernel.org Acked-by: Marek Szyprowski (v1) Reviewed-by: Chris Wilson Reviewed-by: Mauro Carvalho Chehab --- drivers/media/v4l2-core/videobuf2-dma-contig.c | 4 ++-- drivers/rapidio/devices/rio_mport_cdev.c | 4 ++-- include/linux/scatterlist.h| 2 +- lib/scatterlist.c | 2 +- 4 files changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c index 2db0413f5d57..b5009c1649bc 100644 --- a/drivers/media/v4l2-core/videobuf2-dma-contig.c +++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c @@ -478,7 +478,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr, { struct vb2_dc_buf *buf; struct frame_vector *vec; - unsigned long offset; + unsigned int offset; int n_pages, i; int ret = 0; struct sg_table *sgt; @@ -506,7 +506,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr, buf->dev = dev; buf->dma_dir = dma_dir; - offset = vaddr & ~PAGE_MASK; + offset = lower_32_bits(offset_in_page(vaddr)); vec = vb2_create_framevec(vaddr, size, dma_dir == DMA_FROM_DEVICE); if (IS_ERR(vec)) { ret = PTR_ERR(vec); diff --git a/drivers/rapidio/devices/rio_mport_cdev.c b/drivers/rapidio/devices/rio_mport_cdev.c index 50b617af81bd..a8b6696ab6cb 100644 --- a/drivers/rapidio/devices/rio_mport_cdev.c +++ b/drivers/rapidio/devices/rio_mport_cdev.c @@ -876,10 +876,10 @@ rio_dma_transfer(struct file *filp, u32 transfer_mode, * offset within the internal buffer specified by handle parameter. */ if (xfer->loc_addr) { - unsigned long offset; + unsigned int offset; long pinned; - offset = (unsigned long)(uintptr_t)xfer->loc_addr & ~PAGE_MASK; + offset = lower_32_bits(offset_in_page(xfer->loc_addr)); nr_pages = PAGE_ALIGN(xfer->length + offset) >> PAGE_SHIFT; page_list = kmalloc_array(nr_pages, diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h index cb3c8fe6acd7..c981bee1a3ae 100644 --- a/include/linux/scatterlist.h +++ b/include/linux/scatterlist.h @@ -263,7 +263,7 @@ int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int, int sg_alloc_table(struct sg_table *, unsigned int, gfp_t); int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages, unsigned int n_pages, - unsigned long offset, unsigned long size, + unsigned int offset, unsigned long size, gfp_t gfp_mask); size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf, diff --git a/lib/scatterlist.c b/lib/scatterlist.c index c6cf82242d65..11f172c383cb 100644 --- a/lib/scatterlist.c +++ b/lib/scatterlist.c @@ -391,7 +391,7 @@ EXPORT_SYMBOL(sg_alloc_table); */ int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages, unsigned int n_pages, - unsigned long offset, unsigned long size, + unsigned int offset, unsigned long size, gfp_t gfp_mask) { unsigned int chunks; -- 2.9.3 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations
From: Tvrtko Ursulin With the addition of __sg_alloc_table_from_pages we can control the maximum coallescing size and eliminate a separate path for allocating backing store here. Similar to 871dfbd67d4e ("drm/i915: Allow compaction upto SWIOTLB max segment size") this enables more compact sg lists to be created and so has a beneficial effect on workloads with many and/or large objects of this class. v2: * Rename helper to i915_sg_segment_size and fix swiotlb override. * Commit message update. v3: * Actually include the swiotlb override fix. v4: * Regroup parameters a bit. (Chris Wilson) v5: * Rebase for swiotlb_max_segment. * Add DMA map failure handling as in abb0deacb5a6 ("drm/i915: Fallback to single PAGE_SIZE segments for DMA remapping"). v6: Handle swiotlb_max_segment() returning 1. (Joonas Lahtinen) v7: Rebase. Signed-off-by: Tvrtko Ursulin Cc: Chris Wilson Cc: linux-ker...@vger.kernel.org Reviewed-by: Chris Wilson (v4) Cc: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_drv.h | 15 +++ drivers/gpu/drm/i915/i915_gem.c | 6 +-- drivers/gpu/drm/i915/i915_gem_userptr.c | 79 - 3 files changed, 45 insertions(+), 55 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index b20ed16da0ad..320c16df1c9c 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2676,6 +2676,21 @@ static inline struct scatterlist *__sg_next(struct scatterlist *sg) (((__iter).curr += PAGE_SIZE) < (__iter).max) || \ ((__iter) = __sgt_iter(__sg_next((__iter).sgp), false), 0)) +static inline unsigned int i915_sg_segment_size(void) +{ + unsigned int size = swiotlb_max_segment(); + + if (size == 0) + return SCATTERLIST_MAX_SEGMENT; + + size = rounddown(size, PAGE_SIZE); + /* swiotlb_max_segment_size can return 1 byte when it means one page. */ + if (size < PAGE_SIZE) + size = PAGE_SIZE; + + return size; +} + static inline const struct intel_device_info * intel_info(const struct drm_i915_private *dev_priv) { diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index f9c6b9b5002c..b2727905ef2b 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2336,7 +2336,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) struct sgt_iter sgt_iter; struct page *page; unsigned long last_pfn = 0; /* suppress gcc warning */ - unsigned int max_segment; + unsigned int max_segment = i915_sg_segment_size(); int ret; gfp_t gfp; @@ -2347,10 +2347,6 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) GEM_BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS); GEM_BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS); - max_segment = swiotlb_max_segment(); - if (!max_segment) - max_segment = rounddown(UINT_MAX, PAGE_SIZE); - st = kmalloc(sizeof(*st), GFP_KERNEL); if (st == NULL) return ERR_PTR(-ENOMEM); diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c index 58ccf8b8ca1c..d003076702ad 100644 --- a/drivers/gpu/drm/i915/i915_gem_userptr.c +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c @@ -399,64 +399,42 @@ struct get_pages_work { struct task_struct *task; }; -#if IS_ENABLED(CONFIG_SWIOTLB) -#define swiotlb_active() swiotlb_nr_tbl() -#else -#define swiotlb_active() 0 -#endif - -static int -st_set_pages(struct sg_table **st, struct page **pvec, int num_pages) -{ - struct scatterlist *sg; - int ret, n; - - *st = kmalloc(sizeof(**st), GFP_KERNEL); - if (*st == NULL) - return -ENOMEM; - - if (swiotlb_active()) { - ret = sg_alloc_table(*st, num_pages, GFP_KERNEL); - if (ret) - goto err; - - for_each_sg((*st)->sgl, sg, num_pages, n) - sg_set_page(sg, pvec[n], PAGE_SIZE, 0); - } else { - ret = sg_alloc_table_from_pages(*st, pvec, num_pages, - 0, num_pages << PAGE_SHIFT, - GFP_KERNEL); - if (ret) - goto err; - } - - return 0; - -err: - kfree(*st); - *st = NULL; - return ret; -} - static struct sg_table * -__i915_gem_userptr_set_pages(struct drm_i915_gem_object *obj, -struct page **pvec, int num_pages) +__i915_gem_userptr_alloc_pages(struct drm_i915_gem_object *obj, + struct page **pvec, int num_pages) { - struct sg_table *pages; + unsigned int max_segment = i915_sg_segment_size(); + struct sg_table *st; int ret; - ret = st_set_pages(&pages, pvec, num_pages
Re: [Intel-gfx] [RFC PATCH 6/6] drm/i915/gvt: support QEMU getting the dmabuf
On Thu, 4 May 2017 03:09:40 + "Chen, Xiaoguang" wrote: > Hi Alex, do you have any comments for this interface? > > >-Original Message- > >From: intel-gvt-dev [mailto:intel-gvt-dev-boun...@lists.freedesktop.org] On > >Behalf Of Chen, Xiaoguang > >Sent: Wednesday, May 03, 2017 9:39 AM > >To: Gerd Hoffmann > >Cc: Tian, Kevin ; intel-gfx@lists.freedesktop.org; > >linux- > >ker...@vger.kernel.org; zhen...@linux.intel.com; alex.william...@redhat.com; > >Lv, Zhiyuan ; intel-gvt-...@lists.freedesktop.org; > >Wang, > >Zhi A > >Subject: RE: [RFC PATCH 6/6] drm/i915/gvt: support QEMU getting the dmabuf > > > > > > > >>-Original Message- > >>From: Gerd Hoffmann [mailto:kra...@redhat.com] > >>Sent: Tuesday, May 02, 2017 5:51 PM > >>To: Chen, Xiaoguang > >>Cc: alex.william...@redhat.com; intel-gfx@lists.freedesktop.org; > >>intel-gvt- d...@lists.freedesktop.org; Wang, Zhi A > >>; zhen...@linux.intel.com; > >>linux-ker...@vger.kernel.org; Lv, Zhiyuan ; Tian, > >>Kevin > >>Subject: Re: [RFC PATCH 6/6] drm/i915/gvt: support QEMU getting the > >>dmabuf > >> > >>On Fr, 2017-04-28 at 17:35 +0800, Xiaoguang Chen wrote: > >>> +static size_t intel_vgpu_reg_rw_gvtg(struct intel_vgpu *vgpu, char > >>> *buf, > >>> + size_t count, loff_t *ppos, bool iswrite) { > >>> + unsigned int i = VFIO_PCI_OFFSET_TO_INDEX(*ppos) - > >>> + VFIO_PCI_NUM_REGIONS; > >>> + loff_t pos = *ppos & VFIO_PCI_OFFSET_MASK; > >>> + int fd; > >>> + > >>> + if (pos >= vgpu->vdev.region[i].size || iswrite) { > >>> + gvt_vgpu_err("invalid op or offset for Intel vgpu fd > >>> region\n"); > >>> + return -EINVAL; > >>> + } > >>> + > >>> + fd = anon_inode_getfd("gvtg", &intel_vgpu_gvtg_ops, vgpu, > >>> + O_RDWR | O_CLOEXEC); > >>> + if (fd < 0) { > >>> + gvt_vgpu_err("create intel vgpu fd failed:%d\n", fd); > >>> + return -EINVAL; > >>> + } > >>> + > >>> + count = min(count, (size_t)(vgpu->vdev.region[i].size - pos)); > >>> + memcpy(buf, &fd, count); > >>> + > >>> + return count; > >>> +} > >> > >>Hmm, that looks like a rather strange way to return a file descriptor. > >> > >>What is the reason to not use ioctls on the vfio file handle, like > >>older version of these patches did? > >If I understood correctly that Alex prefer not to change the ioctls on the > >vfio file > >handle like the old version. > >So I used this way the smallest change to general vfio framework only adding > >a > >subregion definition. I think I was hoping we could avoid a separate file descriptor altogether and use a vfio region instead. However, it was explained previously why this really needs to be a separate fd and I agree that using a region to expose an fd is really awkward. If we're going to have a separate fd, let's use a device specific ioctl to get it. Thanks, Alex ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2 2/3] drm/i915/guc: Make scratch register base and count flexible
On Thu, May 04, 2017 at 04:22:15PM +0300, Jani Nikula wrote: > On Thu, 04 May 2017, Michal Wajdeczko wrote: > > We are using some scratch registers in MMIO based send function. > > Make their base and count flexible in preparation of upcoming > > GuC firmware/hardware changes. While around, change cmd len > > parameter verification from WARN_ON to GEM_BUG_ON as we don't > > need this all the time. > > I'm not generally fond of caching the registers like this or adding > _MMIO() wrapping outside of i915_reg.h. Sure, we have some of that here > and there, but here it's hard to see the rationale because you do this > in preparation for something that we you're not sharing. > I can't share details atm, but as commit message says, there will be a change in both offsets and number of scratch registers. Imho any wrapping around these values can't go to the i915_[guc_]reg.h file as that file shall include only raw MMIO definitions, without any extra logic that is based on GEN or PLATFORM or FW version. Alternate approach would be, thanks to the already defined virtual function send(), to create new send_mmio function(s) that will be 100% the same as the old send_mmio except offset and count of the scratch registers. Then we can benefit from most optimal implementation per GEN|PLATFORM|FW that can run without reading cached regs offsets/count, but at the cost of extra code that need to be maintained to be in sync with the original function. And then someone else can point out that we missed code sharing opportunity. I'm afraid there is no clear winner. -Michal > BR, > Jani. > > > > > v2: call out WARN/GEM_BUG change in the commit msg (Daniele) > > > > Signed-off-by: Michal Wajdeczko > > Suggested-by: Daniele Ceraolo Spurio > > Cc: Daniele Ceraolo Spurio > > Cc: Joonas Lahtinen > > Reviewed-by: Daniele Ceraolo Spurio > > --- > > drivers/gpu/drm/i915/intel_uc.c | 41 > > ++--- > > drivers/gpu/drm/i915/intel_uc.h | 7 +++ > > 2 files changed, 41 insertions(+), 7 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_uc.c > > b/drivers/gpu/drm/i915/intel_uc.c > > index 72f49e6..9d11c42 100644 > > --- a/drivers/gpu/drm/i915/intel_uc.c > > +++ b/drivers/gpu/drm/i915/intel_uc.c > > @@ -260,9 +260,36 @@ void intel_uc_fini_fw(struct drm_i915_private > > *dev_priv) > > __intel_uc_fw_fini(&dev_priv->huc.fw); > > } > > > > +static inline i915_reg_t guc_send_reg(struct intel_guc *guc, u32 i) > > +{ > > + GEM_BUG_ON(!guc->send_regs.base); > > + GEM_BUG_ON(!guc->send_regs.count); > > + GEM_BUG_ON(i >= guc->send_regs.count); > > + > > + return _MMIO(guc->send_regs.base + 4 * i); > > +} > > + > > +static void guc_init_send_regs(struct intel_guc *guc) > > +{ > > + struct drm_i915_private *dev_priv = guc_to_i915(guc); > > + enum forcewake_domains fw_domains = 0; > > + u32 i; > > + > > + guc->send_regs.base = i915_mmio_reg_offset(SOFT_SCRATCH(0)); > > + guc->send_regs.count = SOFT_SCRATCH_COUNT - 1; > > + > > + for (i = 0; i < guc->send_regs.count; i++) { > > + fw_domains |= intel_uncore_forcewake_for_reg(dev_priv, > > + guc_send_reg(guc, i), > > + FW_REG_READ | FW_REG_WRITE); > > + } > > + guc->send_regs.fw_domains = fw_domains; > > +} > > + > > static int guc_enable_communication(struct intel_guc *guc) > > { > > /* XXX: placeholder for alternate setup */ > > + guc_init_send_regs(guc); > > guc->send = intel_guc_send_mmio; > > return 0; > > } > > @@ -407,19 +434,19 @@ int intel_guc_send_mmio(struct intel_guc *guc, const > > u32 *action, u32 len) > > int i; > > int ret; > > > > - if (WARN_ON(len < 1 || len > 15)) > > - return -EINVAL; > > + GEM_BUG_ON(!len); > > + GEM_BUG_ON(len > guc->send_regs.count); > > > > mutex_lock(&guc->send_mutex); > > - intel_uncore_forcewake_get(dev_priv, FORCEWAKE_BLITTER); > > + intel_uncore_forcewake_get(dev_priv, guc->send_regs.fw_domains); > > > > dev_priv->guc.action_count += 1; > > dev_priv->guc.action_cmd = action[0]; > > > > for (i = 0; i < len; i++) > > - I915_WRITE(SOFT_SCRATCH(i), action[i]); > > + I915_WRITE(guc_send_reg(guc, i), action[i]); > > > > - POSTING_READ(SOFT_SCRATCH(i - 1)); > > + POSTING_READ(guc_send_reg(guc, i - 1)); > > > > intel_guc_notify(guc); > > > > @@ -428,7 +455,7 @@ int intel_guc_send_mmio(struct intel_guc *guc, const > > u32 *action, u32 len) > > * Fast commands should still complete in 10us. > > */ > > ret = __intel_wait_for_register_fw(dev_priv, > > - SOFT_SCRATCH(0), > > + guc_send_reg(guc, 0), > >INTEL_GUC_RECV_MASK, > >INTEL_GUC_RECV_MASK, > >10, 10, &status); > > @@ -450,7 +477,7 @@ i
Re: [Intel-gfx] [PATCH] drm/i915: Update MOCS settings for gen 9
On Thursday, May 4, 2017 7:47:21 AM PDT David Weinehall wrote: > On Thu, May 04, 2017 at 10:35:33AM +0200, Arkadiusz Hiler wrote: > > Thanks for rephrasing - that's exactly what I am concerned with. > > > > Did you just use the MediaSDK as it is - meaning that MOCS entries > > beyond the set of the 3 we have defined had been naively utilized? > > > > If that's the case it is probably the cause of the performance > > difference - everything beyond "the 3" means UNCACHED. > > > > Can you try changing MediaSDK to only use entries that are already in? > > How the performance differs in that case? > > We're benchmarking using upstream MediaSDK without changes, since that's > the only thing that's relevant. Customising benchmarks to get better > results isn't really an acceptable solution :) > > Obviously fixing MediaSDK upstream is a different story, in case one of > the three pre-defined entries we have turns out to be the best possible > MOCS-settings for that workload. You're right about customizing benchmarks, but... MediaSDK is not a benchmark. If I'm not mistaken, it's a userspace driver produced by Intel engineers, one which Intel has the full capability to change. What you're saying is that Intel's MediaSDK engineers are unwilling to change their software to provide better performance for their Linux users. That's pretty mental. We don't warp the core operating system to work around userspace software simply because they don't want to change it. This isn't about open vs. closed or internal vs. public projects, either. I work on a public userspace driver for Intel graphics. If I sent a kernel patch, the kernel developers would ask me the exact same questions, to justify my new additions: 1. Is your userspace actually using all these new additions? If not, which ones are you using? They would ask me to drop anything I wasn't actually using yet, because speculatively adding things to the kernel that we have to maintain backwards compatibility for has caused both kernel and userspace developers a lot of trouble. 2. Are you sure that you need them all? Is there a simpler solution - are some existing things good enough? What's the additional benefit of each new addition? I would have to answer these questions to the satisfaction of the kernel developers before they would even consider taking my patch. You keep pointing to your large performance improvement, but all it's shown is that actually using the GPU cache is faster than having a broken userspace driver explicitly set everything to uncached. Many people have pointed this out. Arek and Tvrtko have good suggestions. I don't think you're going to get anywhere with this until you demonstrate that the new MOCS entries provide some non-zero value over using the existing WB entry. Here are a couple more data points: 1. We likely can't implement the documented "MOCS Version 1" table as is. The kernel exposes existing entries with specific semantics. Changing their meaning would introduce a backwards-incompatible change that would likely regress the performance of existing userspace. This is almost certainly unacceptable - our customers, distro partners, users, and even people like Linus Torvalds will suffer and complain loudly. We could add the new entries at an offset - i.e. leave the existing 3 entries, and append the rest after that. But that would require changing userspace that assumes the Windows tables, such as MediaSDK (they would have to add 3 to their MOCS indexes). At which point, we're changing them, so...the "runs unaltered" argument falls over. 2. The docs finally contain "recommended MOCS settings" - i.e. where to cache various types of objects, and at what age. However, I believe those recommendations can be implemented with 1-2 new table entries and a PTE change to be eLLC-only by default. Most of the table is completely unnecessary to implement the recommendations. I personally would like to try implementing their recommended settings in my driver. I have not had time yet, but plan to try. I'm very glad to see the Windows MOCS recommendations documented. I'd been asking for that information for literally years. If we'd gotten it earlier, a lot of mess could have been avoided. For future platforms, we may want to coordinate and use the same table. But Gen9 has been shipping for ages, and we don't have that luxury. --Ken signature.asc Description: This is a digitally signed message part. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/9] drm/i915: Replace ten seq_puts() calls by seq_putc()
From: Markus Elfring Date: Thu, 4 May 2017 11:04:45 +0200 Some single characters should be put into a sequence. Thus use the corresponding function "seq_putc". This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring --- drivers/gpu/drm/i915/i915_debugfs.c | 32 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index d689e511744e..f2bda699749a 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -190,7 +190,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) seq_printf(m, " , fence: %d%s", vma->fence->id, i915_gem_active_isset(&vma->last_fence) ? "*" : ""); - seq_puts(m, ")"); + seq_putc(m, ')'); } if (obj->stolen) seq_printf(m, " (stolen: %08llx)", obj->stolen->start); @@ -2689,7 +2689,7 @@ static int i915_edp_psr_status(struct seq_file *m, void *data) (stat[pipe] == VLV_EDP_PSR_ACTIVE_SF_UPDATE)) seq_printf(m, " pipe %c", pipe_name(pipe)); } - seq_puts(m, "\n"); + seq_putc(m, '\n'); /* * VLV/CHV PSR has no kind of performance counter @@ -3176,7 +3176,7 @@ static void intel_scaler_info(struct seq_file *m, struct intel_crtc *intel_crtc) seq_printf(m, ", scalers[%d]: use=%s, mode=%x", i, yesno(sc->in_use), sc->mode); } - seq_puts(m, "\n"); + seq_putc(m, '\n'); } else { seq_puts(m, "\tNo scalers available on this platform\n"); } @@ -3384,8 +3384,7 @@ static int i915_engine_info(struct seq_file *m, void *unused) w->tsk->comm, w->tsk->pid, w->seqno); } spin_unlock_irq(&b->rb_lock); - - seq_puts(m, "\n"); + seq_putc(m, '\n'); } intel_runtime_pm_put(dev_priv); @@ -3629,7 +3628,7 @@ static void drrs_status_per_crtc(struct seq_file *m, /* DRRS not supported. Print the VBT parameter*/ seq_puts(m, "\tDRRS Supported : No"); } - seq_puts(m, "\n"); + seq_putc(m, '\n'); } static int i915_drrs_status(struct seq_file *m, void *unused) @@ -3764,12 +3763,11 @@ static int i915_displayport_test_active_show(struct seq_file *m, void *data) if (connector->status == connector_status_connected && connector->encoder != NULL) { intel_dp = enc_to_intel_dp(connector->encoder); - if (intel_dp->compliance.test_active) - seq_puts(m, "1"); - else - seq_puts(m, "0"); - } else - seq_puts(m, "0"); + seq_putc(m, +intel_dp->compliance.test_active ? '1' : '0'); + } else { + seq_putc(m, '0'); + } } drm_connector_list_iter_end(&conn_iter); @@ -3823,8 +3821,9 @@ static int i915_displayport_test_data_show(struct seq_file *m, void *data) seq_printf(m, "bpc: %u\n", intel_dp->compliance.test_data.bpc); } - } else - seq_puts(m, "0"); + } else { + seq_putc(m, '0'); + } } drm_connector_list_iter_end(&conn_iter); @@ -3864,8 +3863,9 @@ static int i915_displayport_test_type_show(struct seq_file *m, void *data) connector->encoder != NULL) { intel_dp = enc_to_intel_dp(connector->encoder); seq_printf(m, "%02lx", intel_dp->compliance.test_type); - } else - seq_puts(m, "0"); + } else { + seq_putc(m, '0'); + } } drm_connector_list_iter_end(&conn_iter); -- 2.12.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/9] drm/i915: Combine five seq_printf() calls in i915_display_info()
From: Markus Elfring Date: Thu, 4 May 2017 13:17:10 +0200 Some text was put into a sequence by separate function calls. Print the same data by two single function calls instead. Signed-off-by: Markus Elfring --- drivers/gpu/drm/i915/i915_debugfs.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index f2bda699749a..4adf96be9146 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -3191,8 +3191,7 @@ static int i915_display_info(struct seq_file *m, void *unused) struct drm_connector_list_iter conn_iter; intel_runtime_pm_get(dev_priv); - seq_printf(m, "CRTC info\n"); - seq_printf(m, "-\n"); + seq_puts(m, "CRTC info\n-\n"); for_each_intel_crtc(dev, crtc) { bool active; struct intel_crtc_state *pipe_config; @@ -3226,9 +3225,7 @@ static int i915_display_info(struct seq_file *m, void *unused) drm_modeset_unlock(&crtc->base.mutex); } - seq_printf(m, "\n"); - seq_printf(m, "Connector info\n"); - seq_printf(m, "--\n"); + seq_puts(m, "\nConnector info\n--\n"); mutex_lock(&dev->mode_config.mutex); drm_connector_list_iter_begin(dev, &conn_iter); drm_for_each_connector_iter(connector, &conn_iter) -- 2.12.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 6/9] drm/i915: Add spaces for better code readability
From: Markus Elfring Date: Thu, 4 May 2017 14:04:38 +0200 Use space characters at some source code places according to the Linux coding style convention. Signed-off-by: Markus Elfring --- drivers/gpu/drm/i915/i915_debugfs.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index d9c699d7245e..6f3119d40c50 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2358,7 +2358,7 @@ static int i915_llc(struct seq_file *m, void *data) seq_printf(m, "LLC: %s\n", yesno(HAS_LLC(dev_priv))); seq_printf(m, "%s: %lluMB\n", edram ? "eDRAM" : "eLLC", - intel_uncore_edram_size(dev_priv)/1024/1024); + intel_uncore_edram_size(dev_priv) / 1024 / 1024); return 0; } @@ -4502,7 +4502,7 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv, { int s_max = 3, ss_max = 4; int s, ss; - u32 s_reg[s_max], eu_reg[2*s_max], eu_mask[2]; + u32 s_reg[s_max], eu_reg[2 * s_max], eu_mask[2]; /* BXT has a single slice and at most 3 subslices. */ if (IS_GEN9_LP(dev_priv)) { @@ -4512,8 +4512,8 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv, for (s = 0; s < s_max; s++) { s_reg[s] = I915_READ(GEN9_SLICE_PGCTL_ACK(s)); - eu_reg[2*s] = I915_READ(GEN9_SS01_EU_PGCTL_ACK(s)); - eu_reg[2*s + 1] = I915_READ(GEN9_SS23_EU_PGCTL_ACK(s)); + eu_reg[2 * s] = I915_READ(GEN9_SS01_EU_PGCTL_ACK(s)); + eu_reg[2 * s + 1] = I915_READ(GEN9_SS23_EU_PGCTL_ACK(s)); } eu_mask[0] = GEN9_PGCTL_SSA_EU08_ACK | @@ -4547,8 +4547,8 @@ static void gen9_sseu_device_status(struct drm_i915_private *dev_priv, sseu->subslice_mask |= BIT(ss); } - eu_cnt = 2 * hweight32(eu_reg[2*s + ss/2] & - eu_mask[ss%2]); + eu_cnt = 2 * hweight32(eu_reg[2 * s + ss / 2] & + eu_mask[ss % 2]); sseu->eu_total += eu_cnt; sseu->eu_per_subslice = max_t(unsigned int, sseu->eu_per_subslice, -- 2.12.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 7/9] drm/i915: Combine substrings for a message in gen6_drpc_info()
From: Markus Elfring Date: Thu, 4 May 2017 14:15:00 +0200 The script "checkpatch.pl" pointed information out like the following. WARNING: quoted string split across lines Thus fix the affected source code place. Signed-off-by: Markus Elfring --- drivers/gpu/drm/i915/i915_debugfs.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 6f3119d40c50..dbd52ea89fb4 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -1529,8 +1529,8 @@ static int gen6_drpc_info(struct seq_file *m) forcewake_count = READ_ONCE(dev_priv->uncore.fw_domain[FW_DOMAIN_ID_RENDER].wake_count); if (forcewake_count) { - seq_puts(m, "RC information inaccurate because somebody " - "holds a forcewake reference \n"); + seq_puts(m, +"RC information inaccurate because somebody holds a forcewake reference.\n"); } else { /* NB: we cannot use forcewake, else we read the wrong values */ while (count++ < 50 && (I915_READ_NOTRACE(FORCEWAKE_ACK) & 1)) -- 2.12.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 3/9] drm/i915: Replace 14 seq_printf() calls by seq_puts()
From: Markus Elfring Date: Thu, 4 May 2017 13:20:47 +0200 Some strings which did not contain data format specifications should be put into a sequence. Thus use the corresponding function "seq_puts". This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring --- drivers/gpu/drm/i915/i915_debugfs.c | 34 +- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 4adf96be9146..296108464f2b 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -149,7 +149,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj) } seq_printf(m, " (pinned x %d)", pin_count); if (obj->pin_display) - seq_printf(m, " (display)"); + seq_puts(m, " (display)"); list_for_each_entry(vma, &obj->vma_list, obj_link) { if (!drm_mm_node_allocated(&vma->node)) continue; @@ -581,8 +581,10 @@ static int i915_gem_pageflip_info(struct seq_file *m, void *data) intel_engine_last_submit(engine), intel_engine_get_seqno(engine), i915_gem_request_completed(work->flip_queued_req)); - } else - seq_printf(m, "Flip not associated with any ring\n"); + } else { + seq_puts(m, +"Flip not associated with any ring\n"); + } seq_printf(m, "Flip queued on frame %d, (was ready on frame %d), now %d\n", work->flip_queued_vblank, work->flip_ready_vblank, @@ -2048,7 +2050,7 @@ static int i915_dump_lrc(struct seq_file *m, void *unused) int ret; if (!i915.enable_execlists) { - seq_printf(m, "Logical Ring Contexts are disabled\n"); + seq_puts(m, "Logical Ring Contexts are disabled\n"); return 0; } @@ -2402,7 +2404,7 @@ static int i915_guc_load_status_info(struct seq_file *m, void *data) if (!HAS_GUC_UCODE(dev_priv)) return 0; - seq_printf(m, "GuC firmware status:\n"); + seq_puts(m, "GuC firmware status:\n"); seq_printf(m, "\tpath: %s\n", guc_fw->path); seq_printf(m, "\tfetch: %s\n", @@ -2510,7 +2512,7 @@ static int i915_guc_info(struct seq_file *m, void *data) return 0; } - seq_printf(m, "Doorbell map:\n"); + seq_puts(m, "Doorbell map:\n"); seq_printf(m, "\t%*pb\n", GUC_NUM_DOORBELLS, guc->doorbell_bitmap); seq_printf(m, "Doorbell next cacheline: 0x%x\n\n", guc->db_cacheline); @@ -2521,7 +2523,7 @@ static int i915_guc_info(struct seq_file *m, void *data) seq_printf(m, "GuC last action error code: %d\n", guc->action_err); total = 0; - seq_printf(m, "\nGuC submissions:\n"); + seq_puts(m, "\nGuC submissions:\n"); for_each_engine(engine, dev_priv, id) { u64 submissions = guc->submissions[id]; total += submissions; @@ -2795,7 +2797,7 @@ static int i915_runtime_pm_status(struct seq_file *m, void *unused) seq_printf(m, "Usage count: %d\n", atomic_read(&dev_priv->drm.dev->power.usage_count)); #else - seq_printf(m, "Device Power Management (CONFIG_PM) disabled\n"); + seq_puts(m, "Device Power Management (CONFIG_PM) disabled\n"); #endif seq_printf(m, "PCI device power state: %s [%d]\n", pci_power_name(pdev->current_state), @@ -2914,7 +2916,7 @@ static void intel_encoder_info(struct seq_file *m, drm_get_connector_status_name(connector->status)); if (connector->status == connector_status_connected) { struct drm_display_mode *mode = &crtc->mode; - seq_printf(m, ", mode:\n"); + seq_puts(m, ", mode:\n"); intel_seq_print_mode(m, 2, mode); } else { seq_putc(m, '\n'); @@ -2945,7 +2947,7 @@ static void intel_panel_info(struct seq_file *m, struct intel_panel *panel) { struct drm_display_mode *mode = panel->fixed_mode; - seq_printf(m, "\tfixed mode:\n"); + seq_puts(m, "\tfixed mode:\n"); intel_seq_print_mode(m, 2, mode); } @@ -3038,7 +3040,7 @@ static void intel_connector_info(struct seq_file *m, break; } - seq_printf(m, "\tmodes:\n"); + seq_puts(m, "\tmodes:\n"); list_for_each_entry(mode, &connector->modes, head) intel_seq_print_mode(m, 2, mode); } @@ -3266,9 +3268,7 @@ static int i915_engine_info(struc
[Intel-gfx] [PATCH 4/9] drm/i915: Delete unnecessary braces in three functions
From: Markus Elfring Date: Thu, 4 May 2017 13:40:53 +0200 Do not use curly brackets at some source code places where a single statement should be sufficient. Signed-off-by: Markus Elfring --- drivers/gpu/drm/i915/i915_debugfs.c | 19 --- 1 file changed, 8 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 296108464f2b..bf9a2e8d8c16 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -565,13 +565,13 @@ static int i915_gem_pageflip_info(struct seq_file *m, void *data) u32 addr; pending = atomic_read(&work->pending); - if (pending) { + if (pending) seq_printf(m, "Flip ioctl preparing on pipe %c (plane %c)\n", pipe, plane); - } else { + else seq_printf(m, "Flip pending (waiting for vsync) on pipe %c (plane %c)\n", pipe, plane); - } + if (work->flip_queued_req) { struct intel_engine_cs *engine = work->flip_queued_req->engine; @@ -3130,13 +3130,11 @@ static void intel_plane_info(struct seq_file *m, struct intel_crtc *intel_crtc) } state = plane->state; - - if (state->fb) { + if (state->fb) drm_get_format_name(state->fb->format->format, &format_name); - } else { + else sprintf(format_name.str, "N/A"); - } seq_printf(m, "\t--Plane id %d: type=%s, crtc_pos=%4dx%4d, crtc_size=%4dx%4d, src_pos=%d.%04ux%d.%04u, src_size=%d.%04ux%d.%04u, format=%s, rotation=%s\n", plane->base.id, @@ -4636,13 +4634,12 @@ static int i915_sseu_status(struct seq_file *m, void *unused) intel_runtime_pm_get(dev_priv); - if (IS_CHERRYVIEW(dev_priv)) { + if (IS_CHERRYVIEW(dev_priv)) cherryview_sseu_device_status(dev_priv, &sseu); - } else if (IS_BROADWELL(dev_priv)) { + else if (IS_BROADWELL(dev_priv)) broadwell_sseu_device_status(dev_priv, &sseu); - } else if (INTEL_GEN(dev_priv) >= 9) { + else if (INTEL_GEN(dev_priv) >= 9) gen9_sseu_device_status(dev_priv, &sseu); - } intel_runtime_pm_put(dev_priv); -- 2.12.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 8/9] drm/i915: Replace a seq_puts() call by seq_putc() in two functions
From: Markus Elfring Date: Thu, 4 May 2017 14:23:32 +0200 Two single characters (line breaks) should be put into a sequence. Thus use the corresponding function "seq_putc". This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring --- drivers/gpu/drm/i915/i915_gem_gtt.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 2aa6b97fd22f..9f64dc3f2d05 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -1254,7 +1254,7 @@ static void gen8_dump_pdp(struct i915_hw_ppgtt *ppgtt, else seq_puts(m, " SCRATCH "); } - seq_puts(m, "\n"); + seq_putc(m, '\n'); } kunmap_atomic(pt_vaddr); } @@ -1437,7 +1437,7 @@ static void gen6_dump_ppgtt(struct i915_hw_ppgtt *ppgtt, struct seq_file *m) else seq_puts(m, " SCRATCH "); } - seq_puts(m, "\n"); + seq_putc(m, '\n'); } kunmap_atomic(pt_vaddr); } -- 2.12.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 9/9] drm/i915: Combine substrings for two messages in i915_ggtt_probe_hw()
From: Markus Elfring Date: Thu, 4 May 2017 14:30:37 +0200 The script "checkpatch.pl" pointed information out like the following. WARNING: quoted string split across lines Thus fix the affected source code place. Signed-off-by: Markus Elfring --- drivers/gpu/drm/i915/i915_gem_gtt.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 9f64dc3f2d05..508431f42b65 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -2905,16 +2905,14 @@ int i915_ggtt_probe_hw(struct drm_i915_private *dev_priv) } if ((ggtt->base.total - 1) >> 32) { - DRM_ERROR("We never expected a Global GTT with more than 32bits" - " of address space! Found %lldM!\n", + DRM_ERROR("We never expected a Global GTT with more than 32bits of address space! Found %lldM!\n", ggtt->base.total >> 20); ggtt->base.total = 1ULL << 32; ggtt->mappable_end = min(ggtt->mappable_end, ggtt->base.total); } if (ggtt->mappable_end > ggtt->base.total) { - DRM_ERROR("mappable aperture extends past end of GGTT," - " aperture=%llx, total=%llx\n", + DRM_ERROR("mappable aperture extends past end of GGTT, aperture=%llx, total=%llx\n", ggtt->mappable_end, ggtt->base.total); ggtt->mappable_end = ggtt->base.total; } -- 2.12.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 5/9] drm/i915: Adjust seven checks for null pointers
From: Markus Elfring Date: Thu, 4 May 2017 13:52:19 +0200 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The script “checkpatch.pl” pointed information out like the following. Comparison to NULL could be written … Thus fix affected source code places. Signed-off-by: Markus Elfring --- drivers/gpu/drm/i915/i915_debugfs.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index bf9a2e8d8c16..d9c699d7245e 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -242,7 +242,7 @@ static int i915_gem_stolen_list_info(struct seq_file *m, void *data) if (count == total) break; - if (obj->stolen == NULL) + if (!obj->stolen) continue; objects[count++] = obj; @@ -254,7 +254,7 @@ static int i915_gem_stolen_list_info(struct seq_file *m, void *data) if (count == total) break; - if (obj->stolen == NULL) + if (!obj->stolen) continue; objects[count++] = obj; @@ -557,7 +557,7 @@ static int i915_gem_pageflip_info(struct seq_file *m, void *data) spin_lock_irq(&dev->event_lock); work = crtc->flip_work; - if (work == NULL) { + if (!work) { seq_printf(m, "No flip due on pipe %c (plane %c)\n", pipe, plane); } else { @@ -3717,7 +3717,7 @@ static ssize_t i915_displayport_test_active_write(struct file *file, continue; if (connector->status == connector_status_connected && - connector->encoder != NULL) { + connector->encoder) { intel_dp = enc_to_intel_dp(connector->encoder); status = kstrtoint(input_buffer, 10, &val); if (status < 0) @@ -3756,7 +3756,7 @@ static int i915_displayport_test_active_show(struct seq_file *m, void *data) continue; if (connector->status == connector_status_connected && - connector->encoder != NULL) { + connector->encoder) { intel_dp = enc_to_intel_dp(connector->encoder); seq_putc(m, intel_dp->compliance.test_active ? '1' : '0'); @@ -3801,7 +3801,7 @@ static int i915_displayport_test_data_show(struct seq_file *m, void *data) continue; if (connector->status == connector_status_connected && - connector->encoder != NULL) { + connector->encoder) { intel_dp = enc_to_intel_dp(connector->encoder); if (intel_dp->compliance.test_type == DP_TEST_LINK_EDID_READ) @@ -3855,7 +3855,7 @@ static int i915_displayport_test_type_show(struct seq_file *m, void *data) continue; if (connector->status == connector_status_connected && - connector->encoder != NULL) { + connector->encoder) { intel_dp = enc_to_intel_dp(connector->encoder); seq_printf(m, "%02lx", intel_dp->compliance.test_type); } else { -- 2.12.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages
== Series Details == Series: series starting with [1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages URL : https://patchwork.freedesktop.org/series/23969/ State : success == Summary == Series 23969v1 Series without cover letter https://patchwork.freedesktop.org/api/1.0/series/23969/revisions/1/mbox/ Test gem_exec_suspend: Subgroup basic-s4-devices: dmesg-warn -> PASS (fi-kbl-7560u) fdo#100125 fdo#100125 https://bugs.freedesktop.org/show_bug.cgi?id=100125 fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11 time:430s fi-bsw-n3050 total:278 pass:242 dwarn:0 dfail:0 fail:0 skip:36 time:572s fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time:513s fi-bxt-t5700 total:278 pass:258 dwarn:0 dfail:0 fail:0 skip:20 time:552s fi-byt-j1900 total:278 pass:254 dwarn:0 dfail:0 fail:0 skip:24 time:486s fi-byt-n2820 total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:481s fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:408s fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:405s fi-ilk-650 total:278 pass:228 dwarn:0 dfail:0 fail:0 skip:50 time:416s fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:484s fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:464s fi-kbl-7500u total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:459s fi-kbl-7560u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:566s fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:456s fi-skl-6700hqtotal:278 pass:261 dwarn:0 dfail:0 fail:0 skip:17 time:568s fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time:473s fi-skl-6770hqtotal:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:500s fi-skl-gvtdvmtotal:278 pass:265 dwarn:0 dfail:0 fail:0 skip:13 time:438s fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:531s fi-snb-2600 total:278 pass:249 dwarn:0 dfail:0 fail:0 skip:29 time:404s fi-bdw-gvtdvm failed to collect. IGT log at Patchwork_4623/fi-bdw-gvtdvm/igt.log 369880c1680bf9bde467a40d2a03d3ad32341281 drm-tip: 2017y-05m-04d-15h-00m-33s UTC integration manifest 5bb846f drm/i915: Use __sg_alloc_table_from_pages for userptr allocations 54ed0e1 lib/scatterlist: Introduce and export __sg_alloc_table_from_pages bafac0f lib/scatterlist: Avoid potential scatterlist entry overflow b5fb37a lib/scatterlist: Fix offset type in sg_alloc_table_from_pages == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4623/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] drm] Atomic update on pipe (A) took 119 us, max time under evasion is 100 us
On Thu, May 04, 2017 at 09:26:09AM -0600, Jens Axboe wrote: > Hi, > > Running current -git on my laptop (20FB, X1 Carbon gen4, skylake), I get > a lot of the below warnings. Things seem to work fine (in fact it seems > faster in general use than previously), but it's a lot of warning spew. > > [ 764.877978] [drm] Atomic update on pipe (A) took 156 us, max time under > evasion is 100 us I tried to optimize this a bit recently but indeed it's stil known to be too slow. Looks like all of that stuff did land in Linus's tree already, so presumably you have it all already. I did have some further ideas that should help but I got sidetracked by other things before I managed to finish the work. I guess I'll need to get back on that horse and try to finish what I started. In the meantime, maybe we should just silence this error spew again until we're more confident about meeting the deadlines. Maarten? Do you have lockdep enabled BTW? Based on what I've seen lockdep does seem be a major contributor to slowness here. -- Ville Syrjälä Intel OTC ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] drm] Atomic update on pipe (A) took 119 us, max time under evasion is 100 us
On 05/04/2017 11:42 AM, Ville Syrjälä wrote: > On Thu, May 04, 2017 at 09:26:09AM -0600, Jens Axboe wrote: >> Hi, >> >> Running current -git on my laptop (20FB, X1 Carbon gen4, skylake), I get >> a lot of the below warnings. Things seem to work fine (in fact it seems >> faster in general use than previously), but it's a lot of warning spew. >> >> [ 764.877978] [drm] Atomic update on pipe (A) took 156 us, max time under >> evasion is 100 us > > I tried to optimize this a bit recently but indeed it's stil known to be too > slow. Looks like all of that stuff did land in Linus's tree already, > so presumably you have it all already. Yes, this is Linus' tree... > I did have some further ideas that should help but I got sidetracked by > other things before I managed to finish the work. I guess I'll need to get > back on that horse and try to finish what I started. > > In the meantime, maybe we should just silence this error spew again > until we're more confident about meeting the deadlines. Maarten? > > Do you have lockdep enabled BTW? Based on what I've seen lockdep does > seem be a major contributor to slowness here. Nope, running a fairly optimized build on my laptop. -- Jens Axboe ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Set all undefined MOCS entries to follow PTE
David Weinehall writes: > On Thu, May 04, 2017 at 10:51:29AM +0100, Chris Wilson wrote: >> A good default for garbage entries from the user is to follow the >> default setting of the object (i.e. the PTE). Currently they use the >> uncached entry, and now the only way to accidentally hit uncached >> performance is via explicit use of the uncached MOCS or setting the >> object to uncached. Note that these entries are currently undefined in >> the ABI and we reserve the right to change them. We originally chose >> uncached to eliminate any problem with reducing the caching level in >> future, but the object is a much better definition of the minimum >> caching level. >> NAK. The reason for the default being UC is that it's the only setting that guarantees full forwards compatibility with any other entry that might be added in the future. If you default to PTE on (e)LLC and WB on L3, userspace will no longer be able to use any newly introduced entry with stricter coherency guarantees than that (e.g. any L3-uncached entry) in a backwards-compatible way. Attempting to do so may break memory coherency assumptions of the application and lead to misrendering when run on older kernel versions (which to my judgment is a scarier failure mode than reduced performance). My other concern is that this change may make inadvertent use of undefined MOCS entries extremely difficult to detect in some cases -- UC gives userspace a pretty obvious (if functionally harmless) indicative that it's got its caching settings wrong, and is a strong motivation for userspace developers to contribute MOCS table changes to the kernel instead of blindly making assumptions about them (e.g. that they match the Android kernel as media-sdk was probably doing). With this change checked in, the performance drawback from using media-sdk on an upstream kernel may have been subtle enough that David would never have bothered to look into the issue. People may have started shipping copies of media-sdk making bogus MOCS table assumptions (with potential correctness implications), at which point you would have to deal with userspace regressions anytime the MOCS table is extended in the future. >> Fixes: 3bbaba0ceaa2 ("drm/i915: Added Programming of the MOCS") >> Signed-off-by: Chris Wilson >> Cc: David Weinehall >> Cc: Arkadiusz Hiler >> Cc: Tvrtko Ursulin >> Cc: sta...@vger.kernel.org > > LGTM, and passes our nightly msdk test case. > > Tested-by: David Weinehall > Reviewed-by: David Weinehall > >> --- >> drivers/gpu/drm/i915/intel_mocs.c | 39 >> +++ >> 1 file changed, 15 insertions(+), 24 deletions(-) >> >> diff --git a/drivers/gpu/drm/i915/intel_mocs.c >> b/drivers/gpu/drm/i915/intel_mocs.c >> index 92e461c68385..e7a7781ca457 100644 >> --- a/drivers/gpu/drm/i915/intel_mocs.c >> +++ b/drivers/gpu/drm/i915/intel_mocs.c >> @@ -85,10 +85,7 @@ struct drm_i915_mocs_table { >> * >> * Entries not part of the following tables are undefined as far as >> * userspace is concerned and shouldn't be relied upon. For the time >> - * being they will be implicitly initialized to the strictest caching >> - * configuration (uncached) to guarantee forwards compatibility with >> - * userspace programs written against more recent kernels providing >> - * additional MOCS entries. >> + * being they will be implicitly initialized to follow the PTE. >> * >> * NOTE: These tables MUST start with being uncached and the length >> * MUST be less than 63 as the last two registers are reserved >> @@ -249,16 +246,13 @@ int intel_mocs_init_engine(struct intel_engine_cs >> *engine) >> table.table[index].control_value); >> >> /* >> - * Ok, now set the unused entries to uncached. These entries >> + * Ok, now set the unused entries to follow the PTE. These entries >> * are officially undefined and no contract for the contents >> * and settings is given for these entries. >> - * >> - * Entry 0 in the table is uncached - so we are just writing >> - * that value to all the used entries. >> */ >> for (; index < GEN9_NUM_MOCS_ENTRIES; index++) >> I915_WRITE(mocs_register(engine->id, index), >> - table.table[0].control_value); >> + table.table[I915_MOCS_PTE].control_value); >> >> return 0; >> } >> @@ -295,16 +289,13 @@ static int emit_mocs_control_table(struct >> drm_i915_gem_request *req, >> } >> >> /* >> - * Ok, now set the unused entries to uncached. These entries >> + * Ok, now set the unused entries to follow the PTE. These entries >> * are officially undefined and no contract for the contents >> * and settings is given for these entries. >> - * >> - * Entry 0 in the table is uncached - so we are just writing >> - * that value to all the used entries. >> */ >> for (; index < GEN9_NUM_MOCS_ENTRIES; index++) { >>
[Intel-gfx] [PATCH] drm/i915: Fix rawclk readout for g4x
From: Ville Syrjälä Turns out our skills in decoding the CLKCFG register weren't good enough. On this particular elk the answer we got was 400 MHz when in reality the clock was running at 266 MHz, which then caused us to program a bogus AUX clock divider that caused all AUX communication to fail. Sadly the docs are now in bit heaven, so the fix will have to be based on empirical evidence. Using another elk machine I was able to frob the FSB frequency from the BIOS and see how it affects the CLKCFG register. The machine seesm to use a frequency of 266 MHz by default, and fortunately it still boot even with the 50% CPU overclock that we get when we bump the FSB up to 400 MHz. It turns out the actual FSB frequency and the register have no real link whatsoever. The register value is based on some straps or something, but fortunately those too can be configured from the BIOS on this board, although it doesn't seem to respect the settings 100%. In the end I was able to derive the following relationship: BIOS FSB / strap | CLKCFG - 200 | 0x2 266 | 0x0 333 | 0x4 400 | 0x4 So only the 200 and 400 MHz cases actually match how we're currently decoding that register. But as the comment next to some of the defines says, we have been just guessing anyway. So let's fix things up so that at least the 266 MHz case will work correctly as that is actually the setting used by both the buggy machine and my test machine. The fact that 333 and 400 MHz BIOS settings result in the same register value is a little disappointing, as that means we can't tell them apart. However, according to the gmch datasheet for both elk and ctg 400 Mhz is not even a supported FSB frequency, so I'm going to make the assumption that we should decode it as 333 MHz instead. Cc: sta...@vger.kernel.org Cc: Tomi Sarvela Reported-by: Tomi Sarvela Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100926 Signed-off-by: Ville Syrjälä --- drivers/gpu/drm/i915/i915_reg.h| 10 +++--- drivers/gpu/drm/i915/intel_cdclk.c | 6 ++ 2 files changed, 9 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index ee8170cda93e..524fdfda9d45 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -3059,10 +3059,14 @@ enum skl_disp_power_wells { #define CLKCFG_FSB_667 (3 << 0)/* hrawclk 166 */ #define CLKCFG_FSB_800 (2 << 0)/* hrawclk 200 */ #define CLKCFG_FSB_1067(6 << 0) /* hrawclk 266 */ +#define CLKCFG_FSB_1067_ALT(0 << 0)/* hrawclk 266 */ #define CLKCFG_FSB_1333(7 << 0) /* hrawclk 333 */ -/* Note, below two are guess */ -#define CLKCFG_FSB_1600(4 << 0) /* hrawclk 400 */ -#define CLKCFG_FSB_1600_ALT(0 << 0)/* hrawclk 400 */ +/* + * Note that on at least on ELK the below value is reported for both + * 333 and 400 MHz BIOS FSB setting, but given that the gmch datasheet + * lists only 200/266/333 MHz FSB as supported let's decode it as 333 MHz. + */ +#define CLKCFG_FSB_1333_ALT(4 << 0)/* hrawclk 333 */ #define CLKCFG_FSB_MASK(7 << 0) #define CLKCFG_MEM_533 (1 << 4) #define CLKCFG_MEM_667 (2 << 4) diff --git a/drivers/gpu/drm/i915/intel_cdclk.c b/drivers/gpu/drm/i915/intel_cdclk.c index 763010f8ad89..29792972d55d 100644 --- a/drivers/gpu/drm/i915/intel_cdclk.c +++ b/drivers/gpu/drm/i915/intel_cdclk.c @@ -1808,13 +1808,11 @@ static int g4x_hrawclk(struct drm_i915_private *dev_priv) case CLKCFG_FSB_800: return 20; case CLKCFG_FSB_1067: + case CLKCFG_FSB_1067_ALT: return 27; case CLKCFG_FSB_1333: + case CLKCFG_FSB_1333_ALT: return 33; - /* these two are just a guess; one of them might be right */ - case CLKCFG_FSB_1600: - case CLKCFG_FSB_1600_ALT: - return 40; default: return 13; } -- 2.10.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Fix rawclk readout for g4x
== Series Details == Series: drm/i915: Fix rawclk readout for g4x URL : https://patchwork.freedesktop.org/series/23978/ State : success == Summary == Series 23978v1 drm/i915: Fix rawclk readout for g4x https://patchwork.freedesktop.org/api/1.0/series/23978/revisions/1/mbox/ fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11 time:432s fi-bdw-gvtdvmtotal:278 pass:256 dwarn:8 dfail:0 fail:0 skip:14 time:425s fi-bsw-n3050 total:278 pass:242 dwarn:0 dfail:0 fail:0 skip:36 time:579s fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time:515s fi-bxt-t5700 total:278 pass:258 dwarn:0 dfail:0 fail:0 skip:20 time:564s fi-byt-j1900 total:278 pass:254 dwarn:0 dfail:0 fail:0 skip:24 time:494s fi-byt-n2820 total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:483s fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:411s fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:409s fi-ilk-650 total:278 pass:228 dwarn:0 dfail:0 fail:0 skip:50 time:420s fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:480s fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:487s fi-kbl-7500u total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:458s fi-kbl-7560u total:278 pass:267 dwarn:1 dfail:0 fail:0 skip:10 time:571s fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:454s fi-skl-6700hqtotal:278 pass:261 dwarn:0 dfail:0 fail:0 skip:17 time:573s fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time:455s fi-skl-6770hqtotal:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:492s fi-skl-gvtdvmtotal:278 pass:265 dwarn:0 dfail:0 fail:0 skip:13 time:430s fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:529s fi-snb-2600 total:278 pass:249 dwarn:0 dfail:0 fail:0 skip:29 time:413s 369880c1680bf9bde467a40d2a03d3ad32341281 drm-tip: 2017y-05m-04d-15h-00m-33s UTC integration manifest be10d0a drm/i915: Fix rawclk readout for g4x == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4624/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCHv4 0/3] dma_buf import support for vgem
Hi, This v4 of the series to add dma_buf import functions for vgem. This version primarily focuses on adding a new approach for an alternate dma_buf attach after platformdev was removed. Thanks, Laura Laura Abbott (3): drm/vgem: Add a dummy platform device drm/prime: Introduce drm_gem_prime_import_dev drm/vgem: Enable dmabuf import interfaces drivers/gpu/drm/drm_prime.c | 30 ++-- drivers/gpu/drm/vgem/vgem_drv.c | 155 +++- drivers/gpu/drm/vgem/vgem_drv.h | 2 + include/drm/drm_prime.h | 5 ++ 4 files changed, 154 insertions(+), 38 deletions(-) -- 2.7.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCHv4 2/3] drm/prime: Introduce drm_gem_prime_import_dev
The existing drm_gem_prime_import function uses the underlying struct device of a drm_device for attaching to a dma_buf. Some drivers (notably vgem) may not have an underlying device structure. Offer an alternate function to attach using any available device structure. Signed-off-by: Laura Abbott --- v4: Alternate implemntation to take an arbitrary struct dev instead of just a platform device. This was different enough that I dropped the previous Reviewed-by --- drivers/gpu/drm/drm_prime.c | 30 -- include/drm/drm_prime.h | 5 + 2 files changed, 29 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c index 9fb65b7..5ad9a26 100644 --- a/drivers/gpu/drm/drm_prime.c +++ b/drivers/gpu/drm/drm_prime.c @@ -595,15 +595,18 @@ int drm_gem_prime_handle_to_fd(struct drm_device *dev, EXPORT_SYMBOL(drm_gem_prime_handle_to_fd); /** - * drm_gem_prime_import - helper library implementation of the import callback + * drm_gem_prime_import_dev - core implementation of the import callback * @dev: drm_device to import into * @dma_buf: dma-buf object to import + * @attach_dev: struct device to dma_buf attach * - * This is the implementation of the gem_prime_import functions for GEM drivers - * using the PRIME helpers. + * This is the core of drm_gem_prime_import. It's designed to be called by + * drivers who want to use a different device structure than dev->dev for + * attaching via dma_buf. */ -struct drm_gem_object *drm_gem_prime_import(struct drm_device *dev, - struct dma_buf *dma_buf) +struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, + struct dma_buf *dma_buf, + struct device *attach_dev) { struct dma_buf_attachment *attach; struct sg_table *sgt; @@ -625,7 +628,7 @@ struct drm_gem_object *drm_gem_prime_import(struct drm_device *dev, if (!dev->driver->gem_prime_import_sg_table) return ERR_PTR(-EINVAL); - attach = dma_buf_attach(dma_buf, dev->dev); + attach = dma_buf_attach(dma_buf, attach_dev); if (IS_ERR(attach)) return ERR_CAST(attach); @@ -655,6 +658,21 @@ struct drm_gem_object *drm_gem_prime_import(struct drm_device *dev, return ERR_PTR(ret); } +EXPORT_SYMBOL(drm_gem_prime_import_dev); + +/** + * drm_gem_prime_import - helper library implementation of the import callback + * @dev: drm_device to import into + * @dma_buf: dma-buf object to import + * + * This is the implementation of the gem_prime_import functions for GEM drivers + * using the PRIME helpers. + */ +struct drm_gem_object *drm_gem_prime_import(struct drm_device *dev, + struct dma_buf *dma_buf) +{ + return drm_gem_prime_import_dev(dev, dma_buf, dev->dev); +} EXPORT_SYMBOL(drm_gem_prime_import); /** diff --git a/include/drm/drm_prime.h b/include/drm/drm_prime.h index 0b2a235..46fd1fb 100644 --- a/include/drm/drm_prime.h +++ b/include/drm/drm_prime.h @@ -65,6 +65,11 @@ int drm_gem_prime_handle_to_fd(struct drm_device *dev, int *prime_fd); struct drm_gem_object *drm_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf); + +struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, + struct dma_buf *dma_buf, + struct device *attach_dev); + int drm_gem_prime_fd_to_handle(struct drm_device *dev, struct drm_file *file_priv, int prime_fd, uint32_t *handle); struct dma_buf *drm_gem_dmabuf_export(struct drm_device *dev, -- 2.7.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCHv4 3/3] drm/vgem: Enable dmabuf import interfaces
Enable the GEM dma-buf import interfaces in addition to the export interfaces. This lets vgem be used as a test source for other allocators (e.g. Ion). Reviewed-by: Chris Wilson Signed-off-by: Laura Abbott --- v4: Use new drm_gem_prime_import_dev function --- drivers/gpu/drm/vgem/vgem_drv.c | 136 +++- drivers/gpu/drm/vgem/vgem_drv.h | 2 + 2 files changed, 109 insertions(+), 29 deletions(-) diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c index d1d98af..c9381d45 100644 --- a/drivers/gpu/drm/vgem/vgem_drv.c +++ b/drivers/gpu/drm/vgem/vgem_drv.c @@ -48,6 +48,11 @@ static void vgem_gem_free_object(struct drm_gem_object *obj) { struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj); + drm_free_large(vgem_obj->pages); + + if (obj->import_attach) + drm_prime_gem_destroy(obj, vgem_obj->table); + drm_gem_object_release(obj); kfree(vgem_obj); } @@ -58,26 +63,49 @@ static int vgem_gem_fault(struct vm_fault *vmf) struct drm_vgem_gem_object *obj = vma->vm_private_data; /* We don't use vmf->pgoff since that has the fake offset */ unsigned long vaddr = vmf->address; - struct page *page; - - page = shmem_read_mapping_page(file_inode(obj->base.filp)->i_mapping, - (vaddr - vma->vm_start) >> PAGE_SHIFT); - if (!IS_ERR(page)) { - vmf->page = page; - return 0; - } else switch (PTR_ERR(page)) { - case -ENOSPC: - case -ENOMEM: - return VM_FAULT_OOM; - case -EBUSY: - return VM_FAULT_RETRY; - case -EFAULT: - case -EINVAL: - return VM_FAULT_SIGBUS; - default: - WARN_ON_ONCE(PTR_ERR(page)); - return VM_FAULT_SIGBUS; + int ret; + loff_t num_pages; + pgoff_t page_offset; + page_offset = (vaddr - vma->vm_start) >> PAGE_SHIFT; + + num_pages = DIV_ROUND_UP(obj->base.size, PAGE_SIZE); + + if (page_offset > num_pages) + return VM_FAULT_SIGBUS; + + if (obj->pages) { + get_page(obj->pages[page_offset]); + vmf->page = obj->pages[page_offset]; + ret = 0; + } else { + struct page *page; + + page = shmem_read_mapping_page( + file_inode(obj->base.filp)->i_mapping, + page_offset); + if (!IS_ERR(page)) { + vmf->page = page; + ret = 0; + } else switch (PTR_ERR(page)) { + case -ENOSPC: + case -ENOMEM: + ret = VM_FAULT_OOM; + break; + case -EBUSY: + ret = VM_FAULT_RETRY; + break; + case -EFAULT: + case -EINVAL: + ret = VM_FAULT_SIGBUS; + break; + default: + WARN_ON(PTR_ERR(page)); + ret = VM_FAULT_SIGBUS; + break; + } + } + return ret; } static const struct vm_operations_struct vgem_gem_vm_ops = { @@ -114,12 +142,8 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file) kfree(vfile); } -/* ioctls */ - -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev, - struct drm_file *file, - unsigned int *handle, - unsigned long size) +static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev, + unsigned long size) { struct drm_vgem_gem_object *obj; int ret; @@ -129,8 +153,31 @@ static struct drm_gem_object *vgem_gem_create(struct drm_device *dev, return ERR_PTR(-ENOMEM); ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE)); - if (ret) - goto err_free; + if (ret) { + kfree(obj); + return ERR_PTR(ret); + } + + return obj; +} + +static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj) +{ + drm_gem_object_release(&obj->base); + kfree(obj); +} + +static struct drm_gem_object *vgem_gem_create(struct drm_device *dev, + struct drm_file *file, + unsigned int *handle, + unsigned long size) +{ + struct drm_vgem_gem_object *o
[Intel-gfx] [PATCHv4 1/3] drm/vgem: Add a dummy platform device
The vgem driver is currently registered independent of any actual device. Some usage of the dmabuf APIs require an actual device structure to do anything. Register a dummy platform device for use with dmabuf. Reviewed-by: Chris Wilson Signed-off-by: Laura Abbott --- v4: Switch from the now removed platformdev to a static platform device. --- drivers/gpu/drm/vgem/vgem_drv.c | 19 --- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c index 9fee38a..d1d98af 100644 --- a/drivers/gpu/drm/vgem/vgem_drv.c +++ b/drivers/gpu/drm/vgem/vgem_drv.c @@ -42,6 +42,8 @@ #define DRIVER_MAJOR 1 #define DRIVER_MINOR 0 +static struct platform_device *vgem_platform; + static void vgem_gem_free_object(struct drm_gem_object *obj) { struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj); @@ -335,11 +337,20 @@ static int __init vgem_init(void) int ret; vgem_device = drm_dev_alloc(&vgem_driver, NULL); - if (IS_ERR(vgem_device)) { - ret = PTR_ERR(vgem_device); + if (IS_ERR(vgem_device)) + return PTR_ERR(vgem_device); + + vgem_platform = platform_device_register_simple("vgem", + -1, NULL, 0); + + if (!vgem_platform) { + ret = -ENODEV; goto out; } + dma_coerce_mask_and_coherent(&vgem_platform->dev, + DMA_BIT_MASK(64)); + ret = drm_dev_register(vgem_device, 0); if (ret) goto out_unref; @@ -347,13 +358,15 @@ static int __init vgem_init(void) return 0; out_unref: - drm_dev_unref(vgem_device); + platform_device_unregister(vgem_platform); out: + drm_dev_unref(vgem_device); return ret; } static void __exit vgem_exit(void) { + platform_device_unregister(vgem_platform); drm_dev_unregister(vgem_device); drm_dev_unref(vgem_device); } -- 2.7.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [RFC] drm/i915/guc: capture GuC logs if FW fails to load
We're currently deleting the GuC logs if the FW fails to load, but those are still useful to understand why the loading failed. Instead of deleting them, taking a snapshot allows us to access them after driver load is completed. Cc: Oscar Mateo Cc: Michal Wajdeczko Signed-off-by: Daniele Ceraolo Spurio --- drivers/gpu/drm/i915/i915_debugfs.c | 36 --- drivers/gpu/drm/i915/i915_drv.c | 3 +++ drivers/gpu/drm/i915/i915_drv.h | 6 ++ drivers/gpu/drm/i915/i915_gpu_error.c | 36 +++ drivers/gpu/drm/i915/intel_guc_fwif.h | 14 +++--- drivers/gpu/drm/i915/intel_guc_log.c | 10 ++ drivers/gpu/drm/i915/intel_uc.c | 7 +-- 7 files changed, 84 insertions(+), 28 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 870c470..4ff20fc 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -2543,26 +2543,32 @@ static int i915_guc_info(struct seq_file *m, void *data) static int i915_guc_log_dump(struct seq_file *m, void *data) { struct drm_i915_private *dev_priv = node_to_i915(m->private); - struct drm_i915_gem_object *obj; - int i = 0, pg; - - if (!dev_priv->guc.log.vma) + u32 *log; + int i = 0; + + if (dev_priv->guc.log.vma) { + log = i915_gem_object_pin_map(dev_priv->guc.log.vma->obj, + I915_MAP_WC); + if (IS_ERR(log)) { + DRM_ERROR("Failed to pin guc_log vma\n"); + return -ENOMEM; + } + } else if (dev_priv->gpu_error.guc_load_fail_log) { + log = dev_priv->gpu_error.guc_load_fail_log; + } else { return 0; - - obj = dev_priv->guc.log.vma->obj; - for (pg = 0; pg < obj->base.size / PAGE_SIZE; pg++) { - u32 *log = kmap_atomic(i915_gem_object_get_page(obj, pg)); - - for (i = 0; i < PAGE_SIZE / sizeof(u32); i += 4) - seq_printf(m, "0x%08x 0x%08x 0x%08x 0x%08x\n", - *(log + i), *(log + i + 1), - *(log + i + 2), *(log + i + 3)); - - kunmap_atomic(log); } + for (i = 0; i < GUC_LOG_SIZE / sizeof(u32); i += 4) + seq_printf(m, "0x%08x 0x%08x 0x%08x 0x%08x\n", + *(log + i), *(log + i + 1), + *(log + i + 2), *(log + i + 3)); + seq_putc(m, '\n'); + if (dev_priv->guc.log.vma) + i915_gem_object_unpin_map(dev_priv->guc.log.vma->obj); + return 0; } diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 452c265..c7cb36c 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1354,6 +1354,9 @@ void i915_driver_unload(struct drm_device *dev) cancel_delayed_work_sync(&dev_priv->gpu_error.hangcheck_work); i915_reset_error_state(dev_priv); + /* release GuC error log (if any) */ + i915_guc_load_error_log_free(dev_priv); + /* Flush any outstanding unpin_work. */ drain_workqueue(dev_priv->wq); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 4588b3e..761c663 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1555,6 +1555,9 @@ struct i915_gpu_error { /* Protected by the above dev->gpu_error.lock. */ struct i915_gpu_state *first_error; + /* Log snapshot if GuC errors during load */ + void *guc_load_fail_log; + unsigned long missed_irq_rings; /** @@ -3687,6 +3690,9 @@ static inline void i915_reset_error_state(struct drm_i915_private *i915) #endif +void i915_guc_load_error_log_capture(struct drm_i915_private *i915); +void i915_guc_load_error_log_free(struct drm_i915_private *i915); + const char *i915_cache_level_str(struct drm_i915_private *i915, int type); /* i915_cmd_parser.c */ diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index ec526d9..44a873b 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -1809,3 +1809,39 @@ void i915_reset_error_state(struct drm_i915_private *i915) i915_gpu_state_put(error); } + +void i915_guc_load_error_log_capture(struct drm_i915_private *i915) +{ + void *log, *buf; + struct i915_vma *vma = i915->guc.log.vma; + + if (i915->gpu_error.guc_load_fail_log || !vma) + return; + + /* +* the vma should be already pinned and mapped for log runtime +* management but let's play safe +*/ + log = i915_gem_object_pin_map(vma->obj, I915_MAP_WC); + if (IS_ERR(log)) { + DRM_ERROR("Failed to pin guc_log vma\n"); + return; +
[Intel-gfx] ✓ Fi.CI.BAT: success for dma_buf import support for vgem (rev2)
== Series Details == Series: dma_buf import support for vgem (rev2) URL : https://patchwork.freedesktop.org/series/23824/ State : success == Summary == Series 23824v2 dma_buf import support for vgem https://patchwork.freedesktop.org/api/1.0/series/23824/revisions/2/mbox/ Test gem_exec_flush: Subgroup basic-batch-kernel-default-uc: pass -> FAIL (fi-snb-2600) fdo#17 fdo#17 https://bugs.freedesktop.org/show_bug.cgi?id=17 fi-bdw-5557u total:278 pass:267 dwarn:0 dfail:0 fail:0 skip:11 time:431s fi-bdw-gvtdvmtotal:278 pass:256 dwarn:8 dfail:0 fail:0 skip:14 time:425s fi-bsw-n3050 total:278 pass:242 dwarn:0 dfail:0 fail:0 skip:36 time:584s fi-bxt-j4205 total:278 pass:259 dwarn:0 dfail:0 fail:0 skip:19 time:508s fi-bxt-t5700 total:278 pass:258 dwarn:0 dfail:0 fail:0 skip:20 time:553s fi-byt-j1900 total:278 pass:254 dwarn:0 dfail:0 fail:0 skip:24 time:492s fi-byt-n2820 total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:480s fi-hsw-4770 total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:411s fi-hsw-4770r total:278 pass:262 dwarn:0 dfail:0 fail:0 skip:16 time:408s fi-ilk-650 total:278 pass:228 dwarn:0 dfail:0 fail:0 skip:50 time:415s fi-ivb-3520m total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:493s fi-ivb-3770 total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:464s fi-kbl-7500u total:278 pass:260 dwarn:0 dfail:0 fail:0 skip:18 time:460s fi-kbl-7560u total:278 pass:267 dwarn:1 dfail:0 fail:0 skip:10 time:560s fi-skl-6260u total:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:455s fi-skl-6700hqtotal:278 pass:261 dwarn:0 dfail:0 fail:0 skip:17 time:566s fi-skl-6700k total:278 pass:256 dwarn:4 dfail:0 fail:0 skip:18 time:453s fi-skl-6770hqtotal:278 pass:268 dwarn:0 dfail:0 fail:0 skip:10 time:498s fi-skl-gvtdvmtotal:278 pass:265 dwarn:0 dfail:0 fail:0 skip:13 time:431s fi-snb-2520m total:278 pass:250 dwarn:0 dfail:0 fail:0 skip:28 time:539s fi-snb-2600 total:278 pass:248 dwarn:0 dfail:0 fail:1 skip:29 time:417s 369880c1680bf9bde467a40d2a03d3ad32341281 drm-tip: 2017y-05m-04d-15h-00m-33s UTC integration manifest 0e6a5c5 drm/vgem: Enable dmabuf import interfaces 36b39d3 drm/prime: Introduce drm_gem_prime_import_dev d231a4f drm/vgem: Add a dummy platform device == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_4625/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx