[Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915/tgl: Wa_14011059788 (rev4)
== Series Details == Series: drm/i915/tgl: Wa_14011059788 (rev4) URL : https://patchwork.freedesktop.org/series/75990/ State : success == Summary == CI Bug Log - changes from CI_DRM_8335_full -> Patchwork_17393_full Summary --- **SUCCESS** No regressions found. Known issues Here are the changes found in Patchwork_17393_full that come from known issues: ### IGT changes ### Issues hit * igt@i915_module_load@reload-with-fault-injection: - shard-kbl: [PASS][1] -> [INCOMPLETE][2] ([i915#1373]) [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8335/shard-kbl2/igt@i915_module_l...@reload-with-fault-injection.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17393/shard-kbl1/igt@i915_module_l...@reload-with-fault-injection.html * igt@i915_suspend@debugfs-reader: - shard-skl: [PASS][3] -> [INCOMPLETE][4] ([i915#69]) [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8335/shard-skl5/igt@i915_susp...@debugfs-reader.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17393/shard-skl6/igt@i915_susp...@debugfs-reader.html * igt@kms_big_fb@linear-32bpp-rotate-0: - shard-kbl: [PASS][5] -> [FAIL][6] ([i915#1119] / [i915#93] / [i915#95]) [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8335/shard-kbl6/igt@kms_big...@linear-32bpp-rotate-0.html [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17393/shard-kbl3/igt@kms_big...@linear-32bpp-rotate-0.html - shard-apl: [PASS][7] -> [FAIL][8] ([i915#1119] / [i915#95]) [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8335/shard-apl2/igt@kms_big...@linear-32bpp-rotate-0.html [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17393/shard-apl2/igt@kms_big...@linear-32bpp-rotate-0.html * igt@kms_cursor_crc@pipe-a-cursor-suspend: - shard-kbl: [PASS][9] -> [DMESG-WARN][10] ([i915#180]) +3 similar issues [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8335/shard-kbl4/igt@kms_cursor_...@pipe-a-cursor-suspend.html [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17393/shard-kbl1/igt@kms_cursor_...@pipe-a-cursor-suspend.html * igt@kms_cursor_crc@pipe-c-cursor-suspend: - shard-skl: [PASS][11] -> [INCOMPLETE][12] ([i915#300]) [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8335/shard-skl2/igt@kms_cursor_...@pipe-c-cursor-suspend.html [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17393/shard-skl1/igt@kms_cursor_...@pipe-c-cursor-suspend.html * igt@kms_cursor_legacy@cursor-vs-flip-toggle: - shard-hsw: [PASS][13] -> [FAIL][14] ([i915#57]) [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8335/shard-hsw4/igt@kms_cursor_leg...@cursor-vs-flip-toggle.html [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17393/shard-hsw6/igt@kms_cursor_leg...@cursor-vs-flip-toggle.html * igt@kms_cursor_legacy@flip-vs-cursor-busy-crc-legacy: - shard-skl: [PASS][15] -> [FAIL][16] ([IGT#5]) [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8335/shard-skl3/igt@kms_cursor_leg...@flip-vs-cursor-busy-crc-legacy.html [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17393/shard-skl6/igt@kms_cursor_leg...@flip-vs-cursor-busy-crc-legacy.html * igt@kms_dp_dsc@basic-dsc-enable-edp: - shard-iclb: [PASS][17] -> [SKIP][18] ([fdo#109349]) [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8335/shard-iclb2/igt@kms_dp_...@basic-dsc-enable-edp.html [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17393/shard-iclb7/igt@kms_dp_...@basic-dsc-enable-edp.html * igt@kms_draw_crc@draw-method-rgb565-blt-xtiled: - shard-glk: [PASS][19] -> [FAIL][20] ([i915#52] / [i915#54]) [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8335/shard-glk8/igt@kms_draw_...@draw-method-rgb565-blt-xtiled.html [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17393/shard-glk1/igt@kms_draw_...@draw-method-rgb565-blt-xtiled.html * igt@kms_flip_tiling@flip-to-x-tiled: - shard-skl: [PASS][21] -> [FAIL][22] ([i915#167]) [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8335/shard-skl10/igt@kms_flip_til...@flip-to-x-tiled.html [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17393/shard-skl5/igt@kms_flip_til...@flip-to-x-tiled.html * igt@kms_plane@plane-panning-bottom-right-suspend-pipe-b-planes: - shard-apl: [PASS][23] -> [DMESG-WARN][24] ([i915#180]) [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8335/shard-apl6/igt@kms_pl...@plane-panning-bottom-right-suspend-pipe-b-planes.html [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17393/shard-apl3/igt@kms_pl...@plane-panning-bottom-right-suspend-pipe-b-planes.html * igt@kms_plane_alpha_blend@pipe-a-coverage-7efc: - shard-skl: [PASS][25] -> [FAIL][26] ([fdo#108145] / [i915#265]) [25]
Re: [Intel-gfx] [PATCH 37/59] drm/cirrus: Move to drm/tiny
On Wed, Apr 15, 2020 at 09:40:12AM +0200, Daniel Vetter wrote: > Because it is. Indeed. Acked-by: Gerd Hoffmann take care, Gerd ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 59/59] drm/bochs: Remove explicit drm_connector_register
On Wed, Apr 15, 2020 at 09:40:34AM +0200, Daniel Vetter wrote: > This is leftovers from the old drm_driver->load callback > upside-down issues. It doesn't do anything for not-hotplugged > connectors since drm_dev_register takes care of that. > > Signed-off-by: Daniel Vetter > Cc: Gerd Hoffmann > Cc: virtualizat...@lists.linux-foundation.org Acked-by: Gerd Hoffmann ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 05/18] drm/i915/display/display: Prefer drm_WARN_ON over WARN_ON
Pankaj, the subject line is identical to patch 4, please update. Imre, one question inline for you. On Mon, 06 Apr 2020, Pankaj Bharadiya wrote: > diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c > b/drivers/gpu/drm/i915/display/intel_display_power.c > index 433e5a81dd4d..5475f989df4c 100644 > --- a/drivers/gpu/drm/i915/display/intel_display_power.c > +++ b/drivers/gpu/drm/i915/display/intel_display_power.c > @@ -1850,22 +1850,29 @@ static u64 __async_put_domains_mask(struct > i915_power_domains *power_domains) > static bool > assert_async_put_domain_masks_disjoint(struct i915_power_domains > *power_domains) > { > - return !WARN_ON(power_domains->async_put_domains[0] & > - power_domains->async_put_domains[1]); > + struct drm_i915_private *i915 = container_of(power_domains, > + struct drm_i915_private, > + power_domains); > + return !drm_WARN_ON(&i915->drm, power_domains->async_put_domains[0] & > + power_domains->async_put_domains[1]); > } Do we want to depend on struct i915_power_domains being a struct drm_i915_private member via container_of? BR, Jani. > > static bool > __async_put_domains_state_ok(struct i915_power_domains *power_domains) > { > + struct drm_i915_private *i915 = container_of(power_domains, > + struct drm_i915_private, > + power_domains); > enum intel_display_power_domain domain; > bool err = false; > > err |= !assert_async_put_domain_masks_disjoint(power_domains); > - err |= WARN_ON(!!power_domains->async_put_wakeref != > -!!__async_put_domains_mask(power_domains)); > + err |= drm_WARN_ON(&i915->drm, !!power_domains->async_put_wakeref != > +!!__async_put_domains_mask(power_domains)); > > for_each_power_domain(domain, __async_put_domains_mask(power_domains)) > - err |= WARN_ON(power_domains->domain_use_count[domain] != 1); > + err |= drm_WARN_ON(&i915->drm, > +power_domains->domain_use_count[domain] != > 1); > > return !err; > } > @@ -2107,11 +2114,14 @@ static void > queue_async_put_domains_work(struct i915_power_domains *power_domains, >intel_wakeref_t wakeref) > { > - WARN_ON(power_domains->async_put_wakeref); > + struct drm_i915_private *i915 = container_of(power_domains, > + struct drm_i915_private, > + power_domains); > + drm_WARN_ON(&i915->drm, power_domains->async_put_wakeref); > power_domains->async_put_wakeref = wakeref; > - WARN_ON(!queue_delayed_work(system_unbound_wq, > - &power_domains->async_put_work, > - msecs_to_jiffies(100))); > + drm_WARN_ON(&i915->drm, !queue_delayed_work(system_unbound_wq, > + > &power_domains->async_put_work, > + msecs_to_jiffies(100))); > } > > static void > @@ -4318,6 +4328,9 @@ __set_power_wells(struct i915_power_domains > *power_domains, > const struct i915_power_well_desc *power_well_descs, > int power_well_count) > { > + struct drm_i915_private *i915 = container_of(power_domains, > + struct drm_i915_private, > + power_domains); > u64 power_well_ids = 0; > int i; > > @@ -4337,8 +4350,8 @@ __set_power_wells(struct i915_power_domains > *power_domains, > if (id == DISP_PW_ID_NONE) > continue; > > - WARN_ON(id >= sizeof(power_well_ids) * 8); > - WARN_ON(power_well_ids & BIT_ULL(id)); > + drm_WARN_ON(&i915->drm, id >= sizeof(power_well_ids) * 8); > + drm_WARN_ON(&i915->drm, power_well_ids & BIT_ULL(id)); > power_well_ids |= BIT_ULL(id); > } -- Jani Nikula, Intel Open Source Graphics Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for In order to readout DP SDPs, refactors the handling of DP SDPs (rev11)
== Series Details == Series: In order to readout DP SDPs, refactors the handling of DP SDPs (rev11) URL : https://patchwork.freedesktop.org/series/72853/ State : warning == Summary == $ dim checkpatch origin/drm-tip 33516c7eda02 video/hdmi: Add Unpack only function for DRM infoframe 5eae0ec2d8ae drm/i915/dp: Read out DP SDPs 3df2cc567e5a drm: Add logging function for DP VSC SDP 6cccd45e8119 drm/i915: Include HDMI DRM infoframe in the crtc state dump ca01029436e6 drm/i915: Include DP HDR Metadata Infoframe SDP in the crtc state dump 673d595572d2 drm/i915: Include DP VSC SDP in the crtc state dump baa9bbad6bef drm/i915: Program DP SDPs with computed configs beeabdf3a48b drm/i915: Add state readout for DP HDR Metadata Infoframe SDP 3f0c456bcfa9 drm/i915: Add state readout for DP VSC SDP -:83: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'name' - possible side-effects? #83: FILE: drivers/gpu/drm/i915/display/intel_display.c:13735: +#define PIPE_CONF_CHECK_DP_VSC_SDP(name) do { \ + if (!current_config->has_psr && !pipe_config->has_psr && \ + !intel_compare_dp_vsc_sdp(¤t_config->infoframes.name, \ + &pipe_config->infoframes.name)) { \ + pipe_config_dp_vsc_sdp_mismatch(dev_priv, fastset, __stringify(name), \ + ¤t_config->infoframes.name, \ + &pipe_config->infoframes.name); \ + ret = false; \ + } \ +} while (0) total: 0 errors, 0 warnings, 1 checks, 75 lines checked e78863a17e96 drm/i915: Fix enabled infoframe states of lspcon b8c74d06c9ea drm/i915: Program DP SDPs on pipe updates 07856a13f253 drm/i915: Stop sending DP SDPs on ddi disable 9a2ad5a32a45 drm/i915/dp: Add compute routine for DP PSR VSC SDP a5db16ec932c drm/i915/psr: Use new DP VSC SDP compute routine on PSR ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2] drm/i915: Fix ref->mutex deadlock in i915_active_wait()
Quoting Sultan Alsawaf (2020-04-20 18:42:16) > On Mon, Apr 20, 2020 at 12:02:39PM +0300, Joonas Lahtinen wrote: > > I think the the patch should be dropped for now before the issue is > > properly addressed. Either by backporting the mainline fixes or if > > those are too big and there indeed is a smaller alternative patch > > that is properly reviewed. But the above patch is not, at least yet. > > Why should a fix for a bona-fide issue be dropped due to political reasons? > This > doesn't make sense to me. This just hurts miserable i915 users even more. If > my > patch is going to be dropped, it should be replaced by a different fix at the > same time. There's no politics involved. It's all about doing the due diligence that we're fixing upstream bugs, and we're fixing them in a way that does not cause regressions to other users. Without being able to reproduce a bug against vanilla kernel, there's too high of a risk that the patch that was developed will only work on the downstream kernel it was developed for. That happens for the best of the developers, and that is exactly why the process is in place, to avoid human error. So no politics, just due diligence. If you could provide bug reproduction instructions by filing a bug, we can make forward progress in solving this issue. After assessing the severity of the bug and the amount of users involved, it will be prioritized accordingly. That is the most efficient way to get attention to a bug. > Also, the mainline fixes just *happen* to fix this deadlock by removing the > mutex lock from the path in question and creating multiple other bugs in the > process that had to be addressed with "Fixes:" commits. The regression > potential > was too high to include those patches for a "stable" kernel, so I made this > patch which fixes the issue in the simplest way possible. The thing is that it may be that the patch fixes the exact issue you have at hand in the downstream kernel you are testing against. But in doing so it may as well break other usecases for other users of vanilla kernel. That is what we're trying to avoid. With the reproduction instructions, it'll be possible to check which kernel versions are affected, and after applying a fix to make sure that the bug is gone from those version. And if the reproduction can be trivialized to a test, we can introduce a regression check to CI. A patch that claims to fix a deadlock in upstream kernel should include that splat from upstream kernel, not a speculated chain. Again, this is just the regular due diligence, because we have made errors in the past. It is for those self-made errors we know not to merge fixes too quickly before we are able to reproduce the error and make sure it is gone. It's not about where the patch came from, it's about avoiding errors. > We put this patch into > Ubuntu now as well, because praying for a response from i915 maintainers while > the 20.04 release was on the horizon was not an option. > > > There is an another similar thread where there's jumping into > > conclusions and doing ad-hoc patches for already fixed issues: > > > > https://lore.kernel.org/dri-devel/20200414144309.GB2082@sultan-box.localdomain/ > > Maybe this wouldn't have happened if I had received a proper response for that > issue on gitlab from the get-go... Instead I got the run-around from Chris > claiming that it wasn't an i915 bug: > > https://gitlab.freedesktop.org/drm/intel/issues/1599 > > > I appreciate enthusiasm to provide fixes to i915 but we should > > continue do the regular due diligence to make sure we're properly > > fixing bugs in upstream kernels. And when fixing them, to make > > sure we're not simply papering over them for a single use case. > > > > It would be preferred to file a bug for the seen issues, > > describing how to reproduce them with vanilla upstream kernels: > > > > https://gitlab.freedesktop.org/drm/intel/-/wikis/How-to-file-i915-bugs > > gitlab.freedesktop.org/drm/intel is where bugs go to be neglected, as noted > above. I really see no reason to send anything there anymore, when the vast > majority of community-sourced bug reports go ignored. In the above bug, you claim to be booting vanilla kernel but the splat clearly says "5.4.28-7-g64bb42e80256-dirty", so the developer correctly requested to bisect the error between 5.4.27 and 5.4.28 vanilla kernels, which you seem to have ignored and simply jumped to provide a patch. Apologies if it feels like the bugs do not get enough attention, but we do our best to act on the reported bugs. You can best guarantee that your bug is getting the attention by providing all the details requested in the above link. Without that information, it'll be hard to assess the severity of the bug. Above bug is missing critical pieces of information which help us in assessing the severity: 1. Is the bug reproducible on drm-tip? 2. How to reproduce? 3. How often does it reproduce? 4. Which hardware? If that information is mi
[Intel-gfx] ✗ Fi.CI.BAT: failure for In order to readout DP SDPs, refactors the handling of DP SDPs (rev11)
== Series Details == Series: In order to readout DP SDPs, refactors the handling of DP SDPs (rev11) URL : https://patchwork.freedesktop.org/series/72853/ State : failure == Summary == CI Bug Log - changes from CI_DRM_8339 -> Patchwork_17395 Summary --- **FAILURE** Serious unknown changes coming with Patchwork_17395 absolutely need to be verified manually. If you think the reported changes have nothing to do with the changes introduced in Patchwork_17395, please notify your bug team to allow them to document this new failure mode, which will reduce false positives in CI. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17395/index.html Possible new issues --- Here are the unknown changes that may have been introduced in Patchwork_17395: ### IGT changes ### Possible regressions * igt@i915_selftest@live@mman: - fi-snb-2600:[PASS][1] -> [FAIL][2] [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8339/fi-snb-2600/igt@i915_selftest@l...@mman.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17395/fi-snb-2600/igt@i915_selftest@l...@mman.html Known issues Here are the changes found in Patchwork_17395 that come from known issues: ### IGT changes ### Issues hit * igt@i915_selftest@live@gt_pm: - fi-apl-guc: [PASS][3] -> [DMESG-FAIL][4] ([i915#1751]) [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8339/fi-apl-guc/igt@i915_selftest@live@gt_pm.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17395/fi-apl-guc/igt@i915_selftest@live@gt_pm.html Possible fixes * igt@i915_selftest@live@gt_pm: - fi-icl-u2: [DMESG-FAIL][5] -> [PASS][6] [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8339/fi-icl-u2/igt@i915_selftest@live@gt_pm.html [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17395/fi-icl-u2/igt@i915_selftest@live@gt_pm.html [i915#1751]: https://gitlab.freedesktop.org/drm/intel/issues/1751 Participating hosts (48 -> 41) -- Missing(7): fi-cml-u2 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-hsw-4770 fi-byt-clapper fi-bdw-samus Build changes - * CI: CI-20190529 -> None * Linux: CI_DRM_8339 -> Patchwork_17395 CI-20190529: 20190529 CI_DRM_8339: aff3f45feee4ed28b07e06bacac28c72c5315e37 @ git://anongit.freedesktop.org/gfx-ci/linux IGT_5602: a8fcccd15dcc2dd409edd23785a2d6f6e85fb682 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools Patchwork_17395: a5db16ec932c85ce6de749e953448d54a6b5e40f @ git://anongit.freedesktop.org/gfx-ci/linux == Linux commits == a5db16ec932c drm/i915/psr: Use new DP VSC SDP compute routine on PSR 9a2ad5a32a45 drm/i915/dp: Add compute routine for DP PSR VSC SDP 07856a13f253 drm/i915: Stop sending DP SDPs on ddi disable b8c74d06c9ea drm/i915: Program DP SDPs on pipe updates e78863a17e96 drm/i915: Fix enabled infoframe states of lspcon 3f0c456bcfa9 drm/i915: Add state readout for DP VSC SDP beeabdf3a48b drm/i915: Add state readout for DP HDR Metadata Infoframe SDP baa9bbad6bef drm/i915: Program DP SDPs with computed configs 673d595572d2 drm/i915: Include DP VSC SDP in the crtc state dump ca01029436e6 drm/i915: Include DP HDR Metadata Infoframe SDP in the crtc state dump 6cccd45e8119 drm/i915: Include HDMI DRM infoframe in the crtc state dump 3df2cc567e5a drm: Add logging function for DP VSC SDP 5eae0ec2d8ae drm/i915/dp: Read out DP SDPs 33516c7eda02 video/hdmi: Add Unpack only function for DRM infoframe == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17395/index.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 15/18] drm/i915/i915_drv: Prefer drm_WARN_ON over WARN_ON
On Mon, 06 Apr 2020, Pankaj Bharadiya wrote: > struct drm_device specific drm_WARN* macros include device information > in the backtrace, so we know what device the warnings originate from. > > Prefer drm_WARN_ON over WARN_ON. > > Signed-off-by: Pankaj Bharadiya > --- > drivers/gpu/drm/i915/i915_drv.h | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index e9ee4daa9320..be33cab6403d 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -1647,7 +1647,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, > #define HAS_DISPLAY(dev_priv) (INTEL_INFO(dev_priv)->pipe_mask != 0) > > /* Only valid when HAS_DISPLAY() is true */ > -#define INTEL_DISPLAY_ENABLED(dev_priv) (WARN_ON(!HAS_DISPLAY(dev_priv)), > !i915_modparams.disable_display) > +#define INTEL_DISPLAY_ENABLED(dev_priv) \ > + (drm_WARN_ON(&dev_priv->drm, !HAS_DISPLAY(dev_priv)), > !i915_modparams.disable_display) Needs parens around the dev_priv macro argument. BR, Jani. > > static inline bool intel_vtd_active(void) > { -- Jani Nikula, Intel Open Source Graphics Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915/gt: Prefer soft-rc6 over RPS DOWN_TIMEOUT
The RPS DOWN_TIMEOUT interrupt is signaled after a period of rc6, and upon receipt of that interrupt we reprogram the GPU clocks down to the next idle notch [to help convserve power during rc6]. However, on execlists, we benefit from soft-rc6 immediately parking the GPU and setting idle frequencies upon idling [within a jiffie], and here the interrupt prevents us from restarting from our last frequency. In the process, we can simply opt for a static pm_events mask and rely on the enable/disable interrupts to flush the worker on parking. This will reduce the amount of oscillation observed during steady workloads with microsleeps, as each time the rc6 timeout occurs we immediately follow with a waitboost for a dropped frame. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_rps.c | 39 + 1 file changed, 18 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c index 4dcfae16a7ce..b94f92b7860d 100644 --- a/drivers/gpu/drm/i915/gt/intel_rps.c +++ b/drivers/gpu/drm/i915/gt/intel_rps.c @@ -57,7 +57,7 @@ static u32 rps_pm_mask(struct intel_rps *rps, u8 val) if (val < rps->max_freq_softlimit) mask |= GEN6_PM_RP_UP_EI_EXPIRED | GEN6_PM_RP_UP_THRESHOLD; - mask &= READ_ONCE(rps->pm_events); + mask &= rps->pm_events; return rps_pm_sanitize_mask(rps, ~mask); } @@ -70,19 +70,9 @@ static void rps_reset_ei(struct intel_rps *rps) static void rps_enable_interrupts(struct intel_rps *rps) { struct intel_gt *gt = rps_to_gt(rps); - u32 events; rps_reset_ei(rps); - if (IS_VALLEYVIEW(gt->i915)) - /* WaGsvRC0ResidencyMethod:vlv */ - events = GEN6_PM_RP_UP_EI_EXPIRED; - else - events = (GEN6_PM_RP_UP_THRESHOLD | - GEN6_PM_RP_DOWN_THRESHOLD | - GEN6_PM_RP_DOWN_TIMEOUT); - WRITE_ONCE(rps->pm_events, events); - spin_lock_irq(>->irq_lock); gen6_gt_pm_enable_irq(gt, rps->pm_events); spin_unlock_irq(>->irq_lock); @@ -919,12 +909,10 @@ static bool gen9_rps_enable(struct intel_rps *rps) intel_uncore_write_fw(uncore, GEN6_RC_VIDEO_FREQ, GEN9_FREQUENCY(rps->rp1_freq)); - /* 1 second timeout */ - intel_uncore_write_fw(uncore, GEN6_RP_DOWN_TIMEOUT, - GT_INTERVAL_FROM_US(i915, 100)); - intel_uncore_write_fw(uncore, GEN6_RP_IDLE_HYSTERSIS, 0xa); + rps->pm_events = GEN6_PM_RP_UP_THRESHOLD | GEN6_PM_RP_DOWN_THRESHOLD; + return rps_reset(rps); } @@ -935,12 +923,10 @@ static bool gen8_rps_enable(struct intel_rps *rps) intel_uncore_write_fw(uncore, GEN6_RC_VIDEO_FREQ, HSW_FREQUENCY(rps->rp1_freq)); - /* NB: Docs say 1s, and 100 - which aren't equivalent */ - intel_uncore_write_fw(uncore, GEN6_RP_DOWN_TIMEOUT, - 1 / 128); /* 1 second timeout */ - intel_uncore_write_fw(uncore, GEN6_RP_IDLE_HYSTERSIS, 10); + rps->pm_events = GEN6_PM_RP_UP_THRESHOLD | GEN6_PM_RP_DOWN_THRESHOLD; + return rps_reset(rps); } @@ -952,6 +938,10 @@ static bool gen6_rps_enable(struct intel_rps *rps) intel_uncore_write_fw(uncore, GEN6_RP_DOWN_TIMEOUT, 5); intel_uncore_write_fw(uncore, GEN6_RP_IDLE_HYSTERSIS, 10); + rps->pm_events = (GEN6_PM_RP_UP_THRESHOLD | + GEN6_PM_RP_DOWN_THRESHOLD | + GEN6_PM_RP_DOWN_TIMEOUT); + return rps_reset(rps); } @@ -1037,6 +1027,10 @@ static bool chv_rps_enable(struct intel_rps *rps) GEN6_RP_UP_BUSY_AVG | GEN6_RP_DOWN_IDLE_AVG); + rps->pm_events = (GEN6_PM_RP_UP_THRESHOLD | + GEN6_PM_RP_DOWN_THRESHOLD | + GEN6_PM_RP_DOWN_TIMEOUT); + /* Setting Fixed Bias */ vlv_punit_get(i915); @@ -1135,6 +1129,9 @@ static bool vlv_rps_enable(struct intel_rps *rps) GEN6_RP_UP_BUSY_AVG | GEN6_RP_DOWN_IDLE_CONT); + /* WaGsvRC0ResidencyMethod:vlv */ + rps->pm_events = GEN6_PM_RP_UP_EI_EXPIRED; + vlv_punit_get(i915); /* Setting Fixed Bias */ @@ -1469,7 +1466,7 @@ static void rps_work(struct work_struct *work) u32 pm_iir = 0; spin_lock_irq(>->irq_lock); - pm_iir = fetch_and_zero(&rps->pm_iir) & READ_ONCE(rps->pm_events); + pm_iir = fetch_and_zero(&rps->pm_iir) & rps->pm_events; client_boost = atomic_read(&rps->num_waiters); spin_unlock_irq(>->irq_lock); @@ -1572,7 +1569,7 @@ void gen6_rps_irq_handler(struct intel_rps *rps, u32 pm_iir) struct intel_gt *gt = rps_to_gt(rps); u32 events; - events = pm_iir & READ_ONCE(rps->pm_events); + e
Re: [Intel-gfx] [PATCH 17/18] drm/i915/pm: Prefer drm_WARN_ON over WARN_ON
On Mon, 06 Apr 2020, Pankaj Bharadiya wrote: > struct drm_device specific drm_WARN* macros include device information > in the backtrace, so we know what device the warnings originate from. > > Prefer drm_WARN_ON over WARN_ON. > > Conversion is done with below sementic patch: > > @@ > identifier func, T; > @@ > func(...) { > ... > struct intel_crtc *T = ...; > <+... > -WARN_ON( > +drm_WARN_ON(T->base.dev, > ...) > ...+> > > } > > @@ > identifier func, T; > @@ > func(struct intel_crtc_state *T,...) { > <+... > -WARN_ON( > +drm_WARN_ON(T->uapi.crtc->dev, > ...) > ...+> > > } > > Signed-off-by: Pankaj Bharadiya > --- > drivers/gpu/drm/i915/intel_pm.c | 57 ++--- > 1 file changed, 32 insertions(+), 25 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c > index 8375054ba27d..b2d22fdaf3db 100644 > --- a/drivers/gpu/drm/i915/intel_pm.c > +++ b/drivers/gpu/drm/i915/intel_pm.c > @@ -1464,8 +1464,8 @@ static int g4x_compute_intermediate_wm(struct > intel_crtc_state *new_crtc_state) > max(optimal->wm.plane[plane_id], > active->wm.plane[plane_id]); > > - WARN_ON(intermediate->wm.plane[plane_id] > > - g4x_plane_fifo_size(plane_id, G4X_WM_LEVEL_NORMAL)); > + drm_WARN_ON(crtc->base.dev, intermediate->wm.plane[plane_id] > > + g4x_plane_fifo_size(plane_id, G4X_WM_LEVEL_NORMAL)); > } > > intermediate->sr.plane = max(optimal->sr.plane, > @@ -1482,21 +1482,25 @@ static int g4x_compute_intermediate_wm(struct > intel_crtc_state *new_crtc_state) > intermediate->hpll.fbc = max(optimal->hpll.fbc, >active->hpll.fbc); > > - WARN_ON((intermediate->sr.plane > > - g4x_plane_fifo_size(PLANE_PRIMARY, G4X_WM_LEVEL_SR) || > - intermediate->sr.cursor > > - g4x_plane_fifo_size(PLANE_CURSOR, G4X_WM_LEVEL_SR)) && > - intermediate->cxsr); > - WARN_ON((intermediate->sr.plane > > - g4x_plane_fifo_size(PLANE_PRIMARY, G4X_WM_LEVEL_HPLL) || > - intermediate->sr.cursor > > - g4x_plane_fifo_size(PLANE_CURSOR, G4X_WM_LEVEL_HPLL)) && > - intermediate->hpll_en); > - > - WARN_ON(intermediate->sr.fbc > g4x_fbc_fifo_size(1) && > - intermediate->fbc_en && intermediate->cxsr); > - WARN_ON(intermediate->hpll.fbc > g4x_fbc_fifo_size(2) && > - intermediate->fbc_en && intermediate->hpll_en); > + drm_WARN_ON(crtc->base.dev, > + (intermediate->sr.plane > > + g4x_plane_fifo_size(PLANE_PRIMARY, G4X_WM_LEVEL_SR) || > + intermediate->sr.cursor > > + g4x_plane_fifo_size(PLANE_CURSOR, G4X_WM_LEVEL_SR)) && > + intermediate->cxsr); > + drm_WARN_ON(crtc->base.dev, > + (intermediate->sr.plane > > + g4x_plane_fifo_size(PLANE_PRIMARY, G4X_WM_LEVEL_HPLL) || > + intermediate->sr.cursor > > + g4x_plane_fifo_size(PLANE_CURSOR, G4X_WM_LEVEL_HPLL)) && > + intermediate->hpll_en); > + > + drm_WARN_ON(crtc->base.dev, > + intermediate->sr.fbc > g4x_fbc_fifo_size(1) && > + intermediate->fbc_en && intermediate->cxsr); > + drm_WARN_ON(crtc->base.dev, > + intermediate->hpll.fbc > g4x_fbc_fifo_size(2) && > + intermediate->fbc_en && intermediate->hpll_en); Please add a i915 local variable and use &i915->drm. > > out: > /* > @@ -1748,11 +1752,11 @@ static int vlv_compute_fifo(struct intel_crtc_state > *crtc_state) > fifo_left -= plane_extra; > } > > - WARN_ON(active_planes != 0 && fifo_left != 0); > + drm_WARN_ON(crtc->base.dev, active_planes != 0 && fifo_left != 0); > > /* give it all to the first plane if none are active */ > if (active_planes == 0) { > - WARN_ON(fifo_left != fifo_size); > + drm_WARN_ON(crtc->base.dev, fifo_left != fifo_size); > fifo_state->plane[PLANE_PRIMARY] = fifo_left; > } > > @@ -4154,7 +4158,8 @@ skl_plane_downscale_amount(const struct > intel_crtc_state *crtc_state, > uint_fixed_16_16_t fp_w_ratio, fp_h_ratio; > uint_fixed_16_16_t downscale_h, downscale_w; > > - if (WARN_ON(!intel_wm_plane_visible(crtc_state, plane_state))) > + if (drm_WARN_ON(crtc_state->uapi.crtc->dev, > + !intel_wm_plane_visible(crtc_state, plane_state))) > return u32_to_fixed16(0); > > /* > @@ -4815,7 +4820,7 @@ intel_get_linetime_us(const struct intel_crtc_state > *crtc_state) > > pixel_rate = crtc_state->pixel_rate; > > - if (WARN_ON(pixel_rate == 0)) > + if (drm_WARN_ON(crtc_state->uapi.crtc->dev, pixel_rate == 0)) > return u32_to_fixed16(0); > > crtc_htotal =
Re: [Intel-gfx] [PATCH 18/18] drm/i915/runtime_pm: Prefer drm_WARN* over WARN*
Imre, please check the one question inline. On Mon, 06 Apr 2020, Pankaj Bharadiya wrote: > struct drm_device specific drm_WARN* macros include device information > in the backtrace, so we know what device the warnings originate from. > > Prefer drm_WARN* over WARN*. > > Conversion is done with below semantic patch: > > @@ > identifier func, T; > @@ > func(struct intel_runtime_pm *T,...) { > + struct drm_i915_private *i915 = container_of(T, struct drm_i915_private, > runtime_pm); > <+... > ( > -WARN( > +drm_WARN(&i915->drm, > ...) > | > -WARN_ON( > +drm_WARN_ON(&i915->drm, > ...) > | > -WARN_ONCE( > +drm_WARN_ONCE(&i915->drm, > ...) > | > -WARN_ON_ONCE( > +drm_WARN_ON_ONCE(&i915->drm, > ...) > ) > ...+> > > } > > Signed-off-by: Pankaj Bharadiya > --- > drivers/gpu/drm/i915/intel_runtime_pm.c | 39 ++--- > 1 file changed, 28 insertions(+), 11 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c > b/drivers/gpu/drm/i915/intel_runtime_pm.c > index ad719c9602af..31ccd0559c55 100644 > --- a/drivers/gpu/drm/i915/intel_runtime_pm.c > +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c > @@ -116,6 +116,9 @@ track_intel_runtime_pm_wakeref(struct intel_runtime_pm > *rpm) > static void untrack_intel_runtime_pm_wakeref(struct intel_runtime_pm *rpm, >depot_stack_handle_t stack) > { > + struct drm_i915_private *i915 = container_of(rpm, > + struct drm_i915_private, > + runtime_pm); Is this a dependency we want to add? Should struct intel_runtime_pm be allowed to be elsewhere than struct i915? BR, Jani. > unsigned long flags, n; > bool found = false; > > @@ -134,9 +137,9 @@ static void untrack_intel_runtime_pm_wakeref(struct > intel_runtime_pm *rpm, > } > spin_unlock_irqrestore(&rpm->debug.lock, flags); > > - if (WARN(!found, > - "Unmatched wakeref (tracking %lu), count %u\n", > - rpm->debug.count, atomic_read(&rpm->wakeref_count))) { > + if (drm_WARN(&i915->drm, !found, > + "Unmatched wakeref (tracking %lu), count %u\n", > + rpm->debug.count, atomic_read(&rpm->wakeref_count))) { > char *buf; > > buf = kmalloc(PAGE_SIZE, GFP_NOWAIT | __GFP_NOWARN); > @@ -355,10 +358,14 @@ intel_runtime_pm_release(struct intel_runtime_pm *rpm, > int wakelock) > static intel_wakeref_t __intel_runtime_pm_get(struct intel_runtime_pm *rpm, > bool wakelock) > { > + struct drm_i915_private *i915 = container_of(rpm, > + struct drm_i915_private, > + runtime_pm); > int ret; > > ret = pm_runtime_get_sync(rpm->kdev); > - WARN_ONCE(ret < 0, "pm_runtime_get_sync() failed: %d\n", ret); > + drm_WARN_ONCE(&i915->drm, ret < 0, > + "pm_runtime_get_sync() failed: %d\n", ret); > > intel_runtime_pm_acquire(rpm, wakelock); > > @@ -539,6 +546,9 @@ void intel_runtime_pm_put(struct intel_runtime_pm *rpm, > intel_wakeref_t wref) > */ > void intel_runtime_pm_enable(struct intel_runtime_pm *rpm) > { > + struct drm_i915_private *i915 = container_of(rpm, > + struct drm_i915_private, > + runtime_pm); > struct device *kdev = rpm->kdev; > > /* > @@ -565,7 +575,8 @@ void intel_runtime_pm_enable(struct intel_runtime_pm *rpm) > > pm_runtime_dont_use_autosuspend(kdev); > ret = pm_runtime_get_sync(kdev); > - WARN(ret < 0, "pm_runtime_get_sync() failed: %d\n", ret); > + drm_WARN(&i915->drm, ret < 0, > + "pm_runtime_get_sync() failed: %d\n", ret); > } else { > pm_runtime_use_autosuspend(kdev); > } > @@ -580,11 +591,14 @@ void intel_runtime_pm_enable(struct intel_runtime_pm > *rpm) > > void intel_runtime_pm_disable(struct intel_runtime_pm *rpm) > { > + struct drm_i915_private *i915 = container_of(rpm, > + struct drm_i915_private, > + runtime_pm); > struct device *kdev = rpm->kdev; > > /* Transfer rpm ownership back to core */ > - WARN(pm_runtime_get_sync(kdev) < 0, > - "Failed to pass rpm ownership back to core\n"); > + drm_WARN(&i915->drm, pm_runtime_get_sync(kdev) < 0, > + "Failed to pass rpm ownership back to core\n"); > > pm_runtime_dont_use_autosuspend(kdev); > > @@ -594,12 +608,15 @@ void intel_runtime_pm_disable(struct intel_runtime_pm > *rpm) > > void intel_runtime_pm_driver_release(struct intel_runtime_pm *rpm) > { > + struct drm_i915_private *i915 = container_of(rpm, > +
Re: [Intel-gfx] [PATCH 14/18] drm/i915/gem: Prefer drm_WARN* over WARN*
On Mon, 06 Apr 2020, Pankaj Bharadiya wrote: > struct drm_device specific drm_WARN* macros include device information > in the backtrace, so we know what device the warnings originate from. > > Prefer drm_WARN* over WARN* at places where struct drm_device pointer > can be extracted. I'd like to have Chris' ack on this. BR, Jani. > > Signed-off-by: Pankaj Bharadiya > --- > drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 2 +- > drivers/gpu/drm/i915/gem/i915_gem_phys.c | 3 ++- > drivers/gpu/drm/i915/gem/i915_gem_userptr.c| 2 +- > 3 files changed, 4 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > index 9d11bad74e9a..d910eb9b77ef 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > @@ -1440,7 +1440,7 @@ eb_relocate_entry(struct i915_execbuffer *eb, > err = i915_vma_bind(target->vma, > target->vma->obj->cache_level, > PIN_GLOBAL, NULL); > - if (WARN_ONCE(err, > + if (drm_WARN_ONCE(&i915->drm, err, > "Unexpected failure to bind target VMA!")) > return err; > } > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_phys.c > b/drivers/gpu/drm/i915/gem/i915_gem_phys.c > index 7fe9831aa9ba..4c1c7232b024 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_phys.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_phys.c > @@ -27,7 +27,8 @@ static int i915_gem_object_get_pages_phys(struct > drm_i915_gem_object *obj) > void *dst; > int i; > > - if (WARN_ON(i915_gem_object_needs_bit17_swizzle(obj))) > + if (drm_WARN_ON(obj->base.dev, > + i915_gem_object_needs_bit17_swizzle(obj))) > return -EINVAL; > > /* > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c > b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c > index 7ffd7afeb7a5..8b0708708671 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c > @@ -235,7 +235,7 @@ i915_gem_userptr_init__mmu_notifier(struct > drm_i915_gem_object *obj, > if (flags & I915_USERPTR_UNSYNCHRONIZED) > return capable(CAP_SYS_ADMIN) ? 0 : -EPERM; > > - if (WARN_ON(obj->userptr.mm == NULL)) > + if (drm_WARN_ON(obj->base.dev, obj->userptr.mm == NULL)) > return -EINVAL; > > mn = i915_mmu_notifier_find(obj->userptr.mm); -- Jani Nikula, Intel Open Source Graphics Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/selftests: Show the pstate limits on any failure to reset min
Chris Wilson writes: > We want to see the pstate limits whenever we fail to set the minimum > frequency as that may help for debugging. > > Signed-off-by: Chris Wilson Reviewed-by: Mika Kuoppala > --- > drivers/gpu/drm/i915/gt/selftest_rps.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/gpu/drm/i915/gt/selftest_rps.c > b/drivers/gpu/drm/i915/gt/selftest_rps.c > index d7cd673550ef..e0a791eac752 100644 > --- a/drivers/gpu/drm/i915/gt/selftest_rps.c > +++ b/drivers/gpu/drm/i915/gt/selftest_rps.c > @@ -238,6 +238,7 @@ int live_rps_control(void *arg) > pr_err("%s: could not set minimum frequency [%x], only > %x!\n", > engine->name, rps->min_freq, read_cagf(rps)); > igt_spinner_end(&spin); > + show_pstate_limits(rps); > err = -EINVAL; > break; > } > @@ -278,6 +279,7 @@ int live_rps_control(void *arg) > if (limit == rps->min_freq) { > pr_err("%s: GPU throttled to minimum!\n", > engine->name); > + show_pstate_limits(rps); > err = -ENODEV; > break; > } > -- > 2.20.1 > > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 00/18] Prefer drm_WARN* over WARN*
On Mon, 06 Apr 2020, Pankaj Bharadiya wrote: > Now we have struct drm_device specific drm_WARN* macros which include > device information in the backtrace, so we know what device the > warnings originate from. > > This series converts WARN* with drm_WARN* where struct drm_device > pointer can be extracted. I think I pushed all the patches that I didn't comment on separately, and that still applied. BR, Jani. -- Jani Nikula, Intel Open Source Graphics Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/selftests: Show the pstate limits on any failure to reset min
Quoting Mika Kuoppala (2020-04-21 09:28:41) > Chris Wilson writes: > > > We want to see the pstate limits whenever we fail to set the minimum > > frequency as that may help for debugging. > > > > Signed-off-by: Chris Wilson > > Reviewed-by: Mika Kuoppala It was a nice idea but for soraka, it reports 0x00ff, which means unlimited [min = 0, max = ff]. Grr. Thanks for the review, it may help for some systems! -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH xf86-video-intel v4] Sync i915_pciids upto 8717c6b7414f
Import the kernel's i915_pciids.h, up to: commit 8717c6b7414ffb890672276dccc284c23078ac0e Author: Lee Shawn C Date: Tue Dec 10 23:04:15 2019 +0800 drm/i915/cml: Separate U series pci id from origianl list. Signed-off-by: Liwei Song --- V3 -> V4: Add missed PINEVIEW V2 -> V3: Add 4 new info blocks and add sound support for them. Change since V1: replace old definition in intel_module.c and dri3-test.c --- src/i915_pciids.h | 265 -- src/intel_module.c| 93 ++- src/sna/gen9_render.c | 48 test/dri3-test.c | 3 +- 4 files changed, 346 insertions(+), 63 deletions(-) diff --git a/src/i915_pciids.h b/src/i915_pciids.h index fd965ffbb92e..1d2c12219f44 100644 --- a/src/i915_pciids.h +++ b/src/i915_pciids.h @@ -108,8 +108,10 @@ INTEL_VGA_DEVICE(0x2e42, info), /* B43_G */ \ INTEL_VGA_DEVICE(0x2e92, info) /* B43_G.1 */ -#define INTEL_PINEVIEW_IDS(info) \ - INTEL_VGA_DEVICE(0xa001, info), \ +#define INTEL_PINEVIEW_G_IDS(info) \ + INTEL_VGA_DEVICE(0xa001, info) + +#define INTEL_PINEVIEW_M_IDS(info) \ INTEL_VGA_DEVICE(0xa011, info) #define INTEL_IRONLAKE_D_IDS(info) \ @@ -166,7 +168,18 @@ #define INTEL_IVB_Q_IDS(info) \ INTEL_QUANTA_VGA_DEVICE(info) /* Quanta transcode */ +#define INTEL_HSW_ULT_GT1_IDS(info) \ + INTEL_VGA_DEVICE(0x0A02, info), /* ULT GT1 desktop */ \ + INTEL_VGA_DEVICE(0x0A0A, info), /* ULT GT1 server */ \ + INTEL_VGA_DEVICE(0x0A0B, info), /* ULT GT1 reserved */ \ + INTEL_VGA_DEVICE(0x0A06, info) /* ULT GT1 mobile */ + +#define INTEL_HSW_ULX_GT1_IDS(info) \ + INTEL_VGA_DEVICE(0x0A0E, info) /* ULX GT1 mobile */ + #define INTEL_HSW_GT1_IDS(info) \ + INTEL_HSW_ULT_GT1_IDS(info), \ + INTEL_HSW_ULX_GT1_IDS(info), \ INTEL_VGA_DEVICE(0x0402, info), /* GT1 desktop */ \ INTEL_VGA_DEVICE(0x040a, info), /* GT1 server */ \ INTEL_VGA_DEVICE(0x040B, info), /* GT1 reserved */ \ @@ -175,20 +188,26 @@ INTEL_VGA_DEVICE(0x0C0A, info), /* SDV GT1 server */ \ INTEL_VGA_DEVICE(0x0C0B, info), /* SDV GT1 reserved */ \ INTEL_VGA_DEVICE(0x0C0E, info), /* SDV GT1 reserved */ \ - INTEL_VGA_DEVICE(0x0A02, info), /* ULT GT1 desktop */ \ - INTEL_VGA_DEVICE(0x0A0A, info), /* ULT GT1 server */ \ - INTEL_VGA_DEVICE(0x0A0B, info), /* ULT GT1 reserved */ \ INTEL_VGA_DEVICE(0x0D02, info), /* CRW GT1 desktop */ \ INTEL_VGA_DEVICE(0x0D0A, info), /* CRW GT1 server */ \ INTEL_VGA_DEVICE(0x0D0B, info), /* CRW GT1 reserved */ \ INTEL_VGA_DEVICE(0x0D0E, info), /* CRW GT1 reserved */ \ INTEL_VGA_DEVICE(0x0406, info), /* GT1 mobile */ \ INTEL_VGA_DEVICE(0x0C06, info), /* SDV GT1 mobile */ \ - INTEL_VGA_DEVICE(0x0A06, info), /* ULT GT1 mobile */ \ - INTEL_VGA_DEVICE(0x0A0E, info), /* ULX GT1 mobile */ \ INTEL_VGA_DEVICE(0x0D06, info) /* CRW GT1 mobile */ +#define INTEL_HSW_ULT_GT2_IDS(info) \ + INTEL_VGA_DEVICE(0x0A12, info), /* ULT GT2 desktop */ \ + INTEL_VGA_DEVICE(0x0A1A, info), /* ULT GT2 server */ \ + INTEL_VGA_DEVICE(0x0A1B, info), /* ULT GT2 reserved */ \ + INTEL_VGA_DEVICE(0x0A16, info) /* ULT GT2 mobile */ + +#define INTEL_HSW_ULX_GT2_IDS(info) \ + INTEL_VGA_DEVICE(0x0A1E, info) /* ULX GT2 mobile */ \ + #define INTEL_HSW_GT2_IDS(info) \ + INTEL_HSW_ULT_GT2_IDS(info), \ + INTEL_HSW_ULX_GT2_IDS(info), \ INTEL_VGA_DEVICE(0x0412, info), /* GT2 desktop */ \ INTEL_VGA_DEVICE(0x041a, info), /* GT2 server */ \ INTEL_VGA_DEVICE(0x041B, info), /* GT2 reserved */ \ @@ -197,9 +216,6 @@ INTEL_VGA_DEVICE(0x0C1A, info), /* SDV GT2 server */ \ INTEL_VGA_DEVICE(0x0C1B, info), /* SDV GT2 reserved */ \ INTEL_VGA_DEVICE(0x0C1E, info), /* SDV GT2 reserved */ \ - INTEL_VGA_DEVICE(0x0A12, info), /* ULT GT2 desktop */ \ - INTEL_VGA_DEVICE(0x0A1A, info), /* ULT GT2 server */ \ - INTEL_VGA_DEVICE(0x0A1B, info), /* ULT GT2 reserved */ \ INTEL_VGA_DEVICE(0x0D12, info), /* CRW GT2 desktop */ \ INTEL_VGA_DEVICE(0x0D1A, info), /* CRW GT2 server */ \ INTEL_VGA_DEVICE(0x0D1B, info), /* CRW GT2 reserved */ \ @@ -207,11 +223,17 @@ INTEL_VGA_DEVICE(0x0416, info), /* GT2 mobile */ \ INTEL_VGA_DEVICE(0x0426, info), /* GT2 mobile */ \ INTEL_VGA_DEVICE(0x0C16, info), /* SDV GT2 mobile */ \ - INTEL_VGA_DEVICE(0x0A16, info), /* ULT GT2 mobile */ \ - INTEL_VGA_DEVICE(0x0A1E, info), /* ULX GT2 mobile */ \ INTEL_VGA_DEVICE(0x0D16, info) /* CRW GT2 mobile */ +#define INTEL_HSW_ULT_GT3_IDS(info) \ + INTEL_VGA_DEVICE(0x0A22, info), /* ULT GT3 desktop */ \ + INTEL_VGA_DEVICE(0x0A2A, info), /* ULT GT3 server */ \ + INTEL_VGA_DEVICE(0x0A2B, info), /* ULT GT3 reserved */ \ + INTEL_VGA_DEVICE(0x0A26, info), /* ULT GT3 mobile
[Intel-gfx] [PATCH v3] drm/i915/gt: Poison residual state [HWSP] across resume.
Since we may lose the content of any buffer when we relinquish control of the system (e.g. suspend/resume), we have to be careful not to rely on regaining control. A good method to detect when we might be using garbage is by always injecting that garbage prior to first use on load/resume/etc. v2: Drop sanitize callback on cleanup v3: Move seqno reset to timeline enter, so we reset all timelines. However, this is done on every activation during runtime and not reset. The similar level of paranoia we apply to correcting context state after a period of inactivity. Suggested-by: Tvrtko Ursulin Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Cc: Venkata Ramana Nayana Cc: Daniele Ceraolo Spurio --- drivers/gpu/drm/i915/gt/intel_lrc.c | 16 +++- drivers/gpu/drm/i915/gt/intel_timeline.c | 17 - 2 files changed, 31 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 34f67eb9bfa1..248db89fd293 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -3649,6 +3649,18 @@ static void reset_csb_pointers(struct intel_engine_cs *engine) static void execlists_sanitize(struct intel_engine_cs *engine) { + /* +* Poison residual state on resume, in case the suspend didn't! +* +* We have to assume that across suspend/resume (or other loss +* of control) that the contents of our pinned buffers has been +* lost, replaced by garbage. Since this doesn't always happen, +* let's poison such state so that we more quickly spot when +* we falsely assume it has been preserved. +*/ + if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)) + memset(engine->status_page.addr, POISON_INUSE, PAGE_SIZE); + reset_csb_pointers(engine); } @@ -4539,6 +4551,8 @@ static void execlists_shutdown(struct intel_engine_cs *engine) static void execlists_release(struct intel_engine_cs *engine) { + engine->sanitize = NULL; /* no longer in control, nothing to sanitize */ + execlists_shutdown(engine); intel_engine_cleanup_common(engine); @@ -4550,7 +4564,6 @@ logical_ring_default_vfuncs(struct intel_engine_cs *engine) { /* Default vfuncs which can be overriden by each engine. */ - engine->sanitize = execlists_sanitize; engine->resume = execlists_resume; engine->cops = &execlists_context_ops; @@ -4666,6 +4679,7 @@ int intel_execlists_submission_setup(struct intel_engine_cs *engine) execlists->csb_size = GEN11_CSB_ENTRIES; /* Finally, take ownership and responsibility for cleanup! */ + engine->sanitize = execlists_sanitize; engine->release = execlists_release; return 0; diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c index 3779c2ae0d65..373cedd45ddd 100644 --- a/drivers/gpu/drm/i915/gt/intel_timeline.c +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c @@ -337,6 +337,13 @@ int intel_timeline_pin(struct intel_timeline *tl) return 0; } +static void intel_timeline_reset_seqno(struct intel_timeline *tl) +{ + /* Must be pinned to be writable, and no requests in flight. */ + GEM_BUG_ON(!atomic_read(&tl->pin_count)); + WRITE_ONCE(*(u32 *)tl->hwsp_seqno, tl->seqno); +} + void intel_timeline_enter(struct intel_timeline *tl) { struct intel_gt_timelines *timelines = &tl->gt->timelines; @@ -365,8 +372,16 @@ void intel_timeline_enter(struct intel_timeline *tl) return; spin_lock(&timelines->lock); - if (!atomic_fetch_inc(&tl->active_count)) + if (!atomic_fetch_inc(&tl->active_count)) { + /* +* The HWSP is volatile, and may have been lost while inactive, +* e.g. across suspend/resume. Be paranoid, and ensure that +* the HWSP value matches our seqno so we don't proclaim +* the next request as already complete. +*/ + intel_timeline_reset_seqno(tl); list_add_tail(&tl->link, &timelines->active_list); + } spin_unlock(&timelines->lock); } -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [CI 1/2] drm/i915/gt: Trace RPS events
Add tracek to the RPS events (interrupts, worker, enabling, threshold selection, frequency setting), so that if we have to debug reticent HW we have some traces to start from. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_rps.c | 48 ++--- 1 file changed, 44 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c index 4dcfae16a7ce..c4da41a90891 100644 --- a/drivers/gpu/drm/i915/gt/intel_rps.c +++ b/drivers/gpu/drm/i915/gt/intel_rps.c @@ -72,6 +72,9 @@ static void rps_enable_interrupts(struct intel_rps *rps) struct intel_gt *gt = rps_to_gt(rps); u32 events; + GT_TRACE(gt, "interrupts:on rps->pm_events: %x, rps_pm_mask:%x\n", +rps->pm_events, rps_pm_mask(rps, rps->last_freq)); + rps_reset_ei(rps); if (IS_VALLEYVIEW(gt->i915)) @@ -140,6 +143,7 @@ static void rps_disable_interrupts(struct intel_rps *rps) cancel_work_sync(&rps->work); rps_reset_interrupts(rps); + GT_TRACE(gt, "interrupts:off\n"); } static const struct cparams { @@ -581,6 +585,10 @@ static void rps_set_power(struct intel_rps *rps, int new_power) if (IS_VALLEYVIEW(i915)) goto skip_hw_write; + GT_TRACE(rps_to_gt(rps), +"changing power mode [%d], up %d%% @ %dus, down %d%% @ %dus\n", +new_power, threshold_up, ei_up, threshold_down, ei_down); + set(uncore, GEN6_RP_UP_EI, GT_INTERVAL_FROM_US(i915, ei_up)); set(uncore, GEN6_RP_UP_THRESHOLD, GT_INTERVAL_FROM_US(i915, ei_up * threshold_up / 100)); @@ -645,6 +653,8 @@ static void gen6_rps_set_thresholds(struct intel_rps *rps, u8 val) void intel_rps_mark_interactive(struct intel_rps *rps, bool interactive) { + GT_TRACE(rps_to_gt(rps), "mark interactive: %s\n", yesno(interactive)); + mutex_lock(&rps->power.mutex); if (interactive) { if (!rps->power.interactive++ && READ_ONCE(rps->active)) @@ -672,6 +682,9 @@ static int gen6_rps_set(struct intel_rps *rps, u8 val) GEN6_AGGRESSIVE_TURBO); set(uncore, GEN6_RPNSWREQ, swreq); + GT_TRACE(rps_to_gt(rps), "set val:%x, freq:%d, swreq:%x\n", +val, intel_gpu_freq(rps, val), swreq); + return 0; } @@ -684,6 +697,9 @@ static int vlv_rps_set(struct intel_rps *rps, u8 val) err = vlv_punit_write(i915, PUNIT_REG_GPU_FREQ_REQ, val); vlv_punit_put(i915); + GT_TRACE(rps_to_gt(rps), "set val:%x, freq:%d\n", +val, intel_gpu_freq(rps, val)); + return err; } @@ -717,6 +733,8 @@ void intel_rps_unpark(struct intel_rps *rps) if (!rps->enabled) return; + GT_TRACE(rps_to_gt(rps), "unpark:%x\n", rps->cur_freq); + /* * Use the user's desired frequency as a guide, but for better * performance, jump directly to RPe as our starting frequency. @@ -784,6 +802,8 @@ void intel_rps_park(struct intel_rps *rps) */ rps->cur_freq = max_t(int, round_down(rps->cur_freq - 1, 2), rps->min_freq); + + GT_TRACE(rps_to_gt(rps), "park:%x\n", rps->cur_freq); } void intel_rps_boost(struct i915_request *rq) @@ -800,6 +820,9 @@ void intel_rps_boost(struct i915_request *rq) !dma_fence_is_signaled_locked(&rq->fence)) { set_bit(I915_FENCE_FLAG_BOOST, &rq->fence.flags); + GT_TRACE(rps_to_gt(rps), "boost fence:%llx:%llx\n", +rq->fence.context, rq->fence.seqno); + if (!atomic_fetch_inc(&rps->num_waiters) && READ_ONCE(rps->cur_freq) < rps->boost_freq) schedule_work(&rps->work); @@ -895,6 +918,7 @@ static void gen6_rps_init(struct intel_rps *rps) static bool rps_reset(struct intel_rps *rps) { struct drm_i915_private *i915 = rps_to_i915(rps); + /* force a reset */ rps->power.mode = -1; rps->last_freq = -1; @@ -1215,11 +1239,17 @@ void intel_rps_enable(struct intel_rps *rps) if (!rps->enabled) return; - drm_WARN_ON(&i915->drm, rps->max_freq < rps->min_freq); - drm_WARN_ON(&i915->drm, rps->idle_freq > rps->max_freq); + GT_TRACE(rps_to_gt(rps), +"min:%x, max:%x, freq:[%d, %d]\n", +rps->min_freq, rps->max_freq, +intel_gpu_freq(rps, rps->min_freq), +intel_gpu_freq(rps, rps->max_freq)); - drm_WARN_ON(&i915->drm, rps->efficient_freq < rps->min_freq); - drm_WARN_ON(&i915->drm, rps->efficient_freq > rps->max_freq); + GEM_BUG_ON(rps->max_freq < rps->min_freq); + GEM_BUG_ON(rps->idle_freq > rps->max_freq); + + GEM_BUG_ON(rps->efficient_freq < rps->min_freq); + GEM_BUG_ON(rps->efficient_freq > rps->max_freq); } static void gen6_rps_disable(struct intel_rps *rps) @@ -1487,6 +1517,12 @@ stati
[Intel-gfx] [CI 2/2] drm/i915/gt: Use the RPM config register to determine clk frequencies
For many configuration details within RC6 and RPS we are programming intervals for the internal clocks. From gen11, these clocks are configuration via the RPM_CONFIG and so for convenience, we would like to convert to/from more natural units (ns). Signed-off-by: Chris Wilson Cc: Andi Shyti Cc: Mika Kuoppala Reviewed-by: Andi Shyti --- drivers/gpu/drm/i915/Makefile | 1 + drivers/gpu/drm/i915/gt/debugfs_gt_pm.c | 27 ++--- drivers/gpu/drm/i915/gt/intel_gt.c| 3 + .../gpu/drm/i915/gt/intel_gt_clock_utils.c| 102 ++ .../gpu/drm/i915/gt/intel_gt_clock_utils.h| 27 + drivers/gpu/drm/i915/gt/intel_gt_pm.c | 3 + drivers/gpu/drm/i915/gt/intel_gt_types.h | 9 +- drivers/gpu/drm/i915/gt/intel_rps.c | 36 --- drivers/gpu/drm/i915/gt/selftest_rps.c| 7 +- drivers/gpu/drm/i915/i915_debugfs.c | 34 +++--- drivers/gpu/drm/i915/i915_reg.h | 25 - 11 files changed, 205 insertions(+), 69 deletions(-) create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_clock_utils.c create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_clock_utils.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 6f112d8f80ca..ce24a4ee9591 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -91,6 +91,7 @@ gt-y += \ gt/intel_ggtt.o \ gt/intel_ggtt_fencing.o \ gt/intel_gt.o \ + gt/intel_gt_clock_utils.o \ gt/intel_gt_irq.o \ gt/intel_gt_pm.o \ gt/intel_gt_pm_irq.o \ diff --git a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c index aab30d908072..d4e3b4c0c48f 100644 --- a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c +++ b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c @@ -10,6 +10,7 @@ #include "debugfs_gt_pm.h" #include "i915_drv.h" #include "intel_gt.h" +#include "intel_gt_clock_utils.h" #include "intel_llc.h" #include "intel_rc6.h" #include "intel_rps.h" @@ -394,21 +395,23 @@ static int frequency_show(struct seq_file *m, void *unused) seq_printf(m, "RPDECLIMIT: 0x%08x\n", rpdeclimit); seq_printf(m, "RPNSWREQ: %dMHz\n", reqf); seq_printf(m, "CAGF: %dMHz\n", cagf); - seq_printf(m, "RP CUR UP EI: %d (%dus)\n", - rpupei, GT_PM_INTERVAL_TO_US(i915, rpupei)); - seq_printf(m, "RP CUR UP: %d (%dus)\n", - rpcurup, GT_PM_INTERVAL_TO_US(i915, rpcurup)); - seq_printf(m, "RP PREV UP: %d (%dus)\n", - rpprevup, GT_PM_INTERVAL_TO_US(i915, rpprevup)); + seq_printf(m, "RP CUR UP EI: %d (%dns)\n", + rpupei, intel_gt_pm_interval_to_ns(gt, rpupei)); + seq_printf(m, "RP CUR UP: %d (%dns)\n", + rpcurup, intel_gt_pm_interval_to_ns(gt, rpcurup)); + seq_printf(m, "RP PREV UP: %d (%dns)\n", + rpprevup, intel_gt_pm_interval_to_ns(gt, rpprevup)); seq_printf(m, "Up threshold: %d%%\n", rps->power.up_threshold); - seq_printf(m, "RP CUR DOWN EI: %d (%dus)\n", - rpdownei, GT_PM_INTERVAL_TO_US(i915, rpdownei)); - seq_printf(m, "RP CUR DOWN: %d (%dus)\n", - rpcurdown, GT_PM_INTERVAL_TO_US(i915, rpcurdown)); - seq_printf(m, "RP PREV DOWN: %d (%dus)\n", - rpprevdown, GT_PM_INTERVAL_TO_US(i915, rpprevdown)); + seq_printf(m, "RP CUR DOWN EI: %d (%dns)\n", + rpdownei, intel_gt_pm_interval_to_ns(gt, rpdownei)); + seq_printf(m, "RP CUR DOWN: %d (%dns)\n", + rpcurdown, + intel_gt_pm_interval_to_ns(gt, rpcurdown)); + seq_printf(m, "RP PREV DOWN: %d (%dns)\n", + rpprevdown, + intel_gt_pm_interval_to_ns(gt, rpprevdown)); seq_printf(m, "Down threshold: %d%%\n", rps->power.down_threshold); diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index 1c99cc72305a..d9cf8194c997 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -7,6 +7,7 @@ #include "i915_drv.h" #include "intel_context.h" #include "intel_gt.h" +#include "intel_gt_clock_utils.h" #include "intel_gt_pm.h" #include "intel_gt_requests.h" #include "intel_mocs.h" @@ -576,6 +577,8 @@ int intel_gt_init(struct intel_gt *gt) */ intel_uncore_forcewake_get(gt->uncore, FORCEWAKE_ALL); + intel_gt_init_clock_frequency(gt); + err = intel_gt_init_scratch(gt, IS_GEN(gt->i915, 2) ? SZ_256K : SZ_4K); if (err) goto out_fw; diff --git a/drivers/gpu/drm/i915/gt/intel_gt_clock_utils.c b/drivers/gpu/d
Re: [Intel-gfx] [PATCH v2 5/5] uaccess: Rename user_access_begin/end() to user_full_access_begin/end()
On Tue, Apr 21, 2020 at 03:49:19AM +0100, Al Viro wrote: > The only source I'd been able to find speeks of >= 60 cycles > (and possibly much more) for non-pipelined coprocessor instructions; > the list of such does contain loads and stores to a bunch of registers. > However, the register in question (p15/c3) has only store mentioned there, > so loads might be cheap; no obvious reasons for those to be slow. > That's a question to arm folks, I'm afraid... rmk? I have no information on that; instruction timings are not defined at architecture level (architecture reference manual), nor do I find information in the CPU technical reference manual (which would be specific to the CPU). Instruction timings tend to be implementation dependent. I've always consulted Will Deacon when I've needed to know whether an instruction is expensive or not. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 05/18] drm/i915/display/display: Prefer drm_WARN_ON over WARN_ON
On Tue, Apr 21, 2020 at 10:53:12AM +0300, Jani Nikula wrote: > > Pankaj, the subject line is identical to patch 4, please update. > > Imre, one question inline for you. > > On Mon, 06 Apr 2020, Pankaj Bharadiya > wrote: > > diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c > > b/drivers/gpu/drm/i915/display/intel_display_power.c > > index 433e5a81dd4d..5475f989df4c 100644 > > --- a/drivers/gpu/drm/i915/display/intel_display_power.c > > +++ b/drivers/gpu/drm/i915/display/intel_display_power.c > > @@ -1850,22 +1850,29 @@ static u64 __async_put_domains_mask(struct > > i915_power_domains *power_domains) > > static bool > > assert_async_put_domain_masks_disjoint(struct i915_power_domains > > *power_domains) > > { > > - return !WARN_ON(power_domains->async_put_domains[0] & > > - power_domains->async_put_domains[1]); > > + struct drm_i915_private *i915 = container_of(power_domains, > > +struct drm_i915_private, > > +power_domains); > > + return !drm_WARN_ON(&i915->drm, power_domains->async_put_domains[0] & > > + power_domains->async_put_domains[1]); > > } > > Do we want to depend on struct i915_power_domains being a struct > drm_i915_private member via container_of? It looks ok to me, there is only one i915_power_domains struct per device. > BR, > Jani. > > > > > static bool > > __async_put_domains_state_ok(struct i915_power_domains *power_domains) > > { > > + struct drm_i915_private *i915 = container_of(power_domains, > > +struct drm_i915_private, > > +power_domains); > > enum intel_display_power_domain domain; > > bool err = false; > > > > err |= !assert_async_put_domain_masks_disjoint(power_domains); > > - err |= WARN_ON(!!power_domains->async_put_wakeref != > > - !!__async_put_domains_mask(power_domains)); > > + err |= drm_WARN_ON(&i915->drm, !!power_domains->async_put_wakeref != > > + !!__async_put_domains_mask(power_domains)); > > > > for_each_power_domain(domain, __async_put_domains_mask(power_domains)) > > - err |= WARN_ON(power_domains->domain_use_count[domain] != 1); > > + err |= drm_WARN_ON(&i915->drm, > > + power_domains->domain_use_count[domain] != > > 1); > > > > return !err; > > } > > @@ -2107,11 +2114,14 @@ static void > > queue_async_put_domains_work(struct i915_power_domains *power_domains, > > intel_wakeref_t wakeref) > > { > > - WARN_ON(power_domains->async_put_wakeref); > > + struct drm_i915_private *i915 = container_of(power_domains, > > +struct drm_i915_private, > > +power_domains); > > + drm_WARN_ON(&i915->drm, power_domains->async_put_wakeref); > > power_domains->async_put_wakeref = wakeref; > > - WARN_ON(!queue_delayed_work(system_unbound_wq, > > - &power_domains->async_put_work, > > - msecs_to_jiffies(100))); > > + drm_WARN_ON(&i915->drm, !queue_delayed_work(system_unbound_wq, > > + > > &power_domains->async_put_work, > > + msecs_to_jiffies(100))); > > } > > > > static void > > @@ -4318,6 +4328,9 @@ __set_power_wells(struct i915_power_domains > > *power_domains, > > const struct i915_power_well_desc *power_well_descs, > > int power_well_count) > > { > > + struct drm_i915_private *i915 = container_of(power_domains, > > +struct drm_i915_private, > > +power_domains); > > u64 power_well_ids = 0; > > int i; > > > > @@ -4337,8 +4350,8 @@ __set_power_wells(struct i915_power_domains > > *power_domains, > > if (id == DISP_PW_ID_NONE) > > continue; > > > > - WARN_ON(id >= sizeof(power_well_ids) * 8); > > - WARN_ON(power_well_ids & BIT_ULL(id)); > > + drm_WARN_ON(&i915->drm, id >= sizeof(power_well_ids) * 8); > > + drm_WARN_ON(&i915->drm, power_well_ids & BIT_ULL(id)); > > power_well_ids |= BIT_ULL(id); > > } > > -- > Jani Nikula, Intel Open Source Graphics Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v4] drm/i915/gt: Poison residual state [HWSP] across resume.
Since we may lose the content of any buffer when we relinquish control of the system (e.g. suspend/resume), we have to be careful not to rely on regaining control. A good method to detect when we might be using garbage is by always injecting that garbage prior to first use on load/resume/etc. v2: Drop sanitize callback on cleanup v3: Move seqno reset to timeline enter, so we reset all timelines. However, this is done on every activation during runtime and not reset. The similar level of paranoia we apply to correcting context state after a period of inactivity. Suggested-by: Tvrtko Ursulin Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Cc: Venkata Ramana Nayana Cc: Daniele Ceraolo Spurio --- Reset in sanitize, for we may attempt to park the engine before using any timelines. --- drivers/gpu/drm/i915/gt/intel_lrc.c | 23 ++- drivers/gpu/drm/i915/gt/intel_timeline.c | 17 - drivers/gpu/drm/i915/gt/intel_timeline.h | 2 ++ 3 files changed, 40 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 34f67eb9bfa1..d42a9d6767d4 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -3649,7 +3649,26 @@ static void reset_csb_pointers(struct intel_engine_cs *engine) static void execlists_sanitize(struct intel_engine_cs *engine) { + /* +* Poison residual state on resume, in case the suspend didn't! +* +* We have to assume that across suspend/resume (or other loss +* of control) that the contents of our pinned buffers has been +* lost, replaced by garbage. Since this doesn't always happen, +* let's poison such state so that we more quickly spot when +* we falsely assume it has been preserved. +*/ + if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM)) + memset(engine->status_page.addr, POISON_INUSE, PAGE_SIZE); + reset_csb_pointers(engine); + + /* +* The kernel_context HWSP is stored in the status_page. As above, +* that may be lost on resume/initialisation, and so we need to +* reset the value in the HWSP. +*/ + intel_timeline_reset_seqno(engine->kernel_context->timeline); } static void enable_error_interrupt(struct intel_engine_cs *engine) @@ -4539,6 +4558,8 @@ static void execlists_shutdown(struct intel_engine_cs *engine) static void execlists_release(struct intel_engine_cs *engine) { + engine->sanitize = NULL; /* no longer in control, nothing to sanitize */ + execlists_shutdown(engine); intel_engine_cleanup_common(engine); @@ -4550,7 +4571,6 @@ logical_ring_default_vfuncs(struct intel_engine_cs *engine) { /* Default vfuncs which can be overriden by each engine. */ - engine->sanitize = execlists_sanitize; engine->resume = execlists_resume; engine->cops = &execlists_context_ops; @@ -4666,6 +4686,7 @@ int intel_execlists_submission_setup(struct intel_engine_cs *engine) execlists->csb_size = GEN11_CSB_ENTRIES; /* Finally, take ownership and responsibility for cleanup! */ + engine->sanitize = execlists_sanitize; engine->release = execlists_release; return 0; diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c index 3779c2ae0d65..29a39e44fa36 100644 --- a/drivers/gpu/drm/i915/gt/intel_timeline.c +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c @@ -337,6 +337,13 @@ int intel_timeline_pin(struct intel_timeline *tl) return 0; } +void intel_timeline_reset_seqno(const struct intel_timeline *tl) +{ + /* Must be pinned to be writable, and no requests in flight. */ + GEM_BUG_ON(!atomic_read(&tl->pin_count)); + WRITE_ONCE(*(u32 *)tl->hwsp_seqno, tl->seqno); +} + void intel_timeline_enter(struct intel_timeline *tl) { struct intel_gt_timelines *timelines = &tl->gt->timelines; @@ -365,8 +372,16 @@ void intel_timeline_enter(struct intel_timeline *tl) return; spin_lock(&timelines->lock); - if (!atomic_fetch_inc(&tl->active_count)) + if (!atomic_fetch_inc(&tl->active_count)) { + /* +* The HWSP is volatile, and may have been lost while inactive, +* e.g. across suspend/resume. Be paranoid, and ensure that +* the HWSP value matches our seqno so we don't proclaim +* the next request as already complete. +*/ + intel_timeline_reset_seqno(tl); list_add_tail(&tl->link, &timelines->active_list); + } spin_unlock(&timelines->lock); } diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.h b/drivers/gpu/drm/i915/gt/intel_timeline.h index f5b7eade3809..c8e59a333182 100644 --- a/drivers/gpu/drm/i915/gt/intel_timeline.h +++ b/drivers/gpu/drm/i915/gt/intel_timeline.h @@ -84,6 +84,8 @@ int
Re: [Intel-gfx] [PATCH xf86-video-intel v4] Sync i915_pciids upto 8717c6b7414f
Quoting Liwei Song (2020-04-21 09:41:28) > +static const struct intel_device_info intel_cannonlake_info = { > + .gen = 0115, .gen = 0120 /* 10 */ > +}; > + > +static const struct intel_device_info intel_icelake_info = { > + .gen = 0116, .gen = 0130 /* 11 */ > +}; > + > +static const struct intel_device_info intel_elkhartlake_info = { > + .gen = 0117, .gen = 0131 > +}; > + > +static const struct intel_device_info intel_tigerlake_info = { > + .gen = 0120, .gen = 0140 /* 12 */ You definitely do not want to feed them through the gen9 assembler. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/5] drm: report dp downstream port type as a subconnector property
On 2020-04-15 at 13:01:21 +0300, Jani Nikula wrote: > On Tue, 07 Apr 2020, Jeevan B wrote: > > From: Oleg Vasilev > > > > Currently, downstream port type is only reported in debugfs. This > > information should be considered important since it reflects the actual > > physical connector type. Some userspace (e.g. window compositors) > > may want to show this info to a user. > > > > The 'subconnector' property is already utilized for DVI-I and TV-out for > > reporting connector subtype. > > > > The initial motivation for this feature came from i2c test [1]. > > It is supposed to be skipped on VGA connectors, but it cannot > > detect VGA over DP and fails instead. > > > > v2: > > - Ville: utilized drm_dp_is_branch() > > - Ville: implement DP 1.0 downstream type info > > - Replaced create_dp_properties with add_dp_subconnector_property > > - Added dp_set_subconnector_property helper > > > > v4: > > - Ville: add DP1.0 best assumption about subconnector > > - Ville: assume DVI is DVI-D > > - Ville: reuse Writeback enum value for Virtual subconnector > > - Renamed #defines: HDMI -> HDMIA, DP -> DisplayPort > > > > v5: rebase > > > > [1]: https://bugs.freedesktop.org/show_bug.cgi?id=104097 > > > > Cc: Ville Syrjälä > > Cc: intel-gfx@lists.freedesktop.org > > Signed-off-by: Jeevan B > > Signed-off-by: Oleg Vasilev > > Reviewed-by: Emil Velikov > > Link: > > https://patchwork.freedesktop.org/patch/msgid/20190829114854.1539-3-oleg.vasi...@intel.com > > High level looks fine to me, please see some nitpicks inline. > > BR, > Jani. > > > > --- > > drivers/gpu/drm/drm_connector.c | 49 -- > > drivers/gpu/drm/drm_dp_helper.c | 77 > > + > > include/drm/drm_connector.h | 3 ++ > > include/drm/drm_dp_helper.h | 8 + > > include/drm/drm_mode_config.h | 6 > > include/uapi/drm/drm_mode.h | 21 ++- > > 6 files changed, 154 insertions(+), 10 deletions(-) > > > > diff --git a/drivers/gpu/drm/drm_connector.c > > b/drivers/gpu/drm/drm_connector.c > > index b1099e1..b6972d1 100644 > > --- a/drivers/gpu/drm/drm_connector.c > > +++ b/drivers/gpu/drm/drm_connector.c > > @@ -844,7 +844,7 @@ static const struct drm_prop_enum_list > > drm_dvi_i_select_enum_list[] = { > > DRM_ENUM_NAME_FN(drm_get_dvi_i_select_name, drm_dvi_i_select_enum_list) > > > > static const struct drm_prop_enum_list drm_dvi_i_subconnector_enum_list[] > > = { > > - { DRM_MODE_SUBCONNECTOR_Unknown, "Unknown" }, /* DVI-I and TV-out */ > > + { DRM_MODE_SUBCONNECTOR_Unknown, "Unknown" }, /* DVI-I, TV-out and > > DP */ > > { DRM_MODE_SUBCONNECTOR_DVID, "DVI-D" }, /* DVI-I */ > > { DRM_MODE_SUBCONNECTOR_DVIA, "DVI-A" }, /* DVI-I */ > > }; > > @@ -861,7 +861,7 @@ static const struct drm_prop_enum_list > > drm_tv_select_enum_list[] = { > > DRM_ENUM_NAME_FN(drm_get_tv_select_name, drm_tv_select_enum_list) > > > > static const struct drm_prop_enum_list drm_tv_subconnector_enum_list[] = { > > - { DRM_MODE_SUBCONNECTOR_Unknown, "Unknown" }, /* DVI-I and TV-out */ > > + { DRM_MODE_SUBCONNECTOR_Unknown, "Unknown" }, /* DVI-I, TV-out and > > DP */ > > { DRM_MODE_SUBCONNECTOR_Composite, "Composite" }, /* TV-out */ > > { DRM_MODE_SUBCONNECTOR_SVIDEO,"SVIDEO"}, /* TV-out */ > > { DRM_MODE_SUBCONNECTOR_Component, "Component" }, /* TV-out */ > > @@ -870,6 +870,19 @@ static const struct drm_prop_enum_list > > drm_tv_subconnector_enum_list[] = { > > DRM_ENUM_NAME_FN(drm_get_tv_subconnector_name, > > drm_tv_subconnector_enum_list) > > > > +static const struct drm_prop_enum_list drm_dp_subconnector_enum_list[] = { > > + { DRM_MODE_SUBCONNECTOR_Unknown, "Unknown" }, /* DVI-I, TV-out > > and DP */ > > + { DRM_MODE_SUBCONNECTOR_VGA, "VGA" }, /* DP */ > > + { DRM_MODE_SUBCONNECTOR_DVID,"DVI-D" }, /* DP */ > > + { DRM_MODE_SUBCONNECTOR_HDMIA, "HDMI" }, /* DP */ > > + { DRM_MODE_SUBCONNECTOR_DisplayPort, "DP"}, /* DP */ > > + { DRM_MODE_SUBCONNECTOR_Wireless,"Wireless" }, /* DP */ > > + { DRM_MODE_SUBCONNECTOR_Native, "Native"}, /* DP */ > > +}; > > + > > +DRM_ENUM_NAME_FN(drm_get_dp_subconnector_name, > > +drm_dp_subconnector_enum_list) > > + > > static const struct drm_prop_enum_list hdmi_colorspaces[] = { > > /* For Default case, driver will set the colorspace */ > > { DRM_MODE_COLORIMETRY_DEFAULT, "Default" }, > > @@ -1186,6 +1199,14 @@ static const struct drm_prop_enum_list > > dp_colorspaces[] = { > > * can also expose this property to external outputs, in which case they > > * must support "None", which should be the default (since external screens > > * have a built-in scaler). > > + * > > + * subconnector: > > + * This property is used by DVI-I, TVout and DisplayPort to indicate > > different > > + * connector subtypes. Enum values more or less match with those
Re: [Intel-gfx] [PATCH 2/5] drm/i915: utilize subconnector property for DP
On 2020-04-15 at 13:01:59 +0300, Jani Nikula wrote: > On Tue, 07 Apr 2020, Jeevan B wrote: > > From: Oleg Vasilev > > > > Since DP-specific information is stored in driver's structures, every > > driver needs to implement subconnector property by itself. > > > > v2: updates to match previous commit changes > > > > v3: rebase > > > > Cc: Ville Syrjälä > > Cc: intel-gfx@lists.freedesktop.org > > Signed-off-by: Jeevan B > > Signed-off-by: Oleg Vasilev > > Reviewed-by: Emil Velikov > > Tested-by: Oleg Vasilev > > Link: > > https://patchwork.freedesktop.org/patch/msgid/20190829114854.1539-4-oleg.vasi...@intel.com > > You're not supposed to add the Link: tag yourself. I will do the necessary change. > > Reviewed-by: Jani Nikula Thanks Jeevan B > > > > --- > > drivers/gpu/drm/i915/display/intel_dp.c | 8 > > 1 file changed, 8 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/display/intel_dp.c > > b/drivers/gpu/drm/i915/display/intel_dp.c > > index db6ae8e..ba443e1 100644 > > --- a/drivers/gpu/drm/i915/display/intel_dp.c > > +++ b/drivers/gpu/drm/i915/display/intel_dp.c > > @@ -6155,6 +6155,11 @@ intel_dp_detect(struct drm_connector *connector, > > */ > > intel_display_power_flush_work(dev_priv); > > > > + if (!intel_dp_is_edp(intel_dp)) > > + drm_dp_set_subconnector_property(connector, > > +status, > > +intel_dp->dpcd, > > +intel_dp->downstream_ports); > > return status; > > } > > > > @@ -7211,6 +7216,9 @@ intel_dp_add_properties(struct intel_dp *intel_dp, > > struct drm_connector *connect > > struct drm_i915_private *dev_priv = to_i915(connector->dev); > > enum port port = dp_to_dig_port(intel_dp)->base.port; > > > > + if (!intel_dp_is_edp(intel_dp)) > > + drm_mode_add_dp_subconnector_property(connector); > > + > > if (!IS_G4X(dev_priv) && port != PORT_A) > > intel_attach_force_audio_property(connector); > > -- > Jani Nikula, Intel Open Source Graphics Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 18/18] drm/i915/runtime_pm: Prefer drm_WARN* over WARN*
On Tue, Apr 21, 2020 at 11:28:24AM +0300, Jani Nikula wrote: > > Imre, please check the one question inline. > > On Mon, 06 Apr 2020, Pankaj Bharadiya > wrote: > > struct drm_device specific drm_WARN* macros include device information > > in the backtrace, so we know what device the warnings originate from. > > > > Prefer drm_WARN* over WARN*. > > > > Conversion is done with below semantic patch: > > > > @@ > > identifier func, T; > > @@ > > func(struct intel_runtime_pm *T,...) { > > + struct drm_i915_private *i915 = container_of(T, struct drm_i915_private, > > runtime_pm); > > <+... > > ( > > -WARN( > > +drm_WARN(&i915->drm, > > ...) > > | > > -WARN_ON( > > +drm_WARN_ON(&i915->drm, > > ...) > > | > > -WARN_ONCE( > > +drm_WARN_ONCE(&i915->drm, > > ...) > > | > > -WARN_ON_ONCE( > > +drm_WARN_ON_ONCE(&i915->drm, > > ...) > > ) > > ...+> > > > > } > > > > Signed-off-by: Pankaj Bharadiya > > --- > > drivers/gpu/drm/i915/intel_runtime_pm.c | 39 ++--- > > 1 file changed, 28 insertions(+), 11 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c > > b/drivers/gpu/drm/i915/intel_runtime_pm.c > > index ad719c9602af..31ccd0559c55 100644 > > --- a/drivers/gpu/drm/i915/intel_runtime_pm.c > > +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c > > @@ -116,6 +116,9 @@ track_intel_runtime_pm_wakeref(struct intel_runtime_pm > > *rpm) > > static void untrack_intel_runtime_pm_wakeref(struct intel_runtime_pm *rpm, > > depot_stack_handle_t stack) > > { > > + struct drm_i915_private *i915 = container_of(rpm, > > +struct drm_i915_private, > > +runtime_pm); > > Is this a dependency we want to add? Should struct intel_runtime_pm be > allowed to be elsewhere than struct i915? For convenience a pointer to intel_runtime_pm is stored in intel_uncore and intel_wakeref, but there is only one instance of it. So looks ok to me to use container_of() on the pointer. > > BR, > Jani. > > > unsigned long flags, n; > > bool found = false; > > > > @@ -134,9 +137,9 @@ static void untrack_intel_runtime_pm_wakeref(struct > > intel_runtime_pm *rpm, > > } > > spin_unlock_irqrestore(&rpm->debug.lock, flags); > > > > - if (WARN(!found, > > -"Unmatched wakeref (tracking %lu), count %u\n", > > -rpm->debug.count, atomic_read(&rpm->wakeref_count))) { > > + if (drm_WARN(&i915->drm, !found, > > +"Unmatched wakeref (tracking %lu), count %u\n", > > +rpm->debug.count, atomic_read(&rpm->wakeref_count))) { > > char *buf; > > > > buf = kmalloc(PAGE_SIZE, GFP_NOWAIT | __GFP_NOWARN); > > @@ -355,10 +358,14 @@ intel_runtime_pm_release(struct intel_runtime_pm > > *rpm, int wakelock) > > static intel_wakeref_t __intel_runtime_pm_get(struct intel_runtime_pm *rpm, > > bool wakelock) > > { > > + struct drm_i915_private *i915 = container_of(rpm, > > +struct drm_i915_private, > > +runtime_pm); > > int ret; > > > > ret = pm_runtime_get_sync(rpm->kdev); > > - WARN_ONCE(ret < 0, "pm_runtime_get_sync() failed: %d\n", ret); > > + drm_WARN_ONCE(&i915->drm, ret < 0, > > + "pm_runtime_get_sync() failed: %d\n", ret); > > > > intel_runtime_pm_acquire(rpm, wakelock); > > > > @@ -539,6 +546,9 @@ void intel_runtime_pm_put(struct intel_runtime_pm *rpm, > > intel_wakeref_t wref) > > */ > > void intel_runtime_pm_enable(struct intel_runtime_pm *rpm) > > { > > + struct drm_i915_private *i915 = container_of(rpm, > > +struct drm_i915_private, > > +runtime_pm); > > struct device *kdev = rpm->kdev; > > > > /* > > @@ -565,7 +575,8 @@ void intel_runtime_pm_enable(struct intel_runtime_pm > > *rpm) > > > > pm_runtime_dont_use_autosuspend(kdev); > > ret = pm_runtime_get_sync(kdev); > > - WARN(ret < 0, "pm_runtime_get_sync() failed: %d\n", ret); > > + drm_WARN(&i915->drm, ret < 0, > > +"pm_runtime_get_sync() failed: %d\n", ret); > > } else { > > pm_runtime_use_autosuspend(kdev); > > } > > @@ -580,11 +591,14 @@ void intel_runtime_pm_enable(struct intel_runtime_pm > > *rpm) > > > > void intel_runtime_pm_disable(struct intel_runtime_pm *rpm) > > { > > + struct drm_i915_private *i915 = container_of(rpm, > > +struct drm_i915_private, > > +runtime_pm); > > struct device *kdev = rpm->kdev; > > > > /* Transfer rpm ownership back to core */ > > - WARN(pm_runtime_get_sync(kdev) < 0, > > -"Failed to pass rpm ownership back to
Re: [Intel-gfx] [PATCH xf86-video-intel v4] Sync i915_pciids upto 8717c6b7414f
On 4/21/20 17:29, Chris Wilson wrote: > Quoting Liwei Song (2020-04-21 09:41:28) >> +static const struct intel_device_info intel_cannonlake_info = { >> + .gen = 0115, > .gen = 0120 /* 10 */ > >> +}; >> + >> +static const struct intel_device_info intel_icelake_info = { >> + .gen = 0116, > .gen = 0130 /* 11 */ > >> +}; >> + >> +static const struct intel_device_info intel_elkhartlake_info = { >> + .gen = 0117, > .gen = 0131 > >> +}; >> + >> +static const struct intel_device_info intel_tigerlake_info = { >> + .gen = 0120, > .gen = 0140 /* 12 */ > > You definitely do not want to feed them through the gen9 assembler. Thanks, will modify it in v5, but could you explain more here how we decide the gen number, I really do not have too much knowledge here, I use to think it is an increasing number like some other definition. Thanks, Liwei. > -Chris > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/3] drm/i915: Add per ctx batchbuffer wa for timestamp
Chris Wilson writes: > Quoting Mika Kuoppala (2020-04-17 15:44:27) >> Restoration of a previous timestamp can collide >> with updating the timestamp, causing a value corruption. >> >> Combat this issue by using indirect ctx bb to >> modify the context image during restoring process. >> >> For render engine, we can preload value into >> scratch register. From which we then do the actual >> write with LRR. LRR is faster and thus less error prone. >> For other engines, no scratch is available so we >> must do a more complex sequence of sync and async LRMs. >> As the LRM is slower, the probablity of racy write >> raises and thus we still see corruption sometimes. >> >> References: HSDES#16010904313 >> Testcase: igt/i915_selftest/gt_lrc >> Suggested-by: Joseph Koston >> Cc: Chris Wilson >> Signed-off-by: Mika Kuoppala >> >> bug on fix >> --- >> drivers/gpu/drm/i915/gt/intel_context_types.h | 3 + >> drivers/gpu/drm/i915/gt/intel_gpu_commands.h | 3 +- >> drivers/gpu/drm/i915/gt/intel_lrc.c | 196 ++ >> drivers/gpu/drm/i915/gt/intel_lrc_reg.h | 1 + >> 4 files changed, 165 insertions(+), 38 deletions(-) >> >> diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h >> b/drivers/gpu/drm/i915/gt/intel_context_types.h >> index 07cb83a0d017..c7573d565f58 100644 >> --- a/drivers/gpu/drm/i915/gt/intel_context_types.h >> +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h >> @@ -70,6 +70,9 @@ struct intel_context { >> >> u32 *lrc_reg_state; >> u64 lrc_desc; >> + >> + u32 ctx_bb_offset; >> + >> u32 tag; /* cookie passed to HW to track this context on submission >> */ >> >> /* Time on GPU as tracked by the hw. */ >> diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h >> b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h >> index f04214a54f75..0c2adb4078a7 100644 >> --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h >> +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h >> @@ -138,7 +138,7 @@ >> */ >> #define MI_LOAD_REGISTER_IMM(x)MI_INSTR(0x22, 2*(x)-1) >> /* Gen11+. addr = base + (ctx_restore ? offset & GENMASK(12,2) : offset) */ >> -#define MI_LRI_CS_MMIO (1<<19) >> +#define MI_LRI_LRM_CS_MMIO (1<<19) >> #define MI_LRI_FORCE_POSTED (1<<12) >> #define MI_LOAD_REGISTER_IMM_MAX_REGS (126) >> #define MI_STORE_REGISTER_MEMMI_INSTR(0x24, 1) >> @@ -155,6 +155,7 @@ >> #define MI_FLUSH_DW_USE_PPGTT(0<<2) >> #define MI_LOAD_REGISTER_MEM MI_INSTR(0x29, 1) >> #define MI_LOAD_REGISTER_MEM_GEN8 MI_INSTR(0x29, 2) >> +#define MI_LRM_ASYNC (1<<21) >> #define MI_LOAD_REGISTER_REGMI_INSTR(0x2A, 1) >> #define MI_BATCH_BUFFERMI_INSTR(0x30, 1) >> #define MI_BATCH_NON_SECURE (1) >> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c >> b/drivers/gpu/drm/i915/gt/intel_lrc.c >> index 6fbad5e2343f..531884b9050c 100644 >> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c >> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c >> @@ -234,7 +234,7 @@ static void execlists_init_reg_state(u32 *reg_state, >> const struct intel_ring *ring, >> bool close); >> static void >> -__execlists_update_reg_state(const struct intel_context *ce, >> +__execlists_update_reg_state(struct intel_context *ce, >> const struct intel_engine_cs *engine, >> u32 head); >> >> @@ -537,7 +537,7 @@ static void set_offsets(u32 *regs, >> if (flags & POSTED) >> *regs |= MI_LRI_FORCE_POSTED; >> if (INTEL_GEN(engine->i915) >= 11) >> - *regs |= MI_LRI_CS_MMIO; >> + *regs |= MI_LRI_LRM_CS_MMIO; >> regs++; >> >> GEM_BUG_ON(!count); >> @@ -3142,8 +3142,152 @@ static void execlists_context_unpin(struct >> intel_context *ce) >> i915_gem_object_unpin_map(ce->state->obj); >> } >> >> +static u32 intel_lr_indirect_ctx_offset(const struct intel_engine_cs >> *engine) >> +{ >> + u32 indirect_ctx_offset; >> + >> + switch (INTEL_GEN(engine->i915)) { >> + default: >> + MISSING_CASE(INTEL_GEN(engine->i915)); >> + fallthrough; >> + case 12: >> + indirect_ctx_offset = >> + GEN12_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT; >> + break; >> + case 11: >> + indirect_ctx_offset = >> + GEN11_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT; >> + break; >> + case 10: >> + indirect_ctx_offset = >> + GEN10_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT; >> + break; >> + case 9: >> + indirect_ctx_offset = >> + GEN9_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT; >> + break; >> + case 8: >> +
Re: [Intel-gfx] [PATCH xf86-video-intel v4] Sync i915_pciids upto 8717c6b7414f
Quoting Liwei Song (2020-04-21 10:59:04) > > > On 4/21/20 17:29, Chris Wilson wrote: > > Quoting Liwei Song (2020-04-21 09:41:28) > >> +static const struct intel_device_info intel_cannonlake_info = { > >> + .gen = 0115, > > .gen = 0120 /* 10 */ > > > >> +}; > >> + > >> +static const struct intel_device_info intel_icelake_info = { > >> + .gen = 0116, > > .gen = 0130 /* 11 */ > > > >> +}; > >> + > >> +static const struct intel_device_info intel_elkhartlake_info = { > >> + .gen = 0117, > > .gen = 0131 > > > >> +}; > >> + > >> +static const struct intel_device_info intel_tigerlake_info = { > >> + .gen = 0120, > > .gen = 0140 /* 12 */ > > > > You definitely do not want to feed them through the gen9 assembler. > > Thanks, will modify it in v5, but could you explain more here how we > decide the gen number, I really do not have too much knowledge here, > I use to think it is an increasing number like some other definition. It's octal, with the low octet being a revision number (so we can quickly tell big from little core mostly, but whatever else makes sense for that gen). -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.BUILD: failure for Sync i915_pciids upto 8717c6b7414f (rev4)
== Series Details == Series: Sync i915_pciids upto 8717c6b7414f (rev4) URL : https://patchwork.freedesktop.org/series/76080/ State : failure == Summary == Applying: Sync i915_pciids upto 8717c6b7414f error: sha1 information is lacking or useless (src/intel_module.c). error: could not build fake ancestor hint: Use 'git am --show-current-patch=diff' to see the failed patch Patch failed at 0001 Sync i915_pciids upto 8717c6b7414f When you have resolved this problem, run "git am --continue". If you prefer to skip this patch, run "git am --skip" instead. To restore the original branch and stop patching, run "git am --abort". ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 15/18] drm/i915/i915_drv: Prefer drm_WARN_ON over WARN_ON
On Tue, Apr 21, 2020 at 11:24:39AM +0300, Jani Nikula wrote: > On Mon, 06 Apr 2020, Pankaj Bharadiya > wrote: > > struct drm_device specific drm_WARN* macros include device information > > in the backtrace, so we know what device the warnings originate from. > > > > Prefer drm_WARN_ON over WARN_ON. > > > > Signed-off-by: Pankaj Bharadiya > > --- > > drivers/gpu/drm/i915/i915_drv.h | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h > > b/drivers/gpu/drm/i915/i915_drv.h > > index e9ee4daa9320..be33cab6403d 100644 > > --- a/drivers/gpu/drm/i915/i915_drv.h > > +++ b/drivers/gpu/drm/i915/i915_drv.h > > @@ -1647,7 +1647,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, > > #define HAS_DISPLAY(dev_priv) (INTEL_INFO(dev_priv)->pipe_mask != 0) > > > > /* Only valid when HAS_DISPLAY() is true */ > > -#define INTEL_DISPLAY_ENABLED(dev_priv) (WARN_ON(!HAS_DISPLAY(dev_priv)), > > !i915_modparams.disable_display) > > +#define INTEL_DISPLAY_ENABLED(dev_priv) \ > > + (drm_WARN_ON(&dev_priv->drm, !HAS_DISPLAY(dev_priv)), > > !i915_modparams.disable_display) > > Needs parens around the dev_priv macro argument. Yeah, missed it. Thanks for pointing out. Thanks, Pankaj > > BR, > Jani. > > > > > static inline bool intel_vtd_active(void) > > { > > -- > Jani Nikula, Intel Open Source Graphics Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 00/18] Prefer drm_WARN* over WARN*
On Tue, Apr 21, 2020 at 11:35:08AM +0300, Jani Nikula wrote: > On Mon, 06 Apr 2020, Pankaj Bharadiya > wrote: > > Now we have struct drm_device specific drm_WARN* macros which include > > device information in the backtrace, so we know what device the > > warnings originate from. > > > > This series converts WARN* with drm_WARN* where struct drm_device > > pointer can be extracted. > > I think I pushed all the patches that I didn't comment on separately, > and that still applied. Thank you Jani. Will rework on pending patches and resubmit. Thanks, Pankaj > > BR, > Jani. > > > -- > Jani Nikula, Intel Open Source Graphics Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [CI,1/2] drm/i915/gt: Trace RPS events
== Series Details == Series: series starting with [CI,1/2] drm/i915/gt: Trace RPS events URL : https://patchwork.freedesktop.org/series/76251/ State : failure == Summary == CALLscripts/checksyscalls.sh CALLscripts/atomic/check-atomics.sh DESCEND objtool CHK include/generated/compile.h CC [M] drivers/gpu/drm/i915/gt/intel_rps.o In file included from ./drivers/gpu/drm/i915/gt/uc/intel_guc.h:9:0, from ./drivers/gpu/drm/i915/gt/uc/intel_uc.h:9, from ./drivers/gpu/drm/i915/gt/intel_gt_types.h:16, from ./drivers/gpu/drm/i915/i915_drv.h:82, from drivers/gpu/drm/i915/gt/intel_rps.c:9: drivers/gpu/drm/i915/gt/intel_rps.c: In function ‘gen9_rps_enable’: drivers/gpu/drm/i915/gt/intel_rps.c:951:10: error: implicit declaration of function ‘GT_INTERVAL_FROM_US’; did you mean ‘NTP_INTERVAL_FREQ’? [-Werror=implicit-function-declaration] GT_INTERVAL_FROM_US(i915, 100)); ^ ./drivers/gpu/drm/i915/intel_uncore.h:378:57: note: in definition of macro ‘intel_uncore_write_fw’ #define intel_uncore_write_fw(...) __raw_uncore_write32(__VA_ARGS__) ^~~ drivers/gpu/drm/i915/gt/intel_rps.c:951:30: error: ‘i915’ undeclared (first use in this function); did you mean ‘to_i915’? GT_INTERVAL_FROM_US(i915, 100)); ^ ./drivers/gpu/drm/i915/intel_uncore.h:378:57: note: in definition of macro ‘intel_uncore_write_fw’ #define intel_uncore_write_fw(...) __raw_uncore_write32(__VA_ARGS__) ^~~ drivers/gpu/drm/i915/gt/intel_rps.c:951:30: note: each undeclared identifier is reported only once for each function it appears in GT_INTERVAL_FROM_US(i915, 100)); ^ ./drivers/gpu/drm/i915/intel_uncore.h:378:57: note: in definition of macro ‘intel_uncore_write_fw’ #define intel_uncore_write_fw(...) __raw_uncore_write32(__VA_ARGS__) ^~~ cc1: all warnings being treated as errors scripts/Makefile.build:266: recipe for target 'drivers/gpu/drm/i915/gt/intel_rps.o' failed make[4]: *** [drivers/gpu/drm/i915/gt/intel_rps.o] Error 1 scripts/Makefile.build:488: recipe for target 'drivers/gpu/drm/i915' failed make[3]: *** [drivers/gpu/drm/i915] Error 2 scripts/Makefile.build:488: recipe for target 'drivers/gpu/drm' failed make[2]: *** [drivers/gpu/drm] Error 2 scripts/Makefile.build:488: recipe for target 'drivers/gpu' failed make[1]: *** [drivers/gpu] Error 2 Makefile:1722: recipe for target 'drivers' failed make: *** [drivers] Error 2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915/gt: Prefer soft-rc6 over RPS DOWN_TIMEOUT
The RPS DOWN_TIMEOUT interrupt is signaled after a period of rc6, and upon receipt of that interrupt we reprogram the GPU clocks down to the next idle notch [to help convserve power during rc6]. However, on execlists, we benefit from soft-rc6 immediately parking the GPU and setting idle frequencies upon idling [within a jiffie], and here the interrupt prevents us from restarting from our last frequency. In the process, we can simply opt for a static pm_events mask and rely on the enable/disable interrupts to flush the worker on parking. This will reduce the amount of oscillation observed during steady workloads with microsleeps, as each time the rc6 timeout occurs we immediately follow with a waitboost for a dropped frame. Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_rps.c | 41 + 1 file changed, 18 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c index 4dcfae16a7ce..785cd58fba76 100644 --- a/drivers/gpu/drm/i915/gt/intel_rps.c +++ b/drivers/gpu/drm/i915/gt/intel_rps.c @@ -57,7 +57,7 @@ static u32 rps_pm_mask(struct intel_rps *rps, u8 val) if (val < rps->max_freq_softlimit) mask |= GEN6_PM_RP_UP_EI_EXPIRED | GEN6_PM_RP_UP_THRESHOLD; - mask &= READ_ONCE(rps->pm_events); + mask &= rps->pm_events; return rps_pm_sanitize_mask(rps, ~mask); } @@ -70,19 +70,9 @@ static void rps_reset_ei(struct intel_rps *rps) static void rps_enable_interrupts(struct intel_rps *rps) { struct intel_gt *gt = rps_to_gt(rps); - u32 events; rps_reset_ei(rps); - if (IS_VALLEYVIEW(gt->i915)) - /* WaGsvRC0ResidencyMethod:vlv */ - events = GEN6_PM_RP_UP_EI_EXPIRED; - else - events = (GEN6_PM_RP_UP_THRESHOLD | - GEN6_PM_RP_DOWN_THRESHOLD | - GEN6_PM_RP_DOWN_TIMEOUT); - WRITE_ONCE(rps->pm_events, events); - spin_lock_irq(>->irq_lock); gen6_gt_pm_enable_irq(gt, rps->pm_events); spin_unlock_irq(>->irq_lock); @@ -120,8 +110,6 @@ static void rps_disable_interrupts(struct intel_rps *rps) { struct intel_gt *gt = rps_to_gt(rps); - WRITE_ONCE(rps->pm_events, 0); - intel_uncore_write(gt->uncore, GEN6_PMINTRMSK, rps_pm_sanitize_mask(rps, ~0u)); @@ -919,12 +907,10 @@ static bool gen9_rps_enable(struct intel_rps *rps) intel_uncore_write_fw(uncore, GEN6_RC_VIDEO_FREQ, GEN9_FREQUENCY(rps->rp1_freq)); - /* 1 second timeout */ - intel_uncore_write_fw(uncore, GEN6_RP_DOWN_TIMEOUT, - GT_INTERVAL_FROM_US(i915, 100)); - intel_uncore_write_fw(uncore, GEN6_RP_IDLE_HYSTERSIS, 0xa); + rps->pm_events = GEN6_PM_RP_UP_THRESHOLD | GEN6_PM_RP_DOWN_THRESHOLD; + return rps_reset(rps); } @@ -935,12 +921,10 @@ static bool gen8_rps_enable(struct intel_rps *rps) intel_uncore_write_fw(uncore, GEN6_RC_VIDEO_FREQ, HSW_FREQUENCY(rps->rp1_freq)); - /* NB: Docs say 1s, and 100 - which aren't equivalent */ - intel_uncore_write_fw(uncore, GEN6_RP_DOWN_TIMEOUT, - 1 / 128); /* 1 second timeout */ - intel_uncore_write_fw(uncore, GEN6_RP_IDLE_HYSTERSIS, 10); + rps->pm_events = GEN6_PM_RP_UP_THRESHOLD | GEN6_PM_RP_DOWN_THRESHOLD; + return rps_reset(rps); } @@ -952,6 +936,10 @@ static bool gen6_rps_enable(struct intel_rps *rps) intel_uncore_write_fw(uncore, GEN6_RP_DOWN_TIMEOUT, 5); intel_uncore_write_fw(uncore, GEN6_RP_IDLE_HYSTERSIS, 10); + rps->pm_events = (GEN6_PM_RP_UP_THRESHOLD | + GEN6_PM_RP_DOWN_THRESHOLD | + GEN6_PM_RP_DOWN_TIMEOUT); + return rps_reset(rps); } @@ -1037,6 +1025,10 @@ static bool chv_rps_enable(struct intel_rps *rps) GEN6_RP_UP_BUSY_AVG | GEN6_RP_DOWN_IDLE_AVG); + rps->pm_events = (GEN6_PM_RP_UP_THRESHOLD | + GEN6_PM_RP_DOWN_THRESHOLD | + GEN6_PM_RP_DOWN_TIMEOUT); + /* Setting Fixed Bias */ vlv_punit_get(i915); @@ -1135,6 +1127,9 @@ static bool vlv_rps_enable(struct intel_rps *rps) GEN6_RP_UP_BUSY_AVG | GEN6_RP_DOWN_IDLE_CONT); + /* WaGsvRC0ResidencyMethod:vlv */ + rps->pm_events = GEN6_PM_RP_UP_EI_EXPIRED; + vlv_punit_get(i915); /* Setting Fixed Bias */ @@ -1469,7 +1464,7 @@ static void rps_work(struct work_struct *work) u32 pm_iir = 0; spin_lock_irq(>->irq_lock); - pm_iir = fetch_and_zero(&rps->pm_iir) & READ_ONCE(rps->pm_events); + pm_iir = fetch_and_zero(&rps->pm_iir) & rps->pm_events; client_boost =
[Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915/gt: Prefer soft-rc6 over RPS DOWN_TIMEOUT (rev2)
== Series Details == Series: drm/i915/gt: Prefer soft-rc6 over RPS DOWN_TIMEOUT (rev2) URL : https://patchwork.freedesktop.org/series/76216/ State : failure == Summary == CI Bug Log - changes from CI_DRM_8342 -> Patchwork_17396 Summary --- **FAILURE** Serious unknown changes coming with Patchwork_17396 absolutely need to be verified manually. If you think the reported changes have nothing to do with the changes introduced in Patchwork_17396, please notify your bug team to allow them to document this new failure mode, which will reduce false positives in CI. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17396/index.html Possible new issues --- Here are the unknown changes that may have been introduced in Patchwork_17396: ### IGT changes ### Possible regressions * igt@i915_selftest@live@gt_pm: - fi-cml-s: [PASS][1] -> [DMESG-FAIL][2] [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-cml-s/igt@i915_selftest@live@gt_pm.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17396/fi-cml-s/igt@i915_selftest@live@gt_pm.html - fi-cfl-guc: [PASS][3] -> [DMESG-FAIL][4] [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-cfl-guc/igt@i915_selftest@live@gt_pm.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17396/fi-cfl-guc/igt@i915_selftest@live@gt_pm.html - fi-skl-6700k2: [PASS][5] -> [DMESG-FAIL][6] [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-skl-6700k2/igt@i915_selftest@live@gt_pm.html [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17396/fi-skl-6700k2/igt@i915_selftest@live@gt_pm.html - fi-bsw-n3050: [PASS][7] -> [DMESG-FAIL][8] [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-bsw-n3050/igt@i915_selftest@live@gt_pm.html [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17396/fi-bsw-n3050/igt@i915_selftest@live@gt_pm.html - fi-skl-guc: [PASS][9] -> [DMESG-FAIL][10] [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-skl-guc/igt@i915_selftest@live@gt_pm.html [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17396/fi-skl-guc/igt@i915_selftest@live@gt_pm.html - fi-kbl-x1275: [PASS][11] -> [DMESG-FAIL][12] [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-kbl-x1275/igt@i915_selftest@live@gt_pm.html [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17396/fi-kbl-x1275/igt@i915_selftest@live@gt_pm.html - fi-bsw-kefka: [PASS][13] -> [DMESG-FAIL][14] [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-bsw-kefka/igt@i915_selftest@live@gt_pm.html [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17396/fi-bsw-kefka/igt@i915_selftest@live@gt_pm.html - fi-cfl-8700k: [PASS][15] -> [DMESG-FAIL][16] [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-cfl-8700k/igt@i915_selftest@live@gt_pm.html [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17396/fi-cfl-8700k/igt@i915_selftest@live@gt_pm.html - fi-bsw-nick:[PASS][17] -> [DMESG-FAIL][18] [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-bsw-nick/igt@i915_selftest@live@gt_pm.html [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17396/fi-bsw-nick/igt@i915_selftest@live@gt_pm.html - fi-skl-lmem:[PASS][19] -> [DMESG-FAIL][20] [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-skl-lmem/igt@i915_selftest@live@gt_pm.html [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17396/fi-skl-lmem/igt@i915_selftest@live@gt_pm.html - fi-apl-guc: [PASS][21] -> [DMESG-FAIL][22] [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-apl-guc/igt@i915_selftest@live@gt_pm.html [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17396/fi-apl-guc/igt@i915_selftest@live@gt_pm.html - fi-snb-2520m: [PASS][23] -> [DMESG-FAIL][24] [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-snb-2520m/igt@i915_selftest@live@gt_pm.html [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17396/fi-snb-2520m/igt@i915_selftest@live@gt_pm.html - fi-kbl-8809g: [PASS][25] -> [DMESG-FAIL][26] [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-kbl-8809g/igt@i915_selftest@live@gt_pm.html [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17396/fi-kbl-8809g/igt@i915_selftest@live@gt_pm.html - fi-kbl-r: [PASS][27] -> [DMESG-FAIL][28] [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-kbl-r/igt@i915_selftest@live@gt_pm.html [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17396/fi-kbl-r/igt@i915_selftest@live@gt_pm.html - fi-bdw-5557u: [PASS][29] -> [DMESG-FAIL][30] [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-bdw-5557u/igt@i915_selftest@live@gt_pm.html [3
[Intel-gfx] [PATCH] drm/i915/gt: Apply a small scalefactor for the gpu:ring ratio
Due to the latency from impedance mismatch on memory access, as the GPU gets faster we need to increase the frequency of the ring to offset. In effect, we are fixing the topmost frequency selection, and scaling down faster, so that at low frequencies we will be using lower ring frequencies and conserving power (and in the process make the scaling fairer, and aim for a linear increase in performance with frequency). Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/gt/intel_llc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_llc.c b/drivers/gpu/drm/i915/gt/intel_llc.c index e3f637b3650e..27196c303cbe 100644 --- a/drivers/gpu/drm/i915/gt/intel_llc.c +++ b/drivers/gpu/drm/i915/gt/intel_llc.c @@ -89,7 +89,7 @@ static void calc_ia_freq(struct intel_llc *llc, * ring_freq = 2 * GT. ring_freq is in 100MHz units * No floor required for ring frequency on SKL. */ - ring_freq = gpu_freq; + ring_freq = consts->max_gpu_freq - mult_frac(diff, 5, 4); } else if (INTEL_GEN(i915) >= 8) { /* max(2 * GT, DDR). NB: GT is 50MHz units */ ring_freq = max(consts->min_ring_freq, gpu_freq); -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/gt: Poison residual state [HWSP] across resume. (rev5)
== Series Details == Series: drm/i915/gt: Poison residual state [HWSP] across resume. (rev5) URL : https://patchwork.freedesktop.org/series/76100/ State : success == Summary == CI Bug Log - changes from CI_DRM_8342 -> Patchwork_17399 Summary --- **SUCCESS** No regressions found. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17399/index.html Known issues Here are the changes found in Patchwork_17399 that come from known issues: ### IGT changes ### Issues hit * igt@i915_selftest@live@gt_pm: - fi-apl-guc: [PASS][1] -> [DMESG-FAIL][2] ([i915#1751]) [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-apl-guc/igt@i915_selftest@live@gt_pm.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17399/fi-apl-guc/igt@i915_selftest@live@gt_pm.html * igt@kms_chamelium@dp-edid-read: - fi-kbl-7500u: [PASS][3] -> [FAIL][4] ([i915#976]) [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-kbl-7500u/igt@kms_chamel...@dp-edid-read.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17399/fi-kbl-7500u/igt@kms_chamel...@dp-edid-read.html Possible fixes * igt@i915_selftest@live@gt_pm: - fi-glk-dsi: [DMESG-FAIL][5] ([i915#1751]) -> [PASS][6] [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-glk-dsi/igt@i915_selftest@live@gt_pm.html [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17399/fi-glk-dsi/igt@i915_selftest@live@gt_pm.html Warnings * igt@i915_selftest@live@gt_pm: - fi-tgl-y: [DMESG-FAIL][7] ([i915#1744]) -> [DMESG-FAIL][8] ([i915#1759]) [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-tgl-y/igt@i915_selftest@live@gt_pm.html [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17399/fi-tgl-y/igt@i915_selftest@live@gt_pm.html {name}: This element is suppressed. This means it is ignored when computing the status of the difference (SUCCESS, WARNING, or FAILURE). [i915#1744]: https://gitlab.freedesktop.org/drm/intel/issues/1744 [i915#1751]: https://gitlab.freedesktop.org/drm/intel/issues/1751 [i915#1759]: https://gitlab.freedesktop.org/drm/intel/issues/1759 [i915#666]: https://gitlab.freedesktop.org/drm/intel/issues/666 [i915#976]: https://gitlab.freedesktop.org/drm/intel/issues/976 Participating hosts (48 -> 42) -- Missing(6): fi-cml-u2 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-byt-clapper fi-bdw-samus Build changes - * CI: CI-20190529 -> None * Linux: CI_DRM_8342 -> Patchwork_17399 CI-20190529: 20190529 CI_DRM_8342: 17407a9f61a0ee402254522e391a626acc4375ec @ git://anongit.freedesktop.org/gfx-ci/linux IGT_5602: a8fcccd15dcc2dd409edd23785a2d6f6e85fb682 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools Patchwork_17399: fb5af459320991b4c67465de380617bc24bb1a2e @ git://anongit.freedesktop.org/gfx-ci/linux == Linux commits == fb5af4593209 drm/i915/gt: Poison residual state [HWSP] across resume. == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17399/index.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 03/24] Revert "drm/i915/gem: Drop relocation slowpath"
This reverts commit 7dc8f1143778 ("drm/i915/gem: Drop relocation slowpath"). We need the slowpath relocation for taking ww-mutex inside the page fault handler, and we will take this mutex when pinning all objects. Cc: Chris Wilson Cc: Matthew Auld Signed-off-by: Maarten Lankhorst --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 239 +- 1 file changed, 235 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 042916ad3629..40a6bc89c2b8 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -1531,7 +1531,9 @@ static int eb_relocate_vma(struct i915_execbuffer *eb, struct eb_vma *ev) * we would try to acquire the struct mutex again. Obviously * this is bad and so lockdep complains vehemently. */ - copied = __copy_from_user(r, urelocs, count * sizeof(r[0])); + pagefault_disable(); + copied = __copy_from_user_inatomic(r, urelocs, count * sizeof(r[0])); + pagefault_enable(); if (unlikely(copied)) { remain = -EFAULT; goto out; @@ -1579,6 +1581,236 @@ static int eb_relocate_vma(struct i915_execbuffer *eb, struct eb_vma *ev) return remain; } +static int +eb_relocate_vma_slow(struct i915_execbuffer *eb, struct eb_vma *ev) +{ + const struct drm_i915_gem_exec_object2 *entry = ev->exec; + struct drm_i915_gem_relocation_entry *relocs = + u64_to_ptr(typeof(*relocs), entry->relocs_ptr); + unsigned int i; + int err; + + for (i = 0; i < entry->relocation_count; i++) { + u64 offset = eb_relocate_entry(eb, ev, &relocs[i]); + + if ((s64)offset < 0) { + err = (int)offset; + goto err; + } + } + err = 0; +err: + reloc_cache_reset(&eb->reloc_cache); + return err; +} + +static int check_relocations(const struct drm_i915_gem_exec_object2 *entry) +{ + const char __user *addr, *end; + unsigned long size; + char __maybe_unused c; + + size = entry->relocation_count; + if (size == 0) + return 0; + + if (size > N_RELOC(ULONG_MAX)) + return -EINVAL; + + addr = u64_to_user_ptr(entry->relocs_ptr); + size *= sizeof(struct drm_i915_gem_relocation_entry); + if (!access_ok(addr, size)) + return -EFAULT; + + end = addr + size; + for (; addr < end; addr += PAGE_SIZE) { + int err = __get_user(c, addr); + if (err) + return err; + } + return __get_user(c, end - 1); +} + +static int eb_copy_relocations(const struct i915_execbuffer *eb) +{ + struct drm_i915_gem_relocation_entry *relocs; + const unsigned int count = eb->buffer_count; + unsigned int i; + int err; + + for (i = 0; i < count; i++) { + const unsigned int nreloc = eb->exec[i].relocation_count; + struct drm_i915_gem_relocation_entry __user *urelocs; + unsigned long size; + unsigned long copied; + + if (nreloc == 0) + continue; + + err = check_relocations(&eb->exec[i]); + if (err) + goto err; + + urelocs = u64_to_user_ptr(eb->exec[i].relocs_ptr); + size = nreloc * sizeof(*relocs); + + relocs = kvmalloc_array(size, 1, GFP_KERNEL); + if (!relocs) { + err = -ENOMEM; + goto err; + } + + /* copy_from_user is limited to < 4GiB */ + copied = 0; + do { + unsigned int len = + min_t(u64, BIT_ULL(31), size - copied); + + if (__copy_from_user((char *)relocs + copied, +(char __user *)urelocs + copied, +len)) + goto end; + + copied += len; + } while (copied < size); + + /* +* As we do not update the known relocation offsets after +* relocating (due to the complexities in lock handling), +* we need to mark them as invalid now so that we force the +* relocation processing next time. Just in case the target +* object is evicted and then rebound into its old +* presumed_offset before the next execbuffer - if that +* happened we would make the mistake of assuming that the +* relocations were valid. +*/ + if (!user_access_begin
[Intel-gfx] [PATCH 10/24] drm/i915: Add ww context handling to context_barrier_task
This is required if we want to pass a ww context in intel_context_pin and gen6_ppgtt_pin(). Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 55 ++- .../drm/i915/gem/selftests/i915_gem_context.c | 22 +++- 2 files changed, 48 insertions(+), 29 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 022716f05e91..9fe387ac9850 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -1096,6 +1096,7 @@ I915_SELFTEST_DECLARE(static intel_engine_mask_t context_barrier_inject_fault); static int context_barrier_task(struct i915_gem_context *ctx, intel_engine_mask_t engines, bool (*skip)(struct intel_context *ce, void *data), + int (*pin)(struct intel_context *ce, struct i915_gem_ww_ctx *ww, void *data), int (*emit)(struct i915_request *rq, void *data), void (*task)(void *data), void *data) @@ -1103,6 +1104,7 @@ static int context_barrier_task(struct i915_gem_context *ctx, struct context_barrier_task *cb; struct i915_gem_engines_iter it; struct i915_gem_engines *e; + struct i915_gem_ww_ctx ww; struct intel_context *ce; int err = 0; @@ -1140,10 +1142,21 @@ static int context_barrier_task(struct i915_gem_context *ctx, if (skip && skip(ce, data)) continue; - rq = intel_context_create_request(ce); + i915_gem_ww_ctx_init(&ww, true); +retry: + err = intel_context_pin(ce); + if (err) + goto err; + + if (pin) + err = pin(ce, &ww, data); + if (err) + goto err_unpin; + + rq = i915_request_create(ce); if (IS_ERR(rq)) { err = PTR_ERR(rq); - break; + goto err_unpin; } err = 0; @@ -1153,6 +1166,16 @@ static int context_barrier_task(struct i915_gem_context *ctx, err = i915_active_add_request(&cb->base, rq); i915_request_add(rq); +err_unpin: + intel_context_unpin(ce); +err: + if (err == -EDEADLK) { + err = i915_gem_ww_ctx_backoff(&ww); + if (!err) + goto retry; + } + i915_gem_ww_ctx_fini(&ww); + if (err) break; } @@ -1208,6 +1231,17 @@ static void set_ppgtt_barrier(void *data) i915_vm_close(old); } +static int pin_ppgtt_update(struct intel_context *ce, struct i915_gem_ww_ctx *ww, void *data) +{ + struct i915_address_space *vm = ce->vm; + + if (!HAS_LOGICAL_RING_CONTEXTS(vm->i915)) + /* ppGTT is not part of the legacy context image */ + return gen6_ppgtt_pin(i915_vm_to_ppgtt(vm)); + + return 0; +} + static int emit_ppgtt_update(struct i915_request *rq, void *data) { struct i915_address_space *vm = rq->context->vm; @@ -1264,20 +1298,10 @@ static int emit_ppgtt_update(struct i915_request *rq, void *data) static bool skip_ppgtt_update(struct intel_context *ce, void *data) { - if (!test_bit(CONTEXT_ALLOC_BIT, &ce->flags)) - return true; - if (HAS_LOGICAL_RING_CONTEXTS(ce->engine->i915)) - return false; - - if (!atomic_read(&ce->pin_count)) - return true; - - /* ppGTT is not part of the legacy context image */ - if (gen6_ppgtt_pin(i915_vm_to_ppgtt(ce->vm))) - return true; - - return false; + return !ce->state; + else + return !atomic_read(&ce->pin_count); } static int set_ppgtt(struct drm_i915_file_private *file_priv, @@ -1328,6 +1352,7 @@ static int set_ppgtt(struct drm_i915_file_private *file_priv, */ err = context_barrier_task(ctx, ALL_ENGINES, skip_ppgtt_update, + pin_ppgtt_update, emit_ppgtt_update, set_ppgtt_barrier, old); diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c index 42edbd0f3c14..78356031ec61 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c @@ -1903,8 +1903,8 @@ static int mock_context_barrier(void *arg) return -ENOMEM; counter = 0; - err = context_barrier_task(ctx, 0, -
Re: [Intel-gfx] [PATCH 01/59] drm: Add devm_drm_dev_alloc macro
On Mon, Apr 20, 2020 at 3:37 PM Thomas Zimmermann wrote: > > Hi > > Am 15.04.20 um 09:39 schrieb Daniel Vetter: > > Add a new macro helper to combine the usual init sequence in drivers, > > consisting of a kzalloc + devm_drm_dev_init + drmm_add_final_kfree > > triplet. This allows us to remove the rather unsightly > > drmm_add_final_kfree from all currently merged drivers. > > > > The kerneldoc is only added for this new function. Existing kerneldoc > > and examples will be udated at the very end, since once all drivers > > are converted over to devm_drm_dev_alloc we can unexport a lot of > > interim functions and make the documentation for driver authors a lot > > cleaner and less confusing. There will be only one true way to > > initialize a drm_device at the end of this, which is going to be > > devm_drm_dev_alloc. > > > > v2: > > - Actually explain what this is for in the commit message (Sam) > > - Fix checkpatch issues (Sam) > > > > Acked-by: Noralf Trønnes > > Cc: Noralf Trønnes > > Reviewed-by: Sam Ravnborg > > Cc: Sam Ravnborg > > Cc: Paul Kocialkowski > > Cc: Laurent Pinchart > > Signed-off-by: Daniel Vetter Thanks for taking a look, some questions on your suggestions below. > Sorry for being late. A number of nits are listed below. In any case: > > Reviewed-by: Thomas Zimmermann > > Best regards > Thomas > > > --- > > drivers/gpu/drm/drm_drv.c | 23 +++ > > include/drm/drm_drv.h | 33 + > > 2 files changed, 56 insertions(+) > > > > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c > > index 1bb4f636b83c..8e1813d2a12e 100644 > > --- a/drivers/gpu/drm/drm_drv.c > > +++ b/drivers/gpu/drm/drm_drv.c > > @@ -739,6 +739,29 @@ int devm_drm_dev_init(struct device *parent, > > } > > EXPORT_SYMBOL(devm_drm_dev_init); > > > > +void *__devm_drm_dev_alloc(struct device *parent, struct drm_driver > > *driver, > > +size_t size, size_t offset) > > Maybe rename 'offset' of 'dev_offset' to make the relationship clear. Hm, I see the point of this (and the dev_field below, although I'd go with dev_member there for some consistency with other macros using offset_of or container_of), but I'm not sure about the dev_ prefix. Drivers use that sometimes for the struct device *, and usage for struct drm_device * is also very inconsistent. I've seen ddev, drm, dev and base (that one only for embedded structs ofc). So not sure which prefix to pick, aside from dev_ seems the most confusing. Got ideas? > > +{ > > + void *container; > > + struct drm_device *drm; > > + int ret; > > + > > + container = kzalloc(size, GFP_KERNEL); > > + if (!container) > > + return ERR_PTR(-ENOMEM); > > + > > + drm = container + offset; > > While convenient, I somewhat dislike the use of void* variables. I'd use > unsigned char* for container and do an explicit cast to struct > drm_device* here. I thought ever since C89 the explicit recommendation for untyped pointer math has been void *, and no longer char *, with the spec being explicit that void * pointer math works exactly like char *. So not clear on why you think char * is preferred here. I'm also not aware of any other kernel code that casts to char * for untyped pointer math. So unless you have some supporting evidence, I'll skip this one, ok? Thanks, Daniel > > + ret = devm_drm_dev_init(parent, drm, driver); > > + if (ret) { > > + kfree(container); > > + return ERR_PTR(ret); > > + } > > + drmm_add_final_kfree(drm, container); > > + > > + return container; > > +} > > +EXPORT_SYMBOL(__devm_drm_dev_alloc); > > + > > /** > > * drm_dev_alloc - Allocate new DRM device > > * @driver: DRM driver to allocate device for > > diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h > > index e7c6ea261ed1..f07f15721254 100644 > > --- a/include/drm/drm_drv.h > > +++ b/include/drm/drm_drv.h > > @@ -626,6 +626,39 @@ int devm_drm_dev_init(struct device *parent, > > struct drm_device *dev, > > struct drm_driver *driver); > > > > +void *__devm_drm_dev_alloc(struct device *parent, struct drm_driver > > *driver, > > +size_t size, size_t offset); > > + > > +/** > > + * devm_drm_dev_alloc - Resource managed allocation of a &drm_device > > instance > > + * @parent: Parent device object > > + * @driver: DRM driver > > + * @type: the type of the struct which contains struct &drm_device > > + * @member: the name of the &drm_device within @type. > > + * > > + * This allocates and initialize a new DRM device. No device registration > > is done. > > + * Call drm_dev_register() to advertice the device to user space and > > register it > > + * with other core subsystems. This should be done last in the device > > + * initialization sequence to make sure userspace can't access an > > inconsistent > > + * state. > > + * > > + * The initial ref-count of the
[Intel-gfx] [PATCH 14/24] drm/i915: Make sure execbuffer always passes ww state to i915_vma_pin.
As a preparation step for full object locking and wait/wound handling during pin and object mapping, ensure that we always pass the ww context in i915_gem_execbuffer.c to i915_vma_pin, use lockdep to ensure this happens. This also requires changing the order of eb_parse slightly, to ensure we pass ww at a point where we could still handle -EDEADLK safely. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/display/intel_display.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_context.c | 4 +- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 137 ++ drivers/gpu/drm/i915/gt/gen6_ppgtt.c | 4 +- drivers/gpu/drm/i915/gt/gen6_ppgtt.h | 4 +- drivers/gpu/drm/i915/gt/intel_context.c | 65 ++--- drivers/gpu/drm/i915/gt/intel_context.h | 13 ++ drivers/gpu/drm/i915/gt/intel_context_types.h | 3 +- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 2 +- drivers/gpu/drm/i915/gt/intel_gt.c| 2 +- drivers/gpu/drm/i915/gt/intel_lrc.c | 5 +- drivers/gpu/drm/i915/gt/intel_renderstate.c | 2 +- drivers/gpu/drm/i915/gt/intel_ring.c | 10 +- drivers/gpu/drm/i915/gt/intel_ring.h | 3 +- .../gpu/drm/i915/gt/intel_ring_submission.c | 15 +- drivers/gpu/drm/i915/gt/intel_timeline.c | 12 +- drivers/gpu/drm/i915/gt/intel_timeline.h | 3 +- drivers/gpu/drm/i915/gt/mock_engine.c | 3 +- drivers/gpu/drm/i915/gt/selftest_timeline.c | 4 +- drivers/gpu/drm/i915/gt/uc/intel_guc.c| 2 +- drivers/gpu/drm/i915/i915_drv.h | 13 +- drivers/gpu/drm/i915/i915_gem.c | 11 +- drivers/gpu/drm/i915/i915_vma.c | 13 +- drivers/gpu/drm/i915/i915_vma.h | 13 +- 24 files changed, 213 insertions(+), 132 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index 1064e34e42cd..0c3ca46c6cc7 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -3449,7 +3449,7 @@ initial_plane_vma(struct drm_i915_private *i915, if (IS_ERR(vma)) goto err_obj; - if (i915_ggtt_pin(vma, 0, PIN_MAPPABLE | PIN_OFFSET_FIXED | base)) + if (i915_ggtt_pin(vma, NULL, 0, PIN_MAPPABLE | PIN_OFFSET_FIXED | base)) goto err_obj; if (i915_gem_object_is_tiled(obj) && diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 9fe387ac9850..6a0180ac8101 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -1144,7 +1144,7 @@ static int context_barrier_task(struct i915_gem_context *ctx, i915_gem_ww_ctx_init(&ww, true); retry: - err = intel_context_pin(ce); + err = intel_context_pin_ww(ce, &ww); if (err) goto err; @@ -1237,7 +1237,7 @@ static int pin_ppgtt_update(struct intel_context *ce, struct i915_gem_ww_ctx *ww if (!HAS_LOGICAL_RING_CONTEXTS(vm->i915)) /* ppGTT is not part of the legacy context image */ - return gen6_ppgtt_pin(i915_vm_to_ppgtt(vm)); + return gen6_ppgtt_pin(i915_vm_to_ppgtt(vm), ww); return 0; } diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 6abe37b7933d..28cf45e9d9c6 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -424,16 +424,17 @@ eb_pin_vma(struct i915_execbuffer *eb, pin_flags |= PIN_GLOBAL; /* Attempt to reuse the current location if available */ - if (unlikely(i915_vma_pin(vma, 0, 0, pin_flags))) { + /* TODO: Add -EDEADLK handling here */ + if (unlikely(i915_vma_pin_ww(vma, &eb->ww, 0, 0, pin_flags))) { if (entry->flags & EXEC_OBJECT_PINNED) return false; /* Failing that pick any _free_ space if suitable */ - if (unlikely(i915_vma_pin(vma, - entry->pad_to_size, - entry->alignment, - eb_pin_flags(entry, ev->flags) | - PIN_USER | PIN_NOEVICT))) + if (unlikely(i915_vma_pin_ww(vma, &eb->ww, +entry->pad_to_size, +entry->alignment, +eb_pin_flags(entry, ev->flags) | +PIN_USER | PIN_NOEVICT))) return false; } @@ -575,7 +576,7 @@ static inline int use_cpu_reloc(const struct reloc_cache *cache, obj->cache_level != I915_CACHE_NONE); } -static int eb_reserve_vma(const struc
[Intel-gfx] [PATCH 05/24] drm/i915: Remove locking from i915_gem_object_prepare_read/write
Execbuffer submission will perform its own WW locking, and we cannot rely on the implicit lock there. This also makes it clear that the GVT code will get a lockdep splat when multiple batchbuffer shadows need to be performed in the same instance, fix that up. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/gem/i915_gem_domain.c| 20 ++- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 13 ++-- drivers/gpu/drm/i915/gem/i915_gem_object.h| 1 - .../gpu/drm/i915/gem/selftests/huge_pages.c | 5 - .../i915/gem/selftests/i915_gem_coherency.c | 14 + .../drm/i915/gem/selftests/i915_gem_context.c | 12 --- drivers/gpu/drm/i915/gt/intel_renderstate.c | 5 - drivers/gpu/drm/i915/gvt/cmd_parser.c | 9 - drivers/gpu/drm/i915/i915_gem.c | 20 +-- 9 files changed, 70 insertions(+), 29 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c index c0acfc97fae3..8ebceebd11b0 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c @@ -576,19 +576,17 @@ int i915_gem_object_prepare_read(struct drm_i915_gem_object *obj, if (!i915_gem_object_has_struct_page(obj)) return -ENODEV; - ret = i915_gem_object_lock_interruptible(obj, NULL); - if (ret) - return ret; + assert_object_held(obj); ret = i915_gem_object_wait(obj, I915_WAIT_INTERRUPTIBLE, MAX_SCHEDULE_TIMEOUT); if (ret) - goto err_unlock; + return ret; ret = i915_gem_object_pin_pages(obj); if (ret) - goto err_unlock; + return ret; if (obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ || !static_cpu_has(X86_FEATURE_CLFLUSH)) { @@ -616,8 +614,6 @@ int i915_gem_object_prepare_read(struct drm_i915_gem_object *obj, err_unpin: i915_gem_object_unpin_pages(obj); -err_unlock: - i915_gem_object_unlock(obj); return ret; } @@ -630,20 +626,18 @@ int i915_gem_object_prepare_write(struct drm_i915_gem_object *obj, if (!i915_gem_object_has_struct_page(obj)) return -ENODEV; - ret = i915_gem_object_lock_interruptible(obj, NULL); - if (ret) - return ret; + assert_object_held(obj); ret = i915_gem_object_wait(obj, I915_WAIT_INTERRUPTIBLE | I915_WAIT_ALL, MAX_SCHEDULE_TIMEOUT); if (ret) - goto err_unlock; + return ret; ret = i915_gem_object_pin_pages(obj); if (ret) - goto err_unlock; + return ret; if (obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE || !static_cpu_has(X86_FEATURE_CLFLUSH)) { @@ -680,7 +674,5 @@ int i915_gem_object_prepare_write(struct drm_i915_gem_object *obj, err_unpin: i915_gem_object_unpin_pages(obj); -err_unlock: - i915_gem_object_unlock(obj); return ret; } diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index bc5371d8da0a..2c7c0f4142aa 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -1003,11 +1003,14 @@ static void reloc_cache_reset(struct reloc_cache *cache) vaddr = unmask_page(cache->vaddr); if (cache->vaddr & KMAP) { + struct drm_i915_gem_object *obj = + (struct drm_i915_gem_object *)cache->node.mm; if (cache->vaddr & CLFLUSH_AFTER) mb(); kunmap_atomic(vaddr); - i915_gem_object_finish_access((struct drm_i915_gem_object *)cache->node.mm); + i915_gem_object_finish_access(obj); + i915_gem_object_unlock(obj); } else { struct i915_ggtt *ggtt = cache_to_ggtt(cache); @@ -1042,10 +1045,16 @@ static void *reloc_kmap(struct drm_i915_gem_object *obj, unsigned int flushes; int err; - err = i915_gem_object_prepare_write(obj, &flushes); + err = i915_gem_object_lock_interruptible(obj, NULL); if (err) return ERR_PTR(err); + err = i915_gem_object_prepare_write(obj, &flushes); + if (err) { + i915_gem_object_unlock(obj); + return ERR_PTR(err); + } + BUILD_BUG_ON(KMAP & CLFLUSH_FLAGS); BUILD_BUG_ON((KMAP | CLFLUSH_FLAGS) & PAGE_MASK); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 5103067269b0..11b8e27
[Intel-gfx] [PATCH 20/24] drm/i915: Use ww pinning for intel_context_create_request()
We want to get rid of intel_context_pin(), convert intel_context_create_request() first. :) Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/gt/intel_context.c | 20 +++- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index fe9fff5a63b1..e148e2d69ae1 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -449,15 +449,25 @@ int intel_context_prepare_remote_request(struct intel_context *ce, struct i915_request *intel_context_create_request(struct intel_context *ce) { + struct i915_gem_ww_ctx ww; struct i915_request *rq; int err; - err = intel_context_pin(ce); - if (unlikely(err)) - return ERR_PTR(err); + i915_gem_ww_ctx_init(&ww, true); +retry: + err = intel_context_pin_ww(ce, &ww); + if (!err) { + rq = i915_request_create(ce); + intel_context_unpin(ce); + } else if (err == -EDEADLK) { + err = i915_gem_ww_ctx_backoff(&ww); + if (!err) + goto retry; + } else { + rq = ERR_PTR(err); + } - rq = i915_request_create(ce); - intel_context_unpin(ce); + i915_gem_ww_ctx_fini(&ww); if (IS_ERR(rq)) return rq; -- 2.26.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 24/24] drm/i915: Ensure we hold the pin mutex
Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/gt/intel_renderstate.c | 2 +- drivers/gpu/drm/i915/i915_vma.c | 9 - drivers/gpu/drm/i915/i915_vma.h | 1 + 3 files changed, 10 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_renderstate.c b/drivers/gpu/drm/i915/gt/intel_renderstate.c index b954d0807b4b..357207bf5d7c 100644 --- a/drivers/gpu/drm/i915/gt/intel_renderstate.c +++ b/drivers/gpu/drm/i915/gt/intel_renderstate.c @@ -207,7 +207,7 @@ int intel_renderstate_init(struct intel_renderstate *so, if (err) goto err_context; - err = i915_vma_pin(so->vma, 0, 0, PIN_GLOBAL | PIN_HIGH); + err = i915_vma_pin_ww(so->vma, &so->ww, 0, 0, PIN_GLOBAL | PIN_HIGH); if (err) goto err_context; diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index 98476f09f3f0..bf8ec7175acb 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -868,6 +868,8 @@ int i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww, #ifdef CONFIG_PROVE_LOCKING if (debug_locks && lockdep_is_held(&vma->vm->i915->drm.struct_mutex)) WARN_ON(!ww); + if (debug_locks && ww && vma->resv) + assert_vma_held(vma); #endif BUILD_BUG_ON(PIN_GLOBAL != I915_VMA_GLOBAL_BIND); @@ -1008,8 +1010,13 @@ int i915_ggtt_pin(struct i915_vma *vma, struct i915_gem_ww_ctx *ww, GEM_BUG_ON(!i915_vma_is_ggtt(vma)); + WARN_ON(!ww && vma->resv && dma_resv_held(vma->resv)); + do { - err = i915_vma_pin_ww(vma, ww, 0, align, flags | PIN_GLOBAL); + if (ww) + err = i915_vma_pin_ww(vma, ww, 0, align, flags | PIN_GLOBAL); + else + err = i915_vma_pin(vma, 0, align, flags | PIN_GLOBAL); if (err != -ENOSPC) { if (!err) { err = i915_vma_wait_for_bind(vma); diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h index 2e3779a8a437..d937ce950481 100644 --- a/drivers/gpu/drm/i915/i915_vma.h +++ b/drivers/gpu/drm/i915/i915_vma.h @@ -242,6 +242,7 @@ i915_vma_pin_ww(struct i915_vma *vma, struct i915_gem_ww_ctx *ww, static inline int __must_check i915_vma_pin(struct i915_vma *vma, u64 size, u64 alignment, u64 flags) { + WARN_ON_ONCE(vma->resv && dma_resv_held(vma->resv)); return i915_vma_pin_ww(vma, NULL, size, alignment, flags); } -- 2.26.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 11/24] drm/i915: Nuke arguments to eb_pin_engine
Those arguments are already set as eb.file and eb.args, so kill off the extra arguments. This will allow us to move eb_pin_engine() to after we reserved all BO's. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 17 +++-- 1 file changed, 7 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index a1b3d1fa1402..d432451608c9 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -2458,11 +2458,10 @@ static void eb_unpin_engine(struct i915_execbuffer *eb) } static unsigned int -eb_select_legacy_ring(struct i915_execbuffer *eb, - struct drm_file *file, - struct drm_i915_gem_execbuffer2 *args) +eb_select_legacy_ring(struct i915_execbuffer *eb) { struct drm_i915_private *i915 = eb->i915; + struct drm_i915_gem_execbuffer2 *args = eb->args; unsigned int user_ring_id = args->flags & I915_EXEC_RING_MASK; if (user_ring_id != I915_EXEC_BSD && @@ -2477,7 +2476,7 @@ eb_select_legacy_ring(struct i915_execbuffer *eb, unsigned int bsd_idx = args->flags & I915_EXEC_BSD_MASK; if (bsd_idx == I915_EXEC_BSD_DEFAULT) { - bsd_idx = gen8_dispatch_bsd_engine(i915, file); + bsd_idx = gen8_dispatch_bsd_engine(i915, eb->file); } else if (bsd_idx >= I915_EXEC_BSD_RING1 && bsd_idx <= I915_EXEC_BSD_RING2) { bsd_idx >>= I915_EXEC_BSD_SHIFT; @@ -2502,18 +2501,16 @@ eb_select_legacy_ring(struct i915_execbuffer *eb, } static int -eb_pin_engine(struct i915_execbuffer *eb, - struct drm_file *file, - struct drm_i915_gem_execbuffer2 *args) +eb_pin_engine(struct i915_execbuffer *eb) { struct intel_context *ce; unsigned int idx; int err; if (i915_gem_context_user_engines(eb->gem_context)) - idx = args->flags & I915_EXEC_RING_MASK; + idx = eb->args->flags & I915_EXEC_RING_MASK; else - idx = eb_select_legacy_ring(eb, file, args); + idx = eb_select_legacy_ring(eb); ce = i915_gem_context_get_engine(eb->gem_context, idx); if (IS_ERR(ce)) @@ -2812,7 +2809,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, if (unlikely(err)) goto err_destroy; - err = eb_pin_engine(&eb, file, args); + err = eb_pin_engine(&eb); if (unlikely(err)) goto err_context; -- 2.26.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 23/24] drm/i915: Add ww locking to pin_to_display_plane
Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/gem/i915_gem_domain.c | 65 -- drivers/gpu/drm/i915/gem/i915_gem_object.h | 1 + 2 files changed, 49 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c index 8ebceebd11b0..c0d153284984 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c @@ -37,6 +37,12 @@ void i915_gem_object_flush_if_display(struct drm_i915_gem_object *obj) i915_gem_object_unlock(obj); } +void i915_gem_object_flush_if_display_locked(struct drm_i915_gem_object *obj) +{ + if (i915_gem_object_is_framebuffer(obj)) + __i915_gem_object_flush_for_display(obj); +} + /** * Moves a single object to the WC read, and possibly write domain. * @obj: object to act on @@ -197,18 +203,12 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj, if (ret) return ret; - ret = i915_gem_object_lock_interruptible(obj, NULL); - if (ret) - return ret; - /* Always invalidate stale cachelines */ if (obj->cache_level != cache_level) { i915_gem_object_set_cache_coherency(obj, cache_level); obj->cache_dirty = true; } - i915_gem_object_unlock(obj); - /* The cache-level will be applied when each vma is rebound. */ return i915_gem_object_unbind(obj, I915_GEM_OBJECT_UNBIND_ACTIVE | @@ -255,6 +255,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data, struct drm_i915_gem_caching *args = data; struct drm_i915_gem_object *obj; enum i915_cache_level level; + struct i915_gem_ww_ctx ww; int ret = 0; switch (args->caching) { @@ -293,7 +294,18 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data, goto out; } - ret = i915_gem_object_set_cache_level(obj, level); + i915_gem_ww_ctx_init(&ww, true); +retry: + ret = i915_gem_object_lock(obj, &ww); + if (!ret) + ret = i915_gem_object_set_cache_level(obj, level); + + if (ret == -EDEADLK) { + ret = i915_gem_ww_ctx_backoff(&ww); + if (!ret) + goto retry; + } + i915_gem_ww_ctx_fini(&ww); out: i915_gem_object_put(obj); @@ -313,6 +325,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, unsigned int flags) { struct drm_i915_private *i915 = to_i915(obj->base.dev); + struct i915_gem_ww_ctx ww; struct i915_vma *vma; int ret; @@ -320,6 +333,11 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, if (HAS_LMEM(i915) && !i915_gem_object_is_lmem(obj)) return ERR_PTR(-EINVAL); + i915_gem_ww_ctx_init(&ww, true); +retry: + ret = i915_gem_object_lock(obj, &ww); + if (ret) + goto err; /* * The display engine is not coherent with the LLC cache on gen6. As * a result, we make sure that the pinning that is about to occur is @@ -334,7 +352,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, HAS_WT(i915) ? I915_CACHE_WT : I915_CACHE_NONE); if (ret) - return ERR_PTR(ret); + goto err; /* * As the user may map the buffer once pinned in the display plane @@ -347,18 +365,31 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, vma = ERR_PTR(-ENOSPC); if ((flags & PIN_MAPPABLE) == 0 && (!view || view->type == I915_GGTT_VIEW_NORMAL)) - vma = i915_gem_object_ggtt_pin(obj, view, 0, alignment, - flags | - PIN_MAPPABLE | - PIN_NONBLOCK); - if (IS_ERR(vma)) - vma = i915_gem_object_ggtt_pin(obj, view, 0, alignment, flags); - if (IS_ERR(vma)) - return vma; + vma = i915_gem_object_ggtt_pin_ww(obj, &ww, view, 0, alignment, + flags | PIN_MAPPABLE | + PIN_NONBLOCK); + if (IS_ERR(vma) && vma != ERR_PTR(-EDEADLK)) + vma = i915_gem_object_ggtt_pin_ww(obj, &ww, view, 0, + alignment, flags); + if (IS_ERR(vma)) { + ret = PTR_ERR(vma); + goto err; + } vma->display_alignment = max_t(u64, vma->display_alignment, alignment); - i915_gem_object_flush_if_display(obj); + i915_gem_object_flush_if_display_loc
[Intel-gfx] [PATCH 15/24] drm/i915: Convert i915_gem_object/client_blt.c to use ww locking as well, v2.
This is the last part outside of selftests that still don't use the correct lock ordering of timeline->mutex vs resv_lock. With gem fixed, there are a few places that still get locking wrong: - gvt/scheduler.c - i915_perf.c - Most if not all selftests. Changes since v1: - Add intel_engine_pm_get/put() calls to fix use-after-free when using intel_engine_get_pool(). Signed-off-by: Maarten Lankhorst --- .../gpu/drm/i915/gem/i915_gem_client_blt.c| 80 +++-- .../gpu/drm/i915/gem/i915_gem_object_blt.c| 156 +++--- .../gpu/drm/i915/gem/i915_gem_object_blt.h| 3 + 3 files changed, 165 insertions(+), 74 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c b/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c index 2f1d8150256b..6d2f6ac500dc 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c @@ -156,6 +156,7 @@ static void clear_pages_worker(struct work_struct *work) struct clear_pages_work *w = container_of(work, typeof(*w), work); struct drm_i915_gem_object *obj = w->sleeve->vma->obj; struct i915_vma *vma = w->sleeve->vma; + struct i915_gem_ww_ctx ww; struct i915_request *rq; struct i915_vma *batch; int err = w->dma.error; @@ -171,17 +172,20 @@ static void clear_pages_worker(struct work_struct *work) obj->read_domains = I915_GEM_GPU_DOMAINS; obj->write_domain = 0; - err = i915_vma_pin(vma, 0, 0, PIN_USER); - if (unlikely(err)) + i915_gem_ww_ctx_init(&ww, false); + intel_engine_pm_get(w->ce->engine); +retry: + err = intel_context_pin_ww(w->ce, &ww); + if (err) goto out_signal; - batch = intel_emit_vma_fill_blt(w->ce, vma, w->value); + batch = intel_emit_vma_fill_blt(w->ce, vma, &ww, w->value); if (IS_ERR(batch)) { err = PTR_ERR(batch); - goto out_unpin; + goto out_ctx; } - rq = intel_context_create_request(w->ce); + rq = i915_request_create(w->ce); if (IS_ERR(rq)) { err = PTR_ERR(rq); goto out_batch; @@ -223,9 +227,19 @@ static void clear_pages_worker(struct work_struct *work) i915_request_add(rq); out_batch: intel_emit_vma_release(w->ce, batch); -out_unpin: - i915_vma_unpin(vma); +out_ctx: + intel_context_unpin(w->ce); out_signal: + if (err == -EDEADLK) { + err = i915_gem_ww_ctx_backoff(&ww); + if (!err) + goto retry; + } + i915_gem_ww_ctx_fini(&ww); + + i915_vma_unpin(w->sleeve->vma); + intel_engine_pm_put(w->ce->engine); + if (unlikely(err)) { dma_fence_set_error(&w->dma, err); dma_fence_signal(&w->dma); @@ -233,6 +247,45 @@ static void clear_pages_worker(struct work_struct *work) } } +static int pin_wait_clear_pages_work(struct clear_pages_work *w, +struct intel_context *ce) +{ + struct i915_vma *vma = w->sleeve->vma; + struct i915_gem_ww_ctx ww; + int err; + + i915_gem_ww_ctx_init(&ww, false); +retry: + err = i915_gem_object_lock(vma->obj, &ww); + if (err) + goto out; + + err = i915_vma_pin_ww(vma, &ww, 0, 0, PIN_USER); + if (unlikely(err)) + goto out; + + err = i915_sw_fence_await_reservation(&w->wait, + vma->obj->base.resv, NULL, + true, I915_FENCE_TIMEOUT, + I915_FENCE_GFP); + if (err) + goto err_unpin_vma; + + dma_resv_add_excl_fence(vma->obj->base.resv, &w->dma); + +err_unpin_vma: + if (err) + i915_vma_unpin(vma); +out: + if (err == -EDEADLK) { + err = i915_gem_ww_ctx_backoff(&ww); + if (!err) + goto retry; + } + i915_gem_ww_ctx_fini(&ww); + return err; +} + static int __i915_sw_fence_call clear_pages_work_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state) @@ -286,18 +339,9 @@ int i915_gem_schedule_fill_pages_blt(struct drm_i915_gem_object *obj, dma_fence_init(&work->dma, &clear_pages_work_ops, &fence_lock, 0, 0); i915_sw_fence_init(&work->wait, clear_pages_work_notify); - i915_gem_object_lock(obj, NULL); - err = i915_sw_fence_await_reservation(&work->wait, - obj->base.resv, NULL, - true, I915_FENCE_TIMEOUT, - I915_FENCE_GFP); - if (err < 0) { + err = pin_wait_clear_pages_work(work, ce); + if (err < 0) dma_fence_set_error(&work->dma, err); - } else { - dma_resv_add_e
[Intel-gfx] [PATCH 13/24] drm/i915: Rework intel_context pinning to do everything outside of pin_mutex
Instead of doing everything inside of pin_mutex, we move all pinning outside. Because i915_active has its own reference counting and pinning is also having the same issues vs mutexes, we make sure everything is pinned first, so the pinning in i915_active only needs to bump refcounts. This allows us to take pin refcounts correctly all the time. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/gt/intel_context.c | 232 +++--- drivers/gpu/drm/i915/gt/intel_context_types.h | 4 +- drivers/gpu/drm/i915/gt/intel_lrc.c | 34 ++- .../gpu/drm/i915/gt/intel_ring_submission.c | 13 +- drivers/gpu/drm/i915/gt/mock_engine.c | 13 +- 5 files changed, 190 insertions(+), 106 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index e4aece20bc80..c039e87a46c4 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -93,79 +93,6 @@ static void intel_context_active_release(struct intel_context *ce) i915_active_release(&ce->active); } -int __intel_context_do_pin(struct intel_context *ce) -{ - int err; - - if (unlikely(!test_bit(CONTEXT_ALLOC_BIT, &ce->flags))) { - err = intel_context_alloc_state(ce); - if (err) - return err; - } - - err = i915_active_acquire(&ce->active); - if (err) - return err; - - if (mutex_lock_interruptible(&ce->pin_mutex)) { - err = -EINTR; - goto out_release; - } - - if (unlikely(intel_context_is_closed(ce))) { - err = -ENOENT; - goto out_unlock; - } - - if (likely(!atomic_add_unless(&ce->pin_count, 1, 0))) { - err = intel_context_active_acquire(ce); - if (unlikely(err)) - goto out_unlock; - - err = ce->ops->pin(ce); - if (unlikely(err)) - goto err_active; - - CE_TRACE(ce, "pin ring:{start:%08x, head:%04x, tail:%04x}\n", -i915_ggtt_offset(ce->ring->vma), -ce->ring->head, ce->ring->tail); - - smp_mb__before_atomic(); /* flush pin before it is visible */ - atomic_inc(&ce->pin_count); - } - - GEM_BUG_ON(!intel_context_is_pinned(ce)); /* no overflow! */ - GEM_BUG_ON(i915_active_is_idle(&ce->active)); - goto out_unlock; - -err_active: - intel_context_active_release(ce); -out_unlock: - mutex_unlock(&ce->pin_mutex); -out_release: - i915_active_release(&ce->active); - return err; -} - -void intel_context_unpin(struct intel_context *ce) -{ - if (!atomic_dec_and_test(&ce->pin_count)) - return; - - CE_TRACE(ce, "unpin\n"); - ce->ops->unpin(ce); - - /* -* Once released, we may asynchronously drop the active reference. -* As that may be the only reference keeping the context alive, -* take an extra now so that it is not freed before we finish -* dereferencing it. -*/ - intel_context_get(ce); - intel_context_active_release(ce); - intel_context_put(ce); -} - static int __context_pin_state(struct i915_vma *vma) { unsigned int bias = i915_ggtt_pin_bias(vma) | PIN_OFFSET_BIAS; @@ -225,6 +152,138 @@ static void __ring_retire(struct intel_ring *ring) i915_active_release(&ring->vma->active); } +static int intel_context_pre_pin(struct intel_context *ce) +{ + int err; + + CE_TRACE(ce, "active\n"); + + err = __ring_active(ce->ring); + if (err) + return err; + + err = intel_timeline_pin(ce->timeline); + if (err) + goto err_ring; + + if (!ce->state) + return 0; + + err = __context_pin_state(ce->state); + if (err) + goto err_timeline; + + + return 0; + +err_timeline: + intel_timeline_unpin(ce->timeline); +err_ring: + __ring_retire(ce->ring); + return err; +} + +static void intel_context_post_unpin(struct intel_context *ce) +{ + if (ce->state) + __context_unpin_state(ce->state); + + intel_timeline_unpin(ce->timeline); + __ring_retire(ce->ring); +} + +int __intel_context_do_pin(struct intel_context *ce) +{ + bool handoff = false; + void *vaddr; + int err = 0; + + if (unlikely(!test_bit(CONTEXT_ALLOC_BIT, &ce->flags))) { + err = intel_context_alloc_state(ce); + if (err) + return err; + } + + /* +* We always pin the context/ring/timeline here, to ensure a pin +* refcount for __intel_context_active(), which prevent a lock +* inversion of ce->pin_mutex vs dma_resv_lock(). +*/ + err = intel_context_pre_pin(ce); + if (err) + return err; + +
[Intel-gfx] [PATCH 18/24] drm/i915: Dirty hack to fix selftests locking inversion
Some i915 selftests still use i915_vma_lock() as inner lock, and intel_context_create_request() intel_timeline->mutex as outer lock. Fortunately for selftests this is not an issue, they should be fixed but we can move ahead and cleanify lockdep now. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/gt/intel_context.c | 12 1 file changed, 12 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index 64948386630f..fe9fff5a63b1 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -459,6 +459,18 @@ struct i915_request *intel_context_create_request(struct intel_context *ce) rq = i915_request_create(ce); intel_context_unpin(ce); + if (IS_ERR(rq)) + return rq; + + /* +* timeline->mutex should be the inner lock, but is used as outer lock. +* Hack around this to shut up lockdep in selftests.. +*/ + lockdep_unpin_lock(&ce->timeline->mutex, rq->cookie); + mutex_release(&ce->timeline->mutex.dep_map, _RET_IP_); + mutex_acquire(&ce->timeline->mutex.dep_map, SINGLE_DEPTH_NESTING, 0, _RET_IP_); + rq->cookie = lockdep_pin_lock(&ce->timeline->mutex); + return rq; } -- 2.26.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 21/24] drm/i915: Move i915_vma_lock in the selftests to avoid lock inversion, v2.
Make sure vma_lock is not used as inner lock when kernel context is used, and add ww handling where appropriate. Signed-off-by: Maarten Lankhorst --- .../i915/gem/selftests/i915_gem_coherency.c | 26 ++-- .../drm/i915/gem/selftests/i915_gem_mman.c| 41 ++- drivers/gpu/drm/i915/selftests/i915_request.c | 18 +--- 3 files changed, 57 insertions(+), 28 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c index 99f8466a108a..d93b7d9ad174 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_coherency.c @@ -199,25 +199,25 @@ static int gpu_set(struct context *ctx, unsigned long offset, u32 v) i915_gem_object_lock(ctx->obj, NULL); err = i915_gem_object_set_to_gtt_domain(ctx->obj, true); - i915_gem_object_unlock(ctx->obj); if (err) - return err; + goto out_unlock; vma = i915_gem_object_ggtt_pin(ctx->obj, NULL, 0, 0, 0); - if (IS_ERR(vma)) - return PTR_ERR(vma); + if (IS_ERR(vma)) { + err = PTR_ERR(vma); + goto out_unlock; + } rq = intel_engine_create_kernel_request(ctx->engine); if (IS_ERR(rq)) { - i915_vma_unpin(vma); - return PTR_ERR(rq); + err = PTR_ERR(rq); + goto out_unpin; } cs = intel_ring_begin(rq, 4); if (IS_ERR(cs)) { - i915_request_add(rq); - i915_vma_unpin(vma); - return PTR_ERR(cs); + err = PTR_ERR(cs); + goto out_rq; } if (INTEL_GEN(ctx->engine->i915) >= 8) { @@ -238,14 +238,16 @@ static int gpu_set(struct context *ctx, unsigned long offset, u32 v) } intel_ring_advance(rq, cs); - i915_vma_lock(vma); err = i915_request_await_object(rq, vma->obj, true); if (err == 0) err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE); - i915_vma_unlock(vma); - i915_vma_unpin(vma); +out_rq: i915_request_add(rq); +out_unpin: + i915_vma_unpin(vma); +out_unlock: + i915_gem_object_unlock(ctx->obj); return err; } diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c index eec58da734bd..c8b9343cc88c 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c @@ -528,31 +528,42 @@ static int make_obj_busy(struct drm_i915_gem_object *obj) for_each_uabi_engine(engine, i915) { struct i915_request *rq; struct i915_vma *vma; + struct i915_gem_ww_ctx ww; int err; vma = i915_vma_instance(obj, &engine->gt->ggtt->vm, NULL); if (IS_ERR(vma)) return PTR_ERR(vma); - err = i915_vma_pin(vma, 0, 0, PIN_USER); + i915_gem_ww_ctx_init(&ww, false); +retry: + err = i915_gem_object_lock(obj, &ww); + if (!err) + err = i915_vma_pin_ww(vma, &ww, 0, 0, PIN_USER); if (err) - return err; + goto err; rq = intel_engine_create_kernel_request(engine); if (IS_ERR(rq)) { - i915_vma_unpin(vma); - return PTR_ERR(rq); + err = PTR_ERR(rq); + goto err_unpin; } - i915_vma_lock(vma); err = i915_request_await_object(rq, vma->obj, true); if (err == 0) err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE); - i915_vma_unlock(vma); i915_request_add(rq); +err_unpin: i915_vma_unpin(vma); +err: + if (err == -EDEADLK) { + err = i915_gem_ww_ctx_backoff(&ww); + if (!err) + goto retry; + } + i915_gem_ww_ctx_fini(&ww); if (err) return err; } @@ -1000,6 +1011,7 @@ static int __igt_mmap_gpu(struct drm_i915_private *i915, for_each_uabi_engine(engine, i915) { struct i915_request *rq; struct i915_vma *vma; + struct i915_gem_ww_ctx ww; vma = i915_vma_instance(obj, engine->kernel_context->vm, NULL); if (IS_ERR(vma)) { @@ -1007,9 +1019,13 @@ static int __igt_mmap_gpu(struct drm_i915_private *i915, goto out_unmap; } - err = i915_vma_pin(vma, 0, 0, PIN_USER); + i915_gem
[Intel-gfx] [PATCH 02/24] drm/i915/gt: Move the batch buffer pool from the engine to the gt
From: Chris Wilson Since the introduction of 'soft-rc6', we aim to park the device quickly and that results in frequent idling of the whole device. Currently upon idling we free the batch buffer pool, and so this renders the cache ineffective for many workloads. If we want to have an effective cache of recently allocated buffers available for reuse, we need to decouple that cache from the engine powermanagement and make it timer based. As there is no reason then to keep it within the engine (where it once made retirement order easier to track), we can move it up the hierarchy to the owner of the memory allocations. v2: Hook up to debugfs/drop_caches to clear the cache on demand. Signed-off-by: Chris Wilson Cc: Maarten Lankhorst Cc: Tvrtko Ursulin Signed-off-by: Maarten Lankhorst Link: https://patchwork.freedesktop.org/patch/msgid/20200416071804.30187-1-ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/Makefile | 2 +- .../gpu/drm/i915/gem/i915_gem_client_blt.c| 1 - .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 20 ++-- .../gpu/drm/i915/gem/i915_gem_object_blt.c| 18 +-- .../gpu/drm/i915/gem/i915_gem_object_blt.h| 1 - drivers/gpu/drm/i915/gt/intel_engine_cs.c | 4 - drivers/gpu/drm/i915/gt/intel_engine_pm.c | 2 - drivers/gpu/drm/i915/gt/intel_engine_pool.h | 34 -- drivers/gpu/drm/i915/gt/intel_engine_types.h | 8 -- drivers/gpu/drm/i915/gt/intel_gt.c| 3 + ...l_engine_pool.c => intel_gt_buffer_pool.c} | 111 -- .../gpu/drm/i915/gt/intel_gt_buffer_pool.h| 38 ++ ...l_types.h => intel_gt_buffer_pool_types.h} | 15 ++- drivers/gpu/drm/i915/gt/intel_gt_types.h | 11 ++ drivers/gpu/drm/i915/gt/mock_engine.c | 2 - drivers/gpu/drm/i915/i915_debugfs.c | 4 + 16 files changed, 160 insertions(+), 114 deletions(-) delete mode 100644 drivers/gpu/drm/i915/gt/intel_engine_pool.h rename drivers/gpu/drm/i915/gt/{intel_engine_pool.c => intel_gt_buffer_pool.c} (53%) create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.h rename drivers/gpu/drm/i915/gt/{intel_engine_pool_types.h => intel_gt_buffer_pool_types.h} (54%) diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 6f112d8f80ca..79f8b5d07c9f 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -86,11 +86,11 @@ gt-y += \ gt/intel_engine_cs.o \ gt/intel_engine_heartbeat.o \ gt/intel_engine_pm.o \ - gt/intel_engine_pool.o \ gt/intel_engine_user.o \ gt/intel_ggtt.o \ gt/intel_ggtt_fencing.o \ gt/intel_gt.o \ + gt/intel_gt_buffer_pool.o \ gt/intel_gt_irq.o \ gt/intel_gt_pm.o \ gt/intel_gt_pm_irq.o \ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c b/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c index 0598e5382a1d..3a146aa2593b 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c @@ -6,7 +6,6 @@ #include "i915_drv.h" #include "gt/intel_context.h" #include "gt/intel_engine_pm.h" -#include "gt/intel_engine_pool.h" #include "i915_gem_client_blt.h" #include "i915_gem_object_blt.h" diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 517898aa634c..042916ad3629 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -15,8 +15,8 @@ #include "gem/i915_gem_ioctls.h" #include "gt/intel_context.h" -#include "gt/intel_engine_pool.h" #include "gt/intel_gt.h" +#include "gt/intel_gt_buffer_pool.h" #include "gt/intel_gt_pm.h" #include "gt/intel_ring.h" @@ -1194,13 +1194,13 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb, unsigned int len) { struct reloc_cache *cache = &eb->reloc_cache; - struct intel_engine_pool_node *pool; + struct intel_gt_buffer_pool_node *pool; struct i915_request *rq; struct i915_vma *batch; u32 *cmd; int err; - pool = intel_engine_get_pool(eb->engine, PAGE_SIZE); + pool = intel_gt_get_buffer_pool(eb->engine->gt, PAGE_SIZE); if (IS_ERR(pool)) return PTR_ERR(pool); @@ -1229,7 +1229,7 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb, goto err_unpin; } - err = intel_engine_pool_mark_active(pool, rq); + err = intel_gt_buffer_pool_mark_active(pool, rq); if (err) goto err_request; @@ -1270,7 +1270,7 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb, err_unmap: i915_gem_object_unpin_map(pool->obj); out_pool: - intel_engine_pool_put(pool); + intel_gt_buffer_pool_put(pool); return err; } @@ -1887,7 +1887,7 @@ static int eb_parse_pipeline(struct i915_execbuffer *eb, static int eb_parse(struct i915_execbuffer *eb) { s
[Intel-gfx] [PATCH 08/24] drm/i915: Use per object locking in execbuf, v8.
Now that we changed execbuf submission slightly to allow us to do all pinning in one place, we can now simply add ww versions on top of struct_mutex. All we have to do is a separate path for -EDEADLK handling, which needs to unpin all gem bo's before dropping the lock, then starting over. This finally allows us to do parallel submission, but because not all of the pinning code uses the ww ctx yet, we cannot completely drop struct_mutex yet. Changes since v1: - Keep struct_mutex for now. :( Changes since v2: - Make sure we always lock the ww context in slowpath. Changes since v3: - Don't call __eb_unreserve_vma in eb_move_to_gpu now; this can be done on normal unlock path. - Unconditionally release vmas and context. Changes since v4: - Rebased on top of struct_mutex reduction. Changes since v5: - Remove training wheels. Changes since v6: - Fix accidentally broken -ENOSPC handling. Changes since v7: - Handle gt buffer pool better. Signed-off-by: Maarten Lankhorst --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 330 ++ drivers/gpu/drm/i915/i915_gem.c | 6 + drivers/gpu/drm/i915/i915_gem.h | 1 + 3 files changed, 195 insertions(+), 142 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index b05dcd492e25..a1b3d1fa1402 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -249,6 +249,8 @@ struct i915_execbuffer { /** list of vma that have execobj.relocation_count */ struct list_head relocs; + struct i915_gem_ww_ctx ww; + /** * Track the most recently used object for relocations, as we * frequently have to perform multiple relocations within the same @@ -267,14 +269,18 @@ struct i915_execbuffer { struct i915_request *rq; u32 *rq_cmd; unsigned int rq_size; + struct intel_gt_buffer_pool_node *pool; } reloc_cache; + struct intel_gt_buffer_pool_node *reloc_pool; /** relocation pool for -EDEADLK handling */ + u64 invalid_flags; /** Set of execobj.flags that are invalid */ u32 context_flags; /** Set of execobj.flags to insert from the ctx */ u32 batch_start_offset; /** Location within object of batch */ u32 batch_len; /** Length of batch within object */ u32 batch_flags; /** Flags composed for emit_bb_start() */ + struct intel_gt_buffer_pool_node *batch_pool; /** pool node for batch buffer */ /** * Indicate either the size of the hastable used to resolve @@ -441,24 +447,18 @@ eb_pin_vma(struct i915_execbuffer *eb, return !eb_vma_misplaced(entry, vma, ev->flags); } -static inline void __eb_unreserve_vma(struct i915_vma *vma, unsigned int flags) -{ - GEM_BUG_ON(!(flags & __EXEC_OBJECT_HAS_PIN)); - - if (unlikely(flags & __EXEC_OBJECT_HAS_FENCE)) - __i915_vma_unpin_fence(vma); - - __i915_vma_unpin(vma); -} - static inline void eb_unreserve_vma(struct eb_vma *ev) { if (!(ev->flags & __EXEC_OBJECT_HAS_PIN)) return; - __eb_unreserve_vma(ev->vma, ev->flags); ev->flags &= ~__EXEC_OBJECT_RESERVED; + + if (unlikely(ev->flags & __EXEC_OBJECT_HAS_FENCE)) + __i915_vma_unpin_fence(ev->vma); + + __i915_vma_unpin(ev->vma); } static int @@ -552,16 +552,6 @@ eb_add_vma(struct i915_execbuffer *eb, eb->batch = ev; } - - if (eb_pin_vma(eb, entry, ev)) { - if (entry->offset != vma->node.start) { - entry->offset = vma->node.start | UPDATE; - eb->args->flags |= __EXEC_HAS_RELOC; - } - } else { - eb_unreserve_vma(ev); - list_add_tail(&ev->bind_link, &eb->unbound); - } } static inline int use_cpu_reloc(const struct reloc_cache *cache, @@ -646,10 +636,6 @@ static int eb_reserve(struct i915_execbuffer *eb) * This avoid unnecessary unbinding of later objects in order to make * room for the earlier objects *unless* we need to defragment. */ - - if (mutex_lock_interruptible(&eb->i915->drm.struct_mutex)) - return -EINTR; - pass = 0; do { list_for_each_entry(ev, &eb->unbound, bind_link) { @@ -657,8 +643,8 @@ static int eb_reserve(struct i915_execbuffer *eb) if (err) break; } - if (!(err == -ENOSPC || err == -EAGAIN)) - break; + if (err != -ENOSPC) + return err; /* Resort *all* the objects into priority order */ INIT_LIST_HEAD(&eb->unbound); @@ -688,13 +674,6 @@ static int eb_reserve(struct i915_execbuffer *eb) } list_splice_
[Intel-gfx] [PATCH 01/24] perf/core: Only copy-to-user after completely unlocking all locks, v3.
We inadvertently create a dependency on mmap_sem with a whole chain. This breaks any user who wants to take a lock and call rcu_barrier(), while also taking that lock inside mmap_sem: <4> [604.892532] == <4> [604.892534] WARNING: possible circular locking dependency detected <4> [604.892536] 5.6.0-rc7-CI-Patchwork_17096+ #1 Tainted: G U <4> [604.892537] -- <4> [604.892538] kms_frontbuffer/2595 is trying to acquire lock: <4> [604.892540] 8264a558 (rcu_state.barrier_mutex){+.+.}, at: rcu_barrier+0x23/0x190 <4> [604.892547] but task is already holding lock: <4> [604.892547] 888484716050 (reservation_ww_class_mutex){+.+.}, at: i915_gem_object_pin_to_display_plane+0x89/0x270 [i915] <4> [604.892592] which lock already depends on the new lock. <4> [604.892593] the existing dependency chain (in reverse order) is: <4> [604.892594] -> #6 (reservation_ww_class_mutex){+.+.}: <4> [604.892597]__ww_mutex_lock.constprop.15+0xc3/0x1090 <4> [604.892598]ww_mutex_lock+0x39/0x70 <4> [604.892600]dma_resv_lockdep+0x10e/0x1f5 <4> [604.892602]do_one_initcall+0x58/0x300 <4> [604.892604]kernel_init_freeable+0x17b/0x1dc <4> [604.892605]kernel_init+0x5/0x100 <4> [604.892606]ret_from_fork+0x24/0x50 <4> [604.892607] -> #5 (reservation_ww_class_acquire){+.+.}: <4> [604.892609]dma_resv_lockdep+0xec/0x1f5 <4> [604.892610]do_one_initcall+0x58/0x300 <4> [604.892610]kernel_init_freeable+0x17b/0x1dc <4> [604.892611]kernel_init+0x5/0x100 <4> [604.892612]ret_from_fork+0x24/0x50 <4> [604.892613] -> #4 (&mm->mmap_sem#2){}: <4> [604.892615]__might_fault+0x63/0x90 <4> [604.892617]_copy_to_user+0x1e/0x80 <4> [604.892619]perf_read+0x200/0x2b0 <4> [604.892621]vfs_read+0x96/0x160 <4> [604.892622]ksys_read+0x9f/0xe0 <4> [604.892623]do_syscall_64+0x4f/0x220 <4> [604.892624]entry_SYSCALL_64_after_hwframe+0x49/0xbe <4> [604.892625] -> #3 (&cpuctx_mutex){+.+.}: <4> [604.892626]__mutex_lock+0x9a/0x9c0 <4> [604.892627]perf_event_init_cpu+0xa4/0x140 <4> [604.892629]perf_event_init+0x19d/0x1cd <4> [604.892630]start_kernel+0x362/0x4e4 <4> [604.892631]secondary_startup_64+0xa4/0xb0 <4> [604.892631] -> #2 (pmus_lock){+.+.}: <4> [604.892633]__mutex_lock+0x9a/0x9c0 <4> [604.892633]perf_event_init_cpu+0x6b/0x140 <4> [604.892635]cpuhp_invoke_callback+0x9b/0x9d0 <4> [604.892636]_cpu_up+0xa2/0x140 <4> [604.892637]do_cpu_up+0x61/0xa0 <4> [604.892639]smp_init+0x57/0x96 <4> [604.892639]kernel_init_freeable+0x87/0x1dc <4> [604.892640]kernel_init+0x5/0x100 <4> [604.892642]ret_from_fork+0x24/0x50 <4> [604.892642] -> #1 (cpu_hotplug_lock.rw_sem){}: <4> [604.892643]cpus_read_lock+0x34/0xd0 <4> [604.892644]rcu_barrier+0xaa/0x190 <4> [604.892645]kernel_init+0x21/0x100 <4> [604.892647]ret_from_fork+0x24/0x50 <4> [604.892647] -> #0 (rcu_state.barrier_mutex){+.+.}: <4> [604.892649]__lock_acquire+0x1328/0x15d0 <4> [604.892650]lock_acquire+0xa7/0x1c0 <4> [604.892651]__mutex_lock+0x9a/0x9c0 <4> [604.892652]rcu_barrier+0x23/0x190 <4> [604.892680]i915_gem_object_unbind+0x29d/0x3f0 [i915] <4> [604.892707]i915_gem_object_pin_to_display_plane+0x141/0x270 [i915] <4> [604.892737]intel_pin_and_fence_fb_obj+0xec/0x1f0 [i915] <4> [604.892767]intel_plane_pin_fb+0x3f/0xd0 [i915] <4> [604.892797]intel_prepare_plane_fb+0x13b/0x5c0 [i915] <4> [604.892798]drm_atomic_helper_prepare_planes+0x85/0x110 <4> [604.892827]intel_atomic_commit+0xda/0x390 [i915] <4> [604.892828]drm_atomic_helper_set_config+0x57/0xa0 <4> [604.892830]drm_mode_setcrtc+0x1c4/0x720 <4> [604.892830]drm_ioctl_kernel+0xb0/0xf0 <4> [604.892831]drm_ioctl+0x2e1/0x390 <4> [604.892833]ksys_ioctl+0x7b/0x90 <4> [604.892835]__x64_sys_ioctl+0x11/0x20 <4> [604.892835]do_syscall_64+0x4f/0x220 <4> [604.892836]entry_SYSCALL_64_after_hwframe+0x49/0xbe <4> [604.892837] Changes since v1: - Use (*values)[n++] in perf_read_one(). Changes since v2: - Centrally allocate values. Signed-off-by: Maarten Lankhorst fixup perf patch Signed-off-by: Maarten Lankhorst --- kernel/events/core.c | 45 +--- 1 file changed, 21 insertions(+), 24 deletions(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index c8f65daee1f9..b33b99fceecb 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -5102,20 +5102,16 @@ static int __perf_read_group_add(struct perf_event *leader, } static int perf_read_group(struct perf_event *event, - u64 read_format, char __user *buf) + u64 r
[Intel-gfx] [PATCH 17/24] drm/i915: Convert i915_perf to ww locking as well
We have the ordering of timeline->mutex vs resv_lock wrong, convert the i915_pin_vma and intel_context_pin as well to future-proof this. We may need to do future changes to do this more transaction-like, and only get down to a single i915_gem_ww_ctx, but for now this should work. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/i915_perf.c | 57 +++- 1 file changed, 42 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 5cde3e4e7be6..ca154466cac5 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -1195,24 +1195,39 @@ static struct intel_context *oa_pin_context(struct i915_perf_stream *stream) struct i915_gem_engines_iter it; struct i915_gem_context *ctx = stream->ctx; struct intel_context *ce; - int err; + struct i915_gem_ww_ctx ww; + int err = -ENODEV; for_each_gem_engine(ce, i915_gem_context_lock_engines(ctx), it) { if (ce->engine != stream->engine) /* first match! */ continue; - /* -* As the ID is the gtt offset of the context's vma we -* pin the vma to ensure the ID remains fixed. -*/ - err = intel_context_pin(ce); - if (err == 0) { - stream->pinned_ctx = ce; - break; - } + err = 0; + break; } i915_gem_context_unlock_engines(ctx); + if (err) + return ERR_PTR(err); + + i915_gem_ww_ctx_init(&ww, true); +retry: + /* +* As the ID is the gtt offset of the context's vma we +* pin the vma to ensure the ID remains fixed. +*/ + err = intel_context_pin_ww(ce, &ww); + if (err == -EDEADLK) { + err = i915_gem_ww_ctx_backoff(&ww); + if (!err) + goto retry; + } + i915_gem_ww_ctx_fini(&ww); + + if (err) + return ERR_PTR(err); + + stream->pinned_ctx = ce; return stream->pinned_ctx; } @@ -1927,15 +1942,22 @@ emit_oa_config(struct i915_perf_stream *stream, { struct i915_request *rq; struct i915_vma *vma; + struct i915_gem_ww_ctx ww; int err; vma = get_oa_vma(stream, oa_config); if (IS_ERR(vma)) return PTR_ERR(vma); - err = i915_vma_pin(vma, 0, 0, PIN_GLOBAL | PIN_HIGH); + i915_gem_ww_ctx_init(&ww, true); +retry: + err = i915_gem_object_lock(vma->obj, &ww); + if (err) + goto err; + + err = i915_vma_pin_ww(vma, &ww, 0, 0, PIN_GLOBAL | PIN_HIGH); if (err) - goto err_vma_put; + goto err; intel_engine_pm_get(ce->engine); rq = i915_request_create(ce); @@ -1957,11 +1979,9 @@ emit_oa_config(struct i915_perf_stream *stream, goto err_add_request; } - i915_vma_lock(vma); err = i915_request_await_object(rq, vma->obj, 0); if (!err) err = i915_vma_move_to_active(vma, rq, 0); - i915_vma_unlock(vma); if (err) goto err_add_request; @@ -1975,7 +1995,14 @@ emit_oa_config(struct i915_perf_stream *stream, i915_request_add(rq); err_vma_unpin: i915_vma_unpin(vma); -err_vma_put: +err: + if (err == -EDEADLK) { + err = i915_gem_ww_ctx_backoff(&ww); + if (!err) + goto retry; + } + + i915_gem_ww_ctx_fini(&ww); i915_vma_put(vma); return err; } -- 2.26.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 19/24] drm/i915/selftests: Fix locking inversion in lrc selftest.
This function does not use intel_context_create_request, so it has to use the same locking order as normal code. This is required to shut up lockdep in selftests. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/gt/selftest_lrc.c | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c index 6f5e35afe1b2..cb4471dfa6b3 100644 --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c @@ -4311,6 +4311,7 @@ static int __live_lrc_state(struct intel_engine_cs *engine, { struct intel_context *ce; struct i915_request *rq; + struct i915_gem_ww_ctx ww; enum { RING_START_IDX = 0, RING_TAIL_IDX, @@ -4325,7 +4326,11 @@ static int __live_lrc_state(struct intel_engine_cs *engine, if (IS_ERR(ce)) return PTR_ERR(ce); - err = intel_context_pin(ce); + i915_gem_ww_ctx_init(&ww, false); +retry: + err = i915_gem_object_lock(scratch->obj, &ww); + if (!err) + err = intel_context_pin_ww(ce, &ww); if (err) goto err_put; @@ -4354,11 +4359,9 @@ static int __live_lrc_state(struct intel_engine_cs *engine, *cs++ = i915_ggtt_offset(scratch) + RING_TAIL_IDX * sizeof(u32); *cs++ = 0; - i915_vma_lock(scratch); err = i915_request_await_object(rq, scratch->obj, true); if (!err) err = i915_vma_move_to_active(scratch, rq, EXEC_OBJECT_WRITE); - i915_vma_unlock(scratch); i915_request_get(rq); i915_request_add(rq); @@ -4395,6 +4398,12 @@ static int __live_lrc_state(struct intel_engine_cs *engine, err_unpin: intel_context_unpin(ce); err_put: + if (err == -EDEADLK) { + err = i915_gem_ww_ctx_backoff(&ww); + if (!err) + goto retry; + } + i915_gem_ww_ctx_fini(&ww); intel_context_put(ce); return err; } -- 2.26.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 12/24] drm/i915: Pin engine before pinning all objects, v3.
We want to lock all gem objects, including the engine context objects, rework the throttling to ensure that we can do this. Now we only throttle once, but can take eb_pin_engine while acquiring objects. This means we will have to drop the lock to wait. If we don't have to throttle we can still take the fastpath, if not we will take the slowpath and wait for the throttle request while unlocked. The engine has to be pinned as first step, otherwise gpu relocations won't work. Changes since v1: - Only need to get a throttled request in the fastpath, no need for a global flag any more. - Always free the waited request correctly. Changes since v2: - Use intel_engine_pm_get()/put() to keeep engine pool alive during EDEADLK handling. Signed-off-by: Maarten Lankhorst --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 183 -- 1 file changed, 127 insertions(+), 56 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index d432451608c9..6abe37b7933d 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -55,7 +55,8 @@ enum { #define __EXEC_OBJECT_RESERVED (__EXEC_OBJECT_HAS_PIN | __EXEC_OBJECT_HAS_FENCE) #define __EXEC_HAS_RELOC BIT(31) -#define __EXEC_INTERNAL_FLAGS (~0u << 31) +#define __EXEC_ENGINE_PINNED BIT(30) +#define __EXEC_INTERNAL_FLAGS (~0u << 30) #define UPDATE PIN_OFFSET_FIXED #define BATCH_OFFSET_BIAS (256*1024) @@ -292,6 +293,9 @@ struct i915_execbuffer { }; static int eb_parse(struct i915_execbuffer *eb); +static struct i915_request *eb_pin_engine(struct i915_execbuffer *eb, + bool throttle); +static void eb_unpin_engine(struct i915_execbuffer *eb); static inline bool eb_use_cmdparser(const struct i915_execbuffer *eb) { @@ -918,7 +922,7 @@ eb_get_vma(const struct i915_execbuffer *eb, unsigned long handle) } } -static void eb_release_vmas(const struct i915_execbuffer *eb, bool final) +static void eb_release_vmas(struct i915_execbuffer *eb, bool final) { const unsigned int count = eb->buffer_count; unsigned int i; @@ -935,6 +939,8 @@ static void eb_release_vmas(const struct i915_execbuffer *eb, bool final) if (final) i915_vma_put(vma); } + + eb_unpin_engine(eb); } static void eb_destroy(const struct i915_execbuffer *eb) @@ -1758,7 +1764,8 @@ static int eb_prefault_relocations(const struct i915_execbuffer *eb) return 0; } -static noinline int eb_relocate_parse_slow(struct i915_execbuffer *eb) +static noinline int eb_relocate_parse_slow(struct i915_execbuffer *eb, + struct i915_request *rq) { bool have_copy = false; struct eb_vma *ev; @@ -1774,6 +1781,21 @@ static noinline int eb_relocate_parse_slow(struct i915_execbuffer *eb) eb_release_vmas(eb, false); i915_gem_ww_ctx_fini(&eb->ww); + if (rq) { + /* nonblocking is always false */ + if (i915_request_wait(rq, I915_WAIT_INTERRUPTIBLE, + MAX_SCHEDULE_TIMEOUT) < 0) { + i915_request_put(rq); + rq = NULL; + + err = -EINTR; + goto err_relock; + } + + i915_request_put(rq); + rq = NULL; + } + /* * We take 3 passes through the slowpatch. * @@ -1797,14 +1819,25 @@ static noinline int eb_relocate_parse_slow(struct i915_execbuffer *eb) err = 0; } - flush_workqueue(eb->i915->mm.userptr_wq); + if (!err) + flush_workqueue(eb->i915->mm.userptr_wq); +err_relock: i915_gem_ww_ctx_init(&eb->ww, true); if (err) goto out; /* reacquire the objects */ repeat_validate: + rq = eb_pin_engine(eb, false); + if (IS_ERR(rq)) { + err = PTR_ERR(rq); + goto err; + } + + /* We didn't throttle, should be NULL */ + GEM_WARN_ON(rq); + err = eb_validate_vmas(eb); if (err) goto err; @@ -1868,14 +1901,47 @@ static noinline int eb_relocate_parse_slow(struct i915_execbuffer *eb) } } + if (rq) + i915_request_put(rq); + return err; } static int eb_relocate_parse(struct i915_execbuffer *eb) { int err; + struct i915_request *rq = NULL; + bool throttle = true; retry: + rq = eb_pin_engine(eb, throttle); + if (IS_ERR(rq)) { + err = PTR_ERR(rq); + rq = NULL; + if (err != -EDEADLK) + return err; + + goto err; + } + + if (rq) { + bool nonblock = eb->file->filp->f_flags & O_NONBLOCK; + +
[Intel-gfx] [PATCH 16/24] drm/i915: Kill last user of intel_context_create_request outside of selftests
Instead of using intel_context_create_request(), use intel_context_pin() and i915_create_request directly. Now all those calls are gone outside of selftests. :) Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/gt/intel_workarounds.c | 43 ++--- 1 file changed, 29 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index adddc5c93b48..51a0e114c367 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -1746,6 +1746,7 @@ static int engine_wa_list_verify(struct intel_context *ce, const struct i915_wa *wa; struct i915_request *rq; struct i915_vma *vma; + struct i915_gem_ww_ctx ww; unsigned int i; u32 *results; int err; @@ -1758,29 +1759,34 @@ static int engine_wa_list_verify(struct intel_context *ce, return PTR_ERR(vma); intel_engine_pm_get(ce->engine); - rq = intel_context_create_request(ce); - intel_engine_pm_put(ce->engine); + i915_gem_ww_ctx_init(&ww, false); +retry: + err = i915_gem_object_lock(vma->obj, &ww); + if (err == 0) + err = intel_context_pin_ww(ce, &ww); + if (err) + goto err_pm; + + rq = i915_request_create(ce); if (IS_ERR(rq)) { err = PTR_ERR(rq); - goto err_vma; + goto err_unpin; } - i915_vma_lock(vma); err = i915_request_await_object(rq, vma->obj, true); if (err == 0) err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE); - i915_vma_unlock(vma); - if (err) { - i915_request_add(rq); - goto err_vma; - } - - err = wa_list_srm(rq, wal, vma); - if (err) - goto err_vma; + if (err == 0) + err = wa_list_srm(rq, wal, vma); i915_request_get(rq); + if (err) + i915_request_set_error_once(rq, err); i915_request_add(rq); + + if (err) + goto err_rq; + if (i915_request_wait(rq, 0, HZ / 5) < 0) { err = -ETIME; goto err_rq; @@ -1805,7 +1811,16 @@ static int engine_wa_list_verify(struct intel_context *ce, err_rq: i915_request_put(rq); -err_vma: +err_unpin: + intel_context_unpin(ce); +err_pm: + if (err == -EDEADLK) { + err = i915_gem_ww_ctx_backoff(&ww); + if (!err) + goto retry; + } + i915_gem_ww_ctx_fini(&ww); + intel_engine_pm_put(ce->engine); i915_vma_unpin(vma); i915_vma_put(vma); return err; -- 2.26.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 09/24] drm/i915: Use ww locking in intel_renderstate.
We want to start using ww locking in intel_context_pin, for this we need to lock multiple objects, and the single i915_gem_object_lock is not enough. Convert to using ww-waiting, and make sure we always pin intel_context_state, even if we don't have a renderstate object. Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/gt/intel_gt.c | 21 +++--- drivers/gpu/drm/i915/gt/intel_renderstate.c | 79 ++--- drivers/gpu/drm/i915/gt/intel_renderstate.h | 9 ++- 3 files changed, 72 insertions(+), 37 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index 342a9558751c..b5f216bb266d 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -408,21 +408,20 @@ static int __engines_record_defaults(struct intel_gt *gt) /* We must be able to switch to something! */ GEM_BUG_ON(!engine->kernel_context); - err = intel_renderstate_init(&so, engine); - if (err) - goto out; - ce = intel_context_create(engine); if (IS_ERR(ce)) { err = PTR_ERR(ce); goto out; } - rq = intel_context_create_request(ce); + err = intel_renderstate_init(&so, ce); + if (err) + goto err; + + rq = i915_request_create(ce); if (IS_ERR(rq)) { err = PTR_ERR(rq); - intel_context_put(ce); - goto out; + goto err_fini; } err = intel_engine_emit_ctx_wa(rq); @@ -436,9 +435,13 @@ static int __engines_record_defaults(struct intel_gt *gt) err_rq: requests[id] = i915_request_get(rq); i915_request_add(rq); - intel_renderstate_fini(&so); - if (err) +err_fini: + intel_renderstate_fini(&so, ce); +err: + if (err) { + intel_context_put(ce); goto out; + } } /* Flush the default context image to memory, and enable powersaving. */ diff --git a/drivers/gpu/drm/i915/gt/intel_renderstate.c b/drivers/gpu/drm/i915/gt/intel_renderstate.c index ca533d98d14d..5e980642e479 100644 --- a/drivers/gpu/drm/i915/gt/intel_renderstate.c +++ b/drivers/gpu/drm/i915/gt/intel_renderstate.c @@ -27,6 +27,7 @@ #include "i915_drv.h" #include "intel_renderstate.h" +#include "gt/intel_context.h" #include "intel_ring.h" static const struct intel_renderstate_rodata * @@ -74,10 +75,9 @@ static int render_state_setup(struct intel_renderstate *so, u32 *d; int ret; - i915_gem_object_lock(so->vma->obj, NULL); ret = i915_gem_object_prepare_write(so->vma->obj, &needs_clflush); if (ret) - goto out_unlock; + return ret; d = kmap_atomic(i915_gem_object_get_dirty_page(so->vma->obj, 0)); @@ -158,8 +158,6 @@ static int render_state_setup(struct intel_renderstate *so, ret = 0; out: i915_gem_object_finish_access(so->vma->obj); -out_unlock: - i915_gem_object_unlock(so->vma->obj); return ret; err: @@ -171,33 +169,47 @@ static int render_state_setup(struct intel_renderstate *so, #undef OUT_BATCH int intel_renderstate_init(struct intel_renderstate *so, - struct intel_engine_cs *engine) + struct intel_context *ce) { - struct drm_i915_gem_object *obj; + struct intel_engine_cs *engine = ce->engine; + struct drm_i915_gem_object *obj = NULL; int err; memset(so, 0, sizeof(*so)); so->rodata = render_state_get_rodata(engine); - if (!so->rodata) - return 0; + if (so->rodata) { + if (so->rodata->batch_items * 4 > PAGE_SIZE) + return -EINVAL; + + obj = i915_gem_object_create_internal(engine->i915, PAGE_SIZE); + if (IS_ERR(obj)) + return PTR_ERR(obj); + + so->vma = i915_vma_instance(obj, &engine->gt->ggtt->vm, NULL); + if (IS_ERR(so->vma)) { + err = PTR_ERR(so->vma); + goto err_obj; + } + } - if (so->rodata->batch_items * 4 > PAGE_SIZE) - return -EINVAL; + i915_gem_ww_ctx_init(&so->ww, true); +retry: + err = intel_context_pin(ce); + if (err) + goto err_fini; - obj = i915_gem_object_create_internal(engine->i915, PAGE_SIZE); - if (IS_ERR(obj)) - return PTR_ERR(obj); + /* return early if there's nothing to setup */ + if (!err && !so->rodata) + return 0; - so->vma = i915_vma_instance(obj, &engine->gt->ggtt->vm, NULL); - if (
[Intel-gfx] [PATCH 06/24] drm/i915: Parse command buffer earlier in eb_relocate(slow)
We want to introduce backoff logic, but we need to lock the pool object as well for command parsing. Because of this, we will need backoff logic for the engine pool obj, move the batch validation up slightly to eb_lookup_vmas, and the actual command parsing in a separate function which can get called from execbuf relocation fast and slowpath. Signed-off-by: Maarten Lankhorst --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 68 ++- 1 file changed, 37 insertions(+), 31 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 2c7c0f4142aa..59c5d3cc2d39 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -290,6 +290,8 @@ struct i915_execbuffer { struct eb_vma_array *array; }; +static int eb_parse(struct i915_execbuffer *eb); + static inline bool eb_use_cmdparser(const struct i915_execbuffer *eb) { return intel_engine_requires_cmd_parser(eb->engine) || @@ -873,6 +875,7 @@ static struct i915_vma *eb_lookup_vma(struct i915_execbuffer *eb, u32 handle) static int eb_lookup_vmas(struct i915_execbuffer *eb) { + struct drm_i915_private *i915 = eb->i915; unsigned int batch = eb_batch_index(eb); unsigned int i; int err = 0; @@ -886,18 +889,37 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb) vma = eb_lookup_vma(eb, eb->exec[i].handle); if (IS_ERR(vma)) { err = PTR_ERR(vma); - break; + goto err; } err = eb_validate_vma(eb, &eb->exec[i], vma); if (unlikely(err)) { i915_vma_put(vma); - break; + goto err; } eb_add_vma(eb, i, batch, vma); } + if (unlikely(eb->batch->flags & EXEC_OBJECT_WRITE)) { + drm_dbg(&i915->drm, + "Attempting to use self-modifying batch buffer\n"); + return -EINVAL; + } + + if (range_overflows_t(u64, + eb->batch_start_offset, eb->batch_len, + eb->batch->vma->size)) { + drm_dbg(&i915->drm, "Attempting to use out-of-bounds batch\n"); + return -EINVAL; + } + + if (eb->batch_len == 0) + eb->batch_len = eb->batch->vma->size - eb->batch_start_offset; + + return 0; + +err: eb->vma[i].vma = NULL; return err; } @@ -1737,7 +1759,7 @@ static int eb_prefault_relocations(const struct i915_execbuffer *eb) return 0; } -static noinline int eb_relocate_slow(struct i915_execbuffer *eb) +static noinline int eb_relocate_parse_slow(struct i915_execbuffer *eb) { bool have_copy = false; struct eb_vma *ev; @@ -1788,6 +1810,11 @@ static noinline int eb_relocate_slow(struct i915_execbuffer *eb) } } + /* as last step, parse the command buffer */ + err = eb_parse(eb); + if (err) + goto err; + /* * Leave the user relocations as are, this is the painfully slow path, * and we want to avoid the complication of dropping the lock whilst @@ -1820,7 +1847,7 @@ static noinline int eb_relocate_slow(struct i915_execbuffer *eb) return err; } -static int eb_relocate(struct i915_execbuffer *eb) +static int eb_relocate_parse(struct i915_execbuffer *eb) { int err; @@ -1840,11 +1867,11 @@ static int eb_relocate(struct i915_execbuffer *eb) list_for_each_entry(ev, &eb->relocs, reloc_link) { if (eb_relocate_vma(eb, ev)) - return eb_relocate_slow(eb); + return eb_relocate_parse_slow(eb); } } - return 0; + return eb_parse(eb); } static int eb_move_to_gpu(struct i915_execbuffer *eb) @@ -2775,7 +2802,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, if (unlikely(err)) goto err_context; - err = eb_relocate(&eb); + err = eb_relocate_parse(&eb); if (err) { /* * If the user expects the execobject.offset and @@ -2788,33 +2815,10 @@ i915_gem_do_execbuffer(struct drm_device *dev, goto err_vma; } - if (unlikely(eb.batch->flags & EXEC_OBJECT_WRITE)) { - drm_dbg(&i915->drm, - "Attempting to use self-modifying batch buffer\n"); - err = -EINVAL; - goto err_vma; - } - - if (range_overflows_t(u64, - eb.batch_start_offset, eb.batch_len, - eb.batch->vma->size)) { - drm_dbg(&i915->drm, "Attempting to use out-of-bounds batch\n"); - err = -EINVAL; -
[Intel-gfx] [PATCH 04/24] drm/i915: Add an implementation for i915_gem_ww_ctx locking, v2.
i915_gem_ww_ctx is used to lock all gem bo's for pinning and memory eviction. We don't use it yet, but lets start adding the definition first. To use it, we have to pass a non-NULL ww to gem_object_lock, and don't unlock directly. It is done in i915_gem_ww_ctx_fini. Changes since v1: - Change ww_ctx and obj order in locking functions (Jonas Lahtinen) Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/display/intel_display.c | 4 +- .../gpu/drm/i915/gem/i915_gem_client_blt.c| 2 +- drivers/gpu/drm/i915/gem/i915_gem_context.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c| 4 +- drivers/gpu/drm/i915/gem/i915_gem_domain.c| 10 ++-- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 4 +- drivers/gpu/drm/i915/gem/i915_gem_object.c| 2 +- drivers/gpu/drm/i915/gem/i915_gem_object.h| 38 +++--- .../gpu/drm/i915/gem/i915_gem_object_blt.c| 2 +- .../gpu/drm/i915/gem/i915_gem_object_types.h | 9 drivers/gpu/drm/i915/gem/i915_gem_pm.c| 2 +- drivers/gpu/drm/i915/gem/i915_gem_tiling.c| 2 +- .../gpu/drm/i915/gem/selftests/huge_pages.c | 2 +- .../i915/gem/selftests/i915_gem_client_blt.c | 2 +- .../i915/gem/selftests/i915_gem_coherency.c | 10 ++-- .../drm/i915/gem/selftests/i915_gem_context.c | 4 +- .../drm/i915/gem/selftests/i915_gem_mman.c| 4 +- .../drm/i915/gem/selftests/i915_gem_phys.c| 2 +- drivers/gpu/drm/i915/gt/intel_gt.c| 2 +- .../gpu/drm/i915/gt/selftest_workarounds.c| 2 +- drivers/gpu/drm/i915/gvt/cmd_parser.c | 2 +- drivers/gpu/drm/i915/i915_gem.c | 52 +-- drivers/gpu/drm/i915/i915_gem.h | 11 drivers/gpu/drm/i915/selftests/i915_gem.c | 41 +++ drivers/gpu/drm/i915/selftests/i915_vma.c | 2 +- .../drm/i915/selftests/intel_memory_region.c | 2 +- 26 files changed, 175 insertions(+), 44 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index adb08a00bb57..1064e34e42cd 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -2309,7 +2309,7 @@ intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb, void intel_unpin_fb_vma(struct i915_vma *vma, unsigned long flags) { - i915_gem_object_lock(vma->obj); + i915_gem_object_lock(vma->obj, NULL); if (flags & PLANE_HAS_FENCE) i915_vma_unpin_fence(vma); i915_gem_object_unpin_from_display_plane(vma); @@ -16958,7 +16958,7 @@ static int intel_framebuffer_init(struct intel_framebuffer *intel_fb, if (!intel_fb->frontbuffer) return -ENOMEM; - i915_gem_object_lock(obj); + i915_gem_object_lock(obj, NULL); tiling = i915_gem_object_get_tiling(obj); stride = i915_gem_object_get_stride(obj); i915_gem_object_unlock(obj); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c b/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c index 3a146aa2593b..2f1d8150256b 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_client_blt.c @@ -286,7 +286,7 @@ int i915_gem_schedule_fill_pages_blt(struct drm_i915_gem_object *obj, dma_fence_init(&work->dma, &clear_pages_work_ops, &fence_lock, 0, 0); i915_sw_fence_init(&work->wait, clear_pages_work_notify); - i915_gem_object_lock(obj); + i915_gem_object_lock(obj, NULL); err = i915_sw_fence_await_reservation(&work->wait, obj->base.resv, NULL, true, I915_FENCE_TIMEOUT, diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 11d9135cf21a..022716f05e91 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -113,7 +113,7 @@ static void lut_close(struct i915_gem_context *ctx) continue; rcu_read_unlock(); - i915_gem_object_lock(obj); + i915_gem_object_lock(obj, NULL); list_for_each_entry(lut, &obj->lut_list, obj_link) { if (lut->ctx != ctx) continue; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c index 7db5a793739d..cfadccfc2990 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c @@ -128,7 +128,7 @@ static int i915_gem_begin_cpu_access(struct dma_buf *dma_buf, enum dma_data_dire if (err) return err; - err = i915_gem_object_lock_interruptible(obj); + err = i915_gem_object_lock_interruptible(obj, NULL); if (err) goto out; @@ -149,7 +149,7 @@ static int i915_gem_end_cpu_access(struct dma_buf *dma_buf, enum dma_data_direct if (err)
[Intel-gfx] [PATCH 07/24] Revert "drm/i915/gem: Split eb_vma into its own allocation"
This reverts commit 0f1dd02295f35dcdcbaafcbcbbec0753884ab974. This conflicts with the ww mutex handling, which needs to drop the references after gpu submission anyway, because otherwise we may risk unlocking a BO after first freeing it. Signed-off-by: Maarten Lankhorst --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 131 -- 1 file changed, 58 insertions(+), 73 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 59c5d3cc2d39..b05dcd492e25 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -40,11 +40,6 @@ struct eb_vma { u32 handle; }; -struct eb_vma_array { - struct kref kref; - struct eb_vma vma[]; -}; - enum { FORCE_CPU_RELOC = 1, FORCE_GTT_RELOC, @@ -57,6 +52,7 @@ enum { #define __EXEC_OBJECT_NEEDS_MAPBIT(29) #define __EXEC_OBJECT_NEEDS_BIAS BIT(28) #define __EXEC_OBJECT_INTERNAL_FLAGS (~0u << 28) /* all of the above */ +#define __EXEC_OBJECT_RESERVED (__EXEC_OBJECT_HAS_PIN | __EXEC_OBJECT_HAS_FENCE) #define __EXEC_HAS_RELOC BIT(31) #define __EXEC_INTERNAL_FLAGS (~0u << 31) @@ -287,7 +283,6 @@ struct i915_execbuffer { */ int lut_size; struct hlist_head *buckets; /** ht for relocation handles */ - struct eb_vma_array *array; }; static int eb_parse(struct i915_execbuffer *eb); @@ -299,62 +294,8 @@ static inline bool eb_use_cmdparser(const struct i915_execbuffer *eb) eb->args->batch_len); } -static struct eb_vma_array *eb_vma_array_create(unsigned int count) -{ - struct eb_vma_array *arr; - - arr = kvmalloc(struct_size(arr, vma, count), GFP_KERNEL | __GFP_NOWARN); - if (!arr) - return NULL; - - kref_init(&arr->kref); - arr->vma[0].vma = NULL; - - return arr; -} - -static inline void eb_unreserve_vma(struct eb_vma *ev) -{ - struct i915_vma *vma = ev->vma; - - if (unlikely(ev->flags & __EXEC_OBJECT_HAS_FENCE)) - __i915_vma_unpin_fence(vma); - - if (ev->flags & __EXEC_OBJECT_HAS_PIN) - __i915_vma_unpin(vma); - - ev->flags &= ~(__EXEC_OBJECT_HAS_PIN | - __EXEC_OBJECT_HAS_FENCE); -} - -static void eb_vma_array_destroy(struct kref *kref) -{ - struct eb_vma_array *arr = container_of(kref, typeof(*arr), kref); - struct eb_vma *ev = arr->vma; - - while (ev->vma) { - eb_unreserve_vma(ev); - i915_vma_put(ev->vma); - ev++; - } - - kvfree(arr); -} - -static void eb_vma_array_put(struct eb_vma_array *arr) -{ - kref_put(&arr->kref, eb_vma_array_destroy); -} - static int eb_create(struct i915_execbuffer *eb) { - /* Allocate an extra slot for use by the command parser + sentinel */ - eb->array = eb_vma_array_create(eb->buffer_count + 2); - if (!eb->array) - return -ENOMEM; - - eb->vma = eb->array->vma; - if (!(eb->args->flags & I915_EXEC_HANDLE_LUT)) { unsigned int size = 1 + ilog2(eb->buffer_count); @@ -388,10 +329,8 @@ static int eb_create(struct i915_execbuffer *eb) break; } while (--size); - if (unlikely(!size)) { - eb_vma_array_put(eb->array); + if (unlikely(!size)) return -ENOMEM; - } eb->lut_size = size; } else { @@ -502,6 +441,26 @@ eb_pin_vma(struct i915_execbuffer *eb, return !eb_vma_misplaced(entry, vma, ev->flags); } +static inline void __eb_unreserve_vma(struct i915_vma *vma, unsigned int flags) +{ + GEM_BUG_ON(!(flags & __EXEC_OBJECT_HAS_PIN)); + + if (unlikely(flags & __EXEC_OBJECT_HAS_FENCE)) + __i915_vma_unpin_fence(vma); + + __i915_vma_unpin(vma); +} + +static inline void +eb_unreserve_vma(struct eb_vma *ev) +{ + if (!(ev->flags & __EXEC_OBJECT_HAS_PIN)) + return; + + __eb_unreserve_vma(ev->vma, ev->flags); + ev->flags &= ~__EXEC_OBJECT_RESERVED; +} + static int eb_validate_vma(struct i915_execbuffer *eb, struct drm_i915_gem_exec_object2 *entry, @@ -944,13 +903,31 @@ eb_get_vma(const struct i915_execbuffer *eb, unsigned long handle) } } +static void eb_release_vmas(const struct i915_execbuffer *eb) +{ + const unsigned int count = eb->buffer_count; + unsigned int i; + + for (i = 0; i < count; i++) { + struct eb_vma *ev = &eb->vma[i]; + struct i915_vma *vma = ev->vma; + + if (!vma) + break; + + eb->vma[i].vma = NULL; + + if (ev->flags & __EXEC_OBJECT_HAS_PIN) + __eb_unreserve_vma(vma, ev->flags); + + i915_vma_put(vma); + } +} + static v
[Intel-gfx] [PATCH 22/24] drm/i915: Add ww locking to vm_fault_gtt
Signed-off-by: Maarten Lankhorst --- drivers/gpu/drm/i915/gem/i915_gem_mman.c | 51 +++- 1 file changed, 33 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c index b39c24dae64e..e35e8d0b6938 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c @@ -283,37 +283,46 @@ static vm_fault_t vm_fault_gtt(struct vm_fault *vmf) struct intel_runtime_pm *rpm = &i915->runtime_pm; struct i915_ggtt *ggtt = &i915->ggtt; bool write = area->vm_flags & VM_WRITE; + struct i915_gem_ww_ctx ww; intel_wakeref_t wakeref; struct i915_vma *vma; pgoff_t page_offset; int srcu; int ret; - /* Sanity check that we allow writing into this object */ - if (i915_gem_object_is_readonly(obj) && write) - return VM_FAULT_SIGBUS; - /* We don't use vmf->pgoff since that has the fake offset */ page_offset = (vmf->address - area->vm_start) >> PAGE_SHIFT; trace_i915_gem_object_fault(obj, page_offset, true, write); - ret = i915_gem_object_pin_pages(obj); + wakeref = intel_runtime_pm_get(rpm); + + i915_gem_ww_ctx_init(&ww, true); +retry: + ret = i915_gem_object_lock(obj, &ww); if (ret) - goto err; + goto err_rpm; - wakeref = intel_runtime_pm_get(rpm); + /* Sanity check that we allow writing into this object */ + if (i915_gem_object_is_readonly(obj) && write) { + ret = -EFAULT; + goto err_rpm; + } - ret = intel_gt_reset_trylock(ggtt->vm.gt, &srcu); + ret = i915_gem_object_pin_pages(obj); if (ret) goto err_rpm; + ret = intel_gt_reset_trylock(ggtt->vm.gt, &srcu); + if (ret) + goto err_pages; + /* Now pin it into the GTT as needed */ - vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0, - PIN_MAPPABLE | - PIN_NONBLOCK /* NOWARN */ | - PIN_NOEVICT); - if (IS_ERR(vma)) { + vma = i915_gem_object_ggtt_pin_ww(obj, &ww, NULL, 0, 0, + PIN_MAPPABLE | + PIN_NONBLOCK /* NOWARN */ | + PIN_NOEVICT); + if (IS_ERR(vma) && vma != ERR_PTR(-EDEADLK)) { /* Use a partial view if it is bigger than available space */ struct i915_ggtt_view view = compute_partial_view(obj, page_offset, MIN_CHUNK_PAGES); @@ -328,11 +337,11 @@ static vm_fault_t vm_fault_gtt(struct vm_fault *vmf) * all hope that the hardware is able to track future writes. */ - vma = i915_gem_object_ggtt_pin(obj, &view, 0, 0, flags); - if (IS_ERR(vma)) { + vma = i915_gem_object_ggtt_pin_ww(obj, &ww, &view, 0, 0, flags); + if (IS_ERR(vma) && vma != ERR_PTR(-EDEADLK)) { flags = PIN_MAPPABLE; view.type = I915_GGTT_VIEW_PARTIAL; - vma = i915_gem_object_ggtt_pin(obj, &view, 0, 0, flags); + vma = i915_gem_object_ggtt_pin_ww(obj, &ww, &view, 0, 0, flags); } /* The entire mappable GGTT is pinned? Unexpected! */ @@ -389,10 +398,16 @@ static vm_fault_t vm_fault_gtt(struct vm_fault *vmf) __i915_vma_unpin(vma); err_reset: intel_gt_reset_unlock(ggtt->vm.gt, srcu); +err_pages: + i915_gem_object_unpin_pages(obj); err_rpm: + if (ret == -EDEADLK) { + ret = i915_gem_ww_ctx_backoff(&ww); + if (!ret) + goto retry; + } + i915_gem_ww_ctx_fini(&ww); intel_runtime_pm_put(rpm, wakeref); - i915_gem_object_unpin_pages(obj); -err: return i915_error_to_vmf_fault(ret); } -- 2.26.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/gt: Prefer soft-rc6 over RPS DOWN_TIMEOUT (rev3)
== Series Details == Series: drm/i915/gt: Prefer soft-rc6 over RPS DOWN_TIMEOUT (rev3) URL : https://patchwork.freedesktop.org/series/76216/ State : success == Summary == CI Bug Log - changes from CI_DRM_8342 -> Patchwork_17400 Summary --- **SUCCESS** No regressions found. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17400/index.html Known issues Here are the changes found in Patchwork_17400 that come from known issues: ### IGT changes ### Issues hit * igt@i915_pm_rpm@module-reload: - fi-kbl-guc: [PASS][1] -> [SKIP][2] ([fdo#109271]) [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-kbl-guc/igt@i915_pm_...@module-reload.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17400/fi-kbl-guc/igt@i915_pm_...@module-reload.html Possible fixes * igt@i915_selftest@live@gt_pm: - fi-glk-dsi: [DMESG-FAIL][3] ([i915#1751]) -> [PASS][4] [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-glk-dsi/igt@i915_selftest@live@gt_pm.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17400/fi-glk-dsi/igt@i915_selftest@live@gt_pm.html Warnings * igt@i915_selftest@live@gt_pm: - fi-tgl-y: [DMESG-FAIL][5] ([i915#1744]) -> [DMESG-FAIL][6] ([i915#1759]) [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-tgl-y/igt@i915_selftest@live@gt_pm.html [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17400/fi-tgl-y/igt@i915_selftest@live@gt_pm.html {name}: This element is suppressed. This means it is ignored when computing the status of the difference (SUCCESS, WARNING, or FAILURE). [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271 [i915#1744]: https://gitlab.freedesktop.org/drm/intel/issues/1744 [i915#1751]: https://gitlab.freedesktop.org/drm/intel/issues/1751 [i915#1759]: https://gitlab.freedesktop.org/drm/intel/issues/1759 Participating hosts (48 -> 42) -- Missing(6): fi-cml-u2 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-byt-clapper fi-bdw-samus Build changes - * CI: CI-20190529 -> None * Linux: CI_DRM_8342 -> Patchwork_17400 CI-20190529: 20190529 CI_DRM_8342: 17407a9f61a0ee402254522e391a626acc4375ec @ git://anongit.freedesktop.org/gfx-ci/linux IGT_5602: a8fcccd15dcc2dd409edd23785a2d6f6e85fb682 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools Patchwork_17400: 4ca7f34aac20b70957dc30b6f20d2ca232836b43 @ git://anongit.freedesktop.org/gfx-ci/linux == Linux commits == 4ca7f34aac20 drm/i915/gt: Prefer soft-rc6 over RPS DOWN_TIMEOUT == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17400/index.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH xf86-video-intel v5] Sync i915_pciids upto 8717c6b7414f
Import the kernel's i915_pciids.h, up to: commit 8717c6b7414ffb890672276dccc284c23078ac0e Author: Lee Shawn C Date: Tue Dec 10 23:04:15 2019 +0800 drm/i915/cml: Separate U series pci id from origianl list. Signed-off-by: Liwei Song --- V4 -> V5: adjust gen number V3 -> V4: Add missed PINEVIEW V2 -> V3: Add 4 new info blocks and add sound support for them. Change since V1: replace old definition in intel_module.c and dri3-test.c --- src/i915_pciids.h | 265 -- src/intel_module.c| 93 ++- src/sna/gen9_render.c | 48 test/dri3-test.c | 3 +- 4 files changed, 346 insertions(+), 63 deletions(-) diff --git a/src/i915_pciids.h b/src/i915_pciids.h index fd965ffbb92e..1d2c12219f44 100644 --- a/src/i915_pciids.h +++ b/src/i915_pciids.h @@ -108,8 +108,10 @@ INTEL_VGA_DEVICE(0x2e42, info), /* B43_G */ \ INTEL_VGA_DEVICE(0x2e92, info) /* B43_G.1 */ -#define INTEL_PINEVIEW_IDS(info) \ - INTEL_VGA_DEVICE(0xa001, info), \ +#define INTEL_PINEVIEW_G_IDS(info) \ + INTEL_VGA_DEVICE(0xa001, info) + +#define INTEL_PINEVIEW_M_IDS(info) \ INTEL_VGA_DEVICE(0xa011, info) #define INTEL_IRONLAKE_D_IDS(info) \ @@ -166,7 +168,18 @@ #define INTEL_IVB_Q_IDS(info) \ INTEL_QUANTA_VGA_DEVICE(info) /* Quanta transcode */ +#define INTEL_HSW_ULT_GT1_IDS(info) \ + INTEL_VGA_DEVICE(0x0A02, info), /* ULT GT1 desktop */ \ + INTEL_VGA_DEVICE(0x0A0A, info), /* ULT GT1 server */ \ + INTEL_VGA_DEVICE(0x0A0B, info), /* ULT GT1 reserved */ \ + INTEL_VGA_DEVICE(0x0A06, info) /* ULT GT1 mobile */ + +#define INTEL_HSW_ULX_GT1_IDS(info) \ + INTEL_VGA_DEVICE(0x0A0E, info) /* ULX GT1 mobile */ + #define INTEL_HSW_GT1_IDS(info) \ + INTEL_HSW_ULT_GT1_IDS(info), \ + INTEL_HSW_ULX_GT1_IDS(info), \ INTEL_VGA_DEVICE(0x0402, info), /* GT1 desktop */ \ INTEL_VGA_DEVICE(0x040a, info), /* GT1 server */ \ INTEL_VGA_DEVICE(0x040B, info), /* GT1 reserved */ \ @@ -175,20 +188,26 @@ INTEL_VGA_DEVICE(0x0C0A, info), /* SDV GT1 server */ \ INTEL_VGA_DEVICE(0x0C0B, info), /* SDV GT1 reserved */ \ INTEL_VGA_DEVICE(0x0C0E, info), /* SDV GT1 reserved */ \ - INTEL_VGA_DEVICE(0x0A02, info), /* ULT GT1 desktop */ \ - INTEL_VGA_DEVICE(0x0A0A, info), /* ULT GT1 server */ \ - INTEL_VGA_DEVICE(0x0A0B, info), /* ULT GT1 reserved */ \ INTEL_VGA_DEVICE(0x0D02, info), /* CRW GT1 desktop */ \ INTEL_VGA_DEVICE(0x0D0A, info), /* CRW GT1 server */ \ INTEL_VGA_DEVICE(0x0D0B, info), /* CRW GT1 reserved */ \ INTEL_VGA_DEVICE(0x0D0E, info), /* CRW GT1 reserved */ \ INTEL_VGA_DEVICE(0x0406, info), /* GT1 mobile */ \ INTEL_VGA_DEVICE(0x0C06, info), /* SDV GT1 mobile */ \ - INTEL_VGA_DEVICE(0x0A06, info), /* ULT GT1 mobile */ \ - INTEL_VGA_DEVICE(0x0A0E, info), /* ULX GT1 mobile */ \ INTEL_VGA_DEVICE(0x0D06, info) /* CRW GT1 mobile */ +#define INTEL_HSW_ULT_GT2_IDS(info) \ + INTEL_VGA_DEVICE(0x0A12, info), /* ULT GT2 desktop */ \ + INTEL_VGA_DEVICE(0x0A1A, info), /* ULT GT2 server */ \ + INTEL_VGA_DEVICE(0x0A1B, info), /* ULT GT2 reserved */ \ + INTEL_VGA_DEVICE(0x0A16, info) /* ULT GT2 mobile */ + +#define INTEL_HSW_ULX_GT2_IDS(info) \ + INTEL_VGA_DEVICE(0x0A1E, info) /* ULX GT2 mobile */ \ + #define INTEL_HSW_GT2_IDS(info) \ + INTEL_HSW_ULT_GT2_IDS(info), \ + INTEL_HSW_ULX_GT2_IDS(info), \ INTEL_VGA_DEVICE(0x0412, info), /* GT2 desktop */ \ INTEL_VGA_DEVICE(0x041a, info), /* GT2 server */ \ INTEL_VGA_DEVICE(0x041B, info), /* GT2 reserved */ \ @@ -197,9 +216,6 @@ INTEL_VGA_DEVICE(0x0C1A, info), /* SDV GT2 server */ \ INTEL_VGA_DEVICE(0x0C1B, info), /* SDV GT2 reserved */ \ INTEL_VGA_DEVICE(0x0C1E, info), /* SDV GT2 reserved */ \ - INTEL_VGA_DEVICE(0x0A12, info), /* ULT GT2 desktop */ \ - INTEL_VGA_DEVICE(0x0A1A, info), /* ULT GT2 server */ \ - INTEL_VGA_DEVICE(0x0A1B, info), /* ULT GT2 reserved */ \ INTEL_VGA_DEVICE(0x0D12, info), /* CRW GT2 desktop */ \ INTEL_VGA_DEVICE(0x0D1A, info), /* CRW GT2 server */ \ INTEL_VGA_DEVICE(0x0D1B, info), /* CRW GT2 reserved */ \ @@ -207,11 +223,17 @@ INTEL_VGA_DEVICE(0x0416, info), /* GT2 mobile */ \ INTEL_VGA_DEVICE(0x0426, info), /* GT2 mobile */ \ INTEL_VGA_DEVICE(0x0C16, info), /* SDV GT2 mobile */ \ - INTEL_VGA_DEVICE(0x0A16, info), /* ULT GT2 mobile */ \ - INTEL_VGA_DEVICE(0x0A1E, info), /* ULX GT2 mobile */ \ INTEL_VGA_DEVICE(0x0D16, info) /* CRW GT2 mobile */ +#define INTEL_HSW_ULT_GT3_IDS(info) \ + INTEL_VGA_DEVICE(0x0A22, info), /* ULT GT3 desktop */ \ + INTEL_VGA_DEVICE(0x0A2A, info), /* ULT GT3 server */ \ + INTEL_VGA_DEVICE(0x0A2B, info), /* ULT GT3 reserved */ \ + INTEL_VGA_DEVICE(
Re: [Intel-gfx] [PATCH 24/59] drm/tidss: Don't use drm_device->dev_private
On 15/04/2020 10:39, Daniel Vetter wrote: Upcasting using a container_of macro is more typesafe, faster and easier for the compiler to optimize. Tested-by: Jyri Sarha Acked-by: Sam Ravnborg Signed-off-by: Daniel Vetter Cc: Jyri Sarha Cc: Tomi Valkeinen --- drivers/gpu/drm/tidss/tidss_crtc.c | 16 drivers/gpu/drm/tidss/tidss_drv.c | 2 -- drivers/gpu/drm/tidss/tidss_drv.h | 2 ++ drivers/gpu/drm/tidss/tidss_irq.c | 12 ++-- drivers/gpu/drm/tidss/tidss_kms.c | 2 +- drivers/gpu/drm/tidss/tidss_plane.c | 6 +++--- 6 files changed, 20 insertions(+), 20 deletions(-) Reviewed-by: Tomi Valkeinen Tomi -- Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 23/59] drm/tidss: Use devm_drm_dev_alloc
On 15/04/2020 10:39, Daniel Vetter wrote: Already using devm_drm_dev_init, so very simple replacment. Tested-by: Jyri Sarha Acked-by: Sam Ravnborg Signed-off-by: Daniel Vetter Cc: Jyri Sarha Cc: Tomi Valkeinen --- drivers/gpu/drm/tidss/tidss_drv.c | 15 --- 1 file changed, 4 insertions(+), 11 deletions(-) Reviewed-by: Tomi Valkeinen Tomi -- Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH xf86-video-intel v5] Sync i915_pciids upto 8717c6b7414f
Quoting Liwei Song (2020-04-21 11:19:55) > Import the kernel's i915_pciids.h, up to: > > commit 8717c6b7414ffb890672276dccc284c23078ac0e > Author: Lee Shawn C > Date: Tue Dec 10 23:04:15 2019 +0800 > > drm/i915/cml: Separate U series pci id from origianl list. > > Signed-off-by: Liwei Song Fixed the remaining addition of gen10+ to gen9_render.c (we can't reuse the same assembler, so we need to rework lots; a decade on and extracting those paths from mesa look even less likely.) and pushed. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915/gt: Apply a small scalefactor for the gpu:ring ratio
== Series Details == Series: drm/i915/gt: Apply a small scalefactor for the gpu:ring ratio URL : https://patchwork.freedesktop.org/series/76254/ State : failure == Summary == CI Bug Log - changes from CI_DRM_8342 -> Patchwork_17401 Summary --- **FAILURE** Serious unknown changes coming with Patchwork_17401 absolutely need to be verified manually. If you think the reported changes have nothing to do with the changes introduced in Patchwork_17401, please notify your bug team to allow them to document this new failure mode, which will reduce false positives in CI. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17401/index.html Possible new issues --- Here are the unknown changes that may have been introduced in Patchwork_17401: ### IGT changes ### Possible regressions * igt@i915_selftest@live@gt_timelines: - fi-snb-2520m: [PASS][1] -> [FAIL][2] [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-snb-2520m/igt@i915_selftest@live@gt_timelines.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17401/fi-snb-2520m/igt@i915_selftest@live@gt_timelines.html Known issues Here are the changes found in Patchwork_17401 that come from known issues: ### IGT changes ### Issues hit * igt@i915_selftest@live@perf: - fi-bwr-2160:[PASS][3] -> [INCOMPLETE][4] ([i915#489]) [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-bwr-2160/igt@i915_selftest@l...@perf.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17401/fi-bwr-2160/igt@i915_selftest@l...@perf.html Possible fixes * igt@i915_selftest@live@gt_pm: - fi-icl-u2: [DMESG-FAIL][5] -> [PASS][6] [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-icl-u2/igt@i915_selftest@live@gt_pm.html [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17401/fi-icl-u2/igt@i915_selftest@live@gt_pm.html - fi-glk-dsi: [DMESG-FAIL][7] ([i915#1751]) -> [PASS][8] [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-glk-dsi/igt@i915_selftest@live@gt_pm.html [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17401/fi-glk-dsi/igt@i915_selftest@live@gt_pm.html Warnings * igt@i915_selftest@live@gt_pm: - fi-tgl-y: [DMESG-FAIL][9] ([i915#1744]) -> [DMESG-FAIL][10] ([i915#1759]) [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-tgl-y/igt@i915_selftest@live@gt_pm.html [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17401/fi-tgl-y/igt@i915_selftest@live@gt_pm.html {name}: This element is suppressed. This means it is ignored when computing the status of the difference (SUCCESS, WARNING, or FAILURE). [i915#1744]: https://gitlab.freedesktop.org/drm/intel/issues/1744 [i915#1751]: https://gitlab.freedesktop.org/drm/intel/issues/1751 [i915#1759]: https://gitlab.freedesktop.org/drm/intel/issues/1759 [i915#489]: https://gitlab.freedesktop.org/drm/intel/issues/489 Participating hosts (48 -> 43) -- Additional (1): fi-kbl-7560u Missing(6): fi-cml-u2 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-byt-clapper fi-bdw-samus Build changes - * CI: CI-20190529 -> None * Linux: CI_DRM_8342 -> Patchwork_17401 CI-20190529: 20190529 CI_DRM_8342: 17407a9f61a0ee402254522e391a626acc4375ec @ git://anongit.freedesktop.org/gfx-ci/linux IGT_5602: a8fcccd15dcc2dd409edd23785a2d6f6e85fb682 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools Patchwork_17401: 3d9b0e649f1e28e6858a9dda048773dea350 @ git://anongit.freedesktop.org/gfx-ci/linux == Linux commits == 3d9b0e649f1e drm/i915/gt: Apply a small scalefactor for the gpu:ring ratio == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17401/index.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: ✗ Fi.CI.BAT: failure for drm/i915/gt: Apply a small scalefactor for the gpu:ring ratio
Quoting Patchwork (2020-04-21 12:37:30) > ### IGT changes ### > > Possible regressions > > * igt@i915_selftest@live@gt_timelines: > - fi-snb-2520m: [PASS][1] -> [FAIL][2] >[1]: > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-snb-2520m/igt@i915_selftest@live@gt_timelines.html >[2]: > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17401/fi-snb-2520m/igt@i915_selftest@live@gt_timelines.html > > Possible fixes > > * igt@i915_selftest@live@gt_pm: > - fi-icl-u2: [DMESG-FAIL][5] -> [PASS][6] >[5]: > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-icl-u2/igt@i915_selftest@live@gt_pm.html >[6]: > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17401/fi-icl-u2/igt@i915_selftest@live@gt_pm.html > - fi-glk-dsi: [DMESG-FAIL][7] ([i915#1751]) -> [PASS][8] >[7]: > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-glk-dsi/igt@i915_selftest@live@gt_pm.html >[8]: > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17401/fi-glk-dsi/igt@i915_selftest@live@gt_pm.html Mostly looks like an improvement. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/24] perf/core: Only copy-to-user after completely unlocking all locks, v3.
== Series Details == Series: series starting with [01/24] perf/core: Only copy-to-user after completely unlocking all locks, v3. URL : https://patchwork.freedesktop.org/series/76255/ State : warning == Summary == $ dim checkpatch origin/drm-tip 8184ca928432 perf/core: Only copy-to-user after completely unlocking all locks, v3. -:17: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line) #17: <4> [604.892540] 8264a558 (rcu_state.barrier_mutex){+.+.}, at: rcu_barrier+0x23/0x190 -:106: WARNING:BAD_SIGN_OFF: Duplicate signature #106: Signed-off-by: Maarten Lankhorst -:180: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis #180: FILE: kernel/events/core.c:5174: +__perf_read(struct perf_event *event, char __user *buf, + size_t count, u64 *values) total: 0 errors, 2 warnings, 1 checks, 106 lines checked 0b81340ad042 drm/i915/gt: Move the batch buffer pool from the engine to the gt -:293: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating? #293: deleted file mode 100644 -:607: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1 #607: FILE: drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.h:1: +/* -:608: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead #608: FILE: drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.h:2: + * SPDX-License-Identifier: MIT total: 0 errors, 3 warnings, 0 checks, 583 lines checked d7f858dee647 Revert "drm/i915/gem: Drop relocation slowpath" -:78: WARNING:LINE_SPACING: Missing a blank line after declarations #78: FILE: drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:1628: + int err = __get_user(c, addr); + if (err) total: 0 errors, 1 warnings, 0 checks, 257 lines checked 82487a6c7f3a drm/i915: Add an implementation for i915_gem_ww_ctx locking, v2. -:506: WARNING:LONG_LINE: line over 100 characters #506: FILE: drivers/gpu/drm/i915/i915_gem.c:1341: + while ((obj = list_first_entry_or_null(&ww->obj_list, struct drm_i915_gem_object, obj_link))) { total: 0 errors, 1 warnings, 0 checks, 481 lines checked ce3feac9662e drm/i915: Remove locking from i915_gem_object_prepare_read/write 55d5a8b6f2d6 drm/i915: Parse command buffer earlier in eb_relocate(slow) 4a45c519254f Revert "drm/i915/gem: Split eb_vma into its own allocation" d1dbc9276ee0 drm/i915: Use per object locking in execbuf, v8. -:659: CHECK:MULTIPLE_ASSIGNMENTS: multiple assignments should be avoided #659: FILE: drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:2751: + eb.reloc_pool = eb.batch_pool = NULL; total: 0 errors, 0 warnings, 1 checks, 669 lines checked 7d204f6c17a2 drm/i915: Use ww locking in intel_renderstate. -:10: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line) #10: Convert to using ww-waiting, and make sure we always pin intel_context_state, total: 0 errors, 1 warnings, 0 checks, 209 lines checked aeeaf6783820 drm/i915: Add ww context handling to context_barrier_task -:19: WARNING:LONG_LINE: line over 100 characters #19: FILE: drivers/gpu/drm/i915/gem/i915_gem_context.c:1099: + int (*pin)(struct intel_context *ce, struct i915_gem_ww_ctx *ww, void *data), total: 0 errors, 1 warnings, 0 checks, 146 lines checked 5c0318aeb2b3 drm/i915: Nuke arguments to eb_pin_engine 96f229c38fbc drm/i915: Pin engine before pinning all objects, v3. fdf30680d0d1 drm/i915: Rework intel_context pinning to do everything outside of pin_mutex -:125: CHECK:LINE_SPACING: Please don't use multiple blank lines #125: FILE: drivers/gpu/drm/i915/gt/intel_context.c:176: + + -:338: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis #338: FILE: drivers/gpu/drm/i915/gt/intel_lrc.c:3180: + *vaddr = i915_gem_object_pin_map(ce->state->obj, + i915_coherent_map_type(ce->engine->i915) | total: 0 errors, 0 warnings, 2 checks, 435 lines checked 8301fcfaf872 drm/i915: Make sure execbuffer always passes ww state to i915_vma_pin. -:95: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis #95: FILE: drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:595: + err = i915_vma_pin_ww(vma, &eb->ww, entry->pad_to_size, entry->alignment, -:203: WARNING:BLOCK_COMMENT_STYLE: Block comments use a trailing */ on a separate line #203: FILE: drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:2240: +* hsw should have this fixed, but bdw mucks it up again. */ total: 0 errors, 1 warnings, 1 checks, 833 lines checked cfd0b4d678ca drm/i915: Convert i915_gem_object/client_blt.c to use ww locking as well, v2. 7f0541d81726 drm/i915: Kill last user of intel_context_create_request outside of selftests 4063cad86fed drm/i915: Convert i915_perf to ww locking as well 9e9cce8167f6 drm/i915: Dirty hack to fix selftests locking inversion 09c03a
[Intel-gfx] ✗ Fi.CI.BUILD: failure for Sync i915_pciids upto 8717c6b7414f (rev5)
== Series Details == Series: Sync i915_pciids upto 8717c6b7414f (rev5) URL : https://patchwork.freedesktop.org/series/76080/ State : failure == Summary == Applying: Sync i915_pciids upto 8717c6b7414f error: sha1 information is lacking or useless (src/intel_module.c). error: could not build fake ancestor hint: Use 'git am --show-current-patch=diff' to see the failed patch Patch failed at 0001 Sync i915_pciids upto 8717c6b7414f When you have resolved this problem, run "git am --continue". If you prefer to skip this patch, run "git am --skip" instead. To restore the original branch and stop patching, run "git am --abort". ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with [01/24] perf/core: Only copy-to-user after completely unlocking all locks, v3.
== Series Details == Series: series starting with [01/24] perf/core: Only copy-to-user after completely unlocking all locks, v3. URL : https://patchwork.freedesktop.org/series/76255/ State : failure == Summary == CI Bug Log - changes from CI_DRM_8342 -> Patchwork_17402 Summary --- **FAILURE** Serious unknown changes coming with Patchwork_17402 absolutely need to be verified manually. If you think the reported changes have nothing to do with the changes introduced in Patchwork_17402, please notify your bug team to allow them to document this new failure mode, which will reduce false positives in CI. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17402/index.html Possible new issues --- Here are the unknown changes that may have been introduced in Patchwork_17402: ### IGT changes ### Possible regressions * igt@gem_render_tiled_blits@basic: - fi-pnv-d510:[PASS][1] -> [DMESG-WARN][2] [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-pnv-d510/igt@gem_render_tiled_bl...@basic.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17402/fi-pnv-d510/igt@gem_render_tiled_bl...@basic.html - fi-gdg-551: [PASS][3] -> [DMESG-WARN][4] [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-gdg-551/igt@gem_render_tiled_bl...@basic.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17402/fi-gdg-551/igt@gem_render_tiled_bl...@basic.html - fi-blb-e6850: [PASS][5] -> [DMESG-WARN][6] [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-blb-e6850/igt@gem_render_tiled_bl...@basic.html [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17402/fi-blb-e6850/igt@gem_render_tiled_bl...@basic.html * igt@i915_module_load@reload: - fi-byt-j1900: [PASS][7] -> [FAIL][8] [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-byt-j1900/igt@i915_module_l...@reload.html [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17402/fi-byt-j1900/igt@i915_module_l...@reload.html - fi-hsw-4770:[PASS][9] -> [INCOMPLETE][10] [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-hsw-4770/igt@i915_module_l...@reload.html [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17402/fi-hsw-4770/igt@i915_module_l...@reload.html * igt@i915_selftest@live@gt_pm: - fi-cml-s: [PASS][11] -> [DMESG-FAIL][12] [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-cml-s/igt@i915_selftest@live@gt_pm.html [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17402/fi-cml-s/igt@i915_selftest@live@gt_pm.html - fi-cfl-guc: [PASS][13] -> [DMESG-FAIL][14] [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-cfl-guc/igt@i915_selftest@live@gt_pm.html [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17402/fi-cfl-guc/igt@i915_selftest@live@gt_pm.html - fi-skl-6700k2: [PASS][15] -> [DMESG-WARN][16] [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-skl-6700k2/igt@i915_selftest@live@gt_pm.html [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17402/fi-skl-6700k2/igt@i915_selftest@live@gt_pm.html - fi-skl-guc: [PASS][17] -> [DMESG-FAIL][18] [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-skl-guc/igt@i915_selftest@live@gt_pm.html [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17402/fi-skl-guc/igt@i915_selftest@live@gt_pm.html - fi-kbl-x1275: [PASS][19] -> [DMESG-WARN][20] [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-kbl-x1275/igt@i915_selftest@live@gt_pm.html [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17402/fi-kbl-x1275/igt@i915_selftest@live@gt_pm.html - fi-cfl-8700k: [PASS][21] -> [DMESG-FAIL][22] [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-cfl-8700k/igt@i915_selftest@live@gt_pm.html [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17402/fi-cfl-8700k/igt@i915_selftest@live@gt_pm.html - fi-skl-lmem:[PASS][23] -> [DMESG-WARN][24] [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-skl-lmem/igt@i915_selftest@live@gt_pm.html [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17402/fi-skl-lmem/igt@i915_selftest@live@gt_pm.html - fi-kbl-8809g: [PASS][25] -> [DMESG-FAIL][26] [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-kbl-8809g/igt@i915_selftest@live@gt_pm.html [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17402/fi-kbl-8809g/igt@i915_selftest@live@gt_pm.html - fi-kbl-r: [PASS][27] -> [DMESG-FAIL][28] [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8342/fi-kbl-r/igt@i915_selftest@live@gt_pm.html [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17402/fi-kbl-r/igt@i915_selftest@live@gt_pm.html - fi-kbl-7500u: [PASS][29] -> [DME
[Intel-gfx] [PATCH] drm/i915/selftests: Show the full scaling curve on failure
If we detect that the RPS end points do not scale perfectly, take the time to measure all the in between values as well. We are aborting the test, so we might as well spend the available time gathering critical debug information instead. Signed-off-by: Chris Wilson Cc: Mika Kuoppala --- drivers/gpu/drm/i915/gt/selftest_rps.c | 36 ++ 1 file changed, 36 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/selftest_rps.c b/drivers/gpu/drm/i915/gt/selftest_rps.c index e0a791eac752..f8c416ab8539 100644 --- a/drivers/gpu/drm/i915/gt/selftest_rps.c +++ b/drivers/gpu/drm/i915/gt/selftest_rps.c @@ -484,11 +484,29 @@ int live_rps_frequency_cs(void *arg) if (!scaled_within(max.freq * min.count, min.freq * max.count, 2, 3)) { + int f; + pr_err("%s: CS did not scale with frequency! scaled min:%llu, max:%llu\n", engine->name, max.freq * min.count, min.freq * max.count); show_pcu_config(rps); + + for (f = min.freq + 1; f <= rps->max_freq; f++) { + int act = f; + u64 count; + + count = measure_cs_frequency_at(rps, engine, &act); + if (act < f) + break; + + pr_info("%s: %x:%uMHz: %lluKHz [%d%%]\n", + engine->name, + act, intel_gpu_freq(rps, act), count, + (int)DIV64_U64_ROUND_CLOSEST(100 * min.freq * count, +act * min.count)); + } + err = -EINVAL; } @@ -593,11 +611,29 @@ int live_rps_frequency_srm(void *arg) if (!scaled_within(max.freq * min.count, min.freq * max.count, 1, 2)) { + int f; + pr_err("%s: CS did not scale with frequency! scaled min:%llu, max:%llu\n", engine->name, max.freq * min.count, min.freq * max.count); show_pcu_config(rps); + + for (f = min.freq + 1; f <= rps->max_freq; f++) { + int act = f; + u64 count; + + count = measure_frequency_at(rps, cntr, &act); + if (act < f) + break; + + pr_info("%s: %x:%uMHz: %lluKHz [%d%%]\n", + engine->name, + act, intel_gpu_freq(rps, act), count, + (int)DIV64_U64_ROUND_CLOSEST(100 * min.freq * count, +act * min.count)); + } + err = -EINVAL; } -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PULL] drm-misc-next
Hi Dave, Daniel, just a friendly reminder to merge these changes. They don't seem to be in the upstream tree yet. Best regards Thomas Am 14.04.20 um 11:07 schrieb Thomas Zimmermann: > Hi Dave, Daniel, > > with 5.7-rc1 being tagged, here's the first PR for drm-next-misc for what > will become Linux 5.8. > > Best regards > Thomas > > > drm-misc-next-2020-04-14: > drm-misc-next for 5.8: > > UAPI Changes: > > - drm: error out with EBUSY when device has existing master > - drm: rework SET_MASTER and DROP_MASTER perm handling > > Cross-subsystem Changes: > > - fbdev: savage: fix -Wextra build warning > - video: omap2: Use scnprintf() for avoiding potential buffer overflow > > Core Changes: > > - Remove drm_pci.h > - drm_pci_{alloc/free)() are now legacy > - Introduce managed DRM resourcesA > - Allow drivers to subclass struct drm_framebuffer > - Introduce struct drm_afbc_framebuffer and helpers > - fbdev: remove return value from generic fbdev setup > - Introduce simple-encoder helper > - vram-helpers: set fence on plane > - dp_mst: ACT timeout improvements > - dp_mst: Remove drm_dp_mst_has_audio() > - TTM: ttm_trace_dma_{map/unmap}() cleanups > - dma-buf: add flag for PCIP2P support > - EDID: Various improvements > - Encoder: cleanup semantics of possible_clones and possible_crtcs > - VBLANK documentation updates > - Writeback documentation updates > > Driver Changes: > > - Convert several drivers to i2c_new_client_device() > - Drop explicit drm_mode_config_cleanup() calls from drivers > - Auto-release device structures with drmm_add_final_kfree() > - Init bfdev console after registering DRM device > - Make various .debugfs functions return 0 unconditionally; ignore errors > - video: Use scnprintf() to avoid buffer overflows > - Convert drivers to simple encoders > > - drm/amdgpu: note that we can handle peer2peer DMA-buf > - drm/amdgpu: add support for exporting VRAM using DMA-buf v3 > - drm/kirin: Revert change to register connectors > - drm/lima: Add optional devfreq and cooling device support > - drm/lima: Various improvements wrt. task handling > - drm/panel: nt39016: Support multiple modes and 50Hz > - drm/panel: Support Leadtek LTK050H3146W > - drm/rockchip: Add support for afbc > - drm/virtio: Various cleanups > - drm/hisilicon/hibmc: Enforce 128-byte stride alignment > - drm/qxl: Fix notify port address of cursor ring buffer > - drm/sun4i: Improvements to format handling > - drm/bridge: dw-hdmi: Various improvements > > The following changes since commit c2556238120bce8be37670e145226c12870a9e5a: > > Merge branch 'feature/staging_sm5' of > git://people.freedesktop.org/~sroland/linux into drm-next (2020-03-25 > 15:45:45 +1000) > > are available in the Git repository at: > > git://anongit.freedesktop.org/drm/drm-misc tags/drm-misc-next-2020-04-14 > > for you to fetch changes up to 14d0066b845971db7d0ef03c86fefe4d5bf2: > > drm: kirin: Revert change to add register connect helper functions > (2020-04-13 01:46:02 +) > > > drm-misc-next for 5.8: > > UAPI Changes: > > - drm: error out with EBUSY when device has existing master > - drm: rework SET_MASTER and DROP_MASTER perm handling > > Cross-subsystem Changes: > > - fbdev: savage: fix -Wextra build warning > - video: omap2: Use scnprintf() for avoiding potential buffer overflow > > Core Changes: > > - Remove drm_pci.h > - drm_pci_{alloc/free)() are now legacy > - Introduce managed DRM resourcesA > - Allow drivers to subclass struct drm_framebuffer > - Introduce struct drm_afbc_framebuffer and helpers > - fbdev: remove return value from generic fbdev setup > - Introduce simple-encoder helper > - vram-helpers: set fence on plane > - dp_mst: ACT timeout improvements > - dp_mst: Remove drm_dp_mst_has_audio() > - TTM: ttm_trace_dma_{map/unmap}() cleanups > - dma-buf: add flag for PCIP2P support > - EDID: Various improvements > - Encoder: cleanup semantics of possible_clones and possible_crtcs > - VBLANK documentation updates > - Writeback documentation updates > > Driver Changes: > > - Convert several drivers to i2c_new_client_device() > - Drop explicit drm_mode_config_cleanup() calls from drivers > - Auto-release device structures with drmm_add_final_kfree() > - Init bfdev console after registering DRM device > - Make various .debugfs functions return 0 unconditionally; ignore errors > - video: Use scnprintf() to avoid buffer overflows > - Convert drivers to simple encoders > > - drm/amdgpu: note that we can handle peer2peer DMA-buf > - drm/amdgpu: add support for exporting VRAM using DMA-buf v3 > - drm/kirin: Revert change to register connectors > - drm/lima: Add optional devfreq and cooling device support > - drm/lima: Various improvements wrt. task handling > - drm/panel: nt39016: Support multiple modes
[Intel-gfx] [PATCH] drm/i915/selftests: Show the full scaling curve on failure
If we detect that the RPS end points do not scale perfectly, take the time to measure all the in between values as well. We are aborting the test, so we might as well spend the available time gathering critical debug information instead. Signed-off-by: Chris Wilson Cc: Mika Kuoppala --- drivers/gpu/drm/i915/gt/selftest_rps.c | 40 ++ 1 file changed, 40 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/selftest_rps.c b/drivers/gpu/drm/i915/gt/selftest_rps.c index e0a791eac752..395265121e43 100644 --- a/drivers/gpu/drm/i915/gt/selftest_rps.c +++ b/drivers/gpu/drm/i915/gt/selftest_rps.c @@ -484,11 +484,31 @@ int live_rps_frequency_cs(void *arg) if (!scaled_within(max.freq * min.count, min.freq * max.count, 2, 3)) { + int f; + pr_err("%s: CS did not scale with frequency! scaled min:%llu, max:%llu\n", engine->name, max.freq * min.count, min.freq * max.count); show_pcu_config(rps); + + for (f = min.freq + 1; f <= rps->max_freq; f++) { + int act = f; + u64 count; + + count = measure_cs_frequency_at(rps, engine, &act); + if (act < f) + break; + + pr_info("%s: %x:%uMHz: %lluKHz [%d%%]\n", + engine->name, + act, intel_gpu_freq(rps, act), count, + (int)DIV64_U64_ROUND_CLOSEST(100 * min.freq * count, +act * min.count)); + + f = act; /* may skip ahead [pcu granularity] */ + } + err = -EINVAL; } @@ -593,11 +613,31 @@ int live_rps_frequency_srm(void *arg) if (!scaled_within(max.freq * min.count, min.freq * max.count, 1, 2)) { + int f; + pr_err("%s: CS did not scale with frequency! scaled min:%llu, max:%llu\n", engine->name, max.freq * min.count, min.freq * max.count); show_pcu_config(rps); + + for (f = min.freq + 1; f <= rps->max_freq; f++) { + int act = f; + u64 count; + + count = measure_frequency_at(rps, cntr, &act); + if (act < f) + break; + + pr_info("%s: %x:%uMHz: %lluKHz [%d%%]\n", + engine->name, + act, intel_gpu_freq(rps, act), count, + (int)DIV64_U64_ROUND_CLOSEST(100 * min.freq * count, +act * min.count)); + + f = act; /* may skip ahead [pcu granularity] */ + } + err = -EINVAL; } -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 25/59] drm/tidss: Delete tidss->saved_state
On 15/04/2020 10:40, Daniel Vetter wrote: Not used anymore since the switch to suspend/resume helpers. Tested-by: Jyri Sarha Acked-by: Sam Ravnborg Signed-off-by: Daniel Vetter Cc: Jyri Sarha Cc: Tomi Valkeinen --- drivers/gpu/drm/tidss/tidss_drv.h | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/gpu/drm/tidss/tidss_drv.h b/drivers/gpu/drm/tidss/tidss_drv.h index b23cd95c8d78..3b0a3d87b7c4 100644 --- a/drivers/gpu/drm/tidss/tidss_drv.h +++ b/drivers/gpu/drm/tidss/tidss_drv.h @@ -29,8 +29,6 @@ struct tidss_device { spinlock_t wait_lock; /* protects the irq masks */ dispc_irq_t irq_mask; /* enabled irqs in addition to wait_list */ - - struct drm_atomic_state *saved_state; }; #define to_tidss(__dev) container_of(__dev, struct tidss_device, ddev) Reviewed-by: Tomi Valkeinen Tomi -- Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/5] drm/i915: Make define for lrc state offset
More often than not, we need a byte offset into lrc register state from the start of the hw state. Make it so. Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/i915/gt/intel_context_sseu.c | 3 +-- drivers/gpu/drm/i915/gt/intel_lrc.c | 8 drivers/gpu/drm/i915/gt/intel_lrc.h | 1 + drivers/gpu/drm/i915/gt/selftest_lrc.c | 14 +++--- drivers/gpu/drm/i915/i915_perf.c | 2 +- 5 files changed, 14 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_context_sseu.c b/drivers/gpu/drm/i915/gt/intel_context_sseu.c index 57a30956c922..487299cb91f2 100644 --- a/drivers/gpu/drm/i915/gt/intel_context_sseu.c +++ b/drivers/gpu/drm/i915/gt/intel_context_sseu.c @@ -25,8 +25,7 @@ static int gen8_emit_rpcs_config(struct i915_request *rq, return PTR_ERR(cs); offset = i915_ggtt_offset(ce->state) + -LRC_STATE_PN * PAGE_SIZE + -CTX_R_PWR_CLK_STATE * 4; +LRC_STATE_OFFSET + CTX_R_PWR_CLK_STATE * 4; *cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT; *cs++ = lower_32_bits(offset); diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 34f67eb9bfa1..6a4fa7c6176a 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1161,7 +1161,7 @@ static void restore_default_state(struct intel_context *ce, if (engine->pinned_default_state) memcpy(regs, /* skip restoring the vanilla PPHWSP */ - engine->pinned_default_state + LRC_STATE_PN * PAGE_SIZE, + engine->pinned_default_state + LRC_STATE_OFFSET, engine->context_size - PAGE_SIZE); execlists_init_reg_state(regs, ce, engine, ce->ring, false); @@ -3136,7 +3136,7 @@ check_redzone(const void *vaddr, const struct intel_engine_cs *engine) static void execlists_context_unpin(struct intel_context *ce) { - check_redzone((void *)ce->lrc_reg_state - LRC_STATE_PN * PAGE_SIZE, + check_redzone((void *)ce->lrc_reg_state - LRC_STATE_OFFSET, ce->engine); i915_gem_object_unpin_map(ce->state->obj); @@ -3183,7 +3183,7 @@ __execlists_context_pin(struct intel_context *ce, return PTR_ERR(vaddr); ce->lrc_desc = lrc_descriptor(ce, engine) | CTX_DESC_FORCE_RESTORE; - ce->lrc_reg_state = vaddr + LRC_STATE_PN * PAGE_SIZE; + ce->lrc_reg_state = vaddr + LRC_STATE_OFFSET; __execlists_update_reg_state(ce, engine, ce->ring->tail); return 0; @@ -4846,7 +4846,7 @@ populate_lr_context(struct intel_context *ce, * The second page of the context object contains some registers which * must be set up prior to the first execution. */ - execlists_init_reg_state(vaddr + LRC_STATE_PN * PAGE_SIZE, + execlists_init_reg_state(vaddr + LRC_STATE_OFFSET, ce, engine, ring, inhibit); ret = 0; diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.h b/drivers/gpu/drm/i915/gt/intel_lrc.h index dfbc214e14f5..91fd8e452d9b 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.h +++ b/drivers/gpu/drm/i915/gt/intel_lrc.h @@ -90,6 +90,7 @@ int intel_execlists_submission_setup(struct intel_engine_cs *engine); #define LRC_PPHWSP_SZ (1) /* After the PPHWSP we have the logical state for the context */ #define LRC_STATE_PN (LRC_PPHWSP_PN + LRC_PPHWSP_SZ) +#define LRC_STATE_OFFSET (LRC_STATE_PN * PAGE_SIZE) /* Space within PPHWSP reserved to be used as scratch */ #define LRC_PPHWSP_SCRATCH 0x34 diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c index 6f5e35afe1b2..32d2b0850dec 100644 --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c @@ -4131,7 +4131,7 @@ static int live_lrc_layout(void *arg) err = PTR_ERR(hw); break; } - hw += LRC_STATE_PN * PAGE_SIZE / sizeof(*hw); + hw += LRC_STATE_OFFSET / sizeof(*hw); execlists_init_reg_state(memset(lrc, POISON_INUSE, PAGE_SIZE), engine->kernel_context, @@ -4284,7 +4284,7 @@ static int live_lrc_fixed(void *arg) err = PTR_ERR(hw); break; } - hw += LRC_STATE_PN * PAGE_SIZE / sizeof(*hw); + hw += LRC_STATE_OFFSET / sizeof(*hw); for (t = tbl; t->name; t++) { int dw = find_offset(hw, t->reg); @@ -4870,7 +4870,7 @@ store_context(struct intel_context *ce, struct i915_vma *scratch) x = 0; dw = 0; hw = ce->engine->pinned_default_state; - hw += LRC_STATE_PN * PAGE_SIZE / sizeof(*hw); + hw += LRC_STATE_OFFSET / sizeof(*hw); do { u32 len = hw[dw] & 0x7f; @@ -5023,7 +5023,7 @@
[Intel-gfx] [PATCH 4/5] drm/i915: Use indirect ctx bb to mend CMD_BUF_CCTL
Use indirect ctx bb to load cmd buffer control value from context image to avoid corruption. v2: add to lrc layout (Chris) Testcase: igt/i915_selftest/gt_lrc Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/i915/gt/intel_lrc.c | 73 +++-- drivers/gpu/drm/i915/gt/intel_lrc_reg.h | 1 + drivers/gpu/drm/i915/i915_reg.h | 1 + 3 files changed, 71 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index cc4d1967d00b..efa0f33577a7 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -514,7 +514,7 @@ static void set_offsets(u32 *regs, #define REG16(x) \ (((x) >> 9) | BIT(7) | BUILD_BUG_ON_ZERO(x >= 0x1)), \ (((x) >> 2) & 0x7f) -#define END(x) 0, (x) +#define END(total_state_size) 0, (total_state_size) { const u32 base = engine->mmio_base; @@ -922,8 +922,63 @@ static const u8 gen12_rcs_offsets[] = { NOP(6), LRI(1, 0), REG(0x0c8), + NOP(3+9+1), + + LRI(51, POSTED), + REG16(0x588), + REG16(0x588), + REG16(0x588), + REG16(0x588), + REG16(0x588), + REG16(0x588), + REG(0x028), + REG(0x09c), + REG(0x0c0), + REG(0x178), + REG(0x17c), + REG16(0x358), + REG(0x170), + REG(0x150), + REG(0x154), + REG(0x158), + REG16(0x41c), + REG16(0x600), + REG16(0x604), + REG16(0x608), + REG16(0x60c), + REG16(0x610), + REG16(0x614), + REG16(0x618), + REG16(0x61c), + REG16(0x620), + REG16(0x624), + REG16(0x628), + REG16(0x62c), + REG16(0x630), + REG16(0x634), + REG16(0x638), + REG16(0x63c), + REG16(0x640), + REG16(0x644), + REG16(0x648), + REG16(0x64c), + REG16(0x650), + REG16(0x654), + REG16(0x658), + REG16(0x65c), + REG16(0x660), + REG16(0x664), + REG16(0x668), + REG16(0x66c), + REG16(0x670), + REG16(0x674), + REG16(0x678), + REG16(0x67c), + REG(0x068), + REG(0x084), + NOP(1), - END(80) + END(185) }; #undef END @@ -3207,7 +3262,7 @@ gen12_emit_timestamp_wa_lrm(struct intel_context *ce, u32 *cs) } static u32 * -gen12_emit_timestamp_wa_lrr(struct intel_context *ce, u32 *cs) +gen12_emit_render_ctx_wa(struct intel_context *ce, u32 *cs) { const u32 lrc_offset = i915_ggtt_offset(ce->state) + LRC_STATE_OFFSET; @@ -3227,6 +3282,16 @@ gen12_emit_timestamp_wa_lrr(struct intel_context *ce, u32 *cs) *cs++ = scratch_reg; *cs++ = i915_mmio_reg_offset(RING_CTX_TIMESTAMP(0)); + *cs++ = MI_LOAD_REGISTER_MEM_GEN8 | + MI_SRM_LRM_GLOBAL_GTT | MI_LRI_LRM_CS_MMIO; + *cs++ = scratch_reg; + *cs++ = lrc_offset + CTX_CMD_BUF_CCTL * sizeof(u32); + *cs++ = 0; + + *cs++ = MI_LOAD_REGISTER_REG | MI_LRI_LRM_CS_MMIO; + *cs++ = scratch_reg; + *cs++ = i915_mmio_reg_offset(RING_CMD_BUF_CCTL(0)); + return cs; } @@ -3290,7 +3355,7 @@ gen12_setup_timestamp_ctx_wa(struct intel_context *ce) fn = gen12_emit_timestamp_wa_lrm; if (ce->engine->class == RENDER_CLASS) - fn = gen12_emit_timestamp_wa_lrr; + fn = gen12_emit_render_ctx_wa; setup_indirect_ctx_bb(ce, fn); } diff --git a/drivers/gpu/drm/i915/gt/intel_lrc_reg.h b/drivers/gpu/drm/i915/gt/intel_lrc_reg.h index bb444614f33b..6c81a3a815ac 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc_reg.h +++ b/drivers/gpu/drm/i915/gt/intel_lrc_reg.h @@ -27,6 +27,7 @@ #define CTX_PDP0_UDW (0x30 + 1) #define CTX_PDP0_LDW (0x32 + 1) #define CTX_R_PWR_CLK_STATE(0x42 + 1) +#define CTX_CMD_BUF_CCTL (0xB6 + 1) #define GEN9_CTX_RING_MI_MODE 0x54 diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 4a1965467374..0ef30e3cdd3f 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -2657,6 +2657,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define RING_DMA_FADD_UDW(base)_MMIO((base) + 0x60) /* gen8+ */ #define RING_INSTPM(base) _MMIO((base) + 0xc0) #define RING_MI_MODE(base) _MMIO((base) + 0x9c) +#define RING_CMD_BUF_CCTL(base) _MMIO((base) + 0x84) #define INSTPS _MMIO(0x2070) /* 965+ only */ #define GEN4_INSTDONE1 _MMIO(0x207c) /* 965+ only, aka INSTDONE_2 on SNB */ #define ACTHD_I965 _MMIO(0x2074) -- 2.17.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 3/5] drm/i915: Add live selftests for indirect ctx batchbuffers
Indirect ctx batchbuffers are a hw feature of which batch can be run, by hardware, during context restoration stage. Driver can setup a batchbuffer and also an offset into the context image. When context image is marshalled from memory to registers, and when the offset from the start of context register state is equal of what driver pre-determined, batch will run. So one can manipulate context restoration process at any granularity of one lri, given some limitations, as you need to have rudimentaries in place before you can run a batch. Add selftest which will write the ring start register to a canary spot. This will test that hardware will run a batchbuffer for the context in question. Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/i915/gt/selftest_lrc.c | 156 - 1 file changed, 155 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c index 32d2b0850dec..32c4096b627b 100644 --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c @@ -5363,6 +5363,159 @@ static int live_lrc_isolation(void *arg) return err; } +static int ctx_bb_submit_req(struct intel_context *ce) +{ + struct i915_request *rq; + int err; + + rq = intel_context_create_request(ce); + if (IS_ERR(rq)) + return PTR_ERR(rq); + + i915_request_get(rq); + i915_request_add(rq); + + err = i915_request_wait(rq, 0, HZ / 5); + if (err < 0) + pr_err("%s: request not completed!\n", rq->engine->name); + + i915_request_put(rq); + + return 0; +} + +#define CTX_BB_CANARY_OFFSET (3*1024) +#define CTX_BB_CANARY_INDEX (CTX_BB_CANARY_OFFSET/sizeof(u32)) + +static u32 * +emit_ctx_bb_canary(struct intel_context *ce, u32 *cs) +{ + const u32 ring_start_reg = i915_mmio_reg_offset(RING_START(0)); + const u32 srm = MI_STORE_REGISTER_MEM_GEN8 | + MI_SRM_LRM_GLOBAL_GTT | MI_LRI_LRM_CS_MMIO; + + *cs++ = srm; + *cs++ = ring_start_reg; + *cs++ = i915_ggtt_offset(ce->state) + + ce->ctx_bb_offset + CTX_BB_CANARY_OFFSET; + *cs++ = 0; + + return cs; +} + +static void +ctx_bb_setup(struct intel_context *ce) +{ + u32 *cs = context_indirect_bb(ce); + + cs[CTX_BB_CANARY_INDEX] = 0xdeadf00d; + + setup_indirect_ctx_bb(ce, emit_ctx_bb_canary); +} + +static bool check_ring_start(struct intel_context *ce) +{ + const u32 * const ctx_bb = (void *)(ce->lrc_reg_state) - + LRC_STATE_PN * PAGE_SIZE + ce->ctx_bb_offset; + + if (ctx_bb[CTX_BB_CANARY_INDEX] == ce->lrc_reg_state[CTX_RING_START]) + return true; + + pr_err("ring start mismatch: canary 0x%08x vs state 0x%08x\n", + ctx_bb[CTX_BB_CANARY_INDEX], + ce->lrc_reg_state[CTX_RING_START]); + + return false; +} + +static int ctx_bb_check(struct intel_context *ce) +{ + int err; + + err = ctx_bb_submit_req(ce); + if (err) + return err; + + if (!check_ring_start(ce)) + return -EINVAL; + + return 0; +} + +static int __per_ctx_bb(struct intel_engine_cs *engine) +{ + struct intel_context *ce1, *ce2; + int err = 0; + + ce1 = intel_context_create(engine); + ce2 = intel_context_create(engine); + + err = intel_context_pin(ce1); + if (err) + return err; + + err = intel_context_pin(ce2); + if (err) { + intel_context_put(ce1); + return err; + } + + /* We use the already reserved extra page in context state */ + if (!ce1->ctx_bb_offset) { + GEM_BUG_ON(ce2->ctx_bb_offset); + GEM_BUG_ON(INTEL_GEN(engine->i915) == 12); + goto out; + } + + /* In order to test that our per context bb is truly per context, +* and executes at the intended spot on context restoring process, +* make the batch store the ring start value to memory. +* As ring start is restored apriori of starting the indirect ctx bb and +* as it will be different for each context, it fits to this purpose. +*/ + ctx_bb_setup(ce1); + ctx_bb_setup(ce2); + + err = ctx_bb_check(ce1); + if (err) + goto out; + + err = ctx_bb_check(ce2); +out: + intel_context_unpin(ce2); + intel_context_put(ce2); + + intel_context_unpin(ce1); + intel_context_put(ce1); + + return err; +} + +static int live_lrc_indirect_ctx_bb(void *arg) +{ + struct intel_gt *gt = arg; + struct intel_engine_cs *engine; + enum intel_engine_id id; + int err = 0; + + for_each_engine(engine, gt, id) { + + intel_engine_pm_get(engine); + err = __per_ctx_bb(engine); + intel_engine_pm_put(engine); + + if (err) + break; + +
[Intel-gfx] [PATCH 2/5] drm/i915: Add per ctx batchbuffer wa for timestamp
Restoration of a previous timestamp can collide with updating the timestamp, causing a value corruption. Combat this issue by using indirect ctx bb to modify the context image during restoring process. For render engine, we can preload value into scratch register. From which we then do the actual write with LRR. LRR is faster and thus less error prone. For other engines, no scratch is available so we must do a more complex sequence of sync and async LRMs. As the LRM is slower, the probablity of racy write raises and thus we still see corruption sometimes. v2: tidying (Chris) References: HSDES#16010904313 Testcase: igt/i915_selftest/gt_lrc Suggested-by: Joseph Koston Cc: Chris Wilson Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/i915/gt/intel_context_types.h | 3 + drivers/gpu/drm/i915/gt/intel_gpu_commands.h | 3 +- drivers/gpu/drm/i915/gt/intel_lrc.c | 205 ++ drivers/gpu/drm/i915/gt/intel_lrc_reg.h | 1 + 4 files changed, 174 insertions(+), 38 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h index 07cb83a0d017..c7573d565f58 100644 --- a/drivers/gpu/drm/i915/gt/intel_context_types.h +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h @@ -70,6 +70,9 @@ struct intel_context { u32 *lrc_reg_state; u64 lrc_desc; + + u32 ctx_bb_offset; + u32 tag; /* cookie passed to HW to track this context on submission */ /* Time on GPU as tracked by the hw. */ diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h index f04214a54f75..0c2adb4078a7 100644 --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h @@ -138,7 +138,7 @@ */ #define MI_LOAD_REGISTER_IMM(x)MI_INSTR(0x22, 2*(x)-1) /* Gen11+. addr = base + (ctx_restore ? offset & GENMASK(12,2) : offset) */ -#define MI_LRI_CS_MMIO (1<<19) +#define MI_LRI_LRM_CS_MMIO (1<<19) #define MI_LRI_FORCE_POSTED (1<<12) #define MI_LOAD_REGISTER_IMM_MAX_REGS (126) #define MI_STORE_REGISTER_MEMMI_INSTR(0x24, 1) @@ -155,6 +155,7 @@ #define MI_FLUSH_DW_USE_PPGTT(0<<2) #define MI_LOAD_REGISTER_MEM MI_INSTR(0x29, 1) #define MI_LOAD_REGISTER_MEM_GEN8 MI_INSTR(0x29, 2) +#define MI_LRM_ASYNC (1<<21) #define MI_LOAD_REGISTER_REGMI_INSTR(0x2A, 1) #define MI_BATCH_BUFFERMI_INSTR(0x30, 1) #define MI_BATCH_NON_SECURE (1) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 6a4fa7c6176a..cc4d1967d00b 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -234,7 +234,7 @@ static void execlists_init_reg_state(u32 *reg_state, const struct intel_ring *ring, bool close); static void -__execlists_update_reg_state(const struct intel_context *ce, +__execlists_update_reg_state(struct intel_context *ce, const struct intel_engine_cs *engine, u32 head); @@ -537,7 +537,7 @@ static void set_offsets(u32 *regs, if (flags & POSTED) *regs |= MI_LRI_FORCE_POSTED; if (INTEL_GEN(engine->i915) >= 11) - *regs |= MI_LRI_CS_MMIO; + *regs |= MI_LRI_LRM_CS_MMIO; regs++; GEM_BUG_ON(!count); @@ -3142,8 +3142,161 @@ static void execlists_context_unpin(struct intel_context *ce) i915_gem_object_unpin_map(ce->state->obj); } +static u32 intel_lr_indirect_ctx_offset(const struct intel_engine_cs *engine) +{ + u32 indirect_ctx_offset; + + switch (INTEL_GEN(engine->i915)) { + default: + MISSING_CASE(INTEL_GEN(engine->i915)); + fallthrough; + case 12: + indirect_ctx_offset = + GEN12_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT; + break; + case 11: + indirect_ctx_offset = + GEN11_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT; + break; + case 10: + indirect_ctx_offset = + GEN10_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT; + break; + case 9: + indirect_ctx_offset = + GEN9_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT; + break; + case 8: + indirect_ctx_offset = + GEN8_CTX_RCS_INDIRECT_CTX_OFFSET_DEFAULT; + break; + } + + return indirect_ctx_offset; +} + +static u32 * +gen12_emit_timestamp_wa_lrm(struct intel_context *ce, u32 *cs) +{ + const u32 lrc_offset = i915_ggtt_offset(ce->state) + + LRC_STATE_OFFSET; + const u32 lrm = MI_LOAD_REGISTER_MEM_GEN8 | + MI_
[Intel-gfx] [PATCH 5/5] drm/i915: Split ctx timestamp selftest into two
We use different workarounds for render engine than for other engines. Split the selftest according to these types so that we get error rates per workaround. Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/i915/gt/selftest_lrc.c | 26 +++--- 1 file changed, 23 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c index 32c4096b627b..7daee5ca7d3b 100644 --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c @@ -4749,10 +4749,9 @@ static int __lrc_timestamp(const struct lrc_timestamp *arg, bool preempt) return err; } -static int live_lrc_timestamp(void *arg) +static int __live_lrc_timestamp(struct intel_gt *gt, bool rcs) { struct lrc_timestamp data = {}; - struct intel_gt *gt = arg; enum intel_engine_id id; const u32 poison[] = { 0, @@ -4774,6 +4773,12 @@ static int live_lrc_timestamp(void *arg) unsigned long heartbeat; int i, err = 0; + if (rcs && data.engine->class != RENDER_CLASS) + continue; + + if (!rcs && data.engine->class == RENDER_CLASS) + continue; + engine_heartbeat_disable(data.engine, &heartbeat); for (i = 0; i < ARRAY_SIZE(data.ce); i++) { @@ -4825,6 +4830,20 @@ static int live_lrc_timestamp(void *arg) return 0; } +static int live_lrc_timestamp_rcs(void *arg) +{ + struct intel_gt *gt = arg; + + return __live_lrc_timestamp(gt, true); +} + +static int live_lrc_timestamp_xcs(void *arg) +{ + struct intel_gt *gt = arg; + + return __live_lrc_timestamp(gt, false); +} + static struct i915_vma * create_user_vma(struct i915_address_space *vm, unsigned long size) { @@ -5748,7 +5767,8 @@ int intel_lrc_live_selftests(struct drm_i915_private *i915) SUBTEST(live_lrc_state), SUBTEST(live_lrc_gpr), SUBTEST(live_lrc_indirect_ctx_bb), - SUBTEST(live_lrc_timestamp), + SUBTEST(live_lrc_timestamp_rcs), + SUBTEST(live_lrc_timestamp_xcs), SUBTEST(live_lrc_garbage), SUBTEST(live_pphwsp_runtime), SUBTEST(live_lrc_isolation), -- 2.17.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 3/5] drm/i915: Add live selftests for indirect ctx batchbuffers
Mika Kuoppala writes: > Indirect ctx batchbuffers are a hw feature of which > batch can be run, by hardware, during context restoration stage. > Driver can setup a batchbuffer and also an offset into the > context image. When context image is marshalled from > memory to registers, and when the offset from the start of > context register state is equal of what driver pre-determined, > batch will run. So one can manipulate context restoration > process at any granularity of one lri, given some This is wrong, it is granularity of cacheline. -Mika > limitations, as you need to have rudimentaries in place > before you can run a batch. > > Add selftest which will write the ring start register > to a canary spot. This will test that hardware will run a > batchbuffer for the context in question. > > Signed-off-by: Mika Kuoppala > --- > drivers/gpu/drm/i915/gt/selftest_lrc.c | 156 - > 1 file changed, 155 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c > b/drivers/gpu/drm/i915/gt/selftest_lrc.c > index 32d2b0850dec..32c4096b627b 100644 > --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c > +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c > @@ -5363,6 +5363,159 @@ static int live_lrc_isolation(void *arg) > return err; > } > > +static int ctx_bb_submit_req(struct intel_context *ce) > +{ > + struct i915_request *rq; > + int err; > + > + rq = intel_context_create_request(ce); > + if (IS_ERR(rq)) > + return PTR_ERR(rq); > + > + i915_request_get(rq); > + i915_request_add(rq); > + > + err = i915_request_wait(rq, 0, HZ / 5); > + if (err < 0) > + pr_err("%s: request not completed!\n", rq->engine->name); > + > + i915_request_put(rq); > + > + return 0; > +} > + > +#define CTX_BB_CANARY_OFFSET (3*1024) > +#define CTX_BB_CANARY_INDEX (CTX_BB_CANARY_OFFSET/sizeof(u32)) > + > +static u32 * > +emit_ctx_bb_canary(struct intel_context *ce, u32 *cs) > +{ > + const u32 ring_start_reg = i915_mmio_reg_offset(RING_START(0)); > + const u32 srm = MI_STORE_REGISTER_MEM_GEN8 | > + MI_SRM_LRM_GLOBAL_GTT | MI_LRI_LRM_CS_MMIO; > + > + *cs++ = srm; > + *cs++ = ring_start_reg; > + *cs++ = i915_ggtt_offset(ce->state) + > + ce->ctx_bb_offset + CTX_BB_CANARY_OFFSET; > + *cs++ = 0; > + > + return cs; > +} > + > +static void > +ctx_bb_setup(struct intel_context *ce) > +{ > + u32 *cs = context_indirect_bb(ce); > + > + cs[CTX_BB_CANARY_INDEX] = 0xdeadf00d; > + > + setup_indirect_ctx_bb(ce, emit_ctx_bb_canary); > +} > + > +static bool check_ring_start(struct intel_context *ce) > +{ > + const u32 * const ctx_bb = (void *)(ce->lrc_reg_state) - > + LRC_STATE_PN * PAGE_SIZE + ce->ctx_bb_offset; > + > + if (ctx_bb[CTX_BB_CANARY_INDEX] == ce->lrc_reg_state[CTX_RING_START]) > + return true; > + > + pr_err("ring start mismatch: canary 0x%08x vs state 0x%08x\n", > +ctx_bb[CTX_BB_CANARY_INDEX], > +ce->lrc_reg_state[CTX_RING_START]); > + > + return false; > +} > + > +static int ctx_bb_check(struct intel_context *ce) > +{ > + int err; > + > + err = ctx_bb_submit_req(ce); > + if (err) > + return err; > + > + if (!check_ring_start(ce)) > + return -EINVAL; > + > + return 0; > +} > + > +static int __per_ctx_bb(struct intel_engine_cs *engine) > +{ > + struct intel_context *ce1, *ce2; > + int err = 0; > + > + ce1 = intel_context_create(engine); > + ce2 = intel_context_create(engine); > + > + err = intel_context_pin(ce1); > + if (err) > + return err; > + > + err = intel_context_pin(ce2); > + if (err) { > + intel_context_put(ce1); > + return err; > + } > + > + /* We use the already reserved extra page in context state */ > + if (!ce1->ctx_bb_offset) { > + GEM_BUG_ON(ce2->ctx_bb_offset); > + GEM_BUG_ON(INTEL_GEN(engine->i915) == 12); > + goto out; > + } > + > + /* In order to test that our per context bb is truly per context, > + * and executes at the intended spot on context restoring process, > + * make the batch store the ring start value to memory. > + * As ring start is restored apriori of starting the indirect ctx bb and > + * as it will be different for each context, it fits to this purpose. > + */ > + ctx_bb_setup(ce1); > + ctx_bb_setup(ce2); > + > + err = ctx_bb_check(ce1); > + if (err) > + goto out; > + > + err = ctx_bb_check(ce2); > +out: > + intel_context_unpin(ce2); > + intel_context_put(ce2); > + > + intel_context_unpin(ce1); > + intel_context_put(ce1); > + > + return err; > +} > + > +static int live_lrc_indirect_ctx_bb(void *arg) > +{ > + struct intel_gt *gt = arg; > + struct intel_engine_cs *engine; > + enum intel_engine_id id; > + int err = 0; > +
Re: [Intel-gfx] [PATCH 4/5] drm/i915: Use indirect ctx bb to mend CMD_BUF_CCTL
Quoting Mika Kuoppala (2020-04-21 14:16:32) > - END(80) > + END(185) Round up to the next cacheline(192) for safety paranoia. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 5/5] drm/i915: Split ctx timestamp selftest into two
Quoting Mika Kuoppala (2020-04-21 14:16:33) > @@ -4774,6 +4773,12 @@ static int live_lrc_timestamp(void *arg) > unsigned long heartbeat; > int i, err = 0; > > + if (rcs && data.engine->class != RENDER_CLASS) > + continue; > + > + if (!rcs && data.engine->class == RENDER_CLASS) > + continue; At least have a bit of finesse and do if (!(class & BIT(data.engine->engine->class))) continue; ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PULL] drm-misc-next
On Tue, Apr 21, 2020 at 2:46 PM Thomas Zimmermann wrote: > > Hi Dave, Daniel, > > just a friendly reminder to merge these changes. They don't seem to be > in the upstream tree yet. Dave noticed and pinged me on irc that there's some changes in core mm in this one. That patch is correctly acked by akpm, but stuff like that should be highlighted in the pull summary. Essentially anything outside of what's officially maintainer by drm.git needs a special highlight (and double-checking by maintainers that all the right acks are there). I'm assuming Dave is just going to edit the merge commit himself and doesn't need a new pull request. -Daniel > > Best regards > Thomas > > Am 14.04.20 um 11:07 schrieb Thomas Zimmermann: > > Hi Dave, Daniel, > > > > with 5.7-rc1 being tagged, here's the first PR for drm-next-misc for what > > will become Linux 5.8. > > > > Best regards > > Thomas > > > > > > drm-misc-next-2020-04-14: > > drm-misc-next for 5.8: > > > > UAPI Changes: > > > > - drm: error out with EBUSY when device has existing master > > - drm: rework SET_MASTER and DROP_MASTER perm handling > > > > Cross-subsystem Changes: > > > > - fbdev: savage: fix -Wextra build warning > > - video: omap2: Use scnprintf() for avoiding potential buffer overflow > > > > Core Changes: > > > > - Remove drm_pci.h > > - drm_pci_{alloc/free)() are now legacy > > - Introduce managed DRM resourcesA > > - Allow drivers to subclass struct drm_framebuffer > > - Introduce struct drm_afbc_framebuffer and helpers > > - fbdev: remove return value from generic fbdev setup > > - Introduce simple-encoder helper > > - vram-helpers: set fence on plane > > - dp_mst: ACT timeout improvements > > - dp_mst: Remove drm_dp_mst_has_audio() > > - TTM: ttm_trace_dma_{map/unmap}() cleanups > > - dma-buf: add flag for PCIP2P support > > - EDID: Various improvements > > - Encoder: cleanup semantics of possible_clones and possible_crtcs > > - VBLANK documentation updates > > - Writeback documentation updates > > > > Driver Changes: > > > > - Convert several drivers to i2c_new_client_device() > > - Drop explicit drm_mode_config_cleanup() calls from drivers > > - Auto-release device structures with drmm_add_final_kfree() > > - Init bfdev console after registering DRM device > > - Make various .debugfs functions return 0 unconditionally; ignore errors > > - video: Use scnprintf() to avoid buffer overflows > > - Convert drivers to simple encoders > > > > - drm/amdgpu: note that we can handle peer2peer DMA-buf > > - drm/amdgpu: add support for exporting VRAM using DMA-buf v3 > > - drm/kirin: Revert change to register connectors > > - drm/lima: Add optional devfreq and cooling device support > > - drm/lima: Various improvements wrt. task handling > > - drm/panel: nt39016: Support multiple modes and 50Hz > > - drm/panel: Support Leadtek LTK050H3146W > > - drm/rockchip: Add support for afbc > > - drm/virtio: Various cleanups > > - drm/hisilicon/hibmc: Enforce 128-byte stride alignment > > - drm/qxl: Fix notify port address of cursor ring buffer > > - drm/sun4i: Improvements to format handling > > - drm/bridge: dw-hdmi: Various improvements > > > > The following changes since commit c2556238120bce8be37670e145226c12870a9e5a: > > > > Merge branch 'feature/staging_sm5' of > > git://people.freedesktop.org/~sroland/linux into drm-next (2020-03-25 > > 15:45:45 +1000) > > > > are available in the Git repository at: > > > > git://anongit.freedesktop.org/drm/drm-misc tags/drm-misc-next-2020-04-14 > > > > for you to fetch changes up to 14d0066b845971db7d0ef03c86fefe4d5bf2: > > > > drm: kirin: Revert change to add register connect helper functions > > (2020-04-13 01:46:02 +) > > > > > > drm-misc-next for 5.8: > > > > UAPI Changes: > > > > - drm: error out with EBUSY when device has existing master > > - drm: rework SET_MASTER and DROP_MASTER perm handling > > > > Cross-subsystem Changes: > > > > - fbdev: savage: fix -Wextra build warning > > - video: omap2: Use scnprintf() for avoiding potential buffer overflow > > > > Core Changes: > > > > - Remove drm_pci.h > > - drm_pci_{alloc/free)() are now legacy > > - Introduce managed DRM resourcesA > > - Allow drivers to subclass struct drm_framebuffer > > - Introduce struct drm_afbc_framebuffer and helpers > > - fbdev: remove return value from generic fbdev setup > > - Introduce simple-encoder helper > > - vram-helpers: set fence on plane > > - dp_mst: ACT timeout improvements > > - dp_mst: Remove drm_dp_mst_has_audio() > > - TTM: ttm_trace_dma_{map/unmap}() cleanups > > - dma-buf: add flag for PCIP2P support > > - EDID: Various improvements > > - Encoder: cleanup semantics of possible_clones and possible_crtcs > > - VBLANK documentation updates > > - Writeback documentation updates > > > > Driver Changes: > > > > - Convert several
[Intel-gfx] [PATCH] drm/i915/gt: Make the slice:unslice ratio request explicit for RPS
In RPS, we have the option to only specify the unslice [ring] clock ratio and for the pcu to derive the slice [gpu] clock ratio from its magic table. We also have the option to tell the pcu to use our requested gpu clock ratio, and for it to try and throttle the unslice and slice ratios separately. Signed-off-by: Chris Wilson Cc: Mika Kuoppala --- drivers/gpu/drm/i915/gt/intel_rps.c | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c index 4dcfae16a7ce..07321e1b22f6 100644 --- a/drivers/gpu/drm/i915/gt/intel_rps.c +++ b/drivers/gpu/drm/i915/gt/intel_rps.c @@ -662,14 +662,17 @@ static int gen6_rps_set(struct intel_rps *rps, u8 val) struct drm_i915_private *i915 = rps_to_i915(rps); u32 swreq; - if (INTEL_GEN(i915) >= 9) - swreq = GEN9_FREQUENCY(val); - else if (IS_HASWELL(i915) || IS_BROADWELL(i915)) + if (INTEL_GEN(i915) >= 9) { + swreq = 0x2; /* only throttle slice, not unslice */ + swreq |= val << 14; /* slice [gpu] ratio */ + swreq |= val << 23; /* unslice [ring] ratio */ + } else if (IS_HASWELL(i915) || IS_BROADWELL(i915)) { swreq = HSW_FREQUENCY(val); - else + } else { swreq = (GEN6_FREQUENCY(val) | GEN6_OFFSET(0) | GEN6_AGGRESSIVE_TURBO); + } set(uncore, GEN6_RPNSWREQ, swreq); return 0; -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 5/5] drm/i915: Split ctx timestamp selftest into two
Chris Wilson writes: > Quoting Mika Kuoppala (2020-04-21 14:16:33) >> @@ -4774,6 +4773,12 @@ static int live_lrc_timestamp(void *arg) >> unsigned long heartbeat; >> int i, err = 0; >> >> + if (rcs && data.engine->class != RENDER_CLASS) >> + continue; >> + >> + if (!rcs && data.engine->class == RENDER_CLASS) >> + continue; > > At least have a bit of finesse and do > if (!(class & BIT(data.engine->engine->class))) I looked at the engine mask and I knew there must be a better way. -Mika > continue; ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/gt: Make the slice:unslice ratio request explicit for RPS
Quoting Chris Wilson (2020-04-21 14:45:12) > In RPS, we have the option to only specify the unslice [ring] clock > ratio and for the pcu to derive the slice [gpu] clock ratio from its > magic table. We also have the option to tell the pcu to use our > requested gpu clock ratio, and for it to try and throttle the unslice > and slice ratios separately. > > Signed-off-by: Chris Wilson > Cc: Mika Kuoppala > --- > drivers/gpu/drm/i915/gt/intel_rps.c | 11 +++ > 1 file changed, 7 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c > b/drivers/gpu/drm/i915/gt/intel_rps.c > index 4dcfae16a7ce..07321e1b22f6 100644 > --- a/drivers/gpu/drm/i915/gt/intel_rps.c > +++ b/drivers/gpu/drm/i915/gt/intel_rps.c > @@ -662,14 +662,17 @@ static int gen6_rps_set(struct intel_rps *rps, u8 val) > struct drm_i915_private *i915 = rps_to_i915(rps); > u32 swreq; > > - if (INTEL_GEN(i915) >= 9) > - swreq = GEN9_FREQUENCY(val); > - else if (IS_HASWELL(i915) || IS_BROADWELL(i915)) > + if (INTEL_GEN(i915) >= 9) { > + swreq = 0x2; /* only throttle slice, not unslice */ 0x0 == use implicit slice ratio 0x1 == use explicit slice ratio 0x2 == use separate throttling Not sure if 0x2 actually was implemented in the end. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/selftests: Show the full scaling curve on failure (rev2)
== Series Details == Series: drm/i915/selftests: Show the full scaling curve on failure (rev2) URL : https://patchwork.freedesktop.org/series/76260/ State : success == Summary == CI Bug Log - changes from CI_DRM_8343 -> Patchwork_17404 Summary --- **SUCCESS** No regressions found. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17404/index.html Known issues Here are the changes found in Patchwork_17404 that come from known issues: ### IGT changes ### Possible fixes * igt@i915_selftest@live@gt_pm: - fi-icl-u2: [DMESG-FAIL][1] -> [PASS][2] [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8343/fi-icl-u2/igt@i915_selftest@live@gt_pm.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17404/fi-icl-u2/igt@i915_selftest@live@gt_pm.html * igt@kms_chamelium@dp-edid-read: - fi-kbl-7500u: [FAIL][3] ([i915#976]) -> [PASS][4] [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8343/fi-kbl-7500u/igt@kms_chamel...@dp-edid-read.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17404/fi-kbl-7500u/igt@kms_chamel...@dp-edid-read.html Warnings * igt@i915_pm_rpm@module-reload: - fi-kbl-x1275: [FAIL][5] ([i915#62]) -> [SKIP][6] ([fdo#109271]) [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8343/fi-kbl-x1275/igt@i915_pm_...@module-reload.html [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17404/fi-kbl-x1275/igt@i915_pm_...@module-reload.html [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271 [i915#62]: https://gitlab.freedesktop.org/drm/intel/issues/62 [i915#976]: https://gitlab.freedesktop.org/drm/intel/issues/976 Participating hosts (48 -> 43) -- Additional (2): fi-kbl-7560u fi-bwr-2160 Missing(7): fi-cml-u2 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-byt-clapper fi-bdw-samus Build changes - * CI: CI-20190529 -> None * Linux: CI_DRM_8343 -> Patchwork_17404 CI-20190529: 20190529 CI_DRM_8343: a5f7098d36b9370b08717c04d894d01c7cb4320b @ git://anongit.freedesktop.org/gfx-ci/linux IGT_5602: a8fcccd15dcc2dd409edd23785a2d6f6e85fb682 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools Patchwork_17404: 593387feb3845295b8806386545bbc4b67c6718a @ git://anongit.freedesktop.org/gfx-ci/linux == Linux commits == 593387feb384 drm/i915/selftests: Show the full scaling curve on failure == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17404/index.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/gt: Make the slice:unslice ratio request explicit for RPS
Quoting Chris Wilson (2020-04-21 14:50:51) > Quoting Chris Wilson (2020-04-21 14:45:12) > > In RPS, we have the option to only specify the unslice [ring] clock > > ratio and for the pcu to derive the slice [gpu] clock ratio from its > > magic table. We also have the option to tell the pcu to use our > > requested gpu clock ratio, and for it to try and throttle the unslice > > and slice ratios separately. > > > > Signed-off-by: Chris Wilson > > Cc: Mika Kuoppala > > --- > > drivers/gpu/drm/i915/gt/intel_rps.c | 11 +++ > > 1 file changed, 7 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c > > b/drivers/gpu/drm/i915/gt/intel_rps.c > > index 4dcfae16a7ce..07321e1b22f6 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_rps.c > > +++ b/drivers/gpu/drm/i915/gt/intel_rps.c > > @@ -662,14 +662,17 @@ static int gen6_rps_set(struct intel_rps *rps, u8 val) > > struct drm_i915_private *i915 = rps_to_i915(rps); > > u32 swreq; > > > > - if (INTEL_GEN(i915) >= 9) > > - swreq = GEN9_FREQUENCY(val); > > - else if (IS_HASWELL(i915) || IS_BROADWELL(i915)) > > + if (INTEL_GEN(i915) >= 9) { > > + swreq = 0x2; /* only throttle slice, not unslice */ > > 0x0 == use implicit slice ratio > 0x1 == use explicit slice ratio > 0x2 == use separate throttling > > Not sure if 0x2 actually was implemented in the end. That being said, 0x2 seems to be doing better. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/5] drm/i915: Make define for lrc state offset
== Series Details == Series: series starting with [1/5] drm/i915: Make define for lrc state offset URL : https://patchwork.freedesktop.org/series/76262/ State : warning == Summary == $ dim checkpatch origin/drm-tip 52efd944a36d drm/i915: Make define for lrc state offset c968fc0ee7cf drm/i915: Add per ctx batchbuffer wa for timestamp -:51: CHECK:SPACING: spaces preferred around that '<<' (ctx:VxV) #51: FILE: drivers/gpu/drm/i915/gt/intel_gpu_commands.h:141: +#define MI_LRI_LRM_CS_MMIO (1<<19) ^ -:59: CHECK:SPACING: spaces preferred around that '<<' (ctx:VxV) #59: FILE: drivers/gpu/drm/i915/gt/intel_gpu_commands.h:158: +#define MI_LRM_ASYNC (1<<21) ^ -:200: CHECK:SPACING: spaces preferred around that '/' (ctx:VxV) #200: FILE: drivers/gpu/drm/i915/gt/intel_lrc.c:3256: +(I915_GTT_PAGE_SIZE - 4)/sizeof(*cs)); ^ -:253: CHECK:BRACES: Blank lines aren't necessary before a close brace '}' #253: FILE: drivers/gpu/drm/i915/gt/intel_lrc.c:3321: + + } total: 0 errors, 0 warnings, 4 checks, 273 lines checked 2f150abce7d5 drm/i915: Add live selftests for indirect ctx batchbuffers -:52: CHECK:SPACING: spaces preferred around that '*' (ctx:VxV) #52: FILE: drivers/gpu/drm/i915/gt/selftest_lrc.c:5387: +#define CTX_BB_CANARY_OFFSET (3*1024) ^ -:53: CHECK:SPACING: spaces preferred around that '/' (ctx:VxV) #53: FILE: drivers/gpu/drm/i915/gt/selftest_lrc.c:5388: +#define CTX_BB_CANARY_INDEX (CTX_BB_CANARY_OFFSET/sizeof(u32)) ^ -:167: CHECK:BRACES: Blank lines aren't necessary after an open brace '{' #167: FILE: drivers/gpu/drm/i915/gt/selftest_lrc.c:5502: + for_each_engine(engine, gt, id) { + total: 0 errors, 0 warnings, 3 checks, 171 lines checked 7a108f5f4ddb drm/i915: Use indirect ctx bb to mend CMD_BUF_CCTL -:23: ERROR:COMPLEX_MACRO: Macros with complex values should be enclosed in parentheses #23: FILE: drivers/gpu/drm/i915/gt/intel_lrc.c:517: +#define END(total_state_size) 0, (total_state_size) -:31: CHECK:SPACING: spaces preferred around that '+' (ctx:VxV) #31: FILE: drivers/gpu/drm/i915/gt/intel_lrc.c:925: + NOP(3+9+1), ^ -:31: CHECK:SPACING: spaces preferred around that '+' (ctx:VxV) #31: FILE: drivers/gpu/drm/i915/gt/intel_lrc.c:925: + NOP(3+9+1), ^ total: 1 errors, 0 warnings, 2 checks, 118 lines checked c99d6bd66f5b drm/i915: Split ctx timestamp selftest into two ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/gt: Make the slice:unslice ratio request explicit for RPS
Chris Wilson writes: > Quoting Chris Wilson (2020-04-21 14:45:12) >> In RPS, we have the option to only specify the unslice [ring] clock >> ratio and for the pcu to derive the slice [gpu] clock ratio from its >> magic table. We also have the option to tell the pcu to use our >> requested gpu clock ratio, and for it to try and throttle the unslice >> and slice ratios separately. >> >> Signed-off-by: Chris Wilson >> Cc: Mika Kuoppala >> --- >> drivers/gpu/drm/i915/gt/intel_rps.c | 11 +++ >> 1 file changed, 7 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c >> b/drivers/gpu/drm/i915/gt/intel_rps.c >> index 4dcfae16a7ce..07321e1b22f6 100644 >> --- a/drivers/gpu/drm/i915/gt/intel_rps.c >> +++ b/drivers/gpu/drm/i915/gt/intel_rps.c >> @@ -662,14 +662,17 @@ static int gen6_rps_set(struct intel_rps *rps, u8 val) >> struct drm_i915_private *i915 = rps_to_i915(rps); >> u32 swreq; >> >> - if (INTEL_GEN(i915) >= 9) >> - swreq = GEN9_FREQUENCY(val); >> - else if (IS_HASWELL(i915) || IS_BROADWELL(i915)) >> + if (INTEL_GEN(i915) >= 9) { >> + swreq = 0x2; /* only throttle slice, not unslice */ > > 0x0 == use implicit slice ratio > 0x1 == use explicit slice ratio > 0x2 == use separate throttling Care to enum/define these and add as parameter to GEN9_FREQUENCY? Also if there is any bspec link, add a reference. Thanks, -Mika > > Not sure if 0x2 actually was implemented in the end. > -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/selftests: Show the full scaling curve on failure
Chris Wilson writes: > If we detect that the RPS end points do not scale perfectly, take the > time to measure all the in between values as well. We are aborting the > test, so we might as well spend the available time gathering critical > debug information instead. > > Signed-off-by: Chris Wilson > Cc: Mika Kuoppala > --- > drivers/gpu/drm/i915/gt/selftest_rps.c | 36 ++ > 1 file changed, 36 insertions(+) > > diff --git a/drivers/gpu/drm/i915/gt/selftest_rps.c > b/drivers/gpu/drm/i915/gt/selftest_rps.c > index e0a791eac752..f8c416ab8539 100644 > --- a/drivers/gpu/drm/i915/gt/selftest_rps.c > +++ b/drivers/gpu/drm/i915/gt/selftest_rps.c > @@ -484,11 +484,29 @@ int live_rps_frequency_cs(void *arg) > if (!scaled_within(max.freq * min.count, > min.freq * max.count, > 2, 3)) { > + int f; > + > pr_err("%s: CS did not scale with frequency! scaled > min:%llu, max:%llu\n", > engine->name, > max.freq * min.count, > min.freq * max.count); > show_pcu_config(rps); > + > + for (f = min.freq + 1; f <= rps->max_freq; f++) { > + int act = f; > + u64 count; > + > + count = measure_cs_frequency_at(rps, engine, > &act); > + if (act < f) > + break; > + No gripes but in here I ponder would you like to break after the info. Reviewed-by: Mika Kuoppala > + pr_info("%s: %x:%uMHz: %lluKHz [%d%%]\n", > + engine->name, > + act, intel_gpu_freq(rps, act), count, > + (int)DIV64_U64_ROUND_CLOSEST(100 * > min.freq * count, > + act * > min.count)); > + } > + > err = -EINVAL; > } > > @@ -593,11 +611,29 @@ int live_rps_frequency_srm(void *arg) > if (!scaled_within(max.freq * min.count, > min.freq * max.count, > 1, 2)) { > + int f; > + > pr_err("%s: CS did not scale with frequency! scaled > min:%llu, max:%llu\n", > engine->name, > max.freq * min.count, > min.freq * max.count); > show_pcu_config(rps); > + > + for (f = min.freq + 1; f <= rps->max_freq; f++) { > + int act = f; > + u64 count; > + > + count = measure_frequency_at(rps, cntr, &act); > + if (act < f) > + break; > + > + pr_info("%s: %x:%uMHz: %lluKHz [%d%%]\n", > + engine->name, > + act, intel_gpu_freq(rps, act), count, > + (int)DIV64_U64_ROUND_CLOSEST(100 * > min.freq * count, > + act * > min.count)); > + } > + > err = -EINVAL; > } > > -- > 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 01/59] drm: Add devm_drm_dev_alloc macro
Hi Am 21.04.20 um 12:45 schrieb Daniel Vetter: > On Mon, Apr 20, 2020 at 3:37 PM Thomas Zimmermann wrote: >> >> Hi >> >> Am 15.04.20 um 09:39 schrieb Daniel Vetter: >>> Add a new macro helper to combine the usual init sequence in drivers, >>> consisting of a kzalloc + devm_drm_dev_init + drmm_add_final_kfree >>> triplet. This allows us to remove the rather unsightly >>> drmm_add_final_kfree from all currently merged drivers. >>> >>> The kerneldoc is only added for this new function. Existing kerneldoc >>> and examples will be udated at the very end, since once all drivers >>> are converted over to devm_drm_dev_alloc we can unexport a lot of >>> interim functions and make the documentation for driver authors a lot >>> cleaner and less confusing. There will be only one true way to >>> initialize a drm_device at the end of this, which is going to be >>> devm_drm_dev_alloc. >>> >>> v2: >>> - Actually explain what this is for in the commit message (Sam) >>> - Fix checkpatch issues (Sam) >>> >>> Acked-by: Noralf Trønnes >>> Cc: Noralf Trønnes >>> Reviewed-by: Sam Ravnborg >>> Cc: Sam Ravnborg >>> Cc: Paul Kocialkowski >>> Cc: Laurent Pinchart >>> Signed-off-by: Daniel Vetter > > Thanks for taking a look, some questions on your suggestions below. > >> Sorry for being late. A number of nits are listed below. In any case: >> >> Reviewed-by: Thomas Zimmermann >> >> Best regards >> Thomas >> >>> --- >>> drivers/gpu/drm/drm_drv.c | 23 +++ >>> include/drm/drm_drv.h | 33 + >>> 2 files changed, 56 insertions(+) >>> >>> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c >>> index 1bb4f636b83c..8e1813d2a12e 100644 >>> --- a/drivers/gpu/drm/drm_drv.c >>> +++ b/drivers/gpu/drm/drm_drv.c >>> @@ -739,6 +739,29 @@ int devm_drm_dev_init(struct device *parent, >>> } >>> EXPORT_SYMBOL(devm_drm_dev_init); >>> >>> +void *__devm_drm_dev_alloc(struct device *parent, struct drm_driver >>> *driver, >>> +size_t size, size_t offset) >> >> Maybe rename 'offset' of 'dev_offset' to make the relationship clear. > > Hm, I see the point of this (and the dev_field below, although I'd go > with dev_member there for some consistency with other macros using > offset_of or container_of), but I'm not sure about the dev_ prefix. > Drivers use that sometimes for the struct device *, and usage for > struct drm_device * is also very inconsistent. I've seen ddev, drm, > dev and base (that one only for embedded structs ofc). So not sure > which prefix to pick, aside from dev_ seems the most confusing. Got > ideas? We have pdev for the PCI device, dev for the abstract device, and things like mdev for struct mga_device in mgag200. So I'd go with ddev. I don't like drm, because it could be anything in DRM. I guess struct drm_driver is more 'drm' than struct drm_device. But all of this is bikeshedding. It's probably best to keep the patch as-is, and maybe rename variables later if we ever find consent on the naming. > >>> +{ >>> + void *container; >>> + struct drm_device *drm; >>> + int ret; >>> + >>> + container = kzalloc(size, GFP_KERNEL); >>> + if (!container) >>> + return ERR_PTR(-ENOMEM); >>> + >>> + drm = container + offset; >> >> While convenient, I somewhat dislike the use of void* variables. I'd use >> unsigned char* for container and do an explicit cast to struct >> drm_device* here. > > I thought ever since C89 the explicit recommendation for untyped > pointer math has been void *, and no longer char *, with the spec > being explicit that void * pointer math works exactly like char *. So From how I understand the C spec, I think it's the other way around. I had to look up the sections from the C11 spec: Sec 6.5.6 Additive operators 2 For addition, either both operands shall have arithmetic type, or one operand shall be a pointer to a complete object type and the other shall have integer type. (Incrementing is equivalent to adding 1.) About void it says that it's not a complete type. Sec 6.2.5 Types 19 The void type comprises an empty set of values; it is an incomplete object type that cannot be completed. Arithmetic on void* is a gcc extension AFAIK. > not clear on why you think char * is preferred here. I'm also not > aware of any other kernel code that casts to char * for untyped > pointer math. So unless you have some supporting evidence, I'll skip > this one, ok? I'm really just bikeshedding on things I'd have done differently. Best regards Thomas > > Thanks, Daniel > >>> + ret = devm_drm_dev_init(parent, drm, driver); >>> + if (ret) { >>> + kfree(container); >>> + return ERR_PTR(ret); >>> + } >>> + drmm_add_final_kfree(drm, container); >>> + >>> + return container; >>> +} >>> +EXPORT_SYMBOL(__devm_drm_dev_alloc); >>> + >>> /** >>> * drm_dev_alloc - Allocate new DRM device >>> * @driver: DRM driver to allocate device
Re: [Intel-gfx] [PULL] drm-misc-next
Hi Am 21.04.20 um 15:41 schrieb Daniel Vetter: > On Tue, Apr 21, 2020 at 2:46 PM Thomas Zimmermann wrote: >> >> Hi Dave, Daniel, >> >> just a friendly reminder to merge these changes. They don't seem to be >> in the upstream tree yet. > > Dave noticed and pinged me on irc that there's some changes in core mm > in this one. That patch is correctly acked by akpm, but stuff like > that should be highlighted in the pull summary. Essentially anything > outside of what's officially maintainer by drm.git needs a special > highlight (and double-checking by maintainers that all the right acks > are there). I see. I'll be more careful next time. Best regards Thomas > > I'm assuming Dave is just going to edit the merge commit himself and > doesn't need a new pull request. > -Daniel > >> >> Best regards >> Thomas >> >> Am 14.04.20 um 11:07 schrieb Thomas Zimmermann: >>> Hi Dave, Daniel, >>> >>> with 5.7-rc1 being tagged, here's the first PR for drm-next-misc for what >>> will become Linux 5.8. >>> >>> Best regards >>> Thomas >>> >>> >>> drm-misc-next-2020-04-14: >>> drm-misc-next for 5.8: >>> >>> UAPI Changes: >>> >>> - drm: error out with EBUSY when device has existing master >>> - drm: rework SET_MASTER and DROP_MASTER perm handling >>> >>> Cross-subsystem Changes: >>> >>> - fbdev: savage: fix -Wextra build warning >>> - video: omap2: Use scnprintf() for avoiding potential buffer overflow >>> >>> Core Changes: >>> >>> - Remove drm_pci.h >>> - drm_pci_{alloc/free)() are now legacy >>> - Introduce managed DRM resourcesA >>> - Allow drivers to subclass struct drm_framebuffer >>> - Introduce struct drm_afbc_framebuffer and helpers >>> - fbdev: remove return value from generic fbdev setup >>> - Introduce simple-encoder helper >>> - vram-helpers: set fence on plane >>> - dp_mst: ACT timeout improvements >>> - dp_mst: Remove drm_dp_mst_has_audio() >>> - TTM: ttm_trace_dma_{map/unmap}() cleanups >>> - dma-buf: add flag for PCIP2P support >>> - EDID: Various improvements >>> - Encoder: cleanup semantics of possible_clones and possible_crtcs >>> - VBLANK documentation updates >>> - Writeback documentation updates >>> >>> Driver Changes: >>> >>> - Convert several drivers to i2c_new_client_device() >>> - Drop explicit drm_mode_config_cleanup() calls from drivers >>> - Auto-release device structures with drmm_add_final_kfree() >>> - Init bfdev console after registering DRM device >>> - Make various .debugfs functions return 0 unconditionally; ignore errors >>> - video: Use scnprintf() to avoid buffer overflows >>> - Convert drivers to simple encoders >>> >>> - drm/amdgpu: note that we can handle peer2peer DMA-buf >>> - drm/amdgpu: add support for exporting VRAM using DMA-buf v3 >>> - drm/kirin: Revert change to register connectors >>> - drm/lima: Add optional devfreq and cooling device support >>> - drm/lima: Various improvements wrt. task handling >>> - drm/panel: nt39016: Support multiple modes and 50Hz >>> - drm/panel: Support Leadtek LTK050H3146W >>> - drm/rockchip: Add support for afbc >>> - drm/virtio: Various cleanups >>> - drm/hisilicon/hibmc: Enforce 128-byte stride alignment >>> - drm/qxl: Fix notify port address of cursor ring buffer >>> - drm/sun4i: Improvements to format handling >>> - drm/bridge: dw-hdmi: Various improvements >>> >>> The following changes since commit c2556238120bce8be37670e145226c12870a9e5a: >>> >>> Merge branch 'feature/staging_sm5' of >>> git://people.freedesktop.org/~sroland/linux into drm-next (2020-03-25 >>> 15:45:45 +1000) >>> >>> are available in the Git repository at: >>> >>> git://anongit.freedesktop.org/drm/drm-misc tags/drm-misc-next-2020-04-14 >>> >>> for you to fetch changes up to 14d0066b845971db7d0ef03c86fefe4d5bf2: >>> >>> drm: kirin: Revert change to add register connect helper functions >>> (2020-04-13 01:46:02 +) >>> >>> >>> drm-misc-next for 5.8: >>> >>> UAPI Changes: >>> >>> - drm: error out with EBUSY when device has existing master >>> - drm: rework SET_MASTER and DROP_MASTER perm handling >>> >>> Cross-subsystem Changes: >>> >>> - fbdev: savage: fix -Wextra build warning >>> - video: omap2: Use scnprintf() for avoiding potential buffer overflow >>> >>> Core Changes: >>> >>> - Remove drm_pci.h >>> - drm_pci_{alloc/free)() are now legacy >>> - Introduce managed DRM resourcesA >>> - Allow drivers to subclass struct drm_framebuffer >>> - Introduce struct drm_afbc_framebuffer and helpers >>> - fbdev: remove return value from generic fbdev setup >>> - Introduce simple-encoder helper >>> - vram-helpers: set fence on plane >>> - dp_mst: ACT timeout improvements >>> - dp_mst: Remove drm_dp_mst_has_audio() >>> - TTM: ttm_trace_dma_{map/unmap}() cleanups >>> - dma-buf: add flag for PCIP2P support >>> - EDID: Various improvements >>> - Encoder: cleanup semantics of possible_clones a
Re: [Intel-gfx] [PATCH] drm/i915/selftests: Show the full scaling curve on failure
Quoting Mika Kuoppala (2020-04-21 15:00:08) > Chris Wilson writes: > > > If we detect that the RPS end points do not scale perfectly, take the > > time to measure all the in between values as well. We are aborting the > > test, so we might as well spend the available time gathering critical > > debug information instead. > > > > Signed-off-by: Chris Wilson > > Cc: Mika Kuoppala > > --- > > drivers/gpu/drm/i915/gt/selftest_rps.c | 36 ++ > > 1 file changed, 36 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/gt/selftest_rps.c > > b/drivers/gpu/drm/i915/gt/selftest_rps.c > > index e0a791eac752..f8c416ab8539 100644 > > --- a/drivers/gpu/drm/i915/gt/selftest_rps.c > > +++ b/drivers/gpu/drm/i915/gt/selftest_rps.c > > @@ -484,11 +484,29 @@ int live_rps_frequency_cs(void *arg) > > if (!scaled_within(max.freq * min.count, > > min.freq * max.count, > > 2, 3)) { > > + int f; > > + > > pr_err("%s: CS did not scale with frequency! scaled > > min:%llu, max:%llu\n", > > engine->name, > > max.freq * min.count, > > min.freq * max.count); > > show_pcu_config(rps); > > + > > + for (f = min.freq + 1; f <= rps->max_freq; f++) { > > + int act = f; > > + u64 count; > > + > > + count = measure_cs_frequency_at(rps, engine, > > &act); > > + if (act < f) > > + break; > > + > > No gripes but in here I ponder would you like to break after the info. It just means we've repeated ourselves. So meh, it could be useful it could be noise. This is just extra info and interesting point is the curve, so it's not really critical if we skip a repeated line. Or so I believe. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/gt: Make the slice:unslice ratio request explicit for RPS
Quoting Mika Kuoppala (2020-04-21 14:56:46) > Chris Wilson writes: > > > Quoting Chris Wilson (2020-04-21 14:45:12) > >> In RPS, we have the option to only specify the unslice [ring] clock > >> ratio and for the pcu to derive the slice [gpu] clock ratio from its > >> magic table. We also have the option to tell the pcu to use our > >> requested gpu clock ratio, and for it to try and throttle the unslice > >> and slice ratios separately. > >> > >> Signed-off-by: Chris Wilson > >> Cc: Mika Kuoppala > >> --- > >> drivers/gpu/drm/i915/gt/intel_rps.c | 11 +++ > >> 1 file changed, 7 insertions(+), 4 deletions(-) > >> > >> diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c > >> b/drivers/gpu/drm/i915/gt/intel_rps.c > >> index 4dcfae16a7ce..07321e1b22f6 100644 > >> --- a/drivers/gpu/drm/i915/gt/intel_rps.c > >> +++ b/drivers/gpu/drm/i915/gt/intel_rps.c > >> @@ -662,14 +662,17 @@ static int gen6_rps_set(struct intel_rps *rps, u8 > >> val) > >> struct drm_i915_private *i915 = rps_to_i915(rps); > >> u32 swreq; > >> > >> - if (INTEL_GEN(i915) >= 9) > >> - swreq = GEN9_FREQUENCY(val); > >> - else if (IS_HASWELL(i915) || IS_BROADWELL(i915)) > >> + if (INTEL_GEN(i915) >= 9) { > >> + swreq = 0x2; /* only throttle slice, not unslice */ > > > > 0x0 == use implicit slice ratio > > 0x1 == use explicit slice ratio > > 0x2 == use separate throttling > > Care to enum/define these and add as parameter to GEN9_FREQUENCY? It would not be a parameter to GEN9_FREQUENCY as that gets used elsewhere. You know my opinion on single use magic macros, only useful for obfuscating code. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with [1/5] drm/i915: Make define for lrc state offset
== Series Details == Series: series starting with [1/5] drm/i915: Make define for lrc state offset URL : https://patchwork.freedesktop.org/series/76262/ State : failure == Summary == CI Bug Log - changes from CI_DRM_8343 -> Patchwork_17405 Summary --- **FAILURE** Serious unknown changes coming with Patchwork_17405 absolutely need to be verified manually. If you think the reported changes have nothing to do with the changes introduced in Patchwork_17405, please notify your bug team to allow them to document this new failure mode, which will reduce false positives in CI. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17405/index.html Possible new issues --- Here are the unknown changes that may have been introduced in Patchwork_17405: ### IGT changes ### Possible regressions * igt@i915_selftest@live@workarounds: - fi-byt-j1900: [PASS][1] -> [FAIL][2] +1 similar issue [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8343/fi-byt-j1900/igt@i915_selftest@l...@workarounds.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17405/fi-byt-j1900/igt@i915_selftest@l...@workarounds.html Known issues Here are the changes found in Patchwork_17405 that come from known issues: ### IGT changes ### Possible fixes * igt@i915_selftest@live@gt_pm: - fi-apl-guc: [DMESG-FAIL][3] ([i915#1751]) -> [PASS][4] [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8343/fi-apl-guc/igt@i915_selftest@live@gt_pm.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17405/fi-apl-guc/igt@i915_selftest@live@gt_pm.html * igt@kms_chamelium@dp-edid-read: - fi-kbl-7500u: [FAIL][5] ([i915#976]) -> [PASS][6] [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8343/fi-kbl-7500u/igt@kms_chamel...@dp-edid-read.html [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17405/fi-kbl-7500u/igt@kms_chamel...@dp-edid-read.html Warnings * igt@i915_pm_rpm@module-reload: - fi-kbl-x1275: [FAIL][7] ([i915#62]) -> [SKIP][8] ([fdo#109271]) [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8343/fi-kbl-x1275/igt@i915_pm_...@module-reload.html [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17405/fi-kbl-x1275/igt@i915_pm_...@module-reload.html * igt@i915_selftest@live@gt_pm: - fi-icl-u2: [DMESG-FAIL][9] -> [DMESG-FAIL][10] ([i915#1754]) [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8343/fi-icl-u2/igt@i915_selftest@live@gt_pm.html [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17405/fi-icl-u2/igt@i915_selftest@live@gt_pm.html [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271 [i915#1751]: https://gitlab.freedesktop.org/drm/intel/issues/1751 [i915#1754]: https://gitlab.freedesktop.org/drm/intel/issues/1754 [i915#62]: https://gitlab.freedesktop.org/drm/intel/issues/62 [i915#976]: https://gitlab.freedesktop.org/drm/intel/issues/976 Participating hosts (48 -> 43) -- Additional (2): fi-kbl-7560u fi-bwr-2160 Missing(7): fi-cml-u2 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-byt-clapper fi-bdw-samus Build changes - * CI: CI-20190529 -> None * Linux: CI_DRM_8343 -> Patchwork_17405 CI-20190529: 20190529 CI_DRM_8343: a5f7098d36b9370b08717c04d894d01c7cb4320b @ git://anongit.freedesktop.org/gfx-ci/linux IGT_5602: a8fcccd15dcc2dd409edd23785a2d6f6e85fb682 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools Patchwork_17405: c99d6bd66f5b1b45e20e46b04c1b35b38aaf7af3 @ git://anongit.freedesktop.org/gfx-ci/linux == Linux commits == c99d6bd66f5b drm/i915: Split ctx timestamp selftest into two 7a108f5f4ddb drm/i915: Use indirect ctx bb to mend CMD_BUF_CCTL 2f150abce7d5 drm/i915: Add live selftests for indirect ctx batchbuffers c968fc0ee7cf drm/i915: Add per ctx batchbuffer wa for timestamp 52efd944a36d drm/i915: Make define for lrc state offset == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17405/index.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915/selftests: Disable C-states when measuring RPS frequency response
Let's isolate the impact of cpu frequency selection on determing the GPU throughput in response to selection of RPS frequencies. For real systems, we do have to be concerned with the impact of integrating c-states, p-states and rp-states, but for the sake of proving whether or not RPS works, one baby step at a time. Signed-off-by: Chris Wilson Cc: Mika Kuoppala --- drivers/gpu/drm/i915/gt/selftest_rps.c | 18 ++ 1 file changed, 18 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/selftest_rps.c b/drivers/gpu/drm/i915/gt/selftest_rps.c index 395265121e43..e2afc2003caa 100644 --- a/drivers/gpu/drm/i915/gt/selftest_rps.c +++ b/drivers/gpu/drm/i915/gt/selftest_rps.c @@ -3,6 +3,7 @@ * Copyright © 2020 Intel Corporation */ +#include #include #include "intel_engine_pm.h" @@ -14,6 +15,9 @@ #include "selftests/igt_spinner.h" #include "selftests/librapl.h" +/* Try to isolate the impact of cstates from determing frequency response */ +#define CPU_LATENCY 0 /* -1 to disable pm_qos, 0 to disable cstates */ + static void dummy_rps_work(struct work_struct *wrk) { } @@ -406,6 +410,7 @@ int live_rps_frequency_cs(void *arg) struct intel_gt *gt = arg; struct intel_rps *rps = >->rps; struct intel_engine_cs *engine; + struct pm_qos_request qos; enum intel_engine_id id; int err = 0; @@ -421,6 +426,9 @@ int live_rps_frequency_cs(void *arg) if (INTEL_GEN(gt->i915) < 8) /* for CS simplicity */ return 0; + if (CPU_LATENCY >= 0) + cpu_latency_qos_add_request(&qos, CPU_LATENCY); + intel_gt_pm_wait_for_idle(gt); saved_work = rps->work.func; rps->work.func = dummy_rps_work; @@ -527,6 +535,9 @@ int live_rps_frequency_cs(void *arg) intel_gt_pm_wait_for_idle(gt); rps->work.func = saved_work; + if (CPU_LATENCY >= 0) + cpu_latency_qos_remove_request(&qos); + return err; } @@ -536,6 +547,7 @@ int live_rps_frequency_srm(void *arg) struct intel_gt *gt = arg; struct intel_rps *rps = >->rps; struct intel_engine_cs *engine; + struct pm_qos_request qos; enum intel_engine_id id; int err = 0; @@ -551,6 +563,9 @@ int live_rps_frequency_srm(void *arg) if (INTEL_GEN(gt->i915) < 8) /* for CS simplicity */ return 0; + if (CPU_LATENCY >= 0) + cpu_latency_qos_add_request(&qos, CPU_LATENCY); + intel_gt_pm_wait_for_idle(gt); saved_work = rps->work.func; rps->work.func = dummy_rps_work; @@ -656,6 +671,9 @@ int live_rps_frequency_srm(void *arg) intel_gt_pm_wait_for_idle(gt); rps->work.func = saved_work; + if (CPU_LATENCY >= 0) + cpu_latency_qos_remove_request(&qos); + return err; } -- 2.20.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 5/5] drm/i915: Split ctx timestamp selftest into two
We use different workarounds for render engine than for other engines. Split the selftest according to these types so that we get error rates per workaround. Signed-off-by: Mika Kuoppala --- drivers/gpu/drm/i915/gt/selftest_lrc.c | 23 --- 1 file changed, 20 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c index 32c4096b627b..dd260496876c 100644 --- a/drivers/gpu/drm/i915/gt/selftest_lrc.c +++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c @@ -4749,10 +4749,9 @@ static int __lrc_timestamp(const struct lrc_timestamp *arg, bool preempt) return err; } -static int live_lrc_timestamp(void *arg) +static int __live_lrc_timestamp(struct intel_gt *gt, unsigned long class_filter) { struct lrc_timestamp data = {}; - struct intel_gt *gt = arg; enum intel_engine_id id; const u32 poison[] = { 0, @@ -4774,6 +4773,9 @@ static int live_lrc_timestamp(void *arg) unsigned long heartbeat; int i, err = 0; + if (!(class_filter & BIT(data.engine->class))) + continue; + engine_heartbeat_disable(data.engine, &heartbeat); for (i = 0; i < ARRAY_SIZE(data.ce); i++) { @@ -4825,6 +4827,20 @@ static int live_lrc_timestamp(void *arg) return 0; } +static int live_lrc_timestamp_rcs(void *arg) +{ + struct intel_gt *gt = arg; + + return __live_lrc_timestamp(gt, BIT(RENDER_CLASS)); +} + +static int live_lrc_timestamp_xcs(void *arg) +{ + struct intel_gt *gt = arg; + + return __live_lrc_timestamp(gt, ~BIT(RENDER_CLASS)); +} + static struct i915_vma * create_user_vma(struct i915_address_space *vm, unsigned long size) { @@ -5748,7 +5764,8 @@ int intel_lrc_live_selftests(struct drm_i915_private *i915) SUBTEST(live_lrc_state), SUBTEST(live_lrc_gpr), SUBTEST(live_lrc_indirect_ctx_bb), - SUBTEST(live_lrc_timestamp), + SUBTEST(live_lrc_timestamp_rcs), + SUBTEST(live_lrc_timestamp_xcs), SUBTEST(live_lrc_garbage), SUBTEST(live_pphwsp_runtime), SUBTEST(live_lrc_isolation), -- 2.17.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx