Re: [Intel-gfx] [alsa-devel] [PATCH v4 0/3] support DP MST audio
On Wed, 11 Jan 2017 08:39:13 +0100, Daniel Vetter wrote: > > On Tue, Jan 10, 2017 at 9:49 AM, Takashi Iwai wrote: > > On Tue, 10 Jan 2017 09:45:31 +0100, > >> >-Original Message- > >> >From: Takashi Iwai [mailto:ti...@suse.de] > >> >Sent: Tuesday, January 10, 2017 4:19 PM > >> >To: Yang, Libin > >> >Cc: Daniel Vetter ; intel-gfx > >> >; > >> >Nikula, Jani ; alsa-de...@alsa-project.org; > >> >Lin, > >> >Mengdong > >> >Subject: Re: [alsa-devel] [PATCH v4 0/3] support DP MST audio > >> > > >> >On Mon, 09 Jan 2017 07:22:55 +0100, > >> >Yang, Libin wrote: > >> >> > >> >> Hi Takashi, > >> >> > >> >> It seems the patches for DP MST in gfx is not merged into Linus branch. > >> >> > >> >> Do we have plan to merge gfx branch manually and review the patches for > >> >audio? Or we will wait the DP MST patches for i915 merged into Linus > >> >branch? > >> > > >> >Sorry, this was delayed due to the vacation. > >> >Now I applied these three patches to topic/hda-dp-mst branch based on > >> >4.10- > >> >rc2, and it was merged to for-next branch. > >> > > >> > git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git > >> > topic/hda-dp- > >> >mst > >> > >> Thanks. These audio patches are based on the i915 dp mst patches. Without > >> I915 driver patches support, it will make errors. > > > > Ah yeah, I forgot it. So I removed it from for-next branch, but > > topic/hda-dp-mst branch is kept so that it can be merged to i915 > > tree. > > > > Let me know if there is any i915 branch I can pull into sound tree. > > Aw, I didn't know that the depency goes this way round, the dp mst > patches (I'm not even sure which ones they are) on the i915 are just > on the general pile. So not anywhere near a place where I can make a > topic branch. > > I guess what needs to be done now is a cherry-picked list of just the > patches we need, on top of -rc3, that I can then pull into > drm-intel.git plus send a pull request for that to Takashi. That means > the patches are twice in drm-intel.git, but if we cherry-pick > reference them correctly then that should be all ok and can't really > be helped. > > Or we just delay the audio side for 4.12, dp mst audio support is 4 > years late anyway, so one more release won'thurt that much ;-) Well, thinking of the amount of patches, I guess we can do other way round: basically it's fine to apply Libin's latest patches to drm tree for 4.11, if it makes things easier. I don't think we'll have a big conflict with these changes for others during 4.11 development. If any, I can pull some of stable point from drm tree. Does it work for you? If yes, feel free to apply these three sound patches to drm or i915 tree with my ack. Reviewed-by: Takashi Iwai thanks, Takashi ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 4/6] drm/dp: Introduce DP MST topology manager state to track DP link bw
On Sat, Jan 07, 2017 at 12:35:36AM +, Pandiyan, Dhinakaran wrote: > On Thu, 2017-01-05 at 09:24 +0100, Daniel Vetter wrote: > > On Thu, Jan 05, 2017 at 03:54:54AM +, Pandiyan, Dhinakaran wrote: > > > On Wed, 2017-01-04 at 19:20 +, Pandiyan, Dhinakaran wrote: > > > > On Wed, 2017-01-04 at 10:33 +0100, Daniel Vetter wrote: > > > > > On Tue, Jan 03, 2017 at 01:01:49PM -0800, Dhinakaran Pandiyan wrote: > > > > > > Link bandwidth is shared between multiple display streams in DP MST > > > > > > configurations. The DP MST topology manager structure maintains the > > > > > > shared > > > > > > link bandwidth for a primary link directly connected to the GPU. > > > > > > For atomic > > > > > > modesetting drivers, checking if there is sufficient link bandwidth > > > > > > for a > > > > > > mode needs to be done during the atomic_check phase to avoid failed > > > > > > modesets. Let's encsapsulate the available link bw information in a > > > > > > state > > > > > > structure so that bw can be allocated and released atomically for > > > > > > each of > > > > > > the ports sharing the primary link. > > > > > > > > > > > > Signed-off-by: Dhinakaran Pandiyan > > > > > > > > > > Overall issue with the patch is that dp helpers now have 2 places > > > > > where > > > > > available_slots is stored: One for atomic drivers in ->state, and the > > > > > legacy one. I think it'd be good to rework the legacy codepaths (i.e. > > > > > drm_dp_find_vcpi_slots) to use mgr->state->avail_slots, and remove > > > > > mgr->avail_slots entirely. > > > > > > > > PATCH 2/6 does that. mgr->avail_slots is not updated in the legacy code > > > > path, so the check turns out to be against mgr->total_slots. So, I did > > > > just that, albeit explicitly. > > > > Ah right, I missed that. > > > > > > > > diff --git a/include/drm/drm_atomic.h b/include/drm/drm_atomic.h > > > > > > index fd2d971..7ac5ed6 100644 > > > > > > --- a/include/drm/drm_atomic.h > > > > > > +++ b/include/drm/drm_atomic.h > > > > > > @@ -153,6 +153,11 @@ struct __drm_connnectors_state { > > > > > > struct drm_connector_state *state; > > > > > > }; > > > > > > > > > > > > +struct __drm_dp_mst_topology_state { > > > > > > + struct drm_dp_mst_topology_mgr *ptr; > > > > > > + struct drm_dp_mst_topology_state *state; > > > > > > > > > > One way to fix that control inversion I mentioned above is to use > > > > > void* > > > > > pionters here, and then have callbacks for atomic_destroy and > > > > > swap_state > > > > > on top. A bit more shuffling, but we could then use that for other > > > > > driver > > > > > private objects. > > > > > > > > > > Other option is to stuff it into intel_atomic_state. > > > > > > Hmm... I think I understand what you are saying. The core atomic > > > functions like swap_state should not be able alter the topology > > > manager's current state? > > > > > > Did you mean something like this - https://paste.ubuntu.com/23743485/ ? > > > > Not quite yet, here's what I had in mind as a sketch: > > > > diff --git a/include/drm/drm_atomic.h b/include/drm/drm_atomic.h > > index 2e28fdca9c3d..6ce704b1e900 100644 > > --- a/include/drm/drm_atomic.h > > +++ b/include/drm/drm_atomic.h > > @@ -154,6 +154,17 @@ struct __drm_connnectors_state { > > struct drm_connector_state *state; > > }; > > > > +struct drm_private_state_funcs { > > + void (*swap)(void *obj, void *state); > > + void (*destroy_state)(void *state); > > +}; > > + > > +struct __drm_private_obj_state { > > + struct obj *ptr; > > + struct obj_state *state; > > Thanks for the sketch Daniel, I have a couple of questions. > Should this be void *obj and void *obj_state? Yes :-) > > > + struct drm_private_state_funcs *funcs; > > +} > > + > > /** > > * struct drm_atomic_state - the global state object for atomic updates > > * @ref: count of all references to this state (will not be freed until > > zero) > > @@ -178,6 +189,8 @@ struct drm_atomic_state { > > struct __drm_crtcs_state *crtcs; > > int num_connector; > > struct __drm_connnectors_state *connectors; > > + int num_private_objs; > > + struct __drm_private_obj_state *private_objs; > > > > struct drm_modeset_acquire_ctx *acquire_ctx; > > > > @@ -414,6 +427,19 @@ void drm_state_dump(struct drm_device *dev, struct > > drm_printer *p); > > (__i)++) \ > > for_each_if (plane_state) > > > > +/* The magic here is that if obj and obj_state have the right type, then > > this > > + * will automatically cast to the right type. Since we allow any kind of > > private > > + * object mixed into the same array, runtime type casting is done using the > > + * funcs pointer. > > + */ > > +#define for_each_private_obj(__state, obj, obj_state, __i, funcs) > > + for ((__i) = 0; \ > > +(__i) < (__state)->num_private_objs && >
Re: [Intel-gfx] [PATCH] [RFC i-g-t] Test Design to verify mipi enable/disable sequence.
On Mon, Jan 09, 2017 at 11:00:02AM +0200, Jani Nikula wrote: > On Sat, 07 Jan 2017, Yadav Jyoti wrote: > > From: Jenkins Val > > > > This place here is for the commit message, where you should explain > *why* we need this change. > > Where do you get the XML file? Do you write it manually? How do you > manage them? The kernel will execute the sequences from the VBT, not > from your XML file, so you'll have a problem of maintaining XML files > for each machine you ever run this test on. > > I'm also not thrilled about adding special debug messages that the test > depends on finding in dmesg. The test also doesn't actually do anything > to cause the sequences to be run, so you expect some other, undefined > tests to have been run, the dmesg from that run captured, and saved to a > file that you feed to this test. > > I think the design is rather fragile. Also, igt are black-box testcases, they should not assume any specific implementation. Every time we break that, we are adding api (even if it's just for tests in debugfs), and that means coordination issues. On top of that Chris is building up a neat selftest infrastructure which helps to cover anything which cannot easily be tested using a blackbox approach. Furthermore writing the same stuff twice (like the xml and vbt sequence this test seems to rely) on isn't validation, it's just typing stuff twice. Real validation tries to verify (preferrably orthogonal) invariants, or at least entirely indepent approachs to the implementation. Another similar case was the color manager testcase, which did not check functional outcome using crc, but instead checked that the kernel wrote the right register values in the right places. That's not independent validation, an hence not really useful as a testcase. If you want to validate dsi, then either there needs to be some indication from the sink (on edp we have sink crcs and status flags) that thing went well, or we need a special testing board like chamelium (but that doesn't do dsi unfortunately). Everything else is already covered by the generic modeset testcases and the kernel's selftest. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCHv2] drm/i915: Remove WaDisableLSQCROPERFforOCL KBL workaround.
On Mon, Jan 09, 2017 at 01:07:56PM -0800, Francisco Jerez wrote: > The WaDisableLSQCROPERFforOCL workaround has the side effect of > disabling an L3SQ optimization that has huge performance implications > and is unlikely to be necessary for the correct functioning of usual > graphic workloads. Userspace is free to re-enable the workaround on > demand, and is generally in a better position to determine whether the > workaround is necessary than the DRM is (e.g. only during the > execution of compute kernels that rely on both L3 fences and HDC R/W > requests). > > The same workaround seems to apply to BDW (at least to production > stepping G1) and SKL as well (the internal workaround database claims > that it does for all steppings, while the BSpec workaround table only > mentions pre-production steppings), but the DRM doesn't do anything > beyond whitelisting the L3SQCREG4 register so userspace can enable it > when it sees fit. Do the same on KBL platforms. > > Improves performance of the GFXBench4 gl_manhattan31 benchmark by 60%, > and gl_4 (AKA car chase) by 14% on a KBL GT2 running Mesa master -- > This is followed by a regression of 35% and 10% respectively for the > same benchmarks and platform caused by my recent patch series > switching userspace to use the dataport constant cache instead of the > sampler to implement uniform pull constant loads, which caused us to > hit more heavily the L3 cache (and on platforms other than KBL had the > opposite effect of improving performance of the same two benchmarks). > The overall effect on KBL of this change combined with the recent > userspace change is respectively 4.6% and 2.6%. SynMark2 OglShMapPcf > was affected by the constant cache changes (though it improved as it > did on other platforms rather than regressing), but is not > significantly affected by this patch (with statistical significance of > 5% and sample size 20). > > v2: Drop some more code to avoid unused variable warning. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99256 > Signed-off-by: Francisco Jerez > Cc: Eero Tamminen > Cc: Jani Nikula > Cc: Mika Kuoppala > Cc: beig...@lists.freedesktop.org Don't we need some userspace flag/opt-in scheme to avoid stuff going boom for compute kernels? Are the patches for mesa compute/beignet ready&reviewed? -Daniel > --- > drivers/gpu/drm/i915/intel_lrc.c| 10 -- > drivers/gpu/drm/i915/intel_ringbuffer.c | 8 > 2 files changed, 18 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_lrc.c > b/drivers/gpu/drm/i915/intel_lrc.c > index 6db246a..656e0a3 100644 > --- a/drivers/gpu/drm/i915/intel_lrc.c > +++ b/drivers/gpu/drm/i915/intel_lrc.c > @@ -970,18 +970,8 @@ static inline int gen8_emit_flush_coherentl3_wa(struct > intel_engine_cs *engine, > uint32_t *batch, > uint32_t index) > { > - struct drm_i915_private *dev_priv = engine->i915; > uint32_t l3sqc4_flush = (0x4040 | GEN8_LQSC_FLUSH_COHERENT_LINES); > > - /* > - * WaDisableLSQCROPERFforOCL:kbl > - * This WA is implemented in skl_init_clock_gating() but since > - * this batch updates GEN8_L3SQCREG4 with default value we need to > - * set this bit here to retain the WA during flush. > - */ > - if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_E0)) > - l3sqc4_flush |= GEN8_LQSC_RO_PERF_DIS; > - > wa_ctx_emit(batch, index, (MI_STORE_REGISTER_MEM_GEN8 | > MI_SRM_LRM_GLOBAL_GTT)); > wa_ctx_emit_reg(batch, index, GEN8_L3SQCREG4); > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c > b/drivers/gpu/drm/i915/intel_ringbuffer.c > index 0971ac3..7cb2ab4 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > @@ -1095,14 +1095,6 @@ static int kbl_init_workarounds(struct intel_engine_cs > *engine) > WA_SET_BIT_MASKED(HDC_CHICKEN0, > HDC_FENCE_DEST_SLM_DISABLE); > > - /* GEN8_L3SQCREG4 has a dependency with WA batch so any new changes > - * involving this register should also be added to WA batch as required. > - */ > - if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_E0)) > - /* WaDisableLSQCROPERFforOCL:kbl */ > - I915_WRITE(GEN8_L3SQCREG4, I915_READ(GEN8_L3SQCREG4) | > -GEN8_LQSC_RO_PERF_DIS); > - > /* WaToEnableHwFixForPushConstHWBug:kbl */ > if (IS_KBL_REVID(dev_priv, KBL_REVID_C0, REVID_FOREVER)) > WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2, > -- > 2.10.2 > > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.fr
Re: [Intel-gfx] [alsa-devel] [PATCH v4 0/3] support DP MST audio
On Wed, Jan 11, 2017 at 09:00:27AM +0100, Takashi Iwai wrote: > On Wed, 11 Jan 2017 08:39:13 +0100, > Daniel Vetter wrote: > > > > On Tue, Jan 10, 2017 at 9:49 AM, Takashi Iwai wrote: > > > On Tue, 10 Jan 2017 09:45:31 +0100, > > >> >-Original Message- > > >> >From: Takashi Iwai [mailto:ti...@suse.de] > > >> >Sent: Tuesday, January 10, 2017 4:19 PM > > >> >To: Yang, Libin > > >> >Cc: Daniel Vetter ; intel-gfx > > >> >; > > >> >Nikula, Jani ; > > >> >alsa-de...@alsa-project.org; Lin, > > >> >Mengdong > > >> >Subject: Re: [alsa-devel] [PATCH v4 0/3] support DP MST audio > > >> > > > >> >On Mon, 09 Jan 2017 07:22:55 +0100, > > >> >Yang, Libin wrote: > > >> >> > > >> >> Hi Takashi, > > >> >> > > >> >> It seems the patches for DP MST in gfx is not merged into Linus > > >> >> branch. > > >> >> > > >> >> Do we have plan to merge gfx branch manually and review the patches > > >> >> for > > >> >audio? Or we will wait the DP MST patches for i915 merged into Linus > > >> >branch? > > >> > > > >> >Sorry, this was delayed due to the vacation. > > >> >Now I applied these three patches to topic/hda-dp-mst branch based on > > >> >4.10- > > >> >rc2, and it was merged to for-next branch. > > >> > > > >> > git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git > > >> > topic/hda-dp- > > >> >mst > > >> > > >> Thanks. These audio patches are based on the i915 dp mst patches. Without > > >> I915 driver patches support, it will make errors. > > > > > > Ah yeah, I forgot it. So I removed it from for-next branch, but > > > topic/hda-dp-mst branch is kept so that it can be merged to i915 > > > tree. > > > > > > Let me know if there is any i915 branch I can pull into sound tree. > > > > Aw, I didn't know that the depency goes this way round, the dp mst > > patches (I'm not even sure which ones they are) on the i915 are just > > on the general pile. So not anywhere near a place where I can make a > > topic branch. > > > > I guess what needs to be done now is a cherry-picked list of just the > > patches we need, on top of -rc3, that I can then pull into > > drm-intel.git plus send a pull request for that to Takashi. That means > > the patches are twice in drm-intel.git, but if we cherry-pick > > reference them correctly then that should be all ok and can't really > > be helped. > > > > Or we just delay the audio side for 4.12, dp mst audio support is 4 > > years late anyway, so one more release won'thurt that much ;-) > > Well, thinking of the amount of patches, I guess we can do other way > round: basically it's fine to apply Libin's latest patches to drm tree > for 4.11, if it makes things easier. I don't think we'll have a big > conflict with these changes for others during 4.11 development. If > any, I can pull some of stable point from drm tree. > > Does it work for you? If yes, feel free to apply these three sound > patches to drm or i915 tree with my ack. > > Reviewed-by: Takashi Iwai Works for me too. Libin, can you pls resend (somehow this thread disconnected from the patches for me), with Takashi's r-b + ack for merging through drm-intel? Thanks, Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [alsa-devel] [PATCH v4 0/3] support DP MST audio
Hi Daniel, OK, I will resend the patches tomorrow. Thanks. Hi Takashi, In case you still need the patches for i915, it is on git://anongit.freedesktop.org/drm-tip:drm-tip My patches are: commit 9a148a96fc3a654ddcf142a7ab7db37b972ba5d8 drm/i915/debugfs: add dp mst info commit 9935f7fa2854355203e3976762eecfb218079aac drm/i915: abstract ddi being audio enabled commit 7f9e77545b92bcb894b8e2be5646535e8ba8da9e drm/i915: enable dp mst audio commit 31613268c0a6f7abdb0c19487a084249bcf203ba drm/i915/audio: extend get_saved_enc() to support more scenarios commit f55d23be11ed15f493957246f3b81fc530e79d70 drm/i915/audio: extend audio sync rate support for DP MST And you may still need the patches in gfx to fix the flicker issue, which Dhinakaran can help. Regards, Libin >-Original Message- >From: Daniel Vetter [mailto:daniel.vet...@ffwll.ch] On Behalf Of Daniel Vetter >Sent: Wednesday, January 11, 2017 4:35 PM >To: Takashi Iwai >Cc: Daniel Vetter ; Yang, Libin ; intel- >gfx ; Nikula, Jani >; alsa-de...@alsa-project.org; Lin, Mengdong > >Subject: Re: [alsa-devel] [PATCH v4 0/3] support DP MST audio > >On Wed, Jan 11, 2017 at 09:00:27AM +0100, Takashi Iwai wrote: >> On Wed, 11 Jan 2017 08:39:13 +0100, >> Daniel Vetter wrote: >> > >> > On Tue, Jan 10, 2017 at 9:49 AM, Takashi Iwai wrote: >> > > On Tue, 10 Jan 2017 09:45:31 +0100, >> > >> >-Original Message- >> > >> >From: Takashi Iwai [mailto:ti...@suse.de] >> > >> >Sent: Tuesday, January 10, 2017 4:19 PM >> > >> >To: Yang, Libin >> > >> >Cc: Daniel Vetter ; intel-gfx >> > >> >; >> > >> >Nikula, Jani ; >> > >> >alsa-de...@alsa-project.org; Lin, Mengdong >> > >> > >> > >> >Subject: Re: [alsa-devel] [PATCH v4 0/3] support DP MST audio >> > >> > >> > >> >On Mon, 09 Jan 2017 07:22:55 +0100, Yang, Libin wrote: >> > >> >> >> > >> >> Hi Takashi, >> > >> >> >> > >> >> It seems the patches for DP MST in gfx is not merged into Linus >branch. >> > >> >> >> > >> >> Do we have plan to merge gfx branch manually and review the >> > >> >> patches for >> > >> >audio? Or we will wait the DP MST patches for i915 merged into Linus >branch? >> > >> > >> > >> >Sorry, this was delayed due to the vacation. >> > >> >Now I applied these three patches to topic/hda-dp-mst branch >> > >> >based on 4.10- rc2, and it was merged to for-next branch. >> > >> > >> > >> > git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git >> > >> >topic/hda-dp- mst >> > >> >> > >> Thanks. These audio patches are based on the i915 dp mst patches. >> > >> Without >> > >> I915 driver patches support, it will make errors. >> > > >> > > Ah yeah, I forgot it. So I removed it from for-next branch, but >> > > topic/hda-dp-mst branch is kept so that it can be merged to i915 >> > > tree. >> > > >> > > Let me know if there is any i915 branch I can pull into sound tree. >> > >> > Aw, I didn't know that the depency goes this way round, the dp mst >> > patches (I'm not even sure which ones they are) on the i915 are just >> > on the general pile. So not anywhere near a place where I can make a >> > topic branch. >> > >> > I guess what needs to be done now is a cherry-picked list of just >> > the patches we need, on top of -rc3, that I can then pull into >> > drm-intel.git plus send a pull request for that to Takashi. That >> > means the patches are twice in drm-intel.git, but if we cherry-pick >> > reference them correctly then that should be all ok and can't really >> > be helped. >> > >> > Or we just delay the audio side for 4.12, dp mst audio support is 4 >> > years late anyway, so one more release won'thurt that much ;-) >> >> Well, thinking of the amount of patches, I guess we can do other way >> round: basically it's fine to apply Libin's latest patches to drm tree >> for 4.11, if it makes things easier. I don't think we'll have a big >> conflict with these changes for others during 4.11 development. If >> any, I can pull some of stable point from drm tree. >> >> Does it work for you? If yes, feel free to apply these three sound >> patches to drm or i915 tree with my ack. >> >> Reviewed-by: Takashi Iwai > >Works for me too. Libin, can you pls resend (somehow this thread >disconnected from the patches for me), with Takashi's r-b + ack for merging >through drm-intel? > >Thanks, Daniel >-- >Daniel Vetter >Software Engineer, Intel Corporation >http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915/bxt: Add MST support when do DPLL calculation
From: "Lee, Shawn C" Kernel oops was trigger by DP MST monitor/hub connected. DP MST series patch already upstream and MST should be support also. MST monitor will display normally with this change on bxt platform. Cc: Jani Nikula Reviewed-by: Cooper Chiou Reviewed-by: Gary C Wang Reviewed-by: Ciobanu, Nathan D Reviewed-by: Herbert, Marc Reviewed-by: Sripada, Radhakrishna Signed-off-by: Shawn Lee --- drivers/gpu/drm/i915/intel_dpll_mgr.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/intel_dpll_mgr.c b/drivers/gpu/drm/i915/intel_dpll_mgr.c index c92a2558beb4..1a1d99d266ed 100644 --- a/drivers/gpu/drm/i915/intel_dpll_mgr.c +++ b/drivers/gpu/drm/i915/intel_dpll_mgr.c @@ -1855,7 +1855,8 @@ bool bxt_ddi_dp_set_dpll_hw_state(int clock, return NULL; if ((encoder->type == INTEL_OUTPUT_DP || -encoder->type == INTEL_OUTPUT_EDP) && +encoder->type == INTEL_OUTPUT_EDP || +encoder->type == INTEL_OUTPUT_DP_MST ) && !bxt_ddi_dp_set_dpll_hw_state(clock, &dpll_hw_state)) return NULL; -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/4] lib/scatterlist: Avoid potential scatterlist entry overflow
From: Tvrtko Ursulin Since the scatterlist length field is an unsigned int, make sure that sg_alloc_table_from_pages does not overflow it while coallescing pages to a single entry. v2: Drop reference to future use. Use UINT_MAX. v3: max_segment must be page aligned. Signed-off-by: Tvrtko Ursulin Cc: Masahiro Yamada Cc: linux-ker...@vger.kernel.org Reviewed-by: Chris Wilson (v2) --- lib/scatterlist.c | 25 +++-- 1 file changed, 19 insertions(+), 6 deletions(-) diff --git a/lib/scatterlist.c b/lib/scatterlist.c index e05e7fc98892..4fc54801cd29 100644 --- a/lib/scatterlist.c +++ b/lib/scatterlist.c @@ -394,7 +394,8 @@ int sg_alloc_table_from_pages(struct sg_table *sgt, unsigned int offset, unsigned long size, gfp_t gfp_mask) { - unsigned int chunks; + const unsigned int max_segment = rounddown(UINT_MAX, PAGE_SIZE); + unsigned int seg_len, chunks; unsigned int i; unsigned int cur_page; int ret; @@ -402,9 +403,16 @@ int sg_alloc_table_from_pages(struct sg_table *sgt, /* compute number of contiguous chunks */ chunks = 1; - for (i = 1; i < n_pages; ++i) - if (page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1) + seg_len = PAGE_SIZE; + for (i = 1; i < n_pages; ++i) { + if (seg_len >= max_segment || + page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1) { ++chunks; + seg_len = PAGE_SIZE; + } else { + seg_len += PAGE_SIZE; + } + } ret = sg_alloc_table(sgt, chunks, gfp_mask); if (unlikely(ret)) @@ -413,17 +421,22 @@ int sg_alloc_table_from_pages(struct sg_table *sgt, /* merging chunks and putting them into the scatterlist */ cur_page = 0; for_each_sg(sgt->sgl, s, sgt->orig_nents, i) { - unsigned long chunk_size; + unsigned int chunk_size; unsigned int j; /* look for the end of the current chunk */ + seg_len = PAGE_SIZE; for (j = cur_page + 1; j < n_pages; ++j) - if (page_to_pfn(pages[j]) != + if (seg_len >= max_segment || + page_to_pfn(pages[j]) != page_to_pfn(pages[j - 1]) + 1) break; + else + seg_len += PAGE_SIZE; chunk_size = ((j - cur_page) << PAGE_SHIFT) - offset; - sg_set_page(s, pages[cur_page], min(size, chunk_size), offset); + sg_set_page(s, pages[cur_page], + min_t(unsigned long, size, chunk_size), offset); size -= chunk_size; offset = 0; cur_page = j; -- 2.7.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages
From: Tvrtko Ursulin Scatterlist entries have an unsigned int for the offset so correct the sg_alloc_table_from_pages function accordingly. Since these are offsets withing a page, unsigned int is wide enough. Also converts callers which were using unsigned long locally with the lower_32_bits annotation to make it explicitly clear what is happening. v2: Use offset_in_page. (Chris Wilson) Signed-off-by: Tvrtko Ursulin Cc: Masahiro Yamada Cc: Pawel Osciak Cc: Marek Szyprowski Cc: Kyungmin Park Cc: Tomasz Stanislawski Cc: Matt Porter Cc: Alexandre Bounine Cc: linux-me...@vger.kernel.org Cc: linux-ker...@vger.kernel.org Acked-by: Marek Szyprowski (v1) Reviewed-by: Chris Wilson Reviewed-by: Mauro Carvalho Chehab --- drivers/media/v4l2-core/videobuf2-dma-contig.c | 4 ++-- drivers/rapidio/devices/rio_mport_cdev.c | 4 ++-- include/linux/scatterlist.h| 2 +- lib/scatterlist.c | 2 +- 4 files changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c index fb6a177be461..51e8765bc3c6 100644 --- a/drivers/media/v4l2-core/videobuf2-dma-contig.c +++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c @@ -478,7 +478,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr, { struct vb2_dc_buf *buf; struct frame_vector *vec; - unsigned long offset; + unsigned int offset; int n_pages, i; int ret = 0; struct sg_table *sgt; @@ -506,7 +506,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr, buf->dev = dev; buf->dma_dir = dma_dir; - offset = vaddr & ~PAGE_MASK; + offset = lower_32_bits(offset_in_page(vaddr)); vec = vb2_create_framevec(vaddr, size, dma_dir == DMA_FROM_DEVICE); if (IS_ERR(vec)) { ret = PTR_ERR(vec); diff --git a/drivers/rapidio/devices/rio_mport_cdev.c b/drivers/rapidio/devices/rio_mport_cdev.c index 9013a585507e..0fae29ff47ba 100644 --- a/drivers/rapidio/devices/rio_mport_cdev.c +++ b/drivers/rapidio/devices/rio_mport_cdev.c @@ -876,10 +876,10 @@ rio_dma_transfer(struct file *filp, u32 transfer_mode, * offset within the internal buffer specified by handle parameter. */ if (xfer->loc_addr) { - unsigned long offset; + unsigned int offset; long pinned; - offset = (unsigned long)(uintptr_t)xfer->loc_addr & ~PAGE_MASK; + offset = lower_32_bits(offset_in_page(xfer->loc_addr)); nr_pages = PAGE_ALIGN(xfer->length + offset) >> PAGE_SHIFT; page_list = kmalloc_array(nr_pages, diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h index cb3c8fe6acd7..c981bee1a3ae 100644 --- a/include/linux/scatterlist.h +++ b/include/linux/scatterlist.h @@ -263,7 +263,7 @@ int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int, int sg_alloc_table(struct sg_table *, unsigned int, gfp_t); int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages, unsigned int n_pages, - unsigned long offset, unsigned long size, + unsigned int offset, unsigned long size, gfp_t gfp_mask); size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf, diff --git a/lib/scatterlist.c b/lib/scatterlist.c index 004fc70fc56a..e05e7fc98892 100644 --- a/lib/scatterlist.c +++ b/lib/scatterlist.c @@ -391,7 +391,7 @@ EXPORT_SYMBOL(sg_alloc_table); */ int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages, unsigned int n_pages, - unsigned long offset, unsigned long size, + unsigned int offset, unsigned long size, gfp_t gfp_mask) { unsigned int chunks; -- 2.7.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 3/4] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages
From: Tvrtko Ursulin Drivers like i915 benefit from being able to control the maxium size of the sg coallesced segment while building the scatter- gather list. Introduce and export the __sg_alloc_table_from_pages function which will allow it that control. v2: Reorder parameters. (Chris Wilson) v3: Fix incomplete reordering in v2. v4: max_segment needs to be page aligned. Signed-off-by: Tvrtko Ursulin Cc: Masahiro Yamada Cc: linux-ker...@vger.kernel.org Cc: Chris Wilson Reviewed-by: Chris Wilson (v2) --- include/linux/scatterlist.h | 11 ++--- lib/scatterlist.c | 59 +++-- 2 files changed, 53 insertions(+), 17 deletions(-) diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h index c981bee1a3ae..16b740afeed2 100644 --- a/include/linux/scatterlist.h +++ b/include/linux/scatterlist.h @@ -261,10 +261,13 @@ void sg_free_table(struct sg_table *); int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int, struct scatterlist *, gfp_t, sg_alloc_fn *); int sg_alloc_table(struct sg_table *, unsigned int, gfp_t); -int sg_alloc_table_from_pages(struct sg_table *sgt, - struct page **pages, unsigned int n_pages, - unsigned int offset, unsigned long size, - gfp_t gfp_mask); +int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages, + unsigned int n_pages, unsigned int offset, + unsigned long size, unsigned int max_segment, + gfp_t gfp_mask); +int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages, + unsigned int n_pages, unsigned int offset, + unsigned long size, gfp_t gfp_mask); size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf, size_t buflen, off_t skip, bool to_buffer); diff --git a/lib/scatterlist.c b/lib/scatterlist.c index 4fc54801cd29..df375ff18587 100644 --- a/lib/scatterlist.c +++ b/lib/scatterlist.c @@ -370,14 +370,15 @@ int sg_alloc_table(struct sg_table *table, unsigned int nents, gfp_t gfp_mask) EXPORT_SYMBOL(sg_alloc_table); /** - * sg_alloc_table_from_pages - Allocate and initialize an sg table from - *an array of pages - * @sgt: The sg table header to use - * @pages: Pointer to an array of page pointers - * @n_pages: Number of pages in the pages array - * @offset: Offset from start of the first page to the start of a buffer - * @size: Number of valid bytes in the buffer (after offset) - * @gfp_mask: GFP allocation mask + * __sg_alloc_table_from_pages - Allocate and initialize an sg table from + * an array of pages + * @sgt:The sg table header to use + * @pages: Pointer to an array of page pointers + * @n_pages:Number of pages in the pages array + * @offset: Offset from start of the first page to the start of a buffer + * @size:Number of valid bytes in the buffer (after offset) + * @max_segment: Maximum size of a scatterlist node in bytes (page aligned) + * @gfp_mask: GFP allocation mask * * Description: *Allocate and initialize an sg table from a list of pages. Contiguous @@ -389,18 +390,20 @@ EXPORT_SYMBOL(sg_alloc_table); * Returns: * 0 on success, negative error on failure */ -int sg_alloc_table_from_pages(struct sg_table *sgt, - struct page **pages, unsigned int n_pages, - unsigned int offset, unsigned long size, - gfp_t gfp_mask) +int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages, + unsigned int n_pages, unsigned int offset, + unsigned long size, unsigned int max_segment, + gfp_t gfp_mask) { - const unsigned int max_segment = rounddown(UINT_MAX, PAGE_SIZE); unsigned int seg_len, chunks; unsigned int i; unsigned int cur_page; int ret; struct scatterlist *s; + if (WARN_ON(offset_in_page(max_segment))) + return -EINVAL; + /* compute number of contiguous chunks */ chunks = 1; seg_len = PAGE_SIZE; @@ -444,6 +447,36 @@ int sg_alloc_table_from_pages(struct sg_table *sgt, return 0; } +EXPORT_SYMBOL(__sg_alloc_table_from_pages); + +/** + * sg_alloc_table_from_pages - Allocate and initialize an sg table from + *an array of pages + * @sgt:The sg table header to use + * @pages: Pointer to an array of page pointers + * @n_pages:Number of pages in the pages array + * @offset: Offset from start of the first page to the start of a buffer + * @size:Number of valid bytes in the buffer (after offset) + * @gfp_mask: GFP allocation mask + * + * Description: + *Allocate and initialize an sg table from a list of pages. Contiguous + *r
[Intel-gfx] [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations
From: Tvrtko Ursulin With the addition of __sg_alloc_table_from_pages we can control the maximum coallescing size and eliminate a separate path for allocating backing store here. Similar to 871dfbd67d4e ("drm/i915: Allow compaction upto SWIOTLB max segment size") this enables more compact sg lists to be created and so has a beneficial effect on workloads with many and/or large objects of this class. v2: * Rename helper to i915_sg_segment_size and fix swiotlb override. * Commit message update. v3: * Actually include the swiotlb override fix. v4: * Regroup parameters a bit. (Chris Wilson) v5: * Rebase for swiotlb_max_segment. * Add DMA map failure handling as in abb0deacb5a6 ("drm/i915: Fallback to single PAGE_SIZE segments for DMA remapping"). Signed-off-by: Tvrtko Ursulin Cc: Chris Wilson Cc: linux-ker...@vger.kernel.org Reviewed-by: Chris Wilson (v4) --- drivers/gpu/drm/i915/i915_drv.h | 10 + drivers/gpu/drm/i915/i915_gem.c | 6 +-- drivers/gpu/drm/i915/i915_gem_userptr.c | 79 - 3 files changed, 40 insertions(+), 55 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 2b325032fedc..a944ff0c5c68 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2594,6 +2594,16 @@ static inline struct scatterlist *__sg_next(struct scatterlist *sg) (((__iter).curr += PAGE_SIZE) < (__iter).max) || \ ((__iter) = __sgt_iter(__sg_next((__iter).sgp), false), 0)) +static inline unsigned int i915_sg_segment_size(void) +{ + unsigned int size = swiotlb_max_segment(); + + if (size == 0) + size = UINT_MAX; + + return rounddown(size, PAGE_SIZE); +} + static inline const struct intel_device_info * intel_info(const struct drm_i915_private *dev_priv) { diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 13c02015709c..421827069a2f 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2248,7 +2248,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) struct sgt_iter sgt_iter; struct page *page; unsigned long last_pfn = 0; /* suppress gcc warning */ - unsigned int max_segment; + unsigned int max_segment = i915_sg_segment_size(); int ret; gfp_t gfp; @@ -2259,10 +2259,6 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) GEM_BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS); GEM_BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS); - max_segment = swiotlb_max_segment(); - if (!max_segment) - max_segment = rounddown(UINT_MAX, PAGE_SIZE); - st = kmalloc(sizeof(*st), GFP_KERNEL); if (st == NULL) return ERR_PTR(-ENOMEM); diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c index 6a8fa085b74e..95b62b9c5cd6 100644 --- a/drivers/gpu/drm/i915/i915_gem_userptr.c +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c @@ -390,64 +390,42 @@ struct get_pages_work { struct task_struct *task; }; -#if IS_ENABLED(CONFIG_SWIOTLB) -#define swiotlb_active() swiotlb_nr_tbl() -#else -#define swiotlb_active() 0 -#endif - -static int -st_set_pages(struct sg_table **st, struct page **pvec, int num_pages) -{ - struct scatterlist *sg; - int ret, n; - - *st = kmalloc(sizeof(**st), GFP_KERNEL); - if (*st == NULL) - return -ENOMEM; - - if (swiotlb_active()) { - ret = sg_alloc_table(*st, num_pages, GFP_KERNEL); - if (ret) - goto err; - - for_each_sg((*st)->sgl, sg, num_pages, n) - sg_set_page(sg, pvec[n], PAGE_SIZE, 0); - } else { - ret = sg_alloc_table_from_pages(*st, pvec, num_pages, - 0, num_pages << PAGE_SHIFT, - GFP_KERNEL); - if (ret) - goto err; - } - - return 0; - -err: - kfree(*st); - *st = NULL; - return ret; -} - static struct sg_table * -__i915_gem_userptr_set_pages(struct drm_i915_gem_object *obj, -struct page **pvec, int num_pages) +__i915_gem_userptr_alloc_pages(struct drm_i915_gem_object *obj, + struct page **pvec, int num_pages) { - struct sg_table *pages; + unsigned int max_segment = i915_sg_segment_size(); + struct sg_table *st; int ret; - ret = st_set_pages(&pages, pvec, num_pages); - if (ret) + st = kmalloc(sizeof(*st), GFP_KERNEL); + if (!st) + return ERR_PTR(-ENOMEM); + +alloc_table: + ret = __sg_alloc_table_from_pages(st, pvec, num_pages, + 0, num_pages << PAGE_SHIFT, +
[Intel-gfx] [PATCH] drm/probe-helpers: Drop locking from poll_enable
It was only needed to protect the connector_list walking, see commit 8c4ccc4ab6f64e859d4ff8d7c02c2ed2e956e07f Author: Daniel Vetter Date: Thu Jul 9 23:44:26 2015 +0200 drm/probe-helper: Grab mode_config.mutex in poll_init/enable Unfortunately the commit message of that patch fails to mention that the new locking check was for the connector_list. But that requirement disappeared in commit c36a3254f7857f1ad9badbe3578ccc92be541a8e Author: Daniel Vetter Date: Thu Dec 15 16:58:43 2016 +0100 drm: Convert all helpers to drm_connector_list_iter and so we can drop this again. This fixes a locking inversion on nouveau, where the rpm code needs to re-enable. But in other places the rpm_get() calls are nested within the big modeset locks. While at it, also improve the kerneldoc for these two functions a notch. v2: Update the kerneldoc even more to explain that these functions can't be called concurrently, or bad things happen (Chris). Cc: Dave Airlie Reviewed-by: Chris Wilson Cc: Chris Wilson Signed-off-by: Daniel Vetter --- drivers/gpu/drm/drm_probe_helper.c | 51 ++-- drivers/gpu/drm/i915/intel_hotplug.c | 4 +-- include/drm/drm_crtc_helper.h| 1 - 3 files changed, 22 insertions(+), 34 deletions(-) diff --git a/drivers/gpu/drm/drm_probe_helper.c b/drivers/gpu/drm/drm_probe_helper.c index 20f48d1e2785..93381454bdf7 100644 --- a/drivers/gpu/drm/drm_probe_helper.c +++ b/drivers/gpu/drm/drm_probe_helper.c @@ -115,25 +115,28 @@ static int drm_helper_probe_add_cmdline_mode(struct drm_connector *connector) #define DRM_OUTPUT_POLL_PERIOD (10*HZ) /** - * drm_kms_helper_poll_enable_locked - re-enable output polling. + * drm_kms_helper_poll_enable - re-enable output polling. * @dev: drm_device * - * This function re-enables the output polling work without - * locking the mode_config mutex. + * This function re-enables the output polling work, after it has been + * temporarily disabled using drm_kms_helper_poll_disable(), for example over + * suspend/resume. * - * This is like drm_kms_helper_poll_enable() however it is to be - * called from a context where the mode_config mutex is locked - * already. + * Drivers can call this helper from their device resume implementation. It is + * an error to call this when the output polling support has not yet been set + * up. + * + * Note that calls to enable and disable polling must be strictly ordered, which + * is automatically the case when they're only call from suspend/resume + * callbacks. */ -void drm_kms_helper_poll_enable_locked(struct drm_device *dev) +void drm_kms_helper_poll_enable(struct drm_device *dev) { bool poll = false; struct drm_connector *connector; struct drm_connector_list_iter conn_iter; unsigned long delay = DRM_OUTPUT_POLL_PERIOD; - WARN_ON(!mutex_is_locked(&dev->mode_config.mutex)); - if (!dev->mode_config.poll_enabled || !drm_kms_helper_poll) return; @@ -163,7 +166,7 @@ void drm_kms_helper_poll_enable_locked(struct drm_device *dev) if (poll) schedule_delayed_work(&dev->mode_config.output_poll_work, delay); } -EXPORT_SYMBOL(drm_kms_helper_poll_enable_locked); +EXPORT_SYMBOL(drm_kms_helper_poll_enable); static enum drm_connector_status drm_connector_detect(struct drm_connector *connector, bool force) @@ -290,7 +293,7 @@ int drm_helper_probe_single_connector_modes(struct drm_connector *connector, /* Re-enable polling in case the global poll config changed. */ if (drm_kms_helper_poll != dev->mode_config.poll_running) - drm_kms_helper_poll_enable_locked(dev); + drm_kms_helper_poll_enable(dev); dev->mode_config.poll_running = drm_kms_helper_poll; @@ -484,8 +487,12 @@ static void output_poll_execute(struct work_struct *work) * This function disables the output polling work. * * Drivers can call this helper from their device suspend implementation. It is - * not an error to call this even when output polling isn't enabled or arlready - * disabled. + * not an error to call this even when output polling isn't enabled or already + * disabled. Polling is re-enabled by calling drm_kms_helper_poll_enable(). + * + * Note that calls to enable and disable polling must be strictly ordered, which + * is automatically the case when they're only call from suspend/resume + * callbacks. */ void drm_kms_helper_poll_disable(struct drm_device *dev) { @@ -496,24 +503,6 @@ void drm_kms_helper_poll_disable(struct drm_device *dev) EXPORT_SYMBOL(drm_kms_helper_poll_disable); /** - * drm_kms_helper_poll_enable - re-enable output polling. - * @dev: drm_device - * - * This function re-enables the output polling work. - * - * Drivers can call this helper from their device resume implementation. It is - * an error to call this when the output polling support has not yet been set - * up. - */ -void drm_kms_helper_poll_enable(struct drm_devi
Re: [Intel-gfx] [PATCH] drm/i915/bxt: Add MST support when do DPLL calculation
On Wed, 11 Jan 2017, "Lee, Shawn C" wrote: > From: "Lee, Shawn C" > > Kernel oops was trigger by DP MST monitor/hub connected. Copy paste the oops to the commit message please. It's *much* easier to match bug reports and fixes this way. There's likely a bug report, or several bug reports about this over at FDO bugzilla. Any Bugzilla: references we should add? When was this broken? Which commit does this fix? We should use a Fixes: tag to identify it, so the fix can be backported to appropriate stable kernels. BR, Jani. > DP MST series patch already upstream and MST should > be support also. MST monitor will display normally with this > change on bxt platform. > > Cc: Jani Nikula > Reviewed-by: Cooper Chiou > Reviewed-by: Gary C Wang > Reviewed-by: Ciobanu, Nathan D > Reviewed-by: Herbert, Marc > Reviewed-by: Sripada, Radhakrishna > > Signed-off-by: Shawn Lee > --- > drivers/gpu/drm/i915/intel_dpll_mgr.c |3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/intel_dpll_mgr.c > b/drivers/gpu/drm/i915/intel_dpll_mgr.c > index c92a2558beb4..1a1d99d266ed 100644 > --- a/drivers/gpu/drm/i915/intel_dpll_mgr.c > +++ b/drivers/gpu/drm/i915/intel_dpll_mgr.c > @@ -1855,7 +1855,8 @@ bool bxt_ddi_dp_set_dpll_hw_state(int clock, > return NULL; > > if ((encoder->type == INTEL_OUTPUT_DP || > - encoder->type == INTEL_OUTPUT_EDP) && > + encoder->type == INTEL_OUTPUT_EDP || > + encoder->type == INTEL_OUTPUT_DP_MST ) && > !bxt_ddi_dp_set_dpll_hw_state(clock, &dpll_hw_state)) > return NULL; -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.BAT: warning for drm/i915/bxt: Add MST support when do DPLL calculation
== Series Details == Series: drm/i915/bxt: Add MST support when do DPLL calculation URL : https://patchwork.freedesktop.org/series/17815/ State : warning == Summary == Series 17815v1 drm/i915/bxt: Add MST support when do DPLL calculation https://patchwork.freedesktop.org/api/1.0/series/17815/revisions/1/mbox/ Test kms_force_connector_basic: Subgroup force-edid: pass -> DMESG-WARN (fi-snb-2520m) fi-bdw-5557u total:246 pass:232 dwarn:0 dfail:0 fail:0 skip:14 fi-bsw-n3050 total:246 pass:207 dwarn:0 dfail:0 fail:0 skip:39 fi-bxt-j4205 total:246 pass:224 dwarn:0 dfail:0 fail:0 skip:22 fi-bxt-t5700 total:82 pass:69 dwarn:0 dfail:0 fail:0 skip:12 fi-byt-j1900 total:246 pass:219 dwarn:0 dfail:0 fail:0 skip:27 fi-byt-n2820 total:246 pass:215 dwarn:0 dfail:0 fail:0 skip:31 fi-hsw-4770 total:246 pass:227 dwarn:0 dfail:0 fail:0 skip:19 fi-hsw-4770r total:246 pass:227 dwarn:0 dfail:0 fail:0 skip:19 fi-ivb-3520m total:246 pass:225 dwarn:0 dfail:0 fail:0 skip:21 fi-ivb-3770 total:246 pass:225 dwarn:0 dfail:0 fail:0 skip:21 fi-kbl-7500u total:246 pass:225 dwarn:0 dfail:0 fail:0 skip:21 fi-skl-6260u total:246 pass:233 dwarn:0 dfail:0 fail:0 skip:13 fi-skl-6700hqtotal:246 pass:226 dwarn:0 dfail:0 fail:0 skip:20 fi-skl-6700k total:246 pass:222 dwarn:3 dfail:0 fail:0 skip:21 fi-skl-6770hqtotal:246 pass:233 dwarn:0 dfail:0 fail:0 skip:13 fi-snb-2520m total:246 pass:214 dwarn:1 dfail:0 fail:0 skip:31 fi-snb-2600 total:246 pass:214 dwarn:0 dfail:0 fail:0 skip:32 abf5260be6dda4ade94e8edf66e133260083f29b drm-tip: 2017y-01m-10d-23h-42m-21s UTC integration manifest a6d6930 drm/i915/bxt: Add MST support when do DPLL calculation == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3472/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 3/3] drm/i915: Prefer random replacement before eviction search
On Wed, Jan 11, 2017 at 09:47:41AM +0200, Joonas Lahtinen wrote: > On ti, 2017-01-10 at 21:55 +, Chris Wilson wrote: > > Performing an eviction search can be very, very slow especially for a > > range restricted replacement. For example, a workload like > > gem_concurrent_blit will populate the entire GTT and then cause aperture > > thrashing. Since the GTT is a mix of active and inactive tiny objects, > > we have to search through almost 400k objects before finding anything > > inside the mappable region, and as this search is required before every > > operation performance falls off a cliff. > > > > Instead of performing the full search, we do a trial replacement of the > > node at a random location fitting the specified restrictions. We lose > > the strict LRU property of the GTT in exchange for avoiding the slow > > search (several orders of runtime improvement for gem_concurrent_blit > > 4KiB-global-gtt, e.g. from 5000s to 20s). The loss of LRU replacement is > > (later) mitigated firstly by only doing replacement if we find no > > freespace and secondly by execbuf doing a PIN_NONBLOCK search first before > > it starts thrashing (i.e. the random replacement will only occur from the > > already inactive set of objects). > > > > Signed-off-by: Chris Wilson > > Cc: Tvrtko Ursulin > > Cc: Joonas Lahtinen > > > > > +static u64 random_offset(u64 start, u64 end, u64 len, u64 align) > > +{ > > The usual GEM_BUG_ON dance to make sure the inputs make some sense. Or > are you relying on the upper level callers? It was static and the callers were checking, but yeah might as well catch them whilst we think about it. > > + u64 range, addr; > > + > > + if (align == 0) > > + align = I915_GTT_MIN_ALIGNMENT; > > + > > + range = round_down(end - len, align) - round_up(start, align); > > For example this may cause an odd result. > > > @@ -3629,6 +3655,16 @@ int i915_gem_gtt_insert(struct i915_address_space > > *vm, > > if (err != -ENOSPC) > > return err; > > > > + /* No free space, pick a slot at random */ > > + err = i915_gem_gtt_reserve(vm, node, > > + size, > > + random_offset(start, end, size, alignment), > > I'd pull this to a line above just to make it more humane to read. > > + color, > > + flags); > > + if (err != -ENOSPC) > > + return err; > > + > > + /* Randomly selected placement is pinned, do a search */ > > err = i915_gem_evict_something(vm, size, alignment, color, > > start, end, flags); > > if (err) > > I'm bit unsure why it would make such a big difference, but if you've > been running the numbers. Code itself is all good, so this is; The pathological case we have is |<-- 256 MiB aperture (64k objects) -->||<-- 1792 MiB unmappable (448k objects) -->| Now imagine that the eviction LRU is ordered top-down (just because pathology meets reallife), and that we need to evict an object to make room inside the aperture. The eviction scan then has to walk the list 448k before it finds one within range. And now imagine that it has to search for a new hole between every byte inside the memcpy, for several simultaneous clients. If there are a few holes in the unmappable region, we also have a similar problem with hole skipping inside the drm_mm range search. This is mitigated by using DRM_MM_INSERT_LOW, but only once we have that support in drm_mm. Right now, the drm_mm search is also having to walk the MRU rejecting the holes above the full aperture. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages
== Series Details == Series: series starting with [1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages URL : https://patchwork.freedesktop.org/series/17816/ State : success == Summary == Series 17816v1 Series without cover letter https://patchwork.freedesktop.org/api/1.0/series/17816/revisions/1/mbox/ fi-bdw-5557u total:246 pass:232 dwarn:0 dfail:0 fail:0 skip:14 fi-bsw-n3050 total:246 pass:207 dwarn:0 dfail:0 fail:0 skip:39 fi-bxt-j4205 total:246 pass:224 dwarn:0 dfail:0 fail:0 skip:22 fi-bxt-t5700 total:82 pass:69 dwarn:0 dfail:0 fail:0 skip:12 fi-byt-j1900 total:246 pass:219 dwarn:0 dfail:0 fail:0 skip:27 fi-byt-n2820 total:246 pass:215 dwarn:0 dfail:0 fail:0 skip:31 fi-hsw-4770 total:246 pass:227 dwarn:0 dfail:0 fail:0 skip:19 fi-hsw-4770r total:246 pass:227 dwarn:0 dfail:0 fail:0 skip:19 fi-ivb-3520m total:246 pass:225 dwarn:0 dfail:0 fail:0 skip:21 fi-ivb-3770 total:246 pass:225 dwarn:0 dfail:0 fail:0 skip:21 fi-kbl-7500u total:246 pass:225 dwarn:0 dfail:0 fail:0 skip:21 fi-skl-6260u total:246 pass:233 dwarn:0 dfail:0 fail:0 skip:13 fi-skl-6700hqtotal:246 pass:226 dwarn:0 dfail:0 fail:0 skip:20 fi-skl-6700k total:246 pass:222 dwarn:3 dfail:0 fail:0 skip:21 fi-skl-6770hqtotal:246 pass:233 dwarn:0 dfail:0 fail:0 skip:13 fi-snb-2520m total:246 pass:215 dwarn:0 dfail:0 fail:0 skip:31 fi-snb-2600 total:246 pass:214 dwarn:0 dfail:0 fail:0 skip:32 abf5260be6dda4ade94e8edf66e133260083f29b drm-tip: 2017y-01m-10d-23h-42m-21s UTC integration manifest 91daf71e drm/i915: Use __sg_alloc_table_from_pages for userptr allocations 653fe3c lib/scatterlist: Introduce and export __sg_alloc_table_from_pages 9fee41b lib/scatterlist: Avoid potential scatterlist entry overflow 354e525 lib/scatterlist: Fix offset type in sg_alloc_table_from_pages == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3473/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Prefer random replacement before eviction search
Performing an eviction search can be very, very slow especially for a range restricted replacement. For example, a workload like gem_concurrent_blit will populate the entire GTT and then cause aperture thrashing. Since the GTT is a mix of active and inactive tiny objects, we have to search through almost 400k objects before finding anything inside the mappable region, and as this search is required before every operation performance falls off a cliff. Instead of performing the full search, we do a trial replacement of the node at a random location fitting the specified restrictions. We lose the strict LRU property of the GTT in exchange for avoiding the slow search (several orders of runtime improvement for gem_concurrent_blit 4KiB-global-gtt, e.g. from 5000s to 20s). The loss of LRU replacement is (later) mitigated firstly by only doing replacement if we find no freespace and secondly by execbuf doing a PIN_NONBLOCK search first before it starts thrashing (i.e. the random replacement will only occur from the already inactive set of objects). v2: Ascii-art, and check preconditions Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Cc: Joonas Lahtinen Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_gtt.c | 52 + 1 file changed, 52 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index a3ea32f79d86..9aa53bdf5a48 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -24,6 +24,7 @@ */ #include +#include #include #include @@ -3581,12 +3582,38 @@ int i915_gem_gtt_reserve(struct i915_address_space *vm, return err; } +static u64 random_offset(u64 start, u64 end, u64 len, u64 align) +{ + u64 range, addr; + + GEM_BUG_ON(range_overflows(start, len, end)); + GEM_BUG_ON(round_up(start, align) > round_down(end - len, align)); + + range = round_down(end - len, align) - round_up(start, align); + if (range) { + if (sizeof(unsigned long) == sizeof(u64)) { + addr = get_random_long(); + } else { + addr = get_random_int(); + if (range > U32_MAX) { + addr <<= 32; + addr |= get_random_int(); + } + } + div64_u64_rem(addr, range, &addr); + start += addr; + } + + return round_up(start, align); +} + int i915_gem_gtt_insert(struct i915_address_space *vm, struct drm_mm_node *node, u64 size, u64 alignment, unsigned long color, u64 start, u64 end, unsigned int flags) { u32 search_flag, alloc_flag; + u64 offset; int err; lockdep_assert_held(&vm->i915->drm.struct_mutex); @@ -3629,6 +3656,31 @@ int i915_gem_gtt_insert(struct i915_address_space *vm, if (err != -ENOSPC) return err; + /* No free space, pick a slot at random. +* +* There is a pathological case here using a GTT shared between +* mmap and GPU (i.e. ggtt/aliasing_ppgtt but not full-ppgtt): +* +*|<-- 256 MiB aperture -->||<-- 1792 MiB unmappable -->| +* (64k objects) (448k objects) +* +* Now imagine that the eviction LRU is ordered top-down (just because +* pathology meets real life), and that we need to evict an object to +* make room inside the aperture. The eviction scan then has to walk +* the list 448k before it finds one within range. And now imagine that +* it has to search for a new hole between every byte inside the memcpy, +* for several simultaneous clients. +* +* On a full-ppgtt system, if we have run out of available space, there +* will be lots and lots of objects in the eviction list! +*/ + offset = random_offset(start, end, + size, alignment ?: I915_GTT_MIN_ALIGNMENT); + err = i915_gem_gtt_reserve(vm, node, size, offset, color, flags); + if (err != -ENOSPC) + return err; + + /* Randomly selected placement is pinned, do a search */ err = i915_gem_evict_something(vm, size, alignment, color, start, end, flags); if (err) -- 2.11.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v3] drm/i915: Prefer random replacement before eviction search
Performing an eviction search can be very, very slow especially for a range restricted replacement. For example, a workload like gem_concurrent_blit will populate the entire GTT and then cause aperture thrashing. Since the GTT is a mix of active and inactive tiny objects, we have to search through almost 400k objects before finding anything inside the mappable region, and as this search is required before every operation performance falls off a cliff. Instead of performing the full search, we do a trial replacement of the node at a random location fitting the specified restrictions. We lose the strict LRU property of the GTT in exchange for avoiding the slow search (several orders of runtime improvement for gem_concurrent_blit 4KiB-global-gtt, e.g. from 5000s to 20s). The loss of LRU replacement is (later) mitigated firstly by only doing replacement if we find no freespace and secondly by execbuf doing a PIN_NONBLOCK search first before it starts thrashing (i.e. the random replacement will only occur from the already inactive set of objects). v2: Ascii-art, and check preconditionst v3: Rephrase final sentence in comment to explain why we don't both with if (i915_is_ggtt(vm)) for preferring random replacement. Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Cc: Joonas Lahtinen Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_gtt.c | 56 + 1 file changed, 56 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index a3ea32f79d86..b320cdd22f2f 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -24,6 +24,7 @@ */ #include +#include #include #include @@ -3581,12 +3582,38 @@ int i915_gem_gtt_reserve(struct i915_address_space *vm, return err; } +static u64 random_offset(u64 start, u64 end, u64 len, u64 align) +{ + u64 range, addr; + + GEM_BUG_ON(range_overflows(start, len, end)); + GEM_BUG_ON(round_up(start, align) > round_down(end - len, align)); + + range = round_down(end - len, align) - round_up(start, align); + if (range) { + if (sizeof(unsigned long) == sizeof(u64)) { + addr = get_random_long(); + } else { + addr = get_random_int(); + if (range > U32_MAX) { + addr <<= 32; + addr |= get_random_int(); + } + } + div64_u64_rem(addr, range, &addr); + start += addr; + } + + return round_up(start, align); +} + int i915_gem_gtt_insert(struct i915_address_space *vm, struct drm_mm_node *node, u64 size, u64 alignment, unsigned long color, u64 start, u64 end, unsigned int flags) { u32 search_flag, alloc_flag; + u64 offset; int err; lockdep_assert_held(&vm->i915->drm.struct_mutex); @@ -3629,6 +3656,35 @@ int i915_gem_gtt_insert(struct i915_address_space *vm, if (err != -ENOSPC) return err; + /* No free space, pick a slot at random. +* +* There is a pathological case here using a GTT shared between +* mmap and GPU (i.e. ggtt/aliasing_ppgtt but not full-ppgtt): +* +*|<-- 256 MiB aperture -->||<-- 1792 MiB unmappable -->| +* (64k objects) (448k objects) +* +* Now imagine that the eviction LRU is ordered top-down (just because +* pathology meets real life), and that we need to evict an object to +* make room inside the aperture. The eviction scan then has to walk +* the 448k list before it finds one within range. And now imagine that +* it has to search for a new hole between every byte inside the memcpy, +* for several simultaneous clients. +* +* On a full-ppgtt system, if we have run out of available space, there +* will be lots and lots of objects in the eviction list! Again, +* searching that LRU list may be slow if we are also applying any +* range restrictions (e.g. restriction to low 4GiB) and so, for +* simplicity and similarilty between different GTT, try the single +* random replacement first. +*/ + offset = random_offset(start, end, + size, alignment ?: I915_GTT_MIN_ALIGNMENT); + err = i915_gem_gtt_reserve(vm, node, size, offset, color, flags); + if (err != -ENOSPC) + return err; + + /* Randomly selected placement is pinned, do a search */ err = i915_gem_evict_something(vm, size, alignment, color, start, end, flags); if (err) -- 2.11.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https:/
[Intel-gfx] ✗ Fi.CI.BAT: warning for series starting with [1/3] drm/i915: Use the MRU stack search after evicting (rev3)
== Series Details == Series: series starting with [1/3] drm/i915: Use the MRU stack search after evicting (rev3) URL : https://patchwork.freedesktop.org/series/17784/ State : warning == Summary == Series 17784v3 Series without cover letter https://patchwork.freedesktop.org/api/1.0/series/17784/revisions/3/mbox/ Test kms_force_connector_basic: Subgroup force-connector-state: pass -> DMESG-WARN (fi-snb-2520m) fi-bdw-5557u total:246 pass:232 dwarn:0 dfail:0 fail:0 skip:14 fi-bsw-n3050 total:246 pass:207 dwarn:0 dfail:0 fail:0 skip:39 fi-bxt-j4205 total:246 pass:224 dwarn:0 dfail:0 fail:0 skip:22 fi-bxt-t5700 total:82 pass:69 dwarn:0 dfail:0 fail:0 skip:12 fi-byt-j1900 total:246 pass:219 dwarn:0 dfail:0 fail:0 skip:27 fi-byt-n2820 total:246 pass:215 dwarn:0 dfail:0 fail:0 skip:31 fi-hsw-4770 total:246 pass:227 dwarn:0 dfail:0 fail:0 skip:19 fi-hsw-4770r total:246 pass:227 dwarn:0 dfail:0 fail:0 skip:19 fi-ivb-3520m total:246 pass:225 dwarn:0 dfail:0 fail:0 skip:21 fi-ivb-3770 total:246 pass:225 dwarn:0 dfail:0 fail:0 skip:21 fi-kbl-7500u total:246 pass:225 dwarn:0 dfail:0 fail:0 skip:21 fi-skl-6260u total:246 pass:233 dwarn:0 dfail:0 fail:0 skip:13 fi-skl-6700hqtotal:246 pass:226 dwarn:0 dfail:0 fail:0 skip:20 fi-skl-6700k total:246 pass:222 dwarn:3 dfail:0 fail:0 skip:21 fi-skl-6770hqtotal:246 pass:233 dwarn:0 dfail:0 fail:0 skip:13 fi-snb-2520m total:246 pass:214 dwarn:1 dfail:0 fail:0 skip:31 fi-snb-2600 total:246 pass:214 dwarn:0 dfail:0 fail:0 skip:32 abf5260be6dda4ade94e8edf66e133260083f29b drm-tip: 2017y-01m-10d-23h-42m-21s UTC integration manifest cee7403 drm/i915: Prefer random replacement before eviction search df15dc1 drm/i915: Extract reserving space in the GTT to a helper c4a94c9 drm/i915: Use the MRU stack search after evicting == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3475/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v2 3/3] drm/i915: Prefer random replacement before eviction search
Performing an eviction search can be very, very slow especially for a range restricted replacement. For example, a workload like gem_concurrent_blit will populate the entire GTT and then cause aperture thrashing. Since the GTT is a mix of active and inactive tiny objects, we have to search through almost 400k objects before finding anything inside the mappable region, and as this search is required before every operation performance falls off a cliff. Instead of performing the full search, we do a trial replacement of the node at a random location fitting the specified restrictions. We lose the strict LRU property of the GTT in exchange for avoiding the slow search (several orders of runtime improvement for gem_concurrent_blit 4KiB-global-gtt, e.g. from 5000s to 20s). The loss of LRU replacement is (later) mitigated firstly by only doing replacement if we find no freespace and secondly by execbuf doing a PIN_NONBLOCK search first before it starts thrashing (i.e. the random replacement will only occur from the already inactive set of objects). v2: Ascii-art, and check preconditionst v3: Rephrase final sentence in comment to explain why we don't both with if (i915_is_ggtt(vm)) for preferring random replacement. Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Cc: Joonas Lahtinen Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_gtt.c | 59 - 1 file changed, 58 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 92b907f27986..0c48d6286419 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -24,6 +24,7 @@ */ #include +#include #include #include @@ -3605,6 +3606,31 @@ int i915_gem_gtt_reserve(struct i915_address_space *vm, return err; } +static u64 random_offset(u64 start, u64 end, u64 len, u64 align) +{ + u64 range, addr; + + GEM_BUG_ON(range_overflows(start, len, end)); + GEM_BUG_ON(round_up(start, align) > round_down(end - len, align)); + + range = round_down(end - len, align) - round_up(start, align); + if (range) { + if (sizeof(unsigned long) == sizeof(u64)) { + addr = get_random_long(); + } else { + addr = get_random_int(); + if (range > U32_MAX) { + addr <<= 32; + addr |= get_random_int(); + } + } + div64_u64_rem(addr, range, &addr); + start += addr; + } + + return round_up(start, align); +} + /** * i915_gem_gtt_insert - insert a node into an address_space (GTT) * @vm - the &struct i915_address_space @@ -3626,7 +3652,8 @@ int i915_gem_gtt_reserve(struct i915_address_space *vm, * its @size must then fit entirely within the [@start, @end] bounds. The * nodes on either side of the hole must match @color, or else a guard page * will be inserted between the two nodes (or the node evicted). If no - * suitable hole is found, then the LRU list of objects within the GTT + * suitable hole is found, first a victim is randomly selected and tested + * for eviction, otherwise then the LRU list of objects within the GTT * is scanned to find the first set of replacement nodes to create the hole. * Those old overlapping nodes are evicted from the GTT (and so must be * rebound before any future use). Any node that is current pinned cannot @@ -3644,6 +3671,7 @@ int i915_gem_gtt_insert(struct i915_address_space *vm, u64 start, u64 end, unsigned int flags) { u32 search_flag, alloc_flag; + u64 offset; int err; lockdep_assert_held(&vm->i915->drm.struct_mutex); @@ -3686,6 +3714,35 @@ int i915_gem_gtt_insert(struct i915_address_space *vm, if (err != -ENOSPC) return err; + /* No free space, pick a slot at random. +* +* There is a pathological case here using a GTT shared between +* mmap and GPU (i.e. ggtt/aliasing_ppgtt but not full-ppgtt): +* +*|<-- 256 MiB aperture -->||<-- 1792 MiB unmappable -->| +* (64k objects) (448k objects) +* +* Now imagine that the eviction LRU is ordered top-down (just because +* pathology meets real life), and that we need to evict an object to +* make room inside the aperture. The eviction scan then has to walk +* the 448k list before it finds one within range. And now imagine that +* it has to search for a new hole between every byte inside the memcpy, +* for several simultaneous clients. +* +* On a full-ppgtt system, if we have run out of available space, there +* will be lots and lots of objects in the eviction list! Again, +* searching that LRU list may be slow if we are also applying any +
[Intel-gfx] [PATCH v2 2/3] drm/i915: Extract reserving space in the GTT to a helper
Extract drm_mm_reserve_node + calling i915_gem_evict_for_node into its own routine so that it can be shared rather than duplicated. v2: Kerneldoc Signed-off-by: Chris Wilson Cc: Joonas Lahtinen Cc: igvt-g-...@lists.01.org Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_drv.h| 5 ++-- drivers/gpu/drm/i915/i915_gem_evict.c | 33 -- drivers/gpu/drm/i915/i915_gem_gtt.c| 51 ++ drivers/gpu/drm/i915/i915_gem_gtt.h| 5 drivers/gpu/drm/i915/i915_gem_stolen.c | 7 ++--- drivers/gpu/drm/i915/i915_trace.h | 16 +-- drivers/gpu/drm/i915/i915_vgpu.c | 33 -- drivers/gpu/drm/i915/i915_vma.c| 16 --- 8 files changed, 105 insertions(+), 61 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 89e0038ea26b..a29d138b6906 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3468,8 +3468,9 @@ int __must_check i915_gem_evict_something(struct i915_address_space *vm, unsigned cache_level, u64 start, u64 end, unsigned flags); -int __must_check i915_gem_evict_for_vma(struct i915_vma *vma, - unsigned int flags); +int __must_check i915_gem_evict_for_node(struct i915_address_space *vm, +struct drm_mm_node *node, +unsigned int flags); int i915_gem_evict_vm(struct i915_address_space *vm, bool do_idle); /* belongs in i915_gem_gtt.h */ diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c index 6a5415e31acf..50b4645bf627 100644 --- a/drivers/gpu/drm/i915/i915_gem_evict.c +++ b/drivers/gpu/drm/i915/i915_gem_evict.c @@ -231,7 +231,8 @@ i915_gem_evict_something(struct i915_address_space *vm, /** * i915_gem_evict_for_vma - Evict vmas to make room for binding a new one - * @target: address space and range to evict for + * @vm: address space to evict from + * @target: range (and color) to evict for * @flags: additional flags to control the eviction algorithm * * This function will try to evict vmas that overlap the target node. @@ -239,18 +240,20 @@ i915_gem_evict_something(struct i915_address_space *vm, * To clarify: This is for freeing up virtual address space, not for freeing * memory in e.g. the shrinker. */ -int i915_gem_evict_for_vma(struct i915_vma *target, unsigned int flags) +int i915_gem_evict_for_node(struct i915_address_space *vm, + struct drm_mm_node *target, + unsigned int flags) { LIST_HEAD(eviction_list); struct drm_mm_node *node; - u64 start = target->node.start; - u64 end = start + target->node.size; + u64 start = target->start; + u64 end = start + target->size; struct i915_vma *vma, *next; bool check_color; int ret = 0; - lockdep_assert_held(&target->vm->i915->drm.struct_mutex); - trace_i915_gem_evict_vma(target, flags); + lockdep_assert_held(&vm->i915->drm.struct_mutex); + trace_i915_gem_evict_node(vm, target, flags); /* Retire before we search the active list. Although we have * reasonable accuracy in our retirement lists, we may have @@ -258,18 +261,18 @@ int i915_gem_evict_for_vma(struct i915_vma *target, unsigned int flags) * retiring. */ if (!(flags & PIN_NONBLOCK)) - i915_gem_retire_requests(target->vm->i915); + i915_gem_retire_requests(vm->i915); - check_color = target->vm->mm.color_adjust; + check_color = vm->mm.color_adjust; if (check_color) { /* Expand search to cover neighbouring guard pages (or lack!) */ - if (start > target->vm->start) + if (start > vm->start) start -= I915_GTT_PAGE_SIZE; - if (end < target->vm->start + target->vm->total) + if (end < vm->start + vm->total) end += I915_GTT_PAGE_SIZE; } - drm_mm_for_each_node_in_range(node, &target->vm->mm, start, end) { + drm_mm_for_each_node_in_range(node, &vm->mm, start, end) { /* If we find any non-objects (!vma), we cannot evict them */ if (node->color == I915_COLOR_UNEVICTABLE) { ret = -ENOSPC; @@ -285,12 +288,12 @@ int i915_gem_evict_for_vma(struct i915_vma *target, unsigned int flags) * those as well to make room for our guard pages. */ if (check_color) { - if (vma->node.start + vma->node.size == target->node.start) { - if (vma->node.color == target->node.color) + if (vma->node.sta
[Intel-gfx] [PATCH v2 1/3] drm/i915: Use the MRU stack search after evicting
When we evict from the GTT to make room for an object, the hole we create is put onto the MRU stack inside the drm_mm range manager. On the next search pass, we can speed up a PIN_HIGH allocation by referencing that stack for the new hole. v2: Pull together the 3 identical implements (ahem, a couple were outdated) into a common routine for allocating a node and evicting as necessary. v3: Detect invalid calls to i915_gem_gtt_insert() v4: kerneldoc Signed-off-by: Chris Wilson Reviewed-by: Joonas Lahtinen --- drivers/gpu/drm/i915/gvt/aperture_gm.c | 33 +++-- drivers/gpu/drm/i915/i915_gem_gtt.c| 121 +++-- drivers/gpu/drm/i915/i915_gem_gtt.h| 5 ++ drivers/gpu/drm/i915/i915_vma.c| 40 ++- 4 files changed, 119 insertions(+), 80 deletions(-) diff --git a/drivers/gpu/drm/i915/gvt/aperture_gm.c b/drivers/gpu/drm/i915/gvt/aperture_gm.c index 016227d77dd4..a97d56ea3d83 100644 --- a/drivers/gpu/drm/i915/gvt/aperture_gm.c +++ b/drivers/gpu/drm/i915/gvt/aperture_gm.c @@ -41,47 +41,34 @@ static int alloc_gm(struct intel_vgpu *vgpu, bool high_gm) { struct intel_gvt *gvt = vgpu->gvt; struct drm_i915_private *dev_priv = gvt->dev_priv; - u32 alloc_flag, search_flag; + unsigned int flags; u64 start, end, size; struct drm_mm_node *node; - int retried = 0; int ret; if (high_gm) { - search_flag = DRM_MM_SEARCH_BELOW; - alloc_flag = DRM_MM_CREATE_TOP; node = &vgpu->gm.high_gm_node; size = vgpu_hidden_sz(vgpu); start = gvt_hidden_gmadr_base(gvt); end = gvt_hidden_gmadr_end(gvt); + flags = PIN_HIGH; } else { - search_flag = DRM_MM_SEARCH_DEFAULT; - alloc_flag = DRM_MM_CREATE_DEFAULT; node = &vgpu->gm.low_gm_node; size = vgpu_aperture_sz(vgpu); start = gvt_aperture_gmadr_base(gvt); end = gvt_aperture_gmadr_end(gvt); + flags = PIN_MAPPABLE; } mutex_lock(&dev_priv->drm.struct_mutex); -search_again: - ret = drm_mm_insert_node_in_range_generic(&dev_priv->ggtt.base.mm, - node, size, 4096, - I915_COLOR_UNEVICTABLE, - start, end, search_flag, - alloc_flag); - if (ret) { - ret = i915_gem_evict_something(&dev_priv->ggtt.base, - size, 4096, - I915_COLOR_UNEVICTABLE, - start, end, 0); - if (ret == 0 && ++retried < 3) - goto search_again; - - gvt_err("fail to alloc %s gm space from host, retried %d\n", - high_gm ? "high" : "low", retried); - } + ret = i915_gem_gtt_insert(&dev_priv->ggtt.base, node, + size, 4096, I915_COLOR_UNEVICTABLE, + start, end, flags); mutex_unlock(&dev_priv->drm.struct_mutex); + if (ret) + gvt_err("fail to alloc %s gm space from host\n", + high_gm ? "high" : "low"); + return ret; } diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 8aca11f5f446..136f90ba95ab 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -23,10 +23,13 @@ * */ +#include #include #include + #include #include + #include "i915_drv.h" #include "i915_vgpu.h" #include "i915_trace.h" @@ -2032,7 +2035,6 @@ static int gen6_ppgtt_allocate_page_directories(struct i915_hw_ppgtt *ppgtt) struct i915_address_space *vm = &ppgtt->base; struct drm_i915_private *dev_priv = ppgtt->base.i915; struct i915_ggtt *ggtt = &dev_priv->ggtt; - bool retried = false; int ret; /* PPGTT PDEs reside in the GGTT and consists of 512 entries. The @@ -2045,29 +2047,14 @@ static int gen6_ppgtt_allocate_page_directories(struct i915_hw_ppgtt *ppgtt) if (ret) return ret; -alloc: - ret = drm_mm_insert_node_in_range_generic(&ggtt->base.mm, &ppgtt->node, - GEN6_PD_SIZE, GEN6_PD_ALIGN, - I915_COLOR_UNEVICTABLE, - 0, ggtt->base.total, - DRM_MM_TOPDOWN); - if (ret == -ENOSPC && !retried) { - ret = i915_gem_evict_something(&ggtt->base, - GEN6_PD_SIZE, GEN6_PD_ALIGN, - I915_COLOR_UNEVICTABLE, -
[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [v2,1/3] drm/i915: Use the MRU stack search after evicting
== Series Details == Series: series starting with [v2,1/3] drm/i915: Use the MRU stack search after evicting URL : https://patchwork.freedesktop.org/series/17822/ State : success == Summary == Series 17822v1 Series without cover letter https://patchwork.freedesktop.org/api/1.0/series/17822/revisions/1/mbox/ fi-bdw-5557u total:246 pass:232 dwarn:0 dfail:0 fail:0 skip:14 fi-bsw-n3050 total:246 pass:207 dwarn:0 dfail:0 fail:0 skip:39 fi-bxt-j4205 total:246 pass:224 dwarn:0 dfail:0 fail:0 skip:22 fi-bxt-t5700 total:82 pass:69 dwarn:0 dfail:0 fail:0 skip:12 fi-byt-j1900 total:246 pass:219 dwarn:0 dfail:0 fail:0 skip:27 fi-byt-n2820 total:246 pass:215 dwarn:0 dfail:0 fail:0 skip:31 fi-hsw-4770 total:246 pass:227 dwarn:0 dfail:0 fail:0 skip:19 fi-hsw-4770r total:246 pass:227 dwarn:0 dfail:0 fail:0 skip:19 fi-ivb-3520m total:246 pass:225 dwarn:0 dfail:0 fail:0 skip:21 fi-ivb-3770 total:246 pass:225 dwarn:0 dfail:0 fail:0 skip:21 fi-kbl-7500u total:246 pass:225 dwarn:0 dfail:0 fail:0 skip:21 fi-skl-6260u total:246 pass:233 dwarn:0 dfail:0 fail:0 skip:13 fi-skl-6700hqtotal:246 pass:226 dwarn:0 dfail:0 fail:0 skip:20 fi-skl-6700k total:246 pass:222 dwarn:3 dfail:0 fail:0 skip:21 fi-skl-6770hqtotal:246 pass:233 dwarn:0 dfail:0 fail:0 skip:13 fi-snb-2520m total:246 pass:215 dwarn:0 dfail:0 fail:0 skip:31 fi-snb-2600 total:246 pass:214 dwarn:0 dfail:0 fail:0 skip:32 abf5260be6dda4ade94e8edf66e133260083f29b drm-tip: 2017y-01m-10d-23h-42m-21s UTC integration manifest 0db2997 drm/i915: Prefer random replacement before eviction search a2d9bf3 drm/i915: Extract reserving space in the GTT to a helper be9c2e1 drm/i915: Use the MRU stack search after evicting == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3476/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v4] lib/scatterlist: Avoid potential scatterlist entry overflow
From: Tvrtko Ursulin Since the scatterlist length field is an unsigned int, make sure that sg_alloc_table_from_pages does not overflow it while coallescing pages to a single entry. v2: Drop reference to future use. Use UINT_MAX. v3: max_segment must be page aligned. v4: Do not rely on compiler to optimise out the rounddown. (Joonas Lahtinen) Signed-off-by: Tvrtko Ursulin Cc: Masahiro Yamada Cc: linux-ker...@vger.kernel.org Reviewed-by: Chris Wilson (v2) Cc: Joonas Lahtinen --- include/linux/scatterlist.h | 6 ++ lib/scatterlist.c | 25 +++-- 2 files changed, 25 insertions(+), 6 deletions(-) diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h index c981bee1a3ae..15265bb6e5c3 100644 --- a/include/linux/scatterlist.h +++ b/include/linux/scatterlist.h @@ -21,6 +21,12 @@ struct scatterlist { }; /* + * Since the above length field is an unsigned int, below we define the maximum + * lenght in bytes that can be stored in one scatterlist entry. + */ +#define SCATTERLIST_MAX_SEGMENT (0xf000) + +/* * These macros should be used after a dma_map_sg call has been done * to get bus addresses of each of the SG entries and their lengths. * You should only work with the number of sg entries dma_map_sg diff --git a/lib/scatterlist.c b/lib/scatterlist.c index e05e7fc98892..24beb0965e69 100644 --- a/lib/scatterlist.c +++ b/lib/scatterlist.c @@ -394,7 +394,8 @@ int sg_alloc_table_from_pages(struct sg_table *sgt, unsigned int offset, unsigned long size, gfp_t gfp_mask) { - unsigned int chunks; + const unsigned int max_segment = SCATTERLIST_MAX_SEGMENT; + unsigned int seg_len, chunks; unsigned int i; unsigned int cur_page; int ret; @@ -402,9 +403,16 @@ int sg_alloc_table_from_pages(struct sg_table *sgt, /* compute number of contiguous chunks */ chunks = 1; - for (i = 1; i < n_pages; ++i) - if (page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1) + seg_len = PAGE_SIZE; + for (i = 1; i < n_pages; ++i) { + if (seg_len >= max_segment || + page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1) { ++chunks; + seg_len = PAGE_SIZE; + } else { + seg_len += PAGE_SIZE; + } + } ret = sg_alloc_table(sgt, chunks, gfp_mask); if (unlikely(ret)) @@ -413,17 +421,22 @@ int sg_alloc_table_from_pages(struct sg_table *sgt, /* merging chunks and putting them into the scatterlist */ cur_page = 0; for_each_sg(sgt->sgl, s, sgt->orig_nents, i) { - unsigned long chunk_size; + unsigned int chunk_size; unsigned int j; /* look for the end of the current chunk */ + seg_len = PAGE_SIZE; for (j = cur_page + 1; j < n_pages; ++j) - if (page_to_pfn(pages[j]) != + if (seg_len >= max_segment || + page_to_pfn(pages[j]) != page_to_pfn(pages[j - 1]) + 1) break; + else + seg_len += PAGE_SIZE; chunk_size = ((j - cur_page) << PAGE_SHIFT) - offset; - sg_set_page(s, pages[cur_page], min(size, chunk_size), offset); + sg_set_page(s, pages[cur_page], + min_t(unsigned long, size, chunk_size), offset); size -= chunk_size; offset = 0; cur_page = j; -- 2.7.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v5] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages
From: Tvrtko Ursulin Drivers like i915 benefit from being able to control the maxium size of the sg coallesced segment while building the scatter- gather list. Introduce and export the __sg_alloc_table_from_pages function which will allow it that control. v2: Reorder parameters. (Chris Wilson) v3: Fix incomplete reordering in v2. v4: max_segment needs to be page aligned. v5: Rebase. Signed-off-by: Tvrtko Ursulin Cc: Masahiro Yamada Cc: linux-ker...@vger.kernel.org Cc: Chris Wilson Reviewed-by: Chris Wilson (v2) Cc: Joonas Lahtinen --- include/linux/scatterlist.h | 11 + lib/scatterlist.c | 58 +++-- 2 files changed, 52 insertions(+), 17 deletions(-) diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h index 15265bb6e5c3..c897533fb85d 100644 --- a/include/linux/scatterlist.h +++ b/include/linux/scatterlist.h @@ -267,10 +267,13 @@ void sg_free_table(struct sg_table *); int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int, struct scatterlist *, gfp_t, sg_alloc_fn *); int sg_alloc_table(struct sg_table *, unsigned int, gfp_t); -int sg_alloc_table_from_pages(struct sg_table *sgt, - struct page **pages, unsigned int n_pages, - unsigned int offset, unsigned long size, - gfp_t gfp_mask); +int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages, + unsigned int n_pages, unsigned int offset, + unsigned long size, unsigned int max_segment, + gfp_t gfp_mask); +int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages, + unsigned int n_pages, unsigned int offset, + unsigned long size, gfp_t gfp_mask); size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf, size_t buflen, off_t skip, bool to_buffer); diff --git a/lib/scatterlist.c b/lib/scatterlist.c index 24beb0965e69..732c6e516657 100644 --- a/lib/scatterlist.c +++ b/lib/scatterlist.c @@ -370,14 +370,15 @@ int sg_alloc_table(struct sg_table *table, unsigned int nents, gfp_t gfp_mask) EXPORT_SYMBOL(sg_alloc_table); /** - * sg_alloc_table_from_pages - Allocate and initialize an sg table from - *an array of pages - * @sgt: The sg table header to use - * @pages: Pointer to an array of page pointers - * @n_pages: Number of pages in the pages array - * @offset: Offset from start of the first page to the start of a buffer - * @size: Number of valid bytes in the buffer (after offset) - * @gfp_mask: GFP allocation mask + * __sg_alloc_table_from_pages - Allocate and initialize an sg table from + * an array of pages + * @sgt:The sg table header to use + * @pages: Pointer to an array of page pointers + * @n_pages:Number of pages in the pages array + * @offset: Offset from start of the first page to the start of a buffer + * @size:Number of valid bytes in the buffer (after offset) + * @max_segment: Maximum size of a scatterlist node in bytes (page aligned) + * @gfp_mask: GFP allocation mask * * Description: *Allocate and initialize an sg table from a list of pages. Contiguous @@ -389,18 +390,20 @@ EXPORT_SYMBOL(sg_alloc_table); * Returns: * 0 on success, negative error on failure */ -int sg_alloc_table_from_pages(struct sg_table *sgt, - struct page **pages, unsigned int n_pages, - unsigned int offset, unsigned long size, - gfp_t gfp_mask) +int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages, + unsigned int n_pages, unsigned int offset, + unsigned long size, unsigned int max_segment, + gfp_t gfp_mask) { - const unsigned int max_segment = SCATTERLIST_MAX_SEGMENT; unsigned int seg_len, chunks; unsigned int i; unsigned int cur_page; int ret; struct scatterlist *s; + if (WARN_ON(!max_segment || offset_in_page(max_segment))) + return -EINVAL; + /* compute number of contiguous chunks */ chunks = 1; seg_len = PAGE_SIZE; @@ -444,6 +447,35 @@ int sg_alloc_table_from_pages(struct sg_table *sgt, return 0; } +EXPORT_SYMBOL(__sg_alloc_table_from_pages); + +/** + * sg_alloc_table_from_pages - Allocate and initialize an sg table from + *an array of pages + * @sgt:The sg table header to use + * @pages: Pointer to an array of page pointers + * @n_pages:Number of pages in the pages array + * @offset: Offset from start of the first page to the start of a buffer + * @size:Number of valid bytes in the buffer (after offset) + * @gfp_mask: GFP allocation mask + * + * Description: + *Allocate and initialize an sg table
[Intel-gfx] [PATCH v6] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations
From: Tvrtko Ursulin With the addition of __sg_alloc_table_from_pages we can control the maximum coallescing size and eliminate a separate path for allocating backing store here. Similar to 871dfbd67d4e ("drm/i915: Allow compaction upto SWIOTLB max segment size") this enables more compact sg lists to be created and so has a beneficial effect on workloads with many and/or large objects of this class. v2: * Rename helper to i915_sg_segment_size and fix swiotlb override. * Commit message update. v3: * Actually include the swiotlb override fix. v4: * Regroup parameters a bit. (Chris Wilson) v5: * Rebase for swiotlb_max_segment. * Add DMA map failure handling as in abb0deacb5a6 ("drm/i915: Fallback to single PAGE_SIZE segments for DMA remapping"). v6: Handle swiotlb_max_segment() returning 1. (Joonas Lahtinen) Signed-off-by: Tvrtko Ursulin Cc: Chris Wilson Cc: linux-ker...@vger.kernel.org Reviewed-by: Chris Wilson (v4) Cc: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_drv.h | 15 +++ drivers/gpu/drm/i915/i915_gem.c | 6 +-- drivers/gpu/drm/i915/i915_gem_userptr.c | 79 - 3 files changed, 45 insertions(+), 55 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index a320675a9e71..5646e48a893b 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2598,6 +2598,21 @@ static inline struct scatterlist *__sg_next(struct scatterlist *sg) (((__iter).curr += PAGE_SIZE) < (__iter).max) || \ ((__iter) = __sgt_iter(__sg_next((__iter).sgp), false), 0)) +static inline unsigned int i915_sg_segment_size(void) +{ + unsigned int size = swiotlb_max_segment(); + + if (size == 0) + return SCATTERLIST_MAX_SEGMENT; + + size = rounddown(size, PAGE_SIZE); + /* swiotlb_max_segment_size can return 1 byte when it means one page. */ + if (size < PAGE_SIZE) + size = PAGE_SIZE; + + return size; +} + static inline const struct intel_device_info * intel_info(const struct drm_i915_private *dev_priv) { diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 3bf517e2430a..9312284a31e4 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2255,7 +2255,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) struct sgt_iter sgt_iter; struct page *page; unsigned long last_pfn = 0; /* suppress gcc warning */ - unsigned int max_segment; + unsigned int max_segment = i915_sg_segment_size(); int ret; gfp_t gfp; @@ -2266,10 +2266,6 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) GEM_BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS); GEM_BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS); - max_segment = swiotlb_max_segment(); - if (!max_segment) - max_segment = rounddown(UINT_MAX, PAGE_SIZE); - st = kmalloc(sizeof(*st), GFP_KERNEL); if (st == NULL) return ERR_PTR(-ENOMEM); diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c index 6a8fa085b74e..95b62b9c5cd6 100644 --- a/drivers/gpu/drm/i915/i915_gem_userptr.c +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c @@ -390,64 +390,42 @@ struct get_pages_work { struct task_struct *task; }; -#if IS_ENABLED(CONFIG_SWIOTLB) -#define swiotlb_active() swiotlb_nr_tbl() -#else -#define swiotlb_active() 0 -#endif - -static int -st_set_pages(struct sg_table **st, struct page **pvec, int num_pages) -{ - struct scatterlist *sg; - int ret, n; - - *st = kmalloc(sizeof(**st), GFP_KERNEL); - if (*st == NULL) - return -ENOMEM; - - if (swiotlb_active()) { - ret = sg_alloc_table(*st, num_pages, GFP_KERNEL); - if (ret) - goto err; - - for_each_sg((*st)->sgl, sg, num_pages, n) - sg_set_page(sg, pvec[n], PAGE_SIZE, 0); - } else { - ret = sg_alloc_table_from_pages(*st, pvec, num_pages, - 0, num_pages << PAGE_SHIFT, - GFP_KERNEL); - if (ret) - goto err; - } - - return 0; - -err: - kfree(*st); - *st = NULL; - return ret; -} - static struct sg_table * -__i915_gem_userptr_set_pages(struct drm_i915_gem_object *obj, -struct page **pvec, int num_pages) +__i915_gem_userptr_alloc_pages(struct drm_i915_gem_object *obj, + struct page **pvec, int num_pages) { - struct sg_table *pages; + unsigned int max_segment = i915_sg_segment_size(); + struct sg_table *st; int ret; - ret = st_set_pages(&pages, pvec, num_pages); - if
Re: [Intel-gfx] [PATCH v2 1/3] drm/i915: Use the MRU stack search after evicting
On ke, 2017-01-11 at 11:23 +, Chris Wilson wrote: > When we evict from the GTT to make room for an object, the hole we > create is put onto the MRU stack inside the drm_mm range manager. On the > next search pass, we can speed up a PIN_HIGH allocation by referencing > that stack for the new hole. > > v2: Pull together the 3 identical implements (ahem, a couple were > outdated) into a common routine for allocating a node and evicting as > necessary. > v3: Detect invalid calls to i915_gem_gtt_insert() > v4: kerneldoc > > Signed-off-by: Chris Wilson > Reviewed-by: Joonas Lahtinen > +/** > + * i915_gem_gtt_insert - insert a node into an address_space (GTT) > + * @vm - the &struct i915_address_space mixing &struct and @struct, I guess you meant &struct in later line too. > + * @node - the @struct drm_mm_node (typicallay i915_vma.mode) "typicallly" and "i915_vma.node" > + * @size - how much space to allocate inside the GTT, > + * must be #I915_GTT_PAGE_SIZE aligned > + * @alignment - required alignment of starting offset, may be 0 but > + * if specified, this must be a power-of-two and at least > + * #I915_GTT_MIN_ALIGNMENT > + * @color - color to apply to node > + * @start - start of any range restriction inside GTT (0 for all), > + * must be #I915_GTT_PAGE_SIZE aligned > + * @end - end of any range restriction inside GTT (U64_MAX for all), > + *must be #I915_GTT_PAGE_SIZE aligned > + * @flags - control search and eviction behaviour > + * > + * i915_gem_gtt_insert() first searches for an available hole into which > + * is can insert the node. The hole address is aligned to @alignment and > + * its @size must then fit entirely within the [@start, @end] bounds. The > + * nodes on either side of the hole must match @color, or else a guard page > + * will be inserted between the two nodes (or the node evicted). If no > + * suitable hole is found, then the LRU list of objects within the GTT > + * is scanned to find the first set of replacement nodes to create the hole. > + * Those old overlapping nodes are evicted from the GTT (and so must be > + * rebound before any future use). Any node that is current pinned cannot "currently" > + * be evicted (see i915_vma_pin()). Similar if the node's VMA is currently > + * active and #PIN_NONBLOCK is specified, that node is also skipped when > + * searching for an eviction candidate. See i915_gem_evict_something() for > + * the gory details on the eviction algorithm. > + * > + * Returns: 0 on success, -ENOSPC if no suitable hole is found, -EINTR if > + * asked to wait for eviction and interrupted. > + */ Fit those fixed, good to merge. Regards, Joonas -- Joonas Lahtinen Open Source Technology Center Intel Corporation ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCHv2] drm/i915: Remove WaDisableLSQCROPERFforOCL KBL workaround.
Daniel Vetter writes: > On Mon, Jan 09, 2017 at 01:07:56PM -0800, Francisco Jerez wrote: >> The WaDisableLSQCROPERFforOCL workaround has the side effect of >> disabling an L3SQ optimization that has huge performance implications >> and is unlikely to be necessary for the correct functioning of usual >> graphic workloads. Userspace is free to re-enable the workaround on >> demand, and is generally in a better position to determine whether the >> workaround is necessary than the DRM is (e.g. only during the >> execution of compute kernels that rely on both L3 fences and HDC R/W >> requests). >> >> The same workaround seems to apply to BDW (at least to production >> stepping G1) and SKL as well (the internal workaround database claims >> that it does for all steppings, while the BSpec workaround table only >> mentions pre-production steppings), but the DRM doesn't do anything >> beyond whitelisting the L3SQCREG4 register so userspace can enable it >> when it sees fit. Do the same on KBL platforms. >> >> Improves performance of the GFXBench4 gl_manhattan31 benchmark by 60%, >> and gl_4 (AKA car chase) by 14% on a KBL GT2 running Mesa master -- >> This is followed by a regression of 35% and 10% respectively for the >> same benchmarks and platform caused by my recent patch series >> switching userspace to use the dataport constant cache instead of the >> sampler to implement uniform pull constant loads, which caused us to >> hit more heavily the L3 cache (and on platforms other than KBL had the >> opposite effect of improving performance of the same two benchmarks). >> The overall effect on KBL of this change combined with the recent >> userspace change is respectively 4.6% and 2.6%. SynMark2 OglShMapPcf >> was affected by the constant cache changes (though it improved as it >> did on other platforms rather than regressing), but is not >> significantly affected by this patch (with statistical significance of >> 5% and sample size 20). >> >> v2: Drop some more code to avoid unused variable warning. >> >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99256 >> Signed-off-by: Francisco Jerez >> Cc: Eero Tamminen >> Cc: Jani Nikula >> Cc: Mika Kuoppala >> Cc: beig...@lists.freedesktop.org > > Don't we need some userspace flag/opt-in scheme to avoid stuff going boom > for compute kernels? Are the patches for mesa compute/beignet > ready&reviewed? This is explicit setting on kbl/E0 only. So one could argue that unless they filter based on PCI-IDs, things would already blow up across the skl/kbl population, if they forgot to set it. The whitelisting is in place and looks sane so this E0 exception is a wart that got in by me reading wa database slavishly without thinking. -Mika > -Daniel > >> --- >> drivers/gpu/drm/i915/intel_lrc.c| 10 -- >> drivers/gpu/drm/i915/intel_ringbuffer.c | 8 >> 2 files changed, 18 deletions(-) >> >> diff --git a/drivers/gpu/drm/i915/intel_lrc.c >> b/drivers/gpu/drm/i915/intel_lrc.c >> index 6db246a..656e0a3 100644 >> --- a/drivers/gpu/drm/i915/intel_lrc.c >> +++ b/drivers/gpu/drm/i915/intel_lrc.c >> @@ -970,18 +970,8 @@ static inline int gen8_emit_flush_coherentl3_wa(struct >> intel_engine_cs *engine, >> uint32_t *batch, >> uint32_t index) >> { >> -struct drm_i915_private *dev_priv = engine->i915; >> uint32_t l3sqc4_flush = (0x4040 | GEN8_LQSC_FLUSH_COHERENT_LINES); >> >> -/* >> - * WaDisableLSQCROPERFforOCL:kbl >> - * This WA is implemented in skl_init_clock_gating() but since >> - * this batch updates GEN8_L3SQCREG4 with default value we need to >> - * set this bit here to retain the WA during flush. >> - */ >> -if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_E0)) >> -l3sqc4_flush |= GEN8_LQSC_RO_PERF_DIS; >> - >> wa_ctx_emit(batch, index, (MI_STORE_REGISTER_MEM_GEN8 | >> MI_SRM_LRM_GLOBAL_GTT)); >> wa_ctx_emit_reg(batch, index, GEN8_L3SQCREG4); >> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c >> b/drivers/gpu/drm/i915/intel_ringbuffer.c >> index 0971ac3..7cb2ab4 100644 >> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c >> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c >> @@ -1095,14 +1095,6 @@ static int kbl_init_workarounds(struct >> intel_engine_cs *engine) >> WA_SET_BIT_MASKED(HDC_CHICKEN0, >>HDC_FENCE_DEST_SLM_DISABLE); >> >> -/* GEN8_L3SQCREG4 has a dependency with WA batch so any new changes >> - * involving this register should also be added to WA batch as required. >> - */ >> -if (IS_KBL_REVID(dev_priv, 0, KBL_REVID_E0)) >> -/* WaDisableLSQCROPERFforOCL:kbl */ >> -I915_WRITE(GEN8_L3SQCREG4, I915_READ(GEN8_L3SQCREG4) | >> - GEN8_LQSC_RO_PERF_DIS); >> - >> /* WaToEnableHwFixForPushConstHWBug:kbl */ >> if (IS_KBL_REVID(dev_priv, KB
[Intel-gfx] [PATCH] drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt
When switching between contexts using the aliasing_ppgtt, the VM is shared. We don't need to reload the PD registers unless they are dirty. Martin Peres reported an issue that looks like corruption between Haswell context switches, bisecting to commit f9326be5f1d3 ("drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use"). Switching between the same mm (the aliasing_ppgtt is used for all contexts in this case) should be a nop, but appears to trigger some side-effects in the context switch. However, as we know the switch is redundant in this case, we can skip it and continue to ignore the issue until somebody feels strong enough to investigate full-ppgtt on gen7 again! Fixes: f9326be5f1d3 ("drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use") Reported-by: Martin Peres Signed-off-by: Chris Wilson Cc: Martin Peres --- drivers/gpu/drm/i915/i915_gem_context.c | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index ed31133b3ce3..86426c1a9534 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -728,10 +728,10 @@ static inline bool skip_rcs_switch(struct i915_hw_ppgtt *ppgtt, } static bool -needs_pd_load_pre(struct i915_hw_ppgtt *ppgtt, - struct intel_engine_cs *engine, - struct i915_gem_context *to) +needs_pd_load_pre(struct i915_hw_ppgtt *ppgtt, struct intel_engine_cs *engine) { + struct i915_hw_ppgtt *last_ppgtt; + if (!ppgtt) return false; @@ -740,7 +740,9 @@ needs_pd_load_pre(struct i915_hw_ppgtt *ppgtt, return true; /* Same context without new entries, skip */ - if (engine->legacy_active_context == to && + last_ppgtt = + engine->legacy_active_context->ppgtt ?: engine->i915->mm.aliasing_ppgtt; + if (last_ppgtt == ppgtt && !(intel_engine_flag(engine) & ppgtt->pd_dirty_rings)) return false; @@ -784,7 +786,7 @@ static int do_rcs_switch(struct drm_i915_gem_request *req) if (skip_rcs_switch(ppgtt, engine, to)) return 0; - if (needs_pd_load_pre(ppgtt, engine, to)) { + if (needs_pd_load_pre(ppgtt, engine)) { /* Older GENs and non render rings still want the load first, * "PP_DCLV followed by PP_DIR_BASE register through Load * Register Immediate commands in Ring Buffer before submitting @@ -881,7 +883,7 @@ int i915_switch_context(struct drm_i915_gem_request *req) struct i915_hw_ppgtt *ppgtt = to->ppgtt ?: req->i915->mm.aliasing_ppgtt; - if (needs_pd_load_pre(ppgtt, engine, to)) { + if (needs_pd_load_pre(ppgtt, engine)) { int ret; trace_switch_mm(engine, to); -- 2.11.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCHv2] drm/i915: Remove WaDisableLSQCROPERFforOCL KBL workaround.
On Wed, Jan 11, 2017 at 02:07:37PM +0200, Mika Kuoppala wrote: > Daniel Vetter writes: > > > On Mon, Jan 09, 2017 at 01:07:56PM -0800, Francisco Jerez wrote: > >> The WaDisableLSQCROPERFforOCL workaround has the side effect of > >> disabling an L3SQ optimization that has huge performance implications > >> and is unlikely to be necessary for the correct functioning of usual > >> graphic workloads. Userspace is free to re-enable the workaround on > >> demand, and is generally in a better position to determine whether the > >> workaround is necessary than the DRM is (e.g. only during the > >> execution of compute kernels that rely on both L3 fences and HDC R/W > >> requests). > >> > >> The same workaround seems to apply to BDW (at least to production > >> stepping G1) and SKL as well (the internal workaround database claims > >> that it does for all steppings, while the BSpec workaround table only > >> mentions pre-production steppings), but the DRM doesn't do anything > >> beyond whitelisting the L3SQCREG4 register so userspace can enable it > >> when it sees fit. Do the same on KBL platforms. > >> > >> Improves performance of the GFXBench4 gl_manhattan31 benchmark by 60%, > >> and gl_4 (AKA car chase) by 14% on a KBL GT2 running Mesa master -- > >> This is followed by a regression of 35% and 10% respectively for the > >> same benchmarks and platform caused by my recent patch series > >> switching userspace to use the dataport constant cache instead of the > >> sampler to implement uniform pull constant loads, which caused us to > >> hit more heavily the L3 cache (and on platforms other than KBL had the > >> opposite effect of improving performance of the same two benchmarks). > >> The overall effect on KBL of this change combined with the recent > >> userspace change is respectively 4.6% and 2.6%. SynMark2 OglShMapPcf > >> was affected by the constant cache changes (though it improved as it > >> did on other platforms rather than regressing), but is not > >> significantly affected by this patch (with statistical significance of > >> 5% and sample size 20). > >> > >> v2: Drop some more code to avoid unused variable warning. > >> > >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99256 > >> Signed-off-by: Francisco Jerez > >> Cc: Eero Tamminen > >> Cc: Jani Nikula > >> Cc: Mika Kuoppala > >> Cc: beig...@lists.freedesktop.org > > > > Don't we need some userspace flag/opt-in scheme to avoid stuff going boom > > for compute kernels? Are the patches for mesa compute/beignet > > ready&reviewed? > > This is explicit setting on kbl/E0 only. So one could argue > that unless they filter based on PCI-IDs, things would already > blow up across the skl/kbl population, if they forgot > to set it. The whitelisting is in place and looks sane > so this E0 exception is a wart that got in by me reading wa > database slavishly without thinking. Add Fixes then? -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2 1/3] drm/i915: Use the MRU stack search after evicting
On Wed, Jan 11, 2017 at 02:04:53PM +0200, Joonas Lahtinen wrote: > On ke, 2017-01-11 at 11:23 +, Chris Wilson wrote: > > When we evict from the GTT to make room for an object, the hole we > > create is put onto the MRU stack inside the drm_mm range manager. On the > > next search pass, we can speed up a PIN_HIGH allocation by referencing > > that stack for the new hole. > > > > v2: Pull together the 3 identical implements (ahem, a couple were > > outdated) into a common routine for allocating a node and evicting as > > necessary. > > v3: Detect invalid calls to i915_gem_gtt_insert() > > v4: kerneldoc > > > > Signed-off-by: Chris Wilson > > Reviewed-by: Joonas Lahtinen > > > > > +/** > > + * i915_gem_gtt_insert - insert a node into an address_space (GTT) > > + * @vm - the &struct i915_address_space > > mixing &struct and @struct, I guess you meant &struct in later line > too. > > > + * @node - the @struct drm_mm_node (typicallay i915_vma.mode) > > "typicallly" and "i915_vma.node" > > > + * @size - how much space to allocate inside the GTT, > > + * must be #I915_GTT_PAGE_SIZE aligned > > + * @alignment - required alignment of starting offset, may be 0 but > > + * if specified, this must be a power-of-two and at least > > + * #I915_GTT_MIN_ALIGNMENT > > + * @color - color to apply to node > > + * @start - start of any range restriction inside GTT (0 for all), > > + * must be #I915_GTT_PAGE_SIZE aligned > > + * @end - end of any range restriction inside GTT (U64_MAX for all), > > + *must be #I915_GTT_PAGE_SIZE aligned > > + * @flags - control search and eviction behaviour > > + * > > + * i915_gem_gtt_insert() first searches for an available hole into which > > + * is can insert the node. The hole address is aligned to @alignment and > > + * its @size must then fit entirely within the [@start, @end] bounds. The > > + * nodes on either side of the hole must match @color, or else a guard page > > + * will be inserted between the two nodes (or the node evicted). If no > > + * suitable hole is found, then the LRU list of objects within the GTT > > + * is scanned to find the first set of replacement nodes to create the > > hole. > > + * Those old overlapping nodes are evicted from the GTT (and so must be > > + * rebound before any future use). Any node that is current pinned cannot > > "currently" > > > + * be evicted (see i915_vma_pin()). Similar if the node's VMA is currently > > + * active and #PIN_NONBLOCK is specified, that node is also skipped when > > + * searching for an eviction candidate. See i915_gem_evict_something() for > > + * the gory details on the eviction algorithm. > > + * > > + * Returns: 0 on success, -ENOSPC if no suitable hole is found, -EINTR if > > + * asked to wait for eviction and interrupted. > > + */ > > Fit those fixed, good to merge. Thanks for proof reading. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCHv2] drm/i915: Remove WaDisableLSQCROPERFforOCL KBL workaround.
Chris Wilson writes: > On Wed, Jan 11, 2017 at 02:07:37PM +0200, Mika Kuoppala wrote: >> Daniel Vetter writes: >> >> > On Mon, Jan 09, 2017 at 01:07:56PM -0800, Francisco Jerez wrote: >> >> The WaDisableLSQCROPERFforOCL workaround has the side effect of >> >> disabling an L3SQ optimization that has huge performance implications >> >> and is unlikely to be necessary for the correct functioning of usual >> >> graphic workloads. Userspace is free to re-enable the workaround on >> >> demand, and is generally in a better position to determine whether the >> >> workaround is necessary than the DRM is (e.g. only during the >> >> execution of compute kernels that rely on both L3 fences and HDC R/W >> >> requests). >> >> >> >> The same workaround seems to apply to BDW (at least to production >> >> stepping G1) and SKL as well (the internal workaround database claims >> >> that it does for all steppings, while the BSpec workaround table only >> >> mentions pre-production steppings), but the DRM doesn't do anything >> >> beyond whitelisting the L3SQCREG4 register so userspace can enable it >> >> when it sees fit. Do the same on KBL platforms. >> >> >> >> Improves performance of the GFXBench4 gl_manhattan31 benchmark by 60%, >> >> and gl_4 (AKA car chase) by 14% on a KBL GT2 running Mesa master -- >> >> This is followed by a regression of 35% and 10% respectively for the >> >> same benchmarks and platform caused by my recent patch series >> >> switching userspace to use the dataport constant cache instead of the >> >> sampler to implement uniform pull constant loads, which caused us to >> >> hit more heavily the L3 cache (and on platforms other than KBL had the >> >> opposite effect of improving performance of the same two benchmarks). >> >> The overall effect on KBL of this change combined with the recent >> >> userspace change is respectively 4.6% and 2.6%. SynMark2 OglShMapPcf >> >> was affected by the constant cache changes (though it improved as it >> >> did on other platforms rather than regressing), but is not >> >> significantly affected by this patch (with statistical significance of >> >> 5% and sample size 20). >> >> >> >> v2: Drop some more code to avoid unused variable warning. >> >> >> >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99256 >> >> Signed-off-by: Francisco Jerez >> >> Cc: Eero Tamminen >> >> Cc: Jani Nikula >> >> Cc: Mika Kuoppala >> >> Cc: beig...@lists.freedesktop.org >> > >> > Don't we need some userspace flag/opt-in scheme to avoid stuff going boom >> > for compute kernels? Are the patches for mesa compute/beignet >> > ready&reviewed? >> >> This is explicit setting on kbl/E0 only. So one could argue >> that unless they filter based on PCI-IDs, things would already >> blow up across the skl/kbl population, if they forgot >> to set it. The whitelisting is in place and looks sane >> so this E0 exception is a wart that got in by me reading wa >> database slavishly without thinking. > > Add Fixes then? Fixes: a4106a782d11 ("drm/i915/gen9: Add WaFlushCoherentL3CacheLinesAtContextSwitch workaround") Looking at beignet source, they don't care about this register/bit (yet). Also we need to get rid of KBL_REVID_E0 as there is no such thing. Oddly kbl doesnt follow the logical x0->rev mapping but leave holes. Were they afraid of running out of revids or what... -Mika > -Chris > > -- > Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt
== Series Details == Series: drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt URL : https://patchwork.freedesktop.org/series/17823/ State : failure == Summary == Series 17823v1 drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt https://patchwork.freedesktop.org/api/1.0/series/17823/revisions/1/mbox/ Test drv_hangman: Subgroup error-state-basic: pass -> TIMEOUT(fi-hsw-4770) Test gem_close_race: Subgroup basic-process: pass -> INCOMPLETE (fi-hsw-4770) Subgroup basic-threads: pass -> INCOMPLETE (fi-hsw-4770r) Test gem_ctx_basic: pass -> INCOMPLETE (fi-ivb-3520m) fi-bdw-5557u total:246 pass:232 dwarn:0 dfail:0 fail:0 skip:14 fi-bsw-n3050 total:246 pass:207 dwarn:0 dfail:0 fail:0 skip:39 fi-bxt-j4205 total:246 pass:224 dwarn:0 dfail:0 fail:0 skip:22 fi-bxt-t5700 total:82 pass:69 dwarn:0 dfail:0 fail:0 skip:12 fi-byt-j1900 total:246 pass:219 dwarn:0 dfail:0 fail:0 skip:27 fi-byt-n2820 total:246 pass:215 dwarn:0 dfail:0 fail:0 skip:31 fi-hsw-4770 total:14 pass:12 dwarn:0 dfail:0 fail:0 skip:0 fi-hsw-4770r total:15 pass:14 dwarn:0 dfail:0 fail:0 skip:0 fi-ivb-3520m total:18 pass:17 dwarn:0 dfail:0 fail:0 skip:0 fi-kbl-7500u total:246 pass:225 dwarn:0 dfail:0 fail:0 skip:21 fi-skl-6260u total:246 pass:233 dwarn:0 dfail:0 fail:0 skip:13 fi-skl-6700hqtotal:246 pass:226 dwarn:0 dfail:0 fail:0 skip:20 fi-skl-6700k total:246 pass:222 dwarn:3 dfail:0 fail:0 skip:21 fi-skl-6770hqtotal:246 pass:233 dwarn:0 dfail:0 fail:0 skip:13 fi-snb-2520m total:246 pass:215 dwarn:0 dfail:0 fail:0 skip:31 fi-snb-2600 total:246 pass:214 dwarn:0 dfail:0 fail:0 skip:32 abf5260be6dda4ade94e8edf66e133260083f29b drm-tip: 2017y-01m-10d-23h-42m-21s UTC integration manifest f58497f drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3478/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 0/5] drm/edid: Improve RGB limited range handling a bit
From: Ville Syrjälä While reading the HDMI 2.0 spec I noticed some new things related to the RGB quantization range stuff, and after cross checking with CEA-861-F I spotted a some other things as well. So I figured I should pimp up the code a bit. And since we now have two drivers that deal with this stuff, I decided to move a bunch of the code to the core to avoid duplicating the code and having different bugs/features for each driver. I still left the state computation part in the drivers, but eventually we might want to move that code into some helper as well. Entire series available here: git://github.com/vsyrjala/linux.git hdmi_quant_range_helpers Ville Syrjälä (5): drm/edid: Have drm_edid.h include hdmi.h drm/edid: Introduce drm_default_rgb_quant_range() drm/edid: Introduce drm_hdmi_avi_infoframe_quant_range() drm/edid: Set AVI infoframe Q even when QS=0 drm/edid: Set YQ bits in the AVI infoframe according to CEA-861-F drivers/gpu/drm/drm_edid.c| 64 +++ drivers/gpu/drm/i915/intel_dp.c | 4 ++- drivers/gpu/drm/i915/intel_hdmi.c | 20 ++-- drivers/gpu/drm/vc4/vc4_hdmi.c| 18 +-- include/drm/drm_edid.h| 10 -- 5 files changed, 93 insertions(+), 23 deletions(-) -- 2.10.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/5] drm/edid: Have drm_edid.h include hdmi.h
From: Ville Syrjälä drm_edid.h depends on hdmi.h on account of enum hdmi_picture_aspect, so let's just include hdmi.h and drop some useless struct declarations. Signed-off-by: Ville Syrjälä --- include/drm/drm_edid.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/include/drm/drm_edid.h b/include/drm/drm_edid.h index 38eabf65f19d..838eaf2b42e9 100644 --- a/include/drm/drm_edid.h +++ b/include/drm/drm_edid.h @@ -24,6 +24,7 @@ #define __DRM_EDID_H__ #include +#include struct drm_device; struct i2c_adapter; @@ -322,8 +323,6 @@ struct cea_sad { struct drm_encoder; struct drm_connector; struct drm_display_mode; -struct hdmi_avi_infoframe; -struct hdmi_vendor_infoframe; void drm_edid_to_eld(struct drm_connector *connector, struct edid *edid); int drm_edid_to_sad(struct edid *edid, struct cea_sad **sads); -- 2.10.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 4/5] drm/edid: Set AVI infoframe Q even when QS=0
From: Ville Syrjälä HDMI 2.0 recommends that we set the Q bits in the AVI infoframe even when the sink does not support quantization range selection (QS=0). According to CEA-861 we can do that as long as the Q we send matches the default quantization range for the mode. Previosuly I think I had misread the spec as saying that you can't send a non-zero Q at all when QS=0. But that's not what the spec actually says. Signed-off-by: Ville Syrjälä --- drivers/gpu/drm/drm_edid.c| 8 +++- drivers/gpu/drm/i915/intel_hdmi.c | 6 -- drivers/gpu/drm/vc4/vc4_hdmi.c| 2 +- include/drm/drm_edid.h| 1 + 4 files changed, 13 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c index 548c20250b95..caa2435bac31 100644 --- a/drivers/gpu/drm/drm_edid.c +++ b/drivers/gpu/drm/drm_edid.c @@ -4295,11 +4295,13 @@ EXPORT_SYMBOL(drm_hdmi_avi_infoframe_from_display_mode); * drm_hdmi_avi_infoframe_quant_range() - fill the HDMI AVI infoframe *quantization range information * @frame: HDMI AVI infoframe + * @mode: DRM display mode * @rgb_quant_range: RGB quantization range (Q) * @rgb_quant_range_selectable: Sink support selectable RGB quantization range (QS) */ void drm_hdmi_avi_infoframe_quant_range(struct hdmi_avi_infoframe *frame, + const struct drm_display_mode *mode, enum hdmi_quantization_range rgb_quant_range, bool rgb_quant_range_selectable) { @@ -4309,8 +4311,12 @@ drm_hdmi_avi_infoframe_quant_range(struct hdmi_avi_infoframe *frame, * to the default RGB Quantization Range for the transmitted Picture * unless the Sink indicates support for the Q bit in a Video * Capabilities Data Block." +* +* HDMI 2.0 recommends sending non-zero Q when it does match the +* default RGB quantization range for the mode, even when QS=0. */ - if (rgb_quant_range_selectable) + if (rgb_quant_range_selectable || + rgb_quant_range == drm_default_rgb_quant_range(mode)) frame->quantization_range = rgb_quant_range; else frame->quantization_range = HDMI_QUANTIZATION_RANGE_DEFAULT; diff --git a/drivers/gpu/drm/i915/intel_hdmi.c b/drivers/gpu/drm/i915/intel_hdmi.c index 351f837b09a0..af16b0fa6b69 100644 --- a/drivers/gpu/drm/i915/intel_hdmi.c +++ b/drivers/gpu/drm/i915/intel_hdmi.c @@ -455,17 +455,19 @@ static void intel_hdmi_set_avi_infoframe(struct drm_encoder *encoder, const struct intel_crtc_state *crtc_state) { struct intel_hdmi *intel_hdmi = enc_to_intel_hdmi(encoder); + const struct drm_display_mode *adjusted_mode = + &crtc_state->base.adjusted_mode; union hdmi_infoframe frame; int ret; ret = drm_hdmi_avi_infoframe_from_display_mode(&frame.avi, - &crtc_state->base.adjusted_mode); + adjusted_mode); if (ret < 0) { DRM_ERROR("couldn't fill AVI infoframe\n"); return; } - drm_hdmi_avi_infoframe_quant_range(&frame.avi, + drm_hdmi_avi_infoframe_quant_range(&frame.avi, adjusted_mode, crtc_state->limited_color_range ? HDMI_QUANTIZATION_RANGE_LIMITED : HDMI_QUANTIZATION_RANGE_FULL, diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c index a588156b5410..f38fdbac2878 100644 --- a/drivers/gpu/drm/vc4/vc4_hdmi.c +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c @@ -356,7 +356,7 @@ static void vc4_hdmi_set_avi_infoframe(struct drm_encoder *encoder) return; } - drm_hdmi_avi_infoframe_quant_range(&frame.avi, + drm_hdmi_avi_infoframe_quant_range(&frame.avi, mode, vc4_encoder->limited_rgb_range ? HDMI_QUANTIZATION_RANGE_LIMITED : HDMI_QUANTIZATION_RANGE_FULL, diff --git a/include/drm/drm_edid.h b/include/drm/drm_edid.h index cfad4d89589f..43fb0ac5eb9c 100644 --- a/include/drm/drm_edid.h +++ b/include/drm/drm_edid.h @@ -347,6 +347,7 @@ drm_hdmi_vendor_infoframe_from_display_mode(struct hdmi_vendor_infoframe *frame, const struct drm_display_mode *mode); void drm_hdmi_avi_infoframe_quant_range(struct hdmi_avi_infoframe *frame, + const struct drm_display_mode *mode, enum hdmi_quantization_range rgb_quant_range, bool rgb_quant_range_selectable); -- 2.10.2 __
[Intel-gfx] [PATCH 2/5] drm/edid: Introduce drm_default_rgb_quant_range()
From: Ville Syrjälä Make the code selecting the RGB quantization range a little less magicy by wrapping it up in a small helper. Signed-off-by: Ville Syrjälä --- drivers/gpu/drm/drm_edid.c| 18 ++ drivers/gpu/drm/i915/intel_dp.c | 4 +++- drivers/gpu/drm/i915/intel_hdmi.c | 3 ++- drivers/gpu/drm/vc4/vc4_hdmi.c| 4 +++- include/drm/drm_edid.h| 2 ++ 5 files changed, 28 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c index 4ff04aa84dd0..304c583b8000 100644 --- a/drivers/gpu/drm/drm_edid.c +++ b/drivers/gpu/drm/drm_edid.c @@ -3768,6 +3768,24 @@ bool drm_rgb_quant_range_selectable(struct edid *edid) } EXPORT_SYMBOL(drm_rgb_quant_range_selectable); +/** + * drm_default_rgb_quant_range - default RGB quantization range + * @mode: display mode + * + * Determine the default RGB quantization range for the mode, + * as specified in CEA-861. + * + * Return: The default RGB quantization range for the mode + */ +enum hdmi_quantization_range +drm_default_rgb_quant_range(const struct drm_display_mode *mode) +{ + return drm_match_cea_mode(mode) > 1 ? + HDMI_QUANTIZATION_RANGE_LIMITED : + HDMI_QUANTIZATION_RANGE_FULL; +} +EXPORT_SYMBOL(drm_default_rgb_quant_range); + static void drm_parse_hdmi_deep_color_info(struct drm_connector *connector, const u8 *hdmi) { diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index 343e1d9fa761..d4befbbe834a 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -1713,7 +1713,9 @@ intel_dp_compute_config(struct intel_encoder *encoder, * VESA DisplayPort Ver.1.2a - 5.1.1.1 Video Colorimetry */ pipe_config->limited_color_range = - bpp != 18 && drm_match_cea_mode(adjusted_mode) > 1; + bpp != 18 && + drm_default_rgb_quant_range(adjusted_mode) == + HDMI_QUANTIZATION_RANGE_LIMITED; } else { pipe_config->limited_color_range = intel_dp->limited_color_range; diff --git a/drivers/gpu/drm/i915/intel_hdmi.c b/drivers/gpu/drm/i915/intel_hdmi.c index 0bcfead14571..19bd13f53729 100644 --- a/drivers/gpu/drm/i915/intel_hdmi.c +++ b/drivers/gpu/drm/i915/intel_hdmi.c @@ -1330,7 +1330,8 @@ bool intel_hdmi_compute_config(struct intel_encoder *encoder, /* See CEA-861-E - 5.1 Default Encoding Parameters */ pipe_config->limited_color_range = pipe_config->has_hdmi_sink && - drm_match_cea_mode(adjusted_mode) > 1; + drm_default_rgb_quant_range(adjusted_mode) == + HDMI_QUANTIZATION_RANGE_LIMITED; } else { pipe_config->limited_color_range = intel_hdmi->limited_color_range; diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c index c4cb2e26de32..d79466a42690 100644 --- a/drivers/gpu/drm/vc4/vc4_hdmi.c +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c @@ -463,7 +463,9 @@ static void vc4_hdmi_encoder_mode_set(struct drm_encoder *encoder, csc_ctl = VC4_SET_FIELD(VC4_HD_CSC_CTL_ORDER_BGR, VC4_HD_CSC_CTL_ORDER); - if (vc4_encoder->hdmi_monitor && drm_match_cea_mode(mode) > 1) { + if (vc4_encoder->hdmi_monitor && + drm_default_rgb_quant_range(adjusted_mode) == + HDMI_QUANTIZATION_RANGE_LIMITED) { /* CEA VICs other than #1 requre limited range RGB * output unless overridden by an AVI infoframe. * Apply a colorspace conversion to squash 0-255 down diff --git a/include/drm/drm_edid.h b/include/drm/drm_edid.h index 838eaf2b42e9..25cdf5f7a0d8 100644 --- a/include/drm/drm_edid.h +++ b/include/drm/drm_edid.h @@ -441,6 +441,8 @@ enum hdmi_picture_aspect drm_get_cea_aspect_ratio(const u8 video_code); bool drm_detect_hdmi_monitor(struct edid *edid); bool drm_detect_monitor_audio(struct edid *edid); bool drm_rgb_quant_range_selectable(struct edid *edid); +enum hdmi_quantization_range +drm_default_rgb_quant_range(const struct drm_display_mode *mode); int drm_add_modes_noedid(struct drm_connector *connector, int hdisplay, int vdisplay); void drm_set_preferred_mode(struct drm_connector *connector, -- 2.10.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 5/5] drm/edid: Set YQ bits in the AVI infoframe according to CEA-861-F
From: Ville Syrjälä CEA-861-F tells us: "When transmitting any RGB colorimetry, the Source should set the YQ-field to match the RGB Quantization Range being transmitted (e.g., when Limited Range RGB, set YQ=0 or when Full Range RGB, set YQ=1) and the Sink shall ignore the YQ-field." So let's go ahead and do that. Perhaps there are sinks that don't ignore the YQ as they should for RGB? I wasn't able to find similar text in CEA-861-E, so it would seem to be a fairly "recent" addition. Signed-off-by: Ville Syrjälä --- drivers/gpu/drm/drm_edid.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c index caa2435bac31..6ba9a1a6eae4 100644 --- a/drivers/gpu/drm/drm_edid.c +++ b/drivers/gpu/drm/drm_edid.c @@ -4320,6 +4320,20 @@ drm_hdmi_avi_infoframe_quant_range(struct hdmi_avi_infoframe *frame, frame->quantization_range = rgb_quant_range; else frame->quantization_range = HDMI_QUANTIZATION_RANGE_DEFAULT; + + /* +* CEA-861-F: +* "When transmitting any RGB colorimetry, the Source should set the +* YQ-field to match the RGB Quantization Range being transmitted +* (e.g., when Limited Range RGB, set YQ=0 or when Full Range RGB, +* set YQ=1) and the Sink shall ignore the YQ-field." +*/ + if (rgb_quant_range == HDMI_QUANTIZATION_RANGE_LIMITED) + frame->ycc_quantization_range = + HDMI_YCC_QUANTIZATION_RANGE_LIMITED; + else + frame->ycc_quantization_range = + HDMI_YCC_QUANTIZATION_RANGE_FULL; } EXPORT_SYMBOL(drm_hdmi_avi_infoframe_quant_range); -- 2.10.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 3/5] drm/edid: Introduce drm_hdmi_avi_infoframe_quant_range()
From: Ville Syrjälä Pull the logic to populate the quantization range information in the AVI infoframe into a small helper. We'll be adding a bit more logic to it, and having it in a central place seems like a good idea since it's based on the CEA-861 spec. Signed-off-by: Ville Syrjälä --- drivers/gpu/drm/drm_edid.c| 26 ++ drivers/gpu/drm/i915/intel_hdmi.c | 13 + drivers/gpu/drm/vc4/vc4_hdmi.c| 14 +- include/drm/drm_edid.h| 4 4 files changed, 40 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c index 304c583b8000..548c20250b95 100644 --- a/drivers/gpu/drm/drm_edid.c +++ b/drivers/gpu/drm/drm_edid.c @@ -4291,6 +4291,32 @@ drm_hdmi_avi_infoframe_from_display_mode(struct hdmi_avi_infoframe *frame, } EXPORT_SYMBOL(drm_hdmi_avi_infoframe_from_display_mode); +/** + * drm_hdmi_avi_infoframe_quant_range() - fill the HDMI AVI infoframe + *quantization range information + * @frame: HDMI AVI infoframe + * @rgb_quant_range: RGB quantization range (Q) + * @rgb_quant_range_selectable: Sink support selectable RGB quantization range (QS) + */ +void +drm_hdmi_avi_infoframe_quant_range(struct hdmi_avi_infoframe *frame, + enum hdmi_quantization_range rgb_quant_range, + bool rgb_quant_range_selectable) +{ + /* +* CEA-861: +* "A Source shall not send a non-zero Q value that does not correspond +* to the default RGB Quantization Range for the transmitted Picture +* unless the Sink indicates support for the Q bit in a Video +* Capabilities Data Block." +*/ + if (rgb_quant_range_selectable) + frame->quantization_range = rgb_quant_range; + else + frame->quantization_range = HDMI_QUANTIZATION_RANGE_DEFAULT; +} +EXPORT_SYMBOL(drm_hdmi_avi_infoframe_quant_range); + static enum hdmi_3d_structure s3d_structure_from_display_mode(const struct drm_display_mode *mode) { diff --git a/drivers/gpu/drm/i915/intel_hdmi.c b/drivers/gpu/drm/i915/intel_hdmi.c index 19bd13f53729..351f837b09a0 100644 --- a/drivers/gpu/drm/i915/intel_hdmi.c +++ b/drivers/gpu/drm/i915/intel_hdmi.c @@ -465,14 +465,11 @@ static void intel_hdmi_set_avi_infoframe(struct drm_encoder *encoder, return; } - if (intel_hdmi->rgb_quant_range_selectable) { - if (crtc_state->limited_color_range) - frame.avi.quantization_range = - HDMI_QUANTIZATION_RANGE_LIMITED; - else - frame.avi.quantization_range = - HDMI_QUANTIZATION_RANGE_FULL; - } + drm_hdmi_avi_infoframe_quant_range(&frame.avi, + crtc_state->limited_color_range ? + HDMI_QUANTIZATION_RANGE_LIMITED : + HDMI_QUANTIZATION_RANGE_FULL, + intel_hdmi->rgb_quant_range_selectable); intel_write_infoframe(encoder, crtc_state, &frame); } diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c index d79466a42690..a588156b5410 100644 --- a/drivers/gpu/drm/vc4/vc4_hdmi.c +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c @@ -356,15 +356,11 @@ static void vc4_hdmi_set_avi_infoframe(struct drm_encoder *encoder) return; } - if (vc4_encoder->rgb_range_selectable) { - if (vc4_encoder->limited_rgb_range) { - frame.avi.quantization_range = - HDMI_QUANTIZATION_RANGE_LIMITED; - } else { - frame.avi.quantization_range = - HDMI_QUANTIZATION_RANGE_FULL; - } - } + drm_hdmi_avi_infoframe_quant_range(&frame.avi, + vc4_encoder->limited_rgb_range ? + HDMI_QUANTIZATION_RANGE_LIMITED : + HDMI_QUANTIZATION_RANGE_FULL, + vc4_encoder->rgb_range_selectable); vc4_hdmi_write_infoframe(encoder, &frame); } diff --git a/include/drm/drm_edid.h b/include/drm/drm_edid.h index 25cdf5f7a0d8..cfad4d89589f 100644 --- a/include/drm/drm_edid.h +++ b/include/drm/drm_edid.h @@ -345,6 +345,10 @@ drm_hdmi_avi_infoframe_from_display_mode(struct hdmi_avi_infoframe *frame, int drm_hdmi_vendor_infoframe_from_display_mode(struct hdmi_vendor_infoframe *frame, const struct drm_display_mode *mode); +void +drm_hdmi_avi_infoframe_quant_range(struct hdmi_avi_infoframe *frame, + enum hdmi_quantization_range rgb_quant_range, +
Re: [Intel-gfx] [PATCH] drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt
On ke, 2017-01-11 at 12:14 +, Chris Wilson wrote: > When switching between contexts using the aliasing_ppgtt, the VM is > shared. We don't need to reload the PD registers unless they are dirty. > > Martin Peres reported an issue that looks like corruption between > Haswell context switches, bisecting to commit f9326be5f1d3 ("drm/i915: > Rearrange switch_context to load the aliasing ppgtt on first use"). > Switching between the same mm (the aliasing_ppgtt is used for all > contexts in this case) should be a nop, but appears to trigger some > side-effects in the context switch. However, as we know the switch > is redundant in this case, we can skip it and continue to ignore the > issue until somebody feels strong enough to investigate full-ppgtt on > gen7 again! > > Fixes: f9326be5f1d3 ("drm/i915: Rearrange switch_context to load the aliasing > ppgtt on first use") > Reported-by: Martin Peres > Signed-off-by: Chris Wilson > Cc: Martin Peres Code looks good, could use the T-b's to verify. Reviewed-by: Joonas Lahtinen Regards, Joonas -- Joonas Lahtinen Open Source Technology Center Intel Corporation ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt
On Wed, Jan 11, 2017 at 12:56:03PM -, Patchwork wrote: > == Series Details == > > Series: drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt > URL : https://patchwork.freedesktop.org/series/17823/ > State : failure > > == Summary == > > Series 17823v1 drm/i915: Suppress switch_mm emission between the same > aliasing_ppgtt > https://patchwork.freedesktop.org/api/1.0/series/17823/revisions/1/mbox/ > > Test drv_hangman: > Subgroup error-state-basic: > pass -> TIMEOUT(fi-hsw-4770) > Test gem_close_race: > Subgroup basic-process: > pass -> INCOMPLETE (fi-hsw-4770) > Subgroup basic-threads: > pass -> INCOMPLETE (fi-hsw-4770r) > Test gem_ctx_basic: > pass -> INCOMPLETE (fi-ivb-3520m) Ooh. That is suitably scary that there is something wrong going here. Still think the patch is sane by itself, so suspecting there is something not meeting the eye here. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 3/3] HAX enable guc submission for CI
--- drivers/gpu/drm/i915/i915_params.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c index 0e280fbd52f1..1d3766cfc837 100644 --- a/drivers/gpu/drm/i915/i915_params.c +++ b/drivers/gpu/drm/i915/i915_params.c @@ -56,8 +56,8 @@ struct i915_params i915 __read_mostly = { .verbose_state_checks = 1, .nuclear_pageflip = 0, .edp_vswing = 0, - .enable_guc_loading = 0, - .enable_guc_submission = 0, + .enable_guc_loading = 1, + .enable_guc_submission = 1, .guc_log_level = -1, .enable_dp_mst = true, .inject_load_failure = 0, -- 2.11.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/3] drm/i915: Invalidate the guc ggtt TLB upon insertion
Move the GuC invalidation of its ggtt TLB to where we perform the ggtt modification rather than proliferate it into all the callers of the insert (which may or may not in fact have to do the insertion). v2: Just do the guc invalidate unconditionally, (afaict) it has no impact without the guc loaded on gen8+ v3: Conditionally invalidate the guc - just in case that register has not been validated for other modes. Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Cc: Daniel Vetter --- drivers/gpu/drm/i915/i915_gem_gtt.c| 78 +++--- drivers/gpu/drm/i915/i915_gem_gtt.h| 3 ++ drivers/gpu/drm/i915/i915_guc_submission.c | 3 -- drivers/gpu/drm/i915/intel_guc_loader.c| 7 +-- drivers/gpu/drm/i915/intel_lrc.c | 6 --- 5 files changed, 57 insertions(+), 40 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 0ed99adfd0da..ed120a1e7f93 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -110,6 +110,30 @@ const struct i915_ggtt_view i915_ggtt_view_rotated = { .type = I915_GGTT_VIEW_ROTATED, }; +static void gen6_ggtt_invalidate(struct drm_i915_private *dev_priv) +{ + /* Note that as an uncached mmio write, this should flush the +* WCB of the writes into the GGTT before it triggers the invalidate. +*/ + I915_WRITE(GFX_FLSH_CNTL_GEN6, GFX_FLSH_CNTL_EN); +} + +static void guc_ggtt_invalidate(struct drm_i915_private *dev_priv) +{ + gen6_ggtt_invalidate(dev_priv); + I915_WRITE(GEN8_GTCR, GEN8_GTCR_INVALIDATE); +} + +static void gmch_ggtt_invalidate(struct drm_i915_private *dev_priv) +{ + intel_gtt_chipset_flush(); +} + +static inline void i915_ggtt_invalidate(struct drm_i915_private *i915) +{ + i915->ggtt.invalidate(i915); +} + int intel_sanitize_enable_ppgtt(struct drm_i915_private *dev_priv, int enable_ppgtt) { @@ -2307,16 +2331,6 @@ void i915_check_and_clear_faults(struct drm_i915_private *dev_priv) POSTING_READ(RING_FAULT_REG(dev_priv->engine[RCS])); } -static void i915_ggtt_flush(struct drm_i915_private *dev_priv) -{ - if (INTEL_INFO(dev_priv)->gen < 6) { - intel_gtt_chipset_flush(); - } else { - I915_WRITE(GFX_FLSH_CNTL_GEN6, GFX_FLSH_CNTL_EN); - POSTING_READ(GFX_FLSH_CNTL_GEN6); - } -} - void i915_gem_suspend_gtt_mappings(struct drm_i915_private *dev_priv) { struct i915_ggtt *ggtt = &dev_priv->ggtt; @@ -2331,7 +2345,7 @@ void i915_gem_suspend_gtt_mappings(struct drm_i915_private *dev_priv) ggtt->base.clear_range(&ggtt->base, ggtt->base.start, ggtt->base.total); - i915_ggtt_flush(dev_priv); + i915_ggtt_invalidate(dev_priv); } int i915_gem_gtt_prepare_pages(struct drm_i915_gem_object *obj, @@ -2370,15 +2384,13 @@ static void gen8_ggtt_insert_page(struct i915_address_space *vm, enum i915_cache_level level, u32 unused) { - struct drm_i915_private *dev_priv = vm->i915; + struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm); gen8_pte_t __iomem *pte = - (gen8_pte_t __iomem *)dev_priv->ggtt.gsm + - (offset >> PAGE_SHIFT); + (gen8_pte_t __iomem *)ggtt->gsm + (offset >> PAGE_SHIFT); gen8_set_pte(pte, gen8_pte_encode(addr, level)); - I915_WRITE(GFX_FLSH_CNTL_GEN6, GFX_FLSH_CNTL_EN); - POSTING_READ(GFX_FLSH_CNTL_GEN6); + ggtt->invalidate(vm->i915); } static void gen8_ggtt_insert_entries(struct i915_address_space *vm, @@ -2386,7 +2398,6 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm, uint64_t start, enum i915_cache_level level, u32 unused) { - struct drm_i915_private *dev_priv = vm->i915; struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm); struct sgt_iter sgt_iter; gen8_pte_t __iomem *gtt_entries; @@ -2415,8 +2426,7 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm, * want to flush the TLBs only after we're certain all the PTE updates * have finished. */ - I915_WRITE(GFX_FLSH_CNTL_GEN6, GFX_FLSH_CNTL_EN); - POSTING_READ(GFX_FLSH_CNTL_GEN6); + ggtt->invalidate(vm->i915); } struct insert_entries { @@ -2451,15 +2461,13 @@ static void gen6_ggtt_insert_page(struct i915_address_space *vm, enum i915_cache_level level, u32 flags) { - struct drm_i915_private *dev_priv = vm->i915; + struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm); gen6_pte_t __iomem *pte = - (gen6_pte_t __iomem *)dev_priv->ggtt.gsm + - (offset >> PAGE_SHIFT); + (gen6_pte_t __iomem *)ggtt->gsm + (offset >> PAGE_SHIFT); iowrite
[Intel-gfx] [PATCH 2/3] drm/i915/scheduler: emulate a scheduler for guc
This emulates execlists on top of the GuC in order to defer submission of requests to the hardware. This deferral allows time for high priority requests to gazump their way to the head of the queue, however it nerfs the GuC by converting it back into a simple execlist (where the CPU has to wake up after every request to feed new commands into the GuC). v2: Drop hack status - though iirc there is still a lockdep inversion between fence and engine->timeline->lock (which is impossible as the nesting only occurs on different fences - hopefully just requires some judicious lockdep annotation) Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_guc_submission.c | 79 +++--- drivers/gpu/drm/i915/i915_irq.c| 4 +- drivers/gpu/drm/i915/intel_lrc.c | 5 +- 3 files changed, 76 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 913d87358972..bdc9e2bc5eb9 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -350,7 +350,7 @@ int i915_guc_wq_reserve(struct drm_i915_gem_request *request) u32 freespace; int ret; - spin_lock(&client->wq_lock); + spin_lock_irq(&client->wq_lock); freespace = CIRC_SPACE(client->wq_tail, desc->head, client->wq_size); freespace -= client->wq_rsvd; if (likely(freespace >= wqi_size)) { @@ -360,7 +360,7 @@ int i915_guc_wq_reserve(struct drm_i915_gem_request *request) client->no_wq_space++; ret = -EAGAIN; } - spin_unlock(&client->wq_lock); + spin_unlock_irq(&client->wq_lock); return ret; } @@ -372,9 +372,9 @@ void i915_guc_wq_unreserve(struct drm_i915_gem_request *request) GEM_BUG_ON(READ_ONCE(client->wq_rsvd) < wqi_size); - spin_lock(&client->wq_lock); + spin_lock_irq(&client->wq_lock); client->wq_rsvd -= wqi_size; - spin_unlock(&client->wq_lock); + spin_unlock_irq(&client->wq_lock); } /* Construct a Work Item and append it to the GuC's Work Queue */ @@ -534,10 +534,74 @@ static void __i915_guc_submit(struct drm_i915_gem_request *rq) static void i915_guc_submit(struct drm_i915_gem_request *rq) { - i915_gem_request_submit(rq); + __i915_gem_request_submit(rq); __i915_guc_submit(rq); } +static bool i915_guc_dequeue(struct intel_engine_cs *engine) +{ + struct execlist_port *port = engine->execlist_port; + struct drm_i915_gem_request *last = port[0].request; + unsigned long flags; + struct rb_node *rb; + bool submit = false; + + spin_lock_irqsave(&engine->timeline->lock, flags); + rb = engine->execlist_first; + while (rb) { + struct drm_i915_gem_request *cursor = + rb_entry(rb, typeof(*cursor), priotree.node); + + if (last && cursor->ctx != last->ctx) { + if (port != engine->execlist_port) + break; + + i915_gem_request_assign(&port->request, last); + dma_fence_enable_sw_signaling(&last->fence); + port++; + } + + rb = rb_next(rb); + rb_erase(&cursor->priotree.node, &engine->execlist_queue); + RB_CLEAR_NODE(&cursor->priotree.node); + cursor->priotree.priority = INT_MAX; + + i915_guc_submit(cursor); + last = cursor; + submit = true; + } + if (submit) { + i915_gem_request_assign(&port->request, last); + dma_fence_enable_sw_signaling(&last->fence); + engine->execlist_first = rb; + } + spin_unlock_irqrestore(&engine->timeline->lock, flags); + + return submit; +} + +static void i915_guc_irq_handler(unsigned long data) +{ + struct intel_engine_cs *engine = (struct intel_engine_cs *)data; + struct execlist_port *port = engine->execlist_port; + struct drm_i915_gem_request *rq; + bool submit; + + do { + rq = port[0].request; + while (rq && i915_gem_request_completed(rq)) { + i915_gem_request_put(rq); + rq = port[1].request; + port[0].request = rq; + port[1].request = NULL; + } + + submit = false; + if (!port[1].request) + submit = i915_guc_dequeue(engine); + } while (submit); +} + /* * Everything below here is concerned with setup & teardown, and is * therefore not part of the somewhat time-critical batch-submission @@ -1428,8 +1492,9 @@ int i915_guc_submission_enable(struct drm_i915_private *dev_priv) for_each_engine(engine, dev_priv, id) { struct drm_i915_gem_request *rq; -
[Intel-gfx] [PATCH] drm/i915/huc: Add HuC fw loading support
The HuC loading process is similar to GuC. The intel_uc_fw_fetch() is used for both cases. HuC loading needs to be before GuC loading. The WOPCM setting must be done early before loading any of them. v2: rebased on-top of drm-intel-nightly. removed if(HAS_GUC()) before the guc call. (D.Gordon) update huc_version number of format. v3: rebased to drm-intel-nightly, changed the file name format to match the one in the huc package. Changed dev->dev_private to to_i915() v4: moved function back to where it was. change wait_for_atomic to wait_for. v5: rebased + comment changes. v7: rebased. v8: rebased. v9: rebased. Changed the year in the copyright message to reflect the right year.Correct the comments,remove the unwanted WARN message, replace drm_gem_object_unreference() with i915_gem_object_put().Make the prototypes in intel_huc.h non-extern. v10: rebased. Update the file construction done by HuC. It is similar to GuC.Adopted the approach used in- https://patchwork.freedesktop.org/patch/104355/ v11: Fix warnings remove old declaration v12: Change dev to dev_priv in macro definition. Corrected comments. v13: rebased. v14: rebased on top of drm-tip v15: rebased. Updated functions intel_huc_load(),intel_huc_init() and intel_uc_fw_fetch() to accept dev_priv instead of dev. Moved contents of intel_huc.h to intel_uc.h v16: change SKL_FW_ to SKL_HUC_FW_. Add intel_ prefix to guc_wopcm_size(). Remove unwanted checks in intel_uc.h. Rename huc_fw in struct intel_huc to simply fw to avoid redundency. v17: rebased. v18: rebased. Correct comments. v19: rebased. Correct comments. move definition to i915_guc_reg.h from intel_uc.h. Clean DMA_CTRL bits after HuC DMA transfer in huc_ucode_xfer() instead of guc_ucode_xfer(). Add suitable WARNs to give extra info. Cc: Arkadiusz Hiler Cc: Michal Wajdeczko Tested-by: Xiang Haihao Signed-off-by: Anusha Srivatsa Signed-off-by: Alex Dai Signed-off-by: Peter Antoine --- drivers/gpu/drm/i915/Makefile | 1 + drivers/gpu/drm/i915/i915_drv.c | 3 + drivers/gpu/drm/i915/i915_drv.h | 2 + drivers/gpu/drm/i915/i915_guc_reg.h | 6 + drivers/gpu/drm/i915/intel_guc_loader.c | 7 +- drivers/gpu/drm/i915/intel_huc_loader.c | 265 drivers/gpu/drm/i915/intel_uc.h | 14 ++ 7 files changed, 295 insertions(+), 3 deletions(-) create mode 100644 drivers/gpu/drm/i915/intel_huc_loader.c diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 5196509..45ae124 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -57,6 +57,7 @@ i915-y += i915_cmd_parser.o \ # general-purpose microcontroller (GuC) support i915-y += intel_uc.o \ intel_guc_loader.o \ + intel_huc_loader.o \ i915_guc_submission.o # autogenerated null render state diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index aefab9a..5a90829 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -599,6 +599,7 @@ static int i915_load_modeset_init(struct drm_device *dev) if (ret) goto cleanup_irq; + intel_huc_init(dev_priv); intel_guc_init(dev_priv); ret = i915_gem_init(dev_priv); @@ -627,6 +628,7 @@ static int i915_load_modeset_init(struct drm_device *dev) i915_gem_fini(dev_priv); cleanup_irq: intel_guc_fini(dev_priv); + intel_huc_fini(dev); drm_irq_uninstall(dev); intel_teardown_gmbus(dev_priv); cleanup_csr: @@ -1314,6 +1316,7 @@ void i915_driver_unload(struct drm_device *dev) drain_workqueue(dev_priv->wq); intel_guc_fini(dev_priv); + intel_huc_fini(dev); i915_gem_fini(dev_priv); intel_fbc_cleanup_cfb(dev_priv); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index b84c1d1..2a17df2 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2073,6 +2073,7 @@ struct drm_i915_private { struct intel_gvt *gvt; + struct intel_huc huc; struct intel_guc guc; struct intel_csr csr; @@ -2847,6 +2848,7 @@ intel_info(const struct drm_i915_private *dev_priv) #define HAS_GUC(dev_priv) ((dev_priv)->info.has_guc) #define HAS_GUC_UCODE(dev_priv)(HAS_GUC(dev_priv)) #define HAS_GUC_SCHED(dev_priv)(HAS_GUC(dev_priv)) +#define HAS_HUC_UCODE(dev_priv)(HAS_GUC(dev_priv)) #define HAS_RESOURCE_STREAMER(dev_priv) ((dev_priv)->info.has_resource_streamer) diff --git a/drivers/gpu/drm/i915/i915_guc_reg.h b/drivers/gpu/drm/i915/i915_guc_reg.h index 6a0adaf..35cf991 100644 --- a/drivers/gpu/drm/i915/i915_guc_reg.h +++ b/drivers/gpu/drm/i915/i915_guc_reg.h @@ -61,12 +61,18 @@ #define DMA_ADDRESS_SPACE_GTT (8 << 16) #define DMA_COPY_SIZE _MMIO(0xc310) #define DMA_CTRL _MMIO(0xc314) +#define HUC_UKERNEL
Re: [Intel-gfx] [PATCHv2] drm/i915: Remove WaDisableLSQCROPERFforOCL KBL workaround.
On Wed, Jan 11, 2017 at 12:24:59PM +, Chris Wilson wrote: > On Wed, Jan 11, 2017 at 02:07:37PM +0200, Mika Kuoppala wrote: > > Daniel Vetter writes: > > > > > On Mon, Jan 09, 2017 at 01:07:56PM -0800, Francisco Jerez wrote: > > >> The WaDisableLSQCROPERFforOCL workaround has the side effect of > > >> disabling an L3SQ optimization that has huge performance implications > > >> and is unlikely to be necessary for the correct functioning of usual > > >> graphic workloads. Userspace is free to re-enable the workaround on > > >> demand, and is generally in a better position to determine whether the > > >> workaround is necessary than the DRM is (e.g. only during the > > >> execution of compute kernels that rely on both L3 fences and HDC R/W > > >> requests). > > >> > > >> The same workaround seems to apply to BDW (at least to production > > >> stepping G1) and SKL as well (the internal workaround database claims > > >> that it does for all steppings, while the BSpec workaround table only > > >> mentions pre-production steppings), but the DRM doesn't do anything > > >> beyond whitelisting the L3SQCREG4 register so userspace can enable it > > >> when it sees fit. Do the same on KBL platforms. > > >> > > >> Improves performance of the GFXBench4 gl_manhattan31 benchmark by 60%, > > >> and gl_4 (AKA car chase) by 14% on a KBL GT2 running Mesa master -- > > >> This is followed by a regression of 35% and 10% respectively for the > > >> same benchmarks and platform caused by my recent patch series > > >> switching userspace to use the dataport constant cache instead of the > > >> sampler to implement uniform pull constant loads, which caused us to > > >> hit more heavily the L3 cache (and on platforms other than KBL had the > > >> opposite effect of improving performance of the same two benchmarks). > > >> The overall effect on KBL of this change combined with the recent > > >> userspace change is respectively 4.6% and 2.6%. SynMark2 OglShMapPcf > > >> was affected by the constant cache changes (though it improved as it > > >> did on other platforms rather than regressing), but is not > > >> significantly affected by this patch (with statistical significance of > > >> 5% and sample size 20). > > >> > > >> v2: Drop some more code to avoid unused variable warning. > > >> > > >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99256 > > >> Signed-off-by: Francisco Jerez > > >> Cc: Eero Tamminen > > >> Cc: Jani Nikula > > >> Cc: Mika Kuoppala > > >> Cc: beig...@lists.freedesktop.org > > > > > > Don't we need some userspace flag/opt-in scheme to avoid stuff going boom > > > for compute kernels? Are the patches for mesa compute/beignet > > > ready&reviewed? > > > > This is explicit setting on kbl/E0 only. So one could argue > > that unless they filter based on PCI-IDs, things would already > > blow up across the skl/kbl population, if they forgot > > to set it. The whitelisting is in place and looks sane > > so this E0 exception is a wart that got in by me reading wa > > database slavishly without thinking. > > Add Fixes then? Yeah, cc: stable would be good to make sure it shows up in all supported kernels, fast. Otherwise we'll get some good wtf bug reports. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt
On Wed, Jan 11, 2017 at 01:01:18PM +, Chris Wilson wrote: > On Wed, Jan 11, 2017 at 12:56:03PM -, Patchwork wrote: > > == Series Details == > > > > Series: drm/i915: Suppress switch_mm emission between the same > > aliasing_ppgtt > > URL : https://patchwork.freedesktop.org/series/17823/ > > State : failure > > > > == Summary == > > > > Series 17823v1 drm/i915: Suppress switch_mm emission between the same > > aliasing_ppgtt > > https://patchwork.freedesktop.org/api/1.0/series/17823/revisions/1/mbox/ > > > > Test drv_hangman: > > Subgroup error-state-basic: > > pass -> TIMEOUT(fi-hsw-4770) > > Test gem_close_race: > > Subgroup basic-process: > > pass -> INCOMPLETE (fi-hsw-4770) > > Subgroup basic-threads: > > pass -> INCOMPLETE (fi-hsw-4770r) > > Test gem_ctx_basic: > > pass -> INCOMPLETE (fi-ivb-3520m) > > Ooh. That is suitably scary that there is something wrong going here. > Still think the patch is sane by itself, so suspecting there is > something not meeting the eye here. To further demonstrate the bizarreness, they are all *CPU* lockups. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915/huc: Support HuC authentication
From: Peter Antoine The HuC authentication is done by host2guc call. The HuC RSA keys are sent to GuC for authentication. v2: rebased on top of drm-intel-nightly. changed name format and upped version 1.7. v3: rebased on top of drm-intel-nightly. v4: changed wait_for_automic to wait_for v5: rebased. v7: rebased. v8: rebased. v9: rebased. Rename intel_huc_auh() to intel_guc_auth_huc() and place the prototype in intel_guc.h,correct the comments. v10: rebased. v11: rebased. v12: rebased on top of drm-tip v13: rebased. Moved intel_guc_auth_huc from i915_guc_submission.c to intel_uc.c.Update dev to dev_priv in intel_guc_auth_huc(). Renamed HOST2GUC_ACTION_AUTHENTICATE_HUC TO INTEL_GUC_ACTION_ AUTHENTICATE_HUC v14: rebased. v15: rebased. Add newline on DRM_ERRORs that already dont have one. v16: rebased. Replace wait_for with intel_wait_for_register() since the latter employs sleep optimisations for quick responses- as pointed out by Chris Wilson. v17: rebased. Cleanup the intel_guc_auth_huc() by removing checks already performed in earlier functions. Make comments more descriptive. Cc: Chris Wilson Cc: Arkadiusz Hiler Cc: Michal Wajdeczko Tested-by: Xiang Haihao Signed-off-by: Anusha Srivatsa Signed-off-by: Alex Dai Signed-off-by: Peter Antoine --- drivers/gpu/drm/i915/intel_guc_fwif.h | 1 + drivers/gpu/drm/i915/intel_guc_loader.c | 2 ++ drivers/gpu/drm/i915/intel_uc.c | 56 - drivers/gpu/drm/i915/intel_uc.h | 1 + 4 files changed, 59 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h b/drivers/gpu/drm/i915/intel_guc_fwif.h index ed1ab40..25691f0 100644 --- a/drivers/gpu/drm/i915/intel_guc_fwif.h +++ b/drivers/gpu/drm/i915/intel_guc_fwif.h @@ -505,6 +505,7 @@ enum intel_guc_action { INTEL_GUC_ACTION_ENTER_S_STATE = 0x501, INTEL_GUC_ACTION_EXIT_S_STATE = 0x502, INTEL_GUC_ACTION_SLPC_REQUEST = 0x3003, + INTEL_GUC_ACTION_AUTHENTICATE_HUC = 0x4000, INTEL_GUC_ACTION_UK_LOG_ENABLE_LOGGING = 0x0E000, INTEL_GUC_ACTION_LIMIT }; diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c index 3b05232..967ab2f 100644 --- a/drivers/gpu/drm/i915/intel_guc_loader.c +++ b/drivers/gpu/drm/i915/intel_guc_loader.c @@ -529,6 +529,8 @@ int intel_guc_setup(struct drm_i915_private *dev_priv) intel_uc_fw_status_repr(guc_fw->fetch_status), intel_uc_fw_status_repr(guc_fw->load_status)); + intel_guc_auth_huc(dev_priv); + if (i915.enable_guc_submission) { if (i915.guc_log_level >= 0) gen9_enable_guc_interrupts(dev_priv); diff --git a/drivers/gpu/drm/i915/intel_uc.c b/drivers/gpu/drm/i915/intel_uc.c index c6be352..7dabbe6 100644 --- a/drivers/gpu/drm/i915/intel_uc.c +++ b/drivers/gpu/drm/i915/intel_uc.c @@ -46,7 +46,7 @@ static bool intel_guc_recv(struct intel_guc *guc, u32 *status) int intel_guc_send(struct intel_guc *guc, const u32 *action, u32 len) { struct drm_i915_private *dev_priv = guc_to_i915(guc); - u32 status; + u32 status = 0; int i; int ret; @@ -140,3 +140,57 @@ int intel_guc_log_control(struct intel_guc *guc, u32 control_val) return intel_guc_send(guc, action, ARRAY_SIZE(action)); } + +/** + * intel_guc_auth_huc() - authenticate ucode + * @dev_priv: the drm_i915_device + * + * Triggers a HuC fw authentication request to the GuC via intel_guc_action_ + * authenticate_huc interface. + * interface. + */ +void intel_guc_auth_huc(struct drm_i915_private *dev_priv) +{ + struct intel_guc *guc = &dev_priv->guc; + struct intel_huc *huc = &dev_priv->huc; + struct i915_vma *vma; + int ret; + u32 data[2]; + + vma = i915_gem_object_ggtt_pin(huc->fw.obj, NULL, 0, 0, 0); + if (IS_ERR(vma)) { + DRM_DEBUG_DRIVER("failed to pin huc fw object %d\n", + (int)PTR_ERR(vma)); + return; + } + + + /* Invalidate GuC TLB to let GuC take the latest updates to GTT. */ + I915_WRITE(GEN8_GTCR, GEN8_GTCR_INVALIDATE); + + /* Specify auth action and where public signature is. */ + data[0] = INTEL_GUC_ACTION_AUTHENTICATE_HUC; + data[1] = i915_ggtt_offset(vma) + huc->fw.rsa_offset; + + ret = intel_guc_send(guc, data, ARRAY_SIZE(data)); + if (ret) { + DRM_ERROR("HuC: GuC did not ack Auth request\n"); + goto out; + } + + /* Check authentication status, it should be done by now */ + ret = intel_wait_for_register(dev_priv, + HUC_STATUS2, + HUC_FW_VERIFIED, + HUC_FW_VERIFIED, + 50); + + if (ret) { + DRM_ERROR("HuC: Authentication failed\n"); + goto out; + } + + DRM_ERROR("HuC Authentication Suc
Re: [Intel-gfx] [PATCH i-g-t v3] tools: Add intel_dp_compliance for DisplayPort 1.2 compliance automation
Hi The copyright statements still need the year corrected. intel_dp_compliance needs to be added to tools/.gitignore Some new comments also: - Why do some of the prints have \r\n? - Building intel_dp_compliance should actually be made conditional upon HAVE_UDEV -- Petri Latvala On Fri, Dec 23, 2016 at 09:47:48AM +0200, Pandiyan, Dhinakaran wrote: > I have addressed review comments that Petri, Jim had for this patch along > with making some small changes for error handling. The functionality is > mostly unchanged from Manasi's version. > > -DK > > From: Pandiyan, Dhinakaran > Sent: Thursday, December 22, 2016 11:41 PM > To: intel-gfx@lists.freedesktop.org > Cc: jim.br...@linux.intel.com; Navare, Manasi D; Latvala, Petri; Vlad, Marius > C; Daniel Vetter; Pandiyan, Dhinakaran > Subject: [PATCH i-g-t v3] tools: Add intel_dp_compliance for DisplayPort 1.2 > compliance automation > > From: "Navare, Manasi D" > > This is the userspace component of the Displayport Compliance > testing software required for compliance testing of the I915 > Display Port driver. This must be running in order to successfully > complete Display Port compliance testing. This app and the kernel > code that accompanies it has been written to satify the requirements > of the Displayport Link CTS 1.2 rev1.1 specification from VESA. > Note that this application does not support eDP compliance testing. > This utility has an automation support for the Link training tests > (4.3.1.1. - 4.3.2.3), EDID tests (4.2.2.3 > - 4.2.2.6) and Video Pattern generation tests (4.3.3.1) from CTS > specification 1.2 Rev 1.1. > > This tool has the support for responding to the hotplug uevents > sent by compliance testting unit after each test. > > The Linux DUT running this utility must be in text (console) mode > and cannot have any other display manager running. Since this uses > sysfs nodes for kernel interaction, this utility should be run as > Root. Once this user application is up and running, waiting for > test requests, the test appliance software on the windows host > can now be used to execute the compliance tests. > > This app is based on some prior work done in April 2015 (by > Todd Previte ) > > v2: > * Add mode unset on hotplug uevent on disconnect (Manasi Navare) > > v3: > Made capitalization consistent > Reduced line lengths > Added return value checks > Changed how GLib is linked > Fixed build warnings > > Cc: Petri Latvala > Cc: Marius Vlad > Cc: Daniel Vetter > Signed-off-by: Manasi Navare > Signed-off-by: Dhinakaran Pandiyan > --- > tools/Makefile.am |1 + > tools/Makefile.sources |7 + > tools/intel_dp_compliance.c | 1104 > +++ > tools/intel_dp_compliance.h | 35 ++ > tools/intel_dp_compliance_hotplug.c | 123 > 5 files changed, 1270 insertions(+) > create mode 100644 tools/intel_dp_compliance.c > create mode 100644 tools/intel_dp_compliance.h > create mode 100644 tools/intel_dp_compliance_hotplug.c > > diff --git a/tools/Makefile.am b/tools/Makefile.am > index 18f86f6..bd8f512 100644 > --- a/tools/Makefile.am > +++ b/tools/Makefile.am > @@ -16,6 +16,7 @@ AM_CFLAGS = $(DEBUG_CFLAGS) $(DRM_CFLAGS) > $(PCIACCESS_CFLAGS) $(CWARNFLAGS) \ > LDADD = $(top_builddir)/lib/libintel_tools.la > AM_LDFLAGS = -Wl,--as-needed > > +intel_dp_compliance_LDADD = $(top_builddir)/lib/libintel_tools.la > $(GLIB_LIBS) > > # aubdumper > > diff --git a/tools/Makefile.sources b/tools/Makefile.sources > index e2451ea..e8ce891 100644 > --- a/tools/Makefile.sources > +++ b/tools/Makefile.sources > @@ -13,6 +13,7 @@ tools_prog_lists =\ > intel_bios_reader \ > intel_display_crc \ > intel_display_poller\ > + intel_dp_compliance \ > intel_forcewaked\ > intel_gpu_frequency \ > intel_firmware_decode \ > @@ -56,3 +57,9 @@ intel_l3_parity_SOURCES = \ > intel_l3_parity.h \ > intel_l3_udev_listener.c > > +intel_dp_compliance_SOURCES = \ > +intel_dp_compliance.c \ > +intel_dp_compliance.h \ > +intel_dp_compliance_hotplug.c \ > +$(NULL) > + > diff --git a/tools/intel_dp_compliance.c b/tools/intel_dp_compliance.c > new file mode 100644 > index 000..df1ca10 > --- /dev/null > +++ b/tools/intel_dp_compliance.c > @@ -0,0 +1,1104 @@ > +/* > + * Copyright © 2014 Intel Corporation > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > +
[Intel-gfx] ✓ Fi.CI.BAT: success for drm/edid: Improve RGB limited range handling a bit
== Series Details == Series: drm/edid: Improve RGB limited range handling a bit URL : https://patchwork.freedesktop.org/series/17825/ State : success == Summary == Series 17825v1 drm/edid: Improve RGB limited range handling a bit https://patchwork.freedesktop.org/api/1.0/series/17825/revisions/1/mbox/ Test kms_pipe_crc_basic: Subgroup suspend-read-crc-pipe-a: incomplete -> SKIP (fi-bsw-n3050) Test pm_rpm: Subgroup basic-pci-d3-state: incomplete -> PASS (fi-byt-n2820) fi-bdw-5557u total:246 pass:232 dwarn:0 dfail:0 fail:0 skip:14 fi-bsw-n3050 total:246 pass:207 dwarn:0 dfail:0 fail:0 skip:39 fi-bxt-j4205 total:246 pass:224 dwarn:0 dfail:0 fail:0 skip:22 fi-bxt-t5700 total:82 pass:69 dwarn:0 dfail:0 fail:0 skip:12 fi-byt-j1900 total:246 pass:219 dwarn:0 dfail:0 fail:0 skip:27 fi-byt-n2820 total:246 pass:215 dwarn:0 dfail:0 fail:0 skip:31 fi-hsw-4770 total:246 pass:227 dwarn:0 dfail:0 fail:0 skip:19 fi-hsw-4770r total:246 pass:227 dwarn:0 dfail:0 fail:0 skip:19 fi-ivb-3520m total:246 pass:225 dwarn:0 dfail:0 fail:0 skip:21 fi-ivb-3770 total:246 pass:225 dwarn:0 dfail:0 fail:0 skip:21 fi-kbl-7500u total:83 pass:71 dwarn:0 dfail:0 fail:0 skip:11 fi-skl-6260u total:246 pass:233 dwarn:0 dfail:0 fail:0 skip:13 fi-skl-6700hqtotal:246 pass:226 dwarn:0 dfail:0 fail:0 skip:20 fi-skl-6700k total:246 pass:222 dwarn:3 dfail:0 fail:0 skip:21 fi-skl-6770hqtotal:246 pass:233 dwarn:0 dfail:0 fail:0 skip:13 fi-snb-2520m total:246 pass:215 dwarn:0 dfail:0 fail:0 skip:31 fi-snb-2600 total:246 pass:214 dwarn:0 dfail:0 fail:0 skip:32 f0350ffa1b2bc16dc49fdc2fce10776d604a1c5f drm-tip: 2017y-01m-11d-12h-34m-12s UTC integration manifest af78a86 drm/edid: Set YQ bits in the AVI infoframe according to CEA-861-F 6e2f2ab drm/edid: Set AVI infoframe Q even when QS=0 ac00d36 drm/edid: Introduce drm_hdmi_avi_infoframe_quant_range() d7e4b07 drm/edid: Introduce drm_default_rgb_quant_range() 142daf0 drm/edid: Have drm_edid.h include hdmi.h == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3479/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Add a sanity check that no request is submitted in the middle
It is an error to start a new request on the same timeline (ringbuffer) as the current one before the current is submitted. If there are two requests emitting to the ringbuffer at the same time, the operation is undefined. We can catch this by checking for the timeline having a later seqno than ours when we come to submit out request. Currently we have this check at the end of __i915_add_request, but having an early check as well isolates a failure in the caller versus a failure in sealing the request (i.e. from inside __i915_add_request itself). For example, CI is currently tripping over this late assertion on ctg/ilk: [ 100.329399] [IGT] gem_cs_tlb: starting subtest basic-default [ 100.336333] [ cut here ] [ 100.336341] kernel BUG at drivers/gpu/drm/i915/i915_gem_request.c:908! [ 100.336347] invalid opcode: [#1] PREEMPT SMP [ 100.336351] Modules linked in: snd_hda_intel i915 snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm coretemp mei_me lpc_ich mei e1000e ptp pps_core [last unloaded: i915] [ 100.336373] CPU: 0 PID: 6308 Comm: gem_cs_tlb Tainted: G U 4.10.0-rc3-CI-CI_DRM_2045+ #1 [ 100.336380] Hardware name: LENOVO 7465CTO/7465CTO, BIOS 6DET44WW (2.08 ) 04/22/2009 [ 100.336386] task: 88012b738040 task.stack: c956 [ 100.336441] RIP: 0010:__i915_add_request+0x4aa/0x510 [i915] [ 100.336445] RSP: 0018:c9563ac0 EFLAGS: 00010212 [ 100.336451] RAX: 5d52 RBX: 880133bb84c0 RCX: 0001 [ 100.336456] RDX: 8001 RSI: 88012b738860 RDI: [ 100.336461] RBP: c9563b00 R08: 880133bb8780 R09: [ 100.336466] R10: R11: R12: 88012f53d950 [ 100.336472] R13: 88012a2b0af8 R14: 88012a5b0008 R15: 88012f53d960 [ 100.336477] FS: 7f0d19da38c0() GS:88013bc0() knlGS: [ 100.336483] CS: 0010 DS: ES: CR0: 80050033 [ 100.336488] CR2: 7f0d17706000 CR3: 00012aa3e000 CR4: 000406f0 [ 100.336496] Call Trace: [ 100.336527] i915_gem_switch_to_kernel_context+0x131/0x1b0 [i915] [ 100.336559] i915_gem_evict_vm+0x202/0x2b0 [i915] [ 100.336590] i915_gem_execbuffer_reserve.isra.9+0x3ae/0x440 [i915] [ 100.336623] i915_gem_do_execbuffer.isra.15+0x6d9/0x1b20 [i915] [ 100.336656] i915_gem_execbuffer2+0xc0/0x250 [i915] [ 100.33] drm_ioctl+0x200/0x450 [ 100.336697] ? i915_gem_execbuffer+0x330/0x330 [i915] [ 100.336708] do_vfs_ioctl+0x90/0x6e0 [ 100.336716] ? up_read+0x1a/0x40 [ 100.336723] ? trace_hardirqs_on_caller+0x122/0x1b0 [ 100.336730] SyS_ioctl+0x3c/0x70 [ 100.336738] entry_SYSCALL_64_fastpath+0x1c/0xb1 [ 100.336745] RIP: 0033:0x7f0d187cb357 [ 100.336750] RSP: 002b:7ffe0b2f7c28 EFLAGS: 0246 ORIG_RAX: 0010 [ 100.336761] RAX: ffda RBX: 7ffe0b2f7d60 RCX: 7f0d187cb357 [ 100.336768] RDX: 7ffe0b2f7d00 RSI: 40406469 RDI: 0003 [ 100.336775] RBP: R08: R09: 0022 [ 100.336782] R10: 0007 R11: 0246 R12: 0002 [ 100.336789] R13: 00419101 R14: 7ffe0b2f7d60 R15: 7ffe0b2f7d50 [ 100.336797] Code: 5f 74 1e e9 d4 fb ff ff e8 bc 1e 9c e0 e9 ae fb ff ff 4c 89 e7 e8 77 22 fd ff e9 88 fd ff ff 0f 0b e8 a3 1e 9c e0 e9 b1 fb ff ff <0f> 0b 0f 0b e8 fd af ab e0 85 c0 75 c2 48 c7 c2 80 2c 71 a0 be [ 100.336877] RIP: __i915_add_request+0x4aa/0x510 [i915] RSP: c9563ac0 [ 100.336886] ---[ end trace 22b36545479e5eb7 ]--- Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Cc: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_request.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c index 99056b948eda..1ad47f08e1fd 100644 --- a/drivers/gpu/drm/i915/i915_gem_request.c +++ b/drivers/gpu/drm/i915/i915_gem_request.c @@ -851,6 +851,13 @@ void __i915_add_request(struct drm_i915_gem_request *request, bool flush_caches) lockdep_assert_held(&request->i915->drm.struct_mutex); trace_i915_gem_request_add(request); + /* Make sure that no request gazzumped us - if it was allocated after +* our i915_gem_request_alloc() and called __i915_add_request() before +* us, the timeline will hold its seqno which is later than ours. +*/ + GEM_BUG_ON(i915_seqno_passed(timeline->last_submitted_seqno, +request->fence.seqno)); + /* * To ensure that this call will not fail, space for its emissions * should already have been reserved in the ring buffer. Let the ring -- 2.11.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/huc: Support HuC authentication
On Wed, Jan 11, 2017 at 05:36:49AM -0800, Anusha Srivatsa wrote: > From: Peter Antoine > > The HuC authentication is done by host2guc call. The HuC RSA keys > are sent to GuC for authentication. > > v2: rebased on top of drm-intel-nightly. > changed name format and upped version 1.7. > v3: rebased on top of drm-intel-nightly. > v4: changed wait_for_automic to wait_for > v5: rebased. > v7: rebased. > v8: rebased. > v9: rebased. Rename intel_huc_auh() to intel_guc_auth_huc() > and place the prototype in intel_guc.h,correct the comments. > v10: rebased. > v11: rebased. > v12: rebased on top of drm-tip > v13: rebased. Moved intel_guc_auth_huc from i915_guc_submission.c > to intel_uc.c.Update dev to dev_priv in intel_guc_auth_huc(). > Renamed HOST2GUC_ACTION_AUTHENTICATE_HUC TO INTEL_GUC_ACTION_ > AUTHENTICATE_HUC > v14: rebased. > v15: rebased. Add newline on DRM_ERRORs that already dont have one. > v16: rebased. Replace wait_for with intel_wait_for_register() since > the latter employs sleep optimisations for quick responses- as pointed > out by Chris Wilson. > v17: rebased. Cleanup the intel_guc_auth_huc() by removing checks > already performed in earlier functions. Make comments more descriptive. > > Cc: Chris Wilson > Cc: Arkadiusz Hiler > Cc: Michal Wajdeczko There is still typo in my email ;( > Tested-by: Xiang Haihao > Signed-off-by: Anusha Srivatsa > Signed-off-by: Alex Dai > Signed-off-by: Peter Antoine > --- > drivers/gpu/drm/i915/intel_guc_fwif.h | 1 + > drivers/gpu/drm/i915/intel_guc_loader.c | 2 ++ > drivers/gpu/drm/i915/intel_uc.c | 56 > - > drivers/gpu/drm/i915/intel_uc.h | 1 + > 4 files changed, 59 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/intel_guc_fwif.h > b/drivers/gpu/drm/i915/intel_guc_fwif.h > index ed1ab40..25691f0 100644 > --- a/drivers/gpu/drm/i915/intel_guc_fwif.h > +++ b/drivers/gpu/drm/i915/intel_guc_fwif.h > @@ -505,6 +505,7 @@ enum intel_guc_action { > INTEL_GUC_ACTION_ENTER_S_STATE = 0x501, > INTEL_GUC_ACTION_EXIT_S_STATE = 0x502, > INTEL_GUC_ACTION_SLPC_REQUEST = 0x3003, > + INTEL_GUC_ACTION_AUTHENTICATE_HUC = 0x4000, > INTEL_GUC_ACTION_UK_LOG_ENABLE_LOGGING = 0x0E000, > INTEL_GUC_ACTION_LIMIT > }; > diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c > b/drivers/gpu/drm/i915/intel_guc_loader.c > index 3b05232..967ab2f 100644 > --- a/drivers/gpu/drm/i915/intel_guc_loader.c > +++ b/drivers/gpu/drm/i915/intel_guc_loader.c > @@ -529,6 +529,8 @@ int intel_guc_setup(struct drm_i915_private *dev_priv) > intel_uc_fw_status_repr(guc_fw->fetch_status), > intel_uc_fw_status_repr(guc_fw->load_status)); > > + intel_guc_auth_huc(dev_priv); > + > if (i915.enable_guc_submission) { > if (i915.guc_log_level >= 0) > gen9_enable_guc_interrupts(dev_priv); > diff --git a/drivers/gpu/drm/i915/intel_uc.c b/drivers/gpu/drm/i915/intel_uc.c > index c6be352..7dabbe6 100644 > --- a/drivers/gpu/drm/i915/intel_uc.c > +++ b/drivers/gpu/drm/i915/intel_uc.c > @@ -46,7 +46,7 @@ static bool intel_guc_recv(struct intel_guc *guc, u32 > *status) > int intel_guc_send(struct intel_guc *guc, const u32 *action, u32 len) > { > struct drm_i915_private *dev_priv = guc_to_i915(guc); > - u32 status; > + u32 status = 0; Any reason why this chunk is included in the Huc patch ? Merge error ? > int i; > int ret; > > @@ -140,3 +140,57 @@ int intel_guc_log_control(struct intel_guc *guc, u32 > control_val) > > return intel_guc_send(guc, action, ARRAY_SIZE(action)); > } > + > +/** > + * intel_guc_auth_huc() - authenticate ucode > + * @dev_priv: the drm_i915_device > + * > + * Triggers a HuC fw authentication request to the GuC via intel_guc_action_ > + * authenticate_huc interface. > + * interface. > + */ > +void intel_guc_auth_huc(struct drm_i915_private *dev_priv) > +{ > + struct intel_guc *guc = &dev_priv->guc; > + struct intel_huc *huc = &dev_priv->huc; > + struct i915_vma *vma; > + int ret; > + u32 data[2]; > + > + vma = i915_gem_object_ggtt_pin(huc->fw.obj, NULL, 0, 0, 0); > + if (IS_ERR(vma)) { > + DRM_DEBUG_DRIVER("failed to pin huc fw object %d\n", > + (int)PTR_ERR(vma)); > + return; > + } > + > + Leave only one line > + /* Invalidate GuC TLB to let GuC take the latest updates to GTT. */ > + I915_WRITE(GEN8_GTCR, GEN8_GTCR_INVALIDATE); > + > + /* Specify auth action and where public signature is. */ > + data[0] = INTEL_GUC_ACTION_AUTHENTICATE_HUC; > + data[1] = i915_ggtt_offset(vma) + huc->fw.rsa_offset; > + > + ret = intel_guc_send(guc, data, ARRAY_SIZE(data)); > + if (ret) { > + DRM_ERROR("HuC: GuC did not ack Auth request\n"); > + goto out; > + } > + > + /* Check authentication status, it should be done by now */ > +
Re: [Intel-gfx] [PATCH] drm/i915/huc: Add HuC fw loading support
On Wed, Jan 11, 2017 at 05:15:14AM -0800, Anusha Srivatsa wrote: > The HuC loading process is similar to GuC. The intel_uc_fw_fetch() > is used for both cases. > > HuC loading needs to be before GuC loading. The WOPCM setting must > be done early before loading any of them. > > v2: rebased on-top of drm-intel-nightly. > removed if(HAS_GUC()) before the guc call. (D.Gordon) > update huc_version number of format. > v3: rebased to drm-intel-nightly, changed the file name format to > match the one in the huc package. > Changed dev->dev_private to to_i915() > v4: moved function back to where it was. > change wait_for_atomic to wait_for. > v5: rebased + comment changes. > v7: rebased. > v8: rebased. > v9: rebased. Changed the year in the copyright message to reflect > the right year.Correct the comments,remove the unwanted WARN message, > replace drm_gem_object_unreference() with i915_gem_object_put().Make the > prototypes in intel_huc.h non-extern. > v10: rebased. Update the file construction done by HuC. It is similar to > GuC.Adopted the approach used in- > https://patchwork.freedesktop.org/patch/104355/ > v11: Fix warnings remove old declaration > v12: Change dev to dev_priv in macro definition. > Corrected comments. > v13: rebased. > v14: rebased on top of drm-tip > v15: rebased. Updated functions intel_huc_load(),intel_huc_init() and > intel_uc_fw_fetch() to accept dev_priv instead of dev. Moved contents > of intel_huc.h to intel_uc.h > v16: change SKL_FW_ to SKL_HUC_FW_. Add intel_ prefix to guc_wopcm_size(). > Remove unwanted checks in intel_uc.h. Rename huc_fw in struct intel_huc to > simply fw to avoid redundency. > v17: rebased. > v18: rebased. Correct comments. > v19: rebased. Correct comments. move definition to i915_guc_reg.h from > intel_uc.h. Clean DMA_CTRL bits after HuC DMA transfer in huc_ucode_xfer() > instead of guc_ucode_xfer(). Add suitable WARNs to give extra info. > > Cc: Arkadiusz Hiler > Cc: Michal Wajdeczko > Tested-by: Xiang Haihao > Signed-off-by: Anusha Srivatsa > Signed-off-by: Alex Dai > Signed-off-by: Peter Antoine > --- > drivers/gpu/drm/i915/Makefile | 1 + > drivers/gpu/drm/i915/i915_drv.c | 3 + > drivers/gpu/drm/i915/i915_drv.h | 2 + > drivers/gpu/drm/i915/i915_guc_reg.h | 6 + > drivers/gpu/drm/i915/intel_guc_loader.c | 7 +- > drivers/gpu/drm/i915/intel_huc_loader.c | 265 > > drivers/gpu/drm/i915/intel_uc.h | 14 ++ > 7 files changed, 295 insertions(+), 3 deletions(-) > create mode 100644 drivers/gpu/drm/i915/intel_huc_loader.c > > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile > index 5196509..45ae124 100644 > --- a/drivers/gpu/drm/i915/Makefile > +++ b/drivers/gpu/drm/i915/Makefile > @@ -57,6 +57,7 @@ i915-y += i915_cmd_parser.o \ > # general-purpose microcontroller (GuC) support > i915-y += intel_uc.o \ > intel_guc_loader.o \ > + intel_huc_loader.o \ > i915_guc_submission.o > > # autogenerated null render state > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c > index aefab9a..5a90829 100644 > --- a/drivers/gpu/drm/i915/i915_drv.c > +++ b/drivers/gpu/drm/i915/i915_drv.c > @@ -599,6 +599,7 @@ static int i915_load_modeset_init(struct drm_device *dev) > if (ret) > goto cleanup_irq; > > + intel_huc_init(dev_priv); > intel_guc_init(dev_priv); > > ret = i915_gem_init(dev_priv); > @@ -627,6 +628,7 @@ static int i915_load_modeset_init(struct drm_device *dev) > i915_gem_fini(dev_priv); > cleanup_irq: > intel_guc_fini(dev_priv); > + intel_huc_fini(dev); > drm_irq_uninstall(dev); > intel_teardown_gmbus(dev_priv); > cleanup_csr: > @@ -1314,6 +1316,7 @@ void i915_driver_unload(struct drm_device *dev) > drain_workqueue(dev_priv->wq); > > intel_guc_fini(dev_priv); > + intel_huc_fini(dev); Hmm, still not dev_priv? > i915_gem_fini(dev_priv); > intel_fbc_cleanup_cfb(dev_priv); > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index b84c1d1..2a17df2 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -2073,6 +2073,7 @@ struct drm_i915_private { > > struct intel_gvt *gvt; > > + struct intel_huc huc; > struct intel_guc guc; > > struct intel_csr csr; > @@ -2847,6 +2848,7 @@ intel_info(const struct drm_i915_private *dev_priv) > #define HAS_GUC(dev_priv)((dev_priv)->info.has_guc) > #define HAS_GUC_UCODE(dev_priv) (HAS_GUC(dev_priv)) > #define HAS_GUC_SCHED(dev_priv) (HAS_GUC(dev_priv)) > +#define HAS_HUC_UCODE(dev_priv) (HAS_GUC(dev_priv)) > > #define HAS_RESOURCE_STREAMER(dev_priv) > ((dev_priv)->info.has_resource_streamer) > > diff --git a/drivers/gpu/drm/i915/i915_guc_reg.h > b/drivers/gpu/drm/i915/i915_guc_reg.h > index 6a0adaf..35cf991 100644 > --- a/driv
Re: [Intel-gfx] [PATCH] drm/i915: Add a sanity check that no request is submitted in the middle
Chris Wilson writes: > It is an error to start a new request on the same timeline (ringbuffer) > as the current one before the current is submitted. If there are two > requests emitting to the ringbuffer at the same time, the operation is > undefined. We can catch this by checking for the timeline having a later > seqno than ours when we come to submit out request. > > Currently we have this check at the end of __i915_add_request, but > having an early check as well isolates a failure in the caller versus a > failure in sealing the request (i.e. from inside __i915_add_request > itself). For example, CI is currently tripping over this late assertion > on ctg/ilk: > > [ 100.329399] [IGT] gem_cs_tlb: starting subtest basic-default > [ 100.336333] [ cut here ] > [ 100.336341] kernel BUG at drivers/gpu/drm/i915/i915_gem_request.c:908! > [ 100.336347] invalid opcode: [#1] PREEMPT SMP > [ 100.336351] Modules linked in: snd_hda_intel i915 snd_hda_codec_generic > snd_hda_codec snd_hwdep snd_hda_core snd_pcm coretemp mei_me lpc_ich mei > e1000e ptp pps_core [last unloaded: i915] > [ 100.336373] CPU: 0 PID: 6308 Comm: gem_cs_tlb Tainted: G U > 4.10.0-rc3-CI-CI_DRM_2045+ #1 > [ 100.336380] Hardware name: LENOVO 7465CTO/7465CTO, BIOS 6DET44WW (2.08 ) > 04/22/2009 > [ 100.336386] task: 88012b738040 task.stack: c956 > [ 100.336441] RIP: 0010:__i915_add_request+0x4aa/0x510 [i915] > [ 100.336445] RSP: 0018:c9563ac0 EFLAGS: 00010212 > [ 100.336451] RAX: 5d52 RBX: 880133bb84c0 RCX: > 0001 > [ 100.336456] RDX: 8001 RSI: 88012b738860 RDI: > > [ 100.336461] RBP: c9563b00 R08: 880133bb8780 R09: > > [ 100.336466] R10: R11: R12: > 88012f53d950 > [ 100.336472] R13: 88012a2b0af8 R14: 88012a5b0008 R15: > 88012f53d960 > [ 100.336477] FS: 7f0d19da38c0() GS:88013bc0() > knlGS: > [ 100.336483] CS: 0010 DS: ES: CR0: 80050033 > [ 100.336488] CR2: 7f0d17706000 CR3: 00012aa3e000 CR4: > 000406f0 > [ 100.336496] Call Trace: > [ 100.336527] i915_gem_switch_to_kernel_context+0x131/0x1b0 [i915] > [ 100.336559] i915_gem_evict_vm+0x202/0x2b0 [i915] > [ 100.336590] i915_gem_execbuffer_reserve.isra.9+0x3ae/0x440 [i915] > [ 100.336623] i915_gem_do_execbuffer.isra.15+0x6d9/0x1b20 [i915] > [ 100.336656] i915_gem_execbuffer2+0xc0/0x250 [i915] > [ 100.33] drm_ioctl+0x200/0x450 > [ 100.336697] ? i915_gem_execbuffer+0x330/0x330 [i915] > [ 100.336708] do_vfs_ioctl+0x90/0x6e0 > [ 100.336716] ? up_read+0x1a/0x40 > [ 100.336723] ? trace_hardirqs_on_caller+0x122/0x1b0 > [ 100.336730] SyS_ioctl+0x3c/0x70 > [ 100.336738] entry_SYSCALL_64_fastpath+0x1c/0xb1 > [ 100.336745] RIP: 0033:0x7f0d187cb357 > [ 100.336750] RSP: 002b:7ffe0b2f7c28 EFLAGS: 0246 ORIG_RAX: > 0010 > [ 100.336761] RAX: ffda RBX: 7ffe0b2f7d60 RCX: > 7f0d187cb357 > [ 100.336768] RDX: 7ffe0b2f7d00 RSI: 40406469 RDI: > 0003 > [ 100.336775] RBP: R08: R09: > 0022 > [ 100.336782] R10: 0007 R11: 0246 R12: > 0002 > [ 100.336789] R13: 00419101 R14: 7ffe0b2f7d60 R15: > 7ffe0b2f7d50 > [ 100.336797] Code: 5f 74 1e e9 d4 fb ff ff e8 bc 1e 9c e0 e9 ae fb ff ff 4c > 89 e7 e8 77 22 fd ff e9 88 fd ff ff 0f 0b e8 a3 1e 9c e0 e9 b1 fb ff ff <0f> > 0b 0f 0b e8 fd af ab e0 85 c0 75 c2 48 c7 c2 80 2c 71 a0 be > [ 100.336877] RIP: __i915_add_request+0x4aa/0x510 [i915] RSP: > c9563ac0 > [ 100.336886] ---[ end trace 22b36545479e5eb7 ]--- > > Signed-off-by: Chris Wilson > Cc: Tvrtko Ursulin > Cc: Joonas Lahtinen Reviewed-by: Mika Kuoppala > --- > drivers/gpu/drm/i915/i915_gem_request.c | 7 +++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_gem_request.c > b/drivers/gpu/drm/i915/i915_gem_request.c > index 99056b948eda..1ad47f08e1fd 100644 > --- a/drivers/gpu/drm/i915/i915_gem_request.c > +++ b/drivers/gpu/drm/i915/i915_gem_request.c > @@ -851,6 +851,13 @@ void __i915_add_request(struct drm_i915_gem_request > *request, bool flush_caches) > lockdep_assert_held(&request->i915->drm.struct_mutex); > trace_i915_gem_request_add(request); > > + /* Make sure that no request gazzumped us - if it was allocated after > + * our i915_gem_request_alloc() and called __i915_add_request() before > + * us, the timeline will hold its seqno which is later than ours. > + */ > + GEM_BUG_ON(i915_seqno_passed(timeline->last_submitted_seqno, > + request->fence.seqno)); > + > /* >* To ensure that this call will not fail, space for its emissions >* should already have been reserved in the ring buffer. L
[Intel-gfx] [PATCH v2 2/5] drm/edid: Introduce drm_default_rgb_quant_range()
From: Ville Syrjälä Make the code selecting the RGB quantization range a little less magicy by wrapping it up in a small helper. v2: s/adjusted_mode/mode in vc4 to make it actually compile Signed-off-by: Ville Syrjälä --- drivers/gpu/drm/drm_edid.c| 18 ++ drivers/gpu/drm/i915/intel_dp.c | 4 +++- drivers/gpu/drm/i915/intel_hdmi.c | 3 ++- drivers/gpu/drm/vc4/vc4_hdmi.c| 4 +++- include/drm/drm_edid.h| 2 ++ 5 files changed, 28 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c index 4ff04aa84dd0..304c583b8000 100644 --- a/drivers/gpu/drm/drm_edid.c +++ b/drivers/gpu/drm/drm_edid.c @@ -3768,6 +3768,24 @@ bool drm_rgb_quant_range_selectable(struct edid *edid) } EXPORT_SYMBOL(drm_rgb_quant_range_selectable); +/** + * drm_default_rgb_quant_range - default RGB quantization range + * @mode: display mode + * + * Determine the default RGB quantization range for the mode, + * as specified in CEA-861. + * + * Return: The default RGB quantization range for the mode + */ +enum hdmi_quantization_range +drm_default_rgb_quant_range(const struct drm_display_mode *mode) +{ + return drm_match_cea_mode(mode) > 1 ? + HDMI_QUANTIZATION_RANGE_LIMITED : + HDMI_QUANTIZATION_RANGE_FULL; +} +EXPORT_SYMBOL(drm_default_rgb_quant_range); + static void drm_parse_hdmi_deep_color_info(struct drm_connector *connector, const u8 *hdmi) { diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index 343e1d9fa761..d4befbbe834a 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -1713,7 +1713,9 @@ intel_dp_compute_config(struct intel_encoder *encoder, * VESA DisplayPort Ver.1.2a - 5.1.1.1 Video Colorimetry */ pipe_config->limited_color_range = - bpp != 18 && drm_match_cea_mode(adjusted_mode) > 1; + bpp != 18 && + drm_default_rgb_quant_range(adjusted_mode) == + HDMI_QUANTIZATION_RANGE_LIMITED; } else { pipe_config->limited_color_range = intel_dp->limited_color_range; diff --git a/drivers/gpu/drm/i915/intel_hdmi.c b/drivers/gpu/drm/i915/intel_hdmi.c index 0bcfead14571..19bd13f53729 100644 --- a/drivers/gpu/drm/i915/intel_hdmi.c +++ b/drivers/gpu/drm/i915/intel_hdmi.c @@ -1330,7 +1330,8 @@ bool intel_hdmi_compute_config(struct intel_encoder *encoder, /* See CEA-861-E - 5.1 Default Encoding Parameters */ pipe_config->limited_color_range = pipe_config->has_hdmi_sink && - drm_match_cea_mode(adjusted_mode) > 1; + drm_default_rgb_quant_range(adjusted_mode) == + HDMI_QUANTIZATION_RANGE_LIMITED; } else { pipe_config->limited_color_range = intel_hdmi->limited_color_range; diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c index c4cb2e26de32..5d49bf948162 100644 --- a/drivers/gpu/drm/vc4/vc4_hdmi.c +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c @@ -463,7 +463,9 @@ static void vc4_hdmi_encoder_mode_set(struct drm_encoder *encoder, csc_ctl = VC4_SET_FIELD(VC4_HD_CSC_CTL_ORDER_BGR, VC4_HD_CSC_CTL_ORDER); - if (vc4_encoder->hdmi_monitor && drm_match_cea_mode(mode) > 1) { + if (vc4_encoder->hdmi_monitor && + drm_default_rgb_quant_range(mode) == + HDMI_QUANTIZATION_RANGE_LIMITED) { /* CEA VICs other than #1 requre limited range RGB * output unless overridden by an AVI infoframe. * Apply a colorspace conversion to squash 0-255 down diff --git a/include/drm/drm_edid.h b/include/drm/drm_edid.h index 838eaf2b42e9..25cdf5f7a0d8 100644 --- a/include/drm/drm_edid.h +++ b/include/drm/drm_edid.h @@ -441,6 +441,8 @@ enum hdmi_picture_aspect drm_get_cea_aspect_ratio(const u8 video_code); bool drm_detect_hdmi_monitor(struct edid *edid); bool drm_detect_monitor_audio(struct edid *edid); bool drm_rgb_quant_range_selectable(struct edid *edid); +enum hdmi_quantization_range +drm_default_rgb_quant_range(const struct drm_display_mode *mode); int drm_add_modes_noedid(struct drm_connector *connector, int hdisplay, int vdisplay); void drm_set_preferred_mode(struct drm_connector *connector, -- 2.10.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Add a sanity check that no request is submitted in the middle
On ke, 2017-01-11 at 14:08 +, Chris Wilson wrote: > It is an error to start a new request on the same timeline (ringbuffer) > as the current one before the current is submitted. If there are two > requests emitting to the ringbuffer at the same time, the operation is > undefined. We can catch this by checking for the timeline having a later > seqno than ours when we come to submit out request. > > Currently we have this check at the end of __i915_add_request, but > having an early check as well isolates a failure in the caller versus a > failure in sealing the request (i.e. from inside __i915_add_request > itself). For example, CI is currently tripping over this late assertion > on ctg/ilk: > > [ 100.329399] [IGT] gem_cs_tlb: starting subtest basic-default > [ 100.336333] [ cut here ] > [ 100.336341] kernel BUG at drivers/gpu/drm/i915/i915_gem_request.c:908! > [ 100.336347] invalid opcode: [#1] PREEMPT SMP > [ 100.336351] Modules linked in: snd_hda_intel i915 snd_hda_codec_generic > snd_hda_codec snd_hwdep snd_hda_core snd_pcm coretemp mei_me lpc_ich mei > e1000e ptp pps_core [last unloaded: i915] > [ 100.336373] CPU: 0 PID: 6308 Comm: gem_cs_tlb Tainted: G U > 4.10.0-rc3-CI-CI_DRM_2045+ #1 > [ 100.336380] Hardware name: LENOVO 7465CTO/7465CTO, BIOS 6DET44WW (2.08 ) > 04/22/2009 > [ 100.336386] task: 88012b738040 task.stack: c956 > [ 100.336441] RIP: 0010:__i915_add_request+0x4aa/0x510 [i915] > [ 100.336445] RSP: 0018:c9563ac0 EFLAGS: 00010212 > [ 100.336451] RAX: 5d52 RBX: 880133bb84c0 RCX: > 0001 > [ 100.336456] RDX: 8001 RSI: 88012b738860 RDI: > > [ 100.336461] RBP: c9563b00 R08: 880133bb8780 R09: > > [ 100.336466] R10: R11: R12: > 88012f53d950 > [ 100.336472] R13: 88012a2b0af8 R14: 88012a5b0008 R15: > 88012f53d960 > [ 100.336477] FS: 7f0d19da38c0() GS:88013bc0() > knlGS: > [ 100.336483] CS: 0010 DS: ES: CR0: 80050033 > [ 100.336488] CR2: 7f0d17706000 CR3: 00012aa3e000 CR4: > 000406f0 > [ 100.336496] Call Trace: > [ 100.336527] i915_gem_switch_to_kernel_context+0x131/0x1b0 [i915] > [ 100.336559] i915_gem_evict_vm+0x202/0x2b0 [i915] > [ 100.336590] i915_gem_execbuffer_reserve.isra.9+0x3ae/0x440 [i915] > [ 100.336623] i915_gem_do_execbuffer.isra.15+0x6d9/0x1b20 [i915] > [ 100.336656] i915_gem_execbuffer2+0xc0/0x250 [i915] > [ 100.33] drm_ioctl+0x200/0x450 > [ 100.336697] ? i915_gem_execbuffer+0x330/0x330 [i915] > [ 100.336708] do_vfs_ioctl+0x90/0x6e0 > [ 100.336716] ? up_read+0x1a/0x40 > [ 100.336723] ? trace_hardirqs_on_caller+0x122/0x1b0 > [ 100.336730] SyS_ioctl+0x3c/0x70 > [ 100.336738] entry_SYSCALL_64_fastpath+0x1c/0xb1 > [ 100.336745] RIP: 0033:0x7f0d187cb357 > [ 100.336750] RSP: 002b:7ffe0b2f7c28 EFLAGS: 0246 ORIG_RAX: > 0010 > [ 100.336761] RAX: ffda RBX: 7ffe0b2f7d60 RCX: > 7f0d187cb357 > [ 100.336768] RDX: 7ffe0b2f7d00 RSI: 40406469 RDI: > 0003 > [ 100.336775] RBP: R08: R09: > 0022 > [ 100.336782] R10: 0007 R11: 0246 R12: > 0002 > [ 100.336789] R13: 00419101 R14: 7ffe0b2f7d60 R15: > 7ffe0b2f7d50 > [ 100.336797] Code: 5f 74 1e e9 d4 fb ff ff e8 bc 1e 9c e0 e9 ae fb ff ff 4c > 89 e7 e8 77 22 fd ff e9 88 fd ff ff 0f 0b e8 a3 1e 9c e0 e9 b1 fb ff ff <0f> > 0b 0f 0b e8 fd af ab e0 85 c0 75 c2 48 c7 c2 80 2c 71 a0 be > [ 100.336877] RIP: __i915_add_request+0x4aa/0x510 [i915] RSP: > c9563ac0 > [ 100.336886] ---[ end trace 22b36545479e5eb7 ]--- > > Signed-off-by: Chris Wilson > Cc: Tvrtko Ursulin > Cc: Joonas Lahtinen Reviewed-by: Joonas Lahtinen Regards, Joonas -- Joonas Lahtinen Open Source Technology Center Intel Corporation ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/huc: Add HuC fw loading support
On Wed, Jan 11, 2017 at 03:13:29PM +0100, Michal Wajdeczko wrote: > > + vma = i915_gem_object_ggtt_pin(huc_fw->obj, NULL, 0, 0, 0); > > + if (IS_ERR(vma)) { > > + DRM_DEBUG_DRIVER("pin failed %d\n", (int)PTR_ERR(vma)); > > + return PTR_ERR(vma); > > + } Just asking a stupid question: Does the HuC have the same limitation as the GuC on not being able to map certain ranges of the GuC? From the earlier discussion on the failures, I got the impression the HuC had the same limitations. > > + > > + /* Invalidate GuC TLB to let GuC take the latest updates to GTT. */ > > + I915_WRITE(GEN8_GTCR, GEN8_GTCR_INVALIDATE); > > + > > + intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL); > > + > > + /* init WOPCM */ > > + I915_WRITE(GUC_WOPCM_SIZE, intel_guc_wopcm_size(dev_priv)); > > + I915_WRITE(DMA_GUC_WOPCM_OFFSET, GUC_WOPCM_OFFSET_VALUE | > > + HUC_LOADING_AGENT_GUC); > > + > > + /* Set the source address for the uCode */ > > + offset = i915_ggtt_offset(vma) + huc_fw->header_offset; If huc does have the same limits as the guc, please use guc_ggtt_offset() for the extra verification on the address before use. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with [1/3] drm/i915: Invalidate the guc ggtt TLB upon insertion
== Series Details == Series: series starting with [1/3] drm/i915: Invalidate the guc ggtt TLB upon insertion URL : https://patchwork.freedesktop.org/series/17829/ State : failure == Summary == Series 17829v1 Series without cover letter https://patchwork.freedesktop.org/api/1.0/series/17829/revisions/1/mbox/ Test drv_module_reload: Subgroup basic-reload: pass -> DMESG-WARN (fi-skl-6260u) pass -> DMESG-WARN (fi-bxt-t5700) pass -> DMESG-WARN (fi-skl-6700hq) Subgroup basic-reload-final: pass -> DMESG-WARN (fi-bxt-j4205) Test gem_busy: Subgroup basic-hang-default: pass -> DMESG-WARN (fi-skl-6770hq) Test gem_exec_suspend: Subgroup basic-s3: pass -> INCOMPLETE (fi-skl-6770hq) pass -> INCOMPLETE (fi-bxt-j4205) Subgroup basic-s4-devices: pass -> INCOMPLETE (fi-skl-6700k) Test kms_force_connector_basic: Subgroup force-connector-state: pass -> DMESG-WARN (fi-snb-2520m) Test kms_pipe_crc_basic: Subgroup suspend-read-crc-pipe-a: pass -> INCOMPLETE (fi-skl-6260u) pass -> INCOMPLETE (fi-skl-6700hq) incomplete -> SKIP (fi-bsw-n3050) Test pm_rpm: Subgroup basic-pci-d3-state: incomplete -> PASS (fi-byt-n2820) fi-bdw-5557u total:246 pass:232 dwarn:0 dfail:0 fail:0 skip:14 fi-bsw-n3050 total:246 pass:207 dwarn:0 dfail:0 fail:0 skip:39 fi-bxt-j4205 total:82 pass:69 dwarn:1 dfail:0 fail:0 skip:11 fi-bxt-t5700 total:82 pass:68 dwarn:1 dfail:0 fail:0 skip:12 fi-byt-j1900 total:246 pass:219 dwarn:0 dfail:0 fail:0 skip:27 fi-byt-n2820 total:246 pass:215 dwarn:0 dfail:0 fail:0 skip:31 fi-hsw-4770 total:246 pass:227 dwarn:0 dfail:0 fail:0 skip:19 fi-hsw-4770r total:246 pass:227 dwarn:0 dfail:0 fail:0 skip:19 fi-ivb-3520m total:246 pass:225 dwarn:0 dfail:0 fail:0 skip:21 fi-ivb-3770 total:246 pass:225 dwarn:0 dfail:0 fail:0 skip:21 fi-kbl-7500u total:246 pass:225 dwarn:0 dfail:0 fail:0 skip:21 fi-skl-6260u total:208 pass:195 dwarn:1 dfail:0 fail:0 skip:11 fi-skl-6700hqtotal:208 pass:187 dwarn:1 dfail:0 fail:0 skip:19 fi-skl-6700k total:83 pass:68 dwarn:3 dfail:0 fail:0 skip:11 fi-skl-6770hqtotal:82 pass:77 dwarn:1 dfail:0 fail:0 skip:3 fi-snb-2520m total:246 pass:214 dwarn:1 dfail:0 fail:0 skip:31 fi-snb-2600 total:246 pass:214 dwarn:0 dfail:0 fail:0 skip:32 f0350ffa1b2bc16dc49fdc2fce10776d604a1c5f drm-tip: 2017y-01m-11d-12h-34m-12s UTC integration manifest ad8e4de HAX enable guc submission for CI b2b1516 drm/i915/scheduler: emulate a scheduler for guc 7ed1066 drm/i915: Invalidate the guc ggtt TLB upon insertion == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3480/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Add a sanity check that no request is submitted in the middle
On Wed, Jan 11, 2017 at 04:19:19PM +0200, Joonas Lahtinen wrote: > On ke, 2017-01-11 at 14:08 +, Chris Wilson wrote: > > It is an error to start a new request on the same timeline (ringbuffer) > > as the current one before the current is submitted. If there are two > > requests emitting to the ringbuffer at the same time, the operation is > > undefined. We can catch this by checking for the timeline having a later > > seqno than ours when we come to submit out request. > > > > Currently we have this check at the end of __i915_add_request, but > > having an early check as well isolates a failure in the caller versus a > > failure in sealing the request (i.e. from inside __i915_add_request > > itself). For example, CI is currently tripping over this late assertion > > on ctg/ilk: > > > > [ 100.329399] [IGT] gem_cs_tlb: starting subtest basic-default > > [ 100.336333] [ cut here ] > > [ 100.336341] kernel BUG at drivers/gpu/drm/i915/i915_gem_request.c:908! > > [ 100.336347] invalid opcode: [#1] PREEMPT SMP > > [ 100.336351] Modules linked in: snd_hda_intel i915 snd_hda_codec_generic > > snd_hda_codec snd_hwdep snd_hda_core snd_pcm coretemp mei_me lpc_ich mei > > e1000e ptp pps_core [last unloaded: i915] > > [ 100.336373] CPU: 0 PID: 6308 Comm: gem_cs_tlb Tainted: G U > > 4.10.0-rc3-CI-CI_DRM_2045+ #1 > > [ 100.336380] Hardware name: LENOVO 7465CTO/7465CTO, BIOS 6DET44WW (2.08 ) > > 04/22/2009 > > [ 100.336386] task: 88012b738040 task.stack: c956 > > [ 100.336441] RIP: 0010:__i915_add_request+0x4aa/0x510 [i915] > > [ 100.336445] RSP: 0018:c9563ac0 EFLAGS: 00010212 > > [ 100.336451] RAX: 5d52 RBX: 880133bb84c0 RCX: > > 0001 > > [ 100.336456] RDX: 8001 RSI: 88012b738860 RDI: > > > > [ 100.336461] RBP: c9563b00 R08: 880133bb8780 R09: > > > > [ 100.336466] R10: R11: R12: > > 88012f53d950 > > [ 100.336472] R13: 88012a2b0af8 R14: 88012a5b0008 R15: > > 88012f53d960 > > [ 100.336477] FS: 7f0d19da38c0() GS:88013bc0() > > knlGS: > > [ 100.336483] CS: 0010 DS: ES: CR0: 80050033 > > [ 100.336488] CR2: 7f0d17706000 CR3: 00012aa3e000 CR4: > > 000406f0 > > [ 100.336496] Call Trace: > > [ 100.336527] i915_gem_switch_to_kernel_context+0x131/0x1b0 [i915] > > [ 100.336559] i915_gem_evict_vm+0x202/0x2b0 [i915] > > [ 100.336590] i915_gem_execbuffer_reserve.isra.9+0x3ae/0x440 [i915] > > [ 100.336623] i915_gem_do_execbuffer.isra.15+0x6d9/0x1b20 [i915] > > [ 100.336656] i915_gem_execbuffer2+0xc0/0x250 [i915] > > [ 100.33] drm_ioctl+0x200/0x450 > > [ 100.336697] ? i915_gem_execbuffer+0x330/0x330 [i915] > > [ 100.336708] do_vfs_ioctl+0x90/0x6e0 > > [ 100.336716] ? up_read+0x1a/0x40 > > [ 100.336723] ? trace_hardirqs_on_caller+0x122/0x1b0 > > [ 100.336730] SyS_ioctl+0x3c/0x70 > > [ 100.336738] entry_SYSCALL_64_fastpath+0x1c/0xb1 > > [ 100.336745] RIP: 0033:0x7f0d187cb357 > > [ 100.336750] RSP: 002b:7ffe0b2f7c28 EFLAGS: 0246 ORIG_RAX: > > 0010 > > [ 100.336761] RAX: ffda RBX: 7ffe0b2f7d60 RCX: > > 7f0d187cb357 > > [ 100.336768] RDX: 7ffe0b2f7d00 RSI: 40406469 RDI: > > 0003 > > [ 100.336775] RBP: R08: R09: > > 0022 > > [ 100.336782] R10: 0007 R11: 0246 R12: > > 0002 > > [ 100.336789] R13: 00419101 R14: 7ffe0b2f7d60 R15: > > 7ffe0b2f7d50 > > [ 100.336797] Code: 5f 74 1e e9 d4 fb ff ff e8 bc 1e 9c e0 e9 ae fb ff ff > > 4c 89 e7 e8 77 22 fd ff e9 88 fd ff ff 0f 0b e8 a3 1e 9c e0 e9 b1 fb ff ff > > <0f> 0b 0f 0b e8 fd af ab e0 85 c0 75 c2 48 c7 c2 80 2c 71 a0 be > > [ 100.336877] RIP: __i915_add_request+0x4aa/0x510 [i915] RSP: > > c9563ac0 > > [ 100.336886] ---[ end trace 22b36545479e5eb7 ]--- > > > > Signed-off-by: Chris Wilson > > Cc: Tvrtko Ursulin > > Cc: Joonas Lahtinen > > Reviewed-by: Joonas Lahtinen Thanks, pushed this promptly since I'm trying to understand the CI failure. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 03/10] drm/i915/psr: fix blank screen issue for psr2
Psr1 and psr2 are mutually exclusive,ie when psr2 is enabled, psr1 should be disabled.When psr2 is exited , bit 31 of reg PSR2_CTL must be set to 0 but currently bit 31 of SRD_CTL (psr1 control register)is set to 0. Also ,PSR2_IDLE state is looked up from SRD_STATUS(psr1 register) instead of PSR2_STATUS register, which has wrong data, resulting in blankscreen. hsw_enable_source is split into hsw_enable_source_psr1 and hsw_enable_source_psr2 for easier code review and maintenance, as suggested by rodrigo and jim. v2: (Rodrigo) - Rename hsw_enable_source_psr* to intel_enable_source_psr* v3: (Rodrigo) - In hsw_psr_disable , 1) for psr active case, handle psr2 followed by psr1. 2) psr inactive case, handle psr2 followed by psr1 v4:(Rodrigo) - move psr2 restriction(32X20) to match_conditions function returning false and fully blocking PSR to a new patch before this one. Cc: Rodrigo Vivi Cc: Jim Bride Signed-off-by: Vathsala Nagaraju Signed-off-by: Patil Deepti --- drivers/gpu/drm/i915/i915_reg.h | 3 + drivers/gpu/drm/i915/intel_psr.c | 122 +-- 2 files changed, 95 insertions(+), 30 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 00970aa..7830e6e 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -3615,6 +3615,9 @@ enum { #define EDP_PSR2_FRAME_BEFORE_SU_MASK(0xf<<4) #define EDP_PSR2_IDLE_MASK 0xf +#define EDP_PSR2_STATUS_CTL_MMIO(0x6f940) +#define EDP_PSR2_STATUS_STATE_MASK (0xf<<28) + /* VGA port control */ #define ADPA _MMIO(0x61100) #define PCH_ADPA_MMIO(0xe1100) diff --git a/drivers/gpu/drm/i915/intel_psr.c b/drivers/gpu/drm/i915/intel_psr.c index 707cae8..19c7090 100644 --- a/drivers/gpu/drm/i915/intel_psr.c +++ b/drivers/gpu/drm/i915/intel_psr.c @@ -261,7 +261,7 @@ static void vlv_psr_activate(struct intel_dp *intel_dp) VLV_EDP_PSR_ACTIVE_ENTRY); } -static void hsw_psr_enable_source(struct intel_dp *intel_dp) +static void intel_enable_source_psr1(struct intel_dp *intel_dp) { struct intel_digital_port *dig_port = dp_to_dig_port(intel_dp); struct drm_device *dev = dig_port->base.base.dev; @@ -312,14 +312,29 @@ static void hsw_psr_enable_source(struct intel_dp *intel_dp) val |= EDP_PSR_TP1_TP2_SEL; I915_WRITE(EDP_PSR_CTL, val); +} - if (!dev_priv->psr.psr2_support) - return; +static void intel_enable_source_psr2(struct intel_dp *intel_dp) +{ + struct intel_digital_port *dig_port = dp_to_dig_port(intel_dp); + struct drm_device *dev = dig_port->base.base.dev; + struct drm_i915_private *dev_priv = to_i915(dev); + /* +* Let's respect VBT in case VBT asks a higher idle_frame value. +* Let's use 6 as the minimum to cover all known cases including +* the off-by-one issue that HW has in some cases. Also there are +* cases where sink should be able to train +* with the 5 or 6 idle patterns. +*/ + uint32_t idle_frames = max(6, dev_priv->vbt.psr.idle_frames); + uint32_t val = EDP_PSR_ENABLE; + + val |= idle_frames << EDP_PSR_IDLE_FRAME_SHIFT; /* FIXME: selective update is probably totally broken because it doesn't * mesh at all with our frontbuffer tracking. And the hw alone isn't * good enough. */ - val = EDP_PSR2_ENABLE | EDP_SU_TRACK_ENABLE; + val |= EDP_PSR2_ENABLE | EDP_SU_TRACK_ENABLE; if (dev_priv->vbt.psr.tp2_tp3_wakeup_time > 5) val |= EDP_PSR2_TP2_TIME_2500; @@ -333,6 +348,19 @@ static void hsw_psr_enable_source(struct intel_dp *intel_dp) I915_WRITE(EDP_PSR2_CTL, val); } +static void hsw_psr_enable_source(struct intel_dp *intel_dp) +{ + struct intel_digital_port *dig_port = dp_to_dig_port(intel_dp); + struct drm_device *dev = dig_port->base.base.dev; + struct drm_i915_private *dev_priv = to_i915(dev); + + /* psr1 and psr2 are mutually exclusive.*/ + if (dev_priv->psr.psr2_support) + intel_enable_source_psr2(intel_dp); + else + intel_enable_source_psr1(intel_dp); +} + static bool intel_psr_match_conditions(struct intel_dp *intel_dp) { struct intel_digital_port *dig_port = dp_to_dig_port(intel_dp); @@ -417,7 +445,10 @@ static void intel_psr_activate(struct intel_dp *intel_dp) struct drm_device *dev = intel_dig_port->base.base.dev; struct drm_i915_private *dev_priv = to_i915(dev); - WARN_ON(I915_READ(EDP_PSR_CTL) & EDP_PSR_ENABLE); + if (dev_priv->psr.psr2_support) + WARN_ON(I915_READ(EDP_PSR2_CTL) & EDP_PSR2_ENABLE); + else + WARN_ON(I915_READ(EDP_PSR_CTL) & EDP_PSR_ENABLE); WARN_ON(dev_priv->psr.active); lockdep_assert_held(&dev_priv->psr.lock); @@ -468,10 +499,11 @@ void intel_psr_enable
Re: [Intel-gfx] [PATCH v6] drm: add fourcc codes for 16bit R and RG
On Thu, Jan 05, 2017 at 02:45:37PM +0100, Christian König wrote: > Am 05.01.2017 um 12:37 schrieb Ville Syrjälä: > > On Wed, Jan 04, 2017 at 07:38:55PM +0100, Rainer Hochecker wrote: > >> From: Rainer Hochecker > >> > >> This adds fourcc codes for 16bit planes required for DRM buffer > >> export to mesa. > >> > >> Signed-off-by: Rainer Hochecker > > Reviewed-by: Ville Syrjälä > > Good to see some work landing on that part, patch is Acked-by: Christian > König . Has the userspace side of this been reviewed already? /me wonders if it's safe to push this... > > > > >> --- > >> include/uapi/drm/drm_fourcc.h | 7 +++ > >> 1 file changed, 7 insertions(+) > >> > >> diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h > >> index a5890bf..d230e58 100644 > >> --- a/include/uapi/drm/drm_fourcc.h > >> +++ b/include/uapi/drm/drm_fourcc.h > >> @@ -41,10 +41,17 @@ extern "C" { > >> /* 8 bpp Red */ > >> #define DRM_FORMAT_R8fourcc_code('R', '8', ' ', ' ') /* > >> [7:0] R */ > >> > >> +/* 16 bpp Red */ > >> +#define DRM_FORMAT_R16fourcc_code('R', '1', '6', ' ') /* > >> [15:0] R little endian */ > >> + > >> /* 16 bpp RG */ > >> #define DRM_FORMAT_RG88 fourcc_code('R', 'G', '8', '8') /* > >> [15:0] R:G 8:8 little endian */ > >> #define DRM_FORMAT_GR88 fourcc_code('G', 'R', '8', '8') /* > >> [15:0] G:R 8:8 little endian */ > >> > >> +/* 32 bpp RG */ > >> +#define DRM_FORMAT_RG1616 fourcc_code('R', 'G', '3', '2') /* [31:0] R:G > >> 16:16 little endian */ > >> +#define DRM_FORMAT_GR1616 fourcc_code('G', 'R', '3', '2') /* [31:0] G:R > >> 16:16 little endian */ > >> + > >> /* 8 bpp RGB */ > >> #define DRM_FORMAT_RGB332fourcc_code('R', 'G', 'B', '8') /* > >> [7:0] R:G:B 3:3:2 */ > >> #define DRM_FORMAT_BGR233fourcc_code('B', 'G', 'R', '8') /* > >> [7:0] B:G:R 2:3:3 */ > >> -- > >> 2.9.3 > -- Ville Syrjälä Intel OTC ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] DP compliance failure due to dithering for 18bpp video pattern
On Tue, 10 Jan 2017, Manasi Navare wrote: > Hi All, > > We are seeing CRC check failures in some of the 18bpp video pattern > DP Compliance tests causing the tests to fail. On further investigation, it is > rootcaused to dithering that the i915 driver enables in case of 18bpp pipe > configuration that messes up the CRC and causes the test to fail. The CTS spec actually accounts for CRC failures caused by dithering and color space conversions. See section 3.2.1. However, it would be preferrable to be able to automate this. > Some of the approaches that can solve this problem are: > 1. Add a new method in intel_dp.c to request the compliance test state. > Call this new method in intel_display.c to not enable dithering during a > compliance test. Issue with this is it makes the general portion of the driver > compliance aware. > > 2. Move the dithering enable to compute_config methods in all encoder source > files. Issue: Lot of duplicate code and DP is the only encoder that uses > 18bpc. > > 3. Disable dithering at all times in the driver. However this can cause image > quality issue with 8bpc plane and 6 bit pipe. > > Any suggestions on which approach can be implemented in order to pass > compliance? I can't find any mention in the specs that we couldn't enable/disable dithering on the fly. It's PIPE_MISC for BDW+ and PIPE_CONF for the rest. So I'm wondering about doing... 4. Disable dithering at intel_dp_sink_crc_start() and enable it again (according to config->dither) at intel_dp_sink_crc_stop(). It's similar to the hsw_disable_ips() and hsw_enable_ips() calls, but would have to cover more platforms. Ville, thoughts on changing dithering on the fly? BR, Jani. > > Regards > Manasi > > > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Jani Nikula, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915/huc: Add HuC fw loading support
The HuC loading process is similar to GuC. The intel_uc_fw_fetch() is used for both cases. HuC loading needs to be before GuC loading. The WOPCM setting must be done early before loading any of them. v2: rebased on-top of drm-intel-nightly. removed if(HAS_GUC()) before the guc call. (D.Gordon) update huc_version number of format. v3: rebased to drm-intel-nightly, changed the file name format to match the one in the huc package. Changed dev->dev_private to to_i915() v4: moved function back to where it was. change wait_for_atomic to wait_for. v5: rebased + comment changes. v7: rebased. v8: rebased. v9: rebased. Changed the year in the copyright message to reflect the right year.Correct the comments,remove the unwanted WARN message, replace drm_gem_object_unreference() with i915_gem_object_put().Make the prototypes in intel_huc.h non-extern. v10: rebased. Update the file construction done by HuC. It is similar to GuC.Adopted the approach used in- https://patchwork.freedesktop.org/patch/104355/ v11: Fix warnings remove old declaration v12: Change dev to dev_priv in macro definition. Corrected comments. v13: rebased. v14: rebased on top of drm-tip v15: rebased. Updated functions intel_huc_load(),intel_huc_init() and intel_uc_fw_fetch() to accept dev_priv instead of dev. Moved contents of intel_huc.h to intel_uc.h v16: change SKL_FW_ to SKL_HUC_FW_. Add intel_ prefix to guc_wopcm_size(). Remove unwanted checks in intel_uc.h. Rename huc_fw in struct intel_huc to simply fw to avoid redundency. v17: rebased. v18: rebased. Correct comments. v19: rebased. Correct comments. Make intel_huc_fini() accept dev_priv instead of dev like intel_huc_init() and intel_huc_load().Move definition to i915_guc_reg.h from intel_uc.h. Clean DMA_CTRL bits after HuC DMA transfer in huc_ucode_xfer() instead of guc_ucode_xfer(). Add suitable WARNs to give extra info. Cc: Arkadiusz Hiler Cc: Michal Wajdeczko Tested-by: Xiang Haihao Signed-off-by: Anusha Srivatsa Signed-off-by: Alex Dai Signed-off-by: Peter Antoine --- drivers/gpu/drm/i915/Makefile | 1 + drivers/gpu/drm/i915/i915_drv.c | 3 + drivers/gpu/drm/i915/i915_drv.h | 2 + drivers/gpu/drm/i915/i915_guc_reg.h | 6 + drivers/gpu/drm/i915/intel_guc_loader.c | 7 +- drivers/gpu/drm/i915/intel_huc_loader.c | 264 drivers/gpu/drm/i915/intel_uc.h | 14 ++ 7 files changed, 294 insertions(+), 3 deletions(-) create mode 100644 drivers/gpu/drm/i915/intel_huc_loader.c diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 5196509..45ae124 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -57,6 +57,7 @@ i915-y += i915_cmd_parser.o \ # general-purpose microcontroller (GuC) support i915-y += intel_uc.o \ intel_guc_loader.o \ + intel_huc_loader.o \ i915_guc_submission.o # autogenerated null render state diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index aefab9a..d6f32f4 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -599,6 +599,7 @@ static int i915_load_modeset_init(struct drm_device *dev) if (ret) goto cleanup_irq; + intel_huc_init(dev_priv); intel_guc_init(dev_priv); ret = i915_gem_init(dev_priv); @@ -627,6 +628,7 @@ static int i915_load_modeset_init(struct drm_device *dev) i915_gem_fini(dev_priv); cleanup_irq: intel_guc_fini(dev_priv); + intel_huc_fini(dev_priv); drm_irq_uninstall(dev); intel_teardown_gmbus(dev_priv); cleanup_csr: @@ -1314,6 +1316,7 @@ void i915_driver_unload(struct drm_device *dev) drain_workqueue(dev_priv->wq); intel_guc_fini(dev_priv); + intel_huc_fini(dev_priv); i915_gem_fini(dev_priv); intel_fbc_cleanup_cfb(dev_priv); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index b84c1d1..2a17df2 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2073,6 +2073,7 @@ struct drm_i915_private { struct intel_gvt *gvt; + struct intel_huc huc; struct intel_guc guc; struct intel_csr csr; @@ -2847,6 +2848,7 @@ intel_info(const struct drm_i915_private *dev_priv) #define HAS_GUC(dev_priv) ((dev_priv)->info.has_guc) #define HAS_GUC_UCODE(dev_priv)(HAS_GUC(dev_priv)) #define HAS_GUC_SCHED(dev_priv)(HAS_GUC(dev_priv)) +#define HAS_HUC_UCODE(dev_priv)(HAS_GUC(dev_priv)) #define HAS_RESOURCE_STREAMER(dev_priv) ((dev_priv)->info.has_resource_streamer) diff --git a/drivers/gpu/drm/i915/i915_guc_reg.h b/drivers/gpu/drm/i915/i915_guc_reg.h index 6a0adaf..35cf991 100644 --- a/drivers/gpu/drm/i915/i915_guc_reg.h +++ b/drivers/gpu/drm/i915/i915_guc_reg.h @@ -61,12 +61,18 @@ #define DMA_ADDRESS_SPACE_GTT (8 << 16) #define DMA_COPY_SIZE
[Intel-gfx] [PATCH] drm/i915/guc: Make sure vma containing firmware is GuC mappable
Since commit 4741da925fa3 ("drm/i915/guc: Assert that all GGTT offsets used by the GuC are mappable"), we're asserting that GuC firmware is in the GuC mappable range. Except we're not pinning the object with bias, which means it's possible to trigger this assert. Let's add a proper bias. Cc: Chris Wilson Cc: Daniele Ceraolo Spurio Signed-off-by: Michał Winiarski --- drivers/gpu/drm/i915/intel_guc_loader.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c index aa2b866..5a6ab87 100644 --- a/drivers/gpu/drm/i915/intel_guc_loader.c +++ b/drivers/gpu/drm/i915/intel_guc_loader.c @@ -360,7 +360,8 @@ static int guc_ucode_xfer(struct drm_i915_private *dev_priv) return ret; } - vma = i915_gem_object_ggtt_pin(guc_fw->guc_fw_obj, NULL, 0, 0, 0); + vma = i915_gem_object_ggtt_pin(guc_fw->guc_fw_obj, NULL, 0, 0, + PIN_OFFSET_BIAS | GUC_WOPCM_TOP); if (IS_ERR(vma)) { DRM_DEBUG_DRIVER("pin failed %d\n", (int)PTR_ERR(vma)); return PTR_ERR(vma); -- 2.9.3 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 04/10] drm/i915/psr: disable aux_frame_sync on psr2 exit
Screen freeze observed if AUX_FRAME_SYNC is not disabled on psr2 exit.AUX_FRAME_SYNC needed for psr2 is enabled during psr2 entry. It must be disabled on psr2 exit. v2: rebase Cc: Rodrigo Vivi Cc: Jim Bride Signed-off-by: Vathsala Nagaraju Signed-off-by: Patil Deepti Reviewed-by: Rodrigo Vivi --- drivers/gpu/drm/i915/intel_psr.c | 9 + 1 file changed, 9 insertions(+) diff --git a/drivers/gpu/drm/i915/intel_psr.c b/drivers/gpu/drm/i915/intel_psr.c index 19c7090..52b8c80 100644 --- a/drivers/gpu/drm/i915/intel_psr.c +++ b/drivers/gpu/drm/i915/intel_psr.c @@ -590,6 +590,11 @@ static void hsw_psr_disable(struct intel_dp *intel_dp) struct drm_i915_private *dev_priv = to_i915(dev); if (dev_priv->psr.active) { + if (dev_priv->psr.aux_frame_sync) + drm_dp_dpcd_writeb(&intel_dp->aux, + DP_SINK_DEVICE_AUX_FRAME_SYNC_CONF, + 0); + if (dev_priv->psr.psr2_support) { I915_WRITE(EDP_PSR2_CTL, I915_READ(EDP_PSR2_CTL) & @@ -728,6 +733,10 @@ static void intel_psr_exit(struct drm_i915_private *dev_priv) return; if (HAS_DDI(dev_priv)) { + if (dev_priv->psr.aux_frame_sync) + drm_dp_dpcd_writeb(&intel_dp->aux, + DP_SINK_DEVICE_AUX_FRAME_SYNC_CONF, + 0); if (dev_priv->psr.psr2_support) { val = I915_READ(EDP_PSR2_CTL); WARN_ON(!(val & EDP_PSR2_ENABLE)); -- 1.9.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 06/10] drm/i915/psr: set CHICKEN_TRANS for psr2
As per bpsec, CHICKEN_TRANS_EDP bit 12 ,15 must be programmed in psr2 enable sequence. bit 12 : Program Transcoder EDP VSC DIP header with a valid setting for PSR2 and Set CHICKEN_TRANS_EDP(0x420cc) bit 12 for programmable header packet. bit 15 : Set CHICKEN_TRANS_EDP(0x420cc) bit 15 if Y coordinate is supported v2: (Rodrigo) - move CHICKEN_TRANS_EDP bit set logic right after setup_vsc v3:(Rodrigo) - initialize chicken_trans to CHICKEN_TRANS_BIT12 instead of 0 v4:(chris wilson) - use BIT(12), remove CHICKEN_TRANS_BIT12 - remove unnecessary comments - update commit message v5: - rename bit 12 PSR2_VSC_ENABLE_PROG_HEADER - rename bit 15 PSR2_ADD_VERTICAL_LINE_COUNT Cc: Rodrigo Vivi Cc: Jim Bride Signed-off-by: vathsala nagaraju Signed-off-by: Patil Deepti --- drivers/gpu/drm/i915/i915_reg.h | 7 +++ drivers/gpu/drm/i915/intel_psr.c | 5 + 2 files changed, 12 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 7830e6e..7a325fb 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -6449,6 +6449,13 @@ enum { #define BDW_DPRS_MASK_VBLANK_SRD (1 << 0) #define CHICKEN_PIPESL_1(pipe) _MMIO_PIPE(pipe, _CHICKEN_PIPESL_1_A, _CHICKEN_PIPESL_1_B) +#define CHICKEN_TRANS_A 0x420c0 +#define CHICKEN_TRANS_B 0x420c4 +#define CHICKEN_TRANS(trans) _MMIO_TRANS(trans, CHICKEN_TRANS_A, CHICKEN_TRANS_B) +#define TRANS_EDP 3 +#define PSR2_VSC_ENABLE_PROG_HEADER(1<<12) +#define PSR2_ADD_VERTICAL_LINE_COUNT (1<<15) + #define DISP_ARB_CTL _MMIO(0x45000) #define DISP_FBC_MEMORY_WAKE (1<<31) #define DISP_TILE_SURFACE_SWIZZLING (1<<13) diff --git a/drivers/gpu/drm/i915/intel_psr.c b/drivers/gpu/drm/i915/intel_psr.c index 3cf5cc4..b582220 100644 --- a/drivers/gpu/drm/i915/intel_psr.c +++ b/drivers/gpu/drm/i915/intel_psr.c @@ -480,6 +480,7 @@ void intel_psr_enable(struct intel_dp *intel_dp) struct intel_digital_port *intel_dig_port = dp_to_dig_port(intel_dp); struct drm_device *dev = intel_dig_port->base.base.dev; struct drm_i915_private *dev_priv = to_i915(dev); + u32 chicken; if (!HAS_PSR(dev_priv)) { DRM_DEBUG_KMS("PSR not supported on this platform\n"); @@ -505,6 +506,10 @@ void intel_psr_enable(struct intel_dp *intel_dp) if (HAS_DDI(dev_priv)) { if (dev_priv->psr.psr2_support) { skl_psr_setup_su_vsc(intel_dp); + chicken = PSR2_VSC_ENABLE_PROG_HEADER; + if (dev_priv->psr.y_cord_support) + chicken |= PSR2_ADD_VERTICAL_LINE_COUNT; + I915_WRITE(CHICKEN_TRANS(TRANS_EDP), chicken); } else { /* set up vsc header for psr1 */ hsw_psr_setup_vsc(intel_dp); -- 1.9.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 07/10] drm/i915/psr: set PSR_MASK bits for deep sleep
Program EDP_PSR_DEBUG_CTL (PSR_MASK) to enable system to go to deep sleep while in psr2.PSR2_STATUS bit 31:28 should report value 8 , if system enters deep sleep state. Also, EDP_FRAMES_BEFORE_SU_ENTRY is set 1 , if not set, flickering is observed on psr2 panel. v2: (Ilia Mirkin) - Remove duplicate bit definition 25:27 v3: rebase v4: rebase Cc: Rodrigo Vivi Cc: Jim Bride Signed-off-by: Vathsala Nagaraju Signed-off-by: Patil Deepti Reviewed-by: Jim Bride --- drivers/gpu/drm/i915/i915_reg.h | 10 +++--- drivers/gpu/drm/i915/intel_psr.c | 31 --- 2 files changed, 27 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 7a325fb..6ad9f06 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -3597,9 +3597,12 @@ enum { #define EDP_PSR_PERF_CNT_MASK0xff #define EDP_PSR_DEBUG_CTL _MMIO(dev_priv->psr_mmio_base + 0x60) -#define EDP_PSR_DEBUG_MASK_LPSP (1<<27) -#define EDP_PSR_DEBUG_MASK_MEMUP (1<<26) -#define EDP_PSR_DEBUG_MASK_HPD (1<<25) +#define EDP_PSR_DEBUG_MASK_MAX_SLEEP (1<<28) +#define EDP_PSR_DEBUG_MASK_LPSP (1<<27) +#define EDP_PSR_DEBUG_MASK_MEMUP (1<<26) +#define EDP_PSR_DEBUG_MASK_HPD (1<<25) +#define EDP_PSR_DEBUG_MASK_DISP_REG_WRITE(1<<16) +#define EDP_PSR_DEBUG_EXIT_ON_PIXEL_UNDERRUN (1<<15) #define EDP_PSR2_CTL _MMIO(0x6f900) #define EDP_PSR2_ENABLE (1<<31) @@ -3614,6 +3617,7 @@ enum { #define EDP_PSR2_FRAME_BEFORE_SU_SHIFT 4 #define EDP_PSR2_FRAME_BEFORE_SU_MASK(0xf<<4) #define EDP_PSR2_IDLE_MASK 0xf +#define EDP_FRAMES_BEFORE_SU_ENTRY (1<<4) #define EDP_PSR2_STATUS_CTL_MMIO(0x6f940) #define EDP_PSR2_STATUS_STATE_MASK (0xf<<28) diff --git a/drivers/gpu/drm/i915/intel_psr.c b/drivers/gpu/drm/i915/intel_psr.c index b582220..f9d620b 100644 --- a/drivers/gpu/drm/i915/intel_psr.c +++ b/drivers/gpu/drm/i915/intel_psr.c @@ -338,7 +338,9 @@ static void intel_enable_source_psr2(struct intel_dp *intel_dp) /* FIXME: selective update is probably totally broken because it doesn't * mesh at all with our frontbuffer tracking. And the hw alone isn't * good enough. */ - val |= EDP_PSR2_ENABLE | EDP_SU_TRACK_ENABLE; + val |= EDP_PSR2_ENABLE | + EDP_SU_TRACK_ENABLE | + EDP_FRAMES_BEFORE_SU_ENTRY; if (dev_priv->vbt.psr.tp2_tp3_wakeup_time > 5) val |= EDP_PSR2_TP2_TIME_2500; @@ -510,20 +512,27 @@ void intel_psr_enable(struct intel_dp *intel_dp) if (dev_priv->psr.y_cord_support) chicken |= PSR2_ADD_VERTICAL_LINE_COUNT; I915_WRITE(CHICKEN_TRANS(TRANS_EDP), chicken); + I915_WRITE(EDP_PSR_DEBUG_CTL, + EDP_PSR_DEBUG_MASK_MEMUP | + EDP_PSR_DEBUG_MASK_HPD | + EDP_PSR_DEBUG_MASK_LPSP | + EDP_PSR_DEBUG_MASK_MAX_SLEEP | + EDP_PSR_DEBUG_MASK_DISP_REG_WRITE); } else { /* set up vsc header for psr1 */ hsw_psr_setup_vsc(intel_dp); + /* +* Per Spec: Avoid continuous PSR exit by masking MEMUP +* and HPD. also mask LPSP to avoid dependency on other +* drivers that might block runtime_pm besides +* preventing other hw tracking issues now we can rely +* on frontbuffer tracking. +*/ + I915_WRITE(EDP_PSR_DEBUG_CTL, + EDP_PSR_DEBUG_MASK_MEMUP | + EDP_PSR_DEBUG_MASK_HPD | + EDP_PSR_DEBUG_MASK_LPSP); } - - /* -* Per Spec: Avoid continuous PSR exit by masking MEMUP and HPD. -* Also mask LPSP to avoid dependency on other drivers that -* might block runtime_pm besides preventing other hw tracking -* issues now we can rely on frontbuffer tracking. -*/ - I915_WRITE(EDP_PSR_DEBUG_CTL, EDP_PSR_DEBUG_MASK_MEMUP | - EDP_PSR_DEBUG_MASK_HPD | EDP_PSR_DEBUG_MASK_LPSP); - /* Enable PSR on the panel */ hsw_psr_enable_sink(intel_dp); -- 1.9.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] 4.10-rc2 oops in DRM connector code
On 01/10/2017 11:43 PM, Daniel Vetter wrote: > On Tue, Jan 10, 2017 at 08:52:47AM -0800, Dave Hansen wrote: >> On 01/10/2017 02:31 AM, Daniel Vetter wrote: >>> commit e73ab00e9a0f1731f34d0620a9c55f5c30c4ad4e >>> Author: Daniel Vetter >>> Date: Sun Dec 18 14:35:45 2016 +0100 >>> >>> drm: prevent double-(un)registration for connectors >>> >>> Lack of that would perfectly explain that oops ... Otherwise still no idea >>> what's going wrong. >> No... That's not in mainline as far as I can see. Should I test with >> it applied? > Hm, I guess failed to cc: stable that one properly, iirc we decided the > race fix is too academic and can't be hit in reality ;-) > > Testing would be great. Probably conflicts because we extracted > drm_connector.c only recently, but running s/drm_connector\.c/drm_crtc.c/ > over the diff and then applying with some fudge should take care of that. It doesn't apply to mainline, with or without the substitution you suggest. Do you want that commit itself tested from -next? ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/guc: Make sure vma containing firmware is GuC mappable
On Wed, Jan 11, 2017 at 04:17:39PM +0100, Michał Winiarski wrote: > Since commit 4741da925fa3 ("drm/i915/guc: Assert that all GGTT offsets used > by the GuC are mappable"), we're asserting that GuC firmware is in the > GuC mappable range. > Except we're not pinning the object with bias, which means it's possible > to trigger this assert. Let's add a proper bias. > > Cc: Chris Wilson > Cc: Daniele Ceraolo Spurio > Signed-off-by: Michał Winiarski Fits in with the checks we added. If they are correct, so is this fix ;) Reviewed-by: Chris Wilson -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt
On 11/01/17 14:58, Joonas Lahtinen wrote: On ke, 2017-01-11 at 12:14 +, Chris Wilson wrote: When switching between contexts using the aliasing_ppgtt, the VM is shared. We don't need to reload the PD registers unless they are dirty. Martin Peres reported an issue that looks like corruption between Haswell context switches, bisecting to commit f9326be5f1d3 ("drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use"). Switching between the same mm (the aliasing_ppgtt is used for all contexts in this case) should be a nop, but appears to trigger some side-effects in the context switch. However, as we know the switch is redundant in this case, we can skip it and continue to ignore the issue until somebody feels strong enough to investigate full-ppgtt on gen7 again! Fixes: f9326be5f1d3 ("drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use") Reported-by: Martin Peres Signed-off-by: Chris Wilson Cc: Martin Peres Code looks good, could use the T-b's to verify. Reviewed-by: Joonas Lahtinen Regards, Joonas https://cgit.freedesktop.org/~ickle/linux-2.6/commit/?h=for-mupuf&id=cfe8f1043b45896af23e4a978020fe20e90c5072 was actually the commit that massively improved the corruption I was seeing in one benchmark while this patch had no visible impact. However, my problem was that i915.enable_ppgtt=2 was set in /etc/modprobe.d/... and I had completely forgotten about it. So yeah, now you know that f9326be5f1d3 massively broke enable_ppgtt=2, but not sure what you want to do about it. There is no hurry though, as the defaults are sane. Sorry for the noise everyone, I hope that my painful manual bisects will be useful if someone wants to make the second mode work :) Martin ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] DP compliance failure due to dithering for 18bpp video pattern
On Wed, Jan 11, 2017 at 05:09:16PM +0200, Jani Nikula wrote: > On Tue, 10 Jan 2017, Manasi Navare wrote: > > Hi All, > > > > We are seeing CRC check failures in some of the 18bpp video pattern > > DP Compliance tests causing the tests to fail. On further investigation, it > > is > > rootcaused to dithering that the i915 driver enables in case of 18bpp pipe > > configuration that messes up the CRC and causes the test to fail. > > The CTS spec actually accounts for CRC failures caused by dithering and > color space conversions. See section 3.2.1. However, it would be > preferrable to be able to automate this. > > > Some of the approaches that can solve this problem are: > > 1. Add a new method in intel_dp.c to request the compliance test state. > > Call this new method in intel_display.c to not enable dithering during a > > compliance test. Issue with this is it makes the general portion of the > > driver > > compliance aware. > > > > 2. Move the dithering enable to compute_config methods in all encoder > > source > > files. Issue: Lot of duplicate code and DP is the only encoder that uses > > 18bpc. > > > > 3. Disable dithering at all times in the driver. However this can cause > > image > > quality issue with 8bpc plane and 6 bit pipe. > > > > Any suggestions on which approach can be implemented in order to pass > > compliance? > > I can't find any mention in the specs that we couldn't enable/disable > dithering on the fly. It's PIPE_MISC for BDW+ and PIPE_CONF for the > rest. So I'm wondering about doing... > > 4. Disable dithering at intel_dp_sink_crc_start() and enable it again >(according to config->dither) at intel_dp_sink_crc_stop(). It's >similar to the hsw_disable_ips() and hsw_enable_ips() calls, but >would have to cover more platforms. > > Ville, thoughts on changing dithering on the fly? Should be fine I think. BTW see https://lists.freedesktop.org/archives/intel-gfx/2016-December/115186.html if you intend to add more crc workaround type of things. There I'm changing the IPS w/a to force a full modeset because it was the easiest way to do things, and the current thing is just broken. -- Ville Syrjälä Intel OTC ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] 4.10-rc2 oops in DRM connector code
On Wed, Jan 11, 2017 at 4:24 PM, Dave Hansen wrote: > On 01/10/2017 11:43 PM, Daniel Vetter wrote: >> On Tue, Jan 10, 2017 at 08:52:47AM -0800, Dave Hansen wrote: >>> On 01/10/2017 02:31 AM, Daniel Vetter wrote: commit e73ab00e9a0f1731f34d0620a9c55f5c30c4ad4e Author: Daniel Vetter Date: Sun Dec 18 14:35:45 2016 +0100 drm: prevent double-(un)registration for connectors Lack of that would perfectly explain that oops ... Otherwise still no idea what's going wrong. >>> No... That's not in mainline as far as I can see. Should I test with >>> it applied? >> Hm, I guess failed to cc: stable that one properly, iirc we decided the >> race fix is too academic and can't be hit in reality ;-) >> >> Testing would be great. Probably conflicts because we extracted >> drm_connector.c only recently, but running s/drm_connector\.c/drm_crtc.c/ >> over the diff and then applying with some fudge should take care of that. > > It doesn't apply to mainline, with or without the substitution you suggest. > > Do you want that commit itself tested from -next? Hm, just cherry-picked it on top of Linus' latest 4.10 git, applies cleanly there. The substituation was for 4.9. I can send you the patch here, but seems all fine from what I can tell ... -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] 4.10-rc2 oops in DRM connector code
On Wed, Jan 11, 2017 at 07:24:45AM -0800, Dave Hansen wrote: > On 01/10/2017 11:43 PM, Daniel Vetter wrote: > > On Tue, Jan 10, 2017 at 08:52:47AM -0800, Dave Hansen wrote: > >> On 01/10/2017 02:31 AM, Daniel Vetter wrote: > >>> commit e73ab00e9a0f1731f34d0620a9c55f5c30c4ad4e > >>> Author: Daniel Vetter > >>> Date: Sun Dec 18 14:35:45 2016 +0100 > >>> > >>> drm: prevent double-(un)registration for connectors > >>> > >>> Lack of that would perfectly explain that oops ... Otherwise still no idea > >>> what's going wrong. > >> No... That's not in mainline as far as I can see. Should I test with > >> it applied? > > Hm, I guess failed to cc: stable that one properly, iirc we decided the > > race fix is too academic and can't be hit in reality ;-) > > > > Testing would be great. Probably conflicts because we extracted > > drm_connector.c only recently, but running s/drm_connector\.c/drm_crtc.c/ > > over the diff and then applying with some fudge should take care of that. > > It doesn't apply to mainline, with or without the substitution you suggest. I was hoping that the locking was the real cause here and would be an easy fix to apply. I did have a look at trying to reorder the DP-MST worker with driver registration. Hacky to say the least. -Chris -- Chris Wilson, Intel Open Source Technology Centre >From 30ac9092e934295f12775f03d73170fc480b7fc8 Mon Sep 17 00:00:00 2001 From: Chris Wilson Date: Tue, 10 Jan 2017 10:46:25 + Subject: [PATCH] dp-mst-register --- drivers/gpu/drm/i915/intel_dp.c | 12 +++- drivers/gpu/drm/i915/intel_dp_mst.c | 9 ++--- 2 files changed, 17 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index f0f44cdbe4b4..fc10eb2c8563 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -4762,7 +4762,17 @@ intel_dp_connector_register(struct drm_connector *connector) intel_dp->aux.name, connector->kdev->kobj.name); intel_dp->aux.dev = connector->kdev; - return drm_dp_aux_register(&intel_dp->aux); + ret = drm_dp_aux_register(&intel_dp->aux); + if (ret) + return ret; + + if (intel_dp->mst_mgr.cbs) { + intel_dp->can_mst = true; + if (intel_dp->attached_connector) + intel_dp->attached_connector->base.status = intel_dp_long_pulse(intel_dp->attached_connector); + } + + return 0; } static void diff --git a/drivers/gpu/drm/i915/intel_dp_mst.c b/drivers/gpu/drm/i915/intel_dp_mst.c index c93c1999a494..f0a664041dbc 100644 --- a/drivers/gpu/drm/i915/intel_dp_mst.c +++ b/drivers/gpu/drm/i915/intel_dp_mst.c @@ -582,16 +582,19 @@ intel_dp_mst_encoder_init(struct intel_digital_port *intel_dig_port, int conn_ba struct drm_device *dev = intel_dig_port->base.base.dev; int ret; - intel_dp->can_mst = true; + intel_dp->can_mst = false; intel_dp->mst_mgr.cbs = &mst_cbs; /* create encoders */ intel_dp_create_fake_mst_encoders(intel_dig_port); - ret = drm_dp_mst_topology_mgr_init(&intel_dp->mst_mgr, dev->dev, &intel_dp->aux, 16, 3, conn_base_id); + ret = drm_dp_mst_topology_mgr_init(&intel_dp->mst_mgr, dev->dev, + &intel_dp->aux, 16, 3, + conn_base_id); if (ret) { - intel_dp->can_mst = false; + intel_dp->mst_mgr.cbs = NULL; return ret; } + return 0; } -- 2.11.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Suppress switch_mm emission between the same aliasing_ppgtt
On Wed, Jan 11, 2017 at 05:35:08PM +0200, Martin Peres wrote: > On 11/01/17 14:58, Joonas Lahtinen wrote: > >On ke, 2017-01-11 at 12:14 +, Chris Wilson wrote: > >>When switching between contexts using the aliasing_ppgtt, the VM is > >>shared. We don't need to reload the PD registers unless they are dirty. > >> > >>Martin Peres reported an issue that looks like corruption between > >>Haswell context switches, bisecting to commit f9326be5f1d3 ("drm/i915: > >>Rearrange switch_context to load the aliasing ppgtt on first use"). > >>Switching between the same mm (the aliasing_ppgtt is used for all > >>contexts in this case) should be a nop, but appears to trigger some > >>side-effects in the context switch. However, as we know the switch > >>is redundant in this case, we can skip it and continue to ignore the > >>issue until somebody feels strong enough to investigate full-ppgtt on > >>gen7 again! > >> > >>Fixes: f9326be5f1d3 ("drm/i915: Rearrange switch_context to load the > >>aliasing ppgtt on first use") > >>Reported-by: Martin Peres > >>Signed-off-by: Chris Wilson > >>Cc: Martin Peres > > > >Code looks good, could use the T-b's to verify. > > > >Reviewed-by: Joonas Lahtinen > > > >Regards, Joonas > > > > https://cgit.freedesktop.org/~ickle/linux-2.6/commit/?h=for-mupuf&id=cfe8f1043b45896af23e4a978020fe20e90c5072 > was actually the commit that massively improved the corruption I was > seeing in one benchmark while this patch had no visible impact. > > However, my problem was that i915.enable_ppgtt=2 was set in > /etc/modprobe.d/... and I had completely forgotten about it. > > So yeah, now you know that f9326be5f1d3 massively broke > enable_ppgtt=2, but not sure what you want to do about it. > > There is no hurry though, as the defaults are sane. > > Sorry for the noise everyone, I hope that my painful manual bisects > will be useful if someone wants to make the second mode work :) The information is very useful, I've added that the symptoms have only been seen with full-ppgtt. I don't see any reason to apply this patch now, so will send it along with the series playing with i915_gem_gtt.c once the kselftests for it are in. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v6] drm: add fourcc codes for 16bit R and RG
On 17-01-11 17:05:04, Ville Syrjälä wrote: On Thu, Jan 05, 2017 at 02:45:37PM +0100, Christian König wrote: Am 05.01.2017 um 12:37 schrieb Ville Syrjälä: > On Wed, Jan 04, 2017 at 07:38:55PM +0100, Rainer Hochecker wrote: >> From: Rainer Hochecker >> >> This adds fourcc codes for 16bit planes required for DRM buffer >> export to mesa. >> >> Signed-off-by: Rainer Hochecker > Reviewed-by: Ville Syrjälä Good to see some work landing on that part, patch is Acked-by: Christian König . Has the userspace side of this been reviewed already? /me wonders if it's safe to push this... I acked the mesa side, and Rainer sent a version 2 which also looked fine to me. Let me bump that thread... > >> --- >> include/uapi/drm/drm_fourcc.h | 7 +++ >> 1 file changed, 7 insertions(+) >> >> diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h >> index a5890bf..d230e58 100644 >> --- a/include/uapi/drm/drm_fourcc.h >> +++ b/include/uapi/drm/drm_fourcc.h >> @@ -41,10 +41,17 @@ extern "C" { >> /* 8 bpp Red */ >> #define DRM_FORMAT_R8fourcc_code('R', '8', ' ', ' ') /* [7:0] R */ >> >> +/* 16 bpp Red */ >> +#define DRM_FORMAT_R16fourcc_code('R', '1', '6', ' ') /* [15:0] R little endian */ >> + >> /* 16 bpp RG */ >> #define DRM_FORMAT_RG88 fourcc_code('R', 'G', '8', '8') /* [15:0] R:G 8:8 little endian */ >> #define DRM_FORMAT_GR88 fourcc_code('G', 'R', '8', '8') /* [15:0] G:R 8:8 little endian */ >> >> +/* 32 bpp RG */ >> +#define DRM_FORMAT_RG1616 fourcc_code('R', 'G', '3', '2') /* [31:0] R:G 16:16 little endian */ >> +#define DRM_FORMAT_GR1616 fourcc_code('G', 'R', '3', '2') /* [31:0] G:R 16:16 little endian */ >> + >> /* 8 bpp RGB */ >> #define DRM_FORMAT_RGB332fourcc_code('R', 'G', 'B', '8') /* [7:0] R:G:B 3:3:2 */ >> #define DRM_FORMAT_BGR233fourcc_code('B', 'G', 'R', '8') /* [7:0] B:G:R 2:3:3 */ >> -- >> 2.9.3 -- Ville Syrjälä Intel OTC ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/3] drm/i915: Invalidate the guc ggtt TLB upon insertion
On 11/01/2017 13:13, Chris Wilson wrote: Move the GuC invalidation of its ggtt TLB to where we perform the ggtt modification rather than proliferate it into all the callers of the insert (which may or may not in fact have to do the insertion). v2: Just do the guc invalidate unconditionally, (afaict) it has no impact without the guc loaded on gen8+ v3: Conditionally invalidate the guc - just in case that register has not been validated for other modes. Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin Cc: Daniel Vetter --- drivers/gpu/drm/i915/i915_gem_gtt.c| 78 +++--- drivers/gpu/drm/i915/i915_gem_gtt.h| 3 ++ drivers/gpu/drm/i915/i915_guc_submission.c | 3 -- drivers/gpu/drm/i915/intel_guc_loader.c| 7 +-- drivers/gpu/drm/i915/intel_lrc.c | 6 --- 5 files changed, 57 insertions(+), 40 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 0ed99adfd0da..ed120a1e7f93 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -110,6 +110,30 @@ const struct i915_ggtt_view i915_ggtt_view_rotated = { .type = I915_GGTT_VIEW_ROTATED, }; +static void gen6_ggtt_invalidate(struct drm_i915_private *dev_priv) +{ + /* Note that as an uncached mmio write, this should flush the +* WCB of the writes into the GGTT before it triggers the invalidate. +*/ + I915_WRITE(GFX_FLSH_CNTL_GEN6, GFX_FLSH_CNTL_EN); +} + +static void guc_ggtt_invalidate(struct drm_i915_private *dev_priv) +{ + gen6_ggtt_invalidate(dev_priv); + I915_WRITE(GEN8_GTCR, GEN8_GTCR_INVALIDATE); +} + +static void gmch_ggtt_invalidate(struct drm_i915_private *dev_priv) +{ + intel_gtt_chipset_flush(); +} + +static inline void i915_ggtt_invalidate(struct drm_i915_private *i915) +{ + i915->ggtt.invalidate(i915); +} + int intel_sanitize_enable_ppgtt(struct drm_i915_private *dev_priv, int enable_ppgtt) { @@ -2307,16 +2331,6 @@ void i915_check_and_clear_faults(struct drm_i915_private *dev_priv) POSTING_READ(RING_FAULT_REG(dev_priv->engine[RCS])); } -static void i915_ggtt_flush(struct drm_i915_private *dev_priv) -{ - if (INTEL_INFO(dev_priv)->gen < 6) { - intel_gtt_chipset_flush(); - } else { - I915_WRITE(GFX_FLSH_CNTL_GEN6, GFX_FLSH_CNTL_EN); - POSTING_READ(GFX_FLSH_CNTL_GEN6); - } -} - void i915_gem_suspend_gtt_mappings(struct drm_i915_private *dev_priv) { struct i915_ggtt *ggtt = &dev_priv->ggtt; @@ -2331,7 +2345,7 @@ void i915_gem_suspend_gtt_mappings(struct drm_i915_private *dev_priv) ggtt->base.clear_range(&ggtt->base, ggtt->base.start, ggtt->base.total); - i915_ggtt_flush(dev_priv); + i915_ggtt_invalidate(dev_priv); } int i915_gem_gtt_prepare_pages(struct drm_i915_gem_object *obj, @@ -2370,15 +2384,13 @@ static void gen8_ggtt_insert_page(struct i915_address_space *vm, enum i915_cache_level level, u32 unused) { - struct drm_i915_private *dev_priv = vm->i915; + struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm); gen8_pte_t __iomem *pte = - (gen8_pte_t __iomem *)dev_priv->ggtt.gsm + - (offset >> PAGE_SHIFT); + (gen8_pte_t __iomem *)ggtt->gsm + (offset >> PAGE_SHIFT); gen8_set_pte(pte, gen8_pte_encode(addr, level)); - I915_WRITE(GFX_FLSH_CNTL_GEN6, GFX_FLSH_CNTL_EN); - POSTING_READ(GFX_FLSH_CNTL_GEN6); + ggtt->invalidate(vm->i915); } static void gen8_ggtt_insert_entries(struct i915_address_space *vm, @@ -2386,7 +2398,6 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm, uint64_t start, enum i915_cache_level level, u32 unused) { - struct drm_i915_private *dev_priv = vm->i915; struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm); struct sgt_iter sgt_iter; gen8_pte_t __iomem *gtt_entries; @@ -2415,8 +2426,7 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm, * want to flush the TLBs only after we're certain all the PTE updates * have finished. */ - I915_WRITE(GFX_FLSH_CNTL_GEN6, GFX_FLSH_CNTL_EN); - POSTING_READ(GFX_FLSH_CNTL_GEN6); + ggtt->invalidate(vm->i915); } struct insert_entries { @@ -2451,15 +2461,13 @@ static void gen6_ggtt_insert_page(struct i915_address_space *vm, enum i915_cache_level level, u32 flags) { - struct drm_i915_private *dev_priv = vm->i915; + struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm); gen6_pte_t __iomem *pte = - (gen6_pte_t __iomem *)dev_priv->ggtt.gsm + - (offset >> PAGE_SHIFT); + (gen6_pte_t __iomem *)ggtt->gsm + (offse
[Intel-gfx] [drm-intel:for-linux-next 1/4] htmldocs: drivers/gpu/drm/i915/i915_gem_gtt.c:3594: warning: No description found for parameter 'vm'
tree: git://anongit.freedesktop.org/drm-intel for-linux-next head: c781c978e784c50dcd7cb312fe17f5281923f55b commit: e007b19d7ba7424735fd4f17a355b145ae153e4c [1/4] drm/i915: Use the MRU stack search after evicting reproduce: make htmldocs All warnings (new ones prefixed by >>): make[3]: warning: jobserver unavailable: using -j1. Add '+' to parent make rule. include/linux/init.h:1: warning: no structured comments found include/linux/kthread.h:26: warning: Excess function parameter '...' description in 'kthread_create' kernel/sys.c:1: warning: no structured comments found drivers/dma-buf/seqno-fence.c:1: warning: no structured comments found include/drm/drm_drv.h:441: warning: No description found for parameter 'firstopen' include/drm/drm_drv.h:441: warning: No description found for parameter 'open' include/drm/drm_drv.h:441: warning: No description found for parameter 'preclose' include/drm/drm_drv.h:441: warning: No description found for parameter 'postclose' include/drm/drm_drv.h:441: warning: No description found for parameter 'lastclose' include/drm/drm_drv.h:441: warning: No description found for parameter 'dma_ioctl' include/drm/drm_drv.h:441: warning: No description found for parameter 'dma_quiescent' include/drm/drm_drv.h:441: warning: No description found for parameter 'context_dtor' include/drm/drm_drv.h:441: warning: No description found for parameter 'set_busid' include/drm/drm_drv.h:441: warning: No description found for parameter 'irq_handler' include/drm/drm_drv.h:441: warning: No description found for parameter 'irq_preinstall' include/drm/drm_drv.h:441: warning: No description found for parameter 'irq_postinstall' include/drm/drm_drv.h:441: warning: No description found for parameter 'irq_uninstall' include/drm/drm_drv.h:441: warning: No description found for parameter 'debugfs_init' include/drm/drm_drv.h:441: warning: No description found for parameter 'debugfs_cleanup' include/drm/drm_drv.h:441: warning: No description found for parameter 'gem_open_object' include/drm/drm_drv.h:441: warning: No description found for parameter 'gem_close_object' include/drm/drm_drv.h:441: warning: No description found for parameter 'prime_handle_to_fd' include/drm/drm_drv.h:441: warning: No description found for parameter 'prime_fd_to_handle' include/drm/drm_drv.h:441: warning: No description found for parameter 'gem_prime_export' include/drm/drm_drv.h:441: warning: No description found for parameter 'gem_prime_import' include/drm/drm_drv.h:441: warning: No description found for parameter 'gem_prime_pin' include/drm/drm_drv.h:441: warning: No description found for parameter 'gem_prime_unpin' include/drm/drm_drv.h:441: warning: No description found for parameter 'gem_prime_res_obj' include/drm/drm_drv.h:441: warning: No description found for parameter 'gem_prime_get_sg_table' include/drm/drm_drv.h:441: warning: No description found for parameter 'gem_prime_import_sg_table' include/drm/drm_drv.h:441: warning: No description found for parameter 'gem_prime_vmap' include/drm/drm_drv.h:441: warning: No description found for parameter 'gem_prime_vunmap' include/drm/drm_drv.h:441: warning: No description found for parameter 'gem_prime_mmap' include/drm/drm_drv.h:441: warning: No description found for parameter 'vgaarb_irq' include/drm/drm_drv.h:441: warning: No description found for parameter 'gem_vm_ops' include/drm/drm_drv.h:441: warning: No description found for parameter 'major' include/drm/drm_drv.h:441: warning: No description found for parameter 'minor' include/drm/drm_drv.h:441: warning: No description found for parameter 'patchlevel' include/drm/drm_drv.h:441: warning: No description found for parameter 'name' include/drm/drm_drv.h:441: warning: No description found for parameter 'desc' include/drm/drm_drv.h:441: warning: No description found for parameter 'date' include/drm/drm_drv.h:441: warning: No description found for parameter 'driver_features' include/drm/drm_drv.h:441: warning: No description found for parameter 'dev_priv_size' include/drm/drm_drv.h:441: warning: No description found for parameter 'ioctls' include/drm/drm_drv.h:441: warning: No description found for parameter 'num_ioctls' include/drm/drm_drv.h:441: warning: No description found for parameter 'fops' include/drm/drm_drv.h:441: warning: No description found for parameter 'legacy_dev_list' >> drivers/gpu/drm/i915/i915_gem_gtt.c:3594: warning: No description found for >> parameter 'vm' >> drivers/gpu/drm/i915/i915_gem_gtt.c:3594: warning: No description found for >> parameter 'node' >> drivers/gpu/drm/i915/i915_gem_gtt.c:3594: warning: No description found for >> parameter 'size' >> drivers/gpu/drm/i915/i915_gem_gtt.c:3594: warning: No description found for >> parameter 'alignment' >> drivers/gpu/drm/i915/i915_gem_gtt.c:3594: warning: No descri
[Intel-gfx] ✓ Fi.CI.BAT: success for HuC Loading Patches (rev2)
== Series Details == Series: HuC Loading Patches (rev2) URL : https://patchwork.freedesktop.org/series/17499/ State : success == Summary == Series 17499v2 HuC Loading Patches https://patchwork.freedesktop.org/api/1.0/series/17499/revisions/2/mbox/ fi-bdw-5557u total:246 pass:232 dwarn:0 dfail:0 fail:0 skip:14 fi-bsw-n3050 total:246 pass:207 dwarn:0 dfail:0 fail:0 skip:39 fi-bxt-j4205 total:246 pass:224 dwarn:0 dfail:0 fail:0 skip:22 fi-bxt-t5700 total:82 pass:69 dwarn:0 dfail:0 fail:0 skip:12 fi-byt-j1900 total:246 pass:219 dwarn:0 dfail:0 fail:0 skip:27 fi-byt-n2820 total:246 pass:215 dwarn:0 dfail:0 fail:0 skip:31 fi-hsw-4770 total:246 pass:227 dwarn:0 dfail:0 fail:0 skip:19 fi-hsw-4770r total:246 pass:227 dwarn:0 dfail:0 fail:0 skip:19 fi-ivb-3520m total:246 pass:225 dwarn:0 dfail:0 fail:0 skip:21 fi-ivb-3770 total:246 pass:225 dwarn:0 dfail:0 fail:0 skip:21 fi-kbl-7500u total:246 pass:225 dwarn:0 dfail:0 fail:0 skip:21 fi-skl-6260u total:246 pass:233 dwarn:0 dfail:0 fail:0 skip:13 fi-skl-6700hqtotal:246 pass:226 dwarn:0 dfail:0 fail:0 skip:20 fi-skl-6700k total:246 pass:222 dwarn:3 dfail:0 fail:0 skip:21 fi-skl-6770hqtotal:246 pass:233 dwarn:0 dfail:0 fail:0 skip:13 fi-snb-2520m total:246 pass:215 dwarn:0 dfail:0 fail:0 skip:31 fi-snb-2600 total:246 pass:214 dwarn:0 dfail:0 fail:0 skip:32 a947b47b6c0c947253f44d750512220ecb7c5cf4 drm-tip: 2017y-01m-11d-14h-32m-39s UTC integration manifest fabcb22 drm/i915/get_params: Add HuC status to getparams 3532699 drm/i915/huc: Support HuC authentication 50c8a56 drm/i915/huc: Add debugfs for HuC loading status check 5d14b30 drm/i915/HuC: Add KBL huC loading Support 6def1fb drm/i915/huc: Add BXT HuC Loading Support e6713f5 drm/i915/huc: Add HuC fw loading support 061e51b drm/i915/huc: Unified css_header struct for GuC and HuC 6db7e21 drm/i915/guc: Make the GuC fw loading helper functions general == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3482/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v3] drm/i915/scheduler: emulate a scheduler for guc
This emulates execlists on top of the GuC in order to defer submission of requests to the hardware. This deferral allows time for high priority requests to gazump their way to the head of the queue, however it nerfs the GuC by converting it back into a simple execlist (where the CPU has to wake up after every request to feed new commands into the GuC). v2: Drop hack status - though iirc there is still a lockdep inversion between fence and engine->timeline->lock (which is impossible as the nesting only occurs on different fences - hopefully just requires some judicious lockdep annotation) v3: Apply lockdep nesting to enabling signaling on the request, using the pattern we already have in __i915_gem_request_submit(); Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin --- drivers/gpu/drm/i915/i915_guc_submission.c | 92 +++--- drivers/gpu/drm/i915/i915_irq.c| 4 +- drivers/gpu/drm/i915/intel_lrc.c | 5 +- 3 files changed, 89 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 913d87358972..4484591cbf7c 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -350,7 +350,7 @@ int i915_guc_wq_reserve(struct drm_i915_gem_request *request) u32 freespace; int ret; - spin_lock(&client->wq_lock); + spin_lock_irq(&client->wq_lock); freespace = CIRC_SPACE(client->wq_tail, desc->head, client->wq_size); freespace -= client->wq_rsvd; if (likely(freespace >= wqi_size)) { @@ -360,7 +360,7 @@ int i915_guc_wq_reserve(struct drm_i915_gem_request *request) client->no_wq_space++; ret = -EAGAIN; } - spin_unlock(&client->wq_lock); + spin_unlock_irq(&client->wq_lock); return ret; } @@ -372,9 +372,9 @@ void i915_guc_wq_unreserve(struct drm_i915_gem_request *request) GEM_BUG_ON(READ_ONCE(client->wq_rsvd) < wqi_size); - spin_lock(&client->wq_lock); + spin_lock_irq(&client->wq_lock); client->wq_rsvd -= wqi_size; - spin_unlock(&client->wq_lock); + spin_unlock_irq(&client->wq_lock); } /* Construct a Work Item and append it to the GuC's Work Queue */ @@ -534,10 +534,87 @@ static void __i915_guc_submit(struct drm_i915_gem_request *rq) static void i915_guc_submit(struct drm_i915_gem_request *rq) { - i915_gem_request_submit(rq); + __i915_gem_request_submit(rq); __i915_guc_submit(rq); } +static void nested_enable_signaling(struct drm_i915_gem_request *rq) +{ + if (test_and_set_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, +&rq->fence.flags)) + return; + + GEM_BUG_ON(test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &rq->fence.flags)); + + spin_lock_nested(&rq->lock, SINGLE_DEPTH_NESTING); + intel_engine_enable_signaling(rq); + spin_unlock(&rq->lock); +} + +static bool i915_guc_dequeue(struct intel_engine_cs *engine) +{ + struct execlist_port *port = engine->execlist_port; + struct drm_i915_gem_request *last = port[0].request; + unsigned long flags; + struct rb_node *rb; + bool submit = false; + + spin_lock_irqsave(&engine->timeline->lock, flags); + rb = engine->execlist_first; + while (rb) { + struct drm_i915_gem_request *cursor = + rb_entry(rb, typeof(*cursor), priotree.node); + + if (last && cursor->ctx != last->ctx) { + if (port != engine->execlist_port) + break; + + i915_gem_request_assign(&port->request, last); + nested_enable_signaling(last); + port++; + } + + rb = rb_next(rb); + rb_erase(&cursor->priotree.node, &engine->execlist_queue); + RB_CLEAR_NODE(&cursor->priotree.node); + cursor->priotree.priority = INT_MAX; + + i915_guc_submit(cursor); + last = cursor; + submit = true; + } + if (submit) { + i915_gem_request_assign(&port->request, last); + nested_enable_signaling(last); + engine->execlist_first = rb; + } + spin_unlock_irqrestore(&engine->timeline->lock, flags); + + return submit; +} + +static void i915_guc_irq_handler(unsigned long data) +{ + struct intel_engine_cs *engine = (struct intel_engine_cs *)data; + struct execlist_port *port = engine->execlist_port; + struct drm_i915_gem_request *rq; + bool submit; + + do { + rq = port[0].request; + while (rq && i915_gem_request_completed(rq)) { + i915_gem_request_put(rq); + rq = port[1].request; + port[0].request = rq; +
Re: [Intel-gfx] [PATCH 2/4] drm/i915: Fix POWER_DOMAIN_AUDIO refcounting.
On Thu, Dec 15, 2016 at 03:29:43PM +0100, Maarten Lankhorst wrote: > If the crtc was brought up with audio before the driver loads, > then crtc_disable will remove a refcount to audio that doesn't exist > before. > > Fortunately we already set power domains on readout, so we can just add > the power domain handling to get_crtc_power_domains, which will update > the power domains correctly in all cases. > > This was found when testing module reload on CI with the crtc enabled, > which resulted in the following warn after module reload + modeset: > > [ 24.197041] [ cut here ] > [ 24.197075] WARNING: CPU: 0 PID: 99 at > drivers/gpu/drm/i915/intel_runtime_pm.c:1790 > intel_display_power_put+0x134/0x140 [i915] > [ 24.197076] Use count on domain AUDIO is already zero > [ 24.197098] CPU: 0 PID: 99 Comm: kworker/u8:2 Not tainted > 4.9.0-CI-Trybot_393+ #1 > [ 24.197099] Hardware name: /NUC6i5SYB, BIOS > SYSKLi35.86A.0042.2016.0409.1246 04/09/2016 > [ 24.197102] Workqueue: events_unbound async_run_entry_fn > [ 24.197105] c93c7688 81435b35 c93c76d8 > > [ 24.197107] c93c76c8 8107e4d6 06fe5dc36f28 > 88025dc30054 > [ 24.197109] 88025dc36f28 88025dc3 88025dc3 > 0015 > [ 24.197110] Call Trace: > [ 24.197113] [] dump_stack+0x67/0x92 > [ 24.197116] [] __warn+0xc6/0xe0 > [ 24.197118] [] warn_slowpath_fmt+0x4a/0x50 > [ 24.197149] [] intel_display_power_put+0x134/0x140 > [i915] > [ 24.197187] [] intel_disable_ddi+0x4d/0x80 [i915] > [ 24.197223] [] intel_encoders_disable.isra.74+0x7f/0x90 > [i915] > [ 24.197257] [] haswell_crtc_disable+0x55/0x170 [i915] > [ 24.197292] [] intel_atomic_commit_tail+0x108/0xfd0 > [i915] > [ 24.197295] [] ? __lock_is_held+0x66/0x90 > [ 24.197330] [] intel_atomic_commit+0x429/0x560 [i915] > [ 24.197332] [] > ?drm_atomic_add_affected_connectors+0x56/0xf0 > [ 24.197334] [] drm_atomic_commit+0x46/0x50 > [ 24.197336] [] restore_fbdev_mode+0x147/0x270 > [ 24.197337] [] > drm_fb_helper_restore_fbdev_mode_unlocked+0x2e/0x70 > [ 24.197339] [] drm_fb_helper_set_par+0x28/0x50 > [ 24.197374] [] intel_fbdev_set_par+0x13/0x70 [i915] > [ 24.197376] [] fbcon_init+0x57a/0x600 > [ 24.197379] [] visual_init+0xd1/0x130 > [ 24.197381] [] do_bind_con_driver+0x1bc/0x3a0 > [ 24.197384] [] do_take_over_console+0x111/0x180 > [ 24.197386] [] do_fbcon_takeover+0x52/0xb0 > [ 24.197387] [] fbcon_event_notify+0x723/0x850 > [ 24.197390] [] ?__blocking_notifier_call_chain+0x30/0x70 > [ 24.197392] [] notifier_call_chain+0x34/0xa0 > [ 24.197394] [] __blocking_notifier_call_chain+0x48/0x70 > [ 24.197397] [] blocking_notifier_call_chain+0x11/0x20 > [ 24.197398] [] fb_notifier_call_chain+0x16/0x20 > [ 24.197400] [] register_framebuffer+0x24c/0x330 > [ 24.197402] [] drm_fb_helper_initial_config+0x219/0x3c0 > [ 24.197436] [] intel_fbdev_initial_config+0x13/0x30 > [i915] > [ 24.197438] [] async_run_entry_fn+0x34/0x140 > [ 24.197440] [] process_one_work+0x1ec/0x6b0 > [ 24.197442] [] ? process_one_work+0x166/0x6b0 > [ 24.197445] [] worker_thread+0x49/0x490 > [ 24.197447] [] ? process_one_work+0x6b0/0x6b0 > [ 24.197448] [] kthread+0xeb/0x110 > [ 24.197451] [] ? kthread_park+0x60/0x60 > [ 24.197453] [] ret_from_fork+0x27/0x40 > [ 24.197476] ---[ end trace bda64b683b8e8162 ]--- > > Signed-off-by: Maarten Lankhorst Do we still need this with patch 3? I know it'd be nice if we could faithfully restore any state we can also program, but then that's also a lot of complexity ... Otoh patch 3 means we'll stop testing a lot of the fastboot code while reloading the driver. But then that's been the thing in the past, and as long as we still boot up we have at least some test coverage fo the fastboot code (I'm mostly concerned about the plane/buffer readout code, since that's not covered by the state checker). But for now I'd say let's just go with patch 3 only. -Daniel > --- > drivers/gpu/drm/i915/intel_ddi.c | 14 ++ > drivers/gpu/drm/i915/intel_display.c | 4 > drivers/gpu/drm/i915/intel_dp_mst.c | 9 ++--- > 3 files changed, 8 insertions(+), 19 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_ddi.c > b/drivers/gpu/drm/i915/intel_ddi.c > index d808a2ccc29e..8c9ce850760b 100644 > --- a/drivers/gpu/drm/i915/intel_ddi.c > +++ b/drivers/gpu/drm/i915/intel_ddi.c > @@ -1835,8 +1835,6 @@ static void intel_enable_ddi(struct intel_encoder > *intel_encoder, >struct drm_connector_state *conn_state) > { > struct drm_encoder *encoder = &intel_encoder->base; > - struct drm_crtc *crtc = encoder->crtc; > - struct intel_crtc *intel_crtc = to_intel_crtc(crtc); > struct drm_i915_private *dev_priv = to_i915(encoder->dev); > enum port port = intel_ddi_get_encoder_port(intel_encoder); >
Re: [Intel-gfx] [PATCH 3/4] drm/i915: Disable all crtcs during driver unload.
On Thu, Dec 15, 2016 at 03:29:44PM +0100, Maarten Lankhorst wrote: > We may keep the crtc's enabled when userspace unsets all framebuffers but > keeps the crtc active. This exposes a WARN in fbc_global disable, and > a lot of bugs in our hardware readout code. Solve this by disabling > all crtc's for now. > > Signed-off-by: Maarten Lankhorst > --- > drivers/gpu/drm/i915/i915_drv.c | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c > index 6428588518aa..bb0d7517b678 100644 > --- a/drivers/gpu/drm/i915/i915_drv.c > +++ b/drivers/gpu/drm/i915/i915_drv.c > @@ -43,6 +43,7 @@ > > #include > #include > +#include > #include > > #include "i915_drv.h" > @@ -1282,6 +1283,10 @@ void i915_driver_unload(struct drm_device *dev) > > intel_display_power_get(dev_priv, POWER_DOMAIN_INIT); > > + drm_modeset_lock_all(dev); > + drm_atomic_helper_disable_all(dev, dev->mode_config.acquire_ctx); > + drm_modeset_unlock_all(dev); Bikeshed: I think we should phase out lock_all and do an explicit acquire context here. And maybe get a bit better at refactoring the boilerplate that brings along. But also as-is: Reviewed-by: Daniel Vetter > + > i915_driver_unregister(dev_priv); > > drm_vblank_cleanup(dev); > -- > 2.7.4 > > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 4/4] drm: Resurrect atomic rmfb code, v2
On Thu, Dec 15, 2016 at 03:29:45PM +0100, Maarten Lankhorst wrote: > From: Daniel Vetter > > This was somehow lost between v3 and the merged version in Maarten's > patch merged as: > > commit f2d580b9a8149735cbc4b59c4a8df60173658140 > Author: Maarten Lankhorst > Date: Wed May 4 14:38:26 2016 +0200 > > drm/core: Do not preserve framebuffer on rmfb, v4. > > Actual code copied from Maarten's patch, but with the slight change to > just use dev->mode_config.funcs->atomic_commit to decide whether to > use the atomic path or not. > > v2: > - Remove plane->fb assignment, done by drm_atomic_clean_old_fb. > - Add WARN_ON when atomic_remove_fb fails. > - Always call drm_atomic_state_put. > > Signed-off-by: Daniel Vetter > Signed-off-by: Daniel Vetter > Signed-off-by: Maarten Lankhorst Would be great if someone else could r-b this, I've proven pretty well that I don't understand the complexity here :( -Daniel > --- > drivers/gpu/drm/drm_atomic.c| 64 > + > drivers/gpu/drm/drm_crtc_internal.h | 1 + > drivers/gpu/drm/drm_framebuffer.c | 7 > 3 files changed, 72 insertions(+) > > diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c > index d1d252261bf1..23a3845542e1 100644 > --- a/drivers/gpu/drm/drm_atomic.c > +++ b/drivers/gpu/drm/drm_atomic.c > @@ -2059,6 +2059,70 @@ static void complete_crtc_signaling(struct drm_device > *dev, > kfree(fence_state); > } > > +int drm_atomic_remove_fb(struct drm_framebuffer *fb) > +{ > + struct drm_modeset_acquire_ctx ctx; > + struct drm_device *dev = fb->dev; > + struct drm_atomic_state *state; > + struct drm_plane *plane; > + int ret = 0; > + unsigned plane_mask; > + > + state = drm_atomic_state_alloc(dev); > + if (!state) > + return -ENOMEM; > + > + drm_modeset_acquire_init(&ctx, 0); > + state->acquire_ctx = &ctx; > + > +retry: > + plane_mask = 0; > + ret = drm_modeset_lock_all_ctx(dev, &ctx); > + if (ret) > + goto unlock; > + > + drm_for_each_plane(plane, dev) { > + struct drm_plane_state *plane_state; > + > + if (plane->state->fb != fb) > + continue; > + > + plane_state = drm_atomic_get_plane_state(state, plane); > + if (IS_ERR(plane_state)) { > + ret = PTR_ERR(plane_state); > + goto unlock; > + } > + > + drm_atomic_set_fb_for_plane(plane_state, NULL); > + ret = drm_atomic_set_crtc_for_plane(plane_state, NULL); > + if (ret) > + goto unlock; > + > + plane_mask |= BIT(drm_plane_index(plane)); > + > + plane->old_fb = plane->fb; > + } > + > + if (plane_mask) > + ret = drm_atomic_commit(state); > + > +unlock: > + if (plane_mask) > + drm_atomic_clean_old_fb(dev, plane_mask, ret); > + > + if (ret == -EDEADLK) { > + drm_modeset_backoff(&ctx); > + goto retry; > + } > + > + drm_atomic_state_put(state); > + > + drm_modeset_drop_locks(&ctx); > + drm_modeset_acquire_fini(&ctx); > + > + return ret; > +} > + > int drm_mode_atomic_ioctl(struct drm_device *dev, > void *data, struct drm_file *file_priv) > { > diff --git a/drivers/gpu/drm/drm_crtc_internal.h > b/drivers/gpu/drm/drm_crtc_internal.h > index cdf6860c9d22..121e250853d2 100644 > --- a/drivers/gpu/drm/drm_crtc_internal.h > +++ b/drivers/gpu/drm/drm_crtc_internal.h > @@ -178,6 +178,7 @@ int drm_atomic_get_property(struct drm_mode_object *obj, > struct drm_property *property, uint64_t *val); > int drm_mode_atomic_ioctl(struct drm_device *dev, > void *data, struct drm_file *file_priv); > +int drm_atomic_remove_fb(struct drm_framebuffer *fb); > > > /* drm_plane.c */ > diff --git a/drivers/gpu/drm/drm_framebuffer.c > b/drivers/gpu/drm/drm_framebuffer.c > index cbf0c893f426..c358bf8280a8 100644 > --- a/drivers/gpu/drm/drm_framebuffer.c > +++ b/drivers/gpu/drm/drm_framebuffer.c > @@ -770,6 +770,12 @@ void drm_framebuffer_remove(struct drm_framebuffer *fb) >* in this manner. >*/ > if (drm_framebuffer_read_refcount(fb) > 1) { > + if (dev->mode_config.funcs->atomic_commit) { > + int ret = drm_atomic_remove_fb(fb); > + WARN(ret, "atomic remove_fb failed with %i\n", ret); > + goto out; > + } > + > drm_modeset_lock_all(dev); > /* remove from any CRTC */ > drm_for_each_crtc(crtc, dev) { > @@ -787,6 +793,7 @@ void drm_framebuffer_remove(struct drm_framebuffer *fb) > drm_modeset_unlock_all(dev); > } > > +out: > drm_framebuffer_unreference(fb); > } > EXPORT_SYMBOL(drm_framebuffer_remove); > -- > 2.7.4 > -- Daniel Vetter Software Engin
Re: [Intel-gfx] 4.10-rc2 oops in DRM connector code
On 01/11/2017 07:39 AM, Daniel Vetter wrote: > Hm, just cherry-picked it on top of Linus' latest 4.10 git, applies > cleanly there. The substituation was for 4.9. I can send you the patch > here, but seems all fine from what I can tell ... All of the printk's that I added were making it fail to apply. So, I took a 4.10-rc3 kernel with i915 compiled in (not as a module) and applied e73ab00e9a0f17, which I grabbed from linux-next. I'm seeing basically the same behavior that I did before applying e73ab00e9a0f17. sysfs_create_dir_ns() fails because of a NULL kobj->parent. Have you guys tried testing this yourselves? It seems really easy to reproduce if you just compile the driver in. > [1.400797] drm_dev_register(88040c73)::730 cpu: 2 > [1.400860] drm_connector_register(88040c76b000)::382 > connector->registered: 0 cpu: 1 > [1.400870] sysfs_create_dir_ns()::53 error: -2 > [1.400874] create_dir()::75 error: -2 cpu: 1 > [1.400878] [ cut here ] > [1.400884] WARNING: CPU: 1 PID: 91 at lib/kobject.c:249 > kobject_add_internal+0x273/0x330 > [1.400888] kobject_add_internal failed for card0-DP-3 (error: -2 parent: > card0) > [1.400892] Modules linked in: > [1.400896] CPU: 1 PID: 91 Comm: kworker/1:2 Not tainted > 4.10.0-rc3-i915borked-dirty #67 > [1.400900] Hardware name: LENOVO 20F5S7V800/20F5S7V800, BIOS R02ET50W > (1.23 ) 09/20/2016 > [1.400906] Workqueue: events_long drm_dp_mst_link_probe_work > [1.400909] Call Trace: > [1.400914] dump_stack+0x67/0x99 > [1.400918] __warn+0xd1/0xf0 > [1.400922] warn_slowpath_fmt+0x4f/0x60 > [1.400925] kobject_add_internal+0x273/0x330 > [1.400927] kobject_add+0x65/0xb0 > [1.400931] ? klist_init+0x31/0x40 > [1.400936] device_add+0x102/0x5d0 > [1.400940] ? kfree_const+0x22/0x30 > [1.400944] device_create_groups_vargs+0xd8/0x100 > [1.400947] device_create_with_groups+0x36/0x40 > [1.400952] ? vprintk_default+0x29/0x50 > [1.400957] ? __might_sleep+0x4a/0x90 > [1.400962] drm_sysfs_connector_add+0x60/0xe0 > [1.400967] drm_connector_register+0x74/0xd0 > [1.400971] intel_dp_register_mst_connector+0x41/0x50 > [1.400975] drm_dp_add_port+0x350/0x450 > [1.400977] drm_connector_register(88040ee6f800)::382 > connector->registered: 0 cpu: 2 > [1.400982] ? rcu_early_boot_tests+0x1/0x10 > [1.400986] ? schedule_timeout+0x1cd/0x390 > [1.400989] ? __might_sleep+0x4a/0x90 > [1.400992] ? mutex_lock+0x25/0x50 > [1.400995] ? drm_dp_mst_wait_tx_reply+0x118/0x1e0 > [1.400996] drm_sysfs_connector_add() connector: 88040ee6f800 kdev: > 88040eef9c00 > [1.401002] ? prepare_to_wait_event+0x120/0x120 > [1.401005] ? drm_dp_check_mstb_guid+0x3d/0x120 > [1.401008] drm_dp_send_link_address+0x185/0x1f0 > [1.401012] drm_dp_check_and_send_link_address+0xad/0xc0 > [1.401015] drm_dp_mst_link_probe_work+0x57/0xa0 > [1.401018] process_one_work+0x14b/0x430 > [1.401021] worker_thread+0x12b/0x4a0 > [1.401025] kthread+0x10c/0x140 > [1.401027] ? process_one_work+0x430/0x430 > [1.401030] ? kthread_create_on_node+0x40/0x40 > [1.401034] ret_from_fork+0x27/0x40 > [1.401038] ---[ end trace ba43fc250fbf282d ]--- > [1.401041] drm_sysfs_connector_add() connector: 88040c76b000 kdev: > fffe > [1.401043] drm_connector_register(88040c768000)::382 > connector->registered: 0 cpu: 2 > [1.401050] [drm:drm_sysfs_connector_add] *ERROR* failed to register > connector device: -2 > [1.401057] drm_sysfs_connector_add() connector: 88040c768000 kdev: > 88040eefa000 > [1.401093] drm_connector_register(88040c768800)::382 > connector->registered: 0 cpu: 2 > [1.401113] drm_sysfs_connector_add() connector: 88040c768800 kdev: > 88040eefa400 > [1.401122] drm_connector_register(88040c769000)::382 > connector->registered: 0 cpu: 2 > [1.401140] drm_sysfs_connector_add() connector: 88040c769000 kdev: > 88040eefa800 > [1.401167] drm_connector_register(88040c769800)::382 > connector->registered: 0 cpu: 2 > [1.401186] drm_sysfs_connector_add() connector: 88040c769800 kdev: > 88040eefac00 > [1.401195] drm_connector_register(88040c76b000)::382 > connector->registered: 0 cpu: 2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/5] drm/edid: Introduce drm_default_rgb_quant_range()
On Wed, Jan 11, 2017 at 02:57:22PM +0200, ville.syrj...@linux.intel.com wrote: > From: Ville Syrjälä > > Make the code selecting the RGB quantization range a little less magicy > by wrapping it up in a small helper. > > Signed-off-by: Ville Syrjälä Needs cc: for driver maintainers, Eric for vc4 here. -Daniel > --- > drivers/gpu/drm/drm_edid.c| 18 ++ > drivers/gpu/drm/i915/intel_dp.c | 4 +++- > drivers/gpu/drm/i915/intel_hdmi.c | 3 ++- > drivers/gpu/drm/vc4/vc4_hdmi.c| 4 +++- > include/drm/drm_edid.h| 2 ++ > 5 files changed, 28 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c > index 4ff04aa84dd0..304c583b8000 100644 > --- a/drivers/gpu/drm/drm_edid.c > +++ b/drivers/gpu/drm/drm_edid.c > @@ -3768,6 +3768,24 @@ bool drm_rgb_quant_range_selectable(struct edid *edid) > } > EXPORT_SYMBOL(drm_rgb_quant_range_selectable); > > +/** > + * drm_default_rgb_quant_range - default RGB quantization range > + * @mode: display mode > + * > + * Determine the default RGB quantization range for the mode, > + * as specified in CEA-861. > + * > + * Return: The default RGB quantization range for the mode > + */ > +enum hdmi_quantization_range > +drm_default_rgb_quant_range(const struct drm_display_mode *mode) > +{ > + return drm_match_cea_mode(mode) > 1 ? > + HDMI_QUANTIZATION_RANGE_LIMITED : > + HDMI_QUANTIZATION_RANGE_FULL; > +} > +EXPORT_SYMBOL(drm_default_rgb_quant_range); > + > static void drm_parse_hdmi_deep_color_info(struct drm_connector *connector, > const u8 *hdmi) > { > diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c > index 343e1d9fa761..d4befbbe834a 100644 > --- a/drivers/gpu/drm/i915/intel_dp.c > +++ b/drivers/gpu/drm/i915/intel_dp.c > @@ -1713,7 +1713,9 @@ intel_dp_compute_config(struct intel_encoder *encoder, >* VESA DisplayPort Ver.1.2a - 5.1.1.1 Video Colorimetry >*/ > pipe_config->limited_color_range = > - bpp != 18 && drm_match_cea_mode(adjusted_mode) > 1; > + bpp != 18 && > + drm_default_rgb_quant_range(adjusted_mode) == > + HDMI_QUANTIZATION_RANGE_LIMITED; > } else { > pipe_config->limited_color_range = > intel_dp->limited_color_range; > diff --git a/drivers/gpu/drm/i915/intel_hdmi.c > b/drivers/gpu/drm/i915/intel_hdmi.c > index 0bcfead14571..19bd13f53729 100644 > --- a/drivers/gpu/drm/i915/intel_hdmi.c > +++ b/drivers/gpu/drm/i915/intel_hdmi.c > @@ -1330,7 +1330,8 @@ bool intel_hdmi_compute_config(struct intel_encoder > *encoder, > /* See CEA-861-E - 5.1 Default Encoding Parameters */ > pipe_config->limited_color_range = > pipe_config->has_hdmi_sink && > - drm_match_cea_mode(adjusted_mode) > 1; > + drm_default_rgb_quant_range(adjusted_mode) == > + HDMI_QUANTIZATION_RANGE_LIMITED; > } else { > pipe_config->limited_color_range = > intel_hdmi->limited_color_range; > diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c > index c4cb2e26de32..d79466a42690 100644 > --- a/drivers/gpu/drm/vc4/vc4_hdmi.c > +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c > @@ -463,7 +463,9 @@ static void vc4_hdmi_encoder_mode_set(struct drm_encoder > *encoder, > csc_ctl = VC4_SET_FIELD(VC4_HD_CSC_CTL_ORDER_BGR, > VC4_HD_CSC_CTL_ORDER); > > - if (vc4_encoder->hdmi_monitor && drm_match_cea_mode(mode) > 1) { > + if (vc4_encoder->hdmi_monitor && > + drm_default_rgb_quant_range(adjusted_mode) == > + HDMI_QUANTIZATION_RANGE_LIMITED) { > /* CEA VICs other than #1 requre limited range RGB >* output unless overridden by an AVI infoframe. >* Apply a colorspace conversion to squash 0-255 down > diff --git a/include/drm/drm_edid.h b/include/drm/drm_edid.h > index 838eaf2b42e9..25cdf5f7a0d8 100644 > --- a/include/drm/drm_edid.h > +++ b/include/drm/drm_edid.h > @@ -441,6 +441,8 @@ enum hdmi_picture_aspect drm_get_cea_aspect_ratio(const > u8 video_code); > bool drm_detect_hdmi_monitor(struct edid *edid); > bool drm_detect_monitor_audio(struct edid *edid); > bool drm_rgb_quant_range_selectable(struct edid *edid); > +enum hdmi_quantization_range > +drm_default_rgb_quant_range(const struct drm_display_mode *mode); > int drm_add_modes_noedid(struct drm_connector *connector, >int hdisplay, int vdisplay); > void drm_set_preferred_mode(struct drm_connector *connector, > -- > 2.10.2 > > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Softwa
Re: [Intel-gfx] [PATCH 2/5] drm/edid: Introduce drm_default_rgb_quant_range()
On Wed, Jan 11, 2017 at 05:16:54PM +0100, Daniel Vetter wrote: > On Wed, Jan 11, 2017 at 02:57:22PM +0200, ville.syrj...@linux.intel.com wrote: > > From: Ville Syrjälä > > > > Make the code selecting the RGB quantization range a little less magicy > > by wrapping it up in a small helper. > > > > Signed-off-by: Ville Syrjälä > > Needs cc: for driver maintainers, Eric for vc4 here. Eric was cc:d. I was just too lazy to add the cc:s to all the commit messages, and so i just used --cc on the whole lot. > -Daniel > > > --- > > drivers/gpu/drm/drm_edid.c| 18 ++ > > drivers/gpu/drm/i915/intel_dp.c | 4 +++- > > drivers/gpu/drm/i915/intel_hdmi.c | 3 ++- > > drivers/gpu/drm/vc4/vc4_hdmi.c| 4 +++- > > include/drm/drm_edid.h| 2 ++ > > 5 files changed, 28 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c > > index 4ff04aa84dd0..304c583b8000 100644 > > --- a/drivers/gpu/drm/drm_edid.c > > +++ b/drivers/gpu/drm/drm_edid.c > > @@ -3768,6 +3768,24 @@ bool drm_rgb_quant_range_selectable(struct edid > > *edid) > > } > > EXPORT_SYMBOL(drm_rgb_quant_range_selectable); > > > > +/** > > + * drm_default_rgb_quant_range - default RGB quantization range > > + * @mode: display mode > > + * > > + * Determine the default RGB quantization range for the mode, > > + * as specified in CEA-861. > > + * > > + * Return: The default RGB quantization range for the mode > > + */ > > +enum hdmi_quantization_range > > +drm_default_rgb_quant_range(const struct drm_display_mode *mode) > > +{ > > + return drm_match_cea_mode(mode) > 1 ? > > + HDMI_QUANTIZATION_RANGE_LIMITED : > > + HDMI_QUANTIZATION_RANGE_FULL; > > +} > > +EXPORT_SYMBOL(drm_default_rgb_quant_range); > > + > > static void drm_parse_hdmi_deep_color_info(struct drm_connector *connector, > >const u8 *hdmi) > > { > > diff --git a/drivers/gpu/drm/i915/intel_dp.c > > b/drivers/gpu/drm/i915/intel_dp.c > > index 343e1d9fa761..d4befbbe834a 100644 > > --- a/drivers/gpu/drm/i915/intel_dp.c > > +++ b/drivers/gpu/drm/i915/intel_dp.c > > @@ -1713,7 +1713,9 @@ intel_dp_compute_config(struct intel_encoder *encoder, > > * VESA DisplayPort Ver.1.2a - 5.1.1.1 Video Colorimetry > > */ > > pipe_config->limited_color_range = > > - bpp != 18 && drm_match_cea_mode(adjusted_mode) > 1; > > + bpp != 18 && > > + drm_default_rgb_quant_range(adjusted_mode) == > > + HDMI_QUANTIZATION_RANGE_LIMITED; > > } else { > > pipe_config->limited_color_range = > > intel_dp->limited_color_range; > > diff --git a/drivers/gpu/drm/i915/intel_hdmi.c > > b/drivers/gpu/drm/i915/intel_hdmi.c > > index 0bcfead14571..19bd13f53729 100644 > > --- a/drivers/gpu/drm/i915/intel_hdmi.c > > +++ b/drivers/gpu/drm/i915/intel_hdmi.c > > @@ -1330,7 +1330,8 @@ bool intel_hdmi_compute_config(struct intel_encoder > > *encoder, > > /* See CEA-861-E - 5.1 Default Encoding Parameters */ > > pipe_config->limited_color_range = > > pipe_config->has_hdmi_sink && > > - drm_match_cea_mode(adjusted_mode) > 1; > > + drm_default_rgb_quant_range(adjusted_mode) == > > + HDMI_QUANTIZATION_RANGE_LIMITED; > > } else { > > pipe_config->limited_color_range = > > intel_hdmi->limited_color_range; > > diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c > > index c4cb2e26de32..d79466a42690 100644 > > --- a/drivers/gpu/drm/vc4/vc4_hdmi.c > > +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c > > @@ -463,7 +463,9 @@ static void vc4_hdmi_encoder_mode_set(struct > > drm_encoder *encoder, > > csc_ctl = VC4_SET_FIELD(VC4_HD_CSC_CTL_ORDER_BGR, > > VC4_HD_CSC_CTL_ORDER); > > > > - if (vc4_encoder->hdmi_monitor && drm_match_cea_mode(mode) > 1) { > > + if (vc4_encoder->hdmi_monitor && > > + drm_default_rgb_quant_range(adjusted_mode) == > > + HDMI_QUANTIZATION_RANGE_LIMITED) { > > /* CEA VICs other than #1 requre limited range RGB > > * output unless overridden by an AVI infoframe. > > * Apply a colorspace conversion to squash 0-255 down > > diff --git a/include/drm/drm_edid.h b/include/drm/drm_edid.h > > index 838eaf2b42e9..25cdf5f7a0d8 100644 > > --- a/include/drm/drm_edid.h > > +++ b/include/drm/drm_edid.h > > @@ -441,6 +441,8 @@ enum hdmi_picture_aspect drm_get_cea_aspect_ratio(const > > u8 video_code); > > bool drm_detect_hdmi_monitor(struct edid *edid); > > bool drm_detect_monitor_audio(struct edid *edid); > > bool drm_rgb_quant_range_selectable(struct edid *edid); > > +enum hdmi_quantization_range > > +drm_default_rgb_quant_range(const struct drm_display_mode *mode); > > int drm_add_modes_noedid(struct drm_conne
[Intel-gfx] GPU hang with kernel 4.10rc3
With kernel 4.10rc3 running as Xen dm0 I get at each boot: [ 49.213697] [drm] GPU HANG: ecode 7:0:0x3d1d3d3d, in gnome-shell [1431], reason: Hang on render ring, action: reset [ 49.213699] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. [ 49.213700] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel [ 49.213700] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. [ 49.213700] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. [ 49.213701] [drm] GPU crash dump saved to /sys/class/drm/card0/error [ 49.213755] drm/i915: Resetting chip after gpu hang [ 60.213769] drm/i915: Resetting chip after gpu hang [ 71.189737] drm/i915: Resetting chip after gpu hang [ 82.165747] drm/i915: Resetting chip after gpu hang [ 93.205727] drm/i915: Resetting chip after gpu hang The dump is attached. Juergen GPU HANG: ecode 7:0:0x3d1d3d3d, in gnome-shell [1431], reason: Hang on render ring, action: reset Kernel: 4.10.0-rc3-pv+ Time: 1484151498 s 569085 us Boottime: 39 s 844107 us Uptime: 34 s 750060 us is_mobile: no is_i85x: no is_i915g: no is_i945gm: no is_g33: no is_g4x: no is_pineview: no is_broadwater: no is_crestline: no is_ivybridge: no is_valleyview: no is_cherryview: no is_haswell: yes is_broadwell: no is_skylake: no is_broxton: no is_kabylake: no is_alpha_support: no has_64bit_reloc: no has_csr: no has_ddi: yes has_dp_mst: yes has_fbc: yes has_fpga_dbg: yes has_gmbus_irq: yes has_gmch_display: no has_guc: no has_hotplug: yes has_hw_contexts: yes has_l3_dpf: yes has_llc: yes has_logical_ring_contexts: no has_overlay: no has_pipe_cxsr: no has_pooled_eu: no has_psr: yes has_rc6: yes has_rc6p: no has_resource_streamer: yes has_runtime_pm: yes has_snoop: no cursor_needs_physical: no hws_needs_physical: no overlay_needs_physical: no supports_tv: no has_decoupled_mmio: no Active process (on ring render): gnome-shell [1431] Reset count: 0 Suspend count: 0 PCI ID: 0x0416 PCI Revision: 0x06 PCI Subsystem: 1028:05bd IOMMU enabled?: 0 EIR: 0x IER: 0xfc002529 GTIER: 0x00401821 PGTBL_ER: 0x FORCEWAKE: 0x0001 DERRMR: 0x CCID: 0x012f710d Missed interrupts: 0x fence[0] = bbf03300603001 fence[1] = c3300700bf4003 fence[2] = d3300f00c34003 fence[3] = 12f003300d34001 fence[4] = 230703f01308003 fence[5] = 2b0003b02309003 fence[6] = 30c503302b09001 fence[7] = fence[8] = fence[9] = fence[10] = fence[11] = fence[12] = fence[13] = fence[14] = fence[15] = fence[16] = fence[17] = fence[18] = fence[19] = fence[20] = fence[21] = fence[22] = fence[23] = fence[24] = fence[25] = fence[26] = fence[27] = fence[28] = fence[29] = fence[30] = fence[31] = ERROR: 0x0101 DONE_REG: 0xffef ERR_INT: 0x render command stream: START: 0x1000 HEAD: 0x0620 [0x05f8] TAIL: 0x0668 [0x0630, 0x0668] CTL: 0x0001f001 MODE: 0x4000 HWS: 0x7fff ACTHD: 0x 030c9004 IPEIR: 0x IPEHR: 0xc2c2c2c2 INSTDONE: 0xffdf SC_INSTDONE: 0x SAMPLER_INSTDONE[0][0]: 0x ROW_INSTDONE[0][0]: 0x batch: [0x_030c9000, 0x_030cc000] BBADDR: 0x_030c9005 BB_STATE: 0x INSTPS: 0x8201 INSTPM: 0x6080 FADDR: 0x 030c9200 RC PSMI: 0x0010 FAULT_REG: 0x00c5 SYNC_0: 0x SYNC_1: 0x0002 SYNC_2: 0x GFX_MODE: 0x2a00 PP_DIR_BASE: 0x7fdf seqno: 0x0008 last_seqno: 0x0009 waiting: yes ring->head: 0x ring->tail: 0x0668 hangcheck: hung [42] blt command stream: START: 0x00022000 HEAD: 0x0088 [0x] TAIL: 0x0088 [0x, 0x] CTL: 0x0001f001 MODE: 0x0200 HWS: 0x00021000 ACTHD: 0x 0088 IPEIR: 0x IPEHR: 0x INSTDONE: 0xfffe BBADDR: 0x_00bc0024 BB_STATE: 0x INSTPS: 0x INSTPM: 0x FADDR: 0x 00022088 RC PSMI: 0x0010 FAULT_REG: 0x SYNC_0: 0x0008 SYNC_1: 0x SYNC_2: 0x GFX_MODE: 0x0200 PP_DIR_BASE: 0x7fdf seqno: 0x0002 last_seqno: 0x0002 waiting: no ring->head: 0x ring->tail: 0x hangcheck: idle [0] bsd command stream: START: 0x00043000 HEAD: 0x [0x] TAIL: 0x [0x, 0x] CTL: 0x0001f001 MODE: 0x0200 HWS: 0x00042000 ACTHD: 0x IPEIR: 0x IPEHR: 0x INSTDONE: 0xfffe BBADDR: 0x_ BB_STATE: 0x INSTPS: 0x INSTPM: 0x
Re: [Intel-gfx] [PATCH v6] drm: add fourcc codes for 16bit R and RG
On Wed, Jan 11, 2017 at 07:44:05AM -0800, Ben Widawsky wrote: > On 17-01-11 17:05:04, Ville Syrjälä wrote: > >On Thu, Jan 05, 2017 at 02:45:37PM +0100, Christian König wrote: > >> Am 05.01.2017 um 12:37 schrieb Ville Syrjälä: > >> > On Wed, Jan 04, 2017 at 07:38:55PM +0100, Rainer Hochecker wrote: > >> >> From: Rainer Hochecker > >> >> > >> >> This adds fourcc codes for 16bit planes required for DRM buffer > >> >> export to mesa. > >> >> > >> >> Signed-off-by: Rainer Hochecker > >> > Reviewed-by: Ville Syrjälä > >> > >> Good to see some work landing on that part, patch is Acked-by: Christian > >> König . > > > >Has the userspace side of this been reviewed already? > > > >/me wonders if it's safe to push this... > > > > I acked the mesa side, and Rainer sent a version 2 which also looked fine to > me. > Let me bump that thread... Thanks everyone. I've pushed this patch to drm-misc-next. > > >> > >> > > >> >> --- > >> >> include/uapi/drm/drm_fourcc.h | 7 +++ > >> >> 1 file changed, 7 insertions(+) > >> >> > >> >> diff --git a/include/uapi/drm/drm_fourcc.h > >> >> b/include/uapi/drm/drm_fourcc.h > >> >> index a5890bf..d230e58 100644 > >> >> --- a/include/uapi/drm/drm_fourcc.h > >> >> +++ b/include/uapi/drm/drm_fourcc.h > >> >> @@ -41,10 +41,17 @@ extern "C" { > >> >> /* 8 bpp Red */ > >> >> #define DRM_FORMAT_R8 fourcc_code('R', '8', ' ', ' ') /* > >> >> [7:0] R */ > >> >> > >> >> +/* 16 bpp Red */ > >> >> +#define DRM_FORMAT_R16 fourcc_code('R', '1', '6', ' ') /* > >> >> [15:0] R little endian */ > >> >> + > >> >> /* 16 bpp RG */ > >> >> #define DRM_FORMAT_RG88 fourcc_code('R', 'G', '8', '8') > >> >> /* [15:0] R:G 8:8 little endian */ > >> >> #define DRM_FORMAT_GR88 fourcc_code('G', 'R', '8', '8') > >> >> /* [15:0] G:R 8:8 little endian */ > >> >> > >> >> +/* 32 bpp RG */ > >> >> +#define DRM_FORMAT_RG1616 fourcc_code('R', 'G', '3', '2') /* > >> >> [31:0] R:G 16:16 little endian */ > >> >> +#define DRM_FORMAT_GR1616 fourcc_code('G', 'R', '3', '2') /* > >> >> [31:0] G:R 16:16 little endian */ > >> >> + > >> >> /* 8 bpp RGB */ > >> >> #define DRM_FORMAT_RGB332 fourcc_code('R', 'G', 'B', '8') /* > >> >> [7:0] R:G:B 3:3:2 */ > >> >> #define DRM_FORMAT_BGR233 fourcc_code('B', 'G', 'R', '8') /* > >> >> [7:0] B:G:R 2:3:3 */ > >> >> -- > >> >> 2.9.3 > >> > > > >-- > >Ville Syrjälä > >Intel OTC -- Ville Syrjälä Intel OTC ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/3] drm/i915/scheduler: emulate a scheduler for guc
On 11/01/2017 13:13, Chris Wilson wrote: This emulates execlists on top of the GuC in order to defer submission of > requests to the hardware. This deferral allows time for high priority requests to gazump their way to the head of the queue, however it nerfs the GuC by converting it back into a simple execlist (where the CPU has to wake up after every request to feed new commands into the GuC). v2: Drop hack status - though iirc there is still a lockdep inversion between fence and engine->timeline->lock (which is impossible as the nesting only occurs on different fences - hopefully just requires some judicious lockdep annotation) Hm hm... so fence->lock under timeline->lock when we enable signalling, while we already have the opposite in the submit_notify->submit_request yes? That would mean a per fence lock class, which can't work, or a nested annotation on fence->lock? So outside i915? Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_guc_submission.c | 79 +++--- drivers/gpu/drm/i915/i915_irq.c| 4 +- drivers/gpu/drm/i915/intel_lrc.c | 5 +- 3 files changed, 76 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 913d87358972..bdc9e2bc5eb9 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -350,7 +350,7 @@ int i915_guc_wq_reserve(struct drm_i915_gem_request *request) u32 freespace; int ret; - spin_lock(&client->wq_lock); + spin_lock_irq(&client->wq_lock); freespace = CIRC_SPACE(client->wq_tail, desc->head, client->wq_size); freespace -= client->wq_rsvd; if (likely(freespace >= wqi_size)) { @@ -360,7 +360,7 @@ int i915_guc_wq_reserve(struct drm_i915_gem_request *request) client->no_wq_space++; ret = -EAGAIN; } - spin_unlock(&client->wq_lock); + spin_unlock_irq(&client->wq_lock); return ret; } @@ -372,9 +372,9 @@ void i915_guc_wq_unreserve(struct drm_i915_gem_request *request) GEM_BUG_ON(READ_ONCE(client->wq_rsvd) < wqi_size); - spin_lock(&client->wq_lock); + spin_lock_irq(&client->wq_lock); client->wq_rsvd -= wqi_size; - spin_unlock(&client->wq_lock); + spin_unlock_irq(&client->wq_lock); } /* Construct a Work Item and append it to the GuC's Work Queue */ @@ -534,10 +534,74 @@ static void __i915_guc_submit(struct drm_i915_gem_request *rq) static void i915_guc_submit(struct drm_i915_gem_request *rq) { - i915_gem_request_submit(rq); + __i915_gem_request_submit(rq); __i915_guc_submit(rq); } +static bool i915_guc_dequeue(struct intel_engine_cs *engine) +{ + struct execlist_port *port = engine->execlist_port; + struct drm_i915_gem_request *last = port[0].request; + unsigned long flags; + struct rb_node *rb; + bool submit = false; + + spin_lock_irqsave(&engine->timeline->lock, flags); + rb = engine->execlist_first; + while (rb) { + struct drm_i915_gem_request *cursor = + rb_entry(rb, typeof(*cursor), priotree.node); + + if (last && cursor->ctx != last->ctx) { + if (port != engine->execlist_port) + break; + + i915_gem_request_assign(&port->request, last); + dma_fence_enable_sw_signaling(&last->fence); + port++; + } + + rb = rb_next(rb); + rb_erase(&cursor->priotree.node, &engine->execlist_queue); + RB_CLEAR_NODE(&cursor->priotree.node); + cursor->priotree.priority = INT_MAX; + + i915_guc_submit(cursor); + last = cursor; + submit = true; + } + if (submit) { + i915_gem_request_assign(&port->request, last); + dma_fence_enable_sw_signaling(&last->fence); + engine->execlist_first = rb; + } + spin_unlock_irqrestore(&engine->timeline->lock, flags); + + return submit; +} It is again tempting me to suggest a single instance of the dequeue loop. Maybe something like: i915_request_dequeue(engine, assign_func, submit_func) { ... while (rb) { ... if () { ... assign_func(); port++; } ... submit_func(); last = cursor; submit = true; } if (submit) { assign_func(); engine->execlists_first = rb; } ... } execlists_dequeue_assign() { i915_gem_request_assign(); } execlists_deqeue_submit() { __i915_gem_request_submit(); } execlists_dequeue(
Re: [Intel-gfx] [PATCH v3] drm/i915/scheduler: emulate a scheduler for guc
On 11/01/2017 16:11, Chris Wilson wrote: This emulates execlists on top of the GuC in order to defer submission of requests to the hardware. This deferral allows time for high priority requests to gazump their way to the head of the queue, however it nerfs the GuC by converting it back into a simple execlist (where the CPU has to wake up after every request to feed new commands into the GuC). v2: Drop hack status - though iirc there is still a lockdep inversion between fence and engine->timeline->lock (which is impossible as the nesting only occurs on different fences - hopefully just requires some judicious lockdep annotation) v3: Apply lockdep nesting to enabling signaling on the request, using the pattern we already have in __i915_gem_request_submit(); Signed-off-by: Chris Wilson Cc: Tvrtko Ursulin --- drivers/gpu/drm/i915/i915_guc_submission.c | 92 +++--- drivers/gpu/drm/i915/i915_irq.c| 4 +- drivers/gpu/drm/i915/intel_lrc.c | 5 +- 3 files changed, 89 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c index 913d87358972..4484591cbf7c 100644 --- a/drivers/gpu/drm/i915/i915_guc_submission.c +++ b/drivers/gpu/drm/i915/i915_guc_submission.c @@ -350,7 +350,7 @@ int i915_guc_wq_reserve(struct drm_i915_gem_request *request) u32 freespace; int ret; - spin_lock(&client->wq_lock); + spin_lock_irq(&client->wq_lock); freespace = CIRC_SPACE(client->wq_tail, desc->head, client->wq_size); freespace -= client->wq_rsvd; if (likely(freespace >= wqi_size)) { @@ -360,7 +360,7 @@ int i915_guc_wq_reserve(struct drm_i915_gem_request *request) client->no_wq_space++; ret = -EAGAIN; } - spin_unlock(&client->wq_lock); + spin_unlock_irq(&client->wq_lock); return ret; } @@ -372,9 +372,9 @@ void i915_guc_wq_unreserve(struct drm_i915_gem_request *request) GEM_BUG_ON(READ_ONCE(client->wq_rsvd) < wqi_size); - spin_lock(&client->wq_lock); + spin_lock_irq(&client->wq_lock); client->wq_rsvd -= wqi_size; - spin_unlock(&client->wq_lock); + spin_unlock_irq(&client->wq_lock); } /* Construct a Work Item and append it to the GuC's Work Queue */ @@ -534,10 +534,87 @@ static void __i915_guc_submit(struct drm_i915_gem_request *rq) static void i915_guc_submit(struct drm_i915_gem_request *rq) { - i915_gem_request_submit(rq); + __i915_gem_request_submit(rq); __i915_guc_submit(rq); } +static void nested_enable_signaling(struct drm_i915_gem_request *rq) +{ + if (test_and_set_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, +&rq->fence.flags)) + return; + + GEM_BUG_ON(test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &rq->fence.flags)); + + spin_lock_nested(&rq->lock, SINGLE_DEPTH_NESTING); + intel_engine_enable_signaling(rq); + spin_unlock(&rq->lock); +} Crossed wires. :) Opencoding this works I guess. I would stick a trace_dma_fence_enable_signal into it to be extra nice. Regards, Tvrtko + +static bool i915_guc_dequeue(struct intel_engine_cs *engine) +{ + struct execlist_port *port = engine->execlist_port; + struct drm_i915_gem_request *last = port[0].request; + unsigned long flags; + struct rb_node *rb; + bool submit = false; + + spin_lock_irqsave(&engine->timeline->lock, flags); + rb = engine->execlist_first; + while (rb) { + struct drm_i915_gem_request *cursor = + rb_entry(rb, typeof(*cursor), priotree.node); + + if (last && cursor->ctx != last->ctx) { + if (port != engine->execlist_port) + break; + + i915_gem_request_assign(&port->request, last); + nested_enable_signaling(last); + port++; + } + + rb = rb_next(rb); + rb_erase(&cursor->priotree.node, &engine->execlist_queue); + RB_CLEAR_NODE(&cursor->priotree.node); + cursor->priotree.priority = INT_MAX; + + i915_guc_submit(cursor); + last = cursor; + submit = true; + } + if (submit) { + i915_gem_request_assign(&port->request, last); + nested_enable_signaling(last); + engine->execlist_first = rb; + } + spin_unlock_irqrestore(&engine->timeline->lock, flags); + + return submit; +} + +static void i915_guc_irq_handler(unsigned long data) +{ + struct intel_engine_cs *engine = (struct intel_engine_cs *)data; + struct execlist_port *port = engine->execlist_port; + struct drm_i915_gem_request *rq; + bool submit; + + do { + rq = port[0].request; + while (rq && i915_gem_request_c
Re: [Intel-gfx] GPU hang with kernel 4.10rc3
On Wed, Jan 11, 2017 at 05:33:34PM +0100, Juergen Gross wrote: > With kernel 4.10rc3 running as Xen dm0 I get at each boot: > > [ 49.213697] [drm] GPU HANG: ecode 7:0:0x3d1d3d3d, in gnome-shell > [1431], reason: Hang on render ring, action: reset > [ 49.213699] [drm] GPU hangs can indicate a bug anywhere in the entire > gfx stack, including userspace. > [ 49.213700] [drm] Please file a _new_ bug report on > bugs.freedesktop.org against DRI -> DRM/Intel > [ 49.213700] [drm] drm/i915 developers can then reassign to the right > component if it's not a kernel issue. > [ 49.213700] [drm] The gpu crash dump is required to analyze gpu > hangs, so please always attach it. > [ 49.213701] [drm] GPU crash dump saved to /sys/class/drm/card0/error > [ 49.213755] drm/i915: Resetting chip after gpu hang > [ 60.213769] drm/i915: Resetting chip after gpu hang > [ 71.189737] drm/i915: Resetting chip after gpu hang > [ 82.165747] drm/i915: Resetting chip after gpu hang > [ 93.205727] drm/i915: Resetting chip after gpu hang > > The dump is attached. That's a nasty one. The first couple of pages of the batchbuffer appear to be overwritten. (Full of 0xc2c2c2c2, i.e. probably pixel data.) That may be a concurrent write by either the GPU or CPU, or we may have incorrected mapped a set of pages. That it doesn't recovered suggests that the corruption occurs frequently, probably on every request/batch. Is this a new bug? Bisection would be the fastest way to triage it. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/3] drm/i915/scheduler: emulate a scheduler for guc
On Wed, Jan 11, 2017 at 04:55:46PM +, Tvrtko Ursulin wrote: > > On 11/01/2017 13:13, Chris Wilson wrote: > >This emulates execlists on top of the GuC in order to defer submission of > > requests to the hardware. This deferral allows time for high priority > >requests to gazump their way to the head of the queue, however it nerfs > >the GuC by converting it back into a simple execlist (where the CPU has > >to wake up after every request to feed new commands into the GuC). > > > >v2: Drop hack status - though iirc there is still a lockdep inversion > >between fence and engine->timeline->lock (which is impossible as the > >nesting only occurs on different fences - hopefully just requires some > >judicious lockdep annotation) > > Hm hm... so fence->lock under timeline->lock when we enable > signalling, while we already have the opposite in the > submit_notify->submit_request yes? That would mean a per fence lock > class, which can't work, or a nested annotation on fence->lock? So > outside i915? That was my approach as well, and by heading in that direction, we can see that it is the same nested signal enabling issue we met when handling the deferral of the breadcrumb signaler in i915_gem_request_submit(). > >Signed-off-by: Chris Wilson > >--- > > drivers/gpu/drm/i915/i915_guc_submission.c | 79 > > +++--- > > drivers/gpu/drm/i915/i915_irq.c| 4 +- > > drivers/gpu/drm/i915/intel_lrc.c | 5 +- > > 3 files changed, 76 insertions(+), 12 deletions(-) > > > >diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c > >b/drivers/gpu/drm/i915/i915_guc_submission.c > >index 913d87358972..bdc9e2bc5eb9 100644 > >--- a/drivers/gpu/drm/i915/i915_guc_submission.c > >+++ b/drivers/gpu/drm/i915/i915_guc_submission.c > >@@ -350,7 +350,7 @@ int i915_guc_wq_reserve(struct drm_i915_gem_request > >*request) > > u32 freespace; > > int ret; > > > >-spin_lock(&client->wq_lock); > >+spin_lock_irq(&client->wq_lock); > > freespace = CIRC_SPACE(client->wq_tail, desc->head, client->wq_size); > > freespace -= client->wq_rsvd; > > if (likely(freespace >= wqi_size)) { > >@@ -360,7 +360,7 @@ int i915_guc_wq_reserve(struct drm_i915_gem_request > >*request) > > client->no_wq_space++; > > ret = -EAGAIN; > > } > >-spin_unlock(&client->wq_lock); > >+spin_unlock_irq(&client->wq_lock); > > > > return ret; > > } > >@@ -372,9 +372,9 @@ void i915_guc_wq_unreserve(struct drm_i915_gem_request > >*request) > > > > GEM_BUG_ON(READ_ONCE(client->wq_rsvd) < wqi_size); > > > >-spin_lock(&client->wq_lock); > >+spin_lock_irq(&client->wq_lock); > > client->wq_rsvd -= wqi_size; > >-spin_unlock(&client->wq_lock); > >+spin_unlock_irq(&client->wq_lock); > > } > > > > /* Construct a Work Item and append it to the GuC's Work Queue */ > >@@ -534,10 +534,74 @@ static void __i915_guc_submit(struct > >drm_i915_gem_request *rq) > > > > static void i915_guc_submit(struct drm_i915_gem_request *rq) > > { > >-i915_gem_request_submit(rq); > >+__i915_gem_request_submit(rq); > > __i915_guc_submit(rq); > > } > > > >+static bool i915_guc_dequeue(struct intel_engine_cs *engine) > >+{ > >+struct execlist_port *port = engine->execlist_port; > >+struct drm_i915_gem_request *last = port[0].request; > >+unsigned long flags; > >+struct rb_node *rb; > >+bool submit = false; > >+ > >+spin_lock_irqsave(&engine->timeline->lock, flags); > >+rb = engine->execlist_first; > >+while (rb) { > >+struct drm_i915_gem_request *cursor = > >+rb_entry(rb, typeof(*cursor), priotree.node); > >+ > >+if (last && cursor->ctx != last->ctx) { > >+if (port != engine->execlist_port) > >+break; > >+ > >+i915_gem_request_assign(&port->request, last); > >+dma_fence_enable_sw_signaling(&last->fence); > >+port++; > >+} > >+ > >+rb = rb_next(rb); > >+rb_erase(&cursor->priotree.node, &engine->execlist_queue); > >+RB_CLEAR_NODE(&cursor->priotree.node); > >+cursor->priotree.priority = INT_MAX; > >+ > >+i915_guc_submit(cursor); > >+last = cursor; > >+submit = true; > >+} > >+if (submit) { > >+i915_gem_request_assign(&port->request, last); > >+dma_fence_enable_sw_signaling(&last->fence); > >+engine->execlist_first = rb; > >+} > >+spin_unlock_irqrestore(&engine->timeline->lock, flags); > >+ > >+return submit; > >+} > > It is again tempting me to suggest a single instance of the dequeue > loop. Maybe something like: > > i915_request_dequeue(engine, assign_func, submit_func) > { > ... > > while (rb) { > ... > if () { Don't forget if (!cant_merge_func()) { /* which is where I start not lik
[Intel-gfx] [drm-intel:for-linux-next 2/4] htmldocs: drivers/gpu/drm/i915/i915_gem_gtt.c:3586: warning: No description found for parameter 'offset'
tree: git://anongit.freedesktop.org/drm-intel for-linux-next head: c781c978e784c50dcd7cb312fe17f5281923f55b commit: 625d988acc28f3fe1d44f3798426561c17387a59 [2/4] drm/i915: Extract reserving space in the GTT to a helper reproduce: make htmldocs All warnings (new ones prefixed by >>): make[3]: warning: jobserver unavailable: using -j1. Add '+' to parent make rule. include/linux/init.h:1: warning: no structured comments found include/linux/kthread.h:26: warning: Excess function parameter '...' description in 'kthread_create' kernel/sys.c:1: warning: no structured comments found drivers/dma-buf/seqno-fence.c:1: warning: no structured comments found include/drm/drm_drv.h:441: warning: No description found for parameter 'firstopen' include/drm/drm_drv.h:441: warning: No description found for parameter 'open' include/drm/drm_drv.h:441: warning: No description found for parameter 'preclose' include/drm/drm_drv.h:441: warning: No description found for parameter 'postclose' include/drm/drm_drv.h:441: warning: No description found for parameter 'lastclose' include/drm/drm_drv.h:441: warning: No description found for parameter 'dma_ioctl' include/drm/drm_drv.h:441: warning: No description found for parameter 'dma_quiescent' include/drm/drm_drv.h:441: warning: No description found for parameter 'context_dtor' include/drm/drm_drv.h:441: warning: No description found for parameter 'set_busid' include/drm/drm_drv.h:441: warning: No description found for parameter 'irq_handler' include/drm/drm_drv.h:441: warning: No description found for parameter 'irq_preinstall' include/drm/drm_drv.h:441: warning: No description found for parameter 'irq_postinstall' include/drm/drm_drv.h:441: warning: No description found for parameter 'irq_uninstall' include/drm/drm_drv.h:441: warning: No description found for parameter 'debugfs_init' include/drm/drm_drv.h:441: warning: No description found for parameter 'debugfs_cleanup' include/drm/drm_drv.h:441: warning: No description found for parameter 'gem_open_object' include/drm/drm_drv.h:441: warning: No description found for parameter 'gem_close_object' include/drm/drm_drv.h:441: warning: No description found for parameter 'prime_handle_to_fd' include/drm/drm_drv.h:441: warning: No description found for parameter 'prime_fd_to_handle' include/drm/drm_drv.h:441: warning: No description found for parameter 'gem_prime_export' include/drm/drm_drv.h:441: warning: No description found for parameter 'gem_prime_import' include/drm/drm_drv.h:441: warning: No description found for parameter 'gem_prime_pin' include/drm/drm_drv.h:441: warning: No description found for parameter 'gem_prime_unpin' include/drm/drm_drv.h:441: warning: No description found for parameter 'gem_prime_res_obj' include/drm/drm_drv.h:441: warning: No description found for parameter 'gem_prime_get_sg_table' include/drm/drm_drv.h:441: warning: No description found for parameter 'gem_prime_import_sg_table' include/drm/drm_drv.h:441: warning: No description found for parameter 'gem_prime_vmap' include/drm/drm_drv.h:441: warning: No description found for parameter 'gem_prime_vunmap' include/drm/drm_drv.h:441: warning: No description found for parameter 'gem_prime_mmap' include/drm/drm_drv.h:441: warning: No description found for parameter 'vgaarb_irq' include/drm/drm_drv.h:441: warning: No description found for parameter 'gem_vm_ops' include/drm/drm_drv.h:441: warning: No description found for parameter 'major' include/drm/drm_drv.h:441: warning: No description found for parameter 'minor' include/drm/drm_drv.h:441: warning: No description found for parameter 'patchlevel' include/drm/drm_drv.h:441: warning: No description found for parameter 'name' include/drm/drm_drv.h:441: warning: No description found for parameter 'desc' include/drm/drm_drv.h:441: warning: No description found for parameter 'date' include/drm/drm_drv.h:441: warning: No description found for parameter 'driver_features' include/drm/drm_drv.h:441: warning: No description found for parameter 'dev_priv_size' include/drm/drm_drv.h:441: warning: No description found for parameter 'ioctls' include/drm/drm_drv.h:441: warning: No description found for parameter 'num_ioctls' include/drm/drm_drv.h:441: warning: No description found for parameter 'fops' include/drm/drm_drv.h:441: warning: No description found for parameter 'legacy_dev_list' drivers/gpu/drm/i915/i915_gem_gtt.c:3586: warning: No description found for parameter 'vm' drivers/gpu/drm/i915/i915_gem_gtt.c:3586: warning: No description found for parameter 'node' drivers/gpu/drm/i915/i915_gem_gtt.c:3586: warning: No description found for parameter 'size' >> drivers/gpu/drm/i915/i915_gem_gtt.c:3586: warning: No description found for >> parameter 'offset' drivers/gpu/drm/i915/i915_gem_gtt.c:3586: warning: No description
Re: [Intel-gfx] [RFCv2 01/19] drm/i915: Provide a hook for selftests
On 20/12/2016 13:07, Chris Wilson wrote: Some pieces of code are independent of hardware but are very tricky to exercise through the normal userspace ABI or via debugfs hooks. Being able to create mock unit tests and execute them through CI is vital. Start by adding a central point where we can execute unit tests and a parameter to enable them. This is disabled by default as the expectation is that these tests will occasionally explode. To facilitate integration with igt, any parameter beginning with i915.igt__ is interpreted as a subtest executable independently via igt/drv_selftest. Two classes of selftests are recognised: mock unit tests and integration tests. Mock unit tests are run as soon as the module is loaded, before the device is probed. At that point there is no driver instantiated and all hw interactions must be "mocked". This is very useful for writing universal tests to exercise code not typically run on a broad range of architectures. Alternatively, you can hook into the late selftests and run when the device has been instantiated - hw interactions are real. v2: Add a macro for compiling conditional code for mock objects inside real objects. v3: Differentiate between mock unit tests and late integration test. v4: List the tests in natural order, use igt to sort after modparam. v5: s/late/live/ Signed-off-by: Chris Wilson Reviewed-by: Tvrtko Ursulin #v1 --- drivers/gpu/drm/i915/Kconfig.debug | 15 ++ drivers/gpu/drm/i915/Makefile | 3 + drivers/gpu/drm/i915/i915_pci.c| 19 +- drivers/gpu/drm/i915/i915_selftest.h | 91 + .../gpu/drm/i915/selftests/i915_live_selftests.h | 11 + .../gpu/drm/i915/selftests/i915_mock_selftests.h | 11 + drivers/gpu/drm/i915/selftests/i915_selftest.c | 222 + 7 files changed, 371 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/i915/i915_selftest.h create mode 100644 drivers/gpu/drm/i915/selftests/i915_live_selftests.h create mode 100644 drivers/gpu/drm/i915/selftests/i915_mock_selftests.h create mode 100644 drivers/gpu/drm/i915/selftests/i915_selftest.c diff --git a/drivers/gpu/drm/i915/Kconfig.debug b/drivers/gpu/drm/i915/Kconfig.debug index 598551dbf62c..de051502e891 100644 --- a/drivers/gpu/drm/i915/Kconfig.debug +++ b/drivers/gpu/drm/i915/Kconfig.debug @@ -26,6 +26,7 @@ config DRM_I915_DEBUG select DRM_DEBUG_MM if DRM=y select DRM_DEBUG_MM_SELFTEST select DRM_I915_SW_FENCE_DEBUG_OBJECTS + select DRM_I915_SELFTEST default n help Choose this option to turn on extra driver debugging that may affect @@ -59,3 +60,17 @@ config DRM_I915_SW_FENCE_DEBUG_OBJECTS Recommended for driver developers only. If in doubt, say "N". + +config DRM_I915_SELFTEST + bool "Enable selftests upon driver load" + depends on DRM_I915 + default n + help + Choose this option to allow the driver to perform selftests upon + loading; also requires the i915.selftest=1 module parameter. To + exit the module after running the selftests (i.e. to prevent normal + module initialisation afterwards) use i915.selftest=-1. + + Recommended for driver developers only. + + If in doubt, say "N". diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 5196509e71cf..461aeb44a9ad 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -3,6 +3,7 @@ # Direct Rendering Infrastructure (DRI) in XFree86 4.1.0 and higher. subdir-ccflags-$(CONFIG_DRM_I915_WERROR) := -Werror +subdir-ccflags-$(CONFIG_DRM_I915_SELFTEST) := -I$(src) -I$(src)/selftests subdir-ccflags-y += \ $(call as-instr,movntdqa (%eax)$(comma)%xmm0,-DCONFIG_AS_MOVNTDQA) @@ -114,6 +115,8 @@ i915-y += dvo_ch7017.o \ # Post-mortem debug and GPU hang state capture i915-$(CONFIG_DRM_I915_CAPTURE_ERROR) += i915_gpu_error.o +i915-$(CONFIG_DRM_I915_SELFTEST) += \ + selftests/i915_selftest.o # virtual gpu code i915-y += i915_vgpu.o diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index 9885458b0fb8..3d416d142573 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -27,6 +27,7 @@ #include #include "i915_drv.h" +#include "i915_selftest.h" #define GEN_DEFAULT_PIPEOFFSETS \ .pipe_offsets = { PIPE_A_OFFSET, PIPE_B_OFFSET, \ @@ -477,6 +478,7 @@ static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent) { struct intel_device_info *intel_info = (struct intel_device_info *) ent->driver_data; + int err; if (IS_ALPHA_SUPPORT(intel_info) && !i915.alpha_support) { DRM_INFO("The driver support for your hardware in this kernel version is alpha quality\n" @@ -500,7 +502,17 @@ static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
[Intel-gfx] [PATCH] drm/i915: Detect vma reserved for execbuf in evict-for-node
The vma->exec_list is still the only means we have for both reserving an object in execbuf, and for constructing the eviction list. So during the construction of the eviction list, we must treat anything already on the exec_list as being pinned. Yes, this sharing of two semantically different lists will be fixed! But in the meantime, we have the issue that this is tripping up CI since we started using i915_gem_gtt_reserve_node() + i915_gem_evict_for_node() from the regular execbuf reservation path in commit 606fec956c0e ("drm/i915: Prefer random replacement before eviction search"): [ 108.424063] kernel BUG at drivers/gpu/drm/i915/i915_vma.h:254! [ 108.424072] invalid opcode: [#1] PREEMPT SMP [ 108.424079] Modules linked in: snd_hda_intel i915 intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec_hdmi snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core mei_me snd_pcm lpc_ich mei sdhci_pci sdhci mmc_core e1000e ptp pps_core [last unloaded: i915] [ 108.424132] CPU: 1 PID: 6865 Comm: gem_cs_tlb Tainted: G U 4.10.0-rc3-CI-CI_DRM_2049+ #1 [ 108.424143] Hardware name: Hewlett-Packard HP EliteBook 8440p/172A, BIOS 68CCU Ver. F.24 09/13/2013 [ 108.424154] task: 88012ae22600 task.stack: c9a14000 [ 108.424220] RIP: 0010:i915_gem_evict_for_node+0x237/0x410 [i915] [ 108.424229] RSP: 0018:c9a17a58 EFLAGS: 00010202 [ 108.424237] RAX: 5871 RBX: 88012d1ad778 RCX: [ 108.424246] RDX: 7000 RSI: c9a17a68 RDI: 880127e694d8 [ 108.424255] RBP: c9a17aa0 R08: c9a17a68 R09: [ 108.424264] R10: 0001 R11: R12: 8000 [ 108.424273] R13: c9a17a68 R14: 880127e694d8 R15: a0387330 [ 108.424283] FS: 7f8236e3d8c0() GS:880137c4() knlGS: [ 108.424293] CS: 0010 DS: ES: CR0: 80050033 [ 108.424305] CR2: 7f82347a2000 CR3: 00012c866000 CR4: 06e0 [ 108.424317] Call Trace: [ 108.424368] i915_gem_gtt_reserve+0x67/0x80 [i915] [ 108.424424] __i915_vma_do_pin+0x248/0x620 [i915] [ 108.424487] ? __i915_vma_do_pin+0x162/0x620 [i915] [ 108.424540] i915_gem_execbuffer_reserve_vma.isra.8+0x153/0x1f0 [i915] [ 108.424591] i915_gem_execbuffer_reserve.isra.9+0x40e/0x440 [i915] [ 108.424643] i915_gem_do_execbuffer.isra.15+0x6d9/0x1b20 [i915] [ 108.424696] i915_gem_execbuffer2+0xc0/0x250 [i915] [ 108.424712] drm_ioctl+0x200/0x450 [ 108.424760] ? i915_gem_execbuffer+0x330/0x330 [i915] [ 108.424776] do_vfs_ioctl+0x90/0x6e0 [ 108.424789] ? up_read+0x1a/0x40 [ 108.424800] ? trace_hardirqs_on_caller+0x122/0x1b0 [ 108.424813] SyS_ioctl+0x3c/0x70 [ 108.424828] entry_SYSCALL_64_fastpath+0x1c/0xb1 [ 108.424839] RIP: 0033:0x7f8235867357 [ 108.424848] RSP: 002b:7ffdc14504c8 EFLAGS: 0246 ORIG_RAX: 0010 [ 108.424866] RAX: ffda RBX: 7ffdc1450600 RCX: 7f8235867357 [ 108.424878] RDX: 7ffdc14505a0 RSI: 40406469 RDI: 0003 [ 108.424890] RBP: R08: R09: 0022 [ 108.424903] R10: 0007 R11: 0246 R12: 0002 [ 108.424915] R13: 00419101 R14: 7ffdc1450600 R15: 7ffdc14505f0 [ 108.424928] Code: 45 b8 8b 4d c0 4c 89 f2 48 89 de ff d0 49 8b 07 4c 8b 45 b8 48 85 c0 75 dd 65 ff 0d d4 a1 c8 5f 0f 84 47 01 00 00 e9 0d fe ff ff <0f> 0b 45 31 f6 4c 8b 65 c8 49 8b 04 24 4d 39 ec 49 8d 9c 24 28 [ 108.425055] RIP: i915_gem_evict_for_node+0x237/0x410 [i915] RSP: c9a17a58 Fixes: 172ae5b4c8c1 ("drm/i915: Fix i915_gem_evict_for_vma (soft-pinning)") Fixes: 606fec956c0e ("drm/i915: Prefer random replacement before eviction search") Signed-off-by: Chris Wilson Cc: Joonas Lahtinen --- drivers/gpu/drm/i915/i915_gem_evict.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c index 50b4645bf627..a43e44e18042 100644 --- a/drivers/gpu/drm/i915/i915_gem_evict.c +++ b/drivers/gpu/drm/i915/i915_gem_evict.c @@ -305,7 +305,7 @@ int i915_gem_evict_for_node(struct i915_address_space *vm, } /* Overlap of objects in the same batch? */ - if (i915_vma_is_pinned(vma)) { + if (i915_vma_is_pinned(vma) || !list_empty(&vma->exec_list)) { ret = -ENOSPC; if (vma->exec_entry && vma->exec_entry->flags & EXEC_OBJECT_PINNED) -- 2.11.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFCv2 14/19] drm/i915: Move uncore selfchecks to live selftest infrastructure
On 20 December 2016 at 13:08, Chris Wilson wrote: > Now that the kselftest infrastructure exists, put it to use and add to > it the existing consistency checks on the fw register lookup tables. > > v2: s/tabke/table/ > > Signed-off-by: Chris Wilson > Cc: Tvrtko Ursulin Reviewed-by: Matthew Auld ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFCv2 15/19] drm/i915: Test all fw tables during mock selftests
On 20 December 2016 at 13:08, Chris Wilson wrote: > In addition to just testing the fw table we load, during the initial > mock testing we can test that all tables are valid (so the testing is > not limited to just the platforms that load that particular table). > > Signed-off-by: Chris Wilson > Cc: Tvrtko Ursulin Reviewed-by: Matthew Auld ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFCv2 16/19] drm/i915: Sanity check all registers for matching fw domains
On 20 December 2016 at 13:08, Chris Wilson wrote: > Add a late selftest that walks over all forcewake registers (those below > 0x4) and checks intel_uncore_forcewake_for_reg() that the look > exists and we having the matching powerwells. > > Signed-off-by: Chris Wilson > --- > drivers/gpu/drm/i915/selftests/intel_uncore.c | 47 > +++ > 1 file changed, 47 insertions(+) > > diff --git a/drivers/gpu/drm/i915/selftests/intel_uncore.c > b/drivers/gpu/drm/i915/selftests/intel_uncore.c > index c18fddb12d00..c9f90514500f 100644 > --- a/drivers/gpu/drm/i915/selftests/intel_uncore.c > +++ b/drivers/gpu/drm/i915/selftests/intel_uncore.c > @@ -107,6 +107,49 @@ int intel_uncore_mock_selftests(void) > return 0; > } > > +static int intel_uncore_check_forcewake_domains(struct drm_i915_private > *dev_priv) > +{ > +#define FW_RANGE 0x4 > + unsigned long *valid; > + u32 offset; > + int err; > + > + valid = kzalloc(BITS_TO_LONGS(FW_RANGE) * sizeof(*valid), > + GFP_TEMPORARY); > + if (!valid) > + return -ENOMEM; > + > + intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL); > + > + check_for_unclaimed_mmio(dev_priv); > + for (offset = 0; offset < FW_RANGE; offset += 4) { > + i915_reg_t reg = { offset }; > + > + (void)I915_READ_FW(reg); > + if (!check_for_unclaimed_mmio(dev_priv)) > + set_bit(offset, valid); > + } > + > + intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL); > + > + for_each_set_bit(offset, valid, FW_RANGE) { > + i915_reg_t reg = { offset }; > + > + intel_uncore_forcewake_reset(dev_priv, false); > + check_for_unclaimed_mmio(dev_priv); hmm, what do we need this for ? > + > + (void)I915_READ(reg); > + if (check_for_unclaimed_mmio(dev_priv)) { > + pr_err("Unclaimed mmio read to register 0x%04x\n", > + offset); > + err = -EINVAL; > + } > + } > + > + kfree(valid); > + return err; > +} > + > int intel_uncore_live_selftests(struct drm_i915_private *i915) > { > int err; > @@ -118,5 +161,9 @@ int intel_uncore_live_selftests(struct drm_i915_private > *i915) > if (err) > return err; > > + err = intel_uncore_check_forcewake_domains(i915); > + if (err) > + return err; > + > return 0; > } > -- > 2.11.0 > > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFCv2 01/19] drm/i915: Provide a hook for selftests
On Wed, Jan 11, 2017 at 06:17:48PM +, Tvrtko Ursulin wrote: > On 20/12/2016 13:07, Chris Wilson wrote: > >@@ -522,6 +534,11 @@ static struct pci_driver i915_pci_driver = { > > static int __init i915_init(void) > > { > > bool use_kms = true; > >+int err; > >+ > >+err = i915_mock_selftests(); > >+if (err) > >+return err > 0 ? 0 : err; > > Am I again confused by the return codes? :) Module param of -1 will > result in i915_mock_selftests returning 1, which here translates to > 0 so it won't abort the load like it should. I had to give up on that for silent passing and do the remove from userspace on success instead. Returning anything other than 0 causes noise in dmesg. That I can live with after an error during the selftest, since dmesg should also contain more details on the test failure. If i915.mock_selftests=-1 then we run the tests and stop. We just leave the module loaded even though it hasn't bound to any pci devices. :| igt/drv_selftest and kselftests/gpu/i915.sh then unload the module. > >+static void set_default_test_all(struct selftest *st, unsigned long count) > >+{ > >+unsigned long i; > >+ > >+for (i = 0; i < count; i++) > >+if (st[i].enabled) > >+return; > >+ > >+for (i = 0; i < count; i++) > >+st[i].enabled = true; > >+} > > unsigned int should be enough for everyone! :) (i & count) Such shortsightedness! > >+static int run_selftests(const char *name, > >+ struct selftest *st, > >+ unsigned long count, > >+ void *data) > >+{ > >+int err = 0; > >+ > > If I got it right: > > /* Make sure both live and mock run with the same seed if ran one > after another. */ Yes, choose the seed once, run every selected test with the same seed. > ? just not sure what happens if user sets zero. I wasn't such if 0 was a valid seed, so I wasn't caring too much if the user did i915.st_random_seed=0. They will see the pr_info() and go wtf, and hopefully don't do that again. > >+while (!i915_selftest.random_seed) > >+i915_selftest.random_seed = get_random_int(); > >+ > >+i915_selftest.timeout_jiffies = > >+i915_selftest.timeout_ms ? > >+msecs_to_jiffies_timeout(i915_selftest.timeout_ms) : > >+MAX_SCHEDULE_TIMEOUT; > >+ > >+set_default_test_all(st, count); > >+ > >+pr_info("i915: Performing %s selftests with st_random_seed=%x and > >st_timeout=%u\n", > >+name, i915_selftest.random_seed, i915_selftest.timeout_ms); > >+ > >+/* Tests are listed in order in i915_*_selftests.h */ > >+for (; count--; st++) { > >+if (!st->enabled) > >+continue; > >+ > >+cond_resched(); > >+if (signal_pending(current)) > >+return -EINTR; > >+ > >+pr_debug("i915: Running %s\n", st->name); > >+if (data) > >+err = st->live(data); > >+else > >+err = st->mock(); > >+if (err) > >+break; > >+} > >+ > >+if (WARN(err > 0 || err == -ENOTTY, > >+ "%s returned %d, conflicting with selftest's magic values!\n", > >+ st->name, err)) > >+err = -1; > >+ > >+rcu_barrier(); > > Why this? Paranoia for the tests aborting without the barrier, as we can't rely on module_unload providing it since we may go on to load the driver as normal. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFCv2 16/19] drm/i915: Sanity check all registers for matching fw domains
On Wed, Jan 11, 2017 at 06:25:59PM +, Matthew Auld wrote: > On 20 December 2016 at 13:08, Chris Wilson wrote: > > + for_each_set_bit(offset, valid, FW_RANGE) { > > + i915_reg_t reg = { offset }; > > + > > + intel_uncore_forcewake_reset(dev_priv, false); > > + check_for_unclaimed_mmio(dev_priv); > hmm, what do we need this for ? It clears the debug register before every test - so that we know the only thing the debug register is complaining about is the I915_READ() sandwiched in between. The reset is there to ensure that the fw is turned off and the timer disabled, so that we have a vanilla state every time with the powerwell off. Hopefully. > > + > > + (void)I915_READ(reg); > > + if (check_for_unclaimed_mmio(dev_priv)) { -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for drm/edid: Improve RGB limited range handling a bit (rev2)
== Series Details == Series: drm/edid: Improve RGB limited range handling a bit (rev2) URL : https://patchwork.freedesktop.org/series/17825/ State : success == Summary == Series 17825v2 drm/edid: Improve RGB limited range handling a bit https://patchwork.freedesktop.org/api/1.0/series/17825/revisions/2/mbox/ fi-bdw-5557u total:246 pass:232 dwarn:0 dfail:0 fail:0 skip:14 fi-bsw-n3050 total:246 pass:207 dwarn:0 dfail:0 fail:0 skip:39 fi-bxt-j4205 total:246 pass:224 dwarn:0 dfail:0 fail:0 skip:22 fi-bxt-t5700 total:82 pass:69 dwarn:0 dfail:0 fail:0 skip:12 fi-byt-j1900 total:246 pass:219 dwarn:0 dfail:0 fail:0 skip:27 fi-byt-n2820 total:246 pass:215 dwarn:0 dfail:0 fail:0 skip:31 fi-hsw-4770 total:246 pass:227 dwarn:0 dfail:0 fail:0 skip:19 fi-hsw-4770r total:246 pass:227 dwarn:0 dfail:0 fail:0 skip:19 fi-ivb-3520m total:246 pass:225 dwarn:0 dfail:0 fail:0 skip:21 fi-ivb-3770 total:246 pass:225 dwarn:0 dfail:0 fail:0 skip:21 fi-kbl-7500u total:246 pass:225 dwarn:0 dfail:0 fail:0 skip:21 fi-skl-6260u total:246 pass:233 dwarn:0 dfail:0 fail:0 skip:13 fi-skl-6700hqtotal:246 pass:226 dwarn:0 dfail:0 fail:0 skip:20 fi-skl-6700k total:246 pass:222 dwarn:3 dfail:0 fail:0 skip:21 fi-skl-6770hqtotal:246 pass:233 dwarn:0 dfail:0 fail:0 skip:13 fi-snb-2520m total:246 pass:215 dwarn:0 dfail:0 fail:0 skip:31 fi-snb-2600 total:246 pass:214 dwarn:0 dfail:0 fail:0 skip:32 b69fc4c941bef6d10750ce3f07daedfffc7017d1 drm-tip: 2017y-01m-11d-17h-30m-02s UTC integration manifest 20e8661 drm/edid: Set YQ bits in the AVI infoframe according to CEA-861-F 3d4bb98 drm/edid: Set AVI infoframe Q even when QS=0 0784091 drm/edid: Introduce drm_hdmi_avi_infoframe_quant_range() 36b475a drm/edid: Introduce drm_default_rgb_quant_range() e46f7c5 drm/edid: Have drm_edid.h include hdmi.h == Logs == For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3484/ ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx