[Intel-gfx] PCH fifo underrun in 3.18
Just updated my thinkpad (x230, ivy bridge platform) to 3.18 and boot fails with the error 'PCH fifo underrun'. This is under fedora 21 with the fedora kernel version 3.18.3-201 Platform information: [jon@localhost]~% lspci 00:00.0 Host bridge: Intel Corporation 3rd Gen Core processor DRAM Controller (rev 09) 00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core processor Graphics Controller (rev 09) 00:14.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB xHCI Host Controller (rev 04) 00:16.0 Communication controller: Intel Corporation 7 Series/C210 Series Chipset Family MEI Controller #1 (rev 04) 00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network Connection (rev 04) 00:1a.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #2 (rev 04) 00:1b.0 Audio device: Intel Corporation 7 Series/C210 Series Chipset Family High Definition Audio Controller (rev 04) 00:1c.0 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 1 (rev c4) 00:1c.1 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 2 (rev c4) 00:1c.2 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 3 (rev c4) 00:1d.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #1 (rev 04) 00:1f.0 ISA bridge: Intel Corporation QM77 Express Chipset LPC Controller (rev 04) 00:1f.2 SATA controller: Intel Corporation 7 Series Chipset Family 6-port SATA Controller [AHCI mode] (rev 04) 00:1f.3 SMBus: Intel Corporation 7 Series/C210 Series Chipset Family SMBus Controller (rev 04) 02:00.0 System peripheral: Ricoh Co Ltd PCIe SDXC/MMC Host Controller (rev 07) 03:00.0 Network controller: Intel Corporation Centrino Advanced-N 6205 [Taylor Peak] (rev 34) ------- [jon@localhost]~% cat /proc/cpuinfo processor: 0 vendor_id: GenuineIntel cpu family: 6 model: 58 model name: Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz stepping: 9 microcode: 0x1b cpu MHz: 1300.585 cache size: 3072 KB physical id: 0 siblings: 4 core id: 0 cpu cores: 2 apicid: 0 initial apicid: 0 fpu: yes fpu_exception: yes cpuid level: 13 wp: yes flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt bugs: bogomips: 4988.55 clflush size: 64 cache_alignment: 64 address sizes: 36 bits physical, 48 bits virtual power management: ------- Jon ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] Fwd: PCH fifo underrun in 3.18
Update. 3.18.5 seems to have fixed the issue. Jon Forwarded Message Subject:PCH fifo underrun in 3.18 Date: Thu, 29 Jan 2015 13:21:53 -1000 From: jon To: daniel.vet...@intel.com, Intel-gfx@lists.freedesktop.org Just updated my thinkpad (x230, ivy bridge platform) to 3.18 and boot fails with the error 'PCH fifo underrun'. This is under fedora 21 with the fedora kernel version 3.18.3-201 Platform information: [jon@localhost]~% lspci 00:00.0 Host bridge: Intel Corporation 3rd Gen Core processor DRAM Controller (rev 09) 00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core processor Graphics Controller (rev 09) 00:14.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB xHCI Host Controller (rev 04) 00:16.0 Communication controller: Intel Corporation 7 Series/C210 Series Chipset Family MEI Controller #1 (rev 04) 00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network Connection (rev 04) 00:1a.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #2 (rev 04) 00:1b.0 Audio device: Intel Corporation 7 Series/C210 Series Chipset Family High Definition Audio Controller (rev 04) 00:1c.0 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 1 (rev c4) 00:1c.1 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 2 (rev c4) 00:1c.2 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 3 (rev c4) 00:1d.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #1 (rev 04) 00:1f.0 ISA bridge: Intel Corporation QM77 Express Chipset LPC Controller (rev 04) 00:1f.2 SATA controller: Intel Corporation 7 Series Chipset Family 6-port SATA Controller [AHCI mode] (rev 04) 00:1f.3 SMBus: Intel Corporation 7 Series/C210 Series Chipset Family SMBus Controller (rev 04) 02:00.0 System peripheral: Ricoh Co Ltd PCIe SDXC/MMC Host Controller (rev 07) 03:00.0 Network controller: Intel Corporation Centrino Advanced-N 6205 [Taylor Peak] (rev 34) ------- [jon@localhost]~% cat /proc/cpuinfo processor: 0 vendor_id: GenuineIntel cpu family: 6 model: 58 model name: Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz stepping: 9 microcode: 0x1b cpu MHz: 1300.585 cache size: 3072 KB physical id: 0 siblings: 4 core id: 0 cpu cores: 2 apicid: 0 initial apicid: 0 fpu: yes fpu_exception: yes cpuid level: 13 wp: yes flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt bugs: bogomips: 4988.55 clflush size: 64 cache_alignment: 64 address sizes: 36 bits physical, 48 bits virtual power management: ------- Jon ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] Fwd: Fwd: PCH fifo underrun in 3.18
Scratch that, still broke. Forwarded Message Subject:Fwd: PCH fifo underrun in 3.18 Date: Fri, 06 Feb 2015 15:11:34 -1000 From: jon To: Intel-gfx@lists.freedesktop.org Update. 3.18.5 seems to have fixed the issue. Jon Forwarded Message Subject:PCH fifo underrun in 3.18 Date: Thu, 29 Jan 2015 13:21:53 -1000 From: jon To: daniel.vet...@intel.com, Intel-gfx@lists.freedesktop.org Just updated my thinkpad (x230, ivy bridge platform) to 3.18 and boot fails with the error 'PCH fifo underrun'. This is under fedora 21 with the fedora kernel version 3.18.3-201 Platform information: [jon@localhost]~% lspci 00:00.0 Host bridge: Intel Corporation 3rd Gen Core processor DRAM Controller (rev 09) 00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core processor Graphics Controller (rev 09) 00:14.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB xHCI Host Controller (rev 04) 00:16.0 Communication controller: Intel Corporation 7 Series/C210 Series Chipset Family MEI Controller #1 (rev 04) 00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network Connection (rev 04) 00:1a.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #2 (rev 04) 00:1b.0 Audio device: Intel Corporation 7 Series/C210 Series Chipset Family High Definition Audio Controller (rev 04) 00:1c.0 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 1 (rev c4) 00:1c.1 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 2 (rev c4) 00:1c.2 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 3 (rev c4) 00:1d.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #1 (rev 04) 00:1f.0 ISA bridge: Intel Corporation QM77 Express Chipset LPC Controller (rev 04) 00:1f.2 SATA controller: Intel Corporation 7 Series Chipset Family 6-port SATA Controller [AHCI mode] (rev 04) 00:1f.3 SMBus: Intel Corporation 7 Series/C210 Series Chipset Family SMBus Controller (rev 04) 02:00.0 System peripheral: Ricoh Co Ltd PCIe SDXC/MMC Host Controller (rev 07) 03:00.0 Network controller: Intel Corporation Centrino Advanced-N 6205 [Taylor Peak] (rev 34) ------- [jon@localhost]~% cat /proc/cpuinfo processor: 0 vendor_id: GenuineIntel cpu family: 6 model: 58 model name: Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz stepping: 9 microcode: 0x1b cpu MHz: 1300.585 cache size: 3072 KB physical id: 0 siblings: 4 core id: 0 cpu cores: 2 apicid: 0 initial apicid: 0 fpu: yes fpu_exception: yes cpuid level: 13 wp: yes flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt bugs: bogomips: 4988.55 clflush size: 64 cache_alignment: 64 address sizes: 36 bits physical, 48 bits virtual power management: ------- Jon ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] PCH fifo underrun in 3.18
Under what product should I submit the bug? dmr.debug=14 doesn't show much. [jon@localhost]~% dmesg | grep drm [1.936241] [drm] Initialized drm 1.1.0 20060810 [2.193043] [drm] Memory usable by graphics device = 2048M [2.193046] [drm] Replacing VGA console driver [2.18] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [2.20] [drm] Driver supports precise vblank timestamp query. [2.287949] [drm] GMBUS [i915 gmbus dpb] timed out, falling back to bit banging on pin 5 [2.297614] fbcon: inteldrmfb (fb0) is primary device [2.576664] [drm:cpt_set_fifo_underrun_reporting] *ERROR* uncleared pch fifo underrun on pch transcoder A [2.57] [drm:cpt_serr_int_handler] *ERROR* PCH transcoder A FIFO underrun Jon On 02/08/2015 11:18 PM, Jani Nikula wrote: On Fri, 30 Jan 2015, jon wrote: Just updated my thinkpad (x230, ivy bridge platform) to 3.18 and boot fails with the error 'PCH fifo underrun'. This is under fedora 21 with the fedora kernel version 3.18.3-201 Please file a bug on [1], attach a dmesg from boot to the problem with drm.debug=14 module parameter set. Reference this thread. Thanks, Jani. [1] https://bugs.freedesktop.org/enter_bug.cgi?product=DRI&component=DRM/Intel -- Jon ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 03/11] drm/i915: Allow the sysadmin to override security mitigations
> -Original Message- > From: Chris Wilson > Sent: Sunday, January 10, 2021 7:04 AM > To: intel-gfx@lists.freedesktop.org > Cc: Chris Wilson ; Joonas Lahtinen > ; Bloomfield, Jon > ; Vivi, Rodrigo ; > sta...@vger.kernel.org > Subject: [PATCH 03/11] drm/i915: Allow the sysadmin to override security > mitigations > > The clear-residuals mitigation is a relatively heavy hammer and under some > circumstances the user may wish to forgo the context isolation in order > to meet some performance requirement. Introduce a generic module > parameter to allow selectively enabling/disabling different mitigations. > > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1858 > Fixes: 47f8253d2b89 ("drm/i915/gen7: Clear all EU/L3 residual contexts") > Signed-off-by: Chris Wilson > Cc: Joonas Lahtinen > Cc: Jon Bloomfield > Cc: Rodrigo Vivi > Cc: sta...@vger.kernel.org # v5.7 > --- Reviewed-by: Jon Bloomfield ? ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC PATCH 00/97] Basic GuC submission support in the i915
> -Original Message- > From: Martin Peres > Sent: Tuesday, May 11, 2021 1:06 AM > To: Daniel Vetter > Cc: Jason Ekstrand ; Brost, Matthew > ; intel-gfx ; > dri-devel ; Ursulin, Tvrtko > ; Ekstrand, Jason ; > Ceraolo Spurio, Daniele ; Bloomfield, Jon > ; Vetter, Daniel ; > Harrison, John C > Subject: Re: [RFC PATCH 00/97] Basic GuC submission support in the i915 > > On 10/05/2021 19:33, Daniel Vetter wrote: > > On Mon, May 10, 2021 at 3:55 PM Martin Peres > wrote: > >> > >> On 10/05/2021 02:11, Jason Ekstrand wrote: > >>> On May 9, 2021 12:12:36 Martin Peres wrote: > >>> > >>>> Hi, > >>>> > >>>> On 06/05/2021 22:13, Matthew Brost wrote: > >>>>> Basic GuC submission support. This is the first bullet point in the > >>>>> upstreaming plan covered in the following RFC [1]. > >>>>> > >>>>> At a very high level the GuC is a piece of firmware which sits between > >>>>> the i915 and the GPU. It offloads some of the scheduling of contexts > >>>>> from the i915 and programs the GPU to submit contexts. The i915 > >>>>> communicates with the GuC and the GuC communicates with the > GPU. > >>>> > >>>> May I ask what will GuC command submission do that execlist > won't/can't > >>>> do? And what would be the impact on users? Even forgetting the > troubled > >>>> history of GuC (instability, performance regression, poor level of user > >>>> support, 6+ years of trying to upstream it...), adding this much code > >>>> and doubling the amount of validation needed should come with a > >>>> rationale making it feel worth it... and I am not seeing here. Would you > >>>> mind providing the rationale behind this work? > >>>> > >>>>> > >>>>> GuC submission will be disabled by default on all current upstream > >>>>> platforms behind a module parameter - enable_guc. A value of 3 will > >>>>> enable submission and HuC loading via the GuC. GuC submission > should > >>>>> work on all gen11+ platforms assuming the GuC firmware is present. > >>>> > >>>> What is the plan here when it comes to keeping support for execlist? I > >>>> am afraid that landing GuC support in Linux is the first step towards > >>>> killing the execlist, which would force users to use proprietary > >>>> firmwares that even most Intel engineers have little influence over. > >>>> Indeed, if "drm/i915/guc: Disable semaphores when using GuC > scheduling" > >>>> which states "Disable semaphores when using GuC scheduling as > semaphores > >>>> are broken in the current GuC firmware." is anything to go by, it means > >>>> that even Intel developers seem to prefer working around the GuC > >>>> firmware, rather than fixing it. > >>> > >>> Yes, landing GuC support may be the first step in removing execlist > >>> support. The inevitable reality is that GPU scheduling is coming and > >>> likely to be there only path in the not-too-distant future. (See also > >>> the ongoing thread with AMD about fences.) I'm not going to pass > >>> judgement on whether or not this is a good thing. I'm just reading the > >>> winds and, in my view, this is where things are headed for good or ill. > >>> > >>> In answer to the question above, the answer to "what do we gain from > >>> GuC?" may soon be, "you get to use your GPU." We're not there yet > and, > >>> again, I'm not necessarily advocating for it, but that is likely where > >>> things are headed. > >> > >> This will be a sad day, especially since it seems fundamentally opposed > >> with any long-term support, on top of taking away user freedom to > >> fix/tweak their system when Intel won't. > >> > >>> A firmware-based submission model isn't a bad design IMO and, aside > from > >>> the firmware freedom issues, I think there are actual advantages to the > >>> model. Immediately, it'll unlock a few features like parallel submission > >>> (more on that in a bit) and long-running compute because they're > >>> implemented in GuC and the work to implement them properly in the > >>> execlist scheduler is highly non-trivial. Longer t
Re: [Intel-gfx] [PATCH 07/11] drm/i915: Add i915_gem_context_is_full_ppgtt
> -Original Message- > From: Tvrtko Ursulin > Sent: Thursday, September 2, 2021 9:42 AM > To: Daniel Vetter > Cc: Daniel Vetter ; DRI Development de...@lists.freedesktop.org>; Intel Graphics Development g...@lists.freedesktop.org>; Maarten Lankhorst > ; Vetter, Daniel > ; Bloomfield, Jon ; > Chris Wilson ; Joonas Lahtinen > ; Thomas Hellström > ; Auld, Matthew > ; Landwerlin, Lionel G > ; Dave Airlie ; Jason > Ekstrand > Subject: Re: [Intel-gfx] [PATCH 07/11] drm/i915: Add > i915_gem_context_is_full_ppgtt > > > On 02/09/2021 16:22, Daniel Vetter wrote: > > On Thu, Sep 02, 2021 at 03:54:36PM +0100, Tvrtko Ursulin wrote: > >> On 02/09/2021 15:20, Daniel Vetter wrote: > >>> And use it anywhere we have open-coded checks for ctx->vm that really > >>> only check for full ppgtt. > >>> > >>> Plus for paranoia add a GEM_BUG_ON that checks it's really only set > >>> when we have full ppgtt, just in case. gem_context->vm is different > >>> since it's NULL in ggtt mode, unlike intel_context->vm or gt->vm, > >>> which is always set. > >>> > >>> v2: 0day found a testcase that I missed. > >>> > >>> Reviewed-by: Maarten Lankhorst > >>> Signed-off-by: Daniel Vetter > >>> Cc: Jon Bloomfield > >>> Cc: Chris Wilson > >>> Cc: Maarten Lankhorst > >>> Cc: Joonas Lahtinen > >>> Cc: Daniel Vetter > >>> Cc: "Thomas Hellström" > >>> Cc: Matthew Auld > >>> Cc: Lionel Landwerlin > >>> Cc: Dave Airlie > >>> Cc: Jason Ekstrand > >>> --- > >>>drivers/gpu/drm/i915/gem/i915_gem_context.c | 2 +- > >>>drivers/gpu/drm/i915/gem/i915_gem_context.h | 7 +++ > >>>drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c| 2 +- > >>>drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c | 6 +++--- > >>>4 files changed, 12 insertions(+), 5 deletions(-) > >>> > >>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c > b/drivers/gpu/drm/i915/gem/i915_gem_context.c > >>> index 7a566fb7cca4..1eec85944c1f 100644 > >>> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c > >>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c > >>> @@ -1566,7 +1566,7 @@ static int get_ppgtt(struct > drm_i915_file_private *file_priv, > >>> int err; > >>> u32 id; > >>> - if (!rcu_access_pointer(ctx->vm)) > >>> + if (!i915_gem_context_is_full_ppgtt(ctx)) > >> > >> It reads a bit wrong because GEM context cannot *be* full ppggt. It can > be > >> associated with a VM which is or isn't full ppgtt. So a test on a VM > >> retrieved from a context is semnntically more correct. Perhaps you want > to > >> consider adding a helper to that effect instead? It could mean splitting > >> into two helpers (getter + test) or maybe just renaming would work. Like > >> i915_gem_context_has_full_ppgtt_vm(ctx)? > > > > The pointer isn't set when the driver/context isn't running in full ppgtt > > mode. This is why I've added the GEM_BUG_ON to check we're not > breaking > > any invariants. So yeah it is a full ppgtt context or it's not, that is > > indeed the question here. > > > > I'm happy to bikeshed the naming, but I don't see how your suggestion is > > an improvement. > > I think the pointer being set or not is implementation detail, for > instance we could have it always set just like it is in intel_context. > > I simply think GEM context *isn't* full ppgtt, but the VM is. And since > GEM context *points* to a VM, *has* is the right verb in my mind. You > did not write why do you not see has as more correct than is so I don't > want to be guessing too much. FWIW, I agree with Tvrtko. i915_gem_context_is_full_ppgtt is incorrect grammar. It IS a bike shed, but, hey it'll live for a while.
Re: [Intel-gfx] [PATCH 1/9] drm/doc/rfc: i915 DG1 uAPI
> -Original Message- > From: Auld, Matthew > Sent: Monday, April 26, 2021 2:39 AM > To: intel-gfx@lists.freedesktop.org > Cc: Joonas Lahtinen ; Thomas Hellström > ; Ceraolo Spurio, Daniele > ; Lionel Landwerlin > ; Bloomfield, Jon > ; Justen, Jordan L ; > Vetter, Daniel ; Kenneth Graunke > ; Jason Ekstrand ; Dave > Airlie ; dri-de...@lists.freedesktop.org; mesa- > d...@lists.freedesktop.org; Daniel Vetter ; Dave > Airlie > Subject: [PATCH 1/9] drm/doc/rfc: i915 DG1 uAPI > > Add an entry for the new uAPI needed for DG1. Also add the overall > upstream plan, including some notes for the TTM conversion. > > v2(Daniel): > - include the overall upstreaming plan > - add a note for mmap, there are differences here for TTM vs i915 > - bunch of other suggestions from Daniel > v3: > (Daniel) > - add a note for set/get caching stuff > - add some more docs for existing query and extensions stuff > - add an actual code example for regions query > - bunch of other stuff > (Jason) > - uAPI change(!): > - try a simpler design with the placements extension > - rather than have a generic setparam which can cover multiple > use cases, have each extension be responsible for one thing > only > v4: > (Daniel) > - add some more notes for ttm conversion > - bunch of other stuff > (Jason) > - uAPI change(!): > - drop all the extra rsvd members for the region_query and > region_info, just keep the bare minimum needed for padding > > Signed-off-by: Matthew Auld > Cc: Joonas Lahtinen > Cc: Thomas Hellström > Cc: Daniele Ceraolo Spurio > Cc: Lionel Landwerlin > Cc: Jon Bloomfield > Cc: Jordan Justen > Cc: Daniel Vetter > Cc: Kenneth Graunke > Cc: Jason Ekstrand > Cc: Dave Airlie > Cc: dri-de...@lists.freedesktop.org > Cc: mesa-...@lists.freedesktop.org > Acked-by: Daniel Vetter > Acked-by: Dave Airlie > --- Acked-by: Jon Bloomfield ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Prevent bonded requests from overtaking each other on preemption
> -Original Message- > From: Intel-gfx On Behalf Of Tvrtko > Ursulin > Sent: Friday, September 20, 2019 8:12 AM > To: Chris Wilson ; intel-gfx@lists.freedesktop.org > Subject: Re: [Intel-gfx] [PATCH] drm/i915: Prevent bonded requests from > overtaking each other on preemption > > > On 20/09/2019 15:57, Chris Wilson wrote: > > Quoting Chris Wilson (2019-09-20 09:36:24) > >> Force bonded requests to run on distinct engines so that they cannot be > >> shuffled onto the same engine where timeslicing will reverse the order. > >> A bonded request will often wait on a semaphore signaled by its master, > >> creating an implicit dependency -- if we ignore that implicit dependency > >> and allow the bonded request to run on the same engine and before its > >> master, we will cause a GPU hang. > > > > Thinking more, it should not directly cause a GPU hang, as the stuck request > > should be timesliced away, and each preemption should be enough to keep > > hangcheck at bay (though we have evidence it may not). So at best it runs > > at half-speed, at worst a third (if my model is correct). > > But I think it is still correct to do since we don't have the coupling > information on re-submit. Hm.. but don't we need to prevent slave from > changing engines as well? Unless I'm missing something, the proposal here is to set the engines in stone at first submission, and never change them? If so, that does sound overly restrictive, and will prevent any kind of rebalancing as workloads (of varying slave counts) come and go. During the original design it was called out that the workloads should be pre-empted atomically. That allows the entire bonding mask to be re-evaluated at every context switch and so we can then rebalance. Still not easy to achieve I agree :-( > > Regards, > > Tvrtko > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Prevent bonded requests from overtaking each other on preemption
> -Original Message- > From: Chris Wilson > Sent: Friday, September 20, 2019 9:04 AM > To: Bloomfield, Jon ; intel- > g...@lists.freedesktop.org; Tvrtko Ursulin > Subject: RE: [Intel-gfx] [PATCH] drm/i915: Prevent bonded requests from > overtaking each other on preemption > > Quoting Bloomfield, Jon (2019-09-20 16:50:57) > > > -Original Message- > > > From: Intel-gfx On Behalf Of > Tvrtko > > > Ursulin > > > Sent: Friday, September 20, 2019 8:12 AM > > > To: Chris Wilson ; > > > intel-gfx@lists.freedesktop.org > > > Subject: Re: [Intel-gfx] [PATCH] drm/i915: Prevent bonded requests from > > > overtaking each other on preemption > > > > > > > > > On 20/09/2019 15:57, Chris Wilson wrote: > > > > Quoting Chris Wilson (2019-09-20 09:36:24) > > > >> Force bonded requests to run on distinct engines so that they cannot be > > > >> shuffled onto the same engine where timeslicing will reverse the order. > > > >> A bonded request will often wait on a semaphore signaled by its master, > > > >> creating an implicit dependency -- if we ignore that implicit > > > >> dependency > > > >> and allow the bonded request to run on the same engine and before its > > > >> master, we will cause a GPU hang. > > > > > > > > Thinking more, it should not directly cause a GPU hang, as the stuck > request > > > > should be timesliced away, and each preemption should be enough to > keep > > > > hangcheck at bay (though we have evidence it may not). So at best it > > > > runs > > > > at half-speed, at worst a third (if my model is correct). > > > > > > But I think it is still correct to do since we don't have the coupling > > > information on re-submit. Hm.. but don't we need to prevent slave from > > > changing engines as well? > > > > Unless I'm missing something, the proposal here is to set the engines in > > stone > at first submission, and never change them? > > For submission here, think execution (submission to actual HW). (We have > 2 separate phases that all like to be called submit()!) > > > If so, that does sound overly restrictive, and will prevent any kind of > rebalancing as workloads (of varying slave counts) come and go. > > We are only restricting this request, not the contexts. We still have > balancing overall, just not instantaneous balancing if we timeslice out > of this request -- we put it back onto the "same" engine and not another. > Which is in some ways is less than ideal, although strictly we are only > saying don't put it back onto an engine we have earmarked for our bonded > request, and so we avoid contending with our parallel request reducing > that to serial (and often bad) behaviour. > > [So at the end of this statement, I'm more happy with the restriction ;] > > > During the original design it was called out that the workloads should be > > pre- > empted atomically. That allows the entire bonding mask to be re-evaluated at > every context switch and so we can then rebalance. Still not easy to achieve I > agree :-( > > The problem with that statement is that atomic implies a global > scheduling decision. Blood, sweat and tears. Agreed - It isn't fun. Perhaps it doesn't matter anyway. Once GuC is offloading the scheduling it should be able to do a little more wrt rebalancing. Let's make it a GuC headache instead. > > Of course, with your endless scheme, scheduling is all in the purview of > the user :) Hey, don't tarnish me with that brush. I don't like it either. Actually, it's your scheme technically. I just asked for a way to enable HPC workloads, and you enthusiastically offered heartbeats&non-persistence. So shall history be written :-) > -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] drm connectors, tegra, and the web they weave (was Re: [PATCH 58/59] drm/todo: Add new debugfs todo)
On 18/06/2019 16:19, Greg Kroah-Hartman wrote: > On Fri, Jun 14, 2019 at 10:36:14PM +0200, Daniel Vetter wrote: >> Greg is busy already, but maybe he won't do everything ... >> >> Cc: Greg Kroah-Hartman >> Signed-off-by: Daniel Vetter >> --- >> Documentation/gpu/todo.rst | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/Documentation/gpu/todo.rst b/Documentation/gpu/todo.rst >> index 9717540ee28f..026e55c517e1 100644 >> --- a/Documentation/gpu/todo.rst >> +++ b/Documentation/gpu/todo.rst >> @@ -375,6 +375,9 @@ There's a bunch of issues with it: >>this (together with the drm_minor->drm_device move) would allow us to >> remove >>debugfs_init. >> >> +- Drop the return code and error checking from all debugfs functions. Greg >> KH is >> + working on this already. > > > Part of this work was to try to delete drm_debugfs_remove_files(). > > There are only 4 files that currently still call this function: > drivers/gpu/drm/tegra/dc.c > drivers/gpu/drm/tegra/dsi.c > drivers/gpu/drm/tegra/hdmi.c > drivers/gpu/drm/tegra/sor.c > > For dc.c, the driver wants to add debugfs files to the struct drm_crtc > debugfs directory. Which is fine, but it has to do some special memory > allocation to get the debugfs callback to point not to the struct > drm_minor pointer, but rather the drm_crtc structure. > > So, to remove this call, I need to remove this special memory allocation > and to do that, I need to somehow be able to cast from drm_minor back to > the drm_crtc structure being used in this driver. And I can't figure > how they are related at all. > > Any pointers here (pun intended) would be appreciated. > > For the other 3 files, the situation is much the same, but I need to get > from a 'struct drm_minor' pointer to a 'struct drm_connector' pointer. > > I could just "open code" a bunch of calls to debugfs_create_file() for > these drivers, which would solve this issue, but in a more "non-drm" > way. Is it worth to just do that instead of overthinking the whole > thing and trying to squish it into the drm "model" of drm debugfs calls? > > Either way, who can test these changes? I can't even build the tegra > driver without digging up an arm64 cross-compiler, and can't test it as > I have no hardware at all. We can definitely compile and boot test these no problem. In fact anything that lands in -next we will boot test. However, I can do some quick sanity if you have something to test. Thierry may have more specific Tegra DRM tests. Cheers Jon -- nvpublic ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/ehl: Add missing VECS engine
> -Original Message- > From: Intel-gfx On Behalf Of Tvrtko > Ursulin > Sent: Tuesday, June 25, 2019 10:22 PM > To: Ceraolo Spurio, Daniele ; Roper, > Matthew D ; Souza, Jose > Cc: intel-gfx@lists.freedesktop.org > Subject: Re: [Intel-gfx] [PATCH] drm/i915/ehl: Add missing VECS engine > > > On 25/06/2019 22:48, Daniele Ceraolo Spurio wrote: > > On 6/25/19 8:26 AM, Matt Roper wrote: > >> On Fri, Jun 14, 2019 at 03:17:39PM -0700, Matt Roper wrote: > >>> On Fri, Jun 14, 2019 at 02:37:49PM -0700, José Roberto de Souza wrote: > EHL can have up to one VECS(video enhancement) engine, so add it to > the device_info. > >>> > >>> Bspec 29150 has a footnote on VEbox that indicates "Pass-through only, > >>> no VEbox processing logic." That note seems a bit vague, but I think I > >>> saw some more detailed info in the past somewhere that indicated the > >>> VECS command streamer is still technically present but doesn't actually > >>> do any video enhancement on EHL; it just passes content through to SFC. > >>> > >>> I'm not terribly plugged into the media side of the world, so I'm not > >>> sure if we want to expose VECS to userspace if it's basically a noop and > >>> doesn't do what it normally does on other platforms. Bspec page 5229 > >>> implies that SFC can be fed directly by the decode engine without going > >>> through VEBOX, so I'm not sure if media userspace would ever have a use > >>> for the passthrough-only VECS streamer. > >>> > >>> We should probably ask someone on the media team what their thoughts > are > >>> on this. > >> > >> Since the media team confirmed that there is indeed a use case for a > >> passthrough-only VECS, > >> > >> Reviewed-by: Matt Roper > >> > > > > A bit late for a question, but how does userspace know that this is just > > a pass-through VECS? Are we expecting them to switch based on platform > > instead of just using the kernel API? IMO it'd be better to hide the > > engine in the query ioctl by default and only show it if userspace > > passes an appropriate flag, otherwise legacy apps could try to submit > > VECS-specific commands to the engine. > > I have a patch which would enable this, guess it's time to send it.. > > If we go this route (hide the engine by default), this patch would need > to add a new capability flag. But what to call it? > I915_VIDEO_ENHANCE_CLASS_PASSTHROUGH? I don't get the legacy userspace problem. It's not realistic to expect userspace to 'just work' on new platforms. Instructions are added and removed at each new gen, and umd needs to know that. The capabilities were not really designed to insulate the umd's from making device specific decisions. If umd can easily deduce a capability from the device-id, it should be doing so. The capabilities will become unwieldy very quickly if we start poking in new ones for every change in hardware shape. Caps should be limited to things that cannot be easily deduced, or that can be different on different SKU's of the same device (e.g. fusing). The SFC cap is needed because we don't expose explicit engine numbers to userspace, so it has no easy way to determine which VDBox's have SFC and which do not. Here, we're talking about an easily deducible feature. > > Regards, > > Tvrtko > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/guc: Request RP0 before loading firmware
On 12/20/2021 3:52 PM, Sundaresan, Sujaritha wrote: On 12/16/2021 3:30 PM, Vinay Belgaumkar wrote: By default, GT (and GuC) run at RPn. Requesting for RP0 before firmware load can speed up DMA and HuC auth as well. In addition to writing to 0xA008, we also need to enable swreq in 0xA024 so that Punit will pay heed to our request. SLPC will restore the frequency back to RPn after initialization, but we need to manually do that for the non-SLPC path. We don't need a manual override in the SLPC disabled case, just use the intel_rps_set function to ensure consistent RPS state. Signed-off-by: Vinay Belgaumkar --- drivers/gpu/drm/i915/gt/intel_rps.c | 59 +++ drivers/gpu/drm/i915/gt/intel_rps.h | 2 + drivers/gpu/drm/i915/gt/uc/intel_uc.c | 9 drivers/gpu/drm/i915/i915_reg.h | 4 ++ 4 files changed, 74 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c index 07ff7ba7b2b7..d576b34c7d6f 100644 --- a/drivers/gpu/drm/i915/gt/intel_rps.c +++ b/drivers/gpu/drm/i915/gt/intel_rps.c @@ -2226,6 +2226,65 @@ u32 intel_rps_read_state_cap(struct intel_rps *rps) return intel_uncore_read(uncore, GEN6_RP_STATE_CAP); } +static void intel_rps_set_manual(struct intel_rps *rps, bool enable) +{ + struct intel_uncore *uncore = rps_to_uncore(rps); + u32 state = enable ? GEN9_RPSWCTL_ENABLE : GEN9_RPSWCTL_DISABLE; + + /* Allow punit to process software requests */ + intel_uncore_write(uncore, GEN6_RP_CONTROL, state); +} Was there a specific reason to remove the set/clear timer functions ? Replying on behalf of Vinay Belguamkar: We are now using the intel_rps_set() function which handles more state update in the correct rps path. We also obtain an rps lock which guarantees not clobbering rps data. The set/clear timers were being done when we were modifying the frequency outside of the rps paths. rps_set_manual is now only called in the SLPC path where the rps timers are not even running. + +void intel_rps_raise_unslice(struct intel_rps *rps) +{ + struct intel_uncore *uncore = rps_to_uncore(rps); + u32 rp0_unslice_req; + + mutex_lock(&rps->lock); + + if (rps_uses_slpc(rps)) { + /* RP limits have not been initialized yet for SLPC path */ + rp0_unslice_req = ((intel_rps_read_state_cap(rps) >> 0) + & 0xff) * GEN9_FREQ_SCALER; + + intel_rps_set_manual(rps, true); + intel_uncore_write(uncore, GEN6_RPNSWREQ, + ((rp0_unslice_req << + GEN9_SW_REQ_UNSLICE_RATIO_SHIFT) | + GEN9_IGNORE_SLICE_RATIO)); + intel_rps_set_manual(rps, false); + } else { + intel_rps_set(rps, rps->rp0_freq); + } + + mutex_unlock(&rps->lock); +} + +void intel_rps_lower_unslice(struct intel_rps *rps) +{ + struct intel_uncore *uncore = rps_to_uncore(rps); + u32 rpn_unslice_req; + + mutex_lock(&rps->lock); + + if (rps_uses_slpc(rps)) { + /* RP limits have not been initialized yet for SLPC path */ + rpn_unslice_req = ((intel_rps_read_state_cap(rps) >> 16) + & 0xff) * GEN9_FREQ_SCALER; + + intel_rps_set_manual(rps, true); + intel_uncore_write(uncore, GEN6_RPNSWREQ, + ((rpn_unslice_req << + GEN9_SW_REQ_UNSLICE_RATIO_SHIFT) | + GEN9_IGNORE_SLICE_RATIO)); + intel_rps_set_manual(rps, false); + } else { + intel_rps_set(rps, rps->min_freq); + } + + mutex_unlock(&rps->lock); +} + Small function name nitpick maybe unslice_freq ? Just a suggestion. /* External interface for intel_ips.ko */ static struct drm_i915_private __rcu *ips_mchdev; diff --git a/drivers/gpu/drm/i915/gt/intel_rps.h b/drivers/gpu/drm/i915/gt/intel_rps.h index aee12f37d38a..c6d76a3d1331 100644 --- a/drivers/gpu/drm/i915/gt/intel_rps.h +++ b/drivers/gpu/drm/i915/gt/intel_rps.h @@ -45,6 +45,8 @@ u32 intel_rps_get_rpn_frequency(struct intel_rps *rps); u32 intel_rps_read_punit_req(struct intel_rps *rps); u32 intel_rps_read_punit_req_frequency(struct intel_rps *rps); u32 intel_rps_read_state_cap(struct intel_rps *rps); +void intel_rps_raise_unslice(struct intel_rps *rps); +void intel_rps_lower_unslice(struct intel_rps *rps); void gen5_rps_irq_handler(struct intel_rps *rps); void gen6_rps_irq_handler(struct intel_rps *rps, u32 pm_iir); diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c b/drivers/gpu/drm/i915/gt/uc/intel_uc.c index 2fef3b0bbe95..3693c4e7dad0 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c @@ -8,6 +8,7 @@ #include "intel_guc.h" #include "intel_guc_ads.h" #include "intel_guc_submission.h" +#include "gt/intel_rps.h" #include "intel_uc.h" #include "i915_drv.h" @@ -462,6 +463,8 @@ static int __uc_init_hw(struct intel_uc *uc) else attempts = 1; + intel_rps_raise_unslice(&uc_to_gt(uc)->rps); +
Re: [Intel-gfx] Small bar recovery vs compressed content on DG2
+@Vetter, Daniel Let's not start re-inventing this on the fly again. That's how we got into trouble in the past. The SAS/Whitepaper does currently require the SMEM+LMEM placement for mappable, for good reasons. We cannot 'always migrate to mappable in the fault handler'. Or at least, this is not as trivial as it is to write in a sentence due to the need to spill out other active objects, and all the usual challenges with context synchronization etc. It is possible, perhaps with a lot of care, but it is challenging to guarantee, easy to break, and not needed for 99.9% of software. We are trying to simplify our driver stack. If we need a special mechanism for debug, we should devise a special mechanism, not throw out the general LMEM+SMEM requirement. Are there any identified first-class clients that require such access, or is it only debugging tools? If only debug, then why can't the tool use a copy engine submission to access the data in place? Or perhaps a bespoke ioctl to access this via the KMD (and kmd submitted copy-engine BB)? Thanks, Jon > -Original Message- > From: Thomas Hellström > Sent: Thursday, March 17, 2022 2:35 AM > To: Joonas Lahtinen ; Bloomfield, Jon > ; Intel Graphics Development g...@lists.freedesktop.org>; Auld, Matthew ; C, > Ramalingam > Subject: Re: Small bar recovery vs compressed content on DG2 > > On Thu, 2022-03-17 at 10:43 +0200, Joonas Lahtinen wrote: > > Quoting Thomas Hellström (2022-03-16 09:25:16) > > > Hi! > > > > > > Do we somehow need to clarify in the headers the semantics for > > > this? > > > > > > From my understanding when discussing the CCS migration series > > > with > > > Ram, the kernel will never do any resolving (compressing / > > > decompressing) migrations or evictions which basically implies the > > > following: > > > > > > *) Compressed data must have LMEM only placement, otherwise the > GPU > > > would read garbage if accessing from SMEM. > > > > This has always been the case, so it should be documented in the uAPI > > headers and kerneldocs. > > > > > *) Compressed data can't be assumed to be mappable by the CPU, > > > because > > > in order to ensure that on small BAR, the placement needs to be > > > LMEM+SMEM. > > > > Not strictly true, as we could always migrate to the mappable region > > in > > the CPU fault handler. Will need the same set of tricks as with > > limited > > mappable GGTT in past. > > In addition to Matt's reply: > > Yes, if there is sufficient space. I'm not sure we want to complicate > this to migrate only part of the buffer to mappable on a fault basis? > Otherwise this is likely to fail. > > One option is to allow cpu-mapping from SYSTEM like TTM is doing for > evicted buffers, even if SYSTEM is not in the placement list, and then > migrate back to LMEM for gpu access. > > But can user-space even interpret the compressed data when CPU- > mapping? > without access to the CCS metadata? > > > > > > *) Neither can compressed data be part of a CAPTURE buffer, because > > > that > > > requires the data to be CPU-mappable. > > > > Especially this will be too big of a limitation which we can't really > > afford > > when it comes to debugging. > > Same here WRT user-space interpretation. > > This will become especially tricky on small BAR, because either we need > to fit all compressed buffers in the mappable portion, or be able to > blit the contents of the capture buffers from within the fence > signalling critical section, which will require a lot of work I guess. > > /Thomas > > > > > > Regards, Joonas > > > > > Are we (and user-mode drivers) OK with these restrictions, or do we > > > need > > > to rethink? > > > > > > Thanks, > > > > > > Thomas > > > > > > >
Re: [Intel-gfx] Small bar recovery vs compressed content on DG2
@Thomas Hellström - I agree :-) My question was really to @Joonas Lahtinen, who was saying we could always migrate in the CPU fault handler. I am pushing back on that unless we have no choice. It's the very complication we were trying to avoid with the current SAS. If that's what's needed, then so be it. But I'm asking whether we can instead handle this specially, instead of adding generic complexity to the primary code paths. Jon > -Original Message- > From: Thomas Hellström > Sent: Friday, March 18, 2022 2:48 AM > To: Bloomfield, Jon ; Joonas Lahtinen > ; Intel Graphics Development g...@lists.freedesktop.org>; Auld, Matthew ; C, > Ramalingam ; Vetter, Daniel > > Subject: Re: Small bar recovery vs compressed content on DG2 > > Hi, > > On Thu, 2022-03-17 at 18:21 +, Bloomfield, Jon wrote: > > +@Vetter, Daniel > > > > Let's not start re-inventing this on the fly again. That's how we got > > into trouble in the past. The SAS/Whitepaper does currently require > > the SMEM+LMEM placement for mappable, for good reasons. > > Just to avoid any misunderstandings here: > > We have two hard requirements from Arch that clash, main problem is > compressed bos can't be captured on error with current designs. > > From an engineering point of view we can do little more than list > options available to resolve this and whether they are hard or not so > hard to implemement. But IMHO Arch needs to agree on what's got to > give. > > Thanks, > Thomas > > > > > > We cannot 'always migrate to mappable in the fault handler'. Or at > > least, this is not as trivial as it is to write in a sentence due to > > the need to spill out other active objects, and all the usual > > challenges with context synchronization etc. It is possible, perhaps > > with a lot of care, but it is challenging to guarantee, easy to > > break, and not needed for 99.9% of software. We are trying to > > simplify our driver stack. > > > > If we need a special mechanism for debug, we should devise a special > > mechanism, not throw out the general LMEM+SMEM requirement. Are > there > > any identified first-class clients that require such access, or is it > > only debugging tools? > > > > If only debug, then why can't the tool use a copy engine submission > > to access the data in place? Or perhaps a bespoke ioctl to access > > this via the KMD (and kmd submitted copy-engine BB)? > > > > Thanks, > > > > Jon > > > > > -Original Message- > > > From: Thomas Hellström > > > Sent: Thursday, March 17, 2022 2:35 AM > > > To: Joonas Lahtinen ; Bloomfield, > > > Jon > > > ; Intel Graphics Development > > g...@lists.freedesktop.org>; Auld, Matthew ; > > > C, > > > Ramalingam > > > Subject: Re: Small bar recovery vs compressed content on DG2 > > > > > > On Thu, 2022-03-17 at 10:43 +0200, Joonas Lahtinen wrote: > > > > Quoting Thomas Hellström (2022-03-16 09:25:16) > > > > > Hi! > > > > > > > > > > Do we somehow need to clarify in the headers the semantics for > > > > > this? > > > > > > > > > > From my understanding when discussing the CCS migration series > > > > > with > > > > > Ram, the kernel will never do any resolving (compressing / > > > > > decompressing) migrations or evictions which basically implies > > > > > the > > > > > following: > > > > > > > > > > *) Compressed data must have LMEM only placement, otherwise the > > > GPU > > > > > would read garbage if accessing from SMEM. > > > > > > > > This has always been the case, so it should be documented in the > > > > uAPI > > > > headers and kerneldocs. > > > > > > > > > *) Compressed data can't be assumed to be mappable by the CPU, > > > > > because > > > > > in order to ensure that on small BAR, the placement needs to be > > > > > LMEM+SMEM. > > > > > > > > Not strictly true, as we could always migrate to the mappable > > > > region > > > > in > > > > the CPU fault handler. Will need the same set of tricks as with > > > > limited > > > > mappable GGTT in past. > > > > > > In addition to Matt's reply: > > > > > > Yes, if there is sufficient space. I'm not sure we want to > >
Re: [Intel-gfx] [PATCH] drm/i915/adlp: Remove require_force_probe protection
Assuming the whitespace cleanup requested below is completed: Acked-by: Jon Bloomfield > -Original Message- > From: Intel-gfx On Behalf Of > Rodrigo Vivi > Sent: Tuesday, November 16, 2021 2:33 PM > To: Taylor, Clinton A > Cc: Intel-gfx@lists.freedesktop.org > Subject: Re: [Intel-gfx] [PATCH] drm/i915/adlp: Remove > require_force_probe protection > > On Mon, Nov 15, 2021 at 03:53:45PM -0800, clinton.a.tay...@intel.com > wrote: > > From: Clint Taylor > > > > drm/i915/adlp: Remove require_force_probe protection > > > > Removing force probe protection from ADL_P platform. Did > > not observe warnings, errors, flickering or any visual > > defects while doing ordinary tasks like browsing and > > editing documents in a two monitor setup. > > some strange alignment here... please remove the extra > tabs here. > > > > > For more info drm-tip idle run results : > > https://intel-gfx-ci.01.org/tree/drm-tip/drmtip.html? > > hmm... I could swear that I had seen the ADL-P green there a few > days ago as well... But right now I couldn't see ADL-P there... > > So that fails on having a *reliable* green CI picture... > Any idea why that is down at this moment? > > > > > Cc: Rodrigo Vivi > > > > Signed-off-by: Clint Taylor > > --- > > drivers/gpu/drm/i915/i915_pci.c | 1 - > > 1 file changed, 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_pci.c > b/drivers/gpu/drm/i915/i915_pci.c > > index 4c7fcc5f9a97..af9f4988bd88 100644 > > --- a/drivers/gpu/drm/i915/i915_pci.c > > +++ b/drivers/gpu/drm/i915/i915_pci.c > > @@ -970,7 +970,6 @@ static const struct intel_device_info adl_p_info = { > > GEN12_FEATURES, > > XE_LPD_FEATURES, > > PLATFORM(INTEL_ALDERLAKE_P), > > - .require_force_probe = 1, > > .cpu_transcoder_mask = BIT(TRANSCODER_A) | > BIT(TRANSCODER_B) | > >BIT(TRANSCODER_C) | BIT(TRANSCODER_D) | > >BIT(TRANSCODER_DSI_0) | > BIT(TRANSCODER_DSI_1), > > -- > > 2.33.1 > >
Re: [Intel-gfx] [PATCH v3 0/4] GuC HWCONFIG with documentation
Acked-by: Jon Bloomfield > -Original Message- > From: Intel-gfx On Behalf Of > Jordan Justen > Sent: Tuesday, February 8, 2022 1:05 PM > To: intel-gfx > Cc: dri-devel > Subject: [Intel-gfx] [PATCH v3 0/4] GuC HWCONFIG with documentation > > This is John/Rodrigo's 2 patches with some minor changes, and I added > 2 patches. > > "drm/i915/uapi: Add query for hwconfig blob" was changed: > > * Rename DRM_I915_QUERY_HWCONFIG_TABLE to > DRM_I915_QUERY_HWCONFIG_BLOB >as requested by Joonas. > > * Reword commit message > > * I added Acked-by to this patch, but this only applies in the >context of this version of the patchset. If my changes are >rejected, then please *do not* add my Acked-by to the other series. > >In particular, I do not want my Acked-by on the patch if the patch >mentions the HWCONFIG format, but is not willing to add that to the >actual uAPI. > >I also do not want my Acked-by on it if it mentions "consolidation" >of this data. Since we are dealing with open source projects (aside >from GuC), this doesn't seem appropriate. > > "drm/i915/uapi: Add struct drm_i915_query_hwconfig_blob_item" adds a > struct to the uAPI and documents the return value for > DRM_I915_QUERY_HWCONFIG_BLOB. (Except, keys / values are still > deferred to the PRM.) > > "drm/i915/guc: Verify hwconfig blob matches supported format" does the > simple verification of the blob to make sure it matches what the uAPI > documents. > > v2: > * Fix -Werror errors. > * Rebase to drm-intel/for-linux-next instead of >drm-intel/for-linux-next-gt, as this seems to be what CI wants. > * Fix u32 -> __u32. (Sorry, I was first testing in Mesa tree.) > * Add commit message for "Verify hwconfig blob" patch as requested by >Tvrtko. > * Reword text added to i915_drm.h as requested by Tvrtko. (Attempting >to indicate the overall blob ends right at the last blob item.) > > v3: > * Add several changes suggested by Tvrtko in the "Verify hwconfig >blob", along with some tweaks to i915_drm.h from the feedback for >the same patch. > > John Harrison (1): > drm/i915/guc: Add fetch of hwconfig table > > Jordan Justen (2): > drm/i915/uapi: Add struct drm_i915_query_hwconfig_blob_item > drm/i915/guc: Verify hwconfig blob matches supported format > > Rodrigo Vivi (1): > drm/i915/uapi: Add query for hwconfig blob > > drivers/gpu/drm/i915/Makefile | 1 + > .../gpu/drm/i915/gt/uc/abi/guc_actions_abi.h | 1 + > .../gpu/drm/i915/gt/uc/abi/guc_errors_abi.h | 4 + > drivers/gpu/drm/i915/gt/uc/intel_guc.h| 3 + > .../gpu/drm/i915/gt/uc/intel_guc_hwconfig.c | 201 ++ > .../gpu/drm/i915/gt/uc/intel_guc_hwconfig.h | 19 ++ > drivers/gpu/drm/i915/gt/uc/intel_uc.c | 6 + > drivers/gpu/drm/i915/i915_query.c | 23 ++ > include/uapi/drm/i915_drm.h | 33 +++ > 9 files changed, 291 insertions(+) > create mode 100644 drivers/gpu/drm/i915/gt/uc/intel_guc_hwconfig.c > create mode 100644 drivers/gpu/drm/i915/gt/uc/intel_guc_hwconfig.h > > -- > 2.34.1
Re: [Intel-gfx] [PATCH 20/30] drm/i915: Cancel non-persistent contexts on close
> -Original Message- > From: Chris Wilson > Sent: Wednesday, October 2, 2019 4:20 AM > To: intel-gfx@lists.freedesktop.org > Cc: Chris Wilson ; Joonas Lahtinen > ; Winiarski, Michal > ; Bloomfield, Jon > Subject: [PATCH 20/30] drm/i915: Cancel non-persistent contexts on close > > Normally, we rely on our hangcheck to prevent persistent batches from > hogging the GPU. However, if the user disables hangcheck, this mechanism > breaks down. Despite our insistence that this is unsafe, the users are > equally insistent that they want to use endless batches and will disable > the hangcheck mechanism. We are looking at perhaps replacing hangcheck > with a softer mechanism, that sends a pulse down the engine to check if > it is well. We can use the same preemptive pulse to flush an active > persistent context off the GPU upon context close, preventing resources > being lost and unkillable requests remaining on the GPU after process > termination. To avoid changing the ABI and accidentally breaking > existing userspace, we make the persistence of a context explicit and > enable it by default (matching current ABI). Userspace can opt out of > persistent mode (forcing requests to be cancelled when the context is > closed by process termination or explicitly) by a context parameter. To > facilitate existing use-cases of disabling hangcheck, if the modparam is > disabled (i915.enable_hangcheck=0), we disable persistence mode by > default. (Note, one of the outcomes for supporting endless mode will be > the removal of hangchecking, at which point opting into persistent mode > will be mandatory, or maybe the default perhaps controlled by cgroups.) > > Testcase: igt/gem_ctx_persistence > Signed-off-by: Chris Wilson > Cc: Joonas Lahtinen > Cc: Michał Winiarski > Cc: Jon Bloomfield > Reviewed-by: Jon Bloomfield > --- > drivers/gpu/drm/i915/Makefile | 1 + > drivers/gpu/drm/i915/gem/i915_gem_context.c | 122 ++ > drivers/gpu/drm/i915/gem/i915_gem_context.h | 15 +++ > .../gpu/drm/i915/gem/i915_gem_context_types.h | 1 + > .../gpu/drm/i915/gem/selftests/mock_context.c | 2 + > .../gpu/drm/i915/gt/intel_engine_heartbeat.c | 56 > .../gpu/drm/i915/gt/intel_engine_heartbeat.h | 14 ++ > drivers/gpu/drm/i915/gt/intel_engine_pm.c | 2 +- > drivers/gpu/drm/i915/i915_priolist_types.h| 1 + > include/uapi/drm/i915_drm.h | 15 +++ > 10 files changed, 228 insertions(+), 1 deletion(-) > create mode 100644 drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c > create mode 100644 drivers/gpu/drm/i915/gt/intel_engine_heartbeat.h > > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile > index 1f87c70a2842..561fa2bb3006 100644 > --- a/drivers/gpu/drm/i915/Makefile > +++ b/drivers/gpu/drm/i915/Makefile > @@ -76,6 +76,7 @@ gt-y += \ > gt/intel_breadcrumbs.o \ > gt/intel_context.o \ > gt/intel_engine_cs.o \ > + gt/intel_engine_heartbeat.o \ > gt/intel_engine_pm.o \ > gt/intel_engine_pool.o \ > gt/intel_engine_sysfs.o \ > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c > b/drivers/gpu/drm/i915/gem/i915_gem_context.c > index 0ab416887fc2..e832238a72e5 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c > +static int > +set_persistence(struct i915_gem_context *ctx, > + const struct drm_i915_gem_context_param *args) > +{ > + if (args->size) > + return -EINVAL; > + > + if (args->value) { > + i915_gem_context_set_persistence(ctx); > + return 0; > + } > + > + /* To cancel a context we use "preempt-to-idle" */ > + if (!(ctx->i915->caps.scheduler & > I915_SCHEDULER_CAP_PREEMPTION)) > + return -ENODEV; > + > + i915_gem_context_clear_persistence(ctx); > + return 0; > +} Hmmn. Given that disabling hangcheck is an explicit operation, and we already change the default setting, can't we make it a hard requirement that persistence requires hangcheck? You should not really be able to opt back in to persistence if hangcheck is disabled. In fact you could just test for hangcheck when deciding whether to kill the context, and force-kill if it is off - that way if hangcheck is disabled after a context starts it will get cleaned up. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 21/30] drm/i915: Replace hangcheck by heartbeats
> -Original Message- > From: Chris Wilson > Sent: Wednesday, October 2, 2019 4:20 AM > To: intel-gfx@lists.freedesktop.org > Cc: Chris Wilson ; Joonas Lahtinen > ; Ursulin, Tvrtko ; > Bloomfield, Jon > Subject: [PATCH 21/30] drm/i915: Replace hangcheck by heartbeats > > Replace sampling the engine state every so often with a periodic > heartbeat request to measure the health of an engine. This is coupled > with the forced-preemption to allow long running requests to survive so > long as they do not block other users. > > The heartbeat interval can be adjusted per-engine using, > > /sys/class/drm/card0/engine/*/heartbeat_interval_ms Not true for discrete :-) Perhaps: /sys/class/drm/card/engine/*/heartbeat_interval_ms And again in the code below. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2] drm/i915: Cancel non-persistent contexts on close
> -Original Message- > From: Chris Wilson > Sent: Wednesday, October 2, 2019 7:23 AM > To: intel-gfx@lists.freedesktop.org > Cc: Chris Wilson ; Joonas Lahtinen > ; Winiarski, Michal > ; Bloomfield, Jon > Subject: [PATCH v2] drm/i915: Cancel non-persistent contexts on close > > Normally, we rely on our hangcheck to prevent persistent batches from > hogging the GPU. However, if the user disables hangcheck, this mechanism > breaks down. Despite our insistence that this is unsafe, the users are > equally insistent that they want to use endless batches and will disable > the hangcheck mechanism. We are looking at perhaps replacing hangcheck > with a softer mechanism, that sends a pulse down the engine to check if > it is well. We can use the same preemptive pulse to flush an active > persistent context off the GPU upon context close, preventing resources > being lost and unkillable requests remaining on the GPU after process > termination. To avoid changing the ABI and accidentally breaking > existing userspace, we make the persistence of a context explicit and > enable it by default (matching current ABI). Userspace can opt out of > persistent mode (forcing requests to be cancelled when the context is > closed by process termination or explicitly) by a context parameter. To > facilitate existing use-cases of disabling hangcheck, if the modparam is > disabled (i915.enable_hangcheck=0), we disable persistence mode by > default. (Note, one of the outcomes for supporting endless mode will be > the removal of hangchecking, at which point opting into persistent mode > will be mandatory, or maybe the default perhaps controlled by cgroups.) > > v2: Check for hangchecking at context termination, so that we are not > left with undying contexts from a craft user. s/craft/crafty/ unless we only care about sailors. Otherwise: Reviewed-by: Jon Bloomfield ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 20/30] drm/i915: Cancel non-persistent contexts on close
> -Original Message- > From: Chris Wilson > Sent: Wednesday, October 2, 2019 7:24 AM > To: Bloomfield, Jon ; intel- > g...@lists.freedesktop.org > Cc: Joonas Lahtinen ; Winiarski, Michal > > Subject: RE: [PATCH 20/30] drm/i915: Cancel non-persistent contexts on close > > Quoting Bloomfield, Jon (2019-10-02 14:52:32) > > > > > > > > Hmmn. Given that disabling hangcheck is an explicit operation, and we > already change the default setting, can't we make it a hard requirement that > persistence requires hangcheck? You should not really be able to opt back in > to > persistence if hangcheck is disabled. In fact you could just test for > hangcheck > when deciding whether to kill the context, and force-kill if it is off - that > way if > hangcheck is disabled after a context starts it will get cleaned up. > > Just great, now I got to update the igt to treat i915.enable_hangcheck > as API! > -Chris Don't blame me ;-) I'm in damage limitation mode. I'd prefer we didn't have persistence at all. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2 3/5] drm/i915: Introduce DRM_I915_GEM_MMAP_OFFSET
> -Original Message- > From: Intel-gfx On Behalf Of Abdiel > Janulgue > Sent: Monday, October 7, 2019 2:19 AM > To: intel-gfx@lists.freedesktop.org > Cc: Auld, Matthew > Subject: [Intel-gfx] [PATCH v2 3/5] drm/i915: Introduce > DRM_I915_GEM_MMAP_OFFSET > > This is really just an alias of mmap_gtt. Add a new CPU mmap > implementation that allows multiple fault handlers that depends on > the object's backing pages. > > Note that we multiplex mmap_gtt and mmap_offset through the same ioctl, > and use the zero extending behaviour of drm to differentiate between > them, when we inspect the flags. > > Signed-off-by: Abdiel Janulgue > Signed-off-by: Matthew Auld > Cc: Joonas Lahtinen > --- > drivers/gpu/drm/i915/gem/i915_gem_mman.c | 36 +-- > .../gpu/drm/i915/gem/i915_gem_object_types.h | 3 ++ > drivers/gpu/drm/i915/i915_drv.c | 2 +- > include/uapi/drm/i915_drm.h | 28 +++ > 4 files changed, 66 insertions(+), 3 deletions(-) How does the label 'offset' fit into this API if it's really about multiple fault handlers? Could do with a much better description here I think. Who would use this, and why, would help a lot. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH i-g-t] igt: Use COND_BBEND for busy spinning on gen9
> -Original Message- > From: Chris Wilson > Sent: Wednesday, November 13, 2019 11:47 AM > To: intel-gfx@lists.freedesktop.org > Cc: Bloomfield, Jon ; Lahtinen, Joonas > ; Vivi, Rodrigo ; > Kuoppala, Mika ; Mika Kuoppala > > Subject: [PATCH i-g-t] igt: Use COND_BBEND for busy spinning on gen9 > > From: Jon Bloomfield > > gen9+ introduces a cmdparser for the BLT engine which copies the > incoming BB to a kmd owned buffer for submission (to prevent changes > being made after the bb has been safely scanned). This breaks the > spin functionality because it relies on changing the submitted spin > buffers in order to terminate them. > > Instead, for gen9+, we change the semantics by introducing a COND_BB_END > into the infinite loop, to wait until a memory flag (in anothe bo) is > cleared. > > v2: Correct nop length to avoid overwriting bb_end instr when using > a dependency bo (cork) > v3: fix conflicts on igt_dummyload (Mika) > v4: s/bool running/uint32_t running, fix r->delta (Mika) > v5: remove overzealous assert (Mika) > v6: rebase on top of lib changes (Mika) > v7: rework on top of public igt lib changes (Mika) > v8: rebase > v9: simplify by using bb end as conditional (Chris) > > Signed-off-by: Jon Bloomfield (v2) > Cc: Joonas Lahtinen > Cc: Rodrigo Vivi > Cc: Mika Kuoppala > Signed-off-by: Mika Kuoppala > --- > lib/i830_reg.h | 2 +- > lib/igt_dummyload.c | 24 +++- > lib/intel_reg.h | 3 +++ > tests/i915/gem_double_irq_loop.c| 2 -- > tests/i915/gem_write_read_ring_switch.c | 2 +- > 5 files changed, 28 insertions(+), 5 deletions(-) > > diff --git a/lib/i830_reg.h b/lib/i830_reg.h > index a57691c7e..410202567 100644 > --- a/lib/i830_reg.h > +++ b/lib/i830_reg.h > @@ -43,7 +43,7 @@ > /* broadwater flush bits */ > #define BRW_MI_GLOBAL_SNAPSHOT_RESET (1 << 3) > > -#define MI_COND_BATCH_BUFFER_END (0x36<<23 | 1) > +#define MI_COND_BATCH_BUFFER_END (0x36 << 23) > #define MI_DO_COMPARE(1<<21) > > #define MI_BATCH_BUFFER_END (0xA << 23) > diff --git a/lib/igt_dummyload.c b/lib/igt_dummyload.c > index c079bd045..93b8b6bc6 100644 > --- a/lib/igt_dummyload.c > +++ b/lib/igt_dummyload.c > @@ -208,7 +208,29 @@ emit_recursive_batch(igt_spin_t *spin, >* trouble. See https://bugs.freedesktop.org/show_bug.cgi?id=102262 >*/ > if (!(opts->flags & IGT_SPIN_FAST)) > - cs += 1000; > + cs += 960; > + > + /* > + * When using a cmdparser, the batch is copied into a read only > location > + * and validated. We are then unable to alter the executing batch, > + * breaking the older *spin->condition = MI_BB_END termination. > + * Instead we can use a conditional MI_BB_END here that looks at > + * the user's copy of the batch and terminates when they modified it. > + */ > + if (gen >= 9) { > + r = &relocs[batch->relocation_count++]; > + > + r->presumed_offset = 0; > + r->target_handle = batch->handle; > + r->offset = (cs + 2 - batch_start) * sizeof(*cs); > + r->read_domains = I915_GEM_DOMAIN_COMMAND; > + r->delta = (spin->condition - batch_start) * sizeof(*cs); > + > + *cs++ = MI_COND_BATCH_BUFFER_END | 1 << 21 | 5 << 12 | > 2; > + *cs++ = MI_BATCH_BUFFER_END; > + *cs++ = r->delta; > + *cs++ = 0; > + } > > /* recurse */ > r = &relocs[batch->relocation_count++]; > diff --git a/lib/intel_reg.h b/lib/intel_reg.h > index 069440cb7..10ca760a2 100644 > --- a/lib/intel_reg.h > +++ b/lib/intel_reg.h > @@ -2593,6 +2593,9 @@ SOFTWARE OR THE USE OR OTHER DEALINGS IN > THE SOFTWARE. > #define MI_BATCH_BUFFER ((0x30 << 23) | 1) > #define MI_BATCH_BUFFER_START(0x31 << 23) > #define MI_BATCH_BUFFER_END (0xA << 23) > +#define MI_COND_BATCH_BUFFER_END (0x36 << 23) > +#define MI_DO_COMPARE (1<<21) > + > #define MI_BATCH_NON_SECURE (1) > #define MI_BATCH_NON_SECURE_I965 (1 << 8) > #define MI_BATCH_NON_SECURE_HSW (1<<13) /* Additional > bit for RCS */ > diff --git a/tests/i915/gem_double_irq_loop.c > b/tests/i915/gem_double_irq_loop.c > index b326fc58a..f17f61c19 100644 > --- a/tests/i915/gem_double_irq_loop.c > +++ b/tests/i915/gem_double_irq_loop.c > @@ -52,8 +52,6 @@ static drm_intel_bo *target_buff
Re: [Intel-gfx] drm-next + i915 CVE yolo merge
> -Original Message- > From: Daniel Vetter > Sent: Thursday, November 14, 2019 12:13 AM > To: Dave Airlie > Cc: Jani Nikula ; Bloomfield, Jon > ; Joonas Lahtinen > ; Chris Wilson ; > Stephen Rothwell ; dri-devel de...@lists.freedesktop.org>; Intel Graphics Development g...@lists.freedesktop.org>; Linus Torvalds > Subject: Re: drm-next + i915 CVE yolo merge > > On Thu, Nov 14, 2019 at 2:33 AM Dave Airlie wrote: > > The landing of the i915 CVE fixes into Linus tree has created a bit of > > a mess in linux-next and downstream in drm-next trees. > > > > I talked to Daniel and he had talked to Joonas a bit, and I decided to > > go with what Daniel describes as the YOLO merge, where I just solve it > > and pray, and everyone else verifies/fixes it. > > > > In my favour I've been reading these patches for a couple of months > > now and applied them to a lot of places, so I'm quite familiar with > > what they are doing. > > > > The worst culprit was the RC6 ctx corruption fix since the whole of > > rc6 got refactored in next. However I also had access to a version of > > this patch Jon wrote on drm-tip a couple of weeks ago. > > > > I took that patch, applied it and fixed it up on top of drm-next. Then > > I backmerged the commit that also went into Linus' tree. Then I > > removed any evidence of the RC6 patch from Linus' tree and left the > > newer version pieces in place. The other stuff mostly merged fine and > > the results looked fine, but I'd definitely think everyone at Intel > > should be staring at it, and my dinq tip resolutions ASAP and > > hopefully it goes into CI and comes out smelling of something good. > > Imre should look at the RC6 fix - He did all the hard work on that, including the rebases I sent to Dave. I was just a proxy :-) > > Let me know if it's all horrible asap, > > Add Martin and Arek for CI results. The yolo stuff landed in > CI_DRM_7340, did we break anything in there? From a quick look seems > all ok. Jon ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Replace hangcheck by heartbeats
> -Original Message- > From: Chris Wilson > Sent: Thursday, July 25, 2019 4:17 PM > To: intel-gfx@lists.freedesktop.org > Cc: Chris Wilson ; Joonas Lahtinen > ; Ursulin, Tvrtko ; > Bloomfield, Jon > Subject: [PATCH] drm/i915: Replace hangcheck by heartbeats > > Replace sampling the engine state every so often with a periodic > heartbeat request to measure the health of an engine. This is coupled > with the forced-preemption to allow long running requests to survive so > long as they do not block other users. Can you explain why we would need this at all if we have forced-preemption? Forced preemption guarantees that an engine cannot interfere with the timely execution of other contexts. If it hangs, but nothing else wants to use the engine then do we care? Power, obviously. But I'm not everything can be pre-empted. If a compute context is running on an engine, and no other contexts require that engine, then is it right to nuke it just because it's busy in a long running thread? > > Signed-off-by: Chris Wilson > Cc: Joonas Lahtinen > Cc: Tvrtko Ursulin > Cc: Jon Bloomfield > --- > Note, strictly this still requires struct_mutex-free retirement for the > corner case where the idle-worker is ineffective and we get a backlog of > requests on the kernel ring. Or if the legacy ringbuffer is full... > When we are able to retire from outside of struct_mutex we can do the > full idle-barrier and idle-work from here. > --- > drivers/gpu/drm/i915/Kconfig.profile | 11 + > drivers/gpu/drm/i915/Makefile | 2 +- > drivers/gpu/drm/i915/gem/i915_gem_pm.c| 2 - > drivers/gpu/drm/i915/gt/intel_engine.h| 32 -- > drivers/gpu/drm/i915/gt/intel_engine_cs.c | 10 +- > .../gpu/drm/i915/gt/intel_engine_heartbeat.c | 99 + > .../gpu/drm/i915/gt/intel_engine_heartbeat.h | 17 + > drivers/gpu/drm/i915/gt/intel_engine_pm.c | 5 +- > drivers/gpu/drm/i915/gt/intel_engine_types.h | 14 +- > drivers/gpu/drm/i915/gt/intel_gt.c| 1 - > drivers/gpu/drm/i915/gt/intel_gt.h| 4 - > drivers/gpu/drm/i915/gt/intel_gt_pm.c | 2 - > drivers/gpu/drm/i915/gt/intel_gt_types.h | 9 - > drivers/gpu/drm/i915/gt/intel_hangcheck.c | 360 -- > drivers/gpu/drm/i915/gt/intel_reset.c | 3 +- > drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 4 - > drivers/gpu/drm/i915/i915_debugfs.c | 86 - > drivers/gpu/drm/i915/i915_drv.c | 6 +- > drivers/gpu/drm/i915/i915_drv.h | 1 - > drivers/gpu/drm/i915/i915_gpu_error.c | 37 +- > drivers/gpu/drm/i915/i915_gpu_error.h | 2 - > drivers/gpu/drm/i915/i915_params.c| 5 - > drivers/gpu/drm/i915/i915_params.h| 1 - > 23 files changed, 151 insertions(+), 562 deletions(-) > create mode 100644 drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c > create mode 100644 drivers/gpu/drm/i915/gt/intel_engine_heartbeat.h > delete mode 100644 drivers/gpu/drm/i915/gt/intel_hangcheck.c > > diff --git a/drivers/gpu/drm/i915/Kconfig.profile > b/drivers/gpu/drm/i915/Kconfig.profile > index 3184e8491333..aafb57f84169 100644 > --- a/drivers/gpu/drm/i915/Kconfig.profile > +++ b/drivers/gpu/drm/i915/Kconfig.profile > @@ -37,3 +37,14 @@ config DRM_I915_PREEMPT_TIMEOUT > to execute. > > May be 0 to disable the timeout. > + > +config DRM_I915_HEARTBEAT_INTERVAL > + int "Interval between heartbeat pulses (ms)" > + default 2500 # microseconds > + help > + While active the driver uses a periodic request, a heartbeat, to > + check the wellness of the GPU and to regularly flush state changes > + (idle barriers). > + > + May be 0 to disable heartbeats and therefore disable automatic GPU > + hang detection. > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile > index 524516251a40..18201852d68d 100644 > --- a/drivers/gpu/drm/i915/Makefile > +++ b/drivers/gpu/drm/i915/Makefile > @@ -73,10 +73,10 @@ gt-y += \ > gt/intel_breadcrumbs.o \ > gt/intel_context.o \ > gt/intel_engine_cs.o \ > + gt/intel_engine_heartbeat.o \ > gt/intel_engine_pm.o \ > gt/intel_gt.o \ > gt/intel_gt_pm.o \ > - gt/intel_hangcheck.o \ > gt/intel_lrc.o \ > gt/intel_renderstate.o \ > gt/intel_reset.o \ > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pm.c > b/drivers/gpu/drm/i915/gem/i915_gem_pm.c > index b5561cbdc5ea..a5a0aefcc04c 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_pm.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_pm.c > @@ -170,8 +170,6 @@ void i915_gem_suspend(struct drm_i9
Re: [Intel-gfx] [PATCH] drm/i915: Replace hangcheck by heartbeats
> -Original Message- > From: Chris Wilson > Sent: Thursday, July 25, 2019 4:28 PM > To: Bloomfield, Jon ; intel- > g...@lists.freedesktop.org > Cc: Joonas Lahtinen ; Ursulin, Tvrtko > > Subject: RE: [PATCH] drm/i915: Replace hangcheck by heartbeats > > Quoting Bloomfield, Jon (2019-07-26 00:21:47) > > > -Original Message- > > > From: Chris Wilson > > > Sent: Thursday, July 25, 2019 4:17 PM > > > To: intel-gfx@lists.freedesktop.org > > > Cc: Chris Wilson ; Joonas Lahtinen > > > ; Ursulin, Tvrtko > ; > > > Bloomfield, Jon > > > Subject: [PATCH] drm/i915: Replace hangcheck by heartbeats > > > > > > Replace sampling the engine state every so often with a periodic > > > heartbeat request to measure the health of an engine. This is coupled > > > with the forced-preemption to allow long running requests to survive so > > > long as they do not block other users. > > > > Can you explain why we would need this at all if we have forced-preemption? > > Forced preemption guarantees that an engine cannot interfere with the > timely > > execution of other contexts. If it hangs, but nothing else wants to use the > engine > > then do we care? > > We may not have something else waiting to use the engine, but we may > have users waiting for the response where we need to detect the GPU hang > to prevent an infinite wait / stuck processes and infinite power drain. I'm not sure I buy that logic. Being able to pre-empt doesn't imply it will ever end. As written a context can sit forever, apparently making progress but never actually returning a response to the user. If the user isn't happy with the progress they will kill the process. So we haven't solved the user responsiveness here. All we've done is eliminated the potential to run one class of otherwise valid workload. Same argument goes for power. Just because it yields when other contexts want to run doesn't mean it won't consume lots of power indefinitely. I can equally write a CPU program to burn lots of power, forever, and it won't get nuked. TDR made sense when it was the only way to ensure contexts could always make forward progress. But force-preemption does everything we need to ensure that as far as I can tell. > > There is also the secondary task of flushing idle barriers. > > > Power, obviously. But I'm not everything can be pre-empted. If a compute > > context is running on an engine, and no other contexts require that engine, > > then is it right to nuke it just because it's busy in a long running > > thread? > > Yes. Unless you ask that we implement GPU-isolation where not even the > kernel is allowed to use a particular set of engines. > -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Replace hangcheck by heartbeats
> -Original Message- > From: Chris Wilson > Sent: Thursday, July 25, 2019 4:52 PM > To: Bloomfield, Jon ; intel- > g...@lists.freedesktop.org > Cc: Joonas Lahtinen ; Ursulin, Tvrtko > > Subject: RE: [PATCH] drm/i915: Replace hangcheck by heartbeats > > Quoting Bloomfield, Jon (2019-07-26 00:41:49) > > > -Original Message- > > > From: Chris Wilson > > > Sent: Thursday, July 25, 2019 4:28 PM > > > To: Bloomfield, Jon ; intel- > > > g...@lists.freedesktop.org > > > Cc: Joonas Lahtinen ; Ursulin, Tvrtko > > > > > > Subject: RE: [PATCH] drm/i915: Replace hangcheck by heartbeats > > > > > > Quoting Bloomfield, Jon (2019-07-26 00:21:47) > > > > > -Original Message- > > > > > From: Chris Wilson > > > > > Sent: Thursday, July 25, 2019 4:17 PM > > > > > To: intel-gfx@lists.freedesktop.org > > > > > Cc: Chris Wilson ; Joonas Lahtinen > > > > > ; Ursulin, Tvrtko > > > ; > > > > > Bloomfield, Jon > > > > > Subject: [PATCH] drm/i915: Replace hangcheck by heartbeats > > > > > > > > > > Replace sampling the engine state every so often with a periodic > > > > > heartbeat request to measure the health of an engine. This is coupled > > > > > with the forced-preemption to allow long running requests to survive > > > > > so > > > > > long as they do not block other users. > > > > > > > > Can you explain why we would need this at all if we have forced- > preemption? > > > > Forced preemption guarantees that an engine cannot interfere with the > > > timely > > > > execution of other contexts. If it hangs, but nothing else wants to use > > > > the > > > engine > > > > then do we care? > > > > > > We may not have something else waiting to use the engine, but we may > > > have users waiting for the response where we need to detect the GPU hang > > > to prevent an infinite wait / stuck processes and infinite power drain. > > > > I'm not sure I buy that logic. Being able to pre-empt doesn't imply it will > > ever end. As written a context can sit forever, apparently making progress > > but never actually returning a response to the user. If the user isn't happy > > with the progress they will kill the process. So we haven't solved the > > user responsiveness here. All we've done is eliminated the potential to > > run one class of otherwise valid workload. > > Indeed, one of the conditions I have in mind for endless is rlimits. The > user + admin should be able to specify that a context not exceed so much > runtime, and if we ever get a scheduler, we can write that as a budget > (along with deadlines). Agreed, an rlimit solution would work better, if there was an option for 'unlimited'. The specific reason I poked was to try to find a solution to the long-running compute workload scenarios. In those cases there is no fwd progress on the ring/BB, but the kernel will still be making fwd progress. The challenge here is that it's not easy for compiler to guarantee to fold a user kernel into something that fit any specific time boundary. It depends on the user kernel structure. A thread could take several seconds (or possibly minutes) to complete. That's not automatically bad. We need a solution to that specific problem, and while I agree that it would be nice to detect errant contexts and kick them out, that relies on an accurate classification of 'errant' and 'safe'. The response needs to be proportional to the problem. > > > Same argument goes for power. Just because it yields when other contexts > > want to run doesn't mean it won't consume lots of power indefinitely. I can > > equally write a CPU program to burn lots of power, forever, and it won't get > > nuked. > > I agree, and continue to dislike letting hogs have free reign. But even with this change a BB hog can still have free reign to consume power, as long as it pauses it's unsavoury habit whenever the heartbeat request comes snooping on the engine. What we're effectively saying with this is that it's ok for a context to spin in a BB, but not ok to spin in an EU kernel. Why would we differentiate when both can burn the same power? So we haven't solved this problem, but continue to victimize legitimate code. > > > TDR made sense when it was the only way to ensure contexts could always > > make forward progress. But force-preemption does everything we need to >
Re: [Intel-gfx] [PATCH] drm/i915: Replace hangcheck by heartbeats
> -Original Message- > From: Chris Wilson > Sent: Friday, July 26, 2019 1:11 PM > To: Bloomfield, Jon ; intel- > g...@lists.freedesktop.org > Cc: Joonas Lahtinen ; Ursulin, Tvrtko > > Subject: RE: [PATCH] drm/i915: Replace hangcheck by heartbeats > > Quoting Bloomfield, Jon (2019-07-26 16:00:06) > > > -Original Message- > > > From: Chris Wilson > > > Sent: Thursday, July 25, 2019 4:52 PM > > > To: Bloomfield, Jon ; intel- > > > g...@lists.freedesktop.org > > > Cc: Joonas Lahtinen ; Ursulin, Tvrtko > > > > > > Subject: RE: [PATCH] drm/i915: Replace hangcheck by heartbeats > > > > > > Quoting Bloomfield, Jon (2019-07-26 00:41:49) > > > > > -Original Message- > > > > > From: Chris Wilson > > > > > Sent: Thursday, July 25, 2019 4:28 PM > > > > > To: Bloomfield, Jon ; intel- > > > > > g...@lists.freedesktop.org > > > > > Cc: Joonas Lahtinen ; Ursulin, Tvrtko > > > > > > > > > > Subject: RE: [PATCH] drm/i915: Replace hangcheck by heartbeats > > > > > > > > > > Quoting Bloomfield, Jon (2019-07-26 00:21:47) > > > > > > > -Original Message- > > > > > > > From: Chris Wilson > > > > > > > Sent: Thursday, July 25, 2019 4:17 PM > > > > > > > To: intel-gfx@lists.freedesktop.org > > > > > > > Cc: Chris Wilson ; Joonas Lahtinen > > > > > > > ; Ursulin, Tvrtko > > > > > ; > > > > > > > Bloomfield, Jon > > > > > > > Subject: [PATCH] drm/i915: Replace hangcheck by heartbeats > > > > > > > > > > > > > > Replace sampling the engine state every so often with a periodic > > > > > > > heartbeat request to measure the health of an engine. This is > coupled > > > > > > > with the forced-preemption to allow long running requests to > survive so > > > > > > > long as they do not block other users. > > > > > > > > > > > > Can you explain why we would need this at all if we have forced- > > > preemption? > > > > > > Forced preemption guarantees that an engine cannot interfere with > the > > > > > timely > > > > > > execution of other contexts. If it hangs, but nothing else wants to > > > > > > use > the > > > > > engine > > > > > > then do we care? > > > > > > > > > > We may not have something else waiting to use the engine, but we may > > > > > have users waiting for the response where we need to detect the GPU > hang > > > > > to prevent an infinite wait / stuck processes and infinite power > > > > > drain. > > > > > > > > I'm not sure I buy that logic. Being able to pre-empt doesn't imply it > > > > will > > > > ever end. As written a context can sit forever, apparently making > > > > progress > > > > but never actually returning a response to the user. If the user isn't > > > > happy > > > > with the progress they will kill the process. So we haven't solved the > > > > user responsiveness here. All we've done is eliminated the potential to > > > > run one class of otherwise valid workload. > > > > > > Indeed, one of the conditions I have in mind for endless is rlimits. The > > > user + admin should be able to specify that a context not exceed so much > > > runtime, and if we ever get a scheduler, we can write that as a budget > > > (along with deadlines). > > > > Agreed, an rlimit solution would work better, if there was an option for > 'unlimited'. > > > > The specific reason I poked was to try to find a solution to the > > long-running compute workload scenarios. In those cases there is no fwd > > progress on the ring/BB, but the kernel will still be making fwd progress. > > The > > challenge here is that it's not easy for compiler to guarantee to fold a > > user > > kernel into something that fit any specific time boundary. It depends on the > > user kernel structure. A thread could take several seconds (or possibly > > minutes) to complete. That's not automatically bad. > > > > We need a solution to that specific problem, and while I agree that it would > be nice to detect errant contexts and kick them ou
Re: [Intel-gfx] [PATCH] drm/i915: Replace hangcheck by heartbeats
Hmmn. We're still on orthogonal perspectives as far as our previous arguments stand. But it doesn't matter because while thinking through your replies, I realized there is one argument in favour, which trumps all my previous arguments against this patch - it makes things deterministic. Without this patch (or hangcheck), whether a context gets nuked depends on what else is running. And that's a recipe for confused support emails. So I retract my other arguments, thanks for staying with me :-) BTW: TDR==Timeout-Detection and Reset. Essentially hangcheck and recovery. > -Original Message- > From: Chris Wilson > Sent: Friday, July 26, 2019 2:30 PM > To: Bloomfield, Jon ; intel- > g...@lists.freedesktop.org > Cc: Joonas Lahtinen ; Ursulin, Tvrtko > > Subject: RE: [PATCH] drm/i915: Replace hangcheck by heartbeats > > Quoting Bloomfield, Jon (2019-07-26 21:58:38) > > > From: Chris Wilson > > > It's no more often than before, you have to fail to advance within an > > > interval, and then deny preemption request. > > > > It's entrapment. You are creating an artificial workload for the context to > impede. Before that artificial workload was injected, the context would have > run to completion, the world would be at peace (and the user would have her > new bitcoin). Instead she stays poor because the DoS police launched an > illegal > sting on her otherwise genius operation. > > It's housekeeping; it's the cost of keeping the engine alive. This > argument is why CPU-isolation came about (aiui). > > > > > I do like the rlimit suggestion, but until we have that, just disabling > > > > TDR > feels > > > like a better stop-gap than nuking innocent compute contexts just in case > they > > > might block something. > > > > > > They're not innocent though :-p > > > > They are innocent until proven guilty :-) > > > > > > > > Disabling hangcheck (no idea why you confuse that with the recovery > > > procedure) makes igt unhappy, but they are able to do that today with > > > the modparam. This patch makes it easier to move that to an engine > > > parameter, but to disable it entirely you still need to be able to reset > > > the GPU on demand (suspend, eviction, oom). Without hangcheck we need > to > > > evaluate all MAX_SCHEDULE_TIMEOUT waits and supply a reset-on- > timeout > > > along critical paths. > > > -Chris > > > > I don't think I'm confusing hang-check with the recovery. I've talked about > TDR, which to me is a periodic hangcheck, combined with a recovery by engine > reset. I don't argue against being able to reset, just against the blunt > classification that hangcheck itself provides. > > > > TDR was originally looking for regressive workloads that were not making > forward progress to protect against DoS. But it was always a very blunt tool. > It's > never been able to differentiate long running, but otherwise valid, compute > workloads from genuine BB hangs, but that was fine at the time, and as you say > users could always switch the modparam. > > To be honest, I still have no idea what TDR is. But I take it that you > agree that we're only talking about hangcheck :) What I think you are > missing out on is that we have some more or less essential (itself > depending on client behaviour) housekeeping that goes along side it. > > My claim is that without a guarantee of isolation, anything else that > wants to use that engine will need the background housekeeping. (And if > we don't go as far as performing complete isolation, I expect userspace > is going to need the kernel to cleanup as they go along, as they are > unlikely to be prepared to do the system maintenance themselves.) > > > Now we have more emphasis on compute we need a solution that doesn't > involve a modparam. This was specifically requested by the compute team - > they know that they can flip the tdr switch, but that means their workload > will > only run if user modifies the system. That's hardly ideal. > > It means they can adjust things to their admins' hearts' content, and > it's a udev rule away from setting permissions to allow the compute group > to freely reconfigure the settings. > > > Without the rlimit concept I don't think we can't prevent power hogs > whatever we do, any more than the core kernel can prevent CPU power hogs. > So, if we can prevent a workload from blocking other contexts, then it is > unhelpful to continue either with the blunt tool that TDR is, or the similarly > blunt heartbeat. If
Re: [Intel-gfx] [PATCH] drm/i915: Replace hangcheck by heartbeats
> -Original Message- > From: Joonas Lahtinen > Sent: Monday, July 29, 2019 5:50 AM > To: Bloomfield, Jon ; intel- > g...@lists.freedesktop.org; Chris Wilson > Cc: Ursulin, Tvrtko > Subject: RE: [PATCH] drm/i915: Replace hangcheck by heartbeats > > Quoting Chris Wilson (2019-07-29 12:45:52) > > Quoting Joonas Lahtinen (2019-07-29 10:26:47) > > > Ok, so just confirming here. The plan is still to have userspace set a > > > per context (or per request) time limit for expected completion of a > > > request. This will be useful for the media workloads that consume > > > deterministic amount of time for correct bitstream. And the userspace > > > wants to be notified much quicker than the generic hangcheck time if > > > the operation failed due to corrupt bitstream. > > > > > > This time limit can be set to infinite by compute workloads. > > > > That only provides a cap on the context itself. We need to make sure that proposals such as the above are compatible with GuC. The nice thing about the heartbeat is that it relies on a more or less standard request/context and so should be compatible with any back end. > > Yes. > > > We also have the > > criteria that is something else has been selected to run on the GPU, you > > have to allow preemption within a certain period or else you will be > > shot. > > This is what I meant ... > > > > Then, in parallel to that, we have cgroups or system wide configuration > > > for maximum allowed timeslice per process/context. That means that a > > > long-running workload must pre-empt at that granularity. > > ... with this. > > > Not quite. It must preempt within a few ms of being asked, that is a > > different problem to the timeslice granularity (which is when we ask it > > to switch, or if not due to a high priority request earlier). It's a QoS > > issue for the other context. Setting that timeout is hard, we can allow > > a context to select its own timeout, or define it via sysfs/cgroups, but > > because it depends on the running context, it causes another context to > > fail in non-trivial ways. The GPU is simply not as preemptible as one > > would like. > > Right, I was only thinking about the pre-emption delay, maybe I chose my > words wrong. Basically what the admin wants to control is exactly what > you wrote, how long it can take from pre-emption request to completion. > This is probably useful as a CONTEXT_GETPARAM for userspace to consider. > They might decide how many loops to run without MI_ARB_CHECK in > non-pre-emptible sections. Dunno. > > That parameter configures the QoS level of the system, how fast a high > priority requests gets to run on the hardware. > > > Fwiw, I was thinking the next step would be to put per-engine controls > > in sysfs, then cross the cgroups bridge. I'm not sure my previous plan > > of exposing per-context parameters for timeslice/preemption is that > > usable. > > The prefered frequency of how often a context would like to be scheduled > on the hardware, makes sense as context setparam. Compute workloads are > probably indifferent and something related to media most likely wants to > run at some multiple of video FPS. > > I guess the userspace could really only request being run more > frequently than the default, and in exchange it would receive less > execution time per each slice. We probably want to control the upper > bound of the frequency. > > > > That pre-emption/hearbeat should happen regardless if others contexts are > > > requesting the hardware or not, because better start recovery of a hung > > > task as soon as it misbehaves. > > > > I concur, but Jon would like the opposite to allow for uncooperative > > compute kernels that simply block preemption forever. I think for the I wasn't asking that :-) What I was originally asking for is to allow a compute workload to run forever IF no other contexts need to run. Don't launch a pre-emptive strike, only kill it if it actually blocks a real workload. But then I realized that this is bad for deterministic behaviour. So retracted the ask. Using the heartbeat to test the workload for pre-emptability is a good solution because it ensures a workload always fails quickly, or never fails. > > extreme Jon wants, something like CPU-isolation fits better, where the > > important client owns an engine all to itself and the kernel is not even > > allowed to do housekeeping on that engine. (We would turn off time- > > slicing, preemption timers, etc on that engine and basically run it in > > submission order.) Agreed, isolation is really the only way we could permit a workload to hog an engine indefinitely. This would be beneficial to some of the RTOS use cases in particular. > > Makes sense to me. > > Regards, Joonas ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 5/5] drm/i915: Cancel non-persistent contexts on close
> -Original Message- > From: Chris Wilson > Sent: Tuesday, August 6, 2019 6:47 AM > To: intel-gfx@lists.freedesktop.org > Cc: Chris Wilson ; Joonas Lahtinen > ; Winiarski, Michal > ; Bloomfield, Jon > Subject: [PATCH 5/5] drm/i915: Cancel non-persistent contexts on close > > Normally, we rely on our hangcheck to prevent persistent batches from > hogging the GPU. However, if the user disables hangcheck, this mechanism > breaks down. Despite our insistence that this is unsafe, the users are > equally insistent that they want to use endless batches and will disable > the hangcheck mechanism. We are looking are perhaps replacing hangcheck > with a softer mechanism, that sends a pulse down the engine to check if > it is well. We can use the same preemptive pulse to flush an active > persistent context off the GPU upon context close, preventing resources > being lost and unkillable requests remaining on the GPU, after process > termination. To avoid changing the ABI and accidentally breaking > existing userspace, we make the persistence of a context explicit and > enable it by default. Userspace can opt out of persistent mode (forcing > requests to be cancelled when the context is closed by process > termination or explicitly) by a context parameter, or to facilitate > existing use-cases by disabling hangcheck (i915.enable_hangcheck=0). > (Note, one of the outcomes for supporting endless mode will be the > removal of hangchecking, at which point opting into persistent mode will > be mandatory, or maybe the default.) > > Testcase: igt/gem_ctx_persistence > Signed-off-by: Chris Wilson > Cc: Joonas Lahtinen > Cc: Michał Winiarski > Cc: Jon Bloomfield > > --- > Same sort of caveats as for hangcheck, a few corner cases need > struct_mutex and some preallocation. > --- > drivers/gpu/drm/i915/Makefile | 3 +- > drivers/gpu/drm/i915/gem/i915_gem_context.c | 79 +++ > drivers/gpu/drm/i915/gem/i915_gem_context.h | 15 > .../gpu/drm/i915/gem/i915_gem_context_types.h | 1 + > .../gpu/drm/i915/gt/intel_engine_heartbeat.c | 53 + > .../gpu/drm/i915/gt/intel_engine_heartbeat.h | 14 > include/uapi/drm/i915_drm.h | 15 > 7 files changed, 179 insertions(+), 1 deletion(-) > create mode 100644 drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c > create mode 100644 drivers/gpu/drm/i915/gt/intel_engine_heartbeat.h > > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile > index a1016858d014..42417d87496e 100644 > --- a/drivers/gpu/drm/i915/Makefile > +++ b/drivers/gpu/drm/i915/Makefile > @@ -71,9 +71,10 @@ obj-y += gt/ > gt-y += \ > gt/intel_breadcrumbs.o \ > gt/intel_context.o \ > - gt/intel_engine_pool.o \ > gt/intel_engine_cs.o \ > + gt/intel_engine_heartbeat.o \ > gt/intel_engine_pm.o \ > + gt/intel_engine_pool.o \ > gt/intel_gt.o \ > gt/intel_gt_pm.o \ > gt/intel_hangcheck.o \ > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c > b/drivers/gpu/drm/i915/gem/i915_gem_context.c > index 64f7a533e886..5718b74f95b7 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c > @@ -70,6 +70,7 @@ > #include > > #include "gt/intel_lrc_reg.h" > +#include "gt/intel_engine_heartbeat.h" > > #include "i915_gem_context.h" > #include "i915_globals.h" > @@ -373,6 +374,45 @@ void i915_gem_context_release(struct kref *ref) > queue_work(i915->wq, &i915->contexts.free_work); > } > > +static void kill_context(struct i915_gem_context *ctx) > +{ > + struct i915_gem_engines_iter it; > + struct intel_engine_cs *engine; > + intel_engine_mask_t tmp, active; > + struct intel_context *ce; > + > + if (i915_gem_context_is_banned(ctx)) > + return; > + > + i915_gem_context_set_banned(ctx); > + > + active = 0; > + for_each_gem_engine(ce, i915_gem_context_lock_engines(ctx), it) { > + struct i915_request *rq; > + > + if (!ce->ring) > + continue; > + > + rq = i915_active_request_get_unlocked(&ce->ring->timeline- > >last_request); > + if (!rq) > + continue; > + > + active |= rq->engine->mask; > + i915_request_put(rq); > + } > + i915_gem_context_unlock_engines(ctx); > + > + /* > + * Send a "high priority pulse" down the engine to cause the > + * current request to be momentarily preempted. (If it fails to > +
Re: [Intel-gfx] [PATCH 5/5] drm/i915: Cancel non-persistent contexts on close
> -Original Message- > From: Chris Wilson > Sent: Wednesday, August 7, 2019 6:23 AM > To: intel-gfx@lists.freedesktop.org > Cc: Joonas Lahtinen ; Winiarski, Michal > ; Bloomfield, Jon > Subject: Re: [PATCH 5/5] drm/i915: Cancel non-persistent contexts on close > > Quoting Chris Wilson (2019-08-06 14:47:25) > > @@ -433,6 +482,8 @@ __create_context(struct drm_i915_private *i915) > > > > i915_gem_context_set_bannable(ctx); > > i915_gem_context_set_recoverable(ctx); > > + if (i915_modparams.enable_hangcheck) > > + i915_gem_context_set_persistence(ctx); > > I am not fond of this, but from a pragmatic point of view, this does > prevent the issue Jon raised: HW resources being pinned indefinitely > past process termination. > > I don't like it because we cannot perform the operation cleanly > everywhere, it requires preemption first and foremost (with a cooperating > submission backend) and per-engine reset. The alternative is to try and > do a full GPU reset if the context is still active. For the sake of the > issue raised, I think that (full reset on older HW) is required. > > That we are baking in a change of ABI due to an unsafe modparam is meh. > There are a few more corner cases to deal with before endless just > works. But since it is being used in the wild, I'm not sure we can wait > for userspace to opt-in, or wait for cgroups. However, since users are > being encouraged to disable hangcheck, should we extend the concept of > persistence to also mean "survives hangcheck"? -- though it should be a > separate parameter, and I'm not sure how exactly to protect it from the > heartbeat reset without giving gross privileges to the context. (CPU > isolation is nicer from the pov where we can just give up and not even > worry about the engine. If userspace can request isolation, it has the > privilege to screw up.) > -Chris Ok, so your concern is supporting non-persistence on older non-preempting, engine-reset capable, hardware. Why is that strictly required? Can't we simply make it dependent on the features needed to do it well, and if your hardware cannot, then the advice is not to disable hangcheck? I'm doubtful that anyone would attempt a HPC type workload on n IVB. I'm not sure I understand your "survives hangcheck" idea. You mean instead of simply disabling hangcheck, just opt in to persistence and have that also prevent hangcheck? Isn't that the wrong way around, since persistence is what is happening today? ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 5/5] drm/i915: Cancel non-persistent contexts on close
> -Original Message- > From: Chris Wilson > Sent: Wednesday, August 7, 2019 7:14 AM > To: Bloomfield, Jon ; intel- > g...@lists.freedesktop.org > Cc: Joonas Lahtinen ; Winiarski, Michal > > Subject: RE: [PATCH 5/5] drm/i915: Cancel non-persistent contexts on close > > Quoting Bloomfield, Jon (2019-08-07 15:04:16) > > Ok, so your concern is supporting non-persistence on older non-preempting, > engine-reset capable, hardware. Why is that strictly required? Can't we simply > make it dependent on the features needed to do it well, and if your hardware > cannot, then the advice is not to disable hangcheck? I'm doubtful that anyone > would attempt a HPC type workload on n IVB. > > Our advice was not to disable hangcheck :) > > With the cat out of the bag, my concern is dotting the Is and crossing > the Ts. Fixing up the error handling path to the reset isn't all that > bad. But I'm not going to advertise the persistence-parameter support > unless we have a clean solution, and we can advise that compute > workloads are better handled with new hardware. > > > I'm not sure I understand your "survives hangcheck" idea. You mean instead > of simply disabling hangcheck, just opt in to persistence and have that also > prevent hangcheck? Isn't that the wrong way around, since persistence is what > is happening today? > > Persistence is the clear-and-present danger. I'm just trying to sketch a > path for endless support, trying to ask myself questions such as: Is the > persistence parameter still required? What other parameters make sense? > Does anything less than CPU-esque isolation make sense? :) > -Chris I personally liked your persistence idea :-) Isolation doesn't really solve the problem in this case. So, customer enables isolation for a HPC workload. That workload hangs, and user hits ctrl-C. We still have a hung workload and the next job in the queue still can't run. Also, Isolation is kind of meaningless when there is only one engine that's capable of running your workload. On Gen9, pretty much every type of workload requires some RCS involvement, and RCS is where the compute workloads need to run. So isolation hasn't helped us. I'd settle for umd opting in to non-persistence and not providing the automatic association with hangcheck. That at least ensures well behaved umd's don't kill the system. We didn't explore the idea of terminating orphaned contexts though (where none of their resources are referenced by any other contexts). Is there a reason why this is not feasible? In the case of compute (certainly HPC) workloads, there would be no compositor taking the output so this might be a solution. Jon ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 5/5] drm/i915: Cancel non-persistent contexts on close
> -Original Message- > From: Chris Wilson > Sent: Wednesday, August 7, 2019 8:08 AM > To: Bloomfield, Jon ; intel- > g...@lists.freedesktop.org > Cc: Joonas Lahtinen ; Winiarski, Michal > > Subject: RE: [PATCH 5/5] drm/i915: Cancel non-persistent contexts on close > > Quoting Bloomfield, Jon (2019-08-07 15:33:51) > [skip to end] > > We didn't explore the idea of terminating orphaned contexts though > (where none of their resources are referenced by any other contexts). Is > there a reason why this is not feasible? In the case of compute (certainly > HPC) workloads, there would be no compositor taking the output so this > might be a solution. > > Sounds easier said than done. We have to go through each request and > determine it if has an external reference (or if the object holding the > reference has an external reference) to see if the output would be > visible to a third party. Sounds like a conservative GC :| > (Coming to that conclusion suggests that we should structure the request > tracking to make reparenting easier.) > > We could take a pid-1 approach and move all the orphan timelines over to > a new parent purely responsible for them. That honestly doesn't seem to > achieve anything. (We are still stuck with tasks on the GPU and no way > to kill them.) > > In comparison, persistence is a rarely used "feature" and cleaning up on > context close fits in nicely with the process model. It just works as > most users/clients would expect. (Although running in non-persistent > by default hasn't show anything to explode on the desktop, it's too easy > to construct scenarios where persistence turns out to be an advantage, > particularly with chains of clients (the compositor model).) Between the > two modes, we should have most bases covered, it's hard to argue for a > third way (that is until someone has a usecase!) > -Chris Ok, makes sense. Thanks. But have we converged on a decision :-) As I said, requiring compute umd optin should be ok for the immediate HPC issue, but I'd personally argue that it's valid to change the contract for hangcheck=0 and switch the default to non-persistent. Jon ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 5/5] drm/i915: Cancel non-persistent contexts on close
> -Original Message- > From: Chris Wilson > Sent: Wednesday, August 7, 2019 9:51 AM > To: Bloomfield, Jon ; intel- > g...@lists.freedesktop.org > Cc: Joonas Lahtinen ; Winiarski, Michal > > Subject: RE: [PATCH 5/5] drm/i915: Cancel non-persistent contexts on close > > Quoting Bloomfield, Jon (2019-08-07 16:29:55) > > > -Original Message- > > > From: Chris Wilson > > > Sent: Wednesday, August 7, 2019 8:08 AM > > > To: Bloomfield, Jon ; intel- > > > g...@lists.freedesktop.org > > > Cc: Joonas Lahtinen ; Winiarski, Michal > > > > > > Subject: RE: [PATCH 5/5] drm/i915: Cancel non-persistent contexts on close > > > > > > Quoting Bloomfield, Jon (2019-08-07 15:33:51) > > > [skip to end] > > > > We didn't explore the idea of terminating orphaned contexts though > > > (where none of their resources are referenced by any other contexts). Is > > > there a reason why this is not feasible? In the case of compute (certainly > > > HPC) workloads, there would be no compositor taking the output so this > > > might be a solution. > > > > > > Sounds easier said than done. We have to go through each request and > > > determine it if has an external reference (or if the object holding the > > > reference has an external reference) to see if the output would be > > > visible to a third party. Sounds like a conservative GC :| > > > (Coming to that conclusion suggests that we should structure the request > > > tracking to make reparenting easier.) > > > > > > We could take a pid-1 approach and move all the orphan timelines over to > > > a new parent purely responsible for them. That honestly doesn't seem to > > > achieve anything. (We are still stuck with tasks on the GPU and no way > > > to kill them.) > > > > > > In comparison, persistence is a rarely used "feature" and cleaning up on > > > context close fits in nicely with the process model. It just works as > > > most users/clients would expect. (Although running in non-persistent > > > by default hasn't show anything to explode on the desktop, it's too easy > > > to construct scenarios where persistence turns out to be an advantage, > > > particularly with chains of clients (the compositor model).) Between the > > > two modes, we should have most bases covered, it's hard to argue for a > > > third way (that is until someone has a usecase!) > > > -Chris > > > > Ok, makes sense. Thanks. > > > > But have we converged on a decision :-) > > > > As I said, requiring compute umd optin should be ok for the immediate HPC > issue, but I'd personally argue that it's valid to change the contract for > hangcheck=0 and switch the default to non-persistent. > > Could you tender > > diff --git a/runtime/os_interface/linux/drm_neo.cpp > b/runtime/os_interface/linux/drm_neo.cpp > index 31deb68b..8a9af363 100644 > --- a/runtime/os_interface/linux/drm_neo.cpp > +++ b/runtime/os_interface/linux/drm_neo.cpp > @@ -141,11 +141,22 @@ void Drm::setLowPriorityContextParam(uint32_t > drmContextId) { > UNRECOVERABLE_IF(retVal != 0); > } > > +void setNonPersistent(uint32_t drmContextId) { > +drm_i915_gem_context_param gcp = {}; > +gcp.ctx_id = drmContextId; > +gcp.param = 0xb; /* I915_CONTEXT_PARAM_PERSISTENCE; */ > + > +ioctl(DRM_IOCTL_I915_GEM_CONTEXT_SETPARAM, &gcp); > +} > + > uint32_t Drm::createDrmContext() { > drm_i915_gem_context_create gcc = {}; > auto retVal = ioctl(DRM_IOCTL_I915_GEM_CONTEXT_CREATE, &gcc); > UNRECOVERABLE_IF(retVal != 0); > > +/* enable cleanup of resources on process termination */ > +setNonPersistent(gcc.ctx_id); > + > return gcc.ctx_id; > } > > to interested parties? > -Chris Yes, that's exactly what I had in mind. I think it's enough to resolve the HPC challenges. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 5/5] drm/i915: Cancel non-persistent contexts on close
> -Original Message- > From: Chris Wilson > Sent: Friday, August 9, 2019 4:35 PM > To: intel-gfx@lists.freedesktop.org > Cc: Joonas Lahtinen ; Winiarski, Michal > ; Bloomfield, Jon > Subject: Re: [PATCH 5/5] drm/i915: Cancel non-persistent contexts on close > > Quoting Chris Wilson (2019-08-06 14:47:25) > > Normally, we rely on our hangcheck to prevent persistent batches from > > hogging the GPU. However, if the user disables hangcheck, this mechanism > > breaks down. Despite our insistence that this is unsafe, the users are > > equally insistent that they want to use endless batches and will disable > > the hangcheck mechanism. We are looking are perhaps replacing hangcheck > > with a softer mechanism, that sends a pulse down the engine to check if > > it is well. We can use the same preemptive pulse to flush an active > > persistent context off the GPU upon context close, preventing resources > > being lost and unkillable requests remaining on the GPU, after process > > termination. To avoid changing the ABI and accidentally breaking > > existing userspace, we make the persistence of a context explicit and > > enable it by default. Userspace can opt out of persistent mode (forcing > > requests to be cancelled when the context is closed by process > > termination or explicitly) by a context parameter, or to facilitate > > existing use-cases by disabling hangcheck (i915.enable_hangcheck=0). > > (Note, one of the outcomes for supporting endless mode will be the > > removal of hangchecking, at which point opting into persistent mode will > > be mandatory, or maybe the default.) > > For the record, I've finally run into examples of desktop clients > exiting before their rendering is shown. No longer hypothetical. > -Chris Can you share any details - Might be useful for testing. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 27/28] drm/i915: Cancel non-persistent contexts on close
> -Original Message- > From: Chris Wilson > Sent: Monday, August 26, 2019 12:22 AM > To: intel-gfx@lists.freedesktop.org > Cc: Chris Wilson ; Joonas Lahtinen > ; Winiarski, Michal > ; Bloomfield, Jon > Subject: [PATCH 27/28] drm/i915: Cancel non-persistent contexts on close > > Normally, we rely on our hangcheck to prevent persistent batches from > hogging the GPU. However, if the user disables hangcheck, this mechanism > breaks down. Despite our insistence that this is unsafe, the users are > equally insistent that they want to use endless batches and will disable > the hangcheck mechanism. We are looking are perhaps replacing hangcheck "looking at"? > with a softer mechanism, that sends a pulse down the engine to check if > it is well. We can use the same preemptive pulse to flush an active > persistent context off the GPU upon context close, preventing resources > being lost and unkillable requests remaining on the GPU after process > termination. To avoid changing the ABI and accidentally breaking > existing userspace, we make the persistence of a context explicit and > enable it by default (matching current ABI). Userspace can opt out of > persistent mode (forcing requests to be cancelled when the context is > closed by process termination or explicitly) by a context parameter. To > facilitate existing use-cases of disabling hangcheck, if the modparam is > disabled (i915.enable_hangcheck=0), we disable peristence mode by > default. (Note, one of the outcomes for supporting endless mode will be > the removal of hangchecking, at which point opting into persistent mode > will be mandatory, or maybe the default perhaps controlled by cgroups.) > > Testcase: igt/gem_ctx_persistence > Signed-off-by: Chris Wilson > Cc: Joonas Lahtinen > Cc: Michał Winiarski > Cc: Jon Bloomfield > --- > drivers/gpu/drm/i915/Makefile | 3 +- > drivers/gpu/drm/i915/gem/i915_gem_context.c | 122 ++ > drivers/gpu/drm/i915/gem/i915_gem_context.h | 15 +++ > .../gpu/drm/i915/gem/i915_gem_context_types.h | 1 + > .../gpu/drm/i915/gt/intel_engine_heartbeat.c | 54 > .../gpu/drm/i915/gt/intel_engine_heartbeat.h | 14 ++ > drivers/gpu/drm/i915/gt/intel_engine_pm.c | 2 +- > drivers/gpu/drm/i915/i915_priolist_types.h| 1 + > include/uapi/drm/i915_drm.h | 15 +++ > 9 files changed, 225 insertions(+), 2 deletions(-) > create mode 100644 drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c > create mode 100644 drivers/gpu/drm/i915/gt/intel_engine_heartbeat.h > > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile > index 658b930d34a8..eaa74e000985 100644 > --- a/drivers/gpu/drm/i915/Makefile > +++ b/drivers/gpu/drm/i915/Makefile > @@ -76,8 +76,9 @@ gt-y += \ > gt/intel_breadcrumbs.o \ > gt/intel_context.o \ > gt/intel_engine_cs.o \ > - gt/intel_engine_pool.o \ > + gt/intel_engine_heartbeat.o \ > gt/intel_engine_pm.o \ > + gt/intel_engine_pool.o \ > gt/intel_engine_user.o \ > gt/intel_gt.o \ > gt/intel_gt_irq.o \ > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c > b/drivers/gpu/drm/i915/gem/i915_gem_context.c > index bd9397669332..5520a896e701 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c > @@ -70,6 +70,7 @@ > #include > > #include "gt/intel_lrc_reg.h" > +#include "gt/intel_engine_heartbeat.h" > #include "gt/intel_engine_user.h" > > #include "i915_gem_context.h" > @@ -375,6 +376,78 @@ void i915_gem_context_release(struct kref *ref) > queue_work(i915->wq, &i915->contexts.free_work); > } > > +static inline struct i915_gem_engines * > +__context_engines_static(struct i915_gem_context *ctx) > +{ > + return rcu_dereference_protected(ctx->engines, true); > +} > + > +static void kill_context(struct i915_gem_context *ctx) > +{ > + intel_engine_mask_t tmp, active, reset; > + struct intel_gt *gt = &ctx->i915->gt; > + struct i915_gem_engines_iter it; > + struct intel_engine_cs *engine; > + struct intel_context *ce; > + > + /* > + * If we are already banned, it was due to a guilty request causing > + * a reset and the entire context being evicted from the GPU. > + */ > + if (i915_gem_context_is_banned(ctx)) > + return; > + > + i915_gem_context_set_banned(ctx); > + > + /* > + * Map the user's engine back to the actual engines; one virtual > + * engine will be mapped to multiple engines, and using ctx-&g
Re: [Intel-gfx] [PATCH 28/28] drm/i915: Replace hangcheck by heartbeats
> -Original Message- > From: Chris Wilson > Sent: Monday, August 26, 2019 12:22 AM > To: intel-gfx@lists.freedesktop.org > Cc: Chris Wilson ; Joonas Lahtinen > ; Ursulin, Tvrtko ; > Bloomfield, Jon > Subject: [PATCH 28/28] drm/i915: Replace hangcheck by heartbeats > > Replace sampling the engine state every so often with a periodic > heartbeat request to measure the health of an engine. This is coupled > with the forced-preemption to allow long running requests to survive so > long as they do not block other users. > > Signed-off-by: Chris Wilson > Cc: Joonas Lahtinen > Cc: Tvrtko Ursulin > Cc: Jon Bloomfield > --- > drivers/gpu/drm/i915/Kconfig.profile | 11 + > drivers/gpu/drm/i915/Makefile | 1 - > drivers/gpu/drm/i915/display/intel_display.c | 2 +- > drivers/gpu/drm/i915/gem/i915_gem_object.h| 1 - > drivers/gpu/drm/i915/gem/i915_gem_pm.c| 2 - > drivers/gpu/drm/i915/gt/intel_engine.h| 32 -- > drivers/gpu/drm/i915/gt/intel_engine_cs.c | 10 +- > .../gpu/drm/i915/gt/intel_engine_heartbeat.c | 113 +- > .../gpu/drm/i915/gt/intel_engine_heartbeat.h | 5 + > drivers/gpu/drm/i915/gt/intel_engine_pm.c | 5 +- > drivers/gpu/drm/i915/gt/intel_engine_types.h | 14 +- > drivers/gpu/drm/i915/gt/intel_gt.c| 1 - > drivers/gpu/drm/i915/gt/intel_gt.h| 4 - > drivers/gpu/drm/i915/gt/intel_gt_pm.c | 2 - > drivers/gpu/drm/i915/gt/intel_gt_types.h | 9 - > drivers/gpu/drm/i915/gt/intel_hangcheck.c | 361 -- > drivers/gpu/drm/i915/gt/intel_reset.c | 3 +- > drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 4 - > drivers/gpu/drm/i915/i915_debugfs.c | 87 - > drivers/gpu/drm/i915/i915_drv.c | 3 - > drivers/gpu/drm/i915/i915_drv.h | 1 - > drivers/gpu/drm/i915/i915_getparam.c | 3 +- > drivers/gpu/drm/i915/i915_gpu_error.c | 33 +- > drivers/gpu/drm/i915/i915_gpu_error.h | 2 - > drivers/gpu/drm/i915/i915_params.c| 6 +- > drivers/gpu/drm/i915/i915_priolist_types.h| 6 + > 26 files changed, 159 insertions(+), 562 deletions(-) > delete mode 100644 drivers/gpu/drm/i915/gt/intel_hangcheck.c > > diff --git a/drivers/gpu/drm/i915/Kconfig.profile > b/drivers/gpu/drm/i915/Kconfig.profile > index 3184e8491333..aafb57f84169 100644 > --- a/drivers/gpu/drm/i915/Kconfig.profile > +++ b/drivers/gpu/drm/i915/Kconfig.profile > @@ -37,3 +37,14 @@ config DRM_I915_PREEMPT_TIMEOUT > to execute. > > May be 0 to disable the timeout. > + > +config DRM_I915_HEARTBEAT_INTERVAL > + int "Interval between heartbeat pulses (ms)" > + default 2500 # microseconds > + help > + While active the driver uses a periodic request, a heartbeat, to > + check the wellness of the GPU and to regularly flush state changes > + (idle barriers). > + > + May be 0 to disable heartbeats and therefore disable automatic GPU > + hang detection. > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile > index eaa74e000985..d7286720de83 100644 > --- a/drivers/gpu/drm/i915/Makefile > +++ b/drivers/gpu/drm/i915/Makefile > @@ -84,7 +84,6 @@ gt-y += \ > gt/intel_gt_irq.o \ > gt/intel_gt_pm.o \ > gt/intel_gt_pm_irq.o \ > - gt/intel_hangcheck.o \ > gt/intel_lrc.o \ > gt/intel_renderstate.o \ > gt/intel_reset.o \ > diff --git a/drivers/gpu/drm/i915/display/intel_display.c > b/drivers/gpu/drm/i915/display/intel_display.c > index 6e74c33f2ec4..e008016d864c 100644 > --- a/drivers/gpu/drm/i915/display/intel_display.c > +++ b/drivers/gpu/drm/i915/display/intel_display.c > @@ -14306,7 +14306,7 @@ static void intel_plane_unpin_fb(struct > intel_plane_state *old_plane_state) > static void fb_obj_bump_render_priority(struct drm_i915_gem_object *obj) > { > struct i915_sched_attr attr = { > - .priority = I915_PRIORITY_DISPLAY, > + .priority = I915_USER_PRIORITY(I915_PRIORITY_DISPLAY), > }; > > i915_gem_object_wait_priority(obj, 0, &attr); > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h > b/drivers/gpu/drm/i915/gem/i915_gem_object.h > index a78af25dce36..967c30737dc5 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h > @@ -429,6 +429,5 @@ int i915_gem_object_wait(struct > drm_i915_gem_object *obj, > int i915_gem_object_wait_priority(struct drm_i915_gem_object *obj, > unsigned int flags, > const struct i915_sched_
Re: [Intel-gfx] [PATCH 28/28] drm/i915: Replace hangcheck by heartbeats
> -Original Message- > From: Chris Wilson > Sent: Monday, August 26, 2019 9:57 AM > To: Bloomfield, Jon ; intel- > g...@lists.freedesktop.org > Cc: Joonas Lahtinen ; Ursulin, Tvrtko > > Subject: RE: [PATCH 28/28] drm/i915: Replace hangcheck by heartbeats > > > Isn't engine->heartbeat now NULL in some cases? > > engine->heartbeat, the worker > vs > engine->last_heartbeat > Doh! > Maybe, > > struct intel_engine_heartbeat { > work; > systole; > }; > > > > > diff --git a/drivers/gpu/drm/i915/i915_getparam.c > > > b/drivers/gpu/drm/i915/i915_getparam.c > > > index 5d9101376a3d..e6c351080593 100644 > > > --- a/drivers/gpu/drm/i915/i915_getparam.c > > > +++ b/drivers/gpu/drm/i915/i915_getparam.c > > > @@ -78,8 +78,7 @@ int i915_getparam_ioctl(struct drm_device *dev, void > > > *data, > > > return -ENODEV; > > > break; > > > case I915_PARAM_HAS_GPU_RESET: > > > - value = i915_modparams.enable_hangcheck && > > > - intel_has_gpu_reset(i915); > > > + value = intel_has_gpu_reset(i915); > > > > Don't understand this tweak. We haven't really changed the essence of > hangcheck, just improved it, so why do we change this get_param? > > I deleted the modparam in earlier patches. But anticipated you would > object... Ok, I see. But then shouldn't we just be checking the new param for a non-zero timeout? That would then be equivalent. Or, it seems fair to conclude that this never made sense, but then it really ought to be a separate patch to remove the association between HAS_GPU_RESET and hangcheck. > -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 29/36] drm/i915: Replace hangcheck by heartbeats
> -Original Message- > From: Chris Wilson > Sent: Thursday, August 29, 2019 1:12 AM > To: intel-gfx@lists.freedesktop.org > Cc: Chris Wilson ; Joonas Lahtinen > ; Ursulin, Tvrtko ; > Bloomfield, Jon > Subject: [PATCH 29/36] drm/i915: Replace hangcheck by heartbeats > > Replace sampling the engine state every so often with a periodic > heartbeat request to measure the health of an engine. This is coupled > with the forced-preemption to allow long running requests to survive so > long as they do not block other users. > > Signed-off-by: Chris Wilson > Cc: Joonas Lahtinen > Cc: Tvrtko Ursulin > Cc: Jon Bloomfield > --- Reviewed-by: Jon Bloomfield ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 28/36] drm/i915: Cancel non-persistent contexts on close
> -Original Message- > From: Chris Wilson > Sent: Thursday, August 29, 2019 1:12 AM > To: intel-gfx@lists.freedesktop.org > Cc: Chris Wilson ; Joonas Lahtinen > ; Winiarski, Michal > ; Bloomfield, Jon > Subject: [PATCH 28/36] drm/i915: Cancel non-persistent contexts on close > > Normally, we rely on our hangcheck to prevent persistent batches from > hogging the GPU. However, if the user disables hangcheck, this mechanism > breaks down. Despite our insistence that this is unsafe, the users are > equally insistent that they want to use endless batches and will disable > the hangcheck mechanism. We are looking at perhaps replacing hangcheck > with a softer mechanism, that sends a pulse down the engine to check if > it is well. We can use the same preemptive pulse to flush an active > persistent context off the GPU upon context close, preventing resources > being lost and unkillable requests remaining on the GPU after process > termination. To avoid changing the ABI and accidentally breaking > existing userspace, we make the persistence of a context explicit and > enable it by default (matching current ABI). Userspace can opt out of > persistent mode (forcing requests to be cancelled when the context is > closed by process termination or explicitly) by a context parameter. To > facilitate existing use-cases of disabling hangcheck, if the modparam is > disabled (i915.enable_hangcheck=0), we disable persistence mode by > default. (Note, one of the outcomes for supporting endless mode will be > the removal of hangchecking, at which point opting into persistent mode > will be mandatory, or maybe the default perhaps controlled by cgroups.) > > Testcase: igt/gem_ctx_persistence > Signed-off-by: Chris Wilson > Cc: Joonas Lahtinen > Cc: Michał Winiarski > Cc: Jon Bloomfield > --- Reviewed-by: Jon Bloomfield ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/2] drm/i915/gen7: Clear all EU/L3 residual contexts
Reducing audience as this series is of high interest externally. I fully agree with Joonas' suggestion here, and we have been looking at doing just that. But can we iterate as a follow up patch series? Putting in the infra to support igt assembly from source will take a little time (igt assembler doesn't like the source right now, so it looks like it will need updating), and we are under pressure to get this security fix out. Jon > -Original Message- > From: Joonas Lahtinen > Sent: Friday, January 31, 2020 1:52 AM > To: Abodunrin, Akeem G ; Wilson, Chris P > ; Phillips, D Scott ; > Vetter, Daniel ; Stewart, David C > ; dri-de...@lists.freedesktop.org; Balestrieri, > Francesco ; intel-gfx@lists.freedesktop.org; > Nikula, Jani ; Bloomfield, Jon > ; Kuoppala, Mika ; > Aran, Omer ; Pathi, Pragyansri > ; Kumar Valsan, Prathap > ; Dutt, Sudeep ; > Luck, Tony > Subject: Re: [PATCH 2/2] drm/i915/gen7: Clear all EU/L3 residual contexts > > Quoting Akeem G Abodunrin (2020-01-30 18:57:21) > > From: Prathap Kumar Valsan > > > > On gen7 and gen7.5 devices, there could be leftover data residuals in > > EU/L3 from the retiring context. This patch introduces workaround to clear > > that residual contexts, by submitting a batch buffer with dedicated HW > > context to the GPU with ring allocation for each context switching. > > > > This security mitigation change does not trigger any performance > > regression. Performance is on par with current mainline/drm-tip. > > > > Signed-off-by: Mika Kuoppala > > Signed-off-by: Prathap Kumar Valsan > > Signed-off-by: Akeem G Abodunrin > > Cc: Chris Wilson > > Cc: Balestrieri Francesco > > Cc: Bloomfield Jon > > Cc: Dutt Sudeep > > --- > > drivers/gpu/drm/i915/Makefile | 1 + > > drivers/gpu/drm/i915/gt/gen7_renderclear.c| 535 ++ > > drivers/gpu/drm/i915/gt/gen7_renderclear.h| 15 + > > drivers/gpu/drm/i915/gt/intel_gpu_commands.h | 17 +- > > .../gpu/drm/i915/gt/intel_ring_submission.c | 3 +- > > drivers/gpu/drm/i915/i915_utils.h | 5 + > > 6 files changed, 572 insertions(+), 4 deletions(-) > > create mode 100644 drivers/gpu/drm/i915/gt/gen7_renderclear.c > > create mode 100644 drivers/gpu/drm/i915/gt/gen7_renderclear.h > > > > diff --git a/drivers/gpu/drm/i915/Makefile > b/drivers/gpu/drm/i915/Makefile > > index 3c88d7d8c764..f96bae664a03 100644 > > --- a/drivers/gpu/drm/i915/Makefile > > +++ b/drivers/gpu/drm/i915/Makefile > > @@ -78,6 +78,7 @@ gt-y += \ > > gt/debugfs_gt.o \ > > gt/debugfs_gt_pm.o \ > > gt/gen6_ppgtt.o \ > > + gt/gen7_renderclear.o \ > > gt/gen8_ppgtt.o \ > > gt/intel_breadcrumbs.o \ > > gt/intel_context.o \ > > diff --git a/drivers/gpu/drm/i915/gt/gen7_renderclear.c > b/drivers/gpu/drm/i915/gt/gen7_renderclear.c > > new file mode 100644 > > index ..a6f5f1602e33 > > --- /dev/null > > +++ b/drivers/gpu/drm/i915/gt/gen7_renderclear.c > > @@ -0,0 +1,535 @@ > > +// SPDX-License-Identifier: MIT > > +/* > > + * Copyright © 2019 Intel Corporation > > + */ > > + > > +#include "gen7_renderclear.h" > > +#include "i915_drv.h" > > +#include "i915_utils.h" > > +#include "intel_gpu_commands.h" > > + > > +#define MAX_URB_ENTRIES 64 > > +#define STATE_SIZE (4 * 1024) > > +#define GT3_INLINE_DATA_DELAYS 0x1E00 > > +#define batch_advance(Y, CS) GEM_BUG_ON((Y)->end != (CS)) > > + > > +/* > > + * Media CB Kernel for gen7 devices > > + * TODO: Add comments to kernel, indicating what each array of hex does > or > > + * include header file, which has assembly source and support in igt to be > > + * able to generate kernel in this same format > > + */ > > Having the original source code for the kernels in IGT is the > best way to proceed. The kernels should also be split into > separate files which can be generated from IGT and copied > over as-is for easy verification. > > Regards, Joonas ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Include asm sources for {ivb, hsw}_clear_kernel.c
> -Original Message- > From: Intel-gfx On Behalf Of > Rodrigo Vivi > Sent: Wednesday, June 10, 2020 1:18 PM > To: intel-gfx@lists.freedesktop.org > Cc: Alexandre Oliva ; Nikula, Jani ; > sta...@vger.kernel.org; Chris Wilson > Subject: [Intel-gfx] [PATCH] drm/i915: Include asm sources for {ivb, > hsw}_clear_kernel.c > > Alexandre Oliva has recently removed these files from Linux Libre > with concerns that the sources weren't available. > > The sources are available on IGT repository, and only open source > tools are used to generate the {ivb,hsw}_clear_kernel.c files. > > However, the remaining concern from Alexandre Oliva was around > GPL license and the source not been present when distributing > the code. > > So, it looks like 2 alternatives are possible, the use of > linux-firmware.git repository to store the blob or making sure > that the source is also present in our tree. Since the goal > is to limit the i915 firmware to only the micro-controller blobs > let's make sure that we do include the asm sources here in our tree. > > Btw, I tried to have some diligence here and make sure that the > asms that these commits are adding are truly the source for > the mentioned files: > > igt$ ./scripts/generate_clear_kernel.sh -g ivb \ > -m ~/mesa/build/src/intel/tools/i965_asm > Output file not specified - using default file "ivb-cb_assembled" > > Generating gen7 CB Kernel assembled file "ivb_clear_kernel.c" > for i915 driver... > > igt$ diff ~/i915/drm-tip/drivers/gpu/drm/i915/gt/ivb_clear_kernel.c \ > ivb_clear_kernel.c > > < * Generated by: IGT Gpu Tools on Fri 21 Feb 2020 05:29:32 AM UTC > > * Generated by: IGT Gpu Tools on Mon 08 Jun 2020 10:00:54 AM PDT > 61c61 > < }; > > }; > \ No newline at end of file > > igt$ ./scripts/generate_clear_kernel.sh -g hsw \ > -m ~/mesa/build/src/intel/tools/i965_asm > Output file not specified - using default file "hsw-cb_assembled" > > Generating gen7.5 CB Kernel assembled file "hsw_clear_kernel.c" > for i915 driver... > > igt$ diff ~/i915/drm-tip/drivers/gpu/drm/i915/gt/hsw_clear_kernel.c \ > hsw_clear_kernel.c > 5c5 > < * Generated by: IGT Gpu Tools on Fri 21 Feb 2020 05:30:13 AM UTC > > * Generated by: IGT Gpu Tools on Mon 08 Jun 2020 10:01:42 AM PDT > 61c61 > < }; > > }; > \ No newline at end of file > > Used IGT and Mesa master repositories from Fri Jun 5 2020) > IGT: 53e8c878a6fb ("tests/kms_chamelium: Force reprobe after replugging > the connector") > Mesa: 5d13c7477eb1 ("radv: set keep_statistic_info with > RADV_DEBUG=shaderstats") > Mesa built with: meson build -D platforms=drm,x11 -D dri-drivers=i965 \ > -D gallium-drivers=iris -D prefix=/usr \ >-D libdir=/usr/lib64/ -Dtools=intel \ >-Dkulkan-drivers=intel && ninja -C build > > v2: Header clean-up and include build instructions in a readme (Chris) > Modified commit message to respect check-patch > > Reference: http://www.fsfla.org/pipermail/linux-libre/2020- > June/003374.html > Reference: http://www.fsfla.org/pipermail/linux-libre/2020- > June/003375.html > Fixes: 47f8253d2b89 ("drm/i915/gen7: Clear all EU/L3 residual contexts") > Cc: # v5.7+ > Cc: Alexandre Oliva > Cc: Prathap Kumar Valsan > Cc: Akeem G Abodunrin > Cc: Mika Kuoppala > Cc: Chris Wilson > Cc: Jani Nikula > Cc: Joonas Lahtinen > Signed-off-by: Rodrigo Vivi Reviewed-by: Jon Bloomfield ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture
> -Original Message- > From: Chris Wilson [mailto:ch...@chris-wilson.co.uk] > Sent: Wednesday, December 6, 2017 7:38 AM > To: intel-gfx@lists.freedesktop.org > Cc: Chris Wilson ; Bloomfield, Jon > ; Harrison, John C ; > Ursulin, Tvrtko ; Joonas Lahtinen > ; Daniel Vetter > Subject: [PATCH v2] drm/i915: Prevent machine hang from Broxton's vtd w/a > and error capture > > Since capturing the error state requires fiddling around with the GGTT > to read arbitrary buffers and is itself run under stop_machine(), it > deadlocks the machine (effectively a hard hang) when run in conjunction > with Broxton's VTd workaround to serialize GGTT access. > > v2: Store the ERR_PTR in first_error so that the error can be reported > to the user via sysfs. > > Fixes: 0ef34ad6222a ("drm/i915: Serialize GTT/Aperture accesses on BXT") > Signed-off-by: Chris Wilson > Cc: Jon Bloomfield > Cc: John Harrison > Cc: Tvrtko Ursulin > Cc: Joonas Lahtinen > Cc: Daniel Vetter It's a real shame to lose error capture on BXT. Can we wrap stop_machine to make it recursive ? Something like... static cpumask_t sm_mask; struct sm_args { cpu_stop_fn_t *fn; void *data; }; void do_recursive_stop(void *sm_arg_data) { struct sm_arg *args = sm_arg_data; /* We're stopped - flag the fact to prevent recursion */ cpumask_set_cpu(smp_processor_id(), &sm_mask); args->fn(args->data); /* Re-enable recursion */ cpumask_clear_cpu(smp_processor_id(), &sm_mask); } void recursive_stop_machine(cpu_stop_fn_t fn, void *data) { if (cpumask_test_cpu(smp_processor_id(), &sm_mask)) { /* We were already stopped, so can just call directly */ fn(data); } else { /* Our CPU is not currently stopped */ struct sm_args *args = {fn, data}; stop_machine(do_recursive_stop, args, NULL); } } ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2] drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture
> -Original Message- > From: Intel-gfx [mailto:intel-gfx-boun...@lists.freedesktop.org] On Behalf > Of Bloomfield, Jon > Sent: Wednesday, December 6, 2017 9:01 AM > To: Chris Wilson ; intel-gfx@lists.freedesktop.org > Cc: Daniel Vetter > Subject: Re: [Intel-gfx] [PATCH v2] drm/i915: Prevent machine hang from > Broxton's vtd w/a and error capture > > > -Original Message- > > From: Chris Wilson [mailto:ch...@chris-wilson.co.uk] > > Sent: Wednesday, December 6, 2017 7:38 AM > > To: intel-gfx@lists.freedesktop.org > > Cc: Chris Wilson ; Bloomfield, Jon > > ; Harrison, John C > ; > > Ursulin, Tvrtko ; Joonas Lahtinen > > ; Daniel Vetter > > Subject: [PATCH v2] drm/i915: Prevent machine hang from Broxton's vtd > w/a > > and error capture > > > > Since capturing the error state requires fiddling around with the GGTT > > to read arbitrary buffers and is itself run under stop_machine(), it > > deadlocks the machine (effectively a hard hang) when run in conjunction > > with Broxton's VTd workaround to serialize GGTT access. > > > > v2: Store the ERR_PTR in first_error so that the error can be reported > > to the user via sysfs. > > > > Fixes: 0ef34ad6222a ("drm/i915: Serialize GTT/Aperture accesses on BXT") > > Signed-off-by: Chris Wilson > > Cc: Jon Bloomfield > > Cc: John Harrison > > Cc: Tvrtko Ursulin > > Cc: Joonas Lahtinen > > Cc: Daniel Vetter > > It's a real shame to lose error capture on BXT. Can we wrap stop_machine to > make it recursive ? > > Something like... > > static cpumask_t sm_mask; > > struct sm_args { > cpu_stop_fn_t *fn; > void *data; > }; > > void do_recursive_stop(void *sm_arg_data) > { > struct sm_arg *args = sm_arg_data; > > /* We're stopped - flag the fact to prevent recursion */ > cpumask_set_cpu(smp_processor_id(), &sm_mask); > > args->fn(args->data); > > /* Re-enable recursion */ > cpumask_clear_cpu(smp_processor_id(), &sm_mask); > } > > void recursive_stop_machine(cpu_stop_fn_t fn, void *data) > { > if (cpumask_test_cpu(smp_processor_id(), &sm_mask)) { > /* We were already stopped, so can just call directly */ > fn(data); > } > else { > /* Our CPU is not currently stopped */ > struct sm_args *args = {fn, data}; > stop_machine(do_recursive_stop, args, NULL); > } > } ... I think a single bool is sufficient in place of the cpumask, since it is set and cleared within stop_machine - I started out trying to set/clear outside. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/icl: Adjust BSD2 semantics to mean any second VCS instance
> -Original Message- > From: Intel-gfx On Behalf Of > Joonas Lahtinen > Sent: Wednesday, April 18, 2018 3:43 AM > To: Intel-gfx@lists.freedesktop.org; Tvrtko Ursulin > Subject: Re: [Intel-gfx] [PATCH] drm/i915/icl: Adjust BSD2 semantics to mean > any second VCS instance > > Quoting Tvrtko Ursulin (2018-04-18 12:33:42) > > From: Tvrtko Ursulin > > > > Currently our driver assumes BSD2 means hardware engine instance > number > > two. This does not work for Icelake parts with two VCS engines, but which > > are hardware instances 0 and 2, and not 0 and 1 as with previous parts. > > > > This makes the second engine not discoverable via HAS_BSD2 get param, > nor > > it can be targetted by execbuf. > > > > While we are working on the next generation execbuf put in a hack which > > allows discovery and access to this second VCS engine using legacy ABI. > > > > Signed-off-by: Tvrtko Ursulin > > Cc: Chris Wilson > > Cc: Jon Bloomfield > > Cc: Tony Ye > > --- > > Compile tested only. > > > > Also, one could argue if this is just a temporary hack or could actually > > make sense to have this permanently in. It feels like the ABI semantics of > > BSD2 are blurry, or at least could be re-blurred for Gen11. > > --- > > drivers/gpu/drm/i915/i915_drv.c| 8 +++- > > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 10 +- > > 2 files changed, 16 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c > b/drivers/gpu/drm/i915/i915_drv.c > > index b7dbeba72dec..a185366d9beb 100644 > > --- a/drivers/gpu/drm/i915/i915_drv.c > > +++ b/drivers/gpu/drm/i915/i915_drv.c > > @@ -331,7 +331,13 @@ static int i915_getparam_ioctl(struct drm_device > *dev, void *data, > > value = !!dev_priv->engine[VECS]; > > break; > > case I915_PARAM_HAS_BSD2: > > - value = !!dev_priv->engine[VCS2]; > > + /* > > +* FIXME: Temporary hack for Icelake. > > +* > > +* Make semantics of HAS_BSD2 "has second", or "has two" > VDBOXes > > +* instead of "has VDBOX 2nd hardware instance". > > +*/ > > + value = dev_priv->engine[VCS2] || dev_priv->engine[VCS3]; > > There can be no temporary hacks for the uAPI... You either sign yourself > up to keep it working indefinitely or don't :) > > Regards, Joonas This doesn't really change the API does it? In fact I'd argue this is simply fixing a breakage in the API wrt to previous devices. It makes no sense to expose holes in the engine map to userspace. If a device has two useable VCS engines, HAS_BSD2 should reflect that, and the second engine (wherever it sits physically), should be addressable through the legacy BSD2 execbuf interface. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/icl: Adjust BSD2 semantics to mean any second VCS instance
> -Original Message- > From: Tvrtko Ursulin > Sent: Wednesday, April 18, 2018 2:34 AM > To: Intel-gfx@lists.freedesktop.org > Cc: tursu...@ursulin.net; Ursulin, Tvrtko ; Chris > Wilson ; Bloomfield, Jon > ; Ye, Tony > Subject: [PATCH] drm/i915/icl: Adjust BSD2 semantics to mean any second > VCS instance > > From: Tvrtko Ursulin > > Currently our driver assumes BSD2 means hardware engine instance number > two. This does not work for Icelake parts with two VCS engines, but which > are hardware instances 0 and 2, and not 0 and 1 as with previous parts. > > This makes the second engine not discoverable via HAS_BSD2 get param, nor > it can be targetted by execbuf. > > While we are working on the next generation execbuf put in a hack which > allows discovery and access to this second VCS engine using legacy ABI. > > Signed-off-by: Tvrtko Ursulin > Cc: Chris Wilson > Cc: Jon Bloomfield > Cc: Tony Ye I would advocate this patch being merged while the new execbuf API is being developed. Currently there is no way to submit to 2 engine skus with non-sequential engine id's. This doesn't introduce a new ABI, and there is no reason that I can see that the new execbuf solution couldn't be made backward compatible with this. Reviewed-by: Jon Bloomfield ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/icl: Adjust BSD2 semantics to mean any second VCS instance
> -Original Message- > From: Tvrtko Ursulin > Sent: Friday, April 20, 2018 9:56 AM > To: Bloomfield, Jon ; Tvrtko Ursulin > ; Intel-gfx@lists.freedesktop.org > Subject: Re: [Intel-gfx] [PATCH] drm/i915/icl: Adjust BSD2 semantics to mean > any second VCS instance > > > On 20/04/2018 15:19, Bloomfield, Jon wrote: > >> -Original Message- > >> From: Tvrtko Ursulin > >> Sent: Wednesday, April 18, 2018 2:34 AM > >> To: Intel-gfx@lists.freedesktop.org > >> Cc: tursu...@ursulin.net; Ursulin, Tvrtko ; Chris > >> Wilson ; Bloomfield, Jon > >> ; Ye, Tony > >> Subject: [PATCH] drm/i915/icl: Adjust BSD2 semantics to mean any second > >> VCS instance > >> > >> From: Tvrtko Ursulin > >> > >> Currently our driver assumes BSD2 means hardware engine instance > number > >> two. This does not work for Icelake parts with two VCS engines, but which > >> are hardware instances 0 and 2, and not 0 and 1 as with previous parts. > >> > >> This makes the second engine not discoverable via HAS_BSD2 get param, > nor > >> it can be targetted by execbuf. > >> > >> While we are working on the next generation execbuf put in a hack which > >> allows discovery and access to this second VCS engine using legacy ABI. > >> > >> Signed-off-by: Tvrtko Ursulin > >> Cc: Chris Wilson > >> Cc: Jon Bloomfield > >> Cc: Tony Ye > > I would advocate this patch being merged while the new execbuf API is > being > > developed. Currently there is no way to submit to 2 engine skus with non- > sequential > > engine id's. This doesn't introduce a new ABI, and there is no reason that I > can see > > that the new execbuf solution couldn't be made backward compatible with > this. > > It is a bit of a awkward period to commit to this permanently because it > only solves a subset of problem space and that makes it a hard sell in > that context. > > If there was legacy userspace which ran on 2 VCS Gen11 then maybe, but > otherwise I think best is just wait for the new execbuf API. Or in fact > would there be _any_ upstream userspace using this before the new > execbuf API happens? > Fair point. Will you be physically inhibiting this legacy ABI for gen11? If you intend to allow it it's worth merging, because right now it simply doesn't work. If you will never allow the legacy ABI, and will forcibly prevent it (hardcode HAS_BSD2==0, for gen11+), then I agree we may as well carry the patch as a delta until the new execbuf lands. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v8 12/35] drm/i915: Implement the HDCP2.2 support for DP
I'm not formally reviewing this series, but while glancing at it, I noticed > -Original Message- > From: Intel-gfx On Behalf Of > Ramalingam C > Sent: Tuesday, November 27, 2018 2:43 AM > To: intel-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org; > daniel.vet...@ffwll.ch; Winkler, Tomas > Subject: [Intel-gfx] [PATCH v8 12/35] drm/i915: Implement the HDCP2.2 > support for DP > > Implements the DP adaptation specific HDCP2.2 functions. > > These functions perform the DPCD read and write for communicating the > HDCP2.2 auth message back and forth. > > v2: > wait for cp_irq is merged with this patch. Rebased. > v3: > wait_queue is used for wait for cp_irq [Chris Wilson] > v4: > Style fixed. > %s/PARING/PAIRING > Few style fixes [Uma] > v5: > Lookup table for DP HDCP2.2 msg details [Daniel]. > Extra lines are removed. > v6: > Rebased. > v7: > Fixed some regression introduced at v5. [Ankit] > Macro HDCP_2_2_RX_CAPS_VERSION_VAL is reused [Uma] > Converted a function to inline [Uma] > %s/uintxx_t/uxx > v8: > Error due to the sinks are reported as DEBUG logs. > Adjust to the new mei interface. > > Signed-off-by: Ramalingam C > Signed-off-by: Ankit K Nautiyal > Reviewed-by: Uma Shankar > --- > drivers/gpu/drm/i915/intel_dp.c | 338 > ++ > drivers/gpu/drm/i915/intel_drv.h | 7 + > drivers/gpu/drm/i915/intel_hdcp.c | 6 + > 3 files changed, 351 insertions(+) > > diff --git a/drivers/gpu/drm/i915/intel_dp.c > b/drivers/gpu/drm/i915/intel_dp.c > index ecc4706db7dc..1cc82e490999 100644 > --- a/drivers/gpu/drm/i915/intel_dp.c > +++ b/drivers/gpu/drm/i915/intel_dp.c > @@ -31,6 +31,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -5347,6 +5348,27 @@ void intel_dp_encoder_suspend(struct > intel_encoder *intel_encoder) > pps_unlock(intel_dp); > } > > + > +static struct hdcp2_dp_msg_data *get_hdcp2_dp_msg_data(u8 msg_id) > +{ > + int i; > + > + for (i = 0; i < sizeof(hdcp2_msg_data); i++) Should be sizeof(hdcp2_msg_data) / sizeof(*hdcp2_msg_data) or equivalent. Has the failure return been tested? > + if (hdcp2_msg_data[i].msg_id == msg_id) > + return &hdcp2_msg_data[i]; > + > + return NULL; > +} > + > -- > 2.7.4 > > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Update MOCS settings for gen 9
On 5/4/2017 9:51 AM, Kenneth Graunke wrote: On Thursday, May 4, 2017 7:47:21 AM PDT David Weinehall wrote: On Thu, May 04, 2017 at 10:35:33AM +0200, Arkadiusz Hiler wrote: Thanks for rephrasing - that's exactly what I am concerned with. Did you just use the MediaSDK as it is - meaning that MOCS entries beyond the set of the 3 we have defined had been naively utilized? If that's the case it is probably the cause of the performance difference - everything beyond "the 3" means UNCACHED. Can you try changing MediaSDK to only use entries that are already in? How the performance differs in that case? We're benchmarking using upstream MediaSDK without changes, since that's the only thing that's relevant. Customising benchmarks to get better results isn't really an acceptable solution :) Obviously fixing MediaSDK upstream is a different story, in case one of the three pre-defined entries we have turns out to be the best possible MOCS-settings for that workload. You're right about customizing benchmarks, but... MediaSDK is not a benchmark. If I'm not mistaken, it's a userspace driver produced by Intel engineers, one which Intel has the full capability to change. What you're saying is that Intel's MediaSDK engineers are unwilling to change their software to provide better performance for their Linux users. That's pretty mental. We don't warp the core operating system to work around userspace software simply because they don't want to change it. Agreed, that isn't the intention. This isn't about open vs. closed or internal vs. public projects, either. I work on a public userspace driver for Intel graphics. If I sent a kernel patch, the kernel developers would ask me the exact same questions, to justify my new additions: 1. Is your userspace actually using all these new additions? If not, which ones are you using? They would ask me to drop anything I wasn't actually using yet, because speculatively adding things to the kernel that we have to maintain backwards compatibility for has caused both kernel and userspace developers a lot of trouble. 2. Are you sure that you need them all? Is there a simpler solution - are some existing things good enough? What's the additional benefit of each new addition? I would have to answer these questions to the satisfaction of the kernel developers before they would even consider taking my patch. You keep pointing to your large performance improvement, but all it's shown is that actually using the GPU cache is faster than having a broken userspace driver explicitly set everything to uncached. Many people have pointed this out. Arek and Tvrtko have good suggestions. I don't think you're going to get anywhere with this until you demonstrate that the new MOCS entries provide some non-zero value over using the existing WB entries Here are a couple more data points: 1. We likely can't implement the documented "MOCS Version 1" table as is. The kernel exposes existing entries with specific semantics. Changing their meaning would introduce a backwards-incompatible change that would likely regress the performance of existing userspace. This is almost certainly unacceptable - our customers, distro partners, users, and even people like Linus Torvalds will suffer and complain loudly. The way the existing entries are exposed should not be changing. We could add the new entries at an offset - i.e. leave the existing 3 entries, and append the rest after that. But that would require changing userspace that assumes the Windows tables, such as MediaSDK (they would have to add 3 to their MOCS indexes). At which point, we're changing them, so...the "runs unaltered" argument falls over. The intention is certainly that the existing 3 entries already in use by open source user mode stack are not changed. This is additive to what is there. The BSpec should be reflecting that - if not that is an error. The Intel UMDs will change to support this. 2. The docs finally contain "recommended MOCS settings" - i.e. where to cache various types of objects, and at what age. However, I believe those recommendations can be implemented with 1-2 new table entries and a PTE change to be eLLC-only by default. Most of the table is completely unnecessary to implement the recommendations. I personally would like to try implementing their recommended settings in my driver. I have not had time yet, but plan to try. I'm very glad to see the Windows MOCS recommendations documented. I'd been asking for that information for literally years. If we'd gotten it earlier, a lot of mess could have been avoided. For future platforms, we may want to coordinate and use the same table. But Gen9 has been shipping for ages, and we don't have that luxury. I would hope that is not the case and the change can be made for gen9, as it should have no impact on e
Re: [Intel-gfx] [PATCH 3/6] drm/i915/gtt: Disable read-only support under GVT
> -Original Message- > From: Chris Wilson > Sent: Thursday, July 12, 2018 11:53 AM > To: intel-gfx@lists.freedesktop.org > Cc: Chris Wilson ; Zhenyu Wang > ; Bloomfield, Jon ; > Joonas Lahtinen ; Matthew Auld > > Subject: [PATCH 3/6] drm/i915/gtt: Disable read-only support under GVT > > GVT is not propagating the PTE bits, and is always setting the > read-write bit, thus breaking read-only support. > > Signed-off-by: Chris Wilson > Cc: Zhenyu Wang > Cc: Jon Bloomfield > Cc: Joonas Lahtinen > Cc: Matthew Auld > --- > drivers/gpu/drm/i915/i915_gem_gtt.c | 8 ++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c > b/drivers/gpu/drm/i915/i915_gem_gtt.c > index 6c0b438afe46..8e70a45b8a90 100644 > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c > @@ -1662,8 +1662,12 @@ static struct i915_hw_ppgtt > *gen8_ppgtt_create(struct drm_i915_private *i915) > 1ULL << 48 : > 1ULL << 32; > > - /* From bdw, there is support for read-only pages in the PPGTT */ > - ppgtt->vm.has_read_only = true; > + /* > + * From bdw, there is support for read-only pages in the PPGTT. > + * > + * XXX GVT is not setting honouring the PTE bits. > + */ > + ppgtt->vm.has_read_only = !intel_vgpu_active(i915); > > i915_address_space_init(&ppgtt->vm, i915); > > -- > 2.18.0 Is there a blocker that prevents gvt respecting this bit? I can't think of an obvious reason why it would be a bad thing to support. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 3/6] drm/i915/gtt: Disable read-only support under GVT
> -Original Message- > From: Chris Wilson > Sent: Friday, July 13, 2018 1:06 AM > To: Bloomfield, Jon ; Zhenyu Wang > > Cc: intel-gfx@lists.freedesktop.org; Zhenyu Wang > ; Joonas Lahtinen > ; Matthew Auld > > Subject: Re: [PATCH 3/6] drm/i915/gtt: Disable read-only support under GVT > > Quoting Zhenyu Wang (2018-07-13 03:03:10) > > On 2018.07.12 20:36:03 +, Bloomfield, Jon wrote: > > > > -Original Message- > > > > From: Chris Wilson > > > > Sent: Thursday, July 12, 2018 11:53 AM > > > > To: intel-gfx@lists.freedesktop.org > > > > Cc: Chris Wilson ; Zhenyu Wang > > > > ; Bloomfield, Jon > ; > > > > Joonas Lahtinen ; Matthew Auld > > > > > > > > Subject: [PATCH 3/6] drm/i915/gtt: Disable read-only support under > GVT > > > > > > > > GVT is not propagating the PTE bits, and is always setting the > > > > read-write bit, thus breaking read-only support. > > > > > > > > Signed-off-by: Chris Wilson > > > > Cc: Zhenyu Wang > > > > Cc: Jon Bloomfield > > > > Cc: Joonas Lahtinen > > > > Cc: Matthew Auld > > > > --- > > > > drivers/gpu/drm/i915/i915_gem_gtt.c | 8 ++-- > > > > 1 file changed, 6 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c > > > > b/drivers/gpu/drm/i915/i915_gem_gtt.c > > > > index 6c0b438afe46..8e70a45b8a90 100644 > > > > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c > > > > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c > > > > @@ -1662,8 +1662,12 @@ static struct i915_hw_ppgtt > > > > *gen8_ppgtt_create(struct drm_i915_private *i915) > > > > 1ULL << 48 : > > > > 1ULL << 32; > > > > > > > > - /* From bdw, there is support for read-only pages in the PPGTT */ > > > > - ppgtt->vm.has_read_only = true; > > > > + /* > > > > +* From bdw, there is support for read-only pages in the PPGTT. > > > > +* > > > > +* XXX GVT is not setting honouring the PTE bits. > > > > +*/ > > > > + ppgtt->vm.has_read_only = !intel_vgpu_active(i915); > > > > > > > > i915_address_space_init(&ppgtt->vm, i915); > > > > > > > > -- > > > > 2.18.0 > > > > > > Is there a blocker that prevents gvt respecting this bit? I can't think > > > of an > obvious > > > reason why it would be a bad thing to support. > > > > I don't think any blocker for gvt support, we can respect that bit when > shadowing. > > But we need capability check on host gvt when that support is ready. > > Another cap bit required, so ack on both sides? > -Chris I see. Not as permanent disable, just more plumbing needed. I'm happy then :-) Reviewed-by: Jon Bloomfield ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v9 7/7] drm/i915: add a sysfs entry to let users set sseu configs
>>>>>>> may be significant.) > >>>>>>>>> > >>>>>>>>> If we go back thinking about the containers use case, then it > >>>>>>>>> transpires that even though the "or" policy does prevent one > >>>>>>>>> container > >>>>>>>>> from affecting the other from one angle, it also prevents one > >>>>>>>>> container from exercising the feature unless all containers > >>>>>>>>> co-operate. > >>>>>>>>> > >>>>>>>>> As such, we can view the original problem statement where we > have an > >>>>>>>>> issue if not everyone co-operates, as conceptually the same just > >>>>>>>>> from > >>>>>>>>> an opposite angle. (Rather than one container incurring the > >>>>>>>>> increased > >>>>>>>>> cost of context switches to the rest, we would have one > container > >>>>>>>>> preventing the optimized slice configuration to the other.) > >>>>>>>>> > >>>>>>>>> From this follows that both proposals require complete > >>>>>>>>> co-operation > >>>>>>>>> from all running userspace to avoid complete control of the > feature. > >>>>>>>>> > >>>>>>>>> Since the balance between the benefit of optimized slice > >>>>>>>>> configuration > >>>>>>>>> (or penalty of suboptimal one), versus the penalty of increased > >>>>>>>>> context switch times, cannot be know by the driver (barring > >>>>>>>>> venturing > >>>>>>>>> into the heuristics territory), that is another reason why I find > >>>>>>>>> the > >>>>>>>>> "or" policy in the driver questionable. > >>>>>>>>> > >>>>>>>>> We can also ask a question of - If we go with the "or" policy, why > >>>>>>>>> require N per-context ioctls to modify the global GPU > configuration > >>>>>>>>> and not instead add a global driver ioctl to modify the state? > >>>>>>>>> > >>>>>>>>> If a future hardware requires, or enables, the per-context > behaviour > >>>>>>>>> in a more efficient way, we could then revisit the problem space. > >>>>>>>>> > >>>>>>>>> In the mean time I see the "or" policy solution as adding some > ABI > >>>>>>>>> which doesn't do anything for many use cases without any way > for the > >>>>>>>>> sysadmin to enable it. At the same time master sysfs knob at > least > >>>>>>>>> enables the sysadmin to make a decision. Here I am thinking > about a > >>>>>>>>> random client environment where not all userspace co- > operates, > >>>>>>>>> but for > >>>>>>>>> instance user is running the feature aware media stack, and > >>>>>>>>> non-feature aware OpenCL/3d stack. > >>>>>>>>> > >>>>>>>>> I guess the complete story boils down to - is the master sysfs > knob > >>>>>>>>> really a problem in container use cases. > >>>>>>>>> > >>>>>>>>> Regards, > >>>>>>>>> > >>>>>>>>> Tvrtko > >>>>>>>> Hey Tvrtko, > >>>>>>>> > >>>>>>>> Thanks for summarizing a bunch of discussions. > >>>>>>>> Essentially I agree with every you wrote above. > >>>>>>>> > >>>>>>>> If we have a global setting (determined by the OR policy), what's > the > >>>>>>>> point of per context settings? > >>>>>>>> > >>>>>>>> In Dmitry's scenario, all userspace applications will work > >>>>>>>> together to > >>>>>>>> reach the consensus so it sounds like we're reimplementing the > policy > >>>>>>>> that is already existing in userspace. > >>>>>>>> > >>>>>>>> Anyway, I'm implementing Joonas' suggestion. Hopefully > somebody else > >>>>>>>> than me pick one or the other :) > >>>>>>> I'll just mention the voting/consensus approach to see if anyone > else > >>>>>>> likes it. > >>>>>>> > >>>>>>> Each context has a CONTEXT_PARAM_HINT_SSEU { small, > dontcare, large } > >>>>>>> (or some other abstract names). > >>>>>> Yeah, the param name should have the word _HINT_ in it when it's > not a > >>>>>> definitive set. > >>>>>> > >>>>>> There's no global setter across containers, only a scenario when > >>>>>> everyone agrees or not. Tallying up the votes and going with a > majority > >>>>>> vote might be an option, too. > >>>>>> > >>>>>> Regards, Joonas > >>>>> Trying to test the "everyone agrees" approach here. > >>>> It's not everyone agrees, but the greater good. > >>> > >>> I'm looking forward to the definition of the greater good :) > >>> Tvrtko wanted to avoid the heuristic territory, it seems like we're just > >>> stepping into it. > >> > >> I am not sure that "small, dontcare, large" models brings much. No one > >> would probably set "dontcare" since we have to default new contexts to > >> large to be compatible. > >> > >> Don't know, I still prefer the master knob option. Honestly don't yet > >> see the containers use case as a problem. There is always a master > >> domain in any system where the knob can be enabled if the customers on > >> the system are such to warrant it. On mixed systems enable it or not > >> someone always suffers. And with the knob we are free to add heuristics > >> later, keep the uapi, and just default the knob to on. > > > > Master knob effectively means dead code behind a switch, that's not very > > upstream friendly. > > Hey at least I wasn't proposing a modparam! :))) > > Yes it is not the best software practice, upstream or not, however I am > trying to be pragmatical here and choose the simplest, smallest, good > enough, and least headache inducing in the future solution. > > One way of of looking at the master switch could be like tune your > system for XYZ - change CPU frequency governor, disable SATA link > saving, allow i915 media optimized mode. Some live code out, some dead > code in. > > But perhaps discussion is moot since we don't have userspace anyway. > > Regards, > > Tvrtko I'm surprised by the "no deadcode behind knobs comment". We do this all the time - "display=0" anyone? Or "enable_cmdparser=false"? Allowing user space to reduce EU performance for the system as a whole is not a great idea imho. Only sysadmin should have the right to let that happen, and an admin switch (I WOULD go with a modparam personally) is a good way to ensure that we can get the optimal media configuration for dedicated media systems, while for general systems we get the best overall performance (not just favouring media). Why would we want to over engineer this? Jon ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v9 7/7] drm/i915: add a sysfs entry to let users set sseu configs
Gratuitous top posting to re-kick the thread. For Gen11 we can't have an on/off switch anyway (media simply won't run with an oncompatible slice config), so let's agree on an api to allow userland to select the slice configuration for its contexts, for Gen11 only. I'd prefer a simple {MediaConfig/GeneralConfig} type context param but I really don't care that much. For gen9/10 it's arguable whether we need this at all. The effect on media workloads varies, but I'm guessing normal users (outside a transcode type environment) will never know they're missing anything. Either way, we can continue discussing while we progress the gen11 solution. Jon > -Original Message- > From: Intel-gfx On Behalf Of > Bloomfield, Jon > Sent: Wednesday, July 18, 2018 9:44 AM > To: Tvrtko Ursulin ; Joonas Lahtinen > ; Chris Wilson ; > Landwerlin, Lionel G ; intel- > g...@lists.freedesktop.org > Subject: Re: [Intel-gfx] [PATCH v9 7/7] drm/i915: add a sysfs entry to let > users > set sseu configs > > > -Original Message- > > From: Intel-gfx On Behalf Of > > Tvrtko Ursulin > > Sent: Thursday, June 14, 2018 1:29 AM > > To: Joonas Lahtinen ; Chris Wilson > > ; Landwerlin, Lionel G > > ; intel-gfx@lists.freedesktop.org > > Subject: Re: [Intel-gfx] [PATCH v9 7/7] drm/i915: add a sysfs entry to let > users > > set sseu configs > > > > > > On 13/06/2018 13:49, Joonas Lahtinen wrote: > > > Quoting Tvrtko Ursulin (2018-06-12 15:02:07) > > >> > > >> On 12/06/2018 11:52, Lionel Landwerlin wrote: > > >>> On 12/06/18 11:37, Chris Wilson wrote: > > >>>> Quoting Lionel Landwerlin (2018-06-12 11:33:34) > > >>>>> On 12/06/18 10:20, Joonas Lahtinen wrote: > > >>>>>> Quoting Chris Wilson (2018-06-11 18:02:37) > > >>>>>>> Quoting Lionel Landwerlin (2018-06-11 14:46:07) > > >>>>>>>> On 11/06/18 13:10, Tvrtko Ursulin wrote: > > >>>>>>>>> On 30/05/2018 15:33, Lionel Landwerlin wrote: > > >>>>>>>>>> There are concerns about denial of service around the per > > >>>>>>>>>> context sseu > > >>>>>>>>>> configuration capability. In a previous commit introducing the > > >>>>>>>>>> capability we allowed it only for capable users. This changes > > >>>>>>>>>> adds a > > >>>>>>>>>> new debugfs entry to let any user configure its own context > > >>>>>>>>>> powergating setup. > > >>>>>>>>> As far as I understood it, Joonas' concerns here are: > > >>>>>>>>> > > >>>>>>>>> 1) That in the containers use case individual containers > wouldn't > > be > > >>>>>>>>> able to turn on the sysfs toggle for them. > > >>>>>>>>> > > >>>>>>>>> 2) That also in the containers use case if box admin turns on > the > > >>>>>>>>> feature, some containers would potentially start negatively > > >>>>>>>>> affecting > > >>>>>>>>> the others (via the accumulated cost of slice re-configuration > on > > >>>>>>>>> context switching). > > >>>>>>>>> > > >>>>>>>>> I am not familiar with typical container setups to be > authoritative > > >>>>>>>>> here, but intuitively I find it reasonable that a low-level > hardware > > >>>>>>>>> switch like this would be under the control of a master domain > > >>>>>>>>> administrator. ("If you are installing our product in the > container > > >>>>>>>>> environment, make sure your system administrator enables > this > > >>>>>>>>> hardware > > >>>>>>>>> feature.", "Note to system administrators: Enabling this > features > > >>>>>>>>> may > > >>>>>>>>> negatively affect the performance of other containers.") > > >>>>>>>>> > > >>>>>>>>> Alternative proposal is for the i915 to apply an "or" filter on > > >>>>>>>>> all > > >>&g
Re: [Intel-gfx] [RFC] drm/i915: Engine discovery query
> -Original Message- > From: Tvrtko Ursulin [mailto:tursu...@ursulin.net] > Sent: Wednesday, March 14, 2018 7:06 AM > To: Intel-gfx@lists.freedesktop.org > Cc: tursu...@ursulin.net; Ursulin, Tvrtko ; Chris > Wilson ; Bloomfield, Jon > ; Rogozhkin, Dmitry V > ; Landwerlin, Lionel G > ; Joonas Lahtinen > > Subject: [RFC] drm/i915: Engine discovery query > > From: Tvrtko Ursulin > > Engine discovery query allows userspace to enumerate engines, probe their > configuration features, all without needing to maintain the internal PCI > ID based database. > > A new query for the generic i915 query ioctl is added named > DRM_I915_QUERY_ENGINE_INFO, together with accompanying structure > drm_i915_query_engine_info. The address of latter should be passed to the > kernel in the query.data_ptr field, and should be large enough for the > kernel to fill out all known engines as struct drm_i915_engine_info > elements trailing the query. > > As with other queries, setting the item query length to zero allows > userspace to query minimum required buffer size. > > Enumerated engines have common type mask which can be used to query all > hardware engines, versus engines userspace can submit to using the execbuf > uAPI. > > Engines also have capabilities which are per engine class namespace of > bits describing features not present on all engine instances. > > Signed-off-by: Tvrtko Ursulin > Cc: Chris Wilson > Cc: Jon Bloomfield > Cc: Dmitry Rogozhkin > Cc: Lionel Landwerlin > Cc: Joonas Lahtinen > --- > This is RFC for now since it is not very usable before the new execbuf API > or virtual engine queue submission gets closer. > > In this version I have added capability of distinguishing between hardware > engines and ABI engines. This is to account for probable future use cases > like virtualization, where guest might only see one engine instance, but > will want to know overall hardware capabilities in order to tune its > behaviour. The patch looks good, but... I can't see any use for exposing the unreachable engines. Can you elaborate on why a umd running in a VM would need to know about the engines assigned to other VM's ? Is it even desirable to leak the physical capabilities to VM's ? In general a VM would have a very limited view of the underlying hardware. It's unlikely to even be capable of discovering true h/w engine counts. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC 0/8] Force preemption
> From: Intel-gfx [mailto:intel-gfx-boun...@lists.freedesktop.org] On Behalf > Of Jeff McGee > Sent: Thursday, March 22, 2018 12:09 PM > To: Tvrtko Ursulin > Cc: Kondapally, Kalyan ; intel- > g...@lists.freedesktop.org; b...@bwidawsk.net > Subject: Re: [Intel-gfx] [RFC 0/8] Force preemption > > On Thu, Mar 22, 2018 at 05:41:57PM +, Tvrtko Ursulin wrote: > > > > On 22/03/2018 16:01, Jeff McGee wrote: > > >On Thu, Mar 22, 2018 at 03:57:49PM +, Tvrtko Ursulin wrote: > > >> > > >>On 22/03/2018 14:34, Jeff McGee wrote: > > >>>On Thu, Mar 22, 2018 at 09:28:00AM +, Chris Wilson wrote: > > Quoting Tvrtko Ursulin (2018-03-22 09:22:55) > > > > > >On 21/03/2018 17:26, jeff.mc...@intel.com wrote: > > >>From: Jeff McGee > > >> > > >>Force preemption uses engine reset to enforce a limit on the time > > >>that a request targeted for preemption can block. This feature is > > >>a requirement in automotive systems where the GPU may be > shared by > > >>clients of critically high priority and clients of low priority that > > >>may not have been curated to be preemption friendly. There may > be > > >>more general applications of this feature. I'm sharing as an RFC to > > >>stimulate that discussion and also to get any technical feedback > > >>that I can before submitting to the product kernel that needs this. > > >>I have developed the patches for ease of rebase, given that this is > > >>for the moment considered a non-upstreamable feature. It would > be > > >>possible to refactor hangcheck to fully incorporate force > preemption > > >>as another tier of patience (or impatience) with the running > request. > > > > > >Sorry if it was mentioned elsewhere and I missed it - but does this > work > > >only with stateless clients - or in other words, what would happen to > > >stateful clients which would be force preempted? Or the answer is > we > > >don't care since they are misbehaving? > > > > They get notified of being guilty for causing a gpu reset; three strikes > > and they are out (banned from using the gpu) using the current rules. > > This is a very blunt hammer that requires the rest of the system to be > > robust; one might argue time spent making the system robust would > be > > better served making sure that the timer never expired in the first > place > > thereby eliminating the need for a forced gpu reset. > > -Chris > > >>> > > >>>Yes, for simplication the policy applied to force preempted contexts > > >>>is the same as for hanging contexts. It is known that this feature > > >>>should not be required in a fully curated system. It's a requirement > > >>>if end user will be alllowed to install 3rd party apps to run in the > > >>>non-critical domain. > > >> > > >>My concern is whether it safe to call this force _preemption_, while > > >>it is not really expected to work as preemption from the point of > > >>view of preempted context. I may be missing some angle here, but I > > >>think a better name would include words like maximum request > > >>duration or something. > > >> > > >>I can see a difference between allowed maximum duration when there > > >>is something else pending, and when it isn't, but I don't > > >>immediately see that we should consider this distinction for any > > >>real benefit? > > >> > > >>So should the feature just be "maximum request duration"? This would > > >>perhaps make it just a special case of hangcheck, which ignores head > > >>progress, or whatever we do in there. > > >> > > >>Regards, > > >> > > >>Tvrtko > > > > > >I think you might be unclear about how this works. We're not starting a > > >preemption to see if we can cleanly remove a request who has begun to > > >exceed its normal time slice, i.e. hangcheck. This is about bounding > > >the time that a normal preemption can take. So first start preemption > > >in response to higher-priority request arrival, then wait for preemption > > >to complete within a certain amount of time. If it does not, resort to > > >reset. > > > > > >So it's really "force the resolution of a preemption", shortened to > > >"force preemption". > > > > You are right, I veered off in my thinking and ended up with > > something different. :) > > > > I however still think the name is potentially misleading, since the > > request/context is not getting preempted. It is getting effectively > > killed (sooner or later, directly or indirectly). > > > > Maybe that is OK for the specific use case when everything is only > > broken and not malicious. > > > > In a more general purpose system it would be a bit random when > > something would work, and when it wouldn't, depending on system > > setup and even timings. > > > > Hm, maybe you don't even really benefit from the standard three > > strikes and you are out policy, and for this specific use case, you > > should just kill it straight away. If it couldn't be preempted once, > > why pay
Re: [Intel-gfx] [PATCH 3/5] drm/i915: Prevent writing into a read-only object via a GGTT mmap
> -Original Message- > From: Chris Wilson > Sent: Thursday, June 14, 2018 5:00 AM > To: intel-gfx@lists.freedesktop.org > Cc: Chris Wilson ; Bloomfield, Jon > ; Joonas Lahtinen > ; Matthew Auld > > Subject: [PATCH 3/5] drm/i915: Prevent writing into a read-only object via a > GGTT mmap > > If the user has created a read-only object, they should not be allowed > to circumvent the write protection by using a GGTT mmapping. Deny it. > > Also most machines do not support read-only GGTT PTEs, so again we have > to reject attempted writes. Fortunately, this is known a priori, so we > can at least reject in the call to create the mmap with backup in the > fault handler. This is a little draconian as we could blatantly ignore > the write protection on the pages, but it is far simply to keep the > readonly object pure. (It is easier to lift a restriction than to impose > it later!) Are you sure this is necessary? I assumed you would just create a ro IA mapping to the page, irrespective of the ability of ggtt. It feels wrong to disallow mapping a read-only object to the CPU as read-only. With ppgtt the presence of an unprotected mapping in the ggtt should be immune from tampering in the GT, so only the cpu mapping should really matter. > > Signed-off-by: Chris Wilson > Cc: Jon Bloomfield > Cc: Joonas Lahtinen > Cc: Matthew Auld > --- ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 3/5] drm/i915: Prevent writing into a read-only object via a GGTT mmap
> -Original Message- > From: Chris Wilson > Sent: Thursday, June 14, 2018 8:00 AM > To: Bloomfield, Jon ; intel- > g...@lists.freedesktop.org > Cc: Joonas Lahtinen ; Matthew Auld > > Subject: RE: [PATCH 3/5] drm/i915: Prevent writing into a read-only object via > a GGTT mmap > > Quoting Bloomfield, Jon (2018-06-14 15:53:13) > > > -Original Message- > > > From: Chris Wilson > > > Sent: Thursday, June 14, 2018 5:00 AM > > > To: intel-gfx@lists.freedesktop.org > > > Cc: Chris Wilson ; Bloomfield, Jon > > > ; Joonas Lahtinen > > > ; Matthew Auld > > > > > > Subject: [PATCH 3/5] drm/i915: Prevent writing into a read-only object via > a > > > GGTT mmap > > > > > > If the user has created a read-only object, they should not be allowed > > > to circumvent the write protection by using a GGTT mmapping. Deny it. > > > > > > Also most machines do not support read-only GGTT PTEs, so again we > have > > > to reject attempted writes. Fortunately, this is known a priori, so we > > > can at least reject in the call to create the mmap with backup in the > > > fault handler. This is a little draconian as we could blatantly ignore > > > the write protection on the pages, but it is far simply to keep the > > > readonly object pure. (It is easier to lift a restriction than to impose > > > it later!) > > Are you sure this is necessary? I assumed you would just create a ro IA > > mapping to the page, irrespective of the ability of ggtt. > > You are thinking of the CPU mmap? The GTT mmap offers a linear view of > the tiled object. It would be very wrong for us to bypass the PROT_READ > protection of a user page by accessing it via the GTT. No, I was thinking of gtt mmap. That requires both GTT and IA PTE mappings right? One to map the object into the GTT, and then a second to point the IA at the aperture. Why wouldn't marking the IA mapping RO protect the object if the GT cannot reach the GTT mapping (from user batches). > > > It feels wrong to > > disallow mapping a read-only object to the CPU as read-only. With ppgtt > > the presence of an unprotected mapping in the ggtt should be immune > > from tampering in the GT, so only the cpu mapping should really matter. > > And the CPU mapping has its protection bits on the IA PTE. > -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 3/5] drm/i915: Prevent writing into a read-only object via a GGTT mmap
> -Original Message- > From: Chris Wilson > Sent: Thursday, June 14, 2018 8:22 AM > To: Bloomfield, Jon ; intel- > g...@lists.freedesktop.org > Cc: Joonas Lahtinen ; Matthew Auld > > Subject: RE: [PATCH 3/5] drm/i915: Prevent writing into a read-only object via > a GGTT mmap > > Quoting Bloomfield, Jon (2018-06-14 16:06:40) > > > -Original Message- > > > From: Chris Wilson > > > Sent: Thursday, June 14, 2018 8:00 AM > > > To: Bloomfield, Jon ; intel- > > > g...@lists.freedesktop.org > > > Cc: Joonas Lahtinen ; Matthew Auld > > > > > > Subject: RE: [PATCH 3/5] drm/i915: Prevent writing into a read-only > object via > > > a GGTT mmap > > > > > > Quoting Bloomfield, Jon (2018-06-14 15:53:13) > > > > > -Original Message- > > > > > From: Chris Wilson > > > > > Sent: Thursday, June 14, 2018 5:00 AM > > > > > To: intel-gfx@lists.freedesktop.org > > > > > Cc: Chris Wilson ; Bloomfield, Jon > > > > > ; Joonas Lahtinen > > > > > ; Matthew Auld > > > > > > > > > > Subject: [PATCH 3/5] drm/i915: Prevent writing into a read-only > object via > > > a > > > > > GGTT mmap > > > > > > > > > > If the user has created a read-only object, they should not be allowed > > > > > to circumvent the write protection by using a GGTT mmapping. Deny > it. > > > > > > > > > > Also most machines do not support read-only GGTT PTEs, so again we > > > have > > > > > to reject attempted writes. Fortunately, this is known a priori, so we > > > > > can at least reject in the call to create the mmap with backup in the > > > > > fault handler. This is a little draconian as we could blatantly ignore > > > > > the write protection on the pages, but it is far simply to keep the > > > > > readonly object pure. (It is easier to lift a restriction than to > > > > > impose > > > > > it later!) > > > > Are you sure this is necessary? I assumed you would just create a ro IA > > > > mapping to the page, irrespective of the ability of ggtt. > > > > > > You are thinking of the CPU mmap? The GTT mmap offers a linear view of > > > the tiled object. It would be very wrong for us to bypass the PROT_READ > > > protection of a user page by accessing it via the GTT. > > No, I was thinking of gtt mmap. That requires both GTT and IA PTE > mappings > > right? One to map the object into the GTT, and then a second to point the > > IA at the aperture. Why wouldn't marking the IA mapping RO protect the > > object if the GT cannot reach the GTT mapping (from user batches). > > Hmm. I keep forgetting that we can get at the vma from mmap(), because > that's hidden away elsewhere and only see i915_gem_fault() on a daily > basis. Hmm, is legal to read a PROT_READ-only vma is PROT_WRITE is > requested, or are meant to return -EINVAL? > -Chris That's a trickier question :-) My instinct in -EINVAL if the user specifies PROT_WRITE, but allowed if they only PROT_READ, and ppgtt is enabled (including aliased) so that userspace can't see the gtt mapping from the GT. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2] drm/i915: Prevent writing into a read-only object via a GGTT mmap
> -Original Message- > From: Chris Wilson > Sent: Thursday, June 14, 2018 9:07 AM > To: intel-gfx@lists.freedesktop.org > Cc: dri-de...@lists.freedesktop.org; Chris Wilson ; > Bloomfield, Jon ; Joonas Lahtinen > ; Matthew Auld > ; David Herrmann > > Subject: [PATCH v2] drm/i915: Prevent writing into a read-only object via a > GGTT mmap > > If the user has created a read-only object, they should not be allowed > to circumvent the write protection by using a GGTT mmapping. Deny it. > > Also most machines do not support read-only GGTT PTEs, so again we have > to reject attempted writes. Fortunately, this is known a priori, so we > can at least reject in the call to create the mmap (with a sanity check > in the fault handler). > > v2: Check the vma->vm_flags during mmap() to allow readonly access. > > Signed-off-by: Chris Wilson > Cc: Jon Bloomfield > Cc: Joonas Lahtinen > Cc: Matthew Auld > Cc: David Herrmann Shame about the BUG_ON, but probably overkill to add code to suppress the RO flag just for mmap. Reviewed-by: Jon Bloomfield ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 5/5] drm/i915/userptr: Enable read-only support on gen8+
> -Original Message- > From: Chris Wilson > Sent: Thursday, June 14, 2018 5:00 AM > To: intel-gfx@lists.freedesktop.org > Cc: Chris Wilson ; Bloomfield, Jon > ; Joonas Lahtinen > > Subject: [PATCH 5/5] drm/i915/userptr: Enable read-only support on gen8+ > > On gen8 and onwards, we can mark GPU accesses through the ppGTT as > being > read-only, that is cause any GPU write onto that page to be discarded > (not triggering a fault). This is all that we need to finally support > the read-only flag for userptr! > > Testcase: igt/gem_userptr_blits/readonly* > Signed-off-by: Chris Wilson > Cc: Jon Bloomfield > Cc: Joonas Lahtinen > --- Reviewed-by: Jon Bloomfield ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 4/5] drm/i915: Reject attempted pwrites into a read-only object
> -Original Message- > From: Chris Wilson > Sent: Thursday, June 14, 2018 5:00 AM > To: intel-gfx@lists.freedesktop.org > Cc: Chris Wilson ; Bloomfield, Jon > ; Joonas Lahtinen > ; Matthew Auld > > Subject: [PATCH 4/5] drm/i915: Reject attempted pwrites into a read-only > object > > If the user created a read-only object, they should not be allowed to > circumvent the write protection using the pwrite ioctl. > > Signed-off-by: Chris Wilson > Cc: Jon Bloomfield > Cc: Joonas Lahtinen > Cc: Matthew Auld > --- Reviewed-by: Jon Bloomfield ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Serialize GTT Updates on BXT
BXT requires accesses to the GTT (i.e. PTE updates) to be serialized when IOMMU is enabled. This patch guarantees this by wrapping all updates in stop_machine and using a flushing read to guarantee that the GTT writes have reached their destination before restarting. Signed-off-by: Jon Bloomfield Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_gem_gtt.c | 106 1 file changed, 106 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 7c769d7..6360d92 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -2191,6 +2191,100 @@ static void gen8_ggtt_clear_range(struct i915_address_space *vm, gen8_set_pte(>t_base[i], scratch_pte); } +#ifdef CONFIG_INTEL_IOMMU +struct insert_page { + struct i915_address_space *vm; + dma_addr_t addr; + u64 offset; + enum i915_cache_level level; +}; + +static int gen8_ggtt_insert_page__cb(void *_arg) +{ + struct insert_page *arg = _arg; + + struct drm_i915_private *dev_priv = arg->vm->i915; + + gen8_ggtt_insert_page(arg->vm, arg->addr, + arg->offset, arg->level, 0); + + POSTING_READ(GFX_FLSH_CNTL_GEN6); + + return 0; +} + +static void gen8_ggtt_insert_page__BKL(struct i915_address_space *vm, + dma_addr_t addr, + u64 offset, + enum i915_cache_level level, + u32 unused) +{ + struct insert_page arg = { vm, addr, offset, level }; + + stop_machine(gen8_ggtt_insert_page__cb, &arg, NULL); +} + + +struct insert_entries { + struct i915_address_space *vm; + struct sg_table *st; + u64 start; + enum i915_cache_level level; +}; + +static int gen8_ggtt_insert_entries__cb(void *_arg) +{ + struct insert_entries *arg = _arg; + + struct drm_i915_private *dev_priv = arg->vm->i915; + + gen8_ggtt_insert_entries(arg->vm, arg->st, + arg->start, arg->level, 0); + + POSTING_READ(GFX_FLSH_CNTL_GEN6); + + return 0; +} + +static void gen8_ggtt_insert_entries__BKL(struct i915_address_space *vm, + struct sg_table *st, + u64 start, + enum i915_cache_level level, + u32 unused) +{ + struct insert_entries arg = { vm, st, start, level }; + + stop_machine(gen8_ggtt_insert_entries__cb, &arg, NULL); +} + +struct clear_range { + struct i915_address_space *vm; + u64 start; + u64 length; +}; + +static int gen8_ggtt_clear_range__cb(void *_arg) +{ + struct clear_range *arg = _arg; + + struct drm_i915_private *dev_priv = arg->vm->i915; + + gen8_ggtt_clear_range(arg->vm, arg->start, arg->length); + + POSTING_READ(GFX_FLSH_CNTL_GEN6); + + return 0; +} + +static void gen8_ggtt_clear_range__BKL(struct i915_address_space *vm, + u64 start, + u64 length) +{ + struct clear_range arg = { vm, start, length }; + stop_machine(gen8_ggtt_clear_range__cb, &arg, NULL); +} +#endif + static void gen6_ggtt_clear_range(struct i915_address_space *vm, u64 start, u64 length) { @@ -2789,6 +2883,18 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt) ggtt->base.insert_entries = gen8_ggtt_insert_entries; +#ifdef CONFIG_INTEL_IOMMU + /* Serialize GTT updates on BXT if VT-d is on. */ + if (IS_BROXTON(dev_priv) && intel_iommu_gfx_mapped) { + ggtt->base.insert_entries = gen8_ggtt_insert_entries__BKL; + ggtt->base.insert_page= gen8_ggtt_insert_page__BKL; + if (!USES_FULL_PPGTT(dev_priv) || + intel_scanout_needs_vtd_wa(dev_priv)) { + ggtt->base.clear_range = gen8_ggtt_clear_range__BKL; + } + } +#endif + ggtt->invalidate = gen6_ggtt_invalidate; return ggtt_probe_common(ggtt, size); -- 2.7.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Serialize GTT Updates on BXT
> -Original Message- > From: Chris Wilson [mailto:ch...@chris-wilson.co.uk] > Sent: Monday, May 22, 2017 1:05 PM > To: Bloomfield, Jon > Cc: intel-gfx@lists.freedesktop.org > Subject: Re: [Intel-gfx] [PATCH] drm/i915: Serialize GTT Updates on BXT > > On Mon, May 22, 2017 at 11:07:25AM -0700, Jon Bloomfield wrote: > > BXT requires accesses to the GTT (i.e. PTE updates) to be serialized > > when IOMMU is enabled. > > Serialised with what, since all writes are serialized already? Fair cop guv. I'll reword. > > The reason is that you need to explain the hw model you are protecting, for > example do clears need to be protected? > > > This patch guarantees this by wrapping all updates in stop_machine and > > using a flushing read to guarantee that the GTT writes have reached > > their destination before restarting. > > If you mention which patch you are reinstating (for a new problem) and cc > the author, he might point out what has changed in the meantime. I don't understand. I'm not re-instating any patches to my knowledge, so it's a bit hard to cc the author. > > Note, the flush here is not about ensuring the GTT writes reach their > destination. > > > Signed-off-by: Jon Bloomfield > > If you are the author and sender, what is John's s-o-b doing afterwards? This patch was previously signed off by John. > > > Signed-off-by: John Harrison > > --- > > drivers/gpu/drm/i915/i915_gem_gtt.c | 106 > > > > 1 file changed, 106 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c > > b/drivers/gpu/drm/i915/i915_gem_gtt.c > > index 7c769d7..6360d92 100644 > > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c > > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c > > @@ -2191,6 +2191,100 @@ static void gen8_ggtt_clear_range(struct > i915_address_space *vm, > > gen8_set_pte(>t_base[i], scratch_pte); } > > > > +#ifdef CONFIG_INTEL_IOMMU > > +struct insert_page { > > + struct i915_address_space *vm; > > + dma_addr_t addr; > > + u64 offset; > > + enum i915_cache_level level; > > +}; > > + > > +static int gen8_ggtt_insert_page__cb(void *_arg) { > > + struct insert_page *arg = _arg; > > + > > + struct drm_i915_private *dev_priv = arg->vm->i915; > > + > > + gen8_ggtt_insert_page(arg->vm, arg->addr, > > + arg->offset, arg->level, 0); > > + > > + POSTING_READ(GFX_FLSH_CNTL_GEN6); > > This is now just a call to i915_ggtt_invalidate() because we are now also > responsible for invalidating the guc tlbs as well as the chipset. > And more importantly it is already done by gen8_ggtt_insert_page. > > All the POSTING_READ(GFX_FLSH_CNTL_GEN6) are spurious. Are you sure - The purpose of the register read is to ensure that all the PTE writes are flushed from the h/w queue before we restart the machine. It is critical that all the PTE writes have left this queue before any other accesses are allowed to begin. Isn't the invalidate a posted write ? If so, it won't drain the queue. Even if the invalidate is guaranteed to effect this pipeline flish, the clear_page path doesn't call invalidate, so it's certainly required there. > > > static void gen6_ggtt_clear_range(struct i915_address_space *vm, > > u64 start, u64 length) > > { > > @@ -2789,6 +2883,18 @@ static int gen8_gmch_probe(struct i915_ggtt > > *ggtt) > > > > ggtt->base.insert_entries = gen8_ggtt_insert_entries; > > > > +#ifdef CONFIG_INTEL_IOMMU > > + /* Serialize GTT updates on BXT if VT-d is on. */ > > + if (IS_BROXTON(dev_priv) && intel_iommu_gfx_mapped) { > > Move to a header and don't ifdef out the users. A small cost in object side > for > the benefit of keeping these ifdef out of code. Move what to a header ? You mean create a macro for the test, the whole block, or something else ? I was following the pattern used elsewhere in the code in the vain hope that by following established convention we might avoid bike-shedding. Every single use of intel_iommu_gfx_mapped in this file is protected by the #ifdef. > -Chris > > -- > Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Serialize GTT Updates on BXT
> -Original Message- > From: Chris Wilson [mailto:ch...@chris-wilson.co.uk] > Sent: Tuesday, May 23, 2017 12:33 AM > To: Bloomfield, Jon > Cc: intel-gfx@lists.freedesktop.org > Subject: Re: [Intel-gfx] [PATCH] drm/i915: Serialize GTT Updates on BXT > > On Mon, May 22, 2017 at 10:18:31PM +, Bloomfield, Jon wrote: > > > -Original Message- > > > From: Chris Wilson [mailto:ch...@chris-wilson.co.uk] > > > Sent: Monday, May 22, 2017 1:05 PM > > > To: Bloomfield, Jon > > > Cc: intel-gfx@lists.freedesktop.org > > > Subject: Re: [Intel-gfx] [PATCH] drm/i915: Serialize GTT Updates on BXT > > > > > > On Mon, May 22, 2017 at 11:07:25AM -0700, Jon Bloomfield wrote: > > > > BXT requires accesses to the GTT (i.e. PTE updates) to be serialized > > > > when IOMMU is enabled. > > > > > > Serialised with what, since all writes are serialized already? > > > > Fair cop guv. I'll reword. > > > > > > > > The reason is that you need to explain the hw model you are protecting, > for > > > example do clears need to be protected? > > > > > > > This patch guarantees this by wrapping all updates in stop_machine and > > > > using a flushing read to guarantee that the GTT writes have reached > > > > their destination before restarting. > > > > > > If you mention which patch you are reinstating (for a new problem) and > cc > > > the author, he might point out what has changed in the meantime. > > > > I don't understand. I'm not re-instating any patches to my knowledge, so > it's a bit hard to cc the author. > > Please then review history before submitting *copied* code. > > > > Note, the flush here is not about ensuring the GTT writes reach their > > > destination. > > > > > > > Signed-off-by: Jon Bloomfield > > > > > > If you are the author and sender, what is John's s-o-b doing afterwards? > > This patch was previously signed off by John. > > > > > > > > > Signed-off-by: John Harrison > > > > --- > > > > drivers/gpu/drm/i915/i915_gem_gtt.c | 106 > > > > > > > > 1 file changed, 106 insertions(+) > > > > > > > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c > > > > b/drivers/gpu/drm/i915/i915_gem_gtt.c > > > > index 7c769d7..6360d92 100644 > > > > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c > > > > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c > > > > @@ -2191,6 +2191,100 @@ static void gen8_ggtt_clear_range(struct > > > i915_address_space *vm, > > > > gen8_set_pte(>t_base[i], scratch_pte); } > > > > > > > > +#ifdef CONFIG_INTEL_IOMMU > > > > +struct insert_page { > > > > + struct i915_address_space *vm; > > > > + dma_addr_t addr; > > > > + u64 offset; > > > > + enum i915_cache_level level; > > > > +}; > > > > + > > > > +static int gen8_ggtt_insert_page__cb(void *_arg) { > > > > + struct insert_page *arg = _arg; > > > > + > > > > + struct drm_i915_private *dev_priv = arg->vm->i915; > > > > + > > > > + gen8_ggtt_insert_page(arg->vm, arg->addr, > > > > + arg->offset, arg->level, 0); > > > > + > > > > + POSTING_READ(GFX_FLSH_CNTL_GEN6); > > > > > > This is now just a call to i915_ggtt_invalidate() because we are now also > > > responsible for invalidating the guc tlbs as well as the chipset. > > > And more importantly it is already done by gen8_ggtt_insert_page. > > > > > > All the POSTING_READ(GFX_FLSH_CNTL_GEN6) are spurious. > > > > Are you sure - The purpose of the register read is to ensure that all the > > PTE > writes are flushed from the h/w queue > > before we restart the machine. It is critical that all the PTE writes have > > left > this queue before any other accesses are > > allowed to begin. > > Isn't the invalidate a posted write ? If so, it won't drain the queue. > > Even if the invalidate is guaranteed to effect this pipeline flish, the > clear_page path doesn't call invalidate, so it's > > certainly required there. > > It's an uncached mmio write. It is strongly ordered and serial. Besides > if you feel it is wrong, fix the bug at source.
Re: [Intel-gfx] [PATCH] drm/i915: Serialize GTT Updates on BXT
> -Original Message- > From: Chris Wilson [mailto:ch...@chris-wilson.co.uk] > Sent: Tuesday, May 23, 2017 1:41 AM > To: Bloomfield, Jon > Cc: intel-gfx@lists.freedesktop.org > Subject: Re: [Intel-gfx] [PATCH] drm/i915: Serialize GTT Updates on BXT > > On Mon, May 22, 2017 at 11:07:25AM -0700, Jon Bloomfield wrote: > > BXT requires accesses to the GTT (i.e. PTE updates) to be serialized > > when IOMMU is enabled. This patch guarantees this by wrapping all > > updates in stop_machine and using a flushing read to guarantee that > > the GTT writes have reached their destination before restarting. > > > > Testcase? igt/gem_concurrent_blit shows the failure. I was using a combination of tests, run in parallel to hit this bug. I need to get hold of a system again to re-run. Are you saying you have also repro'd the bug just with gem_concurrent_blit or just suggesting that it might be a good candidate ? I'll also re-try without the flushing read, but I'm wary of removing this unless I can understand why the mmio write has the same effect. It might be luck. > -Chris > > -- > Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Serialize GTT Updates on BXT
> -Original Message- > From: Bloomfield, Jon > Sent: Tuesday, May 23, 2017 7:28 AM > To: 'Chris Wilson' > Cc: intel-gfx@lists.freedesktop.org > Subject: RE: [Intel-gfx] [PATCH] drm/i915: Serialize GTT Updates on BXT > > > -Original Message- > > From: Chris Wilson [mailto:ch...@chris-wilson.co.uk] > > Sent: Tuesday, May 23, 2017 12:33 AM > > To: Bloomfield, Jon > > Cc: intel-gfx@lists.freedesktop.org > > Subject: Re: [Intel-gfx] [PATCH] drm/i915: Serialize GTT Updates on BXT > > It's an uncached mmio write. It is strongly ordered and serial. Besides > > if you feel it is wrong, fix the bug at source. > > Strongly ordered is not enough, nor is being uncached - that just ensures the > PTE > writes have left the CPU. We need to ensure they have left the GAM before > we > allow any following Aperture accesses to reach the GAM. On the other hand > it > may be that the write to the flushctl reg will explicitly cause the h/w to > clear > the > fifo. I'll check with hw. H/W have confirmed that the flushing write is not sufficient to avoid the bug. In their words [almost]: "You need the read. The [FLSH_CTRL write] will invalidate the GTLB. However, it won't ensure a [Aperture] read isn't in the pipe." So the mmio read needs to stay, and we're left with where to issue it. I placed it in the _cb function because it is only needed for BXT and doing it there avoids any (admittedly small) extra overhead for other devices. But I have no strong opinion on this, so if you really want it to go into the main function I'll move it. I do think it should be conditional though. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Serialize GTT/Aperture accesses on BXT
BXT has a H/W issue with IOMMU which can lead to system hangs when Aperture accesses are queued within the GAM behind GTT Accesses. This patch avoids the condition by wrapping all GTT updates in stop_machine and using a flushing read prior to restarting the machine. The stop_machine guarantees no new Aperture accesses can begin while the PTE writes are being emmitted. The flushing read ensures that any following Aperture accesses cannot begin until the PTE writes have been cleared out of the GAM's fifo. Only FOLLOWING Aperture accesses need to be separated from in flight PTE updates. PTE Writes may follow tightly behind already in flight Aperture accesses, so no flushing read is required at the start of a PTE update sequence. This issue was reproduced by running igt/gem_readwrite and igt/gem_render_copy simultaneously from different processes, each in a tight loop, with INTEL_IOMMU enabled. This patch was originally published as: drm/i915: Serialize GTT Updates on BXT v2: Move bxt/iommu detection into static function Remove #ifdef CONFIG_INTEL_IOMMU protection Make function names more reflective of purpose Move flushing read into static function Signed-off-by: Jon Bloomfield Cc: John Harrison Cc: Chris Wilson Cc: Daniel Vetter Cc: Tvrtko Ursulin --- drivers/gpu/drm/i915/i915_drv.h | 9 +++ drivers/gpu/drm/i915/i915_gem_gtt.c | 114 2 files changed, 123 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 17883a8..b2783b2 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -3006,6 +3006,15 @@ static inline bool intel_scanout_needs_vtd_wa(struct drm_i915_private *dev_priv) return false; } +static inline bool intel_gtt_update_needs_vtd_wa(struct drm_i915_private *dev_priv) +{ +#ifdef CONFIG_INTEL_IOMMU + if (IS_BROXTON(dev_priv) && intel_iommu_gfx_mapped) + return true; +#endif + return false; +} + int intel_sanitize_enable_ppgtt(struct drm_i915_private *dev_priv, int enable_ppgtt); diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 7c769d7..ab91178 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -2191,6 +2191,110 @@ static void gen8_ggtt_clear_range(struct i915_address_space *vm, gen8_set_pte(>t_base[i], scratch_pte); } +static void bxt_vtd_gtt_wa(struct drm_i915_private *dev_priv) +{ + /* +* Make sure the internal GAM fifo has been cleared of all GTT +* Writes before exiting stop_machine(). This guarantees that +* any Aperture accesses waiting to start in another process +* cannot backup behind the GTT Writes causing a hang. +* The register can be any arbitrary GAM register. +*/ + POSTING_READ(GFX_FLSH_CNTL_GEN6); +} + +struct insert_page { + struct i915_address_space *vm; + dma_addr_t addr; + u64 offset; + enum i915_cache_level level; +}; + +static int bxt_vtd_ggtt_insert_page__cb(void *_arg) +{ + struct insert_page *arg = _arg; + + struct drm_i915_private *dev_priv = arg->vm->i915; + + gen8_ggtt_insert_page(arg->vm, arg->addr, + arg->offset, arg->level, 0); + + bxt_vtd_gtt_wa(dev_priv); + + return 0; +} + +static void bxt_vtd_ggtt_insert_page__BKL(struct i915_address_space *vm, + dma_addr_t addr, + u64 offset, + enum i915_cache_level level, + u32 unused) +{ + struct insert_page arg = { vm, addr, offset, level }; + + stop_machine(bxt_vtd_ggtt_insert_page__cb, &arg, NULL); +} + + +struct insert_entries { + struct i915_address_space *vm; + struct sg_table *st; + u64 start; + enum i915_cache_level level; +}; + +static int bxt_vtd_ggtt_insert_entries__cb(void *_arg) +{ + struct insert_entries *arg = _arg; + + struct drm_i915_private *dev_priv = arg->vm->i915; + + gen8_ggtt_insert_entries(arg->vm, arg->st, + arg->start, arg->level, 0); + + bxt_vtd_gtt_wa(dev_priv); + + return 0; +} + +static void bxt_vtd_ggtt_insert_entries__BKL(struct i915_address_space *vm, +struct sg_table *st, +u64 start, +enum i915_cache_level level, +u32 unused) +{ + struct insert_entries arg = { vm, st, start, level }; + + stop_machine(bxt_vtd_ggtt_insert_entries__cb, &arg, NULL); +} + +struct clear_range { + struct i915_address_space *
Re: [Intel-gfx] [PATCH] drm/i915/bxt: Enable VBT based BL control for DP
> -Original Message- > From: Intel-gfx [mailto:intel-gfx-boun...@lists.freedesktop.org] On Behalf > Of Imre Deak > Sent: Tuesday, June 6, 2017 5:34 AM > To: Jani Nikula > Cc: intel-gfx@lists.freedesktop.org; Mustaffa, Mustamin B > > Subject: Re: [Intel-gfx] [PATCH] drm/i915/bxt: Enable VBT based BL control > for DP > > On Tue, Jun 06, 2017 at 12:24:30PM +0300, Jani Nikula wrote: > > On Tue, 06 Jun 2017, Mustamin B Mustaffa > wrote: > > > [...] > > > diff --git a/drivers/gpu/drm/i915/intel_dp.c > b/drivers/gpu/drm/i915/intel_dp.c > > > index d1670b8..124f58b 100644 > > > --- a/drivers/gpu/drm/i915/intel_dp.c > > > +++ b/drivers/gpu/drm/i915/intel_dp.c > > > @@ -591,13 +591,8 @@ bxt_power_sequencer_idx(struct intel_dp > *intel_dp) > > > /* We should never land here with regular DP ports */ > > > WARN_ON(!is_edp(intel_dp)); > > > > > > - /* > > > - * TODO: BXT has 2 PPS instances. The correct port->PPS instance > > > - * mapping needs to be retrieved from VBT, for now just hard-code > to > > > - * use instance #0 always. > > > - */ > > > if (!intel_dp->pps_reset) > > > - return 0; > > > + return dev_priv->vbt.backlight.controller; > > > > So the existing code around here looks a bit convoluted, not least > > because now pretty much all PPS access first does > > > > - intel_pps_get_registers(), which calls > > - bxt_power_sequencer_idx(), which calls > > - intel_dp_init_panel_power_sequencer_registers(), which calls > > - intel_pps_get_registers()... > > > > With your change, for controller == 1 and pps_reset == true, the first > > time the registers are needed, we'll initialize the correct controller 1 > > registers, but controller 0 registers are returned. From there on, we'll > > keep returning controller 1 registers until pps_reset is set to true > > again. > > > > Cc: Imre as author of commits 78597996370c ("drm/i915/bxt: Fix PPS lost > > state after suspend breaking eDP link training") and 8e8232d51878 > > ("drm/i915: Deduplicate PPS register retrieval") which I think create > > the loop. > > A loop would be prevented by the pps->reset check, but agree the code > isn't nice, I guess I overlooked this when I wrote it. To make things > clearer we could factor out a helper from > intel_dp_init_panel_power_sequencer_registers() that takes pps_registers > and call this helper from bxt_power_sequencer_idx(). > > So how about something like the following: Just checking what the intention is here because your proposed change ommits the VBT fix... Are you going to post these changes as a new baseline for Mustaffa's patch ? Or are you asking Mustaffa to fold these changes into his patch ? Hoping it's the former :-) ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915/bxt: Enable VBT based BL control for DP
> -Original Message- > From: Jani Nikula [mailto:jani.nik...@linux.intel.com] > Sent: Wednesday, June 7, 2017 1:16 AM > To: Deak, Imre ; Bloomfield, Jon > > Cc: intel-gfx@lists.freedesktop.org; Mustaffa, Mustamin B > > Subject: Re: [Intel-gfx] [PATCH] drm/i915/bxt: Enable VBT based BL control > for DP > > On Tue, 06 Jun 2017, Imre Deak wrote: > > On Tue, Jun 06, 2017 at 05:58:43PM +0300, Bloomfield, Jon wrote: > >> > -Original Message- > >> > From: Intel-gfx [mailto:intel-gfx-boun...@lists.freedesktop.org] On > Behalf > >> > Of Imre Deak > >> > Sent: Tuesday, June 6, 2017 5:34 AM > >> > To: Jani Nikula > >> > Cc: intel-gfx@lists.freedesktop.org; Mustaffa, Mustamin B > >> > > >> > Subject: Re: [Intel-gfx] [PATCH] drm/i915/bxt: Enable VBT based BL > control > >> > for DP > >> > > >> > On Tue, Jun 06, 2017 at 12:24:30PM +0300, Jani Nikula wrote: > >> > > On Tue, 06 Jun 2017, Mustamin B Mustaffa > >> > wrote: > >> > > > [...] > >> > > > diff --git a/drivers/gpu/drm/i915/intel_dp.c > >> > b/drivers/gpu/drm/i915/intel_dp.c > >> > > > index d1670b8..124f58b 100644 > >> > > > --- a/drivers/gpu/drm/i915/intel_dp.c > >> > > > +++ b/drivers/gpu/drm/i915/intel_dp.c > >> > > > @@ -591,13 +591,8 @@ bxt_power_sequencer_idx(struct intel_dp > >> > *intel_dp) > >> > > > /* We should never land here with regular DP ports */ > >> > > > WARN_ON(!is_edp(intel_dp)); > >> > > > > >> > > > -/* > >> > > > - * TODO: BXT has 2 PPS instances. The correct port->PPS > instance > >> > > > - * mapping needs to be retrieved from VBT, for now just > hard-code > >> > to > >> > > > - * use instance #0 always. > >> > > > - */ > >> > > > if (!intel_dp->pps_reset) > >> > > > -return 0; > >> > > > +return dev_priv->vbt.backlight.controller; > >> > > > >> > > So the existing code around here looks a bit convoluted, not least > >> > > because now pretty much all PPS access first does > >> > > > >> > > - intel_pps_get_registers(), which calls > >> > > - bxt_power_sequencer_idx(), which calls > >> > > - intel_dp_init_panel_power_sequencer_registers(), which calls > >> > > - intel_pps_get_registers()... > >> > > > >> > > With your change, for controller == 1 and pps_reset == true, the first > >> > > time the registers are needed, we'll initialize the correct controller > >> > > 1 > >> > > registers, but controller 0 registers are returned. From there on, > >> > > we'll > >> > > keep returning controller 1 registers until pps_reset is set to true > >> > > again. > >> > > > >> > > Cc: Imre as author of commits 78597996370c ("drm/i915/bxt: Fix PPS > lost > >> > > state after suspend breaking eDP link training") and 8e8232d51878 > >> > > ("drm/i915: Deduplicate PPS register retrieval") which I think create > >> > > the loop. > >> > > >> > A loop would be prevented by the pps->reset check, but agree the code > >> > isn't nice, I guess I overlooked this when I wrote it. To make things > >> > clearer we could factor out a helper from > >> > intel_dp_init_panel_power_sequencer_registers() that takes > pps_registers > >> > and call this helper from bxt_power_sequencer_idx(). > >> > > >> > So how about something like the following: > >> > >> Just checking what the intention is here because your proposed change > >> ommits the VBT fix... Are you going to post these changes as a new > >> baseline for Mustaffa's patch ? Or are you asking Mustaffa to fold > >> these changes into his patch ? Hoping it's the former :-) > > > > The change is just to make the code clearer, unrelated to the VBT fix, > > so it should be a separate patch. I don't mind doing this as a follow-up > > to Mustaffa's patchset. What his patch here would need is just to return > > the correct index from bxt_power_sequencer_idx() in all cases. > > I think we might need to backport Mustaffa's patch to stable so we need > to do that first as a standalone change. After it has been fixed > according to Imre's and my feedback. Oh, and I'd still like someone(tm) > to check if the PPS-PWM mapping is fixed 1:1, or can we cross connect > them? Yes, I was having doubts about this yesterday too. I can find nothing the BSpec to indicate any relationship at all between PWM and PPS. Mustamin is from IOTG and this for a specific product they're working on. The correct fix is probably to extend VBT to include a separate PPS select field and then key off that. As this is a new product, there should be no issues with updating the VBT (I hope), but how does that sit with you guys ? Hardcoding is certainly plain wrong, even if all current released products do need 0. > > I just involved Imre here because the existing code is, I think, > unnecessarily hard to follow. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Workaround incoherence between fences and LLC across multiple CPUs
> -Original Message- > From: daniel.vet...@ffwll.ch [mailto:daniel.vet...@ffwll.ch] On Behalf Of > Daniel Vetter > Sent: Saturday, June 08, 2013 10:22 PM > To: Carsten Emde > Cc: Chris Wilson; Jesse Barnes; Intel Graphics Development; Bloomfield, Jon; > Steven Rostedt; Christoph Mathys; stable > Subject: Re: [Intel-gfx] [PATCH] drm/i915: Workaround incoherence > between fences and LLC across multiple CPUs > > On Sat, Jun 8, 2013 at 10:41 PM, Carsten Emde wrote: > > On 04/08/2013 11:54 AM, Daniel Vetter wrote: > >> > >> On Fri, Apr 05, 2013 at 09:03:27AM -0700, Jesse Barnes wrote: > >>> > >>> On Thu, 4 Apr 2013 21:31:03 +0100 > >>> Chris Wilson wrote: > >>> > >>>> In order to fully serialize access to the fenced region and the > >>>> update to the fence register we need to take extreme measures on > >>>> SNB+, and manually flush writes to memory prior to writing the > >>>> fence register in conjunction with the memory barriers placed around > the register write. > >>>> > >>>> Fixes i-g-t/gem_fence_thrash > >>>> > >>>> v2: Bring a bigger gun > >>>> v3: Switch the bigger gun for heavier bullets (Arjan van de Ven) > >>>> v4: Remove changes for working generations. > >>>> v5: Reduce to a per-cpu wbinvd() call prior to updating the fences. > >>>> v6: Rewrite comments to ellide forgotten history. > >>>> > >>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62191 > >>>> Signed-off-by: Chris Wilson > >>>> Cc: Jon Bloomfield > >>>> Tested-by: Jon Bloomfield (v2) > >>>> Cc: sta...@vger.kernel.org > >>>> --- > >>>> drivers/gpu/drm/i915/i915_gem.c | 28 > +++- > >>>> 1 file changed, 23 insertions(+), 5 deletions(-) > >>>> > >>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c > >>>> b/drivers/gpu/drm/i915/i915_gem.c index fa4ea1a..8f7739e 100644 > >>>> --- a/drivers/gpu/drm/i915/i915_gem.c > >>>> +++ b/drivers/gpu/drm/i915/i915_gem.c > >>>> @@ -2689,17 +2689,35 @@ static inline int fence_number(struct > >>>> drm_i915_private *dev_priv, > >>>> return fence - dev_priv->fence_regs; > >>>> } > >>>> > >>>> +static void i915_gem_write_fence__ipi(void *data) { > >>>> + wbinvd(); > >>>> +} > >>>> + > >>>> static void i915_gem_object_update_fence(struct > >>>> drm_i915_gem_object *obj, > >>>> struct drm_i915_fence_reg > >>>> *fence, > >>>> bool enable) > >>>> { > >>>> - struct drm_i915_private *dev_priv = obj->base.dev->dev_private; > >>>> - int reg = fence_number(dev_priv, fence); > >>>> - > >>>> - i915_gem_write_fence(obj->base.dev, reg, enable ? obj : NULL); > >>>> + struct drm_device *dev = obj->base.dev; > >>>> + struct drm_i915_private *dev_priv = dev->dev_private; > >>>> + int fence_reg = fence_number(dev_priv, fence); > >>>> + > >>>> + /* In order to fully serialize access to the fenced region and > >>>> +* the update to the fence register we need to take extreme > >>>> +* measures on SNB+. In theory, the write to the fence register > >>>> +* flushes all memory transactions before, and coupled with the > >>>> +* mb() placed around the register write we serialise all memory > >>>> +* operations with respect to the changes in the tiler. Yet, on > >>>> +* SNB+ we need to take a step further and emit an explicit > >>>> wbinvd() > >>>> +* on each processor in order to manually flush all memory > >>>> +* transactions before updating the fence register. > >>>> +*/ > >>>> + if (HAS_LLC(obj->base.dev)) > >>>> + on_each_cpu(i915_gem_write_fence__ipi, NULL, 1); > >>>> + i915_gem_write_fence(dev, fence_reg, enable ? obj : NULL); > >>>> > >>>> if (enable) { > >>>> - obj->fence_reg = reg; > >>>> +
Re: [Intel-gfx] [PATCH] drm/i915: Workaround incoherence between fences and LLC across multiple CPUs
> -Original Message- > From: daniel.vet...@ffwll.ch [mailto:daniel.vet...@ffwll.ch] On Behalf Of > Daniel Vetter > Sent: Monday, June 10, 2013 3:38 PM > To: Bloomfield, Jon > Cc: Carsten Emde; Chris Wilson; Jesse Barnes; Intel Graphics Development; > Steven Rostedt; Christoph Mathys; stable > Subject: Re: [Intel-gfx] [PATCH] drm/i915: Workaround incoherence > between fences and LLC across multiple CPUs > > On Mon, Jun 10, 2013 at 3:17 PM, Bloomfield, Jon > wrote: > >> -Original Message- > >> From: daniel.vet...@ffwll.ch [mailto:daniel.vet...@ffwll.ch] On > >> Behalf Of Daniel Vetter > >> Sent: Saturday, June 08, 2013 10:22 PM > >> To: Carsten Emde > >> Cc: Chris Wilson; Jesse Barnes; Intel Graphics Development; > >> Bloomfield, Jon; Steven Rostedt; Christoph Mathys; stable > >> Subject: Re: [Intel-gfx] [PATCH] drm/i915: Workaround incoherence > >> between fences and LLC across multiple CPUs > >> > >> On Sat, Jun 8, 2013 at 10:41 PM, Carsten Emde > wrote: > >> > On 04/08/2013 11:54 AM, Daniel Vetter wrote: > >> >> > >> >> On Fri, Apr 05, 2013 at 09:03:27AM -0700, Jesse Barnes wrote: > >> >>> > >> >>> On Thu, 4 Apr 2013 21:31:03 +0100 Chris Wilson > >> >>> wrote: > >> >>> > >> >>>> In order to fully serialize access to the fenced region and the > >> >>>> update to the fence register we need to take extreme measures on > >> >>>> SNB+, and manually flush writes to memory prior to writing the > >> >>>> fence register in conjunction with the memory barriers placed > >> >>>> around > >> the register write. > >> >>>> > >> >>>> Fixes i-g-t/gem_fence_thrash > >> >>>> > >> >>>> v2: Bring a bigger gun > >> >>>> v3: Switch the bigger gun for heavier bullets (Arjan van de Ven) > >> >>>> v4: Remove changes for working generations. > >> >>>> v5: Reduce to a per-cpu wbinvd() call prior to updating the fences. > >> >>>> v6: Rewrite comments to ellide forgotten history. > >> >>>> > >> >>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62191 > >> >>>> Signed-off-by: Chris Wilson > >> >>>> Cc: Jon Bloomfield > >> >>>> Tested-by: Jon Bloomfield (v2) > >> >>>> Cc: sta...@vger.kernel.org > >> >>>> --- > >> >>>> drivers/gpu/drm/i915/i915_gem.c | 28 > >> +++- > >> >>>> 1 file changed, 23 insertions(+), 5 deletions(-) > >> >>>> > >> >>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c > >> >>>> b/drivers/gpu/drm/i915/i915_gem.c index fa4ea1a..8f7739e 100644 > >> >>>> --- a/drivers/gpu/drm/i915/i915_gem.c > >> >>>> +++ b/drivers/gpu/drm/i915/i915_gem.c > >> >>>> @@ -2689,17 +2689,35 @@ static inline int fence_number(struct > >> >>>> drm_i915_private *dev_priv, > >> >>>> return fence - dev_priv->fence_regs; > >> >>>> } > >> >>>> > >> >>>> +static void i915_gem_write_fence__ipi(void *data) { > >> >>>> + wbinvd(); > >> >>>> +} > >> >>>> + > >> >>>> static void i915_gem_object_update_fence(struct > >> >>>> drm_i915_gem_object *obj, > >> >>>> struct > >> >>>> drm_i915_fence_reg *fence, > >> >>>> bool enable) > >> >>>> { > >> >>>> - struct drm_i915_private *dev_priv = obj->base.dev- > >dev_private; > >> >>>> - int reg = fence_number(dev_priv, fence); > >> >>>> - > >> >>>> - i915_gem_write_fence(obj->base.dev, reg, enable ? obj : > NULL); > >> >>>> + struct drm_device *dev = obj->base.dev; > >> >>>> + struct drm_i915_private *dev_priv = dev->dev_private; > >> >>>> + int fence_reg = fence_number(dev_priv, fence); > >> >>>> + > >> >>>> + /* In order to fully serialize access to the fenced region and &
[Intel-gfx] broken eDP device types
Hi, I'm trying to work out some bugs on Asus T100TA which is Baytrail-T platform. Current code in use is 3.15_rc4. In general I have been having problems with the backlight control. For some reason the kernel is detecting the panel as a VGA device and intel_crt.c does not load intel_backlight, so I hacked something together real quick and ended up getting something that actual changes PWM registers, but this had no effect on actual screen brightness. Without any real solid theory as to why PWM is not doing anything. I started wondering why exactly the kernel thinks the panel is VGA since it is kind of unlikely the panel is analog especially considering Baytrail-T does not have any analog interfaces. So I nabbed the VBT to take a look. Whole thing is here if your interested http://pastebin.com/crht1nDU Pretty sure the relevant portion is: EFP device info: Device type: 0x1400 (unknown) Port: 0x15 (unknown) DDC pin: 0x04 Dock port: 0x00 (N/A) HDMI compatible? No Info: HDMI certified Aux channel: 0x00 Dongle detect: 0x00 The VBT does include tables for LVDS/eDP operation of the panel. It seems just the device type is fubar. So my questions are, why is the type screwed up? What would windows driver do upon seeing that, and what is the best way to override it in the kernel? The practical impact here I think will be during sleep. As taking down the eDP link is unlikely to work so long as driver thinks it is a CRT. Also I notice that the backlight block contains these values: I2C slave addr: 0x58 I2C command: 0xaa Which are not used in the linux driver. Is this something the windows driver actually does? Any plans to implement i2c backlight control in i915? Regards, Jon Pry jon...@gmail.com ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] broken eDP device types
Do you know if the DSI patch set is being maintained? I noticed it is not integrated into drm-intel-next, the patches don't apply cleanly to anything, and there has been no activity in about a month on them. -Jon On Sun, May 11, 2014 at 1:45 PM, Daniel Vetter wrote: > Asus T100 has a mipi dsi panel, and we don't yet have the proper > support for that merged. Hopefully that will get adressed in 3.16. See > > https://bugzilla.kernel.org/show_bug.cgi?id=74381 > > for the overall progress. > -Daniel > > > On Sun, May 11, 2014 at 6:27 AM, Jon Pry wrote: >> Hi, >> >>I'm trying to work out some bugs on Asus T100TA which is Baytrail-T >> platform. Current code in use is 3.15_rc4. In general I have been >> having problems with the backlight control. For some reason the kernel >> is detecting the panel as a VGA device and intel_crt.c does not load >> intel_backlight, so I hacked something together real quick and ended >> up getting something that actual changes PWM registers, but this had >> no effect on actual screen brightness. >> >>Without any real solid theory as to why PWM is not doing anything. >> I started wondering why exactly the kernel thinks the panel is VGA >> since it is kind of unlikely the panel is analog especially >> considering Baytrail-T does not have any analog interfaces. So I >> nabbed the VBT to take a look. >> >> Whole thing is here if your interested http://pastebin.com/crht1nDU >> >> Pretty sure the relevant portion is: >> >> EFP device info: >> Device type: 0x1400 (unknown) >> Port: 0x15 (unknown) >> DDC pin: 0x04 >> Dock port: 0x00 (N/A) >> HDMI compatible? No >> Info: HDMI certified >> Aux channel: 0x00 >> Dongle detect: 0x00 >> >> The VBT does include tables for LVDS/eDP operation of the panel. It >> seems just the device type is fubar. So my questions are, why is the >> type screwed up? What would windows driver do upon seeing that, and >> what is the best way to override it in the kernel? >> >> The practical impact here I think will be during sleep. As taking down >> the eDP link is unlikely to work so long as driver thinks it is a CRT. >> >> Also I notice that the backlight block contains these values: >> >> I2C slave addr: 0x58 >> I2C command: 0xaa >> >> Which are not used in the linux driver. Is this something the windows >> driver actually does? Any plans to implement i2c backlight control in >> i915? >> >> Regards, >> >> Jon Pry >> jon...@gmail.com >> ___ >> Intel-gfx mailing list >> Intel-gfx@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/intel-gfx > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Introduce vblank work function
What's the status of this patch ? I can't find any subsequent mention of it, but we currently use it in one of our Android development trees. I'm trying to work out whether to retain it or replace it. Was it killed off, or is it still in the pipeline ? Thanks. > -Original Message- > From: intel-gfx-bounces+jon.bloomfield=intel@lists.freedesktop.org > [mailto:intel-gfx-bounces+jon.bloomfield=intel@lists.freedesktop.org] > On Behalf Of Chris Wilson > Sent: Friday, July 05, 2013 9:49 AM > To: intel-gfx@lists.freedesktop.org > Subject: [Intel-gfx] [PATCH] drm/i915: Introduce vblank work function > > Throughout the driver, we have a number of pieces of code that must wait > for a vblank before we can update some state. Often these could be run > asynchronously since they are merely freeing a resource no long referenced > by a double-buffered registered. So we introduce a vblank worker upon > which we queue various tasks to be run after the next vvblank. > > This will be used in the next patches to avoid unnecessary stalls when > updating registers and for freeing resources. > > Signed-off-by: Chris Wilson > --- > drivers/gpu/drm/i915/intel_display.c | 79 > > drivers/gpu/drm/i915/intel_drv.h | 8 +++- > 2 files changed, 86 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/intel_display.c > b/drivers/gpu/drm/i915/intel_display.c > index b70ce4e..dff3b93 100644 > --- a/drivers/gpu/drm/i915/intel_display.c > +++ b/drivers/gpu/drm/i915/intel_display.c > @@ -795,6 +795,74 @@ void intel_wait_for_vblank(struct drm_device *dev, > int pipe) > DRM_DEBUG_KMS("vblank wait timed out\n"); } > > +struct intel_crtc_vblank_task { > + struct list_head list; > + void (*func)(struct intel_crtc *, void *data); > + void *data; > +}; > + > +static void intel_crtc_vblank_work_fn(struct work_struct *_work) { > + struct intel_crtc_vblank_work *work = > + container_of(_work, struct intel_crtc_vblank_work, work); > + struct intel_crtc *crtc = > + container_of(work, struct intel_crtc, vblank_work); > + struct list_head tasks; > + > + intel_wait_for_vblank(crtc->base.dev, crtc->pipe); > + > + mutex_lock(&crtc->vblank_work.mutex); > + list_replace_init(&work->tasks, &tasks); > + mutex_unlock(&crtc->vblank_work.mutex); > + > + while (!list_empty(&tasks)) { > + struct intel_crtc_vblank_task *task > + = list_first_entry(&tasks, > +struct intel_crtc_vblank_task, > +list); > + > + task->func(crtc, task->data); > + list_del(&task->list); > + kfree(task); > + } > +} > + > +static int intel_crtc_add_vblank_task(struct intel_crtc *crtc, > + bool single, > + void (*func)(struct intel_crtc *, > +void *data), > + void *data) > +{ > + struct intel_crtc_vblank_task *task; > + struct intel_crtc_vblank_work *work = &crtc->vblank_work; > + > + task = kzalloc(sizeof *task, GFP_KERNEL); > + if (task == NULL) > + return -ENOMEM; > + task->func = func; > + task->data = data; > + > + mutex_lock(&work->mutex); > + if (list_empty(&work->tasks)) { > + schedule_work(&work->work); > + } else if (single) { > + struct intel_crtc_vblank_task *old; > + list_for_each_entry(old, &work->tasks, list) { > + if (task->func == func && task->data == data) { > + func = NULL; > + break; > + } > + } > + } > + if (func) > + list_add(&task->list, &work->tasks); > + else > + kfree(task); > + mutex_unlock(&work->mutex); > + > + return 0; > +} > + > /* > * intel_wait_for_pipe_off - wait for pipe to turn off > * @dev: drm device > @@ -2835,6 +2903,8 @@ static void intel_crtc_wait_for_pending_flips(struct > drm_crtc *crtc) > if (crtc->fb == NULL) > return; > > + flush_work(&to_intel_crtc(crtc)->vblank_work.work); > + > WARN_ON(waitqueue_active(&dev_priv->pending_flip_queue)); > > wait_event(dev_priv->pending_flip_queue, > @@ -9097,6 +9167,10 @@ static void intel_crtc_init(struct drm_device *dev, > int pipe) > if (intel_crtc == NULL) > return; > > + mutex_init(&intel_crtc->vblank_work.mutex); > + INIT_LIST_HEAD(&intel_crtc->vblank_work.tasks); > + INIT_WORK(&intel_crtc->vblank_work.work, > intel_crtc_vblank_work_fn); > + > drm_crtc_init(dev, &intel_crtc->base, &intel_crtc_funcs); > > if (drm_mode_crtc_set_gamma_size(&intel_crtc->base, 256)) @@ - > 10296,11 +10370,16 @@ void intel_modeset_cleanup(struct drm_de
Re: [Intel-gfx] [PATCH] drm/i915: Introduce vblank work function
Ok thanks. To add weight to it becoming official in some form, we're using it for various deferred operations: drm/i915: Make plane switching asynchronous drm/i915: Asynchronously unpin the old framebuffer after the next vblank They aren't my patches but I believe they should be upstreamed in the near future. The claim is that these give a noticeable performance boost. I'll leave it in and hope it becomes official. > -Original Message- > From: Daniel Vetter [mailto:daniel.vet...@ffwll.ch] On Behalf Of Daniel > Vetter > Sent: Friday, December 06, 2013 12:07 PM > To: Bloomfield, Jon > Cc: Chris Wilson; intel-gfx@lists.freedesktop.org > Subject: Re: [Intel-gfx] [PATCH] drm/i915: Introduce vblank work function > > On Fri, Dec 06, 2013 at 10:44:15AM +, Bloomfield, Jon wrote: > > What's the status of this patch ? I can't find any subsequent mention > > of it, but we currently use it in one of our Android development > > trees. I'm trying to work out whether to retain it or replace it. > > > > Was it killed off, or is it still in the pipeline ? > > Stalled atm I think. The overall concept of a vblank worker/timer support > code is still useful imo. I think I've written up all the bikesheds Chris&I > discussed on irc in my other reply. > -Daniel > -- > Daniel Vetter > Software Engineer, Intel Corporation > +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/2] drm/i915: Wait for completion of pending flips when starved of fences
> -Original Message- > From: intel-gfx-boun...@lists.freedesktop.org [mailto:intel-gfx- > boun...@lists.freedesktop.org] On Behalf Of Chris Wilson > Sent: Monday, January 20, 2014 10:18 AM > To: intel-gfx@lists.freedesktop.org > Subject: [Intel-gfx] [PATCH 1/2] drm/i915: Wait for completion of pending > flips when starved of fences > > On older generations (gen2, gen3) the GPU requires fences for many > operations, such as blits. The display hardware also requires fences for > scanouts and this leads to a situation where an arbitrary number of fences > may be pinned by old scanouts following a pageflip but before we have > executed the unpin workqueue. This is unpredictable by userspace and leads > to random EDEADLK when submitting an otherwise benign execbuffer. > However, we can detect when we have an outstanding flip and so cause > userspace to wait upon their completion before finally declaring that the > system is starved of fences. This is really no worse than forcing the GPU to > stall waiting for older execbuffer to retire and release their fences before > we > can reallocate them for the next execbuffer. > > v2: move the test for a pending fb unpin to a common routine for later reuse > during eviction > > Reported-and-tested-by: di...@gmx.net > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73696 > Signed-off-by: Chris Wilson Review-by: Jon Bloomfield ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/2] drm/i915: Repeat evictions whilst pageflip completions are outstanding
> From: intel-gfx-boun...@lists.freedesktop.org [mailto:intel-gfx- > boun...@lists.freedesktop.org] On Behalf Of Chris Wilson > Sent: Monday, January 20, 2014 10:18 AM > To: intel-gfx@lists.freedesktop.org > Subject: [Intel-gfx] [PATCH 2/2] drm/i915: Repeat evictions whilst pageflip > completions are outstanding > > Since an old pageflip will keep its scanout buffer object pinned until it has > executed its unpin task on the common workqueue, we can clog up our > GGTT with stale pinned objects. As we cannot flush those workqueues > without dropping our locks, we have to resort to falling back to userspace > and telling them to repeat the operation in order to have a chance to run our > workqueues and free up the required memory. If we fail, then we are forced > to report ENOSPC back to userspace causing the operation to fail and best- > case scenario is that it introduces temporary corruption. > > Signed-off-by: Chris Wilson Reviewed-by: Jon Bloomfield ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Wait and retry if there is no space in the aperture mappable area
> On Tue, Oct 22, 2013 at 01:15:32PM +0100, Chris Wilson wrote: > > On Tue, Oct 22, 2013 at 12:04:17PM +, Siluvery, Arun wrote: > > > From: "Siluvery, Arun" > > > > > > When a mapping is requested and if there is no space the mapping > > > fails and the region is not physically backed. This results in > > > signal 7(SIGBUS), code 2 (BUS_ADRERR) when it is actually accessed. > > > This patch handles this error, continues to wait and retries to find > > > space. > > > > Eh, no. The line before will remove everything from the aperture that > > is unpinned. Throwing an evict_everything in here breaks reservations, > > so I think you are just papering over a bug. > > If we want to fix this for real (i.e. allow userspace to reliably map stuff, > mabye even bigger than the aperture) we need to fall back to suballocating > tile-row aligned strides of the buffer (maybe pick the tile row multiply in > between 1M-2M). Until that's done userspace can't rely on gtt mmaps relibly > working for large buffers. The current heuristics we're using is half of the > mappable space, but that's probably a bit too optimistic. > -Daniel Is calling i915_gem_evict_everything() actually dangerous ? Despite its name, it appears to only evict unpinned buffers. Or am I missing something ? What it does do is update the status of buffers which are no longer in use on the ring by calling i915_gem_retire_requests(). So from what I can tell (from a 10 minute trawl of the code admittedly) all this patch is doing is getting a more up to date view of the GTT demands so that we only fail with ENOSPC if there are no pinned buffers which can now be unpinned. It doesn't address our underlying issue - userspace should still handle ENOSPC gracefully. However it certainly seems to be improving things considerably, so is beneficial if it really is a safe thing to do. Jon - Intel Corporation (UK) Limited Registered No. 1134945 (England) Registered Office: Pipers Way, Swindon SN3 1RJ VAT No: 860 2173 47 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Wait and retry if there is no space in the aperture mappable area
Thanks Daniel, sermon noted. I hadn't twigged that we were pinning buffers to the mapable GTT region which didn't really need to be there. Do we definitely never need to modify or interrogate the hw contexts from the CPU ? What about for debug ? Jon > -Original Message- > From: daniel.vet...@ffwll.ch [mailto:daniel.vet...@ffwll.ch] On Behalf Of > Daniel Vetter > Sent: Friday, November 01, 2013 4:57 PM > To: Chris Wilson > Cc: Bloomfield, Jon; intel-gfx > Subject: Re: [Intel-gfx] [PATCH] drm/i915: Wait and retry if there is no space > in the aperture mappable area > > Dragging this discussion back onto the mailing list. > > The Intel GenX kernel team is massively distributed over mutliple continents > and lots fo different business groups within Intel, and a few interested > parties outside of Intel. We need to work hard and actively to make sure > everyone is in the loop, that interested parties can jump into a discussion or > at least learn a bit by following it. > > Which means _please_ _always_ cc mailing list. Either the public intel-gfx > list > or when discussing confidential topics (and they can't be redacted out > without causing confusion) the internal gfx-internal-devel mailing list. > > /end of sermon, thanks for your attention > > Short recap for newcomers: Jon's group is hitting reliability issues in stress > testing where gtt mmappings fail with SIGBUS. The frequency sharply > increased when they started to support more outputs/sprites and so > squarely points at mappable gtt fragmentation due to pinned buffers. > > On Fri, Nov 1, 2013 at 1:47 PM, Chris Wilson > wrote: > > On Fri, Nov 01, 2013 at 12:02:07PM +, Bloomfield, Jon wrote: > >> > > > That statement is false. > >> > > Fair enough, but why is it false ? > >> > > >> > You often have buffers larger than the mappable aperture. At > >> > present, such mappings fail (they should pruned out earlier, but in > >> > the fault they would give a SIGBUS). That limit is oft requested to be > raised by magic. > >> But a mapping like this should fail when the user asks to map the buffer, > not when they subsequently access it. It can never succeed, so tell the user > immediately. > >> > >> > > Userspace fails when accessing a 90MB mapping. > >> > > Don't understand how a trampoline would simplify this - > >> > > trampoline as in > >> > vectored jump ? > >> > > Don't understand the 'educate' comment - they are simply mapping > >> > > a > >> > 90MB buffer and trying to use it. Why should that be slow in the > >> > general case ? > >> > > >> > Failing to map a 90MiB buffer is a kernel bug, pure and simple. > >> > Failing to map a 900MiB buffer is a user education issue. (It could > >> > be done, but there are at least a dozen ways the user could do the > >> > operation differently that will be > >> > faster.) > >> Why a kernel bug ? If there are too many surfaces pinned for display (or > sufficiently fragmented), is it guaranteed that a 90MB map should still > succeed ? > >> > >> > The borderline is whether the user expects a 128MiB mapping to > >> > always work. Worst case scenario (without a > >> > trampoline) is that requires the outputs to be disabled first i.e. > >> > process deadlock which leads to a system deadlock with anything like X. > >> Again, how would a trampoline help ? > > > > All I am stating here is that your idea can^Wwill lead to a system > > deadlock. NAK. > > I agree with Chris, implicit waiting for a random pin count to drop is a > deadlock nightmare. We'd need at least an explicti unpin_lru where we track > all soon-to-be-unpinned objects. And then we need to walk that list in > evict_something as a last resort. > > Still resolving the locking inversion this causes between gem and kms code > won't be fun at all, and we still haven't gained any solid guarantees for > userspace.The mappable gtt can still be easily fragmented, it's just a notch > less likely to cause problems. > > We need a better solution, and we need one with actual guarantees for > userspace to depend upon. > > So here's my idea to solve this for real: > - split long-term pinned objects into two classes: Those that might need to be > access through the mappable gtt mmio window (like scanout > buffers) and those that don't (like hw contexts). > - Pin objects that are never accessed through the mappable gtt by the cpu to > the
Re: [Intel-gfx] [PATCH 20/62] drm/i915/bdw: Add GTT functions
> -Original Message- > From: intel-gfx-boun...@lists.freedesktop.org [mailto:intel-gfx- > boun...@lists.freedesktop.org] On Behalf Of Ben Widawsky > Sent: Sunday, November 03, 2013 4:07 AM > To: Intel GFX > Cc: Daniel Vetter; Ben Widawsky; Widawsky, Benjamin > Subject: [Intel-gfx] [PATCH 20/62] drm/i915/bdw: Add GTT functions > > With the PTE clarifications, the bind and clear functions can now be added for > gen8. > > v2: Use for_each_sg_pages in gen8_ggtt_insert_entries. > > v3: Drop dev argument to pte encode functions, upstream lost it. Also rebase > on top of the scratch page movement. > > v4: Rebase on top of the new address space vfuncs. > > v5: Add the bool use_scratch argument to clear_range and the bool valid > argument to the PTE encode function to follow upstream changes. > > Signed-off-by: Ben Widawsky (v1) > Signed-off-by: Daniel Vetter > --- > drivers/gpu/drm/i915/i915_gem_gtt.c | 88 > +++-- > 1 file changed, 85 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c > b/drivers/gpu/drm/i915/i915_gem_gtt.c > index 8bf2184..df992dc 100644 > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c > @@ -58,6 +58,15 @@ typedef uint64_t gen8_gtt_pte_t; > #define HSW_WB_ELLC_LLC_AGE0 > HSW_CACHEABILITY_CONTROL(0xb) > #define HSW_WT_ELLC_LLC_AGE0 > HSW_CACHEABILITY_CONTROL(0x6) > > +static inline gen8_gtt_pte_t gen8_pte_encode(dma_addr_t addr, > + enum i915_cache_level level, > + bool valid) > +{ > + gen8_gtt_pte_t pte = valid ? _PAGE_PRESENT | _PAGE_RW : 0; > + pte |= addr; > + return pte; > +} > + > static gen6_gtt_pte_t snb_pte_encode(dma_addr_t addr, >enum i915_cache_level level, >bool valid) > @@ -576,6 +585,56 @@ int i915_gem_gtt_prepare_object(struct > drm_i915_gem_object *obj) > return 0; > } > > +static inline void gen8_set_pte(void __iomem *addr, gen8_gtt_pte_t pte) > +{ #ifdef writeq > + writeq(pte, addr); > +#else > + iowrite32((u32)pte, addr); > + iowrite32(pte >> 32, addr + 4); > +#endif > +} > + > +static void gen8_ggtt_insert_entries(struct i915_address_space *vm, > + struct sg_table *st, > + unsigned int first_entry, > + enum i915_cache_level level) > +{ > + struct drm_i915_private *dev_priv = vm->dev->dev_private; > + gen8_gtt_pte_t __iomem *gtt_entries = > + (gen8_gtt_pte_t __iomem *)dev_priv->gtt.gsm + > first_entry; > + int i = 0; > + struct sg_page_iter sg_iter; > + dma_addr_t addr; > + > + for_each_sg_page(st->sgl, &sg_iter, st->nents, 0) { > + addr = sg_dma_address(sg_iter.sg) + > + (sg_iter.sg_pgoffset << PAGE_SHIFT); > + gen8_set_pte(>t_entries[i], > + gen8_pte_encode(addr, level, true)); > + i++; > + } > + > + /* XXX: This serves as a posting read to make sure that the PTE has > + * actually been updated. There is some concern that even though > + * registers and PTEs are within the same BAR that they are > potentially > + * of NUMA access patterns. Therefore, even with the way we > assume > + * hardware should work, we must keep this posting read for > paranoia. > + */ > + if (i != 0) > + WARN_ON(readl(>t_entries[i-1]) > + != gen8_pte_encode(addr, level, true)); Comparing a u32 with a 64-bit page-table entry ? ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] PROBLEM: i915 modesetting - weird offset graphics (v2.6.37-rc1-27-gff8b16d)
On Thu, 2010-11-04 at 13:41 +, Chris Wilson wrote: > On Thu, 04 Nov 2010 09:37:01 -0400, Jon Masters > wrote: > > Also, the intel-gfx list is moderated with auto-rejection. Do I really > > need to subscribe to the list just to get a mail through? > > A few of the Intel gfx engineers are at Plumbers. If you tell Jesse about > it, he may be able to convince the powers-that-be to change the policy. I didn't see this mail in time. I did show a few folks the corruption, ran the kernel with drm.debug=4, etc. and checked the modelines. xrandr and the kernel seem to report the same modes each time, but obviously the display is shifted. I see various changes in git (everything from EDID reading to actual modeline changes in the i915 code - various weird margin settings seem to exist that might well cause some offset in the display) that could be responsible, so I will bisect it now for you. My concern with the email list is that it's listed in the MAINTAINERS file, but it's essentially of no value to bug reports unless they are already subscribed to it (moderation is fine, even auto reject after 3 days - a standard mm feature, but I feel auto-reject is a little too harsh). Needless to say, I'm subscribed now :) Thanks, Jon. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] PROBLEM: i915 modesetting - weird offset graphics (v2.6.37-rc1-27-gff8b16d)
On Thu, 2010-11-04 at 18:14 -0400, Jon Masters wrote: > On Thu, 2010-11-04 at 13:41 +, Chris Wilson wrote: > > On Thu, 04 Nov 2010 09:37:01 -0400, Jon Masters > > wrote: > > > Also, the intel-gfx list is moderated with auto-rejection. Do I really > > > need to subscribe to the list just to get a mail through? > > > > A few of the Intel gfx engineers are at Plumbers. If you tell Jesse about > > it, he may be able to convince the powers-that-be to change the policy. > > I didn't see this mail in time. I did show a few folks the corruption, > ran the kernel with drm.debug=4, etc. and checked the modelines. xrandr > and the kernel seem to report the same modes each time, but obviously > the display is shifted. I see various changes in git (everything from > EDID reading to actual modeline changes in the i915 code - various weird > margin settings seem to exist that might well cause some offset in the > display) that could be responsible, so I will bisect it now for you. Bisection is running. Should post something over the weekend. Jon. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] PROBLEM: i915 modesetting - weird offset graphics (v2.6.37-rc1-27-gff8b16d)
On Thu, 2010-11-04 at 03:29 -0400, Jon Masters wrote: > On Thu, 2010-11-04 at 03:25 -0400, Jon Masters wrote: > > > I upgraded the userspace on my EeePC 1015PEM netbook to the latest > > Fedora (F15) rawhide and booted a custom kernel in order to test both > > the RC1 and also see if the now in-staging brcm80211 behaves better (it > > doesn't, it still can't survive suspend at all, separate issue though). > > > > The xrandr output is identical in both case, showing a mode of 1024x600 > > being the one that X is currently using. Booting with i915 modesetting > > disabled does avoid the weird offset but of course X doesn't start. I > > can bisect this if there isn't already some suggestions from the > > audience, in particular is to what debug information you need. > > > > Kernel: v2.6.37-rc1-27-gff8b16d > > > > Photos: > > http://www.flickr.com/photos/jonmasters/5145351220/ > > http://www.flickr.com/photos/jonmasters/5144748543/ > > Video: > http://www.youtube.com/watch?v=43jc9_7n3kU > > I'll bring it in to Plumbers in a few hours. I took it to Plumbers and pinged a few people in person, but nobody was all that interested. I am bisecting this at the moment, a few builds in. Jon. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [bisected] offset display bug in i915
Hi Chris, The following patch that you recently committed breaks my ASUS Eee PC 1015PEM by causing the display to be offset by about 1 inch (a few centimeters) when the mode is (re)set during boot. I previously posted both photographs and video of the problem in another "PROBLEM" thread. Here is the offending commit found through bisection: 219adae138513bae20b256f1946b9cb3b75ca05c is the first bad commit commit 219adae138513bae20b256f1946b9cb3b75ca05c Author: Chris Wilson Date: Thu Sep 16 23:05:10 2010 +0100 drm/i915: Cache LVDS EDID We assume that the panel is permenantly connected and that the EDID data is consistent from boot, so simply cache the whole EDID for the panel. Signed-off-by: Chris Wilson I'll look at the patch. Either the EDID is *not* consistent (in which case, why are we not seeing other bugs like this?) or there is something specific to this system or panel used. Jon. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [bisected] offset display bug in i915
On Mon, 2010-11-08 at 05:18 -0500, Jon Masters wrote: > The following patch that you recently committed breaks my ASUS Eee PC > 1015PEM by causing the display to be offset by about 1 inch (a few > centimeters) when the mode is (re)set during boot. I previously posted > both photographs and video of the problem in another "PROBLEM" thread. Here's my bisection log, for the record. git bisect start # good: [f6f94e2ab1b33f0082ac22d71f66385a60d8157f] Linux 2.6.36 git bisect good f6f94e2ab1b33f0082ac22d71f66385a60d8157f # bad: [c8ddb2713c624f432fa5fe3c7ecffcdda46ea0d4] Linux 2.6.37-rc1 git bisect bad c8ddb2713c624f432fa5fe3c7ecffcdda46ea0d4 # good: [33081adf8b89d5a716d7e1c60171768d39795b39] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6 git bisect good 33081adf8b89d5a716d7e1c60171768d39795b39 # bad: [00ebb6382b8d9c7c15b5f8ad230670d8161d38dd] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc git bisect bad 00ebb6382b8d9c7c15b5f8ad230670d8161d38dd # good: [520045db940a381d2bee1c1b2179f7921b40fb10] Merge branches 'upstream/xenfs' and 'upstream/core' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen git bisect good 520045db940a381d2bee1c1b2179f7921b40fb10 # bad: [60ee6d5faf5f7920ba88b82c072864596f5b88af] ipmi: fix __init and __exit attribute locations git bisect bad 60ee6d5faf5f7920ba88b82c072864596f5b88af # bad: [e6b46ee712b92db1cc2449cf4f65bc635366cad4] Merge branch 'drm-vmware-next' into drm-core-next git bisect bad e6b46ee712b92db1cc2449cf4f65bc635366cad4 # good: [f7abfe8b281991c66406c42c1a6c6c9ee0daa0ff] drm/i915: Fix an overlay regression from 7e7d76c git bisect good f7abfe8b281991c66406c42c1a6c6c9ee0daa0ff # bad: [9e0ae53404700f1e4ae1f33b0ff92948ae0e509d] drm/i915: Don't overwrite the returned error-code git bisect bad 9e0ae53404700f1e4ae1f33b0ff92948ae0e509d # bad: [881f47b64723f4d697084533491a489e3e74b10f] drm/i915: add a new BSD ring buffer for Sandybridge git bisect bad 881f47b64723f4d697084533491a489e3e74b10f # good: [f899fc64cda8569d0529452aafc0da31c042df2e] drm/i915: use GMBUS to manage i2c links git bisect good f899fc64cda8569d0529452aafc0da31c042df2e # bad: [b84d5f0c22914d37d709add54c66e741c404fa56] drm/i915: Inline i915_gem_ring_retire_request() git bisect bad b84d5f0c22914d37d709add54c66e741c404fa56 # bad: [f49f0586191fe16140410db0a46d43bdc690d6af] drm/i915: Actually set the reset bit in i965_reset. git bisect bad f49f0586191fe16140410db0a46d43bdc690d6af # bad: [219adae138513bae20b256f1946b9cb3b75ca05c] drm/i915: Cache LVDS EDID git bisect bad 219adae138513bae20b256f1946b9cb3b75ca05c # good: [e9e5f8e8d373e72f5c39dafde1ce110fc7082118] Merge branch 'drm-intel-fixes' into HEAD git bisect good e9e5f8e8d373e72f5c39dafde1ce110fc7082118 Jon. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [bisected] offset display bug in i915
On Mon, 2010-11-08 at 10:27 +, Chris Wilson wrote: > On Mon, 08 Nov 2010 05:18:32 -0500, Jon Masters > wrote: > > Hi Chris, > > > > The following patch that you recently committed breaks my ASUS Eee PC > > 1015PEM by causing the display to be offset by about 1 inch (a few > > centimeters) when the mode is (re)set during boot. I previously posted > > both photographs and video of the problem in another "PROBLEM" thread. > > [snip] > > > I'll look at the patch. Either the EDID is *not* consistent (in which > > case, why are we not seeing other bugs like this?) or there is something > > specific to this system or panel used. > > Interesting. Or even downright bizarre! Can you disable the caching > (hopefully the patch is still revertible) and save the EDID > (/sys/class/drm/card0-LVDS-1/edid) periodically? Might be simplest just to > dump the EDID every time we probe the LVDS if it has changed. Already onto it. It's 6am here so I'll sleep soon ;) but I just applied the following patch and booted it/logged into X without the problem (apologies if evolution corrupts it, I usually send patches with mutt): >From c2fed0d864935603b1d71c71ae53d197f52e346f Mon Sep 17 00:00:00 2001 From: Jon Masters Date: Mon, 8 Nov 2010 05:39:42 -0500 Subject: [PATCH] drm/i915: revert LVDS EDID cacheing The previous commit v2.6.36-rc5-173-g219adae changed the Intel i915 driver to cache the EDID result obtained from the panel, but in so doing broke some systems by causing weird modesetting problems. Signed-off-by: Jon Masters --- drivers/gpu/drm/i915/intel_lvds.c | 19 +++ 1 files changed, 11 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_lvds.c b/drivers/gpu/drm/i915/intel_lvds.c index f1a6499..78153df 100644 --- a/drivers/gpu/drm/i915/intel_lvds.c +++ b/drivers/gpu/drm/i915/intel_lvds.c @@ -44,7 +44,7 @@ struct intel_lvds { struct intel_encoder base; - struct edid *edid; + bool edid_good; int fitting_mode; u32 pfit_control; @@ -479,12 +479,14 @@ static int intel_lvds_get_modes(struct drm_connector *connector) { struct intel_lvds *intel_lvds = intel_attached_lvds(connector); struct drm_device *dev = connector->dev; + struct drm_i915_private *dev_priv = dev->dev_private; struct drm_display_mode *mode; - if (intel_lvds->edid) { - drm_mode_connector_update_edid_property(connector, - intel_lvds->edid); - return drm_add_edid_modes(connector, intel_lvds->edid); + if (intel_lvds->edid_good) { + int ret = intel_ddc_get_modes(connector, + &dev_priv->gmbus[GMBUS_PORT_PANEL].adapter); + if (ret) + return ret; } mode = drm_mode_duplicate(dev, intel_lvds->fixed_mode); @@ -937,10 +939,11 @@ void intel_lvds_init(struct drm_device *dev) * Attempt to get the fixed panel mode from DDC. Assume that the * preferred mode is the right one. */ - intel_lvds->edid = drm_get_edid(connector, - &dev_priv->gmbus[pin].adapter); + intel_lvds->edid_good = true; + if (!intel_ddc_get_modes(connector, &dev_priv->gmbus[GMBUS_PORT_PANEL].adapter)) + intel_lvds->edid_good = false; - if (!intel_lvds->edid) { + if (!intel_lvds->edid_good) { /* Didn't get an EDID, so * Set wide sync ranges so we get all modes * handed to valid_mode for checking -- 1.7.3.2 Here is the EDID output after booting: [...@monticello ~]$ hexdump /sys/class/drm/card0-LVDS-1/edid 000 ff00 00ff 6422 03e9 8544 0001 010 141c 0301 1680 780d 860a 9426 5157 2790 020 4f21 0054 0101 0101 0101 0101 0101 030 0101 0101 0101 1194 b000 5840 2019 2335 040 0045 81dc 1900 1416 d800 5840 2026 050 235d 0415 81dc fe00 060 fe00 070 0100 f200 080 Sounds like you are suggesting that I hack up the intel_lvds to just printk the EDID when it is read? I can look at that, probably after sleeping. Perhaps you can also send me a patch to get what you want? I'm on IRC, and pingable later on today if you want me to try stuff. Jon. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [bisected] offset display bug in i915
On Mon, 2010-11-08 at 05:54 -0500, Jon Masters wrote: > Here is the EDID output after booting: > > [...@monticello ~]$ hexdump /sys/class/drm/card0-LVDS-1/edid > 000 ff00 00ff 6422 03e9 8544 0001 > 010 141c 0301 1680 780d 860a 9426 5157 2790 > 020 4f21 0054 0101 0101 0101 0101 0101 > 030 0101 0101 0101 1194 b000 5840 2019 2335 > 040 0045 81dc 1900 1416 d800 5840 2026 > 050 235d 0415 81dc fe00 > 060 fe00 > 070 0100 f200 > 080 As I mentioned on IRC, I'm familiar with how I2C works electrically, and therefore EDID implementation as a concept, but I am not really a graphics hacker so I wasn't aware that you prefer edid-decode :) Here is a decoded version of the output: Extracted contents: header: 00 ff ff ff ff ff ff 00 serial number: 22 64 e9 03 44 85 01 00 1c 14 version: 01 03 basic params:80 16 0d 78 0a chroma info: 86 26 94 57 51 90 27 21 4f 54 established: 00 00 00 standard:01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 descriptor 1:94 11 00 b0 40 58 19 20 35 23 45 00 dc 81 00 00 00 19 descriptor 2:16 14 00 d8 40 58 26 20 5d 23 15 04 dc 81 00 00 00 00 descriptor 3:00 00 00 fe 00 00 00 00 00 00 00 00 00 00 00 00 00 00 descriptor 4:00 00 00 fe 00 00 00 00 00 00 00 00 00 01 00 00 00 00 extensions: 00 checksum:f2 Manufacturer: HSD Model 3e9 Serial Number 99652 Made week 28 of 2010 EDID version: 1.3 Digital display Maximum image size: 22 cm x 13 cm Gamma: 2.20 Supported color formats: RGB 4:4:4, YCrCb 4:2:2 First detailed timing is preferred timing Established timings supported: Standard timings supported: Detailed mode: Clock 45.000 MHz, 220 mm x 129 mm 1024 1077 1112 1200 hborder 0 600 604 609 625 vborder 0 -hsync -vsync Detailed mode: Clock 51.420 MHz, 220 mm x 129 mm 1024 1117 1152 1240 hborder 0 600 617 622 638 vborder 0 -hsync -vsync analog composite ASCII string: ASCII string: Checksum: 0xf2 EDID block does NOT conform to EDID 1.3! Missing name descriptor Missing monitor ranges If you send me a hacked up patch that dumps out whatever you want in the kernel ringbugger, I'll build and run it for you. Presumably you are expecting some of the reported geometry to vary, etc. Here is the output from xrandr FWIW (but as I mentioned, this is happening way before X gets its hands on the scene): [...@monticello ~]$ xrandr --verbose Screen 0: minimum 320 x 200, current 1024 x 600, maximum 4096 x 4096 LVDS1 connected 1024x600+0+0 (0x43) normal (normal left inverted right x axis y axis) 220mm x 129mm Identifier: 0x41 Timestamp: 337852 Subpixel: horizontal rgb Gamma: 1.0:1.0:1.0 Brightness: 1.0 Clones: CRTC: 1 CRTCs: 1 Transform: 1.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 1.00 filter: EDID: 00002264e90344850100 1c14010380160d780a86269457519027 214f540001010101010101010101 010101010101941100b0405819203523 4500dc810019161400d840582620 5d231504dc8100fe 00fe 000100f2 BACKLIGHT: 10 (0x000a) range: (0,10) Backlight: 10 (0x000a) range: (0,10) scaling mode: Full aspect supported: None Full Center Full aspect 1024x600 (0x43) 45.0MHz -HSync -VSync *current +preferred h: width 1024 start 1077 end 1112 total 1200 skew0 clock 37.5KHz v: height 600 start 604 end 609 total 625 clock 60.0Hz 1024x600 (0x44) 51.4MHz -HSync -VSync h: width 1024 start 1117 end 1152 total 1240 skew0 clock 41.5KHz v: height 600 start 601 end 606 total 638 clock 65.0Hz 800x600 (0x45) 40.0MHz +HSync +VSync h: width 800 start 840 end 968 total 1056 skew0 clock 37.9KHz v: height 600 start 601 end 605 total 628 clock 60.3Hz 800x600 (0x46) 36.0MHz +HSync +VSync h: width 800 start 824 end 896 total 1024 skew0 clock 35.2KHz v: height 600 start 601 end 603 total 625 clock 56.2Hz 640x480 (0x47) 25.2MHz -HSync -VSync h: width 640 start 656 end 752 total 800 skew0 clock 31.5KHz v: height 480 start 490 end 492 total 525 clock 59.9Hz VGA1 disconnected (normal left inverted right x axis y axis) Identifier: 0x42 Timestamp: 337852 Subpixel: unknown Clones: CR
Re: [Intel-gfx] [bisected] offset display bug in i915
On Mon, 2010-11-08 at 06:22 -0500, Jon Masters wrote: > On Mon, 2010-11-08 at 05:54 -0500, Jon Masters wrote: > > > Here is the EDID output after booting: > > > > [...@monticello ~]$ hexdump /sys/class/drm/card0-LVDS-1/edid > > 000 ff00 00ff 6422 03e9 8544 0001 > > 010 141c 0301 1680 780d 860a 9426 5157 2790 > > 020 4f21 0054 0101 0101 0101 0101 0101 > > 030 0101 0101 0101 1194 b000 5840 2019 2335 > > 040 0045 81dc 1900 1416 d800 5840 2026 > > 050 235d 0415 81dc fe00 > > 060 fe00 > > 070 0100 f200 > > 080 > > As I mentioned on IRC, I'm familiar with how I2C works electrically, and > therefore EDID implementation as a concept, but I am not really a > graphics hacker so I wasn't aware that you prefer edid-decode :) > > Here is a decoded version of the output: And here is an old file I had with the output when running a broken kernel with the cacheing enabled: Screen 0: minimum 320 x 200, current 1024 x 600, maximum 4096 x 4096 LVDS1 connected 1024x600+0+0 (0x43) normal (normal left inverted right x axis y axis) 220mm x 129mm Identifier: 0x41 Timestamp: 1402374 Subpixel: horizontal rgb Gamma: 1.0:1.0:1.0 Brightness: 1.0 Clones: CRTC: 1 CRTCs: 1 Transform: 1.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 1.00 filter: EDID: 00002264e90344850100 1c14010380160d780a86269457519027 214f540001010101010101010101 010101010101941100b0405819203523 4500dc810019161400d840582620 5d231504dc8100fe 00fe 000100f2 BACKLIGHT: 10 (0x000a) range: (0,10) Backlight: 10 (0x000a) range: (0,10) scaling mode: Full aspect supported: None Full Center Full aspect 1024x600 (0x43) 45.0MHz -HSync -VSync *current +preferred h: width 1024 start 1077 end 1112 total 1200 skew0 clock 37.5KHz v: height 600 start 604 end 609 total 625 clock 60.0Hz 1024x600 (0x44) 51.4MHz -HSync -VSync h: width 1024 start 1117 end 1152 total 1240 skew0 clock 41.5KHz v: height 600 start 601 end 606 total 638 clock 65.0Hz 800x600 (0x45) 40.0MHz +HSync +VSync h: width 800 start 840 end 968 total 1056 skew0 clock 37.9KHz v: height 600 start 601 end 605 total 628 clock 60.3Hz 800x600 (0x46) 36.0MHz +HSync +VSync h: width 800 start 824 end 896 total 1024 skew0 clock 35.2KHz v: height 600 start 601 end 603 total 625 clock 56.2Hz 640x480 (0x47) 25.2MHz -HSync -VSync h: width 640 start 656 end 752 total 800 skew0 clock 31.5KHz v: height 480 start 490 end 492 total 525 clock 59.9Hz VGA1 disconnected (normal left inverted right x axis y axis) Identifier: 0x42 Timestamp: 1402374 Subpixel: unknown Clones: CRTCs: 0 1 Transform: 1.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 1.00 filter: Jon. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [bisected] offset display bug in i915
On Mon, 2010-11-08 at 11:26 +, James Courtier-Dutton wrote: > On 8 November 2010 10:27, Chris Wilson wrote: > > On Mon, 08 Nov 2010 05:18:32 -0500, Jon Masters > > wrote: > >> Hi Chris, > >> > >> The following patch that you recently committed breaks my ASUS Eee PC > >> 1015PEM by causing the display to be offset by about 1 inch (a few > >> centimeters) when the mode is (re)set during boot. I previously posted > >> both photographs and video of the problem in another "PROBLEM" thread. > > > > [snip] > > > >> I'll look at the patch. Either the EDID is *not* consistent (in which > >> case, why are we not seeing other bugs like this?) or there is something > >> specific to this system or panel used. > > > > Interesting. Or even downright bizarre! Can you disable the caching > > (hopefully the patch is still revertible) and save the EDID > > (/sys/class/drm/card0-LVDS-1/edid) periodically? Might be simplest just to > > dump the EDID every time we probe the LVDS if it has changed. > I have observed major corruption issues for kernel 2.6.37. Sad to hear, but none of what you are describing is related to this bug. This bug has to do specifically with a small patch that added cacheing of EDID panel data under the assumption that it wouldn't change. It only affects your display, not linker, library objects, file systems, or in fact anything other than getting geometry and location on screen right. > Going back to 2.6.36 stable restores things to normality. > I unfortunately do not have time to track down the problem, but I > thought I would mention it as it could be causing your problem. > My problems were: > Corrupting the library cache so that .so files failed to load any more. > Corrupting .o files during compilation so that the linker did not even > recognize them as .o files. > Corrupted data between a DVB card and the recording on the HD. > I am just mentioning it, because the cause might in fact be somewhere else. You might have a bad compiler, be switching distro packages, who knows. Perhaps you can file a detailed bug report explaining your problems on the bugzilla.kernel.org, or followup with more detail (to a new thread). Jon. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [bisected] offset display bug in i915
On Mon, 2010-11-08 at 06:29 -0500, Jon Masters wrote: > On Mon, 2010-11-08 at 06:22 -0500, Jon Masters wrote: > > On Mon, 2010-11-08 at 05:54 -0500, Jon Masters wrote: > > > > > Here is the EDID output after booting: > > > > > > [...@monticello ~]$ hexdump /sys/class/drm/card0-LVDS-1/edid > > > 000 ff00 00ff 6422 03e9 8544 0001 > > > 010 141c 0301 1680 780d 860a 9426 5157 2790 > > > 020 4f21 0054 0101 0101 0101 0101 0101 > > > 030 0101 0101 0101 1194 b000 5840 2019 2335 > > > 040 0045 81dc 1900 1416 d800 5840 2026 > > > 050 235d 0415 81dc fe00 > > > 060 fe00 > > > 070 0100 f200 > > > 080 > > > > As I mentioned on IRC, I'm familiar with how I2C works electrically, and > > therefore EDID implementation as a concept, but I am not really a > > graphics hacker so I wasn't aware that you prefer edid-decode :) > > > > Here is a decoded version of the output: > > And here is an old file I had with the output when running a broken > kernel with the cacheing enabled: > > Screen 0: minimum 320 x 200, current 1024 x 600, maximum 4096 x 4096 > LVDS1 connected 1024x600+0+0 (0x43) normal (normal left inverted right x > axis y axis) 220mm x 129mm > Identifier: 0x41 > Timestamp: 1402374 > Subpixel: horizontal rgb > Gamma: 1.0:1.0:1.0 > Brightness: 1.0 > Clones: > CRTC: 1 > CRTCs: 1 > Transform: 1.00 0.00 0.00 > 0.00 1.00 0.00 > 0.00 0.00 1.00 > filter: > EDID: > 00002264e90344850100 > 1c14010380160d780a86269457519027 > 214f540001010101010101010101 > 010101010101941100b0405819203523 > 4500dc810019161400d840582620 > 5d231504dc8100fe > 00fe > 000100f2 > BACKLIGHT: 10 (0x000a) range: (0,10) > Backlight: 10 (0x000a) range: (0,10) > scaling mode: Full aspect > supported: None Full Center Full aspect > 1024x600 (0x43) 45.0MHz -HSync -VSync *current +preferred > h: width 1024 start 1077 end 1112 total 1200 skew0 clock > 37.5KHz > v: height 600 start 604 end 609 total 625 clock > 60.0Hz > 1024x600 (0x44) 51.4MHz -HSync -VSync > h: width 1024 start 1117 end 1152 total 1240 skew0 clock > 41.5KHz > v: height 600 start 601 end 606 total 638 clock > 65.0Hz > 800x600 (0x45) 40.0MHz +HSync +VSync > h: width 800 start 840 end 968 total 1056 skew0 clock > 37.9KHz > v: height 600 start 601 end 605 total 628 clock > 60.3Hz > 800x600 (0x46) 36.0MHz +HSync +VSync > h: width 800 start 824 end 896 total 1024 skew0 clock > 35.2KHz > v: height 600 start 601 end 603 total 625 clock > 56.2Hz > 640x480 (0x47) 25.2MHz -HSync -VSync > h: width 640 start 656 end 752 total 800 skew0 clock > 31.5KHz > v: height 480 start 490 end 492 total 525 clock > 59.9Hz > VGA1 disconnected (normal left inverted right x axis y axis) > Identifier: 0x42 > Timestamp: 1402374 > Subpixel: unknown > Clones: > CRTCs: 0 1 > Transform: 1.00 0.00 0.00 > 0.00 1.00 0.00 > 0.00 0.00 1.00 > filter: FWIW I don't recall seeing any reported difference in xrandr. I think what I need is a hacked up debug patch if you can spin me one. Night! Jon. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [bisected] offset display bug in i915
On Mon, 2010-11-08 at 11:33 +, James Courtier-Dutton wrote: > On 8 November 2010 10:54, Jon Masters wrote: > > > > Here is the EDID output after booting: > > > > [...@monticello ~]$ hexdump /sys/class/drm/card0-LVDS-1/edid > > 000 ff00 00ff 6422 03e9 8544 0001 > > 010 141c 0301 1680 780d 860a 9426 5157 2790 > > 020 4f21 0054 0101 0101 0101 0101 0101 > > 030 0101 0101 0101 1194 b000 5840 2019 2335 > > 040 0045 81dc 1900 1416 d800 5840 2026 > > 050 235d 0415 81dc fe00 > > 060 fe00 > > 070 0100 f200 > > 080 > > > hexdump -C is a little easier to read and compare with other hexdumps. Yea, but I already ran edid-decode for Chris. Chris: I suggest your idea of hacking up edid-decode to also take hexdumps is a good idea for the future. That way you can feed it the output from xrandr, or from people who like me don't realize that it's preferred to post the interpreted output. Jon. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [bisected] offset display bug in i915
On Mon, 2010-11-08 at 12:13 +, Chris Wilson wrote: > On Mon, 08 Nov 2010 06:29:09 -0500, Jon Masters > wrote: > > On Mon, 2010-11-08 at 06:22 -0500, Jon Masters wrote: > > > As I mentioned on IRC, I'm familiar with how I2C works electrically, and > > > therefore EDID implementation as a concept, but I am not really a > > > graphics hacker so I wasn't aware that you prefer edid-decode :) > > > > > > Here is a decoded version of the output: > > > > And here is an old file I had with the output when running a broken > > kernel with the cacheing enabled: > > They look to be identical. So it is not necessarily the caching that is > the issue, but that we now no longer do an essential step whilst bringing > up the lvds. Can you just check that the reported EDIDs are the same for > when it is behaving correctly and when the display is offset? diff says they are the same. Here's the output from a broken kernel: $ cat edid_broken_hexdump.txt 000 ff00 00ff 6422 03e9 8544 0001 010 141c 0301 1680 780d 860a 9426 5157 2790 020 4f21 0054 0101 0101 0101 0101 0101 030 0101 0101 0101 1194 b000 5840 2019 2335 040 0045 81dc 1900 1416 d800 5840 2026 050 235d 0415 81dc fe00 060 fe00 070 0100 f200 080 $ cat edid_broken_decode.txt Extracted contents: header: 00 ff ff ff ff ff ff 00 serial number: 22 64 e9 03 44 85 01 00 1c 14 version: 01 03 basic params:80 16 0d 78 0a chroma info: 86 26 94 57 51 90 27 21 4f 54 established: 00 00 00 standard:01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 descriptor 1:94 11 00 b0 40 58 19 20 35 23 45 00 dc 81 00 00 00 19 descriptor 2:16 14 00 d8 40 58 26 20 5d 23 15 04 dc 81 00 00 00 00 descriptor 3:00 00 00 fe 00 00 00 00 00 00 00 00 00 00 00 00 00 00 descriptor 4:00 00 00 fe 00 00 00 00 00 00 00 00 00 01 00 00 00 00 extensions: 00 checksum:f2 Manufacturer: HSD Model 3e9 Serial Number 99652 Made week 28 of 2010 EDID version: 1.3 Digital display Maximum image size: 22 cm x 13 cm Gamma: 2.20 Supported color formats: RGB 4:4:4, YCrCb 4:2:2 First detailed timing is preferred timing Established timings supported: Standard timings supported: Detailed mode: Clock 45.000 MHz, 220 mm x 129 mm 1024 1077 1112 1200 hborder 0 600 604 609 625 vborder 0 -hsync -vsync Detailed mode: Clock 51.420 MHz, 220 mm x 129 mm 1024 1117 1152 1240 hborder 0 600 617 622 638 vborder 0 -hsync -vsync analog composite ASCII string: ASCII string: Checksum: 0xf2 EDID block does NOT conform to EDID 1.3! Missing name descriptor Missing monitor ranges > The secondary change in the commit was to use drm_get_edid() directly > during device detection which delayed setting the EDID property on the > connection until get_modes(). Does this make a difference? > > diff --git a/drivers/gpu/drm/i915/intel_lvds.c > b/drivers/gpu/drm/i915/intel_lvds > index f1a6499..17bcb7d 100644 > --- a/drivers/gpu/drm/i915/intel_lvds.c > +++ b/drivers/gpu/drm/i915/intel_lvds.c > @@ -940,7 +940,10 @@ void intel_lvds_init(struct drm_device *dev) > intel_lvds->edid = drm_get_edid(connector, > &dev_priv->gmbus[pin].adapter); > > - if (!intel_lvds->edid) { > + if (intel_lvds->edid) { > + drm_mode_connector_update_edid_property(connector, > + intel_lvds->edid); > + } else { Nope. This makes no difference at all. Got anything else up your sleeve? Jon. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [bisected] offset display bug in i915
On Mon, 2010-11-08 at 12:13 +, Chris Wilson wrote: > On Mon, 08 Nov 2010 06:29:09 -0500, Jon Masters > wrote: > > On Mon, 2010-11-08 at 06:22 -0500, Jon Masters wrote: > > > As I mentioned on IRC, I'm familiar with how I2C works electrically, and > > > therefore EDID implementation as a concept, but I am not really a > > > graphics hacker so I wasn't aware that you prefer edid-decode :) > > > > > > Here is a decoded version of the output: > > > > And here is an old file I had with the output when running a broken > > kernel with the cacheing enabled: > > They look to be identical. So it is not necessarily the caching that is > the issue, but that we now no longer do an essential step whilst bringing > up the lvds. Can you just check that the reported EDIDs are the same for > when it is behaving correctly and when the display is offset? > > The secondary change in the commit was to use drm_get_edid() directly > during device detection which delayed setting the EDID property on the > connection until get_modes(). Does this make a difference? > > diff --git a/drivers/gpu/drm/i915/intel_lvds.c > b/drivers/gpu/drm/i915/intel_lvds > index f1a6499..17bcb7d 100644 > --- a/drivers/gpu/drm/i915/intel_lvds.c > +++ b/drivers/gpu/drm/i915/intel_lvds.c > @@ -940,7 +940,10 @@ void intel_lvds_init(struct drm_device *dev) > intel_lvds->edid = drm_get_edid(connector, > &dev_priv->gmbus[pin].adapter); > > - if (!intel_lvds->edid) { > + if (intel_lvds->edid) { > + drm_mode_connector_update_edid_property(connector, > + intel_lvds->edid); > + } else { I tried this, which didn't work. Then I added this (ignore stupid evolution formatting - convenient for email backlog, but not mutt): diff --git a/drivers/gpu/drm/i915/intel_lvds.c b/drivers/gpu/drm/i915/intel_lvds.c index 17bcb7d..a0bb443 100644 --- a/drivers/gpu/drm/i915/intel_lvds.c +++ b/drivers/gpu/drm/i915/intel_lvds.c @@ -462,6 +462,14 @@ intel_lvds_detect(struct drm_connector *connector, bool force) { struct drm_device *dev = connector->dev; enum drm_connector_status status = connector_status_connected; + /* JCM - added this */ + struct drm_i915_private *dev_priv = dev->dev_private; + u8 pin = GMBUS_PORT_PANEL; + struct edid *edid = drm_get_edid(connector, &dev_priv->gmbus[pin].adapter); + + /* I was asked to free in here */ + if (edid) + kfree(edid); /* ACPI lid methods were generally unreliable in this generation, so * don't even bother. Which should have made the edid always get freed on detect. That didn't work either. I might wind up learning something about the DRM stack. Anyway. Here's the reg dumper output. First, for a busticated kernel: [...@constitution ~]$ cat intel_regs_broken.txt DCC: 0x00200020 (single channel, XOR randomization: enabled, XOR bit: 11) CHDECMISC: 0x4264 (XOR bank, ch2 enh disabled, ch1 enh disabled, ch0 enh enabled, flex disabled, ep not present) C0DRB0: 0x00200020 (0x0020) C0DRB1: 0x00200020 (0x0020) C0DRB2: 0x00200020 (0x0020) C0DRB3: 0x00880020 (0x0020) C1DRB0: 0x (0x) C1DRB1: 0x (0x) C1DRB2: 0x (0x) C1DRB3: 0x (0x) C0DRA01: 0x0088 (0x0088) C0DRA23: 0x (0x) C1DRA01: 0x (0x) C1DRA23: 0x (0x) PGETBL_CTL: 0x0001 VCLK_DIVISOR_VGA0: 0x00200074 (n = 32, m1 = 0, m2 = 52) VCLK_DIVISOR_VGA1: 0x00200074 (n = 32, m1 = 0, m2 = 52) VCLK_POST_DIV: 0x00800100 (vga0 p1 = 2, p2 = 2, vga1 p1 = 3, p2 = 2) DPLL_TEST: 0x00010001 () CACHE_MODE_0: 0x6820 D_STATE: 0x000b DSPCLK_GATE_D: 0x (clock gates disabled:) RENCLK_GATE_D1: 0x RENCLK_GATE_D2: 0x SDVOB: 0x0030 (disabled, pipe A, stall disabled, not detected) SDVOC: 0x0030 (disabled, pipe A, stall disabled, not detected) SDVOUDI: 0x DSPARB: 0x1d9c DSPFW1: 0xf1830f0f DSPFW2: 0x030f DSPFW3: 0x6700018f ADPA: 0x40008c18 (disabled, pipe B, +hsync, +vsync) LVDS: 0xc0308300 (enabled, pipe B, 18 bit, 1 channel) DVOA: 0x (disabled, pipe A, no stall, -hsync, -vsync) DVOB: 0x0030 (disabled, pipe A, no stall, -hsync, -vsync) DVOC: 0x0030 (disabled, pipe A, no stall, -hsync, -vsyn
[Intel-gfx] [Fwd: Re: [PATCH] drm/i915: Fix LVDS fixed-mode regression from 219adae1]
--- Begin Message --- On Mon, 2010-11-08 at 23:24 +, Chris Wilson wrote: > Commit 219adae1 cached the EDID found during LVDS init, but in the > process prevented the init routine from discovering the preferred > fixed-mode for the panel. This was causing us to guess the correct mode, > which sometimes is wide of the mark. > > Reported-by: Jon Masters > Signed-off-by: Chris Wilson I have tested this patch, and my LVDS panel display on this Eee PC 1015PEM is now working correctly :) Obviously I suspect others will be affected in due course, so hopefully this can make the next 2.6.37 RC. Please add my: Tested-by: Jon Masters To this patch also. Jon. --- End Message --- ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] PROBLEM: i915 modesetting - weird offset graphics (v2.6.37-rc1-27-gff8b16d)
On Thu, 2010-11-11 at 09:02 +0100, Maciej Rutecki wrote: > On czwartek, 4 listopada 2010 o 08:25:58 Jon Masters wrote: > > Folks, > > > > I upgraded the userspace on my EeePC 1015PEM netbook to the latest > > Fedora (F15) rawhide and booted a custom kernel in order to test both > > the RC1 and also see if the now in-staging brcm80211 behaves better (it > > doesn't, it still can't survive suspend at all, separate issue though). > > > > The xrandr output is identical in both case, showing a mode of 1024x600 > > being the one that X is currently using. Booting with i915 modesetting > > disabled does avoid the weird offset but of course X doesn't start. I > > can bisect this if there isn't already some suggestions from the > > audience, in particular is to what debug information you need. > > > > Kernel: v2.6.37-rc1-27-gff8b16d > > > > Photos: > > http://www.flickr.com/photos/jonmasters/5145351220/ > > http://www.flickr.com/photos/jonmasters/5144748543/ > > > > Said system will likely be at plumbers in the am... > > > I created a Bugzilla entry at > https://bugzilla.kernel.org/show_bug.cgi?id=22712 > for your bug report, please add your address to the CC list in there, thanks! Thanks. I requested the k.o BZ to send me a new password over the weekend, which was delayed by a few hours and then I didn't get around to this. FWIW, the bug has a fix on the intel-gfx list. Jon. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v5 1/7] drm/i915: Implement a framework for batch buffer pools
> -Original Message- > From: Nguyen, Michael H > Sent: Wednesday, November 26, 2014 9:54 PM > To: intel-gfx@lists.freedesktop.org > Cc: Bloomfield, Jon; Volkin, Bradley D > Subject: [PATCH v5 1/7] drm/i915: Implement a framework for batch buffer > pools > > From: Brad Volkin > > This adds a small module for managing a pool of batch buffers. > The only current use case is for the command parser, as described in the > kerneldoc in the patch. The code is simple, but separating it out makes it > easier to change the underlying algorithms and to extend to future use cases > should they arise. > > The interface is simple: init to create an empty pool, fini to clean it up, > get to > obtain a new buffer. Note that all buffers are expected to be inactive before > cleaning up the pool. > > Locking is currently based on the caller holding the struct_mutex. > We already do that in the places where we will use the batch pool for the > command parser. > > v2: > - s/BUG_ON/WARN_ON/ for locking assertions > - Remove the cap on pool size > - Switch from alloc/free to init/fini > > v3: > - Idiomatic looping structure in _fini > - Correct handling of purged objects > - Don't return a buffer that's too much larger than needed > > v4: > - Rebased to latest -nightly > > v5: > - Remove _put() function and clean up comments to match > > v6: > - Move purged check inside the loop (danvet, from v4 1/7 feedback) > > Issue: VIZ-4719 > Signed-off-by: Brad Volkin > --- > Documentation/DocBook/drm.tmpl | 5 + > drivers/gpu/drm/i915/Makefile | 1 + > drivers/gpu/drm/i915/i915_drv.h| 15 +++ > drivers/gpu/drm/i915/i915_gem.c| 1 + > drivers/gpu/drm/i915/i915_gem_batch_pool.c | 152 > + > 5 files changed, 174 insertions(+) > create mode 100644 drivers/gpu/drm/i915/i915_gem_batch_pool.c > > diff --git a/Documentation/DocBook/drm.tmpl > b/Documentation/DocBook/drm.tmpl index 944e8ed..b5943d4 100644 > --- a/Documentation/DocBook/drm.tmpl > +++ b/Documentation/DocBook/drm.tmpl > @@ -4028,6 +4028,11 @@ int num_ioctls; > !Idrivers/gpu/drm/i915/i915_cmd_parser.c > > > +Batchbuffer Pools > +!Pdrivers/gpu/drm/i915/i915_gem_batch_pool.c batch pool > +!Idrivers/gpu/drm/i915/i915_gem_batch_pool.c > + > + > Logical Rings, Logical Ring Contexts and Execlists > !Pdrivers/gpu/drm/i915/intel_lrc.c Logical Rings, Logical Ring Contexts and > Execlists !Idrivers/gpu/drm/i915/intel_lrc.c > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile > index e4083e4..c8dbb37d 100644 > --- a/drivers/gpu/drm/i915/Makefile > +++ b/drivers/gpu/drm/i915/Makefile > @@ -19,6 +19,7 @@ i915-$(CONFIG_DEBUG_FS) += i915_debugfs.o > > # GEM code > i915-y += i915_cmd_parser.o \ > + i915_gem_batch_pool.o \ > i915_gem_context.o \ > i915_gem_render_state.o \ > i915_gem_debug.o \ > diff --git a/drivers/gpu/drm/i915/i915_drv.h > b/drivers/gpu/drm/i915/i915_drv.h index c4f2cb6..b12a062 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -1154,6 +1154,12 @@ struct intel_l3_parity { > int which_slice; > }; > > +struct i915_gem_batch_pool { > + struct drm_device *dev; > + struct list_head active_list; > + struct list_head inactive_list; > +}; > + > struct i915_gem_mm { > /** Memory allocator for GTT stolen memory */ > struct drm_mm stolen; > @@ -1885,6 +1891,8 @@ struct drm_i915_gem_object { > /** Used in execbuf to temporarily hold a ref */ > struct list_head obj_exec_link; > > + struct list_head batch_pool_list; > + > /** >* This is set if the object is on the active lists (has pending >* rendering and so a non-zero seqno), and is not set if it i s on @@ - > 2849,6 +2857,13 @@ void i915_destroy_error_state(struct drm_device > *dev); void i915_get_extra_instdone(struct drm_device *dev, uint32_t > *instdone); const char *i915_cache_level_str(struct drm_i915_private *i915, > int type); > > +/* i915_gem_batch_pool.c */ > +void i915_gem_batch_pool_init(struct drm_device *dev, > + struct i915_gem_batch_pool *pool); void > +i915_gem_batch_pool_fini(struct i915_gem_batch_pool *pool); struct > +drm_i915_gem_object* i915_gem_batch_pool_get(struct > i915_gem_batch_pool > +*pool, size_t size); > + > /* i915_cmd_parser.c */ > int i915_cmd_parser_get_version(void); > int i915_cmd_parser_init_ring(struct intel_engine_cs *ring); diff --git > a/drivers/gpu/drm/i915/i915
Re: [Intel-gfx] [PATCH] drm/i915: Add MI_SET_APPID cmd to cmd parser tables
> -Original Message- > From: Intel-gfx [mailto:intel-gfx-boun...@lists.freedesktop.org] On Behalf > Of michael.h.ngu...@intel.com > Sent: Friday, November 21, 2014 5:36 PM > To: intel-gfx@lists.freedesktop.org > Cc: Vetter, Daniel; Barnes, Jesse > Subject: [Intel-gfx] [PATCH] drm/i915: Add MI_SET_APPID cmd to cmd parser > tables > > From: "Michael H. Nguyen" > > Was missing > > Issue: VIZ-4701 > Signed-off-by: Michael H. Nguyen > --- > drivers/gpu/drm/i915/i915_cmd_parser.c | 14 +++--- > drivers/gpu/drm/i915/i915_reg.h| 3 +++ > 2 files changed, 14 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c > b/drivers/gpu/drm/i915/i915_cmd_parser.c > index 22c992a..364aff7 100644 > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c > @@ -152,6 +152,7 @@ static const struct drm_i915_cmd_descriptor > render_cmds[] = { > CMD( MI_PREDICATE, SMI,F, 1, S ), > CMD( MI_TOPOLOGY_FILTER, SMI,F, 1, S ), > CMD( MI_DISPLAY_FLIP, SMI, !F, 0xFF, R ), > + CMD( MI_SET_APPID, SMI,F, 1, S ), > CMD( MI_SET_CONTEXT, SMI, !F, 0xFF, R ), > CMD( MI_URB_CLEAR, SMI, !F, 0xFF, S ), > CMD( MI_STORE_DWORD_IMM, SMI, !F, 0x3F, B, > @@ -210,6 +211,7 @@ static const struct drm_i915_cmd_descriptor > hsw_render_cmds[] = { > CMD( MI_SET_PREDICATE, SMI,F, 1, S ), > CMD( MI_RS_CONTROL,SMI,F, 1, S ), > CMD( MI_URB_ATOMIC_ALLOC, SMI,F, 1, S ), > + CMD( MI_SET_APPID, SMI,F, 1, S ), > CMD( MI_RS_CONTEXT,SMI,F, 1, S ), > CMD( MI_LOAD_SCAN_LINES_INCL, SMI, !F, 0x3F, M ), > CMD( MI_LOAD_SCAN_LINES_EXCL, SMI, !F, 0x3F, R ), > @@ -229,6 +231,7 @@ static const struct drm_i915_cmd_descriptor > hsw_render_cmds[] = { > > static const struct drm_i915_cmd_descriptor video_cmds[] = { > CMD( MI_ARB_ON_OFF,SMI,F, 1, R ), > + CMD( MI_SET_APPID, SMI,F, 1, S ), > CMD( MI_STORE_DWORD_IMM, SMI, !F, 0xFF, B, > .bits = {{ > .offset = 0, > @@ -272,6 +275,7 @@ static const struct drm_i915_cmd_descriptor > video_cmds[] = { > > static const struct drm_i915_cmd_descriptor vecs_cmds[] = { > CMD( MI_ARB_ON_OFF,SMI,F, 1, R ), > + CMD( MI_SET_APPID, SMI,F, 1, S ), > CMD( MI_STORE_DWORD_IMM, SMI, !F, 0xFF, B, > .bits = {{ > .offset = 0, > @@ -481,13 +485,17 @@ static u32 gen7_bsd_get_cmd_length_mask(u32 > cmd_header) > u32 client = (cmd_header & INSTR_CLIENT_MASK) >> > INSTR_CLIENT_SHIFT; > u32 subclient = > (cmd_header & INSTR_SUBCLIENT_MASK) >> > INSTR_SUBCLIENT_SHIFT; > + u32 op = (cmd_header & INSTR_26_TO_24_MASK) >> > INSTR_26_TO_24_SHIFT; > > if (client == INSTR_MI_CLIENT) > return 0x3F; > else if (client == INSTR_RC_CLIENT) { > - if (subclient == INSTR_MEDIA_SUBCLIENT) > - return 0xFFF; > - else > + if (subclient == INSTR_MEDIA_SUBCLIENT) { > + if (op == 6) > + return 0x; > + else > + return 0xFFF; > + } else > return 0xFF; > } > > diff --git a/drivers/gpu/drm/i915/i915_reg.h > b/drivers/gpu/drm/i915/i915_reg.h index 7a77cd5..c881d88 100644 > --- a/drivers/gpu/drm/i915/i915_reg.h > +++ b/drivers/gpu/drm/i915/i915_reg.h > @@ -204,6 +204,8 @@ > #define INSTR_SUBCLIENT_SHIFT 27 > #define INSTR_SUBCLIENT_MASK0x1800 > #define INSTR_MEDIA_SUBCLIENT 0x2 > +#define INSTR_26_TO_24_MASK 0x700 > +#define INSTR_26_TO_24_SHIFT 24 > > /* > * Memory interface instructions used by the kernel @@ -233,6 +235,7 @@ > #define MI_BATCH_BUFFER_END MI_INSTR(0x0a, 0) > #define MI_SUSPEND_FLUSH MI_INSTR(0x0b, 0) > #define MI_SUSPEND_FLUSH_EN(1<<0) > +#define MI_SET_APPID MI_INSTR(0x0e, 0) > #define MI_OVERLAY_FLIP MI_INSTR(0x11, 0) > #define MI_OVERLAY_CONTINUE(0x0<<21) > #define MI_OVERLAY_O
Re: [Intel-gfx] [PATCH v6 1/5] drm/i915: Implement a framework for batch buffer pools
> -Original Message- > From: Intel-gfx [mailto:intel-gfx-boun...@lists.freedesktop.org] On Behalf > Of Daniel Vetter > Sent: Tuesday, December 09, 2014 1:18 PM > To: Nguyen, Michael H > Cc: Brad Volkin; intel-gfx@lists.freedesktop.org > Subject: Re: [Intel-gfx] [PATCH v6 1/5] drm/i915: Implement a framework for > batch buffer pools > > On Mon, Dec 08, 2014 at 02:33:46PM -0800, michael.h.ngu...@intel.com > wrote: > > From: Brad Volkin > > > > This adds a small module for managing a pool of batch buffers. > > The only current use case is for the command parser, as described in > > the kerneldoc in the patch. The code is simple, but separating it out > > makes it easier to change the underlying algorithms and to extend to > > future use cases should they arise. > > > > The interface is simple: init to create an empty pool, fini to clean > > it up, get to obtain a new buffer. Note that all buffers are expected > > to be inactive before cleaning up the pool. > > > > Locking is currently based on the caller holding the struct_mutex. > > We already do that in the places where we will use the batch pool for > > the command parser. > > > > v2: > > - s/BUG_ON/WARN_ON/ for locking assertions > > - Remove the cap on pool size > > - Switch from alloc/free to init/fini > > > > v3: > > - Idiomatic looping structure in _fini > > - Correct handling of purged objects > > - Don't return a buffer that's too much larger than needed > > > > v4: > > - Rebased to latest -nightly > > > > v5: > > - Remove _put() function and clean up comments to match > > > > v6: > > - Move purged check inside the loop (danvet, from v4 1/7 feedback) > > > > v7: > > - Use single list instead of two. (Chris W) > > - s/active_list/cache_list > > - Squashed in debug patches (Chris W) > > drm/i915: Add a batch pool debugfs file > > > > It provides some useful information about the buffers in > > the global command parser batch pool. > > > > v2: rebase on global pool instead of per-ring pools > > v3: rebase > > > > drm/i915: Add batch pool details to i915_gem_objects debugfs > > > > To better account for the potentially large memory consumption > > of the batch pool. > > > > Issue: VIZ-4719 > > Signed-off-by: Brad Volkin > > --- > > Documentation/DocBook/drm.tmpl | 5 ++ > > drivers/gpu/drm/i915/Makefile | 1 + > > drivers/gpu/drm/i915/i915_debugfs.c| 71 ++-- > > drivers/gpu/drm/i915/i915_drv.h| 21 + > > drivers/gpu/drm/i915/i915_gem.c| 1 + > > drivers/gpu/drm/i915/i915_gem_batch_pool.c | 132 > > + > > 6 files changed, 222 insertions(+), 9 deletions(-) create mode > > 100644 drivers/gpu/drm/i915/i915_gem_batch_pool.c > > > > diff --git a/Documentation/DocBook/drm.tmpl > > b/Documentation/DocBook/drm.tmpl index 85287cb..022923a 100644 > > --- a/Documentation/DocBook/drm.tmpl > > +++ b/Documentation/DocBook/drm.tmpl > > @@ -4029,6 +4029,11 @@ int num_ioctls; > > !Idrivers/gpu/drm/i915/i915_cmd_parser.c > > > > > > +Batchbuffer Pools > > +!Pdrivers/gpu/drm/i915/i915_gem_batch_pool.c batch pool > > +!Idrivers/gpu/drm/i915/i915_gem_batch_pool.c > > + > > + > > Logical Rings, Logical Ring Contexts and > > Execlists !Pdrivers/gpu/drm/i915/intel_lrc.c Logical Rings, > > Logical Ring Contexts and Execlists > > !Idrivers/gpu/drm/i915/intel_lrc.c > > diff --git a/drivers/gpu/drm/i915/Makefile > > b/drivers/gpu/drm/i915/Makefile index e4083e4..c8dbb37d 100644 > > --- a/drivers/gpu/drm/i915/Makefile > > +++ b/drivers/gpu/drm/i915/Makefile > > @@ -19,6 +19,7 @@ i915-$(CONFIG_DEBUG_FS) += i915_debugfs.o > > > > # GEM code > > i915-y += i915_cmd_parser.o \ > > + i915_gem_batch_pool.o \ > > i915_gem_context.o \ > > i915_gem_render_state.o \ > > i915_gem_debug.o \ > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c > > b/drivers/gpu/drm/i915/i915_debugfs.c > > index d0e445e..3c3bf98 100644 > > --- a/drivers/gpu/drm/i915/i915_debugfs.c > > +++ b/drivers/gpu/drm/i915/i915_debugfs.c > > @@ -359,6 +359,33 @@ static int per_file_stats(int id, void *ptr, void > *data) > > return 0; > > } > > > > +#define print_file_stats(m, name, stats) \ > > + seq_printf(m, "%s: %u objects, %zu bytes (%zu active, %zu inactive, > %zu global, %zu shared, %zu unbound)\n", \ > > + name, \ > > + stats.count, \ > > + stats.total, \ > > + stats.active, \ > > + stats.inactive, \ > > + stats.global, \ > > + stats.shared, \ > > + stats.unbound) "print_file_stats" might be better named "print_obj_stats" or similar. As it stands there is nothing in its name to suggest its purpose is to describe an object. > > + > > +static void print_batch_pool_stats(struct seq_file *m, > > + struct drm_i915_private *dev_priv) { > > + struct drm_i915_gem_object *obj; >
Re: [Intel-gfx] [PATCH v6 1/5] drm/i915: Implement a framework for batch buffer pools
> -Original Message- > From: Nguyen, Michael H > Sent: Wednesday, December 10, 2014 4:34 PM > To: Bloomfield, Jon > Cc: Daniel Vetter; intel-gfx@lists.freedesktop.org > Subject: Re: [Intel-gfx] [PATCH v6 1/5] drm/i915: Implement a framework for > batch buffer pools > > Hi Jon, > > On 12/10/2014 03:06 AM, Bloomfield, Jon wrote: > >> -Original Message- > >> From: Intel-gfx [mailto:intel-gfx-boun...@lists.freedesktop.org] On > >> Behalf Of Daniel Vetter > >> Sent: Tuesday, December 09, 2014 1:18 PM > >> To: Nguyen, Michael H > >> Cc: Brad Volkin; intel-gfx@lists.freedesktop.org > >> Subject: Re: [Intel-gfx] [PATCH v6 1/5] drm/i915: Implement a > >> framework for batch buffer pools > >> > >> On Mon, Dec 08, 2014 at 02:33:46PM -0800, michael.h.ngu...@intel.com > >> wrote: > >>> From: Brad Volkin > >>> > >>> This adds a small module for managing a pool of batch buffers. > >>> The only current use case is for the command parser, as described in > >>> the kerneldoc in the patch. The code is simple, but separating it > >>> out makes it easier to change the underlying algorithms and to > >>> extend to future use cases should they arise. > >>> > >>> The interface is simple: init to create an empty pool, fini to clean > >>> it up, get to obtain a new buffer. Note that all buffers are > >>> expected to be inactive before cleaning up the pool. > >>> > >>> Locking is currently based on the caller holding the struct_mutex. > >>> We already do that in the places where we will use the batch pool > >>> for the command parser. > >>> > >>> v2: > >>> - s/BUG_ON/WARN_ON/ for locking assertions > >>> - Remove the cap on pool size > >>> - Switch from alloc/free to init/fini > >>> > >>> v3: > >>> - Idiomatic looping structure in _fini > >>> - Correct handling of purged objects > >>> - Don't return a buffer that's too much larger than needed > >>> > >>> v4: > >>> - Rebased to latest -nightly > >>> > >>> v5: > >>> - Remove _put() function and clean up comments to match > >>> > >>> v6: > >>> - Move purged check inside the loop (danvet, from v4 1/7 feedback) > >>> > >>> v7: > >>> - Use single list instead of two. (Chris W) > >>> - s/active_list/cache_list > >>> - Squashed in debug patches (Chris W) > >>>drm/i915: Add a batch pool debugfs file > >>> > >>>It provides some useful information about the buffers in > >>>the global command parser batch pool. > >>> > >>>v2: rebase on global pool instead of per-ring pools > >>>v3: rebase > >>> > >>>drm/i915: Add batch pool details to i915_gem_objects debugfs > >>> > >>>To better account for the potentially large memory consumption > >>>of the batch pool. > >>> > >>> Issue: VIZ-4719 > >>> Signed-off-by: Brad Volkin > >>> --- > >>> Documentation/DocBook/drm.tmpl | 5 ++ > >>> drivers/gpu/drm/i915/Makefile | 1 + > >>> drivers/gpu/drm/i915/i915_debugfs.c| 71 ++-- > >>> drivers/gpu/drm/i915/i915_drv.h| 21 + > >>> drivers/gpu/drm/i915/i915_gem.c| 1 + > >>> drivers/gpu/drm/i915/i915_gem_batch_pool.c | 132 > >>> + > >>> 6 files changed, 222 insertions(+), 9 deletions(-) create mode > >>> 100644 drivers/gpu/drm/i915/i915_gem_batch_pool.c > >>> > >>> diff --git a/Documentation/DocBook/drm.tmpl > >>> b/Documentation/DocBook/drm.tmpl index 85287cb..022923a 100644 > >>> --- a/Documentation/DocBook/drm.tmpl > >>> +++ b/Documentation/DocBook/drm.tmpl > >>> @@ -4029,6 +4029,11 @@ int num_ioctls; > >>> !Idrivers/gpu/drm/i915/i915_cmd_parser.c > >>> > >>> > >>> +Batchbuffer Pools > >>> +!Pdrivers/gpu/drm/i915/i915_gem_batch_pool.c batch pool > >>> +!Idrivers/gpu/drm/i915/i915_gem_batch_pool.c > >>> + > >>> + > >>> Logical Rings, Logical Ring Contexts and > >>> Execlists !Pdrivers/gpu/drm/i915/intel_lrc.c Lo