[Intel-gfx] [bugzilla-dae...@bugzilla.kernel.org: [Bug 213519] New: WARNING on system reboot in: drivers/gpu/drm/i915/intel_runtime_pm.c:635 intel_runtime_pm_driver_release]

2021-06-21 Thread Bjorn Helgaas
- Forwarded message from bugzilla-dae...@bugzilla.kernel.org -

Date: Mon, 21 Jun 2021 02:50:09 +
From: bugzilla-dae...@bugzilla.kernel.org
To: bj...@helgaas.com
Subject: [Bug 213519] New: WARNING on system reboot in:
drivers/gpu/drm/i915/intel_runtime_pm.c:635 
intel_runtime_pm_driver_release
Message-ID: 

https://bugzilla.kernel.org/show_bug.cgi?id=213519

Bug ID: 213519
   Summary: WARNING on system reboot in:
drivers/gpu/drm/i915/intel_runtime_pm.c:635
intel_runtime_pm_driver_release
   Product: Drivers
   Version: 2.5
Kernel Version: 5.12.12
  Hardware: x86-64
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: PCI
  Assignee: drivers_...@kernel-bugs.osdl.org
  Reporter: j-c...@westvi.com
Regression: No

Created attachment 297517
  --> https://bugzilla.kernel.org/attachment.cgi?id=297517&action=edit
Contents of 'warning' stack trace, etc.

As mentioned in summary - warning message in this routine at system reboot. Try
as I might, I cannot include the text of the warning directly here in the
description without losing carriage returns, so I include it as a text
attachment.

- End forwarded message -

[Attachment contents below]

[  239.019148] [ cut here ]
[  239.024226] i915 :00:02.0: i915 raw-wakerefs=1 wakelocks=1 on cleanup
[  239.031561] WARNING: CPU: 4 PID: 2484 at 
drivers/gpu/drm/i915/intel_runtime_pm.c:635 
intel_runtime_pm_driver_release+0x4f/0x60
[  239.043974] Modules linked in: mei_wdt x86_pkg_temp_thermal 
ghash_clmulni_intel mei_me mei cryptd
[  239.053656] CPU: 4 PID: 2484 Comm: reboot Not tainted 5.12.12 #1
[  239.060236] Hardware name: To Be Filled By O.E.M. To Be Filled By 
O.E.M./NUC-8665UE, BIOS P1.50 06/04/2021
[  239.070766] RIP: 0010:intel_runtime_pm_driver_release+0x4f/0x60
[  239.077256] Code: 10 4c 8b 6f 50 4d 85 ed 75 03 4c 8b 2f e8 59 8f 11 00 41 
89 d8 44 89 e1 4c 89 ea 48 89 c6 48 c7 c7 f8 25 7d b0 e8 06 e8 67 00 <0f> 0b 5b 
41 5c 41 5d 5d c3 0f 1f 84 00 00 00 00 00 55 48 89 e5 48
[  239.097700] RSP: 0018:b8c682f3bd30 EFLAGS: 00010286
[  239.103422] RAX:  RBX: 0001 RCX: b0af01e8
[  239.85] RDX:  RSI: dfff RDI: b0a401e0
[  239.118850] RBP: b8c682f3bd48 R08:  R09: b8c682f3bb08
[  239.126617] R10: b8c682f3bb00 R11: b0b20228 R12: 0001
[  239.134390] R13: 978680d114b0 R14: 97868197eae8 R15: fee1dead
[  239.142203] FS:  7f741a182580() GS:9789dc50() 
knlGS:
[  239.151044] CS:  0010 DS:  ES:  CR0: 80050033
[  239.157318] CR2: 0169f4c8 CR3: 00019cf14003 CR4: 003706e0
[  239.165098] DR0:  DR1:  DR2: 
[  239.172874] DR3:  DR6: fffe0ff0 DR7: 0400
[  239.180658] Call Trace:
[  239.183346]  i915_driver_shutdown+0xcf/0xe0
[  239.187920]  i915_pci_shutdown+0x10/0x20
[  239.192181]  pci_device_shutdown+0x35/0x60
[  239.196629]  device_shutdown+0x156/0x1b0
[  239.200827]  __do_sys_reboot.cold+0x2f/0x5b
[  239.205410]  __x64_sys_reboot+0x16/0x20
[  239.209586]  do_syscall_64+0x38/0x50
[  239.213399]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  239.218837] RIP: 0033:0x7f741a0a9bc3
[  239.222740] Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 
1f 44 00 00 89 fa be 69 19 12 28 bf ad de e1 fe b8 a9 00 00 00 0f 05 <48> 3d 00 
f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 71 c2 0c 00 f7 d8
[  239.243228] RSP: 002b:7ffcc2a16488 EFLAGS: 0206 ORIG_RAX: 
00a9
[  239.251503] RAX: ffda RBX: 7ffcc2a165d8 RCX: 7f741a0a9bc3
[  239.259304] RDX: 01234567 RSI: 28121969 RDI: fee1dead
[  239.267105] RBP: 0004 R08:  R09: 0169e2e0
[  239.274926] R10: fd06 R11: 0206 R12: 
[  239.282719] R13: 0001 R14:  R15: 0001
[  239.290433] ---[ end trace cd9d07db38ec6618 ]---

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [bugzilla-dae...@bugzilla.kernel.org: [Bug 213519] New: WARNING on system reboot in: drivers/gpu/drm/i915/intel_runtime_pm.c:635 intel_runtime_pm_driver_release]

2021-06-21 Thread Bjorn Helgaas
[+cc Joel (reporter)]

On Mon, Jun 21, 2021 at 04:50:14PM -0500, Bjorn Helgaas wrote:
> - Forwarded message from bugzilla-dae...@bugzilla.kernel.org -
> 
> Date: Mon, 21 Jun 2021 02:50:09 +
> From: bugzilla-dae...@bugzilla.kernel.org
> To: bj...@helgaas.com
> Subject: [Bug 213519] New: WARNING on system reboot in:
>   drivers/gpu/drm/i915/intel_runtime_pm.c:635 
> intel_runtime_pm_driver_release
> Message-ID: 
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=213519
> 
> Bug ID: 213519
>Summary: WARNING on system reboot in:
> drivers/gpu/drm/i915/intel_runtime_pm.c:635
> intel_runtime_pm_driver_release
>Product: Drivers
>Version: 2.5
> Kernel Version: 5.12.12
>   Hardware: x86-64
> OS: Linux
>   Tree: Mainline
> Status: NEW
>   Severity: normal
>   Priority: P1
>  Component: PCI
>   Assignee: drivers_...@kernel-bugs.osdl.org
>   Reporter: j-c...@westvi.com
> Regression: No
> 
> Created attachment 297517
>   --> https://bugzilla.kernel.org/attachment.cgi?id=297517&action=edit
> Contents of 'warning' stack trace, etc.
> 
> As mentioned in summary - warning message in this routine at system reboot. 
> Try
> as I might, I cannot include the text of the warning directly here in the
> description without losing carriage returns, so I include it as a text
> attachment.
> 
> - End forwarded message -
> 
> [Attachment contents below]
> 
> [  239.019148] [ cut here ]
> [  239.024226] i915 :00:02.0: i915 raw-wakerefs=1 wakelocks=1 on cleanup
> [  239.031561] WARNING: CPU: 4 PID: 2484 at 
> drivers/gpu/drm/i915/intel_runtime_pm.c:635 
> intel_runtime_pm_driver_release+0x4f/0x60
> [  239.043974] Modules linked in: mei_wdt x86_pkg_temp_thermal 
> ghash_clmulni_intel mei_me mei cryptd
> [  239.053656] CPU: 4 PID: 2484 Comm: reboot Not tainted 5.12.12 #1
> [  239.060236] Hardware name: To Be Filled By O.E.M. To Be Filled By 
> O.E.M./NUC-8665UE, BIOS P1.50 06/04/2021
> [  239.070766] RIP: 0010:intel_runtime_pm_driver_release+0x4f/0x60
> [  239.077256] Code: 10 4c 8b 6f 50 4d 85 ed 75 03 4c 8b 2f e8 59 8f 11 00 41 
> 89 d8 44 89 e1 4c 89 ea 48 89 c6 48 c7 c7 f8 25 7d b0 e8 06 e8 67 00 <0f> 0b 
> 5b 41 5c 41 5d 5d c3 0f 1f 84 00 00 00 00 00 55 48 89 e5 48
> [  239.097700] RSP: 0018:b8c682f3bd30 EFLAGS: 00010286
> [  239.103422] RAX:  RBX: 0001 RCX: 
> b0af01e8
> [  239.85] RDX:  RSI: dfff RDI: 
> b0a401e0
> [  239.118850] RBP: b8c682f3bd48 R08:  R09: 
> b8c682f3bb08
> [  239.126617] R10: b8c682f3bb00 R11: b0b20228 R12: 
> 0001
> [  239.134390] R13: 978680d114b0 R14: 97868197eae8 R15: 
> fee1dead
> [  239.142203] FS:  7f741a182580() GS:9789dc50() 
> knlGS:
> [  239.151044] CS:  0010 DS:  ES:  CR0: 80050033
> [  239.157318] CR2: 0169f4c8 CR3: 00019cf14003 CR4: 
> 003706e0
> [  239.165098] DR0:  DR1:  DR2: 
> 
> [  239.172874] DR3:  DR6: fffe0ff0 DR7: 
> 0400
> [  239.180658] Call Trace:
> [  239.183346]  i915_driver_shutdown+0xcf/0xe0
> [  239.187920]  i915_pci_shutdown+0x10/0x20
> [  239.192181]  pci_device_shutdown+0x35/0x60
> [  239.196629]  device_shutdown+0x156/0x1b0
> [  239.200827]  __do_sys_reboot.cold+0x2f/0x5b
> [  239.205410]  __x64_sys_reboot+0x16/0x20
> [  239.209586]  do_syscall_64+0x38/0x50
> [  239.213399]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [  239.218837] RIP: 0033:0x7f741a0a9bc3
> [  239.222740] Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 
> 1f 44 00 00 89 fa be 69 19 12 28 bf ad de e1 fe b8 a9 00 00 00 0f 05 <48> 3d 
> 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 71 c2 0c 00 f7 d8
> [  239.243228] RSP: 002b:7ffcc2a16488 EFLAGS: 0206 ORIG_RAX: 
> 00a9
> [  239.251503] RAX: ffda RBX: 7ffcc2a165d8 RCX: 
> 7f741a0a9bc3
> [  239.259304] RDX: 01234567 RSI: 28121969 RDI: 
> fee1dead
> [  239.267105] RBP: 0004 R08:  R09: 
> 0169e2e0
> [  239.274926] R10: fd06 R11: 0206 R12: 
> 
> [  239.282719] R13: 0001 R14:  R15: 
> 0001
> [  239.290433] ---[ end trace cd9d07db38ec6618 ]---
> 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v5] docs: power: convert docs to ReST and rename to *.rst

2019-06-14 Thread Bjorn Helgaas
On Fri, Jun 14, 2019 at 02:36:31PM -0600, Jonathan Corbet wrote:
> On Thu, 13 Jun 2019 07:10:36 -0300
> Mauro Carvalho Chehab  wrote:
> 
> > Convert the PM documents to ReST, in order to allow them to
> > build with Sphinx.
> > 
> > The conversion is actually:
> >   - add blank lines and identation in order to identify paragraphs;
> >   - fix tables markups;
> >   - add some lists markups;
> >   - mark literal blocks;
> >   - adjust title markups.
> > 
> > At its new index.rst, let's add a :orphan: while this is not linked to
> > the main index.rst file, in order to avoid build warnings.
> > 
> > Signed-off-by: Mauro Carvalho Chehab 
> > Acked-by: Mark Brown 
> > Acked-by: Bjorn Helgaas 
> > Acked-by: Srivatsa S. Bhat (VMware) 
> 
> So I can't apply this one due to conflicts in include/linux/pci.h.  Bjorn,
> perhaps the easiest thing is for you to take this one through your tree?

OK, I applied this to pci/docs for v5.3.  I applied this entire patch,
but if you would prefer that I only apply the PCI-related parts, let
me know and I'll split those out.

Bjorn
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 1/2] drm/i915: Add support for LMEM PCIe resizable bar

2022-06-17 Thread Bjorn Helgaas
[+cc Christian, author of pci_resize_resource(), Sergei, author of
rebalancing patches]

Hi Lucas,

On Fri, Jun 17, 2022 at 11:44:41AM -0700, Lucas De Marchi wrote:
> Cc'ing intel-pci, lkml, Bjorn
> 
> On Fri, Jun 17, 2022 at 11:32:37AM +0300, Jani Nikula wrote:
> > On Thu, 16 Jun 2022, priyanka.dandam...@intel.com wrote:
> > > From: Akeem G Abodunrin 
> > > 
> > > Add support for the local memory PICe resizable bar, so that
> > > local memory can be resized to the maximum size supported by the device,
> > > and mapped correctly to the PCIe memory bar. It is usual that GPU
> > > devices expose only 256MB BARs primarily to be compatible with 32-bit
> > > systems. So, those devices cannot claim larger memory BAR windows size due
> > > to the system BIOS limitation. With this change, it would be possible to
> > > reprogram the windows of the bridge directly above the requesting device
> > > on the same BAR type.
> 
> There is a big caveat here that this may be too late as other drivers
> may have already mapped their BARs - so probably too late in the pci scan
> for it to be effective. In fact, after using this for a while, it seems
> to fail too often, particularly on CFL systems.

Help me understand the "too late" part.  Do you mean that there is
enough available space for the max BAR size, but it's fragmented and
therefore not usable?  And that if we could do something earlier,
before drivers have claimed their devices, we might be able to compact
the BARs of other devices to make a larger contiguous available space?

That is theoretically possible, but I think the current
pci_resize_resource() only supports resizing of the specified BAR and
any upstream bridge windows.  I don't think it supports moving BARs of
other devices.

Sergei did some nice work that might help with this situation because
it can move BARs around more generally.  It hasn't quite achieved
critical mass yet, but maybe this would help get there:

  
https://lore.kernel.org/linux-pci/20201218174011.340514-1-s.miroshniche...@yadro.com/

If I understand Sergei's series correctly, this rebalancing actually
cannot be done during enumeration because we only move BARs if a
driver for the device indicates that it supports it, so there would be
no requirement to do this early.

> Do we have any alternative to be done in the PCI subsystem during the
> scan?  There is other work in progress to allow i915 to use the rest of
> the device memory even with a smaller BAR, but it would be better if we
> can improve our chances of succeeding the resize.

> > > Signed-off-by: Akeem G Abodunrin 
> > > Signed-off-by: Michał Winiarski 
> > > Cc: Stuart Summers 
> > > Cc: Michael J Ruhl 
> > > Cc: Prathap Kumar Valsan 
> > > Signed-off-by: Priyanka Dandamudi 
> > > Reviewed-by: Matthew Auld 
> > 
> > Please see https://lore.kernel.org/r/87pmj8vesm@intel.com
> > 
> > > ---
> > >  drivers/gpu/drm/i915/i915_driver.c | 92 ++
> > >  1 file changed, 92 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_driver.c 
> > > b/drivers/gpu/drm/i915/i915_driver.c
> > > index d26dcca7e654..4bdb471cb2e2 100644
> > > --- a/drivers/gpu/drm/i915/i915_driver.c
> > > +++ b/drivers/gpu/drm/i915/i915_driver.c
> > > @@ -303,6 +303,95 @@ static void sanitize_gpu(struct drm_i915_private 
> > > *i915)
> > >   __intel_gt_reset(to_gt(i915), ALL_ENGINES);
> > >  }
> > > 
> > > +static void __release_bars(struct pci_dev *pdev)
> > > +{
> > > + int resno;
> > > +
> > > + for (resno = PCI_STD_RESOURCES; resno < PCI_STD_RESOURCE_END; resno++) {
> > > + if (pci_resource_len(pdev, resno))
> > > + pci_release_resource(pdev, resno);
> > > + }
> > > +}
> > > +
> > > +static void
> > > +__resize_bar(struct drm_i915_private *i915, int resno, resource_size_t 
> > > size)
> > > +{
> > > + struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
> > > + int bar_size = pci_rebar_bytes_to_size(size);
> > > + int ret;
> > > +
> > > + __release_bars(pdev);
> > > +
> > > + ret = pci_resize_resource(pdev, resno, bar_size);
> > > + if (ret) {
> > > + drm_info(&i915->drm, "Failed to resize BAR%d to %dM (%pe)\n",
> > > +  resno, 1 << bar_size, ERR_PTR(ret));
> > > + return;
> > > + }
> > > +
> > > + drm_info(&i915->drm, "BAR%d resized to %dM\n", resno, 1 << bar_size);
> > > +}
> > > +
> > > +/* BAR size starts from 1MB - 2^20 */
> > > +#define BAR_SIZE_SHIFT 20
> > > +static resource_size_t
> > > +__lmem_rebar_size(struct drm_i915_private *i915, int resno)
> > > +{
> > > + struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
> > > + u32 rebar = pci_rebar_get_possible_sizes(pdev, resno);
> > > + resource_size_t size;
> > > +
> > > + if (!rebar)
> > > + return 0;
> > > +
> > > + size = 1ULL << (__fls(rebar) + BAR_SIZE_SHIFT);
> > > +
> > > + if (size <= pci_resource_len(pdev, resno))
> > > + return 0;
> > > +
> > > + return size;
> > > +}
> > > +
> > > +#define LMEM_BAR_NUM 2
> > > +static void i915

Re: [Intel-gfx] [PATCH] x86/quirks: Fix logic to apply quirk once

2021-12-29 Thread Bjorn Helgaas
On Fri, Dec 17, 2021 at 10:13:13PM -0800, Lucas De Marchi wrote:
> When using QFLAG_APPLY_ONCE we make sure the quirk is applied only once.

Maybe "called" only once, since you're about to add a distinction
between "called" and "applied"?

I'm not really sure the concept of QFLAG_APPLY_ONCE, QFLAG_APPLIED,
QFLAG_DONE is general purpose enough to be handled at the level of
check_dev_quirk().

We don't have anything like that for the regular PCI fixups (see
pci_do_fixups()).  If a regular fixup needed something like that, it
would use a static local variable.  Maybe that would be simpler
overall here, too, since the quirk would be *always* called for
matching devices, and the "one-time" logic would be encapsulated in
the quirk itself where it's more obvious?

> This is useful when it's enough one device to trigger a certain
> condition or when the resource in each that applies is global to the
> system rather than local to the device.
> 
> However we call the quirk handler based on vendor, class, and device,
> allowing the specific handler to do additional filtering. In that case
> check_dev_quirk() may incorrectly mark the quirk as applied when it's
> not. This is particularly bad for intel_graphics_quirks() that uses
> PCI_ANY_ID and then compares with a long list of devices. This hasn't
> been problematic so far because those devices are integrated GPUs and
> there can only be one in the system.  However as Intel starts to
> release discrete cards, this condition is no longer true and we fail to
> reserve the stolen memory (for the integrated gpu) depending on the bus
> topology: if the traversal finds the discrete card first, for which
> there is no system stolen memory, we will fail to reserve it for the
> integrated card.

s/integrated gpu/integrated GPU/ (to match previous use)

> This fixes the stolen memory reservation for an Alderlake-P system with
> one additional DG2. In this system we have:

DG2?

>   - 00:01.0 Bridge
> `- 03:00.0 DG2
>   - Alderklake-P's integrated graphics

s/Alderklake-P/Alderlake-P/

Might be nice to include the integrated GPU PCI address to be parallel
with the bridge and DG2.

> Since we do a depth-first traversal, when we call the handler because of
> DG2 we were marking it as already being applied and never reserving the
> stolen memory for Alderlake-P.
> 
> Here we change the quirk fucntions to return bool in case it applied a
> quirk so we only flag it as applied when that really happened. This only
> makes a difference for quirks using QFLAG_APPLY_ONCE, so all the others
> simply returns true in order to avoid unnecessary complication.

s/fucntions/functions/
s/returns true/return true/

I would consider splitting this into two patches:

  1) Change the quirk signature, make them all return "true", and
  update check_dev_quirk().  This would have no functional impact.

  2) Update intel_graphics_quirks() to return "false" when it doesn't
  reserve the stolen memory.

Then the important change will be in a small patch by itself and will
be easier to understand and revert if that should be necessary.

> Signed-off-by: Lucas De Marchi 
> ---
>  arch/x86/kernel/early-quirks.c | 75 ++
>  1 file changed, 49 insertions(+), 26 deletions(-)
> 
> diff --git a/arch/x86/kernel/early-quirks.c b/arch/x86/kernel/early-quirks.c
> index 391a4e2b8604..5d235fe2a07a 100644
> --- a/arch/x86/kernel/early-quirks.c
> +++ b/arch/x86/kernel/early-quirks.c
> @@ -28,7 +28,7 @@
>  #include 
>  #include 
>  
> -static void __init fix_hypertransport_config(int num, int slot, int func)
> +static bool __init fix_hypertransport_config(int num, int slot, int func)
>  {
>   u32 htcfg;
>   /*
> @@ -51,10 +51,10 @@ static void __init fix_hypertransport_config(int num, int 
> slot, int func)
>   }
>   }
>  
> -
> + return true;
>  }
>  
> -static void __init via_bugs(int  num, int slot, int func)
> +static bool __init via_bugs(int  num, int slot, int func)
>  {
>  #ifdef CONFIG_GART_IOMMU
>   if ((max_pfn > MAX_DMA32_PFN ||  force_iommu) &&
> @@ -63,8 +63,12 @@ static void __init via_bugs(int  num, int slot, int func)
>  "Looks like a VIA chipset. Disabling IOMMU."
>  " Override with iommu=allowed\n");
>   gart_iommu_aperture_disabled = 1;
> +
> + return true;
>   }
>  #endif
> +
> + return false;
>  }
>  
>  #ifdef CONFIG_ACPI
> @@ -77,7 +81,7 @@ static int __init nvidia_hpet_check(struct 
> acpi_table_header *header)
>  #endif /* CONFIG_X86_IO_APIC */
>  #endif /* CONFIG_ACPI */
>  
> -static void __init nvidia_bugs(int num, int slot, int func)
> +static bool __init nvidia_bugs(int num, int slot, int func)
>  {
>  #ifdef CONFIG_ACPI
>  #ifdef CONFIG_X86_IO_APIC
> @@ -86,7 +90,7 @@ static void __init nvidia_bugs(int num, int slot, int func)
>* Nvidia graphics cards with PCI ports on secondary buses.
>*/
>   if (num)
> - retu

Re: [Intel-gfx] [PATCH v2 1/2] x86/quirks: Fix logic to apply quirk once

2022-01-06 Thread Bjorn Helgaas
On Wed, Jan 05, 2022 at 04:36:53PM -0800, Lucas De Marchi wrote:
> When using QFLAG_APPLY_ONCE we make sure the quirk is called only once.
> This is useful when it's enough one device to trigger a certain
> condition or when the resource in each that applies is global to the
> system rather than local to the device.
> 
> However we call the quirk handler based on vendor, class, and device,
> allowing the specific handler to do additional filtering. In that case
> check_dev_quirk() may incorrectly mark the quirk as applied when it's
> not: the quirk was called, but may not have been applied due to the
> additional filter.
> 
> This is particularly bad for intel_graphics_quirks() that uses
> PCI_ANY_ID and then compares with a long list of devices. This hasn't
> been problematic so far because those devices are integrated GPUs and
> there can only be one in the system.  However as Intel starts to
> release discrete cards, this condition is no longer true and we fail to
> reserve the stolen memory (for the integrated GPU) depending on the bus
> topology: if the traversal finds the discrete card first, for which
> there is no system stolen memory, we will fail to reserve it for the
> integrated card.
> 
> This fixes the stolen memory reservation for an Alderlake-P system with
> one additional Intel discrete GPU (DG2 in this case, but applies for
> any of them). In this system we have:
> 
>   - 00:01.0 Bridge
> `- 03:00.0 DG2
>   - 00:02.0 Alderlake-P's integrated GPU
> 
> Since we do a depth-first traversal, when we call the handler because of
> DG2 we were marking it as already being applied and never reserving the
> stolen memory for Alderlake-P.
> 
> Since there are just a few quirks using the QFLAG_APPLY_ONCE logic and
> that is even the only flag, just use a static local variable in the
> quirk function itself. This allows to mark the quirk as applied only
> when it really is. As pointed out by Bjorn Helgaas, this is also more in
> line with the PCI fixups as done by pci_do_fixups().
> 
> Signed-off-by: Lucas De Marchi 
> ---
> 
> v2: instead of changing all quirks to return if it was successfully
> applied, remove the flag infra and use a static local variable to mark
> quirks already applied (suggested by Bjorn Helgaas).
> 
>  arch/x86/kernel/early-quirks.c | 60 --
>  1 file changed, 36 insertions(+), 24 deletions(-)
> 
> diff --git a/arch/x86/kernel/early-quirks.c b/arch/x86/kernel/early-quirks.c
> index 391a4e2b8604..102ecd0a910e 100644
> --- a/arch/x86/kernel/early-quirks.c
> +++ b/arch/x86/kernel/early-quirks.c
> @@ -57,12 +57,18 @@ static void __init fix_hypertransport_config(int num, int 
> slot, int func)
>  static void __init via_bugs(int  num, int slot, int func)
>  {
>  #ifdef CONFIG_GART_IOMMU
> + static bool quirk_applied __initdata;
> +
> + if (quirk_applied)
> + return;

IMO this probably is better than using QFLAG_APPLY_ONCE, etc.

But this patch has the mechanical changes related to QFLAG_APPLY_ONCE,
which I think are useful but not very interesting and not a fix for
something that's broken, mixed together with the *important* change
that actually fixes a problem on Alderlake.

Those two things need to be separate patches.  A patch that fixes a
problem should be as small as possible so it's easy to understand and
backport.

The subject line of this patch doesn't say anything at all about
Alderlake.  Splitting into two patches will make the subject lines
more useful.

Bjorn


Re: [Intel-gfx] [PATCH v2 2/2] x86/quirks: better wrap quirk conditions

2022-01-06 Thread Bjorn Helgaas
On Wed, Jan 05, 2022 at 04:36:54PM -0800, Lucas De Marchi wrote:
> Remove extra parenthesis and wrap lines so it's easier to read what are
> the conditions being checked. The call to the hook also had an extra
> indentation: remove here to conform to coding style.

It's nice when your subject lines are consistent.  These look like:

  x86/quirks: Fix logic to apply quirk once
  x86/quirks: better wrap quirk conditions

The second isn't capitalized like the first.  Obviously if you split
the first patch, you'll have three subject lines, and one will mention
Alderlake.

Bjorn


Re: [Intel-gfx] [PATCH v2 1/2] x86/quirks: Fix logic to apply quirk once

2022-01-06 Thread Bjorn Helgaas
On Thu, Jan 06, 2022 at 02:30:47PM -0800, Lucas De Marchi wrote:
> On Thu, Jan 06, 2022 at 03:58:50PM -0600, Bjorn Helgaas wrote:
> > On Wed, Jan 05, 2022 at 04:36:53PM -0800, Lucas De Marchi wrote:
> > > When using QFLAG_APPLY_ONCE we make sure the quirk is called only once.
> > > This is useful when it's enough one device to trigger a certain
> > > condition or when the resource in each that applies is global to the
> > > system rather than local to the device.
> > > 
> > > However we call the quirk handler based on vendor, class, and device,
> > > allowing the specific handler to do additional filtering. In that case
> > > check_dev_quirk() may incorrectly mark the quirk as applied when it's
> > > not: the quirk was called, but may not have been applied due to the
> > > additional filter.
> > > 
> > > This is particularly bad for intel_graphics_quirks() that uses
> > > PCI_ANY_ID and then compares with a long list of devices. This hasn't
> > > been problematic so far because those devices are integrated GPUs and
> > > there can only be one in the system.  However as Intel starts to
> > > release discrete cards, this condition is no longer true and we fail to
> > > reserve the stolen memory (for the integrated GPU) depending on the bus
> > > topology: if the traversal finds the discrete card first, for which
> > > there is no system stolen memory, we will fail to reserve it for the
> > > integrated card.
> > > 
> > > This fixes the stolen memory reservation for an Alderlake-P system with
> > > one additional Intel discrete GPU (DG2 in this case, but applies for
> > > any of them). In this system we have:
> > > 
> > >   - 00:01.0 Bridge
> > > `- 03:00.0 DG2
> > >   - 00:02.0 Alderlake-P's integrated GPU
> > > 
> > > Since we do a depth-first traversal, when we call the handler because of
> > > DG2 we were marking it as already being applied and never reserving the
> > > stolen memory for Alderlake-P.
> > > 
> > > Since there are just a few quirks using the QFLAG_APPLY_ONCE logic and
> > > that is even the only flag, just use a static local variable in the
> > > quirk function itself. This allows to mark the quirk as applied only
> > > when it really is. As pointed out by Bjorn Helgaas, this is also more in
> > > line with the PCI fixups as done by pci_do_fixups().
> > > 
> > > Signed-off-by: Lucas De Marchi 
> > > ---
> > > 
> > > v2: instead of changing all quirks to return if it was successfully
> > > applied, remove the flag infra and use a static local variable to mark
> > > quirks already applied (suggested by Bjorn Helgaas).
> > > 
> > >  arch/x86/kernel/early-quirks.c | 60 --
> > >  1 file changed, 36 insertions(+), 24 deletions(-)
> > > 
> > > diff --git a/arch/x86/kernel/early-quirks.c 
> > > b/arch/x86/kernel/early-quirks.c
> > > index 391a4e2b8604..102ecd0a910e 100644
> > > --- a/arch/x86/kernel/early-quirks.c
> > > +++ b/arch/x86/kernel/early-quirks.c
> > > @@ -57,12 +57,18 @@ static void __init fix_hypertransport_config(int num, 
> > > int slot, int func)
> > >  static void __init via_bugs(int  num, int slot, int func)
> > >  {
> > >  #ifdef CONFIG_GART_IOMMU
> > > + static bool quirk_applied __initdata;
> > > +
> > > + if (quirk_applied)
> > > + return;
> > 
> > IMO this probably is better than using QFLAG_APPLY_ONCE, etc.
> > 
> > But this patch has the mechanical changes related to QFLAG_APPLY_ONCE,
> > which I think are useful but not very interesting and not a fix for
> > something that's broken, mixed together with the *important* change
> > that actually fixes a problem on Alderlake.
> 
> In general I agree with the statement and also ask people to follow that
> logic, but here I thought it was small enough to be considered as a
> whole.  Here is what I could do to split it up further in a way that is
> not going in a different direction:
> 
> 1) add the static local only to intel graphics quirk  and remove the
> flag from this item
> 2 and 3) add the static local to other functions and remove the flag
> from those items
> 4) remove the flag from the table, the defines and its usage.
> 5) fix the coding style (to be clear, it's already wrong, not
> something wrong introduced here... maybe could be squashed in (4)?)

I would make this two patches:

  1) Add static locals to all QFLAG_APPLY_ONCE quirks and remove
 QFLAG_APPLY_ONCE etc.  This patch is mechanical and shouldn't
 break or fix anything.

  2) Update the intel graphics quirk to change the way it sets the
 static local.  Make the subject be something about fixing the
 problem with Intel integrated + discrete graphics.


Re: [Intel-gfx] [PATCH v3 1/3] x86/quirks: Replace QFLAG_APPLY_ONCE with static locals

2022-01-07 Thread Bjorn Helgaas
On Fri, Jan 07, 2022 at 01:05:14PM -0800, Lucas De Marchi wrote:
> The flags are only used to mark a quirk to be called once and nothing
> else. Also, that logic may not be appropriate if the quirk wants to
> do additional filtering and set quirk ass applied by itself.

s/ass applied/as applied/

> So replace the uses of QFLAG_APPLY_ONCE with static local variables in
> the few quirks that use this logic and remove all the flags logic.
> 
> Signed-off-by: Lucas De Marchi 

Reviewed-by: Bjorn Helgaas 

> ---
> 
> v3: Keep in this patch only the mechanical change to move from
> QFLAG_APPLY_ONCE to static locals. Differently than v2, we don't try to
> set quirk_applied in nvidia_bugs() and via_bugs() only when the
> global resource is set - this patch is a mechanical move only and
> shouldn't change any behavior. For intel_graphics_quirks(), we will move
> it to the right place in a subsequent patch.
> 
>  arch/x86/kernel/early-quirks.c | 55 +-
>  1 file changed, 34 insertions(+), 21 deletions(-)
> 
> diff --git a/arch/x86/kernel/early-quirks.c b/arch/x86/kernel/early-quirks.c
> index 391a4e2b8604..8b689c2b8cc7 100644
> --- a/arch/x86/kernel/early-quirks.c
> +++ b/arch/x86/kernel/early-quirks.c
> @@ -57,6 +57,13 @@ static void __init fix_hypertransport_config(int num, int 
> slot, int func)
>  static void __init via_bugs(int  num, int slot, int func)
>  {
>  #ifdef CONFIG_GART_IOMMU
> + static bool quirk_applied __initdata;
> +
> + if (quirk_applied)
> + return;
> +
> + quirk_applied = true;
> +
>   if ((max_pfn > MAX_DMA32_PFN ||  force_iommu) &&
>   !gart_iommu_aperture_allowed) {
>   printk(KERN_INFO
> @@ -81,6 +88,13 @@ static void __init nvidia_bugs(int num, int slot, int func)
>  {
>  #ifdef CONFIG_ACPI
>  #ifdef CONFIG_X86_IO_APIC
> + static bool quirk_applied __initdata;
> +
> + if (quirk_applied)
> + return;
> +
> + quirk_applied = true;
> +
>   /*
>* Only applies to Nvidia root ports (bus 0) and not to
>* Nvidia graphics cards with PCI ports on secondary buses.
> @@ -587,10 +601,16 @@ intel_graphics_stolen(int num, int slot, int func,
>  
>  static void __init intel_graphics_quirks(int num, int slot, int func)
>  {
> + static bool quirk_applied __initdata;
>   const struct intel_early_ops *early_ops;
>   u16 device;
>   int i;
>  
> + if (quirk_applied)
> + return;
> +
> + quirk_applied = true;
> +
>   device = read_pci_config_16(num, slot, func, PCI_DEVICE_ID);
>  
>   for (i = 0; i < ARRAY_SIZE(intel_early_ids); i++) {
> @@ -673,37 +693,33 @@ static void __init apple_airport_reset(int bus, int 
> slot, int func)
>   early_iounmap(mmio, BCM4331_MMIO_SIZE);
>  }
>  
> -#define QFLAG_APPLY_ONCE 0x1
> -#define QFLAG_APPLIED0x2
> -#define QFLAG_DONE   (QFLAG_APPLY_ONCE|QFLAG_APPLIED)
>  struct chipset {
>   u32 vendor;
>   u32 device;
>   u32 class;
>   u32 class_mask;
> - u32 flags;
>   void (*f)(int num, int slot, int func);
>  };
>  
>  static struct chipset early_qrk[] __initdata = {
>   { PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID,
> -   PCI_CLASS_BRIDGE_PCI, PCI_ANY_ID, QFLAG_APPLY_ONCE, nvidia_bugs },
> +   PCI_CLASS_BRIDGE_PCI, PCI_ANY_ID, nvidia_bugs },
>   { PCI_VENDOR_ID_VIA, PCI_ANY_ID,
> -   PCI_CLASS_BRIDGE_PCI, PCI_ANY_ID, QFLAG_APPLY_ONCE, via_bugs },
> +   PCI_CLASS_BRIDGE_PCI, PCI_ANY_ID, via_bugs },
>   { PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_K8_NB,
> -   PCI_CLASS_BRIDGE_HOST, PCI_ANY_ID, 0, fix_hypertransport_config },
> +   PCI_CLASS_BRIDGE_HOST, PCI_ANY_ID, fix_hypertransport_config },
>   { PCI_VENDOR_ID_ATI, PCI_DEVICE_ID_ATI_IXP400_SMBUS,
> -   PCI_CLASS_SERIAL_SMBUS, PCI_ANY_ID, 0, ati_bugs },
> +   PCI_CLASS_SERIAL_SMBUS, PCI_ANY_ID, ati_bugs },
>   { PCI_VENDOR_ID_ATI, PCI_DEVICE_ID_ATI_SBX00_SMBUS,
> -   PCI_CLASS_SERIAL_SMBUS, PCI_ANY_ID, 0, ati_bugs_contd },
> +   PCI_CLASS_SERIAL_SMBUS, PCI_ANY_ID, ati_bugs_contd },
>   { PCI_VENDOR_ID_INTEL, 0x3403, PCI_CLASS_BRIDGE_HOST,
> -   PCI_BASE_CLASS_BRIDGE, 0, intel_remapping_check },
> +   PCI_BASE_CLASS_BRIDGE, intel_remapping_check },
>   { PCI_VENDOR_ID_INTEL, 0x3405, PCI_CLASS_BRIDGE_HOST,
> -   PCI_BASE_CLASS_BRIDGE, 0, intel_remapping_check },
> +   PCI_BASE_CLASS_BRIDGE, intel_remapping_check },
>   { PCI_VENDOR_ID_INTEL, 0x3406, PCI_CLASS_BRIDGE_HOST,
> -   PCI_BASE_CLASS_BRIDGE, 0, intel_remapping_check },
> +   PCI_BASE_CLASS_BRIDGE, intel_remapping_check }

Re: [Intel-gfx] [PATCH v3 3/3] x86/quirks: Fix stolen detection with integrated + discrete GPU

2022-01-07 Thread Bjorn Helgaas
On Fri, Jan 07, 2022 at 01:05:16PM -0800, Lucas De Marchi wrote:
> early_pci_scan_bus() does a depth-first traversal, possibly calling
> the quirk functions for each device based on vendor, device and class
> from early_qrk table. intel_graphics_quirks() however uses PCI_ANY_ID
> and does additional filtering in the quirk.
> 
> If there is an Intel integrated + discrete GPU the quirk may be called
> first for the discrete GPU based on the PCI topology. Then we will fail
> to reserve the system stolen memory for the integrated GPU, because we
> will already have marked the quirk as "applied".
> 
> This was reproduced in a setup with Alderlake-P (integrated) + DG2
> (discrete), with the following PCI topology:
> 
>   - 00:01.0 Bridge
> `- 03:00.0 DG2
>   - 00:02.0 Integrated GPU
> 
> Move the setting of quirk_applied in intel_graphics_quirks() so it's
> mark as applied only when we find the integrated GPU based on the
> intel_early_ids table.
> 
> Signed-off-by: Lucas De Marchi 

I don't know the details of stolen memory, but the implementation of
this quirk looks good to me.  Very nice that it's now very clear
exactly what the change is.

> ---
> 
> v3: now that we do the refactor before the fix, we can do a single line
> change to fix intel_graphics_quirks(). Also, we don't change
> intel_graphics_stolen() anymore as we did in v2: we don't have to check
> other devices anymore if there was a previous match causing
> intel_graphics_stolen() to be called (there can only be one integrated
> GPU reserving the stolen memory).
> 
>  arch/x86/kernel/early-quirks.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/early-quirks.c b/arch/x86/kernel/early-quirks.c
> index df34963e23bf..932f9087c324 100644
> --- a/arch/x86/kernel/early-quirks.c
> +++ b/arch/x86/kernel/early-quirks.c
> @@ -609,8 +609,6 @@ static void __init intel_graphics_quirks(int num, int 
> slot, int func)
>   if (quirk_applied)
>   return;
>  
> - quirk_applied = true;
> -
>   device = read_pci_config_16(num, slot, func, PCI_DEVICE_ID);
>  
>   for (i = 0; i < ARRAY_SIZE(intel_early_ids); i++) {
> @@ -623,6 +621,8 @@ static void __init intel_graphics_quirks(int num, int 
> slot, int func)
>  
>   intel_graphics_stolen(num, slot, func, early_ops);
>  
> + quirk_applied = true;
> +
>   return;
>   }
>  }
> -- 
> 2.34.1
> 


Re: [Intel-gfx] [PATCH v3 3/3] x86/quirks: Fix stolen detection with integrated + discrete GPU

2022-01-10 Thread Bjorn Helgaas
On Mon, Jan 10, 2022 at 12:11:36PM -0500, Rodrigo Vivi wrote:
> On Fri, Jan 07, 2022 at 08:57:32PM -0600, Bjorn Helgaas wrote:
> > On Fri, Jan 07, 2022 at 01:05:16PM -0800, Lucas De Marchi wrote:
> > > early_pci_scan_bus() does a depth-first traversal, possibly calling
> > > the quirk functions for each device based on vendor, device and class
> > > from early_qrk table. intel_graphics_quirks() however uses PCI_ANY_ID
> > > and does additional filtering in the quirk.
> > > 
> > > If there is an Intel integrated + discrete GPU the quirk may be called
> > > first for the discrete GPU based on the PCI topology. Then we will fail
> > > to reserve the system stolen memory for the integrated GPU, because we
> > > will already have marked the quirk as "applied".
> > > 
> > > This was reproduced in a setup with Alderlake-P (integrated) + DG2
> > > (discrete), with the following PCI topology:
> > > 
> > >   - 00:01.0 Bridge
> > > `- 03:00.0 DG2
> > >   - 00:02.0 Integrated GPU
> > > 
> > > Move the setting of quirk_applied in intel_graphics_quirks() so it's
> > > mark as applied only when we find the integrated GPU based on the
> > > intel_early_ids table.
> > > 
> > > Signed-off-by: Lucas De Marchi 
> > 
> > I don't know the details of stolen memory, but the implementation of
> > this quirk looks good to me.  Very nice that it's now very clear
> > exactly what the change is.
> 
> 
> Reviewed-by: Rodrigo Vivi 
> 
> Bjorn, ack to merge through drm-intel?

This is a bit of a shared area between the x86 folks and me, but
certainly no objection from me.

Acked-by: Bjorn Helgaas 

> > > ---
> > > 
> > > v3: now that we do the refactor before the fix, we can do a single line
> > > change to fix intel_graphics_quirks(). Also, we don't change
> > > intel_graphics_stolen() anymore as we did in v2: we don't have to check
> > > other devices anymore if there was a previous match causing
> > > intel_graphics_stolen() to be called (there can only be one integrated
> > > GPU reserving the stolen memory).
> > > 
> > >  arch/x86/kernel/early-quirks.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/arch/x86/kernel/early-quirks.c 
> > > b/arch/x86/kernel/early-quirks.c
> > > index df34963e23bf..932f9087c324 100644
> > > --- a/arch/x86/kernel/early-quirks.c
> > > +++ b/arch/x86/kernel/early-quirks.c
> > > @@ -609,8 +609,6 @@ static void __init intel_graphics_quirks(int num, int 
> > > slot, int func)
> > >   if (quirk_applied)
> > >   return;
> > >  
> > > - quirk_applied = true;
> > > -
> > >   device = read_pci_config_16(num, slot, func, PCI_DEVICE_ID);
> > >  
> > >   for (i = 0; i < ARRAY_SIZE(intel_early_ids); i++) {
> > > @@ -623,6 +621,8 @@ static void __init intel_graphics_quirks(int num, int 
> > > slot, int func)
> > >  
> > >   intel_graphics_stolen(num, slot, func, early_ops);
> > >  
> > > + quirk_applied = true;
> > > +
> > >   return;
> > >   }
> > >  }
> > > -- 
> > > 2.34.1
> > > 


Re: [Intel-gfx] [PATCH v4] x86/quirks: Replace QFLAG_APPLY_ONCE with static locals

2022-01-12 Thread Bjorn Helgaas
On Wed, Jan 12, 2022 at 03:30:43PM -0800, Lucas De Marchi wrote:
> The flags are only used to mark a quirk to be called once and nothing
> else. Also, that logic may not be appropriate if the quirk wants to
> do additional filtering and set quirk as applied by itself.
> 
> So replace the uses of QFLAG_APPLY_ONCE with static local variables in
> the few quirks that use this logic and remove all the flags logic.
> 
> Signed-off-by: Lucas De Marchi 
> Reviewed-by: Bjorn Helgaas 

Only occurred to me now, but another, less intrusive approach would be
to just remove QFLAG_APPLY_ONCE from intel_graphics_quirks() and do
its bookkeeping internally, e.g.,

diff --git a/arch/x86/kernel/early-quirks.c b/arch/x86/kernel/early-quirks.c
index 391a4e2b8604..7b655004e5fd 100644
--- a/arch/x86/kernel/early-quirks.c
+++ b/arch/x86/kernel/early-quirks.c
@@ -587,10 +587,14 @@ intel_graphics_stolen(int num, int slot, int func,
 
 static void __init intel_graphics_quirks(int num, int slot, int func)
 {
+   static bool stolen __initdata = false;
const struct intel_early_ops *early_ops;
u16 device;
int i;
 
+   if (stolen)
+   return;
+
device = read_pci_config_16(num, slot, func, PCI_DEVICE_ID);
 
for (i = 0; i < ARRAY_SIZE(intel_early_ids); i++) {
@@ -602,6 +606,7 @@ static void __init intel_graphics_quirks(int num, int slot, 
int func)
early_ops = (typeof(early_ops))driver_data;
 
intel_graphics_stolen(num, slot, func, early_ops);
+   stolen = true;
 
return;
}
@@ -703,7 +708,7 @@ static struct chipset early_qrk[] __initdata = {
{ PCI_VENDOR_ID_INTEL, 0x3406, PCI_CLASS_BRIDGE_HOST,
  PCI_BASE_CLASS_BRIDGE, 0, intel_remapping_check },
{ PCI_VENDOR_ID_INTEL, PCI_ANY_ID, PCI_CLASS_DISPLAY_VGA, PCI_ANY_ID,
- QFLAG_APPLY_ONCE, intel_graphics_quirks },
+ 0, intel_graphics_quirks },
/*
 * HPET on the current version of the Baytrail platform has accuracy
 * problems: it will halt in deep idle state - so we disable it.


Re: [Intel-gfx] [PATCH v4] x86/quirks: Replace QFLAG_APPLY_ONCE with static locals

2022-01-12 Thread Bjorn Helgaas
On Wed, Jan 12, 2022 at 04:21:28PM -0800, Lucas De Marchi wrote:
> On Wed, Jan 12, 2022 at 06:08:05PM -0600, Bjorn Helgaas wrote:
> > On Wed, Jan 12, 2022 at 03:30:43PM -0800, Lucas De Marchi wrote:
> > > The flags are only used to mark a quirk to be called once and nothing
> > > else. Also, that logic may not be appropriate if the quirk wants to
> > > do additional filtering and set quirk as applied by itself.
> > > 
> > > So replace the uses of QFLAG_APPLY_ONCE with static local variables in
> > > the few quirks that use this logic and remove all the flags logic.
> > > 
> > > Signed-off-by: Lucas De Marchi 
> > > Reviewed-by: Bjorn Helgaas 
> > 
> > Only occurred to me now, but another, less intrusive approach would be
> > to just remove QFLAG_APPLY_ONCE from intel_graphics_quirks() and do
> > its bookkeeping internally, e.g.,
> 
> that is actually what I suggested after your comment in v2: this would
> be the first patch with "minimal fix". But then to keep it consistent
> with the other calls to follow up with additional patches on top
> converting them as well.  Maybe what I wrote wasn't clear in the
> direction? Copying it here:
> 
>   1) add the static local only to intel graphics quirk  and remove the
>   flag from this item
>   2 and 3) add the static local to other functions and remove the flag
>   from those items
>   4) remove the flag from the table, the defines and its usage.
>   5) fix the coding style (to be clear, it's already wrong, not
>   something wrong introduced here... maybe could be squashed in (4)?)

Oh, sorry, I guess I just skimmed over that without really
comprehending it.

Although the patch below is basically just 1 from above and doesn't
require any changes to the other functions or the flags themselves
(2-4 above).

> > diff --git a/arch/x86/kernel/early-quirks.c b/arch/x86/kernel/early-quirks.c
> > index 391a4e2b8604..7b655004e5fd 100644
> > --- a/arch/x86/kernel/early-quirks.c
> > +++ b/arch/x86/kernel/early-quirks.c
> > @@ -587,10 +587,14 @@ intel_graphics_stolen(int num, int slot, int func,
> > 
> > static void __init intel_graphics_quirks(int num, int slot, int func)
> > {
> > +   static bool stolen __initdata = false;
> > const struct intel_early_ops *early_ops;
> > u16 device;
> > int i;
> > 
> > +   if (stolen)
> > +   return;
> > +
> > device = read_pci_config_16(num, slot, func, PCI_DEVICE_ID);
> > 
> > for (i = 0; i < ARRAY_SIZE(intel_early_ids); i++) {
> > @@ -602,6 +606,7 @@ static void __init intel_graphics_quirks(int num, int 
> > slot, int func)
> > early_ops = (typeof(early_ops))driver_data;
> > 
> > intel_graphics_stolen(num, slot, func, early_ops);
> > +   stolen = true;
> > 
> > return;
> > }
> > @@ -703,7 +708,7 @@ static struct chipset early_qrk[] __initdata = {
> > { PCI_VENDOR_ID_INTEL, 0x3406, PCI_CLASS_BRIDGE_HOST,
> >   PCI_BASE_CLASS_BRIDGE, 0, intel_remapping_check },
> > { PCI_VENDOR_ID_INTEL, PCI_ANY_ID, PCI_CLASS_DISPLAY_VGA, PCI_ANY_ID,
> > - QFLAG_APPLY_ONCE, intel_graphics_quirks },
> > + 0, intel_graphics_quirks },
> > /*
> >  * HPET on the current version of the Baytrail platform has accuracy
> >  * problems: it will halt in deep idle state - so we disable it.


Re: [Intel-gfx] [PATCH v5 1/5] x86/quirks: Fix stolen detection with integrated + discrete GPU

2022-01-18 Thread Bjorn Helgaas
On Tue, Jan 18, 2022 at 06:26:48PM +0100, Borislav Petkov wrote:
> On Tue, Jan 18, 2022 at 08:36:56AM -0800, Lucas De Marchi wrote:
> > I had the impression the subject/title should be imperative, with it
> > more relaxed in the body. It seems we have one more difference among
> > subsystems and I will adapt on next submissions to x86.
> 
> We have written it down properly, in case it explains it better:
> 
> "The tip tree maintainers set value on following these rules, especially
> on the request to write changelogs in imperative mood and not
> impersonating code or the execution of it. This is not just a whim of
> the maintainers. Changelogs written in abstract words are more precise
> and tend to be less confusing than those written in the form of novels."
> 
> from Documentation/process/maintainer-tip.rst

Thanks for writing this down!  I do the same for PCI.  I suspect this
is a pretty conservative style that would be acceptable tree-wide even
if not required everywhere.

> > Although in the review Bjorn suggested just splitting the commit, it was
> > also mentioned that the PCI subsystem has no such logic in its
> > equivalent pci_do_fixups(): a quirk/fixup needing that should instead
> > use a static local.
> 
> Why?

I don't really care much one way or the other.  I think the simplest
approach is to remove QFLAG_APPLY_ONCE from intel_graphics_quirks()
and do nothing else, as I suggested here:

  https://lore.kernel.org/r/20220113000805.GA295089@bhelgaas

Unfortunately that didn't occur to me until I'd already suggested more
complicated things that no longer seem worthwhile to me.

The static variable might be ugly, but it does seem to be what
intel_graphics_quirks() wants -- a "do this at most once per system
but we don't know exactly which device" situation.

> There's perfectly nice ->flags there for exactly stuff like that. static
> vars are ugly and should be avoided if possible.

Bjorn


Re: [Intel-gfx] [PATCH v5 1/5] x86/quirks: Fix stolen detection with integrated + discrete GPU

2022-01-18 Thread Bjorn Helgaas
On Tue, Jan 18, 2022 at 07:37:29PM +0100, Borislav Petkov wrote:
> On Tue, Jan 18, 2022 at 11:58:53AM -0600, Bjorn Helgaas wrote:
> > I don't really care much one way or the other.  I think the simplest
> > approach is to remove QFLAG_APPLY_ONCE from intel_graphics_quirks()
> > and do nothing else, as I suggested here:
> > 
> >   https://lore.kernel.org/r/20220113000805.GA295089@bhelgaas
> > 
> > Unfortunately that didn't occur to me until I'd already suggested more
> > complicated things that no longer seem worthwhile to me.
> > 
> > The static variable might be ugly, but it does seem to be what
> > intel_graphics_quirks() wants -- a "do this at most once per system
> > but we don't know exactly which device" situation.
> 
> I see.
> 
> Yeah, keeping it solely inside intel_graphics_quirks() and maybe with a
> comment ontop, why it is done, is simple. I guess if more quirks need
> this once-thing people might have to consider a more sensible scheme - I
> was just objecting to sprinkling those static vars everywhere.
> 
> But your call. :)

Haha :)  I was hoping not to touch it myself because I think this
whole stolen memory thing is kind of nasty.  It's not clear to me why
we need it at all, or why we have to keep all this device-specific
logic in the kernel, or why it has to be an early quirk as opposed to
a regular PCI quirk.  We had a thread [1] about it a while ago but I
don't think anything got resolved.

But to try to make forward progress, I applied patch 1/5 (actually,
the updated one from [2]) to my pci/misc branch with the updated
commit log and code comments below.

IMO it's probably not worth doing patches 2-5 since I don't think
they fix anything and I'm not sure it improves readability.  I'm sorry
because it was my mistake to encourage that route initially.

I think we could still get this into v5.17 since it's small and the
merge window is still open.

[1] 
https://lore.kernel.org/linux-pci/20201104120506.172447-1-tejaskumarx.surendrakumar.upadh...@intel.com/
[2] https://lore.kernel.org/r/20220118190558.2ququ4vdfjuahicm@ldmartin-desk2

commit 7329e2608d04 ("x86/gpu: Reserve stolen memory for first integrated Intel 
GPU")
Author: Lucas De Marchi 
Date:   Thu Jan 13 16:28:39 2022 -0800

x86/gpu: Reserve stolen memory for first integrated Intel GPU

"Stolen memory" is memory set aside for use by an Intel integrated GPU.
The intel_graphics_quirks() early quirk reserves this memory when it is
called for a GPU that appears in the intel_early_ids[] table of integrated
GPUs.

Previously intel_graphics_quirks() was marked as QFLAG_APPLY_ONCE, so it
was called only for the first Intel GPU found.  If a discrete GPU happened
to be enumerated first, intel_graphics_quirks() was called for it but not
for any integrated GPU found later.  Therefore, stolen memory for such an
integrated GPU was never reserved.

For example, this problem occurs in this Alderlake-P (integrated) + DG2
(discrete) topology where the DG2 is found first, but stolen memory is
associated with the integrated GPU:

  - 00:01.0 Bridge
`- 03:00.0 DG2 discrete GPU
  - 00:02.0 Integrated GPU (with stolen memory)

Remove the QFLAG_APPLY_ONCE flag and call intel_graphics_quirks() for every
Intel GPU.  Reserve stolen memory for the first GPU that appears in
intel_early_ids[].

[bhelgaas: commit log, add code comment, squash in
https://lore.kernel.org/r/20220118190558.2ququ4vdfjuahicm@ldmartin-desk2]
Link: 
https://lore.kernel.org/r/20220114002843.2083382-1-lucas.demar...@intel.com
Signed-off-by: Lucas De Marchi 
Signed-off-by: Bjorn Helgaas 
Cc: sta...@vger.kernel.org

diff --git a/arch/x86/kernel/early-quirks.c b/arch/x86/kernel/early-quirks.c
index 391a4e2b8604..8690fab95ae4 100644
--- a/arch/x86/kernel/early-quirks.c
+++ b/arch/x86/kernel/early-quirks.c
@@ -515,6 +515,7 @@ static const struct intel_early_ops gen11_early_ops 
__initconst = {
.stolen_size = gen9_stolen_size,
 };
 
+/* Intel integrated GPUs for which we need to reserve "stolen memory" */
 static const struct pci_device_id intel_early_ids[] __initconst = {
INTEL_I830_IDS(&i830_early_ops),
INTEL_I845G_IDS(&i845_early_ops),
@@ -591,6 +592,13 @@ static void __init intel_graphics_quirks(int num, int 
slot, int func)
u16 device;
int i;
 
+   /*
+* Reserve "stolen memory" for an integrated GPU.  If we've already
+* found one, there's nothing to do for other (discrete) GPUs.
+*/
+   if (resource_size(&intel_graphics_stolen_res))
+   return;
+
device = read_pci_config_16(num, slot, func, PCI_DEVICE_ID);
 
for (i = 0;

Re: [Intel-gfx] [PATCH v5 1/5] x86/quirks: Fix stolen detection with integrated + discrete GPU

2022-01-19 Thread Bjorn Helgaas
On Wed, Jan 19, 2022 at 12:30:04PM -0800, Lucas De Marchi wrote:
> On Tue, Jan 18, 2022 at 02:01:45PM -0600, Bjorn Helgaas wrote:

> > Haha :)  I was hoping not to touch it myself because I think this
> > whole stolen memory thing is kind of nasty.  It's not clear to me why
> > we need it at all, or why we have to keep all this device-specific
> > logic in the kernel, or why it has to be an early quirk as opposed to
> > a regular PCI quirk.  We had a thread [1] about it a while ago but I
> > don't think anything got resolved.
> 
> I was reading that thread again and thinking what we could do to try to
> resolve this. I will reply on that thread.

Great!  I hope there's some way around this.

> > But to try to make forward progress, I applied patch 1/5 (actually,
> > the updated one from [2]) to my pci/misc branch with the updated
> > commit log and code comments below.
> 
> thanks. I found the wording in the title odd as when I read "first" it
> gives me the impression it's saying there could be more, which is not
> possible.

I said "first integrated GPU" because Linux doesn't control what
devices are in the system; it just has to deal with whatever it finds.
All one can tell from the code is that if we find one or more devices
that appear in intel_early_ids[], we reserve stolen memory for the
first such device.

System-specific knowledge might tell you that there should only be one
integrated GPU, but there's no constraint like that in Linux.

Bjorn


Re: [Intel-gfx] [PATCH 1/2] x86/gpu: include drm/i915_pciids.h directly in early quirks

2022-03-11 Thread Bjorn Helgaas
On Fri, Mar 11, 2022 at 12:06:38PM +0200, Jani Nikula wrote:
> early-quirks.c is the only user of drm/i915_drm.h that also needs
> drm/i915_pciids.h. Include the masses of PCI ID macros only where
> needed.
> 
> Cc: Bjorn Helgaas 
> Cc: linux-...@vger.kernel.org
> Cc: x...@kernel.org
> Signed-off-by: Jani Nikula 

FWIW:

Acked-by: Bjorn Helgaas 

> ---
> 
> I'm hoping to merge this via drm-intel with the other patch.
> ---
>  arch/x86/kernel/early-quirks.c | 1 +
>  include/drm/i915_drm.h | 1 -
>  2 files changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/early-quirks.c b/arch/x86/kernel/early-quirks.c
> index bd6dad83c65b..805596736e20 100644
> --- a/arch/x86/kernel/early-quirks.c
> +++ b/arch/x86/kernel/early-quirks.c
> @@ -18,6 +18,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> diff --git a/include/drm/i915_drm.h b/include/drm/i915_drm.h
> index 6722005884db..afbf3ef5643e 100644
> --- a/include/drm/i915_drm.h
> +++ b/include/drm/i915_drm.h
> @@ -26,7 +26,6 @@
>  #ifndef _I915_DRM_H_
>  #define _I915_DRM_H_
>  
> -#include 
>  #include 
>  
>  /* For use by IPS driver */
> -- 
> 2.30.2
> 


[Intel-gfx] [GIT PULL] PCI fixes for v5.17

2022-01-20 Thread Bjorn Helgaas
The following changes since commit fa55b7dcdc43c1aa1ba12bca9d2dd4318c2a0dbf:

  Linux 5.16-rc1 (2021-11-14 13:56:52 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git 
tags/pci-v5.17-fixes-1

for you to fetch changes up to 9c494ca4d3a535f9ca11ad6af1813983c1c6cbdd:

  x86/gpu: Reserve stolen memory for first integrated Intel GPU (2022-01-18 
14:19:06 -0600)


PCI fixes:

  - Reserve "stolen memory" for integrated Intel GPU, even if it's not
the first GPU to be enumerated (Lucas De Marchi)


Lucas De Marchi (1):
  x86/gpu: Reserve stolen memory for first integrated Intel GPU

 arch/x86/kernel/early-quirks.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)


Re: [Intel-gfx] [PATCH 5/7] PM: sleep: core: Rename DPM_FLAG_NEVER_SKIP

2020-04-10 Thread Bjorn Helgaas
On Fri, Apr 10, 2020 at 05:56:13PM +0200, Rafael J. Wysocki wrote:
> From: "Rafael J. Wysocki" 
> 
> Rename DPM_FLAG_NEVER_SKIP to DPM_FLAG_NO_DIRECT_COMPLETE which
> matches its purpose more closely.
> 
> No functional impact.
> 
> Signed-off-by: Rafael J. Wysocki 

Acked-by: Bjorn Helgaas  # for PCI parts

> ---
>  Documentation/driver-api/pm/devices.rst|  6 +++---
>  Documentation/power/pci.rst| 10 +-
>  drivers/base/power/main.c  |  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c|  2 +-
>  drivers/gpu/drm/i915/intel_runtime_pm.c|  2 +-
>  drivers/gpu/drm/radeon/radeon_kms.c|  2 +-
>  drivers/misc/mei/pci-me.c  |  2 +-
>  drivers/misc/mei/pci-txe.c |  2 +-
>  drivers/net/ethernet/intel/e1000e/netdev.c |  2 +-
>  drivers/net/ethernet/intel/igb/igb_main.c  |  2 +-
>  drivers/net/ethernet/intel/igc/igc_main.c  |  2 +-
>  drivers/pci/pcie/portdrv_pci.c |  2 +-
>  include/linux/pm.h |  6 +++---
>  13 files changed, 21 insertions(+), 21 deletions(-)
> 
> diff --git a/Documentation/driver-api/pm/devices.rst 
> b/Documentation/driver-api/pm/devices.rst
> index f66c7b9126ea..4ace0eba4506 100644
> --- a/Documentation/driver-api/pm/devices.rst
> +++ b/Documentation/driver-api/pm/devices.rst
> @@ -361,9 +361,9 @@ the phases are: ``prepare``, ``suspend``, 
> ``suspend_late``, ``suspend_noirq``.
>   runtime PM disabled.

Minor question about a preceding paragraph that ends:

  In that case, the ``->complete`` callback will be invoked directly
  after the ``->prepare`` callback and is entirely responsible for
  putting the device into a consistent state as appropriate.

What does" a consistent state as appropriate" mean?  I know this is
generic documentation at a high level, so maybe there's no good
explanation for "consistent state," but I don't know what to imagine
there.

And what does "as appropriate" mean?  Would it change the meaning to
drop those two words, or are there situations where it's not
appropriate to put the device into a consistent state?  Or maybe it's
just that the type of device determines what the consistent state is?

>   This feature also can be controlled by device drivers by using the
> - ``DPM_FLAG_NEVER_SKIP`` and ``DPM_FLAG_SMART_PREPARE`` driver power
> - management flags.  [Typically, they are set at the time the driver is
> - probed against the device in question by passing them to the
> + ``DPM_FLAG_NO_DIRECT_COMPLETE`` and ``DPM_FLAG_SMART_PREPARE`` driver
> + power management flags.  [Typically, they are set at the time the driver
> + is probed against the device in question by passing them to the
>   :c:func:`dev_pm_set_driver_flags` helper function.]  If the first of
>   these flags is set, the PM core will not apply the direct-complete
>   procedure described above to the given device and, consequenty, to any

s/consequenty/consequently/

Drive-by comment: I looked for a definition of "direct-complete".  The
closest I found is a couple paragraphs above this, where it says "Note
that this direct-complete procedure ...," but that leaves me to try to
reconstruct the definition from the preceding text.

AFAICT, going to freeze, standby, or memory sleep includes these
callbacks:

  ->prepare
  ->suspend
  ->suspend_late
  ->suspend_noirq
  ->complete (not mentioned in the list of phases)

And "direct-complete" means we skip the suspend, suspend_late,
and suspend_noirq callbacks so we only use these:

  ->prepare
  ->complete

And apparently we skip those callbacks for device X if ->prepare() for
X and all its descendents returns a positive value AND they are all
runtime-suspended, except if a driver for X or a descendent sets
DPM_FLAG_NO_DIRECT_COMPLETE.

Bjorn
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [RFC PATCH 00/17] Drop uses of pci_read_config_*() return value

2020-08-02 Thread Bjorn Helgaas
On Sun, Aug 02, 2020 at 08:46:48PM +0200, Borislav Petkov wrote:
> On Sun, Aug 02, 2020 at 07:28:00PM +0200, Saheed Bolarinwa wrote:
> > Because the value ~0 has a meaning to some drivers and only
> 
> No, ~0 means that the PCI read failed. For *every* PCI device I know.

Wait, I'm not convinced yet.  I know that if a PCI read fails, you
normally get ~0 data because the host bridge fabricates it to complete
the CPU load.

But what guarantees that a PCI config register cannot contain ~0?
If there's something about that in the spec I'd love to know where it
is because it would simplify a lot of things.

I don't think we should merge any of these patches as-is.  If we *do*
want to go this direction, we at least need some kind of macro or
function that tests for ~0 so we have a clue about what's happening
and can grep for it.

Bjorn
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH V3 00/29] PCI: deprecate pci_get_bus_and_slot()

2017-11-29 Thread Bjorn Helgaas
Hi Sinan,

On Mon, Nov 27, 2017 at 11:57:37AM -0500, Sinan Kaya wrote:
> Deprecate pci_get_bus_and_slot() in favor of pci_get_domain_bus_and_slot()
> in order to remove domain 0 assumptions in the kernel.
> 
> pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as
> where a PCI device is present. This restricts the device drivers to be
> reused for other domain numbers.
> 
> Use pci_get_domain_bus_and_slot() with a domain number of 0 where we can't
> extract the domain number. Other places, use the actual domain number from
> the device.
> 
> Changes from v2:
> * commit text cleanups
> * remove implicit busfn assignments and use PCI_DEVFN call in place.
> * change storage type for local copy of pci_domain_nr() value to int
> 
> Sinan Kaya (29):
>   alpha/PCI: deprecate pci_get_bus_and_slot()
>   powerpc/PCI: deprecate pci_get_bus_and_slot()
>   x86/PCI: deprecate pci_get_bus_and_slot()
>   ata: deprecate pci_get_bus_and_slot()
>   agp: nvidia: deprecate pci_get_bus_and_slot()
>   edd: deprecate pci_get_bus_and_slot()
>   ibft: deprecate pci_get_bus_and_slot()
>   drm/gma500: deprecate pci_get_bus_and_slot()
>   drm/i915: deprecate pci_get_bus_and_slot()
>   drm/nouveau: deprecate pci_get_bus_and_slot()
>   Drivers: ide: deprecate pci_get_bus_and_slot()
>   iommu/amd: deprecate pci_get_bus_and_slot()
>   powerpc/powermac: deprecate pci_get_bus_and_slot()
>   bnx2x: deprecate pci_get_bus_and_slot()
>   pch_gbe: deprecate pci_get_bus_and_slot()
>   PCI: cpqhp: deprecate pci_get_bus_and_slot()
>   PCI: ibmphp: deprecate pci_get_bus_and_slot()
>   PCI/quirks: deprecate pci_get_bus_and_slot()
>   PCI/syscall: deprecate pci_get_bus_and_slot()
>   xen: deprecate pci_get_bus_and_slot()
>   openprom: deprecate pci_get_bus_and_slot()
>   [media] atomisp: deprecate pci_get_bus_and_slot()
>   staging: rts5208: remove rtsx_read_pci_cfg_byte()
>   backlight: deprecate pci_get_bus_and_slot()
>   video: fbdev: intelfb: deprecate pci_get_bus_and_slot()
>   video: fbdev: nvidia: deprecate pci_get_bus_and_slot()
>   video: fbdev: riva: deprecate pci_get_bus_and_slot()
>   i7300_idle: remove unused file
>   PCI: remove pci_get_bus_and_slot() function

Thanks for doing this work!

I see other maintainers are picking up some of these.  I'll wait until
later in the cycle and pick up any remaining ones and the PCI core
parts.

If you repost the series for any reason, please capitalize the first
word of the changelog summary to match the drivers/pci convention (for
non PCI patches, follow *their* convention, of course) and put the
acks/reviewed-by tags after your signed-off-by.  I use this order:

  Reported-by:
  Tested-by:
  Signed-off-by: (author)
  Signed-off-by: (chain)
  Reviewed-by:
  Acked-by:
  Cc: sta...@vger.kernel.org# 3.4+
  Cc: (other)

But don't bother reposting the series just for this reason; I can fix
these up myself.

Bjorn
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v1] ACPI: Switch to use generic UUID API

2017-05-04 Thread Bjorn Helgaas
On Thu, May 4, 2017 at 4:21 AM, Andy Shevchenko
 wrote:
> acpi_evaluate_dsm() and friends take a pointer to a raw buffer of 16
> bytes. Instead we convert them to use uuid_le type. At the same time we
> convert current users.
>
> acpi_str_to_uuid() becomes useless after the conversion and it's safe to
> get rid of it.
>
> The conversion fixes a potential bug in int340x_thermal as well since
> we have to use memcmp() on binary data.
>
> Cc: Rafael J. Wysocki 
> Cc: Mika Westerberg 
> Cc: Borislav Petkov 
> Cc: Dan Williams 
> Cc: Amir Goldstein 
> Cc: Jarkko Sakkinen 
> Cc: Jani Nikula 
> Cc: Ben Skeggs 
> Cc: Benjamin Tissoires 
> Cc: Joerg Roedel 
> Cc: Adrian Hunter 
> Cc: Yisen Zhuang 
> Cc: Bjorn Helgaas 
> Cc: Zhang Rui 
> Cc: Felipe Balbi 
> Cc: Mathias Nyman 
> Cc: Heikki Krogerus 
> Cc: Liam Girdwood 
> Cc: Mark Brown 
> Signed-off-by: Andy Shevchenko 

For the drivers/pci parts:

Acked-by: Bjorn Helgaas 

> ---
>  drivers/acpi/acpi_extlog.c | 10 +++---
>  drivers/acpi/bus.c | 29 ++--
>  drivers/acpi/nfit/core.c   | 40 
> +++---
>  drivers/acpi/nfit/nfit.h   |  3 +-
>  drivers/acpi/utils.c   |  4 +--
>  drivers/char/tpm/tpm_crb.c |  9 +++--
>  drivers/char/tpm/tpm_ppi.c | 20 +--
>  drivers/gpu/drm/i915/intel_acpi.c  | 14 +++-
>  drivers/gpu/drm/nouveau/nouveau_acpi.c | 20 +--
>  drivers/gpu/drm/nouveau/nvkm/subdev/mxm/base.c |  9 +++--
>  drivers/hid/i2c-hid/i2c-hid.c  |  9 +++--
>  drivers/iommu/dmar.c   | 11 +++---
>  drivers/mmc/host/sdhci-pci-core.c  |  9 +++--
>  drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c | 15 
>  drivers/pci/pci-acpi.c | 11 +++---
>  drivers/pci/pci-label.c|  4 +--
>  drivers/thermal/int340x_thermal/int3400_thermal.c  |  8 ++---
>  drivers/usb/dwc3/dwc3-pci.c|  6 ++--
>  drivers/usb/host/xhci-pci.c|  9 +++--
>  drivers/usb/misc/ucsi.c|  2 +-
>  drivers/usb/typec/typec_wcove.c|  4 +--
>  include/acpi/acpi_bus.h|  9 ++---
>  include/linux/acpi.h   |  4 +--
>  include/linux/pci-acpi.h   |  2 +-
>  sound/soc/intel/skylake/skl-nhlt.c |  7 ++--
>  tools/testing/nvdimm/test/iomap.c  |  2 +-
>  tools/testing/nvdimm/test/nfit.c   |  2 +-
>  27 files changed, 116 insertions(+), 156 deletions(-)
>
> diff --git a/drivers/acpi/acpi_extlog.c b/drivers/acpi/acpi_extlog.c
> index 502ea4dc2080..69d6140b6afa 100644
> --- a/drivers/acpi/acpi_extlog.c
> +++ b/drivers/acpi/acpi_extlog.c
> @@ -182,17 +182,17 @@ static int extlog_print(struct notifier_block *nb, 
> unsigned long val,
>
>  static bool __init extlog_get_l1addr(void)
>  {
> -   u8 uuid[16];
> +   uuid_le uuid;
> acpi_handle handle;
> union acpi_object *obj;
>
> -   acpi_str_to_uuid(extlog_dsm_uuid, uuid);
> -
> +   if (uuid_le_to_bin(extlog_dsm_uuid, &uuid))
> +   return false;
> if (ACPI_FAILURE(acpi_get_handle(NULL, "\\_SB", &handle)))
> return false;
> -   if (!acpi_check_dsm(handle, uuid, EXTLOG_DSM_REV, 1 << 
> EXTLOG_FN_ADDR))
> +   if (!acpi_check_dsm(handle, &uuid, EXTLOG_DSM_REV, 1 << 
> EXTLOG_FN_ADDR))
> return false;
> -   obj = acpi_evaluate_dsm_typed(handle, uuid, EXTLOG_DSM_REV,
> +   obj = acpi_evaluate_dsm_typed(handle, &uuid, EXTLOG_DSM_REV,
>   EXTLOG_FN_ADDR, NULL, 
> ACPI_TYPE_INTEGER);
> if (!obj) {
> return false;
> diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c
> index 784bda663d16..e8130a4873e9 100644
> --- a/drivers/acpi/bus.c
> +++ b/drivers/acpi/bus.c
> @@ -196,42 +196,19 @@ static void acpi_print_osc_error(acpi_handle handle,
> pr_debug("\n");
>  }
>
> -acpi_status acpi_str_to_uuid(char *str, u8 *uuid)
> -{
> -   int i;
> -   static int opc_map_to_uuid[16] = {6, 4, 2, 0, 11, 9, 16, 14, 19, 21,
> -   24, 26, 28, 30, 32, 34};
> -
> -   if (strlen(str) != 36)
> -   return AE_BAD_PARAMETER;
> -   for (i = 0; i < 36; i++) {
> -   if (i == 8 || i == 13 || i == 18 || i == 23) {
&g

Re: [Intel-gfx] [PATCH] pci: Add a few new IDs for Intel GPU "spurious interrupt" quirk

2018-09-26 Thread Bjorn Helgaas
[+cc Intel DRM maintainers, etc]

On Wed, Sep 26, 2018 at 08:14:01AM -0700, Bin Meng wrote:
> Add more PCI IDs to the Intel GPU "spurious interrupt" quirk table,
> which are known to break.

Do you have a reference for this?  Any public bug reports, bugzilla,
Intel spec reference or errata?  "Which are known to break" is pretty
vague.

> See commit f67fd55fa96f ("PCI: Add quirk for still enabled interrupts
> on Intel Sandy Bridge GPUs"), and commit 7c82126a94e6 ("PCI: Add new
> ID for Intel GPU "spurious interrupt" quirk") for some history.
> 
> Based on current findings, it is highly possible that all Intel
> 1st/2nd/3rd generation Core processors' IGD has such quirk.

Can you include a reference to these "current findings"?  I assume you
have bug reports that include the device IDs you're adding?  If not,
how did you build this list of new IDs?

The function comment added by f67fd55fa96f ("PCI: Add quirk for still
enabled interrupts on Intel Sandy Bridge GPUs") suggests that this is
actually a BIOS issue, not a hardware erratum, i.e., I don't see
anything there that suggests a hardware defect.

But there must be a hole somewhere -- the kernel can't be expected to
disable interrupts in device-specific ways when there's no driver
loaded.  Maybe it's simply a BIOS defect or maybe there's some
interrupt or _PRT-related setup we're missing.

> Signed-off-by: Bin Meng 
> Cc:  # v3.4+
> ---
> 
>  drivers/pci/quirks.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 6bc27b7..c0673a7 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -3190,7 +3190,11 @@ static void disable_igfx_irq(struct pci_dev *dev)
>  
>   pci_iounmap(dev, regs);
>  }
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0042, disable_igfx_irq);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0046, disable_igfx_irq);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x004a, disable_igfx_irq);
>  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0102, disable_igfx_irq);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0106, disable_igfx_irq);
>  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x010a, disable_igfx_irq);
>  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0152, disable_igfx_irq);
>  
> -- 
> 2.7.4
> 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] pci: Add a few new IDs for Intel GPU "spurious interrupt" quirk

2018-10-03 Thread Bjorn Helgaas
On Thu, Sep 27, 2018 at 10:10:07AM +0800, Bin Meng wrote:
> On Thu, Sep 27, 2018 at 12:57 AM Bjorn Helgaas  wrote:
> > On Wed, Sep 26, 2018 at 08:14:01AM -0700, Bin Meng wrote:
> > > Add more PCI IDs to the Intel GPU "spurious interrupt" quirk table,
> > > which are known to break.
> >
> > Do you have a reference for this?  Any public bug reports, bugzilla,
> > Intel spec reference or errata?  "Which are known to break" is pretty
> > vague.
> 
> Sorry I used wrong words and should have been clearer. These devices
> are validated to be broken. The test I used is very simple, just
> unplug the VGA cable and plug it again, and "spurious interrupt" will
> be seen on the interrupt line of the IGD device. I was not aware of
> any public bugs filed to Intel, nor seen any errata from Intel.

The original commit, f67fd55fa96f ("PCI: Add quirk for still enabled
interrupts on Intel Sandy Bridge GPUs"), says some systems "crash"
(not sure if that means an oops or an actual crash that requires a
reboot) and on other systems, Linux disables the shared interrupt
line.  I assume disabling the interrupt line keeps devices using that
line from working, but does not directly cause a crash.

What specific symptom do you see here?  I think it might be useful to
collect details, e.g., dmesg logs, /proc/interrupts contents, output
of "sudo lspci -vv", etc., for the systems you're quirking here.  I'm
hoping we can eventually figure out a solution that doesn't require a
quirk for every new GPU, and maybe that info will help find it.

> > > See commit f67fd55fa96f ("PCI: Add quirk for still enabled interrupts
> > > on Intel Sandy Bridge GPUs"), and commit 7c82126a94e6 ("PCI: Add new
> > > ID for Intel GPU "spurious interrupt" quirk") for some history.
> > >
> > > Based on current findings, it is highly possible that all Intel
> > > 1st/2nd/3rd generation Core processors' IGD has such quirk.
> >
> > Can you include a reference to these "current findings"?  I assume you
> > have bug reports that include the device IDs you're adding?  If not,
> > how did you build this list of new IDs?
> 
> By "current findings" I mean given the IDs we have here, plus previous
> one added by Thomas, it's highly possible this VGA BIOS bug exists in
> every 1st/2nd/3rd generation Core processors.
> 
> > The function comment added by f67fd55fa96f ("PCI: Add quirk for still
> > enabled interrupts on Intel Sandy Bridge GPUs") suggests that this is
> > actually a BIOS issue, not a hardware erratum, i.e., I don't see
> > anything there that suggests a hardware defect.
> >
> > But there must be a hole somewhere -- the kernel can't be expected to
> > disable interrupts in device-specific ways when there's no driver
> > loaded.  Maybe it's simply a BIOS defect or maybe there's some
> > interrupt or _PRT-related setup we're missing.
> 
> It's a pure VGA BIOS bug, not the BIOS bug or _PRT etc. The VGA BIOS
> forgot to turn off the interrupt on these devices.

If this is a VGA BIOS defect, it's not very likely that it will
magically be fixed for all new Intel GPUs, so in effect it sounds like
we need to update this list of quirks in Linux every time a new Intel
GPU comes out.  That prospect is a little daunting.

Do you happen to know if Windows has the same problem?  I.e., if you
boot an old version of Windows with a new GPU, and unplug the VGA
cable, does Windows crash?  If Windows can figure out how to handle
that situation gracefully, Linux should be able to do it, too.

> > > Signed-off-by: Bin Meng 
> > > Cc:  # v3.4+
> > > ---
> > >
> > >  drivers/pci/quirks.c | 4 
> > >  1 file changed, 4 insertions(+)
> > >
> > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > > index 6bc27b7..c0673a7 100644
> > > --- a/drivers/pci/quirks.c
> > > +++ b/drivers/pci/quirks.c
> > > @@ -3190,7 +3190,11 @@ static void disable_igfx_irq(struct pci_dev *dev)
> > >
> > >   pci_iounmap(dev, regs);
> > >  }
> > > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0042, disable_igfx_irq);
> > > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0046, disable_igfx_irq);
> > > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x004a, disable_igfx_irq);
> > >  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0102, disable_igfx_irq);
> > > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0106, disable_igfx_irq);
> > >  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x010a, disable_igfx_irq);
> > >  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0152, disable_igfx_irq);
> > >
> > > --
> 
> Regards,
> Bin
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] pci: Add a few new IDs for Intel GPU "spurious interrupt" quirk

2018-10-09 Thread Bjorn Helgaas
On Mon, Oct 08, 2018 at 05:44:08PM +0800, Bin Meng wrote:
> On Thu, Oct 4, 2018 at 4:12 AM Bjorn Helgaas  wrote:
> > On Thu, Sep 27, 2018 at 10:10:07AM +0800, Bin Meng wrote:
> > > On Thu, Sep 27, 2018 at 12:57 AM Bjorn Helgaas  wrote:
> > > > On Wed, Sep 26, 2018 at 08:14:01AM -0700, Bin Meng wrote:
> > > > > Add more PCI IDs to the Intel GPU "spurious interrupt" quirk table,
> > > > > which are known to break.
> > > >
> > > > Do you have a reference for this?  Any public bug reports, bugzilla,
> > > > Intel spec reference or errata?  "Which are known to break" is pretty
> > > > vague.
> > >
> > > Sorry I used wrong words and should have been clearer. These devices
> > > are validated to be broken. The test I used is very simple, just
> > > unplug the VGA cable and plug it again, and "spurious interrupt" will
> > > be seen on the interrupt line of the IGD device. I was not aware of
> > > any public bugs filed to Intel, nor seen any errata from Intel.
> >
> > The original commit, f67fd55fa96f ("PCI: Add quirk for still enabled
> > interrupts on Intel Sandy Bridge GPUs"), says some systems "crash"
> > (not sure if that means an oops or an actual crash that requires a
> > reboot) and on other systems, Linux disables the shared interrupt
> > line.  I assume disabling the interrupt line keeps devices using that
> > line from working, but does not directly cause a crash.
> >
> 
> Correct, disable the shared interrupt line keeps all devices using
> that line from working, which is current kernel's behavior w/o this
> quirk handling: it disables the (shared) interrupt line after 100.000+
> generated interrupts. But the side effect is that other devices become
> unusable after that (eg: USB devices which share the same interrupt
> line with the Intel GPU). That's why the original commit, f67fd55fa96f
> ("PCI: Add quirk for still enabled interrupts on Intel Sandy Bridge
> GPUs") disables the GPU's interrupt directly, which should really be
> done by the VGA BIOS itself (a buggy VBIOS!).
> 
> > What specific symptom do you see here?  I think it might be useful to
> > collect details, e.g., dmesg logs, /proc/interrupts contents, output
> > of "sudo lspci -vv", etc., for the systems you're quirking here.  I'm
> > hoping we can eventually figure out a solution that doesn't require a
> > quirk for every new GPU, and maybe that info will help find it.
> 
> The symptom was described briefly in the original commit f67fd55fa96f
> too, that disables the (shared) interrupt line after 100.000+
> generated interrupts (can be observed via /proc/interrupts).
> 
> > > > > See commit f67fd55fa96f ("PCI: Add quirk for still enabled interrupts
> > > > > on Intel Sandy Bridge GPUs"), and commit 7c82126a94e6 ("PCI: Add new
> > > > > ID for Intel GPU "spurious interrupt" quirk") for some history.
> > > > >
> > > > > Based on current findings, it is highly possible that all Intel
> > > > > 1st/2nd/3rd generation Core processors' IGD has such quirk.
> > > >
> > > > Can you include a reference to these "current findings"?  I assume you
> > > > have bug reports that include the device IDs you're adding?  If not,
> > > > how did you build this list of new IDs?
> > >
> > > By "current findings" I mean given the IDs we have here, plus previous
> > > one added by Thomas, it's highly possible this VGA BIOS bug exists in
> > > every 1st/2nd/3rd generation Core processors.
> > >
> > > > The function comment added by f67fd55fa96f ("PCI: Add quirk for still
> > > > enabled interrupts on Intel Sandy Bridge GPUs") suggests that this is
> > > > actually a BIOS issue, not a hardware erratum, i.e., I don't see
> > > > anything there that suggests a hardware defect.
> > > >
> > > > But there must be a hole somewhere -- the kernel can't be expected to
> > > > disable interrupts in device-specific ways when there's no driver
> > > > loaded.  Maybe it's simply a BIOS defect or maybe there's some
> > > > interrupt or _PRT-related setup we're missing.
> > >
> > > It's a pure VGA BIOS bug, not the BIOS bug or _PRT etc. The VGA BIOS
> > > forgot to turn off the interrupt on these devices.
> >
> > If this is a VGA BIOS defect, it's not very l

Re: [Intel-gfx] [PATCH] pci: Add a few new IDs for Intel GPU "spurious interrupt" quirk

2018-10-11 Thread Bjorn Helgaas
On Thu, Oct 11, 2018 at 03:11:01PM +0800, Bin Meng wrote:
> On Wed, Oct 10, 2018 at 1:02 AM Bjorn Helgaas  wrote:
> > On Mon, Oct 08, 2018 at 05:44:08PM +0800, Bin Meng wrote:
> > > On Thu, Oct 4, 2018 at 4:12 AM Bjorn Helgaas  wrote:
> > > > On Thu, Sep 27, 2018 at 10:10:07AM +0800, Bin Meng wrote:
> > > > > On Thu, Sep 27, 2018 at 12:57 AM Bjorn Helgaas  
> > > > > wrote:
> > > > > > On Wed, Sep 26, 2018 at 08:14:01AM -0700, Bin Meng wrote:
> > > > > > > Add more PCI IDs to the Intel GPU "spurious interrupt" quirk 
> > > > > > > table,
> > > > > > > which are known to break.
> > > > > >
> > > > > > Do you have a reference for this?  Any public bug reports, bugzilla,
> > > > > > Intel spec reference or errata?  "Which are known to break" is 
> > > > > > pretty
> > > > > > vague.
> > > > >
> > > > > Sorry I used wrong words and should have been clearer. These devices
> > > > > are validated to be broken. The test I used is very simple, just
> > > > > unplug the VGA cable and plug it again, and "spurious interrupt" will
> > > > > be seen on the interrupt line of the IGD device. I was not aware of
> > > > > any public bugs filed to Intel, nor seen any errata from Intel.
> > > >
> > > > The original commit, f67fd55fa96f ("PCI: Add quirk for still enabled
> > > > interrupts on Intel Sandy Bridge GPUs"), says some systems "crash"
> > > > (not sure if that means an oops or an actual crash that requires a
> > > > reboot) and on other systems, Linux disables the shared interrupt
> > > > line.  I assume disabling the interrupt line keeps devices using that
> > > > line from working, but does not directly cause a crash.
> > > >
> > >
> > > Correct, disable the shared interrupt line keeps all devices using
> > > that line from working, which is current kernel's behavior w/o this
> > > quirk handling: it disables the (shared) interrupt line after 100.000+
> > > generated interrupts. But the side effect is that other devices become
> > > unusable after that (eg: USB devices which share the same interrupt
> > > line with the Intel GPU). That's why the original commit, f67fd55fa96f
> > > ("PCI: Add quirk for still enabled interrupts on Intel Sandy Bridge
> > > GPUs") disables the GPU's interrupt directly, which should really be
> > > done by the VGA BIOS itself (a buggy VBIOS!).
> > >
> > > > What specific symptom do you see here?  I think it might be useful to
> > > > collect details, e.g., dmesg logs, /proc/interrupts contents, output
> > > > of "sudo lspci -vv", etc., for the systems you're quirking here.  I'm
> > > > hoping we can eventually figure out a solution that doesn't require a
> > > > quirk for every new GPU, and maybe that info will help find it.
> > >
> > > The symptom was described briefly in the original commit f67fd55fa96f
> > > too, that disables the (shared) interrupt line after 100.000+
> > > generated interrupts (can be observed via /proc/interrupts).
> > >
> > > > > > > See commit f67fd55fa96f ("PCI: Add quirk for still enabled 
> > > > > > > interrupts
> > > > > > > on Intel Sandy Bridge GPUs"), and commit 7c82126a94e6 ("PCI: Add 
> > > > > > > new
> > > > > > > ID for Intel GPU "spurious interrupt" quirk") for some history.
> > > > > > >
> > > > > > > Based on current findings, it is highly possible that all Intel
> > > > > > > 1st/2nd/3rd generation Core processors' IGD has such quirk.
> > > > > >
> > > > > > Can you include a reference to these "current findings"?  I assume 
> > > > > > you
> > > > > > have bug reports that include the device IDs you're adding?  If not,
> > > > > > how did you build this list of new IDs?
> > > > >
> > > > > By "current findings" I mean given the IDs we have here, plus previous
> > > > > one added by Thomas, it's highly possible this VGA BIOS bug exists in
> > > > > every 1st/2nd/3rd generation Core processors.
> > > > >
> > > > > > The functio

Re: [Intel-gfx] [PATCH 10/12] pci: use for_each_if

2018-07-09 Thread Bjorn Helgaas
On Mon, Jul 09, 2018 at 10:36:48AM +0200, Daniel Vetter wrote:
> Avoids the inverted condition compared to the open-coded version.
> 
> Signed-off-by: Daniel Vetter 
> Cc: Bjorn Helgaas 
> Cc: linux-...@vger.kernel.org

Acked-by: Bjorn Helgaas 

I assume you'll merge this with the rest of the series.

> ---
>  include/linux/pci.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 340029b2fb38..1484471ed048 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -601,7 +601,7 @@ static inline bool pci_is_bridge(struct pci_dev *dev)
>  
>  #define for_each_pci_bridge(dev, bus)\
>   list_for_each_entry(dev, &bus->devices, bus_list)   \
> - if (!pci_is_bridge(dev)) {} else
> + for_each_if (pci_is_bridge(dev))
>  
>  static inline struct pci_dev *pci_upstream_bridge(struct pci_dev *dev)
>  {
> -- 
> 2.18.0
> 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 01/29] pci: Decouple quirks.c from i915_reg.h

2015-11-25 Thread Bjorn Helgaas
Hi Ville,

On Wed, Nov 04, 2015 at 11:19:49PM +0200, ville.syrj...@linux.intel.com wrote:
> From: Ville Syrjälä 
> 
> i915 register defines are going to become type safe, so going forward
> the register defines can't be used as straight numbers. Since quirks.c
> needs just a few extra register defines from i915_reg.h, decouple the
> two by defining the required registers locally in quirks.c. This was
> already done for a few other igpu related registers.
> 
> Cc: Bjorn Helgaas 
> Cc: linux-...@vger.kernel.org
> Cc: linux-ker...@vger.kernel.org
> Signed-off-by: Ville Syrjälä 

I haven't seen the rest of this series, but this patch is OK by me,
and I assume you'll merge this along with the whole series.  Please
adjust the subject line to be:

  PCI: Decouple quirks.c from i915_reg.h

Acked-by: Bjorn Helgaas 

> ---
>  drivers/pci/quirks.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index b03373f..78a70fb 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -3404,7 +3404,9 @@ static int reset_intel_82599_sfp_virtfn(struct pci_dev 
> *dev, int probe)
>   return 0;
>  }
>  
> -#include "../gpu/drm/i915/i915_reg.h"
> +#define SOUTH_CHICKEN2   0xc2004
> +#define PCH_PP_STATUS0xc7200
> +#define PCH_PP_CONTROL   0xc7204
>  #define MSG_CTL  0x45010
>  #define NSDE_PWR_STATE   0xd0100
>  #define IGD_OPERATION_TIMEOUT1 /* set timeout 10 seconds */
> -- 
> 2.4.10
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v4] PCI / PM: tune down RPM suspend error message with EBUSY and EAGAIN retval

2015-11-30 Thread Bjorn Helgaas
Hi Imre,

> PCI / PM: tune down RPM suspend error message with EBUSY and EAGAIN retval

Please capitalize the first word, e.g., "Tune ..."

Maybe "Tune down non-fatal runtime suspend error messages" would be
more to the point.  Or maybe "retryable," since it sounds like that's
what's supposed to happen.


On Sat, Nov 28, 2015 at 10:34:24AM +0200, Imre Deak wrote:
> The runtime PM core doesn't treat EBUSY and EAGAIN retvals from the driver
> suspend hooks as errors, but they still show up as errors in dmesg. Tune
> them down.

I looked briefly for the place in the runtime PM core where we handle
EBUSY and EAGAIN specially, but I didn't find it.  A pointer would be
helpful.

> One problem caused by this was noticed by Daniel: the i915 driver
> returns EAGAIN to signal a temporary failure to suspend and as a request
> towards the RPM core for scheduling a suspend again. This is a normal
> event, but the resulting error message flags a breakage during the
> driver's automated testing which parses dmesg and picks up the error.
> 
> v2:
> - fix compile breake when CONFIG_PM_SLEEP=n (0-day builder)
> v3:
> - instead of modifying the suspend_report_result() helper to disinguish
>   between the runtime and system suspend case, inline the error
>   printing, it's not used anywhere else (Rafael)
> v4:
> - don't refer to log levels as flags in code comment (Rafael)
> - use pr_debug(), pr_err() instead of the corresponding printk() (Rafael)

The version history is very useful during development but not after
merging, so I prefer it to go after the "---" marker so it doesn't get
included in the permanent changelog.

> Reported-by: Daniel Vetter 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92992
> CC: Bjorn Helgaas 
> CC: Rafael J. Wysocki 
> Signed-off-by: Imre Deak 
> ---
>  drivers/pci/pci-driver.c | 16 ++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 4446fcb..32a9947 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -1146,9 +1146,21 @@ static int pci_pm_runtime_suspend(struct device *dev)
>   pci_dev->state_saved = false;
>   pci_dev->no_d3cold = false;
>   error = pm->runtime_suspend(dev);
> - suspend_report_result(pm->runtime_suspend, error);
> - if (error)
> + if (error) {
> + /*
> +  * -EBUSY and -EAGAIN is used to request the runtime PM core
> +  * to schedule a new suspend, so log the event only with debug
> +  * log level.
> +  */
> + if (error == -EBUSY || error == -EAGAIN)
> + pr_debug("%s(): %pF returns %d\n", __func__,
> +  pm->runtime_suspend, error);
> + else
> + pr_err("%s(): %pF returns %d\n", __func__,
> +pm->runtime_suspend, error);

This looks fine to me in principle.

We have the device pointer, so please use dev_printk(KERN_DEBUG) and
dev_err().  If you're OK with changing the semantics so the debug
message is only emitted when dynamically enabled, you could use
dev_dbg().

I don't know what a user is supposed to do with a message like:

  pci_pm_runtime_suspend(): ata_port_runtime_suspend+0x0/0x200 returns -12

It feels a little bit ... internal.  Using dev_err() will help anchor
it to a specific device.  It's not related to a specific PC in the
function, so maybe use %pf instead of %pF to get rid of the function
offset and size.  And the message doesn't say anything about what the
return code *means*; maybe something like:

  ata_piix :00:14.0: can't suspend now (ata_port_runtime_suspend returned 
-11)
  ata_piix :00:14.0: can't suspend (ata_port_runtime_suspend returned -12)

> +
>   return error;
> + }
>   if (!pci_dev->d3cold_allowed)
>   pci_dev->no_d3cold = true;
>  
> -- 
> 2.5.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v5] PCI / PM: Tune down retryable runtime suspend error messages

2015-12-01 Thread Bjorn Helgaas
On Wed, Dec 02, 2015 at 02:54:45AM +0100, Rafael J. Wysocki wrote:
> On Monday, November 30, 2015 09:02:55 PM Imre Deak wrote:
> > The runtime PM core doesn't treat EBUSY and EAGAIN retvals from the driver
> > suspend hooks as errors, but they still show up as errors in dmesg. Tune
> > them down. See rpm_suspend() for details of handling these return values.
> > 
> > Note that we use dev_dbg() for the retryable retvals, so after this
> > change you'll need either CONFIG_DYNAMIC_DEBUG or CONFIG_PCI_DEBUG
> > for them to show up in the log.
> > 
> > One problem caused by this was noticed by Daniel: the i915 driver
> > returns EAGAIN to signal a temporary failure to suspend and as a request
> > towards the RPM core for scheduling a suspend again. This is a normal
> > event, but the resulting error message flags a breakage during the
> > driver's automated testing which parses dmesg and picks up the error.
> > 
> > Reported-by: Daniel Vetter 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92992
> > CC: Bjorn Helgaas 
> > CC: Rafael J. Wysocki 
> > Signed-off-by: Imre Deak 
> 
> Hi Bjorn,
> 
> Are you going to handle this one or should I take care of it?

Why don't you take it, since you're the PM guru :)

Acked-by: Bjorn Helgaas 

Thanks for all your work, Imre!

> > ---
> > 
> > v2:
> > - fix compile breake when CONFIG_PM_SLEEP=n (0-day builder)
> > v3:
> > - instead of modifying the suspend_report_result() helper to disinguish
> >   between the runtime and system suspend case, inline the error
> >   printing, it's not used anywhere else (Rafael)
> > v4:
> > - don't refer to log levels as flags in code comment (Rafael)
> > - use pr_debug(), pr_err() instead of the corresponding printk() (Rafael)
> > v5:
> > - clarify commit message (Bjorn)
> > - use dev_dbg, dev_err instead of pr_debug, pr_err (Bjorn)
> > - use %pf in printk format instead of %pF (Bjorn)
> > - make the debug/error messages more meaningful (Bjorn)
> > ---
> >  drivers/pci/pci-driver.c | 16 ++--
> >  1 file changed, 14 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> > index 4446fcb..d7ffd66 100644
> > --- a/drivers/pci/pci-driver.c
> > +++ b/drivers/pci/pci-driver.c
> > @@ -1146,9 +1146,21 @@ static int pci_pm_runtime_suspend(struct device *dev)
> > pci_dev->state_saved = false;
> > pci_dev->no_d3cold = false;
> > error = pm->runtime_suspend(dev);
> > -   suspend_report_result(pm->runtime_suspend, error);
> > -   if (error)
> > +   if (error) {
> > +   /*
> > +* -EBUSY and -EAGAIN is used to request the runtime PM core
> > +* to schedule a new suspend, so log the event only with debug
> > +* log level.
> > +*/
> > +   if (error == -EBUSY || error == -EAGAIN)
> > +   dev_dbg(dev, "can't suspend now (%pf returned %d)\n",
> > +   pm->runtime_suspend, error);
> > +   else
> > +   dev_err(dev, "can't suspend (%pf returned %d)\n",
> > +   pm->runtime_suspend, error);
> > +
> > return error;
> > +   }
> > if (!pci_dev->d3cold_allowed)
> > pci_dev->no_d3cold = true;
> >  
> > 
> 
> -- 
> I speak only for myself.
> Rafael J. Wysocki, Intel Open Source Technology Center.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v1 00/12] PCI: Rework shadow ROM handling

2016-03-03 Thread Bjorn Helgaas
The purpose of this series is to:

  - Fix the "BAR 6: [??? 0x flags 0x2] has bogus alignment"
messages reported by Linus [1], Andy [2], and others.

  - Move arch-specific shadow ROM location knowledge, e.g.,
0xC-0xD, from PCI core to arch code.

  - Fix the ia64 and MIPS Loongson 3 oddity of keeping virtual
addresses in shadow ROM struct resource (resources should always
contain *physical* addresses).

  - Remove now-unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY
flags.

This series is based on v4.5-rc1, and it's available on my
pci/resource git branch (along with a couple tiny unrelated patches)
at [3].

Bjorn


[1] 
http://lkml.kernel.org/r/ca+55afyvmftbb0oz_yx8+eqoejnzgtcsysj9quhepdz9bhd...@mail.gmail.com
[2] 
http://lkml.kernel.org/r/calcetrv+rwnpzxyl8uvnsragu-6cczd_cc9pfjt2nctjplz...@mail.gmail.com
[3] 
https://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/log/?h=pci/resource


---

Bjorn Helgaas (12):
  PCI: Mark shadow copy of VGA ROM as IORESOURCE_PCI_FIXED
  PCI: Don't assign or reassign immutable resources
  PCI: Don't enable/disable ROM BAR if we're using a RAM shadow copy
  PCI: Set ROM shadow location in arch code, not in PCI core
  PCI: Clean up pci_map_rom() whitespace
  ia64/PCI: Use temporary struct resource * to avoid repetition
  ia64/PCI: Use ioremap() instead of open-coded equivalent
  ia64/PCI: Keep CPU physical (not virtual) addresses in shadow ROM resource
  MIPS: Loongson 3: Use temporary struct resource * to avoid repetition
  MIPS: Loongson 3: Keep CPU physical (not virtual) addresses in shadow ROM 
resource
  PCI: Remove unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY
  PCI: Simplify sysfs ROM cleanup


 arch/ia64/pci/fixup.c  |   21 +++--
 arch/ia64/sn/kernel/io_acpi_init.c |   22 ++
 arch/ia64/sn/kernel/io_init.c  |   51 --
 arch/mips/pci/fixup-loongson3.c|   19 +---
 arch/x86/pci/fixup.c   |   21 +++--
 drivers/pci/pci-sysfs.c|   13 +-
 drivers/pci/remove.c   |1 
 drivers/pci/rom.c  |   83 +++-
 drivers/pci/setup-res.c|6 +++
 include/linux/ioport.h |4 --
 10 files changed, 111 insertions(+), 130 deletions(-)
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v1 02/12] PCI: Don't assign or reassign immutable resources

2016-03-03 Thread Bjorn Helgaas
IORESOURCE_PCI_FIXED means the resource can't be moved, so if it's set,
don't bother trying to assign or reassign the resource.

Signed-off-by: Bjorn Helgaas 
---
 drivers/pci/setup-res.c |6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
index 604011e..66c4d8f 100644
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -276,6 +276,9 @@ int pci_assign_resource(struct pci_dev *dev, int resno)
resource_size_t align, size;
int ret;
 
+   if (res->flags & IORESOURCE_PCI_FIXED)
+   return 0;
+
res->flags |= IORESOURCE_UNSET;
align = pci_resource_alignment(dev, res);
if (!align) {
@@ -321,6 +324,9 @@ int pci_reassign_resource(struct pci_dev *dev, int resno, 
resource_size_t addsiz
resource_size_t new_size;
int ret;
 
+   if (res->flags & IORESOURCE_PCI_FIXED)
+   return 0;
+
flags = res->flags;
res->flags |= IORESOURCE_UNSET;
if (!res->parent) {

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v1 03/12] PCI: Don't enable/disable ROM BAR if we're using a RAM shadow copy

2016-03-03 Thread Bjorn Helgaas
If we're using a RAM shadow copy instead of the ROM BAR, we don't need to
touch the ROM BAR enable bit.

Signed-off-by: Bjorn Helgaas 
---
 drivers/pci/rom.c |   16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/rom.c b/drivers/pci/rom.c
index 9eaca39..5da8d06 100644
--- a/drivers/pci/rom.c
+++ b/drivers/pci/rom.c
@@ -24,13 +24,17 @@
  */
 int pci_enable_rom(struct pci_dev *pdev)
 {
-   struct resource *res = pdev->resource + PCI_ROM_RESOURCE;
+   struct resource *res = &pdev->resource[PCI_ROM_RESOURCE];
struct pci_bus_region region;
u32 rom_addr;
 
if (!res->flags)
return -1;
 
+   /* Nothing to enable if we're using a shadow copy in RAM */
+   if (res->flags & IORESOURCE_ROM_SHADOW)
+   return 0;
+
pcibios_resource_to_bus(pdev->bus, ®ion, res);
pci_read_config_dword(pdev, pdev->rom_base_reg, &rom_addr);
rom_addr &= ~PCI_ROM_ADDRESS_MASK;
@@ -49,7 +53,12 @@ EXPORT_SYMBOL_GPL(pci_enable_rom);
  */
 void pci_disable_rom(struct pci_dev *pdev)
 {
+   struct resource *res = &pdev->resource[PCI_ROM_RESOURCE];
u32 rom_addr;
+
+   if (res->flags & IORESOURCE_ROM_SHADOW)
+   return;
+
pci_read_config_dword(pdev, pdev->rom_base_reg, &rom_addr);
rom_addr &= ~PCI_ROM_ADDRESS_ENABLE;
pci_write_config_dword(pdev, pdev->rom_base_reg, rom_addr);
@@ -154,7 +163,6 @@ void __iomem *pci_map_rom(struct pci_dev *pdev, size_t 
*size)
if (!rom) {
/* restore enable if ioremap fails */
if (!(res->flags & (IORESOURCE_ROM_ENABLE |
-   IORESOURCE_ROM_SHADOW |
IORESOURCE_ROM_COPY)))
pci_disable_rom(pdev);
return NULL;
@@ -186,8 +194,8 @@ void pci_unmap_rom(struct pci_dev *pdev, void __iomem *rom)
 
iounmap(rom);
 
-   /* Disable again before continuing, leave enabled if pci=rom */
-   if (!(res->flags & (IORESOURCE_ROM_ENABLE | IORESOURCE_ROM_SHADOW)))
+   /* Disable again before continuing */
+   if (!(res->flags & IORESOURCE_ROM_ENABLE))
pci_disable_rom(pdev);
 }
 EXPORT_SYMBOL(pci_unmap_rom);

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v1 01/12] PCI: Mark shadow copy of VGA ROM as IORESOURCE_PCI_FIXED

2016-03-03 Thread Bjorn Helgaas
A shadow copy of an option ROM is placed by the BIOS as a fixed address.
Set IORESOURCE_PCI_FIXED to indicate that we can't move the shadow copy.
This prevents warnings like the following when we assign resources:

  BAR 6: [??? 0x flags 0x2] has bogus alignment

This warning is emitted by pdev_sort_resources(), which already ignores
IORESOURCE_PCI_FIXED resources.

Link: 
http://lkml.kernel.org/r/ca+55afyvmftbb0oz_yx8+eqoejnzgtcsysj9quhepdz9bhd...@mail.gmail.com
Signed-off-by: Bjorn Helgaas 
---
 arch/ia64/pci/fixup.c |3 ++-
 arch/x86/pci/fixup.c  |3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/ia64/pci/fixup.c b/arch/ia64/pci/fixup.c
index fc505d5..fd9ceff 100644
--- a/arch/ia64/pci/fixup.c
+++ b/arch/ia64/pci/fixup.c
@@ -61,7 +61,8 @@ static void pci_fixup_video(struct pci_dev *pdev)
if (!vga_default_device() || pdev == vga_default_device()) {
pci_read_config_word(pdev, PCI_COMMAND, &config);
if (config & (PCI_COMMAND_IO | PCI_COMMAND_MEMORY)) {
-   pdev->resource[PCI_ROM_RESOURCE].flags |= 
IORESOURCE_ROM_SHADOW;
+   pdev->resource[PCI_ROM_RESOURCE].flags |= 
IORESOURCE_ROM_SHADOW |
+   IORESOURCE_PCI_FIXED;
dev_printk(KERN_DEBUG, &pdev->dev, "Video device with 
shadowed ROM\n");
}
}
diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
index 0ae7e9f..dac027c 100644
--- a/arch/x86/pci/fixup.c
+++ b/arch/x86/pci/fixup.c
@@ -336,7 +336,8 @@ static void pci_fixup_video(struct pci_dev *pdev)
if (!vga_default_device() || pdev == vga_default_device()) {
pci_read_config_word(pdev, PCI_COMMAND, &config);
if (config & (PCI_COMMAND_IO | PCI_COMMAND_MEMORY)) {
-   pdev->resource[PCI_ROM_RESOURCE].flags |= 
IORESOURCE_ROM_SHADOW;
+   pdev->resource[PCI_ROM_RESOURCE].flags |= 
IORESOURCE_ROM_SHADOW |
+   IORESOURCE_PCI_FIXED;
dev_printk(KERN_DEBUG, &pdev->dev, "Video device with 
shadowed ROM\n");
}
}

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v1 04/12] PCI: Set ROM shadow location in arch code, not in PCI core

2016-03-03 Thread Bjorn Helgaas
IORESOURCE_ROM_SHADOW means there is a copy of a device's option ROM in
RAM.  The existence of such a copy and its location are arch-specific.
Previously the IORESOURCE_ROM_SHADOW flag was set in arch code, but the
0xC-0xD location was hard-coded into the PCI core.

If we're using a shadow copy in RAM, disable the ROM BAR and release the
address space it was consuming.  Move the location information from the PCI
core to the arch code that sets IORESOURCE_ROM_SHADOW.  Save the location
of the RAM copy in the struct resource for PCI_ROM_RESOURCE.

After this change, pci_map_rom() will call pci_assign_resource() and
pci_enable_rom() for these IORESOURCE_ROM_SHADOW resources, which we did
not do before.  This is safe because:

  - pci_assign_resource() will do nothing because the resource is marked
IORESOURCE_PCI_FIXED, which means we can't move it, and

  - pci_enable_rom() will not turn on the ROM BAR's enable bit because the
resource is marked IORESOURCE_ROM_SHADOW, which means it is in RAM
rather than in PCI memory space.

Storing the location in the struct resource means "lspci" will show the
shadow location, not the value from the ROM BAR.

Signed-off-by: Bjorn Helgaas 
---
 arch/ia64/pci/fixup.c  |   22 --
 arch/x86/pci/fixup.c   |   22 --
 drivers/pci/rom.c  |   11 ---
 include/linux/ioport.h |2 +-
 4 files changed, 33 insertions(+), 24 deletions(-)

diff --git a/arch/ia64/pci/fixup.c b/arch/ia64/pci/fixup.c
index fd9ceff..41caa99 100644
--- a/arch/ia64/pci/fixup.c
+++ b/arch/ia64/pci/fixup.c
@@ -17,14 +17,14 @@
  *
  * The standard boot ROM sequence for an x86 machine uses the BIOS
  * to select an initial video card for boot display. This boot video
- * card will have it's BIOS copied to C in system RAM.
+ * card will have its BIOS copied to 0xC in system RAM.
  * IORESOURCE_ROM_SHADOW is used to associate the boot video
  * card with this copy. On laptops this copy has to be used since
  * the main ROM may be compressed or combined with another image.
  * See pci_map_rom() for use of this flag. Before marking the device
  * with IORESOURCE_ROM_SHADOW check if a vga_default_device is already set
- * by either arch cde or vga-arbitration, if so only apply the fixup to this
- * already determined primary video card.
+ * by either arch code or vga-arbitration; if so only apply the fixup to this
+ * already-determined primary video card.
  */
 
 static void pci_fixup_video(struct pci_dev *pdev)
@@ -32,6 +32,7 @@ static void pci_fixup_video(struct pci_dev *pdev)
struct pci_dev *bridge;
struct pci_bus *bus;
u16 config;
+   struct resource *res;
 
if ((strcmp(ia64_platform_name, "dig") != 0)
&& (strcmp(ia64_platform_name, "hpzx1")  != 0))
@@ -61,9 +62,18 @@ static void pci_fixup_video(struct pci_dev *pdev)
if (!vga_default_device() || pdev == vga_default_device()) {
pci_read_config_word(pdev, PCI_COMMAND, &config);
if (config & (PCI_COMMAND_IO | PCI_COMMAND_MEMORY)) {
-   pdev->resource[PCI_ROM_RESOURCE].flags |= 
IORESOURCE_ROM_SHADOW |
-   IORESOURCE_PCI_FIXED;
-   dev_printk(KERN_DEBUG, &pdev->dev, "Video device with 
shadowed ROM\n");
+   res = &pdev->resource[PCI_ROM_RESOURCE];
+
+   pci_disable_rom(pdev);
+   if (res->parent)
+   release_resource(res);
+
+   res->start = 0xC;
+   res->end = res->start + 0x2 - 1;
+   res->flags = IORESOURCE_MEM | IORESOURCE_ROM_SHADOW |
+IORESOURCE_PCI_FIXED;
+   dev_info(&pdev->dev, "Video device with shadowed ROM at 
%pR\n",
+res);
}
}
 }
diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
index dac027c..b7de192 100644
--- a/arch/x86/pci/fixup.c
+++ b/arch/x86/pci/fixup.c
@@ -297,14 +297,14 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL,  
PCI_DEVICE_ID_INTEL_MCH_PC1,pcie_r
  *
  * The standard boot ROM sequence for an x86 machine uses the BIOS
  * to select an initial video card for boot display. This boot video
- * card will have it's BIOS copied to C in system RAM.
+ * card will have its BIOS copied to 0xC in system RAM.
  * IORESOURCE_ROM_SHADOW is used to associate the boot video
  * card with this copy. On laptops this copy has to be used since
  * the main ROM may be compressed or combined with another image.
  * See pci_map_rom() for use of this flag. Before marking the device
  * with IORESOURCE_ROM_SHADOW check if a vga_default_device is already set
- * by either arch cde or vga-arbitration, if so only ap

[Intel-gfx] [PATCH v1 06/12] ia64/PCI: Use temporary struct resource * to avoid repetition

2016-03-03 Thread Bjorn Helgaas
Use a temporary struct resource pointer to avoid needless repetition of
"dev->resource[idx]".  No functional change intended.

Signed-off-by: Bjorn Helgaas 
---
 arch/ia64/sn/kernel/io_acpi_init.c |   10 +
 arch/ia64/sn/kernel/io_init.c  |   39 
 2 files changed, 22 insertions(+), 27 deletions(-)

diff --git a/arch/ia64/sn/kernel/io_acpi_init.c 
b/arch/ia64/sn/kernel/io_acpi_init.c
index 0640739..815c291 100644
--- a/arch/ia64/sn/kernel/io_acpi_init.c
+++ b/arch/ia64/sn/kernel/io_acpi_init.c
@@ -429,6 +429,7 @@ sn_acpi_slot_fixup(struct pci_dev *dev)
void __iomem *addr;
struct pcidev_info *pcidev_info = NULL;
struct sn_irq_info *sn_irq_info = NULL;
+   struct resource *res;
size_t image_size, size;
 
if (sn_acpi_get_pcidev_info(dev, &pcidev_info, &sn_irq_info)) {
@@ -446,14 +447,13 @@ sn_acpi_slot_fixup(struct pci_dev *dev)
addr = 
ioremap(pcidev_info->pdi_pio_mapped_addr[PCI_ROM_RESOURCE],
   size);
image_size = pci_get_rom_size(dev, addr, size);
-   dev->resource[PCI_ROM_RESOURCE].start = (unsigned long) addr;
-   dev->resource[PCI_ROM_RESOURCE].end =
-   (unsigned long) addr + image_size - 1;
-   dev->resource[PCI_ROM_RESOURCE].flags |= 
IORESOURCE_ROM_BIOS_COPY;
+   res = &dev->resource[PCI_ROM_RESOURCE];
+   res->start = (unsigned long) addr;
+   res->end = (unsigned long) addr + image_size - 1;
+   res->flags |= IORESOURCE_ROM_BIOS_COPY;
}
sn_pci_fixup_slot(dev, pcidev_info, sn_irq_info);
 }
-
 EXPORT_SYMBOL(sn_acpi_slot_fixup);
 
 
diff --git a/arch/ia64/sn/kernel/io_init.c b/arch/ia64/sn/kernel/io_init.c
index 1be65eb..40c0263 100644
--- a/arch/ia64/sn/kernel/io_init.c
+++ b/arch/ia64/sn/kernel/io_init.c
@@ -150,7 +150,8 @@ void
 sn_io_slot_fixup(struct pci_dev *dev)
 {
int idx;
-   unsigned long addr, end, size, start;
+   struct resource *res;
+   unsigned long addr, size;
struct pcidev_info *pcidev_info;
struct sn_irq_info *sn_irq_info;
int status;
@@ -175,33 +176,31 @@ sn_io_slot_fixup(struct pci_dev *dev)
 
/* Copy over PIO Mapped Addresses */
for (idx = 0; idx <= PCI_ROM_RESOURCE; idx++) {
-
-   if (!pcidev_info->pdi_pio_mapped_addr[idx]) {
+   if (!pcidev_info->pdi_pio_mapped_addr[idx])
continue;
-   }
 
-   start = dev->resource[idx].start;
-   end = dev->resource[idx].end;
-   size = end - start;
-   if (size == 0) {
+   res = &dev->resource[idx];
+
+   size = res->end - res->start;
+   if (size == 0)
continue;
-   }
+
addr = pcidev_info->pdi_pio_mapped_addr[idx];
addr = ((addr << 4) >> 4) | __IA64_UNCACHED_OFFSET;
-   dev->resource[idx].start = addr;
-   dev->resource[idx].end = addr + size;
+   res->start = addr;
+   res->end = addr + size;
 
/*
 * if it's already in the device structure, remove it before
 * inserting
 */
-   if (dev->resource[idx].parent && 
dev->resource[idx].parent->child)
-   release_resource(&dev->resource[idx]);
+   if (res->parent && res->parent->child)
+   release_resource(res);
 
-   if (dev->resource[idx].flags & IORESOURCE_IO)
-   insert_resource(&ioport_resource, &dev->resource[idx]);
+   if (res->flags & IORESOURCE_IO)
+   insert_resource(&ioport_resource, res);
else
-   insert_resource(&iomem_resource, &dev->resource[idx]);
+   insert_resource(&iomem_resource, res);
/*
 * If ROM, set the actual ROM image size, and mark as
 * shadowed in PROM.
@@ -213,17 +212,13 @@ sn_io_slot_fixup(struct pci_dev *dev)
rom = ioremap(pci_resource_start(dev, PCI_ROM_RESOURCE),
  size + 1);
image_size = pci_get_rom_size(dev, rom, size + 1);
-   dev->resource[PCI_ROM_RESOURCE].end =
-   dev->resource[PCI_ROM_RESOURCE].start +
-   image_size - 1;
-   dev->resource[PCI_ROM_RESOURCE].flags |=
-IORESOURCE_ROM_BIOS_COPY;
+   res->end = res->sta

[Intel-gfx] [PATCH v1 08/12] ia64/PCI: Keep CPU physical (not virtual) addresses in shadow ROM resource

2016-03-03 Thread Bjorn Helgaas
A struct resource contains CPU physical addresses, not virtual addresses.
But sn_acpi_slot_fixup() and sn_io_slot_fixup() stored the virtual address
of a shadow ROM copy in the resource.  To compensate, pci_map_rom() had a
special case that returned the resource address directly rather than
calling ioremap() on it.

When we're using a shadow copy in RAM or PROM, disable the ROM BAR and
release the address space it was consuming.

Store the CPU physical (not virtual) address in the shadow ROM resource,
and mark the resource as IORESOURCE_ROM_SHADOW so we use the normal
pci_map_rom() path that ioremaps the copy.

Signed-off-by: Bjorn Helgaas 
---
 arch/ia64/sn/kernel/io_acpi_init.c |   18 +++---
 arch/ia64/sn/kernel/io_init.c  |   17 +
 2 files changed, 16 insertions(+), 19 deletions(-)

diff --git a/arch/ia64/sn/kernel/io_acpi_init.c 
b/arch/ia64/sn/kernel/io_acpi_init.c
index 815c291..231234c 100644
--- a/arch/ia64/sn/kernel/io_acpi_init.c
+++ b/arch/ia64/sn/kernel/io_acpi_init.c
@@ -430,7 +430,7 @@ sn_acpi_slot_fixup(struct pci_dev *dev)
struct pcidev_info *pcidev_info = NULL;
struct sn_irq_info *sn_irq_info = NULL;
struct resource *res;
-   size_t image_size, size;
+   size_t size;
 
if (sn_acpi_get_pcidev_info(dev, &pcidev_info, &sn_irq_info)) {
panic("%s:  Failure obtaining pcidev_info for %s\n",
@@ -444,13 +444,17 @@ sn_acpi_slot_fixup(struct pci_dev *dev)
 * of the shadowed copy, and the actual length of the ROM image.
 */
size = pci_resource_len(dev, PCI_ROM_RESOURCE);
-   addr = 
ioremap(pcidev_info->pdi_pio_mapped_addr[PCI_ROM_RESOURCE],
-  size);
-   image_size = pci_get_rom_size(dev, addr, size);
+
res = &dev->resource[PCI_ROM_RESOURCE];
-   res->start = (unsigned long) addr;
-   res->end = (unsigned long) addr + image_size - 1;
-   res->flags |= IORESOURCE_ROM_BIOS_COPY;
+
+   pci_disable_rom(dev);
+   if (res->parent)
+   release_resource(res);
+
+   res->start = pcidev_info->pdi_pio_mapped_addr[PCI_ROM_RESOURCE];
+   res->end = res->start + size - 1;
+   res->flags = IORESOURCE_MEM | IORESOURCE_ROM_SHADOW |
+IORESOURCE_PCI_FIXED;
}
sn_pci_fixup_slot(dev, pcidev_info, sn_irq_info);
 }
diff --git a/arch/ia64/sn/kernel/io_init.c b/arch/ia64/sn/kernel/io_init.c
index 0227e20..c15a41e 100644
--- a/arch/ia64/sn/kernel/io_init.c
+++ b/arch/ia64/sn/kernel/io_init.c
@@ -185,8 +185,7 @@ sn_io_slot_fixup(struct pci_dev *dev)
if (size == 0)
continue;
 
-   res->start = ioremap(pcidev_info->pdi_pio_mapped_addr[idx],
-size + 1);
+   res->start = pcidev_info->pdi_pio_mapped_addr[idx];
res->end = addr + size;
 
/*
@@ -201,18 +200,12 @@ sn_io_slot_fixup(struct pci_dev *dev)
else
insert_resource(&iomem_resource, res);
/*
-* If ROM, set the actual ROM image size, and mark as
-* shadowed in PROM.
+* If ROM, mark as shadowed in PROM.
 */
if (idx == PCI_ROM_RESOURCE) {
-   size_t image_size;
-   void __iomem *rom;
-
-   rom = ioremap(pci_resource_start(dev, PCI_ROM_RESOURCE),
- size + 1);
-   image_size = pci_get_rom_size(dev, rom, size + 1);
-   res->end = res->start + image_size - 1;
-   res->flags |= IORESOURCE_ROM_BIOS_COPY;
+   pci_disable_rom(dev);
+   res->flags = IORESOURCE_MEM | IORESOURCE_ROM_SHADOW |
+IORESOURCE_PCI_FIXED;
}
}
 

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v1 05/12] PCI: Clean up pci_map_rom() whitespace

2016-03-03 Thread Bjorn Helgaas
Remove unnecessary indentation in pci_map_rom().  This is logically part of
the previous patch; I split it out to make the critical changes in that
patch more obvious.  No functional change intended.

Signed-off-by: Bjorn Helgaas 
---
 drivers/pci/rom.c |   37 ++---
 1 file changed, 18 insertions(+), 19 deletions(-)

diff --git a/drivers/pci/rom.c b/drivers/pci/rom.c
index 80e82b1..2a07f34 100644
--- a/drivers/pci/rom.c
+++ b/drivers/pci/rom.c
@@ -128,25 +128,24 @@ void __iomem *pci_map_rom(struct pci_dev *pdev, size_t 
*size)
loff_t start;
void __iomem *rom;
 
-   if (res->flags &
-   (IORESOURCE_ROM_COPY | IORESOURCE_ROM_BIOS_COPY)) {
-   *size = pci_resource_len(pdev, PCI_ROM_RESOURCE);
-   return (void __iomem *)(unsigned long)
-   pci_resource_start(pdev, PCI_ROM_RESOURCE);
-   } else {
-   /* assign the ROM an address if it doesn't have one */
-   if (res->parent == NULL &&
-   pci_assign_resource(pdev, PCI_ROM_RESOURCE))
-   return NULL;
-   start = pci_resource_start(pdev, PCI_ROM_RESOURCE);
-   *size = pci_resource_len(pdev, PCI_ROM_RESOURCE);
-   if (*size == 0)
-   return NULL;
-
-   /* Enable ROM space decodes */
-   if (pci_enable_rom(pdev))
-   return NULL;
-   }
+   if (res->flags & (IORESOURCE_ROM_COPY | IORESOURCE_ROM_BIOS_COPY)) {
+   *size = pci_resource_len(pdev, PCI_ROM_RESOURCE);
+   return (void __iomem *)(unsigned long)
+   pci_resource_start(pdev, PCI_ROM_RESOURCE);
+   }
+
+   /* assign the ROM an address if it doesn't have one */
+   if (res->parent == NULL && pci_assign_resource(pdev, PCI_ROM_RESOURCE))
+   return NULL;
+
+   start = pci_resource_start(pdev, PCI_ROM_RESOURCE);
+   *size = pci_resource_len(pdev, PCI_ROM_RESOURCE);
+   if (*size == 0)
+   return NULL;
+
+   /* Enable ROM space decodes */
+   if (pci_enable_rom(pdev))
+   return NULL;
 
rom = ioremap(start, *size);
if (!rom) {

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v1 07/12] ia64/PCI: Use ioremap() instead of open-coded equivalent

2016-03-03 Thread Bjorn Helgaas
Depositing __IA64_UNCACHED_OFFSET in the upper address bits is essentially
equivalent to ioremap(): it converts a CPU physical address to a virtual
address using the ia64 uncacheable identity map.

Call ioremap() instead of doing the phys-to-virt conversion manually with
__IA64_UNCACHED_OFFSET.

Note that this makes it obvious that (a) we're putting a virtual address in
a struct resource, and (b) we're passing a virtual address to ioremap()
below in the PCI_ROM_RESOURCE case.  These are both pre-existing problems
that I'll resolve next.

Signed-off-by: Bjorn Helgaas 
---
 arch/ia64/sn/kernel/io_init.c |5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/ia64/sn/kernel/io_init.c b/arch/ia64/sn/kernel/io_init.c
index 40c0263..0227e20 100644
--- a/arch/ia64/sn/kernel/io_init.c
+++ b/arch/ia64/sn/kernel/io_init.c
@@ -185,9 +185,8 @@ sn_io_slot_fixup(struct pci_dev *dev)
if (size == 0)
continue;
 
-   addr = pcidev_info->pdi_pio_mapped_addr[idx];
-   addr = ((addr << 4) >> 4) | __IA64_UNCACHED_OFFSET;
-   res->start = addr;
+   res->start = ioremap(pcidev_info->pdi_pio_mapped_addr[idx],
+size + 1);
res->end = addr + size;
 
/*

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v1 11/12] PCI: Remove unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY

2016-03-03 Thread Bjorn Helgaas
The IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY bits are unused.
Remove them and code that depends on them.

Signed-off-by: Bjorn Helgaas 
---
 drivers/pci/remove.c   |1 -
 drivers/pci/rom.c  |   31 +--
 include/linux/ioport.h |2 --
 3 files changed, 1 insertion(+), 33 deletions(-)

diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
index 8a280e9..6b66329 100644
--- a/drivers/pci/remove.c
+++ b/drivers/pci/remove.c
@@ -7,7 +7,6 @@ static void pci_free_resources(struct pci_dev *dev)
 {
int i;
 
-   pci_cleanup_rom(dev);
for (i = 0; i < PCI_NUM_RESOURCES; i++) {
struct resource *res = dev->resource + i;
if (res->parent)
diff --git a/drivers/pci/rom.c b/drivers/pci/rom.c
index 2a07f34..06663d3 100644
--- a/drivers/pci/rom.c
+++ b/drivers/pci/rom.c
@@ -128,12 +128,6 @@ void __iomem *pci_map_rom(struct pci_dev *pdev, size_t 
*size)
loff_t start;
void __iomem *rom;
 
-   if (res->flags & (IORESOURCE_ROM_COPY | IORESOURCE_ROM_BIOS_COPY)) {
-   *size = pci_resource_len(pdev, PCI_ROM_RESOURCE);
-   return (void __iomem *)(unsigned long)
-   pci_resource_start(pdev, PCI_ROM_RESOURCE);
-   }
-
/* assign the ROM an address if it doesn't have one */
if (res->parent == NULL && pci_assign_resource(pdev, PCI_ROM_RESOURCE))
return NULL;
@@ -150,8 +144,7 @@ void __iomem *pci_map_rom(struct pci_dev *pdev, size_t 
*size)
rom = ioremap(start, *size);
if (!rom) {
/* restore enable if ioremap fails */
-   if (!(res->flags & (IORESOURCE_ROM_ENABLE |
-   IORESOURCE_ROM_COPY)))
+   if (!(res->flags & IORESOURCE_ROM_ENABLE))
pci_disable_rom(pdev);
return NULL;
}
@@ -177,9 +170,6 @@ void pci_unmap_rom(struct pci_dev *pdev, void __iomem *rom)
 {
struct resource *res = &pdev->resource[PCI_ROM_RESOURCE];
 
-   if (res->flags & (IORESOURCE_ROM_COPY | IORESOURCE_ROM_BIOS_COPY))
-   return;
-
iounmap(rom);
 
/* Disable again before continuing */
@@ -189,25 +179,6 @@ void pci_unmap_rom(struct pci_dev *pdev, void __iomem *rom)
 EXPORT_SYMBOL(pci_unmap_rom);
 
 /**
- * pci_cleanup_rom - free the ROM copy created by pci_map_rom_copy
- * @pdev: pointer to pci device struct
- *
- * Free the copied ROM if we allocated one.
- */
-void pci_cleanup_rom(struct pci_dev *pdev)
-{
-   struct resource *res = &pdev->resource[PCI_ROM_RESOURCE];
-
-   if (res->flags & IORESOURCE_ROM_COPY) {
-   kfree((void *)(unsigned long)res->start);
-   res->flags |= IORESOURCE_UNSET;
-   res->flags &= ~IORESOURCE_ROM_COPY;
-   res->start = 0;
-   res->end = 0;
-   }
-}
-
-/**
  * pci_platform_rom - provides a pointer to any ROM image provided by the
  * platform
  * @pdev: pointer to pci device struct
diff --git a/include/linux/ioport.h b/include/linux/ioport.h
index 2cf1667..29a6deb 100644
--- a/include/linux/ioport.h
+++ b/include/linux/ioport.h
@@ -99,8 +99,6 @@ struct resource {
 /* PCI ROM control bits (IORESOURCE_BITS) */
 #define IORESOURCE_ROM_ENABLE  (1<<0)  /* ROM is enabled, same as 
PCI_ROM_ADDRESS_ENABLE */
 #define IORESOURCE_ROM_SHADOW  (1<<1)  /* Use RAM image, not ROM BAR */
-#define IORESOURCE_ROM_COPY(1<<2)  /* ROM is alloc'd copy, 
resource field overlaid */
-#define IORESOURCE_ROM_BIOS_COPY   (1<<3)  /* ROM is BIOS copy, resource 
field overlaid */
 
 /* PCI control bits.  Shares IORESOURCE_BITS with above PCI ROM.  */
 #define IORESOURCE_PCI_FIXED   (1<<4)  /* Do not move resource */

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v1 09/12] MIPS: Loongson 3: Use temporary struct resource * to avoid repetition

2016-03-03 Thread Bjorn Helgaas
Use a temporary struct resource pointer to avoid needless repetition of
"pdev->resource[PCI_ROM_RESOURCE]".  No functional change intended.

Signed-off-by: Bjorn Helgaas 
---
 arch/mips/pci/fixup-loongson3.c |   14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/mips/pci/fixup-loongson3.c b/arch/mips/pci/fixup-loongson3.c
index d708ae4..b66b1eb 100644
--- a/arch/mips/pci/fixup-loongson3.c
+++ b/arch/mips/pci/fixup-loongson3.c
@@ -40,20 +40,20 @@ int __init pcibios_map_irq(const struct pci_dev *dev, u8 
slot, u8 pin)
 
 static void pci_fixup_radeon(struct pci_dev *pdev)
 {
-   if (pdev->resource[PCI_ROM_RESOURCE].start)
+   struct resource *res = &pdev->resource[PCI_ROM_RESOURCE];
+
+   if (res->start)
return;
 
if (!loongson_sysconf.vgabios_addr)
return;
 
-   pdev->resource[PCI_ROM_RESOURCE].start =
-   loongson_sysconf.vgabios_addr;
-   pdev->resource[PCI_ROM_RESOURCE].end   =
-   loongson_sysconf.vgabios_addr + 256*1024 - 1;
-   pdev->resource[PCI_ROM_RESOURCE].flags |= IORESOURCE_ROM_COPY;
+   res->start = loongson_sysconf.vgabios_addr;
+   res->end   = res->start + 256*1024 - 1;
+   res->flags |= IORESOURCE_ROM_COPY;
 
dev_info(&pdev->dev, "BAR %d: assigned %pR for Radeon ROM\n",
-   PCI_ROM_RESOURCE, &pdev->resource[PCI_ROM_RESOURCE]);
+PCI_ROM_RESOURCE, res);
 }
 
 DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_ATI, PCI_ANY_ID,

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v1 10/12] MIPS: Loongson 3: Keep CPU physical (not virtual) addresses in shadow ROM resource

2016-03-03 Thread Bjorn Helgaas
Loongson 3 used the IORESOURCE_ROM_COPY flag for its ROM resource.  There
are two problems with this:

  - When IORESOURCE_ROM_COPY is set, pci_map_rom() assumes the resource
contains virtual addresses, so it doesn't ioremap the resource.  This
implies loongson_sysconf.vgabios_addr is a virtual address.  That's a
problem because resources should contain CPU *physical* addresses not
virtual addresses.

  - When IORESOURCE_ROM_COPY is set, pci_cleanup_rom() calls kfree() on the
resource.  We did not kmalloc() the loongson_sysconf.vgabios_addr area,
so it is incorrect to kfree() it.

If we're using a shadow copy in RAM for the Loongson 3 VGA BIOS area,
disable the ROM BAR and release the address space it was consuming.

Use IORESOURCE_ROM_SHADOW instead of IORESOURCE_ROM_COPY.  This means the
struct resource contains CPU physical addresses, and pci_map_rom() will
ioremap() it as needed.

Signed-off-by: Bjorn Helgaas 
---
 arch/mips/pci/fixup-loongson3.c |9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/mips/pci/fixup-loongson3.c b/arch/mips/pci/fixup-loongson3.c
index b66b1eb..c015ca1 100644
--- a/arch/mips/pci/fixup-loongson3.c
+++ b/arch/mips/pci/fixup-loongson3.c
@@ -48,9 +48,14 @@ static void pci_fixup_radeon(struct pci_dev *pdev)
if (!loongson_sysconf.vgabios_addr)
return;
 
-   res->start = loongson_sysconf.vgabios_addr;
+   pci_disable_rom(pdev);
+   if (res->parent)
+   release_resource(res);
+
+   res->start = virt_to_phys(loongson_sysconf.vgabios_addr);
res->end   = res->start + 256*1024 - 1;
-   res->flags |= IORESOURCE_ROM_COPY;
+   res->flags = IORESOURCE_MEM | IORESOURCE_ROM_SHADOW |
+IORESOURCE_PCI_FIXED;
 
dev_info(&pdev->dev, "BAR %d: assigned %pR for Radeon ROM\n",
 PCI_ROM_RESOURCE, res);

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v1 12/12] PCI: Simplify sysfs ROM cleanup

2016-03-03 Thread Bjorn Helgaas
The value of pdev->rom_attr is the definitive indicator of the fact that
we're created a sysfs attribute.  Check that rather than rom_size, which is
only used incidentally when deciding whether to create a sysfs attribute.

Signed-off-by: Bjorn Helgaas 
---
 drivers/pci/pci-sysfs.c |   13 +++--
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 95d9e7b..a4cfa52 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -1409,7 +1409,7 @@ int __must_check pci_create_sysfs_dev_files(struct 
pci_dev *pdev)
return 0;
 
 err_rom_file:
-   if (rom_size) {
+   if (pdev->rom_attr) {
sysfs_remove_bin_file(&pdev->dev.kobj, pdev->rom_attr);
kfree(pdev->rom_attr);
pdev->rom_attr = NULL;
@@ -1447,8 +1447,6 @@ static void pci_remove_capabilities_sysfs(struct pci_dev 
*dev)
  */
 void pci_remove_sysfs_dev_files(struct pci_dev *pdev)
 {
-   int rom_size = 0;
-
if (!sysfs_initialized)
return;
 
@@ -1461,18 +1459,13 @@ void pci_remove_sysfs_dev_files(struct pci_dev *pdev)
 
pci_remove_resource_files(pdev);
 
-   if (pci_resource_len(pdev, PCI_ROM_RESOURCE))
-   rom_size = pci_resource_len(pdev, PCI_ROM_RESOURCE);
-   else if (pdev->resource[PCI_ROM_RESOURCE].flags & IORESOURCE_ROM_SHADOW)
-   rom_size = 0x2;
-
-   if (rom_size && pdev->rom_attr) {
+   if (pdev->rom_attr) {
sysfs_remove_bin_file(&pdev->dev.kobj, pdev->rom_attr);
kfree(pdev->rom_attr);
+   pdev->rom_attr = NULL;
}
 
pci_remove_firmware_label_files(pdev);
-
 }
 
 static int __init pci_sysfs_init(void)

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v1 00/12] PCI: Rework shadow ROM handling

2016-03-08 Thread Bjorn Helgaas
On Thu, Mar 03, 2016 at 10:53:50AM -0600, Bjorn Helgaas wrote:
> The purpose of this series is to:
> 
>   - Fix the "BAR 6: [??? 0x flags 0x2] has bogus alignment"
> messages reported by Linus [1], Andy [2], and others.
> 
>   - Move arch-specific shadow ROM location knowledge, e.g.,
> 0xC-0xD, from PCI core to arch code.
> 
>   - Fix the ia64 and MIPS Loongson 3 oddity of keeping virtual
> addresses in shadow ROM struct resource (resources should always
> contain *physical* addresses).
> 
>   - Remove now-unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY
> flags.
> 
> This series is based on v4.5-rc1, and it's available on my
> pci/resource git branch (along with a couple tiny unrelated patches)
> at [3].
> 
> Bjorn
> 
> 
> [1] 
> http://lkml.kernel.org/r/ca+55afyvmftbb0oz_yx8+eqoejnzgtcsysj9quhepdz9bhd...@mail.gmail.com
> [2] 
> http://lkml.kernel.org/r/calcetrv+rwnpzxyl8uvnsragu-6cczd_cc9pfjt2nctjplz...@mail.gmail.com
> [3] 
> https://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/log/?h=pci/resource
> 
> 
> ---
> 
> Bjorn Helgaas (12):
>   PCI: Mark shadow copy of VGA ROM as IORESOURCE_PCI_FIXED
>   PCI: Don't assign or reassign immutable resources
>   PCI: Don't enable/disable ROM BAR if we're using a RAM shadow copy
>   PCI: Set ROM shadow location in arch code, not in PCI core
>   PCI: Clean up pci_map_rom() whitespace
>   ia64/PCI: Use temporary struct resource * to avoid repetition
>   ia64/PCI: Use ioremap() instead of open-coded equivalent
>   ia64/PCI: Keep CPU physical (not virtual) addresses in shadow ROM 
> resource
>   MIPS: Loongson 3: Use temporary struct resource * to avoid repetition
>   MIPS: Loongson 3: Keep CPU physical (not virtual) addresses in shadow 
> ROM resource
>   PCI: Remove unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY
>   PCI: Simplify sysfs ROM cleanup
> 
> 
>  arch/ia64/pci/fixup.c  |   21 +++--
>  arch/ia64/sn/kernel/io_acpi_init.c |   22 ++
>  arch/ia64/sn/kernel/io_init.c  |   51 --
>  arch/mips/pci/fixup-loongson3.c|   19 +---
>  arch/x86/pci/fixup.c   |   21 +++--
>  drivers/pci/pci-sysfs.c|   13 +-
>  drivers/pci/remove.c   |1 
>  drivers/pci/rom.c  |   83 
> +++-
>  drivers/pci/setup-res.c|6 +++
>  include/linux/ioport.h |4 --
>  10 files changed, 111 insertions(+), 130 deletions(-)

I applied this series to pci/resource for v4.6.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v1 00/12] PCI: Rework shadow ROM handling

2016-03-11 Thread Bjorn Helgaas
On Fri, Mar 11, 2016 at 01:16:09PM -0800, Andy Lutomirski wrote:
> On Tue, Mar 8, 2016 at 9:45 AM, Bjorn Helgaas  wrote:
> > On Thu, Mar 03, 2016 at 10:53:50AM -0600, Bjorn Helgaas wrote:
> >> The purpose of this series is to:
> >>
> >>   - Fix the "BAR 6: [??? 0x flags 0x2] has bogus alignment"
> >> messages reported by Linus [1], Andy [2], and others.
> >>
> >>   - Move arch-specific shadow ROM location knowledge, e.g.,
> >> 0xC-0xD, from PCI core to arch code.
> >>
> >>   - Fix the ia64 and MIPS Loongson 3 oddity of keeping virtual
> >> addresses in shadow ROM struct resource (resources should always
> >> contain *physical* addresses).
> >>
> >>   - Remove now-unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY
> >> flags.
> >>
> >> This series is based on v4.5-rc1, and it's available on my
> >> pci/resource git branch (along with a couple tiny unrelated patches)
> >> at [3].
> >>
> >> Bjorn
> >>
> >>
> >> [1] 
> >> http://lkml.kernel.org/r/ca+55afyvmftbb0oz_yx8+eqoejnzgtcsysj9quhepdz9bhd...@mail.gmail.com
> >> [2] 
> >> http://lkml.kernel.org/r/calcetrv+rwnpzxyl8uvnsragu-6cczd_cc9pfjt2nctjplz...@mail.gmail.com
> >> [3] 
> >> https://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/log/?h=pci/resource
> >>
> >>
> >> ---
> >>
> >> Bjorn Helgaas (12):
> >>   PCI: Mark shadow copy of VGA ROM as IORESOURCE_PCI_FIXED
> >>   PCI: Don't assign or reassign immutable resources
> >>   PCI: Don't enable/disable ROM BAR if we're using a RAM shadow copy
> >>   PCI: Set ROM shadow location in arch code, not in PCI core
> >>   PCI: Clean up pci_map_rom() whitespace
> >>   ia64/PCI: Use temporary struct resource * to avoid repetition
> >>   ia64/PCI: Use ioremap() instead of open-coded equivalent
> >>   ia64/PCI: Keep CPU physical (not virtual) addresses in shadow ROM 
> >> resource
> >>   MIPS: Loongson 3: Use temporary struct resource * to avoid repetition
> >>   MIPS: Loongson 3: Keep CPU physical (not virtual) addresses in 
> >> shadow ROM resource
> >>   PCI: Remove unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY
> >>   PCI: Simplify sysfs ROM cleanup
> >>
> >>
> >>  arch/ia64/pci/fixup.c  |   21 +++--
> >>  arch/ia64/sn/kernel/io_acpi_init.c |   22 ++
> >>  arch/ia64/sn/kernel/io_init.c  |   51 --
> >>  arch/mips/pci/fixup-loongson3.c|   19 +---
> >>  arch/x86/pci/fixup.c   |   21 +++--
> >>  drivers/pci/pci-sysfs.c|   13 +-
> >>  drivers/pci/remove.c   |1
> >>  drivers/pci/rom.c  |   83 
> >> +++-
> >>  drivers/pci/setup-res.c|6 +++
> >>  include/linux/ioport.h |4 --
> >>  10 files changed, 111 insertions(+), 130 deletions(-)
> >
> > I applied this series to pci/resource for v4.6.
> 
> This gets rid of all the warnings for me until I try to read my i915
> device's rom using sysfs.  Then I get:
> 
> i915 :00:02.0: Invalid PCI ROM header signature: expecting 0xaa55,
> got 0x
> 
> So I suspect that something is still subtly wrong -- I'd imagine that
> this should either work or the intialization code should detect that
> there is no usable ROM and not expose it.
> 
> (To be clear, there's no regression here.)

Hmmm.  Thanks for testing this.  As you say, I think this is the way
it's always been, but it does seem non-intuitive.

That "Invalid PCI ROM header signature" warning comes from
pci_get_rom_size().  We don't call that at enumeration-time; we only
call it later when somebody tries to read the ROM via sysfs:

  pci_bus_add_device
pci_fixup_device(pci_fixup_final)
  pci_fixup_video # final fixup
res->flags = MEM | SHADOW | PCI_FIXED
pci_create_sysfs_dev_files
  if (SHADOW)
rom_size = 0x2# oops, I should have fixed this too
  if (rom_size)
attr->read = pci_read_rom
sysfs_create_bin_file # sysfs "rom" file

  pci_read_rom# sysfs "rom" read function
pci_map_rom
  ioremap
  pci_get_rom_size
dev_err("Invalid PCI ROM header signature")
memcpy_fromio
pci_unmap_rom

I think this is the

Re: [Intel-gfx] [PATCH v1 00/12] PCI: Rework shadow ROM handling

2016-03-12 Thread Bjorn Helgaas
On Thu, Mar 03, 2016 at 10:53:50AM -0600, Bjorn Helgaas wrote:
> The purpose of this series is to:
> ...

>   - Move arch-specific shadow ROM location knowledge, e.g.,
> 0xC-0xD, from PCI core to arch code.
> ...

> Bjorn Helgaas (12):
>   PCI: Mark shadow copy of VGA ROM as IORESOURCE_PCI_FIXED
>   PCI: Don't assign or reassign immutable resources
>   PCI: Don't enable/disable ROM BAR if we're using a RAM shadow copy
>   PCI: Set ROM shadow location in arch code, not in PCI core

I propose to add the patch below at this point in the series.

>   PCI: Clean up pci_map_rom() whitespace
>   ia64/PCI: Use temporary struct resource * to avoid repetition
>   ia64/PCI: Use ioremap() instead of open-coded equivalent
>   ia64/PCI: Keep CPU physical (not virtual) addresses in shadow ROM 
> resource
>   MIPS: Loongson 3: Use temporary struct resource * to avoid repetition
>   MIPS: Loongson 3: Keep CPU physical (not virtual) addresses in shadow 
> ROM resource
>   PCI: Remove unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY
>   PCI: Simplify sysfs ROM cleanup

commit ac0c302a919ba7b68dbf274babdc08c83df6f532
Author: Bjorn Helgaas 
Date:   Sat Mar 12 05:48:08 2016 -0600

PCI: Remove arch-specific IORESOURCE_ROM_SHADOW size from sysfs

When pci_create_sysfs_dev_files() created the "rom" sysfs file, it set the
sysfs file size to the actual size of a ROM BAR, or if there was no ROM BAR
but the platform provided a shadow copy in RAM, to 0x2.  0x2 is an
arch-specific length that should not be baked into the PCI core.

Every place that sets IORESOURCE_ROM_SHADOW also sets the size of the
    PCI_ROM_RESOURCE, so use the resource length always.

Signed-off-by: Bjorn Helgaas 

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 95d9e7b..51d4dad 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -1356,7 +1356,7 @@ error:
 int __must_check pci_create_sysfs_dev_files(struct pci_dev *pdev)
 {
int retval;
-   int rom_size = 0;
+   int rom_size;
struct bin_attribute *attr;
 
if (!sysfs_initialized)
@@ -1373,12 +1373,8 @@ int __must_check pci_create_sysfs_dev_files(struct 
pci_dev *pdev)
if (retval)
goto err_config_file;
 
-   if (pci_resource_len(pdev, PCI_ROM_RESOURCE))
-   rom_size = pci_resource_len(pdev, PCI_ROM_RESOURCE);
-   else if (pdev->resource[PCI_ROM_RESOURCE].flags & IORESOURCE_ROM_SHADOW)
-   rom_size = 0x2;
-
/* If the device has a ROM, try to expose it in sysfs. */
+   rom_size = pci_resource_len(pdev, PCI_ROM_RESOURCE);
if (rom_size) {
attr = kzalloc(sizeof(*attr), GFP_ATOMIC);
if (!attr) {
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH V3 09/29] drm/i915: deprecate pci_get_bus_and_slot()

2018-02-16 Thread Bjorn Helgaas
On Mon, Nov 27, 2017 at 11:57:46AM -0500, Sinan Kaya wrote:
> pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as
> where a PCI device is present. This restricts the device drivers to be
> reused for other domain numbers.
> 
> Getting ready to remove pci_get_bus_and_slot() function in favor of
> pci_get_domain_bus_and_slot().
> 
> Extract the domain number from drm_device and pass it into
> pci_get_domain_bus_and_slot() function.
> 
> Signed-off-by: Sinan Kaya 

I don't know what happened to this, and it didn't make it into
v4.16-rc1.  I applied it to pci/deprecate-get-bus-and-slot for v4.17
along with the patch that actually removes pci_get_bus_and_slot().

> ---
>  drivers/gpu/drm/i915/i915_drv.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 9f45cfe..5a8cb79 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -419,7 +419,10 @@ static int i915_getparam(struct drm_device *dev, void 
> *data,
>  
>  static int i915_get_bridge_dev(struct drm_i915_private *dev_priv)
>  {
> - dev_priv->bridge_dev = pci_get_bus_and_slot(0, PCI_DEVFN(0, 0));
> + int domain = pci_domain_nr(dev_priv->drm.pdev->bus);
> +
> + dev_priv->bridge_dev =
> + pci_get_domain_bus_and_slot(domain, 0, PCI_DEVFN(0, 0));
>   if (!dev_priv->bridge_dev) {
>   DRM_ERROR("bridge device not found\n");
>   return -1;
> -- 
> 1.9.1
> 
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v3 4/9] PCI/VGA: Improve the default VGA device selection

2023-07-25 Thread Bjorn Helgaas
On Mon, Jul 24, 2023 at 08:16:18PM +0800, suijingfeng wrote:
> On 2023/7/20 03:32, Bjorn Helgaas wrote:
> > > 2) It does not take the PCI Bar may get relocated into consideration.
> > > 3) It is not effective for the PCI device without a dedicated VRAM Bar.
> > > 4) It is device-agnostic, thus it has to waste the effort to iterate all
> > > of the PCI Bar to find the VRAM aperture.
> > > 5) It has invented lots of methods to determine which one is the default
> > > boot device, but this is still a policy because it doesn't give the
> > > user a choice to override.
> > I don't think we need a list of*potential*  problems.  We need an
> > example of the specific problem this will solve, i.e., what currently
> > does not work?
> 
> 
> This version do allow the arbitration service works on non-x86 arch,
> which also allow me remove a arch-specific workaround.
> I will give more detail at the next version.

Yes.  This part I think we want.

> But I want to provide one more drawback of vgaarb here:
> 
> (6) It does not works for non VGA-compatible PCI(e) display controllers.
> 
> Because, currently, vgaarb deal with PCI VGA compatible devices only.
> 
> See another my patch set [1] for more elaborate discussion.
> 
> It also ignore PCI_CLASS_NOT_DEFINED_VGA as Maciej puts it[2].
> 
> While my approach do not required the display controller to be
> VGA-compatible to enjoy the arbitration service.

I think vgaarb is really only for dealing with the problem of the
legacy VGA address space routing.  For example, there may be VGA
devices that require the [pci 0xa-0xb] range but they don't
describe that via a BAR.  There may also be VGA option ROMs that
depend on that range so they can initialize the device.

The [pci 0xa-0xb] range can only be routed to one device at a
time, and vgaarb is what takes care of that by manipulating the VGA
Enable bits in bridges.

I don't think we should extend vgaarb to deal with non-VGA GPUs in
general, i.e., I don't think it should be concerned with devices and
option ROMs that do not require the [pci 0xa-0xb] range.

I think a strict reading of the PCI Class Code spec would be that only
devices with Programming Interface  b can depend on that
legacy range.

If that's what vgaarb currently enforces, great.  If it currently
deals with more than just  b devices, and there's some value
in restricting it to only  b, we could try that, but I would
suggest doing that in a tiny patch all by itself.  Then if we trip
over a problem, it's easy to bisect and revert it.

> [1] https://patchwork.freedesktop.org/patch/546690/?series=120548&rev=1
> 
> [2] https://lkml.org/lkml/2023/6/18/315
> 


Re: [Intel-gfx] [PATCH v3 0/9] PCI/VGA: Improve the default VGA device selection

2023-07-25 Thread Bjorn Helgaas
On Mon, Jul 24, 2023 at 08:47:48PM +0800, suijingfeng wrote:
> On 2023/7/20 03:32, Bjorn Helgaas wrote:
> > "drm/loongson: Add an implement for ..." also solves a problem, but it
> > lacks a commit log, so I don't know what the problem is.
> 
> I have already telling you one yeas ago.

The patch itself must be self-contained, including a commit log that
justifies the change.  You may have told me a year ago, but that
doesn't help somebody looking at these changes next year.

Bjorn


Re: [Intel-gfx] [PATCH v3 4/4] PCI/VGA: introduce is_boot_device function callback to vga_client_register

2023-06-09 Thread Bjorn Helgaas
On Fri, Jun 09, 2023 at 10:27:39AM +0800, Sui Jingfeng wrote:
> On 2023/6/9 03:19, Bjorn Helgaas wrote:
> > On Thu, Jun 08, 2023 at 07:43:22PM +0800, Sui Jingfeng wrote:
> > > From: Sui Jingfeng 
> > > 
> > > The vga_is_firmware_default() function is arch-dependent, which doesn't
> > > sound right. At least, it also works on the Mips and LoongArch platforms.
> > > Tested with the drm/amdgpu and drm/radeon drivers. However, it's difficult
> > > to enumerate all arch-driver combinations. I'm wrong if there is only one
> > > exception.
> > > 
> > > With the observation that device drivers typically have better knowledge
> > > about which PCI bar contains the firmware framebuffer, which could avoid
> > > the need to iterate all of the PCI BARs.
> > > 
> > > But as a PCI function at pci/vgaarb.c, vga_is_firmware_default() is
> > > probably not suitable to make such an optimization for a specific device.
> > > 
> > > There are PCI display controllers that don't have a dedicated VRAM bar,
> > > this function will lose its effectiveness in such a case. Luckily, the
> > > device driver can provide an accurate workaround.
> > > 
> > > Therefore, this patch introduces a callback that allows the device driver
> > > to tell the VGAARB if the device is the default boot device. This patch
> > > only intends to introduce the mechanism, while the implementation is left
> > > to the device driver authors. Also honor the comment: "Clients have two
> > > callback mechanisms they can use"
> > s/bar/BAR/ (several)
> > 
> > Nothing here uses the callback.  I don't want to merge this until we
> > have a user.
> 
> This is chicken and egg question.
> 
> If you could help get this merge first, I will show you the first user.
> 
> > I'm not sure why the device driver should know whether its device is
> > the default boot device.
> 
> It's not that the device driver should know,
> 
> but it's about that the device driver has the right to override.
> 
> Device driver may have better approach to identify the default boot
> device.

The way we usually handle this is to include the new callback in the
same series as the first user of it.  That has two benefits:
(1) everybody can review the whole picture and possibly suggest
different approaches, and (2) when we merge the infrastructure,
we also merge a user of it at the same time, so the whole thing can be
tested and we don't end up with unused code.

Bjorn


Re: [Intel-gfx] [PATCH V2] PCI: Move VMD ASPM/LTR fix to PCI quirk

2023-06-09 Thread Bjorn Helgaas
On Fri, Jun 09, 2023 at 03:09:26PM -0700, David E. Box wrote:
> Hi Bjorn,
> 
> On Thu, 2023-06-08 at 15:52 -0500, Bjorn Helgaas wrote:
> > On Tue, Apr 11, 2023 at 02:33:23PM -0700, David E. Box wrote:
> > > In commit f492edb40b54 ("PCI: vmd: Add quirk to configure PCIe ASPM and
> > > LTR") the VMD driver calls pci_enabled_link_state as a callback from
> > > pci_bus_walk. Both will acquire the pci_bus_sem lock leading to a lockdep
> > > warning. Instead of doing the pci_bus_walk, move the fix to quirks.c using
> > > DECLARE_PCI_FIXUP_FINAL.

> > > +#define VMD_DEVICE_LTR 0x1003  /* 3145728 ns */
> > 
> > It would be nice to know how this value was derived.  But I know we
> > had this hard-coded value before, so it's not new with this patch.
> 
> Do you mean to show the multiplier that determines that value or to
> say why this particular number was chosen? For the latter, it the
> largest that could be set (given the multipier options) that will
> allow the SoC to get to it's lowest power state. And it's the same
> so far on all the SoCs covered by the VMD driver.

Oh, sorry, I meant "why this number was chosen".  PCIe r6.0, sec
7.8.2, says this capability allows software to provide "platform
latency information," so I assume this is somehow dependent on
platform, but I really don't understand the details of how LTR works,
and we didn't have an explanation before, so this was just a "if you
happen to know, it might be useful here" comment.

> > > +static void quirk_intel_vmd(struct pci_dev *pdev)
> > 
> > I think this quirk could possibly stay in
> > drivers/pci/controller/vmd.c, couldn't it?  It has a lot of
> > VMD-specific knowledge that it would nice to contain in vmd.c.
> 
> I may have misunderstood your comment on V1 then. But you suggested
> that this would be typically done as PCI_FIXUP so that the PCI core
> could call it and we could avoid the locking issue that was seen
> while walking the bus in vmd.c.

Right, I think it makes sense to be a DECLARE_PCI_FIXUP_CLASS_FINAL(),
but I was thinking that it could be implemented in vmd.c and still be
called by the PCI core.

But now I'm uncertain since vmd.c can be compiled as a module, and I'm
not sure how that could work, since pci_fixup_device() calls things in
the __start_pci_fixups_final[] table, and I don't see how loading a
module would insert the fixup entry into that table.

So maybe it needs to be in quirks.c after all.

I think my only remaining questions here are about how to identify
devices below VMD and the order of enabling ASPM states vs setting
LTR.

Bjorn


Re: [Intel-gfx] [PATCH v7 6/8] PCI/VGA: Introduce is_boot_device function callback to vga_client_register

2023-06-29 Thread Bjorn Helgaas
On Thu, Jun 22, 2023 at 01:08:15PM +0800, Sui Jingfeng wrote:
> Hi,
> 
> 
> A nouveau developer(Lyude) from redhat send me a R-B,
> 
> Thanks for the developers of nouveau project.
> 
> 
> Please allow me add a link[1] here.
> 
> 
> [1] 
> https://lore.kernel.org/all/0afadc69f99a36bc9d03ecf54ff25859dbc10e28.ca...@redhat.com/

1) Thanks for this.  If you post another version of this series,
   please pick up Lyude's Reviewed-by and include it in the relevant
   patches (as long as you haven't made significant changes to the
   code Lyude reviewed).  Whoever applies this should automatically
   pick up Reviewed-by/Ack/etc that are replies to the version being
   applied, but they won't go through previous revisions to find them.

2) Please mention the commit to which the series applies.  I tried to
   apply this on v6.4-rc1, but it doesn't apply cleanly.

3) Thanks for including cover letters in your postings.  Please
   include a little changelog in the cover letter so we know what
   changed between v6 and v7, etc.

4) Right now we're in the middle of the v6.5 merge window, so new
   content, e.g., this series, is too late for v6.5.  Most
   maintainers, including me, wait to merge new content until the
   merge window closes and a new -rc1 is tagged.  This merge window
   should close on July 9, and people will start merging content for
   v6.6, typically based on v6.5-rc1.

Bjorn


Re: [Intel-gfx] [PATCH v7 6/8] PCI/VGA: Introduce is_boot_device function callback to vga_client_register

2023-06-30 Thread Bjorn Helgaas
On Fri, Jun 30, 2023 at 10:14:11AM +0800, suijingfeng wrote:
> On 2023/6/30 01:44, Limonciello, Mario wrote:
> > > On 2023/6/29 23:54, Bjorn Helgaas wrote:
> > > > On Thu, Jun 22, 2023 at 01:08:15PM +0800, Sui Jingfeng wrote:

> > > > 4) Right now we're in the middle of the v6.5 merge window, so new
> > > >  content, e.g., this series, is too late for v6.5.  Most
> > > >  maintainers, including me, wait to merge new content until the
> > > >  merge window closes and a new -rc1 is tagged.  This merge window
> > > >  should close on July 9, and people will start merging content for
> > > >  v6.6, typically based on v6.5-rc1.
> > > 
> > > Would you will merge all of the patches in this series (e.g. including
> > > the patch for drm/amdgpu(7/8) and drm/radeon(8/8)) ?
> > > 
> > > Or just part of them?

The bulk of this series is drivers/pci changes, so typically I would
merge all the patches after getting Acked-by tags from the other
subsystems (DRM and VFIO).

> Is it possible to merge the PCI/VGA part as fast as possible,
> especially the PATCH-0006 PCI/VGA: Introduce is_boot_device function
> callback to vga_client_register

We're in the middle of the v6.5 merge window, so it's too late to add
things to v6.5-rc1.  The most likely path for new material like this
would be to queue it for v6.6, which means I would merge it after
v6.5-rc1 is tagged (that tag will probably happen on July 9).

It would then be in -next until the v6.6 merge window opens (likely in
September), when it would be merged into Linus' tree.

If the series fixes a regression or other major defect, it's
*possible* to merge things earlier, so they appear in v6.5.  But this
series doesn't seem to fall in that category, so I think v6.6 is a
more realistic target.

Merging for v6.6 would include both the PCI parts and the DRM parts at
the same time, so hopefully that addresses your dependency concerns.

I suggest that you wait until v6.5-rc1, rebase your patches so they
apply cleanly on that tag, collect all the Reviewed-by and Acked-by
tags, include them in your commit logs, and then repost them.

Bjorn


Re: [Intel-gfx] [PATCH v3 4/9] PCI/VGA: Improve the default VGA device selection

2023-07-19 Thread Bjorn Helgaas
[+cc linux-pci (please cc in the future since the bulk of this patch
is in drivers/pci/)]

On Wed, Jul 12, 2023 at 12:43:05AM +0800, Sui Jingfeng wrote:
> From: Sui Jingfeng 
> 
> Currently, the strategy of selecting the default boot on a multiple video
> card coexistence system is not perfect. Potential problems are:
> 
> 1) This function is a no-op on non-x86 architectures.

Which function in particular is a no-op for non-x86?

> 2) It does not take the PCI Bar may get relocated into consideration.
> 3) It is not effective for the PCI device without a dedicated VRAM Bar.
> 4) It is device-agnostic, thus it has to waste the effort to iterate all
>of the PCI Bar to find the VRAM aperture.
> 5) It has invented lots of methods to determine which one is the default
>boot device, but this is still a policy because it doesn't give the
>user a choice to override.

I don't think we need a list of *potential* problems.  We need an
example of the specific problem this will solve, i.e., what currently
does not work?

The drm/ast and maybe drm/loongson patches are the only ones that use
the new callback, so I assume there are real problems with those
drivers.

CONFIG_DRM_AST is a tristate.  We're talking about identifying the
boot-time console device.  So if CONFIG_DRM_AST=m, I guess we don't
get the benefit of the new callback unless the module gets loaded?

> Also honor the comment: "Clients have *TWO* callback mechanisms they
> can use"

This refers to the existing vga_client_register() function comment:

   * vga_client_register - register or unregister a VGA arbitration client
   * @pdev: pci device of the VGA client
   * @set_decode: vga decode change callback
   *
   * Clients have two callback mechanisms they can use.
   *
   * @set_decode callback: If a client can disable its GPU VGA resource, it
   * will get a callback from this to set the encode/decode state.

and the fact that struct vga_device currently only contains *one*
callback function pointer:

  unsigned int (*set_decode)(struct pci_dev *pdev, bool decode);

Adding the .is_primary_gpu() callback does mean there will now be two
callbacks, as the comment says, but I think it's just confusing to
mention this in the commit log, so I would just remove it.

> @@ -1509,13 +1543,24 @@ static int pci_notify(struct notifier_block *nb, 
> unsigned long action,
>* cases of hotplugable vga cards.
>*/
>  
> - if (action == BUS_NOTIFY_ADD_DEVICE)
> + switch (action) {
> + case BUS_NOTIFY_ADD_DEVICE:
>   notify = vga_arbiter_add_pci_device(pdev);
> - else if (action == BUS_NOTIFY_DEL_DEVICE)
> + if (notify)
> + vga_arbiter_notify_clients();
> + break;
> + case BUS_NOTIFY_DEL_DEVICE:
>   notify = vga_arbiter_del_pci_device(pdev);
> + if (notify)
> + vga_arbiter_notify_clients();
> + break;
> + case BUS_NOTIFY_BOUND_DRIVER:
> + vga_arbiter_do_arbitration(pdev);
> + break;
> + default:
> + break;
> + }

Changing from if/else to switch makes the patch bigger than necessary
for no real benefit and obscures what is really changing.

Bjorn


Re: [Intel-gfx] [PATCH v3 0/9] PCI/VGA: Improve the default VGA device selection

2023-07-19 Thread Bjorn Helgaas
[+cc linux-pci]

On Wed, Jul 12, 2023 at 12:43:01AM +0800, Sui Jingfeng wrote:
> From: Sui Jingfeng 
> 
> Currently, the default VGA device selection is not perfect. Potential
> problems are:
> 
> 1) This function is a no-op on non-x86 architectures.
> 2) It does not take the PCI Bar may get relocated into consideration.
> 3) It is not effective for the PCI device without a dedicated VRAM Bar.
> 4) It is device-agnostic, thus it has to waste the effort to iterate all
>of the PCI Bar to find the VRAM aperture.
> 5) It has invented lots of methods to determine which one is the default
>boot device on a multiple video card coexistence system. But this is
>still a policy because it doesn't give the user a choice to override.
> 
> With the observation that device drivers or video aperture helpers may
> have better knowledge about which PCI bar contains the firmware FB,
> 
> This patch tries to solve the above problems by introducing a function
> callback to the vga_client_register() function interface. DRM device
> drivers for the PCI device need to register the is_boot_device() function
> callback during the driver loading time. Once the driver binds the device
> successfully, VRAARB will call back to the driver. This gives the device
> drivers a chance to provide accurate boot device identification. Which in
> turn unlock the abitration service to non-x86 architectures. A device
> driver can also pass a NULL pointer to keep the original behavior.

I skimmed all these patches, but the only one that seems to mention an
actual problem being solved, i.e., something that doesn't work for an
end user, is "drm/ast: Register as a vga client ..."  Maybe
"drm/loongson: Add an implement for ..." also solves a problem, but it
lacks a commit log, so I don't know what the problem is.

In the future, can you please cc: linux-pci with the entire series?
In this posting, only the patches that touched drivers/pci/vgaarb.c
went to linux-pci, but those aren't enough to make sense of the series
as a whole.

>   video/aperture: Add a helper to detect if an aperture contains
> firmware FB
>   video/aperture: Add a helper for determining if an unmoved aperture
> contain FB
>   PCI/VGA: Switch to aperture_contain_firmware_fb_nonreloc()

Since this subject includes the function name (which is nice!), it
would also be helpful if the "Add a helper ..." subject included the
same function name.

>   PCI/VGA: Improve the default VGA device selection

If you can make this subject any more specific, that would be useful.
There's more to say about that patch, so I'll respond there.

>   drm/amdgpu: Implement the is_primary_gpu callback of
> vga_client_register()
>   drm/radeon: Add an implement for the is_primary_gpu function callback
>   drm/i915: Add an implement for the is_primary_gpu hook
>   drm/ast: Register as a vga client to vgaarb by calling
> vga_client_register()
>   drm/loongson: Add an implement for the is_primary_gpu function
> callback

There's unnecessary variation in the subject lines (and the commit
logs) of these patches.  If they all do the same thing but in
different drivers, it's useful if the patches all *look* the same.

You might be able to write these subjects as
"Implement .is_primary_gpu() callback" for brevity.

Bjorn


Re: [Intel-gfx] [PATCH v3 3/9] PCI/VGA: Switch to aperture_contain_firmware_fb_nonreloc()

2023-07-19 Thread Bjorn Helgaas
[+cc linux-pci; I don't apply or ack PCI patches unless they appear there]

On Wed, Jul 12, 2023 at 12:43:04AM +0800, Sui Jingfeng wrote:
> From: Sui Jingfeng 
> 
> The observation behind this is that we should avoid accessing the global
> screen_info directly. Call the aperture_contain_firmware_fb_nonreloc()
> function to implement the detection of whether an aperture contains the
> firmware FB.

Because it's better to access the global screen_info from
aperture_contain_firmware_fb_nonreloc()?  The reasoning here is not
super clear to me.

> This patch helps to decouple the determination from the implementation.
> Or, in other words, we intend to make the determination opaque to the
> caller. The determination may choose to be arch-dependent or
> arch-independent. But vgaarb, as a consumer of the determination,
> shouldn't care how the does determination is implemented.

"how the determination ..."  (drop the "does")

Are you saying that aperture_contain_firmware_fb_nonreloc() might be
arch-dependent?  Are there multiple callers?  Or does this just move
code from one place to a more appropriate place?

> Signed-off-by: Sui Jingfeng 
> ---
>  drivers/pci/vgaarb.c | 19 ---
>  1 file changed, 4 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c
> index bf96e085751d..953daf731b2c 100644
> --- a/drivers/pci/vgaarb.c
> +++ b/drivers/pci/vgaarb.c
> @@ -14,6 +14,7 @@
>  #define vgaarb_info(dev, fmt, arg...)dev_info(dev, "vgaarb: " fmt, 
> ##arg)
>  #define vgaarb_err(dev, fmt, arg...) dev_err(dev, "vgaarb: " fmt, ##arg)
>  
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -26,7 +27,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> @@ -558,20 +558,11 @@ void vga_put(struct pci_dev *pdev, unsigned int rsrc)
>  }
>  EXPORT_SYMBOL(vga_put);
>  
> +/* Select the device owning the boot framebuffer if there is one */
>  static bool vga_is_firmware_default(struct pci_dev *pdev)
>  {
>  #if defined(CONFIG_X86) || defined(CONFIG_IA64)
> - u64 base = screen_info.lfb_base;
> - u64 size = screen_info.lfb_size;
>   struct resource *r;
> - u64 limit;
> -
> - /* Select the device owning the boot framebuffer if there is one */
> -
> - if (screen_info.capabilities & VIDEO_CAPABILITY_64BIT_BASE)
> - base |= (u64)screen_info.ext_lfb_base << 32;
> -
> - limit = base + size;
>  
>   /* Does firmware framebuffer belong to us? */
>   pci_dev_for_each_resource(pdev, r) {
> @@ -581,10 +572,8 @@ static bool vga_is_firmware_default(struct pci_dev *pdev)
>   if (!r->start || !r->end)
>   continue;
>  
> - if (base < r->start || limit >= r->end)
> - continue;
> -
> - return true;
> + if (aperture_contain_firmware_fb_nonreloc(r->start, r->end))
> + return true;
>   }
>  #endif
>   return false;
> -- 
> 2.25.1
> 


Re: [Intel-gfx] [PATCH v3 1/9] video/aperture: Add a helper to detect if an aperture contains firmware FB

2023-07-19 Thread Bjorn Helgaas
On Wed, Jul 12, 2023 at 12:43:02AM +0800, Sui Jingfeng wrote:
> From: Sui Jingfeng 
> 
> This patch adds the aperture_contain_firmware_fb() function to do the
> determination. Unfortunately, due to the fact that the apertures list
> will be freed dynamically, the location and size information of the
> firmware FB will be lost after dedicated drivers call
> aperture_remove_conflicting_devices(),
> aperture_remove_conflicting_pci_devices() or
> aperture_remove_all_conflicting_devices() functions

> We solve this problem by introducing two static variables that record the
> firmware framebuffer's start addrness and end addrness. It assumes that the
> system has only one active firmware framebuffer driver at a time. We don't
> use the global structure screen_info here, because PCI resources may get
> reallocated (the VRAM BAR could be moved) during the kernel boot stage.

s/addrness/address/ (twice)


Re: [Intel-gfx] [PATCH v2 1/2] vgaarb: various coding style and comments fix

2023-06-06 Thread Bjorn Helgaas
Match the subject line style:

  $ git log --oneline drivers/pci/vgaarb.c
  f321c35feaee PCI/VGA: Replace full MIT license text with SPDX identifier
  d5109fe4d1ec PCI/VGA: Use unsigned format string to print lock counts
  4e6c91847a7f PCI/VGA: Log bridge control messages when adding devices
  dc593fd48abb PCI/VGA: Remove empty vga_arb_device_card_gone()
  ...

Subject line should be a summary of the commit log, not just "various
style fixes".  This one needs to say something about
vga_str_to_iostate().

On Mon, Jun 05, 2023 at 04:58:30AM +0800, Sui Jingfeng wrote:
> From: Sui Jingfeng 
> 
> To keep consistent with vga_iostate_to_str() function, the third argument
> of vga_str_to_iostate() function should be 'unsigned int *'.
> 
> Signed-off-by: Sui Jingfeng 
> ---
>  drivers/pci/vgaarb.c   | 29 +++--
>  include/linux/vgaarb.h |  8 +++-
>  2 files changed, 18 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c
> index 5a696078b382..e40e6e5e5f03 100644
> --- a/drivers/pci/vgaarb.c
> +++ b/drivers/pci/vgaarb.c
> @@ -61,7 +61,6 @@ static bool vga_arbiter_used;
>  static DEFINE_SPINLOCK(vga_lock);
>  static DECLARE_WAIT_QUEUE_HEAD(vga_wait_queue);
>  
> -
>  static const char *vga_iostate_to_str(unsigned int iostate)
>  {
>   /* Ignore VGA_RSRC_IO and VGA_RSRC_MEM */
> @@ -77,10 +76,12 @@ static const char *vga_iostate_to_str(unsigned int 
> iostate)
>   return "none";
>  }
>  
> -static int vga_str_to_iostate(char *buf, int str_size, int *io_state)
> +static int vga_str_to_iostate(char *buf, int str_size, unsigned int 
> *io_state)
>  {
> - /* we could in theory hand out locks on IO and mem
> -  * separately to userspace but it can cause deadlocks */
> + /*
> +  * we could in theory hand out locks on IO and mem
> +  * separately to userspace but it can cause deadlocks
> +  */

Omit all the comment formatting changes.  They are distractions from the
vga_str_to_iostate() parameter change.

I think this patch should be the single line change to the
vga_str_to_iostate() prototype so it matches the callers.

If you want to do the other comment formatting changes, they're fine,
but they should be all together in a separate patch that clearly
doesn't change the generated code.

Bjorn


Re: [Intel-gfx] [PATCH v3 1/4] PCI/VGA: tidy up the code and comment format

2023-06-08 Thread Bjorn Helgaas
Capitalize subject to match ("Tidy ...")

On Thu, Jun 08, 2023 at 07:43:19PM +0800, Sui Jingfeng wrote:
> From: Sui Jingfeng 
> 
> This patch replaces the leading space with a tab and removes the double
> blank line, no functional change.

Can you move this to the end of the series?  The functional changes
are more likely to be backported, and I think the backport may be a
little easier without the cleanup in the middle.

>   /* we could in theory hand out locks on IO and mem
> -  * separately to userspace but it can cause deadlocks */
> +  * separately to userspace but it can cause deadlocks
> +  */

Since you're touching this anyway, can you update it to the
conventional multi-line comment style:

  /*
   * We could in theory ...
   */

And capitalize "We", add a period at end, and rewrap to fill 78
columns or so?  Same for other comments below.

> +++ b/include/linux/vgaarb.h
> @@ -23,9 +23,7 @@
>   * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
>   * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
>   * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> - * DEALINGS
> - * IN THE SOFTWARE.
> - *
> + * DEALINGS IN THE SOFTWARE.
>   */

Can you make a separate patch to replace this entire copyright notice
with the appropriate SPDX-License-Identifier header?
Documentation/process/license-rules.rst has details.

Bjorn


Re: [Intel-gfx] [PATCH v3 3/4] PCI/VGA: only deal with VGA class devices

2023-06-08 Thread Bjorn Helgaas
Start with verb and capitalize to match ("Deal only with ...")

On Thu, Jun 08, 2023 at 07:43:21PM +0800, Sui Jingfeng wrote:
> From: Sui Jingfeng 
> 
> vgaarb only deal with the VGA devcie(pdev->class == 0x0300), so replace the
> pci_get_subsys() function with pci_get_class(). Filter the non pci display
> device out. There no need to process the non display PCI device.

s/non pci/non-PCI/
s/non display/non-display/

This is fine, and I'll merge this, but someday I would like to get rid
of the bus_register_notifier() and call vga_arbiter_add_pci_device()
and vga_arbiter_del_pci_device() directly from the PCI core.

Or if you wanted to do that now, that would be even better :)

Bjorn

> Signed-off-by: Sui Jingfeng 
> ---
>  drivers/pci/vgaarb.c | 22 --
>  1 file changed, 12 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c
> index 7f0347f4f6fd..b0bf4952a95d 100644
> --- a/drivers/pci/vgaarb.c
> +++ b/drivers/pci/vgaarb.c
> @@ -756,10 +756,6 @@ static bool vga_arbiter_add_pci_device(struct pci_dev 
> *pdev)
>   struct pci_dev *bridge;
>   u16 cmd;
>  
> - /* Only deal with VGA class devices */
> - if ((pdev->class >> 8) != PCI_CLASS_DISPLAY_VGA)
> - return false;
> -
>   /* Allocate structure */
>   vgadev = kzalloc(sizeof(struct vga_device), GFP_KERNEL);
>   if (vgadev == NULL) {
> @@ -1499,7 +1495,9 @@ static int pci_notify(struct notifier_block *nb, 
> unsigned long action,
>   struct pci_dev *pdev = to_pci_dev(dev);
>   bool notify = false;
>  
> - vgaarb_dbg(dev, "%s\n", __func__);
> + /* Only deal with VGA class devices */
> + if ((pdev->class >> 8) != PCI_CLASS_DISPLAY_VGA)
> + return 0;
>  
>   /* For now we're only intereted in devices added and removed. I didn't
>* test this thing here, so someone needs to double check for the
> @@ -1509,6 +1507,8 @@ static int pci_notify(struct notifier_block *nb, 
> unsigned long action,
>   else if (action == BUS_NOTIFY_DEL_DEVICE)
>   notify = vga_arbiter_del_pci_device(pdev);
>  
> + vgaarb_dbg(dev, "%s: action = %lu\n", __func__, action);
> +
>   if (notify)
>   vga_arbiter_notify_clients();
>   return 0;
> @@ -1533,8 +1533,8 @@ static struct miscdevice vga_arb_device = {
>  
>  static int __init vga_arb_device_init(void)
>  {
> + struct pci_dev *pdev = NULL;
>   int rc;
> - struct pci_dev *pdev;
>  
>   rc = misc_register(&vga_arb_device);
>   if (rc < 0)
> @@ -1545,11 +1545,13 @@ static int __init vga_arb_device_init(void)
>   /* We add all PCI devices satisfying VGA class in the arbiter by
>* default
>*/
> - pdev = NULL;
> - while ((pdev =
> - pci_get_subsys(PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID,
> -PCI_ANY_ID, pdev)) != NULL)
> + while (1) {
> + pdev = pci_get_class(PCI_CLASS_DISPLAY_VGA << 8, pdev);
> + if (!pdev)
> + break;
> +
>   vga_arbiter_add_pci_device(pdev);
> + };
>  
>   pr_info("loaded\n");
>   return rc;
> -- 
> 2.25.1
> 


Re: [Intel-gfx] [PATCH v3 4/4] PCI/VGA: introduce is_boot_device function callback to vga_client_register

2023-06-08 Thread Bjorn Helgaas
On Thu, Jun 08, 2023 at 07:43:22PM +0800, Sui Jingfeng wrote:
> From: Sui Jingfeng 
> 
> The vga_is_firmware_default() function is arch-dependent, which doesn't
> sound right. At least, it also works on the Mips and LoongArch platforms.
> Tested with the drm/amdgpu and drm/radeon drivers. However, it's difficult
> to enumerate all arch-driver combinations. I'm wrong if there is only one
> exception.
> 
> With the observation that device drivers typically have better knowledge
> about which PCI bar contains the firmware framebuffer, which could avoid
> the need to iterate all of the PCI BARs.
> 
> But as a PCI function at pci/vgaarb.c, vga_is_firmware_default() is
> probably not suitable to make such an optimization for a specific device.
> 
> There are PCI display controllers that don't have a dedicated VRAM bar,
> this function will lose its effectiveness in such a case. Luckily, the
> device driver can provide an accurate workaround.
> 
> Therefore, this patch introduces a callback that allows the device driver
> to tell the VGAARB if the device is the default boot device. This patch
> only intends to introduce the mechanism, while the implementation is left
> to the device driver authors. Also honor the comment: "Clients have two
> callback mechanisms they can use"

s/bar/BAR/ (several)

Nothing here uses the callback.  I don't want to merge this until we
have a user.

I'm not sure why the device driver should know whether its device is
the default boot device.

> Signed-off-by: Sui Jingfeng 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  2 +-
>  drivers/gpu/drm/i915/display/intel_vga.c   |  3 +--
>  drivers/gpu/drm/nouveau/nouveau_vga.c  |  2 +-
>  drivers/gpu/drm/radeon/radeon_device.c |  2 +-
>  drivers/pci/vgaarb.c   | 22 ++
>  drivers/vfio/pci/vfio_pci_core.c   |  2 +-
>  include/linux/vgaarb.h |  8 +---
>  7 files changed, 28 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 5c7d40873ee2..7a096f2d5c16 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3960,7 +3960,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>   /* this will fail for cards that aren't VGA class devices, just
>* ignore it */
>   if ((adev->pdev->class >> 8) == PCI_CLASS_DISPLAY_VGA)
> - vga_client_register(adev->pdev, amdgpu_device_vga_set_decode);
> + vga_client_register(adev->pdev, amdgpu_device_vga_set_decode, 
> NULL);
>  
>   px = amdgpu_device_supports_px(ddev);
>  
> diff --git a/drivers/gpu/drm/i915/display/intel_vga.c 
> b/drivers/gpu/drm/i915/display/intel_vga.c
> index 286a0bdd28c6..98d7d4dffe9f 100644
> --- a/drivers/gpu/drm/i915/display/intel_vga.c
> +++ b/drivers/gpu/drm/i915/display/intel_vga.c
> @@ -115,7 +115,6 @@ intel_vga_set_decode(struct pci_dev *pdev, bool 
> enable_decode)
>  
>  int intel_vga_register(struct drm_i915_private *i915)
>  {
> -
>   struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
>   int ret;
>  
> @@ -127,7 +126,7 @@ int intel_vga_register(struct drm_i915_private *i915)
>* then we do not take part in VGA arbitration and the
>* vga_client_register() fails with -ENODEV.
>*/
> - ret = vga_client_register(pdev, intel_vga_set_decode);
> + ret = vga_client_register(pdev, intel_vga_set_decode, NULL);
>   if (ret && ret != -ENODEV)
>   return ret;
>  
> diff --git a/drivers/gpu/drm/nouveau/nouveau_vga.c 
> b/drivers/gpu/drm/nouveau/nouveau_vga.c
> index f8bf0ec26844..162b4f4676c7 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_vga.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_vga.c
> @@ -92,7 +92,7 @@ nouveau_vga_init(struct nouveau_drm *drm)
>   return;
>   pdev = to_pci_dev(dev->dev);
>  
> - vga_client_register(pdev, nouveau_vga_set_decode);
> + vga_client_register(pdev, nouveau_vga_set_decode, NULL);
>  
>   /* don't register Thunderbolt eGPU with vga_switcheroo */
>   if (pci_is_thunderbolt_attached(pdev))
> diff --git a/drivers/gpu/drm/radeon/radeon_device.c 
> b/drivers/gpu/drm/radeon/radeon_device.c
> index afbb3a80c0c6..71f2ff39d6a1 100644
> --- a/drivers/gpu/drm/radeon/radeon_device.c
> +++ b/drivers/gpu/drm/radeon/radeon_device.c
> @@ -1425,7 +1425,7 @@ int radeon_device_init(struct radeon_device *rdev,
>   /* if we have > 1 VGA cards, then disable the radeon VGA resources */
>   /* this will fail for cards that aren't VGA class devices, just
>* ignore it */
> - vga_client_register(rdev->pdev, radeon_vga_set_decode);
> + vga_client_register(rdev->pdev, radeon_vga_set_decode, NULL);
>  
>   if (rdev->flags & RADEON_IS_PX)
>   runtime = true;
> diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c
> index b0bf4952a95d..d3dab61e0ef2 100644
> --- a/drivers/pci/vgaarb.c
> 

Re: [Intel-gfx] [PATCH V2] PCI: Move VMD ASPM/LTR fix to PCI quirk

2023-06-08 Thread Bjorn Helgaas
On Tue, Apr 11, 2023 at 02:33:23PM -0700, David E. Box wrote:
> In commit f492edb40b54 ("PCI: vmd: Add quirk to configure PCIe ASPM and
> LTR") the VMD driver calls pci_enabled_link_state as a callback from
> pci_bus_walk. Both will acquire the pci_bus_sem lock leading to a lockdep
> warning. Instead of doing the pci_bus_walk, move the fix to quirks.c using
> DECLARE_PCI_FIXUP_FINAL.

s/pci_enabled_link_state/pci_enable_link_state/

Add "()" after pci_enable_link_state() and pci_bus_walk() to make it
obvious they're functions.

> ...
> +++ b/drivers/pci/quirks.c
> @@ -6023,3 +6023,75 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a2d, 
> dpc_log_size);
>  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a2f, dpc_log_size);
>  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a31, dpc_log_size);
>  #endif
> +
> +#ifdef CONFIG_VMD
> +/*
> + * Enable ASPM on the PCIE root ports under VMD and set the default LTR of 
> the
> + * storage devices on platforms where these values are not configured by 
> BIOS.
> + * This is needed for laptops, which require these settings for proper power
> + * management of the SoC.

s/PCIE/PCIe/ to match spec usage.

> + */
> +#define VMD_DEVICE_LTR   0x1003  /* 3145728 ns */

It would be nice to know how this value was derived.  But I know we
had this hard-coded value before, so it's not new with this patch.

> +static void quirk_intel_vmd(struct pci_dev *pdev)

I think this quirk could possibly stay in
drivers/pci/controller/vmd.c, couldn't it?  It has a lot of
VMD-specific knowledge that it would nice to contain in vmd.c.

> +{
> + struct pci_dev *parent;
> + u16 ltr = VMD_DEVICE_LTR;

I don't think "ltr" is an improvement over using "VMD_DEVICE_LTR"
below.

> + u32 ltr_reg;
> + int pos;
> +
> + /* Check in VMD domain */
> + if (pci_domain_nr(pdev->bus) < 0x1)
> + return;

If in vmd.c, maybe could identify devices under a VMD by checking
pdev->bus->ops as vmd_acpi_find_companion() does?

> + /* Get Root Port */
> + parent = pci_upstream_bridge(pdev);
> + if (!parent || parent->vendor != PCI_VENDOR_ID_INTEL)
> + return;
> +
> + /* Get VMD Host Bridge */
> + parent = to_pci_dev(parent->dev.parent);
> + if (!parent)
> + return;
> +
> + /* Get RAID controller */
> + parent = to_pci_dev(parent->dev.parent);
> + if (!parent)
> + return;
> +
> + switch (parent->device) {
> + case 0x467f:
> + case 0x4c3d:
> + case 0xa77f:
> + case 0x7d0b:
> + case 0xad0b:
> + case 0x9a0b:
> + break;
> + default:
> + return;
> + }
> +
> + pci_enable_link_state(pdev, PCIE_LINK_STATE_ALL);

Seems like you would want to set LTR *before* enabling the link
states?

> + pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_LTR);
> + if (!pos)
> + return;
> +
> + /* Skip if the max snoop LTR is non-zero, indicating BIOS has set it */
> + pci_read_config_dword(pdev, pos + PCI_LTR_MAX_SNOOP_LAT,  + if (!!(ltr_reg & (PCI_LTR_VALUE_MASK | PCI_LTR_SCALE_MASK)))
> + return;
> +
> + /*
> +  * Set the LTR values to the maximum required by the platform to
> +  * allow the deepest power management savings. Write as a DWORD where
> +  * the lower word is the max snoop latency and the upper word is the
> +  * max non-snoop latency.

This comment suggests that the LTR value is platform-dependent, which
is what I would expect, but the code and the hard-coded VMD_DEVICE_LTR
value don't have any platform dependencies.  Again, nothing new in
*this* patch; that's true in the current tree, too.

> + ltr_reg = (ltr << 16) | ltr;
> + pci_write_config_dword(pdev, pos + PCI_LTR_MAX_SNOOP_LAT, ltr_reg);
> + pci_info(pdev, "LTR set by VMD PCI quick\n");

s/quick/quirk/

> +
> +}
> +DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_ANY_ID, PCI_ANY_ID,
> +   PCI_CLASS_STORAGE_EXPRESS, 0, quirk_intel_vmd);
> +#endif
> -- 
> 2.34.1
> 


Re: [PATCH v3 1/2] pm: runtime: Simplify pm_runtime_get_if_active() usage

2024-01-22 Thread Bjorn Helgaas
On Mon, Jan 22, 2024 at 01:41:21PM +0200, Sakari Ailus wrote:
> There are two ways to opportunistically increment a device's runtime PM
> usage count, calling either pm_runtime_get_if_active() or
> pm_runtime_get_if_in_use(). The former has an argument to tell whether to
> ignore the usage count or not, and the latter simply calls the former with
> ign_usage_count set to false. The other users that want to ignore the
> usage_count will have to explitly set that argument to true which is a bit
> cumbersome.

s/explitly/explicitly/

> To make this function more practical to use, remove the ign_usage_count
> argument from the function. The main implementation is renamed as
> pm_runtime_get_conditional().
> 
> Signed-off-by: Sakari Ailus 
> Reviewed-by: Alex Elder  # drivers/net/ipa/ipa_smp2p.c
> Reviewed-by: Laurent Pinchart 
> Acked-by: Takashi Iwai  # sound/
> Reviewed-by: Jacek Lawrynowicz  # 
> drivers/accel/ivpu/
> Acked-by: Rodrigo Vivi  # drivers/gpu/drm/i915/
> Reviewed-by: Rodrigo Vivi 

Acked-by: Bjorn Helgaas  # drivers/pci/

> -EXPORT_SYMBOL_GPL(pm_runtime_get_if_active);
> +EXPORT_SYMBOL_GPL(pm_runtime_get_conditional);

If pm_runtime_get_conditional() is exported, shouldn't it also be
documented in Documentation/power/runtime_pm.rst?

But I'm dubious about exporting it because
__intel_runtime_pm_get_if_active() is the only caller, and you end up
with the same pattern there that we have before this series in the PM
core.  Why can't intel_runtime_pm.c be updated to use
pm_runtime_get_if_active() or pm_runtime_get_if_in_use() directly, and
make pm_runtime_get_conditional() static?

> +++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
> @@ -246,7 +246,7 @@ static intel_wakeref_t 
> __intel_runtime_pm_get_if_active(struct intel_runtime_pm
>* function, since the power state is undefined. This applies
>* atm to the late/early system suspend/resume handlers.
>*/
> - if (pm_runtime_get_if_active(rpm->kdev, ignore_usecount) <= 0)
> + if (pm_runtime_get_conditional(rpm->kdev, ignore_usecount) <= 0)
>   return 0;
>   }

Bjorn


Re: [PATCH v4 1/3] pm: runtime: Simplify pm_runtime_get_if_active() usage

2024-01-23 Thread Bjorn Helgaas
On Tue, Jan 23, 2024 at 11:56:42AM +0200, Sakari Ailus wrote:
> There are two ways to opportunistically increment a device's runtime PM
> usage count, calling either pm_runtime_get_if_active() or
> pm_runtime_get_if_in_use(). The former has an argument to tell whether to
> ignore the usage count or not, and the latter simply calls the former with
> ign_usage_count set to false. The other users that want to ignore the
> usage_count will have to explitly set that argument to true which is a bit
> cumbersome.
> 
> To make this function more practical to use, remove the ign_usage_count
> argument from the function. The main implementation is renamed as
> pm_runtime_get_conditional().
> 
> Signed-off-by: Sakari Ailus 
> Reviewed-by: Alex Elder  # drivers/net/ipa/ipa_smp2p.c
> Reviewed-by: Laurent Pinchart 
> Acked-by: Takashi Iwai  # sound/
> Reviewed-by: Jacek Lawrynowicz  # 
> drivers/accel/ivpu/
> Acked-by: Rodrigo Vivi  # drivers/gpu/drm/i915/
> Reviewed-by: Rodrigo Vivi 

Acked-by: Bjorn Helgaas  # drivers/pci/

- Previous PM history uses "PM: " in the subject lines (not "pm: ").

- I don't know whether it's feasible, but it would be nice if the
  intel_pm_runtime_pm.c rework could be done in one shot instead of
  being split between patches 1/3 and 2/3.

  Maybe it could be a preliminary patch that uses the existing
  if_active/if_in_use interfaces, followed by the trivial if_active
  updates in this patch.  I think that would make the history easier
  to read than having the transitory pm_runtime_get_conditional() in
  the middle.

- Similarly, it would be nice if pm_runtime_get_conditional() never
  had to be published in pm_runtime.h, instead of being temporarily
  added there by this patch and then immediately made private by 2/3.
  Maybe that's not practical, I dunno.

Bjorn


Re: [PATCH v4 1/3] pm: runtime: Simplify pm_runtime_get_if_active() usage

2024-01-23 Thread Bjorn Helgaas
On Tue, Jan 23, 2024 at 08:44:04PM +, Sakari Ailus wrote:
> On Tue, Jan 23, 2024 at 11:24:23AM -0600, Bjorn Helgaas wrote:
> ...

> > - I don't know whether it's feasible, but it would be nice if the
> >   intel_pm_runtime_pm.c rework could be done in one shot instead of
> >   being split between patches 1/3 and 2/3.
> > 
> >   Maybe it could be a preliminary patch that uses the existing
> >   if_active/if_in_use interfaces, followed by the trivial if_active
> >   updates in this patch.  I think that would make the history easier
> >   to read than having the transitory pm_runtime_get_conditional() in
> >   the middle.
> 
> I think I'd merge the two patches. The second patch is fairly small, after
> all, and both deal with largely the same code.

I'm not sure which two patches you mean, but the fact that two patches
deal with largely the same code is not necessarily an argument for
merging them.  From a reviewing perspective, it's nice if a patch like
1/3, where it's largely mechanical and easy to review, is separated
from patches that make more substantive changes.

That's why I think it'd be nice if the "interesting"
intel_pm_runtime_pm.c changes were all in the same patch, and ideally,
if that patch *only* touched intel_pm_runtime_pm.c.

Bjorn


Re: [Intel-gfx] [PATCH] PCI/ASPM: pci_enable_link_state: Add argument to acquire bus lock

2023-03-22 Thread Bjorn Helgaas
Hi David,

On Tue, Mar 21, 2023 at 04:38:49PM -0700, David E. Box wrote:
> The VMD driver calls pci_enabled_link_state as a callback from
> pci_bus_walk. Both will acquire the pci_bus_sem lock leading to a lockdep
> warning. Add an argument to pci_enable_link_state to set whether the lock
> should be acquired. In the VMD driver, set the argument to false since the
> lock will already be obtained by pci_bus_walk.
> 
> Reported-by: Ville Syrjälä 
> Fixes: de82f60f9c86 ("PCI/ASPM: Add pci_enable_link_state()")

This means "if your kernel includes de82f60f9c86, you probably want to
backport this fix to it."  But that's not the case here.  This patch
is not fixing an issue with de82f60f9c86, so I don't think there's a
reason to include a "Fixes" line.

This patch is adding functionality that is only needed by some other
patch, and it should be part of a series that also includes the patch
that uses it to make sure they go together.

Nit: I prefer to add "()" after function names in the commit log to
make it more obvious that they're functions and help distinguish them
from variables.

> Link: https://lore.kernel.org/linux-pci/zbjko%2fifuniws...@intel.com/
> Signed-off-by: David E. Box 
> ---
>  drivers/pci/controller/vmd.c | 2 +-
>  drivers/pci/pcie/aspm.c  | 9 ++---
>  include/linux/pci.h  | 5 +++--
>  3 files changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> index 990630ec57c6..45aa35744eae 100644
> --- a/drivers/pci/controller/vmd.c
> +++ b/drivers/pci/controller/vmd.c
> @@ -737,7 +737,7 @@ static int vmd_pm_enable_quirk(struct pci_dev *pdev, void 
> *userdata)
>   if (!(features & VMD_FEAT_BIOS_PM_QUIRK))
>   return 0;
>  
> - pci_enable_link_state(pdev, PCIE_LINK_STATE_ALL);
> + pci_enable_link_state(pdev, PCIE_LINK_STATE_ALL, false);
>  
>   pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_LTR);
>   if (!pos)
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index 66d7514ca111..5b5a600bb864 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -1147,8 +1147,9 @@ EXPORT_SYMBOL(pci_disable_link_state);
>   *
>   * @pdev: PCI device
>   * @state: Mask of ASPM link states to enable
> + * @sem: Boolean to acquire/release pci_bus_sem
>   */
> -int pci_enable_link_state(struct pci_dev *pdev, int state)
> +int pci_enable_link_state(struct pci_dev *pdev, int state, bool sem)
>  {
>   struct pcie_link_state *link = pcie_aspm_get_link(pdev);
>  
> @@ -1165,7 +1166,8 @@ int pci_enable_link_state(struct pci_dev *pdev, int 
> state)
>   return -EPERM;
>   }
>  
> - down_read(&pci_bus_sem);
> + if (sem)
> + down_read(&pci_bus_sem);
>   mutex_lock(&aspm_lock);
>   link->aspm_default = 0;
>   if (state & PCIE_LINK_STATE_L0S)
> @@ -1186,7 +1188,8 @@ int pci_enable_link_state(struct pci_dev *pdev, int 
> state)
>   link->clkpm_default = (state & PCIE_LINK_STATE_CLKPM) ? 1 : 0;
>   pcie_set_clkpm(link, policy_to_clkpm_state(link));
>   mutex_unlock(&aspm_lock);
> - up_read(&pci_bus_sem);
> + if (sem)
> + up_read(&pci_bus_sem);
>  
>   return 0;
>  }
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index fafd8020c6d7..a6f9f24b39fd 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1707,7 +1707,7 @@ extern bool pcie_ports_native;
>  #ifdef CONFIG_PCIEASPM
>  int pci_disable_link_state(struct pci_dev *pdev, int state);
>  int pci_disable_link_state_locked(struct pci_dev *pdev, int state);
> -int pci_enable_link_state(struct pci_dev *pdev, int state);
> +int pci_enable_link_state(struct pci_dev *pdev, int state, bool sem);
>  void pcie_no_aspm(void);
>  bool pcie_aspm_support_enabled(void);
>  bool pcie_aspm_enabled(struct pci_dev *pdev);
> @@ -1716,7 +1716,8 @@ static inline int pci_disable_link_state(struct pci_dev 
> *pdev, int state)
>  { return 0; }
>  static inline int pci_disable_link_state_locked(struct pci_dev *pdev, int 
> state)
>  { return 0; }
> -static inline int pci_enable_link_state(struct pci_dev *pdev, int state)
> +static inline int
> +pci_enable_link_state(struct pci_dev *pdev, int state, bool sem)
>  { return 0; }
>  static inline void pcie_no_aspm(void) { }
>  static inline bool pcie_aspm_support_enabled(void) { return false; }
> -- 
> 2.34.1
> 


Re: [Intel-gfx] [PATCH] PCI/ASPM: pci_enable_link_state: Add argument to acquire bus lock

2023-03-22 Thread Bjorn Helgaas
On Wed, Mar 22, 2023 at 03:45:01PM -0500, Bjorn Helgaas wrote:
> Hi David,
> 
> On Tue, Mar 21, 2023 at 04:38:49PM -0700, David E. Box wrote:
> > The VMD driver calls pci_enabled_link_state as a callback from
> > pci_bus_walk. Both will acquire the pci_bus_sem lock leading to a lockdep
> > warning. Add an argument to pci_enable_link_state to set whether the lock
> > should be acquired. In the VMD driver, set the argument to false since the
> > lock will already be obtained by pci_bus_walk.
> > 
> > Reported-by: Ville Syrjälä 
> > Fixes: de82f60f9c86 ("PCI/ASPM: Add pci_enable_link_state()")
> 
> This means "if your kernel includes de82f60f9c86, you probably want to
> backport this fix to it."  But that's not the case here.  This patch
> is not fixing an issue with de82f60f9c86, so I don't think there's a
> reason to include a "Fixes" line.

Oops, sorry, forgot to engage brain before hitting "send".

I think the "Fixes" line should reference f492edb40b54 ("PCI: vmd: Add
quirk to configure PCIe ASPM and LTR") instead, since that's where the
locking problem started.

> This patch is adding functionality that is only needed by some other
> patch, and it should be part of a series that also includes the patch
> that uses it to make sure they go together.

And I see that the use *is* included in this patch.  But I don't
really like this pattern:

  vmd_probe
vmd_enable_domain
  vmd->bus = pci_create_root_bus(...);
  pci_scan_child_bus(vmd->bus);
  pci_walk_bus(vmd->bus, vmd_pm_enable_quirk, &features);

because pci_walk_bus() makes locking complicated (as this issue shows)
and it doesn't work for hot-added devices (I don't know if that's an
issue for VMD, but the pattern gets copied to places where it *is*).

Normally vmd_pm_enable_quirk() would be done by making it an actual
DECLARE_PCI_FIXUP_HEADER() or DECLARE_PCI_FIXUP_FINAL(), so it would
be called automatically by the PCI core when a new device is
enumerated.  Would that work here?  If it would, I don't think you'd
need to add the extra flag to pci_enable_link_state().

> > Link: https://lore.kernel.org/linux-pci/zbjko%2fifuniws...@intel.com/
> > Signed-off-by: David E. Box 
> > ---
> >  drivers/pci/controller/vmd.c | 2 +-
> >  drivers/pci/pcie/aspm.c  | 9 ++---
> >  include/linux/pci.h  | 5 +++--
> >  3 files changed, 10 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> > index 990630ec57c6..45aa35744eae 100644
> > --- a/drivers/pci/controller/vmd.c
> > +++ b/drivers/pci/controller/vmd.c
> > @@ -737,7 +737,7 @@ static int vmd_pm_enable_quirk(struct pci_dev *pdev, 
> > void *userdata)
> > if (!(features & VMD_FEAT_BIOS_PM_QUIRK))
> > return 0;
> >  
> > -   pci_enable_link_state(pdev, PCIE_LINK_STATE_ALL);
> > +   pci_enable_link_state(pdev, PCIE_LINK_STATE_ALL, false);
> >  
> > pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_LTR);
> > if (!pos)
> > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > index 66d7514ca111..5b5a600bb864 100644
> > --- a/drivers/pci/pcie/aspm.c
> > +++ b/drivers/pci/pcie/aspm.c
> > @@ -1147,8 +1147,9 @@ EXPORT_SYMBOL(pci_disable_link_state);
> >   *
> >   * @pdev: PCI device
> >   * @state: Mask of ASPM link states to enable
> > + * @sem: Boolean to acquire/release pci_bus_sem
> >   */
> > -int pci_enable_link_state(struct pci_dev *pdev, int state)
> > +int pci_enable_link_state(struct pci_dev *pdev, int state, bool sem)
> >  {
> > struct pcie_link_state *link = pcie_aspm_get_link(pdev);
> >  
> > @@ -1165,7 +1166,8 @@ int pci_enable_link_state(struct pci_dev *pdev, int 
> > state)
> > return -EPERM;
> > }
> >  
> > -   down_read(&pci_bus_sem);
> > +   if (sem)
> > +   down_read(&pci_bus_sem);
> > mutex_lock(&aspm_lock);
> > link->aspm_default = 0;
> > if (state & PCIE_LINK_STATE_L0S)
> > @@ -1186,7 +1188,8 @@ int pci_enable_link_state(struct pci_dev *pdev, int 
> > state)
> > link->clkpm_default = (state & PCIE_LINK_STATE_CLKPM) ? 1 : 0;
> > pcie_set_clkpm(link, policy_to_clkpm_state(link));
> > mutex_unlock(&aspm_lock);
> > -   up_read(&pci_bus_sem);
> > +   if (sem)
> > +   up_read(&pci_bus_sem);
> >  
> > return 0;
> >  }
> > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > index fafd8020c6d7.

Re: [PATCH 0/8] drm/i915/pciids: PCI ID macro cleanups

2024-05-15 Thread Bjorn Helgaas
On Fri, May 10, 2024 at 04:55:07PM +0300, Jani Nikula wrote:
> On Fri, 10 May 2024, Jani Nikula  wrote:
> > This is a spin-off from [1], including just the PCI ID macro cleanups,
> > as well as adding a bunch more cleanups.
> >
> > BR,
> > Jani.
> >
> > [1] https://lore.kernel.org/all/cover.1715086509.git.jani.nik...@intel.com/
> >
> >
> > Jani Nikula (8):
> >   drm/i915/pciids: add INTEL_PNV_IDS(), use acronym
> >   drm/i915/pciids: add INTEL_ILK_IDS(), use acronym
> >   drm/i915/pciids: add INTEL_SNB_IDS()
> >   drm/i915/pciids: add INTEL_IVB_IDS()
> >   drm/i915/pciids: don't include WHL/CML PCI IDs in CFL
> >   drm/i915/pciids: remove 11 from INTEL_ICL_IDS()
> >   drm/i915/pciids: remove 12 from INTEL_TGL_IDS()
> >   drm/i915/pciids: don't include RPL-U PCI IDs in RPL-P
> >
> >  arch/x86/kernel/early-quirks.c| 19 +++---
> 
> Bjorn, ack for merging this via drm-intel-next?

Sorry, I had ignored this because I didn't think it affected any PCI
stuff.  This is fine with me:

Acked-by: Bjorn Helgaas 

But since you asked :), I'll gripe again about the fact that this
intel_early_ids[] list needs continual maintenance, which is not the
way things are supposed to work.  Long thread about it here:

https://lore.kernel.org/linux-pci/20201104120506.172447-1-tejaskumarx.surendrakumar.upadh...@intel.com/t/#u

> >  .../drm/i915/display/intel_display_device.c   | 20 +++---
> >  drivers/gpu/drm/i915/i915_pci.c   | 13 ++--
> >  drivers/gpu/drm/i915/intel_device_info.c  |  3 +-
> >  include/drm/i915_pciids.h | 67 ---
> >  5 files changed, 71 insertions(+), 51 deletions(-)
> 
> -- 
> Jani Nikula, Intel


Re: [core-for-CI PATCH] PCI: Make PCI cfg_access_lock lockdep key a singleton

2024-05-30 Thread Bjorn Helgaas
[+cc linux-pci, this is a follow-up to 7e89efc6e9e4, which appeared in
v6.10-rc1 via the PCI tree]

On Wed, May 29, 2024 at 02:49:01PM +0300, Imre Deak wrote:
> From: Dan Williams 
> 
> The new lockdep annotation for cfg_access_lock naively registered a new
> key per device. This is overkill and leads to warnings on hash
> collisions at dynamic registration time:
> 
>  WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:1226 
> lockdep_register_key+0xb0/0x240
>  RIP: 0010:lockdep_register_key+0xb0/0x240
>  [..]
>  Call Trace:
>   
>   ? __warn+0x8c/0x190
>   ? lockdep_register_key+0xb0/0x240
>   ? report_bug+0x1f8/0x200
>   ? handle_bug+0x3c/0x70
>   ? exc_invalid_op+0x18/0x70
>   ? asm_exc_invalid_op+0x1a/0x20
>   ? lockdep_register_key+0xb0/0x240
>   pci_device_add+0x14b/0x560
>   ? pci_setup_device+0x42e/0x6a0
>   pci_scan_single_device+0xa7/0xd0
>   p2sb_scan_and_cache_devfn+0xc/0x90
>   p2sb_fs_init+0x15f/0x170
> 
> Switch to a shared static key for all instances.
> 
> Fixes: 7e89efc6e9e4 ("PCI: Lock upstream bridge for pci_reset_function()")
> Reported-by: Jani Saarinen 
> Closes: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_14834/bat-apl-1/boot0.txt
> Cc: Dave Jiang 
> Cc: Bjorn Helgaas 
> Signed-off-by: Dan Williams 
> ---
>  drivers/pci/probe.c | 7 ---
>  include/linux/pci.h | 1 -
>  2 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 8e696e547565c..15168881ec941 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -2533,6 +2533,8 @@ static void pci_set_msi_domain(struct pci_dev *dev)
>   dev_set_msi_domain(&dev->dev, d);
>  }
>  
> +static struct lock_class_key cfg_access_key;
> +
>  void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)
>  {
>   int ret;
> @@ -2546,9 +2548,8 @@ void pci_device_add(struct pci_dev *dev, struct pci_bus 
> *bus)
>   dev->dev.dma_mask = &dev->dma_mask;
>   dev->dev.dma_parms = &dev->dma_parms;
>   dev->dev.coherent_dma_mask = 0xull;
> - lockdep_register_key(&dev->cfg_access_key);
> - lockdep_init_map(&dev->cfg_access_lock, dev_name(&dev->dev),
> -  &dev->cfg_access_key, 0);
> + lockdep_init_map(&dev->cfg_access_lock, "&dev->cfg_access_lock",
> +  &cfg_access_key, 0);
>  
>   dma_set_max_seg_size(&dev->dev, 65536);
>   dma_set_seg_boundary(&dev->dev, 0x);
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index fb004fd4e8890..5bece7fd11f88 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -413,7 +413,6 @@ struct pci_dev {
>   struct resource driver_exclusive_resource;   /* driver exclusive 
> resource ranges */
>  
>   boolmatch_driver;   /* Skip attaching driver */
> - struct lock_class_key cfg_access_key;
>   struct lockdep_map cfg_access_lock;
>  
>   unsigned inttransparent:1;  /* Subtractive decode bridge */
> -- 
> 2.43.3
> 


Re: [PATCH] PCI: Make PCI cfg_access_lock lockdep key a singleton

2024-05-30 Thread Bjorn Helgaas
[+cc Alison, Imre, et al.  IIUC this patch didn't help the similar
issue reported by Imre at
https://lore.kernel.org/r/zlxp5otnsapid...@ideak-desk.fi.intel.com,
but just FYI]

On Tue, May 28, 2024 at 04:22:59PM -0700, Dan Williams wrote:
> The new lockdep annotation for cfg_access_lock naively registered a new
> key per device. This is overkill and leads to warnings on hash
> collisions at dynamic registration time:
> 
>  WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:1226 
> lockdep_register_key+0xb0/0x240
>  RIP: 0010:lockdep_register_key+0xb0/0x240
>  [..]
>  Call Trace:
>   
>   ? __warn+0x8c/0x190
>   ? lockdep_register_key+0xb0/0x240
>   ? report_bug+0x1f8/0x200
>   ? handle_bug+0x3c/0x70
>   ? exc_invalid_op+0x18/0x70
>   ? asm_exc_invalid_op+0x1a/0x20
>   ? lockdep_register_key+0xb0/0x240
>   pci_device_add+0x14b/0x560
>   ? pci_setup_device+0x42e/0x6a0
>   pci_scan_single_device+0xa7/0xd0
>   p2sb_scan_and_cache_devfn+0xc/0x90
>   p2sb_fs_init+0x15f/0x170
> 
> Switch to a shared static key for all instances.
> 
> Fixes: 7e89efc6e9e4 ("PCI: Lock upstream bridge for pci_reset_function()")
> Reported-by: Jani Saarinen 
> Closes: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_14834/bat-apl-1/boot0.txt
> Cc: Dave Jiang 
> Cc: Bjorn Helgaas 
> Signed-off-by: Dan Williams 

Applied with Alison's reviewed-by and Jani's tested-by to for-linus
for v6.10, thanks!



> ---
> Hi Bjorn,
> 
> One more fallout from the cfg_access_lock lockdep annotation. This one
> still wants a Tested-by from Jani before merging, but wanted to make you
> aware of it in case similar reports make their way to you in the
> meantime.
> 
>  drivers/pci/probe.c |7 ---
>  include/linux/pci.h |1 -
>  2 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 8e696e547565..15168881ec94 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -2533,6 +2533,8 @@ static void pci_set_msi_domain(struct pci_dev *dev)
>   dev_set_msi_domain(&dev->dev, d);
>  }
>  
> +static struct lock_class_key cfg_access_key;
> +
>  void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)
>  {
>   int ret;
> @@ -2546,9 +2548,8 @@ void pci_device_add(struct pci_dev *dev, struct pci_bus 
> *bus)
>   dev->dev.dma_mask = &dev->dma_mask;
>   dev->dev.dma_parms = &dev->dma_parms;
>   dev->dev.coherent_dma_mask = 0xull;
> - lockdep_register_key(&dev->cfg_access_key);
> - lockdep_init_map(&dev->cfg_access_lock, dev_name(&dev->dev),
> -  &dev->cfg_access_key, 0);
> + lockdep_init_map(&dev->cfg_access_lock, "&dev->cfg_access_lock",
> +  &cfg_access_key, 0);
>  
>   dma_set_max_seg_size(&dev->dev, 65536);
>   dma_set_seg_boundary(&dev->dev, 0x);
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index fb004fd4e889..5bece7fd11f8 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -413,7 +413,6 @@ struct pci_dev {
>   struct resource driver_exclusive_resource;   /* driver exclusive 
> resource ranges */
>  
>   boolmatch_driver;   /* Skip attaching driver */
> - struct lock_class_key cfg_access_key;
>   struct lockdep_map cfg_access_lock;
>  
>   unsigned inttransparent:1;  /* Subtractive decode bridge */
> 


Re: [core-for-CI PATCH] PCI: Make PCI cfg_access_lock lockdep key a singleton

2024-05-30 Thread Bjorn Helgaas
On Wed, May 29, 2024 at 02:49:01PM +0300, Imre Deak wrote:
> From: Dan Williams 
> 
> The new lockdep annotation for cfg_access_lock naively registered a new
> key per device. This is overkill and leads to warnings on hash
> collisions at dynamic registration time:
> 
>  WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:1226 
> lockdep_register_key+0xb0/0x240
>  RIP: 0010:lockdep_register_key+0xb0/0x240
>  [..]
>  Call Trace:
>   
>   ? __warn+0x8c/0x190
>   ? lockdep_register_key+0xb0/0x240
>   ? report_bug+0x1f8/0x200
>   ? handle_bug+0x3c/0x70
>   ? exc_invalid_op+0x18/0x70
>   ? asm_exc_invalid_op+0x1a/0x20
>   ? lockdep_register_key+0xb0/0x240
>   pci_device_add+0x14b/0x560
>   ? pci_setup_device+0x42e/0x6a0
>   pci_scan_single_device+0xa7/0xd0
>   p2sb_scan_and_cache_devfn+0xc/0x90
>   p2sb_fs_init+0x15f/0x170
> 
> Switch to a shared static key for all instances.
> 
> Fixes: 7e89efc6e9e4 ("PCI: Lock upstream bridge for pci_reset_function()")
> Reported-by: Jani Saarinen 
> Closes: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_14834/bat-apl-1/boot0.txt
> Cc: Dave Jiang 
> Cc: Bjorn Helgaas 
> Signed-off-by: Dan Williams 

FWIW, this patch has been applied to the PCI tree for v6.10:
https://lore.kernel.org/r/20240530173659.GA553323@bhelgaas

> ---
>  drivers/pci/probe.c | 7 ---
>  include/linux/pci.h | 1 -
>  2 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 8e696e547565c..15168881ec941 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -2533,6 +2533,8 @@ static void pci_set_msi_domain(struct pci_dev *dev)
>   dev_set_msi_domain(&dev->dev, d);
>  }
>  
> +static struct lock_class_key cfg_access_key;
> +
>  void pci_device_add(struct pci_dev *dev, struct pci_bus *bus)
>  {
>   int ret;
> @@ -2546,9 +2548,8 @@ void pci_device_add(struct pci_dev *dev, struct pci_bus 
> *bus)
>   dev->dev.dma_mask = &dev->dma_mask;
>   dev->dev.dma_parms = &dev->dma_parms;
>   dev->dev.coherent_dma_mask = 0xull;
> - lockdep_register_key(&dev->cfg_access_key);
> - lockdep_init_map(&dev->cfg_access_lock, dev_name(&dev->dev),
> -  &dev->cfg_access_key, 0);
> + lockdep_init_map(&dev->cfg_access_lock, "&dev->cfg_access_lock",
> +  &cfg_access_key, 0);
>  
>   dma_set_max_seg_size(&dev->dev, 65536);
>   dma_set_seg_boundary(&dev->dev, 0x);
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index fb004fd4e8890..5bece7fd11f88 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -413,7 +413,6 @@ struct pci_dev {
>   struct resource driver_exclusive_resource;   /* driver exclusive 
> resource ranges */
>  
>   boolmatch_driver;   /* Skip attaching driver */
> - struct lock_class_key cfg_access_key;
>   struct lockdep_map cfg_access_lock;
>  
>   unsigned inttransparent:1;  /* Subtractive decode bridge */
> -- 
> 2.43.3
> 


Re: [PATCH v1 1/1] treewide: Align match_string() with sysfs_match_string()

2024-06-04 Thread Bjorn Helgaas
On Sun, Jun 02, 2024 at 06:57:12PM +0300, Andy Shevchenko wrote:
> Make two APIs look similar. Hence convert match_string() to be
> a 2-argument macro. In order to avoid unneeded churn, convert
> all users as well. There is no functional change intended.

Looks nice, thanks for doing this.

> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index ac6293c24976..2d317c7e1cea 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -210,7 +210,7 @@ void pcie_ecrc_get_policy(char *str)
>  {
>   int i;
>  
> - i = match_string(ecrc_policy_str, ARRAY_SIZE(ecrc_policy_str), str);
> + i = match_string(ecrc_policy_str, str);
>   if (i < 0)
>   return;
>  

Acked-by: Bjorn Helgaas# drivers/pci/

> +++ b/mm/vmpressure.c
> @@ -388,7 +388,7 @@ int vmpressure_register_event(struct mem_cgroup *memcg,
>  
>   /* Find required level */
>   token = strsep(&spec, ",");
> - ret = match_string(vmpressure_str_levels, VMPRESSURE_NUM_LEVELS, token);
> + ret = match_string(vmpressure_str_levels, token);

VMPRESSURE_NUM_LEVELS looks like it's no longer used?

>   if (ret < 0)
>   goto out;
>   level = ret;
> @@ -396,7 +396,7 @@ int vmpressure_register_event(struct mem_cgroup *memcg,
>   /* Find optional mode */
>   token = strsep(&spec, ",");
>   if (token) {
> - ret = match_string(vmpressure_str_modes, VMPRESSURE_NUM_MODES, 
> token);
> + ret = match_string(vmpressure_str_modes, token);

Ditto.

>   if (ret < 0)
>   goto out;
>   mode = ret;


Re: [Intel-gfx] [PATCH] PCI: Add new PCI id for Intel GPU interrupt quirk

2014-04-25 Thread Bjorn Helgaas
[+cc Daniel, Jani, intel-gfx, dri-devel, -cc stable]

On Mon, Apr 07, 2014 at 03:10:32PM +0200, Thomas Jarosch wrote:
> After a CPU upgrade while keeping the same mainboard,
> we faced "spurious interrupt" problems again.
> 
> It turned out that the new CPU also featured a
> new GPU with a different PCI id.
> 
> -> Add this PCI id to the quirk table. Probably all other
> Intel GPU PCI ids are affected, too, but I don't want
> to add them without a test system.

Daniel, Jani, et al, do we need a better solution to this?  Is there
a more general way to solve this than by tripping over affected machines
one-by-one?  Could this be done in the driver rather than in a quirk?

> Signed-off-by: Thomas Jarosch 
> Tested-by: Thomas Jarosch 
> ---
>  drivers/pci/quirks.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index e729206..0feb4a3 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -2954,6 +2954,7 @@ static void disable_igfx_irq(struct pci_dev *dev)
>  }
>  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0102, disable_igfx_irq);
>  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x010a, disable_igfx_irq);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0152, disable_igfx_irq);
>  
>  /*
>   * PCI devices which are on Intel chips can skip the 10ms delay
> -- 
> 1.8.1.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] PCI: Add new PCI id for Intel GPU interrupt quirk

2014-04-25 Thread Bjorn Helgaas
On Fri, Apr 25, 2014 at 11:35:21AM -0600, Bjorn Helgaas wrote:
> [+cc Daniel, Jani, intel-gfx, dri-devel, -cc stable]
> 
> On Mon, Apr 07, 2014 at 03:10:32PM +0200, Thomas Jarosch wrote:
> > After a CPU upgrade while keeping the same mainboard,
> > we faced "spurious interrupt" problems again.
> > 
> > It turned out that the new CPU also featured a
> > new GPU with a different PCI id.
> > 
> > -> Add this PCI id to the quirk table. Probably all other
> > Intel GPU PCI ids are affected, too, but I don't want
> > to add them without a test system.
> 
> Daniel, Jani, et al, do we need a better solution to this?  Is there
> a more general way to solve this than by tripping over affected machines
> one-by-one?  Could this be done in the driver rather than in a quirk?

I guess it can't be done in the driver because the problem happens even if
the driver isn't loaded, per f67fd55fa96f ("PCI: Add quirk for still
enabled interrupts on Intel Sandy Bridge GPUs").

> > Signed-off-by: Thomas Jarosch 
> > Tested-by: Thomas Jarosch 
> > ---
> >  drivers/pci/quirks.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > index e729206..0feb4a3 100644
> > --- a/drivers/pci/quirks.c
> > +++ b/drivers/pci/quirks.c
> > @@ -2954,6 +2954,7 @@ static void disable_igfx_irq(struct pci_dev *dev)
> >  }
> >  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0102, disable_igfx_irq);
> >  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x010a, disable_igfx_irq);
> > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0152, disable_igfx_irq);
> >  
> >  /*
> >   * PCI devices which are on Intel chips can skip the 10ms delay
> > -- 
> > 1.8.1.4
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> > the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] agp/intel: can't ioremap flush page - no chipset flushing

2014-02-10 Thread Bjorn Helgaas
On Sun, Feb 9, 2014 at 6:25 AM, Paul Bolle  wrote:
> On Sun, 2014-02-09 at 13:15 +, Steven Newbury wrote:
>> PCI resource allocation is undergoing some changes at the moment, it's
>> definitely a bug if the Flush Page isn't getting allocated.  I'm looking
>> forward to hopefully getting pci_bus_alloc_resource_fit() behaviour in
>> mainline, it will provide much better resource allocation in the 32 bit
>> PCI address space, and prevent problems like this from cropping up.
>>
>> See Yinghai Lu's for-pci-res-alloc branch.
>>
>> I've been carrying the changes in my local tree, but right now the
>> upstream PCI changes are quite extensive.  He's planning on rebasing the
>> branch soon.
>
> Does this mean I might be better of not bisecting this just yet? Or are
> these changes targeted at v3.15 (or later)?

Yinghai's changes would probably be in v3.15 or later.

Can you open a kernel.org bugzilla report and attach complete dmesg
logs of the working and broken kernels to it?  There might be more
useful resource-related messages from the PCI core.

I wouldn't start bisecting yet, but if you're in the mood, this
commit: 96702be56037 "Merge branch 'pci/resource' into next" looks
like a good place to start, so you could try the pre-merge commit:
04f982beb900 "Merge branch 'pci/msi' into next".  If 04f982beb900 is
good, there are only about 15 commits on the pci/resource branch to
look at.

Thanks for the report!

Bjorn
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] agp/intel: can't ioremap flush page - no chipset flushing

2014-03-06 Thread Bjorn Helgaas
On Thu, Mar 6, 2014 at 1:25 PM, Paul Bolle  wrote:
> Bjorn Helgaas schreef op ma 10-02-2014 om 14:33 [-0700]:
>> Can you open a kernel.org bugzilla report and attach complete dmesg
>> logs of the working and broken kernels to it?  There might be more
>> useful resource-related messages from the PCI core.
>
> That took me quite a bit longer than I hoped, but I finally opened a
> report at
> https://bugzilla.kernel.org/show_bug.cgi?id=71611 .
>
> Note that the dmesg's are identical (up to that error). Are you still
> interested?

Yep, I wouldn't mind seeing the dmesg log to compare with the
/proc/iomem contents you posted earlier.  Since they're identical,
there might not be a clue, but you never know, and it would be really
great to squash this regression before v3.14.

>> I wouldn't start bisecting yet, but if you're in the mood, this
>> commit: 96702be56037 "Merge branch 'pci/resource' into next" looks
>> like a good place to start, so you could try the pre-merge commit:
>> 04f982beb900 "Merge branch 'pci/msi' into next".  If 04f982beb900 is
>> good, there are only about 15 commits on the pci/resource branch to
>> look at.
>
> I hope to do a bisect in the next few days. If that bisect pinpoints an
> interesting commit that might just be in time for v3.14.
>
>
> Paul Bolle
>
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] agp/intel: can't ioremap flush page - no chipset flushing

2014-03-07 Thread Bjorn Helgaas
On Fri, Mar 7, 2014 at 2:48 AM, Paul Bolle  wrote:
> Bjorn Helgaas schreef op ma 10-02-2014 om 14:33 [-0700]:
>> I wouldn't start bisecting yet, but if you're in the mood, this
>> commit: 96702be56037 "Merge branch 'pci/resource' into next" looks
>> like a good place to start, so you could try the pre-merge commit:
>> 04f982beb900 "Merge branch 'pci/msi' into next".  If 04f982beb900 is
>> good, there are only about 15 commits on the pci/resource branch to
>> look at.
>
> This might end up not being relevant. And this is surely documented
> somewhere, but anyhow:
> - what git magic returns the hashes of the 15 commits that merge commit
>   96702be56037 added to the tree; and

"git show 96702be56037" gives:

commit 96702be560374ee7e7139a34cab03554129abbb4
Merge: 04f982beb900 d56dbf5bab8c
...

04f982beb900 is the previous HEAD, d56dbf5bab8c is the head of the
branch merged by this commit.  "git log 04f982beb900..96702be56037"
shows the commits merged.

> - how can I use the list of those hashes to limit the range of commits
>   to do a git bisect?

I'm not a git bisect expert, but I *think* you should be able to do
something like this:

git bisect start
git bisect bad 96702be56037
git bisect good 04f982beb900

(assuming you've verified that 96702be56037 really *is* bad and
04f982beb900 really *is* good), and git should checkout something in
the middle and you can build and test it, then use "git bisect good"
or "git bisect bad" depending on the result.

Bjorn
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] agp/intel: can't ioremap flush page - no chipset flushing

2014-03-07 Thread Bjorn Helgaas
On Thu, Mar 6, 2014 at 1:25 PM, Paul Bolle  wrote:
> Bjorn Helgaas schreef op ma 10-02-2014 om 14:33 [-0700]:
>> Can you open a kernel.org bugzilla report and attach complete dmesg
>> logs of the working and broken kernels to it?  There might be more
>> useful resource-related messages from the PCI core.
>
> That took me quite a bit longer than I hoped, but I finally opened a
> report at
> https://bugzilla.kernel.org/show_bug.cgi?id=71611 .
>
> Note that the dmesg's are identical (up to that error). Are you still
> interested?

Thanks for attaching the dmesg log.  I also attached the iomem
contents you previously posted.

This is likely a tangent from the AGP issue, but I don't understand
this iomem diff between v3.13.2 and v3.14-rc1:

  -8000-801f : PCI Bus :02
  -8020-8027 : :00:02.1

The first line is a bridge window to [bus 02], and the second is an
MMIO region for one of your AGP devices (00:02.0 is VGA and 00:02.1
seems to be some related device).

The bridge to [bus 02] is 00:1c.0, and your dmesg shows:

  pci :00:1c.0: PCI bridge to [bus 02]
  pci :00:1c.0:   bridge window [io  0x6000-0x6fff]
  pci :00:1c.0:   bridge window [mem 0xa010-0xa01f]

The [mem 0xa010-0xa01f] window appears in the dmesg log and in
both iomem captures; that makes good sense.  The 8000-801f
window is in only the v3.13.2 iomem, and it *should* appear in the
v3.13.2 dmesg log but apparently doesn't?

Likewise, the 00:02.1 resource appears only in the v3.13.2 iomem, and
it should also be mentioned in the v3.13.2 dmesg log.  The v3.14-rc5
dmesg log shows an unassigned resource there:

  pci :00:02.1: reg 0x10: [mem 0x-0x0007]

and says we were unable to assign anything.  But apparently v3.13.2
*did* assign something, and the dmesg log should show that.

Bjorn
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] agp/intel: can't ioremap flush page - no chipset flushing

2014-03-07 Thread Bjorn Helgaas
On Fri, Mar 07, 2014 at 06:16:49PM +0100, Paul Bolle wrote:
> Bjorn Helgaas schreef op vr 07-03-2014 om 09:55 [-0700]:
> > On Fri, Mar 7, 2014 at 2:48 AM, Paul Bolle  wrote:
> > > This might end up not being relevant. And this is surely documented
> > > somewhere, but anyhow:
> > > - what git magic returns the hashes of the 15 commits that merge commit
> > >   96702be56037 added to the tree; and
> > 
> > "git show 96702be56037" gives:
> > 
> > commit 96702be560374ee7e7139a34cab03554129abbb4
> > Merge: 04f982beb900 d56dbf5bab8c
> > ...
> > 
> > 04f982beb900 is the previous HEAD, d56dbf5bab8c is the head of the
> > branch merged by this commit.  "git log 04f982beb900..96702be56037"
> > shows the commits merged.
> 
> Thanks. Fairly obvious, actually. Not sure why I didn't think of this
> myself.
> 
> > > - how can I use the list of those hashes to limit the range of commits
> > >   to do a git bisect?
> > 
> > I'm not a git bisect expert, but I *think* you should be able to do
> > something like this:
> > 
> > git bisect start
> > git bisect bad 96702be56037
> > git bisect good 04f982beb900
> > 
> > (assuming you've verified that 96702be56037 really *is* bad and
> > 04f982beb900 really *is* good), and git should checkout something in
> > the middle and you can build and test it, then use "git bisect good"
> > or "git bisect bad" depending on the result.
> 
> Makes sense. Thanks again. 04f982beb900 appears to be good. So if
> 96702be56037 turns out to be bad bisecting might not turn into the
> ordeal it usually is. (On this machine, with my workflow, bisecting an
> v3.x..v3.x+1-rcy range takes a few days.)

It seems quite possible that I broke pci_bus_alloc_resource(), which could
cause an allocation failure like this.  

If you have a chance to try it, here's a debug patch against v3.14-rc5.  It
should apply cleanly to 96702be56037.  If you can try it, please attach the
dmesg log to the bugzilla.

Bjorn


diff --git a/drivers/char/agp/intel-gtt.c b/drivers/char/agp/intel-gtt.c
index 5c85350f4c3d..0dbba6c7c001 100644
--- a/drivers/char/agp/intel-gtt.c
+++ b/drivers/char/agp/intel-gtt.c
@@ -997,6 +997,7 @@ static int intel_alloc_chipset_flush_resource(void)
ret = pci_bus_alloc_resource(intel_private.bridge_dev->bus, 
&intel_private.ifp_resource, PAGE_SIZE,
 PAGE_SIZE, PCIBIOS_MIN_MEM, 0,
 pcibios_align_resource, 
intel_private.bridge_dev);
+   dev_info(&intel_private.bridge_dev->dev, "pci_bus_alloc ret %d\n", ret);
 
return ret;
 }
@@ -1007,6 +1008,7 @@ static void intel_i915_setup_chipset_flush(void)
u32 temp;
 
pci_read_config_dword(intel_private.bridge_dev, I915_IFPADDR, &temp);
+   dev_info(&intel_private.bridge_dev->dev, "I915_IFPADDR %#010x\n", temp);
if (!(temp & 0x1)) {
intel_alloc_chipset_flush_resource();
intel_private.resource_valid = 1;
@@ -1022,6 +1024,7 @@ static void intel_i915_setup_chipset_flush(void)
if (ret)
intel_private.resource_valid = 0;
}
+   dev_info(&intel_private.bridge_dev->dev, "ifp_resource %pR\n", 
&intel_private.ifp_resource);
 }
 
 static void intel_i965_g33_setup_chipset_flush(void)
diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index 00660cc502c5..1c6d75ae34d9 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -146,24 +146,31 @@ static int pci_bus_alloc_from_region(struct pci_bus *bus, 
struct resource *res,
 
type_mask |= IORESOURCE_IO | IORESOURCE_MEM;
 
+   dev_info(&bus->dev, "%s: alloc %pR size %#llx from bus region 
[%#010llx-%#010llx]\n", __func__, res, (long long) size, (long long) 
region->start, (long long) region->end);
pci_bus_for_each_resource(bus, r, i) {
if (!r)
continue;
 
/* type_mask must match */
-   if ((res->flags ^ r->flags) & type_mask)
+   if ((res->flags ^ r->flags) & type_mask) {
+   dev_info(&bus->dev, "%s: %pR: wrong type (%#lx %#lx 
mask %#x)\n", __func__, r, res->flags, r->flags, type_mask);
continue;
+   }
 
/* We cannot allocate a non-prefetching resource
   from a pre-fetching area */
if ((r->flags & IORESOURCE_PREFETCH) &&
-   !(res->flags & IORESOURCE_PREFETCH))
+   !(res->flags & IORESOURCE_PREFETCH)) {
+   

Re: [Intel-gfx] agp/intel: can't ioremap flush page - no chipset flushing

2014-03-07 Thread Bjorn Helgaas
On Fri, Mar 7, 2014 at 2:03 PM, Paul Bolle  wrote:
> On Fri, 2014-03-07 at 13:40 -0700, Bjorn Helgaas wrote:
>> It seems quite possible that I broke pci_bus_alloc_resource(), which could
>> cause an allocation failure like this.
>>
>> If you have a chance to try it, here's a debug patch against v3.14-rc5.  It
>> should apply cleanly to 96702be56037.  If you can try it, please attach the
>> dmesg log to the bugzilla.
>
> That ThinkPad X41 is now building 96702be56037. Once that build is
> finished and tested I'll try your debug patch (on top of v3.14-rc5, see
> later). It might take some time to finish both builds and test boots.

> My v3.13 based builds don't have INTEL_GTT set! My v3.14-rcy based
> builds do. I have not yet investigated why that is.

I think that's OK.  CONFIG_INTEL_GTT was added after v3.13
(00fe639a56b4 "drm/i915: Make AGP support optional").  It looks like
in v3.13, intel-gtt.o was built if CONFIG_AGP_INTEL=y (or =m), which
you probably do have (see drivers/char/agp/Makefile).

> Too bad drivers/pci/bus.o is built in by definition. If only one could
> build a kernel without rebuilding all modules. Or is there some way to
> actually do that?

You should be able to "make bzImage" and get just the kernel.  But
there might be module loading issues if the modules don't exactly
match the kernel.  I think it's possible to turn that checking off,
but I don't do it often enough to remember details.

Bjorn
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] agp/intel: can't ioremap flush page - no chipset flushing

2014-03-08 Thread Bjorn Helgaas
On Fri, Mar 7, 2014 at 1:40 PM, Bjorn Helgaas  wrote:
> It seems quite possible that I broke pci_bus_alloc_resource(), which could
> cause an allocation failure like this.
>
> If you have a chance to try it, here's a debug patch against v3.14-rc5.  It
> should apply cleanly to 96702be56037.  If you can try it, please attach the
> dmesg log to the bugzilla.

Paul verified that I *did* break this.  More details in the bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=71691

The problem is basically that I used resource_size() to figure out
whether there's any available space.  resource_size() is res->end -
res->start + 1, so applying it to [mem 0x-0x] returns
zero in a kernel 32-bit resource addresses, i.e., with
CONFIG_PHYS_ADDR_T_64BIT=n.

Paul, I assume you have CONFIG_PHYS_ADDR_T_64BIT=n (which is perfectly
legal); let me know if otherwise.

  pci_bus :00: pci_bus_alloc_from_region: alloc [mem 0x]
size 0x1000 from bus region [0x-0x]
  pci_bus :00: pci_bus_alloc_from_region: [mem
0x-0x]: no space (avail [mem 0x-0x])

Bjorn
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] agp/intel: can't ioremap flush page - no chipset flushing

2014-03-10 Thread Bjorn Helgaas
[+cc linux-pci]

On Sat, Mar 08, 2014 at 03:44:37PM +0100, Paul Bolle wrote:
> Bjorn Helgaas schreef op za 08-03-2014 om 07:12 [-0700]:
> > I assume you have CONFIG_PHYS_ADDR_T_64BIT=n (which is perfectly
> > legal); let me know if otherwise.
> 
> $ grep CONFIG_PHYS_ADDR_T_64BIT /boot/config-3.14.0-0.rc5.1.local2.fc20.i686 
> # CONFIG_PHYS_ADDR_T_64BIT is not set
> 
> So, yes, CONFIG_PHYS_ADDR_T_64BIT=n here.

Thanks.  Can you try the patch below?  I think it should fix the problem.


PCI: Don't check resource_size() in pci_bus_alloc_resource()

From: Bjorn Helgaas 

When resource_size_t is 32 bits wide, e.g., when CONFIG_PHYS_ADDR_T_64BIT
is not defined, resource_size() on [mem 0x-0x] returns 0
because (r->end - r->start + 1) overflows.

Therefore, we can't use "resource_size() == 0" to decide that allocation
from this resource will fail.  allocate_resource() should fail anyway if it
can't satisfy the address constraints, so we should just depend on that.

A [mem 0x-0x] bus resource is obviously not really valid,
but we do fall back to it as a default when we don't have information about
host bridge apertures.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=71611
Fixes: f75b99d5a77d PCI: Enforce bus address limits in resource allocation
Signed-off-by: Bjorn Helgaas 
---
 drivers/pci/bus.c |2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index 00660cc502c5..38901665c770 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -162,8 +162,6 @@ static int pci_bus_alloc_from_region(struct pci_bus *bus, 
struct resource *res,
 
avail = *r;
pci_clip_resource_to_region(bus, &avail, region);
-   if (!resource_size(&avail))
-   continue;
 
/*
 * "min" is typically PCIBIOS_MIN_IO or PCIBIOS_MIN_MEM to
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] agp/intel: can't ioremap flush page - no chipset flushing

2014-03-10 Thread Bjorn Helgaas
[+cc Rafael]

On Mon, Mar 10, 2014 at 5:45 PM, Paul Bolle  wrote:
> Bjorn Helgaas schreef op ma 10-03-2014 om 12:24 [-0600]:
>> Thanks.  Can you try the patch below?  I think it should fix the problem.
>>
>>
>> PCI: Don't check resource_size() in pci_bus_alloc_resource()
>>
>> From: Bjorn Helgaas 
>>
>> When resource_size_t is 32 bits wide, e.g., when CONFIG_PHYS_ADDR_T_64BIT
>> is not defined, resource_size() on [mem 0x-0x] returns 0
>> because (r->end - r->start + 1) overflows.
>>
>> Therefore, we can't use "resource_size() == 0" to decide that allocation
>> from this resource will fail.  allocate_resource() should fail anyway if it
>> can't satisfy the address constraints, so we should just depend on that.
>>
>> A [mem 0x-0x] bus resource is obviously not really valid,
>> but we do fall back to it as a default when we don't have information about
>> host bridge apertures.
>>
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=71611
>> Fixes: f75b99d5a77d PCI: Enforce bus address limits in resource allocation
>> Signed-off-by: Bjorn Helgaas 
>> ---
>>  drivers/pci/bus.c |2 --
>>  1 file changed, 2 deletions(-)
>>
>> diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
>> index 00660cc502c5..38901665c770 100644
>> --- a/drivers/pci/bus.c
>> +++ b/drivers/pci/bus.c
>> @@ -162,8 +162,6 @@ static int pci_bus_alloc_from_region(struct pci_bus 
>> *bus, struct resource *res,
>>
>>   avail = *r;
>>   pci_clip_resource_to_region(bus, &avail, region);
>> - if (!resource_size(&avail))
>> - continue;
>>
>>   /*
>>* "min" is typically PCIBIOS_MIN_IO or PCIBIOS_MIN_MEM to
>
> I've applied your patch on top of v3.14-rc6. The boot warning is gone.
> And /proc/iomem once again has the lines:
> [...]
> 8000-801f : PCI Bus :02
> 8020-8027 : :00:02.1
> 8028-80280fff : Intel Flush Page
> [...]
>
> Which is what we want, don't we?

Yep, that part looks good.

> A bit of doubt is caused by two new boot time messages:
> pnp 00:00: unknown resource type 10 in _CRS
> pnp 00:00: can't evaluate _CRS: 1
>
> But I haven't yet tried v3.14-rc6 without your patch, so these might be
> unrelated. They're unclear to me, anyway.

Hmm, this is definitely concerning.  Resource type 10 is
ACPI_RESOURCE_TYPE_FIXED_MEMORY32, which we do handle (unless it's
length 0).  And your previous dmesg logs (at
https://bugzilla.kernel.org/show_bug.cgi?id=71611) don't show anything
like that.

Can you attach an acpidump and the dmesg log with my patch to the bugzilla?

Bjorn
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] agp/intel: can't ioremap flush page - no chipset flushing

2014-03-10 Thread Bjorn Helgaas
On Mon, Mar 10, 2014 at 6:15 PM, Paul Bolle  wrote:
> On Mon, 2014-03-10 at 18:07 -0600, Bjorn Helgaas wrote:
>> On Mon, Mar 10, 2014 at 5:45 PM, Paul Bolle  wrote:
>> > A bit of doubt is caused by two new boot time messages:
>> > pnp 00:00: unknown resource type 10 in _CRS
>> > pnp 00:00: can't evaluate _CRS: 1
>> >
>> > But I haven't yet tried v3.14-rc6 without your patch, so these might be
>> > unrelated. They're unclear to me, anyway.
>>
>> Hmm, this is definitely concerning.  Resource type 10 is
>> ACPI_RESOURCE_TYPE_FIXED_MEMORY32, which we do handle (unless it's
>> length 0).  And your previous dmesg logs (at
>> https://bugzilla.kernel.org/show_bug.cgi?id=71611) don't show anything
>> like that.
>>
>> Can you attach an acpidump and the dmesg log with my patch to the bugzilla?
>
> It's getting rather late over here. So I'll try (quite) a few hours
> later. But acpidump doesn't ring any bells right now. Any hints?

There's acpidump info here: https://01.org/linux-acpi/utilities .  You
don't need to extract the binary tables or anything; just attach the
text dump to the bugzilla.

It would be very interesting to try v3.14-rc6 without my patch.  I
hope it has the same "unknown resource type" messages, because I don't
see how my patch could be related to them.  There were no recent
PNP-related changes, so then I start to worry about some sort of
memory corruption from an unrelated patch.  But I hope that's not the
case.

Bjorn
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v5] ACPI: Fix acpi_evaluate_object() return value check

2014-02-04 Thread Bjorn Helgaas
On Wed, Jan 22, 2014 at 8:42 PM, Yijing Wang  wrote:
> Since acpi_evaluate_object() returns acpi_status and not plain int,
> ACPI_FAILURE() should be used for checking its return value. Also
> add some detailed debug info when acpi_evaluate_object() failed.
>
> Reviewed-by: Jani Nikula 
> Acked-by: Bjorn Helgaas 
> Signed-off-by: Yijing Wang 
> ---
> v4->v5: Add some detailed debug info for acpi_evaluate_object()
> failure suggested by Bjorn.
> v3->v4: Fix spell error, add Jani Nikula reviewed-by.
> v2->v3: Fix compile error pointed out by Hanjun.
> v1->v2: Add CC to related subsystem MAINTAINERS
> ---
>  drivers/gpu/drm/i915/intel_acpi.c  |   33 ---
>  drivers/gpu/drm/nouveau/core/subdev/mxm/base.c |   13 ++---
>  drivers/gpu/drm/nouveau/nouveau_acpi.c |   25 +++---
>  drivers/pci/pci-label.c|   10 +--
>  4 files changed, 54 insertions(+), 27 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_acpi.c 
> b/drivers/gpu/drm/i915/intel_acpi.c
> index dfff090..e7b526b 100644
> --- a/drivers/gpu/drm/i915/intel_acpi.c
> +++ b/drivers/gpu/drm/i915/intel_acpi.c
> @@ -31,11 +31,13 @@ static const u8 intel_dsm_guid[] = {
>  static int intel_dsm(acpi_handle handle, int func)
>  {
> struct acpi_buffer output = { ACPI_ALLOCATE_BUFFER, NULL };
> +   struct acpi_buffer string = { ACPI_ALLOCATE_BUFFER, NULL };
> struct acpi_object_list input;
> union acpi_object params[4];
> union acpi_object *obj;
> u32 result;
> -   int ret = 0;
> +   acpi_status status;
> +   int ret;
>
> input.count = 4;
> input.pointer = params;
> @@ -50,10 +52,14 @@ static int intel_dsm(acpi_handle handle, int func)
> params[3].package.count = 0;
> params[3].package.elements = NULL;
>
> -   ret = acpi_evaluate_object(handle, "_DSM", &input, &output);
> -   if (ret) {
> -   DRM_DEBUG_DRIVER("failed to evaluate _DSM: %d\n", ret);
> -   return ret;
> +   status = acpi_evaluate_object(handle, "_DSM", &input, &output);
> +   if (ACPI_FAILURE(status)) {
> +   acpi_get_name(handle, ACPI_FULL_PATHNAME, &string);
> +   DRM_DEBUG_DRIVER(
> +   "failed to evaluate _DSM for %s, exit status %u\n",
> +   (char *)string.pointer, (unsigned int)status);
> +   kfree(string.pointer);
> +   return -EINVAL;

I said "too bad there isn't an *easy* way" to include more
information.  IMHO this is too ugly and error-prone to use
consistently.  And if you are going to add more information, why did
you only do it for some of the calls and not others?

I considered adding a %p extension to print the pathname; I don't know
if that's worthwhile or not.  I think it would be ideal if we had a
struct device and could use dev_info(), and then a way to connect the
struct device with an ACPI path, like maybe a dmesg note when we
create the struct device corresponding to an ACPI Device node.

Bjorn

> }
>
> obj = (union acpi_object *)output.pointer;
> @@ -138,10 +144,12 @@ static char *intel_dsm_mux_type(u8 type)
>  static void intel_dsm_platform_mux_info(void)
>  {
> struct acpi_buffer output = { ACPI_ALLOCATE_BUFFER, NULL };
> +   struct acpi_buffer string = { ACPI_ALLOCATE_BUFFER, NULL };
> struct acpi_object_list input;
> union acpi_object params[4];
> union acpi_object *pkg;
> -   int i, ret;
> +   acpi_status status;
> +   int i;
>
> input.count = 4;
> input.pointer = params;
> @@ -156,10 +164,15 @@ static void intel_dsm_platform_mux_info(void)
> params[3].package.count = 0;
> params[3].package.elements = NULL;
>
> -   ret = acpi_evaluate_object(intel_dsm_priv.dhandle, "_DSM", &input,
> -  &output);
> -   if (ret) {
> -   DRM_DEBUG_DRIVER("failed to evaluate _DSM: %d\n", ret);
> +   acpi_status = acpi_evaluate_object(intel_dsm_priv.dhandle,
> +   "_DSM", &input, &output);
> +   if (ACPI_FAILURE(status)) {
> +   acpi_get_name(intel_dsm_priv.dhandle,
> +   ACPI_FULL_PATHNAME, &string);
> +   DRM_DEBUG_DRIVER(
> +   "failed to evaluate _DSM for %s, exit status %u\n",
> +   (char *)string.pointer, (unsigned int)status);
> +   kfree(string.pointer);
> goto

Re: [Intel-gfx] [PATCH v4] ACPI: Fix acpi_evaluate_object() return value check

2014-02-04 Thread Bjorn Helgaas
On Mon, Jan 20, 2014 at 7:46 PM, Yijing Wang  wrote:
> Since acpi_evaluate_object() returns acpi_status and not plain int,
> ACPI_FAILURE() should be used for checking its return value.
>
> Reviewed-by: Jani Nikula 
> Signed-off-by: Yijing Wang 
> ---
> v3->v4: Fix spell error, add Jani Nikula reviewed-by.
> v2->v3: Fix compile error pointed out by Hanjun.
> v1->v2: Add CC to related subsystem MAINTAINERS
> ---
>  drivers/gpu/drm/i915/intel_acpi.c  |   24 
> ++--
>  drivers/gpu/drm/nouveau/core/subdev/mxm/base.c |9 +
>  drivers/gpu/drm/nouveau/nouveau_acpi.c |   23 +--
>  drivers/pci/pci-label.c|9 ++---

For the drivers/pci/pci-label.c part,

Acked-by: Bjorn Helgaas 

> +   status = acpi_evaluate_object(handle, "_DSM", &input, &output);
> +   if (ACPI_FAILURE(status)) {
> +   DRM_DEBUG_DRIVER("failed to evaluate _DSM: %s\n",
> +   acpi_format_exception(status));

It's too bad there isn't an easy way to produce more informative error
messages, e.g., by including a namespace path or something.  A message
like:

failed to evaluate _DSM: A requested entity is not found

is only useful if there's enough context to figure out what's going on.

Bjorn
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v5] ACPI: Fix acpi_evaluate_object() return value check

2014-02-04 Thread Bjorn Helgaas
On Fri, Jan 24, 2014 at 8:36 AM, Rafael J. Wysocki  wrote:
> On Friday, January 24, 2014 07:54:29 AM Bjorn Helgaas wrote:
>> On Thu, Jan 23, 2014 at 5:33 PM, Rafael J. Wysocki  
>> wrote:
>> > On Thursday, January 23, 2014 11:21:01 AM Bjorn Helgaas wrote:
>> >> On Wed, Jan 22, 2014 at 8:42 PM, Yijing Wang  
>> >> wrote:
>> >> > Since acpi_evaluate_object() returns acpi_status and not plain int,
>> >> > ACPI_FAILURE() should be used for checking its return value. Also
>> >> > add some detailed debug info when acpi_evaluate_object() failed.
>> >> >
>> >> > Reviewed-by: Jani Nikula 
>> >> > Acked-by: Bjorn Helgaas 
>> >> > Signed-off-by: Yijing Wang 
>> >> > ---
>> >> > v4->v5: Add some detailed debug info for acpi_evaluate_object()
>> >> > failure suggested by Bjorn.
>> >> > v3->v4: Fix spell error, add Jani Nikula reviewed-by.
>> >> > v2->v3: Fix compile error pointed out by Hanjun.
>> >> > v1->v2: Add CC to related subsystem MAINTAINERS
>> >> > ---
>> >> >  drivers/gpu/drm/i915/intel_acpi.c  |   33 
>> >> > ---
>> >> >  drivers/gpu/drm/nouveau/core/subdev/mxm/base.c |   13 ++---
>> >> >  drivers/gpu/drm/nouveau/nouveau_acpi.c |   25 
>> >> > +++---
>> >> >  drivers/pci/pci-label.c|   10 +--
>> >> >  4 files changed, 54 insertions(+), 27 deletions(-)
>> >> >
>> >> > diff --git a/drivers/gpu/drm/i915/intel_acpi.c 
>> >> > b/drivers/gpu/drm/i915/intel_acpi.c
>> >> > index dfff090..e7b526b 100644
>> >> > --- a/drivers/gpu/drm/i915/intel_acpi.c
>> >> > +++ b/drivers/gpu/drm/i915/intel_acpi.c
>> >> > @@ -31,11 +31,13 @@ static const u8 intel_dsm_guid[] = {
>> >> >  static int intel_dsm(acpi_handle handle, int func)
>> >> >  {
>> >> > struct acpi_buffer output = { ACPI_ALLOCATE_BUFFER, NULL };
>> >> > +   struct acpi_buffer string = { ACPI_ALLOCATE_BUFFER, NULL };
>> >> > struct acpi_object_list input;
>> >> > union acpi_object params[4];
>> >> > union acpi_object *obj;
>> >> > u32 result;
>> >> > -   int ret = 0;
>> >> > +   acpi_status status;
>> >> > +   int ret;
>> >> >
>> >> > input.count = 4;
>> >> > input.pointer = params;
>> >> > @@ -50,10 +52,14 @@ static int intel_dsm(acpi_handle handle, int func)
>> >> > params[3].package.count = 0;
>> >> > params[3].package.elements = NULL;
>> >> >
>> >> > -   ret = acpi_evaluate_object(handle, "_DSM", &input, &output);
>> >> > -   if (ret) {
>> >> > -   DRM_DEBUG_DRIVER("failed to evaluate _DSM: %d\n", ret);
>> >> > -   return ret;
>> >> > +   status = acpi_evaluate_object(handle, "_DSM", &input, &output);
>> >> > +   if (ACPI_FAILURE(status)) {
>> >> > +   acpi_get_name(handle, ACPI_FULL_PATHNAME, &string);
>> >> > +   DRM_DEBUG_DRIVER(
>> >> > +   "failed to evaluate _DSM for %s, exit status 
>> >> > %u\n",
>> >> > +   (char *)string.pointer, (unsigned int)status);
>> >> > +   kfree(string.pointer);
>> >> > +   return -EINVAL;
>> >>
>> >> I said "too bad there isn't an *easy* way" to include more
>> >> information.  IMHO this is too ugly and error-prone to use
>> >> consistently.  And if you are going to add more information, why did
>> >> you only do it for some of the calls and not others?
>> >>
>> >> I considered adding a %p extension to print the pathname; I don't know
>> >> if that's worthwhile or not.  I think it would be ideal if we had a
>> >> struct device and could use dev_info(), and then a way to connect the
>> >> struct device with an ACPI path, like maybe a dmesg note when we
>> >> create the struct device corresponding to an ACPI Device node.
>> >
>> > Well, we can generally print something like that from pci_acpi_setup().
>> >
>> > What about the below?  Wouldn't it generate too much output on some 
>> > systems?
>>
>> Yeah, that probably would generate an awful lot of output.  I was just
>> hoping to avoid treating ACPI pathnames as first-class objects.  What
>> do you think about a %p extension?  I played with that once, but I
>> seem to have lost the patch.
>
> Well, it may be worth doing.  However, that information is readily available 
> from
> sysfs anyway, you only need to follow the firmware_node link in the PCI 
> device's
> sysfs directory and read the path attribute from there.  For example, on my
> system:
>
> $ cat /sys/devices/pci:00/:00:1c.4/:0b:00.0/firmware_node/path
> \_SB_.PCI0.RP05.PXSX

That's perfect.  If we had a struct device, we could just use
dev_info() for these messages.  But I have no idea how hard it would
be to get at the struct device.

Bjorn
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v5] ACPI: Fix acpi_evaluate_object() return value check

2014-02-04 Thread Bjorn Helgaas
On Thu, Jan 23, 2014 at 5:33 PM, Rafael J. Wysocki  wrote:
> On Thursday, January 23, 2014 11:21:01 AM Bjorn Helgaas wrote:
>> On Wed, Jan 22, 2014 at 8:42 PM, Yijing Wang  wrote:
>> > Since acpi_evaluate_object() returns acpi_status and not plain int,
>> > ACPI_FAILURE() should be used for checking its return value. Also
>> > add some detailed debug info when acpi_evaluate_object() failed.
>> >
>> > Reviewed-by: Jani Nikula 
>> > Acked-by: Bjorn Helgaas 
>> > Signed-off-by: Yijing Wang 
>> > ---
>> > v4->v5: Add some detailed debug info for acpi_evaluate_object()
>> > failure suggested by Bjorn.
>> > v3->v4: Fix spell error, add Jani Nikula reviewed-by.
>> > v2->v3: Fix compile error pointed out by Hanjun.
>> > v1->v2: Add CC to related subsystem MAINTAINERS
>> > ---
>> >  drivers/gpu/drm/i915/intel_acpi.c  |   33 
>> > ---
>> >  drivers/gpu/drm/nouveau/core/subdev/mxm/base.c |   13 ++---
>> >  drivers/gpu/drm/nouveau/nouveau_acpi.c |   25 +++---
>> >  drivers/pci/pci-label.c|   10 +--
>> >  4 files changed, 54 insertions(+), 27 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/i915/intel_acpi.c 
>> > b/drivers/gpu/drm/i915/intel_acpi.c
>> > index dfff090..e7b526b 100644
>> > --- a/drivers/gpu/drm/i915/intel_acpi.c
>> > +++ b/drivers/gpu/drm/i915/intel_acpi.c
>> > @@ -31,11 +31,13 @@ static const u8 intel_dsm_guid[] = {
>> >  static int intel_dsm(acpi_handle handle, int func)
>> >  {
>> > struct acpi_buffer output = { ACPI_ALLOCATE_BUFFER, NULL };
>> > +   struct acpi_buffer string = { ACPI_ALLOCATE_BUFFER, NULL };
>> > struct acpi_object_list input;
>> > union acpi_object params[4];
>> > union acpi_object *obj;
>> > u32 result;
>> > -   int ret = 0;
>> > +   acpi_status status;
>> > +   int ret;
>> >
>> > input.count = 4;
>> > input.pointer = params;
>> > @@ -50,10 +52,14 @@ static int intel_dsm(acpi_handle handle, int func)
>> > params[3].package.count = 0;
>> > params[3].package.elements = NULL;
>> >
>> > -   ret = acpi_evaluate_object(handle, "_DSM", &input, &output);
>> > -   if (ret) {
>> > -   DRM_DEBUG_DRIVER("failed to evaluate _DSM: %d\n", ret);
>> > -   return ret;
>> > +   status = acpi_evaluate_object(handle, "_DSM", &input, &output);
>> > +   if (ACPI_FAILURE(status)) {
>> > +   acpi_get_name(handle, ACPI_FULL_PATHNAME, &string);
>> > +   DRM_DEBUG_DRIVER(
>> > +   "failed to evaluate _DSM for %s, exit status %u\n",
>> > +   (char *)string.pointer, (unsigned int)status);
>> > +   kfree(string.pointer);
>> > +   return -EINVAL;
>>
>> I said "too bad there isn't an *easy* way" to include more
>> information.  IMHO this is too ugly and error-prone to use
>> consistently.  And if you are going to add more information, why did
>> you only do it for some of the calls and not others?
>>
>> I considered adding a %p extension to print the pathname; I don't know
>> if that's worthwhile or not.  I think it would be ideal if we had a
>> struct device and could use dev_info(), and then a way to connect the
>> struct device with an ACPI path, like maybe a dmesg note when we
>> create the struct device corresponding to an ACPI Device node.
>
> Well, we can generally print something like that from pci_acpi_setup().
>
> What about the below?  Wouldn't it generate too much output on some systems?

Yeah, that probably would generate an awful lot of output.  I was just
hoping to avoid treating ACPI pathnames as first-class objects.  What
do you think about a %p extension?  I played with that once, but I
seem to have lost the patch.

> ---
>  drivers/pci/pci-acpi.c |2 ++
>  1 file changed, 2 insertions(+)
>
> Index: linux-pm/drivers/pci/pci-acpi.c
> ===
> --- linux-pm.orig/drivers/pci/pci-acpi.c
> +++ linux-pm/drivers/pci/pci-acpi.c
> @@ -330,6 +330,8 @@ static void pci_acpi_setup(struct device
> if (!adev)
> return;
>
> +   acpi_handle_info(adev->handle, "bound to %s\n", dev_name(dev));
> +
> pci_acpi_add_pm_notifier(adev, pci_dev);
> if (!adev->wakeup.flags.valid)
> return;
>
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] git pull] drm for v4.1-rc1

2015-04-21 Thread Bjorn Helgaas
On Tue, Apr 21, 2015 at 11:07 AM, Linus Torvalds
 wrote:
> Hmm. The odd Intel PCI resource mess is back.
>
> Or maybe it never went away.
>
> I get these when suspending. Things *work*, but it's really spamming
> my logs a fair bit:
>
>   i915 :00:02.0: BAR 6: [??? 0x flags 0x2] has bogus alignment
>   pci_bus :01: Allocating resources
>   pci_bus :02: Allocating resources
>   i915 :00:02.0: BAR 6: [??? 0x flags 0x2] has bogus alignment
>   i915 :00:02.0: BAR 6: [??? 0x flags 0x2] has bogus alignment
>   i915 :00:02.0: BAR 6: [??? 0x flags 0x2] has bogus alignment
>   i915 :00:02.0: BAR 6: [??? 0x flags 0x2] has bogus alignment
>   i915 :00:02.0: BAR 6: [??? 0x flags 0x2] has bogus alignment
>   i915 :00:02.0: BAR 6: [??? 0x flags 0x2] has bogus alignment
>
> That resource is complete garbage. "flags 0x2" is not even a valid
> flag value. I'm *guessing* it might be IORESOURCE_ROM_SHADOW, but if
> that is valid, then it should also have have had the IORESOURCE_MEM
> bit, and it doesn't.
>
> (The low 8 bits of the resource flags depend on the high bits, which
> is why I say that I'm "guessing" at that ROM_SHADOW bit. It could be
> something else, like a IORESOURCE_MEM_CACHEABLE PnP bit - but that
> should not show up for PCI, and "BAR 6" is normally the ROM resource,
> so the ROM_SHADOW bit makes some sense.
>
> The only place that sets IORESOURCE_ROM_SHADOW that I find is the x86
> pci_fixup_video() function. That one checks for PCI_COMMAND_IO |
> PCI_COMMAND_MEMORY in the PCI command word, though. Why are the other
> bits not set?
>
> Both i915/dri people and PCI people on the Cc. This warning does *not*
> happen at bootup, but only at suspend time. So my suspicion is that
> somebody messes with the PCI ROM resource, and disables it or
> something, but the IORESOURCE_ROM_SHADOW never gets cleared. And then
> because res->flags is non-zero, the PCI scanning code doesn't ignore
> the resource.
>
> Just before the whole bogus alignment check, the PCI code does
>
> if (!(r->flags) || r->parent)
> continue;
>
> (don't ask me about the odd parenthesis) which *should* have
> triggered, but that IORESOURCE_ROM_SHADOW bit screws things up.
>
> Anybody?

I'll look into this.  It's been around a long time, but hasn't
percolated to the top of my list until now.

https://bugzilla.kernel.org/show_bug.cgi?id=16063 says "echo 1
>/sys/bus/pci/rescan" is also a reproducer.

Bjorn
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] git pull] drm for v4.1-rc1

2015-04-23 Thread Bjorn Helgaas
[+cc Matthew]

On Tue, Apr 21, 2015 at 09:07:45AM -0700, Linus Torvalds wrote:
> Hmm. The odd Intel PCI resource mess is back.
> 
> Or maybe it never went away.
> 
> I get these when suspending. Things *work*, but it's really spamming
> my logs a fair bit:
> 
>   i915 :00:02.0: BAR 6: [??? 0x flags 0x2] has bogus alignment
>   pci_bus :01: Allocating resources
>   pci_bus :02: Allocating resources
>   i915 :00:02.0: BAR 6: [??? 0x flags 0x2] has bogus alignment
>   i915 :00:02.0: BAR 6: [??? 0x flags 0x2] has bogus alignment
>   i915 :00:02.0: BAR 6: [??? 0x flags 0x2] has bogus alignment
>   i915 :00:02.0: BAR 6: [??? 0x flags 0x2] has bogus alignment
>   i915 :00:02.0: BAR 6: [??? 0x flags 0x2] has bogus alignment
>   i915 :00:02.0: BAR 6: [??? 0x flags 0x2] has bogus alignment
> 
> That resource is complete garbage. "flags 0x2" is not even a valid
> flag value. I'm *guessing* it might be IORESOURCE_ROM_SHADOW, but if
> that is valid, then it should also have have had the IORESOURCE_MEM
> bit, and it doesn't.

Your i915 does not have a ROM BAR in hardware.  If the default video
device has no ROM BAR, pci_fixup_video() sets IORESOURCE_ROM_SHADOW
even though the resource flags are zero because the BAR itself doesn't
exist.

If IORESOURCE_ROM_SHADOW is set, pci_map_rom() assumes there's a
shadow ROM image at 0xC.  Is there a shadow image even if the
device itself doesn't have a ROM BAR?

We could fabricate a resource even if the BAR doesn't exist, e.g.,
"flags = IORESOURCE_MEM | ... | IORESOURCE_ROM_SHADOW", but that
would be ugly and error-prone in other places that use the ROM.

Matthew added dev->rom for ROM images supplied by the platform
(84c1b80e3263 ("PCI: Add support for non-BAR ROMs")).  A shadow
image seems like a similar thing.  I think it would be cleaner to get
rid of IORESOURCE_ROM_SHADOW altogether and instead set "dev->rom =
0xC" if there's a shadow image, e.g.:

  int pcibios_add_device(struct pci_dev *dev)
  {
if (dev-is-default-vga-device) {
  dev->rom = 0xC;
  dev->romlen = 0x2;
}

pa_data = boot_params.hdr.setup_data;
while (pa_data) {
  ...
  if (data->type == SETUP_PCI) {
rom = (struct pci_setup_rom *)data;

if (dev-is-rom-dev) {
  dev->rom = ...
  dev->romlen = rom->pcilen;
}
  }
}

But the rules for figuring out which image to use seem ...
complicated.

Bjorn
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] PCI: Add another ID for Intel GPU "spurious interrupt" quirk

2014-09-05 Thread Bjorn Helgaas
On Fri, Aug 08, 2014 at 03:54:04PM +0200, Thomas Jarosch wrote:
> New Intel G3258 CPU, new MSI board, same problem:
> The GPU interrupt fired like crazy on monitor unplug.
> 
> lspci output:
> 00:02.0 VGA compatible controller: Intel Corporation Device 0402 (rev 06)
> Subsystem: Micro-Star International Co., Ltd. Device 7817
> Flags: bus master, fast devsel, latency 0, IRQ 11
> 
> Signed-off-by: Thomas Jarosch 
> Tested-by: Thomas Jarosch 
> CC: sta...@vger.kernel.org  # v3.4+

Dropping for now, please resend if necessary after resolving Daniel's
question.

> ---
>  drivers/pci/quirks.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 80c2d01..d2ff39d 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -2940,6 +2940,7 @@ static void disable_igfx_irq(struct pci_dev *dev)
>  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0102, disable_igfx_irq);
>  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x010a, disable_igfx_irq);
>  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0152, disable_igfx_irq);
> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0402, disable_igfx_irq);
>  
>  /*
>   * PCI devices which are on Intel chips can skip the 10ms delay
> -- 
> 1.9.3
> 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [PATCH 6/6] drm/i915/pm: Use pci_dev->skip_bus_pm for hibernate vs. D3 workaround

2024-09-25 Thread Bjorn Helgaas
On Wed, Sep 25, 2024 at 05:45:26PM +0300, Ville Syrjala wrote:
> From: Ville Syrjälä 
> 
> On some older laptops we have to leave the device in D0
> during hibernation, or else the BIOS just hangs and never
> finishes the hibernation.
> 
> Currently we are achieving that by skipping the
> pci_set_power_state(D3). However we also need to call
> pci_save_state() ahead of time, or else
> pci_pm_suspend_noirq() will do the pci_set_power_state(D3)
> anyway.
> 
> This is all rather ugly, and might cause us to deviate from
> standard pci pm behaviour in unknown ways since we always
> call pci_save_state() for any kind of suspend operation.
> 
> Stop calling pci_save_state()+pci_set_power_state() entirely
> (apart from the switcheroo paths) and instead set
> pci_dev->skip_bus_pm=true to prevent the D3 during hibernation
> on old machines. Apart from that we'll just let the normal
> pci pm code take care of everything for us.
> 
> Cc: Bjorn Helgaas 
> Cc: "Rafael J. Wysocki" 
> Cc: Rodrigo Vivi 
> Cc: linux-...@vger.kernel.org
> Cc: intel-gfx@lists.freedesktop.org
> Signed-off-by: Ville Syrjälä 
> ---
>  drivers/gpu/drm/i915/i915_driver.c | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_driver.c 
> b/drivers/gpu/drm/i915/i915_driver.c
> index c3e7225ea1ba..05948d00a874 100644
> --- a/drivers/gpu/drm/i915/i915_driver.c
> +++ b/drivers/gpu/drm/i915/i915_driver.c
> @@ -1123,13 +1123,9 @@ static int i915_drm_suspend_noirq(struct drm_device 
> *dev, bool hibernation)
>* Lenovo Thinkpad X301, X61s, X60, T60, X41
>* Fujitsu FSC S7110
>* Acer Aspire 1830T
> -  *
> -  * pci_save_state() is needed to prevent driver/pci from
> -  * automagically putting the device into D3.
>*/
> - pci_save_state(pdev);
> - if (!(hibernation && GRAPHICS_VER(dev_priv) < 6))
> - pci_set_power_state(pdev, PCI_D3hot);
> + if (hibernation && GRAPHICS_VER(dev_priv) < 6)
> + pdev->skip_bus_pm = true;

.skip_bus_pm was previously strictly internal to
drivers/pci/pci-driver.c.  Not sure about using it outside
drivers/pci/; would want Rafael to chime in.

>   return 0;
>  }
> @@ -1137,6 +1133,7 @@ static int i915_drm_suspend_noirq(struct drm_device 
> *dev, bool hibernation)
>  int i915_driver_suspend_switcheroo(struct drm_i915_private *i915,
>  pm_message_t state)
>  {
> + struct pci_dev *pdev = to_pci_dev(i915->drm.dev);
>   int error;
>  
>   if (drm_WARN_ON_ONCE(&i915->drm, state.event != PM_EVENT_SUSPEND &&
> @@ -1158,6 +1155,9 @@ int i915_driver_suspend_switcheroo(struct 
> drm_i915_private *i915,
>   if (error)
>   return error;
>  
> + pci_save_state(pdev);
> + pci_set_power_state(pdev, PCI_D3hot);
> +
>   return 0;
>  }
>  
> -- 
> 2.44.2
> 


Re: [PATCH 4/6] drm/i915/pm: Move the hibernate+D3 quirk stuff into noirq() pm hooks

2024-09-25 Thread Bjorn Helgaas
On Wed, Sep 25, 2024 at 05:45:24PM +0300, Ville Syrjala wrote:
> From: Ville Syrjälä 
> 
> If the driver doesn't call pci_save_state() driver/pci will
> normally save+power manage the device from the _noirq() pm
> hooks.

drivers/pci.  Or maybe just mention the PM hook names specifically.

> We can't let that happen as some old BIOSes fail to hibernate
> when the device is in D3. However, we can get a bit closer to
> the standard behaviour by doing our explicit pci_save_state()
> and pci_set_power_state() stuff from driver provided _noirq()
> hooks as well.
> 
> This results in a change of behaviur where we no longer go
> into D3 at the end of .freeze_late(), so when it comes time
> to .thaw() we'll already be in D0, and thus we can drop the
> explicit pci_set_power_state(D0) call.

s/behaviur/behaviour/

> Presumable swictcheroo suspend will want to go into D3 so
> call the _noirq() stuff from the switcheroo suspend hook,
> and since we dropped the pci_set_power_state(D0) from
> .resume_early() we'll need to add one back into the
> swtcheroo resume hook.

s/swictcheroo/switcheroo/
s/swtcheroo/switcheroo/

Or maybe just use the actual function names here too.  That saves
time for me, at least, because it points me at exactly where to look.

Bjorn


Re: [PATCH 1/6] PCI/PM: Respect pci_dev->skip_bus_pm in the .poweroff() path

2024-09-30 Thread Bjorn Helgaas
On Thu, Sep 26, 2024 at 07:03:16PM +0300, Ville Syrjälä wrote:
> On Wed, Sep 25, 2024 at 02:28:42PM -0500, Bjorn Helgaas wrote:
> > On Wed, Sep 25, 2024 at 05:45:21PM +0300, Ville Syrjala wrote:
> > > From: Ville Syrjälä 
> > > 
> > > On some older laptops i915 needs to leave the GPU in
> > > D0 when hibernating the system, or else the BIOS
> > > hangs somewhere. Currently that is achieved by calling
> > > pci_save_state() ahead of time, which then skips the
> > > whole pci_prepare_to_sleep() stuff.

> > If there's a general requirement to leave all devices in D0 when
> > hibernating, it would be nice to have have some documentation like an
> > ACPI spec reference.
> 
> No, IIRC the ACPI spec even says that you must (or at least
> should) put devices into D3. But the buggy BIOS on some old
> laptops keels over when you do that. Hence we need this quirk.

Can we include a reference to this part of the ACPI spec and some
details on which laptops have this issue?

I'm a little bit wary of changing the PCI core in a generic-looking
way on the basis of some unspecified buggy old BIOS.  That feels like
something we're likely to break in the future.

> > Or if this is some i915-specific thing, maybe a pointer to history
> > like a lore or bugzilla reference.
> 
> The two relevant commits I can find are:
> 
> commit 54875571bbfd ("drm/i915: apply the PCI_D0/D3 hibernation
> workaround everywhere on pre GEN6")
> commit ab3be73fa7b4 ("drm/i915: gen4: work around hang during
> hibernation")

Thanks, this feels like important history to include somewhere.

> > IIUC this is a cleanup that doesn't fix any known problem?  The
> > overall diffstat doesn't make it look like a simplification, although
> > it might certainly be cleaner somehow:
> 
> My main concern is that using pci_save_state() might cause the pci
> code to deviate from the normal path in more ways than just skipping
> the D0->D3 transition. And then we might end up constantly chasing
> after driver/pci changes in order to match its behaviour.
> 
> Not to mention that having the pci_save_state() in the driver code
> is clearly confusing a bunch of our developers.

I'm all in favor of removing pci_save_state() from drivers when
possible.  I take it that this doesn't fix a functional issue.

Bjorn


Re: [PATCH 1/6] PCI/PM: Respect pci_dev->skip_bus_pm in the .poweroff() path

2024-09-25 Thread Bjorn Helgaas
On Wed, Sep 25, 2024 at 05:45:21PM +0300, Ville Syrjala wrote:
> From: Ville Syrjälä 
> 
> On some older laptops i915 needs to leave the GPU in
> D0 when hibernating the system, or else the BIOS
> hangs somewhere. Currently that is achieved by calling
> pci_save_state() ahead of time, which then skips the
> whole pci_prepare_to_sleep() stuff.

IIUC this refers to pci_pm_suspend_noirq(), which has this:

  if (!pci_dev->state_saved) {
pci_save_state(pci_dev);
if (!pci_dev->skip_bus_pm && pci_power_manageable(pci_dev))
  pci_prepare_to_sleep(pci_dev);
  }

Would be good if the commit log included the name of the function
where pci_prepare_to_sleep() is skipped.

If there's a general requirement to leave all devices in D0 when
hibernating, it would be nice to have have some documentation like an
ACPI spec reference.

Or if this is some i915-specific thing, maybe a pointer to history
like a lore or bugzilla reference.

> It feels to me that this approach could lead to unintended
> side effects as it causes the pci code to deviate from the
> standard path in various ways. In order to keep i915
> behaviour more standard it seems preferrable to use
> pci_dev->skip_bus_pm here. Duplicate the relevant logic
> from pci_pm_suspend_noirq() in pci_pm_poweroff_noirq().
> 
> It also looks like the current code is may put the parent
> bridge into D3 despite leaving the device in D0. Though
> perhaps the host bridge (which is where the integrated
> GPU lives) always has subordinates, which would make
> this a non-issue for i915. But maybe this could be a
> problem for other devices. Utilizing skip_bus_pm will
> make the behaviour of leaving the bridge in D0 a bit
> more explicit if nothing else.

s/is may/may/

Rewrap to fill 75 columns.  Could apply to all patches in the series.

Will need an ack from Rafael, author of:

  d491f2b75237 ("PCI: PM: Avoid possible suspend-to-idle issue")
  3e26c5feed2a ("PCI: PM: Skip devices in D0 for suspend-to-idle")

which added .skip_bus_pm and its use in pci_pm_suspend_noirq().

IIUC this is a cleanup that doesn't fix any known problem?  The
overall diffstat doesn't make it look like a simplification, although
it might certainly be cleaner somehow:

> drivers/gpu/drm/i915/i915_driver.c | 121 +++------
> drivers/pci/pci-driver.c   |  16 +++-
> 2 files changed, 94 insertions(+), 43 deletions(-)

> Cc: Bjorn Helgaas 
> Cc: "Rafael J. Wysocki" 
> Cc: Rodrigo Vivi 
> Cc: linux-...@vger.kernel.org
> Cc: intel-gfx@lists.freedesktop.org
> Signed-off-by: Ville Syrjälä 
> ---
>  drivers/pci/pci-driver.c | 16 +++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index f412ef73a6e4..ef436895939c 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -1142,6 +1142,8 @@ static int pci_pm_poweroff(struct device *dev)
>   struct pci_dev *pci_dev = to_pci_dev(dev);
>   const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
>  
> + pci_dev->skip_bus_pm = false;
> +
>   if (pci_has_legacy_pm_support(pci_dev))
>   return pci_legacy_suspend(dev, PMSG_HIBERNATE);
>  
> @@ -1206,9 +1208,21 @@ static int pci_pm_poweroff_noirq(struct device *dev)
>   return error;
>   }
>  
> - if (!pci_dev->state_saved && !pci_has_subordinate(pci_dev))
> + if (!pci_dev->state_saved && !pci_dev->skip_bus_pm &&
> + !pci_has_subordinate(pci_dev))
>   pci_prepare_to_sleep(pci_dev);
>  
> + if (pci_dev->current_state == PCI_D0) {
> + pci_dev->skip_bus_pm = true;
> + /*
> +  * Per PCI PM r1.2, table 6-1, a bridge must be in D0 if any
> +  * downstream device is in D0, so avoid changing the power state
> +  * of the parent bridge by setting the skip_bus_pm flag for it.
> +  */
> + if (pci_dev->bus->self)
> + pci_dev->bus->self->skip_bus_pm = true;
> + }
> +
>   /*
>* The reason for doing this here is the same as for the analogous code
>* in pci_pm_suspend_noirq().
> -- 
> 2.44.2
> 


Re: Regression on linux-next (next-20250221)

2025-02-26 Thread Bjorn Helgaas
On Wed, Feb 26, 2025 at 07:36:19PM +, Borah, Chaitanya Kumar wrote:
> Hello Bjorn,
> 
> Hope you are doing well. I am Chaitanya from the linux graphics team in Intel.
> 
> This mail is regarding a regression we are seeing in our CI runs[1] on 
> linux-next repository.
> 
> Since the version next-20250221 [2], we are seeing that some of the machines 
> in our CI are unable to connect through ssh (and therefore unable to 
> participate).
> Looking at the logs we see this.
> 
> `
> [5.838496] e1000e: Intel(R) PRO/1000 Network Driver
> [5.838515] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
> [5.838737] e1000e :01:00.0: Disabling ASPM  L1
> [5.840055] e1000e :01:00.0: probe with driver e1000e failed with 
> error -12
> `
> After bisecting the tree, the following patch [3] seems to be the first "bad"
> commit
> 
> `````````
> commit 7d90d8d2bb1bfff8b33acbb6f815cba6f5250fad
> Author: Bjorn Helgaas mailto:bhelg...@google.com
> Date:   Fri Feb 14 18:03:00 2025 -0600
> 
>     PCI: Avoid pointless capability searches
> 
>     Many of the save/restore functions in the pci_save_state() and
>     pci_restore_state() paths depend on both a PCI capability of the device 
> and
>     a pci_cap_saved_state structure to hold the configuration data, and they
>     skip the operation if either is missing.
> `
> 
> We verified that if we revert the patch the issue is not seen.

Sorry about this; this patch was dropped in next-20250224

Bjorn