Re: Debian SID kernel doesn't boot on PowerBook 3400c

2021-08-03 Thread Christophe Leroy




Le 02/08/2021 à 19:32, Stan Johnson a écrit :

On 8/2/21 8:41 AM, Christophe Leroy wrote:



Le 31/07/2021 à 20:24, Stan Johnson a écrit :

Hi Christophe,

On 7/31/21 9:58 AM, Christophe Leroy wrote:

Stan Johnson  a écrit :


Hello,

The current Debian SID kernel will not boot on a PowerBook 3400c
running
the latest version of Debian SID. If booted using the BootX extension,
the kernel hangs immediately:

"Welcome to Linux, kernel 5.10.0-8-powerpc"

If booted from Mac OS, the Mac OS screen hangs.

Booting also hangs if the "No video driver" option is selected in
BootX,
"No video driver" causes "video=ofonly" to be passed to the kernel.

This is the current command line that I'm using in BootX:
root=/dev/sda13 video=chips65550:vmode:14,cmode:16

Kernel v5.9 works as expected.

The config file I'm using is attached.

Here are the results of a git bisect, marking v5.9 as "good" and the
most current kernel as "bad":

$ cd linux
$ git remote update
$ git bisect reset
$ git bisect start
$ git bisect bad
$ git bisect good v5.9

Note: "bad" -> hangs at boot; "good" -> boots to login prompt

   1) 5.11.0-rc5-pmac-00034-g684da7628d9 (bad)
   2) 5.10.0-rc3-pmac-00383-gbb9dd3ce617 (good)
   3) 5.10.0-pmac-06637-g2911ed9f47b (good)
  Note: I had to disable SMP to build this kernel.
   4) 5.10.0-pmac-10584-g9805529ec54 (good)
  Note: I had to disable SMP to build this kernel.
   5) 5.10.0-pmac-12577-g8552d28e140 (bad)
   6) 5.10.0-pmac-11576-g8a5be36b930 (bad)
   7) 5.10.0-pmac-11044-gbe695ee29e8 (good)
  Note: I had to disable SMP to build this kernel.
   8) 5.10.0-rc2-pmac-00288-g59d512e4374 (bad)
   9) 5.10.0-rc2-pmac-00155-gc3d35ddd1ec (good)
10) 5.10.0-rc2-pmac-00221-g7049b288ea8 (good)
11) 5.10.0-rc2-pmac-00254-g4b74a35fc7e (bad)
12) 5.10.0-rc2-pmac-00237-ged22bb8d39f (good)
13) 5.10.0-rc2-pmac-00245-g87b57ea7e10 (good)
14) 5.10.0-rc2-pmac-00249-gf10881a46f8 (bad)
15) 5.10.0-rc2-pmac-00247-gf8a4b277c3c (good)
16) 5.10.0-rc2-pmac-00248-gdb972a3787d (bad)

db972a3787d12b1ce9ba7a31ec376d8a79e04c47 is the first bad commit


Not sure this is really the root of the problem.

Can you try again without CONFIG_VMAP_STACK ?

Thanks
Christophe
...



With CONFIG_VMAP_STACK=y, 5.11.0-rc5-pmac-00034-g684da7628d9 hangs at
boot on the PB 3400c.

Without CONFIG_VMAP_STACK, 5.11.0-rc5-pmac-00034-g684da7628d9 boots as
expected.

I didn't re-build the Debian SID kernel, though I confirmed that the
Debian config file for 5.10.0-8-powerpc includes CONFIG_VMAP_STACK=y.
It's not clear whether removing CONFIG_VMAP_STACK would be appropriate
for other powerpc systems.

Please let me know why removing CONFIG_VMAP_STACK fixed the problem on
the PB 3400c. Should CONFIG_HAVE_ARCH_VMAP_STACK also be removed?



When CONFIG_HAVE_ARCH_VMAP_STACK is selected by the architecture,
CONFIG_VMAP_STACK  is selected by default.

The point is that your config has CONFIG_ADB_PMU.

A bug with VMAP stack was detected during 5.9 release cycle for
platforms selecting CONFIG_ADB_PMU. Because fixing the bug was an heavy
change, we prefered at that time to disable VMAP stack, so VMAP stack
was deselected for CONFIG_ADB_PMU by commit
4a133eb351ccc275683ad49305d0b04dde903733.

Then as a second step, the proper fix was implemented and then VMAP
stack was enabled again by the commit you bisected.

Taking into account that the problem disappears for you when you
manually deselect VMAP stacks, it means the problem is not the fix
itself, but the fact that VMAP stacks are now enable by default.

We need to understand why VMAP stack doesn't work on your platform, more
than that why it doesn't boot at all with VMAP stack.

Could you send me the dmesg output of your system when it properly boots ?

Did you check with kernel 5.13 ?

Thanks
Christophe



Christophe,

Thanks for your response. It looks like I never tested v5.13 (I was
originally just reporting that the default Debian SID kernel,
5.10.0-8-powerpc, hangs at boot on the PB 3400c).

So I rebuilt the stock v5.13 from kernel.org using Finn's
dot-config-powermac-5.13, which got changed slightly at compilation (see
dot-config-v5.13-pmac, attached). It has CONFIG_VMAP_STACK and
CONFIG_ADB_PMU set, and it booted, but there were multiple memory
errors. So it looks like the hang-at-boot problem was fixed sometime
after v5.11, but there are now memory errors (similar to Wallstreet).

With CONFIG_VMAP_STACK not set (CONFIG_ADB_PMU is still set), the
.config file turns into the attached dot-config-v5.13-pmac_NO_VMAP. And
there were still memory errors (dmesg output attached).

The memory errors may be a completely unrelated issue, since they occur
regardless of the CONFIG_VMAP_STACK setting.

To help rule out a hardware issue, I confirmed that memory errors don't
occur with v5.8.2 (dmesg output attached).

A useful git bisect might be possible if CONFIG_VMAP_STACK is disabled
for each build. I would need to determine where the memory errors
started (v5.9, v5.10, v5.11, or v5.12). There is the complication t

Re: [PATCH] powerpc/kexec: blacklist functions called in real mode for kprobe

2021-08-03 Thread Michael Ellerman
On Wed, 14 Jul 2021 18:17:58 +0530, Hari Bathini wrote:
> As kprobe does not handle events happening in real mode, blacklist the
> functions that only get called in real mode or in kexec sequence with
> MMU turned off.

Applied to powerpc/next.

[1/1] powerpc/kexec: blacklist functions called in real mode for kprobe
  https://git.kernel.org/powerpc/c/8119cefd9a29b71997e62b762932d23499ba4896

cheers


Re: [PATCH] powerpc/stacktrace: Include linux/delay.h

2021-08-03 Thread Michael Ellerman
On Thu, 29 Jul 2021 20:01:03 +0200, Michal Suchanek wrote:
> commit 7c6986ade69e ("powerpc/stacktrace: Fix spurious "stale" traces in 
> raise_backtrace_ipi()")
> introduces udelay() call without including the linux/delay.h header.
> This may happen to work on master but the header that declares the
> functionshould be included nonetheless.

Applied to powerpc/next.

[1/1] powerpc/stacktrace: Include linux/delay.h
  https://git.kernel.org/powerpc/c/135462ae7692a824e5b63299178684fca3a366e6

cheers


Re: [PATCH v5 0/2] cpuidle/pseries: cleanup of the CEDE0 latency fixup code

2021-08-03 Thread Michael Ellerman
On Mon, 19 Jul 2021 12:03:17 +0530, Gautham R. Shenoy wrote:
> 
> Hi,
> 
> This is the v5 of the patchset to fixup CEDE0 latency only from
> POWER10 onwards.
> 
> 
> [...]

Applied to powerpc/next.

[1/2] cpuidle/pseries: Fixup CEDE0 latency only for POWER10 onwards
  https://git.kernel.org/powerpc/c/7cbd631d4decfd78f8a17196dce9abcd4e7e1e94
[2/2] cpuidle/pseries: Do not cap the CEDE0 latency in fixup_cede0_latency()
  https://git.kernel.org/powerpc/c/4bceb03859c1a508669ce542c649c8d4f5d4bd93

cheers


Re: [PATCH 2/3] selftests/powerpc: Add test for real address error handling

2021-08-03 Thread Michael Ellerman
Ganesh Goudar  writes:
> Add test for real address or control memory address access
> error handling, using NX-GZIP engine.
>
> The error is injected by accessing the control memory address
> using illegal instruction, on successful handling the process
> attempting to access control memory address using illegal
> instruction receives SIGBUS.
>
> Signed-off-by: Ganesh Goudar 
> ---
>  tools/testing/selftests/powerpc/Makefile  |  3 +-
>  tools/testing/selftests/powerpc/mce/Makefile  |  6 +++
>  .../selftests/powerpc/mce/inject-ra-err.c | 42 +++
>  .../selftests/powerpc/mce/inject-ra-err.sh| 19 +
>  4 files changed, 69 insertions(+), 1 deletion(-)
>  create mode 100644 tools/testing/selftests/powerpc/mce/Makefile
>  create mode 100644 tools/testing/selftests/powerpc/mce/inject-ra-err.c
>  create mode 100755 tools/testing/selftests/powerpc/mce/inject-ra-err.sh

This breaks the selftests build:

  https://github.com/ruscur/linux-ci/runs/3204665920?check_suite_focus=true

  make[2]: Entering directory '/linux/tools/testing/selftests/powerpc/mce'
  powerpc-linux-gnu-gcc -std=gnu99 -O2 -Wall -Werror -DGIT_VERSION='"77349a6"' 
-I/linux/tools/testing/selftests/powerpc/include inject-ra-err.c  -o 
/output/kselftest/powerpc/mce/inject-ra-err
  Error: inject-ra-err.c:11:25: fatal error: asm/vas-api.h: No such file or 
directory

cheers


[PATCH v2 6/6] PCI: Drop duplicated tracking of a pci_dev's bound driver

2021-08-03 Thread Uwe Kleine-König
Currently it's tracked twice which driver is bound to a given pci
device. Now that all users of the pci specific one (struct
pci_dev::driver) are updated to use an access macro
(pci_driver_of_dev()), change the macro to use the information from the
driver core and remove the driver member from struct pci_dev.

Signed-off-by: Uwe Kleine-König 
---
 drivers/pci/pci-driver.c | 4 
 include/linux/pci.h  | 3 +--
 2 files changed, 1 insertion(+), 6 deletions(-)

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 740d5bf5d411..5d950eb476e2 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -305,12 +305,10 @@ static long local_pci_probe(void *_ddi)
 * its remove routine.
 */
pm_runtime_get_sync(dev);
-   pci_dev->driver = pci_drv;
rc = pci_drv->probe(pci_dev, ddi->id);
if (!rc)
return rc;
if (rc < 0) {
-   pci_dev->driver = NULL;
pm_runtime_put_sync(dev);
return rc;
}
@@ -376,7 +374,6 @@ static int pci_call_probe(struct pci_driver *drv, struct 
pci_dev *dev,
  * @pci_dev: PCI device being probed
  *
  * returns 0 on success, else error.
- * side-effect: pci_dev->driver is set to drv when drv claims pci_dev.
  */
 static int __pci_device_probe(struct pci_driver *drv, struct pci_dev *pci_dev)
 {
@@ -451,7 +448,6 @@ static int pci_device_remove(struct device *dev)
pm_runtime_put_noidle(dev);
}
pcibios_free_irq(pci_dev);
-   pci_dev->driver = NULL;
pci_iov_remove(pci_dev);
 
/* Undo the runtime PM settings in local_pci_probe() */
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 778f3b5e6f23..f44ab76e216f 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -342,7 +342,6 @@ struct pci_dev {
u16 pcie_flags_reg; /* Cached PCIe Capabilities Register */
unsigned long   *dma_alias_mask;/* Mask of enabled devfn aliases */
 
-   struct pci_driver *driver;  /* Driver bound to this device */
u64 dma_mask;   /* Mask of the bits of bus address this
   device implements.  Normally this is
   0x.  You only need to change
@@ -887,7 +886,7 @@ struct pci_driver {
 };
 
 #defineto_pci_driver(drv) container_of(drv, struct pci_driver, driver)
-#define pci_driver_of_dev(pdev) ((pdev)->driver)
+#define pci_driver_of_dev(pdev) ((pdev)->dev.driver ? 
to_pci_driver((pdev)->dev.driver) : NULL)
 
 /**
  * PCI_DEVICE - macro used to describe a specific PCI device
-- 
2.30.2



[PATCH v2 4/6] PCI: Provide wrapper to access a pci_dev's bound driver

2021-08-03 Thread Uwe Kleine-König
Which driver a device is bound to is available twice: In struct
pci_dev::dev->driver and in struct pci_dev::driver. To get rid of the
duplication introduce a wrapper to access struct pci_dev's driver
member. Once all users are converted the wrapper can be changed to
calculate the driver using pci_dev::dev->driver.

Signed-off-by: Uwe Kleine-König 
---
 include/linux/pci.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/pci.h b/include/linux/pci.h
index 540b377ca8f6..778f3b5e6f23 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -887,6 +887,7 @@ struct pci_driver {
 };
 
 #defineto_pci_driver(drv) container_of(drv, struct pci_driver, driver)
+#define pci_driver_of_dev(pdev) ((pdev)->driver)
 
 /**
  * PCI_DEVICE - macro used to describe a specific PCI device
-- 
2.30.2



[PATCH v2 0/6] PCI: Drop duplicated tracking of a pci_dev's bound driver

2021-08-03 Thread Uwe Kleine-König
Hello,

changes since v1 
(https://lore.kernel.org/linux-pci/20210729203740.1377045-1-u.kleine-koe...@pengutronix.de):

- New patch to simplify drivers/pci/xen-pcifront.c, spotted and
  suggested by Boris Ostrovsky
- Fix a possible NULL pointer dereference I introduced in xen-pcifront.c
- A few whitespace improvements
- Add a commit log to patch #6 (formerly #5)

I also expanded the audience for patches #4 and #6 to allow affected
people to actually see the changes to their drivers.

Interdiff can be found below.

The idea is still the same: After a few cleanups (#1 - #3) a new macro
is introduced abstracting access to struct pci_dev->driver. All users
are then converted to use this and in the last patch the macro is
changed to make use of struct pci_dev::dev->driver to get rid of the
duplicated tracking.

Best regards
Uwe

Uwe Kleine-König (6):
  PCI: Simplify pci_device_remove()
  PCI: Drop useless check from pci_device_probe()
  xen/pci: Drop some checks that are always true
  PCI: Provide wrapper to access a pci_dev's bound driver
  PCI: Adapt all code locations to not use struct pci_dev::driver
directly
  PCI: Drop duplicated tracking of a pci_dev's bound driver

 arch/powerpc/include/asm/ppc-pci.h|  3 +-
 arch/powerpc/kernel/eeh_driver.c  | 12 ++--
 arch/x86/events/intel/uncore.c|  2 +-
 arch/x86/kernel/probe_roms.c  |  2 +-
 drivers/bcma/host_pci.c   |  6 +-
 drivers/crypto/hisilicon/qm.c |  2 +-
 drivers/crypto/qat/qat_common/adf_aer.c   |  2 +-
 drivers/message/fusion/mptbase.c  |  4 +-
 drivers/misc/cxl/guest.c  | 21 +++
 drivers/misc/cxl/pci.c| 25 
 .../ethernet/hisilicon/hns3/hns3_ethtool.c|  2 +-
 .../ethernet/marvell/prestera/prestera_pci.c  |  2 +-
 drivers/net/ethernet/mellanox/mlxsw/pci.c |  2 +-
 .../ethernet/netronome/nfp/nfp_net_ethtool.c  |  2 +-
 drivers/pci/iov.c | 23 ---
 drivers/pci/pci-driver.c  | 48 +++
 drivers/pci/pci.c | 10 ++--
 drivers/pci/pcie/err.c| 35 ++-
 drivers/pci/xen-pcifront.c| 60 ---
 drivers/ssb/pcihost_wrapper.c |  7 ++-
 drivers/usb/host/xhci-pci.c   |  3 +-
 include/linux/pci.h   |  2 +-
 22 files changed, 145 insertions(+), 130 deletions(-)

Range-diff against v1:
1:  7d97605df363 = 1:  8ba6e9faa18c PCI: Simplify pci_device_remove()
2:  aec84c688d0f = 2:  d8a7dc52091f PCI: Drop useless check from 
pci_device_probe()
-:   > 3:  f4b78aa41776 xen/pci: Drop some checks that are always 
true
3:  e6f933f532c9 = 4:  50f3daa64170 PCI: Provide wrapper to access a pci_dev's 
bound driver
4:  d678a2924143 ! 5:  21cbd3f180a1 PCI: Adapt all code locations to not use 
struct pci_dev::driver directly
@@ drivers/message/fusion/mptbase.c: mpt_device_driver_register(struct 
mpt_pci_driv
 -  id = ioc->pcidev->driver ?
 -  ioc->pcidev->driver->id_table : NULL;
 +  struct pci_driver *pdrv = pci_driver_of_dev(ioc->pcidev);
-+  id = pdrv ?  pdrv->id_table : NULL;
++  id = pdrv ? pdrv->id_table : NULL;
if (dd_cbfunc->probe)
dd_cbfunc->probe(ioc->pcidev, id);
 }
@@ drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c: static void 
hns3_get_drvinfo
}
  
 -  strncpy(drvinfo->driver, h->pdev->driver->name,
--  sizeof(drvinfo->driver));
-+  strncpy(drvinfo->driver, pci_driver_of_dev(h->pdev)->name, 
sizeof(drvinfo->driver));
++  strncpy(drvinfo->driver, pci_driver_of_dev(h->pdev)->name,
+   sizeof(drvinfo->driver));
drvinfo->driver[sizeof(drvinfo->driver) - 1] = '\0';
  
-   strncpy(drvinfo->bus_info, pci_name(h->pdev),
 
  ## drivers/net/ethernet/marvell/prestera/prestera_pci.c ##
 @@ drivers/net/ethernet/marvell/prestera/prestera_pci.c: static int 
prestera_fw_load(struct prestera_fw *fw)
@@ drivers/pci/xen-pcifront.c: static pci_ers_result_t 
pcifront_common_process(int
  
pcidev = pci_get_domain_bus_and_slot(domain, bus, devfn);
 -  if (!pcidev || !pcidev->driver) {
-+  pdrv = pci_driver_of_dev(pcidev);
-+  if (!pcidev || !pdrv) {
++  if (!pcidev || !(pdrv = pci_driver_of_dev(pcidev))) {
dev_err(&pdev->xdev->dev, "device or AER driver is NULL\n");
pci_dev_put(pcidev);
return result;
}
 -  pdrv = pcidev->driver;
  
-   if (pdrv) {
-   if (pdrv->err_handler && pdrv->err_handler->error_detected) {
+   if (pdrv->err_handler && pdrv->err_handler->error_detected) {
+   pci_dbg(pcidev, "trying to call AER service\n");
 
  ## drivers/ssb/pcihost_wrapper.c ##
 @@ driver

[PATCH v2 5/6] PCI: Adapt all code locations to not use struct pci_dev::driver directly

2021-08-03 Thread Uwe Kleine-König
This prepares removing the driver member of struct pci_dev which holds the
same information than struct pci_dev::dev->driver.

Signed-off-by: Uwe Kleine-König 
---
 arch/powerpc/include/asm/ppc-pci.h|  3 +-
 arch/powerpc/kernel/eeh_driver.c  | 12 ---
 arch/x86/events/intel/uncore.c|  2 +-
 arch/x86/kernel/probe_roms.c  |  2 +-
 drivers/bcma/host_pci.c   |  6 ++--
 drivers/crypto/hisilicon/qm.c |  2 +-
 drivers/crypto/qat/qat_common/adf_aer.c   |  2 +-
 drivers/message/fusion/mptbase.c  |  4 +--
 drivers/misc/cxl/guest.c  | 21 +--
 drivers/misc/cxl/pci.c| 25 +++--
 .../ethernet/hisilicon/hns3/hns3_ethtool.c|  2 +-
 .../ethernet/marvell/prestera/prestera_pci.c  |  2 +-
 drivers/net/ethernet/mellanox/mlxsw/pci.c |  2 +-
 .../ethernet/netronome/nfp/nfp_net_ethtool.c  |  2 +-
 drivers/pci/iov.c | 23 +++-
 drivers/pci/pci-driver.c  | 28 ---
 drivers/pci/pci.c | 10 +++---
 drivers/pci/pcie/err.c| 35 ++-
 drivers/pci/xen-pcifront.c|  3 +-
 drivers/ssb/pcihost_wrapper.c |  7 ++--
 drivers/usb/host/xhci-pci.c   |  3 +-
 21 files changed, 112 insertions(+), 84 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-pci.h 
b/arch/powerpc/include/asm/ppc-pci.h
index 2b9edbf6e929..63938c156c57 100644
--- a/arch/powerpc/include/asm/ppc-pci.h
+++ b/arch/powerpc/include/asm/ppc-pci.h
@@ -57,7 +57,8 @@ void eeh_sysfs_remove_device(struct pci_dev *pdev);
 
 static inline const char *eeh_driver_name(struct pci_dev *pdev)
 {
-   return (pdev && pdev->driver) ? pdev->driver->name : "";
+   struct pci_driver *pdrv;
+   return (pdev && (pdrv = pci_driver_of_dev(pdev))) ? pdrv->name : 
"";
 }
 
 #endif /* CONFIG_EEH */
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 3eff6a4888e7..0fc712a8775e 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -104,13 +104,14 @@ static bool eeh_edev_actionable(struct eeh_dev *edev)
  */
 static inline struct pci_driver *eeh_pcid_get(struct pci_dev *pdev)
 {
-   if (!pdev || !pdev->driver)
+   struct pci_driver *pdrv;
+   if (!pdev || !(pdrv = pci_driver_of_dev(pdev)))
return NULL;
 
-   if (!try_module_get(pdev->driver->driver.owner))
+   if (!try_module_get(pdrv->driver.owner))
return NULL;
 
-   return pdev->driver;
+   return pdrv;
 }
 
 /**
@@ -122,10 +123,11 @@ static inline struct pci_driver *eeh_pcid_get(struct 
pci_dev *pdev)
  */
 static inline void eeh_pcid_put(struct pci_dev *pdev)
 {
-   if (!pdev || !pdev->driver)
+   struct pci_driver *pdrv;
+   if (!pdev || !(pdrv = pci_driver_of_dev(pdev)))
return;
 
-   module_put(pdev->driver->driver.owner);
+   module_put(pdrv->driver.owner);
 }
 
 /**
diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index 9bf4dbbc26e2..14eb141b6cf2 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -1176,7 +1176,7 @@ static int uncore_pci_probe(struct pci_dev *pdev, const 
struct pci_device_id *id
 * PCI slot and func to indicate the uncore box.
 */
if (id->driver_data & ~0x) {
-   struct pci_driver *pci_drv = pdev->driver;
+   struct pci_driver *pci_drv = pci_driver_of_dev(pdev);
 
pmu = uncore_pci_find_dev_pmu(pdev, pci_drv->id_table);
if (pmu == NULL)
diff --git a/arch/x86/kernel/probe_roms.c b/arch/x86/kernel/probe_roms.c
index 9e1def3744f2..55bfafec9684 100644
--- a/arch/x86/kernel/probe_roms.c
+++ b/arch/x86/kernel/probe_roms.c
@@ -80,7 +80,7 @@ static struct resource video_rom_resource = {
  */
 static bool match_id(struct pci_dev *pdev, unsigned short vendor, unsigned 
short device)
 {
-   struct pci_driver *drv = pdev->driver;
+   struct pci_driver *drv = pci_driver_of_dev(pdev);
const struct pci_device_id *id;
 
if (pdev->vendor == vendor && pdev->device == device)
diff --git a/drivers/bcma/host_pci.c b/drivers/bcma/host_pci.c
index 69c10a7b7c61..f9bfae87ebd9 100644
--- a/drivers/bcma/host_pci.c
+++ b/drivers/bcma/host_pci.c
@@ -161,6 +161,7 @@ static int bcma_host_pci_probe(struct pci_dev *dev,
   const struct pci_device_id *id)
 {
struct bcma_bus *bus;
+   struct pci_driver *pdrv;
int err = -ENOMEM;
const char *name;
u32 val;
@@ -176,8 +177,9 @@ static int bcma_host_pci_probe(struct pci_dev *dev,
goto err_kfree_bus;
 
name = dev_name(&dev->dev);
-   if (dev->driver && dev->driver->name)
-   name = dev->driver->name;
+   pdrv = pci_driver

Re: [PATCH v5 0/2] cpuidle/pseries: cleanup of the CEDE0 latency fixup code

2021-08-03 Thread Michael Ellerman
Michael Ellerman  writes:
> On Mon, 19 Jul 2021 12:03:17 +0530, Gautham R. Shenoy wrote:
>> 
>> Hi,
>> 
>> This is the v5 of the patchset to fixup CEDE0 latency only from
>> POWER10 onwards.
>> 
>> 
>> [...]
>
> Applied to powerpc/next.
>
> [1/2] cpuidle/pseries: Fixup CEDE0 latency only for POWER10 onwards
>   
> https://git.kernel.org/powerpc/c/7cbd631d4decfd78f8a17196dce9abcd4e7e1e94
> [2/2] cpuidle/pseries: Do not cap the CEDE0 latency in fixup_cede0_latency()
>   
> https://git.kernel.org/powerpc/c/4bceb03859c1a508669ce542c649c8d4f5d4bd93

First commit had a bad fixes tag, so now these are:

[1/2] cpuidle/pseries: Fixup CEDE0 latency only for POWER10 onwards
  https://git.kernel.org/powerpc/c/50741b70b0cbbafbd9199f5180e66c0c53783a4a
[2/2] cpuidle/pseries: Do not cap the CEDE0 latency in fixup_cede0_latency()
  https://git.kernel.org/powerpc/c/71737a6c2a8f801622d2b71567d1ec1e4c5b40b8

cheers


[PATCH printk v1 03/10] kgdb: delay roundup if holding printk cpulock

2021-08-03 Thread John Ogness
kgdb makes use of its own cpulock (@dbg_master_lock, @kgdb_active)
during cpu roundup. This will conflict with the printk cpulock.
Therefore, a CPU must ensure that it is not holding the printk
cpulock when calling kgdb_cpu_enter(). If it is, it must allow its
printk context to complete first.

A new helper function kgdb_roundup_delay() is introduced for kgdb
to determine if it is holding the printk cpulock. If so, a flag is
set so that when the printk cpulock is released, kgdb will be
re-triggered for that CPU.

Signed-off-by: John Ogness 
---
 arch/powerpc/include/asm/smp.h |  1 +
 arch/powerpc/kernel/kgdb.c | 10 +++-
 arch/powerpc/kernel/smp.c  |  5 
 arch/x86/kernel/kgdb.c |  9 ---
 include/linux/kgdb.h   |  3 +++
 include/linux/printk.h |  8 ++
 kernel/debug/debug_core.c  | 45 --
 kernel/printk/printk.c | 12 +
 8 files changed, 70 insertions(+), 23 deletions(-)

diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
index 03b3d010cbab..eec452e647b3 100644
--- a/arch/powerpc/include/asm/smp.h
+++ b/arch/powerpc/include/asm/smp.h
@@ -58,6 +58,7 @@ struct smp_ops_t {
 
 extern int smp_send_nmi_ipi(int cpu, void (*fn)(struct pt_regs *), u64 
delay_us);
 extern int smp_send_safe_nmi_ipi(int cpu, void (*fn)(struct pt_regs *), u64 
delay_us);
+extern void smp_send_debugger_break_cpu(unsigned int cpu);
 extern void smp_send_debugger_break(void);
 extern void start_secondary_resume(void);
 extern void smp_generic_give_timebase(void);
diff --git a/arch/powerpc/kernel/kgdb.c b/arch/powerpc/kernel/kgdb.c
index bdee7262c080..d57d37497862 100644
--- a/arch/powerpc/kernel/kgdb.c
+++ b/arch/powerpc/kernel/kgdb.c
@@ -120,11 +120,19 @@ int kgdb_skipexception(int exception, struct pt_regs 
*regs)
 
 static int kgdb_debugger_ipi(struct pt_regs *regs)
 {
-   kgdb_nmicallback(raw_smp_processor_id(), regs);
+   int cpu = raw_smp_processor_id();
+
+   if (!kgdb_roundup_delay(cpu))
+   kgdb_nmicallback(cpu, regs);
return 0;
 }
 
 #ifdef CONFIG_SMP
+void kgdb_roundup_cpu(unsigned int cpu)
+{
+   smp_send_debugger_break_cpu(cpu);
+}
+
 void kgdb_roundup_cpus(void)
 {
smp_send_debugger_break();
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 447b78a87c8f..816d7f09bbf9 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -582,6 +582,11 @@ static void debugger_ipi_callback(struct pt_regs *regs)
debugger_ipi(regs);
 }
 
+void smp_send_debugger_break_cpu(unsigned int cpu)
+{
+   smp_send_nmi_ipi(cpu, debugger_ipi_callback, 100);
+}
+
 void smp_send_debugger_break(void)
 {
smp_send_nmi_ipi(NMI_IPI_ALL_OTHERS, debugger_ipi_callback, 100);
diff --git a/arch/x86/kernel/kgdb.c b/arch/x86/kernel/kgdb.c
index 3a43a2dee658..37bd37cdf2b6 100644
--- a/arch/x86/kernel/kgdb.c
+++ b/arch/x86/kernel/kgdb.c
@@ -502,9 +502,12 @@ static int kgdb_nmi_handler(unsigned int cmd, struct 
pt_regs *regs)
if (atomic_read(&kgdb_active) != -1) {
/* KGDB CPU roundup */
cpu = raw_smp_processor_id();
-   kgdb_nmicallback(cpu, regs);
-   set_bit(cpu, was_in_debug_nmi);
-   touch_nmi_watchdog();
+
+   if (!kgdb_roundup_delay(cpu)) {
+   kgdb_nmicallback(cpu, regs);
+   set_bit(cpu, was_in_debug_nmi);
+   touch_nmi_watchdog();
+   }
 
return NMI_HANDLED;
}
diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
index 258cdde8d356..9bca0d98db5a 100644
--- a/include/linux/kgdb.h
+++ b/include/linux/kgdb.h
@@ -212,6 +212,8 @@ extern void kgdb_call_nmi_hook(void *ignored);
  */
 extern void kgdb_roundup_cpus(void);
 
+extern void kgdb_roundup_cpu(unsigned int cpu);
+
 /**
  * kgdb_arch_set_pc - Generic call back to the program counter
  * @regs: Current &struct pt_regs.
@@ -365,5 +367,6 @@ extern void kgdb_free_init_mem(void);
 #define dbg_late_init()
 static inline void kgdb_panic(const char *msg) {}
 static inline void kgdb_free_init_mem(void) { }
+static inline void kgdb_roundup_cpu(unsigned int cpu) {}
 #endif /* ! CONFIG_KGDB */
 #endif /* _KGDB_H_ */
diff --git a/include/linux/printk.h b/include/linux/printk.h
index ac738d1d9934..974ea2c99749 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -280,10 +280,18 @@ static inline void dump_stack(void)
 extern int __printk_cpu_trylock(void);
 extern void __printk_wait_on_cpu_lock(void);
 extern void __printk_cpu_unlock(void);
+extern bool kgdb_roundup_delay(unsigned int cpu);
+
 #else
+
 #define __printk_cpu_trylock() 1
 #define __printk_wait_on_cpu_lock()
 #define __printk_cpu_unlock()
+
+static inline bool kgdb_roundup_delay(unsigned int cpu)
+{
+   return false

[PATCH 03/38] powerpc: Replace deprecated CPU-hotplug functions.

2021-08-03 Thread Sebastian Andrzej Siewior
The functions get_online_cpus() and put_online_cpus() have been
deprecated during the CPU hotplug rework. They map directly to
cpus_read_lock() and cpus_read_unlock().

Replace deprecated CPU-hotplug functions with the official version.
The behavior remains unchanged.

Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: linuxppc-dev@lists.ozlabs.org
Cc: kvm-...@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior 
---
 arch/powerpc/kernel/rtasd.c   |  4 ++--
 arch/powerpc/kvm/book3s_hv_builtin.c  | 10 +-
 arch/powerpc/platforms/powernv/idle.c |  4 ++--
 arch/powerpc/platforms/powernv/opal-imc.c |  8 
 4 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/kernel/rtasd.c b/arch/powerpc/kernel/rtasd.c
index 8561dfb33f241..32ee17753eb4a 100644
--- a/arch/powerpc/kernel/rtasd.c
+++ b/arch/powerpc/kernel/rtasd.c
@@ -429,7 +429,7 @@ static void rtas_event_scan(struct work_struct *w)
 
do_event_scan();
 
-   get_online_cpus();
+   cpus_read_lock();
 
/* raw_ OK because just using CPU as starting point. */
cpu = cpumask_next(raw_smp_processor_id(), cpu_online_mask);
@@ -451,7 +451,7 @@ static void rtas_event_scan(struct work_struct *w)
schedule_delayed_work_on(cpu, &event_scan_work,
__round_jiffies_relative(event_scan_delay, cpu));
 
-   put_online_cpus();
+   cpus_read_unlock();
 }
 
 #ifdef CONFIG_PPC64
diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c 
b/arch/powerpc/kvm/book3s_hv_builtin.c
index be8ef1c5b1bfb..fcf4760a3a0ea 100644
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -137,23 +137,23 @@ long int kvmppc_rm_h_confer(struct kvm_vcpu *vcpu, int 
target,
  * exist in the system. We use a counter of VMs to track this.
  *
  * One of the operations we need to block is onlining of secondaries, so we
- * protect hv_vm_count with get/put_online_cpus().
+ * protect hv_vm_count with cpus_read_lock/unlock().
  */
 static atomic_t hv_vm_count;
 
 void kvm_hv_vm_activated(void)
 {
-   get_online_cpus();
+   cpus_read_lock();
atomic_inc(&hv_vm_count);
-   put_online_cpus();
+   cpus_read_unlock();
 }
 EXPORT_SYMBOL_GPL(kvm_hv_vm_activated);
 
 void kvm_hv_vm_deactivated(void)
 {
-   get_online_cpus();
+   cpus_read_lock();
atomic_dec(&hv_vm_count);
-   put_online_cpus();
+   cpus_read_unlock();
 }
 EXPORT_SYMBOL_GPL(kvm_hv_vm_deactivated);
 
diff --git a/arch/powerpc/platforms/powernv/idle.c 
b/arch/powerpc/platforms/powernv/idle.c
index 528a7e0cf83aa..aa27689b832db 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -199,12 +199,12 @@ static ssize_t 
store_fastsleep_workaround_applyonce(struct device *dev,
 */
power7_fastsleep_workaround_exit = false;
 
-   get_online_cpus();
+   cpus_read_lock();
primary_thread_mask = cpu_online_cores_map();
on_each_cpu_mask(&primary_thread_mask,
pnv_fastsleep_workaround_apply,
&err, 1);
-   put_online_cpus();
+   cpus_read_unlock();
if (err) {
pr_err("fastsleep_workaround_applyonce change failed while 
running pnv_fastsleep_workaround_apply");
goto fail;
diff --git a/arch/powerpc/platforms/powernv/opal-imc.c 
b/arch/powerpc/platforms/powernv/opal-imc.c
index 7824cc364bc40..ba02a75c14102 100644
--- a/arch/powerpc/platforms/powernv/opal-imc.c
+++ b/arch/powerpc/platforms/powernv/opal-imc.c
@@ -186,7 +186,7 @@ static void disable_nest_pmu_counters(void)
int nid, cpu;
const struct cpumask *l_cpumask;
 
-   get_online_cpus();
+   cpus_read_lock();
for_each_node_with_cpus(nid) {
l_cpumask = cpumask_of_node(nid);
cpu = cpumask_first_and(l_cpumask, cpu_online_mask);
@@ -195,7 +195,7 @@ static void disable_nest_pmu_counters(void)
opal_imc_counters_stop(OPAL_IMC_COUNTERS_NEST,
   get_hard_smp_processor_id(cpu));
}
-   put_online_cpus();
+   cpus_read_unlock();
 }
 
 static void disable_core_pmu_counters(void)
@@ -203,7 +203,7 @@ static void disable_core_pmu_counters(void)
cpumask_t cores_map;
int cpu, rc;
 
-   get_online_cpus();
+   cpus_read_lock();
/* Disable the IMC Core functions */
cores_map = cpu_online_cores_map();
for_each_cpu(cpu, &cores_map) {
@@ -213,7 +213,7 @@ static void disable_core_pmu_counters(void)
pr_err("%s: Failed to stop Core (cpu = %d)\n",
__FUNCTION__, cpu);
}
-   put_online_cpus();
+   cpus_read_unlock();
 }
 
 int get_max_nest_dev(void)
-- 
2.32.0



Re: [PATCH printk v1 03/10] kgdb: delay roundup if holding printk cpulock

2021-08-03 Thread Daniel Thompson
On Tue, Aug 03, 2021 at 03:18:54PM +0206, John Ogness wrote:
> kgdb makes use of its own cpulock (@dbg_master_lock, @kgdb_active)
> during cpu roundup. This will conflict with the printk cpulock.

When the full vision is realized what will be the purpose of the printk
cpulock?

I'm asking largely because it's current role is actively unhelpful
w.r.t. kdb. It is possible that cautious use of in_dbg_master() might
be a better (and safer) solution. However it sounds like there is a
larger role planned for the printk cpulock...


> Therefore, a CPU must ensure that it is not holding the printk
> cpulock when calling kgdb_cpu_enter(). If it is, it must allow its
> printk context to complete first.
> 
> A new helper function kgdb_roundup_delay() is introduced for kgdb
> to determine if it is holding the printk cpulock. If so, a flag is
> set so that when the printk cpulock is released, kgdb will be
> re-triggered for that CPU.
> 
> Signed-off-by: John Ogness 
> ---
>  arch/powerpc/include/asm/smp.h |  1 +
>  arch/powerpc/kernel/kgdb.c | 10 +++-
>  arch/powerpc/kernel/smp.c  |  5 
>  arch/x86/kernel/kgdb.c |  9 ---
>  include/linux/kgdb.h   |  3 +++
>  include/linux/printk.h |  8 ++
>  kernel/debug/debug_core.c  | 45 --
>  kernel/printk/printk.c | 12 +
>  8 files changed, 70 insertions(+), 23 deletions(-)
> 
> [...]
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 3d0c933937b4..1b546e117f10 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -44,6 +44,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -214,6 +215,7 @@ int devkmsg_sysctl_set_loglvl(struct ctl_table *table, 
> int write,
>  #ifdef CONFIG_SMP
>  static atomic_t printk_cpulock_owner = ATOMIC_INIT(-1);
>  static atomic_t printk_cpulock_nested = ATOMIC_INIT(0);
> +static unsigned int kgdb_cpu = -1;

Is this the flag to provoke retriggering? It appears to be a write-only
variable (at least in this patch). How is it consumed?


Daniel.


>  /**
>   * __printk_wait_on_cpu_lock() - Busy wait until the printk cpu-reentrant
> @@ -325,6 +327,16 @@ void __printk_cpu_unlock(void)
>  -1); /* LMM(__printk_cpu_unlock:B) */
>  }
>  EXPORT_SYMBOL(__printk_cpu_unlock);
> +
> +bool kgdb_roundup_delay(unsigned int cpu)
> +{
> + if (cpu != atomic_read(&printk_cpulock_owner))
> + return false;
> +
> + kgdb_cpu = cpu;
> + return true;
> +}
> +EXPORT_SYMBOL(kgdb_roundup_delay);
>  #endif /* CONFIG_SMP */
>  
>  /* Number of registered extended console drivers. */
> -- 
> 2.20.1
> 


[RFC PATCH] powerpc/book3s64/radix: Upgrade va tlbie to PID tlbie if we cross PMD_SIZE

2021-08-03 Thread Aneesh Kumar K.V
With shared mapping, even though we are unmapping a large range, the kernel
will force a TLB flush with ptl lock held to avoid the race mentioned in
commit 1cf35d47712d ("mm: split 'tlb_flush_mmu()' into tlb flushing and memory 
freeing parts")
This results in the kernel issuing a high number of TLB flushes even for a large
range. This can be improved by making sure the kernel switch to pid based flush 
if the
kernel is unmapping a 2M range.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/mm/book3s64/radix_tlb.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c 
b/arch/powerpc/mm/book3s64/radix_tlb.c
index aefc100d79a7..21d0f098e43b 100644
--- a/arch/powerpc/mm/book3s64/radix_tlb.c
+++ b/arch/powerpc/mm/book3s64/radix_tlb.c
@@ -1106,7 +1106,7 @@ EXPORT_SYMBOL(radix__flush_tlb_kernel_range);
  * invalidating a full PID, so it has a far lower threshold to change from
  * individual page flushes to full-pid flushes.
  */
-static unsigned long tlb_single_page_flush_ceiling __read_mostly = 33;
+static unsigned long tlb_single_page_flush_ceiling __read_mostly = 32;
 static unsigned long tlb_local_single_page_flush_ceiling __read_mostly = 
POWER9_TLB_SETS_RADIX * 2;
 
 static inline void __radix__flush_tlb_range(struct mm_struct *mm,
@@ -1133,7 +1133,7 @@ static inline void __radix__flush_tlb_range(struct 
mm_struct *mm,
if (fullmm)
flush_pid = true;
else if (type == FLUSH_TYPE_GLOBAL)
-   flush_pid = nr_pages > tlb_single_page_flush_ceiling;
+   flush_pid = nr_pages >= tlb_single_page_flush_ceiling;
else
flush_pid = nr_pages > tlb_local_single_page_flush_ceiling;
/*
@@ -1335,7 +1335,7 @@ static void __radix__flush_tlb_range_psize(struct 
mm_struct *mm,
if (fullmm)
flush_pid = true;
else if (type == FLUSH_TYPE_GLOBAL)
-   flush_pid = nr_pages > tlb_single_page_flush_ceiling;
+   flush_pid = nr_pages >= tlb_single_page_flush_ceiling;
else
flush_pid = nr_pages > tlb_local_single_page_flush_ceiling;
 
@@ -1505,7 +1505,7 @@ void do_h_rpt_invalidate_prt(unsigned long pid, unsigned 
long lpid,
continue;
 
nr_pages = (end - start) >> def->shift;
-   flush_pid = nr_pages > tlb_single_page_flush_ceiling;
+   flush_pid = nr_pages >= tlb_single_page_flush_ceiling;
 
/*
 * If the number of pages spanning the range is above
-- 
2.31.1



[PATCH] powerpc/kexec: fix for_each_child.cocci warning

2021-08-03 Thread Julia Lawall
From: kernel test robot 

for_each_node_by_type should have of_node_put() before return.

Generated by: scripts/coccinelle/iterators/for_each_child.cocci

CC: Sumera Priyadarsini 
Reported-by: kernel test robot 
Signed-off-by: kernel test robot 
---

The code seems to have been this way since the beginning of time.  Perhaps
the of_node_put is a no op for this code?

 core_64.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -64,8 +64,10 @@ int default_machine_kexec_prepare(struct
begin = image->segment[i].mem;
end = begin + image->segment[i].memsz;

-   if ((begin < high) && (end > low))
+   if ((begin < high) && (end > low)) {
+   of_node_put(node);
return -ETXTBSY;
+   }
}
}



[PATCH] powerpc/32s: Fix napping restore in data storage interrupt (DSI)

2021-08-03 Thread Christophe Leroy
When a DSI (Data Storage Interrupt) is taken while in NAP mode,
r11 doesn't survive the call to power_save_ppc32_restore().

So use r1 instead of r11 as they both contain the virtual stack
pointer at that point.

Reported-by: Finn Thain 
Fixes: 4c0104a83fc3 ("powerpc/32: Dismantle EXC_XFER_STD/LITE/TEMPLATE")
Cc: sta...@vger.kernel.org
Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/head_book3s_32.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/head_book3s_32.S 
b/arch/powerpc/kernel/head_book3s_32.S
index 764edd860ed4..68e5c0a7e99d 100644
--- a/arch/powerpc/kernel/head_book3s_32.S
+++ b/arch/powerpc/kernel/head_book3s_32.S
@@ -300,7 +300,7 @@ ALT_MMU_FTR_SECTION_END_IFSET(MMU_FTR_HPTE_TABLE)
EXCEPTION_PROLOG_1
EXCEPTION_PROLOG_2 INTERRUPT_DATA_STORAGE DataAccess handle_dar_dsisr=1
prepare_transfer_to_handler
-   lwz r5, _DSISR(r11)
+   lwz r5, _DSISR(r1)
andis.  r0, r5, DSISR_DABRMATCH@h
bne-1f
bl  do_page_fault
-- 
2.25.0



Re: [PATCH printk v1 03/10] kgdb: delay roundup if holding printk cpulock

2021-08-03 Thread John Ogness
On 2021-08-03, Daniel Thompson  wrote:
> On Tue, Aug 03, 2021 at 03:18:54PM +0206, John Ogness wrote:
>> kgdb makes use of its own cpulock (@dbg_master_lock, @kgdb_active)
>> during cpu roundup. This will conflict with the printk cpulock.
>
> When the full vision is realized what will be the purpose of the printk
> cpulock?
>
> I'm asking largely because it's current role is actively unhelpful
> w.r.t. kdb. It is possible that cautious use of in_dbg_master() might
> be a better (and safer) solution. However it sounds like there is a
> larger role planned for the printk cpulock...

The printk cpulock is used as a synchronization mechanism for
implementing atomic consoles, which need to be able to safely interrupt
the console write() activity at any time and immediately continue with
their own printing. The ultimate goal is to move all console printing
into per-console dedicated kthreads, so the primary function of the
printk cpulock is really to immediately _stop_ the CPU/kthread
performing write() in order to allow write_atomic() (from any context on
any CPU) to safely and reliably take over.

Atomic consoles are actually quite similar to the kgdb_io ops. For
example, comparing:

serial8250_console_write_atomic() + serial8250_console_putchar_locked()

with

serial8250_put_poll_char()

The difference is that serial8250_console_write_atomic() is line-based
and synchronizing with serial8250_console_write() so that if the kernel
crashes while outputing to the console, write() can be interrupted by
write_atomic() and cleanly formatted crash data can be output.

Also serial8250_put_poll_char() is calling into __pm_runtime_resume(),
which includes a spinlock and possibly sleeping. This would not be
acceptable for atomic consoles. Although, as Andy pointed out [0], I
will need to figure out how to deal with suspended consoles. Or just
implement a policy that registered atomic consoles may never be
suspended.

I had not considered merging kgdb_io ops with atomic console ops. But
now that I look at it more closely, there may be some useful overlap. I
will consider this. Thank you for this idea.

>> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
>> index 3d0c933937b4..1b546e117f10 100644
>> --- a/kernel/printk/printk.c
>> +++ b/kernel/printk/printk.c
>> @@ -214,6 +215,7 @@ int devkmsg_sysctl_set_loglvl(struct ctl_table *table, 
>> int write,
>>  #ifdef CONFIG_SMP
>>  static atomic_t printk_cpulock_owner = ATOMIC_INIT(-1);
>>  static atomic_t printk_cpulock_nested = ATOMIC_INIT(0);
>> +static unsigned int kgdb_cpu = -1;
>
> Is this the flag to provoke retriggering? It appears to be a write-only
> variable (at least in this patch). How is it consumed?

Critical catch! Thank you. I am quite unhappy to see these hunks were
accidentally dropped when generating this series.

@@ -3673,6 +3675,9 @@ EXPORT_SYMBOL(__printk_cpu_trylock);
  */
 void __printk_cpu_unlock(void)
 {
+   bool trigger_kgdb = false;
+   unsigned int cpu;
+
if (atomic_read(&printk_cpulock_nested)) {
atomic_dec(&printk_cpulock_nested);
return;
@@ -3683,6 +3688,12 @@ void __printk_cpu_unlock(void)
 * LMM(__printk_cpu_unlock:A)
 */
 
+   cpu = smp_processor_id();
+   if (kgdb_cpu == cpu) {
+   trigger_kgdb = true;
+   kgdb_cpu = -1;
+   }
+
/*
 * Guarantee loads and stores from this CPU when it was the
 * lock owner are visible to the next lock owner. This pairs
@@ -3703,6 +3714,21 @@ void __printk_cpu_unlock(void)
 */
atomic_set_release(&printk_cpulock_owner,
   -1); /* LMM(__printk_cpu_unlock:B) */
+
+   if (trigger_kgdb) {
+   pr_warn("re-triggering kgdb roundup for CPU#%d\n", cpu);
+   kgdb_roundup_cpu(cpu);
+   }
 }
 EXPORT_SYMBOL(__printk_cpu_unlock);

John Ogness

[0] https://lore.kernel.org/lkml/yqlkaexs9mpme...@smile.fi.intel.com


[PATCH v4] soc: fsl: qe: convert QE interrupt controller to platform_device

2021-08-03 Thread Maxim Kochetkov
Since 5.13 QE's ucc nodes can't get interrupts from devicetree:

ucc@2000 {
cell-index = <1>;
reg = <0x2000 0x200>;
interrupts = <32>;
interrupt-parent = <&qeic>;
};

Now fw_devlink expects driver to create and probe a struct device
for interrupt controller.

So lets convert this driver to simple platform_device with probe().
Also use platform_get_ and devm_ family function to get/allocate
resources and drop unused .compatible = "qeic".

[1] - 
https://lore.kernel.org/lkml/CAGETcx9PiX==mlxb9po8myyk6u2vhpvwtmsa5nkd-ywh5xh...@mail.gmail.com
Fixes: e590474768f1 ("driver core: Set fw_devlink=on by default")
Fixes: ea718c699055 ("Revert "Revert "driver core: Set fw_devlink=on by 
default""")
Signed-off-by: Maxim Kochetkov 
Reported-by: kernel test robot 
Reported-by: Dan Carpenter 
---
 drivers/soc/fsl/qe/qe_ic.c | 75 ++
 1 file changed, 44 insertions(+), 31 deletions(-)

diff --git a/drivers/soc/fsl/qe/qe_ic.c b/drivers/soc/fsl/qe/qe_ic.c
index 3f711c1a0996..e710d554425d 100644
--- a/drivers/soc/fsl/qe/qe_ic.c
+++ b/drivers/soc/fsl/qe/qe_ic.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -404,41 +405,40 @@ static void qe_ic_cascade_muxed_mpic(struct irq_desc 
*desc)
chip->irq_eoi(&desc->irq_data);
 }
 
-static void __init qe_ic_init(struct device_node *node)
+static int qe_ic_init(struct platform_device *pdev)
 {
+   struct device *dev = &pdev->dev;
void (*low_handler)(struct irq_desc *desc);
void (*high_handler)(struct irq_desc *desc);
struct qe_ic *qe_ic;
-   struct resource res;
-   u32 ret;
+   struct resource *res;
+   struct device_node *node = pdev->dev.of_node;
 
-   ret = of_address_to_resource(node, 0, &res);
-   if (ret)
-   return;
+   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+   if (res == NULL) {
+   dev_err(dev, "no memory resource defined\n");
+   return -ENODEV;
+   }
 
-   qe_ic = kzalloc(sizeof(*qe_ic), GFP_KERNEL);
+   qe_ic = devm_kzalloc(dev, sizeof(*qe_ic), GFP_KERNEL);
if (qe_ic == NULL)
-   return;
+   return -ENOMEM;
 
-   qe_ic->irqhost = irq_domain_add_linear(node, NR_QE_IC_INTS,
-  &qe_ic_host_ops, qe_ic);
-   if (qe_ic->irqhost == NULL) {
-   kfree(qe_ic);
-   return;
+   qe_ic->regs = devm_ioremap(dev, res->start, resource_size(res));
+   if (qe_ic->regs == NULL) {
+   dev_err(dev, "failed to ioremap() registers\n");
+   return -ENODEV;
}
 
-   qe_ic->regs = ioremap(res.start, resource_size(&res));
-
qe_ic->hc_irq = qe_ic_irq_chip;
 
-   qe_ic->virq_high = irq_of_parse_and_map(node, 0);
-   qe_ic->virq_low = irq_of_parse_and_map(node, 1);
+   qe_ic->virq_high = platform_get_irq(pdev, 0);
+   qe_ic->virq_low = platform_get_irq(pdev, 1);
 
-   if (!qe_ic->virq_low) {
-   printk(KERN_ERR "Failed to map QE_IC low IRQ\n");
-   kfree(qe_ic);
-   return;
+   if (qe_ic->virq_low < 0) {
+   return -ENODEV;
}
+
if (qe_ic->virq_high != qe_ic->virq_low) {
low_handler = qe_ic_cascade_low;
high_handler = qe_ic_cascade_high;
@@ -447,6 +447,13 @@ static void __init qe_ic_init(struct device_node *node)
high_handler = NULL;
}
 
+   qe_ic->irqhost = irq_domain_add_linear(node, NR_QE_IC_INTS,
+  &qe_ic_host_ops, qe_ic);
+   if (qe_ic->irqhost == NULL) {
+   dev_err(dev, "failed to add irq domain\n");
+   return -ENODEV;
+   }
+
qe_ic_write(qe_ic->regs, QEIC_CICR, 0);
 
irq_set_handler_data(qe_ic->virq_low, qe_ic);
@@ -456,20 +463,26 @@ static void __init qe_ic_init(struct device_node *node)
irq_set_handler_data(qe_ic->virq_high, qe_ic);
irq_set_chained_handler(qe_ic->virq_high, high_handler);
}
+   return 0;
 }
+static const struct of_device_id qe_ic_ids[] = {
+   { .compatible = "fsl,qe-ic"},
+   { .type = "qeic"},
+   {},
+};
 
-static int __init qe_ic_of_init(void)
+static struct platform_driver qe_ic_driver =
 {
-   struct device_node *np;
+   .driver = {
+   .name   = "qe-ic",
+   .of_match_table = qe_ic_ids,
+   },
+   .probe  = qe_ic_init,
+};
 
-   np = of_find_compatible_node(NULL, NULL, "fsl,qe-ic");
-   if (!np) {
-   np = of_find_node_by_type(NULL, "qeic");
-   if (!np)
-   return -ENODEV;
-   }
-   qe_ic_init(np);
-   of_node_put(np);
+static int __init qe_ic_of_init(void)
+{
+   platform_driver_register(&qe_ic_driver);
return

Re: [PATCH v2] arch: vdso: remove if-conditionals of $(c-gettimeofday-y)

2021-08-03 Thread Catalin Marinas
On Sat, Jul 31, 2021 at 03:00:20PM +0900, Masahiro Yamada wrote:
> arm, arm64, csky, mips, powerpc always select GENERIC_GETTIMEOFDAY,
> hence $(gettimeofday-y) never becomes empty.
> 
> riscv conditionally selects GENERIC_GETTIMEOFDAY when MMU=y && 64BIT=y,
> but arch/riscv/kernel/vdso/vgettimeofday.o is built only under that
> condition. So, you can always define CFLAGS_vgettimeofday.o
> 
> Remove all the meaningless conditionals.
> 
> Signed-off-by: Masahiro Yamada 

Acked-by: Catalin Marinas 


[Bug 213961] New: Oops while loading radeon driver

2021-08-03 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=213961

Bug ID: 213961
   Summary: Oops while loading radeon driver
   Product: Platform Specific/Hardware
   Version: 2.5
Kernel Version: 5.14-rc4
  Hardware: PPC-32
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: PPC-32
  Assignee: platform_ppc...@kernel-bugs.osdl.org
  Reporter: riese...@lxtec.de
Regression: No

Created attachment 298183
  --> https://bugzilla.kernel.org/attachment.cgi?id=298183&action=edit
Boot proto

Please find attached boot proto. The virtual console freezes. X doesn' start.
ssh login from remote gives access to the system.

Thanks in advance

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PATCH v5] pseries: prevent free CPU ids to be reused on another node

2021-08-03 Thread Nathan Lynch
Laurent Dufour  writes:
> V5:
>  - Rework code structure
>  - Reintroduce the capability to reuse other node's ids.

OK. While I preferred v4, where we would fail an add rather than allow
CPU IDs to appear to "travel" between nodes, this change is a net
improvement.

Reviewed-by: Nathan Lynch 


[powerpc:merge] BUILD SUCCESS 98b7252a619bc485763d8e7b9f446f7cc3b992e8

2021-08-03 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
merge
branch HEAD: 98b7252a619bc485763d8e7b9f446f7cc3b992e8  Automatic merge of 
'master' into merge (2021-08-02 11:50)

elapsed time: 873m

configs tested: 108
configs skipped: 3

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
arm defconfig
arm64allyesconfig
arm64   defconfig
arm  allyesconfig
arm  allmodconfig
arc  axs101_defconfig
sh   se7750_defconfig
arc haps_hs_smp_defconfig
xtensa  iss_defconfig
mipsgpr_defconfig
riscvallmodconfig
arm  pxa3xx_defconfig
sh   se7751_defconfig
arm assabet_defconfig
arm  pxa910_defconfig
powerpc mpc8315_rdb_defconfig
s390   zfcpdump_defconfig
mips rt305x_defconfig
powerpc  g5_defconfig
archsdk_defconfig
sparc64  alldefconfig
mipse55_defconfig
powerpc xes_mpc85xx_defconfig
powerpc powernv_defconfig
m68kmvme16x_defconfig
mips   rbtx49xx_defconfig
powerpc ksi8560_defconfig
sh  defconfig
arm  moxart_defconfig
sh  kfr2r09_defconfig
arm palmz72_defconfig
ia64 allmodconfig
ia64defconfig
ia64 allyesconfig
x86_64allnoconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nios2   defconfig
arc  allyesconfig
nds32 allnoconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
parisc  defconfig
s390 allyesconfig
s390 allmodconfig
parisc   allyesconfig
s390defconfig
i386 allyesconfig
sparcallyesconfig
sparc   defconfig
i386defconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
x86_64   randconfig-a002-20210803
x86_64   randconfig-a004-20210803
x86_64   randconfig-a006-20210803
x86_64   randconfig-a003-20210803
x86_64   randconfig-a001-20210803
x86_64   randconfig-a005-20210803
i386 randconfig-a004-20210803
i386 randconfig-a005-20210803
i386 randconfig-a002-20210803
i386 randconfig-a006-20210803
i386 randconfig-a001-20210803
i386 randconfig-a003-20210803
i386 randconfig-a012-20210803
i386 randconfig-a011-20210803
i386 randconfig-a015-20210803
i386 randconfig-a013-20210803
i386 randconfig-a014-20210803
i386 randconfig-a016-20210803
i386 randconfig-a012-20210802
i386 randconfig-a011-20210802
i386 randconfig-a015-20210802
i386 randconfig-a013-20210802
i386 randconfig-a014-20210802
i386 randconfig-a016-20210802
riscvnommu_k210_defconfig
riscvallyesconfig
riscvnommu_virt_defconfig
riscv allnoconfig
riscv   defconfig
riscv  rv32_defconfig
um   x86_64_defconfig
um i386_defconfig
x86_64   allyesconfig
x86_64rhel-8.3-kselftests
x86_64  defconfig
x86_64   rhel-

Re: [PATCH v5] pseries/drmem: update LMBs after LPM

2021-08-03 Thread Nathan Lynch
Laurent Dufour  writes:
> V5:
>  - Reword the commit's description to address Nathan's comments.

Thanks. Still don't like the global variable usage but:

Reviewed-by: Nathan Lynch 


Re: [PATCH v5] pseries/drmem: update LMBs after LPM

2021-08-03 Thread Laurent Dufour

Le 03/08/2021 à 19:32, Nathan Lynch a écrit :

Laurent Dufour  writes:

V5:
  - Reword the commit's description to address Nathan's comments.


Thanks. Still don't like the global variable usage but:

Reviewed-by: Nathan Lynch 



Thanks Nathan,

I don't like either the global variable usage but I can't see any smarter way to 
achieve that.


Laurent.


Re: [PATCH v5] pseries: prevent free CPU ids to be reused on another node

2021-08-03 Thread Laurent Dufour

Le 03/08/2021 à 18:54, Nathan Lynch a écrit :

Laurent Dufour  writes:

V5:
  - Rework code structure
  - Reintroduce the capability to reuse other node's ids.


OK. While I preferred v4, where we would fail an add rather than allow
CPU IDs to appear to "travel" between nodes, this change is a net
improvement.

Reviewed-by: Nathan Lynch 



Thanks Nathan,

Regarding the reuse of other nodes free CPU ids, with this patch the kernel does 
it best to prevent that. Instead of failing adding new CPUs, I think it's better 
to reuse free CPU ids of other nodes, otherwise, only a reboot would allow the 
CPU adding operation to succeed.


Laurent.


[PATCH] cpuidle: pseries: Mark pseries_idle_proble() as __init

2021-08-03 Thread Nathan Chancellor
After commit 7cbd631d4dec ("cpuidle: pseries: Fixup CEDE0 latency only
for POWER10 onwards"), pseries_idle_probe() is no longer inlined when
compiling with clang, which causes a modpost warning:

WARNING: modpost: vmlinux.o(.text+0xc86a54): Section mismatch in
reference from the function pseries_idle_probe() to the function
.init.text:fixup_cede0_latency()
The function pseries_idle_probe() references
the function __init fixup_cede0_latency().
This is often because pseries_idle_probe lacks a __init
annotation or the annotation of fixup_cede0_latency is wrong.

pseries_idle_probe() is a non-init function, which calls
fixup_cede0_latency(), which is an init function, explaining the
mismatch. pseries_idle_probe() is only called from
pseries_processor_idle_init(), which is an init function, so mark
pseries_idle_probe() as __init so there is no more warning.

Fixes: 054e44ba99ae ("cpuidle: pseries: Add function to parse extended CEDE 
records")
Signed-off-by: Nathan Chancellor 
---
 drivers/cpuidle/cpuidle-pseries.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cpuidle/cpuidle-pseries.c 
b/drivers/cpuidle/cpuidle-pseries.c
index bba449b77641..7e7ab5597d7a 100644
--- a/drivers/cpuidle/cpuidle-pseries.c
+++ b/drivers/cpuidle/cpuidle-pseries.c
@@ -403,7 +403,7 @@ static void __init fixup_cede0_latency(void)
  * pseries_idle_probe()
  * Choose state table for shared versus dedicated partition
  */
-static int pseries_idle_probe(void)
+static int __init pseries_idle_probe(void)
 {
 
if (cpuidle_disable != IDLE_NO_OVERRIDE)

base-commit: a6cae77f1bc89368a4e2822afcddc45c3062d499
-- 
2.33.0.rc0



Re: [PATCH v4] soc: fsl: qe: convert QE interrupt controller to platform_device

2021-08-03 Thread Saravana Kannan
On Tue, Aug 3, 2021 at 4:33 AM Maxim Kochetkov  wrote:
>
> Since 5.13 QE's ucc nodes can't get interrupts from devicetree:
>
> ucc@2000 {
> cell-index = <1>;
> reg = <0x2000 0x200>;
> interrupts = <32>;
> interrupt-parent = <&qeic>;
> };
>
> Now fw_devlink expects driver to create and probe a struct device
> for interrupt controller.
>
> So lets convert this driver to simple platform_device with probe().
> Also use platform_get_ and devm_ family function to get/allocate
> resources and drop unused .compatible = "qeic".

Yes, please!

Acked-by: Saravana Kannan 

-Saravana

>
> [1] - 
> https://lore.kernel.org/lkml/CAGETcx9PiX==mlxb9po8myyk6u2vhpvwtmsa5nkd-ywh5xh...@mail.gmail.com
> Fixes: e590474768f1 ("driver core: Set fw_devlink=on by default")
> Fixes: ea718c699055 ("Revert "Revert "driver core: Set fw_devlink=on by 
> default""")
> Signed-off-by: Maxim Kochetkov 
> Reported-by: kernel test robot 
> Reported-by: Dan Carpenter 
> ---
>  drivers/soc/fsl/qe/qe_ic.c | 75 ++
>  1 file changed, 44 insertions(+), 31 deletions(-)
>
> diff --git a/drivers/soc/fsl/qe/qe_ic.c b/drivers/soc/fsl/qe/qe_ic.c
> index 3f711c1a0996..e710d554425d 100644
> --- a/drivers/soc/fsl/qe/qe_ic.c
> +++ b/drivers/soc/fsl/qe/qe_ic.c
> @@ -23,6 +23,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -404,41 +405,40 @@ static void qe_ic_cascade_muxed_mpic(struct irq_desc 
> *desc)
> chip->irq_eoi(&desc->irq_data);
>  }
>
> -static void __init qe_ic_init(struct device_node *node)
> +static int qe_ic_init(struct platform_device *pdev)
>  {
> +   struct device *dev = &pdev->dev;
> void (*low_handler)(struct irq_desc *desc);
> void (*high_handler)(struct irq_desc *desc);
> struct qe_ic *qe_ic;
> -   struct resource res;
> -   u32 ret;
> +   struct resource *res;
> +   struct device_node *node = pdev->dev.of_node;
>
> -   ret = of_address_to_resource(node, 0, &res);
> -   if (ret)
> -   return;
> +   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +   if (res == NULL) {
> +   dev_err(dev, "no memory resource defined\n");
> +   return -ENODEV;
> +   }
>
> -   qe_ic = kzalloc(sizeof(*qe_ic), GFP_KERNEL);
> +   qe_ic = devm_kzalloc(dev, sizeof(*qe_ic), GFP_KERNEL);
> if (qe_ic == NULL)
> -   return;
> +   return -ENOMEM;
>
> -   qe_ic->irqhost = irq_domain_add_linear(node, NR_QE_IC_INTS,
> -  &qe_ic_host_ops, qe_ic);
> -   if (qe_ic->irqhost == NULL) {
> -   kfree(qe_ic);
> -   return;
> +   qe_ic->regs = devm_ioremap(dev, res->start, resource_size(res));
> +   if (qe_ic->regs == NULL) {
> +   dev_err(dev, "failed to ioremap() registers\n");
> +   return -ENODEV;
> }
>
> -   qe_ic->regs = ioremap(res.start, resource_size(&res));
> -
> qe_ic->hc_irq = qe_ic_irq_chip;
>
> -   qe_ic->virq_high = irq_of_parse_and_map(node, 0);
> -   qe_ic->virq_low = irq_of_parse_and_map(node, 1);
> +   qe_ic->virq_high = platform_get_irq(pdev, 0);
> +   qe_ic->virq_low = platform_get_irq(pdev, 1);
>
> -   if (!qe_ic->virq_low) {
> -   printk(KERN_ERR "Failed to map QE_IC low IRQ\n");
> -   kfree(qe_ic);
> -   return;
> +   if (qe_ic->virq_low < 0) {
> +   return -ENODEV;
> }
> +
> if (qe_ic->virq_high != qe_ic->virq_low) {
> low_handler = qe_ic_cascade_low;
> high_handler = qe_ic_cascade_high;
> @@ -447,6 +447,13 @@ static void __init qe_ic_init(struct device_node *node)
> high_handler = NULL;
> }
>
> +   qe_ic->irqhost = irq_domain_add_linear(node, NR_QE_IC_INTS,
> +  &qe_ic_host_ops, qe_ic);
> +   if (qe_ic->irqhost == NULL) {
> +   dev_err(dev, "failed to add irq domain\n");
> +   return -ENODEV;
> +   }
> +
> qe_ic_write(qe_ic->regs, QEIC_CICR, 0);
>
> irq_set_handler_data(qe_ic->virq_low, qe_ic);
> @@ -456,20 +463,26 @@ static void __init qe_ic_init(struct device_node *node)
> irq_set_handler_data(qe_ic->virq_high, qe_ic);
> irq_set_chained_handler(qe_ic->virq_high, high_handler);
> }
> +   return 0;
>  }
> +static const struct of_device_id qe_ic_ids[] = {
> +   { .compatible = "fsl,qe-ic"},
> +   { .type = "qeic"},
> +   {},
> +};
>
> -static int __init qe_ic_of_init(void)
> +static struct platform_driver qe_ic_driver =
>  {
> -   struct device_node *np;
> +   .driver = {
> +   .name   = "qe-ic",
> +   .of_match_table = qe_ic_ids,
> +   },
> +   .probe  = qe_ic_init,
> +};
>
> - 

Re: Debian SID kernel doesn't boot on PowerBook 3400c

2021-08-03 Thread Finn Thain
On Tue, 3 Aug 2021, Stan Johnson wrote:

> 
> I'm not sure of the issue you are referencing. If it's the Wallstreet 
> issue, I believe we were waiting to hear back from you regarding the 
> memory errors that crop up with CONFIG_VMAP_STACK=y and mem >464M. 
> Finn, if that is not correct, please let me know.
> 

No, it's not correct. I sent a message dated 3 Aug 2021 with a patch from 
Christophe. I also sent (privately) a message with instructions for 
testing that patch. I will resend these now.


Re: Debian SID kernel doesn't boot on PowerBook 3400c

2021-08-03 Thread Finn Thain


On Tue, 3 Aug 2021, Christophe Leroy wrote:

> 
> Looks like the memory errors are linked to KUAP (Kernel Userspace Access 
> Protection). Based on the places the problems happen, I don't think 
> there are any invalid access, so there must be something wrong in the 
> KUAP logic, probably linked to some interrupts happenning in kernel mode 
> while the KUAP window is opened. And because is not selected by default 
> on book3s/32 until 5.14, probably nobody ever tested it in a real 
> environment before you.
> 
> I think the issue may be linked to commit 
> https://github.com/linuxppc/linux/commit/c16728835 which happened 
> between 5.12 and 5.13.

The messages, "Kernel attempted to write user page (c6207c) - exploit 
attempt? (uid: 0)", appear in the console logs generated by v5.13. Those 
logs come from the Powerbook G3 discussion in the other thread. Could that 
be the same bug?


Re: [PATCH] cpuidle: pseries: Mark pseries_idle_proble() as __init

2021-08-03 Thread Michael Ellerman
Nathan Chancellor  writes:
> After commit 7cbd631d4dec ("cpuidle: pseries: Fixup CEDE0 latency only
> for POWER10 onwards"), pseries_idle_probe() is no longer inlined when
> compiling with clang, which causes a modpost warning:
>
> WARNING: modpost: vmlinux.o(.text+0xc86a54): Section mismatch in
> reference from the function pseries_idle_probe() to the function
> .init.text:fixup_cede0_latency()
> The function pseries_idle_probe() references
> the function __init fixup_cede0_latency().
> This is often because pseries_idle_probe lacks a __init
> annotation or the annotation of fixup_cede0_latency is wrong.
>
> pseries_idle_probe() is a non-init function, which calls
> fixup_cede0_latency(), which is an init function, explaining the
> mismatch. pseries_idle_probe() is only called from
> pseries_processor_idle_init(), which is an init function, so mark
> pseries_idle_probe() as __init so there is no more warning.
>
> Fixes: 054e44ba99ae ("cpuidle: pseries: Add function to parse extended CEDE 
> records")
> Signed-off-by: Nathan Chancellor 
> ---
>  drivers/cpuidle/cpuidle-pseries.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

I don't see this in my builds for some reason, but I guess toolchain or
config differences probably explain it.

Regardless, the patch is correct so I'll pick it up, thanks.

cheers

> diff --git a/drivers/cpuidle/cpuidle-pseries.c 
> b/drivers/cpuidle/cpuidle-pseries.c
> index bba449b77641..7e7ab5597d7a 100644
> --- a/drivers/cpuidle/cpuidle-pseries.c
> +++ b/drivers/cpuidle/cpuidle-pseries.c
> @@ -403,7 +403,7 @@ static void __init fixup_cede0_latency(void)
>   * pseries_idle_probe()
>   * Choose state table for shared versus dedicated partition
>   */
> -static int pseries_idle_probe(void)
> +static int __init pseries_idle_probe(void)
>  {
>  
>   if (cpuidle_disable != IDLE_NO_OVERRIDE)
>
> base-commit: a6cae77f1bc89368a4e2822afcddc45c3062d499
> -- 
> 2.33.0.rc0


[PATCH] powerpc: Always inline radix_enabled() to fix build failure

2021-08-03 Thread Jordan Niethe
This is the same as commit acdad8fb4a15 ("powerpc: Force inlining of
mmu_has_feature to fix build failure") but for radix_enabled().  The
config in the linked bugzilla causes the following build failure:

LD  .tmp_vmlinux.kallsyms1
powerpc64-linux-ld: arch/powerpc/mm/pgtable.o: in function 
`.__ptep_set_access_flags':
pgtable.c:(.text+0x17c): undefined reference to `.radix__ptep_set_access_flags'
powerpc64-linux-ld: arch/powerpc/mm/pageattr.o: in function `.change_page_attr':
pageattr.c:(.text+0xc0): undefined reference to `.radix__flush_tlb_kernel_range'
powerpc64-linux-ld: arch/powerpc/mm/pageattr.o: in function `.set_page_attr':
pageattr.c:(.text+0x228): undefined reference to 
`.radix__flush_tlb_kernel_range'
powerpc64-linux-ld: arch/powerpc/mm/book3s64/mmu_context.o:(.toc+0x0): 
undefined reference to `mmu_pid_bits'
powerpc64-linux-ld: arch/powerpc/mm/book3s64/mmu_context.o:(.toc+0x8): 
undefined reference to `mmu_base_pid'
powerpc64-linux-ld: arch/powerpc/mm/book3s64/pgtable.o: in function 
`.pmd_hugepage_update':
pgtable.c:(.text+0x98): undefined reference to `.radix__pmd_hugepage_update'
powerpc64-linux-ld: arch/powerpc/mm/book3s64/pgtable.o: in function 
`.do_serialize':
pgtable.c:(.text+0xdc): undefined reference to `.exit_lazy_flush_tlb'
powerpc64-linux-ld: arch/powerpc/mm/book3s64/pgtable.o: in function 
`.pmdp_set_access_flags':
pgtable.c:(.text+0x258): undefined reference to `.radix__ptep_set_access_flags'
powerpc64-linux-ld: arch/powerpc/mm/book3s64/pgtable.o: in function 
`.pmdp_invalidate':
pgtable.c:(.text+0x4a8): undefined reference to `.radix__flush_pmd_tlb_range'
powerpc64-linux-ld: arch/powerpc/mm/book3s64/pgtable.o: in function 
`.pmdp_huge_get_and_clear_full':
pgtable.c:(.text+0x510): undefined reference to 
`.radix__pmdp_huge_get_and_clear'
powerpc64-linux-ld: pgtable.c:(.text+0x550): undefined reference to 
`.radix__flush_pmd_tlb_range'
powerpc64-linux-ld: arch/powerpc/mm/book3s64/pgtable.o: in function 
`.mmu_cleanup_all':
pgtable.c:(.text+0x674): undefined reference to `.radix__mmu_cleanup_all'
powerpc64-linux-ld: arch/powerpc/mm/book3s64/pgtable.o: in function 
`.ptep_modify_prot_commit':
pgtable.c:(.text+0xdf8): undefined reference to 
`.radix__ptep_modify_prot_commit'
powerpc64-linux-ld: arch/powerpc/lib/code-patching.o: in function 
`.patch_instruction':
code-patching.c:(.text+0x318): undefined reference to `.radix__map_kernel_page'
powerpc64-linux-ld: code-patching.c:(.text+0x498): undefined reference to 
`.radix__flush_tlb_kernel_range'
powerpc64-linux-ld: kernel/fork.o: in function `.dup_mm':
fork.c:(.text+0x2138): undefined reference to `.radix__flush_tlb_mm'
powerpc64-linux-ld: mm/memory.o: in function `.unmap_page_range':
memory.c:(.text+0x305c): undefined reference to `.radix__tlb_flush'
powerpc64-linux-ld: mm/memory.o: in function `.do_wp_page':
memory.c:(.text+0x36cc): undefined reference to `.radix__flush_tlb_page'
powerpc64-linux-ld: mm/memory.o: in function `.do_set_pmd':
memory.c:(.text+0x42f8): undefined reference to 
`.radix__pgtable_trans_huge_deposit'
powerpc64-linux-ld: mm/memory.o: in function `.__handle_mm_fault':
memory.c:(.text+0x6af8): undefined reference to `.radix__flush_tlb_page'
powerpc64-linux-ld: mm/mprotect.o: in function `.change_protection':
mprotect.c:(.text+0x274): undefined reference to `.radix__flush_tlb_range'
powerpc64-linux-ld: mm/mremap.o: in function `.flush_tlb_range':
mremap.c:(.text+0x500): undefined reference to `.radix__flush_tlb_range'
powerpc64-linux-ld: mm/pgtable-generic.o: in function `.ptep_clear_flush':
pgtable-generic.c:(.text+0xb0): undefined reference to `.radix__flush_tlb_page'
powerpc64-linux-ld: mm/pgtable-generic.o: in function `.pmdp_huge_clear_flush':
pgtable-generic.c:(.text+0x160): undefined reference to 
`.radix__pmdp_huge_get_and_clear'
powerpc64-linux-ld: pgtable-generic.c:(.text+0x198): undefined reference to 
`.radix__flush_pmd_tlb_range'
powerpc64-linux-ld: mm/rmap.o: in function `.try_to_unmap_one':
rmap.c:(.text+0x1d60): undefined reference to `.radix__flush_tlb_range'
powerpc64-linux-ld: mm/rmap.o: in function `.try_to_migrate_one':
rmap.c:(.text+0x222c): undefined reference to `.radix__flush_tlb_range'
powerpc64-linux-ld: mm/vmalloc.o: in function `.flush_tlb_kernel_range':
vmalloc.c:(.text+0x5a8): undefined reference to `.radix__flush_tlb_kernel_range'
powerpc64-linux-ld: mm/hugetlb.o: in function `.hugetlb_cow':
hugetlb.c:(.text+0x53b0): undefined reference to `.radix__flush_hugetlb_page'
powerpc64-linux-ld: mm/hugetlb.o: in function `.hugetlb_change_protection':
hugetlb.c:(.text+0x6558): undefined reference to 
`.radix__flush_hugetlb_tlb_range'
powerpc64-linux-ld: mm/hugetlb.o: in function `.hugetlb_unshare_all_pmds':
hugetlb.c:(.text+0x70f0): undefined reference to 
`.radix__flush_hugetlb_tlb_range'
powerpc64-linux-ld: mm/huge_memory.o: in function `.pgtable_trans_huge_deposit':
huge_memory.c:(.text+0x6b0): undefined reference to 
`.radix__pgtable_trans_huge_deposit'
powerpc64-linux-ld: mm/huge

[powerpc:next] BUILD SUCCESS a6cae77f1bc89368a4e2822afcddc45c3062d499

2021-08-03 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
next
branch HEAD: a6cae77f1bc89368a4e2822afcddc45c3062d499  powerpc/stacktrace: 
Include linux/delay.h

elapsed time: 725m

configs tested: 154
configs skipped: 3

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
arm defconfig
arm64allyesconfig
arm64   defconfig
arm  allyesconfig
arm  allmodconfig
i386 randconfig-c001-20210803
mips   ip28_defconfig
armcerfcube_defconfig
armneponset_defconfig
arcnsimosci_defconfig
arm hackkit_defconfig
powerpc  obs600_defconfig
m68k  atari_defconfig
mips   rs90_defconfig
mipsjmr3927_defconfig
mips   ip32_defconfig
powerpc xes_mpc85xx_defconfig
arm s3c6400_defconfig
armxcep_defconfig
arm vf610m4_defconfig
ia64 alldefconfig
powerpc tqm8541_defconfig
nios2 3c120_defconfig
armpleb_defconfig
powerpc pseries_defconfig
armtrizeps4_defconfig
arm  imote2_defconfig
armdove_defconfig
h8300alldefconfig
sh   se7619_defconfig
powerpc   bluestone_defconfig
mips  rm200_defconfig
powerpc  ppc64e_defconfig
powerpc  ppc6xx_defconfig
sh sh7710voipgw_defconfig
powerpc   allnoconfig
x86_64   alldefconfig
powerpc mpc512x_defconfig
openrisc simple_smp_defconfig
powerpc ppa8548_defconfig
s390 alldefconfig
sh   se7343_defconfig
h8300h8300h-sim_defconfig
ia64  gensparse_defconfig
shapsh4ad0a_defconfig
mips  rb532_defconfig
arc  axs103_defconfig
sh   se7722_defconfig
arm socfpga_defconfig
mips db1xxx_defconfig
powerpc   lite5200b_defconfig
arm   omap1_defconfig
sparc   sparc32_defconfig
powerpc mpc8540_ads_defconfig
powerpc   ebony_defconfig
powerpc  bamboo_defconfig
powerpc kilauea_defconfig
mipsbcm63xx_defconfig
powerpc mpc8315_rdb_defconfig
xtensageneric_kc705_defconfig
mips   ip27_defconfig
powerpc64   defconfig
arm   sama5_defconfig
sh  sdk7786_defconfig
arcvdk_hs38_smp_defconfig
ia64 allmodconfig
ia64defconfig
ia64 allyesconfig
x86_64allnoconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nios2   defconfig
arc  allyesconfig
nds32 allnoconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
parisc  defconfig
s390 allyesconfig
s390 allmodconfig
parisc   allyesconfig
s390defconfig
i386 allyesconfig
sparcallyesconfig
sparc   defconfig
i386defconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc  allmodconfig
x86_64   randconfig-a002-20210803
x86_64   randconfig-a004-20210803
x86_64   randconfig-a006-20210803
x86_64   randconfig

Re: Debian SID kernel doesn't boot on PowerBook 3400c

2021-08-03 Thread Stan Johnson
On 8/3/21 4:08 AM, Christophe Leroy wrote:
> 
> 
> Le 02/08/2021 à 19:32, Stan Johnson a écrit :
>> On 8/2/21 8:41 AM, Christophe Leroy wrote:
>>>
>>> ...
>
> Can you try again without CONFIG_VMAP_STACK ?
>
> Thanks
> Christophe
> ...


 With CONFIG_VMAP_STACK=y, 5.11.0-rc5-pmac-00034-g684da7628d9 hangs at
 boot on the PB 3400c.

 Without CONFIG_VMAP_STACK, 5.11.0-rc5-pmac-00034-g684da7628d9 boots as
 expected.

 I didn't re-build the Debian SID kernel, though I confirmed that the
 Debian config file for 5.10.0-8-powerpc includes CONFIG_VMAP_STACK=y.
 It's not clear whether removing CONFIG_VMAP_STACK would be appropriate
 for other powerpc systems.

 Please let me know why removing CONFIG_VMAP_STACK fixed the problem on
 the PB 3400c. Should CONFIG_HAVE_ARCH_VMAP_STACK also be removed?

>>>
>>> When CONFIG_HAVE_ARCH_VMAP_STACK is selected by the architecture,
>>> CONFIG_VMAP_STACK  is selected by default.
>>>
>>> The point is that your config has CONFIG_ADB_PMU.
>>>
>>> A bug with VMAP stack was detected during 5.9 release cycle for
>>> platforms selecting CONFIG_ADB_PMU. Because fixing the bug was an heavy
>>> change, we prefered at that time to disable VMAP stack, so VMAP stack
>>> was deselected for CONFIG_ADB_PMU by commit
>>> 4a133eb351ccc275683ad49305d0b04dde903733.
>>>
>>> Then as a second step, the proper fix was implemented and then VMAP
>>> stack was enabled again by the commit you bisected.
>>>
>>> Taking into account that the problem disappears for you when you
>>> manually deselect VMAP stacks, it means the problem is not the fix
>>> itself, but the fact that VMAP stacks are now enable by default.
>>>
>>> We need to understand why VMAP stack doesn't work on your platform, more
>>> than that why it doesn't boot at all with VMAP stack.
>>>
>>> Could you send me the dmesg output of your system when it properly
>>> boots ?
>>>
>>> Did you check with kernel 5.13 ?
>>>
>>> Thanks
>>> Christophe
>>>
>>
>> Christophe,
>>
>> Thanks for your response. It looks like I never tested v5.13 (I was
>> originally just reporting that the default Debian SID kernel,
>> 5.10.0-8-powerpc, hangs at boot on the PB 3400c).
>>
>> So I rebuilt the stock v5.13 from kernel.org using Finn's
>> dot-config-powermac-5.13, which got changed slightly at compilation (see
>> dot-config-v5.13-pmac, attached). It has CONFIG_VMAP_STACK and
>> CONFIG_ADB_PMU set, and it booted, but there were multiple memory
>> errors. So it looks like the hang-at-boot problem was fixed sometime
>> after v5.11, but there are now memory errors (similar to Wallstreet).
>>
>> With CONFIG_VMAP_STACK not set (CONFIG_ADB_PMU is still set), the
>> .config file turns into the attached dot-config-v5.13-pmac_NO_VMAP. And
>> there were still memory errors (dmesg output attached).
>>
>> The memory errors may be a completely unrelated issue, since they occur
>> regardless of the CONFIG_VMAP_STACK setting.
>>
>> To help rule out a hardware issue, I confirmed that memory errors don't
>> occur with v5.8.2 (dmesg output attached).
>>
>> A useful git bisect might be possible if CONFIG_VMAP_STACK is disabled
>> for each build. I would need to determine where the memory errors
>> started (v5.9, v5.10, v5.11, or v5.12). There is the complication that
>> (at least) several v5.10 kernels won't compile if SMP is set, so I might
>> need to disable that everywhere as well, assuming the SMP fix didn't
>> cause the memory errors.
>>
> 
> Thanks a lot for the information.
> 
> Looks like the memory errors are linked to KUAP (Kernel Userspace Access
> Protection). Based on the places the problems happen, I don't think
> there are any invalid access, so there must be something wrong in the
> KUAP logic, probably linked to some interrupts happenning in kernel mode
> while the KUAP window is opened. And because is not selected by default
> on book3s/32 until 5.14, probably nobody ever tested it in a real
> environment before you.
> 
> I think the issue may be linked to commit
> https://github.com/linuxppc/linux/commit/c16728835 which happened
> between 5.12 and 5.13. Would be nice if you could confirm that 5.12
> doesn't have the problem (At the same time maybe you can see if 5.12
> also boots OK with CONFIG_VMAP_STACK)

On the PB 3400c:
1) v5.12 with CONFIG_VMAP_STACK -- hangs at boot; see attached config.
2) v5.12 without CONFIG_VMAP_STACK -- did not hang at boot, but hung at
"Run /sbin/init as init process" (I tested it twice; there were no
errors logged); see attached config and serial console log.
3) v5.11 with CONFIG_VMAP_STACK -- hangs at boot, no output at serial
console (see attached config).
4) v5.11 without CONFIG_VMAP_STACK -- no errors (confirms earlier
result); see attached config and dmesg output.

The PB 3400C has a 240 MHz 603e and 144M memory. It's a text-only system
running Debian SID with sysvinit (X Windows and systemd would run too
slowly here).

Please note that the issue o

[PATCH v3 1/2] tty: hvc: pass DMA capable memory to put_chars()

2021-08-03 Thread Xianting Tian
As well known, hvc backend can register its opertions to hvc backend.
the opertions contain put_chars(), get_chars() and so on.

Some hvc backend may do dma in its opertions. eg, put_chars() of
virtio-console. But in the code of hvc framework, it may pass DMA
incapable memory to put_chars() under a specific configuration, which
is explained in commit c4baad5029(virtio-console: avoid DMA from stack):
1, c[] is on stack,
   hvc_console_print():
char c[N_OUTBUF] __ALIGNED__;
cons_ops[index]->put_chars(vtermnos[index], c, i);
2, ch is on stack,
   static void hvc_poll_put_char(,,char ch)
   {
struct tty_struct *tty = driver->ttys[0];
struct hvc_struct *hp = tty->driver_data;
int n;

do {
n = hp->ops->put_chars(hp->vtermno, &ch, 1);
} while (n <= 0);
   }

Commit c4baad5029 is just the fix to avoid DMA from stack memory, which
is passed to virtio-console by hvc framework in above code. But I think
the fix is aggressive, it directly uses kmemdup() to alloc new buffer
from kmalloc area and do memcpy no matter the memory is in kmalloc area
or not. But most importantly, it should better be fixed in the hvc
framework, by changing it to never pass stack memory to the put_chars()
function in the first place. Otherwise, we still face the same issue if
a new hvc backend using dma added in the furture.

Considering lock competition of hp->outbuf, we created a new buffer
hp->hvc_con_outbuf, which is aligned at least to N_OUTBUF, and use it
in above two cases.

With the patch, we can remove the fix c4baad5029.

Signed-off-by: Xianting Tian 
Tested-by: Xianting Tian 
---
 drivers/tty/hvc/hvc_console.c | 30 --
 drivers/tty/hvc/hvc_console.h |  2 ++
 2 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c
index 5bb8c4e44..e5862989c 100644
--- a/drivers/tty/hvc/hvc_console.c
+++ b/drivers/tty/hvc/hvc_console.c
@@ -151,9 +151,11 @@ static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] =
 static void hvc_console_print(struct console *co, const char *b,
  unsigned count)
 {
-   char c[N_OUTBUF] __ALIGNED__;
+   char *c;
unsigned i = 0, n = 0;
int r, donecr = 0, index = co->index;
+   unsigned long flags;
+   struct hvc_struct *hp;
 
/* Console access attempt outside of acceptable console range. */
if (index >= MAX_NR_HVC_CONSOLES)
@@ -163,6 +165,13 @@ static void hvc_console_print(struct console *co, const 
char *b,
if (vtermnos[index] == -1)
return;
 
+   list_for_each_entry(hp, &hvc_structs, next)
+   if (hp->vtermno == vtermnos[index])
+   break;
+
+   c = hp->hvc_con_outbuf;
+
+   spin_lock_irqsave(&hp->hvc_con_lock, flags);
while (count > 0 || i > 0) {
if (count > 0 && i < sizeof(c)) {
if (b[n] == '\n' && !donecr) {
@@ -191,6 +200,7 @@ static void hvc_console_print(struct console *co, const 
char *b,
}
}
}
+   spin_unlock_irqrestore(&hp->hvc_con_lock, flags);
hvc_console_flush(cons_ops[index], vtermnos[index]);
 }
 
@@ -878,9 +888,15 @@ static void hvc_poll_put_char(struct tty_driver *driver, 
int line, char ch)
struct tty_struct *tty = driver->ttys[0];
struct hvc_struct *hp = tty->driver_data;
int n;
+   unsigned long flags;
+   char *c;
 
+   c = hp->hvc_con_outbuf;
do {
-   n = hp->ops->put_chars(hp->vtermno, &ch, 1);
+   spin_lock_irqsave(&hp->hvc_con_lock, flags);
+   c[0] = ch;
+   n = hp->ops->put_chars(hp->vtermno, c, 1);
+   spin_unlock_irqrestore(&hp->hvc_con_lock, flags);
} while (n <= 0);
 }
 #endif
@@ -933,6 +949,16 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data,
hp->outbuf_size = outbuf_size;
hp->outbuf = &((char *)hp)[ALIGN(sizeof(*hp), sizeof(long))];
 
+   /*
+* hvc_con_outbuf is guaranteed to be aligned at least to the
+* size(N_OUTBUF) by kmalloc().
+*/
+   hp->hvc_con_outbuf = kzalloc(N_OUTBUF, GFP_KERNEL);
+   if (!hp->hvc_con_outbuf)
+   return ERR_PTR(-ENOMEM);
+
+   spin_lock_init(&hp->hvc_con_lock);
+
tty_port_init(&hp->port);
hp->port.ops = &hvc_port_ops;
 
diff --git a/drivers/tty/hvc/hvc_console.h b/drivers/tty/hvc/hvc_console.h
index 18d005814..8972c52de 100644
--- a/drivers/tty/hvc/hvc_console.h
+++ b/drivers/tty/hvc/hvc_console.h
@@ -48,6 +48,8 @@ struct hvc_struct {
struct work_struct tty_resize;
struct list_head next;
unsigned long flags;
+   char *hvc_con_outbuf;
+   spinlock_t hvc_con_lock;
 };
 
 /* implemented by a low level driver */
-- 
2.17.1



[PATCH v3 2/2] virtio-console: remove unnecessary kmemdup()

2021-08-03 Thread Xianting Tian
hvc framework will never pass stack memory to the put_chars() function,
So the calling of kmemdup() is unnecessary, remove it.

This revert commit c4baad5029 ("virtio-console: avoid DMA from stack")

Signed-off-by: Xianting Tian 
---
 drivers/char/virtio_console.c | 12 ++--
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index 7eaf303a7..4ed3ffb1d 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -1117,8 +1117,6 @@ static int put_chars(u32 vtermno, const char *buf, int 
count)
 {
struct port *port;
struct scatterlist sg[1];
-   void *data;
-   int ret;
 
if (unlikely(early_put_chars))
return early_put_chars(vtermno, buf, count);
@@ -1127,14 +1125,8 @@ static int put_chars(u32 vtermno, const char *buf, int 
count)
if (!port)
return -EPIPE;
 
-   data = kmemdup(buf, count, GFP_ATOMIC);
-   if (!data)
-   return -ENOMEM;
-
-   sg_init_one(sg, data, count);
-   ret = __send_to_port(port, sg, 1, count, data, false);
-   kfree(data);
-   return ret;
+   sg_init_one(sg, buf, count);
+   return __send_to_port(port, sg, 1, count, (void *)buf, false);
 }
 
 /*
-- 
2.17.1



Re: [PATCH] powerpc/32s: Fix napping restore in data storage interrupt (DSI)

2021-08-03 Thread Finn Thain
On Tue, 3 Aug 2021, Christophe Leroy wrote:

> When a DSI (Data Storage Interrupt) is taken while in NAP mode, r11 
> doesn't survive the call to power_save_ppc32_restore().
> 
> So use r1 instead of r11 as they both contain the virtual stack pointer 
> at that point.
> 
> Reported-by: Finn Thain 
> Fixes: 4c0104a83fc3 ("powerpc/32: Dismantle EXC_XFER_STD/LITE/TEMPLATE")

Regarding that 'Fixes' tag, this patch has not fixed the failure below, 
unfortunately. But there appears to be several bugs in play here. Can you 
tell us which failure mode is associated with the bug addressed by this 
patch?

[ cut here ]
kernel BUG at arch/powerpc/kernel/interrupt.c:49!
Oops: Exception in kernel mode, sig: 5 [#1]
BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
Modules linked in:
CPU: 0 PID: 1859 Comm: xfce4-session Not tainted 5.13.0-pmac-VMAP #10
NIP:  c0011474 LR: c0011464 CTR: 
REGS: e2f75e40 TRAP: 0700   Not tainted  (5.13.0-pmac-VMAP)
MSR:  00021032   CR: 2400446c  XER: 2000

GPR00: c001604c e2f75f00 ca284a60   a5205eb0 0008 0020
GPR08: ffc0 0001 501200d9 ce030005 ca285010 00c1f778  
GPR16: 00945b20 009402f8 0001 a6b87550 a51fd000 afb73220 a6b22c78 a6a6aecc
GPR24:  ffc0 0020 0008 a5205eb0  e2f75f40 00ae
NIP [c0011474] system_call_exception+0x60/0x164
LR [c0011464] system_call_exception+0x50/0x164
Call Trace:
[e2f75f00] [9000] 0x9000 (unreliable)
[e2f75f30] [c001604c] ret_from_syscall+0x0/0x28
--- interrupt: c00 at 0xa69d6cb0
NIP:  a69d6cb0 LR: a69d6c3c CTR: 
REGS: e2f75f40 TRAP: 0c00   Not tainted  (5.13.0-pmac-VMAP)
MSR:  d032   CR: 2400446c  XER: 2000

GPR00: 00ae a5205de0 a5687ca0   a5205eb0 0008 0020
GPR08: ffc0 401201ea 401200d9  c158f230 00c1f778  
GPR16: 00945b20 009402f8 0001 a6b87550 a51fd000 afb73220 a6b22c78 a6a6aecc
GPR24: afb72fc8  0001 a5205f30 afb733dc  a6b85ff4 a5205eb0
NIP [a69d6cb0] 0xa69d6cb0
LR [a69d6c3c] 0xa69d6c3c
--- interrupt: c00
Instruction dump:
7cdb3378 93810020 7cbc2b78 93a10024 7c9d2378 93e1002c 7d3f4b78 4800d629
817e0084 931e0088 69690002 5529fffe <0f09> 69694000 552997fe 0f09
---[ end trace c66c6c3c44806276 ]---


Re: [PATCH v3 31/41] powerpc/32: Dismantle EXC_XFER_STD/LITE/TEMPLATE

2021-08-03 Thread Finn Thain


On Tue, 3 Aug 2021, Stan Johnson wrote:

> Attached you will find the following six files:
> 
> 1) config-5.13-patched_VMAP.txt
> 2) config-5.13-patched_NO_VMAP.txt
> 3) pb3400c-console-5.13-patched_VMAP.txt (using config 1)
> 4) pb3400c-console-5.13-patched_NO_VMAP.txt (using config 2)
> 5) ws-console-5.13-patched_VMAP.txt (using config 1)
> 6) ws-console-5.13-patched_NO_VMAP.txt (using config 2)
> 

Thanks!

> The command lines in BootX were as follows:
> 
> PB 3400c:
> root=/dev/sda13 console=ttyS0 video=chips65550:vmode:14,cmode:16
> 
> Wallstreet:
> root=/dev/sda12 console=ttyS0 video=ofonly
> 
> Notes:
> 
> For 3), the patch seems to have fixed the "hang-at-boot" at the Mac
> OS screen for the PB 3400c. 

I doubt that. I suspect that this is an unrelated failure that only 
affects the Powerbook 3400 and only intermittently. I say that because 
you've also observed this failure in v5.11.

So we should probably ignore this early-boot failure for the moment. Stan, 
if it happens again, please reboot and retry. That may allow us to make 
progress on the other bugs.

> After a successful boot, I didn't see any errors until I accessed the 
> system via ssh. In an ssh window, I entered "dmesg" (no errors) followed 
> by "ls -Rail /usr/bin", and while that was running, the errors appeared. 

Since Stan has a yahoo email address that isn't allowed past the spam 
filter, I'll paste that portion of the console log he sent --

Kernel attempted to write user page (78a930) - exploit attempt? (uid: 1000)
[ cut here ]
Bug: Write fault blocked by KUAP!
WARNING: CPU: 0 PID: 1619 at arch/powerpc/mm/fault.c:230 
do_page_fault+0x484/0x720
Modules linked in:
CPU: 0 PID: 1619 Comm: sshd Not tainted 5.13.0-pmac-VMAP #10
NIP:  c001b780 LR: c001b780 CTR: 
REGS: cb981bc0 TRAP: 0700   Not tainted  (5.13.0-pmac-VMAP)
MSR:  00021032   CR: 24942424  XER: 

GPR00: c001b780 cb981c80 c151c1e0 0021 3bff 085b 0027 c8eb644c
GPR08: 0023    24942424 0076f8c8  000186a0
GPR16: afab5544 afab5540 afab553c afab5538 afab5534 afab5530 0004 0078a934
GPR24:   0078a970 0200 c1497b60 0078a930 0300 cb981cc0
NIP [c001b780] do_page_fault+0x484/0x720
LR [c001b780] do_page_fault+0x484/0x720
Call Trace:
[cb981c80] [c001b780] do_page_fault+0x484/0x720 (unreliable)
[cb981cb0] [c000424c] DataAccess_virt+0xd4/0xe4
--- interrupt: 300 at __copy_tofrom_user+0x110/0x20c
NIP:  c001f9bc LR: c0172b04 CTR: 0001
REGS: cb981cc0 TRAP: 0300   Not tainted  (5.13.0-pmac-VMAP)
MSR:  9032   CR: 442444e8  XER: 2000
DAR: 0078a930 DSISR: 0a00
GPR00:  cb981d80 c151c1e0 0078a930 cb981db8 0004 0078a92c 0100
GPR08: 0122 10c279a1 1000 c1800034 242444e2 0076f8c8  000186a0
GPR16: afab5544 afab5540 afab553c afab5538 afab5534 afab5530 0004 0078a934
GPR24:   0078a970 0078a930 cb981dac cb981dac 0001 0004
NIP [c001f9bc] __copy_tofrom_user+0x110/0x20c
LR [c0172b04] core_sys_select+0x3e8/0x594
--- interrupt: 300
[cb981d80] [c0172960] core_sys_select+0x244/0x594 (unreliable)
[cb981ee0] [c0172d98] kern_select+0xe8/0x158
[cb981f30] [c001604c] ret_from_syscall+0x0/0x28
--- interrupt: c00 at 0xa7a4f388
NIP:  a7a4f388 LR: a7a4f35c CTR: 
REGS: cb981f40 TRAP: 0c00   Not tainted  (5.13.0-pmac-VMAP)
MSR:  d032   CR: 240044e2  XER: 2000

GPR00: 008e afab54e0 a73cc7d0 000c 0078a930 0078a970  
GPR08: 0004   a79e45b0 28004462 0076f8c8  000186a0
GPR16: afab5544 afab5540 afab553c afab5538 afab5534 afab5530 0004 00770490
GPR24: afab552f 0004  0078a930  00771734 a7b2fff4 00798cb0
NIP [a7a4f388] 0xa7a4f388
LR [a7a4f35c] 0xa7a4f35c
--- interrupt: c00
Instruction dump:
3884aa30 3863012c 4807685d 807f0080 48042e41 2f83 419e0148 3c80c079
3c60c076 38841b6c 38630174 4801f701 <0fe0> 386b 4bfffe30 3c80c06b
---[ end trace c6ec12d4725e6f89 ]---

> I'll enter the same commands for the other three boots. It may be 
> important that I didn't see errors until there was significant network 
> access.
> 
> For 4), the PB 3400c also booted normally. Errors started after
> logging in via ssh when I entered "dmesg". To be consistent with the
> first test, I followed that with "ls -Rail /usr/bin" and saw more
> errors. A normal reboot ("shutdown -r now") caused even more errors.
> 

Here's the relevant portion of that log:

Kernel attempted to write user page (ba3bc0) - exploit attempt? (uid: 1000)
[ cut here ]
Bug: Write fault blocked by KUAP!
WARNING: CPU: 0 PID: 1609 at arch/powerpc/mm/fault.c:230 
do_page_fault+0x484/0x720
Modules linked in:
CPU: 0 PID: 1609 Comm: bash Not tainted 5.13.0-pmac-NO_VMAP #11
NIP:  c001b780 LR: c001b780 CTR: 
REGS: c3c5bba0 TRAP: 0700   Not tainted  (5.13.0-pmac-NO_VMAP)
MSR:  00021032   CR: 24442424  XER: 

GPR00: c001b780 c3c5bc60 c3842ca0 0021 3bff 08

[PATCH printk v1 00/10] printk: introduce atomic consoles and sync mode

2021-08-03 Thread John Ogness
Hi,

This is the next part of our printk-rework effort (points 3 and
4 of the LPC 2019 summary [0]).

Here the concept of "atomic consoles" is introduced through  a
new (optional) write_atomic() callback for console drivers. This
callback must be implemented as an NMI-safe variant of the
write() callback, meaning that it can function from any context
without relying on questionable tactics such as ignoring locking
and also without relying on the synchronization of console
semaphore.

As an example of how such an atomic console can look like, this
series implements write_atomic() for the 8250 UART driver.

This series also introduces a new console printing mode called
"sync mode" that is only activated when the kernel is about to
end (such as panic, oops, shutdown, reboot). Sync mode can only
be activated if atomic consoles are available. A system without
registered atomic consoles will be unaffected by this series.

When in sync mode, the console printing behavior becomes:

- only consoles implementing write_atomic() will be called

- printing occurs within vprintk_store() instead of
  console_unlock(), since the console semaphore is irrelevant
  for atomic consoles

For systems that have registered atomic consoles, this series
improves the reliability of seeing crash messages by using new
locking techniques rather than "ignoring locks and hoping for
the best". In particular, atomic consoles rely on the
CPU-reentrant spinlock (i.e. the printk cpulock) for
synchronizing console output.

John Ogness

[0] https://lore.kernel.org/lkml/87k1acz5rx@linutronix.de/

John Ogness (10):
  printk: relocate printk cpulock functions
  printk: rename printk cpulock API and always disable interrupts
  kgdb: delay roundup if holding printk cpulock
  printk: relocate printk_delay()
  printk: call boot_delay_msec() in printk_delay()
  printk: use seqcount_latch for console_seq
  console: add write_atomic interface
  printk: introduce kernel sync mode
  kdb: if available, only use atomic consoles for output mirroring
  serial: 8250: implement write_atomic

 arch/powerpc/include/asm/smp.h |   1 +
 arch/powerpc/kernel/kgdb.c |  10 +-
 arch/powerpc/kernel/smp.c  |   5 +
 arch/x86/kernel/kgdb.c |   9 +-
 drivers/tty/serial/8250/8250.h |  47 ++-
 drivers/tty/serial/8250/8250_core.c|  17 +-
 drivers/tty/serial/8250/8250_fsl.c |   9 +
 drivers/tty/serial/8250/8250_ingenic.c |   7 +
 drivers/tty/serial/8250/8250_mtk.c |  29 +-
 drivers/tty/serial/8250/8250_port.c|  92 ++--
 drivers/tty/serial/8250/Kconfig|   1 +
 include/linux/console.h|  32 ++
 include/linux/kgdb.h   |   3 +
 include/linux/printk.h |  57 +--
 include/linux/serial_8250.h|   5 +
 kernel/debug/debug_core.c  |  45 +-
 kernel/debug/kdb/kdb_io.c  |  16 +
 kernel/printk/printk.c | 554 +
 lib/Kconfig.debug  |   3 +
 lib/dump_stack.c   |   4 +-
 lib/nmi_backtrace.c|   4 +-
 21 files changed, 684 insertions(+), 266 deletions(-)


base-commit: 23d8adcf8022b9483605531d8985f5b77533cb3a
-- 
2.20.1



Re: [PATCH printk v1 00/10] printk: introduce atomic consoles and sync mode

2021-08-03 Thread Andy Shevchenko
On Tue, Aug 03, 2021 at 03:18:51PM +0206, John Ogness wrote:
> Hi,
> 
> This is the next part of our printk-rework effort (points 3 and
> 4 of the LPC 2019 summary [0]).
> 
> Here the concept of "atomic consoles" is introduced through  a
> new (optional) write_atomic() callback for console drivers. This
> callback must be implemented as an NMI-safe variant of the
> write() callback, meaning that it can function from any context
> without relying on questionable tactics such as ignoring locking
> and also without relying on the synchronization of console
> semaphore.
> 
> As an example of how such an atomic console can look like, this
> series implements write_atomic() for the 8250 UART driver.
> 
> This series also introduces a new console printing mode called
> "sync mode" that is only activated when the kernel is about to
> end (such as panic, oops, shutdown, reboot). Sync mode can only
> be activated if atomic consoles are available. A system without
> registered atomic consoles will be unaffected by this series.
> 
> When in sync mode, the console printing behavior becomes:
> 
> - only consoles implementing write_atomic() will be called
> 
> - printing occurs within vprintk_store() instead of
>   console_unlock(), since the console semaphore is irrelevant
>   for atomic consoles
> 
> For systems that have registered atomic consoles, this series
> improves the reliability of seeing crash messages by using new
> locking techniques rather than "ignoring locks and hoping for
> the best". In particular, atomic consoles rely on the
> CPU-reentrant spinlock (i.e. the printk cpulock) for
> synchronizing console output.

If console is runtime suspended, who will bring it up?
Does it mean that this callback can't be implemented on the consoles that
do runtime suspend (some of 8250 currently, for example)?

-- 
With Best Regards,
Andy Shevchenko




Re: [PATCH v2 5/6] PCI: Adapt all code locations to not use struct pci_dev::driver directly

2021-08-03 Thread Boris Ostrovsky


On 8/3/21 6:01 AM, Uwe Kleine-König wrote:
> This prepares removing the driver member of struct pci_dev which holds the
> same information than struct pci_dev::dev->driver.
>
> Signed-off-by: Uwe Kleine-König 
> ---
>  arch/powerpc/include/asm/ppc-pci.h|  3 +-
>  arch/powerpc/kernel/eeh_driver.c  | 12 ---
>  arch/x86/events/intel/uncore.c|  2 +-
>  arch/x86/kernel/probe_roms.c  |  2 +-
>  drivers/bcma/host_pci.c   |  6 ++--
>  drivers/crypto/hisilicon/qm.c |  2 +-
>  drivers/crypto/qat/qat_common/adf_aer.c   |  2 +-
>  drivers/message/fusion/mptbase.c  |  4 +--
>  drivers/misc/cxl/guest.c  | 21 +--
>  drivers/misc/cxl/pci.c| 25 +++--
>  .../ethernet/hisilicon/hns3/hns3_ethtool.c|  2 +-
>  .../ethernet/marvell/prestera/prestera_pci.c  |  2 +-
>  drivers/net/ethernet/mellanox/mlxsw/pci.c |  2 +-
>  .../ethernet/netronome/nfp/nfp_net_ethtool.c  |  2 +-
>  drivers/pci/iov.c | 23 +++-
>  drivers/pci/pci-driver.c  | 28 ---
>  drivers/pci/pci.c | 10 +++---
>  drivers/pci/pcie/err.c| 35 ++-
>  drivers/pci/xen-pcifront.c|  3 +-
>  drivers/ssb/pcihost_wrapper.c |  7 ++--
>  drivers/usb/host/xhci-pci.c   |  3 +-
>  21 files changed, 112 insertions(+), 84 deletions(-)


For Xen bits:

Reviewed-by: Boris Ostrovsky 




[PATCH 00/38] Replace deprecated CPU-hotplug

2021-08-03 Thread Sebastian Andrzej Siewior
This is a tree wide replacement of the deprecated CPU hotplug functions
which are only wrappers around the actual functions.

Each patch is independent and can be picked up by the relevant maintainer.

Cc: Alexander Shishkin 
Cc: Amit Kucheria 
Cc: Andrew Morton 
Cc: Andy Lutomirski 
Cc: Arnaldo Carvalho de Melo 
Cc: Arnd Bergmann 
Cc: Benjamin Herrenschmidt 
Cc: Ben Segall 
Cc: Borislav Petkov 
Cc: cgro...@vger.kernel.org
Cc: Christian Borntraeger 
Cc: coresi...@lists.linaro.org
Cc: Daniel Bristot de Oliveira 
Cc: Daniel Jordan 
Cc: Daniel Lezcano 
Cc: Dave Hansen 
Cc: Davidlohr Bueso 
Cc: "David S. Miller" 
Cc: Dietmar Eggemann 
Cc: Gonglei 
Cc: Greg Kroah-Hartman 
Cc: Guenter Roeck 
Cc: Hans de Goede 
Cc: Heiko Carstens 
Cc: Herbert Xu 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: Ingo Molnar 
Cc: Jakub Kicinski 
Cc: Jason Wang 
Cc: Jean Delvare 
Cc: Jiri Kosina 
Cc: Jiri Olsa 
Cc: Joe Lawrence 
Cc: Joel Fernandes 
Cc: Johannes Weiner 
Cc: John Stultz 
Cc: Jonathan Corbet 
Cc: Josh Poimboeuf 
Cc: Josh Triplett 
Cc: Julian Wiedmann 
Cc: Juri Lelli 
Cc: Karol Herbst 
Cc: Karsten Graul 
Cc: kvm-...@vger.kernel.org
Cc: Lai Jiangshan 
Cc: Len Brown 
Cc: Len Brown 
Cc: Leo Yan 
Cc: linux-a...@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-cry...@vger.kernel.org
Cc: linux-...@vger.kernel.org
Cc: linux-e...@vger.kernel.org
Cc: linux-hw...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: linux-m...@vger.kernel.org
Cc: linux...@kvack.org
Cc: linux...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-r...@vger.kernel.org
Cc: linux-s...@vger.kernel.org
Cc: live-patch...@vger.kernel.org
Cc: Mark Gross 
Cc: Mark Rutland 
Cc: Mathieu Desnoyers 
Cc: Mathieu Poirier 
Cc: Mel Gorman 
Cc: Michael Ellerman 
Cc: "Michael S. Tsirkin" 
Cc: Mike Leach 
Cc: Mike Travis 
Cc: Miroslav Benes 
Cc: Namhyung Kim 
Cc: net...@vger.kernel.org
Cc: nouv...@lists.freedesktop.org
Cc: "Paul E. McKenney" 
Cc: Paul Mackerras 
Cc: Pavel Machek 
Cc: Pekka Paalanen 
Cc: Peter Zijlstra 
Cc: Petr Mladek 
Cc: platform-driver-...@vger.kernel.org
Cc: "Rafael J. Wysocki" 
Cc: r...@vger.kernel.org
Cc: Robin Holt 
Cc: Song Liu 
Cc: Srinivas Pandruvada 
Cc: Steffen Klassert 
Cc: Stephen Boyd 
Cc: Steven Rostedt 
Cc: Steve Wahl 
Cc: Stuart Hayes 
Cc: Suzuki K Poulose 
Cc: Tejun Heo 
Cc: Thomas Bogendoerfer 
Cc: Thomas Gleixner 
Cc: Tony Luck 
Cc: Vasily Gorbik 
Cc: Vincent Guittot 
Cc: Viresh Kumar 
Cc: virtualizat...@lists.linux-foundation.org
Cc: x...@kernel.org
Cc: Zefan Li 
Cc: Zhang Rui 

Sebastian



Re: [PATCH v2 4/6] PCI: Provide wrapper to access a pci_dev's bound driver

2021-08-03 Thread Andy Shevchenko
On Tue, Aug 03, 2021 at 12:01:48PM +0200, Uwe Kleine-König wrote:
> Which driver a device is bound to is available twice: In struct
> pci_dev::dev->driver and in struct pci_dev::driver. To get rid of the
> duplication introduce a wrapper to access struct pci_dev's driver
> member. Once all users are converted the wrapper can be changed to
> calculate the driver using pci_dev::dev->driver.

...

>  #define  to_pci_driver(drv) container_of(drv, struct pci_driver, driver)
> +#define pci_driver_of_dev(pdev) ((pdev)->driver)

Seems like above is (mis)using TAB instead of space after #define. Not sure if
it's good to have them different.

-- 
With Best Regards,
Andy Shevchenko




Re: [PATCH v2 5/6] PCI: Adapt all code locations to not use struct pci_dev::driver directly

2021-08-03 Thread Andy Shevchenko
On Tue, Aug 03, 2021 at 12:01:49PM +0200, Uwe Kleine-König wrote:
> This prepares removing the driver member of struct pci_dev which holds the
> same information than struct pci_dev::dev->driver.

...

> + struct pci_driver *pdrv;

Missed blank line here and everywhere else. I don't remember if it's a
checkpatch who complains on this.

> + return (pdev && (pdrv = pci_driver_of_dev(pdev))) ? pdrv->name : 
> "";

-- 
With Best Regards,
Andy Shevchenko




Re: [PATCH v2 5/6] PCI: Adapt all code locations to not use struct pci_dev::driver directly

2021-08-03 Thread Ido Schimmel
On Tue, Aug 03, 2021 at 12:01:49PM +0200, Uwe Kleine-König wrote:
> This prepares removing the driver member of struct pci_dev which holds the
> same information than struct pci_dev::dev->driver.
> 
> Signed-off-by: Uwe Kleine-König 
> ---
>  arch/powerpc/include/asm/ppc-pci.h|  3 +-
>  arch/powerpc/kernel/eeh_driver.c  | 12 ---
>  arch/x86/events/intel/uncore.c|  2 +-
>  arch/x86/kernel/probe_roms.c  |  2 +-
>  drivers/bcma/host_pci.c   |  6 ++--
>  drivers/crypto/hisilicon/qm.c |  2 +-
>  drivers/crypto/qat/qat_common/adf_aer.c   |  2 +-
>  drivers/message/fusion/mptbase.c  |  4 +--
>  drivers/misc/cxl/guest.c  | 21 +--
>  drivers/misc/cxl/pci.c| 25 +++--
>  .../ethernet/hisilicon/hns3/hns3_ethtool.c|  2 +-
>  .../ethernet/marvell/prestera/prestera_pci.c  |  2 +-
>  drivers/net/ethernet/mellanox/mlxsw/pci.c |  2 +-
>  .../ethernet/netronome/nfp/nfp_net_ethtool.c  |  2 +-
>  drivers/pci/iov.c | 23 +++-
>  drivers/pci/pci-driver.c  | 28 ---
>  drivers/pci/pci.c | 10 +++---
>  drivers/pci/pcie/err.c| 35 ++-
>  drivers/pci/xen-pcifront.c|  3 +-
>  drivers/ssb/pcihost_wrapper.c |  7 ++--
>  drivers/usb/host/xhci-pci.c   |  3 +-
>  21 files changed, 112 insertions(+), 84 deletions(-)

For mlxsw:

Tested-by: Ido Schimmel 


Re: [PATCH 00/38] Replace deprecated CPU-hotplug

2021-08-03 Thread Hans de Goede
Hi Sebastien,

On 8/3/21 4:15 PM, Sebastian Andrzej Siewior wrote:
> This is a tree wide replacement of the deprecated CPU hotplug functions
> which are only wrappers around the actual functions.
> 
> Each patch is independent and can be picked up by the relevant maintainer.

Ok; and I take it that then also is the plan for merging these ?

FWIW I'm fine with the drivers/platform/x86 patch going upstream
through some other tree if its easier to keep the set together ...

Regards,

Hans



> 
> Cc: Alexander Shishkin 
> Cc: Amit Kucheria 
> Cc: Andrew Morton 
> Cc: Andy Lutomirski 
> Cc: Arnaldo Carvalho de Melo 
> Cc: Arnd Bergmann 
> Cc: Benjamin Herrenschmidt 
> Cc: Ben Segall 
> Cc: Borislav Petkov 
> Cc: cgro...@vger.kernel.org
> Cc: Christian Borntraeger 
> Cc: coresi...@lists.linaro.org
> Cc: Daniel Bristot de Oliveira 
> Cc: Daniel Jordan 
> Cc: Daniel Lezcano 
> Cc: Dave Hansen 
> Cc: Davidlohr Bueso 
> Cc: "David S. Miller" 
> Cc: Dietmar Eggemann 
> Cc: Gonglei 
> Cc: Greg Kroah-Hartman 
> Cc: Guenter Roeck 
> Cc: Hans de Goede 
> Cc: Heiko Carstens 
> Cc: Herbert Xu 
> Cc: "H. Peter Anvin" 
> Cc: Ingo Molnar 
> Cc: Ingo Molnar 
> Cc: Jakub Kicinski 
> Cc: Jason Wang 
> Cc: Jean Delvare 
> Cc: Jiri Kosina 
> Cc: Jiri Olsa 
> Cc: Joe Lawrence 
> Cc: Joel Fernandes 
> Cc: Johannes Weiner 
> Cc: John Stultz 
> Cc: Jonathan Corbet 
> Cc: Josh Poimboeuf 
> Cc: Josh Triplett 
> Cc: Julian Wiedmann 
> Cc: Juri Lelli 
> Cc: Karol Herbst 
> Cc: Karsten Graul 
> Cc: kvm-...@vger.kernel.org
> Cc: Lai Jiangshan 
> Cc: Len Brown 
> Cc: Len Brown 
> Cc: Leo Yan 
> Cc: linux-a...@vger.kernel.org
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-cry...@vger.kernel.org
> Cc: linux-...@vger.kernel.org
> Cc: linux-e...@vger.kernel.org
> Cc: linux-hw...@vger.kernel.org
> Cc: linux-ker...@vger.kernel.org
> Cc: linux-m...@vger.kernel.org
> Cc: linux...@kvack.org
> Cc: linux...@vger.kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux-r...@vger.kernel.org
> Cc: linux-s...@vger.kernel.org
> Cc: live-patch...@vger.kernel.org
> Cc: Mark Gross 
> Cc: Mark Rutland 
> Cc: Mathieu Desnoyers 
> Cc: Mathieu Poirier 
> Cc: Mel Gorman 
> Cc: Michael Ellerman 
> Cc: "Michael S. Tsirkin" 
> Cc: Mike Leach 
> Cc: Mike Travis 
> Cc: Miroslav Benes 
> Cc: Namhyung Kim 
> Cc: net...@vger.kernel.org
> Cc: nouv...@lists.freedesktop.org
> Cc: "Paul E. McKenney" 
> Cc: Paul Mackerras 
> Cc: Pavel Machek 
> Cc: Pekka Paalanen 
> Cc: Peter Zijlstra 
> Cc: Petr Mladek 
> Cc: platform-driver-...@vger.kernel.org
> Cc: "Rafael J. Wysocki" 
> Cc: r...@vger.kernel.org
> Cc: Robin Holt 
> Cc: Song Liu 
> Cc: Srinivas Pandruvada 
> Cc: Steffen Klassert 
> Cc: Stephen Boyd 
> Cc: Steven Rostedt 
> Cc: Steve Wahl 
> Cc: Stuart Hayes 
> Cc: Suzuki K Poulose 
> Cc: Tejun Heo 
> Cc: Thomas Bogendoerfer 
> Cc: Thomas Gleixner 
> Cc: Tony Luck 
> Cc: Vasily Gorbik 
> Cc: Vincent Guittot 
> Cc: Viresh Kumar 
> Cc: virtualizat...@lists.linux-foundation.org
> Cc: x...@kernel.org
> Cc: Zefan Li 
> Cc: Zhang Rui 
> 
> Sebastian
> 



Re: [PATCH 00/38] Replace deprecated CPU-hotplug

2021-08-03 Thread Sebastian Andrzej Siewior
On 2021-08-03 17:30:40 [+0200], Hans de Goede wrote:
> Hi Sebastien,
Hi Hans,

> On 8/3/21 4:15 PM, Sebastian Andrzej Siewior wrote:
> > This is a tree wide replacement of the deprecated CPU hotplug functions
> > which are only wrappers around the actual functions.
> > 
> > Each patch is independent and can be picked up by the relevant maintainer.
> 
> Ok; and I take it that then also is the plan for merging these ?
> 
> FWIW I'm fine with the drivers/platform/x86 patch going upstream
> through some other tree if its easier to keep the set together ...

There is no need to keep that set together since each patch is
independent. Please merge it through your tree.

> Regards,
> 
> Hans

Sebastian


Re: [RFC PATCH] powerpc/book3s64/radix: Upgrade va tlbie to PID tlbie if we cross PMD_SIZE

2021-08-03 Thread Nicholas Piggin
Excerpts from Aneesh Kumar K.V's message of August 4, 2021 12:37 am:
> With shared mapping, even though we are unmapping a large range, the kernel
> will force a TLB flush with ptl lock held to avoid the race mentioned in
> commit 1cf35d47712d ("mm: split 'tlb_flush_mmu()' into tlb flushing and 
> memory freeing parts")
> This results in the kernel issuing a high number of TLB flushes even for a 
> large
> range. This can be improved by making sure the kernel switch to pid based 
> flush if the
> kernel is unmapping a 2M range.

It would be good to have a bit more description here.

In any patch that changes a heuristic like this, I would like to see 
some justification or reasoning that could be refuted or used as a 
supporting argument if we ever wanted to change the heuristic later.
Ideally with some of the obvious downsides listed as well.

This "improves" things here, but what if it hurt things elsewhere, how 
would we come in later and decide to change it back?

THP flushes for example, I think now they'll do PID flushes (if they 
have to be broadcast, which they will tend to be when khugepaged does
them). So now that might increase jitter for THP and cause it to be a
loss for more workloads.

So where do you notice this? What's the benefit?

> 
> Signed-off-by: Aneesh Kumar K.V 
> ---
>  arch/powerpc/mm/book3s64/radix_tlb.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c 
> b/arch/powerpc/mm/book3s64/radix_tlb.c
> index aefc100d79a7..21d0f098e43b 100644
> --- a/arch/powerpc/mm/book3s64/radix_tlb.c
> +++ b/arch/powerpc/mm/book3s64/radix_tlb.c
> @@ -1106,7 +1106,7 @@ EXPORT_SYMBOL(radix__flush_tlb_kernel_range);
>   * invalidating a full PID, so it has a far lower threshold to change from
>   * individual page flushes to full-pid flushes.
>   */
> -static unsigned long tlb_single_page_flush_ceiling __read_mostly = 33;
> +static unsigned long tlb_single_page_flush_ceiling __read_mostly = 32;
>  static unsigned long tlb_local_single_page_flush_ceiling __read_mostly = 
> POWER9_TLB_SETS_RADIX * 2;
>  
>  static inline void __radix__flush_tlb_range(struct mm_struct *mm,
> @@ -1133,7 +1133,7 @@ static inline void __radix__flush_tlb_range(struct 
> mm_struct *mm,
>   if (fullmm)
>   flush_pid = true;
>   else if (type == FLUSH_TYPE_GLOBAL)
> - flush_pid = nr_pages > tlb_single_page_flush_ceiling;
> + flush_pid = nr_pages >= tlb_single_page_flush_ceiling;

Arguably >= is nicer than > here, but this shouldn't be in the same 
patch as the value change.

>   else
>   flush_pid = nr_pages > tlb_local_single_page_flush_ceiling;

And it should change everything to be consistent. Although I'm not sure 
it's worth changing even though I highly doubt any administrator would
be tweaking this.

Thanks,
Nick

>   /*
> @@ -1335,7 +1335,7 @@ static void __radix__flush_tlb_range_psize(struct 
> mm_struct *mm,
>   if (fullmm)
>   flush_pid = true;
>   else if (type == FLUSH_TYPE_GLOBAL)
> - flush_pid = nr_pages > tlb_single_page_flush_ceiling;
> + flush_pid = nr_pages >= tlb_single_page_flush_ceiling;
>   else
>   flush_pid = nr_pages > tlb_local_single_page_flush_ceiling;
>  
> @@ -1505,7 +1505,7 @@ void do_h_rpt_invalidate_prt(unsigned long pid, 
> unsigned long lpid,
>   continue;
>  
>   nr_pages = (end - start) >> def->shift;
> - flush_pid = nr_pages > tlb_single_page_flush_ceiling;
> + flush_pid = nr_pages >= tlb_single_page_flush_ceiling;
>  
>   /*
>* If the number of pages spanning the range is above
> -- 
> 2.31.1
> 
> 


Re: [PATCH 1/2] powerpc/bug: Remove specific powerpc BUG_ON() and WARN_ON() on PPC32

2021-08-03 Thread Christophe Leroy

Gentle ping Michael ?

Le 25/06/2021 à 16:41, Christophe Leroy a écrit :

Hi Michael,

What happened to this series ? It has been flagged 'under review' in Patchwork since mid April but I 
never saw it in next-test.


Thanks
Christophe

Le 12/04/2021 à 18:26, Christophe Leroy a écrit :

powerpc BUG_ON() and WARN_ON() are based on using twnei instruction.

For catching simple conditions like a variable having value 0, this
is efficient because it does the test and the trap at the same time.
But most conditions used with BUG_ON or WARN_ON are more complex and
forces GCC to format the condition into a 0 or 1 value in a register.
This will usually require 2 to 3 instructions.

The most efficient solution would be to use __builtin_trap() because
GCC is able to optimise the use of the different trap instructions
based on the requested condition, but this is complex if not
impossible for the following reasons:
- __builtin_trap() is a non-recoverable instruction, so it can't be
used for WARN_ON
- Knowing which line of code generated the trap would require the
analysis of DWARF information. This is not a feature we have today.

As mentioned in commit 8d4fbcfbe0a4 ("Fix WARN_ON() on bitfield ops")
the way WARN_ON() is implemented is suboptimal. That commit also
mentions an issue with 'long long' condition. It fixed it for
WARN_ON() but the same problem still exists today with BUG_ON() on
PPC32. It will be fixed by using the generic implementation.

By using the generic implementation, gcc will naturally generate a
branch to the unconditional trap generated by BUG().

As modern powerpc implement zero-cycle branch,
that's even more efficient.

And for the functions using WARN_ON() and its return, the test
on return from WARN_ON() is now also used for the WARN_ON() itself.

On PPC64 we don't want it because we want to be able to use CFAR
register to track how we entered the code that trapped. The CFAR
register would be clobbered by the branch.

A simple test function:

unsigned long test9w(unsigned long a, unsigned long b)
{
    if (WARN_ON(!b))
    return 0;
    return a / b;
}

Before the patch:

046c :
 46c:    7c 89 00 34 cntlzw  r9,r4
 470:    55 29 d9 7e rlwinm  r9,r9,27,5,31
 474:    0f 09 00 00 twnei   r9,0
 478:    2c 04 00 00 cmpwi   r4,0
 47c:    41 82 00 0c beq 488 
 480:    7c 63 23 96 divwu   r3,r3,r4
 484:    4e 80 00 20 blr

 488:    38 60 00 00 li  r3,0
 48c:    4e 80 00 20 blr

After the patch:

0468 :
 468:    2c 04 00 00 cmpwi   r4,0
 46c:    41 82 00 0c beq 478 
 470:    7c 63 23 96 divwu   r3,r3,r4
 474:    4e 80 00 20 blr

 478:    0f e0 00 00 twui    r0,0
 47c:    38 60 00 00 li  r3,0
 480:    4e 80 00 20 blr

So we see before the patch we need 3 instructions on the likely path
to handle the WARN_ON(). With the patch the trap goes on the unlikely
path.

See below the difference at the entry of system_call_exception where
we have several BUG_ON(), allthough less impressing.

With the patch:

 :
   0:    81 6a 00 84 lwz r11,132(r10)
   4:    90 6a 00 88 stw r3,136(r10)
   8:    71 60 00 02 andi.   r0,r11,2
   c:    41 82 00 70 beq 7c 
  10:    71 60 40 00 andi.   r0,r11,16384
  14:    41 82 00 6c beq 80 
  18:    71 6b 80 00 andi.   r11,r11,32768
  1c:    41 82 00 68 beq 84 
  20:    94 21 ff e0 stwu    r1,-32(r1)
  24:    93 e1 00 1c stw r31,28(r1)
  28:    7d 8c 42 e6 mftb    r12
...
  7c:    0f e0 00 00 twui    r0,0
  80:    0f e0 00 00 twui    r0,0
  84:    0f e0 00 00 twui    r0,0

Without the patch:

 :
   0:    94 21 ff e0 stwu    r1,-32(r1)
   4:    93 e1 00 1c stw r31,28(r1)
   8:    90 6a 00 88 stw r3,136(r10)
   c:    81 6a 00 84 lwz r11,132(r10)
  10:    69 60 00 02 xori    r0,r11,2
  14:    54 00 ff fe rlwinm  r0,r0,31,31,31
  18:    0f 00 00 00 twnei   r0,0
  1c:    69 60 40 00 xori    r0,r11,16384
  20:    54 00 97 fe rlwinm  r0,r0,18,31,31
  24:    0f 00 00 00 twnei   r0,0
  28:    69 6b 80 00 xori    r11,r11,32768
  2c:    55 6b 8f fe rlwinm  r11,r11,17,31,31
  30:    0f 0b 00 00 twnei   r11,0
  34:    7d 8c 42 e6 mftb    r12

Signed-off-by: Christophe Leroy 
---
  arch/powerpc/include/asm/bug.h | 9 ++---
  1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
index d1635ffbb179..101dea4eec8d 100644
--- a/arch/powerpc/include/asm/bug.h
+++ b/arch/powerpc/include/asm/bug.h
@@ -68,7 +68,11 @@
  BUG_ENTRY("twi 31, 0, 0", 0);    \
  unreachable();    \
  } while (0)
+#define HAVE_ARCH_BUG
+
+#define __W

[PATCH] powerpc: Remove MSR_PR check in interrupt_exit_{user/kernel}_prepare()

2021-08-03 Thread Christophe Leroy
In those hot functions that are called at every interrupt, any saved
cycle is worth it.

interrupt_exit_user_prepare() and interrupt_exit_kernel_prepare() are
called from three places:
- From entry_32.S
- From interrupt_64.S
- From interrupt_exit_user_restart() and interrupt_exit_kernel_restart()

In entry_32.S, there are inambiguously called based on MSR_PR:

interrupt_return:
lwz r4,_MSR(r1)
addir3,r1,STACK_FRAME_OVERHEAD
andi.   r0,r4,MSR_PR
beq .Lkernel_interrupt_return
bl  interrupt_exit_user_prepare
...
.Lkernel_interrupt_return:
bl  interrupt_exit_kernel_prepare

In interrupt_64.S, that's similar:

interrupt_return_\srr\():
ld  r4,_MSR(r1)
andi.   r0,r4,MSR_PR
beq interrupt_return_\srr\()_kernel
interrupt_return_\srr\()_user: /* make backtraces match the _kernel 
variant */
addir3,r1,STACK_FRAME_OVERHEAD
bl  interrupt_exit_user_prepare
...
interrupt_return_\srr\()_kernel:
addir3,r1,STACK_FRAME_OVERHEAD
bl  interrupt_exit_kernel_prepare

In interrupt_exit_user_restart() and interrupt_exit_kernel_restart(),
MSR_PR is verified respectively by BUG_ON(!user_mode(regs)) and
BUG_ON(user_mode(regs)) prior to calling interrupt_exit_user_prepare()
and interrupt_exit_kernel_prepare().

The verification in interrupt_exit_user_prepare() and
interrupt_exit_kernel_prepare() are therefore useless and can be removed.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/interrupt.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
index 21bbd615ca41..f26caf911ab5 100644
--- a/arch/powerpc/kernel/interrupt.c
+++ b/arch/powerpc/kernel/interrupt.c
@@ -465,7 +465,6 @@ notrace unsigned long interrupt_exit_user_prepare(struct 
pt_regs *regs)
 
if (!IS_ENABLED(CONFIG_BOOKE) && !IS_ENABLED(CONFIG_40x))
BUG_ON(!(regs->msr & MSR_RI));
-   BUG_ON(!(regs->msr & MSR_PR));
BUG_ON(arch_irq_disabled_regs(regs));
CT_WARN_ON(ct_state() == CONTEXT_USER);
 
@@ -499,7 +498,6 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct 
pt_regs *regs)
if (!IS_ENABLED(CONFIG_BOOKE) && !IS_ENABLED(CONFIG_40x) &&
unlikely(!(regs->msr & MSR_RI)))
unrecoverable_exception(regs);
-   BUG_ON(regs->msr & MSR_PR);
/*
 * CT_WARN_ON comes here via program_check_exception,
 * so avoid recursion.
-- 
2.25.0



Re: undefined reference to `.radix__create_section_mapping'

2021-08-03 Thread Christophe Leroy

Hi Randy,

Le 04/08/2021 à 04:40, Randy Dunlap a écrit :

On 7/31/21 11:22 AM, kernel test robot wrote:

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   c7d102232649226a6958a4942cf13cff4f7c
commit: fe3dc333d2ed50c9764d281869d87bae0d795ce5 powerpc/mmu: Don't duplicate 
radix_enabled()
date:   3 months ago
config: powerpc64-randconfig-r013-20210731 (attached as .config)
compiler: powerpc-linux-gcc (GCC) 10.3.0
reproduce (this is a W=1 build):
 wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross

 chmod +x ~/bin/make.cross
 # 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fe3dc333d2ed50c9764d281869d87bae0d795ce5 


 git remote add linus 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
 git fetch --no-tags linus master
 git checkout fe3dc333d2ed50c9764d281869d87bae0d795ce5
 # save the attached .config to linux build tree
 mkdir build_dir
 COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-10.3.0 make.cross O=build_dir ARCH=powerpc 
SHELL=/bin/bash


If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

    powerpc-linux-ld: arch/powerpc/mm/book3s64/pgtable.o: in function 
`.create_section_mapping':

(.meminit.text+0x3c): undefined reference to `.radix__create_section_mapping'

    powerpc-linux-ld: arch/powerpc/mm/book3s64/pgtable.o: in function 
`.remove_section_mapping':

(.meminit.text+0x90): undefined reference to `.radix__remove_section_mapping'


In the randconfig file:
# CONFIG_PPC_RADIX_MMU is not set

It is default y, but maybe that is not strong enough?
I.e., should it be selected by PPC_BOOK3S_64?

Changing the config to PPC_RADIX_MMU=y fixes the build errors.

Or should arch/powerpc/mm/book3s64/pgtable.c be modified to handle
the case of PPC_RADIX_MMU is not set?



Can you test 
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20210804013724.514468-1-jniet...@gmail.com/ ?


Thanks
Christophe


Re: [PATCH] powerpc: Always inline radix_enabled() to fix build failure

2021-08-03 Thread Christophe Leroy




Le 04/08/2021 à 03:37, Jordan Niethe a écrit :

This is the same as commit acdad8fb4a15 ("powerpc: Force inlining of
mmu_has_feature to fix build failure") but for radix_enabled().  The
config in the linked bugzilla causes the following build failure:

LD  .tmp_vmlinux.kallsyms1
powerpc64-linux-ld: arch/powerpc/mm/pgtable.o: in function 
`.__ptep_set_access_flags':
pgtable.c:(.text+0x17c): undefined reference to `.radix__ptep_set_access_flags'

This is due to radix_enabled() not being inlined. See extract from building 
with -Winline:

In file included from arch/powerpc/include/asm/lppaca.h:46,
  from arch/powerpc/include/asm/paca.h:17,
  from arch/powerpc/include/asm/current.h:13,
  from include/linux/thread_info.h:23,
  from include/asm-generic/preempt.h:5,
  from ./arch/powerpc/include/generated/asm/preempt.h:1,
  from include/linux/preempt.h:78,
  from include/linux/spinlock.h:51,
  from include/linux/mmzone.h:8,
  from include/linux/gfp.h:6,
  from arch/powerpc/mm/pgtable.c:21:
arch/powerpc/include/asm/book3s/64/pgtable.h: In function 
'__ptep_set_access_flags':
arch/powerpc/include/asm/mmu.h:327:20: error: inlining failed in call to 
'radix_enabled': call is unlikely and code size would grow [-Werror=inline]

The code relies on constant folding of MMU_FTRS_POSSIBLE at buildtime
and elimination of non possible parts of code at compile time. For this
to work radix_enabled() must be inlined so make it __always_inline.


Thanks for looking at that. I also got a few notifications of that problem by kernel test robot but 
I didn't look at it yet.


https://lkml.org/lkml/2021/7/31/257
https://lkml.org/lkml/2021/7/25/271



Link: https://bugzilla.kernel.org/show_bug.cgi?id=213803
Reported-by: Erhard F. 
Suggested-by: Michael Ellerman 
Signed-off-by: Jordan Niethe 
---
  arch/powerpc/include/asm/mmu.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
index 27016b98ecb2..8abe8e42e045 100644
--- a/arch/powerpc/include/asm/mmu.h
+++ b/arch/powerpc/include/asm/mmu.h
@@ -324,7 +324,7 @@ static inline void assert_pte_locked(struct mm_struct *mm, 
unsigned long addr)
  }
  #endif /* !CONFIG_DEBUG_VM */
  
-static inline bool radix_enabled(void)

+static __always_inline bool radix_enabled(void)
  {
return mmu_has_feature(MMU_FTR_TYPE_RADIX);
  }



[Bug 213961] Oops while loading radeon driver

2021-08-03 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=213961

Christophe Leroy (christophe.le...@csgroup.eu) changed:

   What|Removed |Added

 CC||christophe.le...@csgroup.eu

--- Comment #1 from Christophe Leroy (christophe.le...@csgroup.eu) ---
There is not much that can be done with the log provided.

Could you please rebuild your kernel with CONFIG_DEBUG_INFO so that we get the
name of involved functions in the BUG report.

Could you also provide your .config, and a full dump of 'dmesg'

Thanks

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

Re: undefined reference to `.radix__create_section_mapping'

2021-08-03 Thread Randy Dunlap

On 8/3/21 10:31 PM, Christophe Leroy wrote:

Hi Randy,

Le 04/08/2021 à 04:40, Randy Dunlap a écrit :

On 7/31/21 11:22 AM, kernel test robot wrote:

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   c7d102232649226a6958a4942cf13cff4f7c
commit: fe3dc333d2ed50c9764d281869d87bae0d795ce5 powerpc/mmu: Don't duplicate 
radix_enabled()
date:   3 months ago
config: powerpc64-randconfig-r013-20210731 (attached as .config)
compiler: powerpc-linux-gcc (GCC) 10.3.0
reproduce (this is a W=1 build):
 wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
 chmod +x ~/bin/make.cross
 # 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fe3dc333d2ed50c9764d281869d87bae0d795ce5
 git remote add linus 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
 git fetch --no-tags linus master
 git checkout fe3dc333d2ed50c9764d281869d87bae0d795ce5
 # save the attached .config to linux build tree
 mkdir build_dir
 COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-10.3.0 make.cross 
O=build_dir ARCH=powerpc SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

    powerpc-linux-ld: arch/powerpc/mm/book3s64/pgtable.o: in function 
`.create_section_mapping':

(.meminit.text+0x3c): undefined reference to `.radix__create_section_mapping'

    powerpc-linux-ld: arch/powerpc/mm/book3s64/pgtable.o: in function 
`.remove_section_mapping':

(.meminit.text+0x90): undefined reference to `.radix__remove_section_mapping'


In the randconfig file:
# CONFIG_PPC_RADIX_MMU is not set

It is default y, but maybe that is not strong enough?
I.e., should it be selected by PPC_BOOK3S_64?

Changing the config to PPC_RADIX_MMU=y fixes the build errors.

Or should arch/powerpc/mm/book3s64/pgtable.c be modified to handle
the case of PPC_RADIX_MMU is not set?



Can you test 
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20210804013724.514468-1-jniet...@gmail.com/
 ?


Hi Christophe,

Yes, that builds without a problem. Thanks for the pointer.

Acked-by: Randy Dunlap  # build-tested

--
~Randy


Re: [PATCH] powerpc/32: Fix critical and debug interrupts on BOOKE

2021-08-03 Thread Christophe Leroy

Hi Radu,

Le 07/07/2021 à 07:55, Christophe Leroy a écrit :

32 bits BOOKE have special interrupts for debug and other
critical events.


Were you able to test this patch ?

Thanks
Christophe




When handling those interrupts, dedicated registers are saved
in the stack frame in addition to the standard registers, leading
to a shift of the pt_regs struct.

Since commit db297c3b07af ("powerpc/32: Don't save thread.regs on
interrupt entry"), the pt_regs struct is expected to be at the
same place all the time.

Instead of handling a special struct in addition to pt_regs, just
add those special registers to struct pt_regs.

Reported-by: Radu Rendec 
Signed-off-by: Christophe Leroy 
Fixes: db297c3b07af ("powerpc/32: Don't save thread.regs on interrupt entry")
Cc: sta...@vger.kernel.org
---
  arch/powerpc/include/asm/ptrace.h | 16 
  arch/powerpc/kernel/asm-offsets.c | 31 ++-
  arch/powerpc/kernel/head_booke.h  | 27 +++
  3 files changed, 33 insertions(+), 41 deletions(-)

diff --git a/arch/powerpc/include/asm/ptrace.h 
b/arch/powerpc/include/asm/ptrace.h
index 3e5d470a6155..14422e851494 100644
--- a/arch/powerpc/include/asm/ptrace.h
+++ b/arch/powerpc/include/asm/ptrace.h
@@ -70,6 +70,22 @@ struct pt_regs
unsigned long __pad[4]; /* Maintain 16 byte interrupt stack 
alignment */
};
  #endif
+#if defined(CONFIG_PPC32) && defined(CONFIG_BOOKE)
+   struct { /* Must be a multiple of 16 bytes */
+   unsigned long mas0;
+   unsigned long mas1;
+   unsigned long mas2;
+   unsigned long mas3;
+   unsigned long mas6;
+   unsigned long mas7;
+   unsigned long srr0;
+   unsigned long srr1;
+   unsigned long csrr0;
+   unsigned long csrr1;
+   unsigned long dsrr0;
+   unsigned long dsrr1;
+   };
+#endif
  };
  #endif
  
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c

index a47eefa09bcb..5bee245d832b 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -309,24 +309,21 @@ int main(void)
STACK_PT_REGS_OFFSET(STACK_REGS_IAMR, iamr);
  #endif
  
-#if defined(CONFIG_PPC32)

-#if defined(CONFIG_BOOKE) || defined(CONFIG_40x)
-   DEFINE(EXC_LVL_SIZE, STACK_EXC_LVL_FRAME_SIZE);
-   DEFINE(MAS0, STACK_INT_FRAME_SIZE+offsetof(struct exception_regs, 
mas0));
+#if defined(CONFIG_PPC32) && defined(CONFIG_BOOKE)
+   STACK_PT_REGS_OFFSET(MAS0, mas0);
/* we overload MMUCR for 44x on MAS0 since they are mutually exclusive 
*/
-   DEFINE(MMUCR, STACK_INT_FRAME_SIZE+offsetof(struct exception_regs, 
mas0));
-   DEFINE(MAS1, STACK_INT_FRAME_SIZE+offsetof(struct exception_regs, 
mas1));
-   DEFINE(MAS2, STACK_INT_FRAME_SIZE+offsetof(struct exception_regs, 
mas2));
-   DEFINE(MAS3, STACK_INT_FRAME_SIZE+offsetof(struct exception_regs, 
mas3));
-   DEFINE(MAS6, STACK_INT_FRAME_SIZE+offsetof(struct exception_regs, 
mas6));
-   DEFINE(MAS7, STACK_INT_FRAME_SIZE+offsetof(struct exception_regs, 
mas7));
-   DEFINE(_SRR0, STACK_INT_FRAME_SIZE+offsetof(struct exception_regs, 
srr0));
-   DEFINE(_SRR1, STACK_INT_FRAME_SIZE+offsetof(struct exception_regs, 
srr1));
-   DEFINE(_CSRR0, STACK_INT_FRAME_SIZE+offsetof(struct exception_regs, 
csrr0));
-   DEFINE(_CSRR1, STACK_INT_FRAME_SIZE+offsetof(struct exception_regs, 
csrr1));
-   DEFINE(_DSRR0, STACK_INT_FRAME_SIZE+offsetof(struct exception_regs, 
dsrr0));
-   DEFINE(_DSRR1, STACK_INT_FRAME_SIZE+offsetof(struct exception_regs, 
dsrr1));
-#endif
+   STACK_PT_REGS_OFFSET(MMUCR, mas0);
+   STACK_PT_REGS_OFFSET(MAS1, mas1);
+   STACK_PT_REGS_OFFSET(MAS2, mas2);
+   STACK_PT_REGS_OFFSET(MAS3, mas3);
+   STACK_PT_REGS_OFFSET(MAS6, mas6);
+   STACK_PT_REGS_OFFSET(MAS7, mas7);
+   STACK_PT_REGS_OFFSET(_SRR0, srr0);
+   STACK_PT_REGS_OFFSET(_SRR1, srr1);
+   STACK_PT_REGS_OFFSET(_CSRR0, csrr0);
+   STACK_PT_REGS_OFFSET(_CSRR1, csrr1);
+   STACK_PT_REGS_OFFSET(_DSRR0, dsrr0);
+   STACK_PT_REGS_OFFSET(_DSRR1, dsrr1);
  #endif
  
  	/* About the CPU features table */

diff --git a/arch/powerpc/kernel/head_booke.h b/arch/powerpc/kernel/head_booke.h
index 87b806e8eded..e5503420b6c6 100644
--- a/arch/powerpc/kernel/head_booke.h
+++ b/arch/powerpc/kernel/head_booke.h
@@ -168,20 +168,18 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_EMB_HV)
  /* only on e500mc */
  #define DBG_STACK_BASEdbgirq_ctx
  
-#define EXC_LVL_FRAME_OVERHEAD	(THREAD_SIZE - INT_FRAME_SIZE - EXC_LVL_SIZE)

-
  #ifdef CONFIG_SMP
  #define BOOKE_LOAD_EXC_LEVEL_STACK(level) \
mfspr   r8,SPRN_PIR;\
slwir8,r8,2;\
addis   r8,r8,level##_STACK_BASE@ha;\
lwz r8,level##_STACK_BASE@l(r8);\
- 

Re: [PATCH] powerpc/32s: Fix napping restore in data storage interrupt (DSI)

2021-08-03 Thread Christophe Leroy




Le 04/08/2021 à 06:04, Finn Thain a écrit :

On Tue, 3 Aug 2021, Christophe Leroy wrote:


When a DSI (Data Storage Interrupt) is taken while in NAP mode, r11
doesn't survive the call to power_save_ppc32_restore().

So use r1 instead of r11 as they both contain the virtual stack pointer
at that point.

Reported-by: Finn Thain 
Fixes: 4c0104a83fc3 ("powerpc/32: Dismantle EXC_XFER_STD/LITE/TEMPLATE")


Regarding that 'Fixes' tag, this patch has not fixed the failure below,
unfortunately. But there appears to be several bugs in play here. Can you
tell us which failure mode is associated with the bug addressed by this
patch?



This is unrelated to the failure below. This patch is related to the bisect you did that pointed to 
4c0104a83fc3 ("powerpc/32: Dismantle EXC_XFER_STD/LITE/TEMPLATE")


I think maybe the starting point should be to (manually) apply the patch on top of that commit in 
order to check that the bug to leaded to pointing that commit as 'first bad commit' is now gone.


The BUG below is likely something completely different.

And the other bug involving KUAP write is also something else to be 
investigated separately.



[ cut here ]
kernel BUG at arch/powerpc/kernel/interrupt.c:49!
Oops: Exception in kernel mode, sig: 5 [#1]
BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
Modules linked in:
CPU: 0 PID: 1859 Comm: xfce4-session Not tainted 5.13.0-pmac-VMAP #10
NIP:  c0011474 LR: c0011464 CTR: 
REGS: e2f75e40 TRAP: 0700   Not tainted  (5.13.0-pmac-VMAP)
MSR:  00021032   CR: 2400446c  XER: 2000

GPR00: c001604c e2f75f00 ca284a60   a5205eb0 0008 0020
GPR08: ffc0 0001 501200d9 ce030005 ca285010 00c1f778  
GPR16: 00945b20 009402f8 0001 a6b87550 a51fd000 afb73220 a6b22c78 a6a6aecc
GPR24:  ffc0 0020 0008 a5205eb0  e2f75f40 00ae
NIP [c0011474] system_call_exception+0x60/0x164
LR [c0011464] system_call_exception+0x50/0x164
Call Trace:
[e2f75f00] [9000] 0x9000 (unreliable)
[e2f75f30] [c001604c] ret_from_syscall+0x0/0x28
--- interrupt: c00 at 0xa69d6cb0
NIP:  a69d6cb0 LR: a69d6c3c CTR: 
REGS: e2f75f40 TRAP: 0c00   Not tainted  (5.13.0-pmac-VMAP)
MSR:  d032   CR: 2400446c  XER: 2000

GPR00: 00ae a5205de0 a5687ca0   a5205eb0 0008 0020
GPR08: ffc0 401201ea 401200d9  c158f230 00c1f778  
GPR16: 00945b20 009402f8 0001 a6b87550 a51fd000 afb73220 a6b22c78 a6a6aecc
GPR24: afb72fc8  0001 a5205f30 afb733dc  a6b85ff4 a5205eb0
NIP [a69d6cb0] 0xa69d6cb0
LR [a69d6c3c] 0xa69d6c3c
--- interrupt: c00
Instruction dump:
7cdb3378 93810020 7cbc2b78 93a10024 7c9d2378 93e1002c 7d3f4b78 4800d629
817e0084 931e0088 69690002 5529fffe <0f09> 69694000 552997fe 0f09
---[ end trace c66c6c3c44806276 ]---



[powerpc:next-test] BUILD SUCCESS c09f799bf0fdc0eeb44db87df07a7a9632c3420c

2021-08-03 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
next-test
branch HEAD: c09f799bf0fdc0eeb44db87df07a7a9632c3420c  pseries/drmem: update 
LMBs after LPM

elapsed time: 1007m

configs tested: 100
configs skipped: 3

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
arm defconfig
arm64allyesconfig
arm64   defconfig
arm  allyesconfig
arm  allmodconfig
i386 randconfig-c001-20210803
powerpc tqm8541_defconfig
nios2 3c120_defconfig
armpleb_defconfig
powerpc pseries_defconfig
armtrizeps4_defconfig
arm  imote2_defconfig
powerpc   bluestone_defconfig
mips  rm200_defconfig
powerpc  ppc64e_defconfig
powerpc  ppc6xx_defconfig
sh sh7710voipgw_defconfig
powerpc   allnoconfig
powerpc mpc8540_ads_defconfig
powerpc   ebony_defconfig
powerpc  bamboo_defconfig
powerpc kilauea_defconfig
mipsbcm63xx_defconfig
powerpc mpc8315_rdb_defconfig
xtensageneric_kc705_defconfig
mips   ip27_defconfig
powerpc64   defconfig
arm   sama5_defconfig
sh  sdk7786_defconfig
arcvdk_hs38_smp_defconfig
x86_64allnoconfig
ia64 allmodconfig
ia64defconfig
ia64 allyesconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
parisc  defconfig
s390 allyesconfig
s390 allmodconfig
parisc   allyesconfig
s390defconfig
i386 allyesconfig
sparcallyesconfig
sparc   defconfig
i386defconfig
nios2   defconfig
arc  allyesconfig
nds32 allnoconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc  allmodconfig
x86_64   randconfig-a002-20210803
x86_64   randconfig-a004-20210803
x86_64   randconfig-a006-20210803
x86_64   randconfig-a003-20210803
x86_64   randconfig-a001-20210803
x86_64   randconfig-a005-20210803
i386 randconfig-a004-20210802
i386 randconfig-a005-20210802
i386 randconfig-a002-20210802
i386 randconfig-a006-20210802
i386 randconfig-a001-20210802
i386 randconfig-a003-20210802
i386 randconfig-a012-20210803
i386 randconfig-a011-20210803
i386 randconfig-a015-20210803
i386 randconfig-a013-20210803
i386 randconfig-a014-20210803
i386 randconfig-a016-20210803
riscvnommu_k210_defconfig
riscvallyesconfig
riscvnommu_virt_defconfig
riscv allnoconfig
riscv   defconfig
riscv  rv32_defconfig
riscvallmodconfig
um   x86_64_defconfig
um i386_defconfig
x86_64   allyesconfig
x86_64rhel-8.3-kselftests
x86_64  defconfig
x86_64   rhel-8.3
x86_64  kexec

clang tested configs:
x86_64   randconfig-a012-20210803
x86_64   randconfig-a016-20210803
x86_64   randconfig-a013-20210803
x86_64   randconfig-a011-20210803
x86_64   randconfig-a014-20210803
x86_64   randconfig-a015-20210803

---
0-DAY

Re: [PATCH] powerpc/32s: Fix napping restore in data storage interrupt (DSI)

2021-08-03 Thread Christophe Leroy

Hi Nic,

I think I'll need your help on that one.

Le 04/08/2021 à 08:07, Christophe Leroy a écrit :



Le 04/08/2021 à 06:04, Finn Thain a écrit :

On Tue, 3 Aug 2021, Christophe Leroy wrote:


...


[ cut here ]
kernel BUG at arch/powerpc/kernel/interrupt.c:49!
Oops: Exception in kernel mode, sig: 5 [#1]
BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
Modules linked in:
CPU: 0 PID: 1859 Comm: xfce4-session Not tainted 5.13.0-pmac-VMAP #10
NIP:  c0011474 LR: c0011464 CTR: 
REGS: e2f75e40 TRAP: 0700   Not tainted  (5.13.0-pmac-VMAP)
MSR:  00021032   CR: 2400446c  XER: 2000

GPR00: c001604c e2f75f00 ca284a60   a5205eb0 0008 0020
GPR08: ffc0 0001 501200d9 ce030005 ca285010 00c1f778  
GPR16: 00945b20 009402f8 0001 a6b87550 a51fd000 afb73220 a6b22c78 a6a6aecc
GPR24:  ffc0 0020 0008 a5205eb0  e2f75f40 00ae
NIP [c0011474] system_call_exception+0x60/0x164
LR [c0011464] system_call_exception+0x50/0x164
Call Trace:
[e2f75f00] [9000] 0x9000 (unreliable)
[e2f75f30] [c001604c] ret_from_syscall+0x0/0x28
--- interrupt: c00 at 0xa69d6cb0
NIP:  a69d6cb0 LR: a69d6c3c CTR: 
REGS: e2f75f40 TRAP: 0c00   Not tainted  (5.13.0-pmac-VMAP)
MSR:  d032   CR: 2400446c  XER: 2000

GPR00: 00ae a5205de0 a5687ca0   a5205eb0 0008 0020
GPR08: ffc0 401201ea 401200d9  c158f230 00c1f778  
GPR16: 00945b20 009402f8 0001 a6b87550 a51fd000 afb73220 a6b22c78 a6a6aecc
GPR24: afb72fc8  0001 a5205f30 afb733dc  a6b85ff4 a5205eb0
NIP [a69d6cb0] 0xa69d6cb0
LR [a69d6c3c] 0xa69d6c3c
--- interrupt: c00
Instruction dump:
7cdb3378 93810020 7cbc2b78 93a10024 7c9d2378 93e1002c 7d3f4b78 4800d629
817e0084 931e0088 69690002 5529fffe <0f09> 69694000 552997fe 0f09
---[ end trace c66c6c3c44806276 ]---



Getting a BUG at arch/powerpc/kernel/interrupt.c:49 meaning MSR_RI is not set, but the c00 interrupt 
frame shows MSR_RI properly set, so what ?


Thanks
Christophe


Re: Debian SID kernel doesn't boot on PowerBook 3400c

2021-08-03 Thread Christophe Leroy




Le 04/08/2021 à 02:34, Finn Thain a écrit :


On Tue, 3 Aug 2021, Christophe Leroy wrote:



Looks like the memory errors are linked to KUAP (Kernel Userspace Access
Protection). Based on the places the problems happen, I don't think
there are any invalid access, so there must be something wrong in the
KUAP logic, probably linked to some interrupts happenning in kernel mode
while the KUAP window is opened. And because is not selected by default
on book3s/32 until 5.14, probably nobody ever tested it in a real
environment before you.

I think the issue may be linked to commit
https://github.com/linuxppc/linux/commit/c16728835 which happened
between 5.12 and 5.13.


The messages, "Kernel attempted to write user page (c6207c) - exploit
attempt? (uid: 0)", appear in the console logs generated by v5.13. Those
logs come from the Powerbook G3 discussion in the other thread. Could that
be the same bug?



Yes, most likely.

So you confirm this appears with 5.13 and not 5.12 ?

Can you check if they happen at commit c16728835
Can you check if they DO NOT happen at preceding commit c16728835~

Could you test without CONFIG_PPC_KUAP
Could you test with CONFIG_PPC_KUAP and CONFIG_PPC_KUAP_DEBUG

Thanks
Christophe


Re: [RFC PATCH] powerpc/book3s64/radix: Upgrade va tlbie to PID tlbie if we cross PMD_SIZE

2021-08-03 Thread Nicholas Piggin
Excerpts from Nicholas Piggin's message of August 4, 2021 3:14 pm:
> Excerpts from Aneesh Kumar K.V's message of August 4, 2021 12:37 am:
>> With shared mapping, even though we are unmapping a large range, the kernel
>> will force a TLB flush with ptl lock held to avoid the race mentioned in
>> commit 1cf35d47712d ("mm: split 'tlb_flush_mmu()' into tlb flushing and 
>> memory freeing parts")
>> This results in the kernel issuing a high number of TLB flushes even for a 
>> large
>> range. This can be improved by making sure the kernel switch to pid based 
>> flush if the
>> kernel is unmapping a 2M range.
> 
> It would be good to have a bit more description here.
> 
> In any patch that changes a heuristic like this, I would like to see 
> some justification or reasoning that could be refuted or used as a 
> supporting argument if we ever wanted to change the heuristic later.
> Ideally with some of the obvious downsides listed as well.
> 
> This "improves" things here, but what if it hurt things elsewhere, how 
> would we come in later and decide to change it back?
> 
> THP flushes for example, I think now they'll do PID flushes (if they 
> have to be broadcast, which they will tend to be when khugepaged does
> them). So now that might increase jitter for THP and cause it to be a
> loss for more workloads.
> 
> So where do you notice this? What's the benefit?

For that matter, I wonder if we shouldn't do something like this 
(untested) so the low level batch flush has visibility to the high
level flush range.

x86 could use this too AFAIKS, just needs to pass the range a bit
further down, but in practice I'm not sure it would ever really
matter for them because the 2MB level exceeds the single page flush
ceiling for 4k pages unlike powerpc with 64k pages. But in corner
cases where the unmap crossed a bunch of small vmas or the ceiling
was increased then in theory it could be of use.

Subject: [PATCH v1] mm/mmu_gather: provide archs with the entire range that is
 to be flushed, not just the particular gather

This allows archs to optimise flushing heuristics better, in the face of
flush operations forcing smaller flush batches. For example, an
architecture may choose a more costly per-page invalidation for small
ranges of pages with the assumption that the full TLB flush cost would
be more expensive in terms of refills. However if a very large range is
forced into flushing small ranges, the faster full-process flush may
have been the better choice.

---
 arch/powerpc/mm/book3s64/radix_tlb.c | 33 
 fs/exec.c|  3 ++-
 include/asm-generic/tlb.h|  9 
 include/linux/mm_types.h |  3 ++-
 mm/hugetlb.c |  2 +-
 mm/madvise.c |  6 ++---
 mm/memory.c  |  4 ++--
 mm/mmap.c|  2 +-
 mm/mmu_gather.c  | 10 ++---
 mm/oom_kill.c|  2 +-
 10 files changed, 47 insertions(+), 27 deletions(-)

diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c 
b/arch/powerpc/mm/book3s64/radix_tlb.c
index aefc100d79a7..e1072d85d72e 100644
--- a/arch/powerpc/mm/book3s64/radix_tlb.c
+++ b/arch/powerpc/mm/book3s64/radix_tlb.c
@@ -1110,12 +1110,13 @@ static unsigned long tlb_single_page_flush_ceiling 
__read_mostly = 33;
 static unsigned long tlb_local_single_page_flush_ceiling __read_mostly = 
POWER9_TLB_SETS_RADIX * 2;
 
 static inline void __radix__flush_tlb_range(struct mm_struct *mm,
-   unsigned long start, unsigned long 
end)
+   unsigned long start, unsigned long 
end,
+   unsigned long entire_range)
 {
unsigned long pid;
unsigned int page_shift = mmu_psize_defs[mmu_virtual_psize].shift;
unsigned long page_size = 1UL << page_shift;
-   unsigned long nr_pages = (end - start) >> page_shift;
+   unsigned long entire_nr_pages = entire_range >> page_shift;
bool fullmm = (end == TLB_FLUSH_ALL);
bool flush_pid, flush_pwc = false;
enum tlb_flush_type type;
@@ -1133,9 +1134,9 @@ static inline void __radix__flush_tlb_range(struct 
mm_struct *mm,
if (fullmm)
flush_pid = true;
else if (type == FLUSH_TYPE_GLOBAL)
-   flush_pid = nr_pages > tlb_single_page_flush_ceiling;
+   flush_pid = entire_nr_pages > tlb_single_page_flush_ceiling;
else
-   flush_pid = nr_pages > tlb_local_single_page_flush_ceiling;
+   flush_pid = entire_nr_pages > 
tlb_local_single_page_flush_ceiling;
/*
 * full pid flush already does the PWC flush. if it is not full pid
 * flush check the range is more than PMD and force a pwc flush
@@ -1220,7 +1221,7 @@ void radix__flush_tlb_range(struct vm_area_struct *vma, 
unsigned long start,
return radix__flush_hugetl

Re: [RFC PATCH] powerpc/book3s64/radix: Upgrade va tlbie to PID tlbie if we cross PMD_SIZE

2021-08-03 Thread Michael Ellerman
Nicholas Piggin  writes:
> Excerpts from Aneesh Kumar K.V's message of August 4, 2021 12:37 am:
>> With shared mapping, even though we are unmapping a large range, the kernel
>> will force a TLB flush with ptl lock held to avoid the race mentioned in
>> commit 1cf35d47712d ("mm: split 'tlb_flush_mmu()' into tlb flushing and 
>> memory freeing parts")
>> This results in the kernel issuing a high number of TLB flushes even for a 
>> large
>> range. This can be improved by making sure the kernel switch to pid based 
>> flush if the
>> kernel is unmapping a 2M range.
>
> It would be good to have a bit more description here.
>
> In any patch that changes a heuristic like this, I would like to see 
> some justification or reasoning that could be refuted or used as a 
> supporting argument if we ever wanted to change the heuristic later.
> Ideally with some of the obvious downsides listed as well.
>
> This "improves" things here, but what if it hurt things elsewhere, how 
> would we come in later and decide to change it back?
>
> THP flushes for example, I think now they'll do PID flushes (if they 
> have to be broadcast, which they will tend to be when khugepaged does
> them). So now that might increase jitter for THP and cause it to be a
> loss for more workloads.
>
> So where do you notice this? What's the benefit?

Ack. Needs some numbers and supporting evidence.

>> diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c 
>> b/arch/powerpc/mm/book3s64/radix_tlb.c
>> index aefc100d79a7..21d0f098e43b 100644
>> --- a/arch/powerpc/mm/book3s64/radix_tlb.c
>> +++ b/arch/powerpc/mm/book3s64/radix_tlb.c
>> @@ -1106,7 +1106,7 @@ EXPORT_SYMBOL(radix__flush_tlb_kernel_range);
>>   * invalidating a full PID, so it has a far lower threshold to change from
>>   * individual page flushes to full-pid flushes.
>>   */
>> -static unsigned long tlb_single_page_flush_ceiling __read_mostly = 33;
>> +static unsigned long tlb_single_page_flush_ceiling __read_mostly = 32;
>>  static unsigned long tlb_local_single_page_flush_ceiling __read_mostly = 
>> POWER9_TLB_SETS_RADIX * 2;
>>  
>>  static inline void __radix__flush_tlb_range(struct mm_struct *mm,
>> @@ -1133,7 +1133,7 @@ static inline void __radix__flush_tlb_range(struct 
>> mm_struct *mm,
>>  if (fullmm)
>>  flush_pid = true;
>>  else if (type == FLUSH_TYPE_GLOBAL)
>> -flush_pid = nr_pages > tlb_single_page_flush_ceiling;
>> +flush_pid = nr_pages >= tlb_single_page_flush_ceiling;
>
> Arguably >= is nicer than > here, but this shouldn't be in the same 
> patch as the value change.
>
>>  else
>>  flush_pid = nr_pages > tlb_local_single_page_flush_ceiling;
>
> And it should change everything to be consistent. Although I'm not sure 
> it's worth changing even though I highly doubt any administrator would
> be tweaking this.

This made me look at how an administrator tweaks these thresholds, and
AFAICS the answer is "recompile the kernel"?

It looks like x86 have a debugfs file for tlb_single_page_flush_ceiling,
but we don't. I guess we meant to copy that but never did?

So at the moment both thresholds could just be #defines.

Making them tweakable at runtime would be nice, it would give us an
escape hatch if we ever hit a workload in production that wants a
different value.

cheers