Re: linux-next: build warning after merge of the akpm-current tree

2020-11-05 Thread Stephen Rothwell
Hi Anand,

On Thu, 5 Nov 2020 18:42:23 +1100 "Anand K. Mistry"  wrote:
>
> How would I go about fixing this? Send a new (v2), fixed patch to the
> mailing list? I'm not that familiar with how patches get merged
> through the branches.

Since this is in Andrew's quilt series, either a v2, or an incremental
patch (to wherever the original went - including cc'ing Andrew).  If
you send a v2, he will probably turn it into an incremental patch and
then squash it back before sending it to Linus.

-- 
Cheers,
Stephen Rothwell


pgpCM8nHbVhAM.pgp
Description: OpenPGP digital signature


Re: linux-next: build warning after merge of the akpm-current tree

2020-11-05 Thread Stephen Rothwell
Hi Anand,

On Thu, 5 Nov 2020 19:00:11 +1100 Stephen Rothwell  
wrote:
>
> On Thu, 5 Nov 2020 18:42:23 +1100 "Anand K. Mistry"  
> wrote:
> >
> > How would I go about fixing this? Send a new (v2), fixed patch to the
> > mailing list? I'm not that familiar with how patches get merged
> > through the branches.  
> 
> Since this is in Andrew's quilt series, either a v2, or an incremental
> patch (to wherever the original went - including cc'ing Andrew).  If
> you send a v2, he will probably turn it into an incremental patch and
> then squash it back before sending it to Linus.

And if you cc me as well, I will add it to the copy of Andrew's series
that I have in linux-next (so I won't have to worry about the warning
until Andrew gets around to sending out a new version of his quilt
series).
-- 
Cheers,
Stephen Rothwell


pgpA7KLhkbo0N.pgp
Description: OpenPGP digital signature


Re: [PATCH] USB: serial: mos7720: fix parallel-port state restore

2020-11-05 Thread Johan Hovold
On Wed, Nov 04, 2020 at 05:59:10PM +0100, Greg Kroah-Hartman wrote:
> On Wed, Nov 04, 2020 at 05:47:27PM +0100, Johan Hovold wrote:
> > The parallel-port restore operations is called when a driver claims the
> > port and is supposed to restore the provided state (e.g. saved when
> > releasing the port).
> > 
> > Fixes: b69578df7e98 ("USB: usbserial: mos7720: add support for parallel 
> > port on moschip 7715")
> > Cc: stable  # 2.6.35
> > Signed-off-by: Johan Hovold 
> > ---
> >  drivers/usb/serial/mos7720.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/usb/serial/mos7720.c b/drivers/usb/serial/mos7720.c
> > index 5eed1078fac8..5a5d2a95070e 100644
> > --- a/drivers/usb/serial/mos7720.c
> > +++ b/drivers/usb/serial/mos7720.c
> > @@ -639,6 +639,8 @@ static void parport_mos7715_restore_state(struct 
> > parport *pp,
> > spin_unlock(&release_lock);
> > return;
> > }
> > +   mos_parport->shadowDCR = s->u.pc.ctr;
> > +   mos_parport->shadowECR = s->u.pc.ecr;
> 
> Wow that's old code.  I'm guessing no one uses these devices really :(

Possibly, but this would still work as long as you don't switch parallel
port driver without disconnecting the mos7715 device in between.

> Anyway, nice work:
> 
> Reviewed-by: Greg Kroah-Hartman 

Thanks, now applied for -next.

Johan


Re: [PATCH v2] ARM: dts: exynos: Add a placeholder for a MAC address

2020-11-05 Thread Anand Moon
Hi Marek,

On Mon, 2 Nov 2020 at 21:53, Marek Szyprowski  wrote:
>
> Hi Anand,
>
> On 01.11.2020 15:07, Anand Moon wrote:
> > Hi Lukasz,
> >
> > On Thu, 1 Oct 2020 at 19:25, Łukasz Stelmach  wrote:
> >> Add a placeholder for a MAC address. A bootloader may fill it
> >> to set the MAC address and override EEPROM settings.
> >>
> >> Signed-off-by: Łukasz Stelmach 
> >> ---
> >> Changes in v2:
> >>   - use local-mac-address and leave mac-address to be added by a bootloader
> >>
> >>   arch/arm/boot/dts/exynos5422-odroidxu3.dts | 18 ++
> >>   1 file changed, 18 insertions(+)
> >>
> >> diff --git a/arch/arm/boot/dts/exynos5422-odroidxu3.dts 
> >> b/arch/arm/boot/dts/exynos5422-odroidxu3.dts
> >> index db0bc17a667b..d0f6ac5fa79d 100644
> >> --- a/arch/arm/boot/dts/exynos5422-odroidxu3.dts
> >> +++ b/arch/arm/boot/dts/exynos5422-odroidxu3.dts
> >> @@ -70,3 +70,21 @@ &pwm {
> >>   &usbdrd_dwc3_1 {
> >>  dr_mode = "peripheral";
> >>   };
> >> +
> >> +&usbhost2 {
> >> +   #address-cells = <1>;
> >> +   #size-cells = <0>;
> >> +
> >> +   hub@1 {
> >> +   compatible = "usb8087,0024";
> >> +   reg = <1>;
> >> +   #address-cells = <1>;
> >> +   #size-cells = <0>;
> >> +
> >> +   ethernet: usbether@1 {
> >> +   compatible = "usb0c45,6310";
> >> +   reg = <1>;
> >> +   local-mac-address = [00 00 00 00 00 00]; /* Filled 
> >> in by a bootloader */
> >> +   };
> >> +   };
> >> +};
> >> --
> >> 2.26.2
> >>
> > Thanks for this patch, can you share some example on how to set the
> > mac address via u-boot bootargs
>
> A little bit hacky script to set permanent board unique MAC address:
>
> # setexp.b u0 *0x1014; setexp.b u1 *0x1015; setexp.b u2
> *0x1016; setexp.b u3 *0x1017; setenv ethaddr
> 0:0:${u0}:${u1}:${u2}:${u3}; setenv usbethaddr ${ethaddr};
>
OK this command worked for me.

> Then if there is proper ethernet0 alias set, u-boot will then
> automatically save the configured MAC address to the device tree. I've
> just check this on recent u-boot v2020.10 and Odroid U3 board.
>
> Lukasz will send updated patch soon (with proper alias entry).
>
> If you want to hack setting MAC address manually, this will work with
> the current patch:
>
> # setexp.b u0 *0x1014; setexp.b u1 *0x1015; setexp.b u2
> *0x1016; setexp.b u3 *0x1017; fdt addr ${fdtaddr}; fdt set
> /soc/usb@1211/hub@1/usbether@1 local-mac-address [ 0 0 ${u0} ${u1}
> ${u2} ${u3} ]
>

So do we need a similar patch for u-boot ?
I am getting following error on Odroid U3+ and U-Boot 2020.10

Odroid #  setexp.b u0 *0x1014; setexp.b u1 *0x1015; setexp.b
u2 *0x1016; setexp.b u3 *0x1017; fdt addr ${fdtaddr}; fdt set
/soc/usb@1211/hub@1/usbether@1 local-mac-address [ 0 0 ${u0} ${u1}
${u2} ${u3} ]
No FDT memory address configured. Please configure
the FDT address via "fdt addr " command.
Aborting!

Also added these command to boot.scr but still observing the failure

mmc0(part 0) is current device
Scanning mmc 0:1...
Found U-Boot script /boot/boot.scr
969 bytes read in 5 ms (188.5 KiB/s)
## Executing script at 4200
7341440 bytes read in 265 ms (26.4 MiB/s)
53875 bytes read in 56 ms (939.5 KiB/s)
7964187 bytes read in 285 ms (26.6 MiB/s)
libfdt fdt_path_offset() returned FDT_ERR_NOTFOUND
Kernel image @ 0x4100 [ 0x00 - 0x700580 ]
## Flattened Device Tree blob at 4080
   Booting using the fdt blob at 0x4080
   Loading Ramdisk to 4f867000, end 461b ... OK
   Loading Device Tree to 4f856000, end 4f866272 ... OK
,
Best Regards
-Anand

> > also can you update this patch for exynos5422-odroidxu3-lite.dts and
> > exynos4412-odroidu3.dts.
>
> Also odroid-x2 and odroid-xu. Lukasz will take care of them.
>
> Best regards
>
> --
> Marek Szyprowski, PhD
> Samsung R&D Institute Poland
>


REPOST: Questions abount driver development and Geniatech X9320 DVB-S/S2 PCIe Tuner

2020-11-05 Thread m . bertens




 Original Message 
Subject: Questions abount driver development and Geniatech X9320 
DVB-S/S2 PCIe Tuner

Date: 2020-11-02 10:38
From: m.bert...@pe2mbs.nl
To: Linux Media 

Hi,

I'm trying to get Geniatech X9320 DVB-S/S2 PCIe Quad Tuner card working,
but i'm new to linux kernel drivers developement

It has a similar hardware configuration as TechnoTrend TT-connect 
S2-4600

* D720201USB host controller
* cy68013A   USB bridge (4x)
* m88ds3103  Demodulator (4x)
* m88ts2022  Tuner (4x)

As it seems to me all components are supported by Linux, but they are 
for another device.


I have taken a look in the dw2102.c driver source code, and I see that 
there is firmware
loaded, my question is: where is this firmware going?: into the cy68013A 
or m88ds3103 chips?

or somewhere else?.

I looked at the site Geniatech, but even that they say there is a Linux 
driver

in the archive the actual implementation is not there.

I hope that there is anyone how can help me to get this card working 
under Linux.



Kind regards,


Marc Bertens-Nguyen
PE2MBS


Re: [PATCH] applesmc: Re-work SMC comms v2

2020-11-05 Thread Henrik Rydberg

Hi Brad,

Great to see this effort, it is certainly an area which could be 
improved. After having seen several generations of Macbooks while 
modifying much of that code, it became clear that the SMC communication 
got refreshed a few times over the years. Every tiny change had to be 
tested on all machines, or kept separate for a particular generation, or 
something would break.


I have not followed the back story here, but I imagine the need has 
arisen because of a new refresh, and so this patch only needs to 
strictly apply to a new generation. I would therefore advice that you 
write the patch in that way, reducing the actual change to zero for 
earlier generations. It also makes it easier to test the effect of the 
new approach on older systems. I should be able to help testing on a 
2008 and 2011 model once we get to that stage.


Thanks,
Henrik

On 2020-11-05 08:26, Brad Campbell wrote:

Commit fff2d0f701e6 ("hwmon: (applesmc) avoid overlong udelay()") introduced
an issue whereby communication with the SMC became unreliable with write
errors like :

[  120.378614] applesmc: send_byte(0x00, 0x0300) fail: 0x40
[  120.378621] applesmc: LKSB: write data fail
[  120.512782] applesmc: send_byte(0x00, 0x0300) fail: 0x40
[  120.512787] applesmc: LKSB: write data fail

The original code appeared to be timing sensitive and was not reliable with
the timing changes in the aforementioned commit.

This patch re-factors the SMC communication to remove the timing
dependencies and restore function with the changes previously committed.

v2 : Address logic and coding style

Reported-by: Andreas Kemnade 
Fixes: fff2d0f701e6 ("hwmon: (applesmc) avoid overlong udelay()")
Signed-off-by: Brad Campbell 

---
diff --git a/drivers/hwmon/applesmc.c b/drivers/hwmon/applesmc.c
index a18887990f4a..de890f3ec12f 100644
--- a/drivers/hwmon/applesmc.c
+++ b/drivers/hwmon/applesmc.c
@@ -42,6 +42,11 @@
  
  #define APPLESMC_MAX_DATA_LENGTH 32
  
+/* Apple SMC status bits */

+#define SMC_STATUS_AWAITING_DATA  BIT(0) /* SMC has data waiting */
+#define SMC_STATUS_IB_CLOSED  BIT(1) /* Will ignore any input */
+#define SMC_STATUS_BUSY   BIT(2) /* Command in progress */
+
  /* wait up to 128 ms for a status change. */
  #define APPLESMC_MIN_WAIT 0x0010
  #define APPLESMC_RETRY_WAIT   0x0100
@@ -151,65 +156,69 @@ static unsigned int key_at_index;
  static struct workqueue_struct *applesmc_led_wq;
  
  /*

- * wait_read - Wait for a byte to appear on SMC port. Callers must
- * hold applesmc_lock.
+ * Wait for specific status bits with a mask on the SMC
+ * Used before and after writes, and before reads
   */
-static int wait_read(void)
+
+static int wait_status(u8 val, u8 mask)
  {
unsigned long end = jiffies + (APPLESMC_MAX_WAIT * HZ) / USEC_PER_SEC;
u8 status;
int us;
  
  	for (us = APPLESMC_MIN_WAIT; us < APPLESMC_MAX_WAIT; us <<= 1) {

-   usleep_range(us, us * 16);
status = inb(APPLESMC_CMD_PORT);
-   /* read: wait for smc to settle */
-   if (status & 0x01)
+   if ((status & mask) == val)
return 0;
/* timeout: give up */
if (time_after(jiffies, end))
break;
-   }
-
-   pr_warn("wait_read() fail: 0x%02x\n", status);
+   usleep_range(us, us * 16);
+   }
return -EIO;
  }
  
  /*

- * send_byte - Write to SMC port, retrying when necessary. Callers
+ * send_byte_data - Write to SMC data port. Callers
   * must hold applesmc_lock.
+ * Parameter skip must be true on the last write of any
+ * command or it'll time out.
   */
-static int send_byte(u8 cmd, u16 port)
+
+static int send_byte_data(u8 cmd, u16 port, bool skip)
  {
-   u8 status;
-   int us;
-   unsigned long end = jiffies + (APPLESMC_MAX_WAIT * HZ) / USEC_PER_SEC;
+   int ret;
  
+	ret = wait_status(SMC_STATUS_BUSY, SMC_STATUS_BUSY | SMC_STATUS_IB_CLOSED);

+   if (ret)
+   return ret;
outb(cmd, port);
-   for (us = APPLESMC_MIN_WAIT; us < APPLESMC_MAX_WAIT; us <<= 1) {
-   usleep_range(us, us * 16);
-   status = inb(APPLESMC_CMD_PORT);
-   /* write: wait for smc to settle */
-   if (status & 0x02)
-   continue;
-   /* ready: cmd accepted, return */
-   if (status & 0x04)
-   return 0;
-   /* timeout: give up */
-   if (time_after(jiffies, end))
-   break;
-   /* busy: long wait and resend */
-   udelay(APPLESMC_RETRY_WAIT);
-   outb(cmd, port);
-   }
+   return wait_status(skip ? 0 : SMC_STATUS_BUSY, SMC_STATUS_BUSY);
+}
  
-	pr_warn("send_byte(0x%02x, 0x%04x) fail: 0x%02x\n", cmd, port, status);

-   return -EIO;
+static int send_byte(u8 cmd, u16 port)
+{
+   return send_byte_data(cmd, port, false);
  }
  
+

Re: [RFC 6/9] staging: dpaa2-switch: add .ndo_start_xmit() callback

2020-11-05 Thread Ioana Ciornei
On Wed, Nov 04, 2020 at 11:27:00PM +0200, Vladimir Oltean wrote:
> On Wed, Nov 04, 2020 at 06:57:17PM +0200, Ioana Ciornei wrote:
> > From: Ioana Ciornei 
> > 
> > Implement the .ndo_start_xmit() callback for the switch port interfaces.
> > For each of the switch ports, gather the corresponding queue
> > destination ID (QDID) necessary for Tx enqueueing.
> > 
> > We'll reserve 64 bytes for software annotations, where we keep a skb
> > backpointer used on the Tx confirmation side for releasing the allocated
> > memory. At the moment, we only support linear skbs.
> > 
> > Signed-off-by: Ioana Ciornei 
> > ---
> > @@ -554,8 +560,10 @@ static int dpaa2_switch_port_open(struct net_device 
> > *netdev)
> > struct ethsw_core *ethsw = port_priv->ethsw_data;
> > int err;
> >  
> > -   /* No need to allow Tx as control interface is disabled */
> > -   netif_tx_stop_all_queues(netdev);
> > +   if (!dpaa2_switch_has_ctrl_if(port_priv->ethsw_data)) {
> > +   /* No need to allow Tx as control interface is disabled */
> > +   netif_tx_stop_all_queues(netdev);
> 
> Personal taste probably, but you could remove the braces here.

Usually checkpatch complains about this kind of thing but not this time.
Maybe it takes into account the comment as well..

I'll remove the braces.

> 
> > +   }
> >  
> > /* Explicitly set carrier off, otherwise
> >  * netif_carrier_ok() will return true and cause 'ip link show'
> > @@ -610,15 +618,6 @@ static int dpaa2_switch_port_stop(struct net_device 
> > *netdev)
> > return 0;
> >  }
> >  
> > +static netdev_tx_t dpaa2_switch_port_tx(struct sk_buff *skb,
> > +   struct net_device *net_dev)
> > +{
> > +   struct ethsw_port_priv *port_priv = netdev_priv(net_dev);
> > +   struct ethsw_core *ethsw = port_priv->ethsw_data;
> > +   int retries = DPAA2_SWITCH_SWP_BUSY_RETRIES;
> > +   struct dpaa2_fd fd;
> > +   int err;
> > +
> > +   if (!dpaa2_switch_has_ctrl_if(ethsw))
> > +   goto err_free_skb;
> > +
> > +   if (unlikely(skb_headroom(skb) < DPAA2_SWITCH_NEEDED_HEADROOM)) {
> > +   struct sk_buff *ns;
> > +
> > +   ns = skb_realloc_headroom(skb, DPAA2_SWITCH_NEEDED_HEADROOM);
> 
> Looks like this passion for skb_realloc_headroom runs in the company?

Not really, ocelot and sja1105 are safe :)

> Few other drivers use it, and Claudiu just had a bug with that in gianfar.
> Luckily what saves you from the same bug is the skb_unshare from right below.
> Maybe you could use skb_cow_head and simplify this a bit?
> 
> > +   if (unlikely(!ns)) {
> > +   netdev_err(net_dev, "Error reallocating skb 
> > headroom\n");
> > +   goto err_free_skb;
> > +   }
> > +   dev_kfree_skb(skb);
> 
> Please use dev_consume_skb_any here, as it's not error path. Or, if you
> use skb_cow_head, only the skb data will be reallocated, not the skb
> structure itself, so there will be no consume_skb in that case at all,
> another reason to simplify.

Ok, I can try that.

How dpaa2-eth deals now with this is to just create a S/G FD when the
headroom is less than what's necessary, so no skb_realloc_headroom() or
skb_cow_head(). But I agree that it's best to make it as simple as
possible.

> 
> > +   skb = ns;
> > +   }
> > +
> > +   /* We'll be holding a back-reference to the skb until Tx confirmation */
> > +   skb = skb_unshare(skb, GFP_ATOMIC);
> > +   if (unlikely(!skb)) {
> > +   /* skb_unshare() has already freed the skb */
> > +   netdev_err(net_dev, "Error copying the socket buffer\n");
> > +   goto err_exit;
> > +   }
> > +
> > +   if (skb_is_nonlinear(skb)) {
> > +   netdev_err(net_dev, "No support for non-linear SKBs!\n");
> 
> Rate-limit maybe?

Yep, that probably should be rate-limited.

> And what is the reason for no non-linear skb's? Too much code to copy
> from dpaa2-eth?

Once this is out of staging, dpaa2-eth and dpaa2-switch could share
the Tx/Rx code path so, as you said, I just didn't want to duplicate
everything if it's not specifically needed.

> > diff --git a/drivers/staging/fsl-dpaa2/ethsw/ethsw.h 
> > b/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
> > index bd24be2c6308..b267c04e2008 100644
> > --- a/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
> > +++ b/drivers/staging/fsl-dpaa2/ethsw/ethsw.h
> > @@ -66,6 +66,19 @@
> >   */
> >  #define DPAA2_SWITCH_SWP_BUSY_RETRIES  1000
> >  
> > +/* Hardware annotation buffer size */
> > +#define DPAA2_SWITCH_HWA_SIZE  64
> > +/* Software annotation buffer size */
> > +#define DPAA2_SWITCH_SWA_SIZE  64
> > +
> > +#define DPAA2_SWITCH_TX_BUF_ALIGN  64
> 
> Could you align all of these to the "1000" from DPAA2_SWITCH_SWP_BUSY_RETRIES?
> 

Sure.

> > +
> > +#define DPAA2_SWITCH_TX_DATA_OFFSET \
> > +   (DPAA2_SWITCH_HWA_SIZE + DPAA2_SWITCH_SWA_SIZE)
> > +
> > +#define DPAA2_SWITCH_NEEDED_HEADROOM \
> > +   (DPAA2_SWITCH_TX_DATA_OFFSET + DPAA2_SWITCH_TX_BUF_AL

Re: [PATCH] applesmc: Re-work SMC comms v2

2020-11-05 Thread Andreas Kemnade
On Thu, 5 Nov 2020 18:26:24 +1100
Brad Campbell  wrote:

> Commit fff2d0f701e6 ("hwmon: (applesmc) avoid overlong udelay()") introduced
> an issue whereby communication with the SMC became unreliable with write
> errors like :
> 
> [  120.378614] applesmc: send_byte(0x00, 0x0300) fail: 0x40
> [  120.378621] applesmc: LKSB: write data fail
> [  120.512782] applesmc: send_byte(0x00, 0x0300) fail: 0x40
> [  120.512787] applesmc: LKSB: write data fail
> 
> The original code appeared to be timing sensitive and was not reliable with
> the timing changes in the aforementioned commit.
> 
> This patch re-factors the SMC communication to remove the timing 
> dependencies and restore function with the changes previously committed.
> 
> v2 : Address logic and coding style
> 
> Reported-by: Andreas Kemnade 
> Fixes: fff2d0f701e6 ("hwmon: (applesmc) avoid overlong udelay()")
> Signed-off-by: Brad Campbell 
> 
Still works here:
Tested-by: Andreas Kemnade  # MacBookAir6,2


Re: [PATCH 08/36] tty: tty_ldisc: Fix some kernel-doc related misdemeanours

2020-11-05 Thread Jiri Slaby

On 04. 11. 20, 20:35, Lee Jones wrote:

  - Functions must follow directly on from their headers
  - Demote non-conforming kernel-doc header
  - Ensure notes have unique section names
  - Provide missing description for 'reinit'

Fixes the following W=1 kernel build warning(s):

  drivers/tty/tty_ldisc.c:158: warning: cannot understand function prototype: 
'int tty_ldisc_autoload = IS_BUILTIN(CONFIG_LDISC_AUTOLOAD); '
  drivers/tty/tty_ldisc.c:199: warning: Function parameter or member 'ld' not 
described in 'tty_ldisc_put'
  drivers/tty/tty_ldisc.c:260: warning: duplicate section name 'Note'
  drivers/tty/tty_ldisc.c:717: warning: Function parameter or member 'reinit' 
not described in 'tty_ldisc_hangup'

Cc: Greg Kroah-Hartman 
Cc: Jiri Slaby 
Signed-off-by: Lee Jones 
---
  drivers/tty/tty_ldisc.c | 10 +-
  1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/tty/tty_ldisc.c b/drivers/tty/tty_ldisc.c
index fe37ec331289b..aced2bf6173be 100644
--- a/drivers/tty/tty_ldisc.c
+++ b/drivers/tty/tty_ldisc.c
@@ -190,7 +189,7 @@ static struct tty_ldisc *tty_ldisc_get(struct tty_struct 
*tty, int disc)
return ld;
  }
  
-/**

+/*
   *tty_ldisc_put   -   release the ldisc


Having tty_ldisc_get in kernel-doc, while tty_ldisc_put not doesn't make 
much sense. What's missing to tty_ldisc_put to conform to kernel-doc?


thanks,
--
js
suse labs


Re: [PATCH 1/5] gpio: tps65910: use regmap accessors

2020-11-05 Thread Lee Jones
On Thu, 05 Nov 2020, Michał Mirosław wrote:

> On Wed, Nov 04, 2020 at 02:43:31PM +, Lee Jones wrote:
> > On Thu, 01 Oct 2020, Lee Jones wrote:
> > > On Wed, 30 Sep 2020, Linus Walleij wrote:
> > > > On Sun, Sep 27, 2020 at 1:59 AM Michał Mirosław 
> > > >  wrote:
> > > > > Use regmap accessors directly for register manipulation - removing one
> > > > > layer of abstraction.
> > > > >
> > > > > Signed-off-by: Michał Mirosław 
> > > > Reviewed-by: Linus Walleij 
> > > > 
> > > > I suppose it is easiest that Lee apply all patches to the MFD tree?
> > > Yes, that's fine.
> > I think this patch is orthogonal right?
> > 
> > Not sure why it need to go in via MFD.
> [...]
> 
> The patch 4 assumes all previous patches are applied (or there will be
> build breakage).

Okay, no problem.

Linus, do you want a PR?

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog


[RFC v3 6/7] KVM: X86: Expose PKS to guest and userspace

2020-11-05 Thread Chenyi Qiang
Existence of PKS is enumerated via CPUID.(EAX=7H,ECX=0):ECX[31]. It is
enabled by setting CR4.PKS when long mode is active. PKS is only
implemented when EPT is enabled and requires the support of VM_{ENTRY,
EXIT}_LOAD_IA32_PKRS currently.

Signed-off-by: Chenyi Qiang 
---
 arch/x86/include/asm/kvm_host.h |  3 ++-
 arch/x86/kvm/cpuid.c|  3 ++-
 arch/x86/kvm/vmx/vmx.c  | 15 ---
 arch/x86/kvm/x86.c  |  9 +++--
 4 files changed, 23 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index ba313c76a1b5..20b2a8be3591 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -100,7 +100,8 @@
  | X86_CR4_PGE | X86_CR4_PCE | X86_CR4_OSFXSR | 
X86_CR4_PCIDE \
  | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \
  | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \
- | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP))
+ | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \
+ | X86_CR4_PKS))
 
 #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR)
 
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 06a278b3701d..4062b83091b9 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -390,7 +390,8 @@ void kvm_set_cpu_caps(void)
F(AVX512VBMI) | F(LA57) | F(PKU) | 0 /*OSPKE*/ | F(RDPID) |
F(AVX512_VPOPCNTDQ) | F(UMIP) | F(AVX512_VBMI2) | F(GFNI) |
F(VAES) | F(VPCLMULQDQ) | F(AVX512_VNNI) | F(AVX512_BITALG) |
-   F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/
+   F(CLDEMOTE) | F(MOVDIRI) | F(MOVDIR64B) | 0 /*WAITPKG*/ |
+   0 /*PKS*/
);
/* Set LA57 based on hardware capability. */
if (cpuid_ecx(7) & F(LA57))
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index e18fdd8ee36a..67af89ed9bb0 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -3186,7 +3186,7 @@ int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
}
 
/*
-* SMEP/SMAP/PKU is disabled if CPU is in non-paging mode in
+* SMEP/SMAP/PKU/PKS is disabled if CPU is in non-paging mode in
 * hardware.  To emulate this behavior, SMEP/SMAP/PKU needs
 * to be manually disabled when guest switches to non-paging
 * mode.
@@ -3194,10 +3194,11 @@ int vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long 
cr4)
 * If !enable_unrestricted_guest, the CPU is always running
 * with CR0.PG=1 and CR4 needs to be modified.
 * If enable_unrestricted_guest, the CPU automatically
-* disables SMEP/SMAP/PKU when the guest sets CR0.PG=0.
+* disables SMEP/SMAP/PKU/PKS when the guest sets CR0.PG=0.
 */
if (!is_paging(vcpu))
-   hw_cr4 &= ~(X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE);
+   hw_cr4 &= ~(X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE |
+   X86_CR4_PKS);
}
 
vmcs_writel(CR4_READ_SHADOW, cr4);
@@ -7342,6 +7343,14 @@ static __init void vmx_set_cpu_caps(void)
if (vmx_pt_mode_is_host_guest())
kvm_cpu_cap_check_and_set(X86_FEATURE_INTEL_PT);
 
+   /*
+* PKS is not yet implemented for shadow paging.
+* If not support VM_{ENTRY, EXIT}_LOAD_IA32_PKRS,
+* don't expose the PKS as well.
+*/
+   if (enable_ept && cpu_has_load_ia32_pkrs())
+   kvm_cpu_cap_check_and_set(X86_FEATURE_PKS);
+
if (vmx_umip_emulated())
kvm_cpu_cap_set(X86_FEATURE_UMIP);
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f5ede41bf9e6..5b157ff27dca 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -982,7 +982,8 @@ int kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
unsigned long old_cr4 = kvm_read_cr4(vcpu);
unsigned long pdptr_bits = X86_CR4_PGE | X86_CR4_PSE | X86_CR4_PAE |
   X86_CR4_SMEP;
-   unsigned long mmu_role_bits = pdptr_bits | X86_CR4_SMAP | X86_CR4_PKE;
+   unsigned long mmu_role_bits = pdptr_bits | X86_CR4_SMAP | X86_CR4_PKE |
+ X86_CR4_PKS;
 
if (kvm_valid_cr4(vcpu, cr4))
return 1;
@@ -1213,7 +1214,7 @@ static const u32 msrs_to_save_all[] = {
MSR_IA32_RTIT_ADDR1_A, MSR_IA32_RTIT_ADDR1_B,
MSR_IA32_RTIT_ADDR2_A, MSR_IA32_RTIT_ADDR2_B,
MSR_IA32_RTIT_ADDR3_A, MSR_IA32_RTIT_ADDR3_B,
-   MSR_IA32_UMWAIT_CONTROL,
+   MSR_IA32_UMWAIT_CONTROL, MSR_IA32_PKRS,
 
MSR_ARCH_PERFMON_FIXED_CTR0, MSR_ARCH_PERFMON_FIXED_CTR1,
MSR_ARCH_PERFMON_FIXED_CTR0 + 2, MSR_ARCH_PERFMON_FIXED_CTR0 + 3,
@@ -5718,6 +5719,10 @@ 

[RFC v3 3/7] KVM: MMU: Rename the pkru to pkr

2020-11-05 Thread Chenyi Qiang
PKRU represents the PKU register utilized in the protection key rights
check for user pages. Protection Keys for Superviosr Pages (PKS) extends
the protection key architecture to cover supervisor pages.

Rename the *pkru* related variables and functions to *pkr* which stands
for both of the PKRU and PKRS. It makes sense because both registers
have the same format. PKS and PKU can also share the same bitmap to
cache the conditions where protection key checks are needed.

Signed-off-by: Chenyi Qiang 
---
 arch/x86/include/asm/kvm_host.h |  2 +-
 arch/x86/kvm/mmu.h  | 12 ++--
 arch/x86/kvm/mmu/mmu.c  | 18 +-
 3 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index d44858b69353..7567952febd9 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -382,7 +382,7 @@ struct kvm_mmu {
* with PFEC.RSVD replaced by ACC_USER_MASK from the page tables.
* Each domain has 2 bits which are ANDed with AD and WD from PKRU.
*/
-   u32 pkru_mask;
+   u32 pkr_mask;
 
u64 *pae_root;
u64 *lm_root;
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 9c4a9c8e43d9..a77bd20c83f9 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -190,8 +190,8 @@ static inline u8 permission_fault(struct kvm_vcpu *vcpu, 
struct kvm_mmu *mmu,
u32 errcode = PFERR_PRESENT_MASK;
 
WARN_ON(pfec & (PFERR_PK_MASK | PFERR_RSVD_MASK));
-   if (unlikely(mmu->pkru_mask)) {
-   u32 pkru_bits, offset;
+   if (unlikely(mmu->pkr_mask)) {
+   u32 pkr_bits, offset;
 
/*
* PKRU defines 32 bits, there are 16 domains and 2
@@ -199,15 +199,15 @@ static inline u8 permission_fault(struct kvm_vcpu *vcpu, 
struct kvm_mmu *mmu,
* index of the protection domain, so pte_pkey * 2 is
* is the index of the first bit for the domain.
*/
-   pkru_bits = (vcpu->arch.pkru >> (pte_pkey * 2)) & 3;
+   pkr_bits = (vcpu->arch.pkru >> (pte_pkey * 2)) & 3;
 
/* clear present bit, replace PFEC.RSVD with ACC_USER_MASK. */
offset = (pfec & ~1) +
((pte_access & PT_USER_MASK) << (PFERR_RSVD_BIT - 
PT_USER_SHIFT));
 
-   pkru_bits &= mmu->pkru_mask >> offset;
-   errcode |= -pkru_bits & PFERR_PK_MASK;
-   fault |= (pkru_bits != 0);
+   pkr_bits &= mmu->pkr_mask >> offset;
+   errcode |= -pkr_bits & PFERR_PK_MASK;
+   fault |= (pkr_bits != 0);
}
 
return -(u32)fault & errcode;
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 1f96adff8dc4..d22c0813e4b9 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4301,20 +4301,20 @@ static void update_permission_bitmask(struct kvm_vcpu 
*vcpu,
 * away both AD and WD.  For all reads or if the last condition holds, WD
 * only will be masked away.
 */
-static void update_pkru_bitmask(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
+static void update_pkr_bitmask(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
bool ept)
 {
unsigned bit;
bool wp;
 
if (ept) {
-   mmu->pkru_mask = 0;
+   mmu->pkr_mask = 0;
return;
}
 
/* PKEY is enabled only if CR4.PKE and EFER.LMA are both set. */
if (!kvm_read_cr4_bits(vcpu, X86_CR4_PKE) || !is_long_mode(vcpu)) {
-   mmu->pkru_mask = 0;
+   mmu->pkr_mask = 0;
return;
}
 
@@ -4348,7 +4348,7 @@ static void update_pkru_bitmask(struct kvm_vcpu *vcpu, 
struct kvm_mmu *mmu,
/* PKRU.WD stops write access. */
pkey_bits |= (!!check_write) << 1;
 
-   mmu->pkru_mask |= (pkey_bits & 3) << pfec;
+   mmu->pkr_mask |= (pkey_bits & 3) << pfec;
}
 }
 
@@ -4370,7 +4370,7 @@ static void paging64_init_context_common(struct kvm_vcpu 
*vcpu,
 
reset_rsvds_bits_mask(vcpu, context);
update_permission_bitmask(vcpu, context, false);
-   update_pkru_bitmask(vcpu, context, false);
+   update_pkr_bitmask(vcpu, context, false);
update_last_nonleaf_level(vcpu, context);
 
MMU_WARN_ON(!is_pae(vcpu));
@@ -4400,7 +4400,7 @@ static void paging32_init_context(struct kvm_vcpu *vcpu,
 
reset_rsvds_bits_mask(vcpu, context);
update_permission_bitmask(vcpu, context, false);
-   update_pkru_bitmask(vcpu, context, false);
+   update_pkr_bitmask(vcpu, context, false);
update_last_nonleaf_level(vcpu, context);
 
context->page_fault = paging32_page_fault;
@@ -4519,7 +4519,7 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu)
}
 
update_permission_bitmask(vcpu, context, false);
-   update_pkru_bitmask(vcpu, context,

[RFC v3 7/7] KVM: VMX: Enable PKS for nested VM

2020-11-05 Thread Chenyi Qiang
PKS MSR passes through guest directly. Configure the MSR to match the
L0/L1 settings so that nested VM runs PKS properly.

Signed-off-by: Chenyi Qiang 
---
 arch/x86/kvm/vmx/nested.c | 37 +++--
 arch/x86/kvm/vmx/vmcs12.c |  2 ++
 arch/x86/kvm/vmx/vmcs12.h |  6 +-
 arch/x86/kvm/vmx/vmx.c| 10 ++
 arch/x86/kvm/vmx/vmx.h|  1 +
 5 files changed, 53 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 2266d98ace6f..a0b0f6fc7808 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -653,6 +653,12 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct 
kvm_vcpu *vcpu,
MSR_IA32_PRED_CMD,
MSR_TYPE_W);
 
+   if (!msr_write_intercepted_l01(vcpu, MSR_IA32_PKRS))
+   nested_vmx_disable_intercept_for_msr(
+   msr_bitmap_l1, msr_bitmap_l0,
+   MSR_IA32_PKRS,
+   MSR_TYPE_R | MSR_TYPE_W);
+
kvm_vcpu_unmap(vcpu, &to_vmx(vcpu)->nested.msr_bitmap_map, false);
 
return true;
@@ -2439,6 +2445,10 @@ static void prepare_vmcs02_rare(struct vcpu_vmx *vmx, 
struct vmcs12 *vmcs12)
if (kvm_mpx_supported() && vmx->nested.nested_run_pending &&
(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS))
vmcs_write64(GUEST_BNDCFGS, vmcs12->guest_bndcfgs);
+
+   if (vmx->nested.nested_run_pending &&
+   (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_PKRS))
+   vmcs_write64(GUEST_IA32_PKRS, vmcs12->guest_ia32_pkrs);
}
 
if (nested_cpu_has_xsaves(vmcs12))
@@ -2527,6 +2537,11 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct 
vmcs12 *vmcs12,
if (kvm_mpx_supported() && (!vmx->nested.nested_run_pending ||
!(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS)))
vmcs_write64(GUEST_BNDCFGS, vmx->nested.vmcs01_guest_bndcfgs);
+
+   if (kvm_cpu_cap_has(X86_FEATURE_PKS) &&
+   (!vmx->nested.nested_run_pending ||
+!(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_PKRS)))
+   vmcs_write64(GUEST_IA32_PKRS, vmx->nested.vmcs01_guest_pkrs);
vmx_set_rflags(vcpu, vmcs12->guest_rflags);
 
/* EXCEPTION_BITMAP and CR0_GUEST_HOST_MASK should basically be the
@@ -2867,6 +2882,10 @@ static int nested_vmx_check_host_state(struct kvm_vcpu 
*vcpu,
   vmcs12->host_ia32_perf_global_ctrl)))
return -EINVAL;
 
+   if ((vmcs12->vm_exit_controls & VM_EXIT_LOAD_IA32_PKRS) &&
+   CC(!kvm_pkrs_valid(vmcs12->host_ia32_pkrs)))
+   return -EINVAL;
+
 #ifdef CONFIG_X86_64
ia32e = !!(vcpu->arch.efer & EFER_LMA);
 #else
@@ -3016,6 +3035,10 @@ static int nested_vmx_check_guest_state(struct kvm_vcpu 
*vcpu,
if (nested_check_guest_non_reg_state(vmcs12))
return -EINVAL;
 
+   if ((vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_PKRS) &&
+   CC(!kvm_pkrs_valid(vmcs12->guest_ia32_pkrs)))
+   return -EINVAL;
+
return 0;
 }
 
@@ -3326,6 +3349,9 @@ enum nvmx_vmentry_status 
nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu,
if (kvm_mpx_supported() &&
!(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS))
vmx->nested.vmcs01_guest_bndcfgs = vmcs_read64(GUEST_BNDCFGS);
+   if (kvm_cpu_cap_has(X86_FEATURE_PKS) &&
+   !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_PKRS))
+   vmx->nested.vmcs01_guest_pkrs = vmcs_read64(GUEST_IA32_PKRS);
 
/*
 * Overwrite vmcs01.GUEST_CR3 with L1's CR3 if EPT is disabled *and*
@@ -3929,6 +3955,7 @@ static bool is_vmcs12_ext_field(unsigned long field)
case GUEST_IDTR_BASE:
case GUEST_PENDING_DBG_EXCEPTIONS:
case GUEST_BNDCFGS:
+   case GUEST_IA32_PKRS:
return true;
default:
break;
@@ -3980,6 +4007,8 @@ static void sync_vmcs02_to_vmcs12_rare(struct kvm_vcpu 
*vcpu,
vmcs_readl(GUEST_PENDING_DBG_EXCEPTIONS);
if (kvm_mpx_supported())
vmcs12->guest_bndcfgs = vmcs_read64(GUEST_BNDCFGS);
+   if (guest_cpuid_has(vcpu, X86_FEATURE_PKS))
+   vmcs12->guest_ia32_pkrs = vmcs_read64(GUEST_IA32_PKRS);
 
vmx->nested.need_sync_vmcs02_to_vmcs12_rare = false;
 }
@@ -4215,6 +4244,9 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
WARN_ON_ONCE(kvm_set_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL,
 vmcs12->host_ia32_perf_global_ctrl));
 
+   if (vmcs12->vm_exit_controls & VM_EXIT_LOAD_IA32_PKRS)
+   vmcs_write64(GUEST_IA32_PKRS, vmcs12->host_ia32_pkrs);
+
/* Set L1 segment info according to Intel SDM

[kvm-unit-tests PATCH] x86: Add tests for PKS

2020-11-05 Thread Chenyi Qiang
This unit-test is intended to test the KVM support for Protection Keys
for Supervisor Pages (PKS). If CR4.PKS is set in long mode, supervisor
pkeys are checked in addition to normal paging protections and Access or
Write can be disabled via a MSR update without TLB flushes when
permissions change.

Signed-off-by: Chenyi Qiang 
---
 lib/x86/msr.h   |   1 +
 lib/x86/processor.h |   2 +
 x86/Makefile.x86_64 |   1 +
 x86/pks.c   | 146 
 x86/unittests.cfg   |   5 ++
 5 files changed, 155 insertions(+)
 create mode 100644 x86/pks.c

diff --git a/lib/x86/msr.h b/lib/x86/msr.h
index 6ef5502..e36934b 100644
--- a/lib/x86/msr.h
+++ b/lib/x86/msr.h
@@ -209,6 +209,7 @@
 #define MSR_IA32_EBL_CR_POWERON0x002a
 #define MSR_IA32_FEATURE_CONTROL0x003a
 #define MSR_IA32_TSC_ADJUST0x003b
+#define MSR_IA32_PKRS  0x06e1
 
 #define FEATURE_CONTROL_LOCKED (1<<0)
 #define FEATURE_CONTROL_VMXON_ENABLED_INSIDE_SMX   (1<<1)
diff --git a/lib/x86/processor.h b/lib/x86/processor.h
index 74a2498..985fdd0 100644
--- a/lib/x86/processor.h
+++ b/lib/x86/processor.h
@@ -50,6 +50,7 @@
 #define X86_CR4_SMEP   0x0010
 #define X86_CR4_SMAP   0x0020
 #define X86_CR4_PKE0x0040
+#define X86_CR4_PKS0x0100
 
 #define X86_EFLAGS_CF0x0001
 #define X86_EFLAGS_FIXED 0x0002
@@ -157,6 +158,7 @@ static inline u8 cpuid_maxphyaddr(void)
 #defineX86_FEATURE_RDPID   (CPUID(0x7, 0, ECX, 22))
 #defineX86_FEATURE_SPEC_CTRL   (CPUID(0x7, 0, EDX, 26))
 #defineX86_FEATURE_ARCH_CAPABILITIES   (CPUID(0x7, 0, EDX, 29))
+#defineX86_FEATURE_PKS (CPUID(0x7, 0, ECX, 31))
 #defineX86_FEATURE_NX  (CPUID(0x8001, 0, EDX, 20))
 #defineX86_FEATURE_RDPRU   (CPUID(0x8008, 0, EBX, 4))
 
diff --git a/x86/Makefile.x86_64 b/x86/Makefile.x86_64
index af61d85..3a353df 100644
--- a/x86/Makefile.x86_64
+++ b/x86/Makefile.x86_64
@@ -20,6 +20,7 @@ tests += $(TEST_DIR)/tscdeadline_latency.flat
 tests += $(TEST_DIR)/intel-iommu.flat
 tests += $(TEST_DIR)/vmware_backdoors.flat
 tests += $(TEST_DIR)/rdpru.flat
+tests += $(TEST_DIR)/pks.flat
 
 include $(SRCDIR)/$(TEST_DIR)/Makefile.common
 
diff --git a/x86/pks.c b/x86/pks.c
new file mode 100644
index 000..a3044cf
--- /dev/null
+++ b/x86/pks.c
@@ -0,0 +1,146 @@
+#include "libcflat.h"
+#include "x86/desc.h"
+#include "x86/processor.h"
+#include "x86/vm.h"
+#include "x86/msr.h"
+
+#define CR0_WP_MASK  (1UL << 16)
+#define PTE_PKEY_BIT 59
+#define SUPER_BASE(1 << 24)
+#define SUPER_VAR(v)  (*((__typeof__(&(v))) (((unsigned long)&v) + 
SUPER_BASE)))
+
+volatile int pf_count = 0;
+volatile unsigned save;
+volatile unsigned test;
+
+static void set_cr0_wp(int wp)
+{
+unsigned long cr0 = read_cr0();
+
+cr0 &= ~CR0_WP_MASK;
+if (wp)
+cr0 |= CR0_WP_MASK;
+write_cr0(cr0);
+}
+
+void do_pf_tss(unsigned long error_code);
+void do_pf_tss(unsigned long error_code)
+{
+printf("#PF handler, error code: 0x%lx\n", error_code);
+pf_count++;
+save = test;
+wrmsr(MSR_IA32_PKRS, 0);
+}
+
+extern void pf_tss(void);
+
+asm ("pf_tss: \n\t"
+#ifdef __x86_64__
+// no task on x86_64, save/restore caller-save regs
+"push %rax; push %rcx; push %rdx; push %rsi; push %rdi\n"
+"push %r8; push %r9; push %r10; push %r11\n"
+"mov 9*8(%rsp), %rdi\n"
+#endif
+"call do_pf_tss \n\t"
+#ifdef __x86_64__
+"pop %r11; pop %r10; pop %r9; pop %r8\n"
+"pop %rdi; pop %rsi; pop %rdx; pop %rcx; pop %rax\n"
+#endif
+"add $"S", %"R "sp\n\t" // discard error code
+"iret"W" \n\t"
+"jmp pf_tss\n\t"
+);
+
+static void init_test(void)
+{
+pf_count = 0;
+
+invlpg(&test);
+invlpg(&SUPER_VAR(test));
+wrmsr(MSR_IA32_PKRS, 0);
+set_cr0_wp(0);
+}
+
+int main(int ac, char **av)
+{
+unsigned long i;
+unsigned int pkey = 0x2;
+unsigned int pkrs_ad = 0x10;
+unsigned int pkrs_wd = 0x20;
+
+if (!this_cpu_has(X86_FEATURE_PKS)) {
+printf("PKS not enabled\n");
+return report_summary();
+}
+
+setup_vm();
+setup_alt_stack();
+set_intr_alt_stack(14, pf_tss);
+wrmsr(MSR_EFER, rdmsr(MSR_EFER) | EFER_LMA);
+
+// First 16MB are user pages
+for (i = 0; i < SUPER_BASE; i += PAGE_SIZE) {
+*get_pte(phys_to_virt(read_cr3()), phys_to_virt(i)) |= ((unsigned 
long)pkey << PTE_PKEY_BIT);
+invlpg((void *)i);
+}
+
+// Present the same 16MB as supervisor pages in the 16MB-32MB range
+for (i = SUPER_BASE; i < 2 * SUPER_BASE; i += PAGE_SIZE) {
+*get_pte(phys_to_virt(read_cr3()), phys_to_virt(i)) &= ~SUPER_BASE;
+*get_pte(phys_to_virt(read_cr3()), phys_to_virt(i)) &= ~PT_USER_MASK;
+*get_pte(phys_to_virt(read_cr3()), phys_to_virt(i)) |= ((unsigned 
long)pkey << PTE_PKEY_BIT);
+invlpg((voi

[RFC v3 0/7] KVM: PKS Virtualization support

2020-11-05 Thread Chenyi Qiang
Protection Keys for Supervisor Pages(PKS) is a feature that extends the
Protection Keys architecture to support thread-specific permission
restrictions on supervisor pages.

PKS works similar to an existing feature named PKU(protecting user pages).
They both perform an additional check after all legacy access
permissions checks are done. If violated, #PF occurs and PFEC.PK bit will
be set. PKS introduces MSR IA32_PKRS to manage supervisor protection key
rights. The MSR contains 16 pairs of ADi and WDi bits. Each pair
advertises on a group of pages with the same key which is set in the
leaf paging-structure entries(bits[62:59]). Currently, IA32_PKRS is not
supported by XSAVES architecture.

This patchset aims to add the virtualization of PKS in KVM. It
implemented PKS CPUID enumeration, vmentry/vmexit configuration, MSR
exposure, nested supported etc. Currently, PKS is not yet supported for
shadow paging. 

Detailed information about PKS can be found in the latest Intel 64 and
IA-32 Architectures Software Developer's Manual.

---

Changelogs:

v2->v3:
- No function changes since last submit
- rebase on the latest PKS kernel support:
  https://lore.kernel.org/lkml/20201102205320.1458656-1-ira.we...@intel.com/
- add MSR_IA32_PKRS to the vmx_possible_passthrough_msrs[]
- RFC v2: 
https://lore.kernel.org/lkml/20201014021157.18022-1-chenyi.qi...@intel.com/

v1->v2:
- rebase on the latest PKS kernel support:
  https://github.com/weiny2/linux-kernel/tree/pks-rfc-v3
- add a kvm-unit-tests for PKS
- add the check in kvm_init_msr_list for PKRS
- place the X86_CR4_PKS in mmu_role_bits in kvm_set_cr4
- add the support to expose VM_{ENTRY, EXIT}_LOAD_IA32_PKRS in nested
  VMX MSR
- RFC v1: 
https://lore.kernel.org/lkml/20200807084841.7112-1-chenyi.qi...@intel.com/

---

Chenyi Qiang (7):
  KVM: VMX: Introduce PKS VMCS fields
  KVM: VMX: Expose IA32_PKRS MSR
  KVM: MMU: Rename the pkru to pkr
  KVM: MMU: Refactor pkr_mask to cache condition
  KVM: MMU: Add support for PKS emulation
  KVM: X86: Expose PKS to guest and userspace
  KVM: VMX: Enable PKS for nested VM

 arch/x86/include/asm/kvm_host.h | 13 ++---
 arch/x86/include/asm/pkeys.h|  1 +
 arch/x86/include/asm/vmx.h  |  6 +++
 arch/x86/kvm/cpuid.c|  3 +-
 arch/x86/kvm/mmu.h  | 36 +++--
 arch/x86/kvm/mmu/mmu.c  | 78 +++-
 arch/x86/kvm/vmx/capabilities.h |  6 +++
 arch/x86/kvm/vmx/nested.c   | 38 +-
 arch/x86/kvm/vmx/vmcs.h |  1 +
 arch/x86/kvm/vmx/vmcs12.c   |  2 +
 arch/x86/kvm/vmx/vmcs12.h   |  6 ++-
 arch/x86/kvm/vmx/vmx.c  | 91 +++--
 arch/x86/kvm/vmx/vmx.h  |  3 +-
 arch/x86/kvm/x86.c  |  9 +++-
 arch/x86/kvm/x86.h  |  6 +++
 arch/x86/mm/pkeys.c |  6 +++
 include/linux/pkeys.h   |  4 ++
 17 files changed, 240 insertions(+), 69 deletions(-)

-- 
2.17.1



[RFC v3 1/7] KVM: VMX: Introduce PKS VMCS fields

2020-11-05 Thread Chenyi Qiang
PKS(Protection Keys for Supervisor Pages) is a feature that extends the
Protection Key architecture to support thread-specific permission
restrictions on supervisor pages.

A new PKS MSR(PKRS) is defined in kernel to support PKS, which holds a
set of permissions associated with each protection domian.

Two VMCS fields {HOST,GUEST}_IA32_PKRS are introduced in
{host,guest}-state area to store the value of PKRS.

Every VM exit saves PKRS into guest-state area.
If VM_EXIT_LOAD_IA32_PKRS = 1, VM exit loads PKRS from the host-state
area.
If VM_ENTRY_LOAD_IA32_PKRS = 1, VM entry loads PKRS from the guest-state
area.

Signed-off-by: Chenyi Qiang 
Reviewed-by: Jim Mattson 
---
 arch/x86/include/asm/vmx.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index f8ba5289ecb0..5472859e21b0 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -94,6 +94,7 @@
 #define VM_EXIT_CLEAR_BNDCFGS   0x0080
 #define VM_EXIT_PT_CONCEAL_PIP 0x0100
 #define VM_EXIT_CLEAR_IA32_RTIT_CTL0x0200
+#define VM_EXIT_LOAD_IA32_PKRS 0x2000
 
 #define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR  0x00036dff
 
@@ -107,6 +108,7 @@
 #define VM_ENTRY_LOAD_BNDCFGS   0x0001
 #define VM_ENTRY_PT_CONCEAL_PIP0x0002
 #define VM_ENTRY_LOAD_IA32_RTIT_CTL0x0004
+#define VM_ENTRY_LOAD_IA32_PKRS0x0040
 
 #define VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR 0x11ff
 
@@ -243,12 +245,16 @@ enum vmcs_field {
GUEST_BNDCFGS_HIGH  = 0x2813,
GUEST_IA32_RTIT_CTL = 0x2814,
GUEST_IA32_RTIT_CTL_HIGH= 0x2815,
+   GUEST_IA32_PKRS = 0x2818,
+   GUEST_IA32_PKRS_HIGH= 0x2819,
HOST_IA32_PAT   = 0x2c00,
HOST_IA32_PAT_HIGH  = 0x2c01,
HOST_IA32_EFER  = 0x2c02,
HOST_IA32_EFER_HIGH = 0x2c03,
HOST_IA32_PERF_GLOBAL_CTRL  = 0x2c04,
HOST_IA32_PERF_GLOBAL_CTRL_HIGH = 0x2c05,
+   HOST_IA32_PKRS  = 0x2c06,
+   HOST_IA32_PKRS_HIGH = 0x2c07,
PIN_BASED_VM_EXEC_CONTROL   = 0x4000,
CPU_BASED_VM_EXEC_CONTROL   = 0x4002,
EXCEPTION_BITMAP= 0x4004,
-- 
2.17.1



[RFC v3 2/7] KVM: VMX: Expose IA32_PKRS MSR

2020-11-05 Thread Chenyi Qiang
Protection Keys for Supervisor Pages (PKS) uses IA32_PKRS MSR (PKRS) at
index 0x6E1 to allow software to manage supervisor protection key
rights. For performance consideration, PKRS intercept will be disabled
so that the guest can access the PKRS without VM exits.
PKS introduces dedicated control fields in VMCS to switch PKRS, which
only does the retore part. In addition, every VM exit saves PKRS into
the guest-state area in VMCS, while VM enter won't save the host value
due to the expectation that the host won't change the MSR often. Update
the host's value in VMCS manually if the MSR has been changed by the
kernel since the last time the VMCS was run.
The function get_current_pkrs() in arch/x86/mm/pkeys.c exports the
per-cpu variable pkrs_cache to avoid frequent rdmsr of PKRS.

Signed-off-by: Chenyi Qiang 
---
 arch/x86/include/asm/pkeys.h|  1 +
 arch/x86/kvm/vmx/capabilities.h |  6 +++
 arch/x86/kvm/vmx/nested.c   |  1 +
 arch/x86/kvm/vmx/vmcs.h |  1 +
 arch/x86/kvm/vmx/vmx.c  | 66 -
 arch/x86/kvm/vmx/vmx.h  |  2 +-
 arch/x86/kvm/x86.h  |  6 +++
 arch/x86/mm/pkeys.c |  6 +++
 include/linux/pkeys.h   |  4 ++
 9 files changed, 90 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/pkeys.h b/arch/x86/include/asm/pkeys.h
index f84351b4ac7c..4cc5ed49ae75 100644
--- a/arch/x86/include/asm/pkeys.h
+++ b/arch/x86/include/asm/pkeys.h
@@ -142,6 +142,7 @@ u32 update_pkey_val(u32 pk_reg, int pkey, unsigned int 
flags);
 #ifdef CONFIG_ARCH_HAS_SUPERVISOR_PKEYS
 int pks_key_alloc(const char *const pkey_user, int flags);
 void pks_key_free(int pkey);
+u32 get_current_pkrs(void);
 
 void pks_mk_noaccess(int pkey);
 void pks_mk_readonly(int pkey);
diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index 3a1861403d73..1cadeaaf9985 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -103,6 +103,12 @@ static inline bool cpu_has_load_perf_global_ctrl(void)
   (vmcs_config.vmexit_ctrl & VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL);
 }
 
+static inline bool cpu_has_load_ia32_pkrs(void)
+{
+   return (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PKRS) &&
+  (vmcs_config.vmexit_ctrl & VM_EXIT_LOAD_IA32_PKRS);
+}
+
 static inline bool cpu_has_vmx_mpx(void)
 {
return (vmcs_config.vmexit_ctrl & VM_EXIT_CLEAR_BNDCFGS) &&
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 89af692deb7e..2266d98ace6f 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -250,6 +250,7 @@ static void vmx_sync_vmcs_host_state(struct vcpu_vmx *vmx,
dest->ds_sel = src->ds_sel;
dest->es_sel = src->es_sel;
 #endif
+   dest->pkrs = src->pkrs;
 }
 
 static void vmx_switch_vmcs(struct kvm_vcpu *vcpu, struct loaded_vmcs *vmcs)
diff --git a/arch/x86/kvm/vmx/vmcs.h b/arch/x86/kvm/vmx/vmcs.h
index 1472c6c376f7..b5ba6407c5e1 100644
--- a/arch/x86/kvm/vmx/vmcs.h
+++ b/arch/x86/kvm/vmx/vmcs.h
@@ -40,6 +40,7 @@ struct vmcs_host_state {
 #ifdef CONFIG_X86_64
u16   ds_sel, es_sel;
 #endif
+   u32   pkrs;
 };
 
 struct vmcs_controls_shadow {
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 47b8357b9751..e18fdd8ee36a 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -166,6 +166,7 @@ static u32 
vmx_possible_passthrough_msrs[MAX_POSSIBLE_PASSTHROUGH_MSRS] = {
MSR_CORE_C3_RESIDENCY,
MSR_CORE_C6_RESIDENCY,
MSR_CORE_C7_RESIDENCY,
+   MSR_IA32_PKRS,
 };
 
 /*
@@ -1201,6 +1202,7 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
 #endif
unsigned long fs_base, gs_base;
u16 fs_sel, gs_sel;
+   u32 host_pkrs;
int i;
 
vmx->req_immediate_exit = false;
@@ -1233,6 +1235,20 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
 */
host_state->ldt_sel = kvm_read_ldt();
 
+   /*
+* Update the host pkrs vmcs field before vcpu runs.
+* The setting of VM_EXIT_LOAD_IA32_PKRS can ensure
+* kvm_cpu_cap_has(X86_FEATURE_PKS) &&
+* guest_cpuid_has(vcpu, X86_FEATURE_VMX).
+*/
+   if (vm_exit_controls_get(vmx) & VM_EXIT_LOAD_IA32_PKRS) {
+   host_pkrs = get_current_pkrs();
+   if (unlikely(host_pkrs != host_state->pkrs)) {
+   vmcs_write64(HOST_IA32_PKRS, host_pkrs);
+   host_state->pkrs = host_pkrs;
+   }
+   }
+
 #ifdef CONFIG_X86_64
savesegment(ds, host_state->ds_sel);
savesegment(es, host_state->es_sel);
@@ -1920,6 +1936,13 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
else
msr_info->data = vmx->pt_desc.guest.addr_a[index / 2];
break;
+   case MSR_IA32_PKRS:
+   if (!kvm_cpu_cap_has(X86_FEATURE_PKS) ||
+   (!msr_info->host_ini

Re: [PATCH 12/36] tty: tty_io: Fix some kernel-doc issues

2020-11-05 Thread Jiri Slaby

On 04. 11. 20, 20:35, Lee Jones wrote:

Demote non-conformant headers and supply some missing descriptions.

Fixes the following W=1 kernel build warning(s):

  drivers/tty/tty_io.c:218: warning: Function parameter or member 'file' not 
described in 'tty_free_file'
  drivers/tty/tty_io.c:566: warning: Function parameter or member 
'exit_session' not described in '__tty_hangup'
  drivers/tty/tty_io.c:1077: warning: Function parameter or member 'tty' not 
described in 'tty_send_xchar'
  drivers/tty/tty_io.c:1077: warning: Function parameter or member 'ch' not 
described in 'tty_send_xchar'
  drivers/tty/tty_io.c:1155: warning: Function parameter or member 'file' not 
described in 'tty_driver_lookup_tty'
  drivers/tty/tty_io.c:1508: warning: Function parameter or member 'tty' not 
described in 'release_tty'
  drivers/tty/tty_io.c:1508: warning: Function parameter or member 'idx' not 
described in 'release_tty'
  drivers/tty/tty_io.c:2973: warning: Function parameter or member 'driver' not 
described in 'alloc_tty_struct'
  drivers/tty/tty_io.c:2973: warning: Function parameter or member 'idx' not 
described in 'alloc_tty_struct'

Cc: Greg Kroah-Hartman 
Cc: Jiri Slaby 
Cc: Nick Holloway 
Cc: -- 
Cc: Marko Kohtala 
Cc: Bill Hawes 
Cc: "C. Scott Ananian" 
Cc: Russell King 
Cc: Andrew Morton 
Signed-off-by: Lee Jones 
---
  drivers/tty/tty_io.c | 10 +++---
  1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index 88b00c47b606e..f50286fb080da 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -2961,7 +2965,7 @@ static struct device *tty_get_device(struct tty_struct 
*tty)
  }
  
  
-/**

+/*
   *alloc_tty_struct
   *
   *This subroutine allocates and initializes a tty structure.


Why do you randomly sometimes fix kernel-doc and sometimes remove 
functions from kernel-doc? What's the rule? For example, 
alloc_tty_struct is among the ones, I would like to see fixed instead of 
removed from kernel-doc.


thanks,
--
js
suse labs


[RFC v3 4/7] KVM: MMU: Refactor pkr_mask to cache condition

2020-11-05 Thread Chenyi Qiang
pkr_mask bitmap indicates if protection key checks are needed for user
pages currently. It is indexed by page fault error code bits [4:1] with
PFEC.RSVD replaced by the ACC_USER_MASK from the page tables. Refactor
it by reverting to the use of PFEC.RSVD. After that, PKS and PKU can
share the same bitmap.

Signed-off-by: Chenyi Qiang 
---
 arch/x86/kvm/mmu.h | 10 ++
 arch/x86/kvm/mmu/mmu.c | 16 ++--
 2 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index a77bd20c83f9..8f05f7c0f6df 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -199,11 +199,13 @@ static inline u8 permission_fault(struct kvm_vcpu *vcpu, 
struct kvm_mmu *mmu,
* index of the protection domain, so pte_pkey * 2 is
* is the index of the first bit for the domain.
*/
-   pkr_bits = (vcpu->arch.pkru >> (pte_pkey * 2)) & 3;
+   if (pte_access & PT_USER_MASK)
+   pkr_bits = (vcpu->arch.pkru >> (pte_pkey * 2)) & 3;
+   else
+   pkr_bits = 0;
 
-   /* clear present bit, replace PFEC.RSVD with ACC_USER_MASK. */
-   offset = (pfec & ~1) +
-   ((pte_access & PT_USER_MASK) << (PFERR_RSVD_BIT - 
PT_USER_SHIFT));
+   /* clear present bit */
+   offset = (pfec & ~1);
 
pkr_bits &= mmu->pkr_mask >> offset;
errcode |= -pkr_bits & PFERR_PK_MASK;
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index d22c0813e4b9..39afa865dc1a 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4322,21 +4322,25 @@ static void update_pkr_bitmask(struct kvm_vcpu *vcpu, 
struct kvm_mmu *mmu,
 
for (bit = 0; bit < ARRAY_SIZE(mmu->permissions); ++bit) {
unsigned pfec, pkey_bits;
-   bool check_pkey, check_write, ff, uf, wf, pte_user;
+   bool check_pkey, check_write, ff, uf, wf, rsvdf;
 
pfec = bit << 1;
ff = pfec & PFERR_FETCH_MASK;
uf = pfec & PFERR_USER_MASK;
wf = pfec & PFERR_WRITE_MASK;
 
-   /* PFEC.RSVD is replaced by ACC_USER_MASK. */
-   pte_user = pfec & PFERR_RSVD_MASK;
+   /*
+* PFERR_RSVD_MASK bit is not set if the
+* access is subject to PK restrictions.
+*/
+   rsvdf = pfec & PFERR_RSVD_MASK;
 
/*
-* Only need to check the access which is not an
-* instruction fetch and is to a user page.
+* need to check the access which is not an
+* instruction fetch and is not a rsvd fault.
 */
-   check_pkey = (!ff && pte_user);
+   check_pkey = (!ff && !rsvdf);
+
/*
 * write access is controlled by PKRU if it is a
 * user access or CR0.WP = 1.
-- 
2.17.1



[RFC v3 5/7] KVM: MMU: Add support for PKS emulation

2020-11-05 Thread Chenyi Qiang
Advertise pkr_mask to cache the conditions where pretection key checks
for supervisor pages are needed. When the accessed pages are those with
a translation for which the U/S flag is 0 in at least one
paging-structure entry controlling the translation, they are the
supervisor pages and PKRS enforces the access rights check.

Signed-off-by: Chenyi Qiang 
---
 arch/x86/include/asm/kvm_host.h |  8 +++---
 arch/x86/kvm/mmu.h  | 12 ++---
 arch/x86/kvm/mmu/mmu.c  | 44 +
 3 files changed, 35 insertions(+), 29 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 7567952febd9..ba313c76a1b5 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -377,10 +377,10 @@ struct kvm_mmu {
u8 permissions[16];
 
/*
-   * The pkru_mask indicates if protection key checks are needed.  It
-   * consists of 16 domains indexed by page fault error code bits [4:1],
-   * with PFEC.RSVD replaced by ACC_USER_MASK from the page tables.
-   * Each domain has 2 bits which are ANDed with AD and WD from PKRU.
+   * The pkr_mask indicates if protection key checks are needed.
+   * It consists of 16 domains indexed by page fault error code
+   * bits[4:1]. Each domain has 2 bits which are ANDed with AD
+   * and WD from PKRU/PKRS.
*/
u32 pkr_mask;
 
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 8f05f7c0f6df..a3629f7b7499 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -192,15 +192,19 @@ static inline u8 permission_fault(struct kvm_vcpu *vcpu, 
struct kvm_mmu *mmu,
WARN_ON(pfec & (PFERR_PK_MASK | PFERR_RSVD_MASK));
if (unlikely(mmu->pkr_mask)) {
u32 pkr_bits, offset;
+   u64 pkrs;
 
/*
-   * PKRU defines 32 bits, there are 16 domains and 2
-   * attribute bits per domain in pkru.  pte_pkey is the
-   * index of the protection domain, so pte_pkey * 2 is
-   * is the index of the first bit for the domain.
+   * PKRU and PKRS both define 32 bits. There are 16 domains
+   * and 2 attribute bits per domain in them. pte_key is the
+   * index of the protection domain, so pte_pkey * 2 is the
+   * index of the first bit for the domain. The choice of
+   * PKRU and PKRS is determined by the accessed pages.
*/
if (pte_access & PT_USER_MASK)
pkr_bits = (vcpu->arch.pkru >> (pte_pkey * 2)) & 3;
+   else if (!kvm_get_msr(vcpu, MSR_IA32_PKRS, &pkrs))
+   pkr_bits = (pkrs >> (pte_pkey * 2)) & 3;
else
pkr_bits = 0;
 
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 39afa865dc1a..e5758911bb12 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4278,28 +4278,29 @@ static void update_permission_bitmask(struct kvm_vcpu 
*vcpu,
 }
 
 /*
-* PKU is an additional mechanism by which the paging controls access to
-* user-mode addresses based on the value in the PKRU register.  Protection
-* key violations are reported through a bit in the page fault error code.
+* Protection Keys (PKEY) is an additional mechanism by which
+* the paging controls access to user-mode/supervisor-mode address
+* based on the values in PKEY registers (PKRU/PKRS). Protection key
+* violations are reported through a bit in the page fault error code.
 * Unlike other bits of the error code, the PK bit is not known at the
 * call site of e.g. gva_to_gpa; it must be computed directly in
-* permission_fault based on two bits of PKRU, on some machine state (CR4,
-* CR0, EFER, CPL), and on other bits of the error code and the page tables.
+* permission_fault based on two bits of PKRU/PKRS, on some machine
+* state (CR4, CR0, EFER, CPL), and on other bits of the error code
+* and the page tables.
 *
 * In particular the following conditions come from the error code, the
 * page tables and the machine state:
-* - PK is always zero unless CR4.PKE=1 and EFER.LMA=1
+* - PK is always zero unless CR4.PKE=1/CR4.PKS=1 and EFER.LMA=1
 * - PK is always zero if RSVD=1 (reserved bit set) or F=1 (instruction fetch)
-* - PK is always zero if U=0 in the page tables
-* - PKRU.WD is ignored if CR0.WP=0 and the access is a supervisor access.
+* - (PKRU/PKRS).WD is ignored if CR0.WP=0 and the access is a supervisor 
access.
 *
-* The PKRU bitmask caches the result of these four conditions.  The error
-* code (minus the P bit) and the page table's U bit form an index into the
-* PKRU bitmask.  Two bits of the PKRU bitmask are then extracted and ANDed
-* with the two bits of the PKRU register corresponding to the protection key.
-* For the first three conditions above the bits will be 00, thus masking
-* away both AD and WD.  For all reads or if the last condition holds, WD
-* onl

Re: [PATCH] applesmc: Re-work SMC comms v2

2020-11-05 Thread Andreas Kemnade
On Thu, 5 Nov 2020 08:56:04 +0100
Henrik Rydberg  wrote:

> Hi Brad,
> 
> Great to see this effort, it is certainly an area which could be 
> improved. After having seen several generations of Macbooks while 
> modifying much of that code, it became clear that the SMC communication 
> got refreshed a few times over the years. Every tiny change had to be 
> tested on all machines, or kept separate for a particular generation, or 
> something would break.
> 
> I have not followed the back story here, but I imagine the need has 
> arisen because of a new refresh, and so this patch only needs to 
> strictly apply to a new generation. I would therefore advice that you 
> write the patch in that way, reducing the actual change to zero for 
> earlier generations. It also makes it easier to test the effect of the 
> new approach on older systems. I should be able to help testing on a 
> 2008 and 2011 model once we get to that stage.
> 
Well, the issue has arisen because of a change in kernel to make clang
happy. So it is not a new Apple device causing trouble.

Regards,
Andreas


Re: [PATCH v13 0/3] scsi: ufs: Add Host Performance Booster Support

2020-11-05 Thread Can Guo

On 2020-11-03 12:40, Daejun Park wrote:

Changelog:

v12 -> v13
1. Cleanup codes by comments from Can Guo.
2. Add HPB related descriptor/flag/attributes in sysfs.
3. Change base commit from 5.10/scsi-queue to 5.11/scsi-queue.



If you have changed the code based by comments left on Google gerrit, 
here is


Reviewed-by: Can Guo 


v11 -> v12
1. Fixed to return error value when HPB fails to initialize pinned 
active

region.
2. Fixed to disable HPB feature if HPB fails to allocate essential 
memory

and workqueue.
3. Fixed to change proper sub-region state when region is already 
evicted.


v10 -> v11
Add a newline at end the last line on Kconfig file.

v9 -> v10
1. Fixed 64-bit division error
2. Fixed problems commentted in Bart's review.

v8 -> v9
1. Change sysfs initialization.
2. Change reading descriptor during HPB initialization
3. Fixed problems commentted in Bart's review.
4. Change base commit from 5.9/scsi-queue to 5.10/scsi-queue.

v7 -> v8
Remove wrongly added tags.

v6 -> v7
1. Remove UFS feature layer.
2. Cleanup for sparse error.

v5 -> v6
Change base commit to b53293fa662e28ae0cdd40828dc641c09f133405

v4 -> v5
Delete unused macro define.

v3 -> v4
1. Cleanup.

v2 -> v3
1. Add checking input module parameter value.
2. Change base commit from 5.8/scsi-queue to 5.9/scsi-queue.
3. Cleanup for unused variables and label.

v1 -> v2
1. Change the full boilerplate text to SPDX style.
2. Adopt dynamic allocation for sub-region data structure.
3. Cleanup.

NAND flash memory-based storage devices use Flash Translation Layer 
(FTL)

to translate logical addresses of I/O requests to corresponding flash
memory addresses. Mobile storage devices typically have RAM with
constrained size, thus lack in memory to keep the whole mapping table.
Therefore, mapping tables are partially retrieved from NAND flash on
demand, causing random-read performance degradation.

To improve random read performance, JESD220-3 (HPB v1.0) proposes HPB
(Host Performance Booster) which uses host system memory as a cache for 
the

FTL mapping table. By using HPB, FTL data can be read from host memory
faster than from NAND flash memory.

The current version only supports the DCM (device control mode).
This patch consists of 3 parts to support HPB feature.

1) HPB probe and initialization process
2) READ -> HPB READ using cached map information
3) L2P (logical to physical) map management

In the HPB probe and init process, the device information of the UFS is
queried. After checking supported features, the data structure for the 
HPB

is initialized according to the device information.

A read I/O in the active sub-region where the map is cached is changed 
to

HPB READ by the HPB.

The HPB manages the L2P map using information received from the
device. For active sub-region, the HPB caches through ufshpb_map
request. For the in-active region, the HPB discards the L2P map.
When a write I/O occurs in an active sub-region area, associated dirty
bitmap checked as dirty for preventing stale read.

HPB is shown to have a performance improvement of 58 - 67% for random 
read

workload. [1]

This series patches are based on the 5.11/scsi-queue branch.

[1]:
https://www.usenix.org/conference/hotstorage17/program/presentation/jeong

Daejun park (3):
 scsi: ufs: Introduce HPB feature
 scsi: ufs: L2P map management for HPB read
 scsi: ufs: Prepare HPB read for cached sub-region

 drivers/scsi/ufs/Kconfig |9 +
 drivers/scsi/ufs/Makefile|1 +
 drivers/scsi/ufs/ufs-sysfs.c |   18 +
 drivers/scsi/ufs/ufs.h   |   49 +
 drivers/scsi/ufs/ufshcd.c|   53 ++
 drivers/scsi/ufs/ufshcd.h|   23 +-
 drivers/scsi/ufs/ufshpb.c| 1784 
+

 drivers/scsi/ufs/ufshpb.h|  230 +
 8 files changed, 2166 insertions(+), 1 deletion(-)
 created mode 100644 drivers/scsi/ufs/ufshpb.c
 created mode 100644 drivers/scsi/ufs/ufshpb.h


Re: [PATCH v2] ARM: dts: exynos: Add a placeholder for a MAC address

2020-11-05 Thread Marek Szyprowski
Hi Anand,

On 05.11.2020 09:06, Anand Moon wrote:
> On Mon, 2 Nov 2020 at 21:53, Marek Szyprowski  
> wrote:
>> On 01.11.2020 15:07, Anand Moon wrote:
>>> On Thu, 1 Oct 2020 at 19:25, Łukasz Stelmach  wrote:
 Add a placeholder for a MAC address. A bootloader may fill it
 to set the MAC address and override EEPROM settings.

 Signed-off-by: Łukasz Stelmach 
 ---
 Changes in v2:
- use local-mac-address and leave mac-address to be added by a 
 bootloader

arch/arm/boot/dts/exynos5422-odroidxu3.dts | 18 ++
1 file changed, 18 insertions(+)

 diff --git a/arch/arm/boot/dts/exynos5422-odroidxu3.dts 
 b/arch/arm/boot/dts/exynos5422-odroidxu3.dts
 index db0bc17a667b..d0f6ac5fa79d 100644
 --- a/arch/arm/boot/dts/exynos5422-odroidxu3.dts
 +++ b/arch/arm/boot/dts/exynos5422-odroidxu3.dts
 @@ -70,3 +70,21 @@ &pwm {
&usbdrd_dwc3_1 {
   dr_mode = "peripheral";
};
 +
 +&usbhost2 {
 +   #address-cells = <1>;
 +   #size-cells = <0>;
 +
 +   hub@1 {
 +   compatible = "usb8087,0024";
 +   reg = <1>;
 +   #address-cells = <1>;
 +   #size-cells = <0>;
 +
 +   ethernet: usbether@1 {
 +   compatible = "usb0c45,6310";
 +   reg = <1>;
 +   local-mac-address = [00 00 00 00 00 00]; /* Filled 
 in by a bootloader */
 +   };
 +   };
 +};
 --
 2.26.2

>>> Thanks for this patch, can you share some example on how to set the
>>> mac address via u-boot bootargs
>> A little bit hacky script to set permanent board unique MAC address:
>>
>> # setexp.b u0 *0x1014; setexp.b u1 *0x1015; setexp.b u2
>> *0x1016; setexp.b u3 *0x1017; setenv ethaddr
>> 0:0:${u0}:${u1}:${u2}:${u3}; setenv usbethaddr ${ethaddr};
>>
> OK this command worked for me.
>
>> Then if there is proper ethernet0 alias set, u-boot will then
>> automatically save the configured MAC address to the device tree. I've
>> just check this on recent u-boot v2020.10 and Odroid U3 board.
>>
>> Lukasz will send updated patch soon (with proper alias entry).
>>
>> If you want to hack setting MAC address manually, this will work with
>> the current patch:
>>
>> # setexp.b u0 *0x1014; setexp.b u1 *0x1015; setexp.b u2
>> *0x1016; setexp.b u3 *0x1017; fdt addr ${fdtaddr}; fdt set
>> /soc/usb@1211/hub@1/usbether@1 local-mac-address [ 0 0 ${u0} ${u1}
>> ${u2} ${u3} ]
>>
> So do we need a similar patch for u-boot ?

I've not sure that this ethaddr hack/workaround should be added to 
mainline uboot. Some other exynos based board have proper MAC address 
stored in EEPROM (for example Odroid XU4/HC1). I would leave it for the 
users to add it manually if it is really needed for now.

> I am getting following error on Odroid U3+ and U-Boot 2020.10
>
> Odroid #  setexp.b u0 *0x1014; setexp.b u1 *0x1015; setexp.b
> u2 *0x1016; setexp.b u3 *0x1017; fdt addr ${fdtaddr}; fdt set
> /soc/usb@1211/hub@1/usbether@1 local-mac-address [ 0 0 ${u0} ${u1}
> ${u2} ${u3} ]
> No FDT memory address configured. Please configure
> the FDT address via "fdt addr " command.
> Aborting!
>
> Also added these command to boot.scr but still observing the failure

You need to use proper env for setting fdt address (the "fdt addr 
${fdtaddr}" command). For some versions it was ${fdt_addr}, the other 
used ${fdtaddr}. Please check which one is used for loading dtb and 
adjust the script.

Best regards

-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland



Re: Re: Re: [PATCH 0/2] drivers/tty: delete break after return or goto

2020-11-05 Thread Greg Kroah-Hartman
On Thu, Nov 05, 2020 at 03:34:55PM +0800, Bernard wrote:
> From: Greg Kroah-Hartman 
> 
> Date: 2020-11-04 19:59:03
> To:  Bernard 
> Cc:  Jiri Slaby ,Shawn Guo ,Sascha 
> Hauer ,Pengutronix Kernel Team 
> ,Fabio Estevam ,NXP Linux Team 
> ,linux-kernel@vger.kernel.org,linux-ser...@vger.kernel.org,linux-arm-ker...@lists.infradead.org,opensource.ker...@vivo.com
> Subject: Re: Re: [PATCH 0/2] drivers/tty: delete break after return or 
> goto>On Wed, Nov 04, 2020 at 07:17:56PM +0800, Bernard wrote:
> >> 
> >> 
> >> From: Greg Kroah-Hartman 
> >> Date: 2020-11-04 19:02:53
> >> To:  Bernard Zhao 
> >> Cc:  Jiri Slaby ,Shawn Guo 
> >> ,Sascha Hauer ,Pengutronix 
> >> Kernel Team ,Fabio Estevam ,NXP 
> >> Linux Team 
> >> ,linux-kernel@vger.kernel.org,linux-ser...@vger.kernel.org,linux-arm-ker...@lists.infradead.org,opensource.ker...@vivo.com
> >> Subject: Re: [PATCH 0/2] drivers/tty: delete break after return or goto>On 
> >> Wed, Nov 04, 2020 at 02:53:29AM -0800, Bernard Zhao wrote:
> >> >> This patch sereies optimise code like:
> >> >> {
> >> >> case XXX:
> >> >> return XXX;
> >> >> break; //The break is meanless, so just delete it.
> >> >> case YYY:
> >> >> goto YYY;
> >> >> break; //The break is meanless, so just delete it.
> >> >> ..
> >> >> }
> >> >> 
> >> >> Signed-off-by: Bernard Zhao 
> >> >> 
> >> >> ---
> >> >> Bernard Zhao (2):
> >> >>   drivers/tty/nozomi.c: delete no use break after goto
> >> >>   drivers/tty/serial/imx.c: delete no use break after return
> >> >
> >> >That is not the subject of the patches you sent out, what broke?
> >> 
> >> Hi:
> >> 
> >> I am sorry that I am a little confused:
> >> The patch series`s subject is "drivers/tty: delete break after return or 
> >> goto"
> >> and the blurb is:
> >> This patch sereies optimise code like:
> >> {
> >> case XXX:
> >>return XXX;
> >>break; //The break is meanless, so just delete it.
> >> case YYY:
> >>goto YYY;
> >>break; //The break is meanless, so just delete it.
> >> ..
> >> }
> >> last, the modified files are:
> >> Bernard Zhao (2):
> >>   drivers/tty/nozomi.c: delete no use break after goto
> >>   drivers/tty/serial/imx.c: delete no use break after return
> >> 
> >> Is there something wrong that I didn`t catch?
> >
> >The above lines do not match up with the subject lines of the patches
> >you sent out, so something went wrong.
> 
> 
> Hi, Greg:
> 
> Sorry to bother you.
> I am a newcomer to the community, and this is my first time submitting a 
> patch series.

You might want to start in the drivers/staging/ part of the kernel to
get your bearings and work out these types of things.  It's "easier"
there as the code there needs lots of work and it's set up to handle new
developers like yourself.

> I am sorry that I still don't understand:"The above lines do not match up 
> with the subject lines of the patches you sent out, so something went wrong."
> I compared my patch series with other people`s patch series, as shown in the 
> picture below, they look the seem.
> The only difference is that I made a signature here, so is this the issue 
> that you mean?

the output of the git command that caused those lines to be written was
taken from the subject lines of the patches in your tree.  Yet the
subject lines of the patches you emailed us did not match that at all,
so what you sent is not what you actually had here when you generated
that cover letter.  So something went wrong with your process.

Try deleting all patch files in the directory and generating them again,
and then emailing the series to yourself to verify that everything
matches up properly.

hope this helps,

greg k-h


Re: [PATCH 1/3] dt-bindings: media: i2c: Add OV8865 bindings documentation

2020-11-05 Thread Sakari Ailus
Hi Paul,

On Wed, Nov 04, 2020 at 11:26:43AM +0100, Paul Kocialkowski wrote:
> Hi Sakari and thanks for the review!
> 
> On Tue 03 Nov 20, 01:24, Sakari Ailus wrote:
> > On Fri, Oct 23, 2020 at 07:54:04PM +0200, Paul Kocialkowski wrote:
> > > This introduces YAML bindings documentation for the OV8865
> > > image sensor.
> > > 
> > > Co-developed-by: Kévin L'hôpital 
> > > Signed-off-by: Kévin L'hôpital 
> > > Signed-off-by: Paul Kocialkowski 
> > > ---
> > >  .../bindings/media/i2c/ovti,ov8865.yaml   | 124 ++
> > >  1 file changed, 124 insertions(+)
> > >  create mode 100644 
> > > Documentation/devicetree/bindings/media/i2c/ovti,ov8865.yaml
> > > 
> > > diff --git a/Documentation/devicetree/bindings/media/i2c/ovti,ov8865.yaml 
> > > b/Documentation/devicetree/bindings/media/i2c/ovti,ov8865.yaml
> > > new file mode 100644
> > > index ..807f1a94afae
> > > --- /dev/null
> > > +++ b/Documentation/devicetree/bindings/media/i2c/ovti,ov8865.yaml
> > > @@ -0,0 +1,124 @@
> > > +# SPDX-License-Identifier: GPL-2.0
> > > +%YAML 1.2
> > > +---
> > > +$id: http://devicetree.org/schemas/media/i2c/ovti,ov8865.yaml#
> > > +$schema: http://devicetree.org/meta-schemas/core.yaml#
> > > +
> > > +title: OmniVision OV8865 Image Sensor Device Tree Bindings
> > > +
> > > +maintainers:
> > > +  - Paul Kocialkowski 
> > > +
> > > +properties:
> > > +  compatible:
> > > +const: ovti,ov8865
> > > +
> > > +  reg:
> > > +maxItems: 1
> > > +
> > > +  clocks:
> > > +items:
> > > +  - description: EXTCLK Clock
> > > +
> > > +  clock-names:
> > > +items:
> > > +  - const: extclk
> > 
> > Is this needed with a single clock?
> 
> Yes I think so: we grab the clock with devm_clk_get which takes a name string
> that matches the clock-names property.

That argument may be NULL.

> 
> > And... shouldn't this also come with assigned-clock-rates etc., to set the
> > clock frequency?
> 
> I'm a bit confused why we would need to do that in the device-tree rather than
> setting the clock rate with clk_set_rate in the driver, like any other driver
> does. I think this was discussed before (on the initial ov8865 series) and the
> conclusion was that there is no particular reason for media i2c drivers to
> behave differently. So I believe this is the correct approach.

I'm not exactly sure about that conclusion.

You can use clk_set_rate() if you get the frequency from DT, but we
recently did conclude that camera sensor drivers can expect to get the
frequency indicated by assigned-clock-rate property.

In other words, the driver may not be specific to a particular board and
SoC you have.

Please also read Documentation/driver-api/media/camera-sensor.rst .

> 
> > > +
> > > +  dvdd-supply:
> > > +description: Digital Domain Power Supply
> > > +
> > > +  avdd-supply:
> > > +description: Analog Domain Power Supply (internal AVDD is used if 
> > > missing)
> > > +
> > > +  dovdd-supply:
> > > +description: I/O Domain Power Supply
> > > +
> > > +  powerdown-gpios:
> > > +maxItems: 1
> > > +description: Power Down Pin GPIO Control (active low)
> > > +
> > > +  reset-gpios:
> > > +maxItems: 1
> > > +description: Reset Pin GPIO Control (active low)
> > > +
> > > +  port:
> > > +type: object
> > > +description: Input port, connect to a MIPI CSI-2 receiver
> > > +
> > > +properties:
> > > +  endpoint:
> > > +type: object
> > > +
> > > +properties:
> > > +  remote-endpoint: true
> > > +
> > > +  bus-type:
> > > +const: 4
> > > +
> > > +  clock-lanes:
> > > +maxItems: 1
> > 
> > I believe you can drop clock-lanes and bus-type; these are both constants.
> 
> I don't understand why bus-type should be dropped because it is constant:
> if bus-type is set to something else, the driver will definitely not probe
> since we're requesting V4L2_MBUS_CSI2_DPHY for v4l2_fwnode_endpoint_parse.
> So I think it's quite important for the bindings to reflect this.

This driver is for a particular device that has MIPI CSI-2 on D-PHY as the
data bus. You can assume that in the driver.

> 
> > I presume the device does not support lane remapping?
> 
> That's correct so this is indeed not something we can configure.
> But shouldn't we instead specift clock-lanes = <0> as a const rather than
> getting rid of it?

Why would you put redundant information to DT?

> 
> > Could you also add link-frequencies, to list which frequencies are known to
> > be good?
> 
> Ah right, I had missed it. I'm a bit unsure about what I should do with the
> information from the driver though: should I refuse to use link frequencies 
> that
> are not in the list?

Yes, please.

-- 
Regards,

Sakari Ailus


Re: [PATCH v4 2/4] mfd: Support ROHM BD9576MUF and BD9573MUF

2020-11-05 Thread Lee Jones
On Thu, 05 Nov 2020, Vaittinen, Matti wrote:

> 
> On Thu, 2020-11-05 at 08:46 +0200, Matti Vaittinen wrote:
> > Morning Lee,
> > 
> > Thanks for taking a look at this :) I see most of the comments being
> > valid. There's two I would like to clarify though...
> > 
> > On Wed, 2020-11-04 at 15:51 +, Lee Jones wrote:
> > > On Wed, 28 Oct 2020, Matti Vaittinen wrote:
> > > 
> > > > Add core support for ROHM BD9576MUF and BD9573MUF PMICs which are
> > > > mainly used to power the R-Car series processors.
> > > > 
> > > > Signed-off-by: Matti Vaittinen  > > > >
> > > > ---
> > > > +   unsigned int chip_type;
> > > > +
> > > > +   chip_type = (unsigned int)(uintptr_t)
> > > > +   of_device_get_match_data(&i2c->dev);
> > > 
> > > Not overly keen on this casting.
> > > 
> > > Why not just leave it as (uintptr_t)?
> > 
> > I didn't do so because on x86_64 the address width is probably 64
> > bits
> > whereas the unsigned int is likely to be 32 bits. So the assignment
> > will crop half of the value. It does not really matter as values are
> > small - but I would be surprized if no compilers/analyzers emitted a
> > warning.
> > 
> > I must admit I am not 100% sure though. I sure can change this if you
> > know it better?

What if you used 'long', which I believe changed with the
architecture's bus width in Linux?

> > > What happens when you don't cast to (uintptr_t) first?
> > 
> > On some systems at least the gcc will warn:
> > > warning: cast from pointer to integer of different size [-Wpointer-
> > to-int-cast]
> > 
> > I am pretty sure I did end up this double casting via trial and error
> > :)

It's not uncommon. :)

> > > > +static const struct of_device_id bd957x_of_match[] = {
> > > > +   {
> > > > +   .compatible = "rohm,bd9576",
> > > > +   .data = (void *)ROHM_CHIP_TYPE_BD9576,
> > > > +   },
> > > > +   {
> > > 
> > > You could put the 2 lines above on a single line.
> > 
> > Braces? I put braces on separate lines on purpose. Been doing this
> > after we had this discussion:
> > 
> > https://lore.kernel.org/lkml/20180705055226.GJ496@dell/
> > https://lore.kernel.org/lkml/20180706070559.GW496@dell/
> > 
> > ;)
> > 
> > I can change it if you wishfeel it is important - not a point I feel
> > like fighting over ;)
> > 
> 
> Ah. I guess you meant:
> static const struct of_device_id bd957x_of_match[] = {
> { .compatible = "rohm,bd9576", .data = (void *)ROHM_CHIP_TYPE_BD9576, 
> },
> { .compatible = "rohm,bd9573", .data = (void *)ROHM_CHIP_TYPE_BD9573, 
> },
> {},
> }; 

This would be better, yes.

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog


Re: [PATCH] ARM: dts: exynos: Assign a fixed index to mmc devices on exynos4412 based ODROID boards

2020-11-05 Thread Krzysztof Kozlowski
On Wed, Nov 04, 2020 at 02:44:10PM +0100, Marek Szyprowski wrote:
> On 04.11.2020 14:13, Marek Szyprowski wrote:
> > On 04.11.2020 14:06, Markus Reichl wrote:
> >> Am 04.11.20 um 13:25 schrieb Marek Szyprowski:
> >>> On 04.11.2020 11:25, Markus Reichl wrote:
>  Recently introduced async probe on mmc devices can shuffle block IDs.
>  Pin them to fixed values to ease booting in evironments where UUIDs
>  ar not practical.
>  Use newly introduced aliases for mmcblk devices from [1].
> 
>  [1]
>  https://patchwork.kernel.org/patch/11747669/
> 
>  Signed-off-by: Markus Reichl 
>  ---
>     arch/arm/boot/dts/exynos4412-odroid-common.dtsi | 5 +
>     1 file changed, 5 insertions(+)
> 
>  diff --git a/arch/arm/boot/dts/exynos4412-odroid-common.dtsi
>  b/arch/arm/boot/dts/exynos4412-odroid-common.dtsi
>  index a5c1ce1e396c..aa10d5bc7e1c 100644
>  --- a/arch/arm/boot/dts/exynos4412-odroid-common.dtsi
>  +++ b/arch/arm/boot/dts/exynos4412-odroid-common.dtsi
>  @@ -13,6 +13,11 @@
>     #include "exynos-mfc-reserved-memory.dtsi"
>       / {
>  +    aliases {
>  +    mmc0 = &sdhci_2;
>  +    mmc1 = &mshc_0;
> >>> Like in the OdroidXU3-family patch, I would use 0 for the eMMC (mshc_0)
> >>> and 2 for the SD-card (sdhci_2).
> >> How to deal then with sdhci_0 (from exynos4.dtsi) vc. mshc_0 (from
> >> exynos4412.dts)?
> > sdhci_0 and mshc_0 both operate on the same physical MMC0 bus, so this
> > is not an issue. They cannot be used simultaneously. The latter is just
> > faster, the first one has been left there mainly for the software
> > compatibility.
> 
> I've thought a bit more on this and I would simply prefer to add generic 
> MMC aliases to the top-level Exynos dtsi files (3250, 4210, 4412, 5250, 
> 5410, 5420) to keep Linux logical MMC bus numbers in sync with the HW 
> bus numbers on all boards.

I like this approach - I don't see much benefit of having different
numbering between boards of the same SoC.

Let's match old U-Boot behavior (I assume that people switch to PARTUUID
around the v4.0 mixup, so they should not be affected).

Best regards,
Krzysztof



Re: [PATCH v4 2/4] mfd: Support ROHM BD9576MUF and BD9573MUF

2020-11-05 Thread Lee Jones
On Wed, 04 Nov 2020, Lee Jones wrote:

> On Wed, 28 Oct 2020, Matti Vaittinen wrote:
> 
> > Add core support for ROHM BD9576MUF and BD9573MUF PMICs which are
> > mainly used to power the R-Car series processors.
> > 
> > Signed-off-by: Matti Vaittinen 
> > ---
> >  drivers/mfd/Kconfig  |  11 +++
> >  drivers/mfd/Makefile |   1 +
> >  drivers/mfd/rohm-bd9576.c| 130 +++
> >  include/linux/mfd/rohm-bd957x.h  |  59 ++
> >  include/linux/mfd/rohm-generic.h |   2 +
> >  5 files changed, 203 insertions(+)
> >  create mode 100644 drivers/mfd/rohm-bd9576.c
> >  create mode 100644 include/linux/mfd/rohm-bd957x.h

[...]

> > +static const struct regmap_range volatile_ranges[] = {
> > +   {
> > +   .range_min = BD957X_REG_SMRB_ASSERT,
> > +   .range_max = BD957X_REG_SMRB_ASSERT,
> > +   },
> > +   {
> 
> The way you space your braces is not consistent.
> 
> > +   .range_min = BD957X_REG_PMIC_INTERNAL_STAT,
> > +   .range_max = BD957X_REG_PMIC_INTERNAL_STAT,
> > +   },
> > +   {
> > +   .range_min = BD957X_REG_INT_THERM_STAT,
> > +   .range_max = BD957X_REG_INT_THERM_STAT,
> > +   },
> > +   {
> > +   .range_min = BD957X_REG_INT_OVP_STAT,
> > +   .range_max = BD957X_REG_INT_SYS_STAT,
> > +   }, {
> > +   .range_min = BD957X_REG_INT_MAIN_STAT,
> > +   .range_max = BD957X_REG_INT_MAIN_STAT,
> > +   },
> > +};

Don't forget about this.

I would prefer to have the braces on the same line (even if it means
you have to change an extra line when editing), but I'm not 100% dead
set on it.  Consistency however, I am.

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog


Re: [PATCH v5 06/17] virt: acrn: Introduce VM management interfaces

2020-11-05 Thread Greg Kroah-Hartman
On Thu, Nov 05, 2020 at 03:35:45PM +0800, Shuo A Liu wrote:
> On Thu  5.Nov'20 at  7:29:07 +0100, Greg Kroah-Hartman wrote:
> > On Thu, Nov 05, 2020 at 11:10:29AM +0800, Shuo A Liu wrote:
> > > On Wed  4.Nov'20 at 20:02:35 +0100, Greg Kroah-Hartman wrote:
> > > > On Mon, Oct 19, 2020 at 02:17:52PM +0800, shuo.a@intel.com wrote:
> > > > > --- /dev/null
> > > > > +++ b/include/uapi/linux/acrn.h
> > > > > @@ -0,0 +1,56 @@
> > > > > +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> > > > > +/*
> > > > > + * Userspace interface for /dev/acrn_hsm - ACRN Hypervisor Service 
> > > > > Module
> > > > > + *
> > > > > + * This file can be used by applications that need to communicate 
> > > > > with the HSM
> > > > > + * via the ioctl interface.
> > > > > + */
> > > > > +
> > > > > +#ifndef _UAPI_ACRN_H
> > > > > +#define _UAPI_ACRN_H
> > > > > +
> > > > > +#include 
> > > > > +
> > > > > +/**
> > > > > + * struct acrn_vm_creation - Info to create a User VM
> > > > > + * @vmid:User VM ID returned from the hypervisor
> > > > > + * @reserved0:   Reserved
> > > > > + * @vcpu_num:Number of vCPU in the VM. Return from 
> > > > > hypervisor.
> > > > > + * @reserved1:   Reserved
> > > > > + * @uuid:UUID of the VM. Pass to hypervisor directly.
> > > > > + * @vm_flag: Flag of the VM creating. Pass to hypervisor 
> > > > > directly.
> > > > > + * @ioreq_buf:   Service VM GPA of I/O request buffer. 
> > > > > Pass to
> > > > > + *   hypervisor directly.
> > > > > + * @cpu_affinity:CPU affinity of the VM. Pass to hypervisor 
> > > > > directly.
> > > > > + * @reserved2:   Reserved
> > > >
> > > > Reserved and must be 0?
> > > 
> > > Not a must.
> > 
> > That's guaranteed to come back and bite you in the end.
> 
> OK. I can fill them with zero before passing them to hypervisor.
> 
> > You all have read the "how to write a good api" document, right?
> 
> Is it Documentation/driver-api/ioctl.rst? Or i missed..

That's one good document, but no, not what I was referring to.  I was
thinking of Documentation/process/adding-syscalls.rst, which is what you
are doing here implicitly with these new ioctls (every ioctl is a brand
new syscall.)

> > > > What are they reserved for?
> > > >
> > > > Same for all of the reserved fields, why?
> > > 
> > > Some reserved fields are to map layout in the hypervisor side, others
> > > are for future use.
> > 
> > ioctls should not have these, again, please read the documentation.  If
> > you need something new in the future, just make a new ioctl.
> 
> OK. I will remove some reserved fields for scalability.

"scalability" should have nothing to do with any of this, right?  What
am I missing?

> Though i can
> keep some reserved fields for alignment (and to keep same data structure
> layout with the hypervisor), right?
> Documentation/driver-api/ioctl.rst says that explicit reserved fields
> could be used.

If you need alignment, yes, that is fine, but that's not what you are
saying these are for.  And if you need alignment, why not move things
around so they are properly aligned.

And this structure has nothing to do with the hypervisor structure,
that's a internal-kernel structure, not a userspace-visable thing if you
are doing things correctly.

As an example of all of this type of review and conversation, please
refer to the review of the recent nitro_enclaves code that got merged.
All of the discussions there about ioctls are also relevant here.

thanks,

greg k-h


Re: [PATCH] usb/mos7720: process deferred urbs in a workqueue

2020-11-05 Thread Johan Hovold
On Wed, Nov 04, 2020 at 04:13:07PM -0800, Davidlohr Bueso wrote:
> On Wed, 04 Nov 2020, Johan Hovold wrote:
> 
> >Hmm. I took at closer look at the parport code and it seems the current
> >implementation is already racy but that removing the tasklet is going to
> >widen that that window.
> >
> >Those register writes in restore() should be submitted before any
> >later requests. Perhaps setting a flag and flushing the work in
> >parport_prologue() could work?
> 
> Ah, I see and agree. Considering work is only deferred from restore_state()
> I don't even think we need a flag, no? We can let parport_prologue()
> just flush_work() unconditionally (right before taking the disc_mutex)
> which for the most part will be idle anyway. The flush_work() also becomes
> saner now that we'll stop rescheduling work in send_deferred_urbs().

A flag isn't strictly needed, no, but it could be used to avoid some of
the flush_work() overhead for every parport callback. The restore-state
work will typically only be queued once.
 
> Also, but not strictly related to this. What do you think of deferring all
> work in write_parport_reg_nonblock() unconditionally? I'd like to avoid
> that mutex_trylock() because eventually I'll be re-adding a warn in the
> locking code, but that would also simplify the code done here in the
> nonblocking irq write. I'm not at all familiar with parport, but I would
> think that restore_state context would not care.

Sounds good to me. As long as the state is restored before submitting
further requests we should be fine. That would even allow getting rid of
write_parport_reg_nonblock() as you can restore the state using
synchronous calls from the worker thread. Should simplify things quite a
bit.

> >On the other hand, the restore() implementation looks broken in that it
> >doesn't actually restore the provided state. I'll go fix that up.
> 
> How did this thing ever work?

The shadow registers are initialised at probe so as long as you don't
switch to a different parallel-port driver without disconnecting the
mos7715 in between it works.

Johan


Re: [PATCH v2 1/4] dt-bindings: soc: imx8m: add DT Binding doc for soc unique ID

2020-11-05 Thread Krzysztof Kozlowski
On Thu, Nov 05, 2020 at 03:26:26PM +0800, Alice Guo wrote:
> Add DT Binding doc for the Unique ID of i.MX 8M series.
> 
> Signed-off-by: Alice Guo 
> ---
>  .../devicetree/bindings/arm/fsl.yaml  | 33 +++
>  1 file changed, 33 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/arm/fsl.yaml 
> b/Documentation/devicetree/bindings/arm/fsl.yaml
> index e4db0f9ed664..0419f078502b 100644
> --- a/Documentation/devicetree/bindings/arm/fsl.yaml
> +++ b/Documentation/devicetree/bindings/arm/fsl.yaml
> @@ -901,6 +901,39 @@ properties:
>- fsl,s32v234-evb   # S32V234-EVB2 Customer Evaluation 
> Board
>- const: fsl,s32v234
> 
> +  soc:
> +description:
> +  i.MX8M Family SoC must provide a soc node in the root of the device 
> tree,
> +  representing the System-on-Chip since these test chips are rather 
> complex.
> +type: object
> +properties:
> +  compatible:
> +oneOf:
> +  - items:
> +  - const: fsl,imx8mm-soc
> +  - const: simple-bus
> +  - items:
> +  - const: fsl,imx8mn-soc
> +  - const: simple-bus
> +  - items:
> +  - const: fsl,imx8mp-soc
> +  - const: simple-bus
> +  - items:
> +  - const: fsl,imx8mq-soc
> +  - const: simple-bus
> +
> +  nvmem-cells:
> +maxItems: 1
> +description: Phandle to the SOC Unique ID provided by a nvmem node
> +
> +  nvmem-cells-names:
> +const: soc_unique_id
> +
> +required:
> +  - compatible
> +  - nvmem-cells
> +  - nvmem-cell-names
> +

Did you actually test it? I see multiple errors with this patch.
fsl-ls1012a-frdm.dt.yaml: /: soc:compatible: ['simple-bus'] is not valid under 
any of the given schemas

Best regards,
Krzysztof



Re: [RFC 6/9] staging: dpaa2-switch: add .ndo_start_xmit() callback

2020-11-05 Thread Ioana Ciornei
On Thu, Nov 05, 2020 at 02:04:39AM +0100, Andrew Lunn wrote:
> > +static int dpaa2_switch_build_single_fd(struct ethsw_core *ethsw,
> > +   struct sk_buff *skb,
> > +   struct dpaa2_fd *fd)
> > +{
> > +   struct device *dev = ethsw->dev;
> > +   struct sk_buff **skbh;
> > +   dma_addr_t addr;
> > +   u8 *buff_start;
> > +   void *hwa;
> > +
> > +   buff_start = PTR_ALIGN(skb->data - DPAA2_SWITCH_TX_DATA_OFFSET -
> > +  DPAA2_SWITCH_TX_BUF_ALIGN,
> > +  DPAA2_SWITCH_TX_BUF_ALIGN);
> > +
> > +   /* Clear FAS to have consistent values for TX confirmation. It is
> > +* located in the first 8 bytes of the buffer's hardware annotation
> > +* area
> > +*/
> > +   hwa = buff_start + DPAA2_SWITCH_SWA_SIZE;
> > +   memset(hwa, 0, 8);
> > +
> > +   /* Store a backpointer to the skb at the beginning of the buffer
> > +* (in the private data area) such that we can release it
> > +* on Tx confirm
> > +*/
> > +   skbh = (struct sk_buff **)buff_start;
> > +   *skbh = skb;
> 
> Where is the TX confirm which uses this stored pointer. I don't see it
> in this file.
> 

The Tx confirm - dpaa2_switch_tx_conf() - is added in patch 5/9.

> It can be expensive to store pointer like this in buffers used for
> DMA.

Yes, it is. But the hardware does not give us any other indication that
a packet was actually sent so that we can move ahead with consuming the
initial skb.

> It has to be flushed out of the cache here as part of the
> send. Then the TX complete needs to invalidate and then read it back
> into the cache. Or you use coherent memory which is just slow.
> 
> It can be cheaper to keep a parallel ring in cacheable memory which
> never gets flushed.

I'm afraid I don't really understand your suggestion. In this parallel
ring I would keep the skb pointers of all frames which are in-flight?
Then, when a packet is received on the Tx confirmation queue I would
have to loop over the parallel ring and determine somehow which skb was
this packet initially associated to. Isn't this even more expensive?

Ioana



Re: [PATCH 27/36] tty: synclinkmp: Mark never checked 'readval' as __always_unused

2020-11-05 Thread Jiri Slaby

On 04. 11. 20, 20:35, Lee Jones wrote:

Fixes the following W=1 kernel build warning(s):

  drivers/tty/synclinkmp.c: In function ‘init_adapter’:
  drivers/tty/synclinkmp.c:5167:6: warning: variable ‘readval’ set but not used 
[-Wunused-but-set-variable]

Cc: Greg Kroah-Hartman 
Cc: Jiri Slaby 
Cc: pau...@microgate.com
Signed-off-by: Lee Jones 
---
  drivers/tty/synclinkmp.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/tty/synclinkmp.c b/drivers/tty/synclinkmp.c
index 0ca738f61a35b..75f494bfdcbed 100644
--- a/drivers/tty/synclinkmp.c
+++ b/drivers/tty/synclinkmp.c
@@ -5165,7 +5165,7 @@ static bool init_adapter(SLMP_INFO *info)
  
  	/* Set BIT30 of Local Control Reg 0x50 to reset SCA */

volatile u32 *MiscCtrl = (u32 *)(info->lcr_base + 0x50);
-   u32 readval;
+   u32 __always_unused readval;


Why not just remove readval completely as in other cases?

And the loop can be turned into ndelay:

/*
 * Force at least 170ns delay before clearing
 * reset bit. Each read from LCR takes at least
 * 30ns so 10 times for 300ns to be safe.
 */
for(i=0;i<10;i++)
readval = *MiscCtrl;


thanks,
--
js
suse labs


[PATCH] perf_event_open.2: Update man page with recent changes

2020-11-05 Thread Namhyung Kim
From: Namhyung Kim 

There are lots of changes as usual.  I've tried to fill some missing
bits in the man page but it'd be nice if you could take a look and put
more info there.

Signed-off-by: Namhyung Kim 
---
 man2/perf_event_open.2 | 262 -
 1 file changed, 260 insertions(+), 2 deletions(-)

diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2
index 72afafb50..e86adfa41 100644
--- a/man2/perf_event_open.2
+++ b/man2/perf_event_open.2
@@ -247,8 +247,15 @@ struct perf_event_attr {
due to exec */
   use_clockid:  1,  /* use clockid for time fields */
   context_switch :  1,  /* context switch data */
+  write_backward :  1,  /* Write ring buffer from end to beginning */
+  namespaces :  1,  /* include namespaces data */
+  ksymbol:  1,  /* include ksymbol events */
+  bpf_event  :  1,  /* include bpf events */
+  aux_output :  1,  /* generate AUX records instead of events */
+  cgroup :  1,  /* include cgroup events */
+  text_poke  :  1,  /* include text poke events */
 
-  __reserved_1   : 37;
+  __reserved_1   : 30;
 
 union {
 __u32 wakeup_events;/* wakeup every n events */
@@ -854,6 +861,20 @@ is set higher than zero then the register
 values returned are those captured by
 hardware at the time of the sampled
 instruction's retirement.
+.TP
+.BR PERF_SAMPLE_PHYS_ADDR " (since Linux 4.13)"
+.\" commit fc7ce9c74c3ad232b084d80148654f926d01ece7
+Records physical address of data like in
+.B PERF_SAMPLE_ADDR .
+.TP
+.BR PERF_SAMPLE_CGROUP " (since Linux 5.7)"
+.\" commit 96aaab686505c449e24d76e76507290dcc30e008
+Records (perf_event) cgroup id of the process.
+This corresponds to the
+.I id
+field in the
+.B PERF_RECORD_CGROUP
+event.
 .RE
 .TP
 .IR "read_format"
@@ -1189,6 +1210,47 @@ information even with strict
 .I perf_event_paranoid
 settings.
 .TP
+.IR "write_backward" " (since Linux 4.6)"
+.\" commit 9ecda41acb971ebd07c8fb35faf24005c0baea12
+This makes the ring buffer is written from end to beginning.
+This is to support reading from overwritable ring buffer.
+.TP
+.IR "namespaces" " (since Linux 4.11)"
+.\" commit e422267322cd319e2695a535e47c5b1feeac45eb
+This enables the generation of
+.B PERF_RECORD_NAMESPACES
+records when a task is entering to a new namespace.  Each namespace has a
+combination of device and inode numbers.
+.TP
+.IR "ksymbol" " (since Linux 5.0)"
+.\" commit 76193a94522f1d4edf2447a536f3f796ce56343b
+This enables the generation of
+.B PERF_RECORD_KSYMBOL
+records when a new kernel symbols are registered or unregistered.
+This is analyzing dynamic kernel functions like eBPF.
+.TP
+.IR "bpf_event" " (since Linux 5.0)"
+.\" commit 6ee52e2a3fe4ea35520720736e6791df1fb67106
+This enables the generation of
+.B PERF_RECORD_BPF_EVENT
+records when a eBPF program is loaded or unloaded.
+.IR "auxevent" " (since Linux 5.4)"
+.\" commit ab43762ef010967e4ccd53627f70a2eecbeafefb
+This allows normal (non-AUX) events to generate data for AUX events
+if the hardware supports it.
+.IR "cgroup" " (since Linux 5.7)"
+.\" commit 96aaab686505c449e24d76e76507290dcc30e008
+This enables the generation of
+.B PERF_RECORD_CGROUP
+records when a new cgroup is created (and activated).
+.TR
+.IR "text_poke" " (since Linux 5.8)"
+.\" commit e17d43b93e544f5016c0251d2074c15568d5d963
+This enables the generation of
+.B PERF_RECORD_TEXT_POKE
+records when there's a changes to the kernel text (i.e. self-modifying
+code).
+.TP
 .IR "wakeup_events" ", " "wakeup_watermark"
 This union sets how many samples
 .RI ( wakeup_events )
@@ -2101,7 +2163,7 @@ struct {
 u64nr;  /* if PERF_SAMPLE_CALLCHAIN */
 u64ips[nr]; /* if PERF_SAMPLE_CALLCHAIN */
 u32size;/* if PERF_SAMPLE_RAW */
-char  data[size];   /* if PERF_SAMPLE_RAW */
+char   data[size];  /* if PERF_SAMPLE_RAW */
 u64bnr; /* if PERF_SAMPLE_BRANCH_STACK */
 struct perf_branch_entry lbr[bnr];
 /* if PERF_SAMPLE_BRANCH_STACK */
@@ -2118,6 +2180,8 @@ struct {
 u64abi; /* if PERF_SAMPLE_REGS_INTR */
 u64regs[weight(mask)];
 /* if PERF_SAMPLE_REGS_INTR */
+u64phys_addr;   /* if PERF_SAMPLE_PHYS_ADDR */
+u64cgroup;  /* if PERF_SAMPLE_CGROUP */
 };
 .EE
 .in
@@ -2744,6 +2808,200 @@ or next (if switching out) process on the CPU.
 The thread ID of the previous (if switching in)
 or next (if switching out) thread on the CPU.
 .RE
+.TP
+.BR PERF_RECORD_NAMESPACES " (since Linux 4.11)"
+.\" commit e422267322cd319e2695a535e47c5b1feeac45eb
+This record includes various namespace information of a process.
+.IP
+.in +4n
+.EX
+struct {
+struct perf_event_header header;
+u32 pid;
+u32 tid;
+u64 nr_namespaces;
+struct { u64 dev, inode } [nr_namespaces];
+struct sample_id sample_id;
+

Re: [LKP] Re: [mm/gup] a308c71bf1: stress-ng.vm-splice.ops_per_sec -95.6% regression

2020-11-05 Thread Xing Zhengjun




On 11/5/2020 2:29 AM, Linus Torvalds wrote:

On Mon, Nov 2, 2020 at 1:15 AM kernel test robot  wrote:


Greeting,

FYI, we noticed a -95.6% regression of stress-ng.vm-splice.ops_per_sec due to 
commit:

commit: a308c71bf1e6e19cc2e4ced31853ee0fc7cb439a ("mm/gup: Remove enfornced COW 
mechanism")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


Note that this is just the reverse of the previous 2000% improvement
reported by the test robot here:

 https://lore.kernel.org/lkml/20200611040453.GK12456@shao2-debian/

and the explanation seems to remain the same:

 
https://lore.kernel.org/lkml/cag48ez1v1b4x5lgfya6nvi33-twwqna_dc5jgfvosqqhdn_...@mail.gmail.com/

IOW, this is testing a special case (zero page lookup) that the "force
COW" patches happened to turn into a regular case (COW creating a
regular page from the zero page).

The question is whether we should care about the zero page for gup_fast lookup.

If we do care, then the proper fix is likely simply to allow the zero
page in fast-gup, the same way we already do in slow-gup.

ENTIRELY UNTESTED PATCH ATTACHED.

Rong - mind testing this? I don't think the zero-page _should_ be
something that real loads care about, but hey, maybe people do want to
do things like splice zeroes very efficiently..


I test the patch, the regression still existed.

=
tbox_group/testcase/rootfs/kconfig/compiler/nr_threads/disk/testtime/class/cpufreq_governor/ucode:

lkp-csl-2sp5/stress-ng/debian-10.4-x86_64-20200603.cgz/x86_64-rhel-8.3/gcc-9/100%/1HDD/30s/pipe/performance/0x5002f01

commit:
  1a0cf26323c80e2f1c58fc04f15686de61bfab0c
  a308c71bf1e6e19cc2e4ced31853ee0fc7cb439a
  da5ba9980aa2211c1e2a89fc814abab2fea6f69d (debug patch)

1a0cf26323c80e2f a308c71bf1e6e19cc2e4ced3185 da5ba9980aa2211c1e2a89fc814
 --- ---
 %stddev %change %stddev %change %stddev
 \  |\  |\
 3.406e+09   -95.6%   1.49e+08   -96.4%  1.213e+08 
   stress-ng.vm-splice.ops
 1.135e+08   -95.6%4965911   -96.4%4041777 
   stress-ng.vm-splice.ops_per_sec




And note the "untested" part of the patch. It _looks_ fairly obvious,
but maybe I'm missing something.

 Linus


___
LKP mailing list -- l...@lists.01.org
To unsubscribe send an email to lkp-le...@lists.01.org



--
Zhengjun Xing


Linux Kernel Code of Conduct Committee: October 2020 report

2020-11-05 Thread Greg KH
Despite our previously hoped-for timely release of these reports that
were mentioned last time:
https://lore.kernel.org/lkml/20200103105614.gc1047...@kroah.com/
that hasn't happened, so here's the report for the first 10 months of
2020.  I will work to do better on this in the future, my apologies.

Linux Kernel Code of Conduct Committee: October 2020

In the period of January 1, 2020 through October 31, 2020 the Committee
received the following reports:
  - Unacceptable behavior or comments in email: 1
  - Unacceptable comments in github repo by non-community members: 1
  - Unacceptable comments toward a company: 1

The result of the investigation:
  - Education and coaching: 1
  - Locking of github repo for any comments: 1
  - Clarification that the Code of Conduct covers conduct related to
individual developers only: 1

We would like to thank the Linux kernel community members who have
supported the adoption of the Code of Conduct and who continue to uphold
the professional standards of our community.  If you have questions
about this report, please write to .



The website at https://www.kernel.org/code-of-conduct.html has a list of
this, and other past Code Of Conduct Committee reports.

thanks,

greg k-h


Re: [PATCH v1 4/4] powernv/memtrace: don't abuse memory hot(un)plug infrastructure for memory allocations

2020-11-05 Thread David Hildenbrand


> Am 05.11.2020 um 03:53 schrieb Michael Ellerman :
> 
> David Hildenbrand  writes:
>> Let's use alloc_contig_pages() for allocating memory and remove the
>> linear mapping manually via arch_remove_linear_mapping(). Mark all pages
>> PG_offline, such that they will definitely not get touched - e.g.,
>> when hibernating. When freeing memory, try to revert what we did.
>> The original idea was discussed in:
>> https://lkml.kernel.org/r/48340e96-7e6b-736f-9e23-d3111b915...@redhat.com
>> This is similar to CONFIG_DEBUG_PAGEALLOC handling on other
>> architectures, whereby only single pages are unmapped from the linear
>> mapping. Let's mimic what memory hot(un)plug would do with the linear
>> mapping.
>> We now need MEMORY_HOTPLUG and CONTIG_ALLOC as dependencies.
>> Simple test under QEMU TCG (10GB RAM, single NUMA node):
>> sh-5.0# mount -t debugfs none /sys/kernel/debug/
>> sh-5.0# cat /sys/devices/system/memory/block_size_bytes
>> 4000
>> sh-5.0# echo 0x4000 > /sys/kernel/debug/powerpc/memtrace/enable
>> [   71.052836][  T356] memtrace: Allocated trace memory on node 0 at 
>> 0x8000
>> sh-5.0# echo 0x8000 > /sys/kernel/debug/powerpc/memtrace/enable
>> [   75.424302][  T356] radix-mmu: Mapped 
>> 0x8000-0xc000 with 64.0 KiB pages
>> [   75.430549][  T356] memtrace: Freed trace memory back on node 0
>> [   75.604520][  T356] memtrace: Allocated trace memory on node 0 at 
>> 0x8000
>> sh-5.0# echo 0x1 > /sys/kernel/debug/powerpc/memtrace/enable
>> [   80.418835][  T356] radix-mmu: Mapped 
>> 0x8000-0x0001 with 64.0 KiB pages
>> [   80.430493][  T356] memtrace: Freed trace memory back on node 0
>> [   80.433882][  T356] memtrace: Failed to allocate trace memory on node 0
>> sh-5.0# echo 0x4000 > /sys/kernel/debug/powerpc/memtrace/enable
>> [   91.920158][  T356] memtrace: Allocated trace memory on node 0 at 
>> 0x8000
> 
> I gave this a quick spin on a real machine, seems to work OK.
> 
> I don't have the actual memtrace tools setup to do an actual trace, will
> try and get someone to test that also.
> 
> One observation is that previously the memory was zeroed when enabling
> the memtrace, whereas now it's not.
> 
> eg, before:
> 
> # hexdump -C /sys/kernel/debug/powerpc/memtrace//trace 
>   00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
> *
> 1000
> 
> whereas after:
> 
> # hexdump -C /sys/kernel/debug/powerpc/memtrace//trace
>   00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
> *
> 0080  e0 fd 43 00 00 00 00 00  e0 fd 43 00 00 00 00 00  |..C...C.|
> 0090  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
> *
> 0830  98 bf 39 00 00 00 00 00  98 bf 39 00 00 00 00 00  |..9...9.|
> 0840  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
> *
> 08a0  b0 c8 47 00 00 00 00 00  b0 c8 47 00 00 00 00 00  |..G...G.|
> 08b0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
> ...
> 0f70  78 53 49 7d 00 00 29 2e  88 00 92 41 01 00 49 39  |xSI}..)A..I9|
> 0f80  b4 07 4a 7d 28 f8 00 7d  00 48 08 7c 0c 00 c2 40  |..J}(..}.H.|...@|
> 0f90  2d f9 40 7d f0 ff c2 40  b4 07 0a 7d 00 48 8a 7f  |-.@}...@...}.H..|
> 0fa0  70 fe 9e 41 cc ff ff 4b  00 00 00 60 00 00 00 60  |p..A...K...`...`|
> 0fb0  01 00 00 48 00 00 00 60  00 00 a3 2f 0c fd 9e 40  |...H...`.../...@|
> 0fc0  00 00 a2 3c 00 00 a5 e8  00 00 62 3c 00 00 63 e8  |...<..b<..c.|
> 0fd0  01 00 20 39 83 02 80 38  00 00 3c 99 01 00 00 48  |.. 9...8.. 0fe0  00 00 00 60 e4 fc ff 4b  00 00 80 38 78 fb e3 7f  |...`...K...8x...|
> 0ff0  01 00 00 48 00 00 00 60  2c fe ff 4b 00 00 00 60  |...H...`,..K...`|
> 1000
> 
> 
> That's a nice way for root to read kernel memory, so we should probably
> add a __GFP_ZERO or memset in there somewhere.

Thanks for catching that! Will have a look on Monday if alloc_contig_pages() 
already properly handled __GFP_ZERO so we can use it, otherwise I‘ll fix that.

I don‘t recall that memory hotunplug does any zeroing - that‘s why I didn‘t add 
any explicit zeroing. Could be you were just lucky in your experiment - I 
assume we‘ll leak kernel memory already.

Thank!

> cheers



Re: [RFC 0/2] perf/core: Invoke pmu::sched_task callback for cpu events

2020-11-05 Thread Stephane Eranian
On Mon, Nov 2, 2020 at 6:52 AM Namhyung Kim  wrote:
>
> Hello,
>
> It was reported that system-wide events with precise_ip set have a lot
> of unknown symbols on Intel machines.  Depending on the system load I
> can see more than 30% of total symbols are not resolved (actually
> don't have DSO mappings).
>
> I found that it's only large PEBS is enabled - using call-graph or the
> frequency mode will disable it and have valid results.  I've verified
> it by checking intel_pmu_pebs_sched_task() is called like below:
>
>   # perf probe -a intel_pmu_pebs_sched_task
>
>   # perf stat -a -e probe:intel_pmu_pebs_sched_task \
>   >   perf record -a -e cycles:ppp -c 11 sleep 1
>   [ perf record: Woken up 1 times to write data ]
>   [ perf record: Captured and wrote 2.625 MB perf.data (10345 samples) ]
>
>Performance counter stats for 'system wide':
>
>  0  probe:intel_pmu_pebs_sched_task
>
>2.157533991 seconds time elapsed
>
>
> Looking at the code, I found out that the pmu::sched_task callback was
> changed recently that it's called only for task events.  So cpu events
> with large PEBS didn't flush the buffer and they are attributed to
> unrelated tasks later resulted in unresolved symbols.
>
> This patch reverts it and keeps the optimization for task events.
> While at it, I also found the context switch callback was not enabled
> for cpu events from the beginning.  So I've added it too.  With this
> applied, I can see the above callbacks are hit as expected and perf
> report has valid symbols.
>
This is a serious bug that impacts many kernel versions as soon as
multi-entry PEBS is activated by the kernel in system-wide mode.
I remember this was working in the past so it must have been broken by
some code refactoring or optimization or extension of sched_task
to other features. PEBS must be flushed on context switch in per-cpu
mode, otherwise you may report samples in locations that do not belong
to the process where they are processed in. PEBS does not tag samples
with PID/TID.


Re: [PATCH v5 0/4] DCMI BT656 parallel bus mode support

2020-11-05 Thread Sakari Ailus
Hi Hugues,

On Wed, Nov 04, 2020 at 06:32:08PM +0100, Hugues Fruchet wrote:
> Add support of BT656 embedded synchronization bus.
> This mode allows to save hardware synchro lines hsync & vsync
> by replacing them with synchro codes embedded in data stream.
> Add "bus-type" property and make it required so that there is no
> ambiguity between parallel mode (bus-type=5) and BT656 mode (bus-type=6).

Thanks for the update.

Regarding the two last patches, which tree they're intended to go to?
Something else than media? I can also take them if it's ok for the
maintainer of the "right" tree.

-- 
Kind regards,

Sakari Ailus


Re: [PATCH v4 1/2] kunit: Support for Parameterized Testing

2020-11-05 Thread Marco Elver
On Thu, 5 Nov 2020 at 08:32, Arpitha Raghunandan <98.a...@gmail.com> wrote:
>
> On 28/10/20 12:51 am, Marco Elver wrote:
> > On Tue, 27 Oct 2020 at 18:47, Arpitha Raghunandan <98.a...@gmail.com> wrote:
> >>
> >> Implementation of support for parameterized testing in KUnit.
> >> This approach requires the creation of a test case using the
> >> KUNIT_CASE_PARAM macro that accepts a generator function as input.
> >> This generator function should return the next parameter given the
> >> previous parameter in parameterized tests. It also provides
> >> a macro to generate common-case generators.
> >>
> >> Signed-off-by: Arpitha Raghunandan <98.a...@gmail.com>
> >> Co-developed-by: Marco Elver 
> >> Signed-off-by: Marco Elver 
> >> ---
> >> Changes v3->v4:
> >> - Rename kunit variables
> >> - Rename generator function helper macro
> >> - Add documentation for generator approach
> >> - Display test case name in case of failure along with param index
> >> Changes v2->v3:
> >> - Modifictaion of generator macro and method
> >> Changes v1->v2:
> >> - Use of a generator method to access test case parameters
> >>
> >>  include/kunit/test.h | 34 ++
> >>  lib/kunit/test.c | 21 -
> >>  2 files changed, 54 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/include/kunit/test.h b/include/kunit/test.h
> >> index 9197da792336..ec2307ee9bb0 100644
> >> --- a/include/kunit/test.h
> >> +++ b/include/kunit/test.h
> >> @@ -107,6 +107,13 @@ struct kunit;
> >>   *
> >>   * @run_case: the function representing the actual test case.
> >>   * @name: the name of the test case.
> >> + * @generate_params: the generator function for parameterized tests.
> >> + *
> >> + * The generator function is used to lazily generate a series of
> >> + * arbitrarily typed values that fit into a void*. The argument @prev
> >> + * is the previously returned value, which should be used to derive the
> >> + * next value; @prev is set to NULL on the initial generator call.
> >> + * When no more values are available, the generator must return NULL.
> >>   *
> >
> > Hmm, should this really be the first paragraph? I think it should be
> > the paragraph before "Example:" maybe. But then that paragraph should
> > refer to generate_params e.g. "The generator function @generate_params
> > is used to ".
> >
> > The other option you have is to move this paragraph to the kernel-doc
> > comment for KUNIT_CASE_PARAM, which seems to be missing a kernel-doc
> > comment.
> >
> >>   * A test case is a function with the signature,
> >>   * ``void (*)(struct kunit *)``
> >> @@ -141,6 +148,7 @@ struct kunit;
> >>  struct kunit_case {
> >> void (*run_case)(struct kunit *test);
> >> const char *name;
> >> +   void* (*generate_params)(void *prev);
> >>
> >> /* private: internal use only. */
> >> bool success;
> >> @@ -162,6 +170,9 @@ static inline char *kunit_status_to_string(bool status)
> >>   * &struct kunit_case for an example on how to use it.
> >>   */
> >>  #define KUNIT_CASE(test_name) { .run_case = test_name, .name = #test_name 
> >> }
> >
> > I.e. create a new kernel-doc comment for KUNIT_CASE_PARAM here, and
> > simply move the paragraph describing the generator protocol into that
> > comment.
> >
> >> +#define KUNIT_CASE_PARAM(test_name, gen_params)\
> >> +   { .run_case = test_name, .name = #test_name,\
> >> + .generate_params = gen_params }
> >>
> >>  /**
> >>   * struct kunit_suite - describes a related collection of &struct 
> >> kunit_case
> >> @@ -208,6 +219,15 @@ struct kunit {
> >> const char *name; /* Read only after initialization! */
> >> char *log; /* Points at case log after initialization */
> >> struct kunit_try_catch try_catch;
> >> +   /* param_value points to test case parameters in parameterized 
> >> tests */
> >
> > Hmm, not quite: param_value is the current parameter value for a test
> > case. Most likely it's a pointer, but it doesn't need to be.
> >
> >> +   void *param_value;
> >> +   /*
> >> +* param_index stores the index of the parameter in
> >> +* parameterized tests. param_index + 1 is printed
> >> +* to indicate the parameter that causes the test
> >> +* to fail in case of test failure.
> >> +*/
> >
> > I think this comment needs to be reformatted, because you can use at
> > the very least use 80 cols per line. (If you use vim, visual select
> > and do 'gq'.)
> >
> >> +   int param_index;
> >> /*
> >>  * success starts as true, and may only be set to false during a
> >>  * test case; thus, it is safe to update this across multiple
> >> @@ -1742,4 +1762,18 @@ do {
> >> \
> >> fmt,   
> >> \
> >>   

Re: [PATCH] ARM: dts: exynos: Assign a fixed index to mmc devices on exynos4412 based ODROID boards

2020-11-05 Thread Markus Reichl

Hi Marek,

on rk3399 the proposed ordering [1] is according to base address in DT.

[1]
https://patchwork.kernel.org/patch/11881427

Am 04.11.20 um 14:44 schrieb Marek Szyprowski:

On 04.11.2020 14:13, Marek Szyprowski wrote:

On 04.11.2020 14:06, Markus Reichl wrote:

Am 04.11.20 um 13:25 schrieb Marek Szyprowski:

On 04.11.2020 11:25, Markus Reichl wrote:

Recently introduced async probe on mmc devices can shuffle block IDs.
Pin them to fixed values to ease booting in evironments where UUIDs
ar not practical.
Use newly introduced aliases for mmcblk devices from [1].

[1]
https://patchwork.kernel.org/patch/11747669/

Signed-off-by: Markus Reichl 
---
   arch/arm/boot/dts/exynos4412-odroid-common.dtsi | 5 +
   1 file changed, 5 insertions(+)

diff --git a/arch/arm/boot/dts/exynos4412-odroid-common.dtsi
b/arch/arm/boot/dts/exynos4412-odroid-common.dtsi
index a5c1ce1e396c..aa10d5bc7e1c 100644
--- a/arch/arm/boot/dts/exynos4412-odroid-common.dtsi
+++ b/arch/arm/boot/dts/exynos4412-odroid-common.dtsi
@@ -13,6 +13,11 @@
   #include "exynos-mfc-reserved-memory.dtsi"
     / {
+    aliases {
+    mmc0 = &sdhci_2;
+    mmc1 = &mshc_0;

Like in the OdroidXU3-family patch, I would use 0 for the eMMC (mshc_0)
and 2 for the SD-card (sdhci_2).

How to deal then with sdhci_0 (from exynos4.dtsi) vc. mshc_0 (from
exynos4412.dts)?

sdhci_0 and mshc_0 both operate on the same physical MMC0 bus, so this
is not an issue. They cannot be used simultaneously. The latter is just
faster, the first one has been left there mainly for the software
compatibility.


I've thought a bit more on this and I would simply prefer to add generic
MMC aliases to the top-level Exynos dtsi files (3250, 4210, 4412, 5250,
5410, 5420) to keep Linux logical MMC bus numbers in sync with the HW
bus numbers on all boards.

Best regards



Gruß,
--
Markus Reichl


OpenPGP_0x3A25DE0E6B1AFDB2.asc
Description: application/pgp-keys


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH] applesmc: Re-work SMC comms v2

2020-11-05 Thread Brad Campbell
On 5/11/20 6:56 pm, Henrik Rydberg wrote:
> Hi Brad,
> 
> Great to see this effort, it is certainly an area which could be improved. 
> After having seen several generations of Macbooks while modifying much of 
> that code, it became clear that the SMC communication got refreshed a few 
> times over the years. Every tiny change had to be tested on all machines, or 
> kept separate for a particular generation, or something would break.
> 
> I have not followed the back story here, but I imagine the need has arisen 
> because of a new refresh, and so this patch only needs to strictly apply to a 
> new generation. I would therefore advice that you write the patch in that 
> way, reducing the actual change to zero for earlier generations. It also 
> makes it easier to test the effect of the new approach on older systems. I 
> should be able to help testing on a 2008 and 2011 model once we get to that 
> stage.

G'day Henrik,

Unfortunately I didn't make these changes to accommodate a "new generation". 
Changes made in kernel 5.9 broke it on my machine and in looking at why didn't 
identify any obvious causes, so I re-worked some of the comms.

I can't guarantee it won't break older machines which is why I've asked for 
help testing it. I only have a MacbookPro 11,1 and an iMac 12,2. It fixes both 
of those.

Help testing would be much appreciated.

Regards,
Brad


Re: [PATCH v3 2/6] dt-bindings: pci: add the samsung,exynos-pcie binding

2020-11-05 Thread Marek Szyprowski
Hi Rob,

On 04.11.2020 22:35, Rob Herring wrote:
> On Thu, Oct 29, 2020 at 02:40:13PM +0100, Marek Szyprowski wrote:
>> Add dt-bindings for the Samsung Exynos PCIe controller (Exynos5433
>> variant). Based on the text dt-binding posted by Jaehoon Chung.
>>
>> Signed-off-by: Marek Szyprowski 
>> Reviewed-by: Krzysztof Kozlowski 
>> ---
>>   .../bindings/pci/samsung,exynos-pcie.yaml | 119 ++
>>   1 file changed, 119 insertions(+)
>>   create mode 100644 
>> Documentation/devicetree/bindings/pci/samsung,exynos-pcie.yaml

>> ...

>> +  num-viewport:
>> +const: 3
> I'm confused why you need this. This is only used with the iATU except
> for keystone. Platforms like Exynos with their own child bus config
> space accessors don't have an iATU.

Frankly I have no idea, I don't know much about the PCI internals. After 
rebasing onto your latest DW PCI changes I've noticed a following 
warning message:

exynos-pcie 1570.pcie: Resources exceed number of ATU entries (2)

Here is a complete log:

# dmesg | grep pci
ehci-pci: EHCI PCI platform driver
ohci-pci: OHCI PCI platform driver
exynos-pcie 1570.pcie: host bridge /soc@0/pcie@1570 ranges:
exynos-pcie 1570.pcie:   IO 0x000c001000..0x000c010fff -> 
0x00
exynos-pcie 1570.pcie:  MEM 0x000c011000..0x000ffe -> 
0x000c011000
exynos-pcie 1570.pcie: Resources exceed number of ATU entries (2)
exynos-pcie 1570.pcie: Link up
exynos-pcie 1570.pcie: PCI host bridge to bus :00
pci_bus :00: root bus resource [bus 00-ff]
pci_bus :00: root bus resource [io  0x-0x]
pci_bus :00: root bus resource [mem 0x0c011000-0x0ffe]
pci :00:00.0: [144d:a5e3] type 01 class 0x060400
pci :00:00.0: PME# supported from D0 D3hot D3cold
pci :01:00.0: [14e4:43e9] type 00 class 0x028000
pci :01:00.0: reg 0x10: [mem 0x-0x7fff 64bit]
pci :01:00.0: reg 0x18: [mem 0x-0x003f 64bit]
pci :01:00.0: supports D1 D2
pci :01:00.0: PME# supported from D0 D1 D2 D3hot D3cold
pci :00:00.0: BAR 14: assigned [mem 0x0c20-0x0c7f]
pci :01:00.0: BAR 2: assigned [mem 0x0c40-0x0c7f 64bit]
pci :01:00.0: BAR 0: assigned [mem 0x0c20-0x0c207fff 64bit]
pci :00:00.0: PCI bridge to [bus 01-ff]
pci :00:00.0:   bridge window [mem 0x0c20-0x0c7f]
pci :00:00.0: MSI quirk detected; MSI disabled
pcieport :00:00.0: PME: Signaling with IRQ 97
brcmfmac: brcmf_fw_alloc_request: using brcm/brcmfmac4358-pcie for chip 
BCM4358/1

When I've increased the numer of viewports it has gone.

If this is not the proper solution, I will removed it.

Best regards

-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland



Re: [PATCH 35/36] tty: synclink: Mark disposable variables as __always_unused

2020-11-05 Thread Jiri Slaby

On 04. 11. 20, 20:35, Lee Jones wrote:

Fixes the following W=1 kernel build warning(s):

  drivers/tty/synclink.c: In function ‘usc_reset’:
  drivers/tty/synclink.c:5571:6: warning: variable ‘readval’ set but not used 
[-Wunused-but-set-variable]
  drivers/tty/synclink.c: In function ‘mgsl_load_pci_memory’:
  drivers/tty/synclink.c:7267:16: warning: variable ‘Dummy’ set but not used 
[-Wunused-but-set-variable]

Cc: Greg Kroah-Hartman 
Cc: Jiri Slaby 
Cc: pau...@microgate.com
Signed-off-by: Lee Jones 
---
  drivers/tty/synclink.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/synclink.c b/drivers/tty/synclink.c
index c8324d58ef564..8ed64b1e7c378 100644
--- a/drivers/tty/synclink.c
+++ b/drivers/tty/synclink.c
@@ -5568,7 +5568,7 @@ static void usc_load_txfifo( struct mgsl_struct *info )
  static void usc_reset( struct mgsl_struct *info )
  {
int i;
-   u32 readval;
+   u32 __always_unused readval;


The same as in synclinkmp.

  
  	/* Set BIT30 of Misc Control Register */

/* (Local Control Register 0x50) to force reset of USC. */
@@ -7264,7 +7264,7 @@ static void mgsl_load_pci_memory( char* TargetPtr, const 
char* SourcePtr,
  
  	unsigned short Intervalcount = count / PCI_LOAD_INTERVAL;

unsigned short Index;
-   unsigned long Dummy;
+   unsigned long __always_unused Dummy;


You can kill it completely.

  
  	for ( Index = 0 ; Index < Intervalcount ; Index++ )

{



thanks,
--
js
suse labs


Re: [PATCH 34/36] tty: serial: pmac_zilog: Make disposable variable __always_unused

2020-11-05 Thread Lee Jones
On Thu, 05 Nov 2020, Jiri Slaby wrote:

> On 05. 11. 20, 8:04, Christophe Leroy wrote:
> > 
> > 
> > Le 04/11/2020 à 20:35, Lee Jones a écrit :
> > > Fixes the following W=1 kernel build warning(s):
> > > 
> > >   drivers/tty/serial/pmac_zilog.h:365:58: warning: variable
> > > ‘garbage’ set but not used [-Wunused-but-set-variable]
> > 
> > Explain how you are fixing this warning.
> > 
> > Setting  __always_unused is usually not the good solution for fixing
> > this warning, but here I guess this is likely the good solution. But it
> > should be explained why.

There are normally 3 ways to fix this warning;

 - Start using/checking the variable/result
 - Remove the variable
 - Mark it as __{always,maybe}_unused

The later just tells the compiler that not checking the resultant
value is intentional.  There are some functions (as Jiri mentions
below) which are marked as '__must_check' which *require* a dummy
(garbage) variable to be used.

> Or, why is the "garbage =" needed in the first place? read_zsdata is not
> defined with __warn_unused_result__.

I used '__always_used' here for fear of breaking something.

However, if it's safe to remove it, then all the better.

> And even if it was, would (void)!read_zsdata(port) fix it?

That's hideous. :D

*Much* better to just use '__always_used' in that use-case.

> > > Cc: Greg Kroah-Hartman 
> > > Cc: Jiri Slaby 
> > > Cc: Michael Ellerman 
> > > Cc: Benjamin Herrenschmidt 
> > > Cc: Paul Mackerras 
> > > Cc: linux-ser...@vger.kernel.org
> > > Cc: linuxppc-...@lists.ozlabs.org
> > > Signed-off-by: Lee Jones 
> > > ---
> > >   drivers/tty/serial/pmac_zilog.h | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/tty/serial/pmac_zilog.h
> > > b/drivers/tty/serial/pmac_zilog.h
> > > index bb874e76810e0..968aec7c1cf82 100644
> > > --- a/drivers/tty/serial/pmac_zilog.h
> > > +++ b/drivers/tty/serial/pmac_zilog.h
> > > @@ -362,7 +362,7 @@ static inline void zssync(struct uart_pmac_port
> > > *port)
> > >   /* Misc macros */
> > >   #define ZS_CLEARERR(port)    (write_zsreg(port, 0, ERR_RES))
> > > -#define ZS_CLEARFIFO(port)   do { volatile unsigned char garbage; \
> > > +#define ZS_CLEARFIFO(port)   do { volatile unsigned char
> > > __always_unused garbage; \
> > >    garbage = read_zsdata(port); \
> > >    garbage = read_zsdata(port); \
> > >    garbage = read_zsdata(port); \
> > > 
> 
> thanks,

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog


Re: [PATCH v3 02/24] dt-bindings: introduce silabs,wfx.yaml

2020-11-05 Thread Jérôme Pouiller
On Wednesday 4 November 2020 20:15:54 CET Rob Herring wrote:
> On Wed, 04 Nov 2020 16:51:45 +0100, Jerome Pouiller wrote:
> > From: Jérôme Pouiller 
> >
> > Signed-off-by: Jérôme Pouiller 
> > ---
> >  .../bindings/net/wireless/silabs,wfx.yaml | 131 ++
> >  1 file changed, 131 insertions(+)
> >  create mode 100644 
> > Documentation/devicetree/bindings/net/wireless/silabs,wfx.yaml
> >
> 
> 
> My bot found errors running 'make dt_binding_check' on your patch:
> 
> yamllint warnings/errors:
> 
> dtschema/dtc warnings/errors:
> /builds/robherring/linux-dt-review/Documentation/devicetree/bindings/net/wireless/silabs,wfx.yaml:
>  'additionalProperties' is a required property
> /builds/robherring/linux-dt-review/Documentation/devicetree/bindings/net/wireless/silabs,wfx.yaml:
>  ignoring, error in schema:
> warning: no schema found in file: 
> ./Documentation/devicetree/bindings/net/wireless/silabs,wfx.yaml
> 
> 
> See https://patchwork.ozlabs.org/patch/1394182
> 
> The base for the patch is generally the last rc1. Any dependencies
> should be noted.
> 
> If you already ran 'make dt_binding_check' and didn't see the above
> error(s), then make sure 'yamllint' is installed and dt-schema is up to
> date:
> 
> pip3 install dtschema --upgrade

Weird, I don't have any error. Yet, yamllint is installed (1.24.2-1) and 
pip says that dts-schema is up-to-date (2020.8.2.dev2+gd63b653).

I have also tried after rebased on v5.10-rc2, then on v5.10-rc1 without 
success.


-- 
Jérôme Pouiller




Re: [PATCH 31/36] powerpc: asm: hvconsole: Move 'hvc_vio_init_early's prototype to shared location

2020-11-05 Thread Lee Jones
On Thu, 05 Nov 2020, Christophe Leroy wrote:

> 
> 
> Le 04/11/2020 à 20:35, Lee Jones a écrit :
> > Fixes the following W=1 kernel build warning(s):
> > 
> >   drivers/tty/hvc/hvc_vio.c:385:13: warning: no previous prototype for 
> > ‘hvc_vio_init_early’ [-Wmissing-prototypes]
> >   385 | void __init hvc_vio_init_early(void)
> >   | ^~
> > 
> > Cc: Michael Ellerman 
> > Cc: Benjamin Herrenschmidt 
> > Cc: Paul Mackerras 
> > Cc: linuxppc-...@lists.ozlabs.org
> > Signed-off-by: Lee Jones 
> > ---
> >   arch/powerpc/include/asm/hvconsole.h | 3 +++
> >   arch/powerpc/platforms/pseries/pseries.h | 3 ---
> >   arch/powerpc/platforms/pseries/setup.c   | 1 +
> >   3 files changed, 4 insertions(+), 3 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/hvconsole.h 
> > b/arch/powerpc/include/asm/hvconsole.h
> > index 999ed5ac90531..936a1ee1ac786 100644
> > --- a/arch/powerpc/include/asm/hvconsole.h
> > +++ b/arch/powerpc/include/asm/hvconsole.h
> > @@ -24,5 +24,8 @@
> >   extern int hvc_get_chars(uint32_t vtermno, char *buf, int count);
> >   extern int hvc_put_chars(uint32_t vtermno, const char *buf, int count);
> > +/* Provided by HVC VIO */
> > +extern void hvc_vio_init_early(void);
> > +
> 
> Declaring a prototype 'extern' is pointless. Don't add new misuse of 'extern' 
> keyword.

No new code (misuse or otherwise) is being added in this patch.

It's just moved from one place to another.

I can also strip out 'extern' if it's preferred.

> >   #endif /* __KERNEL__ */
> >   #endif /* _PPC64_HVCONSOLE_H */
> > diff --git a/arch/powerpc/platforms/pseries/pseries.h 
> > b/arch/powerpc/platforms/pseries/pseries.h
> > index 13fa370a87e4e..7be5b054dfc36 100644
> > --- a/arch/powerpc/platforms/pseries/pseries.h
> > +++ b/arch/powerpc/platforms/pseries/pseries.h
> > @@ -43,9 +43,6 @@ extern void pSeries_final_fixup(void);
> >   /* Poweron flag used for enabling auto ups restart */
> >   extern unsigned long rtas_poweron_auto;
> > -/* Provided by HVC VIO */
> > -extern void hvc_vio_init_early(void);
> > -
> >   /* Dynamic logical Partitioning/Mobility */
> >   extern void dlpar_free_cc_nodes(struct device_node *);
> >   extern void dlpar_free_cc_property(struct property *);
> > diff --git a/arch/powerpc/platforms/pseries/setup.c 
> > b/arch/powerpc/platforms/pseries/setup.c
> > index 633c45ec406da..6999b83f06612 100644
> > --- a/arch/powerpc/platforms/pseries/setup.c
> > +++ b/arch/powerpc/platforms/pseries/setup.c
> > @@ -71,6 +71,7 @@
> >   #include 
> >   #include 
> >   #include 
> > +#include 
> >   #include "pseries.h"
> >   #include "../../../../drivers/pci/pci.h"
> > 
> 
> Christophe

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog


[PATCH] drm/ingenic: ipu: Search for scaling coefs up to 102% of the screen

2020-11-05 Thread Paul Cercueil
Increase the scaled image's theorical width/height until we find a
configuration that has valid scaling coefficients, up to 102% of the
screen's resolution. This makes sure that we can scale from almost
every resolution possible at the cost of a very small distorsion.
The CRTC_W / CRTC_H are not modified.

This algorithm was already in place but would not try to go above the
screen's resolution, and as a result would only work if the CRTC_W /
CRTC_H were smaller than the screen resolution. It will now try until it
reaches 102% of the screen's resolution.

Signed-off-by: Paul Cercueil 
---
 drivers/gpu/drm/ingenic/ingenic-ipu.c | 23 +++
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/ingenic/ingenic-ipu.c 
b/drivers/gpu/drm/ingenic/ingenic-ipu.c
index fc8c6e970ee3..e52777ef85fd 100644
--- a/drivers/gpu/drm/ingenic/ingenic-ipu.c
+++ b/drivers/gpu/drm/ingenic/ingenic-ipu.c
@@ -516,7 +516,7 @@ static void ingenic_ipu_plane_atomic_update(struct 
drm_plane *plane,
 static int ingenic_ipu_plane_atomic_check(struct drm_plane *plane,
  struct drm_plane_state *state)
 {
-   unsigned int num_w, denom_w, num_h, denom_h, xres, yres;
+   unsigned int num_w, denom_w, num_h, denom_h, xres, yres, max_w, max_h;
struct ingenic_ipu *ipu = plane_to_ingenic_ipu(plane);
struct drm_crtc *crtc = state->crtc ?: plane->state->crtc;
struct drm_crtc_state *crtc_state;
@@ -558,19 +558,26 @@ static int ingenic_ipu_plane_atomic_check(struct 
drm_plane *plane,
xres = state->src_w >> 16;
yres = state->src_h >> 16;
 
-   /* Adjust the coefficients until we find a valid configuration */
-   for (denom_w = xres, num_w = state->crtc_w;
-num_w <= crtc_state->mode.hdisplay; num_w++)
+   /*
+* Increase the scaled image's theorical width/height until we find a
+* configuration that has valid scaling coefficients, up to 102% of the
+* screen's resolution. This makes sure that we can scale from almost
+* every resolution possible at the cost of a very small distorsion.
+* The CRTC_W / CRTC_H are not modified.
+*/
+   max_w = crtc_state->mode.hdisplay * 102 / 100;
+   max_h = crtc_state->mode.vdisplay * 102 / 100;
+
+   for (denom_w = xres, num_w = state->crtc_w; num_w <= max_w; num_w++)
if (!reduce_fraction(&num_w, &denom_w))
break;
-   if (num_w > crtc_state->mode.hdisplay)
+   if (num_w > max_w)
return -EINVAL;
 
-   for (denom_h = yres, num_h = state->crtc_h;
-num_h <= crtc_state->mode.vdisplay; num_h++)
+   for (denom_h = yres, num_h = state->crtc_h; num_h <= max_h; num_h++)
if (!reduce_fraction(&num_h, &denom_h))
break;
-   if (num_h > crtc_state->mode.vdisplay)
+   if (num_h > max_h)
return -EINVAL;
 
ipu->num_w = num_w;
-- 
2.28.0



Re: [PATCH 31/36] powerpc: asm: hvconsole: Move 'hvc_vio_init_early's prototype to shared location

2020-11-05 Thread Lee Jones
On Thu, 05 Nov 2020, Michael Ellerman wrote:

> Lee Jones  writes:
> > Fixes the following W=1 kernel build warning(s):
> >
> >  drivers/tty/hvc/hvc_vio.c:385:13: warning: no previous prototype for 
> > ‘hvc_vio_init_early’ [-Wmissing-prototypes]
> >  385 | void __init hvc_vio_init_early(void)
> >  | ^~
> >
> > Cc: Michael Ellerman 
> > Cc: Benjamin Herrenschmidt 
> > Cc: Paul Mackerras 
> > Cc: linuxppc-...@lists.ozlabs.org
> > Signed-off-by: Lee Jones 
> > ---
> >  arch/powerpc/include/asm/hvconsole.h | 3 +++
> >  arch/powerpc/platforms/pseries/pseries.h | 3 ---
> >  arch/powerpc/platforms/pseries/setup.c   | 1 +
> >  3 files changed, 4 insertions(+), 3 deletions(-)
> 
> Acked-by: Michael Ellerman 

Thanks.

> > diff --git a/arch/powerpc/include/asm/hvconsole.h 
> > b/arch/powerpc/include/asm/hvconsole.h
> > index 999ed5ac90531..936a1ee1ac786 100644
> > --- a/arch/powerpc/include/asm/hvconsole.h
> > +++ b/arch/powerpc/include/asm/hvconsole.h
> > @@ -24,5 +24,8 @@
> >  extern int hvc_get_chars(uint32_t vtermno, char *buf, int count);
> >  extern int hvc_put_chars(uint32_t vtermno, const char *buf, int count);
> >  
> > +/* Provided by HVC VIO */
> > +extern void hvc_vio_init_early(void);
> 
> extern isn't needed, but don't feel you need to respin just to drop it.

That's fine.  I don't mind re-spinning.

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog


Re: [RFC] perf evlist: Warn if event group has mixed sw/hw events

2020-11-05 Thread Stephane Eranian
On Mon, Oct 26, 2020 at 7:19 AM Namhyung Kim  wrote:
>
> I found that order of events in a group impacts performance during the
> open.  If a group has a software event as a leader and has other
> hardware events, the lead needs to be moved to a hardware context.
> This includes RCU synchronization which takes about 20 msec on my
> system.  And this is just for a single group, so total time increases
> in proportion to the number of event groups and the number of cpus.
>
> On my 36 cpu system, opening 3 groups system-wide takes more than 2
> seconds.  You can see and compare it easily with the following:
>
>   $ time ./perf stat -a -e '{cs,cycles},{cs,cycles},{cs,cycles}' sleep 1
>   ...
>1.006333430 seconds time elapsed
>
>   real  0m3.969s
>   user  0m0.089s
>   sys   0m0.074s
>
>   $ time ./perf stat -a -e '{cycles,cs},{cycles,cs},{cycles,cs}' sleep 1
>   ...
>1.006755292 seconds time elapsed
>
>   real  0m1.144s
>   user  0m0.067s
>   sys   0m0.083s
>
> This patch just added a warning before running it.  I'd really want to
> fix the kernel if possible but don't have a good idea.  Thoughts?
>
This is a problem for us. This has caused problems on our systems with
perf command taking much longer than expected and firing timeouts.

The cost of perf_event_open() should not be so dependent on the order
of the events in a group. The penalty incurred by synchronize_rcu()
is very large and likely does not scale too well. Scalability may not
only be impacted by the number of CPUs of the machine. I am not an
expert
at RCU but it seems it exposes perf_event_open() to penalties caused
by other subsystem operations. I am wondering if there would be a
different way of handling the change of group type that would avoid
the high cost of synchronize_rcu().


> Signed-off-by: Namhyung Kim 
> ---
>  tools/perf/builtin-record.c |  2 +
>  tools/perf/builtin-stat.c   |  2 +
>  tools/perf/builtin-top.c|  2 +
>  tools/perf/util/evlist.c| 78 +
>  tools/perf/util/evlist.h|  1 +
>  5 files changed, 85 insertions(+)
>
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index adf311d15d3d..c0b08cacbae0 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -912,6 +912,8 @@ static int record__open(struct record *rec)
>
> perf_evlist__config(evlist, opts, &callchain_param);
>
> +   evlist__warn_mixed_group(evlist);
> +
> evlist__for_each_entry(evlist, pos) {
>  try_again:
> if (evsel__open(pos, pos->core.cpus, pos->core.threads) < 0) {
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index b01af171d94f..d5d4e02bda69 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -738,6 +738,8 @@ static int __run_perf_stat(int argc, const char **argv, 
> int run_idx)
> if (affinity__setup(&affinity) < 0)
> return -1;
>
> +   evlist__warn_mixed_group(evsel_list);
> +
> evlist__for_each_cpu (evsel_list, i, cpu) {
> affinity__set(&affinity, cpu);
>
> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
> index 7c64134472c7..9ad319cea948 100644
> --- a/tools/perf/builtin-top.c
> +++ b/tools/perf/builtin-top.c
> @@ -1027,6 +1027,8 @@ static int perf_top__start_counters(struct perf_top 
> *top)
>
> perf_evlist__config(evlist, opts, &callchain_param);
>
> +   evlist__warn_mixed_group(evlist);
> +
> evlist__for_each_entry(evlist, counter) {
>  try_again:
> if (evsel__open(counter, top->evlist->core.cpus,
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index 8bdf3d2c907c..02cff39e509e 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -28,6 +28,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include "parse-events.h"
>  #include 
> @@ -1980,3 +1981,80 @@ struct evsel *evlist__find_evsel(struct evlist 
> *evlist, int idx)
> }
> return NULL;
>  }
> +
> +static int *sw_types;
> +static int nr_sw_types;
> +
> +static void collect_software_pmu_types(void)
> +{
> +   const char *known_sw_pmu[] = {
> +   "software", "tracepoint", "breakpoint", "kprobe", "uprobe", 
> "msr"
> +   };
> +   DIR *dir;
> +   struct dirent *d;
> +   char path[PATH_MAX];
> +   int i;
> +
> +   if (sw_types != NULL)
> +   return;
> +
> +   nr_sw_types = ARRAY_SIZE(known_sw_pmu);
> +   sw_types = calloc(nr_sw_types, sizeof(int));
> +   if (sw_types == NULL) {
> +   pr_err("Memory allocation failed!\n");
> +   return;
> +   }
> +
> +   dir = opendir("/sys/bus/event_source/devices");
> +   while ((d = readdir(dir)) != NULL) {
> +   for (i = 0; i < nr_sw_types; i++) {
> +   if (strcmp(d->d_name, known_sw_pmu[i]))
> +   continue;
> +
> +   snp

Benachrichtigung von Microsoft

2020-11-05 Thread César Rodolfo Montalvo Catacora
Sehr geehrter Benutzer des Microsoft-Kontos,


Ihr E-Mail-Konto wird bald vom Microsoft-Überprüfungsteam gesperrt.
Weil Ihr E-Mail-Konto nicht auf unsere neueste Version von Microsoft 
aktualisiert wird.
KLICKEN Sie jetzt HIER, um 
ein Upgrade durchzuführen und Ihr Konto sicher zu halten.

Vielen Dank!

Microsoft Outlook-Upgrade-Team
Copyright © 2020 Webmail .Inc. Alle Rechte vorbehalten.



Re: [PATCH 27/36] tty: synclinkmp: Mark never checked 'readval' as __always_unused

2020-11-05 Thread Lee Jones
On Thu, 05 Nov 2020, Jiri Slaby wrote:

> On 04. 11. 20, 20:35, Lee Jones wrote:
> > Fixes the following W=1 kernel build warning(s):
> > 
> >   drivers/tty/synclinkmp.c: In function ‘init_adapter’:
> >   drivers/tty/synclinkmp.c:5167:6: warning: variable ‘readval’ set but not 
> > used [-Wunused-but-set-variable]
> > 
> > Cc: Greg Kroah-Hartman 
> > Cc: Jiri Slaby 
> > Cc: pau...@microgate.com
> > Signed-off-by: Lee Jones 
> > ---
> >   drivers/tty/synclinkmp.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/tty/synclinkmp.c b/drivers/tty/synclinkmp.c
> > index 0ca738f61a35b..75f494bfdcbed 100644
> > --- a/drivers/tty/synclinkmp.c
> > +++ b/drivers/tty/synclinkmp.c
> > @@ -5165,7 +5165,7 @@ static bool init_adapter(SLMP_INFO *info)
> > /* Set BIT30 of Local Control Reg 0x50 to reset SCA */
> > volatile u32 *MiscCtrl = (u32 *)(info->lcr_base + 0x50);
> > -   u32 readval;
> > +   u32 __always_unused readval;
> 
> Why not just remove readval completely as in other cases?

Because I don't know what the result would be.

Will the read still happen, or will the compiler optimise it away?

My changes should not affect any of the instructions i.e. the register
read must still take place

> And the loop can be turned into ndelay:
> 
> /*
>  * Force at least 170ns delay before clearing
>  * reset bit. Each read from LCR takes at least
>  * 30ns so 10 times for 300ns to be safe.
>  */
> for(i=0;i<10;i++)
> readval = *MiscCtrl;

Again, since I can't test this, I do not want this patch to contain
any functional changes.  AFAIC, the 10 register reads must still
happen after this patch is applied.

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog


Re: [PATCH 08/14] media: sunxi: Add support for the A31 MIPI CSI-2 controller

2020-11-05 Thread Sakari Ailus
Hi Paul,

On Fri, Oct 23, 2020 at 07:45:40PM +0200, Paul Kocialkowski wrote:
> The A31 MIPI CSI-2 controller is a dedicated MIPI CSI-2 controller
> found on Allwinner SoCs such as the A31 and V3/V3s.
> 
> It is a standalone block, connected to the CSI controller on one side
> and to the MIPI D-PHY block on the other. It has a dedicated address
> space, interrupt line and clock.
> 
> Currently, the MIPI CSI-2 controller is hard-tied to a specific CSI
> controller (CSI0) but newer SoCs (such as the V5) may allow switching
> MIPI CSI-2 controllers between CSI controllers.
> 
> It is represented as a V4L2 subdev to the CSI controller and takes a
> MIPI CSI-2 sensor as its own subdev, all using the fwnode graph and
> media controller API.
> 
> Signed-off-by: Paul Kocialkowski 
> ---
>  drivers/media/platform/sunxi/Kconfig  |   1 +
>  drivers/media/platform/sunxi/Makefile |   1 +
>  .../platform/sunxi/sun6i-mipi-csi2/Kconfig|  11 +
>  .../platform/sunxi/sun6i-mipi-csi2/Makefile   |   4 +
>  .../sunxi/sun6i-mipi-csi2/sun6i_mipi_csi2.c   | 635 ++
>  .../sunxi/sun6i-mipi-csi2/sun6i_mipi_csi2.h   | 116 
>  6 files changed, 768 insertions(+)
>  create mode 100644 drivers/media/platform/sunxi/sun6i-mipi-csi2/Kconfig
>  create mode 100644 drivers/media/platform/sunxi/sun6i-mipi-csi2/Makefile
>  create mode 100644 
> drivers/media/platform/sunxi/sun6i-mipi-csi2/sun6i_mipi_csi2.c
>  create mode 100644 
> drivers/media/platform/sunxi/sun6i-mipi-csi2/sun6i_mipi_csi2.h
> 
> diff --git a/drivers/media/platform/sunxi/Kconfig 
> b/drivers/media/platform/sunxi/Kconfig
> index 7151cc249afa..9684e07454ad 100644
> --- a/drivers/media/platform/sunxi/Kconfig
> +++ b/drivers/media/platform/sunxi/Kconfig
> @@ -2,3 +2,4 @@
>  
>  source "drivers/media/platform/sunxi/sun4i-csi/Kconfig"
>  source "drivers/media/platform/sunxi/sun6i-csi/Kconfig"
> +source "drivers/media/platform/sunxi/sun6i-mipi-csi2/Kconfig"
> diff --git a/drivers/media/platform/sunxi/Makefile 
> b/drivers/media/platform/sunxi/Makefile
> index fc537c9f5ca9..887a7cae8fca 100644
> --- a/drivers/media/platform/sunxi/Makefile
> +++ b/drivers/media/platform/sunxi/Makefile
> @@ -2,5 +2,6 @@
>  
>  obj-y+= sun4i-csi/
>  obj-y+= sun6i-csi/
> +obj-y+= sun6i-mipi-csi2/
>  obj-y+= sun8i-di/
>  obj-y+= sun8i-rotate/
> diff --git a/drivers/media/platform/sunxi/sun6i-mipi-csi2/Kconfig 
> b/drivers/media/platform/sunxi/sun6i-mipi-csi2/Kconfig
> new file mode 100644
> index ..7033bda483b4
> --- /dev/null
> +++ b/drivers/media/platform/sunxi/sun6i-mipi-csi2/Kconfig
> @@ -0,0 +1,11 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +config VIDEO_SUN6I_MIPI_CSI2
> + tristate "Allwinner A31 MIPI CSI-2 Controller Driver"
> + depends on VIDEO_V4L2 && COMMON_CLK
> + depends on ARCH_SUNXI || COMPILE_TEST
> + select MEDIA_CONTROLLER
> + select VIDEO_V4L2_SUBDEV_API
> + select REGMAP_MMIO
> + select V4L2_FWNODE
> + help
> +Support for the Allwinner A31 MIPI CSI-2 Controller.
> diff --git a/drivers/media/platform/sunxi/sun6i-mipi-csi2/Makefile 
> b/drivers/media/platform/sunxi/sun6i-mipi-csi2/Makefile
> new file mode 100644
> index ..14e4e03818b5
> --- /dev/null
> +++ b/drivers/media/platform/sunxi/sun6i-mipi-csi2/Makefile
> @@ -0,0 +1,4 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +sun6i-mipi-csi2-y += sun6i_mipi_csi2.o
> +
> +obj-$(CONFIG_VIDEO_SUN6I_MIPI_CSI2) += sun6i-mipi-csi2.o
> diff --git a/drivers/media/platform/sunxi/sun6i-mipi-csi2/sun6i_mipi_csi2.c 
> b/drivers/media/platform/sunxi/sun6i-mipi-csi2/sun6i_mipi_csi2.c
> new file mode 100644
> index ..ce89c35f5b86
> --- /dev/null
> +++ b/drivers/media/platform/sunxi/sun6i-mipi-csi2/sun6i_mipi_csi2.c
> @@ -0,0 +1,635 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +/*
> + * Copyright 2020 Bootlin
> + * Author: Paul Kocialkowski 
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "sun6i_mipi_csi2.h"
> +
> +#define MODULE_NAME  "sun6i-mipi-csi2"
> +
> +/* Core */
> +
> +static irqreturn_t sun6i_mipi_csi2_isr(int irq, void *dev_id)
> +{
> + struct sun6i_mipi_csi2_dev *cdev = (struct sun6i_mipi_csi2_dev *)dev_id;

Unnecessary casting from void *.

> + struct regmap *regmap = cdev->regmap;
> + u32 pending;
> +
> + WARN_ONCE(1, MODULE_NAME
> +   ": Unsolicited interrupt, an error likely occurred!\n");
> +
> + regmap_read(regmap, SUN6I_MIPI_CSI2_CH_INT_PD_REG, &pending);
> + regmap_write(regmap, SUN6I_MIPI_CSI2_CH_INT_PD_REG, pending);
> +
> + /*
> +  * The interrupt can be used to catch transmission errors.
> +  * However, we currently lack plumbing for reporting that to the
> +  * A31 CSI controller driver.
> +  */
> +
> + return IRQ_HANDLED;
> +}
> +
> +static int su

Re: [Linux-stm32] [BUG] Error applying setting, reverse things back on lot of devices

2020-11-05 Thread Ahmad Fatoum
Hello Alex,

On 11/4/20 11:50 AM, Alexandre Torgue wrote:
>> Boot up with v5.10-rc2 + your cf1ad559a2 + &pmic { regulators { 
>> vref_ddr-supply = <®_5v2>; }
> 
> Just to know, Did you test v5.10-rc2 + vref_ddr-supply = <®_5v2>; ? (which 
> seems to correspond to the patch I sent for DK/EV STM32 boards)

I did yes, it masks the issue and allows the system to boot.

-- 
Pengutronix e.K.   | |
Steuerwalder Str. 21   | http://www.pengutronix.de/  |
31137 Hildesheim, Germany  | Phone: +49-5121-206917-0|
Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |


Re: [PATCH 11/14] dt-bindings: media: i2c: Add A83T MIPI CSI-2 bindings documentation

2020-11-05 Thread Sakari Ailus
Hi Paul,

On Fri, Oct 23, 2020 at 07:45:43PM +0200, Paul Kocialkowski wrote:
> This introduces YAML bindings documentation for the A83T MIPI CSI-2
> controller.
> 
> Signed-off-by: Paul Kocialkowski 
> ---
>  .../media/allwinner,sun8i-a83t-mipi-csi2.yaml | 158 ++
>  1 file changed, 158 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/media/allwinner,sun8i-a83t-mipi-csi2.yaml
> 
> diff --git 
> a/Documentation/devicetree/bindings/media/allwinner,sun8i-a83t-mipi-csi2.yaml 
> b/Documentation/devicetree/bindings/media/allwinner,sun8i-a83t-mipi-csi2.yaml
> new file mode 100644
> index ..2384ae4e7be0
> --- /dev/null
> +++ 
> b/Documentation/devicetree/bindings/media/allwinner,sun8i-a83t-mipi-csi2.yaml
> @@ -0,0 +1,158 @@
> +# SPDX-License-Identifier: GPL-2.0
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/media/allwinner,sun8i-a83t-mipi-csi2.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Allwinner A83T MIPI CSI-2 Device Tree Bindings
> +
> +maintainers:
> +  - Paul Kocialkowski 
> +
> +properties:
> +  compatible:
> +const: allwinner,sun8i-a83t-mipi-csi2
> +
> +  reg:
> +maxItems: 1
> +
> +  interrupts:
> +maxItems: 1
> +
> +  clocks:
> +items:
> +  - description: Bus Clock
> +  - description: Module Clock
> +  - description: MIPI-specific Clock
> +  - description: Misc CSI Clock
> +
> +  clock-names:
> +items:
> +  - const: bus
> +  - const: mod
> +  - const: mipi
> +  - const: misc
> +
> +  resets:
> +maxItems: 1
> +
> +  # See ./video-interfaces.txt for details
> +  ports:
> +type: object
> +
> +properties:
> +  port@0:
> +type: object
> +description: Input port, connect to a MIPI CSI-2 sensor
> +
> +properties:
> +  reg:
> +const: 0
> +
> +  endpoint:
> +type: object
> +
> +properties:
> +  remote-endpoint: true
> +
> +  bus-type:
> +const: 4

Again, if this is D-PHY only, you can remove this.

> +
> +  clock-lanes:
> +maxItems: 1
> +
> +  data-lanes:
> +minItems: 1
> +maxItems: 4

Does the device support lane reordering? If not, you can remove
clock-lanes.

> +
> +required:
> +  - bus-type
> +  - data-lanes
> +  - remote-endpoint
> +
> +additionalProperties: false
> +
> +required:
> +  - endpoint
> +
> +additionalProperties: false
> +
> +  port@1:
> +type: object
> +description: Output port, connect to a CSI controller
> +
> +properties:
> +  reg:
> +const: 1
> +
> +  endpoint:
> +type: object
> +
> +properties:
> +  remote-endpoint: true
> +
> +  bus-type:
> +const: 4

Is it a MIPI CSI-2 D-PHY -> MIPI CSI-2 D-PHY device? I call that "cable".
:-)

> +
> +additionalProperties: false
> +
> +required:
> +  - endpoint
> +
> +additionalProperties: false
> +
> +required:
> +  - compatible
> +  - reg
> +  - interrupts
> +  - clocks
> +  - clock-names
> +  - resets
> +
> +additionalProperties: false
> +
> +examples:
> +  - |
> +#include 
> +#include 
> +#include 
> +
> +mipi_csi2: mipi-csi2@1cb1000 {
> +compatible = "allwinner,sun8i-a83t-mipi-csi2";
> +reg = <0x01cb1000 0x1000>;
> +interrupts = ;
> +clocks = <&ccu CLK_BUS_CSI>,
> + <&ccu CLK_CSI_SCLK>,
> + <&ccu CLK_MIPI_CSI>,
> + <&ccu CLK_CSI_MISC>;
> +clock-names = "bus", "mod", "mipi", "misc";
> +resets = <&ccu RST_BUS_CSI>;
> +
> +ports {
> +#address-cells = <1>;
> +#size-cells = <0>;
> +
> +mipi_csi2_in: port@0 {
> +reg = <0>;
> +
> +mipi_csi2_in_ov8865: endpoint {
> +bus-type = <4>; /* MIPI CSI-2 D-PHY */
> +clock-lanes = <0>;
> +data-lanes = <1 2 3 4>;
> +
> +remote-endpoint = <&ov8865_out_mipi_csi2>;
> +};
> +};
> +
> +mipi_csi2_out: port@1 {
> +reg = <1>;
> +
> +mipi_csi2_out_csi: endpoint {
> +bus-type = <4>; /* MIPI CSI-2 D-PHY */
> +remote-endpoint = <&csi_in_mipi_csi2>;
> +};
> +};
> +};
> +};
> +
> +...

-- 
Regards,

Sakari Ailus


Re: [PATCH v2 0/4] media: meson: Add support for the Amlogic GE2D Accelerator Unit

2020-11-05 Thread Neil Armstrong
Hi Linus,

On 05/11/2020 08:53, Linus Walleij wrote:
> Hi Neil,
> 
> this is just a drive-by question and I'm looping in Todd in the hopes for
> a discussion or clarification.
> 
> On Fri, Oct 30, 2020 at 3:37 PM Neil Armstrong  
> wrote:
> 
>> The GE2D is a 2D accelerator with various features like configurable blitter
>> with alpha blending, frame rotation, scaling, format conversion and 
>> colorspace
>> conversion.
>>
>> The driver implements a Memory2Memory VB2 V4L2 streaming device permitting:
>> - 0, 90, 180, 270deg rotation
>> - horizontal/vertical flipping
>> - source cropping
>> - destination compositing
>> - 32bit/24bit/16bit format conversion
>>
>> This adds the support for the GE2D version found in the AXG SoCs Family.
> 
> We are starting to see a bunch of these really nicely abstracted blitters
> and other 2D-accelerators now.

The actual blitting functionality is limited to non-alpha blitting since
no standard CID are available for this, but we could totally try to find
common CIDs to describe the possible Alpha Blending properties for these
2D accelerators.

> 
> Is stuff like Android going to pick up and use this to blit and blend
> generic buffers?

I'm not sure this is Google's plan right now, but maybe it should be doable.

> 
> Or is this in essence a camera and/or video out accelerator thing?

No it's really a blitter & scaler, rotate & format converter, like
the samsung and rockhip drivers, and somehow the allwinner rotate driver.
Amlogic mainly uses it to copy frames beeing displayed for encoding,
or to convert frames from the HDMI RX on their TV SoCs.

> 
> The placement of this driver in drivers/media makes me think that
> it is for cameras or video output, but the functionality is actually
> quite generic.

It's really a memory-2-memory driver, like the video codecs, it's a separate
class than the camera & video output drivers.

> 
> I've been half-guessing that userspace like Android actually mostly
> use GPUs to composit their graphics, but IIUC this can sometimes be
> used for 2D compositing, and when used will often be quicker and/or
> more energy efficient than using a GPU for the same task.

Well drm-hwcomposer can already use the DRM universal planes and the virtual
writeback connector when available for compositing.
But this kind of driver can be really useful for display rotation for example
when the DRM driver doesn't support it.

Honestly I don't understand the Android graphics stack enough to formally answer
this question, but if it can be used, this kind of driver is much faster and 
much
simpler than a GPU for simple blitting and rotation.
And since they support DMA-BUF, they can totally be used in a modern graphics 
pipeline.

Maybe someone could answer ? Maybe drm-hwcomposer could be extended for that ?

I know there is an issue opened in GloDroid for that:
https://github.com/GloDroid/glodroid_manifest/issues/66

For the record, I use this driver to accelerate the LVGL flush to display on the
AXG SoCs lacking a GPU, this by using DMA-BUF and DRM atomic modesetting with 
the DRM
LVGL display driver I submitted (and tweaked for V4L2 M2M):
https://github.com/lvgl/lv_drivers/blob/master/display/drm.c

Neil

> 
> Yours,
> Linus Walleij
> 



Re: [PATCH 2/4] clk: qcom: Add SDX55 GCC support

2020-11-05 Thread Manivannan Sadhasivam
On Wed, Nov 04, 2020 at 06:23:37PM -0800, Stephen Boyd wrote:
> Quoting Manivannan Sadhasivam (2020-10-28 00:42:30)
> > From: Naveen Yadav 
> > 
> > Add Global Clock Controller (GCC) support for SDX55 SoCs from Qualcomm.
> > 
> > Signed-off-by: Naveen Yadav 
> > [mani: converted to parent_data, commented critical clocks, cleanups]
> > Signed-off-by: Manivannan Sadhasivam 
> > ---
> >  drivers/clk/qcom/Kconfig |8 +
> >  drivers/clk/qcom/Makefile|1 +
> >  drivers/clk/qcom/gcc-sdx55.c | 1667 ++
> >  3 files changed, 1676 insertions(+)
> >  create mode 100644 drivers/clk/qcom/gcc-sdx55.c
> > 
> > diff --git a/drivers/clk/qcom/Kconfig b/drivers/clk/qcom/Kconfig
> > index 3a965bd326d5..dbca8debc06f 100644
> > --- a/drivers/clk/qcom/Kconfig
> > +++ b/drivers/clk/qcom/Kconfig
> > @@ -502,4 +502,12 @@ config KRAITCC
> >   Support for the Krait CPU clocks on Qualcomm devices.
> >   Say Y if you want to support CPU frequency scaling.
> >  
> > +config GCC_SDX55
> 
> Please sort instead of add at end.
> 
> > +   tristate "SDX55 Global Clock Controller"
> > +   depends on ARM
> 
> Why?
> 

Not needed, will remove.

> > +   help
> > + Support for the global clock controller on SDX55 devices.
> > + Say Y if you want to use peripheral devices such as UART,
> > + SPI, I2C, USB, SD/UFS, PCIe etc.
> > +
> >  endif
> > diff --git a/drivers/clk/qcom/Makefile b/drivers/clk/qcom/Makefile
> > index 11ae86febe87..3e27d67f95aa 100644
> > --- a/drivers/clk/qcom/Makefile
> > +++ b/drivers/clk/qcom/Makefile
> > @@ -75,3 +75,4 @@ obj-$(CONFIG_SPMI_PMIC_CLKDIV) += clk-spmi-pmic-div.o
> >  obj-$(CONFIG_KPSS_XCC) += kpss-xcc.o
> >  obj-$(CONFIG_QCOM_HFPLL) += hfpll.o
> >  obj-$(CONFIG_KRAITCC) += krait-cc.o
> > +obj-$(CONFIG_GCC_SDX55) += gcc-sdx55.o
> 
> Please sort this instead of add at end.
> 
> > diff --git a/drivers/clk/qcom/gcc-sdx55.c b/drivers/clk/qcom/gcc-sdx55.c
> > new file mode 100644
> > index ..75831c829202
> > --- /dev/null
> > +++ b/drivers/clk/qcom/gcc-sdx55.c
> > @@ -0,0 +1,1667 @@
> > +

[...]

> > +static const struct clk_div_table post_div_table_lucid_even[] = {
> > +   { 0x0, 1 },
> > +   { 0x1, 2 },
> > +   { 0x3, 4 },
> > +   { 0x7, 8 },
> > +   { }
> > +};
> 
> I think this table is common to all lucid plls? Maybe we can push it
> into the clk_ops somehow and stop duplicating it here?
> 

Are you referring to lucid plls in this driver? Because, this table is
not common for other SoCs. And I don't think having this way introduces
any overhead, so I'd prefer keeping it as it is.

> > +
> > +static struct clk_alpha_pll_postdiv gpll0_out_even = {
> > +   .offset = 0x0,
> > +   .regs = clk_alpha_pll_regs[CLK_ALPHA_PLL_TYPE_LUCID],
> > +   .post_div_shift = 8,
> > +   .post_div_table = post_div_table_lucid_even,
> > +   .num_post_div = ARRAY_SIZE(post_div_table_lucid_even),
> > +   .width = 4,
> > +   .clkr.hw.init = &(struct clk_init_data){
> > +   .name = "gpll0_out_even",
> > +   .parent_data = &(const struct clk_parent_data){
> > +   .fw_name = "gpll0",
> > +   },
> 
> If this is gpll0 in this file, then this should be a clk_hws pointer
> instead and directly pointing to the parent.
> 

Ack

> > +   .num_parents = 1,
> > +   .ops = &clk_alpha_pll_postdiv_lucid_ops,
> > +   },
> > +};
> > +
> > +static struct clk_alpha_pll gpll4 = {
> > +   .offset = 0x76000,
> > +   .regs = clk_alpha_pll_regs[CLK_ALPHA_PLL_TYPE_LUCID],
> > +   .vco_table = lucid_vco,
> > +   .num_vco = ARRAY_SIZE(lucid_vco),
> > +   .clkr = {
> > +   .enable_reg = 0x6d000,
> > +   .enable_mask = BIT(4),
> > +   .hw.init = &(struct clk_init_data){
> > +   .name = "gpll4",
> > +   .parent_data = &(const struct clk_parent_data){
> > +   .fw_name = "bi_tcxo",
> > +   },
> > +   .num_parents = 1,
> > +   .ops = &clk_alpha_pll_fixed_lucid_ops,
> > +   },
> > +   },
> > +};
> > +
> > +static struct clk_alpha_pll_postdiv gpll4_out_even = {
> > +   .offset = 0x76000,
> > +   .regs = clk_alpha_pll_regs[CLK_ALPHA_PLL_TYPE_LUCID],
> > +   .post_div_shift = 8,
> > +   .post_div_table = post_div_table_lucid_even,
> > +   .num_post_div = ARRAY_SIZE(post_div_table_lucid_even),
> > +   .width = 4,
> > +   .clkr.hw.init = &(struct clk_init_data){
> > +   .name = "gpll4_out_even",
> > +   .parent_data = &(const struct clk_parent_data){
> > +   .fw_name = "gpll4",
> 
> If this is gpll4 in this file, then this should be a clk_hws pointer
> instead and directly pointing to the parent.
> 

Ack

> > +   },
> > +   .num_parents = 1,
> > + 

Re: [PATCH 12/36] tty: tty_io: Fix some kernel-doc issues

2020-11-05 Thread Lee Jones
On Thu, 05 Nov 2020, Jiri Slaby wrote:

> On 04. 11. 20, 20:35, Lee Jones wrote:
> > Demote non-conformant headers and supply some missing descriptions.
> > 
> > Fixes the following W=1 kernel build warning(s):
> > 
> >   drivers/tty/tty_io.c:218: warning: Function parameter or member 'file' 
> > not described in 'tty_free_file'
> >   drivers/tty/tty_io.c:566: warning: Function parameter or member 
> > 'exit_session' not described in '__tty_hangup'
> >   drivers/tty/tty_io.c:1077: warning: Function parameter or member 'tty' 
> > not described in 'tty_send_xchar'
> >   drivers/tty/tty_io.c:1077: warning: Function parameter or member 'ch' not 
> > described in 'tty_send_xchar'
> >   drivers/tty/tty_io.c:1155: warning: Function parameter or member 'file' 
> > not described in 'tty_driver_lookup_tty'
> >   drivers/tty/tty_io.c:1508: warning: Function parameter or member 'tty' 
> > not described in 'release_tty'
> >   drivers/tty/tty_io.c:1508: warning: Function parameter or member 'idx' 
> > not described in 'release_tty'
> >   drivers/tty/tty_io.c:2973: warning: Function parameter or member 'driver' 
> > not described in 'alloc_tty_struct'
> >   drivers/tty/tty_io.c:2973: warning: Function parameter or member 'idx' 
> > not described in 'alloc_tty_struct'
> > 
> > Cc: Greg Kroah-Hartman 
> > Cc: Jiri Slaby 
> > Cc: Nick Holloway 
> > Cc: -- 
> > Cc: Marko Kohtala 
> > Cc: Bill Hawes 
> > Cc: "C. Scott Ananian" 
> > Cc: Russell King 
> > Cc: Andrew Morton 
> > Signed-off-by: Lee Jones 
> > ---
> >   drivers/tty/tty_io.c | 10 +++---
> >   1 file changed, 7 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
> > index 88b00c47b606e..f50286fb080da 100644
> > --- a/drivers/tty/tty_io.c
> > +++ b/drivers/tty/tty_io.c
> > @@ -2961,7 +2965,7 @@ static struct device *tty_get_device(struct 
> > tty_struct *tty)
> >   }
> > -/**
> > +/*
> >*alloc_tty_struct
> >*
> >*This subroutine allocates and initializes a tty structure.
> 
> Why do you randomly sometimes fix kernel-doc and sometimes remove functions
> from kernel-doc? What's the rule?

The decision is made quickly (I am fixing literally 1000's of these),
but the process is definitely not random.

If there has been little or no attempt to document the function, it
gets demoted.  If the developer has had a good crack at providing
descriptions and/or the header is just suffering with a little
incompleteness/doc-rot, then I'll fix it up.

Here for example, no attempt was made to provide any proper
documentation.

> For example, alloc_tty_struct is among the
> ones, I would like to see fixed instead of removed from kernel-doc.

There is nothing stopping anyone from providing said descriptions and
promoting it back up to kernel-doc.  If you have good reasons for it
to be properly documented with kernel-doc, then it should also be
referenced from /Documentation using the kernel-doc:: notation.

Also see: scripts/find-unused-docs.sh

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog


Re: [PATCH 08/36] tty: tty_ldisc: Fix some kernel-doc related misdemeanours

2020-11-05 Thread Lee Jones
On Thu, 05 Nov 2020, Jiri Slaby wrote:

> On 04. 11. 20, 20:35, Lee Jones wrote:
> >   - Functions must follow directly on from their headers
> >   - Demote non-conforming kernel-doc header
> >   - Ensure notes have unique section names
> >   - Provide missing description for 'reinit'
> > 
> > Fixes the following W=1 kernel build warning(s):
> > 
> >   drivers/tty/tty_ldisc.c:158: warning: cannot understand function 
> > prototype: 'int tty_ldisc_autoload = IS_BUILTIN(CONFIG_LDISC_AUTOLOAD); '
> >   drivers/tty/tty_ldisc.c:199: warning: Function parameter or member 'ld' 
> > not described in 'tty_ldisc_put'
> >   drivers/tty/tty_ldisc.c:260: warning: duplicate section name 'Note'
> >   drivers/tty/tty_ldisc.c:717: warning: Function parameter or member 
> > 'reinit' not described in 'tty_ldisc_hangup'
> > 
> > Cc: Greg Kroah-Hartman 
> > Cc: Jiri Slaby 
> > Signed-off-by: Lee Jones 
> > ---
> >   drivers/tty/tty_ldisc.c | 10 +-
> >   1 file changed, 5 insertions(+), 5 deletions(-)
> > 
> > diff --git a/drivers/tty/tty_ldisc.c b/drivers/tty/tty_ldisc.c
> > index fe37ec331289b..aced2bf6173be 100644
> > --- a/drivers/tty/tty_ldisc.c
> > +++ b/drivers/tty/tty_ldisc.c
> > @@ -190,7 +189,7 @@ static struct tty_ldisc *tty_ldisc_get(struct 
> > tty_struct *tty, int disc)
> > return ld;
> >   }
> > -/**
> > +/*
> >*tty_ldisc_put   -   release the ldisc
> 
> Having tty_ldisc_get in kernel-doc, while tty_ldisc_put not doesn't make
> much sense. What's missing to tty_ldisc_put to conform to kernel-doc?

Where are they in kernel-doc?  I don't see any references.

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog


Re: [PATCH v2 1/8] clk: at91: sama7g5: fix compilation error

2020-11-05 Thread Tudor.Ambarus
On 11/4/20 7:45 PM, Claudiu Beznea wrote:
> pmc_data_allocate() has been changed. pmc_data_free() was removed.
> Adapt the code taking this into consideration. With this the programmable
> clocks were also saved in sama7g5_pmc so that they could be later
> referenced.
> 
> Fixes: cb783bbbcf54 ("clk: at91: sama7g5: add clock support for sama7g5")
> Signed-off-by: Claudiu Beznea 

Reviewed-by: Tudor Ambarus 
Tested-by: Tudor Ambarus 

> ---
>  drivers/clk/at91/sama7g5.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/clk/at91/sama7g5.c b/drivers/clk/at91/sama7g5.c
> index 0db2ab3eca14..a092a940baa4 100644
> --- a/drivers/clk/at91/sama7g5.c
> +++ b/drivers/clk/at91/sama7g5.c
> @@ -838,7 +838,7 @@ static void __init sama7g5_pmc_setup(struct device_node 
> *np)
>   sama7g5_pmc = pmc_data_allocate(PMC_I2S1_MUX + 1,
>   nck(sama7g5_systemck),
>   nck(sama7g5_periphck),
> - nck(sama7g5_gck));
> + nck(sama7g5_gck), 8);
>   if (!sama7g5_pmc)
>   return;
>  
> @@ -980,6 +980,8 @@ static void __init sama7g5_pmc_setup(struct device_node 
> *np)
>   sama7g5_prog_mux_table);
>   if (IS_ERR(hw))
>   goto err_free;
> +
> + sama7g5_pmc->pchws[i] = hw;
>   }
>  
>   for (i = 0; i < ARRAY_SIZE(sama7g5_systemck); i++) {
> @@ -1052,7 +1054,7 @@ static void __init sama7g5_pmc_setup(struct device_node 
> *np)
>   kfree(alloc_mem);
>   }
>  
> - pmc_data_free(sama7g5_pmc);
> + kfree(sama7g5_pmc);
>  }
>  
>  /* Some clks are used for a clocksource */
> 



Re: [PATCH 34/36] tty: serial: pmac_zilog: Make disposable variable __always_unused

2020-11-05 Thread Jiri Slaby

On 05. 11. 20, 9:36, Lee Jones wrote:

On Thu, 05 Nov 2020, Jiri Slaby wrote:


On 05. 11. 20, 8:04, Christophe Leroy wrote:



Le 04/11/2020 à 20:35, Lee Jones a écrit :

Fixes the following W=1 kernel build warning(s):

   drivers/tty/serial/pmac_zilog.h:365:58: warning: variable
‘garbage’ set but not used [-Wunused-but-set-variable]


Explain how you are fixing this warning.

Setting  __always_unused is usually not the good solution for fixing
this warning, but here I guess this is likely the good solution. But it
should be explained why.


There are normally 3 ways to fix this warning;

  - Start using/checking the variable/result
  - Remove the variable
  - Mark it as __{always,maybe}_unused

The later just tells the compiler that not checking the resultant
value is intentional.  There are some functions (as Jiri mentions
below) which are marked as '__must_check' which *require* a dummy
(garbage) variable to be used.


Or, why is the "garbage =" needed in the first place? read_zsdata is not
defined with __warn_unused_result__.


I used '__always_used' here for fear of breaking something.

However, if it's safe to remove it, then all the better.


Yes please -- this "garbage" is one of the examples of volatile misuses. 
If readb didn't work on volatile pointer, marking the return variable as 
volatile wouldn't save it.



And even if it was, would (void)!read_zsdata(port) fix it?


That's hideous. :D


Sure, marking reads as must_check would be insane.


*Much* better to just use '__always_used' in that use-case.


Then using a dummy variable to fool must_check must mean must_check is 
used incorrectly, no :)? But there are always exceptions…


thanks,
--
js
suse labs


[PATCH v21 00/19] per memcg lru lock

2020-11-05 Thread Alex Shi
This version rebase on next/master 20201104, with much of Johannes's
Acks and some changes according to Johannes comments. And add a new patch
v21-0006-mm-rmap-stop-store-reordering-issue-on-page-mapp.patch to support
v21-0007.

This patchset followed 2 memcg VM_WARN_ON_ONCE_PAGE patches which were
added to -mm tree yesterday.
 
Many thanks for line by line review by Hugh Dickins, Alexander Duyck and
Johannes Weiner.

So now this patchset includes 3 parts:
1, some code cleanup and minimum optimization as a preparation. 
2, use TestCleanPageLRU as page isolation's precondition.
3, replace per node lru_lock with per memcg per node lru_lock.

Current lru_lock is one for each of node, pgdat->lru_lock, that guard for
lru lists, but now we had moved the lru lists into memcg for long time. Still
using per node lru_lock is clearly unscalable, pages on each of memcgs have
to compete each others for a whole lru_lock. This patchset try to use per
lruvec/memcg lru_lock to repleace per node lru lock to guard lru lists, make
it scalable for memcgs and get performance gain.

Currently lru_lock still guards both lru list and page's lru bit, that's ok.
but if we want to use specific lruvec lock on the page, we need to pin down
the page's lruvec/memcg during locking. Just taking lruvec lock first may be
undermined by the page's memcg charge/migration. To fix this problem, we could
take out the page's lru bit clear and use it as pin down action to block the
memcg changes. That's the reason for new atomic func TestClearPageLRU.
So now isolating a page need both actions: TestClearPageLRU and hold the
lru_lock.

The typical usage of this is isolate_migratepages_block() in compaction.c
we have to take lru bit before lru lock, that serialized the page isolation
in memcg page charge/migration which will change page's lruvec and new 
lru_lock in it.

The above solution suggested by Johannes Weiner, and based on his new memcg 
charge path, then have this patchset. (Hugh Dickins tested and contributed much
code from compaction fix to general code polish, thanks a lot!).

Daniel Jordan's testing show 62% improvement on modified readtwice case
on his 2P * 10 core * 2 HT broadwell box on v18, which has no much different
with this v20.
https://lore.kernel.org/lkml/20200915165807.kpp7uhiw7l3lo...@ca-dmjordan1.us.oracle.com/

Thanks Hugh Dickins and Konstantin Khlebnikov, they both brought this
idea 8 years ago, and others who give comments as well: Daniel Jordan, 
Mel Gorman, Shakeel Butt, Matthew Wilcox, Alexander Duyck etc.

Thanks for Testing support from Intel 0day and Rong Chen, Fengguang Wu,
and Yun Wang. Hugh Dickins also shared his kbuild-swap case. Thanks!


Alex Shi (16):
  mm/thp: move lru_add_page_tail func to huge_memory.c
  mm/thp: use head for head page in lru_add_page_tail
  mm/thp: Simplify lru_add_page_tail()
  mm/thp: narrow lru locking
  mm/vmscan: remove unnecessary lruvec adding
  mm/rmap: stop store reordering issue on page->mapping
  mm/memcg: add debug checking in lock_page_memcg
  mm/swap.c: fold vm event PGROTATED into pagevec_move_tail_fn
  mm/lru: move lock into lru_note_cost
  mm/vmscan: remove lruvec reget in move_pages_to_lru
  mm/mlock: remove lru_lock on TestClearPageMlocked
  mm/mlock: remove __munlock_isolate_lru_page
  mm/lru: introduce TestClearPageLRU
  mm/compaction: do page isolation first in compaction
  mm/swap.c: serialize memcg changes in pagevec_lru_move_fn
  mm/lru: replace pgdat lru_lock with lruvec lock

Alexander Duyck (1):
  mm/lru: introduce the relock_page_lruvec function

Hugh Dickins (2):
  mm: page_idle_get_page() does not need lru_lock
  mm/lru: revise the comments of lru_lock

 Documentation/admin-guide/cgroup-v1/memcg_test.rst |  15 +-
 Documentation/admin-guide/cgroup-v1/memory.rst |  21 +--
 Documentation/trace/events-kmem.rst|   2 +-
 Documentation/vm/unevictable-lru.rst   |  22 +--
 include/linux/memcontrol.h | 110 +++
 include/linux/mm_types.h   |   2 +-
 include/linux/mmzone.h |   6 +-
 include/linux/page-flags.h |   1 +
 include/linux/swap.h   |   4 +-
 mm/compaction.c|  94 +++---
 mm/filemap.c   |   4 +-
 mm/huge_memory.c   |  45 +++--
 mm/memcontrol.c|  79 +++-
 mm/mlock.c |  63 ++-
 mm/mmzone.c|   1 +
 mm/page_alloc.c|   1 -
 mm/page_idle.c |   4 -
 mm/rmap.c  |  11 +-
 mm/swap.c  | 208 -
 mm/vmscan.c| 207 ++--
 mm/workingset.c

Re: [PATCH 08/36] tty: tty_ldisc: Fix some kernel-doc related misdemeanours

2020-11-05 Thread Lee Jones
On Thu, 05 Nov 2020, Lee Jones wrote:

> On Thu, 05 Nov 2020, Jiri Slaby wrote:
> 
> > On 04. 11. 20, 20:35, Lee Jones wrote:
> > >   - Functions must follow directly on from their headers
> > >   - Demote non-conforming kernel-doc header
> > >   - Ensure notes have unique section names
> > >   - Provide missing description for 'reinit'
> > > 
> > > Fixes the following W=1 kernel build warning(s):
> > > 
> > >   drivers/tty/tty_ldisc.c:158: warning: cannot understand function 
> > > prototype: 'int tty_ldisc_autoload = IS_BUILTIN(CONFIG_LDISC_AUTOLOAD); '
> > >   drivers/tty/tty_ldisc.c:199: warning: Function parameter or member 'ld' 
> > > not described in 'tty_ldisc_put'
> > >   drivers/tty/tty_ldisc.c:260: warning: duplicate section name 'Note'
> > >   drivers/tty/tty_ldisc.c:717: warning: Function parameter or member 
> > > 'reinit' not described in 'tty_ldisc_hangup'
> > > 
> > > Cc: Greg Kroah-Hartman 
> > > Cc: Jiri Slaby 
> > > Signed-off-by: Lee Jones 
> > > ---
> > >   drivers/tty/tty_ldisc.c | 10 +-
> > >   1 file changed, 5 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/drivers/tty/tty_ldisc.c b/drivers/tty/tty_ldisc.c
> > > index fe37ec331289b..aced2bf6173be 100644
> > > --- a/drivers/tty/tty_ldisc.c
> > > +++ b/drivers/tty/tty_ldisc.c
> > > @@ -190,7 +189,7 @@ static struct tty_ldisc *tty_ldisc_get(struct 
> > > tty_struct *tty, int disc)
> > >   return ld;
> > >   }
> > > -/**
> > > +/*
> > >*  tty_ldisc_put   -   release the ldisc
> > 
> > Having tty_ldisc_get in kernel-doc, while tty_ldisc_put not doesn't make
> > much sense. What's missing to tty_ldisc_put to conform to kernel-doc?
> 
> Where are they in kernel-doc?  I don't see any references.

Also:

 $ ./scripts/find-unused-docs.sh drivers/tty/
 The following files contain kerneldoc comments for exported functions that are 
not used in the formatted documentation
 drivers/tty/n_tracesink.c
 drivers/tty/tty_baudrate.c
 drivers/tty/serial/8250/8250_port.c
 drivers/tty/serdev/core.c
 drivers/tty/vt/keyboard.c
 drivers/tty/vt/selection.c
 drivers/tty/vt/consolemap.c
 drivers/tty/vt/vt.c
 drivers/tty/tty_jobctrl.c
 drivers/tty/tty_buffer.c
 drivers/tty/n_tty.c
 drivers/tty/hvc/hvc_console.c
 drivers/tty/tty_ioctl.c
 drivers/tty/sysrq.c
 drivers/tty/tty_ldisc.c<-
 drivers/tty/tty_io.c
 drivers/tty/tty_port.c

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog


[PATCH v21 15/19] mm/compaction: do page isolation first in compaction

2020-11-05 Thread Alex Shi
Currently, compaction would get the lru_lock and then do page isolation
which works fine with pgdat->lru_lock, since any page isoltion would
compete for the lru_lock. If we want to change to memcg lru_lock, we
have to isolate the page before getting lru_lock, thus isoltion would
block page's memcg change which relay on page isoltion too. Then we
could safely use per memcg lru_lock later.

The new page isolation use previous introduced TestClearPageLRU() +
pgdat lru locking which will be changed to memcg lru lock later.

Hugh Dickins  fixed following bugs in this patch's
early version:

Fix lots of crashes under compaction load: isolate_migratepages_block()
must clean up appropriately when rejecting a page, setting PageLRU again
if it had been cleared; and a put_page() after get_page_unless_zero()
cannot safely be done while holding locked_lruvec - it may turn out to
be the final put_page(), which will take an lruvec lock when PageLRU.
And move __isolate_lru_page_prepare back after get_page_unless_zero to
make trylock_page() safe:
trylock_page() is not safe to use at this time: its setting PG_locked
can race with the page being freed or allocated ("Bad page"), and can
also erase flags being set by one of those "sole owners" of a freshly
allocated page who use non-atomic __SetPageFlag().

Suggested-by: Johannes Weiner 
Signed-off-by: Alex Shi 
Acked-by: Hugh Dickins 
Acked-by: Johannes Weiner 
Cc: Andrew Morton 
Cc: Matthew Wilcox 
Cc: linux-kernel@vger.kernel.org
Cc: linux...@kvack.org
---
 include/linux/swap.h |  2 +-
 mm/compaction.c  | 42 +-
 mm/vmscan.c  | 43 ++-
 3 files changed, 56 insertions(+), 31 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 5e1e967c225f..596bc2f4d9b0 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -356,7 +356,7 @@ extern void lru_cache_add_inactive_or_unevictable(struct 
page *page,
 extern unsigned long zone_reclaimable_pages(struct zone *zone);
 extern unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
gfp_t gfp_mask, nodemask_t *mask);
-extern int __isolate_lru_page(struct page *page, isolate_mode_t mode);
+extern int __isolate_lru_page_prepare(struct page *page, isolate_mode_t mode);
 extern unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg,
  unsigned long nr_pages,
  gfp_t gfp_mask,
diff --git a/mm/compaction.c b/mm/compaction.c
index ee1f8439369e..7b1cf48884dd 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -886,6 +886,7 @@ static bool too_many_isolated(pg_data_t *pgdat)
if (!valid_page && IS_ALIGNED(low_pfn, pageblock_nr_pages)) {
if (!cc->ignore_skip_hint && get_pageblock_skip(page)) {
low_pfn = end_pfn;
+   page = NULL;
goto isolate_abort;
}
valid_page = page;
@@ -967,6 +968,21 @@ static bool too_many_isolated(pg_data_t *pgdat)
if (!(cc->gfp_mask & __GFP_FS) && page_mapping(page))
goto isolate_fail;
 
+   /*
+* Be careful not to clear PageLRU until after we're
+* sure the page is not being freed elsewhere -- the
+* page release code relies on it.
+*/
+   if (unlikely(!get_page_unless_zero(page)))
+   goto isolate_fail;
+
+   if (__isolate_lru_page_prepare(page, isolate_mode) != 0)
+   goto isolate_fail_put;
+
+   /* Try isolate the page */
+   if (!TestClearPageLRU(page))
+   goto isolate_fail_put;
+
/* If we already hold the lock, we can skip some rechecking */
if (!locked) {
locked = compact_lock_irqsave(&pgdat->lru_lock,
@@ -979,10 +995,6 @@ static bool too_many_isolated(pg_data_t *pgdat)
goto isolate_abort;
}
 
-   /* Recheck PageLRU and PageCompound under lock */
-   if (!PageLRU(page))
-   goto isolate_fail;
-
/*
 * Page become compound since the non-locked check,
 * and it's on LRU. It can only be a THP so the order
@@ -990,16 +1002,13 @@ static bool too_many_isolated(pg_data_t *pgdat)
 */
if (unlikely(PageCompound(page) && !cc->alloc_contig)) {
low_pfn += compound_nr(page) - 1;
-   goto isolate_fail;
+   SetPageLRU(page);
+   goto isolate_

[PATCH v21 05/19] mm/vmscan: remove unnecessary lruvec adding

2020-11-05 Thread Alex Shi
We don't have to add a freeable page into lru and then remove from it.
This change saves a couple of actions and makes the moving more clear.

The SetPageLRU needs to be kept before put_page_testzero for list
integrity, otherwise:

  #0 move_pages_to_lru #1 release_pages
  if !put_page_testzero
   if (put_page_testzero())
  !PageLRU //skip lru_lock
 SetPageLRU()
 list_add(&page->lru,)
 list_add(&page->lru,)

[a...@linux-foundation.org: coding style fixes]
Signed-off-by: Alex Shi 
Acked-by: Hugh Dickins 
Acked-by: Johannes Weiner 
Cc: Andrew Morton 
Cc: Johannes Weiner 
Cc: Tejun Heo 
Cc: Matthew Wilcox 
Cc: Hugh Dickins 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/vmscan.c | 38 +-
 1 file changed, 25 insertions(+), 13 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 12a4873942e2..b9935668d121 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1852,26 +1852,30 @@ static unsigned noinline_for_stack 
move_pages_to_lru(struct lruvec *lruvec,
while (!list_empty(list)) {
page = lru_to_page(list);
VM_BUG_ON_PAGE(PageLRU(page), page);
+   list_del(&page->lru);
if (unlikely(!page_evictable(page))) {
-   list_del(&page->lru);
spin_unlock_irq(&pgdat->lru_lock);
putback_lru_page(page);
spin_lock_irq(&pgdat->lru_lock);
continue;
}
-   lruvec = mem_cgroup_page_lruvec(page, pgdat);
 
+   /*
+* The SetPageLRU needs to be kept here for list integrity.
+* Otherwise:
+*   #0 move_pages_to_lru #1 release_pages
+*   if !put_page_testzero
+*if (put_page_testzero())
+*  !PageLRU //skip lru_lock
+* SetPageLRU()
+* list_add(&page->lru,)
+*list_add(&page->lru,)
+*/
SetPageLRU(page);
-   lru = page_lru(page);
 
-   nr_pages = thp_nr_pages(page);
-   update_lru_size(lruvec, lru, page_zonenum(page), nr_pages);
-   list_move(&page->lru, &lruvec->lists[lru]);
-
-   if (put_page_testzero(page)) {
+   if (unlikely(put_page_testzero(page))) {
__ClearPageLRU(page);
__ClearPageActive(page);
-   del_page_from_lru_list(page, lruvec, lru);
 
if (unlikely(PageCompound(page))) {
spin_unlock_irq(&pgdat->lru_lock);
@@ -1879,11 +1883,19 @@ static unsigned noinline_for_stack 
move_pages_to_lru(struct lruvec *lruvec,
spin_lock_irq(&pgdat->lru_lock);
} else
list_add(&page->lru, &pages_to_free);
-   } else {
-   nr_moved += nr_pages;
-   if (PageActive(page))
-   workingset_age_nonresident(lruvec, nr_pages);
+
+   continue;
}
+
+   lruvec = mem_cgroup_page_lruvec(page, pgdat);
+   lru = page_lru(page);
+   nr_pages = thp_nr_pages(page);
+
+   update_lru_size(lruvec, lru, page_zonenum(page), nr_pages);
+   list_add(&page->lru, &lruvec->lists[lru]);
+   nr_moved += nr_pages;
+   if (PageActive(page))
+   workingset_age_nonresident(lruvec, nr_pages);
}
 
/*
-- 
1.8.3.1



Re: [RFT PATCH v2 7/8] gpio: exar: switch to using regmap

2020-11-05 Thread Bartosz Golaszewski
On Wed, Nov 4, 2020 at 9:35 PM Andy Shevchenko
 wrote:
>
> On Wed, Nov 4, 2020 at 9:34 PM Bartosz Golaszewski  wrote:
>
> ...
>
> > +static const struct regmap_config exar_regmap_config = {
> > +   .name   = "exar-gpio",
> > +   .reg_bits   = 8,
> > +   .val_bits   = 8,
> > +};
>
> Looking at gpio-pca953xx regmap conversion I'm wondering shouldn't you
> provide a callback to define volatile registers (such as GPIO input
> bits)?
>
> --
> With Best Regards,
> Andy Shevchenko

I think this was done in pca953x due to weird calculations of banks
and registers. For a rather simple driver like this one I don't think
this is needed.

Bartosz


[PATCH v21 12/19] mm/mlock: remove lru_lock on TestClearPageMlocked

2020-11-05 Thread Alex Shi
In the func munlock_vma_page, comments mentained lru_lock needed for
serialization with split_huge_pages. But the page must be PageLocked
as well as pages in split_huge_page series funcs. Thus the PageLocked
is enough to serialize both funcs.

Further more, Hugh Dickins pointed: before splitting in
split_huge_page_to_list, the page was unmap_page() to remove pmd/ptes
which protect the page from munlock. Thus, no needs to guard
__split_huge_page_tail for mlock clean, just keep the lru_lock there for
isolation purpose.

LKP found a preempt issue on __mod_zone_page_state which need change
to mod_zone_page_state. Thanks!

Signed-off-by: Alex Shi 
Acked-by: Hugh Dickins 
Acked-by: Johannes Weiner 
Cc: Kirill A. Shutemov 
Cc: Vlastimil Babka 
Cc: Andrew Morton 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/mlock.c | 26 +-
 1 file changed, 5 insertions(+), 21 deletions(-)

diff --git a/mm/mlock.c b/mm/mlock.c
index 884b1216da6a..796c726a0407 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -187,40 +187,24 @@ static void __munlock_isolation_failed(struct page *page)
 unsigned int munlock_vma_page(struct page *page)
 {
int nr_pages;
-   pg_data_t *pgdat = page_pgdat(page);
 
/* For try_to_munlock() and to serialize with page migration */
BUG_ON(!PageLocked(page));
-
VM_BUG_ON_PAGE(PageTail(page), page);
 
-   /*
-* Serialize with any parallel __split_huge_page_refcount() which
-* might otherwise copy PageMlocked to part of the tail pages before
-* we clear it in the head page. It also stabilizes thp_nr_pages().
-*/
-   spin_lock_irq(&pgdat->lru_lock);
-
if (!TestClearPageMlocked(page)) {
/* Potentially, PTE-mapped THP: do not skip the rest PTEs */
-   nr_pages = 1;
-   goto unlock_out;
+   return 0;
}
 
nr_pages = thp_nr_pages(page);
-   __mod_zone_page_state(page_zone(page), NR_MLOCK, -nr_pages);
+   mod_zone_page_state(page_zone(page), NR_MLOCK, -nr_pages);
 
-   if (__munlock_isolate_lru_page(page, true)) {
-   spin_unlock_irq(&pgdat->lru_lock);
+   if (!isolate_lru_page(page))
__munlock_isolated_page(page);
-   goto out;
-   }
-   __munlock_isolation_failed(page);
-
-unlock_out:
-   spin_unlock_irq(&pgdat->lru_lock);
+   else
+   __munlock_isolation_failed(page);
 
-out:
return nr_pages - 1;
 }
 
-- 
1.8.3.1



[PATCH v21 03/19] mm/thp: Simplify lru_add_page_tail()

2020-11-05 Thread Alex Shi
Simplify lru_add_page_tail(), there are actually only two cases possible:
split_huge_page_to_list(), with list supplied and head isolated from lru
by its caller; or split_huge_page(), with NULL list and head on lru -
because when head is racily isolated from lru, the isolator's reference
will stop the split from getting any further than its page_ref_freeze().

So decide between the two cases by "list", but add VM_WARN_ON()s to
verify that they match our lru expectations.

[Hugh Dickins: rewrite commit log]
Signed-off-by: Alex Shi 
Reviewed-by: Kirill A. Shutemov 
Acked-by: Hugh Dickins 
Cc: Kirill A. Shutemov 
Cc: Andrew Morton 
Cc: Johannes Weiner 
Cc: Matthew Wilcox 
Cc: Hugh Dickins 
Cc: Mika Penttilä 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/huge_memory.c | 20 ++--
 1 file changed, 6 insertions(+), 14 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 60726eb26840..79318d7f7d5d 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2356,24 +2356,16 @@ static void lru_add_page_tail(struct page *head, struct 
page *tail,
VM_BUG_ON_PAGE(PageLRU(tail), head);
lockdep_assert_held(&lruvec_pgdat(lruvec)->lru_lock);
 
-   if (!list)
-   SetPageLRU(tail);
-
-   if (likely(PageLRU(head)))
-   list_add_tail(&tail->lru, &head->lru);
-   else if (list) {
+   if (list) {
/* page reclaim is reclaiming a huge page */
+   VM_WARN_ON(PageLRU(head));
get_page(tail);
list_add_tail(&tail->lru, list);
} else {
-   /*
-* Head page has not yet been counted, as an hpage,
-* so we must account for each subpage individually.
-*
-* Put tail on the list at the correct position
-* so they all end up in order.
-*/
-   add_page_to_lru_list_tail(tail, lruvec, page_lru(tail));
+   /* head is still on lru (and we have it frozen) */
+   VM_WARN_ON(!PageLRU(head));
+   SetPageLRU(tail);
+   list_add_tail(&tail->lru, &head->lru);
}
 }
 
-- 
1.8.3.1



[PATCH v21 18/19] mm/lru: introduce the relock_page_lruvec function

2020-11-05 Thread Alex Shi
From: Alexander Duyck 

Use this new function to replace repeated same code, no func change.

When testing for relock we can avoid the need for RCU locking if we simply
compare the page pgdat and memcg pointers versus those that the lruvec is
holding. By doing this we can avoid the extra pointer walks and accesses of
the memory cgroup.

In addition we can avoid the checks entirely if lruvec is currently NULL.

Signed-off-by: Alexander Duyck 
Signed-off-by: Alex Shi 
Acked-by: Hugh Dickins 
Acked-by: Johannes Weiner 
Cc: Johannes Weiner 
Cc: Andrew Morton 
Cc: Thomas Gleixner 
Cc: Andrey Ryabinin 
Cc: Matthew Wilcox 
Cc: Mel Gorman 
Cc: Konstantin Khlebnikov 
Cc: Hugh Dickins 
Cc: Tejun Heo 
Cc: linux-kernel@vger.kernel.org
Cc: cgro...@vger.kernel.org
Cc: linux...@kvack.org
---
 include/linux/memcontrol.h | 52 ++
 mm/mlock.c | 11 +-
 mm/swap.c  | 33 +++--
 mm/vmscan.c| 12 ++-
 4 files changed, 62 insertions(+), 46 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 6ecb08ff4ad1..ba4050154fea 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -660,6 +660,22 @@ static inline struct lruvec *mem_cgroup_lruvec(struct 
mem_cgroup *memcg,
 
 struct lruvec *mem_cgroup_page_lruvec(struct page *, struct pglist_data *);
 
+static inline bool lruvec_holds_page_lru_lock(struct page *page,
+ struct lruvec *lruvec)
+{
+   pg_data_t *pgdat = page_pgdat(page);
+   const struct mem_cgroup *memcg;
+   struct mem_cgroup_per_node *mz;
+
+   if (mem_cgroup_disabled())
+   return lruvec == &pgdat->__lruvec;
+
+   mz = container_of(lruvec, struct mem_cgroup_per_node, lruvec);
+   memcg = page->mem_cgroup ? : root_mem_cgroup;
+
+   return lruvec->pgdat == pgdat && mz->memcg == memcg;
+}
+
 struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p);
 
 struct mem_cgroup *get_mem_cgroup_from_mm(struct mm_struct *mm);
@@ -1221,6 +1237,14 @@ static inline struct lruvec 
*mem_cgroup_page_lruvec(struct page *page,
return &pgdat->__lruvec;
 }
 
+static inline bool lruvec_holds_page_lru_lock(struct page *page,
+ struct lruvec *lruvec)
+{
+   pg_data_t *pgdat = page_pgdat(page);
+
+   return lruvec == &pgdat->__lruvec;
+}
+
 static inline struct mem_cgroup *parent_mem_cgroup(struct mem_cgroup *memcg)
 {
return NULL;
@@ -1663,6 +1687,34 @@ static inline void unlock_page_lruvec_irqrestore(struct 
lruvec *lruvec,
spin_unlock_irqrestore(&lruvec->lru_lock, flags);
 }
 
+/* Don't lock again iff page's lruvec locked */
+static inline struct lruvec *relock_page_lruvec_irq(struct page *page,
+   struct lruvec *locked_lruvec)
+{
+   if (locked_lruvec) {
+   if (lruvec_holds_page_lru_lock(page, locked_lruvec))
+   return locked_lruvec;
+
+   unlock_page_lruvec_irq(locked_lruvec);
+   }
+
+   return lock_page_lruvec_irq(page);
+}
+
+/* Don't lock again iff page's lruvec locked */
+static inline struct lruvec *relock_page_lruvec_irqsave(struct page *page,
+   struct lruvec *locked_lruvec, unsigned long *flags)
+{
+   if (locked_lruvec) {
+   if (lruvec_holds_page_lru_lock(page, locked_lruvec))
+   return locked_lruvec;
+
+   unlock_page_lruvec_irqrestore(locked_lruvec, *flags);
+   }
+
+   return lock_page_lruvec_irqsave(page, flags);
+}
+
 #ifdef CONFIG_CGROUP_WRITEBACK
 
 struct wb_domain *mem_cgroup_wb_domain(struct bdi_writeback *wb);
diff --git a/mm/mlock.c b/mm/mlock.c
index ab164a675c25..55b3b3672977 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -277,16 +277,7 @@ static void __munlock_pagevec(struct pagevec *pvec, struct 
zone *zone)
 * so we can spare the get_page() here.
 */
if (TestClearPageLRU(page)) {
-   struct lruvec *new_lruvec;
-
-   new_lruvec = mem_cgroup_page_lruvec(page,
-   page_pgdat(page));
-   if (new_lruvec != lruvec) {
-   if (lruvec)
-   unlock_page_lruvec_irq(lruvec);
-   lruvec = lock_page_lruvec_irq(page);
-   }
-
+   lruvec = relock_page_lruvec_irq(page, lruvec);
del_page_from_lru_list(page, lruvec,
page_lru(page));
continue;
diff --git a/mm/swap.c b/mm/swap.c
index ed033f7c4f2d..c593ba596dea 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -210,19 +210,12 @@ stati

[PATCH v21 17/19] mm/lru: replace pgdat lru_lock with lruvec lock

2020-11-05 Thread Alex Shi
This patch moves per node lru_lock into lruvec, thus bring a lru_lock for
each of memcg per node. So on a large machine, each of memcg don't
have to suffer from per node pgdat->lru_lock competition. They could go
fast with their self lru_lock.

After move memcg charge before lru inserting, page isolation could
serialize page's memcg, then per memcg lruvec lock is stable and could
replace per node lru lock.

In func isolate_migratepages_block, compact_unlock_should_abort and
lock_page_lruvec_irqsave are open coded to work with compact_control.
Also add a debug func in locking which may give some clues if there are
sth out of hands.

Daniel Jordan's testing show 62% improvement on modified readtwice case
on his 2P * 10 core * 2 HT broadwell box.
https://lore.kernel.org/lkml/20200915165807.kpp7uhiw7l3lo...@ca-dmjordan1.us.oracle.com/

On a large machine with memcg enabled but not used, the page's lruvec
seeking pass a few pointers, that may lead to lru_lock holding time
increase and a bit regression.

Hugh Dickins helped on the patch polish, thanks!

Signed-off-by: Alex Shi 
Acked-by: Hugh Dickins 
Cc: Rong Chen 
Cc: Hugh Dickins 
Cc: Andrew Morton 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Vladimir Davydov 
Cc: Yang Shi 
Cc: Matthew Wilcox 
Cc: Konstantin Khlebnikov 
Cc: Tejun Heo 
Cc: linux-kernel@vger.kernel.org
Cc: linux...@kvack.org
Cc: cgro...@vger.kernel.org
---
 include/linux/memcontrol.h |  58 +++
 include/linux/mmzone.h |   3 +-
 mm/compaction.c|  56 ++
 mm/huge_memory.c   |  11 ++---
 mm/memcontrol.c|  73 ++--
 mm/mlock.c |  22 ++---
 mm/mmzone.c|   1 +
 mm/page_alloc.c|   1 -
 mm/swap.c  | 116 ++---
 mm/vmscan.c|  55 ++---
 10 files changed, 270 insertions(+), 126 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 0f4dd7829fb2..6ecb08ff4ad1 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -666,6 +666,19 @@ static inline struct lruvec *mem_cgroup_lruvec(struct 
mem_cgroup *memcg,
 
 struct mem_cgroup *get_mem_cgroup_from_page(struct page *page);
 
+struct lruvec *lock_page_lruvec(struct page *page);
+struct lruvec *lock_page_lruvec_irq(struct page *page);
+struct lruvec *lock_page_lruvec_irqsave(struct page *page,
+   unsigned long *flags);
+
+#ifdef CONFIG_DEBUG_VM
+void lruvec_memcg_debug(struct lruvec *lruvec, struct page *page);
+#else
+static inline void lruvec_memcg_debug(struct lruvec *lruvec, struct page *page)
+{
+}
+#endif
+
 static inline
 struct mem_cgroup *mem_cgroup_from_css(struct cgroup_subsys_state *css){
return css ? container_of(css, struct mem_cgroup, css) : NULL;
@@ -1233,6 +1246,31 @@ static inline void mem_cgroup_put(struct mem_cgroup 
*memcg)
 {
 }
 
+static inline struct lruvec *lock_page_lruvec(struct page *page)
+{
+   struct pglist_data *pgdat = page_pgdat(page);
+
+   spin_lock(&pgdat->__lruvec.lru_lock);
+   return &pgdat->__lruvec;
+}
+
+static inline struct lruvec *lock_page_lruvec_irq(struct page *page)
+{
+   struct pglist_data *pgdat = page_pgdat(page);
+
+   spin_lock_irq(&pgdat->__lruvec.lru_lock);
+   return &pgdat->__lruvec;
+}
+
+static inline struct lruvec *lock_page_lruvec_irqsave(struct page *page,
+   unsigned long *flagsp)
+{
+   struct pglist_data *pgdat = page_pgdat(page);
+
+   spin_lock_irqsave(&pgdat->__lruvec.lru_lock, *flagsp);
+   return &pgdat->__lruvec;
+}
+
 static inline struct mem_cgroup *
 mem_cgroup_iter(struct mem_cgroup *root,
struct mem_cgroup *prev,
@@ -1476,6 +1514,10 @@ static inline void count_memcg_page_event(struct page 
*page,
 void count_memcg_event_mm(struct mm_struct *mm, enum vm_event_item idx)
 {
 }
+
+static inline void lruvec_memcg_debug(struct lruvec *lruvec, struct page *page)
+{
+}
 #endif /* CONFIG_MEMCG */
 
 /* idx can be of type enum memcg_stat_item or node_stat_item */
@@ -1605,6 +1647,22 @@ static inline struct lruvec *parent_lruvec(struct lruvec 
*lruvec)
return mem_cgroup_lruvec(memcg, lruvec_pgdat(lruvec));
 }
 
+static inline void unlock_page_lruvec(struct lruvec *lruvec)
+{
+   spin_unlock(&lruvec->lru_lock);
+}
+
+static inline void unlock_page_lruvec_irq(struct lruvec *lruvec)
+{
+   spin_unlock_irq(&lruvec->lru_lock);
+}
+
+static inline void unlock_page_lruvec_irqrestore(struct lruvec *lruvec,
+   unsigned long flags)
+{
+   spin_unlock_irqrestore(&lruvec->lru_lock, flags);
+}
+
 #ifdef CONFIG_CGROUP_WRITEBACK
 
 struct wb_domain *mem_cgroup_wb_domain(struct bdi_writeback *wb);
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index fb3bf696c05e..0afba4ea2a21 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -276,6 +276,8 @@ enum lruve

[PATCH v21 13/19] mm/mlock: remove __munlock_isolate_lru_page

2020-11-05 Thread Alex Shi
The func only has one caller, remove it to clean up code and simplify
code.

Signed-off-by: Alex Shi 
Acked-by: Hugh Dickins 
Acked-by: Johannes Weiner 
Cc: Hugh Dickins 
Cc: Kirill A. Shutemov 
Cc: Vlastimil Babka 
Cc: Andrew Morton 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/mlock.c | 31 +--
 1 file changed, 9 insertions(+), 22 deletions(-)

diff --git a/mm/mlock.c b/mm/mlock.c
index 796c726a0407..d487aa864e86 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -106,26 +106,6 @@ void mlock_vma_page(struct page *page)
 }
 
 /*
- * Isolate a page from LRU with optional get_page() pin.
- * Assumes lru_lock already held and page already pinned.
- */
-static bool __munlock_isolate_lru_page(struct page *page, bool getpage)
-{
-   if (PageLRU(page)) {
-   struct lruvec *lruvec;
-
-   lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page));
-   if (getpage)
-   get_page(page);
-   ClearPageLRU(page);
-   del_page_from_lru_list(page, lruvec, page_lru(page));
-   return true;
-   }
-
-   return false;
-}
-
-/*
  * Finish munlock after successful page isolation
  *
  * Page must be locked. This is a wrapper for try_to_munlock()
@@ -296,9 +276,16 @@ static void __munlock_pagevec(struct pagevec *pvec, struct 
zone *zone)
 * We already have pin from follow_page_mask()
 * so we can spare the get_page() here.
 */
-   if (__munlock_isolate_lru_page(page, false))
+   if (PageLRU(page)) {
+   struct lruvec *lruvec;
+
+   ClearPageLRU(page);
+   lruvec = mem_cgroup_page_lruvec(page,
+   page_pgdat(page));
+   del_page_from_lru_list(page, lruvec,
+   page_lru(page));
continue;
-   else
+   } else
__munlock_isolation_failed(page);
} else {
delta_munlocked++;
-- 
1.8.3.1



[PATCH v21 09/19] mm/swap.c: fold vm event PGROTATED into pagevec_move_tail_fn

2020-11-05 Thread Alex Shi
Fold the PGROTATED event collection into pagevec_move_tail_fn call back
func like other funcs does in pagevec_lru_move_fn. Thus we could save
func call pagevec_move_tail().
Now all usage of pagevec_lru_move_fn are same and no needs of its 3rd
parameter.

It's just simply the calling. No functional change.

[l...@intel.com: found a build issue in the original patch, thanks]
Signed-off-by: Alex Shi 
Acked-by: Hugh Dickins 
Acked-by: Johannes Weiner 
Cc: Andrew Morton 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/swap.c | 65 ++-
 1 file changed, 23 insertions(+), 42 deletions(-)

diff --git a/mm/swap.c b/mm/swap.c
index 8a578381c2fc..ce8c97146e0d 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -204,8 +204,7 @@ int get_kernel_page(unsigned long start, int write, struct 
page **pages)
 EXPORT_SYMBOL_GPL(get_kernel_page);
 
 static void pagevec_lru_move_fn(struct pagevec *pvec,
-   void (*move_fn)(struct page *page, struct lruvec *lruvec, void *arg),
-   void *arg)
+   void (*move_fn)(struct page *page, struct lruvec *lruvec))
 {
int i;
struct pglist_data *pgdat = NULL;
@@ -224,7 +223,7 @@ static void pagevec_lru_move_fn(struct pagevec *pvec,
}
 
lruvec = mem_cgroup_page_lruvec(page, pgdat);
-   (*move_fn)(page, lruvec, arg);
+   (*move_fn)(page, lruvec);
}
if (pgdat)
spin_unlock_irqrestore(&pgdat->lru_lock, flags);
@@ -232,35 +231,22 @@ static void pagevec_lru_move_fn(struct pagevec *pvec,
pagevec_reinit(pvec);
 }
 
-static void pagevec_move_tail_fn(struct page *page, struct lruvec *lruvec,
-void *arg)
+static void pagevec_move_tail_fn(struct page *page, struct lruvec *lruvec)
 {
-   int *pgmoved = arg;
-
if (PageLRU(page) && !PageUnevictable(page)) {
del_page_from_lru_list(page, lruvec, page_lru(page));
ClearPageActive(page);
add_page_to_lru_list_tail(page, lruvec, page_lru(page));
-   (*pgmoved) += thp_nr_pages(page);
+   __count_vm_events(PGROTATED, thp_nr_pages(page));
}
 }
 
 /*
- * pagevec_move_tail() must be called with IRQ disabled.
- * Otherwise this may cause nasty races.
- */
-static void pagevec_move_tail(struct pagevec *pvec)
-{
-   int pgmoved = 0;
-
-   pagevec_lru_move_fn(pvec, pagevec_move_tail_fn, &pgmoved);
-   __count_vm_events(PGROTATED, pgmoved);
-}
-
-/*
  * Writeback is about to end against a page which has been marked for immediate
  * reclaim.  If it still appears to be reclaimable, move it to the tail of the
  * inactive list.
+ *
+ * rotate_reclaimable_page() must disable IRQs, to prevent nasty races.
  */
 void rotate_reclaimable_page(struct page *page)
 {
@@ -273,7 +259,7 @@ void rotate_reclaimable_page(struct page *page)
local_lock_irqsave(&lru_rotate.lock, flags);
pvec = this_cpu_ptr(&lru_rotate.pvec);
if (!pagevec_add(pvec, page) || PageCompound(page))
-   pagevec_move_tail(pvec);
+   pagevec_lru_move_fn(pvec, pagevec_move_tail_fn);
local_unlock_irqrestore(&lru_rotate.lock, flags);
}
 }
@@ -315,8 +301,7 @@ void lru_note_cost_page(struct page *page)
  page_is_file_lru(page), thp_nr_pages(page));
 }
 
-static void __activate_page(struct page *page, struct lruvec *lruvec,
-   void *arg)
+static void __activate_page(struct page *page, struct lruvec *lruvec)
 {
if (PageLRU(page) && !PageActive(page) && !PageUnevictable(page)) {
int lru = page_lru_base_type(page);
@@ -340,7 +325,7 @@ static void activate_page_drain(int cpu)
struct pagevec *pvec = &per_cpu(lru_pvecs.activate_page, cpu);
 
if (pagevec_count(pvec))
-   pagevec_lru_move_fn(pvec, __activate_page, NULL);
+   pagevec_lru_move_fn(pvec, __activate_page);
 }
 
 static bool need_activate_page_drain(int cpu)
@@ -358,7 +343,7 @@ static void activate_page(struct page *page)
pvec = this_cpu_ptr(&lru_pvecs.activate_page);
get_page(page);
if (!pagevec_add(pvec, page) || PageCompound(page))
-   pagevec_lru_move_fn(pvec, __activate_page, NULL);
+   pagevec_lru_move_fn(pvec, __activate_page);
local_unlock(&lru_pvecs.lock);
}
 }
@@ -374,7 +359,7 @@ static void activate_page(struct page *page)
 
page = compound_head(page);
spin_lock_irq(&pgdat->lru_lock);
-   __activate_page(page, mem_cgroup_page_lruvec(page, pgdat), NULL);
+   __activate_page(page, mem_cgroup_page_lruvec(page, pgdat));
spin_unlock_irq(&pgdat->lru_lock);
 }
 #endif
@@ -525,8 +510,7 @@ void lru_cache_add_inactive_or_unevictable(struct page 
*page,
  * be write it out by flusher threads as t

[PATCH v21 08/19] mm/memcg: add debug checking in lock_page_memcg

2020-11-05 Thread Alex Shi
Add a debug checking in lock_page_memcg, then we could get alarm
if anything wrong here.

Suggested-by: Johannes Weiner 
Signed-off-by: Alex Shi 
Acked-by: Hugh Dickins 
Acked-by: Johannes Weiner 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Vladimir Davydov 
Cc: Andrew Morton 
Cc: cgro...@vger.kernel.org
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/memcontrol.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index b2aa3b73ab82..157b745031a4 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2121,6 +2121,12 @@ struct mem_cgroup *lock_page_memcg(struct page *page)
if (unlikely(!memcg))
return NULL;
 
+#ifdef CONFIG_PROVE_LOCKING
+   local_irq_save(flags);
+   might_lock(&memcg->move_lock);
+   local_irq_restore(flags);
+#endif
+
if (atomic_read(&memcg->moving_account) <= 0)
return memcg;
 
-- 
1.8.3.1



[PATCH v21 01/19] mm/thp: move lru_add_page_tail func to huge_memory.c

2020-11-05 Thread Alex Shi
The func is only used in huge_memory.c, defining it in other file with a
CONFIG_TRANSPARENT_HUGEPAGE macro restrict just looks weird.

Let's move it THP. And make it static as Hugh Dickin suggested.

Signed-off-by: Alex Shi 
Reviewed-by: Kirill A. Shutemov 
Acked-by: Hugh Dickins 
Acked-by: Johannes Weiner 
Cc: Andrew Morton 
Cc: Johannes Weiner 
Cc: Matthew Wilcox 
Cc: Hugh Dickins 
Cc: linux-kernel@vger.kernel.org
Cc: linux...@kvack.org
---
 include/linux/swap.h |  2 --
 mm/huge_memory.c | 30 ++
 mm/swap.c| 33 -
 3 files changed, 30 insertions(+), 35 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 667935c0dbd4..5e1e967c225f 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -338,8 +338,6 @@ extern void lru_note_cost(struct lruvec *lruvec, bool file,
  unsigned int nr_pages);
 extern void lru_note_cost_page(struct page *);
 extern void lru_cache_add(struct page *);
-extern void lru_add_page_tail(struct page *page, struct page *page_tail,
-struct lruvec *lruvec, struct list_head *head);
 extern void mark_page_accessed(struct page *);
 extern void lru_add_drain(void);
 extern void lru_add_drain_cpu(int cpu);
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 08a183f6c3ab..8f16e991f7cc 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2348,6 +2348,36 @@ static void remap_page(struct page *page, unsigned int 
nr)
}
 }
 
+static void lru_add_page_tail(struct page *page, struct page *page_tail,
+   struct lruvec *lruvec, struct list_head *list)
+{
+   VM_BUG_ON_PAGE(!PageHead(page), page);
+   VM_BUG_ON_PAGE(PageCompound(page_tail), page);
+   VM_BUG_ON_PAGE(PageLRU(page_tail), page);
+   lockdep_assert_held(&lruvec_pgdat(lruvec)->lru_lock);
+
+   if (!list)
+   SetPageLRU(page_tail);
+
+   if (likely(PageLRU(page)))
+   list_add_tail(&page_tail->lru, &page->lru);
+   else if (list) {
+   /* page reclaim is reclaiming a huge page */
+   get_page(page_tail);
+   list_add_tail(&page_tail->lru, list);
+   } else {
+   /*
+* Head page has not yet been counted, as an hpage,
+* so we must account for each subpage individually.
+*
+* Put page_tail on the list at the correct position
+* so they all end up in order.
+*/
+   add_page_to_lru_list_tail(page_tail, lruvec,
+ page_lru(page_tail));
+   }
+}
+
 static void __split_huge_page_tail(struct page *head, int tail,
struct lruvec *lruvec, struct list_head *list)
 {
diff --git a/mm/swap.c b/mm/swap.c
index 29220174433b..8a578381c2fc 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -977,39 +977,6 @@ void __pagevec_release(struct pagevec *pvec)
 }
 EXPORT_SYMBOL(__pagevec_release);
 
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
-/* used by __split_huge_page_refcount() */
-void lru_add_page_tail(struct page *page, struct page *page_tail,
-  struct lruvec *lruvec, struct list_head *list)
-{
-   VM_BUG_ON_PAGE(!PageHead(page), page);
-   VM_BUG_ON_PAGE(PageCompound(page_tail), page);
-   VM_BUG_ON_PAGE(PageLRU(page_tail), page);
-   lockdep_assert_held(&lruvec_pgdat(lruvec)->lru_lock);
-
-   if (!list)
-   SetPageLRU(page_tail);
-
-   if (likely(PageLRU(page)))
-   list_add_tail(&page_tail->lru, &page->lru);
-   else if (list) {
-   /* page reclaim is reclaiming a huge page */
-   get_page(page_tail);
-   list_add_tail(&page_tail->lru, list);
-   } else {
-   /*
-* Head page has not yet been counted, as an hpage,
-* so we must account for each subpage individually.
-*
-* Put page_tail on the list at the correct position
-* so they all end up in order.
-*/
-   add_page_to_lru_list_tail(page_tail, lruvec,
- page_lru(page_tail));
-   }
-}
-#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
-
 static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec,
 void *arg)
 {
-- 
1.8.3.1



Re: [PATCH v4 2/4] mfd: Support ROHM BD9576MUF and BD9573MUF

2020-11-05 Thread Vaittinen, Matti

On Thu, 2020-11-05 at 08:21 +, Lee Jones wrote:
> On Thu, 05 Nov 2020, Vaittinen, Matti wrote:
> 
> > On Thu, 2020-11-05 at 08:46 +0200, Matti Vaittinen wrote:
> > > Morning Lee,
> > > 
> > > Thanks for taking a look at this :) I see most of the comments
> > > being
> > > valid. There's two I would like to clarify though...
> > > 
> > > On Wed, 2020-11-04 at 15:51 +, Lee Jones wrote:
> > > > On Wed, 28 Oct 2020, Matti Vaittinen wrote:
> > > > 
> > > > > Add core support for ROHM BD9576MUF and BD9573MUF PMICs which
> > > > > are
> > > > > mainly used to power the R-Car series processors.
> > > > > 
> > > > > Signed-off-by: Matti Vaittinen <
> > > > > matti.vaitti...@fi.rohmeurope.com
> > > > > ---
> > > > > + unsigned int chip_type;
> > > > > +
> > > > > + chip_type = (unsigned int)(uintptr_t)
> > > > > + of_device_get_match_data(&i2c->dev);
> > > > 
> > > > Not overly keen on this casting.
> > > > 
> > > > Why not just leave it as (uintptr_t)?
> > > 
> > > I didn't do so because on x86_64 the address width is probably 64
> > > bits
> > > whereas the unsigned int is likely to be 32 bits. So the
> > > assignment
> > > will crop half of the value. It does not really matter as values
> > > are
> > > small - but I would be surprized if no compilers/analyzers
> > > emitted a
> > > warning.
> > > 
> > > I must admit I am not 100% sure though. I sure can change this if
> > > you
> > > know it better?
> 
> What if you used 'long', which I believe changed with the
> architecture's bus width in Linux?

I think this is exactly what uintptr_t was created for. To provide type
which assures a pointer conversion to integer and back works.

I guess I can change the

unsigned int chip_type;

to uintptr_t and get away with single cast if it looks better to you.
For me the double cast does not look that bad when it allows use of
native int size variable - but in this case it's really just a matter
of taste. Both should work fine.

I'll cook v5.

--Matti



[PATCH v21 19/19] mm/lru: revise the comments of lru_lock

2020-11-05 Thread Alex Shi
From: Hugh Dickins 

Since we changed the pgdat->lru_lock to lruvec->lru_lock, it's time to
fix the incorrect comments in code. Also fixed some zone->lru_lock comment
error from ancient time. etc.

I struggled to understand the comment above move_pages_to_lru() (surely
it never calls page_referenced()), and eventually realized that most of
it had got separated from shrink_active_list(): move that comment back.

Signed-off-by: Hugh Dickins 
Signed-off-by: Alex Shi 
Acked-by: Johannes Weiner 
Cc: Andrew Morton 
Cc: Tejun Heo 
Cc: Andrey Ryabinin 
Cc: Jann Horn 
Cc: Mel Gorman 
Cc: Johannes Weiner 
Cc: Matthew Wilcox 
Cc: Hugh Dickins 
Cc: cgro...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux...@kvack.org
---
 Documentation/admin-guide/cgroup-v1/memcg_test.rst | 15 ++--
 Documentation/admin-guide/cgroup-v1/memory.rst | 21 +--
 Documentation/trace/events-kmem.rst|  2 +-
 Documentation/vm/unevictable-lru.rst   | 22 +---
 include/linux/mm_types.h   |  2 +-
 include/linux/mmzone.h |  3 +-
 mm/filemap.c   |  4 +--
 mm/rmap.c  |  4 +--
 mm/vmscan.c| 41 --
 9 files changed, 50 insertions(+), 64 deletions(-)

diff --git a/Documentation/admin-guide/cgroup-v1/memcg_test.rst 
b/Documentation/admin-guide/cgroup-v1/memcg_test.rst
index 3f7115e07b5d..0b9f91589d3d 100644
--- a/Documentation/admin-guide/cgroup-v1/memcg_test.rst
+++ b/Documentation/admin-guide/cgroup-v1/memcg_test.rst
@@ -133,18 +133,9 @@ Under below explanation, we assume 
CONFIG_MEM_RES_CTRL_SWAP=y.
 
 8. LRU
 ==
-Each memcg has its own private LRU. Now, its handling is under global
-   VM's control (means that it's handled under global pgdat->lru_lock).
-   Almost all routines around memcg's LRU is called by global LRU's
-   list management functions under pgdat->lru_lock.
-
-   A special function is mem_cgroup_isolate_pages(). This scans
-   memcg's private LRU and call __isolate_lru_page() to extract a page
-   from LRU.
-
-   (By __isolate_lru_page(), the page is removed from both of global and
-   private LRU.)
-
+   Each memcg has its own vector of LRUs (inactive anon, active anon,
+   inactive file, active file, unevictable) of pages from each node,
+   each LRU handled under a single lru_lock for that memcg and node.
 
 9. Typical Tests.
 =
diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst 
b/Documentation/admin-guide/cgroup-v1/memory.rst
index 12757e63b26c..24450696579f 100644
--- a/Documentation/admin-guide/cgroup-v1/memory.rst
+++ b/Documentation/admin-guide/cgroup-v1/memory.rst
@@ -285,20 +285,17 @@ When oom event notifier is registered, event will be 
delivered.
 2.6 Locking
 ---
 
-   lock_page_cgroup()/unlock_page_cgroup() should not be called under
-   the i_pages lock.
+Lock order is as follows:
 
-   Other lock order is following:
+  Page lock (PG_locked bit of page->flags)
+mm->page_table_lock or split pte_lock
+  lock_page_memcg (memcg->move_lock)
+mapping->i_pages lock
+  lruvec->lru_lock.
 
-   PG_locked.
- mm->page_table_lock
- pgdat->lru_lock
-  lock_page_cgroup.
-
-  In many cases, just lock_page_cgroup() is called.
-
-  per-zone-per-cgroup LRU (cgroup's private LRU) is just guarded by
-  pgdat->lru_lock, it has no lock of its own.
+Per-node-per-memcgroup LRU (cgroup's private LRU) is guarded by
+lruvec->lru_lock; PG_lru bit of page->flags is cleared before
+isolating a page from its LRU under lruvec->lru_lock.
 
 2.7 Kernel Memory Extension (CONFIG_MEMCG_KMEM)
 ---
diff --git a/Documentation/trace/events-kmem.rst 
b/Documentation/trace/events-kmem.rst
index 555484110e36..68fa75247488 100644
--- a/Documentation/trace/events-kmem.rst
+++ b/Documentation/trace/events-kmem.rst
@@ -69,7 +69,7 @@ When pages are freed in batch, the also mm_page_free_batched 
is triggered.
 Broadly speaking, pages are taken off the LRU lock in bulk and
 freed in batch with a page list. Significant amounts of activity here could
 indicate that the system is under memory pressure and can also indicate
-contention on the zone->lru_lock.
+contention on the lruvec->lru_lock.
 
 4. Per-CPU Allocator Activity
 =
diff --git a/Documentation/vm/unevictable-lru.rst 
b/Documentation/vm/unevictable-lru.rst
index 17d0861b0f1d..0e1490524f53 100644
--- a/Documentation/vm/unevictable-lru.rst
+++ b/Documentation/vm/unevictable-lru.rst
@@ -33,7 +33,7 @@ reclaim in Linux.  The problems have been observed at 
customer sites on large
 memory x86_64 systems.
 
 To illustrate this with an example, a non-NUMA x86_64 platform with 128GB of
-main memory will have over 32 million 4k pages in a single zone.  When a large
+main memory 

[PATCH v21 07/19] mm: page_idle_get_page() does not need lru_lock

2020-11-05 Thread Alex Shi
From: Hugh Dickins 

It is necessary for page_idle_get_page() to recheck PageLRU() after
get_page_unless_zero(), but holding lru_lock around that serves no
useful purpose, and adds to lru_lock contention: delete it.

See https://lore.kernel.org/lkml/20150504031722.GA2768@blaptop for the
discussion that led to lru_lock there; but __page_set_anon_rmap() now
uses WRITE_ONCE(), and I see no other risk in page_idle_clear_pte_refs()
using rmap_walk() (beyond the risk of racing PageAnon->PageKsm, mostly
but not entirely prevented by page_count() check in ksm.c's
write_protect_page(): that risk being shared with page_referenced() and
not helped by lru_lock).

Signed-off-by: Hugh Dickins 
Signed-off-by: Alex Shi 
Cc: Andrew Morton 
Cc: Vladimir Davydov 
Cc: Vlastimil Babka 
Cc: Minchan Kim 
Cc: Alex Shi 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/page_idle.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/mm/page_idle.c b/mm/page_idle.c
index 057c61df12db..64e5344a992c 100644
--- a/mm/page_idle.c
+++ b/mm/page_idle.c
@@ -32,19 +32,15 @@
 static struct page *page_idle_get_page(unsigned long pfn)
 {
struct page *page = pfn_to_online_page(pfn);
-   pg_data_t *pgdat;
 
if (!page || !PageLRU(page) ||
!get_page_unless_zero(page))
return NULL;
 
-   pgdat = page_pgdat(page);
-   spin_lock_irq(&pgdat->lru_lock);
if (unlikely(!PageLRU(page))) {
put_page(page);
page = NULL;
}
-   spin_unlock_irq(&pgdat->lru_lock);
return page;
 }
 
-- 
1.8.3.1



[PATCH v21 06/19] mm/rmap: stop store reordering issue on page->mapping

2020-11-05 Thread Alex Shi
Hugh Dickins and Minchan Kim observed a long time issue which
discussed here, but actully the mentioned fix missed.
https://lore.kernel.org/lkml/20150504031722.GA2768@blaptop/
The store reordering may cause problem in the scenario:

CPU 0   CPU1
   do_anonymous_page
page_add_new_anon_rmap()
  page->mapping = anon_vma + PAGE_MAPPING_ANON
lru_cache_add_inactive_or_unevictable()
  spin_lock(lruvec->lock)
  SetPageLRU()
  spin_unlock(lruvec->lock)
/* idletacking judged it as LRU
 * page so pass the page in
 * page_idle_clear_pte_refs
 */
page_idle_clear_pte_refs
  rmap_walk
if PageAnon(page)

Johannes give detailed examples how the store reordering could cause
a trouble:
The concern is the SetPageLRU may get reorder before 'page->mapping'
setting, That would make CPU 1 will observe at page->mapping after
observing PageLRU set on the page.

1. anon_vma + PAGE_MAPPING_ANON

   That's the in-order scenario and is fine.

2. NULL

   That's possible if the page->mapping store gets reordered to occur
   after SetPageLRU. That's fine too because we check for it.

3. anon_vma without the PAGE_MAPPING_ANON bit

   That would be a problem and could lead to all kinds of undesirable
   behavior including crashes and data corruption.

   Is it possible? AFAICT the compiler is allowed to tear the store to
   page->mapping and I don't see anything that would prevent it.

That said, I also don't see how the reader testing PageLRU under the
lru_lock would prevent that in the first place. AFAICT we need that
WRITE_ONCE() around the page->mapping assignment.

Signed-off-by: Alex Shi 
Cc: Johannes Weiner 
Cc: Andrew Morton 
Cc: Hugh Dickins 
Cc: Matthew Wilcox 
Cc: Minchan Kim 
Cc: Vladimir Davydov 
Cc: linux-kernel@vger.kernel.org
Cc: linux...@kvack.org
---
 mm/rmap.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index 1b84945d655c..078d54da59d4 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1054,8 +1054,13 @@ static void __page_set_anon_rmap(struct page *page,
if (!exclusive)
anon_vma = anon_vma->root;
 
+   /*
+* Prevent page->mapping from pointing to an anon_vma without
+* the PAGE_MAPPING_ANON bit set.  This could happen if the
+* compiler stores anon_vma and then adds PAGE_MAPPING_ANON to it.
+*/
anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON;
-   page->mapping = (struct address_space *) anon_vma;
+   WRITE_ONCE(page->mapping, (struct address_space *) anon_vma);
page->index = linear_page_index(vma, address);
 }
 
-- 
1.8.3.1



[PATCH v21 14/19] mm/lru: introduce TestClearPageLRU

2020-11-05 Thread Alex Shi
Currently lru_lock still guards both lru list and page's lru bit, that's
ok. but if we want to use specific lruvec lock on the page, we need to
pin down the page's lruvec/memcg during locking. Just taking lruvec
lock first may be undermined by the page's memcg charge/migration. To
fix this problem, we will clear the lru bit out of locking and use
it as pin down action to block the page isolation in memcg changing.

So now a standard steps of page isolation is following:
1, get_page(); #pin the page avoid to be free
2, TestClearPageLRU(); #block other isolation like memcg change
3, spin_lock on lru_lock; #serialize lru list access
4, delete page from lru list;

This patch start with the first part: TestClearPageLRU, which combines
PageLRU check and ClearPageLRU into a macro func TestClearPageLRU. This
function will be used as page isolation precondition to prevent other
isolations some where else. Then there are may !PageLRU page on lru
list, need to remove BUG() checking accordingly.

There 2 rules for lru bit now:
1, the lru bit still indicate if a page on lru list, just in some
   temporary moment(isolating), the page may have no lru bit when
   it's on lru list.  but the page still must be on lru list when the
   lru bit set.
2, have to remove lru bit before delete it from lru list.

As Andrew Morton mentioned this change would dirty cacheline for page
isn't on LRU. But the lost would be acceptable in Rong Chen
 report:
https://lore.kernel.org/lkml/20200304090301.GB5972@shao2-debian/

Suggested-by: Johannes Weiner 
Signed-off-by: Alex Shi 
Acked-by: Hugh Dickins 
Acked-by: Johannes Weiner 
Cc: Hugh Dickins 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Vladimir Davydov 
Cc: Andrew Morton 
Cc: linux-kernel@vger.kernel.org
Cc: cgro...@vger.kernel.org
Cc: linux...@kvack.org
---
 include/linux/page-flags.h |  1 +
 mm/mlock.c |  3 +--
 mm/vmscan.c| 39 +++
 3 files changed, 21 insertions(+), 22 deletions(-)

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 291dc247dc79..6426f2f03611 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -335,6 +335,7 @@ static inline void page_init_poison(struct page *page, 
size_t size)
 PAGEFLAG(Dirty, dirty, PF_HEAD) TESTSCFLAG(Dirty, dirty, PF_HEAD)
__CLEARPAGEFLAG(Dirty, dirty, PF_HEAD)
 PAGEFLAG(LRU, lru, PF_HEAD) __CLEARPAGEFLAG(LRU, lru, PF_HEAD)
+   TESTCLEARFLAG(LRU, lru, PF_HEAD)
 PAGEFLAG(Active, active, PF_HEAD) __CLEARPAGEFLAG(Active, active, PF_HEAD)
TESTCLEARFLAG(Active, active, PF_HEAD)
 PAGEFLAG(Workingset, workingset, PF_HEAD)
diff --git a/mm/mlock.c b/mm/mlock.c
index d487aa864e86..7b0e6334be6f 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -276,10 +276,9 @@ static void __munlock_pagevec(struct pagevec *pvec, struct 
zone *zone)
 * We already have pin from follow_page_mask()
 * so we can spare the get_page() here.
 */
-   if (PageLRU(page)) {
+   if (TestClearPageLRU(page)) {
struct lruvec *lruvec;
 
-   ClearPageLRU(page);
lruvec = mem_cgroup_page_lruvec(page,
page_pgdat(page));
del_page_from_lru_list(page, lruvec,
diff --git a/mm/vmscan.c b/mm/vmscan.c
index cb2f6256a7d6..ab7a0104d1e1 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1542,7 +1542,7 @@ unsigned int reclaim_clean_pages_from_list(struct zone 
*zone,
  */
 int __isolate_lru_page(struct page *page, isolate_mode_t mode)
 {
-   int ret = -EINVAL;
+   int ret = -EBUSY;
 
/* Only take pages on the LRU. */
if (!PageLRU(page))
@@ -1552,8 +1552,6 @@ int __isolate_lru_page(struct page *page, isolate_mode_t 
mode)
if (PageUnevictable(page) && !(mode & ISOLATE_UNEVICTABLE))
return ret;
 
-   ret = -EBUSY;
-
/*
 * To minimise LRU disruption, the caller can indicate that it only
 * wants to isolate pages it will be able to operate on without
@@ -1600,8 +1598,10 @@ int __isolate_lru_page(struct page *page, isolate_mode_t 
mode)
 * sure the page is not being freed elsewhere -- the
 * page release code relies on it.
 */
-   ClearPageLRU(page);
-   ret = 0;
+   if (TestClearPageLRU(page))
+   ret = 0;
+   else
+   put_page(page);
}
 
return ret;
@@ -1667,8 +1667,6 @@ static unsigned long isolate_lru_pages(unsigned long 
nr_to_scan,
page = lru_to_page(src);
prefetchw_prev_lru_page(page, src, flags);
 
-   VM_BUG_ON_PAGE(!PageLRU(page), page);
-
nr_pages = compound_nr(page);
   

[PATCH v21 16/19] mm/swap.c: serialize memcg changes in pagevec_lru_move_fn

2020-11-05 Thread Alex Shi
Hugh Dickins' found a memcg change bug on original version:
If we want to change the pgdat->lru_lock to memcg's lruvec lock, we have
to serialize mem_cgroup_move_account during pagevec_lru_move_fn. The
possible bad scenario would like:

cpu 0   cpu 1
lruvec = mem_cgroup_page_lruvec()
if (!isolate_lru_page())
mem_cgroup_move_account

spin_lock_irqsave(&lruvec->lru_lock <== wrong lock.

So we need TestClearPageLRU to block isolate_lru_page(), that serializes
the memcg change. and then removing the PageLRU check in move_fn callee
as the consequence.

__pagevec_lru_add_fn() is different from the others, because the pages
it deals with are, by definition, not yet on the lru.  TestClearPageLRU
is not needed and would not work, so __pagevec_lru_add() goes its own
way.

Reported-by: Hugh Dickins 
Signed-off-by: Alex Shi 
Acked-by: Hugh Dickins 
Acked-by: Johannes Weiner 
Cc: Andrew Morton 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/swap.c | 44 +++-
 1 file changed, 35 insertions(+), 9 deletions(-)

diff --git a/mm/swap.c b/mm/swap.c
index 2681d9023998..1838a9535703 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -222,8 +222,14 @@ static void pagevec_lru_move_fn(struct pagevec *pvec,
spin_lock_irqsave(&pgdat->lru_lock, flags);
}
 
+   /* block memcg migration during page moving between lru */
+   if (!TestClearPageLRU(page))
+   continue;
+
lruvec = mem_cgroup_page_lruvec(page, pgdat);
(*move_fn)(page, lruvec);
+
+   SetPageLRU(page);
}
if (pgdat)
spin_unlock_irqrestore(&pgdat->lru_lock, flags);
@@ -233,7 +239,7 @@ static void pagevec_lru_move_fn(struct pagevec *pvec,
 
 static void pagevec_move_tail_fn(struct page *page, struct lruvec *lruvec)
 {
-   if (PageLRU(page) && !PageUnevictable(page)) {
+   if (!PageUnevictable(page)) {
del_page_from_lru_list(page, lruvec, page_lru(page));
ClearPageActive(page);
add_page_to_lru_list_tail(page, lruvec, page_lru(page));
@@ -306,7 +312,7 @@ void lru_note_cost_page(struct page *page)
 
 static void __activate_page(struct page *page, struct lruvec *lruvec)
 {
-   if (PageLRU(page) && !PageActive(page) && !PageUnevictable(page)) {
+   if (!PageActive(page) && !PageUnevictable(page)) {
int lru = page_lru_base_type(page);
int nr_pages = thp_nr_pages(page);
 
@@ -362,7 +368,8 @@ static void activate_page(struct page *page)
 
page = compound_head(page);
spin_lock_irq(&pgdat->lru_lock);
-   __activate_page(page, mem_cgroup_page_lruvec(page, pgdat));
+   if (PageLRU(page))
+   __activate_page(page, mem_cgroup_page_lruvec(page, pgdat));
spin_unlock_irq(&pgdat->lru_lock);
 }
 #endif
@@ -519,9 +526,6 @@ static void lru_deactivate_file_fn(struct page *page, 
struct lruvec *lruvec)
bool active;
int nr_pages = thp_nr_pages(page);
 
-   if (!PageLRU(page))
-   return;
-
if (PageUnevictable(page))
return;
 
@@ -562,7 +566,7 @@ static void lru_deactivate_file_fn(struct page *page, 
struct lruvec *lruvec)
 
 static void lru_deactivate_fn(struct page *page, struct lruvec *lruvec)
 {
-   if (PageLRU(page) && PageActive(page) && !PageUnevictable(page)) {
+   if (PageActive(page) && !PageUnevictable(page)) {
int lru = page_lru_base_type(page);
int nr_pages = thp_nr_pages(page);
 
@@ -579,7 +583,7 @@ static void lru_deactivate_fn(struct page *page, struct 
lruvec *lruvec)
 
 static void lru_lazyfree_fn(struct page *page, struct lruvec *lruvec)
 {
-   if (PageLRU(page) && PageAnon(page) && PageSwapBacked(page) &&
+   if (PageAnon(page) && PageSwapBacked(page) &&
!PageSwapCache(page) && !PageUnevictable(page)) {
bool active = PageActive(page);
int nr_pages = thp_nr_pages(page);
@@ -1021,7 +1025,29 @@ static void __pagevec_lru_add_fn(struct page *page, 
struct lruvec *lruvec)
  */
 void __pagevec_lru_add(struct pagevec *pvec)
 {
-   pagevec_lru_move_fn(pvec, __pagevec_lru_add_fn);
+   int i;
+   struct pglist_data *pgdat = NULL;
+   struct lruvec *lruvec;
+   unsigned long flags = 0;
+
+   for (i = 0; i < pagevec_count(pvec); i++) {
+   struct page *page = pvec->pages[i];
+   struct pglist_data *pagepgdat = page_pgdat(page);
+
+   if (pagepgdat != pgdat) {
+   if (pgdat)
+   spin_unlock_irqrestore(&pgdat->lru_lock, flags);
+   pgdat = pagepgdat;
+   spin_lock_irqsave(&pgdat->lru_lock, flags);
+   }
+
+   lruvec =

[PATCH v21 10/19] mm/lru: move lock into lru_note_cost

2020-11-05 Thread Alex Shi
We have to move lru_lock into lru_note_cost, since it cycle up on memcg
tree, for future per lruvec lru_lock replace. It's a bit ugly and may
cost a bit more locking, but benefit from multiple memcg locking could
cover the lost.

Signed-off-by: Alex Shi 
Acked-by: Hugh Dickins 
Acked-by: Johannes Weiner 
Cc: Johannes Weiner 
Cc: Andrew Morton 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/swap.c   | 3 +++
 mm/vmscan.c | 4 +---
 mm/workingset.c | 2 --
 3 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/mm/swap.c b/mm/swap.c
index ce8c97146e0d..2681d9023998 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -268,7 +268,9 @@ void lru_note_cost(struct lruvec *lruvec, bool file, 
unsigned int nr_pages)
 {
do {
unsigned long lrusize;
+   struct pglist_data *pgdat = lruvec_pgdat(lruvec);
 
+   spin_lock_irq(&pgdat->lru_lock);
/* Record cost event */
if (file)
lruvec->file_cost += nr_pages;
@@ -292,6 +294,7 @@ void lru_note_cost(struct lruvec *lruvec, bool file, 
unsigned int nr_pages)
lruvec->file_cost /= 2;
lruvec->anon_cost /= 2;
}
+   spin_unlock_irq(&pgdat->lru_lock);
} while ((lruvec = parent_lruvec(lruvec)));
 }
 
diff --git a/mm/vmscan.c b/mm/vmscan.c
index b9935668d121..d771f812e983 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1973,19 +1973,17 @@ static int current_may_throttle(void)
&stat, false);
 
spin_lock_irq(&pgdat->lru_lock);
-
move_pages_to_lru(lruvec, &page_list);
 
__mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, -nr_taken);
-   lru_note_cost(lruvec, file, stat.nr_pageout);
item = current_is_kswapd() ? PGSTEAL_KSWAPD : PGSTEAL_DIRECT;
if (!cgroup_reclaim(sc))
__count_vm_events(item, nr_reclaimed);
__count_memcg_events(lruvec_memcg(lruvec), item, nr_reclaimed);
__count_vm_events(PGSTEAL_ANON + file, nr_reclaimed);
-
spin_unlock_irq(&pgdat->lru_lock);
 
+   lru_note_cost(lruvec, file, stat.nr_pageout);
mem_cgroup_uncharge_list(&page_list);
free_unref_page_list(&page_list);
 
diff --git a/mm/workingset.c b/mm/workingset.c
index 130348cbf40a..a915a812c363 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -381,9 +381,7 @@ void workingset_refault(struct page *page, void *shadow)
if (workingset) {
SetPageWorkingset(page);
/* XXX: Move to lru_cache_add() when it supports new vs putback 
*/
-   spin_lock_irq(&page_pgdat(page)->lru_lock);
lru_note_cost_page(page);
-   spin_unlock_irq(&page_pgdat(page)->lru_lock);
inc_lruvec_state(lruvec, WORKINGSET_RESTORE_BASE + file);
}
 out:
-- 
1.8.3.1



[PATCH v21 02/19] mm/thp: use head for head page in lru_add_page_tail

2020-11-05 Thread Alex Shi
Since the first parameter is only used by head page, it's better to make
it explicit.

Signed-off-by: Alex Shi 
Reviewed-by: Kirill A. Shutemov 
Reviewed-by: Matthew Wilcox (Oracle) 
Acked-by: Hugh Dickins 
Acked-by: Johannes Weiner 
Cc: Andrew Morton 
Cc: Johannes Weiner 
Cc: Matthew Wilcox 
Cc: Hugh Dickins 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/huge_memory.c | 23 +++
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 8f16e991f7cc..60726eb26840 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2348,33 +2348,32 @@ static void remap_page(struct page *page, unsigned int 
nr)
}
 }
 
-static void lru_add_page_tail(struct page *page, struct page *page_tail,
+static void lru_add_page_tail(struct page *head, struct page *tail,
struct lruvec *lruvec, struct list_head *list)
 {
-   VM_BUG_ON_PAGE(!PageHead(page), page);
-   VM_BUG_ON_PAGE(PageCompound(page_tail), page);
-   VM_BUG_ON_PAGE(PageLRU(page_tail), page);
+   VM_BUG_ON_PAGE(!PageHead(head), head);
+   VM_BUG_ON_PAGE(PageCompound(tail), head);
+   VM_BUG_ON_PAGE(PageLRU(tail), head);
lockdep_assert_held(&lruvec_pgdat(lruvec)->lru_lock);
 
if (!list)
-   SetPageLRU(page_tail);
+   SetPageLRU(tail);
 
-   if (likely(PageLRU(page)))
-   list_add_tail(&page_tail->lru, &page->lru);
+   if (likely(PageLRU(head)))
+   list_add_tail(&tail->lru, &head->lru);
else if (list) {
/* page reclaim is reclaiming a huge page */
-   get_page(page_tail);
-   list_add_tail(&page_tail->lru, list);
+   get_page(tail);
+   list_add_tail(&tail->lru, list);
} else {
/*
 * Head page has not yet been counted, as an hpage,
 * so we must account for each subpage individually.
 *
-* Put page_tail on the list at the correct position
+* Put tail on the list at the correct position
 * so they all end up in order.
 */
-   add_page_to_lru_list_tail(page_tail, lruvec,
- page_lru(page_tail));
+   add_page_to_lru_list_tail(tail, lruvec, page_lru(tail));
}
 }
 
-- 
1.8.3.1



[PATCH v21 11/19] mm/vmscan: remove lruvec reget in move_pages_to_lru

2020-11-05 Thread Alex Shi
Isolated page shouldn't be recharged by memcg since the memcg
migration isn't possible at the time. All pages were isolated
from the same lruvec (and isolation inhibits memcg migration).
So remove unnecessary regetting.

Thanks to Alexander Duyck for pointing this out.

Signed-off-by: Alex Shi 
Acked-by: Hugh Dickins 
Acked-by: Johannes Weiner 
Cc: Alexander Duyck 
Cc: Andrew Morton 
Cc: Konstantin Khlebnikov 
Cc: Michal Hocko 
Cc: Hugh Dickins 
Cc: Johannes Weiner 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
Cc: cgro...@vger.kernel.org
---
 mm/vmscan.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index d771f812e983..cb2f6256a7d6 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1887,7 +1887,12 @@ static unsigned noinline_for_stack 
move_pages_to_lru(struct lruvec *lruvec,
continue;
}
 
-   lruvec = mem_cgroup_page_lruvec(page, pgdat);
+   /*
+* All pages were isolated from the same lruvec (and isolation
+* inhibits memcg migration).
+*/
+   VM_BUG_ON_PAGE(mem_cgroup_page_lruvec(page, page_pgdat(page))
+   != lruvec, page);
lru = page_lru(page);
nr_pages = thp_nr_pages(page);
 
-- 
1.8.3.1



[PATCH v21 04/19] mm/thp: narrow lru locking

2020-11-05 Thread Alex Shi
lru_lock and page cache xa_lock have no obvious reason to be taken
one way round or the other: until now, lru_lock has been taken before
page cache xa_lock, when splitting a THP; but nothing else takes them
together.  Reverse that ordering: let's narrow the lru locking - but
leave local_irq_disable to block interrupts throughout, like before.

Hugh Dickins point: split_huge_page_to_list() was already silly, to be
using the _irqsave variant: it's just been taking sleeping locks, so
would already be broken if entered with interrupts enabled.  So we
can save passing flags argument down to __split_huge_page().

Why change the lock ordering here? That was hard to decide. One reason:
when this series reaches per-memcg lru locking, it relies on the THP's
memcg to be stable when taking the lru_lock: that is now done after the
THP's refcount has been frozen, which ensures page memcg cannot change.

Another reason: previously, lock_page_memcg()'s move_lock was presumed
to nest inside lru_lock; but now lru_lock must nest inside (page cache
lock inside) move_lock, so it becomes possible to use lock_page_memcg()
to stabilize page memcg before taking its lru_lock.  That is not the
mechanism used in this series, but it is an option we want to keep open.

[Hugh Dickins: rewrite commit log]
Signed-off-by: Alex Shi 
Reviewed-by: Kirill A. Shutemov 
Acked-by: Hugh Dickins 
Cc: Hugh Dickins 
Cc: Kirill A. Shutemov 
Cc: Andrea Arcangeli 
Cc: Johannes Weiner 
Cc: Matthew Wilcox 
Cc: Andrew Morton 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/huge_memory.c | 25 +
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 79318d7f7d5d..b70ec0c6076b 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2435,7 +2435,7 @@ static void __split_huge_page_tail(struct page *head, int 
tail,
 }
 
 static void __split_huge_page(struct page *page, struct list_head *list,
-   pgoff_t end, unsigned long flags)
+   pgoff_t end)
 {
struct page *head = compound_head(page);
pg_data_t *pgdat = page_pgdat(head);
@@ -2445,8 +2445,6 @@ static void __split_huge_page(struct page *page, struct 
list_head *list,
unsigned int nr = thp_nr_pages(head);
int i;
 
-   lruvec = mem_cgroup_page_lruvec(head, pgdat);
-
/* complete memcg works before add pages to LRU */
mem_cgroup_split_huge_fixup(head);
 
@@ -2458,6 +2456,11 @@ static void __split_huge_page(struct page *page, struct 
list_head *list,
xa_lock(&swap_cache->i_pages);
}
 
+   /* prevent PageLRU to go away from under us, and freeze lru stats */
+   spin_lock(&pgdat->lru_lock);
+
+   lruvec = mem_cgroup_page_lruvec(head, pgdat);
+
for (i = nr - 1; i >= 1; i--) {
__split_huge_page_tail(head, i, lruvec, list);
/* Some pages can be beyond i_size: drop them from page cache */
@@ -2477,6 +2480,8 @@ static void __split_huge_page(struct page *page, struct 
list_head *list,
}
 
ClearPageCompound(head);
+   spin_unlock(&pgdat->lru_lock);
+   /* Caller disabled irqs, so they are still disabled here */
 
split_page_owner(head, nr);
 
@@ -2494,8 +2499,7 @@ static void __split_huge_page(struct page *page, struct 
list_head *list,
page_ref_add(head, 2);
xa_unlock(&head->mapping->i_pages);
}
-
-   spin_unlock_irqrestore(&pgdat->lru_lock, flags);
+   local_irq_enable();
 
remap_page(head, nr);
 
@@ -2641,12 +2645,10 @@ bool can_split_huge_page(struct page *page, int 
*pextra_pins)
 int split_huge_page_to_list(struct page *page, struct list_head *list)
 {
struct page *head = compound_head(page);
-   struct pglist_data *pgdata = NODE_DATA(page_to_nid(head));
struct deferred_split *ds_queue = get_deferred_split_queue(head);
struct anon_vma *anon_vma = NULL;
struct address_space *mapping = NULL;
int count, mapcount, extra_pins, ret;
-   unsigned long flags;
pgoff_t end;
 
VM_BUG_ON_PAGE(is_huge_zero_page(head), head);
@@ -2707,9 +2709,8 @@ int split_huge_page_to_list(struct page *page, struct 
list_head *list)
unmap_page(head);
VM_BUG_ON_PAGE(compound_mapcount(head), head);
 
-   /* prevent PageLRU to go away from under us, and freeze lru stats */
-   spin_lock_irqsave(&pgdata->lru_lock, flags);
-
+   /* block interrupt reentry in xa_lock and spinlock */
+   local_irq_disable();
if (mapping) {
XA_STATE(xas, &mapping->i_pages, page_index(head));
 
@@ -2739,7 +2740,7 @@ int split_huge_page_to_list(struct page *page, struct 
list_head *list)
__dec_lruvec_page_state(head, NR_FILE_THPS);
}
 
-   __split_huge_page(page, list, end, flags);
+   __split_huge_page(page, list, end);
ret = 0;
} 

Re: [PATCH v4 2/4] mfd: Support ROHM BD9576MUF and BD9573MUF

2020-11-05 Thread Lee Jones
On Thu, 05 Nov 2020, Vaittinen, Matti wrote:

> 
> On Thu, 2020-11-05 at 08:21 +, Lee Jones wrote:
> > On Thu, 05 Nov 2020, Vaittinen, Matti wrote:
> > 
> > > On Thu, 2020-11-05 at 08:46 +0200, Matti Vaittinen wrote:
> > > > Morning Lee,
> > > > 
> > > > Thanks for taking a look at this :) I see most of the comments
> > > > being
> > > > valid. There's two I would like to clarify though...
> > > > 
> > > > On Wed, 2020-11-04 at 15:51 +, Lee Jones wrote:
> > > > > On Wed, 28 Oct 2020, Matti Vaittinen wrote:
> > > > > 
> > > > > > Add core support for ROHM BD9576MUF and BD9573MUF PMICs which
> > > > > > are
> > > > > > mainly used to power the R-Car series processors.
> > > > > > 
> > > > > > Signed-off-by: Matti Vaittinen <
> > > > > > matti.vaitti...@fi.rohmeurope.com
> > > > > > ---
> > > > > > +   unsigned int chip_type;
> > > > > > +
> > > > > > +   chip_type = (unsigned int)(uintptr_t)
> > > > > > +   of_device_get_match_data(&i2c->dev);
> > > > > 
> > > > > Not overly keen on this casting.
> > > > > 
> > > > > Why not just leave it as (uintptr_t)?
> > > > 
> > > > I didn't do so because on x86_64 the address width is probably 64
> > > > bits
> > > > whereas the unsigned int is likely to be 32 bits. So the
> > > > assignment
> > > > will crop half of the value. It does not really matter as values
> > > > are
> > > > small - but I would be surprized if no compilers/analyzers
> > > > emitted a
> > > > warning.
> > > > 
> > > > I must admit I am not 100% sure though. I sure can change this if
> > > > you
> > > > know it better?
> > 
> > What if you used 'long', which I believe changed with the
> > architecture's bus width in Linux?
> 
> I think this is exactly what uintptr_t was created for. To provide type
> which assures a pointer conversion to integer and back works.
> 
> I guess I can change the
> 
> unsigned int chip_type;
> 
> to uintptr_t and get away with single cast if it looks better to you.
> For me the double cast does not look that bad when it allows use of
> native int size variable - but in this case it's really just a matter
> of taste. Both should work fine.

I do see people casting to uintptr and placing the result into a long.

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog


Re: [PATCH v4 2/4] mfd: Support ROHM BD9576MUF and BD9573MUF

2020-11-05 Thread Vaittinen, Matti

On Thu, 2020-11-05 at 08:23 +, Lee Jones wrote:
> On Wed, 04 Nov 2020, Lee Jones wrote:
> 
> > On Wed, 28 Oct 2020, Matti Vaittinen wrote:
> > 
> > > Add core support for ROHM BD9576MUF and BD9573MUF PMICs which are
> > > mainly used to power the R-Car series processors.
> > > 
> > > Signed-off-by: Matti Vaittinen  > > >
> > > ---
> > >  drivers/mfd/Kconfig  |  11 +++
> > >  drivers/mfd/Makefile |   1 +
> > >  drivers/mfd/rohm-bd9576.c| 130
> > > +++
> > >  include/linux/mfd/rohm-bd957x.h  |  59 ++
> > >  include/linux/mfd/rohm-generic.h |   2 +
> > >  5 files changed, 203 insertions(+)
> > >  create mode 100644 drivers/mfd/rohm-bd9576.c
> > >  create mode 100644 include/linux/mfd/rohm-bd957x.h
> 
> [...]
> 
> > > +static const struct regmap_range volatile_ranges[] = {
> > > + {
> > > + .range_min = BD957X_REG_SMRB_ASSERT,
> > > + .range_max = BD957X_REG_SMRB_ASSERT,
> > > + },
> > > + {
> > 
> > The way you space your braces is not consistent.
> > 
> > > + .range_min = BD957X_REG_PMIC_INTERNAL_STAT,
> > > + .range_max = BD957X_REG_PMIC_INTERNAL_STAT,
> > > + },
> > > + {
> > > + .range_min = BD957X_REG_INT_THERM_STAT,
> > > + .range_max = BD957X_REG_INT_THERM_STAT,
> > > + },
> > > + {
> > > + .range_min = BD957X_REG_INT_OVP_STAT,
> > > + .range_max = BD957X_REG_INT_SYS_STAT,
> > > + }, {
> > > + .range_min = BD957X_REG_INT_MAIN_STAT,
> > > + .range_max = BD957X_REG_INT_MAIN_STAT,
> > > + },
> > > +};
> 
> Don't forget about this.
> 
> I would prefer to have the braces on the same line (even if it means
> you have to change an extra line when editing), but I'm not 100% dead
> set on it.  Consistency however, I am.
> 

I won't forget. I intended to write that I was Ok with all the other
comments. Maybe I forgot though. Anyways, I'll fix the inconsistency -
thanks for pointing it out!

--Matti



RE: [PATCH bpf v2 1/2] lib/strncpy_from_user.c: Don't overcopy bytes after NUL terminator

2020-11-05 Thread David Laight
From: Daniel Xu
> Sent: 05 November 2020 02:26
...
> --- a/lib/strncpy_from_user.c
> +++ b/lib/strncpy_from_user.c
> @@ -35,17 +35,22 @@ static inline long do_strncpy_from_user(char *dst, const 
> char __user *src,
>   goto byte_at_a_time;
> 
>   while (max >= sizeof(unsigned long)) {
> - unsigned long c, data;
> + unsigned long c, data, mask, *out;
> 
>   /* Fall back to byte-at-a-time if we get a page fault */
>   unsafe_get_user(c, (unsigned long __user *)(src+res), 
> byte_at_a_time);

It's not related to this change, but since both addresses
are aligned (checked earlier) a page fault on the word read
is fatal.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



Re: [PATCH 35/36] tty: synclink: Mark disposable variables as __always_unused

2020-11-05 Thread Lee Jones
On Thu, 05 Nov 2020, Jiri Slaby wrote:

> On 04. 11. 20, 20:35, Lee Jones wrote:
> > Fixes the following W=1 kernel build warning(s):
> > 
> >   drivers/tty/synclink.c: In function ‘usc_reset’:
> >   drivers/tty/synclink.c:5571:6: warning: variable ‘readval’ set but not 
> > used [-Wunused-but-set-variable]
> >   drivers/tty/synclink.c: In function ‘mgsl_load_pci_memory’:
> >   drivers/tty/synclink.c:7267:16: warning: variable ‘Dummy’ set but not 
> > used [-Wunused-but-set-variable]
> > 
> > Cc: Greg Kroah-Hartman 
> > Cc: Jiri Slaby 
> > Cc: pau...@microgate.com
> > Signed-off-by: Lee Jones 
> > ---
> >   drivers/tty/synclink.c | 4 ++--
> >   1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/tty/synclink.c b/drivers/tty/synclink.c
> > index c8324d58ef564..8ed64b1e7c378 100644
> > --- a/drivers/tty/synclink.c
> > +++ b/drivers/tty/synclink.c
> > @@ -5568,7 +5568,7 @@ static void usc_load_txfifo( struct mgsl_struct *info 
> > )
> >   static void usc_reset( struct mgsl_struct *info )
> >   {
> > int i;
> > -   u32 readval;
> > +   u32 __always_unused readval;
> 
> The same as in synclinkmp.
> 
> > /* Set BIT30 of Misc Control Register */
> > /* (Local Control Register 0x50) to force reset of USC. */
> > @@ -7264,7 +7264,7 @@ static void mgsl_load_pci_memory( char* TargetPtr, 
> > const char* SourcePtr,
> > unsigned short Intervalcount = count / PCI_LOAD_INTERVAL;
> > unsigned short Index;
> > -   unsigned long Dummy;
> > +   unsigned long __always_unused Dummy;
> 
> You can kill it completely.

Great.  Will fix.

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog


Re: [PATCH 34/36] tty: serial: pmac_zilog: Make disposable variable __always_unused

2020-11-05 Thread Lee Jones
On Thu, 05 Nov 2020, Jiri Slaby wrote:

> On 05. 11. 20, 9:36, Lee Jones wrote:
> > On Thu, 05 Nov 2020, Jiri Slaby wrote:
> > 
> > > On 05. 11. 20, 8:04, Christophe Leroy wrote:
> > > > 
> > > > 
> > > > Le 04/11/2020 à 20:35, Lee Jones a écrit :
> > > > > Fixes the following W=1 kernel build warning(s):
> > > > > 
> > > > >    drivers/tty/serial/pmac_zilog.h:365:58: warning: variable
> > > > > ‘garbage’ set but not used [-Wunused-but-set-variable]
> > > > 
> > > > Explain how you are fixing this warning.
> > > > 
> > > > Setting  __always_unused is usually not the good solution for fixing
> > > > this warning, but here I guess this is likely the good solution. But it
> > > > should be explained why.
> > 
> > There are normally 3 ways to fix this warning;
> > 
> >   - Start using/checking the variable/result
> >   - Remove the variable
> >   - Mark it as __{always,maybe}_unused
> > 
> > The later just tells the compiler that not checking the resultant
> > value is intentional.  There are some functions (as Jiri mentions
> > below) which are marked as '__must_check' which *require* a dummy
> > (garbage) variable to be used.
> > 
> > > Or, why is the "garbage =" needed in the first place? read_zsdata is not
> > > defined with __warn_unused_result__.
> > 
> > I used '__always_used' here for fear of breaking something.
> > 
> > However, if it's safe to remove it, then all the better.
> 
> Yes please -- this "garbage" is one of the examples of volatile misuses. If
> readb didn't work on volatile pointer, marking the return variable as
> volatile wouldn't save it.
> 
> > > And even if it was, would (void)!read_zsdata(port) fix it?
> > 
> > That's hideous. :D
> 
> Sure, marking reads as must_check would be insane.
> 
> > *Much* better to just use '__always_used' in that use-case.
> 
> Then using a dummy variable to fool must_check must mean must_check is used
> incorrectly, no :)? But there are always exceptions…

Agreed on all points.

Will fix.

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog


Re: [PATCH v5 06/17] virt: acrn: Introduce VM management interfaces

2020-11-05 Thread Shuo A Liu

On Thu  5.Nov'20 at  9:26:39 +0100, Greg Kroah-Hartman wrote:

On Thu, Nov 05, 2020 at 03:35:45PM +0800, Shuo A Liu wrote:

On Thu  5.Nov'20 at  7:29:07 +0100, Greg Kroah-Hartman wrote:
> On Thu, Nov 05, 2020 at 11:10:29AM +0800, Shuo A Liu wrote:
> > On Wed  4.Nov'20 at 20:02:35 +0100, Greg Kroah-Hartman wrote:
> > > On Mon, Oct 19, 2020 at 02:17:52PM +0800, shuo.a@intel.com wrote:
> > > > --- /dev/null
> > > > +++ b/include/uapi/linux/acrn.h
> > > > @@ -0,0 +1,56 @@
> > > > +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> > > > +/*
> > > > + * Userspace interface for /dev/acrn_hsm - ACRN Hypervisor Service 
Module
> > > > + *
> > > > + * This file can be used by applications that need to communicate with 
the HSM
> > > > + * via the ioctl interface.
> > > > + */
> > > > +
> > > > +#ifndef _UAPI_ACRN_H
> > > > +#define _UAPI_ACRN_H
> > > > +
> > > > +#include 
> > > > +
> > > > +/**
> > > > + * struct acrn_vm_creation - Info to create a User VM
> > > > + * @vmid:  User VM ID returned from the hypervisor
> > > > + * @reserved0: Reserved
> > > > + * @vcpu_num:  Number of vCPU in the VM. Return from 
hypervisor.
> > > > + * @reserved1: Reserved
> > > > + * @uuid:  UUID of the VM. Pass to hypervisor directly.
> > > > + * @vm_flag:   Flag of the VM creating. Pass to hypervisor 
directly.
> > > > + * @ioreq_buf: Service VM GPA of I/O request buffer. Pass 
to
> > > > + * hypervisor directly.
> > > > + * @cpu_affinity:  CPU affinity of the VM. Pass to hypervisor directly.
> > > > + * @reserved2: Reserved
> > >
> > > Reserved and must be 0?
> >
> > Not a must.
>
> That's guaranteed to come back and bite you in the end.

OK. I can fill them with zero before passing them to hypervisor.

> You all have read the "how to write a good api" document, right?

Is it Documentation/driver-api/ioctl.rst? Or i missed..


That's one good document, but no, not what I was referring to.  I was
thinking of Documentation/process/adding-syscalls.rst, which is what you
are doing here implicitly with these new ioctls (every ioctl is a brand
new syscall.)


I will read it as well. Thanks.




> > > What are they reserved for?
> > >
> > > Same for all of the reserved fields, why?
> >
> > Some reserved fields are to map layout in the hypervisor side, others
> > are for future use.
>
> ioctls should not have these, again, please read the documentation.  If
> you need something new in the future, just make a new ioctl.

OK. I will remove some reserved fields for scalability.


"scalability" should have nothing to do with any of this, right?  What
am I missing?


Sorry, i meant reserved fields for future use.




Though i can
keep some reserved fields for alignment (and to keep same data structure
layout with the hypervisor), right?
Documentation/driver-api/ioctl.rst says that explicit reserved fields
could be used.


If you need alignment, yes, that is fine, but that's not what you are
saying these are for.  And if you need alignment, why not move things
around so they are properly aligned.

And this structure has nothing to do with the hypervisor structure,
that's a internal-kernel structure, not a userspace-visable thing if you
are doing things correctly.


It's the same structure with the one in hypervisor. HSM driver
doesn't maintain the VM much, it just pass the data for VM creation from
userspace to hypervisor.



As an example of all of this type of review and conversation, please
refer to the review of the recent nitro_enclaves code that got merged.
All of the discussions there about ioctls are also relevant here.


I will. Thanks very much.

Thanks
shuo


Re: Kernel 5.10-rc1 not mounting NAND flash (Bisected to d7157ff49a5b ("mtd: rawnand: Use the ECC framework user input parsing bits"))

2020-11-05 Thread Christophe Leroy



Quoting Miquel Raynal :


Hi Christophe,

Christophe Leroy  wrote on Wed, 4 Nov 2020
19:37:57 +0100:


Hi Miquel,

Le 04/11/2020 à 18:38, Miquel Raynal a écrit :
> Hi Christophe,
>
> Christophe Leroy  wrote on Wed, 04 Nov
> 2020 18:33:53 +0100:
>
>> Hi Miquel,
>>
>> I'm unable to boot 5.10-rc1 on my boards. I get the following error:
>>
>> [4.125811] nand: device found, Manufacturer ID: 0xad, Chip ID: 0x76
>> [4.131992] nand: Hynix NAND 64MiB 3,3V 8-bit
>> [4.136173] nand: 64 MiB, SLC, erase size: 16 KiB, page size:  
512, OOB size: 16

>> [4.143534] [ cut here ]
>> [4.147934] Unsupported ECC algorithm!
>> [4.152142] WARNING: CPU: 0 PID: 1 at  
drivers/mtd/nand/raw/nand_base.c:5244  
nand_scan_with_ids+0x1260/0x1640

>> ...
>> [4.332052] ---[ end trace e3a36f62cae4ac56 ]---
>> [4.336882] gpio-nand: probe of c000.nand failed with error -22
>>
>> Bisected to commit d7157ff49a5b ("mtd: rawnand: Use the ECC  
framework user input parsing bits")

>>
>> My first impression is that with that change, the value set in  
chip->ecc.algo
>> by gpio_nand_probe() in drivers/mtd/nand/raw/gpio.c gets  
overwritten in rawnand_dt_init()

>>
>> The following change fixes the problem, though I'm not sure it  
is the right fix. Can you have a look ?

>>
>> diff --git a/drivers/mtd/nand/raw/nand_base.c  
b/drivers/mtd/nand/raw/nand_base.c

>> index 1f0d542d5923..aa74797cf2da 100644
>> --- a/drivers/mtd/nand/raw/nand_base.c
>> +++ b/drivers/mtd/nand/raw/nand_base.c
>> @@ -5032,7 +5032,8 @@ static int rawnand_dt_init(struct nand_chip *chip)
>>chip->ecc.engine_type = nand->ecc.defaults.engine_type;
>>
>>chip->ecc.placement = nand->ecc.user_conf.placement;
>> -  chip->ecc.algo = nand->ecc.user_conf.algo;
>> +  if (chip->ecc.algo == NAND_ECC_ALGO_UNKNOWN)
>> +  chip->ecc.algo = nand->ecc.user_conf.algo;
>>chip->ecc.strength = nand->ecc.user_conf.strength;
>>chip->ecc.size = nand->ecc.user_conf.step_size;
>>
>> ---
>>
>> Thanks
>> Christophe
>
> Sorry for introducing this issue, I didn't had the time to send the
> Fixes PR yet but I think this issue has been solved already. Could
> you please try with a recent linux-next?
>

Sorry, same problem with "Linux version 5.10.0-rc2-next-20201104"


Can you please give this patch a try, please?

---8<---

Author: Miquel Raynal 
Date:   Thu Nov 5 08:44:48 2020 +0100

mtd: rawnand: gpio: Move the ECC initialization to ->attach_chip()

While forcing a Hamming software ECC looks clearly wrong, let's just
fix the situation for now and move these lines to the ->attach_chip()
hook which gets executed after the user input parsing and NAND chip
discovery.

Fixes: d7157ff49a5b ("mtd: rawnand: Use the ECC framework user  
input parsing bits")

Signed-off-by: Miquel Raynal 

diff --git a/drivers/mtd/nand/raw/gpio.c b/drivers/mtd/nand/raw/gpio.c
index 3bd847ccc3f3..6feab847f5e0 100644
--- a/drivers/mtd/nand/raw/gpio.c
+++ b/drivers/mtd/nand/raw/gpio.c
@@ -161,8 +161,15 @@ static int gpio_nand_exec_op(struct nand_chip *chip,
return ret;
 }

+static int gpio_nand_attach_chip(struct nand_chip *chip)
+{
+   chip->ecc.mode = NAND_ECC_SOFT;
+   chip->ecc.algo = NAND_ECC_HAMMING;
+}
+
 static const struct nand_controller_ops gpio_nand_ops = {
.exec_op = gpio_nand_exec_op,
+   .attach_chip = gpio_nand_attach_chip,
 };

 #ifdef CONFIG_OF
@@ -342,8 +349,6 @@ static int gpio_nand_probe(struct platform_device *pdev)
gpiomtd->base.ops = &gpio_nand_ops;

nand_set_flash_node(chip, pdev->dev.of_node);
-   chip->ecc.mode  = NAND_ECC_SOFT;
-   chip->ecc.algo  = NAND_ECC_HAMMING;
chip->options   = gpiomtd->plat.options;
chip->controller= &gpiomtd->base;



Works with the following:

diff --git a/drivers/mtd/nand/raw/gpio.c b/drivers/mtd/nand/raw/gpio.c
index 4ec0a1e10867..66d3f1eb788c 100644
--- a/drivers/mtd/nand/raw/gpio.c
+++ b/drivers/mtd/nand/raw/gpio.c
@@ -161,8 +161,17 @@ static int gpio_nand_exec_op(struct nand_chip *chip,
return ret;
 }

+static int gpio_nand_attach_chip(struct nand_chip *chip)
+{
+   chip->ecc.engine_type= NAND_ECC_ENGINE_TYPE_SOFT;
+   chip->ecc.algo   = NAND_ECC_ALGO_HAMMING;
+
+   return 0;
+}
+
 static const struct nand_controller_ops gpio_nand_ops = {
.exec_op = gpio_nand_exec_op,
+   .attach_chip = gpio_nand_attach_chip,
 };

 #ifdef CONFIG_OF
@@ -342,8 +351,6 @@ static int gpio_nand_probe(struct platform_device *pdev)
gpiomtd->base.ops = &gpio_nand_ops;

nand_set_flash_node(chip, pdev->dev.of_node);
-   chip->ecc.engine_type= NAND_ECC_ENGINE_TYPE_SOFT;
-   chip->ecc.algo   = NAND_ECC_ALGO_HAMMING;
chip->options= gpiomtd->plat.options;
chip->controller = &gpiomtd->base;

---
Christophe


[PATCH] habanalabs: move HW dirty check to a proper location

2020-11-05 Thread Oded Gabbay
From: Ofir Bitton 

Driver must verify if HW is dirty before trying to fetch preboot
information. Hence, we move this validation to a prior stage of
the boot sequence.

Signed-off-by: Ofir Bitton 
Reviewed-by: Oded Gabbay 
Signed-off-by: Oded Gabbay 
---
 drivers/misc/habanalabs/common/device.c   |  7 
 drivers/misc/habanalabs/common/habanalabs.h   |  6 +--
 .../misc/habanalabs/common/habanalabs_drv.c   | 12 +++---
 drivers/misc/habanalabs/common/pci.c  | 23 +--
 drivers/misc/habanalabs/gaudi/gaudi.c | 35 -
 drivers/misc/habanalabs/goya/goya.c   | 38 ++-
 6 files changed, 63 insertions(+), 58 deletions(-)

diff --git a/drivers/misc/habanalabs/common/device.c 
b/drivers/misc/habanalabs/common/device.c
index 59308a612b36..348faf31668b 100644
--- a/drivers/misc/habanalabs/common/device.c
+++ b/drivers/misc/habanalabs/common/device.c
@@ -1278,13 +1278,6 @@ int hl_device_init(struct hl_device *hdev, struct class 
*hclass)
 
hl_debugfs_add_device(hdev);
 
-   if (hdev->asic_funcs->get_hw_state(hdev) == HL_DEVICE_HW_STATE_DIRTY) {
-   dev_info(hdev->dev,
-   "H/W state is dirty, must reset before initializing\n");
-   hdev->asic_funcs->halt_engines(hdev, true);
-   hdev->asic_funcs->hw_fini(hdev, true);
-   }
-
/*
 * From this point, in case of an error, add char devices and create
 * sysfs nodes as part of the error flow, to allow debugging.
diff --git a/drivers/misc/habanalabs/common/habanalabs.h 
b/drivers/misc/habanalabs/common/habanalabs.h
index 9c7594d0ca07..42988f12fb00 100644
--- a/drivers/misc/habanalabs/common/habanalabs.h
+++ b/drivers/misc/habanalabs/common/habanalabs.h
@@ -917,7 +917,6 @@ struct hl_asic_funcs {
size_t max_size);
int (*send_cpu_message)(struct hl_device *hdev, u32 *msg,
u16 len, u32 timeout, long *result);
-   enum hl_device_hw_state (*get_hw_state)(struct hl_device *hdev);
int (*pci_bars_map)(struct hl_device *hdev);
int (*init_iatu)(struct hl_device *hdev);
u32 (*rreg)(struct hl_device *hdev, u32 reg);
@@ -1901,6 +1900,7 @@ struct hl_device {
u8  hard_reset_on_fw_events;
u8  bmc_enable;
u8  rl_enable;
+   u8  reset_on_preboot_fail;
 };
 
 
@@ -2148,9 +2148,7 @@ int hl_pci_set_inbound_region(struct hl_device *hdev, u8 
region,
struct hl_inbound_pci_region *pci_region);
 int hl_pci_set_outbound_region(struct hl_device *hdev,
struct hl_outbound_pci_region *pci_region);
-int hl_pci_init(struct hl_device *hdev, u32 cpu_boot_status_reg,
-   u32 cpu_security_boot_status_reg, u32 boot_err0_reg,
-   u32 preboot_ver_timeout);
+int hl_pci_init(struct hl_device *hdev);
 void hl_pci_fini(struct hl_device *hdev);
 
 long hl_get_frequency(struct hl_device *hdev, u32 pll_index, bool curr);
diff --git a/drivers/misc/habanalabs/common/habanalabs_drv.c 
b/drivers/misc/habanalabs/common/habanalabs_drv.c
index aac798f3296e..6bbb6bca6860 100644
--- a/drivers/misc/habanalabs/common/habanalabs_drv.c
+++ b/drivers/misc/habanalabs/common/habanalabs_drv.c
@@ -234,20 +234,20 @@ int hl_device_open_ctrl(struct inode *inode, struct file 
*filp)
 
 static void set_driver_behavior_per_device(struct hl_device *hdev)
 {
-   hdev->mmu_enable = 1;
hdev->cpu_enable = 1;
-   hdev->fw_loading = 1;
+   hdev->fw_loading = FW_TYPE_ALL_TYPES;
hdev->cpu_queues_enable = 1;
hdev->heartbeat = 1;
+   hdev->mmu_enable = 1;
hdev->clock_gating_mask = ULONG_MAX;
-
-   hdev->reset_pcilink = 0;
-   hdev->axi_drain = 0;
hdev->sram_scrambler_enable = 1;
hdev->dram_scrambler_enable = 1;
hdev->bmc_enable = 1;
hdev->hard_reset_on_fw_events = 1;
-   hdev->fw_loading = FW_TYPE_ALL_TYPES;
+   hdev->reset_on_preboot_fail = 1;
+
+   hdev->reset_pcilink = 0;
+   hdev->axi_drain = 0;
 }
 
 /*
diff --git a/drivers/misc/habanalabs/common/pci.c 
b/drivers/misc/habanalabs/common/pci.c
index 02152d85cf19..923b2606e29f 100644
--- a/drivers/misc/habanalabs/common/pci.c
+++ b/drivers/misc/habanalabs/common/pci.c
@@ -338,20 +338,12 @@ static int hl_pci_set_dma_mask(struct hl_device *hdev)
 /**
  * hl_pci_init() - PCI initialization code.
  * @hdev: Pointer to hl_device structure.
- * @cpu_boot_status_reg: status register of the device's CPU
- * @cpu_security_boot_status_reg: status register of device's CPU security
- *configuration
- * @boot_err0_reg: boot error register of the device's CPU
- * @preboot_ver_timeout: how much to wait before bailing out on reading
- *   the preboot version
  *
  * Set DMA masks, initialize the PCI controller and map t

Regression: QCA6390 fails with "mm/page_alloc: place pages to tail in __free_pages_core()"

2020-11-05 Thread Kalle Valo
(changing the subject, adding more lists and people)

Pavel Procopiuc  writes:

> Op 04.11.2020 om 10:12 schreef Kalle Valo:
>> Yeah, it is unfortunately time consuming but it is the best way to get
>> bottom of this.
>
> I have found the commit that breaks things for me, it's
> 7fef431be9c9ac255838a9578331567b9dba4477 mm/page_alloc: place pages to
> tail in __free_pages_core()
>
> I've reverted it on top of the 5.10-rc2 and ath11k driver loads fine
> and I have wifi working.

Oh, very interesting. Thanks a lot for the bisection, otherwise we would
have never found out whats causing this.

David & mm folks: Pavel noticed that his QCA6390 Wi-Fi 6 device (driver
ath11k) failed on v5.10-rc1. After bisecting he found that the commit
below causes the regression. I have not been able to reproduce this and
for me QCA6390 works fine. I don't know if this needs a specific kernel
configuration or what's the difference between our setups.

Any ideas what might cause this and how to fix it?

Full discussion: 
http://lists.infradead.org/pipermail/ath11k/2020-November/000501.html

commit 7fef431be9c9ac255838a9578331567b9dba4477
Author: David Hildenbrand 
AuthorDate: Thu Oct 15 20:09:35 2020 -0700
Commit: Linus Torvalds 
CommitDate: Fri Oct 16 11:11:18 2020 -0700

mm/page_alloc: place pages to tail in __free_pages_core()

-- 
https://patchwork.kernel.org/project/linux-wireless/list/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches


Re: [PATCH v3 1/6] MIPS: Loongson64: Do not write the read only field LPA of CP0_CONFIG3

2020-11-05 Thread Tiezhu Yang

On 11/05/2020 01:57 PM, Huacai Chen wrote:

Hi, Tiezhu,

On Wed, Nov 4, 2020 at 11:51 AM Tiezhu Yang  wrote:

On 11/04/2020 10:00 AM, Huacai Chen wrote:

Hi, Tiezhu,

On Tue, Nov 3, 2020 at 3:13 PM Tiezhu Yang  wrote:

The field LPA of CP0_CONFIG3 register is read only for Loongson64, so the
write operations are meaningless, remove them.

Signed-off-by: Tiezhu Yang 
---

v2: No changes
v3: No changes

   arch/mips/include/asm/mach-loongson64/kernel-entry-init.h | 8 
   arch/mips/loongson64/numa.c   | 3 ---
   2 files changed, 11 deletions(-)

diff --git a/arch/mips/include/asm/mach-loongson64/kernel-entry-init.h 
b/arch/mips/include/asm/mach-loongson64/kernel-entry-init.h
index 87a5bfb..e4d77f4 100644
--- a/arch/mips/include/asm/mach-loongson64/kernel-entry-init.h
+++ b/arch/mips/include/asm/mach-loongson64/kernel-entry-init.h
@@ -19,10 +19,6 @@
  .macro  kernel_entry_setup
  .setpush
  .setmips64
-   /* Set LPA on LOONGSON3 config3 */
-   mfc0t0, CP0_CONFIG3
-   or  t0, (0x1 << 7)
-   mtc0t0, CP0_CONFIG3

Sorry for the late response, I have the same worry as Jiaxun. As you
know, Loongson's user manuals are not always correct, but the original
code comes from Loongson are usually better. So, my opinion is "Don't
change it if it doesn't break anything".

Hi Huacai,

Thanks for your reply, I have confirmed by Loongson user manuals and
hardware designers, CP0_CONFIG3 register is read only.

Without this patch, the related kernel code is meaningless, with
this patch, it can reflect the reality.

Thanks,
Tiezhu

Then you should at least test your code on Loongson-3A R1 two way machine.


Hi Huacai,

Thanks for your opinion.

I find a 3a1000 machine to test, the result is that CP0 config3
is read only which is consistent with user manual, the LPA field
of CP0 config3 can not write and its reset default value is 1.

So this patch has no problem.

Thanks,
Tiezhu



Huacai

Huacai


  /* Set ELPA on LOONGSON3 pagegrain */
  mfc0t0, CP0_PAGEGRAIN
  or  t0, (0x1 << 29)
@@ -54,10 +50,6 @@
  .macro  smp_slave_setup
  .setpush
  .setmips64
-   /* Set LPA on LOONGSON3 config3 */
-   mfc0t0, CP0_CONFIG3
-   or  t0, (0x1 << 7)
-   mtc0t0, CP0_CONFIG3
  /* Set ELPA on LOONGSON3 pagegrain */
  mfc0t0, CP0_PAGEGRAIN
  or  t0, (0x1 << 29)
diff --git a/arch/mips/loongson64/numa.c b/arch/mips/loongson64/numa.c
index cf9459f..c7e3cced 100644
--- a/arch/mips/loongson64/numa.c
+++ b/arch/mips/loongson64/numa.c
@@ -40,9 +40,6 @@ static void enable_lpa(void)
  unsigned long value;

  value = __read_32bit_c0_register($16, 3);
-   value |= 0x0080;
-   __write_32bit_c0_register($16, 3, value);
-   value = __read_32bit_c0_register($16, 3);
  pr_info("CP0_Config3: CP0 16.3 (0x%lx)\n", value);

  value = __read_32bit_c0_register($5, 1);
--
2.1.0





Re: [PATCH v2 4/4] soc: imx8m: change to use platform driver

2020-11-05 Thread Krzysztof Kozlowski
On Thu, Nov 05, 2020 at 03:26:29PM +0800, Alice Guo wrote:
> Directly reading ocotp register depends on that bootloader enables ocotp
> clk, which is not always effective, so change to use nvmem API. Using
> nvmem API requires to support driver defer probe and thus change
> soc-imx8m.c to use platform driver.
> 
> The other reason is that directly reading ocotp register causes kexec
> kernel hang because the 1st kernel running will disable unused clks
> after kernel boots up, and then ocotp clk will be disabled even if
> bootloader enables it. When kexec kernel, ocotp clk needs to be enabled
> before reading ocotp registers, and nvmem API with platform driver
> supported can accomplish this.
> 
> Signed-off-by: Alice Guo 
> ---
>  drivers/soc/imx/soc-imx8m.c | 75 +
>  1 file changed, 42 insertions(+), 33 deletions(-)
> 
> diff --git a/drivers/soc/imx/soc-imx8m.c b/drivers/soc/imx/soc-imx8m.c
> index cc57a384d74d..83f3297509be 100644
> --- a/drivers/soc/imx/soc-imx8m.c
> +++ b/drivers/soc/imx/soc-imx8m.c
> @@ -5,6 +5,8 @@
> 
>  #include 
>  #include 
> +#include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -29,7 +31,7 @@
> 
>  struct imx8_soc_data {
>   char *name;
> - u32 (*soc_revision)(void);
> + u32 (*soc_revision)(struct device *dev);
>  };
> 
>  static u64 soc_uid;
> @@ -50,12 +52,15 @@ static u32 imx8mq_soc_revision_from_atf(void)
>  static inline u32 imx8mq_soc_revision_from_atf(void) { return 0; };
>  #endif
> 
> -static u32 __init imx8mq_soc_revision(void)
> +static u32 __init imx8mm_soc_uid(struct device *dev);
> +
> +static u32 __init imx8mq_soc_revision(struct device *dev)
>  {
>   struct device_node *np;
>   void __iomem *ocotp_base;
>   u32 magic;
>   u32 rev;
> + int ret = 0;
> 
>   np = of_find_compatible_node(NULL, NULL, "fsl,imx8mq-ocotp");
>   if (!np)
> @@ -75,9 +80,9 @@ static u32 __init imx8mq_soc_revision(void)
>   rev = REV_B1;
>   }
> 
> - soc_uid = readl_relaxed(ocotp_base + OCOTP_UID_HIGH);
> - soc_uid <<= 32;
> - soc_uid |= readl_relaxed(ocotp_base + OCOTP_UID_LOW);
> + ret = imx8mm_soc_uid(dev);
> + if (ret)
> + return ret;
> 
>   iounmap(ocotp_base);
>   of_node_put(np);
> @@ -85,33 +90,20 @@ static u32 __init imx8mq_soc_revision(void)
>   return rev;
>  }
> 
> -static void __init imx8mm_soc_uid(void)
> +static u32 __init imx8mm_soc_uid(struct device *dev)
>  {
> - void __iomem *ocotp_base;
> - struct device_node *np;
> - u32 offset = of_machine_is_compatible("fsl,imx8mp") ?
> -  IMX8MP_OCOTP_UID_OFFSET : 0;
> -
> - np = of_find_compatible_node(NULL, NULL, "fsl,imx8mm-ocotp");
> - if (!np)
> - return;
> -
> - ocotp_base = of_iomap(np, 0);
> - WARN_ON(!ocotp_base);
> -
> - soc_uid = readl_relaxed(ocotp_base + OCOTP_UID_HIGH + offset);
> - soc_uid <<= 32;
> - soc_uid |= readl_relaxed(ocotp_base + OCOTP_UID_LOW + offset);
> + int ret = 0;
> 
> - iounmap(ocotp_base);
> - of_node_put(np);
> + ret = nvmem_cell_read_u64(dev, "soc_unique_id", &soc_uid);
> + return ret;
>  }
> 
> -static u32 __init imx8mm_soc_revision(void)
> +static u32 __init imx8mm_soc_revision(struct device *dev)
>  {
>   struct device_node *np;
>   void __iomem *anatop_base;
>   u32 rev;
> + int ret;
> 
>   np = of_find_compatible_node(NULL, NULL, "fsl,imx8mm-anatop");
>   if (!np)
> @@ -125,7 +117,9 @@ static u32 __init imx8mm_soc_revision(void)
>   iounmap(anatop_base);
>   of_node_put(np);
> 
> - imx8mm_soc_uid();
> + ret = imx8mm_soc_uid(dev);
> + if (ret)
> + return ret;

I think this breaks old existing DTBs. If applied on separate branch
than DTS patches, it will cause bisect regressions. Regardless of it,
all out-of-tree (customer DTBs) will start failing here as well.

Best regards,
Krzysztof


> 
>   return rev;
>  }
> @@ -151,19 +145,20 @@ static const struct imx8_soc_data imx8mp_soc_data = {
>  };
> 
>  static __maybe_unused const struct of_device_id imx8_soc_match[] = {
> - { .compatible = "fsl,imx8mq", .data = &imx8mq_soc_data, },
> - { .compatible = "fsl,imx8mm", .data = &imx8mm_soc_data, },
> - { .compatible = "fsl,imx8mn", .data = &imx8mn_soc_data, },
> - { .compatible = "fsl,imx8mp", .data = &imx8mp_soc_data, },
> + { .compatible = "fsl,imx8mq-soc", .data = &imx8mq_soc_data, },
> + { .compatible = "fsl,imx8mm-soc", .data = &imx8mm_soc_data, },
> + { .compatible = "fsl,imx8mn-soc", .data = &imx8mn_soc_data, },
> + { .compatible = "fsl,imx8mp-soc", .data = &imx8mp_soc_data, },
>   { }
>  };
> +MODULE_DEVICE_TABLE(of, imx8_soc_match);
> 
>  #define imx8_revision(soc_rev) \
>   soc_rev ? \
>   kasprintf(GFP_KERNEL, "%d.%d", (soc_rev >> 4) & 0xf,  soc_rev & 0xf) : \
>   "unknown"
> 
> -static int __init imx8_soc_init(void)
> +static int imx8_soc_init_prob

Re: [PATCH] clk-si5341: Support NVM programming through sysfs

2020-11-05 Thread Mike Looijmans



Met vriendelijke groet / kind regards,

Mike Looijmans
System Expert


TOPIC Embedded Products B.V.
Materiaalweg 4, 5681 RJ Best
The Netherlands

T: +31 (0) 499 33 69 69
E: mike.looijm...@topicproducts.com
W: www.topicproducts.com

Please consider the environment before printing this e-mail
On 05-11-2020 02:48, Stephen Boyd wrote:

Quoting Mike Looijmans (2020-11-03 06:17:41)

Export an attribute program_nvm_bank that when read reports the current
bank value. To program the chip's current state into NVM, write the
magic value 0xC7 into this attribute.

Signed-off-by: Mike Looijmans 
---


Any chance this can be done through the nvmem framework?


This part doesn't fit. The purpose is to store the current state of the clock 
chip into its non-volatile storage so it boots up with that configuration the 
next POR. Main use case is that some vendors initialize PLLs only in a 
bootloader and thus need the clock running at boot. Or it might just be to 
save on that 300ms initialization time.


Having said that, the clock chip does have some "scratch" areas that'd be 
useful as NVMEM storage. That'd be for a separate patch.


For this device to be NVMEM compatible, nvmem would need to have a sort of 
transaction model, where you write several values and then "commit" them all 
to NVM in one call. The nvmem framework wasn't intended for that I think.





  drivers/clk/clk-si5341.c | 73 
  1 file changed, 73 insertions(+)

diff --git a/drivers/clk/clk-si5341.c b/drivers/clk/clk-si5341.c
index e0446e66fa64..4e025a5ea2b7 100644
--- a/drivers/clk/clk-si5341.c
+++ b/drivers/clk/clk-si5341.c
@@ -1199,6 +1205,69 @@ static const struct regmap_config si5341_regmap_config = 
{
 .volatile_table = &si5341_regmap_volatile,
  };
  
+static ssize_t program_nvm_bank_show(struct device *dev,

+   struct device_attribute *attr, char *buf)
+{
+   struct i2c_client *client = to_i2c_client(dev);
+   struct clk_si5341 *data = i2c_get_clientdata(client);
+   unsigned int regval;
+   int ret;
+
+   ret = regmap_read(data->regmap, SI5341_ACTIVE_NVM_BANK, ®val);
+   if (ret)
+   return ret;
+
+   return sprintf(buf, "%#x\n", regval);
+}
+
+static ssize_t program_nvm_bank_store(struct device *dev,
+   struct device_attribute *attr,
+   const char *buf,
+   size_t count)
+{
+   struct clk_si5341 *data = i2c_get_clientdata(to_i2c_client(dev));
+   int ret;
+   unsigned int value;
+   unsigned int timeout;
+
+   ret = kstrtouint(buf, 0, &value);
+   if (ret < 0)
+   return ret;
+
+   /* Write the magic value to this attribute to program the NVM */
+   if (value != SI5341_SI5341_NVM_WRITE_COOKIE)
+   return -EINVAL;
+
+   ret = regmap_write(data->regmap, SI5341_NVM_WRITE,
+   SI5341_SI5341_NVM_WRITE_COOKIE);
+   if (ret)
+   return ret;
+
+   /* Wait for SI5341_DEVICE_READY register to become 0x0f */
+   for (timeout = 1; timeout; --timeout) {
+   ret = regmap_read(data->regmap, SI5341_DEVICE_READY, &value);


This is regmap_read_poll_timeout()?


Yes, indeed.




+   if (ret)
+   return ret;
+
+   if (value == 0x0f)
+   break;
+   }
+
+   return count;
+}
+
+static DEVICE_ATTR_RW(program_nvm_bank);
+
+static struct attribute *si5341_sysfs_entries[] = {
+   &dev_attr_program_nvm_bank.attr,
+   NULL,
+};
+
+static struct attribute_group si5341_attr_group = {
+   .name   = NULL, /* put in device directory */
+   .attrs  = si5341_sysfs_entries,
+};


If not nvmem framework, then this needs to be documented in
Documentation/ABI/


Okay, will do.





+
  static int si5341_dt_parse_dt(struct i2c_client *client,
 struct clk_si5341_output_config *config)
  {
@@ -1544,6 +1613,10 @@ static int si5341_probe(struct i2c_client *client,
 for (i = 0; i < data->num_synth; ++i)
  devm_kfree(&client->dev, (void *)synth_clock_names[i]);
  
+   err = sysfs_create_group(&client->dev.kobj, &si5341_attr_group);

+   if (err)
+   dev_err(&client->dev, "failed to create sysfs entries\n");
+


Cool, I as a user would do what in this situation? The error message
seems sort of worthless.



It's not critical for the driver to be able to register this. So I could just 
silently ignore the error.


M.


Re: linux-next: build failure after merge of the mfd tree

2020-11-05 Thread Lee Jones
On Thu, 05 Nov 2020, Michał Mirosław wrote:

> On Thu, Nov 05, 2020 at 12:50:27PM +1100, Stephen Rothwell wrote:
> > Hi all,
> > 
> > After merging the mfd tree, today's linux-next build (arm
> > multi_v7_defconfig) failed like this:
> > 
> > drivers/gpio/gpio-tps65910.c: In function 'tps65910_gpio_get':
> > drivers/gpio/gpio-tps65910.c:31:2: error: implicit declaration of function 
> > 'tps65910_reg_read' [-Werror=implicit-function-declaration]
> >31 |  tps65910_reg_read(tps65910, TPS65910_GPIO0 + offset, &val);
> >   |  ^
> > drivers/gpio/gpio-tps65910.c: In function 'tps65910_gpio_set':
> > drivers/gpio/gpio-tps65910.c:46:3: error: implicit declaration of function 
> > 'tps65910_reg_set_bits' [-Werror=implicit-function-declaration]
> >46 |   tps65910_reg_set_bits(tps65910, TPS65910_GPIO0 + offset,
> >   |   ^
> > drivers/gpio/gpio-tps65910.c:49:3: error: implicit declaration of function 
> > 'tps65910_reg_clear_bits' [-Werror=implicit-function-declaration]
> >49 |   tps65910_reg_clear_bits(tps65910, TPS65910_GPIO0 + offset,
> >   |   ^~~
> > 
> > Caused by commit
> > 
> >   23feb2c3367c ("mfd: tps65910: Clean up after switching to regmap")
> > 
> > I have used the version of the mfd tree from next-20201104 for today.
> 
> Hi,
> 
> It's missing a patch for gpio part [1].
> 
> [1] https://lkml.org/lkml/2020/9/26/398

I'm aware of it.  Just waiting for Linus' reply.

-- 
Lee Jones [李琼斯]
Senior Technical Lead - Developer Services
Linaro.org │ Open source software for Arm SoCs
Follow Linaro: Facebook | Twitter | Blog


Re: [PATCH v3 03/11] gpio: raspberrypi-exp: Release firmware handle on unbind

2020-11-05 Thread Bartosz Golaszewski
On Wed, Nov 4, 2020 at 11:39 AM Nicolas Saenz Julienne
 wrote:
>
> Use devm_rpi_firmware_get() so as to make sure we release RPi's firmware
> interface when unbinding the device.
>
> Signed-off-by: Nicolas Saenz Julienne 
>
> ---
>
> Changes since v2:
>  - Use devm_rpi_firmware_get(), instead of remove function
>
>  drivers/gpio/gpio-raspberrypi-exp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpio/gpio-raspberrypi-exp.c 
> b/drivers/gpio/gpio-raspberrypi-exp.c
> index bb100e0124e6..64a552ecc2ad 100644
> --- a/drivers/gpio/gpio-raspberrypi-exp.c
> +++ b/drivers/gpio/gpio-raspberrypi-exp.c
> @@ -208,7 +208,7 @@ static int rpi_exp_gpio_probe(struct platform_device 
> *pdev)
> return -ENOENT;
> }
>
> -   fw = rpi_firmware_get(fw_node);
> +   fw = devm_rpi_firmware_get(&pdev->dev, fw_node);
> of_node_put(fw_node);
> if (!fw)
> return -EPROBE_DEFER;
> --
> 2.29.1
>

Acked-by: Bartosz Golaszewski 


Re: [PATCH] KVM: x86: use positive error values for msr emulation that causes #GP

2020-11-05 Thread Maxim Levitsky
On Thu, 2020-11-05 at 07:14 +0100, Pankaj Gupta wrote:
> > Recent introduction of the userspace msr filtering added code that uses
> > negative error codes for cases that result in either #GP delivery to
> > the guest, or handled by the userspace msr filtering.
> > 
> > This breaks an assumption that a negative error code returned from the
> > msr emulation code is a semi-fatal error which should be returned
> > to userspace via KVM_RUN ioctl and usually kill the guest.
> > 
> > Fix this by reusing the already existing KVM_MSR_RET_INVALID error code,
> > and by adding a new KVM_MSR_RET_FILTERED error code for the
> > userspace filtered msrs.
> > 
> > Fixes: 291f35fb2c1d1 ("KVM: x86: report negative values from wrmsr 
> > emulation to userspace")
> > Reported-by: Qian Cai 
> > Signed-off-by: Maxim Levitsky 
> > ---
> >  arch/x86/kvm/x86.c | 29 +++--
> >  arch/x86/kvm/x86.h |  8 +++-
> >  2 files changed, 22 insertions(+), 15 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 397f599b20e5a..537130d78b2af 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -255,11 +255,10 @@ static struct kmem_cache *x86_emulator_cache;
> > 
> >  /*
> >   * When called, it means the previous get/set msr reached an invalid msr.
> > - * Return 0 if we want to ignore/silent this failed msr access, or 1 if we 
> > want
> > - * to fail the caller.
> > + * Return true if we want to ignore/silent this failed msr access.
> >   */
> > -static int kvm_msr_ignored_check(struct kvm_vcpu *vcpu, u32 msr,
> > -u64 data, bool write)
> > +static bool kvm_msr_ignored_check(struct kvm_vcpu *vcpu, u32 msr,
> > + u64 data, bool write)
> >  {
> > const char *op = write ? "wrmsr" : "rdmsr";
> > 
> > @@ -267,12 +266,11 @@ static int kvm_msr_ignored_check(struct kvm_vcpu 
> > *vcpu, u32 msr,
> > if (report_ignored_msrs)
> > vcpu_unimpl(vcpu, "ignored %s: 0x%x data 0x%llx\n",
> > op, msr, data);
> > -   /* Mask the error */
> > -   return 0;
> > +   return true;
> > } else {
> > vcpu_debug_ratelimited(vcpu, "unhandled %s: 0x%x data 
> > 0x%llx\n",
> >op, msr, data);
> > -   return -ENOENT;
> > +   return false;
> > }
> >  }
> > 
> > @@ -1416,7 +1414,8 @@ static int do_get_msr_feature(struct kvm_vcpu *vcpu, 
> > unsigned index, u64 *data)
> > if (r == KVM_MSR_RET_INVALID) {
> > /* Unconditionally clear the output for simplicity */
> > *data = 0;
> > -   r = kvm_msr_ignored_check(vcpu, index, 0, false);
> > +   if (kvm_msr_ignored_check(vcpu, index, 0, false))
> > +   r = 0;
> > }
> > 
> > if (r)
> > @@ -1540,7 +1539,7 @@ static int __kvm_set_msr(struct kvm_vcpu *vcpu, u32 
> > index, u64 data,
> > struct msr_data msr;
> > 
> > if (!host_initiated && !kvm_msr_allowed(vcpu, index, 
> > KVM_MSR_FILTER_WRITE))
> > -   return -EPERM;
> > +   return KVM_MSR_RET_FILTERED;
> > 
> > switch (index) {
> > case MSR_FS_BASE:
> > @@ -1581,7 +1580,8 @@ static int kvm_set_msr_ignored_check(struct kvm_vcpu 
> > *vcpu,
> > int ret = __kvm_set_msr(vcpu, index, data, host_initiated);
> > 
> > if (ret == KVM_MSR_RET_INVALID)
> > -   ret = kvm_msr_ignored_check(vcpu, index, data, true);
> > +   if (kvm_msr_ignored_check(vcpu, index, data, true))
> > +   ret = 0;
> > 
> > return ret;
> >  }
> > @@ -1599,7 +1599,7 @@ int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, 
> > u64 *data,
> > int ret;
> > 
> > if (!host_initiated && !kvm_msr_allowed(vcpu, index, 
> > KVM_MSR_FILTER_READ))
> > -   return -EPERM;
> > +   return KVM_MSR_RET_FILTERED;
> > 
> > msr.index = index;
> > msr.host_initiated = host_initiated;
> > @@ -1618,7 +1618,8 @@ static int kvm_get_msr_ignored_check(struct kvm_vcpu 
> > *vcpu,
> > if (ret == KVM_MSR_RET_INVALID) {
> > /* Unconditionally clear *data for simplicity */
> > *data = 0;
> > -   ret = kvm_msr_ignored_check(vcpu, index, 0, false);
> > +   if (kvm_msr_ignored_check(vcpu, index, 0, false))
> > +   ret = 0;
> > }
> > 
> > return ret;
> > @@ -1662,9 +1663,9 @@ static int complete_emulated_wrmsr(struct kvm_vcpu 
> > *vcpu)
> >  static u64 kvm_msr_reason(int r)
> >  {
> > switch (r) {
> > -   case -ENOENT:
> > +   case KVM_MSR_RET_INVALID:
> > return KVM_MSR_EXIT_REASON_UNKNOWN;
> > -   case -EPERM:
> > +   case KVM_MSR_RET_FILTERED:
> > return KVM_MSR_EXIT_REASON_FIL

RE: [PATCH net-next] net: x25_asy: Delete the x25_asy driver

2020-11-05 Thread David Laight
From: Xie He
> Sent: 05 November 2020 07:35
> 
> This driver transports LAPB (X.25 link layer) frames over TTY links.

I don't remember any requests to run LAPB over anything other
than synchronous links when I was writing LAPB implementation(s)
back in the mid 1980's.

If you need to run 'comms over async uart links' there
are better options.

I wonder what the actual use case was?

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



  1   2   3   4   5   6   7   8   9   10   >