date:20180822

Re: [RFC PATCH 1/2] dt-bindings: pwm: imx: Allow switching PWM output between PWM and GPIO

2018-08-22 Thread Michal Vokáč


On 22.8.2018 08:14, Lothar Waßmann wrote:

Michal Vokáč  wrote:


Output of the PWM block of i.MX SoCs is always zero volts when the block
is disabled. This can caue issues when inverted PWM polarity is needed.
With inverted polarity a duty cycle = 0% corresponds to solid high level
on the output. If the PWM is dissabled its output instantly goes to solid
zero which corresponds to duty cycle = 100%.

To have a trully inverted PWM output configure the PWM pad as a GPIO
with pull-up. Then switch the pad to PWM output whenever non-zero
duty cycle is needed.

Signed-off-by: Michal Vokáč 
---
  Documentation/devicetree/bindings/pwm/imx-pwm.txt | 44 +++
  1 file changed, 44 insertions(+)

diff --git a/Documentation/devicetree/bindings/pwm/imx-pwm.txt 
b/Documentation/devicetree/bindings/pwm/imx-pwm.txt
index c61bdf8..3b1bc4c 100644
--- a/Documentation/devicetree/bindings/pwm/imx-pwm.txt
+++ b/Documentation/devicetree/bindings/pwm/imx-pwm.txt
@@ -14,6 +14,12 @@ See the clock consumer binding,
Documentation/devicetree/bindings/clock/clock-bindings.txt
  - interrupts: The interrupt for the pwm controller
  
+Optional properties:

+- pinctrl: For i.MX27 and newer SoCs. Add extra pinctrl to configure the PWM
+  pin to gpio function.  It allows control over the pin output level when the
+  PWM block is disabled. This is meant to be used if inverted polarity of the
+  PWM signal is required. See "Inverted PWM output" section bellow.
+
  Example:
  
  pwm1: pwm@53fb4000 {

@@ -25,3 +31,41 @@ pwm1: pwm@53fb4000 {
clock-names = "ipg", "per";
interrupts = <61>;
  };
+
+Inverted PWM output
+---
+
+The i.MX SoC has such limitation that whenever a pad is configured as a PWM
+output, the output level is always zero volts when the PWM block is disabled.
+The zero output level is actively driven by the output stage of the PWM block
+and can not be overridden by pull-up. It also does not matter what PWM polarity
+a PWM client (e.g. backlight) requested.
+
+To gain control of the PWM output level in disabled state two pinctrl states
+can be used. The "default" state and the "pwm" state. In the default state the


The "default" function of a PWM is to deliver a PWM signal. So it is
more sensible to me to have the PWM function as "default" and a "gpio"
function as alternative state.


Yes, I totally agree that using "default" for PWM and "gpio" as the
alternative function seems more sensible. That is actually how I started.
Then I realized that that way you end up with the PWM pad set to zero
until the first call of imx_pwm_apply_v2 where you can select the GPIO
function. On my system that first call is made by pwm-backlight more than
3s after pinctrl init.

I suggested to use the "default" state as a GPIO function as the only way
how to get a truly inverted PWM output all the time from power-up to
power-down.

In my opinion it is up to the DT author what pad configuration he uses for
each pinctrl function as he knows what the HW really needs. I see that this
approach is kind of controversial but I hope that with good documentation
this would not be a problem. And as I wrote in the intro, it is absolutely
optional. If you do not need it, you do not use it.


+PWM output is configured as a GPIO with pull-up. In the "pwm" state the output
+is configured as a PWM output. This setup assures that the PWM output is at
+the required level that corresponds to duty cycle = 0 when PWM is disabled.
+E.g. at boot.
+
+Example:
+
+&pwm1 {
+   pinctrl-names = "default", "pwm";
+   pinctrl-0 = <&pinctrl_backlight_gpio>;
+   pinctrl-1 = <&pinctrl_backlight_pwm>;
+}
+
+pinctrl_backlight_gpio: pwm1grp-gpio {
+   fsl,pins = <
+   /* GPIO with 22kOhm pull-up */
+   MX6QDL_PAD_GPIO_9__GPIO1_IO09   0xF008
+   >;
+};
+
+pinctrl_backlight_pwm: pwm1grp-pwm {
+   fsl,pins = <
+   /* PWM output */
+   MX6QDL_PAD_GPIO_9__PWM1_OUT 0x8
+   >;
+};

Re: [PATCH v9 22/22] s390: doc: detailed specifications for AP virtualization

2018-08-22 Thread Harald Freudenberger

On 21.08.2018 17:53, Cornelia Huck wrote:
> On Tue, 21 Aug 2018 11:00:00 +0200
> Harald Freudenberger  wrote:
>
>> On 20.08.2018 18:03, Cornelia Huck wrote:
>>> On Mon, 13 Aug 2018 17:48:19 -0400
>>> Tony Krowiak  wrote:
 +* AP Instructions:
 +
 +  There are three AP instructions:
 +
 +  * NQAP: to enqueue an AP command-request message to a queue
 +  * DQAP: to dequeue an AP command-reply message from a queue
 +  * PQAP: to administer the queues  
>>> So, NQAP/DQAP need usage domains, while PQAP needs a control domain? Or
>>> is it that all of them need usage domains, but PQAP can target a control
>>> domain as well?
>>>
>>> [I don't want to dive deeply into the AP architecture here, just far
>>> enough to really understand the design implications.]  
>> Well, to be honest, nobody ever tried this under Linux. Theoretically
>> one should be able to send a CPRB to a usage domain where inside
>> the CPRB another domain (the control domain) is addressed. However,
>> as of now I am only aware of applications controlling the same usage
>> domain. I don't know any application which is able to address another
>> control domain and I am not sure if the zcrypt device driver would
>> handle such a CPRB correctly. NQAP, DQAP and PQAP always address
>> a usage domain. But the CPRB send down the pipe via NQAP may
>> address some control thing on another domain. I am not sure which
>> code and where do the sorting out here. There are two candidates:
>> the firmware layer in the CEC and the crypto card code.
> OK, so it's possible as by the architecture, but at least Linux does
> not (currently) do it?
>
> Perhaps we should simply not overthink that whole control domain
> thingy :) It's mostly yet another knob, and as long as the design does
> not go against the general architecture, it's probably fine, I guess.
Well, sooner or later this has to work. Yesterday we tested the control
domain thing with trying to pull some simple data from a 'controlled' domain
to the TKE - doesn't work with a Linux LPAR. I will investigate the details in 
the
next weeks. However, long-term it should be possible to run scenarios
like having one KVM guest control all the domains used by other KVM guests.
With respect to the KVM vfio driver, currently there should be just the
rule that for a guest the control domain mask should be equal or a superset
of the usage domain mask. This is by convention as the architecture is
not so clear here, but this is enforced on every place which deals with
usage and control domains (SE, TKE).

regards Harald Freudenberger

Re: [PATCH] mm,page_alloc: PF_WQ_WORKER threads must sleep at should_reclaim_retry().

2018-08-22 Thread Michal Hocko

On Wed 22-08-18 06:07:40, Tetsuo Handa wrote:
> On 2018/08/03 15:16, Michal Hocko wrote:
[...]
> >> Now that Roman's cgroup aware OOM killer patchset will be dropped from 
> >> linux-next.git ,
> >> linux-next.git will get the sleeping point removed. Please send this patch 
> >> to linux-next.git .
> > 
> > I still haven't heard any explicit confirmation that the patch works for
> > your workload. Should I beg for it? Or you simply do not want to have
> > your stamp on the patch? If yes, I can live with that but this playing
> > hide and catch is not really a lot of fun.
> > 
> 
> I noticed that the patch has not been sent to linux-next.git yet.
> Please send to linux-next.git without my stamp on the patch.

I plan to do so after merge window closes.
-- 
Michal Hocko
SUSE Labs

Re: [PATCH] IB/ucm: fix UCM link error

2018-08-22 Thread Arnd Bergmann

On Wed, Aug 22, 2018 at 5:18 AM Jason Gunthorpe  wrote:
>
> On Tue, Aug 21, 2018 at 04:20:44PM +0200, Arnd Bergmann wrote:
> > Building UCM with CONFIG_INFINIBAND_USER_ACCESS=m results in a
> > set of link errors including:
> >
> > drivers/infiniband/core/ucm.o: In function `ib_ucm_event_handler':
> > ucm.c:(.text+0x6dc): undefined reference to `ib_copy_path_rec_to_user'
> > drivers/infiniband/core/ucma.o: In function `ucma_event_handler':
> > ucma.c:(.text+0xdc0): undefined reference to `ib_copy_ah_attr_to_user'
> >
> > To get it to build-test again, this makes the option itself a
> > tristate, which lets Kconfig figure out the dependency correctly.
> >
> > Fixes: 486edfb1039d ("IB/ucm: Fix compiling ucm.c")
> > Signed-off-by: Arnd Bergmann 
> > ---
> >  drivers/infiniband/Kconfig | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
>
> Applied to for-rc
>
> But that fixes line isn't right is it?
>
> Should it be
>
> commit 7a8690ed6f5346f6738971892205e91d39b6b901
> Author: Leon Romanovsky 
> Date:   Wed May 23 08:22:11 2018 +0300
>
> RDMA/ucm: Mark UCM interface as BROKEN
>
> Which added the config in the first place??

The commit I cited is the one that caused the build regression for me.

The first one (7a8690ed6f) caused the interface to disappear,
the second one (486edfb1039d) brought it back in a way that
fails in some random configurations.

   Arnd

Re: [PATCH] soc: ti: pm33xx: Enable DS0 for the platforms on which it is functional

2018-08-22 Thread Johan Hovold

On Wed, Aug 22, 2018 at 11:02:31AM +0530, Keerthy wrote:
> Enable DS0 for only those platforms on which it is functional
> 
> Signed-off-by: Keerthy 
> ---
>  arch/arm/mach-omap2/pm33xx-core.c| 5 +
>  drivers/soc/ti/pm33xx.c  | 9 +
>  include/linux/platform_data/pm33xx.h | 2 ++
>  3 files changed, 16 insertions(+)
> 
> diff --git a/arch/arm/mach-omap2/pm33xx-core.c 
> b/arch/arm/mach-omap2/pm33xx-core.c
> index f4971e4..f0f6e8e 100644
> --- a/arch/arm/mach-omap2/pm33xx-core.c
> +++ b/arch/arm/mach-omap2/pm33xx-core.c
> @@ -135,6 +135,11 @@ static int am43xx_suspend(unsigned int state, int 
> (*fn)(unsigned long),
>  {
>   int ret = 0;
>  
> + if (!(args & WFI_FLAG_DEEP_SLEEP0)) {
> + pr_err("DS0 mode not supported\n");
> + return -ENOTSUPP;
> + }
> +
>   amx3_pre_suspend_common();
>   scu_power_mode(scu_base, SCU_PM_POWEROFF);
>   ret = cpu_suspend(args, fn);
> diff --git a/drivers/soc/ti/pm33xx.c b/drivers/soc/ti/pm33xx.c
> index d0dab32..53238d7 100644
> --- a/drivers/soc/ti/pm33xx.c
> +++ b/drivers/soc/ti/pm33xx.c
> @@ -324,6 +324,15 @@ static int am33xx_pm_probe(struct platform_device *pdev)
>   suspend_wfi_flags |= WFI_FLAG_SAVE_EMIF;
>   suspend_wfi_flags |= WFI_FLAG_WAKE_M3;
>  
> + /*
> +  * Deep Sleep0 mode is currently functional only on am437x-gp-evm,
> +  * am33xx-evm and boneblack family. Hence set the DS0 flag
> +  */
> + if (of_machine_is_compatible("ti,am437x-gp-evm") ||
> + of_machine_is_compatible("ti,am335x-bone-black") ||
> + of_machine_is_compatible("ti,am335x-evm"))
> + suspend_wfi_flags |= WFI_FLAG_DEEP_SLEEP0;

What about other (out-of-tree) machines which supports DS0 and which
this change would break?

I think this needs to be a blacklist if anything.

Please also expand in the commit message why you think this is needed.

Last, what tree is this against? There's no am43xx_suspend() in
linux-next (and you add compatibles above for am33xx too).

Thanks,
Johan

Re: [PATCH] soc: ti: pm33xx: Enable DS0 for the platforms on which it is functional

2018-08-22 Thread Johan Hovold

On Wed, Aug 22, 2018 at 09:34:09AM +0200, Johan Hovold wrote:
> On Wed, Aug 22, 2018 at 11:02:31AM +0530, Keerthy wrote:
> > Enable DS0 for only those platforms on which it is functional
> > 
> > Signed-off-by: Keerthy 
> > ---
> >  arch/arm/mach-omap2/pm33xx-core.c| 5 +
> >  drivers/soc/ti/pm33xx.c  | 9 +
> >  include/linux/platform_data/pm33xx.h | 2 ++
> >  3 files changed, 16 insertions(+)
> > 
> > diff --git a/arch/arm/mach-omap2/pm33xx-core.c 
> > b/arch/arm/mach-omap2/pm33xx-core.c
> > index f4971e4..f0f6e8e 100644
> > --- a/arch/arm/mach-omap2/pm33xx-core.c
> > +++ b/arch/arm/mach-omap2/pm33xx-core.c
> > @@ -135,6 +135,11 @@ static int am43xx_suspend(unsigned int state, int 
> > (*fn)(unsigned long),
> >  {
> > int ret = 0;
> >  
> > +   if (!(args & WFI_FLAG_DEEP_SLEEP0)) {
> > +   pr_err("DS0 mode not supported\n");
> > +   return -ENOTSUPP;
> > +   }
> > +
> > amx3_pre_suspend_common();
> > scu_power_mode(scu_base, SCU_PM_POWEROFF);
> > ret = cpu_suspend(args, fn);
> > diff --git a/drivers/soc/ti/pm33xx.c b/drivers/soc/ti/pm33xx.c
> > index d0dab32..53238d7 100644
> > --- a/drivers/soc/ti/pm33xx.c
> > +++ b/drivers/soc/ti/pm33xx.c
> > @@ -324,6 +324,15 @@ static int am33xx_pm_probe(struct platform_device 
> > *pdev)
> > suspend_wfi_flags |= WFI_FLAG_SAVE_EMIF;
> > suspend_wfi_flags |= WFI_FLAG_WAKE_M3;
> >  
> > +   /*
> > +* Deep Sleep0 mode is currently functional only on am437x-gp-evm,
> > +* am33xx-evm and boneblack family. Hence set the DS0 flag
> > +*/
> > +   if (of_machine_is_compatible("ti,am437x-gp-evm") ||
> > +   of_machine_is_compatible("ti,am335x-bone-black") ||
> > +   of_machine_is_compatible("ti,am335x-evm"))
> > +   suspend_wfi_flags |= WFI_FLAG_DEEP_SLEEP0;
> 
> What about other (out-of-tree) machines which supports DS0 and which
> this change would break?
> 
> I think this needs to be a blacklist if anything.
> 
> Please also expand in the commit message why you think this is needed.
> 
> Last, what tree is this against? There's no am43xx_suspend() in
> linux-next (and you add compatibles above for am33xx too).

Sorry, there is indeed an am43xx_suspend(), but you are adding
compatibles for am33xx which use am33xx_suspend().

Johan

[PATCH 14/14] ata: ahci_xgene: add second argument of ahci_platform_get_resources()

2018-08-22 Thread Kunihiko Hayashi

Adding a flag to indicate whether acqiring the optional resources
as the second argument of ahci_platform_get_resources(),
add the argument as initial value 0.

Cc: Hans de Goede 
Signed-off-by: Kunihiko Hayashi 
---
 drivers/ata/ahci_xgene.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/ata/ahci_xgene.c b/drivers/ata/ahci_xgene.c
index ad58da7..7e157e1 100644
--- a/drivers/ata/ahci_xgene.c
+++ b/drivers/ata/ahci_xgene.c
@@ -759,7 +759,7 @@ static int xgene_ahci_probe(struct platform_device *pdev)
  &xgene_ahci_v2_port_info };
int rc;
 
-   hpriv = ahci_platform_get_resources(pdev);
+   hpriv = ahci_platform_get_resources(pdev, 0);
if (IS_ERR(hpriv))
return PTR_ERR(hpriv);
 
-- 
2.7.4

[PATCH 11/14] ata: ahci_st: add second argument of ahci_platform_get_resources()

2018-08-22 Thread Kunihiko Hayashi

Adding a flag to indicate whether acqiring the optional resources
as the second argument of ahci_platform_get_resources(),
add the argument as initial value 0.

Cc: Hans de Goede 
Cc: Patrice Chotard 
Signed-off-by: Kunihiko Hayashi 
---
 drivers/ata/ahci_st.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/ata/ahci_st.c b/drivers/ata/ahci_st.c
index bc345f2..21c5c44 100644
--- a/drivers/ata/ahci_st.c
+++ b/drivers/ata/ahci_st.c
@@ -156,7 +156,7 @@ static int st_ahci_probe(struct platform_device *pdev)
if (!drv_data)
return -ENOMEM;
 
-   hpriv = ahci_platform_get_resources(pdev);
+   hpriv = ahci_platform_get_resources(pdev, 0);
if (IS_ERR(hpriv))
return PTR_ERR(hpriv);
hpriv->plat_data = drv_data;
-- 
2.7.4

[PATCH 13/14] ata: ahci_tegra: add second argument of ahci_platform_get_resources()

2018-08-22 Thread Kunihiko Hayashi

Adding a flag to indicate whether acqiring the optional resources
as the second argument of ahci_platform_get_resources(),
add the argument as initial value 0.

Cc: Hans de Goede 
Cc: Thierry Reding 
Signed-off-by: Kunihiko Hayashi 
---
 drivers/ata/ahci_tegra.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/ata/ahci_tegra.c b/drivers/ata/ahci_tegra.c
index 64d8484..004f260 100644
--- a/drivers/ata/ahci_tegra.c
+++ b/drivers/ata/ahci_tegra.c
@@ -494,7 +494,7 @@ static int tegra_ahci_probe(struct platform_device *pdev)
int ret;
unsigned int i;
 
-   hpriv = ahci_platform_get_resources(pdev);
+   hpriv = ahci_platform_get_resources(pdev, 0);
if (IS_ERR(hpriv))
return PTR_ERR(hpriv);
 
-- 
2.7.4

[PATCH 05/14] ata: ahci_dm816: add second argument of ahci_platform_get_resources()

2018-08-22 Thread Kunihiko Hayashi

Adding a flag to indicate whether acqiring the optional resources
as the second argument of ahci_platform_get_resources(),
add the argument as initial value 0.

Cc: Hans de Goede 
Signed-off-by: Kunihiko Hayashi 
---
 drivers/ata/ahci_dm816.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/ata/ahci_dm816.c b/drivers/ata/ahci_dm816.c
index fbd827c..89509c3 100644
--- a/drivers/ata/ahci_dm816.c
+++ b/drivers/ata/ahci_dm816.c
@@ -148,7 +148,7 @@ static int ahci_dm816_probe(struct platform_device *pdev)
struct ahci_host_priv *hpriv;
int rc;
 
-   hpriv = ahci_platform_get_resources(pdev);
+   hpriv = ahci_platform_get_resources(pdev, 0);
if (IS_ERR(hpriv))
return PTR_ERR(hpriv);
 
-- 
2.7.4

[PATCH 09/14] ata: ahci_qoriq: add second argument of ahci_platform_get_resources()

2018-08-22 Thread Kunihiko Hayashi

Adding a flag to indicate whether acqiring the optional resources
as the second argument of ahci_platform_get_resources(),
add the argument as initial value 0.

Cc: Hans de Goede 
Signed-off-by: Kunihiko Hayashi 
---
 drivers/ata/ahci_qoriq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/ata/ahci_qoriq.c b/drivers/ata/ahci_qoriq.c
index cfdef4d..ce59253 100644
--- a/drivers/ata/ahci_qoriq.c
+++ b/drivers/ata/ahci_qoriq.c
@@ -250,7 +250,7 @@ static int ahci_qoriq_probe(struct platform_device *pdev)
struct resource *res;
int rc;
 
-   hpriv = ahci_platform_get_resources(pdev);
+   hpriv = ahci_platform_get_resources(pdev, 0);
if (IS_ERR(hpriv))
return PTR_ERR(hpriv);
 
-- 
2.7.4

[PATCH 01/14] ata: ahci-platform: add reset control support and the flag to specify using reset

2018-08-22 Thread Kunihiko Hayashi

Add support to get and control a list of resets for the device
as optional and shared. These resets must be kept de-asserted until
the device is enabled.

This is specified as shared because some SoCs like UniPhier series
have common reset controls with all ahci controller instances.

However, according to Thierry's view,
https://www.spinics.net/lists/linux-ide/msg55357.html
some hardware-specific drivers already use their own resets,
and the common reset make a path to occur double controls of resets.

Now this add the flag to ahci_platform_get_resources() indicating
whether to use the resources, currently resets only, and existing
drivers set 0 to this flags.

Suggested-by: Hans de Goede 
Cc: Thierry Reding 
Signed-off-by: Kunihiko Hayashi 
---
 .../devicetree/bindings/ata/ahci-platform.txt  |  1 +
 drivers/ata/ahci.h |  1 +
 drivers/ata/ahci_platform.c|  3 +-
 drivers/ata/libahci_platform.c | 35 ++
 include/linux/ahci_platform.h  |  4 ++-
 5 files changed, 36 insertions(+), 8 deletions(-)

diff --git a/Documentation/devicetree/bindings/ata/ahci-platform.txt 
b/Documentation/devicetree/bindings/ata/ahci-platform.txt
index 6637666..5d5bd45 100644
--- a/Documentation/devicetree/bindings/ata/ahci-platform.txt
+++ b/Documentation/devicetree/bindings/ata/ahci-platform.txt
@@ -29,6 +29,7 @@ compatible:
 Optional properties:
 - dma-coherent  : Present if dma operations are coherent
 - clocks: a list of phandle + clock specifier pairs
+- resets: a list of phandle + reset specifier pairs
 - target-supply : regulator for SATA target power
 - phys  : reference to the SATA PHY node
 - phy-names : must be "sata-phy"
diff --git a/drivers/ata/ahci.h b/drivers/ata/ahci.h
index 1609eba..6a1515f 100644
--- a/drivers/ata/ahci.h
+++ b/drivers/ata/ahci.h
@@ -350,6 +350,7 @@ struct ahci_host_priv {
u32 em_msg_type;/* EM message type */
boolgot_runtime_pm; /* Did we do pm_runtime_get? */
struct clk  *clks[AHCI_MAX_CLKS]; /* Optional */
+   struct reset_control*rsts;  /* Optional */
struct regulator**target_pwrs;  /* Optional */
/*
 * If platform uses PHYs. There is a 1:1 relation between the port 
number and
diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c
index 564570e..46f0bd7 100644
--- a/drivers/ata/ahci_platform.c
+++ b/drivers/ata/ahci_platform.c
@@ -43,7 +43,8 @@ static int ahci_probe(struct platform_device *pdev)
struct ahci_host_priv *hpriv;
int rc;
 
-   hpriv = ahci_platform_get_resources(pdev);
+   hpriv = ahci_platform_get_resources(pdev,
+   AHCI_PLATFORM_GET_RESETS);
if (IS_ERR(hpriv))
return PTR_ERR(hpriv);
 
diff --git a/drivers/ata/libahci_platform.c b/drivers/ata/libahci_platform.c
index 8fbb532..c92c10d 100644
--- a/drivers/ata/libahci_platform.c
+++ b/drivers/ata/libahci_platform.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "ahci.h"
 
 static void ahci_host_stop(struct ata_host *host);
@@ -195,7 +196,8 @@ EXPORT_SYMBOL_GPL(ahci_platform_disable_regulators);
  * following order:
  * 1) Regulator
  * 2) Clocks (through ahci_platform_enable_clks)
- * 3) Phys
+ * 3) Resets
+ * 4) Phys
  *
  * If resource enabling fails at any point the previous enabled resources
  * are disabled in reverse order.
@@ -215,12 +217,19 @@ int ahci_platform_enable_resources(struct ahci_host_priv 
*hpriv)
if (rc)
goto disable_regulator;
 
-   rc = ahci_platform_enable_phys(hpriv);
+   rc = reset_control_deassert(hpriv->rsts);
if (rc)
goto disable_clks;
 
+   rc = ahci_platform_enable_phys(hpriv);
+   if (rc)
+   goto disable_resets;
+
return 0;
 
+disable_resets:
+   reset_control_assert(hpriv->rsts);
+
 disable_clks:
ahci_platform_disable_clks(hpriv);
 
@@ -238,13 +247,16 @@ EXPORT_SYMBOL_GPL(ahci_platform_enable_resources);
  * This function disables all ahci_platform managed resources in the
  * following order:
  * 1) Phys
- * 2) Clocks (through ahci_platform_disable_clks)
- * 3) Regulator
+ * 2) Resets
+ * 3) Clocks (through ahci_platform_disable_clks)
+ * 4) Regulator
  */
 void ahci_platform_disable_resources(struct ahci_host_priv *hpriv)
 {
ahci_platform_disable_phys(hpriv);
 
+   reset_control_assert(hpriv->rsts);
+
ahci_platform_disable_clks(hpriv);
 
ahci_platform_disable_regulators(hpriv);
@@ -332,6 +344,7 @@ static int ahci_platform_get_regulator(struct 
ahci_host_priv *hpriv, u32 port,
 /**
  * ahci_platform_get_resources - Get platform resources
  * @pdev: platform device to get resources for
+ * @flags: bitmap representing the resource to get
  *
  * This function

[PATCH 06/14] ata: ahci_imx: add second argument of ahci_platform_get_resources()

2018-08-22 Thread Kunihiko Hayashi

Adding a flag to indicate whether acqiring the optional resources
as the second argument of ahci_platform_get_resources(),
add the argument as initial value 0.

Cc: Hans de Goede 
Signed-off-by: Kunihiko Hayashi 
---
 drivers/ata/ahci_imx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/ata/ahci_imx.c b/drivers/ata/ahci_imx.c
index 6822e2f..b00799d 100644
--- a/drivers/ata/ahci_imx.c
+++ b/drivers/ata/ahci_imx.c
@@ -1127,7 +1127,7 @@ static int imx_ahci_probe(struct platform_device *pdev)
return ret;
}
 
-   hpriv = ahci_platform_get_resources(pdev);
+   hpriv = ahci_platform_get_resources(pdev, 0);
if (IS_ERR(hpriv))
return PTR_ERR(hpriv);
 
-- 
2.7.4

[PATCH 08/14] ata: ahci_mvebu: add second argument of ahci_platform_get_resources()

2018-08-22 Thread Kunihiko Hayashi

Adding a flag to indicate whether acqiring the optional resources
as the second argument of ahci_platform_get_resources(),
add the argument as initial value 0.

Cc: Hans de Goede 
Signed-off-by: Kunihiko Hayashi 
---
 drivers/ata/ahci_mvebu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/ata/ahci_mvebu.c b/drivers/ata/ahci_mvebu.c
index 72d90b4..f9cb51b 100644
--- a/drivers/ata/ahci_mvebu.c
+++ b/drivers/ata/ahci_mvebu.c
@@ -158,7 +158,7 @@ static int ahci_mvebu_probe(struct platform_device *pdev)
const struct mbus_dram_target_info *dram;
int rc;
 
-   hpriv = ahci_platform_get_resources(pdev);
+   hpriv = ahci_platform_get_resources(pdev, 0);
if (IS_ERR(hpriv))
return PTR_ERR(hpriv);
 
-- 
2.7.4

[PATCH 04/14] ata: ahci_da850: add second argument of ahci_platform_get_resources()

2018-08-22 Thread Kunihiko Hayashi

Adding a flag to indicate whether acqiring the optional resources
as the second argument of ahci_platform_get_resources(),
add the argument as initial value 0.

Cc: Hans de Goede 
Signed-off-by: Kunihiko Hayashi 
---
 drivers/ata/ahci_da850.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/ata/ahci_da850.c b/drivers/ata/ahci_da850.c
index 9b34dff..ebaa657 100644
--- a/drivers/ata/ahci_da850.c
+++ b/drivers/ata/ahci_da850.c
@@ -171,7 +171,7 @@ static int ahci_da850_probe(struct platform_device *pdev)
u32 mpy;
int rc;
 
-   hpriv = ahci_platform_get_resources(pdev);
+   hpriv = ahci_platform_get_resources(pdev, 0);
if (IS_ERR(hpriv))
return PTR_ERR(hpriv);
 
-- 
2.7.4

[PATCH 10/14] ata: ahci_seattle: add second argument of ahci_platform_get_resources()

2018-08-22 Thread Kunihiko Hayashi

Adding a flag to indicate whether acqiring the optional resources
as the second argument of ahci_platform_get_resources(),
add the argument as initial value 0.

Cc: Hans de Goede 
Signed-off-by: Kunihiko Hayashi 
---
 drivers/ata/ahci_seattle.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/ata/ahci_seattle.c b/drivers/ata/ahci_seattle.c
index 1d31c0c..e57b6f9 100644
--- a/drivers/ata/ahci_seattle.c
+++ b/drivers/ata/ahci_seattle.c
@@ -164,7 +164,7 @@ static int ahci_seattle_probe(struct platform_device *pdev)
int rc;
struct ahci_host_priv *hpriv;
 
-   hpriv = ahci_platform_get_resources(pdev);
+   hpriv = ahci_platform_get_resources(pdev, 0);
if (IS_ERR(hpriv))
return PTR_ERR(hpriv);
 
-- 
2.7.4

[PATCH 12/14] ata: ahci_sunxi: add second argument of ahci_platform_get_resources()

2018-08-22 Thread Kunihiko Hayashi

Adding a flag to indicate whether acqiring the optional resources
as the second argument of ahci_platform_get_resources(),
add the argument as initial value 0.

Cc: Hans de Goede 
Cc: Maxime Ripard 
Signed-off-by: Kunihiko Hayashi 
---
 drivers/ata/ahci_sunxi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/ata/ahci_sunxi.c b/drivers/ata/ahci_sunxi.c
index b264374..631610b 100644
--- a/drivers/ata/ahci_sunxi.c
+++ b/drivers/ata/ahci_sunxi.c
@@ -181,7 +181,7 @@ static int ahci_sunxi_probe(struct platform_device *pdev)
struct ahci_host_priv *hpriv;
int rc;
 
-   hpriv = ahci_platform_get_resources(pdev);
+   hpriv = ahci_platform_get_resources(pdev, 0);
if (IS_ERR(hpriv))
return PTR_ERR(hpriv);
 
-- 
2.7.4

[PATCH 07/14] ata: ahci_brcm: add second argument of ahci_platform_get_resources()

2018-08-22 Thread Kunihiko Hayashi

Adding a flag to indicate whether acqiring the optional resources
as the second argument of ahci_platform_get_resources(),
add the argument as initial value 0.

Cc: Hans de Goede 
Cc: Matthias Brugger 
Signed-off-by: Kunihiko Hayashi 
---
 drivers/ata/ahci_mtk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/ata/ahci_mtk.c b/drivers/ata/ahci_mtk.c
index 0ae6971..8bc1a26 100644
--- a/drivers/ata/ahci_mtk.c
+++ b/drivers/ata/ahci_mtk.c
@@ -142,7 +142,7 @@ static int mtk_ahci_probe(struct platform_device *pdev)
if (!plat)
return -ENOMEM;
 
-   hpriv = ahci_platform_get_resources(pdev);
+   hpriv = ahci_platform_get_resources(pdev, 0);
if (IS_ERR(hpriv))
return PTR_ERR(hpriv);
 
-- 
2.7.4

[PATCH 02/14] ata: ahci_brcm: add second argument of ahci_platform_get_resources()

2018-08-22 Thread Kunihiko Hayashi

Adding a flag to indicate whether acqiring the optional resources
as the second argument of ahci_platform_get_resources(),
add the argument as initial value 0.

Cc: Hans de Goede 
Signed-off-by: Kunihiko Hayashi 
---
 drivers/ata/ahci_brcm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/ata/ahci_brcm.c b/drivers/ata/ahci_brcm.c
index ea43081..f3d5577 100644
--- a/drivers/ata/ahci_brcm.c
+++ b/drivers/ata/ahci_brcm.c
@@ -425,7 +425,7 @@ static int brcm_ahci_probe(struct platform_device *pdev)
 
brcm_sata_phys_enable(priv);
 
-   hpriv = ahci_platform_get_resources(pdev);
+   hpriv = ahci_platform_get_resources(pdev, 0);
if (IS_ERR(hpriv))
return PTR_ERR(hpriv);
hpriv->plat_data = priv;
-- 
2.7.4

Re: [PATCH v9 22/22] s390: doc: detailed specifications for AP virtualization

2018-08-22 Thread Cornelia Huck

On Tue, 21 Aug 2018 20:54:49 +0200
Halil Pasic  wrote:

> On 08/20/2018 10:16 PM, Tony Krowiak wrote:
> >> Does the SIE complain if you specify a control
> >> domain that the host does not have access to (I'd guess so)?  
> > 
> > The SIE does not complain if you specify a domain to which the host - or a
> > lower level guest - does not have access. The firmware performs a logical
> > AND of the guest's and hosts's - or lower level guest's - APMs, AQMs and 
> > ADMs  
> 
> Rather a bit-wise AND, I guess (of the same type masks corresponding to Guest 
> 1 and
> Guest 2). The result of a logical AND is a logical value (true or false) as
> far as I remember.
> 
> > to create effective masks EAPM, EAQM and EADM. Only devices corresponding to
> > the bits set in the EAPM, EAQM and EADM will be accessible by the guest.  
> 
> I'm not sure what is the intended meaning of 'the SIE complains'. If it means
> getting out of (SIE when interpreting lets say an NQAP under the discussed
> circumstances) with some sort of error code, I think Tony's answer, ' SIE 
> does not complain'
> makes a lot of sense. It's the guest that's is trying to stretch further than
> the blanket reaches, and it's the guest that needs to be educated on this 
> fact.

Yep, that's what I meant. If the hypervisor can call the SIE with that
config, but the guest gets an error if it tries to use something that
it cannot use, that's fine.

[PATCH 00/14] ata: ahci-platform: add reset control support except for existing drivers

2018-08-22 Thread Kunihiko Hayashi

Add support to get and control a list of resets for the device, and
add the flag indicating whether to use the reset. Existing drivers
set 0 to this flags.

This series solves the issue of the previous patch [1] that was already
reverted [2].
[1] https://www.spinics.net/lists/linux-ide/msg55299.html
[2] https://www.spinics.net/lists/linux-ide/msg55379.html

Kunihiko Hayashi (14):
  ata: ahci-platform: add reset control support and the flag to specify
using reset
  ata: ahci_brcm: add second argument of ahci_platform_get_resources()
  ata: ahci_ceva: add second argument of ahci_platform_get_resources()
  ata: ahci_da850: add second argument of ahci_platform_get_resources()
  ata: ahci_dm816: add second argument of ahci_platform_get_resources()
  ata: ahci_imx: add second argument of ahci_platform_get_resources()
  ata: ahci_brcm: add second argument of ahci_platform_get_resources()
  ata: ahci_mvebu: add second argument of ahci_platform_get_resources()
  ata: ahci_qoriq: add second argument of ahci_platform_get_resources()
  ata: ahci_seattle: add second argument of
ahci_platform_get_resources()
  ata: ahci_st: add second argument of ahci_platform_get_resources()
  ata: ahci_sunxi: add second argument of ahci_platform_get_resources()
  ata: ahci_tegra: add second argument of ahci_platform_get_resources()
  ata: ahci_xgene: add second argument of ahci_platform_get_resources()

 .../devicetree/bindings/ata/ahci-platform.txt  |  1 +
 drivers/ata/ahci.h |  1 +
 drivers/ata/ahci_brcm.c|  2 +-
 drivers/ata/ahci_ceva.c|  2 +-
 drivers/ata/ahci_da850.c   |  2 +-
 drivers/ata/ahci_dm816.c   |  2 +-
 drivers/ata/ahci_imx.c |  2 +-
 drivers/ata/ahci_mtk.c |  2 +-
 drivers/ata/ahci_mvebu.c   |  2 +-
 drivers/ata/ahci_platform.c|  3 +-
 drivers/ata/ahci_qoriq.c   |  2 +-
 drivers/ata/ahci_seattle.c |  2 +-
 drivers/ata/ahci_st.c  |  2 +-
 drivers/ata/ahci_sunxi.c   |  2 +-
 drivers/ata/ahci_tegra.c   |  2 +-
 drivers/ata/ahci_xgene.c   |  2 +-
 drivers/ata/libahci_platform.c | 35 ++
 include/linux/ahci_platform.h  |  4 ++-
 18 files changed, 49 insertions(+), 21 deletions(-)

-- 
2.7.4

[PATCH 03/14] ata: ahci_ceva: add second argument of ahci_platform_get_resources()

2018-08-22 Thread Kunihiko Hayashi

Adding a flag to indicate whether acqiring the optional resources
as the second argument of ahci_platform_get_resources(),
add the argument as initial value 0.

Cc: Hans de Goede 
Signed-off-by: Kunihiko Hayashi 
---
 drivers/ata/ahci_ceva.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/ata/ahci_ceva.c b/drivers/ata/ahci_ceva.c
index 5ecc9d4..dc78c98 100644
--- a/drivers/ata/ahci_ceva.c
+++ b/drivers/ata/ahci_ceva.c
@@ -213,7 +213,7 @@ static int ceva_ahci_probe(struct platform_device *pdev)
 
cevapriv->ahci_pdev = pdev;
 
-   hpriv = ahci_platform_get_resources(pdev);
+   hpriv = ahci_platform_get_resources(pdev, 0);
if (IS_ERR(hpriv))
return PTR_ERR(hpriv);
 
-- 
2.7.4

[RESEND PATCH v2] acpi/processor: Fix the return value of acpi_processor_ids_walk()

2018-08-22 Thread Dou Liyang

ACPI driver should make sure all the processor IDs in their ACPI Namespace
are unique. the driver performs a depth-first walk of the namespace tree
and calls the acpi_processor_ids_walk() to check the duplicate IDs.

But, the acpi_processor_ids_walk() mistakes the return value. If a
processor is checked, it returns true which causes the walk break
immediately, and other processors will never be checked.

Repace the value with AE_OK which is the standard acpi_status value.
And don't abort the namespace walk even on error.

Fixes 8c8cb30f49b8 ("acpi/processor: Implement DEVICE operator for processor 
enumeration")
Signed-off-by: Dou Liyang 
---
Changelog:
  v1 --> v2:
   - Fix the check against duplicate IDs suggested by Rafael.
  
  Now，the duplicate IDs only be found in Ivb42 machine, and we have added this 
check at 
  linux-4.9. But, we introduced a bug in linux-4.12 by commit 8c8cb30f49b8.

  For resolving the bug, firstly, I removed the check[1]. because Linux will 
compare
  the coming ID with present processors when it hot-added a physical CPU and 
will avoid
  using duplicate IDs.

  But, seems we should consider all the possible processors. So, with this 
patch, All
  the processors with the same IDs will never be hot-plugged.

[1] https://lkml.org/lkml/2018/5/28/213
---
 drivers/acpi/acpi_processor.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 449d86d39965..a59870ccd5ca 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -643,7 +643,7 @@ static acpi_status __init 
acpi_processor_ids_walk(acpi_handle handle,
 
status = acpi_get_type(handle, &acpi_type);
if (ACPI_FAILURE(status))
-   return false;
+   return_ACPI_STATUS(status);
 
switch (acpi_type) {
case ACPI_TYPE_PROCESSOR:
@@ -663,11 +663,12 @@ static acpi_status __init 
acpi_processor_ids_walk(acpi_handle handle,
}
 
processor_validated_ids_update(uid);
-   return true;
+   return AE_OK;
 
 err:
+   /* Exit on error, but don't abort the namespace walk */
acpi_handle_info(handle, "Invalid processor object\n");
-   return false;
+   return AE_OK;
 
 }
 
-- 
2.14.3

[PATCH] x86/kvm/vmx: Fix GPF on reading vmentry_l1d_flush

2018-08-22 Thread MINOURA Makoto / 箕浦真



When EPT is not enabled, reading
/sys/module/kvm_intel/parameters/vmentry_l1d_flush causes
general protection fault in vmentry_l1d_flush_get() due to
access beyond the end of the array vmentry_l1d_param[].

Signed-off-by: Minoura Makoto 
---
 arch/x86/include/asm/vmx.h | 1 +
 arch/x86/kvm/vmx.c | 4 +++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 95f9107449bf..c4b834b05178 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -581,6 +581,7 @@ enum vmx_l1d_flush_state {
VMENTER_L1D_FLUSH_NEVER,
VMENTER_L1D_FLUSH_COND,
VMENTER_L1D_FLUSH_ALWAYS,
+   VMENTER_L1D_FLUSH_PARAM_MAX = VMENTER_L1D_FLUSH_ALWAYS,
VMENTER_L1D_FLUSH_EPT_DISABLED,
VMENTER_L1D_FLUSH_NOT_REQUIRED,
 };
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 1519f030fd73..155ba2a9139f 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -204,6 +204,8 @@ static const struct {
{"never",   VMENTER_L1D_FLUSH_NEVER},
{"cond",VMENTER_L1D_FLUSH_COND},
{"always",  VMENTER_L1D_FLUSH_ALWAYS},
+   {"ept-disabled", VMENTER_L1D_FLUSH_EPT_DISABLED},
+   {"not-required", VMENTER_L1D_FLUSH_NOT_REQUIRED},
 };
 
 #define L1D_CACHE_ORDER 4
@@ -286,7 +288,7 @@ static int vmentry_l1d_flush_parse(const char *s)
unsigned int i;
 
if (s) {
-   for (i = 0; i < ARRAY_SIZE(vmentry_l1d_param); i++) {
+   for (i = 0; i <= VMENTER_L1D_FLUSH_PARAM_MAX; i++) {
if (sysfs_streq(s, vmentry_l1d_param[i].option))
return vmentry_l1d_param[i].cmd;
}

Re: [RFC v2 2/2] mm/memory_hotplug: Shrink spanned pages when offlining memory

2018-08-22 Thread Oscar Salvador

On Tue, Aug 21, 2018 at 03:17:10PM +0200, David Hildenbrand wrote:
> > add_device_memory is in charge of
> 
> I wouldn't use the terminology of onlining/offlining here. That applies
> rather to memory that is exposed to the rest of the system (e.g. buddy
> allocator, has underlying memory block devices). I guess it is rather a
> pure setup/teardown of that device memory.

Hi David,

I am not sure if you are referring to:

"
a) calling either arch_add_memory() or add_pages(), depending on whether
   we want a linear mapping
b) online the memory sections that correspond to the pfn range
c) calling move_pfn_range_to_zone() being zone ZONE_DEVICE to
   expand zone/pgdat spanned pages and initialize its pages
"

Well, that is partialy true.
I mean, in order to make this work, we need to offline/online the memory
sections, because shrink_pages will rely on that from now on.
Is what we do when online/offline pages, but since device memory
does not go through the "official" channels, we need to do it there
as well.

Sure I can use another terminology, but since that is what
offline/online_mem_sections do, I just came up with that.

> I would really like to see the mem_hotplug_begin/end also getting moved
> inside add_device_memory()/del_device_memory(). (just like for
> add/remove_memory)
> 
> I wonder if kasan_ stuff actually requires this lock, or if it could
> also be somehow moved inside add_device_memory/del_device_memory.

Yes, that was my first approach, but then I saw that the kasan stuff is being
handled whithin those locks, so I was not sure and I backed off leaving the
mem_hotplug_begin/end where they were.

Maybe Jerome can shed some light and, and we can just handle the kasan stuff
out of the locks.

> Maybe shorten that a bit
> 
> "HMM/devm memory does not have IORESOURCE_SYSTEM_RAM set. They use
>  devm_request_mem_region/devm_release_mem_region to add/release a
>  resource. Just back off here."

Uhm, fair enough.

> Any reason for these indirections?

I wanted to hide the internals in the memory_hotplug code.
I thought about removing them, but I finally left them.
If people think that we are better off without them, I can just
remove them.

> I guess for readability, this patch could be split up into several
> patches. E.g. factoring out of add_device_memory/del_device_memory,
> release_mem_region_adjustable change ...

Yes, really true.
But I wanted first to gather feedback mainly from HMM/devm people to see
if they saw an outright bug within the series because I am not so
familiar with that part of the code.

Feedback from Jerome/Dan will be appreciate as well to see if this is a good
direction.

But you are right, in the end, this will have to be slipt up into several
parts to ease the review.

Thanks for reviewing this David!
I will try to address your concerns.

Thanks 
-- 
Oscar Salvador
SUSE L3

Re: [PATCH 1/2] workqueue: skip lockdep wq dependency in cancel_work_sync()

2018-08-22 Thread Byungchul Park

On Wed, Aug 22, 2018 at 09:07:23AM +0200, Johannes Berg wrote:
> On Wed, 2018-08-22 at 14:47 +0900, Byungchul Park wrote:
> > On Wed, Aug 22, 2018 at 06:02:23AM +0200, Johannes Berg wrote:
> > > On Wed, 2018-08-22 at 11:45 +0900, Byungchul Park wrote:
> > > 
> > > > That should've been adjusted as well when Ingo reverted Cross-release.
> > > 
> > > I can't really say.
> > 
> > What do you mean?
> 
> I haven't followed any of this, so I just don't know.
> 
> > > > It would be much easier to add each pair, acquire/release, before
> > > > wait_for_completion() in both flush_workqueue() and flush_work() than
> > > > reverting the whole commit.
> > > 
> > > The commit doesn't do much more than this though.
> > 
> > That also has named of lockdep_map for wq/work in a better way.
> 
> What do you mean?

Ah.. Not important thing. I just mentioned I changed lock names a bit
when initializing lockdep_map instances which was suggested by Ingo. But
no problem even if you revert the whole thing. I just informed it. ;)

> > > > What's lacking is only lockdep annotations for wait_for_completion().
> > > 
> > > No, I disagree. Like I said before, we need the lockdep annotations on
> > 
> > You seem to be confused. I was talking about wait_for_completion() in
> > both flush_workqueue() and flush_work(). Without
> > the wait_for_completion()s, nothing matters wrt what you are concerning.
> 
> Yes and no.
> 
> You're basically saying if we don't get to do a wait_for_completion(),
> then we don't need any lockdep annotation. I'm saying this isn't true.

Strictly no. But I'm just talking about the case in wq flush code.

> Consider the following case:
> 
> work_function()
> {
>   mutex_lock(&mutex);
>   mutex_unlock(&mutex);
> }
> 
> other_function()
> {
>   queue_work(&my_wq, &work);
> 
>   if (common_case) {
>   schedule_and_wait_for_something_that_takes_a_long_time()
>   }
> 
>   mutex_lock(&mutex);
>   flush_workqueue(&my_wq);
>   mutex_unlock(&mutex);
> }
> 
> 
> Clearly this code is broken, right?
> 
> However, you'll almost never get lockdep to indicate that, because of
> the "if (common_case)".

Sorry I don't catch you. Why is that problem with the example? Please
a deadlock example.

> My argument basically is that the lockdep annotations in the workqueue
> code should be entirely independent of the actual need to call
> wait_for_completion().

No. Lockdep annotations always do with either wait_for_something or self
event loop within a single context e.g. fs -> memory reclaim -> fs -> ..

> Therefore, the commit should be reverted regardless of any cross-release

No. That is necessary only when the wait_for_completion() cannot be
tracked in checking dependencies automatically by cross-release.

It might be the key to understand you, could you explain it more why you
think lockdep annotations are independent of the actual need to call
wait_for_completion()(or wait_for_something_else) hopefully with a
deadlock example?

> work (that I neither know and thus don't understand right now), since it
> makes workqueue code rely on lockdep for the completion, whereas we

Using wait_for_completion(), right?

> really want to have annotations here even when we didn't actually need
> to wait_for_completion().

Please an example of deadlock even w/o wait_for_completion().

> 
> johannes

Byungchul

Re: [PATCH v2 0/2] mm: soft-offline: fix race against page allocation

2018-08-22 Thread Michal Hocko

On Wed 22-08-18 01:37:48, Naoya Horiguchi wrote:
> On Wed, Aug 15, 2018 at 03:43:34PM -0700, Andrew Morton wrote:
> > On Tue, 17 Jul 2018 14:32:30 +0900 Naoya Horiguchi 
> >  wrote:
> > 
> > > I've updated the patchset based on feedbacks:
> > > 
> > > - updated comments (from Andrew),
> > > - moved calling set_hwpoison_free_buddy_page() from mm/migrate.c to 
> > > mm/memory-failure.c,
> > >   which is necessary to check the return code of 
> > > set_hwpoison_free_buddy_page(),
> > > - lkp bot reported a build error when only 1/2 is applied.
> > > 
> > >   >mm/memory-failure.c: In function 'soft_offline_huge_page':
> > >   > >> mm/memory-failure.c:1610:8: error: implicit declaration of function
> > >   > 'set_hwpoison_free_buddy_page'; did you mean 'is_free_buddy_page'?
> > >   > [-Werror=implicit-function-declaration]
> > >   >if (set_hwpoison_free_buddy_page(page))
> > >   >^~~~
> > >   >is_free_buddy_page
> > >   >cc1: some warnings being treated as errors
> > > 
> > >   set_hwpoison_free_buddy_page() is defined in 2/2, so we can't use it
> > >   in 1/2. Simply doing 
> > > s/set_hwpoison_free_buddy_page/!TestSetPageHWPoison/
> > >   will fix this.
> > > 
> > > v1: https://lkml.org/lkml/2018/7/12/968
> > > 
> > 
> > Quite a bit of discussion on these two, but no actual acks or
> > review-by's?
> 
> Really sorry for late response.
> Xishi provided feedback on previous version, but no final ack/reviewed-by.
> This fix should work on the reported issue, but rewriting soft-offlining
> without PageHWPoison flag would be the better fix (no actual patch yet.)

If we can go with the later the I would obviously prefer that. I cannot
promise to work on the patch though. I can help with reviewing of
course.

If this is important enough that people are hitting the issue in normal
workloads then sure, let's go with the simple fix and continue on top of
that.
-- 
Michal Hocko
SUSE Labs

Re: [PATCH v2] mfd: arizona: Correct calling of runtime_put_sync

2018-08-22 Thread Charles Keepax

On Tue, Aug 21, 2018 at 07:52:44PM +0530, sapthagiri.bara...@gmail.com wrote:
> From: Sapthagiri Baratam 
> 
> Don't call runtime_put_sync when clk32k_ref is ARIZONA_32KZ_MCLK2
> as there is no corresponding runtime_get_sync call.
> 
> MCLK1 is not in the AoD power domain so if it is used as 32kHz clock
> source we need to hold a runtime PM reference to keep the device from
> going into low power mode.
> 
> fixes: cdd8da8cc66b ("mfd: arizona: Add gating of external MCLKn clocks")
> Signed-off-by: Sapthagiri Baratam 
> ---

Acked-by: Charles Keepax 

Thanks,
Charles

RE: [PATCH V5 00/10] mmc: add support for sdhci 4.0

2018-08-22 Thread 张春艳

On Thu, 16 Aug 2018 at 15:54, Chunyan Zhang  wrote:
>
> From the SD host controller version 4.0 on, SDHCI implementation either
> is version 3 compatible or version 4 mode. This patch-set covers those
> changes which are common for SDHCI 4.0 version, regardless of whether
> they are used with SD or eMMC storage devices.
>
> This patchset also added a new sdhci driver for Spreadtrum's controller
> which supports v4.0 mode.
>
> This patchset has been tested on Spreadtrum's mobile phone, emmc can be
> initialized, mounted, read and written, with these changes for common
> sdhci framework and sdhci-sprd driver.
>
> Changes from V4:
> * Addressed Adrian's comments:
> - Enable v4 mode in __sdhci_read_caps() and sdhci_init() instead of 
> sdhci_do_reset();
> - Move the added member 'v4_mode' to following with other bools;
> - Add more comments in the added function sdhci_config_dma();
> - Instead of enabling auto-CMD23 in init, enabled it only if receiving sbc 
> from
>   cards and the argument is suitable for host to deal with;
> - Make the addition of the SDHCI_SPEC_4xx defines a separate patch;
> - Disable auto-CMD23 if stuff bits is set in the argument of CMD23 in 
> sdhci_request().
>
> * For V4 mode, SDMA also can use auto-CMD23, adjusted host->flags in 
> sdhci_setup_host().
>
> Previous patch series:
> v4: https://lkml.org/lkml/2018/7/23/269
> v3: https://lkml.org/lkml/2018/7/8/239
> v2: https://lkml.org/lkml/2018/6/14/936
> v1: https://lkml.org/lkml/2018/6/8/108
>
> Chunyan Zhang (10):
>   mmc: sdhci: Add version V4 definition
>   mmc: sdhci: Add sd host v4 mode
>   mmc: sdhci: Change SDMA address register for v4 mode
>   mmc: sdhci: Add ADMA2 64-bit addressing support for V4 mode
>   mmc: sdhci: Add 32-bit block count support for v4 mode
>   mmc: sdhci: Disable auto-CMD23 if stuff bits is set in CMD23 argument
>   mmc: sdhci: Add Auto CMD Auto Select support
>   mmc: sdhci: SDMA may use Auto-CMD23 in v4 mode
>   mmc: sdhci-sprd: Add Spreadtrum's initial host controller
>   dt-bindings: sdhci-sprd: Add bindings for the sdhci-sprd controller
>
>  .../devicetree/bindings/mmc/sdhci-sprd.txt |  41 ++
>  drivers/mmc/host/Kconfig   |  13 +
>  drivers/mmc/host/Makefile  |   1 +
>  drivers/mmc/host/sdhci-sprd.c  | 464 
> +
>  drivers/mmc/host/sdhci.c   | 251 ---
>  drivers/mmc/host/sdhci.h   |  22 +-
>  6 files changed, 741 insertions(+), 51 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/mmc/sdhci-sprd.txt
>  create mode 100644 drivers/mmc/host/sdhci-sprd.c
>
> --
> 2.7.4
>

[PATCH] KVM: s390: vsie: Consolidate CRYCB validation

2018-08-22 Thread Pierre Morel

Currently when shadowing the CRYCB on SIE entrance, the validation
tests the following:
- accept only FORMAT1 or FORMAT2
- test if MSAext facility (76) is installed
- accept the CRYCB if no keys are used
- verifies that the CRYCB format1 is inside a page
- verifies that the CRYCB origin is not 0

This is not following the architecture.

On SIE entrance, the CRYCB must be validated before accepting
any of its entries.

Let's do the validation in the right order and also verify
correctly the FORMAT2 CRYCB.

The testing of facility MSAext3 (76) is not useful as it is
already tested by kvm_crypto_init() to set FORMAT1.

The testing of a null CRYCB origin must be done what ever
the format of the guest3 CRYCB is.

The CRYCB must be contained inside a page, but the CRYCB size
depends on the CRYCB format.
Lets test what the guest2 initialized, we can not trust it to have
done things right.

Signed-off-by: Pierre Morel 
---
 arch/s390/kvm/vsie.c | 35 +--
 1 file changed, 25 insertions(+), 10 deletions(-)

diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
index a2b28cd..35c3907 100644
--- a/arch/s390/kvm/vsie.c
+++ b/arch/s390/kvm/vsie.c
@@ -158,28 +158,43 @@ static int shadow_crycb(struct kvm_vcpu *vcpu, struct 
vsie_page *vsie_page)
scb_s->crycbd = 0;
if (!(crycbd_o & vcpu->arch.sie_block->crycbd & CRYCB_FORMAT1))
return 0;
-   /* format-1 is supported with message-security-assist extension 3 */
-   if (!test_kvm_facility(vcpu->kvm, 76))
-   return 0;
+   /*
+* If APIE is set or it the CRYCB Format is FORMAT1 or FORMAT2 with
+* APXA installed, the machine checks the validity of crycb origin.
+* KVM kvm_s390_crypto_init() makes sure that FORMAT2 is only used
+* if APXA is installed.
+* The guest2 hypervizor could have set APIE and Format2 so let's
+* test all these points.
+* We here have always a CRYCB FORMAT1 or FORMAT2 (FORMAT0 was
+* refused in previous test).
+*/
+   if (!crycb_addr)
+   return set_validity_icpt(scb_s, 0x0039U);
+
+   if ((crycbd_o & 0x03) == CRYCB_FORMAT1)
+   if ((crycb_addr & PAGE_MASK) !=
+  ((crycb_addr + 128) & PAGE_MASK))
+   return set_validity_icpt(scb_s, 0x003CU);
+
+   if ((crycbd_o & 0x03) == CRYCB_FORMAT2)
+   if ((crycb_addr & PAGE_MASK) !=
+  ((crycb_addr + 256) & PAGE_MASK))
+   return set_validity_icpt(scb_s, 0x003CU);
+
/* we may only allow it if enabled for guest 2 */
ecb3_flags = scb_o->ecb3 & vcpu->arch.sie_block->ecb3 &
 (ECB3_AES | ECB3_DEA);
if (!ecb3_flags)
return 0;
 
-   if ((crycb_addr & PAGE_MASK) != ((crycb_addr + 128) & PAGE_MASK))
-   return set_validity_icpt(scb_s, 0x003CU);
-   else if (!crycb_addr)
-   return set_validity_icpt(scb_s, 0x0039U);
-
/* copy only the wrapping keys */
if (read_guest_real(vcpu, crycb_addr + 72,
vsie_page->crycb.dea_wrapping_key_mask, 56))
return set_validity_icpt(scb_s, 0x0035U);
 
scb_s->ecb3 |= ecb3_flags;
-   scb_s->crycbd = ((__u32)(__u64) &vsie_page->crycb) | CRYCB_FORMAT1 |
-   CRYCB_FORMAT2;
+   /* Set the shadow CRYCB format to format 2 */
+   scb_s->crycbd = ((__u32)(__u64) &vsie_page->crycb) | CRYCB_FORMAT2;
 
/* xor both blocks in one run */
b1 = (unsigned long *) vsie_page->crycb.dea_wrapping_key_mask;
-- 
2.7.4

Re: [RFC v2 2/2] mm/memory_hotplug: Shrink spanned pages when offlining memory

2018-08-22 Thread David Hildenbrand

On 22.08.2018 09:50, Oscar Salvador wrote:
> On Tue, Aug 21, 2018 at 03:17:10PM +0200, David Hildenbrand wrote:
>>> add_device_memory is in charge of
>>
>> I wouldn't use the terminology of onlining/offlining here. That applies
>> rather to memory that is exposed to the rest of the system (e.g. buddy
>> allocator, has underlying memory block devices). I guess it is rather a
>> pure setup/teardown of that device memory.
> 
> Hi David,
> 
> I am not sure if you are referring to:
> 
> "
> a) calling either arch_add_memory() or add_pages(), depending on whether
>we want a linear mapping
> b) online the memory sections that correspond to the pfn range
> c) calling move_pfn_range_to_zone() being zone ZONE_DEVICE to
>expand zone/pgdat spanned pages and initialize its pages
> "
> 
> Well, that is partialy true.
> I mean, in order to make this work, we need to offline/online the memory
> sections, because shrink_pages will rely on that from now on.
> Is what we do when online/offline pages, but since device memory
> does not go through the "official" channels, we need to do it there
> as well.
> 
> Sure I can use another terminology, but since that is what
> offline/online_mem_sections do, I just came up with that.
> 

Okay, got it, so it is basically "mark the sections as online/offline".

>> I would really like to see the mem_hotplug_begin/end also getting moved
>> inside add_device_memory()/del_device_memory(). (just like for
>> add/remove_memory)
>>
>> I wonder if kasan_ stuff actually requires this lock, or if it could
>> also be somehow moved inside add_device_memory/del_device_memory.
> 
> Yes, that was my first approach, but then I saw that the kasan stuff is being
> handled whithin those locks, so I was not sure and I backed off leaving the
> mem_hotplug_begin/end where they were.
> 
> Maybe Jerome can shed some light and, and we can just handle the kasan stuff
> out of the locks.
> 
>> Maybe shorten that a bit
>>
>> "HMM/devm memory does not have IORESOURCE_SYSTEM_RAM set. They use
>>  devm_request_mem_region/devm_release_mem_region to add/release a
>>  resource. Just back off here."
> 
> Uhm, fair enough.
> 
>> Any reason for these indirections?
> 
> I wanted to hide the internals in the memory_hotplug code.
> I thought about removing them, but I finally left them.
> If people think that we are better off without them, I can just
> remove them.

I don't see a need for that. (everyone following the functions has to go
via one indirection that just passes on parameters). It is also not done
for other functions (a.g. add_memory)

> 
>> I guess for readability, this patch could be split up into several
>> patches. E.g. factoring out of add_device_memory/del_device_memory,
>> release_mem_region_adjustable change ...
> 
> Yes, really true.
> But I wanted first to gather feedback mainly from HMM/devm people to see
> if they saw an outright bug within the series because I am not so
> familiar with that part of the code.
> 
> Feedback from Jerome/Dan will be appreciate as well to see if this is a good
> direction.

Yes, they probably know best how this all fits together.

> 
> But you are right, in the end, this will have to be slipt up into several
> parts to ease the review.
> 
> Thanks for reviewing this David!
> I will try to address your concerns.
> 
> Thanks 
> 


-- 

Thanks,

David / dhildenb

Re: SEV guest regression in 4.18

2018-08-22 Thread Borislav Petkov

Dropping Pavel as it bounces.

On Tue, Aug 21, 2018 at 11:07:38AM -0500, Brijesh Singh wrote:
> The tsc_early_init() is called before setup_arch() -> init_mem_mapping.

Ok, I see it, thanks for explaining.

So back to your original ideas - I'm wondering whether we should define
a chunk of memory which the hypervisor and guest can share and thus
communicate over... Something ala SEV-ES also with strictly defined
layout and put all those variables there. And then the guest can map
decrypted.

There might be something similar though, I dunno.

Maybe Paolo has a better idea...

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
--

Re: [PATCH] staging: rtl8188eu: Fix spelling mistake

2018-08-22 Thread Dan Carpenter

On Tue, Aug 21, 2018 at 07:14:28AM +0530, Bhaskar Singh wrote:
> This patch fix spelling mistakes in TODO.
> 

Btw, it helps when you say which word you're changing, otherwise it
takes a while to spot the difference.  We changed "HGz" to "GHz".

Probably someone smarter than I am would have spotted it faster...

regards,
dan carpenter

Re: [PATCH] soc: ti: pm33xx: Enable DS0 for the platforms on which it is functional

2018-08-22 Thread J, KEERTHY





On 8/22/2018 1:07 PM, Johan Hovold wrote:

On Wed, Aug 22, 2018 at 09:34:09AM +0200, Johan Hovold wrote:

On Wed, Aug 22, 2018 at 11:02:31AM +0530, Keerthy wrote:

Enable DS0 for only those platforms on which it is functional

Signed-off-by: Keerthy 
---
  arch/arm/mach-omap2/pm33xx-core.c| 5 +
  drivers/soc/ti/pm33xx.c  | 9 +
  include/linux/platform_data/pm33xx.h | 2 ++
  3 files changed, 16 insertions(+)

diff --git a/arch/arm/mach-omap2/pm33xx-core.c 
b/arch/arm/mach-omap2/pm33xx-core.c
index f4971e4..f0f6e8e 100644
--- a/arch/arm/mach-omap2/pm33xx-core.c
+++ b/arch/arm/mach-omap2/pm33xx-core.c
@@ -135,6 +135,11 @@ static int am43xx_suspend(unsigned int state, int 
(*fn)(unsigned long),
  {
int ret = 0;
  
+	if (!(args & WFI_FLAG_DEEP_SLEEP0)) {

+   pr_err("DS0 mode not supported\n");
+   return -ENOTSUPP;
+   }
+
amx3_pre_suspend_common();
scu_power_mode(scu_base, SCU_PM_POWEROFF);
ret = cpu_suspend(args, fn);
diff --git a/drivers/soc/ti/pm33xx.c b/drivers/soc/ti/pm33xx.c
index d0dab32..53238d7 100644
--- a/drivers/soc/ti/pm33xx.c
+++ b/drivers/soc/ti/pm33xx.c
@@ -324,6 +324,15 @@ static int am33xx_pm_probe(struct platform_device *pdev)
suspend_wfi_flags |= WFI_FLAG_SAVE_EMIF;
suspend_wfi_flags |= WFI_FLAG_WAKE_M3;
  
+	/*

+* Deep Sleep0 mode is currently functional only on am437x-gp-evm,
+* am33xx-evm and boneblack family. Hence set the DS0 flag
+*/
+   if (of_machine_is_compatible("ti,am437x-gp-evm") ||
+   of_machine_is_compatible("ti,am335x-bone-black") ||
+   of_machine_is_compatible("ti,am335x-evm"))
+   suspend_wfi_flags |= WFI_FLAG_DEEP_SLEEP0;


What about other (out-of-tree) machines which supports DS0 and which
this change would break?

I think this needs to be a blacklist if anything.

Please also expand in the commit message why you think this is needed.


Currently when one does echo mem > /sys/power/state on unsuppored 
machines there can be a crash or a hang. So bail out with a message.




Last, what tree is this against? There's no am43xx_suspend() in
linux-next (and you add compatibles above for am33xx too).


Sorry, there is indeed an am43xx_suspend(), but you are adding
compatibles for am33xx which use am33xx_suspend().


am33xx_pm_probe is a common probe function for both am33 and am43.
AFAIK for am33 family am335x-evm and am335x-bone-black support Deep 
Sleep mode. For am43 family am43tx-gp-evm alone supports at the moment.


Can you let me know of other am33 machines that support DS0 mode?
I could have simply used ti,am33xx compatible which covers entire am33 
family but then am33xx-bone (bone white) does not support this mode.




Johan

Re: [PATCH] KVM: s390: vsie: Consolidate CRYCB validation

2018-08-22 Thread David Hildenbrand

On 22.08.2018 10:08, Pierre Morel wrote:
> Currently when shadowing the CRYCB on SIE entrance, the validation
> tests the following:
> - accept only FORMAT1 or FORMAT2
> - test if MSAext facility (76) is installed
> - accept the CRYCB if no keys are used
> - verifies that the CRYCB format1 is inside a page
> - verifies that the CRYCB origin is not 0
> 
> This is not following the architecture.

I have to trust you on that :)

> 
> On SIE entrance, the CRYCB must be validated before accepting
> any of its entries.
> 
> Let's do the validation in the right order and also verify
> correctly the FORMAT2 CRYCB.

With which facility was FORMAT2 introduced?

Does MSA3 imply that FORMAT2 can be used? (even if AP is absent)

FORMAT2 is backwards compatible to FORMAT1,

> 
> The testing of facility MSAext3 (76) is not useful as it is
> already tested by kvm_crypto_init() to set FORMAT1.

Indeed, having FORMAT1 in g1 implies that.

> 
> The testing of a null CRYCB origin must be done what ever
> the format of the guest3 CRYCB is.
> 
> The CRYCB must be contained inside a page, but the CRYCB size
> depends on the CRYCB format.
> Lets test what the guest2 initialized, we can not trust it to have
> done things right.
> 
> Signed-off-by: Pierre Morel 
> ---
>  arch/s390/kvm/vsie.c | 35 +--
>  1 file changed, 25 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
> index a2b28cd..35c3907 100644
> --- a/arch/s390/kvm/vsie.c
> +++ b/arch/s390/kvm/vsie.c
> @@ -158,28 +158,43 @@ static int shadow_crycb(struct kvm_vcpu *vcpu, struct 
> vsie_page *vsie_page)
>   scb_s->crycbd = 0;
>   if (!(crycbd_o & vcpu->arch.sie_block->crycbd & CRYCB_FORMAT1))
>   return 0;
> - /* format-1 is supported with message-security-assist extension 3 */
> - if (!test_kvm_facility(vcpu->kvm, 76))
> - return 0;
> + /*
> +  * If APIE is set or it the CRYCB Format is FORMAT1 or FORMAT2 with
> +  * APXA installed, the machine checks the validity of crycb origin.
> +  * KVM kvm_s390_crypto_init() makes sure that FORMAT2 is only used
> +  * if APXA is installed.
> +  * The guest2 hypervizor could have set APIE and Format2 so let's
> +  * test all these points.
> +  * We here have always a CRYCB FORMAT1 or FORMAT2 (FORMAT0 was
> +  * refused in previous test).

Can you shorten that comment and leave out all stuff to be added next?
(APIE, APXA ...). I guess this whole comment is to be left out of this
patch.

> +  */
> + if (!crycb_addr)
> + return set_validity_icpt(scb_s, 0x0039U);
> +
> + if ((crycbd_o & 0x03) == CRYCB_FORMAT1)

Can you instead of 0x03 define CRYCB_FORMAT_MASK

> + if ((crycb_addr & PAGE_MASK) !=
> +((crycb_addr + 128) & PAGE_MASK))

please add one space in front of the second line to properly indent

> + return set_validity_icpt(scb_s, 0x003CU);
> +
> + if ((crycbd_o & 0x03) == CRYCB_FORMAT2)
> + if ((crycb_addr & PAGE_MASK) !=
> +((crycb_addr + 256) & PAGE_MASK))

dito

> + return set_validity_icpt(scb_s, 0x003CU);
> +
>   /* we may only allow it if enabled for guest 2 */
>   ecb3_flags = scb_o->ecb3 & vcpu->arch.sie_block->ecb3 &
>(ECB3_AES | ECB3_DEA);
>   if (!ecb3_flags)
>   return 0;
>  
> - if ((crycb_addr & PAGE_MASK) != ((crycb_addr + 128) & PAGE_MASK))
> - return set_validity_icpt(scb_s, 0x003CU);
> - else if (!crycb_addr)
> - return set_validity_icpt(scb_s, 0x0039U);
> -
>   /* copy only the wrapping keys */
>   if (read_guest_real(vcpu, crycb_addr + 72,
>   vsie_page->crycb.dea_wrapping_key_mask, 56))
>   return set_validity_icpt(scb_s, 0x0035U);
>  
>   scb_s->ecb3 |= ecb3_flags;
> - scb_s->crycbd = ((__u32)(__u64) &vsie_page->crycb) | CRYCB_FORMAT1 |
> - CRYCB_FORMAT2;
> + /* Set the shadow CRYCB format to format 2 */
I don't consider this comment helpful (CRYCB_FORMAT2 below is at least
obvious to me) - CRYCB_FORMAT2 implies CRYCB_FORMAT1 (what the existing
code did not consider)

> + scb_s->crycbd = ((__u32)(__u64) &vsie_page->crycb) | CRYCB_FORMAT2;
>  
>   /* xor both blocks in one run */
>   b1 = (unsigned long *) vsie_page->crycb.dea_wrapping_key_mask;
> 

Thanks for looking into this.

-- 

Thanks,

David / dhildenb

Waiting for the photos

2018-08-22 Thread Lucy Karlson


Do you have needs to change or cut out background for you photos?
Do you have needs to retouch or enhance your photos?
We are an image team of 10 editors, who can help you for those photo work
needs.

Please contact us for further info.

Thanks,
Lucy Karlson

Re: [PATCH] x86/kvm/vmx: Fix GPF on reading vmentry_l1d_flush

2018-08-22 Thread Jinpu Wang

> From: MINOURA Makoto / 箕浦 真 
> Date: 2018年8月22日周三 上午9:50
> Subject: [PATCH] x86/kvm/vmx: Fix GPF on reading vmentry_l1d_flush
> To: 
> Cc: 
>
>
>
> When EPT is not enabled, reading
> /sys/module/kvm_intel/parameters/vmentry_l1d_flush causes
> general protection fault in vmentry_l1d_flush_get() due to
> access beyond the end of the array vmentry_l1d_param[].
>
> Signed-off-by: Minoura Makoto 
> ---
>  arch/x86/include/asm/vmx.h | 1 +
>  arch/x86/kvm/vmx.c | 4 +++-
>  2 files changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
> index 95f9107449bf..c4b834b05178 100644
> --- a/arch/x86/include/asm/vmx.h
> +++ b/arch/x86/include/asm/vmx.h
> @@ -581,6 +581,7 @@ enum vmx_l1d_flush_state {
> VMENTER_L1D_FLUSH_NEVER,
> VMENTER_L1D_FLUSH_COND,
> VMENTER_L1D_FLUSH_ALWAYS,
> +   VMENTER_L1D_FLUSH_PARAM_MAX = VMENTER_L1D_FLUSH_ALWAYS,
> VMENTER_L1D_FLUSH_EPT_DISABLED,
> VMENTER_L1D_FLUSH_NOT_REQUIRED,
>  };
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 1519f030fd73..155ba2a9139f 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -204,6 +204,8 @@ static const struct {
> {"never",   VMENTER_L1D_FLUSH_NEVER},
> {"cond",VMENTER_L1D_FLUSH_COND},
> {"always",  VMENTER_L1D_FLUSH_ALWAYS},
> +   {"ept-disabled", VMENTER_L1D_FLUSH_EPT_DISABLED},
> +   {"not-required", VMENTER_L1D_FLUSH_NOT_REQUIRED},
>  };
>
>  #define L1D_CACHE_ORDER 4
> @@ -286,7 +288,7 @@ static int vmentry_l1d_flush_parse(const char *s)
> unsigned int i;
>
> if (s) {
> -   for (i = 0; i < ARRAY_SIZE(vmentry_l1d_param); i++) {
> +   for (i = 0; i <= VMENTER_L1D_FLUSH_PARAM_MAX; i++) {
> if (sysfs_streq(s, vmentry_l1d_param[i].option))
> return vmentry_l1d_param[i].cmd;
> }
Easy to reproduce. Thanks.
Tested-by: Jack Wang 

--
Jack Wang
Linux Kernel Developer

ProfitBricks GmbH
Greifswalder Str. 207
D - 10405 Berlin

Tel:   +49 30 577 008  042
Fax:  +49 30 577 008 299
Email:jinpu.w...@profitbricks.com
URL:  https://www.profitbricks.de

Sitz der Gesellschaft: Berlin
Registergericht: Amtsgericht Charlottenburg, HRB 125506 B
Geschäftsführer: Achim Weiss, Matthias Steinberg, Christoph Steffens

[PATCH v4 0/4] Fix debug macros and their usages

2018-08-22 Thread Nishad Kamdar

This patchset fixes the four debug macros N_MSG, ERR_MSG, INIT_MSG and
IRQ_MSG. Each patch fixes one particular macro and its usages.

For N_MSG, replaces printk with dev_ without __func__ or __LINE__
or current->comm and current->pid. Removes the do {} while(0) loop for
the single statement macro.

For ERR_MSG and IRQ_MSG, makes the same changes as those for N_MSG, but
further drops the macros and replaces their usages with dev_.

Removes the INIT_MSG macro and its usages.

Changes in v4:
  - Create multiple patches, one for each type of macro being
deleted/changed.

Changes in v3:
  - Replace usages of ERR_MSG and IRQ_MSG with dev_err() in code itself.
  - Remove all INIT_MSG usages.
  - Drop ERR_MSG, INIT_MSG and IRQ_MSG from dbg.h.

Changes in v2:
  - Replace printk with dev_.
  - Remove __func__, __LINE__, current->comm, current->pid from arguments.
  - Remove the do {} while(0) loop from these macros.
  - Modify commit message to include other changes.
-
Nishad Kamdar (4):
  staging: mt7621-mmc: Fix debug macro N_MSG
  staging: mt7621-mmc: Fix debug macro ERR_MSG and its usages
  staging: mt7621-mmc: Remove macro INIT_MSG and its usages
  staging: mt7621-mmc: Fix debug macro IRQ_MSG and its usages

 drivers/staging/mt7621-mmc/dbg.h |  36 +--
 drivers/staging/mt7621-mmc/sd.c  | 180 ---
 2 files changed, 121 insertions(+), 95 deletions(-)

-- 
2.17.1

[PATCH v4 1/4] staging: mt7621-mmc: Fix debug macro N_MSG

2018-08-22 Thread Nishad Kamdar

This patch fixes the debug macro N_MSG. Replaces printk with
dev_ without __func__ or __LINE__ or current->comm and
current->pid. Removes the do {} while(0) loop for the single
statement macro. Issue found by checkpatch.

Signed-off-by: Nishad Kamdar 
---
 drivers/staging/mt7621-mmc/dbg.h | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/staging/mt7621-mmc/dbg.h b/drivers/staging/mt7621-mmc/dbg.h
index 2f2c56b73987..c56fb896617a 100644
--- a/drivers/staging/mt7621-mmc/dbg.h
+++ b/drivers/staging/mt7621-mmc/dbg.h
@@ -104,13 +104,10 @@ do { \
 
 #define N_MSG(evt, fmt, args...)
 /*
-do {\
-if ((DBG_EVT_##evt) & sd_debug_zone[host->id]) { \
-printk(KERN_ERR TAG"%d -> "fmt" <- %s() : L<%d> PID<%s><0x%x>\n", \
-host->id,  ##args , __FUNCTION__, __LINE__, current->comm, 
current->pid);  \
-} \
-} while(0)
-*/
+ *if ((DBG_EVT_##evt) & sd_debug_zone[host->id]) { \
+ *dev_err(mmc_dev(host->mmc), "%d -> " fmt "\n", host->id, ##args) \
+ *}
+ */
 
 #define ERR_MSG(fmt, args...) \
 do { \
-- 
2.17.1

Re: [PATCH] KVM: s390: vsie: Consolidate CRYCB validation

2018-08-22 Thread Pierre Morel


On 22/08/2018 10:25, David Hildenbrand wrote:

On 22.08.2018 10:08, Pierre Morel wrote:

Currently when shadowing the CRYCB on SIE entrance, the validation
tests the following:
- accept only FORMAT1 or FORMAT2
- test if MSAext facility (76) is installed
- accept the CRYCB if no keys are used
- verifies that the CRYCB format1 is inside a page
- verifies that the CRYCB origin is not 0

This is not following the architecture.

I have to trust you on that :)


On SIE entrance, the CRYCB must be validated before accepting
any of its entries.

Let's do the validation in the right order and also verify
correctly the FORMAT2 CRYCB.

With which facility was FORMAT2 introduced?

With APXA.
KVM initialization setup CRYCB format according to the presence
of APXA for FORMAT2 or FORMAT1



Does MSA3 imply that FORMAT2 can be used? (even if AP is absent)


Not exactly.
If AP is absent FORMAT2 may be defined, independently of MSA3 but the SIE
silently ignore bit 30 i.e. using a FORMAT1 instead



FORMAT2 is backwards compatible to FORMAT1,


For what MSA3 implies yes.




The testing of facility MSAext3 (76) is not useful as it is
already tested by kvm_crypto_init() to set FORMAT1.

Indeed, having FORMAT1 in g1 implies that.


The testing of a null CRYCB origin must be done what ever
the format of the guest3 CRYCB is.

The CRYCB must be contained inside a page, but the CRYCB size
depends on the CRYCB format.
Lets test what the guest2 initialized, we can not trust it to have
done things right.

Signed-off-by: Pierre Morel 
---
  arch/s390/kvm/vsie.c | 35 +--
  1 file changed, 25 insertions(+), 10 deletions(-)

diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
index a2b28cd..35c3907 100644
--- a/arch/s390/kvm/vsie.c
+++ b/arch/s390/kvm/vsie.c
@@ -158,28 +158,43 @@ static int shadow_crycb(struct kvm_vcpu *vcpu, struct 
vsie_page *vsie_page)
scb_s->crycbd = 0;
if (!(crycbd_o & vcpu->arch.sie_block->crycbd & CRYCB_FORMAT1))
return 0;
-   /* format-1 is supported with message-security-assist extension 3 */
-   if (!test_kvm_facility(vcpu->kvm, 76))
-   return 0;
+   /*
+* If APIE is set or it the CRYCB Format is FORMAT1 or FORMAT2 with
+* APXA installed, the machine checks the validity of crycb origin.
+* KVM kvm_s390_crypto_init() makes sure that FORMAT2 is only used
+* if APXA is installed.
+* The guest2 hypervizor could have set APIE and Format2 so let's
+* test all these points.
+* We here have always a CRYCB FORMAT1 or FORMAT2 (FORMAT0 was
+* refused in previous test).

Can you shorten that comment and leave out all stuff to be added next?
(APIE, APXA ...). I guess this whole comment is to be left out of this
patch.

OK



+*/
+   if (!crycb_addr)
+   return set_validity_icpt(scb_s, 0x0039U);
+
+   if ((crycbd_o & 0x03) == CRYCB_FORMAT1)

Can you instead of 0x03 define CRYCB_FORMAT_MASK

OK




+   if ((crycb_addr & PAGE_MASK) !=
+  ((crycb_addr + 128) & PAGE_MASK))

please add one space in front of the second line to properly indent

yes



+   return set_validity_icpt(scb_s, 0x003CU);
+
+   if ((crycbd_o & 0x03) == CRYCB_FORMAT2)
+   if ((crycb_addr & PAGE_MASK) !=
+  ((crycb_addr + 256) & PAGE_MASK))

dito

yes :)




+   return set_validity_icpt(scb_s, 0x003CU);
+
/* we may only allow it if enabled for guest 2 */
ecb3_flags = scb_o->ecb3 & vcpu->arch.sie_block->ecb3 &
 (ECB3_AES | ECB3_DEA);
if (!ecb3_flags)
return 0;
  
-	if ((crycb_addr & PAGE_MASK) != ((crycb_addr + 128) & PAGE_MASK))

-   return set_validity_icpt(scb_s, 0x003CU);
-   else if (!crycb_addr)
-   return set_validity_icpt(scb_s, 0x0039U);
-
/* copy only the wrapping keys */
if (read_guest_real(vcpu, crycb_addr + 72,
vsie_page->crycb.dea_wrapping_key_mask, 56))
return set_validity_icpt(scb_s, 0x0035U);
  
  	scb_s->ecb3 |= ecb3_flags;

-   scb_s->crycbd = ((__u32)(__u64) &vsie_page->crycb) | CRYCB_FORMAT1 |
-   CRYCB_FORMAT2;
+   /* Set the shadow CRYCB format to format 2 */

I don't consider this comment helpful (CRYCB_FORMAT2 below is at least
obvious to me) - CRYCB_FORMAT2 implies CRYCB_FORMAT1 (what the existing
code did not consider)


OK, I still let the simplification below.




+   scb_s->crycbd = ((__u32)(__u64) &vsie_page->crycb) | CRYCB_FORMAT2;
  
  	/* xor both blocks in one run */

b1 = (unsigned long *) vsie_page->crycb.dea_wrapping_key_mask;


Thanks for looking into this.


Thanks for the comments

best regards,

Pierre

--
Pierre Morel
Linux/KVM/QEMU in Böblingen - Germany

[PATCH v4 2/4] staging: mt7621-mmc: Fix debug macro ERR_MSG and its usages

2018-08-22 Thread Nishad Kamdar

Replace all usages of ERR_MSG with with dev_ without __func__
or __LINE__ or current->comm and current->pid. Remove the do {}
while(0) loop for the single statement macro. Drop ERR_MSG from dbg.h.
Issue found by checkpatch.

Signed-off-by: Nishad Kamdar 
---
 drivers/staging/mt7621-mmc/dbg.h |   6 --
 drivers/staging/mt7621-mmc/sd.c  | 128 ++-
 2 files changed, 90 insertions(+), 44 deletions(-)

diff --git a/drivers/staging/mt7621-mmc/dbg.h b/drivers/staging/mt7621-mmc/dbg.h
index c56fb896617a..71295df59ed0 100644
--- a/drivers/staging/mt7621-mmc/dbg.h
+++ b/drivers/staging/mt7621-mmc/dbg.h
@@ -109,12 +109,6 @@ do { \
  *}
  */
 
-#define ERR_MSG(fmt, args...) \
-do { \
-   printk(KERN_ERR TAG"%d -> "fmt" <- %s() : L<%d> PID<%s><0x%x>\n", \
-  host->id,  ##args, __FUNCTION__, __LINE__, current->comm, 
current->pid); \
-} while (0);
-
 #if 1
 //defined CONFIG_MTK_MMC_CD_POLL
 #define INIT_MSG(fmt, args...)
diff --git a/drivers/staging/mt7621-mmc/sd.c b/drivers/staging/mt7621-mmc/sd.c
index 04d23cc7cd4a..6b2c72fc61f2 100644
--- a/drivers/staging/mt7621-mmc/sd.c
+++ b/drivers/staging/mt7621-mmc/sd.c
@@ -466,7 +466,8 @@ static void msdc_set_mclk(struct msdc_host *host, int ddr, 
unsigned int hz)
//u8  clksrc = hw->clk_src;
 
if (!hz) { // set mmc system clock to 0 ?
-   //ERR_MSG("set mclk to 0!!!");
+   //dev_err(mmc_dev(host->mmc), "%d -> set mclk to 0!!!\n",
+   //host->id);
msdc_reset_hw(host);
return;
}
@@ -521,7 +522,7 @@ static void msdc_abort_data(struct msdc_host *host)
 {
struct mmc_command *stop = host->mrq->stop;
 
-   ERR_MSG("Need to Abort.");
+   dev_err(mmc_dev(host->mmc), "%d -> Need to Abort.\n", host->id);
 
msdc_reset_hw(host);
msdc_clr_fifo(host);
@@ -530,7 +531,8 @@ static void msdc_abort_data(struct msdc_host *host)
// need to check FIFO count 0 ?
 
if (stop) {  /* try to stop, but may not success */
-   ERR_MSG("stop when abort CMD<%d>", stop->opcode);
+   dev_err(mmc_dev(host->mmc), "%d -> stop when abort CMD<%d>\n",
+   host->id, stop->opcode);
(void)msdc_do_command(host, stop, 0, CMD_TIMEOUT);
}
 
@@ -688,13 +690,17 @@ static void msdc_pm(pm_message_t state, void *data)
 
} else if (evt == PM_EVENT_RESUME || evt == PM_EVENT_USER_RESUME) {
if (!host->suspend) {
-   //ERR_MSG("warning: already resume");
+   //dev_err(mmc_dev(host->mmc),
+   //"%d -> warning: already resume\n",
+   //host->id);
return;
}
 
/* No PM resume when USR suspend */
if (evt == PM_EVENT_RESUME && host->pm_state.event == 
PM_EVENT_USER_SUSPEND) {
-   ERR_MSG("PM Resume when in USR Suspend");   
/* won't happen. */
+   dev_err(mmc_dev(host->mmc),
+   "%d -> PM Resume when in USR Suspend\n",
+   host->id); /* won't happen. */
return;
}
 
@@ -812,7 +818,9 @@ static unsigned int msdc_command_start(struct msdc_host   
*host,
break;
 
if (time_after(jiffies, tmo)) {
-   ERR_MSG("XXX cmd_busy timeout: before CMD<%d>", 
opcode);
+   dev_err(mmc_dev(host->mmc),
+   "%d -> XXX cmd_busy timeout: before 
CMD<%d>\n",
+   host->id, opcode);
cmd->error = -ETIMEDOUT;
msdc_reset_hw(host);
goto end;
@@ -823,7 +831,9 @@ static unsigned int msdc_command_start(struct msdc_host   
*host,
if (!sdc_is_busy())
break;
if (time_after(jiffies, tmo)) {
-   ERR_MSG("XXX sdc_busy timeout: before CMD<%d>", 
opcode);
+   dev_err(mmc_dev(host->mmc),
+   "%d -> XXX sdc_busy timeout: before 
CMD<%d>\n",
+   host->id, opcode);
cmd->error = -ETIMEDOUT;
msdc_reset_hw(host);
goto end;
@@ -862,7 +872,9 @@ static unsigned int msdc_command_resp(struct msdc_host   
*host,
 
spin_unlock(&host->lock);
if (!wait_for_completion_timeout(&host->cmd_done, 10 * timeout)) {
-   ERR_MSG("XXX CMD<%d> wait_for_completion timeout ARG<0x%.8x>", 
opcode, cmd->arg);
+   dev_err(mmc_dev(host->mmc),
+   "%d -> XXX CMD<%d> wait_for_completion timeout 
ARG

Re: [PATCH] soc: ti: pm33xx: Enable DS0 for the platforms on which it is functional

2018-08-22 Thread Johan Hovold

On Wed, Aug 22, 2018 at 01:50:29PM +0530, J, KEERTHY wrote:
> 
> 
> On 8/22/2018 1:07 PM, Johan Hovold wrote:
> > On Wed, Aug 22, 2018 at 09:34:09AM +0200, Johan Hovold wrote:
> >> On Wed, Aug 22, 2018 at 11:02:31AM +0530, Keerthy wrote:
> >>> Enable DS0 for only those platforms on which it is functional
> >>>
> >>> Signed-off-by: Keerthy 
> >>> ---
> >>>   arch/arm/mach-omap2/pm33xx-core.c| 5 +
> >>>   drivers/soc/ti/pm33xx.c  | 9 +
> >>>   include/linux/platform_data/pm33xx.h | 2 ++
> >>>   3 files changed, 16 insertions(+)
> >>>
> >>> diff --git a/arch/arm/mach-omap2/pm33xx-core.c 
> >>> b/arch/arm/mach-omap2/pm33xx-core.c
> >>> index f4971e4..f0f6e8e 100644
> >>> --- a/arch/arm/mach-omap2/pm33xx-core.c
> >>> +++ b/arch/arm/mach-omap2/pm33xx-core.c
> >>> @@ -135,6 +135,11 @@ static int am43xx_suspend(unsigned int state, int 
> >>> (*fn)(unsigned long),
> >>>   {
> >>>   int ret = 0;
> >>>   
> >>> + if (!(args & WFI_FLAG_DEEP_SLEEP0)) {
> >>> + pr_err("DS0 mode not supported\n");
> >>> + return -ENOTSUPP;
> >>> + }
> >>> +
> >>>   amx3_pre_suspend_common();
> >>>   scu_power_mode(scu_base, SCU_PM_POWEROFF);
> >>>   ret = cpu_suspend(args, fn);
> >>> diff --git a/drivers/soc/ti/pm33xx.c b/drivers/soc/ti/pm33xx.c
> >>> index d0dab32..53238d7 100644
> >>> --- a/drivers/soc/ti/pm33xx.c
> >>> +++ b/drivers/soc/ti/pm33xx.c
> >>> @@ -324,6 +324,15 @@ static int am33xx_pm_probe(struct platform_device 
> >>> *pdev)
> >>>   suspend_wfi_flags |= WFI_FLAG_SAVE_EMIF;
> >>>   suspend_wfi_flags |= WFI_FLAG_WAKE_M3;
> >>>   
> >>> + /*
> >>> +  * Deep Sleep0 mode is currently functional only on am437x-gp-evm,
> >>> +  * am33xx-evm and boneblack family. Hence set the DS0 flag
> >>> +  */
> >>> + if (of_machine_is_compatible("ti,am437x-gp-evm") ||
> >>> + of_machine_is_compatible("ti,am335x-bone-black") ||
> >>> + of_machine_is_compatible("ti,am335x-evm"))
> >>> + suspend_wfi_flags |= WFI_FLAG_DEEP_SLEEP0;
> >>
> >> What about other (out-of-tree) machines which supports DS0 and which
> >> this change would break?
> >>
> >> I think this needs to be a blacklist if anything.
> >>
> >> Please also expand in the commit message why you think this is needed.
> 
> Currently when one does echo mem > /sys/power/state on unsuppored 
> machines there can be a crash or a hang. So bail out with a message.

Yes, but why is this unsupported on some machines? Which machines, and
why? Your commit messages should be self-contained and hold the
information needed to determine whether your patch makes sense in the
first place.

> >> Last, what tree is this against? There's no am43xx_suspend() in
> >> linux-next (and you add compatibles above for am33xx too).
> > 
> > Sorry, there is indeed an am43xx_suspend(), but you are adding
> > compatibles for am33xx which use am33xx_suspend().
> 
> am33xx_pm_probe is a common probe function for both am33 and am43.

Yes, but you add a check for your new flag only to am43xx_suspend(), not
to am33xx_suspend() which is used by the am33xx compatibles you add.

> AFAIK for am33 family am335x-evm and am335x-bone-black support Deep 
> Sleep mode. For am43 family am43tx-gp-evm alone supports at the moment.

But these are development boards (EVKs), not SOC families (or
chip revisions). What about all the products that customers to TI who
have bought these SoCs have built?

> Can you let me know of other am33 machines that support DS0 mode?

I have a customer who use DS0, whose DTS is not yet in mainline, and
whose setup this patch would break for example.

> I could have simply used ti,am33xx compatible which covers entire am33 
> family but then am33xx-bone (bone white) does not support this mode.

Yes, and a blacklist would make much more sense for something like this
if where talking about specific boards.

Also note that your patch doesn't even handle bone-white as you didn't
add a check to am33xx_suspend() as I pointed out above.

Johan

Re: [PATCH] KVM: s390: vsie: Consolidate CRYCB validation

2018-08-22 Thread David Hildenbrand

On 22.08.2018 10:41, Pierre Morel wrote:
> On 22/08/2018 10:25, David Hildenbrand wrote:
>> On 22.08.2018 10:08, Pierre Morel wrote:
>>> Currently when shadowing the CRYCB on SIE entrance, the validation
>>> tests the following:
>>> - accept only FORMAT1 or FORMAT2
>>> - test if MSAext facility (76) is installed
>>> - accept the CRYCB if no keys are used
>>> - verifies that the CRYCB format1 is inside a page
>>> - verifies that the CRYCB origin is not 0
>>>
>>> This is not following the architecture.
>> I have to trust you on that :)
>>
>>> On SIE entrance, the CRYCB must be validated before accepting
>>> any of its entries.
>>>
>>> Let's do the validation in the right order and also verify
>>> correctly the FORMAT2 CRYCB.
>> With which facility was FORMAT2 introduced?
> With APXA.
> KVM initialization setup CRYCB format according to the presence
> of APXA for FORMAT2 or FORMAT1

As our guest does not see APXA, why should it be allowed to make use of
FORMAT2 here already?

In my opinion, the size check you are adding is in the current state not
correct.


-- 

Thanks,

David / dhildenb

[PATCH v4 3/4] staging: mt7621-mmc: Remove macro INIT_MSG and its usages

2018-08-22 Thread Nishad Kamdar

Removed all usages of INIT_MSG and dropped it from dbg.h.

Signed-off-by: Nishad Kamdar 
---
 drivers/staging/mt7621-mmc/dbg.h |  7 ---
 drivers/staging/mt7621-mmc/sd.c  | 16 
 2 files changed, 23 deletions(-)

diff --git a/drivers/staging/mt7621-mmc/dbg.h b/drivers/staging/mt7621-mmc/dbg.h
index 71295df59ed0..8d2c16450ef5 100644
--- a/drivers/staging/mt7621-mmc/dbg.h
+++ b/drivers/staging/mt7621-mmc/dbg.h
@@ -111,15 +111,8 @@ do { \
 
 #if 1
 //defined CONFIG_MTK_MMC_CD_POLL
-#define INIT_MSG(fmt, args...)
 #define IRQ_MSG(fmt, args...)
 #else
-#define INIT_MSG(fmt, args...) \
-do { \
-   printk(KERN_ERR TAG"%d -> "fmt" <- %s() : L<%d> PID<%s><0x%x>\n", \
-  host->id,  ##args, __FUNCTION__, __LINE__, current->comm, 
current->pid); \
-} while (0);
-
 /* PID in ISR in not corrent */
 #define IRQ_MSG(fmt, args...) \
 do { \
diff --git a/drivers/staging/mt7621-mmc/sd.c b/drivers/staging/mt7621-mmc/sd.c
index 6b2c72fc61f2..327c1cd7fd04 100644
--- a/drivers/staging/mt7621-mmc/sd.c
+++ b/drivers/staging/mt7621-mmc/sd.c
@@ -187,12 +187,10 @@ static u32 hclks[] = {5000}; /* +/- by chhung */
 //
 #define msdc_vcore_on(host) \
do {\
-   INIT_MSG("[+]VMC ref. count<%d>", ++host->pwr_ref); \
(void)hwPowerOn(MT65XX_POWER_LDO_VMC, VOL_3300, "SD");  \
} while (0)
 #define msdc_vcore_off(host) \
do {\
-   INIT_MSG("[-]VMC ref. count<%d>", --host->pwr_ref); \
(void)hwPowerDown(MT65XX_POWER_LDO_VMC, "SD");  \
} while (0)
 
@@ -439,7 +437,6 @@ static void msdc_select_clksrc(struct msdc_host *host, 
unsigned char clksrc)
u32 val;
 
BUG_ON(clksrc > 3);
-   INIT_MSG("set clock source to <%d>", clksrc);
 
val = readl(host->base + MSDC_CLKSRC_REG);
if (readl(host->base + MSDC_ECO_VER) >= 4) {
@@ -510,10 +507,6 @@ static void msdc_set_mclk(struct msdc_host *host, int ddr, 
unsigned int hz)
host->mclk = hz;
msdc_set_timeout(host, host->timeout_ns, host->timeout_clks); // need?
 
-   INIT_MSG("");
-   INIT_MSG("!!! Set<%dKHz> Source<%dKHz> -> sclk<%dKHz>", hz / 1000, hclk 
/ 1000, sclk / 1000);
-   INIT_MSG("");
-
msdc_irq_restore(flags);
 }
 
@@ -671,12 +664,6 @@ static void msdc_pm(pm_message_t state, void *data)
struct msdc_host *host = (struct msdc_host *)data;
int evt = state.event;
 
-   if (evt == PM_EVENT_USER_RESUME || evt == PM_EVENT_USER_SUSPEND) {
-   INIT_MSG("USR_%s: suspend<%d> power<%d>",
-   evt == PM_EVENT_USER_RESUME ? "EVENT_USER_RESUME" : 
"EVENT_USER_SUSPEND",
-   host->suspend, host->power_mode);
-   }
-
if (evt == PM_EVENT_SUSPEND || evt == PM_EVENT_USER_SUSPEND) {
if (host->suspend) /* already suspend */  /* default 0*/
return;
@@ -1762,7 +1749,6 @@ static void msdc_ops_set_ios(struct mmc_host *mmc, struct 
mmc_ios *ios)
if (host->mclk != ios->clock) {
if (ios->clock > 2500) {
//if (!(host->hw->flags & MSDC_REMOVABLE)) {
-   INIT_MSG("SD data latch edge<%d>", MSDC_SMPL_FALLING);
sdr_set_field(host->base + MSDC_IOCON, MSDC_IOCON_RSPL,
  MSDC_SMPL_FALLING);
sdr_set_field(host->base + MSDC_IOCON, MSDC_IOCON_DSPL,
@@ -1815,7 +1801,6 @@ static int msdc_ops_get_cd(struct mmc_host *mmc)
return 1;
 #else
host->card_inserted = (host->pm_state.event == 
PM_EVENT_USER_RESUME) ? 1 : 0;
-   INIT_MSG("sdio ops_get_cd<%d>", host->card_inserted);
return host->card_inserted;
 #endif
}
@@ -1839,7 +1824,6 @@ static int msdc_ops_get_cd(struct mmc_host *mmc)
present = 0; /* TODO? Check DAT3 pins for card detection */
}
 
-   INIT_MSG("ops_get_cd return<%d>", present);
return present;
 }
 
-- 
2.17.1

Re: [PATCH 1/1] perf/x86/intel: make error messages less confusing

2018-08-22 Thread Peter Zijlstra

On Tue, Aug 21, 2018 at 04:05:22PM -0700, Eduardo Valentin wrote:
> On Tue, Aug 21, 2018 at 03:09:37PM -0700, Andi Kleen wrote:
> > On Tue, Aug 21, 2018 at 02:15:28PM -0700, Eduardo Valentin wrote:

> > > [ 0.100114] Performance Events: unsupported p6 CPU model 85 no PMU 
> > > driver, software events only.

> > Maybe it is confusing (why exactly?), but it doesn't seem to me that your
> > new message is any better.
> 
> Yeah, the part that says "unsupported CPU" is the confusing part,
> I get people thinking that the specific reported CPU model is not
> supported by the kernel :-)

It is prefixed by: "Performance Events:", what is the problem?

[PATCH v4 4/4] staging: mt7621-mmc: Fix debug macro IRQ_MSG and its usages

2018-08-22 Thread Nishad Kamdar

Replace all usages of IRQ_MSG with with dev_ without __func__
or __LINE__ or current->comm and current->pid. Remove the do {}
while(0) loop for the single statement macro. Drop IRQ_MSG from dbg.h.
Issue found by checkpatch.

Signed-off-by: Nishad Kamdar 
---
 drivers/staging/mt7621-mmc/dbg.h | 12 ---
 drivers/staging/mt7621-mmc/sd.c  | 36 
 2 files changed, 27 insertions(+), 21 deletions(-)

diff --git a/drivers/staging/mt7621-mmc/dbg.h b/drivers/staging/mt7621-mmc/dbg.h
index 8d2c16450ef5..5458dae5dc03 100644
--- a/drivers/staging/mt7621-mmc/dbg.h
+++ b/drivers/staging/mt7621-mmc/dbg.h
@@ -109,18 +109,6 @@ do { \
  *}
  */
 
-#if 1
-//defined CONFIG_MTK_MMC_CD_POLL
-#define IRQ_MSG(fmt, args...)
-#else
-/* PID in ISR in not corrent */
-#define IRQ_MSG(fmt, args...) \
-do { \
-   printk(KERN_ERR TAG"%d -> "fmt" <- %s() : L<%d>\n", \
-  host->id,  ##args, __FUNCTION__, __LINE__);  \
-} while (0);
-#endif
-
 void msdc_debug_proc_init(void);
 
 #if 0 /* --- chhung */
diff --git a/drivers/staging/mt7621-mmc/sd.c b/drivers/staging/mt7621-mmc/sd.c
index 327c1cd7fd04..5cb6bc36e78b 100644
--- a/drivers/staging/mt7621-mmc/sd.c
+++ b/drivers/staging/mt7621-mmc/sd.c
@@ -420,7 +420,9 @@ static void msdc_tasklet_card(struct work_struct *work)
mmc_detect_change(host->mmc, msecs_to_jiffies(20));
}
 
-   IRQ_MSG("card found<%s>", inserted ? "inserted" : "removed");
+   dev_err(mmc_dev(host->mmc),
+   "%d -> card found<%s>\n",
+   host->id, inserted ? "inserted" : "removed");
 #endif
 
spin_unlock(&host->lock);
@@ -1858,14 +1860,17 @@ static irqreturn_t msdc_irq(int irq, void *dev_id)
if (intsts & MSDC_INT_CDSC) {
if (host->mmc->caps & MMC_CAP_NEEDS_POLL)
return IRQ_HANDLED;
-   IRQ_MSG("MSDC_INT_CDSC irq<0x%.8x>", intsts);
+   dev_err(mmc_dev(host->mmc),
+   "%d -> MSDC_INT_CDSC irq<0x%.8x>\n", host->id, intsts);
schedule_delayed_work(&host->card_delaywork, HZ);
/* tuning when plug card ? */
}
 
/* sdio interrupt */
if (intsts & MSDC_INT_SDIOIRQ) {
-   IRQ_MSG("XXX MSDC_INT_SDIOIRQ");  /* seems not sdio irq */
+   dev_err(mmc_dev(host->mmc),
+   "%d -> XXX MSDC_INT_SDIOIRQ\n",
+   host->id); /* seems not sdio irq */
//mmc_signal_sdio_irq(host->mmc);
}
 
@@ -1883,10 +1888,15 @@ static irqreturn_t msdc_irq(int irq, void *dev_id)
msdc_clr_int();
 
if (intsts & MSDC_INT_DATTMO) {
-   IRQ_MSG("XXX CMD<%d> MSDC_INT_DATTMO", 
host->mrq->cmd->opcode);
+   dev_err(mmc_dev(host->mmc),
+   "%d -> XXX CMD<%d> MSDC_INT_DATTMO\n",
+   host->id, host->mrq->cmd->opcode);
data->error = -ETIMEDOUT;
} else if (intsts & MSDC_INT_DATCRCERR) {
-   IRQ_MSG("XXX CMD<%d> MSDC_INT_DATCRCERR, 
SDC_DCRC_STS<0x%x>", host->mrq->cmd->opcode, readl(host->base + SDC_DCRC_STS));
+   dev_err(mmc_dev(host->mmc),
+   "%d -> XXX CMD<%d> MSDC_INT_DATCRCERR, 
SDC_DCRC_STS<0x%x>\n",
+   host->id, host->mrq->cmd->opcode,
+   readl(host->base + SDC_DCRC_STS);
data->error = -EIO;
}
 
@@ -1919,15 +1929,23 @@ static irqreturn_t msdc_irq(int irq, void *dev_id)
}
} else if ((intsts & MSDC_INT_RSPCRCERR) || (intsts & 
MSDC_INT_ACMDCRCERR)) {
if (intsts & MSDC_INT_ACMDCRCERR)
-   IRQ_MSG("XXX CMD<%d> MSDC_INT_ACMDCRCERR", 
cmd->opcode);
+   dev_err(mmc_dev(host->mmc),
+   "%d -> XXX CMD<%d> 
MSDC_INT_ACMDCRCERR\n",
+   host->id, cmd->opcode);
else
-   IRQ_MSG("XXX CMD<%d> MSDC_INT_RSPCRCERR", 
cmd->opcode);
+   dev_err(mmc_dev(host->mmc),
+   "%d -> XXX CMD<%d> 
MSDC_INT_RSPCRCERR\n",
+   host->id, cmd->opcode);
cmd->error = -EIO;
} else if ((intsts & MSDC_INT_CMDTMO) || (intsts & 
MSDC_INT_ACMDTMO)) {
if (intsts & MSDC_INT_ACMDTMO)
-   IRQ_MSG("XXX CMD<%d> MSDC_INT_ACMDTMO", 
cmd->opcode);
+   dev_err(mmc_dev(host->mmc),
+   "%d -> XXX CMD<%d> MSDC_INT_ACMDTMO\n",
+

Compliment of the day to you Dear Friend.

2018-08-22 Thread Mrs. Amina Kadi

 Compliment of the day to you Dear Friend.

Dear Friend.
 
  I am Mrs. Amina Kadi. am sending this brief letter to solicit your
partnership to transfer $5.5 million US Dollars. I shall send you
more information and procedures when I receive positive response from
you.

Mrs. Amina Kadi

[PATCH] ovl: set I_CREATING on inode being created

2018-08-22 Thread Miklos Szeredi

...otherwise there will be list corruption due to inode_sb_list_add() being
called for inode already on the sb list.

Signed-off-by: Miklos Szeredi 
Fixes: e950564b97fd ("vfs: don't evict uninitialized inode")
---
This missed the 4.19 overlay pull request, because it fixes a bug
introduced by patch not in said pull (buggy patch is also mine,
incidentally).

 fs/overlayfs/dir.c |4 
 1 file changed, 4 insertions(+)

--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -603,6 +603,10 @@ static int ovl_create_object(struct dent
if (!inode)
goto out_drop_write;
 
+   spin_lock(&inode->i_lock);
+   inode->i_state |= I_CREATING;
+   spin_unlock(&inode->i_lock);
+
inode_init_owner(inode, dentry->d_parent->d_inode, mode);
attr.mode = inode->i_mode;

[PATCH] clk: ti: fix OF child-node lookup

2018-08-22 Thread Johan Hovold

Fix child-node lookup which by using the wrong OF helper was searching
the whole tree depth-first, something which could end up matching an
unrelated node.

Also fix the related node-reference leaks.

Fixes: 5b385a45e001 ("clk: ti: add support for clkctrl aliases")
Signed-off-by: Johan Hovold 
---
 drivers/clk/ti/clk.c | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/clk/ti/clk.c b/drivers/clk/ti/clk.c
index 7d22e1af2247..27e0979b3158 100644
--- a/drivers/clk/ti/clk.c
+++ b/drivers/clk/ti/clk.c
@@ -129,7 +129,7 @@ int ti_clk_setup_ll_ops(struct ti_clk_ll_ops *ops)
 void __init ti_dt_clocks_register(struct ti_dt_clk oclks[])
 {
struct ti_dt_clk *c;
-   struct device_node *node;
+   struct device_node *node, *parent;
struct clk *clk;
struct of_phandle_args clkspec;
char buf[64];
@@ -164,8 +164,12 @@ void __init ti_dt_clocks_register(struct ti_dt_clk oclks[])
continue;
 
node = of_find_node_by_name(NULL, buf);
-   if (num_args)
-   node = of_find_node_by_name(node, "clk");
+   if (num_args) {
+   parent = node;
+   node = of_get_child_by_name(parent, "clk");
+   of_node_put(parent);
+   }
+
clkspec.np = node;
clkspec.args_count = num_args;
for (i = 0; i < num_args; i++) {
@@ -173,11 +177,12 @@ void __init ti_dt_clocks_register(struct ti_dt_clk 
oclks[])
if (ret) {
pr_warn("Bad tag in %s at %d: %s\n",
c->node_name, i, tags[i]);
+   of_node_put(node);
return;
}
}
clk = of_clk_get_from_provider(&clkspec);
-
+   of_node_put(node);
if (!IS_ERR(clk)) {
c->lk.clk = clk;
clkdev_add(&c->lk);
-- 
2.18.0

Re: [PATCH] staging: rtl8188eu: Fix spelling mistake

2018-08-22 Thread Bhaskar Singh

On Wed, Aug 22, 2018 at 11:16:36AM +0300, Dan Carpenter wrote:
> On Tue, Aug 21, 2018 at 07:14:28AM +0530, Bhaskar Singh wrote:
> > This patch fix spelling mistakes in TODO.
> > 
> 
> Btw, it helps when you say which word you're changing, otherwise it
> takes a while to spot the difference.  We changed "HGz" to "GHz".
> 
> Probably someone smarter than I am would have spotted it faster...
> 
> regards,
> dan carpenter
>

Apologies for the inconvenience and thanks a lot for your suggestion.

I will definitely follow that.

Should I send the patch again?

Thanks
Bhaskar Singh

Re: general protection fault in finish_task_switch (2)

2018-08-22 Thread Peter Zijlstra

On Tue, Aug 21, 2018 at 02:28:02PM -0700, syzbot wrote:
> syzbot has found a reproducer for the following crash on:
> 
> HEAD commit:778a33959a8a Merge tag 'please-pull-noboot' of git://git.k..
> git tree:   upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=14a5385a40
> kernel config:  https://syzkaller.appspot.com/x/.config?x=214e4990bd49329f
> dashboard link: https://syzkaller.appspot.com/bug?extid=1f56df64bfb3c29dde6f
> compiler:   gcc (GCC) 8.0.1 20180413 (experimental)
> userspace arch: i386
> syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=13ffa56140

FWIW the lack of whitespace between "repro:" and the URL makes it hard
to copy paste.

> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1002396140


> RIP: 0010:__fire_sched_in_preempt_notifiers kernel/sched/core.c:2481

That repro thing does something dodgy with KVM, which then corrupts the
premption notifier thing. I'm sufficiently KVM clueless to not really
know where to start looking though..

Re: [PATCH v4 1/4] staging: mt7621-mmc: Fix debug macro N_MSG

2018-08-22 Thread Dan Carpenter

On Wed, Aug 22, 2018 at 02:04:55PM +0530, Nishad Kamdar wrote:
> This patch fixes the debug macro N_MSG. Replaces printk with
> dev_ without __func__ or __LINE__ or current->comm and
> current->pid. Removes the do {} while(0) loop for the single
> statement macro. Issue found by checkpatch.
> 
> Signed-off-by: Nishad Kamdar 
> ---
>  drivers/staging/mt7621-mmc/dbg.h | 11 ---
>  1 file changed, 4 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/staging/mt7621-mmc/dbg.h 
> b/drivers/staging/mt7621-mmc/dbg.h
> index 2f2c56b73987..c56fb896617a 100644
> --- a/drivers/staging/mt7621-mmc/dbg.h
> +++ b/drivers/staging/mt7621-mmc/dbg.h
> @@ -104,13 +104,10 @@ do { \
>  
>  #define N_MSG(evt, fmt, args...)
>  /*
> -do {\
> -if ((DBG_EVT_##evt) & sd_debug_zone[host->id]) { \
> -printk(KERN_ERR TAG"%d -> "fmt" <- %s() : L<%d> PID<%s><0x%x>\n", \
> -host->id,  ##args , __FUNCTION__, __LINE__, current->comm, 
> current->pid);\
> -} \
> -} while(0)
> -*/
> + *if ((DBG_EVT_##evt) & sd_debug_zone[host->id]) { \
> + *dev_err(mmc_dev(host->mmc), "%d -> " fmt "\n", host->id, ##args) \
> + *}
> + */

I don't understand what you're trying to do here.  You just commented
out the macro and turned it into a no-op.  That's not what the patch
description says.

To me the original code seems fine.

regards,
dan carpenter

Re: [PATCH 8/9] psi: pressure stall information for CPU, memory, and IO

2018-08-22 Thread Peter Zijlstra

On Tue, Aug 21, 2018 at 04:11:15PM -0400, Johannes Weiner wrote:
> On Fri, Aug 03, 2018 at 07:21:39PM +0200, Peter Zijlstra wrote:
> > On Wed, Aug 01, 2018 at 11:19:57AM -0400, Johannes Weiner wrote:
> > > + time = READ_ONCE(groupc->times[s]);
> > > + /*
> > > +  * In addition to already concluded states, we
> > > +  * also incorporate currently active states on
> > > +  * the CPU, since states may last for many
> > > +  * sampling periods.
> > > +  *
> > > +  * This way we keep our delta sampling buckets
> > > +  * small (u32) and our reported pressure close
> > > +  * to what's actually happening.
> > > +  */
> > > + if (test_state(groupc->tasks, cpu, s)) {
> > > + /*
> > > +  * We can race with a state change and
> > > +  * need to make sure the state_start
> > > +  * update is ordered against the
> > > +  * updates to the live state and the
> > > +  * time buckets (groupc->times).
> > > +  *
> > > +  * 1. If we observe task state that
> > > +  * needs to be recorded, make sure we
> > > +  * see state_start from when that
> > > +  * state went into effect or we'll
> > > +  * count time from the previous state.
> > > +  *
> > > +  * 2. If the time delta has already
> > > +  * been added to the bucket, make sure
> > > +  * we don't see it in state_start or
> > > +  * we'll count it twice.
> > > +  *
> > > +  * If the time delta is out of
> > > +  * state_start but not in the time
> > > +  * bucket yet, we'll miss it entirely
> > > +  * and handle it in the next period.
> > > +  */
> > > + smp_rmb();
> > > + time += cpu_clock(cpu) - groupc->state_start;
> > > + }
> > 
> > As is, groupc->state_start needs a READ_ONCE() above and a WRITE_ONCE()
> > below. But like stated earlier, doing an update in scheduler_tick() is
> > probably easier.
> 
> I've wrapped these in READ_ONCE/WRITE_ONCE.

I just realized, these are u64, so READ_ONCE/WRITE_ONCE will not work
correct on 32bit.

Re: [PATCH v4 2/4] staging: mt7621-mmc: Fix debug macro ERR_MSG and its usages

2018-08-22 Thread Dan Carpenter

On Wed, Aug 22, 2018 at 02:13:07PM +0530, Nishad Kamdar wrote:
> diff --git a/drivers/staging/mt7621-mmc/sd.c b/drivers/staging/mt7621-mmc/sd.c
> index 04d23cc7cd4a..6b2c72fc61f2 100644
> --- a/drivers/staging/mt7621-mmc/sd.c
> +++ b/drivers/staging/mt7621-mmc/sd.c
> @@ -466,7 +466,8 @@ static void msdc_set_mclk(struct msdc_host *host, int 
> ddr, unsigned int hz)
>   //u8  clksrc = hw->clk_src;
>  
>   if (!hz) { // set mmc system clock to 0 ?
> - //ERR_MSG("set mclk to 0!!!");
> + //dev_err(mmc_dev(host->mmc), "%d -> set mclk to 0!!!\n",
> + //host->id);

Just delete commented out code.

>   msdc_reset_hw(host);
>   return;
>   }

regards,
dan carpenter

Re: [PATCH 8/9] psi: pressure stall information for CPU, memory, and IO

2018-08-22 Thread Peter Zijlstra

On Tue, Aug 21, 2018 at 03:44:13PM -0400, Johannes Weiner wrote:

> > > + for (s = PSI_NONIDLE; s >= 0; s--) {
> > > + u32 time, delta;
> > > +
> > > + time = READ_ONCE(groupc->times[s]);
> > > + /*
> > > +  * In addition to already concluded states, we
> > > +  * also incorporate currently active states on
> > > +  * the CPU, since states may last for many
> > > +  * sampling periods.
> > > +  *
> > > +  * This way we keep our delta sampling buckets
> > > +  * small (u32) and our reported pressure close
> > > +  * to what's actually happening.
> > > +  */
> > > + if (test_state(groupc->tasks, cpu, s)) {
> > > + /*
> > > +  * We can race with a state change and
> > > +  * need to make sure the state_start
> > > +  * update is ordered against the
> > > +  * updates to the live state and the
> > > +  * time buckets (groupc->times).
> > > +  *
> > > +  * 1. If we observe task state that
> > > +  * needs to be recorded, make sure we
> > > +  * see state_start from when that
> > > +  * state went into effect or we'll
> > > +  * count time from the previous state.
> > > +  *
> > > +  * 2. If the time delta has already
> > > +  * been added to the bucket, make sure
> > > +  * we don't see it in state_start or
> > > +  * we'll count it twice.
> > > +  *
> > > +  * If the time delta is out of
> > > +  * state_start but not in the time
> > > +  * bucket yet, we'll miss it entirely
> > > +  * and handle it in the next period.
> > > +  */
> > > + smp_rmb();
> > > + time += cpu_clock(cpu) - groupc->state_start;
> > > + }
> > 
> > The alternative is adding an update to scheduler_tick(), that would
> > ensure you're never more than nr_cpu_ids * TICK_NSEC behind.
> 
> I wasn't able to convert *all* states to tick updates like this.
> 
> The reason is that, while testing rq->curr for PF_MEMSTALL is cheap,
> other tasks associated with the rq could be from any cgroup in the
> system. That means we'd have to do for_each_cgroup() on every tick to
> keep the groupc->times that closely uptodate, and that wouldn't scale.
> We tend to have hundreds of them, some setups have thousands.
> 
> Since we don't need to be *that* current, I left the on-demand update
> inside the aggregator for now. It's a bit trickier, but much cheaper.

ARGH indeed; I was thinking we only need to update current. But because
we're tracking blocked state that doesn't work.

Sorry for that :/

Re: general protection fault in finish_task_switch (2)

2018-08-22 Thread Paolo Bonzini

On 22/08/2018 11:08, Peter Zijlstra wrote:
>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1002396140
> 
>> RIP: 0010:__fire_sched_in_preempt_notifiers kernel/sched/core.c:2481
> That repro thing does something dodgy with KVM, which then corrupts the
> premption notifier thing. I'm sufficiently KVM clueless to not really
> know where to start looking though..

It seems to be a reference counting issue, or something like that.  I'm
looking at it...

Paolo

Re: [PATCH] staging: rtl8188eu: Fix spelling mistake

2018-08-22 Thread Dan Carpenter

On Wed, Aug 22, 2018 at 02:35:32PM +0530, Bhaskar Singh wrote:
> On Wed, Aug 22, 2018 at 11:16:36AM +0300, Dan Carpenter wrote:
> > On Tue, Aug 21, 2018 at 07:14:28AM +0530, Bhaskar Singh wrote:
> > > This patch fix spelling mistakes in TODO.
> > > 
> > 
> > Btw, it helps when you say which word you're changing, otherwise it
> > takes a while to spot the difference.  We changed "HGz" to "GHz".
> > 
> > Probably someone smarter than I am would have spotted it faster...
> > 
> > regards,
> > dan carpenter
> >
> 
> Apologies for the inconvenience and thanks a lot for your suggestion.
> 
> I will definitely follow that.
> 
> Should I send the patch again?

No.  It's fine.

regards,
dan carpenter

Re: [PATCH 00/14] ata: ahci-platform: add reset control support except for existing drivers

2018-08-22 Thread Hans de Goede


Hi,

On 22-08-18 09:36, Kunihiko Hayashi wrote:

Add support to get and control a list of resets for the device, and
add the flag indicating whether to use the reset. Existing drivers
set 0 to this flags.

This series solves the issue of the previous patch [1] that was already
reverted [2].
[1] https://www.spinics.net/lists/linux-ide/msg55299.html
[2] https://www.spinics.net/lists/linux-ide/msg55379.html

Kunihiko Hayashi (14):
   ata: ahci-platform: add reset control support and the flag to specify
 using reset
   ata: ahci_brcm: add second argument of ahci_platform_get_resources()
   ata: ahci_ceva: add second argument of ahci_platform_get_resources()
   ata: ahci_da850: add second argument of ahci_platform_get_resources()
   ata: ahci_dm816: add second argument of ahci_platform_get_resources()
   ata: ahci_imx: add second argument of ahci_platform_get_resources()
   ata: ahci_brcm: add second argument of ahci_platform_get_resources()
   ata: ahci_mvebu: add second argument of ahci_platform_get_resources()
   ata: ahci_qoriq: add second argument of ahci_platform_get_resources()
   ata: ahci_seattle: add second argument of
 ahci_platform_get_resources()
   ata: ahci_st: add second argument of ahci_platform_get_resources()
   ata: ahci_sunxi: add second argument of ahci_platform_get_resources()
   ata: ahci_tegra: add second argument of ahci_platform_get_resources()
   ata: ahci_xgene: add second argument of ahci_platform_get_resources()


When you change a function prototype, you must also change all
the callers in a single commit, so that all intermediate commits
will compile without errors, otherwise you will break git bisect.

Otherwise this looks good.

I suggest you split this like this:

1) Add a flags argument to ahci_platform_get_resources(),
   without adding support for any flags yet, so this just
   changes the function prototype and passes 0 for the new
   flags argument *everywhere* without any other changes
2) Add support for a AHCI_PLATFORM_GET_RESETS flag, basically
   your current first patch, minus the prototype patches
3) A patch which passes AHCI_PLATFORM_GET_RESETS for the
   generic ahci_platform driver (so break this out of your
   first patch). Also describe in the commit message of this
   patch why / for which platforms this is necessary.

The idea of doing 3. separately is that we can easily revert
it in case of problems while keeping the core functionality
in place. Note I do not expect this to be necessary.

Regards,

Hans




  .../devicetree/bindings/ata/ahci-platform.txt  |  1 +
  drivers/ata/ahci.h |  1 +
  drivers/ata/ahci_brcm.c|  2 +-
  drivers/ata/ahci_ceva.c|  2 +-
  drivers/ata/ahci_da850.c   |  2 +-
  drivers/ata/ahci_dm816.c   |  2 +-
  drivers/ata/ahci_imx.c |  2 +-
  drivers/ata/ahci_mtk.c |  2 +-
  drivers/ata/ahci_mvebu.c   |  2 +-
  drivers/ata/ahci_platform.c|  3 +-
  drivers/ata/ahci_qoriq.c   |  2 +-
  drivers/ata/ahci_seattle.c |  2 +-
  drivers/ata/ahci_st.c  |  2 +-
  drivers/ata/ahci_sunxi.c   |  2 +-
  drivers/ata/ahci_tegra.c   |  2 +-
  drivers/ata/ahci_xgene.c   |  2 +-
  drivers/ata/libahci_platform.c | 35 ++
  include/linux/ahci_platform.h  |  4 ++-
  18 files changed, 49 insertions(+), 21 deletions(-)

Howto prevent kernel from evicting code pages ever? (to avoid disk thrashing when about to run out of RAM)

2018-08-22 Thread Marcus Linsner

Hi. How to make the kernel keep(lock?) all code pages in RAM so that
kswapd0 won't evict them when the system is under low memory
conditions ?

The purpose of this is to prevent the kernel from causing lots of disk
reads(effectively freezing the whole system) when about to run out of
RAM, even when there is no swap enabled, but well before(in real time
minutes) OOM-killer triggers to kill the offending process (eg. ld)!

I can replicate this consistently with 4G (and 12G) max RAM inside a
Qubes OS R4.0 AppVM running Fedora 28 while trying to compile Firefox.
The disk thrashing (continuous 192+MiB/sec reads) occurs well before
the OOM-killer triggers to kill 'ld' (or 'rustc') process and
everything is frozen for (real time) minutes. I've also encountered
this on bare metal myself, if it matters at all.

I tried to ask this question on SO here:
https://stackoverflow.com/q/51927528/10239615
but maybe I have better luck on this mailing list where the kernel experts are.

Just think of all the frozen systems that you'll be saving(see related
question in the above link, for one), if you figure out the answer to
this, whether be it a kernel patch, or some .config options needing
change, or whatever. Just consider it, whoever you are, reader :)
(probably a kernel god xD 'cause who else would know howto) - I'm
actually selfish, I want this for myself, but I'm more than willing to
share it with all, once I'm aware of it. (this = this howto: let the
OOM-killer kill the offending process asap, without first passing
through disk-thrashing hell freezing the OS ;-) er, I mean Hi, and
how's life? 'll be even better after you've read this, I guarantee it
;-) just believe! synergize)

[RFC PATCH 5/5] mm/memory_hotplug: Simplify node_states_check_changes_offline

2018-08-22 Thread Oscar Salvador

From: Oscar Salvador 

This patch tries to simplify node_states_check_changes_offline
and make the code more understandable by:

- Removing the if (N_MEMORY == N_NORMAL_MEMORY) wrong statement
- Removing the if (N_MEMORY == N_HIGH_MEMORY) wrong statement
- Re-structure the code a bit
- Removing confusing comments

Signed-off-by: Oscar Salvador 
---
 mm/memory_hotplug.c | 81 ++---
 1 file changed, 33 insertions(+), 48 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 006a7b817724..b45bc681e6db 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1487,51 +1487,40 @@ static void node_states_check_changes_offline(unsigned 
long nr_pages,
enum zone_type zt, zone_last = ZONE_NORMAL;
 
/*
-* If we have HIGHMEM or movable node, node_states[N_NORMAL_MEMORY]
-* contains nodes which have zones of 0...ZONE_NORMAL,
-* set zone_last to ZONE_NORMAL.
-*
-* If we don't have HIGHMEM nor movable node,
-* node_states[N_NORMAL_MEMORY] contains nodes which have zones of
-* 0...ZONE_MOVABLE, set zone_last to ZONE_MOVABLE.
+* If the current zone is whithin (0..ZONE_NORMAL],
+* check if the amount of pages that are going to be
+* offlined is above or equal to the sum of the present
+* pages of these zones.
+* If that happens, we need to take this node out of
+* node_state[N_NORMAL_MEMORY]
 */
-   if (N_MEMORY == N_NORMAL_MEMORY)
-   zone_last = ZONE_MOVABLE;
+   if (zone_idx(zone) <= zone_last) {
+   for (zt = 0; zt <= zone_last; zt++)
+   present_pages += pgdat->node_zones[zt].present_pages;
 
-   /*
-* check whether node_states[N_NORMAL_MEMORY] will be changed.
-* If the memory to be offline is in a zone of 0...zone_last,
-* and it is the last present memory, 0...zone_last will
-* become empty after offline , thus we can determind we will
-* need to clear the node from node_states[N_NORMAL_MEMORY].
-*/
-   for (zt = 0; zt <= zone_last; zt++)
-   present_pages += pgdat->node_zones[zt].present_pages;
-   if (zone_idx(zone) <= zone_last && nr_pages >= present_pages)
-   arg->status_change_nid_normal = zone_to_nid(zone);
-   else
-   arg->status_change_nid_normal = -1;
+   if (nr_pages >= present_pages)
+   arg->status_change_nid_normal = zone_to_nid(zone);
+   else
+   arg->status_change_nid_normal = -1;
+   }
 
 #ifdef CONFIG_HIGHMEM
/*
-* If we have movable node, node_states[N_HIGH_MEMORY]
-* contains nodes which have zones of 0...ZONE_HIGHMEM,
-* set zone_last to ZONE_HIGHMEM.
-*
-* If we don't have movable node, node_states[N_NORMAL_MEMORY]
-* contains nodes which have zones of 0...ZONE_MOVABLE,
-* set zone_last to ZONE_MOVABLE.
+* If the current zone is whithin (0..ZONE_HIGHMEM], check if
+* the amount of pages that are going to be offlined is above
+* or equal to the sum of the present pages of these zones.
+* If that happens, we need to take this node out of
+* node_state[N_HIGH_MEMORY]
 */
-   zone_last = ZONE_HIGHMEM;
-   if (N_MEMORY == N_HIGH_MEMORY)
-   zone_last = ZONE_MOVABLE;
-
-   for (; zt <= zone_last; zt++)
+   if (zone_idx(zone) <= ZONE_HIGHMEM) {
+   zt = ZONE_HIGHMEM;
present_pages += pgdat->node_zones[zt].present_pages;
-   if (zone_idx(zone) <= zone_last && nr_pages >= present_pages)
-   arg->status_change_nid_high = zone_to_nid(zone);
-   else
-   arg->status_change_nid_high = -1;
+
+   if (nr_pages >= present_pages)
+   arg->status_change_nid_high = zone_to_nid(zone);
+   else
+   arg->status_change_nid_high = -1;
+   }
 #else
/*
 * When !CONFIG_HIGHMEM, N_HIGH_MEMORY equals N_NORMAL_MEMORY
@@ -1541,18 +1530,14 @@ static void node_states_check_changes_offline(unsigned 
long nr_pages,
 #endif
 
/*
-* node_states[N_HIGH_MEMORY] contains nodes which have 0...ZONE_MOVABLE
+* Count pages from ZONE_MOVABLE as well.
+* If the amount of pages that are going to be offlined is above
+* or equal the sum of the present pages of all zones, we need
+* to remove this node from node_state[N_MEMORY]
 */
-   zone_last = ZONE_MOVABLE;
+   zt = ZONE_MOVABLE;
+   present_pages += pgdat->node_zones[zt].present_pages;
 
-   /*
-* check whether node_states[N_HIGH_MEMORY] will be changed
-* If we try to offline the last present @nr_pages from the node,
-* we can determind we will need to clear the node from
-* node_states[N_HIG

[RFC PATCH 2/5] mm/memory_hotplug: Avoid node_set/clear_state(N_HIGH_MEMORY) when !CONFIG_HIGHMEM

2018-08-22 Thread Oscar Salvador

From: Oscar Salvador 

Currently, when !CONFIG_HIGHMEM, status_change_nid_high is being set
to status_change_nid_normal, but on such systems, N_HIGH_MEMORY equals
N_NORMAL_MEMORY.
That means that if status_change_nid_normal is not -1,
we will perform two calls to node_set_state for the same memory type.

Set status_change_nid_high to -1 for !CONFIG_HIGHMEM, so we skip the
double call in node_states_set_node.

The same goes for node_clear_state.

Signed-off-by: Oscar Salvador 
---
 mm/memory_hotplug.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 4a89915e1467..1cfd0b5a9cc7 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -724,7 +724,11 @@ static void node_states_check_changes_online(unsigned long 
nr_pages,
else
arg->status_change_nid_high = -1;
 #else
-   arg->status_change_nid_high = arg->status_change_nid_normal;
+   /*
+* When !CONFIG_HIGHMEM, N_HIGH_MEMORY equals N_NORMAL_MEMORY
+* so setting the node for N_NORMAL_MEMORY is enough.
+*/
+   arg->status_change_nid_high = -1;
 #endif
 
/*
@@ -1547,7 +1551,11 @@ static void node_states_check_changes_offline(unsigned 
long nr_pages,
else
arg->status_change_nid_high = -1;
 #else
-   arg->status_change_nid_high = arg->status_change_nid_normal;
+   /*
+* When !CONFIG_HIGHMEM, N_HIGH_MEMORY equals N_NORMAL_MEMORY
+* so clearing the node for N_NORMAL_MEMORY is enough.
+*/
+   arg->status_change_nid_high = -1;
 #endif
 
/*
-- 
2.13.6

[RFC PATCH 3/5] mm/memory_hotplug: Simplify node_states_check_changes_online

2018-08-22 Thread Oscar Salvador

From: Oscar Salvador 

While looking at node_states_check_changes_online, I saw some
confusing things I am not sure how it was supposed to work.

Right after entering the function, we find this:

if (N_MEMORY == N_NORMAL_MEMORY)
zone_last = ZONE_MOVABLE;

This, unless I am missing something really obvious, is wrong.
N_MEMORY cannot really be equal to N_NORMAL_MEMORY.
My guess is that this wanted to be something like:

if (N_NORMAL_MEMORY == N_HIGH_MEMORY)

to check if we have CONFIG_HIGHMEM.

Later on, in the CONFIG_HICHMEM block, we have:

if (N_MEMORY == N_HIGH_MEMORY)
zone_last = ZONE_MOVABLE;

This is also wrong, and will never be evaluated to true.

The thing is that besides this, the function can be simplified a bit.

- If the zone is whithin (0..ZONE_NORMAL], we need to set the node
  for node_state[N_NORMAL_MEMORY]
- If we have CONFIG_HIGHMEM, and the zone is within (0..ZONE_NORMAL],
  we need to set the node for node_state[N_HIGH_MEMORY], as
  N_HIGH_MEMORY stands for regular or high memory.
- Finally, we set the node for node_states[N_MEMORY].
  ZONE_MOVABLE ends up there.

Signed-off-by: Oscar Salvador 
---
 mm/memory_hotplug.c | 44 +---
 1 file changed, 13 insertions(+), 31 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 1cfd0b5a9cc7..0f2cf6941224 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -680,46 +680,28 @@ static void node_states_check_changes_online(unsigned 
long nr_pages,
struct zone *zone, struct memory_notify *arg)
 {
int nid = zone_to_nid(zone);
-   enum zone_type zone_last = ZONE_NORMAL;
 
/*
-* If we have HIGHMEM or movable node, node_states[N_NORMAL_MEMORY]
-* contains nodes which have zones of 0...ZONE_NORMAL,
-* set zone_last to ZONE_NORMAL.
-*
-* If we don't have HIGHMEM nor movable node,
-* node_states[N_NORMAL_MEMORY] contains nodes which have zones of
-* 0...ZONE_MOVABLE, set zone_last to ZONE_MOVABLE.
-*/
-   if (N_MEMORY == N_NORMAL_MEMORY)
-   zone_last = ZONE_MOVABLE;
-
-   /*
-* if the memory to be online is in a zone of 0...zone_last, and
-* the zones of 0...zone_last don't have memory before online, we will
-* need to set the node to node_states[N_NORMAL_MEMORY] after
-* the memory is online.
+* node_states[N_NORMAL_MEMORY] contains nodes which have
+* zones from (0..ZONE_NORMAL]
+* We can start checking if the current zone is in that range
+* and if so, if the node needs to be set to 
node_states[N_NORMAL_MEMORY]
+* after memory is online.
 */
-   if (zone_idx(zone) <= zone_last && !node_state(nid, N_NORMAL_MEMORY))
+   if (zone_idx(zone) <= ZONE_NORMAL && !node_state(nid, N_NORMAL_MEMORY))
arg->status_change_nid_normal = nid;
else
arg->status_change_nid_normal = -1;
 
 #ifdef CONFIG_HIGHMEM
/*
-* If we have movable node, node_states[N_HIGH_MEMORY]
-* contains nodes which have zones of 0...ZONE_HIGHMEM,
-* set zone_last to ZONE_HIGHMEM.
-*
-* If we don't have movable node, node_states[N_NORMAL_MEMORY]
-* contains nodes which have zones of 0...ZONE_MOVABLE,
-* set zone_last to ZONE_MOVABLE.
+* The current zone cannot be ZONE_HIGHMEM, as zone_for_pfn_range
+* can only return (0..ZONE_NORMAL] or ZONE_MOVABLE.
+* N_HIGH_MEMORY stands for regular or high memory, so if the zone
+* is within the range (0..ZONE_NORMAL], we have to set the node
+* for N_HIGH_MEMORY as well.
 */
-   zone_last = ZONE_HIGHMEM;
-   if (N_MEMORY == N_HIGH_MEMORY)
-   zone_last = ZONE_MOVABLE;
-
-   if (zone_idx(zone) <= zone_last && !node_state(nid, N_HIGH_MEMORY))
+   if (zone_idx(zone) < ZONE_HIGHMEM && !node_state(nid, N_HIGH_MEMORY))
arg->status_change_nid_high = nid;
else
arg->status_change_nid_high = -1;
@@ -732,7 +714,7 @@ static void node_states_check_changes_online(unsigned long 
nr_pages,
 #endif
 
/*
-* if the node don't have memory befor online, we will need to
+* if the node don't have memory before online, we will need to
 * set the node to node_states[N_MEMORY] after the memory
 * is online.
 */
-- 
2.13.6

[RFC PATCH 4/5] mm/memory_hotplug: Tidy up node_states_clear_node

2018-08-22 Thread Oscar Salvador

From: Oscar Salvador 

node_states_clear has the following if statements:

if ((N_MEMORY != N_NORMAL_MEMORY) &&
(arg->status_change_nid_high >= 0))
...

if ((N_MEMORY != N_HIGH_MEMORY) &&
(arg->status_change_nid >= 0))
...

N_MEMORY can never be equal to neither N_NORMAL_MEMORY nor
N_HIGH_MEMORY.
This is wrong, so let us get rid of it.

Signed-off-by: Oscar Salvador 
---
 mm/memory_hotplug.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 0f2cf6941224..006a7b817724 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1564,12 +1564,10 @@ static void node_states_clear_node(int node, struct 
memory_notify *arg)
if (arg->status_change_nid_normal >= 0)
node_clear_state(node, N_NORMAL_MEMORY);
 
-   if ((N_MEMORY != N_NORMAL_MEMORY) &&
-   (arg->status_change_nid_high >= 0))
+   if (arg->status_change_nid_high >= 0)
node_clear_state(node, N_HIGH_MEMORY);
 
-   if ((N_MEMORY != N_HIGH_MEMORY) &&
-   (arg->status_change_nid >= 0))
+   if (arg->status_change_nid >= 0)
node_clear_state(node, N_MEMORY);
 }
 
-- 
2.13.6

[RFC PATCH 1/5] mm/memory_hotplug: Spare unnecessary calls to node_set_state

2018-08-22 Thread Oscar Salvador

From: Oscar Salvador 

In node_states_check_changes_online, we check if the node will
have to be set for any of the N_*_MEMORY states after the pages
have been onlined.

Later on, we perform the activation in node_states_set_node.
Currently, in node_states_set_node we set the node to N_MEMORY
unconditionally.
This means that we will call node_set_state for N_MEMORY every time
pages go online, but we only need to do it if the node has not yet been
set for N_MEMORY.

Signed-off-by: Oscar Salvador 
---
 mm/memory_hotplug.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 60b67f09956e..4a89915e1467 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -746,7 +746,8 @@ static void node_states_set_node(int node, struct 
memory_notify *arg)
if (arg->status_change_nid_high >= 0)
node_set_state(node, N_HIGH_MEMORY);
 
-   node_set_state(node, N_MEMORY);
+   if (arg->status_change_nid >= 0)
+   node_set_state(node, N_MEMORY);
 }
 
 static void __meminit resize_zone_range(struct zone *zone, unsigned long 
start_pfn,
-- 
2.13.6

[RFC PATCH 0/5] Clean up node_states_check_changes_online/offline

2018-08-22 Thread Oscar Salvador

From: Oscar Salvador 

This patchset clean ups node_states_check_changes_online/offline
functions together with node_states_set/clear_node functions.

The main reason behind this patchset is that currently, these
functions are suboptimal and confusing.

For example, they contain wrong statements like:

if (N_MEMORY == N_NORMAL_MEMORY)
if (N_MEMORY =! N_NORMAL_MEMORY)
if (N_MEMORY != N_HIGH_MEMORY)
if (N_MEMORY == N_HIGH_MEMORY)

At least, I could not find anywhere where N_NORMAL_MEMORY gets
assigned to N_MEMORY, or the other way around.
Neither for the N_HIGH_MEMORY case.

My rough guess is that all that was meant to compare
N_NORMAL_MEMORY to N_HIGH_MEMORY, to see if we were on
CONFIG_HIGHMEM systems.

This went unnoticed because the if statements never got triggered,
so they were always silent.
For instance, let us take a look at node_states_clear_node

...
if ((N_MEMORY != N_NORMAL_MEMORY) &&
(arg->status_change_nid_high >= 0))
node_clear_state(node, N_HIGH_MEMORY);

if ((N_MEMORY != N_HIGH_MEMORY) &&
(arg->status_change_nid >= 0))
node_clear_state(node, N_MEMORY);
...

Since N_MEMORY will never be equal to neither N_HIGH_MEMORY nor
N_NORMAL_MEMORY, this justs proceeds normally.

Another case is node_states_check_changes_offline:

...
zone_last = ZONE_HIGHMEM;
if (N_MEMORY == N_HIGH_MEMORY)
zone_last = ZONE_MOVABLE;
...

Since N_MEMORY will never be equal to N_HIGH_MEMORY, zone_last will
never be set to ZONE_MOVABLE.
But this is fine as the code works without that.

After I found all this, I tried to re-write the code in a more
understandable way, and I got rid of these confusing parts
on the way.

Another reason for this patchset is that there are some functions that are
called unconditionally when they should only be called under certain
conditions.

That is the case for:

- node_states_set_node()->node_set_state(node, N_MEMORY)

* node_states_set_node() gets called whenever we online pages,
  so we end up calling node_set_state(node, N_MEMORY) everytime.
  To avoid this, we should check if the node is already in node_state[N_MEMORY].

- node_states_set_node()->node_set_state(node, N_HIGH_MEMORY)

* On !CONFIG_HIGH_MEMORY, N_HIGH_MEMORY == N_NORMAL_MEMORY,
  but the current code sets:
  status_change_nid_high = status_change_nid_normal
  This means that we will call node_set_state(node, N_NORMAL_MEMORY) twice.
  The fix here is to set status_change_nid_normal = -1 on such systems,
  so we skip the second call.


I tried it out on x86_64 so far and everything worked.
But I would like to get feedback on this since I could be
missing something.

Oscar Salvador (5):
  mm/memory_hotplug: Spare unnecessary calls to node_set_state
  mm/memory_hotplug: Avoid node_set/clear_state(N_HIGH_MEMORY) when
!CONFIG_HIGHMEM
  mm/memory_hotplug: Simplify node_states_check_changes_online
  mm/memory_hotplug: Tidy up node_states_clear_node
  mm/memory_hotplug: Simplify node_states_check_changes_offline

 mm/memory_hotplug.c | 146 +---
 1 file changed, 60 insertions(+), 86 deletions(-)

-- 
2.13.6

Re: [PATCH 01/14] ata: ahci-platform: add reset control support and the flag to specify using reset

2018-08-22 Thread Sergei Shtylyov


Hello!

On 8/22/2018 10:36 AM, Kunihiko Hayashi wrote:


Add support to get and control a list of resets for the device
as optional and shared. These resets must be kept de-asserted until
the device is enabled.

This is specified as shared because some SoCs like UniPhier series
have common reset controls with all ahci controller instances.

However, according to Thierry's view,
https://www.spinics.net/lists/linux-ide/msg55357.html
some hardware-specific drivers already use their own resets,
and the common reset make a path to occur double controls of resets.

Now this add the flag to ahci_platform_get_resources() indicating
whether to use the resources, currently resets only, and existing
drivers set 0 to this flags.

Suggested-by: Hans de Goede 
Cc: Thierry Reding 
Signed-off-by: Kunihiko Hayashi 

[...]


diff --git a/include/linux/ahci_platform.h b/include/linux/ahci_platform.h
index 1b0a17b..eaedca5f 100644
--- a/include/linux/ahci_platform.h
+++ b/include/linux/ahci_platform.h
@@ -30,7 +30,7 @@ void ahci_platform_disable_regulators(struct ahci_host_priv 
*hpriv);
  int ahci_platform_enable_resources(struct ahci_host_priv *hpriv);
  void ahci_platform_disable_resources(struct ahci_host_priv *hpriv);
  struct ahci_host_priv *ahci_platform_get_resources(
-   struct platform_device *pdev);
+   struct platform_device *pdev, unsigned int flags);


   That breaks all the users of this API. You should fix the callers in this 
same patch to avoid breakage.


[...]

MBR, Sergei

Re: [PATCH 03/11] i2c: use SPDX identifier for Renesas drivers

2018-08-22 Thread Simon Horman

On Wed, Aug 22, 2018 at 12:02:16AM +0200, Wolfram Sang wrote:
> Signed-off-by: Wolfram Sang 

Reviewed-by: Simon Horman

[GIT PULL] More power management updates for v4.19-rc1

2018-08-22 Thread Rafael J. Wysocki

Hi Linus,

Please pull from the tag

 git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \
 pm-4.19-rc1-2

with top-most commit 01ac7c4c2e035bc8d0d47dc880bbc25bf562a648

 Merge branches 'pm-cpufreq', 'pm-pci' and 'pm-sleep'

on top of commit b018fc9800557bd14a40d69501e19c340eb2c521

 Merge tag 'pm-4.19-rc1' of
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

to receive more power management updates for 4.19-rc1.

These fix the main idle loop and the menu cpuidle governor, clean up
the latter, fix a mistake in the PCI bus type's support for system
suspend and resume, fix the ondemand and conservative cpufreq
governors, address a build issue in the system wakeup framework and
make the ACPI C-states desciptions less confusing.

Specifics:

 - Make the idle loop handle stopped scheduler tick correctly (Rafael
   Wysocki).

 - Prevent the menu cpuidle governor from letting CPUs spend too much
   time in shallow idle states when it is invoked with scheduler tick
   stopped and clean it up somewhat (Rafael Wysocki).

 - Avoid invoking the platform firmware to make the platform enter
   the ACPI S3 sleep state with suspended PCIe root ports which may
   confuse the firmware and cause it to crash (Rafael Wysocki).

 - Fix sysfs-related race in the ondemand and conservative cpufreq
   governors which may cause the system to crash if the governor
   module is removed during an update of CPU frequency limits (Henry
   Willard).

 - Select SRCU when building the system wakeup framework to avoid a
   build issue in it (zhangyi).

 - Make the descriptions of ACPI C-states vendor-neutral to avoid
   confusion (Prarit Bhargava).

Thanks!


---

Henry Willard (1):
  cpufreq: governor: Avoid accessing invalid governor_data

Prarit Bhargava (1):
  x86/ACPI/cstate: Make APCI C1 FFH MWAIT C-state description vendor-neutral

Rafael J. Wysocki (5):
  cpuidle: menu: Fix white space
  cpuidle: menu: Update stale polling override comment
  PCI / ACPI / PM: Resume all bridges on suspend-to-RAM
  sched: idle: Avoid retaining the tick when it has been stopped
  cpuidle: menu: Handle stopped tick more aggressively

zhangyi (F) (1):
  PM / sleep: wakeup: Fix build error caused by missing SRCU support

---

 arch/x86/kernel/acpi/cstate.c  |  2 +-
 drivers/cpufreq/cpufreq_governor.c | 12 --
 drivers/cpuidle/governors/menu.c   | 45 --
 drivers/pci/pci-acpi.c |  6 ++---
 kernel/power/Kconfig   |  1 +
 kernel/sched/idle.c|  2 +-
 6 files changed, 43 insertions(+), 25 deletions(-)

[GIT PULL] More ACPI updates for v4.19-rc1

2018-08-22 Thread Rafael J. Wysocki

Hi Linus,

Please pull from the tag

 git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \
 acpi-4.19-rc1-2

with top-most commit d1f3ab5b13c03b6f32d5379cd3cd5c7e50ce612c

 Merge branch 'acpi-pmic'

on top of commit 2c20443ec221dcb76484b30933593e8ecd836bbd

 Merge tag 'acpi-4.19-rc1' of
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

to receive more ACPI updates for 4.19-rc1.

These update the ACPICA code in the kernel to the most recent
upstream revision (which includes a regression fix and other
improvements), make ACPICA clear the status of all ACPI events
when entering sleep states (to restore the previous behavior)
and update the ACPI operation region driver for the CrystalCove
PMIC.

Specifics:

 - Update the ACPICA code in the kernel to upstream revision 20180810
   including:
   * Fix for AML parser regression causing it to mishandle opcodes
 that open a scope upon parse failures (Erik Schmauss).
   * Fix for a reference counting issue on large systems (Erik
 Schmauss).
   * Fix to discard values coming from register reads that have
 failed (Erik Schmauss).
   * Two acpiexec fixes (Bob Moore, Erik Schmauss).
   * Debugger cleanup (Bob Moore).
   * Cleanup of duplicate table error message (Bob Moore).
   * Cleanup of hex detection in the utilities (Erik Schmauss).

 - Make ACPICA clear the status of all ACPI events when entering
   sleep states again to avoid functional regressions (Rafael Wysocki).

 - Update the ACPI operation region driver for the CrystalCove PMIC
   to cover all of the known operation region fields (Hans de Goede).

Thanks!


---

Bob Moore (4):
  ACPICA: Update an error message for a duplicate table
  ACPICA: Debugger: Cleanup interface to the AML disassembler
  ACPICA: acpiexec: fix a small memory leak regression
  ACPICA: Update version to 20180810

Erik Schmauss (7):
  ACPICA: AML Parser: ignore all exceptions resulting from
incorrect AML during table load
  ACPICA: ACPICA: add status check for acpi_hw_read before
assigning return value
  ACPICA: Utilities: split hex detection into smaller functions
  ACPICA: AML Parser: skip opcodes that open a scope upon parse failure
  ACPICA: acpi_exec: fixing -fi option
  ACPICA: Reference count: add additional debugging details
  ACPICA: Reference Counts: increase max to 0x4000 for large servers

Hans de Goede (1):
  ACPI / PMIC: CrystalCove: Extend PMOP support to support all
possible fields

Rafael J. Wysocki (1):
  ACPICA: Clear status of all events when entering sleep states

---

 drivers/acpi/acpica/aclocal.h  |   1 +
 drivers/acpi/acpica/acnamesp.h |  17 +++---
 drivers/acpi/acpica/acutils.h  |   2 +
 drivers/acpi/acpica/dbinput.c  |  10 
 drivers/acpi/acpica/dbmethod.c |   8 +--
 drivers/acpi/acpica/dbxface.c  |  10 +++-
 drivers/acpi/acpica/dsfield.c  |  34 +++-
 drivers/acpi/acpica/hwregs.c   |   9 ++-
 drivers/acpi/acpica/hwsleep.c  |  11 +---
 drivers/acpi/acpica/nsaccess.c |  13 +
 drivers/acpi/acpica/psloop.c   |  43 ---
 drivers/acpi/acpica/tbdata.c   |   4 +-
 drivers/acpi/acpica/utdelete.c |   7 ++-
 drivers/acpi/acpica/utstrsuppt.c   |  26 -
 drivers/acpi/acpica/utstrtoul64.c  |   2 +-
 drivers/acpi/pmic/intel_pmic_crc.c | 109 -
 include/acpi/acconfig.h|   2 +-
 include/acpi/acexcep.h |   6 ++
 include/acpi/acpixf.h  |   2 +-
 19 files changed, 260 insertions(+), 56 deletions(-)

Re: [PATCH v9 12/22] s390: vfio-ap: sysfs interfaces to configure control domains

2018-08-22 Thread Cornelia Huck

On Wed, 22 Aug 2018 01:18:20 +0200
Halil Pasic  wrote:

> On 08/21/2018 07:07 PM, Tony Krowiak wrote:
> > This convention has been enforced by the kernel since v1. This is also
> > enforced by both the LPAR as well as in z/VM. The following is from the
> > PR/SM Planning Guide:
> > 
> > Control Domain
> > A logical partition's control domains are those cryptographic domains for 
> > which remote secure
> > administration functions can be established and administered from this 
> > logical partition. This
> > logical partition’s control domains must include its usage domains. For 
> > each index selected in the
> > usage domain index list, you must select the same index in the control 
> > domain index list
> >   

That's interesting.

> 
> IMHO this quote is quite a half-full half-empty cup one:
> * it mandates the set of usage domains is a subset of the set
> of the control domains, but
> * it speaks of independent controls, namely about the 'usage domain index'
> and the 'control domain index list' and makes the enforcement of the rule
> a job of the administrator (instead of codifying it in the controls).

I'm wondering if a configuration with a usage domain that is not also a
control domain is rejected outright? Anybody tried that? :)

> 
> > 
> > Consequently, I'm going to opt for ensuring this is clearly documented. 
> > Based on the fact you've
> > requested clarification of many points described in this section of the 
> > doc, I
> > think I'll try putting my meager skills as a wordsmith to work to hopefully 
> > clarify things.
> > I'll run it by you when I complete that task to see if I've succeeded:)  
> 
> I don't think just a doc update will do. Let me explain why.
> 
> What describe as "... note that the AQM and ADM masks configured for the
> mediated matrix device will be logically OR'd together to create the ADM
> stored in the CRYCB referenced from the guest's SIE state description."
> is a gotcha at best. The member of struct ap_matrix and the member of the
> respective apcb in the crycb are both called 'adm', but ap_matrix.adm is
> not an ADM as we know it from the architecture, but rather ~ AQM & ADM.
> 
> I feel pretty strongly about this one. If we want to keep the enforcement
> in the kernel, I guess, the assign_domain should set the bit corresponding
> bit not only in ap_matrix.aqm but also in ap_matrix.adm. When the
> ap_matrix is committed into the crycb no further manipulating the masks
> should take place.

Would you be fine if the control domain interface stated that it is
used to configure _additional_ control domains and the usage domain
interface stated that it is used to define usage and implicitly also
control domains? (And make the usage domain interface also set the
equivalent bit in the control domain mask.)

> 
> I don't feel strongly about whether to enforce this convention about AQM
> and ADM in the kernel or not. Frankly, I don't know what is behind the
> rule. Since I can't tell if any problems are to be expected if this
> convention is violated, I would feel more comfortable if the rule was
> accommodated higher in the management stack.

I guess it depends:

- If this is a case of: "Don't configure control domains that are not
  also usage domains. You are likely to go through
  {code,firmware,hardware} paths that are generally not used.",
  configure it in the kernel.
- If this rather is "Everybody is doing that, it's a general
  convention.", configure it higher up in the stack (libvirt?)

[PATCH] ARM: dts: stm32: update rtc st,syscfg property on stm32h743

2018-08-22 Thread Amelie Delaunay

To fit with latest rtc driver updates, rtc st,syscfg property must contain
the control register offset of pwrcfg and the mask corresponding to the
DBP (Disable Backup Protection) bit.

Signed-off-by: Amelie Delaunay 
---
 arch/arm/boot/dts/stm32h743.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/stm32h743.dtsi b/arch/arm/boot/dts/stm32h743.dtsi
index 637beff..cbdd69c 100644
--- a/arch/arm/boot/dts/stm32h743.dtsi
+++ b/arch/arm/boot/dts/stm32h743.dtsi
@@ -472,7 +472,7 @@
interrupt-parent = <&exti>;
interrupts = <17 IRQ_TYPE_EDGE_RISING>;
interrupt-names = "alarm";
-   st,syscfg = <&pwrcfg>;
+   st,syscfg = <&pwrcfg 0x00 0x100>;
status = "disabled";
};
 
-- 
2.7.4

Re: [PATCH v2] PCI: dwc: fix scheduling while atomic issues

2018-08-22 Thread Gustavo Pimentel

Hi Jisheng

On 21/08/2018 07:15, Jisheng Zhang wrote:
> When programming inbound/outbound atu, we call usleep_range() after
> each checking PCIE_ATU_ENABLE bit. Unfortunately, the atu programming
> can be called in atomic context:
> 
> inbound atu programming could be called through
> pci_epc_write_header()
>   =>dw_pcie_ep_write_header()
> =>dw_pcie_prog_inbound_atu()
> 
> outbound atu programming could be called through
> pci_bus_read_config_dword()
>   =>dw_pcie_rd_conf()
> =>dw_pcie_prog_outbound_atu()
> 
> Fix this issue by calling mdelay() instead.

Makes sense. Thanks.

Acked-by: Gustavo Pimentel 

Regards,
Gustavo

> 
> Signed-off-by: Jisheng Zhang 
> ---
> 
> Since v1
>  - use mdelay() instead of udelay() to avoid __bad_udelay()
> 
>  drivers/pci/controller/dwc/pcie-designware.c | 8 
>  drivers/pci/controller/dwc/pcie-designware.h | 3 +--
>  2 files changed, 5 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware.c 
> b/drivers/pci/controller/dwc/pcie-designware.c
> index 778c4f76a884..2153956a0b20 100644
> --- a/drivers/pci/controller/dwc/pcie-designware.c
> +++ b/drivers/pci/controller/dwc/pcie-designware.c
> @@ -135,7 +135,7 @@ static void dw_pcie_prog_outbound_atu_unroll(struct 
> dw_pcie *pci, int index,
>   if (val & PCIE_ATU_ENABLE)
>   return;
>  
> - usleep_range(LINK_WAIT_IATU_MIN, LINK_WAIT_IATU_MAX);
> + mdelay(LINK_WAIT_IATU);
>   }
>   dev_err(pci->dev, "Outbound iATU is not being enabled\n");
>  }
> @@ -178,7 +178,7 @@ void dw_pcie_prog_outbound_atu(struct dw_pcie *pci, int 
> index, int type,
>   if (val & PCIE_ATU_ENABLE)
>   return;
>  
> - usleep_range(LINK_WAIT_IATU_MIN, LINK_WAIT_IATU_MAX);
> + mdelay(LINK_WAIT_IATU);
>   }
>   dev_err(pci->dev, "Outbound iATU is not being enabled\n");
>  }
> @@ -236,7 +236,7 @@ static int dw_pcie_prog_inbound_atu_unroll(struct dw_pcie 
> *pci, int index,
>   if (val & PCIE_ATU_ENABLE)
>   return 0;
>  
> - usleep_range(LINK_WAIT_IATU_MIN, LINK_WAIT_IATU_MAX);
> + mdelay(LINK_WAIT_IATU);
>   }
>   dev_err(pci->dev, "Inbound iATU is not being enabled\n");
>  
> @@ -282,7 +282,7 @@ int dw_pcie_prog_inbound_atu(struct dw_pcie *pci, int 
> index, int bar,
>   if (val & PCIE_ATU_ENABLE)
>   return 0;
>  
> - usleep_range(LINK_WAIT_IATU_MIN, LINK_WAIT_IATU_MAX);
> + mdelay(LINK_WAIT_IATU);
>   }
>   dev_err(pci->dev, "Inbound iATU is not being enabled\n");
>  
> diff --git a/drivers/pci/controller/dwc/pcie-designware.h 
> b/drivers/pci/controller/dwc/pcie-designware.h
> index 96126fd8403c..9f1a5e399b70 100644
> --- a/drivers/pci/controller/dwc/pcie-designware.h
> +++ b/drivers/pci/controller/dwc/pcie-designware.h
> @@ -26,8 +26,7 @@
>  
>  /* Parameters for the waiting for iATU enabled routine */
>  #define LINK_WAIT_MAX_IATU_RETRIES   5
> -#define LINK_WAIT_IATU_MIN   9000
> -#define LINK_WAIT_IATU_MAX   1
> +#define LINK_WAIT_IATU   9
>  
>  /* Synopsys-specific PCIe configuration registers */
>  #define PCIE_PORT_LINK_CONTROL   0x710
>

Re: [PATCH 0/2] Add SDHI support to r8a774a1

2018-08-22 Thread Ulf Hansson

On 14 August 2018 at 14:34, Fabrizio Castro
 wrote:
> Dear All,
>
> this series aims at documenting SDHI support for RZ/G2M (a.k.a. R8A774A1).
>
> Cheers,
> Fab
>
> Fabrizio Castro (2):
>   mmc: renesas_sdhi_internal_dmac: Whitelist r8a774a1
>   mmc: renesas_sdhi: Add r8a774a1 support
>
>  Documentation/devicetree/bindings/mmc/tmio_mmc.txt | 4 +++-
>  drivers/mmc/host/renesas_sdhi_internal_dmac.c  | 1 +
>  2 files changed, 4 insertions(+), 1 deletion(-)
>
> --
> 2.7.4
>

Thanks, queued for v4.20!

Kind regards
Uffe

Re: [PATCH v2 1/1] mmc: dw_mmc: hi3798cv200: add MMC_CAP_CMD23 cap

2018-08-22 Thread Ulf Hansson

On 20 August 2018 at 15:04, Igor Opaniuk  wrote:
> Enable access to the RPMB on the on-board eMMC of the
> Poplar board.
>
> Signed-off-by: Igor Opaniuk 

Thanks, queued for v.4.20!

Kind regards
Uffe

> ---
>
> v2:
> - as there are three dwmmc blocks integrated on Hi3798CV200 SoC with
> identical CMD23 support, provide MMC_CAP_CMD23 capability instead of 0 for
> the third block also.
>
>  drivers/mmc/host/dw_mmc-hi3798cv200.c | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/drivers/mmc/host/dw_mmc-hi3798cv200.c 
> b/drivers/mmc/host/dw_mmc-hi3798cv200.c
> index f9b333f..bc51cef 100644
> --- a/drivers/mmc/host/dw_mmc-hi3798cv200.c
> +++ b/drivers/mmc/host/dw_mmc-hi3798cv200.c
> @@ -23,6 +23,12 @@ struct hi3798cv200_priv {
> struct clk *drive_clk;
>  };
>
> +static unsigned long dw_mci_hi3798cv200_caps[] = {
> +   MMC_CAP_CMD23,
> +   MMC_CAP_CMD23,
> +   MMC_CAP_CMD23
> +};
> +
>  static void dw_mci_hi3798cv200_set_ios(struct dw_mci *host, struct mmc_ios 
> *ios)
>  {
> struct hi3798cv200_priv *priv = host->priv;
> @@ -160,6 +166,8 @@ static int dw_mci_hi3798cv200_init(struct dw_mci *host)
>  }
>
>  static const struct dw_mci_drv_data hi3798cv200_data = {
> +   .caps = dw_mci_hi3798cv200_caps,
> +   .num_caps = ARRAY_SIZE(dw_mci_hi3798cv200_caps),
> .init = dw_mci_hi3798cv200_init,
> .set_ios = dw_mci_hi3798cv200_set_ios,
> .execute_tuning = dw_mci_hi3798cv200_execute_tuning,
> --
> 2.7.4
>

Re: [PATCH] mmc: jz4740: Drop dependency on MACH_JZ4740/80

2018-08-22 Thread Ulf Hansson

On 21 August 2018 at 15:03, Paul Cercueil  wrote:
> Depending on MACH_JZ4740 | MACH_JZ4780 prevent us from creating a generic
> kernel that works on more than one MIPS board. Instead, we just depend on
> MIPS being set.
>
> Signed-off-by: Paul Cercueil 

Thanks, queued for v4.20!

Kind regards
Uffe

> ---
>  drivers/mmc/host/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
> index 0581c199c996..71d9bdc52422 100644
> --- a/drivers/mmc/host/Kconfig
> +++ b/drivers/mmc/host/Kconfig
> @@ -761,7 +761,7 @@ config MMC_SH_MMCIF
>
>  config MMC_JZ4740
> tristate "Ingenic JZ47xx SD/Multimedia Card Interface support"
> -   depends on MACH_JZ4740 || MACH_JZ4780
> +   depends on MIPS
> help
>   This selects support for the SD/MMC controller on Ingenic
>   JZ4740, JZ4750, JZ4770 and JZ4780 SoCs.
> --
> 2.11.0
>

Re: [PATCH v4 0/2] Add ACPI support to IPROC SDHCI

2018-08-22 Thread Ulf Hansson

On 5 August 2018 at 09:52, Srinath Mannam  wrote:
> This patch series adds
>   - Feature to get generic device properties in the
> place of DT properties.
>   - ACPI support to IPROC SDHCI varients
>
> This patch series is based off v4.18-rc3
>
> Changes from v3:
>   - Replaced separate device tree and ACPI get match data APIs
> with single device_get_match_data API.
>
> Changes from v2:
>   - Added patch "Convert DT properties to generic device properties"
> given by Adrian Hunter to this patch series because
> "Add ACPI support to IPROC SDHCI" patch is depends on this.
>
> Changes from v1:
>   - Removed sdhci_iproc_data array change and add directly
> into of and acpi id tables.
>   - Add a change to get match data directly.
>   - Removed clock-frequency property read change.
>   - Used sdhci_get_property to get properties.
>   - Verified with patch given by Adrian Hunter
> mmc: sdhci-pltfm: Convert DT properties to generic device properties
>
> Adrian Hunter (1):
>   mmc: sdhci-pltfm: Convert DT properties to generic device properties
>
> Srinath Mannam (1):
>   mmc: host: iproc: Add ACPI support to IPROC SDHCI
>
>  drivers/mmc/host/Kconfig   |  1 +
>  drivers/mmc/host/sdhci-iproc.c | 59 
>  drivers/mmc/host/sdhci-pltfm.c | 68 
> +-
>  drivers/mmc/host/sdhci-pltfm.h |  7 -
>  4 files changed, 87 insertions(+), 48 deletions(-)
>
> --
> 2.7.4
>

Thanks, queued for v4.20!

Kind regards
Uffe

Re: [PATCH] mmc: jz4740: Add support for the JZ4725B

2018-08-22 Thread Ulf Hansson

On 21 August 2018 at 17:21, Paul Cercueil  wrote:
> The JZ4725B is the first JZ SoC version that introduced a 32-bit IMASK
> register, not the JZ4750.
>
> Signed-off-by: Paul Cercueil 

Thanks, queued for v4.20!

Kind regards
Uffe

> ---
>  Documentation/devicetree/bindings/mmc/jz4740.txt | 1 +
>  drivers/mmc/host/jz4740_mmc.c| 5 +++--
>  2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/devicetree/bindings/mmc/jz4740.txt 
> b/Documentation/devicetree/bindings/mmc/jz4740.txt
> index 7cd8c432d7c8..8a6f87f13114 100644
> --- a/Documentation/devicetree/bindings/mmc/jz4740.txt
> +++ b/Documentation/devicetree/bindings/mmc/jz4740.txt
> @@ -7,6 +7,7 @@ described in mmc.txt.
>  Required properties:
>  - compatible: Should be one of the following:
>- "ingenic,jz4740-mmc" for the JZ4740
> +  - "ingenic,jz4725b-mmc" for the JZ4725B
>- "ingenic,jz4780-mmc" for the JZ4780
>  - reg: Should contain the MMC controller registers location and length.
>  - interrupts: Should contain the interrupt specifier of the MMC controller.
> diff --git a/drivers/mmc/host/jz4740_mmc.c b/drivers/mmc/host/jz4740_mmc.c
> index 993386c9ea50..0c1efd5100b7 100644
> --- a/drivers/mmc/host/jz4740_mmc.c
> +++ b/drivers/mmc/host/jz4740_mmc.c
> @@ -115,7 +115,7 @@
>
>  enum jz4740_mmc_version {
> JZ_MMC_JZ4740,
> -   JZ_MMC_JZ4750,
> +   JZ_MMC_JZ4725B,
> JZ_MMC_JZ4780,
>  };
>
> @@ -176,7 +176,7 @@ struct jz4740_mmc_host {
>  static void jz4740_mmc_write_irq_mask(struct jz4740_mmc_host *host,
>   uint32_t val)
>  {
> -   if (host->version >= JZ_MMC_JZ4750)
> +   if (host->version >= JZ_MMC_JZ4725B)
> return writel(val, host->base + JZ_REG_MMC_IMASK);
> else
> return writew(val, host->base + JZ_REG_MMC_IMASK);
> @@ -1012,6 +1012,7 @@ static void jz4740_mmc_free_gpios(struct 
> platform_device *pdev)
>
>  static const struct of_device_id jz4740_mmc_of_match[] = {
> { .compatible = "ingenic,jz4740-mmc", .data = (void *) JZ_MMC_JZ4740 
> },
> +   { .compatible = "ingenic,jz4725b-mmc", .data = (void *)JZ_MMC_JZ4725B 
> },
> { .compatible = "ingenic,jz4780-mmc", .data = (void *) JZ_MMC_JZ4780 
> },
> {},
>  };
> --
> 2.11.0
>

[PATCH v1] KVM: s390: store DXC/VXC in fpc on DATA/Vector-processing exceptions

2018-08-22 Thread David Hildenbrand

When DATA exceptions and vector-processing exceptions (program interrupts)
are injected, the DXC/VXC is also to be stored in the fpc, if AFP is
enabled in CR0.

This can happen inside KVM when reinjecting an interrupt during program
interrupt intercepts. These are triggered for example when debugging the
guest (concurrent PER events result in an intercept instead of an
injection of such interrupts).

Signed-off-by: David Hildenbrand 
---

Only compile-tested.

 arch/s390/include/asm/ctl_reg.h | 1 +
 arch/s390/kvm/interrupt.c   | 8 
 2 files changed, 9 insertions(+)

diff --git a/arch/s390/include/asm/ctl_reg.h b/arch/s390/include/asm/ctl_reg.h
index 4600453536c2..88f3f14baee9 100644
--- a/arch/s390/include/asm/ctl_reg.h
+++ b/arch/s390/include/asm/ctl_reg.h
@@ -11,6 +11,7 @@
 #include 
 
 #define CR0_CLOCK_COMPARATOR_SIGN  _BITUL(63 - 10)
+#define CR0_AFP_REGISTER_CONTROL   _BITUL(63 - 45)
 #define CR0_EMERGENCY_SIGNAL_SUBMASK   _BITUL(63 - 49)
 #define CR0_EXTERNAL_CALL_SUBMASK  _BITUL(63 - 50)
 #define CR0_CLOCK_COMPARATOR_SUBMASK   _BITUL(63 - 52)
diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index fcb55b02990e..5b5754d8f460 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -765,6 +765,14 @@ static int __must_check __deliver_prog(struct kvm_vcpu 
*vcpu)
break;
case PGM_VECTOR_PROCESSING:
case PGM_DATA:
+   if (vcpu->arch.sie_block->gcr[0] & CR0_AFP_REGISTER_CONTROL) {
+   /* make sure the new fpc will be lazily loaded */
+   save_fpu_regs();
+   /* the DXC/VXC cannot make the fpc invalid */
+   current->thread.fpu.fpc &= ~0xff00u;
+   current->thread.fpu.fpc |= (pgm_info.data_exc_code << 8)
+  & 0xff00u;
+   }
rc = put_guest_lc(vcpu, pgm_info.data_exc_code,
  (u32 *)__LC_DATA_EXC_CODE);
break;
-- 
2.17.1

[PATCH] regulator: regmap helpers - support overlapping linear ranges

2018-08-22 Thread Matti Vaittinen

Don't give up voltage mapping if first range with suitable min/max uV
does not provide the wanted voltage.

Signed-off-by: Matti Vaittinen 
---

We may have HW which handles regulator voltage setting like:
LDO5 voltage reg:
bit [3] voltage range selection
bit [2:0] voltage selection where:

If D[3]=0,

000 = 0.7V
001 = 0.8V
010 = 0.9V
011 = 1.0V
100 = 1.1V
101 = 1.2V
110 = 1.3V
111 = 1.4V

If D[3]=1,

000 = 0.675V
001 = 0.775V
010 = 0.875V
011 = 0.975V
100 = 1.075V
101 = 1.175V
110 = 1.275V
111 = 1.375V

which can be described as:

REGULATOR_LINEAR_VOLTAGE(70, 0x0, 0x7, 10)
REGULATOR_LINEAR_VOLTAGE(675000, 0x8, 0xF, 10)

If consumer wants to use exactly 775000 uV current implementation of
regulator_map_voltage_linear_range will pick up first range (as 775000
is between min uV and max uV for this range) and fail because first
range is not supporting this voltage. This change makes
regulator_map_voltage_linear_range to continue checking rest of the
ranges if they support requested voltage before failing out.

Unfortunately this approach only works if 'range selection bit' follows
immediately after voltage selection bits - which is not case for
ROHM BD71837/BD71847 PMICs. Hence I will soon submit patch with helpers
supporting 'pickable range' register - which would solve also this
issue. Yet, the 'pickable range' solution is not as elegant as current
linear range solution is (in my opinion) - hence I suggest this addition
for cases where we have contagious vsel mask but overlapping linear
ranges.

 drivers/regulator/helpers.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/regulator/helpers.c b/drivers/regulator/helpers.c
index 2ae7c3ac5940..ef09021dc46e 100644
--- a/drivers/regulator/helpers.c
+++ b/drivers/regulator/helpers.c
@@ -321,17 +321,18 @@ int regulator_map_voltage_linear_range(struct 
regulator_dev *rdev,
 
ret += range->min_sel;
 
-   break;
+   /*
+* Map back into a voltage to verify we're still in bounds.
+* If we are not, then continue checking rest of the ranges.
+*/
+   voltage = rdev->desc->ops->list_voltage(rdev, ret);
+   if (voltage >= min_uV && voltage <= max_uV)
+   break;
}
 
if (i == rdev->desc->n_linear_ranges)
return -EINVAL;
 
-   /* Map back into a voltage to verify we're still in bounds */
-   voltage = rdev->desc->ops->list_voltage(rdev, ret);
-   if (voltage < min_uV || voltage > max_uV)
-   return -EINVAL;
-
return ret;
 }
 EXPORT_SYMBOL_GPL(regulator_map_voltage_linear_range);
-- 
2.14.3

[RESEND PATCH v4 0/6] arm64/mm: Move swapper_pg_dir to rodata

2018-08-22 Thread Jun Yao

The set_init_mm_pgd() is reimplemented using assembly in order to
avoid being instrumented by kasan.

Test following configs with CONFIG_RANDOMIZE_BASE/UNMAP_KERNEL_AT_EL0/
CONFIG_ARM64_SW_TTBR0_PAN/CONFIG_KASAN_OUTLINE enabled on qemu:

1. CONFIG_ARM64_4K_PAGES/CONFIG_ARM64_VA_BITS_48
2. CONFIG_ARM64_4K_PAGES/CONFIG_ARM64_VA_BITS_39
3. CONFIG_ARM64_64K_PAGES/CONFIG_ARM64_VA_BITS_48
4. CONFIG_ARM64_64K_PAGES/CONFIG_ARM64_VA_BITS_42

Jun Yao (6):
  arm64/mm: Introduce the init_pg_dir.
  arm64/mm: Pass ttbr1 as a parameter to __enable_mmu().
  arm64/mm: Create the initial page table in the init_pg_dir.
  arm64/mm: Create the final page table directly in swapper_pg_dir.
  arm64/mm: Populate the swapper_pg_dir by fixmap.
  arm64/mm: Move {idmap_pg_dir .. swapper_pg_dir} to rodata section.

 arch/arm64/include/asm/assembler.h | 29 +
 arch/arm64/include/asm/pgtable.h   | 66 ++
 arch/arm64/kernel/head.S   | 48 ++
 arch/arm64/kernel/sleep.S  |  1 +
 arch/arm64/kernel/vmlinux.lds.S| 47 ++---
 arch/arm64/mm/mmu.c| 45 
 6 files changed, 168 insertions(+), 68 deletions(-)

-- 
2.17.1

[RESEND PATCH v4 2/6] arm64/mm: Pass ttbr1 as a parameter to __enable_mmu().

2018-08-22 Thread Jun Yao

The kernel sets up the initial page table in the init_pg_dir.
However, it will create the final page table in the swapper_pg_dir
during the initialization process. We need to let __enable_mmu()
know which page table to use.

Signed-off-by: Jun Yao 
---
 arch/arm64/kernel/head.S  | 21 -
 arch/arm64/kernel/sleep.S |  1 +
 2 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 2c83a8c47e3f..c3e4b1886cde 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -714,6 +714,7 @@ secondary_startup:
 * Common entry point for secondary CPUs.
 */
bl  __cpu_setup // initialise processor
+   adrpx1, swapper_pg_dir
bl  __enable_mmu
ldr x8, =__secondary_switched
br  x8
@@ -756,6 +757,7 @@ ENDPROC(__secondary_switched)
  * Enable the MMU.
  *
  *  x0  = SCTLR_EL1 value for turning on the MMU.
+ *  x1  = TTBR1_EL1 value for turning on the MMU.
  *
  * Returns to the caller via x30/lr. This requires the caller to be covered
  * by the .idmap.text section.
@@ -764,15 +766,15 @@ ENDPROC(__secondary_switched)
  * If it isn't, park the CPU
  */
 ENTRY(__enable_mmu)
-   mrs x1, ID_AA64MMFR0_EL1
-   ubfxx2, x1, #ID_AA64MMFR0_TGRAN_SHIFT, 4
-   cmp x2, #ID_AA64MMFR0_TGRAN_SUPPORTED
+   mrs x5, ID_AA64MMFR0_EL1
+   ubfxx6, x5, #ID_AA64MMFR0_TGRAN_SHIFT, 4
+   cmp x6, #ID_AA64MMFR0_TGRAN_SUPPORTED
b.ne__no_granule_support
-   update_early_cpu_boot_status 0, x1, x2
-   adrpx1, idmap_pg_dir
-   adrpx2, swapper_pg_dir
-   phys_to_ttbr x3, x1
-   phys_to_ttbr x4, x2
+   update_early_cpu_boot_status 0, x5, x6
+   adrpx5, idmap_pg_dir
+   mov x6, x1
+   phys_to_ttbr x3, x5
+   phys_to_ttbr x4, x6
msr ttbr0_el1, x3   // load TTBR0
msr ttbr1_el1, x4   // load TTBR1
isb
@@ -791,7 +793,7 @@ ENDPROC(__enable_mmu)
 
 __no_granule_support:
/* Indicate that this CPU can't boot and is stuck in the kernel */
-   update_early_cpu_boot_status CPU_STUCK_IN_KERNEL, x1, x2
+   update_early_cpu_boot_status CPU_STUCK_IN_KERNEL, x5, x6
 1:
wfe
wfi
@@ -831,6 +833,7 @@ __primary_switch:
mrs x20, sctlr_el1  // preserve old SCTLR_EL1 value
 #endif
 
+   adrpx1, swapper_pg_dir
bl  __enable_mmu
 #ifdef CONFIG_RELOCATABLE
bl  __relocate_kernel
diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index bebec8ef9372..3e53ffa07994 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -101,6 +101,7 @@ ENTRY(cpu_resume)
bl  el2_setup   // if in EL2 drop to EL1 cleanly
bl  __cpu_setup
/* enable the MMU early - so we can access sleep_save_stash by va */
+   adrpx1, swapper_pg_dir
bl  __enable_mmu
ldr x8, =_cpu_resume
br  x8
-- 
2.17.1

[RESEND PATCH v4 1/6] arm64/mm: Introduce the init_pg_dir.

2018-08-22 Thread Jun Yao

To make the swapper_pg_dir read only, we will move it to the rodata
section. And force the kernel to set up the initial page table in
the init_pg_dir. After generating all levels page table, we copy
only the top level into the swapper_pg_dir during paging_init().

Signed-off-by: Jun Yao 
---
 arch/arm64/include/asm/assembler.h | 29 +
 arch/arm64/kernel/head.S   | 22 +++---
 arch/arm64/kernel/vmlinux.lds.S|  8 
 3 files changed, 52 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h 
b/arch/arm64/include/asm/assembler.h
index 0bcc98dbba56..eb363a915c0e 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -456,6 +456,35 @@ USER(\label, icivau, \tmp2)// 
invalidate I line PoU
b.ne9998b
.endm
 
+/*
+ * clear_page - clear one page
+ *
+ * start:  page aligned virtual address
+ */
+   .macro clear_page, start:req
+9996:  stp xzr, xzr, [\start], #16
+   stp xzr, xzr, [\start], #16
+   stp xzr, xzr, [\start], #16
+   stp xzr, xzr, [\start], #16
+   tst \start, #(PAGE_SIZE - 1)
+   b.ne9996b
+   .endm
+
+/*
+ * clear_pages - clear contiguous pages
+ *
+ * start, end: page aligend virtual addresses
+ */
+   .macro clear_pages, start:req, end:req
+   sub \end, \end, \start
+   lsr \end, \end, #(PAGE_SHIFT)
+9997:  cbz \end, 9998f
+   clear_page \start
+   sub \end, \end, #1
+   b   9997b
+9998:
+   .endm
+
 /*
  * Annotate a function as position independent, i.e., safe to be called before
  * the kernel virtual mapping is activated.
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index b0853069702f..2c83a8c47e3f 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -295,18 +295,21 @@ __create_page_tables:
sub x1, x1, x0
bl  __inval_dcache_area
 
+   adrpx0, init_pg_dir
+   adrpx1, init_pg_end
+   sub x1, x1, x0
+   bl  __inval_dcache_area
+
/*
 * Clear the idmap and swapper page tables.
 */
adrpx0, idmap_pg_dir
adrpx1, swapper_pg_end
-   sub x1, x1, x0
-1: stp xzr, xzr, [x0], #16
-   stp xzr, xzr, [x0], #16
-   stp xzr, xzr, [x0], #16
-   stp xzr, xzr, [x0], #16
-   subsx1, x1, #64
-   b.ne1b
+   clear_pages x0, x1
+
+   adrpx0, init_pg_dir
+   adrpx1, init_pg_end
+   clear_pages x0, x1
 
mov x7, SWAPPER_MM_MMUFLAGS
 
@@ -395,6 +398,11 @@ __create_page_tables:
dmb sy
bl  __inval_dcache_area
 
+   adrpx0, init_pg_dir
+   adrpx1, init_pg_end
+   sub x1, x1, x0
+   bl  __inval_dcache_area
+
ret x28
 ENDPROC(__create_page_tables)
.ltorg
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 605d1b60469c..61d7cee3eaa6 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -68,6 +68,12 @@ jiffies = jiffies_64;
 #define TRAMP_TEXT
 #endif
 
+#define INIT_PG_TABLES \
+   . = ALIGN(PAGE_SIZE);   \
+   init_pg_dir = .;\
+   . += SWAPPER_DIR_SIZE;  \
+   init_pg_end = .;
+
 /*
  * The size of the PE/COFF section that covers the kernel image, which
  * runs from stext to _edata, must be a round multiple of the PE/COFF
@@ -161,6 +167,8 @@ SECTIONS
__inittext_end = .;
__initdata_begin = .;
 
+   INIT_PG_TABLES
+
.init.data : {
INIT_DATA
INIT_SETUP(16)
-- 
2.17.1

[RESEND PATCH v4 4/6] arm64/mm: Create the final page table directly in swapper_pg_dir.

2018-08-22 Thread Jun Yao

As the initial page table is created in the init_pg_dir, we can set
up the final page table directly in the swapper_pg_dir. And it only
contains the top level page table, so we can reduce it to a page
size.

Signed-off-by: Jun Yao 
---
 arch/arm64/kernel/vmlinux.lds.S |  2 +-
 arch/arm64/mm/mmu.c | 29 ++---
 2 files changed, 3 insertions(+), 28 deletions(-)

diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 61d7cee3eaa6..2446911f4262 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -237,7 +237,7 @@ SECTIONS
. += RESERVED_TTBR0_SIZE;
 #endif
swapper_pg_dir = .;
-   . += SWAPPER_DIR_SIZE;
+   . += PAGE_SIZE;
swapper_pg_end = .;
 
__pecoff_data_size = ABSOLUTE(. - __initdata_begin);
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index f7e544f6f3eb..b7f9afb628ac 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -642,35 +642,10 @@ void __init set_init_mm_pgd(pgd_t *pgd)
  */
 void __init paging_init(void)
 {
-   phys_addr_t pgd_phys = early_pgtable_alloc();
-   pgd_t *pgdp = pgd_set_fixmap(pgd_phys);
-
-   map_kernel(pgdp);
-   map_mem(pgdp);
-
-   /*
-* We want to reuse the original swapper_pg_dir so we don't have to
-* communicate the new address to non-coherent secondaries in
-* secondary_entry, and so cpu_switch_mm can generate the address with
-* adrp+add rather than a load from some global variable.
-*
-* To do this we need to go via a temporary pgd.
-*/
-   cpu_replace_ttbr1(__va(pgd_phys));
-   memcpy(swapper_pg_dir, pgdp, PGD_SIZE);
+   map_kernel(swapper_pg_dir);
+   map_mem(swapper_pg_dir);
cpu_replace_ttbr1(lm_alias(swapper_pg_dir));
set_init_mm_pgd(swapper_pg_dir);
-
-   pgd_clear_fixmap();
-   memblock_free(pgd_phys, PAGE_SIZE);
-
-   /*
-* We only reuse the PGD from the swapper_pg_dir, not the pud + pmd
-* allocated with it.
-*/
-   memblock_free(__pa_symbol(swapper_pg_dir) + PAGE_SIZE,
- __pa_symbol(swapper_pg_end) - __pa_symbol(swapper_pg_dir)
- - PAGE_SIZE);
 }
 
 /*
-- 
2.17.1

[RESEND PATCH v4 3/6] arm64/mm: Create the initial page table in the init_pg_dir.

2018-08-22 Thread Jun Yao

Create the initial page table in the init_pg_dir. And before
calling kasan_early_init(), we update the init_mm.pgd by
introducing set_init_mm_pgd(). This will ensure that pgd_offset_k()
works correctly. When the final page table is created, we redirect
the init_mm.pgd to the swapper_pg_dir.

Signed-off-by: Jun Yao 
---
 arch/arm64/include/asm/pgtable.h |  2 ++
 arch/arm64/kernel/head.S |  9 ++---
 arch/arm64/mm/mmu.c  | 14 ++
 3 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 1bdeca8918a6..46ef21ebfe47 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -712,6 +712,8 @@ static inline pmd_t pmdp_establish(struct vm_area_struct 
*vma,
 }
 #endif
 
+extern pgd_t init_pg_dir[PTRS_PER_PGD];
+extern pgd_t init_pg_end[];
 extern pgd_t swapper_pg_dir[PTRS_PER_PGD];
 extern pgd_t swapper_pg_end[];
 extern pgd_t idmap_pg_dir[PTRS_PER_PGD];
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index c3e4b1886cde..ede2e964592b 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -376,7 +376,7 @@ __create_page_tables:
/*
 * Map the kernel image (starting with PHYS_OFFSET).
 */
-   adrpx0, swapper_pg_dir
+   adrpx0, init_pg_dir
mov_q   x5, KIMAGE_VADDR + TEXT_OFFSET  // compile time __va(_text)
add x5, x5, x23 // add KASLR displacement
mov x4, PTRS_PER_PGD
@@ -402,7 +402,6 @@ __create_page_tables:
adrpx1, init_pg_end
sub x1, x1, x0
bl  __inval_dcache_area
-
ret x28
 ENDPROC(__create_page_tables)
.ltorg
@@ -439,6 +438,9 @@ __primary_switched:
bl  __pi_memset
dsb ishst   // Make zero page visible to PTW
 
+   adrpx0, init_pg_dir
+   bl  set_init_mm_pgd
+
 #ifdef CONFIG_KASAN
bl  kasan_early_init
 #endif
@@ -833,8 +835,9 @@ __primary_switch:
mrs x20, sctlr_el1  // preserve old SCTLR_EL1 value
 #endif
 
-   adrpx1, swapper_pg_dir
+   adrpx1, init_pg_dir
bl  __enable_mmu
+
 #ifdef CONFIG_RELOCATABLE
bl  __relocate_kernel
 #ifdef CONFIG_RANDOMIZE_BASE
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 65f86271f02b..f7e544f6f3eb 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -623,6 +623,19 @@ static void __init map_kernel(pgd_t *pgdp)
kasan_copy_shadow(pgdp);
 }
 
+/*
+ * set_init_mm_pgd() just updates init_mm.pgd. The purpose of using
+ * assembly is to prevent KASAN instrumentation, as KASAN has not
+ * been initialized when this function is called.
+ */
+void __init set_init_mm_pgd(pgd_t *pgd)
+{
+   pgd_t **addr = &(init_mm.pgd);
+
+   asm volatile("str %x0, [%1]\n"
+   : : "r" (pgd), "r" (addr) : "memory");
+}
+
 /*
  * paging_init() sets up the page tables, initialises the zone memory
  * maps and sets up the zero page.
@@ -646,6 +659,7 @@ void __init paging_init(void)
cpu_replace_ttbr1(__va(pgd_phys));
memcpy(swapper_pg_dir, pgdp, PGD_SIZE);
cpu_replace_ttbr1(lm_alias(swapper_pg_dir));
+   set_init_mm_pgd(swapper_pg_dir);
 
pgd_clear_fixmap();
memblock_free(pgd_phys, PAGE_SIZE);
-- 
2.17.1

[RESEND PATCH v4 6/6] arm64/mm: Move {idmap_pg_dir .. swapper_pg_dir} to rodata section.

2018-08-22 Thread Jun Yao

Move the idmap_pg_dir/tramp_pg_dir/reserved_ttbr0/swapper_pg_dir to
the rodata section. When the kernel is initialized, the
idmap_pg_dir, tramp_pg_dir and reserved_ttbr0 will not change. And
it's safe to move them to rodata section.

Signed-off-by: Jun Yao 
---
 arch/arm64/kernel/vmlinux.lds.S | 39 -
 1 file changed, 24 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 2446911f4262..142528a23b44 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -64,8 +64,13 @@ jiffies = jiffies_64;
*(.entry.tramp.text)\
. = ALIGN(PAGE_SIZE);   \
__entry_tramp_text_end = .;
+
+#define TRAMP_PG_TABLE \
+   tramp_pg_dir = .;   \
+   . += PAGE_SIZE;
 #else
 #define TRAMP_TEXT
+#define TRAMP_PG_TABLE
 #endif
 
 #define INIT_PG_TABLES \
@@ -74,6 +79,24 @@ jiffies = jiffies_64;
. += SWAPPER_DIR_SIZE;  \
init_pg_end = .;
 
+#ifdef CONFIG_ARM64_SW_TTBR0_PAN
+#define RESERVED_PG_TABLE  \
+   reserved_ttbr0 = .; \
+   . += RESERVED_TTBR0_SIZE;
+#else
+#define RESERVED_PG_TABLE
+#endif
+
+#define KERNEL_PG_TABLES   \
+   . = ALIGN(PAGE_SIZE);   \
+   idmap_pg_dir = .;   \
+   . += IDMAP_DIR_SIZE;\
+   TRAMP_PG_TABLE  \
+   RESERVED_PG_TABLE   \
+   swapper_pg_dir = .; \
+   . += PAGE_SIZE; \
+   swapper_pg_end = .;
+
 /*
  * The size of the PE/COFF section that covers the kernel image, which
  * runs from stext to _edata, must be a round multiple of the PE/COFF
@@ -143,6 +166,7 @@ SECTIONS
RO_DATA(PAGE_SIZE)  /* everything from this point to */
EXCEPTION_TABLE(8)  /* __init_begin will be marked RO NX */
NOTES
+   KERNEL_PG_TABLES
 
. = ALIGN(SEGMENT_ALIGN);
__init_begin = .;
@@ -224,21 +248,6 @@ SECTIONS
BSS_SECTION(0, 0, 0)
 
. = ALIGN(PAGE_SIZE);
-   idmap_pg_dir = .;
-   . += IDMAP_DIR_SIZE;
-
-#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
-   tramp_pg_dir = .;
-   . += PAGE_SIZE;
-#endif
-
-#ifdef CONFIG_ARM64_SW_TTBR0_PAN
-   reserved_ttbr0 = .;
-   . += RESERVED_TTBR0_SIZE;
-#endif
-   swapper_pg_dir = .;
-   . += PAGE_SIZE;
-   swapper_pg_end = .;
 
__pecoff_data_size = ABSOLUTE(. - __initdata_begin);
_end = .;
-- 
2.17.1

Re: [PATCH 1/9] CHROMIUM: v4l: Add H264 low-level decoder API compound controls.

2018-08-22 Thread Tomasz Figa

On Wed, Aug 22, 2018 at 6:16 PM Maxime Ripard  wrote:
>
> Hi,
>
> On Tue, Aug 21, 2018 at 01:58:38PM -0300, Ezequiel Garcia wrote:
> > On Wed, 2018-06-13 at 16:07 +0200, Maxime Ripard wrote:
> > > From: Pawel Osciak 
> > >
> > > Signed-off-by: Pawel Osciak 
> > > Reviewed-by: Wu-cheng Li 
> > > Tested-by: Tomasz Figa 
> > > [rebase44(groeck): include linux/types.h in v4l2-controls.h]
> > > Signed-off-by: Guenter Roeck 
> > > Signed-off-by: Maxime Ripard 
> > > ---
> > >
> > [..]
> > > diff --git a/include/uapi/linux/videodev2.h 
> > > b/include/uapi/linux/videodev2.h
> > > index 242a6bfa1440..4b4a1b25a0db 100644
> > > --- a/include/uapi/linux/videodev2.h
> > > +++ b/include/uapi/linux/videodev2.h
> > > @@ -626,6 +626,7 @@ struct v4l2_pix_format {
> > >  #define V4L2_PIX_FMT_H264 v4l2_fourcc('H', '2', '6', '4') /* H264 
> > > with start codes */
> > >  #define V4L2_PIX_FMT_H264_NO_SC v4l2_fourcc('A', 'V', 'C', '1') /* H264 
> > > without start codes */
> > >  #define V4L2_PIX_FMT_H264_MVC v4l2_fourcc('M', '2', '6', '4') /* H264 
> > > MVC */
> > > +#define V4L2_PIX_FMT_H264_SLICE v4l2_fourcc('S', '2', '6', '4') /* H264 
> > > parsed slices */
> >
> > As pointed out by Tomasz, the Rockchip VPU driver expects start codes [1], 
> > so the userspace
> > should be aware of it. Perhaps we could document this pixel format better 
> > as:
> >
> > #define V4L2_PIX_FMT_H264_SLICE v4l2_fourcc('S', '2', '6', '4') /* H264 
> > parsed slices with start codes */
>
> I'm not sure this is something we want to do at that point. libva
> doesn't give the start code, so this is only going to make the life of
> the sane controllers more difficult. And if you need to have the start
> code and parse it, then you're not so stateless anymore.

I might not remember correctly, but Rockchip decoder does some slice
parsing on its own (despite not doing any higher level parsing).
Probably that's why it needs those start codes.

I wonder if libva is the best reference here. It's been designed
almost entirely by Intel for Intel video hardware. We want something
that could work with a wide range of devices and avoid something like
a need to create a semi-stateless API few months later. In fact,
hardware from another vendor, we're working with, also does parsing of
slice headers internally. Moreover, we have some weird
kind-of-stateful decoders, which cannot fully deal with bitstream on
its own, e.g. cannot parse formats, cannot handle resolution changes,
need H264 bitstream NALUs split into separate buffers, etc.

As I suggested some time ago, having the full bitstream in the buffer,
with offsets of particular units included in respective controls,
would be the most scalable thing. If really needed, we could add flags
telling the driver that particular units are present, so one's
implementation of libva could put only raw slice data in the buffers.
But perhaps it's libva which needs some amendment?

Best regards,
Tomasz

[RESEND PATCH v4 5/6] arm64/mm: Populate the swapper_pg_dir by fixmap.

2018-08-22 Thread Jun Yao

Since we will move the swapper_pg_dir to rodata section, we need a
way to update it. The fixmap can handle it. When the swapper_pg_dir
needs to be updated, we map it dynamically. The map will be
canceled after the update is complete. In this way, we can defend
against KSMA(Kernel Space Mirror Attack).

Signed-off-by: Jun Yao 
---
 arch/arm64/include/asm/pgtable.h | 68 ++--
 arch/arm64/mm/mmu.c  |  2 +
 2 files changed, 59 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 46ef21ebfe47..d5c3df99af7b 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -45,6 +45,13 @@
 #include 
 #include 
 
+extern pgd_t init_pg_dir[PTRS_PER_PGD];
+extern pgd_t init_pg_end[];
+extern pgd_t swapper_pg_dir[PTRS_PER_PGD];
+extern pgd_t swapper_pg_end[];
+extern pgd_t idmap_pg_dir[PTRS_PER_PGD];
+extern pgd_t tramp_pg_dir[PTRS_PER_PGD];
+
 extern void __pte_error(const char *file, int line, unsigned long val);
 extern void __pmd_error(const char *file, int line, unsigned long val);
 extern void __pud_error(const char *file, int line, unsigned long val);
@@ -428,8 +435,32 @@ extern pgprot_t phys_mem_access_prot(struct file *file, 
unsigned long pfn,
 PUD_TYPE_TABLE)
 #endif
 
+extern spinlock_t swapper_pgdir_lock;
+
+#define pgd_set_fixmap(addr)   ((pgd_t *)set_fixmap_offset(FIX_PGD, addr))
+#define pgd_clear_fixmap() clear_fixmap(FIX_PGD)
+
+static inline bool in_swapper_pgdir(void *addr)
+{
+   return ((unsigned long)addr & PAGE_MASK) ==
+   ((unsigned long)swapper_pg_dir & PAGE_MASK);
+}
+
 static inline void set_pmd(pmd_t *pmdp, pmd_t pmd)
 {
+#ifdef __PAGETABLE_PMD_FOLDED
+   if (in_swapper_pgdir(pmdp)) {
+   pmd_t *fixmap_pmdp;
+
+   spin_lock(&swapper_pgdir_lock);
+   fixmap_pmdp = (pmd_t *)pgd_set_fixmap(__pa(pmdp));
+   WRITE_ONCE(*fixmap_pmdp, pmd);
+   dsb(ishst);
+   pgd_clear_fixmap();
+   spin_unlock(&swapper_pgdir_lock);
+   return;
+   }
+#endif
WRITE_ONCE(*pmdp, pmd);
dsb(ishst);
 }
@@ -480,6 +511,19 @@ static inline phys_addr_t pmd_page_paddr(pmd_t pmd)
 
 static inline void set_pud(pud_t *pudp, pud_t pud)
 {
+#ifdef __PAGETABLE_PUD_FOLDED
+   if (in_swapper_pgdir(pudp)) {
+   pud_t *fixmap_pudp;
+
+   spin_lock(&swapper_pgdir_lock);
+   fixmap_pudp = (pud_t *)pgd_set_fixmap(__pa(pudp));
+   WRITE_ONCE(*fixmap_pudp, pud);
+   dsb(ishst);
+   pgd_clear_fixmap();
+   spin_unlock(&swapper_pgdir_lock);
+   return;
+   }
+#endif
WRITE_ONCE(*pudp, pud);
dsb(ishst);
 }
@@ -532,8 +576,19 @@ static inline phys_addr_t pud_page_paddr(pud_t pud)
 
 static inline void set_pgd(pgd_t *pgdp, pgd_t pgd)
 {
-   WRITE_ONCE(*pgdp, pgd);
-   dsb(ishst);
+   if (in_swapper_pgdir(pgdp)) {
+   pgd_t *fixmap_pgdp;
+
+   spin_lock(&swapper_pgdir_lock);
+   fixmap_pgdp = pgd_set_fixmap(__pa(pgdp));
+   WRITE_ONCE(*fixmap_pgdp, pgd);
+   dsb(ishst);
+   pgd_clear_fixmap();
+   spin_unlock(&swapper_pgdir_lock);
+   } else {
+   WRITE_ONCE(*pgdp, pgd);
+   dsb(ishst);
+   }
 }
 
 static inline void pgd_clear(pgd_t *pgdp)
@@ -586,8 +641,6 @@ static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
 /* to find an entry in a kernel page-table-directory */
 #define pgd_offset_k(addr) pgd_offset(&init_mm, addr)
 
-#define pgd_set_fixmap(addr)   ((pgd_t *)set_fixmap_offset(FIX_PGD, addr))
-#define pgd_clear_fixmap() clear_fixmap(FIX_PGD)
 
 static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 {
@@ -712,13 +765,6 @@ static inline pmd_t pmdp_establish(struct vm_area_struct 
*vma,
 }
 #endif
 
-extern pgd_t init_pg_dir[PTRS_PER_PGD];
-extern pgd_t init_pg_end[];
-extern pgd_t swapper_pg_dir[PTRS_PER_PGD];
-extern pgd_t swapper_pg_end[];
-extern pgd_t idmap_pg_dir[PTRS_PER_PGD];
-extern pgd_t tramp_pg_dir[PTRS_PER_PGD];
-
 /*
  * Encode and decode a swap entry:
  * bits 0-1:   present (must be zero)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index b7f9afb628ac..691a05bbf87b 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -67,6 +67,8 @@ static pte_t bm_pte[PTRS_PER_PTE] __page_aligned_bss;
 static pmd_t bm_pmd[PTRS_PER_PMD] __page_aligned_bss __maybe_unused;
 static pud_t bm_pud[PTRS_PER_PUD] __page_aligned_bss __maybe_unused;
 
+DEFINE_SPINLOCK(swapper_pgdir_lock);
+
 pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
  unsigned long size, pgprot_t vma_prot)
 {
-- 
2.17.1

Re: [PATCH] ARM: use choice for kernel unwinders

2018-08-22 Thread Arnd Bergmann

On Wed, Aug 22, 2018 at 12:24 AM Stefan Agner  wrote:
>
> While in theory multiple unwinders could be compiled in, it does
> not make sense in practise. Use a choice to make the unwinder
> selection mutually exclusive and mandatory.
>
> Already before this commit it has not been possible to deselect
> FRAME_POINTER. Remove the obsolete comment.
>
> Furthermore, to produce a meaningful backtrace with FRAME_POINTER
> enabled the kernel needs a specific function prologue:
> movip, sp
> stmfdsp!, {fp, ip, lr, pc}
> subfp, ip, #4
>
> To get to the required prologue gcc uses apcs and no-sched-prolog.
> This compiler options are not available on clang, and clang is not
> able to generate the required prologue. Make the FRAME_POINTER
> config symbol depending on !clang.
>
> Suggested-by: Arnd Bergmann 
> Signed-off-by: Stefan Agner 

Looks ok to me. I've added it to my randconfig test environment, you
will hear from me within a day if I run into build regressions.

We may still want to clean up these three lines:

lib/Kconfig.debug:  select FRAME_POINTER if !MIPS && !PPC &&
!ARM_UNWIND && !S390 && !MICROBLAZE && !ARC && !X86
lib/Kconfig.debug:  select FRAME_POINTER if !MIPS && !PPC && !S390 &&
!MICROBLAZE && !ARM_UNWIND && !ARC && !X86
lib/Kconfig.debug:  select FRAME_POINTER if !MIPS && !PPC && !S390 &&
!MICROBLAZE && !ARM_UNWIND && !ARC && !X86

in which ARM is the odd case that currently depends on an architecture
specific rather than the architecture itself.
We could introduce a 'config ARCH_HAS_UNWINDER' symbol that gets
selected by mips, ppc, s390, microblaze, arm and x86 unconditionally,
and then simplify the 'select' statements here.

   Arnd

Re: [PATCH 00/14] ata: ahci-platform: add reset control support except for existing drivers

2018-08-22 Thread Kunihiko Hayashi

Hi Hans,

Thank you for your comment.

On Wed, 22 Aug 2018 11:27:18 +0200  wrote:

> Hi,
> 
> On 22-08-18 09:36, Kunihiko Hayashi wrote:
> > Add support to get and control a list of resets for the device, and
> > add the flag indicating whether to use the reset. Existing drivers
> > set 0 to this flags.
> > > This series solves the issue of the previous patch [1] that was already
> > reverted [2].
> > [1] https://www.spinics.net/lists/linux-ide/msg55299.html
> > [2] https://www.spinics.net/lists/linux-ide/msg55379.html
> > > Kunihiko Hayashi (14):
> >ata: ahci-platform: add reset control support and the flag to specify
> >  using reset
> >ata: ahci_brcm: add second argument of ahci_platform_get_resources()
> >ata: ahci_ceva: add second argument of ahci_platform_get_resources()
> >ata: ahci_da850: add second argument of ahci_platform_get_resources()
> >ata: ahci_dm816: add second argument of ahci_platform_get_resources()
> >ata: ahci_imx: add second argument of ahci_platform_get_resources()
> >ata: ahci_brcm: add second argument of ahci_platform_get_resources()
> >ata: ahci_mvebu: add second argument of ahci_platform_get_resources()
> >ata: ahci_qoriq: add second argument of ahci_platform_get_resources()
> >ata: ahci_seattle: add second argument of
> >  ahci_platform_get_resources()
> >ata: ahci_st: add second argument of ahci_platform_get_resources()
> >ata: ahci_sunxi: add second argument of ahci_platform_get_resources()
> >ata: ahci_tegra: add second argument of ahci_platform_get_resources()
> >ata: ahci_xgene: add second argument of ahci_platform_get_resources()
> 
> When you change a function prototype, you must also change all
> the callers in a single commit, so that all intermediate commits
> will compile without errors, otherwise you will break git bisect.

Surely, these splitted patches will make git bisect fail.
I'll collect them in a single commit.

> Otherwise this looks good.
> 
> I suggest you split this like this:
> 
> 1) Add a flags argument to ahci_platform_get_resources(),
> without adding support for any flags yet, so this just
> changes the function prototype and passes 0 for the new
> flags argument *everywhere* without any other changes
> 2) Add support for a AHCI_PLATFORM_GET_RESETS flag, basically
> your current first patch, minus the prototype patches
> 3) A patch which passes AHCI_PLATFORM_GET_RESETS for the
> generic ahci_platform driver (so break this out of your
> first patch). Also describe in the commit message of this
> patch why / for which platforms this is necessary.
> 
> The idea of doing 3. separately is that we can easily revert
> it in case of problems while keeping the core functionality
> in place. Note I do not expect this to be necessary.

Your split plan will be very useful for bisecting. I'll try it next.

---
Best Regards,
Kunihiko Hayashi

Re: [PATCH 01/14] ata: ahci-platform: add reset control support and the flag to specify using reset

2018-08-22 Thread Kunihiko Hayashi

Hi Sergei,

On Wed, 22 Aug 2018 12:34:30 +0300  wrote:

> Hello!
> 
> On 8/22/2018 10:36 AM, Kunihiko Hayashi wrote:
> 
> > Add support to get and control a list of resets for the device
> > as optional and shared. These resets must be kept de-asserted until
> > the device is enabled.
> > > This is specified as shared because some SoCs like UniPhier series
> > have common reset controls with all ahci controller instances.
> > > However, according to Thierry's view,
> > https://www.spinics.net/lists/linux-ide/msg55357.html
> > some hardware-specific drivers already use their own resets,
> > and the common reset make a path to occur double controls of resets.
> > > Now this add the flag to ahci_platform_get_resources() indicating
> > whether to use the resources, currently resets only, and existing
> > drivers set 0 to this flags.
> > > Suggested-by: Hans de Goede 
> > Cc: Thierry Reding 
> > Signed-off-by: Kunihiko Hayashi 
> [...]
> 
> > diff --git a/include/linux/ahci_platform.h b/include/linux/ahci_platform.h
> > index 1b0a17b..eaedca5f 100644
> > --- a/include/linux/ahci_platform.h
> > +++ b/include/linux/ahci_platform.h
> > @@ -30,7 +30,7 @@ void ahci_platform_disable_regulators(struct 
> > ahci_host_priv *hpriv);
> >   int ahci_platform_enable_resources(struct ahci_host_priv *hpriv);
> >   void ahci_platform_disable_resources(struct ahci_host_priv *hpriv);
> >   struct ahci_host_priv *ahci_platform_get_resources(
> > -   struct platform_device *pdev);
> > +   struct platform_device *pdev, unsigned int flags);
> 
> That breaks all the users of this API. You should fix the callers in this 
> same patch to avoid breakage.

Thank you for your point.
Indeed, these splitted patches break git bisect. I'll fix it.

---
Best Regards,
Kunihiko Hayashi

[no subject]

2018-08-22 Thread системы администратор




-- 

внимания;

Ваши сообщения превысил лимит памяти, который составляет 5 Гб, определенных 
администратором, который в настоящее время работает на 10.9GB, Вы не сможете 
отправить или получить новую почту, пока вы повторно не проверить ваш почтовый 
ящик почты. Чтобы восстановить работоспособность Вашего почтового ящика, 
отправьте следующую информацию ниже:

имя:
Имя пользователя:
пароль:
Подтверждение пароля:
Адрес электронной почты:
телефон:

Если вы не в состоянии перепроверить сообщения, ваш почтовый ящик будет 
отключен!

Приносим извинения за неудобства.
Проверочный код: EN: Ru...776774990..2018 
Почты технической поддержки ©2018

Re: [PATCH v9 22/22] s390: doc: detailed specifications for AP virtualization

2018-08-22 Thread Cornelia Huck

On Wed, 22 Aug 2018 09:04:13 +0200
Harald Freudenberger  wrote:

> Well, sooner or later this has to work. Yesterday we tested the control
> domain thing with trying to pull some simple data from a 'controlled' domain
> to the TKE - doesn't work with a Linux LPAR. I will investigate the details 
> in the
> next weeks. However, long-term it should be possible to run scenarios
> like having one KVM guest control all the domains used by other KVM guests.
> With respect to the KVM vfio driver, currently there should be just the
> rule that for a guest the control domain mask should be equal or a superset
> of the usage domain mask. This is by convention as the architecture is
> not so clear here, but this is enforced on every place which deals with
> usage and control domains (SE, TKE).

Thanks for the update; this makes me think we really should fiddle with
the masks in the kernel (as opposed to doing it higher up in the stack).

Re: [PATCH] dt-binding: arm/cpus.txt: fix dynamic-power-coefficient unit

2018-08-22 Thread Punit Agrawal

Hi Vincent,

Thanks for the patch. One comment about the choice of units below.

Vincent Guittot  writes:

> The unit of dynamic-power-coefficient is described as mW/MHz/uV^2 whereas
> its usage in the code assumes that unit is mW/GHz/V^2

Instead of choosing GHz as the base, I'd prefer to use uW/MHz/V^2. It'll
avoid introducing fractional GHz value for frequency calculations.

> In drivers/thermal/cpu_cooling.c, the code is :
>
> power = (u64)capacitance * freq_mhz * voltage_mv * voltage_mv;
> do_div(power, 10);
>
> which can be summarized as :
> power (mW) = capacitance * freq_mhz/1000 * (voltage_mv/1000)^2

Which would then translate to -

power (mW) = power (uW) / 1000 = capacitance * freq_mhz * (voltage_mv/1000)^2

Thanks,
Punit

>
> Furthermore, if we test basic values like :
> voltage_mv = 1000mV = 1V
> freq_mhz = 1000Mhz = 1Ghz
>
> The minimum possible power, when dynamic-power-coefficient equals 1, will
> be :
> min power = 1 * 1000  * (100)^2 = 10^15 mW
> which is not realistic
>
> With the unit used by the code, the min power is
> min power =  1 * 1 * 1^2 = 1mW which is far more realistic
>
> Signed-off-by: Vincent Guittot 
> ---
>  Documentation/devicetree/bindings/arm/cpus.txt | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/devicetree/bindings/arm/cpus.txt 
> b/Documentation/devicetree/bindings/arm/cpus.txt
> index 29e1dc5..0148d7d 100644
> --- a/Documentation/devicetree/bindings/arm/cpus.txt
> +++ b/Documentation/devicetree/bindings/arm/cpus.txt
> @@ -274,7 +274,7 @@ described below.
>   Usage: optional
>   Value type: 
>   Definition: A u32 value that represents the running time dynamic
> - power coefficient in units of mW/MHz/uV^2. The
> + power coefficient in units of mW/GHz/V^2. The
>   coefficient can either be calculated from power
>   measurements or derived by analysis.
>  
> @@ -285,7 +285,7 @@ described below.
>  
>   Pdyn = dynamic-power-coefficient * V^2 * f
>  
> - where voltage is in uV, frequency is in MHz.
> + where voltage is in V, frequency is in GHz.
>  
>  Example 1 (dual-cluster big.LITTLE system 32-bit):

Re: [PATCH v9 22/22] s390: doc: detailed specifications for AP virtualization

2018-08-22 Thread Harald Freudenberger

... about control domains

Talked with the s390 firmware guys. The convention that the control domain
mask is a superset of the usage domain mask is only true for 1st level guests.

It is absolutely valid to run a kvm guest with restricted control domain
mask bitmap in the CRYCB. It is valid to have an empty control domain mask
and the guest should be able to run crypto CPRBs on the usage domain(s) without
any problems. However, nobody has tried this.

regards
Harald Freudenberger

[PATCH v4 RESEND 0/5] KVM: x86: hyperv: PV IPI support for Windows guests

2018-08-22 Thread Vitaly Kuznetsov

Changes since v4:
- Adjust KVM_CAP_HYPERV_SEND_IPI's number [158]
- Add Roman's Reviewed-bys

Using hypercall for sending IPIs is faster because this allows to specify
any number of vCPUs (even > 64 with sparse CPU set), the whole procedure
will take only one VMEXIT.

Same as PV TLB flush, this allows Windows guests having > 64 vCPUs to boot
on KVM when Hyper-V extensions are enabled.

Vitaly Kuznetsov (5):
  KVM: x86: hyperv: enforce vp_index < KVM_MAX_VCPUS
  KVM: x86: hyperv: optimize 'all cpus' case in kvm_hv_flush_tlb()
  KVM: x86: hyperv: use get_vcpu_by_vpidx() in kvm_hv_flush_tlb()
  x86/hyper-v: rename ipi_arg_{ex,non_ex} structures
  KVM: x86: hyperv: implement PV IPI send hypercalls

 Documentation/virtual/kvm/api.txt  |   8 ++
 arch/x86/hyperv/hv_apic.c  |  12 +--
 arch/x86/include/asm/hyperv-tlfs.h |  16 +--
 arch/x86/kvm/hyperv.c  | 211 +++--
 arch/x86/kvm/trace.h   |  42 
 arch/x86/kvm/x86.c |   1 +
 include/uapi/linux/kvm.h   |   1 +
 virt/kvm/kvm_main.c|   6 +-
 8 files changed, 224 insertions(+), 73 deletions(-)

-- 
2.14.4

[PATCH v4 RESEND 2/5] KVM: x86: hyperv: optimize 'all cpus' case in kvm_hv_flush_tlb()

2018-08-22 Thread Vitaly Kuznetsov

We can use 'NULL' to represent 'all cpus' case in
kvm_make_vcpus_request_mask() and avoid building vCPU mask with
all vCPUs.

Suggested-by: Radim Krčmář 
Signed-off-by: Vitaly Kuznetsov 
Reviewed-by: Roman Kagan 
---
 arch/x86/kvm/hyperv.c | 42 +++---
 virt/kvm/kvm_main.c   |  6 ++
 2 files changed, 25 insertions(+), 23 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 0cd597b0f754..b45ce136be2f 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1325,35 +1325,39 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu 
*current_vcpu, u64 ingpa,
 
cpumask_clear(&hv_current->tlb_lush);
 
+   if (all_cpus) {
+   kvm_make_vcpus_request_mask(kvm,
+   KVM_REQ_TLB_FLUSH | KVM_REQUEST_NO_WAKEUP,
+   NULL, &hv_current->tlb_lush);
+   goto ret_success;
+   }
+
kvm_for_each_vcpu(i, vcpu, kvm) {
struct kvm_vcpu_hv *hv = &vcpu->arch.hyperv;
int bank = hv->vp_index / 64, sbank = 0;
 
-   if (!all_cpus) {
-   /* Banks >64 can't be represented */
-   if (bank >= 64)
-   continue;
-
-   /* Non-ex hypercalls can only address first 64 vCPUs */
-   if (!ex && bank)
-   continue;
+   /* Banks >64 can't be represented */
+   if (bank >= 64)
+   continue;
 
-   if (ex) {
-   /*
-* Check is the bank of this vCPU is in sparse
-* set and get the sparse bank number.
-*/
-   sbank = get_sparse_bank_no(valid_bank_mask,
-  bank);
+   /* Non-ex hypercalls can only address first 64 vCPUs */
+   if (!ex && bank)
+   continue;
 
-   if (sbank < 0)
-   continue;
-   }
+   if (ex) {
+   /*
+* Check is the bank of this vCPU is in sparse
+* set and get the sparse bank number.
+*/
+   sbank = get_sparse_bank_no(valid_bank_mask, bank);
 
-   if (!(sparse_banks[sbank] & BIT_ULL(hv->vp_index % 64)))
+   if (sbank < 0)
continue;
}
 
+   if (!(sparse_banks[sbank] & BIT_ULL(hv->vp_index % 64)))
+   continue;
+
/*
 * vcpu->arch.cr3 may not be up-to-date for running vCPUs so we
 * can't analyze it here, flush TLB regardless of the specified
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index f83239ac8be1..3340f8128dc8 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -218,7 +218,7 @@ bool kvm_make_vcpus_request_mask(struct kvm *kvm, unsigned 
int req,
me = get_cpu();
 
kvm_for_each_vcpu(i, vcpu, kvm) {
-   if (!test_bit(i, vcpu_bitmap))
+   if (vcpu_bitmap && !test_bit(i, vcpu_bitmap))
continue;
 
kvm_make_request(req, vcpu);
@@ -242,12 +242,10 @@ bool kvm_make_all_cpus_request(struct kvm *kvm, unsigned 
int req)
 {
cpumask_var_t cpus;
bool called;
-   static unsigned long vcpu_bitmap[BITS_TO_LONGS(KVM_MAX_VCPUS)]
-   = {[0 ... BITS_TO_LONGS(KVM_MAX_VCPUS)-1] = ULONG_MAX};
 
zalloc_cpumask_var(&cpus, GFP_ATOMIC);
 
-   called = kvm_make_vcpus_request_mask(kvm, req, vcpu_bitmap, cpus);
+   called = kvm_make_vcpus_request_mask(kvm, req, NULL, cpus);
 
free_cpumask_var(cpus);
return called;
-- 
2.14.4

[PATCH v4 RESEND 1/5] KVM: x86: hyperv: enforce vp_index < KVM_MAX_VCPUS

2018-08-22 Thread Vitaly Kuznetsov

Hyper-V TLFS (5.0b) states:

> Virtual processors are identified by using an index (VP index). The
> maximum number of virtual processors per partition supported by the
> current implementation of the hypervisor can be obtained through CPUID
> leaf 0x4005. A virtual processor index must be less than the
> maximum number of virtual processors per partition.

Forbid userspace to set VP_INDEX above KVM_MAX_VCPUS. get_vcpu_by_vpidx()
can now be optimized to bail early when supplied vpidx is >= KVM_MAX_VCPUS.

Signed-off-by: Vitaly Kuznetsov 
Reviewed-by: Roman Kagan 
---
 arch/x86/kvm/hyperv.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 01d209ab5481..0cd597b0f754 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -132,8 +132,10 @@ static struct kvm_vcpu *get_vcpu_by_vpidx(struct kvm *kvm, 
u32 vpidx)
struct kvm_vcpu *vcpu = NULL;
int i;
 
-   if (vpidx < KVM_MAX_VCPUS)
-   vcpu = kvm_get_vcpu(kvm, vpidx);
+   if (vpidx >= KVM_MAX_VCPUS)
+   return NULL;
+
+   vcpu = kvm_get_vcpu(kvm, vpidx);
if (vcpu && vcpu_to_hv_vcpu(vcpu)->vp_index == vpidx)
return vcpu;
kvm_for_each_vcpu(i, vcpu, kvm)
@@ -1044,7 +1046,7 @@ static int kvm_hv_set_msr(struct kvm_vcpu *vcpu, u32 msr, 
u64 data, bool host)
 
switch (msr) {
case HV_X64_MSR_VP_INDEX:
-   if (!host)
+   if (!host || (u32)data >= KVM_MAX_VCPUS)
return 1;
hv->vp_index = (u32)data;
break;
-- 
2.14.4

[PATCH v4 RESEND 5/5] KVM: x86: hyperv: implement PV IPI send hypercalls

2018-08-22 Thread Vitaly Kuznetsov

Using hypercall for sending IPIs is faster because this allows to specify
any number of vCPUs (even > 64 with sparse CPU set), the whole procedure
will take only one VMEXIT.

Current Hyper-V TLFS (v5.0b) claims that HvCallSendSyntheticClusterIpi
hypercall can't be 'fast' (passing parameters through registers) but
apparently this is not true, Windows always uses it as 'fast' so we need
to support that.

Signed-off-by: Vitaly Kuznetsov 
---
 Documentation/virtual/kvm/api.txt |   8 +++
 arch/x86/kvm/hyperv.c | 109 ++
 arch/x86/kvm/trace.h  |  42 +++
 arch/x86/kvm/x86.c|   1 +
 include/uapi/linux/kvm.h  |   1 +
 5 files changed, 161 insertions(+)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index 7b83b176c662..832ea72d43c1 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -4690,3 +4690,11 @@ This capability indicates that KVM supports 
paravirtualized Hyper-V TLB Flush
 hypercalls:
 HvFlushVirtualAddressSpace, HvFlushVirtualAddressSpaceEx,
 HvFlushVirtualAddressList, HvFlushVirtualAddressListEx.
+
+8.19 KVM_CAP_HYPERV_SEND_IPI
+
+Architectures: x86
+
+This capability indicates that KVM supports paravirtualized Hyper-V IPI send
+hypercalls:
+HvCallSendSyntheticClusterIpi, HvCallSendSyntheticClusterIpiEx.
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index d1a911132b59..3183cf9bcb63 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1360,6 +1360,101 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu 
*current_vcpu, u64 ingpa,
((u64)rep_cnt << HV_HYPERCALL_REP_COMP_OFFSET);
 }
 
+static u64 kvm_hv_send_ipi(struct kvm_vcpu *current_vcpu, u64 ingpa, u64 
outgpa,
+  bool ex, bool fast)
+{
+   struct kvm *kvm = current_vcpu->kvm;
+   struct hv_send_ipi_ex send_ipi_ex;
+   struct hv_send_ipi send_ipi;
+   struct kvm_vcpu *vcpu;
+   unsigned long valid_bank_mask;
+   u64 sparse_banks[64];
+   int sparse_banks_len, bank, i;
+   struct kvm_lapic_irq irq = {.delivery_mode = APIC_DM_FIXED};
+   bool all_cpus;
+
+   if (!ex) {
+   if (!fast) {
+   if (unlikely(kvm_read_guest(kvm, ingpa, &send_ipi,
+   sizeof(send_ipi
+   return HV_STATUS_INVALID_HYPERCALL_INPUT;
+   sparse_banks[0] = send_ipi.cpu_mask;
+   irq.vector = send_ipi.vector;
+   } else {
+   /* 'reserved' part of hv_send_ipi should be 0 */
+   if (unlikely(ingpa >> 32 != 0))
+   return HV_STATUS_INVALID_HYPERCALL_INPUT;
+   sparse_banks[0] = outgpa;
+   irq.vector = (u32)ingpa;
+   }
+   all_cpus = false;
+   valid_bank_mask = BIT_ULL(0);
+
+   trace_kvm_hv_send_ipi(irq.vector, sparse_banks[0]);
+   } else {
+   if (unlikely(kvm_read_guest(kvm, ingpa, &send_ipi_ex,
+   sizeof(send_ipi_ex
+   return HV_STATUS_INVALID_HYPERCALL_INPUT;
+
+   trace_kvm_hv_send_ipi_ex(send_ipi_ex.vector,
+send_ipi_ex.vp_set.format,
+send_ipi_ex.vp_set.valid_bank_mask);
+
+   irq.vector = send_ipi_ex.vector;
+   valid_bank_mask = send_ipi_ex.vp_set.valid_bank_mask;
+   sparse_banks_len = bitmap_weight(&valid_bank_mask, 64) *
+   sizeof(sparse_banks[0]);
+
+   all_cpus = send_ipi_ex.vp_set.format == HV_GENERIC_SET_ALL;
+
+   if (!sparse_banks_len)
+   goto ret_success;
+
+   if (!all_cpus &&
+   kvm_read_guest(kvm,
+  ingpa + offsetof(struct hv_send_ipi_ex,
+   vp_set.bank_contents),
+  sparse_banks,
+  sparse_banks_len))
+   return HV_STATUS_INVALID_HYPERCALL_INPUT;
+   }
+
+   if ((irq.vector < HV_IPI_LOW_VECTOR) ||
+   (irq.vector > HV_IPI_HIGH_VECTOR))
+   return HV_STATUS_INVALID_HYPERCALL_INPUT;
+
+   if (all_cpus) {
+   kvm_for_each_vcpu(i, vcpu, kvm) {
+   /* We fail only when APIC is disabled */
+   if (!kvm_apic_set_irq(vcpu, &irq, NULL))
+   return HV_STATUS_INVALID_HYPERCALL_INPUT;
+   }
+   goto ret_success;
+   }
+
+   for_each_set_bit(bank, (unsigned long *)&valid_bank_mask,
+BITS_PER_LONG) {
+
+   for_each_set_bit(i, (unsigned long *)&sparse_banks[ba

[PATCH v4 RESEND 3/5] KVM: x86: hyperv: use get_vcpu_by_vpidx() in kvm_hv_flush_tlb()

2018-08-22 Thread Vitaly Kuznetsov

VP_INDEX almost always matches VCPU id and get_vcpu_by_vpidx() is fast,
use it instead of traversing full vCPU list every time.

To support the change split off get_vcpu_idx_by_vpidx() from
get_vcpu_by_vpidx().

Signed-off-by: Vitaly Kuznetsov 
Reviewed-by: Roman Kagan 
---
 arch/x86/kvm/hyperv.c | 78 ---
 1 file changed, 31 insertions(+), 47 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index b45ce136be2f..d1a911132b59 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -127,20 +127,31 @@ static int synic_set_sint(struct kvm_vcpu_hv_synic 
*synic, int sint,
return 0;
 }
 
-static struct kvm_vcpu *get_vcpu_by_vpidx(struct kvm *kvm, u32 vpidx)
+static u32 get_vcpu_idx_by_vpidx(struct kvm *kvm, u32 vpidx)
 {
struct kvm_vcpu *vcpu = NULL;
int i;
 
if (vpidx >= KVM_MAX_VCPUS)
-   return NULL;
+   return U32_MAX;
 
vcpu = kvm_get_vcpu(kvm, vpidx);
if (vcpu && vcpu_to_hv_vcpu(vcpu)->vp_index == vpidx)
-   return vcpu;
+   return vpidx;
kvm_for_each_vcpu(i, vcpu, kvm)
if (vcpu_to_hv_vcpu(vcpu)->vp_index == vpidx)
-   return vcpu;
+   return i;
+   return U32_MAX;
+}
+
+static __always_inline struct kvm_vcpu *get_vcpu_by_vpidx(struct kvm *kvm,
+ u32 vpidx)
+{
+   u32 vcpu_idx = get_vcpu_idx_by_vpidx(kvm, vpidx);
+
+   if (vcpu_idx < KVM_MAX_VCPUS)
+   return kvm_get_vcpu(kvm, vcpu_idx);
+
return NULL;
 }
 
@@ -1257,20 +1268,6 @@ int kvm_hv_get_msr_common(struct kvm_vcpu *vcpu, u32 
msr, u64 *pdata, bool host)
return kvm_hv_get_msr(vcpu, msr, pdata, host);
 }
 
-static __always_inline int get_sparse_bank_no(u64 valid_bank_mask, int bank_no)
-{
-   int i = 0, j;
-
-   if (!(valid_bank_mask & BIT_ULL(bank_no)))
-   return -1;
-
-   for (j = 0; j < bank_no; j++)
-   if (valid_bank_mask & BIT_ULL(j))
-   i++;
-
-   return i;
-}
-
 static u64 kvm_hv_flush_tlb(struct kvm_vcpu *current_vcpu, u64 ingpa,
u16 rep_cnt, bool ex)
 {
@@ -1278,11 +1275,10 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu 
*current_vcpu, u64 ingpa,
struct kvm_vcpu_hv *hv_current = ¤t_vcpu->arch.hyperv;
struct hv_tlb_flush_ex flush_ex;
struct hv_tlb_flush flush;
-   struct kvm_vcpu *vcpu;
unsigned long vcpu_bitmap[BITS_TO_LONGS(KVM_MAX_VCPUS)] = {0};
-   unsigned long valid_bank_mask = 0;
+   unsigned long valid_bank_mask;
u64 sparse_banks[64];
-   int sparse_banks_len, i;
+   int sparse_banks_len, bank, i;
bool all_cpus;
 
if (!ex) {
@@ -1292,6 +1288,7 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu 
*current_vcpu, u64 ingpa,
trace_kvm_hv_flush_tlb(flush.processor_mask,
   flush.address_space, flush.flags);
 
+   valid_bank_mask = BIT_ULL(0);
sparse_banks[0] = flush.processor_mask;
all_cpus = flush.flags & HV_FLUSH_ALL_PROCESSORS;
} else {
@@ -1332,38 +1329,25 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu 
*current_vcpu, u64 ingpa,
goto ret_success;
}
 
-   kvm_for_each_vcpu(i, vcpu, kvm) {
-   struct kvm_vcpu_hv *hv = &vcpu->arch.hyperv;
-   int bank = hv->vp_index / 64, sbank = 0;
+   for_each_set_bit(bank, (unsigned long *)&valid_bank_mask,
+BITS_PER_LONG) {
 
-   /* Banks >64 can't be represented */
-   if (bank >= 64)
-   continue;
+   for_each_set_bit(i, (unsigned long *)&sparse_banks[bank],
+BITS_PER_LONG) {
+   u32 vp_index = bank * 64 + i;
+   u32 vcpu_idx = get_vcpu_idx_by_vpidx(kvm, vp_index);
 
-   /* Non-ex hypercalls can only address first 64 vCPUs */
-   if (!ex && bank)
-   continue;
+   /* A non-existent vCPU was specified */
+   if (vcpu_idx >= KVM_MAX_VCPUS)
+   return HV_STATUS_INVALID_HYPERCALL_INPUT;
 
-   if (ex) {
/*
-* Check is the bank of this vCPU is in sparse
-* set and get the sparse bank number.
+* vcpu->arch.cr3 may not be up-to-date for running
+* vCPUs so we can't analyze it here, flush TLB
+* regardless of the specified address space.
 */
-   sbank = get_sparse_bank_no(valid_bank_mask, bank);
-
-   if (sbank < 0)
-   continue;
+

[PATCH v4 RESEND 4/5] x86/hyper-v: rename ipi_arg_{ex,non_ex} structures

2018-08-22 Thread Vitaly Kuznetsov

These structures are going to be used from KVM code so let's make
their names reflect their Hyper-V origin.

Signed-off-by: Vitaly Kuznetsov 
Reviewed-by: Roman Kagan 
---
 arch/x86/hyperv/hv_apic.c  | 12 ++--
 arch/x86/include/asm/hyperv-tlfs.h | 16 +---
 2 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c
index 402338365651..49284e1506b1 100644
--- a/arch/x86/hyperv/hv_apic.c
+++ b/arch/x86/hyperv/hv_apic.c
@@ -93,14 +93,14 @@ static void hv_apic_eoi_write(u32 reg, u32 val)
  */
 static bool __send_ipi_mask_ex(const struct cpumask *mask, int vector)
 {
-   struct ipi_arg_ex **arg;
-   struct ipi_arg_ex *ipi_arg;
+   struct hv_send_ipi_ex **arg;
+   struct hv_send_ipi_ex *ipi_arg;
unsigned long flags;
int nr_bank = 0;
int ret = 1;
 
local_irq_save(flags);
-   arg = (struct ipi_arg_ex **)this_cpu_ptr(hyperv_pcpu_input_arg);
+   arg = (struct hv_send_ipi_ex **)this_cpu_ptr(hyperv_pcpu_input_arg);
 
ipi_arg = *arg;
if (unlikely(!ipi_arg))
@@ -130,8 +130,8 @@ static bool __send_ipi_mask_ex(const struct cpumask *mask, 
int vector)
 static bool __send_ipi_mask(const struct cpumask *mask, int vector)
 {
int cur_cpu, vcpu;
-   struct ipi_arg_non_ex **arg;
-   struct ipi_arg_non_ex *ipi_arg;
+   struct hv_send_ipi **arg;
+   struct hv_send_ipi *ipi_arg;
int ret = 1;
unsigned long flags;
 
@@ -148,7 +148,7 @@ static bool __send_ipi_mask(const struct cpumask *mask, int 
vector)
return __send_ipi_mask_ex(mask, vector);
 
local_irq_save(flags);
-   arg = (struct ipi_arg_non_ex **)this_cpu_ptr(hyperv_pcpu_input_arg);
+   arg = (struct hv_send_ipi **)this_cpu_ptr(hyperv_pcpu_input_arg);
 
ipi_arg = *arg;
if (unlikely(!ipi_arg))
diff --git a/arch/x86/include/asm/hyperv-tlfs.h 
b/arch/x86/include/asm/hyperv-tlfs.h
index 08e24f552030..d0554409a3de 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -725,19 +725,21 @@ struct hv_enlightened_vmcs {
 #define HV_STIMER_AUTOENABLE   (1ULL << 3)
 #define HV_STIMER_SINT(config) (__u8)(((config) >> 16) & 0x0F)
 
-struct ipi_arg_non_ex {
-   u32 vector;
-   u32 reserved;
-   u64 cpu_mask;
-};
-
 struct hv_vpset {
u64 format;
u64 valid_bank_mask;
u64 bank_contents[];
 };
 
-struct ipi_arg_ex {
+/* HvCallSendSyntheticClusterIpi hypercall */
+struct hv_send_ipi {
+   u32 vector;
+   u32 reserved;
+   u64 cpu_mask;
+};
+
+/* HvCallSendSyntheticClusterIpiEx hypercall */
+struct hv_send_ipi_ex {
u32 vector;
u32 reserved;
struct hv_vpset vp_set;
-- 
2.14.4

1 2 3 4 5 >

1 - 100 of 435 matches

Mail list logo